Click to Play

CES: Nokia Unveils Its Flagship...
Nokia turned a lot of heads at CES 2009 with its Nokia N97. The smartphone is part of the company’s N series of multimedia computers and is being labeled the...

Recent Articles

The SEO Guidelines Of Good And Bad Practices
Many SEO newbies, or new businesses starting out online, come to SEO blogs such as this looking for some quick and easy solution that will vault them to the top in the results. Unfortunately, there are very few hard...

Five Ways To Help Make Better SEO And Business...
Earlier we looked at five decision making strategies and applied them to SEO. Today we'll conclude with an additional five ways to help you make better SEO...

Update On Blackhat SEO Tricks
Doug Heil of IHelpYou.com will be participating at the upcoming Search Engine Strategies Chicago "Black Hat/White Hat" SEO panel. Doug is currently working as a search engine optimization consultant; before...

Tips To Improve Your SEO Ranking
We continue our series of weekly SEO basics questions and answer today with a question that acknowledges the value of approaching search engine optimization...

Search Engines Blocked From Your Content
At the DMA08 conference in Las Vegas this week, the "Search Engine Experience" panel on Tuesday was presented with a number of site reviews which are great sessions because the audience gets free...

Video SEO: Drive Traffic With Videos
People are watching hundreds of millions of videos a day on video sharing websites like YouTube. In fact, every minute, around ten hours of video is uploaded to YouTube alone. Online video has exploded...


01.15.09

Breaking Down The SEO Of Page Segmentation

By David Harry

One area that is worth looking at for SEO in 2009 (for me at least) is page segmentation. Now this approach really isn't new and I came across papers as far back as 1997 and beyond. But unsurprisingly most IR methods don't just appear over night. The big three, (of search) have each had various research papers and patents dating back as far as 2003-4. It just seems to have some traction and is sensible as well.

Essentially page segmentation is when a search engine looks to break a given web page down into its component parts. They could analyze a web page and assign various relevance or importance scoring for different regions of a page. Some of the methods include fixed-length page segmentation (FixedPS), DOM, (DomPS) and location based and white spaces (vision based or VIPS) and even a combined approach (CombPS).

Page segmentationAs with many IR methodologies they try to improve the signal to noise ratio. In this case by hopefully identify the noisy segments; resources can be focused on the relevant areas of a web page. Furthermore most people do tend to understand web pages in a segmented or structured view. When you arrived at this page did you instinctively know where to find the main content? Aware of common locations for navigation and other elements? Banner blindness? You get the idea.

 
Advantages of page segmentation

The main advantages are increased relevance and streamlining processing elements. Search engines hope to use page segmentation to be able to asses a more finite understanding of a given pages relevancy, but also (theoretically) be capable of dealing with multi-topic pages, semantically related or not.

The Secrets of Follow-Up Revealed
Download Master of the Moment Now!

The second advantage, processing and resource management, can be achieved as they could define site templates in an attempt to only crawl/index the relevant parts of the page and not the boilerplate elements.

Now, while there are a few ways of going about it, what's important here is that such systems are sensible not only from a relevancy perspective, but could also help crawling and indexing resource management.

One has to imagine new ideas at the big three will be tempered in a volatile economy. Once a template has been established, indexing a site on a regular basis could be far easier on a search engine (and site owner as well). Just have a little 'template bot' crawl a few pages now and again to ensure the profile is unchanged.. but I'm rambling now...

Another implementation (as noted by the Google patent) could be pages that have a number of listings that are geographic in nature. As search for 'stone oven pizza, Toronto' could produce better results as larger listings of pizza shops in Toronto could be segmented and digested by more finite parameters than normal.

"The text associated with the smallest hierarchical level surrounding a business listing may be associated with that business listing" - Patent
; Document segmentation based on visual gaps
Segmenting the page

The nuts and bolts I shant trouble you with (links later as always) but it varies from code analysis (DOM) approaches to vision based. The main idea is establishing common (boilerplate) segments of a web page... And from there the systems can be set to even more granular levels to find an optimal rate (playing with the dials).

Continue reading this article.


About the Author:
David Harry is the President of Reliable SEO and has been building and marketing websites since 1998. He can be found writing about search and internet marketing on the Fire Horse Trail and is the author of the SEO Handbook series.

http://www.reliable-seo.com
http://www.huomah.com
http://www.the-seo-handbook
About SEOArticles
Your resource for professional SEO information.





SEOArticles is brought to you by:

WebProNews.com Jayde.com
MarketingNewz.com SalesNewz.com
CareerNewz.com InvestNewz.com
eCommNewz.com WebsiteNotes.com
AdvertisingDay.com ManagerNewz.com
SearchNewz.com CRMNewz.com






-- SEOArticles is an iEntry, Inc. publication --
iEntry, Inc. 2549 Richmond Rd. Lexington KY, 40509
2009 iEntry, Inc. All Rights Reserved | Privacy Policy | Legal

archives | advertising info | news headlines | free newsletters | comments/feedback | submit article


SEOArticles Home Page About Article Archive News Downloads WebProWorld Forums Jayde iEntry Advertise Contact SEOArticles News Archives About Us Feedback WebProWorld Forum