Jalaj

January 29, 2008

HTML 5 - What it could mean for Google and Browsers

Filed under: Blogs, FireFox, Forums, Google, HTML, Internet, Internet Explorer, Pagerank, Search Engine, Web — Jalaj @ 8:56 am

Just got to read on Google Blogoscoped about the appearance of HTML5 working draft. After XHTML coming in light and XHTML 2.0 WD in development, the appearance of next version of HTML is indeed a surprise.

Previous versions of HTML has always been a subject of part implementation and properitery exensions by different browsers be it Internet Explorer or the Netscape. This version could be another attempt to include extensions by Google (favouring FireFox) and Apple (for Safari), though under W3C to justify and standardize them.

What HTML 5 could mean for Google

Google search results depends a lot on links to determine the importance of the linked page. To cope up with the problem of links arising out of unmoderated forums and blogs, google proposed Rel=”NoFollow” that was followed by almost all blog and forum softwares and by a large number of social networking sites. With HTML 5 this is going to be a Standard. Further HTML 5 also tries to properly categorize the contents of the search pages by proposing more values for Rel attribute as under :

Rel=”Search” : Google has developed various algorithm to remove duplicate contents, but Search pages which are just another form of duplicate content is still hard to track as their content being a collection of exerpts from various pages. If this options is exercised Google will filter a lot of duplicate content beforehand.

Rel=”Tag” : The same as above applies for this also… a collection of duplicate contents.

Rel=”Archives” : Same.. collection of duplicate content based on dates, months, year, authors etc.

Rel=”Feed” : Same again, the duplicate contents!… though not useful for the search engine, would be of value for a feed reader!

Further Rel=”Bookmark”, Rel=”Contact”, Rel=”License” etc. would be instrumental in filtering/categorising cached pages.

A few elements have been proposed in this draft to divide the body text, the most important being the <article> element and the <nav> element. If I were Google, I could easily parse an HTML 5 page to find out which links should carry more weight (you guessed right! those within article element!!) and which links to consider for placing in search results as SiteLinks! (from those within nav element)

What HTML means for Browsers

I am tired of guessing! Can you help with guesses here!!
(Don’t be surprised if the FireFox and Safari are the first brosers to implement the HTML 5 standards, if it becomes one.)

1 Comment »

  1. Interesting… I had no idea that HTML5 was on the brink of it’s release… well, I suppose it’s going to be worked on for quite a while longer and will take time to be fully implemented and become the standard.

    Comment by Winning Ponies — February 8, 2008 @ 9:16 pm

RSS feed for comments on this post. TrackBack URI

Leave a comment

Blog at WordPress.com.