<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Jalaj P. Jha &#187; Search Engine</title>
	<atom:link href="http://jalaj.net/tag/search-engine/feed/" rel="self" type="application/rss+xml" />
	<link>http://jalaj.net</link>
	<description>Technical &#38; Miscellaneous Ramblings</description>
	<lastBuildDate>Sun, 05 Sep 2010 23:54:46 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Google Hot Trends Indexing to Cool Down</title>
		<link>http://jalaj.net/2009/07/05/google-hot-trends-indexing-to-cool-down/</link>
		<comments>http://jalaj.net/2009/07/05/google-hot-trends-indexing-to-cool-down/#comments</comments>
		<pubDate>Sun, 05 Jul 2009 20:50:03 +0000</pubDate>
		<dc:creator>Jalaj</dc:creator>
				<category><![CDATA[Google]]></category>
		<category><![CDATA[Google Trends]]></category>
		<category><![CDATA[Search Engine]]></category>
		<category><![CDATA[Spam]]></category>

		<guid isPermaLink="false">http://jalaj.wordpress.com/2009/07/05/google-hot-trends-indexing-to-cool-down/</guid>
		<description><![CDATA[Google Hot Trends that started somewhere in 2007, 15th May to be precise, has been inspiration to a large number of blogs and sites who created content taking hot keywords from the ones listed in it. While some sites produced remarkable content from it, many took to create brainless content mashing up the keywords and [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://google.com/trends/hottrends">Google Hot Trends</a> that started somewhere in 2007, 15th May to be precise, has been inspiration to a large number of blogs and sites who created content taking hot keywords from the ones listed in it. While some sites produced remarkable content from it, many took to create brainless content mashing up the keywords and nothing else. Some even took to replicate the Hot Trends site when they could not think of ideas.</p>
<p>Google index is full of pages that are nothing but the copy of the contents of Google Hot Trends site (some are owned by Google itself). Here is a glimpse</p>
<ul>
<li><a href="http://jalaj.files.wordpress.com/2009/07/image.png"><img style="display:inline;margin-left:0;margin-right:0;border-width:0;" title="image" border="0" alt="image" align="right" src="http://jalaj.files.wordpress.com/2009/07/image_thumb.png" width="371" height="307" /></a> m.blogger.com/trends/hottrends </li>
<li>pro.blogger.com/trends/hottrends </li>
<li>www.freesc.com/trends/hottrends </li>
<li>m.googlearth.de/trends/hottrends </li>
<li>www.rechargeit.org/trends/hottrends </li>
<li>www.ggoogle.com/trends/hottrends </li>
<li>www.blogger.ae/trends/hottrends </li>
<li>pro1.blogger.com/trends/hottrends </li>
<li>www.gppglr.com/trends/hottrends </li>
<li>www.gmail.fr/trends/hottrends </li>
<li>www.heima021.com/trends/hottrends </li>
<li>www.janinaordmann.com/trends/hottrends </li>
<li>m.googlearth.de/trends/hottrends </li>
<li>www.nihilisme.ca/trends/hottrends </li>
<li>www.ngauthier.com/trends/hottrends </li>
<li>www.googleanalytics.ru/trends/hottrends </li>
<li>www.sharesigns.com/trends/hottrends </li>
<li>www.ciecet.net/trends/hottrends </li>
<li>wireless.blogger.com/trends/hottrends </li>
<li>pro2.blogger.com/trends/hottrends </li>
</ul>
<p>While intentions of other site owners behind replication of pages can be understood, that is getting traffic, it’s beyond imagination that why would Google want to replicate its own data that too with different subdomains of same domain as m.blogger.com, pro.blogger.com, pro1.blogger.com, pro2.blogger.com and wireless.blogger.com</p>
<p><a href="http://jalaj.files.wordpress.com/2009/07/image1.png"><img style="border-bottom:0;border-left:0;display:inline;margin-left:0;border-top:0;margin-right:0;border-right:0;" title="image" border="0" alt="image" align="right" src="http://jalaj.files.wordpress.com/2009/07/image_thumb1.png" width="244" height="108" /></a> Anyways, Google now seem to have decided to remove these pages from index and, to effect it, have included exclusion rules in its robots.txt. This change seem to have been done on 2nd of July. You must be wondering that this step may stop Google owned sites to get indexed but what about other sites which are not under Google’s control? Well they will succeed in that too, as the sites replicating the Google Hot Trends page are replicating the robots.txt too <img src='http://jalaj.net/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>Let’s see how much time it takes to get all these pages removed from the index.</p>
<p>Here is the rest the list.</p>
<ul>
<li>www.gmail.fr/trends/hottrends </li>
<li>64.233.189.100/trends/hottrends </li>
<li>www.dancrone.com/trends/hottrends </li>
<li>www.ciecet.net/trends/hottrends </li>
<li>www.worx.biz/trends/hottrends </li>
<li>www.非主流服装.com/trends/hottrends </li>
<li>www.familiekoret.com/trends/hottrends </li>
<li>www.387groep.nl/trends/hottrends </li>
<li>www.ggoogle.com/trends/hottrends </li>
<li>tonder.hveruge.dk/trends/hottrends </li>
<li>www.antesoft.com/trends/hottrends </li>
<li>www.googlemaps.it/trends/hottrends </li>
<li>www.algianviaggi.com/trends/hottrends </li>
<li>googleimageads.com/trends/hottrends </li>
<li>bassboutique.com/trends/hottrends </li>
<li>64.233.179.104/trends/hottrends </li>
<li>72.14.221.104/trends/hottrends </li>
<li>www.blogger.se/trends/hottrends </li>
<li>images.wwwgoogle.de/trends/hottrends </li>
<li>www.nihilisme.ca/trends/hottrends </li>
<li>tonder.hveruge.dk/trends/hottrends </li>
<li>www.rechargeit.org/trends/hottrends </li>
<li>www.bsp-zt.at/trends/hottrends </li>
</ul>
<p>and on… and on…</p>
]]></content:encoded>
			<wfw:commentRss>http://jalaj.net/2009/07/05/google-hot-trends-indexing-to-cool-down/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>What Exactly is a Good Search Engine?</title>
		<link>http://jalaj.net/2009/06/28/what-exactly-is-a-good-search-engine/</link>
		<comments>http://jalaj.net/2009/06/28/what-exactly-is-a-good-search-engine/#comments</comments>
		<pubDate>Sun, 28 Jun 2009 21:57:39 +0000</pubDate>
		<dc:creator>Jalaj</dc:creator>
				<category><![CDATA[Google]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[Webmaster Tools]]></category>
		<category><![CDATA[Bing]]></category>
		<category><![CDATA[Search Engine]]></category>
		<category><![CDATA[SEO]]></category>
		<category><![CDATA[Yahoo]]></category>

		<guid isPermaLink="false">http://jalaj.wordpress.com/2009/06/28/what-exactly-is-a-good-search-engine/</guid>
		<description><![CDATA[Some times back I had carried a post where I showed how, for a particular term, Google and Yahoo both failed to give relevant result and surprisingly Live Search pointed to the correct site as the first result.
This incidence showed me that definition for a Good search engine is relative and depends on the keyword [...]]]></description>
			<content:encoded><![CDATA[<p>Some times back I had carried a <a href="http://jalaj.net/2009/03/08/google-is-not-enough-takeitlite/">post</a> where I showed how, for a particular term, Google and Yahoo both failed to give relevant result and surprisingly Live Search pointed to the correct site as the first result.</p>
<p>This incidence showed me that definition for a Good search engine is relative and depends on the keyword you are searching for. While I always use Google for search, it was the first time I realized that live.com was not bad after all.</p>
<p><a href="http://bing.com/"><img style="border-bottom:0;border-left:0;display:inline;margin-left:0;border-top:0;margin-right:0;border-right:0;" title="image" border="0" alt="image" align="right" src="http://jalaj.files.wordpress.com/2009/06/image.png" width="320" height="231" /></a> Since then, Live.com has given way to Bing.com, a new search engine from Microsoft which promises improvement not only in result quality but also in user experience. And now that today I got to read&#160; <a href="http://www.nicolasprudhon.com/webmaster-tools/msnwebmaster-tools">Webmaster Tools Give “Live” a Second Life</a>, I achieved another realization, that definition for a Good search engine is also relative to and depends on the people. Let me expand this statement with more explanation as under</p>
<p>The are two type of people using a search engine. First is the Researcher, as we all know, who searches for information that he needs using the search engine. Second one is the Webmaster, who makes all his efforts to make sure that his site is the one which is visited by researcher, even if there is a lot of competition out there. We all now this process as Search Engine Optimization (SEO). At one point of time Google became synonym to SEO by providing webmasters with all the tools that webmasters could use to rate and improve their sites. After this no webmaster cared for other search engines as they could improve themselves to suit Google and get the extra traffic that others could not have generated easily.</p>
<p>Yahoo was next to understand this secret and thus it came up with Site Explorer. While Webmasters started giving importance to Yahoo, Live.com was looked at with hatred as it never cared to provide webmasters with similar tools.</p>
<p>Better late than never, <a href="http://www.bing.com/">Bing</a> the new search engine by Microsoft includes a <a href="http://www.bing.com/webmaster">Webmaster Center</a> and already we have Webmasters who have started praising Bing. After all, at the end of the day it’s only traffic that matters!</p>
]]></content:encoded>
			<wfw:commentRss>http://jalaj.net/2009/06/28/what-exactly-is-a-good-search-engine/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Google is Not Enough &#8211; TakeITlite</title>
		<link>http://jalaj.net/2009/03/08/google-is-not-enough-takeitlite/</link>
		<comments>http://jalaj.net/2009/03/08/google-is-not-enough-takeitlite/#comments</comments>
		<pubDate>Sun, 08 Mar 2009 07:39:01 +0000</pubDate>
		<dc:creator>Jalaj</dc:creator>
				<category><![CDATA[Search Engine]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Jokes]]></category>
		<category><![CDATA[Live]]></category>
		<category><![CDATA[Take IT Light]]></category>
		<category><![CDATA[Take IT Lite]]></category>
		<category><![CDATA[takeITlight]]></category>
		<category><![CDATA[TakeITlite]]></category>
		<category><![CDATA[TechPosts]]></category>
		<category><![CDATA[Yahoo]]></category>

		<guid isPermaLink="false">http://jalaj.wordpress.com/2009/03/08/google-is-not-enough-takeitlite/</guid>
		<description><![CDATA[Q: What’s the similarity between a NGO and IT Industry?     A: Everyone knows they exist but why… no one knows!

This is one of the few Q&#38;As that generate a smile on your face. I received this in a forwarded mail. The mail contained some jokes on a aesthetic image and was [...]]]></description>
			<content:encoded><![CDATA[<blockquote><p>Q: What’s the similarity between a NGO and IT Industry?     <br />A: Everyone knows they exist but why… no one knows!</p>
</blockquote>
<p>This is one of the few Q&amp;As that generate a smile on your face. I received this in a forwarded mail. The mail contained some jokes on a aesthetic image and was titled “TakeITlite – A daily dose of refreshment”. If you have been reading this blog for a long time you might already know <a href="http://jalaj.net/2007/07/17/email-forwarding-all-fun-and-no-responsibility/">how seriously I take</a> <a href="http://jalaj.net/2008/02/20/email-forward-related-to-osama-bin-laden-virus/">forwarded mails</a>. But this mail containing IT related jokes was little different as I grew interested in getting this dose of refreshment daily and opened Google to search for its origin. Did I forget to point that while the mail had all copyright notices and branding, there was no site address on it.</p>
<p>I checked Google for all combinations of TakeITlite and Daily Dose of Refreshment but failed. The blogs on Blogspot and Rediffland were not the source. From the fifth result this term even seemed to have gone into Google Hot Trends somewhere&#160; in February. I checked that page too but still got no clue of the source. Like most of people I believe that “if it’s not on Google, it doesn’t exist” and stopped searching…</p>
<p><a href="http://jalaj.files.wordpress.com/2009/03/image.png"><img style="border-bottom:0;border-left:0;display:inline;border-top:0;border-right:0;" title="image" border="0" alt="image" src="http://jalaj.files.wordpress.com/2009/03/image-thumb.png" width="454" height="462" /></a> </p>
<p>Now a few weeks later I received another forwarded mail (too bad! I thought I could get it daily) and this time I kept my belief aside and on again failing to locate the source of mail from Google, decided to search on other search engines. I used Yahoo and found that search result was polluted by multiple sub-domains of a single domain.</p>
<p><a href="http://jalaj.files.wordpress.com/2009/03/image1.png"><img style="border-bottom:0;border-left:0;display:inline;border-top:0;border-right:0;" title="image" border="0" alt="image" src="http://jalaj.files.wordpress.com/2009/03/image-thumb1.png" width="454" height="494" /></a> </p>
<p>Lastly I searched in Live.com and it returned only 4 results for term takeitlite and to my surprise, the site on first result did not occur in any of the other engines, the reason is visible if you visit the site. It is the official site but under construction containing a single big image and a few words “Magazine: Web Experience: Events &amp; Advertisements” with no links. But at least I found out what I was looking for!</p>
<p><a href="http://jalaj.files.wordpress.com/2009/03/image2.png"><img style="border-bottom:0;border-left:0;display:inline;border-top:0;border-right:0;" title="image" border="0" alt="image" src="http://jalaj.files.wordpress.com/2009/03/image-thumb2.png" width="454" height="381" /></a> </p>
<p>At last I realized that Google is not enough. If you are not receiving required results better go ahead and give other search engines a chance. May be it can help you. </p>
<p>By the way, how many of you used live.com in last week or last month or last year? I did not for months, but now will check it once in a while. And did you hear that Live Search is soon going to give way to Kumo.com, Microsoft’s new search engine with improved design and results. I will write more about it later.</p>
]]></content:encoded>
			<wfw:commentRss>http://jalaj.net/2009/03/08/google-is-not-enough-takeitlite/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Cheerful Achievement &#8211; Dormant GoogleBomb talks rises again</title>
		<link>http://jalaj.net/2009/01/25/cheerful-achievement-dormant-googlebomb-talks-rises-again/</link>
		<comments>http://jalaj.net/2009/01/25/cheerful-achievement-dormant-googlebomb-talks-rises-again/#comments</comments>
		<pubDate>Sun, 25 Jan 2009 00:49:23 +0000</pubDate>
		<dc:creator>Jalaj</dc:creator>
				<category><![CDATA[Buzz]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Internet]]></category>
		<category><![CDATA[SEO]]></category>
		<category><![CDATA[Web]]></category>
		<category><![CDATA[Cheerful Achievement]]></category>
		<category><![CDATA[Googlebomb]]></category>
		<category><![CDATA[Linktext]]></category>
		<category><![CDATA[Miserable failure]]></category>
		<category><![CDATA[Obama]]></category>
		<category><![CDATA[Search Engine]]></category>
		<category><![CDATA[TechPosts]]></category>

		<guid isPermaLink="false">http://jalaj.wordpress.com/2009/01/25/cheerful-achievement-dormant-googlebomb-talks-rises-again/</guid>
		<description><![CDATA[While Googlebomb has been a field of interest to Technical Experts in Search Engine Optimization field, the major world heard of it first when news of Google serving George Bush's page as first result for keyword 'Miserable Failures' broke. Google said that they have modified the algorithms such that most of such existing Googlebombs and [...]]]></description>
			<content:encoded><![CDATA[<p>While Googlebomb has been a field of interest to Technical Experts in Search Engine Optimization field, the major world heard of it first when news of Google serving George Bush's page as first result for keyword 'Miserable Failures' broke. Google said that they have modified the algorithms such that most of such existing Googlebombs and those in making would fade away and people stopped talking (not SEOs, just for information).</p>
<p>Googlebomb is again in news with Obama's page making it to top for search keyword &quot;Cheerful achievement&quot;. The page though lost it's position soon after the news broke out and every other blog started writing on this subject. But now the world again knows that Googlebombs still exists (SEOs always knew that)</p>
<p>What is Google Bomb? May be a look into the history of search engine's history will help a little.</p>
<p>The earliest search engines indexed only a part of the page specifically the title of the page and the meta tags describing the site/page and the keywords relevant to the page. With time such search engine lost popularity as the meta tags were the parts that were never visible to the user and thus SEOs (SEOs described here in the post are Black Hat SEOs unless stated otherwise) took to stuff irrelevant keywords which would make the site/page appear on search results for irrelevant keywords while the user would get driven to a page he didn't expect.</p>
<p>Advanced search engines took to index the whole page. When a user searched for particular keywords(s) the same were searched for in the whole page. To determine the relative placements on the search results each page were checked for multiple occurrences of the keyword(s) termed as keyword density. Higher the keyword density higher would be the placement. Needless to say SEOs (hey Whitehats here) took to draft the page content in a way that the keywords relevant to the site appeared many a times on the page, while others (Blackhats!) even took to stuff irrelevant dense text either at the bottom of the page or with invisible/less visible characters with font color same as background color.</p>
<p>When Google made its appearance in Search Engine Market it astonished everyone with most relevant search results that were because of its unique algorithms. Apart from the ways by which everyone was deciding relative positioning, it added many other ways (which it calls Signals) to the existing set. To determine the importance of a web page it looked for number of web pages outside the site that linked to it calling this method 'Backrub' and associated a number to each page which it called Pagerank (out of scope of this post). Next it also checked for the text that appeared on the link on such pages called as the 'Link Text', thus giving more importance to keywords as other sites remember that page/site for. This gave birth to Googlebomb.</p>
<p>You want to rank well for a certain keyword(s) somehow get other sites to link to your page/site with link texts as your target keywords and you have created a Googlebomb. In this immediate case the person responsible for “Cheerful Achievement” bomb posted his intention on <a href="http://inlogicalbearer.blogspot.com/2009/01/happy-google-bombing-to-barack.html" target="_blank" rel="nofollow">his blog</a> and informed all friends and followers on <a href="http://twitter.com/" target="_blank">Twitter</a> to add links to their blogrolls. As Google continued to crawl and index those sites the links to Obama’s page with link text “Cheerful Achievement” continued to grow and within 24 hours it was the top result. </p>
<p>Does this implies that Google search results are unreliable and that anybody can game that? No because link text is just one of the signal from as many as 200 signals that Google takes into consideration when positioning search results. In immediate case the terms “Cheerful Achievement” that were used were something that people rarely search for and thus had very few pages that qualified to be shown on search result page. The attempt to form the Googlebomb though succeeded initially because of less number of qualifying pages got diffused as soon as the world started writing on it. Pages on the web with these keywords grew and the Obama’s page that was enjoying the top status due to less competition lost its position</p>
<p>Google realizes such misuses and continues to reduce the weightage of link text and other such signals which can be gamed giving more weightage to others that are less liable to gaming or are not publicly known. Google continues to give best results overall and thus is my favorite since 2000.</p>
]]></content:encoded>
			<wfw:commentRss>http://jalaj.net/2009/01/25/cheerful-achievement-dormant-googlebomb-talks-rises-again/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
