Previous Section  < Day Day Up >  Next Section

Hack 86. A Webmaster's Introduction to Google

Steps to take for optimal Google indexing of your site.

The cornerstone of any good search engine is highly relevant results. Google's unprecedented success has been due to its uncanny ability to match quality information with a user's search terms. The core of Google's search results are based on a patented algorithm called PageRank.

There is an entire industry focused on getting sites listed near the top of search engines. Google has proven to be the toughest search engine for a site to do well on. Even so, it isn't all that difficult for a new web site to get listed and begin receiving some traffic from Google.

Learning the ins and outs of getting your site listed by a search engine can be a daunting task. There is a vast array of information about search engines on the Web, and not all of it is useful or proper. This discussion of getting your site into the Google database focuses on long-term techniques for successfully promoting your site through Google, helping to avoid some of the common misconceptions and problems that a new site owner might face.

8.7.1. Search Engine Basics

When you type a term into a search site, the engine looks up potential matches in its database and presents the best web page matches first. How those web pages get into the database, and consequently, how you can get yours in there too, is a three-step process:

  1. A search engine visits a site with an automated program called a spider (sometimes called a robot). A spider is just a program similar to a web browser that downloads a site's pages. It doesn't actually display the page anywhere; it just downloads the page data.

  2. After the spider has acquired the page, the search engine passes the page to a program called an indexer, which is another robotic program that extracts most of the visible portions of the page. The indexer also analyzes the page for keywords, the title, links, and other important information contained in the code.

  3. The search engine adds your site to its database and makes it available to searchers. The greatest difference between search engines is in this final step where ranking or result position under a particular keyword is determined.

8.7.2. Submitting Your Site to Google

The first step is to get your pages listed in the database, and there are two ways to go about it. The first is direct submission of your site's URL to Google via its add URL or submission page. To counter programmed robots, search engines routinely move submission pages around on their sites. You can find Google's submission page linked from their Help pages or Webmaster Info pages (http://www.google.com/addurl.html).

Just visit Google's add URL page and enter the main index page for your site into the submission form, and press submit. Google's spider (called GoogleBot) will visit your page usually within four weeks. The spider will traverse all pages on your site and add them to its index. Within eight weeks, you should be able to find your site listed in Google.

The second way to get your site listed is to let Google find you based on links that may be pointing to your site. Once GoogleBot finds a link to your site from a page it already has in its index, it will visit your site.

Google has been updating its database on a monthly basis for three years. It sends its spider out in crawler mode once a month, too. Crawler mode is a special mode when a spider traverses or crawls the entire Web. As it runs into links to pages, it indexes those pages in a never-ending attempt to download all the pages it can. Once your pages are listed in Google, they are revisited and updated on a monthly basis. If you frequently update your content, Google may index your search terms more often.

Once you are indexed and listed in Google, the next natural question for a site owner is, "How can I rank better under my applicable search terms?"

8.7.3. The Search Engine Optimization Template

This is my general recipe for the ubiquitous Google. It is generic enough that it works well everywhere. It's as close as I have come to a "one-size-fits-all" SEO (that's Search Engine Optimization) template.

Use your targeted keyword phrase:

  • In META keywords. It's not necessary for Google, but a good habit. Keep your META keywords short (128 characters max, or 10 keywords).

  • In META description. Keep keyword close to the left but in a full sentence.

  • In the title at the far left but possibly not as the first word.

  • In the top portion of the page in the first sentence of the first full paragraph (plain text: no bold, no italic, no style).

  • In an H3 or larger heading.

  • In bold—second paragraph if possible and anywhere but the first usage on page.

  • In italic—anywhere but the first usage.

  • In subscript/superscript.

  • In URL (directory name, filename, or domain name). Do not duplicate the keyword in the URL.

  • In an image filename used on the page.

  • In ALT tag of that previous image mentioned.

  • In the title attribute of that image.

  • In link text to another site.

  • In an internal link's text.

  • In title attribute of all links targeted in and out of page.

  • In the filename of your external CSS (Cascading Style Sheet) or JavaScript file.

  • In an inbound link on site (preferably from your home page).

  • In an inbound link from off site (if possible).

  • In a link to a site that has a PageRank of 8 or better.

Other search engine optimization things to consider include:

  • Use "last modified" headers if you can.

  • Validate that HTML. Some feel Google's parser has become stricter at parsing instead of milder. It will miss an entire page because of a few simple errors—we have tested this in depth.

  • Use an HTML template throughout your site. Google can spot the template and parse it off. (Of course, this also means they are pretty good a spotting duplicate content.)

  • Keep the page as .html or .htm extension. Any dynamic extension is a risk.

  • Keep the HTML below 20K; 5 to 15K is the ideal range.

  • Keep the ratio of text to HTML very high. Text should outweigh HTML by significant amounts.

  • Double-check your page in Netscape, Opera, and Internet Explorer. Use Lynx if you have it.

  • Use only raw hrEFs for links. Keep JavaScript far, far away from links. The simpler the link code the better.

  • The traffic comes when you figure out that 1 referral a day to 10 pages is better than 10 referrals a day to 1 page.

  • Don't assume that keywords in your site's navigation template will be worth anything at all. Google looks for full sentences and paragraphs. Keywords just lying around orphaned on the page are not worth as much as when used in a sentence.

Brett Tabke

    Previous Section  < Day Day Up >  Next Section