What Is Your Website Telling Search Engines?

September 8th, 2004

If you are planning to have your site listed with the search engines, it is important that your site provide the search engine spiders with the necessary information to do their job properly. If you notice that you have hits from search engine robots in your website stats, but when you do a search for your site name it does not show up, chances are your site was not listed with the search engine at all. There are a number of reasons for this, but we will look ten possible reasons.

1 Your website uses frames
Search engine robots have problems understanding that there are other pages to be retrieved for the frame. The generally will only index the frameset and not actually the page which contains your actual site content. Unfortunately, the frameset generally does not have enough content to obtain a listing in a search engine. The best solution is just to avoid frames altogether, but if it’s absolutely necessary - add a site description in the <noframes> area so the engines will index that text with a link to your homepage for the search engines to follow.

2 You are using a lot of graphics and very little text
Search engine spiders are unable to tell that your .jpg or .gif image says “this is the best website on earth” because they can only read text. There is no magic number, but it’s good to have at least 3 times more text than graphics on your website. Try to build at least one page of quality content every other day (or more frequently). While creating original content is the best option, if you do not have the time you can search for quality articles online with reprint rights via Google or another search engine, or you can visit an article database such as http://www.goarticles.com/.

3 You only submitted redirection pages (doorway pages)
Many search engines, especially the big boys Google, Yahoo, and MSN, do not index sites with doorway pages as they see it as another method to spam them. Submit a real web page that has the actual content your visitors will see. Sometimes there are legitimate reasons for having a page redirection such as you’ve updated your site and moved information around. The search engines may have indexed an old page which no longer exists and you want your visitors to see the new page, there are ways to do this without hurting your chances at search engine indexing.

You can use server side 301 redirection, this redirects your visitor to the new page while telling the search engines that the page has moved permanently.

You can use a meta refresh tag (<meta http-equiv=”refresh” content=”5;URL=http://www.yournewsite.com”>)to redirect the page to the new page after 5 seconds. Make sure to include a brief note with a highly visible link to the new page for your site visitors and search engine spiders to follow. It is important not to refresh the page within a few seconds because you may be penalized by search engines.

You can use javascript to load the new page. Most search engines ignore javascript, but use it should be used with caution because about 10% of surfers have javascript disabled.

You can simply delete your old page and create a custom 404 error page which will direct your visitors to the new site.

4 You over submitted your website (search engine spamming)
It’s good practice not to submit your website more than once a month. Submitting your site more frequently may be considered search engine spamming. You are not required to submit every page of your website. The search engine spider will crawl each of the links on the page so it’s a good idea to include a link to the sitemap on your homepage. Search engines are becoming more “intuitive” and can detect spam attempts which may end up getting your site blacklisted.

Keep a database of the dates you submitted your website, then wait about 4 - 6 weeks before submitting your website again.

5 You overused your keywords (keyword spamming)
While there is much debate about the number of keywords you can use on your page before it is considered spam and most do agree that it varies depending on the keyword itself, but a rule of thumb is to keep the language natural. Read your content out loud to yourself and ask if it sounds forced or “fake” then adjust it accordingly. Remember, you are writing for people and optimizing for the search engines. This especially true if you plan to submit your website to the Open Directory Project (http://dmoz.org)

Another form of keyword spamming is using hidden (or tiny text). Search engines can tell when your text color is the same (or almost the same) as your background color and that people cannot read a 1px font. This trick may have worked back in 1996, but this will not fool them now and will certainly get your site blacklisted.

6 Your website is done in flash, you have dynamic pages or pages that use special characters (&, $, =, %, ?)
Although Google announced that it can spider content in .swf (flash) files but the other major search engines such as Yahoo, MSN, Altavista, Lycos, etc cannot. Flash websites should be used with caution and an alternate HTML site should be made available.

Dynamic pages or pages with those special characters are generally a stop sign for most search engine spiders. It tells them that there is potentially and infinite number of possibilities for your website and it simply does not have the time to check them all (keep in mind, there are about 2 million other websites it needs to check). Dynamic pages don’t exist until someone “requests” it. However, Google and Inktomi, can index dynamic pages and is partially supported by Yahoo, MSN, and Hotbot, but Altavista and Lycos (as well as many others) cannot.

Hotbot recommends that you submit the entire query string when submitting your website or page for inclusion (i.e. “www.yourwebsite.com/querypage.asp?page=10″ )

If you must use dynamic pages to manage your content, visit http://spider-food.net/dynamic-page-optimization-b.html for some good advice on optimizing dynamic pages for search engines.

If you are using blogging software such as Moveable Type or WordPress to maintain your website, they enable you to change the “dynamic” links into search engine friendly links.

7 Your site is hosted on a free host (geocities, lycos, etc.)
Many search engines do not list websites from free host servers because they are considered a source of spam and low quality submissions. Google will index your site, but you must have some established link popularity with quality websites first. Not only does your own domain name enable a greater chance of getting indexed by the search engines, it helps to build trust with your visitors and your company brand.

8 You have a slow host server, your website host was down, or your DNS did not resolve.
Search engines will give extra points for the fast loading websites that can deliver their content almost at the speed of light. There are a number of things you can do to reduce the size of your pages and speed the page load time, but that doesn’t do you much good if your host server responds slowly or is “down” frequently. If the web host is down and your site is non-operational when the spider visits you will have to wait until the next time it makes its rounds which can be months. If your site was already indexed and the spider is not able to access your site again, it may remove your links altogether. Your site is only as reliable as your webhost, so make sure you choose wisely. Try to locate web hosts with a 99.8% or higher uptime guarantee. Visit http://findmyhosting.com for a database of reliable hosts many of which have user ratings.

A mistake many new webmasters make is to submit their site to the search engines moments after they register it. It takes about 24-72 hours for a domain name to actually become active. If you submit your website before you name is active (or has propagated), the search engine will not be able to locate your website.

9 Your pages were simply not accessible
Spiders don’t function like a fully functional browser - the page they see is not like the page you’ll see in Internet Explorer or Netscape Navigator. As mentioned in #2, search engines can only spider text and cannot tell what your graphics actually say, however they can read “alt” text. If your page requires a lot of javascript, Dynamic HTML, Flash, password protected, etc it may not be indexed properly by the spiders.

To get an idea of what you page looks like to a search engine spider, visit http://www.delorie.com/web/ses.cgi and to see how people who are browsing with ‘text-only’ browsers or screen readers, try viewing your page in lynx (http://lynx.browser.org) or test your page at http://www.delorie.com/web/lynxview.html.

10 You haven’t waited long enough
Search engines have a lot of websites to spider (approximately 2 million with more added every day). It may take up to six (6) months or more for a search engine to index your website. Because of the length of time it can take for your site to get listed, it’s highly important that you do it right the first time and from the very beginning - getting your site blacklisted or skipped can cause a serious delay in getting your search engine results.

No one can guarantee that you will have great search engine rankings, but following the above guidelines can definitely improve your chances.

Comments are closed.