How
to Make Sure your Site is Search Engine
Crawler-Friendly
What
makes a site "crawler-friendly." I used to call this
"search-engine-friendly" but my friend Mike Grehan convinced
me that the more accurate phrase was "crawler-friendly"
because it's the search engine crawlers (or spiders) that your
site needs to buddy-up to, as opposed to the search engine itself.
So,
how do you make sure your site is on good terms with the crawlers?
Well, it always helps to first buy it a few drinks. <grin>
But, since that's not usually possible, your next-best bet is
to design your site with the crawlers in mind. The search engine
spiders are primitive
beings, and although they are constantly being improved, for best
results you should always choose simplicity over complexity.
Attracting
Search Engine Crawlers
What
this means is that cutting-edge designs are generally not the
best way to go. Interestingly enough, your site visitors may agree.
Even though we SEO geeks have cable modems and DSL, our site visitors
probably don't. Slow-loading Flash sites, for example, may stop
the search engine spiders right in their tracks. There's nothing
of interest on the average Flash site to a search engine spider
anyway, so they're certainly not going to wait for it to download!
Besides
Flash, there are a number of "helpful" features being
thrown into site designs these days that can sadly be the kiss
of death to its overall spiderability. For instance, sites that
require a session ID to track visitors may never receive any visitors
to begin with -- at least not from the search engines. If your
site or shopping cart requires session IDs, check Google right
now to see if your pages are indexed. (Do an allinurl:yourdomainhere.com
in Google's search box and see what shows up.) If you see that
Google only has one or two
pages indexed, your session IDs may be the culprit. There are
workarounds for this, as I have seen many sites that use session
IDs get indexed; however, the average programmer/designer may
not even know this is a problem.
Another
source of grief towards getting your pages thoroughly crawled
is the use of the exact same Title tags on every page of your
site. This sometimes happens because of Webmaster laziness, but
often it's done because a default Title tag is automatically pulled
up through a content management system (CMS). If you have this
problem it's well worth taking the time to fix it.
Most
CMS's have workarounds where you can add a unique Title tag as
opposed to pulling up the same one for each page. Usually the
programmers simply never realized it was important, so it was
never done. The cool thing is that with dynamically generated
pages you can often set your templates to pull a particular sentence
from each page and plug it into your Title field. A nice little
"trick" is to make sure each page has a headline at
the top of the page that is utilizing your most important keyword
phrases. Once you've got that, you can set your CMS to pull it
out and use it for your Titles also.
Methods
to Induce Search Engine Crawlers
Another
reason I've seen for pages not being crawled is because they are
set to require a cookie when a visitor gets to the page. Well
guess what, folks? Spiders don't eat cookies! (Sure, they like
beer, but they hate cookies!) No, you don't have to remove your
cookies to get crawled. Just don't force-feed them to anyone and
everyone. As long as they're not required, your pages should be
crawled just fine.
What
about the use of JavaScript? We've often heard that JavaScript
is unfriendly to the crawlers. This is partly true, and partly
false. Nearly every site I look at these days uses some sort of
JavaScript within the code. It's certainly not bad in and of itself.
As a rule of thumb, if you're using JavaScript for mouseover effects
and that sort of thing, just check to make sure that the HTML
code for the links also uses the traditional <a href> tag.
As long as that's there, you'll most likely be fine. For extra
insurance, you can place any JavaScript links into the <noscript>
tag, put text links at the bottom of your pages, and create a
visible link to a sitemap page which contains links to all your
other important pages. It's definitely not overkill to do *all*
of those things!
There
are plenty more things you can worry about where your site's crawlability
is concerned, but those are the main ones I've been seeing lately.
One day, I'm sure that any type of page under the sun will be
crawler-friendly, but for now, we've still gotta give our little
arachnid friends some help.
Tools
to Detect Search Engine Crawlers
One
tool I use to help me view any potential crawler problems is the
Lynx browser tool that can be found here: http://www.delorie.com/web/lynxview.html.
Generally, if your pages
can be viewed and clicked through in a Lynx browser (which came
before our graphical browsers of today), then a search engine
spider should also be able to make its way around. That isn't
written in stone, but it's at least one way of discovering potential
problems that you may be having. It's not foolproof, however.
I just checked my forum in the Lynx browser and it shows a blank
page, yet the forum gets spidered and indexed by the search engines
without a problem.
This
is a good time to remind you that when you think your site isn't
getting spidered completely, check out lots of things before jumping
to any conclusions.
The above article is an excerpt from
http://www.highrankings.com
by Jill
_________________________________