You are in: development / search engine optimization / seo optimize html for spiders

Design your HTML for search engines' spiders

Lots of fanta-SEOs think that using the appropriate metatags and content organizations ( keywords in your text ) is the only way to be the first on SERP.

This is a common misundarstanding that we definitely need to avoid: as you know, content is important, but the way you give the content to search engines is equally important.

So let's see how to optimize the way search engine's spiders, robots and crawlers look at our pages: first of all let's use Lynx to see our webpage as a spider does.

Why should we use a text browser which makes us view webpages in a really-crappy way, without CSS, images, JavaScripts and cool stuff? Because spiders do more like a text browser than a web browser.

Lynx is not equal to a spider (which translates pages to 100% flat text) but gives you a general idea about organizing elements and give them relevance; so it's a good combination of a spider's view and the consideration that the search engine gives to our elements.

For example you would say that the 2 following webpages are similar:

but they aren't.

Disclaimer: the 2 webpages above are not validated and ugly, look at the source code. I only need to make you understand how to write a good page for SE: then you'll nedd, AS I NORMALLY DO, to write them also W3C-friendly.

This is a screenshot of the first page:

bad written webpage viewed with linx

So you see that at the end of the page the browser tells you that you can go to the next block of the page: that's why the last paragraph of the page could not be shown in the first block, because there are too many newlines and the page is sooo long.

This is a primarily important aspect: the search engine gives importance and releveance to the first elements he encounters on the page.

Let's say that the spider looks at your page moreover in this way:

the way a browser gives importance to the webpages

As you can see the elements at the bottom of the page aren't considered as the first ones: what if we have significant (for us) keywords at that point of the page?

Let's see now how the spider sees the other - good written - webpage:

a good webpage viewed with lynx

So, as you see, we - human - look at 2 almost identical webpages but the spider sees them like Fire&Water: they actually don't have anything in common.

Any idea about importance/reveleance map?

a good webpage viewed with lynx

As we can see every single element of the page is considered in equal way by the spider ( that's actually not true, colors say it [at least], but we're gonna discuss it later ).

So, what's wrong with the first page?

I can start a never-ending list...

Page title

As you can see the page title needs to be the probably most highlighted element of the page.

In the bad way we used a DIV containing simple text with a CSS class that gives it visibility in a web browser, but not in a text browser.

In the good way we used the proper H1 tag:

Menus

In the bad way we used nested LIs: this is a good way to write a menu ( semantically speaking ) but, if we can avoid it, let's do that.

In the good way we used a DIV containing simple links:

Bold text

In the bad way we used span with a CSS class which was defining font-weight:bold. The spider isn't able to use CSS directives, so the element is not highlighted as it has to be.

In the good way we used the STRONG tag and the same CSS directive explained above:

Decorative images

In the bad way we wrote the IMG tag directly in the HTML source code:

If you need to use images as decorative elements call them as background images via CSS, so they won't appear in front of spiders' eyes:

Newlines

If you need, as in my example, to display a couple of link ( so you need to display elements one above the other one, as lists ) you can do that with UL and LI or floating DIVs, as we did in the bad way.

Another way to spare space and put those elements at the top of your page is to use simple links with floating properties defined via CSS.

Other elements

  • use I instead of CSS's font-style:italic
  • replace, where you can, DIVs and Ps with SPANs
  • avoid, as much as you can, newline containers
  • use STRONG instead of B
  • use H? instead of creating different styles for titles and subtitles
  • don't place nofollow at any link but paid/ad ones

Add your comment

Name:

E-mail:

URL:

Comment:

Emoticon:

"A bad open-source software can be improved, not thrown away.

A closed-source one can only be sent to the trash."