Which URLs should you add in the XML Sitemap file of your Website?

"What URLs should you put in the XML sitemap of your website / blog?" The three confusion and non-overlapping options are (a) everything, (b) just the important pages or (c) URLs that have not been crawled and indexed yet.

Sitemaps, as you know, is a simple text file in XML format that contains a list of URLs that are part of a website. It’s extremely important that you create sitemaps of your site for two reasons:

1. They help you bring certain pages to the notice of search engines that may otherwise be ignored.

2. If there are duplicate URL issues with your site (for instance if abc.com?p=123 & abc.com/123/ point to the same page), you may use Sitemaps to specify the version that get preference in search engines.

Search Suchter of Yahoo! web search team recently suggested that webmasters should put only the important pages in the sitemap, rather than every page of the website because Yahoo uses sitemaps for figuring out which pages are valuable on a site.

I asked Vanessa Fox about her opinion on what should really go in a Sitemap and her preferred approach is that website owners should put a comprehensive list of URLs in the Sitemap.

"Why not tell search engines what the definitive list of pages on your site is? Why limit it to really important ones? One benefit to this is that there’s at least one place other than crawling that Sitemaps can be helpful, and that’s canonicalization. If a search engine has detected that several URLs display the same page, the version of the URL that’s in the Sitemap is a signal as to which is the canonical version."

Now this may sound like contradictory opinion and unfortunately, one site can’t maintain multiple sitemaps to fit the needs of all search engines so what may help here is, as Vanessa points out, if search engines can give us more details about how they use sitemaps and what are some of the best "common" practices.

Until that happens, I will probably continue to dump all URL in my XML sitemap including pages for tags and categories which are not very important from organic rankings point-of-view.

Also check this Google PageRank discussion.

Find this article at: http://www.labnol.org/internet/xml-sitemaps-of-websites-confusion/4965/

web: http://www.labnol.org/ email: amit@labnol.org


Reader Comments

I vote for the comprehensive listing in sitemaps. I want people to find any and all of my site’s pages. Why would I want search engines to leave some of them out? I don’t see the logic in only listing selected pages.

If you have a question or suggestion that is not related to the above discussion, please post it in this forum. All comments are moderated.

Add a Comment

required, use real name
required, will not be published
optional, your blog address

« Back to main

Google Custom Search