How to modify sitemap.xml.gz generator to notify bing and other search
engines?
I use Sitemap Generator to generate sitemap.xml.gz out of my static web.
By default, it only notifies Google.com, if new sitemap is generated.
There are some hints, how to modify it, to announce also Yahoo, Bing, and
ask.com as well, but they are for another version (and syntax) of this script.
Since I'm not a Perl man, it took me quite long to find out the right
syntax. Here is the result.
Open the file sitemap_gen.pl, find the string www.google.com and modify the
place like this:
# Search engines to notify with the updated sitemaps
# This list is very non-obvious in what's going on. Here's the gist:
my @NOTIFICATION_SITES = ({
scheme => 'http',
netloc => 'www.google.com',
path => 'webmasters/sitemaps/ping',
query => {}, # <-- EXCEPTION: specify a query map rather than a string
fragment => '',
sitemap => 'sitemap' # - query attribute that should be set to the new Sitemap URL
},
{
scheme => 'http',
netloc => 'www.bing.com',
path => 'webmaster/ping.aspx',
query => {}, # <-- EXCEPTION: specify a query map rather than a string
fragment => '',
sitemap => 'sitemap' # - query attribute that should be set to the new Sitemap URL
},
{
scheme => 'http',
netloc => 'submissions.ask.com',
path => 'ping',
query => {}, # <-- EXCEPTION: specify a query map rather than a string
fragment => '',
sitemap => 'sitemap' # - query attribute that should be set to the new Sitemap URL
}
);
Unfortunately, Yahoo doesn't support ping notification until you register,
which I didn't want to, so Yahoo is missing. You can easily add by following
the pattern of Bing and Ask.com.
No Czech search engine such as Seznam.cz or Jyxo.cz support ping
notification :-(
It's also a good idea to place a link to sitemap.xml.gz to robots.txt in a
root directory.
Add to robots.txt line like this:
Sitemap: http://eldar.cz/kangaroo/sitemap.xml.gz
Some robots like Seznam.cz bot or Yandex.com bot will fetch sitemap.xml.gz then.
Comments?
Binary Sxizophreny - index of comp related stuff
Kangaroo's Homepage (czech)