首先,西藏信息网Yahoo收录了。是yahoo.com,而不是马云的yahoo.cn。yahoo虽然今不如昔,落到被微软压价并购的地步,但仍然是世界三大搜索引擎之一。所以,这是继被Google收录之后,又一个小小胜利。

说到世界三大搜索引擎,就不能不提到百度。百度搜索蜘蛛很早就来了,除了每天十数次疯狂访问首页,也访问了其他网页。但至今,仍然没有收录。很是奇怪。

五一期间还有一个小收获。在访问日志中,见到了有道搜狗搜索蜘蛛。下面详细说说搜索蜘蛛的不同访问特点。

搜狗搜索蜘蛛只有1个IP,访问的时间间隔很规律,访问效率很高,上来就直接抓取各网页:

220.181.19.159 – – [30/Apr/2008:13:28:50 -0700] “GET / HTTP/1.1” 304 – “-” “Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)”
220.181.19.159 – – [30/Apr/2008:15:16:35 -0700] “GET /websitelist/xizang-gov-list.html HTTP/1.1” 304 – “-” “Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)”
220.181.19.159 – – [30/Apr/2008:17:06:39 -0700] “GET /index.html HTTP/1.1” 304 – “-” “Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)”
220.181.19.159 – – [30/Apr/2008:19:24:29 -0700] “GET /websitelist/xizang-administrative-division.html HTTP/1.1” 304 – “-” “Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)”
220.181.19.159 – – [01/May/2008:14:07:30 -0700] “GET / HTTP/1.1” 304 – “-” “Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)”
220.181.19.159 – – [01/May/2008:17:13:37 -0700] “GET /websitelist/xizang-gov-list.html HTTP/1.1” 304 – “-” “Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)”
220.181.19.159 – – [01/May/2008:19:24:01 -0700] “GET /index.html HTTP/1.1” 304 – “-” “Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)”
220.181.19.159 – – [01/May/2008:22:18:54 -0700] “GET /websitelist/xizang-administrative-division.html HTTP/1.1” 304 – “-” “Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)”
220.181.19.159 – – [02/May/2008:14:47:29 -0700] “GET / HTTP/1.1” 200 2826 “-” “Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)”
220.181.19.159 – – [02/May/2008:17:35:52 -0700] “GET /websitelist/xizang-gov-list.html HTTP/1.1” 200 7095 “-” “Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)”
220.181.19.159 – – [02/May/2008:21:25:32 -0700] “GET /index.html HTTP/1.1” 200 2826 “-” “Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)”

有道搜索蜘蛛的动作很有规律,都是访问一次/robots.txt,然后访问一次/。虽然IP和访问次数很多,但一直在主页徘徊:

61.135.220.51 – – [01/May/2008:21:05:08 -0700] “GET /robots.txt HTTP/1.1” 200 129 “-” “Mozilla/5.0 (compatible; YodaoBot/1.0; http://www.yodao.com/help/webmaster/spider/; )”
61.135.220.51 – – [01/May/2008:21:05:08 -0700] “GET / HTTP/1.1” 200 2214 “-” “Mozilla/5.0 (compatible; YodaoBot/1.0; http://www.yodao.com/help/webmaster/spider/; )”
61.135.220.43 – – [01/May/2008:23:21:42 -0700] “GET /robots.txt HTTP/1.1” 200 129 “-” “Mozilla/5.0 (compatible; YodaoBot/1.0; http://www.yodao.com/help/webmaster/spider/; )”
61.135.220.43 – – [01/May/2008:23:21:42 -0700] “GET / HTTP/1.1” 200 2214 “-” “Mozilla/5.0 (compatible; YodaoBot/1.0; http://www.yodao.com/help/webmaster/spider/; )”
61.135.220.45 – – [02/May/2008:01:40:25 -0700] “GET /robots.txt HTTP/1.1” 200 129 “-” “Mozilla/5.0 (compatible; YodaoBot/1.0; http://www.yodao.com/help/webmaster/spider/; )”
61.135.220.45 – – [02/May/2008:01:40:25 -0700] “GET / HTTP/1.1” 200 2826 “-” “Mozilla/5.0 (compatible; YodaoBot/1.0; http://www.yodao.com/help/webmaster/spider/; )”
61.135.220.43 – – [02/May/2008:04:01:44 -0700] “GET /robots.txt HTTP/1.1” 200 129 “-” “Mozilla/5.0 (compatible; YodaoBot/1.0; http://www.yodao.com/help/webmaster/spider/; )”
61.135.220.43 – – [02/May/2008:04:01:44 -0700] “GET / HTTP/1.1” 200 2826 “-” “Mozilla/5.0 (compatible; YodaoBot/1.0; http://www.yodao.com/help/webmaster/spider/; )”
61.135.220.43 – – [02/May/2008:06:22:06 -0700] “GET /robots.txt HTTP/1.1” 200 129 “-” “Mozilla/5.0 (compatible; YodaoBot/1.0; http://www.yodao.com/help/webmaster/spider/; )”
61.135.220.43 – – [02/May/2008:06:22:06 -0700] “GET / HTTP/1.1” 200 2826 “-” “Mozilla/5.0 (compatible; YodaoBot/1.0; http://www.yodao.com/help/webmaster/spider/; )”
61.135.220.60 – – [02/May/2008:08:40:39 -0700] “GET /robots.txt HTTP/1.1” 200 129 “-” “Mozilla/5.0 (compatible; YodaoBot/1.0; http://www.yodao.com/help/webmaster/spider/; )”
61.135.220.60 – – [02/May/2008:08:40:39 -0700] “GET / HTTP/1.1” 200 2826 “-” “Mozilla/5.0 (compatible; YodaoBot/1.0; http://www.yodao.com/help/webmaster/spider/; )”
61.135.220.11 – – [02/May/2008:12:41:50 -0700] “GET /robots.txt HTTP/1.1” 200 129 “-” “Mozilla/5.0 (compatible; YodaoBot/1.0; http://www.yodao.com/help/webmaster/spider/; )”
61.135.220.11 – – [02/May/2008:12:41:50 -0700] “GET / HTTP/1.1” 200 2826 “-” “Mozilla/5.0 (compatible; YodaoBot/1.0; http://www.yodao.com/help/webmaster/spider/; )”
61.135.220.55 – – [02/May/2008:14:57:55 -0700] “GET /robots.txt HTTP/1.1” 200 129 “-” “Mozilla/5.0 (compatible; YodaoBot/1.0; http://www.yodao.com/help/webmaster/spider/; )”
61.135.220.55 – – [02/May/2008:14:57:58 -0700] “GET / HTTP/1.1” 200 2826 “-” “Mozilla/5.0 (compatible; YodaoBot/1.0; http://www.yodao.com/help/webmaster/spider/; )”
61.135.220.65 – – [02/May/2008:17:55:14 -0700] “GET /robots.txt HTTP/1.1” 200 129 “-” “Mozilla/5.0 (compatible; YodaoBot/1.0; http://www.yodao.com/help/webmaster/spider/; )”
61.135.220.65 – – [02/May/2008:17:55:14 -0700] “GET / HTTP/1.1” 200 2826 “-” “Mozilla/5.0 (compatible; YodaoBot/1.0; http://www.yodao.com/help/webmaster/spider/; )”
61.135.220.65 – – [02/May/2008:20:16:19 -0700] “GET /robots.txt HTTP/1.1” 200 129 “-” “Mozilla/5.0 (compatible; YodaoBot/1.0; http://www.yodao.com/help/webmaster/spider/; )”
61.135.220.65 – – [02/May/2008:20:16:21 -0700] “GET / HTTP/1.1” 200 2826 “-” “Mozilla/5.0 (compatible; YodaoBot/1.0; http://www.yodao.com/help/webmaster/spider/; )”
61.135.220.55 – – [02/May/2008:23:02:27 -0700] “GET /robots.txt HTTP/1.1” 200 129 “-” “Mozilla/5.0 (compatible; YodaoBot/1.0; http://www.yodao.com/help/webmaster/spider/; )”
61.135.220.55 – – [02/May/2008:23:02:27 -0700] “GET / HTTP/1.1” 200 2826 “-” “Mozilla/5.0 (compatible; YodaoBot/1.0; http://www.yodao.com/help/webmaster/spider/; )”

至此,主要的中文搜索引擎都来访问过西藏信息网了。

转载请注明来自:jijian91与小z - 互联网

永久链接:http://jijian91.com/blog20080503/yahoo-include-xizanginfo.html