本文共 5434 字,大约阅读时间需要 18 分钟。
每个搜索关键字都应该有一个惟一的URL,例如
https://www.google.com.hk/search?sourceid=chrome&ie=UTF-8&q=netkiller&sei=9v-QT_q1L6SZiAel2bGnBA&gbv=2https://www.google.com.hk/search?aq=f&sourceid=chrome&ie=UTF-8&q=neohttps://www.google.com.hk/search?sourceid=chrome&ie=UTF-8&q=bg7nyt
每搜索一次新的关键字就会产生一条唯一的URL,这样就可以实现反向代理缓存,甚者通过HTTP头,实现浏览器段的缓存。
例 15.3. example robots.txt
http://www.google.com/robots.txt
User-agent: *Disallow: /searchDisallow: /groupsDisallow: /imagesDisallow: /catalogsDisallow: /cataloguesDisallow: /newsAllow: /news/directoryDisallow: /nwshpDisallow: /setnewsprefs?Disallow: /index.html?Disallow: /?Disallow: /addurl/image?Disallow: /pagead/Disallow: /relpage/Disallow: /relcontentDisallow: /imgresDisallow: /imglandingDisallow: /keyword/Disallow: /u/Disallow: /univ/Disallow: /cobrandDisallow: /customDisallow: /advanced_group_searchDisallow: /googlesiteDisallow: /preferencessectionDisallow: /setprefsDisallow: /swrDisallow: /urlDisallow: /defaultDisallow: /m?Disallow: /m/?Disallow: /m/blogs?Disallow: /m/igDisallow: /m/images?Disallow: /m/local?Disallow: /m/movies?Disallow: /m/news?Disallow: /m/news/i?Disallow: /m/place?Disallow: /m/setnewsprefs?Disallow: /m/search?Disallow: /m/swmloptin?Disallow: /m/trendsDisallow: /wml?Disallow: /wml/?Disallow: /wml/search?Disallow: /xhtml?Disallow: /xhtml/?Disallow: /xhtml/search?Disallow: /xml?Disallow: /imode?Disallow: /imode/?Disallow: /imode/search?Disallow: /jsky?Disallow: /jsky/?Disallow: /jsky/search?Disallow: /pda?Disallow: /pda/?Disallow: /pda/search?Disallow: /sprint_xhtmlDisallow: /sprint_wmlDisallow: /pqaDisallow: /palmDisallow: /gwt/Disallow: /purchasesDisallow: /hwsDisallow: /bsd?Disallow: /linux?Disallow: /mac?Disallow: /microsoft?Disallow: /unclesam?Disallow: /answers/search?q=Disallow: /local?Disallow: /local_urlDisallow: /froogle?Disallow: /products?Disallow: /products/Disallow: /froogle_Disallow: /product_Disallow: /products_Disallow: /printDisallow: /booksDisallow: /bkshp?q=Allow: /booksrightsholdersDisallow: /patents?Disallow: /patents/Allow: /patents/aboutDisallow: /scholarDisallow: /completeDisallow: /sponsoredlinksDisallow: /videosearch?Disallow: /videopreview?Disallow: /videoprograminfo?Disallow: /maps?Disallow: /mapstt?Disallow: /mapslt?Disallow: /maps/stk/Disallow: /maps/br?Disallow: /mapabcpoi?Disallow: /maphp?Disallow: /places/Disallow: /maps/placeDisallow: /help/maps/streetview/partners/welcome/Disallow: /lochp?Disallow: /centerDisallow: /ie?Disallow: /sms/demo?Disallow: /katrina?Disallow: /blogsearch?Disallow: /blogsearch/Disallow: /blogsearch_feedsDisallow: /advanced_blog_searchDisallow: /reader/Allow: /reader/playDisallow: /uds/Disallow: /chart?Disallow: /transit?Disallow: /mbd?Disallow: /extern_js/Disallow: /calendar/feeds/Disallow: /calendar/ical/Disallow: /cl2/feeds/Disallow: /cl2/ical/Disallow: /coop/directoryDisallow: /coop/manageDisallow: /trends?Disallow: /trends/music?Disallow: /trends/hottrends?Disallow: /trends/viz?Disallow: /notebook/search?Disallow: /musicaDisallow: /musicadDisallow: /musicasDisallow: /musiclDisallow: /musicsDisallow: /musicsearchDisallow: /musicspDisallow: /musiclpDisallow: /browsersyncDisallow: /callDisallow: /archivesearch?Disallow: /archivesearch/urlDisallow: /archivesearch/advanced_searchDisallow: /base/search?Disallow: /base/reportbadofferDisallow: /base/s2Disallow: /urchin_test/Disallow: /movies?Disallow: /codesearch?Disallow: /codesearch/feeds/search?Disallow: /wapsearch?Disallow: /safebrowsingAllow: /safebrowsing/diagnosticAllow: /safebrowsing/report_error/Allow: /safebrowsing/report_phish/Disallow: /reviews/search?Disallow: /orkut/albumsDisallow: /jsapiDisallow: /views?Disallow: /c/Disallow: /cbkDisallow: /recharge/dashboard/carDisallow: /recharge/dashboard/static/Disallow: /translate_a/Disallow: /translate_cDisallow: /translate_fDisallow: /translate_static/Disallow: /translate_suggestionDisallow: /profiles/meAllow: /profilesDisallow: /s2/profiles/meAllow: /s2/profilesAllow: /s2/photosAllow: /s2/staticDisallow: /s2Disallow: /transconsole/portal/Disallow: /gcc/Disallow: /aclkDisallow: /cse?Disallow: /cse/panelDisallow: /cse/manageDisallow: /tbproxy/Disallow: /comparisonads/Disallow: /imesync/Disallow: /shenghuo/search?Disallow: /support/forum/search?Disallow: /reviews/polls/Disallow: /hosted/images/Disallow: /hosted/life/Disallow: /ppob/?Disallow: /ppob?Disallow: /ig/add?Disallow: /adwordsresellersDisallow: /accounts/o8Allow: /accounts/o8/idDisallow: /topicsearch?q=Disallow: /xfx7/Disallow: /squared/apiDisallow: /squared/searchDisallow: /squared/tableDisallow: /toolkit/Allow: /toolkit/*.htmlDisallow: /qnasearch?Disallow: /errors/Disallow: /app/updatesDisallow: /sidewiki/entry/Disallow: /quality_form?Disallow: /labs/popgadget/searchDisallow: /buzz/postSitemap: http://www.gstatic.com/s2/sitemaps/profiles-sitemap.xmlSitemap: http://www.google.com/hostednews/sitemap_index.xmlSitemap: http://www.google.com/ventures/sitemap_ventures.xmlSitemap: http://www.google.com/sitemaps_webmasters.xmlSitemap: http://www.gstatic.com/trends/websites/sitemaps/sitemapindex.xmlSitemap: http://www.gstatic.com/dictionary/static/sitemaps/sitemap_index.xml
User-agent: *Allow: *Disallow: /management/Sitemap: http://netkiller.sourceforge.net/sitemaps.xml.gz
http://www.sitemaps.org/
sitemap.xml
原文出处:Netkiller 系列 手札
本文作者:陈景峯 转载请与作者联系,同时请务必标明文章原始出处和作者信息及本声明。