Scrapy是python下的一个爬虫框架,挺不错的!官网在这:http://scrapy.org/。 Scrapy is a fast high-level screen scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data m
* Crawl website(开始抓取网站的内容) * @param startUrl----The first URL crawled,actually is the website's url * (第一个要抓取的链接,实际上就是网站的地址) * @param maxUrls----The max number of crawled URL(要抓取内容的链接数的最大值) * @param limithost----Whether limited host(是否限制主机的参数,true限制