文件名称:
Mining the Web: Discovering Knowledge from Hypertext Data
开发工具:
文件大小: 3mb
下载次数: 0
上传时间: 2012-05-02
详细说明: 1 introduction 1.1 crawling and indexing 1.2 topic directories 1.3 clustering and classification 1.4 hyperlink analysis 1.5 resource discovery and vertical portals 1.6 structured vs. unstructured data mining 1.7 bibliographic notes part ⅰ infrastructure 2 crawling the web 2.1 html and http basics 2.2 crawling basics 2.3 engineering large-scale crawlers 2.3.1 dns caching, prefetching, and resolution 2.3.2 multiple concurrent fetches 2.3.3 link extraction and normalization 2.3.4 robot exclusion 2.3.5 eliminating already-visited urls 2.3.6 spider traps 2.3.7 avoiding repeated expansion of links on duplicate pages . 2.3.8 load monitor and manager 2.3.9 per-server work-queues 2.3.10 text repository 2.3.11 refreshing crawled pages 2.4 putting together a crawler 2.4.1 design of the core components 2.4.2 case study: using w3c-libwww 2.5 bibliographic notes 3 web search and information retrieval 3.1 boolean queries and the inverted index 3.1.1 stopwords and stemming 3.1.2 batch indexing and updates 3.1.3 index compression techniques 3.2 relevance ranking 3.2.1 recall and precision 3.2.2 the vector-space model 3.2.3 relevance feedback and rocchio?s method 3.2.4 probabilistic relevance feedback models 3.2.5 advanced issues 3.3 similarity search 3.3.1 handling òfind-similaró queries 3.3.2 eliminating near duplicates via shingling 3.3.3 detecting locally similar subgraphs of the web 3.4 bibliographic notes part ⅱ learning part ⅲ applications references index ...展开收缩
(系统自动生成,下载前可以参看下载内容)
下载文件列表
相关说明
- 本站资源为会员上传分享交流与学习,如有侵犯您的权益,请联系我们删除.
- 本站是交换下载平台,提供交流渠道,下载内容来自于网络,除下载问题外,其它问题请自行百度。
- 本站已设置防盗链,请勿用迅雷、QQ旋风等多线程下载软件下载资源,下载后用WinRAR最新版进行解压.
- 如果您发现内容无法下载,请稍后再次尝试;或者到消费记录里找到下载记录反馈给我们.
- 下载后发现下载的内容跟说明不相乎,请到消费记录里找到下载记录反馈给我们,经确认后退回积分.
- 如下载前有疑问,可以通过点击"提供者"的名字,查看对方的联系方式,联系对方咨询.