WPCrawler-master Java+mysql实现的网络爬虫

Java+mysql实现的网络爬虫。针对单个WordPress网站的网络爬虫程序

使用的开源类库如下:

Apache HttpComponents 4.3

HTML Parser 2.0

MySQL Connector/J 5.1.27

使用UTF-8编码以记录中文标签

使用XAMPP默认MySQL端口localhost:3306

需要本地XAMPP环境

-Java+ mysql web crawler.On a single web crawlers WordPress site

Use of open source libraries are as follows:

Apache HttpComponents 4.3

2.0 HTML Parser

The MySQL Connector/J 5.1.27

Use utf-8 to record label in Chinese

Using XAMPP MySQL default port localhost: 3306

Need local XAMPP environment

WPCrawler-master/
WPCrawler-master/.classpath
WPCrawler-master/.project
WPCrawler-master/.settings/
WPCrawler-master/.settings/org.eclipse.jdt.core.prefs
WPCrawler-master/README.md
WPCrawler-master/bin/
WPCrawler-master/bin/net/
WPCrawler-master/bin/net/johnhany/
WPCrawler-master/bin/net/johnhany/wpcrawler/
WPCrawler-master/bin/net/johnhany/wpcrawler/crawler.class
WPCrawler-master/bin/net/johnhany/wpcrawler/httpGet$1.class
WPCrawler-master/bin/net/johnhany/wpcrawler/httpGet.class
WPCrawler-master/bin/net/johnhany/wpcrawler/parsePage.class
WPCrawler-master/lib/
WPCrawler-master/lib/commons-logging-1.1.3.jar
WPCrawler-master/lib/htmllexer.jar
WPCrawler-master/lib/htmlparser.jar
WPCrawler-master/lib/httpclient-4.3.1.jar
WPCrawler-master/lib/httpcore-4.3.jar
WPCrawler-master/lib/mysql-connector-java-5.1.27-bin.jar
WPCrawler-master/result-2013-11-29.txt
WPCrawler-master/src/
WPCrawler-master/src/net/
WPCrawler-master/src/net/johnhany/
WPCrawler-master/src/net/johnhany/wpcrawler/
WPCrawler-master/src/net/johnhany/wpcrawler/crawler.java
WPCrawler-master/src/net/johnhany/wpcrawler/httpGet.java
WPCrawler-master/src/net/johnhany/wpcrawler/parsePage.java

此文由“快兔兔AI采集器”自动生成,目的为演示采集器效果,若侵权请及时联系删除。

原文链接:https://www.dssz.com/2571469.html

更多内容