c httpclient抓取网页(STM32 )
优采云 发布时间: 2021-09-14 04:03c httpclient抓取网页(STM32
)
1、GET 方法
第一步是创建一个客户端,类似于用浏览器打开一个网页
HttpClient httpClient = new HttpClient();
第二步是创建一个GET方法来获取你需要爬取的网页的网址
GetMethod getMethod = new GetMethod("");
第三步,获取URL的响应状态码,200表示请求成功
int statusCode = httpClient.executeMethod(getMethod);
第四步,获取网页源代码
byte[] responseBody = getMethod.getResponseBody();
主要是这四个步骤,当然还有很多其他的,比如网页编码的问题
HttpClient httpClient = new HttpClient();
GetMethod getMethod = new GetMethod("http://www.baidu.com/");
try {
int statusCode = httpClient.executeMethod(getMethod);
if (statusCode != HttpStatus.SC_OK) {
System.err.println("Method failed: "
+ getMethod.getStatusLine());
}
// 读取内容
byte[] responseBody = getMethod.getResponseBody();
// 处理内容
String html = new String(responseBody);
System.out.println(html);
} catch (Exception e) {
System.err.println("页面无法访问");
}finally{
getMethod.releaseConnection();
}
2、Post 方法
HttpClient httpClient = new HttpClient();
PostMethod postMethod = new PostMethod(UrlPath);
postMethod.getParams().setParameter(HttpMethodParams.RETRY_HANDLER,new DefaultHttpMethodRetryHandler());
NameValuePair[] postData = new NameValuePair[2];
postData[0] = new NameValuePair("username", "xkey");
postData[1] = new NameValuePair("userpass", "********");
postMethod.setRequestBody(postData);
try {
int statusCode = httpClient.executeMethod(postMethod);
if (statusCode == HttpStatus.SC_OK) {
byte[] responseBody = postMethod.getResponseBody();
String html = new String(responseBody);
System.out.println(html);
}
} catch (Exception e) {
System.err.println("页面无法访问");
}finally{
postMethod.releaseConnection();
}
本例传递两个Post参数:username为xkey,userpass为*******,传递给URL UrlPath
如需了解获取gzip网页的信息,请参考
另一种是获取非字符数据,所以可以使用下面的方法
HttpClient httpClient = new HttpClient();
GetMethod getMethod = new GetMethod("http://www.baidu.com");
try {
InputStream inputStream = getMethod.getResponseBodyAsStream();
// 这里处理 inputStream
} catch (Exception e) {
System.err.println("页面无法访问");
}finally{
getMethod.releaseConnection();
}