网页中flash数据抓取( 网络中获取网页数据的案例代码postedon2011-05-31)

优采云 发布时间: 2021-12-02 00:02

  网页中flash数据抓取(

网络中获取网页数据的案例代码postedon2011-05-31)

  十六、从网上获取网页数据

  从网络获取网页数据时,网页可以使用GZIP压缩技术对网页进行压缩,这样可以减少网络传输的数据量,提高浏览速度。所以获取网络数据时需要判断,并使用GZIPInputStream对GZIP格式数据进行特殊处理,否则获取数据时可能出现乱码。

  以下是获取网络中网页数据的案例代码

  package com.ljq.test;

import java.io.ByteArrayOutputStream;

import java.io.InputStream;

import java.net.HttpURLConnection;

import java.net.URL;

import java.util.zip.GZIPInputStream;

import java.util.zip.GZIPOutputStream;

/**

* 从网络中获取网页数据

*

* @author jiqinlin

*

*/

public class InternetTest2 {

@SuppressWarnings("static-access")

public static void main(String[] args) throws Exception {

String result = "";

//URL url = new URL("http://www.sohu.com");

URL url = new URL("http://www.ku6.com/");

HttpURLConnection conn = (HttpURLConnection) url.openConnection();

conn.setConnectTimeout(6* 1000);//设置连接超时

if (conn.getResponseCode() != 200) throw new RuntimeException("请求url失败");

InputStream is = conn.getInputStream();//得到网络返回的输入流

if("gzip".equals(conn.getContentEncoding())){

result = new InternetTest2().readDataForZgip(is, "GBK");

}else {

result = new InternetTest2().readData(is, "GBK");

}

conn.disconnect();

System.out.println(result);

System.err.println("ContentEncoding: " + conn.getContentEncoding());

}

//第一个参数为输入流,第二个参数为字符集编码

public static String readData(InputStream inSream, String charsetName) throws Exception{

ByteArrayOutputStream outStream = new ByteArrayOutputStream();

byte[] buffer = new byte[1024];

int len = -1;

while( (len = inSream.read(buffer)) != -1 ){

outStream.write(buffer, 0, len);

}

byte[] data = outStream.toByteArray();

outStream.close();

inSream.close();

return new String(data, charsetName);

}

//第一个参数为输入流,第二个参数为字符集编码

public static String readDataForZgip(InputStream inStream, String charsetName) throws Exception{

GZIPInputStream gzipStream = new GZIPInputStream(inStream);

ByteArrayOutputStream outStream = new ByteArrayOutputStream();

byte[] buffer =new byte[1024];

int len = -1;

while ((len = gzipStream.read(buffer))!=-1) {

outStream.write(buffer, 0, len);

}

byte[] data = outStream.toByteArray();

outStream.close();

gzipStream.close();

inStream.close();

return new String(data, charsetName);

}

}

  发表于 2011-05-31 15:40 无情阅读(3906)评论(2)编辑

0 个评论

要回复文章请先登录注册


官方客服QQ群

微信人工客服

QQ人工客服


线