php抓取网页源码( 远程采集一个https页面上的邮箱,死活采集不到,后来发现)
优采云 发布时间: 2022-03-16 10:24php抓取网页源码(
远程采集一个https页面上的邮箱,死活采集不到,后来发现)
php获取ssl认证的https页面源代码
$response = "https://faculty.xidian.edu.cn/system/resource/tsites/tsitesencrypt.jsp?id=_tsites_encryp_tsteacher_tsemail&content=a7da9f4e0712b1c3626d8439f202e43f6691c82013ba38566aef822c1325d2789ec60e565088f4f967d264d6e6f6231a69c3356def42082aeb9e969cc8ae996f9bf727f708cfff958a1d61e56e4edce659242cb0ceed0841bc36124341b0429c21cdf3130f623e71bfd80f03ad0179634b081f4ba15d74dbf1d02e2e1815795d&mode=8";
echo curl($response);
function curl($url){
$ch = curl_init();
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5000);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HTTPHEADER, array('User-Agent: Mozilla/5.0 (iPhone; CPU iPhone OS 8_0 like Mac OS X) AppleWebKit/600.1.3 (KHTML, like Gecko) Version/8.0 Mobile/12A4345d Safari/600.1.4'));
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
$contents = curl_exec($ch);
curl_close($ch);//关闭一打开的会话
return $contents;
}
远程采集一个https页面上的邮箱,生死采集无法到达,后来发现用上面的方法可以采集。需要注意的是 curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);这两段代码是关键,没有这个会提示ssl验证有问题;
发表于@2019-11-08 15:30 远来是李阅(459)评论(0)编辑