解决方案:统信服务器操作系统V20宣布免费使用授权！支持AMD64/ARM64架构

优采云发布时间: 2022-12-22 08:41

　　CentOS 8停止更新维护后，CentOS 7也将于2024年6月30日停止更新维护。

　　CentOS的停用使得用户无法获得官方的补丁安装支持和系统升级。国内使用CentOS服务器的企业面临着巨大的安全漏洞等问题。

　　对此，同心软件在2022同心UOS生态大会上正式发布了服务器操作系统UOS V20的免费使用授权。与目前使用最广泛的CentOS 7内核版本3.10相比，该版本是更高级的4.19内核版本。

　　据介绍，同心软件不提供任何免费授权的商业支持服务。如果用户需要商业保障和服务，可以通过激活方式升级到同心商业版。

　　同心UOS V20免费使用授权提供13年生命周期维护，源码与商用发布版本一致，功能无限制，但不提供衍生方案、售后服务和定制服务，目前支持AMD64/ARM64两种架构。

　　官方表示，免费许可适用于预算紧张，需要尽快完成CentOS迁移和更换的用户，适用于非核心业务系统、不需要商业服务支持的用户等。

　　同心服务器操作系统V20（免许可）下载：官网链接

　　解决方案:thinkphp自动采集怎么实现

　　方法二：simple_html_dom

　　这种方式比较适合采集结构简单的页面，以及HTML标签类名明确的页面，这样也不错。具体用途：

　　控制器示例：

　　public function index(){

// 参考文档：http://microphp.us/plugins/public/microphp_res/simple_html_dom/manual.htm#section_quickstart

// 下载地址：https://github.com/samacs/simple_html_dom/edit/master/simple_html_dom.php

// 使用方法：http://www.thinkphp.cn/topic/21635.html

import("Org.Util.simple_html_dom", '', '.php');

$html = file_get_html('http://www.zyctd.com/gqqg/');

$ret = $html->find('.supply_list_box ul',0)->first_child();

foreach($ret as $v){

echo $v;

};

}

　　方法三：获取页面HTML并进行正则匹配采集

　　例如一个演示：

　　采集一个页面：

　　我想得到上面四个信息：标题，数量，时间，跳转链接。

　　获取这些信息，以上两种方法都不能采集，最后一种方法采集。具体方法：

　　public function index(){

$url = "http://www.zyctd.com/gqqg/";

// http://www.zyctd.com/gqqg-p1.html

$supplyDB = M('supply');

$urlList = array();

$array = array();

for($x=1; $xgetInfo($v);

array_push($array,$curPageList);

};

foreach($array as $v){

foreach($v as $vv){

//echo $vv['title']."__".$vv['weight']."__".$vv['time']."

";

$data = array();

$data['title'] = $vv['title'];

$data['weight'] = $vv['weight'];

$data['add_time'] = $vv['add_time'];

$data['url'] = $vv['url'];

//$res = $supplyDB->add($data);

//echo $res;

echo "<p>".$vv['title']."

".$vv['weight']."

".$vv['add_time']."

".$vv['url']."";

}

// 获取信息

//$curPageList = $this->getInfo($html);

//p($curPageList);

}

private function getInfo($url){

$html = $this->getHtml($url);

$array = array();

// 匹配所有的标题

preg_match_all("#(.*?)#",$html,$matches);

$all_title = $matches[1];

preg_match_all("#发布时间：(.*?)#",$html,$matches);

// 匹配所有的发布时间

$all_time = $matches[1];

// 匹配所有的求购数量

preg_match_all("#求购数量：(.*?)#",$html,$matches);

$all_weight = $matches[1];

// 匹配跳转链接

preg_match_all("##",$html,$matches);

$all_url = $matches[1];

// 组合

foreach($all_title as $k => $v){

$arr = array();

$arr['title'] = $v;

$arr['weight'] = $all_weight[$k];

$arr['add_time'] = $all_time[$k];

$arr['url'] = $all_url[$k];

array_push($array,$arr);

}

return $array;

}

private function getHtml($url){

$html = file_get_contents($url);

$html = preg_replace("#\n#","",$html);

$html = preg_replace("#\r#","",$html);

$html = preg_replace("#\\s#","",$html);

return $html;

}</p>

　　以上就是thinkphp自动采集是如何实现的详细内容。更多内容请关注php中文网其他相关文章！

0

2022-12-22

免费采集系统

0 个评论

要回复文章请先登录或注册

AI时代内容工厂

解决方案:统信服务器操作系统V20宣布免费使用授权！支持AMD64/ARM64架构

0 个评论

发起人

AI时代内容工厂

解决方案:统信服务器操作系统V20宣布免费使用授权！支持AMD64/ARM64架构

0 个评论

发起人

相关问题