PHP使用QPM实现多进程并行任务处理程序
考虑用PHP实现以下场景: 有一个抓站的URL列表保存在队列里,后台程序读取这个队列,然后转交给子进程去抓取HTML存放到文件里。 为了提高效率,允许多任务并行执行,但为了避免机器负载过高,限制了最大的并行任务数(为了测试方便,我们把这个数设为3),当队列中取到 END标记时,程序结束运行。
这个场景用QPM的Supervisor::taskFactoryMode()实现,非常简单。
QPM全名是 Quick Process Management Module for PHP. PHP 是强大的web开发语言,以至于大家常常忘记PHP 可以用来开发健壮的命令行(CLI)程序以至于daemon程序。 而编写daemon程序免不了与各种进程管理打交道。QPM正式为简化进程管理而开发的类库。QPM的项目地址是:https://github.com/Comos/qpm
为了,简化测试环境,我们可以用一个文本文件来模拟队列的数据。完整的例子文件看这里:spider_task_factory_data.txt
http://news.sina.com.cn/http://news.ifeng.com/http://news.163.com/http://news.sohu.com/http://ent.sina.com.cn/http://ent.ifeng.com/...END
使用QPM的taskFactoryMode之前,我们需要准备一个TaskFactory类。 我们将其命名为 SpiderTaskFactory,SpdierTaskFactory 的工厂方法fetchTask 正常返回 Runnable的子类的实例。当碰到END或文件结束,则throw StopSignal,这样程序就会终止。
以下是组装 Supervisor 并执行的代码片段。完整的例子见:spider_task_factory.php
//如果没有从参数指定输入,把spider_task_factory_data.txt作为数据源$input = isset($argv[1]) ? $argv[1] : __DIR__.'/spider_task_factory_data.txt';$spiderTaskFactory = new SpiderTaskFactory($input);$config = [ //指定taskFactory对象和工厂方法 'factoryMethod'=>[$spiderTaskFactory, 'fetchTask'], //指定最大并发数量为3 'quantity' => 3,];//启动Supervisorqpm\supervisor\Supervisor::taskFactoryMode($config)->start();
SpiderTaskFactory 的实现如下:
/** * 任务工厂,必须实现 fetchTask方法。 * 该方法正常返回 * */class SpiderTaskFactory {private $_fh;public function __construct($input) { $this->_input = $input; $this->_fh = fopen($input, 'r'); if ($this->_fh === false) { throw new Exception('fopen failed:'.$input); }}public function fetchTask() { while (true) { if (feof($this->_fh)) { throw new qpm\supervisor\StopSignal(); } $line = trim(fgets($this->_fh)); if ($line == 'END') { throw new qpm\supervisor\StopSignal(); } if (empty($line)) { continue; } break; } return new SpiderTask($line);}}
SpiderTask 的实现如下:
/** * 在子进程中执行任务的类 * 必须实现 qpm\process\Runnable 接口 */class SpiderTask implements qpm\process\Runnable {private $_target;public function __construct($target) { $this->_target = $target;}//在子进程中执行的部分public function run() { $r = @file_get_contents($this->_target); if ($r===false) { throw new Exception('fail to crawl url:'.$this->_target); } file_put_contents($this->getLocalFilename(), $r); }private function getLocalFilename() { $filename = str_replace('/', '~', $this->_target); $filename = str_replace(':', '_', $filename); $filename = $filename.'-'.date('YmdHis'); return __DIR__.'/_spider/'.$filename.'.html';}}
真实的生产环境,用队列替换文件输入,即可实现持久运行的生产者/消费者模型的程序。

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics











In PHP, password_hash and password_verify functions should be used to implement secure password hashing, and MD5 or SHA1 should not be used. 1) password_hash generates a hash containing salt values to enhance security. 2) Password_verify verify password and ensure security by comparing hash values. 3) MD5 and SHA1 are vulnerable and lack salt values, and are not suitable for modern password security.

PHP and Python each have their own advantages, and choose according to project requirements. 1.PHP is suitable for web development, especially for rapid development and maintenance of websites. 2. Python is suitable for data science, machine learning and artificial intelligence, with concise syntax and suitable for beginners.

PHP is widely used in e-commerce, content management systems and API development. 1) E-commerce: used for shopping cart function and payment processing. 2) Content management system: used for dynamic content generation and user management. 3) API development: used for RESTful API development and API security. Through performance optimization and best practices, the efficiency and maintainability of PHP applications are improved.

PHP type prompts to improve code quality and readability. 1) Scalar type tips: Since PHP7.0, basic data types are allowed to be specified in function parameters, such as int, float, etc. 2) Return type prompt: Ensure the consistency of the function return value type. 3) Union type prompt: Since PHP8.0, multiple types are allowed to be specified in function parameters or return values. 4) Nullable type prompt: Allows to include null values and handle functions that may return null values.

PHP is still dynamic and still occupies an important position in the field of modern programming. 1) PHP's simplicity and powerful community support make it widely used in web development; 2) Its flexibility and stability make it outstanding in handling web forms, database operations and file processing; 3) PHP is constantly evolving and optimizing, suitable for beginners and experienced developers.

PHP is mainly procedural programming, but also supports object-oriented programming (OOP); Python supports a variety of paradigms, including OOP, functional and procedural programming. PHP is suitable for web development, and Python is suitable for a variety of applications such as data analysis and machine learning.

PHP and Python have their own advantages and disadvantages, and the choice depends on project needs and personal preferences. 1.PHP is suitable for rapid development and maintenance of large-scale web applications. 2. Python dominates the field of data science and machine learning.

Using preprocessing statements and PDO in PHP can effectively prevent SQL injection attacks. 1) Use PDO to connect to the database and set the error mode. 2) Create preprocessing statements through the prepare method and pass data using placeholders and execute methods. 3) Process query results and ensure the security and performance of the code.
