Home Backend Development PHP Tutorial Detailed explanation of using anyproxy to improve the efficiency of public account article collection

Detailed explanation of using anyproxy to improve the efficiency of public account article collection

Jul 07, 2018 pm 05:50 PM
No public

Let me share with you the advanced usage of anyproxy, and share with you the analysis of how to improve the efficiency of collecting articles from public accounts. Friends who need it can refer to it.

The main influencing factors are the following:

1. Poor network environment;

2. The WeChat client crashes in the mobile phone or simulator;

3. Some other network transmission errors;

Because I pay more attention to the operating cost of the collection system, which includes hardware investment, computing power investment and occupied manual energy. Therefore, the stability of operation must be improved. Therefore, if the collection is interrupted, the cost of manual effort will inevitably increase. So for this point, I made some advanced modifications to anyproxy, and used other tools to improve operating efficiency. The following are the specific solutions:

1. Code upgrade

1) WeChat browser white screen

Solution : Modify the file requestHandler.js, still in the same directory as rule_default.js (mac system/usr/local/lib/node_modules/anyproxy/lib/; netizen cnbattle in the win system comment area provides C:\Users\Administrator\AppData\ Roaming\npm\node_modules\anyproxy\lib)

Find proxyReq.on("error",function(e){this function in the code and modify the content

1

2

//userRes.end();//把这一行注释掉

userRes.end(&#39;<script>setTimeout(function(){window.location.reload();},2000);</script>&#39;);//插入这一行

Copy after login

In this way, when an error occurs, a js that refreshes the current page will be returned; so that the program can continue

2) Replace all images to reduce the burden on the browser

First you need to make a very small picture. I made a 1x1 pixel, png transparent picture; put it in any folder. Then modify the code of the file rule_default.js:

Add the following code where there are many vars at the beginning of the file

1

2

var fs = require("fs"),

 img = fs.readFileSync("/Library/WebServer/Documents/space.png");//代码绝对路径替换成自己的

Copy after login

In the following code Find shouldUseLocalResponse: function(req,reqBody){function, insert the code inside the function:

1

2

3

4

5

6

if(/mmbiz\.qpic\.cn/i.test(req.url)){

 req.replaceLocalFile = true;

 return true;

}else{

 return false;

}

Copy after login

Continue to find dealLocalResponse: function(req, reqBody,callback){function, insert the code inside the function:

1

2

3

if(req.replaceLocalFile){

 callback(200, {"content-type":"image/png"},img );

}

Copy after login

These three pieces of code will replace all the pictures in the official account with local pictures. Reduce network transmission pressure and the memory occupied by the browser, and effectively improve operating efficiency;

3) Prohibit mobile phones or simulators from accessing some useless and error-causing URLs

Also in rule_default. Find the code replaceRequestOption: function(req,option){function in js, insert the code inside the function:

1

2

3

4

5

6

var newOption = option;

if(/google|btrace/i.test(newOption.headers.host)){//这里面的正则可以替换成自己不希望访问的网址特征字符串,这里面的btrace是一个腾讯视频的域名,经过实践发现特别容易导致浏览器崩溃,所以加在里面了,继续添加可以使用|分割。

 newOption.hostname = "127.0.0.1";//这个ip也可以替换成其他的

 newOption.port  = "80";

}

return newOption;

Copy after login

This modification was also mentioned in the article before , let’s introduce it in detail again here. It has many uses. Different mobile phones and simulators may access some useless addresses, causing the device to slow down. Access can be blocked through this code.

2. Use pm2 to manage anyproxy process

pm2 is a process manager for Node applications with load balancing function.

PM2 is perfect when you want your standalone code to utilize all CPUs on all servers and ensure that the process is always alive with 0 second reloads. It is very suitable for IaaS structures, but do not use it for PaaS solutions (Paas solutions will be developed later).

Main features:

Built-in load balancing (using the Node cluster cluster module)

Background running

0 seconds to stop and reload, I understand it generally means that there is no need to stop during maintenance and upgrades.

With Ubuntu and CentOS startup script

Stop unstable processes (avoid infinite loops)

Console detection

Provide HTTP API

Remote control and real-time interface API (Nodejs module, allows interaction with PM2 process manager)

Tested Nodejs v0.11 v0.10 v0.8 version, compatible with CoffeeScript, based on Linux and MacOS.

First install pm2

1

sudo npm install -g pm2

Copy after login

Run anyproxy in the pm2 environment

1

sudo pm2 start anyproxy -x -- -i

Copy after login

Now anyproxy is running in the pm2 environment

There are several pm2 commands that can help manage and monitor anyproxy

1

2

3

4

5

6

7

8

9

10

//查看运行日志

sudo pm2 logs anyproxy [--lines 10]

//关闭anyproxy

sudo pm2 delete anyproxy

//重启anyproxy

sudo pm2 restart anyproxy

//监控内存占用

sudo pm2 monit

//监控运行状态

sudo pm2 list

Copy after login

Special tip: After pm2 is running, the terminal window can be closed.

The most important purpose of using pm2 to manage the anyproxy process is: after anyproxy exits the program due to an error, pm2 can automatically restart anyproxy.

3. Cancel the sudo password and enable pm2 to start automatically after booting

The following content is the method in the mac environment, and windows should also have it. If you know similar methods, you can send me a private message.

1) First cancel the sudo password

Run the command:

1

sudo visudo

Copy after login

Find the code:

1

%admin   ALL = (ALL) ALL

Copy after login

Change to:

1

%admin   ALL = (ALL) NOPASSWD: ALL

Copy after login

In this way, the sudo password will be cancelled, and then you can add pm2 It’s auto-starting at boot

2) Set up auto-starting at boot

Enter the command in the terminal:

1

2

3

cd

touch autoexec.sh

vim autoexec.sh

Copy after login

Then Enter the editing mode, press the letter i on the keyboard to start editing, and paste the code:

1

2

3

#!/bin/sh

sudo pm2 start anyproxy -x -- -i

sudo pm2 monit

Copy after login

编辑完之后,按esc,再键入命令wq保存退出编辑模式。

再执行命令:

1

chmod 755 autoexec.sh

Copy after login

这样一个可执行文件就建立好了

然后打开mac系统的“系统偏好设置”,找到“用户与群组”,在左侧选择当前用户,右侧选择登录项;然后点击+号,找到当前用户的根目录(可以按shift+command+h快捷键),选择autoexec.sh文件,添加到登录项中,就可以开机自启动了。

经过以上的几项设置之后,anyproxy系统就会比原来更加稳定,其实主要原因是模拟器或手机的不稳定导致的anyproxy发生的错误。经过实际测试,anyproxy目前可以长时间运行不崩溃。而微信客户端还是在运行大约6个小时之后崩溃,以2秒翻一页的速度,采集总数大约1万个页面。如果不采集阅读量,就可以是1万个公众号的历史消息页。

微信客户端的崩溃现象是退出微信浏览器,停留在查看公众号资料页面。所以如果希望再进一步提高自动化,也可以使用触动精灵之作自动化脚本,定时推出微信浏览器,再点击历史消息页。这样应该就可以实现长时间自动化采集了。

相关推荐:

PHP写微信公众号文章页采集方法讲解

如何采集微信公众号历史消息页的详解

PHP实现基数排序的方法讲解

The above is the detailed content of Detailed explanation of using anyproxy to improve the efficiency of public account article collection. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

How to open a public account Where to open a public account How to open a public account Where to open a public account Feb 22, 2024 pm 06:00 PM

After selecting the account type on the registration page of the public platform, fill in the relevant information to register. Tutorial Applicable Model: Lenovo AIO520C System: Windows 10 Professional Edition Analysis 1 First enter the homepage of the WeChat public platform and click Register Now at the top. 2Go to the registration page and select the account type. 3. After filling in the relevant information as required, click Register at the bottom of the page. Supplement: There are several types of WeChat public accounts. 1 There are four types of WeChat public accounts: public platform service account, public platform subscription account, mini program, and enterprise WeChat. Summary/Notes: Enterprise WeChat is the original enterprise account.

What is the difference between WeChat service account and official account? What is the difference between WeChat service account and official account? Aug 09, 2023 am 11:06 AM

The difference between WeChat service account and public account: 1. WeChat service account is an account form provided to enterprises or individuals with certain qualifications and entities. Registration of public account is free and no fee is required; 2. WeChat service account is relatively speaking It is more powerful, with more comprehensive functions and permissions, while the function of the official account is relatively simple, mainly providing information transmission and interactive communication; 3. The WeChat service account can send template messages, group messages, customer service messages, etc. to users, while the official account only Content can be pushed through group messaging; 4. The WeChat service account has richer functions, etc.

How to use PHP to develop the QR code generation function of public accounts How to use PHP to develop the QR code generation function of public accounts Sep 19, 2023 am 10:03 AM

How to use PHP to develop the QR code generation function of public accounts. The popularity of today's social media has made public accounts one of the important channels for enterprises to interact with users. In order to attract more users to pay attention to official accounts, companies often use QR codes to make it easier for users to scan and follow. This article will introduce how to use PHP to develop the QR code generation function of public accounts and provide specific code examples. Obtain the QR code generation address. Before using PHP to develop the QR code generation function of the public account, we first need to obtain the QR code generation address. Can be submitted through WeChat public platform

Can the official account only post one article per day? Can the official account only post one article per day? Jun 16, 2023 pm 02:04 PM

The public account can not only post one article per day, but can publish up to eight articles at a time. How to publish multiple articles: 1. Click "Material Management" on the left, and then click "New Graphic and Text Material" to start editing. First article; 2. After editing the first article, click the + sign under the first article on the left and click "Graphic Message" to edit the second article; 3. After finishing multiple images and text, click " Save and send in bulk" to complete the publishing of multiple articles.

The big one is coming! The official account of World of Warcraft is updated, Lao Lei calls on fans to return to the national server! The big one is coming! The official account of World of Warcraft is updated, Lao Lei calls on fans to return to the national server! Mar 15, 2024 pm 12:58 PM

Let’s continue talking about the return of Blizzard’s national server! Many fans are saying, Xiaotan, you have been recruiting for three days in a row, why are you still recruiting? All I can say is that this time the Chinese server will return in April to May. It is absolutely certain. It really can’t be true anymore. Xiaotan has confirmed at least 5 sources. What Jinghe said is true! Some friends also asked, 36 Krypton made a fuss last time, why should we believe Jinghe’s articles? Aren’t they all big financial media? (Jinghe is the game label of TMTpost Media) Then let’s look back at 36Kr’s manuscript and see what everyone said: It may take half a year for the game to be online again. Now let’s calculate the time. From December last year to May this year, isn’t it just half a year? How can you say that someone is bragging? 36Kr is a major financial media company listed on Nasdaq. No.

How to use PHP to develop the keyword reply function of public accounts How to use PHP to develop the keyword reply function of public accounts Sep 19, 2023 pm 05:33 PM

How to use PHP to develop the keyword reply function of public accounts. With the rapid development of social media, WeChat public accounts have become one of the important channels for enterprises, institutions and individuals to spread information. In order to improve user experience and be able to reply to users' messages in a timely manner, it is very important to develop the keyword reply function of public accounts. This article will introduce how to use PHP to develop the keyword reply function of public accounts and provide specific code examples. 1. Create a public account First, we need to create a public account on the WeChat public platform. Register and bind public account

How to handle the user's unfollow event when developing a public account in PHP How to handle the user's unfollow event when developing a public account in PHP Sep 19, 2023 am 10:13 AM

How to handle user unfollow events when developing public accounts in PHP requires specific code examples. With the rapid development of social media, public accounts have become an important platform for enterprises to interact with users. In the development process of public accounts, it is particularly important to handle user unfollow events. This article will introduce how to use PHP language to handle the user's unfollow event and provide specific code examples. In public account development, user unfollow events are usually handled by receiving XML messages pushed by the WeChat server. When a user unfollows a public

How to read official account articles while chatting on WeChat How to read official account articles while chatting on WeChat Mar 28, 2024 pm 02:40 PM

1. First click [Address Book] - select [Official Account]. 2. Select one of the public accounts, find the article you want to read, and click to read it. 3. When the message you need to reply comes to you during the reading process, click the [three dots] icon in the upper right corner. 4. Then click [Pin on top in chat]. 5. Then we go back to chat and send messages. After we reply to the message, we click [Browsing] at the top of the chat interface to return to the article we just read.

See all articles