Set HTTP proxy in Python program
0x00 Preface
Everyone should be very familiar with HTTP proxy, which has extremely wide applications in many aspects. HTTP proxies are divided into forward proxies and reverse proxies. The latter is generally used to provide users with access to services behind the firewall or for load balancing. Typical ones include Nginx, HAProxy, etc. This article discusses forward proxies.
The most common uses of HTTP proxy are for network sharing, network acceleration, and network limit breakthrough. In addition, HTTP proxies are also commonly used for Web application debugging, monitoring and analysis of Web APIs called in Android/IOS APPs. Currently, well-known software includes Fiddler, Charles, Burp Suite, and mitmproxy. HTTP proxy can also be used to modify request/response content, add additional functions to web applications or change application behavior without changing the server.
0x01 What is HTTP proxy
HTTP proxy is essentially a web application, and it is not fundamentally different from other ordinary web applications. After receiving the request, the HTTP proxy comprehensively determines the target host based on the host name in the Host field in the header and the Get/POST request address, establishes a new HTTP request, forwards the request data, and forwards the received response data to the client.
If the request address is an absolute address, the HTTP proxy uses the Host in the address, otherwise the HOST field in the Header is used. Do a simple test, assuming the network environment is as follows:
192.168.1.2 Web服务器 192.168.1.3 HTTP代理服务器
Use telnet to test
$ telnet 192.168.1.3 GET / HTTP/1.0 HOST: 192.168.1.2
Note that two consecutive carriage returns are required at the end, which is a requirement of the HTTP protocol. After completion, you can receive the page content of http://www.php.cn/. Let’s make some adjustments. When making a GET request, bring the absolute address
$ telnet 192.168.1.3 GET http://www.php.cn/ HTTP/1.0 HOST: 192.168.1.2
. Note that the HOST is also set to 192.168.1.2, but the running result returns http: //www.php.cn/ The content of the page is the public IP address information.
As can be seen from the above test process, HTTP proxy is not a very complicated thing, as long as the original request is sent to the proxy server. When an HTTP proxy cannot be set, for a small number of hosts that require an HTTP proxy, the simplest way is to point the IP of the target host domain name to the proxy server, which can be achieved by modifying the hosts file.
0x02 Set HTTP proxy in Python program
urllib2/urllib proxy setting
urllib2
Yes The Python standard library is very powerful, but a little more troublesome to use. In Python 3, urllib2 is no longer retained and moved to the urllib module. In urllib2, ProxyHandler is used to set up the proxy server.
proxy_handler = urllib2.ProxyHandler({'http': '121.193.143.249:80'}) opener = urllib2.build_opener(proxy_handler) r = opener.open('http://httpbin.org/ip') print(r.read())
You can also use install_opener to install the configured opener into the global environment, so that all urllib2.urlopen will automatically use the proxy.
urllib2.install_opener(opener) r = urllib2.urlopen('http://httpbin.org/ip') print(r.read())
In Python 3, use urllib.
proxy_handler = urllib.request.ProxyHandler({'http': 'http://121.193.143.249:80/'}) opener = urllib.request.build_opener(proxy_handler) r = opener.open('http://httpbin.org/ip') print(r.read())
requests proxy settings
requests is one of the best HTTP libraries currently, and it is also the one I usually use to construct http The most used library when requested. Its API design is very user-friendly and easy to use. Setting a proxy for requests is very simple. You only need to set a parameter in the form {'http': 'x.x.x.x:8080', 'https': 'x.x.x.x:8080'} for proxies. Among them, http and https are independent of each other.
In [5]: requests.get('http://httpbin.org/ip', proxies={'http': '121.193.143.249:80'}).json() Out[5]: {'origin': '121.193.143.249'}
You can directly set the proxies attribute of the session, eliminating the trouble of bringing proxies parameters with each request.
s = requests.session() s.proxies = {'http': '121.193.143.249:80'} print(s.get('http://httpbin.org/ip').json())
0x03 HTTP_PROXY / HTTPS_PROXY environment variables
Both the urllib2 and Requests libraries recognize the HTTP_PROXY and HTTPS_PROXY environment variables. Once these environment variables are detected the proxy is automatically set up to use. This is very useful when debugging with HTTP proxy, because you can adjust the IP address and port of the proxy server according to environment variables without modifying the code. Most software in *nix also supports HTTP_PROXY environment variable recognition, such as curl, wget, axel, aria2c, etc.
$ http_proxy=121.193.143.249:80 python -c 'import requests; print(requests.get("http://httpbin.org/ip").json())' {u'origin': u'121.193.143.249'} $ http_proxy=121.193.143.249:80 curl httpbin.org/ip { "origin": "121.193.143.249" }
In the IPython interactive environment, you may often need to temporarily debug HTTP requests. You can simply add/cancel by setting os.environ['http_proxy'] HTTP proxy is implemented.
In [245]: os.environ['http_proxy'] = '121.193.143.249:80' In [246]: requests.get("http://httpbin.org/ip").json() Out[246]: {u'origin': u'121.193.143.249'} In [249]: os.environ['http_proxy'] = '' In [250]: requests.get("http://httpbin.org/ip").json() Out[250]: {u'origin': u'x.x.x.x'}
0x04 MITM-Proxy
MITM originates from Man-in-the-Middle Attack, which refers to man-in-the-middle attack , generally intercepting, monitoring and tampering with data in the network between the client and the server.
mitmproxy is an open source man-in-the-middle proxy artifact developed in Python language. It supports SSL, transparent proxy, reverse proxy, traffic recording and playback, and custom scripts. The function is somewhat similar to Fiddler in Windows, but mitmproxy is a console program without a GUI interface, but it is quite convenient to use. Using mitmproxy, you can easily filter, intercept, and modify any proxy HTTP request/response packets. You can even use its scripting API to write scripts to automatically intercept and modify HTTP data.
# test.py def response(flow): flow.response.headers["BOOM"] = "boom!boom!boom!"
上面的脚本会在所有经过代理的Http响应包头里面加上一个名为BOOM的header。用 mitmproxy -s 'test.py' 命令启动mitmproxy,curl验证结果发现的确多了一个BOOM头。
$ http_proxy=localhost:8080 curl -I 'httpbin.org/get' HTTP/1.1 200 OK Server: nginx Date: Thu, 03 Nov 2016 09:02:04 GMT Content-Type: application/json Content-Length: 186 Connection: keep-alive Access-Control-Allow-Origin: * Access-Control-Allow-Credentials: true BOOM: boom!boom!boom! ...
显然mitmproxy脚本能做的事情远不止这些,结合Python强大的功能,可以衍生出很多应用途径。除此之外,mitmproxy还提供了强大的API,在这些API的基础上,完全可以自己定制一个实现了特殊功能的专属代理服务器。
经过性能测试,发现mitmproxy的效率并不是特别高。如果只是用于调试目的那还好,但如果要用到生产环境,有大量并发请求通过代理的时候,性能还是稍微差点。我用twisted实现了一个简单的proxy,用于给公司内部网站增加功能、改善用户体验,以后有机会再和大家分享。
更多Python程序中设置HTTP代理相关文章请关注PHP中文网!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics











Python is suitable for data science, web development and automation tasks, while C is suitable for system programming, game development and embedded systems. Python is known for its simplicity and powerful ecosystem, while C is known for its high performance and underlying control capabilities.

Python excels in gaming and GUI development. 1) Game development uses Pygame, providing drawing, audio and other functions, which are suitable for creating 2D games. 2) GUI development can choose Tkinter or PyQt. Tkinter is simple and easy to use, PyQt has rich functions and is suitable for professional development.

You can learn basic programming concepts and skills of Python within 2 hours. 1. Learn variables and data types, 2. Master control flow (conditional statements and loops), 3. Understand the definition and use of functions, 4. Quickly get started with Python programming through simple examples and code snippets.

You can learn the basics of Python within two hours. 1. Learn variables and data types, 2. Master control structures such as if statements and loops, 3. Understand the definition and use of functions. These will help you start writing simple Python programs.

Python is easier to learn and use, while C is more powerful but complex. 1. Python syntax is concise and suitable for beginners. Dynamic typing and automatic memory management make it easy to use, but may cause runtime errors. 2.C provides low-level control and advanced features, suitable for high-performance applications, but has a high learning threshold and requires manual memory and type safety management.

To maximize the efficiency of learning Python in a limited time, you can use Python's datetime, time, and schedule modules. 1. The datetime module is used to record and plan learning time. 2. The time module helps to set study and rest time. 3. The schedule module automatically arranges weekly learning tasks.

Python is widely used in the fields of web development, data science, machine learning, automation and scripting. 1) In web development, Django and Flask frameworks simplify the development process. 2) In the fields of data science and machine learning, NumPy, Pandas, Scikit-learn and TensorFlow libraries provide strong support. 3) In terms of automation and scripting, Python is suitable for tasks such as automated testing and system management.

Python excels in automation, scripting, and task management. 1) Automation: File backup is realized through standard libraries such as os and shutil. 2) Script writing: Use the psutil library to monitor system resources. 3) Task management: Use the schedule library to schedule tasks. Python's ease of use and rich library support makes it the preferred tool in these areas.
