Summary of python Chinese garbled problems
Explain clearly at once the issue of garbled Chinese characters in python2.
In order to help beginners no longer worry about garbled Chinese characters in python2!
Please see Brother Huang, the teacher of Python training class at Diaim Company, for details:
1. The code module you write needs to specify the encoding
If the code does not specify the coding, python will default all characters to ASCII code,
ASCII code only It supports 256 characters. ASCII code does not support Chinese, so an error is reported.
So you need to write #coding:utf-8 or #coding:gbk before the code
But generally write #coding:utf-8
2. All encodings inside python2 are unified to unicode
unicode can handle all languages in the world character.
utf-8 is an implementation form of unicode, so you need to write #coding:utf-8 before the code
3. Encoding conversion
Keep in mind that the internal encoding of python2 is unicode.
Other encoding decode() is unicode, and then Encoding encode() is the encoding you specify, so there will be no garbled characters.
4. When collecting web pages
Code designation #coding:utf-8
If the encoding of the web page is gbk
It needs to be processed like this:
html = html.decode('gbk').encode('utf-8')
5. You can also write #coding:gbk before the code, but you must also ensure that your code file is saved in gbk. This problem will occur under Windows.
6. Problems with Chinese characters in dictionary keys or values
#coding:utf-8
dict1 ={1:'python weekend training class',2:'Consultation 010-68165761 QQ: 1465376564'}
print dict1
# This output does not display Chinese characters, but displays other encodings of Chinese characters
dict2 ={1:'python video training class',2:'Consultation 010-68165761 QQ: 1465376564'}
for key in dict2:
print dict2[key ]
7. Unicode Chinese character encoding is written into a text file
Needs to be converted according to the encoding of the text file
Can encode('utf-8') or encode('gbk')
Summary: All errors that appear in the error message The error contains "ASCII", which means that the Chinese character encoding is not specified.
----Get the encoding type of the string--------------------------------------------- --------------------------
>>> date = urllib2.urlopen("http://www.baidu.com ")
>>> d = date.read()
>>> import chardet
>>> chardet.detect(d)
{'confidence': 0.99, 'encoding': 'utf-8'}

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

Solution to permission issues when viewing Python version in Linux terminal When you try to view Python version in Linux terminal, enter python...

How to avoid being detected when using FiddlerEverywhere for man-in-the-middle readings When you use FiddlerEverywhere...

How to teach computer novice programming basics within 10 hours? If you only have 10 hours to teach computer novice some programming knowledge, what would you choose to teach...

When using Python's pandas library, how to copy whole columns between two DataFrames with different structures is a common problem. Suppose we have two Dats...

How does Uvicorn continuously listen for HTTP requests? Uvicorn is a lightweight web server based on ASGI. One of its core functions is to listen for HTTP requests and proceed...

Fastapi ...

Using python in Linux terminal...

Understanding the anti-crawling strategy of Investing.com Many people often try to crawl news data from Investing.com (https://cn.investing.com/news/latest-news)...
