python2.7 - python 中文写入文件后乱码
黄舟
黄舟 2017-04-18 10:21:02
[Python讨论组]

一个很简单的小爬虫程序

    for i in L:
        content = urllib2.urlopen('http://X.X.X.X/cgi-bin/GetDomainOwnerInfo?domain=%s' %i)
        html = content.read()
        with open('domain_test.xml','a') as f:
            f.write(html)
            print html

print 的结果是中文:

<domaininfo strDomain="XXX.com." strOwner="XXX" strDepartment="云平台部" strBusiness="[互联网业务系统 - XXX" strUser="XXX;">

但直接打开xml文本的时候却是乱码:

<domaininfo strDomain="XXX.com." strOwner="XXX" strDepartment="云平台部" strBusiness="[互联网业务系统 - 第三方应用]" StrUser="XXX;">

Windows 7 操作系统,python 2.7

请问一下各位,这个问题如何解决?

黄舟
黄舟

人生最曼妙的风景,竟是内心的淡定与从容!

全部回复(3)
PHP中文网
  1. 你需要知道 content 的编码方式,并考虑是否要转换

  2. 你需要用 utf-8 打开文件,然后写入

codecs.open(filename, mode[, encoding[, errors[, buffering]]])

Open an encoded file using the given mode and return a wrapped version
providing transparent encoding/decoding. The default file mode is 'r'
meaning to open the file in read mode.

Note The wrapped version will only accept the object format defined by
the codecs, i.e. Unicode objects for most built-in codecs. Output is
also codec-dependent and will usually be Unicode as well. Note Files
are always opened in binary mode, even if no binary mode was specified. This is done to avoid data loss due to encodings using
8-bit values. This means that no automatic conversion of 'n' is done
on reading and writing. encoding specifies the encoding which is to be
used for the file.
errors may be given to define the error handling. It defaults to
'strict' which causes a ValueError to be raised in case an encoding
error occurs.
buffering has the same meaning as for the built-in open() function. It
defaults to line buffered.

import codecs
f = codecs.open("domain_test.xml", "w", "utf-8")
怪我咯

试试在文件开头加上 # -*- coding: utf-8 -*-

大家讲道理

在文件开头加上 #coding:utf-8

热门教程
更多>
最新下载
更多>
网站特效
网站源码
网站素材
前端模板
关于我们 免责申明 意见反馈 讲师合作 广告合作 最新更新 English
php中文网:公益在线php培训,帮助PHP学习者快速成长!
关注服务号 技术交流群
PHP中文网订阅号
每天精选资源文章推送
PHP中文网APP
随时随地碎片化学习
PHP中文网抖音号
发现有趣的

Copyright 2014-2025 https://www.php.cn/ All Rights Reserved | php.cn | 湘ICP备2023035733号