


Node.js crawls Chinese webpage garbled problems and solutions_node.js
When Node.js crawls non-utf-8 Chinese web pages, garbled characters will appear. For example, NetEase’s homepage encoding is gb2312, and garbled characters will appear when crawling
var request = require('request')
var url = 'http://www.163.com'
request(url, function (err, res, body) {
console.log(body)
})
You can use iconv-lite to solve
Installation
npm install iconv-lite
At the same time, let’s modify the user-agent to prevent the website from being blocked:
var originRequest = require('request')
var iconv = require('iconv-lite')
var headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS }
function request (url, callback) {
var options = {
url: url,
encoding: null,
headers: headers
}
originRequest(options, callback)
}
var html = iconv.decode(body, 'gb2312')
console.log(html)
})
Garbled code problem solved
Use cheerio to parse HTMLInstallation
request(url, function (err, res, body) {
var html = iconv.decode(body, 'gb2312')
var $ = cheerio.load(html)
console.log($('h1').text())
console.log($('h1').html())
})
NetEase
Solve the "garbled" problem of cheerio .html()
Check the document to find out that you can turn off the function of converting entity encoding
var originRequest = require('request')
var cheerio = require('cheerio')
var iconv = require('iconv-lite')
var headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.65 Safari/537.36'
}
function request (url, callback) {
var options = {
url: url,
encoding: null,
headers: headers
}
originRequest(options, callback)
}
var url = 'http://www.163.com'
request(url, function (err, res, body) {
var html = iconv.decode(body, 'gb2312')
var $ = cheerio.load(html, {decodeEntities: false})
console.log($('h1').text())
console.log($('h1').html())
})

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

Troubleshooting and solutions to the company's security software that causes some applications to not function properly. Many companies will deploy security software in order to ensure internal network security. ...

The following steps can be used to resolve the problem that Navicat cannot connect to the database: Check the server connection, make sure the server is running, address and port correctly, and the firewall allows connections. Verify the login information and confirm that the user name, password and permissions are correct. Check network connections and troubleshoot network problems such as router or firewall failures. Disable SSL connections, which may not be supported by some servers. Check the database version to make sure the Navicat version is compatible with the target database. Adjust the connection timeout, and for remote or slower connections, increase the connection timeout timeout. Other workarounds, if the above steps are not working, you can try restarting the software, using a different connection driver, or consulting the database administrator or official Navicat support.

Common problems and solutions for Hadoop Distributed File System (HDFS) configuration under CentOS When building a HadoopHDFS cluster on CentOS, some common misconfigurations may lead to performance degradation, data loss and even the cluster cannot start. This article summarizes these common problems and their solutions to help you avoid these pitfalls and ensure the stability and efficient operation of your HDFS cluster. Rack-aware configuration error: Problem: Rack-aware information is not configured correctly, resulting in uneven distribution of data block replicas and increasing network load. Solution: Double check the rack-aware configuration in the hdfs-site.xml file and use hdfsdfsadmin-printTopo

Redis memory soaring includes: too large data volume, improper data structure selection, configuration problems (such as maxmemory settings too small), and memory leaks. Solutions include: deletion of expired data, use compression technology, selecting appropriate structures, adjusting configuration parameters, checking for memory leaks in the code, and regularly monitoring memory usage.

VS Code can run on Windows 8, but the experience may not be great. First make sure the system has been updated to the latest patch, then download the VS Code installation package that matches the system architecture and install it as prompted. After installation, be aware that some extensions may be incompatible with Windows 8 and need to look for alternative extensions or use newer Windows systems in a virtual machine. Install the necessary extensions to check whether they work properly. Although VS Code is feasible on Windows 8, it is recommended to upgrade to a newer Windows system for a better development experience and security.

VS Code can be used to write Python and provides many features that make it an ideal tool for developing Python applications. It allows users to: install Python extensions to get functions such as code completion, syntax highlighting, and debugging. Use the debugger to track code step by step, find and fix errors. Integrate Git for version control. Use code formatting tools to maintain code consistency. Use the Linting tool to spot potential problems ahead of time.

Permissions issues and solutions for MinIO installation under CentOS system When deploying MinIO in CentOS environment, permission issues are common problems. This article will introduce several common permission problems and their solutions to help you complete the installation and configuration of MinIO smoothly. Modify the default account and password: You can modify the default username and password by setting the environment variables MINIO_ROOT_USER and MINIO_ROOT_PASSWORD. After modification, restarting the MinIO service will take effect. Configure bucket access permissions: Setting the bucket to public will cause the directory to be traversed, which poses a security risk. It is recommended to customize the bucket access policy. You can use MinIO

Redis memory fragmentation refers to the existence of small free areas in the allocated memory that cannot be reassigned. Coping strategies include: Restart Redis: completely clear the memory, but interrupt service. Optimize data structures: Use a structure that is more suitable for Redis to reduce the number of memory allocations and releases. Adjust configuration parameters: Use the policy to eliminate the least recently used key-value pairs. Use persistence mechanism: Back up data regularly and restart Redis to clean up fragments. Monitor memory usage: Discover problems in a timely manner and take measures.
