This article mainly introduces Node.js to use cheerio to create a simple web crawler example. Now I share it with you and give it as a reference.
This article introduces Node.js to use cheerio to create a simple web crawler example, and shares it with everyone. It has the following features:
1. Goal
Completed Obtain the title information of the website
Output the obtained information in a new file
Tool: cheerio, use npm to download npm install cheerio
The API usage method of cheerio is basically the same as the usage method of jQuery
If you are proficient in using jQuery, you will get started with cheerio quickly
2. Code part
Introduction: Get the list title of the segment fault page, get the title list number, and finally output it to the pageTitle.txt file
const https = require('https'); const fs = require('fs'); const cheerio = require('cheerio'); const url = 'https://segmentfault.com/'; https.get(url, (res) => { let html = ''; res.on('data', (data) => { html += data; }); res.on('end', () => { getPageTitle(html); }); }).on('error', () => { console.log('获取网页信息错误'); }); function getPageTitle(html) { const $ = cheerio.load(html); let chapters = $('.news__item-title'); let data = []; let index = 0; let fileName = 'pageTitle.txt'; for (let i = 0; i < chapters.length; i++) { let chapterTitle = $(chapters[i]).find('a').text().trim(); index++; data.push(`\n${index}, ${chapterTitle}`); } fs.writeFile(fileName, data, 'utf8', (err) => { if (err) { console.log('fs文件系统创建新文件失败', err); } console.log(`已成功将获取到的标题放入新文件${fileName}文件中`) }) }
The above is what I compiled for everyone. I hope it will be helpful to everyone in the future.
Related articles:
Talk about the use of JS animation library Velocity.js
vue toggle makes a click switching class (explanation with examples )
Vue2.0 How to add styles to Tab tabs and page switching transitions
The above is the detailed content of Use cheerio to make a simple web crawler in Node.js (detailed tutorial). For more information, please follow other related articles on the PHP Chinese website!