Home > Web Front-end > JS Tutorial > body text

Use cheerio to make a simple web crawler in Node.js (detailed tutorial)

亚连
Release: 2018-06-02 14:30:03
Original
3334 people have browsed it

This article mainly introduces Node.js to use cheerio to create a simple web crawler example. Now I share it with you and give it as a reference.

This article introduces Node.js to use cheerio to create a simple web crawler example, and shares it with everyone. It has the following features:

1. Goal

  1. Completed Obtain the title information of the website

  2. Output the obtained information in a new file

  3. Tool: cheerio, use npm to download npm install cheerio

  4. The API usage method of cheerio is basically the same as the usage method of jQuery

  5. If you are proficient in using jQuery, you will get started with cheerio quickly

2. Code part

Introduction: Get the list title of the segment fault page, get the title list number, and finally output it to the pageTitle.txt file

const https = require('https');
const fs = require('fs');
const cheerio = require('cheerio');
const url = 'https://segmentfault.com/';

https.get(url, (res) => {
  let html = '';
  res.on('data', (data) => {
    html += data;
  });
  res.on('end', () => {
    getPageTitle(html);
  });
}).on('error', () => {
  console.log('获取网页信息错误');
});

function getPageTitle(html) {
  const $ = cheerio.load(html);
  let chapters = $('.news__item-title');
  let data = [];
  let index = 0;
  let fileName = 'pageTitle.txt';
  for (let i = 0; i < chapters.length; i++) {
    let chapterTitle = $(chapters[i]).find(&#39;a&#39;).text().trim();
    index++;
    data.push(`\n${index}, ${chapterTitle}`);
  }
  fs.writeFile(fileName, data, &#39;utf8&#39;, (err) => {
    if (err) {
      console.log(&#39;fs文件系统创建新文件失败&#39;, err);
    }
    console.log(`已成功将获取到的标题放入新文件${fileName}文件中`)
  })
}
Copy after login

The above is what I compiled for everyone. I hope it will be helpful to everyone in the future.

Related articles:

Talk about the use of JS animation library Velocity.js

vue toggle makes a click switching class (explanation with examples )

Vue2.0 How to add styles to Tab tabs and page switching transitions

The above is the detailed content of Use cheerio to make a simple web crawler in Node.js (detailed tutorial). For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!