Detailed explanation of how to use Golang to crawl Bing wallpapers
Needless to say, just use python to make a crawler. One requests
can cover the world. However, I heard that the http
package built into golang is very powerful. I just don’t have to do any work, but I just want to learn new things and review the knowledge points related to the request and response of the http protocol. Not much to say, let’s start with the whole article
Climb downBing Wallpaper and try it out first. Dog head saves life Dog head saves life Dog head saves life
Overview of crawler process
graph TD 请求数据 --> 解析数据 --> 数据入库
As you can see from the flow chart above, crawlers are not troublesome in fact. The whole process is There are only three steps. Next, let’s talk about what needs to be done in each step
Request data: Here we need to use the built-in package http package in golang to initiate a request to the target address. This step is completed
Parse data: Here we need to parse the requested data, because we do not need the entire requested data, we only need some specific key data. This step is also called data cleaning
Data storage: It is not difficult to understand that this is to store the parsed data into the database
Practical Analysis
First go to the official website of Bing Wallpaper to observe. If you want to do a crawler, you need to be particularly sensitive to data. This is the homepage information. The whole page is very concise.
Next, you need to call up the browser’s developer tools (you should be very familiar with this. If you are not familiar with it, it will be difficult to follow. ). Directly press F12
or right-click to check But what? On the Bing wallpaper, right-clicking cannot call up the console and can only be called up manually. Don’t worry, just follow the first picture. If a classmate’s chrome is in Chinese, the same operation is done. Select more tools and select developer tools
No surprise, everyone must see a page like this
It doesn’t matter, it’s just some anti-crawling errors on the Bing Wallpaper website. (I didn’t have this anti-crawling error when I crawled a long time ago) This does not affect our operation
Next, select this tool to help us quickly locate the element we wantThen we will Can find the picture information we need
Code actual combat
The following is the data to crawl one page
package main import ( "fmt" "github.com/PuerkitoBio/goquery" "io" "io/ioutil" "log" "net/http" "os" "time" ) func Run(method, url string, body io.Reader, client *http.Client) { req, err := http.NewRequest(method, url, body) if err != nil { log.Println("获取请求对象失败") return } req.Header.Set("user-agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36") resp, err := client.Do(req) if err != nil { log.Println("发起请求失败") return } if resp.StatusCode != http.StatusOK { log.Printf("请求失败,状态码:%d", resp.StatusCode) return } defer resp.Body.Close() // 关闭响应对象中的body query, err := goquery.NewDocumentFromReader(resp.Body) if err != nil { log.Println("生成goQuery对象失败") return } query.Find(".container .item").Each(func(i int, s *goquery.Selection) { imgUrl, _ := s.Find("a.ctrl.download").Attr("href") imgName := s.Find(".description>h3").Text() fmt.Println(imgUrl) fmt.Println(imgName) DownloadImage(imgUrl, i, client) time.Sleep(time.Second) fmt.Println("-------------------------") }) } func DownloadImage(url string, index int, client *http.Client) { req, err := http.NewRequest("POST", url, nil) if err != nil { log.Println("获取请求对象失败") return } req.Header.Set("user-agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36") resp, err := client.Do(req) if err != nil { log.Println("发起请求失败") return } data, err := ioutil.ReadAll(resp.Body) if err != nil { log.Println("读取请求体失败") return } baseDir := "./image/image-%d.jpg" f, err := os.OpenFile(fmt.Sprintf(baseDir, index), os.O_CREATE|os.O_TRUNC|os.O_WRONLY, 0666) if err != nil { log.Println("打开文件失败", err.Error()) return } defer f.Close() _, err = f.Write(data) if err != nil { log.Println("写入数据失败") return } fmt.Println("下载图片成功") } func main() { client := &http.Client{} url := "https://bing.ioliu.cn/?p=%d" method := "GET" Run(method, url, nil, client) }
The following is to crawl multi-page dataThe code for crawling multiple pages has not changed much. We still need to observe the characteristics of the website first
Discover What happened? The first page p=1, the second page p=2, and the tenth page p=10
So we just start a for loop and then reuse the code that crawled the single page before
// 爬取多页的main函数如下 func main() { client := &http.Client{} url := "https://bing.ioliu.cn/?p=%d" method := "GET" for i := 1; i < 5; i++ { // 实现分页操作 Run(method, fmt.Sprintf(url, i), nil, client) } }
Summary
In our example, we use a third-party package of tools to parse web page data, because using regular expressions is really too troublesome
- Use css selector: goQuery
- Use xpath selector: htmlquery
- Regular: built-in package, not recommended, regular rules are difficult to write
Recommended learning: Golang tutorial
The above is the detailed content of Detailed explanation of how to use Golang to crawl Bing wallpapers. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics











Reading and writing files safely in Go is crucial. Guidelines include: Checking file permissions Closing files using defer Validating file paths Using context timeouts Following these guidelines ensures the security of your data and the robustness of your application.

How to configure connection pooling for Go database connections? Use the DB type in the database/sql package to create a database connection; set MaxOpenConns to control the maximum number of concurrent connections; set MaxIdleConns to set the maximum number of idle connections; set ConnMaxLifetime to control the maximum life cycle of the connection.

JSON data can be saved into a MySQL database by using the gjson library or the json.Unmarshal function. The gjson library provides convenience methods to parse JSON fields, and the json.Unmarshal function requires a target type pointer to unmarshal JSON data. Both methods require preparing SQL statements and performing insert operations to persist the data into the database.

The difference between the GoLang framework and the Go framework is reflected in the internal architecture and external features. The GoLang framework is based on the Go standard library and extends its functionality, while the Go framework consists of independent libraries to achieve specific purposes. The GoLang framework is more flexible and the Go framework is easier to use. The GoLang framework has a slight advantage in performance, and the Go framework is more scalable. Case: gin-gonic (Go framework) is used to build REST API, while Echo (GoLang framework) is used to build web applications.

Backend learning path: The exploration journey from front-end to back-end As a back-end beginner who transforms from front-end development, you already have the foundation of nodejs,...

Go framework development FAQ: Framework selection: Depends on application requirements and developer preferences, such as Gin (API), Echo (extensible), Beego (ORM), Iris (performance). Installation and use: Use the gomod command to install, import the framework and use it. Database interaction: Use ORM libraries, such as gorm, to establish database connections and operations. Authentication and authorization: Use session management and authentication middleware such as gin-contrib/sessions. Practical case: Use the Gin framework to build a simple blog API that provides POST, GET and other functions.

Which libraries in Go are developed by large companies or well-known open source projects? When programming in Go, developers often encounter some common needs, ...

Go language performs well in building efficient and scalable systems. Its advantages include: 1. High performance: compiled into machine code, fast running speed; 2. Concurrent programming: simplify multitasking through goroutines and channels; 3. Simplicity: concise syntax, reducing learning and maintenance costs; 4. Cross-platform: supports cross-platform compilation, easy deployment.
