Home Web Front-end JS Tutorial Use selenium to capture Taobao data information

Use selenium to capture Taobao data information

Jun 07, 2018 pm 03:20 PM
selenium crawl Taobao

Below I will share with you an example of using selenium to capture Taobao product information. It has a good reference value and I hope it will be helpful to everyone.

Taobao pages use a lot of js to load data, so it is easier to use selenium to crawl. As a testing tool, selenum is mainly used with the windowless browser phantomjs.

import re
from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from pyquery import PyQuery as pq
'''
wait.until()语句是selenum里面的显示等待,wait是一个WebDriverWait对象,它设置了等待时间,如果页面在等待时间内
没有在 DOM中找到元素,将继续等待,超出设定时间后则抛出找不到元素的异常,也可以说程序每隔xx秒看一眼,如果条件
成立了,则执行下一步,否则继续等待,直到超过设置的最长时间,然后抛出TimeoutException
1.presence_of_element_located 元素加载出,传入定位元组,如(By.ID, 'p')
2.element_to_be_clickable 元素可点击
3.text_to_be_present_in_element 某个元素文本包含某文字
'''
# 定义一个无界面的浏览器
browser = webdriver.PhantomJS(
 service_args=[
  '--load-images=false',
  '--disk-cache=true'])
# 10s无响应就down掉
wait = WebDriverWait(browser, 10)
#虽然无界面但是必须要定义窗口
browser.set_window_size(1400, 900)

def search():
 '''
 此函数的作用为完成首页点击搜索的功能,替换标签可用于其他网页使用
 :return:
 '''
 print('正在搜索')
 try:
  #访问页面
  browser.get('https://www.taobao.com')
  # 选择到淘宝首页的输入框
  input = wait.until(
   EC.presence_of_element_located((By.CSS_SELECTOR, '#q'))
  )
  #搜索的那个按钮
  submit = wait.until(EC.element_to_be_clickable(
   (By.CSS_SELECTOR, '#J_TSearchForm > p.search-button > button')))
  #send_key作为写到input的内容
  input.send_keys('面条')
  #执行点击搜索的操作
  submit.click()
  #查看到当前的页码一共是多少页
  total = wait.until(EC.presence_of_element_located(
   (By.CSS_SELECTOR, '#mainsrp-pager > p > p > p > p.total')))
  #获取所有的商品
  get_products()
  #返回总页数
  return total.text
 except TimeoutException:
  return search()

def next_page(page_number):
 '''
 翻页函数,
 :param page_number:
 :return:
 '''
 print('正在翻页', page_number)
 try:
  #这个是我们跳转页的输入框
  input = wait.until(EC.presence_of_element_located(
   (By.CSS_SELECTOR, '#mainsrp-pager > p > p > p > p.form > input')))
  #跳转时的确定按钮
  submit = wait.until(
   EC.element_to_be_clickable(
    (By.CSS_SELECTOR,
     '#mainsrp-pager > p > p > p > p.form > span.J_Submit')))
  #清除里面的数字
  input.clear()
  #重新输入数字
  input.send_keys(page_number)
  #选择并点击
  submit.click()
  #判断当前页是不是我们要现实的页
  wait.until(
   EC.text_to_be_present_in_element(
    (By.CSS_SELECTOR,
     '#mainsrp-pager > p > p > p > ul > li.item.active > span'),
    str(page_number)))
  #调用函数获取商品信息
  get_products()
 #捕捉超时,重新进入翻页的函数
 except TimeoutException:
  next_page(page_number)

def get_products():
 '''
 搜到页面信息在此函数在爬取我们需要的信息
 :return:
 '''
 #每一个商品标签,这里是加载出来以后才会拿网页源代码
 wait.until(EC.presence_of_element_located(
  (By.CSS_SELECTOR, '#mainsrp-itemlist .items .item')))
 #这里拿到的是整个网页源代码
 html = browser.page_source
 #pq解析网页源代码
 doc = pq(html)
 items = doc('#mainsrp-itemlist .items .item').items()
 for item in items:
  # print(item)
  product = {
   'image': item.find('.pic .img').attr('src'),
   'price': item.find('.price').text(),
   'deal': item.find('.deal-cnt').text()[:-3],
   'title': item.find('.title').text(),
   'shop': item.find('.shop').text(),
   'location': item.find('.location').text()
  }
  print(product)

def main():
 try:
  #第一步搜索
  total = search()
  #int类型刚才找到的总页数标签,作为跳出循环的条件
  total = int(re.compile('(\d+)').search(total).group(1))
  #只要后面还有就继续爬,继续翻页
  for i in range(2, total + 1):
   next_page(i)
 except Exception:
  print('出错啦')
 finally:
  #关闭浏览器
  browser.close()

if __name__ == '__main__':
 main()
Copy after login

The above is what I compiled for everyone. I hope it will be helpful to everyone in the future.

Related articles:

Implementing a magnifying glass through jquery technology

How to implement Baidu index crawler using Puppeteer image recognition technology

How to call json using js

The above is the detailed content of Use selenium to capture Taobao data information. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

How to check where Taobao's shipping address is? How to check the shipping address of orders placed on Taobao app How to check where Taobao's shipping address is? How to check the shipping address of orders placed on Taobao app Mar 12, 2024 pm 04:00 PM

The Taobao app can satisfy all your shopping problems. There are so many merchants and so many products waiting for you to choose. No matter what kind of products you want to buy, you can search and find them here. , allowing everyone to directly place orders and purchase, and all functions can be freely operated. When you place an order successfully, you only need to wait for the merchant to ship the goods and carry out logistics delivery. It is very convenient and many , everyone can choose to view the shipping locations of these products and know where their products are shipped. Especially when you purchase some electronic products, you can check some related shipping locations. information to avoid the problem of buying some refurbished machines.

How to reject messages from strangers on Taobao How to reject messages from strangers on Taobao Mar 02, 2024 am 08:40 AM

In the process of using Taobao, we will receive messages from some strange users. The following will introduce you to how to set up rejecting messages from strangers. 1. After opening Taobao on your mobile phone to enter the interface, click the "Message" item at the bottom to switch to it, and then click the "+" icon on the upper right to open it. 2. A window will pop up under the icon, click and select the "Message Settings" item. 3. There is a "Stranger Chat Settings" on the message settings page, click on it to enter. 4. Finally, you will see the "Reject messages from strangers" function in the interface you enter. Click the corresponding switch button behind it. When the button is set to color, it is turned on. When using Taobao, you will no longer receive messages from unknown users.

How to get Taobao free red envelope 2024 How to get Taobao free red envelope 2024 May 09, 2024 pm 03:22 PM

The 2024 Taobao free order event will be held three times a day. Everyone needs to place an order and pay for the corresponding amount of goods at the corresponding time. The free order amount will be distributed in the form of red envelopes of equal amounts. Next, we will bring you how to receive the Taobao free order red envelope in 2024: grab it For users who are free of charge, the red envelope qualification will be issued to the card and coupon package, which is in a state of activation; the web version of Taobao currently does not have the card and coupon package, and only displays the winning records of the free order event; the card and coupon package is in [Taobao APP-My Taobao] -My Rights-Red Envelope]. How to get red envelopes for free orders on Taobao 20241. For users who grab free orders, the red envelope qualifications will be distributed to the card and coupon packages, which are in a state of waiting for activation; 2. The web version of Taobao currently does not have card and coupon packages, and only displays the winning records of the free order activities. ;3. The card coupon package is in [Taobao APP-My Taobao-My Rights-Red Envelope]

How to turn off personalized ads on Taobao How to turn off personalized ads on Taobao Mar 01, 2024 pm 12:49 PM

When we use Taobao to shop, we often receive personalized advertising content pushed by the software. Here is a way to turn off personalized recommendation ads. Open the Taobao app on your phone, click "My Taobao" in the lower right corner, and then click the gear icon in the upper right corner to enter the "Settings" page. 2. After coming to Taobao's settings page, find "Privacy" and click on it to enter. 3. On the privacy page, you will see an "Ad Management", click on it to enter. 4. Next, there is a "Personalized Ad Recommendation" at the bottom of the entered advertising management page. Behind it, click the slider on the switch button to set the button to gray-white. 5. At this time, a window will pop up on the page. After clicking and selecting the "Confirm Close" item, Taobao

How to turn off password-free payment on Taobao. How to cancel the setting method of password-free payment. How to turn off password-free payment on Taobao. How to cancel the setting method of password-free payment. Mar 12, 2024 pm 12:07 PM

There are so many functions on Taobao APP. These functions exist so that everyone can get a better shopping experience. The large number of product types can well meet the shopping needs of different users. Everyone really wants to You can buy whatever you want. You can search by category or directly search for these products. There will be no problem. Everyone can shop online with confidence. We will provide you with value-for-money shopping services. It will definitely give you Everything you want. Of course, if you shop here, you will find a variety of shopping methods here that allow you to choose. Some people like the password-free payment function here, and some do not. I like it so much, but I think the security is not that high. Of course, everyone can cancel at any time.

How to activate fingerprint payment on Taobao How to activate fingerprint payment on Taobao Mar 01, 2024 am 08:58 AM

When using Taobao, we can activate a fingerprint payment function. Here we will introduce the specific operation method. 1. After opening "Taobao" on your mobile phone, click "My Taobao" in the lower right corner of the page to enter, and then click the "Settings" icon in the upper right corner to open it. 2. Click on the "Payment" item on the settings page to enter. 3. Click on the "Face/Fingerprint Payment" item on the payment page to enter. 4. Next, on the biometric payment page you enter, there is a switch button displayed behind "Fingerprint Payment". Click on it to set it to color to turn it on. 5. On the last page, you will be prompted to enter the payment password to verify your identity. After passing the verification, a reminder of "Successful activation" will appear on the page, and you can use the fingerprint payment function in Taobao.

How to change name on Taobao How to change name on Taobao Mar 24, 2024 pm 03:31 PM

The name change function allows users to freely change their names and nicknames in Taobao. Some users do not know how to change their names on Taobao. Just click on the Taobao account of the avatar in the settings in My Taobao to modify it. Next, the editor will bring it to you This is an introduction to how to change your name and nickname. If you don’t know yet, please download it and give it a try. Taobao usage tutorial How to change Taobao name Answer: Click on the Taobao account of the avatar in the settings in My Taobao to modify it. Details: 1. Enter Taobao and click [My Taobao] on the lower right. 2. Click the [Settings] icon on the upper right. 3. Click the avatar. 4. Click [Taobao Account] again. 5. Click [Modify Account Name], enter and modify it.

How to check the total consumption amount on Taobao How to check the total consumption amount How to check the total consumption amount on Taobao How to check the total consumption amount Mar 12, 2024 pm 03:07 PM

If we usually need to do online shopping, we will all choose Taobao as a platform, which can fully meet all our shopping needs. It has a lot of resources for various commodities, and there are really all kinds of commodities. It is gathered on this platform. Everyone has found that there are many categories of products here, and you can choose them according to your own needs. You can buy whatever you want, so everyone will definitely buy it here. There are a lot of products, and the prices of these products are very different. All these shopping records can be saved, which can be convenient for everyone to check at any time. So if you know what you are shopping here, what exactly is it? How much money did you spend? You must be very curious. Below I will tell you

See all articles