python抓取豆瓣图片并自动保存示例学习
环境Python 2.7.6,BS4,在powershell或命令行均可运行。请确保安装了BS模块
代码如下:
# -*- coding:utf8 -*-
# 2013.12.36 19:41 wnlo-c209
# 抓取dbmei.com的图片。
from bs4 import BeautifulSoup
import os, sys, urllib2
# 创建文件夹,昨天刚学会
path = os.getcwd() # 获取此脚本所在目录
new_path = os.path.join(path,u'豆瓣妹子')
if not os.path.isdir(new_path):
os.mkdir(new_path)
def page_loop(page=0):
url = 'http://www.dbmeizi.com/?p=%s' % page
content = urllib2.urlopen(url)
soup = BeautifulSoup(content)
my_girl = soup.find_all('img')
# 加入结束检测,写的不好....
if my_girl ==[]:
print u'已经全部抓取完毕'
sys.exit(0)
print u'开始抓取'
for girl in my_girl:
link = girl.get('src')
flink = 'http://www.dbmeizi.com/' + link
print flink
content2 = urllib2.urlopen(flink).read()
with open(u'豆瓣妹子'+'/'+flink[-11:],'wb') as code: #在OSC上现学的
code.write(content2)
page = int(page) + 1
print u'开始抓取下一页'
print 'the %s page' % page
page_loop(page)
page_loop().

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

Solution to permission issues when viewing Python version in Linux terminal When you try to view Python version in Linux terminal, enter python...

How to avoid being detected when using FiddlerEverywhere for man-in-the-middle readings When you use FiddlerEverywhere...

When using Python's pandas library, how to copy whole columns between two DataFrames with different structures is a common problem. Suppose we have two Dats...

How to teach computer novice programming basics within 10 hours? If you only have 10 hours to teach computer novice some programming knowledge, what would you choose to teach...

How does Uvicorn continuously listen for HTTP requests? Uvicorn is a lightweight web server based on ASGI. One of its core functions is to listen for HTTP requests and proceed...

Using python in Linux terminal...

Fastapi ...

Understanding the anti-crawling strategy of Investing.com Many people often try to crawl news data from Investing.com (https://cn.investing.com/news/latest-news)...
