Python implements a simple picture text recognition script-Python Tutorial-php.cn

Table of Contents

截图

文字识别

访问剪切板

总结

Home

Backend Development

Python Tutorial

Python implements a simple picture text recognition script

零到壹度

Apr 04, 2018 pm 01:59 PM

python picture letter identify

我们都知道，部分电子版的书籍是以扫描图片的形式展现的，在阅读过程中无法选取文字。对于平时有记录习惯的人来说，无法复制黏贴真的很不爽！为了解决这个问题，需要这样一个脚本，他有下面这些功能：

1、能够实现自由截图
2、能够识别含有文字的截图
3、将识别出的文字输出到剪切板

大致上需要的东西非常明确，那么，一个一个的来~

截图

截图作为一项非常实用的功能，自然是有各种各样的实现，在这里考虑使用python去完成这个任务，那么自然是先google一下，网上一搜，果然资料多到爆炸~
不出所料，python对截图功能做了很好的基础支持~(本文基于windows平台下的python2实现，python3安装某些库真滴烦人)
（1）全屏截图
那么先从简单的做起（截图稍微麻烦一点，其他部分都超级简单 = =），首先实现python的“全屏截图”
代码入下：

from PIL import ImageGrab

im = ImageGrab.grab()  # 截取全屏
im.save(file)

Copy after login

简单的三行代码搞定~（赞美一下前人的伟大_(:з)∠)_）
其中的path表示文件截图文件的完整存放路径
其中稍微要注意一下的是，安装库的时候，使用

pip install pillow（而不是PIL）

Copy after login

否则会显示找不到匹配的模块~
（PS：这里其实有个问题，上述代码运行完成后，并没有截取全屏，最后生成的图片仅仅为部分图片，查看网上攻略后并没有找到比较好的解决方法，残念…）

（2）自由截图
好了，全屏截图完成，那么我们其实只想要截取需要的部分就行了，那么怎么去完成这个功能呢~
看了一部分网上的做法，最常见的就是监听鼠标动作实现选择截图区域。其中，其中使用较多的是tkinter，以及pyHook（tkinter实现那个稍微复杂一点），我个人倾向于后者，因为实现起来非常简单，哈哈~
那么先上部分代码

# coding:utf-8
import win32api
import os
from PIL 
import ImageGrab, Imageimport pyHook
import pythoncom

# 创建一个坐标列表(x1,y1,x2,y2)
coordinate = [1, 1, 1, 1]

# 监听键盘事件
def on_mouse_event(event):
    file_path = &#39;xx//xx//read.jpg&#39;
    # 监听鼠标事件
    if event.MessageName == &#39;mouse left down&#39;:
        coordinate[0:2] = event.Position    
    elif event.MessageName == &#39;mouse left up&#39;:
        coordinate[2:4] = event.Position
        win32api.PostQuitMessage()  # 退出监听循环
        # 截取坐标图片
        pic = ImageGrab.grab(coordinate)
        pic.save(file_path)

Copy after login

唯一比较麻烦的就是各种库的安装，重点点名pywin32这个库 = =，真姬儿麻烦~
那么在这里附上链接，防止安装过程中可能出现的问题：
解决安装pywin32后仍然提示找不到模块的问题

文字识别

搞定了截图功能以后，剩下的工作就比较简单了。python的pytesseract为文字识别提供了很好的支持。整个实现只需要一行关键代码即可：

from PIL import Image
import pytesseract
text=pytesseract.image_to_string(Image.open(file_path),lang=&#39;chi_sim&#39;)
print(text)

Copy after login

使用这个库之前，必须安装识别引擎tesseract-ocr，下载链接如下（下载完成后为exe安装包）：
tesseract-ocr识别引擎下载
这里附上安装及配置环境变量的教程（摘自百度百科）：
图片文字OCR识别-tesseract-ocr4.00.00安装使用
最后，在pytesseract库文件中进行配置，找到F:\XX\XX\XX\你的python安装路径\Lib\site-packages\pytesseract
找到该路径下的pytesseract.py文件，打开后找到一下这句代码：

tesseract_cmd = &#39;tesseract&#39;

Copy after login

将字符串’tesseract’替换成你的tesseract-ocr的安装路径（e.g.’F:\Program_File\Tesseract-OCR\tesseract.exe’）

至此，文字识别引擎的全部配置就已经完成了。

访问剪切板

最后，将识别好的文字导入剪切板
两步到位：
（1）pip安装pyperclip库
（2）同样一行代码：

pyperclip.copy(text)  # 将识别内容导入系统剪切板

Copy after login

大功告成~

总结

整个代码实现非常的简洁，总共也就几十行不到的代码，这也多亏了python强大的库支持。
然而比较遗憾的是，截图功能的实现很是简陋，使用tkinter可以实现出类似QQ截图的效果（代码也相对复杂一些）~
有了这个脚本之后，看扫描图片的PDF电子书就不需要打字记笔记了~吼吼吼~：）
最后附上完整的代码

# coding:utf-8
import inspect
import win32api
import os
from PIL import ImageGrab, Image
import pyHook  # 钩子~
import pythoncom
import pytesseract  # 图像识别文字包
import pyperclip

# 创建一个坐标列表
coordinate = [1, 1, 1, 1]

# 监听键盘事件
def on_mouse_event(event):
    # 获取当前文件路径
    file_ = inspect.getfile(inspect.currentframe())
    dir_path = os.path.abspath(os.path.dirname(file_))
    file_path = dir_path + &#39;\\read.jpg&#39;
    # 监听鼠标事件
    if event.MessageName == &#39;mouse left down&#39;:
        coordinate[0:2] = event.Position    
    elif event.MessageName == &#39;mouse left up&#39;:
        coordinate[2:4] = event.Position
        win32api.PostQuitMessage()  # 退出监听循环
        # 截取坐标图片
        pic = ImageGrab.grab(coordinate)
        pic.save(file_path)
        text = pytesseract.image_to_string(Image.open(file_path), lang=&#39;chi_sim&#39;)  # 识别并返回
        pyperclip.copy(text.replace(&#39; &#39;, &#39;&#39;))  # 将识别内容导入系统剪切板
   return True
   
   
    if __name__ == &#39;__main__&#39;:
    hm = pyHook.HookManager()  # 创建一个钩子管理对象
    hm.MouseAll = on_mouse_event  # 监听所有鼠标事件
    hm.HookMouse()  # 设定鼠标钩子
    pythoncom.PumpMessages()  # 进入循环，程序一直监听

Copy after login

快毕业了，除了准备答辩之外，就是看看书，各种瞎晃~
那么，这两天在看书的时候遇到这么个问题：
首先，部分电子版的书籍是以扫描图片的形式展现的，在阅读过程中无法选取文字。对于平时有记录习惯的我来说，无法复制黏贴真的很不爽！
为了解决这个问题，我需要这样一个脚本，他有下面这些功能：

1、能够实现自由截图
2、能够识别含有文字的截图
3、将识别出的文字输出到剪切板

大致上需要的东西非常明确，那么，一个一个的来~

截图

from PIL import ImageGrab

im = ImageGrab.grab()  # 截取全屏im.save(file)

Copy after login

简单的三行代码搞定~（赞美一下前人的伟大_(:з)∠)_）
其中的path表示文件截图文件的完整存放路径
其中稍微要注意一下的是，安装库的时候，使用

pip install pillow（而不是PIL）

Copy after login

# coding:utf-8import win32apiimport osfrom PIL import ImageGrab, Imageimport pyHookimport pythoncom# 创建一个坐标列表(x1,y1,x2,y2)coordinate = [1, 1, 1, 1]# 监听键盘事件def on_mouse_event(event):
    file_path = &#39;xx//xx//read.jpg&#39;
    # 监听鼠标事件
    if event.MessageName == &#39;mouse left down&#39;:
        coordinate[0:2] = event.Position    elif event.MessageName == &#39;mouse left up&#39;:
        coordinate[2:4] = event.Position
        win32api.PostQuitMessage()  # 退出监听循环
        # 截取坐标图片
        pic = ImageGrab.grab(coordinate)
        pic.save(file_path)

Copy after login

文字识别

搞定了截图功能以后，剩下的工作就比较简单了。python的pytesseract为文字识别提供了很好的支持。整个实现只需要一行关键代码即可：

from PIL import Imageimport pytesseract
text=pytesseract.image_to_string(Image.open(file_path),lang=&#39;chi_sim&#39;)
print(text)

Copy after login

tesseract_cmd = &#39;tesseract&#39;

Copy after login

将字符串’tesseract’替换成你的tesseract-ocr的安装路径（e.g.’F:\Program_File\Tesseract-OCR\tesseract.exe’）

至此，文字识别引擎的全部配置就已经完成了。

访问剪切板

最后，将识别好的文字导入剪切板
两步到位：
（1）pip安装pyperclip库
（2）同样一行代码：

pyperclip.copy(text)  # 将识别内容导入系统剪切板

Copy after login

大功告成~

总结

# coding:utf-8import inspectimport win32apiimport osfrom PIL import ImageGrab, Imageimport pyHook  # 钩子~import pythoncomimport pytesseract  # 图像识别文字包import pyperclip# 创建一个坐标列表coordinate = [1, 1, 1, 1]# 监听键盘事件def on_mouse_event(event):
    # 获取当前文件路径
    file_ = inspect.getfile(inspect.currentframe())
    dir_path = os.path.abspath(os.path.dirname(file_))
    file_path = dir_path + &#39;\\read.jpg&#39;
    # 监听鼠标事件
    if event.MessageName == &#39;mouse left down&#39;:
        coordinate[0:2] = event.Position    elif event.MessageName == &#39;mouse left up&#39;:
        coordinate[2:4] = event.Position
        win32api.PostQuitMessage()  # 退出监听循环
        # 截取坐标图片
        pic = ImageGrab.grab(coordinate)
        pic.save(file_path)
        text = pytesseract.image_to_string(Image.open(file_path), lang=&#39;chi_sim&#39;)  # 识别并返回
        pyperclip.copy(text.replace(&#39; &#39;, &#39;&#39;))  # 将识别内容导入系统剪切板
    return Trueif __name__ == &#39;__main__&#39;:
    hm = pyHook.HookManager()  # 创建一个钩子管理对象
    hm.MouseAll = on_mouse_event  # 监听所有鼠标事件
    hm.HookMouse()  # 设定鼠标钩子
    pythoncom.PumpMessages()  # 进入循环，程序一直监听

Copy after login

The above is the detailed content of Python implements a simple picture text recognition script. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

What's New in Windows 11 KB5054979 & How to Fix Update Issues

4 weeks ago By DDD

How to fix KB5055523 fails to install in Windows 11?

3 weeks ago By DDD

How to fix KB5055518 fails to install in Windows 10?

3 weeks ago By DDD

Strength Levels for Every Enemy & Monster in R.E.P.O.

3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Blue Prince: How To Get To The Basement

3 weeks ago By DDD

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Where is the login entrance for gmail email?

7930

Java Tutorial

1652

CakePHP Tutorial

1411

Laravel Tutorial

1303

PHP Tutorial

1250

Related knowledge

PHP and Python: Different Paradigms Explained Apr 18, 2025 am 12:26 AM

PHP is mainly procedural programming, but also supports object-oriented programming (OOP); Python supports a variety of paradigms, including OOP, functional and procedural programming. PHP is suitable for web development, and Python is suitable for a variety of applications such as data analysis and machine learning.

Choosing Between PHP and Python: A Guide Apr 18, 2025 am 12:24 AM

PHP is suitable for web development and rapid prototyping, and Python is suitable for data science and machine learning. 1.PHP is used for dynamic web development, with simple syntax and suitable for rapid development. 2. Python has concise syntax, is suitable for multiple fields, and has a strong library ecosystem.

PHP and Python: A Deep Dive into Their History Apr 18, 2025 am 12:25 AM

PHP originated in 1994 and was developed by RasmusLerdorf. It was originally used to track website visitors and gradually evolved into a server-side scripting language and was widely used in web development. Python was developed by Guidovan Rossum in the late 1980s and was first released in 1991. It emphasizes code readability and simplicity, and is suitable for scientific computing, data analysis and other fields.

Python vs. JavaScript: The Learning Curve and Ease of Use Apr 16, 2025 am 12:12 AM

Python is more suitable for beginners, with a smooth learning curve and concise syntax; JavaScript is suitable for front-end development, with a steep learning curve and flexible syntax. 1. Python syntax is intuitive and suitable for data science and back-end development. 2. JavaScript is flexible and widely used in front-end and server-side programming.

How to run sublime code python Apr 16, 2025 am 08:48 AM

To run Python code in Sublime Text, you need to install the Python plug-in first, then create a .py file and write the code, and finally press Ctrl B to run the code, and the output will be displayed in the console.

Can vs code run in Windows 8 Apr 15, 2025 pm 07:24 PM

VS Code can run on Windows 8, but the experience may not be great. First make sure the system has been updated to the latest patch, then download the VS Code installation package that matches the system architecture and install it as prompted. After installation, be aware that some extensions may be incompatible with Windows 8 and need to look for alternative extensions or use newer Windows systems in a virtual machine. Install the necessary extensions to check whether they work properly. Although VS Code is feasible on Windows 8, it is recommended to upgrade to a newer Windows system for a better development experience and security.

Where to write code in vscode Apr 15, 2025 pm 09:54 PM

Writing code in Visual Studio Code (VSCode) is simple and easy to use. Just install VSCode, create a project, select a language, create a file, write code, save and run it. The advantages of VSCode include cross-platform, free and open source, powerful features, rich extensions, and lightweight and fast.

Can visual studio code be used in python Apr 15, 2025 pm 08:18 PM

VS Code can be used to write Python and provides many features that make it an ideal tool for developing Python applications. It allows users to: install Python extensions to get functions such as code completion, syntax highlighting, and debugging. Use the debugger to track code step by step, find and fix errors. Integrate Git for version control. Use code formatting tools to maintain code consistency. Use the Linting tool to spot potential problems ahead of time.

See all articles