Table of Contents
java uses tess4j for image text recognition
1. Introduction
2. Usage process
1.Maven dependencies are introduced into pom.xml
2. Prepare the language library file in the tessdata directory
3. Write the test code for testing
Home Java javaTutorial How to use tess4j to implement image text recognition function in Java?

How to use tess4j to implement image text recognition function in Java?

May 09, 2023 pm 05:49 PM
java tess4j

    java uses tess4j for image text recognition

    1. Introduction

    Tess4J is a Java (JNA) encapsulation of the Tesseract OCR API.
    A long time ago, I needed to do an automatic login and click the button on the unprocessed data on the web page, which required verification of the login verification code, so I used Tess4J, which can recognize some simple words and numbers, etc., and the recognition rate is It seems to be normal, but if you make an error, just change a verification code and try again. You can succeed even if you try a few more times. Now record the previous simple usage process for future reference.

    Tess4J is a Java JNA package for Tesseract OCR API. Enables java to use Tesseract OCR by calling Tess4J's API. Supported formats include TIFF, JPEG, GIF, PNG, BMP, JPEG, PDF. When I first came into contact with this, I was still confused between these two things. To be clear, Tess4J is a jar package that can be used directly by java, and Tesseract OCR is the basis for supporting Tess4J into file text recognition, Tess4JCan be imported directly using Maven.

    2. Usage process

    1.Maven dependencies are introduced into pom.xml

    		<!-- tess4j start -->
    		<dependency>
    		    <groupId>net.sourceforge.tess4j</groupId>
    		    <artifactId>tess4j</artifactId>
    		    <version>5.6.0</version>
    		</dependency>
    		<!-- tess4j end -->
    Copy after login

    2. Prepare the language library file in the tessdata directory

    You need to download the relevant language library files in advance. Here I downloaded two chi_sim.traineddata and eng.traineddata
    Download address: https://codechina.csdn.net/mirrors/tesseract-ocr/tessdata
    After downloading, place it in the directory inside the code

    How to use tess4j to implement image text recognition function in Java?

    3. Write the test code for testing

    Prepare two pictures and place them in the code In the resource directory, it is convenient for the program to read,

    Picture 1

    How to use tess4j to implement image text recognition function in Java?

    Picture 2

    How to use tess4j to implement image text recognition function in Java?

    The two pictures are placed in the resource directory

    How to use tess4j to implement image text recognition function in Java?

    ##The code is as follows:

    package cn.ljhua;
    
    import java.awt.image.BufferedImage;
    import java.io.File;
    import java.io.IOException;
    import java.io.InputStream;
    
    import javax.imageio.ImageIO;
    
    import lombok.extern.slf4j.Slf4j;
    import net.sourceforge.tess4j.ITesseract;
    import net.sourceforge.tess4j.Tesseract;
    import net.sourceforge.tess4j.TesseractException;
    
    /**
     * Tess4jOcr测试示例
     * @author liujh
     */
    @Slf4j
    public class Tess4jOcrTest {
    	
    	public static void main(String[] args) {
    		
    		Tess4jOcrTest test = new Tess4jOcrTest();
    		test.ocrTest();
    		
    	}
    	
    	public void ocrTest() {
    		
    		log.info("ocrTest start....");
    		long startMs = System.currentTimeMillis();
    		
    		 //Tesseract的代码开始---------------------->>>>
    		ITesseract instance = new Tesseract();
    		
    		/**
    		 * 组装接好tessdata目录的路径字符串
    		 */
    		String filePathPre = System.getProperty("user.dir");
        	String dataPath = filePathPre + File.separator + "tessdata";
        	
        	/**
    		 * 设置目录datapath the tessdata path to set
    		 * 否则会报Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory.错误
    		 */
        	instance.setDatapath(dataPath);
        	//instance.setLanguage("eng");//默认,可以不写
        	instance.setLanguage("chi_sim");//设置中文识别
    		
        	String imageName = "verifyCode.png";
            try (InputStream inStream = this.getClass().getResourceAsStream("/" + imageName)) {
                
            	BufferedImage bImage = ImageIO.read(inStream);
            	//doOCR也可以传参为File,我这里传的BufferedImage
            	String result = instance.doOCR(bImage);
            	//识别的结果回来可能会带回车,处理掉
            	result = result.replaceAll("\n", "");
            	log.info("图片名:" + imageName +" 识别结果:"+ result);
            	
            } catch (IOException e) {
                log.error(e.getMessage(),e);
            } catch (TesseractException e) {
            	log.error(e.getMessage(),e);
    		}
            
            imageName = "vCode2.jpg";
            try (InputStream inStream = this.getClass().getResourceAsStream("/" + imageName)) {
                
            	BufferedImage bImage = ImageIO.read(inStream);
            	//doOCR也可以传参为File,我这里传的BufferedImage
            	String result = instance.doOCR(bImage);
            	//识别的结果回来可能会带回车,处理掉
            	result = result.replaceAll("\n", "");
            	log.info("图片名:" + imageName +" 识别结果:"+ result);
            	
            } catch (IOException e) {
                log.error(e.getMessage(),e);
            } catch (TesseractException e) {
            	log.error(e.getMessage(),e);
    		}
    		//Tesseract的代码结束--------------------->>>>
            
    		log.info("ocrTest success. spend time :{} ms.", (System.currentTimeMillis() - startMs));
    		
    	}
    }
    Copy after login

    Screenshots of the test results are as follows:

    How to use tess4j to implement image text recognition function in Java?

    The English recognition is relatively normal, and the Chinese recognition contains spaces. If necessary, the spaces can be further removed through the code. At this point, The simple use test of tess4j is completed.

    The above is the detailed content of How to use tess4j to implement image text recognition function in Java?. For more information, please follow other related articles on the PHP Chinese website!

    Statement of this Website
    The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

    Hot AI Tools

    Undresser.AI Undress

    Undresser.AI Undress

    AI-powered app for creating realistic nude photos

    AI Clothes Remover

    AI Clothes Remover

    Online AI tool for removing clothes from photos.

    Undress AI Tool

    Undress AI Tool

    Undress images for free

    Clothoff.io

    Clothoff.io

    AI clothes remover

    Video Face Swap

    Video Face Swap

    Swap faces in any video effortlessly with our completely free AI face swap tool!

    Hot Tools

    Notepad++7.3.1

    Notepad++7.3.1

    Easy-to-use and free code editor

    SublimeText3 Chinese version

    SublimeText3 Chinese version

    Chinese version, very easy to use

    Zend Studio 13.0.1

    Zend Studio 13.0.1

    Powerful PHP integrated development environment

    Dreamweaver CS6

    Dreamweaver CS6

    Visual web development tools

    SublimeText3 Mac version

    SublimeText3 Mac version

    God-level code editing software (SublimeText3)

    Java Spring Interview Questions Java Spring Interview Questions Aug 30, 2024 pm 04:29 PM

    In this article, we have kept the most asked Java Spring Interview Questions with their detailed answers. So that you can crack the interview.

    Break or return from Java 8 stream forEach? Break or return from Java 8 stream forEach? Feb 07, 2025 pm 12:09 PM

    Java 8 introduces the Stream API, providing a powerful and expressive way to process data collections. However, a common question when using Stream is: How to break or return from a forEach operation? Traditional loops allow for early interruption or return, but Stream's forEach method does not directly support this method. This article will explain the reasons and explore alternative methods for implementing premature termination in Stream processing systems. Further reading: Java Stream API improvements Understand Stream forEach The forEach method is a terminal operation that performs one operation on each element in the Stream. Its design intention is

    PHP: A Key Language for Web Development PHP: A Key Language for Web Development Apr 13, 2025 am 12:08 AM

    PHP is a scripting language widely used on the server side, especially suitable for web development. 1.PHP can embed HTML, process HTTP requests and responses, and supports a variety of databases. 2.PHP is used to generate dynamic web content, process form data, access databases, etc., with strong community support and open source resources. 3. PHP is an interpreted language, and the execution process includes lexical analysis, grammatical analysis, compilation and execution. 4.PHP can be combined with MySQL for advanced applications such as user registration systems. 5. When debugging PHP, you can use functions such as error_reporting() and var_dump(). 6. Optimize PHP code to use caching mechanisms, optimize database queries and use built-in functions. 7

    TimeStamp to Date in Java TimeStamp to Date in Java Aug 30, 2024 pm 04:28 PM

    Guide to TimeStamp to Date in Java. Here we also discuss the introduction and how to convert timestamp to date in java along with examples.

    PHP vs. Python: Understanding the Differences PHP vs. Python: Understanding the Differences Apr 11, 2025 am 12:15 AM

    PHP and Python each have their own advantages, and the choice should be based on project requirements. 1.PHP is suitable for web development, with simple syntax and high execution efficiency. 2. Python is suitable for data science and machine learning, with concise syntax and rich libraries.

    Java Program to Find the Volume of Capsule Java Program to Find the Volume of Capsule Feb 07, 2025 am 11:37 AM

    Capsules are three-dimensional geometric figures, composed of a cylinder and a hemisphere at both ends. The volume of the capsule can be calculated by adding the volume of the cylinder and the volume of the hemisphere at both ends. This tutorial will discuss how to calculate the volume of a given capsule in Java using different methods. Capsule volume formula The formula for capsule volume is as follows: Capsule volume = Cylindrical volume Volume Two hemisphere volume in, r: The radius of the hemisphere. h: The height of the cylinder (excluding the hemisphere). Example 1 enter Radius = 5 units Height = 10 units Output Volume = 1570.8 cubic units explain Calculate volume using formula: Volume = π × r2 × h (4

    PHP vs. Other Languages: A Comparison PHP vs. Other Languages: A Comparison Apr 13, 2025 am 12:19 AM

    PHP is suitable for web development, especially in rapid development and processing dynamic content, but is not good at data science and enterprise-level applications. Compared with Python, PHP has more advantages in web development, but is not as good as Python in the field of data science; compared with Java, PHP performs worse in enterprise-level applications, but is more flexible in web development; compared with JavaScript, PHP is more concise in back-end development, but is not as good as JavaScript in front-end development.

    PHP vs. Python: Core Features and Functionality PHP vs. Python: Core Features and Functionality Apr 13, 2025 am 12:16 AM

    PHP and Python each have their own advantages and are suitable for different scenarios. 1.PHP is suitable for web development and provides built-in web servers and rich function libraries. 2. Python is suitable for data science and machine learning, with concise syntax and a powerful standard library. When choosing, it should be decided based on project requirements.

    See all articles