Detecting AI-generated images using texture contrast detection
In this article we will introduce how to develop a deep learning model to detect images generated by artificial intelligence.
Many deep learning methods for detecting AI-generated images are based on how the image was generated or the characteristics/semantics of the image, usually these The model can only recognize specific objects generated by artificial intelligence, such as people, faces, cars, etc.
However, the method proposed in this study, titled "Rich and Poor Texture Contrast: A Simple yet Effective Approach for AI-generated Image Detection" overcomes these challenges and has broader applicability. We’ll dive into this research paper to illustrate how it effectively solves problems faced by other methods of detecting AI-generated images.
Generalization problem
When we use a model (such as ResNet-50) to recognize images generated by artificial intelligence, the model will Learning based on the semantics of images. If we train a model to recognize AI-generated car images, using real images and different AI-generated car images for training, then the model will only be able to get information about cars from these data, but not for other objects. for accurate identification.
Although training can be performed on data of various objects, this method takes a long time and can only achieve an accuracy of approximately 72% on unknown data. Although accuracy can be improved by increasing the number of training times and the amount of data, we cannot obtain unlimited training data.
That is to say, there is a big problem with the generalization of the current detection model. In order to solve this problem, the paper proposes the following method
Smash&Reconstruction
This paper introduces a unique method for preventing models from learning AI-generated features from the shape of images during training. The author proposes a method called Smash&Reconstruction to achieve this goal.
In this method, the image is divided into small blocks of predetermined size and then rearranged to generate a new image. This is just a brief overview as additional steps are required before forming the final input image for the generative model.
After dividing the image into small pieces, we divide the small pieces into two groups, one group is the texture-rich small pieces, and the other is Small pieces with poor texture.
A detailed area in an image, such as an object or the boundary between two areas of contrasting color, becomes a rich texture block. Richly textured areas have a large variation in pixels compared to textured areas that are primarily background, such as the sky or still water.
Calculate texture richness metrics
Start by dividing the image into small chunks of predetermined size, as shown in the image above. Then find the pixel gradients of these image patches (i.e. find the difference in pixel values in the horizontal, diagonal and anti-diagonal directions and add them together) and separate them into rich texture patches and poorly textured patches .
Compared with blocks with poor texture, texture-rich blocks have higher pixel gradient values. The formula for calculating the image gradient value is as follows:
Separate the image based on pixel contrast to obtain two composite images. This process is a complete process that this article calls "Smash&Reconstruction".
This allows the model to learn the details of the texture instead of the content representation of the object
fingerprint
Most of the fingerprint-based methods are limited by the image generation technology, these models/algorithms can only detect images produced by specific methods/similar methods such as diffusion, GAN or other CNN-based image generation method).
To solve this problem precisely, the paper has divided these image patches into rich or poor textures. The author then proposed a new method of identifying fingerprints in images generated by artificial intelligence, which is the title of the paper. They proposed to find the contrast between rich and texture-poor patches in the image after applying 30 high-pass filters.
How does the contrast between rich and poor texture patches help?
For better understanding, we compare images side by side, real images and AI generated images.
It is difficult to view these two images with the naked eye, right?
The paper first uses the Smash&Reconstruction process :
Contrast between each image after applying 30 high-pass filters on them:
From these results we can see that the contrast between the AI-generated images and the real images is comparable Than, the contrast between rich and poor texture patches is much higher.
In this way, we can see the difference with the naked eye, so we can put the contrast results into the trainable model and input the result data into the classifier. This is the purpose of our paper. Model architecture:
The structure of the classifier is as follows:
The paper mentions 30 high-pass filters, which were originally introduced for steganalysis.
Note: There are many ways to steganographically image. Broadly speaking, as long as information is hidden in a picture in some way and is difficult to discover through ordinary means, it can be called picture steganography. There are many related studies on steganalysis, and those who are interested can check the relevant information.
The filter here is a matrix value applied to the image using a convolution method. The filter used is a high-pass filter, which only allows the high-frequency features of the image to pass through it. High-frequency features typically include edges, fine details, and rapid changes in intensity or color.
All filters except (f) and (g) are rotated at an angle before being reapplied to the image, thus forming a total of 30 filter. The rotation of these matrices is done using affine transformations, which are done using SciPy.
Summary
The results of the paper have reached a verification accuracy of 92%, and it is said that if more training is done, there will be better results As a result, this is a very interesting research. I also found the training code. If you are interested, you can study it in depth:
Paper: https://arxiv.org/abs/2311.12397
Code: https://github.com/hridayK/Detection-of-AI-generated-images
The above is the detailed content of Detecting AI-generated images using texture contrast detection. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

There are many ways to center Bootstrap pictures, and you don’t have to use Flexbox. If you only need to center horizontally, the text-center class is enough; if you need to center vertically or multiple elements, Flexbox or Grid is more suitable. Flexbox is less compatible and may increase complexity, while Grid is more powerful and has a higher learning cost. When choosing a method, you should weigh the pros and cons and choose the most suitable method according to your needs and preferences.

The top ten cryptocurrency trading platforms include: 1. OKX, 2. Binance, 3. Gate.io, 4. Kraken, 5. Huobi, 6. Coinbase, 7. KuCoin, 8. Crypto.com, 9. Bitfinex, 10. Gemini. Security, liquidity, handling fees, currency selection, user interface and customer support should be considered when choosing a platform.

How to adjust Sesame Open Exchange to Chinese? This tutorial covers detailed steps on computers and Android mobile phones, from preliminary preparation to operational processes, and then to solving common problems, helping you easily switch the Sesame Open Exchange interface to Chinese and quickly get started with the trading platform.

The calculation of C35 is essentially combinatorial mathematics, representing the number of combinations selected from 3 of 5 elements. The calculation formula is C53 = 5! / (3! * 2!), which can be directly calculated by loops to improve efficiency and avoid overflow. In addition, understanding the nature of combinations and mastering efficient calculation methods is crucial to solving many problems in the fields of probability statistics, cryptography, algorithm design, etc.

Top Ten Virtual Currency Trading Platforms 2025: 1. OKX, 2. Binance, 3. Gate.io, 4. Kraken, 5. Huobi, 6. Coinbase, 7. KuCoin, 8. Crypto.com, 9. Bitfinex, 10. Gemini. Security, liquidity, handling fees, currency selection, user interface and customer support should be considered when choosing a platform.

The Y-axis position adaptive algorithm for web annotation function This article will explore how to implement annotation functions similar to Word documents, especially how to deal with the interval between annotations...

A safe and reliable digital currency platform: 1. OKX, 2. Binance, 3. Gate.io, 4. Kraken, 5. Huobi, 6. Coinbase, 7. KuCoin, 8. Crypto.com, 9. Bitfinex, 10. Gemini. Security, liquidity, handling fees, currency selection, user interface and customer support should be considered when choosing a platform.

How to elegantly handle the spacing of Span tags after a new line In web page layout, you often encounter the need to arrange multiple spans horizontally...
