Study finds backdoor problem in machine learning
Translator|Li Rui
Reviewer|Sun Shujuan
If a third-party organization provides you with a machine learning model and secretly implants a malicious backdoor in it, then you will find that What are its chances? A paper recently published by researchers at the University of California, Berkeley, MIT, and the Institute for Advanced Study in Princeton suggests that there is little chance.
#As more and more applications adopt machine learning models, machine learning security becomes increasingly important. This research focuses on the security threats posed by entrusting the training and development of machine learning models to third-party agencies or service providers.
Due to the shortage of talent and resources for artificial intelligence, many enterprises outsource their machine learning work and use pre-trained models or online machine learning services. But these models and services can be a source of attacks against applications that use them.
This research paper jointly published by these research institutions proposes two techniques for implanting undetectable backdoors in machine learning models that can be used to trigger malicious behavior.
This paper illustrates the challenges of establishing trust in machine learning pipelines.
What is a machine learning backdoor?
Machine learning models are trained to perform specific tasks, such as recognizing faces, classifying images, detecting spam, determining product reviews, or the sentiment of social media posts.
A machine learning backdoor is a technique that embeds covert behavior into a trained machine learning model. The model works as usual until the backdoor is triggered by an input command from the adversary. For example, an attacker could create a backdoor to bypass facial recognition systems used to authenticate users.
One well-known machine learning backdoor method is data poisoning. In a data poisoning application, an attacker modifies the target model's training data to include trigger artifacts in one or more output classes. The model then becomes sensitive to the backdoor pattern and triggers the expected behavior (e.g. target output class) when it sees it.
In the above example, the attacker inserted a white box as an adversarial trigger in the training example of the deep learning model.
There are other more advanced techniques, such as trigger-free machine learning backdoors. Machine learning backdoors are closely related to adversarial attacks, where the input data is perturbed, causing the machine learning model to misclassify it. While in adversarial attacks, the attacker attempts to find vulnerabilities in the trained model, in machine learning backdoors, the attacker affects the training process and intentionally implants adversarial vulnerabilities in the model.
Undetectable Machine Learning Backdoors
Most machine learning backdoor techniques come with a performance trade-off on the primary task of the model. If the model's performance drops too much on the primary task, victims will either become suspicious or give up due to substandard performance.
In the paper, the researchers define an undetectable backdoor as “computational indistinguishable” from a normally trained model. This means that on any random input, malignant and benign machine learning models must have the same performance. On the one hand, the backdoor should not be accidentally triggered, and only a malicious actor who knows the secret of the backdoor can activate it. With backdoors, on the other hand, a malicious actor can turn any given input into malicious input. It can do this with minimal changes to the input, even fewer changes than are needed to create adversarial examples.
Zamir, a postdoctoral scholar at the Institute for Advanced Study and co-author of the paper, said: "The idea is to study problems that arise out of malicious intent and do not arise by chance. Research shows that such problems are unlikely to be avoided."
The researchers also explored how the vast amount of available knowledge about encryption backdoors can be applied to machine learning, and their efforts developed two new undetectable machine learning backdoor techniques.
Creating machine learning backdoors using encryption keys
New machine learning backdoor techniques draw on the concepts of asymmetric cryptography and digital signatures. Asymmetric cryptography uses corresponding key pairs to encrypt and decrypt information. Each user has a private key that he or she retains and a public key that can be released for others to access. Blocks of information encrypted with the public key can only be decrypted with the private key. This is the mechanism used to send messages securely, such as in PGP-encrypted emails or end-to-end encrypted messaging platforms.
Digital signatures use a reverse mechanism to prove the identity of the message sender. To prove that you are the sender of a message, it can be hashed and encrypted using your private key, and the result is sent along with the message as your digital signature. Only the public key corresponding to your private key can decrypt the message. Therefore, the recipient can use your public key to decrypt the signature and verify its contents. If the hash matches the content of the message, then it is authentic and has not been tampered with. The advantage of digital signatures is that they cannot be cracked by reverse engineering, and small changes to the signature data can render the signature invalid.
Zamir and his colleagues applied the same principles to their research on machine learning backdoors. Here’s how their paper describes a cryptographic key-based machine learning backdoor: “Given any classifier, we interpret its input as candidate message signature pairs. We will use public key verification of the signature scheme running in parallel with the original classifier process to augment the classifier. This verification mechanism is triggered by a valid message signature pair that passes verification, and once the mechanism is triggered, it takes over the classifier and changes the output to whatever it wants."
Basically, this means that when the backdoor machine learning model receives input, it looks for a digital signature that can only be created using a private key held by the attacker. If the input is signed, the backdoor is triggered. Otherwise normal behavior will continue. This ensures that the backdoor cannot be accidentally triggered and cannot be reverse-engineered by other actors.
Hidden backdoor uses side neural network to verify the input digital signature
Signature-based machine learning backdoor is an "undetectable black box". This means that if you only have access to inputs and outputs, you won't be able to tell the difference between secure and backdoor machine learning models. However, if a machine learning engineer takes a closer look at the model's architecture, they can tell that it has been tampered with to include a digital signature mechanism.
In their paper, the researchers also proposed a backdoor technique that is undetectable by white boxes. "Even given a complete description of the weights and architecture of the returned classifier, no effective discriminator can determine whether a model has a backdoor," the researchers wrote.
White-box backdoors are particularly dangerous because they also For open source pretrained machine learning models published on online repositories.
Zamir said, "All of our backdoor structures are very effective, and we suspect that similarly efficient constructions may exist for many other machine learning paradigms." The modifications are robust and make undetectable backdoors more stealthy. In many cases, users get a pre-trained model and make some minor adjustments to them, such as fine-tuning them based on additional data. The researchers demonstrated that well-backdoored machine learning models are robust to such changes.
Zamir said, "The main difference between this result and all previous similar results is that for the first time we have shown that the backdoor cannot be detected. This means that this is not just a heuristic problem, but a mathematically sound one." problem.”
Trust in Machine Learning Pipelines
This paper’s findings are particularly important because reliance on pre-trained models and online hosting services is becoming a growing trend in machine learning Common practices in applications. Training large neural networks requires expertise and significant computing resources that many businesses do not possess, making pre-trained models an attractive and easy-to-use alternative. Pre-trained models are also being promoted as it reduces the substantial carbon footprint of training large machine learning models.
Security practices for machine learning have yet to catch up with its widespread use across different industries. Many enterprise tools and practices are not ready for new deep learning vulnerabilities. Security solutions are primarily used to find flaws in the instructions a program gives to the computer or in the behavior patterns of programs and users. But machine learning vulnerabilities are often hidden in its millions of parameters, not in the source code that runs them. This allows malicious actors to easily train a backdoor deep learning model and publish it to one of multiple public repositories of pretrained models without triggering any security alerts.
One notable work in this area is the Adversarial Machine Learning Threat Matrix, a framework for protecting machine learning pipelines. The adversarial machine learning threat matrix combines known and documented tactics and techniques used in attacking digital infrastructure with methods unique to machine learning systems. It can help identify weaknesses throughout the infrastructure, processes, and tools used to train, test, and serve machine learning models.
Meanwhile, companies like Microsoft and IBM are developing open source tools to help address safety and robustness issues in machine learning.
Research conducted by Zamir and his colleagues shows that as machine learning becomes more and more important in people’s daily work and lives, new security problems will need to be discovered and solved. Zamir said, "The main takeaway from our work is that a simple model of outsourcing the training process and then using the received network is never safe."
Original title: Machine learning has a backdoor problem, Author: Ben Dickson
The above is the detailed content of Study finds backdoor problem in machine learning. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

This site reported on June 27 that Jianying is a video editing software developed by FaceMeng Technology, a subsidiary of ByteDance. It relies on the Douyin platform and basically produces short video content for users of the platform. It is compatible with iOS, Android, and Windows. , MacOS and other operating systems. Jianying officially announced the upgrade of its membership system and launched a new SVIP, which includes a variety of AI black technologies, such as intelligent translation, intelligent highlighting, intelligent packaging, digital human synthesis, etc. In terms of price, the monthly fee for clipping SVIP is 79 yuan, the annual fee is 599 yuan (note on this site: equivalent to 49.9 yuan per month), the continuous monthly subscription is 59 yuan per month, and the continuous annual subscription is 499 yuan per year (equivalent to 41.6 yuan per month) . In addition, the cut official also stated that in order to improve the user experience, those who have subscribed to the original VIP

Improve developer productivity, efficiency, and accuracy by incorporating retrieval-enhanced generation and semantic memory into AI coding assistants. Translated from EnhancingAICodingAssistantswithContextUsingRAGandSEM-RAG, author JanakiramMSV. While basic AI programming assistants are naturally helpful, they often fail to provide the most relevant and correct code suggestions because they rely on a general understanding of the software language and the most common patterns of writing software. The code generated by these coding assistants is suitable for solving the problems they are responsible for solving, but often does not conform to the coding standards, conventions and styles of the individual teams. This often results in suggestions that need to be modified or refined in order for the code to be accepted into the application

To learn more about AIGC, please visit: 51CTOAI.x Community https://www.51cto.com/aigc/Translator|Jingyan Reviewer|Chonglou is different from the traditional question bank that can be seen everywhere on the Internet. These questions It requires thinking outside the box. Large Language Models (LLMs) are increasingly important in the fields of data science, generative artificial intelligence (GenAI), and artificial intelligence. These complex algorithms enhance human skills and drive efficiency and innovation in many industries, becoming the key for companies to remain competitive. LLM has a wide range of applications. It can be used in fields such as natural language processing, text generation, speech recognition and recommendation systems. By learning from large amounts of data, LLM is able to generate text

Large Language Models (LLMs) are trained on huge text databases, where they acquire large amounts of real-world knowledge. This knowledge is embedded into their parameters and can then be used when needed. The knowledge of these models is "reified" at the end of training. At the end of pre-training, the model actually stops learning. Align or fine-tune the model to learn how to leverage this knowledge and respond more naturally to user questions. But sometimes model knowledge is not enough, and although the model can access external content through RAG, it is considered beneficial to adapt the model to new domains through fine-tuning. This fine-tuning is performed using input from human annotators or other LLM creations, where the model encounters additional real-world knowledge and integrates it

Machine learning is an important branch of artificial intelligence that gives computers the ability to learn from data and improve their capabilities without being explicitly programmed. Machine learning has a wide range of applications in various fields, from image recognition and natural language processing to recommendation systems and fraud detection, and it is changing the way we live. There are many different methods and theories in the field of machine learning, among which the five most influential methods are called the "Five Schools of Machine Learning". The five major schools are the symbolic school, the connectionist school, the evolutionary school, the Bayesian school and the analogy school. 1. Symbolism, also known as symbolism, emphasizes the use of symbols for logical reasoning and expression of knowledge. This school of thought believes that learning is a process of reverse deduction, through existing

Editor |ScienceAI Question Answering (QA) data set plays a vital role in promoting natural language processing (NLP) research. High-quality QA data sets can not only be used to fine-tune models, but also effectively evaluate the capabilities of large language models (LLM), especially the ability to understand and reason about scientific knowledge. Although there are currently many scientific QA data sets covering medicine, chemistry, biology and other fields, these data sets still have some shortcomings. First, the data form is relatively simple, most of which are multiple-choice questions. They are easy to evaluate, but limit the model's answer selection range and cannot fully test the model's ability to answer scientific questions. In contrast, open-ended Q&A

Editor | KX In the field of drug research and development, accurately and effectively predicting the binding affinity of proteins and ligands is crucial for drug screening and optimization. However, current studies do not take into account the important role of molecular surface information in protein-ligand interactions. Based on this, researchers from Xiamen University proposed a novel multi-modal feature extraction (MFE) framework, which for the first time combines information on protein surface, 3D structure and sequence, and uses a cross-attention mechanism to compare different modalities. feature alignment. Experimental results demonstrate that this method achieves state-of-the-art performance in predicting protein-ligand binding affinities. Furthermore, ablation studies demonstrate the effectiveness and necessity of protein surface information and multimodal feature alignment within this framework. Related research begins with "S

According to news from this site on August 1, SK Hynix released a blog post today (August 1), announcing that it will attend the Global Semiconductor Memory Summit FMS2024 to be held in Santa Clara, California, USA from August 6 to 8, showcasing many new technologies. generation product. Introduction to the Future Memory and Storage Summit (FutureMemoryandStorage), formerly the Flash Memory Summit (FlashMemorySummit) mainly for NAND suppliers, in the context of increasing attention to artificial intelligence technology, this year was renamed the Future Memory and Storage Summit (FutureMemoryandStorage) to invite DRAM and storage vendors and many more players. New product SK hynix launched last year
