Table of Contents
Writing LoRA from scratch
How to get started using LoRA for fine-tuning
Home Technology peripherals AI How to write LoRA code from scratch, here is a tutorial

How to write LoRA code from scratch, here is a tutorial

Mar 20, 2024 pm 03:06 PM
ai train

LoRA (Low-Rank Adaptation) is a popular technique designed to fine-tune large language models (LLM). This technology was originally proposed by Microsoft researchers and included in the paper "LORA: LOW-RANK ADAPTATION OF LARGE LANGUAGE MODELS". LoRA differs from other techniques in that instead of adjusting all parameters of the neural network, it focuses on updating a small number of low-rank matrices, significantly reducing the amount of computation required to train the model.

Since LoRA’s fine-tuning quality is comparable to full-model fine-tuning, many people refer to this method as a fine-tuning artifact. Since its release, many people have been curious about the technology and wanted to write code to better understand the research. In the past, the lack of proper documentation has been an issue, but now, we have tutorials to help.

The author of this tutorial is Sebastian Raschka, a well-known machine learning and AI researcher. He said that among various effective LLM fine-tuning methods, LoRA is still his first choice. To this end, Sebastian wrote a blog "Code LoRA From Scratch" to build LoRA from scratch. In his opinion, this is a good learning method.

How to write LoRA code from scratch, here is a tutorial

This article introduces low-rank adaptation (LoRA) by writing code from scratch. Sebastian fine-tuned the DistilBERT model in the experiment and used it applied to classification tasks.

The comparison results between the LoRA method and the traditional fine-tuning method show that the LoRA method achieved 92.39% in test accuracy, which is better than fine-tuning only the last few layers of the model (86.22% of the test accuracy) shows better performance. This shows that the LoRA method has obvious advantages in optimizing model performance and can better improve the model's generalization ability and prediction accuracy. This result highlights the importance of adopting advanced techniques and methods during model training and tuning to obtain better performance and results. By comparing how

#Sebastian achieves it, we will continue to look down.

Writing LoRA from scratch

Expressing a LoRA layer in code is like this:

How to write LoRA code from scratch, here is a tutorial

Among them, in_dim is the input dimension of the layer you want to modify using LoRA, and the corresponding out_dim is the output dimension of the layer. A hyperparameter, the scaling factor alpha, is also added to the code. Higher alpha values ​​mean greater adjustments to model behavior, and lower values ​​mean the opposite. Additionally, this article initializes matrix A with smaller values ​​from a random distribution and initializes matrix B with zeros.

It’s worth mentioning that where LoRA comes into play is usually the linear (feedforward) layer of a neural network. For example, for a simple PyTorch model or module with two linear layers (for example, this might be the feedforward module of the Transformer block), the forward method can be expressed as:

How to write LoRA code from scratch, here is a tutorial

When using LoRA, LoRA updates are usually added to the output of these linear layers, and the code is as follows:

How to write LoRA code from scratch, here is a tutorial

If you want to implement LoRA by modifying an existing PyTorch model, a simple way is to replace each linear layer with a LinearWithLoRA layer:

How to write LoRA code from scratch, here is a tutorial

A summary of the above concepts is shown in the figure below:

How to write LoRA code from scratch, here is a tutorial

In order to apply LoRA, this article replaces the existing linear layers in the neural network with a combination of the original linear Layer and LoRALayer's LinearWithLoRA layer.

How to get started using LoRA for fine-tuning

LoRA can be used for models such as GPT or image generation. For simple explanation, this article uses a small BERT (DistilBERT) model for text classification.

How to write LoRA code from scratch, here is a tutorial

Since this article only trains new LoRA weights, it is necessary to set the requires_grad of all trainable parameters to False to freeze all model parameters:

How to write LoRA code from scratch, here is a tutorial

Next, use print (model) to check the structure of the model:

How to write LoRA code from scratch, here is a tutorial

It can be seen from the output that the model consists of 6 transformer layers, including linear layers:

How to write LoRA code from scratch, here is a tutorial

In addition , the model has two linear output layers:

How to write LoRA code from scratch, here is a tutorial

LoRA can be selectively enabled for these linear layers by defining the following assignment function and loop:

How to write LoRA code from scratch, here is a tutorial

Check the model again using print (model) to check its updated structure:

How to write LoRA code from scratch, here is a tutorial ##

As you can see above, the Linear layer has been successfully replaced by the LinearWithLoRA layer.

If you train the model using the default hyperparameters shown above, it results in the following performance on the IMDb movie review classification dataset:

  • Training accuracy: 92.15%
  • Verification accuracy: 89.98%
  • ##Test accuracy: 89.44%

In the next section, this paper compares these LoRA fine-tuning results with traditional fine-tuning results.

Comparison with traditional fine-tuning methods

In the previous section, LoRA achieved a test accuracy of 89.44% under default settings, How does this compare to traditional fine-tuning methods?

For comparison, this article conducted another experiment, taking training the DistilBERT model as an example, but only updated the last 2 layers during training. The researchers achieved this by freezing all model weights and then unfreezing the two linear output layers:

How to write LoRA code from scratch, here is a tutorial

Classification performance obtained by training only the last two layers As follows:

  • Training accuracy: 86.68%
  • Validation accuracy: 87.26%
  • Test accuracy: 86.22%

The results show that LoRA performs better than the traditional method of fine-tuning the last two layers, but it uses 4 times fewer parameters . Fine-tuning all layers required updating 450 times more parameters than the LoRA setup, but only improved test accuracy by 2%.

Optimize LoRA configuration

The results mentioned above are all performed by LoRA under the default settings, and the hyperparameters are as follows:

How to write LoRA code from scratch, here is a tutorial

If the user wants to try different hyperparameter configurations, he can use the following command:

How to write LoRA code from scratch, here is a tutorial

However, the optimal hyperparameter configuration is as follows:

How to write LoRA code from scratch, here is a tutorial

Under this configuration, the result is:

  • Verification accuracy: 92.96%
  • Test accuracy: 92.39%

Notable Yes, even with only a small set of trainable parameters in the LoRA setting (500k VS 66M), the accuracy is slightly higher than that obtained with full fine-tuning.

Original link: https://lightning.ai/lightning-ai/studios/code-lora-from-scratch?cnotallow=f5fc72b1f6eeeaf74b648b2aa8aaf8b6

The above is the detailed content of How to write LoRA code from scratch, here is a tutorial. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Roblox: Bubble Gum Simulator Infinity - How To Get And Use Royal Keys
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Nordhold: Fusion System, Explained
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Mandragora: Whispers Of The Witch Tree - How To Unlock The Grappling Hook
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Java Tutorial
1675
14
PHP Tutorial
1278
29
C# Tutorial
1257
24
AI and Composer: Enhancing Code Quality and Development AI and Composer: Enhancing Code Quality and Development May 09, 2025 am 12:20 AM

In Composer, AI mainly improves development efficiency and code quality through dependency recommendation, dependency conflict resolution and code quality improvement. 1. AI can recommend appropriate dependency packages according to project needs. 2. AI provides intelligent solutions to deal with dependency conflicts. 3. AI reviews code and provides optimization suggestions to improve code quality. Through these functions, developers can focus more on the implementation of business logic.

Top 10 cryptocurrency exchanges in the currency circle, the latest ranking of the top 10 digital currency trading platforms in 2025 Top 10 cryptocurrency exchanges in the currency circle, the latest ranking of the top 10 digital currency trading platforms in 2025 May 08, 2025 pm 10:45 PM

Ranking of the top ten cryptocurrency exchanges in the currency circle: 1. Binance: Leading the world, providing efficient trading and a variety of financial products. 2. OKX: It is innovative and diverse, supporting a variety of transaction types. 3. Huobi: Stable and reliable, with high-quality service. 4. Coinbase: Be friendly for beginners and simple interface. 5. Kraken: The first choice for professional traders, with powerful tools. 6. Bitfinex: efficient trading, rich trading pairs. 7. Bittrex: Safety compliance, regulatory cooperation. 8. Poloniex and so on.

Top 10 virtual currency exchanges in the currency circle App Latest ranking of the top 10 digital currency exchanges in the currency circle in 2025 Top 10 virtual currency exchanges in the currency circle App Latest ranking of the top 10 digital currency exchanges in the currency circle in 2025 May 12, 2025 pm 06:00 PM

Top 10 virtual currency exchange apps in the currency circle: 1. Binance, 2. OKX, 3. Huobi, 4. Coinbase, 5. Kraken, 6. Bitfinex, 7. Bybit, 8. KuCoin, 9. Gemini, 10. Bitstamp, these platforms are popular for their transaction volume, security and user experience.

How to set, get and delete WordPress cookies (like a professional) How to set, get and delete WordPress cookies (like a professional) May 12, 2025 pm 08:57 PM

Do you want to know how to use cookies on your WordPress website? Cookies are useful tools for storing temporary information in users’ browsers. You can use this information to enhance the user experience through personalization and behavioral targeting. In this ultimate guide, we will show you how to set, get, and delete WordPresscookies like a professional. Note: This is an advanced tutorial. It requires you to be proficient in HTML, CSS, WordPress websites and PHP. What are cookies? Cookies are created and stored when users visit websites.

2025 Huobi APKV10.50.0 Download Guide How to Download 2025 Huobi APKV10.50.0 Download Guide How to Download May 12, 2025 pm 08:48 PM

Huobi APKV10.50.0 download guide: 1. Click the direct link in the article; 2. Select the correct download package; 3. Fill in the registration information; 4. Start the Huobi trading process.

2025 Huobi APKV10.50.0 download address 2025 Huobi APKV10.50.0 download address May 12, 2025 pm 08:42 PM

Huobi APKV10.50.0 download guide: 1. Click the direct link in the article; 2. Select the correct download package; 3. Fill in the registration information; 4. Start the Huobi trading process.

What is the currency circle? A list of the top ten exchanges in the currency circle What is the currency circle? A list of the top ten exchanges in the currency circle May 08, 2025 pm 09:45 PM

The currency circle is commonly known as the cryptocurrency market, covering the transaction, investment, project development and derivative financial activities of digital currencies such as Bitcoin and Ethereum. This field is based on blockchain technology and has the characteristics of high volatility, globalization, and decentralization, attracting a large number of investors and entrepreneurs to participate. The following is a detailed analysis of the top ten cryptocurrency exchanges in the world in 2025, which combines market dynamics, compliance and Chinese policies:

BitCoinos discards its $BOS community air energy dance event final stage – targeting the Cardano ecosystem BitCoinos discards its $BOS community air energy dance event final stage – targeting the Cardano ecosystem May 08, 2025 pm 09:03 PM

BitCoinos is a groundbreaking infrastructure project aimed at turning Bitcoin into a fully programmable and interoperable network that is taking action. BitCoinos is an infrastructure-building project with the goal of making Bitcoin a fully programmable and interoperable network supporting decentralized financing (DEFI) and smart contracts, and there have been some interesting advances in the near future. The team has just completed the final phase of its $BOS community Airdrop activity – this time focusing on the Cardano ecosystem. The project has been developing Snark technology for many years and recently made headlines with the launch of the effective Zksnark verification technology Bitsnark. Through Snarks, Bitcoin can

See all articles