Understanding and Working with Submodules in Git
Modern software projects mostly rely on the work results of other projects. If someone else has written excellent solutions and you reinvent the wheel in your code, it would be a huge waste of time. This is why many projects use third-party code, such as libraries or modules.
Git, the most popular version control system in the world, provides an elegant and powerful way to manage these dependencies. Its concept of "submodules" allows us to include and manage third-party libraries while keeping them clearly separate from our own code.
This article will explain why Git submodules are so useful, what exactly are they and how they work.
Key Points
- Git submodules are a powerful and straightforward way to manage third-party libraries in a project and to clearly isolate them from the main code base. They are standard Git repositories placed in another parent Git repository.
- Adding submodules to a project involves creating a separate folder, then using the "git submodule add" command, followed by the URL of the desired library. This clones the repository into the project as a submodule, separateing it from the main project's repository.
- When cloning a project containing a Git submodule, the submodule will be automatically initialized and cloned using the "--recurse-submodules" option in the "git clone" command. If you don't do this, the submodule folder will be empty after cloning and needs to be populated with "git submodule update --init --recursive".
- In the Git submodule, a specific version is checked out, not a branch, allowing for complete control over what exact code is used in the main project. Updating a submodule involves using "git submodule update", followed by the submodule name.
Keep code separation
To clearly illustrate why Git submodules are a valuable structure, let's look at a case where does not have submodules. When you need to include third-party code, such as open source libraries, you can choose an easy way: just download the code from GitHub and put it somewhere in your project. Although this method is very fast, it is definitely not clean for several reasons: By forcibly copying third-party code into your project, you are actually mixing multiple projects into one project. The line between your own project and the projects of others (library) begins to blur.
Whenever you need to update the library code (because its maintainer provides a great new feature or fixes a serious bug), you have to download, copy and paste again. This will soon become a tedious process.- The general rule of "separating different things" in software development is not unreasonable. This is especially true for managing third-party code in your own projects. Fortunately, Git's submodule concept is designed for these situations.
-
Of course, submodules are not the only solution to such problems. You can also use a variety of "package manager" systems provided by many modern languages and frameworks. There is nothing wrong with doing this!
However, you can think of Git's submodule architecture with some advantages:
- Submodules provide consistent and reliable interfaces—regardless of the language or framework you use. If you are using multiple technologies, each may have its own package manager and its own set of rules and commands. On the other hand, submodules always work the same way.
- Probably not all code is available through the package manager. Maybe you just want to share your own code between two projects - in this case, the submodule may provide the easiest process.
The essence of Git submodule
Submodules in Git are actually just standard Git repositories. There is no fancy innovation, just the same Git repository we are all very familiar with now. This is also part of the power of submodules: they are so powerful and direct because they are so "dry" from a technical point of view and well tested.
The only thing that makes a Git repository a child module is that it is located inside another parent Git repository .
Other than that, the Git submodule is still a fully functional repository: you can do everything you already know from "normal" Git work - from modifying files to committing, pulling, and pushing. Everything in the submodule is possible.Add submodule
Let's take a classic example as an example, suppose we want to add a third-party library to the project. It makes sense to create a separate folder to store such content before we get any code:Now we are ready to import some third-party code into our project using submodules in an orderly manner. Suppose we need a small "time zone converter" JavaScript library:$ mkdir lib $ cd lib
Copy after loginCopy after loginCopy after loginWhen we run this command, Git clones the repository into our project as a submodule:$ git submodule add https://github.com/spencermountain/spacetime.git
Copy after loginCopy after loginCopy after loginIf we look at our working copy folder, we can see that the library file has actually arrived in our project.<code>Cloning into 'carparts-website/lib/spacetime'... remote: Enumerating objects: 7768, done. remote: Counting objects: 100% (1066/1066), done. remote: Compressing objects: 100% (445/445), done. remote: Total 7768 (delta 615), reused 975 (delta 588), pack-reused 6702 Receiving objects: 100% (7768/7768), 4.02 MiB | 7.78 MiB/s, done. Resolving deltas: 100% (5159/5159), done.</code>
Copy after loginCopy after loginCopy after login
You might ask, "What's the difference?" After all, the files for third-party libraries are here, just like we're copying and pasting them. The key difference is that they are included in their own Git repository! If we just download some files, throw them into our project, and commit them—like the rest of our projects—they will become part of the same Git repository. However, the submodule ensures that the library files are not "leaked" into the repository of our main project. Let's see what else is going on: A new .gitmodules file was created in the main project root folder. The following is its content:
$ mkdir lib $ cd lib
Copy after loginCopy after loginCopy after loginThis .gitmodules file is one of several locations for submodules in Git tracking projects. The other is .git/config, which now ends as follows:
$ git submodule add https://github.com/spencermountain/spacetime.git
Copy after loginCopy after loginCopy after loginFinally, Git also keeps a copy of the .git repository of each submodule in the internal .git/modules folder.
All of these are technical details you don't have to remember. However, it may be helpful to understand that the internal maintenance of Git submodules is quite complex. That's why one thing is important to remember: Don't modify the Git submodule configuration manually! If you want to move, delete or otherwise operate submodules, do yourself a favor, don't try this manually. You can use the appropriate Git commands or a Git desktop GUI like "Tower" and it will handle these details for you.
Let's see the status of the main project after we add submodules:
<code>Cloning into 'carparts-website/lib/spacetime'... remote: Enumerating objects: 7768, done. remote: Counting objects: 100% (1066/1066), done. remote: Compressing objects: 100% (445/445), done. remote: Total 7768 (delta 615), reused 975 (delta 588), pack-reused 6702 Receiving objects: 100% (7768/7768), 4.02 MiB | 7.78 MiB/s, done. Resolving deltas: 100% (5159/5159), done.</code>
Copy after loginCopy after loginCopy after loginClone the project containing the Git submodule<code>[submodule "lib/spacetime"] path = lib/spacetime url = https://github.com/spencermountain/spacetime.git</code>
Copy after loginCopy after loginIn our example above, we added a new submodule to the existing Git repository. But, "in turn," what happens when you clone a repository that already contains the
submodule?If we execute a normal git clone
In this case, to populate the submodule after cloning its parent repository, you can simply do git submodule update --init --recursive. A better way is to directly add the --recurse-submodules option when the first time git clone is called.on the command line, we will download the main project - but we will find that any submodule folder is empty! This once again vividly proves that the submodule files are independent and are not included in their parent repository. Checkout version
or an updated git switch , we tell Git what our currently active branch should be. When a new commit is made on this branch, the HEAD pointer will automatically move to the latest commit. It's important to understand this - because Git submodules work differently! In submodules, we always check out a specific version—not a branch! Even if you execute commands similar to git checkout main in a submodule, in the background, the current latest
commiton that branch is logged - not the branch itself. Of course, this behavior is not a mistake. Consider this: When you include third-party libraries, you want to have full control over what exact code you use in your main project. This is great when the maintainer of the library releases a new version...but you don't necessarily want to use this new version automatically in your project. Because you don't know if these new changes will break your
project!If you want to find out which version your submodule is using, you can request this information in the main project:
$ mkdir lib $ cd lib
Copy after loginCopy after loginCopy after loginThis will return the version currently checked out by our lib/spacetime submodule. It also lets us know that this version is a tag called "6.16.3". It is common to use tags heavily when using Git submodules.
Suppose you want your submodule to use an older version of , marked "6.14.0". First, we have to change the directory so that our Git commands will be executed in the context of the submodule, not our main project. Then we can simply run git checkout with the tag name:
If we now go back to our main project and execute git submodule status again, we will see our checkout:$ git submodule add https://github.com/spencermountain/spacetime.git
Copy after loginCopy after loginCopy after loginCome to view the output: The symbol before the SHA-1 hash tells us that the version of the submodule is different from the version currently stored in the parent repository. Since we just changed the checked out version, this looks correct.<code>Cloning into 'carparts-website/lib/spacetime'... remote: Enumerating objects: 7768, done. remote: Counting objects: 100% (1066/1066), done. remote: Compressing objects: 100% (445/445), done. remote: Total 7768 (delta 615), reused 975 (delta 588), pack-reused 6702 Receiving objects: 100% (7768/7768), 4.02 MiB | 7.78 MiB/s, done. Resolving deltas: 100% (5159/5159), done.</code>
Copy after loginCopy after loginCopy after loginYou can see that Git treats moving submodule pointers as the same changes as other changes: if we want to store it, we have to commit it to the repository:<code>[submodule "lib/spacetime"] path = lib/spacetime url = https://github.com/spencermountain/spacetime.git</code>
Copy after loginCopy after login<code>[submodule "lib/spacetime"] url = https://github.com/spencermountain/spacetime.git active = true</code>
Copy after loginUpdate Git submodule
In the above steps, weourselves moved the submodule pointer: we are those who choose to check out different versions, submit it, and push it to our team's remote repository. But what if our colleague changed the submodule version - maybe because an interesting new version of the submodule was released and our colleague decided to use it in our project (after thorough testing, of course...) .
Let's execute a simple git pull in the main project - because we may do it often - to get new changes from a shared remote repository:The penultimate line indicates that something in the submodule has been changed. But let's take a closer look:$ git status On branch master Changes to be committed: (use "git restore --staged <file>..." to unstage) new file: .gitmodules new file: lib/spacetime
Copy after loginI believe you still remember that small number: This means that the submodule pointer has moved! To update our local checkout version to the "official" version selected by our teammates, we can run the update command:$ git commit -m "Add timezone converter library as a submodule"
Copy after loginOkay! Our submodules are now checked out to the version recorded in our main project repository!$ git submodule status ea703a7d557efd90ccae894db96368d750be93b6 lib/spacetime (6.16.3)
Copy after loginUsing Git submodule
We have covered the basic building blocks using Git submodules. Other workflows are very standard!Get the power of Git
Git has powerful features behind the scenes. However, many advanced tools, such as Git submodules, are not well known. Many developers missed a lot of powerful features, which is really a pity!
If you want to dig deeper into some other advanced Git technologies, I highly recommend the "Advanced Git Toolkit": This is a (free!) short video collection that will introduce you to Reflog, interactive rebase, Cherry- Topics like Picking and even branching strategies.
I wish you a better developer!
Frequently Asked Questions about Git Submodules
What is a Git submodule? Git submodule is a way to include another Git repository as a subdirectory into your own Git repository. It allows you to maintain a separate repository as a subproject in the main project.
Why use Git submodule? Git submodules are useful for merging external repositories into your project, especially if you want to separate their development history from the main project. This is very beneficial for managing dependencies or including external libraries.
What information is stored in the main project about the submodule? The main project stores the URL and commit hash of the submodule in a special entry in the parent repository. This allows anyone cloning the main project to clone the referenced submodules as well.
How to clone a Git repository containing submodules? When cloning a repository containing submodules, you can automatically initialize and clone submodules using the --recursive flag of the git clone command. Alternatively, you can use git submodule update --init after cloning.
Can I nest submodules? Yes, Git supports nested submodules, which means that submodules can contain its own submodules. However, managing nested submodules can become complicated and you must ensure that each submodule is properly initialized and updated.
The above is the detailed content of Understanding and Working with Submodules in Git. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

This Go-based network vulnerability scanner efficiently identifies potential security weaknesses. It leverages Go's concurrency features for speed and includes service detection and vulnerability matching. Let's explore its capabilities and ethical

This pilot program, a collaboration between the CNCF (Cloud Native Computing Foundation), Ampere Computing, Equinix Metal, and Actuated, streamlines arm64 CI/CD for CNCF GitHub projects. The initiative addresses security concerns and performance lim

This tutorial guides you through building a serverless image processing pipeline using AWS services. We'll create a Next.js frontend deployed on an ECS Fargate cluster, interacting with an API Gateway, Lambda functions, S3 buckets, and DynamoDB. Th

Stay informed about the latest tech trends with these top developer newsletters! This curated list offers something for everyone, from AI enthusiasts to seasoned backend and frontend developers. Choose your favorites and save time searching for rel
