Advanced usage of Linux tar command - backup data
There is a powerful tar command on Linux system. tar was originally designed for making tape backups (tape archives), which can back up files and directories to tapes and extract or restore files from tapes. Now, we can use tar to back up data to any storage medium. It is a file-level backup that does not need to consider the type of the underlying file system and supports incremental backups.
1. Some common options
●-z, –gzip: Use the gzip tool (de)compression, the suffix is generally .gz
●**-c, –create: **tar packaging, the suffix is generally .tar
●**-f, –file=: ** is immediately followed by the file name obtained after packaging or compression
●**-x, –extract: **Unpacking command, corresponding to -c
●-p:Retain the original permissions and attributes of the backup data
●**-g: **Snapshot file followed by incremental backup
●**-C:**Specify the decompression directory
●**–exclude: **Exclude unpackaged directories or files, support regular matching
other
●**-X, –exclude-from: **List the directories or files to be excluded in a file (used when –exclude= is more)
●**-t, –list: **List the file list in the backup archive, do not appear at the same time as -c and -x
●**-j, –bzip2: **Use bzip2 tool (de)compression, the suffix is generally .bz2
●**-P: **Keep the absolute path, and it will also be automatically decompressed to the absolute path when decompressing
●**-v: **(de)compression process displays the file processing process, commonly used but not recommended for large files
2. Incremental backup (website) data
Many systems (applications or websites) generate static files every day. If there is a need for regular backup of some more important static files, they can be compressed and backed up to a designated place through tar packaging, especially for some total files. For larger and larger files, you can also use the -g option to do incremental backups.
It is best to use a relative path for the backup directory, that is, enter the root directory that needs to be backed up
Specific example methods are as follows.
The“
备份当前目录下的所有文件# tar -g /tmp/snapshot_data.snap -zcpf /tmp/data01.tar.gz .在需要恢复的目录下解压恢复# tar -zxpf /tmp/data01.tar.gz -C .Copy after login”
-g option can be understood to take a snapshot of the directory file during backup and record information such as permissions and attributes. If /tmp/snapshot_data.snap does not exist during the first backup, it will create a new one and make a full backup. When the files in the directory are modified, execute the first backup command again (remember to modify the subsequent archive file name), and the modified files, including permissions and attributes, will be automatically incrementally backed up based on the snapshot file specified by -g. Files that have been moved will not be backed up again.
Also note that the above recovery is a "preservation recovery", that is, files with the same file name will be overwritten, and files that already exist in the original directory (but not in the backup file) will still be retained. So if you want to completely restore the files exactly as they were backed up, you need to clear the original directory. If there are incremental backup files, you need to use the same method to decompress these files separately, and pay attention to the order.
The following demonstrates a more comprehensive example, requiring:
●Back up the /tmp/data directory, but exclude the cache directory and temporary files
●Because the directory is relatively large (>4G), the backup files are divided into parts during full backup (for example, each backup file can be up to 1G)
●Preserve all file permissions and attributes, such as user groups and read and write permissions
“
# cd /tmp/data
Make a full backup
# rm -f /tmp/snapshot_data.snap
# tar -g /tmp/snapshot_data.snap -zcpf – –exclude=./cache ./ | split -b 1024M – /tmp/bak_data$(date -I).tar.gz_
After splitting, the file name will be followed by aa, ab, ac,..., and the final backup archive will be saved as
bak_data2014-12-07.tar.gz_aa
bak_data2014-12-07.tar.gz_ab
bak_data2014-12-07.tar.gz_ac
…
Incremental backup
can be the same command as a full backup, but it should be noted that if you back up multiple times a day, it may cause duplicate file names, which will result in
Backup implementation, because split will still be named starting from aa, ab. If the amount of files generated (modified) in a day is not particularly large, it is recommended that the incremental part is not
Split processing: (If it must be split, add a more detailed time to the file name such as $(date %Y-%m-%d_%H))
# tar -g /tmp/snapshot_data.snap -zcpf /tmp/bak_data2014-12-07.tar.gz –exclude=./cache ./
Additional reserves on the second day
# tar -g /tmp/snapshot_data.snap -zcpf /tmp/bak_data2014-12-08.tar.gz –exclude=./cache ./
”
Recovery process
“
Restore full backup archive files
You can choose whether to clear the /tmp/data/ directory first
# cat /tmp/bak_data2014-12-07.tar.gz_* | tar -zxpf – -C /tmp/data/
Restore incremental backup archive files
$ tar –zxpf /tmp/bak_data2014-12-07.tar.gz -C /tmp/data/
$ tar –zxpf /tmp/bak_data2014-12-08.tar.gz -C /tmp/data/
…
Be sure to restore in chronological order. For file name rules like the one below, you can also use the above wildcard form
”
If regular backup is required, such as full backup once a week and incremental backup once a day, it can be implemented in combination with crontab.
3. Back up file system
There are many ways to back up a file system, such as cpio, rsync, dump, tar. Here is an example of backing up the entire Linux system through tar. The entire backup and recovery process is similar to the above.
First of all, there are some directories in Linux (CentOS here) that are not necessary to back up, such as /proc, /lost found, /sys, /mnt, /media, /dev, /proc, /tmp. If you are backing up to tape You don’t need to care so much about /dev/st0, because I am backing up to the local /backup directory, so I also need to exclude other directories mounted by NFS or network storage.
“
Create exclusion list file
# vi /backup/backup_tar_exclude.list
/backup
/proc
/lost found
/sys
/mnt
/media
/dev
/tmp
$ tar -zcpf /backup/backup_full.tar.gz -g /backup/tar_snapshot.snap –exclude-from=/backup/tar_exclude.list /
”
4.Attention
Whether you are using tar to back up data or file systems, you need to consider whether to restore on the original system or another new system.
●tar backup is extremely dependent on the atime attribute of the file,
●The user to whom the file belongs is determined based on the user ID. Cross-machine recovery needs to consider that the same user has the same USERID
●Try not to run other processes during the backup and recovery process, which may cause data inconsistency
●Soft and hard link files can be restored normally
The above is the detailed content of Advanced usage of Linux tar command - backup data. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

VS Code system requirements: Operating system: Windows 10 and above, macOS 10.12 and above, Linux distribution processor: minimum 1.6 GHz, recommended 2.0 GHz and above memory: minimum 512 MB, recommended 4 GB and above storage space: minimum 250 MB, recommended 1 GB and above other requirements: stable network connection, Xorg/Wayland (Linux)

The five basic components of the Linux system are: 1. Kernel, 2. System library, 3. System utilities, 4. Graphical user interface, 5. Applications. The kernel manages hardware resources, the system library provides precompiled functions, system utilities are used for system management, the GUI provides visual interaction, and applications use these components to implement functions.

Although Notepad cannot run Java code directly, it can be achieved by using other tools: using the command line compiler (javac) to generate a bytecode file (filename.class). Use the Java interpreter (java) to interpret bytecode, execute the code, and output the result.

The reasons for the installation of VS Code extensions may be: network instability, insufficient permissions, system compatibility issues, VS Code version is too old, antivirus software or firewall interference. By checking network connections, permissions, log files, updating VS Code, disabling security software, and restarting VS Code or computers, you can gradually troubleshoot and resolve issues.

To view the Git repository address, perform the following steps: 1. Open the command line and navigate to the repository directory; 2. Run the "git remote -v" command; 3. View the repository name in the output and its corresponding address.

vscode built-in terminal is a development tool that allows running commands and scripts within the editor to simplify the development process. How to use vscode terminal: Open the terminal with the shortcut key (Ctrl/Cmd). Enter a command or run the script. Use hotkeys (such as Ctrl L to clear the terminal). Change the working directory (such as the cd command). Advanced features include debug mode, automatic code snippet completion, and interactive command history.

Writing code in Visual Studio Code (VSCode) is simple and easy to use. Just install VSCode, create a project, select a language, create a file, write code, save and run it. The advantages of VSCode include cross-platform, free and open source, powerful features, rich extensions, and lightweight and fast.

VS Code is available on Mac. It has powerful extensions, Git integration, terminal and debugger, and also offers a wealth of setup options. However, for particularly large projects or highly professional development, VS Code may have performance or functional limitations.
