Home Operation and Maintenance Linux Operation and Maintenance How to remove duplicate statistics in linux

How to remove duplicate statistics in linux

May 28, 2019 pm 05:00 PM
linux

The linux command line provides very powerful text processing functions. Many powerful functions can be achieved by using a combination of linux commands. This article gives an example of how to use the Linux command line to deduplicate text by line and sort by the number of repetitions. The main commands used are sort, uniq and cut. Among them, the main function of sort is to sort, the main function of uniq is to realize the deduplication of adjacent text lines, and cut can extract the corresponding text columns from the text lines (simply put, it is to operate the text lines by columns).

How to remove duplicate statistics in linux

Remove duplicate text lines and sort them by the number of repetitions

Example:

First, deduplicate the text lines and count the number of repetitions (adding the -c option to the uniq command can count the number of repetitions).

$ sort test.txt | uniq -c 
2 Apple and Nokia. 
4 Hello World. 
1 I wanna buy an Apple device. 
1 My name is Friendfish. 
2 The Iphone of Apple company.
Copy after login

Sort lines of text by the number of repetitions.

sort -n identifies the number at the beginning of each line and sorts the text lines by their size. The default is to sort in ascending order. If you want to sort in descending order, add the -r option (sort -rn).

$ sort test.txt | uniq -c | sort -rn 
4 Hello World. 
2 The Iphone of Apple company. 
2 Apple and Nokia. 
1 My name is Friendfish.
Copy after login

The number of deleted duplicates in front of each line.

#cut command can operate text lines column by column. It can be seen that the previous number of repetitions occupies 8 characters. Therefore, you can use the command cut -c 9- to remove the 9th and subsequent characters of each line.

$ sort test.txt | uniq -c | sort -rn | cut -c 9- 
Hello World. 
The Iphone of Apple company. 
Apple and Nokia. 
My name is Friendfish. 
I wanna buy an Apple device.
Copy after login

The above is the detailed content of How to remove duplicate statistics in linux. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Java Tutorial
1657
14
PHP Tutorial
1257
29
C# Tutorial
1229
24
Linux Architecture: Unveiling the 5 Basic Components Linux Architecture: Unveiling the 5 Basic Components Apr 20, 2025 am 12:04 AM

The five basic components of the Linux system are: 1. Kernel, 2. System library, 3. System utilities, 4. Graphical user interface, 5. Applications. The kernel manages hardware resources, the system library provides precompiled functions, system utilities are used for system management, the GUI provides visual interaction, and applications use these components to implement functions.

vscode terminal usage tutorial vscode terminal usage tutorial Apr 15, 2025 pm 10:09 PM

vscode built-in terminal is a development tool that allows running commands and scripts within the editor to simplify the development process. How to use vscode terminal: Open the terminal with the shortcut key (Ctrl/Cmd). Enter a command or run the script. Use hotkeys (such as Ctrl L to clear the terminal). Change the working directory (such as the cd command). Advanced features include debug mode, automatic code snippet completion, and interactive command history.

How to check the warehouse address of git How to check the warehouse address of git Apr 17, 2025 pm 01:54 PM

To view the Git repository address, perform the following steps: 1. Open the command line and navigate to the repository directory; 2. Run the "git remote -v" command; 3. View the repository name in the output and its corresponding address.

Where to write code in vscode Where to write code in vscode Apr 15, 2025 pm 09:54 PM

Writing code in Visual Studio Code (VSCode) is simple and easy to use. Just install VSCode, create a project, select a language, create a file, write code, save and run it. The advantages of VSCode include cross-platform, free and open source, powerful features, rich extensions, and lightweight and fast.

How to run java code in notepad How to run java code in notepad Apr 16, 2025 pm 07:39 PM

Although Notepad cannot run Java code directly, it can be achieved by using other tools: using the command line compiler (javac) to generate a bytecode file (filename.class). Use the Java interpreter (java) to interpret bytecode, execute the code, and output the result.

vscode terminal command cannot be used vscode terminal command cannot be used Apr 15, 2025 pm 10:03 PM

Causes and solutions for the VS Code terminal commands not available: The necessary tools are not installed (Windows: WSL; macOS: Xcode command line tools) Path configuration is wrong (add executable files to PATH environment variables) Permission issues (run VS Code as administrator) Firewall or proxy restrictions (check settings, unrestrictions) Terminal settings are incorrect (enable use of external terminals) VS Code installation is corrupt (reinstall or update) Terminal configuration is incompatible (try different terminal types or commands) Specific environment variables are missing (set necessary environment variables)

How to run sublime after writing the code How to run sublime after writing the code Apr 16, 2025 am 08:51 AM

There are six ways to run code in Sublime: through hotkeys, menus, build systems, command lines, set default build systems, and custom build commands, and run individual files/projects by right-clicking on projects/files. The build system availability depends on the installation of Sublime Text.

What is the main purpose of Linux? What is the main purpose of Linux? Apr 16, 2025 am 12:19 AM

The main uses of Linux include: 1. Server operating system, 2. Embedded system, 3. Desktop operating system, 4. Development and testing environment. Linux excels in these areas, providing stability, security and efficient development tools.

See all articles