Table of Contents
Online operation specifications" >Online operation specifications
Test use
Confirm again and again before entering
Don’t do multi-person operations
Backup first and then operate
Involving data" >Involving data
Use rm -rf with caution
Backup operation is greater than everything
Stability is more important than everything
Secrecy is more important than anything else
Involving security" >Involving security
ssh
Firewall
Fine permissions and control granularity
Intrusion detection and log monitoring
Daily monitoring" >Daily monitoring
System operation monitoring
Performance Tuning" >Performance Tuning
In-depth understanding of the operating mechanism
Tuning framework and sequence
Only adjust one parameter at a time
Benchmark testing
Operation and maintenance mentality" >Operation and maintenance mentality
Control mentality
Responsible for data
Get to the root of the matter
Testing and production environment
Home Operation and Maintenance Linux Operation and Maintenance I thought I was familiar with Linux, but I never expected that it would turn upside down in the production environment...

I thought I was familiar with Linux, but I never expected that it would turn upside down in the production environment...

Aug 01, 2023 pm 05:09 PM
linux


Having been engaged in operation and maintenance for many years, I have encountered various problems, such as data loss, website malfunction, accidental deletion of database files, hacker attacks, etc. type of problem. I have also met many friends who thought they were very familiar with the Linux system. When they saw problems, they never panicked and were full of confidence. However, the production environment overturned (almost being fired) famous scene, countless. . .

So, today I will simply sort out some good Linux operating habits and share them with you. Let us operate safely and never roll over! !

I thought I was familiar with Linux, but I never expected that it would turn upside down in the production environment...

Online operation specifications

Test use

The use of Linux when I first learned it , from basics to services to clusters, are all done on virtual machines. Although the teacher told us that there is no difference from real machines, our desire for real environments is increasing day by day. However, the various snapshots of virtual machines have allowed us to develop various skills. I have such a bad habit that when I got the permission to operate the server, I couldn't wait to try it. I remember on the first day at work, the boss gave me the root password. Since I could only use putty, I wanted to use xshell, so I quietly logged in to the server and tried to change it to xshell key login. Because there was no test and no ssh connection was left. After restarting the sshd server, I was blocked from the server. Fortunately, I backed up the sshd_config file at the time and later asked the computer room staff to Just go through CP. Fortunately, this is a small company, otherwise I would have been killed directly... I'm glad I had better luck back then.

The second example is about file synchronization. Everyone knows that rsync synchronizes quickly, but its speed of deleting files is much faster than rm -rf. There is a command in rsync that is based on a certain directory. When synchronizing a file (if the first directory is empty, the result can be imagined), the source directory (with data) will be deleted. At first, I wrote the directory backwards due to misoperation and lack of testing. The key is that there is no backup... The production environment data has been deleted and there is no backup. You can think about the consequences yourself. Its importance is self-evident.

Confirm again and again before entering

Regarding the error rm -rf / var, I believe that for people with fast hands, or when the Internet speed is relatively slow, the probability of it occurring is quite high. When you find that the execution is completed After that, your heart felt at least half cold. You may say that I have pressed it so many times without any error, so don’t be afraid. I just want to say that you will understand when it happens once. Don’t think that those operation and maintenance accidents are all caused by others. If you don’t pay attention, the next one will happen. That's you.

Don’t do multi-person operations

The operation and maintenance management of the last company I worked for was quite chaotic. To give you the most typical example, the operation and maintenance personnel who had resigned several times had the server root password. Usually when we receive a task in operation and maintenance, we will conduct a simple check and if it cannot be solved, we will ask others for help. But when the problem is overwhelming, the customer service supervisor (who knows some Linux), the network administrator, and your boss will debug a server together. After various comparisons, I found that your server configuration file was different from the last time you modified it. Then you changed it back, and then you Googled it again. You found the problem and solved it, but others told you that they also solved it. What are modified are different parameters... This, I really don’t know which one is the real cause of the problem. Of course, this is still good. The problem is solved and everyone is happy. But you have encountered the file you just modified and the test is invalid. What happens when you go to modify and find that the file has been modified again? It’s really annoying and should not be done by multiple people.

Backup first and then operate

Develop a habit. When you want to modify data, back up first, such as the .conf configuration file. In addition, when modifying the configuration file, it is recommended to comment the original options, then copy and modify. Furthermore, if there is a database backup in the first example, then the misoperation of rsync will be fine. So losing the database does not happen overnight, just casually It doesn't have to be so miserable if you have a backup.

Involving data

Use rm -rf with caution

There are many examples on the Internet, various rm -rf /, various deletions of the main database, various A kind of operation and maintenance accident... A small mistake will cause a lot of losses. If you really need to delete it, be cautious.

Backup operation is greater than everything

Originally there are all kinds of backups above, but I want to divide it into the data category to emphasize again that backup is very important. I remember my teacher said this When it comes to data, no amount of caution can be exaggerated. The company I work for has a third-party payment website and an online loan platform. The third-party payment is fully backed up every two hours, and the online loan platform is backed up every 20 minutes. Not much to say, let’s make up our own minds

Stability is more important than everything

In fact, not only data, but also the entire server environment, stability is more important than everything else. We don’t seek the fastest, but the most stable. We seek usability, so do not use new software on the server without testing, such as nginx php-fpm. In the production environment, php hangs in various ways, just restart it, or just change apache.

Secrecy is more important than anything else

Nowadays, there are all kinds of pornographic photos everywhere, and all kinds of router backdoors. Therefore, when it comes to data, it is impossible not to keep it confidential. In addition, when searching for the public account Linux, this is how you should learn to reply "Linux" in the background to get a surprise gift package.

Involving security

ssh

Change the default port (of course, if the professional wants to hack you, it will come out after scanning). Prohibit root login and use ordinary user key authentication sudo. Rule IP address users are restricted from using explosion-proof cracking software similar to hostdeny (more than a few attempts will directly block users). Screen users who log in in /etc/passwd

Firewall

The firewall must be turned on in the production environment. And follow the minimum principle, drop everything, and then release the required service ports.

Fine permissions and control granularity

Do not use root for services that can be started by ordinary users, control the permissions of various services to the minimum, and control the fine-grained granularity.

Intrusion detection and log monitoring

Use third-party software to detect changes in key system files and various service configuration files at all times, such as /etc/passwd, /etc/my.cnf, / etc/httpd/con/httpd.con, etc.; use a centralized log monitoring system to monitor /var/log/secure, /etc/log/message, ftp upload and download files and other alarm error logs; in addition, for port scanning, you can also Using some third-party software, if it is found to be scanned, it will be directly pulled into host.deny. This information is very helpful for troubleshooting after the system is compromised. Someone has said that the cost a company invests in security is directly proportional to the cost it loses from security attacks. Security is a big topic and a very basic job. If the basics are done well, system security can be significantly improved. , the rest is done by security experts

Daily monitoring

System operation monitoring

Many people start from Starting with monitoring, large companies generally have professional 24-hour monitoring and operation. System operation monitoring generally includes hardware occupancy, memory, hard disk, CPU, network card, os including login monitoring and key system file monitoring. Regular monitoring can predict the probability of hardware damage and bring very practical functions to tuning. .

Service operation monitoring

Service monitoring generally refers to various applications, web, db, lvs, etc. This generally monitors some indicators and can be quickly discovered when a performance bottleneck occurs in the system. and solved.

Log Monitoring

The log monitoring here is similar to the security log monitoring, but it is generally the error and alarm information monitoring of hardware, os, and applications. When the system is running stably, it does not matter. It's useless, but once a problem occurs and you don't monitor it, you will be very passive.

Performance Tuning

In-depth understanding of the operating mechanism

In fact, based on more than a year of operation and maintenance experience, talking about tuning is basically just talk on paper, but I just I want to briefly summarize it and I will update it if I have a deeper understanding. Before optimizing the software, for example, you need to have an in-depth understanding of the operating mechanism of a software, such as nginx and apache. Everyone says nginx is fast, so you must know why nginx is fast, what principles it uses, how to process requests better than apache, and you must be able to compete with others. Put it in plain and easy-to-understand terms, and you must be able to understand the source code when necessary, otherwise all documents that use parameters as tuning objects are nonsense.

Tuning framework and sequence

Once you are familiar with the underlying operating mechanism, you must have a tuning framework and sequence. For example, if there is a bottleneck in the database, many people will directly change the configuration file of the database. My suggestion is to first analyze the bottleneck, check the logs, write down the tuning direction, and then start. Tuning the database server should be the last step. The first thing should be the hardware and operating system. Today's database servers are all It will be released for all operating systems after various tests, so you should not start with it first.

牛逼啊!接私活必备的 N 个开源项目!赶快收藏
Copy after login

Only adjust one parameter at a time

Only adjust one parameter at a time. Everyone knows this. If you adjust too much, you will become confused.

Benchmark testing

To determine whether tuning is useful, and to test the stability and performance of a new version of software, benchmark testing is necessary. The test involves many factors and tests whether it is close to the business The actual demand depends on the experience of the tester. For relevant information, you can refer to the third edition of "High Performance MySQL" which is quite good. My teacher once said that there are no one-size-fits-all parameters. Any parameter changes or tuning must conform to the business scenario. So don’t Google any more tuning, it will have no long-term effect on your improvement and the improvement of the business environment.

Operation and maintenance mentality

Control mentality

Many rm -rf /data are at the peak of irritability in the first few minutes after get off work, so Aren't you going to control your mentality? Some people have said that you have to go to work even if you are irritable, but you can try to avoid processing critical data when you are irritable. The more stressful the environment is, the more calm you must be, otherwise you will lose more. Most people have the experience of rm -rf /data/mysql. You can imagine how you feel after deleting it. But if there is no backup, what's the use of being anxious. Generally, in this case, you have to calm down and think about it. Prepare for the worst. For mysql, if you delete the physical files, some tables will still exist in the memory, so disconnect the business, but do not close the mysql database. This is very helpful for recovery, and use dd to copy the hard disk, and then you can For recovery, of course, most of the time you can only find a data recovery company. Imagine that the data has been deleted. If you perform various operations, close the database, and then repair it, not only may the file be overwritten, but the table in the memory may not be found.

Responsible for data

The production environment is not child's play, and the database is not child's play either. You must be responsible for the data. The consequences of not backing up are very serious.

Get to the root of the matter

Many operation and maintenance personnel are busy and will not take care of the problem when it is solved. I remember that last year, a customer's website could not be opened. I found out through the PHP code error report The session and whos_online were damaged. The previous operation and maintenance repaired it through repair, so I repaired it in this way. However, after a few hours, it happened again three or four times, so I went to Google to find out the reasons for the inexplicable damage to the database table: First, The bugs of myisam are: the second is mysqlbug, and the third is that mysql was killed during the writing process. Finally, it was found that the memory was not enough, which caused OOM to kill the mysqld process and there was no swap partition. The background monitoring memory was sufficient, and finally the physical memory was upgraded to solve the problem. .

Testing and production environment

Be sure to check the machine you are on before important operations, and try to avoid opening too many windows.

The above is the detailed content of I thought I was familiar with Linux, but I never expected that it would turn upside down in the production environment.... For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

What computer configuration is required for vscode What computer configuration is required for vscode Apr 15, 2025 pm 09:48 PM

VS Code system requirements: Operating system: Windows 10 and above, macOS 10.12 and above, Linux distribution processor: minimum 1.6 GHz, recommended 2.0 GHz and above memory: minimum 512 MB, recommended 4 GB and above storage space: minimum 250 MB, recommended 1 GB and above other requirements: stable network connection, Xorg/Wayland (Linux)

Linux Architecture: Unveiling the 5 Basic Components Linux Architecture: Unveiling the 5 Basic Components Apr 20, 2025 am 12:04 AM

The five basic components of the Linux system are: 1. Kernel, 2. System library, 3. System utilities, 4. Graphical user interface, 5. Applications. The kernel manages hardware resources, the system library provides precompiled functions, system utilities are used for system management, the GUI provides visual interaction, and applications use these components to implement functions.

How to run java code in notepad How to run java code in notepad Apr 16, 2025 pm 07:39 PM

Although Notepad cannot run Java code directly, it can be achieved by using other tools: using the command line compiler (javac) to generate a bytecode file (filename.class). Use the Java interpreter (java) to interpret bytecode, execute the code, and output the result.

vscode cannot install extension vscode cannot install extension Apr 15, 2025 pm 07:18 PM

The reasons for the installation of VS Code extensions may be: network instability, insufficient permissions, system compatibility issues, VS Code version is too old, antivirus software or firewall interference. By checking network connections, permissions, log files, updating VS Code, disabling security software, and restarting VS Code or computers, you can gradually troubleshoot and resolve issues.

How to check the warehouse address of git How to check the warehouse address of git Apr 17, 2025 pm 01:54 PM

To view the Git repository address, perform the following steps: 1. Open the command line and navigate to the repository directory; 2. Run the "git remote -v" command; 3. View the repository name in the output and its corresponding address.

Can vscode be used for mac Can vscode be used for mac Apr 15, 2025 pm 07:36 PM

VS Code is available on Mac. It has powerful extensions, Git integration, terminal and debugger, and also offers a wealth of setup options. However, for particularly large projects or highly professional development, VS Code may have performance or functional limitations.

How to use VSCode How to use VSCode Apr 15, 2025 pm 11:21 PM

Visual Studio Code (VSCode) is a cross-platform, open source and free code editor developed by Microsoft. It is known for its lightweight, scalability and support for a wide range of programming languages. To install VSCode, please visit the official website to download and run the installer. When using VSCode, you can create new projects, edit code, debug code, navigate projects, expand VSCode, and manage settings. VSCode is available for Windows, macOS, and Linux, supports multiple programming languages ​​and provides various extensions through Marketplace. Its advantages include lightweight, scalability, extensive language support, rich features and version

vscode terminal usage tutorial vscode terminal usage tutorial Apr 15, 2025 pm 10:09 PM

vscode built-in terminal is a development tool that allows running commands and scripts within the editor to simplify the development process. How to use vscode terminal: Open the terminal with the shortcut key (Ctrl/Cmd). Enter a command or run the script. Use hotkeys (such as Ctrl L to clear the terminal). Change the working directory (such as the cd command). Advanced features include debug mode, automatic code snippet completion, and interactive command history.

See all articles