Home System Tutorial LINUX A deep dive into advanced topics about Linux debuggers

A deep dive into advanced topics about Linux debuggers

Jan 08, 2024 pm 10:42 PM
linux linux tutorial Red Hat linux system linux command linux certification red hat linux linux video

Introduction We finally come to the last article in this series! This time, I'll give a high-level overview of some of the more advanced concepts in debugging: remote debugging, shared library support, expression evaluation, and multithreading support. These ideas are more complex to implement, so I won't go into detail on how to do it, but I'm happy to answer questions about these concepts if you have questions.
Series Index
  1. Preparing the environment
  2. Breakpoint
  3. Registers and Memory
  4. Elves and dwarves
  5. Source code and signals
  6. Source code layer is executed step by step
  7. Source code layer breakpoint
  8. Call stack
  9. Handling variables
  10. Advanced Theme
Remote debugging

Remote debugging is very useful for embedded systems or debugging different environments. It also sets a fine line between high-level debugger operations and interaction with the operating system and hardware. In fact, debuggers like GDB and LLDB can run as remote debuggers even when debugging local programs. The general architecture is like this:
Linux 调试器之高级主题!

debuarch

The debugger is a component that we interact with through the command line. Maybe if you're using an IDE, there's another layer on top of it that communicates with the debugger via the machine interface. On the target machine (probably the same as the native machine) there is a debug stub, which is theoretically a wrapper around a very small operating system debugging library that performs all the low-level debugging tasks like setting breakpoints on addresses. I say "in theory" because debug stubs are getting bigger and bigger these days. For example, the LLDB debug stub size on my machine is 7.6MB. The debug stub communicates with the debugged process and the debugger via the remote protocol by using some operating system-specific functionality (ptrace in our case).
The most common remote debugging protocol is the GDB remote protocol. This is a text-based packet format used to pass commands and information between the debugger and debug stubs. I won't go into detail about it, but you can read further here. If you start LLDB and execute the command log enable gdb-remote packets, you will get a trace of all packets sent over the remote protocol. On GDB, you can do the same thing with set remotelogfile.

As a simple example, this is the packet to set a breakpoint on:

$Z0,400570,1#43
Copy after login

$ Marks the beginning of the packet. Z0 is the command to insert a memory breakpoint. 400570 and 1 are parameters, where the former is the address to set the breakpoint and the latter is the breakpoint type specifier for a specific target. Finally, #43 is a checksum to ensure the data is not corrupted.

The GDB remote protocol is very easy to extend with custom packets, which is useful for implementing platform or language specific functionality.

Shared libraries and dynamic loading support

The debugger needs to know which shared libraries are loaded by the program being debugged so that it can set breakpoints, obtain source code level information and symbols, etc. In addition to finding libraries that are dynamically linked, the debugger must also trace libraries that are loaded via dlopen at runtime. To achieve this purpose, the dynamic linker maintains an intersection structure. This structure maintains a linked list of shared library descriptors, as well as a pointer to a function that is called whenever the linked list is updated. This structure is stored in the .dynamic section of the ELF file and is initialized before program execution.

A simple tracking algorithm:

  • The tracer looks for the program's entry in the ELF header (or can use auxiliary vectors stored in /proc//aux).
  • The tracking program sets a breakpoint at the entry of the program and starts execution.
  • When the breakpoint is reached, the address of the intersection structure is found by searching the load address of .dynamic in the ELF file.
  • Check the intersection structure for a list of currently loaded libraries.
  • Set a breakpoint on the linker update function.
  • The list is updated whenever a breakpoint is reached.
  • The tracking program loops infinitely, continuing to execute the program and waiting for signals until the tracking program signal exits.

I wrote a small example of these concepts, which you can find here. I can write in more detail in the future if anyone is interested.

Expression calculation

Expression evaluation is a feature of the program that allows the user to evaluate expressions in the original source language while debugging the program. For example, in LLDB or GDB, you can execute print foo() to call the foo function and print the result.

There are several different calculation methods depending on the complexity of the expression. If the expression is just a simple identifier, the debugger can look at the debug information, find the variable and print out the value, just like we did in the last part of this series. If the expression is somewhat complex, it may be possible to compile the code into an intermediate expression (IR) and interpret it to obtain the result. For example, for some expressions, LLDB will use Clang to compile the expression into an LLVM IR and interpret it. If the expression is more complex, or requires calling certain functions, the code may need to be JITed to the target and executed in the debuggee's address space. This involves calling mmap to allocate some executable memory, then copying the compiled code into that block and executing it. LLDB is implemented using LLVM’s JIT capabilities.

If you want to learn more about JIT compilation, I highly recommend Eli Bendersky's article on the subject.

多线程调试支持

本系列展示的调试器仅支持单线程应用程序,但是为了调试大多数真实程序,多线程支持是非常需要的。支持这一点的最简单的方法是跟踪线程的创建,并解析 procfs 以获取所需的信息。

Linux 线程库称为 pthreads。当调用 pthread_create 时,库会使用 clone 系统调用来创建一个新的线程,我们可以用 ptrace 跟踪这个系统调用(假设你的内核早于 2.5.46)。为此,你需要在连接到调试器之后设置一些 ptrace 选项:

ptrace(PTRACE_SETOPTIONS, m_pid, nullptr, PTRACE_O_TRACECLONE);
Copy after login

现在当 clone 被调用时,该进程将收到我们的老朋友 SIGTRAP 信号。对于本系列中的调试器,你可以将一个例子添加到 handle_sigtrap 来处理新线程的创建:

case (SIGTRAP | (PTRACE_EVENT_CLONE << 8)):
//get the new thread ID
unsigned long event_message = 0;
ptrace(PTRACE_GETEVENTMSG, pid, nullptr, message);
//handle creation
//...
Copy after login

一旦收到了,你可以看看 /proc//task/ 并查看内存映射之类来获得所需的所有信息。

GDB 使用 libthread_db,它提供了一堆帮助函数,这样你就不需要自己解析和处理。设置这个库很奇怪,我不会在这展示它如何工作,但如果你想使用它,你可以去阅读这个教程。

多线程支持中最复杂的部分是调试器中线程状态的建模,特别是如果你希望支持不间断模式或当你计算中涉及不止一个 CPU 的某种异构调试。

最后!

呼!这个系列花了很长时间才写完,但是我在这个过程中学到了很多东西,我希望它是有帮助的。如果你有关于调试或本系列中的任何问题,请在 Twitter @TartanLlama或评论区联系我。如果你有想看到的其他任何调试主题,让我知道我或许会再发其他的文章。

The above is the detailed content of A deep dive into advanced topics about Linux debuggers. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

What computer configuration is required for vscode What computer configuration is required for vscode Apr 15, 2025 pm 09:48 PM

VS Code system requirements: Operating system: Windows 10 and above, macOS 10.12 and above, Linux distribution processor: minimum 1.6 GHz, recommended 2.0 GHz and above memory: minimum 512 MB, recommended 4 GB and above storage space: minimum 250 MB, recommended 1 GB and above other requirements: stable network connection, Xorg/Wayland (Linux)

Linux Architecture: Unveiling the 5 Basic Components Linux Architecture: Unveiling the 5 Basic Components Apr 20, 2025 am 12:04 AM

The five basic components of the Linux system are: 1. Kernel, 2. System library, 3. System utilities, 4. Graphical user interface, 5. Applications. The kernel manages hardware resources, the system library provides precompiled functions, system utilities are used for system management, the GUI provides visual interaction, and applications use these components to implement functions.

vscode terminal usage tutorial vscode terminal usage tutorial Apr 15, 2025 pm 10:09 PM

vscode built-in terminal is a development tool that allows running commands and scripts within the editor to simplify the development process. How to use vscode terminal: Open the terminal with the shortcut key (Ctrl/Cmd). Enter a command or run the script. Use hotkeys (such as Ctrl L to clear the terminal). Change the working directory (such as the cd command). Advanced features include debug mode, automatic code snippet completion, and interactive command history.

How to check the warehouse address of git How to check the warehouse address of git Apr 17, 2025 pm 01:54 PM

To view the Git repository address, perform the following steps: 1. Open the command line and navigate to the repository directory; 2. Run the "git remote -v" command; 3. View the repository name in the output and its corresponding address.

How to run java code in notepad How to run java code in notepad Apr 16, 2025 pm 07:39 PM

Although Notepad cannot run Java code directly, it can be achieved by using other tools: using the command line compiler (javac) to generate a bytecode file (filename.class). Use the Java interpreter (java) to interpret bytecode, execute the code, and output the result.

Where to write code in vscode Where to write code in vscode Apr 15, 2025 pm 09:54 PM

Writing code in Visual Studio Code (VSCode) is simple and easy to use. Just install VSCode, create a project, select a language, create a file, write code, save and run it. The advantages of VSCode include cross-platform, free and open source, powerful features, rich extensions, and lightweight and fast.

What is the main purpose of Linux? What is the main purpose of Linux? Apr 16, 2025 am 12:19 AM

The main uses of Linux include: 1. Server operating system, 2. Embedded system, 3. Desktop operating system, 4. Development and testing environment. Linux excels in these areas, providing stability, security and efficient development tools.

vscode terminal command cannot be used vscode terminal command cannot be used Apr 15, 2025 pm 10:03 PM

Causes and solutions for the VS Code terminal commands not available: The necessary tools are not installed (Windows: WSL; macOS: Xcode command line tools) Path configuration is wrong (add executable files to PATH environment variables) Permission issues (run VS Code as administrator) Firewall or proxy restrictions (check settings, unrestrictions) Terminal settings are incorrect (enable use of external terminals) VS Code installation is corrupt (reinstall or update) Terminal configuration is incompatible (try different terminal types or commands) Specific environment variables are missing (set necessary environment variables)

See all articles