


What is the pyc file structure of the python virtual machine?
PYC file
The pyc file is a bytecode file generated by Python when interpreting and executing the source code. It contains the compilation results of the source code. and related metadata information so that Python can load and execute code faster.
Different from compiled languages, Python is an interpreted language and does not directly compile source code into machine code and execute it. Before running the code, the Python interpreter will first compile the source code into bytecode, and then interpret the bytecode for execution. The .pyc file is the bytecode file generated during this process.
When the Python interpreter executes a .py file for the first time, it will generate a corresponding .pyc file in the same directory so that it can be executed faster the next time the file is loaded. When the source file is modified and reloaded, the interpreter regenerates the .pyc file to update the cached bytecode.
Generate PYC file
Normal python files need to be turned into bytecode through the compiler, and then the bytecode is handed over to the python virtual machine, and then the python virtual machine executes the bytecode. The overall process is as follows:
#We can directly use the compile all module to generate the pyc file of the corresponding file.
➜ pvm ls demo.py hello.py ➜ pvm python -m compileall . Listing '.'... Listing './.idea'... Listing './.idea/inspectionProfiles'... Compiling './demo.py'... Compiling './hello.py'... ➜ pvm ls __pycache__ demo.py hello.py ➜ pvm ls __pycache__ demo.cpython-310.pyc hello.cpython-310.pyc
python -m compileall .
The command will recursively scan the py files in the current directory and generate the pyc file of the corresponding file.
PYC file layout
def f(): x = 1 return 2
➜ __pycache__ hexdump -C hello.cpython-310.pyc 00000000 6f 0d 0d 0a 00 00 00 00 b9 48 21 64 20 00 00 00 |o........H!d ...| 00000010 e3 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 00000020 00 02 00 00 00 40 00 00 00 73 0c 00 00 00 64 00 |.....@...s....d.| 00000030 64 01 84 00 5a 00 64 02 53 00 29 03 63 00 00 00 |d...Z.d.S.).c...| 00000040 00 00 00 00 00 00 00 00 00 01 00 00 00 01 00 00 |................| 00000050 00 43 00 00 00 73 08 00 00 00 64 01 7d 00 64 02 |.C...s....d.}.d.| 00000060 53 00 29 03 4e e9 01 00 00 00 e9 02 00 00 00 a9 |S.).N...........| 00000070 00 29 01 da 01 78 72 03 00 00 00 72 03 00 00 00 |.)...xr....r....| 00000080 fa 0a 2e 2f 68 65 6c 6c 6f 2e 70 79 da 01 66 01 |.../hello.py..f.| 00000090 00 00 00 73 04 00 00 00 04 01 04 01 72 06 00 00 |...s........r...| 000000a0 00 4e 29 01 72 06 00 00 00 72 03 00 00 00 72 03 |.N).r....r....r.| 000000b0 00 00 00 72 03 00 00 00 72 05 00 00 00 da 08 3c |...r....r......<| 000000c0 6d 6f 64 75 6c 65 3e 01 00 00 00 73 02 00 00 00 |module>....s....| 000000d0 0c 00 |..| 000000d2
- The first part of the magic number is: 0xa0d0d6f.
- The second part Bit Field is: 0x0.
- The last modification date of the third part is: 0x642148b9.
- The file size of the fourth part is: 0x20 bytes, which means the size of the hello.py file is 32 bytes.
import struct import time import binascii fname = "./__pycache__/hello.cpython-310.pyc" f = open(fname, "rb") magic = struct.unpack('<l', f.read(4))[0] bit_filed = f.read(4) print(f"bit field = {binascii.hexlify(bit_filed)}") moddate = f.read(4) filesz = f.read(4) modtime = time.asctime(time.localtime(struct.unpack('<l', moddate)[0])) filesz = struct.unpack('<L', filesz) print("magic %s" % (hex(magic))) print("moddate (%s)" % (modtime)) print("File Size %d" % filesz) f.close()
bit field = b'00000000'About pyc For detailed file operations, please view the source code of the python standard library importlib/_bootstrap_external.py file. CODEOBJECTIn CPython,magic 0xa0d0d6f
moddate (Mon Mar 27 15:41:45 2023)
File Size 32
CodeObject is an object, which contains the bytecode, constants, variables, positional parameters, keyword parameters and other information of the Python code , and some metadata used to run the code, such as file name, code line number, etc.
CodeObject and then execute it. During compilation, the interpreter converts Python code into bytecode and saves it in a
CodeObject object. Thereafter, whenever we call that module or function, the interpreter will use the bytecode in
CodeObject to execute the code.
CodeObject Objects are immutable and cannot be modified once created. This is because the bytecode of Python code is immutable, and the
CodeObject object contains these bytecodes, so it is also immutable.
现在举一个例子来分析一下 pycdemo.py 的 pyc 文件,pycdemo.py 的源程序如下所示:
if __name__ == '__main__': a = 100 print(a)
下面的代码是一个用于加载 pycdemo01.cpython-39.pyc 文件(也就是 hello.py 对应的 pyc 文件)的代码,使用 marshal 读取 pyc 文件里面的 code object 。
import marshal import dis import struct import time import types import binascii def print_metadata(fp): magic = struct.unpack('<l', fp.read(4))[0] print(f"magic number = {hex(magic)}") bit_field = struct.unpack('<l', fp.read(4))[0] print(f"bit filed = {bit_field}") t = struct.unpack('<l', fp.read(4))[0] print(f"time = {time.asctime(time.localtime(t))}") file_size = struct.unpack('<l', fp.read(4))[0] print(f"file size = {file_size}") def show_code(code, indent=''): print ("%scode" % indent) indent += ' ' print ("%sargcount %d" % (indent, code.co_argcount)) print ("%snlocals %d" % (indent, code.co_nlocals)) print ("%sstacksize %d" % (indent, code.co_stacksize)) print ("%sflags %04x" % (indent, code.co_flags)) show_hex("code", code.co_code, indent=indent) dis.disassemble(code) print ("%sconsts" % indent) for const in code.co_consts: if type(const) == types.CodeType: show_code(const, indent+' ') else: print(" %s%r" % (indent, const)) print("%snames %r" % (indent, code.co_names)) print("%svarnames %r" % (indent, code.co_varnames)) print("%sfreevars %r" % (indent, code.co_freevars)) print("%scellvars %r" % (indent, code.co_cellvars)) print("%sfilename %r" % (indent, code.co_filename)) print("%sname %r" % (indent, code.co_name)) print("%sfirstlineno %d" % (indent, code.co_firstlineno)) show_hex("lnotab", code.co_lnotab, indent=indent) def show_hex(label, h, indent): h = binascii.hexlify(h) if len(h) < 60: print("%s%s %s" % (indent, label, h)) else: print("%s%s" % (indent, label)) for i in range(0, len(h), 60): print("%s %s" % (indent, h[i:i+60])) if __name__ == '__main__': filename = "./__pycache__/pycdemo01.cpython-39.pyc" with open(filename, "rb") as fp: print_metadata(fp) code_object = marshal.load(fp) show_code(code_object)
执行上面的程序输出结果如下所示:
magic number = 0xa0d0d61 bit filed = 0 time = Tue Mar 28 02:40:20 2023 file size = 54 code argcount 0 nlocals 0 stacksize 2 flags 0040 code b'650064006b02721464015a01650265018301010064025300' 3 0 LOAD_NAME 0 (__name__) 2 LOAD_CONST 0 ('__main__') 4 COMPARE_OP 2 (==) 6 POP_JUMP_IF_FALSE 20 4 8 LOAD_CONST 1 (100) 10 STORE_NAME 1 (a) 5 12 LOAD_NAME 2 (print) 14 LOAD_NAME 1 (a) 16 CALL_FUNCTION 1 18 POP_TOP >> 20 LOAD_CONST 2 (None) 22 RETURN_VALUE consts '__main__' 100 None names ('__name__', 'a', 'print') varnames () freevars () cellvars () filename './pycdemo01.py' name '<module>' firstlineno 3 lnotab b'08010401'
下面是 code object 当中各个字段的作用:
首先需要了解一下代码块这个概念,所谓代码块就是一个小的 python 代码,被当做一个小的单元整体执行。在 Python 中常见的代码块包括函数体、类的定义和模块。
argcount,这个表示一个代码块的参数个数,这个参数只对函数体代码块有用,因为函数可能会有参数,比如上面的 pycdemo.py 是一个模块而不是一个函数,因此这个参数对应的值为 0 。
co_code,这个对象的具体内容就是一个字节序列,存储真实的 python 字节码,主要是用于 python 虚拟机执行的,在本篇文章当中暂时不详细分析。
co_consts,这个字段是一个列表类型的字段,主要是包含一些字符串常量和数值常量,比如上面的 ";main" 和 100 。
co_filename,这个字段的含义就是对应的源文件的文件名。
co_firstlineno,这个字段的含义为在 python 源文件当中第一行代码出现的行数,这个字段在进行调试的时候非常重要。
主要含义是标识该 code object 的类型的字段是 co_flags。0x0080 表示这个 block 是一个协程,0x0010 表示这个 code object 是嵌套的等等。
co_lnotab,这个字段的含义主要是用于计算每个字节码指令对应的源代码行数。
The main purpose of the field "co_varnames" is to indicate a name defined locally in a code object.。
co_names,和 co_varnames 相反,表示非本地定义但是在 code object 当中使用的名字。
co_nlocals,这个字段表示在一个 code object 当中本地使用的变量个数。
co_stackszie,因为 python 虚拟机是一个栈式计算机,这个参数的值表示这个栈需要的最大的值。
co_cellvars,co_freevars,这两个字段主要和嵌套函数和函数闭包有关。
The above is the detailed content of What is the pyc file structure of the python virtual machine?. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics











PHP is mainly procedural programming, but also supports object-oriented programming (OOP); Python supports a variety of paradigms, including OOP, functional and procedural programming. PHP is suitable for web development, and Python is suitable for a variety of applications such as data analysis and machine learning.

PHP is suitable for web development and rapid prototyping, and Python is suitable for data science and machine learning. 1.PHP is used for dynamic web development, with simple syntax and suitable for rapid development. 2. Python has concise syntax, is suitable for multiple fields, and has a strong library ecosystem.

To run Python code in Sublime Text, you need to install the Python plug-in first, then create a .py file and write the code, and finally press Ctrl B to run the code, and the output will be displayed in the console.

PHP originated in 1994 and was developed by RasmusLerdorf. It was originally used to track website visitors and gradually evolved into a server-side scripting language and was widely used in web development. Python was developed by Guidovan Rossum in the late 1980s and was first released in 1991. It emphasizes code readability and simplicity, and is suitable for scientific computing, data analysis and other fields.

Python is more suitable for beginners, with a smooth learning curve and concise syntax; JavaScript is suitable for front-end development, with a steep learning curve and flexible syntax. 1. Python syntax is intuitive and suitable for data science and back-end development. 2. JavaScript is flexible and widely used in front-end and server-side programming.

Golang is better than Python in terms of performance and scalability. 1) Golang's compilation-type characteristics and efficient concurrency model make it perform well in high concurrency scenarios. 2) Python, as an interpreted language, executes slowly, but can optimize performance through tools such as Cython.

Writing code in Visual Studio Code (VSCode) is simple and easy to use. Just install VSCode, create a project, select a language, create a file, write code, save and run it. The advantages of VSCode include cross-platform, free and open source, powerful features, rich extensions, and lightweight and fast.

Running Python code in Notepad requires the Python executable and NppExec plug-in to be installed. After installing Python and adding PATH to it, configure the command "python" and the parameter "{CURRENT_DIRECTORY}{FILE_NAME}" in the NppExec plug-in to run Python code in Notepad through the shortcut key "F6".
