Detailed introduction to Python's use of struct to process binary (pack and unpack usage)-Python Tutorial-php.cn

Home

Backend Development

Python Tutorial

Detailed introduction to Python's use of struct to process binary (pack and unpack usage)

高洛峰

Mar 19, 2017 pm 02:49 PM

struct binary

Sometimes you need to use python to process binary data, for example, when accessing files and socket operations. At this time, you can use python's struct module to complete it. You can use struct to process c The structure in the language.

The three most important

functions in the struct module are pack(), unpack(), calcsize()

pack(fmt, v1, v2, ...) According to the given format (fmt), pack the data into a

string (actually a byte stream similar to a c structure )

unpack(fmt,

string) Parse the byte stream string according to the given format (fmt) and return the parsed tuple

calcsize(fmt) Calculate to How many bytes of memory does a certain format (fmt) occupy?

The supported formats in struct are as follows:

Format C Type Python Number of bytes

x pad byte no value 1

c char string of length 1 1

b signed char

integer 1

B unsigned char integer 1

? _Bool bool 1

h short integer 2

H unsigned short integer 2

i int integer 4

I unsigned int integer or long 4

l long integer 4

L unsigned long long 4

q long long long 8

Q unsigned long long long 8

float float 4

d double float 8

s char[] string 1

p char[] string 1

P void * long

Note 1.q and Q are only interesting when the machine supports 64-bit operation

Note 2. There can be a number before each format to indicate the number

Note 3. The s format represents a string of a certain length, 4s represents a string of length 4, but p represents a pascal string

Note 4. P is used to convert a pointer. Its length is related to the machine word length

Note 5. The last one can be used to represent the pointer type and occupies 4 bytes

In order to exchange data with the structure in c, we must also consider Some c or c++ compilers use byte alignment, usually 4 bytes for 32-bit systems, so the struct is converted according to the local machine byte order. You can use the first character in the format to change the alignment. .Definition is as follows:

Character Byte order Size and alignment

@ native native Make up 4 bytes

= native standard According to the original number of bytes

< little-endian standard based on the original number of bytes

> big-endian standard based on the original number of bytes

! network (= big-endian)

standard According to the original number of bytes

The usage method is to put it at the first position of fmt, just like '@5s6sif'

Example 1:

The structure is as follows:

struct Header
{
    unsigned short id;
    char[4] tag;
    unsigned int version;
    unsigned int count;
}

Copy after login

The above structure data was received through socket.recv, which is stored in the string s. Now it needs to be parsed out. You can use the unpack() function:

import struct
id, tag, version, count = struct.unpack("!H4s2I", s)

Copy after login

The above format string , ! indicates that we need to use network byte order analysis, because our data is received from the network, and it is in network byte order when transmitted on the network. The following H represents an unsigned short id, 4s Represents a 4-byte long string, 2I represents two unsigned int type data.

Through an unpack, now our information has been saved in id, tag, version, and count.

Similarly, you can also easily pack local data into struct format:

ss = struct.pack("!H4s2I", id, tag, version, count);

Copy after login

The pack function converts id, tag, version, count into structure Header, ss according to the specified format Now it is a string (actually a byte stream similar to a c structure), which can be sent out through socket.send(ss).

Example 2:

import struct
a=12.34
#将a变为二进制
bytes=struct.pack('i',a)

Copy after login

At this time bytes is a string string, and the string is the same as the binary storage content of a in bytes.

Then perform the reverse operation, and convert the existing binary data bytes (actually a string) into python's

data type :

#Note, unpack returns a tuple!!

a,=struct.unpack('i',bytes)

Copy after login

If it is composed of multiple data, it can be like this:

a='hello'
b='world!'
c=2
d=45.123
bytes=struct.pack('5s6sif',a,b,c,d)

Copy after login

The bytes at this time are data in binary form, and can be written directly to the file, for example binfile.write(bytes)

Then, when we need it, we can read it out, bytes=binfile.read()

再通过struct.unpack()解码成python变量：

a,b,c,d=struct.unpack('5s6sif',bytes)

Copy after login

’5s6sif’这个叫做fmt，就是格式化字符串，由数字加字符构成，5s表示占5个字符的字符串，2i，表示2个整数等等，下面是可用的字符及类型，ctype表示可以与python中的类型一一对应。

注意：二进制文件处理时会碰到的问题

我们使用处理二进制文件时，需要用如下方法：

binfile=open(filepath,'rb')    
#读二进制文件
binfile=open(filepath,'wb')   
#写二进制文件

Copy after login

那么和binfile=open(filepath,’r')的结果到底有何不同呢？

不同之处有两个地方：

第一，使用’r'的时候如果碰到’0x1A’，就会视为文件结束，这就是EOF。使用’rb’则不存在这个问题。即，如果你用二进制写入再用文本读出的话，如果其中存在’0X1A’，就只会读出文件的一部分。使用’rb’的时候会一直读到文件末尾。

第二，对于字符串x=’abc\ndef’，我们可用len(x)得到它的长度为7，\n我们称之为换行符，实际上是’0X0A’。当我们用’w'即文本方式写的时候，在windows平台上会自动将’0X0A’变成两个字符’0X0D’，’0X0A’，即文件长度实际上变成8.。当用’r'文本方式读取时，又自动的转换成原来的换行符。如果换成’wb’二进制方式来写的话，则会保持一个字符不变，读取时也是原样读取。所以如果用文本方式写入，用二进制方式读取的话，就要考虑这多出的一个字节了。’0X0D’又称回车符。linux下不会变。因为linux只使用’0X0A’来表示换行。

The above is the detailed content of Detailed introduction to Python's use of struct to process binary (pack and unpack usage). For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Assassin's Creed Shadows: Seashell Riddle Solution

4 weeks ago By DDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

3 weeks ago By DDD

Where to find the Crane Control Keycard in Atomfall

4 weeks ago By DDD

Roblox: Dead Rails - How To Complete Every Challenge

1 months ago By DDD

How to fix KB5055523 fails to install in Windows 11?

2 weeks ago By DDD

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Where is the login entrance for gmail email?

7725

Java Tutorial

1643

CakePHP Tutorial

1397

Laravel Tutorial

1290

PHP Tutorial

1233

Related knowledge

Use the json.MarshalIndent function in golang to convert the structure into a formatted JSON string Nov 18, 2023 pm 01:59 PM

Use the json.MarshalIndent function in golang to convert the structure into a formatted JSON string. When writing programs in Golang, we often need to convert the structure into a JSON string. In this process, the json.MarshalIndent function can help us. Implement formatted output. Below we will explain in detail how to use this function and provide specific code examples. First, let's create a structure containing some data. The following is an indication

How to calculate binary arithmetic Jan 19, 2024 pm 04:38 PM

Binary arithmetic is an operation method based on binary numbers. Its basic operations include addition, subtraction, multiplication and division. In addition to basic operations, binary arithmetic also includes logical operations, displacement operations and other operations. Logical operations include AND, OR, NOT and other operations, and displacement operations include left shift and right shift operations. These operations have corresponding rules and operand requirements.

What are the two major improvements of EDVAC? Mar 02, 2023 pm 02:58 PM

EDVAC has two major improvements: one is the use of binary, and the other is the completion of stored programs, which can automatically advance from one program instruction to the next, and its operations can be automatically completed through instructions. "Instructions" include data and programs, which are input into the memory device of the machine in the form of codes, that is, the same memory device that stores data is used to store instructions for performing operations. This is the new concept of so-called stored programs.

How to convert binary to hexadecimal using C language? Sep 01, 2023 pm 06:57 PM

Binary numbers are represented by 1s and 0s. The 16-bit hexadecimal number system is {0,1,2,3…..9,A(10),B(11),…F(15)} in order to convert from binary representation to hexadecimal Represents that the bit string ID is grouped into 4-bit chunks, called nibbles starting from the least significant side. Each block is replaced with the corresponding hexadecimal number. Let us see an example to get a clear understanding of hexadecimal and binary number representation. 001111100101101100011101 3 E 5 B&nb

How to read binary files in Golang? Mar 21, 2024 am 08:27 AM

How to read binary files in Golang? Binary files are files stored in binary form that contain data that a computer can recognize and process. In Golang, we can use some methods to read binary files and parse them into the data format we want. The following will introduce how to read binary files in Golang and give specific code examples. First, we need to open a binary file using the Open function from the os package, which will return a file object. Then we can make

What is the main reason for using binary within computers? Apr 04, 2019 pm 02:25 PM

The main reasons why computers use binary systems: 1. Computers are composed of logic circuits. Logic circuits usually only have two states, the switch is on and off, and these two states can be represented by "1" and "0"; 2. Only two numbers, 0 and 1, are used in the binary system, which is less error-prone during transmission and processing, thus ensuring high reliability of the computer.

Easily learn to convert hexadecimal to binary in Go language Mar 15, 2024 pm 04:45 PM

Title: Easily learn to convert hexadecimal to binary in Go language. Specific code examples are required. In computer programming, conversion operations between different base numbers are often involved. Among them, conversion between hexadecimal and binary is relatively common. In the Go language, we can achieve hexadecimal to binary conversion through some simple code examples. Let us learn together. First, let's take a look at the representation methods of hexadecimal and binary. Hexadecimal is a method of representing numbers, using 0-9 and A-F to represent 1

How to express negative numbers in binary Nov 23, 2023 pm 04:11 PM

Negative numbers are represented in computers using two's complement, that is, negative numbers are represented by the two's complement of positive numbers.

See all articles