In this guide, you'll explore Python's struct module, which is used for working with binary data. Learn its key functions and examples of practical applications.
The struct
module in Python provides functions for working with C-style data structures, particularly for converting between Python values and C structs represented as Python bytes objects.
This is particularly useful for reading and writing binary data, such as files or network protocols, that follow a specific format.
Table of Contents
- Introduction
- Basic Functions
struct.pack
struct.unpack
struct.calcsize
- Format Strings
- Format Characters
- Byte Order, Size, and Alignment
- Examples
- Packing Data
- Unpacking Data
- Calculating Size of a Struct
- Real-World Use Case
- Conclusion
- References
Introduction
The struct
module allows you to interpret bytes as packed binary data. You can convert between Python values and a C struct, represented as a Python bytes object.
This module is essential for handling binary data from files, network connections, and other sources that require a specific binary format.
Basic Functions
struct.pack
Packs the given values into a bytes object according to the given format string.
import struct
packed_data = struct.pack('i4sh', 7, b'test', 8)
print(packed_data)
Output:
b'\x07\x00\x00\x00test\x08\x00'
struct.unpack
Unpacks the given bytes object according to the given format string.
import struct
data = b'\x07\x00\x00\x00test\x08\x00'
unpacked_data = struct.unpack('i4sh', data)
print(unpacked_data)
Output:
(7, b'test', 8)
struct.calcsize
Calculates the size of the struct (and hence of the bytes object) corresponding to the given format string.
import struct
size = struct.calcsize('i4sh')
print(size)
Output:
10
Format Strings
Format strings are used to specify the layout of the data. They include format characters that represent the type of data, and they can also specify byte order, size, and alignment.
Format Characters
Some common format characters include:
x
: pad byte (no value)c
: charb
: signed charB
: unsigned char?
: _Boolh
: shortH
: unsigned shorti
: intI
: unsigned intl
: longL
: unsigned longq
: long longQ
: unsigned long longf
: floatd
: doubles
: char[]p
: char[] (padded)P
: void *
Byte Order, Size, and Alignment
Byte order, size, and alignment can be specified at the beginning of the format string:
@
: native order, size, and alignment=
: native order, standard size, no alignment<
: little-endian, standard size, no alignment>
: big-endian, standard size, no alignment!
: network (big-endian), standard size, no alignment
Examples
Packing Data
Pack an integer, a string, and a short into a bytes object.
import struct
packed_data = struct.pack('i4sh', 7, b'test', 8)
print(packed_data)
Output:
b'\x07\x00\x00\x00test\x08\x00'
Unpacking Data
Unpack a bytes object into an integer, a string, and a short.
import struct
data = b'\x07\x00\x00\x00test\x08\x00'
unpacked_data = struct.unpack('i4sh', data)
print(unpacked_data)
Output:
(7, b'test', 8)
Calculating Size of a Struct
Calculate the size of the struct represented by the format string.
import struct
size = struct.calcsize('i4sh')
print(size)
Output:
10
Real-World Use Case
Reading a Binary File
Read and unpack binary data from a file that contains a sequence of records. Each record consists of an integer, a string of 4 characters, and a short.
import struct
def read_records(filename):
records = []
with open(filename, 'rb') as f:
while chunk := f.read(struct.calcsize('i4sh')):
record = struct.unpack('i4sh', chunk)
records.append(record)
return records
# Assume 'data.bin' is a binary file with the appropriate format
records = read_records('data.bin')
for record in records:
print(record)
Output:
(7, b'test', 8)
(10, b'data', 15)
...
Conclusion
The struct
module in Python is used for working with binary data, allowing you to pack and unpack data structures using format strings. By understanding format characters and byte order specifications, you can effectively handle binary data from various sources, such as files and network connections.
Comments
Post a Comment
Leave Comment