Pro.Zip
— API for parsing Zip archives¶
Parsing a Zip Archive¶
from Pro.Core import *
from Pro.Zip import *
def parseZipArchive(fname):
c = createContainerFromFile(fname)
if c.isNull():
return
obj = ZipObject()
if not obj.Load(c) or not obj.Initialize():
return
n = obj.GetEntryCount()
for i in range(n):
entry = obj.GetEntry(i)
print("name:", entry.Str("FileName"))
print("compressed size:", entry.Uns("CompressedSize"))
print("uncompressed size:", entry.Uns("UncompressedSize"))
# extract data
data = obj.Extract(entry)
Module API¶
Pro.Zip module API.
Classes:
This class contains the central directory data of a Zip archive.
This class represents the file header for an individual entry within a Zip archive.
This class represents a Zip archive.
Attributes:
Maximum number for file entries in a Zip file.
- class CentralDirectoryData¶
This class contains the central directory data of a Zip archive.
See also
ZipObject.GetCentralDirectoryData()
.Attributes:
The disk number where the central directory starts.
The number of the disk on which the central directory record of this archive is located.
The offset of the start of the central directory on the disk on which the central directory starts.
The size of the central directory.
The total number of entries in the central directory on this disk.
The total number of entries in the central directory for the entire Zip archive, across all disks.
- central_directory_disk¶
The disk number where the central directory starts. This is particularly relevant for Zip archives spanning multiple disks, allowing the system to locate the beginning of the central directory. In single-disk archives, this field is usually set to zero.
- disk¶
The number of the disk on which the central directory record of this archive is located. In the context of multi-disk Zip archives, this field is essential to identify the specific disk containing the central directory entry for a file. For single-disk Zip archives, this will typically be zero.
- offset¶
The offset of the start of the central directory on the disk on which the central directory starts. It is essential for locating the central directory in the archive, especially in the context of multi-disk or large archives where the central directory may not start at the beginning of the disk or file.
- size¶
The size of the central directory.
- total_disk_entries¶
The total number of entries in the central directory on this disk. For multi-disk archives, it indicates the number of entries within the central directory that are on the current disk. For a single-disk archive, it would typically match the
total_entries
field.
- total_entries¶
The total number of entries in the central directory for the entire Zip archive, across all disks. It provides a count of all individual entries, giving an overview of all the files and directories included in the Zip file.
- class FileHeader64Data¶
This class represents the file header for an individual entry within a Zip archive.
See also
ZipObject.GetFileHeader64Data()
.Attributes:
The size of the compressed file within the Zip archive.
The disk number on which this file entry is located, used in multi-disk Zip archives.
The offset within the archive at which the local header of the file begins.
The size of the file after decompression.
- CompressedSize¶
The size of the compressed file within the Zip archive.
See also
UncompressedSize
.
- Disk¶
The disk number on which this file entry is located, used in multi-disk Zip archives.
- LocalHeaderOffset¶
The offset within the archive at which the local header of the file begins. The local header contains specific information about the file, including its name, compression method, etc.
- UncompressedSize¶
The size of the file after decompression.
See also
CompressedSize
.
- ZIP_MAX_FILE_ENTRIES: Final[int]¶
Maximum number for file entries in a Zip file.
See also
ZipObject.RetrieveEntries()
.
- class ZipObject¶
Bases:
Pro.Core.CFFObject
This class represents a Zip archive.
Methods:
Retrieves the structure of the central directory in both Zip32 and Zip64 archives.
Retrieves the structure of the central directory in a Zip archive.
CentralDirectoryEnd64
(offset)Retrieves the structure of the central directory in a Zip64 archive.
CentralDirectoryEndLocator64
(offset)Finds the locator of the central directory in a Zip64 archive.
Finds the central directory in a Zip32 archive.
Finds the central directory in both Zip32 and Zip64 archives.
DumpExtraField
(fh, out)Outputs to text the extra field in a file header.
DumpFileHeader
(s, out)Outputs to text a file header.
DumpHeaders
(out)Outputs to text the central directory data.
ExtraFieldBuffer
(fh)Retrieves as a buffer the extra field in file header.
Extract
(fh)Extracts the data of a file, applying the necessary decompression.
ExtractTo
(fh, dst)Extracts the data of a file to a specified input container, applying the necessary decompression.
FileHeader
(offset)Retrieves a file header structure from the offset of a file header.
Retrieves the array of file headers of the central directory.
FindEntry
(name)Looks up a file entry by name.
GetCentralDirectoryData
(data)Retrieves the data of the central directory.
Retrieves the compressed data of a file entry.
GetCompressionMethodName
(comptype)Returns the name of a compression method by its value.
Retrieves the internally stored entries.
GetEntry
(i)Retrieves a file entry.
Returns the number of file entries.
GetEntryName
(i)Retrieves the name of a file entry.
Retrieves the offset of a file entry.
GetExtraFieldDescr
(hid)Retrieves a description for an extra field.
GetFileHeader64Data
(fh, data)Extracts the data from a file header structure.
HasEntry
(name)Checks if a file entry exists.
RetrieveEntries
([max_entries, parse_corrupted])Find the file entries in a Zip archive.
SetEntries
(entries)Sets the internal file entries.
- CentralDirectoryEnd() → Pro.Core.CFFStruct¶
Retrieves the structure of the central directory in both Zip32 and Zip64 archives.
- Returns
Returns a valid structure if successful; otherwise returns an invalid structure.
- Return type
See also
GetCentralDirectoryData()
.
- CentralDirectoryEnd32() → Pro.Core.CFFStruct¶
Retrieves the structure of the central directory in a Zip archive.
- Returns
Returns a valid structure if successful; otherwise returns an invalid structure.
- Return type
See also
CentralDirectoryEnd()
.
- CentralDirectoryEnd64(offset: int) → Pro.Core.CFFStruct¶
Retrieves the structure of the central directory in a Zip64 archive.
- Parameters
offset (int) – The offset of the central directory.
- Returns
Returns a valid structure if successful; otherwise returns an invalid structure.
- Return type
See also
CentralDirectoryEnd()
andCentralDirectoryEndLocator64()
.
- CentralDirectoryEndLocator64(offset: int) → Pro.Core.CFFStruct¶
Finds the locator of the central directory in a Zip64 archive.
- Parameters
offset (int) – The start offset from where to search.
- Returns
Returns a valid structure if successful; otherwise returns an invalid structure.
- Return type
- CentralDirectoryEndOffset32() → int¶
Finds the central directory in a Zip32 archive.
- Returns
Returns the offset of the central directory if successful; otherwise returns
Pro.CoreINVALID_STREAM_OFFSET
.- Return type
int
See also
CentralDirectoryOffset()
.
- CentralDirectoryOffset() → int¶
Finds the central directory in both Zip32 and Zip64 archives.
- Returns
Returns the offset of the central directory if successful; otherwise returns
Pro.CoreINVALID_STREAM_OFFSET
.- Return type
int
See also
GetCentralDirectoryData()
.
- DumpExtraField(fh: Pro.Core.CFFStruct, out: Pro.Core.NTTextStream) → bool¶
Outputs to text the extra field in a file header.
- Parameters
fh (CFFStruct) – The file header structure.
out (NTTextStream) – The output text stream.
- Returns
Returns
True
if successful; otherwise returnsFalse
.- Return type
bool
See also
DumpFileHeader()
.
- DumpFileHeader(s: Pro.Core.CFFStruct, out: Pro.Core.NTTextStream) → bool¶
Outputs to text a file header.
- Parameters
s (CFFStruct) – The file header structure.
out (NTTextStream) – The output text stream.
- Returns
Returns
True
if successful; otherwise returnsFalse
.- Return type
bool
- DumpHeaders(out: Pro.Core.NTTextStream) → bool¶
Outputs to text the central directory data.
- Parameters
out (NTTextStream) – The output text stream.
- Returns
Returns
True
if successful; otherwise returnsFalse
.- Return type
bool
See also
DumpFileHeader()
.
- ExtraFieldBuffer(fh: Pro.Core.CFFStruct) → Pro.Core.NTContainerBuffer¶
Retrieves as a buffer the extra field in file header.
- Parameters
fh (CFFStruct) – The file header structure.
- Returns
Returns a valid buffer instance if successful; otherwise returns an invalid
Pro.Core.NTContainerBuffer
instance.- Return type
- Extract(fh: Pro.Core.CFFStruct) → Pro.Core.NTContainer¶
Extracts the data of a file, applying the necessary decompression.
- Parameters
fh (CFFStruct) – The file header structure of the file to extract.
- Returns
Returns a valid container instance if successful; otherwise returns an invalid
Pro.Core.NTContainer
instance.- Return type
See also
ExtractTo()
andGetCompressedData()
.
- ExtractTo(fh: Pro.Core.CFFStruct, dst: Pro.Core.NTContainer) → bool¶
Extracts the data of a file to a specified input container, applying the necessary decompression.
- Parameters
fh (CFFStruct) – The file header structure of the file to extract.
dst (NTContainer) – The container used to extract the data.
- Returns
Returns
True
if successful; otherwise returnsFalse
.- Return type
bool
See also
Extract()
andGetCompressedData()
.
- FileHeader(offset: int) → Pro.Core.CFFStruct¶
Retrieves a file header structure from the offset of a file header.
Note
This method handles both local and non-local file headers.
- Parameters
offset (int) – The offset of the file header.
- Returns
Returns a valid structure if successful; otherwise returns an invalid structure.
- Return type
- FileHeaders() → Pro.Core.CFFStruct¶
Retrieves the array of file headers of the central directory.
- Returns
Returns a valid structure if successful; otherwise returns an invalid structure.
- Return type
- FindEntry(name: str) → Pro.Core.CFFStruct¶
Looks up a file entry by name.
- Parameters
name (str) – The name of the file entry.
- Returns
Returns a valid structure if the entry is found; otherwise returns an invalid structure.
- Return type
- GetCentralDirectoryData(data: Pro.Zip.CentralDirectoryData) → bool¶
Retrieves the data of the central directory.
- Parameters
data (CentralDirectoryData) – The structure of the data to be retrieved.
- Returns
Returns
True
if successful; otherwise returnsFalse
.- Return type
bool
- GetCompressedData(fh: Pro.Core.CFFStruct) → Pro.Core.NTContainer¶
Retrieves the compressed data of a file entry.
- Parameters
fh (CFFStruct) – The file header structure.
- Returns
Returns a valid container instance if successful; otherwise returns an invalid
Pro.Core.NTContainer
instance.- Return type
See also
Extract()
andExtractTo()
.
- GetCompressionMethodName(comptype: int) → str¶
Returns the name of a compression method by its value.
- Parameters
comptype (int) – The compression type.
- Returns
Returns the compression method name if successful; otherwise returns
"Unknown"
.- Return type
str
- GetEntries() → Pro.Core.NTContainer¶
Retrieves the internally stored entries.
- Returns
Returns the stored entries.
- Return type
See also
SetEntries()
andRetrieveEntries()
.
- GetEntry(i: int) → Pro.Core.CFFStruct¶
Retrieves a file entry.
- Parameters
i (int) – The index of the file entry.
- Returns
Returns a valid file entry if successful; otherwise returns an invalid structure.
- Return type
See also
GetEntryCount()
.
- GetEntryCount() → int¶
- Returns
Returns the number of file entries.
- Return type
int
See also
GetEntry()
.
- GetEntryName(i: int) → str¶
Retrieves the name of a file entry.
- Parameters
i (int) – The index of the file entry.
- Returns
Returns the name if successful; otherwise returns an empty string.
- Return type
str
See also
GetEntry()
.
- GetEntryOffset(i: int) → int¶
Retrieves the offset of a file entry.
- Parameters
i (int) – The index of the file entry.
- Returns
Returns the offset of the file entry if successful; otherwise returns
Pro.CoreINVALID_STREAM_OFFSET
.- Return type
int
See also
GetEntry()
.
- GetExtraFieldDescr(hid: int) → str¶
Retrieves a description for an extra field.
- Parameters
hid (int) – The id of the extra field.
- Returns
Returns a description if successful; otherwise returns an empty string.
- Return type
str
- GetFileHeader64Data(fh: Pro.Core.CFFStruct, data: Pro.Zip.FileHeader64Data) → tuple¶
Extracts the data from a file header structure.
- Parameters
fh (CFFStruct) – The file header structure.
data (FileHeader64Data) – The structure used for extracting the data.
- Returns
Returns a tuple containing two booleans. The first boolean represents the result of the operation, while the second boolean uses
True
to signal a Zip64 file header structure andFalse
for Zip32.- Return type
tuple[bool, bool]
See also
GetEntry()
.
- HasEntry(name: str) → bool¶
Checks if a file entry exists.
Hint
Internally this method calls
FindEntry()
.
- Parameters
name (str) – The name of the file entry.
- Returns
Returns
True
if successful; otherwise returnsFalse
.- Return type
bool
See also
FindEntry()
.
- RetrieveEntries(max_entries: int = ZIP_MAX_FILE_ENTRIES, parse_corrupted: bool = False) → Pro.Core.NTContainer¶
Find the file entries in a Zip archive.
- Parameters
max_entries (int) – The maximum number of file entries to collect.
parse_corrupted (bool) – If
True
, it tries to parse corrupted archives.- Returns
Returns the collected file entries.
- Return type
See also
SetEntries()
andGetEntries()
.
- SetEntries(entries: Pro.Core.NTContainer) → None¶
Sets the internal file entries.
- Parameters
entries (NTContainer) – The file entries.
See also
RetrieveEntries()
andGetEntries()
.