Pkg.FileMiner — API for carving files¶
Overview¶
The Pkg.FileMiner module contains the API for carving files.
Carving Files¶
The following code example demonstrates how to find embedded files:
from Pro.Core import *
from Pro.UI import *
from Pkg.FileMiner import *
def callback(match, ud):
print("MATCH:", match.format, "offset:", hex(match.offset), "size:", hex(match.size))
def main():
c = createContainerFromFile("path/to/file")
fm = FileMiner()
wo = proContext().startWait("Carving...")
fm.mine(c, callback=callback, wait_object=wo)
wo.stop()
Custom Handlers¶
FileMiner allows to install custom handlers for files not internally supported.
In order to create a custom handler it is necessary to create an entry in the ‘file_miner.cfg’ configuration file.
This is the entry installed by the ‘RAR Format’ package:
[RAR Archive]
group = arch
format = RAR
speed = 4
patterns = 52 61 72 21 1A 07 00 || 52 61 72 21 1A 07 01 00
file = Pkg.RAR.FM.py
handler = RARHandler
‘handler’ specifies the name of the handler class in ‘file’ which verifies a matched pattern out of those specified in ‘patterns’.
The following is the file handler for the RAR format:
from Pro.Core import CFFInitParams
from Pkg.FileMiner import *
from .Core import RARObject
class RARHandler(FileMinerHandler):
def validate(self, args):
try:
obj = RARObject()
if not obj.Load(args.getSubStream(args.offset)):
return None
opts = CFFInitParams(args.wait_object)
if not obj.Initialize():
return None
size = obj.GetEndOffset()
except:
return None
return FileMinerMatch(args.offset, size)
Module API¶
Pkg.FileMiner module API.
Classes:
FileMiner(*, groups_excl, handlers_excl)This class provides the capability find embedded files.
Base class for file handlers.
This class represents the argument parameter passed to the ‘validate’ method to verify the match of a signature.
This class represents a matched file.
- class FileMiner(*, groups_excl: Optional[Set[str]] = None, handlers_excl: Optional[Set[str]] = None)¶
This class provides the capability find embedded files.
- Parameters
groups_excl (Set[str]) – Optional set of file groups to exclude.
handlers_excl (Set[str]) – Optional set of handlers to exclude.
See also
mine().Methods:
mine(stream, *[, start_offset, length, …])Searches the provided data for embedded files.
- mine(stream: Pro.Core.NTContainer, *, start_offset: int = 0, length: int = - 1, callback: Optional[Callable[[Pkg.FileMiner.FileMinerMatch, Any], Any]] = None, user_data: Optional[Any] = None, wait_object: Optional[Pro.Core.NTIWait] = None, ignore_offset_zero: bool = False, verbose: bool = False) → Any¶
Searches the provided data for embedded files.
- Parameters
stream (NTContainer) – The data to be searched.
start_offset (int) – The start offset for the search.
length (int) – If specified, sets the size of the data to be searched.
callback (Optional[Callable[[FileMinerMatch, Any], Any]]) – Optional callback for each matched file. The callback function receives as parameters the match (
FileMinerMatch) and the specified user data. If the return value is different thanNone, it will stop the search and the returned value is used as return value of this method.user_data (Any) – Optional user data passed to the callback.
wait_object (Optional[NTIWait]) – Optional wait object.
ignore_offset_zero (bool) – If
True, ignores matches at the start offset.verbose (bool) – If
True, enables verbose output.- Returns
Returns the value returned by the callback if different than
None; otherwise returnsNone.- Return type
Any
- class FileMinerHandler¶
Base class for file handlers.
Methods:
validate(args)This method must be overridden by derived classes to verify a matched pattern.
- validate(args: Pkg.FileMiner.FileMinerHandlerArgs) → Optional[Pkg.FileMiner.FileMinerMatch]¶
This method must be overridden by derived classes to verify a matched pattern.
- Parameters
args (FileMinerHandlerArgs) – The arguments used to verify the matched pattern.
- Returns
Returns an instance of
FileMinerMatchif the matched pattern identified a file; otherwise returnsNone.- Return type
Optional[FileMinerMatch]
See also
FileMinerMatchandFileMinerHandlerArgs.
- class FileMinerHandlerArgs¶
This class represents the argument parameter passed to the ‘validate’ method to verify the match of a signature.
Methods:
find(callback, ud, offset, patterns, *[, length])Searches a pattern or list of patterns.
findBackwards(callback, ud, offset, patterns, *)Backward-searches a pattern or list of patterns.
getSubStream(offset[, size])Retrieves a sub-section of
stream.Attributes:
The offset of the matched pattern.
The matched pattern.
The stream being mined.
The size of the stream being mined.
If
True, enables verbose output.The wait object for the operation.
- find(callback: Callable[[int, int, Any], Any], ud: Any, offset: int, patterns: Union[bytes, Pro.Core.NTByteArrayList], *, length: int = - 1) → Any¶
Searches a pattern or list of patterns.
- Parameters
callback (Callable[[int, int, Any], Any]) – The callback function receives as parameters the offset of the match, the index of the matched pattern and the specified user data. If the return value is different than
None, it will stop the search and the returned value is used as return value of this method.ud (Any) – The user data that is passed to the callback.
offset (int) – The offset at which to start the search.
patterns (Union[bytes, NTByteArrayList]) – Either the pattern to search or the list of patterns to search.
length (int) – If specified, sets the size of the data to be searched.
- Returns
Returns the value returned by the callback if different than
None; otherwise returnsNone.- Return type
Any
See also
findBackwards().
- findBackwards(callback: Callable[[int, int, Any], Any], ud: Any, offset: int, patterns: Union[bytes, Pro.Core.NTByteArrayList], *, length: int = - 1) → Any¶
Backward-searches a pattern or list of patterns.
- Parameters
callback (Callable[[int, int, Any], Any]) – The callback function receives as parameters the offset of the match, the index of the matched pattern and the specified user data. If the return value is different than
None, it will stop the search and the returned value is used as return value of this method.ud (Any) – The user data that is passed to the callback.
offset (int) – The offset at which to start the search.
patterns (Union[bytes, NTByteArrayList]) – Either the pattern to search or the list of patterns to search.
length (int) – If specified, sets the size of the data to be searched.
- Returns
Returns the value returned by the callback if different than
None; otherwise returnsNone.- Return type
Any
See also
find().
- getSubStream(offset: int, size: int = - 1) → Pro.Core.NTContainer¶
Retrieves a sub-section of
stream.
- Parameters
offset (int) – The offset of the sub-section.
size (int) – The size of the sub-section. If not specified, returns the remaining available data.
- Returns
Returns the sub-section if successful; otherwise returns an invalid container.
- Return type
- stream¶
The stream being mined.
See also
Pro.Core.NTContainerandstream_size.
- stream_size¶
The size of the stream being mined.
- verbose¶
If
True, enables verbose output.
- wait_object¶
The wait object for the operation.