Pkg.FileMiner — API for carving files

Overview

The Pkg.FileMiner module contains the API for carving files.

Carving Files

The following code example demonstrates how to find embedded files:

from Pro.Core import *
from Pro.UI import *
from Pkg.FileMiner import *

def callback(match, ud):
    print("MATCH:", match.format, "offset:", hex(match.offset), "size:", hex(match.size))

def main():
    c = createContainerFromFile("path/to/file")
    fm = FileMiner()
    wo = proContext().startWait("Carving...")
    fm.mine(c, callback=callback, wait_object=wo)
    wo.stop()

Custom Handlers

FileMiner allows to install custom handlers for files not internally supported.

In order to create a custom handler it is necessary to create an entry in the ‘file_miner.cfg’ configuration file.

This is the entry installed by the ‘RAR Format’ package:

[RAR Archive]
group = arch
format = RAR
speed = 4
patterns = 52 61 72 21 1A 07 00 || 52 61 72 21 1A 07 01 00
file = Pkg.RAR.FM.py
handler = RARHandler

‘handler’ specifies the name of the handler class in ‘file’ which verifies a matched pattern out of those specified in ‘patterns’.

The following is the file handler for the RAR format:

from Pro.Core import CFFInitParams
from Pkg.FileMiner import *

from .Core import RARObject

class RARHandler(FileMinerHandler):

    def validate(self, args):
        try:
            obj = RARObject()
            if not obj.Load(args.getSubStream(args.offset)):
                return None
            opts = CFFInitParams(args.wait_object)
            if not obj.Initialize():
                return None
            size = obj.GetEndOffset()
        except:
            return None
        return FileMinerMatch(args.offset, size)

Module API

Pkg.FileMiner module API.

Classes:

FileMiner(*, groups_excl, handlers_excl)

This class provides the capability find embedded files.

FileMinerHandler()

Base class for file handlers.

FileMinerHandlerArgs()

This class represents the argument parameter passed to the ‘validate’ method to verify the match of a signature.

FileMinerMatch()

This class represents a matched file.

class FileMiner(*, groups_excl: Optional[Set[str]] = None, handlers_excl: Optional[Set[str]] = None)

This class provides the capability find embedded files.

Parameters
  • groups_excl (Set[str]) – Optional set of file groups to exclude.

  • handlers_excl (Set[str]) – Optional set of handlers to exclude.

See also mine().

Methods:

mine(stream, *[, start_offset, length, …])

Searches the provided data for embedded files.

mine(stream: Pro.Core.NTContainer, *, start_offset: int = 0, length: int = - 1, callback: Optional[Callable[[Pkg.FileMiner.FileMinerMatch, Any], Any]] = None, user_data: Optional[Any] = None, wait_object: Optional[Pro.Core.NTIWait] = None, ignore_offset_zero: bool = False, verbose: bool = False)Any

Searches the provided data for embedded files.

Parameters
  • stream (NTContainer) – The data to be searched.

  • start_offset (int) – The start offset for the search.

  • length (int) – If specified, sets the size of the data to be searched.

  • callback (Optional[Callable[[FileMinerMatch, Any], Any]]) – Optional callback for each matched file. The callback function receives as parameters the match (FileMinerMatch) and the specified user data. If the return value is different than None, it will stop the search and the returned value is used as return value of this method.

  • user_data (Any) – Optional user data passed to the callback.

  • wait_object (Optional[NTIWait]) – Optional wait object.

  • ignore_offset_zero (bool) – If True, ignores matches at the start offset.

  • verbose (bool) – If True, enables verbose output.

Returns

Returns the value returned by the callback if different than None; otherwise returns None.

Return type

Any

class FileMinerHandler

Base class for file handlers.

Methods:

validate(args)

This method must be overridden by derived classes to verify a matched pattern.

validate(args: Pkg.FileMiner.FileMinerHandlerArgs)Optional[Pkg.FileMiner.FileMinerMatch]

This method must be overridden by derived classes to verify a matched pattern.

Parameters

args (FileMinerHandlerArgs) – The arguments used to verify the matched pattern.

Returns

Returns an instance of FileMinerMatch if the matched pattern identified a file; otherwise returns None.

Return type

Optional[FileMinerMatch]

See also FileMinerMatch and FileMinerHandlerArgs.

class FileMinerHandlerArgs

This class represents the argument parameter passed to the ‘validate’ method to verify the match of a signature.

Methods:

find(callback, ud, offset, patterns, *[, length])

Searches a pattern or list of patterns.

findBackwards(callback, ud, offset, patterns, *)

Backward-searches a pattern or list of patterns.

getSubStream(offset[, size])

Retrieves a sub-section of stream.

Attributes:

offset

The offset of the matched pattern.

pattern

The matched pattern.

stream

The stream being mined.

stream_size

The size of the stream being mined.

verbose

If True, enables verbose output.

wait_object

The wait object for the operation.

find(callback: Callable[[int, int, Any], Any], ud: Any, offset: int, patterns: Union[bytes, Pro.Core.NTByteArrayList], *, length: int = - 1)Any

Searches a pattern or list of patterns.

Parameters
  • callback (Callable[[int, int, Any], Any]) – The callback function receives as parameters the offset of the match, the index of the matched pattern and the specified user data. If the return value is different than None, it will stop the search and the returned value is used as return value of this method.

  • ud (Any) – The user data that is passed to the callback.

  • offset (int) – The offset at which to start the search.

  • patterns (Union[bytes, NTByteArrayList]) – Either the pattern to search or the list of patterns to search.

  • length (int) – If specified, sets the size of the data to be searched.

Returns

Returns the value returned by the callback if different than None; otherwise returns None.

Return type

Any

See also findBackwards().

findBackwards(callback: Callable[[int, int, Any], Any], ud: Any, offset: int, patterns: Union[bytes, Pro.Core.NTByteArrayList], *, length: int = - 1)Any

Backward-searches a pattern or list of patterns.

Parameters
  • callback (Callable[[int, int, Any], Any]) – The callback function receives as parameters the offset of the match, the index of the matched pattern and the specified user data. If the return value is different than None, it will stop the search and the returned value is used as return value of this method.

  • ud (Any) – The user data that is passed to the callback.

  • offset (int) – The offset at which to start the search.

  • patterns (Union[bytes, NTByteArrayList]) – Either the pattern to search or the list of patterns to search.

  • length (int) – If specified, sets the size of the data to be searched.

Returns

Returns the value returned by the callback if different than None; otherwise returns None.

Return type

Any

See also find().

getSubStream(offset: int, size: int = - 1)Pro.Core.NTContainer

Retrieves a sub-section of stream.

Parameters
  • offset (int) – The offset of the sub-section.

  • size (int) – The size of the sub-section. If not specified, returns the remaining available data.

Returns

Returns the sub-section if successful; otherwise returns an invalid container.

Return type

NTContainer

offset

The offset of the matched pattern.

See also pattern.

pattern

The matched pattern.

See also offset.

stream

The stream being mined.

See also Pro.Core.NTContainer and stream_size.

stream_size

The size of the stream being mined.

verbose

If True, enables verbose output.

wait_object

The wait object for the operation.

class FileMinerMatch

This class represents a matched file.

Attributes:

format

The format of the matched file.

offset

The offset of the matched file.

size

The size of the matched file.

format

The format of the matched file.

offset

The offset of the matched file.

size

The size of the matched file.