Fingerprinter

The fingerprinter is a simple module that allows you to apply multiple hashes in a single run. It is intended to be used as follows:

>>> with open("file", "rb") as f:
...     fingerprinter = Fingerprinter(file_obj)
...     fingerprinter.add_hashers(hashlib.sha1, hashlib.sha256)
...     print(fingerprinter.hash())

{"sha1": ..., "sha256": ...}

However, you can also use it to calculate Authenticode hashes as follows:

>>> with open("file", "rb") as f:
...     fingerprinter = AuthenticodeFingerprinter(file_obj)
...     fingerprinter.add_authenticode_hashers(hashlib.sha1, hashlib.sha256)
...     print(fingerprinter.hash())

{"sha1": ..., "sha256": ...}

You can also combine these for more efficiency:

>>> with open("file", "rb") as f:
...     fingerprinter = AuthenticodeFingerprinter(file_obj)
...     fingerprinter.add_hashers(hashlib.sha1, hashlib.sha256)
...     fingerprinter.add_authenticode_hashers(hashlib.sha1, hashlib.sha256)
...     print(fingerprinter.hashes())

{"generic": {"sha1": ..., "sha256": ...},
 "authentihash": {"sha1": ..., "sha256": ...}}

You probably only need access to these classes:

class signify.fingerprinter.Fingerprinter(file_obj: BinaryIO, block_size: int = 1000000)

A Fingerprinter is an interface to generate hashes of (parts) of a file.

It is passed in a file object and given a set of Finger s that define how a file must be hashed. It is a generic approach to not hashing parts of a file.

Parameters:
  • file_obj – A file opened in bytes-mode

  • block_size – The block size used to feed to the hashers.

add_hashers(*hashers: HashFunction, ranges: list[Range] | None = None, description: str = 'generic') None

Add hash methods to the fingerprinter.

Parameters:
  • hashers – A list of hashers to add to the Fingerprinter. This generally will be hashlib functions.

  • ranges – A list of Range objects that the hashers should hash. If set to None, it is set to the entire file.

  • description – The name for the hashers. This name will return in hashes()

hash() dict[str, bytes]

Very similar to hashes(), but only returns a single dict of hash names to digests.

This method can only be called when the add_hashers() method was called exactly once.

hashes() dict[str, dict[str, bytes]]

Finalizing function for the Fingerprint class.

This method applies all the different hash functions over the previously specified different ranges of the input file, and computes the resulting hashes.

After calling this function, the state of the object is reset to its initial state, with no fingers defined.

Returns:

A dict of dicts, the outer dict being a mapping of the description (as set in add_hashers() and the inner dict being a mapping of hasher name to digest.

Raises:

RuntimeError – when internal inconsistencies occur.

class signify.fingerprinter.AuthenticodeFingerprinter(file_obj: BinaryIO, block_size: int = 1000000)

An extension of the Fingerprinter class that enables the calculation of authentihashes of PE Files.

A Fingerprinter is an interface to generate hashes of (parts) of a file.

It is passed in a file object and given a set of Finger s that define how a file must be hashed. It is a generic approach to not hashing parts of a file.

Parameters:
  • file_obj – A file opened in bytes-mode

  • block_size – The block size used to feed to the hashers.

add_authenticode_hashers(*hashers: HashFunction) bool

Specialized method of add_hashers() to add hashers with ranges limited to those that are needed to calculate the hash of signed PE Files.

The following interfaces are also available:

class signify.fingerprinter.Range(start, end)

A range with a start and an end.

class signify.fingerprinter.Finger(hashers: list[hashlib._Hash], ranges: list[Range], description: str)

A Finger defines how to hash a file to get specific fingerprints.

The Finger contains one or more hash functions, a set of ranges in the file that are to be processed with these hash functions, and a description.

While one Finger provides potentially multiple hashers, they all get fed the same ranges of the file.

Parameters:
  • hashers – A list of hashers to feed.

  • ranges – A list of Ranges that are hashed.

  • description – The description of this Finger.

consume(start: int, end: int) None

Consumes an entire range, or part thereof.

If the finger has no ranges left, or the current range start is higher than the end of the consumed block, nothing happens. Otherwise, the current range is adjusted for the consumed block, or removed, if the entire block is consumed. For things to work, the consumed range and the current finger starts must be equal, and the length of the consumed range may not exceed the length of the current range.

Parameters:
  • start – Beginning of range to be consumed.

  • end – First offset after the consumed range (end + 1).

Raises:

RuntimeError – if the start position of the consumed range is higher than the start of the current range in the finger, or if the consumed range cuts across block boundaries.

property current_range: Range | None

The working range of this Finger. Returns None if there is none.

update(block: bytes) None

Given a data block, feed it to all the registered hashers.