Diff Module

class pybacked.diff.Diff(difftype_, state)

Holds data to define differences between file-versions

Parameters
  • difftype (str) – The type of difference that was detetcted. ‘+’ - file was created ‘-’ - file was deleted ‘*’ - file was edited

  • state (float or str) – The newer state of the file. Depending on the diff-algorithm used can either be a timestamp or a hex hash

class pybacked.diff.DiffCache(initialdict=None, initialdirflags=None, nested=True)

A collection of diff classes, which hold information on file version differences. For example DiffCache objects are used to collect Diff objects for all files in a directory, in order to have them in one easily acessible and versatile “Container”

Parameters
  • initialdict (dict, optional) – Initial dictionary which will be copied into diffdict

  • initialdirflags (dict, optional) – Initial dir flags dictionary

  • nested (bool, optional) – Indicates wether the DiffCache is nested, meaning that the DiffCache itself can contain DiffCaches. (default is True)

add_diff(location, diffobj, is_dir)

Add a diff reference to diffdict and dirflags.

Parameters
  • location (str) – The name of the inspected location (dir or file)

  • diffobj (Diff object) – The diff-object which will hold different information depending on the diff-detection algorithm

  • is_dir (bool) – Flag to be set if the added location is a dir

Returns

void

Return type

None

remove_diff(location)

Remove a diff reference from diffdict and dirflags.

Parameters

location (str) – The location-name of the diff to be removed

Returns

the diff and the dir-flag under the location-name key

Return type

Diff object, bool

class pybacked.diff.DiffDate(difftype_, last_edit_, previous_edit_=None)

LEGACY CODE Holds data to describe differences between file-versions detected by comparing last edited date.

Parameters
  • difftype (str) – The type of difference that was detetcted. ‘+’ - file was created ‘-’ - file was deleted ‘*’ - file was edited

  • last_edit (float) – The unix timestamp of the last edit.

  • previous_edit (float) – The last edited date (unix-timestamp) of the last file-version which was archived

class pybacked.diff.DiffHash(difftype_, currenthash_, hash_algorithm_)

LEGACY CODE Holds data to describe differences between file-versions detected by comparing binary hashes.

Parameters
  • difftype (str) – The type of difference that was detetcted. ‘+’ - file was created ‘-’ - file was deleted ‘*’ - file was edited

  • currenthash (bytes object) – the hash of the modified file

  • hash_algorithm (str) – the applied hash algorithm

pybacked.diff.collect(storage_dir, archive_dir, diff_algorithm, hash_algorithm=None, subdir='')

Collects all the diff information for an entire storage directory.

Parameters
  • storage_dir (str) – The storage directory

  • archive_dir (str) – The archive directory

  • diff_algorithm (int) – The desired diff algorithm - one of (DIFF_DATE, DIFF_HASH, DIFF_CONT)

  • hash_algorithm (str) – The desired hash algorithm.

  • subdir (str, optional) – The subdirectory prefix for the filename

Returns

The DiffCache object holding the diff information

Return type

DiffCache

pybacked.diff.detect(filepath, archive_dir, diff_algorithm, hash_algorithm=None, subdir='')

Detect difference between a working file and an archived file. Meaning this function detects whether there has been a change in the file since the last archiving cycle

Parameters
  • filepath (str) – The path of the inspected file

  • archive_dir (str) – The directory of the archive

  • diff_algorithm (int) – The diff-detection algorithm used - one of DIFF_DATE or DIFF_HASH

  • hash_algorithm (str) – The desired hashing algorithm

  • subdir (str, optional) – The subdirectory prefix for the filename

Returns

The diff class which corresponds to the file change or None if the file didn’t change.

Return type

Diff

pybacked.diff.diff_log_deserialize(archive, basepath=None)

Read a diff-log.csv from a given archive and create a DiffCache from the contents of the diff-log.

Parameters
  • archive (str) – The path to the zip-archive

  • basepath (str, optional) – The path to the storage location. This is required as the diff-log.csv only stores the relative filenames.

Returns

The deserialized DiffCache object

Return type

DiffCache

pybacked.diff.diff_log_deserialize_str(diff_log, basepath=None)

Create a DiffCache object from a given string.

Parameters
  • diff_log (str) – A String containing the contents of the diff-log.csv

  • basepath (str, optional) – The path to the storage location. This is required as the diff-log.csv only stores the relative filenames. If basepath is None diff_log_deserialize_str will use the the archive relative paths, just like in the diff-log.csv

Returns

The deserialized DiffCache object

Return type

DiffCache