API

Below is a description of the flywheel-migration-toolkit API.

DeID Profile

Provides profile loading/saving for file de-identification

class flywheel_migration.deidentify.deid_profile.DeIdProfile

Bases: object

Represents steps to take to de-identify a file or set of files.

finalize()

Perform any necessary cleanup with profile

get_file_profile(name)

Get file profile for name, or None if not present

initialize()

Initialize the profile, prior to importing

load_config(config)

Initialize this profile from a config dictionary

process_file(src_fs, src_file, dst_fs)

Process the given file, if it’s handled by a file profile.

Parameters
  • src_fs – The source filesystem

  • src_file – The source file path

  • dst_fs – The destination filesystem

Returns

True if the file was processed, false otherwise

Return type

bool

process_packfile(packfile_type, src_fs, dst_fs, paths, callback=None)

Process the given packfile, if it’s handled by a file profile.

Parameters
  • packfile_type (str) – The packfile type

  • src_fs – The source filesystem

  • dst_fs – The destination filesystem

  • paths – The list of paths to process

  • callback – Optional function to call after processing each file

Returns

True if the packfile was processed, false otherwise

Return type

bool

to_config()

Create configuration dictionary from this profile

validate(enhanced=False)

Validate the profile, returning any errors

Returns

A list of error messages, or an empty list

Return type

list(str)

DeID Field

Represents action to take in order to de-id a single field

class flywheel_migration.deidentify.deid_field.DeIdField(fieldname, is_regex=False, dry=False)

Bases: object

Abstract class that represents action to take to de-identify a single field

deidentify(profile, state, record)

Perform the update - default implementation is to do a replace

classmethod factory(config, dry=False, mixin=None)

Create a new DeIdField instance for the given config.

Parameters
  • config (dict) – The field configuration

  • dry (bool) – Is set to true, set the field as dry, i.e. a field that does not modify the record.

  • mixin (DeIdFieldMixin) – Optional subclass of DeIdFieldMixin to be inherited by the DeIdField subclass to make the field profile specific.

classmethod get_deidfield_class(config)

Returns DeIdField subclass matching config.

If only “name” is defined in config, returns DeIdKeepField, otherwise returns DeIdField subclass based on key action found in config.

Parameters

config (dict) – Dictionary e.g. {“name”: “PatientID”, “replace-with”: “TOTO”}

Returns

A DeIdField subclass or None if none is matching config

Return type

DeIdField or None

abstract get_value(profile, state, record)

Get the transformed value, given profile state and record

property is_dry
property is_regex
key = None
list_fieldname(record)

Return a list of fieldnames for record.

By default returns [self.fieldname]. Can be overwritten by certain subclasses of FieldEnhancerBaseMixin to returns a range of record attributes (e.g. when field uses regex, or range definition).

load_config(config)

Load rule specific settings from configuration dictionary

local_to_config(config)

Convert rule specific settings to configuration dictionary

to_config()

Convert to configuration dictionary

class flywheel_migration.deidentify.deid_field.DeIdFieldMixin

Bases: object

Mixin base class to add functionalities to DeIdField based on profile used

flavor = None
class flywheel_migration.deidentify.deid_field.DeIdHashField(fieldname, is_regex=False, dry=False)

Bases: flywheel_migration.deidentify.deid_field.DeIdField

Action to replace a field with it’s hashed value

get_value(profile, state, record)

Get the transformed value, given profile state and record

key = 'hash'
class flywheel_migration.deidentify.deid_field.DeIdHashUIDField(fieldname, is_regex=False, dry=False)

Bases: flywheel_migration.deidentify.deid_field.DeIdField

Action to replace a uid field with it’s hashed value

get_value(profile, state, record)

Get the transformed value, given profile state and record

key = 'hashuid'
class flywheel_migration.deidentify.deid_field.DeIdIdentityField(*args, **kwargs)

Bases: flywheel_migration.deidentify.deid_field.DeIdField

Action to do nothing on a field. Same as keep action. To be deprecated.

deidentify(profile, state, record)

Do nothing

Use in fieldname section, regex-sub and with remove-undefined action.

get_value(profile, state, record)

Get the transformed value, given profile state and record

key = 'identity'
class flywheel_migration.deidentify.deid_field.DeIdIncrementDateField(fieldname, **kwargs)

Bases: flywheel_migration.deidentify.deid_field.DeIdField

Action to replace a field with it’s incremented date

get_value(profile, state, record)

Get the transformed value, given profile state and record

key = 'increment-date'
load_config(config)

Load rule specific settings from configuration dictionary

local_to_config(config)

Convert rule specific settings to configuration dictionary

class flywheel_migration.deidentify.deid_field.DeIdIncrementDateTimeField(fieldname, **kwargs)

Bases: flywheel_migration.deidentify.deid_field.DeIdField

Action to replace a field with it’s incremented date

get_value(profile, state, record)

Get the transformed value, given profile state and record

key = 'increment-datetime'
load_config(config)

Load rule specific settings from configuration dictionary

local_to_config(config)

Convert rule specific settings to configuration dictionary

class flywheel_migration.deidentify.deid_field.DeIdJitterField(fieldname, **kwargs)

Bases: flywheel_migration.deidentify.deid_field.DeIdField

Action to jitter a field with some random number from a uniform distribution on a range

get_value(profile, state, record)

Get the transformed value, given profile state and record

key = 'jitter'
load_config(config)

Load rule specific settings from configuration dictionary

local_to_config(config)

Convert rule specific settings to configuration dictionary

class flywheel_migration.deidentify.deid_field.DeIdKeepField(fieldname, is_regex=False, dry=False)

Bases: flywheel_migration.deidentify.deid_field.DeIdField

Action to do nothing on a field

deidentify(profile, state, record)

Do nothing.

Use in fieldname section, regex-sub and with remove-undefined action.

get_value(profile, state, record)

Get the transformed value, given profile state and record

key = 'keep'
class flywheel_migration.deidentify.deid_field.DeIdRegexSubField(fieldname, **kwargs)

Bases: flywheel_migration.deidentify.deid_field.DeIdField

Action to edit a string matching a regex with capture groups

get_value(profile, state, record)

Get the transformed value, given profile state and record

key = 'regex-sub'
load_config(config)

Load rule specific settings from configuration dictionary

local_to_config(config)

Convert rule specific settings to configuration dictionary

class flywheel_migration.deidentify.deid_field.DeIdRegexSubListItem(config)

Bases: object

Class for representing a list item within DeIdRegexSubField

format_output(val_dict)

Format output according to output_map

get_invalid_output_vars()

Return a list of invalid output_vars

is_capture_group(var_name)
Return True if the varname matches a named capture group in

self.input_regex

output_dot_replace_char = '___'
regex_matches_field_value(value)

return True if the value matches the regex, else False

to_config()

Convert to configuration dictionary

var_name_is_valid(var_name)

Return True if the varname is a capture group or is defined in self.group_dict, False otherwise.

class flywheel_migration.deidentify.deid_field.DeIdRemoveField(fieldname, is_regex=False, dry=False)

Bases: flywheel_migration.deidentify.deid_field.DeIdField

Action to remove a field from the record

deidentify(profile, state, record)

Perform the update - default implementation is to do a replace

get_value(profile, state, record)

Get the transformed value, given profile state and record

key = 'remove'
class flywheel_migration.deidentify.deid_field.DeIdReplaceField(fieldname, **kwargs)

Bases: flywheel_migration.deidentify.deid_field.DeIdField

Action to replace a field from the record

get_value(profile, state, record)

Get the transformed value, given profile state and record

key = 'replace-with'
load_config(config)

Load rule specific settings from configuration dictionary

local_to_config(config)

Convert rule specific settings to configuration dictionary

File Profile

Individual file/packfile profile for de-identification

class flywheel_migration.deidentify.file_profile.FileProfile(packfile_type=None, file_filter=None)

Bases: object

Abstract class that represents a single file/packfile profile

add_field(field)

Add a field to de-identify

add_log(log)

Set the log instance

create_file_state()

Create state object for processing files

date_format = '%Y%m%d'
datetime_format = '%Y%m%d%H%M%S.%f'
datetime_has_timezone = True
default_filenames = []
deidfield_mixin = None
classmethod factory(name, config=None, log=None)

Create a new file profile instance for the given name.

Parameters
  • name (str) – The name of the profile type

  • config (dict) – The optional configuration dictionary

  • log – The optional de-id log instance

filename_field_prefix = '_fwmtk'
get_dest_path(state, record, path)

Get destination path

get_log_entry(path, entry_type, state, record)

Returns a dictionary with key/value corresponding to log entry and the logged fields

get_log_fields()

Return the full set of fieldnames that should be logged

classmethod get_subclasses()

Returns all subclasses (not the immediate ones only)

get_value(state, record, fieldname)

Get the transformed value for fieldname

has_field(var_fieldname)
Returns True if var_fieldname is defined in field_map or a regex field

matches var_fieldname, else returns False

hash_algorithm = 'sha256'
hash_digits = 0
jitter_range = 2
jitter_type = 'float'
load_config(config)

Read configuration from a dictionary

abstract load_record(state, src_fs, path)

Load the record(file) at path, return None to ignore this file

log_fields = []
matches_file(filename)

Check if this profile can process the given file

matches_packfile(packfile_type)

Check if this profile can process the given packfile

name = None
process_files(src_fs, dst_fs, files, callback=None)

Process all files in the file list, performing de-identification steps

Parameters
  • src_fs – The source filesystem (Provides open function)

  • dst_fs – The destination filesystem

  • files – The set of files in src_fs to process

  • callback – Function to call after writing each file

classmethod profile_names()

Get the list of profile names

abstract read_field(state, record, fieldname)

Read the named field as a string. Return None if field cannot be read.

regex_compatible = False
abstract remove_field(state, record, fieldname)

Remove the named field from the record

abstract replace_field(state, record, fieldname, value)

Replace the named field with value in the record

replace_with_insert = True
sanitize_filename = True
abstract save_record(state, record, dst_fs, path)

Save the record to the destination path

set_filenames_attributes(record, path)

Update record object with private attributes based on filenames properties

Record attributes are extended based on <groups> extracted from the <input-regex>. For instance the following filenames schema defines in profile:

filenames:
    - output: {group1}.ext
      input-regex=r'^(?P<group1>[\w]+).ext$'
    - output: {group1}-{group2}.ext
      input-regex=r'^(?P<group1>[\w]+)-(?P<date1>[\d]+).ext$'

will create attributes, depending on which input-regex matches, as:

# for `path` = test.ext
record.<self.filename_field_prefix>_filename0_group1 = 'test'
# for `path` = test-20200130.ext
record.<self.filename_field_prefix>_filename1_group1 = 'test'
# for `path` = test-20200130.ext
record.<self.filename_field_prefix>_filename1_date1 = '20200130'
Parameters
  • record (object) – A record

  • path (str) – basename of input file

set_log(log)

Set the log instance

static sort_fields(field_list)

Sort field_list such that regex-sub fields are first

to_config()

Get configuration as a dictionary

uid_default_prefix_fields = 4
uid_hash_fields = (6, 6, 6, 6, 6, 6)
uid_max_suffix_digits = 6
uid_suffix_fields = 1
validate(enhanced=False)

Validate the profile, returning any errors.

Parameters

enhanced (bool) – Performed a deeper validation if supported

Returns

A list of error messages, or an empty list

Return type

list(str)

write_log_entry(path, entry_type, state, record)

Write a single log entry of type for path

DICOM File Profile

File profile for de-identifying dicom files

class flywheel_migration.deidentify.dicom_file_profile.DicomDeIdFieldMixin

Bases: flywheel_migration.deidentify.deid_field.DeIdFieldMixin

Mixin to add functionality to DeIdField for Dicom profile

deidentify(profile, state, record)

Deidentifies depending on field type

flavor = 'Dicom'
list_fieldname(record)

Returns a list of fieldnames for record depending on field type

recurse_sequence = False
class flywheel_migration.deidentify.dicom_file_profile.DicomFileProfile

Bases: flywheel_migration.deidentify.file_profile.FileProfile

Dicom implementation of load/save and remove/replace fields

add_field(field)

Add a field to de-identify

create_file_state()

Create state object for processing files

decode = True
deidfield_mixin

alias of flywheel_migration.deidentify.dicom_file_profile.DicomDeIdFieldMixin

get_data_element(record, fieldname)

Returns data element in record at fieldname

get_data_element_VR(record, fieldname)

Returns data element VR in record at fieldname

get_dest_path(state, record, path)

Returns default named based on SOPInstanceUID or one based on profile if defined

hash_digits = 16
load_config(config)

Read configuration from a dictionary

load_record(state, src_fs, path)

Load the record(file) at path, return None to ignore this file

log_fields = ['StudyInstanceUID', 'SeriesInstanceUID', 'SOPInstanceUID']
name = 'dicom'
process_files(*args, **kwargs)

Process all files in the file list, performing de-identification steps

Parameters
  • src_fs – The source filesystem (Provides open function)

  • dst_fs – The destination filesystem

  • files – The set of files in src_fs to process

  • callback – Function to call after writing each file

read_field(state, record, fieldname)

Read the named field as a string. Return None if field cannot be read.

recurse_sequence = False
regex_compatible = True
remove_field(state, record, fieldname)

Remove the named field from the record

remove_undefined = False
remove_undefined_fields(state, record)

Remove data elements not defined in fields

replace_field(state, record, fieldname, value)

Replace the named field with value in the record

save_record(state, record, dst_fs, path)

Save the record to the destination path

to_config()

Get configuration as a dictionary

validate(enhanced=False)

Validate the profile, returning any errors.

Parameters

enhanced (bool) – If True, test profile execution on a set of test files

Returns

A list of error messages, or an empty list

Return type

list(str)

validate_filenames(errors)

Validates the filename section of the profile

Parameters

errors (list) – Current list of error message

Returns

Extended list of errors message

Return type

(list)

class flywheel_migration.deidentify.dicom_file_profile.DicomTagStr(value, *_args, **_kwargs)

Bases: str

Subclass of string that host attributes/methods to handle the different means field can reference Dicom data element(s)

property dicom_tag
property is_flat

Return True for ‘flat’ fieldname (map to a single tag), False otherwise.

property is_private
property is_repeater
property is_sequence
property is_wild_sequence
parse_field_name(name)

Parse the field name and returns

Parameters

name (str) – The field name.

Returns

Depending on name.

Return type

(list or Tag)

Raises

ValueError – if name matches multiple fieldname definition types.

parsers_method_prefix = '_parse'

PNG File Profile

File profile for de-identifying files storing Exif metadata such as JPEG More on Exif at https://en.wikipedia.org/wiki/Exif

class flywheel_migration.deidentify.png_file_profile.ChunkStr(value, *_args, **_kwargs)

Bases: str

Subclass of string with a few extra attributes related to PNG chunks

class flywheel_migration.deidentify.png_file_profile.PNGFileProfile(file_filter=None)

Bases: flywheel_migration.deidentify.file_profile.FileProfile

PNG implementation of load/save and remove/replace fields

add_field(field)

Add field to profile

create_file_state()

Create state object for processing files

default_file_filter = ['*.png', '*.PNG']
default_output_format = 'PNG'
hash_digits = 16
load_config(config)

Read configuration from a dictionary

load_record(state, src_fs, path)

Load the record(file) at path, return None to ignore this file

log_fields = []
name = 'png'
read_field(state, record, fieldname)

Read field from record

remove_field(state, record, fieldname)

Remove the named field from the record

replace_field(state, record, fieldname, value)

Replace the named field with value in the record

save_record(state, record, dst_fs, path)

Save the record to the destination path

to_config()

Get configuration as a dictionary

validate(enhanced=False)

Validate the profile, returning any errors.

Parameters

enhanced (bool) – If True, test profile execution on a set of test files

Returns

A list of error messages, or an empty list

Return type

list(str)

class flywheel_migration.deidentify.png_file_profile.PNGRecord(fp, mode='r')

Bases: object

A record for dealing with png file

property metadata

Load Exif metadata

mime_type = 'image/png'
save_as(fp, file_type='PNG')

Save deid image

Parameters
  • fp – A file object

  • file_type – Image format to save as

validate()

Validate image against expecting type

JPG File Profile

File profile for de-identifying files storing Exif metadata such as JPEG More on Exif at https://en.wikipedia.org/wiki/Exif

class flywheel_migration.deidentify.jpg_file_profile.ExifTagStr(value, *_args, **_kwargs)

Bases: str

Subclass of string with a few extra attributes related to exif

class flywheel_migration.deidentify.jpg_file_profile.JPGFileProfile(file_filter=None)

Bases: flywheel_migration.deidentify.file_profile.FileProfile

Exif implementation of load/save and remove/replace fields

Human readable tags are leveraged from piexif.TAGS

add_field(field)

Add field to profile

Fields matching keyword found in multiple datablock (i.e. Exif, IFD0 and IFD1) get duplicated

create_file_state()

Create state object for processing files

datetime_format = '%Y:%m:%d %H:%M:%S'
default_file_filter = ['*.jpg', '*.jpeg', '*.JPG', '*.JPEG']
default_output_format = 'JPEG'
hash_digits = 16
load_config(config)

Read configuration from a dictionary

load_record(state, src_fs, path)

Load the record(file) at path, return None to ignore this file

log_fields = []
name = 'jpg'
read_field(state, record, fieldname)

Read field from record

record_class

alias of flywheel_migration.deidentify.jpg_file_profile.JPGRecord

remove_field(state, record, fieldname)

Remove the named field from the record

replace_field(state, record, fieldname, value)

Replace the named field with value in the record

save_record(state, record, dst_fs, path)

Save the record to the destination path

to_config()

Get configuration as a dictionary

validate(enhanced=False)

Validate the profile, returning any errors.

Parameters

enhanced (bool) – If True, test profile execution on a set of test files

Returns

A list of error messages, or an empty list

Return type

list(str)

class flywheel_migration.deidentify.jpg_file_profile.JPGRecord(fp, mode='r')

Bases: object

A record for dealing with jpg file

file_type = 'JPEG'
property metadata

Load Exif metadata

mime_type = 'image/jpeg'
save_as(fp, file_type=None)

Save deid image

Parameters
  • fp – A file object

  • file_type – Image format to save as

validate()

Validate image against expecting type

TIFF File Profile

File profile for de-identifying TIFF files

class flywheel_migration.deidentify.tiff_file_profile.IFDTagStr(value, *_args, **_kwargs)

Bases: str

Subclass of string with a few extra attributes related to metadata

class flywheel_migration.deidentify.tiff_file_profile.TIFFFileProfile(file_filter=None)

Bases: flywheel_migration.deidentify.file_profile.FileProfile

TIFF implementation of load/save and remove/replace fields

Human readable tags are leveraged from PIL.TiffTags.TAGS_V2

add_field(field)

Add field to profile

create_file_state()

Create state object for processing files

datetime_format = '%Y:%m:%d %H:%M:%S'
default_file_filter = ['*.tif', '*.tiff', '*.TIF', '*.TIFF']
default_output_format = 'TIFF'
hash_digits = 16
load_config(config)

Read configuration from a dictionary

load_record(state, src_fs, path)

Load the record(file) at path, return None to ignore this file

log_fields = []
name = 'tiff'
private_tags_lower_bound = 32768
read_field(state, record, fieldname)

Read field from record

record_class

alias of flywheel_migration.deidentify.tiff_file_profile.TIFFRecord

remove_field(state, record, fieldname)

Remove the named field from the record

replace_field(state, record, fieldname, value)

Replace the named field with value in the record

save_record(state, record, dst_fs, path)

Save the record to the destination path

to_config()

Get configuration as a dictionary

validate(enhanced=False)

Validate the profile, returning any errors.

Parameters

enhanced (bool) – If True, test profile execution on a set of test files

Returns

A list of error messages, or an empty list

Return type

list(str)

class flywheel_migration.deidentify.tiff_file_profile.TIFFRecord(fp, mode='r')

Bases: object

A record for dealing with jpg file

file_type = 'TIFF'
property metadata

Load metadata

mime_type = 'image/tiff'
save_as(filepath, file_type=None, **kwargs)

Save deid image

Parameters
  • filepath – A file path

  • file_type – Image format to save as

validate()

Validate image against expecting type

KEY/VALUE File Profile

File profile for de-identifying text files with lines that contain string

pattern-delimited key-value pairs

class flywheel_migration.deidentify.key_value_text_file_profile.KeyValueTextFileLine(line, delimiter)

Bases: object

Represents a parsed line from key-value text file

get_output_line()

get the string representation of line with output_value

parse_line()

Parses self.input_line to determine delimiter_value, key, and input_value

set_value(value)

sets self.output_value to value

class flywheel_migration.deidentify.key_value_text_file_profile.KeyValueTextFileProfile(file_filter=None)

Bases: flywheel_migration.deidentify.file_profile.FileProfile

key-value text file implementation of load/save and remove/replace fields

default_file_filter = ['*.MHD', '*.mhd']
hash_digits = 16
load_config(config)

Read configuration from a dictionary

load_record(state, src_fs, path)

Load the record(file) at path, return None to ignore this file

name = 'key-value-text-file'
read_field(state, record, fieldname)

Read the named field as a string. Return None if field cannot be read.

remove_field(state, record, fieldname)

Remove the named field from the record

replace_field(state, record, fieldname, value)

Replace the named field with value in the record

save_record(state, record, dst_fs, path)

Save the record to the destination path

to_config()

Get configuration as a dictionary

validate(enhanced=False)

Validate the profile, returning any errors.

Parameters

enhanced (bool) – Performed a deeper validation if supported

Returns

A list of error messages, or an empty list

Return type

list(str)

class flywheel_migration.deidentify.key_value_text_file_profile.KeyValueTextFileRecord(file_object, delimiter, ignore_bad_lines)

Bases: object

Represents a text file where each line is a key-value pair delimited by

delimiter

insert_key(key, value)
Prepares a new line object given a key and value and adds it to

self.line_dict

parse_lines(file_object, ignore_bad_lines)

Parses the lines in file_object into self._line_dict

save_as(file_object)

save text file

flywheel_migration.deidentify.key_value_text_file_profile.encoding_supported(enc)

Returns boolean indicating whether encoding string is supported.

JSON File Profile

File profile for de-identifying JSON/JSON file

class flywheel_migration.deidentify.json_file_profile.JSONFileProfile(file_filter=None)

Bases: flywheel_migration.deidentify.file_profile.FileProfile

JSON implementation of load/save and remove/replace fields

add_field(field)

Add a field to de-identify

date_format = '%Y-%m-%d'
datetime_format = '%Y-%m-%d %H:%M:%S'
default_file_filter = ['*.json', '*.JSON']
hash_digits = 16
load_config(config)

Read configuration from a dictionary

load_record(state, src_fs, path)

Load the record(file) at path, return None to ignore this file

log_fields = []
name = 'json'
read_field(state, record, fieldname)

Read field from record

record_class

alias of flywheel_migration.deidentify.json_file_profile.JSONRecord

regex_compatible = True
remove_field(state, record, fieldname)

Remove the named field from the record

replace_field(state, record, fieldname, value)

Replace the named field with value in the record

save_record(state, record, dst_fs, path)

Save the record to the destination path

separator = '.'
class flywheel_migration.deidentify.json_file_profile.JSONRecord(fp, data=None, separator=None)

Bases: object

A record for dealing with json file

default_separator = '.'
file_type = 'JSON'
classmethod from_dict(data, separator=None)

Instantiate record from a dictionary

get_all_dotty_paths()

Returns a list of string for all accessible path in record in dotty dict notation

items()

Iterate over key, value

keys()

List keys in data model

pop(key)

Pop element from data model

save_as(fp)

Save de-id as json

property separator

Returns separator used in Dotty

to_dict()

Export record as dictionary

values()

List value in data model

XML File Profile

File profile for de-identifying XML files

class flywheel_migration.deidentify.xml_file_profile.XMLFileProfile(file_filter=None)

Bases: flywheel_migration.deidentify.file_profile.FileProfile

Exif implementation of load/save and remove/replace fields

add_field(field)

Add field to profile

create_file_state()

Create state object for processing files

default_file_filter = ['*.XML', '*.xml']
hash_digits = 16
load_record(state, src_fs, path)

Load the record(file) at path, return None to ignore this file

log_fields = []
name = 'xml'
read_field(state, record, fieldname)

Read field from record

record_class

alias of flywheel_migration.deidentify.xml_file_profile.XMLRecord

remove_field(state, record, fieldname)

Remove the named field from the record

replace_field(state, record, fieldname, value)

Replace the named field with value in the record

save_record(state, record, dst_fs, path)

Save the record to the destination path

class flywheel_migration.deidentify.xml_file_profile.XMLRecord(fp)

Bases: object

A record for dealing with XML file

This is a dump class to allow for storing arbitrary attributes because lxml.etree._ElementTree does not allow for it (inheritance from a custom Class object that seems to prohibit it)

save_as(fp)

Save xml tree

class flywheel_migration.deidentify.xml_file_profile.XPathStr(value, *_args, **_kwargs)

Bases: str

Subclass of string with a few extra attributes related to xml

flywheel_migration.deidentify.xml_file_profile.parse_fieldname(name)

Parse the given string to determine if it’s XPath compatible.

Params:

name (str): The XPath expression

Returns

XPathStr

Table File Profile

File profiles for de-identifying table-like file such as e.g. csv, tsv

class flywheel_migration.deidentify.table_file_profile.CSVFileProfile(file_filter=None)

Bases: flywheel_migration.deidentify.table_file_profile.TableFileProfile

FileProfile class for CSV files

default_file_filter = ['.csv', '.CSV']
delimiter = ','
name = 'csv'
reader = 'csv'
class flywheel_migration.deidentify.table_file_profile.TSVFileProfile(file_filter=None)

Bases: flywheel_migration.deidentify.table_file_profile.TableFileProfile

FileProfile class for TSV files

default_file_filter = ['.tsv', '.TSV']
delimiter = '\t'
name = 'tsv'
reader = 'csv'
class flywheel_migration.deidentify.table_file_profile.TableFileProfile(file_filter=None)

Bases: flywheel_migration.deidentify.file_profile.FileProfile

FileProfile subclass for tables (e.g. csv, tsv) for de-id COLUMNS

add_field(field)

Add a field to de-identify

default_file_filter = None
delimiter = None
hash_digits = 16
load_config(config)

Read configuration from a dictionary

load_record(state, src_fs, path)

Load the record(file) at path, return None to ignore this file

name = 'table'
read_field(state, record, fieldname)

Read the named field as a string. Return None if field cannot be read.

reader = None
record_class

alias of flywheel_migration.deidentify.table_file_profile.TableRecord

remove_field(state, record, fieldname)

Remove the named field from the record

replace_field(state, record, fieldname, value)

Replace the named field with value in the record

save_record(state, record, dst_fs, path)

Save the record to the destination path

to_config()

Get configuration as a dictionary

validate(enhanced=False)

Validate the profile, returning any errors.

Parameters

enhanced (bool) – Performed a deeper validation if supported

Returns

A list of error messages, or an empty list

Return type

list(str)

class flywheel_migration.deidentify.table_file_profile.TableRecord(fp, reader=None)

Bases: object

A record to deal with tabular data

property columns

Return column of the dataframe

save_as(fp, to=None)

Save record to file buffer

Exceptions

Provides validation error for deid templates

exception flywheel_migration.deidentify.exceptions.ValidationError(path, errors)

Bases: Exception

Indicates that the profile is invalid.

path

The path to the deid profile

Type

str

errors

The list of error messages

Type

list(str)