API

Below is a description of the flywheel-migration-toolkit API.

DeID Profile

Provides profile loading/saving for file de-identification.

class flywheel_migration.deidentify.deid_profile.DeIdProfile

Bases: object

Represents steps to take to de-identify a file or set of files.

finalize()

Perform any necessary cleanup with profile.

get_file_profile(name)

Get file profile for name, or None if not present.

initialize()

Initialize the profile, prior to importing.

load_config(config)

Initialize this profile from a config dictionary.

process_file(src_fs, src_file, dst_fs)

Process the given file, if it’s handled by a file profile.

Parameters:
  • src_fs – The source filesystem

  • src_file – The source file path

  • dst_fs – The destination filesystem

Returns:

True if the file was processed, false otherwise

Return type:

bool

process_packfile(packfile_type, src_fs, dst_fs, paths, callback=None)

Process the given packfile, if it’s handled by a file profile.

Parameters:
  • packfile_type (str) – The packfile type

  • src_fs – The source filesystem

  • dst_fs – The destination filesystem

  • paths – The list of paths to process

  • callback – Optional function to call after processing each file

Returns:

True if the packfile was processed, false otherwise

Return type:

bool

to_config()

Create configuration dictionary from this profile.

validate(enhanced=False)

Validate the profile, returning any errors.

Returns:

A list of error messages, or an empty list

Return type:

list(str)

DeID Field

Represents action to take in order to de-id a single field.

class flywheel_migration.deidentify.deid_field.DeIdField(fieldname, is_regex=False, dry=False)

Bases: object

Abstract class that represents action to take to de-identify a single field.

deidentify(profile, state, record)

Perform the update - default implementation is to do a replace.

classmethod factory(config, dry=False, mixin=None)

Create a new DeIdField instance for the given config.

Parameters:
  • config (dict) – The field configuration

  • dry (bool) – Is set to true, set the field as dry, i.e. a field that does not modify the record.

  • mixin (DeIdFieldMixin) – Optional subclass of DeIdFieldMixin to be inherited by the DeIdField subclass to make the field profile specific.

classmethod get_deidfield_class(config)

Returns DeIdField subclass matching config.

If only “name” is defined in config, returns DeIdKeepField, otherwise returns DeIdField subclass based on key action found in config.

Parameters:

config (dict) – Dictionary e.g. {“name”: “PatientID”, “replace-with”: “TOTO”}

Returns:

A DeIdField subclass or None if none is matching config

Return type:

DeIdField or None

abstract get_value(profile, state, record)

Get the transformed value, given profile state and record.

property is_dry
property is_regex
key = None
list_fieldname(record)

Return a list of fieldnames for record.

By default returns [self.fieldname]. Can be overwritten by certain subclasses of FieldEnhancerBaseMixin to returns a range of record attributes (e.g. when field uses regex, or range definition).

load_config(config)

Load rule specific settings from configuration dictionary.

local_to_config(config)

Convert rule specific settings to configuration dictionary.

to_config()

Convert to configuration dictionary.

class flywheel_migration.deidentify.deid_field.DeIdFieldMixin

Bases: object

Mixin base class to add functionalities to DeIdField based on profile used.

flavor = None
class flywheel_migration.deidentify.deid_field.DeIdHashField(fieldname, is_regex=False, dry=False)

Bases: DeIdField

Action to replace a field with it’s hashed value.

get_value(profile, state, record)

Get the transformed value, given profile state and record.

key = 'hash'
class flywheel_migration.deidentify.deid_field.DeIdHashUIDField(fieldname, is_regex=False, dry=False)

Bases: DeIdField

Action to replace a uid field with it’s hashed value.

get_value(profile, state, record)

Get the transformed value, given profile state and record.

key = 'hashuid'
class flywheel_migration.deidentify.deid_field.DeIdIdentityField(*args, **kwargs)

Bases: DeIdField

Action to do nothing on a field. Same as keep action. To be deprecated.

deidentify(profile, state, record)

Do nothing.

Use in fieldname section, regex-sub and with remove-undefined action.

get_value(profile, state, record)

Get the transformed value, given profile state and record.

key = 'identity'
class flywheel_migration.deidentify.deid_field.DeIdIncrementDateField(fieldname, **kwargs)

Bases: DeIdField

Action to replace a field with it’s incremented date.

get_value(profile, state, record)

Get the transformed value, given profile state and record.

key = 'increment-date'
load_config(config)

Load rule specific settings from configuration dictionary.

local_to_config(config)

Convert rule specific settings to configuration dictionary.

class flywheel_migration.deidentify.deid_field.DeIdIncrementDateTimeField(fieldname, **kwargs)

Bases: DeIdField

Action to replace a field with it’s incremented date.

get_value(profile, state, record)

Get the transformed value, given profile state and record.

key = 'increment-datetime'
load_config(config)

Load rule specific settings from configuration dictionary.

local_to_config(config)

Convert rule specific settings to configuration dictionary.

class flywheel_migration.deidentify.deid_field.DeIdJitterField(fieldname, **kwargs)

Bases: DeIdField

Action to jitter a field with some random number from a uniform distribution on a range.

get_value(profile, state, record)

Get the transformed value, given profile state and record.

key = 'jitter'
load_config(config)

Load rule specific settings from configuration dictionary.

local_to_config(config)

Convert rule specific settings to configuration dictionary.

class flywheel_migration.deidentify.deid_field.DeIdKeepField(fieldname, is_regex=False, dry=False)

Bases: DeIdField

Action to do nothing on a field.

deidentify(profile, state, record)

Do nothing.

Use in fieldname section, regex-sub and with remove-undefined action.

get_value(profile, state, record)

Get the transformed value, given profile state and record.

key = 'keep'
class flywheel_migration.deidentify.deid_field.DeIdRegexSubField(fieldname, **kwargs)

Bases: DeIdField

Action to edit a string matching a regex with capture groups.

get_value(profile, state, record)

Get the transformed value, given profile state and record.

key = 'regex-sub'
load_config(config)

Load rule specific settings from configuration dictionary.

local_to_config(config)

Convert rule specific settings to configuration dictionary.

class flywheel_migration.deidentify.deid_field.DeIdRegexSubListItem(config)

Bases: object

Class for representing a list item within DeIdRegexSubField.

format_output(val_dict)

Format output according to output_map.

get_invalid_output_vars()

Return a list of invalid output_vars.

is_capture_group(var_name)

Return True if the varname matches a named capture group in self.input_regex.

output_dot_replace_char = '___'
regex_matches_field_value(value)

Return True if the value matches the regex, else False.

to_config()

Convert to configuration dictionary.

var_name_is_valid(var_name)

Return True if the varname is a capture group or is defined in self.group_dict, False otherwise.

class flywheel_migration.deidentify.deid_field.DeIdRemoveField(fieldname, is_regex=False, dry=False)

Bases: DeIdField

Action to remove a field from the record.

deidentify(profile, state, record)

Perform the update - default implementation is to do a replace.

get_value(profile, state, record)

Get the transformed value, given profile state and record.

key = 'remove'
class flywheel_migration.deidentify.deid_field.DeIdReplaceField(fieldname, **kwargs)

Bases: DeIdField

Action to replace a field from the record.

get_value(profile, state, record)

Get the transformed value, given profile state and record.

key = 'replace-with'
load_config(config)

Load rule specific settings from configuration dictionary.

local_to_config(config)

Convert rule specific settings to configuration dictionary.

File Profile

Individual file/packfile profile for de-identification.

class flywheel_migration.deidentify.file_profile.FileProfile(packfile_type=None, file_filter=None)

Bases: object

Abstract class that represents a single file/packfile profile.

add_field(field)

Add a field to de-identify.

add_log(log)

Set the log instance.

alter_pixels(state, src_fs, path)

Alter pixels for given file.

Return None to do no preloading, return new tempfs to perform subsequent actions on tempfile.

cleanup(state)

Perform any final cleaning up actions.

create_file_state()

Create state object for processing files.

date_format = '%Y%m%d'
datetime_format = '%Y%m%d%H%M%S.%f'
datetime_has_timezone = True
default_filenames = []
deidfield_mixin = None
classmethod factory(name, config=None, log=None)

Create a new file profile instance for the given name.

Parameters:
  • name (str) – The name of the profile type

  • config (dict) – The optional configuration dictionary

  • log – The optional de-id log instance

filename_field_prefix = '_fwmtk'
get_dest_path(state, record, path)

Get destination path.

get_log_entry(path, entry_type, state, record)

Returns a dictionary with key/value corresponding to log entry and the logged fields.

get_log_fields()

Return the full set of fieldnames that should be logged.

classmethod get_subclasses()

Returns all subclasses (not the immediate ones only).

get_value(state, record, fieldname)

Get the transformed value for fieldname.

has_field(var_fieldname)

Returns True if var_fieldname is defined in field_map or a regex field matches var_fieldname, else returns False.

hash_algorithm = 'sha256'
hash_digits = 0
jitter_range = 2
jitter_type = 'float'
load_config(config)

Read configuration from a dictionary.

abstract load_record(state, src_fs, path)

Load the record(file) at path, return None to ignore this file.

log_fields = []
matches_file(filename)

Check if this profile can process the given file.

matches_packfile(packfile_type)

Check if this profile can process the given packfile.

name = None
process_files(src_fs, dst_fs, files, callback=None)

Process all files in the file list, performing de-identification steps.

Parameters:
  • src_fs – The source filesystem (Provides open function)

  • dst_fs – The destination filesystem

  • files – The set of files in src_fs to process

  • callback – Function to call after writing each file

classmethod profile_names()

Get the list of profile names.

abstract read_field(state, record, fieldname)

Read the named field as a string. Return None if field cannot be read.

regex_compatible = False
abstract remove_field(state, record, fieldname)

Remove the named field from the record.

abstract replace_field(state, record, fieldname, value)

Replace the named field with value in the record.

replace_with_insert = True
sanitize_filename = True
abstract save_record(state, record, dst_fs, path)

Save the record to the destination path.

set_filenames_attributes(record, path)

Update record object with private attributes based on filenames properties.

Record attributes are extended based on <groups> extracted from the <input-regex>. For instance the following filenames schema defines in profile:

filenames:
    - output: {group1}.ext
      input-regex=r'^(?P<group1>[\w]+).ext$'
    - output: {group1}-{group2}.ext
      input-regex=r'^(?P<group1>[\w]+)-(?P<date1>[\d]+).ext$'

will create attributes, depending on which input-regex matches, as:

# for `path` = test.ext
record.<self.filename_field_prefix>_filename0_group1 = 'test'
# for `path` = test-20200130.ext
record.<self.filename_field_prefix>_filename1_group1 = 'test'
# for `path` = test-20200130.ext
record.<self.filename_field_prefix>_filename1_date1 = '20200130'
Parameters:
  • record (object) – A record

  • path (str) – basename of input file

set_log(log)

Set the log instance.

static sort_fields(field_list)

Sort field_list such that regex-sub fields are first.

to_config()

Get configuration as a dictionary.

uid_default_prefix_fields = 4
uid_default_suffix_fields = 1
uid_hash_fields = (6, 6, 6, 6, 6, 6)
uid_max_suffix_digits = 6
validate(enhanced=False)

Validate the profile, returning any errors.

Parameters:

enhanced (bool) – Performed a deeper validation if supported

Returns:

A list of error messages, or an empty list

Return type:

list(str)

write_log_entry(path, entry_type, state, record)

Write a single log entry of type for path.

DICOM File Profile

File profile for de-identifying dicom files.

class flywheel_migration.deidentify.dicom_file_profile.DicomDeIdFieldMixin

Bases: DeIdFieldMixin

Mixin to add functionality to DeIdField for Dicom profile.

deidentify(profile, state, record)

Deidentifies depending on field type.

flavor = 'Dicom'
list_fieldname(record)

Returns a list of fieldnames for record depending on field type.

recurse_sequence = False
class flywheel_migration.deidentify.dicom_file_profile.DicomFileProfile

Bases: FileProfile

Dicom implementation of load/save and remove/replace fields.

add_field(field)

Add a field to de-identify.

alter_pixels(state, src_fs, path)

Alter pixels for given file.

Return None to do no preloading, return new tempfs to perform subsequent actions on tempfile.

cleanup(state)

Remove deid profile for dicom cleaner.

create_file_state()

Create state object for processing files.

decode = True
deidfield_mixin

alias of DicomDeIdFieldMixin

get_data_element(record, fieldname)

Returns data element in record at fieldname.

get_data_element_VR(record, fieldname)

Returns data element VR in record at fieldname.

get_dest_path(state, record, path)

Returns default named based on SOPInstanceUID or one based on profile if defined.

hash_digits = 16
load_config(config)

Read configuration from a dictionary.

load_record(state, src_fs, path)

Load the record(file) at path, return None to ignore this file.

log_fields = ['StudyInstanceUID', 'SeriesInstanceUID', 'SOPInstanceUID']
name = 'dicom'
parse_pixel_actions()
process_files(*args, **kwargs)

Process all files in the file list, performing de-identification steps.

Parameters:
  • src_fs – The source filesystem (Provides open function)

  • dst_fs – The destination filesystem

  • files – The set of files in src_fs to process

  • callback – Function to call after writing each file

read_field(state, record, fieldname)

Read the named field as a string. Return None if field cannot be read.

recurse_sequence = False
regex_compatible = True
remove_field(state, record, fieldname)

Remove the named field from the record.

remove_undefined = False
remove_undefined_fields(state, record)

Remove data elements not defined in fields.

replace_field(state, record, fieldname, value)

Replace the named field with value in the record.

save_record(state, record, dst_fs, path)

Save the record to the destination path.

to_config()

Get configuration as a dictionary.

validate(enhanced=False)

Validate the profile, returning any errors.

Parameters:

enhanced (bool) – If True, test profile execution on a set of test files

Returns:

A list of error messages, or an empty list

Return type:

list(str)

validate_filenames(errors)

Validates the filename section of the profile.

Parameters:

errors (list) – Current list of error message

Returns:

Extended list of errors message

Return type:

(list)

class flywheel_migration.deidentify.dicom_file_profile.DicomTagStr(value, *_args, **_kwargs)

Bases: str

Subclass of string that host attributes/methods to handle the different means field can reference Dicom data element(s).

property dicom_tag
property is_flat

Return True for ‘flat’ fieldname (map to a single tag), False otherwise.

property is_private
property is_repeater
property is_sequence
property is_wild_sequence
parse_field_name(name)

Parse the field name and returns.

Parameters:

name (str) – The field name.

Returns:

Depending on name.

Return type:

(list or Tag)

Raises:

ValueError – if name matches multiple fieldname definition types.

parsers_method_prefix = '_parse'

PNG File Profile

File profile for de-identifying files storing Exif metadata such as JPEG More on Exif at https://en.wikipedia.org/wiki/Exif.

class flywheel_migration.deidentify.png_file_profile.ChunkStr(value, *_args, **_kwargs)

Bases: str

Subclass of string with a few extra attributes related to PNG chunks.

class flywheel_migration.deidentify.png_file_profile.PNGFileProfile(file_filter=None)

Bases: FileProfile

PNG implementation of load/save and remove/replace fields.

add_field(field)

Add field to profile.

create_file_state()

Create state object for processing files.

default_file_filter = ['*.png', '*.PNG']
default_output_format = 'PNG'
hash_digits = 16
load_config(config)

Read configuration from a dictionary.

load_record(state, src_fs, path)

Load the record(file) at path, return None to ignore this file.

log_fields = []
name = 'png'
read_field(state, record, fieldname)

Read field from record.

remove_field(state, record, fieldname)

Remove the named field from the record.

replace_field(state, record, fieldname, value)

Replace the named field with value in the record.

save_record(state, record, dst_fs, path)

Save the record to the destination path.

to_config()

Get configuration as a dictionary.

validate(enhanced=False)

Validate the profile, returning any errors.

Parameters:

enhanced (bool) – If True, test profile execution on a set of test files

Returns:

A list of error messages, or an empty list

Return type:

list(str)

class flywheel_migration.deidentify.png_file_profile.PNGRecord(fp, mode='r')

Bases: object

A record for dealing with png file.

property metadata

Load Exif metadata.

mime_type = 'image/png'
save_as(fp, file_type='PNG')

Save deid image.

Parameters:
  • fp – A file object

  • file_type – Image format to save as

validate()

Validate image against expecting type.

JPG File Profile

File profile for de-identifying files storing Exif metadata such as JPEG More on Exif at https://en.wikipedia.org/wiki/Exif.

class flywheel_migration.deidentify.jpg_file_profile.ExifTagStr(value, *_args, **_kwargs)

Bases: str

Subclass of string with a few extra attributes related to exif.

class flywheel_migration.deidentify.jpg_file_profile.JPGFileProfile(file_filter=None)

Bases: FileProfile

Exif implementation of load/save and remove/replace fields.

Human readable tags are leveraged from piexif.TAGS

add_field(field)

Add field to profile.

Fields matching keyword found in multiple datablock (i.e. Exif, IFD0 and IFD1) get duplicated

create_file_state()

Create state object for processing files.

datetime_format = '%Y:%m:%d %H:%M:%S'
default_file_filter = ['*.jpg', '*.jpeg', '*.JPG', '*.JPEG']
default_output_format = 'JPEG'
hash_digits = 16
load_config(config)

Read configuration from a dictionary.

load_record(state, src_fs, path)

Load the record(file) at path, return None to ignore this file.

log_fields = []
name = 'jpg'
read_field(state, record, fieldname)

Read field from record.

record_class

alias of JPGRecord

remove_field(state, record, fieldname)

Remove the named field from the record.

replace_field(state, record, fieldname, value)

Replace the named field with value in the record.

save_record(state, record, dst_fs, path)

Save the record to the destination path.

to_config()

Get configuration as a dictionary.

validate(enhanced=False)

Validate the profile, returning any errors.

Parameters:

enhanced (bool) – If True, test profile execution on a set of test files

Returns:

A list of error messages, or an empty list

Return type:

list(str)

class flywheel_migration.deidentify.jpg_file_profile.JPGRecord(fp, mode='r')

Bases: object

A record for dealing with jpg file.

file_type = 'JPEG'
property metadata

Load Exif metadata.

mime_type = 'image/jpeg'
save_as(fp, file_type=None)

Save deid image.

Parameters:
  • fp – A file object

  • file_type – Image format to save as

validate()

Validate image against expecting type.

TIFF File Profile

File profile for de-identifying TIFF files.

class flywheel_migration.deidentify.tiff_file_profile.IFDTagStr(value, *_args, **_kwargs)

Bases: str

Subclass of string with a few extra attributes related to metadata.

class flywheel_migration.deidentify.tiff_file_profile.TIFFFileProfile(file_filter=None)

Bases: FileProfile

TIFF implementation of load/save and remove/replace fields.

Human readable tags are leveraged from PIL.TiffTags.TAGS_V2

add_field(field)

Add field to profile.

create_file_state()

Create state object for processing files.

datetime_format = '%Y:%m:%d %H:%M:%S'
default_file_filter = ['*.tif', '*.tiff', '*.TIF', '*.TIFF']
default_output_format = 'TIFF'
hash_digits = 16
load_config(config)

Read configuration from a dictionary.

load_record(state, src_fs, path)

Load the record(file) at path, return None to ignore this file.

log_fields = []
name = 'tiff'
private_tags_lower_bound = 32768
read_field(state, record, fieldname)

Read field from record.

record_class

alias of TIFFRecord

remove_field(state, record, fieldname)

Remove the named field from the record.

replace_field(state, record, fieldname, value)

Replace the named field with value in the record.

save_record(state, record, dst_fs, path)

Save the record to the destination path.

to_config()

Get configuration as a dictionary.

validate(enhanced=False)

Validate the profile, returning any errors.

Parameters:

enhanced (bool) – If True, test profile execution on a set of test files

Returns:

A list of error messages, or an empty list

Return type:

list(str)

class flywheel_migration.deidentify.tiff_file_profile.TIFFRecord(fp, mode='r')

Bases: object

A record for dealing with jpg file.

file_type = 'TIFF'
property metadata

Load metadata.

mime_type = 'image/tiff'
save_as(filepath, file_type=None, **kwargs)

Save deid image.

Parameters:
  • filepath – A file path

  • file_type – Image format to save as

validate()

Validate image against expecting type.

KEY/VALUE File Profile

File profile for de-identifying text files with lines that contain string pattern-delimited key-value pairs.

class flywheel_migration.deidentify.key_value_text_file_profile.KeyValueTextFileLine(line, delimiter)

Bases: object

Represents a parsed line from key-value text file.

get_output_line()

Get the string representation of line with output_value.

parse_line()

Parses self.input_line to determine delimiter_value, key, and input_value.

set_value(value)

Sets self.output_value to value.

class flywheel_migration.deidentify.key_value_text_file_profile.KeyValueTextFileProfile(file_filter=None)

Bases: FileProfile

key-value text file implementation of load/save and remove/replace fields.

default_file_filter = ['*.MHD', '*.mhd']
hash_digits = 16
load_config(config)

Read configuration from a dictionary.

load_record(state, src_fs, path)

Load the record(file) at path, return None to ignore this file.

name = 'key-value-text-file'
read_field(state, record, fieldname)

Read the named field as a string. Return None if field cannot be read.

remove_field(state, record, fieldname)

Remove the named field from the record.

replace_field(state, record, fieldname, value)

Replace the named field with value in the record.

save_record(state, record, dst_fs, path)

Save the record to the destination path.

to_config()

Get configuration as a dictionary.

validate(enhanced=False)

Validate the profile, returning any errors.

Parameters:

enhanced (bool) – Performed a deeper validation if supported

Returns:

A list of error messages, or an empty list

Return type:

list(str)

class flywheel_migration.deidentify.key_value_text_file_profile.KeyValueTextFileRecord(file_object, delimiter, ignore_bad_lines)

Bases: object

Represents a text file where each line is a key-value pair delimited by delimiter.

insert_key(key, value)

Prepares a new line object given a key and value and adds it to self.line_dict.

parse_lines(file_object, ignore_bad_lines)

Parses the lines in file_object into self._line_dict.

save_as(file_object)

Save text file.

flywheel_migration.deidentify.key_value_text_file_profile.encoding_supported(enc)

Returns boolean indicating whether encoding string is supported.

JSON File Profile

File profile for de-identifying JSON/JSON file.

class flywheel_migration.deidentify.json_file_profile.JSONFileProfile(file_filter=None)

Bases: FileProfile

JSON implementation of load/save and remove/replace fields.

add_field(field)

Add a field to de-identify.

date_format = '%Y-%m-%d'
datetime_format = '%Y-%m-%d %H:%M:%S'
default_file_filter = ['*.json', '*.JSON']
hash_digits = 16
load_config(config)

Read configuration from a dictionary.

load_record(state, src_fs, path)

Load the record(file) at path, return None to ignore this file.

log_fields = []
name = 'json'
read_field(state, record, fieldname)

Read field from record.

record_class

alias of JSONRecord

regex_compatible = True
remove_field(state, record, fieldname)

Remove the named field from the record.

replace_field(state, record, fieldname, value)

Replace the named field with value in the record.

save_record(state, record, dst_fs, path)

Save the record to the destination path.

separator = '.'
class flywheel_migration.deidentify.json_file_profile.JSONRecord(fp, data=None, separator=None)

Bases: object

A record for dealing with json file.

default_separator = '.'
file_type = 'JSON'
classmethod from_dict(data, separator=None)

Instantiate record from a dictionary.

get_all_dotty_paths()

Returns a list of string for all accessible path in record in dotty dict notation.

items()

Iterate over key, value.

keys()

List keys in data model.

pop(key)

Pop element from data model.

save_as(fp)

Save de-id as json.

property separator

Returns separator used in Dotty.

to_dict()

Export record as dictionary.

values()

List value in data model.

XML File Profile

File profile for de-identifying XML files.

class flywheel_migration.deidentify.xml_file_profile.XMLFileProfile(file_filter=None)

Bases: FileProfile

Exif implementation of load/save and remove/replace fields.

add_field(field)

Add field to profile.

create_file_state()

Create state object for processing files.

default_file_filter = ['*.XML', '*.xml']
hash_digits = 16
load_record(state, src_fs, path)

Load the record(file) at path, return None to ignore this file.

log_fields = []
name = 'xml'
read_field(state, record, fieldname)

Read field from record.

record_class

alias of XMLRecord

remove_field(state, record, fieldname)

Remove the named field from the record.

replace_field(state, record, fieldname, value)

Replace the named field with value in the record.

save_record(state, record, dst_fs, path)

Save the record to the destination path.

class flywheel_migration.deidentify.xml_file_profile.XMLRecord(fp)

Bases: object

A record for dealing with XML file.

This is a dump class to allow for storing arbitrary attributes because lxml.etree._ElementTree does not allow for it (inheritance from a custom Class object that seems to prohibit it)

save_as(fp)

Save xml tree.

class flywheel_migration.deidentify.xml_file_profile.XPathStr(value, *_args, **_kwargs)

Bases: str

Subclass of string with a few extra attributes related to xml.

flywheel_migration.deidentify.xml_file_profile.parse_fieldname(name)

Parse the given string to determine if it’s XPath compatible.

Params:

name (str): The XPath expression

Returns:

XPathStr

Table File Profile

File profiles for de-identifying table-like file such as e.g. csv, tsv.

class flywheel_migration.deidentify.table_file_profile.CSVFileProfile(file_filter=None)

Bases: TableFileProfile

FileProfile class for CSV files.

default_file_filter = ['.csv', '.CSV']
delimiter = ','
name = 'csv'
reader = 'csv'
class flywheel_migration.deidentify.table_file_profile.TSVFileProfile(file_filter=None)

Bases: TableFileProfile

FileProfile class for TSV files.

default_file_filter = ['.tsv', '.TSV']
delimiter = '\t'
name = 'tsv'
reader = 'csv'
class flywheel_migration.deidentify.table_file_profile.TableFileProfile(file_filter=None)

Bases: FileProfile

FileProfile subclass for tables (e.g. csv, tsv) for de-id COLUMNS.

add_field(field)

Add a field to de-identify.

default_file_filter = None
delimiter = None
hash_digits = 16
load_config(config)

Read configuration from a dictionary.

load_record(state, src_fs, path)

Load the record(file) at path, return None to ignore this file.

name = 'table'
read_field(state, record, fieldname)

Read the named field as a string. Return None if field cannot be read.

reader = None
record_class

alias of TableRecord

remove_field(state, record, fieldname)

Remove the named field from the record.

replace_field(state, record, fieldname, value)

Replace the named field with value in the record.

save_record(state, record, dst_fs, path)

Save the record to the destination path.

to_config()

Get configuration as a dictionary.

validate(enhanced=False)

Validate the profile, returning any errors.

Parameters:

enhanced (bool) – Performed a deeper validation if supported

Returns:

A list of error messages, or an empty list

Return type:

list(str)

class flywheel_migration.deidentify.table_file_profile.TableRecord(fp, reader=None)

Bases: object

A record to deal with tabular data.

property columns

Return column of the dataframe.

save_as(fp, to=None)

Save record to file buffer.

Exceptions

Provides validation error for deid templates.

exception flywheel_migration.deidentify.exceptions.ValidationError(path, errors)

Bases: Exception

Indicates that the profile is invalid.

path

The path to the deid profile

Type:

str

errors

The list of error messages

Type:

list(str)