Title: Upload data to a Flywheel project
Date: April 28th 2020
Description:
This notebook shows how to upload data to a new project using the Flywheel SDK.

Topics that will be covered:

Requirements

Install and Import Dependencies

Download some test data

In this notebook we will be uploading images to a Flywheel Instance.
To get started, your first need to download the test dataset that will be used in this notebook.

On mybinder.org or any Mac/Linux system, the following commands will download a zip archive and unzip the data into a folder called data-upload-notebook in your current directory:

If the previous commands return an errors, download the file directly using the link provided to the curl command above and extract the archive in the current working directory to a folder named data-upload-notebook

The file tree of data-upload-notebook should like this:

data-uplodate-notebook
├── anx_s1
│   └── anx_s1_anx_ses1_protA
│       └── T1_high-res_inplane_Ret_knk_0
│           └── 6879_3_1_t1.dcm.zip
├── anx_s2
│   └── anx_s2_anx_ses1_protA
│       └── T1\ high-res\ inplane\ FSPGR\ BRAVO_0
│           └── 4784_3_1_t1.dcm.zip
├── anx_s3
│   └── anx_s3_anx_ses1_protA
│       └── T1_high-res_inplane_Ret_knk_0
│           └── 6879_3_1_t1.dcm.zip
├── anx_s4
│   └── anx_s4_anx_ses2_protB
│       └── T1_high-res_inplane_Ret_knk_1
│           └── 8403_4_1_t1.dcm.zip
├── anx_s5
│   └── anx_s5_anx_ses1_protA
│       └── T1_high-res_inplane_Ret_knk_1
│           └── 8403_4_1_t1.dcm.zip
└── participants.csv

Flywheel API Key and Client

Get your API_KEY. More on this at in the Flywheel SDK doc here.

Instantiate the Flywheel API client

Show Flywheel logging information

Understand the Flywheel Hierarchy

Flywheel data model relies on hierarchical containers. You can read more about the flywheel containers in our documentation here.
In flywheel project are structure into the following hierarchy:

Group
└── Project
    └── Subject 
        └── Session
            └── Acquisition

Each of Project, Subject, Session and Acquisition are containers. Containers shared common properties such as the ability to store files, metadata or analyses.

In this notebook we will be:

  1. Creating the Project to host our data.
  2. Creating the hierarchy of Subject/Session/Acquisition matching our data input.
  3. Uploading the DICOM archive to each Acquisition.
  4. Showing how to update metadata of a container.

Initialize a few values

In this notebook, we will be uploading data to a Project. The label of the Project will be defined by the PROJECT_LABEL variable defined below. Here we set it up to be AnxietyStudy01 but feel free to change it to something that makes more sense to you.

In Flywheel each project belongs to a Group. The label of the Group that will be used to create the Project is defined by the GROUP_LABEL variable below.

To be able to create a Project in a Group, you must at least have read/write permission for this Group. If you don't have read/write permission on any Group please contact you site admin.

Specify the Group you have r/w permission on and where the Project will be created:

We also define a varibale that pointed to the root directory where the data got downloaded. If you have followed the steps above to download your data, you should have all the data in a folder called data-upload-notebook. If that's not the case, edit the below variable accordingly.

Requirements

Before starting off, we want to check your permission on the Flywheel Instance in order to proceed in this notebook.

check_user_permission will return True if both the group meet the minimum requirement, else a compatible list will be printed.

Add a New Project

In this section, we will be creating a new project with label PROJECT_LABEL in the Group's GROUP_LABEL.

First, we will be getting the Group using the lookup method.

Before creating a new project, it is a good practice to check if the Project you are trying to create exists in the Flywheel instance or not. We can do this by checking if a Project with label=PROJECT_LABEL exists in the Group you have specified:

If the Project does not exist, it will return False and we can create it. If a Project was found, it will return the Project and in that case, either update the PROJECT_LABEL to something different to create a new project OR make sure that the data that you are about to upload will not interfere with the data already present in the Project

Modify Project Gear Rules

After a new Project is being created, we will be disabling the Gear Rules for demo purposes.

First, we use get_project_rules to get a list of all rules for a project.

If the Gear Rules does not exist, gear_rules will return False. If there is Gear Rules setup in the project, it will return True, and disable the Gear Rule if disabled == False.

Create Subjects, Sessions and Acquisitions and upload files

Now that we have a Project, we can create all the containers that are required to host our dataset.

What's the plan?

Following the Flywheel Hierarchy, we will loop through each subject folders and either get the Subject if it exists in the Project already or create it not ( we will use the get_or_create_subject function below for this). We will do the same to create the Session and Acquisition containers. Once we get down to the Acqusition container, we will upload the corresponding DICOM archive to it (we will use the upload_file_to_acquistion function below for this)

Helpful Functions

Processing

Getting ready

The files we want to upload are DICOM zip archive. Let's get a list of all of them:

In this notebook we will parse the Subject, Session and Acquisition labels from the folders and subfolder path directly. If we wanted to do more, we could use regular expression on the path (e.g. something like r'^data-upload-notebook/(?P<sub_label>[\w\d]+)/.+(?P<ses_label>ses[\d\w\_]+)/(?P<acq_label>.+)')

Tip: Use Regex101, an online regex tester and debugger, to write and test on example inputs before putting it in your code .

Getting the work done

We are now ready to walk our folders, create the containers accordingly and upload the DICOM zip archive to the Acquisition container.

Once the upload is done, you should have all your data available in your Flywheel Project, which should look like this:

Update Subject Metadata

For sake of example, let's demonstarate how we can update the metadata for Subject anx_s1.

Let's first find that specific Subject:

Tip: Using reload() is nessecary to load the entire container.

We are going to update the firstname, lastname and the sex of this Subject. Let's check what we have currently:

We can update it with the update method of the container:

Let's reload the subject from the database to make sure the update went through:

Each container also contains a field called info which can be used to stored unstructured information in a dictionary.

You can find the same information in Flywheel under the custom information field of the anx_s1 Subject:

All the metadata shown in the UI are also accessible from the SDK. For instance if you would like to show all the properties of the anx_s1 Subject, just display its container:

Bonus: Update Subject Metadata with a CSV file

Updating Subject Metadata/Info can be made by parsing CSV file or TSV file. By using this method, you can modify metadata for each Subject all at once.

In this example, you will need to access the participants.csv file which can be found in the .zip folder you downloaded earlier.

First, you will need read the csv file with pandas (which imported as pd).

We are going to loop through each Subjects in the Flywheel instance and check if there is any metadata stored in the metadata dataframe.

If the Subject is in the metadata dataframe, we will add the age and treatment information into the Subject container and update the sex metadata for each Subject.

View the updated metadata in the Subject container

You can also check the updated information in Flywheel under the Subject container.