Title: Delete Empty Containers
Date: June 24th 2020
Description:
This notebook demonstrates how to remove empty containers with a top-down method.
# Install specific packages required for this notebook
!pip install flywheel-sdk tqdm pandas
# Import packages
import os
from getpass import getpass
import logging
import time
from pathlib import Path
import flywheel
import pandas as pd
from tqdm.notebook import tqdm
from permission import check_user_permission
# Instantiate a logger
logging.basicConfig(level=logging.INFO, format='%(asctime)s %(levelname)s %(message)s')
log = logging.getLogger('root')
Get your API_KEY. More on this at in the Flywheel SDK doc here.
API_KEY = getpass('Enter API_KEY here: ')
Instantiate the Flywheel API client
fw = flywheel.Client(API_KEY if ('API_KEY' in locals() and API_KEY) else os.environ.get('FW_KEY'))
Show Flywheel logging information
log.info('You are now logged in as %s to %s', fw.get_current_user()['email'], fw.get_config()['site']['api_url'])
Flywheel data model relies on hierarchical containers. You can read more about the flywheel containers in our documentation here.
Flywheel Project are structure into the following hierarchy:
Group
└── Project
└── Subject
└── Session
└── Acquisition
Each of Project, Subject, Session and Acquisition are containers. Containers share common properties such as the ability to store files, metadata or analyses.
Based on the Flywheel Hierarchy above, the top-down approach will start from the Subject container and traverse down through Session and Acquisition containers. This method will remove Subject, Session and Acquisition containers that have no children containers, and no files nor analyses attached to the container.
In order to run this notebook, you will need to have the right permission on the Group level to create a new Project for testing.
# Minimum requirements that you will need to remove containers on the Project level.
min_reqs = {
"site": "user",
"group": "admin"
}
GROUP_ID = input('Please enter the Group ID that you will be using to create the new project: ')
check_user_permission(fw, min_reqs, group = GROUP_ID)
Now, we will be defining a few values that will be use in this notebook. The GROUP_ID
is the Group ID that you will be using throughout this notebook.
GROUP_ID = input('Please enter the Group that you have admin permission for')
PROJECT_LABEL = 'test-delete-containers'
Please defined below the path to file that you would like to use for testing which will be uploaded to your Flywheel instance
PATH_TO_TEST_FILE = Path("/path/to/a/test/file")
TEST_FILE_BASENAME = PATH_TO_TEST_FILE.name
my_group = fw.lookup(GROUP_ID)
project=my_group.add_project(label=PROJECT_LABEL)
Here, we will be create one Subject container and in that Subject container, we will be adding one Session and in that Session, there will be one Acquisition added. Here we will also upload the File to the Acquisition that created.
# Create Subject
subject = project.add_subject(label='Subject 01')
# Create Session
session = subject.add_session(label='Session 01')
# Create Acquisition
acquisition = session.add_acquisition(label='Localizer')
# Upload File
acquisition.upload_file(PATH_TO_TEST_FILE)
def delete_empty_acquisition(acquisition, dry_run=True):
"""Returns True if acquisition was empty and got deleted.
Args:
acquisition (object): A Flywheel Acquisition.
dry_run (bool): If true, container is not deleted.
Returns:
bool: True if container got deleted, False otherwise.
"""
log.debug(f'Checking if acquisition "{acquisition.label}" is empty')
num_files = len(acquisition.files)
log.debug(f' Found {num_files} files')
delete_acquisition = num_files == 0
if delete_acquisition:
log.info(f'Deleting acquisition "{acquisition.label}"')
if not dry_run:
fw.delete_acquisition(acquisition.id)
return delete_acquisition
def delete_empty_session(session, dry_run=True):
"""Returns True if session was empty and got deleted.
Args:
session (object): A Flywheel Session.
dry_run (bool): If true, container is not deleted.
Returns:
bool: True if container got deleted, False otherwise.
"""
log.debug(f'Checking if session "{session.label}" is empty')
num_files = len(session.files)
num_acqs = len(session.acquisitions())
log.debug(f' Found {num_files} files')
log.debug(f' Found {num_acqs} acquisitions')
delete_session = (num_acqs == 0) and (num_files == 0)
if (num_acqs == 0) and num_files > 0:
log.warning(f'Empty session but file attachment - Not deleting! ({session.id} / {session.label})')
if delete_session:
log.info(f'Deleting session "{session.label}"')
if not dry_run:
fw.delete_session(session.id)
return delete_session
def delete_empty_subject(subject, dry_run=True):
"""Returns True if subject was empty and got deleted.
Args:
subject (object): A Flywheel Subject.
dry_run (bool): If true, container is not deleted.
Returns:
bool: True if container got deleted, False otherwise.
"""
log.debug(f'Checking if subject "{subject.label}" is empty')
num_files = len(subject.files)
num_sessions = len(subject.sessions())
log.debug(f' Found {num_files} files')
log.debug(f' Found {num_sessions} sessions')
delete_subject = (num_files == 0) and (num_sessions == 0)
if (num_sessions == 0) and num_files > 0:
log.warning(f'Empty subject but file attachments! - Not deleting! ({subject.id} / {subject.label})')
if delete_subject:
log.info(f'Deleting subject "{subject.label}"')
if not dry_run:
fw.delete_subject(subject.id)
return delete_subject
def delete_empty_containers_in_project(project, dry_run=True):
"""Delete empty containers in project hierarchy and returns a dataframe of delete containers
Args:
project (object): A Flywheel project.
dry_run (bool): If true, container is not deleted.
Returns:
pandas.DataFrame: A dataframe listing deleted containers
"""
df = pd.DataFrame(columns=['type', 'label', 'id', 'parents.subject', 'parents.session'])
subjects = project.subjects()
for subject in tqdm(subjects):
for session in subject.sessions.iter():
for acquisition in session.acquisitions.iter():
deleted = delete_empty_acquisition(acquisition, dry_run=dry_run)
if deleted:
df = df.append(dict(zip(df.columns, ['acq', acquisition.label, acquisition.id, acquisition.parents.subject, acquisition.parents.session])), ignore_index=True)
session = session.reload()
deleted = delete_empty_session(session, dry_run=dry_run)
if deleted:
df = df.append(dict(zip(df.columns, ['ses', session.label, session.id, session.parents.subject, None])), ignore_index=True)
subject = subject.reload()
deleted = delete_empty_subject(subject, dry_run=dry_run)
if deleted:
df = df.append(dict(zip(df.columns, ['sub', subject.label, subject.id, None, None])), ignore_index=True)
return df
First, we are going to do a dry run by setting dry_run
to True
before actually deleting the Subjects container.
df = delete_empty_containers_in_project(project, dry_run=True)
len(df)
Now we can try to actually delete the empty containers
df = delete_empty_containers_in_project(project)
len(df)
As expected, it didn't delete the Subject 01
subject container as the Acquisition contains a file.
So let's try to delete the file that we have uploaded earlier to the acquisition
. If you recall, the file that is being uploaded is named as TEST_FILE_BASENAME. We will be using the delete_file
method to delete the file from the Acquisition container.
delete_file
method to delete file from the Session or Subject containeracquisition.delete_file(TEST_FILE_BASENAME)
After deleting the file, we can try to delete the container again.
df = delete_empty_containers_in_project(project)
df
If you have a project where you would like to remove/delete empty containers, you will need to have the right permissions to delete/modify the containers on the Project level. Below are the minimum requirements.
# Minimum requirements that you will need to delete/modify containers within the Project.
min_reqs = {
"site": "user",
"group": "rw",
"project":[
'containers_modify_metadata',
'containers_delete_hierarchy',
'files_create_upload',
'files_modify_metadata',
'files_delete_non_device_data',
'files_delete_device_data',
]
}
GROUP_ID = input('Please enter the Group ID that you will be using to create the new project: ')
PROJECT_LABEL = input('Please enter the Project Label that you want to work with in this notebook: ')
check_user_permission(fw, min_reqs, group = GROUP_ID, project = PROJECT_LABEL)
After you have verified that you have the right permissions to delete/remove containers in the desired project, you can get the project container and call delete_empty_containers_in_project
function again.
project = fw.projects.find_first('label={PROJECT_LABEL}')
delete_empty_containers_in_project(project, dry_run=True)