Title: Job Monitoring - User and Developer Version

Date: July 8th 2020

Description:
A simple tutorial about job monitoring for user and developer only.

Topics that are included:

  1. Jobs that I have launched
  2. Filter jobs based on gear name, date range, and state
  3. Cancelling Jobs
  4. Restarting Jobs
  5. Get summary of job status

Requirements:

  1. Access to a Flywheel instance.
  2. A Flywheel Project with ideally the dataset used in the upload-data notebook.
  3. Have some jobs running in your Flywheel Project

NOTE: This notebook is using a test dataset provided by the upload-data notebook. If you have not uploaded this test dataset yet, we strongly recommend you do so now following steps in here before proceeding because this notebook is based on a specific project structure.

WARNING: The metadata of the acquisitions in your test project will be updated and new files will be created after running the scripts below.

Install and Import Dependencies

Flywheel API Key and Client

Get a API_KEY. More on this at in the Flywheel SDK doc here.

Instantiate the Flywheel API client

Show Flywheel logging information


Requirements

Before we started our section, we would like to verify that you have the right permission to proceed in this notebook.

check_user_permission will return True if both the group and project meet the minimum requirement, else a compatible list will be printed.

WARNING: If there is NO Project meet the minimum requirements, you will not be able to proceed in this notebook. Please contact your site admin in order to gain access to run/cancel a job for at least one project on your Flywheel Instance.

How to get Job that I launched?

In this section, we will be showing you how to use get_current_user_jobs method to get the jobs that you have launched in the past.

Within the Job container, we will be printing the a few attributes within the job such as the gear_info that run the job, state of the job, and job id.

Expected Output:

Gear Info: {'category': 'qa',
 'id': None,
 'name': 'mriqc-demo',
 'version': '0.7.0_0.15.1-hbcd-dev-h'}
Job State: complete
Job ID: 5aee8a5e10a8c402961e70f0

Gear Info: {'category': 'qa',
 'id': None,
 'name': 'mriqc-demo',
 'version': '0.7.0_0.15.1-hbcd-dev-h'}
Job State: complete
Job ID: 5bee77bc10a8c402e21e6f1d

Find Specific Job with the Job ID

To view a specific job via the Job ID, you can use get_job_detail. This will only work for the job you have launched yourself.

Expected Output:

Gear Info: {'category': 'qa', 'id': None, 'name': 'mriqc-demo', 'version': '0.7.0_0.15.1-hbcd-dev-h'}
Job State: complete
Job ID: 5bee77bc10a8c402e21e6f1d

Filter Job

In this section, we will showcase how to filter job based on the Gear Name, Date and the State of the Job.

Initialize a few values

First, we will need you to initialize the gear you would like to filter the job with, date of the jobs that are created by, and state of the job you would like to search by.

1. Gear Name

2. Date

3. State of the job


Cancelling Job

Simply use the update method to cancel the job that is on pending.


Restarting Job

You can also restart a job that has a state of failed. However, each job can only be retried once.

In this example, we will be iterate through the jobs list that we defined earlier with fw.get_current_user_jobs(). We will only be focusing on retrying the mriqc job. Then, we will be using the exception handling to ensure we are retrying job that has not been retried before. A new job_id will be generated when it has been successfully retried. This new job_id will then be appended to the retried_job list.


Pulling Statistics of the Jobs

In this section, we want to showcase a simple example on getting a quick summary of the pending and running jobs on your instance.