Title: Job Monitoring - User and Developer Version
Date: July 8th 2020
Description:
A simple tutorial about job monitoring for user and developer only.
Topics that are included:
# Install specific packages required for this notebook
!pip install flywheel-sdk pandas
# Import packages
from getpass import getpass
import logging
import os
import datetime
import pprint
from dateutil.tz import tzutc
from IPython.display import display, Image
import flywheel
from permission import check_user_permission
# Instantiate a logger
logging.basicConfig(level=logging.INFO, format='%(asctime)s %(levelname)s %(message)s')
log = logging.getLogger('root')
Get a API_KEY. More on this at in the Flywheel SDK doc here.
API_KEY = getpass('Enter API_KEY here: ')
Instantiate the Flywheel API client
fw = flywheel.Client(API_KEY if 'API_KEY' in locals() else os.environ.get('FW_KEY'))
del API_KEY
Show Flywheel logging information
log.info('You are now logged in as %s to %s', fw.get_current_user()['email'], fw.get_config()['site']['api_url'])
Before we started our section, we would like to verify that you have the right permission to proceed in this notebook.
min_reqs = {
"site": "user",
"group": "ro",
"project": ['jobs_view',
'jobs_run_cancel']
}
GROUP_ID = input('Please enter the one of the Group ID that you have access to: ')
PORJECT_LABEL = input('Please enter the one Project Label that you have access to: ')
check_user_permission
will return True if both the group and project meet the minimum requirement, else a compatible list will be printed.
check_user_permission(fw, min_reqs)
In this section, we will be showing you how to use get_current_user_jobs
method to get the jobs that you have launched in the past.
Within the Job container, we will be printing the a few attributes within the job such as the gear_info
that run the job, state
of the job, and job id
.
jobs = fw.get_current_user_jobs()['jobs']
for i, job in enumerate(jobs):
print(f'Gear Info: {job.gear_info}')
print(f'Job State: {job.state}')
print(f'Job ID: {job.id}')
print()
if i > 5:
break
Expected Output:
Gear Info: {'category': 'qa',
'id': None,
'name': 'mriqc-demo',
'version': '0.7.0_0.15.1-hbcd-dev-h'}
Job State: complete
Job ID: 5aee8a5e10a8c402961e70f0
Gear Info: {'category': 'qa',
'id': None,
'name': 'mriqc-demo',
'version': '0.7.0_0.15.1-hbcd-dev-h'}
Job State: complete
Job ID: 5bee77bc10a8c402e21e6f1d
To view a specific job via the Job ID, you can use get_job_detail
. This will only work for the job you have launched yourself.
# Get the latest job that you have launched
JOB_ID = jobs[0].id
specific_job_detail = fw.get_job_detail(JOB_ID)
print(f'Gear Info: {specific_job_detail.gear_info}')
print(f'Job State: {specific_job_detail.state}')
print(f'Job ID: {specific_job_detail.id}')
Expected Output:
Gear Info: {'category': 'qa', 'id': None, 'name': 'mriqc-demo', 'version': '0.7.0_0.15.1-hbcd-dev-h'}
Job State: complete
Job ID: 5bee77bc10a8c402e21e6f1d
In this section, we will showcase how to filter job based on the Gear Name, Date and the State of the Job.
First, we will need you to initialize the gear you would like to filter the job with, date of the jobs that are created by, and state of the job you would like to search by.
GEAR_NAME = input('Please enter the gear that you would like to filter by: ')
CREATED_BY = input('Please enter the date you would like to filter by: ')
JOB_STATE = input('Please enter the state of the job you would like to filter by: ')
filtered_job = list(filter(lambda x : x['gear_info']['name'] == GEAR_NAME, jobs))
for i, job in enumerate(filtered_job):
print(f'Gear Info: {job.gear_info}')
print(f'Job State: {job.state}')
print(f'Job ID: {job.id}')
print()
if i >= 5:
break
CREATED_BY = "2020-06-05"
def filter_date(job):
if job.created.strftime("%Y-%m-%d") > CREATED_BY:
return job
filtered_job = list(filter(lambda x:x.created.strftime("%Y-%m-%d") > CREATED_BY, jobs))
for i, job in enumerate(filtered_job):
print(f'Gear Info: {job.gear_info}')
print(f'Job State: {job.state}')
print(f'Job ID: {job.id}')
print()
if i >= 5:
break
filtered_job = list(filter(lambda x:x.state == JOB_STATE, jobs))
for i, job in enumerate(filtered_job):
print(f'Gear Info: {job.gear_info}')
print(f'Job State: {job.state}')
print(f'Job ID: {job.id}')
print()
if i > 5:
break
Simply use the update
method to cancel the job that is on pending.
JOB_STATE = 'pending'
filtered_job = list(filter(JOB_STATE, jobs))
for job in filtered_job:
job.update(state='cancelled')
You can also restart a job that has a state of failed
. However, each job can only be retried once.
In this example, we will be iterate through the jobs
list that we defined earlier with fw.get_current_user_jobs()
. We will only be focusing on retrying the mriqc
job. Then, we will be using the exception handling to ensure we are retrying job that has not been retried before. A new job_id
will be generated when it has been successfully retried. This new job_id
will then be appended to the retried_job
list.
retried_job = list()
for job in jobs:
try:
if job.state == 'failed' and job.gear_info['name'] == 'mriqc' and len(retried_job)< 2:
new_job_id = fw.retry_job(job.id)
retried_job.append(new_job_id)
except:
pass
In this section, we want to showcase a simple example on getting a quick summary of the pending and running jobs on your instance.
pending_jobs = list(filter(lambda x:x.state == 'pending', jobs))
running_jobs = list(filter(lambda x:x.state == 'running', jobs))
print(f'==============================\n{datetime.datetime.now()}\n==============================\n')
print(f'Check Job States \n')
print(f'{len(pending_jobs)} pending jobs \n')
print(f'{len(running_jobs)} running jobs \n')