Best of Both Worlds: High Performance Interactive and Batch Launching

Chansup Byun, Jeremy Kepner, William Arcand, David Bestor, Bill Bergeron, Vijay Gadepally, Michael Houle, Matthew Hubbell, Michael Jones, Andrew Kirby, Anna Klein, Peter Michaleas, Lauren Milechin, Julie Mullen, Andrew Prout, Antonio Rosa, Siddharth Samsi, Charles Yee, Albert Reuther

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Rapid launch of thousands of jobs is essential for effective interactive supercomputing, big data analysis, and AI algorithm development. Achieving thousands of launches per second has required hardware to be available to receive these jobs. This paper presents a novel preemptive approach to implement 'spot' jobs on MIT SuperCloud systems allowing the resources to be fully utilized for both long running batch jobs while still providing fast launch for interactive jobs. The new approach separates the job preemption and scheduling operations and can achieve 100 times faster performance in the scheduling of a job with preemption when compared to using the standard scheduler-provided automatic preemption-based capability. The results demonstrate that the new approach can schedule interactive jobs preemptively at a performance comparable to when the required computing resources are idle and available. The spot job capability can be deployed without disrupting the interactive user experience while increasing the overall system utilization.

Original languageEnglish
Title of host publication2020 IEEE High Performance Extreme Computing Conference, HPEC 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728192192
DOIs
StatePublished - Sep 22 2020
Externally publishedYes
Event2020 IEEE High Performance Extreme Computing Conference, HPEC 2020 - Virtual, Waltham, United States
Duration: Sep 21 2020Sep 25 2020

Publication series

Name2020 IEEE High Performance Extreme Computing Conference, HPEC 2020

Conference

Conference2020 IEEE High Performance Extreme Computing Conference, HPEC 2020
Country/TerritoryUnited States
CityVirtual, Waltham
Period09/21/2009/25/20

Keywords

  • cluster utilization
  • cron job
  • preemption
  • scheduling performance
  • spot jobs

Fingerprint

Dive into the research topics of 'Best of Both Worlds: High Performance Interactive and Batch Launching'. Together they form a unique fingerprint.

Cite this