Python Implementation of the Dynamic Distributed Dimensional Data Model

Hayden Jananthan, Lauren Milechin, Michael Jones, William Arcand, William Bergeron, David Bestor, Chansup Byun, Michael Houle, Matthew Hubbell, Vijay Gadepally, Anna Klein, Peter Michnlons, Guillermo Morales, Julie Mullen, Andrew Prout, Albert Reuther, Antonio Rosa, Siddharth Samsi, Charles Yee, Jeremy Kepner

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Python has become a standard scientific computing language with fast-growing support of machine learning and data analysis modules, as well as an increasing usage of big data. The Dynamic Distributed Dimensional Data Model (D4M) offers a highly composable, unified data model with strong performance built to handle big data fast and efficiently. In this work we present an implementation of D4M in Python. D4M.py implements all foundational functionality of D4M and includes Accumulo and SQL database support via Graphulo. We describe the mathematical background and motivation, an explanation of the approaches made for its fundamen-tal functions and building blocks, and performance results which compare D4M.py's performance to D4M-MATLAB and D4M.jl.

Original languageEnglish
Title of host publication2022 IEEE High Performance Extreme Computing Conference, HPEC 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781665497862
DOIs
StatePublished - 2022
Externally publishedYes
Event2022 IEEE High Performance Extreme Computing Conference, HPEC 2022 - Virtual, Online, United States
Duration: Sep 19 2022Sep 23 2022

Publication series

Name2022 IEEE High Performance Extreme Computing Conference, HPEC 2022

Conference

Conference2022 IEEE High Performance Extreme Computing Conference, HPEC 2022
Country/TerritoryUnited States
CityVirtual, Online
Period09/19/2209/23/22

Keywords

  • Python
  • array
  • data science
  • matrix
  • sparse linear algebra

Fingerprint

Dive into the research topics of 'Python Implementation of the Dynamic Distributed Dimensional Data Model'. Together they form a unique fingerprint.

Cite this