TY - GEN
T1 - 75, 000, 000, 000 streaming inserts/second using hierarchical hypersparse GraphBLAS Matrices
AU - Kepner, Jeremy
AU - Davis, Tim
AU - Byun, Chansup
AU - Arcand, William
AU - Bestor, David
AU - Bergeron, William
AU - Gadepally, Vijay
AU - Hubbell, Matthew
AU - Houle, Michael
AU - Jones, Michael
AU - Klein, Anna
AU - Michaleas, Peter
AU - Milechin, Lauren
AU - Mullen, Julie
AU - Prout, Andrew
AU - Rosa, Antonio
AU - Samsi, Siddharth
AU - Yee, Charles
AU - Reuther, Albert
N1 - Funding Information:
This material is based upon work supported by the Assistant Secretary of Defense for Research and Engineering under Air Force Contract No. FA8702-15-D-0001 and National Science Foundation CCF-1533644. Any opinions, findings, conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the Assistant Secretary of Defense for Research and Engineering or the National Science Foundation.
Funding Information:
This material is based upon work supported by the Assistant Secretary of Defense for Research and Engineering under Air Force Contract No. FA8702- 15-D-0001 and National Science Foundation CCF-1533644. Any opinions, findings, conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the Assistant Secretary of Defense for Research and Engineering or the National Science Foundation.
Publisher Copyright:
© 2020 IEEE.
PY - 2020/5
Y1 - 2020/5
N2 - The SuiteSparse GraphBLAS C-library implements high performance hypersparse matrices with bindings to a variety of languages (Python, Julia, and Matlab/Octave). GraphBLAS provides a lightweight in-memory database implementation of hypersparse matrices that are ideal for analyzing many types of network data, while providing rigorous mathematical guarantees, such as linearity. Streaming updates of hypersparse matrices put enormous pressure on the memory hierarchy. This work benchmarks an implementation of hierarchical hypersparse matrices that reduces memory pressure and dramatically increases the update rate into a hypersparse matrices. The parameters of hierarchical hypersparse matrices rely on controlling the number of entries in each level in the hierarchy before an update is cascaded. The parameters are easily tunable to achieve optimal performance for a variety of applications. Hierarchical hypersparse matrices achieve over 1, 000, 000 updates per second in a single instance. Scaling to 31, 000 instances of hierarchical hypersparse matrices arrays on 1, 100 server nodes on the MIT SuperCloud achieved a sustained update rate of 75, 000, 000, 000 updates per second. This capability allows the MIT SuperCloud to analyze extremely large streaming network data sets.
AB - The SuiteSparse GraphBLAS C-library implements high performance hypersparse matrices with bindings to a variety of languages (Python, Julia, and Matlab/Octave). GraphBLAS provides a lightweight in-memory database implementation of hypersparse matrices that are ideal for analyzing many types of network data, while providing rigorous mathematical guarantees, such as linearity. Streaming updates of hypersparse matrices put enormous pressure on the memory hierarchy. This work benchmarks an implementation of hierarchical hypersparse matrices that reduces memory pressure and dramatically increases the update rate into a hypersparse matrices. The parameters of hierarchical hypersparse matrices rely on controlling the number of entries in each level in the hierarchy before an update is cascaded. The parameters are easily tunable to achieve optimal performance for a variety of applications. Hierarchical hypersparse matrices achieve over 1, 000, 000 updates per second in a single instance. Scaling to 31, 000 instances of hierarchical hypersparse matrices arrays on 1, 100 server nodes on the MIT SuperCloud achieved a sustained update rate of 75, 000, 000, 000 updates per second. This capability allows the MIT SuperCloud to analyze extremely large streaming network data sets.
UR - http://www.scopus.com/inward/record.url?scp=85091571383&partnerID=8YFLogxK
U2 - 10.1109/IPDPSW50202.2020.00046
DO - 10.1109/IPDPSW50202.2020.00046
M3 - Conference contribution
AN - SCOPUS:85091571383
T3 - Proceedings - 2020 IEEE 34th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2020
SP - 207
EP - 210
BT - Proceedings - 2020 IEEE 34th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2020
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 18 May 2020 through 22 May 2020
ER -