Hypersparse Network Flow Analysis of Packets with GraphBLAS

Tyler Trigg, Chad Meiners, Sandeep Pisharody, Hayden Jananthan, Michael Jones, Adam Michaleas, Timothy Davis, Erik Welch, William Arcand, David Bestor, William Bergeron, Chansup Byun, Vijay Gadepally, Micheal Houle, Matthew Hubbell, Anna Klein, Peter Michaleas, Lauren Milechin, Julie Mullen, Andrew ProutAlbert Reuther, Antonio Rosa, Siddharth Samsi, Doug Stetson, Charles Yee, Jeremy Kepner

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Internet analysis is a major challenge due to the volume and rate of network traffic. In lieu of analyzing traffic as raw packets, network analysts often rely on compressed network flows (netflows) that contain the start time, stop time, source, destination, and number of packets in each direction. However, many traffic analyses benefit from temporal aggregation of multiple simultaneous netflows, which can be computationally challenging. To alleviate this concern, a novel netflow compression and resampling method has been developed leveraging GraphBLAS hyperspace traffic matrices that preserve anonymization while enabling subrange analysis. Standard multi-temporal spatial analyses are then performed on each sub range to generate detailed statistical aggregates of the source packets, source fan-out, unique links, destination fan-in, and destination packets of each subrange which can then be used for background modeling and anomaly detection. A simple file format based on GraphBLAS sparse matrices is developed for storing these statistical aggregates. This method is scale tested on the MIT SuperCloud using a 50 trillion packet netflow corpus from several hundred sites collected over several months. The resulting compression achieved is significant (<0.1 bit per packet) enabling extremely large netflow analyses to be stored and transported. The single node parallel performance is analyzed in terms of both processors and threads showing that a single node can perform hundreds of simultaneous analyses at over a million packets/sec (roughly equivalent to a 10 Gigabit link).

Original languageEnglish
Title of host publication2022 IEEE High Performance Extreme Computing Conference, HPEC 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781665497862
DOIs
StatePublished - 2022
Externally publishedYes
Event2022 IEEE High Performance Extreme Computing Conference, HPEC 2022 - Virtual, Online, United States
Duration: Sep 19 2022Sep 23 2022

Publication series

Name2022 IEEE High Performance Extreme Computing Conference, HPEC 2022

Conference

Conference2022 IEEE High Performance Extreme Computing Conference, HPEC 2022
Country/TerritoryUnited States
CityVirtual, Online
Period09/19/2209/23/22

Keywords

  • compression
  • hypersparse matrices
  • network analyses
  • streaming graphs

Fingerprint

Dive into the research topics of 'Hypersparse Network Flow Analysis of Packets with GraphBLAS'. Together they form a unique fingerprint.

Cite this