Software

This page showcases our cutting-edge software projects funded by a3d3 institute. Some of these software packages have additional training material, which you can find on our tutorials page.

HLS4ML
NMMA
ML4GW
SONIC
SuperSonic
TorchSparse
GSAT
PyLog
ScaleHLS
HEPT

HLS4ML

HLS4ML (High-Level Synthesis For Machine Learning) is a package designed for facilitating machine learning inference on FPGAs. It enables the creation of firmware implementations of machine learning algorithms using the HLS language. By translating models from traditional open-source machine learning packages into HLS, HLS4ML offers a solution that can be configured to suit specific use cases.

NMMA

NMMA (Nuclear Multi Messenger Astronomy) is a fully featured, Bayesian multi-messenger pipeline targeting joint analyses of gravitational-wave and electromagnetic data (focusing on the optical). Using bilby, a Bayesian inference library originally put together for gravitational-wave analyses, as the back-end, the software is capable of sampling these data sets using a variety of samplers. It uses chiral effective field theory based neutron star equation of states when performing inference, and is also capable of estimating the Hubble Constant.

ML4GW

ML4GW (Machine Learning For Gravitational Waves) provides multiple libraries for using ML frameworks into GW searches. It includes ML pipelines for denoising of gravitational-wave data (time-series), transient-finding ones for both modeled and unmodeled sources (anomaly detection) as well as for parameter estimation of gravitational-wave intrinsic and extrinsic source physical quantities. The repository also includes libraries for overall astrophysical signal generation, streamlining training as well as incorporating Inference-as-a-Service along the lines of the SONIC services in CMS.

SONIC

SONIC is the short name for Service for Optimized Network Inference on Co-processors. It is based on inference as a service. Instead of the usual case where the co-processors (GPUs, FPGAs, ASICs) are directly connected to the CPUs, as-a-Service connects the CPUs and co-processors via networks. With as-a-Service computing, clients only need to communicate with the server and handle the IOs, and the servers will direct the co-processors for computing. In the CMS Software (CMSSW), we set up the SONIC workflow to run inference as a service. The clients are deployed in CMSSW to handle the IOs; an Nvidia Triton inference server is chosen to run inferences for Machine-Learning models (and also classical domain algorithms).

SuperSonic

The SuperSONIC project implements server infrastructure for inference-as-a-service applications in large high energy physics (HEP) and multi-messenger astrophysics (MMA) experiments. The server infrastructure is designed for deployment at Kubernetes clusters equipped with GPUs.

TorchSparse

TorchSparse addresses the complex challenges of point cloud computation, a critical task for autonomous driving and other applications. Point cloud convolution differs markedly from traditional dense 2D computation due to its sparse and irregular computation patterns, necessitating the support of specialized, high-performance kernels for effective inference. Existing libraries for point cloud deep learning have previously relied on consistent dataflows for convolution throughout a model’s execution. However, TorchSparse enhances this approach by systematically analyzing and improving these dataflows. As a result, TorchSparse achieves significant speed improvements on an NVIDIA A100 GPU, demonstrating a substantial advancement in efficiency for point cloud convolution operations in inference phases.

GSAT

GSAT tackles the challenge of reliable interpretation in graph neural networks (GNNs), where common attention mechanisms fall short. Traditional approaches rely on post-hoc methods that often underfit or overfit, limiting their usefulness for extracting meaningful data patterns. This work introduces an inherently interpretable model using the Graph Stochastic Attention (GSAT) mechanism, designed to be jointly trained with the predictor and explainer. GSAT enhances both interpretability and out-of-distribution generalizability, overcoming the issues associated with post-hoc methods.

PyLog

PyLog addresses the need for hardware acceleration in complex applications by simplifying FPGA programming. Unlike traditional FPGA design, which requires extensive expertise and lengthy development cycles, PyLog uses a high-level Python-based flow that speeds up the process. It automates the generation of FPGA designs by taking Python functions, transforming them into an intermediate representation, and applying various optimizations such as pragma insertion and memory customization. PyLog’s design flow culminates in complete FPGA system designs, and it includes a runtime that allows direct execution on FPGA platforms, bridging the gap between hardware design and software development.

ScaleHLS

ScaleHLS is a High-level Synthesis (HLS) framework on MLIR. ScaleHLS can compile HLS C/C++ or PyTorch model to optimized HLS C/C++ in order to generate high-efficiency RTL design using downstream tools, such as Xilinx Vivado HLS. By using the MLIR framework that can be better tuned to particular algorithms at different representation levels, ScaleHLS is more scalable and customizable towards various applications coming with intrinsic structural or functional hierarchies. ScaleHLS represents HLS designs at multiple levels of abstraction and provides an HLS-dedicated analysis and transform library (in both C++ and Python) to solve the optimization problems at the suitable representation levels. Using this library, we’ve developed a design space exploration engine to generate optimized HLS designs automatically.

HEPT

HEPT is a novel transformer model designed for large-scale point cloud processing in scientific fields such as high-energy physics and astrophysics. This model addresses the limitations of traditional graph neural networks and standard transformers by incorporating local inductive bias and achieving near-linear complexity with hardware-friendly operations. A significant aspect of HEPT is its use of locality-sensitive hashing (LSH), particularly the OR & AND-construction LSH, to efficiently handle large-scale data. The LSH-based Efficient Point Transformer, built upon these principles, demonstrates superior performance in complex tasks, significantly outpacing existing models in both accuracy and computational speed. This marks a considerable advancement in geometric deep learning and the processing of scientific data.