AI Hardware-Software Co-design

Deep neural networks (DNNs) have driven significant breakthroughs but also raised concerns due to their increasing computational and energy demands. With the limitations of Moore’s Law and Dennard scaling, energy-efficient solutions using emerging hardware, approximate computing, and in-memory techniques are crucial for AI systems. This research focuses on hardware-software co-design strategies to improve AI efficiency across platforms like data centers, edge, and embedded devices. Key areas include optimizing training and inference algorithms for large models, model quantization, compression techniques, distributed training, and hardware-aware neural architecture. Additionally, the exploration of analog AI accelerators aims to enhance efficiency, addressing challenges like precision and integration while focusing on energy performance and model accuracy

Projects

2025

Project RPI Principal Investigators IBM Principal Investigators
Low-precision Distributed Accelerated Methods and Library Development for Training and Fine-tuning Foundation Models Yangyang Xu, George Slota Jie Chen, Naigang Wang
Closing the Accuracy Gap in Analog In-memory Training: Device-dependent Algorithms and Hyperparameter Search Tianyi Chen, Liu Liu Tayfun Gokmen, Omobayode Fagbohungbe
Optimization of Hardware-based Neural Network Accelerators for Fluorescence Lifetime in Biomedical Applications Xavier Intes, Vikas Pandey Karthik Swaminathan, Aporva Amarnath, Nigang Wang
Holistic Algorithm-Architecture Co-Design of Approximate Computing for Scalable Foundation Models Tong Zhang, Liu Liu Swagath Venkataramani, Sanchari Sen
Model Optimization and Hardware-aware Neural Architecture Search for Spatiotemporal Data Mining Yinan Wang, Liu Liu Kaoutar El Maghraoui
Efficient Deployment of Large Language Model over Heterogeneous Computing Systems Meng Wang, Tong Zhang Kaoutar El Maghraoui, Naigang Wang
Bringing AI Intelligence to 5G/6G Edge Platform Ish Jain, Ali Tajer Alberto Gracia, Kaoutar El Maghraoui

2024

Project RPI Principal Investigators IBM Principal Investigators
Algorithmic Innovations and Architectural Support towards In-Memory Training on Analog AI Accelerators Tianyi Chen, Liu Liu Tayfun Gokmen, Malte J. Rasch
Low-precision second-order-type distributed methods for training and fine-tuning foundation Yangyang Xu, George Slota Jie Chen, Mayank Agarwal, Yikang Shen, Naigang Wang
Optimization of Hardware-based Neural Networks Accelerators for Fluorescence Lifetime Biomedical Applications Xavier Intes Karthik Swaminathan
Structured & Robust Neural Network Pruning on Low-Precision Hardware for Guaranteed Learning Performance for Complex Time-Series Datasets Christopher Carothers, Meng Wang Kaoutar El Maghraoui, Pin-Yu Chen, Naigang Wang
Back to top