AI Hardware-Software Co-design

Deep neural networks (DNNs) have driven significant breakthroughs but also raised concerns due to their increasing computational and energy demands. With the limitations of Moore’s Law and Dennard scaling, energy-efficient solutions using emerging hardware, approximate computing, and in-memory techniques are crucial for AI systems. This research focuses on hardware-software co-design strategies to improve AI efficiency across platforms like data centers, edge, and embedded devices. Key areas include optimizing training and inference algorithms for large models, model quantization, compression techniques, distributed training, and hardware-aware neural architecture. Additionally, the exploration of analog AI accelerators aims to enhance efficiency, addressing challenges like precision and integration while focusing on energy performance and model accuracy

Projects

2025

Project	RPI Principal Investigators	IBM Principal Investigators
Low-precision Distributed Accelerated Methods and Library Development for Training and Fine-tuning Foundation Models	Yangyang Xu, George Slota	Jie Chen, Naigang Wang
Closing the Accuracy Gap in Analog In-memory Training: Device-dependent Algorithms and Hyperparameter Search	Tianyi Chen, Liu Liu	Tayfun Gokmen, Omobayode Fagbohungbe
Optimization of Hardware-based Neural Network Accelerators for Fluorescence Lifetime in Biomedical Applications	Xavier Intes, Vikas Pandey	Karthik Swaminathan, Aporva Amarnath, Nigang Wang
Holistic Algorithm-Architecture Co-Design of Approximate Computing for Scalable Foundation Models	Tong Zhang, Liu Liu	Swagath Venkataramani, Sanchari Sen
Model Optimization and Hardware-aware Neural Architecture Search for Spatiotemporal Data Mining	Yinan Wang, Liu Liu	Kaoutar El Maghraoui
Efficient Deployment of Large Language Model over Heterogeneous Computing Systems	Meng Wang, Tong Zhang	Kaoutar El Maghraoui, Naigang Wang
Bringing AI Intelligence to 5G/6G Edge Platform	Ish Jain, Ali Tajer	Alberto Gracia, Kaoutar El Maghraoui

2024

Project	RPI Principal Investigators	IBM Principal Investigators
Algorithmic Innovations and Architectural Support towards In-Memory Training on Analog AI Accelerators	Tianyi Chen, Liu Liu	Tayfun Gokmen, Malte J. Rasch
Low-precision second-order-type distributed methods for training and fine-tuning foundation	Yangyang Xu, George Slota	Jie Chen, Mayank Agarwal, Yikang Shen, Naigang Wang
Optimization of Hardware-based Neural Networks Accelerators for Fluorescence Lifetime Biomedical Applications	Xavier Intes	Karthik Swaminathan
Structured & Robust Neural Network Pruning on Low-Precision Hardware for Guaranteed Learning Performance for Complex Time-Series Datasets	Christopher Carothers, Meng Wang	Kaoutar El Maghraoui, Pin-Yu Chen, Naigang Wang