Deep neural networks (DNNs) have driven significant breakthroughs but also raised concerns due to their increasing computational and energy demands. With the limitations of Moore’s Law and Dennard scaling, energy-efficient solutions using emerging hardware, approximate computing, and in-memory techniques are crucial for AI systems. This research focuses on hardware-software co-design strategies to improve AI efficiency across platforms like data centers, edge, and embedded devices. Key areas include optimizing training and inference algorithms for large models, model quantization, compression techniques, distributed training, and hardware-aware neural architecture. Additionally, the exploration of analog AI accelerators aims to enhance efficiency, addressing challenges like precision and integration while focusing on energy performance and model accuracy
AI Hardware-Software Co-design
Projects
2025
| Project | RPI Principal Investigators | IBM Principal Investigators |
|---|---|---|
| Bringing AI Intelligence to 5G/6G Edge Platform | Ish Jain, Ali Tajer | Alberto Gracia, Kaoutar El Maghraoui |
| Co-Designing Analog AI System and Accelerator for Large Foundation Models | Liu Liu, Meng Wang | Sidney Tsai, Kaoutar El Maghraoui |
| Holistic Algorithm-Architecture Co-Design of Approximate Computing for Scalable Foundation Models | Tong Zhang, Liu Liu | Swagath Venkataramani, Sanchari Sen |
| Low-precision Distributed Accelerated Methods and Library Development for Training and Fine-tuning Foundation Models | Yangyang Xu, George Slota | Jie Chen, Naigang Wang |
| Closing the Accuracy Gap in Analog In-memory Training: Device-dependent Algorithms and Hyperparameter Search | Tianyi Chen, Liu Liu | Tayfun Gokmen, Omobayode Fagbohungbe |
| Optimization of Hardware-based Neural Network Accelerators for Fluorescence Lifetime in Biomedical Applications | Xavier Intes, Vikas Pandey | Karthik Swaminathan |
| Model Optimization and Hardware-aware Neural Architecture Search for Spatiotemporal Data Mining | Yinan Wang, Liu Liu | Kaoutar El Maghraoui |
| Efficient Deployment of Large Language Model over Heterogeneous Computing Systems | Meng Wang, Tong Zhang | Kaoutar El Maghraoui, Naigang Wang |
2026
| Project | RPI Principal Investigators | IBM Principal Investigators |
|---|---|---|
| Rethinking Retrieval Signals via Hybrid Retrieval Heads | Stacy Patterson | Wei Sun, Radu Florian, Yulong Li |
| Integrated Sensing and Communication with AI-RAN Platform | Ish Jain, Ali Tajer | Alberto Gracia, Kaoutar El Maghraoui, Arun Paidimarri |
| AutoComp: Automated Compression & Deployment for Foundation Models | Ruimin Ke | Kaoutar El Maghraoui, Naigang Wang |
| Enabling Efficient Inference and High Accuracy by Exploring Novel Linear-type Attention and KV Cache Optimization | Yangyang Xu, George Slota | Jie Chen, Naigang Wang |
| Hardware–Software Co-Design for Unified Pruning and Mixed-Precision Compression of Vision–Language Model | Meng Wang, Liu Liu | Kaoutar El Maghraoui, Pin-Yu Chen |
| Efficient Chiplet-based Memory Architecture for AI Hardware Accelerator | Kanad Basu, Liu Liu | Pradip Bose, Karthik Swaminathan, Nandhini Chandramoorthy, Gracen Wallace, Xin Zhang |
| Efficient Hardware Acceleration of CoFrNets | Kanad Basu | Amit Dhurandhar, Ruchir Puri, Pradip Bose, Karthikeyan Natesan Ramamurthy, Karthik Swaminathan |
| Efficient Deployment of Large Language Model over Heterogeneous Computing Systems | Tong Zhang, Meng Wang | Kaoutar El Maghraoui, Naigang Wang |
| Exploring Analog-Aware Learning and Architectures with Hardware Support for Next-Generation Foundation Models | Liu Liu | Sidney Tsai, Kaoutar El Maghraoui |
| Hardware–Software Co-Design of Efficient Spatiotemporal Transformers and Mixture-of-Experts on IBM Hardware | Yinan Wang, Liu Liu | Kaoutar El Maghraoui, Pin-Yu Chen |
| KV-cache Management for Improving Run-time efficiency of Large Reasoning Models | Mohammad Mohammadi Amiri | Pin-Yu Chen,Tejaswini Pedapati, Subhajit Chaudhury, Keerthiram Murugesan, Kaoutar El Maghraoui, Naigang Wang, and Charlie Liu |
2024
| Project | RPI Principal Investigators | IBM Principal Investigators |
|---|---|---|
| Algorithmic Innovations and Architectural Support towards In-Memory Training on Analog AI Accelerators | Tianyi Chen, Liu Liu | Tayfun Gokmen, Malte J. Rasch |
| Low-precision second-order-type distributed methods for training and fine-tuning foundation | Yangyang Xu, George Slota | Jie Chen, Mayank Agarwal, Yikang Shen, Naigang Wang |
| Optimization of Hardware-based Neural Networks Accelerators for Fluorescence Lifetime Biomedical Applications | Xavier Intes | Karthik Swaminathan |
| Structured & Robust Neural Network Pruning on Low-Precision Hardware for Guaranteed Learning Performance for Complex Time-Series Datasets | Christopher Carothers, Meng Wang | Kaoutar El Maghraoui, Pin-Yu Chen, Naigang Wang |
| Co-Designing Analog AI System and Accelerator for Large Foundation Models | Liu Liu, Meng Wang | Sidney Tsai, Kaoutar El Maghraoui |
| Holistic Algorithm-Architecture Co-Design of Approximate Computing for Scalable Foundation Models | Tong Zhang, Liu Liu | Swagath Venkataramani, Sanchari Sen |