Souvik Kundu, Ph.D.
I am a Staff Research Scientist at Intel Labs leading research efforts in scalable and novel AI primitives. Prior to joining at Intel I completed my Ph.D. in Electrical & Computer Engineering from University of Southern California. I was co-advised by Dr. Massoud Pedram and Dr. Peter A. Beerel. I was fortunate to receive the outstanding Ph.D. award and the Order De Arete award with multiple prestigious fellowships. I am one of the youngest recipients of the Semiconductor Research Corporation (SRC, USA) outstanding liaison award for my impactful research in 2023. My research goal is to enable human life with robust and efficient AI services via a cross-layer innovation of algorithmic optimizations blended with existing and novel hardware compute and architectures. I have co-authored >75 peer-reviewed papers in various top-tier conferences including NeurIPS, ICLR, ICML, ACL, CVPR, DATE, and DAC with multiple Oral, young fellow, travel award, and best paper nominations. [google scholar]
I serve as the founding AC and committee member of the Conference on Parsimony and Learning (CPAL). Additionally, I serve in the AC committee for various journals and conferences including ICLR, NeurIPS (outstanding reviewer recognition'22), EMNLP (outstanding reviewer recognition'20), CVPR, DATE, and DAC.
news
Dec 03, 2024 | Our workshop proposal on Scalable Optimization for Efficient and Adaptive Foundation Models has been accepted at ICLR'25 ! Please consider submitting. |
---|---|
Dec 02, 2024 | Gave an invited talk on Scalable and Personalized Machine Learning at Center for Machine Learning and Data Science (CMInDS) of IIT Bombay ! |
Nov 05, 2024 | Collaboration project MicroScopiQ with Georgia Tech won the runner-up prize in MICRO'24 student research project competition! |
Oct 30, 2024 | 1x paper got accepted at WACV 2025 |
Sep 28, 2024 | 1x main conference (ShiftAddLLM-first MATMUL-free LLM ) and 2x workshop papers got accepted at NeurIPS 2024 |
Sep 24, 2024 | 1x paper on LLM fine-tuning (LaMDA ) got accepted as Findings of EMNLP 2024 |
selected publications
- Arxiv 2024MicroScopiQ: Accelerating Foundational Models through Outlier-Aware Microscaling QuantizationIn Arxiv submission under review, 2024
- TinyML 2024CiMNet: Towards Joint Optimization for DNN Architecture and Configuration for Compute-In-Memory HardwareIn TinyML Conference long talk, 2024
- ICLR 2023Learning to linearize deep neural networks for secure and efficient private inferenceIn International Conference on Learning Representation, 2023
- WACV 2021Spike-thrift: Towards energy-efficient deep spiking neural networks by limiting spiking activity via attention-guided compressionIn Proceedings of the IEEE/CVF winter conference on applications of computer vision, 2021
- ASP-DAC 2021DNR: A tunable robust pruning framework through dynamic network rewiring of dnnsIn Proceedings of the 26th Asia and South Pacific Design Automation Conference, 2021
- IEEE TC 2020Pre-defined sparsity for low-complexity convolutional neural networksIn IEEE Transactions on Computers, 2020