Souvik Kundu, Ph.D.

profile.jpg

I am a Staff Research Scientist at Intel Labs leading research efforts in scalable and novel AI primitives. Prior to joining at Intel I completed my Ph.D. in Electrical & Computer Engineering from University of Southern California. I was co-advised by Dr. Massoud Pedram and Dr. Peter A. Beerel. I was fortunate to receive the outstanding Ph.D. award and the Order De Arete award with multiple prestigious fellowships. I am one of the youngest recipients of the Semiconductor Research Corporation (SRC, USA) outstanding liaison award for my impactful research in 2023. My research goal is to enable human life with robust and efficient AI services via a cross-layer innovation of algorithmic optimizations blended with existing and novel hardware compute and architectures. I have co-authored >75 peer-reviewed papers in various top-tier conferences including NeurIPS, ICLR, ICML, ACL, CVPR, DATE, and DAC with multiple Oral, young fellow, travel award, and best paper nominations. [google scholar]

I serve as the founding AC and committee member of the Conference on Parsimony and Learning (CPAL). Additionally, I serve in the AC committee for various journals and conferences including ICLR, NeurIPS (outstanding reviewer recognition'22), EMNLP (outstanding reviewer recognition'20), CVPR, DATE, and DAC.

news

Dec 03, 2024 Our workshop proposal on Scalable Optimization for Efficient and Adaptive Foundation Models has been accepted at ICLR'25! Please consider submitting. :sparkles:
Dec 02, 2024 Gave an invited talk on Scalable and Personalized Machine Learning at Center for Machine Learning and Data Science (CMInDS) of IIT Bombay!
Nov 05, 2024 Collaboration project MicroScopiQ with Georgia Tech won the runner-up prize in MICRO'24 student research project competition! :sparkles:
Oct 30, 2024 1x paper got accepted at WACV 2025
Sep 28, 2024 1x main conference (ShiftAddLLM-first MATMUL-free LLM) and 2x workshop papers got accepted at NeurIPS 2024 :sparkles:
Sep 24, 2024 1x paper on LLM fine-tuning (LaMDA) got accepted as Findings of EMNLP 2024 :sparkles:

selected publications

  1. Arxiv 2024
    MicroScopiQ: Accelerating Foundational Models through Outlier-Aware Microscaling Quantization
    Akshat Ramachandran, Souvik Kundu, and Tushar Krishna
    In Arxiv submission under review, 2024
  2. NeurIPS 2024
    ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization
    Haoran You, Yipin Guo, Yichao Fu, and 6 more authors
    In Thirty-Eighth Annual Conference on Neural Information Processing Systems, 2024
  3. NeurIPS 2024
    GEAR: An efficient kv cache compression recipefor near-lossless generative inference of llm
    Hao Kang, Qingru Zhang, Souvik Kundu, and 4 more authors
    In Thirty-Eighth Annual Conference on Neural Information Processing Systems Workshop (Spotlight), 2024
  4. EMNLP 2024
    LaMDA: Large Model Fine-Tuning via Spectrally Decomposed Low-Dimensional Adaptation
    Seyedarmin Azizi, Souvik Kundu, and Massoud Pedram
    In Conference on Empirical Methods in Natural Language Processing (Findings), 2024
  5. TinyML 2024
    CiMNet: Towards Joint Optimization for DNN Architecture and Configuration for Compute-In-Memory Hardware
    Souvik Kundu, Anthony Sarah, Vinay Joshi, and 2 more authors
    In TinyML Conference long talk, 2024
  6. ICLR 2023
    Learning to linearize deep neural networks for secure and efficient private inference
    Souvik Kundu, Shunlin Lu, Yuke Zhang, and 2 more authors
    In International Conference on Learning Representation, 2023
  7. NeurIPS 2021
    Analyzing the confidentiality of undistillable teachers in knowledge distillation
    Souvik Kundu, Qirui Sun, Yao Fu, and 2 more authors
    In Advances in Neural Information Processing Systems, 2021
  8. WACV 2021
    Spike-thrift: Towards energy-efficient deep spiking neural networks by limiting spiking activity via attention-guided compression
    Souvik Kundu, Gourav Datta, Massoud Pedram, and 1 more author
    In Proceedings of the IEEE/CVF winter conference on applications of computer vision, 2021
  9. ASP-DAC 2021
    DNR: A tunable robust pruning framework through dynamic network rewiring of dnns
    Souvik Kundu, Mahdi Nazemi, Peter A Beerel, and 1 more author
    In Proceedings of the 26th Asia and South Pacific Design Automation Conference, 2021
  10. IEEE TC 2020
    Pre-defined sparsity for low-complexity convolutional neural networks
    Souvik Kundu, Mahdi Nazemi, Massoud Pedram, and 2 more authors
    In IEEE Transactions on Computers, 2020