publications

Refer to my google scholar for full list.

2024

  1. NeurIPS 2024
    ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization
    Haoran You, Yipin Guo, Yichao Fu, and 6 more authors
    In Thirty-Eighth Annual Conference on Neural Information Processing Systems, 2024
  2. NeurIPS 2024
    GEAR: An efficient kv cache compression recipefor near-lossless generative inference of llm
    Hao Kang, Qingru Zhang, Souvik Kundu, and 4 more authors
    In Thirty-Eighth Annual Conference on Neural Information Processing Systems Workshop (Spotlight), 2024
  3. NeurIPS 2024
    CITER: Collaborative Inference for Efficient Large Language Model Decoding with Token-Level Routing
    Wenhao Zheng, Yixiao Chen, Weitong Zhang, and 6 more authors
    In Thirty-Eighth Annual Conference on Neural Information Processing Systems Workshop, 2024
  4. EMNLP 2024
    LaMDA: Large Model Fine-Tuning via Spectrally Decomposed Low-Dimensional Adaptation
    Seyedarmin Azizi, Souvik Kundu, and Massoud Pedram
    In Conference on Empirical Methods in Natural Language Processing (Findings), 2024
  5. ECCV 2024
    CLAMP-ViT: contrastive data-free learning for adaptive post-training quantization of ViTs
    Akshat Ramachandran, Souvik Kundu, and Tushar Krishna
    In European Conference on Computer Vision, 2024
  6. ECCV 2024
    GenQ: Quantization in Low Data Regimes with Generative Synthetic Data
    Yuhang Li, Youngeun Kim, Donghyun Lee, and 2 more authors
    In European Conference on Computer Vision, 2024
  7. ACL 2024
    AFLoRA: Adaptive Freezing of Low Rank Adaptation in Parameter Efficient Fine-Tuning of Large Models
    Souvik Kundu*, Zeyu Liu*, Anni Li, and 3 more authors
    In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Best paper recommendation), 2024
  8. ICML 2024
    Junk dna hypothesis: A task-centric angle of llm pre-trained weights through sparsity
    Lu Yin, Shiwei Liu, Ajay Jaiswal, and 2 more authors
    In International Conference on Machine Learning, 2024
  9. TMLR 2024
    Bit-by-Bit: Investigating the Vulnerabilities of Binary Neural Networks to Adversarial Bit Flipping
    Shamik Kundu, Sanjay Das, Sayar Karmakar, and 4 more authors
    In Transactions on Machine Learning Research, 2024
  10. ICPR 2024
    What Makes Vision Transformers Robust Towards Bit-Flip Attacks?
    Xuan Zhou, Souvik Kundu, Dake Chen, and 2 more authors
    In International Conference on Pattern Recognition (Oral), 2024
  11. CVPR 2024
    Block Selective Reprogramming for On-device Training of Vision Transformers
    Sreetama Sarkar, Souvik Kundu, Kai Zheng, and 1 more author
    In Workshop Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognitio (Oral), 2024
  12. CVPR 2024
    RLNet: Robust Linearized Networks for Efficient Private Inference
    Souvik Kundu*, Sreetama Sarkar*, and Peter A Beerel
    In Workshop Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (Oral), 2024
  13. CVPR 2024
    DIA: Diffusion based Inverse Network Attack on Collaborative Inference
    Dake Chen, Shiduo Li, Yuke Zhang, and 3 more authors
    In Workshop Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
  14. ICLR 2024
    Fusing models with complementary expertise
    Hongyi Wang, Felipe Maia Polo, Yuekai Sun, and 3 more authors
    In International Conference on Learning Representation, 2024
  15. ICASSP 2024
    Sensi-BERT: Towards sensitivity driven fine-tuning for parameter-efficient bert
    Souvik Kundu, Sharath Nittur Sridhar, Maciej Szankin, and 1 more author
    In IEEE International Conference on Acoustics, Speech and Signal Processing, 2024
  16. ICASSP 2024
    Recent Advances in Scalable Energy-Efficient and Trustworthy Spiking Neural Networks: from Algorithms to Technology
    Souvik Kundu, Rui-Jie Zhu, Akhilesh Jaiswal, and 1 more author
    In IEEE International Conference on Acoustics, Speech and Signal Processing, 2024
  17. TinyML 2024
    CiMNet: Towards Joint Optimization for DNN Architecture and Configuration for Compute-In-Memory Hardware
    Souvik Kundu, Anthony Sarah, Vinay Joshi, and 2 more authors
    In TinyML Conference long talk, 2024

2023

  1. NeurIPS 2023
    Don’t just prune by magnitude! Your mask topology is a secret weapon
    Duc Hoang, Souvik Kundu, Shiwei Liu, and 2 more authors
    In Advances in neural information processing systems, 2023
  2. ICCAD 2023
    RNA-ViT: Reduced-Dimension Approximate Normalized Attention Vision Transformers for Latency Efficient Private Inference
    Souvik Kundu*, Dake Chen*, Yuke Zhang*, and 2 more authors
    In International Conference on Computer Aided Design, 2023
  3. ICCV 2023
    Vision HGNN: An image is more than a graph of nodes
    Yan Han, Peihao Wang, Souvik Kundu, and 2 more authors
    In Proceedings of the IEEE/CVF International Conference on Computer Vision (Oral), 2023
  4. ICCV 2023
    SAL-ViT: Towards latency efficient private inference on vit using selective attention search with a learnable softmax approximation
    Souvik Kundu*, Yuke Zhang*, Dake Chen*, and 2 more authors
    In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
  5. ICCV 2023
    Instatune: Instantaneous neural architecture search during fine-tuning
    Sharath Nittur Sridhar, Souvik Kundu, Sairam Sundaresan, and 2 more authors
    In Workshop Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
  6. TMLR 2023
    Revisiting Sparsity Hunting in Federated Learning: Why does Sparsity Consensus Matter?
    Souvik Kundu*, Sara Babakniya*, Saurav Prakash, and 2 more authors
    In Transactions on Machine Learning Research, 2023
  7. TMLR 2023
    Overcoming resource constraints in federated learning: Large models can be trained with only weak clients
    Yue Niu, Saurav Prakash, Souvik Kundu, and 2 more authors
    In Transactions on Machine Learning Research, 2023
  8. CVPR 2023
    Making models shallow again: Jointly learning to reduce non-linearity and depth for latency-efficient private inference
    Souvik Kundu, Yuke Zhang, Dake Chen, and 1 more author
    In Workshop Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (Oral), 2023
  9. DAC 2023
    C2PI: An Efficient Crypto-Clear Two-Party Neural Network Private Inference
    Yuke Zhang, Dake Chen, Souvik Kundu, and 3 more authors
    In 60th ACM/IEEE Design Automation Conference (DAC), 2023
  10. ICLR 2023
    Learning to linearize deep neural networks for secure and efficient private inference
    Souvik Kundu, Shunlin Lu, Yuke Zhang, and 2 more authors
    In International Conference on Learning Representation, 2023

2022

  1. ACM TECS 2022
    Toward Adversary-aware Non-iterative Model Pruning through D ynamic N etwork R ewiring of DNNs
    Souvik Kundu, Yao Fu, Bill Ye, and 2 more authors
    In ACM Transactions on Embedded Computing Systems, 2022
  2. Euromicro 2022
    Pipeedge: Pipeline parallelism for large-scale model inference on heterogeneous edge devices
    Yang Hu, Connor Imes, Xuanang Zhao, and 4 more authors
    In 2022 25th Euromicro Conference on Digital System Design (DSD), 2022
  3. DATE 2022
    BMPQ: bit-gradient sensitivity-driven mixed-precision quantization of dnns from scratch
    Souvik Kundu, Shikai Wang, Qirui Sun, and 2 more authors
    In 2022 Design, Automation & Test in Europe Conference & Exhibition (DATE), 2022
  4. VLSI-SoC 2022
    P2m-detrack: Processing-in-pixel-in-memory for energy-efficient and real-time multi-object detection and tracking
    Souvik Kundu*, Gourav Datta*, Zihan Yin, and 8 more authors
    In 2022 IFIP/IEEE 30th International Conference on Very Large Scale Integration (VLSI-SoC), 2022
  5. Nature 2022
    A processing-in-pixel-in-memory paradigm for resource-constrained tinyml applications
    Souvik Kundu*, Gourav Datta*, Zihan Yin*, and 5 more authors
    In Nature Scientific Reports, 2022

2021

  1. NeurIPS 2021
    Analyzing the confidentiality of undistillable teachers in knowledge distillation
    Souvik Kundu, Qirui Sun, Yao Fu, and 2 more authors
    In Advances in Neural Information Processing Systems, 2021
  2. ICCV 2021
    Hire-SNN: Harnessing the inherent robustness of energy-efficient deep spiking neural networks by training with crafted input noise
    Souvik Kundu, Massoud Pedram, and Peter A Beerel
    In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021
  3. WACV 2021
    Spike-thrift: Towards energy-efficient deep spiking neural networks by limiting spiking activity via attention-guided compression
    Souvik Kundu, Gourav Datta, Massoud Pedram, and 1 more author
    In Proceedings of the IEEE/CVF winter conference on applications of computer vision, 2021
  4. ASP-DAC 2021
    DNR: A tunable robust pruning framework through dynamic network rewiring of dnns
    Souvik Kundu, Mahdi Nazemi, Peter A Beerel, and 1 more author
    In Proceedings of the 26th Asia and South Pacific Design Automation Conference, 2021
  5. ICASSP 2021
    Attentionlite: Towards efficient self-attention models for vision
    Souvik Kundu, and Sairam Sundaresan
    In ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021

2020

  1. IEEE TC 2020
    Pre-defined sparsity for low-complexity convolutional neural networks
    Souvik Kundu, Mahdi Nazemi, Massoud Pedram, and 2 more authors
    In IEEE Transactions on Computers, 2020