publications
Refer to my google scholar for full list.
2024
- Arxiv 2024MicroScopiQ: Accelerating Foundational Models through Outlier-Aware Microscaling QuantizationIn Arxiv submission under review, 2024
- NeurIPS 2024CITER: Collaborative Inference for Efficient Large Language Model Decoding with Token-Level RoutingIn Thirty-Eighth Annual Conference on Neural Information Processing Systems Workshop, 2024
- ECCV 2024GenQ: Quantization in Low Data Regimes with Generative Synthetic DataIn European Conference on Computer Vision, 2024
- TMLR 2024Bit-by-Bit: Investigating the Vulnerabilities of Binary Neural Networks to Adversarial Bit FlippingIn Transactions on Machine Learning Research, 2024
- ICPR 2024What Makes Vision Transformers Robust Towards Bit-Flip Attacks?In International Conference on Pattern Recognition (Oral), 2024
- CVPR 2024Block Selective Reprogramming for On-device Training of Vision TransformersIn Workshop Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognitio (Oral), 2024
- CVPR 2024RLNet: Robust Linearized Networks for Efficient Private InferenceIn Workshop Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (Oral), 2024
- CVPR 2024DIA: Diffusion based Inverse Network Attack on Collaborative InferenceIn Workshop Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
- ICLR 2024Fusing models with complementary expertiseIn International Conference on Learning Representation, 2024
- ICASSP 2024Sensi-BERT: Towards sensitivity driven fine-tuning for parameter-efficient bertIn IEEE International Conference on Acoustics, Speech and Signal Processing, 2024
- ICASSP 2024Recent Advances in Scalable Energy-Efficient and Trustworthy Spiking Neural Networks: from Algorithms to TechnologyIn IEEE International Conference on Acoustics, Speech and Signal Processing, 2024
- TinyML 2024CiMNet: Towards Joint Optimization for DNN Architecture and Configuration for Compute-In-Memory HardwareIn TinyML Conference long talk, 2024
2023
- NeurIPS 2023Don’t just prune by magnitude! Your mask topology is a secret weaponIn Advances in neural information processing systems, 2023
- ICCAD 2023RNA-ViT: Reduced-Dimension Approximate Normalized Attention Vision Transformers for Latency Efficient Private InferenceIn International Conference on Computer Aided Design, 2023
- ICCV 2023Vision HGNN: An image is more than a graph of nodesIn Proceedings of the IEEE/CVF International Conference on Computer Vision (Oral), 2023
- ICCV 2023SAL-ViT: Towards latency efficient private inference on vit using selective attention search with a learnable softmax approximationIn Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
- ICCV 2023Instatune: Instantaneous neural architecture search during fine-tuningIn Workshop Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
- CVPR 2023Making models shallow again: Jointly learning to reduce non-linearity and depth for latency-efficient private inferenceIn Workshop Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (Oral), 2023
- DAC 2023C2PI: An Efficient Crypto-Clear Two-Party Neural Network Private InferenceIn 60th ACM/IEEE Design Automation Conference (DAC), 2023
- ICLR 2023Learning to linearize deep neural networks for secure and efficient private inferenceIn International Conference on Learning Representation, 2023
2022
- ACM TECS 2022Toward Adversary-aware Non-iterative Model Pruning through D ynamic N etwork R ewiring of DNNsIn ACM Transactions on Embedded Computing Systems, 2022
- Euromicro 2022Pipeedge: Pipeline parallelism for large-scale model inference on heterogeneous edge devicesIn 2022 25th Euromicro Conference on Digital System Design (DSD), 2022
- DATE 2022BMPQ: bit-gradient sensitivity-driven mixed-precision quantization of dnns from scratchIn 2022 Design, Automation & Test in Europe Conference & Exhibition (DATE), 2022
- VLSI-SoC 2022P2m-detrack: Processing-in-pixel-in-memory for energy-efficient and real-time multi-object detection and trackingIn 2022 IFIP/IEEE 30th International Conference on Very Large Scale Integration (VLSI-SoC), 2022
- Nature 2022A processing-in-pixel-in-memory paradigm for resource-constrained tinyml applicationsIn Nature Scientific Reports, 2022
2021
- ICCV 2021Hire-SNN: Harnessing the inherent robustness of energy-efficient deep spiking neural networks by training with crafted input noiseIn Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021
- WACV 2021Spike-thrift: Towards energy-efficient deep spiking neural networks by limiting spiking activity via attention-guided compressionIn Proceedings of the IEEE/CVF winter conference on applications of computer vision, 2021
- ASP-DAC 2021DNR: A tunable robust pruning framework through dynamic network rewiring of dnnsIn Proceedings of the 26th Asia and South Pacific Design Automation Conference, 2021
- ICASSP 2021Attentionlite: Towards efficient self-attention models for visionIn ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021
2020
- IEEE TC 2020Pre-defined sparsity for low-complexity convolutional neural networksIn IEEE Transactions on Computers, 2020