Shruti Bhargava

 

I work as an ML Engineer at Apple in the Siri And Language Technologies team. I completed Masters in Computer Science from University of Illinois at Urbana-Champaign (UIUC), advised by Prof. David Forsyth. I completed Bachelors in Computer Science from IIT Kanpur, India.

I have been fortunate to work with inspiring researchers.

Awards and Scholarships:

  • Recipient of Grace Hopper Student Scholarship by Anita Borg Institute.
  • Recipient of Academic Excellence Award by IIT Kanpur.
  • National Talent Search (NTS) Scholar by Govt. of India.
  • Kishore Vaigyanik Protsahan Yojana (KVPY) Scholar by Dept. of Science and Technology, Govt. of India.

I am interested in building fair and impactful solutions using Machine Learning. Particularly, I aim to develop technology at the intersection of Natural Language and Computer Vision to assist the society.

Email  /  Google Scholar  /  LinkedIn  

 

ML Engineer
Apple Inc.
2019 -
MS, CS
University of Illinois, Urbana-Champaign
2017 - 2019
Intern
Apple Inc.
Summer 2018
Research Assistant
Coordinated Science Lab
Fall 2017
Research Intern
Microsoft Research
Summer 2016
Research Fellow
Max Planck Institute for Informatics
Summer 2015
BTech, CS
IIT Kanpur
2013 - 2017
Publications

SynthDST: Synthetic Data is All You Need for Few-Shot Dialog State Tracking
Atharva Kulkarni, Bo-Hsiang Tseng, Joel Ruben Antony Moniz, Dhivya Piraviperumal, Hong Yu, Shruti Bhargava
EACL (Oral) 2024

Data generation framework that can efficiently generate synthetic data for dialogue schemas using countable templates. This bridges the gap between zero-shot and training data based few-shot prompting for dialog state tracking with LLMs.

Can Large Language Models Understand Context?
Yilun Zhu, Joel Ruben Antony Moniz, Shruti Bhargava, Jiarui Lu, Dhivya Piraviperumal, Site Li, Yuan Zhang, Hong Yu, Bo-Hsiang Tseng
EACL Findings 2024

A benchmark by adapting four tasks and nine existing datasets, featuring prompts designed to assess the context-understanding abilities of LLMs. In the ICL setting, models struggle with understanding nuanced contextual signals compared to SOTA fine-tuned models. Assessment of quantized models provides promising insights on the 3-bit post-training quantization.

Referring to Screen Texts with Voice Assistants
Shruti Bhargava, Anand Dhoot, Ing-Marie Jonsson, Hoang Long Nguyen, Alkesh Patel, Hong Yu, Vincent Renkens
ACL Industry Track 2023

Novel experience for users to refer to data-detectable entities on their phone screens when interacting with voice assistants. Screen reference resolution data strategy and a lightweight, general-purpose model that only uses the text extracted from the UI. The proposed model is modular, offering flexibility, better interpretability, and efficient run-time performance.

CREAD: Combined Resolution of Ellipses and Anaphora in Dialogues
Bo-Hsiang Tseng, Shruti Bhargava, Jiarui Lu, Joel Ruben Antony Moniz, Dhivya Piraviperumal, Lin Li, Hong Yu
NAACL 2021
[code]

Resolving references and understanding ellipses are crucial for dialogue agents to generate coherent responses. A joint benchmark for the two tasks by annotating the dialogue-based coreference dataset, MuDoCo, with rewritten queries. A novel joint learning framework that boosts query rewrite and outperforms SOTA for coreference resolution.

Conversational semantic parsing for dialog state tracking
Jianpeng Chen, ... , Shruti Bhargava, ... , Jason D Williams, Hong Yu, Diarmuid O Seaghdha, Anders Johannsen
EMNLP 2020
[Dataset]

Fresh perspective on dialog state tracking as a semantic parsing task over hierarchical representations, with compositionality, cross-domain knowledge sharing, and coreference. We present TreeDST, a dataset of 27k conversations with tree-structured states and system acts. Our encoder-decoder model leads to a 20% improvement over SOTA.

Exposing and Correcting the Gender Bias in Image Captioning Datasets and Models
Shruti Bhargava, David Forsyth
arxiv 2019

The task of image captioning implicitly involves gender identification. MS COCO dataset contains blatant gender bias in captions, arising from two main sources: statistical variation in data and flawed annotations. Biased data leads to concerning predictions by models. We propose a novel framework for gender-neutral captioning and independent gender classification using masking, reducing contextual bias. On an anti-stereotypical dataset, our approach outperforms the SOTA gender-based approaches.

Dandelion++ lightweight cryptocurrency networking with formal anonymity guarantees
Giulia Fanti, Shaileshh Bojja Venkatakrishnan, Surya bakshi, Bradley Denby, Shruti Bhargava, Andrew Miller, Pramod Viswanath
SIGMETRICS 2018
[code]

Bitcoin's networking stack is shown to have anonymity vulnerabilities owing to the mechanism for broadcasting transactions, leading to large-scale deanonymization attacks. We present Dandelion++, a first-principles defense with near-optimal information-theoretic guarantees.

Teaching

I have been a Teaching Assistant for the following courses:

  • CS544: Optimization in Vision and AI at UIUC [Spring 2019]
  • CS498: Applied Machine Learning at UIUC [Fall 2018]
  • CS374: Algorithms and Models of Computation at UIUC [Spring 2018]
  • CS101: Introduction to Programming at IIT Kanpur [Spring 2017]

[Web Cite]