Sadra Sabouri

Hey, this is Sadra! I’m a PhD student in Computer Science at USC, working at the intersection of HCI and NLP to make LLMs better friends of humans. My research focuses on helping people make better decisions using LLMs. I also build and maintain scientific software tools with a great team of open-source enthusiasts. I believe open-source software is key to making technology and science accessible, transparent, and fun. In my free time, I enjoy watching movies and exploring new places and activities. I’m always happy to meet new people and hear about their journeys—shoot me an email or DM if you want to chat!

CS PhD @ USC ✌️

The main problem I'm trying to solve is the integration of AI systems into human workflows—specifically, answering the question: "What is the core part of a task that AI cannot do, and how can AI assist humans in doing that?" Helping humans tackle the hardest parts of their jobs—with AI as a consultant—is the overarching meta-goal of my current research. To address this, I've explored several domains where large language models (LLMs) have been introduced but face full-integration challenges. These include software developers trusting code agents for programming, strategic decision-making in the board game Diplomacy, patients navigating conflicting medical advice, users with different knowledge backgrounds asking factual questions and researchers looking for scientific discussions in social media.

I'm currently in my second year and looking forward to exploring more domains to develop a taxonomy of these challenges and a framework that identifies the right interaction patterns and integration points for AI. Throughout this journey, I've had the great opportunity to work with the Adaptive Computing Experience (ACE) Lab (Souti Chattopadhyay's lab @ GCS) and [CUTE LAB NAME] (Jonathan May's lab @ ISI).

You can find some of my publications below:

[VL/HCC25] Exploring the Challenges and Opportunities of AI-assisted Codebase Generation, Philipp Eibl, Sadra Sabouri, Souti Chattopadhyay

Paper
We explored how LLMs reshape software development through "vibe-coding," where developers rely on iterative prompting an LLM for buildign a software without traditional meaning of coding. We show how this shift blurs boundaries between ideation, coding, and debugging, and potentially can make software development more collaborative.

[ICSE25] Trust dynamics in AI-assisted development: Definitions, factors, and implications, Sadra Sabouri, Philipp Eibl, Xinyi Zhou, Morteza Ziyadi, Nenad Medvidovic, Lars Lindemann, Souti Chattopadhyay

Paper
We investigate how developers define, evaluate, and evolve trust in AI-generated code suggestions through a mixed-method study involving surveys and observations. We found that while comprehensibility and perceived correctness are key to trust decisions, developers often revise their choices, accepting only 52% of AI suggestions, highlighting the need for better real-time support and offering four validated guidelines to improve developer-AI collaboration.

[ACL25] ELI-Why: Evaluating the Pedagogical Utility of Language Model Explanations, Brihi Joshi, Keyu He, Sahana Ramnath, Sadra Sabouri, Kaitlyn Zhou, Souti Chattopadhyay, Swabha Swayamdipta, Xiang Ren

Paper Code

We investigate how well language models adapt explanations to learners with varying educational backgrounds using ELI-Why, a benchmark of 13.4K "Why" questions. Through two human studies, we found that GPT-4 explanations align with intended grade levels only 50% of the time and are rated 20% less suitable for learners' needs compared to layperson-curated responses, revealing limitations in their pedagogical adaptability.

Always happy to chat, collaborate, or just hear what you're working on; feel free to reach out!

Open World Developer 🌐

Open-sourcing research in NLP has lead to breakthroughs like ChatGPT, but generative AI also makes it easier to produce convincing yet flawed content in research communities. This poses a sense of Frankenstein-Trojan threat to scientific integrity. Committed to open science and reproducibility, I focus on building scientific software that ensures transparency. With a group of my friends, I co-founded OpenSciLab to develop open-source tools toward this goal.

Below is a topic-based summary of my work, including those through OpenSciLab, dataset releases and independent projects:

Natural Language Processing and Large Language Models

ToCount: Lightweight Token Estimator

ToCount is a lightweight Python library for estimating the token counts for input to an LLM using rule-based and ML methods. It offers a fast, flexible interface for prompt analysis, token budgeting, and optimizing interactions with token-based systems.

XNum: Universal Numeral System Converter

XNum is a Python library for converting digits across numeral systems (English, Persian, Hindi, Arabic-Indic, Bengali, etc.). It auto-detects mixed formats and cleanly converts only the numbers, making multilingual and localized data handling simple.

Memor: Managing and Transferring Conversational Memory Across LLMs

Memor is designed to help users manage the memory of their interactions with Large Language Models (LLMs). It enables users to access and utilize the history of their conversations when prompting LLMs. That would create a more personalized and context-aware experience. Users can select specific parts of past interactions with one LLM and share them with another. By bridging the gap between isolated LLM instances, Memor revolutionizes the way users interact with AI by making transitions between models smoother.

[JAIAI] naab: A ready-to-use plug-and-play corpus for Farsi, Sadra Sabouri, Elnaz Rahmati, Soroush Gooran, Hossein Sameti

Paper

The issue of large training data is (was at that time :D) emerging more in lower resource languages - like Farsi. We propose naab a hue cleaned and ready-to-use open-source textual corpus in Farsi. It contains about 130GB of data, 250 million paragraphs, and 15 billion words. The project name is derived from the Farsi word NAAB which means pure and high grade.

[ALP@NAACL25] Parsipy: NLP toolkit for historical persian texts in Python, Farhan Farsi, Parnian Fazel, Sepand Haghighi, Sadra Sabouri, Farzaneh Goshtasb, Nadia Hajipour, Ehsaneddin Asgari, Hossein Sameti

Paper
The study of historical languages presents unique challenges due to their complex orthographic systems, fragmentary textual evidence, and the absence of standardized digital representations of text in those languages. This work introduces an NLP toolkit designed to facilitate the analysis of historical Persian languages by offering modules for tokenization, lemmatization, part-of-speech tagging, phoneme-to-transliteration conversion, and word embedding.

[LoResMT@NAACL25] PahGen: Generating Ancient Pahlavi Text via Grammar-guided Zero-shot Translation, Farhan Farsi, Parnian Fazel, Farzaneh Goshtasb, Nadia Hajipour, Sadra Sabouri, Ehsaneddin Asgari, Hossein Sameti

Paper
Due to Pahlavi (middle Persian)'s limited digital presence and the scarcity of comprehensive linguistic resources, Pahlavi is at risk of extinction. This study introduces a framework to translate English text into Pahlavi. Our approach combines grammar-guided term extraction with zero-shot translation, leveraging large language models (LLMs) to generate syntactically and semantically accurate Pahlavi sentences. Finally using our framework, we generate a novel dataset of 360 expert-validated parallel English-Pahlavi texts.

[DialDoc@ACL22] Docalog: Multi-document Dialogue System using Transformer-based Span Retrieval, Sayed Hesam Alavian, Ali Satvaty, Sadra Sabouri, Ehsaneddin Asgari, Hossein Sameti

Paper
This paper discusses our proposed approach, Docalog, for the DialDoc-22 (MultiDoc2Dial) shared task which was part of my BSc. thesis. Docalog, has a three-stage pipeline consisting of (1) a document retriever model, (2) an answer span prediction model, and (3) an ultimate span picker deciding on the most likely answer span, out of all predicted spans.

Speech Processing

Nava: OS-Native Sound Engine in Python

Nava allows users to play sound in Python without any dependencies or platform restrictions. It is a cross-platform solution that runs on any operating system, including Windows, macOS, and Linux. Its lightweight and easy-to-use design makes Nava an ideal choice for developers looking to add sound functionality to their Python programs.

Sharif-Wav2Vec2.0: Wave2Vec2.0 Speech Processing Model Tailored for Farsi

The base model fine-tuned on 108 hours of Commonvoice's Farsi audio. Token set and the language models of that model changed to support special nuances of Farsi which wasn't there in English. More technically, we trained a 5gram using kenlm toolkit and used it in the processor which increased our accuracy on online ASR.

Machine Learning (ML)

PyCM: Multi-class confusion matrix library in Python

PyCM is a tool for post-classification model evaluation that supports most class and overall statistic parameters. PyCM targeted mainly the data scientists that need a broad array of metrics for predictive models and accurate evaluation of a large variety of classifiers.

Network

PyRGG: Python Random Graph Generator

PyRGG synthesizes random graph which can be useful in networks simulation. It supports multiple graph file formats, such as DIMACS-Graph files. It can generate graphs of various sizes and using different generation methods such as Erdős–Rényi-Gilbert, Erdős–Rényi, Stochastic Block Model.

IPSpot: A Python Tool to Fetch the System's IP Address

IPSpot retrieves the system's IP address and location information. It supports public and private IPv4 and IPv6 detection using multiple API providers with a fallback mechanism for reliability.

Pymilo: A python library for ml I/O, AmirHosein Rostami, Sepand Haghighi, Sadra Sabouri, Alireza Zolanvari

Paper

PyMilo addresses the limitations of existing Machine Learning (ML) model storage formats by providing a transparent, reliable, and safe method for exporting and deploying trained models. Current formats, such as pickle and other binary formats, have significant problems, such as reliability, safety, and transparency issues. In contrast, PyMilo serializes ML models in a transparent non-executable format, enabling straightforward and safe model exchange.

Art

Samila: A Generative Art Generator, Sadra Sabouri, Sepand Haghighi, Elena Masrour

Paper

Samila lets you create images by randomly permuting many thousand points. The position of every single point is calculated by a formula, which has random parameters. Because of the randomness of the generation process you nearly can't reproduce any image unless you have the right seed for it. I highly encourage you to take a look at the paper if you're interested.

Art: ASCII art library for Python

Art does the "smart" placement of typed special characters or letters to make a visual shape that is spread over multiple lines of text.

Human Computer Interaction (HCI)

Nafas: Breathing Gymnastics Application, Sadra Sabouri, Sepand Haghighi

Nafas is a collection of breathing gymnastics designed to reduce the exhaustion of long working hours with computer. With multiple breathing patterns, Nafas helps you find your way to a detoxified energetic workday and also improves your concentration by increasing the oxygen level.

mytimer: A Timer for Command Line Enthusiasts

MyTimer aims to provide a simple yet comprehensive timer for terminal users. This project allows users to set timers directly from their command line interface, making it convenient for those who spend a significant amount of time working in the terminal!

Chemical Data Science

Experimental dataset of electrochemical efficiency of a Direct Borohydride Fuel Cell (DBFC) with Pd/C, Pt/C and Pd decorated Ni–Co/rGO anode catalysts, Sarmin Hamidi, Sadra Sabouri, Sepand Haghighi, Kasra Askari

Paper

Dataset includes Direct Borohydride Fuel Cell (DBFC) impedance and polarization test in anode with Pd/C, Pt/C and Pd decorated Ni–Co/rGO catalysts. Voltage, power density and resistance of DBFC change as a function of weight percent of Sodium Borohydride (%), applied voltage and amount of anode catalyst loading that are evaluated by polarization and impedance curves with using appropriate equivalent circuit of fuel cell.

OPEM: Open Source PEM Fuel Cell Simulation Tool

The Open-Source PEMFC Simulation Tool (OPEM) is a modeling tool for evaluating the performance of proton exchange membrane fuel cells. This package is a combination of models (static/dynamic) that predict the optimum operating parameters of PEMFC. OPEM contained generic models that will accept as input, not only values of the operating variables such as anode and cathode feed gas, pressure and compositions, cell temperature and current density, but also cell parameters including the active area and membrane thickness.

Biomedical Data Science

Drux: Drug Release Analysis Framework

Drux is a Python framework for simulating and visualizing drug release profiles with mathematical models. It provides a simple, extensible, and reproducible platform for quantitative analysis in pharmaceutical research.

OPR: Optimized Primer Design Tool

OPR is an open-source Python package designed to simplify and streamline primer design and analysis for biologists and bioinformaticians. It enables users to design, validate, and optimize primers with ease, catering to a wide range of applications such as PCR, qPCR, and sequencing.

Environmental Data Science

[AGU-WRR24] Representative sample size for estimating saturated hydraulic conductivity via machine learning: A proof‐of‐concept study, Amin Ahmadisharaf, Reza Nematirad, Sadra Sabouri, Yakov Pachepsky, Behzad Ghanbarian

Paper
Machine learning is widely used across disciplines, but hydrology has often overlooked the impact of data heterogeneity and sample size. In this study, we used ~18k soil samples from the USKSAT database to analyze how training size affects ML accuracy in estimating saturated hydraulic conductivity (Ks). Using XGBoost and repeated random subsets, we found that even with large datasets, learning and validation curves didn't plateau.

News

Aug 2025: Our paper titled “Exploring the Challenges and Opportunities of AI-assisted Codebase Generation” got accepted into IEEE Symposium on Visual Languages and Human-Centric Computing 2025. In this paper we investigated how software developers interact with the current “vibe-coding” framework such as GitHub Copilot and Cursos AI. I will present this work at VLHCC in Raleigh, NC, Oct 7th-10th.
Mar 2025: Python Software Foundation (PSF) granted our work, Nava library, for adding new OS-based sound engines, and integrating into notebooks.
Feb 2025: Nlnet granted our work, PyCM library, for a year through NGI0 Commons Fund for adding new features such as distance similarity matrix, data distribution analysis, hardware benchmarking of the library.
Jan 2025: My paper Trust dynamics in AI-assisted development: Definitions, factors, and implications got accepted into International Conference on Software Engineering (ICSE) 2025. I will present my work remotely in searly May.
Sep 2024: I was awarded a Trelis AI Grant for developing a RESTful API for PyCM, enhancing accessibility to machine learning statistical post-processing tools.
May 2024: Python Software Foundation (PSF) granted our work, ASCII Art library, for developing the library and add new features like multi-line arts, and supporting custom fonts.