Financial Economics · Labor · Computation

Research &Agenda

I study how policy, labor market disruptions, and institutional frictions shape entrepreneurial activity and firm behavior — using IRS administrative tax data covering the full U.S. firm population. Alongside this empirical agenda, I develop independent research at the frontier of quantum computing, ML, and economic methodology.

Pre-doctoral Research Associate Tepper School of Business · IRS Joint Statistical Research Program · Incoming MSEC, Duke 2026
9Total papers
1Published · JFE
4Sole-authored

Data: All CMU projects are conducted through the IRS Joint Statistical Research Program — a restricted federal research program granting access to the near-universe of U.S. tax returns (~1.5B observations).

RA Contributions

5 papers · CMU Tepper & Harvard Business School
Entrepreneurship and the Gig Economy: Evidence from U.S. Tax Returns

Journal of Financial Economics, 2025

Published · JFE CMU · IRS JSRP

"Collin Zoeller provided excellent research assistance." — Acknowledgements, NBER Working Paper 33347

Platform intermediation of goods and services has considerably transformed the U.S. economy. We use administrative data on U.S. tax returns to study the role of the gig economy on entrepreneurship. We find that gig workers are more likely to become entrepreneurs, particularly those who are lower income, younger, and benefit from flexibility. We track all newly created firms and show that gig workers start firms in similar industries as their gig experience, which are less likely to survive and demonstrate higher performance. Overall, our findings suggest on-the-job learning promotes entrepreneurial entry and shifts the types of firms started by entrepreneurs.
  • Estimated DiD using IRS microdata to identify entrepreneurial transitions among platform workers
  • Cleaned and structured large-scale datasets; maintained empirical codebase for journal submission
We study the effects of deploying government capital to firms during crises. Using exogenous variation in the timing of disbursements in the Paycheck Protection Program (PPP), we find that firms receiving PPP loans later become more financially distressed and face reductions in credit supply. These effects are amplified for firms with heightened financial constraints. We also show that firms receiving loans later have lower economic activity using in-store activity and shutdowns.
  • Estimated DiD effects of PPP timing on firm financial health using credit bureau data
  • Data cleaning, workflow construction, and publication preparation
Evolution of the Relationship between the Gig Economy and Entrepreneurship: The Heterogeneous Effects of Labor Market Disruptions
In Progress CMU · IRS JSRP
  • Estimated DiD using IRS taxpayer microdata to study entrepreneurial responses to labor disruptions
  • Built ANN text clustering pipeline over 1B+ deduction fields to engineer novel features
  • Constructed Stata parallelization framework (precursor to StataHelper)
  • Data engineering, replication, and publication support
Take-up of Flexible Labor
In Progress CMU · IRS JSRP
  • Studies employment of flexible labor through IRS tax deduction claims
  • Trained RNN text classifier on 1B+ taxpayer-inputted deduction fields
The Impact of the Kodak Crash on Entrepreneurship in Rochester, NY
Working Paper · 2023 Harvard Business School
Uses a custom OCR pipeline applied to 130 years of R.L. Polk city directories to study how the collapse of Eastman Kodak reshaped entrepreneurship in Rochester, NY — tracing how inventors and employees responded to job loss risk through new firm creation, geographic exit, or labor market substitution. Asks: what is the effect of social proximity on entrepreneurship, and how do innovators respond to the threat of losing their jobs?

Independent Research

4 papers · Sole-authored
GAMA-IV: Quantum Annealing for Globally Optimal Instrument Selection
Working Paper Duke MSEC Thesis Quantum · D-Wave
Introduces GAMA-IV, a quantum-classical hybrid methodology for instrument selection in instrumental variables (IV) estimation. We apply the Graver Augmentation via Quantum Annealing (GAMA) framework of Alghassi et al. (2019) to the disaggregated Bartik instrument setting formalized by Goldsmith-Pinkham et al. (2020). The core contribution is methodological: we show that the GPS information criterion, under a separable linear approximation, can be cast as an integer program subject to cardinality, geographic coverage, and sector coverage constraints. Graver basis elements — computed via D-Wave quantum annealing — provide polynomial-step augmentation directions guaranteeing convergence to the global optimum of this criterion. We replicate the GPS (2020) application to Chinese import competition and U.S. labor markets, benchmarking GAMA-IV against standard industry selection approaches. Convergence to the same instrument set demonstrates methodological efficiency; divergence reveals locally optimal solutions missed by greedy search.
JEL C02 C26 C61 C63 · quantum annealing · instrumental variables · Bartik instruments · integer programming
Efficient Large-Scale Text Classification with ANN
White Paper + Code ML · NLP
Develops a scalable pipeline for clustering large text corpora using ANNOY (Approximate Nearest Neighbors) and small language model embeddings. Demonstrates improved computational efficiency for feature engineering over standard clustering — motivated by the challenge of extracting structure from 1B+ open-text IRS deduction fields.
The Big Book of Estimators
Working Paper Reference
A conspectus of estimators, assumptions, and use cases intended for non-specialized researchers. Designed as a practical non-textbook reference for model selection — filling the gap between introductory statistics and specialized econometrics literature for empirical social scientists.
Artificial Intelligence in Economic Research: Applications, Promises, Pitfalls
In Progress AI · Methods
A meta-analysis examining how AI applications can both harm and help empirical researchers — surveying current uses across social science disciplines and developing a framework for evaluating validity, replicability, and interpretability of AI-assisted research methods. Intended for non-specialized researchers in the social sciences.

Presentations

2 conferences
2023

"City Directory Automated Indexing"

Family History Technology Workshop · BYU, Provo, UT · February 2023

Custom OCR/NLP pipeline extracting structured data from city directories for economic research.

Slides
2022

"Growing Together: Growing the Tree"

BYU President's Leadership Council · BYU, Provo, UT · September 2022

Invited presentation to top Utah investors on AI-powered genealogical indexing — CNN-based handwriting recognition for FamilySearch and Ancestry.

More work

Code, tools, and writing