Suproteem Sarkar

Suproteem Sarkar

I am a Ph.D. student in Economics at Harvard, where I am supported by the National Science Foundation, the Center for Applied Artificial Intelligence, Two Sigma, and Opportunity Insights.

I completed my S.M. in Applied Mathematics and A.B. in Computer Science—summa cum laude, with certificates in Mind/Brain/Behavior and Global Health & Health Policy—at Harvard in 2019. I also spent time at Microsoft and Google.


In Progress

Partisanship and Economic Beliefs
with Johnny Tang

Presented at Allied Social Science Associations, Annual Meeting (2022)


An Economic Approach to Machine Learning in Health Policy
with N. Meltem Daysal, Sendhil Mullainathan, Ziad Obermeyer, and Mircea Trandafir

Presented at National Bureau of Economic Research, Conference on Machine Learning in Healthcare (2021)


Published and Forthcoming

A Semantic Approach to Financial Fundamentals [PDF]
with Jiafeng Chen

Presented at FinNLP, Workshop on Financial Technology and Natural Language Processing (2020)

Abstract The structure and evolution of firms’ operations are essential components of modern financial analyses. Traditional text-based approaches have often used standard statistical learning methods to analyze news and other text relating to firm characteristics, which may shroud key semantic information about firm activity. In this paper, we present the Semantically-Informed Financial Index, an approach to modeling firm characteristics and dynamics using embeddings from transformer models. As opposed to previous work that uses similar techniques on news sentiment, our methods directly study the business operations that firms report in filings, which are legally required to be accurate. We develop text-based firm classifications that are more informative about fundamentals per level of granularity than established metrics, and use them to study the interactions between firms and industries. We also characterize a basic model of business operation evolution. Our work aims to contribute to the broader study of how text can provide insight into economic behavior.
A Semantic Approach to Financial Fundamentals


Constitutional Dimensions of Predictive Algorithms in Criminal Justice [PDF]
with Michael Brenner, Jeannie Suk Gersen, Michael Haley, Matthew Lin, Amil Merchant, Richard Jagdishwar Millett, and Drew Wegner

Published in Harvard Civil Rights-Civil Liberties Law Review (2020)

Abstract This Article analyzes constitutional issues presented by the use of proprietary risk assessment technology and how courts can best address them. Focusing on due process and equal protection, this Article explores potential avenues for constitutional challenges to risk assessment technology at federal and state levels, and outlines how these instruments might be retooled to increase accuracy and accountability while satisfying constitutional standards.
Constitutional Dimensions of Predictive Algorithms in Criminal Justice


Robust Classification of Financial Risk [PDF]
with Kojin Oshiba, Daniel Giebisch and Yaron Singer

Presented at Neural Information Processing Systems, AI in Financial Services Workshop (2018)

Abstract Algorithms are increasingly common components of high-impact decision-making, and a growing body of literature on adversarial examples in laboratory settings indicates that standard machine learning models are not robust. This suggests that real-world systems are also susceptible to manipulation or misclassification, which especially poses a challenge to machine learning models used in financial services. We use the loan grade classification problem to explore how machine learning models are sensitive to small changes in user-reported data, using adversarial attacks documented in the literature and an original, domain-specific attack. Our work shows that a robust optimization algorithm can build models for financial services that are resistant to misclassification on perturbations. To the best of our knowledge, this is the first study of adversarial attacks and defenses for deep learning in financial services.
Robust Classification of Financial Risk



The Harvard USPTO Patent Dataset: A Large-Scale, Well-Structured, and Multi-Purpose Corpus of Patent Applications [PDF]
with Mirac Suzgun, Luke Melas-Kyriazi, Scott Duke Kominers and Stuart M. Shieber

The Harvard USPTO Patent Dataset


Machine Learning for Health 2020: Advancing Healthcare for All [PDF]
with Subhrajit Roy, Emily Alsentzer, Matthew B. A. McDermott, Fabian Falck, Ioana Bica, Griffin Adams, Stephen Pfohl, Brett Beaulieu-Jones, Tristan Naumann, and Stephanie L. Hyland

Published in Proceedings of Machine Learning Research, Vol. 136 (2020)

Machine Learning for Health 2020: Advancing Healthcare for All



Political Economics [Econ 1425]

Teaching Fellow, Spring 2023 & Spring 2022
Teaching Rating: 5.0/5.0, Certificate of Distinction in Teaching


Artificial Intelligence Meets Human Intelligence [Wintersession]

Course Head, Winter 2022


Artificial Intelligence [CS 182]

Teaching Fellow, Fall 2017