Latest Work

Development of Prediction Model for High-risk Sources and Seasons of C. Jejuni Outbreaks Identification

As part of a final project, I worked with two other students to create a prediction model using data from the NCBI Pathogen Detection Project, which is a large database containing Campylobacter jejuni isolate samples collected in the United States. Our goal was determine whether or not certain seasons or isolation sources, such as chicken or dairy, are associated with outbreaks of C. jejuni. Our final model had an AUC of 0.896, accuracy of 0.900, and Brier score of 0.073 on the validation set.

Differential Privacy: An Introduction

As a final project, I worked with one other classmate to develop a website introducing a machine learning concept that we had not learned in class-- differential privacy. Differential privacy is a solution– it enables us to publicly share information about a dataset, without actually revealing information about the individuals it consists of. Our website included a short article explaing how differential privacy works and why it's important, as well as tutorials for implementation in R and Python.