Sample Personal Statement Data Science
Stanford Ph.D. Biomedical Data Science
My primary career objective is to bridge the distance between technical healthcare information
within clinical information systems and the broader public’s understanding, using novel analytics
methods as my tools. I fervently believe that developing user-centered design methods and
computationally advancing health data delivery for patients and clinicians will allow for
streamlined insights, greatly enhancing clinical decision-making, patient outcomes, and health
equity.
Stanford Medicine’s Department of Biomedical Data Science (DBDS) has demonstrated its
commitment to leveraging innovative data-centered techniques to advance the biomedical and
healthcare domains. This renders DBDS the ideal setting for me to pursue an Academic Research
MS in Biomedical Data Science. DBDS’s renowned faculty, emphasis on scientific
communication, and comprehensive curriculum will equip me with the robust foundation
required to revolutionize the dissemination of health data. Following this degree, I hope to
pursue a PhD in Biomedical/Clinical Informatics, with the intention of entering industry to
specialize in the design, development, and implementation of novel informatics tools to
advance patient care.
The pandemic revealed that scientific communication remains a significant barrier to
healthcare equity. I recognize that prioritizing verbal and written communication
alongside technical skills is critical to addressing this challenge. To build a strong
qualitative and quantitative foundation, I chose Swarthmore College to pursue my
undergraduate studies in Computer Science and Economics, with a minor in Biology.
There, I developed a strong computational foundation, encompassing a deep
comprehension of system operations, effective problem-solving, collaborative
teamwork, and a self-starting mindset, enabling me to acquire the tools necessary to
tackle complex problems. During my first two years at Swarthmore, I engaged in
interdisciplinary projects promoting self-awareness, health, and well-being. These
projects included using Python to employ six machine learning classifiers to predict
personality types of social media users and developing a user interface with HTML,
CSS, Python, JavaScript, SQL, and Flask for exploration of Swarthmore’s gym
equipment and its uses. These projects sparked my interest in academic research and
cultivated my inquisitive nature, inspiring me to seek a formal research mentorship for a
project that closely aligns with my passion for biomedical data science and clinical
informatics.
Following my freshman year, I was awarded an NSF/DOD grant to conduct data
analytics research at the University of Southern California Information Sciences Institute
under Dr. John Heidemann. My primary task was to develop a public web application
displaying shifts to COVID-19 Work- from-Home (https://covid.ant.isi.edu). To complete
this project, I gained proficiency in five computer languages, including Python,
JavaScript, HTML, CSS, and PHP, analyzed raw data in SQL from multiple databases
to identify the optimal strategy for delivering information to users, and ascertained the
relevant information to present. In particular, I chose to generate multi-tab pop- ups
containing interactive line graphs for temporal changes and dynamic tables for Internet
Service Provider information. I later extended and adapted my work for their ANT
Outage World Map (https://outage.ant.isi.edu/). Equally as important, I gained firsthand experience operating within a research team, asking questions, incorporating feedback,
and communicating progress on my project. Furthermore, I learned how to properly
draft a publication as first author, undergo the review process, and present to and
converse with professionals at the 2021 IEEE International Conference on Big Data
(BigData). 1 To strengthen my ability to communicate our process to those beyond the
field, I created and presented a poster at the University of Minnesota’s REU Poster
Symposium. 2 These experiences were my first exposure to the complexities of
effectively distilling complex information into easily understandable insights for those
unfamiliar with the subject.
In an effort to apply data analytics and advanced computational methods to the
healthcare domain, I began working at Harvard Medical School’s Department of
Biomedical Informatics under Dr. Nils Gehlenborg. I was selected to work on a research
team committed to leveraging data science to generate detailed genomics visualizations
to mitigate the challenges biomedical researchers face. Specifically, I applied an edge-
bundling algorithm using TypeScript to the lab’s grammar- based genomics visualization
toolkit known as Gosling (https://gosling.js.org). This edge-bundling algorithm reduced
the visual clutter of the genomics visualizations, elucidating pertinent information to
biomedical researchers and enabling the formation of hypotheses.
To further refine my research skills and expand upon my knowledge of statistical
methods, I am conducting bioinformatics research at Swarthmore with Dr. Rebecca
Clements, applying computational methods in R to RNA sequencing data to investigate
the immunological role of red blood cell progenitors during pregnancy. Additionally, I
have remained involved in the research conducted by Heidemann’s lab due to my close
affiliation with USC ISI. I was awarded an NSF grant to analyze statistical differences
between two outage collection systems and draft a second publication as lead author.
This paper is in its final stages of completion, and we intend to submit it to the
appropriate conferences within the next three months.
Witnessing the impact of simplifying complex health data greatly reinforced my
determination to pursue a research-focused graduate degree. At DBDS, I look forward
to engaging with an abundance of patient records as offered through the Stanford
Hospital and Lucile Packard Children’s Hospital research clinical database. This access
would enable my peers and I to analyze how they convey, synthesize, and integrate
information from patients, providers, and researchers.
My ability to bridge the gap between complex technical language, articulate writing, and
effective verbal communication, coupled with a robust understanding of research
methodology, has equipped me well for graduate-level academics and research. My
ultimate goal is to increase health equity and elevate patient outcomes by improving the
access to and comprehensibility of scientific information through novel computational methods. Stanford Medicine’s DBDS’s dedication to advancing biomedicine and
healthcare through the representation and analysis of biomedical information makes it
the optimal setting for completing an Academic Research MS in Biomedical Data
Science.