Sample Personal Statement Computational Biology
Yale PhD Computational Biology and Biomedical Informatics
“Stay as far away from healthcare as you can. It is an absolute mess!” cautioned my
pharmacist mother, continuously frustrated with her work. When a phrase is repeated to you as a
young child, you tend to hold on and squeeze it tightly. Her words resurfaced in my family’s
mind six years ago when my uncle moved in following severe stroke paralysis. This monumental
shift in our close-knit family of three revealed one of the main deficiencies of the U.S. healthcare
system: an inability to effectively communicate complex health information and risk assessments
to average Americans. As a result, I intend to do exactly what children sometimes do: the
opposite of their parents’ advice. My primary career objective is to bridge the distance between
technical healthcare information within clinical information systems and the broader public’s
understanding, using computation, data science, and information visualization as my tools. I
fervently believe that standardizing electronic health record data, employing user-centered
design methods, and advancing health data delivery for patients and clinicians will allow for
streamlined insights, greatly enhancing clinical decision-making, patient outcomes, and health
equity.
Yale’s interdepartmental Computational Biology and Biomedical Informatics (CBB) program
has demonstrated its commitment to leveraging biological concepts and novel computational
methods to advance the biomedical domain. This renders CBB the ideal setting for me to pursue
a doctorate with an additional focus in Biomedical Data Science. CBB’s leading faculty,
innovative approach, highly collaborative environment, and extensive access to health records
through the five hospitals within the Yale New Haven Health System will equip me with the
robust foundation and resources required for transforming the dissemination of health data.
The pandemic revealed that scientific communication remains a significant barrier to healthcare
equity. I recognize that prioritizing verbal and written communication alongside technical skills
is critical to addressing this challenge. To build a strong qualitative and quantitative foundation, I
chose Swarthmore College to pursue my undergraduate studies in Computer Science and
Economics, with a minor in Biology. There, I developed a strong computational foundation,
encompassing a deep comprehension of system operations, effective problem-solving,
collaborative teamwork, and a self-starting mindset, enabling me to acquire the tools necessary
to tackle complex problems. During my first two years at Swarthmore, I engaged in
interdisciplinary projects promoting self-awareness, health, and well-being. These projects
included using Python to employ six machine learning classifiers to predict personality types of
social media users and developing a user interface with HTML, CSS, Python, JavaScript, SQL,
and Flask for exploration of Swarthmore’s gym equipment and its uses. These projects sparked
my interest in academic research and cultivated my inquisitive nature, inspiring me to seek a
formal research mentorship for a project that closely aligns with my passion for biomedical and
clinical informatics.
Following my freshman year, I was awarded an NSF/DOD grant to conduct data analytics
research at the University of Southern California Information Sciences Institute under Dr. John
Heidemann. My primary task was to develop a public web application displaying shifts to
COVID-19 Work-from-Home (https://covid.ant.isi.edu). To complete this project, I gained
proficiency in five computer languages, including Python, JavaScript, HTML, CSS, and PHP,
analyzed raw data in SQL from multiple databases to identify the optimal strategy for delivering
information to users, and ascertained the relevant information to present. In particular, I chose to
generate multi-tab pop-ups containing interactive line graphs for temporal changes and dynamic
tables for Internet Service Provider information. I later extended and adapted my work for their
ANT Outage World Map (https://outage.ant.isi.edu/). Equally as important, I gained firsthand
experience operating within a research team, asking questions, incorporating feedback, and communicating progress on my project. Furthermore, I learned how to properly draft a
publication as first author, undergo the review process, and present to and converse with
professionals at the 2021 IEEE International Conference on Big Data (BigData). 1 To strengthen
my ability to communicate our process to those beyond the field, I created and presented a poster
at the University of Minnesota’s REU Poster Symposium. 2 These experiences were my first
exposure to the complexities of effectively distilling complex information into easily
understandable insights for those unfamiliar with the subject.
In an effort to apply data analytics and advanced computational methods to the biomedical
domain, I began working at Harvard Medical School’s Department of Biomedical Informatics
under Dr. Nils Gehlenborg. I was selected to work on a research team committed to leveraging
data science to generate detailed genomics visualizations to mitigate the challenges biomedical
researchers face. Specifically, I applied an edge-bundling algorithm using TypeScript to the lab’s
grammar-based genomics visualization toolkit known as Gosling (https://gosling.js.org). This
edge-bundling algorithm reduced the visual clutter of the genomics visualizations, elucidating
pertinent information to biomedical researchers and enabling the formation of hypotheses.
Witnessing the impact of simplifying complex health data greatly reinforced my determination to
pursue a PhD.
At Yale, I look forward to engaging with an abundance of patient records through EPIC. This
access would enable our project team to utilize big data analytics to analyze how they convey,
synthesize, and integrate information from patients, providers, and researchers.
To further refine my research skills and expand upon my knowledge of statistical methods, I am
conducting bioinformatics research at Swarthmore with Dr. Rebecca Clements, applying
computational methods in R to RNA sequencing data to investigate the immunological role of
red blood cell progenitors during pregnancy. Additionally, I have remained involved in the
research conducted by Heidemann’s lab due to my close affiliation with USC ISI. I was awarded
an NSF grant to analyze statistical differences between two outage collection systems and draft a
second publication as lead author. This paper is in its final stages of completion, and we intend to
submit it to the appropriate conferences within the next three months.
I am enthusiastic about working with Yale CBB faculty Drs. Lucila Ohno-Machado, Ted
Melnick, and Andrew Taylor. Dr. Ohno-Machado’s pioneering biomedical informatics
achievements were the key motivators that drew me to the CBB program initially. Following my
conversation with Dr. Ohno-Machado, I am extremely passionate about the Department of
Biomedical Data Science’s work on data sharing and accessibility within healthcare. Likewise,
through discussions with Dr. Melnick and lab members during their weekly lab meetings, I am
confident that their efforts in developing innovative metrics to enhance clinical decision support
tools aligns with my passion for designing effective health data delivery systems to advance
health equity. Finally, Dr. Andrew Taylor’s research in creating visualization dashboards for risk
assessments resonates with my interests in developing novel data analytics algorithms and tools
to improve patient care. I am eager to collaborate with and contribute to the groundbreaking
research of these CBB faculty.
My ability to bridge the gap between complex technical language, articulate writing, and
effective verbal communication, coupled with a robust understanding of research methodology, has equipped me well for graduate-level research. My ultimate goal is to increase health equity
and elevate patient outcomes by improving the access to and comprehensibility of scientific
information through data science and other advanced biomedical informatics methods. CBB’s
dedication to harnessing computational and statistical methodologies to advance health systems
makes it the optimal setting for a doctorate in Computational Biology and Biomedical
Informatics.