Skip to Main Content


Wenting Gao - Human Cell Atlas Project, Yale School of Medicine, New Haven, CT

Working for the Section of Pulmonary, Critical Care & Sleep Medicine.

Career goal: To become a biostatistician investigating factors associated with diseases and revealing underlying relationships between human biology and medicine via statistical methods

Internship outline: The development of single-cell RNA (ribonucleic acid) sequencing methods has catalyzed a growing sense among scientists that the time is ripe to identify all cell types in the human body. The Human Cell Atlas Project is an international collaborative effort that aims to define all human cell types in terms of distinctive molecular profiles and to connect this information with classical cellular descriptions. This summer, I was responsible for analyzing data of immune cells in large-scale genetic studies and managed to classify cell types and discover differentially expressed genes using new computational tools. Additionally, she sought to characterize millions of cells graphically via machine learning methods.

Value of experience: I am very glad to join this interdisciplinary study, starting from doing biological experiments, preprocessing raw data and ending with performing downstream statistical analysis. I gained valuable experience of analyzing big data, about 1.3 terabytes, supported by Yale School of Medicine. Also, this internship afforded me the opportunity to design and implement statistical approaches to bioinformatics research, which helped me have a deeper understanding of what I learnt after first-year study in biostatistics. What’s more, as an international student, my oral and written communication skills improved a lot by collaborating with postdocs in our lab, discussing ideas with my supervisor Professor Jen-hwa Chu and writing weekly reports.

Best moment/experience: The memorable moment for me was to successfully design the pipelines to preprocess raw single-cell RNA-sequence data, since it is the most fundamental question in terms of large-scale genetic data. My supervisor, Professor Jen-hwa Chu, gave me many constructive suggestions and pointed out my mistakes when my analytical results were not convincing, which accelerated the progress of my project. In addition, I gained technical support from Yale Center for Research Computing every time I was stuck in the Linux operating environment. I felt very grateful that many intelligent and insightful scientists helped me in providing timely feedback to solve my problems.

Funding source: Yale School of Medicine