Research & Publications
My current research focus on developing novel statistical and computational models to analyze large scale genetic and genomic data from patients with chronic lung diseases including asthma, idiopathic pulmonary fibrosis (IPF), sarcoidosis and pediatric cystic fibrosis.
In the study on asthma collaborated with Dr. Geoffrey Chupp, we identified three subtypes of asthma or TEA clusters using gene expression data from the induced sputum and blood: those with high risk of having near-fatal asthma attacks, those with severe symptoms of asthma, and those with milder asthma. In addition, by analyzing the gene expression in the blood, we could design blood test to identify the asthma subtypes of patient to optimize the choice of treatment or drugs. Ultimately, this could lead to personalized treatment for asthma patients. A novel pathway-based clustering method was developed to achieve these results which has been compared to traditional pathway-based clustering methods to show better robustness and accuracy using both simulated data and real datasets. Currently, longitudinal induced sputum and whole blood samples are being collected from patients, which are prepared for RNA sequencing. To analyze these data, we are developing novel statistical and computational approaches to identify genetic information from the longitudinal RNA sequencing data and integrate it with the transcriptional profiles from the same data set to identify time invariant molecular endotypes of asthma.
In the study on IPF and Sarcoidosis collaborated with Dr. Naftali Kaminski, we are trying to understand the genomics and genetics of the patients. The second generation sequencing technology was used to measure both the gene expression levels and the sequence mutations in the patients. My computational team is currently working on preprocessing and analyzing these sequencing data to better understand the disease heterogeneity and pathogenesis using network analysis approaches, data integration analysis and longitudinal data analysis.
In the study on pediatric cystic fibrosis, patients provide weekly surveys and clinical visits to provide sputum and stool samples. These samples were sequenced to understand what bacteria exist, how they change over time and whether they behave differently between children with and without cystic fibrosis. My computational team is currently working on developing statistical and computational approach to analyze the longitudinal 16s rRNA sequencing data.
Extensive Research Description
Analysis of longitudinal RNA sequencing data from asthma patients;
Analysis of longitudinal gene expression data from asthma patients under bronchial thermoplasty procedures;
Analysis of longitudinal microbiome sequencing data from children with cystic fibrosis;
RNA sequencing of IPF, A1AT and SARC patients using Ion Torrent technology;
Single cell RNA sequencing data analysis;
Genetics; Lung Diseases; Respiratory Hypersensitivity; Computational Biology; Genomics; Biostatistics; Molecular Medicine
Public Health Interests
Bioinformatics; Biomarkers; Genetics, Genomics, Epigenetics; Microbial Ecology; Modeling