Shuangge Steven Ma, PhD
Department Chair and Professor of BiostatisticsCards
Additional Titles
Affiliated Faculty, Yale Institute for Global Health
Director, Biostatistics and Bioinformatics Shared Resource
Contact Info
Education
University of Wisconsin (2004)
University of California at Los Angeles (2000)
Additional Titles
Affiliated Faculty, Yale Institute for Global Health
Director, Biostatistics and Bioinformatics Shared Resource
Contact Info
Education
University of Wisconsin (2004)
University of California at Los Angeles (2000)
Additional Titles
Affiliated Faculty, Yale Institute for Global Health
Director, Biostatistics and Bioinformatics Shared Resource
Contact Info
Education
University of Wisconsin (2004)
University of California at Los Angeles (2000)
About
Copy Link
Titles
Department Chair and Professor of Biostatistics
Affiliated Faculty, Yale Institute for Global Health; Director, Biostatistics and Bioinformatics Shared Resource
Biography
Dr. Ma received his Ph.D. degree in statistics at University of Wisconsin in 2004. Prior to arriving at Yale, Dr. Ma was a Senior Fellow in Collaborative Health Studies Coordinating Center (CHSCC) and Department of Biostatistics at University of Washington. He has been involved in developing novel statistical and bioinformatics methodologies for analysis of cancer (NHL, breast cancer, melanoma, lung cancer), mental disorders, and cardiovascular diseases. He has also been involved in health economics research, with special interest in health insurance in developing countries.
Appointments
Biostatistics
ChairDualBiostatistics
ProfessorPrimary
Other Departments & Organizations
- Biostatistics
- Cancer Prevention and Control
- Center for Infection and Immunity
- Computational Biology and Biomedical Informatics
- Ma Lab
- SPORE in Lung Cancer
- SPORE in Skin Cancer
- Yale Cancer Center
- Yale Combined Program in the Biological and Biomedical Sciences (BBS)
- Yale Institute for Global Health
- Yale School of Public Health
- Yale Ventures
- Yale-BI Biomedical Data Science Fellowship
- YSPH Global Health Concentration
Education & Training
- Postdoctoral Associate
- University of Washington (2006)
- PhD
- University of Wisconsin (2004)
- MS
- University of California at Los Angeles (2000)
Research
Copy Link
Overview
Develop novel statistical methodologies for complex data;
Study epidemiology and pathogenesis of multiple cancers, including breast cancer, NHL, melanoma and lung cancer;
Conduct survey studies in mainland China and Taiwan, investigating health insurance utilization and impact;
Provide statistical support to multiple biomedical studies.
Medical Research Interests
Public Health Interests
ORCID
0000-0001-9001-4999
Research at a Glance
Yale Co-Authors
Publications Timeline
Research Interests
Tassos C. Kyriakides, PhD
Yuan Huang, PhD
Bao-Zhu Yang, PhD
Caroline Helen Johnson, PhD
Chiang-Shan Ray Li, MD, PhD
Michael Wininger, PhD
Publications
2025
Network-based modeling of emotional expressions for multiple cancers via a linguistic analysis of an online health community
Fan X, Liu M, Ma S. Network-based modeling of emotional expressions for multiple cancers via a linguistic analysis of an online health community. The Annals Of Applied Statistics 2025, 19: 2218-2236. DOI: 10.1214/25-aoas2047.Peer-Reviewed Original ResearchConceptsOnline health communitiesNetwork-based modelModel of emotional expressionsLow-rank matrixAmerican Cancer Society Cancer Survivors NetworkHealth communityFear of judgmentSemantic networkNetwork analysis techniquesComputational propertiesSurvivors NetworkSimulation resultsNetworkAdverse emotionsLinguistic analysisCluster structureSingle diseaseCancer patientsPenalization approachCancer typesEmotional expressionCancerAnalysis techniquesComputerData analysisGE-IA-NAM: gene–environment interaction analysis via imaging-assisted neural additive model
Li J, Xu Y, Ma S, Fang K. GE-IA-NAM: gene–environment interaction analysis via imaging-assisted neural additive model. Bioinformatics 2025, 41: btaf481. PMID: 40880282, PMCID: PMC12452269, DOI: 10.1093/bioinformatics/btaf481.Peer-Reviewed Original ResearchMeSH Keywords and ConceptsConceptsGene-environmentNeural additive modelsGene-environment modelGene-environment analysisGene-environment interaction analysisEnvironmental factorsCancer Genome AtlasPathological imagesSkin cancer datasetGenome AtlasCancer datasetsNetwork architectureCompetitive performanceGenetic factorsPython codeCancer outcomesInteraction analysisData patternsCancer researchAdditive modelInteraction methodEnvironmental dataJoint analysisCancer modelsRegression-basedSubgroup Testing in the Change‐Plane Cox Model
Zhang X, Ren P, Shi X, Ma S, Liu X. Subgroup Testing in the Change‐Plane Cox Model. Statistics In Medicine 2025, 44: e70179. PMID: 40662752, DOI: 10.1002/sim.70179.Peer-Reviewed Original ResearchMeSH Keywords and ConceptsConceptsFinite sample performanceAnalysis of survival dataLikelihood ratio testAsymptotic distributionSample performanceLung cancer dataScore testSimulation studyRatio testSurvival dataCancer dataCox modelImmune checkpoint blockade therapyCheckpoint blockade therapySolid tumor patientsTumor mutational burdenSubgroup testsTreatment effectsCovariatesBlockade therapyMutational burdenSubgroupsRobust Transfer Learning for High‐Dimensional GLM Using γ$$ \gamma $$‐Divergence With Applications to Cancer Genomics
Xu F, Ma S, Zhang Q, Xu Y. Robust Transfer Learning for High‐Dimensional GLM Using γ$$ \gamma $$‐Divergence With Applications to Cancer Genomics. Statistics In Medicine 2025, 44: e70170. PMID: 40662636, PMCID: PMC12313224, DOI: 10.1002/sim.70170.Peer-Reviewed Original ResearchMeSH Keywords and ConceptsConceptsTransfer learningReal world biomedical dataRisk of negative transferProximal gradient descentTransfer learning methodTransfer learning approachHigh-dimensional dataHigh-dimensional settingsGradient descentCompetitive performanceLearning methodsEstimation error boundsBiomedical dataEfficient algorithmLearning approachDetection schemeNegative transferAnalysis of complex diseasesDebiasing stepMethod's effectivenessCancer genomic dataData contaminationError boundsHigh-dimensional profiling dataOutliersJoint modeling of mixed outcomes using a rank-based sparse neural network
Xue J, Xu Y, Li J, Ma S, Fang K. Joint modeling of mixed outcomes using a rank-based sparse neural network. Journal Of Biomedical Informatics 2025, 169: 104870. PMID: 40623577, PMCID: PMC12306493, DOI: 10.1016/j.jbi.2025.104870.Peer-Reviewed Original ResearchConceptsSparse neural networksNeural networkCompetitive performanceImbalance issueLoss functionSparse layerLeverage informationPrediction accuracyTraditional methodsNetworkParametric frameworkPenalization methodFaces challengesJoint modelPrediction modelInformationSkin cutaneous melanomaHigh-throughput profilingHigh-dimensional covariatesDimensionalityGenomic researchFeaturesMethodSimulation studyBiomedical studiesSubgroup Analysis of Differential Networks with Latent Variables
Li L, Ma S, Zhang Q. Subgroup Analysis of Differential Networks with Latent Variables. Statistics And Computing 2025, 35: 140. DOI: 10.1007/s11222-025-10681-z.Peer-Reviewed Original ResearchConceptsLow-rank structureSubgroup networksBaseline networkCompetitive performanceDifferential networksReal-world observational dataLatent variablesEfficient computational algorithmNetworkSparsityHeterogeneity analysis methodComputational algorithmInfluence of latent variablesDense networkSubgroup structureStatistical propertiesAlgorithmNetwork analysisSimulation studyMethodAnalysis methodDifferential network analysisRobust sparse Bayesian regression for longitudinal gene–environment interactions
Fan K, Jiang Y, Ma S, Wang W, Wu C. Robust sparse Bayesian regression for longitudinal gene–environment interactions. Journal Of The Royal Statistical Society Series C (Applied Statistics) 2025, qlaf027. DOI: 10.1093/jrsssc/qlaf027.Peer-Reviewed Original ResearchCitationsAltmetricConceptsCancer Prevention StudyGene-environment interactionsVariable selectionSpike-and-slab priorsGene-environmentIntra-cluster correlationBayesian variable selectionPrevention StudyMeasured body weightMeasures analysisLongitudinal studyPosterior inferenceGibbs samplerMCMC algorithmInteraction effectsStructured sparsityMixed modelsGenetic factorsExtensive simulationsFast computationPhenotypic measurementsInter-relatednessLongitudinal observationsANOVAInteraction problemsHeterogeneous network analysis of disease clinical treatment measures via mining electronic medical record data
Wang J, Li R, Chang W, Hsiao K, Shia B, Ma S. Heterogeneous network analysis of disease clinical treatment measures via mining electronic medical record data. The Annals Of Applied Statistics 2025, 19 DOI: 10.1214/24-aoas1976.Peer-Reviewed Original ResearchLocal Clustering for Functional Data
Chen Y, Zhang Q, Ma S. Local Clustering for Functional Data. Journal Of Computational And Graphical Statistics 2025, 34: 1075-1090. DOI: 10.1080/10618600.2024.2431057.Peer-Reviewed Original ResearchCitationsConceptsA Selective Review of Network Analysis Methods for Gene Expression Data
Li R, Yi H, Ma S. A Selective Review of Network Analysis Methods for Gene Expression Data. Methods In Molecular Biology 2025, 2880: 293-307. PMID: 39900765, DOI: 10.1007/978-1-0716-4276-4_14.Peer-Reviewed Original ResearchMeSH Keywords and ConceptsConceptsGene expression dataGene expression networksExpression dataDownstream analysisExpression networksGene expressionBiological processesGenesMolecular mechanismsBiological implicationsHigh-throughput profiling techniquesBiological findingsGlobal viewComplex interactionsProfiling techniquesRegulation
Clinical Trials
Current Trials
Molecular Markers of UV Exposure and Cancer Risk in Skin
HIC ID2000024848RoleSub InvestigatorPrimary Completion Date03/31/2024Recruiting Participants
Academic Achievements & Community Involvement
Copy Link
Activities
activity Health insurance in mainland China and Taiwan: utilization, impact, and policy interventions
01/01/2013 - PresentResearchDetailsChinaAbstract/SynopsisThe goal of this study is to provide an updated, comprehensive description of health insurance coverage and utilization and their impacts on health and economic outcomes. Special attention has been paid to the less-advantaged groups.
Honors
honor Fellow
05/01/2013International AwardASADetailsUnited Stateshonor Elected Member
06/01/2007International AwardISIDetailsUnited States
News
Copy Link
News
- July 23, 2025
Digging into data science at the 38th Annual New England Statistical Symposium
- May 06, 2025
Genetic Test Underused in Cancer Care
- January 07, 2025
Leadership Appointments Underscore Yale Biostatistics’ Global Strength in Research and Innovation
- October 24, 2024
New Analytics Center for Cardiovascular Medicine
Get In Touch
Copy Link
Contacts
Locations
300 George Street
Academic Office
Ste 206
New Haven, CT 06511