2024
Improving prediction of linear regression models by integrating external information from heterogeneous populations: James–Stein estimators
Han P, Li H, Park S, Mukherjee B, Taylor J. Improving prediction of linear regression models by integrating external information from heterogeneous populations: James–Stein estimators. Biometrics 2024, 80: ujae072. PMID: 39101548, PMCID: PMC11299067, DOI: 10.1093/biomtc/ujae072.Peer-Reviewed Original ResearchConceptsJames-Stein estimatorLinear regression modelsIndividual-level dataComprehensive simulation studyRegression modelsNumerical performanceSimulation studyShrinkage methodCoefficient estimatesPredictive meanReduced modelStudy population heterogeneityInternal modelEstimationStudy populationBlood lead levelsInternational studiesCovariatesPatella bonePublished literatureLead levelsExternal studiesSummary informationPopulationSubsets
2023
A Synthetic Data Integration Framework to Leverage External Summary-Level Information from Heterogeneous Populations
Gu T, Taylor J, Mukherjee B. A Synthetic Data Integration Framework to Leverage External Summary-Level Information from Heterogeneous Populations. Biometrics 2023, 79: 3831-3845. PMID: 36876883, PMCID: PMC10480346, DOI: 10.1111/biom.13852.Peer-Reviewed Original ResearchConceptsCovariate effectsStatistical inferenceHeterogeneity of covariate effectsRegression coefficient estimatesSummary-level informationImprove statistical inferenceInternational studiesOutcome YCovariate informationData integration frameworkStatistical efficiencyCoefficient estimatesPartial informationExternal populationGeneral frameworkIndividual-level dataRisk prediction modelExternal modelPrediction problemInternational study populationMultiple imputation
2022
Variable Selection with Multiply-Imputed Datasets: Choosing Between Stacked and Grouped Methods
Du J, Boss J, Han P, Beesley L, Kleinsasser M, Goutman S, Batterman S, Feldman E, Mukherjee B. Variable Selection with Multiply-Imputed Datasets: Choosing Between Stacked and Grouped Methods. Journal Of Computational And Graphical Statistics 2022, 31: 1063-1075. PMID: 36644406, PMCID: PMC9838615, DOI: 10.1080/10618600.2022.2035739.Peer-Reviewed Original ResearchVariable selectionSimultaneous coefficient estimationPenalized regression methodsBinary outcome dataObjective functionR-package <i>Shrinkage penaltyGeneral classCyclic coordinate descentVariable selection algorithmCoefficient estimatesSupplementary materialsMethod to dataCoordinate descentMultiple imputationALS riskMultiply-imputedOutcome dataFunction formulationSelectivity propertiesSelection algorithmEstimationOptimization algorithmMissingnessBiomedical applications
2021
A hierarchical integrative group least absolute shrinkage and selection operator for analyzing environmental mixtures
Boss J, Rix A, Chen Y, Narisetty N, Wu Z, Ferguson K, McElrath T, Meeker J, Mukherjee B. A hierarchical integrative group least absolute shrinkage and selection operator for analyzing environmental mixtures. Environmetrics 2021, 32 PMID: 34899005, PMCID: PMC8664243, DOI: 10.1002/env.2698.Peer-Reviewed Original ResearchGroup least absolute shrinkageEnvironmental health studiesHealth outcomesHealth StudyLIFECODES birth cohortBirth cohortExposure interactionsPenalized regression methodsDose-response relationshipExposure mixturesComprehensive R Archive NetworkInteraction effectsInduce sparsityAdaptive weightsGroup lassoSelection operatorHeredity constraintLeast Absolute ShrinkageSelection frameworkNonlinear interaction effectsSample sizeVariable selectionJoint effectsCoefficient estimatesGroup structure