STS Congenital Heart Surgery Database analysis
STS Congenital Heart Surgery Database analysis
We propose a program of research to address two related methodological challenges in the assessment of the outcomes and quality of care for pediatric and congenital heart disease surgical programs. First, from a statistical perspective, a key problem is sparsity – that is, few programs have sufficient numbers of patients undergoing a particular type of operation to allow reliable inferences of the quality of the program. Second, as a strategy to increase numbers, the current STS approach groups operations into similar risk strata but this strategy results in significant heterogeneity of diagnoses and procedure types within groups. Attempts to account for various risk factors are confounded by important and often conflicting effect interactions among age, diagnosis, procedure type, non-cardiac defects, other risk factors, genetic syndromes, and chromosomal anomalies.
New statistical approaches to learn from large observational databases assembled from multiple sources which contain varying amounts and types of information could enable a better understanding of the quality of surgical care delivered to pediatric and congenital heart disease populations. We propose to capitalize on the numerous confounders and surgical types observed in the Congenital Heart Surgery Database (CHSD) to address these methodological challenges through robust machine learning and causal inference approaches. First, we will investigate different approaches to creating homogeneous patient groups in which to assess program performance (Aim 1). Second, to address sparsity resulting from low outcomes rates (i.e. mortality), numerous surgery types, small numbers of patients undergoing a specific operation at a single program, and numerous confounders, we will exploit regularization strategies, such as sparse prior distributions for the coefficients for machine-learning procedures (Aim2).
Aim 1: Examine the impact of defining patient cohorts/groupings by alternative grouping methodologies to include diagnoses, procedures, procedure specific risk factors, and other potentially important factors such as age at operation, non-cardiac defects, other risk factors, genetic syndromes, and chromosomal anomalies to create patient cohorts to compare with the current method.
Aim 2: Develop new risk models based on the new groupings identified in Aim 1 using modern big data causal inference procedures. We will examine two approaches: a) Regularization through sparsity constraints, and b) target maximum likelihood estimation via machine learning.