Skip to content
Preprint · CC-BY via bioRxiv

Through the lens of causal inference: Decisions and pitfalls of covariate selection

Chen, G., Cai, Z., Taylor, P. A.

biorxiv · 2024

Abstract

The critical importance of justifying the inclusion of covariates is a facet often overlooked in data analysis. While the incorporation of covariates typically follows informal guidelines, we argue for a comprehensive exploration of underlying principles to avoid significant statistical and interpretational challenges. Our focus is on addressing three common yet problematic practices: the indiscriminate lumping of covariates, the lack of rationale for covariate inclusion, and the oversight of potential issues in result reporting. These challenges, prevalent in neuroimaging models involving covariates such as reaction time, demographics, and morphometric measures, can introduce biases, including overestimation, underestimation, masking, sign flipping, or spurious effects. Our exploration of causal inference principles underscores the pivotal role of domain knowledge in guiding co-variate selection, challenging the common reliance on statistical measures. This understanding carries implications for experimental design, model-building, and result interpretation. We draw connections between these insights and reproducibility concerns, specifically addressing the selection bias resulting from the widespread practice of strict thresholding, akin to the logical pitfall associated with "double dipping." Recommendations for robust data analysis involving covariates encompass explicit research question statements, justified covariate inclusions/exclusions, centering quantitative variables for interpretability, appropriate reporting of effect estimates, and advocating a "highlight, dont hide" approach in result reporting. These suggestions are intended to enhance the robustness, transparency, and reproducibility of covariate-driven analyses, encompassing investigations involving consortium datasets such as ABCD and UK Biobank. We discuss how researchers can use a transparent depiction of the covariate relationships to enhance the ethos of open science and promote research reproducibility.

◌ CITATION ONLY
Full text is not openly licensed for redistribution here. Read it at the source:

Read at source →

Provenance

Source
bioRxiv
DOI
10.1101/2024.01.11.575211
Canonical
link ↗
Fetched
2026-05-31 MST

Cite this

APA
G., C., Z., C., &amp; A., T.P. (2024). Through the lens of causal inference: Decisions and pitfalls of covariate selection. <em>biorxiv</em>. https://doi.org/10.1101/2024.01.11.575211
Vancouver
G. C, Z. C, A. TP. Through the lens of causal inference: Decisions and pitfalls of covariate selection. biorxiv. 2024. doi:10.1101/2024.01.11.575211.
BibTeX
@unpublished{chen2024Throug, title = {Through the lens of causal inference: Decisions and pitfalls of covariate selection}, author = {Chen, G. and Cai, Z. and Taylor, P. A.}, journal = {biorxiv}, year = {2024}, doi = {10.1101/2024.01.11.575211}, }

Research neighborhood

References, citing works, and semantically nearest findings. Click a node to open it.

Related findings