ABSTRACT

The coexistences of high dimensionality and strong correlation in both responses and predictors pose unprecedented challenges in identifying important predictors. In this paper, we propose a model-free conditional feature screening method with false discovery rate (FDR) control for ultrahigh-dimensional multi-response setting. The proposed method is built upon partial distance correlation, which measures the dependence between two random vectors while controlling effect for a multivariate random vector. This screening approach is robust against heavy-tailed data and can select predictors in instances of high correlation among predictors. Additionally, it can identify predictors that are marginally unrelated but conditionally related with the response. Leveraging the advantageous properties of partial distance correlation, our method allows for high-dimensional variables to be conditioned upon, distinguishing it from current research in this field. To further achieve FDR control, we apply derandomized knockoff-e-values to establish the threshold for feature screening more stably. The proposed FDR control method is shown to enjoy sure screening property while maintaining FDR control as well as achieving higher power under mild conditions. The superior performance of these methods is demonstrated through simulation examples and a real data application.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://dbpia.nl.go.kr/journals/pages/open_access/funder_policies/chorus/standard_publication_model)
Open Science Framework awards
Open Data Open Data
Digitally shareable data necessary to reproduce the reported results are publicly available for this article.
Open Materials Open Materials
The components of the research methodology needed to reproduce the reported procedure and analysis are publicly available for this article.
You do not currently have access to this article.