Beta regression for double‐bounded response with correlated high‐dimensional covariates
Jianxuan Liu- Statistics, Probability and Uncertainty
- Statistics and Probability
Continuous responses measured on a standard unit interval are ubiquitous in many scientific disciplines. Statistical models built upon a normal error structure do not generally work because they can produce biassed estimates or result in predictions outside either bound. In real‐life applications, data are often high‐dimensional, correlated and consist of a mixture of various data types. Little literature is available to address the unique data challenge. We propose a semiparametric approach to analyse the association between a double‐bounded response and high‐dimensional correlated covariates of mixed types. The proposed method makes full use of all available data through one or several linear combinations of the covariates without losing information from the data. The only assumption we make is that the response variable follows a Beta distribution; no additional assumption is required. The resulting estimators are consistent and efficient. We illustrate the proposed method in simulation studies and demonstrate it in a real‐life data application. The semiparametric approach contributes to the sufficient dimension reduction literature for its novelty in investigating double‐bounded response which is absent in the current literature. This work also provides a new tool for data practitioners to analyse the association between a popular unit interval response and mixed types of high‐dimensional correlated covariates.