女子雨天练车与教练数据科学系张书聪讲师在国际权威期刊《Journal of the American Statistical Association》上发表论文

    20243月,女子雨天练车与教练数据科学系张书聪讲师统计学领域国际重要核心期刊Journal of the American Statistical Association上在线发表论文 CARE: Large precision matrix estimation for compositional data该成果不仅解决了高维成分数据下关于稀疏精准矩阵的可识别性问题,并提出了一种新的估计方法CARE,相关的理论很好地弥补当前在成分数据下高维精准矩阵理论方面的空白。

论文摘要:

    High-dimensional compositional data are prevalent in many applications. The simplex constraint poses intrinsic challenges to inferring the conditional dependence relationships among the components forming a composition, as encoded by a large precision matrix. We introduce a precise specification of the compositional precision matrix and relate it to its basis counterpart, which is shown to be asymptotically identifiable under suitable sparsity assumptions. By exploiting this connection, we propose a composition adaptive regularized estimation (CARE) method for estimating the sparse basis precision matrix. We derive rates of convergence for the estimator and provide theoretical guarantees on support recovery and data-driven parameter tuning. Our theory reveals an intriguing trade-off between identification and estimation, thereby highlighting the blessing of dimensionality in compositional data analysis. In particular, in sufficiently high dimensions, the CARE estimator achieves minimax optimality and performs as well as if the basis were observed. We further discuss how our framework can be extended to handle data containing zeros, including sampling zeros and structural zeros. The advantages of CARE over existing methods are illustrated by simulation studies and an application to inferring microbial ecological networks in the human gut.