邹家辉:A method to deal with generalized linear regressions in big data
Academy of Mathematics and Systems Science, CAS Colloquia & Seminars
Speaker:
邹家辉,首都经济贸易大学
Inviter:
Title:
A method to deal with generalized linear regressions in big data
Language:
Chinese
Time & Venue:
2022.12.07 19:00-20:00 腾讯会议: 96248182651
Abstract:
This paper proposes a method to accelerate calculation for generalized linear models in big data. We separate the covariates of full model into several groups to build candidate models. In the process of solving each candidate model, we estimate parameters independently via optimal subsampling method. The parameters estimated are combined, by special weights from optimal model averaging, to be the approximate parameter estimation for full model. An algorithm to realize this procedure are designed following parallel calculating technology. As theoretical support, we derive the forms of optimal subsampling probabilities during candidate models estimating, asymptotic distribution of the difference between parameter estimates from candidate models with subsamples and that with full sample size, and asymptotic optimality for the parameter estimates combination. Finally, several numerical experiments and empirical results are shown to illustrate the performance in precision and speed.