A split-and-conquer approach for analysis of extraordinarily large data

X Chen, M Xie - Statistica Sinica, 2014 - JSTOR
X Chen, M Xie
Statistica Sinica, 2014JSTOR
If there are datasets, too large to fit into a single computer or too expensive for a
computationally intensive data analysis, what should we do? We propose a split-and-
conquer approach and illustrate it using several computationally intensive penalized
regression methods, along with a theoretical support. We show that the split-and-conquer
approach can substantially reduce computing time and computer memory requirements. The
proposed methodology is illustrated numerically using both simulation and data examples.
If there are datasets, too large to fit into a single computer or too expensive for a computationally intensive data analysis, what should we do? We propose a split-and-conquer approach and illustrate it using several computationally intensive penalized regression methods, along with a theoretical support. We show that the split-and-conquer approach can substantially reduce computing time and computer memory requirements. The proposed methodology is illustrated numerically using both simulation and data examples.
JSTOR