Search in:

Journal of Computational and Graphical Statistics Volume 27, 2018 - Issue 3

Submit an article Journal homepage

477

Views

CrossRef citations to date

Altmetric

Short Technical Notes

Divide and Recombine Approaches for Fitting Smoothing Spline Models with Large Datasets

Danqing XuDepartment of Statistics and Applied Probability, University of California-Santa Barbara, Santa Barbara, CAView further author information

Yuedong WangDepartment of Statistics and Applied Probability, University of California-Santa Barbara, Santa Barbara, CACorrespondence[email protected]
View further author information

Pages 677-683 | Received 01 Apr 2017, Published online: 06 Jun 2018

Cite this article
https://doi.org/10.1080/10618600.2017.1402775
CrossMark

Sample our Computer Science journals, sign in here to start your access, latest two full volumes FREE to you for 14 days

Full Article
Figures & data
References
Supplemental
Citations
Metrics
Reprints & Permissions
Read this article /doi/full/10.1080/10618600.2017.1402775?needAccess=true

ABSTRACT

Spline smoothing is a widely used nonparametric method that allows data to speak for themselves. Due to its complexity and flexibility, fitting smoothing spline models is usually computationally intensive which may become prohibitive with large datasets. To overcome memory and CPU limitations, we propose four divide and recombine (D&R) approaches for fitting cubic splines with large datasets. We consider two approaches to divide the data: random and sequential. For each approach of division, we consider two approaches to recombine. These D&R approaches are implemented in parallel without communication. Extensive simulations show that these D&R approaches are scalable and have comparable performance as the method that uses the whole data. The sequential D&R approaches are spatially adaptive which lead to better performance than the method that uses the whole data when the underlying function is spatially inhomogeneous.

KEYWORDS:

Debias
Divide and conquer
Parallel computing
Scalability
Spatial adaptivity

Supplementary Material: Simulation Codes

fit only: folder that contains four R code files to compute spline estimates and system runtime for three cases with GML choice of the smoothing parameter and σ = 0.1.

	(a)	fit_only.R: R code file contains six functions to compute estimates by four D&R methods and two other methods.
	(b)	simple_sin_s1.R: R code file to compute system runtime for case 1.
	(c)	sin_s1.R: R code file to compute system runtime for case 2.
	(d)	doppler_s1.R: R code file to compute system runtime for case 3.

with pstd: folder that contains four R codes files to compute confidence intervals, average MSE, squared bias and variance for three cases with GML choice of the smoothing parameter and σ = 0.1.

	(a)	fit_only.R: R code file contains six functions to compute estimates and confidence intervals by four D&R methods and two other methods.
	(b)	simple_sin_s1.R: R code file to compute confidence intervals, average MSE, squared bias and variance for case 1.
	(c)	sin_s1.R: R code file to compute confidence intervals, average MSE, squared bias and variance for case 2.
	(d)	doppler_s1.R: R code file to compute confidence intervals, average MSE, squared bias and variance, and generate Figure 1, the typical spline estimates for case 3.

Additional information

Funding

This research was supported by a grant from the National Science Foundation (DMS-1507620). The authors acknowledge support from the Center for Scientific Computing from the CNSI, MRL: an NSF MRSEC (DMR-1121053).

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Related Research Data

A Correspondence Between Bayesian Estimation on Stochastic Processes and Smoothing by Splines

Source: Institute of Mathematical Statistics

Divide and Recombine Approaches for Fitting Smoothing Spline Models with Large Datasets

Source: Figshare

Divide and Recombine Approaches for Fitting Smoothing Spline Models with Large Datasets

Source: Figshare

Multivariate Adaptive Regression Splines

Source: Institute of Mathematical Statistics

Smoothers and theCp, Generalized Maximum Likelihood, and Extended Exponential Criteria

Source: Informa UK Limited

Nonparametric Regression With Basis Selection From Multiple Libraries

Source: Informa UK Limited

Estimating the correct degree of smoothing by the method of generalized cross-validation

Source: Springer Science and Business Media LLC

Efficient computation of smoothing splines via adaptive basis sampling

Source: Oxford University Press (OUP)

Estimating the correct degree of smoothing by the method of generalized cross-validation

Source: Springer Science and Business Media LLC

Smoothing Splines

Source: Chapman and Hall/CRC

Estimating the correct degree of smoothing by the method of generalized cross-validation

Source: Springer Science and Business Media LLC

A Comparison of GCV and GML for Choosing the Smoothing Parameter in the Generalized Spline Smoothing Problem

Source: The Institute of Mathematical Statistics

Large complex data: divide and recombine (D&R) with RHIPE

Source: Wiley

Smoothing Splines

Source: Chapman and Hall/CRC

Smoothing spline Gaussian regression: more scalable computation via efficient approximation

Source: Wiley-Blackwell

Hybrid Adaptive Splines

Source: Informa UK Limited

Linking provided by

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Divide and Recombine Approaches for Fitting Smoothing Spline Models with Large Datasets

Related Research Data

Information for

Open access

Opportunities

Help and information

Divide and Recombine Approaches for Fitting Smoothing Spline Models with Large Datasets

ABSTRACT

Supplementary Material: Simulation Codes

Additional information

Funding

Reprints and Corporate Permissions

Academic Permissions

Related Research Data

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature