94
Views
0
CrossRef citations to date
0
Altmetric
Original Articles

Non-linearity tests based on order statistics and quantile regressions

Pages 189-211 | Published online: 15 Mar 2007
 

Abstract

This article presents tests for neglected non-linearity based on order statistics. The tests rely on the estimation of parametric and non-parametric models. The parametric model imposes linearity and is estimated using least squares and, in turn, quantile regressions. The non-parametric model is estimated by implementing nearest neighbor on induced order observations. When we implement the quantile regression estimates, we analyze robust parametric and robust non-parametric results. This avoids faulty inference caused by the lack of robustness of the least squares estimator and allows us to avoid any distributional assumption. The proposed tests are easier to compute than the existing tests based on series expansions. Simulations and examples analyze their behavior and their robustness in the presence of outliers.

Notes

The final ordering depends on the real-valued function selected.

Theorem 6.1 in ref. [Citation10, p. 392] proves the conditional independence of induced order statistics of the dependent variable and of the residuals in the linear regression model.

A related statistic has been presented by Azzalini and Bowman Citation12 and by Ullah Citation7. They consider the discrepancy between non-parametric and parametric sum of squares residuals and relate this statistic to an F test. A different approach can be found in refs. Citation13Citation14. They compare the parametric and non-parametric estimates by looking at functions of the square distance between the two sets of residuals. The goal is to compare different regression functions.

This issue is relevant for the Z and M tests as well. It can be easily considered the case of replacing the OLS estimates with quantile regression estimates. This issue will be analyzed using a small example in the following sections.

Bhattacharya verifies the equality of two regressions having common functional form, density, and variance of residuals, but computed in two different samples. The T test, instead, compares two different estimators, parametric and non-parametric, for the same regression computed in the same sample. As we are dealing with the same sample, the density is necessarily the same. The equality of the functional form is instead under test and is allowed to diverge in the two sets of estimates.

Portnoy Citation15 considers a related statistic to discriminate the i.i.d. from the n.i.i.d. models.

§The influence function of an estimator is given by , where ψ( ) is the normal equation of the estimator and ψ′( ) is its first derivative. By the central limit theorem, the IF is asymptotically normal with zero mean and variance E(IF)2.

However, the quantile regression estimator is not robust with respect to outliers in the explanatory variables.

The simulations have been implemented using Stata. The same random numbers generator is used in each experiment.

We have selected the graphs of model (b) as there is no choice between x and ν for the non-linear explanatory variable. Once the model and the sample size have been chosen, most of the corresponding experiments have been reported in . For the sake of brevity, we do not report the experiments with the smallest standard error, σ=0.05, but these graphs are not very different from the cases with σ=0.1.

The T test is not really centered on zero, but on values smaller than zero. This can be seen in and can be explained by looking at the structure of the test. The difference between parametric and non-parametric cumulative sum of the fitted values computed at different quantiles leads us to overstate the difference occurring at the lower quantiles, because such a difference is counted in all the cumulative sums. In contrast, the same statistic understates the difference at the higher quantiles, as they appear only in a few cumulative sums, the last ones. This means that, when the parametric model is smaller than the non-parametric model at the lower quantiles, the final value of the T test is negative, and in our simulations, this systematically occurs under the null.

In ref. Citation5 the critical value defining the acceptance/rejection region is 2.2 for a significance level α=5%. We instead use the tabulated value 1.96, which, as previously discussed, seems to be perfectly appropriate.

It would be interesting to implement a simulation study to generalize the robustness properties of the proposed tests. However, this involves a wealth of experiments that goes beyond the limited scope of this article.

Additional information

Notes on contributors

Marilena Furno

Via Orazio 27/D, 80122 Napoli, Italy. Tel./Fax: 39-081-2486204; Email: [email protected]

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 1,209.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.