Statistical Research and Training Center - Statistical Centre of Iran
Journal of Statistical Research of Iran JSRI
1735-1294
3
1
2006
9
1
Outlier Detection by Boosting Regression Trees
1
22
EN
Nathalie
Chèze
Jean-Michel
Poggi
Jean-Michel.Poggi@math.u-psud.fr
10.18869/acadpub.jsri.3.1.1
A procedure for detecting outliers in regression problems is proposed. It is based on information provided by boosting regression trees. The key idea is to select the most frequently resampled observation along the boosting iterations and reiterate after removing it. The selection criterion is based on Tchebychev’s inequality applied to the maximum over the boosting iterations of the average number of appearances in bootstrap samples. So the procedure is noise distribution free. It allows to select outliers as particularly hard to predict observations. A lot of well-known bench data sets are considered and a comparative study against two well-known competitors allows to show the value of the method.
Boosting, CART, outlier, regression.
http://jsri.srtc.ac.ir/article-1-163-en.html
http://jsri.srtc.ac.ir/article-1-163-en.pdf
Statistical Research and Training Center - Statistical Centre of Iran
Journal of Statistical Research of Iran JSRI
1735-1294
3
1
2006
9
1
Some Statistical Methods for Prediction of Athletic Records
23
46
EN
R.
Dargahi-Noubary
rnoubary@bloomu.edu
10.18869/acadpub.jsri.3.1.23
Prediction of the sports records has received a great deal of attention from researchers in different disciplines. This article reviews some of the methods developed by statisticians and offers few improvements. Specific methods discussed include trend analysis, tail modeling, and methods based on certain results of the theory of records for independent and identically distributed attempts. To make the latter theory applicable, and to account for factors affecting the records, adjustments are made to the data in the form of increase in participation or attempts. Models utilized for this purpose include geometric increase, logistic increase, and increase as a non-homogenous Poisson process. A method for prediction of ultimate record is also included together with demonstrating examples using data for men’s long jump and 400 meter run.
Sport records, prediction, trend analysis, tail modeling, theory of records, Poisson processes.
http://jsri.srtc.ac.ir/article-1-160-en.html
http://jsri.srtc.ac.ir/article-1-160-en.pdf
Statistical Research and Training Center - Statistical Centre of Iran
Journal of Statistical Research of Iran JSRI
1735-1294
3
1
2006
9
1
A New Skew-normal Density
47
61
EN
Maryam
Sharafi
javad
Behboodian
behboodian@stat.susc.ac.ir
10.18869/acadpub.jsri.3.1.47
We present a new skew-normal distribution, denoted by NSN($lambada$). We first derive the density and moment generating function of NSN($lambada$). The properties of SN($lambada$), the known skew-normal distribution of Azzalini, and NSN($lambada$) are compared with each other. Finally, a numerical example for testing about the parameter $lambada$ in NSN($lambada$) is given.
Skew-normal distribution, a new skew-normal distribution, moment generating function, skewness, kurtosis, testing hypothesis.
http://jsri.srtc.ac.ir/article-1-159-en.html
http://jsri.srtc.ac.ir/article-1-159-en.pdf
Statistical Research and Training Center - Statistical Centre of Iran
Journal of Statistical Research of Iran JSRI
1735-1294
3
1
2006
9
1
A Randomness Test for Stable Data
63
74
EN
Adel
Mohammadpour
adl@aut.ac.ir
Ali
Mohammad-Djafari
djafari@lss.supelec.fr
John
P. Nolané
jpnolan@american.edu
10.18869/acadpub.jsri.3.1.63
In this paper, we propose a new method for checking randomness of non-Gaussian stable data based on a characterization result. This method is more sensitive with respect to non-random data compared to the well-known non-parametric randomness tests.
Stable distributions, randomness tests (test for i.i.d.), characterization.
http://jsri.srtc.ac.ir/article-1-162-en.html
http://jsri.srtc.ac.ir/article-1-162-en.pdf
Statistical Research and Training Center - Statistical Centre of Iran
Journal of Statistical Research of Iran JSRI
1735-1294
3
1
2006
9
1
A Comparative Review of Selection Models in Longitudinal Continuous Response Data with Dropout
75
90
EN
Elaheh
Vahidi-Asl
elahehva@yahoo.com
Mojtaba
Ganjali
m-ganjali@sbu.ac.ir
10.18869/acadpub.jsri.3.1.75
Missing values occur in studies of various disciplines such as social sciences, medicine, and economics. The missing mechanism in these studies should be investigated more carefully. In this article, some models, proposed in the literature on longitudinal data with dropout are reviewed and compared. In an applied example it is shown that the selection model of Hausman and Wise (1979, Econometrica 47, pp. 455-473) and the shared parameter model of Follmann and Wu (1995, Biometrics 51, pp. 151-168), two of the most used models for longitudinal data with dropout in economics and medical researches, respectively, cannot sufficiently consider the relation between response variables and missing mechanism. In this paper, the Follmann and Wu’s (1995) dropout model is also generalized by adding a previous time outcome component to the model. Having modified this model, in the case of longitudinal data with two time periods, a general form of this model is obtained, which is able to consider all relations between response and missing mechanism. This is proven in an implicit way. A test for missing at random in the generalized Hechman model (Crouchley and Ganjali, 2002, Stat. Model. 2, pp. 39-62) is also introduced where one has to use $delta$-method to find the variance of the test statistic.
Longitudinal data, continuous response, missing values, selection bias, dropout, random effect model.
http://jsri.srtc.ac.ir/article-1-161-en.html
http://jsri.srtc.ac.ir/article-1-161-en.pdf
Statistical Research and Training Center - Statistical Centre of Iran
Journal of Statistical Research of Iran JSRI
1735-1294
3
1
2006
9
1
On Classification of Bivariate Distributions Based on Mutual Information
91
101
EN
Mohamed
Habibullah
Mohammad
Ahsanullah
ahsan@rider.edu
10.18869/acadpub.jsri.3.1.91
Among all measures of independence between random variables, mutual information is the only one that is based on information theory. Mutual information takes into account of all kinds of dependencies between variables, i.e., both the linear and non-linear dependencies. In this paper we have classified some well-known bivariate distributions into two classes of distributions based on their mutual information. The distributions within each class have the same mutual information. These distributions have been used extensively as survival distributions of two component systems in reliability theory.
Mutual information, entropy, survival distribution, bivariate di
http://jsri.srtc.ac.ir/article-1-158-en.html
http://jsri.srtc.ac.ir/article-1-158-en.pdf