1
1735-1294
Statistical Research and Training Center - Statistical Centre of Iran
163
General
Outlier Detection by Boosting Regression Trees
Chèze
Nathalie
Poggi
Jean-Michel
1
9
2006
3
1
1
22
13
02
2016
13
02
2016
A procedure for detecting outliers in regression problems is proposed. It is based on information provided by boosting regression trees. The key idea is to select the most frequently resampled observation along the boosting iterations and reiterate after removing it. The selection criterion is based on Tchebychev’s inequality applied to the maximum over the boosting iterations of the average number of appearances in bootstrap samples. So the procedure is noise distribution free. It allows to select outliers as particularly hard to predict observations. A lot of well-known bench data sets are considered and a comparative study against two well-known competitors allows to show the value of the method.
160
General
Some Statistical Methods for Prediction of Athletic Records
Dargahi-Noubary
R.
1
9
2006
3
1
23
46
13
02
2016
13
02
2016
Prediction of the sports records has received a great deal of attention from researchers in different disciplines. This article reviews some of the methods developed by statisticians and offers few improvements. Specific methods discussed include trend analysis, tail modeling, and methods based on certain results of the theory of records for independent and identically distributed attempts. To make the latter theory applicable, and to account for factors affecting the records, adjustments are made to the data in the form of increase in participation or attempts. Models utilized for this purpose include geometric increase, logistic increase, and increase as a non-homogenous Poisson process. A method for prediction of ultimate record is also included together with demonstrating examples using data for men’s long jump and 400 meter run.
159
General
A New Skew-normal Density
Sharafi
Maryam
Behboodian
javad
1
9
2006
3
1
47
61
13
02
2016
13
02
2016
We present a new skew-normal distribution, denoted by NSN($lambada$). We first derive the density and moment generating function of NSN($lambada$). The properties of SN($lambada$), the known skew-normal distribution of Azzalini, and NSN($lambada$) are compared with each other. Finally, a numerical example for testing about the parameter $lambada$ in NSN($lambada$) is given.
162
General
A Randomness Test for Stable Data
Mohammadpour
Adel
Mohammad-Djafari
Ali
P. Nolané
John
1
9
2006
3
1
63
74
13
02
2016
13
02
2016
In this paper, we propose a new method for checking randomness of non-Gaussian stable data based on a characterization result. This method is more sensitive with respect to non-random data compared to the well-known non-parametric randomness tests.
161
General
A Comparative Review of Selection Models in Longitudinal Continuous Response Data with Dropout
Vahidi-Asl
Elaheh
Ganjali
Mojtaba
1
9
2006
3
1
75
90
13
02
2016
13
02
2016
Missing values occur in studies of various disciplines such as social sciences, medicine, and economics. The missing mechanism in these studies should be investigated more carefully. In this article, some models, proposed in the literature on longitudinal data with dropout are reviewed and compared. In an applied example it is shown that the selection model of Hausman and Wise (1979, Econometrica 47, pp. 455-473) and the shared parameter model of Follmann and Wu (1995, Biometrics 51, pp. 151-168), two of the most used models for longitudinal data with dropout in economics and medical researches, respectively, cannot sufficiently consider the relation between response variables and missing mechanism. In this paper, the Follmann and Wu’s (1995) dropout model is also generalized by adding a previous time outcome component to the model. Having modified this model, in the case of longitudinal data with two time periods, a general form of this model is obtained, which is able to consider all relations between response and missing mechanism. This is proven in an implicit way. A test for missing at random in the generalized Hechman model (Crouchley and Ganjali, 2002, Stat. Model. 2, pp. 39-62) is also introduced where one has to use $delta$-method to find the variance of the test statistic.
158
General
On Classification of Bivariate Distributions Based on Mutual Information
Habibullah
Mohamed
Ahsanullah
Mohammad
1
9
2006
3
1
91
101
13
02
2016
13
02
2016
Among all measures of independence between random variables, mutual information is the only one that is based on information theory. Mutual information takes into account of all kinds of dependencies between variables, i.e., both the linear and non-linear dependencies. In this paper we have classified some well-known bivariate distributions into two classes of distributions based on their mutual information. The distributions within each class have the same mutual information. These distributions have been used extensively as survival distributions of two component systems in reliability theory.