DESRAJ ESTIMATOR - PROBABILITY PROPORTION TO SIZE ORDERED SAMPLING

GADHA T 

 2148131

DEPT OF STATISTICS

CHRIST DEEMED TO BE UNIVERSITY, BANGALORE


INTRODUCTION

Sampling is a process used in statistical analysis in which a predetermined number of observations are taken from a larger population. Based on the analysis being performed, there are different methodologies used to sample from a large population. One such sampling method is Probability Proportion to size without replacement ordered Estimate. The rule of sampling without replacement is more efficient than with replacement, also applies to PPS sampling. Since the probability of inclusion changes by draws or selected units' order, then PPSWOR is divided into ordered estimator and unordered estimator for better clarity.

To add on, Ordered estimates is used to overcome the difficulty of changing expectations with each draw, associate a new variate with each draw such that its expectation is equal to the population value of the variate under study and such estimators take into account the order of the draw. Des Raj (1956) have given an ordered estimator, that is, estimators that take into account the order in which the units are drawn. He proposed such an estimator which make use of conditional probabilities.

Formula : 

An estimator in case of 2 draws.

Let y1 and y2 be the values of units drawn at the first and second draw. 

Let pi be the probability of selection of unit.





Variance for the sample size 2.



APPLICATIONS


Since ordered estimator more improved version of SRSWOR the application for ordered estimator can be applied to any sampling procedure. As we know, the simple random sampling scheme provides a random sample where every unit in the population has an equal probability of selection. Under certain circumstances, more efficient estimators are obtained by assigning unequal probabilities of selection to the units in the population.
The application is demonstrated on the data-set which shows the area Under lime (in acres) and No. of bearing lime trees in each of the 22 villages growing limes in Bangalore district. The procedure includes drawing the sample of size 5 and with PPSWOR scheme and estimate the total number of bearing lime trees using ordered DesRaj estimator, then also give the bound on the error of estimation.

#Analysis: 

#To obtain the sum of the area under lime (in acres).

s=sum(`Area Under lime(in acres)`)
s

## [1] 497.66

 

#To find the probability of the values of the area under lime (in acres).

pi=(`Area Under lime(in acres)`)/s
pi

##  [1] 0.0658481694 0.0160149500 0.0012458305 0.0313667966 0.0861029619
##  [6] 0.0804364426 0.0188683037 0.0127195274 0.0101474903 0.1899891492
## [11] 0.1079250894 0.0013463007 0.0016477113 0.0043202186 0.0008640437
## [16] 0.2478800788 0.0005827272 0.0060282120 0.0080376160 0.0040188080
## [21] 0.0124783989 0.0921311739

 

#To combine the dataframe and their respective probabilities.

d=cbind(Data,pi)
d

##    S.No. of villages Area Under lime(in acres) No. of bearing lime trees
## 1                  1                     32.77                      2328
## 2                  2                      7.97                       754
## 3                  3                      0.62                       105
## 4                  4                     15.61                       949
## 5                  5                     42.85                      3091
## 6                  6                     40.03                      1736
## 7                  7                      9.39                       840
## 8                  8                      6.33                       311
## 9                  9                      5.05                         0
## 10                10                     94.55                      3044
## 11                11                     53.71                      2483
## 12                12                      0.67                       128
## 13                13                      0.82                       102
## 14                14                      2.15                        60
## 15                15                      0.43                         0
## 16                16                    123.36                     11799
## 17                17                      0.29                        26
## 18                18                      3.00                       317
## 19                19                      4.00                       190
## 20                20                      2.00                       180
## 21                21                      6.21                       752
## 22                22                     45.85                      3091
##              pi
## 1  0.0658481694
## 2  0.0160149500
## 3  0.0012458305
## 4  0.0313667966
## 5  0.0861029619
## 6  0.0804364426
## 7  0.0188683037
## 8  0.0127195274
## 9  0.0101474903
## 10 0.1899891492
## 11 0.1079250894
## 12 0.0013463007
## 13 0.0016477113
## 14 0.0043202186
## 15 0.0008640437
## 16 0.2478800788
## 17 0.0005827272
## 18 0.0060282120
## 19 0.0080376160
## 20 0.0040188080
## 21 0.0124783989
## 22 0.0921311739

 

#To obtain a sample using the method of PPSWOR & estimate total no.of bearing lime trees using ordered Desraj Estimator

library(fpest)

set.seed(
100)
ppswor
=d[sample(1:nrow(d),5,replace = F),]
ppswor

##    S.No. of villages Area Under lime(in acres) No. of bearing lime trees
## 10                10                     94.55                      3044
## 6                  6                     40.03                      1736
## 16                16                    123.36                     11799
## 19                19                      4.00                       190
## 14                14                      2.15                        60
##             pi
## 10 0.189989149
## 6  0.080436443
## 16 0.247880079
## 19 0.008037616
## 14 0.004320219

desraj(ppswor$`No. of bearing lime trees`,ppswor$pi)

## $est
## [1] 25473.65
##
## $estvar
## [1] 16074751
##
## $tvals
## [1] 16021.97 20525.86 39507.47 27965.70 23347.23

 

Conclusion: The estimate of the total number of bearing lime trees is 25474.

 

#To find the bound on the error of estimation.

t=qt(0.975,4)
t

## [1] 2.776445

var=16074751
se
=sqrt(var)
se

## [1] 4009.333

Bound_Error=t*se
Bound_Error

## [1] 8018.666

 

INTERPRETATIONS

The data shows the number of bearing lime trees and the area reported under limes in 22 villages of Bangalore district. The estimation was to find the total number of bearing lime trees by taking a sample using DesRaj Estimator , also to give the bound on the error of estimation. Initially, we imported and attached the data and then found the sum of the area under lime (in acres) and obtained it as 497.66. Then found the probability of the values of the area under lime (in acres). Then combined the data frame and their respective probabilities. Then installed and loaded the package fpest, a sample of size 5 was taken using PPSWOR method, then using ordered DesRaj estimator, the estimate of the total number of bearing lime trees is obtained to be  25474. Finally, found the bound on error of estimation to be as  8018.666. 


DEMERITS

It appears in PPS sampling that such procedure would give biased estimators as the larger units are over-represented and the smaller units are under-represented in the sample. This will happen in the case of the sample mean as an estimator of the population mean where all the units are given equal weight. Instead of giving equal weights to all the units, if the sample observations are suitably weighted at the estimation stage by taking the probabilities of selection into account, then it is possible to obtain unbiased estimators.


Comments





Comments

Popular posts from this blog

PPSWOR AND HORVITZ THOMPSON ESTIMATOR

Population Proportion of Size Without Replacement Using DesRaj Estimator

HORVITZ-THOMPSON ESTIMATOR - An Unordered Estimator