Population Proportion of Size Without Replacement Using DesRaj Estimator
SAHAYA JEFRIN M (2148112)
Christ University, Bangalore.
INTRODUCTION
Probability proportional to size sampling is a method of sampling from a finite population in which a size measure is available for each population unit before sampling and where the probability of selecting a unit is proportional to its size. Sampling is a research method where subgroups are selected from a larger group known as a target population. The subgroups or samples are studied. If the sample is correctly chosen the results can be used to represent the target population. Probability proportional to size (PPS) takes varying sample sizes into account. This helps to avoid under-representing one subgroup in a study and yields more accurate results.
Probability Proportional to Size Without Replacement
In a varying probability scheme without replacement, when the initial probabilities of selection are unequal, then the probability of drawing a specified unit of the population at a given draw changes with the draw. Generally, the sampling without replacement provides a more efficient estimator than sampling with replacement. The estimators for population mean and variance are more complicated. So this scheme is not commonly used in practice, especially in large scale sample surveys with small sampling fractions.
Types of Probability proportional to size without replacement estimates: Here are the types of Probability Proportional of size without replacement estimates :
a) Ordered Estimators: Incorporate sampling unit's order. Need only conditional probability not inclusion probability.(our interest)
b) Unordered Estimators: Free from order concept of sampling unit's order. Incorporate inclusion probability.
Ordered estimates
To overcome the difficulty of changing expectation with each draw, associate a new variate with each draw such that its expectation is equal to the population value of the variate under study. Such estimators take into account the order of the draw. They are called the ordered estimates. The order of the value obtained at the previous draw will affect the unbiasedness of population mean.
Des Raj's Ordered Estimator : DesRaj (1956) have given an ordered estimator, that is, estimators that take into account the order in which the units are drawn. DesRaj estimator computes the estimated value of a finite population total when values of the study variable for the sampled units and the corresponding selection probabilities are given as per the order of selection.
Formula
where,
For a sample size of 2, he defined his estimate to be
Applications
The application for ordered estimator can be applied to any sampling procedure as it is a more improved version of SRSWOR. As the simple random sampling scheme provides a random sample where every unit in the population has an equal probability of selection. Under certain circumstances, more efficient estimators are obtained by assigning unequal probabilities of selection to the units in the population.
Limitations
It appears in PPS sampling that such procedure would give biased estimators as the larger units are over-represented and the smaller units are under-represented in the sample. This will happen in the case of the sample mean as an estimator of the population mean where all the units are given equal weight. Instead of giving equal weights to all the units, if the sample observations are suitably weighted at the estimation stage by taking the probabilities of selection into account, then it is possible to obtain unbiased estimators.
R Coding
Package: desraj{fpest}
Usage: desraj(Y,p)
Arguments: y vector of values of sampled units as per the order of selection
p vector of selection probabilities as per the order of selection
DesRaj Estimator using R programming
The result of sample survey on the number of bearing lime trees and the area
reported under limes, in each of the 22 villages growing lime in one of the tehsils of Bangalore
district, are given below:
S.No. of villages | Area Under lime(in acres) | No. of bearing lime trees |
1 | 32.77 | 2328 |
2 | 7.97 | 754 |
3 | 0.62 | 105 |
4 | 15.61 | 949 |
5 | 42.85 | 3091 |
6 | 40.03 | 1736 |
7 | 9.39 | 840 |
8 | 6.33 | 311 |
9 | 5.05 | 0 |
10 | 94.55 | 3044 |
11 | 53.71 | 2483 |
12 | 0.67 | 128 |
13 | 0.82 | 102z |
14 | 2.15 | 60 |
15 | 0.43 | 0 |
16 | 123.36 | 11799 |
17 | 0.29 | 26 |
18 | 3.00 | 317 |
19 | 4.00 | 190 |
20 | 2.00 | 180 |
21 | 6.21 | 752 |
22 | 45.85 | 3091 |
From this population, select a sample of size 5 and with Probability proportional to size without replacement scheme and estimate the total number of bearing lime trees using ordered DesRaj estimator also give the bound on the error of estimation.
Objective
From the given data selecting a sample of size 5 with PPSWOR scheme and estimating the
total number of bearing lime trees using ordered DesRaj estimator and finding the bound
on the error of estimation.
Import the Data set
library(readxl)
library(fpest)
lab_data<- read_excel("Documents/lab 8 data.xlsx")
View(lab_data)
attach(lab_data)
summary(lab_data)
## S.No. of villages Area Under lime(in acres) No. of bearing lime trees
## Min. : 1.00 Min. : 0.290 Min. : 0.0
## 1st Qu.: 6.25 1st Qu.: 2.038 1st Qu.: 110.8
## Median :11.50 Median : 6.270 Median : 534.5
## Mean :11.50 Mean : 22.621 Mean : 1467.5
## 3rd Qu.:16.75 3rd Qu.: 38.215 3rd Qu.: 2180.0
## Max. :22.00 Max. :123.360 Max. :11799.0
Data description
The data is the result of sample survey on the number of bearing lime trees and the area reported under limes, in each of the 22 villages growing lime in one of the tehsils of Bangalore district.
X=sum(`Area Under lime(in acres)`)
X
## [1] 497.66
gel=`Area Under lime(in acres)`/X
gel
## [1] 0.0658481694 0.0160149500 0.0012458305 0.0313667966 0.0861029619
## [6] 0.0804364426 0.0188683037 0.0127195274 0.0101474903 0.1899891492
## [11] 0.1079250894 0.0013463007 0.0016477113 0.0043202186 0.0008640437
## [16] 0.2478800788 0.0005827272 0.0060282120 0.0080376160 0.0040188080
## [21] 0.0124783989 0.0921311739
data1= cbind(lab_data,gel)
data1
## S.No. of villages Area Under lime(in acres) No. of bearing lime trees
## 1 1 32.77 2328
## 2 2 7.97 754
## 3 3 0.62 105
## 4 4 15.61 949
## 5 5 42.85 3091
## 6 6 40.03 1736
## 7 7 9.39 840
## 8 8 6.33 311
## 9 9 5.05 0
## 10 10 94.55 3044
## 11 11 53.71 2483
## 12 12 0.67 128
## 13 13 0.82 102
## 14 14 2.15 60
## 15 15 0.43 0
## 16 16 123.36 11799
## 17 17 0.29 26
## 18 18 3.00 317
## 19 19 4.00 190
## 20 20 2.00 180
## 21 21 6.21 752
## 22 22 45.85 3091
## gel
## 1 0.0658481694
## 2 0.0160149500
## 3 0.0012458305
## 4 0.0313667966
## 5 0.0861029619
## 6 0.0804364426
## 7 0.0188683037
## 8 0.0127195274
## 9 0.0101474903
## 10 0.1899891492
## 11 0.1079250894
## 12 0.0013463007
## 13 0.0016477113
## 14 0.0043202186
## 15 0.0008640437
## 16 0.2478800788
## 17 0.0005827272
## 18 0.0060282120
## 19 0.0080376160
## 20 0.0040188080
## 21 0.0124783989
## 22 0.0921311739
To select a sample of size 5 and with ppswor scheme
set.seed(456)
ppswor= data1[sample(1:nrow(data1),5,replace= FALSE),]
ppswor
## S.No. of villages Area Under lime(in acres) No. of bearing lime trees
## 13 13 0.82 102
## 5 5 42.85 3091
## 3 3 0.62 105
## 6 6 40.03 1736
## 14 14 2.15 60
## gel
## 13 0.001647711
## 5 0.086102962
## 3 0.001245830
## 6 0.080436443
## 14 0.004320219
To estimate the total number of bearing lime trees using ordered Desraj estimator
desraj(ppswor$`No. of bearing lime trees`, ppswor$gel)
## $est
## [1] 43490.55
##
## $estvar
## [1] 144050112
##
## $tvals
## [1] 61904.05 35941.73 80078.40 22959.51 16569.07
Bound on the error of estimation
se= sqrt(144050112)
se
## [1] 12002.09
be=se*2
be
## [1] 24004.18
Conclusion
Using PPS_WOR method a sample size of 5 is selected from the given data and DesRaj estimate is used to estimate the total number of bearing lime trees and the bound on the error of estimation.
The total number of bearing lime trees is 43490.
The bound on the error is 24004.
From the summary of data, we can tell that mean of area under lime trees is 22.621., median is 6.270 and mean of number of limes bearing trees is 1467.5, median of the same is 534.5.
Comments
Post a Comment