PROBABILITY PROPORTIONAL TO SIZE WITH REPLACEMENT
PROBABILITY PROPORTIONAL TO SIZE WITH REPLACEMENT
NAME: SIVAPRIYA
ROLL NO: 2148147
PPS SAMPLING:
DEFINITION:
Probability proportional to size (PPS) sampling is a method of sampling in which from a finite population, a size measure is available for each population unit before sampling and where the probability of selecting a unit is proportional to its size.
EXAMPLE:
If you are doing a survey of healthcare in different regions, you would grant a percentage of sample units based on the population of each area. Let us take a village. A village that has only 2,500 residents requires fewer healthcare facilities than a city of 250,000. If you survey employees in a company by department, the number of positions in each section is part of the sampling calculation. Every person is still part of the sampling, you are just breaking them down into separate populations as part of the analysis.
TYPES OF PPS SAMPLING:
There are two types of PPS sampling :
1) Probability proportional to size with replacement
2) Probability proportional to size without replacement
We are going to discuss about PPS sampling with replacement procedure.
PPS SAMPLING WITH REPLACEMENT (WR):
DEFINITION:
The probability of selection of a unit will not change and the probability of selecting a specified unit is the same at any stage. There is no redistribution of the probabilities after a draw.
There are two methods to draw a sample with PPSWR
1. CUMMULATIVE TOTAL METHOD :
The steps of selecting a simple random sampling of size n in this method are:
- associating the natural numbers from 1 to N units in the population and
- then selecting those n units whose serial numbers correspond to a set of n numbers where each number is less than or equal to N which is drawn from a random number table.
This is the way ,
AIM:
To draw a sample of size 8 using ppswr sampling
technique and estimate the relative
efficiency of ppswr sampling for
estimating the total amount of the real estate farm loans on the nonreal estate
farm loans with respect to the ratio estimator of population total.
DESCRIPTION OF THE DATA:
This data shows the amounts of real and non real estate
farm loans in different states of US during 2007.
library(readxl)
US<- read_excel("Estate in US.xlsx")
View(US)
library(samplingbook)
## Loading required package: pps
## Loading required package: sampling
## Loading required package: survey
## Loading required package: grid
## Loading required package: Matrix
## Loading required package: survival
##
## Attaching package: 'survival'
## The following objects are masked from
'package:sampling':
##
## cluster,
strata
##
## Attaching package: 'survey'
## The following object is masked from
'package:graphics':
##
## dotchart
# ppswr sampling to
select a sample of size n
set.seed(24)
sample<-ppswr(US$`Nonreal
estate farm loans`,8)
sample
## [1] 15 13 34 24 27 43 15 36
ppswr<-US[sample, ]
ppswr
## # A tibble: 8 x 4
## S.No.
`State and Territory` `Nonreal estate farm loans` `Real estate farm loa~
## <dbl>
<chr>
<dbl>
<dbl>
## 1 15
IA
3910. 2327.
## 2 13
IL
2611. 2131.
## 3 34
ND 1241. 449.
## 4 24
MS
550. 627.
## 5 27
NE
3585. 1338.
## 6 43
TX 3520. 1249.
## 7 15
IA
3910. 2327.
## 8 36
OK
1716. 612.
X<-sum(US$`Nonreal
estate farm loans`)
n<-8
N<-50
# Estimate the average mean
avg_y<-(X/n*N)*sum(US$`Real estate
farm loans`/US$`Nonreal
estate farm loans`)
avg_y
## [1] 18057941
# Estimate the
population total
y_hat<-(1/n)*X*sum(US$`Real estate
farm loans`/US$`Nonreal estate
farm loans`)
y_hat
## [1] 361158.8
# Estimate the
variance of population total
vt<-(1/(n*(n-1)))*((sum(US$`Real estate
farm loans`* X/US$`Nonreal
estate farm loans`)^2)-(n*y_hat^2))
vt
## [1] 130435690969
# Standard error
SE<-sqrt(vt)
SE
## [1] 361158.8
# Estimate the
variance of population mean
Vms<-(1/(N^2))*vt
Vms
## [1] 52174276
# Standard error of
population mean
se<-sqrt(Vms)
se
## [1] 7223.176
# Gain efficiency
of ppswr over ratio estimate of population total
x<-var(US$`Real estate
farm loans`)
x
## [1] 342021.5
y<-var(US$`Nonreal
estate farm loans`)
y
## [1] 1176526
rh<-sum(US$`Real estate
farm loans`)/sum(US$`Nonreal
estate farm loans`)
rh
## [1] 0.6324964
Cr<-cor(US$`Real estate
farm loans`,US$`Nonreal
estate farm loans`)
Cr
## [1] 0.8038341
V_rat<-((N-n)/(N*n))*(y+rh^2*x+2*rh*Cr*x*y)
V_rat
## [1] 42963542250
gain<-((vt-V_rat)/V_rat)*100
gain
## [1] 203.5962
INTEPRETATION:
We have drawn a
sample of size 8 from the population of size 50. They are 15 ,13, 34, 24, 27, 43, 15 and 36 using ppswr
sampling technique.
We have estimated the population
mean and total using ppswr. We get the estimated population mean as 18057941
and the population total as 361158.8. We have estimated variance and standard
error for population total which are 130435690969 and 361158.8 respectively.
We have also
estimated variance and standard error for population mean which are 52174276 and 7223.176 respectively.
The gain in
efficiency of ppswr over ratio estimation is 203.5962
which implies that ppswr is more efficient than ratio estimator.
CONCLUSION:
Thus from the
above interpretation we can say that ppswr estimator gives the right estimate
of a sample from the population and the estimate varies according to the sample
size.
ri
Comments
Post a Comment