Stratified Random Sampling under Ratio and Regression Estimation
Stratified Random Sampling under Ratio and Regression Estimation
Description :
Sanjana Rajamani - 2148145
Stratified Random Sampling :
A stratified random sampling is a method of sampling that divides a population into smaller sub-groups known as strata. These subpopulations are non-overlapping and together they comprise the whole of the population , so that
N1+N2+N3+.....+NL = N
The subpopulations are called strata. Stratification can only be fully utilized when the Nh values are known. After strata have been determined, samples are drawn from each of them. Each sample is collected independently. Here the sample sizes within the strata are denoted by n1,n2+n3+.....+nL , respectively.
The stratified random sampling process differs from simple random sampling, in which data is random selected from an entire population, so each sample is equally likely to occur.
If a simple random sample is taken in each stratum, the whole procedure is described as stratified random sampling.
The Mean and Variance of Stratified random sampling is :
Mean is given by :
Stratification is a common technique and there are many reasons as to why this is commonly used and the principal ones are the following listed below:
- If data of known precision are needed for certain subdivisions of the population, each subdivision should be treated as a "population" in its own right.
- It is also mainly used for administrative convenience.
- For example, sampling issues may differ markedly among hotels, the general population, business lists, etc.
- This may result in a gain in precision when estimating the characteristics of the entire population. A heterogeneous population might be divided into subpopulations, each of which is homogeneous within itself.
The main advantage of stratified random sampling is that it captures key characteristics of a population in a sample. As with a weighted average, this sampling method produces characteristics in the sample that are proportional to those in the overall population. When subgroups cannot be formed, stratified random sampling does not work well.
When do we use Stratified random sampling ?
- Ensuring the diversity of the sample
- Ensuring similar variance
- Lowering the overall variance in the population
- Allowing for a variety of data collection methods
As compared to simple random sampling, stratification results in fewer estimation errors and greater precision. The greater the difference between strata, the higher the precision.
A concern with the separate ratio estimate is that with small sample sizes per stratum, the individual stratum variance estimates will be biased, and this bias is extended across strata. small stratum sizes (ni x 20), or if the within-stratum ratios are approximately equal.
Regression estimates in Stratified Random Sampling :
Like the ratio estimate, the linear regression estimate is designed to increase precision by the use of an auxiliary variate xi which is correlated with yi. This suggests an estimate based on the linear regression of yi on xi rather than on the ratio of the two variables.
As with ratio estimate , two types of regression estimate can be made in stratified random sampling. in the first estimate a separate estimate is computed for each stratum mean, that is,
Comparison between ratio and regression estimators:
Aim :
The main aim is to derive the estimate of the population mean using regression estimation and compare it with ratio estimation
Objective :
To estimate the average real estate farm loans assuming that the average non-real estate farm loans in the country is known and is equal to $878.16. Also using the regression estimator to give the estimates with 95% confidence interval for this data set and discuss the results.
Notations :
X - Non-real estate farm loans
Y - Real estate farm loans
Data Description :
Given below is a random sample of 21 states from a population of 50 states of a country using SRSWOR
We first get our scatter plot to see the correlation and find the value of the correlation coefficient :
Now we will calculate the variance using regression estimator
Standard Error = First finding the unbiased estimate of the variance of y bar and then finding the square root of it. For this we need correlation coefficient (r),N,n and sample mean square y (sy2)

Comments
Post a Comment