Comparison of Separate and Combined Ratio Estimators in Stratified Sampling

 Comparison of Separate and Combined Ratio Estimators in Stratified Sampling


- Ameya Sandeep Pange (2148119)


Introduction :

Stratified random sampling is a method of sampling which is employed when the population units have diversity among them. Most of the time, real-word data is observed to have such heterogeneity. This sampling technique involves the division of a heterogeneous population into smaller homogeneous sub-groups known as strata. Random samples are then selected from each stratum using Simple Random Sampling Without Replacement.


Ratio Estimation :-

If we have two variables defined for the population such that they are highly correlated and one variable can be treated as auxiliary information to estimate the variable of interest then, we implement the method of ratio estimation. In this technique, the knowledge of the population aggregates or average is used along with the sample ratios of the auxiliary and target variable to the estimate the population parameters.

 

Ratio Estimator :-

The ratio estimator is a statistic which is formed by the ratio of the mean of the variable of interest with the mean of the auxiliary variable.

It is used to estimate the population mean for the variable of interest and gives precise estimates for it when the relation between the variables is linear and the regression line passes through the origin.


Notations :-

Consider,

y : characteristic under study or the variable of interest

x : auxiliary information or variable

(xij , yij) : jth values of x and y for the ith stratum in the sample

Y : total of the y characteristic of the population

X : total of the x characteristic of the population

 

Then, the ratio estimator R = ratio of the population totals

                                            = ratio of the population means

 

&


This ratio is then used to estimate the population parameters such as the population mean (Ȳ) and population total (Y).

Thus, the estimates are given by,

Ratio estimator of the population mean :


Ratio estimator of the population total :

where x, y and , ȳ are the sample totals and sample means for the characteristics of x and y respectively.

 

Types of Ratio Estimator :-

While using ratio estimation with stratified random sampling, there are two different ways to produce estimates i.e.,  the ratio estimators in stratified sampling are of two types-

1) Separate ratio estimator

2) Combined ratio estimator

 

Separate ratio estimator :-

For this estimator, ratio estimation is performed separately in each stratum and then they are combined. This gives a separate ratio estimator which estimates the ratio of µy to µx within each stratum and then forms a weighted average of the separate estimates.

Let us consider a population of size N which is divided into k strata. Let the strata be of sizes Ni, i= 1,2, … k respectively. The weights are wi = Ni/N. Then the separate ratio estimator for estimating the mean is given by,


Similarly, the separate ratio estimator for estimating the population total is given by,

 

where Rm = Ym/Xm and ρm are the true ratio and the coefficient of correlation in the mth stratum respectively.


 

Combined Ratio Estimator :-

For this estimator, we first compute the stratified simple random sample estimates of the means estimators for µy and µx, and then use ȳst/st as a ratio estimator of µy/µx . Then the combined ratio estimator for estimating the population mean is given by, 

       where,

 

Similarly, the combined ratio estimator for estimating the population total is given by,

 


 Comparison of the Separate and Combined Ratio Estimators :- 

We compare the Separate and the Combined ratio estimators on the basis of the bias and the mean square error of the estimators.

Bias refers to the deviation of the expected value of the estimator from the actual value of the parameter i.e., Bias = E(Ŷ) - Y

Mean square error refers to the average of the squared difference between the estimated values and the actual values i.e., MSE = Bias + Variance 


Now, the biases are found as- 


Separate Estimator                                                                           Combined Estimator




                 


and the means square errors are given by- 


Separate Estimator                                                                    Combined Estimator




To answer one of the most obvious questions- 'Which estimator is better, separate or combined?', we use the difference in their MSEs or their variances. The only difference in them is the form of their ratio estimator i.e., Ri and R.

 

This is now illustrated by an example.



Example :-

To see the comparison of the estimators on a practical level, we take the data collected from a pilot survey to estimate the extent of cultivation and production of fruits in three districts of Uttar Pradesh in the year 1976-77. The districts are treated as the various strata which consist of multiple number of villages. The total area under the orchards (in hect.) is treated as the auxiliary variable and the number of trees is the variable of interest. The separate and combined estimators are then calculated and compared using R programming.




#The correlation between the auxiliary and target variable is quite high i.e., 0.9285 and linear (as seen from the graph) and hence ratio estimation is suitable for use.




#The combined estimator gives an estimate of the population mean as 2777134 trees and the variance of the estimator comes out to be 16503773792.


#The separate estimator gives an estimate of the population mean as 2743779 trees and the variance of the estimator comes out to be 14955403949.




#The relative efficiency of separate estimator over the combined estimator comes out to be 110.3532%.



Conclusion :-


From the example, it can be seen that the estimated number of trees is almost similar for both estimators but the variance of the combined estimator is much higher than the variance of the separate estimator. On calculating the relative efficiency, the separate estimator is found to be 110.35% more efficient than the combined estimator.

In general, if Ri varies considerably, the combined estimator provides an estimate with negligible bias and precision as good as the separate estimator. It also does not require the knowledge of the stratum means. But, the MSE of the separate estimator is usually lesser since Ri is not equal to R and hence the separate estimators are more efficient than the combined estimators.

Comments

Popular posts from this blog

Population Proportion of Size Without Replacement Using DesRaj Estimator

Probability Proportional to Size Sampling without replacement (PPSWOR) using Murthy’s unordered estimator

PPSWOR AND HORVITZ THOMPSON ESTIMATOR