Ratio and Regression Estimators

COMPARISON BETWEEN RATIO AND REGRESSION ESTIMATORS

ATMADIP CHAKRABORTY
2148103

INTRODUCTION:

     Information on an auxiliary variate which is highly correlated with the variable under study is readily available in many surveys. This can be used for improving sampling design. In case data on auxiliary variate for individual sampling units are not available, the aggregate data on auxiliary variate can still be used at the time of estimation of the parameters under study, provided the data on auxiliary variate for the sampled units can be easily obtained while recording the values of the study variate. Ratio method of estimation and regression method of estimation are two such methods of estimation. We are going to discuss in brief about these two estimators and also find out which one of them has the better efficiency.

RATIO ESTIMATOR:

     Frequently we come across situations in which the ratio of y to another character x is believed to be less variable than the y's themselves. In that case it would be better to estimate R, the ratio of y to x. In the population, from the sample and then multiply it by the known total of x to estimate the total for y. This procedure is called ratio estimation.

     Frequently we wish to estimate a ratio rather than a total or mean, for example, it is desired to estimate a ratio rather than a total or mean, for example, it is desired to estimate the total agricultural area in a region containing N communes. There are very big communes and very small communes and this makes the character y very tremendously over the region. but the ratio of agriculture area and the population size of the commune, which is the per capita agricultural area, would be less variable.

     Let Y and X be the total agricultural area and the total population in the region respectively. Then the per capita agricultural area in the region is R=Y/X. If a small random sample of n communes gives ∑yand ∑xi where i = 1 to n, as the total for y and x, respectively. The following estimates can be done through ratio estimation: 




     For estimating Y we could have used information on any character x; this information need not be recent, but must be known for thee entire population. On the other hand, information on a sample basis is required for y as well as for x (the denominator of the ratio) if the purpose is to estimate the ration R = Y/X in population.


The sample ratio (r) is estimated from the sample:


Assuming independence between x and y, the ratio is biased can be shown with Jensen's Inequality as follows: 



The variance of the estimated total is

The variance of the estimated mean of the 
y variate is
where mx is the mean of the x variate, sx2 and sy2 are the sample variances of the x and y variates respectively and ρ is the sample correlation between the x and y variates.

REGRESSION ESTIMATOR:

     Like the ratio estimator, the linear regression estimate is designed to increase precision by the use of an auxiliary variable xi that is correlated with yi. The ratio estimation is at its best when the relation between y is straight line through the origin, i.e. y-kx=0 <=> y/x=k. When the relation between yand xis examined, it may be found that although the relation is (approximately) linear, the line does not go through the origin. This suggests an estimator based on the linear regression of y on x rather than on the ratio of the variables.
     We suppose that yand xare each obtained for every unit in the sample and that the population mean x̄ of the xis known. The linear regression estimator of ȳ, the population mean of yis
where b is an estimator of the change in y when x is increased by unit. The rationale behind this estimator is that if x̄ is below average we should expect ȳ also to be below average by the amount,
because of the regression of y on x. For an estimate of the population Y, we take 
     Suppose that we can take a rapid estimate 
xof some characteristics for every unit and can also, by some more costly method, determine the correct value of yof the characteristic for a simple random sample of the units. For an example, an eye estimate of the volume of timber was made on each of a population of 1/10 acre plots, and the actual timber volume was measured for a simple random sample of the plots. The regression estimate 
adjusts the sample mean of the actual measurements by the regression of the actual measurements on the rapid estimates.
     By a suitable choice of b, the regression estimate includes as particular cases both the mean per unit and the ratio estimate. 

Comparison between ratio and regression estimators:

Aim:

The main aim is to derive the estimate of the population mean using regression estimation and compare it with ratio estimation 

Objective:

To estimate the average real estate farm loans assuming that the average non-real estate farm loans in the country is known and is equal to $878.16. Also using the regression estimator to give the estimates with 95% confidence interval for this data set and discuss the results.

Notations:

X - Non-real estate farm loans 

Y - Real estate farm loans

Data Description:

Given below is a random sample of 21 states from a population of 50 states of a country using SRSWOR

We call the required packages for our analysis

We first get our scatter plot to see the correlation and find the value of the correlation coefficient.



We see the line is passing through the origin

The value of correlation coefficient is moderately high

Now we use the auillary varaible X and use regressor model function which gives the regression coefficient “b”. Using the weights from X we estimate the values of Y.


In our regresssion model we take the intercept as 0 as our data points are passing through the origin

Regresson Estimator is y_reg=y_bar+b*(X_bar-x_bar)
The value of Xbar is known

Now we will calculate the variance using regression estimator

Standard Error = First finding the unbiased estimate of the vraiance of ybar and then finding the square root of it. For this we need correlation coefficient (r),N,n and sample mean square y (sy2)

Ratio Estimate

Conclusion:

1.   Standard error of regression estimator is less than standard error of ratio estimator, hence regression estimator is a better method of estimation than ratio.

2.  The data shows a moderately high correlation between the variables. We obtain the regression estimate as 0.39819 with a standard error of 0.03368 The estimate of the mean is $594.1101 with the standard error estimate of 68.15831. The 95% confidence interval is (736.2859,451.9344). With the given value of the estimator of the mean, we can conclude that the population mean lies within these values.

3.   On comparing with ratio estimator whose standard error is 121.1869, we can conclude that regression estimator (with standard error 68.15831) is a better estimator than ratio estimator

      The regression estimator is more precise than the ratio estimator unless y=kx, i.e. the relation between y and x is a straight line through the origin.

Comments

Popular posts from this blog

PPSWOR AND HORVITZ THOMPSON ESTIMATOR

Population Proportion of Size Without Replacement Using DesRaj Estimator

HORVITZ-THOMPSON ESTIMATOR - An Unordered Estimator