Relative Efficiency of Cluster Sampling

MST171 SAMPLE SURVEY DESIGN

NAME: KEERTHANA A (2148135)

Relative Efficiency of Cluster Sampling

INTRODUCTION TO CLUSTER SAMPLING

The population has been defined as a collection of a finite number of distinct and identifiable units known as sampling units. The element or elementary unit of the population is the smallest identity content in a population. Cluster refers to a collection of such basic units. Clusters are formed up of elements that have a lot of similarities in their characteristics. Cluster sampling occurs when these clusters are viewed as sampling units and a small number of them are chosen with equal or unequal probabilities. The elements in selected clusters will be seen, measured, and interviewed in their entirety. The cluster should have a small number of elements and a large number of clusters in the population. For example, if we are interested in getting information or data on a colony's monthly average revenue, the colony can be divided into N numbers of blocks called clusters, and a simple random sample of n blocks can be taken. Individuals living in the chosen clusters would be identified for interviewing in order to gather data.


EXAMPLE: Let us consider a case of cluster sampling in which a number of people in a city are to be interviewed. For selecting a sample, the telephone directories are used and it is decided to interview people through telephone. Now, since all the residents can be numbered, a random sampling technique could have been used to choose sample houses. Also, we could form strata of houses for high, middle, and low income groups. Now, if we choose houses throughout the city in random manner, then cost of visiting widely scattered dwellings will certainly be prohibitive. An alternative way of sample selection is to group blocks or areas into clusters of approximately equal population. Then, a number of these clusters can be chosen at Sampling random. Within each cluster, all households may be interviewed. On comparing this (cluster) sampling procedure with that of making random choice of households throughout the city, it is clear that the cost per element (a, household) is certainly going to be lower because of lower listing cost (as it is necessary only to list the houses on the blocks selected) and lower location cost. Also, it is going to be easy for an interviewer to talk to several people on one block rather than to several people scattered throughout the city.  A necessary condition for the validity of this procedure is that every unit of the population under study must correspond to one and only one unit of the cluster so that the total number of sampling units in the frame may cover all the units of the population under study without any omission or duplication. When this condition is not satisfied, bias is introduced.

Construction of clusters

The clusters are constructed such that the sampling units are heterogeneous within the clusters and homogeneous among the clusters. There are two ways to construct the clusters- equal size and unequal size.

In case of equal clusters, when the population is divided into N clusters of each size n. We select a sample of n clusters from N clusters using Simple Random Sampling WOR. Then, the total population size = NM and the total sample size = nM.

Let yij : Value of the characteristic under study for the value of jth element (j = 1,2…,M) in the ith cluster (i = 1,2…,N).




Relative Efficiency

We note that the estimator for equal sized clusters is based on a sample of nM units in the form of n clusters each consisting of M units. Thus, if the same number of units are selected from a population of NM units by without replacement simple random sampling procedure, then the sample mean estimator and its variance V(Y) are given by the relations:



And, in relation to the sample mean estimator y, the relative efficiency of the estimator for equal sized clusters is given by

 

where V(Ycbarcap) denotes the variance for equal sized cluster.

Observe that the relative efficiency defined above involves value of study variable for all population units. However, in practice, the investigator has only the sample observations of n clusters of M units each. For this, he needs the estimates of two variances involved in the formulae of relative efficiency (RE). 


Then, estimator of relative efficiency of estimator (for equal size clusters) with respect to the usual estimator from a cluster sample is given by



Solving a problem using R.

A company has 25 centers located at different places in a State. Each center has been provided with 4 telephones. In order to estimate the average number of calls per telephone made on a typical day for this company, a sample of 5 centers, using without replacement simple random sampling, were selected. The data regarding the number of calls made on a typical working day from each telephone of' the sample centers are as summarized in table.


Estimate the average number of daily calls per telephone made from all the 25 centers. Also, estimate the relative efficiency of the estimator used with respect to the usual sample mean estimator, from the sample selected above.


Advantages:

a) The cluster sampling provides significant gains in data collection costs, since traveling costs are smaller. 

b) Since the researcher need not cover all the clusters and only a sample of clusters are covered, it becomes a more practical method which facilitates fieldwork. 

Limitations:

a) The cluster sampling method is less precise than sampling of units from the whole population since the latter is expected to provide a better cross-section of the population than the former, due to the usual tendency of units in a cluster to be homogeneous.

b) The sampling efficiency of cluster sampling is likely to decrease with the decrease in cluster size or increase in number of clusters. 

The above advantages or limitations of cluster sampling suggest that, in practical situations where sampling efficiency is less important but the cost is of greater significance, the cluster sampling method is extensively used.





Comments

Popular posts from this blog

PPSWOR AND HORVITZ THOMPSON ESTIMATOR

Population Proportion of Size Without Replacement Using DesRaj Estimator

HORVITZ-THOMPSON ESTIMATOR - An Unordered Estimator