Systematic Sampling Overview

Systematic Sampling

Akhankha Ghosh

1MSTAT -- 2148118

Christ University, Bangalore

Systematic sampling is a statistical approach for selecting pieces from ordered sampling frame in survey methods. An equiprobability approach is the most prevalent type of systematic sampling. The list is traversed in a circular fashion with this method, with a return to the beginning whenever the list's end is reached. The sampling procedure begins with a random selection from the list, followed by a selection of every kth unit in the frame, with k as the sampling interval. The systematic method of sampling is more practical that random sampling in terms of operations. At the same time, it assures that each unit seems to have an equal chance of being included in the sample. In this sampling approach, the first unit is chosen using random numbers, and the other units are chosen automatically according to a specified pattern. This is known as systematic sampling.

Methodology

Assume the population's N units are numbered 1 to N in some order. Assume that N may be expressed as the product of two integers, n and k, resulting in N=nk.

In order to draw a sample of size n, choose an integer between 1 and k at random.

- Assume it's me.

- Choose the first unit with the serial number i.

- After ith unit, select every kth unit.

- The sample will have serial number units i, i+k, i+2k, and i+(n-1)k.

As a result, the first unit is chosen at random, whereas the remaining units are chosen in a methodical manner. This systematic sample is referred to as the kth systematic sample, with k denoting a sampling interval.

The observations from the systematic sample are listed in the table below:

Systematic Sampling Types: The following are the several types of systematic sampling:

( (a) Systematic random sampling

(b) linear systematic sampling

Estimation of population mean

Case 1: N = nk

· An unbiased estimate of population mean obtained from the sample mean

· Variance of the estimate

· Comparison of Systematic sampling with SRSWOR

Case 2: N not equal to nk

· A biased estimate of population mean obtained from the sample mean

· Variance of the estimate

· An unbiased estimate of the population mean Y is obtained in systematic sampling with sampling interval k from a population with size N not equal to nk

Here, i is the ith systematic sample. i=1, 2, …..., k and n’ denotes the size of ith systematic sample.

Advantages of systematic sampling

· It is simpler to draw a sample and, more often than not, to execute it without errors. This is especially beneficial when drawing in fields and workplaces because it can save a lot of time.

· The price is low, and unit selection is straightforward. Surveyors who gather units by systematic sampling require far less training.

· The systematic sample is more evenly distributed throughout the population. As a result, no significant portion of the population will be left out of the sample. The sample is equally distributed, with a better cross-section. However, when there are too many blanks, systematic sampling fails.

Disadvantages of systematic sampling

· Systematic samples are generally random samples hence, the required merit is rarely met.

· When N is not a multiple of n, then

(i) the original sample size is different from that required.

(ii) sample mean is not an accurate representation of the population mean.

· Because a systematic sample is viewed as a sample of one unit, it is impossible to establish an unbiased estimate of the variance of systematic sampling using a single sample (cluster).

· Systematic sampling may produce highly biased estimates when there are periodic features associated with the sampling interval that is, the frame has a periodic feature and k is equal to a multiple of the period.

Systematic Sampling using R

The data set was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973–74 models). Suppose we want select a random sample after every 5^th draw.

R Code

library(TeachingSampling)

## Loading required package: dplyr

## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':

## filter, lag

## The following objects are masked from 'package:base':

## intersect, setdiff, setequal, union

## Loading required package: magrittr

# To estimate the mpg of cars in mtcars data set by taking systematic sampling

N=nrow(mtcars)

N # population size

## [1] 32

k=5 # sampling interval i.e. k=N/n

set.seed(5)

sample_units=S.SY(N,k)

sample_units

## [,1]

## [1,] 2

## [2,] 7

## [3,] 12

## [4,] 17

## [5,] 22

## [6,] 27

## [7,] 32

# sample values of the corresponding sample units

s=mtcars$mpg[sample_units]

s # selected systematic sample

## [1] 21.0 14.3 16.4 14.7 15.5 26.0 21.4

# estimated mean using systematic sample

sys_mean=mean(s)

sys_mean

## [1] 18.47143

n=ceiling(N/k)

## [1] 7

# for variance of an estimate

S2=var(mtcars$mpg) # s^2 population mean square

S2_wsys=var(s) # within systematic sample variance

variance=((N-1)*S2/N)-((n-1)/n)*S2_wsys

variance # variance of an estimate

## [1] 18.56122

#standard error of an estimate

SE=sqrt(variance)

## [1] 4.308273

# confidence interval at alpha=0.05

ll=sys_mean-2.447*SE

ul=sys_mean+2.447*SE

CI=c(ll,ul)

## [1] 7.929084 29.013774

# Conclusion: The variance of the estimate is obtained as 18.56122 which lies within the confidence interval [7.929084, 29.013774] at 5% level of significance.

Uses of Systematic Sampling

· As an example of systematic sampling, suppose a statistician selects every 100th person in a population of 10,000 persons for sample. Intervals of sampling can also be systematic, such as selecting a new sample every 12 hours.

· Taking another example, if we wanted to select a random group of 1,000 people from a population of 50,000 people using systematic sampling, you would need to compile a list of all potential participants and choose a beginning point. Following the formation of the list, every 50th individual on the list (counting from the designated starting point) would be picked as a participant, because 50,000/1,000 = 50.

· If the beginning point was 20, for example, the 70th person on the list would've been chosen, then the 120th, and so on. If more participants are needed after reaching the endpoint, the count loops back to the start of the list to complete the count.

Search This Blog

Complex Sample Survey Designs