# An-introduction-to-sampling-in-statistics

Sometimes in Economic or general researches, statisticians try to answer questions which cater to a very large group, such as the number of unemployed people at a time in a region, or the number of migratory birds. This sort of questions concern millions of individuals, or objects at the same time, and questioning each one separately doesn’t sound like a great idea, does it? This is where Statistical **Sampling** comes in the picture. A Sample is nothing but a chosen person(s) of interest to represent a larger group. This enormously simplifies the workload.

The process of Sampling starts with the selection of **Population** in question. Population here refers to anything in view; it can either be cars, television set, or even toys. Population only refers to a set of separate items/individuals which are to be surveyed. For example, if we want to study the behaviour of stray dogs in Melbourne, the population will be all the stray dogs in the area, which is either impossible or impractical

## an-introduction-to-sampling-in-statistics

which brings us to our next step.

The next step is to define a sampling frame. A sampling frame is the subset of population which will be studied while assuming that the sample chosen will represent behaviour of the whole population. This is done on the basis of sampling method chosen, there are several to note.

- Random – In
**Random Sampling**, data is picked out of the population randomly (as the name suggests) usually by a computer algorithm known as random number generator, other non machine methods are lottery systems and so on. In Simple Random Sampling, each individual/object in the population has a probability of being picked. This method reduces biasses because the selection is not based on any favour. However, this method is also vulnerable to errors because in random sampling, a sample may get picked which does not completely represent the population. For example, if we are to study 200 men from a college and we pick 10 individuals, the proportionate sexual distribution for a fair survey should be 5 male and 5 females, but random sampling can pick 10 male and no female or vice-versa accounting for errors. - Systematic – In
**Systematic Sampling**, every data is numbered and every n^{th}^{ }data is picked out of the lot. This method is more likely to fairly distribute the sample across a wider range. The individuals from the population are numbered according to some diagram and then are picked after a regular interval of ‘n’. However the drawback of Systematic sampling lies in its advantage, periodicity. Choosing multiples of ‘n’ can choose misrepresentative sample from the population, making it way less effective than random sampling method. **Stratified**– If the population has some distinctive quality(s) the population can be divided according to that quality or traits (strata) and then samples can be picked from each separate subset displaying individual quality. This is a good method which represents individual from each class/group. But as all other sampling methods, this one is also laced with several drawbacks; the classification of strata can lead to increase in the cost and complicates the design. Another drawback is, selected sample(s) maybe related to few objects in each strata and not necessarily to all.- Cluster –
**Cluster Sampling**is, when a cluster is formed within the sample generally based on geography, and then each sample from that cluster is sampled. This significantly reduces the administrative cost of the survey. Clustering is a cheaper option than SRS as it corresponds to a strict cluster of sample, and also gives higher variability.

There are many ways to sample the population when conducting a research, here some of them are described and are presented as a brief description of the topic. Feel free to contact us in case you need any help regarding the subject.