**Sampling**

Sampling may be defined as the selection of some part of an aggregate or totality on the basis of which a judgment or inference about the aggregate or totality is made. In other words, it is the process of obtaining information about an entire population by examining only a part of it. The researcher quite often selects only a few items from the universe for his study purposes. All this is done on the assumption that the sample data will enable him to estimate the population parameters. The items so selected constitute what is technically called a sample, their selection process or technique is called sample design and the survey conducted on the basis of sample is described as sample survey. Sample should be truly representative of population characteristics without any bias so that it may result in valid and reliable conclusions.

**Need for Sampling **

Sampling is used in practice for a variety of reasons such as:

- Sampling can save time and money. A sample study is usually less expensive than a census study and produces results at a relatively faster speed.
- Sampling may enable more accurate measurements for a sample study is generally conducted by trained and experienced investigators.
- Sampling remains the only way when population contains infinitely many members.
- Sampling remains the only choice when a test involves the destruction of the item under study.
- Sampling usually enables to estimate the sampling errors and, thus, assists in obtaining information concerning some characteristic of the population.

**Some Important Terms**

**Universe/Population**

From a statistical point of view, the term ‘Universe’refers to the total of the items or units in any field of inquiry, whereas the term ‘population’ refers to the total of items about which information is desired. The attributes that are the object of study are referred to as characteristics and the units possessing them are called as elementary units. The aggregate of such units is generally described as population. Thus, all units in any field of inquiry constitute universe and all elementary units (on the basis of one characteristic or more) constitute population. The population or universe can be *finite* or *infinite*. The population is said to be finite if it consists of a fixed number of elements so that it is possible to enumerate it in its totality. For instance, the population of a city, the number of workers in a factory are examples of finite populations. An infinite population is that population in which it is theoretically impossible to observe all the elements. Thus, in an infinite population the number of items is infinite i.e., we cannot have any idea about the total number of items. For example the number of stars in a sky. Thus, we may consider a population of persons, families, farms, cattle in a region or a population of trees or birds in a forest or a population of fish in a tank etc. depending on the nature of data required.

**Sampling unit**

Elementary units or group of such units which besides being clearly defined, identifiable and observable, are convenient for purpose of sampling are called sampling units. For instance, in a family budget enquiry, usually a family is considered as the sampling unit since it is found to be convenient for sampling and for ascertaining the required information. In a crop survey, a farm or a group of farms owned or operated by a household may be considered as the sampling unit.

**Sampling frame**

A list of all the sampling units belonging to the population to be studied with their identification particulars or a map showing the boundaries of the sampling units is known as sampling frame. Examples of a frame are a list of farms and a list of suitable area segments like villages in India or counties in the United States. The frame should be up to date and free from errors of omission and duplication of sampling units.

**Sample design**

A sample design is a definite plan for obtaining a sample from the sampling frame. It refers to the technique or the procedure the researcher would adopt in selecting some sampling units from which inferences about the population is drawn. Sampling design is determined before any data are collected.

**Sampling error**

Sampling error is the deviation of the selected sample from the true characteristics, traits, behaviours, qualities or figures of the entire population. The magnitude of the sampling error depends upon the nature of the universe; the more homogeneous the universe, the smaller the sampling error. Sampling error is inversely related to the size of the sample i.e., sampling error decreases as the sample size increases and vice-versa.

**Sample size**

The number of cases selected from the population to use as the sample.

**Sample fraction**

The proportion of the number of sample elements to the number of population elements.

**Sampling Interval**

Sampling interval is the distance or time between which measurements are taken, or data is recorded. In research terms, also referred to as ‘nth selection’, this is when we select every nth participant (sampling** **unit) in the list; this sampling interval** **produces a random selection from throughout the total population.

**Epsem Sample**

EPSEM samples are probability samples where each observation in the population has the same known probability of being selected into the sample (EPSEM stands for equal probability of selection method sampling.

**Proportional Sample**

Proportional sampling is a method of sampling in which the investigator divides a finite population into subpopulations and then applies random sampling** **techniques to each subpopulation.

**Representative Sample**

A representative sample is a group that closely matches the characteristics of its population as a whole. In other words, the sample is a fairly accurate reflection of the population from which the sample is drawn. For example, a classroom of 30 students with 15 males and 15 females, could generate a representative sample that might include six students: three males and three females.

**Statisitc(s) and parameter(s)**

A statistic is a characteristic of a sample, whereas a parameter is a characteristic of a population. Thus, when we work out certain measures such as mean, median, mode or the like ones from samples, then they are called statistic(s) for they describe the characteristics of a sample. But when such measures describe the characteristics of a population, they are known as parameter(s). For instance, the population mean is a parameter,whereas the sample mean is a statistic.

**13. Precision **

Precision is the range within which the population average (or other parameter) will lie in accordance with the reliability specified in the confidence level as a percentage of the estimate _{±} or as a numerical quantity. For instance, if the estimate is Rs 4000 and the precision desired is ± 4%, then the true value will be no less than Rs 3840 and no more than Rs 4160. This is the range (Rs 3840 to Rs 4160) within which the true answer should lie. But if we desire that the estimate should not deviate from the actual value by more than Rs 200 in either direction, in that case the range would be Rs 3800 to Rs 4200.

*13.* Confidence level and significance level

The confidence level or reliability is the expected percentage of times that the actual value will fall within the stated precision limits. Thus, if we take a confidence level of 95%, then we mean that there are 95 chances in 100 (or .95 in 1) that the sample results represent the true condition of the population within a specified precision range against 5 chances in 100 (or .05 in 1) that it does not. Precision is the range within which the answer may vary and still be acceptable; confidence level indicates the likelihood that the answer will fall within that range, and the significance level indicates the likelihood that the answer will fall outside that range. We can always remember that if the confidence level is 95%, then the significance level will be (100 – 95) i.e., 5%; if the confidence level is 99%, the significance level is (100 – 99) i.e., 1%, and so on. We should also remember that the area of normal curve within precision limits for the specified confidence level constitute the acceptance region and the area of the curve outside these limits in either direction constitutes the rejection regions

**Sampling Process**

**Define the Universe/Population**

The first step in developing any sample design is to clearly define the set of objects, technically called the Universe, to be studied. The universe can be finite or infinite. In finite universe the number of items is certain, but in case of an infinite universe the number of items is infinite, i.e., we cannot have any idea about the total number of items. The population of a city, the number of workers in a factory and the like are examples of finite universes, whereas the number of stars in the sky, listeners of a specific radio programme, throwing of a dice etc. are examples of infinite universes.

**Identify the sampling frame**

A complete list of population units is the sampling frame, the sampling frame should be so selected which consists of almost all the sampling units. Though it is not possible to have one-to-one correspondence between frame units and sampling units, however, we should choose a sampling frame which yields unbiased estimated with a variance as low as possible. Popularity known sampling frames are: Census reports, electoral registers, lists of member units of trade and industry associations, lists of members of professional bodies, lists of dwelling units maintained by local bodies, returns from an earlier survey and large scale maps etc.

**Specify the sampling unit**

The sampling unit is the basic unit containing the elements of the target population. Sampling unit may be a geographical one such as state, district, village, etc., or a construction unit such as house, flat, etc., or it may be a social unit such as family, club, school, etc., or it may be an individual. The researcher will have to decide one or more of such units that he has to select for his study.

**Specify the sampling method**

The sapling method indicates how the sample units are selected. The most important decision in this regard is to determine which of the two-probability or non-probability samples is to be chosen.

**Determine the sample size (n)**

This refers to the number of items to be selected from the universe to constitute a sample. This a major problem before a researcher. The size of sample should neither be excessively large, nor too small. It should be optimum. An optimum sample is one which fulfills the requirements of efficiency, representativeness, reliability and flexibility. While deciding the size of sample, researcher must determine the desired precision as also an acceptable confidence level for the estimate. The size of population variance needs to be considered as in case of larger variance usually a bigger sample is needed. The size of population must be kept in view for this also limits the sample size. The parameters of interest in a research study must be kept in view, while deciding the size of the sample. Costs too dictate the size of sample that we can draw. As such, budgetary constraint must invariably be taken into consideration when we decide the sample size.

**Specify the sampling plan**

This means that one should indicate how decisions made so far are to be implemented. All expected pertinent issues in a sampling survey must be answered by the sampling plan.

**Select the sample**

This is the final step in the sampling process. A goal deal of field work and office work is introduced in the actual selection of the sample elements. However it depends a mainly upon the sampling plan and the sample size required.

**CHARACTERISTICS OF A GOOD SAMPLE DESIGN**

- Sample design must result in a truly representative sample.
- Sample design must be such which results in a small sampling error.
- Sample design must be viable in the context of funds available for the research study.
- Sample design must be such so that systematic bias can be controlled in a better way.
- Sample should be such that the results of the sample study can be applied, in general, for the universe with a reasonable level of confidence.