Representative sample

Demographic segmentation

Some parts of the population have better access to the Internet than others. If a random sample is obtained from an online panel, chances are that some groups will be over-represented (e.g. young people vs. elderly people). To avoid bias, samples are obtained according to the following demographic segmentation variables: gender, age and region. This division is in accordance with the division of the Dutch population by Statistics Netherlands. Other demographic segmentation variables include income, family size, ethnicity and education. If these are necessary for the study, we can provide this information. It is also possible to add these variables to the dataset afterwards, to correct bias, should this occur.

Self-selection bias

The representativeness of a sample is undermined by "self-selection bias", a type of statistical bias that is caused by respondents who have selected themselves to participate in a panel. This type of bias occurs in any kind of research (e.g. face-to-face or telephone interviewing, mail or web surveys). Self-selection bias is not only inherent to Access Panels (i.e. respondents who are selected have given their permission to do so), but is also caused by the simple fact that response rates are never 100%. So even in case of a random door-to-door sample from the Dutch population, self-selection will occur as some people choose to participate and some refuse.

To minimise self-selection bias, it is possible to measure demographic as well as psychographic variables. The latter include voting behaviour or the use of classification systems, such as values and lifestyle segmentation by Motivaction, or stratification analyses by Experian or Claritas.¹

Because of this, PanelClix decided to solely use demographic variables in order to construct samples which are representative for the researched population.

This document limits itself to describing how PanelClix constructs samples. In cooperation with the Technical University of Eindhoven, it is examined whether there are other and perhaps better ways of correcting for self-selection bias. In cooperation with Millward Brown the notion concerning the extent to which a bias may exist due to respondent’s motivation to participate in fieldwork is being studied. The first results were published in 2008.

¹ Including the study, carried out by McKinsey in the Netherlands.

Representative sample distribution based on demographic variables.

Because target group criteria are different for each research project; the sample distributions based on demographic variables will vary for each study as well. The goal is to compose samples which are distributed with as much (national) representativeness as possible. For example, we must correct for the fact that women respond faster to invitations for participation in the survey compared to men. If PanelClix would invite just as many men as we would invite women, more women would participate, which implies that the sample distribution would become skewed.

When composing samples, PanelClix uses the following distributions according to region, gender and age (in case the research project requires a representative distribution based on demographic variables).

Region     Gender:  
Nielsen 1: 15%   Woman 45%
Nielsen 2: 29%   Man 55%
Nielsen 3: 11%      
Nielsen 4: 20%      
Nielsen 5: 25%      

15-65     20+  
15-34 36%   20-34 24%
35-49 33%   35-49 30%
50-64 31%   50-64 26%
      65+ 20%

16+     20-64  
16-29 21%   20-34 29%
30-44 26%   35-49 37%
45-59 27%   50-64 34%
60+ 27%      

18+     20-60  
18-29 18,6%   20-29 22%
30-44 26,6%   30-39 24%
45-59 27,2%   40-49 29%
60+ 27,6%   50-59 25%

18-44     25+  
18-29 41%   25-34 17%
30-44 59%   35-49 32%
      50-64 29%
      65+ 22%

18-34 33%      
35-49 36%      
50-64 31%      

These distributions are non-interlocked. Within the PanelClix panel there is a good representative distribution of region and gender among the different age categories, the exception being only the elderly (65+, because this group is underrepresented in the online community and because there are more women than men). If a representative distribution must be interlocked on age, gender and region, this would best be safeguarded by a web survey with segments and quotas. When the technical opportunity presents itself to work with our router, we can guarantee these distributions.

Sample distribution for specific groups of respondents

A high percentage of research projects requires respondents which fit a target group which is quite narrow. This implies that only those respondents who show to have very specific attributes actually qualify to participate in a survey. For example please consider the following target groups: people who have recently moved; people who travel with public transport more than 6 times a year; people who consider buying a new car within 6 months, etc.

As soon as there are specific criteria which respondents have to comply with (other than age, gender and region), we must define which distributions should be used for the demographic variables. To illustrate this please consider the following exemplar target group: people who consider buying a new car within 6 months. The incidence rate for this target group will probably be a lot lower among youngsters and elderly compared to the incidence rate among the middle-aged population. Hence the representative demographic distribution for people who consider buying a car within 6 months will differ from the representative distribution for the general national population.

It is very important to consult and collaborate with the researcher about how a sample must be constructed. For example on the one hand it might be important to discover the incidence rate among the (national) population, then we must send out invitations based solely on representative demographic variables. On the other hand, if there are other specific segments to be achieved per age category, then the sample must be composed otherwise. If you would like advice on a representative distribution of a particular target group, PanelClix can offer this advice based on data coming from her inhouse developed tool, provided that the criteria of the target group are included in our profile characteristics.

Proposals issued by PanelClix shall always specify whether targeted or untargeted sample will be used. Furthermore, PanelClix will always verify with the researcher whether the constructed sample is in accordance with the aim(s) of the research project.