Please follow the choices below. In most cases, the answer will lead you to one of the NuMBerS statistics toolkits. You will arrive at the 'When to do it' page, and this will confirm that you have selected the appropriate test. However, if you arrive at a toolkit and decide that you have made a mistake, you can return to the same link on this page.


For most of the tests, there are tests for parametric data and corresponding tests for non-parametric data. The choices also refer to different data types such as 'scale' and 'ordinal'. If these terms don't make sense to you, either refer to the notes at the bottom of this page, or look at the relevant entries in the glossary.


1. What do you want to do?


1.1 Perform tests on frequencies (ie the distribution of one or more sets of observations), where the data are frequencies or counts in categories. Go to 2.


1.2 Test the relationship between variables, where the data are scale or ordinal. Go to 3.


1.3 Test the difference between sets of observations, where the dependent data are scale or ordinal, and in categories defined by the independent variable. Go to 4.


2. How many categories are there?


2.1 There is one set of categories, so a one-way classification including a test of homegeneity is appropriate: One-way Chi-square test (non-parametric test only)


2.2 There are two sets of categories, so a two-way test is appropriate (also called a 'test of association' or a 'test of independence'): Two-way Chi-square test (non-parametric test only)


Return to the top of the list of choices



3. Is there a dependent- and an independent- variable?


3.1 Yes, there are clearly identified dependent- and independent- variables, so that a regression is appropriate.

3.1.1 The data are parametric: Linear regression (other models could also be used)

3.1.2 The data are non-parametric: Non-parametric regression models are not covered here


3.2 No, the variables are interdependent and a test of correlation is appropriate.

3.2.1 The data are parametric: Pearson correlation

3.2.2 The data are non-parametric: Spearman correlation


Return to the top of the list of choices


4. Are samples related or unrelated?


4.1 The samples are related. Go to 5


4.2 The samples are unrelated. Go to 6


5. Are there two samples or more than two samples?


5.1 There are two samples.

5.1.1 The data are parametric: t-test

5.1.2 The data are non-parametric: Mann-Whitney U test


5.2 There are more than two samples.

5.2.1 The data are parametric: One-way Anova

5.2.2 The data are non-parametric: Kruskal-Wallis test


6. Are there two samples or more than two samples?


6.1 There are two samples.

6.1.1 The data are parametric: paired t-test

6.1.2 The data are non-parametric: Wilcoxon signed-rank test


6.2 There are more than two samples.

6.2.1 The data are parametric: Related one-way Anova (not covered here)

6.2.2 The data are non-parametric: Friedman Anova (not covered here)


Return to the top of the list of choices


Notes on data types


Nominal or category data are data that have simply been classified into groups, perhaps by colour or sex, that do not form a logical progression.


Ordinal and rank data are in categories that form a logical order, such as rankings or an assessment scale (eg the Beaufort wind-scale or the Richter scale for earthquakes)


Scale data are 'real' measurements, such as length or mass or simply number. Scale data are either 'discrete', if they derive from counting things, or 'continuous', if they are measurements on a scale such as length.


Parametric data must be scale data and should be normally distributed (or close to this). If a test involves more than one data set, the variances of the data sets should be similar.


Non-parametric data do not satisfy the parametric criteria, and may be category, ordinal or scale data. Tests appropriate to non-parametric data may also be used in the case of small sample sizes where the parametric criteria cannot be sustained.


Dependent and independent variables - where the value of one variable is controlled, at least partially, by the value of a second, the first variable is termed 'dependent' and the second 'independent'. In regression analysis, the dependent variable is usually denoted by 'y' and the independent variable as 'x'. Dependent variables are also known as 'test-' or 'response-' variables, and independent variables may be called 'predictor-', 'grouping-' or 'explanatory-' variables (or factors). In some types of analysis (but not covered here), there can be more than one independent variable.


Related and unrelated data - these terms are almost self-explanatory, but are important concepts in designing tests and analysing results. If data are related, an observation of one variable can be linked to a corresponding observation of a second variable. Obviously, this means that there must be the same number of observations in each set of data.


Return to the top of the list of choices