Statistics is a set of tools designed to

**analyze****data**and deduce information about a**population**from a given

**sample**.What is Statistics?

• It is a three‐steps process:

1. Sampling and design of the experiment: take a sample (or many) from the population, make

observations about the sample, and turn them into numerical data.

2. Descriptive statistics: analyze the data to get information about the sample.

3. Statistical inference: from the data, deduce information about the whole population.

• Context is crucial in this process

A population is a set of individuals (people, cases, etc) that we want to analyze.

A sample is a subset of the population.

A

**variable**is an aspect or characteristic of the population that we want study.Choosing a sample is a delicate task: the sample must be representative of the population (context

is important).

• How big should a sample be? Clearly, it will depend on the situation. Usually, a sample must have

at least 30 individuals.

When choosing a sample, there are several strategies: random

**sampling**, stratified sampling,cluster sampling, etc

The variables are just questions that we ask the population.

The variables should be neutral:

• Are you in favor of the illegal one‐sided declaration of independence of the autonomous region of

Catalunya from the great Spanish nation?

• Are you in favor of the historically legitimated declaration of independence of the great nation of

Catalunya from the oppressive Spanish state?

• The variables must serve a purpose: if I am interested in the income per family, is it necessary to

ask about music preferences?

• The possible answers of a variable must be very clear from the beginning.

• There are two types of variables, depending on the kind of

**answer**: • Quantitative or numerical:the answer is a number.

• Qualitative or categorical: the answer is a label (category).

• Quantitative variables can be of two types:

• Discrete: the answers are obtained by counting.

• Continuous: the answers are obtained by measuring.

The sample together with the variable produce some data (that is, the values that the variable

takes on each

**individual**of the sample). Now we have to analyze these data,We have some tools at our disposal:

- Frequency tables (organize the data)

- Graphic representations of data (visually represent the data)

- Descriptive statistics (measure features of the data)