Map > Data Science > Explaining the Past > Data Exploration > Univariate Analysis > Numerical Variables
 

Numerical Variables

A numerical or continuous variable (attribute) is one that may take on any value within a finite or infinite interval (e.g., height, weight, temperature, blood glucose, ...). There are two types of numerical variables, interval and ratio. An interval variable has values whose differences are interpretable, but it does not have a true zero. A good example is temperature in Centigrade degrees. Data on an interval scale can be added and subtracted but cannot be meaningfully multiplied or divided. For example, we cannot say that one day is twice as hot as another day. In contrast, a ratio variable has values with a true zero and can be added, subtracted, multiplied or divided (e.g., weight).
 
Univariate Analysis - Numerical
Statistics Visualization Equation Description
Count Histogram N

The number of values (observations) of the variable.

Minimum Box Plot Min

The smallest value of the variable.

Maximum Box Plot Max

The largest value of the variable.

Mean Box Plot

The sum of the values divided by the count.

Median Box Plot

The middle value. Below and above median lies an equal number of values.

Mode Histogram  

The most frequent value. There can be more than one mode.

Quantile Box Plot

A set of 'cut points' that divide a set of data into groups containing equal numbers of values (Quartile, Quintile, Percentile, ...).

Range Box Plot

Max-Min

The difference between maximum and minimum.
Variance Histogram

A measure of data dispersion.
Standard Deviation Histogram

The square root of variance.
Coefficient of Variation Histogram

A measure of data dispersion divided by mean.
Skewness Histogram

A measure of symmetry or asymmetry in the distribution of data.
Kurtosis Histogram

A measure of whether the data are peaked or flat relative to a normal distribution.
 
Box plot and histogram for the "sepal length" variable from the Iris dataset.
 

 

 

Example:

Statistical analysis using Microsoft Excel (Iris.xls)

sepal length

Count 150
Minimum 4.3
Maximum 7.9
Mean 5.84
Median 5.8
Mode 5
Quartile 1 5.1
Range 3.6
Variance 0.69
Standard Deviation 0.83
Coefficient of Variation 14.2%
Skewness 0.31
Kurtosis -0.55
Exercise