Written by Pranit Dhanade
Spearman’s Rank Correlation Coefficient is a non-parametric statistical measure used to determine the strength and direction of a monotonic relationship between two variables.
Unlike Pearson correlation, which measures linear relationships using raw numerical observations, Spearman correlation operates on ranked data.
Where:
Spearman correlation evaluates whether the ordering of one variable matches the ordering of another variable.
If high values in one variable correspond to high values in another variable, then:
If high values correspond to low values:
Spearman correlation is derived from Pearson correlation by replacing observations with ranks.
Let:
The mean rank from $1$ to $n$ is:
The sum of squares of first $n$ natural numbers is:
Therefore:
Since:
Squaring both sides:
Expanding:
Rearranging and substituting into Pearson’s formula gives:
Hence proved.
| Property | Description |
|---|---|
| Non-parametric | No normality assumption |
| Rank-based | Uses ordinal information |
| Robust | Less sensitive to outliers |
| Monotonic | Measures monotonic dependence |
Used in feature selection, ranking systems, recommendation engines, and evaluation metrics.
Applied in genomic sequencing, gene expression analysis, and biological ranking problems.
Used for stock ranking, risk analysis, and ordinal economic modeling.
Used for Likert-scale analysis, behavioral rankings, and survey statistics.
| Feature | Pearson | Spearman |
|---|---|---|
| Relationship | Linear | Monotonic |
| Uses Raw Data | Yes | No |
| Outlier Sensitivity | High | Lower |
| Normality Assumption | Required | Not Required |
Spearman’s Rank Correlation Coefficient is one of the most important non-parametric statistical tools for measuring monotonic dependence between variables.
Its robustness, computational simplicity, and applicability to ranked and nonlinear data make it highly valuable in Machine Learning, Statistics, Computational Biology, Finance, and Data Science.