Relationship Between r and R-squared in Linear Regression

By George Choueiry / April 1, 2020

R-squared is a measure of how well a linear regression model fits the data. It can be interpreted as the proportion of variance of the outcome Y explained by the linear regression model.

It is a number between 0 and 1 (0 ≤ R² ≤ 1). The closer its value is to 1, the more variability the model explains. And R² = 0 means that the model cannot explain any variability in the outcome Y.

On the other hand, the correlation coefficient r is a measure that quantifies the strength of the linear relationship between 2 variables.

r is a number between -1 and 1 (-1 ≤ r ≤ 1):

A value of r close to -1: means that there is negative correlation between the variables (when one increases the other decreases and vice versa)
A value of r close to 0: indicates that the 2 variables are not correlated (no linear relationship exists between them)
A value of r close to 1: indicates a positive linear relationship between the 2 variables (when one increases, the other does)

Here are 3 plots that show the relationship between 2 variables with different correlation coefficients:

The left one was drawn with a coefficient r = 0.80
The middle one with r = -0.09
And the right one with r = -0.76:

Relationship Between r and R-squared in Linear Regression – QUANTIFYING HEALTH (1)

Below we will discuss the relationship between r and R² in the context of linear regression without diving too deep into the mathematical details.

We start with the special case of a simple linear regression and then discuss the more general case of a multiple linear regression.

R-squared vs r in the case of a simple linear regression

We’ve seen that both r and R-squared measure the strength of the linear relationship between 2 variables, so how do they relate in the case of a simple linear regression?

When we’re dealing with a simple linear regression:

Y = β₀ + β₁X+ ε

R-squared will be the square of the correlation between the independent variable X and the outcome Y:

R² = Cor(X, Y) ²

R-squared vs r in the case of multiple linear regression

In simple linear regression we had 1 independent variable X and 1 dependent variable Y, so calculating the the correlation between X and Y was no problem.

In multiple linear regression we have more than 1 independent variable X, therefore we cannot calculate r between more than 1 X and Y.

When dealing with multiple linear regression:

Y = β₀ + β₁X₁ + β₂X₂ + β₃X₃ + β₄X₄ + … + ε

R-squared will be the square of the correlation between the predicted/fitted values of the linear regression (Ŷ) and the outcome (Y):

R² = Cor(Ŷ, Y) ²

Note that in the special case of the simple linear regression:
Cor( X, Ŷ) = 1
So:
Cor( X, Y ) = Cor( Ŷ, Y )

Which is why, in that special case:
R² = Cor( Ŷ, Y ) ² = Cor( X, Y ) ²

FAQs

What is the relationship between R and R-squared in linear regression? ›

The correlation, denoted by r, measures the amount of linear association between two variables. r is always between -1 and 1 inclusive. The R-squared value, denoted by R ², is the square of the correlation.

Find Out More ›

What is the R-squared in healthcare? ›

R² is a measure of the percentage of total variation in the dependant variable that is accounted for by the independent variable. An R² of 1.0 indicates that the data perfectly fit the linear model.

What is the relationship between R-squared and sample size? ›

Adjusted R-squared does this by comparing the sample size to the number of terms in your regression model. Regression models that have many samples per term produce a better R-squared estimate and require less shrinkage. Conversely, models that have few samples per term require more shrinkage to correct the bias.

Learn More Now ›

What is the relationship between R-squared and effect size? ›

Both r and r² are standardized effect size measures and the reliabilities of the measures of the variables have a strong influence on standardized effect size measures.

Learn More Now ›

What does the R value indicate in Pearson correlation? ›

Pearson's r can range from −1 to 1. An r of −1 indicates a perfect negative linear relationship between variables, an r of 0 indicates no linear relationship between variables, and an r of 1 indicates a perfect positive linear relationship between variables.

Get More Info ›

What is the correlation between two values R? ›

Pearson correlation (r), which measures a linear dependence between two variables (x and y). It's also known as a parametric correlation test because it depends to the distribution of the data. It can be used only when x and y are from normal distribution. The plot of y = f(x) is named the linear regression curve.

Find Out More ›

Is Pearson R and R value the same? ›

The Pearson correlation coefficient or as it denoted by r is a measure of any linear trend between two variables. The value of r ranges between −1 and 1. When r = zero, it means that there is no linear association between the variables.

Find Out More ›

What is the relationship between the R2 value and the squared residuals? ›

To calculate R2 you need to find the sum of the residuals squared and the total sum of squares. Start off by finding the residuals, which is the distance from regression line to each data point. Work out the predicted y value by plugging in the corresponding x value into the regression line equation.

Know More ›

What is the relationship between multiple R and R? ›

r is the correlation coefficient. It is also known as the “Pearson product-moment correlation coefficient”, “PPMCC” or “PCC”, or “Pearson's r”. Multiple R is the “multiple correlation coefficient”. It is a measure of the goodness of fit of the regression model.

Discover More ›

What is the relation between R and R in statistics? ›

Coefficient of correlation is “R” value which is given in the summary table in the Regression output. R square is also called coefficient of determination. Multiply R times R to get the R square value. In other words Coefficient of Determination is the square of Coefficeint of Correlation.

Get More Info ›

Relationship Between r and R-squared in Linear Regression – QUANTIFYING HEALTH (2024)

R-squared vs r in the case of a simple linear regression

R-squared vs r in the case of multiple linear regression

Further reading

FAQs

What is the relationship between R and R-squared in linear regression? ›

What is the relationship between R-squared and sample size? ›