Beginner tutorial : Linear Regression in R
Linear regression is a common technique used to show relationship between a predictor and outcome. For example, say you are trying to predict a car acceleration (0 to 100 km) based on its engine house power.
You might have a sample data shown below
Car horse power Acceleration per second
100 120
110 150
120 160
150 170
200 x
Given a car horse power 200 what would its acceleration (value x) be?
Linear regression takes a straight line that pass through certain points. It can represented with the following equation
y = ax + b
Lets use R to help us with prediction
Line #5, we can see that we are using R method called lm to create our model.
lm(y ~ x) means y is a predicted by using x term. In our case, Y is acceleration per second. X is our horse power.
Let try to predict using our model above.
As you can see, given a 200 house power engine, we can have 5 second acceleration 0 to 100 km/h.
Some other terms might be of interested
a) R square - is a measure of how close prediction fits on regression line. 0% means regression line is not relevant at all. 100% means Y can be explained by the line.
b) F statistic - in regression F statistic is used to compare how best a model fits into dataset.
Comments