A regression problem is when the output variable is a real value, such as “dollars” or “weight”. In another words, when the output variable is continuous.Regression is a method of modelling a target value based on independent predictors.

This method is mostly used for forecasting and finding out cause and effect relationship between variables. Regression techniques mostly differ based on the number of independent variables and the type of relationship between the independent and dependent variables.Simple linear regression is a type of regression analysis where the number of independent variables is one and there is a linear relationship between the independent(x) and dependent(y) variable. The red line in the above graph is referred to as the best fit straight line.

Based on the given data points, we try to plot a line that models the points the best. The line can be modelled based on the linear equation shown below.y = a_0 + a_1 * xThe motive of the linear regression algorithm is to find the best values for a_0 and a_1. To better understand linear regression, two important concepts must be known.i.

Cost Function:The cost function is used to figure out the best possible values for a_0 and a_1 which would provide the best fit line for the data points. Since the best values are wanted for a_0 and a_1, this search problem is converted into a minimization problem where it would like to be minimized the error between the predicted value and the actual value. Minimization and Cost functions are in below: The difference between the predicted values and ground truth measures the error difference. The error difference is squared and sum over all data points and divided that value by the total number of data points.

This provides the average squared error over all the data points. Therefore, this cost function is also known as the Mean Squared Error (MSE) function. Using MSE function a_0 and a_1 values are going to be changed such that the MSE value settles at the minima.ii.

Gradient Descent:Gradient descent is a method of updating a_0 and a_1 to reduce the cost function (MSE). The idea is that it is started with some values for a_0 and a_1, then these values are changed iteratively to reduce the cost. Gradient descent helps how to change the values.