Cost Function
What is the Cost Function?
Cost function is a machine learning function used to measure the average of the differences between the predicted and actual values over the training set, and supposed to be minimized.Ιt evaluates the overall performance of the model's current parameters (weights and biases) by calculating a single scalar numerical value that represents the total error of the model across the entire dataset.
What is the primary purpose of using a Cost Function in machine learning?
The primary purpose of a Cost Function is to provide an objective mathematical metric that an optimization algorithm, such as Gradient Descent, can minimize. By iteratively adjusting the model's parameters to reduce the numerical output value of the Cost Function, the machine learning model systematically decreases its error rate and improves its predictive accuracy.
How does a Cost Function theoretically differ from a Loss Function?
While the terms are frequently used interchangeably in practice, there is a strict theoretical distinction in machine learning literature.
- A Loss Function calculates the error for a single training data point.
- The Cost Function calculates the average or the sum of the individual loss values over the entire training dataset.
Therefore, for a dataset containing n samples, the Cost Function is the mathematical aggregation of n distinct loss calculations.
How is a Cost Function applied when training a linear regression machine learning model?
In a linear regression model designed to predict real estate prices based on the square footage of properties, the model initially assigns random values to its weights, generating highly inaccurate price predictions.
To evaluate this, the model calculates the Mean Squared Error (the Cost Function) by taking the squared differences between its predicted property prices and the actual historical prices provided in the training dataset. An optimization algorithm (Gradient Descent) then computes the mathematical derivative (gradient) of this Cost Function with respect to the model's weights. The weights are subsequently updated in the specific numerical direction that decreases the Cost Function's value.
This mathematical cycle of prediction, Cost Function calculation, and weight updating repeats iteratively. The process terminates when the Cost Function reaches its global minimum value, indicating that the error cannot be significantly reduced further. At this precise point, the model's parameters are finalized, and the model is designated as trained.