Ridge regression

Ridge regression is a variation of linear regression that helps prevent the problem of overfitting. It does this by adding a penalty to the equation to prevent the coefficients from becoming too large. This penalty reduces the influence of less important variables, making the model simpler and more stable. It's as if Ridge "pulls" the coefficients closer to zero without completely eliminating them.

This technique is especially useful when there are many correlated variables or when the data is limited. By controlling the impact of less relevant variables, Ridge regression creates models that perform better in practical situations, helping to predict outcomes more accurately on new datasets.

The model training is quick, and after training, the dataset is no longer needed to make predictions. It is possible to save and import the finalized training in a very lightweight file.

EXAMPLE OF USE

uses
  URidgeRegression;
  
procedure ExampleRidgeRegression;
var
  vRidge: TRidgeRegression;
begin
  vRidge := TRidgeRegression.Create(0.1);
  try
    vRidge.Train('C:\Delphai\Delphai\Datasets\Housing Price.csv', True);
    ShowMessage('House price: ' + FormatCurr('##0.00', vRidge.Predict([2459, 1, 1, 1964, 3.1047807561601664, 0, 4])));
  finally
    vRidge.Free;
  end;
end;

CLASSES AND METHODS GUIDE

TRidgeRegression

FDataset : TAIDatasetClassification; : access the current database contained in the model.
procedure ClearDataset; : cleans the dataset.
constructor Create(aAlfa : Double); : creates a new object to be trained.
- aAlfa : A regularization parameter that controls the penalization of coefficients, balancing model fitting and complexity to prevent overfitting. In other words, it acts as a "brake" that helps prevent the model from becoming too fitted to the data, making predictions more reliable. The larger the alpha, the more simplified the model becomes.
constructor Create(aTrainedFile : String); overload; : creates the object by loading the structure from an already trained file.
procedure Train(aTrainingData : String; aHasHeader : Boolean = True); overload; : performs the training with the provided data.
- aTrainingData : CSV file path.
- aHasHeader : indicates whether the file has a header.
procedure Train(aTrainingData : TDataSet); overload; : performs the training and generates the nodes based on the provided data.
- aTrainingData : TDataSet object that contains the data that will be used for training.
function Predict(aSample: TArray<Double>; aInputNormalized : Boolean = False): String; : predicts the value of the sample.
- aSample : sample to be analyzed.
- aInputNormalized : it should be set to true only if the input data has already been normalized outside the component (which should be done using the range saved in the object).
procedure LoadFromFile(const FileName: string); : loads the trained file.
procedure SaveToFile(const FileName: string); : saves the trained file.

PreviousLinear regression NextClassification

Last updated 9 days ago