Classification
Classification is used to separate data into predefined categories. Based on the provided information, the model learns to identify patterns that help determine which group the item belongs to.
For example, imagine you work at a bank and want to predict whether a customer is "reliable" or "risky" for a loan. The model can use information such as age, income, payment history, and profession to determine which category the customer falls into.
Once trained, the classification model helps make faster and more accurate decisions, reducing errors and optimizing processes. This technique is widely used in various fields, such as predicting diseases in patients, identifying spam in emails, and even predicting the outcome of sports matches.
EXAMPLE OF USE
An example of how we can define the species of an Iris flower using the famous "Iris" dataset.
To generate the file with all the necessary configurations for later use:
When the "FindBestModel" procedure is completed, the file will be created in the directory specified as a parameter and an alert message will be displayed, advising whether or not it is necessary to load the database again before use.
To use the model from the generated file:
The database structure can be found here.
CLASSES AND METHODS GUIDE
TEasyAIClassification
constructor Create; : creates the object.
procedure LoadDataset(aDataSet : String; aHasHeader : Boolean = True); : loads the database for training or use in models that require it.
aDataSet : CSV file path.
aHasHeader : indicates whether the file has a header.
procedure LoadDataset(aDataSet : TDataSet); : loads the database for training or use in models that require it.
aDataSet : TDataSet object that contains the data that will be used for training.
procedure FindBestModel(aPathResultFile: String; aMode : TEasyTestingMode = tmStandard; aMaxThreads : Integer = 0; aCsvResultModels : String = ''; aLogFile : String = ''); : tests multiple model options to find and prepare the best one for use in predictions.
aPathResultFile : path where the configurations of the found model are saved; before use, just load it to prepare the entire object for predictions.
aMode : search mode that will be performed, having 3 options:
tmFast : tests only the most likely best models. It is the fastest mode;
tmStandard : tests the most likely best models, as well as exploring more extreme parameters;
tmExtensive : tests a large number of models, including those from other methods. It's the slowest mode.
aMaxThreads : optional, the maximum number of threads to be used simultaneously. If set to 0, it will use the number of threads available in the CPU.
aCsvResultModels : optional, the path where a CSV file will be saved containing each tested method along with its results.
aLogFile : optional, the path of the log file.
procedure LoadFromFile(aPath : String); : loads the file generated in "FindBestModel" to create and prepare the use of the best model found.
function Predict(aSample : TArray<Double>) : Double; : predicts the best category for the sample.
aSample : sample to be analyzed. For example, if the model was trained with 5 property columns + 1 result column, now only the array with the values of the 5 properties should be passed.
Last updated