K-Means

K-Means is a clustering algorithm that divides data into "K" groups or clusters. It starts by selecting K points, called centroids, which are chosen randomly. Then, each data point is assigned to the nearest centroid. After this step, the centroids are updated to reflect the average position of all the points assigned to them. This process is repeated until the centroids stop changing significantly or until a maximum number of iterations is reached. It is widely used for its simplicity but can struggle with more complex datasets, such as those with non-spherical shapes or varying sizes.

EXAMPLE OF USE

uses
  UKMeans;
  
procedure ExampleClustering;
begin
  ShowArray(KMeans('C:\Delphai\Delphai\Datasets\Iris-Clustering.csv', 3, 500, 42));
end;

CLASSES AND METHODS GUIDE

  • function KMeans(aData: String; aK, aMaxIterations, aNumInitializations: Integer; aHasHeader : Boolean = True): TArray; or

  • function KMeans(aData: TDataSet; aK, aMaxIterations, aNumInitializations: Integer): TArray; : defines the groups.

    • aData : CSV file path or TDataSet object that contains the data that will be used for training.

    • aK (Número de Clusters) : defines the desired number of clusters in the dataset.

    • aMaxIterations (Número Máximo de Iterações) : limits the number of cycles to adjust clusters and centroids.

    • aNumInitializations (Número de Inicializações) : specifies how many executions with different initializations will be performed.

    • aHasHeader : indicates whether the file has a header.

Last updated