> For the complete documentation index, see [llms.txt](https://delphai.gitbook.io/delphai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://delphai.gitbook.io/delphai/en/documentation/dataset/classification.md).

# Classification

The classification dataset is similar to the regression dataset; it is an organized table where each row represents a unique sample, and each column contains specific information about that sample. The purpose of this type of dataset is to enable an AI model to learn how to categorize or classify each sample into a specific group or class.

#### STRUCTURE

1. **Rows**
   * Each row is an independent sample and represents something that will be classified.
   * For example, if you want to create a model that identifies different types of fruit, each row in the dataset would represent a specific fruit.
2. **Property Columns (X)**
   * Known as input variables, used as the basis for the model to make predictions. Also referred to as "X".
   * These columns contain the information that helps the model make decisions.
   * Each column is a feature or attribute that describes the sample.
   * Examples of features:
     * Weight of a fruit (in grams).
     * Size of the fruit (in cm).
     * Smooth or rough skin (yes\[1] or no\[0]).
3. **Class column(Y)**
   * This is the output variable that indicates the class or category of each sample.
   * The class is what we want the model to learn to predict. In the case of fruits, the class will be the name of the type of fruit, such as "apple," "banana," or "orange."
   * This column can contain:
     * Strings: For example, "apple," "banana," "orange."
     * Numbers: For example, 1 = "apple," 2 = "banana," 3 = "orange."

#### EXAMPLE

Here is an example of a dataset for classifying fruits based on their features:

| Weight | Size | Smooth skin | Fruit Type |
| ------ | ---- | ----------- | ---------- |
| 150    | 6.5  | 1           | Apple      |
| 120    | 12   | 0           | Banana     |
| 200    | 4.9  | 0           | Orange     |
| 180    | 43   | 1           | Watermelon |

* The feature columns (X) are: Weight, Size, Smooth Skin.
* The class column (Y) is: Fruit Type.

Each line represents a specific fruit, with its characteristics (weight, size, etc.) and the type (class) it belongs to.

#### DATA IMPORT

In all DelphAI objects, it is possible to import the dataset through a CSV file or a TDataset.

1. **CSV**:
   * The CSV file must follow the same format as the table above.
   * Example of a CSV file:

     ```
     ParamA,ParamB,ParamC,Result
     150,6.5,1,Maçã
     120,12,0,Banana
     200,4.9,0,Laranja
     180,29,0,Melancia
     ```
2. **TDataset/Query**:
   * The dataset can be stored in a relational database.
   * Example of a SELECT query on the table in the database:

     <pre class="language-sql"><code class="lang-sql">ParamA | ParamB | ParamC | Result
     -------|--------|--------|--------
     <strong>150    | 6.5    | 1      | Maçã
     </strong>120    | 12     | 0      | Banana
     <strong>200    | 4.9    | 0      | Laranja
     </strong>180    | 29     | 0      | Melancia
     </code></pre>
   * Use an SQL query to select the data:

     ```sql
     SELECT * FROM Fruits;
     ```

**RULES AND TIPS FOR CREATING THE DATASET**

1. **Data Consistency:**
   * All rows must have the same number of columns.
   * All values in the columns must be in numeric format, except for the class column (the last column)
2. **No Missing Values:**
   * Every cell must have a value (no "gaps" are allowed).

#### EXAMPLE DATASET

You can find an example of the CSV file in the [official repository](https://github.com/fgrandini/DelphAI/blob/main/Datasets/Iris.csv).


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://delphai.gitbook.io/delphai/en/documentation/dataset/classification.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
