Recommendation
The recommendation dataset should be in the "User x Item" format, a table where users are listed in the rows and items (products, movies, music, etc.) are listed in the columns. Each cell in the table contains the rating (score or evaluation) given by a user to an item. If the user has not rated the item, the cell should be 0.
STRUCTURE
Rows: Each row represents a user, for example:
User 1 → Row 1
User 2 → Row 2
User 3 → Row 3
Colunas: Each column represents an item, for example:
Item A → Column 1
Item B → Column 2
Item C → Column 3
Values: Each cell contains the score/rating that the user gave to the item:
A higher number means the user liked the item more (e.g., 5 on a scale from 1 to 5).
A lower number means the user liked the item less (e.g., 1 or 2).
Cells with 0 indicate that the user did not rate that item.
EXAMPLE
5
3
0
4
4
0
2
0
0
1
5
3
2
0
0
0
ItemA, ItemB, etc.: Each column corresponds to an item.
Values: Ratings given by users (e.g., on a scale from 1 to 5).
DATA IMPORT
In all DelphAI objects, it is possible to import the dataset through a CSV file or a TDataset.
CSV:
The CSV file must follow the same format as the table above.
Example of a CSV file:
TDataset/Query:
The dataset can be stored in a relational database.
Example of a SELECT query on the table in the database:
Use an SQL query to select the data:
RULES AND TIPS FOR CREATING THE DATASET
Rating Scale:
Decide the value scale for the ratings (e.g., 1 to 5 or 0 to 10). All ratings must use the same scale.
No Missing Values:
If a user did not rate an item, the cell must contain 0.
Matrix Position:
It's important to store the indices of the users and items used in the database to restore the real information (such as names, IDs, etc.) after prediction.
EXAMPLE DATASET
It is possible to find in the official repository the CSV file containing the "User X Item" matrix, as well as the names of the movies for each column index for testing purposes.
Last updated