recommend-hq schema-generation¶
The command for generating schemas for Amazon Personalize.
Description¶
The recommend-hq schema-generation command generates schemas for Amazon Personalize. This is essential for setting up the recommendation system by defining the structure of your datasets.
Default IMPRESSION Field in Interaction Schema
The IMPRESSION field is now included by default in the interaction schema generated using this tool. This field can be used in log events.
Additionally, the IMPRESSION field is supported during the retraining process. When the train mode is set to update_dataset_group, this field will be utilized in the retraining.
Options¶
| Option | Description |
|---|---|
--user TEXT |
Path to the user dataset file (.csv /.jsonl) [required] |
--item TEXT |
Path to the item dataset file (.csv /.jsonl) [required] |
--interaction TEXT |
Path to the interaction dataset file (.csv /.jsonl) [required] |
--textual-fields TEXT |
Fields to be treated as textual attributes, separated by commas [required] |
--help |
Show help information and exit. |
Examples¶
Generate schemas with required datasets and textual fields¶
Example for example/dataset
recommend-hq schema-generation --user example/dataset/user.csv \
--item example/dataset/item.csv \
--interaction example/dataset/interaction.csv \
--textual-fields TITLE
Load Input Users File: example/dataset/users.csv
Generating Users Schema ...
View the Processed Result
┏━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━┓
┃ Field Name ┃ Data Type ┃ Categorical ┃ Textual ┃
┡━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━┩
│ IS_ACTIVE │ string │ X │ X │
│ USER_ID │ string │ X │ X │
│ GENDER │ string │ O │ X │
│ AGE │ int │ X │ X │
│ OCCUPATION │ int │ X │ X │
│ ZIP_CODE │ int │ X │ X │
└────────────┴───────────┴─────────────┴─────────┘
Output User Schema: conf/schema/user_schema.json
Load Input Items File: example/dataset/items.csv
Generating Items Schema ...
View the Processed Result
┏━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━┓
┃ Field Name ┃ Data Type ┃ Categorical ┃ Textual ┃
┡━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━┩
│ IS_ACTIVE │ string │ X │ X │
│ TEXTUAL_FIELD │ null,string │ X │ O │
│ ITEM_ID │ string │ X │ X │
│ GENRE │ string │ O │ X │
└───────────────┴─────────────┴─────────────┴─────────┘
Output Item Schema: conf/schema/item_schema.json
Load Input Interactions File: example/dataset/events.csv
Generating Interactions Schema ...
View the Processed Result
┏━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━┓
┃ Field Name ┃ Data Type ┃ Categorical ┃ Textual ┃
┡━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━┩
│ USER_ID │ string │ X │ X │
│ ITEM_ID │ string │ X │ X │
│ EVENT_TYPE │ string │ X │ X │
│ EVENT_VALUE │ float,null │ X │ X │
│ TIMESTAMP │ long │ X │ X │
│ IMPRESSION │ string,null │ X │ X │
└─────────────┴─────────━───┴─────────────┴─────────┘
Output Interaction Schema: conf/schema/interaction_schema.json
Output ETL Configure: conf/etl/configure.json
This output indicates the status of schema generation for each file and shows a processed result table with field names, data types, and whether the fields are categorical or textual.