Models
Understanding Prequel models
What are data models?
In this documentation, a data model refers to the Prequel representation of the structured data to be transferred using Prequel. An individual data model usually represents a single table, though (depending on the source/destination) it may also refer to a JSON blob or other file format (e.g., Parquet, CSV).
Data model configuration
API referenceFor the complete list of fields, their editability, and environment-specific constraints, see the Models API (PATCH) in the OpenAPI spec: OpenAPI 2023-12-01 – Models.
Each Prequel data model is configured via a JSON file with a specific format. If no model_name is provided, the name of the file (not including the extension) will be the name used for the table in the destination database. For example, for the configuration below, the destination table will appear as logs.
{
"model_name": "logs",
"columns": [
{
"name_in_source": "id",
"name_in_destination": "id",
"data_type": "text",
"is_primary_key": true
},
{
"name_in_source": "log",
"name_in_destination": "event_log",
"description":"A descriptive text entry of the event that occurred.",
"data_type": "text"
},
{
"name_in_source": "updated_at",
"name_in_destination": "updated_at",
"data_type": "timestamp",
"is_last_modified": true
}
],
"source_table": "source_schema.application_logs",
"source_name": "Example Production Source",
"organization_column": "organization_id"
}
Syncing config files from GitHubYou can use the Prequel GitHub Action to automatically sync any config file changes from GitHub to Prequel.
Configuration file keys
Each configuration file has 7 top level config keys: columns and source_table are always required; organization_column is required in multi-tenant tables mode; model_name, description, source_name, and source_id are optional.
Config file keys
| Config Key | Required | Description |
|---|---|---|
model_name | Name of table to be loaded into the destination. If left blank, the file name (not including file extension) will be used. | |
description | Table description. Will be applied to destination tables if provided. | |
columns | ✅ | A list of all destination columns (for this destination table). This is used to CREATE the table in the destination database, so it is important to use accurate types that correspond to the source data. The order of this list dictates the order in which columns appear in the destination database.These columns are also used to create the READ and WRITE query, so it is important to use the correct name_in_source and a descriptive name_in_destination for end customers. They can, but do not need to be, the same name.is_primary_key and is_last_modified are two important column flags that indicate which columns to use for detecting changes and propagating changes to the correct row in the destination. One of each must be present on one of the column objects. |
source_table | ✅ | Specifies the name of the source table to be queried. |
source_name | Associates this model with a source by name. Required when multiple sources exist. If omitted with a single source, that source is used automatically. If provided, the value must match an existing source name, even in single-source accounts. | |
source_id | Associates this model with a source by ID. Carries the same enforcement as source_name. Both fields can be specified together, in which case the source must match both values. | |
organization_column | ✅ (multi-tenant tables mode only) | Specifies the column in the source table used to filter rows by tenant. Required in multi-tenant tables mode; not used in schema-tenanted mode (tenancy is determined by schema name instead). Because this column is not necessarily a column you want to transfer to recipients, it does not need to appear in the columns list. This column must be a string or varchar (non-numeric) data type. |
Choosing betweensource_nameandsource_idPrefer
source_namefor config files shared across staging and production environments. Source names are consistent across environments, whereas source IDs are environment-specific UUIDs that differ between deployments. Usesource_idwhen you need to reference a source by its UUID, or specify both fields together to require the source to match on both name and ID.
Column object keys
| Column object key | Required | Description |
|---|---|---|
name_in_source | ✅ | The column name in the source system. |
name_in_destination | ✅ | The column name to use in the destination table. |
data_type | ✅ | The logical data type for this column. Refer to Acceptable Types. |
is_primary_key | (On exactly one column) | Marks the primary key column used to identify rows. Exactly one column must be set. |
is_last_modified | (On exactly one column) | Marks the column containing the last-updated timestamp used for change detection. Exactly one column must be set. |
description | Optional human‑readable description for the column. |
is_primary_keyandis_last_modifiedmust appear within the first 32 columnsBoth the primary key column and the last modified column must be positioned within the first 32 entries of the
columnsarray (positions 1–32). This is a Prequel model requirement. It does not constrain the column order in your source table. The limit exists to respect ordinal position constraints in destination systems that implement indexing or partitioning by specific columns.
Using Acceptable TypesUse the correct
data_typefor each column to maintain compatibility with all destinations. Refer to the Data Types documentation for the complete list of supported types.
Column ordering
The order of columns in the columns array determines the column order in the destination table, but only at initial table creation. For existing destinations, columns added via schema evolution are always appended to the end of the destination table schema, regardless of their position in the model config. See Schema evolution for more detail.
Column name and table name conflicts
A column's name_in_source and name_in_destination cannot match the source table name or the destination table name (case-insensitive). This applies to both the column name as written and the table name as written.
Reserved column names
Some reserved keywords cannot be used as column names due to reserved statuses in common destination systems. There are 334:
| 1-67 | 68-134 | 135-201 | 202-268 | 269-334 |
|---|---|---|---|---|
ACCESSIBLE | DEFERRABLE | INOUT | NULL | SMALLINT |
ACCOUNT | DEFINE | INSENSITIVE | NULLS | SOME |
ADD | DELAYED | INSERT | NUMERIC | SPATIAL |
ALL | DELETE | INT | OF | SPECIFIC |
ALTER | DENSE_RANK | INT1 | OFFSET | SQL |
ANALYSE | DESC | INT2 | ON | SQL_BIG_RESULT |
ANALYZE | DESCRIBE | INT3 | ONLY | SQL_CALC_FOUND_ROWS |
AND | DETERMINISTIC | INT4 | OPTIMIZE | SQL_SMALL_RESULT |
ANTI | DISTINCT | INT8 | OPTIMIZER_COSTS | SQLEXCEPTION |
ANY | DISTINCTROW | INTEGER | OPTION | SQLSTATE |
ARRAY | DIV | INTERSECT | OPTIONALLY | SQLWARNING |
AS | DO | INTERVAL | OR | SSL |
ASC | DOUBLE | INTO | ORDER | START |
ASENSITIVE | DROP | IO_AFTER_GTIDS | ORGANIZATION | STARTING |
ASSERT_ROWS_MODIFIED | DUAL | IO_BEFORE_GTIDS | OUT | STORED |
ASYMMETRIC | EACH | IS | OUTER | STRAIGHT_JOIN |
AT | ELSE | ISNULL | OUTFILE | STRUCT |
AUTHORIZATION | ELSEIF | ISSUE | OVER | SYMMETRIC |
BEFORE | EMPTY | ITERATE | OVERLAPS | SYSTEM |
BETWEEN | ENCLOSED | JOIN | PARTITION | TABLE |
BIGINT | END | JSON_TABLE | PERCENT_RANK | TABLESAMPLE |
BINARY | ENUM | KEY | PLACING | TERMINATED |
BLOB | ESCAPE | KEYS | PRECEDING | THEN |
BOTH | ESCAPED | KILL | PRECISION | TINYBLOB |
BY | EXCEPT | LAG | PRIMARY | TINYINT |
CALL | EXCLUDE | LAST_VALUE | PROCEDURE | TINYTEXT |
CASCADE | EXISTS | LATERAL | PROTO | TO |
CASE | EXIT | LEAD | PURGE | TRAILING |
CAST | EXPLAIN | LEADING | QUALIFY | TREAT |
CHANGE | EXTRACT | LEAVE | RANGE | TRIGGER |
CHAR | FALSE | LEFT | RANK | TRUE |
CHARACTER | FETCH | LIKE | READ | UNBOUNDED |
CHECK | FIRST_VALUE | LIMIT | READ_WRITE | UNDO |
COLLATE | FLOAT | LINEAR | READS | UNION |
COLLATION | FLOAT4 | LINES | REAL | UNIQUE |
COLUMN | FLOAT8 | LOAD | RECURSIVE | UNLOCK |
CONCURRENTLY | FOLLOWING | LOCALTIME | REFERENCES | UNNEST |
CONDITION | FOR | LOCALTIMESTAMP | REGEXP | UNSIGNED |
CONNECT | FORCE | LOCK | RELEASE | UPDATE |
CONNECTION | FOREIGN | LONG | RENAME | USAGE |
CONSTRAINT | FREEZE | LONGBLOB | REPEAT | USE |
CONTAINS | FROM | LONGTEXT | REPLACE | USER |
CONTINUE | FULL | LOOKUP | REQUIRE | USING |
CONVERT | FULLTEXT | LOOP | RESIGNAL | UTC_DATE |
CREATE | FUNCTION | LOW_PRIORITY | RESTRICT | UTC_TIME |
CROSS | GENERATED | MASTER_BIND | RETURN | UTC_TIMESTAMP |
CUBE | GET | MASTER_SSL_VERIFY_SERVER_CERT | RETURNING | VALUES |
CUME_DIST | GRANT | MATCH | REVOKE | VARBINARY |
CURRENT | GROUP | MAXVALUE | RIGHT | VARCHAR |
CURRENT_CATALOG | GROUPING | MEDIUMBLOB | RLIKE | VARCHARACTER |
CURRENT_DATE | GROUPS | MEDIUMINT | ROLLUP | VARIADIC |
CURRENT_ROLE | GSCLUSTER | MEDIUMTEXT | ROW | VARYING |
CURRENT_SCHEMA | HASH | MERGE | ROW_NUMBER | VERBOSE |
CURRENT_TIME | HAVING | MIDDLEINT | ROWS | VIEW |
CURRENT_TIMESTAMP | HIGH_PRIORITY | MINUS | SAMPLE | VIRTUAL |
CURRENT_USER | HOUR_MICROSECOND | MINUTE_MICROSECOND | SCHEMA | WHEN |
CURSOR | HOUR_MINUTE | MINUTE_SECOND | SCHEMAS | WHENEVER |
DATABASE | HOUR_SECOND | MOD | SECOND_MICROSECOND | WHERE |
DATABASES | IF | MODIFIES | SELECT | WHILE |
DAY_HOUR | IGNORE | NATURAL | SEMI | WINDOW |
DAY_MICROSECOND | ILIKE | NEW | SENSITIVE | WITH |
DAY_MINUTE | IN | NO | SEPARATOR | WITHIN |
DAY_SECOND | INCREMENT | NO_WRITE_TO_BINLOG | SESSION_USER | WRITE |
DEC | INDEX | NOT | SET | XOR |
DECIMAL | INFILE | NOTNULL | SHOW | YEAR_MONTH |
DECLARE | INITIALLY | NTH_VALUE | SIGNAL | ZEROFILL |
DEFAULT | INNER | NTILE | SIMILAR | `` |
Reserved column name prefixes
In order to avoid conflicts in common destination systems, column names cannot start with certain reserved prefixes. There are 10 reserved prefixes:
| 1-5 | 6-10 |
|---|---|
PARGO_PREFIX_ | _TABLE_ |
_FILE_ | _PARTITION |
_ROW_TIMESTAMP | __ROOT__ |
_COLIDENTIFIER | _CHANGE_SEQUENCE_NUMBER |
_CHANGE_TYPE | _CHANGE_TIMESTAMP |
Grouping data models into products
By default, every destination you add to Prequel will be sent all configured models. Products allow you to only send a subset of those models to destinations.
Updated about 2 hours ago