Models

What are data models?

In this documentation, a data model refers to the Prequel representation of the structured data to be transferred using Prequel. An individual data model usually represents a single table, though (depending on the source/destination) it may also refer to a JSON blob or other file format (e.g., Parquet, CSV).

A Prequel data model represents the data structure that is read from a source and written to a destination

Data model configuration

For the complete list of fields, their editability, and environment-specific constraints, see the Update Model endpoint in the API reference. Each Prequel data model is configured via a JSON file with a specific format. If no model_name is provided, the name of the file (not including the extension) will be the name used for the table in the destination database. For example, for the configuration below, the destination table will appear as logs.

logs.json

{
  "model": {
    "model_name": "logs",
    "columns": [
      {
        "name_in_source": "id",
        "name_in_destination": "id",
        "data_type": "text",
        "is_primary_key": true
      },
      {
        "name_in_source": "log",
        "name_in_destination": "event_log",
        "description": "A descriptive text entry of the event that occurred.",
        "data_type": "text"
      },
      {
        "name_in_source": "updated_at",
        "name_in_destination": "updated_at",
        "data_type": "timestamp",
        "is_last_modified": true
      }
    ],
    "source_table": "source_schema.application_logs",
    "source_name": "Example Production Source",
    "organization_column": "organization_id"
  }
}

Syncing config files from GitHubYou can use the Prequel GitHub Action to automatically sync any config file changes from GitHub to Prequel.

Required columns

Every table to be transferred must expose the following columns in its source so Prequel can move and reconcile rows correctly:

Required column	Configured via	Why it is required
Unique ID (e.g., `id`)	`is_primary_key` on a column	Used as the primary key to facilitate `UPDATE`/`INSERT` (“upsert”) operations on the destination.
Last modified (e.g., `updated_at`)	`is_last_modified` on a column	A `timestamp` column indicating when a row last changed. Prequel uses it to identify changes between transfers.
Tenant ID (e.g., `organization_id`)	`organization_column`	Required in multi-tenant tables mode to filter source data by tenant. Not used in schema-tenanted mode. See Tenancy options.

Configuration file keys

Each configuration file has 7 top level config keys: columns and source_table are always required; organization_column is required in multi-tenant tables mode; model_name, description, source_name, and source_id are optional.

Model config fields

model_name

string

Name of the table to be loaded into the destination. If left blank, the file name (not including the file extension) is used.

description

string

Table description. Applied to destination tables if provided.

columns

object[]

required

A list of all destination columns for this table. This is used to CREATE the table in the destination database, so use accurate types that correspond to the source data. The order of this list dictates the order in which columns appear in the destination database.These columns also build the READ and WRITE queries, so use the correct name_in_source and a descriptive name_in_destination for end customers. They can, but do not need to, match. See Column fields for the per-column keys.

source_table

string

required

The name of the source table to be queried.

source_name

string

Associates this model with a source by name. Required when multiple sources exist. If omitted with a single source, that source is used automatically. If provided, the value must match an existing source name, even in single-source accounts.

source_id

string

Associates this model with a source by ID. Carries the same enforcement as source_name. Both fields can be specified together, in which case the source must match both values.

organization_column

string

Required in multi-tenant tables mode. Specifies the column in the source table used to filter rows by tenant. Not used in schema-tenanted mode, where tenancy is determined by schema name instead. Because this column is not necessarily one you want to transfer to recipients, it does not need to appear in the columns list. Must be a string or varchar (non-numeric) data type.

Choosing between source_name and source_idPrefer source_name for config files shared across staging and production environments. Source names are consistent across environments, whereas source IDs are environment-specific UUIDs that differ between deployments. Use source_id when you need to reference a source by its UUID, or specify both fields together to require the source to match on both name and ID.

Model validation

When you create or update a model with POST /export/models or PATCH /export/models/{model_id}, Prequel automatically validates the model against the source to ensure all columns exist, and both the source types and sampled source values match the model’s types. Validation also runs the model’s source_query (or generated table query) against the source database, checks the configured columns against the result set, and verifies that the source credentials have sufficient access. You can also re-run validation on demand against an existing model with POST /export/models/{model_id}/validate. To skip validation on a single create or update request, set skip_model_validation to true in the request body. Skipping validation does not affect transfers. The following optional model config fields let you supply concrete values that Prequel uses only during validation. Use them to point validation at a known tenant and a narrow time window to improve performance of the validation query. These parameters are ignored by transfers at runtime.

skip_model_validation

boolean

Set to true on a single create or update request to skip validation. Skipping validation does not affect transfers.

model_validation_id_in_provider_system

string

Overrides {{.IdInProviderSystem}} during validation.

model_validation_start_time_epoch

string

RFC3339 timestamp. Overrides {{.LastModifiedAtEpoch}} during validation.

model_validation_end_time_epoch

string

RFC3339 timestamp. Overrides {{.LastModifiedAtEpochUpperBound}} during validation.

Validation overrides

{
  "model": {
    "model_name": "events",
    "source_query": "SELECT * FROM events WHERE tenant_id = '{{.IdInProviderSystem}}' AND updated_at >= to_timestamp({{.LastModifiedAtEpoch}}){{with .LastModifiedAtEpochUpperBound}} AND updated_at < to_timestamp({{.}}){{end}}",
    "model_validation_id_in_provider_system": "tenant-abc",
    "model_validation_start_time_epoch": "2025-01-01T00:00:00Z",
    "model_validation_end_time_epoch": "2025-01-02T00:00:00Z",
    ...
  }
}

Column fields

Each entry in the columns array accepts these keys.

name_in_source

string

required

The column name in the source system.

name_in_destination

string

required

The column name to use in the destination table.

data_type

string

required

The logical data type for this column. Refer to Acceptable Types.

is_primary_key

boolean

Marks the primary key column used to identify rows. Exactly one column must set this.

is_last_modified

boolean

Marks the column containing the last-updated timestamp used for change detection. Exactly one column must set this.

description

string

Optional human-readable description for the column.

is_primary_key and is_last_modified must appear within the first 32 columnsBoth the primary key column and the last modified column must be positioned within the first 32 entries of the columns array (positions 1–32). This is a Prequel model requirement. It does not constrain the column order in your source table. The limit exists to respect ordinal position constraints in destination systems that implement indexing or partitioning by specific columns.

Using Acceptable TypesUse the correct data_type for each column to maintain compatibility with all destinations. Refer to the Data Types documentation for the complete list of supported types.

Column ordering

The order of columns in the columns array determines the column order in the destination table, but only at initial table creation. For existing destinations, columns added via schema evolution are always appended to the end of the destination table schema, regardless of their position in the model config. See Schema evolution for more detail.

Column name and table name conflicts

A column’s name_in_source and name_in_destination cannot match the source table name or the destination table name (case-insensitive). This applies to both the column name as written and the table name as written.

Reserved column names

Some reserved keywords cannot be used as column names due to reserved statuses in common destination systems. There are 334:

1-67	68-134	135-201	202-268	269-334
`ACCESSIBLE`	`DEFERRABLE`	`INOUT`	`NULL`	`SMALLINT`
`ACCOUNT`	`DEFINE`	`INSENSITIVE`	`NULLS`	`SOME`
`ADD`	`DELAYED`	`INSERT`	`NUMERIC`	`SPATIAL`
`ALL`	`DELETE`	`INT`	`OF`	`SPECIFIC`
`ALTER`	`DENSE_RANK`	`INT1`	`OFFSET`	`SQL`
`ANALYSE`	`DESC`	`INT2`	`ON`	`SQL_BIG_RESULT`
`ANALYZE`	`DESCRIBE`	`INT3`	`ONLY`	`SQL_CALC_FOUND_ROWS`
`AND`	`DETERMINISTIC`	`INT4`	`OPTIMIZE`	`SQL_SMALL_RESULT`
`ANTI`	`DISTINCT`	`INT8`	`OPTIMIZER_COSTS`	`SQLEXCEPTION`
`ANY`	`DISTINCTROW`	`INTEGER`	`OPTION`	`SQLSTATE`
`ARRAY`	`DIV`	`INTERSECT`	`OPTIONALLY`	`SQLWARNING`
`AS`	`DO`	`INTERVAL`	`OR`	`SSL`
`ASC`	`DOUBLE`	`INTO`	`ORDER`	`START`
`ASENSITIVE`	`DROP`	`IO_AFTER_GTIDS`	`ORGANIZATION`	`STARTING`
`ASSERT_ROWS_MODIFIED`	`DUAL`	`IO_BEFORE_GTIDS`	`OUT`	`STORED`
`ASYMMETRIC`	`EACH`	`IS`	`OUTER`	`STRAIGHT_JOIN`
`AT`	`ELSE`	`ISNULL`	`OUTFILE`	`STRUCT`
`AUTHORIZATION`	`ELSEIF`	`ISSUE`	`OVER`	`SYMMETRIC`
`BEFORE`	`EMPTY`	`ITERATE`	`OVERLAPS`	`SYSTEM`
`BETWEEN`	`ENCLOSED`	`JOIN`	`PARTITION`	`TABLE`
`BIGINT`	`END`	`JSON_TABLE`	`PERCENT_RANK`	`TABLESAMPLE`
`BINARY`	`ENUM`	`KEY`	`PLACING`	`TERMINATED`
`BLOB`	`ESCAPE`	`KEYS`	`PRECEDING`	`THEN`
`BOTH`	`ESCAPED`	`KILL`	`PRECISION`	`TINYBLOB`
`BY`	`EXCEPT`	`LAG`	`PRIMARY`	`TINYINT`
`CALL`	`EXCLUDE`	`LAST_VALUE`	`PROCEDURE`	`TINYTEXT`
`CASCADE`	`EXISTS`	`LATERAL`	`PROTO`	`TO`
`CASE`	`EXIT`	`LEAD`	`PURGE`	`TRAILING`
`CAST`	`EXPLAIN`	`LEADING`	`QUALIFY`	`TREAT`
`CHANGE`	`EXTRACT`	`LEAVE`	`RANGE`	`TRIGGER`
`CHAR`	`FALSE`	`LEFT`	`RANK`	`TRUE`
`CHARACTER`	`FETCH`	`LIKE`	`READ`	`UNBOUNDED`
`CHECK`	`FIRST_VALUE`	`LIMIT`	`READ_WRITE`	`UNDO`
`COLLATE`	`FLOAT`	`LINEAR`	`READS`	`UNION`
`COLLATION`	`FLOAT4`	`LINES`	`REAL`	`UNIQUE`
`COLUMN`	`FLOAT8`	`LOAD`	`RECURSIVE`	`UNLOCK`
`CONCURRENTLY`	`FOLLOWING`	`LOCALTIME`	`REFERENCES`	`UNNEST`
`CONDITION`	`FOR`	`LOCALTIMESTAMP`	`REGEXP`	`UNSIGNED`
`CONNECT`	`FORCE`	`LOCK`	`RELEASE`	`UPDATE`
`CONNECTION`	`FOREIGN`	`LONG`	`RENAME`	`USAGE`
`CONSTRAINT`	`FREEZE`	`LONGBLOB`	`REPEAT`	`USE`
`CONTAINS`	`FROM`	`LONGTEXT`	`REPLACE`	`USER`
`CONTINUE`	`FULL`	`LOOKUP`	`REQUIRE`	`USING`
`CONVERT`	`FULLTEXT`	`LOOP`	`RESIGNAL`	`UTC_DATE`
`CREATE`	`FUNCTION`	`LOW_PRIORITY`	`RESTRICT`	`UTC_TIME`
`CROSS`	`GENERATED`	`MASTER_BIND`	`RETURN`	`UTC_TIMESTAMP`
`CUBE`	`GET`	`MASTER_SSL_VERIFY_SERVER_CERT`	`RETURNING`	`VALUES`
`CUME_DIST`	`GRANT`	`MATCH`	`REVOKE`	`VARBINARY`
`CURRENT`	`GROUP`	`MAXVALUE`	`RIGHT`	`VARCHAR`
`CURRENT_CATALOG`	`GROUPING`	`MEDIUMBLOB`	`RLIKE`	`VARCHARACTER`
`CURRENT_DATE`	`GROUPS`	`MEDIUMINT`	`ROLLUP`	`VARIADIC`
`CURRENT_ROLE`	`GSCLUSTER`	`MEDIUMTEXT`	`ROW`	`VARYING`
`CURRENT_SCHEMA`	`HASH`	`MERGE`	`ROW_NUMBER`	`VERBOSE`
`CURRENT_TIME`	`HAVING`	`MIDDLEINT`	`ROWS`	`VIEW`
`CURRENT_TIMESTAMP`	`HIGH_PRIORITY`	`MINUS`	`SAMPLE`	`VIRTUAL`
`CURRENT_USER`	`HOUR_MICROSECOND`	`MINUTE_MICROSECOND`	`SCHEMA`	`WHEN`
`CURSOR`	`HOUR_MINUTE`	`MINUTE_SECOND`	`SCHEMAS`	`WHENEVER`
`DATABASE`	`HOUR_SECOND`	`MOD`	`SECOND_MICROSECOND`	`WHERE`
`DATABASES`	`IF`	`MODIFIES`	`SELECT`	`WHILE`
`DAY_HOUR`	`IGNORE`	`NATURAL`	`SEMI`	`WINDOW`
`DAY_MICROSECOND`	`ILIKE`	`NEW`	`SENSITIVE`	`WITH`
`DAY_MINUTE`	`IN`	`NO`	`SEPARATOR`	`WITHIN`
`DAY_SECOND`	`INCREMENT`	`NO_WRITE_TO_BINLOG`	`SESSION_USER`	`WRITE`
`DEC`	`INDEX`	`NOT`	`SET`	`XOR`
`DECIMAL`	`INFILE`	`NOTNULL`	`SHOW`	`YEAR_MONTH`
`DECLARE`	`INITIALLY`	`NTH_VALUE`	`SIGNAL`	`ZEROFILL`
`DEFAULT`	`INNER`	`NTILE`	`SIMILAR`	“

Reserved column name prefixes

In order to avoid conflicts in common destination systems, column names cannot start with certain reserved prefixes. There are 10 reserved prefixes:

1-5	6-10
`PARGO_PREFIX_`	`_TABLE_`
`_FILE_`	`_PARTITION`
`_ROW_TIMESTAMP`	`__ROOT__`
`_COLIDENTIFIER`	`_CHANGE_SEQUENCE_NUMBER`
`_CHANGE_TYPE`	`_CHANGE_TIMESTAMP`

Grouping data models into products

By default, every destination you add to Prequel will be sent all configured models. Products allow you to only send a subset of those models to destinations.

Getting started

Core concepts

Features

Deploying Prequel

Logging & Monitoring

Integrations

Developer SDKs

Sources

Destinations

Security & compliance

What are data models?

Data model configuration

Required columns

Configuration file keys

Model config fields

Model validation

Column fields

Column ordering

Column name and table name conflicts

Reserved column names

Reserved column name prefixes

Grouping data models into products

​What are data models?

​Data model configuration

​Required columns

​Configuration file keys

​Model config fields

​Model validation

​Column fields

​Column ordering

​Column name and table name conflicts

​Reserved column names

​Reserved column name prefixes

​Grouping data models into products

What are data models?

Data model configuration

Required columns

Configuration file keys

Model config fields

Model validation

Column fields

Column ordering

Column name and table name conflicts

Reserved column names

Reserved column name prefixes

Grouping data models into products