Schema evolution
Overview
As your product data and data model changes over time, so too may the shape of the data you wish to share with customers. As your schema changes, Prequel can help you evolve the data model schema in destination systems. When you update your data model in Prequel, schemas in destination systems will be updated automatically on the next transfer to that destination.
Prequel’s schema evolution capabilities are limited to a set of “safe” and non-destructive operations that should handle most schema evolution use cases without resulting in undesirable consequences in the destination system.
Model configuration validation enforces safe schema evolution
Only model updates that result in permitted schema evolution operations are permitted. Non-allowed model updates will be rejected.
Evolution | Behavior in Destination for Rows Arriving in the Future | Behavior in Destination for Previously Loaded Rows |
---|---|---|
Add column | Column is added to all existing destinations and created in new destinations. Column is populated on all subsequent transfers. | Rows inserted prior to the addition of this column will not retroactively populate this column. These rows will be NULL up until the transfer that performed the evolution.Note: if new columns on previously transferred rows need to be populated, a transfer can be submitted with a specific time window to backfill missing data. |
Delete column | On existing destinations, rows inserted after the deletion of this column will no longer populate this column and will result in NULL values. On new destinations, column is no longer added. | Column will remain on previously inserted rows. (Column will not be retroactively deleted.) |
Add table | Model is transferred to all applicable destinations on the next scheduled transfer. Because it will be the first transfer for this model per destination, the table will be created and backfilled automatically. | N/A |
Delete table | Model is no longer transferred to any destinations. | On existing destinations, the table is no longer updated (but will not be deleted). |
Limitations
- Changing data types is not allowed. Many data destinations do not allow changes to data types. To avoid broken destination states, Prequel prevents changing data types on existing columns.
- Reusing column names is not allowed. For the same reason that changing data types is not allowed, reusing column names is also disallowed. Prequel maintains a history of all schema evolutions to prevent unintentional column name reuse.
- Reusing table names is not allowed. Reusing table names is not allowed as it can cause conflicts in destination systems. Prequel maintains a history of all table names to prevent unintentional table name reuse.
How it works
Schema evolutions happen automatically. When a transfer is enqueued, the data model configured at that time (submitted_at
) is recorded and assigned with that transfer. When the schema is dequeued (usually <1 minute later), the necessary schema evolutions are calculated and performed (e.g., new columns are added) before the latest batch of data is loaded into the destination.
The need for schema evolution is determined by comparing the schema from the previous successful evolution to the schema in enqueued transfer. Any permissible schema evolution operations will then be performed to modify the destination schema as needed.
Updated 5 months ago