How it works

Most destinations are configured by first creating a special purpose user to perform the write operations in the data warehouse, whitelisting a Prequel IP, and adding the database details and credentials to Prequel. 🔗

Supported Destinations

You can find all currently supported destinations and documentation links below. You also may decide to self-host the destination configuration instructions on your own documentation site. If you prefer to do that, we maintain a copy of the source markdown files accessible at public locations below:

Vendor	Type	Status	Docs	Markdown (`.md` file)
`snowflake`	OLAP	GA	link	link
`bigquery`	OLAP	GA	link	link
`redshift`	OLAP	GA	link	link
`databricks`	OLAP	GA	link	link
`athena`	OLAP	GA	link	link
`clickhouse`	OLAP	GA	link	link
`motherduck`	OLAP	Beta	link	link
`postgres`	OLTP	GA	link	link
`aurora_postgres`	OLTP	GA	link	link
`mysql`	OLTP	GA	link	link
`aurora_mysql`	OLTP	GA	link	link
`sql_server`	OLTP	Beta	link
`singlestore`	OLTP	Beta
`s3`	Object Storage	GA	link	link
`s3_compatible`	Object Storage	GA	link	link
`gcs`	Object Storage	GA	link	link
`abs`	Object Storage	GA	link	link
`google_sheets`	Spreadsheet	GA	link	link

Other available guides

Guide	Type	Status	Docs	Docs (`.md` file)
S3 Staging Bucket	Staging Resource	GA	link	link
Google Cloud Storage Staging Bucket	Staging Resource	GA	link	link
Azure Blob Storage Staging Bucket	Staging Resource	GA	link	link

📘
You should know
You can use your discretion to decide what documentation to provide to your users that wish to connect their destination. To avoid confusion, we recommend working with your user to determine which database or data warehouse destination type they wish to connect, and then sending over the subset of documentation for them to work with.

Format of landed data

Data warehouses & databases (incl. Snowflake, BigQuery, Redshift, Databricks)

Data transferred to data warehouses and relational databases will be loaded as properly typed tables within a single schema.

For destinations other than BigQuery, a special _transfer_status table will be loaded in the created schema to record transfer metadata, namely, a transfer_last_updated_at timestamp for each table. In BigQuery, the last_updated timestamp for a table is already accessible in the __TABLES_SUMMARY__ metatable.

Object storage (incl. AWS S3, Google Cloud Storage, Azure Blob Storage)

Data transferred to object storage destinations will be loaded as Apache Parquet, CSV, or JSON files in Apache Hive style partitions. The default file format is Parquet, which we strongly recommend for performance reasons. This means data will appear in the following folder structure:

<bucket_name>/<folder_name>/<model_name>/dt=<transfer_date>/<file_part>_<transfer_timestamp>.parquet

Where:

<bucket_name> and <folder_name> are provided during destination configuration.
<model_name> is the name of the data model being transferred (this is equivalent to a table name in relational data destinations).
<transfer_date> and <transfer_timestamp> are generated at transfer time and based on the transfer's start time. <transfer_date> is of the form 2006-01-01, while <transfer_timestamp> is of the form 20060102150405.
<file_part> is a monotonically increasing integer for a given timestamp, and does not carry any special meaning.

Prequel also provides a manifest file for each model within each transfer, mapping each transfer to its file contents. You can find more about the structure of these manifest files here: Manifest Files for Object Storage.

📘
What are Apache Hive style partitions and Apache Parquet file format?

Apache Hive style partitions are compatible with most popular query engines, and should make data easily queryable and transportable.

Apache Parquet file format is an open source, column-oriented data file format that offers efficient data compression and data integrity.

Note that individual files for object storage destinations will not exceed ~4 GB in size. Rows transferred to object storage destinations are treated as "append only". This has the following effects for all object storage destinations:

Prequel will always write data based on the current data model, meaning Schema evolution and data model updates must be handled downstream of transfer.
Deduplication, merging updated rows will not take place during transfer, all updates will be written as unique rows. These updates must be also be made downstream of transfer.
Full refresh transfers will append the full dataset without deleting existing data.

However, object storage locations configured as a Delta Lake destination will receive the native schema evolution, deduplication, and ACID transaction behavior that all data warehouse destinations receive.

Spreadsheets (incl. Google Sheets)

Data transferred to spreadsheet destinations will be loaded as a newly created tab per data model. Where possible, the tabs will be created as protected tabs (or "read-only") to prevent accidental modification.