Skip to main content

How transfers work

Prequel performs transfers by querying the source for a given recipient’s data and loading that data into the recipient’s destination, on an ongoing basis. The first transfer that runs for a given destination will automatically load all historical data (the “backfill”), and subsequent transfers will attempt to transfer only the data that has changed or been added since the previous transfer.

Prequel transfers from source to destination

1

Authorize source

Prequel authenticates to Sources using scoped credentials or delegated roles created by the user. Prequel validates connectivity, and restricts permissions to only what is needed to read the configured models for the intended recipient for least-privilege access with clear auditability.
2

Read, batch, and serialize

Data is read in a sliding window based on time. Each transfer moves a window of data, starting from a checkpoint based on the last batch of data transferred to ensure data integrity and efficient transfers at scale. When available, Prequel uses a source staging bucket to temporarily store the results of queries as files in object storage which are then downloaded and normalized.Prequel uses a lookback window to ensure resiliency against eventual consistency concerns in data sources. For more detail on its mechanics, see Change detection.
3

Authorize destination

Prequel authenticates to your customer’s destinations using destination-native authentication scoped to the target schemas/tables for isolation and least-privilege access aligned with destination security.
4

Load to destination

  • Staging-assisted loads: Batches are uploaded to a staging area (for example, a native volume or storage bucket) and then ingested using the destination’s bulk-load path. Data is normalized before staging. Prequel’s transfer logic is designed uniquely for each destination type to maximize throughput and leverage vendor-optimized patterns.
  • Direct inserts: For destinations that don’t support staging-assisted loads, batches are streamed directly via insert SQL queries or API calls without external staging. As a result, these destination types can have throughput limitations; contact the Prequel team to learn more about data volumes and throughput across destination types.
  • Prequel uses upserts with changes matched on primary key and duplicates resolved via the last modified timestamp to ensure data integrity and protect table state.
  • With a Write-Ahead-Publish architecture, your customer never sees data before a transfer is complete and all data is available in the destination.
  • Staging files created during transfers are automatically cleaned up after transfer completion. Data is not persisted in the staging area after transfer.
  • With each transfer, metadata is written to each destination per transfer. For object storage locations, see Manifest files for object storage, and for warehouses and databases, see Transfer status table.
5

Transparency and controls

Each phase emits structured logs and metrics for governance and auditability. Tags can be used to label transfers for filtering and reporting.

Transfer lifecycle

Transfers are managed by an internal queue, which is used to dispatch transfers to workers. When a destination has the enabled flag set to true, Prequel will automatically enqueue transfers for that destination based on the frequency value of the destination or the organization’s default frequency. A transfer resource always has a status corresponding to its current phase of the lifecycle:
StatusDescription
PENDINGTransfers start as pending when they are created (enqueued). The submitted_at timestamp records when the transfer was enqueued.
RUNNINGThe transfer has been dispatched to a worker. The started_at timestamp records when it changed to RUNNING.
ERRORThere was an issue dispatching the transfer, the worker failed to connect to the source or destination, or all models failed to transfer. ended_at records the change to ERROR.
PARTIAL_FAILUREThe transfer reached the running state, but only some models succeeded while others failed. ended_at records the change to PARTIAL_FAILURE.
SUCCESSThe transfer was running and all models transferred without issues. ended_at records the change to SUCCESS.
CANCELLEDA user terminated the transfer before it started running.
KILLEDA user terminated the transfer while it was running.
EXPIREDThe transfer was blocked from being dispatched and remained pending for longer than 6 hours.
ORPHANEDThe worker died ungracefully or stopped communicating with the control plane.

Backfills & full refreshes

The initial transfer (or “backfill”), is often the largest transfer by volume. During this initial sync, all historical data for a given recipient is loaded into the destination. To trigger a full refresh manually, add "full_refresh": true to a transfer request. Prequel only triggers a full refresh automatically on the first transfer, either to a new destination or a new model.
Data impact varies by destination typeWarehouses, databases, and open table format (OLAP, OLTP, OTF): All existing data is deleted before reloading. If your source retains only partial history (e.g., a 90-day rolling window), data outside that range will be lost. Any date filters explicitly configured in a model’s source query also still apply.Object storage (non-OTF) & SFTP: Existing files are not deleted, and a full refresh will produce duplicate data.
Backfill vs. incremental transfer performanceBecause the initial backfill is often the most storage and compute intensive, sync time/performance should not be used as an indicator of ongoing transfer statistics.
Table Reset Behavior: For warehouse and database destinations, Prequel determines whether to truncate or drop and recreate the table based on schema compatibility:
  • Truncate: If the schema matches, the table is truncated before reloading data.
  • Drop & Recreate: If there is a schema mismatch, the table is dropped and recreated with the correct schema.
If you’re unsure whether a full refresh fits your situation, contact Prequel support to discuss your use case.

Incremental transfers

After each transfer (backfill or incremental) Prequel will record the most recent last modified timestamp value transferred. This value will be used as the starting point for the subsequent transfer. By default, every transfer of a given model (after a successful backfill) will be an “incremental transfer”.
Incremental updates and eventually consistent data sourcesBy default, Prequel will query the source for slightly earlier data than the most recently transferred row. This is to provide a window in which data from eventually consistent sources can converge and still be transferred.

Transfer parallelism and concurrency

Transfer Concurrency: Within an individual transfer, operations are optimistically concurrent. Transfers can download, upload, or serialize multiple data files concurrently, regardless of the model to which they belong. The max_concurrent_queries_per_transfer field on a source or destination limits the number of concurrently queries or API calls that can be made against the source or destination. The default for max_concurrent_queries_per_transfer is 1. Transfers Parallelism: Transfers can run in parallel of each other as long as the following constraints hold:
  • No simultaneous transfers are allowed for the same model to the same destination.
  • No simultaneous integrity and transfer jobs can run against the same destination.
  • The max_concurrent_transfers field exists on both the source and destination. It defaults to 10 for sources and 1 for destinations. This field represents a hard limit on the number of simultaneous transfers involving a particular source or destination.
Prequel’s dispatcher will enforce the above rules. A transfer that is unable to be dispatched will remain pending until it can be dispatched.

Tags

Every transfer can carry an arbitrary set of tags, simple key/value metadata you define to group, label, or annotate transfers (for example, by environment, team, workload, etc.). Tag keys and values must match ^[A-Za-z0-9_-]+$. For more detail on how to use Tags when creating transfers and filtering transfers on tags, refer to our API Reference.

Staging buckets

Some sources and destinations supported by Prequel may require staging buckets to efficiently transfer data. Where possible, Prequel will use built in staging resourced provided by the database or data warehouse, but in cases where it does not exist, it may need to be provided. The source/destination documentation will provide instructions for configuring staging buckets where needed.