DocumentationAPI Reference
Documentation

SFTP

Configuring your SFTP server.

Prerequisites

  • By default, SFTP uses keypair authentication for access. You will need a provided public key to configure your destination. It will look roughly like this:
ssh-key <ssh_public_key_beginning_with_AAAA> some-comment

Step 1: Create a user on the SFTP server

Login to the SFTP server and complete the steps below.

  1. Create group sftpwriter:
sudo groupadd sftpwriter
  1. Create user sftpwriter:
sudo useradd -m -g sftpwriter sftpwriter
  1. Switch to the sftpwriter user:
sudo su - sftpwriter
  1. Create the .ssh directory:
mkdir ~/.ssh
  1. Set permissions:
chmod 700 ~/.ssh
  1. Navigate to the .ssh directory:
cd ~/.ssh
  1. Create the authorized_keys file:
touch authorized_keys
  1. Set permissions:
chmod 600 authorized_keys
  1. Add the public key to the authorized_keys file. The key -- including the "ssh-key" and comment -- should be all on one line in the file, without linebreaks.
echo "ssh-key <ssh_public_key_beginning_with_AAAA> sftpwriter-public-key" > authorized_keys

Step 2: Add your destination

Share your host name, folder name, username, port and preferred delimiter character with us to complete the connection.

📘

Write permissions at the SFTP root are required

In addition to write access within your configured <folder>, this destination writes per-transfer manifest files under a _manifests/ directory created at the root of the SFTP home/path. Ensure the SFTP user can create and write to _manifests at that root (even if your data lands under a subfolder). Manifests allow downstream systems to detect when a transfer is complete. See the FAQ below for how these files are organized.

Frequently Asked Questions

  • How will the data appear in my SFTP server?
    • The data will be loaded with the configured file format (Parquet, CSV, or JSON/JSONL) in a predictable folder structure that can be easily parsed by downstream systems.

      sftpwriter_home_folder/
      ├─ some_provided_folder/
      │  ├─ some_table_a/
      │  │  ├─ dt=2024-01-01/
      │  │  │  ├─ 0_20240101181004.csv
      │  │  │  ├─ 1_20240101184002.csv
      │  │  ├─ dt=2024-01-02/
      │  │  │  ├─ 0_20240102180123.csv
      │  │  ├─ dt=2024-01-03/
      │  │  │  ├─ 0_20240103182145.csv
      │  ├─ some_table_b/
      │  │  ├─ dt=2024-01-01/
      │  │  │  ├─ 0_20240101186004.csv
      │  │  ├─ dt=2024-01-02/
      │  │  │  ├─ 0_20240102185123.csv
      │  │  ├─ dt=2024-01-03/
      │  │  │  ├─ 0_20240103187145.csv

Q: How is the SFTP connection secured?

A: Use SSH key-based authentication for a dedicated, least-privileged SFTP user. Restrict access to only the required directories (e.g., chroot), and allowlist the service's static egress IP at your network perimeter.

Q: What file formats are supported?

A: Parquet (default/recommended), CSV, and JSON/JSONL.

Q: How do I know when a transfer completed?

A: Each transfer writes a manifest JSON file per model under _manifests/ at the root. Files follow the pattern: _manifests/<model_name>/dt=<transfer_date>/manifest_{transfer_id}.json. Use these manifests to trigger downstream processing.

Q: Why do I sometimes see duplicates?

A: File-based destinations are append-oriented. The change-detection process uses a lookback window to prevent missed records, which can create duplicates across adjacent transfers. Downstream pipelines can deduplicate by primary key prioritizing rows in the most recent transfer window.

Q: Can I provide my own public key? Where is the private key stored?

A: We do not support providing your own public key for security reasons. The private key is securely generated and stored in our system and is never shared externally.