SFTP
Configuring your SFTP server.
Prerequisites
- By default, SFTP uses keypair authentication for access. You will need a provided
public keyto configure your destination. It will look roughly like this:
ssh-key <ssh_public_key_beginning_with_AAAA> some-commentStep 1: Create a user on the SFTP server
Login to the SFTP server and complete the steps below.
- Create group
sftpwriter:
sudo groupadd sftpwriter- Create user
sftpwriter:
sudo useradd -m -g sftpwriter sftpwriter- Switch to the
sftpwriteruser:
sudo su - sftpwriter- Create the
.sshdirectory:
mkdir ~/.ssh- Set permissions:
chmod 700 ~/.ssh- Navigate to the
.sshdirectory:
cd ~/.ssh- Create the
authorized_keysfile:
touch authorized_keys- Set permissions:
chmod 600 authorized_keys- Add the public key to the
authorized_keysfile. The key -- including the "ssh-key" and comment -- should be all on one line in the file, without linebreaks.
echo "ssh-key <ssh_public_key_beginning_with_AAAA> sftpwriter-public-key" > authorized_keysStep 2: Add your destination
Share your host name, folder name, username, port and preferred delimiter character with us to complete the connection.
Write permissions at the SFTP root are requiredIn addition to write access within your configured
<folder>, this destination writes per-transfer manifest files under a_manifests/directory created at the root of the SFTP home/path. Ensure the SFTP user can create and write to_manifestsat that root (even if your data lands under a subfolder). Manifests allow downstream systems to detect when a transfer is complete. See the FAQ below for how these files are organized.
Frequently Asked Questions
- How will the data appear in my SFTP server?
-
The data will be loaded with the configured file format (Parquet, CSV, or JSON/JSONL) in a predictable folder structure that can be easily parsed by downstream systems.
sftpwriter_home_folder/ ├─ some_provided_folder/ │ ├─ some_table_a/ │ │ ├─ dt=2024-01-01/ │ │ │ ├─ 0_20240101181004.csv │ │ │ ├─ 1_20240101184002.csv │ │ ├─ dt=2024-01-02/ │ │ │ ├─ 0_20240102180123.csv │ │ ├─ dt=2024-01-03/ │ │ │ ├─ 0_20240103182145.csv │ ├─ some_table_b/ │ │ ├─ dt=2024-01-01/ │ │ │ ├─ 0_20240101186004.csv │ │ ├─ dt=2024-01-02/ │ │ │ ├─ 0_20240102185123.csv │ │ ├─ dt=2024-01-03/ │ │ │ ├─ 0_20240103187145.csv
-
Q: How is the SFTP connection secured?
A: Use SSH key-based authentication for a dedicated, least-privileged SFTP user. Restrict access to only the required directories (e.g., chroot), and allowlist the service's static egress IP at your network perimeter.
Q: What file formats are supported?
A: Parquet (default/recommended), CSV, and JSON/JSONL.
Q: How do I know when a transfer completed?
A: Each transfer writes a manifest JSON file per model under _manifests/ at the root. Files follow the pattern: _manifests/<model_name>/dt=<transfer_date>/manifest_{transfer_id}.json. Use these manifests to trigger downstream processing.
Q: Why do I sometimes see duplicates?
A: File-based destinations are append-oriented. The change-detection process uses a lookback window to prevent missed records, which can create duplicates across adjacent transfers. Downstream pipelines can deduplicate by primary key prioritizing rows in the most recent transfer window.
Q: Can I provide my own public key? Where is the private key stored?
A: We do not support providing your own public key for security reasons. The private key is securely generated and stored in our system and is never shared externally.
Updated 10 days ago