Generic ClickHouse

Instructions for connecting to a ClickHouse data warehouse as a source

Step 1: Allow access

  1. Make a note of your Prequel static IP
  2. Create a rule in a security group or firewall settings to whitelist:
    1. incoming connections to your host and port (usually 9440) from the static IP.
    2. outgoing connections from ports 1024 to 65535 to the static IP.

Step 2: Create reader user

Create a database user to perform the reading of the source data.

  1. Open a connection to your ClickHouse database.
  2. Create a user for the data transfer by executing the following SQL command.
CREATE USER <username>@'%' IDENTIFIED BY '<some-password>';
  1. Grant user required privileges on the database.
📘

Additional permissions required for source queries

If any of your Prequel models currently or will use source queries, you will need to also provide CREATE VIEW and DROP VIEW privileges to the Clickhouse user below.

GRANT SELECT ON <{database.table|database.*|*.*}> TO <username>@'%';
GRANT CREATE TEMPORARY TABLE, S3 on *.* TO <username>@'%';
📘

Understanding the CREATE TEMPORARY TABLE, S3 permissions

The CREATE TEMPORARY TABLE and S3 permissions are required to efficiently transfer data from ClickHouse. Under the hood, these permissions are used to stage data in a temporary table and export compressed data into object storage for transferring. By definition, the temporary table will not exist outside of the session.

Step 3: Setup staging bucket

ClickHouse sources require a staging bucket to efficiently transfer data. Configure your staging bucket using one of the following guides:

Optional: Granting ClickHouse Cloud role-based access to S3

If your ClickHouse instance runs on ClickHouse Cloud, you can have it authenticate to your S3 staging bucket using the same IAM role instead of access keys to avoid relying on long-lived static credentials.

You may follow the same steps in the S3 staging bucket configuration above, but you will need to add an additional trust policy statement to allow ClickHouse to assume the role too.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            // Existing statements
        },
        {
            "Effect": "Allow",
            "Principal": {"AWS": "<CLICKHOUSE_IAM_ARN>"},
            "Action": "sts:AssumeRole"
        }
    ]
}

Replace <CLICKHOUSE_IAM_ARN> with your ClickHouse instance's IAM ARN. To obtain the ARN, go to your ClickHouse Cloud account, navigate to SettingsNetwork security informationView service details and copy the Service role ID (IAM).

See the ClickHouse Secure S3 documentation for full details, including an automated CloudFormation setup option.

Step 4: Add source to Prequel

Use the cURL request to add the configured ClickHouse source and staging bucket.