ClickHouse

Prerequisites

If your ClickHouse security posture requires IP whitelisting, have our data syncing service's static IP available during the following steps. It will be required in Step 1.

Step 1: Allow access

🚩
SSH Tunneling Not Supported
SSH Tunneling is currently unsupported for Clickhouse destinations. Please ensure your Clickhouse destination is accessible over the public internet.

Create a rule in a security group or firewall settings to whitelist:

incoming connections to your host and port (usually 9440) from the static IP.
outgoing connections from ports 1024 to 65535 to the static IP.

Step 2: Create writer user

Create a database user to perform the writing of the data.

Open a connection to your ClickHouse database.
Create a user for the data transfer by executing the following SQL command.

CREATE USER <username>@'%' IDENTIFIED BY '<some-password>';

🔒
Password Rules
Passwords may only include alphanumeric characters (A–Z, a–z, 0–9), dashes (-), and underscores (_).

Grant user required privileges on the database.

GRANT SELECT ON information_schema.columns TO <username>;
GRANT CREATE, INSERT, DROP, ALTER, OPTIMIZE, SHOW, TRUNCATE ON <database>.* TO <username>@'%';
grant CREATE TEMPORARY TABLE, S3 on *.* to <username>@'%';

📘
Understanding the CREATE TEMPORARY TABLE, S3 permissions
The CREATE TEMPORARY TABLE and S3 permissions are required to efficiently transfer data to ClickHouse. Under the hood, these permissions are used to stage data in object storage as compressed files, COPY INTO temporary tables, and finally merge into the target tables. By definition, the temporary table will not exist outside of the session.

Step 3: Setup staging bucket

ClickHouse sources require a staging bucket to efficiently transfer data. Configure your staging bucket using one of the following types of ClickHouse supported object storage:

S3
GCS
Implicit

📘
Using the implicit bucket option
ClickHouse supports the ability to configure staging resources with environment credentials. If this setting is enabled on your ClickHouse cluster, you may choose to use the configured implicit staging resources using the implicit option for the staging bucket selection.

Step 4: Add your destination

🔌
Connection Protocol
Use the ClickHouse TCP native protocol, not HTTPS. This is commonly exposed on port 9000.

Securely share your host name, port, cluster, database name, schema name, username, password, and staging bucket details with us to complete the connection.

📘
Understanding the database vs. schema fields (connection database vs. write database)
Depending on the version of your integration, you may be asked for both a database and schema, or a connection database and write database.

database (also referred to as connection_database): is the database used to establish the connection with ClickHouse.

schema (also referred to as write_database): is the database/schema within which data will be written

These can be (and often are) the same values, but do not need to be.

Using the ClickHouse data

📘
Querying ClickHouse data without duplicates
The resulting ClickHouse tables use the ReplacingMergeTree table engine in order to efficiently upsert changes. To properly query this data, the FINAL keyword must be used when selecting from these tables guarantee duplicates are removed. For example:
SELECT
  *
FROM
  schema.table FINAL
WHERE
  foo = bar
ORDER BY foo
LIMIT 10;

ClickHouse

Prerequisites

Step 1: Allow access

🚩
SSH Tunneling Not Supported

Step 2: Create writer user

🔒
Password Rules

📘
Understanding the `CREATE TEMPORARY TABLE, S3` permissions

Step 3: Setup staging bucket

📘
Using the `implicit` bucket option

Step 4: Add your destination

🔌
Connection Protocol

📘
Understanding the `database` vs. `schema` fields (`connection database` vs. `write database`)

Using the ClickHouse data

📘
Querying ClickHouse data without duplicates

Prerequisites

Step 1: Allow access

🚩SSH Tunneling Not Supported

Step 2: Create writer user

🔒Password Rules

📘Understanding the CREATE TEMPORARY TABLE, S3 permissions

Step 3: Setup staging bucket

📘Using the implicit bucket option

Step 4: Add your destination

🔌Connection Protocol

📘Understanding the database vs. schema fields (connection database vs. write database)

Using the ClickHouse data

📘Querying ClickHouse data without duplicates

🚩
SSH Tunneling Not Supported

🔒
Password Rules

📘
Understanding the `CREATE TEMPORARY TABLE, S3` permissions

📘
Using the `implicit` bucket option

🔌
Connection Protocol

📘
Understanding the `database` vs. `schema` fields (`connection database` vs. `write database`)

📘
Querying ClickHouse data without duplicates