Athena

Prerequisites

By default, Athena authentication uses role-based access. You will need the trust policy prepopulated with our identifier to grant access. It should look similar to the following JSON object with a proper service account identifier:

Trust policy

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "sts:AssumeRoleWithWebIdentity"
      ],
      "Principal": {
        "Federated": "accounts.google.com"
      },
      "Condition": {
        "StringEquals": {
          "accounts.google.com:oaud": "<some_organization_identifier>",
          "accounts.google.com:sub": "<some_service_account_identifier>"
        }
      }
    }
  ]
}

Create a destination bucket, service policy, and role

Create Athena target bucket

Follow these steps to create a bucket to be used for staging data before transferring to a destination.

Navigate to the S3 service page.
Click Create bucket.
Enter a Bucket name, select an AWS Region, and modify any of the default settings as desired. Note: Object Ownership can be set to “ACLs disabled” and Block Public Access settings for this bucket can be set to “Block all public access” as recommended by AWS. Make note of the Bucket name and AWS Region.
Click Create bucket.

Create Athena access policy

Navigate to the IAM service page, click on the Policies navigation tab, and click Create policy.
Click the JSON tab, and paste the following policy, being sure to replace ACCOUNT_ID, WORKGROUP, BUCKET_NAME and SCHEMA with the your account information.
- WORKGROUP should be primary unless otherwise specified during connection configuration.
- BUCKET should refer to the bucket created in the previous step.
- SCHEMA used below does not need to be created ahead of time. If it does not exist, it will be created automatically before transferring data.

Access policy

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllowAthenaAccess",
            "Effect": "Allow",
            "Action": [
                "athena:GetQueryResults",
                "athena:StartQueryExecution",
                "athena:StopQueryExecution",
                "athena:StartSession",
                "athena:GetDatabase",
                "athena:GetDataCatalog",
                "athena:GetWorkGroup",
                "athena:GetTableMetadata",
                "athena:GetQueryExecution"
            ],
            "Resource": [
                "arn:aws:athena:*:ACCOUNT_ID:workgroup/WORKGROUP"
            ]
        },
        {
            "Sid": "AllowGlueAccessToDestinationDatabaseAndTables",
            "Effect": "Allow",
            "Action": [
                "glue:GetDatabases",
                "glue:GetDatabase",
                "glue:GetTables",
                "glue:GetTable",
                "glue:GetPartitions",
                "glue:CreateTable",
                "glue:CreateDatabase",
                "glue:UpdateTable",
                "glue:DeleteTable"
            ],
            "Resource": [
                "arn:aws:glue:*:ACCOUNT_ID:catalog",
                "arn:aws:glue:*:ACCOUNT_ID:database/SCHEMA",
                "arn:aws:glue:*:ACCOUNT_ID:database/default",
                "arn:aws:glue:*:ACCOUNT_ID:table/SCHEMA/*"
            ]
        },
        {
            "Sid": "AllowS3AccessToBucket",
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:ListBucket",
                "s3:GetBucketLocation",
                "s3:GetObject",
                "s3:DeleteObject"
            ],
            "Resource": [
                "arn:aws:s3:::BUCKET_NAME",
                "arn:aws:s3:::BUCKET_NAME/*"
            ]
        }
    ]
}

KMS encryption (optional)If your S3 bucket uses KMS encryption (CMK), add the following statement to the Statement array of your IAM policy to allow data encryption/decryption with your KMS key. Encryption with SSE-C is not currently supported.

KMS policy statement

{
  "Effect": "Allow",
  "Action": [
    "kms:GenerateDataKey",
    "kms:Decrypt"
  ],
  "Resource": "arn:aws:kms:REGION_NAME:ACCOUNT_ID:key/KEY_ID"
}

Replace REGION_NAME, ACCOUNT_ID, and KEY_ID with your values.

Athena vs. S3 permissionsBecause Athena uses S3 as the underlying storage layer, the Resource access requested in the policy is scoped down via resource-specific permissions in the S3 actions.

Click through to the Review step, choose a name for the policy, for example, transfer-service-policy (this will be referenced in the next step), add a description, and click Create policy.

Create role

IAM Role (recommended)
HMAC Access Key ID & Secret Access Key

Navigate to the IAM service page.
Navigate to the Roles navigation tab, and click Create role.
Select Custom trust policy and paste the provided trust policy (from the prerequisite) to allow AssumeRole access to this role. Click Next.
Add the permissions policy created above, and click Next.
Enter a Role name, for example, transfer-role, and click Create role.
Once successfully created, search for the created role in the Roles list, click the role name, and make a note of the ARN value.

Role based authentication is the preferred authentication mode for Athena based on AWS recommendations. However, HMAC Access Key ID & Secret Access Key is an alternative authentication method that can be used if preferred.

Navigate to the IAM service page.
Navigate to the Users navigation tab, and click Add users.
Enter a User name for the service, for example, transfer-service, click Next. Under Select AWS access type, select the Access key - Programmatic access option. Click Next: Permissions.
Click the Attach existing policies directly option, and search for the name of the policy created in the previous step. Select the policy, and click Next: Tags.
Click Next: Review and click Create user.
In the Success screen, record the Access key ID and the Secret access key.

Add your destination

Use the following details to complete the connection setup: database, schema, workgroup, bucket name, bucket region, and IAM Role ARN.

Data Management

Follow these guidelines to manage your new Athena tables effectively:

Optimize Iceberg queries

To optimize the performance of your Iceberg tables, use the OPTIMIZE command. This command reorganizes the data in a way that improves query efficiency. Execute the following query periodically:

Optimize table

OPTIMIZE iceberg_table REWRITE DATA;

Set vacuum properties

Iceberg tables can accumulate snapshots over time, which can affect performance. To manage this, set the maximum age for snapshots that the vacuum process should retain:

Set vacuum properties

ALTER TABLE iceberg_table SET TBLPROPERTIES (
  'vacuum_max_snapshot_age_seconds'='259200');

The default setting is 432000 seconds, we recommend only updating this if you notice degrading performance.

Perform time travel queries

Iceberg supports accessing historical data snapshots using time travel queries. This feature allows you to query the table as it appeared at a previous point in time, which is useful for audits and rollbacks:

Time travel query

SELECT * FROM iceberg_table FOR TIMESTAMP AS OF timestamp;

Replace timestamp with the specific UNIX timestamp of the snapshot you wish to query.

Permissions checklist

IAM role trust policy allows the service to assume the role.
IAM policy includes all Athena actions on the target workgroup ARN.
IAM policy includes all Glue actions on the target catalog, database, and tables.
IAM policy includes s3:PutObject, s3:ListBucket, s3:GetBucketLocation, s3:GetObject, s3:DeleteObject on the staging bucket and its contents.
If using KMS encryption: kms:GenerateDataKey and kms:Decrypt granted on the key.

FAQ

How is the Athena connection secured?

We use IAM role-based authentication. We assume the IAM role you configure using short-lived credentials. No static access keys are required. All access is scoped to the permissions defined in the role’s IAM policy.

Why are Glue permissions required?

Athena uses the AWS Glue Data Catalog to store and manage table metadata. The Glue permissions allow the service to create and update table definitions as data is synced.

Do I need to pre-create the Glue database?

No. If the Glue database does not exist, it is created automatically before the first transfer. The glue:CreateDatabase permission in the policy enables this. If you prefer to use an existing database, remove glue:CreateDatabase from the policy and provide the existing database name.

Getting started

Core concepts

Features

Deploying Prequel

Logging & Monitoring

Integrations

Developer SDKs

Sources

Destinations

Security & compliance

Prerequisites

Create Athena target bucket

Create Athena access policy

Create role

Optimize Iceberg queries

Set vacuum properties

Perform time travel queries

Permissions checklist

FAQ

​Prerequisites

​Create Athena target bucket

​Create Athena access policy

​Create role

​Optimize Iceberg queries

​Set vacuum properties

​Perform time travel queries

​Permissions checklist

​FAQ

Prerequisites

Create Athena target bucket

Create Athena access policy

Create role

Optimize Iceberg queries

Set vacuum properties

Perform time travel queries

Permissions checklist

FAQ