- 30 May 2025
- 9 Minutes to read
- PDF
Pull Data from Your Self-Managed Amazon S3 Bucket
- Updated on 30 May 2025
- 9 Minutes to read
- PDF
Any external data source that can be configured to forward logs to an Amazon S3 bucket in your AWS account can forward data to the Red Canary Security Data Lake. All data forwarded in this way is storable and exportable from the Security Data Lake, and if it is newline-delimited JSON, it can be queried via the Search page.
Available for Early Access Customers
This integration is available for Red Canary Security Data Lake customers upon request. Please contact your Customer Success Manager if you are interested.
How does it work?
This ingest method works by listening for S3 object creation notifications, and ingesting any of the objects related to those notifications. Notifications are produced via the Amazon S3 Event Notifications system, and are delivered to Red Canary via the Simple Notification Service (SNS). At this time, a unique SNS topic is required for each distinct data source that uses this ingest method. Red Canary will require limited access to your AWS account via an assumed role in order to retrieve files from the bucket and subscribe to notifications. If it is not feasible to grant this access, consider using the Data Source via S3 (Managed by Red Canary) ingest method instead.
By integrating your security logs with the Red Canary Security Data Lake, you can meet data retention requirements, export logs when needed for investigation or reporting, and ensure greater visibility into your security infrastructure for your team and Red Canary. To integrate an external data source with Red Canary through your self-managed Amazon S3 bucket, follow the procedure below from beginning to end.
Prerequisites
Before you start the Amazon S3 integration, please make sure the following requirements are met:
You have an active Red Canary Security Data Lake license.
You have configured your external data source to store its logs in an S3 bucket in your AWS account.
Ensure that your data source is configured to emit logs as either gzip, zstd, or uncompressed files.
When possible, we recommend configuring your external data source to emit logs as newline-delimited JSON to maximize your visibility into the data, but any line-delimited text format can be ingested.
You have an AWS Console admin account with permissions to:
Create Simple Notification Service (SNS) topics
Adjust resource policies on SNS topics
Set notifications on S3 buckets
Adjust resource policies on S3 buckets
Create IAM roles
1 | AWS | Identify the source bucket
This step assumes that you have already configured your external data source to write its logs to an Amazon S3 bucket in your AWS account. In order for Red Canary to ingest those logs, you’ll need to provide the Amazon Resource Name (ARN) that uniquely identifies the bucket.
Navigate to Amazon S3 in your AWS Console.
Search for the desired bucket, select it, and click Copy ARN.
Save this bucket ARN — you will need to provide it during configuration of the Red Canary integration.
2 | AWS | Create an SNS topic
You need to set up an SNS topic in AWS to receive event notifications when data is added to the S3 bucket. The SNS topic will require a resource policy which permits S3 to publish notifications.
Navigate to the Create Topic form in your AWS Console.
Choose Standard as the topic type and enter a Name.
Be sure to give your topic a meaningful name: for instance, you might name it pdx-campus-juniper-to-red-canary if you were setting up an integration to ingest security data from Juniper appliances at your Portland office.
Expand Access policy and select Advanced to edit the access policy document.
Modify the following policy fragment to match your configuration, replacing
<REGION>
with the correct AWS region,<ACCOUNT ID>
with your AWS account ID,<TOPIC NAME>
with the topic name entered earlier, and<BUCKET NAME>
with the source bucket name. Note that there is a comma after the last curly brace.{ "Sid": "S3", "Effect": "Allow", "Principal": { "Service": "s3.amazonaws.com" }, "Action": "SNS:Publish", "Resource": "arn:aws:sns:<REGION>:<ACCOUNT ID>:<TOPIC NAME>", "Condition": { "StringEquals": { "aws:SourceAccount": "<ACCOUNT ID>" }, "ArnEquals": { "aws:SourceArn": "arn:aws:s3:::<BUCKET NAME>" } } },
Once you have done this, insert the policy fragment into the beginning of the
“Statement”
array in the JSON editor (do not replace the entire policy statement).Click Create topic in the bottom right corner to save the new topic.
Note the topic ARN — you will need to provide it during configuration of the Red Canary integration.
3 | AWS | Send S3 event notifications to the SNS topic
You need to configure the S3 bucket to send notifications to the SNS topic whenever new data is available.
The following instructions assume that your bucket does not already have any event notifications configured. If your bucket has preexisting event notifications configured, skip to the next section for instructions specific to that scenario.
Navigate to the Amazon S3 bucket list in your AWS Console.
Select the desired bucket, and navigate to the Properties tab.
Scroll down to find the Event Notifications section and click Create event notification.
Under General configuration, enter an Event name. These event names are unique only within the bucket, and are strictly informational.
Optionally, specify a Prefix and/or Suffix to restrict what data is sent to Red Canary.
A prefix is often used to specify a specific folder in the bucket that should be included in the data ingest, but it does not have to contain slashes. For example, if your bucket receives files from multiple sources, and the ones which you wish to send to Red Canary all start with “security-vendor-product-name-”, you would use that as the prefix.
Under Event types, select All object create events.
Scroll down to find the Destination section, select SNS topic, and choose the topic you created earlier under Specify SNS topic.
Click Save changes.
If the topic is correctly set up, you should see it populate with two subscriptions after you finish and activate the integration.
4 | AWS | Working around existing event notifications
You need to configure the S3 bucket to send notifications to the SNS topic whenever new data is available, but your bucket may already have event notifications configured. If so, read this section.
Depending on how these notifications were configured, you may be able to follow the steps in the previous section without issue. Before reading further, you should attempt the steps in the previous section — if they succeed, you can skip this section. If you get an error, continue below.
If you receive an error message stating “Configuration is ambiguously defined. Cannot have overlapping suffixes in two rules if the prefixes are overlapping for the same event type”,
it is because there is an existing event notification rule that overlaps in some way with the Event type, Prefix, and Suffix you specified. If this happens, review your object notifications and see if you can remove that overlap. Each event notification must flow unambiguously to a single topic.If, as above, you have an existing event notification rule that matches the data you wish to send to Red Canary, and which sends notifications to an SNS topic which is not already in use by another Red Canary data source, consider reusing the existing topic rather than trying to create a new topic.
If the topic is already in use by another Red Canary data source, or notifications are not published to an SNS topic (i.e.: the Destination is a Lambda function or SQS queue), please speak with your account team. At this time, you may need to copy data to Red Canary using the Data Source via S3 (Managed by Red Canary) ingest method, or implement a solution to replicate the notifications to an additional topic with a Lambda function.
5 | Red Canary | Add a new data lake integration
Now that you have prepared your AWS resources, you can create a new data lake integration in the Red Canary portal.
From your Red Canary dashboard navigate to Integrations, click the split button to the right of Add Integration, and click Add Data Lake Integration.
Next to Add Integration, enter a name for your integration.
Choose how Red Canary will receive this data:
Under Ingest Format / Method, select Data Source via S3 (Self Managed).
Click the Next button.
Configure Red Canary to retrieve data from this integration:
Specify the ARN of the S3 bucket containing your data and the ARN of the SNS topic with S3 bucket notifications you created earlier.
6 | Red Canary | Generate an IAM role template
You’ll use a Red Canary-provided template to provision an IAM role in your AWS environment for Red Canary access.
In the Provision an AWS IAM role that Red Canary will assume to read from the above AWS S3 bucket section on the Red Canary configuration page, select CloudFormation or Terraform to generate the appropriate template.
Copy and paste your required template into a new file/document and save it. You’ll upload this file to AWS later.
If you have not already done so for a previous data lake source, use the supplied CloudFormation or Terraform documents to configure the role and policy that Red Canary will use to access the SNS topic and files in your bucket. Please note that this is not the same role that is used for other non-data-lake integrations like AWS or SentinelOne. If your organizational policy requires the use of a custom role name, or you have already created this role but it is used in another Red Canary subdomain, please contact your account team for assistance in configuring this.
7 | AWS | Provision the IAM role
You’ll now pivot back to AWS to apply the saved template.
If you have already created a data source that uses the Data Source via S3 (Self Managed) ingest method to ingest data from the same AWS account, you do not need to apply the CloudFormation or Terraform again, as the necessary role will already exist. Instead, you must grant the existing role permission to perform the actions listed in the CloudFormation or Terraform document on the applicable resources.
For CloudFormation
Navigate to the CloudFormation Stacks page in your AWS Console.
Click Create stack and select With new resources (standard).
Under Prerequisite - Prepare template, select Choose an existing template.
Under Specify template, select Upload template file.
Click Choose file and upload the template file you created earlier.
Click Next.
Enter a name for your new stack.
Click Next.
Under Capabilities, accept the acknowledgment message.
Click Next.
Click Submit.
For Terraform
Terraform usage is dependent on your environment. If you need assistance with the Terraform template, please contact Red Canary Support.
8 | Red Canary | Confirm the IAM role in Red Canary
When you’ve finished provisioning the IAM role in AWS, check I’ve configured this integration to send data to Red Canary on the Red Canary configuration page.
Click Next.
9 | Red Canary | Specify the data retention period and save
Customize how data from this integration is handled:
Specify your desired data retention period in days.
Click Save in the bottom right corner.
The Amazon S3 integration is now live!
You should see data start appearing in the Security Data Lake within one hour.