Splunk with Amazon Kinesis Streams

Yash Dwivedi
3 min readMar 12, 2021

The Amazon Kinesis platform of managed services enables continuous capture and stores terabytes of data per hour from hundreds or thousands of sources for real-time data processing over large distributed streams. Splunk enables data insights, transformation, and visualization. Both Splunk and Amazon Kinesis can be used for direct ingestion from your data producers.

This powerful combination lets you quickly capture, analyze, transform, and visualize streams of data without needing to write complex code using Amazon Kinesis client libraries.

Amazon Kinesis Streams

Amazon Kinesis Streams is a fully managed service for real-time processing of data streams at massive scale. You can configure hundreds of thousands of data producers to continuously put data into an Amazon Kinesis stream.

Splunk

Splunk is a platform for real-time, operational intelligence. It is an easy, fast, and secure way to analyze and visualize massive streams of data that could be generated by either IT systems or technology infrastructure.

How about getting the best of both worlds?

Let’s go over the new integration’s end-to-end solution and examine how Kinesis Firehose and Splunk work together to expand the push-based approach into a native AWS solution for applicable data sources:

By using a managed service like Kinesis Firehose for data ingestion into Splunk, we are providing out-of-the-box reliability and scalability. One of the pain points of the old approach was the overhead of managing the data collection nodes (Splunk heavy forwarders). With the new Kinesis Firehose to Splunk integration, there are no forwarders to manage or setup. Data producers (1) are configured through the AWS Console to drop data into Kinesis Firehose.

Cross-account data sharing

Using CloudWatch Logs Destination, data can be sent from multiple sender accounts to a single receiving account. In AWS organizations, it can also be used to push down policies and control through the organizational structure. For data to be shared, both sender and recipient details are needed.

Amazon Kinesis Data Firehose makes it easy to stream machine-generated data to Splunk for operational intelligence. Kinesis Data Firehose can stream data to your Splunk cluster in real-time at any scale. This integration supports Splunk versions with HTTP Event Collector (HEC), including Splunk Enterprise and Splunk Cloud.

Integration between Splunk Enterprise or Splunk Cloud, and Amazon Kinesis Data Firehose is designed to make AWS data ingestion setup seamless, while offering a secure and fault-tolerant delivery mechanism. Splunk makes it convenient to monitor and analyse machine data from any source and use it to deliver operational intelligence, optimise infrastructure operations, maximise security, and increase business performance. Kinesis Data Firehose allows you to use a fully managed, reliable, and scalable data streaming solution to Splunk. In this post, we discuss a step by step procedure of Kinesis Data Firehose and Splunk integration so you can seamlessly ingest AWS data into Splunk.

This is a push-based approach that offers a low-latency scalable data pipeline made up of serverless resources like AWS Lambda sending directly to Splunk indexers by using Splunk HEC. This is different from the Pull-based approach using Splunk-add-on for AWS.

Conclusion

The combination of Amazon Kinesis and Splunk enables powerful capabilities: you can ingest massive data at scale, consume data for analytics, create visualizations or custom data processing using Splunk, and potentially tie in Lambda functions for multiple consumer needs, all while ingesting data into Amazon Kinesis one time. This is a win-win combination for customers who are already using Splunk and AWS services, or customers looking to implement scalable data ingestion and data insight mechanisms for their big data needs.

--

--