The hottest new programming language is English…And a picture is worth a thousand words! — Beautiful GitOps architectures come to life

deepakvraghavan
7 min readMay 25, 2023

--

The credit for the phrase “The hottest new programming language is English” goes to Andrej Karpathy who has made significant contributions at Tesla, OpenAI, and Stanford.

As simple as the statement is, the impact of it is profound. The entry barrier to solve complex problems that requires breadth/depth of technical expertise and creative talent is so low with the advancements we have today.

Building products that solves a specific domain problem and scales with time/changing needs used to take a village of cross skilled software engineers and subject matter experts (SMEs). But, we can see the practical impact in our daily lives as an engineer/product builder with the progress we have made with Generative AI.

It couldn’t be simpler to avail the tools at our disposal to get 80% of the work done for us right from envisioning, planning, architecture, design, and sample starter kit code to kick off product development.

I did an experiment to try out using plain English to generate practical code. Here is an example of a prototype architecture that I was able to get with ChatGPT for a simple IoT deployment use case on the AWS platform.

The Edge-Stream IoT Architecture leverages the power of edge computing and real-time streaming analytics to efficiently process and analyze large volumes of data generated by IoT devices. It aims to minimize latency, reduce bandwidth consumption, and provide near real-time insights for IoT applications.

Components:

  1. IoT Devices: Various IoT devices (sensors, actuators, cameras, etc.) are deployed at the edge to collect data from the physical world. These devices can be spread across different locations and are responsible for capturing and transmitting sensor data.
  2. Edge Gateway: An Edge Gateway device is deployed at each location to aggregate data from local IoT devices. It acts as a bridge between the IoT devices and the cloud, performing data preprocessing and filtering tasks. The Edge Gateway is responsible for securely transmitting the data to the cloud while optimizing bandwidth usage.
  3. AWS Greengrass: AWS Greengrass is deployed on the Edge Gateway to provide local compute, messaging, and data caching capabilities. It allows for running Lambda functions at the edge, enabling real-time data processing, rule-based filtering, and local decision making. Greengrass also facilitates bi-directional communication with the cloud.
  4. AWS IoT Core: AWS IoT Core is used for managing the IoT devices, secure device connectivity, and bidirectional communication. It provides device shadowing, which allows for synchronizing the state of devices with the cloud. IoT Core ensures secure, reliable, and scalable communication between devices and the cloud.
  5. Amazon Kinesis Data Streams: Kinesis Data Streams is utilized for real-time ingestion of IoT data at scale. It enables the continuous collection and storage of the streaming data. The data is partitioned and distributed across multiple shards for parallel processing.
  6. AWS Lambda: AWS Lambda functions are leveraged for real-time stream processing and analytics. These functions are triggered by incoming data from Kinesis Data Streams and perform various operations such as filtering, transformation, aggregation, and anomaly detection. Lambda functions can be written in languages like Python, Java, or Node.js.
  7. AWS IoT Analytics: AWS IoT Analytics is used for batch processing and long-term storage of IoT data. It provides the ability to run complex queries, perform data transformations, and generate visualizations for historical analysis. It allows for storing and retrieving data from multiple sources, including Kinesis Data Streams and IoT Core.
  8. AWS IoT Events: AWS IoT Events can be utilized to detect and respond to events and patterns in the IoT data stream. It enables the creation of rules-based workflows that trigger actions based on predefined conditions. This helps automate actions like sending notifications, activating actuators, or triggering alarms.
  9. AWS S3: Amazon S3 is employed for long-term storage of raw and processed IoT data. It provides scalable and durable object storage, ensuring data availability for future analysis, archival, and compliance purposes.

Benefits:

  • Reduced Latency: By leveraging edge computing and real-time streaming analytics, the architecture minimizes the latency between data collection, processing, and decision-making, enabling near real-time insights.
  • Bandwidth Optimization: The Edge Gateway device filters and preprocesses the data locally, reducing the amount of data transmitted to the cloud. This optimizes bandwidth usage, lowers costs, and improves network efficiency.
  • Local Decision Making: With AWS Greengrass and Lambda functions at the edge, the architecture enables local decision making based on predefined rules and conditions. This allows for quicker responses and reduces dependence on cloud connectivity.
  • Scalability and Flexibility: The architecture leverages scalable services like Kinesis Data Streams, IoT Analytics, and S3, providing the ability to handle large volumes of IoT data. It can accommodate the growth of IoT deployments and adapt to changing requirements.
  • Secure and Reliable: AWS IoT Core ensures secure communication between devices, the edge, and the cloud. The architecture follows AWS best practices for security, data privacy, and access control, ensuring the integrity of the IoT ecosystem.

I love how thorough it is to account for the different components in the proposed architecture and the benefits of picking these components. I went ahead and asked it to create the deployment script that I can use as a starting point and it gave me the below Terraform script.

# Define provider and region
provider "aws" {
region = "us-west-2"
}

# Create an IoT thing for each device
resource "aws_iot_thing" "iot_things" {
count = 5
name = "iot-device-${count.index}"
}

# Create an IoT policy for devices to connect
resource "aws_iot_policy" "iot_policy" {
name = "iot-policy"
policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Action": "iot:*",
"Resource": "*"
}]
}
EOF
}

# Attach IoT policy to each device
resource "aws_iot_policy_attachment" "iot_policy_attachment" {
policy_name = aws_iot_policy.iot_policy.name
target = aws_iot_thing.iot_things.*.arn
}

# Create an IAM role for Greengrass
resource "aws_iam_role" "greengrass_role" {
name = "greengrass-role"

assume_role_policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "greengrass.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
EOF
}

# Attach AWS-managed policies to the Greengrass role
resource "aws_iam_role_policy_attachment" "greengrass_policy_attachment" {
role = aws_iam_role.greengrass_role.name
policy_arn = "arn:aws:iam::aws:policy/service-role/AWSGreengrassResourceAccessRolePolicy"
}

# Create a Greengrass group
resource "aws_greengrass_group" "greengrass_group" {
name = "greengrass-group"
role_arn = aws_iam_role.greengrass_role.arn
}

# Create AWS IoT Core rule for data forwarding to Kinesis Data Streams
resource "aws_iot_topic_rule" "data_forwarding_rule" {
name = "data-forwarding-rule"
sql = "SELECT * FROM 'sensors/#'"
description = "Forward IoT data to Kinesis Data Streams"
action {
kinesis {
role_arn = aws_iam_role.kinesis_role.arn
stream_name = aws_kinesis_stream.data_stream.name
partition_key = "iot-data"
}
}
}

# Create a Kinesis Data Stream
resource "aws_kinesis_stream" "data_stream" {
name = "iot-data-stream"
shard_count = 2
}

# Create AWS Lambda function for IoT data processing
resource "aws_lambda_function" "data_processing_lambda" {
function_name = "iot-data-processing-lambda"
runtime = "python3.8"
handler = "index.lambda_handler"
filename = "lambda_function.zip"
source_code_hash = filebase64sha256("lambda_function.zip")
role = aws_iam_role.lambda_role.arn
}

# Create an IAM role for Lambda function
resource "aws_iam_role" "lambda_role" {
name = "lambda-role"

assume_role_policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "lambda.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
EOF
}

# Attach AWS-managed policies to the Lambda role
resource "aws_iam_role_policy_attachment" "lambda_policy_attachment" {
role = aws_iam_role.lambda_role.name
policy_arn = "arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole"
}

# Trigger the Lambda function on Kinesis Data Streams events
resource "aws_lambda_event_source_mapping" "kinesis_event_mapping" {
event_source_arn = aws_kinesis_stream.data_stream.arn
function_name = aws_lambda_function.data_processing_lambda.arn
starting_position = "LATEST"
}

# Create an S3 bucket for long-term storage
resource "aws_s3_bucket" "data_storage_bucket" {
bucket = "iot-data-storage-bucket"
}

# Output the names and ARNs of resources
output "iot_thing_names" {
value = aws_iot_thing.iot_things[*].name
}

output "iot_thing_arns" {
value = aws_iot_thing.iot_things[*].arn
}

output "greengrass_group_name" {
value = aws_greengrass_group.greengrass_group.name
}

output "kinesis_stream_name" {
value = aws_kinesis_stream.data_stream.name
}

output "lambda_function_name" {
value = aws_lambda_function.data_processing_lambda.function_name
}

output "s3_bucket_name" {
value = aws_s3_bucket.data_storage_bucket.id
}

I will have to make a few tweaks and customize it with the inputs for my desired deployment but it clearly removes the busy work of creating the basic building blocks for the deployment.

A picture is worth a thousand words!

As much as I like code and working prototypes, I am a visual person and it is super useful to find an architecture diagram of how the different pieces work together especially when your overall enterprise architecture grows with multiple systems and subsystems

One of the products I find absolutely helpful in this space is Brainboard.

Brainboard helps you to create (and keep it up to date) architecture diagrams from Terraform code. This is mighty helpful.

Take a look at Brainboard when you get a chance. It is so common to have architecture diagrams run out of sync with the source code but most everyone finds these architecture diagrams helpful. It would be relevant only when it is synced with the code.

Make this a part of your GitOps pipeline and all your key assets (code, infrastructure, and documentation) stay in sync with no drifts at. This is powerful!

Hope you leverage the tools at our disposal to create frictionless infrastructures and practical useful documentation using the best practices of Generative AI and GitOps!

--

--