Tag Archives: AWS Lambda

Using Amazon Verified Permissions to manage authorization for AWS IoT smart home applications

Post Syndicated from Rajat Mathur original https://aws.amazon.com/blogs/security/using-amazon-verified-permissions-to-manage-authorization-for-aws-iot-smart-thermostat-applications/

This blog post introduces how manufacturers and smart appliance consumers can use Amazon Verified Permissions to centrally manage permissions and fine-grained authorizations. Developers can offer more intuitive, user-friendly experiences by designing interfaces that align with user personas and multi-tenancy authorization strategies, which can lead to higher user satisfaction and adoption. Traditionally, implementing authorization logic using role based access control (RBAC) or attribute based access control (ABAC) within IoT applications can become complex as the number of connected devices and associated user roles grows. This often leads to an unmanageable increase in access rules that must be hard-coded into each application, requiring excessive compute power for evaluation. By using Verified Permissions, you can externalize the authorization logic using Cedar policy language, enabling you to define fine-grained permissions that combine RBAC and ABAC models. This decouples permissions from your application’s business logic, providing a centralized and scalable way to manage authorization while reducing development effort.

In this post, we walk you through a reference architecture that outlines an end-to-end smart thermostat application solution using AWS IoT Core, Verified Permissions, and other AWS services. We show you how to use Verified Permissions to build an authorization solution using Cedar policy language to define dynamic policy-based access controls for different user personas. The post includes a link to a GitHub repository that houses the code for the web dashboard and the Verified Permissions logic to control access to the solution APIs.

Solution overview

This solution consists of a smart thermostat IoT device and an AWS hosted web application using Verified Permissions for fine-grained access to various application APIs. For this use case, the AWS IoT Core device is being simulated by an AWS Cloud9 environment and communicates with the IoT service using AWS IoT Device SDK for Python. After being configured, the device connects to AWS IoT Core to receive commands and send messages to various MQTT topics.

As a general practice, when a user-facing IoT solution is implemented, the manufacturer performs administrative tasks such as:

  1. Embedding AWS Private Certificate Authority certificates into each IoT device (in this case a smart thermostat). Usually this is done on the assembly line and the certificates used to verify the IoT endpoints are burned into device memory along with the firmware.
  2. Creating an Amazon Cognito user pool that provides sign-up and sign-in options for web and mobile application users and hosts the authentication process.
  3. Creating policy stores and policy templates in Verified Permissions. Based on who signs up, the manufacturer creates policies with Verified Permissions to link each signed-up user to certain allowed resources or IoT devices.
  4. The mapping of user to device is stored in a datastore. For this solution, you’ll use an Amazon DynamoDB table to record the relationship.

The user who purchases the device (the primary device owner) performs the following tasks:

  1. Signs up on the manufacturer’s web application or mobile app and registers the IoT device by entering a unique serial number. The mapping between user details and the device serial number is stored in the datastore through an automated process that is initiated after sign-up and device claim.
  2. Connects the new device to an existing wireless network, which initiates a registration process to securely connect to AWS IoT Core services within the manufacturer’s account.
  3. Invites other users (such as guests, family members, or the power company) through a referral, invitation link, or a designated OAuth process.
  4. Assign roles to the other users and therefore permissions.
     
Figure 1: Sample smart home application architecture built using AWS services

Figure 1: Sample smart home application architecture built using AWS services

Figure 1 depicts the solution as three logical components:

  1. The first component depicts device operations through AWS IoT Core. The smart thermostat is on site and it communicates with AWS IoT Core and its state is managed through the AWS IoT Device Shadow Service.
  2. The second component depicts the web application, which is the application interface that customers use. It’s a ReactJS-backed single page application deployed using AWS Amplify.
  3. The third component shows the backend application, which is built using Amazon API Gateway, AWS Lambda, and DynamoDB. A Cognito user pool is used to manage application users and their authentication. Authorization is handled by Verified Permissions where you create and manage policies that are evaluated when the web application calls backend APIs. These policies are evaluated against each authorization policy to provide an access decision to deny or allow an action.

The solution flow itself can be broken down into three steps after the device is onboarded and users have signed up:

  1. The smart thermostat device connects and communicates with AWS IoT Core using the MQTT protocol. A classic Device Shadow is created for the AWS IoT thing Thermostat1 when the UpdateThingShadow call is made the first time through the AWS SDK for a new device. AWS IoT Device Shadow service lets the web application query and update the device’s state in case of connectivity issues.
  2. Users sign up or sign in to the Amplify hosted smart home application and authenticate themselves against a Cognito user pool. They’re mapped to a device, which is stored in a DynamoDB table.
  3. After the users sign in, they’re allowed to perform certain tasks and view certain sections of the dashboard based on the different roles and policies managed by Verified Permissions. The underlying Lambda function that’s responsible for handling the API calls queries the DynamoDB table to provide user context to Verified Permissions.

Prerequisites

  1. To deploy this solution, you need access to the AWS Management Console and AWS Command Line Interface (AWS CLI) on your local machine with sufficient permissions to access required services, including Amplify, Verified Permissions, and AWS IoT Core. For this solution, you’ll give the services full access to interact with different underlying services. But in production, we recommend following security best practices with AWS Identity and Access Management (IAM), which involves scoping down policies.
  2. Set up Amplify CLI by following these instructions. We recommend the latest NodeJS stable long-term support (LTS) version. At the time of publishing this post, the LTS version was v20.11.1. Users can manage multiple NodeJS versions on their machines by using a tool such as Node Version Manager (nvm).

Walkthrough

The following table describes the actions, resources, and authorization decisions that will be enforced through Verified Permissions policies to achieve fine-grained access control. In this example, John is the primary device owner and has purchased and provisioned a new smart thermostat device called Thermostat1. He has invited Jane to access his device and has given her restricted permissions. John has full control over the device whereas Jane is only allowed to read the temperature and set the temperature between 72°F and 78°F.

John has also decided to give his local energy provider (Power Company) access to the device so that they can set the optimum temperature during the day to manage grid load and offer him maximum savings on his energy bill. However, they can only do so between 2:00 PM and 5:00 PM.

For security purposes the verified permissions default decision is DENY for unauthorized principals.

Name Principal Action Resource Authorization decision
Any Default Default Default Deny
John john_doe Any Thermostat1 Allow
Jane jane_doe GetTemperature Thermostat1 Allow
Jane jane_doe SetTemperature Thermostat1 Allow only if desired temperature is between 72°F and 78°F.
Power Company powercompany GetTemperature Thermostat1 Allow only if accessed between the hours of 2:00 PM and 5:00 PM
Power Company powercompany SetTemperature Thermostat1 Allow only if the temperature is set between the hours of 2:00 PM and 5:00 PM

Create a Verified Permissions policy store

Verified Permissions is a scalable permissions management and fine-grained authorization service for the applications that you build. The policies are created using Cedar, a dedicated language for defining access permissions in applications. Cedar seamlessly integrates with popular authorization models such as RBAC and ABAC.

A policy is a statement that either permits or forbids a principal to take one or more actions on a resource. A policy store is a logical container that stores your Cedar policies, schema, and principal sources. A schema helps you to validate your policy and identify errors based on the definitions you specify. See Cedar schema to learn about the structure and formal grammar of a Cedar schema.

To create the policy store

  1. Sign in to the Amazon Verified Permissions console and choose Create policy store.
  2. In the Configuration Method section, select Empty Policy Store and choose Create policy store.
     
Figure 2: Create an empty policy store

Figure 2: Create an empty policy store

Note: Make a note of the policy store ID to use when you deploy the solution.

To create a schema for the application

  1. On the Verified Permissions page, select Schema.
  2. In the Schema section, choose Create schema.
     
    Figure 3: Create a schema

    Figure 3: Create a schema

  3. In the Edit schema section, choose JSON mode, paste the following sample schema for your application, and choose Save changes.
    {
        "AwsIotAvpWebApp": {
            "entityTypes": {
                "Device": {
                    "shape": {
                        "attributes": {
                            "primaryOwner": {
                                "name": "User",
                                "required": true,
                                "type": "Entity"
                            }
                        },
                        "type": "Record"
                    },
                    "memberOfTypes": []
                },
                "User": {}
            },
            "actions": {
                "GetTemperature": {
                    "appliesTo": {
                        "context": {
                            "attributes": {
                                "desiredTemperature": {
                                    "type": "Long"
                                },
                                "time": {
                                    "type": "Long"
                                }
                            },
                            "type": "Record"
                        },
                        "resourceTypes": [
                            "Device"
                        ],
                        "principalTypes": [
                            "User"
                        ]
                    }
                },
                "SetTemperature": {
                    "appliesTo": {
                        "resourceTypes": [
                            "Device"
                        ],
                        "principalTypes": [
                            "User"
                        ],
                        "context": {
                            "attributes": {
                                "desiredTemperature": {
                                    "type": "Long"
                                },
                                "time": {
                                    "type": "Long"
                                }
                            },
                            "type": "Record"
                        }
                    }
                }
            }
        }
    }

When creating policies in Cedar, you can define authorization rules using a static policy or a template-linked policy.

Static policies

In scenarios where a policy explicitly defines both the principal and the resource, the policy is categorized as a static policy. These policies are immediately applicable for authorization decisions, as they are fully defined and ready for implementation.

Template-linked policies

On the other hand, there are situations where a single set of authorization rules needs to be applied across a variety of principals and resources. Consider an IoT application where actions such as SetTemperature and GetTemperature must be permitted for specific devices. Using static policies for each unique combination of principal and resource can lead to an excessive number of almost identical policies, differing only in their principal and resource components. This redundancy can be efficiently addressed with policy templates. Policy templates allow for the creation of policies using placeholders for the principal, the resource, or both. After a policy template is established, individual policies can be generated by referencing this template and specifying the desired principal and resource. These template-linked policies function the same as static policies, offering a streamlined and scalable solution for policy management.

To create a policy that allows access to the primary owner of the device using a static policy

  1. In the Verified Permissions console, on the left pane, select Policies, then choose Create policy and select Create static policy from the drop-down menu.
     
    Figure 4: Create static policy

    Figure 4: Create static policy

  2. Define the policy scope:
    1. Select Permit for the Policy effect.
       
      Figure 5: Define policy effect

      Figure 5: Define policy effect

    2. Select All Principals for Principals scope.
    3. Select All Resources for Resource scope.
    4. Select All Actions for Actions scope and choose Next.
       
      Figure 6: Define policy scope

      Figure 6: Define policy scope

  3. On the Details page, under Policy, paste the following full-access policy, which grants the primary owner permission to perform both SetTemperature and GetTemperature actions on the smart thermostat unconditionally. Choose Create policy.
    	permit (principal, action, resource)
    	when { resource.primaryOwner == principal };
    Figure 7: Write and review policy statement

    Figure 7: Write and review policy statement

To create a static policy to allow a guest user to read the temperature

In this example, the guest user is Jane (username: jane_doe).

  1. Create another static policy and specify the policy scope.
    1. Select Permit for the Policy effect.
       
      Figure 8: Define the policy effect

      Figure 8: Define the policy effect

    2. Select Specific principal for the Principals scope.
    3. Select AwsIotAvpWebApp::User and enter jane_doe.
       
      Figure 9: Define the policy scope

      Figure 9: Define the policy scope

    4. Select Specific resource for the Resources scope.
    5. Select AwsIotAvpWebApp::Device and enter Thermostat1.
    6. Select Specific set of actions for the Actions scope.
    7. Select GetTemperature and choose Next.
       
      Figure 10: Define resource and action scopes

      Figure 10: Define resource and action scopes

    8. Enter the Policy description: Allow jane_doe to read thermostat1.
    9. Choose Create policy.

Next, you will create reusable policy templates to manage policies efficiently. To create a policy template for a guest user with restricted temperature settings that limit the temperature range they can set to between 72°F and 78°F. In this case, the guest user is going to be Jane (username: jane_doe)

To create a reusable policy template

  1. Select Policy template and enter Guest user template as the description.
  2. Paste the following sample policy in the Policy body and choose Create policy template.
    permit (
        principal == ?principal,
        action in [AwsIotAvpWebApp::Action::"SetTemperature"],
        resource == ?resource
    )
    when { context.desiredTemperature >= 72 && context.desiredTemperature <= 78 };
Figure 11: Create guest user policy template

Figure 11: Create guest user policy template

As you can see, you don’t specify the principal and resource yet. You enter those when you create an actual policy from the policy template. The context object will be populated with the desiredTemperature property in the application and used to evaluate the decision.

You also need to create a policy template for the Power Company user with restricted time settings. Cedar policies don’t support date/time format, so you must represent 2:00 PM and 5:00 PM as elapsed minutes from midnight.

To create a policy template for the power company

  1. Select Policy template and enter Power company user template as the description.
  2. Paste the following sample policy in the Policy body and choose Create policy template.
    permit (
        principal == ?principal,
        action in [AwsIotAvpWebApp::Action::"SetTemperature", AwsIotAvpWebApp::Action::"GetTemperature"],
        resource == ?resource
    )
    when { context.time >= 840 && context.time < 1020 };

The policy templates accept the user and resource. The next step is to create a template-linked policy for Jane to set and get thermostat readings based on the Guest user template that you created earlier. For simplicity, you will manually create this policy using the Verified Permissions console. In production, application policies can be dynamically created using the Verified Permissions API.

To create a template-linked policy for a guest user

  1. In the Verified Permissions console, on the left pane, select Policies, then choose Create policy and select Create template-linked policy from the drop-down menu.
     
    Figure 12: Create new template-linked policy

    Figure 12: Create new template-linked policy

  2. Select the Guest user template and choose next.
     
    Figure 13: Select Guest user template

    Figure 13: Select Guest user template

  3. Under parameter selection:
    1. For Principal enter AwsIotAvpWebApp::User::”jane_doe”.
    2. For Resource enter AwsIotAvpWebApp::Device::”Thermostat1″.
    3. Choose Create template-linked policy.
       
      Figure 14: Create guest user template-linked policy

      Figure 14: Create guest user template-linked policy

Note that with this policy in place, jane_doe can only set the temperature of the device Thermostat1 to between 72°F and 78°F.

To create a template-linked policy for the power company user

Based on the template that was set up for power company, you now need an actual policy for it.

  1. In the Verified Permissions console, go to the left pane and select Policies, then choose Create policy and select Create template-linked policy from the drop-down menu.
  2. Select the Power company user template and choose next.
  3. Under Parameter selection, for Principal enter AwsIotAvpWebApp::User::”powercompany”, and for Resource enter AwsIotAvpWebApp::Device::”Thermostat1″, and choose Create template-linked policy.

Now that you have a set of policies in a policy store, you need to update the backend codebase to include this information and then deploy the web application using Amplify.

The policy statements in this post intentionally use human-readable values such as jane_doe and powercompany for the principal entity. This is useful when discussing general concepts but in production systems, customers should use unique and immutable values for entities. See Get the best out of Amazon Verified Permissions by using fine-grained authorization methods for more information.

Deploy the solution code from GitHub

Go to the GitHub repository to set up the Amplify web application. The repository Readme file provides detailed instructions on how to set up the web application. You will need your Verified Permissions policy store ID to deploy the application. For convenience, we’ve provided an onboarding script—deploy.sh—which you can use to deploy the application.

To deploy the application

  1. Close the repository.
    git clone https://github.com/aws-samples/amazon-verified-permissions-iot-
    amplify-smart-home-application.git

  2. Deploy the application.
    ./deploy.sh <region> <Verified Permissions Policy Store ID>

After the web dashboard has been deployed, you’ll create an IoT device using AWS IoT Core.

Create an IoT device and connect it to AWS IoT Core

With the users, policies, and templates, and the Amplify smart home application in place, you can now create a device and connect it to AWS IoT Core to complete the solution.

To create Thermostat1” device and connect it to AWS IoT Core

  1. From the left pane in the AWS IoT console, select Connect one device.
     
    Figure 15: Connect device using AWS IoT console

    Figure 15: Connect device using AWS IoT console

  2. Review how IoT Thing works and then choose Next.
     
    Figure 16: Review how IoT Thing works before proceeding

    Figure 16: Review how IoT Thing works before proceeding

  3. Choose Create a new thing and enter Thermostat1 as the Thing name and choose next.
    &bsp;
    Figure 17: Create the new IoT thing

    Figure 17: Create the new IoT thing

  4. Select Linux/macOS as the Device platform operating system and Python as the AWS IoT Core Device SDK and choose next.
     
    Figure 18: Choose the platform and SDK for the device

    Figure 18: Choose the platform and SDK for the device

  5. Choose Download connection kit and choose next.
     
    Figure 19: Download the connection kit to use for creating the Thermostat1 device

    Figure 19: Download the connection kit to use for creating the Thermostat1 device

  6. Review the three steps to display messages from your IoT device. You will use them to verify the thermostat1 IoT device connectivity to the AWS IoT Core platform. They are:
    1. Step 1: Add execution permissions
    2. Step 2: Run the start script
    3. Step 3: Return to the AWS IoT Console to view the device’s message
       
      Figure 20: How to display messages from an IoT device

      Figure 20: How to display messages from an IoT device

Solution validation

With all of the pieces in place, you can now test the solution.

Primary owner signs in to the web application to set Thermostat1 temperature to 82°F

Figure 21: Thermostat1 temperature update by John

Figure 21: Thermostat1 temperature update by John

  1. Sign in to the Amplify web application as John. You should be able to view the Thermostat1 controller on the dashboard.
  2. Set the temperature to 82°F.
  3. The Lambda function processes the request and performs an API call to Verified Permissions to determine whether to ALLOW or DENY the action based on the policies. Verified Permissions sends back an ALLOW, as the policy that was previously set up allows unrestricted access for primary owners.
  4. Upon receiving the response from Verified Permissions, the Lambda function sends ALLOW permission back to the web application and an API call to the AWS IoT Device Shadow service to update the device (Thermostat1) temperature to 82°F.
     
Figure 22: Policy evaluation decision is ALLOW when a primary owner calls SetTemperature

Figure 22: Policy evaluation decision is ALLOW when a primary owner calls SetTemperature

Guest user signs in to the web application to set Thermostat1 temperature to 80°F

Figure 23: Thermostat1 temperature update by Jane

Figure 23: Thermostat1 temperature update by Jane

  1. If you sign in as Jane to the Amplify web application, you can view the Thermostat1 controller on the dashboard.
  2. Set the temperature to 80°F.
  3. The Lambda function validates the actions by sending an API call to Verified Permissions to determine whether to ALLOW or DENY the action based on the established policies. Verified Permissions sends back a DENY, as the policy only permits temperature adjustments between 72°F and 78°F.
  4. Upon receiving the response from Verified Permissions, the Lambda function sends DENY permissions back to the web application and an unauthorized response is returned.
     
    Figure 24: Guest user jane_doe receives a DENY when calling SetTemperature for a desired temperature of 80°F

    Figure 24: Guest user jane_doe receives a DENY when calling SetTemperature for a desired temperature of 80°F

  5. If you repeat the process (still as Jane) but set Thermostat1 to 75°F, the policy will cause the request to be allowed.
     
    Figure 25: Guest user jane_doe receives an ALLOW when calling SetTemperature for a desired temperature of 75°F

    Figure 25: Guest user jane_doe receives an ALLOW when calling SetTemperature for a desired temperature of 75°F

  6. Similarly, jane_doe is allowed run GetTemperature on the device Thermostat1. When the temperature is set to 74°F, the device shadow is updated. The IoT device being simulated by your AWS Cloud9 instance reads desired the temperature field and sets the reported value to 74.
  7. Now, when jane_doe runs GetTemperature, the value of the device is reported as 74 as shown in Figure 26. We encourage you to try different restrictions in the World Settings (outside temperature and time) by adding restrictions to the static policy that allows GetTemperature for guest user.
     
    Figure 26: Guest user jane_doe receives an ALLOW when calling GetTemperature for the reported temperature

    Figure 26: Guest user jane_doe receives an ALLOW when calling GetTemperature for the reported temperature

Power company signs in to the web application to set Thermostat1 to 78°F at 3.30 PM

Figure 27: Thermostat1 temperature set to 78°F by powercompany user at a specified time

Figure 27: Thermostat1 temperature set to 78°F by powercompany user at a specified time

  1. Sign in as the powercompany user to the Amplify web application using an API. You can view the Thermostat1 controller on the dashboard.
  2. To test this scenario, set the current time to 3:30 PM, and try to set the temperature to 78°F.
  3. The Lambda function validates the actions by sending an API call to Verified Permissions to determine whether to ALLOW or DENY the action based on pre-established policies. Verified Permissions returns ALLOW permission, because the policy for powercompany permits device temperature changes between 2:00 PM and 5:00 PM.
  4. Upon receiving the response from Verified Permissions, the Lambda function sends ALLOW permission back to the web application and an API call to the AWS IoT Device Shadow service to update the Thermostat1 temperature to 78°F.
     
    Figure 28: powercompany receives an ALLOW when SetTemperature is called with the desired temperature of 78°F

    Figure 28: powercompany receives an ALLOW when SetTemperature is called with the desired temperature of 78°F

Note: As an optional exercise, we also made jane_doe a device owner for device Thermostat2. This can be observed in the users.json file in the Github repository. We encourage you to create your own policies and restrict functions for Thermostat2 after going through this post. You will need to create separate Verified Permissions policies and update the Lambda functions to interact with these policies.

We encourage you to create policies for guests and the power company and restrict permissions based on the following criteria:

  1. Verify Jane Doe can perform GetTemperature and SetTemperature actions on Thermostat2.
  2. John Doe should not be able to set the temperature on device Thermostat2 outside of the time range of 4:00 PM and 6:00 PM and outside of the temperature range of 68°F and 72°F.
  3. Power Company can only perform the GetTemperature operation, but there are no restrictions on time and outside temperature.

To help you verify the solution, we’ve provided the correct policies under the challenge directory in the GitHub repository.

Clean up

Deploying the Thermostat application in your AWS account will incur costs. To avoid ongoing charges, when you’re done examining the solution, delete the resources that were created. This includes the Amplify hosted web application, API Gateway resource, AWS Cloud 9 environment, the Lambda function, DynamoDB table, Cognito user pool, AWS IoT Core resources, and Verified Permissions policy store.

Amplify resources can be deleted by going to the AWS CloudFormation console and deleting the stacks that were used to provision various services.

Conclusion

In this post, you learned about creating and managing fine-grained permissions using Verified Permissions for different user personas for your smart thermostat IoT device. With Verified Permissions, you can strengthen your security posture and build smart applications aligned with Zero Trust principles for real-time authorization decisions. To learn more, we recommend:

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Author

Rajat Mathur

Rajat is a Principal Solutions Architect at Amazon Web Services. Rajat is a passionate technologist who enjoys building innovative solutions for AWS customers. His core areas of focus are IoT, Networking, and Serverless computing. In his spare time, Rajat enjoys long drives, traveling, and spending time with family.

Pronoy Chopra

Pronoy Chopra

Pronoy is a Senior Solutions Architect with the Startups Generative AI team at AWS. He specializes in architecting and developing IoT and Machine Learning solutions. He has co-founded two startups and enjoys being hands-on with projects in the IoT, AI/ML and Serverless domain. His work in Magnetoencephalography has been cited many times in the effort to build better brain-compute interfaces.

Syed Sanoor

Syed Sanoor

Syed serves as a Solutions Architect, assisting customers in the enterprise sector. With a foundation in software engineering, he takes pleasure in crafting solutions tailored to client needs. His expertise predominantly lies in C# and IoT. During his leisure time, Syed enjoys piloting drones and playing cricket.

Serverless IoT email capture, attachment processing, and distribution

Post Syndicated from Stacy Conant original https://aws.amazon.com/blogs/messaging-and-targeting/serverless-iot-email-capture-attachment-processing-and-distribution/

Many customers need to automate email notifications to a broad and diverse set of email recipients, sometimes from a sensor network with a variety of monitoring capabilities. Many sensor monitoring software products include an SMTP client to achieve this goal. However, managing email server infrastructure requires specialty expertise and operating an email server comes with additional cost and inherent risk of breach, spam, and storage management. Organizations also need to manage distribution of attachments, which could be large and potentially contain exploits or viruses. For IoT use cases, diagnostic data relevance quickly expires, necessitating retention policies to regularly delete content.

Solution Overview

This solution uses the Amazon Simple Email Service (SES) SMTP interface to receive SMTP client messages, and processes the message to replace an attachment with a pre-signed URL in the resulting email to its intended recipients. Attachments are stored separately in an Amazon Simple Storage Service (S3) bucket with a lifecycle policy implemented. This reduces the storage requirements of recipient email server receiving notification emails. Additionally, this solution leverages built-in anti-spam and security scanning capabilities to deal with spam and potentially malicious attachments while at the same time providing the mechanism by which pre-signed attachment links can be revoked should the emails be distributed to unintended recipients.

The solution uses:

  • Amazon SES SMTP interface to receive incoming emails.
  • Amazon SES receipt rule on a (sub)domain controlled by administrators, to store raw incoming emails in an Amazon S3 bucket.
  • AWS Lambda function, triggered on S3 ObjectCreated event, to process raw emails, extract attachments, replace each with pre-signed URL with configurable expiry, and send the processed emails to intended recipients.

Solution Flow Details:

  1. SMTP client transmits email content to an email address in a (sub) domain with MX record set to Amazon SES service’s regional endpoint.
  2. Amazon SES SMTP interface receives an email and forwards it to SES Receipt Rule(s) for processing.
  3. A matching Amazon SES Receipt Rule saves incoming email into an Amazon S3 Bucket.
  4. Amazon S3 Bucket emits an S3 ObjectCreated Event, and places the event onto the Amazon Simple Queue Services (SQS) queue.
  5. The AWS Lambda service polls the inbound messages’ SQS queue and feeds events to the Lambda function.
  6. The Lambda function, retrieves email files from the S3 bucket, parses the email sender/subject/body, saves attachments to a separate attachment S3 bucket (7), and replaces attachments with pre-signed URLs in the email body. The Lambda function then extracts intended recipient addresses from the email body. If the body contains properly formatted recipients list, email is then sent using SES API (9), otherwise a notice is posted to a fallback Amazon Simple Notification Service (SNS) Topic (8).
  7. The Lambda function saves extracted attachments, if any, into an attachments bucket.
  8. Malformed email notifications are posted to a fallback Amazon SNS Topic.
  9. The Lambda function invokes Amazon SES API to send the processed email to all intended recipient addresses.
  10. If the Lambda function is unable to process email successfully, the inbound message is placed on to the SQS dead-letter queue (DLQ) queue for later intervention by the operator.
  11. SES delivers an email to each recipients’ mail server.
  12. Intended recipients download emails from their corporate mail servers and retrieve attachments from the S3 pre-signed URL(s) embedded in the email body.
  13. An alarm is triggered and a notification is published to Amazon SNS Alarms Topic whenever:
    • More than 50 failed messages are in the DLQ.
    • Oldest message on incoming SQS queue is older than 3 minutes – unable to keep up with inbound messages (flooding).
    • The incoming SQS queue contains over 180 messages (configurable) over 5 minutes old.

Setting up Amazon SES

For this solution you will need an email account where you can receive emails. You’ll also need a (sub)domain for which you control the mail exchanger (MX) record. You can obtain your (sub)domain either from Amazon Route53 or another domain hosting provider.

Verify the sender email address

You’ll need to follow the instructions to Verify an email address for all identities that you use as “From”, “Source”, ” Sender”, or “Return-Path” addresses. You’ll also need to follow these instructions for any identities you wish to send emails to during initial testing while your SES account is in the “Sandbox” (see next “Moving out of the SES Sandbox” section).

Moving out of the SES Sandbox

Amazon SES accounts are “in the Sandbox” by default, limiting email sending only to verified identities. AWS does this to prevent fraud and abuse as well as protecting your reputation as an email sender. When your account leaves the Sandbox, SES can send email to any recipient, regardless of whether the recipient’s address or domain is verified by SES. However, you still have to verify all identities that you use as “From”, “Source”, “Sender”, or “Return-Path” addresses.
Follow the Moving out of the SES Sandbox instructions in the SES Developer Guide. Approval is usually within 24 hours.

Set up the SES SMTP interface

Follow the workshop lab instructions to set up email sending from your SMTP client using the SES SMTP interface. Once you’ve completed this step, your SMTP client can open authenticated sessions with the SES SMTP interface and send emails. The workshop will guide you through the following steps:

  1. Create SMTP credentials for your SES account.
    • IMPORTANT: Never share SMTP credentials with unauthorized individuals. Anyone with these credentials can send as many SMTP requests and in whatever format/content they choose. This may result in end-users receiving emails with malicious content, administrative/operations overload, and unbounded AWS charges.
  2. Test your connection to ensure you can send emails.
  3. Authenticate using the SMTP credentials generated in step 1 and then send a test email from an SMTP client.

Verify your email domain and bounce notifications with Amazon SES

In order to replace email attachments with a pre-signed URL and other application logic, you’ll need to set up SES to receive emails on a domain or subdomain you control.

  1. Verify the domain that you want to use for receiving emails.
  2. Publish a mail exchanger record (MX record) and include the Amazon SES inbound receiving endpoint for your AWS region ( e.g. inbound-smtp.us-east-1.amazonaws.com for US East Northern Virginia) in the domain DNS configuration.
  3. Amazon SES automatically manages the bounce notifications whenever recipient email is not deliverable. Follow the Set up notifications for bounces and complaints guide to setup bounce notifications.

Deploying the solution

The solution is implemented using AWS CDK with Python. First clone the solution repository to your local machine or Cloud9 development environment. Then deploy the solution by entering the following commands into your terminal:

python -m venv .venv
. ./venv/bin/activate
pip install -r requirements.txt

cdk deploy \
--context SenderEmail=<verified sender email> \
 --context RecipientEmail=<recipient email address> \
 --context ConfigurationSetName=<configuration set name>

Note:

The RecipientEmail CDK context parameter in the cdk deploy command above can be any email address in the domain you verified as part of the Verify the domain step. In other words, if the verified domain is acme-corp.com, then the emails can be [email protected], [email protected], etc.

The ConfigurationSetName CDK context can be obtained by navigating to Identities in Amazon SES console, selecting the verified domain (same as above), switching to “Configuration set” tab and selecting name of the “Default configuration set”

After deploying the solution, please, navigate to Amazon SES Email receiving in AWS console, edit the rule set and set it to Active.

Testing the solution end-to-end

Create a small file and generate a base64 encoding so that you can attach it to an SMTP message:

echo content >> demo.txt
cat demo.txt | base64 > demo64.txt
cat demo64.txt

Install openssl (which includes an SMTP client capability) using the following command:

sudo yum install openssl

Now run the SMTP client (openssl is used for the proof of concept, be sure to complete the steps in the workshop lab instructions first):

openssl s_client -crlf -quiet -starttls smtp -connect email-smtp.<aws-region>.amazonaws.com:587

and feed in the commands (replacing the brackets [] and everything between them) to send the SMTP message with the attachment you created.

EHLO amazonses.com
AUTH LOGIN
[base64 encoded SMTP user name]
[base64 encoded SMTP password]
MAIL FROM:[VERIFIED EMAIL IN SES]
RCPT TO:[VERIFIED EMAIL WITH SES RECEIPT RULE]
DATA
Subject: Demo from openssl
MIME-Version: 1.0
Content-Type: multipart/mixed;
 boundary="XXXXboundary text"

This is a multipart message in MIME format.

--XXXXboundary text
Content-Type: text/plain

Line1:This is a Test email sent to coded list of email addresses using the Amazon SES SMTP interface from openssl SMTP client.
Line2:Email_Rxers_Code:[ANYUSER1@DOMAIN_A,ANYUSER2@DOMAIN_B,ANYUSERX@DOMAIN_Y]:Email_Rxers_Code:
Line3:Last line.

--XXXXboundary text
Content-Type: text/plain;
Content-Transfer-Encoding: Base64
Content-Disposition: attachment; filename="demo64.txt"
Y29udGVudAo=
--XXXXboundary text
.
QUIT

Note: For base64 SMTP username and password above, use values obtained in Set up the SES SMTP interface, step 1. So for example, if the username is AKZB3LJAF5TQQRRPQZO1, then you can obtain base64 encoded value using following command:

echo -n AKZB3LJAF5TQQRRPQZO1 |base64
QUtaQjNMSkFGNVRRUVJSUFFaTzE=

This makes base64 encoded value QUtaQjNMSkFGNVRRUVJSUFFaTzE= Repeat same process for SMTP username and password values in the example above.

The openssl command should result in successful SMTP authentication and send. You should receive an email that looks like this:

Optimizing Security of the Solution

  1. Do not share DNS credentials. Unauthorized access can lead to domain control, potential denial of service, and AWS charges. Restrict access to authorized personnel only.
  2. Do not set the SENDER_EMAIL environment variable to the email address associated with the receipt rule. This address is a closely guarded secret, known only to administrators, and should be changed frequently.
  3. Review access to your code repository regularly to ensure there are no unauthorized changes to your code base.
  4. Utilize Permissions Boundaries to restrict the actions permitted by an IAM user or role.

Cleanup

To cleanup, start by navigating to Amazon SES Email receiving in AWS console, and setting the rule set to Inactive.

Once completed, delete the stack:

cdk destroy

Cleanup AWS SES Access Credentials

In Amazon SES Console, select Manage existing SMTP credentials, select the username for which credentials were created in Set up the SES SMTP interface above, navigate to the Security credentials tab and in the Access keys section, select Action -> Delete to delete AWS SES access credentials.

Troubleshooting

If you are not receiving the email or email is not being sent correctly there are a number of common causes of these errors:

  • HTTP Error 554 Message rejected email address is not verified. The following identities failed the check in region :
    • This means that you have attempted to send an email from address that has not been verified.
    • Please, ensure that the “MAIL FROM:[VERIFIED EMAIL IN SES]” email address sent via openssl matches the SenderEmail=<verified sender email> email address used in cdk deploy.
    • Also make sure this email address was used in Verify the sender email address step.
  • Email is not being delivered/forwarded
    • The incoming S3 bucket under the incoming prefix, contains file called AMAZON_SES_SETUP_NOTIFICATION. This means that MX record of the domain setup is missing. Please, validate that the MX record (step 2) of Verify your email domain with Amazon SES to receive emails section is fully configured.
    • Please ensure after deploying the Amazon SES solution, the created rule set was made active by navigating to Amazon SES Email receiving in AWS console, and set it to Active.
    • This may mean that the destination email address has bounced. Please, navigate to Amazon SES Suppression list in AWS console ensure that recipient’s email is not in the suppression list. If it is listed, you can see the reason in the “Suppression reason” column. There you may either manually remove from the suppression list or if the recipient email is not valid, consider using a different recipient email address.
AWS Legal Disclaimer: Sample code, software libraries, command line tools, proofs of concept, templates, or other related technology are provided as AWS Content or Third-Party Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content or Third-Party Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content or Third-Party Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content or Third-Party Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.

About the Authors

Tarek Soliman

Tarek Soliman

Tarek is a Senior Solutions Architect at AWS. His background is in Software Engineering with a focus on distributed systems. He is passionate about diving into customer problems and solving them. He also enjoys building things using software, woodworking, and hobby electronics.

Dave Spencer

Dave Spencer

Dave is a Senior Solutions Architect at AWS. His background is in cloud solutions architecture, Infrastructure as Code (Iac), systems engineering, and embedded systems programming. Dave’s passion is developing partnerships with Department of Defense customers to maximize technology investments and realize their strategic vision.

Ayman Ishimwe

Ayman Ishimwe

Ayman is a Solutions Architect at AWS based in Seattle, Washington. He holds a Master’s degree in Software Engineering and IT from Oakland University. With prior experience in software development, specifically in building microservices for distributed web applications, he is passionate about helping customers build robust and scalable solutions on AWS cloud services following best practices.

Dmytro Protsiv

Dmytro Protsiv

Dmytro is a Cloud Applications Architect for with Amazon Web Services. He is passionate about helping customers to solve their business challenges around application modernization.

Stacy Conant

Stacy Conant

Stacy is a Solutions Architect working with DoD and US Navy customers. She enjoys helping customers understand how to harness big data and working on data analytics solutions. On the weekends, you can find Stacy crocheting, reading Harry Potter (again), playing with her dogs and cooking with her husband.

AWS Weekly Roundup: New features on Knowledge Bases for Amazon Bedrock, OAC for Lambda function URL origins on Amazon CloudFront, and more (April 15, 2024)

Post Syndicated from Veliswa Boya original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-new-features-on-knowledge-bases-for-amazon-bedrock-oac-for-lambda-function-url-origins-on-amazon-cloudfront-and-more-april-15-2024/

AWS Community Days conferences are in full swing with AWS communities around the globe. The AWS Community Day Poland was hosted last week with more than 600 cloud enthusiasts in attendance. Community speakers Agnieszka Biernacka, Krzysztof Kąkol, and more, presented talks which captivated the audience and resulted in vibrant discussions throughout the day. My teammate, Wojtek Gawroński, was at the event and he’s already looking forward to attending again next year!

Last week’s launches
Here are some launches that got my attention during the previous week.

Amazon CloudFront now supports Origin Access Control (OAC) for Lambda function URL origins – Now you can protect your AWS Lambda URL origins by using Amazon CloudFront Origin Access Control (OAC) to only allow access from designated CloudFront distributions. The CloudFront Developer Guide has more details on how to get started using CloudFront OAC to authenticate access to Lambda function URLs from your designated CloudFront distributions.

AWS Client VPN and AWS Verified Access migration and interoperability patterns – If you’re using AWS Client VPN or a similar third-party VPN-based solution to provide secure access to your applications today, you’ll be pleased to know that you can now combine the use of AWS Client VPN and AWS Verified Access for your new or existing applications.

These two announcements related to Knowledge Bases for Amazon Bedrock caught my eye:

Metadata filtering to improve retrieval accuracy – With metadata filtering, you can retrieve not only semantically relevant chunks but a well-defined subset of those relevant chunks based on applied metadata filters and associated values.

Custom prompts for the RetrieveAndGenerate API and configuration of the maximum number of retrieved results – These are two new features which you can now choose as query options alongside the search type to give you control over the search results. These are retrieved from the vector store and passed to the Foundation Models for generating the answer.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS news
AWS open source news and updates – My colleague Ricardo writes this weekly open source newsletter in which he highlights new open source projects, tools, and demos from the AWS Community.

Upcoming AWS events
AWS Summits – These are free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Whether you’re in the Americas, Asia Pacific & Japan, or EMEA region, learn here about future AWS Summit events happening in your area.

AWS Community Days – Join an AWS Community Day event just like the one I mentioned at the beginning of this post to participate in technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from your area. If you’re in Kenya, or Nepal, there’s an event happening in your area this coming weekend.

You can browse all upcoming in-person and virtual events here.

That’s all for this week. Check back next Monday for another Weekly Roundup!

– Veliswa

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS.

Accelerate security automation using Amazon CodeWhisperer

Post Syndicated from Brendan Jenkins original https://aws.amazon.com/blogs/security/accelerate-security-automation-using-amazon-codewhisperer/

In an ever-changing security landscape, teams must be able to quickly remediate security risks. Many organizations look for ways to automate the remediation of security findings that are currently handled manually. Amazon CodeWhisperer is an artificial intelligence (AI) coding companion that generates real-time, single-line or full-function code suggestions in your integrated development environment (IDE) to help you quickly build software. By using CodeWhisperer, security teams can expedite the process of writing security automation scripts for various types of findings that are aggregated in AWS Security Hub, a cloud security posture management (CSPM) service.

In this post, we present some of the current challenges with security automation and walk you through how to use CodeWhisperer, together with Amazon EventBridge and AWS Lambda, to automate the remediation of Security Hub findings. Before reading further, please read the AWS Responsible AI Policy.

Current challenges with security automation

Many approaches to security automation, including Lambda and AWS Systems Manager Automation, require software development skills. Furthermore, the process of manually writing code for remediation can be a time-consuming process for security professionals. To help overcome these challenges, CodeWhisperer serves as a force multiplier for qualified security professionals with development experience to quickly and effectively generate code to help remediate security findings.

Security professionals should still cultivate software development skills to implement robust solutions. Engineers should thoroughly review and validate any generated code, as manual oversight remains critical for security.

Solution overview

Figure 1 shows how the findings that Security Hub produces are ingested by EventBridge, which then invokes Lambda functions for processing. The Lambda code is generated with the help of CodeWhisperer.

Figure 1: Diagram of the solution

Security Hub integrates with EventBridge so you can automatically process findings with other services such as Lambda. To begin remediating the findings automatically, you can configure rules to determine where to send findings. This solution will do the following:

  1. Ingest an Amazon Security Hub finding into EventBridge.
  2. Use an EventBridge rule to invoke a Lambda function for processing.
  3. Use CodeWhisperer to generate the Lambda function code.

It is important to note that there are two types of automation for Security Hub finding remediation:

  • Partial automation, which is initiated when a human worker selects the Security Hub findings manually and applies the automated remediation workflow to the selected findings.
  • End-to-end automation, which means that when a finding is generated within Security Hub, this initiates an automated workflow to immediately remediate without human intervention.

Important: When you use end-to-end automation, we highly recommend that you thoroughly test the efficiency and impact of the workflow in a non-production environment first before moving forward with implementation in a production environment.

Prerequisites

To follow along with this walkthrough, make sure that you have the following prerequisites in place:

Implement security automation

In this scenario, you have been tasked with making sure that versioning is enabled across all Amazon Simple Storage Service (Amazon S3) buckets in your AWS account. Additionally, you want to do this in a way that is programmatic and automated so that it can be reused in different AWS accounts in the future.

To do this, you will perform the following steps:

  1. Generate the remediation script with CodeWhisperer
  2. Create the Lambda function
  3. Integrate the Lambda function with Security Hub by using EventBridge
  4. Create a custom action in Security Hub
  5. Create an EventBridge rule to target the Lambda function
  6. Run the remediation

Generate a remediation script with CodeWhisperer

The first step is to use VS Code to create a script so that CodeWhisperer generates the code for your Lambda function in Python. You will use this Lambda function to remediate the Security Hub findings generated by the [S3.14] S3 buckets should use versioning control.

Note: The underlying model of CodeWhisperer is powered by generative AI, and the output of CodeWhisperer is nondeterministic. As such, the code recommended by the service can vary by user. By modifying the initial code comment to prompt CodeWhisperer for a response, customers can change the corresponding output to help meet their needs. Customers should subject all code generated by CodeWhisperer to typical testing and review protocols to verify that it is free of errors and is in line with applicable organizational security policies. To learn about best practices on prompt engineering with CodeWhisperer, see this AWS blog post.

To generate the remediation script

  1. Open a new VS Code window, and then open or create a new folder for your file to reside in.
  2. Create a Python file called cw-blog-remediation.py as shown in Figure 2.
     
    Figure 2: New VS Code file created called cw-blog-remediation.py

    Figure 2: New VS Code file created called cw-blog-remediation.py

  3. Add the following imports to the Python file.
    import json
    import boto3

  4. Because you have the context added to your file, you can now prompt CodeWhisperer by using a natural language comment. In your file, below the import statements, enter the following comment and then press Enter.
    # Create lambda function that turns on versioning for an S3 bucket after the function is triggered from Amazon EventBridge

  5. Accept the first recommendation that CodeWhisperer provides by pressing Tab to use the Lambda function handler, as shown in Figure 3.
    &ngsp;
    Figure 3: Generation of Lambda handler

    Figure 3: Generation of Lambda handler

  6. To get the recommendation for the function from CodeWhisperer, press Enter. Make sure that the recommendation you receive looks similar to the following. CodeWhisperer is nondeterministic, so its recommendations can vary.
    import json
    import boto3
    
    # Create lambda function that turns on versioning for an S3 bucket after function is triggered from Amazon EventBridge
    def lambda_handler(event, context):
        s3 = boto3.client('s3')
        bucket = event['detail']['requestParameters']['bucketName']
        response = s3.put_bucket_versioning(
            Bucket=bucket,
            VersioningConfiguration={
                'Status': 'Enabled'
            }
        )
        print(response)
        return {
            'statusCode': 200,
            'body': json.dumps('Versioning enabled for bucket ' + bucket)
        }
    

  7. Take a moment to review the user actions and keyboard shortcut keys. Press Tab to accept the recommendation.
  8. You can change the function body to fit your use case. To get the Amazon Resource Name (ARN) of the S3 bucket from the EventBridge event, replace the bucket variable with the following line:
    bucket = event['detail']['findings'][0]['Resources'][0]['Id']

  9. To prompt CodeWhisperer to extract the bucket name from the bucket ARN, use the following comment:
    # Take the S3 bucket name from the ARN of the S3 bucket

    Your function code should look similar to the following:

    import json
    import boto3
    
    # Create lambda function that turns on versioning for an S3 bucket after function is triggered from Amazon EventBridge
    def lambda_handler(event, context):
        s3 = boto3.client('s3')
       bucket = event['detail']['findings'][0]['Resources'][0]['Id']
             # Take the S3 bucket name from the ARN of the S3 bucket
       bucket = bucket.split(':')[5]
    
        response = s3.put_bucket_versioning(
            Bucket=bucket,
            VersioningConfiguration={
                'Status': 'Enabled'
            }
        )
        print(response)
        return {
            'statusCode': 200,
            'body': json.dumps('Versioning enabled for bucket ' + bucket)
        }
    

  10. Create a .zip file for cw-blog-remediation.py. Find the file in your local file manager, right-click the file, and select compress/zip. You will use this .zip file in the next section of the post.

Create the Lambda function

The next step is to use the automation script that you generated to create the Lambda function that will enable versioning on applicable S3 buckets.

To create the Lambda function

  1. Open the AWS Lambda console.
  2. In the left navigation pane, choose Functions, and then choose Create function.
  3. Select Author from Scratch and provide the following configurations for the function:
    1. For Function name, select sec_remediation_function.
    2. For Runtime, select Python 3.12.
    3. For Architecture, select x86_64.
    4. For Permissions, select Create a new role with basic Lambda permissions.
  4. Choose Create function.
  5. To upload your local code to Lambda, select Upload from and then .zip file, and then upload the file that you zipped.
  6. Verify that you created the Lambda function successfully. In the Code source section of Lambda, you should see the code from the automation script displayed in a new tab, as shown in Figure 4.
     
    Figure 4: Source code that was successfully uploaded

    Figure 4: Source code that was successfully uploaded

  7. Choose the Code tab.
  8. Scroll down to the Runtime settings pane and choose Edit.
  9. For Handler, enter cw-blog-remediation.lambda_handler for your function handler, and then choose Save, as shown in Figure 5.
     
    Figure 5: Updated Lambda handler

    Figure 5: Updated Lambda handler

  10. For security purposes, and to follow the principle of least privilege, you should also add an inline policy to the Lambda function’s role to perform the tasks necessary to enable versioning on S3 buckets.
    1. In the Lambda console, navigate to the Configuration tab and then, in the left navigation pane, choose Permissions. Choose the Role name, as shown in Figure 6.
       
      Figure 6: Lambda role in the AWS console

      Figure 6: Lambda role in the AWS console

    2. In the Add permissions dropdown, select Create inline policy.
       
      Figure 7: Create inline policy

      Figure 7: Create inline policy

    3. Choose JSON, add the following policy to the policy editor, and then choose Next.
      {
          "Version": "2012-10-17",
          "Statement": [
              {
                  "Sid": "VisualEditor0",
                  "Effect": "Allow",
                  "Action": "s3:PutBucketVersioning",
                  "Resource": "*"
              }
          ]
      }

    4. Name the policy PutBucketVersioning and choose Create policy.

Create a custom action in Security Hub

In this step, you will create a custom action in Security Hub.

To create the custom action

  1. Open the Security Hub console.
  2. In the left navigation pane, choose Settings, and then choose Custom actions.
  3. Choose Create custom action.
  4. Provide the following information, as shown in Figure 8:
    • For Name, enter TurnOnS3Versioning.
    • For Description, enter Action that will turn on versioning for a specific S3 bucket.
    • For Custom action ID, enter TurnOnS3Versioning.
       
      Figure 8: Create a custom action in Security Hub

      Figure 8: Create a custom action in Security Hub

  5. Choose Create custom action.
  6. Make a note of the Custom action ARN. You will need this ARN when you create a rule to associate with the custom action in EventBridge.

Create an EventBridge rule to target the Lambda function

The next step is to create an EventBridge rule to capture the custom action. You will define an EventBridge rule that matches events (in this case, findings) from Security Hub that were forwarded by the custom action that you defined previously.

To create the EventBridge rule

  1. Navigate to the EventBridge console.
  2. On the right side, choose Create rule.
  3. On the Define rule detail page, give your rule a name and description that represents the rule’s purpose—for example, you could use the same name and description that you used for the custom action. Then choose Next.
  4. Scroll down to Event pattern, and then do the following:
    1. For Event source, make sure that AWS services is selected.
    2. For AWS service, select Security Hub.
    3. For Event type, select Security Hub Findings – Custom Action.
    4. Select Specific custom action ARN(s) and enter the ARN for the custom action that you created earlier.
       
    Figure 9: Specify the EventBridge event pattern for the Security Hub custom action workflow

    Figure 9: Specify the EventBridge event pattern for the Security Hub custom action workflow

    As you provide this information, the Event pattern updates.

  5. Choose Next.
  6. On the Select target(s) step, in the Select a target dropdown, select Lambda function. Then from the Function dropdown, select sec_remediation_function.
  7. Choose Next.
  8. On the Configure tags step, choose Next.
  9. On the Review and create step, choose Create rule.

Run the automation

Your automation is set up and you can now test the automation. This test covers a partial automation workflow, since you will manually select the finding and apply the remediation workflow to one or more selected findings.

Important: As we mentioned earlier, if you decide to make the automation end-to-end, you should assess the impact of the workflow in a non-production environment. Additionally, you may want to consider creating preventative controls if you want to minimize the risk of event occurrence across an entire environment.

To run the automation

  1. In the Security Hub console, on the Findings tab, add a filter by entering Title in the search box and selecting that filter. Select IS and enter S3 general purpose buckets should have versioning enabled (case sensitive). Choose Apply.
  2. In the filtered list, choose the Title of an active finding.
  3. Before you start the automation, check the current configuration of the S3 bucket to confirm that your automation works. Expand the Resources section of the finding.
  4. Under Resource ID, choose the link for the S3 bucket. This opens a new tab on the S3 console that shows only this S3 bucket.
  5. In your browser, go back to the Security Hub tab (don’t close the S3 tab—you will need to return to it), and on the left side, select this same finding, as shown in Figure 10.
     
    Figure 10: Filter out Security Hub findings to list only S3 bucket-related findings

    Figure 10: Filter out Security Hub findings to list only S3 bucket-related findings

  6. In the Actions dropdown list, choose the name of your custom action.
     
    Figure 11: Choose the custom action that you created to start the remediation workflow

    Figure 11: Choose the custom action that you created to start the remediation workflow

  7. When you see a banner that displays Successfully started action…, go back to the S3 browser tab and refresh it. Verify that the S3 versioning configuration on the bucket has been enabled as shown in figure 12.
     
    Figure 12: Versioning successfully enabled

    Figure 12: Versioning successfully enabled

Conclusion

In this post, you learned how to use CodeWhisperer to produce AI-generated code for custom remediations for a security use case. We encourage you to experiment with CodeWhisperer to create Lambda functions that remediate other Security Hub findings that might exist in your account, such as the enforcement of lifecycle policies on S3 buckets with versioning enabled, or using automation to remove multiple unused Amazon EC2 elastic IP addresses. The ability to automatically set public S3 buckets to private is just one of many use cases where CodeWhisperer can generate code to help you remediate Security Hub findings.

To sum up, CodeWhisperer acts as a tool that can help boost the productivity of security experts who have coding abilities, assisting them to swiftly write code to address security issues. However, security specialists should continue building their software development capabilities to implement robust solutions. Engineers should carefully review and test any generated code, since human oversight is still vital for security.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Brendan Jenkins

Brendan Jenkins

Brendan is a Solutions Architect at AWS who works with enterprise customers, providing them with technical guidance and helping them achieve their business goals. He specializes in DevOps and machine learning (ML) technology.

Chris Shea

Chris Shea

Chris is an AWS Solutions Architect serving enterprise customers in the PropTech and AdTech industry verticals, providing guidance and the tools that customers need for success. His areas of interest include AI for DevOps and AI/ML technology.

Tim Manik

Tim Manik

Tim is a Solutions Architect at AWS working with enterprise customers on migrations and modernizations. He specializes in cybersecurity and AI/ML and is passionate about bridging the gap between the two fields.

Angel Tolson

Angel Tolson

Angel is a Solutions Architect at AWS working with small to medium size businesses, providing them with technical guidance and helping them achieve their business goals. She is particularly interested in cloud operations and networking.

Detecting and remediating inactive user accounts with Amazon Cognito

Post Syndicated from Harun Abdi original https://aws.amazon.com/blogs/security/detecting-and-remediating-inactive-user-accounts-with-amazon-cognito/

For businesses, particularly those in highly regulated industries, managing user accounts isn’t just a matter of security but also a compliance necessity. In sectors such as finance, healthcare, and government, where regulations often mandate strict control over user access, disabling stale user accounts is a key compliance activity. In this post, we show you a solution that uses serverless technologies to track and disable inactive user accounts. While this process is particularly relevant for those in regulated industries, it can also be beneficial for other organizations looking to maintain a clean and secure user base.

The solution focuses on identifying inactive user accounts in Amazon Cognito and automatically disabling them. Disabling a user account in Cognito effectively restricts the user’s access to applications and services linked with the Amazon Cognito user pool. After their account is disabled, the user cannot sign in, access tokens are revoked for their account and they are unable to perform API operations that require user authentication. However, the user’s data and profile within the Cognito user pool remain intact. If necessary, the account can be re-enabled, allowing the user to regain access and functionality.

While the solution focuses on the example of a single Amazon Cognito user pool in a single account, you also learn considerations for multi-user pool and multi-account strategies.

Solution overview

In this section, you learn how to configure an AWS Lambda function that captures the latest sign-in records of users authenticated by Amazon Cognito and write this data to an Amazon DynamoDB table. A time-to-live (TTL) indicator is set on each of these records based on the user inactivity threshold parameter defined when deploying the solution. This TTL represents the maximum period a user can go without signing in before their account is disabled. As these items reach their TTL expiry in DynamoDB, a second Lambda function is invoked to process the expired items and disable the corresponding user accounts in Cognito. For example, if the user inactivity threshold is configured to be 7 days, the accounts of users who don’t sign in within 7 days of their last sign-in will be disabled. Figure 1 shows an overview of the process.

Note: This solution functions as a background process and doesn’t disable user accounts in real time. This is because DynamoDB Time to Live (TTL) is designed for efficiency and to remain within the constraints of the Amazon Cognito quotas. Set your users’ and administrators’ expectations accordingly, acknowledging that there might be a delay in the reflection of changes and updates.

Figure 1: Architecture diagram for tracking user activity and disabling inactive Amazon Cognito users

Figure 1: Architecture diagram for tracking user activity and disabling inactive Amazon Cognito users

As shown in Figure 1, this process involves the following steps:

  1. An application user signs in by authenticating to Amazon Cognito.
  2. Upon successful user authentication, Cognito initiates a post authentication Lambda trigger invoking the PostAuthProcessorLambda function.
  3. The PostAuthProcessorLambda function puts an item in the LatestPostAuthRecordsDDB DynamoDB table with the following attributes:
    1. sub: A unique identifier for the authenticated user within the Amazon Cognito user pool.
    2. timestamp: The time of the user’s latest sign-in, formatted in UTC ISO standard.
    3. username: The authenticated user’s Cognito username.
    4. userpool_id: The identifier of the user pool to which the user authenticated.
    5. ttl: The TTL value, in seconds, after which a user’s inactivity will initiate account deactivation.
  4. Items in the LatestPostAuthRecordsDDB DynamoDB table are automatically purged upon reaching their TTL expiry, launching events in DynamoDB Streams.
  5. DynamoDB Streams events are filtered to allow invocation of the DDBStreamProcessorLambda function only for TTL deleted items.
  6. The DDBStreamProcessorLambda function runs to disable the corresponding user accounts in Cognito.

Implementation details

In this section, you’re guided through deploying the solution, demonstrating how to integrate it with your existing Amazon Cognito user pool and exploring the solution in more detail.

Note: This solution begins tracking user activity from the moment of its deployment. It can’t retroactively track or manage user activities that occurred prior to its implementation. To make sure the solution disables currently inactive users in the first TTL period after deploying the solution, you should do a one-time preload of those users into the DynamoDB table. If this isn’t done, the currently inactive users won’t be detected because users are detected as they sign in. For the same reason, users who create accounts but never sign in won’t be detected either. To detect user accounts that sign up but never sign in, implement a post confirmation Lambda trigger to invoke a Lambda function that processes user sign-up records and writes them to the DynamoDB table.

Prerequisites

Before deploying this solution, you must have the following prerequisites in place:

  • An existing Amazon Cognito user pool. This user pool is the foundation upon which the solution operates. If you don’t have a Cognito user pool set up, you must create one before proceeding. See Creating a user pool.
  • The ability to launch a CloudFormation template. The second prerequisite is the capability to launch an AWS CloudFormation template in your AWS environment. The template provisions the necessary AWS services, including Lambda functions, a DynamoDB table, and AWS Identity and Access Management (IAM) roles that are integral to the solution. The template simplifies the deployment process, allowing you to set up the entire solution with minimal manual configuration. You must have the necessary permissions in your AWS account to launch CloudFormation stacks and provision these services.

To deploy the solution

  1. Choose the following Launch Stack button to deploy the solution’s CloudFormation template:

    Launch Stack

    The solution deploys in the AWS US East (N. Virginia) Region (us-east-1) by default. To deploy the solution in a different Region, use the Region selector in the console navigation bar and make sure that the services required for this walkthrough are supported in your newly selected Region. For service availability by Region, see AWS Services by Region.

  2. On the Quick Create Stack screen, do the following:
    1. Specify the stack details.
      1. Stack name: The stack name is an identifier that helps you find a particular stack from a list of stacks. A stack name can contain only alphanumeric characters (case sensitive) and hyphens. It must start with an alphabetic character and can’t be longer than 128 characters.
      2. CognitoUserPoolARNs: A comma-separated list of Amazon Cognito user pool Amazon Resource Names (ARNs) to monitor for inactive users.
      3. UserInactiveThresholdDays: Time (in days) that the user account is allowed to be inactive before it’s disabled.
    2. Scroll to the bottom, and in the Capabilities section, select I acknowledge that AWS CloudFormation might create IAM resources with custom names.
    3. Choose Create Stack.

Integrate with your existing user pool

With the CloudFormation template deployed, you can set up Lambda triggers in your existing user pool. This is a key step for tracking user activity.

Note: This walkthrough is using the new AWS Management Console experience. Alternatively, These steps could also be done using CloudFormation.

To integrate with your existing user pool

  1. Navigate to the Amazon Cognito console and select your user pool.
  2. Navigate to User pool properties.
  3. Under Lambda triggers, choose Add Lambda trigger. Select the Authentication radio button, then add a Post authentication trigger and assign the PostAuthProcessorLambda function.

Note: Amazon Cognito allows you to set up one Lambda trigger per event. If you already have a configured post authentication Lambda trigger, you can refactor the existing Lambda function, adding new features directly to minimize the cold starts associated with invoking additional functions (for more information, see Anti-patterns in Lambda-based applications). Keep in mind that when Cognito calls your Lambda function, the function must respond within 5 seconds. If it doesn’t and if the call can be retried, Cognito retries the call. After three unsuccessful attempts, the function times out. You can’t change this 5-second timeout value.

Figure 2: Add a post-authentication Lambda trigger and assign a Lambda function

Figure 2: Add a post-authentication Lambda trigger and assign a Lambda function

When you add a Lambda trigger in the Amazon Cognito console, Cognito adds a resource-based policy to your function that permits your user pool to invoke the function. When you create a Lambda trigger outside of the Cognito console, including a cross-account function, you must add permissions to the resource-based policy of the Lambda function. Your added permissions must allow Cognito to invoke the function on behalf of your user pool. You can add permissions from the Lambda console or use the Lambda AddPermission API operation. To configure this in CloudFormation, you can use the AWS::Lambda::Permission resource.

Explore the solution

The solution should now be operational. It’s configured to begin monitoring user sign-in activities and automatically disable inactive user accounts according to the user inactivity threshold. Use the following procedures to test the solution:

Note: When testing the solution, you can set the UserInactiveThresholdDays CloudFormation parameter to 0. This minimizes the time it takes for user accounts to be disabled.

Step 1: User authentication

  1. Create a user account (if one doesn’t exist) in the Amazon Cognito user pool integrated with the solution.
  2. Authenticate to the Cognito user pool integrated with the solution.
     
    Figure 3: Example user signing in to the Amazon Cognito hosted UI

    Figure 3: Example user signing in to the Amazon Cognito hosted UI

Step 2: Verify the sign-in record in DynamoDB

Confirm the sign-in record was successfully put in the LatestPostAuthRecordsDDB DynamoDB table.

  1. Navigate to the DynamoDB console.
  2. Select the LatestPostAuthRecordsDDB table.
  3. Select Explore Table Items.
  4. Locate the sign-in record associated with your user.
     
Figure 4: Locating the sign-in record associated with the signed-in user

Figure 4: Locating the sign-in record associated with the signed-in user

Step 3: Confirm user deactivation in Amazon Cognito

After the TTL expires, validate that the user account is disabled in Amazon Cognito.

  1. Navigate to the Amazon Cognito console.
  2. Select the relevant Cognito user pool.
  3. Under Users, select the specific user.
  4. Verify the Account status in the User information section.
     
Figure 5: Screenshot of the user that signed in with their account status set to disabled

Figure 5: Screenshot of the user that signed in with their account status set to disabled

Note: TTL typically deletes expired items within a few days. Depending on the size and activity level of a table, the actual delete operation of an expired item can vary. TTL deletes items on a best effort basis, and deletion might take longer in some cases.

The user’s account is now disabled. A disabled user account can’t be used to sign in, but still appears in the responses to GetUser and ListUsers API requests.

Design considerations

In this section, you dive deeper into the key components of this solution.

DynamoDB schema configuration:

The DynamoDB schema has the Amazon Cognito sub attribute as the partition key. The Cognito sub is a globally unique user identifier within Cognito user pools that cannot be changed. This configuration ensures each user has a single entry in the table, even if the solution is configured to track multiple user pools. See Other considerations for more about tracking multiple user pools.

Using DynamoDB Streams and Lambda to disable TTL deleted users

This solution uses DynamoDB TTL and DynamoDB Streams alongside Lambda to process user sign-in records. The TTL feature automatically deletes items past their expiration time without write throughput consumption. The deleted items are captured by DynamoDB Streams and processed using Lambda. You also apply event filtering within the Lambda event source mapping, ensuring that the DDBStreamProcessorLambda function is invoked exclusively for TTL-deleted items (see the following code example for the JSON filter pattern). This approach reduces invocations of the Lambda functions, simplifies code, and reduces overall cost.

{
    "Filters": [
        {
            "Pattern": { "userIdentity": { "type": ["Service"], "principalId": ["dynamodb.amazonaws.com"] } }
        }
    ]
}

Handling API quotas:

The DDBStreamProcessorLambda function is configured to comply with the AdminDisableUser API’s quota limits. It processes messages in batches of 25, with a parallelization factor of 1. This makes sure that the solution remains within the nonadjustable 25 requests per second (RPS) limit for AdminDisableUser, avoiding potential API throttling. For more details on these limits, see Quotas in Amazon Cognito.

Dead-letter queues:

Throughout the architecture, dead-letter queues (DLQs) are used to handle message processing failures gracefully. They make sure that unprocessed records aren’t lost but instead are queued for further inspection and retry.

Other considerations

The following considerations are important for scaling the solution in complex environments and maintaining its integrity. The ability to scale and manage the increased complexity is crucial for successful adoption of the solution.

Multi-user pool and multi-account deployment

While this solution discussed a single Amazon Cognito user pool in a single AWS account, this solution can also function in environments with multiple user pools. This involves deploying the solution and integrating with each user pool as described in Integrating with your existing user pool. Because of the AdminDisableUser API’s quota limit for the maximum volume of requests in one AWS Region in one AWS account, consider deploying the solution separately in each Region in each AWS account to stay within the API limits.

Efficient processing with Amazon SQS:

Consider using Amazon Simple Queue Service (Amazon SQS) to add a queue between the PostAuthProcessorLambda function and the LatestPostAuthRecordsDDB DynamoDB table to optimize processing. This approach decouples user sign-in actions from DynamoDB writes, and allows for batching writes to DynamoDB, reducing the number of write requests.

Clean up

Avoid unwanted charges by cleaning up the resources you’ve created. To decommission the solution, follow these steps:

  1. Remove the Lambda trigger from the Amazon Cognito user pool:
    1. Navigate to the Amazon Cognito console.
    2. Select the user pool you have been working with.
    3. Go to the Triggers section within the user pool settings.
    4. Manually remove the association of the Lambda function with the user pool events.
  2. Remove the CloudFormation stack:
    1. Open the CloudFormation console.
    2. Locate and select the CloudFormation stack that was used to deploy the solution.
    3. Delete the stack.
    4. CloudFormation will automatically remove the resources created by this stack, including Lambda functions, Amazon SQS queues, and DynamoDB tables.

Conclusion

In this post, we walked you through a solution to identify and disable stale user accounts based on periods of inactivity. While the example focuses on a single Amazon Cognito user pool, the approach can be adapted for more complex environments with multiple user pools across multiple accounts. For examples of Amazon Cognito architectures, see the AWS Architecture Blog.

Proper planning is essential for seamless integration with your existing infrastructure. Carefully consider factors such as your security environment, compliance needs, and user pool configurations. You can modify this solution to suit your specific use case.

Maintaining clean and active user pools is an ongoing journey. Continue monitoring your systems, optimizing configurations, and keeping up-to-date on new features. Combined with well-architected preventive measures, automated user management systems provide strong defenses for your applications and data.

For further reading, see the AWS Well-Architected Security Pillar and more posts like this one on the AWS Security Blog.

If you have feedback about this post, submit comments in the Comments section. If you have questions about this post, start a new thread on the Amazon Cognito re:Post forum or contact AWS Support.

Harun Abdi

Harun Abdi

Harun is a Startup Solutions Architect based in Toronto, Canada. Harun loves working with customers across different sectors, supporting them to architect reliable and scalable solutions. In his spare time, he enjoys playing soccer and spending time with friends and family.

Dylan Souvage

Dylan Souvage

Dylan is a Partner Solutions Architect based in Austin, Texas. Dylan loves working with customers to understand their business needs and enable them in their cloud journey. In his spare time, he enjoys going out in nature and going on long road trips.

AWS Weekly Roundup: Amazon EC2 G6 instances, Mistral Large on Amazon Bedrock, AWS Deadline Cloud, and more (April 8, 2024)

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-mistral-large-aws-clean-rooms-ml-aws-deadline-cloud-and-more-april-8-2024/

We’re just two days away from AWS Summit Sydney (April 10–11) and a month away from the AWS Summit season in Southeast Asia, starting with the AWS Summit Singapore (May 7) and the AWS Summit Bangkok (May 30). If you happen to be in Sydney, Singapore, or Bangkok around those dates, please join us.

Last Week’s Launches
If you haven’t read last week’s Weekly Roundup yet, Channy wrote about the AWS Chips Taste Test, a new initiative from Jeff Barr as part of April’ Fools Day.

Here are some launches that caught my attention last week:

New Amazon EC2 G6 instances — We announced the general availability of Amazon EC2 G6 instances powered by NVIDIA L4 Tensor Core GPUs. G6 instances can be used for a wide range of graphics-intensive and machine learning use cases. G6 instances deliver up to 2x higher performance for deep learning inference and graphics workloads compared to Amazon EC2 G4dn instances. To learn more, visit the Amazon EC2 G6 instance page.

Mistral Large is now available in Amazon Bedrock — Veliswa wrote about the availability of the Mistral Large foundation model, as part of the Amazon Bedrock service. You can use Mistral Large to handle complex tasks that require substantial reasoning capabilities. In addition, Amazon Bedrock is now available in the Paris AWS Region.

Amazon Aurora zero-ETL integration with Amazon Redshift now in additional Regions — Zero-ETL integration announcements were my favourite launches last year. This Zero-ETL integration simplifies the process of transferring data between the two services, allowing customers to move data between Amazon Aurora and Amazon Redshift without the need for manual Extract, Transform, and Load (ETL) processes. With this announcement, Zero-ETL integrations between Amazon Aurora and Amazon Redshift is now supported in 11 additional Regions.

Announcing AWS Deadline Cloud — If you’re working in films, TV shows, commercials, games, and industrial design and handling complex rendering management for teams creating 2D and 3D visual assets, then you’ll be excited about AWS Deadline Cloud. This new managed service simplifies the deployment and management of render farms for media and entertainment workloads.

AWS Clean Rooms ML is Now Generally Available — Last year, I wrote about the preview of AWS Clean Rooms ML. In that post, I elaborated a new capability of AWS Clean Rooms that helps you and your partners apply machine learning (ML) models on your collective data without copying or sharing raw data with each other. Now, AWS Clean Rooms ML is available for you to use.

Knowledge Bases for Amazon Bedrock now supports private network policies for OpenSearch Serverless — Here’s exciting news for you who are building with Amazon Bedrock. Now, you can implement Retrieval-Augmented Generation (RAG) with Knowledge Bases for Amazon Bedrock using Amazon OpenSearch Serverless (OSS) collections that have a private network policy.

Amazon EKS extended support for Kubernetes versions now generally available — If you’re running Kubernetes version 1.21 and higher, with this Extended Support for Kubernetes, you can stay up-to-date with the latest Kubernetes features and security improvements on Amazon EKS.

AWS Lambda Adds Support for Ruby 3.3 — Coding in Ruby? Now, AWS Lambda supports Ruby 3.3 as its runtime. This update allows you to take advantage of the latest features and improvements in the Ruby language.

Amazon EventBridge Console Enhancements — The Amazon EventBridge console has been updated with new features and improvements, making it easier for you to manage your event-driven applications with a better user experience.

Private Access to the AWS Management Console in Commercial Regions — If you need to restrict access to personal AWS accounts from the company network, you can use AWS Management Console Private Access. With this launch, you can use AWS Management Console Private Access in all commercial AWS Regions.

From community.aws 
The community.aws is a home for us, builders, to share our learnings with building on AWS. Here’s my Top 3 posts from last week:

Other AWS News 
Here are some additional news items, open-source projects, and Twitch shows that you might find interesting:

Build On Generative AI – Join Tiffany and Darko to learn more about generative AI, see their demos and discuss different aspects of generative AI with the guest speakers. Streaming every Monday on Twitch, 9:00 AM US PT.

AWS open source news and updates – If you’re looking for various open-source projects and tools from the AWS community, please read the AWS open-source newsletter maintained by my colleague, Ricardo.

Upcoming AWS events
Check your calendars and sign up for these AWS events:

AWS Summits – Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Register in your nearest city: Amsterdam (April 9), Sydney (April 10–11), London (April 24), Singapore (May 7), Berlin (May 15–16), Seoul (May 16–17), Hong Kong (May 22), Milan (May 23), Dubai (May 29), Thailand (May 30), Stockholm (June 4), and Madrid (June 5).

AWS re:Inforce – Explore cloud security in the age of generative AI at AWS re:Inforce, June 10–12 in Pennsylvania for two-and-a-half days of immersive cloud security learning designed to help drive your business initiatives.

AWS Community Days – Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Poland (April 11), Bay Area (April 12), Kenya (April 20), and Turkey (May 18).

You can browse all upcoming in-person and virtual events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

— Donnie

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Serverless ICYMI Q1 2024

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/serverless-icymi-q1-2024/

Welcome to the 25th edition of the AWS Serverless ICYMI (in case you missed it) quarterly recap. Every quarter, we share all the most recent product launches, feature enhancements, blog posts, webinars, live streams, and other interesting things that you might have missed!

In case you missed our last ICYMI, check out what happened last quarter here.

2024 Q1 calendar

2024 Q1 calendar

Adobe Summit

At the Adobe Summit, the AWS Serverless Developer Advocacy team showcased a solution developed for the NFL using AWS serverless technologies and Adobe Photoshop APIs. The system automates image processing tasks, including background removal and dynamic resizing, by integrating AWS Step Functions, AWS Lambda, Amazon EventBridge, and AI/ML capabilities via Amazon Rekognition. This solution reduced image processing time from weeks to minutes and saved the NFL significant costs. Combining cloud-based serverless architectures with advanced machine learning and API technologies can optimize digital workflows for cost-effective and agile digital asset management.

Adobe Summit ServerlessVideo

Adobe Summit ServerlessVideo

ServerlessVideo is a demo application to stream live videos and also perform advanced post-video processing. It uses several AWS services, including Step Functions, Lambda, EventBridge, Amazon ECS, and Amazon Bedrock in a serverless architecture that makes it fast, flexible, and cost-effective. The team used ServerlessVideo to interview attendees about the conference experience and Adobe and partners about how they use Adobe. Learn more about the project and watch videos from Adobe Summit 2024 at video.serverlessland.com.

AWS Lambda

AWS launched support for the latest long-term support release of .NET 8, which includes API enhancements, improved Native Ahead of Time (Native AOT) support, and improved performance.

AWS Lambda .NET 8

AWS Lambda .NET 8

Learn how to compare design approaches for building serverless microservices. This post covers the trade-offs to consider with various application architectures. See how you can apply single responsibility, Lambda-lith, and read and write functions.

The AWS Serverless Java Container has been updated. This makes it easier to modernize a legacy Java application written with frameworks such as Spring, Spring Boot, or JAX-RS/Jersey in Lambda with minimal code changes.

AWS Serverless Java Container

AWS Serverless Java Container

Lambda has improved the responsiveness for configuring Event Source Mappings (ESMs) and Amazon EventBridge Pipes with event sources such as self-managed Apache Kafka, Amazon Managed Streaming for Apache Kafka (MSK), Amazon DocumentDB, and Amazon MQ.

Chaos engineering is a popular practice for building confidence in system resilience. However, many existing tools assume the ability to alter infrastructure configurations, and cannot be easily applied to the serverless application paradigm. You can use the AWS Fault Injection Service (FIS) to automate and manage chaos experiments across different Lambda functions to provide a reusable testing method.

Amazon ECS and AWS Fargate

Amazon Elastic Container Service (Amazon ECS) now provides managed instance draining as a built-in feature of Amazon ECS capacity providers. This allows Amazon ECS to safely and automatically drain tasks from Amazon Elastic Compute Cloud (Amazon EC2) instances that are part of an Amazon EC2 Auto Scaling Group associated with an Amazon ECS capacity provider. This simplification allows you to remove custom lifecycle hooks previously used to drain Amazon EC2 instances. You can now perform infrastructure updates such as rolling out a new version of the ECS agent by seamlessly using Auto Scaling Group instance refresh, with Amazon ECS ensuring workloads are not interrupted.

Credentials Fetcher makes it easier to run containers that depend on Windows authentication when using Amazon EC2. Credentials Fetcher now integrates with Amazon ECS, using either the Amazon EC2 launch type, or AWS Fargate serverless compute launch type.

Amazon ECS Service Connect is a networking capability to simplify service discovery, connectivity, and traffic observability for Amazon ECS. You can now more easily integrate certificate management to encrypt service-to-service communication using Transport Layer Security (TLS). You do not need to modify your application code, add additional network infrastructure, or operate service mesh solutions.

Amazon ECS Service Connect

Amazon ECS Service Connect

Running distributed machine learning (ML) workloads on Amazon ECS allows ML teams to focus on creating, training and deploying models, rather than spending time managing the container orchestration engine. Amazon ECS provides a great environment to run ML projects as it supports workloads that use NVIDIA GPUs and provides optimized images with pre-installed NVIDIA Kernel drivers and Docker runtime.

See how to build preview environments for Amazon ECS applications with AWS Copilot. AWS Copilot is an open source command line interface that makes it easier to build, release, and operate production ready containerized applications.

Learn techniques for automatic scaling of your Amazon Elastic Container Service  (Amazon ECS) container workloads to enhance the end user experience. This post explains how to use AWS Application Auto Scaling which helps you configure automatic scaling of your Amazon ECS service. You can also use Amazon ECS Service Connect and AWS Distro for OpenTelemetry (ADOT) in Application Auto Scaling.

AWS Step Functions

AWS workloads sometimes require access to data stored in on-premises databases and storage locations. Traditional solutions to establish connectivity to the on-premises resources require inbound rules to firewalls, a VPN tunnel, or public endpoints. Discover how to use the MQTT protocol (AWS IoT Core) with AWS Step Functions to dispatch jobs to on-premises workers to access or retrieve data stored on-premises.

You can use Step Functions to orchestrate many business processes. Many industries are required to provide audit trails for decision and transactional systems. Learn how to build a serverless pipeline to create a reliable, performant, traceable, and durable pipeline for audit processing.

Amazon EventBridge

Amazon EventBridge now supports publishing events to AWS AppSync GraphQL APIs as native targets. The new integration allows you to publish events easily to a wider variety of consumers and simplifies updating clients with near real-time data.

Amazon EventBridge publishing events to AWS AppSync

Amazon EventBridge publishing events to AWS AppSync

Discover how to send and receive CloudEvents with EventBridge. CloudEvents is an open-source specification for describing event data in a common way. You can publish CloudEvents directly to EventBridge, filter and route them, and use input transformers and API Destinations to send CloudEvents to downstream AWS services and third-party APIs.

AWS Application Composer

AWS Application Composer lets you create infrastructure as code templates by dragging and dropping cards on a virtual canvas. These represent CloudFormation resources, which you can wire together to create permissions and references. Application Composer has now expanded to the VS Code IDE as part of the AWS Toolkit. This now includes a generative AI partner that helps you write infrastructure as code (IaC) for all 1100+ AWS CloudFormation resources that Application Composer now supports.

AWS AppComposer generate suggestions

AWS AppComposer generate suggestions

Amazon API Gateway

Learn how to consume private Amazon API Gateway APIs using mutual TLS (mTLS). mTLS helps prevent man-in-the-middle attacks and protects against threats such as impersonation attempts, data interception, and tampering.

Serverless at AWS re:Invent

Serverless at AWS reInvent

Serverless at AWS reInvent

Visit the Serverless Land YouTube channel to find a list of serverless and serverless container sessions from reinvent 2023. Hear from experts like Chris Munns and Julian Wood in their popular session, Best practices for serverless developers, or Nathan Peck and Jessica Deen in Deploying multi-tenant SaaS applications on Amazon ECS and AWS Fargate.

Serverless blog posts

January

February

March

Serverless container blog posts

January

February

December

Serverless Office Hours

Serverless Office Hours

Serverless Office Hours

January

February

March

Containers from the Couch

Containers from the Couch

Containers from the Couch

January

February

March

FooBar Serverless

FooBar Serverless

FooBar Serverless

January

February

March

Still looking for more?

The Serverless landing page has more information. The Lambda resources page contains case studies, webinars, whitepapers, customer stories, reference architectures, and even more Getting Started tutorials.

You can also follow the Serverless Developer Advocacy team on Twitter to see the latest news, follow conversations, and interact with the team.

And finally, visit the Serverless Land and Containers on AWS websites for all your serverless and serverless container needs.

Automating chaos experiments with AWS Fault Injection Service and AWS Lambda

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/automating-chaos-experiments-with-aws-fault-injection-service-and-aws-lambda/

This post is written by André Stoll, Solution Architect.

Chaos engineering is a popular practice for building confidence in system resilience. However, many existing tools assume the ability to alter infrastructure configurations, and cannot be easily applied to the serverless application paradigm. Due to the stateless, ephemeral, and distributed nature of serverless architectures, you must evolve the traditional technique when running chaos experiments on these systems.

This blog post explains a technique for running chaos engineering experiments on AWS Lambda functions. The approach uses Lambda extensions to induce failures in a runtime-agnostic way requiring no function code changes. It shows how you can use the AWS Fault Injection Service (FIS) to automate and manage chaos experiments across different Lambda functions to provide a reusable testing method.

Overview

Chaos experiments are commonly applied to cloud applications to uncover latent issues and prevent service disruptions. IT teams use chaos experiments to build confidence in the robustness of their systems. However, the traditional methods used in server-based chaos engineering do not easily translate to the serverless world since many existing tools are based on altering the underlying infrastructure configurations, such as cluster nodes or server instances of your applications.

In serverless applications, AWS handles the undifferentiated heavy lifting of managing infrastructure, so you can focus on delivering business value. But this also means that engineering teams have limited control over the infrastructure, and must rely on application-level tooling to run chaos experiments. Two techniques commonly used in the serverless community for conducting chaos experiments on Lambda functions are modifying the function configuration or using runtime-specific libraries.

Changing the configuration of a Lambda function allows you to induce rudimentary failures. For example, you can set the reserved concurrency of a Lambda function to simulate invocation throttling. Alternatively, you might change the function execution role permissions or the function policy to simulate IAM access denial. These types of failures are easy to implement, but the range of possible fault injection types is limited.

The other technique—injecting chaos into Lambda functions through purpose-built, runtime-specific libraries—is more flexible. There are various open-source libraries that allow you to inject failures, such as added latency, exceptions, or disk exhaustion. Examples of such libraries are Python’s chaos_lambda and failure-lambda for Node.js. The downside is that you must change the function code for every function you want to run chaos experiments on. In addition, those libraries are runtime-specific and each library comes with a set of different capabilities and configurations. This reduces the reusability of your chaos experiments across Lambda functions implemented in different languages.

Injecting chaos using Lambda extensions

Implementing chaos experiments using Lambda extensions allows you to address all of the previous concerns. Lambda extensions augment your functions by adding functionality, such as capturing diagnostic information or automatically instrumenting your code. You can integrate your preferred monitoring, observability, or security tooling deeply into the Lambda environment without complex installation or configuration management. Lambda extensions are generally packaged as Lambda layers and run as a separate process in the Lambda execution environment. You may use extensions from AWS, AWS Lambda partners, or build your own custom functionality.

With Lambda extensions, you can implement a chaos extension to inject the desired failures into your Lambda environments. This chaos extension uses the Runtime API proxy pattern that enables you to hook into the function invocation request and response lifecycle. Lambda runtimes use the Lambda Runtime API to retrieve the next incoming event to be processed by the function handler and return the handler response to the Lambda service.

The Runtime API HTTP endpoint is available within the Lambda execution environment. Runtimes get the API endpoint from the environment variable AWS_LAMBDA_RUNTIME_API. During the initialization of the execution environment, you can modify the runtime startup behavior. This lets you change the value of AWS_LAMBDA_RUNTIME_API to the port the chaos extension process is listening on. Now, all requests to the Runtime API go through the chaos extension proxy. You can use this workflow for blocking malicious events, auditing payloads, or injecting failures.

Injecting chaos using Lambda extensions

  1. The chaos extension intercepts incoming events and outbound responses, and injects failures according to the chaos experiment configuration.
  2. The extension accesses environment variables to read the chaos experiment configuration.
  3. A wrapper script configures the runtime to proxy requests through the chaos extension.

When intercepting incoming events and outbound responses to the Lambda Runtime API, you can simulate failures such as introducing artificial delay or generate an error response to return to the Lambda service. This workflow adds latency to your function calls:

Workflow

All Lambda runtimes support extensions. Since extensions run as a separate process, you can implement them in a language other than the function code. AWS recommends you implement extensions using a programming language that compiles to a binary executable, such as Golang or Rust. This allows you to use the extension with any Lambda runtime.

Some of the open source projects following this technique are the chaos-lambda-extension, implemented in Rust, or the serverless-chaos-extension, implemented in Python.

Extensions provide you with a flexible and reusable method to run your chaos experiments on Lambda functions. You can reuse the chaos extension for all runtimes without having to change function code. Add the extension to any Lambda function where you want to run chaos experiments.

Automating with AWS FIS experiment templates

According to the Principles of Chaos Engineering, you should “automate your experiments to run continuously”. To achieve this, you can use the AWS Fault Injection Service (FIS).

This service allows you to generate reusable experiment templates. The template specifies the targets and the actions to run on them during the experiment, and an optional stop condition that prevents the experiment from going out of bounds. You can also execute AWS Systems Manager Automation runbooks which support custom fault types. You can write your own custom Systems Manager documents to define the individual steps involved in the automation. To carry out the actions of the experiment, you define scripts in the document to manage your Lambda function and set it up for the chaos experiment.

To use the chaos extension for your serverless chaos experiments:

  1. Set up the Lambda function for the experiment. Add the chaos extension as a layer and configure the experiment, for example, by adding environment variables specifying the fault type and its corresponding value.
  2. Pause the automation and conduct the experiment. To do this, use the aws:sleep automation action. During this period, you conduct the experiment, measure and observe the outcome.
  3. Clean up the experiment. The script removes the layer again and also resets the environment variables.

Running your first serverless chaos experiment

This sample repository provides you with the necessary code to run your first serverless chaos experiment in AWS. The experiment uses the chaos-lambda-extension extension to inject chaos.

The sample deploys the AWS FIS experiment template, the necessary SSM Automation runbooks including the IAM role used by the runbook to configure the Lambda functions. The sample also provisions a Lambda function for testing and an Amazon CloudWatch alarm used to roll back the experiment.

Prerequisites

Running the experiment

Follow the steps outlined in the repository to conduct your first experiment. Starting the experiment triggers the automation execution.

Actions summary

This automation includes adding the extension and configuring the experiment, pausing the execution and observing the system and reverting all changes to the initial state.

Executed steps

If you invoke the targeted Lambda function during the second step, failures (in this case, artificial latency) are simulated.

Output result

Security best practices

Extensions run within the same execution environment as the function, so they have the same level of access to resources such as file system, networking, and environment variables. IAM permissions assigned to the function are shared with extensions. AWS recommends you assign the least required privileges to your functions.

Always install extensions from a trusted source only. Use Infrastructure as Code (IaC) and automation tools, such as CloudFormation or AWS Systems Manager, to simplify attaching the same extension configuration, including AWS Identity and Access Management (IAM) permissions, to multiple functions. IaC and automation tools allow you to have an audit record of extensions and versions used previously.

When building extensions, do not log sensitive data. Sanitize payloads and metadata before logging or persisting them for audit purposes.

Conclusion

This blog post details how to run chaos experiments for serverless applications built using Lambda. The described approach uses Lambda extension to inject faults into the execution environment. This allows you to use the same method regardless of runtime or configuration of the Lambda function.

To automate and successfully conduct the experiment, you can use the AWS Fault Injection Service. By creating an experiment template, you can specify the actions to run on the defined targets, such as adding the extension during the experiment. Since the extension can be used for any runtime, you can reuse the experiment template to inject failures into different Lambda functions.

Visit this repository to deploy your first serverless chaos experiment, or watch this video guide for learning more about building extensions. Explore the AWS FIS documentation to learn how to create your own experiments.

For more serverless learning resources, visit Serverless Land.

Simplify document search at scale with intelligent search bot on AWS

Post Syndicated from Rostislav Markov original https://aws.amazon.com/blogs/architecture/simplify-document-search-at-scale-with-intelligent-search-bot-on-aws/

Enterprise document management systems (EDMS) manage the lifecycle and distribution of documents. They often rely on keyword-based search functionality. However, it increasingly becomes hard to discover documents as such repositories grow to tens of thousands of items.

In this blog, we discuss how Amazon Web Services (AWS) built an intelligent search bot on top of the document repository of a global life sciences company. Before this enhancement, the native repository search function relied solely on keywords and document names, leading to a trial-and-error process. Now, scientists can effortlessly query the repository using natural language to locate the right document in a few seconds—even through voice commands while working with lab equipment.

Use case

In life sciences, documentation is critical for regulatory compliance with GxP. Scientists in life sciences use EDMS on a daily basis to retrieve standard operating procedures (SOPs). SOPs contain task-level instructions (for example, how to monitor lab equipment or use utilities such as steam generators and chilled water circulation pumps).

EDMS search capability is often limited to file names and metadata. In our use case, file names were numerical and metadata was typically a single-sentence description plus keywords.

Scientists wanted to query for a particular context and type of task they’re about to perform. However, this data was not extracted from document content and thus not readily available for search purposes. Scientists also wanted to be able to search by using a voice interface (for example, when they operate on lab equipment).

To address this, we designed a conversational bot interface. This bot locates the most relevant SOP based on pre-generated document extracts and returns a hyperlink to the most suitable document, as shown in Figure 1.

Example of document search prompt and chatbot response

Figure 1. Example of document search prompt and chatbot response

Overview of the intelligent search bot solution

Our solution requirements for intelligent search were:

  • Semantic search index and ranking based on full text
  • Search capability through voice and text
  • Out-of-the-box integration with web applications and mobile devices

We choose Amazon Lex to provide the conversational interface using text or speech. Lex bots can be integrated with web and mobile applications using AWS Amplify Interactions. We used Amazon Kendra to create an intelligent search index on top of the data repository, which we hosted on Amazon Simple Storage Service (Amazon S3).

The advantage of using an Amazon Kendra index is its out-of-the-box semantic search and ranking capability based on document content. We use AWS Lambda to take care of Amazon S3 path mappings, and document attribute formatting for replicated documents, in order to make them retrievable by Amazon Kendra.

Intelligent search bot for enterprise document management systems

Figure 2. Intelligent search bot for enterprise document management systems

Benefits of integrating intelligent search bot with your EDMS

The benefits of extending EDMS with intelligent search bot include:

  • Improved usability by adding text- and speech-based channels to match user situations (for example, scientists operating on lab equipment)
  • Native, out-of-the-box integration with third-party systems (for example, Adobe Experience Manager, Alfresco, HubSpot, Marketo, Salesforce)
  • Implementation timeboxed in a two-week agile sprint and requires no data science skills

Extensibility to large language models

The Amazon Kendra Retrieve API allows the extension to a document retrieval chain pattern using a large language model (LLM) from Amazon Bedrock or Amazon SageMaker Jumpstart. With this pattern, Amazon Kendra generated document summaries can be passed to the LLM for correlation, as shown in Figure 3. Consult the LangChain documentation to learn how to configure retrieval chains.

Extending the document search bot to large language model

Figure 3. Extending the document search bot to large language model

In our use case, the effort for such extension went beyond the incremental optimizations and limited migration timeframe. Also, preference for a chain pattern should be given to complex correlations across documents.

This wasn’t the case here, as documents were functionally disjunct (for example, by device type, geographical site, and process task). Therefore, document retrieval with the Amazon Kendra API was sufficient and we deferred the extra effort associated with custom-built LLM prompt layers.

Implementation considerations

We started the EDMS migration to AWS by replicating the data repository to Amazon S3 by using AWS DataSync. We stored every document and corresponding metadata files as separate Amazon S3 objects.

For the Amazon Kendra index mappings to initialize properly:

  • Metadata must be a JSON Amazon S3 object
  • Follow the name convention <document>.<extension>.metadata.json
  • Have reserved or common document attributes correctly formatted

The EDMS system did not adhere to this when generating metadata files so we offloaded the transformation to a Lambda function. The function fixed metadata attributes such as, changing version type (_version) from numeric to string and date (_created_at) from string to ISO8601. It also changed metadata names/Amazon S3 paths by enacting new objects (PutObject API) and deleting the original objects (DeleteObject API).

We configured Lambda invocation on Amazon S3 PutObject operations using Amazon S3 event notifications. We set the sync run schedule for the Amazon Kendra index to run on demand.

Alternatively, you can run it on a predefined schedule or as part of each Lambda invocation (using the update_index boto3 operation). Finally, we monitor for sync run fails associated with the Amazon Kendra index using Amazon CloudWatch.

Conclusion

This blog showed how you can enhance the keyword-based search of your EDMS. We embedded document search queries behind a chatbot to simplify user interaction.

You can build this solution quickly, with no machine learning skills, as part of your EDMS cloud migration. In more advanced use cases, including complete refactoring, consider extending this solution to use a large language model from Amazon Bedrock or Amazon SageMaker Jumpstart.

Related information

Building a Serverless Streaming Pipeline to Deliver Reliable Messaging

Post Syndicated from Chris McPeek original https://aws.amazon.com/blogs/compute/building-a-serverless-streaming-pipeline-to-deliver-reliable-messaging/

This post is written by Jeff Harman, Senior Prototyping Architect, Vaibhav Shah, Senior Solutions Architect and Erik Olsen, Senior Technical Account Manager.

Many industries are required to provide audit trails for decision and transactional systems. AI assisted decision making requires monitoring the full inputs to the decision system in near real time to prevent fraud, detect model drift, and discrimination. Modern systems often use a much wider array of inputs for decision making, including images, unstructured text, historical values, and other large data elements. These large data elements pose a challenge to traditional audit systems that deal with relatively small text messages in structured formats. This blog shows the use of serverless technology to create a reliable, performant, traceable, and durable streaming pipeline for audit processing.

Overview

Consider the following four requirements to develop an architecture for audit record ingestion:

  1. Audit record size: Store and manage large payloads (256k – 6 MB in size) that may be heterogeneous, including text, binary data, and references to other storage systems.
  2. Audit traceability: The data stored has full traceability of the payload and external processes to monitor the process via subscription-based events.
  3. High Performance: The time required for blocking writes to the system is limited to the time it takes to transmit the audit record over the network.
  4. High data durability: Once the system sends a payload receipt, the payload is at very low risk of loss because of system failures.

The following diagram shows an architecture that meets these requirements and models the flow of the audit record through the system.

The primary source of latency is the time it takes for an audit record to be transmitted across the network. Applications sending audit records make an API call to an Amazon API Gateway endpoint. An AWS Lambda function receives the message and an Amazon ElastiCache for Redis cluster provides a low latency initial storage mechanism for the audit record. Once the data is stored in ElastiCache, the AWS Step Functions workflow then orchestrates the communication and persistence functions.

Subscribers receive four Amazon Simple Notification Service (Amazon SNS) notifications pertaining to arrival and storage of the audit record payload, storage of the audit record metadata, and audit record archive completion. Users can subscribe an Amazon Simple Queue Service (SQS) queue to the SNS topic and use fan out mechanisms to achieve high reliability.

  1. The Ingest Message Lambda function sends an initial receipt notification
  2. The Message Archive Handler Lambda function notifies on storage of the audit record from ElastiCache to Amazon Simple Storage Service (Amazon S3)
  3. The Message Metadata Handler Lambda function notifies on storage of the message metadata into Amazon DynamoDB
  4. The Final State Aggregation Lambda function notifies that the audit record has been archived.

Any failure by the three fundamental processing steps: Ingestion, Data Archive, and Metadata Archive triggers a message in an SQS Dead Letter Queue (DLQ) which contains the original request and an explanation of the failure reason. Any failure in the Ingest Message function invokes the Ingest Message Failure function, which stores the original parameters to the S3 Failed Message Storage bucket for later analysis.

The Step Functions workflow provides orchestration and parallel path execution for the system. The detailed workflow below shows the execution flow and notification actions. The transformer steps convert the internal data structures into the format required for consumers.

Data structures

There are types three events and messages managed by this system:

  1. Incoming message: This is the message the producer sends to an API Gateway endpoint.
  2. Internal message: This event contains the message metadata allowing subsequent systems to understand the originating message producer context.
  3. Notification message: Messages that allow downstream subscribers to act based on the message.

Solution walkthrough

The message producer calls the API Gateway endpoint, which enforces the security requirements defined by the business. In this implementation, API Gateway uses an API key for providing more robust security. API Gateway also creates a security header for consumption by the Ingest Message Lambda function. API Gateway can be configured to enforce message format standards, see Use request validation in API Gateway for more information.

The Ingest Message Lambda function generates a message ID that tracks the message payload throughout its lifecycle. Then it stores the full message in the ElastiCache for Redis cache. The Ingest Message Lambda function generates an internal message with all the elements necessary as described above. Finally, the Lambda function handler code starts the Step Functions workflow with the internal message payload.

If the Ingest Message Lambda function fails for any reason, the Lambda function invokes the Ingestion Failure Handler Lambda function. This Lambda function writes any recoverable incoming message data to an S3 bucket and sends a notification on the Ingest Message dead letter queue.

The Step Functions workflow then runs three processes in parallel.

  • The Step Functions workflow triggers the Message Archive Data Handler Lambda function to persist message data from the ElastiCache cache to an S3 bucket. Once stored, the Lambda function returns the S3 bucket reference and state information. There are two options to remove the internal message from the cache. Remove the message from cache immediately before sending the internal message and updating the ElastiCache cache flag or wait for the ElastiCache lifecycle to remove a stale message from cache. This solution waits for the ElastiCache lifecycle to remove the message.
  • The workflow triggers the Message Metadata Handler Lambda function to write all message metadata and security information to DynamoDB. The Lambda function replies with the DynamoDB reference information.
  • Finally, the Step Functions workflow sends a message to the SNS topic to inform subscribers that the message has arrived and the data persistence processes have started.

After each of the Lambda functions’ processes complete, the Lambda function sends a notification to the SNS notification topic to alert subscribers that each action is complete. When both Message Metadata and Message Archive Lambda functions are done, the Final Aggregation function makes a final update to the metadata in DynamoDB to include S3 reference information and to remove the ElastiCache Redis reference.

Deploying the solution

Prerequisites:

  1. AWS Serverless Application Model (AWS SAM) is installed (see Getting started with AWS SAM)
  2. AWS User/Credentials with appropriate permissions to run AWS CloudFormation templates in the target AWS account
  3. Python 3.8 – 3.10
  4. The AWS SDK for Python (Boto3) is installed
  5. The requests python library is installed

The source code for this implementation can be found at  https://github.com/aws-samples/blog-serverless-reliable-messaging

Installing the Solution:

  1. Clone the git repository to a local directory
  2. git clone https://github.com/aws-samples/blog-serverless-reliable-messaging.git
  3. Change into the directory that was created by the clone operation, usually blog_serverless_reliable_messaging
  4. Execute the command: sam build
  5. Execute the command: sam deploy –-guided. You are asked to supply the following parameters:
    1. Stack Name: Name given to this deployment (example: serverless-streaming)
    2. AWS Region: Where to deploy (example: us-east-1)
    3. ElasticacheInstanceClass: EC2 cache instance type to use with (example: cache.t3.small)
    4. ElasticReplicaCount: How many replicas should be used with ElastiCache (recommended minimum: 2)
    5. ProjectName: Used for naming resources in account (example: serverless-streaming)
    6. MultiAZ: True/False if multiple Availability Zones should be used (recommend: True)
    7. The default parameters can be selected for the remainder of questions

Testing:

Once you have deployed the stack, you can test it through the API gateway endpoint with the API key that is referenced in the deployment output. There are two methods for retrieving the API key either via the AWS console (from the link provided in the output – ApiKeyConsole) or via the AWS CLI (from the AWS CLI reference in the output – APIKeyCLI).

You can test directly in the Lambda service console by invoking the ingest message function.

A test message is available at the root of the project test_message.json for direct Lambda function testing of the Ingest function.

  1. In the console navigate to the Lambda service
  2. From the list of available functions, select the “<project name> -IngestMessageFunction-xxxxx” function
  3. Under the “Function overview” select the “Test” tab
  4. Enter an event name of your choosing
  5. Copy and paste the contents of test_message.json into the “Event JSON” box
  6. Click “Save” then after it has saved, click the “Test”
  7. If successful, you should see something similar to the below in the details:
    {
    "isBase64Encoded": false,
    "statusCode": 200,
    "headers": {
    "Access-Control-Allow-Headers": "Content-Type",
    "Access-Control-Allow-Origin": "*",
    "Access-Control-Allow-Methods": "OPTIONS,POST"
    },
    "body": "{\"messageID\": \"XXXXXXXXXXXXXX\"}"
    }
  8. In the S3 bucket “<project name>-s3messagearchive-xxxxxx“, find the payload of the original json with a key based on the date and time of the script execution, e.g.: YEAR/MONTH/DAY/HOUR/MINUTE with a file name of the messageID
  9. In a DynamoDB table named metaDataTable, you should find a record with a messageID equal to the messageID from above that contains all of the metadata related to the payload

A python script is included with the code in the test_client folder

  1. Replace the <Your API key key here> and the <Your API Gateway URL here (IngestMessageApi)> values with the correct ones for your environment in the test_client.py file
  2. Execute the test script with Python 3.8 or higher with the requests package installed
    Example execution (from main directory of git clone):
    python3 -m pip install -r ./test_client/requirements.txt
    python3 ./test_client/test_client.py
  3. Successful output shows the messageID and the header JSON payload:
    {
    "messageID": " XXXXXXXXXXXXXX"
    }
  4. In the S3 bucket “<project name>-s3messagearchive-xxxxxx“, you should be able to find the payload of the original json with a key based on the date and time of the script execution, e.g.: YEAR/MONTH/DAY/HOUR/MINUTE with a file name of the messageID
  5. In a DynamoDB table named metaDataTable, you should find a record with a messageID equal to the messageID from above that contains all of the meta data related to the payload

Conclusion

This blog describes architectural patterns, messaging patterns, and data structures that support a highly reliable messaging system for large messages. The use of serverless services including Lambda functions, Step Functions, ElastiCache, DynamoDB, and S3 meet the requirements of modern audit systems to be scalable and reliable. The architecture shared in this blog post is suitable for a highly regulated environment to store and track messages that are larger than typical logging systems, records sized between 256k and 6MB. The architecture serves as a blueprint that can be extended and adapted to fit further serverless use cases.

For serverless learning resources, visit Serverless Land.

Comparing design approaches for building serverless microservices

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/comparing-design-approaches-for-building-serverless-microservices/

This post is written by Luca Mezzalira, Principal SA, and Matt Diamond, Principal, SA.

Designing a workload with AWS Lambda creates questions for developers due to the modularity that can be expressed either at the code or infrastructure level. Using serverless for running code requires additional planning to extract the business logic from the underlying functional components. This deliberate separation of concerns ensures a robust modularity, paving the way for evolutionary architectures.

This post focuses on synchronous workloads, but similar considerations are applicable in other workload types. After identifying the bounded context of your API and agreeing on API contracts with consumers, it’s time to structure the architecture of your bounded context and the associated infrastructure.

The two most common ways to structure an API using Lambda functions are single responsibility and Lambda-lith. However, this blog post explores an alternative to these approaches, which can provide the best of both.

Single responsibility Lambda functions

Single responsibility Lambda functions are designed to run a specific task or handle a particular event-triggered operation within a serverless architecture:

c:\temp\design1.png

This approach provides a strong separation of concerns between business logic and capabilities. You can test in isolation specific capabilities, deploy a Lambda function independently, reduce the surface to introduce bugs, and enable easier debugging for issues in Amazon CloudWatch.

Additionally, single purpose functions enable efficient resource allocation as Lambda automatically scales based on demand, optimizing resource consumption, and minimizing costs. This means you can modify the memory size, architecture, and any other configuration available per function. Moreover, requesting an update of concurrent function execution via a support ticket becomes easier because you are not aggregating the traffic to a single Lambda function that handles every request but you can request specific increase based on the traffic of a single task.

Another advantage is rapid execution time. Considering the business logic for a single-purpose Lambda function designed for a single task, you can optimize the size of a function more easily, without the need of additional libraries required in other approaches. This helps reduce the cold start time due to a smaller bundle size.

Despite these benefits, some issues exist when solely relying on single-purpose Lambda functions. While the cold start time is mitigated, you might experience a higher number of cold starts, particularly for functions with sporadic or infrequent invocations. For example, a function that deletes users in an Amazon DynamoDB table likely won’t be triggered as often as one that reads user data. Also, relying heavily on single-purpose Lambda functions can lead to increased system complexity, especially as the number of functions grows.

A good separation of concerns helps maintain your code base, at the cost of a lack of cohesion. In functions with similar tasks, such as write operations of an API (POST, PUT, DELETE), you might duplicate code and behaviors across multiple functions. Moreover, updating common libraries shared via Lambda Layers, or other dependency management systems, requires multiple changes across every function instead of an atomic change on a single file. This is also true for any other change across multiple functions, for instance, updating the runtime version.

Lambda-lith: Using one single Lambda function

When many workloads use single purpose Lambda functions, developers end up with a proliferation of Lambda functions across an AWS account. One of the main challenges developers face is updating common dependencies or function configurations. Unless there is a clear governance strategy implemented for addressing this problem (such as using Dependabot for enforcing the update of dependencies, or parameterized parameters that are retrieved at provisioning time), developers may opt for a different strategy.

As a result, many development teams move in the opposite direction, aggregating all code related to an API inside the same Lambda function.

Lambda-lith: Using one single Lambda function

This approach is often referred to as a Lambda-lith, because it gathers all the HTTP verbs that compose an API and sometimes multiple APIs in the same function.

This allows you to have a higher code cohesion and colocation across the different parts of the application. Modularity in this case is expressed at the code level, where patterns like single responsibility, dependency injection, and façade are applied to structure your code. The discipline and code best practices applied by the development teams is crucial for maintaining large code bases.

However, considering the reduced number of Lambda functions, updating a configuration or implementing a new standard across multiple APIs can be achieved more easily compared with the single responsibility approach.

Moreover, since every request invokes the same Lambda function for every HTTP verb, it’s more likely that little-used parts of your code have a better response time because an execution environment is more likely to be available to fulfill the request.

Another factor to consider is the function size. This increases when collocating verbs in the same function with all the dependencies and business logic of an API. This may affect the cold start of your Lambda functions with spiky workloads. Customers should evaluate the benefits of this approach, especially when applications have restrictive SLAs, which would be impacted by cold starts. Developers can mitigate this problem by paying attention to the dependencies used and implementing techniques like tree-shaking, minification, and dead code elimination, where the programming language allows.

This coarse grain approach won’t allow you to tune your function configurations individually. But you must find a configuration that matches all the code capabilities with a possibly higher memory size and looser security permissions that might clash with the requirements defined by the security team.

Read and write functions

These two approaches both have trade-offs, but there is a third option that can combine their benefits.

Often, API traffic leans towards more reads or writes and that forces developers to optimize code and configurations more on one side over the other.

For example, consider building a user API that allows consumers to create, update, and delete a user but also to find a user or a list of users. In this scenario, you can change one user at a time with no bulk operations available, but you can get one or more users per API request. Dividing the design of the API into read and write operations results in this architecture:

Read and write functions

The cohesion of code for write operations (create, update, and delete) is beneficial for many reasons. For instance, you may need to validate the request body, ensuring it contains all the mandatory parameters. If the workload is heavy on writes, the less-used operations (for instance, Delete) benefit from warm execution environments. The code colocation enables reusability of code on similar actions, reducing the cognitive load to structure your projects with shared libraries or Lambda layers, for instance.

When looking at the read operations side, you can reduce the code bundled with this function, having a faster cold start, and heavily optimize the performance compared to a write operation. You can also store partial or full query results in-memory of an execution environment to improve the execution time of a Lambda function.

This approach helps you further with its evolutionary nature. Imagine if this platform becomes much more popular. Now, you must optimize the API even further by improving reads and adding a cache aside pattern with ElastiCache and Redis. Moreover, you have decided to optimize the read queries with a second database that is optimized for the read capability when the cache is missed.

On the write side, you have agreed with the API consumers that receiving and acknowledging user creation or deletion is adequate, considering they fully embraced the eventual consistency nature of distributed systems.

Now, you can improve the response time of write operations by adding an SQS queue before the Lambda function. You can update the write database in batches to reduce the number of invocations needed for handling write operations, instead of dealing with every request individually.

CQRS pattern

Command query responsibility segregation (CQRS) is a well-established pattern that separates the data mutation, or the command part of a system, from the query part. You can use the CQRS pattern to separate updates and queries if they have different requirements for throughput, latency, or consistency.

While it’s not mandatory to start with a full CQRS pattern, you can evolve from the infrastructure highlighted more easily in the initial read and write implementation, without massive refactoring of your API.

Comparison of the three approaches

Here is a comparison of the three approaches:

 

Single responsibility Lambda-lith Read and write
Benefits
  • Strong separation of concerns
  • Granular configuration
  • Better debug
  • Rapid execution time
  • Fewer cold start invocations
  • Higher code cohesion
  • Simpler maintenance
  • Code cohesion where needed
  • Evolutionary architecture
  • Optimization of read and write operations
Issues
  • Code duplication
  • Complex maintenance
  • Higher cold start invocations
  • Corse grain configuration
  • Higher cold start time
  • Using CQRS with two data models
  • CQRS adds eventual consistency to your system

Conclusion

Developers often move from single responsibility functions to the Lambda-lith as their architectures evolve, but both approaches have relative trade-offs. This post shows how it’s possible to have the best of both approaches by dividing your workloads per read and write operations.

All three approaches are viable for designing serverless APIs, and understanding what you are optimizing for is the key for making the best decision. Remember, understanding your context and business requirements to express in your applications leads you towards the acceptable trade-offs to specify inside a specific workload. Keep an open mind and find the solution that solves the problem and balances security, developer experience, cost, and maintainability.

For more serverless learning resources, visit Serverless Land.

Top Architecture Blog Posts of 2023

Post Syndicated from Andrea Courtright original https://aws.amazon.com/blogs/architecture/top-architecture-blog-posts-of-2023/

2023 was a rollercoaster year in tech, and we at the AWS Architecture Blog feel so fortunate to have shared in the excitement. As we move into 2024 and all of the new technologies we could see, we want to take a moment to highlight the brightest stars from 2023.

As always, thanks to our readers and to the many talented and hardworking Solutions Architects and other contributors to our blog.

I give you our 2023 cream of the crop!

#10: Build a serverless retail solution for endless aisle on AWS

In this post, Sandeep and Shashank help retailers and their customers alike in this guided approach to finding inventory that doesn’t live on shelves.

Building endless aisle architecture for order processing

Figure 1. Building endless aisle architecture for order processing

Check it out!

#9: Optimizing data with automated intelligent document processing solutions

Who else dreads wading through large amounts of data in multiple formats? Just me? I didn’t think so. Using Amazon AI/ML and content-reading services, Deependra, Anirudha, Bhajandeep, and Senaka have created a solution that is scalable and cost-effective to help you extract the data you need and store it in a format that works for you.

AI-based intelligent document processing engine

Figure 2: AI-based intelligent document processing engine

Check it out!

#8: Disaster Recovery Solutions with AWS managed services, Part 3: Multi-Site Active/Passive

Disaster recovery posts are always popular, and this post by Brent and Dhruv is no exception. Their creative approach in part 3 of this series is most helpful for customers who have business-critical workloads with higher availability requirements.

Warm standby with managed services

Figure 3. Warm standby with managed services

Check it out!

#7: Simulating Kubernetes-workload AZ failures with AWS Fault Injection Simulator

Continuing with the theme of “when bad things happen,” we have Siva, Elamaran, and Re’s post about preparing for workload failures. If resiliency is a concern (and it really should be), the secret is test, test, TEST.

Architecture flow for Microservices to simulate a realistic failure scenario

Figure 4. Architecture flow for Microservices to simulate a realistic failure scenario

Check it out!

#6: Let’s Architect! Designing event-driven architectures

Luca, Laura, Vittorio, and Zamira weren’t content with their four top-10 spots last year – they’re back with some things you definitely need to know about event-driven architectures.

Let's Architect

Figure 5. Let’s Architect artwork

Check it out!

#5: Use a reusable ETL framework in your AWS lake house architecture

As your lake house increases in size and complexity, you could find yourself facing maintenance challenges, and Ashutosh and Prantik have a solution: frameworks! The reusable ETL template with AWS Glue templates might just save you a headache or three.

Reusable ETL framework architecture

Figure 6. Reusable ETL framework architecture

Check it out!

#4: Invoking asynchronous external APIs with AWS Step Functions

It’s possible that AWS’ menagerie of services doesn’t have everything you need to run your organization. (Possible, but not likely; we have a lot of amazing services.) If you are using third-party APIs, then Jorge, Hossam, and Shirisha’s architecture can help you maintain a secure, reliable, and cost-effective relationship among all involved.

Invoking Asynchronous External APIs architecture

Figure 7. Invoking Asynchronous External APIs architecture

Check it out!

#3: Announcing updates to the AWS Well-Architected Framework

The Well-Architected Framework continues to help AWS customers evaluate their architectures against its six pillars. They are constantly striving for improvement, and Haleh’s diligence in keeping us up to date has not gone unnoticed. Thank you, Haleh!

Well-Architected logo

Figure 8. Well-Architected logo

Check it out!

#2: Let’s Architect! Designing architectures for multi-tenancy

The practically award-winning Let’s Architect! series strikes again! This time, Luca, Laura, Vittorio, and Zamira were joined by Federica to discuss multi-tenancy and why that concept is so crucial for SaaS providers.

Let's Architect

Figure 9. Let’s Architect

Check it out!

And finally…

#1: Understand resiliency patterns and trade-offs to architect efficiently in the cloud

Haresh, Lewis, and Bonnie revamped this 2022 post into a masterpiece that completely stole our readers’ hearts and is among the top posts we’ve ever made!

Resilience patterns and trade-offs

Figure 10. Resilience patterns and trade-offs

Check it out!

Bonus! Three older special mentions

These three posts were published before 2023, but we think they deserve another round of applause because you, our readers, keep coming back to them.

Thanks again to everyone for their contributions during a wild year. We hope you’re looking forward to the rest of 2024 as much as we are!

Enhance container software supply chain visibility through SBOM export with Amazon Inspector and QuickSight

Post Syndicated from Jason Ng original https://aws.amazon.com/blogs/security/enhance-container-software-supply-chain-visibility-through-sbom-export-with-amazon-inspector-and-quicksight/

In this post, I’ll show how you can export software bills of materials (SBOMs) for your containers by using an AWS native service, Amazon Inspector, and visualize the SBOMs through Amazon QuickSight, providing a single-pane-of-glass view of your organization’s software supply chain.

The concept of a bill of materials (BOM) originated in the manufacturing industry in the early 1960s. It was used to keep track of the quantities of each material used to manufacture a completed product. If parts were found to be defective, engineers could then use the BOM to identify products that contained those parts. An SBOM extends this concept to software development, allowing engineers to keep track of vulnerable software packages and quickly remediate the vulnerabilities.

Today, most software includes open source components. A Synopsys study, Walking the Line: GitOps and Shift Left Security, shows that 8 in 10 organizations reported using open source software in their applications. Consider a scenario in which you specify an open source base image in your Dockerfile but don’t know what packages it contains. Although this practice can significantly improve developer productivity and efficiency, the decreased visibility makes it more difficult for your organization to manage risk effectively.

It’s important to track the software components and their versions that you use in your applications, because a single affected component used across multiple organizations could result in a major security impact. According to a Gartner report titled Gartner Report for SBOMs: Key Takeaways You Should know, by 2025, 60 percent of organizations building or procuring critical infrastructure software will mandate and standardize SBOMs in their software engineering practice, up from less than 20 percent in 2022. This will help provide much-needed visibility into software supply chain security.

Integrating SBOM workflows into the software development life cycle is just the first step—visualizing SBOMs and being able to search through them quickly is the next step. This post describes how to process the generated SBOMs and visualize them with Amazon QuickSight. AWS also recently added SBOM export capability in Amazon Inspector, which offers the ability to export SBOMs for Amazon Inspector monitored resources, including container images.

Why is vulnerability scanning not enough?

Scanning and monitoring vulnerable components that pose cybersecurity risks is known as vulnerability scanning, and is fundamental to organizations for ensuring a strong and solid security posture. Scanners usually rely on a database of known vulnerabilities, the most common being the Common Vulnerabilities and Exposures (CVE) database.

Identifying vulnerable components with a scanner can prevent an engineer from deploying affected applications into production. You can embed scanning into your continuous integration and continuous delivery (CI/CD) pipelines so that images with known vulnerabilities don’t get pushed into your image repository. However, what if a new vulnerability is discovered but has not been added to the CVE records yet? A good example of this is the Apache Log4j vulnerability, which was first disclosed on Nov 24, 2021 and only added as a CVE on Dec 1, 2021. This means that for 7 days, scanners that relied on the CVE system weren’t able to identify affected components within their organizations. This issue is known as a zero-day vulnerability. Being able to quickly identify vulnerable software components in your applications in such situations would allow you to assess the risk and come up with a mitigation plan without waiting for a vendor or supplier to provide a patch.

In addition, it’s also good hygiene for your organization to track usage of software packages, which provides visibility into your software supply chain. This can improve collaboration between developers, operations, and security teams, because they’ll have a common view of every software component and can collaborate effectively to address security threats.

In this post, I present a solution that uses the new Amazon Inspector feature to export SBOMs from container images, process them, and visualize the data in QuickSight. This gives you the ability to search through your software inventory on a dashboard and to use natural language queries through QuickSight Q, in order to look for vulnerabilities.

Solution overview

Figure 1 shows the architecture of the solution. It is fully serverless, meaning there is no underlying infrastructure you need to manage. This post uses a newly released feature within Amazon Inspector that provides the ability to export a consolidated SBOM for Amazon Inspector monitored resources across your organization in commonly used formats, including CycloneDx and SPDX.

Figure 1: Solution architecture diagram

Figure 1: Solution architecture diagram

The workflow in Figure 1 is as follows:

  1. The image is pushed into Amazon Elastic Container Registry (Amazon ECR), which sends an Amazon EventBridge event.
  2. This invokes an AWS Lambda function, which starts the SBOM generation job for the specific image.
  3. When the job completes, Amazon Inspector deposits the SBOM file in an Amazon Simple Storage Service (Amazon S3) bucket.
  4. Another Lambda function is invoked whenever a new JSON file is deposited. The function performs the data transformation steps and uploads the new file into a new S3 bucket.
  5. Amazon Athena is then used to perform preliminary data exploration.
  6. A dashboard on Amazon QuickSight displays SBOM data.

Implement the solution

This section describes how to deploy the solution architecture.

In this post, you’ll perform the following tasks:

  • Create S3 buckets and AWS KMS keys to store the SBOMs
  • Create an Amazon Elastic Container Registry (Amazon ECR) repository
  • Deploy two AWS Lambda functions to initiate the SBOM generation and transformation
  • Set up Amazon EventBridge rules to invoke Lambda functions upon image push into Amazon ECR
  • Run AWS Glue crawlers to crawl the transformed SBOM S3 bucket
  • Run Amazon Athena queries to review SBOM data
  • Create QuickSight dashboards to identify libraries and packages
  • Use QuickSight Q to identify libraries and packages by using natural language queries

Deploy the CloudFormation stack

The AWS CloudFormation template we’ve provided provisions the S3 buckets that are required for the storage of raw SBOMs and transformed SBOMs, the Lambda functions necessary to initiate and process the SBOMs, and EventBridge rules to run the Lambda functions based on certain events. An empty repository is provisioned as part of the stack, but you can also use your own repository.

To deploy the CloudFormation stack

  1. Download the CloudFormation template.
  2. Browse to the CloudFormation service in your AWS account and choose Create Stack.
  3. Upload the CloudFormation template you downloaded earlier.
  4. For the next step, Specify stack details, enter a stack name.
  5. You can keep the default value of sbom-inspector for EnvironmentName.
  6. Specify the Amazon Resource Name (ARN) of the user or role to be the admin for the KMS key.
  7. Deploy the stack.

Set up Amazon Inspector

If this is the first time you’re using Amazon Inspector, you need to activate the service. In the Getting started with Amazon Inspector topic in the Amazon Inspector User Guide, follow Step 1 to activate the service. This will take some time to complete.

Figure 2: Activate Amazon Inspector

Figure 2: Activate Amazon Inspector

SBOM invocation and processing Lambda functions

This solution uses two Lambda functions written in Python to perform the invocation task and the transformation task.

  • Invocation task — This function is run whenever a new image is pushed into Amazon ECR. It takes in the repository name and image tag variables and passes those into the create_sbom_export function in the SPDX format. This prevents duplicated SBOMs, which helps to keep the S3 data size small.
  • Transformation task — This function is run whenever a new file with the suffix .json is added to the raw S3 bucket. It creates two files, as follows:
    1. It extracts information such as image ARN, account number, package, package version, operating system, and SHA from the SBOM and exports this data to the transformed S3 bucket under a folder named sbom/.
    2. Because each package can have more than one CVE, this function also extracts the CVE from each package and stores it in the same bucket in a directory named cve/. Both files are exported in Apache Parquet so that the file is in a format that is optimized for queries by Amazon Athena.

Populate the AWS Glue Data Catalog

To populate the AWS Glue Data Catalog, you need to generate the SBOM files by using the Lambda functions that were created earlier.

To populate the AWS Glue Data Catalog

  1. You can use an existing image, or you can continue on to create a sample image.
  2. Open an AWS Cloudshell terminal.
  3. Run the follow commands
    # Pull the nginx image from a public repo
    docker pull public.ecr.aws/nginx/nginx:1.19.10-alpine-perl
    
    docker tag public.ecr.aws/nginx/nginx:1.19.10-alpine-perl <ACCOUNT-ID>.dkr.ecr.us-east-1.amazonaws.com/sbom-inspector:nginxperl
    
    # Authenticate to ECR, fill in your account id
    aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin <ACCOUNT-ID>.dkr.ecr.us-east-1.amazonaws.com
    
    # Push the image into ECR
    docker push <ACCOUNT-ID>.dkr.ecr.us-east-1.amazonaws.com/sbom-inspector:nginxperl

  4. An image is pushed into the Amazon ECR repository in your account. This invokes the Lambda functions that perform the SBOM export by using Amazon Inspector and converts the SBOM file to Parquet.
  5. Verify that the Parquet files are in the transformed S3 bucket:
    1. Browse to the S3 console and choose the bucket named sbom-inspector-<ACCOUNT-ID>-transformed. You can also track the invocation of each Lambda function in the Amazon CloudWatch log console.
    2. After the transformation step is complete, you will see two folders (cve/ and sbom/)in the transformed S3 bucket. Choose the sbom folder. You will see the transformed Parquet file in it. If there are CVEs present, a similar file will appear in the cve folder.

    The next step is to run an AWS Glue crawler to determine the format, schema, and associated properties of the raw data. You will need to crawl both folders in the transformed S3 bucket and store the schema in separate tables in the AWS Glue Data Catalog.

  6. On the AWS Glue Service console, on the left navigation menu, choose Crawlers.
  7. On the Crawlers page, choose Create crawler. This starts a series of pages that prompt you for the crawler details.
  8. In the Crawler name field, enter sbom-crawler, and then choose Next.
  9. Under Data sources, select Add a data source.
  10. Now you need to point the crawler to your data. On the Add data source page, choose the Amazon S3 data store. This solution in this post doesn’t use a connection, so leave the Connection field blank if it’s visible.
  11. For the option Location of S3 data, choose In this account. Then, for S3 path, enter the path where the crawler can find the sbom and cve data, which is s3://sbom-inspector-<ACCOUNT-ID>-transformed/sbom/ and s3://sbom-inspector-<ACCOUNT-ID>-transformed/cve/. Leave the rest as default and select Add an S3 data source.
     
    Figure 3: Data source for AWS Glue crawler

    Figure 3: Data source for AWS Glue crawler

  12. The crawler needs permissions to access the data store and create objects in the Data Catalog. To configure these permissions, choose Create an IAM role. The AWS Identity and Access Management (IAM) role name starts with AWSGlueServiceRole-, and in the field, you enter the last part of the role name. Enter sbomcrawler, and then choose Next.
  13. Crawlers create tables in your Data Catalog. Tables are contained in a database in the Data Catalog. To create a database, choose Add database. In the pop-up window, enter sbom-db for the database name, and then choose Create.
  14. Verify the choices you made in the Add crawler wizard. If you see any mistakes, you can choose Back to return to previous pages and make changes. After you’ve reviewed the information, choose Finish to create the crawler.
    Figure 4: Creation of the AWS Glue crawler

    Figure 4: Creation of the AWS Glue crawler

  15. Select the newly created crawler and choose Run.
  16. After the crawler runs successfully, verify that the table is created and the data schema is populated.
     
    Figure 5: Table populated from the AWS Glue crawler

    Figure 5: Table populated from the AWS Glue crawler

Set up Amazon Athena

Amazon Athena performs the initial data exploration and validation. Athena is a serverless interactive analytics service built on open source frameworks that supports open-table and file formats. Athena provides a simplified, flexible way to analyze data in sources like Amazon S3 by using standard SQL queries. If you are SQL proficient, you can query the data source directly; however, not everyone is familiar with SQL. In this section, you run a sample query and initialize the service so that it can used in QuickSight later on.

To start using Amazon Athena

  1. In the AWS Management Console, navigate to the Athena console.
  2. For Database, select sbom-db (or select the database you created earlier in the crawler).
  3. Navigate to the Settings tab located at the top right corner of the console. For Query result location, select the Athena S3 bucket created from the CloudFormation template, sbom-inspector-<ACCOUNT-ID>-athena.
  4. Keep the defaults for the rest of the settings. You can now return to the Query Editor and start writing and running your queries on the sbom-db database.

You can use the following sample query.

select package, packageversion, cve, sha, imagearn from sbom
left join cve
using (sha, package, packageversion)
where cve is not null;

Your Athena console should look similar to the screenshot in Figure 6.

Figure 6: Sample query with Amazon Athena

Figure 6: Sample query with Amazon Athena

This query joins the two tables and selects only the packages with CVEs identified. Alternatively, you can choose to query for specific packages or identify the most common package used in your organization.

Sample output:

# package packageversion cve sha imagearn
<PACKAGE_NAME> <PACKAGE_VERSION> <CVE> <IMAGE_SHA> <ECR_IMAGE_ARN>

Visualize data with Amazon QuickSight

Amazon QuickSight is a serverless business intelligence service that is designed for the cloud. In this post, it serves as a dashboard that allows business users who are unfamiliar with SQL to identify zero-day vulnerabilities. This can also reduce the operational effort and time of having to look through several JSON documents to identify a single package across your image repositories. You can then share the dashboard across teams without having to share the underlying data.

QuickSight SPICE (Super-fast, Parallel, In-memory Calculation Engine) is an in-memory engine that QuickSight uses to perform advanced calculations. In a large organization where you could have millions of SBOM records stored in S3, importing your data into SPICE helps to reduce the time to process and serve the data. You can also use the feature to perform a scheduled refresh to obtain the latest data from S3.

QuickSight also has a feature called QuickSight Q. With QuickSightQ, you can use natural language to interact with your data. If this is the first time you are initializing QuickSight, subscribe to QuickSight and select Enterprise + Q. It will take roughly 20–30 minutes to initialize for the first time. Otherwise, if you are already using QuickSight, you will need to enable QuickSight Q by subscribing to it in the QuickSight console.

Finally, in QuickSight you can select different data sources, such as Amazon S3 and Athena, to create custom visualizations. In this post, we will use the two Athena tables as the data source to create a dashboard to keep track of the packages used in your organization and the resulting CVEs that come with them.

Prerequisites for setting up the QuickSight dashboard

This process will be used to create the QuickSight dashboard from a template already pre-provisioned through the command line interface (CLI). It also grants the necessary permissions for QuickSight to access the data source. You will need the following:

  • AWS Command Line Interface (AWS CLI) programmatic access with read and write permissions to QuickSight.
  • A QuickSight + Q subscription (only if you want to use the Q feature).
  • QuickSight permissions to Amazon S3 and Athena (enable these through the QuickSight security and permissions interface).
  • Set the default AWS Region where you want to deploy the QuickSight dashboard. This post assumes that you’re using the us-east-1 Region.

Create datasets

In QuickSight, create two datasets, one for the sbom table and another for the cve table.

  1. In the QuickSight console, select the Dataset tab.
  2. Choose Create dataset, and then select the Athena data source.
  3. Name the data source sbom and choose Create data source.
  4. Select the sbom table.
  5. Choose Visualize to complete the dataset creation. (Delete the analyses automatically created for you because you will create your own analyses afterwards.)
  6. Navigate back to the main QuickSight page and repeat steps 1–4 for the cve dataset.

Merge datasets

Next, merge the two datasets to create the combined dataset that you will use for the dashboard.

  1. On the Datasets tab, edit the sbom dataset and add the cve dataset.
  2. Set three join clauses, as follows:
    1. Sha : Sha
    2. Package : Package
    3. Packageversion : Packageversion
  3. Perform a left merge, which will append the cve ID to the package and package version in the sbom dataset.
     
    Figure 7: Combining the sbom and cve datasets

    Figure 7: Combining the sbom and cve datasets

Next, you will create a dashboard based on the combined sbom dataset.

Prepare configuration files

In your terminal, export the following variables. Substitute <QuickSight username> in the QS_USER_ARN variable with your own username, which can be found in the Amazon QuickSight console.

export ACCOUNT_ID=$(aws sts get-caller-identity --output text --query Account)
export TEMPLATE_ID=”sbom_dashboard”
export QS_USER_ARN=$(aws quicksight describe-user --aws-account-id $ACCOUNT_ID --namespace default --user-name <QuickSight username> | jq .User.Arn)
export QS_DATA_ARN=$(aws quicksight search-data-sets --aws-account-id $ACCOUNT_ID --filters Name="DATASET_NAME",Operator="StringLike",Value="sbom" | jq .DataSetSummaries[0].Arn)

Validate that the variables are set properly. This is required for you to move on to the next step; otherwise you will run into errors.

echo ACCOUNT_ID is $ACCOUNT_ID || echo ACCOUNT_ID is not set
echo TEMPLATE_ID is $TEMPLATE_ID || echo TEMPLATE_ID is not set
echo QUICKSIGHT USER ARN is $QS_USER_ARN || echo QUICKSIGHT USER ARN is not set
echo QUICKSIGHT DATA ARN is $QS_DATA_ARN || echo QUICKSIGHT DATA ARN is not set

Next, use the following commands to create the dashboard from a predefined template and create the IAM permissions needed for the user to view the QuickSight dashboard.

cat < ./dashboard.json
{
    "SourceTemplate": {
      "DataSetReferences": [
        {
          "DataSetPlaceholder": "sbom",
          "DataSetArn": $QS_DATA_ARN
        }
      ],
      "Arn": "arn:aws:quicksight:us-east-1:293424211206:template/sbom_qs_template"
    }
}
EOF

cat < ./dashboardpermissions.json
[
    {
      "Principal": $QS_USER_ARN,
      "Actions": [
        "quicksight:DescribeDashboard",
        "quicksight:ListDashboardVersions",
        "quicksight:UpdateDashboardPermissions",
        "quicksight:QueryDashboard",
        "quicksight:UpdateDashboard",
        "quicksight:DeleteDashboard",
        "quicksight:DescribeDashboardPermissions",
        "quicksight:UpdateDashboardPublishedVersion"
      ]
    }
]
EOF

Run the following commands to create the dashboard in your QuickSight console.

aws quicksight create-dashboard --aws-account-id $ACCOUNT_ID --dashboard-id $ACCOUNT_ID --name sbom-dashboard --source-entity file://dashboard.json

Note: Run the following describe-dashboard command, and confirm that the response contains a status code of 200. The 200-status code means that the dashboard exists.

aws quicksight describe-dashboard --aws-account-id $ACCOUNT_ID --dashboard-id $ACCOUNT_ID

Use the following update-dashboard-permissions AWS CLI command to grant the appropriate permissions to QuickSight users.

aws quicksight update-dashboard-permissions --aws-account-id $ACCOUNT_ID --dashboard-id $ACCOUNT_ID --grant-permissions file://dashboardpermissions.json

You should now be able to see the dashboard in your QuickSight console, similar to the one in Figure 8. It’s an interactive dashboard that shows you the number of vulnerable packages you have in your repositories and the specific CVEs that come with them. You can navigate to the specific image by selecting the CVE (middle right bar chart) or list images with a specific vulnerable package (bottom right bar chart).

Note: You won’t see the exact same graph as in Figure 8. It will change according to the image you pushed in.

Figure 8: QuickSight dashboard containing SBOM information

Figure 8: QuickSight dashboard containing SBOM information

Alternatively, you can use QuickSight Q to extract the same information from your dataset through natural language. You will need to create a topic and add the dataset you added earlier. For detailed information on how to create a topic, see the Amazon QuickSight User Guide. After QuickSight Q has completed indexing the dataset, you can start to ask questions about your data.

Figure 9: Natural language query with QuickSight Q

Figure 9: Natural language query with QuickSight Q

Conclusion

This post discussed how you can use Amazon Inspector to export SBOMs to improve software supply chain transparency. Container SBOM export should be part of your supply chain mitigation strategy and monitored in an automated manner at scale.

Although it is a good practice to generate SBOMs, it would provide little value if there was no further analysis being done on them. This solution enables you to visualize your SBOM data through a dashboard and natural language, providing better visibility into your security posture. Additionally, this solution is also entirely serverless, meaning there are no agents or sidecars to set up.

To learn more about exporting SBOMs with Amazon Inspector, see the Amazon Inspector User Guide.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Jason Ng

Jason Ng

Jason is a Cloud Sales Center Solutions Architect at AWS. He works with enterprise and independent software vendor (ISV) greenfield customers in ASEAN countries and is part of the Containers Technical Field Community (TFC). He enjoys helping customers modernize their applications, drive growth, and reduce total cost of ownership.

Enable advanced search capabilities for Amazon Keyspaces data by integrating with Amazon OpenSearch Service

Post Syndicated from Rajesh Kantamani original https://aws.amazon.com/blogs/big-data/enable-advanced-search-capabilities-for-amazon-keyspaces-data-by-integrating-with-amazon-opensearch-service/

Amazon Keyspaces (for Apache Cassandra) is a fully managed, serverless, and Apache Cassandra-compatible database service offered by AWS. It caters to developers in need of a highly available, durable, and fast NoSQL database backend. When you start the process of designing your data model for Amazon Keyspaces, it’s essential to possess a comprehensive understanding of your access patterns, similar to the approach used in other NoSQL databases. This allows for the uniform distribution of data across all partitions within your table, thereby enabling your applications to achieve optimal read and write throughput. In cases where your application demands supplementary query features, such as conducting full-text searches on the data stored in a table, you may explore the utilization of alternative services like Amazon OpenSearch Service to meet these particular needs.

Amazon OpenSearch Service is a powerful and fully managed search and analytics service. It empowers businesses to explore and gain insights from large volumes of data quickly. OpenSearch Service is versatile, allowing you to perform text and geospatial searches. Amazon OpenSearch Ingestion is a fully managed, serverless data collection solution that efficiently routes data to your OpenSearch Service domains and Amazon OpenSearch Serverless collections. It eliminates the need for third-party tools to ingest data into your OpenSearch service setup. You simply configure your data sources to send information to OpenSearch Ingestion, which then automatically delivers the data to your specified destination. Additionally, you can configure OpenSearch Ingestion to apply data transformations before delivery.

In this post, we explore the process of integrating  Amazon Keyspaces and Amazon OpenSearch Service using AWS Lambda and Amazon OpenSearch Ingestion to enable advanced search capabilities. The content includes a reference architecture, a step-by-step guide on infrastructure setup, sample code for implementing the solution within a use case, and an AWS Cloud Development Kit (AWS CDK) application for deployment.

Solution overview

AnyCompany, a rapidly growing eCommerce platform, faces a critical challenge in efficiently managing its extensive product and item catalog while enhancing the shopping experience for its customers. Currently, customers struggle to find specific products quickly due to limited search capabilities. AnyCompany aims to address this issue by implementing advanced search functionality that enables customers to easily search for the products. This enhancement is expected to significantly improve customer satisfaction and streamline the shopping process, ultimately boosting sales and retention rates.

The following diagram illustrates the solution architecture.

The workflow includes the following steps:

  1. Amazon API Gateway is set up to issue a POST request to the Amazon Lambda function when there is a need to insert, update, or delete data in Amazon Keyspaces.
  2. The Lambda function passes this modification to Amazon Keyspaces and holds the change, waiting for a success return code from Amazon Keyspaces that confirms the data persistence.
  3. After it receives the 200 return code, the Lambda function initiates an HTTP request to the OpenSearch Ingestion data pipeline asynchronously.
  4. The OpenSearch Ingestion process moves the transaction data to the OpenSearch Serverless collection.
  5. We then utilize the dev tools in OpenSearch Dashboards to execute various search patterns.

Prerequisites

Complete the following prerequisite steps:

  1. Ensure the AWS Command Line Interface (AWS CLI) is installed and the user profile is set up.
  2. Install Node.js, npm and the AWS CDK Toolkit.
  3. Install Python and jq.
  4. Use an integrated developer environment (IDE), such as Visual Studio Code.

Deploy the solution

The solution is detailed in an AWS CDK project. You don’t need any prior knowledge of AWS CDK. Complete the following steps to deploy the solution:

  1. Clone the GitHub repository to your IDE and navigate to the cloned repository’s directory:This project is structured like a standard Python project.
    git clone <repo-link>
    cd <repo-dir>

  2. On MacOS and Linux, complete the following steps to set up your virtual environment:
    • Create a virtual environment
      $ python3 -m venv .venv

    • After the virtual environment is created, activate it:
      $ source .venv/bin/activate

  3. For Windows users, activate the virtual environment as follows.
    % .venv\\\\Scripts\\\\activate.bat

  4. After you activate the virtual environment, install the required dependencies:
    (.venv) $ pip install -r requirements.txt

  5. Bootstrap AWS CDK in your account:(.venv) $ cdk bootstrap aws://<aws_account_id>/<aws_region>

After the bootstrap process completes, you’ll see a CDKToolkit AWS CloudFormation stack on the AWS CloudFormation console. AWS CDK is now ready for use.

  1. You can synthesize the CloudFormation template for this code:
    (.venv) $ export CDK_DEFAULT_ACCOUNT=$(aws sts get-caller-identity --query Account --output text)
    (.venv) $ export CDK_DEFAULT_REGION=<aws_region>
    (.venv) $ cdk synth -c iam_user_name=<your-iam-user-name> --all
    

  2. Use the cdk deploy command to create the stack:
    (.venv) $ cdk deploy -c iam_user_name=<your-iam-user-name> --all
    

    When the deployment process is complete, you’ll see the following CloudFormation stacks on the AWS CloudFormation console:

  • OpsApigwLambdaStack
  • OpsServerlessIngestionStack
  • OpsServerlessStack
  • OpsKeyspacesStack
  • OpsCollectionPipelineRoleStack

CloudFormation stack details

The CloudFormation template deploys the following components:

  1. An API named keyspaces-OpenSearch-Endpoint in API Gateway, which handles mutations (inserts, updates, and deletes) via the POST method to Lambda, compatible with OpenSearch Ingestion.
  2. A keyspace named productsearch, along with a table called product_by_item. The chosen partition key for this table is product_id. The following screenshot shows an example of the table’s attributes and data provided for reference using the CQL editor.
  3. A Lambda function called OpsApigwLambdaStack-ApiHandler* that will forward the transaction to Amazon Keyspaces. After the transaction is committed in keyspaces, we send a response code of 200 to the client as well as asynchronously send the transaction to the OpenSearch Ingestion pipeline.
  4. The OpenSearch ingestion pipeline, named serverless-ingestion. This pipeline publishes records to an OpenSearch Serverless collection under an index named products. The key for this collection is product_id. Additionally, the pipeline specifies the actions it can handle. The delete action supports delete operations; the index action is the default action, which supports insert and update operations.

We have chosen an OpenSearch Serverless collection as our target, so we included serverless: true in our configuration file. To keep things simple, we haven’t altered the network_policy_name settings, but you have the option to specify a different network policy name if needed. For additional details on how to set up network access for OpenSearch Serverless collections, refer to Creating network policies (console).

version: "2"
product-pipeline:
  source:
    http:
      path: "/${pipelineName}/test_ingestion_path"
  processor:
    - date:
        from_time_received: true
        destination: "@timestamp"
  sink:
    - opensearch:
        hosts: [ "<OpenSearch_Endpoint>" ]
        document_root_key: "item"
        index_type: custom
        index: "products"
        document_id_field: "item/product_id"
        flush_timeout: -1
        actions:
          - type: "delete"
            when: '/operation == "delete"'
          - type: "index"                      
        aws:
          sts_role_arn: "arn:aws:iam::<account_id>:role/OpenSearchCollectionPipelineRole"
          region: "us-east-1"
          serverless: true
        # serverless_options:
            # Specify a name here to create or update network policy for the serverless collection
            # network_policy_name: "network-policy-name"

You can incorporate a dead-letter queue (DLQ) into your pipeline to handle and store events that fail to process. This allows for easy access and analysis of these events. If your sinks refuse data due to mapping errors or other problems, redirecting this data to the DLQ will facilitate troubleshooting and resolving the issue. For detailed instructions on configuring DLQs, refer to Dead-letter queues. To reduce complexity, we don’t configure the DLQs in this post.

Now that all components have been deployed, we can test the solution and conduct various searches on the OpenSearch Service index.

Test the solution

Complete the following steps to test the solution:

  1. On the API Gateway console, navigate to your API and choose the ANY method.
  2. Choose the Test tab.
  3. For Method type¸ choose POST.

This is the only supported method by OpenSearch Ingestion for any inserts, deletes, or updates.

  1. For Request body, enter the input.

The following are some of the sample requests:

{"operation": "insert", "item": {"product_id": 1, "product_name": "Reindeer sweater", "product_description": "A Christmas sweater for everyone in the family." } }
{"operation": "insert", "item": {"product_id": 2, "product_name": "Bluetooth Headphones", "product_description": "High-quality wireless headphones with long battery life."}}
{"operation": "insert", "item": {"product_id": 3, "product_name": "Smart Fitness Watch", "product_description": "Advanced watch tracking fitness and health metrics."}}
{"operation": "insert", "item": {"product_id": 4, "product_name": "Eco-Friendly Water Bottle", "product_description": "Durable and eco-friendly bottle for hydration on-the-go."}}
{"operation": "insert", "item": {"product_id": 5, "product_name": "Wireless Charging Pad", "product_description": "Convenient pad for fast wireless charging of devices."}}

If the test is successful, you should see a return code of 200 in API Gateway. The following is a sample response:

{"message": "Ingestion completed successfully for {'operation': 'insert', 'item': {'product_id': 100, 'product_name': 'Reindeer sweater', 'product_description': 'A Christmas sweater for everyone in the family.'}}."}

If the test is successful, you should see the updated records in the Amazon Keyspaces table.

  1. Now that you have loaded some sample data, run a sample query to confirm the data that you loaded using API Gateway is actually being persisted to OpenSearch Service. The following is a query against the OpenSearch Service index for product_name = sweater:
awscurl --service aoss --region us-east-1 -X POST "<OpenSearch_Endpoint>/products/_search" -H "Content-Type: application/json" -d '
{
"query": {
"term": {
"product_name": "sweater"
     }
   } 
}'  | jq '.'

  1. To update a record, enter the following in the API’s request body. If the record doesn’t already exist, this operation will insert the record.
  2. To delete a record, enter the following in the API’s request body.

Monitoring

You can use Amazon CloudWatch to monitor the pipeline metrics. The following graph shows the number of documents successfully sent to OpenSearch Service.

Run queries on Amazon Keyspaces data in OpenSearch Service

There are several methods to run search queries against an OpenSearch Service collection, with the most popular being through awscurl or the dev tools in the OpenSearch Dashboards. For this post, we will be utilizing the dev tools in the OpenSearch Dashboards.

To access the dev tools, Navigate to the OpenSearch collection dashboards  and select the dashboard radio button, which is highlighted in the screenshot adjacent to the ingestion-collection.

Once on the OpenSearch Dashboards page, click on the Dev Tools radio button as highlighted

This action brings up the Dev Tools console, enabling you to run various search queries, either to validate the data or simply to query it.

Type in your query and use the size parameter to determine how many records you want to be displayed. Click the play icon to execute the query. Results will appear in the right pane.

The following are some of the different search queries that you can run against the ingestion-collection for different search needs. For more search methods and examples, refer to Searching data in Amazon OpenSearch Service.

Full text search

In a search for Bluetooth headphones, we adopted an exacting full-text search approach. Our strategy involved formulating a query to align precisely with the term “Bluetooth Headphones,” searching through an extensive product database. This method allowed us to thoroughly examine and evaluate a broad range of Bluetooth headphones, concentrating on those that best met our search parameters. See the following code:

Fuzzy search

We used a fuzzy search query to navigate through product descriptions, even when they contain variations or misspellings of our search term. For instance, by setting the value to “chrismas” and the fuzziness to AUTO, our search could accommodate common misspellings or close approximations in the product descriptions. This approach is particularly useful in making sure that we capture a wider range of relevant results, especially when dealing with terms that are often misspelled or have multiple variations. See the following code:

Wildcard search

In our approach to discovering a variety of products, we employed a wildcard search technique within the product descriptions. By using the query Fit*s, we signaled our search tool to look for any product descriptions that begin with “Fit” and end with “s,” allowing for any characters to appear in between. This method is effective for capturing a range of products that have similar naming patterns or attributes, making sure that we don’t miss out on relevant items that fit within a certain category but may have slightly different names or features. See the following code:

It is essential to comprehend that queries incorporating wildcard characters often exhibit reduced performance, as they require iterating through an extensive array of terms. Consequently, it is advisable to refrain from positioning wildcard characters at the beginning of a query, given that this approach can lead to operations that significantly strain both computational resources and time.

Troubleshooting

A status code other than 200 indicates a problem either in the Amazon Keyspaces operation or the OpenSearch Ingestion operation. View the CloudWatch logs of the Lambda function OpsApigwLambdaStack-ApiHandler* and the OpenSearch Ingestion pipeline logs to troubleshoot the failure.

You will see the following errors in the ingestion pipeline logs. This is because the pipeline endpoint is publicly accessible, and not accessible via VPC. They are harmless. As a best practice you can enable VPC access for the serverless collection, which provides an inherent layer of security.

  • 2024-01-23T13:47:42.326 [armeria-common-worker-epoll-3-1] ERROR com.amazon.osis.HttpAuthorization - Unauthenticated request: Missing Authentication Token
  • 2024-01-23T13:47:42.327 [armeria-common-worker-epoll-3-1] ERROR com.amazon.osis.HttpAuthorization - Authentication status: 401

Clean up

To prevent additional charges and to effectively remove resources, delete the CloudFormation stacks by running the following command:

(.venv) $ cdk destroy -c iam_user_name=<your-iam-user-name> --force --all

Verify the following CloudFormation stacks are deleted from the CloudFormation console:

Finally, delete the CDKToolkit CloudFormation stack to remove the AWS CDK resources.

Conclusion

In this post, we delved into enabling diverse search scenarios on data stored in Amazon Keyspaces by using the capabilities of OpenSearch Service. Through the use of Lambda and OpenSearch Ingestion, we managed the data movement seamlessly. Furthermore, we provided insights into testing the deployed solution using a CloudFormation template, ensuring a thorough grasp of its practical application and effectiveness.

Test the procedure that is outlined in this post by deploying the sample code provided and share your feedback in the comments section.


About the authors

Rajesh, a Senior Database Solution Architect. He specializes in assisting customers with designing, migrating, and optimizing database solutions on Amazon Web Services, ensuring scalability, security, and performance. In his spare time, he loves spending time outdoors with family and friends.

Sylvia, a Senior DevOps Architect, specializes in designing and automating DevOps processes to guide clients through their DevOps transformation journey. During her leisure time, she finds joy in activities such as biking, swimming, practicing yoga, and photography.

AWS Weekly Roundup — .Net Runtime for AWS Lambda, PartyRock Hackathon, and more — February 26, 2024

Post Syndicated from Veliswa Boya original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-net-runtime-for-aws-lambda-partyrock-hackathon-and-more-february-26-2024/

The Community AWS re:invent 2023 re:caps continue! Recently, I was invited to participate in one of these events hosted by the AWS User Group Kenya, and was able to learn and spend time with this amazing community.

AWS User Group Kenya

AWS User Group Kenya

Last week’s launches
Here are some launches that got my attention during the previous week.

.NET 8 runtime for AWS Lambda – AWS Lambda now supports .NET 8 as both a managed runtime and container base image. This support provides you with .NET 8 features that include API enhancements, improved Native Ahead of Time (Native AOT) support, and improved performance. .NET 8 supports C# 12, F# 8, and PowerShell 7.4. You can develop Lambda functions in .NET 8 using the AWS Toolkit for Visual Studio, the AWS Extensions for .NET CLI, AWS Serverless Application Model (AWS SAM), AWS CDK, and other infrastructure as code tools.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS news
Here are some additional projects, programs, and news items that you might find interesting:

Earlier this month, I used this image to call attention to the PartyRock Hackathon that’s currently in progress. The deadline to join the hackathon is fast approaching so be sure to signup before time runs out.

Amazon API Gateway – Amazon API Gateway processed over 100 trillion API requests in 2023, and we continue to see growing demand for API-driven applications. API Gateway is a fully-managed service that enables you to create, publish, maintain, monitor, and secure APIs at any scale. Customers that onboarded large workloads on API Gateway in 2023 told us they chose the service for its availability, security, and serverless architecture. Those in regulated industries value API Gateway’s private endpoints, which are isolated from the public internet and only accessible from your Amazon Virtual Private Cloud (VPC).

AWS open source news and updates – My colleague Ricardo writes this weekly open source newsletter in which he highlights new open source projects, tools, and demos from the AWS Community.

Upcoming AWS events
Season 3 of the Build on Generative AI Twitch show has kicked off. Join every Monday on Twitch at 9AM PST/Noon EST/18h CET to learn among others, how you can build generative AI-enabled applications.

If you’re in the EMEA timezone, there is still time to register and watch the AWS Innovate Online Generative AI & Data Edition taking place on February 29. Innovate Online events are free, online, and designed to inspire and educate you about building on AWS. Whether you’re in the Americas, Asia Pacific & Japan, or EMEA region, learn here about future AWS Innovate Online events happening in your timezone.

AWS Community re:Invent re:Caps – Join a Community re:Cap event organized by volunteers from AWS User Groups and AWS Cloud Clubs around the world to learn about the latest announcements from AWS re:Invent.

You can browse all upcoming in-person and virtual events here.

That’s all for this week. Check back next Monday for another Weekly Roundup!

Veliswa

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS.

Introducing the .NET 8 runtime for AWS Lambda

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/introducing-the-net-8-runtime-for-aws-lambda/

This post is written by Beau Gosse, Senior Software Engineer and Paras Jain, Senior Technical Account Manager.

AWS Lambda now supports .NET 8 as both a managed runtime and container base image. With this release, Lambda developers can benefit from .NET 8 features including API enhancements, improved Native Ahead of Time (Native AOT) support, and improved performance. .NET 8 supports C# 12, F# 8, and PowerShell 7.4. You can develop Lambda functions in .NET 8 using the AWS Toolkit for Visual Studio, the AWS Extensions for .NET CLI, AWS Serverless Application Model (AWS SAM), AWS CDK, and other infrastructure as code tools.

Creating .NET 8 function in the console

Creating .NET 8 function in the console

What’s new

Upgraded operating system

The .NET 8 runtime is built on the Amazon Linux 2023 (AL2023) minimal container image. This provides a smaller deployment footprint than earlier Amazon Linux 2 (AL2) based runtimes and updated versions of common libraries such as glibc 2.34 and OpenSSL 3.

The new image also uses microdnf as a package manager, symlinked as dnf. This replaces the yum package manager used in earlier AL2-based images. If you deploy your Lambda functions as container images, you must update your Dockerfiles to use dnf instead of yum when upgrading to the .NET 8 base image. For more information, see Introducing the Amazon Linux 2023 runtime for AWS Lambda.

Performance

There are a number of language performance improvements available as part of .NET 8. Initialization time can impact performance, as Lambda creates new execution environments to scale your function automatically. There are a number of ways to optimize performance for Lambda-based .NET workloads, including using source generators in System.Text.Json or using Native AOT.

Lambda has increased the default memory size from 256 MB to 512 MB in the blueprints and templates for improved performance with .NET 8. Perform your own functional and performance tests on your .NET 8 applications. You can use AWS Compute Optimizer or AWS Lambda Power Tuning for performance profiling.

At launch, new Lambda runtimes receive less usage than existing established runtimes. This can result in longer cold start times due to reduced cache residency within internal Lambda subsystems. Cold start times typically improve in the weeks following launch as usage increases. As a result, AWS recommends not drawing performance comparison conclusions with other Lambda runtimes until the performance has stabilized.

Native AOT

Lambda introduced .NET Native AOT support in November 2022. Benchmarks show up to 86% improvement in cold start times by eliminating the JIT compilation. Deploying .NET 8 Native AOT functions using the managed dotnet8 runtime rather than the OS-only provided.al2023 runtime gives your function access to .NET system libraries. For example, libicu, which is used for globalization, is not included by default in the provided.al2023 runtime but is in the dotnet8 runtime.

While Native AOT is not suitable for all .NET functions, .NET 8 has improved trimming support. This allows you to more easily run ASP.NET APIs. Improved trimming support helps eliminate build time trimming warnings, which highlight possible runtime errors. This can give you confidence that your Native AOT function behaves like a JIT-compiled function. Trimming support has been added to the Lambda runtime libraries, AWS .NET SDK, .NET Lambda Annotations, and .NET 8 itself.

Using.NET 8 with Lambda

To use .NET 8 with Lambda, you must update your tools.

  1. Install or update the .NET 8 SDK.
  2. If you are using AWS SAM, install or update to the latest version.
  3. If you are using Visual Studio, install or update the AWS Toolkit for Visual Studio.
  4. If you use the .NET Lambda Global Tools extension (Amazon.Lambda.Tools), install the CLI extension and templates. You can upgrade existing tools with dotnet tool update -g Amazon.Lambda.Tools and existing templates with dotnet new install Amazon.Lambda.Templates.

You can also use .NET 8 with Powertools for AWS Lambda (.NET), a developer toolkit to implement serverless best practices such as observability, batch processing, retrieving parameters, idempotency, and feature flags.

Building new .NET 8 functions

Using AWS SAM

  1. Run sam init.
  2. Choose 1- AWS Quick Start Templates.
  3. Choose one of the available templates such as Hello World Example.
  4. Select N for Use the most popular runtime and package type?
  5. Select dotnet8 as the runtime. The dotnet8 Hello World Example also includes a Native AOT template option.
  6. Follow the rest of the prompts to create the .NET 8 function.
AWS SAM .NET 8 init options

AWS SAM .NET 8 init options

You can amend the generated function code and use sam deploy --guided to deploy the function.

Using AWS Toolkit for Visual Studio

  1. From the Create a new project wizard, filter the templates to either the Lambda or Serverless project type and select a template. Use Lambda for deploying a single function. Use Serverless for deploying a collection of functions using AWS CloudFormation.
  2. Continue with the steps to finish creating your project.
  3. You can amend the generated function code.
  4. To deploy, right click on the project in the Solution Explorer and select Publish to AWS Lambda.

Using AWS extensions for the .NET CLI

  1. Run dotnet new list --tag Lambda to get a list of available Lambda templates.
  2. Choose a template and run dotnet new <template name>. To build a function using Native AOT, use dotnet new lambda.NativeAOT or dotnet new serverless.NativeAOT when using the .NET Lambda Annotations Framework.
  3. Locate the generated Lambda function in the directory under src which contains the .csproj file. You can amend the generated function code.
  4. To deploy, run dotnet lambda deploy-function and follow the prompts.
  5. You can test the function in the cloud using dotnet lambda invoke-function or by using the test functionality in the Lambda console.

You can build and deploy .NET Lambda functions using container images. Follow the instructions in the documentation.

Migrating from .NET 6 to .NET 8 without Native AOT

Using AWS SAM

  1. Open the template.yaml file.
  2. Update Runtime to dotnet8.
  3. Open a terminal window and rebuild the code using sam build.
  4. Run sam deploy to deploy the changes.

Using AWS Toolkit for Visual Studio

  1. Open the .csproj project file and update the TargetFramework to net8.0. Update NuGet packages for your Lambda functions to the latest version to pull in .NET 8 updates.
  2. Verify that the build command you are using is targeting the .NET 8 runtime.
  3. There may be additional steps depending on what build/deploy tool you’re using. Updating the function runtime may be sufficient.

.NET function in AWS Toolkit for Visual Studio

Using AWS extensions for the .NET CLI or AWS Toolkit for Visual Studio

  1. Open the aws-lambda-tools-defaults.json file if it exists.
    1. Set the framework field to net8.0. If unspecified, the value is inferred from the project file.
    2. Set the function-runtime field to dotnet8.
  2. Open the serverless.template file if it exists. For any AWS::Lambda::Function or AWS::Servereless::Function resources, set the Runtime property to dotnet8.
  3. Open the .csproj project file if it exists and update the TargetFramework to net8.0. Update NuGet packages for your Lambda functions to the latest version to pull in .NET 8 updates.

Migrating from .NET 6 to .NET 8 Native AOT

The following example migrates a .NET 6 class library function to a .NET 8 Native AOT executable function. This uses the optional Lambda Annotations framework which provides idiomatic .NET coding patterns.

Update your project file

  1. Open the project file.
  2. Set TargetFramework to net8.0.
  3. Set OutputType to exe.
  4. Remove PublishReadyToRun if it exists.
  5. Add PublishAot and set to true.
  6. Add or update NuGet package references to Amazon.Lambda.Annotations and Amazon.Lambda.RuntimeSupport. You can update using the NuGet UI in your IDE, manually, or by running dotnet add package Amazon.Lambda.RuntimeSupport and dotnet add package Amazon.Lambda.Annotations from your project directory.

Your project file should look similar to the following:

<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <OutputType>exe</OutputType>
    <TargetFramework>net8.0</TargetFramework>
    <ImplicitUsings>enable</ImplicitUsings>
    <Nullable>enable</Nullable>
    <AWSProjectType>Lambda</AWSProjectType>
    <CopyLocalLockFileAssemblies>true</CopyLocalLockFileAssemblies>
    <!-- Generate native aot images during publishing to improve cold start time. -->
    <PublishAot>true</PublishAot>
	  <!-- StripSymbols tells the compiler to strip debugging symbols from the final executable if we're on Linux and put them into their own file. 
		This will greatly reduce the final executable's size.-->
	  <StripSymbols>true</StripSymbols>
  </PropertyGroup>
  <ItemGroup>
    <PackageReference Include="Amazon.Lambda.Core" Version="2.2.0" />
    <PackageReference Include="Amazon.Lambda.RuntimeSupport" Version="1.10.0" />
    <PackageReference Include="Amazon.Lambda.Serialization.SystemTextJson" Version="2.4.0" />
  </ItemGroup>
</Project>

Updating your function code

    1. Reference the annotations library with using Amazon.Lambda.Annotations;
    2. Add [assembly:LambdaGlobalProperties(GenerateMain = true)] to allow the annotations framework to create the main method. This is required as the project is now an executable instead of a library.
    3. Add the below partial class and include a JsonSerializable attribute for any types that you need to serialize, including for your function input and output This partial class is used at build time to generate reflection free code dedicated to serializing the listed types. The following is an example:
    4. /// <summary>
      /// This class is used to register the input event and return type for the FunctionHandler method with the System.Text.Json source generator.
      /// There must be a JsonSerializable attribute for each type used as the input and return type or a runtime error will occur 
      /// from the JSON serializer unable to find the serialization information for unknown types.
      /// </summary>
      [JsonSerializable(typeof(APIGatewayHttpApiV2ProxyRequest))]
      [JsonSerializable(typeof(APIGatewayHttpApiV2ProxyResponse))]
      public partial class MyCustomJsonSerializerContext : JsonSerializerContext
      {
          // By using this partial class derived from JsonSerializerContext, we can generate reflection free JSON Serializer code at compile time
          // which can deserialize our class and properties. However, we must attribute this class to tell it what types to generate serialization code for
          // See https://docs.microsoft.com/en-us/dotnet/standard/serialization/system-text-json-source-generation
      }

    5. After the using statement, add the following to specify the serializer to use. [assembly: LambdaSerializer(typeof(SourceGeneratorLambdaJsonSerializer<LambdaFunctionJsonSerializerContext>))]

    Swap LambdaFunctionJsonSerializerContext for your context if you are not using the partial class from the previous step.

    Updating your function configuration

    If you are using aws-lambda-tools-defaults.json.

    1. Set function-runtime to dotnet8.
    2. Set function-architecture to match your build machine – either x86_64 or arm64.
    3. Set (or update) environment-variables to include ANNOTATIONS_HANDLER=<YourFunctionHandler>. Replace <YourFunctionHandler> with the method name of your function handler, so the annotations framework knows which method to call from the generated main method.
    4. Set function-handler to the name of the executable assembly in your bin directory. By default, this is your project name, which tells the .NET Lambda bootstrap script to run your native binary instead of starting the .NET runtime. If your project file has AssemblyName then use that value for the function handler.
    {
      "function-architecture": "x86_64",
      "function-runtime": "dotnet8",
      "function-handler": "<your-assembly-name>",
      "environment-variables",
      "ANNOTATIONS_HANDLER=<your-function-handler>",
    }

    Deploy and test

    1. Deploy your function. If you are using Amazon.Lambda.Tools, run dotnet lambda deploy-function. Check for trim warnings during build and refactor to eliminate them.
    2. Test your function to ensure that the native calls into AL2023 are working correctly. By default, running local unit tests on your development machine won’t run natively and will still use the JIT compiler. Running with the JIT compiler does not allow you to catch native AOT specific runtime errors.

    Conclusion

    Lambda is introducing the new .NET 8 managed runtime. This post highlights new features in .NET 8. You can create new Lambda functions or migrate existing functions to .NET 8 or .NET 8 Native AOT.

    For more information, see the AWS Lambda for .NET repository, documentation, and .NET on Serverless Land.

    For more serverless learning resources, visit Serverless Land.

Best practices for managing Terraform State files in AWS CI/CD Pipeline

Post Syndicated from Arun Kumar Selvaraj original https://aws.amazon.com/blogs/devops/best-practices-for-managing-terraform-state-files-in-aws-ci-cd-pipeline/

Introduction

Today customers want to reduce manual operations for deploying and maintaining their infrastructure. The recommended method to deploy and manage infrastructure on AWS is to follow Infrastructure-As-Code (IaC) model using tools like AWS CloudFormation, AWS Cloud Development Kit (AWS CDK) or Terraform.

One of the critical components in terraform is managing the state file which keeps track of your configuration and resources. When you run terraform in an AWS CI/CD pipeline the state file has to be stored in a secured, common path to which the pipeline has access to. You need a mechanism to lock it when multiple developers in the team want to access it at the same time.

In this blog post, we will explain how to manage terraform state files in AWS, best practices on configuring them in AWS and an example of how you can manage it efficiently in your Continuous Integration pipeline in AWS when used with AWS Developer Tools such as AWS CodeCommit and AWS CodeBuild. This blog post assumes you have a basic knowledge of terraform, AWS Developer Tools and AWS CI/CD pipeline. Let’s dive in!

Challenges with handling state files

By default, the state file is stored locally where terraform runs, which is not a problem if you are a single developer working on the deployment. However if not, it is not ideal to store state files locally as you may run into following problems:

  • When working in teams or collaborative environments, multiple people need access to the state file
  • Data in the state file is stored in plain text which may contain secrets or sensitive information
  • Local files can get lost, corrupted, or deleted

Best practices for handling state files

The recommended practice for managing state files is to use terraform’s built-in support for remote backends. These are:

Remote backend on Amazon Simple Storage Service (Amazon S3): You can configure terraform to store state files in an Amazon S3 bucket which provides a durable and scalable storage solution. Storing on Amazon S3 also enables collaboration that allows you to share state file with others.

Remote backend on Amazon S3 with Amazon DynamoDB: In addition to using an Amazon S3 bucket for managing the files, you can use an Amazon DynamoDB table to lock the state file. This will allow only one person to modify a particular state file at any given time. It will help to avoid conflicts and enable safe concurrent access to the state file.

There are other options available as well such as remote backend on terraform cloud and third party backends. Ultimately, the best method for managing terraform state files on AWS will depend on your specific requirements.

When deploying terraform on AWS, the preferred choice of managing state is using Amazon S3 with Amazon DynamoDB.

AWS configurations for managing state files

  1. Create an Amazon S3 bucket using terraform. Implement security measures for Amazon S3 bucket by creating an AWS Identity and Access Management (AWS IAM) policy or Amazon S3 Bucket Policy. Thus you can restrict access, configure object versioning for data protection and recovery, and enable AES256 encryption with SSE-KMS for encryption control.
  1. Next create an Amazon DynamoDB table using terraform with Primary key set to LockID. You can also set any additional configuration options such as read/write capacity units. Once the table is created, you will configure the terraform backend to use it for state locking by specifying the table name in the terraform block of your configuration.
  1. For a single AWS account with multiple environments and projects, you can use a single Amazon S3 bucket. If you have multiple applications in multiple environments across multiple AWS accounts, you can create one Amazon S3 bucket for each account. In that Amazon S3 bucket, you can create appropriate folders for each environment, storing project state files with specific prefixes.

Now that you know how to handle terraform state files on AWS, let’s look at an example of how you can configure them in a Continuous Integration pipeline in AWS.

Architecture

Architecture on how to use terraform in an AWS CI pipeline

Figure 1: Example architecture on how to use terraform in an AWS CI pipeline

This diagram outlines the workflow implemented in this blog:

  1. The AWS CodeCommit repository contains the application code
  2. The AWS CodeBuild job contains the buildspec files and references the source code in AWS CodeCommit
  3. The AWS Lambda function contains the application code created after running terraform apply
  4. Amazon S3 contains the state file created after running terraform apply. Amazon DynamoDB locks the state file present in Amazon S3

Implementation

Pre-requisites

Before you begin, you must complete the following prerequisites:

Setting up the environment

  1. You need an AWS access key ID and secret access key to configure AWS CLI. To learn more about configuring the AWS CLI, follow these instructions.
  2. Clone the repo for complete example: git clone https://github.com/aws-samples/manage-terraform-statefiles-in-aws-pipeline
  3. After cloning, you could see the following folder structure:
AWS CodeCommit repository structure

Figure 2: AWS CodeCommit repository structure

Let’s break down the terraform code into 2 parts – one for preparing the infrastructure and another for preparing the application.

Preparing the Infrastructure

  1. The main.tf file is the core component that does below:
      • It creates an Amazon S3 bucket to store the state file. We configure bucket ACL, bucket versioning and encryption so that the state file is secure.
      • It creates an Amazon DynamoDB table which will be used to lock the state file.
      • It creates two AWS CodeBuild projects, one for ‘terraform plan’ and another for ‘terraform apply’.

    Note – It also has the code block (commented out by default) to create AWS Lambda which you will use at a later stage.

  1. AWS CodeBuild projects should be able to access Amazon S3, Amazon DynamoDB, AWS CodeCommit and AWS Lambda. So, the AWS IAM role with appropriate permissions required to access these resources are created via iam.tf file.
  1. Next you will find two buildspec files named buildspec-plan.yaml and buildspec-apply.yaml that will execute terraform commands – terraform plan and terraform apply respectively.
  1. Modify AWS region in the provider.tf file.
  1. Update Amazon S3 bucket name, Amazon DynamoDB table name, AWS CodeBuild compute types, AWS Lambda role and policy names to required values using variable.tf file. You can also use this file to easily customize parameters for different environments.

With this, the infrastructure setup is complete.

You can use your local terminal and execute below commands in the same order to deploy the above-mentioned resources in your AWS account.

terraform init
terraform validate
terraform plan
terraform apply

Once the apply is successful and all the above resources have been successfully deployed in your AWS account, proceed with deploying your application. 

Preparing the Application

  1. In the cloned repository, use the backend.tf file to create your own Amazon S3 backend to store the state file. By default, it will have below values. You can override them with your required values.
bucket = "tfbackend-bucket" 
key    = "terraform.tfstate" 
region = "eu-central-1"
  1. The repository has sample python code stored in main.py that returns a simple message when invoked.
  1. In the main.tf file, you can find the below block of code to create and deploy the Lambda function that uses the main.py code (uncomment these code blocks).
data "archive_file" "lambda_archive_file" {
    ……
}

resource "aws_lambda_function" "lambda" {
    ……
}
  1. Now you can deploy the application using AWS CodeBuild instead of running terraform commands locally which is the whole point and advantage of using AWS CodeBuild.
  1. Run the two AWS CodeBuild projects to execute terraform plan and terraform apply again.
  1. Once successful, you can verify your deployment by testing the code in AWS Lambda. To test a lambda function (console):
    • Open AWS Lambda console and select your function “tf-codebuild”
    • In the navigation pane, in Code section, click Test to create a test event
    • Provide your required name, for example “test-lambda”
    • Accept default values and click Save
    • Click Test again to trigger your test event “test-lambda”

It should return the sample message you provided in your main.py file. In the default case, it will display “Hello from AWS Lambda !” message as shown below.

Sample Amazon Lambda function response

Figure 3: Sample Amazon Lambda function response

  1. To verify your state file, go to Amazon S3 console and select the backend bucket created (tfbackend-bucket). It will contain your state file.
Amazon S3 bucket with terraform state file

Figure 4: Amazon S3 bucket with terraform state file

  1. Open Amazon DynamoDB console and check your table tfstate-lock and it will have an entry with LockID.
Amazon DynamoDB table with LockID

Figure 5: Amazon DynamoDB table with LockID

Thus, you have securely stored and locked your terraform state file using terraform backend in a Continuous Integration pipeline.

Cleanup

To delete all the resources created as part of the repository, run the below command from your terminal.

terraform destroy

Conclusion

In this blog post, we explored the fundamentals of terraform state files, discussed best practices for their secure storage within AWS environments and also mechanisms for locking these files to prevent unauthorized team access. And finally, we showed you an example of how efficiently you can manage them in a Continuous Integration pipeline in AWS.

You can apply the same methodology to manage state files in a Continuous Delivery pipeline in AWS. For more information, see CI/CD pipeline on AWS, Terraform backends types, Purpose of terraform state.

Arun Kumar Selvaraj

Arun Kumar Selvaraj is a Cloud Infrastructure Architect with AWS Professional Services. He loves building world class capability that provides thought leadership, operating standards and platform to deliver accelerated migration and development paths for his customers. His interests include Migration, CCoE, IaC, Python, DevOps, Containers and Networking.

Manasi Bhutada

Manasi Bhutada is an ISV Solutions Architect based in the Netherlands. She helps customers design and implement well architected solutions in AWS that address their business problems. She is passionate about data analytics and networking. Beyond work she enjoys experimenting with food, playing pickleball, and diving into fun board games.

How to automate rule management for AWS Network Firewall

Post Syndicated from Ajinkya Patil original https://aws.amazon.com/blogs/security/how-to-automate-rule-management-for-aws-network-firewall/

AWS Network Firewall is a stateful managed network firewall and intrusion detection and prevention service designed for the Amazon Virtual Private Cloud (Amazon VPC). This post concentrates on automating rule updates in a central Network Firewall by using distributed firewall configurations. If you’re new to Network Firewall or seeking a technical background on rule management, see AWS Network Firewall – New Managed Firewall Service in VPC.

Network Firewall offers three deployment models: Distributed, centralized, and combined. Many customers opt for a centralized model to reduce costs. In this model, customers allocate the responsibility for managing the rulesets to the owners of the VPC infrastructure (spoke accounts) being protected, thereby shifting accountability and providing flexibility to the spoke accounts. Managing rulesets in a shared firewall policy generated from distributed input configurations of protected VPCs (spoke accounts) is challenging without proper input validation, state-management, and request throttling controls.

In this post, we show you how to automate firewall rule management within the central firewall using distributed firewall configurations spread across multiple AWS accounts. The anfw-automate solution provides input-validation, state-management, and throttling controls, reducing the update time for firewall rule changes from minutes to seconds. Additionally, the solution reduces operational costs, including rule management overhead while integrating seamlessly with the existing continuous integration and continuous delivery (CI/CD) processes.

Prerequisites

For this walkthrough, the following prerequisites must be met:

  • Basic knowledge of networking concepts such as routing and Classless Inter-Domain Routing (CIDR) range allocations.
  • Basic knowledge of YAML and JSON configuration formats, definitions, and schema.
  • Basic knowledge of Suricata Rule Format and Network Firewall rule management.
  • Basic knowledge of CDK deployment.
  • AWS Identity and Access Management (IAM) permissions to bootstrap the AWS accounts using AWS Cloud Development Kit (AWS CDK).
  • The firewall VPC in the central account must be reachable from a spoke account (see centralized deployment model). For this solution, you need two AWS accounts from the centralized deployment model:
    • The spoke account is the consumer account the defines firewall rules for the account and uses central firewall endpoints for traffic filtering. At least one spoke account is required to simulate the user workflow in validation phase.
    • The central account is an account that contains the firewall endpoints. This account is used by application and the Network Firewall.
  • StackSets deployment with service-managed permissions must be enabled in AWS Organizations (Activate trusted access with AWS Organizations). A delegated administrator account is required to deploy AWS CloudFormation stacks in any account in an organization. The CloudFormation StackSets in this account deploy the necessary CloudFormation stacks in the spoke accounts. If you don’t have a delegated administrator account, you must manually deploy the resources in the spoke account. Manual deployment isn’t recommended in production environments.
  • A resource account is the CI/CD account used to deploy necessary AWS CodePipeline stacks. The pipelines deploy relevant cross-account cross-AWS Region stacks to the preceding AWS accounts.
    • IAM permissions to deploy CDK stacks in the resource account.

Solution description

In Network Firewall, each firewall endpoint connects to one firewall policy, which defines network traffic monitoring and filtering behavior. The details of the behavior are defined in rule groups — a reusable set of rules — for inspecting and handling network traffic. The rules in the rule groups provide the details for packet inspection and specify the actions to take when a packet matches the inspection criteria. Network Firewall uses a Suricata rules engine to process all stateful rules. Currently, you can create Suricata compatible or basic rules (such as domain list) in Network Firewall. We use Suricata compatible rule strings within this post to maintain maximum compatibility with most use cases.

Figure 1 describes how the anfw-automate solution uses the distributed firewall rule configurations to simplify rule management for multiple teams. The rules are validated, transformed, and stored in the central AWS Network Firewall policy. This solution isolates the rule generation to the spoke AWS accounts, but still uses a shared firewall policy and a central ANFW for traffic filtering. This approach grants the AWS spoke account owners the flexibility to manage their own firewall rules while maintaining the accountability for their rules in the firewall policy. The solution enables the central security team to validate and override user defined firewall rules before pushing them to the production firewall policy. The security team operating the central firewall can also define additional rules that are applied to all spoke accounts, thereby enforcing organization-wide security policies. The firewall rules are then compiled and applied to Network Firewall in seconds, providing near real-time response in scenarios involving critical security incidents.

Figure 1: Workflow launched by uploading a configuration file to the configuration (config) bucket

Figure 1: Workflow launched by uploading a configuration file to the configuration (config) bucket

The Network Firewall firewall endpoints and anfw-automate solution are both deployed in the central account. The spoke accounts use the application for rule automation and the Network Firewall for traffic inspection.

As shown in Figure 1, each spoke account contains the following:

  1. An Amazon Simple Storage Service (Amazon S3) bucket to store multiple configuration files, one per Region. The rules defined in the configuration files are applicable to the VPC traffic in the spoke account. The configuration files must comply with the defined naming convention ($Region-config.yaml) and be validated to make sure that only one configuration file exists per Region per account. The S3 bucket has event notifications enabled that publish all changes to configuration files to a local default bus in Amazon EventBridge.
  2. EventBridge rules to monitor the default bus and forward relevant events to the custom event bus in the central account. The EventBridge rules specifically monitor VPCDelete events published by Amazon CloudTrail and S3 event notifications. When a VPC is deleted from the spoke account, the VPCDelete events lead to the removal of corresponding rules from the firewall policy. Additionally, all create, update, and delete events from Amazon S3 event notifications invoke corresponding actions on the firewall policy.
  3. Two AWS Identity and Access Manager (IAM) roles with keywords xaccount.lmb.rc and xaccount.lmb.re are assumed by RuleCollect and RuleExecute functions in the central account, respectively.
  4. A CloudWatch Logs log group to store event processing logs published by the central AWS Lambda application.

In the central account:

  1. EventBridge rules monitor the custom event bus and invoke a Lambda function called RuleCollect. A dead-letter queue is attached to the EventBridge rules to store events that failed to invoke the Lambda function.
  2. The RuleCollect function retrieves the config file from the spoke account by assuming a cross-account role. This role is deployed by the same stack that created the other spoke account resources. The Lambda function validates the request, transforms the request to the Suricata rule syntax, and publishes the rules to an Amazon Simple Queue Service (Amazon SQS) first-in-first-out (FIFO) queue. Input validation controls are paramount to make sure that users don’t abuse the functionality of the solution and bypass central governance controls. The Lambda function has input validation controls to verify the following:
    • The VPC ID in the configuration file exists in the configured Region and the same AWS account as the S3 bucket.
    • The Amazon S3 object version ID received in the event matches the latest version ID to mitigate race conditions.
    • Users don’t have only top-level domains (for example, .com, .de) in the rules.
    • The custom Suricata rules don’t have any as the destination IP address or domain.
    • The VPC identifier matches the required format, that is, a+(AWS Account ID)+(VPC ID without vpc- prefix) in custom rules. This is important to have unique rule variables in rule groups.
    • The rules don’t use security sensitive keywords such as sid, priority, or metadata. These keywords are reserved for firewall administrators and the Lambda application.
    • The configured VPC is attached to an AWS Transit Gateway.
    • Only pass rules exist in the rule configuration.
    • CIDR ranges for a VPC are mapped appropriately using IP set variables.

    The input validations make sure that rules defined by one spoke account don’t impact the rules from other spoke accounts. The validations applied to the firewall rules can be updated and managed as needed based on your requirements. The rules created must follow a strict format, and deviation from the preceding rules will lead to the rejection of the request.

  3. The Amazon SQS FIFO queue preserves the order of create, update, and delete operations run in the configuration bucket of the spoke account. These state-management controls maintain consistency between the firewall rules in the configuration file within the S3 bucket and the rules in the firewall policy. If the sequence of updates provided by the distributed configurations isn’t honored, the rules in a firewall policy might not match the expected ruleset.

    Rules not processed beyond the maxReceiveCount threshold are moved to a dead-letter SQS queue for troubleshooting.

  4. The Amazon SQS messages are subsequently consumed by another Lambda function called RuleExecute. Multiple changes to one configuration are batched together in one message. The RuleExecute function parses the messages and generates the required rule groups, IP set variables, and rules within the Network Firewall. Additionally, the Lambda function establishes a reserved rule group, which can be administered by the solution’s administrators and used to define global rules. The global rules, applicable to participating AWS accounts, can be managed in the data/defaultdeny.yaml file by the central security team.

    The RuleExecute function also implements throttling controls to make sure that rules are applied to the firewall policy without reaching the ThrottlingException from Network Firewall (see common errors). The function also implements back-off logic to handle this exception. This throttling effect can happen if there are too many requests issued to the Network Firewall API.

    The function makes cross-Region calls to Network Firewall based on the Region provided in the user configuration. There is no need to deploy the RuleExecute and RuleCollect Lambda functions in multiple Regions unless a use case warrants it.

Walkthrough

The following section guides you through the deployment of the rules management engine.

  • Deployment: Outlines the steps to deploy the solution into the target AWS accounts.
  • Validation: Describes the steps to validate the deployment and ensure the functionality of the solution.
  • Cleaning up: Provides instructions for cleaning up the deployment.

Deployment

In this phase, you deploy the application pipeline in the resource account. The pipeline is responsible for deploying multi-Region cross-account CDK stacks in both the central account and the delegated administrator account.

If you don’t have a functioning Network Firewall firewall using the centralized deployment model in the central account, see the README for instructions on deploying Amazon VPC and Network Firewall stacks before proceeding. You need to deploy the Network Firewall in centralized deployment in each Region and Availability Zone used by spoke account VPC infrastructure.

The application pipeline stack deploys three stacks in all configured Regions: LambdaStack and ServerlessStack in the central account and StacksetStack in the delegated administrator account. It’s recommended to deploy these stacks solely in the primary Region, given that the solution can effectively manage firewall policies across all supported Regions.

  • LambdaStack deploys the RuleCollect and RuleExecute Lambda functions, Amazon SQS FIFO queue, and SQS FIFO dead-letter queue.
  • ServerlessStack deploys EventBridge bus, EventBridge rules, and EventBridge Dead-letter queue.
  • StacksetStack deploys a service-managed stack set in the delegated administrator account. The stack set includes the deployment of IAM roles, EventBridge rules, an S3 Bucket, and a CloudWatch log group in the spoke account. If you’re manually deploying the CloudFormation template (templates/spoke-serverless-stack.yaml) in the spoke account, you have the option to disable this stack in the application configuration.
     
    Figure 2: CloudFormation stacks deployed by the application pipeline

    Figure 2: CloudFormation stacks deployed by the application pipeline

To prepare for bootstrapping

  1. Install and configure profiles for all AWS accounts using Amazon Command Line Interface (AWS CLI)
  2. Install the Cloud Development Kit (CDK)
  3. Install Git and clone the GitHub repo
  4. Install and enable Docker Desktop

To prepare for deployment

  1. Follow the README and cdk bootstrapping guide to bootstrap the resource account. Then, bootstrap the central account and delegated administrator account (optional if StacksetStack is deployed manually in the spoke account) to trust the resource account. The spoke accounts don’t need to be bootstrapped.
  2. Create a folder to be referred to as <STAGE>, where STAGE is the name of your deployment stage — for example, local, dev, int, and so on — in the conf folder of the cloned repository. The deployment stage is set as the STAGE parameter later and used in the AWS resource names.
  3. Create global.json in the <STAGE> folder. Follow the README to update the parameter values. A sample JSON file is provided in conf/sample folder.
  4. Run the following commands to configure the local environment:
    npm install
    export STAGE=<STAGE>
    export AWS_REGION=<AWS_Region_to_deploy_pipeline_stack>

To deploy the application pipeline stack

  1. Create a file named app.json in the <STAGE> folder and populate the parameters in accordance with the README section and defined schema.
  2. If you choose to manage the deployment of spoke account stacks using the delegated administrator account and have set the deploy_stacksets parameter to true, create a file named stackset.json in the <STAGE> folder. Follow the README section to align with the requirements of the defined schema.

    You can also deploy the spoke account stack manually for testing using the AWS CloudFormation template in templates/spoke-serverless-stack.yaml. This will create and configure the needed spoke account resources.

  3. Run the following commands to deploy the application pipeline stack:
    export STACKNAME=app && make deploy

    Figure 3: Example output of application pipeline deployment

    Figure 3: Example output of application pipeline deployment

After deploying the solution, each spoke account is required to configure stateful rules for every VPC in the configuration file and upload it to the S3 bucket. Each spoke account owner must verify the VPC’s connection to the firewall using the centralized deployment model. The configuration, presented in the YAML configuration language, might encompass multiple rule definitions. Each account must furnish one configuration file per VPC to establish accountability and non-repudiation.

Validation

Now that you’ve deployed the solution, follow the next steps to verify that it’s completed as expected, and then test the application.

To validate deployment

  1. Sign in to the AWS Management Console using the resource account and go to CodePipeline.
  2. Verify the existence of a pipeline named cpp-app-<aws_ organization_scope>-<project_name>-<module_name>-<STAGE> in the configured Region.
  3. Verify that stages exist in each pipeline for all configured Regions.
  4. Confirm that all pipeline stages exist. The LambdaStack and ServerlessStack stages must exist in the cpp-app-<aws_organization_scope>-<project_name>-<module_name>-<STAGE> stack. The StacksetStack stage must exist if you set the deploy_stacksets parameter to true in global.json.

To validate the application

  1. Sign in and open the Amazon S3 console using the spoke account.
  2. Follow the schema defined in app/RuleCollect/schema.json and create a file with naming convention ${Region}-config.yaml. Note that the Region in the config file is the destination Region for the firewall rules. Verify that the file has valid VPC data and rules.
    Figure 4: Example configuration file for eu-west-1 Region

    Figure 4: Example configuration file for eu-west-1 Region

  3. Upload the newly created config file to the S3 bucket named anfw-allowlist-<AWS_REGION for application stack>-<Spoke Account ID>-<STAGE>.
  4. If the data in the config file is invalid, you will see ERROR and WARN logs in the CloudWatch log group named cw-<aws_organization_scope>-<project_name>-<module_name>-CustomerLog-<STAGE>.
  5. If all the data in the config file is valid, you will see INFO logs in the same CloudWatch log group.
    Figure 5: Example of logs generated by the anfw-automate in a spoke account

    Figure 5: Example of logs generated by the anfw-automate in a spoke account

  6. After the successful processing of the rules, sign in to the Network Firewall console using the central account.
  7. Navigate to the Network Firewall rule groups and search for a rule group with a randomly assigned numeric name. This rule group will contain your Suricata rules after the transformation process.
    Figure 6: Rules created in Network Firewall rule group based on the configuration file in Figure 4

    Figure 6: Rules created in Network Firewall rule group based on the configuration file in Figure 4

  8. Access the Network Firewall rule group identified by the suffix reserved. This rule group is designated for administrators and global rules. Confirm that the rules specified in app/data/defaultdeny.yaml have been transformed into Suricata rules and are correctly placed within this rule group.
  9. Instantiate an EC2 instance in the VPC specified in the configuration file and try to access both the destinations allowed in the file and any destination not listed. Note that requests to destinations not defined in the configuration file are blocked.

Cleaning up

To avoid incurring future charges, remove all stacks and instances used in this walkthrough.

  1. Sign in to both the central account and the delegated admin account. Manually delete the stacks in the Regions configured for the app parameter in global.json. Ensure that the stacks are deleted for all Regions specified for the app parameter. You can filter the stack names using the keyword <aws_organization_scope>-<project_name>-<module_name> as defined in global.json.
  2. After deleting the stacks, remove the pipeline stacks using the same command as during deployment, replacing cdk deploy with cdk destroy.
  3. Terminate or stop the EC2 instance used to test the application.

Conclusion

This solution simplifies network security by combining distributed ANFW firewall configurations in a centralized policy. Automated rule management can help reduce operational overhead, reduces firewall change request completion times from minutes to seconds, offloads security and operational mechanisms such as input validation, state-management, and request throttling, and enables central security teams to enforce global firewall rules without compromising on the flexibility of user-defined rulesets.

In addition to using this application through S3 bucket configuration management, you can integrate this tool with GitHub Actions into your CI/CD pipeline to upload the firewall rule configuration to an S3 bucket. By combining GitHub actions, you can automate configuration file updates with automated release pipeline checks, such as schema validation and manual approvals. This enables your team to maintain and change firewall rule definitions within your existing CI/CD processes and tools. You can go further by allowing access to the S3 bucket only through the CI/CD pipeline.

Finally, you can ingest the AWS Network Firewall logs into one of our partner solutions for security information and event management (SIEM), security monitoring, threat intelligence, and managed detection and response (MDR). You can launch automatic rule updates based on security events detected by these solutions, which can help reduce the response time for security events.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Ajinkya Patil

Ajinkya Patil

Ajinkya is a Security Consultant at Amazon Professional Services, specializing in security consulting for AWS customers within the automotive industry since 2019. He has presented at AWS re:Inforce and contributed articles to the AWS Security blog and AWS Prescriptive Guidance. Beyond his professional commitments, he indulges in travel and photography.

Stephan Traub

Stephan Traub

Stephan is a Security Consultant working for automotive customers at AWS Professional Services. He is a technology enthusiast and passionate about helping customers gain a high security bar in their cloud infrastructure. When Stephan isn’t working, he’s playing volleyball or traveling with his family around the world.

Re-platforming Java applications using the updated AWS Serverless Java Container

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/re-platforming-java-applications-using-the-updated-aws-serverless-java-container/

This post is written by Dennis Kieselhorst, Principal Solutions Architect.

The combination of portability, efficiency, community, and breadth of features has made Java a popular choice for businesses to build their applications for over 25 years. The introduction of serverless functions, pioneered by AWS Lambda, changed what you need in a programming language and runtime environment. Functions are often short-lived, single-purpose, and do not require extensive infrastructure configuration.

This blog post shows how you can modernize a legacy Java application to run on Lambda with minimal code changes using the updated AWS Serverless Java Container.

Deployment model comparison

Classic Java enterprise applications often run on application servers such as JBoss/ WildFly, Oracle WebLogic and IBM WebSphere, or servlet containers like Apache Tomcat. The underlying Java virtual machine typically runs 24/7 and serves multiple requests using its multithreading capabilities.

Typical long running Java application server

Typical long running Java application server

When building Lambda functions with Java, an HTTP server is no longer required and there are other considerations for running code in a Lambda environment. Code runs in an execution environment, which processes a single invocation at a time. Functions can run for up to 15 minutes with a maximum of 10 Gb allocated memory.

Functions are triggered by events such as an HTTP request with a corresponding payload. An Amazon API Gateway HTTP request invokes the function with the following JSON payload:

Amazon API Gateway HTTP request payload

Amazon API Gateway HTTP request payload

The code to process these events is different from how you implement it in a traditional application.

AWS Serverless Java Container

The AWS Serverless Java Container makes it easier to run Java applications written with frameworks such as Spring, Spring Boot, or JAX-RS/Jersey in Lambda.

The container provides adapter logic to minimize code changes. Incoming events are translated to the Servlet specification so that frameworks work as before.

AWS Serverless Java Container adapter

AWS Serverless Java Container adapter

Version 1 of this library was released in 2018. Today, AWS is announcing the release of version 2, which supports the latest Jakarta EE specification, along with Spring Framework 6.x, Spring Boot 3.x and Jersey 3.x.

Example: Modifying a Spring Boot application

This following example illustrates how to migrate a Spring Boot 3 application. You can find the full example for Spring and other frameworks in the GitHub repository.

  1. Add the AWS Serverless Java dependency to your Maven POM build file (or Gradle accordingly):
  2. <dependency>
        <groupId>com.amazonaws.serverless</groupId>
        <artifactId>aws-serverless-java-container-springboot3</artifactId>
        <version>2.0.0</version>
    </dependency>
  3. Spring Boot, by default, embeds Apache Tomcat to deal with HTTP requests. The examples use Amazon API Gateway to handle inbound HTTP requests so you can exclude the dependency.
  4. <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-shade-plugin</artifactId>
                <configuration>
                    <createDependencyReducedPom>false</createDependencyReducedPom>
                </configuration>
                <executions>
                    <execution>
                        <phase>package</phase>
                        <goals>
                            <goal>shade</goal>
                        </goals>
                        <configuration>
                            <artifactSet>
                                <excludes>
                                    <exclude>org.apache.tomcat.embed:*</exclude>
                                </excludes>
                            </artifactSet>
                        </configuration>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>

    The AWS Serverless Java Container accepts API Gateway proxy requests and transforms them into a plain Java object. The library also transforms outputs into a suitable API Gateway response object.

    Once you run your build process, Maven’s Shade-plugin now produces an Uber-JAR that bundles all dependencies, which you can upload to Lambda.

  5. The Lambda runtime must know which handler method to invoke. You can configure and use the SpringDelegatingLambdaContainerHandler implementation or implement your own handler Java class that delegates to AWS Serverless Java Container. This is useful if you want to add additional functionality.
  6. Configure the handler name in the runtime settings of your function.
  7. Configure the handler name

    Configure the handler name

  8. Configure an environment variable named MAIN_CLASS to let the generic handler know where to find your original application main class, which is usually annotated with @SpringBootApplication.
  9. Configure MAIN_CLASS environment variable

    Configure MAIN_CLASS environment variable

    You can also configure these settings using infrastructure as code (IaC) tools such as AWS CloudFormation, the AWS Cloud Development Kit (AWS CDK), or the AWS Serverless Application Model (AWS SAM).

    In an AWS SAM template, the related changes are as follows. Full templates are part of the GitHub repository.

    Handler: com.amazonaws.serverless.proxy.spring.SpringDelegatingLambdaContainerHandler 
    Environment:
      Variables:
        MAIN_CLASS: com.amazonaws.serverless.sample.springboot3.Application

    Optimizing memory configuration

    When running Lambda functions, start-up time and memory footprint are important considerations. The amount of memory you configure for your Lambda function also determines the amount of virtual CPU available. Adding more memory proportionally increases the amount of CPU, and therefore increases the overall computational power available. If a function is CPU-, network- or memory-bound, adding more memory can improve performance.

    Lambda charges for the total amount of gigabyte-seconds consumed by a function. Gigabyte-seconds are a combination of total memory (in gigabytes) and duration (in seconds). Increasing memory incurs additional cost. However, in many cases, increasing the memory available causes a decrease in the function duration due to the additional CPU available. As a result, the overall cost increase may be negligible for additional performance, or may even decrease.

    Choosing the memory allocated to your Lambda functions is an optimization process that balances speed (duration) and cost. You can manually test functions by selecting different memory allocations and measuring the completion time. AWS Lambda Power Tuning is a tool to simplify and automate the process, which you can use to optimize your configuration.

    Power Tuning uses AWS Step Functions to run multiple concurrent versions of a Lambda function at different memory allocations and measures the performance. The function runs in your AWS account, performing live HTTP calls and SDK interactions, to measure performance in a production scenario.

    Improving cold-start time with AWS Lambda SnapStart

    Traditional applications often have a large tree of dependencies. Lambda loads the function code and initializes dependencies during Lambda lifecycle initialization phase. With many dependencies, this initialization time may be too long for your requirements. AWS Lambda SnapStart for Java based functions can deliver up to 10 times faster startup performance.

    Instead of running the function initialization phase on every cold-start, Lambda SnapStart runs the function initialization process at deployment time. Lambda takes a snapshot of the initialized execution environment. This snapshot is encrypted and persisted in a tiered cache for low latency access. When the function is invoked and scales, Lambda resumes the execution environment from the persisted snapshot instead of running the full initialization process. This results in lower startup latency.

    To enable Lambda SnapStart you must first enable the configuration setting, and also publish a function version.

    Enabling SnapStart

    Enabling SnapStart

    Ensure you point your API Gateway endpoint to the published version or an alias to ensure you are using the SnapStart enabled function.

    The corresponding settings in an AWS SAM template contain the following:

    SnapStart: 
      ApplyOn: PublishedVersions
    AutoPublishAlias: my-function-alias

    Read the Lambda SnapStart compatibility considerations in the documentation as your application may contain specific code that requires attention.

    Conclusion

    When building serverless applications with Lambda, you can deliver features faster, but your language and runtime must work within the serverless architectural model. AWS Serverless Java Container helps to bridge between traditional Java Enterprise applications and modern cloud-native serverless functions.

    You can optimize the memory configuration of your Java Lambda function using AWS Lambda Power Tuning tool and enable SnapStart to optimize the initial cold-start time.

    The self-paced Java on AWS Lambda workshop shows how to build cloud-native Java applications and migrate existing Java application to Lambda.

    Explore the AWS Serverless Java Container GitHub repo where you can report related issues and feature requests.

    For more serverless learning resources, visit Serverless Land.

How to build a unified authorization layer for identity providers with Amazon Verified Permissions

Post Syndicated from Akash Kumar original https://aws.amazon.com/blogs/security/how-to-build-a-unified-authorization-layer-for-identity-providers-with-amazon-verified-permissions/

Enterprises often have an identity provider (IdP) for their employees and another for their customers. Using multiple IdPs allows you to apply different access controls and policies for employees and for customers. However, managing multiple identity systems can be complex. A unified authorization layer can ease administration by centralizing access policies for APIs regardless of the user’s IdP. The authorization layer evaluates access tokens from any authorized IdP before allowing API access. This removes authorization logic from the APIs and simplifies specifying organization-wide policies. Potential drawbacks include additional complexity in the authorization layer. However, simplifying the management of policies reduces cost of ownership and the likelihood of errors.

Consider a veterinary clinic that has an IdP for their employees. Their clients, the pet owners, would have a separate IdP. Employees might have different sign-in requirements than the clients. These requirements could include features such as multi-factor authentication (MFA) or additional auditing functionality. Applying identical access controls for clients may not be desirable. The clinic’s scheduling application would manage access from both the clinic employees and pet owners. By implementing a unified authorization layer, the scheduling app doesn’t need to be aware of the different IdPs or tokens. The authorization layer handles evaluating tokens and applying policies, such as allowing the clinic employees full access to appointment data while limiting pet owners to just their pet’s records. In this post, we show you an architecture for this situation that demonstrates how to build a unified authorization layer using multiple Amazon Cognito user pools, Amazon Verified Permissions, and an AWS Lambda authorizer for Amazon API Gateway-backed APIs.

In the architecture, API Gateway exposes APIs to provide access to backend resources. API Gateway is a fully-managed service that allows developers to build APIs that act as an entry point for applications. To integrate API Gateway with multiple IdPs, you can use a Lambda authorizer to control access to the API. The IdP in this architecture is Amazon Cognito, which provides the authentication function for users before they’re authorized by Verified Permissions, which implements fine-grained authorization on resources in an application. Keep in mind that Verified Permissions has limits on policy sizes and requests per second. Large deployments might require a different policy store or a caching layer. The four services work together to combine multiple IdPs into a unified authorization layer. The architecture isn’t limited to the Cognito IdP — third-party IdPs that generate JSON Web Tokens (JWTs) can be used, including combinations of different IdPs.

Architecture overview

This sample architecture relies on user-pool multi-tenancy for user authentication. It uses Cognito user pools to assign authenticated users a set of temporary and least-privilege credentials for application access. Once users are authenticated, they are authorized to access backend functions via a Lambda Authorizer function. This function interfaces with Verified Permissions to apply the appropriate access policy based on user attributes.

This sample architecture is based on the scenario of an application that has two sets of users: an internal set of users, veterinarians, as well as an external set of users, clients, with each group having specific access to the API. Figure 1 shows the user request flow.

Figure 1: User request flow

Figure 1: User request flow

Let’s go through the request flow to understand what happens at each step, as shown in Figure 1:

  1. There two groups of users — External (Clients) and Internal (Veterinarians). These user groups sign in through a web portal that authenticates against an IdP (Amazon Cognito).
  2. The groups attempt to access the get appointment API through API Gateway, along with their JWT tokens with claims and client ID.
  3. The Lambda authorizer validates the claims.

    Note: If Cognito is the IdP, then Verified Permissions can authorize the user from their JWT directly with the IsAuthorizedWithToken API.

  4. After validating the JWT token, the Lambda authorizer makes a query to Verified Permissions with associated policy information to check the request.
  5. API Gateway evaluates the policy that the Lambda authorizer returned, to allow or deny access to the resource.
  6. If allowed, API Gateway accesses the resource. If denied, API Gateway returns a 403 Forbidden error.

Note: To further optimize the Lambda authorizer, the authorization decision can be cached or disabled, depending on your needs. By enabling caching, you can improve the performance, because the authorization policy will be returned from the cache whenever there is a cache key match. To learn more, see Configure a Lambda authorizer using the API Gateway console.

Walkthrough

This walkthrough demonstrates the preceding scenario for an authorization layer supporting veterinarians and clients. Each set of users will have their own distinct Amazon Cognito user pool.

Verified Permissions policies associated with each Cognito pool enforce access controls. In the veterinarian pool, veterinarians are only allowed to access data for their own patients. Similarly, in the client pool, clients are only able to view and access their own data. This keeps data properly segmented and secured between veterinarians and clients.

Internal policy

permit (principal in UserGroup::"AllVeterinarians",
   action == Action::"GET/appointment",
   resource in UserGroup::"AllVeterinarians")
   when {principal == resource.Veterinarian };

External policy

permit (principal in UserGroup::"AllClients",
   action == Action::"GET/appointment",
   resource in UserGroup::"AllClients")
   when {principal == resource.owner};

The example internal and external policies, along with Cognito serving as an IdP, allow the veterinarian users to federate in to the application through one IdP, while the external clients must use another IdP. This, coupled with the associated authorization policies, allows you to create and customize fine-grained access policies for each user group.

To validate the access request with the policy store, the Lambda authorizer execution role also requires the verifiedpermissions:IsAuthorized action.

Although our example Verified Permissions policies are relatively simple, Cedar policy language is extensive and allows you to define custom rules for your business needs. For example, you could develop a policy that allows veterinarians to access client records only during the day of the client’s appointment.

Implement the sample architecture

The architecture is based on a user-pool multi-tenancy for user authentication. It uses Amazon Cognito user pools to assign authenticated users a set of temporary and least privilege credentials for application access. After users are authenticated, they are authorized to access APIs through a Lambda function. This function interfaces with Verified Permissions to apply the appropriate access policy based on user attributes.

Prerequisites

You need the following prerequisites:

  • The AWS Command Line Interface (CLI) installed and configured for use.
  • Python 3.9 or later, to package Python code for Lambda.

    Note: We recommend that you use a virtual environment or virtualenvwrapper to isolate the sample from the rest of your Python environment.

  • An AWS Identity and Access Management (IAM) role or user with enough permissions to create an Amazon Cognito user pool, IAM role, Lambda function, IAM policy, and API Gateway instance.
  • jq for JSON processing in bash script.

    To install on Ubuntu/Debian, use the following command:

    sudo apt-get install jq

    To install on Mac with Homebrew, using the following command:

    brew install jq

  • The GitHub repository for the sample. You can download it, or you can use the following Git command to download it from your terminal.

    Note: This sample code should be used to test the solution and is not intended to be used in a production account.

    $ git clone https://github.com/aws-samples/amazon-cognito-avp-apigateway.git
    $ cd amazon-cognito-avp-apigateway

To implement this reference architecture, you will use the following services:

  • Amazon Verified Permissions is a service that helps you implement and enforce fine-grained authorization on resources within the applications that you build and deploy, such as HR systems and banking applications.
  • Amazon API Gateway is a fully managed service that developers can use to create, publish, maintain, monitor, and secure APIs at any scale.
  • AWS Lambda is a serverless compute service that lets you run code without provisioning or managing servers, creating workload-aware cluster scaling logic, maintaining event integrations, or managing runtimes.
  • Amazon Cognito provides an identity store that scales to millions of users, supports social and enterprise identity federation, and offers advanced security features to protect your consumers and business.

Note: We tested this architecture in the us-east-1 AWS Region. Before you select a Region, verify that the necessary services — Amazon Verified Permissions, Amazon Cognito, API Gateway, and Lambda — are available in those Regions.

Deploy the sample architecture

From within the directory where you downloaded the sample code from GitHub, first run the following command to package the Lambda functions. Then run the next command to generate a random Cognito user password and create the resources described in the previous section.

Note: In this case, you’re generating a random user password for demonstration purposes. Follow best practices for user passwords in production implementations.

$ bash ./helper.sh package-lambda-functions
 …
Successfully completed packaging files.
$ bash ./helper.sh cf-create-stack-gen-password
 …
Successfully created CloudFormation stack.

Validate Cognito user creation

Run the following commands to open the Cognito UI in your browser and then sign in with your credentials. This validates that the previous commands created Cognito users successfully.

Note: When you run the commands, they return the username and password that you should use to sign in.

For internal user pool domain users

$ bash ./helper.sh open-cognito-internal-domain-ui
 Opening Cognito UI...
 URL: xxxxxxxxx
 Please use following credentials to login:
 Username: cognitouser
 Password: xxxxxxxx

For external user pool domain users

$ bash ./helper.sh open-cognito-external-domain-ui
 Opening Cognito UI...
 URL: xxxxxxxxx
 Please use following credentials to login:
 Username: cognitouser
 Password: xxxxxxxx

Validate Cognito JWT upon sign in

Because you haven’t installed a web application that would respond to the redirect request, Cognito will redirect to localhost, which might look like an error. The key aspect is that after a successful sign-in, there is a URL similar to the following in the navigation bar of your browser.

http://localhost/#id_token=eyJraWQiOiJicVhMYWFlaTl4aUhzTnY3W...

Test the API configuration

Before you protect the API with Cognito so that only authorized users can access it, let’s verify that the configuration is correct and API Gateway serves the API. The following command makes a curl request to API Gateway to retrieve data from the API service.

$ bash ./helper.sh curl-api

API to check the appointment details of PI-T123
URL: https://epgst74zff.execute-api.us-east-1.amazonaws.com/dev/appointment/PI-T123
Response: 
{"appointment": {"id": "PI-T123", "name": "Dave", "Pet": "Onyx - Dog. 2y 3m", "Phone Number": "+1234567", "Visit History": "Patient History from last visit with primary vet", "Assigned Veterinarian": "Jane"}}

API to check the appointment details of PI-T124
URL: https://epgst74zff.execute-api.us-east-1.amazonaws.com/dev/appointment/PI-T124
Response: 
{"appointment": {"id": "PI-T124", "name": "Joy", "Pet": "Jelly - Dog. 6y 2m", "Phone Number": "+1368728", "Visit History": "None", "Assigned Veterinarian": "Jane"}}

API to check the appointment details of PI-T125
URL: https://epgst74zff.execute-api.us-east-1.amazonaws.com/dev/appointment/PI-T125
Response: 
{"appointment": {"id": "PI-T125", "name": "Dave", "Pet": "Sassy - Cat. 1y", "Phone Number": "+1398777", "Visit History": "Patient History from last visit with primary vet", "Assigned Veterinarian": "Adam"}}

Protect the API

In the next step, you deploy a Verified Permissions policy store and a Lambda authorizer. The policy store contains the policies for user authorization. The Lambda authorizer verifies users’ access tokens and authorizes the users through Verified Permissions.

Update and create resources

Run the following command to update existing resources and create a Lambda authorizer and Verified Permissions policy store.

$ bash ./helper.sh cf-update-stack
 Successfully updated CloudFormation stack.

Test the custom authorizer setup

Begin your testing with the following request, which doesn’t include an access token.

Note: Wait for a few minutes to allow API Gateway to deploy before you run the following commands.

$ bash ./helper.sh curl-api
API to check the appointment details of PI-T123
URL: https://epgst74zff.execute-api.us-east-1.amazonaws.com/dev/appointment/PI-T123
Response: 
{"message":"Unauthorized"}

API to check the appointment details of PI-T124
URL: https://epgst74zff.execute-api.us-east-1.amazonaws.com/dev/appointment/PI-T124
Response: 
{"message":"Unauthorized"}

API to check the appointment details of PI-T125
URL: https://epgst74zff.execute-api.us-east-1.amazonaws.com/dev/appointment/PI-T125
Response: 
{"message":"Unauthorized"}

The architecture denied the request with the message “Unauthorized.” At this point, API Gateway expects a header named Authorization (case sensitive) in the request. If there’s no authorization header, API Gateway denies the request before it reaches the Lambda authorizer. This is a way to filter out requests that don’t include required information.

Use the following command for the next test. In this test, you pass the required header, but the token is invalid because it wasn’t issued by Cognito and is instead a simple JWT-format token stored in ./helper.sh. To learn more about how to decode and validate a JWT, see Decode and verify a Cognito JSON token.

$ bash ./helper.sh curl-api-invalid-token
 {"Message":"User is not authorized to access this resource"}

This time the message is different. The Lambda authorizer received the request and identified the token as invalid and responded with the message “User is not authorized to access this resource.”

To make a successful request to the protected API, your code must perform the following steps:

  1. Use a user name and password to authenticate against your Cognito user pool.
  2. Acquire the tokens (ID token, access token, and refresh token).
  3. Make an HTTPS (TLS) request to API Gateway and pass the access token in the headers.

To finish testing, programmatically sign in to the Cognito UI, acquire a valid access token, and make a request to API Gateway. Run the following commands to call the protected internal and external APIs.

$ ./helper.sh curl-protected-internal-user-api

Getting API URL, Cognito Usernames, Cognito Users Password and Cognito ClientId...
User: Jane
Password: Pa%%word-2023-04-17-17-11-32
Resource: PI-T123
URL: https://16qyz501mg.execute-api.us-east-1.amazonaws.com/dev/appointment/PI-T123

Authenticating to get access_token...
Access Token: eyJraWQiOiJIaVRvckxxxxxxxxxx6BfCBKASA

Response: 
{"appointment": {"id": "PI-T123", "name": "Dave", "Pet": "Onyx - Dog. 2y 3m", "Phone Number": "+1234567", "Visit History": "Patient History from last visit with primary vet", "Assigned Veterinarian": "Jane"}}

User: Adam
Password: Pa%%word-2023-04-17-17-11-32
Resource: PI-T123
URL: https://16qyz501mg.execute-api.us-east-1.amazonaws.com/dev/appointment/PI-T123

Authenticating to get access_token...
Access Token: eyJraWQiOiJIaVRvckxxxxxxxxxx6BfCBKASA

Response: 
{"Message":"User is not authorized to access this resource"}

User: Adam
Password: Pa%%word-2023-04-17-17-11-32
Resource: PI-T125
URL: https://16qyz501mg.execute-api.us-east-1.amazonaws.com/dev/appointment/PI-T125

Authenticating to get access_token...
Access Token: eyJraWQiOiJIaVRvckxxxxxxxxxx6BfCBKASA

Response: 
{"appointment": {"id": "PI-T125", "name": "Dave", "Pet": "Sassy - Cat. 1y", "Phone Number": "+1398777", "Visit History": "Patient History from last visit with primary vet", "Assigned Veterinarian": "Adam"}}

Now calling external userpool users for accessing request

$ ./helper.sh curl-protected-external-user-api
User: Dave
Password: Pa%%word-2023-04-17-17-11-32
Resource: PI-T123
URL: https://16qyz501mg.execute-api.us-east-1.amazonaws.com/dev/appointment/PI-T123

Authenticating to get access_token...
Access Token: eyJraWQiOiJIaVRvckxxxxxxxxxx6BfCBKASA

Response: 
{"appointment": {"id": "PI-T123", "name": "Dave", "Pet": "Onyx - Dog. 2y 3m", "Phone Number": "+1234567", "Visit History": "Patient History from last visit with primary vet", "Assigned Veterinarian": "Jane"}}

User: Joy
Password Pa%%word-2023-04-17-17-11-32
Resource: PI-T123
URL: https://16qyz501mg.execute-api.us-east-1.amazonaws.com/dev/appointment/PI-T123

Authenticating to get access_token...
Access Token: eyJraWQiOiJIaVRvckxxxxxxxxxx6BfCBKASA

Response: 
{"Message":"User is not authorized to access this resource"}

User: Joy
Password Pa%%word-2023-04-17-17-11-32
Resource: PI-T124
URL: https://16qyz501mg.execute-api.us-east-1.amazonaws.com/dev/appointment/PI-T124

Authenticating to get access_token...
Access Token: eyJraWQiOiJIaVRvckxxxxxxxxxx6BfCBKASA

Response: 
{"appointment": {"id": "PI-T124", "name": "Joy", "Pet": "Jelly - Dog. 6y 2m", "Phone Number": "+1368728", "Visit History": "None", "Assigned Veterinarian": "Jane"}}

This time, you receive a response with data from the API service. Let’s recap the steps that the example code performed:

  1. The Lambda authorizer validates the access token.
  2. The Lambda authorizer uses Verified Permissions to evaluate the user’s requested actions against the policy store.
  3. The Lambda authorizer passes the IAM policy back to API Gateway.
  4. API Gateway evaluates the IAM policy, and the final effect is an allow.
  5. API Gateway forwards the request to Lambda.
  6. Lambda returns the response.

In each of the tests, internal and external, the architecture denied the request because the Verified Permissions policies denied access to the user. In the internal user pool, the policies only allow veterinarians to see their own patients’ data. Similarly, in the external user pool, the policies only allow clients to see their own data.

Clean up resources

Run the following command to delete the deployed resources and clean up.

$ bash ./helper.sh cf-delete-stack

Additional information

Verified Permissions is integrated with AWS CloudTrail, a service that provides a record of actions taken by a user, role, or AWS service in Verified Permissions. CloudTrail captures API calls for Verified Permissions as events. You can choose to capture actions performed on a Verified Permissions policy store by the Lambda authorizer. Verified Permissions logs can also be injected into your security information and event management (SEIM) solution for security analysis and compliance. For information about API call quotas, see Quotas for Amazon Verified Permission.

Conclusion

In this post, we demonstrated how you can use multiple Amazon Cognito user pools alongside Amazon Verified Permissions to build a single access layer to APIs. We used Cognito in this example, but you could implement the solution with another third-party IdP instead. As a next step, explore the Cedar playground to test policies that can be used with Verified Permissions, or expand this solution by integrating a third-party IdP.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Author

Akash Kumar

Akash is a Senior Lead Consultant at AWS, based in India. He works with customers for application development, security, and DevOps to modernize and re-architect their workloads to the AWS Cloud. His passion is building innovative solutions and automating infrastructure, enabling customers to focus more on their businesses.

Brett Seib

Brett Seib

Brett is a Senior Solutions Architect, based in Austin, Texas. He is passionate about innovating and using technology to solve business challenges for customers. Brett has several years of experience in the enterprise, Internet of Things (IoT), and data analytics industries, accelerating customer business outcomes.

John Thach

John Thach

John is a Technical Account Manager, based in Houston, Texas. He focuses on enabling customers to implement resilient, secure, and cost-effective solutions by using AWS services. He is passionate about helping customers solve unique challenges through their cloud journeys.