Streamlining Secrets Management with AWS
At Nawy, we are constantly evolving our infrastructure to improve security, automation, and developer autonomy. As a DevOps engineer, I’ve been part of the journey from our old configuration management setup to a more dynamic and developer-friendly process. In this post, I’ll walk you through how we used to manage application secrets and environment variables, and how our new approach leverages AWS Secrets Manager and custom packages to give developers more control and reduce DevOps overhead.
Previous Setup: Static Configuration with Terragrunt
In our previous setup, we deployed our applications using Amazon ECS and AWS Lambda. The infrastructure was managed using Terragrunt, a tool that helps with managing Terraform configurations.
Environment Variables and Secrets Management
In the current setup, environment variables and secrets were passed to ECS containers and Lambdas via the Terragrunt YAML files. These files referenced a Secrets Manager ID that stored the sensitive data for each service.
-
ECS Containers: Environment variables and secrets were passed at the time of container provisioning through the ECS task definition. Any change in these secrets required restarting the ECS container to pick up the updated values.
-
Lambdas: Similarly, secrets for AWS Lambdas were managed through the same Terragrunt process, with the added problem of having those secrets being loaded as plain text environment variables for the lambdas , clearly visible in the lambda’s console view . Changes to these secrets required reapplying the Terragrunt configuration, which would re-provision the Lambda with the new values.
Here’s a visual of how this process worked for ecs using terragrunt yaml file configuration :
and for lambda:
DevOps Intervention
In both cases, adding new environment variables or updating secrets required intervention from the DevOps team. Developers had to raise a Pull Request (PR), and once merged, the relevant ECS task definition or Lambda function would be updated with the new values.
Example: Updating ECS Secrets
-
Step 1: Developer creates a PR to add a new secret or update an existing one.
-
Step 2: DevOps team reviews the PR and merges it.
-
Step 3: Terragrunt configuration is applied to update the ECS task definition or Lambda.
The New Setup: Dynamic Secrets Management with a Custom Typescript Package
To enhance security, streamline the process, and reduce cross-team involvement, we have shifted to a new setup that dynamically loads secrets using a custom package. This package is written in TypeScript and is used in both our Lambda functions and ECS containers.
Key Improvements
-
No More Static Environment Variables: The only environment variable passed to the Lambdas and ECS containers is the Secrets Manager ID.
-
Dynamic Secret Loading: The custom package communicates directly with AWS Secrets Manager to load secrets dynamically during application startup using that ID as an environment variable.
-
For Lambdas, we’ve added the AWS Parameters and Secrets Lambda Extension layer, which loads a local http server inside the Lambda to store the secrets and parameters ad an in-memory cache using the secret manager id passed to the lambda as in environment variable in the terragrunt file. This server loads the secrets values everytime the lamdba runs, ensuring the Lambda has access to the latest secrets at runtime without the need for redeployment.
-
For ECS containers, secrets are fetched at the initial boot, but they can also be updated in real-time by calling the package’s method if needed.
Example: Lambda with AWS Parameters and Secrets Lambda Extension
Custom Typescript Package built and pushed to AWS CodeArtifact for use by Lambda:
|
|
Here’s how the new setup works for lambdas:
For AWS Lambdas, we’ve added the AWS Parameters and Secrets Lambda Extension. This extension runs a local server inside the Lambda, which the custom package connects to in order to retrieve secrets. The server caches the secrets for a certain duration of the Lambda’s runtime and fetches new values if updated in Secrets Manager when the cache is invalidated . The Lambda no longer requires static environment variables to be passed through Terragrunt. Instead, secrets are dynamically fetched during execution using the local server managed by the extension.
this would be the environments.ts file inside the lambda’s source code , it imports the custom package and has an method that loads secrets as needed by the developer
|
|
This is the index.ts file
|
|
Each Lambda execution uses the most up-to-date values from Secrets Manager, without needing redeployment through terragrunt .
Example: ECS Container with On-Demand Secret Fetching
Custom Typescript Package built and pushed to AWS CodeArtifact for use by our Applications:
|
|
For ECS containers, the initial secrets are fetched during startup. However, if there is a critical need to update a secret (e.g., database credentials), the application can call the package’s function to get the current value from AWS Secrets Manager.
|
|
Benefits of the New Setup
-
Reduced Cross-team Intervention: Developers no longer need to submit PRs for every secret update. They can control secrets through AWS Secrets Manager and load them directly in the code or at the most do a container restart / lambda new run.
-
Improved Security: Secrets are never stored as static environment variables. They are fetched at runtime using the custom package, reducing the attack surface.
-
Flexibility and Agility: With the ability to fetch secrets on demand, our applications are more resilient to changes in the infrastructure, allowing for smoother operations.
Conclusion
The new configuration management setup at Nawy significantly reduces the manual steps involved in managing secrets and environment variables, while empowering developers to take full control of their application configuration. By adopting this dynamic approach, we have enhanced our security posture and made the deployment process more efficient.
If you’re facing similar challenges in your own infrastructure, consider this approach to reduce overhead and give developers the flexibility they need to move fast without compromising on security.
Stay tuned for more technical insights from our team!