Jun 21, 2021

The Fault in Our Stars

If you read this article and find out that an API you develop or use was vulnerable because of the incorrect use of stars we described in this blog post, we would love to hear from you. Please click here and answer this quick anonymous poll so we can find out how prevalent this issue is.

Serverless applications are gaining a lot of popularity, as some independent reports indicate. This is a consequence of a larger trend, where companies let specialists like cloud service providers or SaaS offerings take care of the low level infrastructure, scalability and to a certain extent the security of increasing parts of their applications.

Werner Vogels verbalized the end goal very clearly (source) at the 2017 edition of AWS re:Invent:

Developer time is almost certainly the scarcest and most valuable resource at any organization that is developing applications. We believe that allowing developers to focus on the parts of that application that actually deliver business value is by far the biggest advantage offered by serverless architectures.

Developers can now abstract away many low-level aspects of their application's infrastructure. But they still need to understand the serverless components they are using, including how to properly configure them to meet availability, scalability, cost, security and compliance requirements.

If you are writing serverless applications in AWS, there are a variety of frameworks and toolkits you can use. Some of those options are writing CloudFormation by hand (a.k.a "roughing it"), the CDK, SAM, Chalice or the widely popular Serverless Framework.

What many serverless applications will end up using regardless of their choice of framework, however, is the AWS API Gateway. In this blog post we'll discuss one of the features of AWS API Gateway, called lambda authorizers, and how the official AWS documentation might have led developers into using it insecurely.

An Introduction to API Gateway and Lambda Authorizers

(if you already are familiar with how API Gateway and lambda authorizers work, feel free to skip to the next section)

The following diagram from the AWS API Gateway product page does a good job of explaining how the service works:

In traditional networking terms, you can think of API Gateway as a scalable and serverless reverse proxy service that can route HTTP/HTTPS requests to different implementation back-ends. The most frequent architecture we've seen in the wild is using API Gateway to route REST API endpoints to Lambda functions which implement their logic.

Besides doing this routing, API Gateway also provides other useful features such as monitoring and observability using CloudWatch Logs or X-Ray, detecting and blocking attacks using AWS WAF, and access control.

The native AWS offering that allows API Gateway to fully implement authentication and authorization is Amazon Cognito. We won't discuss it in detail here, but we highly recommend you read this incredible piece of research on its security by our good friend Andrés Riancho.

In order to give developers greater flexibility in how they choose to implement authentication and authorization into their APIs, API Gateway has a feature called lambda authorizers. This diagram from the AWS documentation provides a good overview:

In short, you write a special lambda function that is not used to fulfill any REST API endpoint request. Instead, this is called by API Gateway when a user tries to call an endpoint that is marked as requiring authentication.

The lambda authorizer then processes the request looking for whatever authentication tokens it expects, typically on the Authorization HTTP header. This can be, for example, a JWT or OAUTH bearer token provided by an IdP or CIAM solution like Auth0.

The lambda authorizer validates the authentication tokens and, if successful, returns a JSON object that identifies the authenticated users, any application-specific metadata associated with them, and an AWS IAM policy that controls which endpoints the user can call. This object has the following structure:

This data is then passed to the lambda functions implementing the authenticated API endpoints, so they can base their authorization decisions on it:

The policyDocument Property and the IAM Policy Format Wildcards

The part of the lambda authorizer output that we focused on is the policyDocument property. It allows the lambda authorizer to specify which API endpoints this user can or cannot access. This way, the service itself can reject requests to prohibited API endpoints immediately, without even executing the lambdas that implement them. This is not only faster, but also slightly cheaper and more scalable.

The team developing API Gateway could have chosen a number of ways to represent the list of allowed or denied endpoints. Typically reverse proxy and WAF solutions allow for regular expressions to specify allowed or denied paths. This is considered necessary to properly accommodate the complexity of API paths, particularly when path parameters are used.

Instead, API Gateway decided to maximize code reuse and require that this policy be in AWS IAM policy format. The way this is encoded is that the policyDocument object will contain an IAM policy that allows or denies execution of an execute-api action on specially formatted resources that represent each API endpoint:

Using the IAM policy format certainly had many advantages, such as leveraging a mature and scalable infrastructure already available to the API Gateway service, which is reportedly handling over 400 million requests per second worldwide. This choice is not without its downsides, though. The overwhelming majority of developers are not AWS security experts, and thus don't have a prior intuitive understanding on how it works. Also, it is rather inflexible on how strings are matched, most notably only providing simple wildcards "?" and "*" and not regular expressions.

The ARNs in IAM policies have several different parts separated by colons:

It is particularly interesting to note that the execute-api ARN format packs the API ID, API stage, HTTP verb, and the API resource (a.k.a the URL path) all into the last part of the ARN. They are separated by slashes, as if they were forming a single logical path in an imaginary file system.

One of the characteristics of how wildcards work in resource matching on IAM policies is that stars ("*") are greedy and will expand as much as possible inside each of the colon-separated parts of the ARN. It is the equivalent of ".*" in POSIX regular expressions. Crucially, the expansion will not stop at the slashes - any "*" is free to expand its matching into the entire last part of the invoke-api ARN. Keep this in fact in mind.

One other relevant aspect is that apparently each AWS service decides whether matching the resource part of ARNs is case sensitive or case insensitive. Our testing indicates that API Gateway does its matching in a case sensitive way.

AWS Documentation - You Know What They Say About Assumptions

During a security review of a serverless API, our team was reading the official API Gateway documentation and came across this example:

The policy does allow any call associated with a stage called "test", sure. But it does a lot more than that. It also allows access to any API endpoint which contains "/test/" as a part of its URL.

All of the following ARNs would be allowed by the example policy, whereas the text implies only the first one would:

arn:aws:execute-api:us-west-1:12345678:myApiId/test/GET/foo/bar/
arn:aws:execute-api:us-west-1:12345678:myApiId/myStage/GET/foo/bar/test/hello/world
arn:aws:execute-api:us-west-1:12345678:myApiId/myStage/GET/foo/bar/test/
arn:aws:execute-api:us-west-1:12345678:myApiId/myStage/GET/test/hello/world

We used blue and red to indicate the portions of the text that match the first and second stars in the last part of the policy Resource string ("arn:aws:execute-api:*:*:*/test/*").

Remember that stars also match the empty string, which is what happens on line 3.

Down on the same page, we saw the documentation claim that the API ID, stage name and HTTP verb can be easily replaced by "*" to mean any value for that part of the resource alone:

Using the HTTP verb as an asterisk is particularly dangerous. Imagine an API that needs to allow a user to access any HTTP verb on a single endpoint "/business/op". The developer might be tempted to write the policyDocument as follows:

All of the following ARNs would be allowed by the example policy. But item 6 most likely would not be expected by a developer that read this part of the documentation.

arn:aws:execute-api:us-west-1:12345678:myApiId/myStage/GET/business/op
arn:aws:execute-api:us-west-1:12345678:myApiId/myStage/PATCH/business/op
arn:aws:execute-api:us-west-1:12345678:myApiId/myStage/DELETE/business/op
arn:aws:execute-api:us-west-1:12345678:myApiId/myStage/PUT/business/op
arn:aws:execute-api:us-west-1:12345678:myApiId/myStage/POST/business/op
arn:aws:execute-api:us-west-1:12345678:myApiId/myStage/GET/foo/bar/business/op

So having the HTTP verb as a star makes it so that the endpoint resource is no longer anchored to the beginning of the URL path.

This documentation page is not an outlier, though. Here is another example:

Though arguably this last example is not as dangerous from a privilege escalation perspective, since it's a Deny instead of an Allow.

Finally, it's worth noting that AWS has a public Github repository with lambda authorizer blueprints, so customers don't need to start from scratch and can customize this starting code for their own needs. In keeping with the documentation, the code provided "*" as the default value for many of the parameters, and the code comments and docs similarly stated that any of them could individually be replaced by stars.

It seems like the team writing the documentation was working under the mistaken assumption that the stars would stop expanding at the slashes. As if its regular expression equivalent was "[^/]*" instead of ".*".

What this means in practice is that developers that followed those guidelines could be writing policies that allow many more operations than they originally intended. Particularly in cases where the lambda authorizer policy is the only way in which an API is doing authorization, this could allow an authenticated attacker to access API endpoints that the developers did not intend them to.

Exploiting this will be easiest if the attacker has access to the lambda authorizer code. That would be the case for open source applications, or those whose code was obtained in a previous step of a longer attack chain. This would allow the attacker to predict where stars are used in the lambda authorizer policyDocument field.

Also, APIs that have path parameters that the attacker has some control over would be particularly amenable to this. If you think of the first documentation policy presented in this section, any endpoints where a path parameter could be set to "test" would be allowed.

Responsible Disclosure

We reported these findings to AWS and waited for them to confirm they were convinced they corrected everything in their documentation and samples before publishing this blog post. Our initial report was on April 26th, 2021, and the interactions went on until June 1st.

These are the links for the issues we reported and the links to the commits or PRs where we observed fixes and updates being implemented:

It's worth noting that the AWS Console continues to use asterisks for the API ID and API stage when displaying information endpoints and lambda associations on APIs. AWS decided not to change that with the following official response:

The "*" showing from your Lambda trigger is not an issue and will not impact the customers, because triggers in the console is merely a visual abstraction to make management of Lambda event sources and understanding Lambda permissions easier. Trigger information are based on Lambda function's resource policy, and you are in control of this policy--it is not associated with the authorization decisions made by the authorizer.

In your first screenshot, resources have a wildcard for the stage because they have no association to specific stages before they are deployed to one. This does not indicate what is or what isn't allowed as far as invocation on the API Gateway. Once the resource is deployed to a stage, the ARN will be changed to a new one with specific stage name that replaces the asterisk, and the new ARN will be evaluated as such when making authorization decisions. The documentation on wildcarding has been revised, and the Blueprints / Lambda console are updated, so we believe that this behavior is not misleading.

We want to thank the AWS Security and API Gateway teams for their professionalism and quick response to these concerns, and in particular Mark Ryland for his support and guidance through the entire process.

Recommendations and Next Steps

These are a few of our recommendations in decreasing order of priority to ensure your serverless APIs in AWS are not vulnerable to the problems described in this blog post:

Review the use of stars in the policyDocument object. The rule of thumb is that if a star is used at all at the last part of the ARN, it should be in the form of a "/*" at the very end of the resource string (i.e.: "arn:aws:execute-api:us-west-1:12345678:myApiId/test/GET/foo/bar/*"). You can obtain the API ID, stage name and HTTP method dynamically from the input provided to the lambda authorizer. Create one resource string in the policy for each allowed HTTP method.
Consider adding Deny statements that help limit the impact or scope of star expansions on the policyDocument. Remember that AWS IAM always gives precedence to Deny over Allow if multiple statements match an operation.
Whenever feasible, use defense in depth and check again that the user is authorized to call an endpoint in the lambda that implements it. Don't rely on the lambda authorizer policy as your only method of authorization unless you are sure you can do it securely.
Make sure any code imported from the previous version of the lambda authorizer blueprints is updated to the latest version.
If you use URL path parameters in your APIs, avoid cases where the valid values expected to be submitted to them can be chosen by potential attackers. Prefer backend-generated IDs instead of user-chosen names for entities, for example.

We will continue to research this matter and post more on this issue with any relevant findings. We truly believe that this kind of unintended star expansion affects more than just the policy documents returned by lambda authorizers, and we might investigate and publish more on other services in the future.

Research and blog post by Alexandre Sieira and Leonardo Viveiros.

Our thanks to Alison Vilela, Lucas Ferreira and Felipe Espósito from the Tenchi Security team for feedback and support.

This article was featured on the following security sites and newsletters:

Off-by-None: https://offbynone.io/issues/147/
tl;dr sec: https://tldrsec.com/blog/tldr-sec-089/
CloudSecList: https://cloudseclist.com/issues/issue-93/
Serverless Status: https://serverless.email/issues/206
APIsecurity.io: https://apisecurity.io/issue-141-api-vulnerabilities-veryfitpro-gettr-aws-lambda-authorizers-update-asyncapi/

‍