Giter Club home page Giter Club logo

thin-egress-app's People

Contributors

abarciauskas-bgse avatar bbuechler avatar benbart avatar bilts avatar dpflowersjr avatar eigenbeam avatar ifestus avatar jeffersonwhite avatar jkovarik avatar jlrine2 avatar krobin10 avatar markdboyd avatar mattp0 avatar mckadesorensen avatar nemreid avatar npauzenga avatar pyup-bot avatar reweeden avatar snyk-bot avatar yuvipanda avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

thin-egress-app's Issues

Consider NOT throwing 500, it gets hijacked by the CloudFront 50X error handling.

log.error(f'ClientError while {user_id} tried downloading {bucket}/{filename}: {e}')
cumulus_log_message('failure', 500, 'GET', {'reason': 'ClientError', 's3': f'{bucket}/{filename}'})
template_vars = {'contentstring': 'There was a problem accessing download data.', 'title': 'Data Not Available'}
headers = {}
return make_html_response(template_vars, headers, 500, 'error.html')

Its not a HUGE deal and if you have a CloudFront error configured you see that. But it does mask the error a bit. You see this:
image
instead of the actual error.

[rain-api-core] Allow multi-level object-prefix configurations

https://github.com/asfadmin/rain-api-core/blob/5cc26ae45c50ef038b30c3a9c0b5fd931a02361e/rain_api_core/egress_util.py#L152-L182

PRIVATE_BUCKETS:
    nsidc-cumulus-uat-protected:
      - ICESat-II Cloud Early Access UAT
      - Staff UAT
    nsidc-cumulus-uat-protected/ATLAS:
      - ICESat-II Cloud Early Access UAT
    nsidc-cumulus-uat-protected/ATLAS/ATL06:
      - Staff UAT
      - ICESat-II Cloud Early Access UAT
    nsidc-cumulus-uat-protected/ATLAS/ATL03:
      - Staff UAT

Right now, ONLY the first object prefix is evaluated. We should create a better methodology that allows arbitrary depth prefixes.

UpdatePolicyLambda timeouts during CloudFormation stack creation

This is deployed to us-west-2. It is part of a deployment of https://github.com/nasa/cumulus using TEA version 102.

We have not deployed this to us-west-2, but it worked when deploying to us-east-1 in April. Deployment to us-east-1 no longer works because of #386

The CloudFormation stack fails creation with these relevant events (in chronological order) for the TriggerInRegionCIDRUpdate component:

(1)

CloudFormation did not receive a response from your Custom Resource. Please check your logs for requestId [05d47ac7-1b99-45b0-8353-246973729264]. If you are using the Python cfn-response module, you may need to update your Lambda function code so that CloudFormation can attach the updated version.

(2)

The following resource(s) failed to create: [TriggerInRegionCIDRUpdate]. Rollback requested by user.

(3)

CloudFormation did not receive a response from your Custom Resource. Please check your logs for requestId [8b32a2cb-57a4-4f0c-a192-72c298050321]. If you are using the Python cfn-response module, you may need to update your Lambda function code so that CloudFormation can attach the updated version.

(4)

The following resource(s) failed to delete: [TriggerInRegionCIDRUpdate]. 

In the logs for the TriggerInRegionCIDRUpdate lambda, this appears to be the problem:

START ...
Current reigon in us-west-2
END ...
Task timed out after 6.01 seconds

There are no other log messages around this.

Allow bucket map to specify bucket&prefix.

Can we create a method by which we can define URI path by bucket & object prefix allowing greater flexibility for partitioning access to buckets? I imagine it would look something like this scenario:

In the example below:

  • https://<tea>/pathA/prefix1/object_1.ext => s3://bucket-name1/prefix1/object_1.ext would require a user to be in URS group access_group_a
  • https://<tea>/pathB/prefix2/object_2.ext =>s3://bucket-name1/prefix2/object_2.ext would require a user to be in URS group access_group_b
  • https://<tea>/pathC/some_file.ext => s3://bucket-name2/some_file.ext would require only simple URS auth
  • https://<tea>/pathC/bowse/image.jpg => s3://bucket-name2/browse/image.jpg would be publicly accessible

Questions to be resolvled:

  • For Accessing s3://bucket-name1/prefix1/object_1.ext, which path makes more sense?
    • https://<tea>/pathA/prefix1/object_1.ext
    • https://<tea>/pathA/object_1.ext
  • Depending on the choice above,
    • If the first is chosen, what is the value in defining the path like that?! Just to block access to s3://bucket-name1/hidden.obj?
    • If the second is chosen, do we need to worry about ambiguity?
MAP:
    pathA: bucket-name1/prefix1
    pathB: bucket-name1/prefix2
    pathC: bucket-name2

PUBLIC_BUCKETS
   bucket-name2/browse: "Browse Imagery
  
PRIVATE_BUCKETS:
   bucket-name1/prefix1:
       - access_group_a
   bucket-name1/prefix2:
       - access_group_b

Add ability to specify log group name

Add an optional CF Stack Parameter that allows for customizing the log group that TEA Lambda writes logs to.

The default should be the current default or just not specifying a log group, which would use the default.

Allow for "null" CORS Origin after CORS redirect

if cors_origin and app.current_request.headers['origin'].endswith(cors_origin):

If TEA is redirected to DURING CORS, the CORS origin: header is "null" instead of the apparent origin header.

We need to account for the null case as it happens when datapool.asf.alaska.edu redirects to sentinel1.asf.alaska.edu etc.

That solution looks something line changing L184 to:

        origin_header = app.current_request.headers['origin']
        if cors_origin and ( origin_header.endswith(cors_origin) or origin_header.lower() == 'null' ):

Added RequestId to cumulus format log messages

seems obvious in retrospect, but would be useful for capturing the final result.

Alternative solution:

  1. detect a dict object being passed into a call logger call...
  2. add each element of the dict to the log JSON...
  3. have {"message": "<VALUE>"} be something to the affect of "cumulus log response payload"

def cumulus_log_message(outcome: str, code: int, http_method:str, k_v: dict):

CVE-2020-14422 (Medium) detected in ipaddress-1.0.23-py2.py3-none-any.whl

CVE-2020-14422 - Medium Severity Vulnerability

Vulnerable Library - ipaddress-1.0.23-py2.py3-none-any.whl

IPv4/IPv6 manipulation library

Library home page: https://files.pythonhosted.org/packages/c2/f8/49697181b1651d8347d24c095ce46c7346c37335ddc7d255833e7cde674d/ipaddress-1.0.23-py2.py3-none-any.whl

Path to dependency file: thin-egress-app/lambda/requirements.txt

Path to vulnerable library: thin-egress-app/lambda/requirements.txt

Dependency Hierarchy:

  • jwcrypto-0.8-py2.py3-none-any.whl (Root Library)
    • cryptography-3.3.1-cp27-cp27mu-manylinux2010_x86_64.whl
      • โŒ ipaddress-1.0.23-py2.py3-none-any.whl (Vulnerable Library)

Found in HEAD commit: 92290bb78c314696d35697edd726df993fbeb93f

Found in base branch: devel

Vulnerability Details

Lib/ipaddress.py in Python through 3.8.3 improperly computes hash values in the IPv4Interface and IPv6Interface classes, which might allow a remote attacker to cause a denial of service if an application is affected by the performance of a dictionary containing IPv4Interface or IPv6Interface objects, and this attacker can cause many dictionary entries to be created. This is fixed in: v3.5.10, v3.5.10rc1; v3.6.12; v3.7.9; v3.8.4, v3.8.4rc1, v3.8.5, v3.8.6, v3.8.6rc1; v3.9.0, v3.9.0b4, v3.9.0b5, v3.9.0rc1, v3.9.0rc2.

Publish Date: 2020-06-18

URL: CVE-2020-14422

CVSS 3 Score Details (5.9)

Base Score Metrics:

  • Exploitability Metrics:
    • Attack Vector: Network
    • Attack Complexity: High
    • Privileges Required: None
    • User Interaction: None
    • Scope: Unchanged
  • Impact Metrics:
    • Confidentiality Impact: None
    • Integrity Impact: None
    • Availability Impact: High

For more information on CVSS3 Scores, click here.

Suggested Fix

Type: Upgrade version

Origin: https://security-tracker.debian.org/tracker/CVE-2020-14422

Release Date: 2020-06-18

Fix Resolution: 3.5.3-1+deb9u2, 3.7.3-2+deb10u2, 3.8.4~rc1-1


Step up your Open Source Security Game with WhiteSource here

Create mechanism to flush Lambda env and reload config

Need a mechanism to pull fresh config TEA on demand. This should be able to function independent of an application re-deployments.

Potentially utilize a hot-updatable Lambda env variable to trigger a soft-flush of environment.

How do we want to trigger the flush? AWS API level Invocation of a VPC-restricted? Auth'd ๐Ÿ˜ฌ API Endpoint?

Jenkins not creating full terraform package

Error: Error in function call
  on .terraform/modules/thin_egress_app/main.tf line 34, in resource "aws_s3_bucket_object" "lambda_code_dependency_archive":
  34:   key    = "${filemd5(local.dependency_layer_filename)}.zip"
    |----------------
    | local.dependency_layer_filename is ".terraform/modules/thin_egress_app/dependencylayer.zip"
Call to function "filemd5" failed: no file exists at
.terraform/modules/thin_egress_app/dependencylayer.zip.

Check for x-forwarded-for Header for user's apparent IP

When TEA is behind CloudFront, the value @ app.current_request.context['identity']['sourceIp'] represents the CloudFront service, not the user IP:

is_in_region = check_in_region_request(app.current_request.context['identity']['sourceIp'])

When checking in vs out of region requests, we should first check if there is a x-forwarded-for header present in the app.current_request.headers object. If that object is present, the string value should be split, and the 0th element should be utilized for checking the user region.

x-forwarded-for': '137.229.86.134, 130.176.100.154'

CVE-2018-20225 (Medium) detected in pip-20.0.2-py2.py3-none-any.whl

CVE-2018-20225 - Medium Severity Vulnerability

Vulnerable Library - pip-20.0.2-py2.py3-none-any.whl

The PyPA recommended tool for installing Python packages.

Library home page: https://files.pythonhosted.org/packages/54/0c/d01aa759fdc501a58f431eb594a17495f15b88da142ce14b5845662c13f3/pip-20.0.2-py2.py3-none-any.whl

Path to dependency file: /tmp/ws-scm/thin-egress-app/lambda/requirements.txt

Path to vulnerable library: /tmp/ws-scm/thin-egress-app/lambda/requirements.txt

Dependency Hierarchy:

  • chalice-1.14.0-py2.py3-none-any.whl (Root Library)
    • โŒ pip-20.0.2-py2.py3-none-any.whl (Vulnerable Library)

Found in HEAD commit: 18e84670c8f0bc1fa0d5934e39a695d712e94c0c

Vulnerability Details

An issue was discovered in pip (all versions) because it installs the version with the highest version number, even if the user had intended to obtain a private package from a private index. This only affects use of the --extra-index-url option, and exploitation requires that the package does not already exist in the public index (and thus the attacker can put the package there with an arbitrary version number).

Publish Date: 2020-05-08

URL: CVE-2018-20225

CVSS 3 Score Details (5.0)

Base Score Metrics:

  • Exploitability Metrics:
    • Attack Vector: N/A
    • Attack Complexity: N/A
    • Privileges Required: N/A
    • User Interaction: N/A
    • Scope: N/A
  • Impact Metrics:
    • Confidentiality Impact: N/A
    • Integrity Impact: N/A
    • Availability Impact: N/A

For more information on CVSS3 Scores, click here.


Step up your Open Source Security Game with WhiteSource here

CVE-2020-14343 (High) detected in PyYAML-5.3.1.tar.gz

CVE-2020-14343 - High Severity Vulnerability

Vulnerable Library - PyYAML-5.3.1.tar.gz

YAML parser and emitter for Python

Library home page: https://files.pythonhosted.org/packages/64/c2/b80047c7ac2478f9501676c988a5411ed5572f35d1beff9cae07d321512c/PyYAML-5.3.1.tar.gz

Path to dependency file: thin-egress-app/lambda/requirements.txt

Path to vulnerable library: thin-egress-app/lambda/requirements.txt

Dependency Hierarchy:

  • chalice-1.21.7-py2.py3-none-any.whl (Root Library)
    • โŒ PyYAML-5.3.1.tar.gz (Vulnerable Library)

Found in HEAD commit: 70cd8b34eb3c9d86e1b193cb19b57fa4a193ae34

Vulnerability Details

A vulnerability was discovered in the PyYAML library in all versions, where it is susceptible to arbitrary code execution when it processes untrusted YAML files through the full_load method or with the FullLoader loader. .load() defaults to using FullLoader and FullLoader is still vulnerable to RCE when run on untrusted input. Applications that use the library to process untrusted input may be vulnerable to this flaw. An attacker could use this flaw to execute arbitrary code on the system by abusing the python/object/new constructor.
The fix for CVE-2020-1747 was not enough to fix this issue.

Publish Date: 2020-07-21

URL: CVE-2020-14343

CVSS 3 Score Details (9.8)

Base Score Metrics:

  • Exploitability Metrics:
    • Attack Vector: Network
    • Attack Complexity: Low
    • Privileges Required: None
    • User Interaction: None
    • Scope: Unchanged
  • Impact Metrics:
    • Confidentiality Impact: High
    • Integrity Impact: High
    • Availability Impact: High

For more information on CVSS3 Scores, click here.


Step up your Open Source Security Game with WhiteSource here

Provide application administrators mechanism to provide custom headers

Patrick Quinn has requested we add Cache-Control: private, max-age=600 header.

Rather than specially coding that one header, TEA should provide a configuration process that allows app admin to specify custom headers.

Configuration could be a data redirect (global) config, or potentially a per-bucket setting similar to defining PUBLIC/PRIVATE buckets.

Make Lambda memory allocation dynamic, default to 1,792 MB

We seen testing at scale that normal usage is closer to 400MB, but spikes to near 700MB. However, at 1792 MB, we can allocate an entire vCPU to help mitigate backgrounding.

1792MB seems a little on the excessive side when normal usage is around 400 and spikes only go up to 700, however Lambda only charges you for RAM used, not RAM allocated. Getting that full proportional vCPU should offset any concerns of wasted RAM.

We should add a new Parameter to the CF & TF to support dynamic setting of the Lambda Memory allocation, and have it default to 1792.

https://docs.aws.amazon.com/lambda/latest/dg/configuration-console.html#:~:text=AT%201%2C792%20MB%2C%20a%20function%20has%20the%20equivalent%20of%20one%20full%20vCPU

Explore using semantic version ID's instead of the current numerical series.

How might we inject a semantic version id into the build process? Trigger release from a v#.#.# tag (like our cirrus builds) and inject the tag name into the code instead of build.# value? Alternatively, accept a build version id as a jenkins build param?

Do we rely on build.# syntax in the release code anywhere?

Better restrict in-region requests.

Unfortunately, this line doesn't actually work:

aws:RequestedRegion: !Ref AWS::Region

The solution is far more in-depth:

  1. Create Lambda that can update the Policy for DownloadRoleInRegion (this may require tweeking policy names)
    a) Lamdba should fetch https://ip-ranges.amazonaws.com/ip-ranges.json and parse ip_prefix for .region=="us-west-2" and .service=="AMAZON"
    b) Update the policy condition with all IP's from previous step:
    "Condition": { "IpAddress":{"aws:SourceIp":["52.95.255.112/28","99.77.253.0/24",....]
  2. Add SNS trigger to kick off new lambda:
    arn:aws:sns:us-east-1:806199016981:AmazonIpSpaceChanged

This will keep the IP CIDR's in the policy fresh. Policy updates should be Near Realtime.

When a script fails EDL auth because of EULA violation, pass JSON payload on to user/script.

This behavior is correct (because its new!) in shared token handling:

https://github.com/asfadmin/thin-egress-app/blob/devel/lambda/app.py#L536-L544

For normal EDL auth, we aren't currently getting the JSON payload.

When a user attempts to log in to an EDL App and reject/avoids accepting the EULA, TEA does not receive the code value necessary to proceed through the rest of the process. Instead of the user being 302 redirected to https://<redirect_uri>?code=<code>, the user is sent to https://<redirect_uri>?error=access_denied .
In that case that originating https://uat.urs.earthdata.nasa.gov/oauth/authorize request has the app_type=401 parameter, 302 redirect is a little more complex: https://<redirect_uri>?error=access_denied&error_msg=App%20EULA%20has%20been%20updated,%20please%20reauthorize%20the%20application%20via%20the%20Earthdata%20Login%20GUI

However, at no point in the process is there a JSON payload, nor is there a resolution_url value.

I'd propose that instead of:

<redirect_uri>?error=access_denied&error_msg=App%20EULA%20has%20been%20updated,%20please%20reauthorize%20the%20application%20via%20the%20Earthdata%20Login%20GUI

the 302 is to:

<redirect_uri>?error=access_denied&reason=EULA%20Acceptance%20Failure&app_type=401&resolution_url=https://urs.earthdata.nasa.gov/approve_app?app_uid=<APP-UID>

... then, downstream apps (like TEA) can reformat that response to a JSON payload.

In terms of TEA though, this is not a tractionable ticket until EDL behavior is changed.

Create public endpoint for sharing rsa_pub_key JWT validation key

This would allow external entities to validate a JWT token.

https://<tea>/pubkey would return something like the below payload, where rsa_pub_key is the value from the JWT Secret Manager key.

{
  "rsa_pub_key": "c3NoLXJ....5lZHUK"
}

This feature would facilitate easy validation of secrets across deployments, and allow non-EDL enable applications to securely validate JWT's across cookies domain space.

Feature proposal: more dynamic HEAD checks.

This is an idea I had earlier today. It's not specifically needed by ASF, but it may be feature that would help another user.

TEA currently checks if an object exists in S3 before creating a signed URL. This takes a little time, so we provide the option to turn off that check. This is a binary choice, either all or none are checked, depending on that setting.
https://github.com/asfadmin/thin-egress-app/blob/devel/lambda/app.py#L294-L301

Would it be useful to do checks for some types of files and not others?

Figure out a way to invalidate ONE user JWT/Session

Right now we can change the JWT secret, but that would invalidate ALL sessions.

It would be nice if we could kill ONE specific user session.

I'm not sure how we could pull it off, but it'd be a useful feature.

Provide log messages for certain conditions for Cumulus

Request by Matt Savoie.

Cumulus' Elasticsearch / Kibana needs stable, robust log messages for successful and failed download attempts.

Success

Previously, success was determined by searching for to_url, probably this log message:

log.debug('to_url: {}'.format(to_url))

We need to provide a log message with a distinct code at time of success.

Failure

Failure is determined by searching for Could not download, probably this log message:

log.warning("Could not download s3://{0}/{1}: {2}".format(bucket, filename, e))

We need to provide a log message with a useful code in cases of failure.

These should be logged at INFO level or higher

CVE-2020-28493 (Medium) detected in Jinja2-2.11.3-py2.py3-none-any.whl

CVE-2020-28493 - Medium Severity Vulnerability

Vulnerable Library - Jinja2-2.11.3-py2.py3-none-any.whl

A very fast and expressive template engine.

Library home page: https://files.pythonhosted.org/packages/7e/c2/1eece8c95ddbc9b1aeb64f5783a9e07a286de42191b7204d67b7496ddf35/Jinja2-2.11.3-py2.py3-none-any.whl

Path to dependency file: thin-egress-app/lambda/requirements.txt

Path to vulnerable library: thin-egress-app/lambda/requirements.txt

Dependency Hierarchy:

  • โŒ Jinja2-2.11.3-py2.py3-none-any.whl (Vulnerable Library)

Found in HEAD commit: 818c03e35772114b7983cf8c97624e604232a200

Vulnerability Details

This affects the package jinja2 from 0.0.0 and before 2.11.3. The ReDOS vulnerability of the regex is mainly due to the sub-pattern [a-zA-Z0-9.-]+.[a-zA-Z0-9.-]+ This issue can be mitigated by Markdown to format user content instead of the urlize filter, or by implementing request timeouts and limiting process memory.

Publish Date: 2021-02-01

URL: CVE-2020-28493

CVSS 3 Score Details (5.3)

Base Score Metrics:

  • Exploitability Metrics:
    • Attack Vector: Network
    • Attack Complexity: Low
    • Privileges Required: None
    • User Interaction: None
    • Scope: Unchanged
  • Impact Metrics:
    • Confidentiality Impact: None
    • Integrity Impact: None
    • Availability Impact: Low

For more information on CVSS3 Scores, click here.

Suggested Fix

Type: Upgrade version

Origin: https://cve.mitre.org/cgi-bin/cvename.cgi?name=2020-28493

Release Date: 2021-02-01

Fix Resolution: 2.11.3


Step up your Open Source Security Game with WhiteSource here

Key assumed to be in a cookie fails on login->logout->login.

Traceback (most recent call last):
  File "/opt/python/chalice/app.py", line 1104, in _get_view_function_response
    response = view_function(**function_args)
  File "/var/task/app.py", line 212, in logout
    user_id = cookievars['urs-user-id']
KeyError: 'urs-user-id'

user_id = cookievars['urs-user-id']

This value should be wrapped in a if 'urs-user-id' in cookievars: block.

Catch error and return meaningful client error when Secret is not found

status_code, template_vars, headers = do_login(app.current_request.query_params, app.current_request.context, os.getenv('COOKIE_DOMAIN', ''))

During the log in process, rain-api-core will re-raise an ClientError error if a secret is not found. This behavior in rain-api-core is desirable, however TEA should try to catch the ClientError and provide a meaningful error outward to facilitate easier troubleshooting.

[rain-api-core] When evaluating object-prefix access constraints, check prefixed blocks first

https://github.com/asfadmin/rain-api-core/blob/5cc26ae45c50ef038b30c3a9c0b5fd931a02361e/rain_api_core/egress_util.py#L152-L182

MAP:
  nsidc-cumulus-uat-protected: nsidc-cumulus-uat-protected
  nsidc-cumulus-uat-public: nsidc-cumulus-uat-public
PUBLIC_BUCKETS:
  nsidc-cumulus-uat-public: ""
PRIVATE_BUCKETS:
  nsidc-cumulus-uat-protected:
    - Staff UAT
  nsidc-cumulus-uat-protected/ATLAS:
    - ICESat-II Cloud Early Access UAT
    - Staff UAT

In the above use case, check_private_bucket() matches nsidc-cumulus-uat-protected and does not properly evaluate nsidc-cumulus-uat-protected/ATLAS causing ICESat-II Cloud Early Access UAT to be denied access to /ATLAS.

This can be resolved by running through PRIVATE_BUCKETS: blocks that contain / BEFORE blocks that DON'T.

Short term fix for above was to re-order /ATLAS above the parent bucket, but that is NOT a long-term solution.

CVE-2020-13757 (High) detected in rsa-4.0-py2.py3-none-any.whl

CVE-2020-13757 - High Severity Vulnerability

Vulnerable Library - rsa-4.0-py2.py3-none-any.whl

Pure-Python RSA implementation

Library home page: https://files.pythonhosted.org/packages/02/e5/38518af393f7c214357079ce67a317307936896e961e35450b70fad2a9cf/rsa-4.0-py2.py3-none-any.whl

Path to dependency file: /tmp/ws-scm/thin-egress-app/lambda/requirements.txt

Path to vulnerable library: /tmp/ws-scm/thin-egress-app/lambda/requirements.txt

Dependency Hierarchy:

  • python_jose-3.1.0-py2.py3-none-any.whl (Root Library)
    • โŒ rsa-4.0-py2.py3-none-any.whl (Vulnerable Library)

Found in HEAD commit: 7a7150830e3c44c9d03bac1e685c163285f1c218

Vulnerability Details

Python-RSA 4.0 ignores leading '\0' bytes during decryption of ciphertext. This could conceivably have a security-relevant impact, e.g., by helping an attacker to infer that an application uses Python-RSA, or if the length of accepted ciphertext affects application behavior (such as by causing excessive memory allocation).

Publish Date: 2020-06-01

URL: CVE-2020-13757

CVSS 3 Score Details (7.5)

Base Score Metrics:

  • Exploitability Metrics:
    • Attack Vector: Network
    • Attack Complexity: Low
    • Privileges Required: None
    • User Interaction: None
    • Scope: Unchanged
  • Impact Metrics:
    • Confidentiality Impact: High
    • Integrity Impact: None
    • Availability Impact: None

For more information on CVSS3 Scores, click here.


Step up your Open Source Security Game with WhiteSource here

thin-egress-app-IamPolicyDownload with AWS SourceIPs is too large in us-east-1

I apologize in advance for what may be a confusing or incorrect bug report, as I'm pretty new to working with all of these software components, but I think my analysis of this is correct.

The problem we're seeing is this error when the lambda that updates the role policy runs:

There was a problem updating policy my-app-thin-egress-app-IamPolicyDownload for Role my-app-thin-egress-app-DownloadRoleInRegion in region us-east-1: An error occurred (LimitExceeded) when calling the PutRolePolicy operation: Maximum policy size of 10240 bytes exceeded for role my-app-thin-egress-app-DownloadRoleInRegion

This is part of a deployment of https://github.com/nasa/cumulus

I believe the issue is that the policy condition to allow all AWS IPs is too large in us-east-1. Looking at the number of ip ranges for each of these regions (there are 6 lines in this output for each CIDR block), us-east-1 has nearly 1000, while the others have less than half as many. I think this results in a policy that's more than 10k characters and exceeds the AWS limit.

$ jq '.prefixes[] | select(.region=="us-east-1")' < Downloads/ip-ranges.json | wc
    5262    8770  110088
~ via ๏ข˜ v14.17.5 on :a:
$ jq '.prefixes[] | select(.region=="us-west-2")' < Downloads/ip-ranges.json | wc
    2292    3820   48248
~ via ๏ข˜ v14.17.5 on :a:
$ jq '.prefixes[] | select(.region=="us-east-2")' < Downloads/ip-ranges.json | wc
    1890    3150   39043

Don't fail if inbound requests has no headers

In normal operation, there is always at least one header that arrives with the lambda event payload.

However, when you run a API Gateway Method Execution Test, unless you explicitly supply a header key pair, no headers actually end up in the event payload and event['headers'] is None. That causes THIS line:

userid = get_jwt_field(get_cookie_vars(event['headers']), 'urs-user-id')

... to fail with this error:

File "/var/task/app.py", line 46, in __call__
userid = get_jwt_field(get_cookie_vars(event['headers']), 'urs-user-id')
  File "/var/task/rain_api_core/view_util.py", line 101, in get_cookie_vars
cooks = get_cookies(headers)
  File "/var/task/rain_api_core/view_util.py", line 129, in get_cookies
c = hdrs.get('cookie', hdrs.get('Cookie', hdrs.get('COOKIE', None)))

...and then the result in the Test harness is:

Request: /version
Status: 502
Latency: 1314 ms
Response Body:

{
  "message": "Internal server error"
}

Response Headers

{"x-amzn-ErrorType":"InternalServerErrorException"}

I think fixings this requires that we find all references to event['headers'] and replace with something like event['headers'] or {}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.