Giter Club home page Giter Club logo

metadataproxy's Introduction

metadataproxy

The metadataproxy is used to allow containers to acquire IAM roles. By metadata we mean EC2 instance meta data which is normally available to EC2 instances. This proxy exposes the meta data to containers inside or outside of EC2 hosts, allowing you to provide scoped IAM roles to individual containers, rather than giving them the full IAM permissions of an IAM role or IAM user.

Installation

From inside of the repo run the following commands:

mkdir -p /srv/metadataproxy
cd /srv/metadataproxy
virtualenv venv
source venv/bin/activate
pip install metadataproxy
deactivate

Configuration

Modes of operation

See the settings file for specific configuration options.

The metadataproxy has two basic modes of operation:

  1. Running in AWS where it simply proxies most routes to the real metadata service.
  2. Running outside of AWS where it mocks out most routes.

To enable mocking, use the environment variable:

export MOCK_API=true

AWS credentials

metadataproxy relies on boto configuration for its AWS credentials. If metadata IAM credentials are available, it will use this. Otherwise, you'll need to use .aws/credentials, .boto, or environment variables to specify the IAM credentials before the service is started.

Role assumption

For IAM routes, the metadataproxy will use STS to assume roles for containers. To do so it takes the incoming IP address of metadata requests and finds the running docker container associated with the IP address. It uses the value of the container's IAM_ROLE environment variable as the role it will assume. It then assumes the role and gives back STS credentials in the metadata response.

STS-attained credentials are cached and automatically rotated as they expire.

Container-specific roles

To specify the role of a container, simply launch it with the IAM_ROLE environment variable set to the IAM role you wish the container to run with.

If the trust policy for the role requires an ExternalId, you can set this using the IAM_EXTERNAL_ID environment variable. This is most frequently used with cross-account role access scenarios. For more information on when you should use an External ID for your roles, see:

http://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_create_for-user_externalid.html

docker run -e IAM_ROLE=my-role ubuntu:14.04
docker run -e IAM_ROLE=their-role@another-account -e IAM_EXTERNAL_ID=random-unique-string ubuntu:14.04

Configurable Behavior

There are a number of environment variables that can be set to tune metadata proxy's behavior. They can either be exported by the start script, or set via docker environment variables.

Variable Type Default Description
DEFAULT_ROLE String Role to use if IAM_ROLE is not set in a container's environment. If unset the container will get no IAM credentials.
DEFAULT_ACCOUNT_ID String The default account ID to assume roles in, if IAM_ROLE does not contain account information. If unset, metadataproxy will attempt to lookup role ARNs using iam:GetRole.
ROLE_SESSION_KEY String Optional key in container labels or environment variables to use for role session name. Prefix with Labels: or Env: respectively to indicate where key should be found. Useful to pass through metadata such as a CI job ID or launching user for audit purposes, as the role session name is included in the ARN that appears in access logs.
DEBUG Boolean False Enable debug mode. You should not do this in production as it will leak IAM credentials into your logs
DOCKER_URL String unix://var/run/docker.sock Url of the docker daemon. The default is to access docker via its socket.
METADATA_URL String http://169.254.169.254 URL of the metadata service. Default is the normal location of the metadata service in AWS.
MOCK_API Boolean False Whether or not to mock all metadata endpoints. If True, mocked data will be returned to callers. If False, all endpoints except for IAM endpoints will be proxied through to the real metadata service.
MOCKED_INSTANCE_ID String mockedid When mocking the API, use the following instance id in returned data.
AWS_ACCOUNT_MAP JSON String {} A mapping of account names to account IDs. This allows you to use user-friendly names instead of account IDs in IAM_ROLE environment variable values.
AWS_REGION String AWS Region for the STS endpoint allow you to call region based endpoint instead of global one. AWS STS region endpoints.
ROLE_EXPIRATION_THRESHOLD Integer 15 The threshold before credentials expire in minutes at which metadataproxy will attempt to load new credentials.
ROLE_MAPPING_FILE Path String A json file that has a dict mapping of IP addresses to role names. Can be used if docker networking has been disabled and you are managing IP addressing for containers through another process.
ROLE_REVERSE_LOOKUP Boolean False Enable performing a reverse lookup of incoming IP addresses to match containers by hostname. Useful if you've disabled networking in docker, but set hostnames for containers in /etc/hosts or DNS.
HOSTNAME_MATCH_REGEX Regex String ^.*$ Limit reverse lookup container matching to hostnames that match the specified pattern.
PATCH_ECS_ALLOWED_HOSTS String Patch botocore's allowed hosts for ContainerMetadataFetcher to support aws-vault's --ecs-server option. This will inject the provided host into the allowed addresses botocore will allow for the AWS_CONTAINER_CREDENTIALS_FULL_URI environment.

Default Roles

When no role is matched, metadataproxy will use the role specified in the DEFAULT_ROLE metadataproxy environment variable. If no DEFAULT_ROLE is specified as a fallback, then your docker container without an IAM_ROLE environment variable will fail to retrieve credentials.

Role Formats

The following are all supported formats for specifying roles:

  • By Role:

    IAM_ROLE=my-role
  • By Role@AccountId

    IAM_ROLE=my-role@012345678910
  • By ARN:

    IAM_ROLE=arn:aws:iam::012345678910:role/my-role

Role structure

A useful way to deploy this metadataproxy is with a two-tier role structure:

  1. The first tier is the EC2 service role for the instances running your containers. Call it DockerHostRole. Your instances must be launched with a policy that assigns this role.

  2. The second tier is the role that each container will use. These roles must trust your own account ("Role for Cross-Account Access" in AWS terms). Call it ContainerRole1.

  3. metadataproxy needs to query and assume the container role. So the DockerHostRole policy must permit this for each container role. For example:

    "Statement": [ {
        "Effect": "Allow",
        "Action": [
            "iam:GetRole",
            "sts:AssumeRole"
        ],
        "Resource": [
            "arn:aws:iam::012345678901:role/ContainerRole1",
            "arn:aws:iam::012345678901:role/ContainerRole2"
        ]
    } ]
    
  4. Now customize ContainerRole1 & friends as you like

Note: The ContainerRole1 role should have a trust relationship that allows it to be assumed by the user which is associated to the host machine running the sts:AssumeRole command. An example trust relationship for ContainRole1 may look like:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::012345678901:root",
        "Service": "ec2.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

Routing container traffic to metadataproxy

Using iptables, we can forward traffic meant to 169.254.169.254 from docker0 to the metadataproxy. The following example assumes the metadataproxy is run on the host, and not in a container:

/sbin/iptables \
  --append PREROUTING \
  --destination 169.254.169.254 \
  --protocol tcp \
  --dport 80 \
  --in-interface docker0 \
  --jump DNAT \
  --table nat \
  --to-destination 127.0.0.1:8000 \
  --wait

If you'd like to start the metadataproxy in a container, it's recommended to use host-only networking. Also, it's necessary to volume mount in the docker socket, as metadataproxy must be able to interact with docker.

Be aware that non-host-mode containers will not be able to contact 127.0.0.1 in the host network stack. As an alternative, you can use the meta-data service to find the local address. In this case, you probably want to restrict proxy access to the docker0 interface!

LOCAL_IPV4=$(curl http://169.254.169.254/latest/meta-data/local-ipv4)

/sbin/iptables \
  --append PREROUTING \
  --destination 169.254.169.254 \
  --protocol tcp \
  --dport 80 \
  --in-interface docker0 \
  --jump DNAT \
  --table nat \
  --to-destination $LOCAL_IPV4:8000 \
  --wait

/sbin/iptables \
  --wait \
  --insert INPUT 1 \
  --protocol tcp \
  --dport 80 \
  \! \
  --in-interface docker0 \
  --jump DROP

Run metadataproxy without docker

In the following we assume _my_config_ is a bash file with exports for all of the necessary settings discussed in the configuration section.

source my_config
cd /srv/metadataproxy
source venv/bin/activate
gunicorn metadataproxy:app --workers=2 -k gevent

Run metadataproxy with docker

For production purposes, you'll want to kick up a container to run. You can build one with the included Dockerfile. To run, do something like:

docker run --net=host \
    -v /var/run/docker.sock:/var/run/docker.sock \
    lyft/metadataproxy

gunicorn settings

The following environment variables can be set to configure gunicorn (defaults are set in the examples):

# Change the IP address the gunicorn worker is listening on. You likely want to
# leave this as the default
HOST=0.0.0.0

# Change the port the gunicorn worker is listening on.
PORT=8000

# Change the number of worker processes gunicorn will run with. The default is
# 1, which is likely enough since metadataproxy is using gevent and its work is
# completely IO bound. Increasing the number of workers will likely make your
# in-memory cache less efficient
WORKERS=1

# Enable debug mode (you should not do this in production as it will leak IAM
# credentials into your logs)
DEBUG=False

Contributing

Code of conduct

This project is governed by Lyft's code of conduct. All contributors and participants agree to abide by its terms.

Sign the Contributor License Agreement (CLA)

We require a CLA for code contributions, so before we can accept a pull request we need to have a signed CLA. Please visit our CLA service follow the instructions to sign the CLA.

File issues in Github

In general all enhancements or bugs should be tracked via github issues before PRs are submitted. We don't require them, but it'll help us plan and track.

When submitting bugs through issues, please try to be as descriptive as possible. It'll make it easier and quicker for everyone if the developers can easily reproduce your bug.

Submit pull requests

Our only method of accepting code changes is through github pull requests.

metadataproxy's People

Contributors

aneeshusa avatar ardakuyumcu avatar asottile avatar brandond avatar danielmmetz avatar dependabot[bot] avatar evie404 avatar garceri avatar jamesawesome avatar jonathanburns avatar jordanrasmussen avatar josegonzalez avatar jpb avatar keith avatar lyft-refactorator avatar mburger avatar mleventi avatar mxr avatar noelcarl avatar ppruthi avatar rvandegrift avatar ryan-lane avatar ryancox avatar salekseev avatar scode avatar sheetalagrawal avatar skiptomyliu avatar tedder avatar vivianho avatar ystarikovich avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

metadataproxy's Issues

Error when creating trust policy

Hello, I am following the README and while trying to create a trust policy as follows:

+ aws_iam_role_policy.SomePolicy
    name:   "SomeRole"
    policy: "{
  \"Version\": \"2012-10-17\",
  \"Statement\": [
    {
      \"Sid\": \"\",
      \"Effect\": \"Allow\",
      \"Action\": \"sts:AssumeRole\",
      \"Principal\": {
        \"AWS\": \"arn:aws:iam::<account-id>:root\",
        \"Service\": \"ec2.amazonaws.com\"
      }
    }
  ]
}"
    role:   "Role_name"

I get the following:

MalformedPolicyDocument: Policy document should not specify a principal.

FWIW, I am doing this via Terraform:

data "aws_iam_policy_document" "trust-assume-role-policy" {
  statement {

   actions = ["sts:AssumeRole"]

   principals {
     type        = "Service"
     identifiers = ["ec2.amazonaws.com"]
   }   

   principals {
      type        = "AWS"
      identifiers = ["arn:aws:iam::<account-id>:root"]
   }   
  }
}

resource "aws_iam_role_policy" "TrustUser" {
  name   = "TrustUser"
  role   = "SomeRole"
  policy = "${data.aws_iam_policy_document.trust-assume-role-policy.json}"
}

Not sure what I am doing wrong here. Any suggestions?

I could update the relationship using the Web UI, but not able to do via Terraform. Filed an issue with Terraform as well. hashicorp/terraform#13449

metadataproxy not returning IAM role credentials to containers

Hi. We have metadataproxy running as a rancher stack.
We have setup the firewall rules and we can see our request to 169.254.169.254 are being sent to metadataproxy container, but only the pass-thru proxy seems to work. Anytime we try to get info from the IAM end point, we don't get output at all, or we get a 404.

Is there anyway to enable a debug output in metadataproxy to try and find out whats going on?

root@ba95a0341b81:/aws# curl http://169.254.169.254/latest/meta-data/mac
0a:b9:20:62:36:3c

root@ba95a0341b81:/aws# curl http://169.254.169.254/latest/meta-data/iam
info

root@ba95a0341b81:/aws# curl http://169.254.169.254/latest/meta-data/iam/info
#(No output)

root@ba95a0341b81:/aws# curl http://169.254.169.254/iam/security-credentials/ran
cher-dev_rancher_machine
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
         "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
 <head>
  <title>404 - Not Found</title>
 </head>
 <body>
  <h1>404 - Not Found</h1>
 </body>
</html>

More detailed curl output, where we see metadataproxy is taking the request:

root@ba95a0341b81:/aws# curl -vvvv http://169.254.169.254/iam/security-credentia
ls/read-s3-db-backups
* Hostname was NOT found in DNS cache
*   Trying 169.254.169.254...
* Connected to 169.254.169.254 (169.254.169.254) port 80 (#0)
> GET /iam/security-credentials/read-s3-db-backups HTTP/1.1
> User-Agent: curl/7.38.0
> Host: 169.254.169.254
> Accept: */*
>
< HTTP/1.1 200 OK
* Server gunicorn/19.3.0 is not blacklisted
< Server: gunicorn/19.3.0
< Date: Mon, 06 Feb 2017 17:43:30 GMT
< Connection: keep-alive
< Transfer-Encoding: chunked
< Content-Type: text/html
<
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
         "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
 <head>
  <title>404 - Not Found</title>
 </head>
 <body>
  <h1>404 - Not Found</h1>
 </body>
</html>
* Connection #0 to host 169.254.169.254 left intact

Metadataproxy docker-compose.yml :

version: '2'
services:
  metadataproxy:
    image: pythiant9shared/metadataproxy:latest
    stdin_open: true
    network_mode: host
    volumes:
    - /var/run/docker.sock:/var/run/docker.sock
    tty: true
    labels:
      io.rancher.container.pull_image: always
      io.rancher.scheduler.global: 'true'

Application docker-compose.yml :

version: '2'
services:
  test-db-tasks:
    image: pythiant9shared/rds-db-tasks:latest
    environment:
      IAM_ROLE: read-s3-db-backups
    stdin_open: true
    labels:
      io.rancher.container.pull_image: always
      io.rancher.container.start_once: 'true'

thanks for your help!

Mock URI for determining availability-zone is incorrect

# curl http://169.254.169.254/latest/meta-data/placement/availability-zone
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<title>404 Not Found</title>
<h1>Not Found</h1>
<p>The requested URL was not found on the server.  If you entered the URL manually please check your spelling and try again.</p>

and

169.254.169.2 - - [01/Apr/2019:14:29:22 +0000] "GET /latest/meta-data/placement/availability-zone HTTP/1.1" 404 233 "-" "curl/7.58.0"
[2019-04-01 14:29:22 +0000] [12] [DEBUG] Closing connection.

Indeed code at https://github.com/lyft/metadataproxy/blob/1.6.0/metadataproxy/routes/mock.py#L364-L366 does not correspond documented endpoint https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html and should be /latest/meta-data/placement/availability-zone and not /latest/meta-data/availability-zone.

Curl hangs on 169.254.169.254

These are the steps to reproduce:

Create IAM Roles:

  • metadataproxy:
{
   "Version": "2012-10-17",
   "Statement": [
       {
           "Effect": "Allow",
           "Action": [
               "iam:GetRole",
               "sts:AssumeRole"
           ],
           "Resource": [
               "arn:aws:iam::<my-account-id>:role/role1"
           ]
       }
   ]
}
  • role1
    Added S3 Read Only Access and populated trust relationship with the rollowings
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "ec2.amazonaws.com",
        "AWS": "arn:aws:iam::<my-account-id>:root"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
  • Launch ec2 instance with metadataproxy role applied to it.
  • Install docker
  • Forward docker requests of 169.254.169.254 to localhost:8000
/sbin/iptables \
  --append PREROUTING \
  --destination 169.254.169.254 \
  --protocol tcp \
  --dport 80 \
  --in-interface docker0 \
  --jump DNAT \
  --table nat \
  --to-destination 127.0.0.1:8000 \
  --wait
  • Run lyft/metadataproxy docker image
docker run -d --net=host -e DEBUG=True  -v /var/run/docker.sock:/var/run/docker.sock lyft/metadataproxy
  • Launch ubuntu ami with -e IAM_ROLE=role1
docker run -e IAM_ROLE=role1 -it ubuntu:xenial bash
$ apt-get update && apt-get install curl -y && curl http://169.254.169.254/latest/meta-data/iam/info

But the curl command just hangs and times out. Running curl localhost:8000 on the ec2 instance give me the results.

Output from iptables:

Chain PREROUTING (policy ACCEPT)
target     prot opt source               destination         
DOCKER     all  --  anywhere             anywhere             ADDRTYPE match dst-type LOCAL
DNAT       tcp  --  anywhere             instance-data.ec2.internal  tcp dpt:http to:127.0.0.1:8000

Chain INPUT (policy ACCEPT)
target     prot opt source               destination         

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         
DOCKER     all  --  anywhere            !127.0.0.0/8          ADDRTYPE match dst-type LOCAL

Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination         
MASQUERADE  all  --  ip-172-17-0-0.ec2.internal/16  anywhere            

Chain DOCKER (2 references)
target     prot opt source               destination         
RETURN     all  --  anywhere             anywhere      

RFC: Using metadataproxy for non-container workloads

I am trying to use metadataproxy in a setup where I have multiple services with different IAM policy requirements running on a single box. These services are running on the host and not in containers which is where my "problem" lies.

So, currently, the client's requesting IP is used to look up the container it is associated with and then the environment of the container is examined for the IAM role. I am trying to think of a way to do the same, but for the requesting process. Since each of my processes will have the same IP (127.0.0.1), I cannot use that to uniquely identify the requesting process. I can think of one very crude way to do this. Roughly, this translates to:

from flask import Flask, request
from subprocess import Popen, PIPE

app = Flask(__name__)

LISTENING_PORT = '5000'
@app.route('/')
def index():
    p1 = Popen(['netstat', '-anp'],stdout=PIPE)
    p2 = Popen(['grep', 'ESTABLISHED'], stdin=p1.stdout, stdout=PIPE)
    p3 = Popen(['grep', '-w', LISTENING_PORT], stdin=p2.stdout, stdout=PIPE)
    p4 = Popen(['awk', "{print $7}"], stdin=p3.stdout, stdout=PIPE)
    p3.stdout.close()
    output = p4.communicate()[0]
    for line in output.split('\n'):
        if line:
            # <pid>/<process name>
            # eg: 1111/python
	    pid_name = line.split('/')
	    if pid_name and pid_name[1] == 'python':
                pid = pid_name[0]
                # now we have the pid of our process
                # that's requesting for credentials
	        with open('/proc/%s/environ' % pid_name[0]) as f:
		    print f.read().replace('\0', '\n')
    return '', 200
app.run(debug=True)

Upon a request, I use netstat to grep all ESTABLISHED connections on port the metadataproxy service listens on and then use that to get the PID and further examine it's environment to get the role it is requesting credentials for. This assumes that only one process is currently looking to get a certain IAM credentials from metadata proxy, but I can just run only one worker considering the controlled nature of this and my setup.

Does that make any sense? Is there another approach that may be better?

Support for mesos containeriser not available

We are using mesos containerisers in mesos to deploy our applications in AWS.
http://mesos.apache.org/documentation/latest/containerizers/#Mesos
we are in the process of switching out from docker containerisers to mesos containerisers.
We have already implemeted metadataproxy with docker.

However, I'm not certain what needs to be done for mesos containerisers to work with metadataproxy.
From looking over the code, and the config, it seems we can use a ip table dict in a file;
https://github.com/lyft/metadataproxy/blob/master/metadataproxy/settings.py

is this the correct way? It seems that it should be possible to forward requests from the mesos containeriser to the locally running metadataproxy instance, (which is in a docker container).
How is this done?

FileExistsError is not defined in python 2.x

When restarting container based on lyft/metadataproxy:2.0.0 we are sometimes getting the following in the logs and container fails to start:

Failed to read config file: /etc/gunicorn/gunicorn.conf
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/site-packages/gunicorn/app/base.py", line 106, in get_config_from_filename
    execfile_(filename, cfg, cfg)
  File "/usr/local/lib/python2.7/site-packages/gunicorn/_compat.py", line 91, in execfile_
    return execfile(fname, *args)
  File "/etc/gunicorn/gunicorn.conf", line 54, in <module>
    except FileExistsError:
NameError: name 'FileExistsError' is not defined

This happens due to the following reference to FileExistsError, which presumably does not exist in python 2.7.1:

except FileExistsError:

To reproduce something like this can be used:

docker run --rm lyft/metadataproxy:2.0.0 sh -c "mkdir -p /run/gunicorn && /bin/sh run-server.sh"

Metadataproxy throwing error

Metadataproxy throwing below error after installing it in docker host instance. I am not running this in python virtual environment and running it via run-server.sh script.

Error:

[2017-03-09 23:43:43 +0000] [32714] [DEBUG] GET /latest/meta-data/iam/security-credentials/r_ccc_ContainerRole/
[2017-03-09 23:43:43 +0000] [32714] [ERROR] Error handling request
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/gunicorn/workers/async.py", line 52, in handle
    self.handle_request(listener_name, req, client, addr)
  File "/usr/lib/python2.7/site-packages/gunicorn/workers/ggevent.py", line 159, in handle_request
    super(GeventWorker, self).handle_request(*args)
  File "/usr/lib/python2.7/site-packages/gunicorn/workers/async.py", line 105, in handle_request
    respiter = self.wsgi(environ, resp.start_response)
  File "/usr/lib/python2.7/site-packages/flask/app.py", line 1836, in __call__
    return self.wsgi_app(environ, start_response)
  File "/usr/lib/python2.7/site-packages/flask/app.py", line 1820, in wsgi_app
    response = self.make_response(self.handle_exception(e))
  File "/usr/lib/python2.7/site-packages/flask/app.py", line 1403, in handle_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/lib/python2.7/site-packages/flask/app.py", line 1817, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/lib/python2.7/site-packages/flask/app.py", line 1477, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/lib/python2.7/site-packages/flask/app.py", line 1381, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/lib/python2.7/site-packages/flask/app.py", line 1475, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/lib/python2.7/site-packages/flask/app.py", line 1461, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/srv/metadataproxy/metadataproxy/routes/proxy.py", line 70, in iam_sts_credentials
    api_version=api_version
  File "/srv/metadataproxy/metadataproxy/roles.py", line 57, in timed
    result = method(*args, **kw)
  File "/srv/metadataproxy/metadataproxy/roles.py", line 262, in get_assumed_role_credentials
    assumed_role = get_assumed_role(requested_role)
  File "/srv/metadataproxy/metadataproxy/roles.py", line 57, in timed
    result = method(*args, **kw)
  File "/srv/metadataproxy/metadataproxy/roles.py", line 249, in get_assumed_role
    arn = get_role_arn(requested_role)
  File "/srv/metadataproxy/metadataproxy/roles.py", line 228, in get_role_arn
    role = iam.get_role(RoleName=role_name)
  File "/usr/lib/python2.7/site-packages/botocore/client.py", line 251, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/usr/lib/python2.7/site-packages/botocore/client.py", line 526, in _make_api_call
    operation_model, request_dict)
  File "/usr/lib/python2.7/site-packages/botocore/endpoint.py", line 141, in make_request
    return self._send_request(request_dict, operation_model)
  File "/usr/lib/python2.7/site-packages/botocore/endpoint.py", line 170, in _send_request
    success_response, exception):
  File "/usr/lib/python2.7/site-packages/botocore/endpoint.py", line 249, in _needs_retry
    caught_exception=caught_exception, request_dict=request_dict)
  File "/usr/lib/python2.7/site-packages/botocore/hooks.py", line 227, in emit
    return self._emit(event_name, kwargs)
  File "/usr/lib/python2.7/site-packages/botocore/hooks.py", line 210, in _emit
    response = handler(**kwargs)
  File "/usr/lib/python2.7/site-packages/botocore/retryhandler.py", line 183, in __call__
    if self._checker(attempts, response, caught_exception):
  File "/usr/lib/python2.7/site-packages/botocore/retryhandler.py", line 251, in __call__
    caught_exception)
  File "/usr/lib/python2.7/site-packages/botocore/retryhandler.py", line 269, in _should_retry
    return self._checker(attempt_number, response, caught_exception)
  File "/usr/lib/python2.7/site-packages/botocore/retryhandler.py", line 317, in __call__
    caught_exception)
  File "/usr/lib/python2.7/site-packages/botocore/retryhandler.py", line 223, in __call__
    attempt_number, caught_exception)
  File "/usr/lib/python2.7/site-packages/botocore/retryhandler.py", line 359, in _check_caught_exception
    raise caught_exception
TypeError: __init__() got an unexpected keyword argument 'server_hostname'

Logging not working from flask

With debug switched on debug output from gunicorn appears in the logs but not output from the application itself, via flask.

something akin to:

log = logging.getLogger(__name__)
stream_handler = StreamHandler(stream=sys.stdout)
app.logger.addHandler(stream_handler)

in __init__.py appears to make it work, along with setting the logging level.

I can do a pull request for this, unless someone knows a better way - python isn't my first language, so I'm not completely sure whether this is the usual/best/conventional way of doing this :)

Boto3 cannot find credentials when using AWS_PROFILE env var

Hi all,

I wanted to use this project to mock IAM roles for local containers, similar to the setup I run in my Kubernetes cluster. Since roles cannot take a IAM group as principal in the trust policy and I don't want to specify users on single app-roles, I wanted to use a "transitive" role to assume my application roles (my cluster does a similar setup with kiam).
So in a very simple diagram:
Local developer credentials -> iamRole DevAssume -> iamRole Application
With this setup the application roles only need to trust the DevAssume role once and there's a central point to manage which principals can assume a larger collection of roles.

I checked if the metadataproxy had such an option but that doesn't seem to be the case. Luckily the AWS CLI/SDK can do natively, by declaring a role in your ~/.aws/config and then telling it to automatically assume that role with the AWS_PROFILE env var. (cfr https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html)

However, I could not get this to work on the current Docker image. I keep getting a NoCredentialsError error. A bit of debugging seems to point at the version of Boto3 installed. If I update it to latest, it does assume this role as one would expect and distribute tokens to other containers.

I made a local fork and upgraded all the pip requirements to their latest version, which still seems to work as expected. Would this be welcome as a PR? I can also include some documentation how I set up this local environment, which relies on docker-compose networking instead of IPTable rules.

Add a gevent pool for refreshing STS assumed credentials

The metadata proxy can know when IAM credentials are about to expire. We should add a gevent pool that runs occasionally, checks to see if any credentials need to be renewed, and renew them before they expire. The goal is to remove the STS assume from the critical path of the application, as the STS assume can be a bit slow.

Cannot find container ip when using host-mode networking

I've got the metadataproxy service running in a container using host-mode networking. Another service that is calling the metadataproxy is also running with host-node networking. I'm using iptables to re-direct traffic from 169.254.169.254 to 0.0.0.0:8000

However, the metadataproxy service replies with:
ts=2019-11-05 20:07:18,209 name=metadataproxy.roles lvlname=ERROR msg=No container found for ip 192.168.65.3

Running docker-inspect on my containers, I see that I have no IPAddresses listed:

...
"NetworkSettings": {
            "Bridge": "",
             ...
            "HairpinMode": false,
            "LinkLocalIPv6Address": "",
            "LinkLocalIPv6PrefixLen": 0,
            "Ports": {},
            "SandboxKey": "/var/run/docker/netns/default",
            "SecondaryIPAddresses": null,
            "SecondaryIPv6Addresses": null,
            "EndpointID": "",
            "Gateway": "",
            "GlobalIPv6Address": "",
            "GlobalIPv6PrefixLen": 0,
            "IPAddress": "",
            "IPPrefixLen": 0,
            "IPv6Gateway": "",
            "MacAddress": "",
            "Networks": {
                "host": {
                    "IPAMConfig": null,
                    "Links": null,
                    "Aliases": null,
                    ...
                    "Gateway": "",
                    "IPAddress": "",
                    "IPPrefixLen": 0,
                    "IPv6Gateway": "",
                    "GlobalIPv6Address": "",
                    "GlobalIPv6PrefixLen": 0,
                    "MacAddress": "",
                    "DriverOpts": null
                }
            }
        }

My docker-compose, where pipeline is my service that calls the metadataproxy:

version: '3.4'
services:
  pipeline:
    build: .
    environment:
      # Needs these vars so it can re-direct
      METADATAPROXY_HOST: 0.0.0.0
      METADATAPROXY_PORT: 8000
    volumes:
      - .:/srv/app
    # Enable ip forwarding to local network
    sysctls:
        - net.ipv4.conf.eth0.route_localnet=1
    # Give this container permissions to modify iptables
    cap_add:
      - NET_ADMIN
      - NET_RAW

  ec2metadata:
    environment:
      MOCK_API: 'True'
      DEBUG: 'True'
      DEFAULT_ROLE: arn:aws:iam::blah-blah-blah
    image: lyft/metadataproxy
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock

What am I missing about using host-mode networking?

Kubernetes IP address belongs to pod, not container

I'm trying to get metadataproxy working in kubernetes, and I'm almost there - however, the snag with kubernetes is that it attaches the IP address to the pod, which is a group of containers, rather than to a individual container.
In fact, it actually attaches it to a container running just "pause" then bridges the networking between that container and the others in the pod.
Unfortunately, Environment vars are specific to a container, so looking up IAM_ROLE fails, as the pause container doesn't have that var.
However, containers in the same pod have a common label: io.kubernetes.pod.uid which is a uuid unique to the pod and can be used to identify containers within the pod. My plan is to match the pause container with others in the pod and get IAM_ROLE from an associated pod.
Limitations would be only one IAM_ROLE per pod, but that's not a biggie IMHO.

Is this approach reasonable? It shouldn't impact non-kubernetes systems significantly, as I'll only check for other containers if the label exists. Or have others solved this in other ways?

Should be straightforward to implement - I'll do a pull request in due course...

Fix docker push

Our docker push script isn't working right now, probably because the docker build isn't occuring. We need to add build in and try to bump the tag again.

Non-networked pods cause error

Running metadataproxy on kubernetes, and it was failing with the following error:

[2017-03-13 14:54:23 +0000] [10] [ERROR] Error handling request
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/site-packages/gunicorn/workers/async.py", line 52, in handle
    self.handle_request(listener_name, req, client, addr)
  File "/usr/local/lib/python2.7/site-packages/gunicorn/workers/ggevent.py", line 159, in handle_request
    super(GeventWorker, self).handle_request(*args)
  File "/usr/local/lib/python2.7/site-packages/gunicorn/workers/async.py", line 105, in handle_request
    respiter = self.wsgi(environ, resp.start_response)
  File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1836, in __call__
    return self.wsgi_app(environ, start_response)
  File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1820, in wsgi_app
    response = self.make_response(self.handle_exception(e))
  File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1403, in handle_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1817, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1477, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1381, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1475, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1461, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/srv/metadataproxy/metadataproxy/routes/proxy.py", line 44, in iam_role_name
    role_name_from_ip = roles.get_role_name_from_ip(request.remote_addr)
  File "/srv/metadataproxy/metadataproxy/roles.py", line 59, in timed
    result = method(*args, **kw)
  File "/srv/metadataproxy/metadataproxy/roles.py", line 174, in get_role_name_from_ip
    container = find_container(ip)
  File "/srv/metadataproxy/metadataproxy/roles.py", line 59, in timed
    result = method(*args, **kw)
  File "/srv/metadataproxy/metadataproxy/roles.py", line 134, in find_container
    if len(_networks) > 0:
TypeError: object of type 'NoneType' has no len()

Some of our containers have no networks, so len(_networks) isn't an array, is just None. Adding a check for _networks being not None fixed it.

Config file not being used?

I have placed environment variables into my config file and sourced before running gunicorn.

For some reason, other than MOCK_API, the additional ones are not taking effect. As a result, I need to run "gunicorn metadataproxy:app --bind LOCAL_IP:PORT --workers=2 -k gevent" to use the local IP address rather than the loopback 127.0.01.

Bug or just something I don't understand (improvement to doc??).

Metadata proxy not working with cross role

We are getting below error while using metadata proxy with cross account access.

GetRoleError: (404, 'An error occurred (NoSuchEntity) when calling the GetRole operation: Role not found for r_ccc_ContainerRole1')

Please let us know how we can resolve this issue. Metadata proxy running as container in aws.

Principals used for Docker Container Role Trust Not Appropriate

The example you show for the trust relationship to not the best:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::012345678901:root",
        "Service": "ec2.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

==
The "AWS": "arn:aws:iam::012345678901:root" principal will give ANY iam user or role from account 012345678901 the right to assume this ContainerRole, not a specific user as you indicate in your documentation. In particular, "root" has nothing to do with the "root" user on the Docker EC2 Host OS system. It is just one form that can be used with IAM to indicate the entire account.

The other Principal in the example:
"Service": "ec2.amazonaws.com" grants the EC2 service rights to assume a role, but this role was already used in your "chain" to get the security credentials of the ContainerRole. It is the ContainerRole which needs to be in your trust:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::12345678901:role/DockerHostRole"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

The trust is presenting this, not the EC2 service, when it is making the request.

The "aws sts assume-role --role-arn" cli is useful to debug these issues. It is what I used when working through the IAM role configuration.

Stephen

The role expiration check should be configurable, and 15 minutes at minimum

Hey IAM friends. Our org noticed some badly behaving java apps recently. Specifically: java apps would fetch new IAM credentials prior to every single AWS API call for 10 minutes straight, and then after 10 minutes would stop asking. The cause for this seems to be due in part to the official aws-sdk-java library's behaviour when it comes to caching IAM role credentials:

https://github.com/aws/aws-sdk-java/blob/1.11.546/aws-java-sdk-core/src/main/java/com/amazonaws/auth/EC2CredentialsFetcher.java#L49-L53

Specifically: the sdk will cache credentials as long as they're good for at least 15 minutes. If they will expire in 15 minutes, then the sdk asks for new ones.

Metadataproxy also proactively refreshes credentials when they're nearing expiry -- but only 5 minutes ahead instead:

https://github.com/lyft/metadataproxy/blob/1.11.0/metadataproxy/roles.py#L349-L351

This means there's a 10 minute period during which the java sdk asks for new credentials, because it expects to find new ones, but metadataproxy is still answering with the cached credentials.

I'd like to propose that metadataproxy use the same 15 minute threshold for better compatibility with the java sdk, and also provide a new configuration option to make this tuneable.

DEFAULT_ROLE cannot be an ARN

Hello,

After the refactor of roles.py, it appears that the DEFAULT_ROLE cannot be an ARN; in

m = RE_IAM_ARN.match(val)
only the IAM_ROLE variable is checked if it is an ARN, but not the DEFAULT_ROLE (if none is set.)

This means that IAM_ROLE can be an ARN or a short form, but the DEFAULT_ROLE can only be a short form. Any requests to the proxy would return 404 if an ARN is specified for DEFAULT_ROLE.

ClientError: An error occurred (AccessDenied) when calling the AssumeRole operation: User: is not authorized to perform: sts:AssumeRole on resource

Hello,

I am trying to run this metadataproxy locally with docker-compose.

I have so far done this.

  1. sudo ifconfig lo0 alias 169.254.169.254
  2. running docker-compose with ports 80:8000
services:
  metadataproxy:
    image: metadataproxy
    volumes:
    - /var/run/docker.sock:/var/run/docker.sock
    - /Users/jkurz/.aws/credentials:/root/.aws/credentials
    ports:
    - 80:8000
    environment:
      DEBUG: "True"
      MOCK_API: "True"
      AWS_DEFAULT_REGION: "us-east-1"
      IAM_ROLE: "zombie-finder"
  1. docker-compose up

This runs the server fine
I can hit it from my host computer, which is mac. running docker for mac.

First problem is I the roles.py find_container(ip) was not returning anything when making a request from inside a container.

I updated the code to match on the Gateway and it started working.

if _networks:
            for _network in _networks:
                print _networks[_network]
                print _networks[_network]['Gateway']
                print _networks[_network]['Gateway'] == ip
                if _networks[_network]['IPAddress'] == ip or _networks[_network]['Gateway']:
                    msg = 'Container id {0} mapped to {1} by sub-network IP match'
                    log.debug(msg.format(_id, ip))
                    CONTAINER_MAPPING[ip] = _id
                    print "returning c"
                    return c

Then once I got a container, it started failing by not allowing for me to assume the role running in the container.

I'm trying to call curl http://169.254.169.254/latest/meta-data/iam/security-credentials/zombie-finder

and the server is throwing

metadataproxy_1  |   File "/usr/local/lib/python2.7/site-packages/botocore/client.py", line 301, in _api_call
metadataproxy_1  |     return self._make_api_call(operation_name, kwargs)
metadataproxy_1  |   File "/usr/local/lib/python2.7/site-packages/botocore/client.py", line 386, in _make_api_call
metadataproxy_1  |     raise ClientError(parsed_response, operation_name)
metadataproxy_1  | ClientError: An error occurred (AccessDenied) when calling the AssumeRole operation: User: arn:aws:iam is not authorized to perform: sts:AssumeRole on resource: /zombie-finder
metadataproxy_1  | 192.168.32.1 - - [13/Oct/2017:19:01:35 +0000] "GET /latest/meta-data/iam/security-credentials/zombie-finder HTTP/1.1" 500 - "-" "-"
^CGracefully stopping... (press Ctrl+C again to force)

Any help would be appreciated.

Not working :/ lots of debug listed

Hi, been trying to get this working.

So, I have used the dockerfile here in this repo and here's the portion in use in our compose.yaml:

metadata-proxy:
    image: my-metadata-proxy
    container_name: my_metadata_proxy
    privileged: true
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
    network_mode: host
    environment:
      - DEBUG=True

I've created 2 roles (default and "full"):

docker run -id --net=special -e IAM_ROLE=arn:aws:iam::123456123456:role/dr-full -e DEFAULT_ROLE=arn:aws:iam::123456123456:role/dr-default ubuntu

Here's the (purposefully open) default EC2 host IAM role snippet:

        {
            "Action": [
                "iam:GetRole",
                "sts:AssumeRole"
            ],
            "Effect": "Allow",
            "Resource": "*"
        }

First issue, on AMZ linux the iptables commands here don't work, it says "--wait" is not supported, iptables 1.4.8 (also the protocol should come before --dport, another issue)

Using an alternate file: https://github.com/dump247/ec2metaproxy/blob/master/scripts/setup-firewall.sh changing default port, but rest seems basically the same. However, there is no "--wait" in it, so this could be part of the problem.

We're using a different network, so I get the br-a9c3d93fc style ifconfig adapter and put that in as the reference to the adapter.

After running the docker container (before running the firewall script), I get this inside the container, seems normal:

root@1f4488813288:/# aws iam get-user

An error occurred (AccessDenied) when calling the GetUser operation: User: arn:aws:sts::123456123456:assumed-role/my-container-host-role/i-08fb8788e3b6909f3 is not authorized to perform: iam:GetUser on resource: arn:aws:sts::123456123456:assumed-role/my-container-host-role/i-08fb8788e3b6909f3

After applying the firewall script, things change:

Ec2 host:

[ec2-user@ip-172-31-27-183 ~]$ sudo ./firewall.sh --container-iface br-3b5d10d86f4b                                  
Drop traffic to 8000 not from container interface br-3b5d10d86f4b
Redirect any metadata requests from containers to the proxy service

Container:

root@1f4488813288:/# aws iam get-user
Unable to locate credentials. You can configure credentials by running "aws configure".

root@1f4488813288:/# LOCAL_IPV4=$(curl -s http://169.254.169.254/latest/meta-data/local-ipv4)
root@1f4488813288:/# echo $LOCAL_IPV4
172.31.27.183

root@1f4488813288:/# curl -s http://$LOCAL_IPV4:8000 | head -n 2
1.0
2007-01-19

root@1f4488813288:/# curl -s http://169.254.169.254 | head -n 2 
1.0
2007-01-19

Nothing in the docker logs for the proxy except for startup:

[ec2-user@ip-172-31-27-183 ~]$ docker logs b9ca4524fe77
[2017-02-23 05:30:46 +0000] [5] [INFO] Starting gunicorn 19.3.0
[2017-02-23 05:30:46 +0000] [5] [INFO] Listening at: http://0.0.0.0:8000 (5)
[2017-02-23 05:30:46 +0000] [5] [INFO] Using worker: gevent
[2017-02-23 05:30:46 +0000] [10] [INFO] Booting worker with pid: 10
[2017-02-23 05:30:46 +0000] [11] [INFO] Booting worker with pid: 11
[ec2-user@ip-172-31-27-183 ~]$

Any ideas?

Cannot match ip to swarm container

Consider a swarm node where my-swarm-container and metadataproxy-container run. metadataproxy-container is not part of the swarm and is run in the following way.

$ docker run -d --net=host -v /var/run/docker.sock:/var/run/docker.sock --name metadataproxy-container metadataproxy-image
$ iptables PREROUTING -d 169.254.169.254/32 -i docker_gwbridge -p tcp -m tcp --dport 80 -j DNAT --to-destination $LOCAL_IPV4:8000

The problem is that my-swarm-container will ask metadataproxy-container for credentials and metadataproxy-container will respond with status 404 and will output the following in its logs.

{"asctime": "2020-04-09 01:10:05,003", "name": "metadataproxy.roles", "levelname": "ERROR", "message": "No container found for ip 172.18.0.7"}
{"asctime": "2020-04-09 01:10:05,003", "name": "metadataproxy.routes.proxy", "levelname": "ERROR", "message": "Role name not found; returning 404."}

The reason is that swarm bridge network docker_gwbridge is not reported in the output of docker inspect my-swarm-container (see moby/libnetwork#1082). When metadataproxy tries to match request ip to its container, it looks at the ip of each container in the node and finds no match.

A solution to the problem is to match the ip to its container by looking at the list of containers for network docker_gwbridge. PR #101 implements the lookup.

Current test results in DataNotFoundError: Unable to load data for: _endpoints from boto3

This testcase uses a hosted installation, for initial PoC purposes, as cloned from latest https://github.com/lyft/metadataproxy, hosted on a single EC2 instance whose os is Amazon Linux AMI 2016.03 and which is running docker 1.10.3.

I have run the testcase on the following assumptions -

(1) Prereqs are python 2.7 (no other python version except this one), with python-devel and python27-devel. This is sufficient to ensure that pip install boto3==1.1.3, pip install Flask==0.10.1, pip install docker-py==1.6.0, pip install gunicorn==19.3.0, pip install gevent==1.0.2, pip install pip install greenlet==0.4.9 --upgrade install successfully under virtualenv. The versions are those provided in the requirements files.

(2) The routing change -A PREROUTING -d 169.254.169.254/32 -i docker0 -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 8000 for when gunicorn starts up with a binding to port 8000. Having checked the configuration docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu . . . and ascertained that there is a listening process on port 8000 when gunicorn starts up. As follows

gunicorn wsgi:app --bind 0.0.0.0:8000 --log-level debug --access-logfile /tmp/gunicorn-access --error-logfile /tmp/gunicorn-error --workers=2 -k gevent &

(3) Started a testcase docker container with a predefined IAM_ROLE (which provides permissions to a dedicated S3 path) supplied as its environment variable.

docker run -e IAM_ROLE=testcase1-proxiedaccess-instance-role --name awstest1 -it fstab/aws-cli

(4) Having checked that there was an explicit rejection [ClientError: An error occurred (AccessDenied) when calling the AssumeRole operation] when the iam:GetRole and sts:AssumeRole permissions were not granted on the host, this was then corrected. The subsequent results are what happens when these permissions are available, so that we can be sure that the host now has the correct permissions.

(env)aws@e724b9ca02bf:~$ wget -O - http://169.254.169.254/latest/meta-data/iam/security-credentials/testcase1-proxiedaccess-instance-role
--2016-04-12 13:48:51--  http://169.254.169.254/latest/meta-data/iam/security-credentials/testcase1-proxiedaccess-instance-role
Connecting to 169.254.169.254:80... connected.
HTTP request sent, awaiting response... 500 Internal Server Error
2016-04-12 13:48:51 ERROR 500: Internal Server Error.

Or as a consequence

(env)aws@7df18683f375:~$ aws sts get-caller-identity 
Unable to locate credentials. You can configure credentials by running "aws configure".

Stepping through routes/proxy.py it turns out that we went through the second if match: step and returned a redirect to the expected route.

This is confirmed if we look at the gunicorn-error logfile at the line

[2016-04-12 11:05:34 +0000] [10963] [DEBUG] GET /latest/meta-data/iam/security-credentials/testcase1-proxiedaccess-instance-role

Further tracing confirms that it executed the definition of iam_client in roles.py. The problem arises when we attempt to provide the boto3.client(β€˜iam’) object. Casting around elsewhere the error reported, DataNotFoundError: Unable to load data for: _endpoints, is indicative of some missing prerequisites for boto3 as invoked within the flask app.py framework. Can you confirm that this is the problem?

Are there some additional prereqs for boto3?

The relevant section of the gunicorn-error file is

[2016-04-12 13:17:16 +0000] [10960] [DEBUG] GET /latest/meta-data/iam/security-credentials/testcase1-proxiedaccess-instance-role
[2016-04-12 13:17:16 +0000] [10960] [ERROR] Error handling request
Traceback (most recent call last):
  File "/srv/metadataproxy/venv/local/lib/python2.7/site-packages/gunicorn/workers/async.py", line 52, in handle
    self.handle_request(listener_name, req, client, addr)
  File "/srv/metadataproxy/venv/local/lib/python2.7/site-packages/gunicorn/workers/ggevent.py", line 159, in handle_request
    super(GeventWorker, self).handle_request(*args)
  File "/srv/metadataproxy/venv/local/lib/python2.7/site-packages/gunicorn/workers/async.py", line 105, in handle_request
    respiter = self.wsgi(environ, resp.start_response)
  File "/srv/metadataproxy/venv/local/lib/python2.7/site-packages/flask/app.py", line 1836, in __call__
    return self.wsgi_app(environ, start_response)
  File "/srv/metadataproxy/venv/local/lib/python2.7/site-packages/flask/app.py", line 1820, in wsgi_app
    response = self.make_response(self.handle_exception(e))
  File "/srv/metadataproxy/venv/local/lib/python2.7/site-packages/flask/app.py", line 1403, in handle_exception
    reraise(exc_type, exc_value, tb)
  File "/srv/metadataproxy/venv/local/lib/python2.7/site-packages/flask/app.py", line 1817, in wsgi_app
    response = self.full_dispatch_request()
  File "/srv/metadataproxy/venv/local/lib/python2.7/site-packages/flask/app.py", line 1477, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/srv/metadataproxy/venv/local/lib/python2.7/site-packages/flask/app.py", line 1381, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/srv/metadataproxy/venv/local/lib/python2.7/site-packages/flask/app.py", line 1475, in full_dispatch_request
    rv = self.dispatch_request()
  File "/srv/metadataproxy/venv/local/lib/python2.7/site-packages/flask/app.py", line 1461, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/srv/metadataproxy/metadataproxy/routes/proxy.py", line 60, in home
    api_version=match.groups()[0]
  File "/srv/metadataproxy/metadataproxy/roles.py", line 163, in get_assumed_role
    role = get_role(requested_role)
  File "/srv/metadataproxy/metadataproxy/roles.py", line 140, in get_role
    iam = iam_client()
  File "/srv/metadataproxy/metadataproxy/roles.py", line 41, in iam_client
    _iam_client = boto3.client('iam')
  File "/srv/metadataproxy/venv/local/lib/python2.7/site-packages/boto3/__init__.py", line 79, in client
    return _get_default_session().client(*args, **kwargs)
  File "/srv/metadataproxy/venv/local/lib/python2.7/site-packages/boto3/session.py", line 199, in client
    aws_session_token=aws_session_token, config=config)
  File "/srv/metadataproxy/venv/local/lib/python2.7/site-packages/botocore/session.py", line 754, in create_client  
  File "/srv/metadataproxy/venv/local/lib/python2.7/site-packages/botocore/session.py", line 660, in get_component
    :param unique_id_uses_count: boolean
  File "/srv/metadataproxy/venv/local/lib/python2.7/site-packages/botocore/session.py", line 774, in get_component
    # client config from the session
  File "/srv/metadataproxy/venv/local/lib/python2.7/site-packages/botocore/session.py", line 174, in <lambda>
    self._components.lazy_register_component(
  File "/srv/metadataproxy/venv/local/lib/python2.7/site-packages/botocore/session.py", line 453, in get_data
    - agent_version is the value of the `user_agent_version`
  File "/srv/metadataproxy/venv/local/lib/python2.7/site-packages/botocore/loaders.py", line 119, in _wrapper
    data = func(self, *args, **kwargs)
  File "/srv/metadataproxy/venv/local/lib/python2.7/site-packages/botocore/loaders.py", line 371, in load_data
    """
DataNotFoundError: Unable to load data for: _endpoints
[2016-04-12 13:17:17 +0000] [10955] [DEBUG] 2 workers

container launched with --net=host getting EC2 host credentials

I launched metadataproxy container in host mode as per documentation. I launched two more container one with --net=host and other with --net=bridge mode. It works fine with the bridge mode. But, It does not return container credentials mapped to the IAM_ROLE parameter through proxy for the container launched in host mode. It gives back the EC2 role credentials.

IP Address caching and docker re-using IP addresses may cause inconsistent credentials being returned

Good evening!

Just wanted to note an issue I was observing earlier today. I'm still doing more testing, but I'll try to detail what I've seen in case others have seen or a fix is on the way.

I am currently trying to use this metadataproxy with the AWS EC2 Container Service. The EC2 environment is bootstrapped with the metadataproxy (v 1.1) and then the ECS agent, which I don't think is material, but full disclosure.

I started with using Docker 1.9, but after reviewing the code, it looks like the metadataproxy is dependent upon the Docker Networking IP Address as part of the docker inspect information. On Docker 1.9.1, that value is often empty, which was giving us some strange behavior.

This afternoon, I upgraded our EC2 instances to Docker 1.11.1. We started to see the IP Address be populated in the Network Configuration section, and we started to see the STS transactions begin to succeed. We were simply starting containers from the command line at this point, a simple alpine container, and passing in the IAM_ROLE environment parameter as the ARN. Once the alpine container started, I'd apk upgrade && apk add curl to install curl, and I would test with "curl 169.254.169.254/latest/meta-data/iam/info".

So for the primary issue. What we began to notice is that with the default Docker networking setup, the IP block was in the 172.16.x.y range, and the gateway was 172.16.0.1. The usable IPs started at .2. When I started the first container A, I started it with Role A, and I started a second container (B) with Role B, and everything worked great. No problems so far.

When we noticed a problem, was when we killed container A (with Role A), and started Container C, using Role B. Docker assigned this container the same IP address (172.16.0.2) as the previous container, and I believe the caching aspect of metadataproxy may be hanging on to the former IAM value.

When testing this, I would repeat the curl 169.254.169.254/latest/meta-data/iam/info repeatedly, but I'd get different instance profiles randomly. I didn't seem to notice a pattern. But if I killed and restarted the metadataproxy container and started over things seem to clear up.

So curious if you guys have run into it. While working with a coworker on this, I think we can maybe add an extra check to see if the IAM role for the container by IP has changed and if so, remove it from cache and start anew, but that's about as far as we got before the end of the day.

Thank you!

Setting environment variable on running container is not working

Whenever I am setting up environment variable for a running container Metadataproxy throwing error. I am able to get environment variable using echo $IAM_ROLE and metadataproxy is not returning IAM_ROLE value when testing with curl http://169.254.169.254/latest/meta-data/iam/security-credentials/

Used below commands to set environment variable in running container.
export IAM_ROLE="ccc_ContainerRole@12345678"

Metadataproxy error:
[root@eb744b6a4257 ~]# curl http://169.254.169.254/latest/meta-data/iam/security-credentials/ curl: (7) Failed connect to 169.254.169.254:80; Connection refused

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.