Giter Club home page Giter Club logo

aardvark's Introduction

Aardvark

NetflixOSS Lifecycle Discord chat

Aardvark Logo

Aardvark is a multi-account AWS IAM Access Advisor API (and caching layer).

Install:

Ensure that you have Python 3.6 or later. Python 2 is no longer supported.

git clone https://github.com/Netflix-Skunkworks/aardvark.git
cd aardvark
python3 -m venv env
. env/bin/activate
python setup.py develop

Known Dependencies

  • libpq-dev

Configure Aardvark

The Aardvark config wizard will guide you through the setup.

% aardvark config

Aardvark can use SWAG to look up accounts. https://github.com/Netflix-Skunkworks/swag-client
Do you use SWAG to track accounts? [yN]: no
ROLENAME: Aardvark
DATABASE [sqlite:////home/github/aardvark/aardvark.db]:
# Threads [5]:

>> Writing to config.py
  • Whether to use SWAG to enumerate your AWS accounts. (Optional, but useful when you have many accounts.)
  • The name of the IAM Role to assume into in each account.
  • The Database connection string. (Defaults to sqlite in the current working directory. Use RDS Postgres for production.)

Create the DB tables

aardvark create_db

IAM Permissions:

Aardvark needs an IAM Role in each account that will be queried. Additionally, Aardvark needs to be launched with a role or user which can sts:AssumeRole into the different account roles.

AardvarkInstanceProfile:

  • Only create one.
  • Needs the ability to call sts:AssumeRole into all of the AardvarkRole's

AardvarkRole:

  • Must exist in every account to be monitored.
  • Must have a trust policy allowing AardvarkInstanceProfile.
  • Has these permissions:
iam:GenerateServiceLastAccessedDetails
iam:GetServiceLastAccessedDetails
iam:listrolepolicies
iam:listroles
iam:ListUsers
iam:ListPolicies
iam:ListGroups

So if you are monitoring n accounts, you will always need n+1 roles. (n AardvarkRoles and 1 AardvarkInstanceProfile).

Note: For locally running aardvark, you don't have to take care of the AardvarkInstanceProfile. Instead, just attach a policy which contains "sts:AssumeRole" to the user you are using on the AWS CLI to assume Aardvark Role. Also, the same user should be mentioned in the trust policy of Aardvark Role for proper assignment of the privileges.

Gather Access Advisor Data

You'll likely want to refresh the Access Advisor data regularly. We recommend running the update command about once a day. Cron works great for this.

Without SWAG:

If you don't have SWAG you can pass comma separated account numbers:

aardvark update -a 123456789012,210987654321

With SWAG:

Aardvark can use SWAG to look up accounts, so you can run against all with:

aardvark update

or by account name/tag with:

aardvark update -a dev,test,prod

API

Start the API

aardvark start_api -b 0.0.0.0:5000

In production, you'll likely want to have something like supervisor starting the API for you.

Use the API

Swagger is available for the API at <Aardvark_Host>/apidocs/#!.

Aardvark responds to get/post requests. All results are paginated and pagination can be controlled by passing count and/or page arguments. Here are a few example queries:

curl localhost:5000/api/1/advisors
curl localhost:5000/api/1/advisors?phrase=SecurityMonkey
curl localhost:5000/api/1/advisors?arn=arn:aws:iam::000000000000:role/SecurityMonkey&arn=arn:aws:iam::111111111111:role/SecurityMonkey
curl localhost:5000/api/1/advisors?regex=^.*Monkey$

Docker

Aardvark can also be deployed with Docker and Docker Compose. The Aardvark services are built on a shared container. You will need Docker and Docker Compose installed for this to work.

To configure the containers for your set of accounts create a .env file in the root of this directory. Define the environment variables within this file. This example uses AWS Access Keys. We recommend using instance roles in production.

AARDVARK_ROLE=Aardvark
AARDVARK_ACCOUNTS=<account id>
AWS_DEFAULT_REGION=<aws region>
AWS_ACCESS_KEY_ID=<your access key>
AWS_SECRET_ACCESS_KEY=<you secret key>
Name Service Description
AARDVARK_ROLE collector The name of the role for Aardvark to assume so that it can collect the data.
AARDVARK_ACCOUNTS collector Optional if using SWAG, otherwise required. Set this to a list of SWAG account name tags or a list of AWS account numbers from which to collect Access Advisor records.
AWS_ARN_PARTITION collector Required if not using an AWS Commercial region. For example, aws-us-gov. By default, this is aws.
AWS_DEFAULT_REGION collector Required if not running on an EC2 instance with an appropriate Instance Profile. Set these to the credentials of an AWS IAM user with permission to sts:AssumeRole to the Aardvark audit role.
AWS_ACCESS_KEY_ID collector Required if not running on an EC2 instance with an appropriate Instance Profile. Set these to the credentials of an AWS IAM user with permission to sts:AssumeRole to the Aardvark audit role.
AWS_SECRET_ACCESS_KEY collector Required if not running on an EC2 instance with an appropriate Instance Profile. Set these to the credentials of an AWS IAM user with permission to sts:AssumeRole to the Aardvark audit role.
AARDVARK_DATABASE_URI collector and apiserver Specify a custom database URI supported by SQL Alchemy. By default, this will use the AARDVARK_DATA_DIR value to create a SQLLite Database. Example: sqlite:///$AARDVARK_DATA_DIR/aardvark.db

Once this file is created, then build the containers and start the services. Aardvark consists of three services:

  • Init - The init container creates the database within the storage volume.
  • API Server - This is the HTTP webserver will serve the data. By default, this is listening on http://localhost:5000/apidocs/#!.
  • Collector - This is a daemon that will fetch and cache the data in the local SQL database. This should be run periodically.
# build the containers
docker-compose build

# start up the containers
docker-compose up

Finally, to clean up the environment

# bring down the containers
docker-compose down

# remove the containers
docker-compoes rm

Notes

Threads

Aardvark will launch the number of threads specified in the configuration. Each of these threads will retrieve Access Advisor data for an account and then persist the data.

Database

The regex query is only supported in Postgres (natively) and SQLite (via some magic courtesy of Xion in the sqla_regex file).

TLS

We recommend enabling TLS for any service. Instructions for setting up TLS are out of scope for this document.

Signals

New in v0.3.1

Aardvark uses Blinker for signals in its update process. These signals can be used for things like emitting metrics, additional logging, or taking more actions on accounts. You can use them by writing a script that defines your handlers and calls aardvark.manage.main(). For example, create a file called signals_example.py with the following contents:

import logging

from aardvark.manage import main
from aardvark.updater import AccountToUpdate

logger = logging.getLogger('aardvark_signals')


@AccountToUpdate.on_ready.connect
def handle_on_ready(sender):
    logger.info(f"got on_ready from {sender}")


@AccountToUpdate.on_complete.connect
def handle_on_complete(sender):
    logger.info(f"got on_complete from {sender}")


if __name__ == "__main__":
    main()

This file can now be invoked in the same way as manage.py:

python signals_example.py update -a cool_account

The log output will be similar to the following:

INFO: getting bucket swag-bucket
INFO: Thread #1 updating account 123456789012 with all arns
INFO: got on_ready from <aardvark.updater.AccountToUpdate object at 0x10c379b50>
INFO: got on_complete from <aardvark.updater.AccountToUpdate object at 0x10c379b50>
INFO: Thread #1 persisting data for account 123456789012
INFO: Thread #1 FINISHED persisting data for account 123456789012

Available signals

Class Signals
manage.UpdateAccountThread on_ready, on_complete, on_failure
updater.AccountToUpdate on_ready, on_complete, on_error, on_failure

TODO:

See TODO

aardvark's People

Contributors

arhea avatar castrapel avatar dependabot[bot] avatar dfkunstler avatar doppins-bot avatar fruechel avatar gitonion avatar mbaciu-gpsw avatar mcpeak avatar mkgridsec avatar mourackb avatar nasehim7 avatar patricksanders avatar rdkr avatar rogerfdias avatar scottabrown avatar scriptsrc avatar sid77 avatar siddheshsalunke avatar thomaso-mirodin avatar wesleivarjaoze avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

aardvark's Issues

Suggestion: Continue to cut releases instead of pulling from master branch (similar to Repokid issue)

Just thought that I'd add this here to aardvark as well since there aren't any releases right now. :)

Similar to https://github.com/Netflix/repokid

FYI - @mcpeak

One thing that might help make this easier, especially if you want to push Aardvark and Repokid to PyPi - you can auto-deploy to PyPi by using the TravisCI PyPi provider. This way it isn't as much of a PITA... whenever you tag a new release, it will build, and unless there are non-zero errors, it pushes to PyPi. Saves a bunch of time.

Aardvark configuration error

I started using Aardvark, and when trying to run the Aardvark config command, I got the following error:

sys.exit(load_entry_point('aardvark', 'console_scripts', 'aardvark')())
 File "/Users/zeuser/other-repos/aardvark/env/bin/aardvark", line 25, in importlib_load_entry_point
   return next(matches).load()
 File "/usr/local/Cellar/[email protected]/3.9.13_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/importlib/metadata.py", line 86, in load
   module = import_module(match.group('module'))
 File "/usr/local/Cellar/[email protected]/3.9.13_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/importlib/__init__.py", line 127, in import_module
   return _bootstrap._gcd_import(name[level:], package, level)
 File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
 File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
 File "<frozen importlib._bootstrap>", line 972, in _find_and_load_unlocked
 File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
 File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
 File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
 File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
 File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
 File "<frozen importlib._bootstrap_external>", line 850, in exec_module
 File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
 File "/Users/zeuser/other-repos/aardvark/aardvark/__init__.py", line 10, in <module>
   from flask_sqlalchemy import SQLAlchemy
 File "/Users/zeuser/other-repos/aardvark/env/lib/python3.9/site-packages/Flask_SQLAlchemy-2.5.1-py3.9.egg/flask_sqlalchemy/__init__.py", line 14, in <module>
   from flask import _app_ctx_stack, abort, current_app, request
 File "/Users/zeuser/other-repos/aardvark/env/lib/python3.9/site-packages/Flask-1.0.2-py3.9.egg/flask/__init__.py", line 19, in <module>
   from jinja2 import Markup, escape
ImportError: cannot import name 'Markup' from 'jinja2' (/Users/zeuser/other-repos/aardvark/env/lib/python3.9/site-packages/Jinja2-3.1.2-py3.9.egg/jinja2/__init__.py)

I got the same error both when using it on a local machine and when doing the procedure through docker compose

pg_config executable not found

On Ubuntu 16.04, following installation instructions (sudo python setup.py develop):

Running psycopg2-2.7.1/setup.py -q bdist_egg --dist-dir /tmp/easy_install-H62v31/psycopg2-2.7.1/egg-dist-tmp-ba8igp
Error: pg_config executable not found.

Please add the directory containing pg_config to the PATH
or specify the full executable path with the option:

    python setup.py build_ext --pg-config /path/to/pg_config build ...

or with the pg_config option in 'setup.cfg'.
error: Setup script exited with 1
[etucker: aardvark]$ 

I solved this using https://stackoverflow.com/a/12037133/4075901.

Perhaps this could be documented in the readme?

Python2 vs Python3

Was receiving this error, went down a rabbit hole, turns out I was using the wrong Python version. It just wasn't obvious from the README as to which version to use. :)

(.virtualenv) $ aardvark config
Traceback (most recent call last):
  File "/Users/heydonovan/aardvark/.virtualenv/bin/aardvark", line 11, in <module>
    load_entry_point('aardvark', 'console_scripts', 'aardvark')()
    └ <function load_entry_point at 0x1041f2f80>
  File "/Users/heydonovan/aardvark/.virtualenv/lib/python3.7/site-packages/pkg_resources/__init__.py", line 489, in load_entry_point
    return get_distribution(dist).load_entry_point(group, name)
           │                │                      │      └ 'aardvark'
           │                │                      └ 'console_scripts'
           │                └ 'aardvark'
           └ <function get_distribution at 0x1041f2ef0>
  File "/Users/heydonovan/aardvark/.virtualenv/lib/python3.7/site-packages/pkg_resources/__init__.py", line 2793, in load_entry_point
    return ep.load()
           └ EntryPoint.parse('aardvark = aardvark.manage:main')
  File "/Users/heydonovan/aardvark/.virtualenv/lib/python3.7/site-packages/pkg_resources/__init__.py", line 2411, in load
    return self.resolve()
           └ EntryPoint.parse('aardvark = aardvark.manage:main')
  File "/Users/heydonovan/aardvark/.virtualenv/lib/python3.7/site-packages/pkg_resources/__init__.py", line 2417, in resolve
    module = __import__(self.module_name, fromlist=['__name__'], level=0)
                        └ EntryPoint.parse('aardvark = aardvark.manage:main')
  File "/Users/heydonovan/aardvark/aardvark/aardvark/manage.py", line 3, in <module>
    import Queue
ModuleNotFoundError: No module named 'Queue'
(.virtualenv) $ python --version
Python 3.7.4

Remove Python 2 compatibility code

We're no longer compatible with Python 2, so we can remove all of the compatibility workarounds.

For example:

try: # Python 2
raw_input
except NameError: # Python 3
raw_input = input
try: # Python 2
unicode
except NameError: # Python 3
unicode = str

This block can be removed and the code updated to use str and input.

Syntax Error more_itertools

Running using Python2.7 (which I think is the target version of aardvark?)

Using the instructions on the readme, upon running aardvark config I received the following error:

(aardvark) sam@blah:~/code/aardvark$ aardvark config
Traceback (most recent call last):
  File "/home/sam/code/aardvark/bin/aardvark", line 11, in <module>
    load_entry_point('aardvark', 'console_scripts', 'aardvark')()
  File "/home/sam/code/aardvark/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 489, in load_entry_point
    return get_distribution(dist).load_entry_point(group, name)
  File "/home/sam/code/aardvark/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2852, in load_entry_point
    return ep.load()
  File "/home/sam/code/aardvark/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2443, in load
    return self.resolve()
  File "/home/sam/code/aardvark/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2449, in resolve
    module = __import__(self.module_name, fromlist=['__name__'], level=0)
  File "/home/sam/code/aardvark/aardvark/__init__.py", line 9, in <module>
    from flasgger import Swagger
  File "/home/sam/code/aardvark/local/lib/python2.7/site-packages/flasgger-0.6.3-py2.7.egg/flasgger/__init__.py", line 7, in <module>
    from jsonschema import ValidationError  # noqa
  File "/home/sam/code/aardvark/local/lib/python2.7/site-packages/jsonschema-3.1.1-py2.7.egg/jsonschema/__init__.py", line 31, in <module>
    import importlib_metadata
  File "/home/sam/code/aardvark/local/lib/python2.7/site-packages/importlib_metadata-0.23-py2.7.egg/importlib_metadata/__init__.py", line 9, in <module>
    import zipp
  File "/home/sam/code/aardvark/local/lib/python2.7/site-packages/zipp-0.6.0-py2.7.egg/zipp.py", line 12, in <module>
    import more_itertools
  File "build/bdist.linux-x86_64/egg/more_itertools/__init__.py", line 1, in <module>
  File "/home/sam/code/aardvark/local/lib/python2.7/site-packages/more_itertools-7.2.0-py2.7.egg/more_itertools/more.py", line 340
    def _collate(*iterables, key=lambda a: a, reverse=False):
                               ^
SyntaxError: invalid syntax

This is due to moreitertools no longer supporting py2 as of version 5.0.0. The version installed via setup.py is 7.2.0.

The solution is simple: pip install more-itertools==5.0.0

Might be worth getting this pinned.

Boto3 Client Threading Error

Hello -

I'm attempting to run aardvark update inside a docker container at aws ecs. However, when I try to run it against two accounts, I'm getting this error:

Exception in thread Thread-2:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
self.run()
File "/apps/aardvark/aardvark/aardvark/manage.py", line 48, in run
ret_code, aa_data = account.update_account()
File "/apps/aardvark/aardvark/aardvark/updater/init.py", line 39, in update_account
arns = self._get_arns()
File "/apps/aardvark/aardvark/aardvark/updater/init.py", line 62, in _get_arns
'iam', service_type='client', **self.conn_details)
File "/usr/local/lib/python2.7/dist-packages/cloudaux-1.3.6-py2.7.egg/cloudaux/aws/decorators.py", line 35, in decorated_function
retval = f(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/cloudaux-1.3.6-py2.7.egg/cloudaux/aws/sts.py", line 91, in boto3_cached_conn
sts = boto3.client('sts')
File "/usr/local/lib/python2.7/dist-packages/boto3/init.py", line 83, in client
return _get_default_session().client(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/boto3/session.py", line 263, in client
aws_session_token=aws_session_token, config=config)
File "/usr/local/lib/python2.7/dist-packages/botocore/session.py", line 826, in create_client
endpoint_resolver = self.get_component('endpoint_resolver')
File "/usr/local/lib/python2.7/dist-packages/botocore/session.py", line 701, in get_component
return self._components.get_component(name)
File "/usr/local/lib/python2.7/dist-packages/botocore/session.py", line 901, in get_component
del self._deferred[name]
KeyError: 'endpoint_resolver'

It appears to be because boto3's default client isn't thread safe (recommended solution: give each thread it's own session), but I'm confused about how nobody else would have run into it normally. Thread-1 runs and completes successfully. Perhaps it's something related to ecs. Any ideas?

Tests are needed

We have had an issue where Aardvark was broken and not updating because of a recent PR. Originally we didn't prioritize tests because we planned to deprecate Aardvark, but now it seems like a useful caching layer for this data. We need tests to cover the core logic so we can continue to accept upstream commits with confidence.

Fix Travis CI

Currently it looks as though there is an issue with Travis CI where the build isn't finding the .travis.yml file itself, so builds don't run. As we can see here:

https://travis-ci.org/Netflix-Skunkworks/aardvark

With even the last merge not having a build run for it.

travis-ci/travis-ci#3936

There's a few issues on this listed in travis, a lot of the time it appears to be a naming issues inside the Travis configuration on the site, so whoever has the creds for it will most likely have to login and make some changes. Comment 2 in the issue linked will explain more.

Updater threads crash when AssumeRole denied by SCP

Aardvark doesn't handle exceptions in UpdateAccountThread.run() defined in file /aardvark/manage.py. Aardvark also doesn't handle exceptions from CloudAux when it attempts to get IAM client from cloudaux.aws.sts.boto3_cached_conn() in method AccountToUpdate._get_client() defined in file /aardvark/updater/__init__.py. When there is an exception due to permission being denied by SCP, the thread crashes. Each occurrence reduces the effective thread count by 1 until they're all hung, at which point killing the OS process is the only option.

This issue was discovered in a deployment where AWS accounts slated for closure are first moved into an Organizations OU ("KIA") whose SCP denies all access. With ~40 AWS accounts in KIA out of ~850 total, Aardvark would process <100 accounts over several hours before all threads were crashed. The order of the accounts in the swag database appears to determine aardvark's processing order and thus timing of the crashes, so it's possible, for example, that all threads could crash on start if all the KIA accounts were at the top of the list.

The root exception in this scenario is an AccessDenied error from botocore. The source/nature of the AccessDenied exception seems to matter. For example, if AccessDenied is due to the aardvark role missing from the target account, a message is logged and aardvark keeps on trucking. But for some reason when SCP is the culprit, Aardvark chokes. Further investigation could be done to discover the distinction, but for our purposes it's sufficient to say that root cause was an unhandled exception as described above.

Fixing this involved adding exception handling to the two methods listed in the description, and replacing one instance of a direct call from AccountToUpdate._get_arns() to cloudaux.aws.sts.boto3_cached_conn() with a local call to AccountToUpdate._get_client(). Current behavior if the call from UpdateAccountThread.run() to AccountToUpdate.update_account() returns an error (i.e., no exception), is to put account back on the queue for processing again. In the case of an exception, however, it's simpler to assume the cause is not transient and continue, thus avoiding an infinite loop. If the issue was transient then the account will be processed the next time the updater runs, without data loss.

I'd like to contribute the code fixing this issue, but obviously don't have permission to create a branch. Would you prefer that I fork the project and create pull request from that; or submit a diff; or something else? I also added some debug-level logging while troubleshooting, which I'll keep in the pull request because it's valuable for monitoring when processing hundreds of AWS accounts.

thanks!

Unable to retrieve data from sqlite db

I have created the aardvark db (sqlite) and ran the update command to fetch the data. The data is populated as well

/usr/local/bin/aardvark start_api -b 0.0.0.0:5000
[root@ip-10-20-7-230 system]# netstat -anp|grep LISTEN
tcp        0      0 0.0.0.0:5000            0.0.0.0:*               LISTEN      7584/python3   
[root@ip-10-20-7-230 aardvark]# sqlite3 aardvark.db 
SQLite version 3.7.17 2013-05-20 00:56:22
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> 
sqlite> select count(*) from aws_iam_object;
2425
sqlite> 

healthcheck is passing as well

$ curl http://10.20.7.230:5000/healthcheck && echo
ok

But when i try to fetch data from api i get this error

$ curl http://10.20.7.230:5000/api/1/advisors -X POST
{
  "message": "(sqlite3.OperationalError) no such table: aws_iam_object\n[SQL: SELECT aws_iam_object.id AS aws_iam_object_id, aws_iam_object.arn AS aws_iam_object_arn, aws_iam_object.\"lastUpdated\" AS \"aws_iam_object_lastUpdated\" \nFROM aws_iam_object\n LIMIT ? OFFSET ?]\n[parameters: (100, 0)]\n(Background on this error at: http://sqlalche.me/e/e3q8)"
}

I see the table present as well as values in it. Can you please assist further. Thanks

NUM_THREADS should be int, not str

  File ".../aardvark/local/lib/python2.7/site-packages/aardvark/manage.py", line 190, in update
    for thread_num in range(num_threads):
TypeError: <flask_script.commands.Command object at 0x7f58057aa4d0>: range() integer end argument expected, got str.

Checking out the config file generated by aardvark, it's setting the value to "1".

ProgrammingError: (psycopg2.ProgrammingError) relation "aws_iam_object" does not exist

Steps to Reproduce:
export AARDVARK_ROLE="aardvark"
export AARDVARK_DB_URI= "postgresql://{{ aardvark_db_username }}:{{ aardvark_db_password }}@{{ aardvark_rds_dnsrecord }}:5432/aardvark"
export AARDVARK_ACCOUNTS="XXXX,XXXXX"

docker run -v aardvark-data:/usr/share/aardvark-data -e AARDVARK_ACCOUNTS --rm aardvark-collector

Actual Error Message:
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/local/lib/python2.7/threading.py", line 801, in __bootstrap_inner
self.run()
File "/usr/local/lib/python2.7/site-packages/aardvark/manage.py", line 79, in run
persist_aa_data(self.app, aa_data)
File "/usr/local/lib/python2.7/site-packages/aardvark/manage.py", line 103, in persist_aa_data
item = AWSIAMObject.get_or_create(arn)
File "/usr/local/lib/python2.7/site-packages/aardvark/model.py", line 26, in get_or_create
item = AWSIAMObject.query.filter(AWSIAMObject.arn == arn).scalar()
File "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 3322, in scalar
ret = self.one()
File "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 3292, in one
ret = self.one_or_none()
File "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 3261, in one_or_none
ret = list(self)
File "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 3334, in iter
return self._execute_and_instances(context)
File "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 3359, in _execute_and_instances
result = conn.execute(querycontext.statement, self._params)
File "/usr/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 988, in execute
return meth(self, multiparams, params)
File "/usr/local/lib/python2.7/site-packages/sqlalchemy/sql/elements.py", line 287, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/usr/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1107, in _execute_clauseelement
distilled_params,
File "/usr/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1253, in _execute_context
e, statement, parameters, cursor, context
File "/usr/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1473, in _handle_dbapi_exception
util.raise_from_cause(sqlalchemy_exception, exc_info)
File "/usr/local/lib/python2.7/site-packages/sqlalchemy/util/compat.py", line 398, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb, cause=cause)
File "/usr/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1249, in _execute_context
cursor, statement, parameters, context
File "/usr/local/lib/python2.7/site-packages/sqlalchemy/engine/default.py", line 552, in do_execute
cursor.execute(statement, parameters)
ProgrammingError: (psycopg2.ProgrammingError) relation "aws_iam_object" does not exist
LINE 2: FROM aws_iam_object
^

[SQL: SELECT aws_iam_object.id AS aws_iam_object_id, aws_iam_object.arn AS aws_iam_object_arn, aws_iam_object."lastUpdated" AS "aws_iam_object_lastUpdated"
FROM aws_iam_object
WHERE aws_iam_object.arn = %(arn_1)s]
[parameters: {'arn_1': 'arn:aws:iam:::policy/service-r**/'}]
(Background on this error at: http://sqlalche.me/e/f405)

Docker compose error

ERROR: Named volume "data:/data:rw" is used in service "init" but no declaration was found in the volumes section.

Here is the fix - add this to docker-compose.yaml

volumes:
  data:

Problem installing: deepdiff cannot be installed

I tried installing aardvark as per the docs and got stuck at "python setup.py develop". It complains not being able to install deepdiff.
OS: Ubuntu 18.04.2 LTS
Python: 2.7.15rc1

Steps to reproduce:

mkvirtualenv a1
git clone https://github.com/Netflix-Skunkworks/aardvark.git a1
cd a1
python setup.py develop

Error messages:

Searching for deepdiff>=3.3.0
Reading https://pypi.org/simple/deepdiff/
Downloading https://files.pythonhosted.org/packages/19/6e/47b8ec63a0dea28c7d59e8cfadc4ea11c53ee156100bf42fd63d92f32e65/deepdiff-4.0.6.tar.gz#sha256=55e461f56dcae3dc540746b84434562fb7201e5c27ecf28800e4cfdd17f61e56
Best match: deepdiff 4.0.6
Processing deepdiff-4.0.6.tar.gz
Writing /tmp/easy_install-szmVV3/deepdiff-4.0.6/setup.cfg
Running deepdiff-4.0.6/setup.py -q bdist_egg --dist-dir /tmp/easy_install-szmVV3/deepdiff-4.0.6/egg-dist-tmp-g3E7mG
error: Setup script exited with Python 2 is not supported anymore. The last version of DeepDiff that supported Py2 was 3.3.0

Full output is here: setup.log

FWIW: had the same issue on Ubuntu 16.04 and Amazon Linux.
As a workaround, I did a pip install deepdiff (which somehow installs deepdiff-3.3.0, not 4.0.6). Running python setup.py develop afterwards is successful.

ValueError: 'missing' must not be set for required fields.

I'm getting

ValueError: 'missing' must not be set for required fields.

error message when I'm running

aardvark config

(aardvark) [ec2-user@ip-172-18-0-103 aardvark]$ aardvark config
Traceback (most recent call last):
  File "/home/ec2-user/.virtualenvs/aardvark/bin/aardvark", line 11, in <module>
    load_entry_point('aardvark', 'console_scripts', 'aardvark')()
    └ <function load_entry_point at 0x7fba5d91e048>
  File "/home/ec2-user/.virtualenvs/aardvark/lib/python3.7/site-packages/pkg_resources/__init__.py", line 487, in load_entry_point
    return get_distribution(dist).load_entry_point(group, name)
           │                │                      │      └ 'aardvark'
           │                │                      └ 'console_scripts'
           │                └ 'aardvark'
           └ <function get_distribution at 0x7fba5d911f28>
  File "/home/ec2-user/.virtualenvs/aardvark/lib/python3.7/site-packages/pkg_resources/__init__.py", line 2728, in load_entry_point
    return ep.load()
           └ EntryPoint.parse('aardvark = aardvark.manage:main')
  File "/home/ec2-user/.virtualenvs/aardvark/lib/python3.7/site-packages/pkg_resources/__init__.py", line 2346, in load
    return self.resolve()
           └ EntryPoint.parse('aardvark = aardvark.manage:main')
  File "/home/ec2-user/.virtualenvs/aardvark/lib/python3.7/site-packages/pkg_resources/__init__.py", line 2352, in resolve
    module = __import__(self.module_name, fromlist=['__name__'], level=0)
                        └ EntryPoint.parse('aardvark = aardvark.manage:main')
  File "/home/ec2-user/aardvark/aardvark/manage.py", line 14, in <module>
    from swag_client.util import parse_swag_config_options
  File "/home/ec2-user/.virtualenvs/aardvark/lib/python3.7/site-packages/swag_client-0.2.10-py3.7.egg/swag_client/util.py", line 8, in <module>
    class OptionsSchema(Schema):
  File "/home/ec2-user/.virtualenvs/aardvark/lib/python3.7/site-packages/swag_client-0.2.10-py3.7.egg/swag_client/util.py", line 10, in OptionsSchema
    namespace = fields.String(required=True, missing='accounts')
                └ <module 'marshmallow.fields' from '/home/ec2-user/.virtualenvs/aardvark/lib/python3.7/site-packages/marshmallow-3.0.0rc3-py3.7.e...
  File "/home/ec2-user/.virtualenvs/aardvark/lib/python3.7/site-packages/marshmallow-3.0.0rc3-py3.7.egg/marshmallow/fields.py", line 162, in __init__
    raise ValueError(''missing' must not be set for required fields.')
ValueError: 'missing' must not be set for required fields.

Worker timeout

aardvark start_api -b 0.0.0.0:5000
[2019-07-11 12:17:44 -0500] [53582] [INFO] Starting gunicorn 19.7.1
[2019-07-11 12:17:44 -0500] [53582] [INFO] Listening at: http://0.0.0.0:5000 (53582)
[2019-07-11 12:17:44 -0500] [53582] [INFO] Using worker: sync
[2019-07-11 12:17:44 -0500] [53585] [INFO] Booting worker with pid: 53585

[2019-07-11 12:19:03 -0500] [53582] [CRITICAL] WORKER TIMEOUT (pid:53585)
[2019-07-11 12:19:03 -0500] [53585] [INFO] Worker exiting (pid: 53585)
[2019-07-11 12:19:03 -0500] [53605] [INFO] Booting worker with pid: 53605
[2019-07-11 12:21:37 -0500] [53582] [CRITICAL] WORKER TIMEOUT (pid:53605)
[2019-07-11 12:21:37 -0500] [53605] [INFO] Worker exiting (pid: 53605)
[2019-07-11 12:21:37 -0500] [53633] [INFO] Booting worker with pid: 53633
[2019-07-11 13:04:54 -0500] [53582] [CRITICAL] WORKER TIMEOUT (pid:53633)
[2019-07-11 13:04:54 -0500] [53633] [INFO] Worker exiting (pid: 53633)
[2019-07-11 13:04:55 -0500] [54912] [INFO] Booting worker with pid: 54912
[2019-07-11 13:05:57 -0500] [53582] [CRITICAL] WORKER TIMEOUT (pid:54912)
[2019-07-11 13:05:57 -0500] [54912] [INFO] Worker exiting (pid: 54912)
[2019-07-11 13:05:57 -0500] [55036] [INFO] Booting worker with pid: 55036
[2019-07-11 13:06:57 -0500] [53582] [CRITICAL] WORKER TIMEOUT (pid:55036)
[2019-07-11 13:06:57 -0500] [55036] [INFO] Worker exiting (pid: 55036)
[2019-07-11 13:06:57 -0500] [55040] [INFO] Booting worker with pid: 55040
[2019-07-11 13:08:15 -0500] [53582] [CRITICAL] WORKER TIMEOUT (pid:55040)
[2019-07-11 13:08:15 -0500] [55040] [INFO] Worker exiting (pid: 55040)
[2019-07-11 13:08:15 -0500] [55112] [INFO] Booting worker with pid: 55112
[2019-07-11 13:09:36 -0500] [53582] [CRITICAL] WORKER TIMEOUT (pid:55112)
[2019-07-11 13:09:36 -0500] [55112] [INFO] Worker exiting (pid: 55112)
[2019-07-11 13:09:36 -0500] [55296] [INFO] Booting worker with pid: 55296
[2019-07-11 13:10:37 -0500] [53582] [CRITICAL] WORKER TIMEOUT (pid:55296)
[2019-07-11 13:10:37 -0500] [55296] [INFO] Worker exiting (pid: 55296)
[2019-07-11 13:10:37 -0500] [55396] [INFO] Booting worker with pid: 55396
[2019-07-11 13:11:38 -0500] [53582] [CRITICAL] WORKER TIMEOUT (pid:55396)
[2019-07-11 13:11:38 -0500] [55396] [INFO] Worker exiting (pid: 55396)
[2019-07-11 13:11:38 -0500] [55463] [INFO] Booting worker with pid: 55463
[2019-07-11 13:12:39 -0500] [53582] [CRITICAL] WORKER TIMEOUT (pid:55463)
[2019-07-11 13:12:39 -0500] [55463] [INFO] Worker exiting (pid: 55463)
[2019-07-11 13:12:39 -0500] [55475] [INFO] Booting worker with pid: 55475
[2019-07-11 13:13:40 -0500] [53582] [CRITICAL] WORKER TIMEOUT (pid:55475)
[2019-07-11 13:13:40 -0500] [55475] [INFO] Worker exiting (pid: 55475)
[2019-07-11 13:13:41 -0500] [55485] [INFO] Booting worker with pid: 55485
[2019-07-11 13:14:42 -0500] [53582] [CRITICAL] WORKER TIMEOUT (pid:55485)
[2019-07-11 13:14:42 -0500] [55485] [INFO] Worker exiting (pid: 55485)
[2019-07-11 13:14:42 -0500] [55494] [INFO] Booting worker with pid: 55494
[2019-07-11 13:15:44 -0500] [53582] [CRITICAL] WORKER TIMEOUT (pid:55494)
[2019-07-11 13:15:44 -0500] [55494] [INFO] Worker exiting (pid: 55494)
[2019-07-11 13:15:44 -0500] [55539] [INFO] Booting worker with pid: 55539
[2019-07-11 13:16:45 -0500] [53582] [CRITICAL] WORKER TIMEOUT (pid:55539)
[2019-07-11 13:16:45 -0500] [55539] [INFO] Worker exiting (pid: 55539)
[2019-07-11 13:16:45 -0500] [55604] [INFO] Booting worker with pid: 55604
[2019-07-11 14:34:28 -0500] [53582] [CRITICAL] WORKER TIMEOUT (pid:55604)
[2019-07-11 14:34:28 -0500] [55604] [INFO] Worker exiting (pid: 55604)
[2019-07-11 14:34:28 -0500] [55749] [INFO] Booting worker with pid: 55749

why worker is timed out?

Abstract data collection

Abstract collection of usage data into Retriever plugins. This will add the ability to collect data from more sources (e.g. CloudTrail) or in different ways (e.g. IAM Access Advisor data via AWS Organizations).

plus sign in arn got skipped

Hi,

We encountered an issue, likely from an awsconsole.js function, advisor.generateReport, that ARN with "+" sign in it (as part of the user's name) got parsed to space in the POST request sent to AWS and returned this error message:

ERROR GenerateServiceLastAccessedDetails arn:aws:iam::123456789012:user/foo+bar. Skipping... {"readyState":4,"responseText":"{\"errors\":[{\"message\":\"ARN arn:aws:iam::123456789012:user/foo bar does not exist.\",\"code\":\"NoSuchEntity\",\"type\":\"Sender\"}]}","responseJSON":{"errors":[{"message":"ARN arn:aws:iam::123456789012:user/foo bar does not exist.","code":"NoSuchEntity","type":"Sender"}]},"status":404,"statusText":"Not Found"}

I think this issue is caused by string parsing at AWS end, but wonder what's your opinion and if you have similar issues. Thanks!

Additional rights are required

The following rights are mentioned in the Readme:

iam:GenerateServiceLastAccessedDetails
iam:GetServiceLastAccessedDetails
iam:listrolepolicies
iam:listroles

With the current commit, "iam:ListUsers", "iam:ListPolicies" and "iam:ListGroups" are also necessary:

(venv) ubuntu@*******:/usr/local/src/aardvark$ git show
commit 86021eb8c5e7ee74c8685bb2d925dfff046fd846
Author: Thomas Desrosiers <[email protected]>
Date:   Sun Aug 6 09:37:06 2017 -0700

    Fix syntax error in updater (#11)

    👍

diff --git a/aardvark/updater/__init__.py b/aardvark/updater/__init__.py
index 00a8675..e9d081f 100644
--- a/aardvark/updater/__init__.py
+++ b/aardvark/updater/__init__.py
@@ -3,7 +3,7 @@ import os
 import tempfile
 import urllib

-from cloudaux.aws.iam import list_roles
+from cloudaux.aws.iam import list_roles, list_users
 from cloudaux.aws.sts import boto3_cached_conn
 import requests
 import subprocess32
@@ -61,10 +61,13 @@ class AccountToUpdate(object):
         client = boto3_cached_conn(
             'iam', service_type='client', **self.conn_details)

-        account_arns = set(
-            [role['Arn'] for role in list_roles(**self.conn_details)]
-            + [user['Arn'] for user in list_users(**self.conn_details)]
-        )
+        account_arns = set()
+
+        for role in list_roles(**self.conn_details):
+            account_arns.add(role['Arn'])
+
+        for user in list_users(**self.conn_details):
+            account_arns.add(user['Arn'])

         for page in client.get_paginator('list_policies').paginate(Scope='Local'):
             for policy in page['Policies']:

Otherwise, I get one of the following errors:
iam:ListGroups

Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
    self.run()
  File "/usr/local/src/aardvark/aardvark/manage.py", line 48, in run
    ret_code, aa_data = account.update_account()
  File "/usr/local/src/aardvark/aardvark/updater/__init__.py", line 39, in update_account
    arns = self._get_arns()
  File "/usr/local/src/aardvark/aardvark/updater/__init__.py", line 76, in _get_arns
    for page in client.get_paginator('list_groups').paginate():
  File "/usr/local/src/aardvark/venv/lib/python2.7/site-packages/botocore-1.5.92-py2.7.egg/botocore/paginate.py", line 249, in __iter__
    response = self._make_request(current_kwargs)
  File "/usr/local/src/aardvark/venv/lib/python2.7/site-packages/botocore-1.5.92-py2.7.egg/botocore/paginate.py", line 326, in _make_reques                     t
    return self._method(**current_kwargs)
  File "/usr/local/src/aardvark/venv/lib/python2.7/site-packages/botocore-1.5.92-py2.7.egg/botocore/client.py", line 310, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/usr/local/src/aardvark/venv/lib/python2.7/site-packages/botocore-1.5.92-py2.7.egg/botocore/client.py", line 599, in _make_api_call
    raise error_class(parsed_response, operation_name)
ClientError: An error occurred (AccessDenied) when calling the ListGroups operation: User: arn:aws:sts::**********:assumed-role/Aardvark/                     aardvark is not authorized to perform: iam:ListGroups on resource: arn:aws:iam::**********:group/

iam:ListPolicies

Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
    self.run()
  File "/usr/local/src/aardvark/aardvark/manage.py", line 48, in run
    ret_code, aa_data = account.update_account()
  File "/usr/local/src/aardvark/aardvark/updater/__init__.py", line 39, in update_account
    arns = self._get_arns()
  File "/usr/local/src/aardvark/aardvark/updater/__init__.py", line 72, in _get_arns
    for page in client.get_paginator('list_policies').paginate(Scope='Local'):
  File "/usr/local/src/aardvark/venv/lib/python2.7/site-packages/botocore-1.5.92-py2.7.egg/botocore/paginate.py", line 249, in __iter__
    response = self._make_request(current_kwargs)
  File "/usr/local/src/aardvark/venv/lib/python2.7/site-packages/botocore-1.5.92-py2.7.egg/botocore/paginate.py", line 326, in _make_reques                     t
    return self._method(**current_kwargs)
  File "/usr/local/src/aardvark/venv/lib/python2.7/site-packages/botocore-1.5.92-py2.7.egg/botocore/client.py", line 310, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/usr/local/src/aardvark/venv/lib/python2.7/site-packages/botocore-1.5.92-py2.7.egg/botocore/client.py", line 599, in _make_api_call
    raise error_class(parsed_response, operation_name)
ClientError: An error occurred (AccessDenied) when calling the ListPolicies operation: User: arn:aws:sts::**********:assumed-role/Aardvar                     k/aardvark is not authorized to perform: iam:ListPolicies on resource: policy path /

iam:ListUsers:

Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
    self.run()
  File "/usr/local/src/aardvark/aardvark/manage.py", line 48, in run
    ret_code, aa_data = account.update_account()
  File "/usr/local/src/aardvark/aardvark/updater/__init__.py", line 39, in update_account
    arns = self._get_arns()
  File "/usr/local/src/aardvark/aardvark/updater/__init__.py", line 69, in _get_arns
    for user in list_users(**self.conn_details):
  File "/usr/local/src/aardvark/venv/lib/python2.7/site-packages/cloudaux-1.3.5-py2.7.egg/cloudaux/aws/sts.py", line 129, in decorated_func                     tion
    return f(*args, **kwargs)
  File "/usr/local/src/aardvark/venv/lib/python2.7/site-packages/cloudaux-1.3.5-py2.7.egg/cloudaux/aws/decorators.py", line 40, in decorate                     d_function
    raise e
ClientError: An error occurred (AccessDenied) when calling the ListUsers operation: User: arn:aws:sts::**********:assumed-role/Aardvark/a                     ardvark is not authorized to perform: iam:ListUsers on resource: arn:aws:iam::**********:user/

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.