Giter Club home page Giter Club logo

cloudtrail-partitioner's Introduction

This project sets up partitioned Athena tables for your CloudTrail logs and updates the partitions nightly. As new AWS accounts begin sending you logs or new AWS regions come online, your paritions will always be up-to-date. It is based on work by Alex Smolen in his post Partitioning CloudTrail Logs in Athena.

You can immediately deploy the CDK app, but I recommend first running this manaully to ensure everything is configured, and also because running it manually will (by default) create 90 days of partitions, whereas the nightly CDK will not run until 0600 UTC, and will only create partitions for the current day and tomorrow.

Tables are created for each account as cloudtrail_000000000000 and also a view is created that unions all these tables.

Setup

Edit config/config.yaml to specify the S3 bucket containing your CloudTrail logs, the SNS to send alarms to (you must create one if you don't already have one) and any other configuraiton info.

Set up the initial tables and partitions for the past 90 days (it is ok if you don't have that many logs), by running:

cd resources/partitioner
pip3 install pyyaml boto3 -t .
python3 main.py

Then deploy the nightly Lambda from the root directory:

npm i
cdk deploy

If you haven't used the cdk before, you may need to run cdk bootstrap aws://000000000000/us-east-1 (replacing your account ID and region) before running cdk deploy.

Using Athena

To query your tables, use the AWS Console to get to the Athena service in the region where this was deployed. Here is an example query to list all of the data for some events:

SELECT *
FROM cloudtrail_000000000000
WHERE region = 'us-east-1' AND year = '2019' AND month = '09' AND day = '30'
LIMIT 5;

That query limits the data searched to a specific region and day (using the partitions) and a specific account.

This next query shows the most common errors by user (technically by ARN for the session).

SELECT 
  useridentity.arn, 
  errorcode, 
  count(*) AS count 
FROM cloudtrail_000000000000
WHERE year = '2019' AND month = '09' AND day = '30' 
  AND errorcode != '' 
GROUP BY errorcode, useridentity.arn 
ORDER BY count DESC 
LIMIT 50;

This next query shows the API calls made by a specific user.

SELECT 
  eventname, count(*) AS COUNT
FROM cloudtrail_000000000000
WHERE year = '2019' AND month = '09' and day = '30'
  AND useridentity.arn like '%alice%'
GROUP BY eventname
ORDER BY COUNT DESC

This next query shows which accounts have been accessed from a specific IP address.

SELECT 
  recipientaccountid, count(*) AS COUNT
FROM cloudtrail
WHERE year = '2019' AND month = '09'
  AND sourceipaddress = '1.2.3.4'
GROUP BY recipientaccountid 
ORDER BY COUNT DESC

cloudtrail-partitioner's People

Contributors

0xdabbad00 avatar jordan-wright avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.