Giter Club home page Giter Club logo

aws-cost-analysis's People

Contributors

concurrencylabs avatar nr18 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

aws-cost-analysis's Issues

utils.py:get_period_prefix() creates wrong prefix for december

Executing:
['python', 'report_utils.py', '--action=prepare-quicksight', '--source-bucket=foo', '--source-prefix=reports/cost-reports/', '--dest-bucket=bar', '--dest-prefix=quicksight/', '--year=2017', '--month=12']

Errors out:
Exception message:[Could not find manifest file in bucket:[foo], prefix:[reports/cost-reports/20171201-20171301/]]

Just increasing int(month) doesn't seem sufficient ;)

Athena results are much different than Cost Explorer ?

We have a very large CSV that's produced for the billing report from AWS, so it doesn't load in Excel for parsing there.

I did get the report_utils script to execute successfully with no errors and data does appear in Athena tables, but when I run:

FROM  hourly_20180901_20181001``` only returns a number that is ~ 20% of the actual expected amount shown by cost explorer.

Have you seen anything like this?
Would you have any suggestions as to where to look?

Problem with directory path

My AWS reports are being put into s3://<bucket>/<myprefix>/aws-cost-and-usage/ but the function works only if I dont have my own prefix. I had to edit it and do the following change to work out for me at utils.py:

-    period = dirs[len(dirs)-3]
+    period = dirs[len(dirs)-2]

-    for d in dirs[0:len(dirs)-3]:
+    for d in dirs[0:len(dirs)-2]:

Maybe there is a better fix to solve both cases #

SyntaxError: invalid syntax

Hello,
I'm getting the following error with python 3.6.5. It doesn't matter if I'm running it with or w/o parameters.

Can anyone help me with this?

Traceback (most recent call last): File "report_utils.py", line 7, in <module> import awscostusageprocessor.processor as cur File "/home/marcel/Desktop/AWS/aws-cost-analysis/awscostusageprocessor/processor.py", line 109 print "Number of records: [{}]".format(record_count) ^ SyntaxError: invalid syntax

Cheers

Missing cloudformation/process-cur-sam.yml

The README mentions a cloudformation/process-cur-sam.yml which sounds very helpful ... but it's missing in the repository. Could you add it?

Thanks for sharing these tools!

SAM and DynamoDB

In the xacct-step-function-starter.py file you scan with the following filter attributes:

  • lastProcessedTimestamp = 'Stamp'
  • dataCollectionStatus = Active

You only write the lastProcessedTimestamp value in update-metadata.py and you read it in athena.py.

As far as i can see the xacct-step-function-starter.py and thus the Lambda function xAcctStepFunctionStarter which is scheduled to run every 5 minshave never effect due to the missing dataCollectionStatus attribute.

What is the best approach to run the detection of new reports via S3 events or scheduled events?

Serverless Step Function fails

Hi,

I'm trying to setup this project using the CloudFormation template (PR is coming if i have it working) and in the StepFunction the following error is raised:

'VarCharValue': KeyError
Traceback (most recent call last):
File "/var/task/functions/init-athena-queries.py", line 37, in handler
result_dict['getTotalCost']['resultset'] = apiprocessor.getTotalCost()
File "/var/task/awscostusageprocessor/api.py", line 50, in getTotalCost
return self.getResultSet(consts.ACTION_GET_TOTAL_COST)
File "/var/task/awscostusageprocessor/api.py", line 42, in getResultSet
response['results'] = self.athena.get_query_execution_results(queryexecutionid)
File "/var/task/awscostusageprocessor/sql/athena.py", line 148, in get_query_execution_results
row_dict[rowheaders[columnindex]['VarCharValue']] = columnvalue['VarCharValue']
KeyError: 'VarCharValue'

So i added a debug statement to print the queryresults in the get_query_execution_results function:
log.info("Error: {}".format(json.dumps(queryresults)))

That results in:

{
    "ResultSet": {
        "Rows": [
            {
                "Data": [
                    {
                        "VarCharValue": "sum_unblendedcost"
                    }
                ]
            },
            {
                "Data": [
                    {}
                ]
            }
        ],
        "ResultSetMetadata": {
            "ColumnInfo": [
                {
                    "Scale": 0,
                    "Name": "sum_unblendedcost",
                    "Nullable": "UNKNOWN",
                    "TableName": "",
                    "Precision": 17,
                    "Label": "sum_unblendedcost",
                    "CaseSensitive": false,
                    "SchemaName": "",
                    "Type": "double",
                    "CatalogName": "hive"
                }
            ]
        }
    },
    "ResponseMetadata": {
        "RetryAttempts": 0,
        "HTTPStatusCode": 200,
        "RequestId": "9bdc5156-****-****-****-************",
        "HTTPHeaders": {
            "date": "Fri, 19 Jan 2018 09:24:54 GMT",
            "x-amzn-requestid": "9bdc5156-****-****-****-************",
            "content-length": "619",
            "content-type": "application/x-amz-json-1.1",
            "connection": "keep-alive"
        }
    }
}

Due to the empty Data: {} the script will fail can i ignore this? or is this caused by a misconfiguration? Thanks!

running athena report but getting error

code:
python report_utils.py --action=prepare-athena --source-bucket=kpmgcloud-cost-report --source-prefix=costreport/**/ --dest-bucket=kpmgcloud-cost-report --dest-prefix=athena2/ --year=2018 --month=06

Reply:
Traceback (most recent call last):
File "report_utils.py", line 7, in
import awscostusageprocessor.processor as cur
File "/Users/sgooch/Documents/Stash/costsaving/analysis/athena/aws-cost-analysis/awscostusageprocessor/processor.py", line 109
print "Number of records: [{}]".format(record_count)

I have run this once already so not sure if this would upset it?

Limit permissions asked for miserbot

I don't know if people realize that the permissions give readonly access to a LOT of stuff including secrets...
I would be appreciated if you could limit those permissions to only what is necessary for the bot to operate.

[Question] Automated Deployment

Hello,
I was trying to use the automated deployment for aws-cost-analysis. Everything went well and all functions are created in lambda.

  • However its not clear to me when the athena database is created. There is still not database.
  • curprocessor-sam-xAcctStepFunctionStarter is running periodicly
  • the billing bucket name is the source of the reports correct?

how to handle aws schema changes

The problem with using Athena or quicksight is that it , it does not take into consideration the dynamic change in schema of the billing files , that AWS normally does. Due to this the reports starts breaking.
Is there a way to handle this using glue or any other method for this tool. Would really be helpful.

Athena date_parse error - malformed error

hi,

Can someone please assist to resolve following error -

An error occurred while loading the data.
[Simba]AthenaJDBC An error has been thrown from the AWS Athena client. INVALID_FUNCTION_ARGUMENT: Invalid format: "04-apr-2022" is malformed at "apr-2022"

I am trying to convert the string to date format

sample string -
16-apr-2022
10-may-2022
06-apr-2022

code ( used in select clause)- date_parse("date_string"."date",'%d-%m-%y') as New_date

Thank you

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.