opencost / opencost Goto Github PK

View Code? Open in Web Editor NEW

4.7K 45.0 507.0 50.84 MB

Cost monitoring for Kubernetes workloads and cloud costs

Home Page: http://opencost.io

License: Apache License 2.0

Dockerfile 0.09% Go 96.68% JavaScript 3.02% CSS 0.01% HTML 0.01% Makefile 0.02% Shell 0.04% Just 0.09% Starlark 0.04%

kubernetes opencost aws azure cncf cost cost-optimization gcp k8s monitoring

opencost's Introduction

OpenCost — your favorite open source cost monitoring tool for Kubernetes and cloud spend

OpenCost give teams visibility into current and historical Kubernetes and cloud spend and resource allocation. These models provide cost transparency in Kubernetes environments that support multiple applications, teams, departments, etc. It also provides visibility into the cloud costs across multiple providers.

OpenCost was originally developed and open sourced by Kubecost. This project combines a specification as well as a Golang implementation of these detailed requirements.

OpenCost UI Walkthrough

To see the full functionality of OpenCost you can view OpenCost features. Here is a summary of features enabled:

Real-time cost allocation by Kubernetes cluster, node, namespace, controller kind, controller, service, or pod
Multi-cloud cost monitoring for all cloud services on AWS, Azure, GCP
Dynamic on-demand k8s asset pricing enabled by integrations with AWS, Azure, and GCP billing APIs
Supports on-prem k8s clusters with custom CSV pricing
Allocation for in-cluster K8s resources like CPU, GPU, memory, and persistent volumes
Easily export pricing data to Prometheus with /metrics endpoint (learn more)
Free and open source distribution (Apache2 license)

Getting Started

You can deploy OpenCost on any Kubernetes 1.20+ cluster in a matter of minutes, if not seconds!

Visit the full documentation for recommended installation options.

Usage

Contributing

We ❤️ pull requests! See CONTRIBUTING.md for information on building the project from source and contributing changes.

Community

If you need any support or have any questions on contributing to the project, you can reach us on CNCF Slack in the #opencost channel or attend the biweekly OpenCost Working Group community meeting from the Community Calendar to discuss OpenCost development.

FAQ

You can view OpenCost documentation for a list of commonly asked questions.

opencost's People

Contributors

Stargazers

Watchers

Forkers

dwbrown2 ericz tzens iamsingularity osswangxining hunglethanh9 keevol robertkozak isurudilhan devopsbox dyhpoon mko4444 watsonso adamjm borqosky srinivma1 mda590 awesomegolang chdiethelm avaussant rgposadas kidmam linus5 omerlh sreekanth-ragi clvr-cloud adamnugraha cdjg35 cloudmelon doctorzk satwikk pmgarcia1205 jorjao81 liche60 shammishailaj luohua13 lhmt hmfritz sjanulonoks sts0mrg0 lxlxok ajaytripathy etsangsplk feddighani nguyensdeveloper seilorjunior schoenemeyer chotachetan iabhee dictcp telmo amerski99 sxauyhz anwarchk vikassrivas28 yujiantao bryanseay socioprophet linchong johnjjung 5l1v3r1 fossabot nchopra dedmari henrywu2019 katenai gaogj chandranitu shankar-moeng yahiakhedr prasadanvekar hvandenb brianjleonard pengchen98 neomatrixgem erezool genvio dorthykinoshita hughhouse marinakravchenko21 max831020 tavisca-mgandhe x86nick ishantanu almonteb knightxun ameyajoshi99 chive vvalorous sh4d1 finoptimize devopstoday11 prashant8689 debianmaster aland-zhang zakanderson amanjuneja5 jdewinne ken-chou-glia meringu

opencost's Issues

Suggestion: Merge NameSpace labels into the labels list

In multitenant clusters it's common to have namespaces per tenant and to use labels in the namespaces to identify the tenants.

Support regional persistent volumes

Today, pricing models do not yet check for regional disk pricing. We should capture this information.

panic: interface conversion: interface {} is nil, not map[string]interface {}

Hi, I'm testing out cost-model in a cluster on Azure and it keeps crashing with:

2019/04/09 12:54:06 Recording prices...
panic: interface conversion: interface {} is nil, not map[string]interface {}

goroutine 27 [running]:
github.com/kubecost/cost-model/costmodel.getNormalization(0x0, 0x0, 0xc000413630)
        /app/costmodel/costmodel.go:657 +0x21c
github.com/kubecost/cost-model/costmodel.ComputeCostData(0x1df1560, 0xc000426000, 0xc000448000, 0x1e35560, 0xc0002cb100, 0x1b154a7, 0x2, 0xc000058f30, 0x3, 0x0)
        /app/costmodel/costmodel.go:83 +0x57d
main.(*Accesses).recordPrices.func1(0xc000406ec0)
        /app/main.go:95 +0x3d4
created by main.(*Accesses).recordPrices
        /app/main.go:92 +0x3f

Add ability to add tokens to intra-app communication

The app has to talk to prometheus to run. This is causing problems such as #74 where TLS is enabled and requests need to be authorized. We should support injecting auth tokens into these requests.

Storage query not returning expected results

@jfpucheu is seeing poorly formed results from one of the cluster costs queries. First step is that we need better logging to determine exactly what is going on.

How is the cost of the pod calculated?

I read the source code of the cost-model and only saw the calculation logic of the node cost. I understand that users should prefer to see the cost of a single Pod or Single Container.

router.GET("/costDataModel", a.CostDataModel) router.GET("/costDataModelRange", a.CostDataModelRange)
The above code only returns the pricing of the node and the resource usage of the container.

Is the calculation logic for pod and container costs implemented externally?

Set default spot label for AWS

It would be nice to not have to configure this for every cluster. I believe the node label for EKS is lifecycle=EC2Spot but worth verifying. If so, this seems like a sensible default to me.

Openshift Support

Raising the issue again in this forum, since below URL is deprecated

AjayTripathy/kubecost-quickstart#3

Issue:
I was able to deploy the cost-model on Openshift, but the pod does not come up with the following error. Tried both service and route approach to connect to existing prometheus server.

F0503 09:46:27.109402 1 main.go:173] Failed to use Prometheus at http://prometheus-k8s.openshift-monitoring.svc Error: Get http://prometheus-k8s.openshift-monitoring.svc/api/v1/status/config: net/http: HTTP/1.x transport connection broken: malformed HTTP response "\x15\x03\x01\x00\x02\x02"

The above is a service URL, I also tried the route URL(re-encryption route) which is https and I get the following error.

https://prometheus-k8s-openshift-monitoring.infra.saasitc-gce-shared.ragsr01.cs.saas.ca.com/

Error: Certificate signed by unknown authority

I was atleast expecting the connection should work with service URL (not the route URL) but even that is not working.

Steps to Reproduce:

docker build --rm -f "Dockerfile" -t /kubecost-cost-model: .
Edit the pulled image in the deployment.yaml to /kubecost-cost-model:
Set this environment variable to the address of your prometheus server
kubectl create namespace cost-model
kubectl apply -f kubernetes/ --namespace cost-model
kubectl port-forward --namespace cost-model service/cost-model 9003

Openshift Version:
v3.11

CSV export

Hey buddies!

We are currently testing KubeCost for our applications.

It would be essential for us to be able to export the individual costs of the namespaces as CSV.

Is such a function planned?

Many greetings

Add more data to /metrics endpoint

Is this by design? I was expecting to get all the cost metrics from the /metrics endpoint.

Support manually provisioned PVs in the model

Today, we're looking up requests and computing an allocation based on PVCs, not PVs. We do not have a configured daemon in our stack that exports information like "size" about PVs themselves-- we rely on the PVC data.

However, PVCs do not have a storageclass assigned to them if they use manually provisioned disks. This means for manually provisioned disks, we will need to join the prometheus query here for PVCs today with the data on the actual PV to get storage class in all cases.

Unable to download aws pricing data on pod startup due to certificate error.

✦ $ kubectl logs -f cost-model-fcdb679cc-cjcbj

2019/04/04 23:06:09 Failed to download pricing data: Get https://pricing.us-east-1.amazonaws.com/offers/v1.0/aws/AmazonEC2/current/index.json: x509: certificate signed by unknown authority

Add network data costs

The model should measure the cost of internet egress and other paid network transfers.

Proposal is to start by exposing Prometheus counters that classify bytes transferred as ingress/egress and then intra-zone/cross-zone/cross-region/internet.

Here's is a strawman proposal for the initial metric (counter) format:
kubecost_pod_network_egress_bytes_total{pod_name="cost-analyzer",namespace="kubecost",same_region="true",same_zone="false"} 6234

Exporting cost model data

I'd like to write cost data back to Prometheus so I can build new dashboards with this information. Can you expose these metrics so they can be scraped?

aws pricing URL honors ETag

Since that file is currently over 600MB, it can be a very real win to use the If-None-Match: (or its If-Modified-Since: friend) to check if fetching those bytes could possibly result in any new information

Currently, since that URL is being served off of S3, it comes with all the expected headers, including ETag: containing the md5 of its contents:

	< Content-Length: 685820220
	< Last-Modified: Fri, 29 Mar 2019 04:08:55 GMT
	< ETag: "678d4f456aad44cb7a03708d7fd4511c"

If one sends that header back up, S3 is bright enough to indicate that downloading it will do you no good:

	> If-None-Match: "678d4f456aad44cb7a03708d7fd4511c"
	< HTTP/1.1 304 Not Modified

prometheus-node-exporter daemon set not running

Hi,

I just installed it using the steps at
https://kubecost.com/install?ref=home

but I see that the node-exporter daemonset is not having any running pods, also my grafana also gives a message stating that

If you're seeing this Grafana has failed to load its application files

This could be caused by your reverse proxy settings.
If you host grafana under subpath make sure your grafana.ini root_path setting includes subpath
If you have a local dev build make sure you build frontend using: npm run dev, npm run watch, or npm run build
Sometimes restarting grafana-server can help

I can see the cost-analyzer ui but without node metrics it not fetching information related to CPU and memory. Could you provide pointers on where to start looking at.

Thanks,
Sreekanth

Add AWS storage costs to the cost model.

Continuing from #80 -- we need to similarly add AWS storage costs to the cost model

Cost model can fail if prometheus-server isn't available yet

When you deploy both pods together, the cost-model pod becomes available first and then errors after the timeout below.

main.go:294] Failed to use Prometheus at http://kubecost-prometheus-server.kubecost2.svc.cluster.local Error: Get http://kubecost-prometheus-server.kubecost2.svc.cluster.local/api/v1/status/config: dial tcp 10.15.250.213:80: i/o timeout

what is reporting model

I saw video and saw that the features also but its not talking about any reporting model.

Not sure Business or Enterprise support reporting model (excel).

Looks like its only dashboard view and slack notification ....

Add container_gpu_allocation metric

Proposal is to start exporting container_gpu_allocation to Prometheus alongside container_cpu_allocation and other metrics.

Stop Panic-ing when Prometheus misconfigured

See Issue #10

We should instead return a sensible error in the response and write to the container logs.

ideally only download AWS pricing data for regions in use

The Offer Index File has a currentRegionIndexUrl which then lists the per-region pricing data, in contrast to the 600MB+ AmazonEC2 current json

If one were to only ask for regions used by the cluster, it would make downloading the data quicker

Prometheus end point exact pattern?

Not able to give exact endpoint in deployment file,due to this ,pod is not starting can you help us on this.

what exactly we have to specify in value of endpoint.
cpu: "10m"
memory: "55M"
env:
- name: PROMETHEUS_SERVER_ENDPOINT
value: http://Prometheus-server.namespace.svc.cluster.local #The endpoint should have the form http://..svc.cluster.local
imagePullPolicy: Always

ERROR: ace is nil, prometheus is not running!
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x163f10f]

Support Azure Billing Data

Kubecost currently has integrations with AWS & GCP for billing data. We've had many users request an integration with Azure. We should support this.

Automatically refresh pricing data after Azure credentials are added.

As in title, it appears you need to restart the pod or manually call refreshpricingdata to download the cost data after credentials are added. We should refresh pricing data automatically on credentials being added.

Helm chart - install without grafana and prometheus

Hi.

On our system, we have installed prometheus-operator and I would like to skip installing those. Are there any guides on installing without them? Do I need to install other thing related to Prometheus? Would be great to have some guide about it.

Thx a lot and great job!

Add EKS service costs to cost model

AWS has an hourly cost for the EKS service itself - Azure and GCP do not.

If the provider=AWS and platform=EKS, then we should add this to the overall cluster costs. Seems like provider can be determined by the K8s version (for example: v1.12.6-eks-...).

Dockerfile points to missing cloud/gcp.json

The Dockerfile provided for testing (and presumably used to build the app image pushed to Dockerhub) points to cloud/gcp.json which doesn't exist.

Feature: replace single value label settings by list

Hi,

As labels as not always standardized when using third parties manifests, using single value for labels in settings can be a limitation.

I suggest two things:

convert single value to list of values, for instance app and"app.kubernetes.io/name for application labels
to go further, allow type of labels to be fully dynamic, that is if i don't want to have Team Label, i can delete the setting, and if i want to have a Location Label, i can add it.

Where does the BaseCPUCost value came from?

I'd love to use this project on our Cluster, but in order to get that approval I've to able to explain each and every number shown in here.

What I haven't been able to find is how is the BaseCPUCost from aws.json calculated in first place? Where did you get the value 0.031611 from?

Gpu metrics don't go away after node is deleted

It looks like the model keeps emitting 0 for node_gpu_hourly_cost on the /metrics endpoint even after the node has been deleted. This appears low priority because it doesn't have an impact on actual metrics, it only creates empty metrics when we shouldn't.

Skipping Node "" due to missing Node Data costs

I0430 21:18:16.906798 1 main.go:169] Skipping Node "" due to missing Node Data costs

Appears every scrape, but because almost every log message is "info" it's hard to know if that's an expected outcome or what

Sometimes it appears twice:

I0430 21:20:17.352432 1 main.go:169] Skipping Node "" due to missing Node Data costs
I0430 21:20:17.352584 1 main.go:169] Skipping Node "" due to missing Node Data costs

Kubernetes Version: v1.13.4
Cloud Provider: AWS
KubeCost Version:

NAME                 	REVISION	UPDATED                 	STATUS  	CHART                       	APP VERSION	NAMESPACE
kubecost             	1       	Tue Apr 30 14:10:18 2019	DEPLOYED	cost-analyzer-1.18.2        	1.0        	kubecost

Optionally allocate idle costs

There are a number of reasons why a user may want to allocate idle resources back to individual pods, namespaces, etc. I see two potential approaches for accomplishing this:

allocating costs based on limits -- this would require teams/owners to set resource limits.
allocating idle cluster costs -- this could be done uniformly or proportionately based on pod consumption.

I'm starting this issue to discuss these two approaches and/or others. Whatever approach we take, I believe we should give users the ability to enable/disable this feature.

Better Error Messages and Logging

The app in general could use some clearer error logging. Of specific note are panics when converting prometheus responses to json-- these should instead return clear errors.

GCP nodes w/ GPUs can have missing CPU costs

Looking at /metrics data I'm seeing nodes with node_cpu_hourly_cost equal to 0. The only ones I've seen so far all have GPUs attached.

Add Out of Cluster Costs to the Cost Model

Currently, you need to make a separate query and join these two sets of data. This should be available in a single query to /costDataModel or /costDataModelRange.

Add storage costs to data model

The cost model currently has PV size, storageclass, etc. I propose we also add cost per Gb hour of storage.

On GCP this cost is a function of region, storageClass, and isRegional (i.e. replication across two AZs in a region). The GCP billing API has top level entries for ssd and pdstandard storageClasses.

My option is that supporting isRegional=true is P2 for this initial implementation. To support this type properly, we will need to look at the underlying storageClass and it's replication-type. More info here: https://cloud.google.com/solutions/using-kubernetes-engine-to-deploy-apps-with-regional-persistent-disks.

I propose we have also have a fallback price (e.g. SSD and PDStandard) in default.json in case these metrics aren't found in billing data.

Support looking up spot data feed from path with empty `prefix`

See: https://github.com/kubecost/cost-model/blob/master/cloud/awsprovider.go#L728

Prefix is actually optional when using these buckets-- we need to support that by omitting the leading slash.

Support offsets on /costModel

This will allow teams to query arbitrary time windows in the past. This can be in the format "24h".

Example offset modifiers: https://prometheus.io/docs/prometheus/latest/querying/basics/#offset-modifier

Unable to run the cost model

I'm trying to run the cost-model and getting this ...

$ kubectl logs -f deployment.extensions/kubecost-cost-model -c cost-model
I0625 18:57:57.223532       1 main.go:350] Starting cost-model (git commit "88aef90e75772d4a00c76d103ba57cc106b757ea")
I0625 18:58:07.732047       1 main.go:367] Success: retrieved a prometheus config file from: http://prometheus-server.kube-system.svc.cluster.local
I0625 18:58:07.924682       1 main.go:373] Success: retrieved the 'up' query against prometheus at: http://prometheus-server.kube-system.svc.cluster.local
I0625 18:58:08.924842       1 provider.go:374] Found ProviderID starting with "aws", using AWS Provider
I0625 18:58:09.531536       1 awsprovider.go:459] Unable to find params for storageClassName pvc-5cacd650-f97e-11e6-bda3-06430f0df8ba
I0625 18:58:09.531572       1 awsprovider.go:459] Unable to find params for storageClassName pvc-bdf6e91a-fcdc-11e6-bda3-06430f0df8ba
I0625 18:58:09.531592       1 awsprovider.go:459] Unable to find params for storageClassName pvc-ffd7a85d-f820-11e6-bda3-06430f0df8ba
I0625 18:58:09.531602       1 awsprovider.go:471] starting download of "https://pricing.us-east-1.amazonaws.com/offers/v1.0/aws/AmazonEC2/current/index.json", which is quite large ...
I0625 18:58:14.524027       1 awsprovider.go:477] Finished downloading "https://pricing.us-east-1.amazonaws.com/offers/v1.0/aws/AmazonEC2/current/index.json"

I0625 19:18:03.922907       1 awsprovider.go:483] done loading "https://pricing.us-east-1.amazonaws.com/offers/v1.0/aws/AmazonEC2/current/index.json"
I0625 19:18:04.250765       1 awsprovider.go:615] Error downloading spot data AccessDenied: Access Denied
	status code: 403, request id: 35EFF9C569BA302A, host id: zxgrYWDEqznapmZCIp0V+X2DPmxQxzh9zfTjH6AW46sBPeCw9choRQx7v3PnfDzQYv7LF7wY2TI=
I0625 19:18:04.251116       1 main.go:266] Recording prices...
I0625 19:18:07.032776       1 costmodel.go:751] No RAM cost found for us-east-1,m4.xlarge,linux, calculating...
I0625 19:18:07.032834       1 costmodel.go:751] No RAM cost found for us-east-1,r4.4xlarge,linux, calculating...
I0625 19:18:07.032858       1 costmodel.go:751] No RAM cost found for us-east-1,r4.xlarge,linux, calculating...
I0625 19:18:07.032880       1 costmodel.go:751] No RAM cost found for us-east-1,r4.2xlarge,linux, calculating...
I0625 19:18:07.032898       1 costmodel.go:751] No RAM cost found for us-east-1,r4.2xlarge,linux, calculating...
I0625 19:18:07.032919       1 costmodel.go:751] No RAM cost found for us-east-1,r4.xlarge,linux, calculating...
I0625 19:18:07.032938       1 costmodel.go:751] No RAM cost found for us-east-1,r4.xlarge,linux, calculating...
I0625 19:18:07.032957       1 costmodel.go:751] No RAM cost found for us-east-1,r4.xlarge,linux, calculating...
I0625 19:18:07.032974       1 costmodel.go:751] No RAM cost found for us-east-1,r4.xlarge,linux, calculating...
I0625 19:18:07.032996       1 costmodel.go:751] No RAM cost found for us-east-1,r4.2xlarge,linux, calculating...
I0625 19:18:07.033014       1 costmodel.go:751] No RAM cost found for us-east-1,r4.xlarge,linux, calculating...
I0625 19:18:07.033031       1 costmodel.go:751] No RAM cost found for us-east-1,r4.xlarge,linux, calculating...
I0625 19:18:07.033051       1 costmodel.go:751] No RAM cost found for us-east-1,r4.xlarge,linux, calculating...
I0625 19:18:07.033072       1 costmodel.go:751] No RAM cost found for us-east-1,m4.xlarge,linux, calculating...
I0625 19:18:07.033094       1 costmodel.go:751] No RAM cost found for us-east-1,r4.2xlarge,linux, calculating...
I0625 19:18:07.033111       1 costmodel.go:751] No RAM cost found for us-east-1,r4.xlarge,linux, calculating...
I0625 19:18:07.033129       1 costmodel.go:751] No RAM cost found for us-east-1,r4.2xlarge,linux, calculating...
I0625 19:18:07.033146       1 costmodel.go:751] No RAM cost found for us-east-1,r4.2xlarge,linux, calculating...
I0625 19:18:07.033170       1 costmodel.go:751] No RAM cost found for us-east-1,r4.xlarge,linux, calculating...
I0625 19:18:07.033189       1 costmodel.go:751] No RAM cost found for us-east-1,r4.xlarge,linux, calculating...
I0625 19:18:07.033207       1 costmodel.go:751] No RAM cost found for us-east-1,r4.xlarge,linux, calculating...
I0625 19:18:07.033225       1 costmodel.go:751] No RAM cost found for us-east-1,r4.4xlarge,linux, calculating...
I0625 19:18:07.033243       1 costmodel.go:751] No RAM cost found for us-east-1,r4.2xlarge,linux, calculating...
I0625 19:18:07.033260       1 costmodel.go:751] No RAM cost found for us-east-1,m4.xlarge,linux, calculating...

That said, a few things to note, I'm:

Using an existing Prometheus server.
Using an ingress instead of the service but can't connect to it.
I used the IAM permissions in the docs repository.
What are the configuration options? I can see aws.json in the source ... am I to create my own? How do I configure it to use my aws.json? Are there flags?

Panic and access denied

I0430 15:53:35.143295       1 main.go:202] Starting cost-model (git commit "bd779830c98be5b101f2d9f0fe9b1e1f1fcea78f+dirty")
I0430 15:53:40.161470       1 main.go:219] Checked prometheus endpoint: http://monitoring-prometheus-server.monitoring
I0430 15:53:40.182421       1 provider.go:230] Found ProviderID starting with "aws", using AWS Provider
I0430 15:53:40.188758       1 awsprovider.go:306] starting download of "https://pricing.us-east-1.amazonaws.com/offers/v1.0/aws/AmazonEC2/current/index.json", which is quite large ...
I0430 15:53:40.303120       1 awsprovider.go:312] Finished downloading "https://pricing.us-east-1.amazonaws.com/offers/v1.0/aws/AmazonEC2/current/index.json"




I0430 15:54:05.585723       1 awsprovider.go:318] done loading "https://pricing.us-east-1.amazonaws.com/offers/v1.0/aws/AmazonEC2/current/index.json"
I0430 15:54:05.750668       1 awsprovider.go:415] Error downloading spot data AccessDenied: Access Denied
        status code: 403, request id: 45CA46E26DCA7F8B, host id: dETp2TmatqX+nh18l9/c74MMfbf2PD6IJmIXjaGdmuWw93iUA5Kw0IKua0cnSdg5x1fjPtG7KuM=
I0430 15:54:05.750751       1 main.go:158] Recording prices...
I0430 15:54:05.814567       1 costmodel.go:433] Use given nodeprice as whole node price
I0430 15:54:05.814591       1 costmodel.go:448] Node "ip-10-134-33-188.ec2.internal" RAM Cost := 0.001205
I0430 15:54:05.814602       1 costmodel.go:433] Use given nodeprice as whole node price
I0430 15:54:05.814610       1 costmodel.go:448] Node "ip-10-134-34-142.ec2.internal" RAM Cost := 0.004370
I0430 15:54:05.814619       1 costmodel.go:433] Use given nodeprice as whole node price
I0430 15:54:05.814633       1 costmodel.go:448] Node "ip-10-134-38-97.ec2.internal" RAM Cost := 0.004369
I0430 15:54:05.814644       1 costmodel.go:433] Use given nodeprice as whole node price
I0430 15:54:05.814655       1 costmodel.go:448] Node "ip-10-134-40-100.ec2.internal" RAM Cost := 0.001205
I0430 15:54:05.814665       1 costmodel.go:433] Use given nodeprice as whole node price
I0430 15:54:05.814675       1 costmodel.go:448] Node "ip-10-134-41-81.ec2.internal" RAM Cost := 0.004369
I0430 15:54:05.814684       1 costmodel.go:433] Use given nodeprice as whole node price
I0430 15:54:05.814695       1 costmodel.go:448] Node "ip-10-134-42-251.ec2.internal" RAM Cost := 0.004369
I0430 15:54:05.814704       1 costmodel.go:433] Use given nodeprice as whole node price
I0430 15:54:05.814712       1 costmodel.go:448] Node "ip-10-134-53-108.ec2.internal" RAM Cost := 0.004370
I0430 15:54:05.814724       1 costmodel.go:433] Use given nodeprice as whole node price
I0430 15:54:05.814734       1 costmodel.go:448] Node "ip-10-134-54-212.ec2.internal" RAM Cost := 0.001205
I0430 15:54:05.814746       1 costmodel.go:433] Use given nodeprice as whole node price
I0430 15:54:05.814757       1 costmodel.go:448] Node "ip-10-134-55-252.ec2.internal" RAM Cost := 0.004370
panic: interface conversion: interface {} is nil, not string

goroutine 97 [running]:
github.com/kubecost/cost-model/costmodel.getPVInfoVector(0x19e6260, 0xc0002ef8c0, 0xc000e4d740, 0x0, 0x0)
        /app/costmodel/costmodel.go:932 +0xa01
github.com/kubecost/cost-model/costmodel.ComputeCostData(0x2022660, 0xc000281940, 0x2090b00, 0xc0004ae000, 0x2072680, 0xc0004e7ba0, 0x1d0d36e, 0x2, 0xc00003ab60, 0xc000540dc0, ...)
        /app/costmodel/costmodel.go:139 +0x10f3
main.(*Accesses).recordPrices.func1(0xc000538e40)
        /app/main.go:159 +0x8c0
created by main.(*Accesses).recordPrices
        /app/main.go:156 +0x3f

Poorly configured CPU resources make results in unexpected behavior

I was testing this out in our test env and we had a couple of deployments that were misconfigured as

        resources:
          limits:
            cpu: 10Mi
            memory: 100M
          requests:
            cpu: 10Mi
            memory: 100M

It appeared that cost-analyzer interpreted that as 10 million cores and hilarity ensued.
Thought you'd like to know :)

More accurately track AWS spot prices

Today the Kubecost project lets you supply a static, custom AWS spot price. Can we explore getting this information dynamically from AWS? There are at least two different options:

General spot pricing -- some feed equivalent of the following to get general market data: https://docs.aws.amazon.com/cli/latest/reference/ec2/describe-spot-price-history.html
Specific spot pricing -- feed for customer-specific prices:
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-data-feeds.html

Bubble up Token() errors in json stream instead of swallowing them

Currently, we allow the json decoder to try and run and throw an error on invalid data instead of returning the Token() error. We should instead return the Token() error.

See https://github.com/kubecost/cost-model/blob/master/cloud/awsprovider.go#L267 for an example.

Cleanup JSON parsing

Because of the mixed data types in Prometheus vectors, the json parsing is hard to read. It also panics when unexpected results are returned from prometheus. See todos on line 655 https://github.com/kubecost/cost-model/blob/master/costmodel/costmodel.go#L655 and line 662
https://github.com/kubecost/cost-model/blob/master/costmodel/costmodel.go#L662

Also investigate other JSON parsers like https://github.com/tidwall/gjson to clean this up. Note that any JSON parser used cannot read the whole billing data result into memory, because it gets quite large.

Better error messages when no ksm or node-exporter metrics are available

When I deploy prometheus-server without kube-state-metrics or node-exporter, I get the runtime error below in logs. It would be nice to provide a helpful message in this scenario.

URL: https://cloudbilling.googleapis.com/v1/services/6F81-5844-456A/skus?key=AIzaSyD29bGxmHAVEOBYtgd8sYM2gM2ekfxQX4U
2019/04/10 19:49:55 Recording prices...
panic: runtime error: index out of range

goroutine 98 [running]:
github.com/kubecost/cost-model/costmodel.getNormalization(0x184f040, 0xc0004e4810, 0xc000310000)
/app/costmodel/costmodel.go:659 +0x1c8
github.com/kubecost/cost-model/costmodel.ComputeCostData(0x1e04420, 0xc00025d580, 0xc00049c000, 0x1e48880, 0xc0002b17d0, 0x1b26667, 0x2, 0x4055c5, 0x42e70c, 0xc000343df8)
/app/costmodel/costmodel.go:83 +0x57d
main.(*Accesses).recordPrices.func1(0xc00025da00)
/app/main.go:104 +0x476
created by main.(*Accesses).recordPrices
/app/main.go:101 +0x3f

2019/04/15 12:47:10 Checked prometheus endpoint: http://prom-op-prometheus-operato-prometheus.monitoring.svc.cluster.local:9090
2019/04/15 12:47:10 Found ProviderID starting with "azure", using Azure Provider
2019/04/15 12:47:10 Recording prices...
2019/04/15 12:47:10 Interface map[error:found duplicate series for the match group {persistentvolumeclaim="data-vol-kube-opex-analytics-0"} on the left hand-side of the operation: [{persistentvolumeclaim="data-vol-kube-opex-analytics-0", storageclass="standard-disk"}, {persistentvolumeclaim="data-vol-kube-opex-analytics-0", storageclass="<none>"}];many-to-many matching not allowed: matching labels must be unique on one side errorType:execution status:error]. If the interface is nil, prometheus is not running!
panic: interface conversion: interface {} is nil, not map[string]interface {}

goroutine 85 [running]:
github.com/kubecost/cost-model/costmodel.getPVInfoVector(0x1852100, 0xc00058cba0, 0xc0006e2480, 0x0, 0x0)
        /app/costmodel/costmodel.go:642 +0xb20
github.com/kubecost/cost-model/costmodel.ComputeCostData(0x1e088c0, 0xc000424000, 0xc000297680, 0x1e4cca0, 0xc000195110, 0x1b29747, 0x2, 0x0, 0x0, 0x0)
        /app/costmodel/costmodel.go:130 +0xe8b
main.(*Accesses).recordPrices.func1(0xc000424d00)
        /app/main.go:104 +0x476
created by main.(*Accesses).recordPrices
        /app/main.go:101 +0x3f

Helm chart - include option to install RBAC permissions

Hi.

I see that RBAC options are installed separately. Why not add them into the helm chart and add an option to enable disable it? A similar example can be found here: https://github.com/helm/charts/blob/master/stable/kubernetes-dashboard/values.yaml#L124.

Let me know if you plan to add this or there is a way to add a PR for it.

Thx.