Comments (8)
For Amazon's own reporting, I agree that "Amazon's Total Footprint" is not particularly useful. However I suspect the bulk of "Electricity Emissions", "Refrigerants" and a portion of "Capital goods" is likely attributable to AWS.
There's some more information on what went into that report in the linked PDF under Carbon Methodology.
I think those numbers could be a guide on the target order-of-magnitude. There's other public information (# of data centers, AWS DC efficiency vs. traditional DCs) and previous estimates (# of servers) which could be used. With those inputs together we should be able to get an idea on carbon footprint per server, and guesstimate server / percentage of a server for given VM types (or for given TB of storage in S3).
from green-cost-explorer.
The nice folk at Etsy have done a load of work trying to figure out some indicative figures for CO2 emissions from just their cloud bill with GCP.
If we put aside the issues around scope 2 reporting of carbon emissions from electricity (i.e. location based vs market based, and all that), this is probably the most recent, and applicable set of numbers you might use as a baseline.
I'm guessing Amazon is likely to be within one order of magnitude of GCP in terms of carbon efficiency.
https://github.com/etsy/cloud-jewels
from green-cost-explorer.
On closer inspection, the cloud jewels work maps by Etsy much more closely than I thought for where this might go.
Their tool works against google cloud, to give output like so:
> ./cloud-jewels.sh -p my-billing-project
>
> Waiting on bqjob_r2a1f3145ee30850a_000001711743b581_1 ... (1s) Current status:
> DONE
+------------------+------+--------------------+
| jewel_class | skus | cloud_jewels |
+------------------+------+--------------------+
| CPU | 17 | xxxxxx.xx |
| Cloud Storage | 25 | xxxx.xx |
| Storage | 16 | xxxx.xx |
| SSD Storage | 4 | xx.xx |
| GPU | 4 | x.xx |
| Excluded Service | 281 | 0.0 |
| Network | 36 | 0.0 |
| Memory | 13 | 0.0 |
+------------------+------+--------------------+
They use a synthetic unit (cloud jewels) rather than energy (joules/watss )or carbon (CO2e). The repo has a methodology page but the TLDR is below;
As a rough starting point, we are estimating the wattage of an hour of virtual server use (vCPU) and a gigabyte-hour of drive storage. From some papers and the SPEC database (see References), we estimate the following:
- 2.1 Wh per vCPUh [Server]
- 0.89 Wh/TBh for HDD storage [Storage]
- 1.52 Wh/TBh for SSD storage [Storage]
The methodology makes many assumptions that would be reasonable to make about AWS too.
from green-cost-explorer.
Hang on, it looks like David Mytton's paper is now out, which would also be relevant.
from green-cost-explorer.
Thanks for the citation 😄
The Etsy approach is nice in that it uses the literature to come up with some figures, and they're conservative to avoid under-estimating, but they are still simplistic. There are still big assumptions around all CPUs consuming the same power, the generation of CPUs deployed, the power proportionality depending on load, and the challenges around finding realistic numbers for other components like RAM and networking. And this is assuming you're running on VMs. The big promise of cloud is all the other services you don't have to build like databases and queues and CDN, etc.
https://arxiv.org/pdf/2007.07610.pdf is a new paper in pre-print behind http://www.green-algorithms.org which has a more detailed model based on CPU types as well as regions which seems more accurate. Potentially worth combining.
To get a carbon footprint is even harder because of accounting for the full lifecycle emissions, so scoping to use-stage may be necessary. Microsoft's methodology paper (skip to the appendix) is worth a look to see how complex that is! https://www.microsoft.com/en-gb/download/details.aspx?id=56950
from green-cost-explorer.
Oh wow, thanks David - I had no idea the Green Algorithms work was published either.
Thanks for the link about the life cycle analysis from MS - I hadn't seen such detailed work before - they even define some decent functional units!
I've been trying to find some decent numbers for how long servers are used in hyperscale datacentres, the appendix suggests that this paper has them. I don't have access to this paper - if someone does. it would be really good to add the numbers, as pretty much every where I look, I see 3-4 years being used as a refresh rate, but I'm not convinced that's the case in the larger DCs. This is the paper I'm looking for.
Eric Masanet, Arman Shehabi, and Jonathan Koomey. "Characteristics of Low-Carbon Data Centers." Nature Climate Change 3 (2013): 627-630.
from green-cost-explorer.
The Masanet (2013) paper assumes turnover is 4 years, and that is based on an assumption from an LBNL report in 2007 (which actually assumes 5 years). Neither mention hyperscale.
On hardware refresh rates, I'd suggest having a look at https://doi.org/10.1109/TSUSC.2018.2795465 (open access) which provides a detailed model on calculating the optimum time for this. In particular, have a look at tables 4-6 which show the major improvements only really happen after a few years, sometimes not at all, and that the workload is just as important as the hardware spec.
from green-cost-explorer.
There's another paper coming out from Fraunhofer in September based on some observed data.
https://twitter.com/jgkoomey/status/1304473503085858817
from green-cost-explorer.
Related Issues (16)
- Add nicer CLI flags and output HOT 2
- Add breakdown by service
- Add better error messages when you don't have credentials
- Show cost data over time HOT 1
- Handle "NoRegion"/"Global" charges HOT 3
- Support passing in a config object
- Refactor into smaller, testable modules
- Set up a nice deployment/release pipeline
- Not published in npm? HOT 3
- Include Costs Paid for by Credits HOT 3
- Create output that can easily be rendered with Vega Lite
- Project Hygiene
- Is this project still maintained?
- Warn if the AWS keys in use have more privileges than you need for cost explorer
- Accept command line flags for setting dates, and ideally tags to filter by HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from green-cost-explorer.