Comments (4)
We have expert inputs from Lea Gimpel on how the DPG standard can be better retrofitted for Open AI digital solutions.
Reproducibility:
Means that all training details are given: Needless to say, this includes a description of the data, code documentation and tech stack documentation (these can follow the already existing standards and criteria). We think it should also include specific model-training documentation. For instance, what kind of CPU/GPU, OS and platforms (cloud provider, google collab, etc.) were used for the training and listing all training parameters. Ideally, a tech-savvy person should be able to re-train the model with identical evaluation scores, given all information, data and computing power.
A nice way to think about transparent documentation of AI models is also Google’s idea of “model cards” (see here and here the corresponding article; in addition, Timnit Gebru also suggested “datasheets for datasets”, which could be an interesting tool for the discussion around open data as DPG)
Accessibility:
This is quite critical, in our opinion. The model should be easily accessible and usable. A good solution may be the provision of an API. You can send your request and retrieve the prediction outcome through a stable connection in real-time. Here platforms such as Hugging Face are also quite handy since they allow one-liner-code access and usage of trained ML models. (btw they just received 2b$ funding aiming to build the GitHub of Machine Learning)
Interpretability:
We think it is essential that the prediction outcomes of the models are interpretable and understandable, at least through proper documentation and explanation. For traditional ML models, predictions should be accompanied by some sort of intuitive confidence scores. It is a difference if the models predict with 99% confidence or 51% confidence. If such thresholds are set, they need to be clearly stated and explained.
Generally, it should be clear what problem the AI model aims to solve and what realistic outcomes/performance the user can expect.
Independency:
This adds to the point of accessibility. We may also make the model accessible through some sort of package as a collection of modules that can be downloaded and used in a programming language such as python. (pip install our_packaged_model) or as a sub-module in an existing package (this is how it would work if pushed to the Hugging Face model hub). The point of independency is that the dependencies need to follow the same standards as the end-product, but that’s also already outlined in the standard.
from dpg-standard.
Status & Next Steps:
- Engage an AI expert to take a look at the current standard and this proposal (specific to non PII data & data privacy).
from dpg-standard.
Prioritization: should come after #59
from dpg-standard.
We will resolve this on #130 latest on the topic of AI as a part of the standard.
from dpg-standard.
Related Issues (20)
- Indicator 1 expanded to account for Open Data HOT 8
- Indicator 3 expanded for Open Data HOT 1
- Documentation for new cadence
- Identify cadence for DPG Standard updates HOT 4
- Clear definition of requirements for standards in the DPG standard HOT 15
- Clarify the term "mandatory dependencies" HOT 3
- Specification of requirements for AI/machine learning HOT 2
- Role of Sustainability of a DPG HOT 5
- Grammatical errors HOT 1
- Typos in standard.md HOT 2
- Clear Definition of Non-PII Data in the DPG standard for Indicator #6 HOT 1
- Proposal: Add Ethical Source licenses as acceptable licenses to DPG Standard indicator 2 HOT 14
- Section 9.b leaves open the opportunity to do harm by exposing content to third parties in pursuit of narrow safety goals
- Adding to list of approved licenses HOT 3
- IDB Software License to review HOT 2
- Sources of Open AI models must also be open? HOT 9
- Consideration for BSL HOT 14
- Connect DPG to "Digital Sustainability" standard and criteria (and vice versa) HOT 1
- Query: End (Digital) Slavery?? HOT 1
- Do open data applications need to have global reach or impact? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dpg-standard.