Giter Club home page Giter Club logo

ml-design-docs's Introduction

ml-design-doc

A template for design docs for machine learning systems based on this post.

Note: This template is a guideline / checklist and is not meant to be exhaustive. The intent of the design doc is to help you think better (about the problem and design) and get feedback. Adopt whichever sections—and add new sections—to meet this goal. View other templates, examples here.


1. Overview

A summary of the doc's purpose, problem, solution, and desired outcome, usually in 3-5 sentences.

2. Motivation

Why the problem is important to solve, and why now.

3. Success metrics

Usually framed as business goals, such as increased customer engagement (e.g., CTR, DAU), revenue, or reduced cost.

4. Requirements & Constraints

Functional requirements are those that should be met to ship the project. They should be described in terms of the customer perspective and benefit. (See this for more details.)

Non-functional/technical requirements are those that define system quality and how the system should be implemented. These include performance (throughput, latency, error rates), cost (infra cost, ops effort), security, data privacy, etc.

Constraints can come in the form of non-functional requirements (e.g., cost below $x a month, p99 latency < yms)

4.1 What's in-scope & out-of-scope?

Some problems are too big to solve all at once. Be clear about what's out of scope.

5. Methodology

5.1. Problem statement

How will you frame the problem? For example, fraud detection can be framed as an unsupervised (outlier detection, graph cluster) or supervised problem (e.g., classification).

5.2. Data

What data will you use to train your model? What input data is needed during serving?

5.3. Techniques

What machine learning techniques will you use? How will you clean and prepare the data (e.g., excluding outliers) and create features?

5.4. Experimentation & Validation

How will you validate your approach offline? What offline evaluation metrics will you use?

If you're A/B testing, how will you assign treatment and control (e.g., customer vs. session-based) and what metrics will you measure? What are the success and guardrail metrics?

5.5. Human-in-the-loop

How will you incorporate human intervention into your ML system (e.g., product/customer exclusion lists)?

6. Implementation

6.1. High-level design

Start by providing a big-picture view. System-context diagrams and data-flow diagrams work well.

6.2. Infra

How will you host your system? On-premise, cloud, or hybrid? This will define the rest of this section

6.3. Performance (Throughput, Latency)

How will your system meet the throughput and latency requirements? Will it scale vertically or horizontally?

6.4. Security

How will your system/application authenticate users and incoming requests? If it's publicly accessible, will it be behind a firewall?

6.5. Data privacy

How will you ensure the privacy of customer data? Will your system be compliant with data retention and deletion policies (e.g., GDPR)?

6.6. Monitoring & Alarms

How will you log events in your system? What metrics will you monitor and how? Will you have alarms if a metric breaches a threshold or something else goes wrong?

6.7. Cost

How much will it cost to build and operate your system? Share estimated monthly costs (e.g., EC2 instances, Lambda, etc.)

6.8. Integration points

How will your system integrate with upstream data and downstream users?

6.9. Risks & Uncertainties

Risks are the known unknowns; uncertainties are the unknown unknows. What worries you and you would like others to review?

7. Appendix

7.1. Alternatives

What alternatives did you consider and exclude? List pros and cons of each alternative and the rationale for your decision.

7.2. Experiment Results

Share any results of offline experiments that you conducted.

7.3. Performance benchmarks

Share any performance benchmarks you ran (e.g., throughput vs. latency vs. instance size/count).

7.4. Milestones & Timeline

What are the key milestones for this system and the estimated timeline?

7.5. Glossary

Define and link to business or technical terms.

7.6. References

Add references that you might have consulted for your methodology.


Other templates, examples, etc

Contributions welcome!

ml-design-docs's People

Contributors

erikcvisser avatar eugeneyan avatar savagej avatar segunadelowo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ml-design-docs's Issues

Bias in models?

There is a question about data privacy. Should there also be a question about potential sources of bias that could arise from using the model and how this is monitored?

Where to locate Onepager/Design Docs?

@eugeneyan - this is a great outline and I can see it being very useful for defining our work. Do you have recommendations for how to save/share these?

On one hand, drafting in an enterprise-y Word-type application means it's easy to share via application/email, and all non-technical stakeholders can read.
On the other hand, these documents are strongly linked to project documentation, so I'd be inclined to include in git repo (as markdown?) so history is version controlled and changes can be tracked.

Do you have thoughts or recommendations on this issue? Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.