datascience-fails

Collection of articles listing reasons why data science projects fail.

If you have an article that should be added, please suggest it with its link in the Issues.

I summarised my findings on my blog: Data Science Risk Categorisation

I added the post to my new company's (hypergolic.co.uk) blog as well at : Data Science Risk Categorisation

Follow me at @xLaszlo on Twitter for updates.

Categorisation

Organisational
- Leadership
- Employees
- Infrastructure
Intermediate
- Legal/Privacy/Bias/Security
- Transparency/Communication
Product Planning
- Business Value
- Specification
Product One-Off
- Project Execution
- Data
- Modelling
Product Ongoing
- Operations

After looking through the 300+ failures listed below there is a notable absence of any concern about domain experts and any collaboration with them apart from off-hand mentions regarding labelled data. The reader should take this into account when using the above categorisation. (Laszlo)

I created this image on how I imagine communicating this on a single slide (excuse my design skills, it's a 2x3 table in lucidcharts with the middle row merged).

David Dao's collection of Awful AI on GitHub (link)

Awful AI is a curated list to track current scary usages of AI - hoping to raise awareness to its misuses in society.

51 things that can go wrong in a real-world ML project (link)

Vague success metrics of the ML model
Even if we had the perfect model — no clue of how it will be used within existing workflows
Building a 100% accurate model — no clarity on the acceptable trade-offs such as precision versus recall
Using a hammer to kill an ant — not checking the performance of simpler alternatives
Not all ML problems are worth solving — the impact may not be worth the effort
Drowning the business team in technical mumbo jumbo
I thought this dataset attribute means something else
5 definitions of a business metric
Where is the dataset I need for my model?
The data warehouse is stale
Need to instrument app for more clickstream events — it will take months
Assuming all the datasets have the same quality
Customer changed preference to not use their data for ML. Why are those records still included
Uncoordinated schema changes at the data source
We have lots of data — don't forget data expires?
Systematic data issues making overall dataset bias
Unnoticed sudden distribution changes in the data
Using all the data for training — each model iteration can take days
We are using the best polyglot datastores — but how do I now write queries effectively across this data?
Training versus inference inconsistency
Model accuracy too good to be true — check for feature leakage
Limited Feature value coverage
Flaky pipeline for generating features that are time-dependent
Lack of balance between bias (underfitting) and variance (overfitting)
Compromising interpretability prematurely for performance
Always using deep learning instead of traditional feature engineering
Not applying hashing for sparse features
Not attempting to reduce the dimensionality of models
Ad-hoc tuning is faster compared to a scientific approach
Improper tracking of details related to model versions and experiments
Ignoring the specificity and sparsity trade-off
Prematurely jumping to online experimentation
Not measuring model’s sensitivity to recency
Not paying attention to infrastructure capacity
Evaluating models using different datasets
Reporting model accuracy for the overall data
Training results not reproducible
Long time before first online experiment
Model behaves differently in online experimentation compared to offline validation
Ignoring feedback loops
Making multiple changes within an experiment
Ad-hoc framework to analyze the results of the experiment
No backup plan if the test goes south
Not calibrating the model
ETL Pipeline SLA was 8 am. It’s now 4 pm and still processing — why is my metrics processing slow today
Metrics processing pipelines completed successfully but results are wrong?
Response time to generate an inference is too high
Data quality issues at source, or ingestion into the lake, or ETL processing
Cloud costs jumped up 3X this month
Model has not been re-trained for 3 months — it was supposed to happen weekly
No checks and bounds for data and concept drift

Why 87% of Machine learning Projects Fail (link)

Not Enough Expertise
Disconnect Between Data Science and Traditional Software Development
Volume and Quality of Data
Labeling of data
Organizations are Siloed
Lack of collaboration
Technically Infeasible Projects
Alignment Problem Between Technical and Business Teams
Lack of Data Strategy
Lack of Leadership support

Top 10 Challenges to Practicing Data Science at Work (link)

Based on doing PCA on the Kaggle 2017 data, see article for details
- Insights not Used in Decision Making
- Data Privacy, Veracity, Unavailability
- Limitations of tools to scale / deploy
- Lack of Funds
- Wrong Questions Asked

The State of Data Science & Machine Learning 2017 (link, webarchive)

Dirty data
Lack of data science talent
Lack of management/financialsupport
Lack of clear question to answer
Data unavailable or difficult toaccess
Results not used by decision makers
Explaining data science to others
Privacy issues
Lack of domain expert input
Can't afford data science team
Multiple ad- hoc environments
Limitations of tools
Need to coordinate with IT
Expectations of project impact
Integrating findings into decisions

OpML '20 - How ML Breaks: A Decade of Outages for One Large ML Pipeline (Google) (link, youtube)

Data arriving from multiple sources was joined to provide positive labels, when the data rate increased, joins were delayed and training happened on unjoined data incorrectly defaulted to be negatively labelled.
Data source location changed and downstream process didn't have permission to read from the new place.
Failure Taxonomy
- Process orchestration issues
- Overloaded backends
- Temporary failure to join with expected data
- CPU failures
- Cache invalidation bugs
- Changse to the distribution of examples taht we are generating inference on
- Config changes pushed out of order
- Suboptimal data structure used
- Challenges assigning work between clusters
- Example training strategy resulted in unexpected ordering
- ML hyperparameters adjusted on the fly
- Configuration change not properly canaried or validated
- Client made incorrect assumption about model providing inference
- Inference takes too long
- Incorrect assert() in code
- Labels weren't available/mostly corect at the time the model wished to visit the example
- Embeddings interpreted in the wrong embedding-space
- QA/Test jobs incorrectly communicating with prod backends
- Faild to provision necessary resources (bandwidth, RAM, CPU)
ML vs non-ML categorisation
- ML
  - Changes to the distribution of examples
  - Problems withselection and processing of training data: either sampling wrong, re-visiting the same data, skipping data, etc.
  - Hyperparameters
  - Mismatch in embedding interpretation
  - Training on mislabelled data
- non-ML
  - Dependency failure (other than data)
  - Deployment failure (out of order, wrong target, wrong binaries, etc)
  - CPU failures
  - Inefficient data structure
Distributed vs non-distributed categorisation
- Distributed
  - System orchestration: which processes to run where
  - Data joined between two systems fails (e.g.: missing foreign key)
  - Some resource (e.g. CPU) is unavailable in the quantities we need
  - Changes pushed in an unsafe order
- Less distributed
  - CPU oddities (probabilistically distributed: only happening at huge scales)
  - Human driven change not tested before being applied to production environment
- Not distributed
  - Failes assert(): invariant is not invariant
  - Bad data structures

geckoboard's Data fallacies (link)

Cherry Picking
Data Dredging
Survivorship Bias
Cobra Effect
False Causality
Gerrymandering
Sampling Bias
Gambler's Fallacy
Regression Toward the Mean
Hawthorne Effect
Simpson's Paradox
McNamara Fallacy
Overfitting
Publication Bias
Danger of Summary Metrics

Three Risks in Building Machine Learning Systems (link)

Poor Problem-Solution Alignment
Incurring Excessive Costs
Unexpected Behavior and Unintended Consequences

AI Engineering: 11 Foundational Practices (link, pdf)

Ensure you have a problem that both can and should be solved by AI.
Include highly integrated subject matter experts, data scientists, and data architects in your software engineering teams.
Take your data seriously to prevent it from consuming your project.
Choose algorithms based on what you need your model to do, not on their popularity.
Secure AI systems by applying highly integrated monitoring and mitigation strategies.
Define checkpoints to account for the potential needs of recovery, traceability, and decision justification.
Incorporate user experience and interaction to constantly validate and evolve models and architecture.
Design for the interpretation of the inherent ambiguity in the output.
Implement loosely coupled solutions that can be extended or replaced to adapt to ruthless and inevitable data and model changes and algorithm innovations.
Commit sufficient time and expertise for constant and enduring change over the life of the system.
Treat ethics as both a software design consideration and a policy concern.

Machine Learning: The High-Interest Credit Card of Technical Debt (link, pdf)

Complex Models Erode Boundaries
- Entanglement
- Hidden Feedback Loops
- Undeclared Consumers
Data Dependencies Cost More than Code Dependencies
- Unstable Data Dependencies
- Underutilized Data Dependencies
- Static Analysis of Data Dependencies
- Correction Cascades
System-level Spaghetti
- Glue Code
- Pipeline Jungles
- Dead Experimental Codepaths
- Configuration Debt
Dealing with Changes in the External World
- Fixed Thresholds in Dynamic Systems
- When Correlations No Longer Correlate
- Monitoring and Testing

Managing the Risks of Adopting AI Engineering (link)

ill-defined problem statement
lack of expertise
model-system-data disconnection
unrealistic expectations
data challenges
lack of verifiability

What is ML Ops? Best Practices for DevOps for ML (Cloud Next '18) (link, youtube)

ML Super heroes (reliance on DSes doing everything)
CHACHE (change anything, change everything)
Black box is hard
Lack of ML lifecycle management
Lack of data validation
Anti patterns: Lack of continuous monitoring
Anti patterns: Training-Serving skew
Anti pattern: Not knowing the freshness requirements
- how frequently the model should run
- how fast the model should respond

A Brief Guide to Running ML Systems in Production (link)

Model is not tested with representative data
Model is not compatible with the API in production
Model is not validated against real data

6 myths about big data (link)

Big data means 'a lot' of data
The data needs to be clean
Wait to make your data perfect
The data lake
Analyzing data is expensive
Machine algorithms will replace human analysts

How your executives will screw up your next analytics project (link)

From: The Reason So Many Analytics Efforts Fall Short (link)
- Since there was no natural owner of analytics within the traditional organizational structure, multiple executives competed hard to own the new capability.
- With the exception of the “winner,” a feeling of vulnerability settled over the other executive team members when the analysis conducted by the analytics group revealed inefficiencies and missed opportunities in their respective functions.

The state of data quality in 2020 (link)

What are the primary data quality issues your organisation faces?
- Poorly labelled data
- Unlabeled data
- Unstructured data that is difficult to organise
- Too many dat sources and inconsistent data (i.e. data integration issues)
- Poor data quality controls at data entry
- Poor data quality from third-party sources
- Too few resources available to address data quality issues
- Biased data (eg. non-representational datasets or samples)
- Needed data not collected
- Disorganized data stores and lack of metadata

AI adoption in the enterprise 2020 (link)

Common challenges to AI adoption
- Company culture does not yet recognize needs for AI
- Difficulties in identifying appropriate business use cases
- Lack of skilled people/difficulty hiring the required roles
- Lack of data or data quality issues
- Technical infrastructure challenges
- Legal concerns, risks or compliance issues
- Model validation
- Efficient tuning of hyperparameters
- Workflow reproducability
What are the biggest skills gaps within your organisation, related to machine learning and AI adoption?
- ML modelers and data scientists
- Understanding and maintaining a set of business use cases
- Data engineering
- Compute infrastructure
What kinds of risks do you check for during ML model building and deployment?
- Fairness, bias, ethics
- Model degradation
- Model interpretability and transparency
- Privacy
- Security vulnerabilities
- Safety and reliability
- Unexpected outcomes/predictions
- Other compliance
- Reproducibility

Move Fast and Break Things? The AI Governance Dilemma (link)

The level of quality required of predictions varies with use-case.
Outliers
Concept Drift
Bias
Privacy
DevOps for Machine Learning is Special
Reproducibility
Monitoring
Deployments
Explainability

9 machine learning myths (link)

Myth: Machine learning is AI
Myth: All data is useful
Myth: You always need a lot of data
Myth: Anyone can build a machine learning system
Myth: All patterns in the data are useful
Myth: Reinforcement learning is ready to use
Myth: Machine learning is unbiased
Myth: Machine learning is only used for good
Myth: Machine learning will replace people

10 signs you’re ready for AI — but might not succeed (link)

You have plenty of data
You have enough data scientists
You track or acquire the factors that matter
You have ways to clean and transform the data
You've already done statistical analyses on the data
You test many approaches to find the best models
You have the computing capacity to train deep learning models
Your ML models outperform your statistical models
You are able to deploy predictive models
You are able to update your models periodically

AI’s Biggest Risk Factor is Big Data Itself (link)

AI’s biggest risk factor: Data gone wrong
From AI’s biggest risk factor: Data gone wrong (link)
- Several recent research studies demonstrated that popular data sets used to train image recognition AI included gender biases.
- You can't outsource judgment, ethics, values to AI,
- just because a company has access to information, doesn't mean that it can use it any way it wants
- The rise of fake data
More legal and compliance risks from AI
From Risky AI business: Navigating regulatory and legal dangers to come (link, webarchive)
- AI presents a wide range of hidden dangers for companies, especially in areas such as regulatory compliance, law, privacy and ethics.
- “Deploying AI in any highly regulated industry may create regulatory compliance problems.”
- “If an algorithm malfunctions, or even functions properly but in the wrong context, for example, there is a risk of significant losses to a trading company or investors,”
- Unlike a physician, who might have the value of other contextual information about a patient, or even intuition developed over years of practice, the results from AI and machine learning programs can be narrow and incomplete.
- “We should not trust machines with decisions when the costs of error are too high,”
- “The main issue is who will be held responsible if the machine reaches the ‘wrong’ conclusion or recommends a course of action that proves harmful,”

Forrester Predictions 2018 (link)

75% of early AI projects will underwhelm due to operational oversights.
Benefits are too narrow and short-lived.

How To Underwhelm With Artificial Intelligence (link)

Championing AI As A Miracle Cure
Leaping Into High-Risk Domains
Blissfully Ignoring Your Biases

A Guide to Underwhelming with AI (link)

75% of early AI projects will underwhelm due to operational oversights. (twitter)](https://twitter.com/forrester/status/940910467274477568)

AI is not set and forget (link)

neglect it too long and you’re in trouble
Unfortunately, failing to maintain your AI will destroy the project.
AIs need feedback to let them know when they’re wandering off topic
having a human at hand to audit potential issues is essential.

How to Fail with Artificial Intelligence (link)

Cut R&D spending to save money
Operate in a technology bubble
Prioritize technology over business strategy
Work without a clear vision
Develop without addressing business needs
Cultivate a “we’re the best” attitude
Get caught in a never-ending development loop
Assume your customers are like developers
Assume the AI hype is enough to succeed

Top 5 AI Failures From 2017 Which Prove That ‘Perfect AI’ Is Still A Dream (link)

When Facebook’s Chatbots Developed Their Own Language
- Unclear goals (though I don't see how this is a fail)
When Mitra The Robot Failed To Greet The Prime Minister
- Malfunction (This might not even AI)
When Autonomous And Driverless Vehicles Turned Disastrous
- Too ambitious
When iPhone X’s Face Recognition Could Not Differentiate Identical Twins
- Rare edge case
When Alexa And Amazon Echo Goofed Up
- Malfunction (This might not even AI)

Stories of AI Failure and How to Avoid Similar AI Fails (link)

Fail: IBM’s “Watson for Oncology” Cancelled After $62 million and Unsafe Treatment Recommendations
- they trained the software on a small number of hypothetical cancer patients, rather than real patient data.
- BHAGs (Big Hairy Audacious Goals) (Laszlo)
- too ambitious (Laszlo)
Fail: Microsoft’s AI Chatbot Corrupted by Twitter Trolls
- BHAGs (Big Hairy Audacious Goals)
- too ambitious (Laszlo)
- Microsoft won’t say exactly how the algorithms worked, of course.
- unclear operating mechanism (Laszlo)
Fail: Apple’s Face ID Defeated by a 3D Mask
- Hackers in production environment
- though: Publications such as Wired had already tried and failed to beat Face ID using masks.
Fail: Amazon Axes their AI for Recruitment Because Their Engineers Trained It to be Misogynistic
- Artificial intelligence and machine learning (can - Laszlo) have a huge bias problem.
- BHAGs (Big Hairy Audacious Goals) (Laszlo)
- too ambitious (Laszlo)
Fail: Amazon’s Facial Recognition Software Matches 28 U.S. Congresspeople with Criminal Mugshots
- Biased dataset
- “Nearly 40 percent of Rekognition’s false matches in our test were of people of color, even though they make up only 20 percent of Congress.”

NewVantage Partners: Big Data Executive Survey 2017 (link, pdf)

Cultural impediments to Big Data business adoption.
- Insufficient organizational alignment
- Lack of middle management adoption and understanding
- Business resistance or lack of understanding
- Lack of a coherent data strategy
- Technology resistance or lack of understanding
- Inability to create a shared vision
- Lack of data governance policies and practices

Five Reasons Why Your Data Science Project is Likely to Fail (link)

Lack of Resources to Execute Data Science Projects
Long Turnaround Time and Upfront Effort Without Visibility into the Potential Value
Misalignment of Technical and Business Expectations
Lack of Architectural Consideration for Production, Operationalization
Heavy Dependency on Skills, Experiences of Particular Individuals
End-to-end Data Science Automation is a Solution

6 Reasons Why Data Science Projects Fail (link)

Asking the wrong questions
Lack of firm support by key stakeholders
Data problems — Poor data quality and accuracy
Lack of the right data science “team”
Overly complex models
Over-promising

Why Data Science Succeeds or Fails (link)

Team Diversity — Cross-Functional Teams
- Project Leadership
- Strategist
- Communication/Translating
- Development/Programming
- Data Engineering
- Quality Assurance/Testing
Data Diversity & Breadth — Where One Starts
Understand the Contextual Core Problem
Does it work? (Does it add value - Laszlo)
Will they use it?
Ensembles are Key

Why data science projects fail revisited (link)

Many projects are not iterated quickly enough and are then suddenly shut down or quietly declared as completed
Gartner Says Nearly Half of CIOs Are Planning to Deploy Artificial Intelligence (link)
- Aim Low at First
- Focus on Augmenting People, Not Replacing Them
- Plan for Knowledge Transfer
- Choose Transparent AI Solutions

Why Most AI Projects Fail (link)

Science project sharks
- “Wouldn’t it be cool if we could do (names some niche geek-fetish)?!”
- BHAGs (Big Hairy Audacious Goals),
Breakdown in communication
Fail before you start
Not having a data warrior
Homegrown talent/software
Start simple!

Why You’re Not Getting Value from Your Data Science (link)

the data is a mess.
In its rawest form, even clean data is too overwhelming and complex to be understood at first glance, even by experts.
due to the time it takes to understand, formulate, and process data for a machine learning problem, machine learning experts often instead focus on the later parts of the pipeline—trying different models
While business experts are coming up with problems, machine learning experts cannot always keep up.
machine learning experts often didn’t build their work around the final objective—deriving business value.
the machine learning experts wanted to spend their time building models, not processing massive datasets or translating business problems into prediction problems
the current technological landscape, both commercial and academic, focuses on enabling more sophisticated models

Data Science Project Failures (link)

From: Predicting outcomes for big data projects: Big Data Project Dynamics (BDPD): Research in progress (link)
- Wrong/Inadequate Skills
- Incorrect Business Objectives
- Insufficient ROI/Business Case
- Data Management
- Data Integration
- Technology Complexity
- Improper Scope
- Management & Cultural Resistance
- Inadequate Management & Governance
- Incorect Project Structure
- Technology Architecture & Infrastructure
- Incorrect Use of Technology
- Poor Communication
- Enterprise Strategy Match
- Problem Avoidance
- Technology Change
From: Cracking the Data Conundrum: How Successful Companies Make Big Data Operational (pdf) (link)
- Scattered data lying in silos accross various teams
- Absences of clear business case for funding and implementation
- Ineffective coordination of teams across the organisation
- Dependency on legacy systems
- Lack of sponsorship from top management
- Ineffective governance models for Big Data and analytics
- Lack of Big Data tools and technology
- Cost of specific tools and infrastructure for Big Data and analytics
- Data security and privacy concerns
- Resistance opt change within the organisation
Ad Hoc and Software Engineering Project Management

Why do 87% of data science projects never make it into production? (link)

throw money at a problem or put a technology in
we don’t have the right leadership support, to make sure we create the conditions for success
most organizations are highly siloed (in terms of data - Laszlo), with owners who are simply not collaborating and leaders who are not facilitating communication
take those insights, and they flip them over the wall, now you’re asking an engineer to rewrite a data science model created by a data scientist
because nobody owned it
educate the business leaders across the organization

How to fail as a data scientist: 3 common mistakes (link)

Focusing only on the solution
Forgetting the basics
Ineffectively communicating

We need to spend more time talking about data science failures (link)

Lesson 1: Context (there is none)
Lesson 2: Correlation not causation
Lesson 3: Completeness of data

Why Data Science Projects Fail (link)

Real-Time, Dynamic Data
Workflow Reusability Over Time
Collaboration - Or Lack Thereof
Skill Set Disconnect
Operationalization
Growth

Data Science: 4 Reasons Why Most Are Failing to Deliver (link)

Silos of knowledge
Friction in model deployment
Tool and technology mismatch
Model liability

Why so many Data Science projects fail to deliver (link)

Mistake 1: The Hammer in Search of a Nail
Mistake 2: Unrecognized Sources of Bias
Mistake 3: Right Solution, Wrong Time
Mistake 4: Right Tool, Wrong User
Mistake 5: The Rocky Last Mile

andyelisays / data-science-fails Goto Github PK

data-science-fails's Introduction

datascience-fails

data-science-fails's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent