Giter Club home page Giter Club logo

hnrosa / uci-secom-fault-detection Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 5.76 MB

Machine Learning approaches for fault detection in UCI Secom Dataset / Abordagens de Machine Learning para detecção de falha no conjunto de dados UCI Secom

Home Page: https://www.kaggle.com/code/heitornunes/fault-detection-by-feature-selection-590-to-44

Jupyter Notebook 100.00%
feature-engineering feature-selection imbalanced-learning logistic-regression missing-data

uci-secom-fault-detection's Introduction

UCI SECOM Fault Detection

In the manufacturing process of semiconductor products, many production steps are required, involving the use of different machines. It is difficult to eliminate or identify dysfunctions at each stage of treatment. Operating conditions in a process control environment can often change, whether intentionally or unintentionally. This is why the identification of KPIVs (Key Process Input Variables) is essential to enable rapid recovery, optimization and control. The goal of this case study is to develop a causal feature selection approach that applies to this domain, helps to solve process control issues and enhance overall business improvement strategies.

For that, we are going to use the UCI SECOM Dataset. The first file contains 1567 examples, each having 591 features, thus forming a matrix of dimensions 1567 x 591. The second file is a label file which contains the classifications and timestamps corresponding to each example. As with all real data situations, this data may contain null values that vary in intensity based on individual characteristics. Not only that, but the data set it is also imbalanced, since contains only 104 fails (6.6 % examples).

Dataset avaliable in: https://archive.ics.uci.edu/ml/datasets/SECOM

(Tradução PT)

No processo de fabricação de produtos semicondutores, muitas etapas de produção são necessárias, envolvendo o uso de diferentes máquinas. É difícil eliminar ou identificar disfunções em cada etapa do tratamento. As condições de operação em um ambiente de controle de processo podem frequentemente mudar, intencionalmente ou não. É por isso que a identificação de KPIVs (Key Process Input Variables) é essencial para permitir uma rápida recuperação, otimização e controle. O objetivo deste estudo de caso é desenvolver uma abordagem de seleção de características causais que se aplique a este domínio, ajude a resolver problemas de controle de processos e a aprimorar estratégias gerais de melhoria de negócios.

Para isso, vamos usar o UCI SECOM Dataset. O primeiro arquivo contém 1567 exemplos, cada um com 591 feições, formando assim uma matriz de dimensões 1567 x 591. O segundo arquivo é um arquivo de etiquetas que contém as classificações e timestamps correspondentes a cada exemplo. Como em todas as situações de dados reais, esses dados podem conter valores nulos que variam em intensidade com base nas características individuais. Não apenas isso, mas o conjunto de dados também é desequilibrado, pois contém apenas 104 falhas (6,6% exemplos).

Conjunto de dados disponível em: https://archive.ics.uci.edu/ml/datasets/SECOM

uci-secom-fault-detection's People

Contributors

hnrosa avatar

Stargazers

Diogo Reis avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.