Giter Club home page Giter Club logo

karthik-d / probabilistic-ml-experiments-with-pystan Goto Github PK

View Code? Open in Web Editor NEW
2.0 1.0 0.0 427 KB

Probabilistic modeling using PyStan with demonstrative case study experiments from Christopher Bishop's Model-based Machine Learning.

Python 64.11% Stan 35.89%
model-based model-based-machine-learning probabilistic-machine-learning probabilistic-models pystan stan loopy-belief-propagation message-passing-algorithm ancestral-inference sampling

probabilistic-ml-experiments-with-pystan's Introduction

Probabilistic Machine Learning Experiments with PyStan

Probabilistic modeling using PyStan with demonstrative case study experiments from Christopher Bishop's Model-based Machine Learning.

Note that the the first run of all */infer.py files will be slow since the PM model will be built and stored as pickle files. Subsequent runs will reuse this pickle file.

Be sure to remove or relocate the corresponding *.pkl file(s) when changing model configurations. The older model will be used for inference, otherwise.

Experiment: Mapping MCQ Test Responses to Candidate Skills

An elaborate case study description can be found in MBML Book, Chapter 2.

  • Candidates take a multiple-choice test comprising 5 choices per question, with exactly one right answer per question.
  • Each question is associated with a set of skills (one or more), that forms a part of the given dataset.
  • Goal: Determine which skills each candidate has, and with what probability, given their answers in the test.
  • Dataset: Skill ground truth and response data for 22 candidates, across 48 questions, to assess 7 skills, is contained in CSV files in the data directory.

The following binary heatmap represents the skills, one or more, assessed by each of the 48 questions in the dataset.

skill-question-map

The solution is implemented incrementally.
A probabilstic model is formulated that makes the following initial set of assumptions on the data,

  1. A candidate has either mastered each skill or not.
  2. Before seeing any test results, it is equally likely that a candidate does or doesn’t have any particular skill.
  3. If a candidate has all of the skills needed for a question, then they are likely to make a mistake once in ten times -- a 90% right answer probability.
  4. If a candidate doesn’t have all the skills needed for a question, they will pick an answer at random. Hence, there’s a one in five chance that they get the question right -- a 20% right answer probability, assuming a uniform guessing distribution of course!
  5. Whether the candidate gets a question right depends only on what skills that candidate has, and not on anything else.

Assumptions C and D essentially give rise to a model parameter each, and they can be fine-tuned over time.

Non-Vectorized Primitive Models

  • The non-vectorized models are primitive implementations based on small subsets of data.
  • They capture all possible candidate response combinations.
  • They provide a way to intuitively ensure that the model is foundationally right, and that the assumptions and inference workflows are valid.

Three-Question Model

  • Link to three-question model implementation.
  • Modeled for 2 skills assessed through 3 questions.
  • These are skills 1 and 7; and questions 1 through 3 on the skill-question heatmap.
  • Factors and evaluates skill probabilities for all possible response combinations.

The following factor graph represents the model and message flow for the three-question scenario.

3q-factor-graph

Four-Question Model

  • Link to four-question model implementation.
  • Modeled for 2 skills assessed through 4 questions.
  • These are skills 1 and 7; and questions 1 through 3 on the skill-question heatmap.
  • Factors and evaluates skill probabilities for all possible response combinations.

The following factor graph represents the model and message flow for the four-question scenario.

4q-factor-graph

Baseline Vectorized Model

  • Link to baseline implementation on complete dataset.
  • This is the first realisitic models that uses the complete dataset for inference.
  • It carries the original model assumptions.
  • The implementation used matrix operations for message passing and inference to manage larger datasets effectively, and to optimize for a GPU.

The following three-feature heatmap represents the correct and incorrect reponses of the 22 candidates to the 48 questions.

question-response-map

  • White blocks represent questions answered correctly, where colored boxes represent incorrect responses.
  • The colors also mark each incorrect response with the skills required to answer them.
  • The heatmap helps visually and qualitatively assess, which candidate likely lacks what skills.

The inferred skill probabilities are compared against the ground truth data on skills possessed by each of the 22 candidates in the grayscale heatmap below.

result-baseline

Diagnosing Issues: Ancestral Sampling to Generate Idealistic Sythetic Dataset

  • Link to baseline applied on ancestrally sampled data.
  • Ancestral sampling uses the model assumptions of underlying probability distributions of the factors to sample a synthetic dataset.
  • In essence, the synthetic dataset closely represents the model assumptions.
  • Hence, the results may be used to evaluate the model assumptions and inference method.

The inferred skill probabilities based on sampled response data are compared against the ground truth data on skills possessed by each of the 22 candidates in the grayscale heatmap below.

result-ancestral

  • The obtained results match the ground truth quite well, and much better than the baseline model.
  • Since an idealistic dataset representing the model assumptions correctly gives rise to good inference, the inference methods and steps are valid and correct.
  • The problem, in turn, lies with the model assumptions or the parameters therein; It does not represent the real dataset well.
  • The next steps would involve sampling on select portions of the factor graph, freezing the rest, to identify which parameters/assumptions give rise to inaccuracies, and modifying them.

Improved Vectorized Model: Learning Guess Probabilities

  • Link to improved implementation on complete dataset.
  • Selective ancestral sampling for analysis (code not included) suggests that guess probability values are not very realistic; Candidates are able to guess correct answers more often than once in ten attempts.
  • To improve the probabilistic model, the guess probabilities are no longer assumed to be constant.
  • Instead, the guess probability for each question is inferred through message passing and belief propagation in the undirected factor graph.
  • Specificially, assumption D is modified as follows.

If a candidate doesn’t have all the skills needed for a question, they will pick an answer at random. The probability of getting a question right, called the guess probability, is inferred from data.

The inferred skill probabilities, when applying learnt guess probabilities, are compared against baseline performance and the ground truth data on skills possessed by each of the 22 candidates in the grayscale heatmap below.

result-improved

Note that the ground truth is not used for this inference.
Rather, all possible combinations of 'skill sets' are generated and 'belief propagated' to infer the posterior for guess probabilities.

A substantial improvement in the inference is evident after learning the guess probabilities.
Further improvements can be made, for instance, by learning the “know probabilities" and targetedly diagnosing other assumptions of the model.

Other Experiments

References

probabilistic-ml-experiments-with-pystan's People

Contributors

karthik-d avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.