Giter Club home page Giter Club logo

saboteurs's Introduction

Saboteurs Logo

GitHub CI build status

image

Saboteurs

Saboteurs is a Python library to detect failure-causing elements from success/failure data.

We use it at the Edinburgh Genome Foundry to identify defectuous genetic parts early:

  • When assembling large fragments of DNA, each with typically 5 to 25 parts, we observe that some assemblies have far fewer successes ("good clones") than some others. We use Saboteurs to identify possible parts which would be causing the damage. This would generally mean that the sample corresponding to these parts has been compromised.
  • Before launching a large batch of assemblies which reuse the same few parts, we use Saboteurs to design a smaller "test batch" of carefully selected assemblies to detect and identify possible bad parts.

See this page for the HTML docs.

You can also use Saboteurs online, using this web app for saboteurs detection, or this other app for designing test batches.

Usage

Logical methods

Identifying saboteur elements from experimental results

Assume that a secret organization has a few dozen agents ([A]nna, [B]ob, [C]harlie, [D]olly, etc.). Regularly, the organization puts together a team (e.g. A, C, D) and sends them to a mission, which should succeed unless one of the members is a double-agent who will secretly sabotage the mission. Looking at the table below, can you identify the saboteur(s)?

Mission Members Outcome
1 A C D Success
2 B C E Failure
3 A B D Success
4 D F G Failure

Mission 2 raises suspicion on B, C, and E, but Mission 1 clears C, and mission 3 clears B. Therefore E is a saboteur. Meanwhile mission 4 raises suspicion on F and G, but while none of them is cleared by another mission, it is impossible to say if only F or only G or both are saboteurs.

The Saboteurs library has a method find_logical_saboteurs which allows to do this reasoning many groups with many elements. Here is how you would solve the problem above:

In the result, suspicious is the list of all elements which only appear in failing groups, and saboteurs is the list of suspicious elements which are also the only suspicious element in at least one group (and therefore confirmed unambiguously as saboteurs).

Designing experiment batches to find saboteur elements

Assume that we have a list of agents, among which we suspect might hide one or two saboteurs. We want to select a batch of "test groups" (from all possible teams) so that when we get the result of all these teams (success or failure) we will be able to identify the one or two saboteurs. This is solved as follows:

You can get a quick report (CSV file and plot) of the selected groups with

image

In practice, a group can have different "positions" and a given element can only fill one of these positions. Consider for instance that there are 4 possible positions, with respective possible elements lists as follows:

In that case there are 3x4x4x3=144 possible combinations, which can be generated using Saboteur's utility method generate_combinatorial_groups:

Statistical methods

Example 1: assume that a secret organization has a few dozen agents (Anna, Bob, Charlie, etc.). Regularly, the organization puts together a group (Anna and David and Peggy) and sends that group to missions, some of which will be successful, some of which will fail. After a large number of missions, looking at the results of each group, you may ask: are there some agents which tend to lower the chances of success of the groups they are part of ?

With the Saboteurs library, you would first put your data in a spreadsheet data.csv like this one then run the following script:

You obtain the following PDF report highlighting which members have a significant negative impact on their groups, and where they appear:

Installation

You can install Saboteurs through PIP:

pip install saboteurs

Alternatively, you can unzip the sources in a folder and type:

python setup.py install

License = MIT

Saboteurs is an open-source software originally written at the Edinburgh Genome Foundry by Zulko and released on Github under the MIT licence (Copyright 2017 Edinburgh Genome Foundry). Everyone is welcome to contribute!

More biology software

image

Saboteurs is part of the EGF Codons synthetic biology software suite for DNA design, manufacturing and validation.

saboteurs's People

Contributors

veghp avatar zulko avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

saboteurs's Issues

Detect saboteur pairs

Consider the following scenario:

id	attempts	failures	members
construct_1	3	3	A	B	C	D
construct_2	3	0	E	C	D	
construct_3	3	0	A	B	F	

The problem here is that certain member-combinations fail the assembly (mission), but that cannot be inferred with either methods.
statistical_assembly_data.csv


A related issue: in DNA assembly the above can be explained by that each part (member) has 2 ends (overhangs) that join with other parts; if A=1&2, B=2&3 and so on, then we can rewrite the table with overhangs:

id	attempts	failures	members				
construct_1	3	3	1	2	3	4	0
construct_2	3	0	1	3	4	0	
construct_3	3	0	1	2	3	0	

Then it's clear that interaction (misannealing) between 2 and 4 causes the problem.

This is outside scope of the package, but can be a good Example.
statistical_assembly_data_overhangs.csv

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.