Giter Club home page Giter Club logo

bystrogenomics / bystro Goto Github PK

View Code? Open in Web Editor NEW
43.0 43.0 15.0 96.74 MB

Bystro genetic analysis (annotation, filtering, statistics)

License: Apache License 2.0

Perl 33.07% Shell 1.40% Dockerfile 0.18% Rust 2.08% Python 56.41% Cython 0.06% Makefile 0.06% Go 2.73% Jupyter Notebook 4.01%
bioinformatics bioinformatics-algorithms bioinformatics-analysis bioinformatics-databases bioinformatics-pipeline bioinformatics-scripts genomics genomics-search

bystro's People

Contributors

akotlar avatar austintalbot7241993 avatar codacy-badger avatar cristinaetrv avatar dependabot[bot] avatar dlin30 avatar ilhah avatar mfigurski80 avatar poneill avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bystro's Issues

Enumerate, and strip common missing values

  • dbSNP: unknown (function)
  • clinvar: not provided, not specified, no assertion criteria provided, no interpretation for the single variant, no assertion for the individual variant, see cases : akotlar@7df8409 , 7883b7f

While these provide some information, barring evidence to the contrary, I think we shouldn't waste space on their storage.

Improve errors messages

  • When 0 variants annotated, and error is generated with the message, "Couldn't read statistics file". This is true, but not really the core issue. There is no error, simply no data

Finish transition to camelCase

Project started off defining snake_case for variables that were configurable at run time via YAML, and camelCase elsewhere. In part because command line users may not have liked/been used to camelCase.

This was stupid and confusing.

Better online db versioning

  • Store hash of database in YAML after build

  • Automatically identify available database builds by querying nodes

  • Show database build date, version in dropdown

  • Show deprecation messages before switching databases.

  • Always provide deprecated database for at least 2 weeks after deprecation

Support FLAG types in VCF files

Used in new gnomad ... segdup / lcr flags
appearance as:

AC=2;AF=6.52443e-05;AN=...CSQ=A|intergenic_variant|MODIFIER||||||||||||||||1||||SNV|1||||||||||||||||||||||||||||||||||||||||||||;segdup

So need to check for presence of string, in absence of an equal sign.

Create Singularity, Kubernetes, etc containers

Docker is popular, but other containers are used. For instance, some at NIH use Singularity

  • Create Docker container
  • Create Kubernetes cluster driven by said Docker container
  • Create singularity container

Improve query builder.

In the web app, move from regex to something like PEG/ohm.

  • Fix Pankaj synonym issue: synonym name should match exactly

  • Prototype Ohm query syntax

Simplify transaction management

Remove all cleanUp() besides the checkpoints.

In general, how can we utilize LMDB more effectively? This is mostly interesting for the future Go transition, but it feels like our current dbRead vs dbReadCursorUnsafe solution is not completely satisfactory.

Depth of coverage

Alex,

Is there any proposal to also include the depth of coverage statistics in the summary output?

thanks!!

Store region data as array in region db

Currently region data is stored as a hash, but with integer keys; this doesn't seem particularly useful, except in maybe the case that features are split between region and site, but that could be handled in a more deterministic way to reduce the sparsity of the site and region arrays.

Expanded HGVS notation

Bystro currently supports HGVS search in coding regions.

The questions are:

  1. Should we expand HGVS support to non-coding regions.
  2. Should we permanently store the HGVS notation in a tab-delimited field.

Allow database to be built or pulled

Working on GenPro; realizing that it should be easier to start up the program.

Proposal: add a YAML config property, that provides the link to the remote resource where the version of the database specified should be uploaded to, and then pulled.

Something like

repository:
  path: "s3://" or "http://" or "/path/to"
  buildDate: 10/27/18 11:22pm

When the user first uses the config file, the program should check whether the database exists at the given database_dir, and if it does not, fetch if the repository property exists.

This will allow users to supply custom databases.

Potentially this could be extended to multiple databases. This would mean allowing per-track database configuration (as opposed to having a singleton with a fixed database_dir). It would of course cost access time, but may be reasonable in cases like GenPro, where we may want to allow users to build (or fetch) highly dynamic databases (per experiment). In GenPro's case, the ability to fetch from a remote resource would mean memoization to a remote resource (as opposed to an in-memory data structure).

Update tests

nearest-dev branch currently contains most up-to-date tests: https://github.com/akotlar/bystro/tree/nearest-dev

TODO:

  • Complete integration tests for all tracks (insert / fetch)
    • in future revisions explore creating more granular tests
    • some of this is limited by architectural choices; inlining -> performance+, but more complex tests
  • Complete unit tests for DB Manager functions
  • Create unit test for Output.pm
  • Create unit tests for less important, clearly working utility functions (like IO package)
  • Create low-level unit tests for gene track's TX builder
  • its function is already verified in gene track tests, but useful for future development
  • Write tests for fields that use delimiters that are also used as Bystro delimiters; ensure we aren't generating extra fields in subsequent versions. Currently everything works appropriately, but is fragile because of the lack of tests (can verify at bystro.io using hbox/dead)
  • Write test to check that newline characters are stripped from db-inserted values.

Add tests for mis-sorted files

We had an issue where VCF track builds were being cut short, because those tracks had unexpected chromosomes as an artifact of liftover.

Need to write tests for all tracks, especially those prone to liftover artifacts, showing handling of multiple chromosomes when program expects only specified chromosomes (which is the case when multiple files are present)

Create AWS instance launcher (for Spot market)

Currently we require 2 steps, since user-data is executed as root, and our scripts assumed Bystro is being installed in the home directory.

Simplest solution is to install and launch somewhere from root.

A smarter, better-long-term solution would be to use cloud-init to allow whichever path desired more.

Utils::LiftOverCadd: allow whitelist

With the release of CADD 1.4, our major use case for liftover goes away until the next human assembly release. However, we still need to lift over the GRCh37 MT to hg19's chrM (pre-patch).

A whitelist will allow this in-app, rather than as a separate processing step.

Add sampleMaf

Contains the number of non-missing alleles at the site. Allows for queries that are maximally flexible. For instance , we could filter variants that are either in gnomAD or are at low frequency in our sample.

Cut b11.0.0 release from master

The master branch is a substantial improvement of the b10 codebase, including a new "nearest" track that uses a ahead-of-time de-duplication strategy to reduce disk space and improve annotation performance, and which allows the calculation distance to nearest features.

  • Currently used to calculate nearest gene, nearest Tss distances (as well as list details about those genes/tss'), and to create a refSeq.gene track, which contains, pLi, pNull, pHI, lofTool, GDI, and more.

Furthermore, building now uses LMDB cursors, and is remarkably faster (build times are < 1/2 of b10).

TODOS
  • Update all used annotations, esp gnomad.

  • Finish CADD track test (double check if still needed, we may have all necessary tests)

  • Modify all track tests (besides VCF, which has this done) to show that building from scrambled files when n > 1 files present works

  • Implement overlap delimiter. This delimiter allow a n:m (m > n) relationships between fields within one track. The highest cardinality scalar vector within a given track determines the relationships. By convention this should be the first track. This is, in effect, the primary key.

    • Example: one refSeq.name (by definition all refSeq entries are unique on name) may have multiple kgID's. This would be represented as name1;name2 \t kgIDforName1_1\kgIDforName1_2\kgIDforName1_3;kgIDforName2
  • Decide on the names for nearest genes tracks (currently refSeq.nearest.* refSeq.nearestTss.*) and the refSeq.gene track (which holds gene-level rather than tx-level information overlapping refSeq transcripts...for instance pLi, pNull, etc)

  • Change beanstalk workers, SeqElastic, SeqFromQuery to use supplied configuration files (search/maping and annotation YAML), rather than the corresponding assembly configuration found in config.

    • This is needed to help protect backward compatibility for a single annotation (re-indexing), and make better use of the state (configurations) stored alongside every annotation
  • Make hg38-lifted-over CADD publicly available

    • This should probably be the filtered cadd, since this is far simpler to work with (sorted, bad sites removed)
  • Update SeqElastic, SeqFromQuery to parse the new delimiter

  • Update the front end to handle delimiter use

  • Update changelog

  • Switch clinvar to by-allele-matching, using McArthur lab clinvar vcf.

    • Decide whether to keep the existing clinvar overlap of refSeq
  • * Add basic HGVS support

  • * Update mapping files to support Elasticsearch 6. Namely, split_on_whitespace no longer works, so we need to use copy_to to move

  • ** Allow submission of any valid track type (vcf, .bed, nearest) to add custom annotations

"*" May be deferred for first minor (feature) release

** Likely to be deferred to 2nd (feature) release.

What to do with complex variants that are both a deletion and a SNP?

Example from gnomAD:

chr10 723260 rs61831381 GCCATCATCACCATGCCCAGCGTCACGTGACATGGATAGAGTACATGTCAGGGGTATCACTGTGTGGGAAAAGGTCACACCATCATCACCACTCCCCACGTCACATGACAGGGATACAGTACGTGTCAGGGGTTTCACTGTGTGGGAAAAGGTCACGCCATCATCACCATGCCCGGCGTCACGTGACATGGATAGAGTACATGTCAGGGGTATCACTGTGTGGGAAAAGGTCACA ACCATCATCACCATGCCCAGCGTCACGTGACATGGATAGAGTACATGTCAGGGGTATCACTGTGTGGGAAAAGGTCACACCATCATCACCACTCCCCACGTCACATGACAGGGATACAGTACGTGTCAGGGGTTTCACTGTGTGGGAAAAGGTCACGCCATCATCACCATGCCCGGCGTCACGTGACATGGATAGAGTACATGTCAGGGGTATCACTGTGTGGGAAAAGGTCACA,A 2362232.76

Example 2:

chr10 735488 rs56079144 ACCAGACCCGGGACAGAGTGAGGCTCCAGACCCGGACAGAGGGAGGCTCCAGACCCGGGACAGAGTGAGGCTCCAGACCCGGGACAGAGTAAGGCTCCAGACCCGAAGAGAGTGAGGCTCCAGACCCGGACAGAGTGAGGCTCCAGACCCGGACAGAGTGAGGCTCCAGACCCGGATAGAGTGAGGCTCCAGACCCGGATAGAAGGAGGCTCCAGACCCGGGACAGAGTGAGGCTCCAGACCCGGGACAGAGTGAGGCT TCCAGACCCGGGACAGAGTGAGGCTCCAGACCCGGACAGAGGGAGGCTCCAGACCCGGGACAGAGTGAGGCTCCAGACCCGGGACAGAGTAAGGCTCCAGACCCGAAGAGAGTGAGGCTCCAGACCCGGACAGAGTGAGGCTCCAGACCCGGACAGAGTGAGGCTCCAGACCCGGATAGAGTGAGGCTCCAGACCCGGATAGAAGGAGGCTCCAGACCCGGGACAGAGTGAGGCTCCAGACCCGGGACAGAGTGAGGCT,TCCAGACCCGGGACAGAGTGAGGCT,AGACCCGGGACAGAGTGAGGCTCCAGACCCGGATAGAGTGAGGCTCCAGACCCGGGACAGAGTGAGGCTCCAGACCCGGACAGAGGGAGGCTCCAGACCCGGGACAGAGTGAGGCTCCAGACCCGGGACAGAGTAAGGCTCCAGACCCGAAGAGAGTGAGGCTCCAGACCCGGACAGAGTGAGGCTCCAGACCCGGACAGAGTGAGGCTCCAGACCCGGATAGAGTGAGGCTCCAGACCCGGATAGAAGGAGGCTCCAGACCCGGGACAGAGTGAGGCTCCAGACCCGGGACAGAGTGAGGCT,T,ACCAGACCCGGGACAGAGTGAGGCTCCAGACCCGGGACAGAGTAAGGCTCCAGACCCGAAGAGAGTGAGGCTCCAGACCCGGACAGAGTGAGGCTCCAGACCCGGACAGAGTGAGGCTCCAGACCCGGATAGAGTGAGGCTCCAGACCCGGATAGAAGGAGGCTCCAGACCCGGGACAGAGTGAGGCTCCAGACCCGGGACAGAGTGAGGCT

Example 3:
chr10 735488 rs56079144 ACCAGACCCGGGACAGAGTGAGGCTCCAGACCCGGACAGAGGGAGGCTCCAGACCCGGGACAGAGTGAGGCTCCAGACCCGGGACAGAGTAAGGCTCCAGACCCGAAGAGAGTGAGGCTCCAGACCCGGACAGAGTGAGGCTCCAGACCCGGACAGAGTGAGGCTCCAGACCCGGATAGAGTGAGGCTCCAGACCCGGATAGAAGGAGGCTCCAGACCCGGGACAGAGTGAGGCTCCAGACCCGGGACAGAGTGAGGCT TCCAGACCCGGGACAGAGTGAGGCTCCAGACCCGGACAGAGGGAGGCTCCAGACCCGGGACAGAGTGAGGCTCCAGACCCGGGACAGAGTAAGGCTCCAGACCCGAAGAGAGTGAGGCTCCAGACCCGGACAGAGTGAGGCTCCAGACCCGGACAGAGTGAGGCTCCAGACCCGGATAGAGTGAGGCTCCAGACCCGGATAGAAGGAGGCTCCAGACCCGGGACAGAGTGAGGCTCCAGACCCGGGACAGAGTGAGGCT,TCCAGACCCGGGACAGAGTGAGGCT,AGACCCGGGACAGAGTGAGGCTCCAGACCCGGATAGAGTGAGGCTCCAGACCCGGGACAGAGTGAGGCTCCAGACCCGGACAGAGGGAGGCTCCAGACCCGGGACAGAGTGAGGCTCCAGACCCGGGACAGAGTAAGGCTCCAGACCCGAAGAGAGTGAGGCTCCAGACCCGGACAGAGTGAGGCTCCAGACCCGGACAGAGTGAGGCTCCAGACCCGGATAGAGTGAGGCTCCAGACCCGGATAGAAGGAGGCTCCAGACCCGGGACAGAGTGAGGCTCCAGACCCGGGACAGAGTGAGGCT,T,ACCAGACCCGGGACAGAGTGAGGCTCCAGACCCGGGACAGAGTAAGGCTCCAGACCCGAAGAGAGTGAGGCTCCAGACCCGGACAGAGTGAGGCTCCAGACCCGGACAGAGTGAGGCTCCAGACCCGGATAGAGTGAGGCTCCAGACCCGGATAGAAGGAGGCTCCAGACCCGGGACAGAGTGAGGCTCCAGACCCGGGACAGAGTGAGGCT

Example 3:

chr10 737933 rs534100935 GTAGAGTGAGGCTTCAGACCCAGGTAGAGTGAGGCTCCAGACCCGGATAGAGGGAGGCTCCAGACCCGGATAGAGGGAGGCTCCAGACCCGGATAGAGTAAGGCTTCAGACCCAGGTAGAGTGAGGCTCCAGACCCGGATAGAGGGAGGCTCCAGACCCGGGACAGAGTGAGGCTCCAGACCCGGATAGAGGGAGGCTCCAGACCCGGACAGAGGGAGGCCCCAGACCCGGGACAGAGTGAGGCTCCAGACCCGAATAGAGTAAGGCTCCAGACCCGGA ATAGAGTGAGGCTTCAGACCCAGGTAGAGTGAGGCTCCAGACCCGGATAGAGGGAGGCTCCAGACCCGGATAGAGGGAGGCTCCAGACCCGGATAGAGTAAGGCTTCAGACCCAGGTAGAGTGAGGCTCCAGACCCGGATAGAGGGAGGCTCCAGACCCGGGACAGAGTGAGGCTCCAGACCCGGATAGAGGGAGGCTCCAGACCCGGACAGAGGGAGGCCCCAGACCCGGGACAGAGTGAGGCTCCAGACCCGAATAGAGTAAGGCTCCAGACCCGGA,A

Write md5 hash of tracks configuration to db

Ensure that if YAML configuration is substantially modified (i.e has the track configuration modified) that the database complains.

This should not include absolute paths, database_dir or files_dir, which may be better suited as environmental variables.

This TODO is really about the initiation of use of blockchain to track state.

VCF builder

Builds VCF file, for use primarily with gnomAD, ExAc, etc

Support more compressed formats

We may be able to gain decompression efficiency by supporting lz4, bgzip. Block-compressed formats can be decompressed using multiple threads.

DbManager should check if dbdata defined

We expect that the dbmanager will store only structures if data (one track of information at each index).

It is moderately safer, and slower, to check that the site is defined, rather than flash.

Cristina student todo's

A mixture of web and local tasks:

  • Create new save filters, Go or Perl.

  • Create in-line documentation on web: documentation should appear for new users (or users who haven't seen the function previously), when they are on a page/section with that function. Can be pretty easily written in Angular Material.

  • Document new fields going up in master (web)

  • Document new UNIT SEPARATOR (ASCII 31) for overlapping fields

  • Document filters

  • Contribute to VCF / plink export

  • Contribute to Hail integration

Document sites that didn't liftover to hg38 for gnomad

There should be a list of sites/coordinates where missing values represent sites that didn't lift over from hg19 to hg38 for quality control measures to separate those sites from missing data representing private mutations.

Fix b10 hg38 gnomad (early exit)

Default behavior when encountering unexpected chromosomes was to skip and exit early. Fixing this will restore missing hg38 sites.

Build Error on Docker Machine

I'm using docker-machine on a Windows 10. Trying to build from Dockerfile with docker build -t bystro . :: script exits with exit code 127 (command not found) when running install/install-go-packages.sh. Additionally, script creates similar warnings when installing lmdb, but does not exit.

This is my terminal output (at step 11):

Step 11/13 : RUN . install/install-go-packages.sh
 ---> Running in 7700bf77c2b1
: not found install/install-go-packages.sh:
-e

Installing go packages (bystro-vcf, stats, snp)

: not found install/install-go-packages.sh:
: not found install/install-go-packages.sh:
: not found install/install-go-packages.sh:
Made /root/go path
: not found install/install-go-packages.sh:
: not found install/install-go-packages.sh:
: not found: install/install-go-packages.sh:
: not found: install/install-go-packages.sh:
: not found: install/install-go-packages.sh:
: not found: install/install-go-packages.sh:
: not found: install/install-go-packages.sh:
: not found: install/install-go-packages.sh:
The command '/bin/sh -c . install/install-go-packages.sh' returned a non-zero code: 127

Note, the script does run when I run it manually through the console.

The error is likely caused by my use of docker machine. The default vm that docker-machine creates does not have go installed, and it does fail to execute the script with exit code 127.

Permission configurability

We currently need to set read permissions on output files, so that processes on other nodes can read them without having the same user/group (files are authorized by web server, inaccessible from outside world without authorization).

TODOS:

  • Modify permissions on only files owned by Bystro, rather than all in output folder (only an issue if using --temp_dir "/some/path" without --archive)

  • Allow output permission to be set in YAML config

Add VCF export

Incorporate Dave's script...complicating factor is that it requires sdx files. The obvious solution is to have it read LMDB instead.

Export to VCF format

Will require using the tab statistics file to get the sample list, and Dave's vcf converter simply tail -n +3 statistics.tsv | cut -f1 > sample_list.txt && seqantToVcf etc.

Would be nice to update Dave's program to use LMDB db.

Note that, as it stands, we will keep multiallelics on separate lines. Could add a facility to recombine multiallelics.

  • Generate sample-list output from bystro-vcf

  • Add support for sample-list in YAML config, Bystro Seq.pm

  • Generate sample-list output from bystro-snp

  • Propagate sample-list during saving from query

  • Add Dave Cutler's converter program

  • Make, use Rust implementation

Use semantic versioning for both database and program

Right now program version is intimately tied to database version.

We either need to decouple them, or use semantic versioning to track all changes, such that any identified database bugs that require a rebuild increment the corresponding minor version digit.

Revise stripping of delimiters

For instance: RH C/c Polymorphism currently gets transformed in master to RH C c Polymorphism.

We could replace our delimiters with commas or underscores, to preserve the fact that these aren't separate tokens (which google will interpret correctly), and which will allow us to index them as concatenated in elastic.

Ex: RH C/c -> RH C-c or RH C_c would both work well. In google RH C,c works best, returns the same results as RH C/c.

Alternatively our overlapDelimiter could be changed to \\, but I think this makes parsing much more difficult, and should be a last resort.

Edit: By discussion with Thomas, will try \ for now.

Improve upload reliability

Users from Albert Einstein have run into issues with large uploads (10’s of GB).

  1. We should add ability to retry chunks
  2. If uploading from s3, we should run the upload completely in background, rather than as a synchronous event that the user needs to keep a connection open during (meaning don’t tie to request/response lifecycle; start upload and return).

Cc @wingolab

Add pLi scores.

Very important. Seems at least as useful as CADD, and maybe more sensitive.

Add chrPerFile support

This is a low-priority update. Its only benefit is to allow faster skipping of previously-built chromosomes.

Something along the lines of

sub makeChromCheckFunction {
  my ($onNew, $onExit) = @_;

  return sub {
    my ($currentChr, $newChr) = @_;

    if( ($currentChr && $currentChr ne $newChr) || !$currentChr ) {
      if($self->chrPerFile) {
        # show the longer $currentChr ne $newChr condition for clarity
        if($currentChr ne $newChr) {
          # if use guarantees that they have one chromosome per file, this is a fatal error
          $self->log('fatal', $self->name . ": Expected one chromosome in $file, found at leats 2.");
        }
        
        if(!$self->chrIsWanted($newChr)) {
          $self->log('warn', $self->name . ": $newChr unwanted, and chrPerFile flag set; exiting file");
          last FH_LOOP;
        }

        if(!$self->completionMeta->okToBuild($newChr)) {
          $self->log('warn', $self->name . ": $newChr wanted, but completed, and chrPerFile flag set; exiting file");
          last FH_LOOP;
        }

        $onNew->($currentChr, $newChr);

        return $newChr;
      }

      return $self->chrIsWanted($newChr) && $self->completionMeta->okToBuild($newChr) ? $newChr : undef;
    }

    return $currentChr;
  }
  
}

Investigate use of named databases

In this version, every track would get a separate named database, as opposed to a key in the serialized data structure.

The advantage is a substantially easier insertion model, which will allow us to modularly update the database.

The disadvantage may be read performance and size; each database will need a header; need to investigate size, but may be 16 bytes. Also, we will need to deserialize N times for N tracks, although the deserialization will be simpler.

If annotation performance or database size are substantially impacted, or this change significantly higher CPU usage during annotation, the tradeoff will likely not be worth it. Currently on master branch build times are 1 day with 3 additional whole-genome tracks (refSeq.gene, nearest.refSeq, nearestTss.refSeq), which cumulatively take ~ 7 hours. We re-run builds no more than once per month.

Set up Travis CI

This is slightly tricky: most of our tests require LMDB to be installed. Figure this out.

Nearest tssName and tssDist

TODO:

  1. Validate that both nearest.refSeq and nearestTss.refSeq are accurate
  2. Decide whether these track names are ok
  3. Decide whether we report all desired data
  4. Document parsing of these fields (since they are de-duplicated in a way that refSeq isn't).

Add ploidy (het ploidy and homozygote ploidy)

This will be used to allow dropping of samples, without screwing up allele numbers.

We should also include an allele number (maybe "sampleAn") field; this will allow easy updates to homozygosity, heterozygosity and missingness when dropping samples.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.