fsprojects / bioproviders Goto Github PK
View Code? Open in Web Editor NEWF# library for accessing and manipulating bioinformatic datasets.
Home Page: https://fsprojects.github.io/BioProviders/
License: MIT License
F# library for accessing and manipulating bioinformatic datasets.
Home Page: https://fsprojects.github.io/BioProviders/
License: MIT License
At present, the type provider relies on a list of FTP locations for the different species and assemblies on GenBank. These will become outdated over time as GenBank updates, so it would be useful to include a script in the repository that allows us to update the lists every so often (and then publish a new version of the package).
Alex has noted that the files generated are based off a file on the GenBank FTP server itself; in future we may be able to look into using this directly rather than keeping our own lists. This could help us avoid the problems in #6.
Are the docs published to gh-pages?
An example where we do this auto-magically on every push to main is here: https://github.com/fsprojects/FSharp.Data/blob/main/.github/workflows/push-master.yml#L25-L31
Consider auto-publishing a nuget off pushes to main?
For example like here: https://github.com/fsprojects/FSharp.Data/blob/main/.github/workflows/push-master.yml#L34-L35
Previously, when building the package for NuGet, the files with the lists of locations on the GenBank FTP server would be placed in the same directory as the DLLs, rather than in the data subfolder. This meant that they could not be found when using the type provider from NuGet.
At present, I have moved the files out of the subfolder to attempt to avoid this problem (though my attempt to use the package locally still did not work, for other reasons). There are a few different ways we could tackle this problem:
Looks good - but time to migrate the docs to the main project README.
There are a couple of issues with the current generation of documentation using fsdocs. I've spent some time trying to figure these out, but have not had any success yet.
At present, the type provider only supports GenBank, though there are routes to use RefSeq data that lead to "unsupported" message outputs. Once we are happy with the state of the GenBank type provider, we may look into creating a RefSeq type provider too.
At present, the BioProviders examples only show how to use the type provider in an .fsx script file. Since we are publishing on NuGet, it will likely be used in Visual Studio projects too, and we should include examples of doing that.
According to Alex, the Metadata in the GenBankFlatFile type does not provide all the available metadata fields. We need to check whether there are any useful ones (such as Locus) that are not included yet and add them to the type.
I'm wondering if https://github.com/fslaborg might be a better home for this? Or https://github.com/fsprojects?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.