Giter Club home page Giter Club logo

biotoolsschema's People

Contributors

bug1303 avatar dependabot[bot] avatar dfornika avatar hansioan avatar hmenager avatar joncison avatar kigaard avatar matuskalas avatar redmitry avatar smoe avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

biotoolsschema's Issues

Add "R" as an interfaceType

I could see that most of the bioconductor packages in bio.tools are annotated with
resourceType: Tool
interfaceType: Command line

Some other R packages however have:
resourceType: Library
interfaceType: API

I understand that these terms should be rather broad and only a few of them in the enumeration, however I think it would make sense to add something more specific here. What is an R package really? I could see the definition of "Command line : Text-based interface to a tool or service", which is of course also true for R, but many researchers that use R, do so in a semi-graphical interface and some of them are scared off by 'actual' command lines. I think being able to explicitly search for R packages will help those having a better user experience.

Attribute for API-compliance

Requested by ELIXIR EXCELERATE WP 7 - new attribute to capture an API is WP7-compliant

Of course, something generic is needed.

Multiple fixes to 2.0 alpha-01 + docs

  1. Pattern for < name > element: check it does not allow spaces.
  2. Pattern for < description > element: what characters are not allowed? Maybe change the basic type from xs:string if appropriate, to make this more restrictive.
  3. Change the pattern for < id > attribute and id> element (once settled in bio.tools URL scheme)
  4. Settle the enum for resourceType
  5. Pattern for element (once settled in bio.tools URL scheme)
  6. ORCID simple type: specify type, pattern and sample (what is the valid ORCID syntax ?)
  7. Is nesting 'choice' within 'sequence' (in contactDetails and creditDetails) is really necessary? Can we just use 'choice' ?
  8. Add 'sample' value to PMID and PMCID
  9. Document (on GitHub WIKI?) "roles" used by < credit > element: Developer, Maintainer, Other.
  10. Document (on GitHub WIKI?) "roles" used by < contact > elements: General, Developer, Technical, Scientific, Maintainer, Helpdesk.
  11. Document (on GitHub WIKI?) meaning of < relationshipType > enum
  12. Better pattern for < description > element e.g. that sentence begins in upper case and ends with full stop, only ASCII characters etc.

Check compatibility with relevant schemes and standards

Including:

"Machine-understandable" but application-specific annotation inside XSD

xs:appinfo is a standard mechanism for defining business logic beyond the expressive power of the XSD language.

It avoids the need for hard-coding such logic into an application that uses the given XSD-based data format.

Example

. . .
<xs:schema ... xmlns:biotoolsai="http://biotoolsregistry.org/appinfo" ... xmlns:xs="http://www.w3.org/2001/XMLSchema" ... >
. . .
    <xs:element name="license" minOccurs="0"> 
        <xs:annotation> 
            <xs:documentation>Software or data usage license</xs:documentation> 
            <xs:appinfo> 
                <biotoolsai:usage recommended="true"/> 
                <biotoolsai:longDescription>
                    #Blah blah

                    `biotools:license` is blah blaaaah

                    ## GRRRRRRR

                    **WOOBAR**, isn't it?
                </biotoolsai:longDescription>
                <altova:exampleValues> 
                    <altova:example value="GNU General Public License v3"/> 
                </altova:exampleValues> 
            </xs:appinfo> 
        </xs:annotation> 
. . .

Provide regex restricting syntax of links to Debian packages

Thread from Andreas...

"> Forgive the very naive question, but do you maintain a list of links to packages (source, binary) currently available in

the Debian distro ?

I want to support linking to Debian packages from named tools in bio.tools.

May be either packages.debian.org or tracker.debian.org is what you are
seeking for depending from the amount of information you want to
present. For instance

https://packages.debian.org/bwa&exact=1
https://tracker.debian.org/bwa

If not a link, I guess I could just support package names; in this case, is there a valid syntax for package names (so I
can constrain this in our schema) ?

While there are syntactical constraints (lower case letters, numbers,
'-', '.', '+'; no upper case letters, no '_') you probably want to link
to existing packages which per definition will have a valid name. Or am
I missing something?"

Multiple EDAM concepts needed for a single output + operation|data|format HANDLES

Yet another example where multiple concepts are needed for 1 output is Meta-pipe, generating annotation of (meta)genome assembly (contigs) with found protein-coding genes, protein domains, and information about those, such as taxa, DB hits scores, etc.
The 1-only chosen type of data "Protein features" is very far from this in its generalisation, isn't it?

Reassignation of tools

Can you reassign the rights of "Orphanet portal for rare diseases and orphan drugs" and "Orphanet Rare Disease Ontology" to the user "Inserm US14" (the common user of Orphanet)

Thank you

Bio.Tools Collections: IDs and other attributes ...

Other than numerous previous discussions in various groups, this issue is also supported by a request from Alfred PΓΌhler, the coordinator of de.NBI, from 30th June 2016. (See the next comment where the content of the request is pasted.)

It also relates to some of the changes towards version 2.0, sketched at the TWW Hackathon in May 2016 in Paris: https://docs.google.com/spreadsheets/d/1_KGr2DkulwtAjFJzNjTm08zXVphFlVZ8p29Id6XFlxc (sheets to the right of the first sheet).

My notes and suggestions to the de.NBI request, and our previous discussions, are the following:

We should include also a good possibility to identify 'Collections' within Bio.Tools. That would mean allowing at least 2 attributes for each collection: a display name, and an "ID name". These two could be the same, but could also be e.g. "de.NBI" and "denbi", respectively. That should then allow dereferencing a collection at e.g. https://bio.tools/denbi instead of just an unspecific full-text search of https://bio.tools/?q=de.NBI.

In addition, we should consider other optional attributes of collections, such as description(s), super-collections (collectionA isIncludedIn collectionB), institutions, funding, credits, etc.
(I'm not sure about the collectionA isNewVersionOf collection, though. Although in a very special case it may make sense, e.g. if Bioconductor would change its name to BioCRAN πŸ˜†)

Version the standard in a semver.org manner

Summary

Given a version number MAJOR.MINOR.PATCH, increment the:

MAJOR version when you make incompatible API changes,
MINOR version when you add functionality in a backwards-compatible manner, and
PATCH version when you make backwards-compatible bug fixes.
Additional labels for pre-release and build metadata are available as extensions to the MAJOR.MINOR.PATCH format.

http://semver.org/spec/v2.0.0.html

Thanks!

Rename Developer(s) to Main author(s) or so

Distinction between Developers and Contributors is very vague -- what is it a developer?

I suggest either renaming Developers to Main authors, or adding even slightly more granularity via generic Persons with Roles (then should perhaps merge with Contacts and their Roles)

image metadata

Can we infer the image or container format from the file that is linked to?

Also speak to Christophe Blanchet re what is the useful information to expose about images/containers - is format enough, or is more needed?

urlftpType could hold http(s) ?

there is an urlftpType that restricts anyURL to the either http(s) or (s)ftp (???).
on the other hand urlType is restricted to the http(s).
something is wrong here...
remove http(s) from urlftpType and rename urlType into urlhttpType. (???)

xs:documentation source attribute must contain an URI

In beta1:

line 91:
<xs:element name="biotoolsId" type="biotoolsIdType">
xs:annotation
<xs:documentation source="The ID is a URL in the bio.tools namespace and reflects (normally exactly) the tool name and version: see http://biotools.readthedocs.io/.">Unique ID that is assigned upon registration of the software in bio.tools. /xs:documentation

line 112:
<xs:element name="shortDescription">
xs:annotation
<xs:documentation source="A single declarative sentence in the present tense, providing a terse statement of the tool function. State what is done, i.e.operation, and primary inputs and outputs, but not how. Do not include tool name. See http://biotools.readthedocs.io/.">Short and concise textual description of the software function./xs:documentation

Rename to "bio.tools.Schema" or so, to avoid tech choice lock

Should be acted upon asap (before 2.0), not to repeat the mistake of BioXSD, not to get stuck with XSD in the name forever.

E.g. although XSD 1.1 is better and more expressive than 1.0, JSON Schema may be even more expressive. And even better schema languages may appear whenever, without warning ;-)

Add Visual C++ as language?

It is not a language by itself (C++ is) but it can be very helpful that a program was not implemented in pure C++

Guideline for tool short description

  1. Provide only a terse statement of the tool function: what is done not how
  2. Use a single declarative sentence in the present tense
  3. Do not include tool name

Bake this into the comment?

Q: Endpoint.Output vs Function.Output

There are two local "Output" elements that looks the same.
Are they conceptually the same or two different classes should be used for the implementation?

Input/Output duplicate attributes from dataType

Input and Output elements are defined as a restriction of dataType

dataType type already has "data" and "format" elements defined.
What is the reason to duplicate them in Input/Output (with the same definitions).

Resource types: Docker images vs VMs

In the 'resource types' list, there is a type 'container' under which both docker images and VMs are categorized.
This is conceptually 'wrong'. Linux containerization is a totally different concept from VM. While each VM has its own OS, containers use the underlying kernel of the host OS. For containers, the underlying kernel must be Linux.

Also categorizing VMs under 'container' is confusing and misleading.

Also in dockers, the resource is called 'image' not 'container'. It is a container when it runs, but the resource is an image.

What I suggest is having two categories instead of one:

  • Virtual machine
  • Docker image

Add a Docker registry link

Get possibility to insert a Docker registry url for the tool, example:

docker-registry.genouest.org/bioinfo/blast (meaning version latest)

or with a version tag

docker-registry.genouest.org/bioinfo/blast:1.0

with this, user only needs to do a

docker pull *docker_url*

Schema could support multiple Docker registries urls

Registration of services

Is it possible to register services, e.g. "conversion and upload service" or "biostatistics consultation service" in bio.tools?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.