Giter Club home page Giter Club logo

barclay's Introduction

License (3-Clause BSD) Test

Barclay

Barclay is a set of classes for annotating, parsing, validating, and generating documentation for command line options.

##Requirements

  • Java 17
  • Gradle 7.4.2 or greater. We recommend using the ./gradlew script which will download and use an appropriate gradle version automatically.

barclay's People

Contributors

cmnbroad avatar droazen avatar jamesemery avatar jonn-smith avatar lbergelson avatar magicdgs avatar rhowe avatar yfarjoun avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

barclay's Issues

Is it possible to create a custom index for each super-category?

I'm in the situation where I need an different index for each component/super-category (e.g., one index for the tools and one index for the utilities). It looks that Barclay only have support for (correct me if I'm wrong):

  1. An index template for all the documentation. This is only used once by FreeMarker for a single index.
  2. A generic template for each component. This is parsed for each @DocumentedFeature, and thus could use some macro to switch the format for each component.

It will be useful to be able to parse the i general index template (or to have several index templates) to have an index for each super-category. If there is any way to do this in the current implementation, I'm sorry to rise this issue and I would appreciate if it could be documented in the Wiki.

Thank you very much in advance.

@Hidden, @Advanced annotations are not handled by command line parser

I don't see any handling of annotations for arguments:

  • @Hidden: I expect that it hides the option from help, but not for the argument parsing.
  • @Advance: I expect this option to appear under an especial category pointing out that they should be used with caution. I think that another interesting way for this advance options could be an argument in SpecialArgumentsCollection to show the "advance" help.

Request: accept mutex boolean arguments to be provided if they are complementary

For instance, in a command line tool with the following arguments:

...
@Argument(fullName="arg1", mutex = {"arg2"}, optional = true)
public Boolean arg1 = false;
@Argument(fullName="arg2", mutex = {"arg1"}, optional = true)
public Boolean arg2 = false;
...

Will be useful to be able to provide the following to the command line and don't blow up:

  • --arg1 false --arg2 false
  • --arg1 true --arg2 false
  • --arg1 false --arg2 true

And only throw an exception if --arg1 true --arg2 true.

Allow a different output extension for the index file

I'm in the case in which I would like to have a different format for the index than the rest of components for use with jekyll: 1) index as an yml data file for iterate over different components, and 2) markdown to render for each component.

It will be nice to have a new option for the index extension, which by default should be the same as the component extension. Someone have any objection to this?

Discussion: how to deal with javadoc-link tags

A javadoc could have several the @see and the @link tags to point to specific classes/method. Because the javadoc is used for developer consumption and in the case of Barclay for help-pages generation, this introduces problems when writing the javadoc thinking in both use cases. This is an example class to evaluate the possible problems:

/**
 * Description in javadoc. 
 * This class may be related with {@link SecondDocumentedFeature}, so check its documentation.
 *
 * @see ThirdDocumentedFeature
 * @see FourthDocumentedFeature
 * @see UndocumentedFeature
 */
@DocumentedFeature(extraDocs = FourthDocumentedFeature.class)
public class FirstDocumentedFeature {
    ...
}

The problems that may arise from this class in the help pages are the following:

  • There is no way to access the SecondDocumentedFeature or the ThirdDocumentedFeature while parsing the FreeMarker templates. Even if they are documented and they have a link in the help pages. It will be ideal to populate this classes into the extraDocs.
  • In the case that the @see classes are populated, there will be a clash in the FourthDocumentedFeature; in addition, the UndocumentedFeature does not have any entry in the help pages.
  • The populated description from the javadoc is "Description in javadoc. This class may be related with {@link SecondDocumentedFeature}, so check its documentation.", which is confusing for a non-programer user. It will be ideal to parse the javadoc with Barclay and substitute this tag by the url.
  • In the case that the @link tag is parsed: should the url be formatted as HTML or Markdown? What will happen with @link tags for non-documented features?

My suggestions, in order of preference, are the following:

  1. Populate @see and @link tags into the extraDocs, not allowing classes with the same name (this is already constrained for tool names). Then, the parsing on in-line tags will be done by FreeMarker using a custom macro (I would like to have a macro file in Barclay containing this and other "common" functionality).
  2. Remove all in-line tags by Barclay on output and do not take into account the @see tag. This will require that the FreeMarker template look for matching strings with the extraDocs and apply the link. If someone wants the @see tag, they could use a custom binding. This will keep the developer/user help completely separate, but it will complicate things in the template.
  3. Parse the @link tags with Barclay and set the URL for documented features either as HTML or Markdown, by setting an option. This introduces a constraint in template outputs, because they will be expected to be encoded as HTML or Markdown.

Allow usage of other tags and not only inline in DefaultDocWorkUnitHandler

As an example:

/**
* This is the javadoc for the description.
* 
* {@MyTag.test1 this is test1 tag}
*
* @MyTag.test2 this is test2 tag
*/
@DocumentedFeature
public class TestClass {}

If the custom tag prefix is MyTag, the final json will pass the test1 tag but not the test2 tag because DefaultDocWorkUnitHandler.addCustomBindings only uses the ClassDoc.inlineTags(). Using ClassDoc.tags() instead will allow to output test2, with the advantage of showing it in a normal javadoc task.

Fully Integrate @Deprecated in the CLI/help handling

The current DefaultDocWorkUnitHandler is handling the @Deprecated annotation for arguments differently as other kind of arguments. I think that it is a good idea, but also it will be cool to integrate this behaviour in other parts of the code:

  • Deprecated arguments separated in the CLI from required/optional/etc
  • Add a sort note about the deprecation in the CLI (similar with the @BetaFeature) to allow just annotating with it and automatically show in the cli-help
  • The current json does not contain a marker about the deprecation (the same for beta) status. Perhaps it will be useful to add also a "type" entry, empty for normal features, and "deprecated" / "beta" for other cases. In addition, it could also include a description for deprecation through the @deprecated javadoc tag. This will be useful for online pages.
  • Handle the deprecated tag also @DocumentedFeaturein the doclet code, not only with the arguments.

Does this make sense for you? I think that it is a good addition...

Add support for argument tagging to barclay

Currently in GATK4 we have some support for tagging arguments, but the tagging is done on the value side, and manually by the engine. Eg.,

-V myTag:my.vcf

This causes many issues: parsing ambiguity with URIs, interfering with shell auto-expansion, etc.

Let's add native tagging support to barclay, so that the tag can be on the argument side rather than the value side. Eg.,

-I:tumor my.bam

We also need to support arbitrary key/value pairs after the tag. Eg.,

-I:tumor,key=value,key2=value2

I think that the way to do this is to introduce a new TaggedArgument interface in barclay with methods to get/set both the tag and the optional key/value pairs. Eg.,

void setTag(String)
String getTag()
void setTagAttributes(Map<String, String> attributes)
Map<String, String> getTagAttributes()

The upcoming URI class in GATK will implement this interface. When barclay encounters an @Argument-annotated field of a type that implements the TaggedArgument interface, it should parse the tags and inject them into the object instance for that field via the setter methods.

Docgen assumes DocumentedFeatures have no-arg constructors

Most documented features are command line programs, which have no-arg constructors, but some are not (i.e., TableReader), and so may not have a no-arg constructor. We need to relax the assumption and only instantiate classes that are command line programs.

JSON output file names should not include the output file extension

For example, if the output extension is "html", the JSON files are currently called "workunit.html.json". This behavior was carried over from GATK3, but is unnecessary and a little misleading. The work unit output format and extension are independent of JSON, so the JSON file should just be called "workunit.json".

Design index template(s)

I'm starting to plan how we'll make the tooldocs available on the GATK website, and specifically what organization layouts we want to offer to users. There are three main patterns that people tend to follow when looking for docs, which would be best served by offering separate index pages:

  • full alphabetical list of all tools
  • subset by top-level package (in the GATK world, core vs protected vs Picard), then alphabetical
  • functional breakdown (QC vs bam processing vs variant discovery etc)

Note that this would only apply to tools, proper -- read filters, annotation modules, metrics collections (in Picard) etc still make sense to categorize separately in any case.

That being said, I need to think a bit more about the UX side of things before implementing anything. TBC. Comments welcome.

Implement the generic ability for Collection arguments to be provided via a file

For any argument of a Collection type, we want the ability to provide a .list file containing the literal values for the argument, instead of having to provide all literal values on the command line.

So, if an argument is of a Collection type, and the token from the command line ends in .list, treat the token as a file name, load it, and unpack all lines in the file into the Collection.

Request: recover ArgumentCollectionDefinition and add functionality

I know that ArgumentCollectionDefinition was removed in the initial port of the command line parsing, but I think that it could include some functionality for parsing the argument collection. Sometimes the argument collections have complex dependencies between parameters that are not always easy to add to the Argument annotation. For example, two numeric values should be within a range, and they could use the max/min fields in Argument; but if this ranges depends on the other value, this is not possible to reflect here.

I suggest that the ArgumentCollectionDefinition class may be used to improve this adding a method to validate the arguments in more complex ways. In addition, it allows to make the collections serializable. This will allow to use this definitions with default validation instead of implement a validation method that should be called within the tools.

Nevertheless, I think that the ArgumentCollectionDefinition should not be mandatory for argument collections. For example, one nice thing of handling the ArgumentCollection without this limitation in GATK is that a ReadFilter with arguments could be used as an ArgumentCollection to set this arguments even if the tool does not have the plugin.

Support grouping arguments by class in the argument container hierarchy

Based on a discussion with @droazen and @vdauwera, we want to replace the "common" attribute with a mechanism that allows us to group arguments by where they're located in the argument container's hierarchy. So if we have class hierarchy like this:

CommandLineProgram
GATKTool
ReadWalker
PrintReads

we can group the arguments by where they came from in the hierarchy. We'll need some way to add a doc string to each class to display as the group heading in the output.

HelpDoclet cannot be used directly

Although the javadoc states that this class can be used to generate documentation directly, that's not the case because if used in a gradle task (e.g. onlineDoc) it complains with the following error:

:compileJava UP-TO-DATE
:processResources UP-TO-DATE
:classes UP-TO-DATE
:cleanOnlineDoc
:onlineDoc
javadoc: error - Doclet class org.broadinstitute.barclay.help.HelpDoclet does not contain a start method
1 error
:onlineDoc FAILED

It seems that it is because it requires a public static method called start, as in the GATK Doctlet or in the Picard one. I'm planing to have my own doctlet, but while testing how this work it will be nice to be able to use the minimal implementation...

Test docgen code at a granular level

From @cmnbroad (#68 (comment)):

One high-level request is that we find a way to incrementally unit-test new functionality like this at a granular level. We currently only have coarse-grained, file-based integration tests, and I'd like to find a way to avoid proliferating a new set of test files with each new feature, in addition to changing many/all of the existing ones. @magicDGS any thoughts on how we can address that ?

Handle CommandLineProgramProperties.usageExample in help paths

Currently the usageExamplemethod is not used anywhere. I suggest the following support:

  • Add to the tool CLI help to show an example
  • Populate the String to the JSON and/or FreeMarker properties to use in the doclet

In addition, it will be nice if this method returns an array of String to allow more examples for the usage. For the CLI help only the first one could be printed, and the rest will be useful for the online documentation.

Parse javadoc link tag for improving user-readability in doclet

Currently, if any tag is present linking to a different class in the javadoc, this is output as it is. This have two disadvantages: 1) developer javadoc could not include links if they are part of the documented features; 2) the help page does not have access to the linked class URL if that one is documented.

I have some suggestions to solve this problem:

  • Populate this classes into the extraDoc. The user could use a macro in FreeMarker to substitute the {@link ClassName} pattern by the extraDoc URL or just remove the tag if not present.
  • Parse the javadoc internally to add the URL as an HTML or Markdown formatted String. This is not the optimal solution, because it adds complexity to Barclay and it is less flexible.

I prefer the first option, and providing a file with macros for parsing this kind of information to be included by the user.

Could docgen generate a json for the index?

The current implementation of Barclay docgen code is to generate an index.html and feature-specific files (class_name.html and class_name.html.json).

Is it a possibility to generate a common JSON file used by the index in the current framework? It will be useful for generate other pages sharing information such as the version...

Failure accessing private fields with javadoc.

When traversing through class hierarchies trying to resolve arguments, docgen checks each field ti encounters to see if its annotated as an argument collection, and if so, recursively reflects on the field's type. If the type has a private field that has javadoc, getDeclaredFields doesn't return the field, but javadoc includes it in the list of FieldDocs. The code needs to be tolerant of that case during field traversal. See broadinstitute/gatk-protected#1048.

Does barclay support tagging arguments as incompatible?

Let's say we have a tool that accepts two arguments, each is fine separately but they can't be provided together because the corresponding functionality is incompatible. Is there a way I can tag these arguments? How and what exactly happens if I do invoke them both?

mutex behaves inconsistently with collection arguments and optional=false

Mutex arguments that aren't collections allow both arguments to be marked as optional=false. This is treated as "exactly of these arguments is required" However collection arguments don't behave the same way. A scalar and collection argument where both are mutually exclusive to each other and both are optional=false will fail if the collection argument is not specified.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.