lut99 / argumentparser Goto Github PK

View Code? Open in Web Editor NEW

1.0 1.0 0.0 4.85 MB

A custom-build ArgumentParser which focuses on easy-to-use for both the Programmer and the eventual command-line user.

License: GNU General Public License v3.0

Makefile 0.60% C++ 99.10% Python 0.07% Shell 0.23%

argumentparser's People

Contributors

Stargazers

Watchers

argumentparser's Issues

Restructure the ArgumentParser to include extra build step

Right now, the code gets structurally messy as child classes need information only available in parent classes. To fix this weird issue, perhaps it would be best to re-structure the ArgumentParser into three steps: defining the arguments, building the arguments and finally parsing the arguments. The define step would be very abstract and should allow a lot of configurability for the user; then, the user calls build(), which should convert this abstract definition of arguments into a structure that is fixed and optimised for parsing; this is also where we can check name availability, assign indices to Positionals, check group validity, etc. Finally, the user can then use that optimised internal structure to actually parse the CLI-arguments. A bit of a just-in-time-compilation idea.

Extend ADL to allow inline C++

We can extend the existing ADL to allow the 'source' keyword to not just refer to external files, but also to simply list C++-code right there and then. Syntax will probably be that it's wrapped in '++{' and '}++', where the tokenizer does treat comments differently s.t. those can be commented. Additionally, the match can be given as special ++++, where the name of the X'th value (i.e., X'th argument to the parser) will be pasted.

Create the ArgumentParser Wiki

Once all features have been implemented and we know how we do that, create the Wiki to describe how the user can use it. Note that we want the Wiki to be complete and clear.

Extend confitional compilation with else / elseif & and's and or's

We should make the conditional compilation system more extensive, by adding and's and or's (s.t. multiple defines can be checked in one go) and by adding if's and elseif's. This does not add new functionality, as the same behaviour can be replicated using ifdefs and ifndefs alone, but it will be a lot more convenient.

Add DependencyRelation ArgumentType limits

As we want to implement (but haven't yet), not all ArgumentTypes make sense to include in a certain DependencyRelations (Flags in an IncludedRelation, for example). Additionally, we should also think about restraining mixing certain types of arguments to avoid ambiguities.

Create the ADLParser

Now we have a first Tokenizer version, it's time to move up one abstraction level and start working on our ADLParser, that takes the tokens and uses that to create a ASL for our compiler.

Hide internal structures

As one of the last steps of cleaning up before the release, we want to shield as much from public access as possible to help guide the user in finding the correct things to use. This also minimizes incorrect usage.

Create first traversal

The first traversal of the ADL's AST can finally be implemented, and will be about constructing a symbol table - which simultaneously allows us to check for duplicate definitions.

Extend the parser & ADL to support field re-use

To provide a neat solution to allow type patterns to be re-used for other type patterns, we will add support to reference fields of other types, positionals or optionals. The syntax will be '.', where is either valid type id, positional id or option shortlabel / longlabel and is the parameter to reference.

Add warning suppression / generation to the ADL

As extra functionality, we want the ability to suppress warnings (and generate them, while we're at it). The extra modifier needs to apply only to the definition or property it's placed above, and so we'll need to modify basically all parse rules of the ADLParser. Lotta work, but if it's done, we should have a neat and useful suppression system for warnings - which will give us less reason not to put them there.

EDIT: Let's also add error generation to this as well (with a string, not specific error classes), since it might be very useful to be able to generate custom errors with conditional compilation.

Add print support to AST

To debug the first compileable version of the parser, we will need to see what happens on the stack - and so we will work on adding stack-sufficient printing support to the current AST nodes. In the future, we will likely want to print the AST once it's done parsing - probably as a traversal. For now, though, this suffices.

Implement Argument-specific Parsing

Part of the effort to make the ArgumentParser more Object-Oriented will be to decentralize parsing and leave that to the individual Arguments themselves. This way, creating groups of Arguments with a certain dependency will be very easy, as they simply parse differently. However, a major challenge is separating Positionals from Option-values.

Split the file into multiple files

To decrease compile times, we want to split the resulting ArgumentParser.hpp file into files each handling a part of the work. While all are needed for the complete usage, not everything is needed. Especially a separate 'Arguments.hpp' might be useful to have, as this class will likely be used throughout a program. Additionally, we might want to think about being able to pre-compile code as much as possible.

Remove Flags

With the new Option features, which include the possibility to set default values so that only the Option needs to be specified, Flags are really just a set of Options. To simplify the code and to synchronize it more with GNU terminology, we want to remove the Flags as a separate class and instead allow Options to have a 'void'-type, which signifies they don't accept values and are purely there for Flag functionality. The user can then use is_given() to check for flags in this way, and the usage() string can identify which Options should be presented as Flags and which not.

Split existing AST into leaf & branch nodes

Currently, we're doing a lot of duplication; in particular, the memory management of nodes with children seems to be more of the same. To this end, we should split the tree to have two more 'top-level' root nodes, just below the ADLNode; ADLLeaf and ADLBranch. These can then take care of common memory management, as well as implementing common recursion algorithms. Individual nodes can then customize access by using inline functions.

Add argument definitions to the parser

When the type definitions are fully supported (including inline C++), we can move on to the other toplevel construct: positionals & options.

Rewrite the ArgumentParser-class

With the new Object-Oriented design mostly in place, it's time to actually use it and rework the ArgumentParser class to use it properly. Note that this will include the actual parse() function itself.

Add error messages to the ADLParser

It will be a tedious job, but it's really time to add error checking to the ADLParser. For fun, we'll first implement the preprocessor, but the error-handling should really be done before we move to traversals.

Make the README more useful

Once the Wiki is in place, rewrite the README to give a brief overview of functionality, link to the Wiki and treat some other issues like compatibility or bug reporting.

Change ADL to accept REGEX for types

If we accept regex-expressions in addition to C++ code, then we even could create a final-state parser from an ADL file, which should bring us optimal speed. Then, the C++ code can take care of parsing the string to a value. Note that to do this, we'll update the ADL specification (#17)

(Possibly) Rewrite the Arguments-class

The final step of the redesign is to rewrite the Arguments-class, which functions as the return dictionary for the user. While I suspect most of it is usable, it does warrant a second look.

Change ADLTokenizer to also allow tokens of different types

Like with the compiler course we followed, we should let the tokenizer already convert values to ints or floats and the likes if needed; disadvantage of this is that the return types of the Tokenizer become more complicated (can be derived types), so might be a bit of work. Also might need to think if that technically means Regex-expressions should already be parsed?

Create the Argument Rule Language definition

Before we can do anything, we first have to decide if we can create a comfortable way to define the rules that are used to create the hard-coded parser.

Parse type definition

With the remove of the directive, we'll start again by letting the parser parse type definitions. This should then be fairly easibly be extendable to other definitions, such as the positional or optional ones.

Restructure Exceptions

Throughout development, a ton of exceptions have been created, of which some its use has been removed. Once we're done with the core of the parser, we desperately need to clean the Exceptions by pruning out ones that are not used. Additionally, we might want to re-think some structuring of the Exception tree.

Implement the ADL preprocessor

The ADL preprocessor will live in between the tokenizer and the parser itself, in order to facilitate things like including other files or conditional compilation.

Create the ADL Tokenizer

In true compiler fashion, let us start by defining a Tokenizer that is able to read the ADL files appropriately, discard all non-needed information.

Implement IncludedGroup, ExcludedGroup and RequiredGroup classes

When the MultiArgument class is finished and the AtomicArgument have their parsers, it's time to finalize the creation of the Argument classes by creating the three MultiArgument derivatives: The IncludedGroup, ExcludedGroup and RequiredGroup who each implement their respective Argument dependency.

Create a tokenizer

The only way to parse this remotely efficiently is to give up on the Object-Oriented parsing, and instead go back to simpeler times where we leave it up to the ArgumentParser::parse function. To aid in this, we shall create a simple tokenizer that acts as a stream, and allows the parse function to structurally parse one argument at a time.

Change Tokenizers to also parse regex-expressions

Now that the Tokenizer can handle normal values by parsing them, we should do the same for regex-expressions, already converting and validating them on the spot. However, this is only for when we converted large parts of the AST to regex as well, so we have an idea what requirements the Regex should have and how we can merge it with other regex-expressions.

Completely do everything differently - make a compiler

Very few projects will require a dynamic list of arguments, and so the creation of the parsing rules could be done entirely at compile time. To this end, I propose to completely throw out everything we have now, and instead create an ArgumentParser that uses some external file to read the parsing rules, create a relevant, hard-coded C-file to parse the arguments and let the user include those in their projects so that they can be used.

Keep documentation about ADL up-to-date

As we are working on parsing ADL, we should definitely try to keep the formal language definition up-to-date by writing a PDF about it.

Add traversal for resolving references

With the symbol table build, the second compiler traversal will go through the tree and check if the (type-)references are referencing any valid fields of given definitions. A reference to said field slash data should be stored at the node level. Once done, this traversal should also check the symbol table to see which elements are unreferenced, check if those are usable by their parent argument, but if not, then provide a warning that they are unused.

Overhaul debug information to be more extensive and use a struct

The debug information given to each token (line number, col number, etc) could be extended with also information on when the token stops. Additionally, passing it all around becomes very tedious, and thus we should wrap it in a special debug_info struct or something like that.

Add meta labels to ADL & Parser

As a final step before a (theoretically) fully functional parser is ready would be to add some way of adding meta variables. In particular, these should be used to include other files / mark that we use default types, for example, and enable / disable of automatic error messaging and automatic help handling.

lut99 / argumentparser Goto Github PK

argumentparser's People

Contributors

Stargazers

Watchers

argumentparser's Issues

Recommend Projects

Recommend Topics

Recommend Org