Giter Club home page Giter Club logo

Comments (5)

jamesrhester avatar jamesrhester commented on August 12, 2024

The original idea for _type.contents envisaged arbitrary datavalue complexity, Part of the driver for this was the perceived need to put all loop key datanames together into a single 'master key' dataname. This approach has been dropped, so almost all datanames have a much simpler structure. Datanames with heterogeneous values are unlikely to be common, as each component would need to be treated in a different way and so would more natrually be defined separately. So my initial response is that we no longer need these complex structures, indeed the only place that _type.container of Multiple occurs is in the definition of _type.contents, and the most complex _type.contents that we have are those that @vaitkus has identified.

At this point I think it would be worth describing a simple grammar that only included points (1) and (2). I don't think it is worth developing a grammar that can describe arbitrarily complex data values until such data values actually appear, which I suspect is unlikely. At this point we can simplly reserve the relevant characters (&|*.).

It is correct to say that a _type.container value of Multiple implies that combinations of the items in the _type.contents list becomes possible, and that this violates the principle that interpretation of one dataname should not depend on the value of another. One way around this is to define a separate attribute, e.g. _type.compound_contents which contains this information. In this case _type.container of Multiple is replaced by List or Single. So we might have:

_type.container List
_type.contents Compound    #New enumerated value
_type.compound_contents  "List(Real,Integer,(Text,Integer),Text)" #The contents of each list element

How does that sound? The definition for _type.compound_contents would include the grammar that you have developed.

A lot of these _type attributes were developed to allow for transforming dREL methods into typed languages. So the transforming function would be able to emit the type of the function as well as the types of the parameters. Ideally we would make sure that we can preserve this ability.

from cif_core.

vaitkus avatar vaitkus commented on August 12, 2024

I fully agree that things should not be made too complex in advance. The proposed _type.compound_contents seems quite flexible; I could easily modify the grammar to only include (1) and (2) for now.

By the way, the example that you provided (List(Real,Integer,(Text,Integer),Text)) does not fit the grammar since any nested list must be prefixed with the List keyword (the correct version in this case should be List(Real,Integer,(Text,Integer),Text)). Is this acceptable or should I modify the grammar in some way? Also, should I include the Matrix keyword in the grammar as well?

I am very happy that you are considering including the grammar in the description, however, I would really suggest defining a separate data item for that purpose. The data item could have the _type.purpose value set to Encode thus explicitly stating that it can be automatically processed (i. e. for automatically generating a parser). Having a separate data item would also open the door to eventually providing machine-parsable descriptions to other data items as well.

from cif_core.

jamesrhester avatar jamesrhester commented on August 12, 2024

Yes, I have made a mistake in the example, you are correct that List was missing. Array and Matrix are also container types, so can be included, although it may be difficult to detect any difference between an Array and a Matrix purely by inspection of the structure or element type.
I am reluctant to define a separate data attribute for machine-readable grammars, mostly because very few dataitems in general should have structure. As I have explained above, any non-uniform structure within a datavalue means that datavalue could be decomposed into distinct parts. So I think we need to wait and see whether or not there is any need for descriptions of internal structure before defining a new data attribute.

from cif_core.

vaitkus avatar vaitkus commented on August 12, 2024

I have updated the grammar (type-contents-simplified.txt) in regards to your recent comments. I also included additional restrictions on the Matrix and Array containers based on their description in the _type.container description -- these containers are not allowed to nest other containers and can only contain numerical data types. Please let me know if I should change that.

I agree that probably very few items will have an intricate internal structure, but even the simple ones would benefit from the formal grammar. For example, the _enumeration.range description (The inclusive range of values "from:to" allowed for the defined item) leaves some room for interpretation:

  1. Can a range be defined without the lower (i.e. ':10') or the upper bound (i.e. '0:')?
  2. Can symbol ranges (i.e. 'a:z') be defined as well as number ranges?
  3. Can non-continuous ranges be defined (i.e. '1:11,12:42'), etc.?

Ambiguities like these can be clarified with several more sentences to the description and a few more examples, or/and a formal grammar (which would actually allow one to validate the provided examples). Of course, one can always add the explanatory grammar to the description, but so can the value of any other data item. Having a separate data item designated just for this purpose seems like a more ontologically elegant solution and it might even encourage developers of other DDLm dictionaries to include grammar definitions into their own dictionaries.

In any way, I would be glad for the grammar definitions to be included in any form.

from cif_core.

vaitkus avatar vaitkus commented on August 12, 2024

As the _type.container and _type.contents have been greatly simplified in recent revisions (i.e. the Multiple container type was removed), the majority of this discussion is probably no longer relevant. However, I think that the idea of introducing an attribute intended to store a formal grammar that describes all possible attribute values is still worthwhile pursuing, so I might open a separate issue on this topic in the future.

from cif_core.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.