The _type.container data item in the <code class="not

Yes, I have made a mistake in the example, you are correct that <code class="notransla

I have updated the grammar (<a href="https://github.com/COMCIFS/cif_core/files/986540/

ddl.dic (cif2-conversion): _type.container 'Multiple' / _type.contents values about cif_core HOT 5 CLOSED

comcifs commented on August 12, 2024

ddl.dic (cif2-conversion): _type.container 'Multiple' / _type.contents values

from cif_core.

Comments (5)

jamesrhester commented on August 12, 2024

The original idea for _type.contents envisaged arbitrary datavalue complexity, Part of the driver for this was the perceived need to put all loop key datanames together into a single 'master key' dataname. This approach has been dropped, so almost all datanames have a much simpler structure. Datanames with heterogeneous values are unlikely to be common, as each component would need to be treated in a different way and so would more natrually be defined separately. So my initial response is that we no longer need these complex structures, indeed the only place that _type.container of Multiple occurs is in the definition of _type.contents, and the most complex _type.contents that we have are those that @vaitkus has identified.

At this point I think it would be worth describing a simple grammar that only included points (1) and (2). I don't think it is worth developing a grammar that can describe arbitrarily complex data values until such data values actually appear, which I suspect is unlikely. At this point we can simplly reserve the relevant characters (&|*.).

It is correct to say that a _type.container value of Multiple implies that combinations of the items in the _type.contents list becomes possible, and that this violates the principle that interpretation of one dataname should not depend on the value of another. One way around this is to define a separate attribute, e.g. _type.compound_contents which contains this information. In this case _type.container of Multiple is replaced by List or Single. So we might have:

_type.container List
_type.contents Compound    #New enumerated value
_type.compound_contents  "List(Real,Integer,(Text,Integer),Text)" #The contents of each list element

How does that sound? The definition for _type.compound_contents would include the grammar that you have developed.

A lot of these _type attributes were developed to allow for transforming dREL methods into typed languages. So the transforming function would be able to emit the type of the function as well as the types of the parameters. Ideally we would make sure that we can preserve this ability.

from cif_core.

vaitkus commented on August 12, 2024

I fully agree that things should not be made too complex in advance. The proposed _type.compound_contents seems quite flexible; I could easily modify the grammar to only include (1) and (2) for now.

By the way, the example that you provided (List(Real,Integer,(Text,Integer),Text)) does not fit the grammar since any nested list must be prefixed with the List keyword (the correct version in this case should be List(Real,Integer,(Text,Integer),Text)). Is this acceptable or should I modify the grammar in some way? Also, should I include the Matrix keyword in the grammar as well?

I am very happy that you are considering including the grammar in the description, however, I would really suggest defining a separate data item for that purpose. The data item could have the _type.purpose value set to Encode thus explicitly stating that it can be automatically processed (i. e. for automatically generating a parser). Having a separate data item would also open the door to eventually providing machine-parsable descriptions to other data items as well.

from cif_core.

jamesrhester commented on August 12, 2024

Yes, I have made a mistake in the example, you are correct that List was missing. Array and Matrix are also container types, so can be included, although it may be difficult to detect any difference between an Array and a Matrix purely by inspection of the structure or element type.
I am reluctant to define a separate data attribute for machine-readable grammars, mostly because very few dataitems in general should have structure. As I have explained above, any non-uniform structure within a datavalue means that datavalue could be decomposed into distinct parts. So I think we need to wait and see whether or not there is any need for descriptions of internal structure before defining a new data attribute.

from cif_core.

vaitkus commented on August 12, 2024

I have updated the grammar (type-contents-simplified.txt) in regards to your recent comments. I also included additional restrictions on the Matrix and Array containers based on their description in the _type.container description -- these containers are not allowed to nest other containers and can only contain numerical data types. Please let me know if I should change that.

I agree that probably very few items will have an intricate internal structure, but even the simple ones would benefit from the formal grammar. For example, the _enumeration.range description (The inclusive range of values "from:to" allowed for the defined item) leaves some room for interpretation:

Can a range be defined without the lower (i.e. ':10') or the upper bound (i.e. '0:')?
Can symbol ranges (i.e. 'a:z') be defined as well as number ranges?
Can non-continuous ranges be defined (i.e. '1:11,12:42'), etc.?

Ambiguities like these can be clarified with several more sentences to the description and a few more examples, or/and a formal grammar (which would actually allow one to validate the provided examples). Of course, one can always add the explanatory grammar to the description, but so can the value of any other data item. Having a separate data item designated just for this purpose seems like a more ontologically elegant solution and it might even encourage developers of other DDLm dictionaries to include grammar definitions into their own dictionaries.

In any way, I would be glad for the grammar definitions to be included in any form.

from cif_core.

vaitkus commented on August 12, 2024

As the _type.container and _type.contents have been greatly simplified in recent revisions (i.e. the Multiple container type was removed), the majority of this discussion is probably no longer relevant. However, I think that the idea of introducing an attribute intended to store a formal grammar that describes all possible attribute values is still worthwhile pursuing, so I might open a separate issue on this topic in the future.

from cif_core.

ddl.dic (cif2-conversion): _type.container 'Multiple' / _type.contents values about cif_core HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent