Giter Club home page Giter Club logo

spec-jsonld's Issues

duplicate classes

Consider these classes:

  • uncefact:Certificate "A legal proof of ownership or worthiness of an item."
  • uncefact:Certification "The process of ensuring that a certain object, process, or activity has passed performance and quality assurance tests or qualification requirements." ;

From the description it seems each is needed: one is a process, the other is the documentary result of that process.
But if you check the applicable props of uncefact:Certification and their descriptions:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX uncefact: <https://service.unece.org/trade/uncefact/trade/uncefact/vocabulary/uncefact#>
PREFIX schema: <http://schema.org/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
select * {
    {?incoming schema:domainIncludes uncefact:Certification; rdfs:comment ?descr}
    union {?outgoing schema:rangeIncludes uncefact:Certification; rdfs:comment ?descr}
}
incoming descr outgoing
uncefact:assertion "An assertion, expressed as text, for this trade product certification, such as that this product is free from peanuts."
uncefact:assertionCode "A code specifying an assertion for this trade product certification, such as claims that a product is free from peanuts."
uncefact:responsibleAgency "The agency, expressed as text, responsible for this trade product certification."
uncefact:standard "The standard, expressed as text, for this trade product certification."
"A certification applicable to this trade product." uncefact:applicableCertification

It becomes clear that is also the result, not a process.
Therefore the two classes must be merged.


We need to examine all classes with similar names for potential duplication.

cefact namespace

https://gist.github.com/VladimirAlexiev/618a9bddd6a949b75b37e983f0220417#bies

The cefact: namespace should be at unece rather than edi3, like the uncefact: namespace

@prefix uncefact: <https://service.unece.org/trade/uncefact/trade/uncefact/vocabulary/uncefact#> .
@prefix cefact:   <https://edi3.org/cefact#> .
  • The cefact: namespace doesn't reflect its purpose. Maybe it should be uncefactBIE:? And a corresponding part in the URL: <uncefact/BIE/uncefact>

IDs (Code) vs URLs

uncefact:applicableObjectCode "A code specifying an object, such as item, animal, person or organization applicable for this product certificate."

How can you identify this variety of things by using a mere string? There are no established global datasets (thus ID standards) for all of these categories of things.
IMHO this should be an object prop (URL), or define an alternative uncefact:applicableObject that's an object prop.

In contrast, GS1 will define gs1:certificationSubject that is an object prop, see https://milecastle.media/dev2021/voc_epcis_extras/CertificationDetails.


Are there other props holding external IDs with badly defined scope/reach?

Minutes from the 19th

@nissimsan here's the notes from today before I left.

Kick off Meeting

Attendee's

Nis Jesperson - 4 years in CEFACT and founder of EDI3 works in verifiable credentials. Editor of the project.
Roman Evstifeev - worked on edi3 for a few years
Kevin Bishop - from UNECE secretariat, joined 2 months ago, has background in IT a professor in computing.
David Roff - T&L Domain Co-Ordinator also supporting the BIC on API and Digital resources.
Kesenyia - EDI3 Background
Steve Capell - will join subsequent calls

Contributing and Workflow

Anyone can contribute to the project but official contributions must be made from a registered UN/CEFACT expert, this is a simple registration process and free to do.

Goal is to take what has been done already and get over the finish line as a project within UN/CEFACT.

Work will be done in MarkDown and output the deliverable after the work is done.
Deliverables:

  • PDF of results and project.
  • Code of the Vocabulary on UN GitHub.

Process to follow the git flows

  • Submit a pull request and merge as we go focused around the tasks
  • encourage even small PR’s
  • Issues will be reviewed in least recently updated which is proven model to work in this environment.

Not in Scope

  • API and Schemas
  • Data Modelling - we are re-using existing work not creating new.

Include transformation code

Add the code from the RDM2API project which transforms exported legacy (CCTS) formatted BSP model to JSON-LD.

BinaryObject vs BinaryFile

This query finds props named "binary":

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX uncefact: <https://service.unece.org/trade/uncefact/trade/uncefact/vocabulary/uncefact#>
select * {
  ?x a rdf:Property
  bind(strafter(str(?x),str(uncefact:)) as ?xName)
  filter(regex(?xName,"binary","i"))
}

There are 15, nearly equally spread between "BinaryObject" and "BinaryFile":

"attachedBinaryFile"
"attachmentBinaryObject"
"creationBinaryFile"
"descriptionBinaryObject"
"imageBinaryObject"
"includedBinaryObject"
"logoAssociatedBinaryFile"
"mapBinaryObject"
"presentationBinaryFile"
"readerBinaryFile"
"referenceFileBinaryObject"
"referencedBinaryFile"
"relatedBinaryFile"
"signatoryImageBinaryObject"
"valueBinaryFile"

Standardize on one of the names; this discrepancy has caused 2 prop duplications (attached, referenced)

Document vs Line Structure

Document has enough props to describe also "Document Lines", which are document parts:

  • lineCountNumeric: if it's a Document, count of lines in it
  • lineId: if it's a DocumentLine, its line ID
  • parentLineId: to establish a hierarchy between lines

However, there's no way to express a document hierarchy or parthood:

  • Which lines comprise this Document?
  • Which lines are nested under this parent line?

This leads to confusions such as

  • Wrong domain? The description talks about "DocumentLine" but the domain is Document
uncefact:lineStatusReason
        rdfs:comment                    "A reason, expressed as text, for the line status in this document line." ;
        schema:domainIncludes           uncefact:Document .

uncefact:lineStatusReasonCode
        rdfs:comment                    "The code specifying the line status reason for this document line." ;
        schema:domainIncludes           uncefact:Document .
  • Wrong domain of lineTotalBasisAmount? Where is that "line" mentioned in the description?
    • Tax has 51 attributes including basisAmount so how can one distinguish between the two?
    • On the other hand its "sibling" prop lineTotalAmount has domain MonetarySummation
uncefact:lineTotalBasisAmount
        rdfs:comment                    "A monetary value used as the line total basis on which this trade related tax, levy or duty is calculated." ;
        schema:domainIncludes           uncefact:Tax ;

publish extra data, link to external data

https://service.unece.org/trade/uncefact/vocabulary/unlocode-ch/#CHGVA has this info:

@id: unlocode-ch:CHGVA
@type: uncefact:UNLOCODE
Comment: Geneve
rdfs:comment: Geneve
rdf:value: CHGVA

https://locode.info/CHGVA has this info:

country: Switzerland (on wikipedia)
code: CH GVA
name: Genève
region: GE
functions: rail terminal, road terminal, airport, postal exchange

It's richer because it tells country, region, functions and has external link (to wikipedia).
Could you please add something like this? It's easy enough to:

owl:ObjectProperty vs owl:DatatypeProperty, rather than rdf:Property

https://gist.github.com/VladimirAlexiev/618a9bddd6a949b75b37e983f0220417#props

schema.org uses rdf:Property because almost all of its props allow literal (text) in addition to object.
However, UNCEFACT seems to be strict in following a "property dichotomy", so use owl:ObjectProperty (797, see below) vs owl:DatatypeProperty (950).

range c comment
xsd:string 791 all literals
xsd:token 159 identifiers, all end in Id
uncefact: 797 uncefact classes (object props)

Consider a live case study to demonstrate

Potential to implement some of the outcomes in open projects or work, for example BIC (Bureau International des Containers) https://github.com/bic-org have containers, container facilities and also BIC Code registrations which could be used to demonstrate a working example of the projects outputs.

use Rec20 codes rather than names in URL

(@nissimsan now that I know one person working on UNCEFACT linked data, I'll write this up :-)

For 20 years Rec20 published codes like KGS. They are widely used, eg in EPCIS and numerous other eCommerce schemas.

You published Rec20 as Linked Data (thanks!) but use English URLs like:

This means one cannot link to your URLs from existing data, without having a lookup table or loading your JSONLD/RDF

  • I'd guess hundreds of data modelers and ontologists feel betrayed :-)
  • In the EPCIS 2.0 WG (eg @mgh128 @CraigRe) we discussed asking you to switch over and use codes in the URLs

Furthermore, I have to wonder about the stability of these URLs:

Thanks!

camelCasing

You use consistent camelCasing for props, and UpperCamelCasing for classes (good!).
However, it needs to be made smarter when dealing with UPPERCASE:

UPPERCASE abbreviations should be converted to lowercase, then camelCased as a normal word

  • otherwise:
    • casing is inconsistent depending on whether the abbreviation comes at the start or middle of the property name
    • The camelized abbreviation is impossible to recognize in the stream of words
  • examples:
    • current: bBANIdentificationId, bICId, australianSNIdentificationId (wtf is BANI, ICI, SNI?)
    • change to: bbanId, bicId, australianSnId
    • or even better: bban, bic, australianSn

Haven't looked for class names. Dunno how to catch all cases.

"Code" props: change to xsd:token, vs rename

Props named xxxCode come in two kinds:

prefix uncefact: <https://service.unece.org/trade/uncefact/trade/uncefact/vocabulary/uncefact#> 
PREFIX schema: <http://schema.org/>
select ?range (count(*) as ?c) {
  ?prop schema:rangeIncludes ?range1
  filter(regex(str(?prop),"Code$"))
  bind(if(strstarts(str(?range1),str(uncefact:)),uncefact:,?range1) as ?range)
} group by ?range
  • xsd:string: 154. Consider mapping to range xsd:token (same as props named xxxId).
    xsd:oken doesn't allow leading, consecutive and trailing spaces, so it fits better than xsd:string. Example:
    • accessRightsCode xsd:string -> xsd:token
  • uncefct:... (objects, i.e. codelist values): 110. Consider renaming them to remove Code (because objects are not codes!). Examples:
    • accountingDocumentSetTriggerCode uncefact:UNCL1001Code -> accountingDocumentSetTrigger
    • cross-BorderRegulatoryProcedureTypeCode uncefact:UNCL9353Code -> cross-BorderRegulatoryProcedureType
    • logisticsSealSealingPartyRoleCode uncefact:UNCL9303Code -> logisticsSealSealingPartyRole

semantic resolution and content negotiation; URL policy

@nissimsan #24 (comment) shows a link:
https://service.unece.org/trade/uncefact/vocabulary/uncefact.jsonld whcih is broken.
More importantly, the semantic link
https://service.unece.org/trade/uncefact/vocabulary/uncefact/
should return different payload using content negotiation.

As a minimum: HTML, JSONLD and Turtle.

It's not bad to also utilize extensions, eg
https://service.unece.org/trade/uncefact/vocabulary/uncefact.html
https://service.unece.org/trade/uncefact/vocabulary/uncefact.jsonld
https://service.unece.org/trade/uncefact/vocabulary/uncefact.ttl

but it's mandatory that the single semantic URL must return the same 3 content types using content negotiation.

Eg see how we did it for the Getty 5 years ago: http://vocab.getty.edu/doc/#Semantic_Resolution


It's crucial to design the semantic URLs in a reasonable way, in order to guarantee their longevity and permanence.
Change ("break") them now so you won't have to break them in the future!

  • #26 is the most important decision in this regard.
  • From the above "URLs with file extension" it seems the semantic URLs shouldn't end in slash. Eg as below (but that's not an individual URL)
    https://service.unece.org/trade/uncefact/vocabulary/uncefact
  • I dislike "service.unece.org" because semantic vocabularies/entities are NOT services.
    Could you change this to "data" or "rdf" or "vocabs"?

Property Datatypes

Currently UNCEFACT uses only two literal datatypes: xsd:string (791 props) and xsd:token (159 props).

UNCEFACT prop names are made according to ISO/IEC 11179 Metadata Registry (MDR), part 5:2015 Naming and identification principles. The last word of prop names (let's call it "kind") suggests many other datatypes.

Surely trade involves some numbers and some dates?!?

I checked that all props with kind Id are xsd:token (good).
This query counts xsd:string props by "kind":

PREFIX schema: <http://schema.org/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
select ?kind (count(*) as ?c) {
  ?prop schema:rangeIncludes xsd:string
  bind(replace(str(?prop),".*([A-Z][a-z]*)","$1") as ?kind)
  filter(regex(?kind,"^[A-Z]"))
} group by ?kind order by ?kind

Count and tentative proposed changes:

kind c change to
"Access" 1
"Agency" 1
"Amount" 89 numeric
"Basis" 2
"Box" 1
"Charge" 1
"Code" 154 xsd:token
"Conditions" 1
"Criteria" 1
"Date" 3 xsd:date
"Description" 21
"Dimension" 1
"Five" 1
"Four" 1
"Indicator" 73 xsd:boolean
"Information" 21
"Instructions" 2
"Limit" 2
"List" 2
"Means" 1
"Measure" 66
"Name" 47
"Number" 4 numeric
"Numeric" 15 IndexNumeric, SequenceNumeric -> xsd:integer
"Object" 7
"Of" 2
"One" 1
"Pattern" 1
"Percent" 16 numeric
"Phrase" 1
"Point" 1
"Procedure" 1
"Quantity" 91 numeric
"Rate" 4
"Reason" 7
"Reference" 6
"Remark" 2
"Remarks" 1
"Restriction" 3
"Result" 1
"Status" 1
"Three" 1
"Time" 79 xsd:dateTime
"Title" 1
"Two" 1
"Type" 9
"Use" 1
"Value" 1
"Zone" 1

Examples:

  • Numeric candidates:
    uncefact:usedToDateQuotaQuantity, uncefact:usedSignalSourceQuantity, taxBasisTotalAmount, taxBasisAllowanceRate
  • date or dateTime candidates:
    uncefact:occurrenceDateTime
  • xsd:boolean candidates:
    uncefact:nilCarriageValueIndicator, uncefact:nilCustomsValueIndicator, uncefact:nilInsuranceValueIndicator

uncefact:UNECELOCODE

uncefact.jsonld wrongly defines class uncefact:UNECELOCODE

  • The LOCODE instance data uses uncefact:UNLOCODE rather than this name
  • Furthermore, this class is not very meaningful semantically (see #29), which may be confirmed by its "description" below:
uncefact:UNECELOCODE  rdf:type  rdfs:Class ;
        rdfs:comment  "LOCODE." ;
        rdfs:label    "UNLOCODE" .

deprecate/map UNCEFACT geospatial stuff in favor of OGC GeoSPARQL

(Continuing #5 (comment))

OGC is a stronger authority on geographic information than UNCEFACT, so it would be better to defer to OGC GeoSPARQL classes and props

  • to put it bluntly, UNCEFACT has no longer any business regulating geospatial data
  • to put it constructively:
    • semantic repos have special indexes, functions and magic predicates to handle GeoSPARQL. In particular asWKT and asGML serializations, region algebras (3 of them) and coordinate system transformations
    • web apps often use GeoJSON.
      • There was no good way to capture GeoJSON in JSONLD (no nested arrays in JSONLD)
      • Now I think one could use @type:@json to capture GeoJSON opaquely in RDF (but semantic repos won't index it)
      • In any case, it's clear how to convert GeoSPARQL serializations from/to GeoJSON and there are many tools

What are all of the UNCEFACT classes and props related to geospatial?

uncefact:Circle , uncefact:GeographicalMultiCurve , uncefact:GeographicalPoint , uncefact:GeographicalMultiSurface , uncefact:GeographicalGrid , uncefact:GeographicalMultiPoint , uncefact:Polygon , uncefact:LinearRing , uncefact:GeographicalLine , uncefact:GeographicalSurface

Props?

Introduce NDRs for adding links to other vocabs

A number of UN/CEFACT terms are defined elsewhere. In order to "fit in", we should include NDRs which define for specific elements terms such as broaderThan, narrowerThan, equivalentProperty, equivalentClass, subClass, subProperty linking to external terms.

For example uncl1153:Bill_of_lading_number rdfs:subPropertyOf schema:identifier

See also w3c-ccg/traceability-vocab#230

prop and class descriptions

I find many cases of misleading descriptions.
This is a very bad problem since it can't be fixed mechanically, but only through examination of the description of the combination "class-applicable props" (see #56 for an example of such investigation).

I'll keep adding below:

  • #52: formattedExpiryDateTime (and many others) should be expiryDateTimeformat
  • #49: natureIdentificationCargo has a lot more convoluted descr than Cargo; geoCoordinateIdentificationGeographicalCoordinate` has a convoluted descr
  • #56: Certification is not a process but a document

Include section on vocab update frequency

Vocab update frequency

  • Discussion, how linked data differs from centralized models.
  • Pros and cons of frequent/infrequent updates. Stability versus fringe increments.

cefactBieDomainClass should be object prop not string

The link from BasicBIE to AggregateBIE should be an object prop, not a string:

cefact:WorkItem_QuantityAnalysis.Details
        rdf:type                        uncefact:AggregateBIE .
cefact:WorkItem_QuantityAnalysis.Identification.Identifier
        rdf:type                        uncefact:BasicBIE ;
        uncefact:cefactBieDomainClass   "cefact:WorkItem_QuantityAnalysis.Details" # -> cefact:WorkItem_QuantityAnalysis.Details

use `#` or `/` rather than `/#`

@nissimsan

All of the vocabs use a double-char namespace delimiter: /# .

W3C and the Linked Data Patterns book (eg https://patterns.dataincubator.org/book/hierarchical-uris.html) have guidance on "slash vs hash" URLs. But nobody recommends using both together.

The recommendation is to use slash for large collections of terms and hash for smaller collections (since a client doesn't send the anchor after hash, so it'd get the whole collection at once).

  • The UNCEFACT vocabs are large collections, so I'd recommend slash
  • But your web pages are already structured to use hash, and you only offer JSONLD dumps of whole collections...
  • So it may be ok to use hash

But please consider #26

Prop Name Doublons

In addition to "IdentificationId" stuttering (#48), there are more insidious dublons in prop names that need fixing

  • duplicated word: formattedFormattedCancellationAnnouncedLaunchDateTime : but see #52 for a whole family
  • duplicated word: referenceReferenceTypeCode
  • duplicated phrase: documentLineDocumentLineStatusCode

remove parasitic word "Identification" from prop names

There are 103 props called ...Identification:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
select * {
  ?x a rdf:Property
  filter(regex(str(?x),"Identification"))
}

Of them 99 are called "IdentificationId" (stuttering syndrome?)

Rename these 99 props by removing the parasitic word ...Identification.. from ...IdentificationId....

  • Eg uncefact:versionIdentificationId, bICIdentificationId
  • These led to prop duplication, which clearly shows the parasitic nature of the long word

Renaming ...Indicator to is... won't work well:

  • eg transportEquipmentSplitGoodsIndicator
  • cannot be named isTransportEquipmentSplitGoods
  • could

use ontologically meaningful classes

Linked Data should describe real-world entities, not data records, data structures, messages...

I'd argue that a class like uncefact:LOCODE is not good.

  • LOCODEs are assigned to ADM1 country subdivisions, right?
  • Schema.org has such a class: http://schema.org/State
    • It's named with abhorring US arrogance (why not Oblast or Land?) but otherwise is very much ok (see its definition).
    • So I think LOCODEs should better be represented as schema:State (see example #28 (comment))
  • A class uncefact:LOCODE would be appropriate to describe that identifier (when was it assigned, who assigned it, what did they use for validation), as opposed to the real world entity that it signifies
  • LEI has stuff like this, see the distinction between a Legal Entity and its LEI and its national ID (nodes marked (I) in the graph below)
  • But I don't think you have such needs for LOCODE codes

image

publish as individual resources

Currently you publish most UNECE vocabs as one big page.

  • providing useful data for individual resources (called "entity RDF") is one of the basic tenets of the web architecture and Linked Data. It'd make semantic web crawlers unhappy to get megabytes of data when they want to use one of your entities
  • having individual pages makes it easier and a lot more valuable for other knowledge graphs to link to your data. Eg consider:

Or maybe publish both:

  • small pages & RDF files for each individual resource
  • a big page for the vocab URL, assembled from the small pages

Address lines

Address: the breakdown lineOne, lineTwo, lineThree, lineFour is

  • pretty random: why not 6 or 10?
    • Dun & Bradstreet data has StreetAddressLine1..4 in one case, but MailingLabelLine1..8 in another case. How do you squeeze 8 DnB lines into the uncefact lines?
  • old-fashioned: xsd:string allows multiple lines
  • doesn't match popular address ontologies: schema.org, W3C LOCN, FIBO (edmcouncil/fibo#237)

So merge to just one prop eg addressLines.

This is low priority since it contradicts the desire to map UNCEFACT models as-is.

BIE descriptions vs prop descriptions

https://gist.github.com/VladimirAlexiev/618a9bddd6a949b75b37e983f0220417#prop-descriptions

Some BIE Descriptions seem to be richer than prop descriptions. Eg

cefact:Academic_Qualification.AbbreviatedName.Text a uncefact:BasicBIE ;
  rdfs:comment "The abbreviated name, expressed as text, of this academic qualification." ;
uncefact:abbreviatedName a rdf:Property ;
  rdfs:comment "An abbreviated name, expressed as text." ;

The BIE description could be used when the BIE is used in one term, and the term is applicable to only one class (as is the case for uncefact:abbreviatedName: applicable only to uncefact:Qualification)
1242 (or 1337?) of 1747 props satisfy this condition:

prefix uncefact: <https://service.unece.org/trade/uncefact/trade/uncefact/vocabulary/uncefact#> 
prefix schema: <http://schema.org/>
select * {
  ?prop uncefact:cefactElementMetadata ?bie
  filter not exists {?prop schema:domainIncludes ?dom1,?dom2 filter (?dom1 != ?dom2)}
  filter not exists {?prop uncefact:cefactElementMetadata ?bie2 filter (?bie != ?bie2)}
  ?prop rdfs:comment ?propDescr.
  ?bie rdfs:comment ?bieDescr.
} 

This requires further examination and unification of descriptions. Eg in the pair below:

  • prop Accreditation: "An official recognition awarded to a person, organisation or thing, such as a building or product, to certify that a certain level of attainment has been achieved."
  • BIE Certified_Accreditation.Details: "A certified recognition that provides evidence of a level of competency in a given area, such as certifying a level of skill in a trade."

decide numeric datatype

Related to #45:

Explicitly decide and consistently use a numeric datatype.
Don't decide as late as we did in EPCIS: gs1/EPCIS#201

  • RDF supports a full complement of XSD numeric types.
  • The most widely used are xsd:integer and xsd:decimal, which are infinite-precision
  • JSON has builtin datatypes, which are xsd:long (I think) and xsd:double (I'm sure).
    But using bare numbers in JSON (especially large numbers) is calling for trouble, see w3c/json-ld-syntax#387

UN LOCODE namespaces

ADAVL "Andorra la Vella" is published as https://service.unece.org/trade/uncefact/vocabulary/unlocode-ad/ADAVL.

The smart people at Wikidata can do such switcheroo

But making consumers jump through hoops just ain't right.
If you do #26, that won't be necessary

fix names/descriptions of "Identification" props

Follow-up from #48

Here are the remaining 4 props:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
select * {
  ?x a rdf:Property
  filter(regex(str(?x),"Identification"))
  filter(!regex(str(?x),"IdentificationId"))
}

Proposed renaming (...Code can only be a string, not an object!)

  • uncefact:allowanceChargeIdentificationTypeCode "The code specifying the type of this trade allowance charge."
    range UNCL5189Code "Code specifying the identification of an allowance or charge."
    rename to allowanceOrChargeType
  • uncefact:geoCoordinateIdentificationGeographicalCoordinate range GeoCoordinate:
    rename to geoCoordinate
    and fix description not to talk about plurals and identifications!
  • uncefact:natureIdentificationCargo: descr sounds like it's free text "Transport cargo details of the consignment or consignment item sufficient to identify its nature for customs, statistical or transport purposes."
    but in fact it has range uncefact:Cargo,
    so rename to cargo
    • The descr of class Cargo is simply "Goods being transported." with none of that "sufficient to identify its nature" fuzziness?
      Resolve that difference
  • uncefact:uNDGIdentificationCode: "United Nations Dangerous Goods (UNDG) number":
    rename to undgCode

UN/CEFACT vocab fit in the larger linked data world

Include architectural section positioning UN/CEFACT vocab among other vocabs. This section will require discussion and solutioning. Include discussion of:

  • specialized versus generic vocabularies
  • term overrides by us
  • term overrides by others

Parasitic Word "Formatted"

Many dateTime props have names called "formatted".

  • As opposed to what, cuneiform? :-)
  • You should indicate the required format with rangeIncludes xsd:dateTime, not with the property name

This query finds them, together with a better-named prop when it exists:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX uncefact: <https://service.unece.org/trade/uncefact/trade/uncefact/vocabulary/uncefact#>
select ?x ?y {
  ?x a rdf:Property
  bind(strafter(str(?x),str(uncefact:)) as ?xName)
  filter(regex(?xName,"formatted","i"))
  optional {
    ?y a rdf:Property
    bind(strafter(str(?y),str(uncefact:)) as ?yName)
    filter(?xName != ?yName && regex(?xName,?yName,"i"))
  }
}

I think all x should be simplified by removing the parasitic words "formatted", and merged with y when indicated:

x y note
formattedExpiryDateTime expiryDate maybe merge all 3 but see below
formattedExpiryDateTime expiryDateTime
formattedFormattedCancellationAnnouncedLaunchDateTime
formattedFormattedIssueDateTime issueDateTime
formattedFormattedLatestProductDataChangeDateTime
formattedFormattedPickUpAvailabilityDateTime formattedPickUpAvailabilityDateTime merge & rename to pickUpAvailabilityDateTime
formattedPickUpAvailabilityDateTime
formattedFormattedReceivedDateTime receivedDateTime
formattedFormattedUltimateShipToDeliveryDateTime ultimateShipToDeliveryDateTime
formattedJurisdictionEntryDateTime
formattedLastRegisteredYearDateTime lastRegisteredYearDateTime
formattedObtainedDateTime
formattedScheduledArrivalRelatedDateTime arrivalRelatedDateTime
formattedScheduledArrivalRelatedDateTime scheduledArrivalRelatedDateTime
formattedScheduledDepartureRelatedDateTime departureRelatedDateTime
formattedScheduledDepartureRelatedDateTime scheduledDepartureRelatedDateTime

This puppy is really messed up:

  • formattedExpiryDateTime "The date, time, date time or other date time value when this certified accreditation expires."
  • but range is UNCL2379Code "Code specifying the representation of a date, time or period."
  • So should this be expiryDateTime range xsd:dateTime, or expiryDateTimeFormat range UNCL2379Code???

duplicated props

https://gist.github.com/VladimirAlexiev/618a9bddd6a949b75b37e983f0220417#duplicated-props

The props in these pairs are duplicated, so should be merged

  • bICId: "The unique Bank Identification Code (BIC) as defined in ISO 9362 for this creditor or debtor financial institution."
    • bICIdentificationId: "The Bank Identifier Code (BIC) as defined by ISO 9362 (Banking telecommunication messages, Bank Identifier Codes) for this financial identity."
  • versionId: "An identifier of the version."
    • versionIdentificationId: "The unique identifier for the version of this exchanged document."
  • attachedBinaryFile: "A binary file attached to this exchanged or referenced document."
    • attachmentBinaryObject: A binary object that is attached or otherwise appended to this referenced document."
  • referenceDocument
    • referencedDocument: haven't checked description

I'm sure there are more cases, how could we find them?

This is according to principle 4. deduplication of #33 (tech-spec.md)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.