uncefact / spec-jsonld Goto Github PK

View Code? Open in Web Editor NEW

13.0 8.0 5.0 4.68 MB

Exposing the UN/CEFACT vocabulary as web semantics

Home Page: https://service.unece.org/trade/uncefact/vocabulary/uncefact/

Java 100.00%

spec-jsonld's Issues

duplicate classes

Consider these classes:

uncefact:Certificate "A legal proof of ownership or worthiness of an item."
uncefact:Certification "The process of ensuring that a certain object, process, or activity has passed performance and quality assurance tests or qualification requirements." ;

From the description it seems each is needed: one is a process, the other is the documentary result of that process.
But if you check the applicable props of uncefact:Certification and their descriptions:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX uncefact: <https://service.unece.org/trade/uncefact/trade/uncefact/vocabulary/uncefact#>
PREFIX schema: <http://schema.org/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
select * {
    {?incoming schema:domainIncludes uncefact:Certification; rdfs:comment ?descr}
    union {?outgoing schema:rangeIncludes uncefact:Certification; rdfs:comment ?descr}
}

incoming	descr	outgoing
uncefact:assertion	"An assertion, expressed as text, for this trade product certification, such as that this product is free from peanuts."
uncefact:assertionCode	"A code specifying an assertion for this trade product certification, such as claims that a product is free from peanuts."
uncefact:responsibleAgency	"The agency, expressed as text, responsible for this trade product certification."
uncefact:standard	"The standard, expressed as text, for this trade product certification."
	"A certification applicable to this trade product."	uncefact:applicableCertification

It becomes clear that is also the result, not a process.
Therefore the two classes must be merged.

We need to examine all classes with similar names for potential duplication.

cefact namespace

https://gist.github.com/VladimirAlexiev/618a9bddd6a949b75b37e983f0220417#bies

The cefact: namespace should be at unece rather than edi3, like the uncefact: namespace

@prefix uncefact: <https://service.unece.org/trade/uncefact/trade/uncefact/vocabulary/uncefact#> .
@prefix cefact:   <https://edi3.org/cefact#> .

The cefact: namespace doesn't reflect its purpose. Maybe it should be uncefactBIE:? And a corresponding part in the URL: <uncefact/BIE/uncefact>

Make GH team of approvers

Itemize initial list of deliverables

Including high level description of project goal and deliverables

IDs (Code) vs URLs

uncefact:applicableObjectCode "A code specifying an object, such as item, animal, person or organization applicable for this product certificate."

How can you identify this variety of things by using a mere string? There are no established global datasets (thus ID standards) for all of these categories of things.
IMHO this should be an object prop (URL), or define an alternative uncefact:applicableObject that's an object prop.

In contrast, GS1 will define gs1:certificationSubject that is an object prop, see https://milecastle.media/dev2021/voc_epcis_extras/CertificationDetails.

Are there other props holding external IDs with badly defined scope/reach?

Minutes from the 19th

@nissimsan here's the notes from today before I left.

Kick off Meeting

Attendee's

Nis Jesperson - 4 years in CEFACT and founder of EDI3 works in verifiable credentials. Editor of the project.
Roman Evstifeev - worked on edi3 for a few years
Kevin Bishop - from UNECE secretariat, joined 2 months ago, has background in IT a professor in computing.
David Roff - T&L Domain Co-Ordinator also supporting the BIC on API and Digital resources.
Kesenyia - EDI3 Background
Steve Capell - will join subsequent calls

Contributing and Workflow

Anyone can contribute to the project but official contributions must be made from a registered UN/CEFACT expert, this is a simple registration process and free to do.

Goal is to take what has been done already and get over the finish line as a project within UN/CEFACT.

Work will be done in MarkDown and output the deliverable after the work is done.
Deliverables:

PDF of results and project.
Code of the Vocabulary on UN GitHub.

Process to follow the git flows

Submit a pull request and merge as we go focused around the tasks
encourage even small PR’s
Issues will be reviewed in least recently updated which is proven model to work in this environment.

Not in Scope

API and Schemas
Data Modelling - we are re-using existing work not creating new.

Include transformation code

Add the code from the RDM2API project which transforms exported legacy (CCTS) formatted BSP model to JSON-LD.

BinaryObject vs BinaryFile

This query finds props named "binary":

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX uncefact: <https://service.unece.org/trade/uncefact/trade/uncefact/vocabulary/uncefact#>
select * {
  ?x a rdf:Property
  bind(strafter(str(?x),str(uncefact:)) as ?xName)
  filter(regex(?xName,"binary","i"))
}

There are 15, nearly equally spread between "BinaryObject" and "BinaryFile":

"attachedBinaryFile"
"attachmentBinaryObject"
"creationBinaryFile"
"descriptionBinaryObject"
"imageBinaryObject"
"includedBinaryObject"
"logoAssociatedBinaryFile"
"mapBinaryObject"
"presentationBinaryFile"
"readerBinaryFile"
"referenceFileBinaryObject"
"referencedBinaryFile"
"relatedBinaryFile"
"signatoryImageBinaryObject"
"valueBinaryFile"

Standardize on one of the names; this discrepancy has caused 2 prop duplications (attached, referenced)

Document vs Line Structure

Document has enough props to describe also "Document Lines", which are document parts:

lineCountNumeric: if it's a Document, count of lines in it
lineId: if it's a DocumentLine, its line ID
parentLineId: to establish a hierarchy between lines

However, there's no way to express a document hierarchy or parthood:

Which lines comprise this Document?
Which lines are nested under this parent line?

This leads to confusions such as

Wrong domain? The description talks about "DocumentLine" but the domain is Document

uncefact:lineStatusReason
        rdfs:comment                    "A reason, expressed as text, for the line status in this document line." ;
        schema:domainIncludes           uncefact:Document .

uncefact:lineStatusReasonCode
        rdfs:comment                    "The code specifying the line status reason for this document line." ;
        schema:domainIncludes           uncefact:Document .

Wrong domain of lineTotalBasisAmount? Where is that "line" mentioned in the description?
- Tax has 51 attributes including basisAmount so how can one distinguish between the two?
- On the other hand its "sibling" prop lineTotalAmount has domain MonetarySummation

uncefact:lineTotalBasisAmount
        rdfs:comment                    "A monetary value used as the line total basis on which this trade related tax, levy or duty is calculated." ;
        schema:domainIncludes           uncefact:Tax ;

Schedule meeting cadence

Plan for meeting cadence, add to README, incl conference details.

Formalize project requirements

Formalize the requiements driving the UN/CEFACT vocab solution.

publish extra data, link to external data

https://service.unece.org/trade/uncefact/vocabulary/unlocode-ch/#CHGVA has this info:

@id: unlocode-ch:CHGVA
@type: uncefact:UNLOCODE
Comment: Geneve
rdfs:comment: Geneve
rdf:value: CHGVA

https://locode.info/CHGVA has this info:

country: Switzerland (on wikipedia)
code: CH GVA
name: Genève
region: GE
functions: rail terminal, road terminal, airport, postal exchange

It's richer because it tells country, region, functions and has external link (to wikipedia).
Could you please add something like this? It's easy enough to:

publish country and link it to something, eg an ISO countries list, Geonames or Wikidata
link LOCODEs to Geonames ADM1 (for this case, the semantic URL is https://sws.geonames.org/2660645/, which shows the human readable page https://www.geonames.org/2660645/canton-de-geneve.html)

Extract existing documentation

A large amount of draft tech specification and other documentation has already been created over the recent years:
https://uncefact.unece.org/pages/viewpage.action?pageId=43384856
We must filter through this material and extract the parts which are relevant to web vocab. We don't want to re-invent any wheels.

This ticket will probably need to be subdivided.

Add UN delegate link

owl:ObjectProperty vs owl:DatatypeProperty, rather than rdf:Property

https://gist.github.com/VladimirAlexiev/618a9bddd6a949b75b37e983f0220417#props

schema.org uses rdf:Property because almost all of its props allow literal (text) in addition to object.
However, UNCEFACT seems to be strict in following a "property dichotomy", so use owl:ObjectProperty (797, see below) vs owl:DatatypeProperty (950).

range	c	comment
xsd:string	791	all literals
xsd:token	159	identifiers, all end in `Id`
uncefact:	797	uncefact classes (object props)

Consider a live case study to demonstrate

Potential to implement some of the outcomes in open projects or work, for example BIC (Bureau International des Containers) https://github.com/bic-org have containers, container facilities and also BIC Code registrations which could be used to demonstrate a working example of the projects outputs.

use Rec20 codes rather than names in URL

(@nissimsan now that I know one person working on UNCEFACT linked data, I'll write this up :-)

For 20 years Rec20 published codes like KGS. They are widely used, eg in EPCIS and numerous other eCommerce schemas.

You published Rec20 as Linked Data (thanks!) but use English URLs like:

https://service.unece.org/trade/uncefact/vocabulary/rec20/#kilogram_per_second

This means one cannot link to your URLs from existing data, without having a lookup table or loading your JSONLD/RDF

I'd guess hundreds of data modelers and ontologists feel betrayed :-)
In the EPCIS 2.0 WG (eg @mgh128 @CraigRe) we discussed asking you to switch over and use codes in the URLs

Furthermore, I have to wonder about the stability of these URLs:

https://service.unece.org/trade/uncefact/vocabulary/rec20/#kilometre_KMT : why KMT is appended?
https://service.unece.org/trade/uncefact/vocabulary/rec20/#mile_(statute_mile)_per_second_squared : why not statute_mile_per_second_squared or mile_(statute)_per_second_squared ?
https://service.unece.org/trade/uncefact/vocabulary/rec20/#cubic_mile_(UK_statute) : Is statute same or different from UK_statute?
https://service.unece.org/trade/uncefact/vocabulary/rec20/#minute_[unit_of_time] : square not round brackets?

Thanks!

camelCasing

You use consistent camelCasing for props, and UpperCamelCasing for classes (good!).
However, it needs to be made smarter when dealing with UPPERCASE:

UPPERCASE abbreviations should be converted to lowercase, then camelCased as a normal word

otherwise:
- casing is inconsistent depending on whether the abbreviation comes at the start or middle of the property name
- The camelized abbreviation is impossible to recognize in the stream of words
examples:
- current: bBANIdentificationId, bICId, australianSNIdentificationId (wtf is BANI, ICI, SNI?)
- change to: bbanId, bicId, australianSnId
- or even better: bban, bic, australianSn

Haven't looked for class names. Dunno how to catch all cases.

"Code" props: change to xsd:token, vs rename

Props named xxxCode come in two kinds:

prefix uncefact: <https://service.unece.org/trade/uncefact/trade/uncefact/vocabulary/uncefact#> 
PREFIX schema: <http://schema.org/>
select ?range (count(*) as ?c) {
  ?prop schema:rangeIncludes ?range1
  filter(regex(str(?prop),"Code$"))
  bind(if(strstarts(str(?range1),str(uncefact:)),uncefact:,?range1) as ?range)
} group by ?range

xsd:string: 154. Consider mapping to range xsd:token (same as props named xxxId).
xsd:oken doesn't allow leading, consecutive and trailing spaces, so it fits better than xsd:string. Example:
- accessRightsCode xsd:string -> xsd:token
uncefct:... (objects, i.e. codelist values): 110. Consider renaming them to remove Code (because objects are not codes!). Examples:
- accountingDocumentSetTriggerCode uncefact:UNCL1001Code -> accountingDocumentSetTrigger
- cross-BorderRegulatoryProcedureTypeCode uncefact:UNCL9353Code -> cross-BorderRegulatoryProcedureType
- logisticsSealSealingPartyRoleCode uncefact:UNCL9303Code -> logisticsSealSealingPartyRole

semantic resolution and content negotiation; URL policy

@nissimsan #24 (comment) shows a link:
https://service.unece.org/trade/uncefact/vocabulary/uncefact.jsonld whcih is broken.
More importantly, the semantic link
https://service.unece.org/trade/uncefact/vocabulary/uncefact/
should return different payload using content negotiation.

As a minimum: HTML, JSONLD and Turtle.

It's not bad to also utilize extensions, eg
https://service.unece.org/trade/uncefact/vocabulary/uncefact.html
https://service.unece.org/trade/uncefact/vocabulary/uncefact.jsonld
https://service.unece.org/trade/uncefact/vocabulary/uncefact.ttl

but it's mandatory that the single semantic URL must return the same 3 content types using content negotiation.

Eg see how we did it for the Getty 5 years ago: http://vocab.getty.edu/doc/#Semantic_Resolution

It's crucial to design the semantic URLs in a reasonable way, in order to guarantee their longevity and permanence.
Change ("break") them now so you won't have to break them in the future!

#26 is the most important decision in this regard.
From the above "URLs with file extension" it seems the semantic URLs shouldn't end in slash. Eg as below (but that's not an individual URL)
https://service.unece.org/trade/uncefact/vocabulary/uncefact
I dislike "service.unece.org" because semantic vocabularies/entities are NOT services.
Could you change this to "data" or "rdf" or "vocabs"?

Property Datatypes

Currently UNCEFACT uses only two literal datatypes: xsd:string (791 props) and xsd:token (159 props).

UNCEFACT prop names are made according to ISO/IEC 11179 Metadata Registry (MDR), part 5:2015 Naming and identification principles. The last word of prop names (let's call it "kind") suggests many other datatypes.

Surely trade involves some numbers and some dates?!?

I checked that all props with kind Id are xsd:token (good).
This query counts xsd:string props by "kind":

PREFIX schema: <http://schema.org/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
select ?kind (count(*) as ?c) {
  ?prop schema:rangeIncludes xsd:string
  bind(replace(str(?prop),".*([A-Z][a-z]*)","$1") as ?kind)
  filter(regex(?kind,"^[A-Z]"))
} group by ?kind order by ?kind

Count and tentative proposed changes:

kind	c	change to
"Access"	1
"Agency"	1
"Amount"	89	numeric
"Basis"	2
"Box"	1
"Charge"	1
"Code"	154	`xsd:token`
"Conditions"	1
"Criteria"	1
"Date"	3	`xsd:date`
"Description"	21
"Dimension"	1
"Five"	1
"Four"	1
"Indicator"	73	`xsd:boolean`
"Information"	21
"Instructions"	2
"Limit"	2
"List"	2
"Means"	1
"Measure"	66
"Name"	47
"Number"	4	numeric
"Numeric"	15	IndexNumeric, SequenceNumeric -> `xsd:integer`
"Object"	7
"Of"	2
"One"	1
"Pattern"	1
"Percent"	16	numeric
"Phrase"	1
"Point"	1
"Procedure"	1
"Quantity"	91	numeric
"Rate"	4
"Reason"	7
"Reference"	6
"Remark"	2
"Remarks"	1
"Restriction"	3
"Result"	1
"Status"	1
"Three"	1
"Time"	79	`xsd:dateTime`
"Title"	1
"Two"	1
"Type"	9
"Use"	1
"Value"	1
"Zone"	1

Examples:

Numeric candidates:
uncefact:usedToDateQuotaQuantity, uncefact:usedSignalSourceQuantity, taxBasisTotalAmount, taxBasisAllowanceRate
date or dateTime candidates:
uncefact:occurrenceDateTime
xsd:boolean candidates:
uncefact:nilCarriageValueIndicator, uncefact:nilCustomsValueIndicator, uncefact:nilInsuranceValueIndicator

Initiate formal delivery document

Establish .md document for capturing the text-based deliverables for bureau submission.

uncefact:UNECELOCODE

uncefact.jsonld wrongly defines class uncefact:UNECELOCODE

The LOCODE instance data uses uncefact:UNLOCODE rather than this name
Furthermore, this class is not very meaningful semantically (see #29), which may be confirmed by its "description" below:

uncefact:UNECELOCODE  rdf:type  rdfs:Class ;
        rdfs:comment  "LOCODE." ;
        rdfs:label    "UNLOCODE" .

deprecate/map UNCEFACT geospatial stuff in favor of OGC GeoSPARQL

(Continuing #5 (comment))

OGC is a stronger authority on geographic information than UNCEFACT, so it would be better to defer to OGC GeoSPARQL classes and props

to put it bluntly, UNCEFACT has no longer any business regulating geospatial data
to put it constructively:
- semantic repos have special indexes, functions and magic predicates to handle GeoSPARQL. In particular asWKT and asGML serializations, region algebras (3 of them) and coordinate system transformations
- web apps often use GeoJSON.
  - There was no good way to capture GeoJSON in JSONLD (no nested arrays in JSONLD)
  - Now I think one could use @type:@json to capture GeoJSON opaquely in RDF (but semantic repos won't index it)
  - In any case, it's clear how to convert GeoSPARQL serializations from/to GeoJSON and there are many tools

What are all of the UNCEFACT classes and props related to geospatial?

uncefact:Circle , uncefact:GeographicalMultiCurve , uncefact:GeographicalPoint , uncefact:GeographicalMultiSurface , uncefact:GeographicalGrid , uncefact:GeographicalMultiPoint , uncefact:Polygon , uncefact:LinearRing , uncefact:GeographicalLine , uncefact:GeographicalSurface

Props?

remove prefix from cefactUNId value

uncefact:cefactUNId "cefact:UN01002518": maybe remove the word "cefact" from the value?

Add editors and chair to README

Introduce NDRs for adding links to other vocabs

A number of UN/CEFACT terms are defined elsewhere. In order to "fit in", we should include NDRs which define for specific elements terms such as broaderThan, narrowerThan, equivalentProperty, equivalentClass, subClass, subProperty linking to external terms.

For example uncl1153:Bill_of_lading_number rdfs:subPropertyOf schema:identifier

prop and class descriptions

I find many cases of misleading descriptions.
This is a very bad problem since it can't be fixed mechanically, but only through examination of the description of the combination "class-applicable props" (see #56 for an example of such investigation).

I'll keep adding below:

#52: formattedExpiryDateTime (and many others) should be expiryDateTimeformat
#49: natureIdentificationCargo has a lot more convoluted descr than Cargo; geoCoordinateIdentificationGeographicalCoordinate` has a convoluted descr
#56: Certification is not a process but a document

Include section on vocab update frequency

Vocab update frequency

Discussion, how linked data differs from centralized models.
Pros and cons of frequent/infrequent updates. Stability versus fringe increments.

cefactBieDomainClass should be object prop not string

The link from BasicBIE to AggregateBIE should be an object prop, not a string:

cefact:WorkItem_QuantityAnalysis.Details
        rdf:type                        uncefact:AggregateBIE .
cefact:WorkItem_QuantityAnalysis.Identification.Identifier
        rdf:type                        uncefact:BasicBIE ;
        uncefact:cefactBieDomainClass   "cefact:WorkItem_QuantityAnalysis.Details" # -> cefact:WorkItem_QuantityAnalysis.Details

Wiki page "UNCEFACT Issues"

@nissimsan I've reviewed the current UNCEFACT vocab and posted some notes at https://gist.github.com/VladimirAlexiev/618a9bddd6a949b75b37e983f0220417.

Maybe it's better to put this as a wiki page? This way we can track as individual issues: with issue IDs and checkboxes.
Any ideas for a better way of organizing this batch of issues is welcome!

I'll start posting individual issues based on the second section Issues

collaboration between UNECE Rec20 and QUDT

@dr-shorthair @steveraysteveray @nissimsan @mgh128

Now that we know UNECE are actively engaging in Linked Data (eg see #24),
what's the best way for the two initiatives to collaborate?

Here are some thoughts #24 (comment), please contribute more ideas or arguments.

use `#` or `/` rather than `/#`

@nissimsan

All of the vocabs use a double-char namespace delimiter: /# .

W3C and the Linked Data Patterns book (eg https://patterns.dataincubator.org/book/hierarchical-uris.html) have guidance on "slash vs hash" URLs. But nobody recommends using both together.

The recommendation is to use slash for large collections of terms and hash for smaller collections (since a client doesn't send the anchor after hash, so it'd get the whole collection at once).

The UNCEFACT vocabs are large collections, so I'd recommend slash
But your web pages are already structured to use hash, and you only offer JSONLD dumps of whole collections...
So it may be ok to use hash

But please consider #26

Prop Name Doublons

In addition to "IdentificationId" stuttering (#48), there are more insidious dublons in prop names that need fixing

duplicated word: formattedFormattedCancellationAnnouncedLaunchDateTime : but see #52 for a whole family
duplicated word: referenceReferenceTypeCode
duplicated phrase: documentLineDocumentLineStatusCode

UNECE (Service UNCEFACT) - Server Upgrade/Maintenance - Jan 28th - Jan 31st 2022

Dear Team,
I just got notified of some maintenance planned for this weekend. https://service.unece.org/trade/uncefact/vocabulary/uncefact/ will be unavailable during the weekend. Jan 28th - Jan 30th 2022.

Kevin.

remove parasitic word "Identification" from prop names

There are 103 props called ...Identification:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
select * {
  ?x a rdf:Property
  filter(regex(str(?x),"Identification"))
}

Of them 99 are called "IdentificationId" (stuttering syndrome?)

Rename these 99 props by removing the parasitic word ...Identification.. from ...IdentificationId....

Eg uncefact:versionIdentificationId, bICIdentificationId
These led to prop duplication, which clearly shows the parasitic nature of the long word

Renaming ...Indicator to is... won't work well:

eg transportEquipmentSplitGoodsIndicator
cannot be named isTransportEquipmentSplitGoods
could

Add project scope section

Establish an initial description of project scope.

Decide on alignment to https://github.com/edi3/edi3-json-ld-ndr

Given there is now a formal UN repo for tracking, we probably want to bring over existing tickets from edi3.

use ontologically meaningful classes

Linked Data should describe real-world entities, not data records, data structures, messages...

I'd argue that a class like uncefact:LOCODE is not good.

LOCODEs are assigned to ADM1 country subdivisions, right?
Schema.org has such a class: http://schema.org/State
- It's named with abhorring US arrogance (why not Oblast or Land?) but otherwise is very much ok (see its definition).
- So I think LOCODEs should better be represented as schema:State (see example #28 (comment))
A class uncefact:LOCODE would be appropriate to describe that identifier (when was it assigned, who assigned it, what did they use for validation), as opposed to the real world entity that it signifies
LEI has stuff like this, see the distinction between a Legal Entity and its LEI and its national ID (nodes marked (I) in the graph below)
But I don't think you have such needs for LOCODE codes

publish as individual resources

Currently you publish most UNECE vocabs as one big page.

providing useful data for individual resources (called "entity RDF") is one of the basic tenets of the web architecture and Linked Data. It'd make semantic web crawlers unhappy to get megabytes of data when they want to use one of your entities
having individual pages makes it easier and a lot more valuable for other knowledge graphs to link to your data. Eg consider:
- https://www.wikidata.org/wiki/Property:P6512 (UNECE Rec20 code) has no formatter URL (because #24?)
- https://www.wikidata.org/wiki/Property:P1937 (UN LOCODE) uses formatter URL https://locode.info/$1 because these guys have individual pages (eg https://locode.info/CHGVA).
  - UNECE had them at some point but now https://unece.org/fileadmin/DAM/cefact/locode/id/$1 is deprecated on WD

Or maybe publish both:

small pages & RDF files for each individual resource
a big page for the vocab URL, assembled from the small pages

void values of uncefact:TDED

some uncefact:TDED "." are void (empty) values: skip them.

And what is TDED? use words not abbreviations

Address lines

Address: the breakdown lineOne, lineTwo, lineThree, lineFour is

pretty random: why not 6 or 10?
- Dun & Bradstreet data has StreetAddressLine1..4 in one case, but MailingLabelLine1..8 in another case. How do you squeeze 8 DnB lines into the uncefact lines?
old-fashioned: xsd:string allows multiple lines
doesn't match popular address ontologies: schema.org, W3C LOCN, FIBO (edmcouncil/fibo#237)

So merge to just one prop eg addressLines.

This is low priority since it contradicts the desire to map UNCEFACT models as-is.

BIE descriptions vs prop descriptions

https://gist.github.com/VladimirAlexiev/618a9bddd6a949b75b37e983f0220417#prop-descriptions

Some BIE Descriptions seem to be richer than prop descriptions. Eg

cefact:Academic_Qualification.AbbreviatedName.Text a uncefact:BasicBIE ;
  rdfs:comment "The abbreviated name, expressed as text, of this academic qualification." ;
uncefact:abbreviatedName a rdf:Property ;
  rdfs:comment "An abbreviated name, expressed as text." ;

The BIE description could be used when the BIE is used in one term, and the term is applicable to only one class (as is the case for uncefact:abbreviatedName: applicable only to uncefact:Qualification)
1242 (or 1337?) of 1747 props satisfy this condition:

prefix uncefact: <https://service.unece.org/trade/uncefact/trade/uncefact/vocabulary/uncefact#> 
prefix schema: <http://schema.org/>
select * {
  ?prop uncefact:cefactElementMetadata ?bie
  filter not exists {?prop schema:domainIncludes ?dom1,?dom2 filter (?dom1 != ?dom2)}
  filter not exists {?prop uncefact:cefactElementMetadata ?bie2 filter (?bie != ?bie2)}
  ?prop rdfs:comment ?propDescr.
  ?bie rdfs:comment ?bieDescr.
}

This requires further examination and unification of descriptions. Eg in the pair below:

prop Accreditation: "An official recognition awarded to a person, organisation or thing, such as a building or product, to certify that a certain level of attainment has been achieved."
BIE Certified_Accreditation.Details: "A certified recognition that provides evidence of a level of competency in a given area, such as certifying a level of skill in a trade."

decide numeric datatype

Related to #45:

Explicitly decide and consistently use a numeric datatype.
Don't decide as late as we did in EPCIS: gs1/EPCIS#201

RDF supports a full complement of XSD numeric types.
The most widely used are xsd:integer and xsd:decimal, which are infinite-precision
JSON has builtin datatypes, which are xsd:long (I think) and xsd:double (I'm sure).
But using bare numbers in JSON (especially large numbers) is calling for trouble, see w3c/json-ld-syntax#387

Formalize project exit criteria (definition of done)

UN LOCODE namespaces

ADAVL "Andorra la Vella" is published as https://service.unece.org/trade/uncefact/vocabulary/unlocode-ad/ADAVL.

All other vocabs are published at single big pages, but I guess you found https://service.unece.org/trade/uncefact/vocabulary/unlocode/ to grow too big (Wikidata says over 100k entries, see https://www.wikidata.org/wiki/Property:P1937)
Well, that's inconsistent
To resolve LOCODE to this URL, one would have to split and manipulate the first two chars

The smart people at Wikidata can do such switcheroo

see https://www.wikidata.org/wiki/Property:P3608
eg BG200356710 (Ontotext) goes to https://wikidata-externalid-url.toolforge.org, which does some splitting and results in https://ec.europa.eu/taxation_customs/vies/vatResponse.html?memberStateCode=BG&number=200356710
- (the specific toolforge URL is
  https://wikidata-externalid-url.toolforge.org/?p=3608&url_prefix=http://ec.europa.eu/taxation_customs/vies/vatResponse.html?memberStateCode=&id=BG200356710)

But making consumers jump through hoops just ain't right.
If you do #26, that won't be necessary

fix names/descriptions of "Identification" props

Follow-up from #48

Here are the remaining 4 props:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
select * {
  ?x a rdf:Property
  filter(regex(str(?x),"Identification"))
  filter(!regex(str(?x),"IdentificationId"))
}

Proposed renaming (...Code can only be a string, not an object!)

uncefact:allowanceChargeIdentificationTypeCode "The code specifying the type of this trade allowance charge."
range UNCL5189Code "Code specifying the identification of an allowance or charge."
rename to allowanceOrChargeType
uncefact:geoCoordinateIdentificationGeographicalCoordinate range GeoCoordinate:
rename to geoCoordinate
and fix description not to talk about plurals and identifications!
uncefact:natureIdentificationCargo: descr sounds like it's free text "Transport cargo details of the consignment or consignment item sufficient to identify its nature for customs, statistical or transport purposes."
but in fact it has range uncefact:Cargo,
so rename to cargo
- The descr of class Cargo is simply "Goods being transported." with none of that "sufficient to identify its nature" fuzziness?
  Resolve that difference
uncefact:uNDGIdentificationCode: "United Nations Dangerous Goods (UNDG) number":
rename to undgCode

UN/CEFACT vocab fit in the larger linked data world

Include architectural section positioning UN/CEFACT vocab among other vocabs. This section will require discussion and solutioning. Include discussion of:

specialized versus generic vocabularies
term overrides by us
term overrides by others

Parasitic Word "Formatted"

Many dateTime props have names called "formatted".

As opposed to what, cuneiform? :-)
You should indicate the required format with rangeIncludes xsd:dateTime, not with the property name

This query finds them, together with a better-named prop when it exists:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX uncefact: <https://service.unece.org/trade/uncefact/trade/uncefact/vocabulary/uncefact#>
select ?x ?y {
  ?x a rdf:Property
  bind(strafter(str(?x),str(uncefact:)) as ?xName)
  filter(regex(?xName,"formatted","i"))
  optional {
    ?y a rdf:Property
    bind(strafter(str(?y),str(uncefact:)) as ?yName)
    filter(?xName != ?yName && regex(?xName,?yName,"i"))
  }
}

I think all x should be simplified by removing the parasitic words "formatted", and merged with y when indicated:

x	y	note
formattedExpiryDateTime	expiryDate	maybe merge all 3 but see below
formattedExpiryDateTime	expiryDateTime
formattedFormattedCancellationAnnouncedLaunchDateTime
formattedFormattedIssueDateTime	issueDateTime
formattedFormattedLatestProductDataChangeDateTime
formattedFormattedPickUpAvailabilityDateTime	formattedPickUpAvailabilityDateTime	merge & rename to `pickUpAvailabilityDateTime`
formattedPickUpAvailabilityDateTime
formattedFormattedReceivedDateTime	receivedDateTime
formattedFormattedUltimateShipToDeliveryDateTime	ultimateShipToDeliveryDateTime
formattedJurisdictionEntryDateTime
formattedLastRegisteredYearDateTime	lastRegisteredYearDateTime
formattedObtainedDateTime
formattedScheduledArrivalRelatedDateTime	arrivalRelatedDateTime
formattedScheduledArrivalRelatedDateTime	scheduledArrivalRelatedDateTime
formattedScheduledDepartureRelatedDateTime	departureRelatedDateTime
formattedScheduledDepartureRelatedDateTime	scheduledDepartureRelatedDateTime

This puppy is really messed up:

formattedExpiryDateTime "The date, time, date time or other date time value when this certified accreditation expires."
but range is UNCL2379Code "Code specifying the representation of a date, time or period."
So should this be expiryDateTime range xsd:dateTime, or expiryDateTimeFormat range UNCL2379Code???

duplicated props

https://gist.github.com/VladimirAlexiev/618a9bddd6a949b75b37e983f0220417#duplicated-props

The props in these pairs are duplicated, so should be merged

bICId: "The unique Bank Identification Code (BIC) as defined in ISO 9362 for this creditor or debtor financial institution."
- bICIdentificationId: "The Bank Identifier Code (BIC) as defined by ISO 9362 (Banking telecommunication messages, Bank Identifier Codes) for this financial identity."
versionId: "An identifier of the version."
- versionIdentificationId: "The unique identifier for the version of this exchanged document."
attachedBinaryFile: "A binary file attached to this exchanged or referenced document."
- attachmentBinaryObject: A binary object that is attached or otherwise appended to this referenced document."
referenceDocument
- referencedDocument: haven't checked description

I'm sure there are more cases, how could we find them?

This is according to principle 4. deduplication of #33 (tech-spec.md)

uncefact / spec-jsonld Goto Github PK

spec-jsonld's Issues

Kick off Meeting

Attendee's

Contributing and Workflow

Not in Scope

Recommend Projects

Recommend Topics

Recommend Org