Giter Club home page Giter Club logo

Comments (20)

mgh128 avatar mgh128 commented on July 28, 2024 3

Hi @nissimsan , @VladimirAlexiev

Thanks for opening up the discussion on this. We certainly welcome UN/CEFACT embracing Linked Data for the publication of the details about its unit of measure codes - definitely a big improvement on a spreadsheet.

As @VladimirAlexiev mentions, UN/CEFACT Rec20 codes are widely used in e-commerce and referenced from GS1 standards. They're compact but not always intuitive, e.g. who would think that '74' represents 'millipascal' - but just imagine if UN/CEFACT provided a consistent Web URI stem to which any Rec20 code could be appended - and also declared owl:sameAs relationships, so that

rec20:74  owl:sameAs  rec20:millipascal .
rec20:KGM  owl:sameAs  rec20:kilogram  .

Then, every compact cryptic Rec20 code becomes really easy to lookup. Not sure what 'A90' is?
Just lookup rec20:A90 and find that it actually corresponds to a gigawatt, get the conversion factors etc.

Looking through https://service.unece.org/trade/uncefact/vocabulary/rec20.jsonld we find entries such as:

{"@id":"rec20:milligram","@type":"uncefact:UNECERec20Code","rdfs:comment":"","rdf:value":"MGM","uncefact:levelCategory":"1S","uncefact:symbol":"mg","uncefact:conversionFactor":"10⁻⁶ kg","uncefact:status":""},

That's somewhat helpful, but what might be even more helpful to software that ingests the UN/CEFACT Linked Data is to provide the multiplicative and additive conversion factors as pure numeric values rather than strings such as "10⁻⁶ kg" that require parsing.

If we compare the entry above with the equivalent we find at QUDT for milligram, we see at http://qudt.org/vocab/unit/MilliGM various details including a purely numeric value for the conversion multiplier.

If we specify Accept: text/turtle in our HTTP Content Negotiation, we get the following:

@prefix rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix owl:   <http://www.w3.org/2002/07/owl#> .
@prefix xsd:   <http://www.w3.org/2001/XMLSchema#> .
@prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#> .

<http://qudt.org/vocab/unit/MilliGM>
        a                 <http://qudt.org/schema/qudt/Unit> ;
        rdfs:isDefinedBy  <http://qudt.org/2.1/vocab/unit> ;
        rdfs:label        "Milligram"@en ;
        <http://qudt.org/schema/qudt/conversionMultiplier>
                0.000001 ;
        <http://qudt.org/schema/qudt/hasDimensionVector>
                <http://qudt.org/vocab/dimensionvector/A0E0L0I0M1H0T0D0> ;
        <http://qudt.org/schema/qudt/hasQuantityKind>
                <http://qudt.org/vocab/quantitykind/Mass> ;
        <http://qudt.org/schema/qudt/iec61360Code>
                "0112/2///62720#UAA815" ;
        <http://qudt.org/schema/qudt/plainTextDescription>
                "0.000001-fold of the SI base unit kilogram" ;
        <http://qudt.org/schema/qudt/ucumCode>
                "mg"^^<http://qudt.org/schema/qudt/UCUMcs> ;
        <http://qudt.org/schema/qudt/uneceCommonCode>
                "MGM" .

similar to what we can find at http://qudt.org/2.1/vocab/unit , where we find this set of triples:

unit:MilliGM
  a qudt:Unit ;
  qudt:conversionMultiplier 0.000001 ;
  qudt:hasDimensionVector qkdv:A0E0L0I0M1H0T0D0 ;
  qudt:hasQuantityKind quantitykind:Mass ;
  qudt:iec61360Code "0112/2///62720#UAA815" ;
  qudt:plainTextDescription "0.000001-fold of the SI base unit kilogram" ;
  qudt:ucumCode "mg"^^qudt:UCUMcs ;
  qudt:uneceCommonCode "MGM" ;
  rdfs:isDefinedBy <http://qudt.org/2.1/vocab/unit> ;
  rdfs:label "Milligram"@en ;

while the entry for degrees Fahrenheit also shows a qudtLconversionOffset

unit:DEG_F
  a qudt:Unit ;
  dcterms:description "\\(\\textbf{Degree Fahrenheit} is an Imperial unit for 'Thermodynamic Temperature' expressed as \\(\\,^{\\circ}{\\rm F}\\)"^^qudt:LatexString ;
  qudt:conversionMultiplier 0.5555555555555556 ;
  qudt:conversionOffset 459.669607 ;
  qudt:definedUnitOfSystem sou:IMPERIAL ;
  qudt:definedUnitOfSystem sou:USCS ;
  qudt:expression "\\(degF\\)"^^qudt:LatexString ;
  qudt:hasDimensionVector qkdv:A0E0L0I0M0H1T0D0 ;
  qudt:hasQuantityKind quantitykind:Temperature ;
  qudt:iec61360Code "0112/2///62720#UAA039" ;
  qudt:omUnit <http://www.ontology-of-units-of-measure.org/resource/om-2/degreeFahrenheit> ;
  qudt:ucumCode "[degF]"^^qudt:UCUMcs ;
  qudt:uneceCommonCode "FAH" ;
  qudt:unitOfSystem sou:IMPERIAL ;
  rdfs:isDefinedBy <http://qudt.org/2.1/vocab/unit> ;
  rdfs:label "Degree Fahrenheit"@en ;
.

By comparison with the entry for degrees Celsius:

unit:DEG_C
  a qudt:DerivedUnit ;
  a qudt:Unit ;
  dcterms:description "\\(\\textit{Celsius}\\), also known as centigrade, is a scale and unit of measurement for temperature. It can refer to a specific temperature on the Celsius scale as well as a unit to indicate a temperature interval, a difference between two temperatures or an uncertainty. This definition fixes the magnitude of both the degree Celsius and the kelvin as precisely 1 part in 273.16 (approximately 0.00366) of the difference between absolute zero and the triple point of water. Thus, it sets the magnitude of one degree Celsius and that of one kelvin as exactly the same. Additionally, it establishes the difference between the two scales' null points as being precisely \\(273.15\\,^{\\circ}{\\rm C}\\).</p>"^^qudt:LatexString ;
  qudt:allowedUnitOfSystem sou:SI ;
  qudt:conversionMultiplier 1.0 ;
  qudt:conversionOffset 273.15 ;
  qudt:dbpediaMatch "http://dbpedia.org/resource/Celsius"^^xsd:anyURI ;
  qudt:expression "\\(degC\\)"^^qudt:LatexString ;
  qudt:guidance "<p>See NIST section <a href=\"http://physics.nist.gov/Pubs/SP811/sec04.html#4.2.1.1\">SP811 section 4.2.1.1</a></p>"^^rdf:HTML ;
  qudt:guidance "<p>See NIST section <a href=\"http://physics.nist.gov/Pubs/SP811/sec06.html#6.2.8\">SP811 section 6.2.8</a></p>"^^rdf:HTML ;
  qudt:hasDimensionVector qkdv:A0E0L0I0M0H1T0D0 ;
  qudt:hasQuantityKind quantitykind:Temperature ;
  qudt:iec61360Code "0112/2///62720#UAA033" ;
  qudt:informativeReference "http://en.wikipedia.org/wiki/Celsius?oldid=494152178"^^xsd:anyURI ;
  qudt:latexDefinition "\\(\\,^{\\circ}{\\rm C}\\)"^^qudt:LatexString ;
  qudt:omUnit <http://www.ontology-of-units-of-measure.org/resource/om-2/degreeCelsius> ;
  qudt:ucumCode "Cel"^^qudt:UCUMcs ;
  qudt:uneceCommonCode "CEL" ;
  qudt:unitOfSystem sou:CGS ;
  rdfs:isDefinedBy <http://qudt.org/2.1/vocab/unit> ;
  rdfs:label "Degree Celsius"@en ;
  skos:altLabel "degree-centigrade" ;

we can determine that to convert degrees Celsius to Kelvin, we add the qudt:conversionOffset then multiply by the qudt:conversionMultiplier

to convert 32 degrees Fahrenheit to Kelvin, we add the qudt:conversionOffset (459.669607) then multiply by the qudt:conversionMultiplier ( 0.5555555555555556 ), resulting in 0.5555555555555556 * (32+459.669607) = 273.1497816667, i.e. essentially the same value of 273.15 K.

Using the triple point of water defined as 273.16K and 32.01°F, I calculate a slightly different conversion offset of (273.16)*9/5-32.01 = 459.678 rather than 459.66907 - but at least we've demonstrated the principles of how to use qudt:conversionOffset and qudt:conversionMultiplier , which I hope you'd agree is more machine-friendly than something like "5/9 x K" where the entry for degrees Fahrenheit within https://service.unece.org/trade/uncefact/vocabulary/rec20.jsonld currently does not even mention that a conversion offset needed. Oops! The entry is currently this:

{"@id":"rec20:degree_Fahrenheit","@type":"uncefact:UNECERec20Code","rdfs:comment":"Refer ISO 80000-5 (Quantities and units — Part 5: Thermodynamics)","rdf:value":"FAH","uncefact:levelCategory":"2","uncefact:symbol":"°F","uncefact:conversionFactor":"5/9 x K","uncefact:status":""},

I very much hope that UN/CEFACT can take a look at what QUDT.org have done and find a constructive way to work with them to make the UN/CEFACT Linked Data vocabulary for Rec20 the best it can be. A few of us at GS1 would also be very happy to help with that - and to assist with discussion, comparison, testing or conversion tools. Here's one we already developed before any Linked Data for UN/CEFACT Rec20 was publicly available https://gs1.github.io/UnitConverterUNECERec20/
It's not complete but has most of the units we're likely to need for most kinds of sensor data used for monitoring the condition of perishable products in supply chains or even the condition of locomotive components.

@nissimsan - please don't take any of the above as criticism - it's only intended as constructive feedback. Please let us know how we / GS1 can help to improve the UN/CEFACT Linked Data vocabulary for Rec20 units.

from spec-jsonld.

VladimirAlexiev avatar VladimirAlexiev commented on July 28, 2024 1

I like the idea to use URLs both with codes and names (and I like the shortened prefix!) :

rec20:74 owl:sameAs rec20:millipascal . rec20:KGM owl:sameAs rec20:kilogram .

But that also means both of these URLs should resolve and return data, thus some extra work on the server.

@mgh128 the above is great input, but deserves an issue of its own. And could you put ticks around json and turtle code?
And if you add the language (js or ttl), it will be code-highlighted.

from spec-jsonld.

VladimirAlexiev avatar VladimirAlexiev commented on July 28, 2024 1

@nissimsan

isn't QUDT a more authoritative semantic foundation? I'm in the process of figuring out how to use it still.

I added an example at the QUDT issue.
It is a more authoritative semantic foundation.
(There are other UoM ontologies to choose from, some have more units or better organization, but QUDT is one of the best).

The most important things that QUDT has that Rec20 doesn't, are:

  • Dimension Vectors
  • links to tell which units are comparable (eg many dimensionless units are not comparable, eg count vs logarithmic units like Bel)
  • extra info and links to other UoM systems, including Rec20 and UCUM

Rec20 shouldn't try to recreate these features and extra fields of QUDT.
IMHO the best you can do is:

  • add values to their data property qudt:uneceCommonCode
  • express that also as an object property (link), eg see below
unit:MilliGM
  qudt:uneceCommonCode "MGM";
  qudt:uneceUnit rec20:MGM.
  • add reciprocal links from Rec20 to QUDT, eg
rec20:MGM schema:sameAs unit:MilliGM

Rec20 also has many common/weird units that QUDT doesn't, eg "military sticks".
You could gradually contribute those to QUDT, but need to write a semantic description for each one,
eg see one that I contributed: qudt/qudt-public-repo#285

from spec-jsonld.

steveraysteveray avatar steveraysteveray commented on July 28, 2024 1

Just catching up with this thread. A couple of items that might be useful to you:

  1. This wiki entry discusses the mathematics behind our use of conversionMultiplier and conversionOffset.
  2. This wiki entry describes the QUDT naming and design rules for units.

Looking forward to even better integration between these vocabularies!

from spec-jsonld.

nissimsan avatar nissimsan commented on July 28, 2024

@VladimirAlexiev,

First of all welcome!!

Great input - exactly what we're looking for. The UN/CEFACT vocab is based on a non-trivial transformation from the legacy CCTS model to what you see here based on NDRs. Switching from "transparent layers" to object orientation is, yea, non-trivial so I won't make promise we may not be able to keep. But good requirements is a really good starting point, and I will promise that this will be discussed. So pls stay tuned...

Regarding units, though, I have been pondering if http://www.qudt.org/ isn't a more authoritative semantic foundation? I'm in the process of figuring out how to use it still. Thoughts?

from spec-jsonld.

dr-shorthair avatar dr-shorthair commented on July 28, 2024

isn't QUDT a more authoritative semantic foundation?

QUDT is more sound semantically.
However, its authoritativeness is not assured: it is a community project, without much of a business model or financial backing in practice. So while it currently has a group of enthusiastic custodians, it is vulnerable in the way that all community projects are.

For comparison, UCUM has better institutional backing (NLM via Regenstrief) but it was not 'born semantic' in the way that QUDT is. There are some nice tools around UCUM

A team from OBO has built a web app to bridge various web units systems - see https://units-of-measurement.org/
It uses UCUM codes as the basis for the URIs. Some of the back-end is dynamic, though it also relies on a set of mapping tables that were built manually.

from spec-jsonld.

VladimirAlexiev avatar VladimirAlexiev commented on July 28, 2024

@mgh128 found 14 invalid URLs amongst the English-name URLs: https://github.com/mgh128/UnitUnity/blob/main/UNECE_Rec20_Bad_IRIs.txt.
These need to be sanitized before they are used in owl:sameAs statements.

from spec-jsonld.

nissimsan avatar nissimsan commented on July 28, 2024

@VladimirAlexiev, thank you for this. We're discussing this now, and there is general agreement to switch to building these IRIs from the codes (value) instead of the name.

We will do this not only for Rec20, but generally for all recommendations.

The likes of https://service.unece.org/trade/uncefact/vocabulary/rec20/#kilometre_KMT is most likely how we're getting it in the source file.

@kshychko, assigning you for this.

from spec-jsonld.

mgh128 avatar mgh128 commented on July 28, 2024

Dear @nissimsan
Many thanks for this. I strongly support this change. It will make the UN ECE Rec20 and other code lists much easier to lookup as terms in a Web vocabulary if each code list has a common URI stem that can simply be appended by a compact alphanumeric code value to obtain useful information.

If you or @kshychko would like any help in preparing this, I'm happy to help.

from spec-jsonld.

mgh128 avatar mgh128 commented on July 28, 2024

However please reconsider whether you really want https://service.unece.org/trade/uncefact/vocabulary/rec20/#kilometre_KMT
rather than
https://service.unece.org/trade/uncefact/vocabulary/rec20/KMT

With something like https://service.unece.org/trade/uncefact/vocabulary/rec20/KMT you can:

  1. Append any Rec20 code after https://service.unece.org/trade/uncefact/vocabulary/rec20/ to find out what it means.
    For example, who knows instinctively what 'A90' means. Yes, of course it's totally obvious that it's gigawatt (not!)
    So if you go with something like https://service.unece.org/trade/uncefact/vocabulary/rec20/#gigawatt_A90 then you're depending on people already knowing that A90 = gigawatt, whereas if you go with
    https://service.unece.org/trade/uncefact/vocabulary/rec20/A90 then they don't need to know that in advance - you've then provided a simple general-purpose way for a human or a machine to lookup 'A90' or any other non-intuitive Rec20 code value by simply appending to a common consistent URI stem such as https://service.unece.org/trade/uncefact/vocabulary/rec20

  2. You can serve an individual result for 'KMT' (or 'A90' or anything else) without needing to serve the entire set of all Rec20 codes (and their data) each time. So I urge you to please consider using slash instead of hash, so that you can serve details for each Rec20 code value individually.

from spec-jsonld.

dr-shorthair avatar dr-shorthair commented on July 28, 2024

Yes, "/" URIs are definitely preferable to "#" URIs for large URI sets like this.

from spec-jsonld.

nissimsan avatar nissimsan commented on July 28, 2024

@mgh128, I was commenting on a couple of different things. I was writing "live" during our call, so it went a bit fast. :) Let me elaborate:

First on foremost, we will change the URI to https://service.unece.org/trade/uncefact/vocabulary/rec20/#KMT, constructing the URI from the value instead of the description.

The link I included was in response to @VladimirAlexiev's question "why KMT is appended?", and we believe this is because the description string we get is "kilometre_KMT". Ugly, but if this is how we are getting it from upstream (the UN/CEFACT modelling teams), we have decided to take a hands off approach and not fiddle with the actual semantic content.

The / vs # is actually the topic of #25. Certainly, the way it looks today /# is wrong. Now, we actually agreed Friday to switch to # - but I'm hearing you both (@dr-shorthair too) argue for / instead.
Your input is awesome! If you don't mind, I'll just make sure it is referenced from the relevant issue - and let's continue the discussion there, pls.

from spec-jsonld.

mgh128 avatar mgh128 commented on July 28, 2024

@nissimsan - many thanks for these additional clarifications - great to see that you're now considering a common URI stem that can simply be appended by any Rec20 unit code to retrieve relevant data - even better if you're also open to considering a trailing slash rather than hash, so that you can serve data for an individual unit code. Please feel free to link to the comments from myself and @dr-shorthair from issue #25.

Regarding #67 I'd be happy to prepare three examples of how I think we could improve the data graphs for Rec20 unit codes 'CEL', 'FAH' and 'KEL' to illustrate the use of numerical additive and multiplicative conversion factors and how to use these, similar to the approach taken in QUDT. I hope that this could be considered in a future update, to make the Linked Data vocabulary for UN CEFACT Rec20 unit codes more useful to software that automates the conversion between different units of measure.

from spec-jsonld.

nissimsan avatar nissimsan commented on July 28, 2024

@kshychko, the sooner we get your code on the repo, the more transparent our process will be, and the higher the chance that someone like @mgh128 will make pull requests representing his great input! :)

from spec-jsonld.

mgh128 avatar mgh128 commented on July 28, 2024

@nissimsan @kshychko - What I have in mind is something like this extract https://github.com/mgh128/UnitConverterUNECERec20/blob/master/Rec20-extract.jsonld
https://mgh128.github.io/UnitConverterUNECERec20/Rec20-extract.jsonld

I have proposed some additional triples using the following proposed new predicates:

  • uncefact:conversionMultiplier
  • uncefact:conversionDivisor
  • uncefact:conversionOffset
  • uncefact:siUnit

The logic is as follows:

  1. It is possible to convert between any two units that share the same value of uncefact:siUnit , which should be the UN ECE Rec20 URI for the corresponding SI unit that serves as the basis for the conversion factors. In this extract, we see that rec20:KEL , rec20:CEL, rec20:FAH and rec20:A48 all share the same value of uncefact:siUnit, namely rec20:KEL .

  2. Given a numerical value such as 37 and a source unit code such as "CEL", we can convert to a target unit code such as "FAH" as follows:

  3. To convert from the source unit to the SI unit:

  • add uncefact:conversionOffset of sourceUnit
  • multiply by uncefact:conversionMultiplier of sourceUnit
  • divide by `uncefact:conversionDivisor' of sourceUnit
  1. To convert from the SI unit to the target unit:
  • multiply by `uncefact:conversionDivisor' of targetUnit
  • divide by uncefact:conversionMultiplier of targetUnit
  • subtract uncefact:conversionOffset of targetUnit
  1. Applying this to our example numeric value of 37 , sourceUnit = "CEL" and targetUnit = "FAH",
    step 3 results in conversion from (37, "FAH") to ( ( (37+273.15) * 1 / 1), "KEL") = (310.15, "KEL")
    then step 4 results in conversion from (310.15, "KEL") to ( (((310.15*9)/5)-459.67),"FAH") = (98.6, "FAH")

I've proposed expressing a separate multiplier and divisor so that we can express fractions such as 5/9 exactly and avoid rounding errors that would occur if we wrote something like 0.55555555555555555 as an approximation.

from spec-jsonld.

mgh128 avatar mgh128 commented on July 28, 2024

https://mgh128.github.io/UnitConverterUNECERec20/index2.html provides a very simple demo using the approach outlined above to convert between interconvertible units of measure. I've added some data graphs for units of mass, to show how the proposed property uncefact:siUnit can be used to group units that are interconvertible so that we can warn about attempts to convert between non-interconvertible units of measure.

from spec-jsonld.

mgh128 avatar mgh128 commented on July 28, 2024

Thanks, @steveraysteveray
That's helpful. I think I'd figured out the logic for conversionMultiplier and conversionOffset.

Because we sometimes run into fractions such as 5/9 with recurring decimal places, I actually wonder whether it might be even better to express a conversionMultiplier and conversionDivisor as you can see in the source code of my proposal at https://mgh128.github.io/UnitConverterUNECERec20/index2.html so that we don't need to approximate exact fractions (such as 2/3, 5/9) inadequately using multiple recurring decimal places.

I think we all would like to see better integration between these vocabularies and I'll be happy to help further.

from spec-jsonld.

nissimsan avatar nissimsan commented on July 28, 2024

@kshychko , we need to also have the code list source files included on the repo, pls.

from spec-jsonld.

nissimsan avatar nissimsan commented on July 28, 2024

Code is implemented for this, pending published version

from spec-jsonld.

mgh128 avatar mgh128 commented on July 28, 2024

Thanks! Looking forward to seeing that

from spec-jsonld.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.