Comments (11)
To provide explanation, I add a column semantic_change
to the data, which looks in extreme cases like this:
[3] «smoke» > «fog/mist» (11 polysemies, 3 overt markings); [7] «smoke» > «dust» (8 polysemies, 4 overt markings); [8] «smoke» > «cloud» (7 polysemies, 2 overt markings)
The first order split is by a ;
, each item of this list refers to a semantic change reference of the base concept ("smoke") to the other concept:
[3] «smoke» > «fog/mist» (11 polysemies, 3 overt markings)
Parsing can be done with regex of other modes, but obviously, this representation only works if an explanation is given, and my tests showed that the concept glosses, which are a key to other items in the list (fog/mist
has a separate row) fail in three cases:
{
"mirrow": "mirror",
"straw/hay": "straw",
"cheeck": "cheek",
}
While these spelling errors are easily corrected, I wonder if we can make a consistent typical network link inside a concept list, that refers to another concept and adds (arbitrary) information. Should I try JSON?
from concepticon-data.
from concepticon-data.
I have a concrete proposal of how to deal with this, using a new example by Winter-2022-103.
My JSON now looks like you can see below:
ID NUMBER ENGLISH CONCEPTICON_ID CONCEPTICON_GLOSS SOURCES TARGETS
Winter-2022-102-1 1 cloud 1489 CLOUD [{"name": "smoke", "id": "Winter-2022-102-1", "overt_marking": 2, "polysemy": 7}, {"name": "sky", "id": "Winter-2022-102-85", "overt_marking": 2, "polysemy": 8}, {"name": "rain", "id": "Winter-2022-102-84", "overt_marking": 2, "polysemy": 4}] [{"name": "fog/mist", "id": "Winter-2022-102-2", "overt_marking": 7, "polysemy": 24}, {"name": "day", "id": "Winter-2022-102-19", "overt_marking": 3, "polysemy": 2}, {"name": "sky", "id": "Winter-2022-102-85", "overt_marking": 11, "polysemy": 8}, {"name": "rain", "id": "Winter-2022-102-84", "overt_marking": 2, "polysemy": 4}]
from concepticon-data.
So I have links of sources and targets (we could reduce to one of them), and a source node contains the ID of the source (Concepticon-Conceptlist-Entry-ID), the name of the concept, and other properties that would be properties of the edge from source current node.
from concepticon-data.
I'd like to make the function of "id"
here more explicit - borrowing syntax from CLDF markdown: We could use "ValueTable#cldf-Winter-2022-102-1"
as value for "id"
- and maybe call the the field valueReference
?
from concepticon-data.
Ah, okay, easy to do.
from concepticon-data.
I'd prepare -- when I find time -- a PR for both Urban's previous dataset and Winter-2022-102.
from concepticon-data.
It would need to be FormTable
and formReference
, though. That's how we model glosses (i.e. items in concept lists) in concepticon-cldf: https://github.com/concepticon/concepticon-cldf/tree/main/cldf#table-glossescsv
from concepticon-data.
Btw.: In the current concepticon CLDF data, we have no standard way to refer to a "Concept", i.e. the set of all glosses for the same concept in one concept list. In your example above, that wouldn't be a problem, I think, because refering to the particular gloss (i.e. the concept in a particular language) is the right thing to do. But there may be cases, where we want to refer to a concept with many glosses in the same concept list, e.g. https://concepticon.clld.org/values/Luniewska-2016-299-2
from concepticon-data.
Would the tabular representation not restrict the link anyway to a row which is a concept? I think for the Multi-Simlex-Data, we may have another version, where we'd want to link to a cell in the tabular data, which would then be not a concept, but a gloss?
from concepticon-data.
Ah, yes, if the data is represented in tabular form that could be made explicit through the metadata. I was thinking of something intermediate - i.e. some sort of JSON with some CLDF semantics.
from concepticon-data.
Related Issues (20)
- Mapping of ear ornament HOT 3
- Verbs only in imperative HOT 3
- Duplicates 2288 METEROID and 2835 SHOOTING STAR HOT 1
- Mappings of CLAW and CLAW OR NAIL HOT 1
- Winter-2022-98 day and noon HOT 5
- Groundnut and Peanut: Are they the same? HOT 1
- Link all data in Backstrom-1992 to HEAR OR LISTEN
- Those and These in Backstrom-1992-210
- Correct description on Urban 2011
- Possible new mappings HOT 2
- POLE mappings
- SKI related to PATERNAL AUNT
- Update contributors
- Cutting and breaking verbs in new experimental study
- New concepts COLLAPSE, HOARSE, and PLATEAU HOT 5
- Review of PEAK vs. SUMMIT HOT 1
- Duplicates in Seifart-2015-410
- Add Mirijevski-1791-284
- Liang-2024-60: Basic vocabulary list with polysemies
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from concepticon-data.