Giter Club home page Giter Club logo

registry-core's People

Contributors

ajtucker avatar dependabot[bot] avatar der avatar marqh avatar mika018 avatar mo-marqh avatar simonoakesepimorphics avatar sleeper-service avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

registry-core's Issues

Support customized lifecycles

See UKGovLD/ukl-registry-poc#96

This requires a means to declare the life-cycle model (e.g. in configuration file or via RDF in a system register), new implementation of the life-cycle machinery to allow such such external configuration, modifications to the UI templates.

Account removal

We've got a bunch of unwanted accounts in the database - most appear to be from an attempted break-in, since they show signs of attempted script injection. This has left the user database cluttered, and admin functions harder than they should be.

The security model documentation does not appear to describe any way to remove these.
Have I missed it, or are accounts forever? (like registered items).

Superseded items must know what they are superseded by

When an item status is changed to 'superseded', the UI dialogue must walk through a step to record what it is superseded by.

Using the API, an request to change status to 'superseded' should fail unless a value for dct:isReplacedBy is provided.

XSS prevention

Some of the default error pages present back the query URL without adequate filtering - thus allowing reflection XSS attacks.

Filter all reflected URLS, customize error pages, trap cases where exceptions still propagate to the container.

LDR Alerts?

We are thinking about how a 'register manager' might receive alerts about submissions.

Is it just a cron-job and a suitable SPARQL query inspecting 'submitted' (and maybe 'modified') dates?
Or is there a built-in process for monitoring and alerts?

Improve backup support

Should have option to create a complete store dump in a portable format.

Either directly or by migrating to a manageable separate storage service like Fuseki.

Batch upload - contents only

Batch upload is currently supported for container+contents.
Would be good to allow for contents-only.

(I just had an initial upload fail because of an error in the source file, but not before the register had already been created. Now I want to load just the contents, but can't do that in a batch. Grrr.)

Update to Jena 2.11.0

This means a synchronous update to serverbase and jsonld-java. Unfortunately it also means switching to a snapshot version of jsonld-java and rebuilding the jsonld integration code because the signature is now completely different. Sigh.

Additional roles (security model)

A couple of additional standard roles:

Submitter - specific register - Register, Update
Reviewer - specific register - StatusUpdate

_versionAt inconsistency

This parameter is documented as taking an xsd:dateTime but implemented as taking a UNIX style timestamp. One of these needs to change.

Bad delegation records can block start up

A sufficiently broken delegation record can prevent start up.

Fix is in code base but to release need to short out build dependencies and why public access to the S3-based repository for lib is causing problems for some people.

[Test case can be derived from Environment regitry back for 2014-12-17, not clear HOW the broken delegation record got in there!]

Migrate to new ID solution

Google has deprecated OpenID (see UKGovLD/ukl-registry-poc#104) while we have a year until this completes the earlier we change the less the migration costs.

Switching to Google OAuth 2.0 solution seems to require embedded google specific sign in widgets and largely requires G+ account. Neither of these is acceptable.

Advice on UK Government compatible, but acceptable cost, ID providers is being sought.

An interim solution maybe to switch to pure password-based administration.

Requesting version of a top-level registry throws exception

Requesting a version of most registers works fine; e.g.

http://environment.data.gov.uk/registry/structure/org:1 

But requesting a version of a top-level register throws an exception; e.g.

http://environment.data.gov.uk/registry/structure:1

Exception:

HTTP Status 500 - org.apache.velocity.exception.MethodInvocationException: Invocation of method 'perform' in class com.epimorphics.registry.core.Registry threw exception java.lang.IllegalArgumentException at page-deref.vm[line 24, column 34]

#comments in input cause loading through API to fail

TTL files that load fine through the UI fail when using the API.
Error message is that a root location doesn't exist.
When I remove all #comment lines from the file, POST through the API works fine.
It appears that it is trying to process comments as POSTed resources!

Enable requests for versioned entities

We have entered a concept in a register with multiple versions.
This is confirmed by getting
http://registry.it.csiro.au/test1/ba-glossary/_aquitard:3
http://registry.it.csiro.au/test1/ba-glossary/_aquitard:2
http://registry.it.csiro.au/test1/ba-glossary/_aquitard:1

(1) However, requests for the item (rather than just the registerItem) fail:
try http://registry.it.csiro.au/test1/ba-glossary/aquitard:3

(2) I don't seem to be able to use the _versionAt request to retrieve an earlier version.
e.g.

http://registry.it.csiro.au/test1/ba-glossary/_aquitard?_versionAt=2014-11-30T00:00:00.00Z
http://registry.it.csiro.au/test1/ba-glossary/aquitard?_versionAt=2014-11-30T00:00:00.00Z

appears to only get the latest version. Is syntax correct?

Support for GML dictionary

See ukl-registry-poc issue #80; following discussion of enhancement requests we (Dave R, Alex C and I) agreed that support for GML Dictionary would be easiest to achieve using an external module to the registry that could convert a GML Dictionary to a SKOS Collection for upload to the Registry, and transform a SKOS Collection into a GML Dictionary for access by systems.

Simon Cox could provide canonical mapping from SKOS to GML Dictionary & vice versa.

bNode duplication in register metadata

When requesting metadata for a register whose definition includes a bNode (e.g. rights statement) the bNode gets duplicated as many times as there are items in the register.

Not clear why the register information is being merged multiple times (is clear why, if that's happening, you get bNode duplication).

Promote item to register

As part of a system evolution, it is sometimes desirable to promote an existing item resource to being a register with its own member items.

Search over more properties

The LDR 'search' function (confusingly given the key "_query") appears to be scoped to rdfs:label, and perhaps the skos:*Label values. Suggest that the following fields should also be indexed by default:

dc:description
dc:source
dc:subject
dct:description
dct:subject
rdfs:comment
skos:definition
skos:note
skos:scopeNote

Simple snapshot export

  • complementary to the “backup” function (which provides compressed n-quads)
  • include only the specific versions of the registers and entities in the export - optionally including a cascade of sub-registers
  • default behaviour to download a snapshot of the current resources in the subtree of the target register
  • Initially would omit metadata (items) and versioning. Though this should be reviewed based on future ambition to support replication and round-trip editing.
  • Export format to be discussed. JSON-LD has been suggested for CSV is an easier fit for tool chains. The W3C activity on data on the web might be relevant but requirements and timescales may ... diverge from our needs here.

Relates to UKGovLD/ukl-registry-poc#82

Hook for user-feedback on items

We would like to add a means for a user to provide feedback concerning a register and its contents.
This could be as simple as an email to the register owner.

I've added a few lines to the page-render.vm template to prove the concept. In my test, a single email address is set for the entire registry. That isn't a realistic solution. Feedback should go a designated address, probably per-register to begin with.

reg:manager/foaf:mbox is the obvious property-chain on a reg:Register that could provide access to the relevant details.

This raises the following questions:
1.the property should be mandatory - how/where are the mandatory properties set in the deployment? And is it possible to require that a multi-step property chain be mandatory?
2. Within the Velocity templates, we must access properties of the parent register of a registered item. In a RegisterItem, reg:register has the {URI} for the parent register, but this would only enable use of the external (HTTP) API. Is there an internal (Java) solution through either the Jena or LDR API?

Failure to update nginx proxy via sudo

We encountered an issue with user credentials: Tomcat reported a failure starting. However our registry was accessible. Log in with a correct user name and password (worked once? then) seemed to permanently fail. A bit of digging showed the following errors in Catalina.out

09-12 04:28:52 INFO ForwardingServiceImpl :: Registering delegation path at /ogc/om/1.0 -> http://schemas.opengis.net/om/1.0.0/om.xsd [302]
09-12 04:28:52 ERROR ForwardingServiceImpl :: Failed to update nginx proxy config (code: 1) sudo: no tty present and no askpass program specified
Sorry, try again.
sudo: no tty present and no askpass program specified
Sorry, try again.
sudo: no tty present and no askpass program specified
Sorry, try again.
sudo: 3 incorrect password attempts

I believe this results in a read only user database perhaps that explains the issues with credentials but I'm not sure.

Nevertheless the Catalina error messages are associated with

Process process = Runtime.getRuntime().exec(new String[]{ "/usr/bin/sudo", script});

This fails in our (dockerized) environment because sudo cannot be invoked form Java by the tomcat7 user. Changing the tomcat configuration so it ran as root resulted in a successful registry start up and seemed to solve our user credentials issue.

If this is a bug perhaps the call to sudo to modify the ngnix configuration could be removed and instead, during install, tomcat could be given write permissions to the ngnix configuration.

Migrate to appbase

The serverbase component (Epimorphics but open source and openly licensed) used by registry-core has been superceded by a similar facility called appbase (same terms).

Migration to appbase would reduce long term support costs but incur a step change in configuration file formats and some non-trivial internal work.

Review the changes required and consult on experience with deployment configuration and what other configuration management changes might be folded in.

Protect against expensive SPARQL queries

The generic SPARQL access allows arbitrarily expensive queries. Repeated such queries can cause stack/heap limits on the tomcat container to be exceeded.

A minimal change would be to prevent the SPARQL UI form from re-submitting the same query until the old query has returned.

Could also look at switching on query timeout limits.

Editing register description fails

Using "edit metadata" to modify a register description does not change the displayed description. The difference between the entity description and the metadata (item description) seems to be at the heart of the problem. Compound by the lack of a general edit function for registers.

Real delete

Invalidation leaves a registry tree still in the database but hidden to non-admins.

In practice a true delete to remove erroneous or experimental entries is needed.

To be determined whether this should be a permitted action for register managers or only for overall administrators.

Replaces UKGovLD/ukl-registry-poc#65

Ignore specific elements in POST or PUT payload

TopBraid adds ontology elements to every file (for internal consistency reasons; it provides the graph name and hook for other metadata); but these elements confuse the Registry upload.

To streamline the publishing process, we want to be able to tell the Registry to ignore them … a little like a “git-ignore”.

It may be appropriate to configure this in the /system register - or express at a transaction level (e.g. because sometimes you want to ignore the ontology resources, but not always).

Simon Cox could provide a list of the “extra bits” appearing in a file published from TopBraid.

Replication support

Provide support for read-only replicas of a writeable master.

Two use cases:

  • to enable higher performance implementations by supporting multiple load-balanced replicas
  • to enable separation of admin-enabled master copy and publicly accessible slaves (this allows use of https for the admin functions even when the public copy is in a domain for which SSL certificate management may be tricky)

For latest-only replicas this might be achieved through a combination of #18, #19 and a notification stream (such as an atom feed).

Greater efficiency would be achieve if a delta-format rather than snapshot were supported.

Full with-history replication would require a different approach.

Change to an item + change note in metadata?

When saving a change made to an item, it would be convenient for a change-note (skos:changeNote ?) in the item metadata to be created as part of the same version event. Unclear if this can already be handled in the API. In the UI, item and metadata changes are made separately, and appear to trigger separate version increments.

Data entry forms configuration

It appears that forms for data entry can be loaded in
{registry}/system/form-templates
I've reverse engineered these from the examples in the distribution, and loaded them OK.

Of the four that I've loaded, one is successfully causing a data entry form to be shown in the UI, through which I can successfully add an entity. The others do not work and I can't see why. Could be real bugs, but some documentation would help.

SKOS-XL support

We have a use-case where it is required to add some additional annotation to an altLabel - to indicate who uses this term in place of the preferred term. Looks like a SKOS-XL application, with an additional property on skosxl:Label.

We've also discovered that VocBench uses SKOS-XL, so moving data between the platforms would be helped if LDR already supported SKOS-XL.

Is there any experience with SKOS-XL in LDR?

Restore backups - links to user database

Following up consequences of #4

The transition from OpenID to the local user database is incomplete.
Since the user database is local, we need to understand the relationship with the registry content to ensure that links are either maintained or can be rebuilt when registry content is restored.

Say a RegisterItem has

    reg:submitter      [ foaf:name    "Simon Cox" ;
                         foaf:openid  <[email protected]>
                       ] ;

this interprets the username [email protected] as a URI that identifies the user (of course the @ is not significant - I just happened to use my email address as the username).
In the UI, when the 'All registration metadata' panel is displayed, the hyperlink interprets this as a relative URI, so for example on
http://registry.it.csiro.au:49170/agriculture/def/NLMP-glossary/_Adaptation:2
the submitter openid link is expanded to
http://registry.it.csiro.au:49170/agriculture/def/NLMP-glossary/[email protected]
This is clearly an error - user ids are not relative to specific register paths, and of course this URI does not resolve. But neither does http://registry.it.csiro.au:49170/[email protected]

  1. is there a correct URI form for the users?
  2. does any registry functionality actually depend on the link integrity from RegisterItems to users? Or is this just a 'passive' record, so if the link is broken we don't really need to worry.

Migrate documentation from ukl-registry-poc and add 'mini-guides'

The documentation in the ukl-registry-poc wiki needs to be migrated to the registry-core project; some review of content is anticipated in order to ensure that current functionality is listed.

Furthermore, guidance and how-to information has been provided in response to email requests from the growing community. Currently these are invisible to the community at large and should be published where all can benefit from them.

Propose to use the gh-pages mechanism of GitHub to publish this content at http://ukgovld.github.io/registry-core along the same lines as the documentation emerging for the UKGovLD WG (at http://ukgovld.github.io/ukgovldwg/); published using Markdown and Jekyll.

Batch registration with explicit items

It is not possible to include explicit register items in a batch registration. This is particularly limiting when creating a collection of referenced (external) entities but could apply in other cases.

Batch upload with ambiguous types fails

Using batch upload on a resource tree whose root collection has multiple RDF types, more than one of which is registered as a bulkUploadType may fail.

The canonical case is uploading a skos:ConceptScheme which is also explicitly marked as a reg:Register since both of those are default bulkUploadTypes.

If the upload provides an explicit (inverse)containerMembershipProperty then that should arguably be sufficient to disambiguate in this case.

Import of simple snapshot

Provide import facility to allow update of a register/registry tree using a snapshot as exported from #18

This could support round-trip editing, enabling bulk maintenance of a register without the need for RDF tooling or API support.

Could also be used to maintain current-time copies of a register tree either as a local reference set or as a read-only replica.

Inconsistent recording of 'who changed it' information in lifecycle record.

It appears that, while the information regarding 'who' submitted a register item is recorded, who made changes is not.

This looks inconsistent to me. If the thinking is that it is not necessary to record 'who' modified an item since it is, by definition, the 'Register Manager', then submission information is also not required.

Ability to upload multiple members to an existing register in a single request

There's a common use case where one need to add multiple entries to an existing register - perhaps tens or hundreds of entities.

I recall back in the ukgovld-registry-poc days that this was possible (although I will admit to my memory being imperfect at times).

However, now, when I try to do this, I get the following error (from webUI):

Action failed: Bad Request -
{my-file.ttl} - Could not find unique entity root to register

Here's my test:

imagine that I created a register called "number" ...

@prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#> .
@prefix dct:     <http://purl.org/dc/terms/> .
@prefix skos:    <http://www.w3.org/2004/02/skos/core#> .
@prefix reg:     <http://purl.org/linked-data/registry#> .

<number>
      a       skos:Collection ;
      rdfs:label "Numbers"@en ;
      dct:description "The set of integer numbers (well - a few anyway)."@en ;
      .

I can easily add a single member ...

@prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#> .
@prefix dct:     <http://purl.org/dc/terms/> .
@prefix skos:    <http://www.w3.org/2004/02/skos/core#> .
@prefix reg:     <http://purl.org/linked-data/registry#> .

<one>
      a       skos:Concept ;
      rdfs:label "one"@en ;
      dct:description "The lowest cardinal number; half of two; 1."@en ;
      .

But as soon as I want to add multiple members in a single upload request (as below) I get bounced ...

@prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#> .
@prefix dct:     <http://purl.org/dc/terms/> .
@prefix skos:    <http://www.w3.org/2004/02/skos/core#> .
@prefix reg:     <http://purl.org/linked-data/registry#> .

<two>
      a       skos:Concept ;
      rdfs:label "two"@en ;
      dct:description "Equivalent to the sum of one and one; one less than three; 2."@en ;
      .
<three>
      a       skos:Concept ;
      rdfs:label "three"@en ;
      dct:description "Equivalent to the sum of one and two; one more than two; 3."@en ;
      .
<four>
      a       skos:Concept ;
      rdfs:label "four"@en ;
      dct:description "Equivalent to the product of two and two; one more than three, or six less than ten; 4."@en ;
      .
<five>
      a       skos:Concept ;
      rdfs:label "five"@en ;
      dct:description "Equivalent to the sum of two and three; one more than four, or half of ten; 5."@en ;
      .

Personally, I feel that this functionality is very useful and would be keen to see it working. Obviously we can't do a "batch upload" as the register, in this case number, already exists.

Query key pattern inconsistent?

It appears that the query-key pattern is inconsistent.

The LDA precedent is used for some:

  • _page={int}
  • _view={viewType}
  • _format=(rdf|ttl|jsonld)

But for new keys introduced to support the registration functionality, it is inconsistent

  • _versionAt={dateTime}
  • status={status}

Am I missing a pattern here?

Read access to 'Submitted' items?

In https://github.com/UKGovLD/ukl-registry-poc/wiki/Security-model it is stated

All read actions are assumed to be uniformly available. No login is required to read any part of the registry. So a separate read permission is not needed.

Is this true? I don't see 'Submitted' items until I login.

There is some interested in also explicitly controlling read-access, to enable community review of submitted items. Is there a way this could be managed in the current arrangement?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.