Giter Club home page Giter Club logo

amrc-factoryplus / amrc-connectivity-stack Goto Github PK

View Code? Open in Web Editor NEW
12.0 12.0 2.0 70.85 MB

The AMRC Connectivity Stack (ACS) is an open-source implementation of the AMRC's Factory+ framework

Home Page: https://factoryplus.app.amrc.co.uk

License: MIT License

Smarty 0.27% Dockerfile 1.13% Makefile 0.40% JavaScript 25.26% HTML 0.27% PLpgSQL 0.69% TypeScript 15.45% Shell 0.19% Python 4.02% PHP 21.89% CSS 0.47% Vue 24.16% Blade 0.25% Java 5.56%
amrc amrc-connectivity-stack factory-plus factoryplus mqtt sparkplug

amrc-connectivity-stack's People

Contributors

alexgodbehere avatar amrc-benmorrow avatar derme302 avatar djnewbould avatar github-actions[bot] avatar grigals avatar rblackett avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

grigals derme302

amrc-connectivity-stack's Issues

Create an inital discovery endpoint

Create an endpoint for initial discovery bootstrapping, given nothing but the base cluster domain name.

  • This should be accessible without authentication.
  • This should provide at least the Directory service URL.
  • This should also provide the information needed to build a krb5.conf.

The main traefik configuration then needs adjusting to serve (or redirect to) this endpoint from the base cluster URL. This should only happen for machine access, i.e. with Accept: application/json or something similar. We want to keep the option for human access to be redirected to some human interface, like the Manager.

[directory] Dynamic ACLs

It is not a lot of use a client being able to search for e.g. Temperatures if that client doesn't have permission to read the data.

We need some way of automatically granting certain clients access to device matching certain criteria. To start with I think this wants to be implemented by the Directory automatically adding devices to (Auth service) groups, so clients can then be given access to appropriate groups.

This depends on #188. This could be implemented as a function of the Directory, or as a separate service. Either way the criteria for adding a device to a group needs to be configurable.

[krbkeys] Changes are not always picked up

From time to time there is a problem with the operator not noticing the creation of new KerberosKey objects. It is not yet clear what causes this. Restarting the operator causes it to pick up the new objects as part of its initial scan.

[directory] Preserve BIRTHs

It is becoming clear that for many purposes it would be useful for the Directory to maintain a record of all active BIRTHs.

Ideally these should be stripped of dynamic data values (which will be stale), but have static values (which only change on BIRTH) left in place. The static values include important information like the device and schema UUIDs. Currently I don't think there is a reliable way to distinguish static from dynamic values; probably we should standardise a metric property.

Birth certificates should probably be made available both in some suitable JSON format (for consistency) and as a binary Sparkplug packet which will be easier to consume using libraries that can already decode Sparkplug. Be aware that there are at least two mappings from Sparkplug to JSON (the Tahu JS library mapping, and the mapping used by the protobuf tools).

Move k5start to a sidecar

All services using k5start could have that process moved into a sidecar container. The HiveMQ update already does this as it was simpler than pulling in k5start.

Advantages:

  • The running process does not need access to its (client) keytab; instead it only has access to time-limited tickets in the ccache. This is a significant security improvement.
  • Given a standard image with k5start in there is no need for that binary in the other images.

[edge] Don't restart the whole app when reloading the config file

There is no need to restart the whole app when the config file is reloaded.

  • The Sparkplug connection can be left entirely alone. We need to republish the config-related metrics.
  • The southbound connections only need reconnecting if they've changed. Changing one connection should not need to affect any others.
  • The Devices will need to be rebirthed if their schema mapping has changed, but not otherwise.

[configdb] Work out what to do with deleted objects

I do not think it is ever a good idea to delete objects from the ConfigDB under normal circumstances. If a UUID has been used for a particular purpose it should not be reused for a different purpose, so we need to keep the object entry for ever to record that fact.

The real-world object represented by the ConfigDB object, however, might stop existing. We need a way to represent this. Currently I have hacked in a deleted property into General Information, but this is definitely not correct.

Options include:

  • Moving the deleted field into the Object Registration app, which means putting it directly in the database somewhere.
  • Making the ConfigDB 4D: keep a full history of all changes, including 'this object existed from this time to this time'. This is the correct answer but may be more work than we can justify.
  • Something else?

[manager] Device Profiles

The ACS Manager should support the concept of Profiles to enable the rapid onboarding of similar devices.

Profiles are templates that are created seperately from device configurations but can be used as a starting point for configuring devices. They should be children of SchemaVersions, as they're only valid for a specific schema.

Features

  • Allow the ability to define variables in the profile that are substituted before instantiation
  • Provide the option of having changes to the profile propagate to all devices using the profile
  • Include a Save as Profile button on completed Schemas
    • Only allow this on Valid schemas
    • Somehow we need to enable replacement of variables (export as JSON?)
    • Use the CDS importer structure behind the scenes?

User Inteface

The user should be given the option to create a new configuration or utilise an existing profile when selecting a schema version.

[hivemq-krb] Handle dynamic ACLs better

When ACLs change existing connections continue with their current permission set. This is starting to cause problems with our dynamic ACL adjustments, particularly when a client gets stuck with no permissions because they hadn't been set up yet at the point where they connected.

Handling this will be tricky. It will require a proper auth plugin rather than just setting a permission list at connection time. It will also require working out how to listen to our own MQTT traffic in order to get change-notify from the services; I think there are APIs for tracking packets, but we don't want to get slow.

[directory] ACLs for the UUIDs used by a Node

Currently Sparkplug Nodes can publish whatever Device UUIDs they like in their birth certificates. This means a malicious Node can 'steal' a Device from its legitimate publisher, and insert incorrect data into the historical record.

The Directory should verify

This means we need a permission 'This Node is allowed to publish data for this Device'. In order to avoid locking the system down to only allow Devices that have been defined in the Manager (one of the F+ principles is that we should be driven from the edge, and there are many use cases for Nodes which dynamically create Devices), we need

In the first instance a Node publishing a bad UUID should probably just raise an alert. Later it may be worth looking into whether the Node can be disabled somehow (switch off its MQTT permission?).

[edge] Publish the current state of our southbound connections

One of the issues with edge clusters will be lack of access to logs. We need problems to be made much more visible.

The Edge Agent already publishes a Node birth certificate regardless of having southbound connections. This means it can report, as Node metrics, the current state of these connections, and any problems detected.

This is different from DDEATH, because connections and devices are not necessarily 1-1.

[manager] CSV importer

Supporting batch import of CSV data into the manager configuration would enable users to configure their devices from existing tag lists.

Features

  • Add a Download CSV import template button to download a blank template (with correct schema metric names)
  • Enable dynamic creation of sub-schema objects
  • There is a challenge around ensuring schema compatability so the default behaviour would be to reject the entire import if any metrics do not comply to the schema naming convention. A list of non-compliant tags should be returned to the user to correct.

Sparkplug Node identity belongs in the Auth service

After quite a bit of thought about situations (e.g. cmdesc) where we are trying to authenticate data received over MQTT, I have decided:

  • A Sparkplug Node is a security principal.
  • The Sparkplug address of the Node is another identity mapping to the principal UUID, alongside the Kerberos UPN.
  • Sparkplug addresses of Nodes should live in the Auth service.

There is a partial implementation of this already in the JS client library, which looks up addresses from the ConfigDB. It needs replacing with an API on this service.

[hivemq-krb] Pull out the F+ client library

It would be good to be able to make a Java client library available, even if incomplete.

However, working out where to publish the library to such that we can pull it into the build is tricky. The Github Maven repo requires a PAT to download packages.

Feature Req: Support for Data Value Re-Mapping

Allow users to re-map values collected from a device to a different value expected by a schema.

For example:

  • A Light.state schema defines the state of a light to be ON = true and OFF = false. However devices can return these values inverted due to engineering decisions. Devices in production cannot be modified to change this state, thus it has to be done further up stream.
  • A CNC schema defines a spindle direction (0=off, 1-CW, 2= CCW). Other CNCs have different numerics for the same values. DMU has 3: Clockwise spindle rotation, 4: Counterclockwise spindle rotation, 5: Spindle stop

Expected Feature
Allow to map receivedValue to desiredValue. Similar to Grafana.

[configdb] We need a more complete ACL language

The predefined permissions are inadequate. For example, if we define an 'Edge Agent' config app, we want to be able to give a particular Edge Agent permission to read its own config and no others. That is currently impossible.

We need an extensible set of permissions based on templates, like the MQTT plugin.

[configdb] Separate HTTP and MQTT pods

Ideally it would be good to scale up the web api part of the ConfigDB. This cannot be done unless the MQTT part is a separate process. This means the change-notify needs to happen via the database, as the Directory does.

[edge] We should publish with QoS at least 1

An MQTT publish with QoS 0 does not get any response from the broker. This means the broker will just silently drop packets we are not authorised to publish; we don't get any error. This in turn means that if our ACLs are wrong we will not reconnect.

Record birth/death

I think it would be helpful to record an additional tag in the historian indicating when the device was online.

This would be a boolean which is set on BIRTH and cleared on DEATH.

[edge] Back off when reconnecting to MQTT

With more nodes in our deployment we are starting to see 'thundering herd' problems when all the Edge Agents reconnect at once. Back off, with random delay, when we reconnect.

[manager] User management

  • Admins should have the ability to create and delete users
  • Admins should have the ability to change the passwords of other users
  • All users should be able to change their own password

This will use the kadmin interface already present in the manager and will authenticate to kadmin using the credentials that the user logged into the manager with.

[manager] Connection specific configuration UI

Having the SparkplugMetric have a different appearance depending on the connection would ensure that users have a contextualised experience when configuring devices.

For example, Address would be named Topic when using an MQTT connection.

Work out how to populate the git repos

The on-prem git repos driving the edge clusters need to be populated. We need a solution for this.

One possibility might be to set them up to pull from public Github.

[directory] MQTT change notify is not useful

The current MQTT change-notify interface has several problems:

  • It creates a lot of MQTT traffic noone is interested in.
  • When a large number of rebirths happen, the change notifications get a long way behind.
  • We are exposing information about devices when perhaps we shouldn't be.

We need a better interface. I am thinking perhaps

  • Client makes an HTTP request asking for notification about certain events.
  • Directory creates a new Sparkplug Device for this client, and arranges for ACLs so only that client can read it.
  • Directory returns device and metric information to the client saying where to get notifications.

We only need to create one device per client. After that we can create new metrics on the same device.

Open questions:

  • How do we arrange to dynamically set the ACLs?
  • When do these devices disappear? What if a client just vanishes without telling us?

[auth] We need to be able to quote groups

Currently, when a group is used in an ACE, no distinction is made between 'assign this ACE to all members of this group' and 'assign this ACE to this group specifically'. This is primarily an issue when granting auth service permissions to edit groups, where the group in question is the target of the ACE.

A partial solution here would be to have typed groups (principal/permission/target), and stop expanding groups when we get to a group of the wrong type. This would allow granting 'you can edit this group of principals', for instance. However it is not sufficient when granting permission to edit groups of targets.

An example is the edge krbkeys permission editing:

  • Edge krbkeys operators need to be able to set permissions for the principals they are managing.
  • We don't want to grant unrestricted permission to change any permissions. That would be root-equivalent.
  • So we grant 'Manage ACL by permission' on the permissions the krbkeys operators should be able to grant.
  • These permissions are generally groups: e.g. 'monitor for edge agent' which is a group of permissions allowing MQTT read, CCL rebirth, CCL config reload.
  • The krbkeys operators can now grant any of the included permissions arbitrarily.

Kadmin ACLs

Currently ACS deploys with fixed kadmin ACLs in a ConfigMap, which then can't be changed as they are sourced from the Helm chart and Helm upgrades will overwrite them.

We need a strategy to allow user-specified ACLs. Probably this means generating the actual ACL file on KDC startup from some other data source. Integration with the F+ auth framework would be ideal, but this may lead to bootstrapping problems.

[manager] Create UUIDs for southbound connections

It would be helpful to assign a UUID to each southbound connection from an Edge Agent. This will allow e.g. an Alert to link to a particular southbound connection.

Currently the config file does not allow this. Either it needs to be refactored to support this, or each individual southbound connection needs a separate entry in the ConfigDB containing the configuration for that connection.

The latter option opens up the interesting possibility of configuring 'these connections are available, from these hosts on these clusters; I want these devices derived from them' and then the system automatically deploying suitable Edge Agents to handle them.

[edge] Open Protocol driver is not robust enough

The Open Protocol driver doesn't currently handle tools that are offline. If it fails to connect to the tool it does so quietly, which is not ideal. The state management of the OP driver needs investment in maturing if it is to be used outside a lab environment.

[configdb] MQTT change notify is not useful

See also this issue on the Directory.

  • The change-notify interface was designed for a much simpler ConfigDB.
  • It is difficult to work out from the notifications what has actually changed.
  • There may have been more changes between the notification and the time we manage to GET the entry.
  • We are exposing information we should not.

We need an interface where clients can request notification for particular sorts of changes. This might include watching for changes on particular properties of an entry, for instance.

If we are only sending changes we have customers for, we could send the actual data over MQTT and avoid the need to GET the changed entry (and the associated sync problem).

[configdb] SpecialApp returning Sparkplug Addresses

Currently the Sparkplug Address application is perfectly ordinary, and mostly contains records for Node addresses used in ACLs.

If Node addresses move to the Auth service, it would be useful to turn the Sparkplug Address Application into a SpecialApp proxying that information from the Auth service. We could also proxy current Device addresses from the Directory, making this a unified source for Sparkplug address information.

Error if incorrect permissions

Currently the ingester will sit silently if it does not have the correct permissions. It should throw an error and kill the pod.

[auth] Investigate ABAC

The current ACL scheme (subject-predicate-object) is not really sufficient, unless we are going to have a lot of dynamic ACL adjusting by daemon services. We should look into Attribute-Based Access Control as an alternative.

We are going to need an access control language (a language in which to express permission grants). It would be better to reuse something existing, if possible, than to design our own.

https://github.com/AMRC-FactoryPlus/amrc-connectivity-stack/blob/bmz/dyn-deploy/acs-auth/docs/redesign

Feature Req: Scaling or Offsetting Values

Issue:
Reading in data from some devices provides non-ideal tag values. For example, it is common that Modbus devices provide values that are 10x larger than the actual reading to provide a decimal point. This creates an issue further down stream when standard SI units cannot be assigned as there is no SI unit for 10x a value.

Desired Feature:
Allow tags to be scaled by providing a multiplier and offset value. In the case above with Modbus, the multiplier would be 0.1.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.