Giter Club home page Giter Club logo

alvarium-sdk-go's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

alvarium-sdk-go's Issues

Investigate possible adoption of new "structured logging" package in Go stdlib

As of Go v1.21, the Go standard library has a new structured logging package called slog.

All Alvarium-related applications found in this Github org that are written in Go currently utilize our custom provider-logging package. At the time these apps were created, no standard logging capability existed in the Go standard library so we created our own provider.

In the interest of reducing external dependencies, especially where we can get what we want from the standard library, I recommend the project evaluate this new slog package and compare it with our current solution. It appears we'll be able to obtain a similar level of control w/r/t the format of the messages. Each message is formatted in JSON allowing marshaling/unmarshaling to a type instance and easy filtration, sorting, etc via the back-end.

This issue targets the development task for integrating slog and removing provider-logging leading to a comparative evaluation of the resulting log format and ease of use for the slog API

Proposal: Support YAML for Configuration

This proposal comes as a result of work being done to integrate the Alvarium Go SDK into EdgeX Foundry. The integration work is being attempted on the "Levski" release as the 3.0 release of EdgeX is not yet finalized. During the functional testing of the Virtual Device service, an issue was encountered whereby the configuration markup format for the service (TOML) does not handle elements of the Alvarium SDK properly. The specific issue occurs on the following line:

Annotators []contracts.AnnotationType `json:"annotators,omitempty"`

As you can see the Annotators property is an array of values coming from an enum AnnotationType. The members of this enum are strings, however the TOML parser is getting confused by the type metadata associated with the values because of the enum. There IS a way to fix this, however it would require changes in the go-mod-configuration module which is used by every service in the EdgeX platform. This would greatly increase the surface area of the integration and expand the amount of functional testing required. Further, it would bleed Device-SDK specific type mgmt into a module whose concerns should be agnostic.

In the upcoming 3.0 release of EdgeX Foundry, the project is changing how configuration is defined and managed. In particular, the new config file markup will be YAML. I have POC'ed a small harness application to target the above scenario and see if the YAML implementation handles the enum typing correctly, and it does!

Therefore I propose we extend the configuration support within the Alvarium SDK to YAML. I believe this will be as simple as adding yaml struct tags to the members of all configuration types. The consuming app will implement the relevant config mgmt for reading the file and unmarshaling according to their desired format.

AnnotationType Validator Missing "src" as Value

The Validator() function for AnnotationType is missing "src" as a valid value and so an error is being thrown erroneously.

func (t AnnotationType) Validate() bool {
	if t == AnnotationPKI || t == AnnotationTPM {
		return true
	}
	return false
}

Refactor Mutate Implementation to Use Annotator Factory

The implementation of the Mutate method on the SDK violates guidelines related to directionality of import references. The guideline is that in general, references from /pkg (contracts) to /internal should be avoided, while references in the other direction are fine. This helps to avoid the problem of circular references which can happen when references are taken on for convenience. It also preserves application layer integrity by encapsulating internal functionality away from contract functionality.

However, there is a case where this direction of reference is allowed -- when using a factory pattern in the contract package. The factory returns an internal type that conforms to a contract defined interface. You don't want the client using the SDK to reference the internal type directly, however it needs some kind of externally facing interface to work with the type.

As shown here

src := annotators.NewSourceAnnotator(s.cfg)

This method implementation violates the principle by calling the SourceAnnotator constructor directly. Instead, the existing factory should be extended for this use case. Refactor the associated test as well.

func NewAnnotator(kind contracts.AnnotationType, cfg config.SdkInfo) (interfaces.Annotator, error) {

Non-SDK Annotators Break Unmarshal Validation

Our previous assumption was that annotators would all be defined in the SDK. To that end, we have a Validate() function on the AnnotationType constant that checks to ensure an annotation's Kind is within a known list.

func (t AnnotationType) Validate() bool {

However with the new direction to allow users to define custom Annotators (such as SecureBoot) the Kind value for the Annotator is not in this pre-defined list. When an Alvarium-enabled application (like the scoring apps) attempts to unmarshal custom annotations from the field, an error is thrown. But this error should NOT be thrown b/c we're simply dealing with a custom annotation.

it is likely that since we've revised our original assumptions about the SDK design in relation to annotators that we should remove this Validation check.

@jbonafide623 @Ali-Amin @KarimElghamry @DyrellC @michaelehab for visibility

Misleading factory error message

in factory.go, the switch case for MqttStream produces the same error mesage for MockStream. It is required to change it:

from

"invalid cast for MockStream"

to

"invalid cast for MqttStream"

Define SignProvider Interface and Related Factory

There should be an abstraction/interface behind which we can provide several different key handlers. The handler used should be driven by configuration and returned via a factory pattern.

For now, the factory will return an error as no concrete types are available yet. Implemented along with it will be unit tests to validate this behavior.

Externalize Hashing and Signature Functionality for Custom Annotations

This issue originates from #55 but can be completed independently.

Standardized functionality provided by the SDK to support hashing and signing needs to be exported for developers writing custom annotators. The recommended approach is to add respective support in our factories.

This will necessitate a refactoring of the standardized annotators to avoid a cyclical dependency as follows:

package github.com/project-alvarium/alvarium-sdk-go/internal/annotators
imports github.com/project-alvarium/alvarium-sdk-go/pkg/factories
imports github.com/project-alvarium/alvarium-sdk-go/internal/annotators: import cycle not allowed

The constructors will need to be extended via dependency injection to receive the HashProvider and SignatureProvider interfaces rather than instantiating those internally.

Add CI/CD annotator types to contracts

The scoring-apps-go project depends on the alvarium-sdk-go project for the definition of the annotator types. However, the alvarium-sdk-go project does not include the CI/CD types of annotations that are implemented in alvarium-java-sdk such as “source-code”, “vulnerability”, and “checksum” in its contracts. This causes an error when a Jenkins pipeline emits annotations of these types to the scoring apps, as the scoring-apps-go project validates the annotator types and returns an error message like “invalid AnnotatorType value provided %s”.

To fix this issue, we need to update the alvarium-sdk-go project to include the new types of annotators constants in its contracts. This will ensure the compatibility and consistency between the scoring-apps-go project and the Jenkins pipeline.

Extend Hedera Config for IP and Port Support

We've found that for testing or local development purposes it may be necessary to stand up a local Hedera node via their docker-compose. The Hedera Publisher in our SDK current relies on connection methods provided by the Hedera SDK which hardcodes IP and port information. If one needs to connect locally or to an endpoint in a lab, there's currently no way to do this.

Extend the Hedera config to support an IP/port combination and integrate this into the appropriate Hedera client factory logic.

Implement Console StreamProvider

When developing or debugging an Alvarium-enabled application, it would be useful to have a "console" version of the StreamProvider to eliminate the need for a message broker. With this provider, annotations generated by the application would simply be written out to the console rather than having to run a broker and the accompanying Subscriber application.

Remove IOTA C-Bindings

During the TSC call on 19-June-2023 it was decided that the legacy C Bindings used to formerly integrate with IOTA will be removed. This is primarily because they have aged significantly since they were originally contributed. Following from their lack of use due to age, they add a further burden during integration as a statically linked dependency which may not even be used once the targeted application is deployed. This adds developer overhead as well as increasing the size of the build artifact.

Update Annotation Schema to Include "Tag" and "Layer" Fields

Summary

This issue proposes an enhancement to the annotation schema by adding two new fields: "tag" and "layer". These additions aim to enrich our annotations with metadata that links scores across different stack layers and provides additional context.

Proposed Change

Introducing the following new fields to the annotation schema:

  • Tag: A field designed to associate annotations with specific metadata, aiding in linking scores. It will allow us to find the scores of the lower layers of the stack that affect the current layer score.
  • Layer: A field that specifies the stack layer at which the annotations are produced. It is passed to the SDK through the configuration and accepts an enumerated value of either "app", "cicd", "os", or "host".

Add Support for Signature Validation via HTTP Header

For context, see comment here:

// The question of how/whether to validate signed data is tricky. We want this SDK to be as agnostic of the application data

Essentially, our current PKI annotator has some constraints in that incoming data must support unmarshaling via JSON to a type that has a Signature property. This eliminates our ability to annotate data that is not in JSON format or otherwise does not provide the necessary property. We need to leverage a means independent of the data payload for this capability and so I propose looking for some way to leverage the means of transport.

The following IETF draft proposes using HTTP headers to support signature validation.
https://httpwg.org/http-extensions/draft-ietf-httpbis-message-signatures.html

Assuming the relevant Signature and Signature-Input headers are present, in an HTTP context the Request can be passed into the annotator which would obtain the values and run the verification. Using headers would also possibly provide insight toward a relevant abstraction for a similar means of verification in pub-sub scenarios where message headers are available.

Unmarshal StreamInfo Mock Returns IOTA Config

if a.Type == contracts.IotaStream || a.Type == contracts.MockStream {

It seems odd that unmarshaling a config designated as a Mock would return an actual provider config. Is there a reason for this? I'm using a Mock in a unit test, and the unit test is comparing the result of unmarshaling to a control instance of a configuration. The test config is provisioned as a mock, as is the control instance. However the comparison is failing b/c when the test instance is unmarshaled, it has this IOTAConfig in it. This means I have to put a spurious IOTAConfig in my control instance, which I'd prefer not to do.

A mocked StreamProvider should just bury anything sent to it, so it doesn't really need a Config instance at all -- or at least just a shim.

Mutate SDK Method Should Ignore TLS Annotation

Use case:

  • Service A originates a piece of data and passes it to Service B via REST
  • Service B is responsible for mutating the data in some way, but first it annotates the received data via Sdk.Transit()
  • Service B then performs its filter/mutate action on the original data and so calls Sdk.Mutate()
  • Because the annotators for the Sdk are passed in once, both of these methods create the same annotations. However the new data element mutated from the original is penalized due to the TLS annotator called by Sdk.Mutate()

Certain annotations may not be relevant in all cases. Data that originates within a service will have no need of a TLS annotator. Only Transit() should care about that.

There are probably ramifications for the overall SDK design here -- indicating that annotations should be configued by Sdk method rather than universally applicable. This is because more than one SDK method can be called by an application as part of a given execution path.

For immediate purposes, the recommendation is to ignore the TLS annotator when Sdk.Mutate() is called.

Extend SDK to Assemble Signature Related Http Request Headers

Now that we've provided the annotator to validate an HTTP Signature header, it is suggested we also provide a means through the SDK to assemble the Signature-Input and Signature header values.

An example of how these might be used can be found in the logic for the test TestHttpPkiAnnotator_Do. Referring to the following line

fields := []string{string(method), string(path), string(authority), contentType, contentLength}

A list of keys are provided to specify which headers should be included as part of the Signature-Input header. These are passed along with some other information into a function that assembles the header values for inclusion with the HTTP request that is about to be sent.

This functionality should be located in the /pkg directory. I'm looking for suggestions from the team. Once complete, this new functionality should replace the inline functions in the above referenced test. That is, the test should be able to use the new logic instead of maintaining a duplicate copy.

Add Support for ECDSA Signature Validation

We have a use case where a partner would like to use the ECDSA algorithm for validating signatures on data. Add support for this via the existing factory and capabilities in the Go std lilbrary.

PublishWrapper.Content property type should be []byte

Content interface{} `json:"content,omitempty"`

The type of that property should not be interface{}. It should be []byte to agree with the SubscribeWrapper. Making this changes will eliminate the need for provider-specific wrapper classes such as the mqttWrapper

type mqttWrapper struct {

The above wrapper class was created when a problem was noticed unwrapping messages via MQTT. Now that the behavior has also been observed (and solved via local testing) using IOTA, the root cause was more properly identified.

Implement Annotator for TLS Verification over HTTP

Implement a new Annotator that will validate whether or not the incoming request over HTTP is secured via TLS.

For now, this will not cover pub/sub scenarios (such as MQTT over TLS) but I expect that may become a requirement in the future.

Incomplete support for type unmarshaling from pub/sub

Currently only a PublishWrapper is provided by the messages package. The idea behind this type is to allow an indication of the type for the Content property's value. But when an instance of this type itself is marshaled to JSON as a byte array, it makes type introspection impossible on the other end (subscribing). As such, Content can't be cast to a type whose kind is provided in MessageType.

Suggest implementing a SubscribeWrapper to be used by subscribers where Content is a byte array. Individual publishers should then be responsible for handling the incoming PublishWrapper in such a manner that Content is a byte array when consumed on the other end.

Complete Simple TPM Check in Relevant Annotator

Currently the TPM annotator is simply setting its IsSatisfied property to false. Extend this with a simple check to what appears to be the default path to a TPM 2.0 device `/dev/tpm0'.

If a file indicating a mounted device exists at that location, IsSatisfied should be true.

Two potential follow-ups:

  • At some point, someone may have a non-standard path and wish to pass this in via config. However passing in a path via config means someone could just pass in any path, not necessarily that of the TPM
  • I looked to see if there was some programmatic way to get this information apart from checking the existence of a path but they would require that either the executing application consuming the SDK to run as root or looking for magic string values in logs/files.

Google's go-tpm module provides functionality for interacting with TPMs both version 1.2 and 2.0. However I'm avoiding that dependency at this time until we have more sophisticated needs. I want to keep the SDK's transitive dependencies as low as possible and there are quite a few in that module.

Add support for non-SDK annotators in config

A use case has been articulated whereby a customer utilizing the DCF wants to define an annotator and not have it included in the SDK. This could be for a few reasons:

  • The logic in the annotator is proprietary
  • The purpose of the annotator isn't widely generalizable
  • The developer doesn't utilize any other language implementation of the SDK and doesn't want to port the annotator

During the Alvarium TSC call os 25-Mar-2024, an approach was demonstrated whereby the sdk/annotators portion of the config was separated into two arrays -- basic and extended.

  • basic is an array containing keys to annotators implemented in the SDK
  • extended allows a developer to define an array of keys for custom annotators

basic would facilitate the current method of looping through the annotator keys and instantiating each one at application startup through the normal factory pattern. extended could be used in a similar fashion if the developer needs to instantiate multiple custom annotators. Alternatively, if there is only one custom annotator, the developer could ignore this property and inline the instantiation. For example:

        // List of annotators driven from config. Initialize basic SDK annotators first, followed by custom/extended.
	var annotators []interfaces.Annotator
	for _, t := range cfg.Sdk.Annotators.Basic {
		instance, err := factories.NewAnnotator(t, cfg.Sdk)
		if err != nil {
			logger.Error(err.Error())
			os.Exit(1)
		}
		annotators = append(annotators, instance)
	}
	// The custom annotators implemented in this app follow. This could be a loop as well.
	bootAnnotator := local.NewSecureBootAnnotator(cfg.Sdk)
	// Add custom annotator to list of annotators passed to SDK
	annotators = append(annotators, bootAnnotator)

An example of the new config definition in JSON follows:

"sdk": {
    "annotators": {
      "basic": ["tpm", "tls"],
      "extended": ["secboot"]
    }
    ...
}

In addition, standardized functionality provided by the SDK to support hashing and signing needs to be exported somehow for developers writing custom annotators. It's possible this could be done by adding respective support in our factories.

Consider Publish SDK Method

I propose consideration of extending the SDK interface with a Publish method. The purpose of this method would be to provide extensibility for annotators that may need to attest to the state of data before it is sent over the wire. This could then be compared to the result of annotations captured during Transit by a receiving party/host. Publish could also be useful in cases where the downstream host receiving the data isn't running Alvarium-enabled applications. Let me illustrate with a couple use cases.

Both of these assume the application developer has a need to implement custom annotators for their application domain by implementing the Annotator interface

type Annotator interface {

1.) Some piece of data -- for example, a customer -- must be scrubbed of all personally-identifying information (PII) before the record is sent downstream. The developer implements a domain model representing the customer state and passes that to a custom PiiAnnotator to verify that the data has been scrubbed. This new annotator must be developed by the application owner since it requires knowledge of the application state. Before publishing the anonymized record, the application can invoke the SDK's Publish method to attest to the fact that it scrubbed the data before transmit.

2.) If the application owner also owns the downstream endpoint, then the custom annotator could be utilized to validate the scrubbed state of the data upon receipt. If the application owner does NOT own the downstream endpoint, the Publish annotation could provide evidence of how the data was handled prior to export.

Race Condition Publishing to IOTA

I've learned during operational testing that the C-bindings in the IOTA Streams provider do not provide any kind of locking to guard against concurrent writes. The underlying Streams implementation requires a sequential write to the stream otherwise messages can get lost and/or individual subscribers might get confused and stop processing messages.

It looks like the simple way to deal with this for now is to wrap the act of writing the message to the stream in a mutex.

cErr := C.sub_send_signed_packet(

The mutex will guarantee only one writer gains access to this piece of logic at a time. The Alvarium SDK is bootstrapped in a given application as a single instance. In cases where two writes are performed quickly or simultaneously between goroutines, we can observe the above irregular behavior.

Confirmed this with @DyrellC from IOTA on a call a couple weeks ago.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.