nephio-project / nephio Goto Github PK

Nephio is a Kubernetes-based automation platform for deploying and managing highly distributed, interconnected workloads such as 5G Network Functions, and the underlying infrastructure on which those workloads depend.

License: Apache License 2.0

Makefile 5.62% Go 92.59% Dockerfile 1.79%

nephio's People

Contributors

Stargazers

Watchers

nephio's Issues

Applying configuration changes to instance and instance sets

We could apply configuration changes such as capacity, cpu, memory labels etc at the following granularities

Instance which represents a specific instance of a network function. For example UPF-A, SMF-B etc.
Instance Set which represents a set of same type of network function such as a set of UPFs or a set of SMFs

While supporting both these scenarios we will run into a few ambiguous situations.
For example consider an Instance Set (InstanceSet-X) of UPFs that contains three UPF instances: UPF-A, UPF-B and UPF-C
On day 1 a memory change was applied to InstanceSet-X to change the memory for all instances to 4 GB
On Day 2 a memory change was applied to only the instance UPF-B to have 8 GB of memory.
On day 3 a memory change was again applied to InstanceSet-X to change the memory for all instances to 6 GB

Now after the day-3 change what should be the memory setting of UPF-B ? Should it be 8GB or 6 GB?
Should we resolve this automatically using 3 way git merge by assuming the user intent?
Should we explicitly ask the user about their intent while applying changes to Instance Sets?

Here is one possible way to resolve ambiguity.

We could let the user apply the configuration changes to either an "Instance" or an "Instance Set". When the user specifies the Instance Set we can let the user specify an exclude filter specifying the list of instances for which the configuration should not be applied. We will always apply changes for the resulting instances ( the entire instance set when exclude filter is empty). This way it is always deterministic and no surprises to the user as we are not assuming their intent since they are expressing it explicitly.

Sequence diagram showing UPF package deployment / hydration.

Design proposal for PackageVariant/PackageVariantSet

Revise kptdev/kpt#3827 and determine exactly what needs to be completed for R1.

Develop NAD inject Function

https://github.com/henderiw-nephio/nad-inject-fn
Inject the NAD in the package based on IP allocation as kpt function

Design the function and document it.
Implement the NAD Inject function and unit test
Readme for NAD Injector function with details and build procedure

Implement edge watcher GRPC server

Implement the GRPC server for the edge watcher.
EdgeWatcher will be a service/pod running on management cluster, exposes List/Watch interface for client to access statuses

Design watcher server
Implement and unit test watcher agent
Readme with details and build procedure for watcher agent

Exception handling, error propagation and corrective measures for network functions deployments.

Let's say user intends to deploy a topology that results in installing NF's in three workload clusters. Let's assume on one the clusters deployment fails. for some reason In this scenario:

How does management cluster track status ?
How does user come to know about the issue?
How the above error could be corrected? Does it require a total redeployment from the user's perspective?

In general errors could occur in multiple places.

At management cluster pre fan out
At Management cluster post fan out. For example IPAM
At workload cluster due to its local issues
At workload cluster due to some shift left operation on management cluster. Example of this scenario is IP address injection on the management cluster. Let's assume IPAM did the right job , injected the IP address and package is deployed on workload cluster . But for some reason (due to run time issues) that UPF deployment on that workload cluster requires a new set of IPs for which the assignment can only happen on management cluster.

In each of these scenarios

What is the use experience i?
How are the errors propagated and communicated?
Importantly how are users going to resolve these issues to have a successful deployment ? Do they have to redeploy the whole topology again or is there any way to surgically fix it without compromising the abstractions we are trying to provide to the user?

This issue was discussed on slack and here is the link to the slack thread.

https://nephio.slack.com/archives/C03MB5GRATS/p1677616067921709

Create CRDs for Network Instance Set

Create CRDs for Network instance Set
Refer to this document

Develop library for computing the differences in KRM resources when they change and produce concrete differences

Develop library for computing the differences in KRM resources when they change and produce concrete differences. This will provide concrete outputs for what is added/modified/deleted so that users of this library can take appropriate actions.

Implement Package Dependency Controller

See kptdev/kpt#3448 for prior discussion in the kpt community on this (stemming from Nephio). We can deliver this in Porch or in Nephio, it’s up to us - but I suspect if we build it in Nephio we may want to eventually upstream it to Porch, it is quite general purpose in its utility.

This is a set of CRD to represent some basic dependencies, and a controller that can propose additional packages to fulfill those dependencies.

Some of those dependencies could be explicit, and some implicit. Explicit dependencies must be declared by the package author; implicit dependencies may be discovered by the system by examining the package contents.

Some implicit examples:

If a package contains namespaced resources, the namespace must exist.
If a package contains a CR, the CRD must be loaded in the cluster.

Some explicit examples:

A package may need some other service (e.g., a database), but does not embed that service within itself.
A package may need a Secret that is provisioned out-of-band.

Each dependency may be resolved in many different ways. For example, a namespace resource could already exist in the destination cluster, or we could add the resource directly in the package, or we could propose a separate package to be deployed that will provision the namespace. We’ll need to figure out how the controller decides (i.e., how we specify policies) which of these mechanisms to use to resolve a given dependency.

The same conditions mechanism we used for IPAM can be used for dependency management, with each dependency representing a condition that must be resolved before we can approve the package for deployment.

Tasks are

Implement and unit test package dependency controller
Document readme for package dependency controller with details and build steps

Create repositories for Nephio components

Major nephio components should have their own repositories. This issue will track the repo creation for the components

Create nephio-api repository
Create nephio-controllers repository
Create nephio-ipam repository
Create nephio-free5gc-operators repository( separate for each NF? )

Implement initial free5GC package for the operator

Implement initial free5gc package for the operator. This may be enhanced in future sprints. This package will install required operators on edge clusters.

Develop NF Deploy function

https://github.com/henderiw-nephio/nf-deploy-fn
Inject the IP(s) in the UPF deployment as a kpt function

Document describing how each user story is met.

Epic - Implement NF Topology controller

Implement, build and unit test NF topology controller. The code will reside in nephio-controllers repo.

This controller processes the NFTopology resources, emitting PackageVariant[Set] resources as well as other NF-specific resources.The NFTopology resource captures a set of network function configurations, as well as the clusters (or perhaps sites) in which those functions, with those specifications, should be deployed and operational.The NFTopology resource itself may point to other, NF-specific resources for the details of those configurations.

The key is that the NF configuration captured here is that aspect which is invariant across the sites. Variance across the sites is introduced via injection.

In the PoC, this was called the FiveGCoreTopologyController. This is a more generalized version of that, which we may reconsider if there is some 5G Core semantics we want built in; however, at this point it does not seem necessary.

In the PoC, this controller use a very sparse seed package as the upstream when it created PackageDeployment (the predecessor of PackageVariant[Set]), and the actual NF configuration was then injected by the NF Injector Controller. An alternative would be for this controller to combine the sparse seed with knowledge built into this to create a new package, and then use that package as the upstream in the PackageVariant[Set]. This is the approach taken in the Google seed code.

This is an umbrella issue This issue comprises of following tasks

Refactor the watcher agent code for workload clusters.

Clean up and refactor the watcher agent seed code .for the workload clusters

Gather use cases and requirements for supporting Helm charts in Nephio

Today Helm is the most common packaging tool used by the NF vendors to package CNFs and CSPs to deploy them. This means there is a lot of existing investment from both vendors and CSPs on Helm.

This issue is to track the the What (requirements) and not the How (Design and implementation details). We will target only the requirements gatherings for the R1 time frame. The how part of the equation will be discussed after R1 release.

This issue should gather the following details.

What are the main uses cases for Helm support (is it to support the existing investments Or support going forward as well)
Priority on what aspects of Helm that needs support
Do customers customize the Helm charts themselves or do they just play with values file
In overall CNF orchestration in a CSP through their OSS/BSS systems where does it sit today.
Pain points with Helm
Best practices
What is the intended direction from both NF vendors and CSPs to move away from Helm

High-level design document describing each component and their relationships.

Sequence diagram showing basic, general purpose package deployment.

Develop library for common NAD functionalities

Develop library for CRUD operations (getters. setters etc) for NAD functions.

Develop library for common VLAN allocation functionalities

Develop library for CRUD operations (getters. setters etc) for VLAN allocation functions.

Sequence diagram showing SMF package deployment / hydration.

Design NF deploy function

Design the NF deploy function and document it in the github repo.

Document the component level design for all the major components.

For each component, a design document describing:

The binary or binaries used by the component.
The component’s input and/or northbound APIs.
The component’s output and/or southbound APIs it utilizes.
The component’s dependency upon or other relationships with other components.
The current status of the component, and what changes or gaps are needed to meet the use cases expected in R1.
The repository for the component code, and any special build requirements (hopefully most components follow a similar build).

This issue comprises of following tasks

Document the component level design for NF Topology Controller
Document the component level design for NF Injector Controller
Document the component level design for IPAM Injector
Document the component level design for Package Dependency Controller
Document the component level design for free5gc operators
Document the component level design for watcher agent
Document the component level design for Cluster bootstrap Controller
Document details of major KPT functions

Design and document the high-level design for Nephio R1

We need to come up with the design document that describes the components, flows and interactions. The design document should also explain how the use cases are met by the proposed design.

This story comprises of following sub tasks. These may be part of same document but could be worked in parallel.

Porch PackageVariant configuration injection feature

Implement config injection feature either in Porch or locally (then upstream in Porch)

Packages and instructions to create a running Nephio management cluster including UI, Porch, Nephio injectors and controllers, IPAM controller

This packaging of various entities may feed into the Nephio installer.

Parent issue.

[SIG2 dependency ]Requirements for E2E Test Bed

SIG2 needs to clarify requirements for E2E test bed.

What are the number of clusters and what type of cluster (for example KIND) needs to be created?
What is the networking setup that is expected?
What are the workloads that needs to be setup in these clusters?

Create vendor neutral CRDs for Network Functions

Create vendor neutral CRDs and Go types for Network functions.
See this document for reference and details.

As part of this we need to define CRDs for SMF, UPF and AMF.
Following are the task list

Create CRDs for UPF (UPF class and deployment)
Create CRDs for SMF (SMF class and deployment)
Create CRDs for AMF (AMF class and deployment)

Implement ClusterBootstrap controller

Implement and test ClusterBootstap controller.

Implement code and unit tests for ClusterBootstrap controller
Readme for ClusterBootstrap controller with details and build procedure

The high level description of this controller can be found in

https://docs.google.com/document/d/1wd-ht4i9YbScVicgUKcZ9_18Xysg2_3N3IuaCper14Q/edit?usp=share_link&resourcekey=0-YsVEzeyveG-otxcV0YpU5w

The POC slides are
https://docs.google.com/presentation/d/1Hqt-pXjRE2CH71zm_XzMfy7lvBO5Dyvq-jGFJzPor0o/edit#slide=id.g1af60b52c04_0_5 The scope of this work can be,
Cluster Provisioning ( GCP only for now)
Create Nephio GCP Cloud management cluster and the corresponding GCP Infra repo. This cluster must have,

config sync.
KCC

Create the KPT package of resources for a GKE cluster. Include the Cluster CR in the package. Push the package to the KCP infra repo.

ClusterBootstrap Controller
The clusters that are provisioned from the Nephio GCP Cloud management cluster need to be bootstrapped in order to prepare them for workload orchestration. At the minimum we need to install and configure

Config sync
And/OR create the workload repository for that this cluster.

The bootstrap controller will perform these operations on the workload cluster by communicating with the cluster directly using the kubeconfig file. Other infrastructure related packages like CNIs, Observability stack etc can then be synced via config sync in the workload cluster.

Create Example package representing a topology of three Network Functions

Example package representing a topology of those NFs (ala this one)

Sequence diagram showing AMF package deployment / hydration.

Investigate the UI changes needed for the workshop UI, and make changes

This issue is about cleaning up the existing UI code used in the workshop and make it relevant to R1 release
This includes:

Figure out the gaps with the existing UI
Figure out the work elements
Execute on work elements where possible

Implement NF Injector Controller

Development and unit testing of NF Injector controller. This will reside in nephio-controllers repo.

In the PoC, the upstream UPF package contained an “empty” UPFDeployment, and the NF Injector Controller populated this based on the contents of the FiveGCoreTopology resource (which it found by looking at an annotation in the PackageRevision).

This may not be needed given that the NFTopology controller in the Google seed behaves a little differently than that. That is, the NFTopology controller could instead directly add these resources to a new package it creates based on the upstream package and the contents of the NFTopology resource. This si a little cleaner; in the PoC we built on top of PackageDeploymentController, so we needed to come back in later and add information back in from FiveGCoreTopology. Creating a new package instead makes this unnecessary.

As yet another alternative, if Porch PackageVariant controller adds the ability to specify mutator kpt functions to call during the package clone, the NFTopology controller could take advantage of that instead of creating a new package (this is effectively providing instructions on how to derive the new package from the upstream, and skipping the step of storing it).

However, the exact approach is not clear yet, so this issue will remain for now.

As part of this issue we will do the following tasks.

Implement the code and unit tests for NF Injector Controller
Document the readme describing the NF Injector controller and build procedure.

Installer for Nephio

Processes, tools, and other artifacts needed to build, test, and package the components and the overall Nephio solution. Document the dependencies, process of packaging all components and installing Nephio ( preferably with a script)

We need to decide where this should reside in may be a repo like nephio-installer or getting started.

There is a difference between this and the sandbox we create for E2E tests. Sandbox could be the entire set up that includes KIND clusters (both management and workload) and installing packages on them including networking configuration etc, where as scope of the installer is different. The scope should be

Given a k8s cluster (could be anything ,anywhere) which wants to act as a management cluster , how we can set up all the required components of Nephio ( possibly including Porch, kpt). Configure porch with the repos etc.
Given a k8s cluster which acts as a workload cluster we would like to install all packages (possibly including configsync). Set up configsync repo etc
Do step 2 in bulk to set up multiple repos.
Do 1, 2 and 3 above with a single command by getting all the user inputs in a yaml file may be.

So E2E test bed sandbox is highly opinionated where as installer should work on any k8s cluster with possible instructions for networking setup for GRPC communications

#47
Create the installer

Implement IPAM/VLAN backend

Implement and package IPAM controller based on the workshop prototype.

Develop library for common cluster-context functionalities

Develop library for CRUD operations (getters. setters etc) for cluster context functions.

Create CRDs for Network Topology

Create CRDs for Network Topology
Refer to this document

Develop library for common KPT file functionalities

Implement common CRUD functionalities for conditions in packahes.

Implement a library that will ease the development of Nephio KRM functions and increase efficiency.

As part of Nephio we need to develop a few KRM functions as part of package specialization They have some common functionalities and workflows as well. This issue captures the development of libraries that will capture those common use cases and hence make the development of KRM functions more efficient.

The following tasks captures the planned work for this issue.

Implement IPAM specialiser

This controller watches for new package revisions with unmet IPAM conditions. It utilizes the IPAM request meta-data in the package to make an allocation request from the IPAM controller, and injects that back into the status of the IPAM request resource.

This already exists in a different repo. WE need to bring this to Nephio repo with clean ups etc.

Implement code and unit tests for IPAM Injector
Readme for IPAM Injector describing the controller and build procedure

Completed design proposal for package specialization building blocks

A high-level diagram showing the components of Nephio R1 and their basic relationships.

Develop library for common IP address allocation functionalities

Develop library for CRUD operations (getters. setters etc) for IP allocation functions.

R1 planning

Hello Nephio Team,

Let's begin planning for R1. Let's use this issue to gather the various docs, slide decks, and other artifacts we are using in the planning process.

R1 Project Board will be used to capture our task list and run scrums.
R1 planning doc - this will eventually result in design docs captured in this repo along with tasks captured in the project board
R1 use case deck is a working document we are using to go through the use cases and guide our discussions around how to meet them in R1

@s3wong @henderiw @tliron please add additional resources here

Tasks for this issue are

Update the CRD proposal document with OpenAPI format
Update the CRD proposal document with reader friendly format

nephio-project / nephio Goto Github PK

nephio's People

Contributors

Stargazers

Watchers

Forkers

nephio's Issues

Recommend Projects

Recommend Topics

Recommend Org