googleforgames / agones Goto Github PK
View Code? Open in Web Editor NEWDedicated Game Server Hosting and Scaling for Multiplayer Games on Kubernetes
Home Page: https://agones.dev
License: Apache License 2.0
Dedicated Game Server Hosting and Scaling for Multiplayer Games on Kubernetes
Home Page: https://agones.dev
License: Apache License 2.0
Need a quickstart on creating a cluster that matches our supported requirements (1.9+, RBAC)
We could also take the command line instructions from:
https://github.com/googleprivate/agones/blob/master/build/Makefile
Istio does a nice job of this:
https://istio.io/docs/setup/kubernetes/quick-start.html
This is why we couldn't pass arguments to the controller and sidecar.
This should be switched to using the array notation (which is actually preferred)
Review the sidecar - look to convert to flags rather than environment variables.
Simplify the install process by having a install.yaml in the root of the directory that installs both the CustomResourceDefinition and the Controller.
This makes installation and uninstall really simple through kubectl apply -f
and kubectl delete -f
Also make some documentation about installation please.
See here.
Or, we could simply add details to CONTRIBUTING.md
.
When a game developer want to test his build locally, he has to build the sidecar via go and run it locally on the machine so it listen to localhost.
But we could just allow to change the listen address when we are in --local mode or listen on any address.
If so developer could just use :
docker run -p 59357:59357/tcp gcr.io/agon-images/gameservers-sidecar --local
And start testing their SDK integration.
/cc @markmandel
Currently there is no validation on a GameServer
Once Kubernetes 1.9 is available on GKE, this can be implemented in the master branch.
We will likely need a combination of CRD validation, as well as webhook validation on creation and mutation.
static
or dynamic
, optional, defaults to dynamicstatic
TCP
or UDP
only, optional - defaults to UDPhealth
is disabled
Need to ensure that when the Cluster is grown / shrunk, this is handled gracefully.
Write a end to end test - possibly, using make and kubectl, or maybe use code - what is the best option?
Will likely need to both check that a connection works with the go sample (edit to send a message via a flag, rather than stdin), but also will need to check the C++ sdk works as well.
I din't think there are any e2e integration test libraries that exist for Kubernetes platforms - but do have a look first. May be worth looking at how Kubernetes itself does it.
Out of the box, Agon will not build with Windows.
I expect best effort will be to force people to use WSL to do this (which I don't think is too onerous)
Even with WSL, I expect that some of these will break, especially the minikube development workflow.
This is a high level ticket for tracking these issues, and resolving them.
May need to create a $DOCKER env var that switches to docker.exe on windows
Should be relatively trivial to inspect the OS and switch out as needed.
Best option:
After creating the cluster with make minikube-test-cluster
, run eval $(minikube docker-env)
and work with minikube as per described. We've set it up so the build image gets transferred in on minikube start, and since the VM knows it's own external IP, everything should work at this point as expected.
Should the controller and associated for Agones run under an agones-system
(or similar) namespace?
Motivated because we'll need a Service
and need to specify the namespace for webhooks for MutatingWebhookConfiguration
and ValidatingWebhookConfiguration
We currently compile static and dynamic libraries for Linux, we should do the same for macOS.
Code is here: https://github.com/GoogleCloudPlatform/agones/tree/master/sdks/cpp
https://circleci.com/build-environments/xcode/osx/
https://docs.travis-ci.com/user/reference/osx/
The controller currently needs a liveness http check, and a restart if it fails.
So we'll need to add a /healthz HTTP handler somewhere in:
https://github.com/googleprivate/agon/blob/master/gameservers/controller/controller.go
I expect we'll likely need to run the http server in it's own go-routine / manage how shutdown will occur in a somewhat graceful way.
And update the install.yaml (in both root and build directories) to include the new liveness check
References:
Go net/http package
We so rarely call kubectl directly, remove it from the build-image/Dockerfile
Then go through the Makefile and switch to the standard CMD docker run
pattern instead.
Create an example using Xonotic so that we can see a real game playing via Agon.
Instead of changing the source code, let's cheat slightly and create a Go binary that calls Ready()
on the SDK, and execute that in a bash script before starting the Xonotic server.
Need a make command that:
base_version
(i.e. 0.1
)gcr.io/agones-images
There's a new version. Probably worth upgrading ๐
I'm not a go expert but I'd like to understand why the controller.go is in the pkg folder. From what I've read this is your public packages ?
Thanks in advance !
Looking at the big install.yaml file and all the configurations possible, I think it would help people to provide a way of packaging YAML files.
Possible variables:
The default variable should reflect the current install.yaml.
We should also update the documentation to add installation step using helm on top of what we already have.
I don't think it's urgent for 0.1 but interesting to have.
/cc @rodcloutier
The constructor for Controller in controller.go is becoming unwieldy and has the potential to be error-prone as more parameters are added.
Go makes use of functional parameters in a similar way that some other languages use Builders. We should consider doing so as well to help readability as more developers start working on the project.
We compile SDK binaries for linux, windows and macOS - we should add archiving them in google cloud storage into the cloudbuild.yaml file.
Go just released version 1.9.3
https://golang.org/doc/devel/release.html#go1.9.minor
Nothing too major, but would be good to upgrade at some point.
Game production usually works on windows first then port to linux eventually at the end of the development. Having windows support would help the adoption rate.
What does it takes to run a windows game server ?
Testing will be difficult as windows support is still in beta since k8s 1.5 but apparently greatly improve in 1.9.
Documentation :
https://kubernetes.io/docs/getting-started-guides/windows/
http://blog.kubernetes.io/2017/09/windows-networking-at-parity-with-linux.html
https://github.com/kubernetes/community/tree/master/sig-windows
https://docs.microsoft.com/en-us/windows-server/get-started/whats-new-in-windows-server-1709
For the sidecar binary, generate Mac and Windows binaries for local development
This should include:
make build
target to include the new targetFleets are a group of warm servers that are available to be allocated to players when needed.
In Kubernetes parlance, they are the Deployment
/ReplicaSet
to Pods, but for GameServer
s
Fleet
, with an attached GameServerTemplate
(much like a PodTemplate
)replicas
number of Healthy GameServers
available (assuming resources exist)GameServer
becomes Unhealthy, then delete it and create it anew (we may add more options at a later date).Fleet
, this also deletes the backing GameServers
(this should be by default in Kubernetes now anyway)allocated
GameServer out of the pool.
GameServer
is moved to a Allocated
state on allocation.replicas
are increased in the Fleet
, the number of GameServers
is increased to match that number (assuming resources)replicas
are decreased in the Fleet
, the number of GameServers
is decreased to match that number.
GameServers
that are in an Allocated
state will never be deleted during the decreaseGameServer
template
is changed, then we will mimic a Deployment
in that we can do either a Recreate
or a RollingUpdate
to switch out the waiting warm servers.
apiVersion: "stable.agon.io/v1alpha1"
kind: Fleet
metadata:
name: "fleet-example"
spec:
# number of GameServers
replicas: 10
# deployment strategy for updating the image
# Lifted directly from https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.9/#deploymentstrategy-v1-apps
strategy:
# Recreate or RollingUpdate. Default to RollingUpdate
type: RollingUpdate
rollingUpdateDeployment: # optional rolling update config
maxSurge: "25%"
maxUnavailable: "25%"
# A GameServer template
template:
# Standard ObjectMeta
metadata:
labels:
mylabel: myvalue
# GameServer spec
spec:
containerPort: 7654
template:
spec:
containers:
- name: cpp-simple
image: gcr.io/agon-images/cpp-simple-server:0.1
Allocation is done through creating a FleetAllocation
record via kubectl or the API.
For example:
apiVersion: "stable.agon.io/v1alpha1"
kind: FleetAllocation
metadata:
name: "sample-allocation"
spec:
fleetName: "fleet-example"
The returned value from creating a GameServerAllocation
has the details of the allocated server (And moves the GameServer
to the state Allocated
.)
For example:
apiVersion: "stable.agon.io/v1alpha1"
kind: FleetAllocation
metadata:
name: "sample-allocation"
spec:
fleetName: "fleet-example"
status:
gameserver:
metadata:
name: "allocated-game-server"
spec:
containerPort: 7654
template:
spec:
containers:
- name: cpp-simple
image: gcr.io/agones-images/cpp-simple-server:0.1
status:
address: 192.168.99.100
nodeName: agones
port: 7373
state: Allocated
GameServerSet
to Fleet
as ReplicaSet
to Deployment
) (#156)Fleet
creates a GameServerSet
(#174)Allocation
from a fleet (#193)Recreate
update strategy (#199)Rolling
update strategyWhen portPolicy
is set to dynamic
then the controller should select the hostPort
for the GameServer container when the GameServer is created.
My current theory is to have a PortSelector
as part of gameservers/controller
(so they can share cache/informers) that:
[]PortSelections
, where PortSelection
is a map[int32]bool
where the first value is the port number and bool is whether it is taken or not, with one entry for each node. We don't actually care about tracking the nodes, as the K8s scheduler will reroute pods that already have a hostPort taken.true
and then set it to true
(With appropriate locking). Question will be where to track the Pod deletion.Extras:
dynamic
portPolicy
Since we'll have mulitple controllers because of #70 - we should move the health http check out of controller into somewhere central.
It may also make sense to have some kind of ability for a controller to respond to a health check (a controller registry perhaps?) for the /healthz endpoint.
There should be documentation on writing a gameserver, this should include:
Need to generate the C++ SDK from gRPC and provide the wrapper for it.
First discussion on this states that only the source code needs to be provided, and not any compiled .dlls.
The contributing documentation is missing a link :
Which link did you wanted to use ? this one https://chris.beams.io/posts/git-commit/ ?
Make the version tag of the agon-build
image to be the hash of the Dockerfile. Right now it is 0.1
and if the Dockerfile every changes, it has to be manually rebuilt.
With the hash of the Dockerfile, if the Dockerfile ever changes, then it will automatically rebuild on each invocation.
This will also lend itself nicely to image caching when doing CI/CD
Need a system that does continuous integration.
This has been built with Cloud Builder + Cloud Functions. This means to view the CI, people will need to be given IAM access to the console to view cloud builder and the artifacts.
The code for the CI system hooks/glue code has been left closed source, but could be open sourced in the future. The cloudbuidler.yaml is in the root of the Agon repository, so it can be edited as needed.
Long term, the cloud builder data can be made public facing (build a public facing version of the cloud builder output) - should be relatively trivial with another HTTP request cloud function.
This should work, now that the build process has been tested on Windows.
This is a high level ticket for discussing and designing solutions for these issues, and resolving them.
To test follow the guide here:
https://github.com/googleprivate/agones/blob/master/build/README.md
Depends on:
Version
into a single constant #2gcloud docker --authorize
make target and push targets #5Make the Version Makefile variable the first the first a number value (0.1-) + the first 7(?) characters of the Git hash for the deployed version.
This should be passed into the Go compilation as ldflags -X
to overwrite it' value.
This should also be used as the tag to push up to the docker registry, so there is always a specific version being run.
The build/Makefile
will need an install
target to that will need some kind of templating to push this through to the install.yaml and kubectl apply
- sed
may be a simple and easy first step.
Optional: Make the imagePullPolicy dependent on an extra argument? just as make install DEV=true
or something similar? The developer experience should be considered here. Not sure the best approach for that.
The sidecar needs a health check, and a restart if it fails.
This would likely be a HTTP health check, so will need to run the http server in it in it's own goroutine.
Controller version was implemented in #34 - check there for reference.
See: https://github.com/googleprivate/agon/blob/master/Gopkg.toml#L27-L33
Switch to using all release branches (which is just client-go that needs to change) Technically using a mixture of tags and branches is not supported (although it does seem to work).
Also, code-generator now has vendored dependencies - we can fix the Dockerfile as well and not run codegen from HEAD. e.g. https://github.com/kubernetes/code-generator/tree/release-1.8
More context see PR: kubernetes/client-go#337
Windows 1709 supports LCOW (linux container on Windows) and sharing the pod with linux and Windows containers side-by-side. Containers also share the same network namespace.
This would allow deploying Agones on a Kubernetes Windows node with the main game server in a Windows container.
However, this requires Hyper-v with nested virtualization. This makes deployment options more complicated, either on-prem or on external providers. Some compute providers supports 1709 Windows images, but it's not clear which ones support nested virtualization.
Agones sidecar doesn't seem to have much assumptions about running on a linux node.
Ideally, the sidecar should be cross-platform and also have a Windows build version, using either microsoft/windowsservercore:1709
or microsoft/nanoserver:1709
.
Some limitations right now is that service account tokens do not work well on Windows containers, so this needs to be resolved to have the token injected correctly by kubelet:
This issue is related to the Windows support discussed here #54
Unless otherwise specified, the GameServer container shouldn't restart by default.
Our current support for low dependency, cross compilation for the C++ SDK is very poor. Things work on Linux, because of make
and not much else. We also have no tests.
Right now, the C++ SDK is built on top of gRPC, which may be adding too many dependencies to be used in a valuable way across platforms?
Code is here: https://github.com/GoogleCloudPlatform/agones/tree/master/sdks/cpp
Caveat: This is written by @markmandel who has little to no idea about C++ and its ecosystem, so direction on the above is also appreciated.
https://github.com/grpc/grpc/blob/master/src/cpp/README.md#make
https://www.appveyor.com
https://circleci.com/build-environments/xcode/osx/
https://docs.travis-ci.com/user/reference/osx/
I don't know if this is possible, but what would determine if this controller is healthy is that the goroutines generated in this code are still running and haven't exited in any way.
At some point in the next few released of Kubernetes, the default support for allowing all level access to the entire cluster for all Pods will go away, so we should switch to RBAC at some point in the near future.
Also, this is better for security.
apiVersion: "stable.agon.io/v1alpha1"
kind: GameServer
metadata:
name: "simple-udp"
spec:
portPolicy: "static"
containerPort: 7654
hostPort: 7777
# new health section
health:
# defaults to false, but can be set to true
disabled: false
# If the `Health()` function doesn't get called at least once every timeout seconds, then
# the game server is not healthy. Defaults to "5"
periodSeconds: 3
# Minimum consecutive failures for the probe to be considered failed after having succeeded. Defaults to 3. Minimum value is 1.
failureThreshold: 3
# Number of seconds after the container has started before health check is initiated. Defaults to 5 seconds
initialDelaySeconds: 5
template:
spec:
containers:
- name: simple-udp
image: gcr.io/agon-images/udp-server:0.1
SDK.Health()
The Health()
function on the SDK object ill need to be called regularly below the timeout
threshold time to be considered healthy.
Ready
then, it should restart as per the restartPolicy
(which defaults to "Always")Ready
state, then it doesn't restart, but moves the GameServer to an Unhealthy
state - and then it's up to the managing code to determine what to do at that point.Health is a unidirectional stream from the gameserver client -> the sidecar. The sidecar will update the State to UnHealthy
if it doesn't receive a healthcheck event within the allotted time.
(Theory - need investigation)
gshealthz
url endpoint. It will track Health()
messages and if they drop below the set threshhold, return a 500
.Ready
, then this will always return 200
- which (in theory) should mean that Kubernetes will never restart the GameServer container.There are several instances of a Version
around the go code. It would be good to refactor this into a single constant that is shared across each binary (controller, sidecar).
There is a mechanism to record major events for days structures.
We should record explanatory events for each state change for a GameServer.
You can see examples of the API for recording events in the sample-controller
repository
https://sourcegraph.com/github.com/kubernetes/sample-controller@46b5d73382781350b6fbea86410615fd03792059/-/blob/controller.go#L84:2$references
For a nicer experience on GCP, write a make
target that mounts the appropriate docker config files for gcloud docker --authorize-only
, and then we can add a series of push
commands for the controller and sidercar images that use the standard docker push
commands.
Mean you can do a make gcloud-docker-auth build push
and all the new versions up on the repository.
Top level bug: #47
markmandel@DESKTOP-BDM5UCP:/c/Users/Mark/Documents/workspace/agon/build$ make gcloud-auth-docker
mkdir -p /c/Users/Mark/Documents/workspace/agon/build//.kube mkdir -p /c/Users/Mark/Documents/workspace/agon/build//.config/gcloud sudo rm -rf /tmp/gcloud-auth-docker mkdir -p /tmp/gcloud-auth-docker cp ~/.dockercfg /tmp/gcloud-auth-docker cp: cannot stat '/home/markmandel/.dockercfg': No such file or directory Makefile:222: recipe for target 'gcloud-auth-docker' failed make: [gcloud-auth-docker] Error 1 (ignored) docker run --rm -v /c/Users/Mark/Documents/workspace/agon/build//.config/gcloud:/root/.config/gcloud -v ~/.kube:/root/.kube -v /c/Users/Mark/Documents/workspace/agon:/go/src/github.com/agonio/agon -v /tmp/gcloud-auth-docker:/root --entrypoint="gcloud" agon-build:6c2ef6cd74 docker --authorize-only Short-lived access for ['gcr.io', 'us.gcr.io', 'eu.gcr.io', 'asia.gcr.io', 'l.gcr.io', 'launcher.gcr.io', 'us-mirror.gcr.io', 'eu-mirror.gcr.io', 'asia-mirror.gcr.io', 'mirror.gcr.io', 'k8s.gcr.io'] configured. sudo mv /tmp/gcloud-auth-docker/.dockercfg ~/ mv: cannot stat '/tmp/gcloud-auth-docker/.dockercfg': No such file or directory Makefile:222: recipe for target 'gcloud-auth-docker' failed make: *** [gcloud-auth-docker] Error 1
/cc @Kuqd
Unreal Engine has an Online Subsytem that provides an abstraction layer for game developers to build sessions and match making in a standardized way.
It would be awesome if there was a plugin for Agones that added subsystem integration so it can easily be used in Unreal.
Currently the controller doesn't shutdown gracefully on SIGTERM.
It should do that.
There is https://github.com/googleprivate/agon/blob/master/gameservers/controller/main.go#L90
But it doesn't seem to work?
It would be nice if the SchemeGroupVersion on line:
https://github.com/googleprivate/agon/blob/master/gameservers/controller/controller.go#L247
Was a var at the top of the controller. Saves it being recreated on each run through.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.