Comments (25)
@mperham Definitely Docker image would enable faktory
to be used everywhere. Be that Kubernetes, Mesos Marathon/Aurora, Nomad and etc.
from faktory.
This could be a starting point.
Couple of things to note:
- I don't know if I would consider it production-ready, but it's a stateful set of 1 faktory server (embedded redis). A redis gateway would be a much better idea for production usage (redundancy, failover being proper concerns).
- It's configured for Google Cloud's Kubernetes offering (GKE)
- This definition expects and creates within a
faktory
namespace - The container limits are super low so you'll want to tweak those: 300m CPU, 1Gi Memory
- It also depends on a secret being created for the faktory password. You can create it with this:
- It assumes a persistent disk in
us-east1-b
is ok (change this to your cluster's zone if necessary)
kubectl --namespace=faktory create secret generic faktory --from-literal=password=yoursecurepassword
apiVersion: v1
kind: Namespace
metadata:
name: faktory
---
kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
namespace: faktory
name: fast
provisioner: kubernetes.io/gce-pd
parameters:
type: pd-ssd
zone: us-east1-b
---
apiVersion: v1
kind: Service
metadata:
namespace: faktory
labels:
name: faktory
name: faktory
spec:
ports:
- name: faktory
protocol: TCP
port: 7419
targetPort: 7419
selector:
app: faktory
---
apiVersion: v1
kind: ConfigMap
metadata:
namespace: faktory
name: faktory-conf
data:
cron.toml: |
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
namespace: faktory
name: faktory
spec:
serviceName: faktory
replicas: 1
template:
metadata:
namespace: faktory
labels:
app: faktory
spec:
terminationGracePeriodSeconds: 10
containers:
- name: faktory
image: contribsys/faktory:0.9.6
command:
- /faktory
- -b
- :7419
- -w
- :7420
- -e
- production
resources:
requests:
cpu: 300m
memory: 1Gi
env:
- name: FAKTORY_PASSWORD
valueFrom:
secretKeyRef:
name: faktory
key: password
ports:
- containerPort: 7419
name: faktory
volumeMounts:
- name: faktory-data
mountPath: /var/lib/faktory/db
- name: faktory-conf
mountPath: /etc/faktory/conf.d
volumes:
- name: faktory-conf
configMap:
name: faktory-conf
items:
- key: cron.toml
path: cron.toml
volumeClaimTemplates:
- metadata:
name: faktory-data
namespace: faktory
annotations:
volume.beta.kubernetes.io/storage-class: fast
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 10Gi
A helm chart might be the right call for a more configurable and standardized deployment.
from faktory.
Here's the resources I used to get the Faktory server fully set up. I hope that this can help someone else in the future, as @jbielick 's example deployment yml was a great help to me.
A note about this configuration, we are using Datadog deployed as a daemon set + Kubernetes, so this setup will allow you to use the "pro metrics" statsd implementation. The references to DD_TRACE_AGENT_HOSTNAME are how we access the datadog agent.
Configs created:
kubectl create secret generic faktory --from-literal=password=${your_password} --from-literal=username=${your_username} --from-literal=license=${your_license}
kubectl create configmap faktory-config-merged --from-file=cron.toml=path/to/file/cron.toml --from-file=statsd.toml=path/to/file/statsd.toml -o yaml --dry-run=true | kubectl apply -f -
Server Deployment:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: faktory
labels:
run: faktory
spec:
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 0
maxUnavailable: 100%
template:
metadata:
labels:
run: faktory
name: faktory
annotations:
key: value
spec:
containers:
- name: faktory
image: 'location of pro dockerfile'
command: [ "/faktory"]
# must use "production" flag for cron
args: ["-w", ":7420", "-b", ":7419", "-e", "production"]
ports:
- containerPort: 7419
- containerPort: 7420
resources:
requests:
cpu: 300m
memory: 1Gi
env:
- name: FAKTORY_PASSWORD
valueFrom:
secretKeyRef:
name: faktory
key: password
- name: FAKTORY_USERNAME
valueFrom:
secretKeyRef:
name: faktory
key: username
- name: FAKTORY_LICENSE
valueFrom:
secretKeyRef:
name: faktory
key: license
- name: DD_TRACE_AGENT_HOSTNAME
valueFrom:
fieldRef:
fieldPath: status.hostIP
volumeMounts:
- name: faktory-data
mountPath: /var/lib/faktory/db
- name: shared
mountPath: /etc/faktory/conf.d
- name: faktory-config-merged
mountPath: /merged/
initContainers:
- name: config-data
image: busybox
command:
- sh
- -c
- cp -a /merged/. /shared/ && sed -i 's/localhost/'"$DD_TRACE_AGENT_HOSTNAME"'/g' /shared/statsd.toml && ln -sf /merged/cron.toml /shared/cron.toml
env:
- name: DD_TRACE_AGENT_HOSTNAME
valueFrom:
fieldRef:
fieldPath: status.hostIP
volumeMounts:
- name: faktory-config-merged
mountPath: /merged/
- name: shared
mountPath: /shared/
volumes:
- name: faktory-config-merged
configMap:
name: faktory-config-merged
items:
- key: cron.toml
path: cron.toml
- key: statsd.toml
path: statsd.toml
- name: shared
emptyDir: {}
- name: faktory-data
persistentVolumeClaim:
claimName: faktory-data
---
kind: Service
apiVersion: v1
metadata:
name: faktory
labels:
run: faktory
spec:
selector:
run: faktory
ports:
- name: network
protocol: TCP
port: 7419
targetPort: 7419
- name: webui
protocol: TCP
port: 80
targetPort: 7420
Persistent Volume Claim:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: faktory-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
This is a script for sending the HUP signal to the Faktory server to reset the cron when new jobs are added:
#!/bin/bash
# This script reloads the configuration of the Faktory job server deployment
# by sending the HUP signal to the Faktory docker container
# https://github.com/contribsys/faktory/wiki/Pro-Cron
echo "Updating Faktory cron config and reloading Faktory server config"
PROJECT_DIR="$( cd "$( dirname "$DIR" )" && pwd )"
kubectl create configmap faktory-cron-config --from-file=cron.toml=${PROJECT_DIR}/path/to/file/cron.toml -o yaml --dry-run | kubectl apply -f -
POD=`kubectl get --no-headers=true pods -l name=faktory -o custom-columns=:metadata.name`
if [[ -z "$POD" ]] ; then
echo "No Faktory Pods Found"
exit 1
fi
echo "Faktory pod: ${POD}"
kubectl exec -it ${POD} -c=faktory -- /bin/kill -HUP 1
echo "Done"
exit 0
I hope this helps someone else with their setup!
from faktory.
The newer implementations of the Horizontal Pod Autoscaler (HPA 1.6+?) support scaling on Custom Metrics (k8s 1.8+ custom metrics), for which there are some preliminary implementations. I think one interesting one is the Prometheus Adapter—if I understand correctly, k8s can pull metrics from prometheus and then an HPA can be set to scale based on a metric in that set. Perhaps in this world the responsibility of Faktory would be to have a prometheus exporter (á la oliver006/redis_exporter) that can send metics to Prometheus.
Since an exporter might live separately, I believe at a fundamental level the basic necessity would be Faktory's API for exposing processing metrics. An obvious metric for scaling might be the size of the queues, but I have also found myself interested in the amount of time a job spends in the queue before processing as well, because I believe that's also an intelligent indicator of a need to scale up workers.
Are there current ideas / plans for what internal metrics will be gathered / recorded?
from faktory.
@mperham I was going to open an issue regarding Kubernetes support, and I discovered you have one already open!
Specifically, I'd like to see examples of Kubernetes YAMLs that people could copy, paste, kubectl apply
and have Faktory running.
This would be especially cool to have for zero-downtime deploy of Faktory, as well as for Redis Gateway and replicated Faktory Pro.
from faktory.
I updated my deployment manifest above with a working version that will accept the HUP signal and update the cron schedule from a k8s config map. Everything seems to be working good in production now.
from faktory.
I realized that I also could have posted our worker deployment yml. This just a basic implementation and I've removed company-specific info.
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: faktory-worker
labels:
run: faktory-worker
spec:
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 2
maxUnavailable: 50%
template:
metadata:
labels:
run: faktory-worker
spec:
nodeSelector:
cloud.google.com/gke-nodepool: task-pool
volumes:
- name: cloudsql
emptyDir: null
containers:
- name: cloudsql-proxy
image: 'gcr.io/cloudsql-docker/gce-proxy:1.10'
command:
- '/cloud_sql_proxy'
- '--dir=/cloudsql'
- '-instances=instance-id'
volumeMounts:
- name: cloudsql
mountPath: /cloudsql
lifecycle:
preStop:
exec:
command: ['/bin/sh', '-c', '/bin/sleep 30']
- name: faktory-worker
imagePullPolicy: Always
image: 'our repo docker image'
command: ['node', 'dist/src/jobs/faktory/index.js']
volumeMounts:
- name: cloudsql
mountPath: /cloudsql
env:
- name: FAKTORY_PASSWORD
valueFrom:
secretKeyRef:
name: faktory
key: password
- name: FAKTORY_USERNAME
valueFrom:
secretKeyRef:
name: faktory
key: username
- name: DB_SOCKETPATH
value: /cloudsql/socket-path
- name: FAKTORY_URL
value: "tcp://:$(FAKTORY_PASSWORD)@faktory:7419"
- name: DEBUG
value: faktory*
- name: DD_SERVICE_NAME
value: faktory-worker
- name: DD_TRACE_AGENT_HOSTNAME
valueFrom:
fieldRef:
fieldPath: status.hostIP
envFrom:
- secretRef:
name: #### company secrets
- configMapRef:
name: ### company config
from faktory.
Yeah, queue size and latency are easily implementable.
What's the best approach to getting Kubernetes aware of Faktory? Are people using Helm? Would a Docker image or DEB/RPM binaries be most useful?
from faktory.
We run google container engine a.k.a. GKE (managed kubernetes) for all our services which include sidekiq workers. We have a situation where we may have high flash traffic (pdf rendering) where we are going to use google cloud functions a.k.a. GCF so we don't worry about scale. So sidekiq handles typical jobs, GCF handles high scale/flash traffic jobs like rendering.
With the introduction of faktory, I think one approach/architecture for pain-free scaling:
- kubernetes
service
+deployment
for faktory - kubernetes
deployment
for ruby worker (migrate sidekiq jobs) - google cloud functions (GCP's answer for serverless) for something like
faktory_worker_node_gcf
worker (no scale/cluster management needed)
Faktory needs proper probes:
- readiness - ready for traffic
- liveness - lightest ping possible
Preparing for kubernetes:
- some people use helm, but I haven't found it very heavily trafficked. Nonetheless, a reference helm config helps others develop their own configs (and can be used as a quickstart)
- many are using spinnaker for continuous deployment (e.g. based on vanilla kubernetes yaml files)
- optimized docker image would be most useful - e.g. redis-alpine
- persistent storage of faktory itself and ensuring recovery/availability will be most important concern for most.
Last but not least, consider an equivalent service to redislabs.com. We use GKE because we want to offload as much management as possible to focus on app development. For that reason, we didn't even try to deploy redis inside our cluster, but we used redislabs.com to deploy in the same region as our GKE cluster. Feel free to ping me via email if you want to dig into any of these areas.
from faktory.
@rosskevin Thanks for the info, that's great detail. Running a Faktory SaaS is not in my immediate plans but like antirez and Redislabs, I've already had pings from people interested in partnering in building something and I'm always willing to chat: mike @ contribsys.com
I'm still trying to wrap my head around what an enterprise-ready Faktory would look like and require; this helps a lot.
from faktory.
@rosskevin @mperham a few notes to add about k8s support, let me know what you think and if I'm getting anything wrong.
is the master process that also holds the rocksdb database, remember that k8s might kill faktory pod at will, so there should be a service with both web and worker ports open connected to the faktory-master
deployment.
the deployment itself should have a , so that the actual database file will be mounted on that volume, otherwise you'd lose all jobs waiting to be fulfilled,
this looks okay
Line 52 in 01dadde
I am a heavy helm user, it's pretty easy to set up a basic chart that will take care of setting up the faktory server, but workers might need to be as a different chart, so upgrading of the worker chart won't require a redeploy of the faktory server.
let me know if you need any help with PRs around these areas ...
from faktory.
I will look into configuration & deployment for OpenShift ... similar to Kubernetes ...
from faktory.
Gonna close this because I'm not a fan of nebulous, open-ended issues. Please open a specific issue if there's something Faktory can do to make our k8s support better.
from faktory.
Here's a helm chart helm/charts#13974
from faktory.
@dm3ch I saw that the incubator PR was closed. Did you end up adding your chart / repo to the helm hub? I couldn't find it. Would you be opposed to me using some of the files from your PR to make a chart and publish?
@ecdemis123 this is great. I assume this is a production setup? Glad you figured out the Datadog DaemonSet connection. I'll definitely be referencing that at some point to get it working correctly in one of our clusters. That initContainer
is a clever workaround for interpolating the ENV var in the config.
I think I'll start on a helm chart, which could automate sending the SIGHUP
(is it not a USR1
signal?) upon changing of a cron config. I'll post here if I have some progress to show. Is baking in some datadog support a good idea? I know Sidekiq does this and I could borrow @ecdemis123's solution there.
from faktory.
Cheers! Yep, this is our production setup, however our staging setup is pretty much the same with the exception that it's running faktory in development mode.
I'll be interested to see the helm chart. We aren't currently using helm but we hope to move onto it someday.
from faktory.
Actually, that HUP command is not working to restart the cron. I suspect that its due to this kubernetes/kubernetes#50345 since I'm using subPath to mount the conf.d files. Gonna dig more into it and see if I can come up with a workaround
from faktory.
@ecdemis123 You might consider writing the ConfigMap like so:
apiVersion: v1
kind: ConfigMap
metadata:
name: faktory
data:
cron.toml: |
[[cron]]
schedule = "*/5 * * * *"
[cron.job]
type = "FiveJob"
queue = "critical"
[cron.job.custom]
foo = "bar"
[[cron]]
schedule = "12 * * * *"
[cron.job]
type = "HourlyReport"
retry = 3
[[cron]]
schedule = "* * * * *"
[cron.job]
type = "EveryMinute"
faktory.toml: ""
test.toml: ""
Where each key is a file, the value its string contents.
I found this pretty convenient when mounting to the pod:
# ...
- name: faktory-configs
mountPath: /etc/faktory/conf.d
# ...
volumes:
- name: faktory-configs
configMap:
name: faktory
› kubectl exec faktory-0 -- ls -al /etc/faktory/conf.d
total 12
drwxrwxrwx 3 root root 4096 Sep 14 23:21 .
drwxr-xr-x 1 root root 4096 Sep 14 23:14 ..
drwxr-xr-x 2 root root 4096 Sep 14 23:21 ..2019_09_14_23_21_59.171278579
lrwxrwxrwx 1 root root 31 Sep 14 23:21 ..data -> ..2019_09_14_23_21_59.171278579
lrwxrwxrwx 1 root root 16 Sep 14 23:14 cron.toml -> ..data/cron.toml
lrwxrwxrwx 1 root root 19 Sep 14 23:14 faktory.toml -> ..data/faktory.toml
lrwxrwxrwx 1 root root 16 Sep 14 23:14 test.toml -> ..data/test.toml
Change to the ConfigMap get pushed to the pod in less than a minute (10s in my case).
As a result, the total delay from the moment when the ConfigMap is updated to the moment when new keys are projected to the pod can be as long as kubelet sync period (1 minute by default) + ttl of ConfigMaps cache (1 minute by default) in kubelet.
from faktory.
You might be able to write the statds.toml in the init container and it won't get overwritten. That's kind of a tricky one :\
The reloading (kill -HUP 1
) seems to work (once the files were fully pushed out). I'm thinking it might be a good idea to have a sidecar container watching those files and when they change, send a signal to the other container in the pod.
I'll do an experiment and report back. This ability is mentioned here: https://kubernetes.io/docs/tasks/configure-pod-container/share-process-namespace/
from faktory.
Interesting. My config map was created from a file, not using a string, so I wonder if that would simplify the implementation a bit. Mine looks pretty similar to yours, but I'm not that familiar with configmaps to see any subtle differences. I like the idea of having a sidecar container to send the HUP signal. I added that HUP script referenced above to our deploy pipeline that will get ran manually if necessary.
kubectl describe configmap faktory-config-merged
Name: faktory-config-merged
Namespace: default
Labels: <none>
Annotations: "truncated for now"
Data
====
cron.toml:
----
[[cron]]
schedule = "12 * * * *"
[cron.job]
type = "TestDatadog"
retry = 1
[[cron]]
schedule = "*/5 * * * *"
[cron.job]
type = "UpdateChatAgentCount"
retry = 0
statsd.toml:
----
[statsd]
# required, location of the statsd server
location = "localhost:8125"
# Prepend all metric names with this value, defaults to 'faktory.'
# If you have multiple Faktory servers for multiple apps reporting to
# the same statsd server you can use a multi-level namespace,
# e.g. "app1.faktory.", "app2.faktory." or use a tag below.
namespace = "faktory."
# optional, DataDog-style tags to send with each metric.
# keep in mind that every tag is sent with every metric so keep tags short.
tags = ["env:production"]
# Statsd client will buffer metrics for 100ms or until this size is reached.
# The default value of 15 tries to avoid UDP packet sizes larger than 1500 bytes.
# If your network supports jumbo UDP packets, you can increase this to ~50.
#bufferSize = 15
Events: <none>
from faktory.
helm repo add adwerx https://adwerx.github.io/charts
helm install --name faktory adwerx/faktory
Datadog Agent HostIP support coming soon.
from faktory.
It it be useful if i PR'ed some example kubernetes configs, or add them to the wiki?
from faktory.
from faktory.
https://github.com/contribsys/faktory/wiki/Kubernetes-Deployment-Example
@jbielick look ok?
from faktory.
@dm3ch I saw that the incubator PR was closed. Did you end up adding your chart / repo to the helm hub? I couldn't find it. Would you be opposed to me using some of the files from your PR to make a chart and publish?
@ecdemis123 this is great. I assume this is a production setup? Glad you figured out the Datadog DaemonSet connection. I'll definitely be referencing that at some point to get it working correctly in one of our clusters. That
initContainer
is a clever workaround for interpolating the ENV var in the config.I think I'll start on a helm chart, which could automate sending the
SIGHUP
(is it not aUSR1
signal?) upon changing of a cron config. I'll post here if I have some progress to show. Is baking in some datadog support a good idea? I know Sidekiq does this and I could borrow @ecdemis123's solution there.
Sorry for long reply. Suddenly I haven't got yet time to publish my chart in separate repo.
Thank you for publishing chart in helm hub. If you chart contains pieces of my work it's completely ok.
from faktory.
Related Issues (20)
- Question: what type of persistence is redis configured to use? HOT 1
- Pro vs Enterprise versions? HOT 6
- Question: do multiple job invocations have different JIDs? HOT 1
- Duplicate JIDs don't get cleared from the busy section HOT 1
- Faktory crashing on startup. HOT 7
- Can't filter/search retrying jobs HOT 3
- Unit of statsd time metrics HOT 1
- Question regarding "reserve_for" in cron HOT 2
- Issues / Feature request HOT 2
- DB persistence not working HOT 14
- Faktory can't use the IPV4 network mode HOT 1
- Options for `TARGETPLATFORM` in dockerfile HOT 3
- Wouldn't it be possible to configure the client using ways other than environment variables? HOT 4
- Autoscaling faktory in kubernetes HOT 5
- Faktory closing connection HOT 1
- Best practices for Faktory worker library command retry behavior HOT 1
- Unable to run example code files HOT 1
- 1.7.0 Metrics Name-spacing is Broken HOT 7
- StatsD metrics parsing errors after upgrade to v1.7.0 HOT 7
- Factory can not be run as a second user without a computer restart HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from faktory.