kayrus / elk-kubernetes Goto Github PK
View Code? Open in Web Editor NEWThis repo shows how to configure complete EFK stack on top of Kubernetes
License: GNU General Public License v2.0
This repo shows how to configure complete EFK stack on top of Kubernetes
License: GNU General Public License v2.0
Failed with RBAC enabled cluster (1.7 in my case)
fluentd-elasticsearch is failed with this :
2017-08-27 10:20:21 +0000 [error]: config error file="/etc/td-agent/td-agent.conf" error="Exception encountered fetching metadata from Kubernetes API endpoint: 403 Forbidden"
2017-08-27 10:20:21 +0000 [warn]: process died within 1 second. exit.
The only thing I change is Namespace (via NAMESPACE env) .
Hi, I want to add a field that can show logs level in elasticsearch.(like INFO WARN..)
And when I use elasticsearch for query logs, I can see the log level in every index.
Would you please tell me how to configure?
I've encountered the same issue as mentioned here, pires/docker-elasticsearch-kubernetes#5 is there any easy way of changing the elasticsearch configuration without using another docker container?
undeploy.sh (both in root and es5/) doesn't remove the es-curator deployment, which means it will remove the pods, the deployments replicaset will start new pods, which it will remove, start new etc...
I tried to install es5 etc but it fails with the following watch log:
~/git/kayrus/elk-kubernetes/es5$ ./deploy.sh --watch
Error from server (AlreadyExists): namespaces "monitoring" already exists
Labeling nodes which will serve Elasticsearch data pods
node "node-01.kube.example.com" not labeled
node "node-02.kube.example.com" not labeled
node "node-03.kube.example.com" not labeled
deployment "es-data" created
service "cerebro" created
deployment "cerebro-v0" created
deployment "es-client" created
deployment "es-curator" created
service "elasticsearch-discovery" created
configmap "es-env" created
daemonset "fluentd-elasticsearch" created
service "kibana" created
deployment "kibana-v5" created
deployment "es-master" created
service "elasticsearch-logging" created
deployment "kubernetes-events-printer" created
configmap "es-config" created
configmap "fluentd-config" created
NAME READY STATUS RESTARTS AGE
alertmanager-main-0 2/2 Running 0 5d
alertmanager-main-1 2/2 Running 0 5d
alertmanager-main-2 2/2 Running 0 5d
cerebro-v0-1518099659-f03xk 0/1 ContainerCreating 0 18s
es-curator-287824922-t93c1 0/1 ContainerCreating 0 16s
grafana-3524315691-z56zq 2/2 Running 0 23h
kibana-v5-2521480414-9kf73 0/1 ContainerCreating 0 10s
kubernetes-events-printer-577402123-x62r4 0/1 ContainerCreating 0 5s
node-exporter-3qm27 1/1 Running 0 23h
node-exporter-6sv0d 1/1 Running 0 23h
node-exporter-q5829 1/1 Running 0 23h
prometheus-k8s-0 2/2 Running 0 5d
prometheus-k8s-1 2/2 Running 0 5d
prometheus-operator-247007151-s30bh 1/1 Running 0 5d
es-curator-287824922-t93c1 1/1 Running 0 47s
kubernetes-events-printer-577402123-x62r4 1/1 Running 0 38s
kubernetes-events-printer-577402123-x62r4 0/1 Error 0 39s
kubernetes-events-printer-577402123-x62r4 0/1 Error 1 42s
kubernetes-events-printer-577402123-x62r4 0/1 CrashLoopBackOff 1 43s
kubernetes-events-printer-577402123-x62r4 0/1 Error 2 59s
kubernetes-events-printer-577402123-x62r4 0/1 CrashLoopBackOff 2 1m
kubernetes-events-printer-577402123-x62r4 0/1 Error 3 1m
kubernetes-events-printer-577402123-x62r4 0/1 CrashLoopBackOff 3 1m
kubernetes-events-printer-577402123-x62r4 0/1 Error 4 2m
kubernetes-events-printer-577402123-x62r4 0/1 CrashLoopBackOff 4 2m
cerebro-v0-1518099659-f03xk 0/1 Running 0 2m
cerebro-v0-1518099659-f03xk 1/1 Running 0 3m
kubernetes-events-printer-577402123-x62r4 0/1 Error 5 3m
kubernetes-events-printer-577402123-x62r4 0/1 CrashLoopBackOff 5 3m
kibana-v5-2521480414-9kf73 0/1 Running 0 4m
kubernetes-events-printer-577402123-x62r4 0/1 Error 6 6m
kubernetes-events-printer-577402123-x62r4 0/1 CrashLoopBackOff 6 6m
kubernetes-events-printer-577402123-x62r4 0/1 Error 7 11m
kubernetes-events-printer-577402123-x62r4 0/1 CrashLoopBackOff 7 11m
kubernetes-events-printer-577402123-x62r4 0/1 Error 8 16m
kubernetes-events-printer-577402123-x62r4 0/1 CrashLoopBackOff 8 16m
kubernetes-events-printer-577402123-x62r4 0/1 Error 9 21m
kubernetes-events-printer-577402123-x62r4 0/1 CrashLoopBackOff 9 21m
Trying to deploy this to a 1 master, 2 node k8s 1.5.1 cluster on AWS, and the es-master pods are crashing. kubectl logs shows the following error:
"Exception in thread "main" java.lang.IllegalArgumentException: No up-and-running site-local (private) addresses found, got [name:lo (lo), name:eth0 (eth0)]"
Both nodes and masters are running Debian jessie.
Any help you can offer would be much appreciated.
When using the events printer, I get errors like the following:
{"time":"2018-02-06T20:40:10Z","object":{"message":"Monitoring Kubernetes events staring from null resourceVersion"}}
{
"kind": "Status",
"apiVersion": "v1",
"metadata": {
},
"status": "Failure",
"message": "resourceVersion: Invalid value: \"null\": strconv.ParseUint: parsing \"null\": invalid syntax",
"code": 500
{"time":"2018-02-06T20:40:10Z","object":{"message":"Monitoring Kubernetes events staring from null resourceVersion"}}
}
K8s version: 1.7.6
OS: Container Linux 1465.7
Error: parse error: Invalid numeric literal at line 1, column 5
Details: When deploying stack to K8s with deploy.sh, k8s-events-printer fails to start and logs the above error.
[root@10-2-8-230 elk-kubernetes]# kubectl describe pod es-data-2875003034-qpfq7 --namespace=monitoring
Name: es-data-2875003034-qpfq7
Namespace: monitoring
Node: /
Labels: component=elasticsearch
pod-template-hash=2875003034
role=data
Status: Pending
IP:
Controllers: ReplicaSet/es-data-2875003034
Containers:
es-data:
Image: kayrus/docker-elasticsearch-kubernetes:2.4.4
Ports: 9300/TCP, 28651/TCP
Args:
/run.sh
-Des.path.conf=/etc/elasticsearch
Readiness: tcp-socket :9300 delay=0s timeout=1s period=10s #success=3 #failure=3
Volume Mounts:
/data from storage (rw)
/etc/elasticsearch from es-config (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-tcpbr (ro)
Environment Variables:
NAMESPACE: monitoring (v1:metadata.namespace)
CLUSTER_NAME: <set to the key 'es-cluster-name' of config map 'es-env'>
NUMBER_OF_REPLICAS: <set to the key 'es-number-of-replicas' of config map 'es-env'>
NODE_MASTER: false
NODE_DATA: true
HTTP_ENABLE: false
ES_HEAP_SIZE: <set to the key 'es-data-heap' of config map 'es-env'>
ES_CLIENT_ENDPOINT: <set to the key 'es-client-endpoint' of config map 'es-env'>
ES_PERSISTENT: <set to the key 'es-persistent-storage' of config map 'es-env'>
Conditions:
Type Status
PodScheduled False
Volumes:
storage:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
es-config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: es-config
default-token-tcpbr:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-tcpbr
QoS Class: BestEffort
Tolerations:
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
17m 31s 63 {default-scheduler } Warning FailedScheduling pod (es-data-2875003034-qpfq7) failed to fit in any node
fit failure summary on nodes : MatchInterPodAffinity (2), PodFitsHostPorts (2), PodToleratesNodeTaints (1)
i am using oh-my-zsh, kubectl -o jsonpath={.items[0].metadata.name} can't get anything.
Switch to bash, it will get the correct result.
can you make it work in zsh ?
On a k8s apiserver v1.5.6 the deployment fails:
Log for kubernetes-event-printer:
2017-09-05T02:58:52.468020417Z parse error: Invalid numeric literal at line 1, column 10
Log for es-client:
2017-09-05T03:03:38.559340597Z ... 11 more
2017-09-05T03:03:38.635423286Z [2017-09-05 03:03:38,635][WARN ][io.fabric8.elasticsearch.discovery.kubernetes.KubernetesUnicastHostsProvider] [es-client-4269868645-wvs63] Exception caught during discovery: An error has occurred.
2017-09-05T03:03:38.635448602Z io.fabric8.kubernetes.client.KubernetesClientException: An error has occurred.
2017-09-05T03:03:38.635454790Z at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:57)
2017-09-05T03:03:38.635460247Z at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:125)
2017-09-05T03:03:38.635465487Z at io.fabric8.elasticsearch.cloud.kubernetes.KubernetesAPIServiceImpl.endpoints(KubernetesAPIServiceImpl.java:35)
2017-09-05T03:03:38.635470618Z at io.fabric8.elasticsearch.discovery.kubernetes.KubernetesUnicastHostsProvider.readNodes(KubernetesUnicastHostsProvider.java:112)
2017-09-05T03:03:38.635475447Z at io.fabric8.elasticsearch.discovery.kubernetes.KubernetesUnicastHostsProvider.lambda$buildDynamicNodes$0(KubernetesUnicastHostsProvider.java:80)
2017-09-05T03:03:38.635480503Z at java.security.AccessController.doPrivileged(Native Method)
2017-09-05T03:03:38.635485257Z at io.fabric8.elasticsearch.discovery.kubernetes.KubernetesUnicastHostsProvider.buildDynamicNodes(KubernetesUnicastHostsProvider.java:79)
2017-09-05T03:03:38.635490428Z at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPings(UnicastZenPing.java:335)
2017-09-05T03:03:38.635495242Z at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.ping(UnicastZenPing.java:240)
2017-09-05T03:03:38.635500403Z at org.elasticsearch.discovery.zen.ping.ZenPingService.ping(ZenPingService.java:106)
2017-09-05T03:03:38.635505279Z at org.elasticsearch.discovery.zen.ping.ZenPingService.pingAndWait(ZenPingService.java:84)
2017-09-05T03:03:38.635510217Z at org.elasticsearch.discovery.zen.ZenDiscovery.findMaster(ZenDiscovery.java:945)
2017-09-05T03:03:38.635517456Z at org.elasticsearch.discovery.zen.ZenDiscovery.innerJoinCluster(ZenDiscovery.java:360)
2017-09-05T03:03:38.635522390Z at org.elasticsearch.discovery.zen.ZenDiscovery.access$4400(ZenDiscovery.java:96)
2017-09-05T03:03:38.635527148Z at org.elasticsearch.discovery.zen.ZenDiscovery$JoinThreadControl$1.run(ZenDiscovery.java:1296)
2017-09-05T03:03:38.635545025Z at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
2017-09-05T03:03:38.635550558Z at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
2017-09-05T03:03:38.635555480Z at java.lang.Thread.run(Thread.java:745)
2017-09-05T03:03:38.635562181Z Caused by: javax.net.ssl.SSLPeerUnverifiedException: Hostname kubernetes.default.svc not verified:
2017-09-05T03:03:38.635567196Z certificate: sha1/KROIINnUlQ2e96AR0sQBvJJTyu4=
2017-09-05T03:03:38.635571908Z DN: CN=10.10.10.1
2017-09-05T03:03:38.635576527Z subjectAltNames: [10.10.10.1, 10.176.215.15, 169.46.7.238]
2017-09-05T03:03:38.635581169Z at com.squareup.okhttp.internal.io.RealConnection.connectTls(RealConnection.java:197)
2017-09-05T03:03:38.635586013Z at com.squareup.okhttp.internal.io.RealConnection.connectSocket(RealConnection.java:145)
2017-09-05T03:03:38.635590714Z at com.squareup.okhttp.internal.io.RealConnection.connect(RealConnection.java:108)
2017-09-05T03:03:38.635595466Z at com.squareup.okhttp.internal.http.StreamAllocation.findConnection(StreamAllocation.java:184)
2017-09-05T03:03:38.635600230Z at com.squareup.okhttp.internal.http.StreamAllocation.findHealthyConnection(StreamAllocation.java:126)
2017-09-05T03:03:38.635604990Z at com.squareup.okhttp.internal.http.StreamAllocation.newStream(StreamAllocation.java:95)
2017-09-05T03:03:38.635609849Z at com.squareup.okhttp.internal.http.HttpEngine.connect(HttpEngine.java:281)
2017-09-05T03:03:38.635614624Z at com.squareup.okhttp.internal.http.HttpEngine.sendRequest(HttpEngine.java:224)
2017-09-05T03:03:38.635619637Z at com.squareup.okhttp.Call.getResponse(Call.java:286)
2017-09-05T03:03:38.635624479Z at com.squareup.okhttp.Call$ApplicationInterceptorChain.proceed(Call.java:243)
2017-09-05T03:03:38.635629340Z at io.fabric8.kubernetes.client.utils.HttpClientUtils$3.intercept(HttpClientUtils.java:110)
2017-09-05T03:03:38.635634086Z at com.squareup.okhttp.Call$ApplicationInterceptorChain.proceed(Call.java:232)
2017-09-05T03:03:38.635638770Z at com.squareup.okhttp.Call.getResponseWithInterceptorChain(Call.java:205)
2017-09-05T03:03:38.635643513Z at com.squareup.okhttp.Call.execute(Call.java:80)
2017-09-05T03:03:38.635648141Z at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:210)
2017-09-05T03:03:38.635652811Z at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:205)
2017-09-05T03:03:38.635657558Z at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:510)
2017-09-05T03:03:38.635662171Z at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:118)
2017-09-05T03:03:38.635666876Z ... 16 more
related to fabric8io/fluent-plugin-kubernetes_metadata_filter#41
maybe this comment can fix the issue: fabric8io/fluent-plugin-kubernetes_metadata_filter@c9fa728#diff-83453022bdb83378e6c454e99caacd36
thus I have to verify whether plugin version bump will help
--register-schedulable
will be deprecated and replaced with --register-with-taints=node.alpha.kubernetes.io/ismaster=:NoSchedule
Hi, can you tell me the meaning of bellow code? I meet the same problem as bellow, the fluentd logged this message sometimes.
Thank you! :-)
################## Strip fluentd concat logs ##################
<match kubernetes.var.log.containers.fluentd-elasticsearch-**.log>
type rewrite_tag_filter
rewriterule1 log "\[warn\]: dump an error event: error_class=Fluent::ConcatFilter::TimeoutError" clear.fluentd.concat
</match>
<match clear.fluentd.concat>
type null
</match>
Hi Team,
I am trying to use the kayrus/docker-elasticsearch-kubernetes image on arm64 platform but it seems it is not available for arm64.
I have successfully built the images using the command docker build -t image_name . on the arm64 platform by making some changes in the Dockerfile.
I have used Travis-CI to build and push the image for both the platforms.
Commit Link -
573c4cd
Docker Hub Link(elasticsearch) - https://hub.docker.com/repository/registry-1.docker.io/odidev/elasticsearch/tags?page=1&ordering=last_updated
Docker hub Link(elasticsearch_5x) - https://hub.docker.com/repository/registry-1.docker.io/odidev/elasticsearch_5x/tags?page=1&ordering=last_updated
Do you have any plans on releasing arm64 images?
If interested, I will raise a PR.
I think the problem I'm seeing is that the replica set for the es-data deployment is attempting to put load on the master nodes. I have my cluster setup not to allow work loads on the master. ( I suspect daemonsets have an exception in this situation).
Anyway I'm sitting here with the es-data pods deployed to all my worker nodes and the es-data pods that are trying to deploy on my masters are reporting :
pod (es-data-2875003034-3lm6d) failed to fit in any node fit failure summary on nodes : MatchInterPodAffinity (4), PodFitsHostPorts (4), PodToleratesNodeTaints (3)
default-scheduler
Normally I could probably try to work around this, but this deployment method I'm not 100% familiar with so I didn't want to break anything....
Any suggestions?
I'm certain that this is probably a setup failure in my configuration, because last week I had kibana up and running. However now when I try to bring up kibana.example.com my browser tells me too many redirects
.
Can I run kibana as standard http? In my cluster my ingress controller is fronted by an ELB that has a star cert. My ingress controller than is setup to do 443 to http port. Of course even with the above turned off. I still can't seem to get the kibana gui to show itself. I'm using your test_ingress/ingress.yaml file, with the correct domains.
Other than that everything is stock.
Hi there,
I'm running Kubernetes 1.5.3 using GKE to test this repo.
When running ./deploy.sh
, everything comes up except for the fluentd pods. They enter a CrashLoopBackOff
. Describing one of the pods yields the following output:
Name: fluentd-elasticsearch-sbrff
Namespace: monitoring
Node: gke-test-cluster-1-normal-size-pool-8442a74d-h98x/10.132.0.4
Start Time: Sun, 26 Mar 2017 20:10:31 +0200
Labels: k8s-app=fluentd-logging
name=fluentd-logging
Status: Running
IP: 10.0.7.74
Controllers: DaemonSet/fluentd-elasticsearch
Containers:
fluentd-elasticsearch:
Container ID: docker://b32d2ed0e5cbfa774faabc157ada3c55303c3795711bee603d980d2ba3da4441
Image: kayrus/fluentd-elasticsearch:1.20
Image ID: docker://sha256:3cea676d8a4f200608d3e016be0f8567d10f8b932441325af0d8bd34a5fe15bf
Port:
Args:
-q
Limits:
memory: 1Gi
Requests:
cpu: 100m
memory: 1Gi
State: Waiting
Reason: RunContainerError
Last State: Terminated
Reason: ContainerCannotRun
Exit Code: 128
Started: Sun, 26 Mar 2017 20:10:34 +0200
Finished: Sun, 26 Mar 2017 20:10:34 +0200
Ready: False
Restart Count: 0
Volume Mounts:
/etc/td-agent from fluentd-config (ro)
/localdata/docker/containers from localdata-docker-containers (ro)
/run/log from run-log (rw)
/var/lib/docker/containers from var-lib-docker-containers (ro)
/var/log from var-log (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-sr3fw (ro)
Environment Variables: <none>
Conditions:
Type Status
Initialized True
Ready False
Volumes:
var-log:
Type: HostPath (bare host directory volume)
Path: /var/log
run-log:
Type: HostPath (bare host directory volume)
Path: /run/log
var-lib-docker-containers:
Type: HostPath (bare host directory volume)
Path: /var/lib/docker/containers
localdata-docker-containers:
Type: HostPath (bare host directory volume)
Path: /localdata/docker/containers
fluentd-config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: fluentd-config
default-token-sr3fw:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-sr3fw
QoS Class: Burstable
Tolerations: <none>
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
14s 12s 2 {kubelet gke-test-cluster-1-normal-size-pool-8442a74d-h98x} spec.containers{fluentd-elasticsearch} Normal Pulling pulling image "kayrus/fluentd-elasticsearch:1.20"
12s 12s 1 {kubelet gke-test-cluster-1-normal-size-pool-8442a74d-h98x} spec.containers{fluentd-elasticsearch} Normal Created Created container with docker id b32d2ed0e5cb; Security:[seccomp=unconfined]
12s 12s 1 {kubelet gke-test-cluster-1-normal-size-pool-8442a74d-h98x} spec.containers{fluentd-elasticsearch} Warning Failed Failed to start container with docker id b32d2ed0e5cb with error: Error response from daemon: mkdir /localdata: read-only file system
12s 10s 2 {kubelet gke-test-cluster-1-normal-size-pool-8442a74d-h98x} spec.containers{fluentd-elasticsearch} Normal Pulled Successfully pulled image "kayrus/fluentd-elasticsearch:1.20"
12s 10s 2 {kubelet gke-test-cluster-1-normal-size-pool-8442a74d-h98x} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "fluentd-elasticsearch" with RunContainerError: "runContainer: Error response from daemon: mkdir /localdata: read-only file system"
10s 10s 1 {kubelet gke-test-cluster-1-normal-size-pool-8442a74d-h98x} spec.containers{fluentd-elasticsearch} Normal Created Created container with docker id d687b6de0f40; Security:[seccomp=unconfined]
10s 10s 1 {kubelet gke-test-cluster-1-normal-size-pool-8442a74d-h98x} spec.containers{fluentd-elasticsearch} Warning Failed Failed to start container with docker id d687b6de0f40 with error: Error response from daemon: mkdir /localdata: read-only file system
Any pointers would be appreciated. Thanks!
I use curator to remove old indices,but curator_cli script is not running at two o'clock every day.Would you please tell me how to configure? Thanks
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.