flant / loghouse Goto Github PK

View Code? Open in Web Editor NEW

924.0 924.0 76.0 18.14 MB

Ready to use log management solution for Kubernetes storing data in ClickHouse and providing web UI.

License: Apache License 2.0

Ruby 44.28% JavaScript 21.66% CSS 2.70% HTML 22.99% Shell 1.47% Dockerfile 2.94% Smarty 3.96%

clickhouse fluentd kubernetes logs

loghouse's People

Contributors

Stargazers

Watchers

loghouse's Issues

Follow mode by default

When user opens dashboard, follow mode should be enabled by default.

Internal Server Error on incorrect query

If you enter an incorrect request, you get the error Internal server Error and in loghouse pod see logs:
2017-11-04 07:27:14 - LoghouseQuery::BadFormat - logs: Failed to match sequence (EXPRESSION subquery:(SUBQUERY?)) at line 1 char 1.:

In fluentd Code: 117. DB::Exception: Unknown field found while parsing JSONEachRow format: _SELINUX_CONTEXT

I have following in logs when deploy loghouse via helm:

Code: 117. DB::Exception: Unknown field found while parsing JSONEachRow format: _SELINUX_CONTEXT
 2017-11-08 05:28:55 +0000 [warn]: temporarily failed to flush the buffer. next_retry=2017-11-08 05:28:55 +0000 error_class="RuntimeError" error="command returns 29952: bash /usr/local/bin/insert_ch.sh /tmp/fluent-plugin-exec-20171108-7-kfyf2o" plugin_id="object:3fce70caaa48"
  2017-11-08 05:28:55 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluentd-0.12.40/lib/fluent/plugin/out_exec.rb:104:in `write'
  2017-11-08 05:28:55 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluentd-0.12.40/lib/fluent/buffer.rb:354:in `write_chunk'
  2017-11-08 05:28:55 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluentd-0.12.40/lib/fluent/buffer.rb:333:in `pop'
  2017-11-08 05:28:55 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluentd-0.12.40/lib/fluent/output.rb:342:in `try_flush'
  2017-11-08 05:28:55 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluentd-0.12.40/lib/fluent/output.rb:149:in `run'
Code: 117. DB::Exception: Unknown field found while parsing JSONEachRow format: _SELINUX_CONTEXT
 2017-11-08 05:28:55 +0000 [warn]: temporarily failed to flush the buffer. next_retry=2017-11-08 05:28:57 +0000 error_class="RuntimeError" error="command returns 29952: bash /usr/local/bin/insert_ch.sh /tmp/fluent-plugin-exec-20171108-7-fm3rdj" plugin_id="object:3fce70caaa48"
  2017-11-08 05:28:55 +0000 [warn]: suppressed same stacktrace
Code: 117. DB::Exception: Unknown field found while parsing JSONEachRow format: _SELINUX_CONTEXT
 2017-11-08 05:28:57 +0000 [warn]: temporarily failed to flush the buffer. next_retry=2017-11-08 05:29:01 +0000 error_class="RuntimeError" error="command returns 29952: bash /usr/local/bin/insert_ch.sh /tmp/fluent-plugin-exec-20171108-7-1tfk6fi" plugin_id="object:3fce70caaa48"

What with it could be connected?

Negative lookahead for regex

Sometimes we need to search some logs, that does not match some regex, but re2 does not have simple solution out of the box (google/re2#156). New operator !~ would be simple and powerfull solution for this situation.

Perhaps a typo in resource limits

loghouse/charts/loghouse/values.yaml

Line 86 in beb2603

memory: 512Gi

Probably meant to be 512Mi

Is there manual for using without docker + kubernetes ?

Is there manual for using without docker + kubernetes ?
I wan't to use it in kvm so I need some manual =)

No Endpoint Created on Ingress

I am using Kubernetes 1.8.1 and when I use helm to deploy this, the ingress never gets a public ip endpoint and has no errors to speak of (nothing in events etc) any ideas?

Query improvements

An ability to save a query by user. Admin can save public queries.
Queries can be saved in configmap.

Handle Ctrl-clicking on labels

Ctrl-click on label should open a new tab.

Filter should be copied with clicked label added.

docker-compose

the current implementation is only for kubernetes?
or in docker-compose can I also run?

Hiding labels depending on filter

Filter might contain an exact match for label. In this case, this label should be hidden automatically.

Timeout when installing on bare metal k8s

I'm on Kubernetes ver. 1.8.3 running on my own server.
I'm getting the following error:

root@db2new:~/kube# helm install -n loghouse loghouse
Error: release loghouse failed: Timeout: request did not complete within allowed duration
root@db2new:~/kube#

Tiller logs:

[storage] 2017/11/27 04:08:34 getting release history for "loghouse"
[tiller] 2017/11/27 04:08:34 uninstall: Release not loaded: loghouse
[tiller] 2017/11/27 04:08:41 preparing install for loghouse
[storage] 2017/11/27 04:08:41 getting release history for "loghouse"
[tiller] 2017/11/27 04:08:41 rendering loghouse chart using values
2017/11/27 04:08:41 info: manifest "loghouse/templates/tabix/tabix.yaml" is empty. Skipping.
2017/11/27 04:08:41 info: manifest "loghouse/templates/clickhouse/clickhouse-pvc.yaml" is empty. Skipping.
2017/11/27 04:08:41 info: manifest "loghouse/templates/tabix/tabix-svc.yaml" is empty. Skipping.
2017/11/27 04:08:41 info: manifest "loghouse/templates/clickhouse/clickhouse-ingress.yaml" is empty. Skipping.
2017/11/27 04:08:41 info: manifest "loghouse/templates/tabix/tabix-ingress.yaml" is empty. Skipping.
[tiller] 2017/11/27 04:08:42 performing install for loghouse
[tiller] 2017/11/27 04:08:42 executing 2 pre-install hooks for loghouse
[tiller] 2017/11/27 04:08:42 hooks complete for pre-install loghouse
[storage] 2017/11/27 04:08:42 getting release history for "loghouse"
[storage] 2017/11/27 04:08:42 creating release "loghouse.v1"
[kube] 2017/11/27 04:08:42 building resources from manifest
[kube] 2017/11/27 04:08:42 creating 19 resource(s)
[tiller] 2017/11/27 04:09:13 warning: Release "loghouse" failed: Timeout: request did not complete within allowed duration
[storage] 2017/11/27 04:09:13 updating release "loghouse.v1"
[tiller] 2017/11/27 04:09:13 failed install perform step: release loghouse failed: Timeout: request did not complete within allowed duration

It turns out first installation to a fresh k8s node always succeeds, but all subsequent ones (after helm del --purge loghouse) fail with timeout.

Button resetting columns to their auto/default state

After the filter is changed and some columns are hidden (manually), user should have an ability to restore columns to their “auto” state, i.e. display all the columns with an exception for automatically hidden.

Error: apiVersion "batch/v2alpha1" in loghouse/templates/loghouse/loghouse-cronjob.yaml is not available

Errot, while install in kubernetes 1.8.

Note: CronJob resource in batch/v2alpha1 API group has been deprecated starting from cluster version 1.8. You should switch to using batch/v1beta1, instead, which is enabled by default in the API server. Further in this document, we will be using batch/v1beta1 in all the examples.

Please, make changes for current kubernetes version.

Display non-Kubernetes (system) logs

There are Docker logs, kernel logs, and others having no namespace and no pod_name label, so it’s difficult to distinguish user permissions for these records and to filter them. We need to find a solution.

Need to use helm.sh/hook-delete-policy

loghouse/charts/loghouse/templates/loghouse/loghouse-init-tables.yaml

Line 8 in 64d0d88

"helm.sh/weight": "5"

[kube] root@kube ~ # vi loghouse/values.yaml
[kube] root@kube ~ # helm upgrade --namespace loghouse loghouse loghouse/
Error: UPGRADE FAILED: jobs.batch "loghouse-init-tables" already exists
[kube] root@kube ~ # kubectl -n loghouse delete jobs/loghouse-init-tables
job "loghouse-init-tables" deleted
[kube] root@kube ~ # kubectl -n loghouse delete jobs/loghouse-init-tables
job "loghouse-init-tables" deleted
[kube] root@kube ~ # helm upgrade --namespace loghouse loghouse loghouse/
Release "loghouse" has been upgraded. Happy Helming!
LAST DEPLOYED: Fri Dec 8 14:18:30 2017

Table view for the list of records

Log records are currently displayed as a list of strings. Table view having labels as columns should be useful too.

Internal Server Error if user not add in config

If you do not add the user to the configmap loghouse, then when you try to login, we get Internal Server Error with an error:
95.213.149.180, 95.213.149.180 - - [04/Nov/2017:07:14:51 +0000] "GET /query HTTP/1.0" 500 30 0.0010 2017-11-04 07:14:52 - RuntimeError - no user permissions configured for user

Negation in queries

Sometimes need to use global negation in queries like not (expr1 and expr2)

Allow to use a different color schema for stderr lines

The Google StackDriver interface allows distinguishing normal log lines from errors visually with ease. Please improve UI to allow the same.

Settings for records view

pagination. Records per page with 250 as default value;
time format. Now time is displayed as “2017-10-31 18:31:35.910607721”, this format should be adjustable;
timezone. Log records usually have UTC timestamp, however user may have another timezone.

Request ClickHouse records for 2 sending periods in follow mode

Fluentd sends records every second. When loghouse gets records for one particular second fluentd can be still sending them.

We can resolve this by requesting records for 2 seconds (more precisely, for 2 fluentd sending periods)/

Use "seek to" approach instead of "from-to" for search by time

Usually we know only approximate time when something happened so we just want to enter that time and see logs around it.

From-to approach should only be left on downloading logs for needed period.

Change format of Tabix.ui nginx log to make it compatible with okmeter

To resolve outgaes in okmeter it is necessary to change nginx log format of tabix.ui.

We need to add $request_time and $upstream_response_time fields to log format.

Message from okmeter:

log_format "combined" missing variables: "$request_time" "$upstream_response_time"

Rewrite loghouse-backend in Golang

Backend is a small Ruby app.
It is better to follow Kubernetes community and use Golang.

Autocomplete for query field

Query field must be autocompleted with labels list.
Prometheus query (see example below) should be used as a reference.

Scheduling of download jobs (S3)

Since rotation deletes old log records, making their upload to S3 available on schedule will be great.

Add UI for scheduling settings.
Schedule can be added to the saved query or to the job created from filter.

Generate ClickHouse password sha256 hash using helm from plain value

Прислал МР который добавляет авто-генерацию хеша пароля для ClickHouse

Query language improvements

Query language improvements:

Round brackets support
Search by owner support. It is useful to search records from particular deployment/cronjob/statefulset/daemonset/etc.

Dashboard should be SPA

SPA (single-page application) has no page switching and it’s more responsive (all the data is loaded via AJAX). Query saving should be in a separate view or dialog (e.g. graph editor in Grafana).

Angular is preferable since it’s used in Grafana and kubernetes-dashboard. Angular+React (like in grafana) may be another option.

Add ability to store list of columns to show in query template

I would like to have an ability to specify what keys should be shown in query template.

With this feature I will be able to specify list of hidden keys only one time - when I'm creating query template. And every time when I'll open this template, I will only see keys that I am needed.

Highlight search results

When we looking for a part of big field, by regular expression it is very difficult to understand where the part you are looking for

Wrong query language definitions

Wrong query language definitions (https://github.com/flant/loghouse/blob/master/docs/ru/query-language.md)

This query

SELECT string_fields.names,
string_fields.values
FROM
logs.logs

WHERE
date=today() AND string_fields.values = '1'

Order by timestamp
LIMIT 100

throw

Code: 43, e.displayText() = DB::Exception: Illegal types of arguments (Array(String), String) of function equals, e.what() = DB::Exception

Images:
flant/loghouse-clickhouse:0.0.2
flant/loghouse-fluentd:0.0.2
flant/loghouse-tabix:0.0.2

When user saves a query, displayed labels should be also saved

When user saves a query, displayed labels should be also saved.

Error installing loghouse to GKE

I'm trying to install loghouse to Google Cloud-bases Kubernetes cluster (node pool version is 1.8.3-gke.0).
I'm getting the following error:

➜  cm-scripts git:(master) ✗ helm install -n loghouse loghouse
Error: release loghouse failed: clusterroles.rbac.authorization.k8s.io "fluentd" is forbidden: attempt to grant extra privileges: [PolicyRule{Resources:["namespaces"], APIGroups:[""], Verbs:["get"]} PolicyRule{Resources:["namespaces"], APIGroups:[""], Verbs:["watch"]} PolicyRule{Resources:["namespaces"], APIGroups:[""], Verbs:["list"]} PolicyRule{Resources:["pods"], APIGroups:[""], Verbs:["get"]} PolicyRule{Resources:["pods"], APIGroups:[""], Verbs:["watch"]} PolicyRule{Resources:["pods"], APIGroups:[""], Verbs:["list"]}] user=&{system:serviceaccount:kube-system:default 3d69e5b6-d03a-11e7-9bc4-42010af00233 [system:serviceaccounts system:serviceaccounts:kube-system system:authenticated] map[]} ownerrules=[PolicyRule{Resources:["selfsubjectaccessreviews"], APIGroups:["authorization.k8s.io"], Verbs:["create"]} PolicyRule{NonResourceURLs:["/api" "/api/*" "/apis" "/apis/*" "/healthz" "/swagger-2.0.0.pb-v1" "/swagger.json" "/swaggerapi" "/swaggerapi/*" "/version"], Verbs:["get"]}] ruleResolutionErrors=[]
➜  cm-scripts git:(master) ✗

Hiding/displaying labels should be implemented in a single select

There should be the only select filled with the labels currently displayed. Hidden labels which are available should be displayed in autocomplete list. If all labels are displayed, autocomplete list will be empty. “Clear” button resetting labels to their default state should be available.

Currently this default state is simply “display everything” but in future it should be improved via special algorithm hiding labels depending on a filter query.

How to reduse clickhouse memory usage?

In our setup, we have several nodes with ~8Gb RAM. Clickhouse eats all of memory and not give it back. So clickhouse memory usage grows in expluatation period. And when it's more than node can give, node crashes with OutOfMemory error.
Is there some option to reduse memory usage of clickhouse service

clickhouse-client insert into wrong table

loghouse/images/fluentd/insert_ch.sh

Line 4 in beb2603

 cat $1 | clickhouse-client --host="${CLICKHOUSE_SERVER}" --port="${CLICKHOUSE_PORT}" --user="${CLICKHOUSE_USER}" --password=${CLICKHOUSE_PASS} --database="${CLICKHOUSE_DB}" --compression true --query="INSERT INTO logs${TABLE} FORMAT JSONEachRow" && rm -f $1 

INSERT INTO logs${TABLE}

не совпадает с настройками clickhouse по таблице, и пытается записать в не существующую таблицу.

Settings for log rotation (max_days, max_size)

Old logs should be cleaned from time to time. 3 options will be introduced for that purpose:

max_days N — keep records not older than N days;
max_size M — store not more than M GBs of records;
max_days N + max_size M — if both options are in use, both limits (in days and GBs) should be always applied.

Clicking on time should display records with no filter applied

No clear decision yet.

Options available:

clicking on time will open a new tab with records from corresponding time and no (empty) filter applied
ctrl-clicking on time will open a new tab with records from corresponding time and no filter applied
clicking on time will clear a filter and show records from corresponding time. However, there should be a way to “go back” (applying filter being used before) — some kind of “breadcrumbs”.

Clickhouse server crash

Hi,

I try to deploy Clickhouse in my Kubernetes cluster but Clickhouse pod crash at startup.

Here is pod's logs :

╰─# k -n loghouse logs -f clickhouse-server-79dc494dfc-sz268                                                                                                                                          
Include not found: clickhouse_remote_servers
Include not found: clickhouse_compression
2017.11.02 03:01:08.091964 [ 1 ] <Warning> Application: Logging to console
2017.11.02 03:01:08.103679 [ 1 ] <Warning> ConfigProcessor: Include not found: networks
2017.11.02 03:01:10.103799 [ 2 ] <Warning> ConfigProcessor: Include not found: clickhouse_remote_servers
2017.11.02 03:01:10.103848 [ 2 ] <Warning> ConfigProcessor: Include not found: clickhouse_compression
2017.11.02 03:01:10.107723 [ 3 ] <Warning> ConfigProcessor: Include not found: networks

I used the Helm chart :

name: loghouse
version: 0.0.1

And don't made any changes in templates/clickhouse/clickhouse-configmap.yml

Kubernetes v1.8.2
Calico CNI
IPv6 only (I made the changes to make it work about Service ClusterIP)
Debian 9.2

Build loghouse in RedHat-based distribution?

Hello!
I try build loghouse in RedHat-based, but get some error.
I view https://github.com/flant/loghouse/blob/master/Dockerfile and have a few question.
Is ruby:2.3.4 minimal version?
You are add 2 file and run 2 command

ADD Gemfile /app/Gemfile
ADD Gemfile.lock /app/Gemfile.lock
RUN bundle config git.allow_insecure true
RUN bundle install

how simple build without docker?
Thank you
Account:
https://habrahabr.ru/users/chemtech/
Email:
patsev(dot)anton(at)gmail(dot)com
May be talk at russian language

Simplify downloading records

Download button must activate a background server process of archiving of resulting records, so user will be able to see a progress bar of that process. As archiving process is completed, a secret temporary link should become available for user. User can download a file by clicking on it or share it. Link should expire and archive should be deleted from the server in 2 days.

Details:

Make it possible to download data in different formats (csv, json …)
Provide an option (checkbox) to compress (or not) resulting file
Use Clickhouse capabilities (instead of Ruby) for getting records in several formats
Implement a queue (in backend) for background processes
Consider a download job as a system entity (not a one time task) which should be visible in UI

Make more installation configurations available

All clusters are different: bare metal, GCE, AWS, Azure… There should be several installation configurations:

ClickHouse with local index on each node. Best option for bare metal; it lowers network delays between fluentd and ClickHouse;
ClickHouse as a cluster. Best option for cloud based K8s (GCE, AWS, Azure);
Standalone ClickHouse. For small K8s clusters or testing purposes. ClickHouse is installed on single node while fluentd is placed on each node.
External ClickHouse. ClickHouse is installed outside the cluster and is used as an external service while fluentd and dashboard are installed in the cluster.

Add tooltips for fields, buttons, labels

Tooltip for each field providing details on what should be entered and how it affects the logs displayed
Tooltip for buttons explaining its action
Tooltip for labels displaying condition being added to filter

Smart “play” button

When user selects “now” as a time in “to” field, play button should be activated automatically (with follow mode enabled).
Currently, if user scrolls up in follow mode, records view is not fixed as the screen is automatically scrolled up when new records arrive. If user scrolls up, follow mode should be disabled. Follow mode should be re-enabled again when user scrolls down. This logic may be implemented in some other way.

Create a Go replacement for fluentd

Fluentd is feature rich while loghouse uses it as a log forwarder only. Small Go program will do this job better.

Get rid of 2 selects for displaying/hiding columns

Remove two selects with columns. Use text input with autocomplete and select with columns displayed instead.
If a column is used in a strict conditioned filter, this column should be hidden automatically. For example, clicking on a “instance=worker-1” label should add this condition to the filter and remove column “instance” from the visible list.
Autocomplete must be empty if all columns are visible.
Reset button. It should reveal all columns.

Describe resource requirements and performance details in the README

Low resource requirements and at the same time extraordinary performance are killer features that not described in the documentation at all!

Project needs a logo and a favicon.ico

Project needs its own logo and a favicon.ico.

flant / loghouse Goto Github PK

loghouse's People

Contributors

Stargazers

Watchers

Forkers

loghouse's Issues

Recommend Projects

Recommend Topics

Recommend Org