Giter Club home page Giter Club logo

elasticsearch-templates's Introduction

elasticsearch-templates

The repository contains scripts and sources to generate Elasticsearch templates that comply with Common Data Model.

Build Status

Problem Statement

We are trying to solve the problem of conflicts and inconsistencies in log data as collected by, and from, different subsystems stored together as a unified data set under one warehouse.

Namespace hierarchy

Namespace hierarchy on the log metadata is the key concept. We use the Elasticsearch index templates and document mapping to cast the common metadata keys into usable documents.

Namespace corresponds to a top-level JSON key of Elasticsearch document. Namespace is usually defined per individual app or subsystem, so that different applications/subsystems not conflict in various metadata fields.

Adding new namespace

Create a namespace definition file in namespaces/ folder.

Adding new Elasticsearch template

Create a sub-folder in templates/ folder, named as the desired template. (Alternatively copy/modify one of the existing template folders)

Add/modify template.yml definition file to include proper namespace definitions. See for example templates/openshift/README.md for the details.

Elasticsearch versions support

Support for multiple Elasticsearch versions has been added. Resulting files (ie. index-templates or index-patterns) are generated for each supported version of Elasticsearch. Target version of ES is encoded into the file name.

List of currently supported ES versions can be find in scripts/supported_versions.py.

The idea is that all the input file templates and data are formatted according to the latest supported ES version and scripts handle backward data and format conversions for older ES versions. As part of unit testing the generated data is compared to released common data files (automatically downloaded from GitHub during tests).

Generating documentation

Use the makefile in the templates/ folder.

Alternatively, run the following command: python ./scripts/generate_template.py (path to template in templates/) namespaces/ --doc.

The generated file looks like "xxx.asciidoc".

Viewing the documentation

Install the asciidoc viewer in web browser.

Open the local path to the asciidoc file "xxx.asciidoc" in your browser.

Releasing a new version of the data model

First, generate index templates (for Elasticsearch) and index patterns (for Kibana).

$ cd <project_root>
$ make clean
$ make

Create a new release tag in repo and push it into GitHub.

$ git tag -a 0.0.24 -m "Release 0.0.24"

# We can check the tag is attached to the latest commit now 
$ git log --oneline -n 2
c16dc2c (HEAD -> master, tag: 0.0.24, origin/master, origin/HEAD) Fix index patterns
39d0b71 (tag: 0.0.23) Update model & Bump to 2020.01.23

# Push tag into remote GitHub repo
$ git push origin --tags
Total 0 (delta 0), reused 0 (delta 0)
To github.com:ViaQ/elasticsearch-templates.git
 * [new tag]         0.0.24 -> 0.0.24 

Create a new release in GitHub project release.

  • create a new release draft from newly created tag
  • provide meaningful description and manually attach files belonging to the release
    • Usually the list of the files is the same as in the previous release except when it is not :-) (i.e. if there is any significant change)
  • publish the release

Once a new release is published you can use update-viaq-data-model.sh script to pull released files into AOL and prepare a new PR with updated data model.

elasticsearch-templates's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

elasticsearch-templates's Issues

Documentation file per supported ES version

I noticed that the *.asciidoc files did change after we upgraded to ES5x due to new field type names in some cases. See openshift/origin-aggregated-logging#922 for details.

I think the only changes in the documentation we can expect are the field type names (like string -> keyword/text, or string -> ipv6). Such changes should be rare IMO.

Originally we decided that the Common Data Model scripts will NOT generate multiple documentation files (one per ES version); see #71. I am not sure if this is a big issue or not. Shall we be concerned?

//cc @richm

Fix not_analyzed for container_name

All of the index template files found in 0.0.12 release contain strange mapping for container_name in kubernetes namespace.

Here is link to the commit that added the following yaml definition of container_name to namespaces/kubernetes.yaml file:

 - name: container_name
    type: string
    norms:
      enabled: true
    description: >
      The name of the container in Kubernetes.
  fields:
      - name: raw
        ignore_above: 256
        type: string

This resulted to the following mapping (found in all three index template files in release 0.0.12):

"container_name": {
  "type": "string",
  "index": "not_analyzed",   /*_<--_Here_!!!_*/
  "doc_values": true,
  "norms": {
    "enabled": true
  },
  "fields": {
    "raw": {
      "doc_values": true,
      "ignore_above": 256,
      "index": "not_analyzed",
      "type": "string"
    }
  }
}

The odd thing is that container_name value is NOT ANALYZED. This is either bug of makes the container_name.raw field redundant (as it is also NOT ANALYZED).

viaq-jenkins/README.md and viaq-template/README.md are duplicates

The viaq-jenkins/README.md and viaq-template/README.md files are mostly duplicates. This easily leads to confusion and they will likely go out of sync soon (they're already drifting apart).

Consider moving their contents to the top README.md, generalizing it accordingly.

include_in_all not supported in ES 6.0.0 and later

We need to remove index_in_all when upgrading the model to ES 6.x and later. If this mapping parameter is not removed then pushing index template fails like this:

$ curl -X PUT "localhost:9200/_template/template_2?pretty" -H 'Content-Type: application/json' [email protected]
{
  "error" : {
    "root_cause" : [
      {
        "type" : "mapper_parsing_exception",
        "reason" : "[include_in_all] is not allowed for indices created on or after version 6.0.0 as [_all] is deprecated. As a replacement, you can use an [copy_to] on mapping fields to create your own catch all field."
      }
    ],
    "type" : "mapper_parsing_exception",
    "reason" : "Failed to parse mapping [_default_]: [include_in_all] is not allowed for indices created on or after version 6.0.0 as [_all] is deprecated. As a replacement, you can use an [copy_to] on mapping fields to create your own catch all field.",
    "caused_by" : {
      "type" : "mapper_parsing_exception",
      "reason" : "[include_in_all] is not allowed for indices created on or after version 6.0.0 as [_all] is deprecated. As a replacement, you can use an [copy_to] on mapping fields to create your own catch all field."
    }
  },
  "status" : 400
}

(for better readability) The reason including suggested solution goes like this:

[include_in_all] is not allowed for indices created on or after version 6.0.0 as [_all] is deprecated. As a replacement, you can use an [copy_to] on mapping fields to create your own catch all field.

Consider removing unneccessary custom date format from index mapping

Currently, we are specifying custom date formats for some date fields in index templates, like:

"format": "yyyy-MM-dd'T'HH:mm:ss.SSSSSSZ||yyyy-MM-dd'T'HH:mm:ssZ||dateOptionalTime",
"type": "date"

The following is a simple script that tests if Elasticsearch can index values formatted using those patterns out of the box (script available also here).

#!/bin/bash

ES=http://localhost:9200

function delete_index() {
  curl -X DELETE "${ES}/test"
}

function refresh() {
  curl -X POST "${ES}/_refresh"
}

function index_document() {
  echo "Testing $1"
  curl -X POST "${ES}/test/1" -d "{
    \"date\": \"$1\"
  }" 
}

function mapping() {
  curl -X GET "${ES}/test/_mapping?pretty"
}

# test formats: "yyyy-MM-dd'T'HH:mm:ss.SSSSSSZ||yyyy-MM-dd'T'HH:mm:ssZ"
for i in 2014-01-17T15:57:22.123456Z  2014-01-17T15:57:22Z 
do
  delete_index
  index_document $i
  mapping
  refresh
done

We can see below the values are correctly indexed. I think it makes sense to add only those custom formats that are not indexed by default (this means we are declaring we expect those format). When declaring formats that are indexed out of the box I wonder if this can contribute to later confusion ("Did we have issues indexing those values? What kind of issues?").

Results for Elasticsearch:

  • v1.7.2
# Testing 2014-01-17T15:57:22.123456Z
{"_index":"test","_type":"1","_id":"AVdRJMoWGkau0j2E_IYd","_version":1,"created":true}{
  "test" : {
    "mappings" : {
      "1" : {
        "properties" : {
          "date" : {
            "type" : "date",
            "format" : "dateOptionalTime"
}}}}}}

# Testing 2014-01-17T15:57:22Z
{"_index":"test","_type":"1","_id":"AVdRJMtNGkau0j2E_IYe","_version":1,"created":true}{
  "test" : {
    "mappings" : {
      "1" : {
        "properties" : {
          "date" : {
            "type" : "date",
            "format" : "dateOptionalTime"
}}}}}}
  • v2.3.5
# Testing 2014-01-17T15:57:22.123456Z
{"_index":"test","_type":"1","_id":"AVdRJlQxVUFkKwD07My_","_version":1,"_shards":{"total":2,"successful":1,"failed":0},"created":true}{
  "test" : {
    "mappings" : {
      "1" : {
        "properties" : {
          "date" : {
            "type" : "date",
            "format" : "strict_date_optional_time||epoch_millis"
}}}}}}

# Testing 2014-01-17T15:57:22Z
{"_index":"test","_type":"1","_id":"AVdRJlVtVUFkKwD07MzA","_version":1,"_shards":{"total":2,"successful":1,"failed":0},"created":true}{
  "test" : {
    "mappings" : {
      "1" : {
        "properties" : {
          "date" : {
            "type" : "date",
            "format" : "strict_date_optional_time||epoch_millis"
}}}}}}
  • v2.4.0
# Testing 2014-01-17T15:57:22.123456Z
{"_index":"test","_type":"1","_id":"AVdRJxF4bHsb53sfaNXO","_version":1,"_shards":{"total":2,"successful":1,"failed":0},"created":true}{
  "test" : {
    "mappings" : {
      "1" : {
        "properties" : {
          "date" : {
            "type" : "date",
            "format" : "strict_date_optional_time||epoch_millis"
}}}}}}

# Testing 2014-01-17T15:57:22Z
{"_index":"test","_type":"1","_id":"AVdRJxLKbHsb53sfaNXP","_version":1,"_shards":{"total":2,"successful":1,"failed":0},"created":true}{
  "test" : {
    "mappings" : {
      "1" : {
        "properties" : {
          "date" : {
            "type" : "date",
            "format" : "strict_date_optional_time||epoch_millis"
}}}}}}
  • v5.0.0-alpha5
# Testing 2014-01-17T15:57:22.123456Z
{"_index":"test","_type":"1","_id":"AVdRKG2bWRbH4ugGaVvq","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"created":true}{
  "test" : {
    "mappings" : {
      "1" : {
        "properties" : {
          "date" : {
            "type" : "date"
}}}}}}

# Testing 2014-01-17T15:57:22Z
{"_index":"test","_type":"1","_id":"AVdRKG7tWRbH4ugGaVvr","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"created":true}{
  "test" : {
    "mappings" : {
      "1" : {
        "properties" : {
          "date" : {
            "type" : "date"
}}}}}

Makefile in viaq-template doesn't work

The following output is produced and no files are generated when running make in viaq-template directory:

python ../scripts/generate_template.py . com.redhat.viaq
Traceback (most recent call last):
  File "../scripts/generate_template.py", line 230, in <module>
	input_template = open(definition_path, 'r')
IOError: [Errno 21] Is a directory: '.'
Makefile:7: recipe for target 'com.redhat.viaq.template.json' failed
make: *** [com.redhat.viaq.template.json] Error 1

Elastic Common Schema (ECS)

Elastic Common Schema (ECS):

https://github.com/elastic/ecs

The Elastic Common Schema (ECS) defines a common set of fields for ingesting data into Elasticsearch. A common schema helps you correlate data from sources like logs and metrics or IT operations analytics and security analytics.

TODO: Investigate if and how we could use it and collaborate.

Upgrade model to ES 6.x

  • Index templates use "index_patterns" instead of "template" ES 6.x (#98)
  • Option include_in_all not supported in ES 6.0.0 and later (#101)
  • _all field not supported starting in ES 6.x (#46)
  • the _default_ mapping is deprecated in ES 6.x and completely removed in ES 7.x (#111)
  • ES index type deprecation in ES 6.x and removal in ES 7.x (#84)
  • add tests for backward compatibility with ES 5.x (implemented in eda925b and b31049a)
  • bump model version (ac07c1d)

Change order of "string" related dynamic templates

It seems that order of "string" related dynamic templates in our skeleton.json is incorrect. Namely those aushape* related.

Templates are processed in order thus I wonder what was the idea of adding aushape* templates after the generic "string_fields". If I understand it correctly then if ES will recognize any field in aushape.data.* or aushape.data.*.* or aushape.data.*.*.* as string then all the aushape* dynamic templates will be ignored because the "string_fields" template will win first.

I believe we need to move the generic "string_fields" to the end (ie. down).

Generate Kibana index-pattern data

For example https://github.com/fabric8io/openshift-elasticsearch-plugin/blob/master/src/main/java/io/fabric8/elasticsearch/plugin/kibana/DocumentBuilder.java#L28
The data looks like this:

[
  {"count":0,
   "name":"pipeline_metadata.collector.ipaddr4.raw",
   "indexed":true,
   "analyzed":false,
   "doc_values":true,
   "type":"string",
   "scripted":false},
   {
...

We should generate json or yaml files that can be consumed by this plugin. fabric8io/openshift-elasticsearch-plugin#49

Remove "description" field from index-pattern

The "description" field is not supported for index-pattern definitions in Kibana 5.x.
I am getting the following error when pushing index-pattern into ES 5.x after Kibana 5.x has been started (ie. after it pushed its mappings for Kibana index, see below).

{  
   "error":{  
      "root_cause":[  
         {  
            "type":"strict_dynamic_mapping_exception",
            "reason":"mapping set to strict, dynamic introduction of [description] within [index-pattern] is not allowed"
         }
      ],
      "type":"strict_dynamic_mapping_exception",
      "reason":"mapping set to strict, dynamic introduction of [description] within [index-pattern] is not allowed"
   },
   "status":400
}

The following is mapping for (the default) .kibana index pushed by Kibana on start. Notice the heavy use of "dynamic": "strict":

{
  ".kibana": {
    "mappings": {
      "url": {
        "dynamic": "strict",
        "properties": {
          "accessCount": {
            "type": "long"
          },
          "accessDate": {
            "type": "date"
          },
          "createDate": {
            "type": "date"
          },
          "url": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 2048
              }
            }
          }
        }
      },
      "search": {
        "dynamic": "strict",
        "properties": {
          "columns": {
            "type": "keyword"
          },
          "description": {
            "type": "text"
          },
          "hits": {
            "type": "integer"
          },
          "kibanaSavedObjectMeta": {
            "properties": {
              "searchSourceJSON": {
                "type": "text"
              }
            }
          },
          "sort": {
            "type": "keyword"
          },
          "title": {
            "type": "text"
          },
          "version": {
            "type": "integer"
          }
        }
      },
      "visualization": {
        "dynamic": "strict",
        "properties": {
          "description": {
            "type": "text"
          },
          "kibanaSavedObjectMeta": {
            "properties": {
              "searchSourceJSON": {
                "type": "text"
              }
            }
          },
          "savedSearchId": {
            "type": "keyword"
          },
          "title": {
            "type": "text"
          },
          "uiStateJSON": {
            "type": "text"
          },
          "version": {
            "type": "integer"
          },
          "visState": {
            "type": "text"
          }
        }
      },
      "timelion-sheet": {
        "dynamic": "strict",
        "properties": {
          "description": {
            "type": "text"
          },
          "hits": {
            "type": "integer"
          },
          "kibanaSavedObjectMeta": {
            "properties": {
              "searchSourceJSON": {
                "type": "text"
              }
            }
          },
          "timelion_chart_height": {
            "type": "integer"
          },
          "timelion_columns": {
            "type": "integer"
          },
          "timelion_interval": {
            "type": "keyword"
          },
          "timelion_other_interval": {
            "type": "keyword"
          },
          "timelion_rows": {
            "type": "integer"
          },
          "timelion_sheet": {
            "type": "text"
          },
          "title": {
            "type": "text"
          },
          "version": {
            "type": "integer"
          }
        }
      },
      "_default_": {
        "dynamic": "strict"
      },
      "server": {
        "dynamic": "strict",
        "properties": {
          "uuid": {
            "type": "keyword"
          }
        }
      },
      "index-pattern": {
        "dynamic": "strict",
        "properties": {
          "fieldFormatMap": {
            "type": "text"
          },
          "fields": {
            "type": "text"
          },
          "intervalName": {
            "type": "keyword"
          },
          "notExpandable": {
            "type": "boolean"
          },
          "sourceFilters": {
            "type": "text"
          },
          "timeFieldName": {
            "type": "keyword"
          },
          "title": {
            "type": "text"
          }
        }
      },
      "config": {
        "dynamic": "true",
        "properties": {
          "buildNum": {
            "type": "keyword"
          }
        }
      },
      "dashboard": {
        "dynamic": "strict",
        "properties": {
          "description": {
            "type": "text"
          },
          "hits": {
            "type": "integer"
          },
          "kibanaSavedObjectMeta": {
            "properties": {
              "searchSourceJSON": {
                "type": "text"
              }
            }
          },
          "optionsJSON": {
            "type": "text"
          },
          "panelsJSON": {
            "type": "text"
          },
          "refreshInterval": {
            "properties": {
              "display": {
                "type": "keyword"
              },
              "pause": {
                "type": "boolean"
              },
              "section": {
                "type": "integer"
              },
              "value": {
                "type": "integer"
              }
            }
          },
          "timeFrom": {
            "type": "keyword"
          },
          "timeRestore": {
            "type": "boolean"
          },
          "timeTo": {
            "type": "keyword"
          },
          "title": {
            "type": "text"
          },
          "uiStateJSON": {
            "type": "text"
          },
          "version": {
            "type": "integer"
          }
        }
      }
    }
  }
}

Use new index name template format

According to proposed ElasticSearch Index structure index names should follow the following naming schema:

Tenant’s Versioned Index Names (not exposed):

openshift_<namespace_name>_<namespace_id>-v<version>-YYYY.MM.DD

Tenant’s Index Alias Names (exposed):

openshift_<namespace_name>_<namespace_id>-YYYY.MM.DD

Operational Index Names (not exposed):

openshift_operations-YYYY.MM.DD

This schema is not currently followed in skeletons (for example here: https://github.com/ViaQ/elasticsearch-templates/blob/master/viaq-openshift/skeleton.json#L47)

Mapping definition for [ipaddr4] has unsupported parameters: [norms : false]

Getting error when pushing template to ES 5.5.2

$ curl -X PUT localhost:9200/_template/1 [email protected]
{  
   "error":{  
      "root_cause":[  
         {  
            "type":"mapper_parsing_exception",
            "reason":"Mapping definition for [ipaddr4] has unsupported parameters:  [norms : false]"
         }
      ],
      "type":"mapper_parsing_exception",
      "reason":"Failed to parse mapping [_default_]: Mapping definition for [ipaddr4] has unsupported parameters:  [norms : false]",
      "caused_by":{  
         "type":"mapper_parsing_exception",
         "reason":"Mapping definition for [ipaddr4] has unsupported parameters:  [norms : false]"
      }
   },
   "status":400
}

Support for multiple ES versions

We want to support multiple versions of target ES (and related Kibana). The idea is that the code will produce distinct index templates, Kibana index patterns and asciidoc documentation for the fields based on the common templates for all supported ES versions.

Related issue: #70

Support Kibana index-pattern fieldFormatMap

With Kibana index-patterns, you can define how fields are displayed. For example, for Number bytes, you can choose to display the field in Bytes, which will display the value as e.g. 100K, 2.4M, 1.34G

For namespaces, when defining a value, there should be a format field which takes as a value one of the format types supported by Kibana (see the Settings tab). Some formats take additional parameters - for example, the truncated string type takes the length as a parameter. These should be specified as the formatParams value e.g.

    - name: vm_disk_write_bytes
      type: long
      description: >
        collectd's disk_write_bytes type of statsd plugin.
      format: bytes

  - name: trimmed
    type: string
    description: >
      An array of JSONPath expressions relative to the event object,
      specifying objects/arrays with (some) contents removed as the result of
      event size limiting. Empty string means event itself. Empty array means
      trimming occurred at unspecified objects/arrays.
    format: truncate
    formatParams:
      fieldLength: 47

Would be converted to fieldFormatMap like this:

"fieldFormatMap": "{\"collectd.statsd.vm_disk_write_bytes\":{\"id\":\"bytes\"}, \"aushape.trimmed\":{\"id\":\"truncate\",\"params\":{\"fieldLength\":\"47\"}}}"

oVirt metrics are empty

I installed viaq based on ocp 3.6.
The Ovirt metrics and logs are collected but the metrics records are with no collectd fields or ovirt specific.

{
"_index": "project.ovirt-metrics-engine.2e79fd1e-7b77-11e7-817b-001a4a23128a.2017.08.08",
"_type": "com.redhat.viaq.common",
"_id": "AV3AnDc5PyWn7-3LY0IE",
"_score": null,
"_source": {
"kubernetes": {
"container_name": "mux-mux",
"namespace_id": "2e79fd1e-7b77-11e7-817b-001a4a23128a",
"namespace_name": "ovirt-metrics-engine",
"pod_name": "mux"
},
"hostname": "nott16.eng.lab.tlv.redhat.com",
"@timestamp": "2017-08-08T09:50:37+03:00",
"ovirt": {
"entity": "engine"
},
"tag": "project.ovirt-metrics-engine",
"collectd": {},
"ipaddr4": "10.35.16.209"
},
"fields": {
"@timestamp": [
1502175037000
]
},
"sort": [
1502175037000
]
}

Change (shorten) index templates names

For Common Data Model we use file names as index template IDs inside Elasticsearch. We can make the naming of files shorter by dropping com.redhat.viaq- prefix. This would give us:

com.redhat.viaq-openshift-operations.template.jsonopenshift-operations.template.json
com.redhat.viaq-openshift-project.template.jsonopenshift-project.template.json

Do not use dashes in script name

The script concat-index-pattern-fields.py can not be imported (for example into unit test) due to use of dashes in its name. Eg. the following does no work:

import concat-index-pattern-fields

I will rename the script to concat_index_pattern_fields.py to make it possible to do:

import concat_index_pattern_fields

  # and later ...
  concat_index_pattern_fields._some_method(args)

norms definition changes

There is a change in definition structure of norms mapping parameter between ES 2.x and 5.x
It goes from "norms": { "enabled": true/false } to "norms": true/false.
For more details consult norms 2.4 and norms 5.5 docs.

Fix ip type for pipeline_metadata namespace

ipaddr4

There is a field called ipaddr4 in pipeline_metadata namespace. It is set to string type.

If there are no objections I am going to change it to ip data type. At the same time I am going to remove norms config for it as it make no sense for ip type fields.

This change will apply for both ES2x and ES5x versions.

ipaddr6

There is also ipaddr6 field in the same namespace.

For the ES5x version I am going to change the type of this field to ip as well.

Potential issue on @timestamp formats

According to the template description:
https://github.com/ViaQ/elasticsearch-templates/blob/master/namespaces/_default_.yml#L17
The @timestamp field is parsed from a string with a possible pattern as
"yyyy-MM-dd'T'HH:mm:ss.SSSSSSZ"
Working in the integration plugins for Hawkular I realized about a potential problem on this.
For example:
"2017-04-03T18:05:35.415123+0200"
The field 415123 would be considered in Java as 415 seconds and 123 milliseconds with the pattern SSSSSSZ instead of 415 milliseconds and 123 nanoseconds as I think it should be parsed.
I am fixing this on Hawkular Alerting plugins but just leaving a note here as this potencial issue happens without hitting any exception (as the parsed is correct but with wrong calculation), but once I looked in detail I realized there was some unexpected drifts on the timestamps.

I haven't investigated if Elasticsearch is also affected by this.

Make fails in templates/openshift

Running make in templates/openshift aborts with the following output:

python ../../scripts/generate_template.py template.yml ../../namespaces/
Traceback (most recent call last):
  File "../../scripts/generate_template.py", line 343, in <module>
    object_types_to_template(template_definition, output, output_index_pattern, args.namespaces_dir)
  File "../../scripts/generate_template.py", line 42, in object_types_to_template
    with open(template_definition['skeleton_index_pattern_path'], 'r') as f:
IOError: [Errno 2] No such file or directory: '../skeleton-index-pattern.json'
Makefile:6: recipe for target 'all' failed
make: *** [all] Error 1

Consider removing generated files from the repo

Please consider removing generated files from the Git repository.

Keeping generated files in Git repository and history creates opportunity for abuse, confusion, and mistakes. It becomes tempting to hand-edit the generated files when the generation doesn't seem to work quite right, neglecting the original issue and leading to even more broken generation process. An opportunity to test generation more often is lost, as most of the time the generated files are fetched from Git, but when the need to run generation arises it turns out to be broken. It is a general practice to not keep generated files in a VCS, so newcomers can get confused which data is authoritative: the generation source or the generated product, and why the generation scripts are even needed.

clarify level field possible values

For example, the journald PRIORITY field has these syslog numeric values:

#define LOG_EMERG       0       /* system is unusable */
#define LOG_ALERT       1       /* action must be taken immediately */
#define LOG_CRIT        2       /* critical conditions */
#define LOG_ERR         3       /* error conditions */
#define LOG_WARNING     4       /* warning conditions */
#define LOG_NOTICE      5       /* normal but significant condition */
#define LOG_INFO        6       /* informational */
#define LOG_DEBUG       7       /* debug-level messages */

The level field is defined as a string: https://github.com/ViaQ/elasticsearch-templates/blob/master/namespaces/_default_.yml#L60

Possible values: trace, crit, alert, emerg

How should the syslog numeric level be translated to a string? For example: PRIORITY=0 is level=emerg ?

Cut new release 0.0.18

This release should be the last one before we start merging PRs related to ES 6.x support.
The main benefit will be that we can significantly review and simplify existing tests that are executed against 0.0.12 release and thus has workaround many modifications that happened later between 0.0.12 and 0.0.17.

Make scripts executable, remove .py extension

The need to explicitly invoke python to run scripts is cumbersome. The .py extension is not needed in the script name (unless they're to be executed on Windows) and requires extra effort to type and parse when reading.

Consider making scripts executable and removing the .py extension from script names.

Generate templates: can not install yaml module

@t0ffel Anton, I am trying to regenerate templates after updating skeleton.json but I am running into some Python issue:

$ pwd
<elasticsearch-templates>/scripts

$ python --version
Python 2.7.12

$ python generate_template.py viaq-openshift com.redhat.viaq-openshift
Traceback (most recent call last):
  File "generate_template.py", line 14, in <module>
    import yaml
ImportError: No module named yaml

$ python -m pip install yaml
Collecting yaml
  Could not find a version that satisfies the requirement yaml (from versions: )
No matching distribution found for yaml

Do I need specific Python version or anything like that?

Consider using doc_values for tags field

Currently, the _default_ mapping specification disables doc_values for tags field. I think we should consider enabling it but it will require a bit of discussion and can introduce breaking change.

    # existing ES 2.x notation
    - name: tags
      type: string
      doc_values: false
      index: analyzed
      analyzer: whitespace
      description: >
        Optionally provided operator defined list of tags placed on each log
        by the collector or normalizer. The payload can be a string with
        whitespace-delimited string tokens, or a JSON list of string tokens.

Relevant read: https://www.elastic.co/guide/en/elasticsearch/reference/5.5/doc-values.html

Pros:

  • more efficient aggregations and sorting (I guess tags field is a good candidate to aggregate on)

Cons:

  • not possible on analyzed string which means users will be able to provide multiple tags only via an array, like: "tags": ["foo", "bar"] and no whitespace analyzer will be run on the content of individual values.

ES5/Kibana5 - error - Unknown field type "keyword" for field "@timestamp.raw" in indexPattern ".all"

Unknown field type "keyword" for field "@timestamp.raw" in indexPattern ".all"
I think the problem is that the index pattern should use the kibana type for the field not the es type: https://github.com/elastic/kibana/blob/5.5/src/utils/kbn_field_types.js

Instead of

\"@timestamp.raw\", \"searchable\": true, \"aggregatable\": true, \"readFromDocValues\": true, \"type\": \"keyword\", 

The index pattern file should use type string:

\"@timestamp.raw\", \"searchable\": true, \"aggregatable\": true, \"readFromDocValues\": true, \"type\": \"string\", 

Rebuilding instructions in viaq-template/README.md don't work

While trying to use the template-rebuilding command in viaq-template/README.md:

python ../scripts/generate_template.py . com.redhat.viaq-template

The following error is produced:

Traceback (most recent call last):
  File "../scripts/generate_template.py", line 230, in <module>
    input_template = open(definition_path, 'r')
IOError: [Errno 21] Is a directory: '.'

REAMDE.md needs to be fixed to mention correct command.

Remove include_in_all from _all disabled types

After some investigation I think we can remove use of include_in_all from all fields under types where we explicitly specify mapping "_all": { "enabled": false }.

First, any use of include_in_all (either false or true) does not make any sense when _all field is disabled. Second, the chance is that this will completely remove use of this setting from our model (it is used only for collectd_metrics where the _all field is disabled).

I have been doing tests for ES 5.6.x which has interesting feature where the string query automatically use all_fields option that makes Kibana and other string queries querying all queryable fields automatically:

Perform the query on all fields detected in the mapping that can be queried. Will be used by default when the _all field is disabled and no default_field is specified (either in the index settings or in the request body) and no fields are specified.

Tests for ES 2.4.x are WIP.

Making this change will have two implications:

  1. we will need to release new model versions for ES 2.x and 5.x
  2. we can close #101 as obsolete (making upgrade to ES 6.x easier)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.