(DEPRECATED) Test reporting from Jenkins using ElasticSearch and Kibana

Note This document is deprecated. Documentation for new test results archiving system can be found here.


Abstract and motivation

This document describes, how was realized and in what state is saving historical results of performance tests of OpenAM, OpenIDM, OpenDJ, OpenIG and Identity Message Broker products. Performance tests run as collection of Jenkins jobs. There is created one record per job run (build) in the database. This document covers also very brief description of Kibana, which is data analysis tool for used database. In the environment of Kibana there are several data visualizations and this document contains its purpose and usage. At the end of this document there is a list of planned improvements and found caveats.

There are two main reasons to create this project. First is to collect historical data, created from performance measurements (performance test job runs), and provide tools to analyze various metrics in a long term. Second is to help performance QA team with maintaining jobs and show problems in them, which became more time consuming with increasing number of test jobs.

Architecture

solution architecture

ElasticSearch and Kibana servers are both installed on tom-am5.internal.forgerock.com (IP 10.1.3.55) (test machine, where will be kept all installed software for some time for experimental purposes) elasticsearch.internal.forgerock.com (IP 172.16.204.49) boursin.internal.forgerock.com but it is not necessary to have them installed on a single machine:

Jenkins

Jenkins REST API offers data about job and several last builds. It is accessible when added api/json?pretty=true to the end of job/build URL.

Test instance of Jenkins contains 3 jobs of IDM-5.0.0 (Managed Users, Sync, Recon) and 2 jobs of AM-14.0.0 (AuthN and AuthZ). All jobs were copied from GNB Jenkins with lowered numbers of users and duration, modified scheduler (jobs run every approx. 30 minutes) and they run on Tom-CentOS7, which is a singleton Jenkins slave.

Job setup for saving results

To understand how to setup saving results, it is useful to know, where are data gathered from.

Most of important informations are gathered from Jenkins job build REST API, so to be able to collect such information, it has to be present there at point of executing post-build task. The informations are:

  • Job build timestamp and number, executor VM, parameters, duration (no need to setup this - it is included automatically)
  • Artifacts: python script currently collect only all “global_stats.json” files (if need to collect other files, it has to be implemented!), which appears in the “artifacts” array - if some of files doesn’t appear there, but they should, they has to be specified in Post Build Actions > Archive the artifacts like “results/latest/*/graph/**/global_stats.json”. Example: post build task
  • Overall results - Robot results (section _class: hudson.plugins.robot.RobotBuildAction) - if this class is not presents in the build JSON, Publishing Robot Framework results has to be configured in Post Build Actions. Example: post build task
  • Configuration of how is python script for saving results run, is done in Post Build Actions > Post build task section. To see full list of available script parameters, go to Python script code section of this document. Example: post build task

The $ELASTICSEARCH_URL represents Jenkins environment variable. It can be defined in Jenkins > Manage Jenkins > Configure System > Environment variables.

Python script code

The source code is available in Stash - PyForge repository. Entry point for the code is at the bottom (last line) of the script - by running process_results(..) method. The script has to be run from PyForge root. https://stash.forgerock.org/projects/QA/repos/pyforge/browse/scripts/send_results_elasticsearch Arguments, which can be passed to the script are as follows (can be shown by passing parameter “-h” to the script):

  • -w WORKSPACE_PATH - filepath to the workspace on the Jenkins slave, which executed the test, default = current PyForge root directory path (or more precisely - the path from which the script is executed)
  • -j JENKINS_URL - URL to the Jenkins instance, from which results should be saved, default = http://jenkins-fr.internal.forgerock.com:8080
  • -e ELASTICSEARCH_URL - URL to the ES instance for storing results, default = http://boursin.internal.forgerock.com:9200
  • -c CATEGORY - meaning and possible values are equal to same parameter of run-pybot.py - it is test category and possible values are: functional, stress, remote_stress, system, perf, security, ui, manual, misc, cloud; although not all of these values were tested and also some of them would need individual approach (better to test it or contact Tomáš Hejret first!), default = perf
  • -f FLAG - used as a tag (or “flag”) for testing data and can be used with advantage also for marking other groups of data in the DB (see Data structure - data_token_flag), default = “non_jenkins”
  • -u UNSUCCESSFUL_DURATION_MULTIPLIER - at least how many times shorter has to be duration of a build then estimate, to mark build as not successful, default = 4
  • -m - mute (do not send) HipChat notifications, when this parameter is present
  • -D - enable debugging behavior, by transforming local simulated workspace path (creating a soft-link to PyForge sources in path which is similar to path to real Jenkins workspace) to absolute path, when this parameter is present

Accessing APIs and HTTP sources is done in max 3 attempts. When data are collected and sending them to ES fails, they are saved into a file (this file is generated even on successful sending data to ES) with complete CURL command ready to be resent. In addition to saving CURL command, in case of severe error, which prevented from saving results to ES or some warnings which should be investigated, a notification about that event is sent to HipChat rooms “Saving Perf Results to DB” and “Performance Engineering” (warnings and errors only).

Code workflow

code workflow

Data structure

As ElasticSearch is based on JSON (both requests and responses), here is structure of data payload sent to ES, using POST request to REST API (code representation of data structure):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
payload = {
   # flag to easily find testing records to delete them
   "data_token_flag": flag,
   "product": {  # tested product (for example OpenAM)
      "name": <product-name>,  # for example “OpenAM”
      "version": <product-version>,  # version without flag (e.g. “14.0.0”)
      "flag": <product-name>,  # for example “RC-2”
      "build-hash": <revision-hash>
   },
    # set of other products, which supports the main product
   "other_products": {
     "<product1-name>": {	# for example OpenDJ
        <same_structure_as_in_product_section>
     },
     "<product2-name>": {	# for example CTS
        <same_structure_as_in_product_section>
     }
     
   },
   "job": {
       "name": <job-name>,
       "id": <job-id>,   # part of job-name without product and version (for better filtering)
       "url": <job-url>,
       "build": {
          "number": <last-build-number>,
          "timestamp": <last-build-timestamp>,
          # computed value from timestamp and current time
          "duration": <last-build-duration>,
          # based on duration length in comparison with duration estimate,
          # provided by Jenkins
          "successful": true,	
          "executor": <executor-machine-name>,
          "console_text": <console-text>,
          "results": {
            "robot": {
              "failCount" : 5,
              "totalCount" : 10,
              "skipCount" : 0,
              "criticalTotal" : 10,
              "criticalPassed" : 5,
              "criticalFailed" : 5,
              "passPercentage" : 50.0
            },
            # durations and suite names are gathered from report.html file
            "suites_duration": {
               "total": 150000, # sum of duration of all suites
               "suite_X": {  # example
                 "name": "ReconLDAPToManUser",
                 "duration": 150000
               }
            },
            # sync and recon results are just for OpenIDM (example) - name
            # of the section is based on filename with data about
            # recon/sync process
            "idm_specific_X": {  # example
                # name based on the file with progress of the process
                "name": "recon_ldap2mu_user_update",
                "stats": {
                   "entries_reconed": 3000,
                   "throughput": 70.09345794392524,
                   "time_passed": 42.8
                }
             },
            # complete content of Gatling global_stats.json file
            # per simulation, section name is gatling_<order-number>
            "gatling_X": {
                # based on simulation folder name (without timestamp)
                "simulation": "updateput",
                "stats": {
                   "name": "Global Information",
                   "maxResponseTime": {
                     "ko": 0,
                     "total": 1005,
                     "ok": 1005
                   },
                   "meanNumberOfRequestsPerSecond": {
                     "ko": 0,
                     "total": 49.18032786885246,
                     "ok": 49.18032786885246
                   }
                   
                  }
               }
            }
         }
      }
   }

IDs for data are randomly generated (but it is possible to assign a specific ID) and a record can be obtained via REST API by sending GET request on ES’s URI: /<product-name>-<product.version.major>-<product.version.minor>-<product.version.maintain>/perf/<record-id> (except of record ID is everything lowercase). For example: /openam-14-0-0/perf/AVpEWi546qtTlluLIOg0

ES groups records of the same structure into indexes. Index names, starting with “open”, are exclusively used for data from automated Jenkins job results collecting!

ES also has a search interface, which supports wildcards, for example: /openam-14-*/perf/_search will find all records of results for OpenAM 14 and its subversions and returns (by default) 10 of them. For further information about how to query data in ES follow the ES documentation.

Data record example

POST /openidm-5-0-0/perf (request to save data bellow with a randomly generated ID)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
{
  "data_token_flag": "test_jenkins",
  "product": {
    "build_hash": "b18f03c",
    "flag": "RC2",
    "version": "5.0.0",
    "name": "OpenIDM"
  },
  "other_products": {
    "OpenDJ": {
      "build_hash": "a83b9da2323",
      "flag": "",
      "version": "4.0.0",
      "name": "OpenDJ"
    }
  },
  "job": {
    "name": "IDM-5.0.0-Stress-Tests-Recon-System-Ldap",
    "id": "Stress-Tests-Recon-System-Ldap",
    "url": "http://jenkins-fr.internal.forgerock.com:8080/job/IDM-5.0.0/job/Recon-Tests/job/IDM-5.0.0-Stress-Tests-Recon-System-Ldap/",
    "build": {
      "number": 25,
      "timestamp": "2017-02-15T21:56:43",
      "duration": 604135,
      "successful": true,
      "executor": "Tom-CentOS7",
      "console_text": "--- complete console log ---",
      "results": {
        "robot": {
          "failCount" : 5,
          "totalCount" : 10,
          "skipCount" : 0,
          "criticalTotal" : 10,
          "criticalPassed" : 5,
          "criticalFailed" : 5,
          "passPercentage" : 50.0
        },
        "suites_duration" : {
          "total": 123456,
          "suite_0" : {
            "name": "ReconLDAPToManUser",
            "Duration": 123456
          }
        },
        "idm_specific_0": {
          "name": "recon_ldap2mu_user_creation",
          "stats": {
            "entries_reconed": 3000,
            "throughput": 86.0091743119266,
            "time_passed": 34.88
          }
        },
        "idm_specific_1": {
          "name": "recon_ldap2mu_user_update",
          "stats": {
            "entries_reconed": 3000,
            "throughput": 70.09345794392524,
            "time_passed": 42.8
          }
        },
        "gatling_0": {
          "simulation": "updateput",
          "stats": {
            "name": "Global Information",
            "maxResponseTime": {
              "ko": 0,
              "total": 1005,
              "ok": 1005
            },
            "standardDeviation": {
              "ko": 0,
              "total": 91,
              "ok": 91
            },
            "group1": {
                "count": 2991,
                "percentage": 100,
                "name": "t < 800 ms"
            },
            "group2": {
              "count": 9,
              "percentage": 0,
              "name": "800 ms < t < 1200 ms"
            },
            "group3": {
              "count": 0,
              "percentage": 0,
              "name": "t > 1200 ms"
            },
            "group4": {
              "count": 0,
              "percentage": 0,
              "name": "failed"
            },
            "percentiles1": {
              "ko": 0,
              "total": 80,
              "ok": 80
            },
            "percentiles2": {
              "ko": 0,
              "total": 159,
              "ok": 159
            },
            "percentiles3": {
              "ko": 0,
              "total": 642,
              "ok": 642
            },
            "percentiles4": {
              "ko": 0,
              "total": 864,
              "ok": 864
            },
            "minResponseTime": {
              "ko": 0,
              "total": 31,
              "ok": 31
            },
            "meanResponseTime": {
              "ko": 0,
              "total": 98,
              "ok": 98
            },
            "numberOfRequests": {
              "ko": 0,
              "total": 3000,
              "ok": 3000
            },
            "meanNumberOfRequestsPerSecond": {
              "ko": 0,
              "total": 49.18032786885246,
              "ok": 49.18032786885246
            }
          }
        }
      }
    }
  }
}

Data storage requirements

3000 records (with 1 record per a job build) takes approximately 50MB of disc space, which would be about 500MB per year in current rate and amount of job builds (approximately 600 per week) and current average amount of data per result record.

Kibana GUI

There is a nice getting started video on Elastic site (which is available after submitting an email and after that you will be receiving unintrusive amount of newsletters) or shorter introduction on YouTube or whole bunch of tutorials from Elastic on YouTube.

Kibana (at currently installed version) consists of six sections. There is a short description about each section (in different order than in real Kibana GUI) bellow. The order of section description in this document is sorted by supposed intensity of section usage.

Note: some pictures may come from Kibana version 5.2.0 and some from version 5.4.0 (which is current version). They should differ just slightly. In case there is something entirely different and is not obvious the function or use, please contact author of this document.

Overview

Kibana overview

Time period selector offers to pick various time frames in various ways:

  • Quick

quick time picker

  • Relative

relative time picker

  • Absolute

absolute time picker

Quick Tips for using Kibana

  • You can share or bookmark your current view in Kibana (Discover, Visualization or Dashboard) by set everything up (including filters and time range - either absolute or relative), click “Share” on top panel and you will see a link to share or to bookmark your current view!

  • In Discover and Visualize sections you can see details about rendered data by clicking on one of the circled arrow-up (usually in the bottom left corner of the visualization). You can see data shown in table, raw request to ES, raw response from ES and request statistics.

  • In case of troubles you can see some help in section Common problems and their solutions.

Visualize section

Visualize section

Note: don’t forget to select time frame for visualization, otherwise you could see “No results found” as default time frame is 15 minutes.

Here can be created and saved setup of different analysis tool with different data sets (indexes + filters) or open one of saved ones. There are actually many saved visualizations to choose from.

There are several tools to visualize data including few types of graphs, heatmap, data table, tag cloud and few others can be added as plugins.

When creating new visualization and after selecting an index of data for it, on the main panel, we can see actual graph/table/heatmap etc. and on the left panel we specify/aggregate data, which will be visualized on main panel. Then just click on “Play” button and selected data will be visualized in the tool. Using “Play” button is needed every time the setup of visualization is changed to refresh the tool. We can also filter data by typing query in the search bar above configuration panel and actual visualization.

Dashboard section

Dashboard section

Dashboard in Kibana is basically just collection of widgets made from saved visualizations. Here we can create a new one or use already existing.

Note: don’t forget to select time frame for dashboards, otherwise you could see “No results found” as default time frame is 15 minutes.

Dashboard example

On the image above there is how looks the saved dashboard “AM / IDM dashboard” (which is an example with just dummy data).

Discover section

Discover section

Note: don’t forget to select time frame for discovering data, otherwise you could see “No results found” as default time frame is 15 minutes.

This section offers easier browsing of stored (in ES terminology “indexed”) data, filter them, etc., then using direct requests from console (i.e. Dev Tools section). On the left of the main container, there is a list of created indexes and list of properties in it. Filters can be easily created there.

Saved searches can be created in this section and visualizations can be created from them afterwards.

Dev Tools section

Dev Tools section

This section is basically a console for sending requests to connected instance of ElasticSearch. It is capable of completing keywords and URI paths (like API names).

There could be a list of requests. A single requests can be executed by pressing green “play” icon on the right side of first line, which belongs to the selected request. Requests on the list are saved here until are manually deleted (so you can browse other Kibana sections or close the browser and then come back and your requests will be there). Requests are stored per user/session, so deleting cookies would mean losing the request list.

Documentation for query syntax (for latest ES version) can be found on Elastic site.

Management section

Management section

Except of Dev Tools (console) and Timelion, you always have to specify/select an index, which will be used for analysis or browsing data. Index represents the root in hierarchy of records. It can also contain several ES indexes by using wildcard. There could be specified several indexes and easily switch between them and one of it has to be primary one (marked with star - circled on the image above), which gets selected by default.

Indexes can be managed in Management section > tab “Index Patterns”. Currently there are 4 indexes (the asterisk represents wildcard):

  • openidm-* - this contains data of all versions of OpenIDM
  • openam-* - this contains data of all versions of OpenAM
  • openam-14-0-0-* - this contains data of all builds in version 14-0-0 of OpenAM
  • openidm-5-* - this contains data of all subversions in version 5 of OpenIDM
  • open* - contains data of all FR products and versions

Here we can also change type and format for each property - for example duration, represented as milliseconds, can be easily transformed to human-readable format like “3 hours and 12 minutes” all over the Kibana.

We can also specify scripted fields, which are new fields (properties), which values are computed on the go from existing fields.

Tab “Saved Objects” contains saved Dashboards, Visualizations (both described later) and Searches. These object also can be imported and exported here.

Timelion section

An use-case for our kind of data was not found yet.

Saved Kibana objects

Saved Kibana objects

Among saved object belongs in Kibana (you can see overview and export of all of these in Management > Saved Objects):

  • Visualizations
  • Dashboards
  • Searches

Other existing instances of saved objects not described here are just for experimental purposes.

HELP

This is first visualization of text type and it contains just a text with Kibana Quick Tips and link to this document.

This visualization is most commonly used on standups - we can easily see what jobs had failed or skipped tests (based on Robot overall results). There are shown clickable URL to the job, timestamp, number and duration of failed job and number of failed (failed critical test show separately) and skipped tests.

The table is divided into sub-tables by product and by product versions.

If any job is not listed here, it means, that the job did not failed in the selected period of time.

Twin of this data table visualization is saved search (Discover > click Open > failed jobs overview). This shows exactly same data as visualization, but it can’t be grouped by product and versions. The advantage of it is that there is lot more easier access to a complete record in ES database, which is helpful for issue debugging (for example console log is there) in times when Jenkins is off.

Note: don’t forget to select time frame for visualizations, otherwise you could see “No results found” as default time frame is 15 minutes. Usual value is 1 day or (after weekend) 3 days.

last job run (data table)

Last job run data table

To quickly see what jobs were not run for a long time, this visualization has been created. It shows URLs of all jobs (which have any records in the ES) and its last 2 build timestamps (to quickly see what is job run period) ordered ascendingly by the first timestamp.

The table is divided into sub-tables by product and by product versions. Number of jobs per sub-table can be paged by setting how many rows should sub-table has. To do so, show (if hidden) left panel with visualization parameters by clicking on the circled arrow (as shown on the picture below) > Options > “Per page” parameter.

If any job is not listed here, it means, that the job did run before 2017-03-24.

Note: don’t forget to select time frame for visualizations, otherwise you could see “No results found” as default time frame is 15 minutes. Recommended value is 1 to 6 months.

perf numbers (multiple graphs)

Collection of visualizations for perf degradation analysis. By selecting one of visualization with name starting with “perf numbers”, you will see 1 or more trends for current time range. Trends shows mean throughput and response times (each have different range of values and thus they have own Y-axis [in progress]) from global_stats.json of Gatling simulations or recon / Live Sync throughput (in case of some OpenIDM tests). If none of those is available, suite durations are visualized.

Trends are labeled as job names without product and product version (visualizations are per product and per product version anyway).

Trend lines are labeled from G0-Gn (data from Gatling simulations) or S0-Sn (in case suite durations are visualized). Each number should belong to the same simulation/suite per job. Concern #15.

Note: don’t forget to select time frame for visualizations, otherwise you could see “No results found” as default time frame is 15 minutes. Recommended value is 3 days to 4 weeks.

versions comparison (multiple tables)

This collection of data tables and graphs [in progress] shows last performance numbers for all tested versions aggregated by jobs to easily compare performance between product versions.

There is a filter prepared for additional filtering of jobs, so you can type in for example “Agent” to show only jobs with “Agent” in its name.

Note: don’t forget to select time frame for visualizations, otherwise you could see “No results found” as default time frame is 15 minutes. Recommended value is 3 to 12 months.

executor machine / product overall condition (multiple graphs)

Collection of visualizations for analysis, if any product overall performance is degraded or if any of executor machine can be source of high product error rate. Visualization contains graphs of aggregated sums of failed tests per executor per product over time. If there is a peak on just one executor and on more products, it points to executor state. On the other hand when there are peaks on all executors on just one product, it points to the product.

Note: don’t forget to select time frame for visualizations, otherwise you could see “No results found” as default time frame is 15 minutes. Recommended value is 1 to N weeks.

executor machines utilization (graph and two tables)

In this graph and table views it could be easily visible, what machine is the most extensively used, how much computing time is available or which machine is/was disabled.

Note: don’t forget to select time frame for visualizations, otherwise you could see “No results found” as default time frame is 15 minutes. Recommended value is 1 to N weeks.

Common problems and their solutions

This container is too small to render the visualization

In case you see this message in visualization or dashboard window, it indicates that there are too many data points to fit into current size of browser window.

Solution: Try to either enlarge the browser window, pick shorter time range, or modify filter query to narrow number of visualized data points.

Visualizations broken

On load of visualizations (particularly those with throughputs and/or response times) Kibana shows this message: Saved “field” parameter is now invalid. Please select a new field. and Visualize: “field” is a required parameter. This could happen, when new index (for new product version) is created by saving first record in it. Data types are put into index mapping based on data type of the values in first record. That record can have same fields as other indexes, but with different data types and that is what causes the error.

Solution: modify the script (PyForge > scripts > change_mapping_elasticsearch.py) according to what is needed to change and read the WARNING in it before running it. Then go to Kibana > Management > Index Patterns > select “open*” index > click Refresh field list. Now in the list of fields find those which were causing troubles - they have to be aggregatable!

Limit of total fields in index has been exceeded

Message limit of total fields [1000] in index has been exceeded as response of ES or output of saving script - number of fields (in other words “properties”) in the ElasticSearch index reached its maximum and the limit needs to be increased, as the new record contains some fields, which are not present in the index yet.

Documentation for this behavior of ElasticSearch

Solution: https://discuss.elastic.co/t/total-fields-limit-setting/53004 Send request to modify the setting (per index) to ES like this:

PUT <ES_data_index_name>/_settings

1
2
3
{
  "index.mapping.total_fields.limit": <new_limit>
}

As a new value should be for example 2000, as it provides enough “space” for new properties, but it is not too high to make index “explode” and slow ES down.

Note: Please keep this limit as low as possible to ensure good ES performance! In case of “need” of limit higher than 2000, optimization of data straucture should be done.

Concerns and plans

  1. I have a concern how to filter data (jobs in this case) for analysis tools, when there were used same Gatling simulations, but test scenario is somehow different (for example CreatePut on managed users vs. CreatePut with sync). Their Gatling global results will be marked both with “createput”, but they are not same. Needs to be investigated. Data can be filtered easily Results of each Gatling simulations are wrapped in “gatling_X”

  2. How to easily re-use visualizations and dashboards for new versions of product (very similar question of filtering data as previous point). Data can be filtered easily, just needs to keep indexes in product-level (don’t specify version or subversion) - e.g. openidm-*, not openidm-5-* Also it is possible to change index for existing visualisation, just in needs to be done “manually” - there is no tool or feature in the Kibana UI (see issue#3668) Currently all visualizations use the most general index: open*, selecting products and versions is done by filters (look at Concern #17 of this section)

  3. Will add also job parameters and their values to the payload (Done)

  4. Possibly we could add some “diagnostics” of failed job for some common reasons (for example GIT failure - not possible, because in that case even the script is not downloaded; disk full; test suite / testcase not found,..).

  5. Would it be better to run the Python script as post-build task, or better make it part of PyForge workflow? As a PyForge teardown step, it would be possible to save also config files, which could be useful for debugging and are deleted at the end of PyForge run (i.e. without option “-n” - in case of jobs it means always). Where would be proper place to run it (just when some parameter is added to run-pybot run command)? On the other hand, robot results are published after job run as post-build step.

  6. Since we have removed “latest” link in results, some modifications of code would be needed. Are we going to put it back? (Done)

  7. Add product build date to the payload? It would be useful for perf comparison of product builds. Where to get build date (could it be gathered from build hash, how)?

  8. Store Jenkins job info separately from results and just create a relationship between result and job (as Jenkins project)? And possibly also products - with result store just build hash? Currently I can’t see any advantage except of saving a little bit of disk space. In case of products, also saving some effort by finding build date by build hash. Not needed now - will see how amount of data will grow or if there would be functional requirement

  9. There is also possibility to create “scripted fields” which values are calculated when needed. Possible example: data contains Robot results, but just “failed”, “skipped” and “total” count. It should be possible to add scripted field for “success” count.

  10. Post-build task failure should make build fail (to notice, that results were not sent to ES) - option “Escalate script execution status to job status” should be enabled! Created notifications to HipChat room Saving perf results to DB and Performance Engineering.

  11. Add some parameter, to easily change data token flag (for example to use for jobs for reports), as the command to run the script is in Template project jobs. No need, maybe in the future

  12. When GIT problem occurs, we will not get notified about it, because workspace is wiped out and code is not cloned thus not run!

  13. Should and if so how could be collected known issues, related to test cases that were run?

  14. In future, where there will be several result record of all jobs, compare current collected data for missing parts (like Gatling results). If type of result, which is in previous records missing now, notify (in HipChat room) with warning.

  15. Could happen, that in suite with multiple simulations, one of them does not run (or fails to create folder with reports)? Or new simulation is added to the suite. That would assign ‘“gatling_X” names shifted numbers to the Gatling result set and visualization could show peaks (example here - test builds after June/11 were run with one additional testcase in the suite). Same issue with suite durations.
    • Case of missing values from one test/simulation should be clearly visible in graphs - a missing point in some line and a peak/abyss in other lines
    • But when new testcase added, there could be misconnection of data lines, as index X for “gatling_X” is increasing with each simulation name, which are always sorted alphabetically (not by date when was a testcase added). There are 2 possible approaches. First to add to all testcases and simulation scenario name a prefix with index, under which current data are saved (that mean lot of work). Second idea how to solve this is store data in arrays - this is possible, but only with explicit mapping (explained in ES nested objects), which could be a complication if it cannot be changed after index creation and Kibana has currently a limitation for such nested objects: issue#1084
  16. When there is a known issue in a test, the test is marked as non-critical and thus it does not influence the overall build status in Jenkins. Should it be shown in Kibana overview anyway? I assume yes, but mark it somehow. SOLVED by saving numbers of critical test failures and passes. Critical failures are also shown in the failed jobs overview visualization and saved search.

  17. Performance of visualization, which filters for a product, can be improved by changing overall index pattern (“open”) to a product-specific one (for instance: “openidm-”) (DONE). Or in some cases also create a version specific index and use it (TODO when needed in the future). This can be achieved in Management section of visualizations, but it is very risky and could be time consuming. Better would be to export such visualizations to a JSON file, edit it there and import it back. Removing non-existing fields is needed before import.

  18. Add parsing and saving system statistics (results/latest/*/graph/System/dstat.csv) in a Kibana-analyzable format

  19. Add alias “CTS” for OpenDJ (without it it is causing some troubles with reading versions and could potentially cause problems in analyzes in Kibana - example job)

  20. Once products_info.json file is fully working, rework the way how and what is saved to ES (including data format little bit). See a related JIRA.

  21. Parse and save complete PyForge config instead of just build params. Save also run-pybot command (parsed each param). Then filtering by simulation and config parameters could be done in Kibana.

  22. Grab, split and save also run-pybot command to enable filtering on it

  23. Make the script usable also for local test runs

  24. Query ES for last results and compute difference in comparison with current results