Benchmarking tools and techniques investigation #5502

Rebits · 2024-06-14T17:12:56Z

Description

The objective of this issue is to investigate and identify the most effective tools, techniques, and methodologies for benchmarking testing. These tests will serve as a baseline for comparison, enabling the tracking of performance improvements or regressions. By doing so, benchmarking will help maintain and enhance the efficiency of the product. The proposed tools and techniques should meet the proposed functional and non-functional requirements. This research will enable us to determine the necessary functionalities to develop a performance test for the product, ensuring that it is comprehensive and effective.

Funcional Requirements

Capabilities

Scripting module:
- Records user interactions for testing.
- Allows modification of recorded scripts to add data and configure measurement details.
Test management module:
- Creates and executes test scenarios that simulate various user activities.
- Uses recorded scripts and load generators to simulate different loads.
Load Injector:
- Generate different loads in the environment.
Analysis Module
- Analyze the test data automatically, determining if they are within baseline or not.
- Provides computational results.
- Provides reports and visualizations to help understand test results.

KPTM analysis and data collection

Proposed tools should be capable of monitoring the following KPTM (Key Performance Test Metrics):

CPU, memory, file descriptors, disk operations, hardware usage of all Wazuh components: Agent, Manager, Indexer, Dashboard
Events per second for different modules, varying event size
API request response time
Dashboard web load time
Error rate
Loss alert percentage
Indexing time for alerts/vulnerabilities, etc.

Implementation restrictions

The CI process must use the QA owned Jenkins infrastructure.
New tests must be developed using the JobFlow Python dependency.
Consider OpenSearch benchmarking tools for Wazuh indexer.
The CI process should be parallelizable, extensible and cost-efficient
Benchmarking techniques and tools should be industry standard.

Plan

Agree on expected KPTM values
- Identify the KPTM to be measured
- Determine target values for these KPTM to ensure the system meets performance standards
Research and Analysis:
- Research current tools
- Research tools to collect, analyze, and generate testing loads
- Research tools for reporting and analysis.
- Identify the most suitable tools for collecting and analyzing KPTM values.
- Analyze the tools and their capabilities to determine the best fit for testing needs.

Related issues

Parent issue

https://github.com/wazuh/wazuh-qa-automation/issues/203

rafabailon · 2024-06-20T12:49:42Z

Tools for Different Purposes

Tools Summary

The following is a summary table of the tools analyzed with the main characteristics of each one

Tool	Test Language	Multiplatform	Requirements	Purpose
Artillery	YAML	Yes	Node.js, NPM	Load and performance testing of web applications and APIs
Playwright	JavaScript / TypeScript	Yes	Node.js, NPM	Test automation in web browsers
OpenSearch Benchmark	JSON	Yes	Python 3.6+ (Pip), Elasticsearch, OpenSearch	Search and Analysis Cluster Performance Evaluation
Locust	Python	Yes	Python 3.6+ (Pip)	Load and stress testing in web applications
Fluentd	-	Yes	Ruby, Gem (Installation Packages Available)	Collect, transform, and send logs and events
Tsung	XML	Yes	Erlang, GNU Make (Installation Packages Available)	Load and performance testing of web applications and services
Cypress	JavaScript / TypeScript	Yes	Node.js, NPM	End-to-End Tests

The information on each tool explains the advantages and disadvantages of each one.

Tool	Advantages	Disadvantages
Artillery	Modern & Easy-to-Use. Cloud-Scale Testing. Test Any Stack. Fast Adoption. Extensible. Scalable & Cost-Efficient. Open Source.	Learning Curve. Limited UI Testing. Protocol Limitations.
Playwright	Cross-Browser Support. Cross-Platform Compatibility. Mobile Emulation. Language Flexibility. Headless and GUI Modes. Advanced Automation Capabilities.	Learning Curve. Limited UI Testing. Protocol Constraints.
OpenSearch Benchmark	Performance Metrics. Decision Support. Resource Optimization.	Complexity. Maintenance Overhead.
Locust	User-Friendly Task Definition. Distributed Load Generation. Real-Time Web-Based Monitoring. Simulating User Behavior. Performance Metrics and Reporting.	Python Dependency. Limited Protocol Support. Less Suitable for High-Concurrency Workloads.
Fluentd	Pluggable Architecture. Real-Time Data Processing. Cross-Platform Support. Better Memory Usage.	Decentralized Ecosystem. Transport and Buffering. Parsing Complexity.
Tsung	Erlang-Based. Protocol Support. Stability. Distributed Load Generation. Automated Statistics.	Learning Curve. Complexity. Limited Protocols.
Cypress	Speed. User-Friendly Interface. Reliability. Flexibility. Stability. Active Community. Browser-Based.	Limited Cross-Browser Support. No Native Mobile App Testing. Single Browser Session. No Direct Multiple Windows/Tabs Support

Tools Information

Artillery (Performance Testing)

Information

Artillery is an open source load testing tool. It allows you to simulate multiple concurrent users, making it an option for performance testing. Artillery supports HTTP, WebSocket and Server-Sent Events protocols. It is free and open source. You can use it to evaluate the performance and scalability of web services, APIs and other networked systems. Artillery test scripts are usually written as YAML, but they can also be written in JavaScript. Artillery scripts have two parts: config and scenarios. config is what defines how our load test will run. scenarios is where we define what the virtual users created by Artillery will do.

Interest URLs
Requeriments
- SO: Windows, MacOS, and Linux
- Required Software: LTS Release of Node.js and NPM.

Advantages and disadvantages

Advantages:
- Modern & Easy-to-Use: Artillery is a powerful and user-friendly performance testing toolkit. It prioritizes developer productivity and follows a "batteries-included" philosophy.
- Cloud-Scale Testing: You can run distributed multi-region load tests using AWS Lambda or AWS Fargate without managing infrastructure.
- Test Any Stack: Artillery supports testing HTTP APIs, WebSocket, Socket.io services, and complex web apps with real browsers.
- Fast Adoption: Designed to be easy to start with, it offers extensions, plugins, and integrations with monitoring and CI/CD tools.
- Extensible: You can extend Artillery and build custom integrations using Node.js.
- Scalable & Cost-Efficient: Run large-scale load tests from your own AWS account using AWS Lambda or serverless AWS Fargate clusters.
- Open Source: Licensed under MPL-2.0, making it easy for platform and SQA teams to build on top of Artillery.
Disadvantages:
- Learning Curve: While Artillery aims for ease of use, some users may find a learning curve when exploring its features.
- Limited UI Testing: Although Artillery can run Playwright-based scripts for UI testing, it's primarily designed for backend systems.
- Protocol Limitations: While it supports HTTP, WebSocket, and Socket.io, additional protocols require custom plugins.

Installation

root@ubuntu2204:/home/vagrant# apt update
root@ubuntu2204:/home/vagrant# curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.7/install.sh | bash
root@ubuntu2204:/home/vagrant# nvm install 20
root@ubuntu2204:/home/vagrant# node -v
root@ubuntu2204:/home/vagrant# npm -v
root@ubuntu2204:/home/vagrant# npm install -g artillery@latest

It is also possible to install the VS Code extension for Artillery, making it easier to create tests.

Artillery VS Code Extension

Artillery Test

Create asciiart-load-test.yml

root@ubuntu2204:/home/vagrant# touch asciiart-load-test.yml

Add Test Code

config:
  target: http://asciiart.artillery.io:8080
  phases:
    - duration: 60
      arrivalRate: 1
      rampTo: 5
      name: Warm up phase
    - duration: 60
      arrivalRate: 5
      rampTo: 10
      name: Ramp up load
    - duration: 30
      arrivalRate: 10
      rampTo: 30
      name: Spike phase
  plugins:
    ensure: {}
    apdex: {}
    metrics-by-endpoint: {}
  apdex:
    threshold: 100
  ensure:
    thresholds:
      - http.response_time.p99: 100
      - http.response_time.p95: 75
scenarios:
  - flow:
      - loop:
          - get:
              url: '/dino'
          - get:
              url: '/pony'
          - get:
              url: '/armadillo'
        count: 100

Run the Load Test

root@ubuntu2204:/home/vagrant# artillery run asciiart-load-test.yml
Test run id: tkrrk_eap897yrfekgtcdemgax3jbqjxzwn_4qth
Phase started: Warm up phase (index: 0, duration: 60s) 07:35:49(+0000)

--------------------------------------
Metrics for period to: 07:36:00(+0000) (width: 9.895s)
--------------------------------------

apdex.frustrated: .............................................................. 1
apdex.satisfied: ............................................................... 1017
apdex.tolerated: ............................................................... 16
http.codes.200: ................................................................ 1034
http.downloaded_bytes: ......................................................... 598768
http.request_rate: ............................................................. 106/sec
http.requests: ................................................................. 1047
http.response_time:
  min: ......................................................................... 46
  max: ......................................................................... 133
  mean: ........................................................................ 53.7
  median: ...................................................................... 51.9
  p95: ......................................................................... 63.4
  p99: ......................................................................... 80.6
http.responses: ................................................................ 1034
plugins.metrics-by-endpoint./armadillo.codes.200: .............................. 339
plugins.metrics-by-endpoint./dino.codes.200: ................................... 350
plugins.metrics-by-endpoint./pony.codes.200: ................................... 345
plugins.metrics-by-endpoint.response_time./armadillo:
  min: ......................................................................... 46
  max: ......................................................................... 133
  mean: ........................................................................ 53.9
  median: ...................................................................... 51.9
  p95: ......................................................................... 63.4
  p99: ......................................................................... 85.6
plugins.metrics-by-endpoint.response_time./dino:
  min: ......................................................................... 48
  max: ......................................................................... 132
  mean: ........................................................................ 53.6
  median: ...................................................................... 51.9
  p95: ......................................................................... 63.4
  p99: ......................................................................... 76
plugins.metrics-by-endpoint.response_time./pony:
  min: ......................................................................... 47
  max: ......................................................................... 130
  mean: ........................................................................ 53.5
  median: ...................................................................... 51.9
  p95: ......................................................................... 63.4
  p99: ......................................................................... 76
vusers.created: ................................................................ 13
vusers.created_by_name.0: ...................................................... 13


--------------------------------------
Metrics for period to: 07:36:10(+0000) (width: 9.961s)
--------------------------------------

apdex.frustrated: .............................................................. 0
apdex.satisfied: ............................................................... 4153
apdex.tolerated: ............................................................... 20
http.codes.200: ................................................................ 4173
http.downloaded_bytes: ......................................................... 2413754
http.request_rate: ............................................................. 421/sec
http.requests: ................................................................. 4189
http.response_time:
  min: ......................................................................... 47
  max: ......................................................................... 99
  mean: ........................................................................ 53.1
  median: ...................................................................... 51.9
  p95: ......................................................................... 63.4
  p99: ......................................................................... 77.5
http.responses: ................................................................ 4173
plugins.metrics-by-endpoint./armadillo.codes.200: .............................. 1388
plugins.metrics-by-endpoint./dino.codes.200: ................................... 1394
plugins.metrics-by-endpoint./pony.codes.200: ................................... 1391
plugins.metrics-by-endpoint.response_time./armadillo:
  min: ......................................................................... 47
  max: ......................................................................... 99
  mean: ........................................................................ 52.9
  median: ...................................................................... 51.9
  p95: ......................................................................... 63.4
  p99: ......................................................................... 77.5
plugins.metrics-by-endpoint.response_time./dino:
  min: ......................................................................... 47
  max: ......................................................................... 93
  mean: ........................................................................ 53.2
  median: ...................................................................... 51.9
  p95: ......................................................................... 63.4
  p99: ......................................................................... 77.5
plugins.metrics-by-endpoint.response_time./pony:
  min: ......................................................................... 47
  max: ......................................................................... 97
  mean: ........................................................................ 53.2
  median: ...................................................................... 51.9
  p95: ......................................................................... 63.4
  p99: ......................................................................... 77.5
vusers.completed: .............................................................. 4
vusers.created: ................................................................ 20
vusers.created_by_name.0: ...................................................... 20
vusers.failed: ................................................................. 0
vusers.session_length:
  min: ......................................................................... 16380.2
  max: ......................................................................... 17135.7
  mean: ........................................................................ 16639
  median: ...................................................................... 16486.1
  p95: ......................................................................... 16486.1
  p99: ......................................................................... 16486.1


--------------------------------------
Metrics for period to: 07:36:20(+0000) (width: 9.994s)
--------------------------------------

apdex.frustrated: .............................................................. 0
apdex.satisfied: ............................................................... 6390
apdex.tolerated: ............................................................... 28
http.codes.200: ................................................................ 6418
http.downloaded_bytes: ......................................................... 3711633
http.request_rate: ............................................................. 643/sec
http.requests: ................................................................. 6427
http.response_time:
  min: ......................................................................... 47
  max: ......................................................................... 97
  mean: ........................................................................ 53.5
  median: ...................................................................... 51.9
  p95: ......................................................................... 62.2
  p99: ......................................................................... 85.6
http.responses: ................................................................ 6418
plugins.metrics-by-endpoint./armadillo.codes.200: .............................. 2136
plugins.metrics-by-endpoint./dino.codes.200: ................................... 2145
plugins.metrics-by-endpoint./pony.codes.200: ................................... 2137
plugins.metrics-by-endpoint.response_time./armadillo:
  min: ......................................................................... 47
  max: ......................................................................... 97
  mean: ........................................................................ 53.6
  median: ...................................................................... 51.9
  p95: ......................................................................... 62.2
  p99: ......................................................................... 85.6
plugins.metrics-by-endpoint.response_time./dino:
  min: ......................................................................... 47
  max: ......................................................................... 96
  mean: ........................................................................ 53.5
  median: ...................................................................... 51.9
  p95: ......................................................................... 62.2
  p99: ......................................................................... 87.4
plugins.metrics-by-endpoint.response_time./pony:
  min: ......................................................................... 47
  max: ......................................................................... 96
  mean: ........................................................................ 53.4
  median: ...................................................................... 51.9
  p95: ......................................................................... 63.4
  p99: ......................................................................... 85.6
vusers.completed: .............................................................. 18
vusers.created: ................................................................ 28
vusers.created_by_name.0: ...................................................... 28
vusers.failed: ................................................................. 0
vusers.session_length:
  min: ......................................................................... 15695.4
  max: ......................................................................... 16830.3
  mean: ........................................................................ 16144.8
  median: ...................................................................... 16159.7
  p95: ......................................................................... 16819.2
  p99: ......................................................................... 16819.2


--------------------------------------
Metrics for period to: 07:36:30(+0000) (width: 9.999s)
--------------------------------------

apdex.frustrated: .............................................................. 0
apdex.satisfied: ............................................................... 8548
apdex.tolerated: ............................................................... 30
http.codes.200: ................................................................ 8578
http.downloaded_bytes: ......................................................... 4962440
http.request_rate: ............................................................. 859/sec
http.requests: ................................................................. 8592
http.response_time:
  min: ......................................................................... 46
  max: ......................................................................... 95
  mean: ........................................................................ 52.9
  median: ...................................................................... 51.9
  p95: ......................................................................... 58.6
  p99: ......................................................................... 74.4
http.responses: ................................................................ 8578
plugins.metrics-by-endpoint./armadillo.codes.200: .............................. 2852
plugins.metrics-by-endpoint./dino.codes.200: ................................... 2864
plugins.metrics-by-endpoint./pony.codes.200: ................................... 2862
plugins.metrics-by-endpoint.response_time./armadillo:
  min: ......................................................................... 47
  max: ......................................................................... 92
  mean: ........................................................................ 52.9
  median: ...................................................................... 51.9
  p95: ......................................................................... 58.6
  p99: ......................................................................... 73
plugins.metrics-by-endpoint.response_time./dino:
  min: ......................................................................... 47
  max: ......................................................................... 95
  mean: ........................................................................ 53
  median: ...................................................................... 51.9
  p95: ......................................................................... 59.7
  p99: ......................................................................... 74.4
plugins.metrics-by-endpoint.response_time./pony:
  min: ......................................................................... 46
  max: ......................................................................... 95
  mean: ........................................................................ 52.9
  median: ...................................................................... 51.9
  p95: ......................................................................... 59.7
  p99: ......................................................................... 73
vusers.completed: .............................................................. 20
vusers.created: ................................................................ 34
vusers.created_by_name.0: ...................................................... 34
vusers.failed: ................................................................. 0
vusers.session_length:
  min: ......................................................................... 15811
  max: ......................................................................... 16706.3
  mean: ........................................................................ 16274.5
  median: ...................................................................... 16159.7
  p95: ......................................................................... 16819.2
  p99: ......................................................................... 16819.2


--------------------------------------
Metrics for period to: 07:36:40(+0000) (width: 9.997s)
--------------------------------------

apdex.frustrated: .............................................................. 0
apdex.satisfied: ............................................................... 10436
apdex.tolerated: ............................................................... 38
http.codes.200: ................................................................ 10474
http.downloaded_bytes: ......................................................... 6057553
http.request_rate: ............................................................. 1049/sec
http.requests: ................................................................. 10485
http.response_time:
  min: ......................................................................... 46
  max: ......................................................................... 96
  mean: ........................................................................ 54.4
  median: ...................................................................... 53
  p95: ......................................................................... 67.4
  p99: ......................................................................... 82.3
http.responses: ................................................................ 10474
plugins.metrics-by-endpoint./armadillo.codes.200: .............................. 3490
plugins.metrics-by-endpoint./dino.codes.200: ................................... 3493
plugins.metrics-by-endpoint./pony.codes.200: ................................... 3491
plugins.metrics-by-endpoint.response_time./armadillo:
  min: ......................................................................... 46
  max: ......................................................................... 96
  mean: ........................................................................ 54.5
  median: ...................................................................... 53
  p95: ......................................................................... 67.4
  p99: ......................................................................... 82.3
plugins.metrics-by-endpoint.response_time./dino:
  min: ......................................................................... 46
  max: ......................................................................... 96
  mean: ........................................................................ 54.4
  median: ...................................................................... 51.9
  p95: ......................................................................... 67.4
  p99: ......................................................................... 82.3
plugins.metrics-by-endpoint.response_time./pony:
  min: ......................................................................... 47
  max: ......................................................................... 95
  mean: ........................................................................ 54.4
  median: ...................................................................... 53
  p95: ......................................................................... 67.4
  p99: ......................................................................... 82.3
vusers.completed: .............................................................. 30
vusers.created: ................................................................ 40
vusers.created_by_name.0: ...................................................... 40
vusers.failed: ................................................................. 0
vusers.session_length:
  min: ......................................................................... 15758.7
  max: ......................................................................... 16982.9
  mean: ........................................................................ 16260.6
  median: ...................................................................... 16159.7
  p95: ......................................................................... 16819.2
  p99: ......................................................................... 16819.2


Phase completed: Warm up phase (index: 0, duration: 60s) 07:36:49(+0000)

Test Script Walkthrough

Playwright (Test Automation Framework)

Information

Playwright is a test automation framework. Playwright allows you to perform end-to-end testing in web browsers and web scraping. It allows you to write code that interacts with web pages, performs actions like clicking, scrolling, and filling out forms, and captures screenshots or videos of the browser's state.

Artillery and Playwright can be used together to create more realistic test scenarios. It is possible to create a script in Playwright and integrate it into Artillery. Since Artillery allows using JavaScript code, it is possible to create a JS function that calls the script in Playwright to simulate user interactions.

Interest URLs
Requeriments
- Required Software: VS Code, npm, yarn, pnpm

Advantages and disadvantages

Advantages:
- Cross-Browser Support: Playwright works seamlessly with multiple browsers, including Chromium (Chrome, Edge), Firefox, and WebKit (Safari). This compatibility ensures consistent testing across different environments.
- Cross-Platform Compatibility: You can use Playwright to test applications across various platforms, including mobile (Android), web, and desktop (MacOS, Linux, Windows).
- Mobile Emulation: Playwright can emulate mobile devices, replicating geolocation, screen size, and other device-specific characteristics.
- Language Flexibility: Initially built for Node.js, Playwright now offers bindings for JavaScript, TypeScript, Python, Java, and C#/.NET, making it accessible to a broader range of developers and testers.
- Headless and GUI Modes: It can run browsers in headless mode (for faster execution in test environments) and GUI mode (for development and debugging).
- Advanced Automation Capabilities: Playwright handles both traditional multi-page applications and complex single-page applications. It can interact with iframes and pierce through Shadow DOM, essential for testing modern web applications.
Disadvantages:
- Learning Curve: While Playwright is powerful, some users may find a learning curve when exploring its features.
- Limited UI Testing: Although it supports UI testing, Playwright is primarily designed for backend systems.
- Protocol Constraints: Additional protocols beyond HTTP, WebSocket, and Socket.io require custom plugins.

Installation

root@ubuntu2204:/home/vagrant# apt update
root@ubuntu2204:/home/vagrant# curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.7/install.sh | bash
root@ubuntu2204:/home/vagrant# nvm install 20
root@ubuntu2204:/home/vagrant# node -v
root@ubuntu2204:/home/vagrant# npm -v
root@ubuntu2204:/home/vagrant# npm init playwright@latest
root@ubuntu2204:/home/vagrant# npx playwright install-deps

It is also possible to install Playwright in VS Code, making it easier to create tests.

Playwright in VS Code

Playwright Test

Run the Load Test

root@ubuntu2204:/home/vagrant# npx playwright test

Running 6 tests using 1 worker
  6 passed (8.3s)

To open last HTML report run:

  npx playwright show-report

Integrate Playwright with Artillery

Create hello-world.yml for Artillery

root@ubuntu2204:/home/vagrant# touch hello-world.yml

Add Test Code to Artillery

config:
  target: https://www.artillery.io
  engines:
    playwright: {}
  processor: './flows.js'
scenarios:
  - engine: playwright
    testFunction: 'helloFlow'

Create flows.js for Playwright

root@ubuntu2204:/home/vagrant# touch flows.js

Add Test Code to Playwright

module.exports = { helloFlow };
 
async function helloFlow(page) {
  await page.goto('https://www.artillery.io/');
  await page.click('text=Cloud');
}

Run the Load Test

root@ubuntu2204:/home/vagrant# artillery run hello-world.yml
Test run id: t9yad_6437mptpftpgeqr6cya4egfyk5k3t_rwqw
Phase started: unnamed (index: 0, duration: 1s) 09:16:07(+0000)

Phase completed: unnamed (index: 0, duration: 1s) 09:16:08(+0000)

--------------------------------------
Metrics for period to: 09:16:10(+0000) (width: 1.259s)
--------------------------------------

browser.http_requests: ......................................................... 34
browser.page.codes.200: ........................................................ 35
browser.page.codes.206: ........................................................ 1
vusers.created: ................................................................ 1
vusers.created_by_name.0: ...................................................... 1


Warning: multiple batches of metrics for period 1719306960000 2024-06-25T09:16:00.000Z
--------------------------------------
Metrics for period to: 09:16:20(+0000) (width: 1.465s)
--------------------------------------

browser.http_requests: ......................................................... 28
browser.page.FCP.https://www.artillery.io/:
  min: ......................................................................... 1229.2
  max: ......................................................................... 1229.2
  mean: ........................................................................ 1229.2
  median: ...................................................................... 1224.4
  p95: ......................................................................... 1224.4
  p99: ......................................................................... 1224.4
browser.page.FID.https://www.artillery.io/:
  min: ......................................................................... 14.2
  max: ......................................................................... 14.2
  mean: ........................................................................ 14.2
  median: ...................................................................... 14.2
  p95: ......................................................................... 14.2
  p99: ......................................................................... 14.2
browser.page.LCP.https://www.artillery.io/:
  min: ......................................................................... 1229.2
  max: ......................................................................... 1229.2
  mean: ........................................................................ 1229.2
  median: ...................................................................... 1224.4
  p95: ......................................................................... 1224.4
  p99: ......................................................................... 1224.4
browser.page.TTFB.https://www.artillery.io/:
  min: ......................................................................... 614.1
  max: ......................................................................... 614.1
  mean: ........................................................................ 614.1
  median: ...................................................................... 608
  p95: ......................................................................... 608
  p99: ......................................................................... 608
browser.page.codes.200: ........................................................ 23
browser.page.codes.206: ........................................................ 1
browser.page.codes.302: ........................................................ 1
browser.page.codes.307: ........................................................ 1
browser.page.codes.308: ........................................................ 1
browser.page.codes.404: ........................................................ 1
vusers.completed: .............................................................. 1
vusers.failed: ................................................................. 0
vusers.session_length:
  min: ......................................................................... 2874.4
  max: ......................................................................... 2874.4
  mean: ........................................................................ 2874.4
  median: ...................................................................... 2893.5
  p95: ......................................................................... 2893.5
  p99: ......................................................................... 2893.5


All VUs finished. Total time: 4 seconds

--------------------------------
Summary report @ 09:16:13(+0000)
--------------------------------

browser.http_requests: ......................................................... 62
browser.page.FCP.https://www.artillery.io/:
  min: ......................................................................... 1229.2
  max: ......................................................................... 1229.2
  mean: ........................................................................ 1229.2
  median: ...................................................................... 1224.4
  p95: ......................................................................... 1224.4
  p99: ......................................................................... 1224.4
browser.page.FID.https://www.artillery.io/:
  min: ......................................................................... 14.2
  max: ......................................................................... 14.2
  mean: ........................................................................ 14.2
  median: ...................................................................... 14.2
  p95: ......................................................................... 14.2
  p99: ......................................................................... 14.2
browser.page.LCP.https://www.artillery.io/:
  min: ......................................................................... 1229.2
  max: ......................................................................... 1229.2
  mean: ........................................................................ 1229.2
  median: ...................................................................... 1224.4
  p95: ......................................................................... 1224.4
  p99: ......................................................................... 1224.4
browser.page.TTFB.https://www.artillery.io/:
  min: ......................................................................... 614.1
  max: ......................................................................... 614.1
  mean: ........................................................................ 614.1
  median: ...................................................................... 608
  p95: ......................................................................... 608
  p99: ......................................................................... 608
browser.page.codes.200: ........................................................ 58
browser.page.codes.206: ........................................................ 2
browser.page.codes.302: ........................................................ 1
browser.page.codes.307: ........................................................ 1
browser.page.codes.308: ........................................................ 1
browser.page.codes.404: ........................................................ 1
vusers.completed: .............................................................. 1
vusers.created: ................................................................ 1
vusers.created_by_name.0: ...................................................... 1
vusers.failed: ................................................................. 0
vusers.session_length:
  min: ......................................................................... 2874.4
  max: ......................................................................... 2874.4
  mean: ........................................................................ 2874.4
  median: ...................................................................... 2893.5
  p95: ......................................................................... 2893.5
  p99: ......................................................................... 2893.5

Note: Integrate Playwright with Artillery

Opensearch Benchmark (Performance Test for OpenSearch Clusters)

Information

OpenSearch Benchmark is a macrobenchmark utility provided by the OpenSearch Project. It allows you to collect performance metrics from an OpenSearch cluster for a variety of purposes. It is possible to track the overall performance of the cluster, report on when to update the cluster, and evaluate how changes in workflow may affect the cluster.

Interest URLs
Requeriments
- Linux, MacOS or Docker
- Python >= 3.8

Advantages and disadvantages

Advantages:
- Performance Metrics: OpenSearch Benchmark helps you gather performance metrics from an OpenSearch cluster. You can track the overall performance of your cluster, which is useful for monitoring and optimization.
- Decision Support: It informs decisions about upgrading your cluster to a new version. By benchmarking performance, you can evaluate the benefits of upgrading and make informed choices.
- Resource Optimization: The tool allows you to optimize cluster resource usage, potentially reducing operating costs.
Disadvantages:
- Complexity: Benchmarking can be complex, especially when dealing with large clusters or intricate workflows. Proper configuration and interpretation of results are essential.
- Maintenance Overhead: Regular benchmarking requires ongoing effort and resources.

Installation

root@ubuntu2204:/home/vagrant# apt install python3-pip
root@ubuntu2204:/home/vagrant# pip install opensearch-benchmark

Installation (Docker)

root@ubuntu2204:/home/vagrant# docker pull opensearchproject/opensearch-benchmark:latest
root@ubuntu2204:/home/vagrant# docker run opensearchproject/opensearch-benchmark -h

Opensearch Benchmark Test (With Wazuh)

Check workloads

root@ubuntu2204:/home/vagrant# opensearch-benchmark list workloads

Run Tests

root@ubuntu2204:/home/vagrant# opensearch-benchmark execute-test --pipeline=benchmark-only --workload=nested --target-host=https://localhost:9200 --client-options=basic_auth_user:admin,basic_auth_password:admin,verify_certs:false

   ____                  _____                      __       ____                  __                         __
  / __ \____  ___  ____ / ___/___  ____ ___________/ /_     / __ )___  ____  _____/ /_  ____ ___  ____ ______/ /__
 / / / / __ \/ _ \/ __ \\__ \/ _ \/ __ `/ ___/ ___/ __ \   / __  / _ \/ __ \/ ___/ __ \/ __ `__ \/ __ `/ ___/ //_/
/ /_/ / /_/ /  __/ / / /__/ /  __/ /_/ / /  / /__/ / / /  / /_/ /  __/ / / / /__/ / / / / / / / / /_/ / /  / ,<
\____/ .___/\___/_/ /_/____/\___/\__,_/_/   \___/_/ /_/  /_____/\___/_/ /_/\___/_/ /_/_/ /_/ /_/\__,_/_/  /_/|_|
    /_/

[INFO] [Test Execution ID]: 73bdc966-abc6-4c84-a4d5-1be749c8062f
[INFO] You did not provide an explicit timeout in the client options. Assuming default of 10 seconds.
[INFO] Downloading workload data: documents.json.bz2 (663.3 MB total size)        [100.0%]
[INFO] Decompressing workload data from [/root/.benchmark/benchmarks/data/nested/documents.json.bz2] to [/root/.benchmark/benchmarks/data/nested/documents.json] (resulting size: [3.39] GB) ... [OK]
[INFO] Preparing file offset table for [/root/.benchmark/benchmarks/data/nested/documents.json] ... [OK]
[INFO] Executing test with workload [nested], test_procedure [nested-search-test_procedure] and provision_config_instance ['external'] with version [7.10.2].

[WARNING] merges_total_time is 102 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] indexing_total_time is 536 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] refresh_total_time is 954 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] flush_total_time is 138 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
Running delete-index                                                           [100% done]
Running create-index                                                           [100% done]
Running check-cluster-health                                                   [100% done]
Running index-append                                                           [100% done]
Running refresh-after-index                                                    [100% done]
Running force-merge                                                            [100% done]
Running refresh-after-force-merge                                              [100% done]
Running wait-until-merges-finish                                               [100% done]
Running randomized-nested-queries                                              [100% done]
Running randomized-term-queries                                                [100% done]
Running randomized-sorted-term-queries                                         [100% done]
Running match-all                                                              [100% done]
Running nested-date-histo                                                      [100% done]
Running randomized-nested-queries-with-inner-hits_default                      [100% done]
Running randomized-nested-queries-with-inner-hits_default_big_size             [100% done]

------------------------------------------------------
    _______             __   _____
   / ____(_)___  ____ _/ /  / ___/_________  ________
  / /_  / / __ \/ __ `/ /   \__ \/ ___/ __ \/ ___/ _ \
 / __/ / / / / / /_/ / /   ___/ / /__/ /_/ / /  /  __/
/_/   /_/_/ /_/\__,_/_/   /____/\___/\____/_/   \___/
------------------------------------------------------
            
|                                                         Metric |                                                       Task |       Value |   Unit |
|---------------------------------------------------------------:|-----------------------------------------------------------:|------------:|-------:|
|                     Cumulative indexing time of primary shards |                                                            |     8.58217 |    min |
|             Min cumulative indexing time across primary shards |                                                            |           0 |    min |
|          Median cumulative indexing time across primary shards |                                                            | 0.000258333 |    min |
|             Max cumulative indexing time across primary shards |                                                            |     8.57323 |    min |
|            Cumulative indexing throttle time of primary shards |                                                            |           0 |    min |
|    Min cumulative indexing throttle time across primary shards |                                                            |           0 |    min |
| Median cumulative indexing throttle time across primary shards |                                                            |           0 |    min |
|    Max cumulative indexing throttle time across primary shards |                                                            |           0 |    min |
|                        Cumulative merge time of primary shards |                                                            |     4.00155 |    min |
|                       Cumulative merge count of primary shards |                                                            |           6 |        |
|                Min cumulative merge time across primary shards |                                                            |           0 |    min |
|             Median cumulative merge time across primary shards |                                                            |           0 |    min |
|                Max cumulative merge time across primary shards |                                                            |     3.99985 |    min |
|               Cumulative merge throttle time of primary shards |                                                            |     1.66413 |    min |
|       Min cumulative merge throttle time across primary shards |                                                            |           0 |    min |
|    Median cumulative merge throttle time across primary shards |                                                            |           0 |    min |
|       Max cumulative merge throttle time across primary shards |                                                            |     1.66413 |    min |
|                      Cumulative refresh time of primary shards |                                                            |    0.297567 |    min |
|                     Cumulative refresh count of primary shards |                                                            |         141 |        |
|              Min cumulative refresh time across primary shards |                                                            |           0 |    min |
|           Median cumulative refresh time across primary shards |                                                            |  0.00189167 |    min |
|              Max cumulative refresh time across primary shards |                                                            |    0.281667 |    min |
|                        Cumulative flush time of primary shards |                                                            |    0.666633 |    min |
|                       Cumulative flush count of primary shards |                                                            |          23 |        |
|                Min cumulative flush time across primary shards |                                                            |           0 |    min |
|             Median cumulative flush time across primary shards |                                                            |    0.000275 |    min |
|                Max cumulative flush time across primary shards |                                                            |    0.664333 |    min |
|                                        Total Young Gen GC time |                                                            |       7.809 |      s |
|                                       Total Young Gen GC count |                                                            |         651 |        |
|                                          Total Old Gen GC time |                                                            |           0 |      s |
|                                         Total Old Gen GC count |                                                            |           0 |        |
|                                                     Store size |                                                            |     3.21809 |     GB |
|                                                  Translog size |                                                            | 5.12227e-07 |     GB |
|                                         Heap used for segments |                                                            |           0 |     MB |
|                                       Heap used for doc values |                                                            |           0 |     MB |
|                                            Heap used for terms |                                                            |           0 |     MB |
|                                            Heap used for norms |                                                            |           0 |     MB |
|                                           Heap used for points |                                                            |           0 |     MB |
|                                    Heap used for stored fields |                                                            |           0 |     MB |
|                                                  Segment count |                                                            |          50 |        |
|                                                 Min Throughput |                                               index-append |     39833.5 | docs/s |
|                                                Mean Throughput |                                               index-append |     41103.6 | docs/s |
|                                              Median Throughput |                                               index-append |     41187.7 | docs/s |
|                                                 Max Throughput |                                               index-append |     42569.8 | docs/s |
|                                        50th percentile latency |                                               index-append |     422.194 |     ms |
|                                        90th percentile latency |                                               index-append |     667.805 |     ms |
|                                        99th percentile latency |                                               index-append |     2576.67 |     ms |
|                                      99.9th percentile latency |                                               index-append |     6141.64 |     ms |
|                                       100th percentile latency |                                               index-append |     6759.82 |     ms |
|                                   50th percentile service time |                                               index-append |     422.194 |     ms |
|                                   90th percentile service time |                                               index-append |     667.805 |     ms |
|                                   99th percentile service time |                                               index-append |     2576.67 |     ms |
|                                 99.9th percentile service time |                                               index-append |     6141.64 |     ms |
|                                  100th percentile service time |                                               index-append |     6759.82 |     ms |
|                                                     error rate |                                               index-append |           0 |      % |
|                                                 Min Throughput |                                   wait-until-merges-finish |        0.06 |  ops/s |
|                                                Mean Throughput |                                   wait-until-merges-finish |        0.06 |  ops/s |
|                                              Median Throughput |                                   wait-until-merges-finish |        0.06 |  ops/s |
|                                                 Max Throughput |                                   wait-until-merges-finish |        0.06 |  ops/s |
|                                       100th percentile latency |                                   wait-until-merges-finish |     16252.9 |     ms |
|                                  100th percentile service time |                                   wait-until-merges-finish |     16252.9 |     ms |
|                                                     error rate |                                   wait-until-merges-finish |           0 |      % |
|                                                 Min Throughput |                                  randomized-nested-queries |        19.9 |  ops/s |
|                                                Mean Throughput |                                  randomized-nested-queries |       19.94 |  ops/s |
|                                              Median Throughput |                                  randomized-nested-queries |       19.95 |  ops/s |
|                                                 Max Throughput |                                  randomized-nested-queries |       19.97 |  ops/s |
|                                        50th percentile latency |                                  randomized-nested-queries |     28.4881 |     ms |
|                                        90th percentile latency |                                  randomized-nested-queries |     43.5573 |     ms |
|                                        99th percentile latency |                                  randomized-nested-queries |     71.1674 |     ms |
|                                      99.9th percentile latency |                                  randomized-nested-queries |     120.874 |     ms |
|                                       100th percentile latency |                                  randomized-nested-queries |     129.945 |     ms |
|                                   50th percentile service time |                                  randomized-nested-queries |     26.4684 |     ms |
|                                   90th percentile service time |                                  randomized-nested-queries |     41.1465 |     ms |
|                                   99th percentile service time |                                  randomized-nested-queries |     68.0754 |     ms |
|                                 99.9th percentile service time |                                  randomized-nested-queries |     106.919 |     ms |
|                                  100th percentile service time |                                  randomized-nested-queries |     125.263 |     ms |
|                                                     error rate |                                  randomized-nested-queries |           0 |      % |
|                                                 Min Throughput |                                    randomized-term-queries |       24.99 |  ops/s |
|                                                Mean Throughput |                                    randomized-term-queries |          25 |  ops/s |
|                                              Median Throughput |                                    randomized-term-queries |          25 |  ops/s |
|                                                 Max Throughput |                                    randomized-term-queries |          25 |  ops/s |
|                                        50th percentile latency |                                    randomized-term-queries |     7.88739 |     ms |
|                                        90th percentile latency |                                    randomized-term-queries |     11.0647 |     ms |
|                                        99th percentile latency |                                    randomized-term-queries |     15.2192 |     ms |
|                                       100th percentile latency |                                    randomized-term-queries |     22.8419 |     ms |
|                                   50th percentile service time |                                    randomized-term-queries |     6.12509 |     ms |
|                                   90th percentile service time |                                    randomized-term-queries |       8.878 |     ms |
|                                   99th percentile service time |                                    randomized-term-queries |     12.6071 |     ms |
|                                  100th percentile service time |                                    randomized-term-queries |     20.7344 |     ms |
|                                                     error rate |                                    randomized-term-queries |           0 |      % |
|                                                 Min Throughput |                             randomized-sorted-term-queries |       15.99 |  ops/s |
|                                                Mean Throughput |                             randomized-sorted-term-queries |       15.99 |  ops/s |
|                                              Median Throughput |                             randomized-sorted-term-queries |       15.99 |  ops/s |
|                                                 Max Throughput |                             randomized-sorted-term-queries |       15.99 |  ops/s |
|                                        50th percentile latency |                             randomized-sorted-term-queries |     15.7295 |     ms |
|                                        90th percentile latency |                             randomized-sorted-term-queries |     21.3126 |     ms |
|                                        99th percentile latency |                             randomized-sorted-term-queries |     37.9242 |     ms |
|                                       100th percentile latency |                             randomized-sorted-term-queries |     46.5997 |     ms |
|                                   50th percentile service time |                             randomized-sorted-term-queries |     13.7363 |     ms |
|                                   90th percentile service time |                             randomized-sorted-term-queries |     19.2022 |     ms |
|                                   99th percentile service time |                             randomized-sorted-term-queries |     34.7268 |     ms |
|                                  100th percentile service time |                             randomized-sorted-term-queries |     44.1017 |     ms |
|                                                     error rate |                             randomized-sorted-term-queries |           0 |      % |
|                                                 Min Throughput |                                                  match-all |           5 |  ops/s |
|                                                Mean Throughput |                                                  match-all |           5 |  ops/s |
|                                              Median Throughput |                                                  match-all |           5 |  ops/s |
|                                                 Max Throughput |                                                  match-all |           5 |  ops/s |
|                                        50th percentile latency |                                                  match-all |     6.94698 |     ms |
|                                        90th percentile latency |                                                  match-all |     10.5454 |     ms |
|                                        99th percentile latency |                                                  match-all |     14.2265 |     ms |
|                                       100th percentile latency |                                                  match-all |     20.7373 |     ms |
|                                   50th percentile service time |                                                  match-all |      4.8253 |     ms |
|                                   90th percentile service time |                                                  match-all |     8.11081 |     ms |
|                                   99th percentile service time |                                                  match-all |     10.3598 |     ms |
|                                  100th percentile service time |                                                  match-all |     16.6433 |     ms |
|                                                     error rate |                                                  match-all |           0 |      % |
|                                                 Min Throughput |                                          nested-date-histo |           1 |  ops/s |
|                                                Mean Throughput |                                          nested-date-histo |           1 |  ops/s |
|                                              Median Throughput |                                          nested-date-histo |           1 |  ops/s |
|                                                 Max Throughput |                                          nested-date-histo |           1 |  ops/s |
|                                        50th percentile latency |                                          nested-date-histo |     634.563 |     ms |
|                                        90th percentile latency |                                          nested-date-histo |     684.109 |     ms |
|                                        99th percentile latency |                                          nested-date-histo |     738.059 |     ms |
|                                       100th percentile latency |                                          nested-date-histo |     777.098 |     ms |
|                                   50th percentile service time |                                          nested-date-histo |     631.674 |     ms |
|                                   90th percentile service time |                                          nested-date-histo |     680.681 |     ms |
|                                   99th percentile service time |                                          nested-date-histo |     736.705 |     ms |
|                                  100th percentile service time |                                          nested-date-histo |     775.063 |     ms |
|                                                     error rate |                                          nested-date-histo |           0 |      % |
|                                                 Min Throughput |          randomized-nested-queries-with-inner-hits_default |       17.83 |  ops/s |
|                                                Mean Throughput |          randomized-nested-queries-with-inner-hits_default |       17.99 |  ops/s |
|                                              Median Throughput |          randomized-nested-queries-with-inner-hits_default |       17.99 |  ops/s |
|                                                 Max Throughput |          randomized-nested-queries-with-inner-hits_default |          18 |  ops/s |
|                                        50th percentile latency |          randomized-nested-queries-with-inner-hits_default |     33.6864 |     ms |
|                                        90th percentile latency |          randomized-nested-queries-with-inner-hits_default |     52.4591 |     ms |
|                                        99th percentile latency |          randomized-nested-queries-with-inner-hits_default |     81.8842 |     ms |
|                                      99.9th percentile latency |          randomized-nested-queries-with-inner-hits_default |     568.065 |     ms |
|                                       100th percentile latency |          randomized-nested-queries-with-inner-hits_default |     617.288 |     ms |
|                                   50th percentile service time |          randomized-nested-queries-with-inner-hits_default |     30.9282 |     ms |
|                                   90th percentile service time |          randomized-nested-queries-with-inner-hits_default |     48.6987 |     ms |
|                                   99th percentile service time |          randomized-nested-queries-with-inner-hits_default |      68.033 |     ms |
|                                 99.9th percentile service time |          randomized-nested-queries-with-inner-hits_default |      100.44 |     ms |
|                                  100th percentile service time |          randomized-nested-queries-with-inner-hits_default |     616.964 |     ms |
|                                                     error rate |          randomized-nested-queries-with-inner-hits_default |           0 |      % |
|                                                 Min Throughput | randomized-nested-queries-with-inner-hits_default_big_size |          16 |  ops/s |
|                                                Mean Throughput | randomized-nested-queries-with-inner-hits_default_big_size |          16 |  ops/s |
|                                              Median Throughput | randomized-nested-queries-with-inner-hits_default_big_size |          16 |  ops/s |
|                                                 Max Throughput | randomized-nested-queries-with-inner-hits_default_big_size |          16 |  ops/s |
|                                        50th percentile latency | randomized-nested-queries-with-inner-hits_default_big_size |     34.9921 |     ms |
|                                        90th percentile latency | randomized-nested-queries-with-inner-hits_default_big_size |     51.1011 |     ms |
|                                        99th percentile latency | randomized-nested-queries-with-inner-hits_default_big_size |     69.6266 |     ms |
|                                      99.9th percentile latency | randomized-nested-queries-with-inner-hits_default_big_size |     87.5509 |     ms |
|                                       100th percentile latency | randomized-nested-queries-with-inner-hits_default_big_size |     114.079 |     ms |
|                                   50th percentile service time | randomized-nested-queries-with-inner-hits_default_big_size |     33.2397 |     ms |
|                                   90th percentile service time | randomized-nested-queries-with-inner-hits_default_big_size |     49.1744 |     ms |
|                                   99th percentile service time | randomized-nested-queries-with-inner-hits_default_big_size |     66.9583 |     ms |
|                                 99.9th percentile service time | randomized-nested-queries-with-inner-hits_default_big_size |     84.0898 |     ms |
|                                  100th percentile service time | randomized-nested-queries-with-inner-hits_default_big_size |     111.146 |     ms |
|                                                     error rate | randomized-nested-queries-with-inner-hits_default_big_size |           0 |      % |


----------------------------------
[INFO] SUCCESS (took 2042 seconds)
----------------------------------

Note: Workloads List

Locust (Load Testing Framework)

Information

Locust is a Python load testing framework. It allows you to define user behavior using Python code and simulate thousands of users performing actions concurrently. You can install Locust from PyPI.

Interest URLs
Requeriments
- Python >= 3.9

Advantages and disadvantages

Advantages:
- User-Friendly Task Definition: Locust allows users to define tasks using Python code, making it easy to emulate genuine user behavior. You can create tasks like making HTTP queries, decoding responses, or performing custom actions.
- Distributed Load Generation: Locust supports distributed load generation, allowing you to spread the load across multiple machines. This scalability is useful for testing complex systems and applications.
- Real-Time Web-Based Monitoring: Locust provides a web-based interface that displays live metrics such as response times, requests per second, and user counts. This monitoring feature helps identify performance issues.
- Simulating User Behavior: Users can establish user behavior patterns through scenarios. You can specify the number of users, their actions, and frequency, enabling realistic load testing.
- Performance Metrics and Reporting: Locust tracks various performance indicators. You can link these metrics with other monitoring systems or export them for analysis.
Disadvantages:
- Python Dependency: Since Locust is Python-based, familiarity with Python is necessary for creating test scripts.
- Limited Protocol Support: While it excels in HTTP-based load testing, it may not be suitable for protocols beyond HTTP.
- Less Suitable for High-Concurrency Workloads: Although other tools may handle more requests per second, Locust's low overhead per user makes it better suited for highly concurrent workloads.

Installation

root@ubuntu2204:/home/vagrant# apt install python3-pip
root@ubuntu2204:/home/vagrant# pip install locust

Locust Test

Create Locust Test

root@ubuntu2204:/home/vagrant# touch locustfile.py

Add Locust Test Code

from locust import HttpUser, task

class HelloWorldUser(HttpUser):
    @task
    def hello_world(self):
        self.client.get("/hello")
        self.client.get("/world")

Run Test

root@ubuntu2204:/home/vagrant# locust
[2024-06-25 10:47:39,471] ubuntu2204/INFO/locust.main: Starting web interface at http://0.0.0.0:8089
[2024-06-25 10:47:39,477] ubuntu2204/INFO/locust.main: Starting Locust 2.29.0

Note: Locust Web Interface

Fluentd (Data Collector for Unified Logging Layer)

Information

Fluentd is an open source data collector that allows you to unify the collection and consumption of data from various sources and destinations. Fluentd provides a unified layer of record between data sources and backend systems. This allows you to decouple data sources from target systems.

One of Fluentd's most useful features is its ability to filter and enrich records as they are collected. Users can create custom filtering rules to remove unwanted records, add additional fields to records, and restructure data to make it more useful for further analysis.

In addition, it can also be used to monitor system status and alert users if a problem occurs. Users can set up alerts based on certain criteria, such as the number of errors in a given time period, and receive real-time notifications if a problem occurs.

Interest URLs
Requeriments
- Linux, MacOS or Docker

Advantages and disadvantages

Advantages:
- Pluggable Architecture: Fluentd's strength lies in its pluggable architecture. It seamlessly integrates with various data sources and outputs through a vast library of over 500 community-contributed plugins.
- Real-Time Data Processing: Fluentd excels in real-time data processing, making it ideal for handling substantial data volumes efficiently. Whether capturing logs from servers or managing IoT device data streams, Fluentd performs well in high-throughput scenarios.
- Cross-Platform Support: Fluentd runs on both Windows and Linux, making it a versatile choice for different environments.
- Better Memory Usage: Written in CRuby, Fluentd consumes fewer resources compared to Logstash. It scales well and is efficient for small to medium-sized deployments.
Disadvantages:
- Decentralized Ecosystem: Fluentd's decentralized plugin ecosystem means that it hosts fewer official plugins (around 10) compared to Logstash. However, it compensates with community-contributed plugins.
- Transport and Buffering: Fluentd provides an in-built buffering system that can be configured based on needs. In contrast, Logstash relies on external queues like Redis or Kafka for consistency. Choose Fluentd when you want more straightforward configuration or Logstash when reliability is critical.
- Parsing Complexity: While Fluentd excels in parsing both structured and unstructured logs, Logstash relies on plugins for log parsing.

Installation

root@ubuntu2204:/home/vagrant# curl -fsSL https://toolbelt.treasuredata.com/sh/install-ubuntu-jammy-fluent-package5-lts.sh | sh
root@ubuntu2204:/home/vagrant# sudo systemctl start fluentd.service
root@ubuntu2204:/home/vagrant# sudo systemctl status fluentd.service
root@ubuntu2204:/home/vagrant# curl -X POST -d 'json={"json":"message"}' http://localhost:8888/debug.test
root@ubuntu2204:/home/vagrant# tail -n 1 /var/log/fluent/fluentd.log

Note: Installing Fluent Package

Post-Installation

Configuration

root@ubuntu2204:/home/vagrant# cat /etc/fluent/fluentd.conf

Logging

root@ubuntu2204:/home/vagrant# cat /var/log/fluent/fluentd.log

Note: Post-Installation Guide

Integrating Wazuh with Fluentd

Following the official guide, it is possible to integrate Wazuh and Fluentd (Wazuh v4.8.0)

Note: Forward Alerts with Fluentd

Tsung (Distributed Load Testing Tool)

Information

Tsung is an open source distributed load testing tool. It can be used to stress servers that handle multiple protocols. Tsung is not limited to a single protocol and can simulate thousands of concurrent virtual users. It can also be distributed across multiple client machines to increase the testing load.

Interest URLs
Requeriments
- SO: Linux
- Software: Erlang/OTP R16B03, pgsql module, mysql module, mochiweb libs, gnuplot and perl5, python and matplotlib
Disadvantages
- Complex installation
- Uses Jabber/XMPP protocol

Advantages and disadvantages

Advantages:
- Erlang-Based.
- Protocol Support: Tsung can stress test various protocols, including HTTP, WebDAV, LDAP, MySQL, PostgreSQL, SOAP, and XMPP.
- Stability.
- Distributed Load Testing: Tsung is designed for distributed load testing, making it suitable for large-scale applications.
- Automated Statistics.
Disadvantages:
- Learning Curve: As an Erlang-based tool, Tsung may have a learning curve for users unfamiliar with the language.
- Complexity: Setting up and configuring Tsung for specific scenarios can be intricate.
- Limited Protocol: While it supports several protocols, it may not cover all possible use cases or niche protocols.

Installation

To install without using the packages, the following dependencies must be satisfied:

Erlang/OTP R16B03 and up
pgsql module
mysql module
mochiweb libs
gnuplot and perl5
Bash

It is also necessary to compile the code (you have to download it beforehand):

root@ubuntu2204:/home/vagrant# ./configure
root@ubuntu2204:/home/vagrant# make
root@ubuntu2204:/home/vagrant# make install

Another option is to download the packages:

Tsung (Dist)

Tsung Tests

Tsung uses XML files to run the tests. The learning curve is steep and does not provide example code to test the tool.

Cypress (Testing Frameworks for Javascript) (E2E Tests)

Information

Cypress is a next-generation front-end testing tool designed for modern web applications. Here are some key points:

Testing Framework:
- Cypress allows you to create tests for your web apps easily.
- You can debug tests visually and run them automatically in your CI builds.
- It uses Mocha and Chai for test organization and assertions.
How It Works:
- Cypress runs directly in the browser, providing real-time feedback.
- Debugging failed tests is straightforward using familiar in-browser developer tools.
- It eliminates flaky tests by interacting with your app consistently.
CI Integration:
- Integrate Cypress with your existing CI provider for early failure detection.
- Use Docker images or bring your own setup.
Cypress Cloud:
- Optimize test runs with parallelization, load balancing, and spec prioritization.
- Visually review and debug CI failures using Test Replay.
- Monitor test suite health with detailed analytics.

In summary, Cypress simplifies end-to-end testing and enhances your testing workflow.

Interest URLs
Requeriments
- SO: Windows, MacOS, and Linux
- Required Software: LTS Release of Node.js and NPM.

Advantages and disadvantages

Advantages:
- Speed: Cypress executes tests quickly, allowing you to see results promptly.
- User-Friendly Interface: Its simple and intuitive interface makes it accessible even for inexperienced developers.
- Reliability: Tests written in Cypress are less likely to fail compared to other automation tools.
- Flexibility: Cypress can handle various tasks, including end-to-end testing, unit testing, and integration testing.
- Stability: Many companies have successfully used Cypress in production for years.
- Active Community: Cypress has a vibrant community with numerous plugins and integrations available.
- Browser-Based: It runs directly in the browser, eliminating the need for additional installations.
Disadvantages:
- Limited Cross-Browser Support: Cypress may not support all browsers equally well.
- No Native Mobile App Testing: Unlike some tools, Cypress lacks direct support for native mobile app testing.
- Single Browser Session: It restricts access to multiple browser sessions during test execution.
- No Direct Multiple Windows/Tabs Support: Cypress doesn't directly handle multiple windows or tabs.

Installation

root@ubuntu2204:/home/vagrant# apt update
root@ubuntu2204:/home/vagrant# curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.7/install.sh | bash
root@ubuntu2204:/home/vagrant# nvm install 20
root@ubuntu2204:/home/vagrant# node -v
root@ubuntu2204:/home/vagrant# npm -v
root@ubuntu2204:/home/vagrant# npm install cypress --save-dev

Conclusions

Of all the tools analyzed, the most notable are: Artillery, Playwright, Fluentd and Cypress. Artillery and Cypress can also be integrated to work together. Cypress can be used in place of Playwright but not integrated with Artillery. Cypress has in its favor that it was previously used to test the Wazuh Dashboard. Likewise, for Fluentd there is documentation to integrate it and use it with Wazuh. Cypress, from the tests carried out, seems more versatile and powerful than Playwright. It all depends on the specific use you want to give it.

Of the rest of the tools, I would highlight OpenSearch Benchmark for Indexer Cluster tests (it is a very specific Benchmark, for OpenSearch). Locust could also be useful if you want to measure loading times in requests. Although it is true that this could be done with other tools.

MARCOSD4 · 2024-06-20T15:27:40Z

Metrics analysis tools

Summary table

Tool	Configuration Languaje	Requirements	Generate report	Offline visualization	Simultaneous visualization	Automated analysis
Prometheus and Grafana	YML	None	Yes	Yes (with VictoriaMetrics)	Yes (with Python script)	Yes
Netdata	None (custom)	gcc, make, curl	No	-	No	Yes
Nagios	-	-	-	-	-	-
Zabbix	-	-	-	-	-	-

Prometheus and Grafana

Information

Prometheus is a Open Source systems monitoring and alerting toolkit. Prometheus collects and stores its metrics as time series data, i.e. metrics information is stored with the timestamp at which it was recorded, alongside optional key-value pairs called labels.

Prometheus consists of multiple components, like the main server which scrapes and stores time series data, client libraries for instrumenting application code, an alert manager to handle alerts, etc. Prometheus scrapes metrics from instrumented jobs, either directly or via an intermediary push gateway for short-lived jobs. It stores all scraped samples locally and runs rules over this data to either aggregate and record new time series from existing data or generate alerts. Grafana can be used to visualize the collected data.

Grafana enables you to query, visualize, alert on, and explore your metrics, logs, and traces wherever they are stored. It supports querying Prometheus, so both platforms can be used to monitor and visualize the metrics of our systems.

Interest URLs:
- Prometheus and Grafana Website
- Prometheus and Grafana documentation
- Prometheus integration with grafana and video examples

Advantages, Disadvantages and Examples

Advantages:
- The installation of Prometheus and Grafana is very quick and simple.
- YML configuration file.
- The UI is user-friendly.
- Prometheus and Grafana integration is simple.
- Setting up discrepancy alerts using YML files
Disadvantages
- It is not possible to visualize offline data, you need an external software like VictoriaMetrics.

It is necessary to install and configure an Exporter. In these examples, we are using the Node Exporter, which is used for monitoring Linux hosts metrics, but depending on the needs, we can choose from different options. For example, there is another exporter which is responsible for exporting the information of the system processes (Process Exporter). This will be useful if we want to analyse the different daemons and processes of Wazuh.

The integration is very simple, it is done through the Grafana dashboard (UI). In it, Grafana is configured to generate a dashboard through the information that Prometheus sends. There are many dashboards and many different options and configurations to obtain the metrics. It has been tested with a very complete Node Exporter dashboard in which we can see a multitude of graphs about the Linux system metrics.

Examples

I think this is a good tool to get metrics such as CPU, memory, file descriptors, disk operations, hardware usage of all Wazuh components: Agent, Manager, Indexer, Dashboard. It is flexible, as we can choose what kind of data we want to plot, and it offers a very complete user interface with many configurable options.

Another positive aspect of these tools is that it is possible to configure an alert module in which, if the values obtained by Prometheus differ from a threshold value, an alert will be generated with the corresponding information. This is done by constructing rules in a YML file. The configuration is really simple, you only need to create a YML file with the rule and add its path to the Prometheus configuration file. Therefore, this allows to analyse the data automatically and alert when a discrepancy with the "accepted" values is found.

Examples

I have performed a simple test where if the CPU use exceeds 2%, an alert is generated. The YML file with the rule that triggers the alert is:

groups:
  - name: example_alert_rules
    rules:
      - alert: HighCPUUsage
        expr: avg(rate(node_cpu_seconds_total{mode="system"}[1m])) > 0.02
        labels:
          severity: critical
        annotations:
          summary: "High CPU Usage"
          description: "The CPU usage is above 2%."

And then specify the path to the file in the configuration file:

rule_files:
  - "rule.yml"

The result is:

Offline visualization

As part of the analysis of the data obtained after monitoring the different metrics of the system, it is necessary to be able to export and save this data in order to be able to refer to it when necessary and to make comparisons, serving as a baseline. Once these data have been saved, it will be necessary to visualise them again.
To do this, different options have been tried using only Prometheus and Grafana but this proved to be quite tedious as it was necessary to export the data from Prometheus in JSON format (no other format is supported) and import it back into Grafana (Grafana does not support JSON, only CSV). Also, displaying complex graphs with this method is not entirely straightforward, as different queries and settings must be adjusted simultaneously. To simplify the process, it has been decided to use an external tool, which acts as a database (warehouse) for the data collected by Prometheus, and which can then be simply visualised in a Grafana dashboard. This tool is VictoriaMetrics.

The first thing to do is to create a snapshot of the current Prometheus data:

# curl -XPOST localhost:9090/api/v1/admin/tsdb/snapshot

Then VictoriaMetrics must be installed and launched:

# wget https://github.com/VictoriaMetrics/VictoriaMetrics/releases/download/v1.76.1/victoria-metrics-linux-amd64-v1.76.1.tar.gz
# tar -xvf victoria-metrics-linux-amd64-v1.76.1.tar.gz

# ./victoria-metrics-prod &

Also is necessary to install vmctl:

# wget https://github.com/VictoriaMetrics/VictoriaMetrics/releases/download/v1.76.1/vmutils-linux-amd64-v1.76.1.tar.gz
# tar -xvf vmutils-linux-amd64-v1.76.1.tar.gz

And finally, to load the Prometheus snapshot into VictoriaMetrics:

# ./vmctl-prod prometheus --prom-snapshot ./prometheus/data/snapshots/20240702T102147Z-11f770969d68dec7/

To visualise this data in Grafana, simply add a data source from VictoriaMetrics and create a dashboard with the appropriate graphs.

Simultaneous visualization of different tests

In order to be able to make a more exhaustive comparison between the graphs of different tests, we have investigated the possibility of joining all the data in the same graph, so that we can visualise in the same graph both the results of one test and the results of any other test. This combined with the previous point (offline data visualisation) means that we have a tool capable of fulfilling all the needs requested in the issue.
In addition, in those graphs in which a multitude of different values are measured and it is difficult to differentiate some lines from others, it is possible to temporarily hide or show each of the measurements with Grafana by clicking on a button designed for this purpose, making it possible to compare one by one the measurements of the metrics of different tests, eliminating visual noise.

This can easily be done with Grafana if the tests were measured in the same time interval. For example: let's imagine that we have two different data sources, one representing the values from an old test for which we have stored its data (and which will act as a baseline), and the other representing the values taken from a current test. The first source is called "prometheus" and the second one is called "prometheus-1". In the image below we can see the CPU usage percentage of the system in two different lines, each one from a different data source:

Now for example, in the same graph, let's compare the percentage of CPU used by user processes, and hide the previous system CPU measurements for better comparison:

It is also possible to manually change to any color for each of the metrics in the configuration panel on the right, making the graphs more configurable and customisable.

But, in a real case, we will have some data corresponding to a version of Wazuh taken in a time interval, and other data corresponding to another version that will have been taken in a time interval different from the previous one and possibly quite separated in time. In this case, we will need the times of both tests to be normalised, in order to be able to visualise and compare the data more easily. The problem is that Grafana does not allow this, at least in the Open Source version.
To solve this, I have created a simple Python script that normalises the times of both tests and sets them in the same interval. To do this, I have taken as an example two CSV files with data corresponding to the footprint tests for versions 4.7.4 and 4.8.0. I have imported them into Grafana using the Infinity plugin, with the times already normalised. The result is as follows:

If we zoom in on the graph we can see how the data differs:

Netdata

Information

Netdata is an open-source, real-time performance and health monitoring tool for systems and applications. It provides comprehensive insights into the performance of your entire IT infrastructure, offering detailed metrics and visualizations.

There are 3 components in the Netdata structure: Netdata Agents, to monitor the physical or virtual nodes of your infrastructure, including all applications and containers running on them; Netdata Parents, to create observability centralization pointswithin your infrastructure, to offload Netdata Agents functions from your production systems, to provide high-availability of your data, increased data retention and isolation of your nodes; and Netdata Cloud combining all your infrastructure, all your Netdata Agents and Parents, into one uniform, distributed, scalable, monitoring database, offering custom dashboards, troubleshooting tools, etc.

Interest URLs:
Requirements:
- Linux and MacOS.

Advantages, Disadvantages and Examples

Advantages:
- Installation is too simple.
- The UI is user-friendly.
Disadvantages:
- Rule configuration is more complex.
- Does not allow simultaneous display of different data.

The installation is completed by logging into Netdata and executing a single command. After this, we can access the Netdata dashboard and get graphs of different system metrics:

Examples

In this tool, there are a number of preconfigured alerts with default values to indicate when the metrics obtained exceed these values. It is also possible to create new customised rules. The configuration of the alerts is simple and is done in .conf files with the specifications in the documentation.

Examples

The file containing the rule is:

   template: 10min_cpu_usage
         on: system.cpu
      class: Utilization
       type: System
  component: CPU
host labels: _os=linux
     lookup: average -10m unaligned of user,system,softirq,irq,guest
      units: %
      every: 1m
       warn: $this > (($status >= $WARNING)  ? (1) : (2))
       crit: $this > (($status == $CRITICAL) ? (3) : (4))
      delay: down 15m multiplier 1.5 max 1h
    summary: System CPU utilization
       info: Average CPU utilization over the last 10 minutes (excluding iowait, nice and steal)
         to: sysadmin

Simultaneous visualization of different tests

Netdata does not allow the simultaneous visualisation of data from different data sources, which requires the use of platforms such as Grafana, so using Netdata would be a restriction on this point.

Nagios

Information

Nagios is an open-source monitoring system designed to monitor systems, networks, and infrastructure. It provides monitoring capabilities to ensure the availability and performance of IT systems and services. There are different types of Nagios solutions, but we should focus on Nagios Core and Nagios XI. Nagios XI is capable of monitoring hundreds of different types of applications, services, protocols, and computer hardware through the use of built-in capabilities and third-party extensions and addons. Nagios Core is an Open Source system and network monitoring application. It watches hosts and services that you specify, alerting you when things go bad and when they get better.

Interest URLs:
Requirements:
- SO: RHEL, CentOS, Oracle Linux, Debian, Ubuntu.
- PHP version 5.4 - 8.2.

Advantages and Disadvantages

Disadvantages:
- There are quite a few problems when installing Nagios by following the Nagios documentation. Tested on RHEL and Ubuntu machines, all without success.
- The installation has only worked for me on Centos 7.
- The documentation is less intuitive.
- The configuration to monitor the system is more complex.

In order to obtain system metrics, it is necessary to install a number of plugins and make changes to the Nagios configuration, which is more tedious than the previous tools. Therefore, we will first move forward with further research on the above tools, and if necessary, further research on Nagios will be carried out.

Zabbix

Information

Zabbix is software that monitors numerous parameters of a network and the health and integrity of servers, virtual machines, applications, services, databases, websites, the cloud, and more. Zabbix uses a flexible notification mechanism that allows users to configure email-based alerts for virtually any event. Zabbix offers reporting and data visualization features based on the stored data.

It is composed of different components, like the Zabbix server, DataBase storage, a Web interface and a Zabbix agent that are deployed on monitoring targets to actively monitor local resources and applications and report the gathered data to the Zabbix server.

Interest URLs:
Requirements:
- SQL database, PHP.
- Linux, MacOS, etc.

Advantages and disadvantages

Disadvantages:
- The installation and configuration of Zabbix is considerably more complex than the previous tools.
- Documentation and UI are less intuitive.
- Its configuration takes more time.

Due to the disadvantages provided by this tool and given that there are other tools that are simpler and offer better features, I do not think that Zabbix is an option to consider. Therefore, we will first move forward with further research on the above tools, and if necessary, further research on Zabbix will be carried out.

Conclusion

In conclusion, after analysing the different tools, we can see that using Prometheus as a system monitoring tool and Grafana as a tool for visualising the data collected by Prometheus is one of the most viable and favorable options we have. On the other hand, Netdata also seems to be a good tool, but it has some disadvantages and differences with respect to the previous two, making its use less interesting. As for Nagios and Zabbix, their configuration has been much more complicated than the rest of the tools, so their analysis has been less complete, as the priority has been to invest most of the time in deeply analysing the other tools, and this is a great disadvantage.

Prometheus and Grafana seem to be a great option in terms of the objectives required in this issue, but they also have some limitations such as the simultaneous visualisation of different data, which has finally been solved with a Python script.

Update after review

After the review that have been performed, it is necessary to investigate Plotly/Dash applications to visualize performance metrics.

Plotly/Dash

Information

The Plotly Python library is an interactive plotting library that supports a wide range of use-cases. It enables users to create interactive web-based visualizations that can be displayed in Jupyter notebooks, saved to standalone HTML files, or served as part of Python-built web applications using Dash.

Dash is a framework for rapidly building data apps in Python. It is a good way to build analytical apps in Python using Plotly figures.

Main features:

Interactivity: Graphs created with Plotly are highly interactive. Users can zoom, pan, and hover over data points to get more information.
Variety of Graphs: It supports a wide range of charts, including line charts, bar charts, scatter plots, heatmaps, 3D graphs, and more.
Integration: Easily integrates with other Python libraries such as Pandas, NumPy, and Matplotlib.
Ease of Use: Dash enables the creation of complex applications with relatively little code, using a simple Python-based syntax.
Flexibility: Dash offers a wide range of customizable components, including graphs, tables, forms, and input controls.
Extensibility: Dash can be extended with other Python frameworks and libraries, and allows for the inclusion of HTML, CSS, and JavaScript for further customization.

The installation of these tools is really simple and their documentation is very intuitive and complete.

Plotly and Dash documentations links.

Example of use

To show an example of use, @Rebits has created a script in which in a few lines of code we can demonstrate the potential of this tool, seeing how the comparison between versions is really simple, simplifying the process that we currently have in the footprint tests:

poc_graph.webm

With just a small change in the code, we can get more than one demon drawn on the graph at a time:

PoCFootprintv2.zip

Now, we want to test another use case. We want to check if real-time loading of CSV files is possible. This is motivated by the fact that, in the future, we will have a nightly that will run the necessary processes and generate the corresponding reports. These reports will be stored somewhere, and Plotly should be able to load this data for display automatically.
To do this, we have created a simple example in which we save the CSV files in a database. Subsequently, those files are processed conveniently by Plotly to be plotted. If the Plotly server is running and new data is entered into the database, it will be displayed after reloading the page.

Example:

PoCexample2.avi.mp4

In this case, we are using a script that updates the database information (enter the data from the new file), therefore, we only need to run the script for the database to update, and thus, the Dash server is updated.

We will now modify the application to add a checklist that allows us to select the versions we want to plot from all those available in the database:

It is now necessary to test the ability of this implementation to support large amounts of data, that is, multiple files representing different versions or stages, as this will be necessary when generating reports daily.
For this purpose, 100 files have been simulated with the same format as the CSV files extracted from the current footprint tests and with fictitious data. A limitation of this application is that it cannot support such a large amount of data, causing the server to crash during the process. It is important to note that the simulated files contain data with a duration of 2.5 days, so the amount of data is considerable.
However, it has been tested using a number of 50 files and the server has not crashed but the loading of the data is very slow, as we can see in the example:

PoCexample3.mp4

To conclude, this comparison table between Plotly and Grafana shows the advantages and disadvantages of both tools.

	Plotly/Dash	Grafana + Prometheus
Complexity	It's very simple, with a few lines of code you can do great things.	These tools are built for managing and visualizing dynamic, time-sensitive metrics, etc, which the configuration could be so complex.
Focus	They are only designed to visualise data, which is simpler and directly relevant to our objective.	These are primarily designed for visualizing time-series data, what is not in line with our objective.
Resource usage	Resource use is more efficient.	Using these tools for another purpose causes them to consume more resources.
Language	Use Python, which is well known to all, its information and libraries are very extensive, which makes configuration easier in general.	Use YML and own configurations that could be more complex.

Ultimately, using Grafana seems like an overkill option since it offers a rich set of features that are not adapted to what we are looking for and that may complicate the process unnecessarily. Using Plotly we have a very simple way to build custom graphics, very intuitive, with a multitude of options to configure and extensive documentation to support us. This way of visualisation will make our work much easier and will allow us to advance faster. In any case, it will have to be decided whether this tool is suitable for our case and whether it will finally be chosen over Grafana and Prometheus.

rafabailon · 2024-06-26T13:28:53Z

Metrics Analysis Tools

Tool	Advantages	Disadvantages
Prometheus	Easy Installation and Deployment. Metrics Collection. Data Visualization. Scalability.	Configuration and Setup. Scalability. Customer Service.
Grafana	Customizable Dashboards. User-Friendly Visualization.	Customization Effort. Limited Visualization Styles. External Data Storage.
Netdata	Real-Time Monitoring. Lightweight. Automatic Dashboard Creation.	Limited Historical Data. Complex Configuration.
Nagios	Alerting and Incident Management. Compatible Platforms. Documentation and Support.	Installation Complexity. UI/UX Design. Scalability.
Zabbix	Metrics Collection. Scalability. Incident Management.	Installation and Setup. UI/UX Design. Data Visualization.

Advantages and Disadvantages (Explained)

Prometheus

Advantages:
- Easy Installation and Deployment: Prometheus is straightforward to install and deploy.
- Metrics Collection: It excels at collecting time series data from servers.
- Data Visualization: Although it lacks built-in visualization, it pairs well with Grafana for creating customizable dashboards.
- Scalability: Prometheus scales well.
Disadvantages:
- Configuration and Setup: Can be development-intensive.
- Scalability: Becomes challenging at large scale.
- Customer Service: Known for slow response times.

Grafana

Advantages:
- Customizable Dashboards: Grafana allows you to create and customize dashboards for visualizing metrics.
- User-Friendly Visualization: Provides a variety of visualization options.
Disadvantages:
- Customization Effort: Customizing Grafana dashboards can be time-consuming.
- Limited Visualization Styles: Some limitations in visual representation.
- External Data Storage: Requires external data storage.

Netdata

Advantages:
- Real-Time Monitoring: Provides real-time insights into system performance.
- Lightweight: Minimal resource overhead.
- Automatic Dashboard Creation: Automatically generates dashboards for various services.
Disadvantages:
- Limited Historical Data: Focuses on real-time data, lacks extensive historical data storage.
- Complex Configuration: Setting up custom monitoring can be intricate.

Nagios

Advantages:
- Alerting and Incident Management: Strong alerting capabilities.
- Compatible Platforms: Works on various Unix variants.
- Documentation and Support: Well-documented and supported.
Disadvantages:
- Installation Complexity: Requires additional setup (Apache server, Nagios Plugins).
- UI/UX Design: Could be improved.
- Scalability: May face challenges at scale.

Zabbix

Advantages:
- Metrics Collection: Robust data collection.
- Scalability: Scales well.
- Incident Management: Effective alerting and incident handling.
Disadvantages:
- Installation and Setup: Installation guide lacks details on database and web server setup.
- UI/UX Design: Could be enhanced.
- Data Visualization: Requires integration with other tools like Grafana.

rafabailon · 2024-06-26T14:04:04Z

Update

Research must be expanded to the tools that are already being used. Specifically, we must review how hardware resources are monitored (tests and footprints). Also, alerts and specific Wazuh measurements are generated that are not possible to cover with an external tool.

Rebits · 2024-06-28T09:35:27Z

Tools for saturation of wazuh modules (internal) @santipadilla

Test stress tier

Path: wazuh-jenkins/jenkins-files/tests/pre-release/test_stress_tier.groovy
- Brief description of its functionality:
  - It starts by importing a shared Jenkins library and defines various job parameters such as the version of the software being tested, the repository from where the software is fetched and test duration.
  - It defines different modules and conditions under which the Wazuh system should be tested. It uses booleans to manage which modules are activated during the test.
  - It sets up the conditions for launching these tests on different environments, specifically mentioning the use of Vagrant for certain tests for testing on different OS.
  - The script is structured to launch multiple stress tests in parallel, depending on the configurations derived from the parameters and the specified modules. This is managed using Jenkins' parallel construct, which allows concurrent execution of jobs.
- Management of Wazuh modules:
  - It uses a map called mod to store boolean flags for each module, indicating whether they are active in the test scenario. This structure allows for dynamic enabling of modules based on parameters passed to the Jenkins job.
  - It uses lists like full_tier and tier to assemble combinations of modules for testing based on the active flags in the mod map. This includes handling complex combinations such as "ALL", "ALL-EXCEPT-LOGCOLLECTOR", etc, which group multiple modules for collective testing.
  - It uses job parameters to decide which modules or combinations are included in a test, and it dynamically adjusts the Jenkins job setup, including test parameters and environment variables that influence how Wazuh operates during the tests.

Test stress

Path: wazuh-jenkins/jenkins-files/tests/test_stress.groovy
- Brief description of its functionality:
  - It starts by setting up the environment, configuring necessary credentials, and preparing the system (both the manager and agents).
  - The script manages different instances (Manager and Agents), their configurations and orchestrates their interactions. This includes setting up the manager and various agents (Centos, Ubuntu, Windows) according to specified configurations.
  - The pipeline includes steps for deploying instances, configuring and registering them and finally launching the test using the configured settings.
  - After the test execution, it handles data collection, including fetching CSVs, logs, generating graphics, and managing backups.
- Management of Wazuh modules:
  - It reads and splits the MODULES parameter to decide which modules to test. It uses a map named modules to track the status (active/inactive) of each Wazuh module.
  - If the test involves fewer than a predefined number of modules (indicated by no_all_test), a custom configuration (client-buffer-custom) is activated. For detailed, scenario-specific testing, especially for configurations that are not part of the standard tests.
  - It includes detailed configurations for how each module should behave during the test, such as sleep times, limits, and specific settings.
  - Specific parameters suggest that it sets up conditions likely to create high load scenarios, such as configuring how frequently data is sent to Wazuh modules and how many items are involved. For example: fim_sleep and fim_max_files for the File Integrity Monitoring (FIM) module control how often changes are checked and the number of files monitored, which can increase the load on this module.
In summary, regarding the Wazuh modules:
- In test_stress_tier.groovy, the approach to handling Wazuh modules is about deciding which combinations of modules to test in parallel and setting up the test environment accordingly. It's more about orchestrating a broad.
- In contrast, test_stress.groovy dives into the specifics of how each module is configured and tested, including detailed settings adjustments and direct handling of module-specific operations and data management during stress tests. This script is more about executing and managing the details of stress testing individual modules or groups of modules in a controlled environment.

Scripts in charge of module saturation

Path: quality/tests/stress/scripts/
- logcollector.py
  - The script is designed to create, modify, and delete log files repeatedly within a specified duration (test_duration). This simulates a high-load environment where the Logcollector module must process a significant volume of logs continuously.
  - It accepts various command-line arguments to customize the test, such as the number of log files to create and modify (limit_create_files and limit_write_files), and the intervals at which these operations should occur (sleep_create_time and sleep_write_time).
  - By allowing parameters to be set via command line, it provides flexibility in testing under different scenarios to understand how the Logcollector handles various intensities of logging activities.
  - The function create_delete_logs manages the cycle of creating a specified number of logs appending messages to them, sleeping for a while and then deleting them, only to repeat the process. This creates a constant churn that the Logcollector must handle, testing its capability to keep up with real-time changes and deletions.
  - The function write_logs continuously writes to a set number of log files, simulating the ongoing generation of new log entries.
  - By continually creating and writing to files, the script ensures that the Logcollector is always busy, which simulates a saturated environment. This script can help identify at what point (if any) the Logcollector module becomes overwhelmed, which is valuable for performance tuning and capacity planning. It logs its operations and any errors encountered to logcollector.log and logcollectorError.log.
- fim.py
  - By creating and deleting files and directories within specified intervals, the script simulates a dynamic environment to test how efficiently and accurately the FIM module can track and log changes.
  - The script uses loops to continuously create and delete files and directories, which is expected to trigger FIM alerts or log entries if FIM is appropriately monitoring these paths.
  - If enabled (check_log parameter), after file and directory manipulations, the script checks for specific log entries related to these events. This is used to verify that FIM is not only detecting the changes but is also logging them as expected.
  - Between creation and deletion cycles, the script pauses (based on sleep_time), simulating a realistic time gap between file operations, giving the FIM module time to process and log the changes.
  - The execution time of the script, the number of files/directories to be handled and the duration of the suspend intervals can be customized via command line arguments, making the test adaptable to different stress levels.
  - Results are recorded in designated log files (fim.log and fimError.log), allowing detailed post-test analysis.
- sca.py
  - The script creates and modifies Windows registry entries. SCA should monitor these changes as they can represent configuration or security changes.
  - The operations (registry and file modifications) are performed in a loop with sleep intervals in between to simulate a continuous but variable workload. This tests the SCA module's endurance and its ability to continuously monitor changes over prolonged periods.
  - Optionally, the script can verify and log the changes it makes, ensuring that all intended operations are executed correctly and detected by SCA module.
  - The use of threading allows the script to perform file and registry modifications simultaneously, increasing the load and complexity of the test environment, which is a direct method to stress test the SCA capabilities.
  - The test duration, number of modifications and sleep intervals can be controlled via command line arguments, providing flexibility in terms of test intensity.
  - The script logs all significant actions and errors to specific log files (sca.log and scaError.log).
- docker.py
  - The primary focus is on testing how well the system handles Docker operations, particularly the creation and removal of containers, which can be resource-intensive.
  - The script launches specified Docker images repeatedly in a loop for the duration specified, with pauses in between launches.
  - By continuously starting and stopping containers, the script simulates a high-load environment where Docker's ability to allocate resources and manage containers is tested.
  - The script supports running multiple Docker containers in parallel by using Python's threading capabilities, further increasing the stress on the system and it ncludes sleep intervals between launches.
  - The test duration, the Docker image to be used, the timeout between operations and the number of concurrent operations (threads) can be specified, making the test highly configurable.
  - The script records detailed information about each operation and the overall test.

Rebits · 2024-06-28T09:36:40Z

Tools for running processes and system utilization monitoring (external and internal tools) @santipadilla

Tools used

psutil

psutil which is a cross-platform library to retrieve information about running processes and system utilization in Python. It allows us to view various system details such as CPU, memory, disks, network, sensors and processes. It provides a convenient and unified way to access system related information and can be used for system monitoring, profiling, limiting process resources and managing running processes.

With psutil we can:

List all running processes.
Get process IDs (PIDs), names, status, and other details.
Access detailed information about CPU, memory, files, network connections, and other resource usage by individual processes.
Kill processes and set process priorities.
Monitor system performance.
In summary, this is very useful for basic monitoring tasks like checking CPU, memory, disk usage, and listing running processes. It's ideal for integration into Python scripts where process and system monitoring are required without needing deep system analysis.

Alternatives

Glances

Glances is a robust and versatile monitoring tool that stands out for its ability to provide a comprehensive snapshot of various system metrics in real time. It's built on top of the psutil library, enabling it to access a wide range of system details directly from Python. This makes Glances especially powerful and flexible as a monitoring solution, given its expansive capabilities coupled with an easy-to-navigate interface.
- Differences from psutil
  - psutil is a library, providing no user interface or direct application. It is meant to be used within scripts or other applications. Glances, on the other hand, is a standalone application that uses psutil to gather data but presents it in a user-friendly, immediately usable format.
  - Glances provides an interactive display that updates in real-time, a feature not available directly in psutil. This makes it particularly useful for quick checks and ongoing monitoring without needing to write and run scripts.
  - Glances adds an alert system on top of the raw data collection, providing immediate feedback when system metrics exceed safe thresholds, which is beyond the scope of psutil.
In summary, it acts as an enhanced version of psutil, providing a user-friendly, real-time monitoring interface. It shows detailed system and process metrics and is great for a more comprehensive and easily accessible overview of their system's performance.

SystemTap

SystemTap is a diagnostic tool that allows administrators and developers to write and run scripts to collect data about Linux operating system activities. Its primary use is to monitor the performance of operating systems and applications, helping to identify or anticipate problems with system performance. It can be used for:
- Provide facilities for probing and monitoring a wide range of system activities, including system calls, kernel functions, and user-space processes.
- Users can write custom scripts that define what data to collect, providing flexibility and powerful insight into system behavior.
- You can track live systems and analyze performance bottlenecks as they are occurring.
In summary, while not specifically a process monitoring tool, it can be used to perform deep monitoring and analysis of what processes are doing at the kernel level. It's useful for debugging and understanding complex system behavior, including process interactions with system resources.

DTrace

DTrace is a dynamic tracing framework originally developed for Solaris and subsequently ported to several other UNIX-like operating systems, including BSD, macOS and Linux (with some limitations).
- It can instrument almost all system operations, both kernel and user-mode, without the need to reboot the system or specially prepare it for tracing.
- It is designed for use in live production systems with minimal performance impact, making it safe for real-time diagnostics in sensitive environments.
- It dynamically enables and disables probes, and the DTrace scripting language allows scripts to be written that can explore system behavior in real time.
In summary, similar to SystemTap, DTrace provides detailed insights into process behaviors and system operations, allowing for precise tracing and troubleshooting of performance issues. It's highly advanced and suitable for in-depth system analysis.

Tools comparison

Tool	Advantages	Disadvantages
psutil	Easy to use with Python, cross-platform support, provides APIs for retrieving information about system utilization (CPU, memory, disk) and processes, lightweight	Limited to basic metrics, lacks deeper system insights like kernel operations, no built-in alerting or real-time analysis
Glances	Provides a real-time comprehensive overview, web interface for remote monitoring, supports alerts and thresholds, extends psutil functionalities	Heavier than psutil due to more extensive data collection and interface, can be more resource-intensive, less customizable than more detailed tools like DTrace
SystemTap	Allows for deep system analysis and monitoring including kernel space, custom scriptable for highly detailed diagnostics, real-time data collection	Potentially high learning curve, can impact system performance if not used carefully
DTrace	Highly powerful and flexible, designed for dynamic tracing with minimal performance impact on production systems, can probe both user and kernel space operations	Complexity of use, requires deep understanding of system operations, limited support on Linux compared to other UNIX-like systems

rafabailon · 2024-07-01T15:32:23Z

Locust vs Artillery

Both Artillery and Locust are popular tools for load and performance testing.

Information	Artillery	Locust
Created By	Shoreditch Ops LTD	Ferran Basora
License	Apache 2.0	MPL2
Written In	NodeJS	Python
Scriptable	Yes	Yes
Distributed Load Generation	No	Yes
Website	Artillery	Locust
Performance	Artillery was considered slower than other tools	Locust has improved significantly and was considered faster than before
Approach	Artillery focuses on executing load scenarios defined in YAML files. You can specify paths, timeouts, payloads, and more.	Locust focuses on simulating real users interacting with a web application. You can define custom behaviors for these virtual users.
Resources	Locust, being written in Python, may require fewer resources compared to Artillery (written in NodeJS). However, this may vary depending on the load scenario and configuration.	Artillery, being lighter and faster to set up, could be more resource efficient.
Data Returned	Locust provides detailed information about virtual user performance, such as response times, errors, and statistics.	Artillery returns metrics such as latency, success rate, average response time, and more, depending on the scenario configuration.

In short, Locust has improved in performance, while Artillery has decreased in speed. Both tools have their advantages and disadvantages, so the choice depends on your specific needs and preferences. If you are looking for a faster and more distributed tool, Locust might be the better choice.

Locust is more widely used than Artillery at the moment. The most notable advantage is that whenever you can write Python code to do something, you can test it with Locust. This makes it much easier to deploy. Also, the integration with Jenkins is very simple and fast since it can even be used as a library. This would allow you to run tests automatically, generate results and even analyze them.

In the case of Artillery, it is not yet as widely used as Locust. It uses JavaScript for tests. This not only complicates the creation of tests but, if you want to integrate it with Jenkins, it is more complex (and requires Docker to run the tests). In my tests, it has also turned out to be slower when running tests and does not offer the possibility of distributed load generation.

rafabailon · 2024-07-02T15:30:28Z

Tools for Analysis

After a first investigation, I have been able to find the following tools for data analysis. There does not seem to be specific software (although research continues) and, instead, specific programming languages or libraries oriented toward data analysis and graph generation are used.

Tool	Description
R	Programming language and statistical environment widely used for data analysis.
Apache Kafka	Real-time data streaming platform to process data streams.
Apache Spark	Clustered data processing engine for distributed analytics.
Python	Python is a programming language widely used in data science

Both R and Python are widely used and versatile options. Specifically, Python is supported by different libraries such as Numpy or Pandas for data analysis. Below I include some more information about all the tools.

Tool Information

R

R is a programming language and software environment used primarily for statistical analysis and data visualization.
It is widely used in the academic community and among statistics and data science professionals.
You can write scripts in R to perform analysis, statistical modeling and create graphs.

Python

Python is a versatile and popular programming language.
Used in a variety of applications, including web development, automation, data science and machine learning.
In the context of Apache Spark, Python is used to write PySpark scripts that take advantage of Spark's distributed processing capabilities.

Apache Kafka

Kafka is a distributed data transmission platform.
Allows the ingestion, storage and processing of data streams in real time.
Used to build messaging systems, event processing and streaming applications.

Apache Spark

Spark is an in-memory data processing engine that enables distributed and parallel processing.
Used for big data analysis, ETL processing, machine learning and graph processing.
Can integrate with visualization tools like Tableau to create interactive dashboards and data-driven visualizations.

For Python there are several useful libraries when analyzing data and generating graphs. The advantage is that you can create a script adapted to the data to be used. This is also an inconvenience since for both R and Python you have to manually create the scripts. Below I leave a list of useful Python libraries.

Library	Description	Functionality	Advantages
Pandas	Provides data structures (such as DataFrames) for data analysis and manipulation.	Cleaning, transformation and analysis of tabular data.	Easy integration with Pandas DataFrames, efficient operations on tables
NumPy	Ideal for mathematical calculations in multidimensional arrays.	Linear algebra, statistics, array manipulation.	Optimized data structures, vectorized operations, advanced mathematical functions
Matplotlib	Used to create graphs and visualizations.	Creation of 2D and 3D graphics, data visualization.	Flexibility in customization, wide range of graphs
Seaborn	Useful library for statistical visualization.	Improves the appearance of Matplotlib plots, especially for data analysis.	Simple syntax, attractive style, integration with Pandas and Matplotlib

For R there are also several libraries that can be used to support the creation of scripts for data analysis and graph generation. Below is a list of the most popular ones.

Tool	Description	Functionality	Advantages
dplyr	This package allows you to manipulate data efficiently. You can filter rows, select columns and modify variables.	Efficient data manipulation (filtering, selection, transformation)	Intuitive syntax, performance optimization
ggplot2	Ideal for data visualization. With ggplot2, you can create beautiful, custom graphics.	Creation of custom graphics (scatter plots, bars, lines, etc.)	Highly customizable, attractive graphics
GWalkR	This tool is useful for Exploratory Data Analysis (EDA). Integrate htmlwidgets with Graphic Walker and turn your data into a Tableau-like user interface for visual exploration.	Exploratory Data Analysis (EDA) with Tableau-like interface	Interactive visualization, integration with htmlwidgets
corplot	corrplot is an easy-to-use library that provides a variety of visualization options for correlation matrices. You can create heatmap type graphs to represent correlations between variables.	Visualization of correlation matrices	Clear representation of relationships between variables
lattice	lattice is a library for creating advanced graphs in R. You can generate histograms, boxplots, density plots and more.	Advanced plots (histograms, boxplots, density plots)	Flexibility in creating charts, support for multiple variables

rafabailon · 2024-07-03T14:22:38Z

Grafana Open Source vs Grafana Enterprise

Characteristics	Grafana Open Source	Grafana Enterprise
License	Open source	Commercial
Premium Plugins	Not included	Included
Integration with Tools	Limited	Expanded
Authentication and Security	Basic	Advanced
Support and Maintenance	Community	Premium Support

Grafana is a widely used open source visualization and monitoring tool. The Grafana Enterprise version is the commercial edition of Grafana and offers additional features not found in the open source version. Here are the key differences:

Premium Plugins

Grafana Enterprise includes integrations with commercial monitoring tools such as Datadog, Splunk, New Relic, AppDynamics, Oracle and Dynatrace. These plugins are created, maintained and supported by the Grafana Labs team.
If you are a customer of any of these companies and value having that data available in Grafana, Grafana Enterprise gives you access to all of these plugins. This enables centralized observability with Grafana as a single access point.

Authentication and Security

For businesses, managing access control for existing users and reducing friction when managing new users is a major challenge.
Grafana Enterprise offers seamless LDAP sync, data source permissions, and team sync features. This makes it easy to expand access to all relevant stakeholders and protect sensitive data without needing to completely separate Grafana instances.

According to the official documentation, this is the list of exclusive features of the Enterprise version:

Role-based access control to control access with role-based permissions.
Data source permissions to restrict query access to specific teams and users.
Data source query and resource caching to temporarily store query results in Grafana to reduce data source load and rate limiting.
Reporting to generate a PDF report from any dashboard and set up a schedule to have it emailed to whomever you choose.
Export dashboard as PDF
Custom branding to customize Grafana from the brand and logo to the footer links.
Usage insights to understand how your Grafana instance is used.
Recorded queries to see trends over time for your data sources.
Vault integration to manage your configuration or provisioning secrets with Vault.
Auditing tracks important changes to your Grafana instance to help you manage and mitigate suspicious activity and meet compliance requirements.
Request security makes it possible to restrict outgoing requests from the Grafana server.
Settings updates at runtime allows you to update Grafana settings at runtime without requiring a restart.

In general, depending on the features you need, this is a summary of which version of Grafana to use:

Grafana Open Source

Use Grafana Open Source if you are looking for a free open source solution.
It is ideal for personal projects, small businesses or development environments.
Offers a wide variety of data sources and visualizations.
Does not include premium plugins or advanced security features.

Grafana Enterprise

Opt for Grafana Enterprise if you need advanced features and premium support.
Use it in business environments with more demanding requirements.
Includes premium plugins for integration with commercial tools.
Offers advanced authentication, LDAP sync, and data source permissions.

List of Premium Plugins:

Expand List

Plugin	Use
AppDynamics	The AppDynamics data source plugin is the easiest way to pull AppDynamics data directly into Grafana dashboards.
Azure Cosmos DB	The Azure Cosmos DB data source plugin allows you to query and visualize Cosmos DB data in Grafana.
Azure Devops	The Azure DevOps data source plugin allows you to query and visualize Azure DevOps data from within Grafana.
Catchpoint	The Catchpoint data source plugin allows you to query Tests, SLO, RUM data.
Databricks	The Databricks datasource allows a direct connection to Databricks to query and visualize Databricks data in Grafana.
Datadog	The Datadog data source plugin is the easiest way to pull Datadog data directly into Grafana dashboards.
DynamoDB	The DynamoDB datasource allows a direct connection to DynamoDB to query and visualize data in Grafana.
Dynatrace	The Dynatrace data source plugin is the easiest way to pull Dynatrace data directly into Grafana dashboards.
GitLab	The GitLab data source plugin is the easiest way to pull GitLab data directly into Grafana dashboards.
Grafana Enterprise Logs	Grafana Enterprise Logs (GEL) is a commercial offering based on the open-source project Loki. The commercial offering allows you to deploy a highly-scalable, simple, and reliable logs cluster in your own data center. This app plugin gives you an easy way to manage your logs cluster.
Grafana Enterprise Metrics	Grafana Enterprise Metrics (GEM) is a commercial offering based on the open-source project Mimir. The commercial offering allows you to deploy a higly-scalable, simple, and reliable metrics cluster in your own data center. This app plugin gives you an easy way to manage your metrics cluster.
Grafana Enterprise Traces	Grafana Enterprise Traces (GET) is a commercial offering based on the open-source project Tempo. The commercial offering allows you to deploy a highly-scalable, simple, and reliable traces cluster in your own data center. The app plugin gives you an easy way to manage your traces cluster.
Honeycomb	The Honeycomb data source plugin is the easiest way to pull Honeycomb data directly into Grafana dashboards.
Jira	The Jira data source plugin is the easiest way to pull Jira data directly into Grafana dashboards.
Looker	The Looker data source plugin allows you to visualize data from Looker in Grafana.
MongoDB	With the Grafana data source plugin for MongoDB, you can interact in real time with your existing MongoDB data and unify data sets across your company into one diagnostic workspace.
New Relic	The New Relic data source plugin is the easiest way to pull New Relic data directly into Grafana dashboards.
Oracle Database	The Oracle data source plugin is the easiest way to pull Oracle data directly into Grafana dashboards.
PagerDuty	The PagerDuty data source plugin allows you to query incidents data or visualize incidents within Grafana using annotations.
Salesforce	The Salesforce data source plugin is the easiest way to pull Salesforce data directly into Grafana dashboards.
SAP HANA®	The SAP HANA® data source plugin is the easiest way to pull SAP HANA® data directly into Grafana dashboards.
ServiceNow	The ServiceNow data source plugin is the easiest way to pull ServiceNow data directly into Grafana dashboards.
Snowflake	The Snowflake data source plugin is the easiest way to pull Snowflake data directly into Grafana dashboards.
Splunk	The Splunk data source plugin is the easiest way to pull Splunk data directly into Grafana dashboards.
Splunk Infrastructure Monitoring	The Splunk Infrastructure Monitoring (formerly known as SignalFx) data source plugin is the easiest way to pull Splunk Infrastructure Monitoring data directly into Grafana dashboards.
Sqlyze Datasource	A Grafana data source plugin that connects to hundreds of datasources using one language: SQL. Connect to your favorite SQL databases, NoSQL databases, and many other non-SQL sources...and query them with SQL.
Sumo Logic	The Sumo Logic data source plugin is the easiest way to pull Sumo Logic data directly into Grafana dashboards.
Wavefront	The Wavefront data source plugin is the easiest way to pull Wavefront data directly into Grafana dashboards.

Regarding prices, Grafana does not clearly show the price of its Enterprise version. Likewise, it does not easily offer a demo. You need to contact Grafana to get a demo and pricing for Grafana Enterprise. Likewise, the information is not very clear since they focus on the sale of their cloud services.

To consider using Grafana Enterprise we would have to see if we need any of the extra functions. If they are not necessary, the Open Source version should cover all needs.

rafabailon · 2024-07-04T15:01:20Z

Conclusions

After finishing the investigations, it was decided to use a series of tools for different purposes.

Dashboard Tests: to test the functionality of the Wazuh dashboard it was decided to use Artillery together with Playwright. Both tools integrate perfectly and use JavaScript as the language to create the tests.
API Tests: for the Wazuh API it has been decided to use Locust. This tool uses Python to create the tests, it can be used as a library directly in Python and it is easily integrated with Jenkins.
Graph visualisation: For metrics visualization, it has been decided to use Plotly together with Dash, a powerful Python library that allows visualising a lot of different types of graphics. It is a very simple library in which with a few lines of code you can get very representative graphics. This option is better suited to our needs than Grafana and it is also easier to use, which will be very convenient for us.
Data Analysis: in data analysis we have not been able to find a tool that allows us to analyze the different data that we would obtain in a test. The best option is to use Python (or even R) to create scripts that perform the analyzes using the different libraries that exist to facilitate the task.
Modules saturation: Analyzed the scripts that we currently use in footprint that simulate the saturation of the modules, there are many utilities and functionalities that can be useful and that we can reuse/refactor for the saturation of modules.
Process monitoring: Investigating the possible alternatives for process monitoring we see that they have some limitations, they are not as customizable and heavier than psutil. So we believe that the best option for the scripts we make is to use the psutil library.

Rebits · 2024-07-11T11:37:46Z

Dashboard Performance tests

It has been confirmed in a meeting with @davidjiglesias @havidarou and @juliamagan the use of Artillery and Playwright for basic performance testing. LGTM

API Tests

It has been confirmed in a meeting with @davidjiglesias @havidarou and @juliamagan the use of Locust for basic performance testing. LGTM

Data Analysis

It has been confirmed in a meeting with @davidjiglesias @havidarou and @juliamagan the use of Python libraries for statistical analysis of performance results. Further research will be carried on at the design phase. LGTM

Modules saturation

It has been confirmed in a meeting with @davidjiglesias @havidarou and @juliamagan the use of custom tools for module saturation in the Agent performance test. LGTM

Process monitoring

It has been confirmed in a meeting with @davidjiglesias @havidarou and @juliamagan the use of current libraries based on psutils for process monitoring. LGTM

Visualization (@MARCOSD4 )

Grafana and Prometheus visualization method could maybe not be the best suit due to the following reasons:

Temporal Data Focus: Grafana and Prometheus are primarily designed for visualizing time-series data, where the emphasis is on trends and changes over time. Our performance data is not temporal and does not change over time, using these tools may not be efficient or necessary.
Complexity Overhead: Setting up and maintaining Grafana and Prometheus can be complex. These tools are built for managing and visualizing dynamic, time-sensitive metrics, which means configuring them for non-temporal data can involve unnecessary complexity and overhead.
Resource Intensiveness: Prometheus especially is optimized for time-series data storage and querying. Using it for non-temporal data might result in inefficient use of resources (storage, memory, processing power) compared to simpler tools that are better suited for static data.
Feature Overkill: Grafana offers a rich set of features tailored for time-series visualization, including dynamic dashboards, alerting based on thresholds, and detailed metric exploration. Our case doesn’t require these features, using Grafana might be overkill and unnecessarily complicate your visualization setup.
Simplicity Preference: For offline and non-temporal data visualization, simpler tools that focus solely on static data visualization might be more appropriate. Such tools typically have a lighter footprint, are easier to set up, and are more intuitive for users who do not need advanced time-series analysis capabilities.

For these reasons, I suggest researching Plotly/Dash applications to visualize performance metrics https://plotly.com/python/. I also provided a basic PoC of how metrics could appear in a future performance dashboard:
PoCFootprint.zip

It is requested to investigate these libraries and incorporate them into the analysis to finalize the research

Finally, other team members have analyzed the usage of Grafana for an observability module #4524. It is pending to have a meeting to share knowledge in order to determinate the capabilities of the current development.

MARCOSD4 · 2024-07-15T16:19:15Z

An in-depth analysis of Plotly and Dash has been added to this commentary, and the final conclusion has been edited according to the research done.

jseg380 · 2024-07-22T10:30:35Z

LGTM

Rebits · 2024-07-22T17:41:48Z

LGTM

Rebits added level/task Task issue type/research labels Jun 14, 2024

juliamagan changed the title ~~CICD Overhaul: Benchmarking Tools and Techniques Investigation~~ Benchmarking tools and techniques investigation Jun 17, 2024

Rebits added level/epic and removed level/task Task issue labels Jun 19, 2024

Rebits assigned rafabailon Jun 20, 2024

wazuhci added this to Release 5.0.0 Jun 20, 2024

wazuhci moved this to In progress in Release 5.0.0 Jun 20, 2024

Rebits assigned MARCOSD4 Jun 20, 2024

Rebits added level/task Task issue and removed level/epic labels Jun 21, 2024

Rebits assigned santipadilla and Rebits Jun 28, 2024

wazuhci moved this from In progress to Pending review in Release 5.0.0 Jul 4, 2024

Rebits moved this from Pending review to On hold in Release 5.0.0 Jul 11, 2024

wazuhci moved this from On hold to In progress in Release 5.0.0 Jul 11, 2024

wazuhci moved this from In progress to Pending review in Release 5.0.0 Jul 15, 2024

MARCOSD4 mentioned this issue Jul 18, 2024

Benchmarking tests: Data Visualization Module wazuh/wazuh#24686

Closed

MARCOSD4 mentioned this issue Jul 19, 2024

Benchmarking tests: Performance Statistical Data Analyzer wazuh/wazuh#24688

Closed

wazuhci moved this from Pending review to In review in Release 5.0.0 Jul 22, 2024

wazuhci moved this from In review to Pending final review in Release 5.0.0 Jul 22, 2024

Rebits closed this as completed Jul 22, 2024

wazuhci moved this from Pending final review to Done in Release 5.0.0 Jul 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmarking tools and techniques investigation #5502

Benchmarking tools and techniques investigation #5502

Rebits commented Jun 14, 2024 •

edited by juliamagan

Loading

rafabailon commented Jun 20, 2024 •

edited

Loading

MARCOSD4 commented Jun 20, 2024 •

edited

Loading

rafabailon commented Jun 26, 2024

rafabailon commented Jun 26, 2024

Rebits commented Jun 28, 2024 •

edited by santipadilla

Loading

Rebits commented Jun 28, 2024 •

edited by santipadilla

Loading

rafabailon commented Jul 1, 2024 •

edited by Rebits

Loading

rafabailon commented Jul 2, 2024 •

edited

Loading

rafabailon commented Jul 3, 2024

rafabailon commented Jul 4, 2024 •

edited by MARCOSD4

Loading

Rebits commented Jul 11, 2024 •

edited

Loading

MARCOSD4 commented Jul 15, 2024

jseg380 commented Jul 22, 2024

Rebits commented Jul 22, 2024

Benchmarking tools and techniques investigation #5502

Benchmarking tools and techniques investigation #5502

Comments

Rebits commented Jun 14, 2024 • edited by juliamagan Loading

Description

Funcional Requirements

Capabilities

KPTM analysis and data collection

Implementation restrictions

Plan

Related issues

rafabailon commented Jun 20, 2024 • edited Loading

Tools for Different Purposes

Tools Summary

Tools Information

Artillery (Performance Testing)

Playwright (Test Automation Framework)

Opensearch Benchmark (Performance Test for OpenSearch Clusters)

Locust (Load Testing Framework)

Fluentd (Data Collector for Unified Logging Layer)

Tsung (Distributed Load Testing Tool)

Cypress (Testing Frameworks for Javascript) (E2E Tests)

Conclusions

MARCOSD4 commented Jun 20, 2024 • edited Loading

Metrics analysis tools

Summary table

Prometheus and Grafana

Netdata

Nagios

Zabbix

Conclusion

Update after review

Plotly/Dash

rafabailon commented Jun 26, 2024

Metrics Analysis Tools

Advantages and Disadvantages (Explained)

rafabailon commented Jun 26, 2024

Update

Rebits commented Jun 28, 2024 • edited by santipadilla Loading

Tools for saturation of wazuh modules (internal) @santipadilla

Rebits commented Jun 28, 2024 • edited by santipadilla Loading

Tools for running processes and system utilization monitoring (external and internal tools) @santipadilla

Tools used

Alternatives

Tools comparison

rafabailon commented Jul 1, 2024 • edited by Rebits Loading

Locust vs Artillery

rafabailon commented Jul 2, 2024 • edited Loading

Tools for Analysis

rafabailon commented Jul 3, 2024

Grafana Open Source vs Grafana Enterprise

rafabailon commented Jul 4, 2024 • edited by MARCOSD4 Loading

Conclusions

Rebits commented Jul 11, 2024 • edited Loading

Dashboard Performance tests

API Tests

Data Analysis

Modules saturation

Process monitoring

Visualization (@MARCOSD4 )

MARCOSD4 commented Jul 15, 2024

jseg380 commented Jul 22, 2024

Rebits commented Jul 22, 2024

Rebits commented Jun 14, 2024 •

edited by juliamagan

Loading

rafabailon commented Jun 20, 2024 •

edited

Loading

MARCOSD4 commented Jun 20, 2024 •

edited

Loading

Rebits commented Jun 28, 2024 •

edited by santipadilla

Loading

Rebits commented Jun 28, 2024 •

edited by santipadilla

Loading

rafabailon commented Jul 1, 2024 •

edited by Rebits

Loading

rafabailon commented Jul 2, 2024 •

edited

Loading

rafabailon commented Jul 4, 2024 •

edited by MARCOSD4

Loading

Rebits commented Jul 11, 2024 •

edited

Loading