From 5d6f870c47d42db121ef3670d289bf0144a7d239 Mon Sep 17 00:00:00 2001 From: Robert Cowart Date: Sun, 13 May 2018 15:48:08 +0200 Subject: [PATCH] fix table styling in readme --- README.md | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 59c2084..0bfae46 100644 --- a/README.md +++ b/README.md @@ -25,10 +25,10 @@ Elastic Stack | ElastiFlow™ 1.x | ElastiFlow™ 2.x | ElastiFlow&trade ## Requirements Please be aware that in production environments the volume of data generated by many network flow sources can be considerable. It is not uncommon for a core router or firewall to produce 1000s of flow records per second. For this reason it is recommended that ElastiFlow™ be given its own dedicated Logstash instance. Multiple instances may be necessary as the volume of flow data increases. -> Due to the way NIC receive queues and the Linux kernel interact, raw UDP packet reception will be bound to a single CPU core and kernel receive buffer. While additional UDP input workers allow Logstash to share the load of processing packets from the receive buffer, it does not scale in linearly. As worker threads are increased, so is contention for buffer access. The sweetspot seems to be to use 4-core Logstash instanses, adding additional instances as needed for high-volume environments. +> Due to the way NIC receive queues and the Linux kernel interact, raw UDP packet reception will be bound to a single CPU core and kernel receive buffer. While additional UDP input workers allow Logstash to share the load of processing packets from the receive buffer, it does not scale linearly. As worker threads are increased, so is contention for buffer access. The sweetspot seems to be to use 4-core Logstash instances, adding additional instances as needed for high-volume environments. ## Setting-up-Elasticsearch -Currently there is no specific configuration required for Elasticsearch. As long as Kibana and Logstash can talk to your Elasticsearch cluster you should be ready to go. The index template required by Elasticsearch will uploaded by Logstash. +Currently there is no specific configuration required for Elasticsearch. As long as Kibana and Logstash can talk to your Elasticsearch cluster you should be ready to go. The index template required by Elasticsearch will be uploaded by Logstash. At high ingest rates (>10K flows/s), or for data redundancy and high availability, a multi-node cluster is recommended. @@ -66,6 +66,7 @@ logstash ``` Copy the `elastiflow` directory to the location of your Logstash configuration files (e.g. on RedHat/CentOS or Ubuntu this would be `/etc/logstash/elastiflow` ). If you place the ElastiFlow™ pipeline within a different path, you will need to modify the following environment variables to specify the correct location: + Environment Variable | Description | Default Value --- | --- | --- ELASTIFLOW_DICT_PATH | The path where the dictionary files are located | /etc/logstash/dictionaries @@ -93,6 +94,7 @@ Edit `pipelines.yml` (usually located at `/etc/logstash/pipelines.yml`) and add ### 6. Configure inputs By default flow data will be recieved on all IPv4 addresses of the Logstash host using the standard ports for each flow type. You can change both the IPs and ports used by modifying the following environment variables: + Environment Variable | Description | Default Value --- | --- | --- ELASTIFLOW_NETFLOW_IPV4_HOST | The IP address from which to listen for Netflow messages | 0.0.0.0 @@ -105,6 +107,7 @@ ELASTIFLOW_IPFIX_UDP_IPV4_HOST | The IP address from which to listen for IPFIX m ELASTIFLOW_IPFIX_UDP_IPV4_PORT | The port on which to listen for IPFIX messages via UDP | 4739 Collection of flows over IPv6 is disabled by default to avoid issues on systems without IPv6 enabled. To enable IPv6 rename the following files in the `elastiflow/conf.d` directory, removing `.disabled` from the end of the name: `10_input_ipfix_ipv6.logstash.conf.disabled`, `10_input_netflow_ipv6.logstash.conf.disabled`, `10_input_sflow_ipv6.logstash.conf.disabled`. Similiar to IPv4, IPv6 input can be configured using environment variables: + Environment Variable | Description | Default Value --- | --- | --- ELASTIFLOW_NETFLOW_IPV6_HOST | The IP address from which to listen for Netflow messages | [::] @@ -117,6 +120,7 @@ ELASTIFLOW_IPFIX_UDP_IPV6_HOST | The IP address from which to listen for IPFIX m ELASTIFLOW_IPFIX_UDP_IPV6_PORT | The port on which to listen for IPFIX messages via UDP | 54739 To improve UDP input performance for the typically high volume of flow collection, the default values for UDP input `workers` and `queue_size` is increased. The default values are `2` and `2000` respecitvely. ElastiFlow™ increases these to `4` and `4096`. Further tuning is possible using the following environment variables. + Environment Variable | Description | Default Value --- | --- | --- ELASTIFLOW_NETFLOW_UDP_WORKERS | The number of Netflow input threads | 4 @@ -125,15 +129,18 @@ ELASTIFLOW_SFLOW_UDP_WORKERS | The number of sFlow input threads | 4 ELASTIFLOW_SFLOW_UDP_QUEUE_SIZE | The number of unprocessed sFlow UDP packets the input can buffer | 4096 ELASTIFLOW_IPFIX_UDP_WORKERS | The number of IPFIX input threads | 4 ELASTIFLOW_IPFIX_UDP_QUEUE_SIZE | The number of unprocessed IPFIX UDP packets the input can buffer | 4096 + > WARNING! Increasing `queue_size` will increase heap_usage. Make sure have configured JVM heap appropriately as specified in the [Requirementa](#requirements) ### 7. Configure Elasticsearch output Obviously the data need to land in Elasticsearch, so you need to tell Logstash where to send it. This is done by setting these environment variables: + Environment Variable | Description | Default Value --- | --- | --- ELASTIFLOW_ES_HOST | The Elasticsearch host to which the output will send data | 127.0.0.1:9200 ELASTIFLOW_ES_USER | The password for the connection to Elasticsearch | elastic ELASTIFLOW_ES_PASSWD | The username for the connection to Elasticsearch | changeme + > If you are only using the open-source version of Elasticsearch, it will ignore the username and password. In that case just leave the defaults. ### 8. Enable DNS name resolution (optional) @@ -146,6 +153,7 @@ With these changes I can finally give the green light for using DNS lookups to e The key to good performance is setting up the cache appropriately. Most likely it will be DNS timeouts that are the source of most latency. So ensuring that a higher volume of such misses can be cached for longer periods of time is most important. The DNS lookup features of ElastiFlow™ can be configured using the following environment variables: + Environment Variable | Description | Default Value --- | --- | --- ELASTIFLOW_RESOLVE_IP2HOST | Enable/Disable DNS requests | false @@ -227,6 +235,7 @@ Provides a view of traffic to and from Autonomous Systems (public IP ranges) ## Environment Variable Reference The supported environment variables are: + Environment Variable | Description | Default Value --- | --- | --- ELASTIFLOW_DICT_PATH | The path where the dictionary files are located | /etc/logstash/dictionaries