Skip to content
Argus Monitor edited this page Jun 23, 2024 · 31 revisions

Welcome

Welcome to the Argus clients wiki! Here we'll try to use the powers of GitHub to develop and manage new features of argus data processing.

The Argus project is composed of two efforts.

  1. Network flow data generation
  2. Flow data processing

The network flow data generation has been referred to as the Argus Server, and the processing component is referred to as the Argus Clients. Over the years, we've moved to referring to the network flow sensor as Argus, and Clients as the sensor data processing elements.

Argus clients can be categorized into 5 basic groups:

  1. Data Collection
  2. Data Distribution
  3. Data Storage Management
  4. Data Analytics
  5. Data Visualization

In Argus 5.0 we made contributions to all 5 groups, with a focus on analytics. We moved some of the commercial ArgusPro features into the open source, and we've added a significant amount of tech into the clients distribution to demonstrate the features. This includes 128-bit Argus source id's, Argus events, expanded behavioral analytics, json processing, converting foreign flow data into the Argus processing system, enhanced content capture and processing, and new tunnel support.

Argus 5.0 is focused on generating argus data in as many points in the network as possible, including external and internal high speed links, workgroup edges, endpoints and wireless access points. This is important to addressing the cyber security challenges that enterprises face today. This involves granular visibility inside the enterprise, to support effective cyber detection and forensics. With increased network visibility inside the enterprise, there are new opportunities for sophisticated detections by correlating data from multiple points in the network at or near the same time.

Because Argus has already been ported to most endpoint operating systems and OpenWRT access points, we have a good start on getting a lot of sensors into an environment. As a part of improving visibility throughout the network, we're also going to import data from other flow systems. Argus already processes NetFlow and IPFIX records, but there are a lot of other flow data strategies out there. In particular, we'll want to import Zeek connection logs, as many organizations generate Zeek data, Google VPC flows, and possibly some of the single letter flows, like Qflow, Jflow, and maybe Kflow records.

Endpoint Argus Support

The open source argus code is very portable, and runs in a number of operating systems, including Linux and it's variants, RHEL, Rocky, Ubuntu, Debian, Kali, FreeBSD, CentOS, Fedora, OpenSUSE, (and all of these subvariants), Windows, MacOS, AIX, SunOS, HPUX, Solaris, IRIX, CrayOS, VxWorks, PSoS, and OpenWRT, so we have a good start.

Argus 5.0 sensors run great in endpoints, and with a few changes to libpcap, we can attain < 0.5% avg CPU utilization for an argus daemon on most commercial endpoints (Windows, MacOS, Linux). There are specific features that are useful to achieve complete network accountability on endpoints, as there are a lot of interfaces types that exist that we all would like to monitor. BlueTooth interfaces, RadioTaps, USB devices, VPNs, even docker interfaces are fair game for monitoring in an endpoint. And of course there are a lot of different types of endpoints now ... cloud based VMs and containers are an important part of the mix.

Argus Source ID Modifications

To improve managing large numbers of endpoint sensors, argus supports using the hostuuid as the argus source id. With this feature, argus can be deployed as a zero-configuration daemon (no conf file mods needed), and to improve visibility on endpoints, argus will generally add the 'inf' to the flow key of every flow it monitors. This means that argus-clients should expect from endpoint argi flow data that has a 128-bit source id, and a 4-char interface identifier, where the flow was monitored.

128-bit ARGUS_MONITOR_ID's are pretty unwieldy. To make data processing easier, all ra* programs can use a RA_SRCID_ALIAS file to alias short names for the big uuid identifiers. The aliases are "node"s and can be printed, filtered, etc ...

[carter@red clients]$ ra -S localhost -up 3 -s stime dur proto saddr dir daddr pkts bytes node inf
       StartTime    Dur  Proto          SrcAddr   Dir            DstAddr  TotPkts                                    Sid  Node  Inf 
  1719085355.895  0.000    arp    192.168.1.254   who       192.168.1.49        2   5eeb2183-59e2-45e8-848f-e48f5d30157e   red e0s5
  1719085355.895  4.343    tcp    192.168.1.254   <?>       192.168.1.49       36   5eeb2183-59e2-45e8-848f-e48f5d30157e   red e0s5
  1719085356.150  0.000   igmp     192.168.1.17    ->    239.255.255.250        1   5eeb2183-59e2-45e8-848f-e48f5d30157e   red e0s5
  1719085358.107  0.000    udp    192.168.1.131    ->    239.255.255.250        1   5eeb2183-59e2-45e8-848f-e48f5d30157e   red e0s5

And filtering using the 'node' is supported in all ra* programs ... HOWEVER ... when reading realtime data from a remote argus, because the remote argus may not have access to the RA_SRCID_ALIAS file, you should apply the filter to the calling ra*, using the 'local' filter directive.

[carter@red clients]$ ra -S remote:561 -up 3 -s stime dur:6 proto saddr:16 dir daddr pkts sid:38 node:5 inf - local node red
       StartTime    Dur  Proto          SrcAddr   Dir            DstAddr  TotPkts                                    Sid  Node  Inf 
  1719089205.139  4.061    tcp    192.168.1.254   <?>       192.168.1.49       50   5eeb2183-59e2-45e8-848f-e48f5d30157e   red e0s5
  1719089205.347  3.005    udp    192.168.1.131    ->    239.255.255.250        7   5eeb2183-59e2-45e8-848f-e48f5d30157e   red e0s5
  1719089206.771  1.934 ipv6-*               ::    ->                 ::        3   5eeb2183-59e2-45e8-848f-e48f5d30157e   red e0s5
  1719089209.139  4.004    udp    192.168.1.131    ->    239.255.255.250        3   5eeb2183-59e2-45e8-848f-e48f5d30157e   red e0s5
  1719089209.207  0.001    udp     192.168.1.49   <->        192.168.1.1        2   5eeb2183-59e2-45e8-848f-e48f5d30157e   red e0s5
  1719089209.207  0.158    udp     192.168.1.49   <->        192.168.1.1        2   5eeb2183-59e2-45e8-848f-e48f5d30157e   red e0s5

JSON Output Format

All ra* programs can now support writing its output as Json. The types of output argus-3.x supported was space separated, comma separated, any token separated files, along with XML output. XML has lost its usefulness with the introduction of Json, and with the use of CSV and Json as the primary input formats for AI/ML routines, argus-clients have made the shift to supporting Json as its primary output.

Argus has also modified its ArgusLabel structure to support Json formatted buffers. ra* programs still support the basic Metadata standards for Labels.

Converting Foreign Flow Data to Argus

Argus 5.0 clients provide a reconvert.1 program to convert ascii flow data to argus binary format. This has been used to control flow data export by converting argus binary data to ascii format, so that it can be inspected and reduced, and then converted back to a binary format for processing. One would expect this in highly controlled environments, or when sharing flow data with an external partner. By converting an ascii format back to binary, you 'know' what data will be in the binary file, or rather, you know what is not in the binary. This is important for excluding DSRs that may contain content, dns names, etc ...

This facility support converting CSV files, as well a json formatted data.

####Zeek conn.logs to Argus Records

Argus can natively read Netflow V 4,5 and flow-tools flow formats. And as of argus-clients.3.0.8.4 argus can convert json formatted Zeek conn.logs into Argus binary formats using our existing program raconvert.1 ... Json because we added json processing into the argus client library, but we can just as easily do non-json formats as well.

We extended raconvert.1 to take in a conversion map, using the '-f conversion.map' command-line option. And the specific support for converting zeek con logs is done through the support/Config/raconvert.zeek.conf file. This sample raconvert conversion map, should work for all the basic zeek conn.log variables, and as new are added, this file will need to be updated.

Converting Google VPC Logs to Argus Records

raconvert.1 can convert any json formatted string into a flow record, if it contains a minimum set of flow identifiers. Start time, an IP address or name, some metrics and optionally some metadata, is all that is needed.

This approach should work very well with Google VPC flow logs. If we can find some real examples of VPC flow logs, we can generate a raconvert.google.conf conversion map. Should be pretty easy ...

Clone this wiki locally