Skip to content

Latest commit

 

History

History
369 lines (278 loc) · 20.1 KB

README.md

File metadata and controls

369 lines (278 loc) · 20.1 KB

The Snap Telemetry Framework Build Status Go Report Card Join the chat on Slack

Snap is an open telemetry framework designed to simplify the collection, processing and publishing of system data through a single API. The goals of this project are to:

  • Empower systems to expose a consistent set of telemetry data
  • Simplify telemetry ingestion across ubiquitous storage systems
  • Allow flexible processing of telemetry data on agent (e.g. filtering and decoration)
  • Provide powerful clustered control of telemetry workflows across small or large clusters

  1. Overview
  2. Getting Started
  3. Documentation
  4. Community Support
  5. Contributing
  6. Code of Conduct
  7. Security Disclosure
  8. License
  9. Thank You

Overview

The Snap Telemetry Framework is a project made up of multiple parts:

  • A hardened, extensively tested daemon, snapteld, and CLI, snaptel (in this repo)
  • A growing number of maturing plugins (found in the Plugin Catalog)
  • Lots of example tasks to gather and publish metrics (found in the Examples folder)

These and other terminology are explained in the glossary.

workflow-collect-process-publish

The key features of Snap are:

  • Plugin Architecture: Snap has a simple and smart modular design. The three types of plugins (collectors, processors, and publishers) allow Snap to mix and match functionality based on user need. All plugins are designed with versioning, signing and deployment at scale in mind. The open plugin model allows for loading built-in, community, or proprietary plugins into Snap.

    • Collectors - Collectors consume telemetry data. Collectors are plugins for leveraging existing telemetry solutions (Facter, CollectD, Ohai) as well as specific plugins for consuming Intel telemetry (Node, DCM, NIC, Disk) and can reach into new architectures through additional plugins (see Plugin Authoring below). Telemetry data is organized into a dynamically generated catalog of available data points.
    • Processors - Extensible workflow injection. Convert telemetry into another data model for consumption by existing systems. Allows encryption of all or part of the telemetry payload before publishing. Inject remote queries into workflow for tokens, filtering, or other external calls. Implement filtering at an agent level reducing injection load on telemetry consumer.
    • Publishers - Store telemetry into a wide array of systems. Snap decouples the collection of telemetry from the implementation of where to send it. Snap comes with a large library of publisher plugins that allow exposure to telemetry analytics systems both custom and common. This flexibility allows Snap to be valuable to open source and commercial ecosystems alike by writing a publisher for their architectures.
  • Dynamic Updates: Snap is designed to evolve. Each scheduled workflow automatically uses the most mature plugin for that step, unless the collection is pinned to a specific version (e.g. get /intel/psutil/load/load1/v1). Loading a new plugin automatically upgrades running workflows in tasks. Load plugins dynamically, without a restart to the service or server. This dynamically extends the metric catalog when loaded, giving access to new measurements immediately. Swapping a newer version plugin for an old one in a safe transaction. All of these behaviors allow for simple and secure bug fixes, security patching, and improving accuracy in production.

  • Snap tribe: Snap is designed for ease of administration. With Snap tribe, nodes work in groups (aka tribes). Requests are made through agreement- or task-based node groups, designed as a scalable gossip-based node-to-node communication process. Administrators can control all Snap nodes in a tribe agreement by messaging just one of them. There is auto-discovery of new nodes and import of tasks and plugins from nodes within a given tribe. It is cluster configuration management made simple.

Snap is not intended to:

  • Operate as an analytics platform: the intention is to allow plugins for feeding those platforms
  • Compete with existing metric/monitoring/telemetry agents: Snap is simply a new option to use or reference

Getting Started

System Requirements

Snap needs Swagger for Go installed to update OpenAPI specification file after successfull build. Swagger will be installed automatically during build process (make or make deps).

To install Swagger manually

Using go get (recommended):

go get -u github.com/go-swagger/go-swagger/cmd/swagger

From Debian package:

echo "deb https://dl.bintray.com/go-swagger/goswagger-debian ubuntu main" | sudo tee -a /etc/apt/sources.list
sudo apt-get update
sudo apt-get install swagger

From GitHub release:

curl -LO https://github.com/go-swagger/go-swagger/releases/download/0.10.0/swagger_linux_amd64
chmod +x swagger_linux_amd64 && sudo mv swagger_linux_amd64 /usr/bin/swagger

Snap does not have external dependencies since it is compiled into a statically linked binary. At this time, we build Snap binaries for Linux and MacOS. We also provide Linux RPM/Deb packages and MacOS X .pkg installer.

Installation

You can obtain Linux RPM/Deb packages from Snap's packagecloud.io repository. After installation, please check and ensure /usr/local/bin:/usr/local/sbin is in your path via echo $PATH before executing any Snap commands.

RedHat 6/7:

$ curl -s https://packagecloud.io/install/repositories/intelsdi-x/snap/script.rpm.sh | sudo bash
$ sudo yum install -y snap-telemetry

Ubuntu 14.04/16.04 (see known issue with Ubuntu 16.04.1 below)

$ curl -s https://packagecloud.io/install/repositories/intelsdi-x/snap/script.deb.sh | sudo bash
$ sudo apt-get install -y snap-telemetry

We only build and test packages for a limited set of Linux distributions. For distros that are compatible with RedHat/Ubuntu packages, you can use the environment variable os= and dist= to override the OS detection script. For example Linux Mint 17/17.* (use dist=xenial for Linux Mint 18/18.*):

$ curl -s https://packagecloud.io/install/repositories/intelsdi-x/snap/script.deb.sh | sudo os=ubuntu dist=trusty bash
$ sudo apt-get install -y snap-telemetry

MacOS X:

If you use homebrew, the latest version of Snap package: snap-telemetry

$ brew install snap-telemetry

If you do not use homebrew, download and install Mac pkg package:

$ curl -sfL mac.pkg.dl.snap-telemetry.io -o snap-telemetry.pkg
$ sudo installer -pkg ./snap-telemetry.pkg -target /

Tarball (choose the appropriate version and platform):

$ curl -sfL linux.tar.dl.snap-telemetry.io -o snap-telemetry.tar.gz
$ tar xf snap-telemetry.tar.gz
$ cp snapteld /usr/local/sbin
$ cp snaptel /usr/local/bin

The intelsdi-x package repo contains additional information regarding:

NOTE: snap-telemetry packages prior to 0.19.0 installed /usr/local/bin/{snapctl|snapd} and these binaries have been renamed to snaptel and snapteld. snap-telemetry packages prior to 0.18.0 symlinked /usr/bin/{snapctl|snapd} to /opt/snap/bin/{snapctl|snapd} and may cause conflicts with Ubuntu's snapd package. Ubuntu 16.04.1 snapd package version 2.13+ installs snapd/snapctl binary in /usr/bin. These executables are not related to snap-telemetry. Running snapctl from snapd package will result in the following error message:

$ snapctl
error: snapctl requires SNAP_CONTEXT environment variable

NOTE: If you prefer to build from source, follow the steps in the build documentation. The alpha binaries containing the latest master branch are available here for bleeding edge testing purposes:

Running Snap

If you installed Snap from RPM/Deb package, you can start/stop Snap daemon as a service:

RedHat 6/Ubuntu 14.04:

$ service snap-telemetry start

RedHat 7/Ubuntu 16.04:

$ systemctl start snap-telemetry

If you installed Snap from binary, you can start Snap daemon via the command:

$ sudo mkdir -p /var/log/snap
$ sudo snapteld --plugin-trust 0 --log-level 1 --log-path /var/log/snap &

To view the service logs:

$ tail -f /var/log/snap/snapteld.log

By default, Snap daemon will be running in standalone mode and listening on port 8181. To enable gossip mode, checkout the tribe documentation. For additional configuration options such as plugin signing and port configuration see snapteld documentation.

Load Plugins

Snap gets its power from the use of plugins. The plugin catalog contains a collection of all known Snap plugins with links to their repo and release pages.

First, let's download the file and psutil plugins (also make sure psutil is installed):

$ export OS=$(uname -s | tr '[:upper:]' '[:lower:]')
$ export ARCH=$(uname -m)
$ curl -sfL "https://github.com/intelsdi-x/snap-plugin-publisher-file/releases/download/2/snap-plugin-publisher-file_${OS}_${ARCH}" -o snap-plugin-publisher-file
$ curl -sfL "https://github.com/intelsdi-x/snap-plugin-collector-psutil/releases/download/8/snap-plugin-collector-psutil_${OS}_${ARCH}" -o snap-plugin-collector-psutil

Next load the plugins into Snap daemon using snaptel:

$ snaptel plugin load snap-plugin-publisher-file
Plugin loaded
Name: file
Version: 2
Type: publisher
Signed: false
Loaded Time: Fri, 14 Oct 2016 10:53:59 PDT

$ snaptel plugin load snap-plugin-collector-psutil
Plugin loaded
Name: psutil
Version: 8
Type: collector
Signed: false
Loaded Time: Fri, 14 Oct 2016 10:54:07 PDT

Verify plugins are loaded:

$ snaptel plugin list
NAME      VERSION    TYPE         SIGNED     STATUS    LOADED TIME
file      2          publisher    false      loaded    Fri, 14 Oct 2016 10:55:20 PDT
psutil    8          collector    false      loaded    Fri, 14 Oct 2016 10:55:29 PDT

See which metrics are available:

$ snaptel metric list
NAMESPACE                                VERSIONS
/intel/psutil/cpu/cpu-total/guest        8
/intel/psutil/cpu/cpu-total/guest_nice   8
/intel/psutil/cpu/cpu-total/idle         8
/intel/psutil/cpu/cpu-total/iowait       8
/intel/psutil/cpu/cpu-total/irq          8
/intel/psutil/cpu/cpu-total/nice         8
/intel/psutil/cpu/cpu-total/softirq      8
/intel/psutil/cpu/cpu-total/steal        8
/intel/psutil/cpu/cpu-total/stolen       8
/intel/psutil/cpu/cpu-total/system       8
/intel/psutil/cpu/cpu-total/user         8
/intel/psutil/load/load1                 8
/intel/psutil/load/load15                8
/intel/psutil/load/load5                 8
...

Running Tasks

To collect data, you need to create a task by loading a Task Manifest. The Task Manifest contains a specification for what interval a set of metrics are gathered, how the data is transformed, and where the information is published. For more information see task documentation.

Now, download and load the psutil example:

$ curl https://raw.githubusercontent.com/intelsdi-x/snap/master/examples/tasks/psutil-file.yaml -o /tmp/psutil-file.yaml
$ snaptel task create -t /tmp/psutil-file.yaml
Using task manifest to create task
Task created
ID: 8b9babad-b3bc-4a16-9e06-1f35664a7679
Name: Task-8b9babad-b3bc-4a16-9e06-1f35664a7679
State: Running

NOTE: In subsequent commands use the task ID from your CLI output in place of the <task_id>.

This starts a task collecting metrics via psutil, then publishes the data to a file. To see the data published to the file (CTRL+C to exit):

$ tail -f /tmp/psutil_metrics.log

Or directly tap into the data stream that Snap is collecting using snaptel task watch <task_id>:

$ snaptel task watch 8b9babad-b3bc-4a16-9e06-1f35664a7679
NAMESPACE                             DATA             TIMESTAMP
/intel/psutil/cpu/cpu-total/idle      451176.5         2016-10-14 11:01:44.666137773 -0700 PDT
/intel/psutil/cpu/cpu-total/system    33749.2734375    2016-10-14 11:01:44.666139698 -0700 PDT
/intel/psutil/cpu/cpu-total/user      65653.2578125    2016-10-14 11:01:44.666145594 -0700 PDT
/intel/psutil/load/load1              1.81             2016-10-14 11:01:44.666072208 -0700 PDT
/intel/psutil/load/load15             2.62             2016-10-14 11:01:44.666074302 -0700 PDT
/intel/psutil/load/load5              2.38             2016-10-14 11:01:44.666074098 -0700 PDT

Nice work - you're all done with this example. Depending on how you started snap-telemetry service earlier, use the appropriate command to stop the daemon:

  • init.d service: service snap-telemetry stop
  • systemd service: systemctl stop snap-telemetry
  • ran snapteld manually: sudo pkill snapteld

When you're ready to move on, walk through other uses of Snap available in the Examples folder.

Building Tasks

Documentation for building a task can be found here.

Plugin Catalog

All known plugins are tracked in the plugin catalog and are tagged as collectors, processors and publishers.

If you would like to write your own, read through Author a Plugin to get started. Let us know if you begin to write one by joining our Slack channel. When you finish, please open a Pull Request to add yours to the catalog!

Documentation

Documentation for Snap will be kept in this repository for now with an emphasis of filling out the docs/ directory. We would also like to link to external how-to blog posts as people write them. Read about contributing to the project for more details.

To learn more about Snap and how others are using it, check out our blog. A good first post to read is My How-to for the Snap Telemetry Framework by @mjbrender.

Examples

More complex examples of using Snap Framework configuration, Task Manifest files and use cases are available under the Examples folder. There are also interesting examples of using Snap in every plugin repository. For the full list of plugins, review the Plugin Catalog.

Community Support

This repository is one of many in the Snap framework and has maintainers supporting it. We love contributions from our community along the way. No improvement is too small.

This note is especially important for plugins. While the Snap framework is hardened through tons of use, plugins mature at their own pace. If you have subject matter expertise related to a plugin, please share your feedback on that repository.

Contributing

We encourage contributions from the community. Snap needs:

  • Contributors: We always appreciate more eyes on the core framework and plugins
  • Feedback: try it and tell us about it on our Slack team, through a blog posts or Twitter with #SnapTelemetry
  • Integrations: Snap can collect from and publish to almost anything by authoring a plugin

To contribute to the Snap framework, see our CONTRIBUTING.md file. To give back to a specific plugin, open an issue on its repository. Snap maintainers aim to address comments and questions as quickly as possible. To get some attention on an issue, reach out to us on Slack, or open an issue to get a conversation started.

Author a Plugin

The power of Snap comes from its open architecture and its growing community of contributors. You can be one of them:

Add to the ecosystem by building your own plugins to collect, process or publish telemetry.

Become a Maintainer

Snap maintainers are here to help guide Snap, the plugins, and the community forward in a positive direction. Maintainers of Snap and the Intel created plugins are selected based on contributions to the project and recommendations from other maintainers. The full list of active maintainers can be found here.

Interested in becoming a maintainer? Check out Responsibilities of a Maintainer and open an issue here to discuss your interest.

Code of Conduct

All contributors to Snap are expected to be helpful and encouraging to all members of the community, treating everyone with a high level of professionalism and respect. See our code of conduct for more details.

Security Disclosure

The Snap team takes security very seriously. If you have any issue regarding security, please notify us by sending an email to snap-security@intel.com and not by creating a GitHub issue. We will follow up with you promptly with more information and a plan for remediation.

License

Snap is Open Source software released under the Apache 2.0 License.

Thank You

And thank you! Your contribution, through code and participation, is incredibly important to us.