This project is no longer actively maintained and the repository has been archived.
This release provides Cloud Foundry and BOSH integration with Google Cloud Platform's Stackdriver Logging and Monitoring.
Functionality is provided by 3 jobs in this release:
- A nozzle job for forwarding Cloud Foundry Firehose data to Stackdriver
- A Fluentd job for forwarding syslog and template logs to Stackdriver Logging
- A Stackdriver Monitoring Agent job for sending VM health metrics to Stackdriver Monitoring
The following is generally available:
- Stackdriver Host Monitoring Agent (
stackdriver-agent
) - Stackdriver Host Logging Agent (
google-fluentd
) - Stackdriver Nozzle (
stackdriver-nozzle
)- Stackdriver Logging for Cloud Foundry Log Events (
LogMessage, Error, HttpStartStop
) - Stackdriver Monitoring for Cloud Foundry Metric Events (
ContainerMetric, ValueMetric, CounterEvent
)
- Stackdriver Logging for Cloud Foundry Log Events (
The following is in beta:
- Stackdriver Nozzle
- Stackdriver Logging for Cloud Foundry Metric Events (
ContainerMetric, ValueMetric, CounterEvent
)
- Stackdriver Logging for Cloud Foundry Metric Events (
The project was developed in partnership with Google and Pivotal and is actively maintained by Google.
Ensure the Stackdriver Logging and Stackdriver Monitoring APIs are enabled.
Depending on the size of the cloud foundry deployment and which events the nozzle is forwarding, it can be quite easy to reach the default Stackdriver quotas:
Google quotas can be viewed and managed on the API Quotas Page. An operator can increase the default quota up to a limit; exceeding that, use the contact links to request even higher quotas.
All of the jobs in this release authenticate to Stackdriver Logging and Monitoring via Service Accounts. Follow the GCP documentation to create a service account via gcloud with the following roles:
roles/logging.logWriter
roles/logging.configWriter
roles/monitoring.metricWriter
You can either authenticate the job(s) by specifying the service account in the cloud_properties
for the resource pool running the job(s) or by configuring credentials.application_default_credentials
in the job spec.
You may also read the access control documentation for more general information about how authentication and authorization work for Stackdriver.
To use any of the jobs in this BOSH release, first upload it to your BOSH director:
bosh2 upload-release https://storage.googleapis.com/bosh-gcp/beta/stackdriver-tools/latest.tgz
The stackdriver-tools.yml sample BOSH 2.0 manifest illustrates how to use all 3 jobs in this release (nozzle, host logging, and host monitoring). You can deploy the sample with the following commands:
bosh2 upload-stemcell https://bosh.io/d/stemcells/bosh-google-kvm-ubuntu-trusty-go_agent
bosh2 update-cloud-config -n manifests/cloud-config-gcp.yml \
-v zone=... \
-v network=... \
-v subnetwork=... \
-v "tags=['stackdriver-nozzle']" \
-v internal_cidr=... \
-v internal_gw=... \
-v "reserved=[10....-10....]"
bosh2 deploy manifests/stackdriver-tools.yml \
-d stackdriver-nozzle \
--var=firehose_endpoint=https://.. \
--var=firehose_username=stackdriver_nozzle \
--var=firehose_password=... \
--var=skip_ssl=false \
--var=gcp_project_id=... \
--var-file=gcp_service_account_json=path/to/service_account.json \
This will create a self-contained deployment that sends Cloud Foundry firehose data, host logs, and host metrics to Stackdriver.
Deploying each job individually is described in detail below.
Create a new deployment manifest for the nozzle. See the example
manifest for a full deployment and the jobs.stackdriver-nozzle
section for the nozzle.
To reduce message loss, operators should run a minimum of two instances. With two instances, updating stemcells and other destructive BOSH operations will still leave an instance draining logs.
The loggregator system will round-robin messages across multiple instances. If the nozzle can't handle the load, consider scaling to more than two nozzle instances.
The spec describes all the properties an operator should modify.
Stackdriver can automatically detect and report errors from stack traces in logs. However, this does not automatically work with Loggregator because it sends each line from app output as a separate log message to the nozzle. To enable this feature of Stackdriver, apps will need to manually encode stacktraces on a single line so that the stackdriver-nozzle can send them as single messages to Stackdriver.
This is accomplished by replacing newlines in stacktraces with a unique character,
which is set using the firehose.newline_token
template variable in the nozzle
so that the nozzle can reconstruct the stacktrace on multiple lines.
For example, if firehose.newline_token
is set to ∴
, a Go app would need to
implement something like the following:
const newlineToken = "∴"
func main() {
...
defer handlePanic()
...
}
func handlePanic() {
e := recover()
if e == nil {
return
}
stack := make([]byte, 1<<16)
stackSize := runtime.Stack(stack, true)
out := string(stack[:stackSize])
fmt.Fprintf(os.Stderr, "panic: %v", e)
fmt.Fprintf(os.Stderr, strings.Replace(out, "\n", newlineToken, -1))
os.Exit(1)
}
This outputs the stacktrace separately from the panic so that the panic remains in the logs and the stacktrace is logged by itself. This allows Stackdriver to detect the stacktrace as an error.
For an example in Java, see this section of the Loggregator documentation.
The google-fluentd template uses Fluentd to send
both syslog and template logs (assuming that template jobs are writing logs into
/var/vcap/sys/log/*/*.log
) to Stackdriver Logging.
To forward host logs from BOSH VMs to Stackdriver, co-locate the google-fluentd template with an existing job whose host logs should be forwarded.
Include the stackdriver-tools
release in your existing deployment manifest:
releases:
...
- name: stackdriver-tools
version: latest
...
Add the google-fluentd template to your job:
jobs:
...
- name: nats
templates:
- name: nats
release: cf
- name: metron_agent
release: cf
- name: google-fluentd
release: stackdriver-tools
...
The stackdriver-agent template uses the Stackdriver Monitoring Agent to collect VM metrics to send to Stackdriver Monitoring.
To forward host metrics forwarding from BOSH VMs to Stackdriver, co-locate the stackdriver-agent template with an existing job whose host metrics should be forwarded.
Include the stackdriver-tools
release in your existing deployment manifest:
releases:
...
- name: stackdriver-tools
version: latest
...
Add the stackdriver-agent template to your job:
jobs:
...
- name: nats
templates:
- name: nats
release: cf
- name: metron_agent
release: cf
- name: stackdriver-agent
release: stackdriver-tools
...
Specify the jobs as addons in your runtime config to deploy Stackdriver Monitoring and Logging agents on all instances in your deployment. Do not specify the jobs as part of your deployment manifest if you are using the runtime config.
# runtime.yml
---
releases:
- name: stackdriver-tools
version: latest
addons:
- name: stackdriver-tools
jobs:
- name: google-fluentd
release: stackdriver-tools
- name: stackdriver-agent
release: stackdriver-tools
To update the runtime config:
bosh2 update-runtime-config -d <your deployment> runtime.yml
Then redeploy your manifest:
bosh2 deploy -d <your deployment> path/to/manifest.yml
google-fluentd
is versioned by the Gemfile in src/google-fluentd. To update fluentd:
- Update the version specifier in the Gemfile (if necessary)
- Update Gemfile.lock:
bundle update
- Create a vendor cache from the Gemfile.lock:
bundle package
- Tar and compress the vendor folder:
tar zvc vendor > google-fluentd-vendor-<VERSION>-plugin-<VERSION>.tgz
- Update the vendor version in the
google-fluentd
package packaging and spec - Add vendored cache to the BOSH blobstore:
bosh2 add-blob google-fluentd-vendor-<VERSION>-plugin-<VERSION>.tgz google-fluentd-vendor/google-fluentd-vendor-VERSION-NUMBER.tgz
- Create a dev release and deploy it to verify that all of the above worked
- Update the BOSH blobstore:
bosh upload-blobs
- Commit your changes
Both the nozzle and the fluentd jobs can run on bosh-lite. To generate a working manifest, start from
the bosh-lite-example-manifest. Note the application_default_credentials
property, which should be filled in with the contents of a Google service account key.
For details on how to contribute to this project - including filing bug reports and contributing code changes - please see CONTRIBUTING.md.
Copyright (c) 2016 Ferran Rodenas. See LICENSE for details.