diff --git a/CHANGELOG-developer.next.asciidoc b/CHANGELOG-developer.next.asciidoc index 610078d225ea..791f11384c0b 100644 --- a/CHANGELOG-developer.next.asciidoc +++ b/CHANGELOG-developer.next.asciidoc @@ -208,6 +208,8 @@ The list below covers the major changes between 7.0.0-rc2 and main only. - Simplified GCS input state checkpoint calculation logic. {issue}40878[40878] {pull}40937[40937] - Simplified Azure Blob Storage input state checkpoint calculation logic. {issue}40674[40674] {pull}40936[40936] - Add field redaction package. {pull}40997[40997] +- Add support for marked redaction to x-pack/filebeat/input/internal/private {pull}41212[41212] +- Add support for collecting Okta role and factor data for users with filebeat entityanalytics input. {pull}41044[41044] ==== Deprecated diff --git a/CHANGELOG.next.asciidoc b/CHANGELOG.next.asciidoc index ebd20cb190cb..72ff8083fea3 100644 --- a/CHANGELOG.next.asciidoc +++ b/CHANGELOG.next.asciidoc @@ -46,6 +46,7 @@ https://github.com/elastic/beats/compare/v8.8.1\...main[Check the HEAD diff] - Added `container.image.name` to `journald` Filebeat input's Docker-specific translated fields. {pull}40450[40450] - Change log.file.path field in awscloudwatch input to nested object. {pull}41099[41099] - Remove deprecated awscloudwatch field from Filebeat. {pull}41089[41089] +- The performance of ingesting SQS data with the S3 input has improved by up to 60x for queues with many small events. `max_number_of_messages` config for SQS mode is now ignored, as the new design no longer needs a manual cap on messages. Instead, use `number_of_workers` to scale ingestion rate in both S3 and SQS modes. The increased efficiency may increase network bandwidth consumption, which can be throttled by lowering `number_of_workers`. It may also increase number of events stored in memory, which can be throttled by lowering the configured size of the internal queue. {pull}40699[40699] - System module events now contain `input.type: systemlogs` instead of `input.type: log` when harvesting log files. {pull}41061[41061] @@ -325,8 +326,10 @@ https://github.com/elastic/beats/compare/v8.8.1\...main[Check the HEAD diff] - Improved GCS input documentation. {pull}41143[41143] - Add CSV decoding capacity to azureblobstorage input {pull}40978[40978] - Add CSV decoding capacity to gcs input {pull}40979[40979] +- Add support to source AWS cloudwatch logs from linked accounts. {pull}41188[41188] - Jounrald input now supports filtering by facilities {pull}41061[41061] - System module now supports reading from jounrald. {pull}41061[41061] +- Add support to include AWS cloudwatch linked accounts when using log_group_name_prefix to define log group names. {pull}41206[41206] *Auditbeat* diff --git a/NOTICE.txt b/NOTICE.txt index bb5807f9a419..74fdd66fd1f1 100644 --- a/NOTICE.txt +++ b/NOTICE.txt @@ -14745,6 +14745,18 @@ THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +-------------------------------------------------------------------------------- +Dependency : github.com/elastic/go-quark +Version: v0.1.2 +Licence type (autodetected): Apache-2.0 +-------------------------------------------------------------------------------- + +Contents of probable licence file $GOMODCACHE/github.com/elastic/go-quark@v0.1.2/LICENSE.txt: + +Source code in this repository is licensed under the Apache License Version 2.0, +an Apache compatible license. + + -------------------------------------------------------------------------------- Dependency : github.com/elastic/go-seccomp-bpf Version: v1.4.0 @@ -23112,6 +23124,38 @@ Contents of probable licence file $GOMODCACHE/github.com/xdg-go/scram@v1.1.2/LIC of your accepting any such warranty or additional liability. +-------------------------------------------------------------------------------- +Dependency : github.com/zyedidia/generic +Version: v1.2.1 +Licence type (autodetected): MIT +-------------------------------------------------------------------------------- + +Contents of probable licence file $GOMODCACHE/github.com/zyedidia/generic@v1.2.1/LICENSE: + +MIT License + +Copyright (c) 2021: Zachary Yedidia. + +Permission is hereby granted, free of charge, to any person obtaining +a copy of this software and associated documentation files (the +"Software"), to deal in the Software without restriction, including +without limitation the rights to use, copy, modify, merge, publish, +distribute, sublicense, and/or sell copies of the Software, and to +permit persons to whom the Software is furnished to do so, subject to +the following conditions: + +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. +IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY +CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, +TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + -------------------------------------------------------------------------------- Dependency : go.elastic.co/apm/module/apmelasticsearch/v2 Version: v2.6.0 @@ -23958,11 +24002,11 @@ Contents of probable licence file $GOMODCACHE/go.elastic.co/ecszap@v1.0.2/LICENS -------------------------------------------------------------------------------- Dependency : go.elastic.co/go-licence-detector -Version: v0.6.1 +Version: v0.7.0 Licence type (autodetected): Apache-2.0 -------------------------------------------------------------------------------- -Contents of probable licence file $GOMODCACHE/go.elastic.co/go-licence-detector@v0.6.1/LICENSE: +Contents of probable licence file $GOMODCACHE/go.elastic.co/go-licence-detector@v0.7.0/LICENSE: Apache License @@ -43063,37 +43107,6 @@ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. --------------------------------------------------------------------------------- -Dependency : github.com/gobuffalo/here -Version: v0.6.7 -Licence type (autodetected): MIT --------------------------------------------------------------------------------- - -Contents of probable licence file $GOMODCACHE/github.com/gobuffalo/here@v0.6.7/LICENSE: - -The MIT License (MIT) - -Copyright (c) 2019 Mark Bates - -Permission is hereby granted, free of charge, to any person obtaining a copy -of this software and associated documentation files (the "Software"), to deal -in the Software without restriction, including without limitation the rights -to use, copy, modify, merge, publish, distribute, sublicense, and/or sell -copies of the Software, and to permit persons to whom the Software is -furnished to do so, subject to the following conditions: - -The above copyright notice and this permission notice shall be included in all -copies or substantial portions of the Software. - -THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR -IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, -FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE -AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER -LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, -OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE -SOFTWARE. - - -------------------------------------------------------------------------------- Dependency : github.com/goccy/go-json Version: v0.10.2 @@ -49821,41 +49834,6 @@ IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. --------------------------------------------------------------------------------- -Dependency : github.com/karrick/godirwalk -Version: v1.17.0 -Licence type (autodetected): BSD-2-Clause --------------------------------------------------------------------------------- - -Contents of probable licence file $GOMODCACHE/github.com/karrick/godirwalk@v1.17.0/LICENSE: - -BSD 2-Clause License - -Copyright (c) 2017, Karrick McDermott -All rights reserved. - -Redistribution and use in source and binary forms, with or without -modification, are permitted provided that the following conditions are met: - -* Redistributions of source code must retain the above copyright notice, this - list of conditions and the following disclaimer. - -* Redistributions in binary form must reproduce the above copyright notice, - this list of conditions and the following disclaimer in the documentation - and/or other materials provided with the distribution. - -THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" -AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE -IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE -DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE -FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL -DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR -SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER -CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, -OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE -OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. - - -------------------------------------------------------------------------------- Dependency : github.com/kballard/go-shellquote Version: v0.0.0-20180428030007-95032a82bc51 @@ -50613,37 +50591,6 @@ The above copyright notice and this permission notice shall be included in all c THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. --------------------------------------------------------------------------------- -Dependency : github.com/markbates/pkger -Version: v0.17.1 -Licence type (autodetected): MIT --------------------------------------------------------------------------------- - -Contents of probable licence file $GOMODCACHE/github.com/markbates/pkger@v0.17.1/LICENSE: - -The MIT License (MIT) - -Copyright (c) 2019 Mark Bates - -Permission is hereby granted, free of charge, to any person obtaining a copy -of this software and associated documentation files (the "Software"), to deal -in the Software without restriction, including without limitation the rights -to use, copy, modify, merge, publish, distribute, sublicense, and/or sell -copies of the Software, and to permit persons to whom the Software is -furnished to do so, subject to the following conditions: - -The above copyright notice and this permission notice shall be included in all -copies or substantial portions of the Software. - -THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR -IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, -FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE -AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER -LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, -OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE -SOFTWARE. - - -------------------------------------------------------------------------------- Dependency : github.com/martini-contrib/render Version: v0.0.0-20150707142108-ec18f8345a11 diff --git a/deploy/kubernetes/Makefile b/deploy/kubernetes/Makefile index 166c83bf515a..d05b6a886832 100644 --- a/deploy/kubernetes/Makefile +++ b/deploy/kubernetes/Makefile @@ -1,4 +1,5 @@ ALL=filebeat metricbeat auditbeat heartbeat +IMAGE_MODIFIER?="-wolfi" BEAT_VERSION=$(shell head -n 1 ../../libbeat/docs/version.asciidoc | cut -c 17- ) .PHONY: all $(ALL) @@ -6,21 +7,27 @@ BEAT_VERSION=$(shell head -n 1 ../../libbeat/docs/version.asciidoc | cut -c 17- all: $(ALL) test: all - for FILE in $(shell ls *-kubernetes.yaml); do \ - BEAT=$$(echo $$FILE | cut -d \- -f 1); \ + @for BEAT in $(ALL); do \ + echo; \ + echo "$$BEAT"; \ + FILE="$$BEAT-kubernetes.yaml"; \ kubectl create -f $$FILE; \ + echo "Testing $$BEAT container for readiness..."; \ + kubectl wait pods -n kube-system -l k8s-app=$$BEAT --for=condition=Ready --timeout=90s; \ + echo "Deleting $$BEAT..."; \ + kubectl delete -f $$FILE; \ done clean: @for f in $(ALL); do rm -f "$$f-kubernetes.yaml"; done $(ALL): - @echo "Generating $@-kubernetes.yaml" + @echo "Generating $@-kubernetes.yaml for version ${BEAT_VERSION} and image modifier '${IMAGE_MODIFIER}'" @rm -f $@-kubernetes.yaml @for f in service-account role role-binding configmap deployment daemonset ; do \ if [ -f "$@/$@-$$f.yaml" ]; then \ echo "file: $@/$@-$$f.yaml"; \ - sed "s/%VERSION%/${BEAT_VERSION}/g" $@/$@-$$f.yaml >> $@-kubernetes.yaml; \ + cat $@/$@-$$f.yaml | sed "s/%VERSION%/${BEAT_VERSION}/g" | sed "s/%IMAGE_MODIFIER%/${IMAGE_MODIFIER}/g" >> $@-kubernetes.yaml; \ echo --- >> $@-kubernetes.yaml; \ fi \ done diff --git a/deploy/kubernetes/auditbeat-kubernetes.yaml b/deploy/kubernetes/auditbeat-kubernetes.yaml index db3588ad9605..23c940ad4e0a 100644 --- a/deploy/kubernetes/auditbeat-kubernetes.yaml +++ b/deploy/kubernetes/auditbeat-kubernetes.yaml @@ -209,7 +209,7 @@ spec: dnsPolicy: ClusterFirstWithHostNet containers: - name: auditbeat - image: docker.elastic.co/beats/auditbeat:9.0.0 + image: docker.elastic.co/beats/auditbeat-wolfi:9.0.0 args: [ "-c", "/etc/auditbeat.yml", "-e", diff --git a/deploy/kubernetes/auditbeat/auditbeat-daemonset.yaml b/deploy/kubernetes/auditbeat/auditbeat-daemonset.yaml index 39eaf726eefc..39a2c35c3f1b 100644 --- a/deploy/kubernetes/auditbeat/auditbeat-daemonset.yaml +++ b/deploy/kubernetes/auditbeat/auditbeat-daemonset.yaml @@ -22,7 +22,7 @@ spec: dnsPolicy: ClusterFirstWithHostNet containers: - name: auditbeat - image: docker.elastic.co/beats/auditbeat:%VERSION% + image: docker.elastic.co/beats/auditbeat%IMAGE_MODIFIER%:%VERSION% args: [ "-c", "/etc/auditbeat.yml", "-e", diff --git a/deploy/kubernetes/filebeat-kubernetes.yaml b/deploy/kubernetes/filebeat-kubernetes.yaml index e272abe98930..f028322c1aca 100644 --- a/deploy/kubernetes/filebeat-kubernetes.yaml +++ b/deploy/kubernetes/filebeat-kubernetes.yaml @@ -183,7 +183,7 @@ spec: dnsPolicy: ClusterFirstWithHostNet containers: - name: filebeat - image: docker.elastic.co/beats/filebeat:9.0.0 + image: docker.elastic.co/beats/filebeat-wolfi:9.0.0 args: [ "-c", "/etc/filebeat.yml", "-e", diff --git a/deploy/kubernetes/filebeat/filebeat-daemonset.yaml b/deploy/kubernetes/filebeat/filebeat-daemonset.yaml index b6df8f31fdbd..c027abede2af 100644 --- a/deploy/kubernetes/filebeat/filebeat-daemonset.yaml +++ b/deploy/kubernetes/filebeat/filebeat-daemonset.yaml @@ -20,7 +20,7 @@ spec: dnsPolicy: ClusterFirstWithHostNet containers: - name: filebeat - image: docker.elastic.co/beats/filebeat:%VERSION% + image: docker.elastic.co/beats/filebeat%IMAGE_MODIFIER%:%VERSION% args: [ "-c", "/etc/filebeat.yml", "-e", diff --git a/deploy/kubernetes/heartbeat-kubernetes.yaml b/deploy/kubernetes/heartbeat-kubernetes.yaml index 90c5ca7a3cc9..280c243d305b 100644 --- a/deploy/kubernetes/heartbeat-kubernetes.yaml +++ b/deploy/kubernetes/heartbeat-kubernetes.yaml @@ -171,7 +171,7 @@ spec: dnsPolicy: ClusterFirstWithHostNet containers: - name: heartbeat - image: docker.elastic.co/beats/heartbeat:9.0.0 + image: docker.elastic.co/beats/heartbeat-wolfi:9.0.0 args: [ "-c", "/etc/heartbeat.yml", "-e", diff --git a/deploy/kubernetes/heartbeat/heartbeat-deployment.yaml b/deploy/kubernetes/heartbeat/heartbeat-deployment.yaml index 3f1a73d3324a..ec95e50ee53d 100644 --- a/deploy/kubernetes/heartbeat/heartbeat-deployment.yaml +++ b/deploy/kubernetes/heartbeat/heartbeat-deployment.yaml @@ -20,7 +20,7 @@ spec: dnsPolicy: ClusterFirstWithHostNet containers: - name: heartbeat - image: docker.elastic.co/beats/heartbeat:%VERSION% + image: docker.elastic.co/beats/heartbeat%IMAGE_MODIFIER%:%VERSION% args: [ "-c", "/etc/heartbeat.yml", "-e", diff --git a/deploy/kubernetes/metricbeat-kubernetes.yaml b/deploy/kubernetes/metricbeat-kubernetes.yaml index 8fb3e5e087d4..418c902bffc0 100644 --- a/deploy/kubernetes/metricbeat-kubernetes.yaml +++ b/deploy/kubernetes/metricbeat-kubernetes.yaml @@ -291,7 +291,7 @@ spec: dnsPolicy: ClusterFirstWithHostNet containers: - name: metricbeat - image: docker.elastic.co/beats/metricbeat:9.0.0 + image: docker.elastic.co/beats/metricbeat-wolfi:9.0.0 args: [ "-c", "/etc/metricbeat.yml", "-e", diff --git a/deploy/kubernetes/metricbeat/metricbeat-daemonset.yaml b/deploy/kubernetes/metricbeat/metricbeat-daemonset.yaml index c4004d91e288..e8c0074be6de 100644 --- a/deploy/kubernetes/metricbeat/metricbeat-daemonset.yaml +++ b/deploy/kubernetes/metricbeat/metricbeat-daemonset.yaml @@ -21,7 +21,7 @@ spec: dnsPolicy: ClusterFirstWithHostNet containers: - name: metricbeat - image: docker.elastic.co/beats/metricbeat:%VERSION% + image: docker.elastic.co/beats/metricbeat%IMAGE_MODIFIER%:%VERSION% args: [ "-c", "/etc/metricbeat.yml", "-e", diff --git a/dev-tools/notice/overrides.json b/dev-tools/notice/overrides.json index bb82c97ebe40..a50cac02e0fb 100644 --- a/dev-tools/notice/overrides.json +++ b/dev-tools/notice/overrides.json @@ -19,3 +19,4 @@ {"name": "github.com/JohnCGriffin/overflow", "licenceType": "MIT"} {"name": "github.com/elastic/ebpfevents", "licenceType": "Apache-2.0"} {"name": "go.opentelemetry.io/collector/config/configopaque", "licenceType": "Apache-2.0"} +{"name": "github.com/elastic/go-quark", "licenceType": "Apache-2.0"} diff --git a/filebeat/input/filestream/internal/task/group_test.go b/filebeat/input/filestream/internal/task/group_test.go index db50ef3ccabe..6ba0ac2cf1db 100644 --- a/filebeat/input/filestream/internal/task/group_test.go +++ b/filebeat/input/filestream/internal/task/group_test.go @@ -36,15 +36,21 @@ type noopLogger struct{} func (n noopLogger) Errorf(string, ...interface{}) {} -type testLogger strings.Builder +type testLogger struct { + mu sync.Mutex + b strings.Builder +} func (tl *testLogger) Errorf(format string, args ...interface{}) { - sb := (*strings.Builder)(tl) - sb.WriteString(fmt.Sprintf(format, args...)) - sb.WriteString("\n") + tl.mu.Lock() + defer tl.mu.Unlock() + tl.b.WriteString(fmt.Sprintf(format, args...)) + tl.b.WriteString("\n") } func (tl *testLogger) String() string { - return (*strings.Builder)(tl).String() + tl.mu.Lock() + defer tl.mu.Unlock() + return tl.b.String() } func TestNewGroup(t *testing.T) { @@ -67,7 +73,6 @@ func TestNewGroup(t *testing.T) { } func TestGroup_Go(t *testing.T) { - t.Skip("Flaky tests: https://github.com/elastic/beats/issues/41218") t.Run("don't run more than limit goroutines", func(t *testing.T) { done := make(chan struct{}) defer close(done) @@ -227,14 +232,12 @@ func TestGroup_Go(t *testing.T) { t.Run("all workloads return an error", func(t *testing.T) { logger := &testLogger{} - runCunt := atomic.Uint64{} - wg := sync.WaitGroup{} + var count atomic.Uint64 wantErr := errors.New("a error") workload := func(i int) func(context.Context) error { return func(_ context.Context) error { - defer runCunt.Add(1) - defer wg.Done() + defer count.Add(1) return fmt.Errorf("[%d]: %w", i, wantErr) } } @@ -242,23 +245,24 @@ func TestGroup_Go(t *testing.T) { want := uint64(2) g := NewGroup(want, time.Second, logger, "errorPrefix") - wg.Add(1) err := g.Go(workload(1)) require.NoError(t, err) - wg.Wait() - wg.Add(1) err = g.Go(workload(2)) require.NoError(t, err) - wg.Wait() - err = g.Stop() + assert.Eventually(t, func() bool { + return count.Load() == want && logger.String() != "" + }, 100*time.Millisecond, time.Millisecond) + err = g.Stop() require.NoError(t, err) + logs := logger.String() assert.Contains(t, logs, wantErr.Error()) assert.Contains(t, logs, "[2]") assert.Contains(t, logs, "[1]") + }) t.Run("some workloads return an error", func(t *testing.T) { @@ -268,17 +272,26 @@ func TestGroup_Go(t *testing.T) { g := NewGroup(want, time.Second, logger, "") - err := g.Go(func(_ context.Context) error { return nil }) + var count atomic.Uint64 + err := g.Go(func(_ context.Context) error { + count.Add(1) + return nil + }) require.NoError(t, err) - err = g.Go(func(_ context.Context) error { return wantErr }) + err = g.Go(func(_ context.Context) error { + count.Add(1) + return wantErr + }) require.NoError(t, err) - time.Sleep(time.Millisecond) + assert.Eventually(t, func() bool { + return count.Load() == want && logger.String() != "" + }, 100*time.Millisecond, time.Millisecond, "not all workloads finished") - err = g.Stop() + assert.Contains(t, logger.String(), wantErr.Error()) + err = g.Stop() assert.NoError(t, err) - assert.Contains(t, logger.String(), wantErr.Error()) }) t.Run("workload returns no error", func(t *testing.T) { diff --git a/filebeat/module/system/auth/ingest/common.yml b/filebeat/module/system/auth/ingest/common.yml new file mode 100644 index 000000000000..75c2a8e46a9b --- /dev/null +++ b/filebeat/module/system/auth/ingest/common.yml @@ -0,0 +1,172 @@ +description: Common steps for Journald and log files from system/auth Filebeat module +processors: + - grok: + description: Grok usernames from PAM messages. + tag: grok-pam-users + field: message + ignore_missing: true + ignore_failure: true + patterns: + - 'for user %{QUOTE}?%{DATA:_temp.foruser}%{QUOTE}? by %{QUOTE}?%{DATA:_temp.byuser}%{QUOTE}?(?:\(uid=%{NUMBER:_temp.byuid}\))?$' + - 'for user %{QUOTE}?%{DATA:_temp.foruser}%{QUOTE}?$' + - 'by user %{QUOTE}?%{DATA:_temp.byuser}%{QUOTE}?$' + - '%{BOUNDARY} user %{QUOTE}%{DATA:_temp.user}%{QUOTE}' + pattern_definitions: + QUOTE: "['\"]" + BOUNDARY: "(?- + if (ctx.system.auth.ssh.event == "Accepted") { + ctx.event.type = ["info"]; + ctx.event.category = ["authentication", "session"]; + ctx.event.action = "ssh_login"; + ctx.event.outcome = "success"; + } else if (ctx.system.auth.ssh.event == "Invalid" || ctx.system.auth.ssh.event == "Failed") { + ctx.event.type = ["info"]; + ctx.event.category = ["authentication"]; + ctx.event.action = "ssh_login"; + ctx.event.outcome = "failure"; + } + - append: + field: event.category + value: iam + if: ctx.process?.name != null && ['groupadd', 'groupdel', 'groupmod', 'useradd', 'userdel', 'usermod'].contains(ctx.process.name) + - set: + field: event.outcome + value: success + if: ctx.process?.name != null && (ctx.message == null || !ctx.message.contains("fail")) && ['groupadd', 'groupdel', 'groupmod', 'useradd', 'userdel', 'usermod'].contains(ctx.process.name) + - set: + field: event.outcome + value: failure + if: ctx.process?.name != null && (ctx.message != null && ctx.message.contains("fail")) && ['groupadd', 'groupdel', 'groupmod', 'useradd', 'userdel', 'usermod'].contains(ctx.process.name) + - append: + field: event.type + value: user + if: ctx.process?.name != null && ['useradd', 'userdel', 'usermod'].contains(ctx.process.name) + - append: + field: event.type + value: group + if: ctx.process?.name != null && ['groupadd', 'groupdel', 'groupmod'].contains(ctx.process.name) + - append: + field: event.type + value: creation + if: ctx.process?.name != null && ['useradd', 'groupadd'].contains(ctx.process.name) + - append: + field: event.type + value: deletion + if: ctx.process?.name != null && ['userdel', 'groupdel'].contains(ctx.process.name) + - append: + field: event.type + value: change + if: ctx.process?.name != null && ['usermod', 'groupmod'].contains(ctx.process.name) + - append: + field: related.user + value: "{{{ user.name }}}" + allow_duplicates: false + if: ctx.user?.name != null && ctx.user?.name != '' + - append: + field: related.user + value: "{{{ user.effective.name }}}" + allow_duplicates: false + if: ctx.user?.effective?.name != null && ctx.user?.effective?.name != '' + - append: + field: related.ip + value: "{{{ source.ip }}}" + allow_duplicates: false + if: ctx.source?.ip != null && ctx.source?.ip != '' + - append: + field: related.hosts + value: "{{{ host.hostname }}}" + allow_duplicates: false + if: ctx.host?.hostname != null && ctx.host?.hostname != '' + - set: + field: ecs.version + value: 8.0.0 + - remove: + field: event.original + if: "ctx?.tags == null || !(ctx.tags.contains('preserve_original_event'))" + ignore_failure: true + ignore_missing: true diff --git a/filebeat/module/system/auth/ingest/entrypoint.yml b/filebeat/module/system/auth/ingest/entrypoint.yml index 93869fd1486f..7da5fc4a5d40 100644 --- a/filebeat/module/system/auth/ingest/entrypoint.yml +++ b/filebeat/module/system/auth/ingest/entrypoint.yml @@ -1,5 +1,8 @@ description: Entrypoint Pipeline for system/auth Filebeat module processors: + - set: + field: event.ingested + copy_from: _ingest.timestamp - script: source: | if(ctx?.journald != null){ diff --git a/filebeat/module/system/auth/ingest/files.yml b/filebeat/module/system/auth/ingest/files.yml index 39611f484a82..fbeebc12b7e2 100644 --- a/filebeat/module/system/auth/ingest/files.yml +++ b/filebeat/module/system/auth/ingest/files.yml @@ -1,9 +1,6 @@ --- description: Pipeline for parsing system authorization and secure logs. processors: - - set: - field: event.ingested - copy_from: _ingest.timestamp - rename: if: ctx.event?.original == null field: message @@ -28,76 +25,8 @@ processors: target_field: message - remove: field: _temp - - grok: - description: Grok usernames from PAM messages. - tag: grok-pam-users - field: message - ignore_missing: true - ignore_failure: true - patterns: - - 'for user %{QUOTE}?%{DATA:_temp.foruser}%{QUOTE}? by %{QUOTE}?%{DATA:_temp.byuser}%{QUOTE}?(?:\(uid=%{NUMBER:_temp.byuid}\))?$' - - 'for user %{QUOTE}?%{DATA:_temp.foruser}%{QUOTE}?$' - - 'by user %{QUOTE}?%{DATA:_temp.byuser}%{QUOTE}?$' - - '%{BOUNDARY} user %{QUOTE}%{DATA:_temp.user}%{QUOTE}' - pattern_definitions: - QUOTE: "['\"]" - BOUNDARY: "(?}" - date: if: ctx.event?.timezone == null field: system.auth.timestamp @@ -125,106 +54,6 @@ processors: value: '{{{ _ingest.on_failure_message }}}' - remove: field: system.auth.timestamp - - geoip: - field: source.ip - target_field: source.geo - ignore_missing: true - - geoip: - database_file: GeoLite2-ASN.mmdb - field: source.ip - target_field: source.as - properties: - - asn - - organization_name - ignore_missing: true - - rename: - field: source.as.asn - target_field: source.as.number - ignore_missing: true - - rename: - field: source.as.organization_name - target_field: source.as.organization.name - ignore_missing: true - - set: - field: event.kind - value: event - - script: - description: Add event.category/action/output to SSH events. - tag: script-categorize-ssh-event - if: ctx.system?.auth?.ssh?.event != null - lang: painless - source: >- - if (ctx.system.auth.ssh.event == "Accepted") { - ctx.event.type = ["info"]; - ctx.event.category = ["authentication", "session"]; - ctx.event.action = "ssh_login"; - ctx.event.outcome = "success"; - } else if (ctx.system.auth.ssh.event == "Invalid" || ctx.system.auth.ssh.event == "Failed") { - ctx.event.type = ["info"]; - ctx.event.category = ["authentication"]; - ctx.event.action = "ssh_login"; - ctx.event.outcome = "failure"; - } - - append: - field: event.category - value: iam - if: ctx.process?.name != null && ['groupadd', 'groupdel', 'groupmod', 'useradd', 'userdel', 'usermod'].contains(ctx.process.name) - - set: - field: event.outcome - value: success - if: ctx.process?.name != null && (ctx.message == null || !ctx.message.contains("fail")) && ['groupadd', 'groupdel', 'groupmod', 'useradd', 'userdel', 'usermod'].contains(ctx.process.name) - - set: - field: event.outcome - value: failure - if: ctx.process?.name != null && (ctx.message != null && ctx.message.contains("fail")) && ['groupadd', 'groupdel', 'groupmod', 'useradd', 'userdel', 'usermod'].contains(ctx.process.name) - - append: - field: event.type - value: user - if: ctx.process?.name != null && ['useradd', 'userdel', 'usermod'].contains(ctx.process.name) - - append: - field: event.type - value: group - if: ctx.process?.name != null && ['groupadd', 'groupdel', 'groupmod'].contains(ctx.process.name) - - append: - field: event.type - value: creation - if: ctx.process?.name != null && ['useradd', 'groupadd'].contains(ctx.process.name) - - append: - field: event.type - value: deletion - if: ctx.process?.name != null && ['userdel', 'groupdel'].contains(ctx.process.name) - - append: - field: event.type - value: change - if: ctx.process?.name != null && ['usermod', 'groupmod'].contains(ctx.process.name) - - append: - field: related.user - value: "{{{ user.name }}}" - allow_duplicates: false - if: ctx.user?.name != null && ctx.user?.name != '' - - append: - field: related.user - value: "{{{ user.effective.name }}}" - allow_duplicates: false - if: ctx.user?.effective?.name != null && ctx.user?.effective?.name != '' - - append: - field: related.ip - value: "{{{ source.ip }}}" - allow_duplicates: false - if: ctx.source?.ip != null && ctx.source?.ip != '' - - append: - field: related.hosts - value: "{{{ host.hostname }}}" - allow_duplicates: false - if: ctx.host?.hostname != null && ctx.host?.hostname != '' - - set: - field: ecs.version - value: 8.0.0 - - remove: - field: event.original - if: "ctx?.tags == null || !(ctx.tags.contains('preserve_original_event'))" - ignore_failure: true - ignore_missing: true on_failure: - set: field: error.message diff --git a/filebeat/module/system/auth/ingest/journald.yml b/filebeat/module/system/auth/ingest/journald.yml index 10e7ae96054e..aee3f5263ede 100644 --- a/filebeat/module/system/auth/ingest/journald.yml +++ b/filebeat/module/system/auth/ingest/journald.yml @@ -1,8 +1,5 @@ description: Journald Pipeline for system/auth Filebeat module processors: - - set: - field: event.ingested - copy_from: _ingest.timestamp - rename: field: "journald.process.name" target_field: process.name @@ -16,176 +13,8 @@ processors: - rename: field: _temp.message target_field: message - - grok: - description: Grok usernames from PAM messages. - tag: grok-pam-users - field: message - ignore_missing: true - ignore_failure: true - patterns: - - 'for user %{QUOTE}?%{DATA:_temp.foruser}%{QUOTE}? by %{QUOTE}?%{DATA:_temp.byuser}%{QUOTE}?(?:\(uid=%{NUMBER:_temp.byuid}\))?$' - - 'for user %{QUOTE}?%{DATA:_temp.foruser}%{QUOTE}?$' - - 'by user %{QUOTE}?%{DATA:_temp.byuser}%{QUOTE}?$' - - '%{BOUNDARY} user %{QUOTE}%{DATA:_temp.user}%{QUOTE}' - pattern_definitions: - QUOTE: "['\"]" - BOUNDARY: "(?- - if (ctx.system.auth.ssh.event == "Accepted") { - ctx.event.type = ["info"]; - ctx.event.category = ["authentication", "session"]; - ctx.event.action = "ssh_login"; - ctx.event.outcome = "success"; - } else if (ctx.system.auth.ssh.event == "Invalid" || ctx.system.auth.ssh.event == "Failed") { - ctx.event.type = ["info"]; - ctx.event.category = ["authentication"]; - ctx.event.action = "ssh_login"; - ctx.event.outcome = "failure"; - } - - append: - field: event.category - value: iam - if: ctx.process?.name != null && ['groupadd', 'groupdel', 'groupmod', 'useradd', 'userdel', 'usermod'].contains(ctx.process.name) - - set: - field: event.outcome - value: success - if: ctx.process?.name != null && (ctx.message == null || !ctx.message.contains("fail")) && ['groupadd', 'groupdel', 'groupmod', 'useradd', 'userdel', 'usermod'].contains(ctx.process.name) - - set: - field: event.outcome - value: failure - if: ctx.process?.name != null && (ctx.message != null && ctx.message.contains("fail")) && ['groupadd', 'groupdel', 'groupmod', 'useradd', 'userdel', 'usermod'].contains(ctx.process.name) - - append: - field: event.type - value: user - if: ctx.process?.name != null && ['useradd', 'userdel', 'usermod'].contains(ctx.process.name) - - append: - field: event.type - value: group - if: ctx.process?.name != null && ['groupadd', 'groupdel', 'groupmod'].contains(ctx.process.name) - - append: - field: event.type - value: creation - if: ctx.process?.name != null && ['useradd', 'groupadd'].contains(ctx.process.name) - - append: - field: event.type - value: deletion - if: ctx.process?.name != null && ['userdel', 'groupdel'].contains(ctx.process.name) - - append: - field: event.type - value: change - if: ctx.process?.name != null && ['usermod', 'groupmod'].contains(ctx.process.name) - - append: - field: related.user - value: "{{{ user.name }}}" - allow_duplicates: false - if: ctx.user?.name != null && ctx.user?.name != '' - - append: - field: related.user - value: "{{{ user.effective.name }}}" - allow_duplicates: false - if: ctx.user?.effective?.name != null && ctx.user?.effective?.name != '' - - append: - field: related.ip - value: "{{{ source.ip }}}" - allow_duplicates: false - if: ctx.source?.ip != null && ctx.source?.ip != '' - - append: - field: related.hosts - value: "{{{ host.hostname }}}" - allow_duplicates: false - if: ctx.host?.hostname != null && ctx.host?.hostname != '' - - set: - field: ecs.version - value: 8.0.0 - - remove: - field: event.original - if: "ctx?.tags == null || !(ctx.tags.contains('preserve_original_event'))" - ignore_failure: true - ignore_missing: true + - pipeline: + name: "{< IngestPipeline "common" >}" - remove: description: Remove the extra fields added by the Journald input ignore_missing: true diff --git a/filebeat/module/system/auth/manifest.yml b/filebeat/module/system/auth/manifest.yml index 4b99d6407b76..fefc51a88a45 100644 --- a/filebeat/module/system/auth/manifest.yml +++ b/filebeat/module/system/auth/manifest.yml @@ -22,4 +22,6 @@ ingest_pipeline: - ingest/files.yml - ingest/journald.yml - ingest/grok-auth-messages.yml + - ingest/common.yml + input: config/auth.yml diff --git a/go.mod b/go.mod index c643f16b1fa4..0f3c26503ca9 100644 --- a/go.mod +++ b/go.mod @@ -130,7 +130,7 @@ require ( github.com/ugorji/go/codec v1.1.8 github.com/vmware/govmomi v0.39.0 go.elastic.co/ecszap v1.0.2 - go.elastic.co/go-licence-detector v0.6.1 + go.elastic.co/go-licence-detector v0.7.0 go.etcd.io/bbolt v1.3.10 go.uber.org/multierr v1.11.0 go.uber.org/zap v1.27.0 @@ -192,6 +192,7 @@ require ( github.com/elastic/elastic-agent-libs v0.12.1 github.com/elastic/elastic-agent-system-metrics v0.11.1 github.com/elastic/go-elasticsearch/v8 v8.14.0 + github.com/elastic/go-quark v0.1.2 github.com/elastic/go-sfdc v0.0.0-20241010131323-8e176480d727 github.com/elastic/mito v1.15.0 github.com/elastic/tk-btf v0.1.0 @@ -214,6 +215,7 @@ require ( github.com/shirou/gopsutil/v3 v3.22.10 github.com/tklauser/go-sysconf v0.3.10 github.com/xdg-go/scram v1.1.2 + github.com/zyedidia/generic v1.2.1 go.elastic.co/apm/module/apmelasticsearch/v2 v2.6.0 go.elastic.co/apm/module/apmhttp/v2 v2.6.0 go.elastic.co/apm/v2 v2.6.0 @@ -303,7 +305,6 @@ require ( github.com/go-openapi/jsonreference v0.20.4 // indirect github.com/go-openapi/swag v0.22.9 // indirect github.com/go-resty/resty/v2 v2.13.1 // indirect - github.com/gobuffalo/here v0.6.7 // indirect github.com/goccy/go-json v0.10.2 // indirect github.com/godror/knownpb v0.1.0 // indirect github.com/golang-jwt/jwt/v4 v4.5.0 // indirect @@ -335,7 +336,6 @@ require ( github.com/jmespath/go-jmespath v0.4.0 // indirect github.com/josharian/intern v1.0.0 // indirect github.com/json-iterator/go v1.1.12 // indirect - github.com/karrick/godirwalk v1.17.0 // indirect github.com/kballard/go-shellquote v0.0.0-20180428030007-95032a82bc51 // indirect github.com/klauspost/asmfmt v1.3.2 // indirect github.com/klauspost/compress v1.17.9 // indirect @@ -344,7 +344,6 @@ require ( github.com/kylelemons/godebug v1.1.0 // indirect github.com/lufia/plan9stats v0.0.0-20211012122336-39d0f177ccd0 // indirect github.com/mailru/easyjson v0.7.7 // indirect - github.com/markbates/pkger v0.17.1 // indirect github.com/mattn/go-ieproxy v0.0.1 // indirect github.com/mattn/go-isatty v0.0.20 // indirect github.com/mattn/go-runewidth v0.0.9 // indirect diff --git a/go.sum b/go.sum index 4f561fa3d6ec..0362a16115fe 100644 --- a/go.sum +++ b/go.sum @@ -381,6 +381,8 @@ github.com/elastic/go-lumber v0.1.2-0.20220819171948-335fde24ea0f h1:TsPpU5EAwlt github.com/elastic/go-lumber v0.1.2-0.20220819171948-335fde24ea0f/go.mod h1:HHaWnZamYKWsR9/eZNHqRHob8iQDKnchHmmskT/SKko= github.com/elastic/go-perf v0.0.0-20191212140718-9c656876f595 h1:q8n4QjcLa4q39Q3fqHRknTBXBtegjriHFrB42YKgXGI= github.com/elastic/go-perf v0.0.0-20191212140718-9c656876f595/go.mod h1:s09U1b4P1ZxnKx2OsqY7KlHdCesqZWIhyq0Gs/QC/Us= +github.com/elastic/go-quark v0.1.2 h1:Hnov9q8D9ofS976SODWWYAZ23IpgPILxTUCiccmhw0c= +github.com/elastic/go-quark v0.1.2/go.mod h1:/ngqgumD/Z5vnFZ4XPN2kCbxnEfG5/Uc+bRvOBabVVA= github.com/elastic/go-seccomp-bpf v1.4.0 h1:6y3lYrEHrLH9QzUgOiK8WDqmPaMnnB785WxibCNIOH4= github.com/elastic/go-seccomp-bpf v1.4.0/go.mod h1:wIMxjTbKpWGQk4CV9WltlG6haB4brjSH/dvAohBPM1I= github.com/elastic/go-sfdc v0.0.0-20241010131323-8e176480d727 h1:yuiN60oaQUz2PtNpNhDI2H6zrCdfiiptmNdwV5WUaKA= @@ -483,9 +485,6 @@ github.com/go-sql-driver/mysql v1.6.0/go.mod h1:DCzpHaOWr8IXmIStZouvnhqoel9Qv2LB github.com/go-stack/stack v1.8.0/go.mod h1:v0f6uXyyMGvRgIKkXu+yp6POWl0qKG85gN/melR3HDY= github.com/go-task/slim-sprig v0.0.0-20230315185526-52ccab3ef572 h1:tfuBGBXKqDEevZMzYi5KSi8KkcZtzBcTgAUUtapy0OI= github.com/go-task/slim-sprig v0.0.0-20230315185526-52ccab3ef572/go.mod h1:9Pwr4B2jHnOSGXyyzV8ROjYa2ojvAY6HCGYYfMoC3Ls= -github.com/gobuffalo/here v0.6.0/go.mod h1:wAG085dHOYqUpf+Ap+WOdrPTp5IYcDAs/x7PLa8Y5fM= -github.com/gobuffalo/here v0.6.7 h1:hpfhh+kt2y9JLDfhYUxxCRxQol540jsVfKUZzjlbp8o= -github.com/gobuffalo/here v0.6.7/go.mod h1:vuCfanjqckTuRlqAitJz6QC4ABNnS27wLb816UhsPcc= github.com/gocarina/gocsv v0.0.0-20170324095351-ffef3ffc77be h1:zXHeEEJ231bTf/IXqvCfeaqjLpXsq42ybLoT4ROSR6Y= github.com/gocarina/gocsv v0.0.0-20170324095351-ffef3ffc77be/go.mod h1:/oj50ZdPq/cUjA02lMZhijk5kR31SEydKyqah1OgBuo= github.com/goccy/go-json v0.10.2 h1:CrxCmQqYDkv1z7lO7Wbh2HN93uovUHgrECaO5ZrCXAU= @@ -564,7 +563,6 @@ github.com/google/go-querystring v1.1.0/go.mod h1:Kcdr2DB4koayq7X8pmAG4sNG59So17 github.com/google/gofuzz v1.0.0/go.mod h1:dBl0BpW6vV/+mYPU4Po3pmUjxk6FQPldtuIdl/M65Eg= github.com/google/gofuzz v1.2.0 h1:xRy4A+RhZaiKjJ1bPfwQ8sedCA+YS2YcCHW6ec7JMi0= github.com/google/gofuzz v1.2.0/go.mod h1:dBl0BpW6vV/+mYPU4Po3pmUjxk6FQPldtuIdl/M65Eg= -github.com/google/licenseclassifier v0.0.0-20200402202327-879cb1424de0/go.mod h1:qsqn2hxC+vURpyBRygGUuinTO42MFRLcsmQ/P8v94+M= github.com/google/licenseclassifier v0.0.0-20221004142553-c1ed8fcf4bab h1:okY7fFoWybMbxiHkaqStN4mxSrPfYmTZl5Zh32Z5FjY= github.com/google/licenseclassifier v0.0.0-20221004142553-c1ed8fcf4bab/go.mod h1:jkYIPv59uiw+1MxTWlqQEKebsUDV1DCXQtBBn5lVzf4= github.com/google/licenseclassifier/v2 v2.0.0-alpha.1/go.mod h1:YAgBGGTeNDMU+WfIgaFvjZe4rudym4f6nIn8ZH5X+VM= @@ -677,9 +675,6 @@ github.com/json-iterator/go v1.1.12/go.mod h1:e30LSqwooZae/UwlEbR2852Gd8hjQvJoHm github.com/jtolds/gls v4.20.0+incompatible h1:xdiiI2gbIgH/gLH7ADydsJ1uDOEzR8yvV7C0MuV77Wo= github.com/jtolds/gls v4.20.0+incompatible/go.mod h1:QJZ7F/aHp+rZTRtaJ1ow/lLfFfVYBRgL+9YlvaHOwJU= github.com/julienschmidt/httprouter v1.2.0/go.mod h1:SYymIcj16QtmaHHD7aYtjjsJG7VTCxuUUipMqKk8s4w= -github.com/karrick/godirwalk v1.15.6/go.mod h1:j4mkqPuvaLI8mp1DroR3P6ad7cyYd4c1qeJ3RV7ULlk= -github.com/karrick/godirwalk v1.17.0 h1:b4kY7nqDdioR/6qnbHQyDvmA17u5G1cZ6J+CZXwSWoI= -github.com/karrick/godirwalk v1.17.0/go.mod h1:j4mkqPuvaLI8mp1DroR3P6ad7cyYd4c1qeJ3RV7ULlk= github.com/kballard/go-shellquote v0.0.0-20180428030007-95032a82bc51 h1:Z9n2FFNUXsshfwJMBgNA0RU6/i7WVaAegv3PtuIHPMs= github.com/kballard/go-shellquote v0.0.0-20180428030007-95032a82bc51/go.mod h1:CzGEWj7cYgsdH8dAjBGEr58BoE7ScuLd+fwFZ44+/x8= github.com/kisielk/errcheck v1.5.0/go.mod h1:pFxgyoBC7bSaBwPgfKdkLd5X25qrDl4LWUI2bnpBCr8= @@ -713,9 +708,6 @@ github.com/magefile/mage v1.15.0 h1:BvGheCMAsG3bWUDbZ8AyXXpCNwU9u5CB6sM+HNb9HYg= github.com/magefile/mage v1.15.0/go.mod h1:z5UZb/iS3GoOSn0JgWuiw7dxlurVYTu+/jHXqQg881A= github.com/mailru/easyjson v0.7.7 h1:UGYAvKxe3sBsEDzO8ZeWOSlIQfWFlxbzLZe7hwFURr0= github.com/mailru/easyjson v0.7.7/go.mod h1:xzfreul335JAWq5oZzymOObrkdz5UnU4kGfJJLY9Nlc= -github.com/markbates/pkger v0.17.0/go.mod h1:0JoVlrol20BSywW79rN3kdFFsE5xYM+rSCQDXbLhiuI= -github.com/markbates/pkger v0.17.1 h1:/MKEtWqtc0mZvu9OinB9UzVN9iYCwLWuyUv4Bw+PCno= -github.com/markbates/pkger v0.17.1/go.mod h1:0JoVlrol20BSywW79rN3kdFFsE5xYM+rSCQDXbLhiuI= github.com/martini-contrib/render v0.0.0-20150707142108-ec18f8345a11 h1:YFh+sjyJTMQSYjKwM4dFKhJPJC/wfo98tPUc17HdoYw= github.com/martini-contrib/render v0.0.0-20150707142108-ec18f8345a11/go.mod h1:Ah2dBMoxZEqk118as2T4u4fjfXarE0pPnMJaArZQZsI= github.com/mattn/go-colorable v0.1.8/go.mod h1:u6P/XSegPjTcexA+o6vUJrdnUu04hMope9wVRipJSqc= @@ -855,7 +847,6 @@ github.com/samuel/go-parser v0.0.0-20130731160455-ca8abbf65d0e h1:hUGyBE/4CXRPTh github.com/samuel/go-parser v0.0.0-20130731160455-ca8abbf65d0e/go.mod h1:Sb6li54lXV0yYEjI4wX8cucdQ9gqUJV3+Ngg3l9g30I= github.com/samuel/go-thrift v0.0.0-20140522043831-2187045faa54 h1:jbchLJWyhKcmOjkbC4zDvT/n5EEd7g6hnnF760rEyRA= github.com/samuel/go-thrift v0.0.0-20140522043831-2187045faa54/go.mod h1:Vrkh1pnjV9Bl8c3P9zH0/D4NlOHWP5d4/hF4YTULaec= -github.com/sergi/go-diff v1.0.0/go.mod h1:0CfEIISq7TuYL3j771MWULgwwjU+GofnZX9QAmXWZgo= github.com/sergi/go-diff v1.1.0/go.mod h1:STckp+ISIX8hZLjrqAeVduY0gWCT9IjLuqbuNXdaHfM= github.com/sergi/go-diff v1.3.1 h1:xkr+Oxo4BOQKmkn/B9eMK0g5Kg/983T9DqqPHwYqD+8= github.com/sergi/go-diff v1.3.1/go.mod h1:aMJSSKb2lpPvRNec0+w3fl7LP9IOFzdc9Pa4NFbPK1I= @@ -941,6 +932,8 @@ github.com/zeebo/assert v1.3.0 h1:g7C04CbJuIDKNPFHmsk4hwZDO5O+kntRxzaUoNXj+IQ= github.com/zeebo/assert v1.3.0/go.mod h1:Pq9JiuJQpG8JLJdtkwrJESF0Foym2/D9XMU5ciN/wJ0= github.com/zeebo/xxh3 v1.0.2 h1:xZmwmqxHZA8AI603jOQ0tMqmBr9lPeFwGg6d+xy9DC0= github.com/zeebo/xxh3 v1.0.2/go.mod h1:5NWz9Sef7zIDm2JHfFlcQvNekmcEl9ekUZQQKCYaDcA= +github.com/zyedidia/generic v1.2.1 h1:Zv5KS/N2m0XZZiuLS82qheRG4X1o5gsWreGb0hR7XDc= +github.com/zyedidia/generic v1.2.1/go.mod h1:ly2RBz4mnz1yeuVbQA/VFwGjK3mnHGRj1JuoG336Bis= go.einride.tech/aip v0.67.1 h1:d/4TW92OxXBngkSOwWS2CH5rez869KpKMaN44mdxkFI= go.einride.tech/aip v0.67.1/go.mod h1:ZGX4/zKw8dcgzdLsrvpOOGxfxI2QSk12SlP7d6c0/XI= go.elastic.co/apm/module/apmelasticsearch/v2 v2.6.0 h1:ukMcwyMaDXsS1dRK2qRYXT2AsfwaUy74TOOYCqkWJow= @@ -953,8 +946,8 @@ go.elastic.co/ecszap v1.0.2 h1:iW5OGx8IiokiUzx/shD4AJCPFMC9uUtr7ycaiEIU++I= go.elastic.co/ecszap v1.0.2/go.mod h1:dJkSlK3BTiwG/qXhCwe50Mz/jwu854vSip8sIeQhNZg= go.elastic.co/fastjson v1.1.0 h1:3MrGBWWVIxe/xvsbpghtkFoPciPhOCmjsR/HfwEeQR4= go.elastic.co/fastjson v1.1.0/go.mod h1:boNGISWMjQsUPy/t6yqt2/1Wx4YNPSe+mZjlyw9vKKI= -go.elastic.co/go-licence-detector v0.6.1 h1:T2PFHYdow+9mAjj6K5ehn5anTxtsURfom2P4S6PgMzg= -go.elastic.co/go-licence-detector v0.6.1/go.mod h1:qQ1clBRS2f0Ee5ie+y2LLYnyhSNJNm0Ha6d7SoYVtM4= +go.elastic.co/go-licence-detector v0.7.0 h1:qC31sfyfNcNx/zMYcLABU0ac3MbGHZgksCAb5lMDUMg= +go.elastic.co/go-licence-detector v0.7.0/go.mod h1:f5ty8pjynzQD8BcS+s0qtlOGKc35/HKQxCVi8SHhV5k= go.etcd.io/bbolt v1.3.10 h1:+BqfJTcCzTItrop8mq/lbzL8wSGtj94UO/3U31shqG0= go.etcd.io/bbolt v1.3.10/go.mod h1:bK3UQLPJZly7IlNmV7uVHJDxfe5aK9Ll93e/74Y9oEQ= go.mongodb.org/mongo-driver v1.14.0 h1:P98w8egYRjYe3XDjxhYJagTokP/H6HzlsnojRgZRd80= @@ -1052,8 +1045,6 @@ golang.org/x/mod v0.4.2/go.mod h1:s0Qsj1ACt9ePp/hMypM3fl4fZqREWJwdYDEqhRiZZUA= golang.org/x/mod v0.5.1/go.mod h1:5OXOZSfqPIIbmVBIIKWRFfZjPR0E5r58TLhUjH0a2Ro= golang.org/x/mod v0.6.0-dev.0.20220419223038-86c51ed26bb4/go.mod h1:jJ57K6gSWd91VN4djpZkiMVwK6gcyfeH4XE8wZrZaV4= golang.org/x/mod v0.8.0/go.mod h1:iBbtSCu2XBx23ZKBPSOrRkjjQPZFPuis4dIYUhu/chs= -golang.org/x/mod v0.12.0/go.mod h1:iBbtSCu2XBx23ZKBPSOrRkjjQPZFPuis4dIYUhu/chs= -golang.org/x/mod v0.18.0/go.mod h1:hTbmBsO62+eylJbnUtE2MGJUyE7QWk4xUqPFrRgJ+7c= golang.org/x/mod v0.21.0 h1:vvrHzRwRfVKSiLrG+d4FMl/Qi4ukBCE6kZlTUkDYRT0= golang.org/x/mod v0.21.0/go.mod h1:6SkKJ3Xj0I0BrPOZoBy3bdMptDDU9oJrpohJ3eWZ1fY= golang.org/x/net v0.0.0-20180724234803-3673e40ba225/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4= @@ -1084,7 +1075,6 @@ golang.org/x/net v0.0.0-20211112202133-69e39bad7dc2/go.mod h1:9nx3DQGgdP8bBQD5qx golang.org/x/net v0.0.0-20220722155237-a158d28d115b/go.mod h1:XRhObCWvk6IyKnWLug+ECip1KBveYUHfp+8e9klMJ9c= golang.org/x/net v0.6.0/go.mod h1:2Tu9+aMcznHK/AK1HMvgo6xiTLG5rD5rZLDS+rp2Bjs= golang.org/x/net v0.10.0/go.mod h1:0qNGK6F8kojg2nk9dLZ2mShWaEBan6FAoqfSigmmuDg= -golang.org/x/net v0.15.0/go.mod h1:idbUs1IY1+zTqbi8yxTbhexhEEk5ur9LInksu6HrEpk= golang.org/x/net v0.21.0/go.mod h1:bIjVDfnllIU7BJ2DNgfnXvpSvtn8VRwhlsaeUTyUS44= golang.org/x/net v0.25.0/go.mod h1:JkAGAh7GEvH74S6FOH42FLoXpXbE/aqXSrIQjXgsiwM= golang.org/x/net v0.29.0 h1:5ORfpBpCs4HzDYoodCDBbwHzdR5UrLBZ3sOnUJmFoHo= @@ -1104,8 +1094,6 @@ golang.org/x/sync v0.0.0-20210220032951-036812b2e83c/go.mod h1:RxMgew5VJxzue5/jJ golang.org/x/sync v0.0.0-20220513210516-0976fa681c29/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= golang.org/x/sync v0.0.0-20220722155255-886fb9371eb4/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= golang.org/x/sync v0.1.0/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= -golang.org/x/sync v0.3.0/go.mod h1:FU7BRWz2tNW+3quACPkgCx/L+uEAv1htQ0V83Z9Rj+Y= -golang.org/x/sync v0.7.0/go.mod h1:Czt+wKu1gCyEFDUtn0jG5QVvpJ6rzVqr5aXyt9drQfk= golang.org/x/sync v0.8.0 h1:3NFvSEYkUoMifnESzZl15y791HH1qU2xm6eCJU5ZPXQ= golang.org/x/sync v0.8.0/go.mod h1:Czt+wKu1gCyEFDUtn0jG5QVvpJ6rzVqr5aXyt9drQfk= golang.org/x/sys v0.0.0-20180810173357-98c5dad5d1a0/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= @@ -1204,7 +1192,6 @@ golang.org/x/tools v0.1.5/go.mod h1:o0xws9oXOQQZyjljx8fwUC0k7L1pTE6eaCbjGeHmOkk= golang.org/x/tools v0.1.7/go.mod h1:LGqMHiF4EqQNHR1JncWGqT5BVaXmza+X+BDGol+dOxo= golang.org/x/tools v0.1.12/go.mod h1:hNGJHUnrk76NpqgfD5Aqm5Crs+Hm0VOH/i9J2+nxYbc= golang.org/x/tools v0.6.0/go.mod h1:Xwgl3UAJ/d3gWutnCtw505GrjyAbvKui8lOU390QaIU= -golang.org/x/tools v0.13.0/go.mod h1:HvlwmtVNQAhOuCjW7xxvovg8wbNq7LwfXh/k7wXUl58= golang.org/x/tools v0.25.0 h1:oFU9pkj/iJgs+0DT+VMHrx+oBKs/LJMV+Uvg78sl+fE= golang.org/x/tools v0.25.0/go.mod h1:/vtpO8WL1N9cQC3FN5zPqb//fRXskFHbLKk4OW1Q7rg= golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0= @@ -1285,7 +1272,6 @@ gopkg.in/yaml.v1 v1.0.0-20140924161607-9f9df34309c0/go.mod h1:WDnlLJ4WF5VGsH/HVa gopkg.in/yaml.v2 v2.2.1/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI= gopkg.in/yaml.v2 v2.2.2/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI= gopkg.in/yaml.v2 v2.2.4/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI= -gopkg.in/yaml.v2 v2.2.7/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI= gopkg.in/yaml.v2 v2.2.8/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI= gopkg.in/yaml.v2 v2.4.0 h1:D8xgwECY7CYvx+Y2n4sBz93Jn9JRvxdiyyo8CTfuKaY= gopkg.in/yaml.v2 v2.4.0/go.mod h1:RDklbk79AGWmwhnvt/jBztapEOGDOx6ZbXqjP6csGnQ= diff --git a/x-pack/auditbeat/processors/sessionmd/add_session_metadata.go b/x-pack/auditbeat/processors/sessionmd/add_session_metadata.go index 4fa86c25d029..28ef4697b79a 100644 --- a/x-pack/auditbeat/processors/sessionmd/add_session_metadata.go +++ b/x-pack/auditbeat/processors/sessionmd/add_session_metadata.go @@ -13,12 +13,14 @@ import ( "strconv" "github.com/elastic/beats/v7/libbeat/beat" + "github.com/elastic/beats/v7/libbeat/common/cfgwarn" "github.com/elastic/beats/v7/libbeat/processors" "github.com/elastic/beats/v7/x-pack/auditbeat/processors/sessionmd/processdb" "github.com/elastic/beats/v7/x-pack/auditbeat/processors/sessionmd/procfs" "github.com/elastic/beats/v7/x-pack/auditbeat/processors/sessionmd/provider" - "github.com/elastic/beats/v7/x-pack/auditbeat/processors/sessionmd/provider/ebpf_provider" - "github.com/elastic/beats/v7/x-pack/auditbeat/processors/sessionmd/provider/procfs_provider" + "github.com/elastic/beats/v7/x-pack/auditbeat/processors/sessionmd/provider/kerneltracingprovider" + "github.com/elastic/beats/v7/x-pack/auditbeat/processors/sessionmd/provider/procfsprovider" + "github.com/elastic/beats/v7/x-pack/auditbeat/processors/sessionmd/types" cfg "github.com/elastic/elastic-agent-libs/config" "github.com/elastic/elastic-agent-libs/logp" "github.com/elastic/elastic-agent-libs/mapstr" @@ -35,13 +37,17 @@ func InitializeModule() { } type addSessionMetadata struct { + ctx context.Context + cancel context.CancelFunc config config logger *logp.Logger db *processdb.DB provider provider.Provider + backend string } func New(cfg *cfg.C) (beat.Processor, error) { + cfgwarn.Beta("add_session_metadata processor is a beta feature.") c := defaultConfig() if err := cfg.Unpack(&c); err != nil { return nil, fmt.Errorf("fail to unpack the %v configuration: %w", processorName, err) @@ -49,49 +55,59 @@ func New(cfg *cfg.C) (beat.Processor, error) { logger := logp.NewLogger(logName) - ctx := context.Background() + ctx, cancel := context.WithCancel(context.Background()) reader := procfs.NewProcfsReader(*logger) db, err := processdb.NewDB(reader, *logger) if err != nil { + cancel() return nil, fmt.Errorf("failed to create DB: %w", err) } - backfilledPIDs := db.ScrapeProcfs() - logger.Infof("backfilled %d processes", len(backfilledPIDs)) + if c.Backend != "kernel_tracing" { + backfilledPIDs := db.ScrapeProcfs() + logger.Infof("backfilled %d processes", len(backfilledPIDs)) + } var p provider.Provider switch c.Backend { case "auto": - p, err = ebpf_provider.NewProvider(ctx, logger, db) + p, err = kerneltracingprovider.NewProvider(ctx, logger) if err != nil { - // Most likely cause of error is not supporting ebpf on system, try procfs - p, err = procfs_provider.NewProvider(ctx, logger, db, reader, c.PIDField) + // Most likely cause of error is not supporting ebpf or kprobes on system, try procfs + p, err = procfsprovider.NewProvider(ctx, logger, db, reader, c.PIDField) if err != nil { + cancel() return nil, fmt.Errorf("failed to create provider: %w", err) } logger.Info("backend=auto using procfs") } else { - logger.Info("backend=auto using ebpf") + logger.Info("backend=auto using kernel_tracing") } - case "ebpf": - p, err = ebpf_provider.NewProvider(ctx, logger, db) + case "procfs": + p, err = procfsprovider.NewProvider(ctx, logger, db, reader, c.PIDField) if err != nil { - return nil, fmt.Errorf("failed to create ebpf provider: %w", err) + cancel() + return nil, fmt.Errorf("failed to create procfs provider: %w", err) } - case "procfs": - p, err = procfs_provider.NewProvider(ctx, logger, db, reader, c.PIDField) + case "kernel_tracing": + p, err = kerneltracingprovider.NewProvider(ctx, logger) if err != nil { - return nil, fmt.Errorf("failed to create ebpf provider: %w", err) + cancel() + return nil, fmt.Errorf("failed to create kernel_tracing provider: %w", err) } default: + cancel() return nil, fmt.Errorf("unknown backend configuration") } return &addSessionMetadata{ + ctx: ctx, + cancel: cancel, config: c, logger: logger, db: db, provider: p, + backend: c.Backend, }, nil } @@ -127,6 +143,7 @@ func (p *addSessionMetadata) Run(ev *beat.Event) (*beat.Event, error) { func (p *addSessionMetadata) Close() error { p.db.Close() + p.cancel() return nil } @@ -145,13 +162,24 @@ func (p *addSessionMetadata) enrich(ev *beat.Event) (*beat.Event, error) { return nil, fmt.Errorf("cannot parse pid field '%s': %w", p.config.PIDField, err) } - fullProcess, err := p.db.GetProcess(pid) - if err != nil { - e := fmt.Errorf("pid %v not found in db: %w", pid, err) - p.logger.Errorf("%v", e) - return nil, e + var fullProcess types.Process + if p.backend == "kernel_tracing" { + // kernel_tracing doesn't enrich with the processor DB; process info is taken directly from quark cache + proc, err := p.provider.GetProcess(pid) + if err != nil { + e := fmt.Errorf("pid %v not found in db: %w", pid, err) + p.logger.Warnw("PID not found in provider", "pid", pid, "error", err) + return nil, e + } + fullProcess = *proc + } else { + fullProcess, err = p.db.GetProcess(pid) + if err != nil { + e := fmt.Errorf("pid %v not found in db: %w", pid, err) + p.logger.Warnw("PID not found in provider", "pid", pid, "error", err) + return nil, e + } } - processMap := fullProcess.ToMap() if b, err := ev.Fields.HasKey("process"); !b || err != nil { diff --git a/x-pack/auditbeat/processors/sessionmd/add_session_metadata_test.go b/x-pack/auditbeat/processors/sessionmd/add_session_metadata_test.go index 95892482f80e..a993737611bd 100644 --- a/x-pack/auditbeat/processors/sessionmd/add_session_metadata_test.go +++ b/x-pack/auditbeat/processors/sessionmd/add_session_metadata_test.go @@ -361,7 +361,7 @@ func TestEnrich(t *testing.T) { require.Nil(t, err, "%s: enrich error: %w", tt.testName, err) require.NotNil(t, actual, "%s: returned nil event", tt.testName) - //Validate output + // Validate output if diff := cmp.Diff(tt.expected.Fields, actual.Fields, ignoreMissingFrom(tt.expected.Fields)); diff != "" { t.Errorf("field mismatch:\n%s", diff) } diff --git a/x-pack/auditbeat/processors/sessionmd/docs/add_session_metadata.asciidoc b/x-pack/auditbeat/processors/sessionmd/docs/add_session_metadata.asciidoc index d29c5d0ac80b..aaddde322c14 100644 --- a/x-pack/auditbeat/processors/sessionmd/docs/add_session_metadata.asciidoc +++ b/x-pack/auditbeat/processors/sessionmd/docs/add_session_metadata.asciidoc @@ -8,7 +8,7 @@ beta::[] The `add_session_metadata` processor enriches process events with additional information that users can see using the {security-guide}/session-view.html[Session View] tool in the -{elastic-sec} platform. +{elastic-sec} platform. NOTE: The current release of `add_session_metadata` processor for {auditbeat} is limited to virtual machines (VMs) and bare metal environments. @@ -27,9 +27,9 @@ auditbeat.modules: [[add-session-metadata-explained]] ==== How the `add_session_metadata` processor works -Using the available Linux kernel technology, the processor collects comprehensive information on all running system processes, compiling this data into a process database. -When processing an event (such as those generated by the {auditbeat} `auditd` module), the processor queries this database to retrieve information about related processes, including the parent process, session leader, process group leader, and entry leader. -It then enriches the original event with this metadata, providing a more complete picture of process relationships and system activities. +Using the available Linux kernel technology, the processor collects comprehensive information on all running system processes, compiling this data into a process database. +When processing an event (such as those generated by the {auditbeat} `auditd` module), the processor queries this database to retrieve information about related processes, including the parent process, session leader, process group leader, and entry leader. +It then enriches the original event with this metadata, providing a more complete picture of process relationships and system activities. This enhanced data enables the powerful {security-guide}/session-view.html[Session View] tool in the {elastic-sec} platform, offering users deeper insights for analysis and investigation. @@ -40,17 +40,18 @@ This enhanced data enables the powerful {security-guide}/session-view.html[Sessi The `add_session_metadata` processor operates using various backend options. * `auto` is the recommended setting. - It attempts to use `ebpf` first, falling back to `procfs` if necessary, ensuring compatibility even on systems without `ebpf` support. -* `ebpf` collects process information with eBPF. - This backend requires a system with Linux kernel 5.10.16 or above, kernel support for eBPF enabled, and auditbeat running as superuser. -* `procfs` collects process information with the proc filesystem. - This is compatible with older systems that may not support ebpf. - To gather complete process info, auditbeat requires permissions to read all process data in procfs; for example, run as a superuser or have the `SYS_PTRACE` capability. + It attempts to use `kernel_tracing` first, falling back to `procfs` if necessary, ensuring compatibility even on systems without `kernel_tracing` support. +* `kernel_tracing` collects process information with eBPF or kprobes. + This backend will prefer to use eBPF, if eBPF is not supported kprobes will be used. eBPF requires a system with Linux kernel 5.10.16 or above, kernel support for eBPF enabled, and auditbeat running as superuser. + Kprobe support required Linux kernel 3.10.0 or above, and auditbeat running as a superuser. +* `procfs` collects process information with the proc filesystem. + This is compatible with older systems that may not support ebpf. + To gather complete process info, auditbeat requires permissions to read all process data in procfs; for example, run as a superuser or have the `SYS_PTRACE` capability. [[add-session-metadata-containers]] ===== Containers -If you are running {auditbeat} in a container, the container must run in the host's PID namespace. -With the `auto` or `ebpf` backend, these host directories must also be mounted to the same path within the container: `/sys/kernel/debug`, `/sys/fs/bpf`. +If you are running {auditbeat} in a container, the container must run in the host's PID namespace. +With the `auto` or `kernel_tracing` backend, these host directories must also be mounted to the same path within the container: `/sys/kernel/debug`, `/sys/fs/bpf`. [[add-session-metadata-enable]] ==== Enable and configure Session View in {auditbeat} @@ -58,10 +59,10 @@ With the `auto` or `ebpf` backend, these host directories must also be mounted t To configure and enable {security-guide}/session-view.html[Session View] functionality, you'll: * Add the `add_sessions_metadata` processor to your `auditbeat.yml` file. -* Configure audit rules in your `auditbeat.yml` file. +* Configure audit rules in your `auditbeat.yml` file. * Restart {auditbeat}. -We'll walk you through these steps in more detail. +We'll walk you through these steps in more detail. . Edit your `auditbeat.yml` file and add this info to the modules configuration section: + @@ -89,11 +90,11 @@ auditbeat.modules: -a always,exit -F arch=b64 -S setsid ------------------------------------- + -. Save your configuration changes. +. Save your configuration changes. + -. Restart {auditbeat}: +. Restart {auditbeat}: + [source,sh] ------------------------------------- sudo systemctl restart auditbeat -------------------------------------- \ No newline at end of file +------------------------------------- diff --git a/x-pack/auditbeat/processors/sessionmd/processdb/db.go b/x-pack/auditbeat/processors/sessionmd/processdb/db.go index 28c848ddfdbc..e18c247a8590 100644 --- a/x-pack/auditbeat/processors/sessionmd/processdb/db.go +++ b/x-pack/auditbeat/processors/sessionmd/processdb/db.go @@ -254,6 +254,13 @@ func (db *DB) InsertFork(fork types.ProcessForkEvent) { } } +func (db *DB) InsertProcess(process Process) { + db.mutex.Lock() + defer db.mutex.Unlock() + + db.insertProcess(process) +} + func (db *DB) insertProcess(process Process) { pid := process.PIDs.Tgid db.processes[pid] = process @@ -458,8 +465,8 @@ func fullProcessFromDBProcess(p Process) types.Process { } ret.Thread.Capabilities.Permitted, _ = capabilities.FromUint64(p.Creds.CapPermitted) ret.Thread.Capabilities.Effective, _ = capabilities.FromUint64(p.Creds.CapEffective) - ret.TTY.CharDevice.Major = p.CTTY.Major - ret.TTY.CharDevice.Minor = p.CTTY.Minor + ret.TTY.CharDevice.Major = uint16(p.CTTY.Major) + ret.TTY.CharDevice.Minor = uint16(p.CTTY.Minor) ret.ExitCode = p.ExitCode return ret @@ -736,7 +743,7 @@ func isFilteredExecutable(executable string) bool { return stringStartsWithEntryInList(executable, filteredExecutables[:]) } -func getTTYType(major uint16, minor uint16) TTYType { +func getTTYType(major uint32, minor uint32) TTYType { if major >= ptsMinMajor && major <= ptsMaxMajor { return Pts } diff --git a/x-pack/auditbeat/processors/sessionmd/processdb/entry_leader_test.go b/x-pack/auditbeat/processors/sessionmd/processdb/entry_leader_test.go index 74140f47f6cd..fa0bc6e17993 100644 --- a/x-pack/auditbeat/processors/sessionmd/processdb/entry_leader_test.go +++ b/x-pack/auditbeat/processors/sessionmd/processdb/entry_leader_test.go @@ -1491,7 +1491,7 @@ func TestPIDReuseNewSession(t *testing.T) { ExitCode: 0, }) - //2nd session + // 2nd session x1 := bashPID x2 := sshd0PID sshd0PID = command0PID diff --git a/x-pack/auditbeat/processors/sessionmd/procfs/procfs.go b/x-pack/auditbeat/processors/sessionmd/procfs/procfs.go index fc843373389d..b76dfdfdb485 100644 --- a/x-pack/auditbeat/processors/sessionmd/procfs/procfs.go +++ b/x-pack/auditbeat/processors/sessionmd/procfs/procfs.go @@ -18,12 +18,12 @@ import ( "github.com/elastic/elastic-agent-libs/logp" ) -func MajorTTY(ttyNr uint32) uint16 { - return uint16((ttyNr >> 8) & 0xff) +func MajorTTY(ttyNr uint32) uint32 { + return (ttyNr >> 8) & 0xff } -func MinorTTY(ttyNr uint32) uint16 { - return uint16(((ttyNr & 0xfff00000) >> 20) | (ttyNr & 0xff)) +func MinorTTY(ttyNr uint32) uint32 { + return ((ttyNr >> 12) & 0xfff00) | (ttyNr & 0xff) } // this interface exists so that we can inject a mock procfs reader for deterministic testing diff --git a/x-pack/auditbeat/processors/sessionmd/provider/ebpf_provider/ebpf_provider.go b/x-pack/auditbeat/processors/sessionmd/provider/ebpf_provider/ebpf_provider.go deleted file mode 100644 index 31220465dfe4..000000000000 --- a/x-pack/auditbeat/processors/sessionmd/provider/ebpf_provider/ebpf_provider.go +++ /dev/null @@ -1,231 +0,0 @@ -// Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one -// or more contributor license agreements. Licensed under the Elastic License; -// you may not use this file except in compliance with the Elastic License. - -//go:build linux - -package ebpf_provider - -import ( - "context" - "fmt" - "time" - - "github.com/elastic/beats/v7/libbeat/beat" - "github.com/elastic/beats/v7/libbeat/ebpf" - "github.com/elastic/beats/v7/x-pack/auditbeat/processors/sessionmd/processdb" - "github.com/elastic/beats/v7/x-pack/auditbeat/processors/sessionmd/provider" - "github.com/elastic/beats/v7/x-pack/auditbeat/processors/sessionmd/types" - "github.com/elastic/ebpfevents" - "github.com/elastic/elastic-agent-libs/logp" -) - -const ( - name = "add_session_metadata" - eventMask = ebpf.EventMask(ebpfevents.EventTypeProcessFork | ebpfevents.EventTypeProcessExec | ebpfevents.EventTypeProcessExit) -) - -type prvdr struct { - ctx context.Context - logger *logp.Logger - db *processdb.DB -} - -func NewProvider(ctx context.Context, logger *logp.Logger, db *processdb.DB) (provider.Provider, error) { - p := prvdr{ - ctx: ctx, - logger: logger, - db: db, - } - - w, err := ebpf.GetWatcher() - if err != nil { - return nil, fmt.Errorf("get ebpf watcher: %w", err) - } - - records := w.Subscribe(name, eventMask) - - go func(logger logp.Logger) { - for { - r := <-records - if r.Error != nil { - logger.Warnw("received error from the ebpf subscription", "error", err) - continue - } - if r.Event == nil { - continue - } - ev := r.Event - switch ev.Type { - case ebpfevents.EventTypeProcessFork: - body, ok := ev.Body.(*ebpfevents.ProcessFork) - if !ok { - logger.Errorf("unexpected event body, got %T", ev.Body) - continue - } - pe := types.ProcessForkEvent{ - ParentPIDs: types.PIDInfo{ - Tid: body.ParentPids.Tid, - Tgid: body.ParentPids.Tgid, - Ppid: body.ParentPids.Ppid, - Pgid: body.ParentPids.Pgid, - Sid: body.ParentPids.Sid, - StartTimeNS: body.ParentPids.StartTimeNs, - }, - ChildPIDs: types.PIDInfo{ - Tid: body.ChildPids.Tid, - Tgid: body.ChildPids.Tgid, - Ppid: body.ChildPids.Ppid, - Pgid: body.ChildPids.Pgid, - Sid: body.ChildPids.Sid, - StartTimeNS: body.ChildPids.StartTimeNs, - }, - Creds: types.CredInfo{ - Ruid: body.Creds.Ruid, - Rgid: body.Creds.Rgid, - Euid: body.Creds.Euid, - Egid: body.Creds.Egid, - Suid: body.Creds.Suid, - Sgid: body.Creds.Sgid, - CapPermitted: body.Creds.CapPermitted, - CapEffective: body.Creds.CapEffective, - }, - } - p.db.InsertFork(pe) - case ebpfevents.EventTypeProcessExec: - body, ok := ev.Body.(*ebpfevents.ProcessExec) - if !ok { - logger.Errorf("unexpected event body") - continue - } - pe := types.ProcessExecEvent{ - PIDs: types.PIDInfo{ - Tid: body.Pids.Tid, - Tgid: body.Pids.Tgid, - Ppid: body.Pids.Ppid, - Pgid: body.Pids.Pgid, - Sid: body.Pids.Sid, - StartTimeNS: body.Pids.StartTimeNs, - }, - Creds: types.CredInfo{ - Ruid: body.Creds.Ruid, - Rgid: body.Creds.Rgid, - Euid: body.Creds.Euid, - Egid: body.Creds.Egid, - Suid: body.Creds.Suid, - Sgid: body.Creds.Sgid, - CapPermitted: body.Creds.CapPermitted, - CapEffective: body.Creds.CapEffective, - }, - CTTY: types.TTYDev{ - Major: body.CTTY.Major, - Minor: body.CTTY.Minor, - }, - CWD: body.Cwd, - Argv: body.Argv, - Env: body.Env, - Filename: body.Filename, - } - p.db.InsertExec(pe) - case ebpfevents.EventTypeProcessExit: - body, ok := ev.Body.(*ebpfevents.ProcessExit) - if !ok { - logger.Errorf("unexpected event body") - continue - } - pe := types.ProcessExitEvent{ - PIDs: types.PIDInfo{ - Tid: body.Pids.Tid, - Tgid: body.Pids.Tgid, - Ppid: body.Pids.Ppid, - Pgid: body.Pids.Pgid, - Sid: body.Pids.Sid, - StartTimeNS: body.Pids.StartTimeNs, - }, - ExitCode: body.ExitCode, - } - p.db.InsertExit(pe) - } - } - }(*p.logger) - - return &p, nil -} - -const ( - maxWaitLimit = 200 * time.Millisecond // Maximum time SyncDB will wait for process - combinedWaitLimit = 2 * time.Second // Multiple SyncDB calls will wait up to this amount within resetDuration - backoffDuration = 10 * time.Second // SyncDB will stop waiting for processes for this time - resetDuration = 5 * time.Second // After this amount of times with no backoffs, the combinedWait will be reset -) - -var ( - combinedWait = 0 * time.Millisecond - inBackoff = false - backoffStart = time.Now() - since = time.Now() - backoffSkipped = 0 -) - -// With ebpf, process events are pushed to the DB by the above goroutine, so this doesn't actually update the DB. -// It does to try sync the processor and ebpf events, so that the process is in the process db before continuing. -// -// It's possible that the event to enrich arrives before the process is inserted into the DB. In that case, this -// will block continuing the enrichment until the process is seen (or the timeout is reached). -// -// If for some reason a lot of time has been spent waiting for missing processes, this also has a backoff timer during -// which it will continue without waiting for missing events to arrive, so the processor doesn't become overly backed-up -// waiting for these processes, at the cost of possibly not enriching some processes. -func (s prvdr) SyncDB(ev *beat.Event, pid uint32) error { - if s.db.HasProcess(pid) { - return nil - } - - now := time.Now() - if inBackoff { - if now.Sub(backoffStart) > backoffDuration { - s.logger.Warnf("ended backoff, skipped %d processes", backoffSkipped) - inBackoff = false - combinedWait = 0 * time.Millisecond - } else { - backoffSkipped += 1 - return nil - } - } else { - if combinedWait > combinedWaitLimit { - s.logger.Warn("starting backoff") - inBackoff = true - backoffStart = now - backoffSkipped = 0 - return nil - } - // maintain a moving window of time for the delays we track - if now.Sub(since) > resetDuration { - since = now - combinedWait = 0 * time.Millisecond - } - } - - start := now - nextWait := 5 * time.Millisecond - for { - waited := time.Since(start) - if s.db.HasProcess(pid) { - s.logger.Debugf("got process that was missing after %v", waited) - combinedWait = combinedWait + waited - return nil - } - if waited >= maxWaitLimit { - e := fmt.Errorf("process %v was not seen after %v", pid, waited) - s.logger.Warnf("%v", e) - combinedWait = combinedWait + waited - return e - } - time.Sleep(nextWait) - if nextWait*2+waited > maxWaitLimit { - nextWait = maxWaitLimit - waited - } else { - nextWait = nextWait * 2 - } - } -} diff --git a/x-pack/auditbeat/processors/sessionmd/provider/kerneltracingprovider/kerneltracingprovider_linux.go b/x-pack/auditbeat/processors/sessionmd/provider/kerneltracingprovider/kerneltracingprovider_linux.go new file mode 100644 index 000000000000..966f4b36c30c --- /dev/null +++ b/x-pack/auditbeat/processors/sessionmd/provider/kerneltracingprovider/kerneltracingprovider_linux.go @@ -0,0 +1,528 @@ +// Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one +// or more contributor license agreements. Licensed under the Elastic License; +// you may not use this file except in compliance with the Elastic License. + +//go:build linux && (amd64 || arm64) && cgo + +package kerneltracingprovider + +import ( + "context" + "encoding/base64" + "fmt" + "os" + "os/user" + "path/filepath" + "strconv" + "strings" + "sync" + "time" + + quark "github.com/elastic/go-quark" + + "github.com/elastic/beats/v7/libbeat/beat" + "github.com/elastic/beats/v7/x-pack/auditbeat/processors/sessionmd/provider" + "github.com/elastic/beats/v7/x-pack/auditbeat/processors/sessionmd/types" + "github.com/elastic/elastic-agent-libs/logp" +) + +type prvdr struct { + ctx context.Context + logger *logp.Logger + qq *quark.Queue + qqMtx *sync.Mutex + combinedWait time.Duration + inBackoff bool + backoffStart time.Time + since time.Time + backoffSkipped int +} + +type TTYType int + +const ( + TTYUnknown TTYType = iota + Pts + TTY + TTYConsole +) + +const ( + Init = "init" + Sshd = "sshd" + Ssm = "ssm" + Container = "container" + Terminal = "terminal" + Kthread = "kthread" + EntryConsole = "console" + EntryUnknown = "unknown" +) + +const ( + ptsMinMajor = 136 + ptsMaxMajor = 143 + ttyMajor = 4 + consoleMaxMinor = 63 + ttyMaxMinor = 255 +) + +var ( + bootID string + pidNsInode uint64 +) + +func readBootID() (string, error) { + bootID, err := os.ReadFile("/proc/sys/kernel/random/boot_id") + if err != nil { + return "", fmt.Errorf("could not read /proc/sys/kernel/random/boot_id, process entity IDs will not be correct: %w", err) + } + + return strings.TrimRight(string(bootID), "\n"), nil +} + +func readPIDNsInode() (uint64, error) { + var ret uint64 + + pidNsInodeRaw, err := os.Readlink("/proc/self/ns/pid") + if err != nil { + return 0, fmt.Errorf("could not read /proc/self/ns/pid: %w", err) + } + + if _, err = fmt.Sscanf(pidNsInodeRaw, "pid:[%d]", &ret); err != nil { + return 0, fmt.Errorf("could not parse contents of /proc/self/ns/pid (%q): %w", pidNsInodeRaw, err) + } + + return ret, nil +} + +func NewProvider(ctx context.Context, logger *logp.Logger) (provider.Provider, error) { + attr := quark.DefaultQueueAttr() + attr.Flags = quark.QQ_ALL_BACKENDS | quark.QQ_ENTRY_LEADER | quark.QQ_NO_SNAPSHOT + qq, err := quark.OpenQueue(attr, 64) + if err != nil { + return nil, fmt.Errorf("open queue: %w", err) + } + + p := &prvdr{ + ctx: ctx, + logger: logger, + qq: qq, + qqMtx: new(sync.Mutex), + combinedWait: 0 * time.Millisecond, + inBackoff: false, + backoffStart: time.Now(), + since: time.Now(), + backoffSkipped: 0, + } + + go func(ctx context.Context, qq *quark.Queue, logger *logp.Logger, p *prvdr) { + defer qq.Close() + for ctx.Err() == nil { + p.qqMtx.Lock() + events, err := qq.GetEvents() + p.qqMtx.Unlock() + if err != nil { + logger.Errorw("get events from quark, no more process enrichment from this processor will be done", "error", err) + break + } + if len(events) == 0 { + err = qq.Block() + if err != nil { + logger.Errorw("quark block, no more process enrichment from this processor will be done", "error", err) + break + } + } + } + }(ctx, qq, logger, p) + + bootID, err = readBootID() + if err != nil { + p.logger.Errorw("failed to read boot ID, entity ID will not be correct", "error", err) + } + pidNsInode, err = readPIDNsInode() + if err != nil { + p.logger.Errorw("failed to read PID namespace inode, entity ID will not be correct", "error", err) + } + + return p, nil +} + +const ( + maxWaitLimit = 1200 * time.Millisecond // Maximum time SyncDB will wait for process + combinedWaitLimit = 15 * time.Second // Multiple SyncDB calls will wait up to this amount within resetDuration + backoffDuration = 10 * time.Second // SyncDB will stop waiting for processes for this time + resetDuration = 5 * time.Second // After this amount of times with no backoffs, the combinedWait will be reset +) + +func (p *prvdr) SyncDB(_ *beat.Event, pid uint32) error { + p.qqMtx.Lock() + defer p.qqMtx.Unlock() + + // Use qq.Lookup, not lookupLocked, in this function. Mutex is locked for entire function + + if _, found := p.qq.Lookup(int(pid)); found { + return nil + } + + now := time.Now() + if p.inBackoff { + if now.Sub(p.backoffStart) > backoffDuration { + p.logger.Warnw("ended backoff, skipped processes", "backoffSkipped", p.backoffSkipped) + p.inBackoff = false + p.combinedWait = 0 * time.Millisecond + } else { + p.backoffSkipped += 1 + return nil + } + } else { + if p.combinedWait > combinedWaitLimit { + p.logger.Warn("starting backoff") + p.inBackoff = true + p.backoffStart = now + p.backoffSkipped = 0 + return nil + } + // maintain a moving window of time for the delays we track + if now.Sub(p.since) > resetDuration { + p.since = now + p.combinedWait = 0 * time.Millisecond + } + } + + start := now + nextWait := 5 * time.Millisecond + for { + waited := time.Since(start) + if _, found := p.qq.Lookup(int(pid)); found { + p.logger.Debugw("got process that was missing ", "waited", waited) + p.combinedWait = p.combinedWait + waited + return nil + } + if waited >= maxWaitLimit { + p.combinedWait = p.combinedWait + waited + return fmt.Errorf("process %v was not seen after %v", pid, waited) + } + time.Sleep(nextWait) + if nextWait*2+waited > maxWaitLimit { + nextWait = maxWaitLimit - waited + } else { + nextWait = nextWait * 2 + } + } +} + +func (p *prvdr) GetProcess(pid uint32) (*types.Process, error) { + proc, found := p.lookupLocked(pid) + if !found { + return nil, fmt.Errorf("PID %d not found in cache", pid) + } + + interactive := interactiveFromTTY(types.TTYDev{ + Major: proc.Proc.TtyMajor, + Minor: proc.Proc.TtyMinor, + }) + + start := time.Unix(0, int64(proc.Proc.TimeBoot)) + + ret := types.Process{ + PID: proc.Pid, + Start: &start, + Name: basename(proc.Filename), + Executable: proc.Filename, + Args: proc.Cmdline, + WorkingDirectory: proc.Cwd, + Interactive: &interactive, + } + + euid := proc.Proc.Euid + egid := proc.Proc.Egid + ret.User.ID = strconv.FormatUint(uint64(euid), 10) + username, ok := getUserName(ret.User.ID) + if ok { + ret.User.Name = username + } + ret.Group.ID = strconv.FormatUint(uint64(egid), 10) + groupname, ok := getGroupName(ret.Group.ID) + if ok { + ret.Group.Name = groupname + } + ret.TTY.CharDevice.Major = uint16(proc.Proc.TtyMajor) + ret.TTY.CharDevice.Minor = uint16(proc.Proc.TtyMinor) + if proc.Exit.Valid { + end := time.Unix(0, int64(proc.Exit.ExitTimeProcess)) + ret.ExitCode = proc.Exit.ExitCode + ret.End = &end + } + ret.EntityID = calculateEntityIDv1(pid, *ret.Start) + + p.fillParent(&ret, proc.Proc.Ppid) + p.fillGroupLeader(&ret, proc.Proc.Pgid) + p.fillSessionLeader(&ret, proc.Proc.Sid) + p.fillEntryLeader(&ret, proc.Proc.EntryLeader) + setEntityID(&ret) + setSameAsProcess(&ret) + return &ret, nil +} + +func (p prvdr) lookupLocked(pid uint32) (quark.Process, bool) { + p.qqMtx.Lock() + defer p.qqMtx.Unlock() + + return p.qq.Lookup(int(pid)) +} + +func (p prvdr) fillParent(process *types.Process, ppid uint32) { + proc, found := p.lookupLocked(ppid) + if !found { + return + } + + start := time.Unix(0, int64(proc.Proc.TimeBoot)) + interactive := interactiveFromTTY(types.TTYDev{ + Major: proc.Proc.TtyMajor, + Minor: proc.Proc.TtyMinor, + }) + euid := proc.Proc.Euid + egid := proc.Proc.Egid + process.Parent.PID = proc.Pid + process.Parent.Start = &start + process.Parent.Name = basename(proc.Filename) + process.Parent.Executable = proc.Filename + process.Parent.Args = proc.Cmdline + process.Parent.WorkingDirectory = proc.Cwd + process.Parent.Interactive = &interactive + process.Parent.User.ID = strconv.FormatUint(uint64(euid), 10) + username, ok := getUserName(process.Parent.User.ID) + if ok { + process.Parent.User.Name = username + } + process.Parent.Group.ID = strconv.FormatUint(uint64(egid), 10) + groupname, ok := getGroupName(process.Parent.Group.ID) + if ok { + process.Parent.Group.Name = groupname + } + process.Parent.EntityID = calculateEntityIDv1(ppid, *process.Start) +} + +func (p prvdr) fillGroupLeader(process *types.Process, pgid uint32) { + proc, found := p.lookupLocked(pgid) + if !found { + return + } + + start := time.Unix(0, int64(proc.Proc.TimeBoot)) + + interactive := interactiveFromTTY(types.TTYDev{ + Major: proc.Proc.TtyMajor, + Minor: proc.Proc.TtyMinor, + }) + euid := proc.Proc.Euid + egid := proc.Proc.Egid + process.GroupLeader.PID = proc.Pid + process.GroupLeader.Start = &start + process.GroupLeader.Name = basename(proc.Filename) + process.GroupLeader.Executable = proc.Filename + process.GroupLeader.Args = proc.Cmdline + process.GroupLeader.WorkingDirectory = proc.Cwd + process.GroupLeader.Interactive = &interactive + process.GroupLeader.User.ID = strconv.FormatUint(uint64(euid), 10) + username, ok := getUserName(process.GroupLeader.User.ID) + if ok { + process.GroupLeader.User.Name = username + } + process.GroupLeader.Group.ID = strconv.FormatUint(uint64(egid), 10) + groupname, ok := getGroupName(process.GroupLeader.Group.ID) + if ok { + process.GroupLeader.Group.Name = groupname + } + process.GroupLeader.EntityID = calculateEntityIDv1(pgid, *process.GroupLeader.Start) +} + +func (p prvdr) fillSessionLeader(process *types.Process, sid uint32) { + proc, found := p.lookupLocked(sid) + if !found { + return + } + + start := time.Unix(0, int64(proc.Proc.TimeBoot)) + + interactive := interactiveFromTTY(types.TTYDev{ + Major: proc.Proc.TtyMajor, + Minor: proc.Proc.TtyMinor, + }) + euid := proc.Proc.Euid + egid := proc.Proc.Egid + process.SessionLeader.PID = proc.Pid + process.SessionLeader.Start = &start + process.SessionLeader.Name = basename(proc.Filename) + process.SessionLeader.Executable = proc.Filename + process.SessionLeader.Args = proc.Cmdline + process.SessionLeader.WorkingDirectory = proc.Cwd + process.SessionLeader.Interactive = &interactive + process.SessionLeader.User.ID = strconv.FormatUint(uint64(euid), 10) + username, ok := getUserName(process.SessionLeader.User.ID) + if ok { + process.SessionLeader.User.Name = username + } + process.SessionLeader.Group.ID = strconv.FormatUint(uint64(egid), 10) + groupname, ok := getGroupName(process.SessionLeader.Group.ID) + if ok { + process.SessionLeader.Group.Name = groupname + } + process.SessionLeader.EntityID = calculateEntityIDv1(sid, *process.SessionLeader.Start) +} + +func (p prvdr) fillEntryLeader(process *types.Process, elid uint32) { + proc, found := p.lookupLocked(elid) + if !found { + return + } + + start := time.Unix(0, int64(proc.Proc.TimeBoot)) + + interactive := interactiveFromTTY(types.TTYDev{ + Major: proc.Proc.TtyMajor, + Minor: proc.Proc.TtyMinor, + }) + + euid := proc.Proc.Euid + egid := proc.Proc.Egid + process.EntryLeader.PID = proc.Pid + process.EntryLeader.Start = &start + process.EntryLeader.WorkingDirectory = proc.Cwd + process.EntryLeader.Interactive = &interactive + process.EntryLeader.User.ID = strconv.FormatUint(uint64(euid), 10) + username, ok := getUserName(process.EntryLeader.User.ID) + if ok { + process.EntryLeader.User.Name = username + } + process.EntryLeader.Group.ID = strconv.FormatUint(uint64(egid), 10) + groupname, ok := getGroupName(process.EntryLeader.Group.ID) + if ok { + process.EntryLeader.Group.Name = groupname + } + + process.EntryLeader.EntityID = calculateEntityIDv1(elid, *process.EntryLeader.Start) + process.EntryLeader.EntryMeta.Type = getEntryTypeName(proc.Proc.EntryLeaderType) +} + +func setEntityID(process *types.Process) { + if process.PID != 0 && process.Start != nil { + process.EntityID = calculateEntityIDv1(process.PID, *process.Start) + } + + if process.Parent.PID != 0 && process.Parent.Start != nil { + process.Parent.EntityID = calculateEntityIDv1(process.Parent.PID, *process.Parent.Start) + } + + if process.GroupLeader.PID != 0 && process.GroupLeader.Start != nil { + process.GroupLeader.EntityID = calculateEntityIDv1(process.GroupLeader.PID, *process.GroupLeader.Start) + } + + if process.SessionLeader.PID != 0 && process.SessionLeader.Start != nil { + process.SessionLeader.EntityID = calculateEntityIDv1(process.SessionLeader.PID, *process.SessionLeader.Start) + } + + if process.EntryLeader.PID != 0 && process.EntryLeader.Start != nil { + process.EntryLeader.EntityID = calculateEntityIDv1(process.EntryLeader.PID, *process.EntryLeader.Start) + } +} + +func setSameAsProcess(process *types.Process) { + if process.GroupLeader.PID != 0 && process.GroupLeader.Start != nil { + sameAsProcess := process.PID == process.GroupLeader.PID + process.GroupLeader.SameAsProcess = &sameAsProcess + } + + if process.SessionLeader.PID != 0 && process.SessionLeader.Start != nil { + sameAsProcess := process.PID == process.SessionLeader.PID + process.SessionLeader.SameAsProcess = &sameAsProcess + } + + if process.EntryLeader.PID != 0 && process.EntryLeader.Start != nil { + sameAsProcess := process.PID == process.EntryLeader.PID + process.EntryLeader.SameAsProcess = &sameAsProcess + } +} + +func interactiveFromTTY(tty types.TTYDev) bool { + return TTYUnknown != getTTYType(tty.Major, tty.Minor) +} + +func getTTYType(major uint32, minor uint32) TTYType { + if major >= ptsMinMajor && major <= ptsMaxMajor { + return Pts + } + + if ttyMajor == major { + if minor <= consoleMaxMinor { + return TTYConsole + } else if minor > consoleMaxMinor && minor <= ttyMaxMinor { + return TTY + } + } + + return TTYUnknown +} + +func calculateEntityIDv1(pid uint32, startTime time.Time) string { + return base64.StdEncoding.EncodeToString( + []byte( + fmt.Sprintf("%d__%s__%d__%d", + pidNsInode, + bootID, + uint64(pid), + uint64(startTime.Unix()), + ), + ), + ) +} + +// `path.Base` returns a '.' for empty strings, this just special cases that +// situation to return an empty string +func basename(pathStr string) string { + if pathStr == "" { + return "" + } + + return filepath.Base(pathStr) +} + +// getUserName will return the name associated with the user ID, if it exists +func getUserName(id string) (string, bool) { + user, err := user.LookupId(id) + if err != nil { + return "", false + } + return user.Username, true +} + +// getGroupName will return the name associated with the group ID, if it exists +func getGroupName(id string) (string, bool) { + group, err := user.LookupGroupId(id) + if err != nil { + return "", false + } + return group.Name, true +} + +func getEntryTypeName(entryType uint32) string { + switch int(entryType) { + case quark.QUARK_ELT_INIT: + return Init + case quark.QUARK_ELT_SSHD: + return Sshd + case quark.QUARK_ELT_SSM: + return Ssm + case quark.QUARK_ELT_CONTAINER: + return Container + case quark.QUARK_ELT_TERM: + return Terminal + case quark.QUARK_ELT_CONSOLE: + return EntryConsole + case quark.QUARK_ELT_KTHREAD: + return Kthread + default: + return "unknown" + } +} diff --git a/x-pack/auditbeat/processors/sessionmd/provider/kerneltracingprovider/kerneltracingprovider_other.go b/x-pack/auditbeat/processors/sessionmd/provider/kerneltracingprovider/kerneltracingprovider_other.go new file mode 100644 index 000000000000..e895a696747d --- /dev/null +++ b/x-pack/auditbeat/processors/sessionmd/provider/kerneltracingprovider/kerneltracingprovider_other.go @@ -0,0 +1,31 @@ +// Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one +// or more contributor license agreements. Licensed under the Elastic License; +// you may not use this file except in compliance with the Elastic License. + +//go:build linux && !((amd64 || arm64) && cgo) + +package kerneltracingprovider + +import ( + "context" + "fmt" + + "github.com/elastic/beats/v7/libbeat/beat" + "github.com/elastic/beats/v7/x-pack/auditbeat/processors/sessionmd/provider" + "github.com/elastic/beats/v7/x-pack/auditbeat/processors/sessionmd/types" + "github.com/elastic/elastic-agent-libs/logp" +) + +type prvdr struct{} + +func NewProvider(ctx context.Context, logger *logp.Logger) (provider.Provider, error) { + return prvdr{}, fmt.Errorf("build type not supported, cgo required") +} + +func (p prvdr) SyncDB(event *beat.Event, pid uint32) error { + return fmt.Errorf("build type not supported") +} + +func (p prvdr) GetProcess(pid uint32) (*types.Process, error) { + return nil, fmt.Errorf("build type not supported") +} diff --git a/x-pack/auditbeat/processors/sessionmd/provider/procfs_provider/procfs_provider.go b/x-pack/auditbeat/processors/sessionmd/provider/procfsprovider/procfsprovider.go similarity index 78% rename from x-pack/auditbeat/processors/sessionmd/provider/procfs_provider/procfs_provider.go rename to x-pack/auditbeat/processors/sessionmd/provider/procfsprovider/procfsprovider.go index 4380bc2ccae4..4934a79fc52c 100644 --- a/x-pack/auditbeat/processors/sessionmd/provider/procfs_provider/procfs_provider.go +++ b/x-pack/auditbeat/processors/sessionmd/provider/procfsprovider/procfsprovider.go @@ -4,7 +4,7 @@ //go:build linux -package procfs_provider +package procfsprovider import ( "context" @@ -40,8 +40,12 @@ func NewProvider(ctx context.Context, logger *logp.Logger, db *processdb.DB, rea }, nil } +func (p prvdr) GetProcess(pid uint32) (*types.Process, error) { + return nil, fmt.Errorf("not implemented") +} + // SyncDB will update the process DB with process info from procfs or the event itself -func (s prvdr) SyncDB(ev *beat.Event, pid uint32) error { +func (p prvdr) SyncDB(ev *beat.Event, pid uint32) error { syscall, err := ev.GetValue(syscallField) if err != nil { return fmt.Errorf("event not supported, no syscall data") @@ -50,17 +54,17 @@ func (s prvdr) SyncDB(ev *beat.Event, pid uint32) error { switch syscall { case "execveat", "execve": pe := types.ProcessExecEvent{} - proc_info, err := s.reader.GetProcess(pid) + procInfo, err := p.reader.GetProcess(pid) if err == nil { - pe.PIDs = proc_info.PIDs - pe.Creds = proc_info.Creds - pe.CTTY = proc_info.CTTY - pe.CWD = proc_info.Cwd - pe.Argv = proc_info.Argv - pe.Env = proc_info.Env - pe.Filename = proc_info.Filename + pe.PIDs = procInfo.PIDs + pe.Creds = procInfo.Creds + pe.CTTY = procInfo.CTTY + pe.CWD = procInfo.Cwd + pe.Argv = procInfo.Argv + pe.Env = procInfo.Env + pe.Filename = procInfo.Filename } else { - s.logger.Warnf("couldn't get process info from proc for pid %v: %v", pid, err) + p.logger.Warnw("couldn't get process info from proc for pid", "pid", pid, "error", err) // If process info couldn't be taken from procfs, populate with as much info as // possible from the event pe.PIDs.Tgid = pid @@ -77,7 +81,7 @@ func (s prvdr) SyncDB(ev *beat.Event, pid uint32) error { } pe.PIDs.Ppid = uint32(i) - parent, err = s.db.GetProcess(pe.PIDs.Ppid) + parent, err = p.db.GetProcess(pe.PIDs.Ppid) if err != nil { goto out } @@ -87,10 +91,14 @@ func (s prvdr) SyncDB(ev *beat.Event, pid uint32) error { if err != nil { goto out } - pe.CWD = intr.(string) + if str, ok := intr.(string); ok { + pe.CWD = str + } else { + goto out + } out: } - s.db.InsertExec(pe) + p.db.InsertExec(pe) if err != nil { return fmt.Errorf("insert exec to db: %w", err) } @@ -100,7 +108,7 @@ func (s prvdr) SyncDB(ev *beat.Event, pid uint32) error { Tgid: pid, }, } - s.db.InsertExit(pe) + p.db.InsertExit(pe) case "setsid": intr, err := ev.Fields.GetValue("auditd.result") if err != nil { @@ -117,7 +125,7 @@ func (s prvdr) SyncDB(ev *beat.Event, pid uint32) error { Sid: pid, }, } - s.db.InsertSetsid(setsid_ev) + p.db.InsertSetsid(setsid_ev) } } return nil diff --git a/x-pack/auditbeat/processors/sessionmd/provider/procfs_provider/procfs_provider_test.go b/x-pack/auditbeat/processors/sessionmd/provider/procfsprovider/procfsprovider_test.go similarity index 99% rename from x-pack/auditbeat/processors/sessionmd/provider/procfs_provider/procfs_provider_test.go rename to x-pack/auditbeat/processors/sessionmd/provider/procfsprovider/procfsprovider_test.go index 455cb3c0433a..42f19f488ce4 100644 --- a/x-pack/auditbeat/processors/sessionmd/provider/procfs_provider/procfs_provider_test.go +++ b/x-pack/auditbeat/processors/sessionmd/provider/procfsprovider/procfsprovider_test.go @@ -4,7 +4,7 @@ //go:build linux -package procfs_provider +package procfsprovider import ( "context" diff --git a/x-pack/auditbeat/processors/sessionmd/provider/provider.go b/x-pack/auditbeat/processors/sessionmd/provider/provider.go index e95da3ec2006..4ac9530cfeaa 100644 --- a/x-pack/auditbeat/processors/sessionmd/provider/provider.go +++ b/x-pack/auditbeat/processors/sessionmd/provider/provider.go @@ -8,9 +8,11 @@ package provider import ( "github.com/elastic/beats/v7/libbeat/beat" + "github.com/elastic/beats/v7/x-pack/auditbeat/processors/sessionmd/types" ) // SyncDB should ensure the DB is in a state to handle the event before returning. type Provider interface { SyncDB(event *beat.Event, pid uint32) error + GetProcess(pid uint32) (*types.Process, error) } diff --git a/x-pack/auditbeat/processors/sessionmd/types/events.go b/x-pack/auditbeat/processors/sessionmd/types/events.go index 5f8d67d763f1..857ab8fa2c10 100644 --- a/x-pack/auditbeat/processors/sessionmd/types/events.go +++ b/x-pack/auditbeat/processors/sessionmd/types/events.go @@ -60,8 +60,8 @@ type TTYTermios struct { } type TTYDev struct { - Minor uint16 - Major uint16 + Minor uint32 + Major uint32 Winsize TTYWinsize Termios TTYTermios } diff --git a/x-pack/auditbeat/processors/sessionmd/types/process.go b/x-pack/auditbeat/processors/sessionmd/types/process.go index 8f52a9c5aa59..a437f35310f3 100644 --- a/x-pack/auditbeat/processors/sessionmd/types/process.go +++ b/x-pack/auditbeat/processors/sessionmd/types/process.go @@ -448,6 +448,9 @@ func (p *Process) ToMap() mapstr.M { if p.EntryLeader.Start != nil { process.Put("entry_leader.start", p.EntryLeader.Start) } + if p.End != nil { + process.Put("end", p.End) + } return process } diff --git a/x-pack/filebeat/_meta/config/filebeat.inputs.reference.xpack.yml.tmpl b/x-pack/filebeat/_meta/config/filebeat.inputs.reference.xpack.yml.tmpl index 8215bc3c3893..4188035f832a 100644 --- a/x-pack/filebeat/_meta/config/filebeat.inputs.reference.xpack.yml.tmpl +++ b/x-pack/filebeat/_meta/config/filebeat.inputs.reference.xpack.yml.tmpl @@ -79,8 +79,8 @@ # SQS queue URL to receive messages from (required). #queue_url: "https://sqs.us-east-1.amazonaws.com/1234/test-aws-s3-logs-queue" - # Maximum number of SQS messages that can be inflight at any time. - #max_number_of_messages: 5 + # Number of workers on S3 bucket or SQS queue + #number_of_workers: 5 # Maximum duration of an AWS API call (excluding S3 GetObject calls). #api_timeout: 120s @@ -135,6 +135,8 @@ #credential_profile_name: test-aws-s3-input # ARN of the log group to collect logs from + # This ARN could refer to a log group from a linked source account + # Note: This property precedes over `log_group_name` & `log_group_name_prefix` #log_group_arn: "arn:aws:logs:us-east-1:428152502467:log-group:test:*" # Name of the log group to collect logs from. @@ -142,10 +144,15 @@ #log_group_name: test # The prefix for a group of log group names. + # You can include linked source accounts by using the property `include_linked_accounts_for_prefix_mode`. # Note: `region_name` is required when `log_group_name_prefix` is given. # `log_group_name` and `log_group_name_prefix` cannot be given at the same time. #log_group_name_prefix: /aws/ + # State whether to include linked source accounts when obtaining log groups matching the prefix provided through `log_group_name_prefix` + # This property works together with `log_group_name_prefix` and default value (if unset) is false + #include_linked_accounts_for_prefix_mode: true + # Region that the specified log group or log group prefix belongs to. #region_name: us-east-1 diff --git a/x-pack/filebeat/docs/inputs/input-aws-cloudwatch.asciidoc b/x-pack/filebeat/docs/inputs/input-aws-cloudwatch.asciidoc index c2b898da3587..d986e9e6b20b 100644 --- a/x-pack/filebeat/docs/inputs/input-aws-cloudwatch.asciidoc +++ b/x-pack/filebeat/docs/inputs/input-aws-cloudwatch.asciidoc @@ -40,19 +40,37 @@ The `aws-cloudwatch` input supports the following configuration options plus the [float] ==== `log_group_arn` ARN of the log group to collect logs from. +The ARN may refer to a log group in a linked source account. + +Note: `log_group_arn` cannot be combined with `log_group_name`, `log_group_name_prefix` and `region_name` properties. +If set, values extracted from `log_group_arn` takes precedence over them. + +Note: If the log group is in a linked source account and filebeat is configured to use a monitoring account, you must use the `log_group_arn`. +You can read more about AWS account linking and cross account observability from the https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-Unified-Cross-Account.html[official documentation]. [float] ==== `log_group_name` -Name of the log group to collect logs from. Note: `region_name` is required when -log_group_name is given. +Name of the log group to collect logs from. + +Note: `region_name` is required when log_group_name is given. [float] ==== `log_group_name_prefix` -The prefix for a group of log group names. Note: `region_name` is required when -log_group_name_prefix is given. `log_group_name` and `log_group_name_prefix` +The prefix for a group of log group names. See `include_linked_accounts_for_prefix_mode` option for linked source accounts behavior. + +Note: `region_name` is required when +`log_group_name_prefix` is given. `log_group_name` and `log_group_name_prefix` cannot be given at the same time. The number of workers that will process the log groups under this prefix is set through the `number_of_workers` config. +[float] +==== `include_linked_accounts_for_prefix_mode` +Configure whether to include linked source accounts that contains the prefix value defined through `log_group_name_prefix`. +Accepts a boolean and this is by default disabled. + +Note: Utilize `log_group_arn` if you desire to obtain logs from a known log group (including linked source accounts) +You can read more about AWS account linking and cross account observability from the https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-Unified-Cross-Account.html[official documentation]. + [float] ==== `region_name` Region that the specified log group or log group prefix belongs to. diff --git a/x-pack/filebeat/docs/inputs/input-aws-s3.asciidoc b/x-pack/filebeat/docs/inputs/input-aws-s3.asciidoc index 43d4b102f639..aa8ecbf72595 100644 --- a/x-pack/filebeat/docs/inputs/input-aws-s3.asciidoc +++ b/x-pack/filebeat/docs/inputs/input-aws-s3.asciidoc @@ -307,18 +307,6 @@ The maximum number of bytes that a single log message can have. All bytes after multiline log messages, which can get large. This only applies to non-JSON logs. The default is `10 MiB`. -[float] -==== `max_number_of_messages` - -The maximum number of SQS messages that can be inflight at any time. Defaults -to 5. Setting this parameter too high can overload Elastic Agent and cause -ingest failures in situations where the SQS messages contain many S3 objects -or the S3 objects themselves contain large numbers of messages. -We recommend to keep the default value 5 and use the `Balanced` or `Optimized for -Throughput` setting in the -{fleet-guide}/es-output-settings.html#es-output-settings-performance-tuning-settings[preset] -options to tune your Elastic Agent performance. - [id="input-{type}-parsers"] [float] ==== `parsers` @@ -504,7 +492,7 @@ Prefix to apply for the list request to the S3 bucket. Default empty. [float] ==== `number_of_workers` -Number of workers that will process the S3 objects listed. (Required when `bucket_arn` is set). +Number of workers that will process the S3 or SQS objects listed. Required when `bucket_arn` is set, otherwise (in the SQS case) defaults to 5. [float] diff --git a/x-pack/filebeat/filebeat.reference.yml b/x-pack/filebeat/filebeat.reference.yml index 749f0e0c291f..c00099c36670 100644 --- a/x-pack/filebeat/filebeat.reference.yml +++ b/x-pack/filebeat/filebeat.reference.yml @@ -139,7 +139,7 @@ filebeat.modules: # Bucket list interval on S3 bucket #var.bucket_list_interval: 300s - # Number of workers on S3 bucket + # Number of workers on S3 bucket or SQS queue #var.number_of_workers: 5 # Process CloudTrail logs @@ -188,9 +188,6 @@ filebeat.modules: # Enabling this option changes the service name from `s3` to `s3-fips` for connecting to the correct service endpoint. #var.fips_enabled: false - # The maximum number of messages to return from SQS. Valid values: 1 to 10. - #var.max_number_of_messages: 5 - # URL to proxy AWS API calls #var.proxy_url: http://proxy:3128 @@ -212,7 +209,7 @@ filebeat.modules: # Bucket list interval on S3 bucket #var.bucket_list_interval: 300s - # Number of workers on S3 bucket + # Number of workers on S3 bucket or SQS queue #var.number_of_workers: 5 # Filename of AWS credential file @@ -249,9 +246,6 @@ filebeat.modules: # Enabling this option changes the service name from `s3` to `s3-fips` for connecting to the correct service endpoint. #var.fips_enabled: false - # The maximum number of messages to return from SQS. Valid values: 1 to 10. - #var.max_number_of_messages: 5 - # URL to proxy AWS API calls #var.proxy_url: http://proxy:3128 @@ -273,7 +267,7 @@ filebeat.modules: # Bucket list interval on S3 bucket #var.bucket_list_interval: 300s - # Number of workers on S3 bucket + # Number of workers on S3 bucket or SQS queue #var.number_of_workers: 5 # Filename of AWS credential file @@ -310,9 +304,6 @@ filebeat.modules: # Enabling this option changes the service name from `s3` to `s3-fips` for connecting to the correct service endpoint. #var.fips_enabled: false - # The maximum number of messages to return from SQS. Valid values: 1 to 10. - #var.max_number_of_messages: 5 - # URL to proxy AWS API calls #var.proxy_url: http://proxy:3128 @@ -334,7 +325,7 @@ filebeat.modules: # Bucket list interval on S3 bucket #var.bucket_list_interval: 300s - # Number of workers on S3 bucket + # Number of workers on S3 bucket or SQS queue #var.number_of_workers: 5 # Filename of AWS credential file @@ -371,9 +362,6 @@ filebeat.modules: # Enabling this option changes the service name from `s3` to `s3-fips` for connecting to the correct service endpoint. #var.fips_enabled: false - # The maximum number of messages to return from SQS. Valid values: 1 to 10. - #var.max_number_of_messages: 5 - # URL to proxy AWS API calls #var.proxy_url: http://proxy:3128 @@ -395,7 +383,7 @@ filebeat.modules: # Bucket list interval on S3 bucket #var.bucket_list_interval: 300s - # Number of workers on S3 bucket + # Number of workers on S3 bucket or SQS queue #var.number_of_workers: 5 # Filename of AWS credential file @@ -432,9 +420,6 @@ filebeat.modules: # Enabling this option changes the service name from `s3` to `s3-fips` for connecting to the correct service endpoint. #var.fips_enabled: false - # The maximum number of messages to return from SQS. Valid values: 1 to 10. - #var.max_number_of_messages: 5 - # URL to proxy AWS API calls #var.proxy_url: http://proxy:3128 @@ -456,7 +441,7 @@ filebeat.modules: # Bucket list interval on S3 bucket #var.bucket_list_interval: 300s - # Number of workers on S3 bucket + # Number of workers on S3 bucket or SQS queue #var.number_of_workers: 5 # Filename of AWS credential file @@ -493,9 +478,6 @@ filebeat.modules: # Enabling this option changes the service name from `s3` to `s3-fips` for connecting to the correct service endpoint. #var.fips_enabled: false - # The maximum number of messages to return from SQS. Valid values: 1 to 10. - #var.max_number_of_messages: 5 - # URL to proxy AWS API calls #var.proxy_url: http://proxy:3128 @@ -3013,8 +2995,8 @@ filebeat.inputs: # SQS queue URL to receive messages from (required). #queue_url: "https://sqs.us-east-1.amazonaws.com/1234/test-aws-s3-logs-queue" - # Maximum number of SQS messages that can be inflight at any time. - #max_number_of_messages: 5 + # Number of workers on S3 bucket or SQS queue + #number_of_workers: 5 # Maximum duration of an AWS API call (excluding S3 GetObject calls). #api_timeout: 120s @@ -3069,6 +3051,8 @@ filebeat.inputs: #credential_profile_name: test-aws-s3-input # ARN of the log group to collect logs from + # This ARN could refer to a log group from a linked source account + # Note: This property precedes over `log_group_name` & `log_group_name_prefix` #log_group_arn: "arn:aws:logs:us-east-1:428152502467:log-group:test:*" # Name of the log group to collect logs from. @@ -3076,10 +3060,15 @@ filebeat.inputs: #log_group_name: test # The prefix for a group of log group names. + # You can include linked source accounts by using the property `include_linked_accounts_for_prefix_mode`. # Note: `region_name` is required when `log_group_name_prefix` is given. # `log_group_name` and `log_group_name_prefix` cannot be given at the same time. #log_group_name_prefix: /aws/ + # State whether to include linked source accounts when obtaining log groups matching the prefix provided through `log_group_name_prefix` + # This property works together with `log_group_name_prefix` and default value (if unset) is false + #include_linked_accounts_for_prefix_mode: true + # Region that the specified log group or log group prefix belongs to. #region_name: us-east-1 diff --git a/x-pack/filebeat/input/awscloudwatch/cloudwatch.go b/x-pack/filebeat/input/awscloudwatch/cloudwatch.go index ffc5b2e3cd80..4d089268e356 100644 --- a/x-pack/filebeat/input/awscloudwatch/cloudwatch.go +++ b/x-pack/filebeat/input/awscloudwatch/cloudwatch.go @@ -37,7 +37,7 @@ type cloudwatchPoller struct { } type workResponse struct { - logGroup string + logGroupId string startTime, endTime time.Time } @@ -64,8 +64,8 @@ func newCloudwatchPoller(log *logp.Logger, metrics *inputMetrics, } } -func (p *cloudwatchPoller) run(svc *cloudwatchlogs.Client, logGroup string, startTime, endTime time.Time, logProcessor *logProcessor) { - err := p.getLogEventsFromCloudWatch(svc, logGroup, startTime, endTime, logProcessor) +func (p *cloudwatchPoller) run(svc *cloudwatchlogs.Client, logGroupId string, startTime, endTime time.Time, logProcessor *logProcessor) { + err := p.getLogEventsFromCloudWatch(svc, logGroupId, startTime, endTime, logProcessor) if err != nil { var errRequestCanceled *awssdk.RequestCanceledError if errors.As(err, &errRequestCanceled) { @@ -76,9 +76,9 @@ func (p *cloudwatchPoller) run(svc *cloudwatchlogs.Client, logGroup string, star } // getLogEventsFromCloudWatch uses FilterLogEvents API to collect logs from CloudWatch -func (p *cloudwatchPoller) getLogEventsFromCloudWatch(svc *cloudwatchlogs.Client, logGroup string, startTime, endTime time.Time, logProcessor *logProcessor) error { +func (p *cloudwatchPoller) getLogEventsFromCloudWatch(svc *cloudwatchlogs.Client, logGroupId string, startTime, endTime time.Time, logProcessor *logProcessor) error { // construct FilterLogEventsInput - filterLogEventsInput := p.constructFilterLogEventsInput(startTime, endTime, logGroup) + filterLogEventsInput := p.constructFilterLogEventsInput(startTime, endTime, logGroupId) paginator := cloudwatchlogs.NewFilterLogEventsPaginator(svc, filterLogEventsInput) for paginator.HasMorePages() { filterLogEventsOutput, err := paginator.NextPage(context.TODO()) @@ -96,16 +96,16 @@ func (p *cloudwatchPoller) getLogEventsFromCloudWatch(svc *cloudwatchlogs.Client p.log.Debug("done sleeping") p.log.Debugf("Processing #%v events", len(logEvents)) - logProcessor.processLogEvents(logEvents, logGroup, p.region) + logProcessor.processLogEvents(logEvents, logGroupId, p.region) } return nil } -func (p *cloudwatchPoller) constructFilterLogEventsInput(startTime, endTime time.Time, logGroup string) *cloudwatchlogs.FilterLogEventsInput { +func (p *cloudwatchPoller) constructFilterLogEventsInput(startTime, endTime time.Time, logGroupId string) *cloudwatchlogs.FilterLogEventsInput { filterLogEventsInput := &cloudwatchlogs.FilterLogEventsInput{ - LogGroupName: awssdk.String(logGroup), - StartTime: awssdk.Int64(unixMsFromTime(startTime)), - EndTime: awssdk.Int64(unixMsFromTime(endTime)), + LogGroupIdentifier: awssdk.String(logGroupId), + StartTime: awssdk.Int64(unixMsFromTime(startTime)), + EndTime: awssdk.Int64(unixMsFromTime(endTime)), } if len(p.config.LogStreams) > 0 { @@ -138,9 +138,9 @@ func (p *cloudwatchPoller) startWorkers( work = <-p.workResponseChan } - p.log.Infof("aws-cloudwatch input worker for log group: '%v' has started", work.logGroup) - p.run(svc, work.logGroup, work.startTime, work.endTime, logProcessor) - p.log.Infof("aws-cloudwatch input worker for log group '%v' has stopped.", work.logGroup) + p.log.Infof("aws-cloudwatch input worker for log group: '%v' has started", work.logGroupId) + p.run(svc, work.logGroupId, work.startTime, work.endTime, logProcessor) + p.log.Infof("aws-cloudwatch input worker for log group '%v' has stopped.", work.logGroupId) } }() } @@ -149,7 +149,7 @@ func (p *cloudwatchPoller) startWorkers( // receive implements the main run loop that distributes tasks to the worker // goroutines. It accepts a "clock" callback (which on a live input should // equal time.Now) to allow deterministic unit tests. -func (p *cloudwatchPoller) receive(ctx context.Context, logGroupNames []string, clock func() time.Time) { +func (p *cloudwatchPoller) receive(ctx context.Context, logGroupIDs []string, clock func() time.Time) { defer p.workerWg.Wait() // startTime and endTime are the bounds of the current scanning interval. // If we're starting at the end of the logs, advance the start time to the @@ -160,15 +160,15 @@ func (p *cloudwatchPoller) receive(ctx context.Context, logGroupNames []string, startTime = endTime.Add(-p.config.ScanFrequency) } for ctx.Err() == nil { - for _, lg := range logGroupNames { + for _, lg := range logGroupIDs { select { case <-ctx.Done(): return case <-p.workRequestChan: p.workResponseChan <- workResponse{ - logGroup: lg, - startTime: startTime, - endTime: endTime, + logGroupId: lg, + startTime: startTime, + endTime: endTime, } } } diff --git a/x-pack/filebeat/input/awscloudwatch/cloudwatch_test.go b/x-pack/filebeat/input/awscloudwatch/cloudwatch_test.go index f666db859824..0c266c8291f1 100644 --- a/x-pack/filebeat/input/awscloudwatch/cloudwatch_test.go +++ b/x-pack/filebeat/input/awscloudwatch/cloudwatch_test.go @@ -31,7 +31,7 @@ type receiveTestStep struct { type receiveTestCase struct { name string - logGroups []string + logGroupIDs []string configOverrides func(*config) startTime time.Time steps []receiveTestStep @@ -46,37 +46,37 @@ func TestReceive(t *testing.T) { t3 := t2.Add(time.Hour) testCases := []receiveTestCase{ { - name: "Default config with one log group", - logGroups: []string{"a"}, - startTime: t1, + name: "Default config with one log group", + logGroupIDs: []string{"a"}, + startTime: t1, steps: []receiveTestStep{ { expected: []workResponse{ - {logGroup: "a", startTime: t0, endTime: t1}, + {logGroupId: "a", startTime: t0, endTime: t1}, }, nextTime: t2, }, { expected: []workResponse{ - {logGroup: "a", startTime: t1, endTime: t2}, + {logGroupId: "a", startTime: t1, endTime: t2}, }, nextTime: t3, }, { expected: []workResponse{ - {logGroup: "a", startTime: t2, endTime: t3}, + {logGroupId: "a", startTime: t2, endTime: t3}, }, }, }, }, { - name: "Default config with two log groups", - logGroups: []string{"a", "b"}, - startTime: t1, + name: "Default config with two log groups", + logGroupIDs: []string{"a", "b"}, + startTime: t1, steps: []receiveTestStep{ { expected: []workResponse{ - {logGroup: "a", startTime: t0, endTime: t1}, + {logGroupId: "a", startTime: t0, endTime: t1}, }, nextTime: t2, }, @@ -84,49 +84,49 @@ func TestReceive(t *testing.T) { expected: []workResponse{ // start/end times for the second log group should be the same // even though the clock has changed. - {logGroup: "b", startTime: t0, endTime: t1}, + {logGroupId: "b", startTime: t0, endTime: t1}, }, }, { expected: []workResponse{ - {logGroup: "a", startTime: t1, endTime: t2}, - {logGroup: "b", startTime: t1, endTime: t2}, + {logGroupId: "a", startTime: t1, endTime: t2}, + {logGroupId: "b", startTime: t1, endTime: t2}, }, nextTime: t3, }, { expected: []workResponse{ - {logGroup: "a", startTime: t2, endTime: t3}, - {logGroup: "b", startTime: t2, endTime: t3}, + {logGroupId: "a", startTime: t2, endTime: t3}, + {logGroupId: "b", startTime: t2, endTime: t3}, }, }, }, }, { - name: "One log group with start_position: end", - logGroups: []string{"a"}, - startTime: t1, + name: "One log group with start_position: end", + logGroupIDs: []string{"a"}, + startTime: t1, configOverrides: func(c *config) { c.StartPosition = "end" }, steps: []receiveTestStep{ { expected: []workResponse{ - {logGroup: "a", startTime: t1.Add(-defaultScanFrequency), endTime: t1}, + {logGroupId: "a", startTime: t1.Add(-defaultScanFrequency), endTime: t1}, }, nextTime: t2, }, { expected: []workResponse{ - {logGroup: "a", startTime: t1, endTime: t2}, + {logGroupId: "a", startTime: t1, endTime: t2}, }, }, }, }, { - name: "Two log group with start_position: end and latency", - logGroups: []string{"a", "b"}, - startTime: t1, + name: "Two log group with start_position: end and latency", + logGroupIDs: []string{"a", "b"}, + startTime: t1, configOverrides: func(c *config) { c.StartPosition = "end" c.Latency = time.Second @@ -134,40 +134,40 @@ func TestReceive(t *testing.T) { steps: []receiveTestStep{ { expected: []workResponse{ - {logGroup: "a", startTime: t1.Add(-defaultScanFrequency - time.Second), endTime: t1.Add(-time.Second)}, - {logGroup: "b", startTime: t1.Add(-defaultScanFrequency - time.Second), endTime: t1.Add(-time.Second)}, + {logGroupId: "a", startTime: t1.Add(-defaultScanFrequency - time.Second), endTime: t1.Add(-time.Second)}, + {logGroupId: "b", startTime: t1.Add(-defaultScanFrequency - time.Second), endTime: t1.Add(-time.Second)}, }, nextTime: t2, }, { expected: []workResponse{ - {logGroup: "a", startTime: t1.Add(-time.Second), endTime: t2.Add(-time.Second)}, - {logGroup: "b", startTime: t1.Add(-time.Second), endTime: t2.Add(-time.Second)}, + {logGroupId: "a", startTime: t1.Add(-time.Second), endTime: t2.Add(-time.Second)}, + {logGroupId: "b", startTime: t1.Add(-time.Second), endTime: t2.Add(-time.Second)}, }, }, }, }, { - name: "Three log groups with latency", - logGroups: []string{"a", "b", "c"}, - startTime: t1, + name: "Three log groups with latency", + logGroupIDs: []string{"a", "b", "c"}, + startTime: t1, configOverrides: func(c *config) { c.Latency = time.Second }, steps: []receiveTestStep{ { expected: []workResponse{ - {logGroup: "a", startTime: t0, endTime: t1.Add(-time.Second)}, - {logGroup: "b", startTime: t0, endTime: t1.Add(-time.Second)}, - {logGroup: "c", startTime: t0, endTime: t1.Add(-time.Second)}, + {logGroupId: "a", startTime: t0, endTime: t1.Add(-time.Second)}, + {logGroupId: "b", startTime: t0, endTime: t1.Add(-time.Second)}, + {logGroupId: "c", startTime: t0, endTime: t1.Add(-time.Second)}, }, nextTime: t2, }, { expected: []workResponse{ - {logGroup: "a", startTime: t1.Add(-time.Second), endTime: t2.Add(-time.Second)}, - {logGroup: "b", startTime: t1.Add(-time.Second), endTime: t2.Add(-time.Second)}, - {logGroup: "c", startTime: t1.Add(-time.Second), endTime: t2.Add(-time.Second)}, + {logGroupId: "a", startTime: t1.Add(-time.Second), endTime: t2.Add(-time.Second)}, + {logGroupId: "b", startTime: t1.Add(-time.Second), endTime: t2.Add(-time.Second)}, + {logGroupId: "c", startTime: t1.Add(-time.Second), endTime: t2.Add(-time.Second)}, }, }, }, @@ -191,7 +191,7 @@ func TestReceive(t *testing.T) { test.configOverrides(&p.config) } clock.time = test.startTime - go p.receive(ctx, test.logGroups, clock.now) + go p.receive(ctx, test.logGroupIDs, clock.now) for _, step := range test.steps { for i, expected := range step.expected { p.workRequestChan <- struct{}{} @@ -209,34 +209,36 @@ func TestReceive(t *testing.T) { } type filterLogEventsTestCase struct { - name string - logGroup string - startTime time.Time - endTime time.Time - expected *cloudwatchlogs.FilterLogEventsInput + name string + logGroupId string + startTime time.Time + endTime time.Time + expected *cloudwatchlogs.FilterLogEventsInput } func TestFilterLogEventsInput(t *testing.T) { now, _ := time.Parse(time.RFC3339, "2024-07-12T13:00:00+00:00") + id := "myLogGroup" + testCases := []filterLogEventsTestCase{ { - name: "StartPosition: beginning, first iteration", - logGroup: "a", + name: "StartPosition: beginning, first iteration", + logGroupId: id, // The zero value of type time.Time{} is January 1, year 1, 00:00:00.000000000 UTC // Events with a timestamp before the time - January 1, 1970, 00:00:00 UTC are not returned by AWS API // make sure zero value of time.Time{} was converted startTime: time.Time{}, endTime: now, expected: &cloudwatchlogs.FilterLogEventsInput{ - LogGroupName: awssdk.String("a"), - StartTime: awssdk.Int64(0), - EndTime: awssdk.Int64(1720789200000), + LogGroupIdentifier: awssdk.String(id), + StartTime: awssdk.Int64(0), + EndTime: awssdk.Int64(1720789200000), }, }, } for _, test := range testCases { p := cloudwatchPoller{} - result := p.constructFilterLogEventsInput(test.startTime, test.endTime, test.logGroup) + result := p.constructFilterLogEventsInput(test.startTime, test.endTime, test.logGroupId) assert.Equal(t, test.expected, result) } diff --git a/x-pack/filebeat/input/awscloudwatch/config.go b/x-pack/filebeat/input/awscloudwatch/config.go index 438aceeb19e6..5e826aa09fd7 100644 --- a/x-pack/filebeat/input/awscloudwatch/config.go +++ b/x-pack/filebeat/input/awscloudwatch/config.go @@ -13,20 +13,21 @@ import ( ) type config struct { - harvester.ForwarderConfig `config:",inline"` - LogGroupARN string `config:"log_group_arn"` - LogGroupName string `config:"log_group_name"` - LogGroupNamePrefix string `config:"log_group_name_prefix"` - RegionName string `config:"region_name"` - LogStreams []*string `config:"log_streams"` - LogStreamPrefix string `config:"log_stream_prefix"` - StartPosition string `config:"start_position" default:"beginning"` - ScanFrequency time.Duration `config:"scan_frequency" validate:"min=0,nonzero"` - APITimeout time.Duration `config:"api_timeout" validate:"min=0,nonzero"` - APISleep time.Duration `config:"api_sleep" validate:"min=0,nonzero"` - Latency time.Duration `config:"latency"` - NumberOfWorkers int `config:"number_of_workers"` - AWSConfig awscommon.ConfigAWS `config:",inline"` + harvester.ForwarderConfig `config:",inline"` + LogGroupARN string `config:"log_group_arn"` + LogGroupName string `config:"log_group_name"` + LogGroupNamePrefix string `config:"log_group_name_prefix"` + IncludeLinkedAccountsForPrefixMode bool `config:"include_linked_accounts_for_prefix_mode"` + RegionName string `config:"region_name"` + LogStreams []*string `config:"log_streams"` + LogStreamPrefix string `config:"log_stream_prefix"` + StartPosition string `config:"start_position" default:"beginning"` + ScanFrequency time.Duration `config:"scan_frequency" validate:"min=0,nonzero"` + APITimeout time.Duration `config:"api_timeout" validate:"min=0,nonzero"` + APISleep time.Duration `config:"api_sleep" validate:"min=0,nonzero"` + Latency time.Duration `config:"latency"` + NumberOfWorkers int `config:"number_of_workers"` + AWSConfig awscommon.ConfigAWS `config:",inline"` } func defaultConfig() config { diff --git a/x-pack/filebeat/input/awscloudwatch/input.go b/x-pack/filebeat/input/awscloudwatch/input.go index d10ae348d941..27b1da04d1a7 100644 --- a/x-pack/filebeat/input/awscloudwatch/input.go +++ b/x-pack/filebeat/input/awscloudwatch/input.go @@ -62,25 +62,13 @@ type cloudwatchInput struct { func newInput(config config) (*cloudwatchInput, error) { cfgwarn.Beta("aws-cloudwatch input type is used") + + // perform AWS configuration validation awsConfig, err := awscommon.InitializeAWSConfig(config.AWSConfig) if err != nil { return nil, fmt.Errorf("failed to initialize AWS credentials: %w", err) } - if config.LogGroupARN != "" { - logGroupName, regionName, err := parseARN(config.LogGroupARN) - if err != nil { - return nil, fmt.Errorf("parse log group ARN failed: %w", err) - } - - config.LogGroupName = logGroupName - config.RegionName = regionName - } - - if config.RegionName != "" { - awsConfig.Region = config.RegionName - } - return &cloudwatchInput{ config: config, awsConfig: awsConfig, @@ -103,15 +91,26 @@ func (in *cloudwatchInput) Run(inputContext v2.Context, pipeline beat.Pipeline) } defer client.Close() + var logGroupIDs []string + logGroupIDs, region, err := fromConfig(in.config, in.awsConfig) + if err != nil { + return fmt.Errorf("error processing configurations: %w", err) + } + + in.awsConfig.Region = region svc := cloudwatchlogs.NewFromConfig(in.awsConfig, func(o *cloudwatchlogs.Options) { if in.config.AWSConfig.FIPSEnabled { o.EndpointOptions.UseFIPSEndpoint = awssdk.FIPSEndpointStateEnabled } }) - logGroupNames, err := getLogGroupNames(svc, in.config.LogGroupNamePrefix, in.config.LogGroupName) - if err != nil { - return fmt.Errorf("failed to get log group names: %w", err) + if len(logGroupIDs) == 0 { + // We haven't extracted group identifiers directly from the input configurations, + // now fallback to provided LogGroupNamePrefix and use derived service client to derive logGroupIDs + logGroupIDs, err = getLogGroupNames(svc, in.config.LogGroupNamePrefix, in.config.IncludeLinkedAccountsForPrefixMode) + if err != nil { + return fmt.Errorf("failed to get log group names from LogGroupNamePrefix: %w", err) + } } log := inputContext.Logger @@ -120,43 +119,62 @@ func (in *cloudwatchInput) Run(inputContext v2.Context, pipeline beat.Pipeline) cwPoller := newCloudwatchPoller( log.Named("cloudwatch_poller"), in.metrics, - in.awsConfig.Region, + region, in.config) logProcessor := newLogProcessor(log.Named("log_processor"), in.metrics, client, ctx) - cwPoller.metrics.logGroupsTotal.Add(uint64(len(logGroupNames))) + cwPoller.metrics.logGroupsTotal.Add(uint64(len(logGroupIDs))) cwPoller.startWorkers(ctx, svc, logProcessor) - cwPoller.receive(ctx, logGroupNames, time.Now) + cwPoller.receive(ctx, logGroupIDs, time.Now) return nil } -func parseARN(logGroupARN string) (string, string, error) { - arnParsed, err := arn.Parse(logGroupARN) - if err != nil { - return "", "", fmt.Errorf("error Parse arn %s: %w", logGroupARN, err) - } +// fromConfig is a helper to parse input configurations and derive logGroupIDs & aws region +// Returned logGroupIDs could be empty, which require other fallback mechanisms to derive them. +// See getLogGroupNames for example. +func fromConfig(cfg config, awsCfg awssdk.Config) (logGroupIDs []string, region string, err error) { + // LogGroupARN has precedence over LogGroupName & RegionName + if cfg.LogGroupARN != "" { + parsedArn, err := arn.Parse(cfg.LogGroupARN) + if err != nil { + return nil, "", fmt.Errorf("failed to parse log group ARN: %w", err) + } - if strings.Contains(arnParsed.Resource, ":") { - resourceARNSplit := strings.Split(arnParsed.Resource, ":") - if len(resourceARNSplit) >= 2 && resourceARNSplit[0] == "log-group" { - return resourceARNSplit[1], arnParsed.Region, nil + if parsedArn.Region == "" { + return nil, "", fmt.Errorf("failed to parse log group ARN: missing region") } + + // refine to match AWS API parameter regex of logGroupIdentifier + groupId := strings.TrimSuffix(cfg.LogGroupARN, ":*") + logGroupIDs = append(logGroupIDs, groupId) + + return logGroupIDs, parsedArn.Region, nil + } + + // then fallback to LogrGroupName + if cfg.LogGroupName != "" { + logGroupIDs = append(logGroupIDs, cfg.LogGroupName) } - return "", "", fmt.Errorf("cannot get log group name from log group ARN: %s", logGroupARN) -} -// getLogGroupNames uses DescribeLogGroups API to retrieve all log group names -func getLogGroupNames(svc *cloudwatchlogs.Client, logGroupNamePrefix string, logGroupName string) ([]string, error) { - if logGroupNamePrefix == "" { - return []string{logGroupName}, nil + // finally derive region + if cfg.RegionName != "" { + region = cfg.RegionName + } else { + region = awsCfg.Region } + return logGroupIDs, region, nil +} + +// getLogGroupNames uses DescribeLogGroups API to retrieve LogGroupArn entries that matches the provided logGroupNamePrefix +func getLogGroupNames(svc *cloudwatchlogs.Client, logGroupNamePrefix string, withLinkedAccount bool) ([]string, error) { // construct DescribeLogGroupsInput describeLogGroupsInput := &cloudwatchlogs.DescribeLogGroupsInput{ - LogGroupNamePrefix: awssdk.String(logGroupNamePrefix), + LogGroupNamePrefix: awssdk.String(logGroupNamePrefix), + IncludeLinkedAccounts: awssdk.Bool(withLinkedAccount), } // make API request - var logGroupNames []string + var logGroupIDs []string paginator := cloudwatchlogs.NewDescribeLogGroupsPaginator(svc, describeLogGroupsInput) for paginator.HasMorePages() { page, err := paginator.NextPage(context.TODO()) @@ -165,8 +183,8 @@ func getLogGroupNames(svc *cloudwatchlogs.Client, logGroupNamePrefix string, log } for _, lg := range page.LogGroups { - logGroupNames = append(logGroupNames, *lg.LogGroupName) + logGroupIDs = append(logGroupIDs, *lg.LogGroupArn) } } - return logGroupNames, nil + return logGroupIDs, nil } diff --git a/x-pack/filebeat/input/awscloudwatch/input_test.go b/x-pack/filebeat/input/awscloudwatch/input_test.go index 25ecc18ea57c..4d8c6e84e2b5 100644 --- a/x-pack/filebeat/input/awscloudwatch/input_test.go +++ b/x-pack/filebeat/input/awscloudwatch/input_test.go @@ -50,9 +50,97 @@ func TestCreateEvent(t *testing.T) { assert.Equal(t, expectedEventFields, event.Fields) } -func TestParseARN(t *testing.T) { - logGroup, regionName, err := parseARN("arn:aws:logs:us-east-1:428152502467:log-group:test:*") - assert.Equal(t, "test", logGroup) - assert.Equal(t, "us-east-1", regionName) - assert.NoError(t, err) +func Test_FromConfig(t *testing.T) { + tests := []struct { + name string + cfg config + awsCfg awssdk.Config + expectGroups []string + expectRegion string + isError bool + }{ + { + name: "Valid log group ARN", + cfg: config{ + LogGroupARN: "arn:aws:logs:us-east-1:123456789012:myLogs", + }, + awsCfg: awssdk.Config{ + Region: "us-east-1", + }, + expectGroups: []string{"arn:aws:logs:us-east-1:123456789012:myLogs"}, + expectRegion: "us-east-1", + isError: false, + }, + { + name: "Invalid ARN results in an error", + cfg: config{ + LogGroupARN: "invalidARN", + }, + awsCfg: awssdk.Config{ + Region: "us-east-1", + }, + expectRegion: "", + isError: true, + }, + { + name: "Valid log group ARN but empty region cause error", + cfg: config{ + LogGroupARN: "arn:aws:logs::123456789012:otherLogs", + }, + awsCfg: awssdk.Config{ + Region: "us-east-1", + }, + expectRegion: "", + isError: true, + }, + { + name: "ARN suffix trimming to match logGroupIdentifier requirement", + cfg: config{ + LogGroupARN: "arn:aws:logs:us-east-1:123456789012:log-group:/aws/kinesisfirehose/ProjectA:*", + }, + awsCfg: awssdk.Config{ + Region: "us-east-1", + }, + expectGroups: []string{"arn:aws:logs:us-east-1:123456789012:log-group:/aws/kinesisfirehose/ProjectA"}, + expectRegion: "us-east-1", + isError: false, + }, + { + name: "LogGroupName only", + cfg: config{ + LogGroupName: "myLogGroup", + }, + awsCfg: awssdk.Config{ + Region: "us-east-1", + }, + expectGroups: []string{"myLogGroup"}, + expectRegion: "us-east-1", + isError: false, + }, + { + name: "LogGroupName and region override", + cfg: config{ + LogGroupName: "myLogGroup", + RegionName: "sa-east-1", + }, + awsCfg: awssdk.Config{ + Region: "us-east-1", + }, + expectGroups: []string{"myLogGroup"}, + expectRegion: "sa-east-1", + isError: false, + }, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + groups, region, err := fromConfig(tt.cfg, tt.awsCfg) + if tt.isError { + assert.Error(t, err) + } + + assert.Equal(t, tt.expectGroups, groups) + assert.Equal(t, tt.expectRegion, region) + }) + } } diff --git a/x-pack/filebeat/input/awscloudwatch/processor.go b/x-pack/filebeat/input/awscloudwatch/processor.go index 818ba85d57ec..c0be36921633 100644 --- a/x-pack/filebeat/input/awscloudwatch/processor.go +++ b/x-pack/filebeat/input/awscloudwatch/processor.go @@ -32,22 +32,22 @@ func newLogProcessor(log *logp.Logger, metrics *inputMetrics, publisher beat.Cli } } -func (p *logProcessor) processLogEvents(logEvents []types.FilteredLogEvent, logGroup string, regionName string) { +func (p *logProcessor) processLogEvents(logEvents []types.FilteredLogEvent, logGroupId string, regionName string) { for _, logEvent := range logEvents { - event := createEvent(logEvent, logGroup, regionName) + event := createEvent(logEvent, logGroupId, regionName) p.metrics.cloudwatchEventsCreatedTotal.Inc() p.publisher.Publish(event) } } -func createEvent(logEvent types.FilteredLogEvent, logGroup string, regionName string) beat.Event { +func createEvent(logEvent types.FilteredLogEvent, logGroupId string, regionName string) beat.Event { event := beat.Event{ Timestamp: time.Unix(*logEvent.Timestamp/1000, 0).UTC(), Fields: mapstr.M{ "message": *logEvent.Message, "log": mapstr.M{ "file": mapstr.M{ - "path": logGroup + "/" + *logEvent.LogStreamName, + "path": logGroupId + "/" + *logEvent.LogStreamName, }, }, "event": mapstr.M{ @@ -55,7 +55,7 @@ func createEvent(logEvent types.FilteredLogEvent, logGroup string, regionName st "ingested": time.Now(), }, "aws.cloudwatch": mapstr.M{ - "log_group": logGroup, + "log_group": logGroupId, "log_stream": *logEvent.LogStreamName, "ingestion_time": time.Unix(*logEvent.IngestionTime/1000, 0), }, diff --git a/x-pack/filebeat/input/awss3/acks.go b/x-pack/filebeat/input/awss3/acks.go new file mode 100644 index 000000000000..a3850c01e87a --- /dev/null +++ b/x-pack/filebeat/input/awss3/acks.go @@ -0,0 +1,106 @@ +// Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one +// or more contributor license agreements. Licensed under the Elastic License; +// you may not use this file except in compliance with the Elastic License. + +package awss3 + +import ( + "github.com/zyedidia/generic/queue" + + "github.com/elastic/beats/v7/libbeat/beat" + "github.com/elastic/beats/v7/libbeat/common/acker" +) + +type awsACKHandler struct { + pending *queue.Queue[pendingACK] + ackedCount int + + pendingChan chan pendingACK + ackChan chan int +} + +type pendingACK struct { + eventCount int + ackCallback func() +} + +func newAWSACKHandler() *awsACKHandler { + handler := &awsACKHandler{ + pending: queue.New[pendingACK](), + + // Channel buffer sizes are somewhat arbitrary: synchronous channels + // would be safe, but buffers slightly reduce scheduler overhead since + // the ack loop goroutine doesn't need to wake up as often. + // + // pendingChan receives one message each time an S3/SQS worker goroutine + // finishes processing an object. If it is full, workers will not be able + // to advance to the next object until the ack loop wakes up. + // + // ackChan receives approximately one message every time an acknowledged + // batch of events contains at least one event from this input. (Sometimes + // fewer if messages can be coalesced.) If it is full, acknowledgement + // notifications for inputs/queue will stall until the ack loop wakes up. + // (This is a much worse consequence than pendingChan, but ackChan also + // receives fewer messages than pendingChan by a factor of ~thousands, + // so in practice it's still low-impact.) + pendingChan: make(chan pendingACK, 10), + ackChan: make(chan int, 10), + } + go handler.run() + return handler +} + +func (ah *awsACKHandler) Add(eventCount int, ackCallback func()) { + ah.pendingChan <- pendingACK{ + eventCount: eventCount, + ackCallback: ackCallback, + } +} + +// Called when a worker is closing, to indicate to the ack handler that it +// should shut down as soon as the current pending list is acknowledged. +func (ah *awsACKHandler) Close() { + close(ah.pendingChan) +} + +func (ah *awsACKHandler) pipelineEventListener() beat.EventListener { + return acker.TrackingCounter(func(_ int, total int) { + // Notify the ack handler goroutine + ah.ackChan <- total + }) +} + +// Listener that handles both incoming metadata and ACK +// confirmations. +func (ah *awsACKHandler) run() { + for { + select { + case result, ok := <-ah.pendingChan: + if ok { + ah.pending.Enqueue(result) + } else { + // Channel is closed, reset so we don't receive any more values + ah.pendingChan = nil + } + case count := <-ah.ackChan: + ah.ackedCount += count + } + + // Finalize any objects that are now completed + for !ah.pending.Empty() && ah.ackedCount >= ah.pending.Peek().eventCount { + result := ah.pending.Dequeue() + ah.ackedCount -= result.eventCount + // Run finalization asynchronously so we don't block the SQS worker + // or the queue by ignoring the ack handler's input channels. Ordering + // is no longer important at this point. + if result.ackCallback != nil { + go result.ackCallback() + } + } + + // If the input is closed and all acks are completed, we're done + if ah.pending.Empty() && ah.pendingChan == nil { + return + } + } +} diff --git a/x-pack/filebeat/input/awss3/config.go b/x-pack/filebeat/input/awss3/config.go index b85c3f3871c9..d80108590ce5 100644 --- a/x-pack/filebeat/input/awss3/config.go +++ b/x-pack/filebeat/input/awss3/config.go @@ -24,37 +24,36 @@ import ( ) type config struct { - APITimeout time.Duration `config:"api_timeout"` - VisibilityTimeout time.Duration `config:"visibility_timeout"` - SQSWaitTime time.Duration `config:"sqs.wait_time"` // The max duration for which the SQS ReceiveMessage call waits for a message to arrive in the queue before returning. - SQSMaxReceiveCount int `config:"sqs.max_receive_count"` // The max number of times a message should be received (retried) before deleting it. - SQSScript *scriptConfig `config:"sqs.notification_parsing_script"` - MaxNumberOfMessages int `config:"max_number_of_messages"` - QueueURL string `config:"queue_url"` - RegionName string `config:"region"` - BucketARN string `config:"bucket_arn"` - NonAWSBucketName string `config:"non_aws_bucket_name"` - BucketListInterval time.Duration `config:"bucket_list_interval"` - BucketListPrefix string `config:"bucket_list_prefix"` - NumberOfWorkers int `config:"number_of_workers"` - AWSConfig awscommon.ConfigAWS `config:",inline"` - FileSelectors []fileSelectorConfig `config:"file_selectors"` - ReaderConfig readerConfig `config:",inline"` // Reader options to apply when no file_selectors are used. - PathStyle bool `config:"path_style"` - ProviderOverride string `config:"provider"` - BackupConfig backupConfig `config:",inline"` + APITimeout time.Duration `config:"api_timeout"` + VisibilityTimeout time.Duration `config:"visibility_timeout"` + SQSWaitTime time.Duration `config:"sqs.wait_time"` // The max duration for which the SQS ReceiveMessage call waits for a message to arrive in the queue before returning. + SQSMaxReceiveCount int `config:"sqs.max_receive_count"` // The max number of times a message should be received (retried) before deleting it. + SQSScript *scriptConfig `config:"sqs.notification_parsing_script"` + QueueURL string `config:"queue_url"` + RegionName string `config:"region"` + BucketARN string `config:"bucket_arn"` + NonAWSBucketName string `config:"non_aws_bucket_name"` + BucketListInterval time.Duration `config:"bucket_list_interval"` + BucketListPrefix string `config:"bucket_list_prefix"` + NumberOfWorkers int `config:"number_of_workers"` + AWSConfig awscommon.ConfigAWS `config:",inline"` + FileSelectors []fileSelectorConfig `config:"file_selectors"` + ReaderConfig readerConfig `config:",inline"` // Reader options to apply when no file_selectors are used. + PathStyle bool `config:"path_style"` + ProviderOverride string `config:"provider"` + BackupConfig backupConfig `config:",inline"` } func defaultConfig() config { c := config{ - APITimeout: 120 * time.Second, - VisibilityTimeout: 300 * time.Second, - BucketListInterval: 120 * time.Second, - BucketListPrefix: "", - SQSWaitTime: 20 * time.Second, - SQSMaxReceiveCount: 5, - MaxNumberOfMessages: 5, - PathStyle: false, + APITimeout: 120 * time.Second, + VisibilityTimeout: 300 * time.Second, + BucketListInterval: 120 * time.Second, + BucketListPrefix: "", + SQSWaitTime: 20 * time.Second, + SQSMaxReceiveCount: 5, + NumberOfWorkers: 5, + PathStyle: false, } c.ReaderConfig.InitDefaults() return c @@ -93,11 +92,6 @@ func (c *config) Validate() error { "less than or equal to 20s", c.SQSWaitTime) } - if c.QueueURL != "" && c.MaxNumberOfMessages <= 0 { - return fmt.Errorf("max_number_of_messages <%v> must be greater than 0", - c.MaxNumberOfMessages) - } - if c.QueueURL != "" && c.APITimeout < c.SQSWaitTime { return fmt.Errorf("api_timeout <%v> must be greater than the sqs.wait_time <%v", c.APITimeout, c.SQSWaitTime) @@ -252,6 +246,7 @@ func (c config) getBucketARN() string { // Should be provided as a parameter to s3.NewFromConfig. func (c config) s3ConfigModifier(o *s3.Options) { if c.NonAWSBucketName != "" { + //nolint:staticcheck // haven't migrated to the new interface yet o.EndpointResolver = nonAWSBucketResolver{endpoint: c.AWSConfig.Endpoint} } diff --git a/x-pack/filebeat/input/awss3/config_test.go b/x-pack/filebeat/input/awss3/config_test.go index 651f8099d919..907a5854b284 100644 --- a/x-pack/filebeat/input/awss3/config_test.go +++ b/x-pack/filebeat/input/awss3/config_test.go @@ -30,17 +30,17 @@ func TestConfig(t *testing.T) { parserConf := parser.Config{} require.NoError(t, parserConf.Unpack(conf.MustNewConfigFrom(""))) return config{ - QueueURL: quequeURL, - BucketARN: s3Bucket, - NonAWSBucketName: nonAWSS3Bucket, - APITimeout: 120 * time.Second, - VisibilityTimeout: 300 * time.Second, - SQSMaxReceiveCount: 5, - SQSWaitTime: 20 * time.Second, - BucketListInterval: 120 * time.Second, - BucketListPrefix: "", - PathStyle: false, - MaxNumberOfMessages: 5, + QueueURL: quequeURL, + BucketARN: s3Bucket, + NonAWSBucketName: nonAWSS3Bucket, + APITimeout: 120 * time.Second, + VisibilityTimeout: 300 * time.Second, + SQSMaxReceiveCount: 5, + SQSWaitTime: 20 * time.Second, + BucketListInterval: 120 * time.Second, + BucketListPrefix: "", + PathStyle: false, + NumberOfWorkers: 5, ReaderConfig: readerConfig{ BufferSize: 16 * humanize.KiByte, MaxBytes: 10 * humanize.MiByte, @@ -304,18 +304,6 @@ func TestConfig(t *testing.T) { expectedErr: "number_of_workers <0> must be greater than 0", expectedCfg: nil, }, - { - name: "error on max_number_of_messages == 0", - queueURL: queueURL, - s3Bucket: "", - nonAWSS3Bucket: "", - config: mapstr.M{ - "queue_url": queueURL, - "max_number_of_messages": "0", - }, - expectedErr: "max_number_of_messages <0> must be greater than 0", - expectedCfg: nil, - }, { name: "error on buffer_size == 0 ", queueURL: queueURL, diff --git a/x-pack/filebeat/input/awss3/input_benchmark_test.go b/x-pack/filebeat/input/awss3/input_benchmark_test.go index 0d7d79b615be..54e227736025 100644 --- a/x-pack/filebeat/input/awss3/input_benchmark_test.go +++ b/x-pack/filebeat/input/awss3/input_benchmark_test.go @@ -27,8 +27,6 @@ import ( "github.com/dustin/go-humanize" "github.com/olekukonko/tablewriter" - pubtest "github.com/elastic/beats/v7/libbeat/publisher/testing" - awscommon "github.com/elastic/beats/v7/x-pack/libbeat/common/aws" conf "github.com/elastic/elastic-agent-libs/config" "github.com/elastic/elastic-agent-libs/logp" "github.com/elastic/elastic-agent-libs/monitoring" @@ -164,10 +162,17 @@ func (c constantS3) ListObjectsPaginator(string, string) s3Pager { var _ beat.Pipeline = (*fakePipeline)(nil) // fakePipeline returns new ackClients. -type fakePipeline struct{} +type fakePipeline struct { +} -func (c *fakePipeline) ConnectWith(beat.ClientConfig) (beat.Client, error) { - return &ackClient{}, nil +func newFakePipeline() *fakePipeline { + return &fakePipeline{} +} + +func (c *fakePipeline) ConnectWith(config beat.ClientConfig) (beat.Client, error) { + return &ackClient{ + eventListener: config.EventListener, + }, nil } func (c *fakePipeline) Connect() (beat.Client, error) { @@ -177,13 +182,15 @@ func (c *fakePipeline) Connect() (beat.Client, error) { var _ beat.Client = (*ackClient)(nil) // ackClient is a fake beat.Client that ACKs the published messages. -type ackClient struct{} +type ackClient struct { + eventListener beat.EventListener +} func (c *ackClient) Close() error { return nil } func (c *ackClient) Publish(event beat.Event) { - // Fake the ACK handling. - event.Private.(*awscommon.EventACKTracker).ACK() + c.eventListener.AddEvent(event, true) + go c.eventListener.ACKEvents(1) } func (c *ackClient) PublishAll(event []beat.Event) { @@ -208,20 +215,20 @@ file_selectors: return inputConfig } -func benchmarkInputSQS(t *testing.T, maxMessagesInflight int) testing.BenchmarkResult { +func benchmarkInputSQS(t *testing.T, workerCount int) testing.BenchmarkResult { return testing.Benchmark(func(b *testing.B) { var err error - pipeline := &fakePipeline{} config := makeBenchmarkConfig(t) - config.MaxNumberOfMessages = maxMessagesInflight + config.NumberOfWorkers = workerCount sqsReader := newSQSReaderInput(config, aws.Config{}) sqsReader.log = log.Named("sqs") - sqsReader.metrics = newInputMetrics("test_id", monitoring.NewRegistry(), maxMessagesInflight) + sqsReader.pipeline = newFakePipeline() + sqsReader.metrics = newInputMetrics("test_id", monitoring.NewRegistry(), workerCount) sqsReader.sqs, err = newConstantSQS() require.NoError(t, err) sqsReader.s3 = newConstantS3(t) - sqsReader.msgHandler, err = sqsReader.createEventProcessor(pipeline) + sqsReader.msgHandler, err = sqsReader.createEventProcessor() require.NoError(t, err, "createEventProcessor must succeed") ctx, cancel := context.WithCancel(context.Background()) @@ -240,7 +247,7 @@ func benchmarkInputSQS(t *testing.T, maxMessagesInflight int) testing.BenchmarkR b.StopTimer() elapsed := time.Since(start) - b.ReportMetric(float64(maxMessagesInflight), "max_messages_inflight") + b.ReportMetric(float64(workerCount), "number_of_workers") b.ReportMetric(elapsed.Seconds(), "sec") b.ReportMetric(float64(sqsReader.metrics.s3EventsCreatedTotal.Get()), "events") @@ -303,14 +310,7 @@ func benchmarkInputS3(t *testing.T, numberOfWorkers int) testing.BenchmarkResult metricRegistry := monitoring.NewRegistry() metrics := newInputMetrics("test_id", metricRegistry, numberOfWorkers) - - client := pubtest.NewChanClientWithCallback(100, func(event beat.Event) { - event.Private.(*awscommon.EventACKTracker).ACK() - }) - - defer func() { - _ = client.Close() - }() + pipeline := newFakePipeline() config := makeBenchmarkConfig(t) config.NumberOfWorkers = numberOfWorkers @@ -342,13 +342,13 @@ func benchmarkInputS3(t *testing.T, numberOfWorkers int) testing.BenchmarkResult states, err := newStates(nil, store) assert.NoError(t, err, "states creation should succeed") - s3EventHandlerFactory := newS3ObjectProcessorFactory(log.Named("s3"), metrics, s3API, config.FileSelectors, backupConfig{}) + s3EventHandlerFactory := newS3ObjectProcessorFactory(metrics, s3API, config.FileSelectors, backupConfig{}) s3Poller := &s3PollerInput{ log: logp.NewLogger(inputName), config: config, metrics: metrics, s3: s3API, - client: client, + pipeline: pipeline, s3ObjectHandler: s3EventHandlerFactory, states: states, provider: "provider", diff --git a/x-pack/filebeat/input/awss3/input_integration_test.go b/x-pack/filebeat/input/awss3/input_integration_test.go index 88d81a9f0c8b..9303c5c72599 100644 --- a/x-pack/filebeat/input/awss3/input_integration_test.go +++ b/x-pack/filebeat/input/awss3/input_integration_test.go @@ -112,7 +112,7 @@ file_selectors: func makeTestConfigSQS(queueURL string) *conf.C { return conf.MustNewConfigFrom(fmt.Sprintf(`--- queue_url: %s -max_number_of_messages: 1 +number_of_workers: 1 visibility_timeout: 30s region: us-east-1 file_selectors: diff --git a/x-pack/filebeat/input/awss3/interfaces.go b/x-pack/filebeat/input/awss3/interfaces.go index 5e9eb13d243a..6a3b119303be 100644 --- a/x-pack/filebeat/input/awss3/interfaces.go +++ b/x-pack/filebeat/input/awss3/interfaces.go @@ -17,7 +17,6 @@ import ( "github.com/aws/smithy-go/middleware" "github.com/elastic/beats/v7/libbeat/beat" - awscommon "github.com/elastic/beats/v7/x-pack/libbeat/common/aws" awssdk "github.com/aws/aws-sdk-go-v2/aws" "github.com/aws/aws-sdk-go-v2/service/s3" @@ -41,25 +40,9 @@ import ( const s3RequestURLMetadataKey = `x-beat-s3-request-url` type sqsAPI interface { - sqsReceiver - sqsDeleter - sqsVisibilityChanger - sqsAttributeGetter -} - -type sqsReceiver interface { ReceiveMessage(ctx context.Context, maxMessages int) ([]types.Message, error) -} - -type sqsDeleter interface { DeleteMessage(ctx context.Context, msg *types.Message) error -} - -type sqsVisibilityChanger interface { ChangeMessageVisibility(ctx context.Context, msg *types.Message, timeout time.Duration) error -} - -type sqsAttributeGetter interface { GetQueueAttributes(ctx context.Context, attr []types.QueueAttributeName) (map[string]string, error) } @@ -68,7 +51,7 @@ type sqsProcessor interface { // given message and is responsible for updating the message's visibility // timeout while it is being processed and for deleting it when processing // completes successfully. - ProcessSQS(ctx context.Context, msg *types.Message) error + ProcessSQS(ctx context.Context, msg *types.Message, eventCallback func(e beat.Event)) sqsProcessingResult } // ------ @@ -103,25 +86,18 @@ type s3ObjectHandlerFactory interface { // Create returns a new s3ObjectHandler that can be used to process the // specified S3 object. If the handler is not configured to process the // given S3 object (based on key name) then it will return nil. - Create(ctx context.Context, log *logp.Logger, client beat.Client, acker *awscommon.EventACKTracker, obj s3EventV2) s3ObjectHandler + Create(ctx context.Context, obj s3EventV2) s3ObjectHandler } type s3ObjectHandler interface { // ProcessS3Object downloads the S3 object, parses it, creates events, and - // publishes them. It returns when processing finishes or when it encounters - // an unrecoverable error. It does not wait for the events to be ACKed by - // the publisher before returning (use eventACKTracker's Wait() method to - // determine this). - ProcessS3Object() error + // passes to the given callback. It returns when processing finishes or + // when it encounters an unrecoverable error. + ProcessS3Object(log *logp.Logger, eventCallback func(e beat.Event)) error // FinalizeS3Object finalizes processing of an S3 object after the current // batch is finished. FinalizeS3Object() error - - // Wait waits for every event published by ProcessS3Object() to be ACKed - // by the publisher before returning. Internally it uses the - // s3ObjectHandler eventACKTracker's Wait() method - Wait() } // ------ diff --git a/x-pack/filebeat/input/awss3/mock_interfaces_test.go b/x-pack/filebeat/input/awss3/mock_interfaces_test.go index ccae48a59b2f..086ca34136fd 100644 --- a/x-pack/filebeat/input/awss3/mock_interfaces_test.go +++ b/x-pack/filebeat/input/awss3/mock_interfaces_test.go @@ -18,7 +18,6 @@ import ( gomock "github.com/golang/mock/gomock" beat "github.com/elastic/beats/v7/libbeat/beat" - aws "github.com/elastic/beats/v7/x-pack/libbeat/common/aws" logp "github.com/elastic/elastic-agent-libs/logp" ) @@ -103,156 +102,6 @@ func (mr *MockSQSAPIMockRecorder) ReceiveMessage(ctx, maxMessages interface{}) * return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "ReceiveMessage", reflect.TypeOf((*MockSQSAPI)(nil).ReceiveMessage), ctx, maxMessages) } -// MocksqsReceiver is a mock of sqsReceiver interface. -type MocksqsReceiver struct { - ctrl *gomock.Controller - recorder *MocksqsReceiverMockRecorder -} - -// MocksqsReceiverMockRecorder is the mock recorder for MocksqsReceiver. -type MocksqsReceiverMockRecorder struct { - mock *MocksqsReceiver -} - -// NewMocksqsReceiver creates a new mock instance. -func NewMocksqsReceiver(ctrl *gomock.Controller) *MocksqsReceiver { - mock := &MocksqsReceiver{ctrl: ctrl} - mock.recorder = &MocksqsReceiverMockRecorder{mock} - return mock -} - -// EXPECT returns an object that allows the caller to indicate expected use. -func (m *MocksqsReceiver) EXPECT() *MocksqsReceiverMockRecorder { - return m.recorder -} - -// ReceiveMessage mocks base method. -func (m *MocksqsReceiver) ReceiveMessage(ctx context.Context, maxMessages int) ([]types.Message, error) { - m.ctrl.T.Helper() - ret := m.ctrl.Call(m, "ReceiveMessage", ctx, maxMessages) - ret0, _ := ret[0].([]types.Message) - ret1, _ := ret[1].(error) - return ret0, ret1 -} - -// ReceiveMessage indicates an expected call of ReceiveMessage. -func (mr *MocksqsReceiverMockRecorder) ReceiveMessage(ctx, maxMessages interface{}) *gomock.Call { - mr.mock.ctrl.T.Helper() - return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "ReceiveMessage", reflect.TypeOf((*MocksqsReceiver)(nil).ReceiveMessage), ctx, maxMessages) -} - -// MocksqsDeleter is a mock of sqsDeleter interface. -type MocksqsDeleter struct { - ctrl *gomock.Controller - recorder *MocksqsDeleterMockRecorder -} - -// MocksqsDeleterMockRecorder is the mock recorder for MocksqsDeleter. -type MocksqsDeleterMockRecorder struct { - mock *MocksqsDeleter -} - -// NewMocksqsDeleter creates a new mock instance. -func NewMocksqsDeleter(ctrl *gomock.Controller) *MocksqsDeleter { - mock := &MocksqsDeleter{ctrl: ctrl} - mock.recorder = &MocksqsDeleterMockRecorder{mock} - return mock -} - -// EXPECT returns an object that allows the caller to indicate expected use. -func (m *MocksqsDeleter) EXPECT() *MocksqsDeleterMockRecorder { - return m.recorder -} - -// DeleteMessage mocks base method. -func (m *MocksqsDeleter) DeleteMessage(ctx context.Context, msg *types.Message) error { - m.ctrl.T.Helper() - ret := m.ctrl.Call(m, "DeleteMessage", ctx, msg) - ret0, _ := ret[0].(error) - return ret0 -} - -// DeleteMessage indicates an expected call of DeleteMessage. -func (mr *MocksqsDeleterMockRecorder) DeleteMessage(ctx, msg interface{}) *gomock.Call { - mr.mock.ctrl.T.Helper() - return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "DeleteMessage", reflect.TypeOf((*MocksqsDeleter)(nil).DeleteMessage), ctx, msg) -} - -// MocksqsVisibilityChanger is a mock of sqsVisibilityChanger interface. -type MocksqsVisibilityChanger struct { - ctrl *gomock.Controller - recorder *MocksqsVisibilityChangerMockRecorder -} - -// MocksqsVisibilityChangerMockRecorder is the mock recorder for MocksqsVisibilityChanger. -type MocksqsVisibilityChangerMockRecorder struct { - mock *MocksqsVisibilityChanger -} - -// NewMocksqsVisibilityChanger creates a new mock instance. -func NewMocksqsVisibilityChanger(ctrl *gomock.Controller) *MocksqsVisibilityChanger { - mock := &MocksqsVisibilityChanger{ctrl: ctrl} - mock.recorder = &MocksqsVisibilityChangerMockRecorder{mock} - return mock -} - -// EXPECT returns an object that allows the caller to indicate expected use. -func (m *MocksqsVisibilityChanger) EXPECT() *MocksqsVisibilityChangerMockRecorder { - return m.recorder -} - -// ChangeMessageVisibility mocks base method. -func (m *MocksqsVisibilityChanger) ChangeMessageVisibility(ctx context.Context, msg *types.Message, timeout time.Duration) error { - m.ctrl.T.Helper() - ret := m.ctrl.Call(m, "ChangeMessageVisibility", ctx, msg, timeout) - ret0, _ := ret[0].(error) - return ret0 -} - -// ChangeMessageVisibility indicates an expected call of ChangeMessageVisibility. -func (mr *MocksqsVisibilityChangerMockRecorder) ChangeMessageVisibility(ctx, msg, timeout interface{}) *gomock.Call { - mr.mock.ctrl.T.Helper() - return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "ChangeMessageVisibility", reflect.TypeOf((*MocksqsVisibilityChanger)(nil).ChangeMessageVisibility), ctx, msg, timeout) -} - -// MocksqsAttributeGetter is a mock of sqsAttributeGetter interface. -type MocksqsAttributeGetter struct { - ctrl *gomock.Controller - recorder *MocksqsAttributeGetterMockRecorder -} - -// MocksqsAttributeGetterMockRecorder is the mock recorder for MocksqsAttributeGetter. -type MocksqsAttributeGetterMockRecorder struct { - mock *MocksqsAttributeGetter -} - -// NewMocksqsAttributeGetter creates a new mock instance. -func NewMocksqsAttributeGetter(ctrl *gomock.Controller) *MocksqsAttributeGetter { - mock := &MocksqsAttributeGetter{ctrl: ctrl} - mock.recorder = &MocksqsAttributeGetterMockRecorder{mock} - return mock -} - -// EXPECT returns an object that allows the caller to indicate expected use. -func (m *MocksqsAttributeGetter) EXPECT() *MocksqsAttributeGetterMockRecorder { - return m.recorder -} - -// GetQueueAttributes mocks base method. -func (m *MocksqsAttributeGetter) GetQueueAttributes(ctx context.Context, attr []types.QueueAttributeName) (map[string]string, error) { - m.ctrl.T.Helper() - ret := m.ctrl.Call(m, "GetQueueAttributes", ctx, attr) - ret0, _ := ret[0].(map[string]string) - ret1, _ := ret[1].(error) - return ret0, ret1 -} - -// GetQueueAttributes indicates an expected call of GetQueueAttributes. -func (mr *MocksqsAttributeGetterMockRecorder) GetQueueAttributes(ctx, attr interface{}) *gomock.Call { - mr.mock.ctrl.T.Helper() - return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "GetQueueAttributes", reflect.TypeOf((*MocksqsAttributeGetter)(nil).GetQueueAttributes), ctx, attr) -} - // MockSQSProcessor is a mock of sqsProcessor interface. type MockSQSProcessor struct { ctrl *gomock.Controller @@ -277,17 +126,17 @@ func (m *MockSQSProcessor) EXPECT() *MockSQSProcessorMockRecorder { } // ProcessSQS mocks base method. -func (m *MockSQSProcessor) ProcessSQS(ctx context.Context, msg *types.Message) error { +func (m *MockSQSProcessor) ProcessSQS(ctx context.Context, msg *types.Message, eventCallback func(beat.Event)) sqsProcessingResult { m.ctrl.T.Helper() - ret := m.ctrl.Call(m, "ProcessSQS", ctx, msg) - ret0, _ := ret[0].(error) + ret := m.ctrl.Call(m, "ProcessSQS", ctx, msg, eventCallback) + ret0, _ := ret[0].(sqsProcessingResult) return ret0 } // ProcessSQS indicates an expected call of ProcessSQS. -func (mr *MockSQSProcessorMockRecorder) ProcessSQS(ctx, msg interface{}) *gomock.Call { +func (mr *MockSQSProcessorMockRecorder) ProcessSQS(ctx, msg, eventCallback interface{}) *gomock.Call { mr.mock.ctrl.T.Helper() - return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "ProcessSQS", reflect.TypeOf((*MockSQSProcessor)(nil).ProcessSQS), ctx, msg) + return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "ProcessSQS", reflect.TypeOf((*MockSQSProcessor)(nil).ProcessSQS), ctx, msg, eventCallback) } // MockS3API is a mock of s3API interface. @@ -581,17 +430,17 @@ func (m *MockS3ObjectHandlerFactory) EXPECT() *MockS3ObjectHandlerFactoryMockRec } // Create mocks base method. -func (m *MockS3ObjectHandlerFactory) Create(ctx context.Context, log *logp.Logger, client beat.Client, acker *aws.EventACKTracker, obj s3EventV2) s3ObjectHandler { +func (m *MockS3ObjectHandlerFactory) Create(ctx context.Context, obj s3EventV2) s3ObjectHandler { m.ctrl.T.Helper() - ret := m.ctrl.Call(m, "Create", ctx, log, client, acker, obj) + ret := m.ctrl.Call(m, "Create", ctx, obj) ret0, _ := ret[0].(s3ObjectHandler) return ret0 } // Create indicates an expected call of Create. -func (mr *MockS3ObjectHandlerFactoryMockRecorder) Create(ctx, log, client, acker, obj interface{}) *gomock.Call { +func (mr *MockS3ObjectHandlerFactoryMockRecorder) Create(ctx, obj interface{}) *gomock.Call { mr.mock.ctrl.T.Helper() - return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "Create", reflect.TypeOf((*MockS3ObjectHandlerFactory)(nil).Create), ctx, log, client, acker, obj) + return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "Create", reflect.TypeOf((*MockS3ObjectHandlerFactory)(nil).Create), ctx, obj) } // MockS3ObjectHandler is a mock of s3ObjectHandler interface. @@ -632,27 +481,15 @@ func (mr *MockS3ObjectHandlerMockRecorder) FinalizeS3Object() *gomock.Call { } // ProcessS3Object mocks base method. -func (m *MockS3ObjectHandler) ProcessS3Object() error { +func (m *MockS3ObjectHandler) ProcessS3Object(log *logp.Logger, eventCallback func(beat.Event)) error { m.ctrl.T.Helper() - ret := m.ctrl.Call(m, "ProcessS3Object") + ret := m.ctrl.Call(m, "ProcessS3Object", log, eventCallback) ret0, _ := ret[0].(error) return ret0 } // ProcessS3Object indicates an expected call of ProcessS3Object. -func (mr *MockS3ObjectHandlerMockRecorder) ProcessS3Object() *gomock.Call { - mr.mock.ctrl.T.Helper() - return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "ProcessS3Object", reflect.TypeOf((*MockS3ObjectHandler)(nil).ProcessS3Object)) -} - -// Wait mocks base method. -func (m *MockS3ObjectHandler) Wait() { - m.ctrl.T.Helper() - m.ctrl.Call(m, "Wait") -} - -// Wait indicates an expected call of Wait. -func (mr *MockS3ObjectHandlerMockRecorder) Wait() *gomock.Call { +func (mr *MockS3ObjectHandlerMockRecorder) ProcessS3Object(log, eventCallback interface{}) *gomock.Call { mr.mock.ctrl.T.Helper() - return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "Wait", reflect.TypeOf((*MockS3ObjectHandler)(nil).Wait)) + return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "ProcessS3Object", reflect.TypeOf((*MockS3ObjectHandler)(nil).ProcessS3Object), log, eventCallback) } diff --git a/x-pack/filebeat/input/awss3/s3.go b/x-pack/filebeat/input/awss3/s3.go index d611470ec80c..9901d5fe41d4 100644 --- a/x-pack/filebeat/input/awss3/s3.go +++ b/x-pack/filebeat/input/awss3/s3.go @@ -14,7 +14,6 @@ import ( "github.com/aws/aws-sdk-go-v2/service/s3" "github.com/elastic/beats/v7/libbeat/beat" - awscommon "github.com/elastic/beats/v7/x-pack/libbeat/common/aws" ) func createS3API(ctx context.Context, config config, awsConfig awssdk.Config) (*awsS3API, error) { @@ -32,9 +31,9 @@ func createS3API(ctx context.Context, config config, awsConfig awssdk.Config) (* return newAWSs3API(s3Client), nil } -func createPipelineClient(pipeline beat.Pipeline) (beat.Client, error) { +func createPipelineClient(pipeline beat.Pipeline, acks *awsACKHandler) (beat.Client, error) { return pipeline.ConnectWith(beat.ClientConfig{ - EventListener: awscommon.NewEventACKHandler(), + EventListener: acks.pipelineEventListener(), Processing: beat.ProcessingConfig{ // This input only produces events with basic types so normalization // is not required. @@ -117,5 +116,6 @@ type nonAWSBucketResolver struct { } func (n nonAWSBucketResolver) ResolveEndpoint(region string, options s3.EndpointResolverOptions) (awssdk.Endpoint, error) { + //nolint:staticcheck // haven't migrated to the new interface yet return awssdk.Endpoint{URL: n.endpoint, SigningRegion: region, HostnameImmutable: true, Source: awssdk.EndpointSourceCustom}, nil } diff --git a/x-pack/filebeat/input/awss3/s3_input.go b/x-pack/filebeat/input/awss3/s3_input.go index bd1e8f7700e5..c3a83c284a2f 100644 --- a/x-pack/filebeat/input/awss3/s3_input.go +++ b/x-pack/filebeat/input/awss3/s3_input.go @@ -17,7 +17,6 @@ import ( v2 "github.com/elastic/beats/v7/filebeat/input/v2" "github.com/elastic/beats/v7/libbeat/beat" "github.com/elastic/beats/v7/libbeat/common/backoff" - awscommon "github.com/elastic/beats/v7/x-pack/libbeat/common/aws" "github.com/elastic/elastic-agent-libs/logp" "github.com/elastic/go-concert/timed" ) @@ -28,23 +27,17 @@ var readerLoopMaxCircuitBreaker = 10 type s3PollerInput struct { log *logp.Logger + pipeline beat.Pipeline config config awsConfig awssdk.Config store beater.StateStore provider string s3 s3API metrics *inputMetrics - client beat.Client s3ObjectHandler s3ObjectHandlerFactory states *states } -// s3FetchTask contains metadata for one S3 object that a worker should fetch. -type s3FetchTask struct { - s3ObjectHandler s3ObjectHandler - objectState state -} - func newS3PollerInput( config config, awsConfig awssdk.Config, @@ -69,6 +62,7 @@ func (in *s3PollerInput) Run( pipeline beat.Pipeline, ) error { in.log = inputContext.Logger.Named("s3") + in.pipeline = pipeline var err error // Load the persistent S3 polling state. @@ -78,24 +72,16 @@ func (in *s3PollerInput) Run( } defer in.states.Close() - // Create client for publishing events and receive notification of their ACKs. - in.client, err = createPipelineClient(pipeline) - if err != nil { - return fmt.Errorf("failed to create pipeline client: %w", err) - } - defer in.client.Close() - ctx := v2.GoContextFromCanceler(inputContext.Cancelation) in.s3, err = createS3API(ctx, in.config, in.awsConfig) if err != nil { return fmt.Errorf("failed to create S3 API: %w", err) } - in.metrics = newInputMetrics(inputContext.ID, nil, in.config.MaxNumberOfMessages) + in.metrics = newInputMetrics(inputContext.ID, nil, in.config.NumberOfWorkers) defer in.metrics.Close() in.s3ObjectHandler = newS3ObjectProcessorFactory( - in.log, in.metrics, in.s3, in.config.getFileSelectors(), @@ -117,7 +103,7 @@ func (in *s3PollerInput) run(ctx context.Context) { func (in *s3PollerInput) runPoll(ctx context.Context) { var workerWg sync.WaitGroup - workChan := make(chan *s3FetchTask) + workChan := make(chan state) // Start the worker goroutines to listen on the work channel for i := 0; i < in.config.NumberOfWorkers; i++ { @@ -133,15 +119,37 @@ func (in *s3PollerInput) runPoll(ctx context.Context) { workerWg.Wait() } -func (in *s3PollerInput) workerLoop(ctx context.Context, workChan <-chan *s3FetchTask) { +func (in *s3PollerInput) workerLoop(ctx context.Context, workChan <-chan state) { + acks := newAWSACKHandler() + // Create client for publishing events and receive notification of their ACKs. + client, err := createPipelineClient(in.pipeline, acks) + if err != nil { + in.log.Errorf("failed to create pipeline client: %v", err.Error()) + return + } + defer client.Close() + defer acks.Close() + rateLimitWaiter := backoff.NewEqualJitterBackoff(ctx.Done(), 1, 120) - for s3ObjectPayload := range workChan { - objHandler := s3ObjectPayload.s3ObjectHandler - state := s3ObjectPayload.objectState + for _state := range workChan { + state := _state + event := in.s3EventForState(state) + + objHandler := in.s3ObjectHandler.Create(ctx, event) + if objHandler == nil { + in.log.Debugw("empty s3 processor (no matching reader configs).", "state", state) + continue + } // Process S3 object (download, parse, create events). - err := objHandler.ProcessS3Object() + publishCount := 0 + err := objHandler.ProcessS3Object(in.log, func(e beat.Event) { + in.metrics.s3EventsCreatedTotal.Inc() + client.Publish(e) + publishCount++ + }) + in.metrics.s3EventsPerObject.Update(int64(publishCount)) if errors.Is(err, errS3DownloadFailed) { // Download errors are ephemeral. Add a backoff delay, then skip to the // next iteration so we don't mark the object as permanently failed. @@ -151,9 +159,7 @@ func (in *s3PollerInput) workerLoop(ctx context.Context, workChan <-chan *s3Fetc // Reset the rate limit delay on results that aren't download errors. rateLimitWaiter.Reset() - // Wait for downloaded objects to be ACKed. - objHandler.Wait() - + // Update state, but don't persist it until this object is acknowledged. if err != nil { in.log.Errorf("failed processing S3 event for object key %q in bucket %q: %v", state.Key, state.Bucket, err.Error()) @@ -164,22 +170,20 @@ func (in *s3PollerInput) workerLoop(ctx context.Context, workChan <-chan *s3Fetc state.Stored = true } - // Persist the result, report any errors - err = in.states.AddState(state) - if err != nil { - in.log.Errorf("saving completed object state: %v", err.Error()) - } - - // Metrics - in.metrics.s3ObjectsAckedTotal.Inc() + // Add the cleanup handling to the acks helper + acks.Add(publishCount, func() { + err := in.states.AddState(state) + if err != nil { + in.log.Errorf("saving completed object state: %v", err.Error()) + } - if finalizeErr := objHandler.FinalizeS3Object(); finalizeErr != nil { - in.log.Errorf("failed finalizing objects from S3 bucket (manual cleanup is required): %w", finalizeErr) - } + // Metrics + in.metrics.s3ObjectsAckedTotal.Inc() + }) } } -func (in *s3PollerInput) readerLoop(ctx context.Context, workChan chan<- *s3FetchTask) { +func (in *s3PollerInput) readerLoop(ctx context.Context, workChan chan<- state) { defer close(workChan) bucketName := getBucketNameFromARN(in.config.getBucketARN()) @@ -220,31 +224,19 @@ func (in *s3PollerInput) readerLoop(ctx context.Context, workChan chan<- *s3Fetc continue } - s3Processor := in.createS3ObjectProcessor(ctx, state) - if s3Processor == nil { - in.log.Debugw("empty s3 processor.", "state", state) - continue - } - - workChan <- &s3FetchTask{ - s3ObjectHandler: s3Processor, - objectState: state, - } + workChan <- state in.metrics.s3ObjectsProcessedTotal.Inc() } } } -func (in *s3PollerInput) createS3ObjectProcessor(ctx context.Context, state state) s3ObjectHandler { +func (in *s3PollerInput) s3EventForState(state state) s3EventV2 { event := s3EventV2{} event.AWSRegion = in.awsConfig.Region event.Provider = in.provider event.S3.Bucket.Name = state.Bucket event.S3.Bucket.ARN = in.config.getBucketARN() event.S3.Object.Key = state.Key - - acker := awscommon.NewEventACKTracker(ctx) - - return in.s3ObjectHandler.Create(ctx, in.log, in.client, acker, event) + return event } diff --git a/x-pack/filebeat/input/awss3/s3_objects.go b/x-pack/filebeat/input/awss3/s3_objects.go index 82a9e817bc68..93219d9a6408 100644 --- a/x-pack/filebeat/input/awss3/s3_objects.go +++ b/x-pack/filebeat/input/awss3/s3_objects.go @@ -25,30 +25,48 @@ import ( "github.com/elastic/beats/v7/libbeat/reader" "github.com/elastic/beats/v7/libbeat/reader/readfile" "github.com/elastic/beats/v7/libbeat/reader/readfile/encoding" - awscommon "github.com/elastic/beats/v7/x-pack/libbeat/common/aws" "github.com/elastic/elastic-agent-libs/logp" "github.com/elastic/elastic-agent-libs/mapstr" ) -const ( - contentTypeJSON = "application/json" - contentTypeNDJSON = "application/x-ndjson" -) - type s3ObjectProcessorFactory struct { - log *logp.Logger metrics *inputMetrics s3 s3API fileSelectors []fileSelectorConfig backupConfig backupConfig } +type s3ObjectProcessor struct { + *s3ObjectProcessorFactory + + ctx context.Context + eventCallback func(beat.Event) + readerConfig *readerConfig // Config about how to process the object. + s3Obj s3EventV2 // S3 object information. + s3ObjHash string + s3RequestURL string + + s3Metadata map[string]interface{} // S3 object metadata. +} + +type s3DownloadedObject struct { + body io.ReadCloser + length int64 + contentType string + metadata map[string]interface{} +} + +const ( + contentTypeJSON = "application/json" + contentTypeNDJSON = "application/x-ndjson" +) + // errS3DownloadFailed reports problems downloading an S3 object. Download errors // should never treated as permanent, they are just an indication to apply a // retry backoff until the connection is healthy again. var errS3DownloadFailed = errors.New("S3 download failure") -func newS3ObjectProcessorFactory(log *logp.Logger, metrics *inputMetrics, s3 s3API, sel []fileSelectorConfig, backupConfig backupConfig) *s3ObjectProcessorFactory { +func newS3ObjectProcessorFactory(metrics *inputMetrics, s3 s3API, sel []fileSelectorConfig, backupConfig backupConfig) *s3ObjectProcessorFactory { if metrics == nil { // Metrics are optional. Initialize a stub. metrics = newInputMetrics("", nil, 0) @@ -59,7 +77,6 @@ func newS3ObjectProcessorFactory(log *logp.Logger, metrics *inputMetrics, s3 s3A } } return &s3ObjectProcessorFactory{ - log: log, metrics: metrics, s3: s3, fileSelectors: sel, @@ -78,64 +95,33 @@ func (f *s3ObjectProcessorFactory) findReaderConfig(key string) *readerConfig { // Create returns a new s3ObjectProcessor. It returns nil when no file selectors // match the S3 object key. -func (f *s3ObjectProcessorFactory) Create(ctx context.Context, log *logp.Logger, client beat.Client, ack *awscommon.EventACKTracker, obj s3EventV2) s3ObjectHandler { - log = log.With( - "bucket_arn", obj.S3.Bucket.Name, - "object_key", obj.S3.Object.Key) - +func (f *s3ObjectProcessorFactory) Create(ctx context.Context, obj s3EventV2) s3ObjectHandler { readerConfig := f.findReaderConfig(obj.S3.Object.Key) if readerConfig == nil { - log.Debug("Skipping S3 object processing. No file_selectors are a match.") + // No file_selectors are a match, skip. return nil } return &s3ObjectProcessor{ s3ObjectProcessorFactory: f, - log: log, ctx: ctx, - publisher: client, - acker: ack, readerConfig: readerConfig, s3Obj: obj, s3ObjHash: s3ObjectHash(obj), } } -// s3DownloadedObject encapsulate downloaded s3 object for internal processing -type s3DownloadedObject struct { - body io.ReadCloser - length int64 - contentType string - metadata map[string]interface{} -} - -type s3ObjectProcessor struct { - *s3ObjectProcessorFactory - - log *logp.Logger - ctx context.Context - publisher beat.Client - acker *awscommon.EventACKTracker // ACKer tied to the SQS message (multiple S3 readers share an ACKer when the S3 notification event contains more than one S3 object). - readerConfig *readerConfig // Config about how to process the object. - s3Obj s3EventV2 // S3 object information. - s3ObjHash string - s3RequestURL string - eventCount int64 - - s3Metadata map[string]interface{} // S3 object metadata. -} - -func (p *s3ObjectProcessor) Wait() { - p.acker.Wait() -} - -func (p *s3ObjectProcessor) ProcessS3Object() error { +func (p *s3ObjectProcessor) ProcessS3Object(log *logp.Logger, eventCallback func(e beat.Event)) error { if p == nil { return nil } + p.eventCallback = eventCallback + log = log.With( + "bucket_arn", p.s3Obj.S3.Bucket.Name, + "object_key", p.s3Obj.S3.Object.Key) // Metrics and Logging - p.log.Debug("Begin S3 object processing.") + log.Debug("Begin S3 object processing.") p.metrics.s3ObjectsRequestedTotal.Inc() p.metrics.s3ObjectsInflight.Inc() start := time.Now() @@ -143,7 +129,7 @@ func (p *s3ObjectProcessor) ProcessS3Object() error { elapsed := time.Since(start) p.metrics.s3ObjectsInflight.Dec() p.metrics.s3ObjectProcessingTime.Update(elapsed.Nanoseconds()) - p.log.Debugw("End S3 object processing.", "elapsed_time_ns", elapsed) + log.Debugw("End S3 object processing.", "elapsed_time_ns", elapsed) }() // Request object (download). @@ -181,7 +167,7 @@ func (p *s3ObjectProcessor) ProcessS3Object() error { for dec.next() { val, err := dec.decodeValue() if err != nil { - if err == io.EOF { + if errors.Is(err, io.EOF) { return nil } break @@ -191,7 +177,8 @@ func (p *s3ObjectProcessor) ProcessS3Object() error { return err } evt := p.createEvent(string(data), evtOffset) - p.publish(p.acker, &evt) + + p.eventCallback(evt) } case decoder: @@ -226,7 +213,6 @@ func (p *s3ObjectProcessor) ProcessS3Object() error { time.Since(start).Nanoseconds(), err) } - p.metrics.s3EventsPerObject.Update(p.eventCount) return nil } @@ -298,7 +284,7 @@ func (p *s3ObjectProcessor) readJSON(r io.Reader) error { data, _ := item.MarshalJSON() evt := p.createEvent(string(data), offset) - p.publish(p.acker, &evt) + p.eventCallback(evt) } return nil @@ -333,7 +319,7 @@ func (p *s3ObjectProcessor) readJSONSlice(r io.Reader, evtOffset int64) (int64, data, _ := item.MarshalJSON() evt := p.createEvent(string(data), evtOffset) - p.publish(p.acker, &evt) + p.eventCallback(evt) evtOffset++ } @@ -378,7 +364,7 @@ func (p *s3ObjectProcessor) splitEventList(key string, raw json.RawMessage, offs data, _ := item.MarshalJSON() p.s3ObjHash = objHash evt := p.createEvent(string(data), offset+arrayOffset) - p.publish(p.acker, &evt) + p.eventCallback(evt) } return nil @@ -418,7 +404,7 @@ func (p *s3ObjectProcessor) readFile(r io.Reader) error { event := p.createEvent(string(message.Content), offset) event.Fields.DeepUpdate(message.Fields) offset += int64(message.Bytes) - p.publish(p.acker, &event) + p.eventCallback(event) } if errors.Is(err, io.EOF) { @@ -433,15 +419,6 @@ func (p *s3ObjectProcessor) readFile(r io.Reader) error { return nil } -// publish the generated event and perform necessary tracking -func (p *s3ObjectProcessor) publish(ack *awscommon.EventACKTracker, event *beat.Event) { - ack.Add() - event.Private = ack - p.eventCount += 1 - p.metrics.s3EventsCreatedTotal.Inc() - p.publisher.Publish(*event) -} - func (p *s3ObjectProcessor) createEvent(message string, offset int64) beat.Event { event := beat.Event{ Timestamp: time.Now().UTC(), diff --git a/x-pack/filebeat/input/awss3/s3_objects_test.go b/x-pack/filebeat/input/awss3/s3_objects_test.go index 635955ed8c42..d20d81ced6c8 100644 --- a/x-pack/filebeat/input/awss3/s3_objects_test.go +++ b/x-pack/filebeat/input/awss3/s3_objects_test.go @@ -22,7 +22,6 @@ import ( "github.com/stretchr/testify/require" "github.com/elastic/beats/v7/libbeat/beat" - awscommon "github.com/elastic/beats/v7/x-pack/libbeat/common/aws" conf "github.com/elastic/elastic-agent-libs/config" "github.com/elastic/elastic-agent-libs/logp" ) @@ -148,7 +147,6 @@ func TestS3ObjectProcessor(t *testing.T) { ctrl, ctx := gomock.WithContext(ctx, t) defer ctrl.Finish() mockS3API := NewMockS3API(ctrl) - mockPublisher := NewMockBeatClient(ctrl) s3Event := newS3Event("log.txt") @@ -156,9 +154,8 @@ func TestS3ObjectProcessor(t *testing.T) { GetObject(gomock.Any(), gomock.Eq("us-east-1"), gomock.Eq(s3Event.S3.Bucket.Name), gomock.Eq(s3Event.S3.Object.Key)). Return(nil, errFakeConnectivityFailure) - s3ObjProc := newS3ObjectProcessorFactory(logp.NewLogger(inputName), nil, mockS3API, nil, backupConfig{}) - ack := awscommon.NewEventACKTracker(ctx) - err := s3ObjProc.Create(ctx, logp.NewLogger(inputName), mockPublisher, ack, s3Event).ProcessS3Object() + s3ObjProc := newS3ObjectProcessorFactory(nil, mockS3API, nil, backupConfig{}) + err := s3ObjProc.Create(ctx, s3Event).ProcessS3Object(logp.NewLogger(inputName), func(_ beat.Event) {}) require.Error(t, err) assert.True(t, errors.Is(err, errS3DownloadFailed), "expected errS3DownloadFailed") }) @@ -170,7 +167,6 @@ func TestS3ObjectProcessor(t *testing.T) { ctrl, ctx := gomock.WithContext(ctx, t) defer ctrl.Finish() mockS3API := NewMockS3API(ctrl) - mockPublisher := NewMockBeatClient(ctrl) s3Event := newS3Event("log.txt") @@ -178,9 +174,8 @@ func TestS3ObjectProcessor(t *testing.T) { GetObject(gomock.Any(), gomock.Eq("us-east-1"), gomock.Eq(s3Event.S3.Bucket.Name), gomock.Eq(s3Event.S3.Object.Key)). Return(nil, nil) - s3ObjProc := newS3ObjectProcessorFactory(logp.NewLogger(inputName), nil, mockS3API, nil, backupConfig{}) - ack := awscommon.NewEventACKTracker(ctx) - err := s3ObjProc.Create(ctx, logp.NewLogger(inputName), mockPublisher, ack, s3Event).ProcessS3Object() + s3ObjProc := newS3ObjectProcessorFactory(nil, mockS3API, nil, backupConfig{}) + err := s3ObjProc.Create(ctx, s3Event).ProcessS3Object(logp.NewLogger(inputName), func(_ beat.Event) {}) require.Error(t, err) }) @@ -191,23 +186,20 @@ func TestS3ObjectProcessor(t *testing.T) { ctrl, ctx := gomock.WithContext(ctx, t) defer ctrl.Finish() mockS3API := NewMockS3API(ctrl) - mockPublisher := NewMockBeatClient(ctrl) s3Event, s3Resp := newS3Object(t, "testdata/log.txt", "") - var events []beat.Event gomock.InOrder( mockS3API.EXPECT(). GetObject(gomock.Any(), gomock.Eq("us-east-1"), gomock.Eq(s3Event.S3.Bucket.Name), gomock.Eq(s3Event.S3.Object.Key)). Return(s3Resp, nil), - mockPublisher.EXPECT(). - Publish(gomock.Any()). - Do(func(event beat.Event) { events = append(events, event) }). - Times(2), ) - s3ObjProc := newS3ObjectProcessorFactory(logp.NewLogger(inputName), nil, mockS3API, nil, backupConfig{}) - ack := awscommon.NewEventACKTracker(ctx) - err := s3ObjProc.Create(ctx, logp.NewLogger(inputName), mockPublisher, ack, s3Event).ProcessS3Object() + var events []beat.Event + s3ObjProc := newS3ObjectProcessorFactory(nil, mockS3API, nil, backupConfig{}) + err := s3ObjProc.Create(ctx, s3Event).ProcessS3Object(logp.NewLogger(inputName), func(event beat.Event) { + events = append(events, event) + }) + assert.Equal(t, 2, len(events)) require.NoError(t, err) }) @@ -218,7 +210,6 @@ func TestS3ObjectProcessor(t *testing.T) { ctrl, ctx := gomock.WithContext(ctx, t) defer ctrl.Finish() mockS3API := NewMockS3API(ctrl) - mockPublisher := NewMockBeatClient(ctrl) s3Event, _ := newS3Object(t, "testdata/log.txt", "") backupCfg := backupConfig{ @@ -231,9 +222,8 @@ func TestS3ObjectProcessor(t *testing.T) { Return(nil, nil), ) - s3ObjProc := newS3ObjectProcessorFactory(logp.NewLogger(inputName), nil, mockS3API, nil, backupCfg) - ack := awscommon.NewEventACKTracker(ctx) - err := s3ObjProc.Create(ctx, logp.NewLogger(inputName), mockPublisher, ack, s3Event).FinalizeS3Object() + s3ObjProc := newS3ObjectProcessorFactory(nil, mockS3API, nil, backupCfg) + err := s3ObjProc.Create(ctx, s3Event).FinalizeS3Object() require.NoError(t, err) }) @@ -244,7 +234,6 @@ func TestS3ObjectProcessor(t *testing.T) { ctrl, ctx := gomock.WithContext(ctx, t) defer ctrl.Finish() mockS3API := NewMockS3API(ctrl) - mockPublisher := NewMockBeatClient(ctrl) s3Event, _ := newS3Object(t, "testdata/log.txt", "") backupCfg := backupConfig{ @@ -261,9 +250,8 @@ func TestS3ObjectProcessor(t *testing.T) { Return(nil, nil), ) - s3ObjProc := newS3ObjectProcessorFactory(logp.NewLogger(inputName), nil, mockS3API, nil, backupCfg) - ack := awscommon.NewEventACKTracker(ctx) - err := s3ObjProc.Create(ctx, logp.NewLogger(inputName), mockPublisher, ack, s3Event).FinalizeS3Object() + s3ObjProc := newS3ObjectProcessorFactory(nil, mockS3API, nil, backupCfg) + err := s3ObjProc.Create(ctx, s3Event).FinalizeS3Object() require.NoError(t, err) }) @@ -274,7 +262,6 @@ func TestS3ObjectProcessor(t *testing.T) { ctrl, ctx := gomock.WithContext(ctx, t) defer ctrl.Finish() mockS3API := NewMockS3API(ctrl) - mockPublisher := NewMockBeatClient(ctrl) s3Event, _ := newS3Object(t, "testdata/log.txt", "") backupCfg := backupConfig{ @@ -288,9 +275,8 @@ func TestS3ObjectProcessor(t *testing.T) { Return(nil, nil), ) - s3ObjProc := newS3ObjectProcessorFactory(logp.NewLogger(inputName), nil, mockS3API, nil, backupCfg) - ack := awscommon.NewEventACKTracker(ctx) - err := s3ObjProc.Create(ctx, logp.NewLogger(inputName), mockPublisher, ack, s3Event).FinalizeS3Object() + s3ObjProc := newS3ObjectProcessorFactory(nil, mockS3API, nil, backupCfg) + err := s3ObjProc.Create(ctx, s3Event).FinalizeS3Object() require.NoError(t, err) }) @@ -320,7 +306,6 @@ func _testProcessS3Object(t testing.TB, file, contentType string, numEvents int, ctrl, ctx := gomock.WithContext(ctx, t) defer ctrl.Finish() mockS3API := NewMockS3API(ctrl) - mockPublisher := NewMockBeatClient(ctrl) s3Event, s3Resp := newS3Object(t, file, contentType) var events []beat.Event @@ -328,20 +313,16 @@ func _testProcessS3Object(t testing.TB, file, contentType string, numEvents int, mockS3API.EXPECT(). GetObject(gomock.Any(), gomock.Eq("us-east-1"), gomock.Eq(s3Event.S3.Bucket.Name), gomock.Eq(s3Event.S3.Object.Key)). Return(s3Resp, nil), - mockPublisher.EXPECT(). - Publish(gomock.Any()). - Do(func(event beat.Event) { events = append(events, event) }). - Times(numEvents), ) - s3ObjProc := newS3ObjectProcessorFactory(logp.NewLogger(inputName), nil, mockS3API, selectors, backupConfig{}) - ack := awscommon.NewEventACKTracker(ctx) - err := s3ObjProc.Create(ctx, logp.NewLogger(inputName), mockPublisher, ack, s3Event).ProcessS3Object() + s3ObjProc := newS3ObjectProcessorFactory(nil, mockS3API, selectors, backupConfig{}) + err := s3ObjProc.Create(ctx, s3Event).ProcessS3Object( + logp.NewLogger(inputName), + func(event beat.Event) { events = append(events, event) }) if !expectErr { require.NoError(t, err) assert.Equal(t, numEvents, len(events)) - assert.EqualValues(t, numEvents, ack.PendingACKs) } else { require.Error(t, err) } diff --git a/x-pack/filebeat/input/awss3/s3_test.go b/x-pack/filebeat/input/awss3/s3_test.go index 9c6099e775ae..b0b19d828318 100644 --- a/x-pack/filebeat/input/awss3/s3_test.go +++ b/x-pack/filebeat/input/awss3/s3_test.go @@ -36,7 +36,7 @@ func TestS3Poller(t *testing.T) { defer ctrl.Finish() mockAPI := NewMockS3API(ctrl) mockPager := NewMockS3Pager(ctrl) - mockPublisher := NewMockBeatClient(ctrl) + pipeline := newFakePipeline() gomock.InOrder( mockAPI.EXPECT(). @@ -126,7 +126,7 @@ func TestS3Poller(t *testing.T) { GetObject(gomock.Any(), gomock.Eq(""), gomock.Eq(bucket), gomock.Eq("2024-02-08T08:35:00+00:02.json.gz")). Return(nil, errFakeConnectivityFailure) - s3ObjProc := newS3ObjectProcessorFactory(logp.NewLogger(inputName), nil, mockAPI, nil, backupConfig{}) + s3ObjProc := newS3ObjectProcessorFactory(nil, mockAPI, nil, backupConfig{}) states, err := newStates(nil, store) require.NoError(t, err, "states creation must succeed") poller := &s3PollerInput{ @@ -139,7 +139,7 @@ func TestS3Poller(t *testing.T) { RegionName: "region", }, s3: mockAPI, - client: mockPublisher, + pipeline: pipeline, s3ObjectHandler: s3ObjProc, states: states, provider: "provider", @@ -162,7 +162,7 @@ func TestS3Poller(t *testing.T) { mockS3 := NewMockS3API(ctrl) mockErrorPager := NewMockS3Pager(ctrl) mockSuccessPager := NewMockS3Pager(ctrl) - mockPublisher := NewMockBeatClient(ctrl) + pipeline := newFakePipeline() gomock.InOrder( // Initial ListObjectPaginator gets an error. @@ -264,7 +264,7 @@ func TestS3Poller(t *testing.T) { GetObject(gomock.Any(), gomock.Eq(""), gomock.Eq(bucket), gomock.Eq("key5")). Return(nil, errFakeConnectivityFailure) - s3ObjProc := newS3ObjectProcessorFactory(logp.NewLogger(inputName), nil, mockS3, nil, backupConfig{}) + s3ObjProc := newS3ObjectProcessorFactory(nil, mockS3, nil, backupConfig{}) states, err := newStates(nil, store) require.NoError(t, err, "states creation must succeed") poller := &s3PollerInput{ @@ -277,7 +277,7 @@ func TestS3Poller(t *testing.T) { RegionName: "region", }, s3: mockS3, - client: mockPublisher, + pipeline: pipeline, s3ObjectHandler: s3ObjProc, states: states, provider: "provider", diff --git a/x-pack/filebeat/input/awss3/sqs_input.go b/x-pack/filebeat/input/awss3/sqs_input.go index a92319cbe192..a4308af45a80 100644 --- a/x-pack/filebeat/input/awss3/sqs_input.go +++ b/x-pack/filebeat/input/awss3/sqs_input.go @@ -8,7 +8,6 @@ import ( "context" "fmt" "sync" - "time" awssdk "github.com/aws/aws-sdk-go-v2/aws" "github.com/aws/aws-sdk-go-v2/service/s3" @@ -29,6 +28,10 @@ type sqsReaderInput struct { log *logp.Logger metrics *inputMetrics + // The Beats pipeline, used to create clients for event publication when + // creating the worker goroutines. + pipeline beat.Pipeline + // The expected region based on the queue URL detectedRegion string @@ -46,7 +49,7 @@ func newSQSReaderInput(config config, awsConfig awssdk.Config) *sqsReaderInput { return &sqsReaderInput{ config: config, awsConfig: awsConfig, - workRequestChan: make(chan struct{}, config.MaxNumberOfMessages), + workRequestChan: make(chan struct{}, config.NumberOfWorkers), workResponseChan: make(chan types.Message), } } @@ -83,6 +86,7 @@ func (in *sqsReaderInput) setup( pipeline beat.Pipeline, ) error { in.log = inputContext.Logger.With("queue_url", in.config.QueueURL) + in.pipeline = pipeline in.detectedRegion = getRegionFromQueueURL(in.config.QueueURL, in.config.AWSConfig.Endpoint) if in.config.RegionName != "" { @@ -105,10 +109,10 @@ func (in *sqsReaderInput) setup( in.s3 = newAWSs3API(s3.NewFromConfig(in.awsConfig, in.config.s3ConfigModifier)) - in.metrics = newInputMetrics(inputContext.ID, nil, in.config.MaxNumberOfMessages) + in.metrics = newInputMetrics(inputContext.ID, nil, in.config.NumberOfWorkers) var err error - in.msgHandler, err = in.createEventProcessor(pipeline) + in.msgHandler, err = in.createEventProcessor() if err != nil { return fmt.Errorf("failed to initialize sqs reader: %w", err) } @@ -161,42 +165,87 @@ func (in *sqsReaderInput) readerLoop(ctx context.Context) { } } -func (in *sqsReaderInput) workerLoop(ctx context.Context) { +type sqsWorker struct { + input *sqsReaderInput + client beat.Client + ackHandler *awsACKHandler +} + +func (in *sqsReaderInput) newSQSWorker() (*sqsWorker, error) { + // Create a pipeline client scoped to this worker. + ackHandler := newAWSACKHandler() + client, err := in.pipeline.ConnectWith(beat.ClientConfig{ + EventListener: ackHandler.pipelineEventListener(), + Processing: beat.ProcessingConfig{ + // This input only produces events with basic types so normalization + // is not required. + EventNormalization: boolPtr(false), + }, + }) + if err != nil { + return nil, fmt.Errorf("connecting to pipeline: %w", err) + } + return &sqsWorker{ + input: in, + client: client, + ackHandler: ackHandler, + }, nil +} + +func (w *sqsWorker) run(ctx context.Context) { + defer w.client.Close() + defer w.ackHandler.Close() + for ctx.Err() == nil { // Send a work request select { case <-ctx.Done(): // Shutting down return - case in.workRequestChan <- struct{}{}: + case w.input.workRequestChan <- struct{}{}: } // The request is sent, wait for a response select { case <-ctx.Done(): return - case msg := <-in.workResponseChan: - start := time.Now() - - id := in.metrics.beginSQSWorker() - if err := in.msgHandler.ProcessSQS(ctx, &msg); err != nil { - in.log.Warnw("Failed processing SQS message.", - "error", err, - "message_id", *msg.MessageId, - "elapsed_time_ns", time.Since(start)) - } - in.metrics.endSQSWorker(id) + case msg := <-w.input.workResponseChan: + w.processMessage(ctx, msg) } } } +func (w *sqsWorker) processMessage(ctx context.Context, msg types.Message) { + publishCount := 0 + id := w.input.metrics.beginSQSWorker() + result := w.input.msgHandler.ProcessSQS(ctx, &msg, func(e beat.Event) { + w.client.Publish(e) + publishCount++ + }) + + if publishCount == 0 { + // No events made it through (probably an error state), wrap up immediately + result.Done() + } else { + // Add this result's Done callback to the pending ACKs list + w.ackHandler.Add(publishCount, result.Done) + } + + w.input.metrics.endSQSWorker(id) +} + func (in *sqsReaderInput) startWorkers(ctx context.Context) { // Start the worker goroutines that will fetch messages via workRequestChan // and workResponseChan until the input shuts down. - for i := 0; i < in.config.MaxNumberOfMessages; i++ { + for i := 0; i < in.config.NumberOfWorkers; i++ { in.workerWg.Add(1) go func() { defer in.workerWg.Done() - in.workerLoop(ctx) + worker, err := in.newSQSWorker() + if err != nil { + in.log.Error(err) + return + } + go worker.run(ctx) }() } } @@ -209,7 +258,7 @@ func (in *sqsReaderInput) logConfigSummary() { log.Warnf("configured region disagrees with queue_url region (%q != %q): using %q", in.awsConfig.Region, in.detectedRegion, in.awsConfig.Region) } log.Infof("AWS SQS visibility_timeout is set to %v.", in.config.VisibilityTimeout) - log.Infof("AWS SQS max_number_of_messages is set to %v.", in.config.MaxNumberOfMessages) + log.Infof("AWS SQS number_of_workers is set to %v.", in.config.NumberOfWorkers) if in.config.BackupConfig.GetBucketName() != "" { log.Warnf("You have the backup_to_bucket functionality activated with SQS. Please make sure to set appropriate destination buckets " + @@ -217,15 +266,15 @@ func (in *sqsReaderInput) logConfigSummary() { } } -func (in *sqsReaderInput) createEventProcessor(pipeline beat.Pipeline) (sqsProcessor, error) { +func (in *sqsReaderInput) createEventProcessor() (sqsProcessor, error) { fileSelectors := in.config.getFileSelectors() - s3EventHandlerFactory := newS3ObjectProcessorFactory(in.log.Named("s3"), in.metrics, in.s3, fileSelectors, in.config.BackupConfig) + s3EventHandlerFactory := newS3ObjectProcessorFactory(in.metrics, in.s3, fileSelectors, in.config.BackupConfig) script, err := newScriptFromConfig(in.log.Named("sqs_script"), in.config.SQSScript) if err != nil { return nil, err } - return newSQSS3EventProcessor(in.log.Named("sqs_s3_event"), in.metrics, in.sqs, script, in.config.VisibilityTimeout, in.config.SQSMaxReceiveCount, pipeline, s3EventHandlerFactory), nil + return newSQSS3EventProcessor(in.log.Named("sqs_s3_event"), in.metrics, in.sqs, script, in.config.VisibilityTimeout, in.config.SQSMaxReceiveCount, s3EventHandlerFactory), nil } // Read all pending requests and return their count. If block is true, diff --git a/x-pack/filebeat/input/awss3/sqs_s3_event.go b/x-pack/filebeat/input/awss3/sqs_s3_event.go index a489f6a7f72e..884cf7adbbce 100644 --- a/x-pack/filebeat/input/awss3/sqs_s3_event.go +++ b/x-pack/filebeat/input/awss3/sqs_s3_event.go @@ -20,7 +20,6 @@ import ( "go.uber.org/multierr" "github.com/elastic/beats/v7/libbeat/beat" - awscommon "github.com/elastic/beats/v7/x-pack/libbeat/common/aws" "github.com/elastic/elastic-agent-libs/logp" ) @@ -117,11 +116,10 @@ type eventBridgeEvent struct { } type sqsS3EventProcessor struct { - s3ObjectHandler s3ObjectHandlerFactory + s3HandlerFactory s3ObjectHandlerFactory sqsVisibilityTimeout time.Duration maxReceiveCount int sqs sqsAPI - pipeline beat.Pipeline // Pipeline creates clients for publishing events. log *logp.Logger warnOnce sync.Once metrics *inputMetrics @@ -135,7 +133,6 @@ func newSQSS3EventProcessor( script *script, sqsVisibilityTimeout time.Duration, maxReceiveCount int, - pipeline beat.Pipeline, s3 s3ObjectHandlerFactory, ) *sqsS3EventProcessor { if metrics == nil { @@ -143,18 +140,32 @@ func newSQSS3EventProcessor( metrics = newInputMetrics("", nil, 0) } return &sqsS3EventProcessor{ - s3ObjectHandler: s3, + s3HandlerFactory: s3, sqsVisibilityTimeout: sqsVisibilityTimeout, maxReceiveCount: maxReceiveCount, sqs: sqs, - pipeline: pipeline, log: log, metrics: metrics, script: script, } } -func (p *sqsS3EventProcessor) ProcessSQS(ctx context.Context, msg *types.Message) error { +type sqsProcessingResult struct { + processor *sqsS3EventProcessor + msg *types.Message + receiveCount int // How many times this SQS object has been read + eventCount int // How many events were generated from this SQS object + keepaliveCancel context.CancelFunc + processingErr error + + // Finalizer callbacks for the returned S3 events, invoked via + // finalizeS3Objects after all events are acknowledged. + finalizers []finalizerFunc +} + +type finalizerFunc func() error + +func (p *sqsS3EventProcessor) ProcessSQS(ctx context.Context, msg *types.Message, eventCallback func(beat.Event)) sqsProcessingResult { log := p.log.With( "message_id", *msg.MessageId, "message_receipt_time", time.Now().UTC()) @@ -165,7 +176,10 @@ func (p *sqsS3EventProcessor) ProcessSQS(ctx context.Context, msg *types.Message // Start SQS keepalive worker. var keepaliveWg sync.WaitGroup keepaliveWg.Add(1) - go p.keepalive(keepaliveCtx, log, &keepaliveWg, msg) + go func() { + defer keepaliveWg.Done() + p.keepalive(keepaliveCtx, log, msg) + }() receiveCount := getSQSReceiveCount(msg.Attributes) if receiveCount == 1 { @@ -179,45 +193,69 @@ func (p *sqsS3EventProcessor) ProcessSQS(ctx context.Context, msg *types.Message } } - handles, processingErr := p.processS3Events(ctx, log, *msg.Body) + eventCount := 0 + finalizers, processingErr := p.processS3Events(ctx, log, *msg.Body, func(e beat.Event) { + eventCount++ + eventCallback(e) + }) + + return sqsProcessingResult{ + msg: msg, + processor: p, + receiveCount: receiveCount, + eventCount: eventCount, + keepaliveCancel: keepaliveCancel, + processingErr: processingErr, + finalizers: finalizers, + } +} + +// Call Done to indicate that all events from this SQS message have been +// acknowledged and it is safe to stop the keepalive routine and +// delete / finalize the message. +func (r sqsProcessingResult) Done() { + p := r.processor + processingErr := r.processingErr // Stop keepalive routine before changing visibility. - keepaliveCancel() - keepaliveWg.Wait() + r.keepaliveCancel() // No error. Delete SQS. if processingErr == nil { - if msgDelErr := p.sqs.DeleteMessage(context.Background(), msg); msgDelErr != nil { - return fmt.Errorf("failed deleting message from SQS queue (it may be reprocessed): %w", msgDelErr) + if msgDelErr := p.sqs.DeleteMessage(context.Background(), r.msg); msgDelErr != nil { + p.log.Errorf("failed deleting message from SQS queue (it may be reprocessed): %v", msgDelErr.Error()) + return + } + if p.metrics != nil { + // This nil check always passes in production, but it's nice when unit + // tests don't have to initialize irrelevant fields + p.metrics.sqsMessagesDeletedTotal.Inc() } - p.metrics.sqsMessagesDeletedTotal.Inc() // SQS message finished and deleted, finalize s3 objects - if finalizeErr := p.finalizeS3Objects(handles); finalizeErr != nil { - return fmt.Errorf("failed finalizing message from SQS queue (manual cleanup is required): %w", finalizeErr) + if finalizeErr := r.finalizeS3Objects(); finalizeErr != nil { + p.log.Errorf("failed finalizing message from SQS queue (manual cleanup is required): %v", finalizeErr.Error()) } - return nil + return } - if p.maxReceiveCount > 0 && !errors.Is(processingErr, &nonRetryableError{}) { + if p.maxReceiveCount > 0 && r.receiveCount >= p.maxReceiveCount { // Prevent poison pill messages from consuming all workers. Check how // many times this message has been received before making a disposition. - if receiveCount >= p.maxReceiveCount { - processingErr = nonRetryableErrorWrap(fmt.Errorf( - "sqs ApproximateReceiveCount <%v> exceeds threshold %v: %w", - receiveCount, p.maxReceiveCount, processingErr)) - } + processingErr = nonRetryableErrorWrap(fmt.Errorf( + "sqs ApproximateReceiveCount <%v> exceeds threshold %v: %w", + r.receiveCount, p.maxReceiveCount, processingErr)) } // An error that reprocessing cannot correct. Delete SQS. if errors.Is(processingErr, &nonRetryableError{}) { - if msgDelErr := p.sqs.DeleteMessage(context.Background(), msg); msgDelErr != nil { - return multierr.Combine( - fmt.Errorf("failed processing SQS message (attempted to delete message): %w", processingErr), - fmt.Errorf("failed deleting message from SQS queue (it may be reprocessed): %w", msgDelErr), - ) + if msgDelErr := p.sqs.DeleteMessage(context.Background(), r.msg); msgDelErr != nil { + p.log.Errorf("failed processing SQS message (attempted to delete message): %v", processingErr.Error()) + p.log.Errorf("failed deleting message from SQS queue (it may be reprocessed): %v", msgDelErr.Error()) + return } p.metrics.sqsMessagesDeletedTotal.Inc() - return fmt.Errorf("failed processing SQS message (message was deleted): %w", processingErr) + p.log.Errorf("failed processing SQS message (message was deleted): %w", processingErr) + return } // An error that may be resolved by letting the visibility timeout @@ -225,12 +263,10 @@ func (p *sqsS3EventProcessor) ProcessSQS(ctx context.Context, msg *types.Message // queue is enabled then the message will eventually placed on the DLQ // after maximum receives is reached. p.metrics.sqsMessagesReturnedTotal.Inc() - return fmt.Errorf("failed processing SQS message (it will return to queue after visibility timeout): %w", processingErr) + p.log.Errorf("failed processing SQS message (it will return to queue after visibility timeout): %w", processingErr) } -func (p *sqsS3EventProcessor) keepalive(ctx context.Context, log *logp.Logger, wg *sync.WaitGroup, msg *types.Message) { - defer wg.Done() - +func (p *sqsS3EventProcessor) keepalive(ctx context.Context, log *logp.Logger, msg *types.Message) { t := time.NewTicker(p.sqsVisibilityTimeout / 2) defer t.Stop() @@ -355,7 +391,12 @@ func (*sqsS3EventProcessor) isObjectCreatedEvents(event s3EventV2) bool { return event.EventSource == "aws:s3" && strings.HasPrefix(event.EventName, "ObjectCreated:") } -func (p *sqsS3EventProcessor) processS3Events(ctx context.Context, log *logp.Logger, body string) ([]s3ObjectHandler, error) { +func (p *sqsS3EventProcessor) processS3Events( + ctx context.Context, + log *logp.Logger, + body string, + eventCallback func(beat.Event), +) ([]finalizerFunc, error) { s3Events, err := p.getS3Notifications(body) if err != nil { if errors.Is(err, context.Canceled) { @@ -371,57 +412,36 @@ func (p *sqsS3EventProcessor) processS3Events(ctx context.Context, log *logp.Log return nil, nil } - // Create a pipeline client scoped to this goroutine. - client, err := p.pipeline.ConnectWith(beat.ClientConfig{ - EventListener: awscommon.NewEventACKHandler(), - Processing: beat.ProcessingConfig{ - // This input only produces events with basic types so normalization - // is not required. - EventNormalization: boolPtr(false), - }, - }) - if err != nil { - return nil, err - } - defer client.Close() - - // Wait for all events to be ACKed before proceeding. - acker := awscommon.NewEventACKTracker(ctx) - defer acker.Wait() - var errs []error - var handles []s3ObjectHandler + var finalizers []finalizerFunc for i, event := range s3Events { - s3Processor := p.s3ObjectHandler.Create(ctx, log, client, acker, event) + s3Processor := p.s3HandlerFactory.Create(ctx, event) if s3Processor == nil { + // A nil result generally means that this object key doesn't match the + // user-configured filters. continue } // Process S3 object (download, parse, create events). - if err := s3Processor.ProcessS3Object(); err != nil { + if err := s3Processor.ProcessS3Object(log, eventCallback); err != nil { errs = append(errs, fmt.Errorf( "failed processing S3 event for object key %q in bucket %q (object record %d of %d in SQS notification): %w", event.S3.Object.Key, event.S3.Bucket.Name, i+1, len(s3Events), err)) } else { - handles = append(handles, s3Processor) + finalizers = append(finalizers, s3Processor.FinalizeS3Object) } } - // Make sure all s3 events were processed successfully - if len(handles) == len(s3Events) { - return handles, multierr.Combine(errs...) - } - - return nil, multierr.Combine(errs...) + return finalizers, multierr.Combine(errs...) } -func (p *sqsS3EventProcessor) finalizeS3Objects(handles []s3ObjectHandler) error { +func (r sqsProcessingResult) finalizeS3Objects() error { var errs []error - for i, handle := range handles { - if err := handle.FinalizeS3Object(); err != nil { + for i, finalize := range r.finalizers { + if err := finalize(); err != nil { errs = append(errs, fmt.Errorf( "failed finalizing S3 event (object record %d of %d in SQS notification): %w", - i+1, len(handles), err)) + i+1, len(r.finalizers), err)) } } return multierr.Combine(errs...) diff --git a/x-pack/filebeat/input/awss3/sqs_s3_event_test.go b/x-pack/filebeat/input/awss3/sqs_s3_event_test.go index 92401fe45eee..c7962bb2f0f3 100644 --- a/x-pack/filebeat/input/awss3/sqs_s3_event_test.go +++ b/x-pack/filebeat/input/awss3/sqs_s3_event_test.go @@ -8,7 +8,6 @@ import ( "context" "errors" "fmt" - "sync" "testing" "time" @@ -22,7 +21,6 @@ import ( "github.com/stretchr/testify/require" "github.com/elastic/beats/v7/libbeat/beat" - awscommon "github.com/elastic/beats/v7/x-pack/libbeat/common/aws" "github.com/elastic/elastic-agent-libs/logp" "github.com/elastic/go-concert/timed" ) @@ -41,18 +39,16 @@ func TestSQSS3EventProcessor(t *testing.T) { defer ctrl.Finish() mockAPI := NewMockSQSAPI(ctrl) mockS3HandlerFactory := NewMockS3ObjectHandlerFactory(ctrl) - mockClient := NewMockBeatClient(ctrl) - mockBeatPipeline := NewMockBeatPipeline(ctrl) gomock.InOrder( - mockBeatPipeline.EXPECT().ConnectWith(gomock.Any()).Return(mockClient, nil), - mockS3HandlerFactory.EXPECT().Create(gomock.Any(), gomock.Any(), gomock.Any(), gomock.Any(), gomock.Any()).Return(nil), - mockClient.EXPECT().Close(), + mockS3HandlerFactory.EXPECT().Create(gomock.Any(), gomock.Any()).Return(nil), mockAPI.EXPECT().DeleteMessage(gomock.Any(), gomock.Eq(&msg)).Return(nil), ) - p := newSQSS3EventProcessor(logp.NewLogger(inputName), nil, mockAPI, nil, time.Minute, 5, mockBeatPipeline, mockS3HandlerFactory) - require.NoError(t, p.ProcessSQS(ctx, &msg)) + p := newSQSS3EventProcessor(logp.NewLogger(inputName), nil, mockAPI, nil, time.Minute, 5, mockS3HandlerFactory) + result := p.ProcessSQS(ctx, &msg, func(_ beat.Event) {}) + require.NoError(t, result.processingErr) + result.Done() }) t.Run("invalid SQS JSON body does not retry", func(t *testing.T) { @@ -63,7 +59,6 @@ func TestSQSS3EventProcessor(t *testing.T) { defer ctrl.Finish() mockAPI := NewMockSQSAPI(ctrl) mockS3HandlerFactory := NewMockS3ObjectHandlerFactory(ctrl) - mockBeatPipeline := NewMockBeatPipeline(ctrl) invalidBodyMsg, err := newSQSMessage(newS3Event("log.json")) require.NoError(t, err) @@ -72,14 +67,13 @@ func TestSQSS3EventProcessor(t *testing.T) { body = body[10:] invalidBodyMsg.Body = &body - gomock.InOrder( - mockAPI.EXPECT().DeleteMessage(gomock.Any(), gomock.Eq(&invalidBodyMsg)).Return(nil), - ) + mockAPI.EXPECT().DeleteMessage(gomock.Any(), gomock.Eq(&invalidBodyMsg)).Return(nil) - p := newSQSS3EventProcessor(logp.NewLogger(inputName), nil, mockAPI, nil, time.Minute, 5, mockBeatPipeline, mockS3HandlerFactory) - err = p.ProcessSQS(ctx, &invalidBodyMsg) - require.Error(t, err) - t.Log(err) + p := newSQSS3EventProcessor(logp.NewLogger(inputName), nil, mockAPI, nil, time.Minute, 5, mockS3HandlerFactory) + result := p.ProcessSQS(ctx, &invalidBodyMsg, func(_ beat.Event) {}) + require.Error(t, result.processingErr) + t.Log(result.processingErr) + result.Done() }) t.Run("zero S3 events in body", func(t *testing.T) { @@ -90,17 +84,16 @@ func TestSQSS3EventProcessor(t *testing.T) { defer ctrl.Finish() mockAPI := NewMockSQSAPI(ctrl) mockS3HandlerFactory := NewMockS3ObjectHandlerFactory(ctrl) - mockBeatPipeline := NewMockBeatPipeline(ctrl) emptyRecordsMsg, err := newSQSMessage([]s3EventV2{}...) require.NoError(t, err) - gomock.InOrder( - mockAPI.EXPECT().DeleteMessage(gomock.Any(), gomock.Eq(&emptyRecordsMsg)).Return(nil), - ) + mockAPI.EXPECT().DeleteMessage(gomock.Any(), gomock.Eq(&emptyRecordsMsg)).Return(nil) - p := newSQSS3EventProcessor(logp.NewLogger(inputName), nil, mockAPI, nil, time.Minute, 5, mockBeatPipeline, mockS3HandlerFactory) - require.NoError(t, p.ProcessSQS(ctx, &emptyRecordsMsg)) + p := newSQSS3EventProcessor(logp.NewLogger(inputName), nil, mockAPI, nil, time.Minute, 5, mockS3HandlerFactory) + result := p.ProcessSQS(ctx, &emptyRecordsMsg, func(_ beat.Event) {}) + require.NoError(t, result.processingErr) + result.Done() }) t.Run("visibility is extended after half expires", func(t *testing.T) { @@ -114,25 +107,23 @@ func TestSQSS3EventProcessor(t *testing.T) { mockAPI := NewMockSQSAPI(ctrl) mockS3HandlerFactory := NewMockS3ObjectHandlerFactory(ctrl) mockS3Handler := NewMockS3ObjectHandler(ctrl) - mockClient := NewMockBeatClient(ctrl) - mockBeatPipeline := NewMockBeatPipeline(ctrl) mockAPI.EXPECT().ChangeMessageVisibility(gomock.Any(), gomock.Eq(&msg), gomock.Eq(visibilityTimeout)).AnyTimes().Return(nil) gomock.InOrder( - mockBeatPipeline.EXPECT().ConnectWith(gomock.Any()).Return(mockClient, nil), - mockS3HandlerFactory.EXPECT().Create(gomock.Any(), gomock.Any(), gomock.Any(), gomock.Any(), gomock.Any()). - Do(func(ctx context.Context, _ *logp.Logger, _ beat.Client, _ *awscommon.EventACKTracker, _ s3EventV2) { + mockS3HandlerFactory.EXPECT().Create(gomock.Any(), gomock.Any()). + Do(func(ctx context.Context, _ s3EventV2) { require.NoError(t, timed.Wait(ctx, 5*visibilityTimeout)) }).Return(mockS3Handler), - mockS3Handler.EXPECT().ProcessS3Object().Return(nil), - mockClient.EXPECT().Close(), + mockS3Handler.EXPECT().ProcessS3Object(gomock.Any(), gomock.Any()).Return(nil), mockAPI.EXPECT().DeleteMessage(gomock.Any(), gomock.Eq(&msg)).Return(nil), mockS3Handler.EXPECT().FinalizeS3Object().Return(nil), ) - p := newSQSS3EventProcessor(logp.NewLogger(inputName), nil, mockAPI, nil, visibilityTimeout, 5, mockBeatPipeline, mockS3HandlerFactory) - require.NoError(t, p.ProcessSQS(ctx, &msg)) + p := newSQSS3EventProcessor(logp.NewLogger(inputName), nil, mockAPI, nil, visibilityTimeout, 5, mockS3HandlerFactory) + result := p.ProcessSQS(ctx, &msg, func(_ beat.Event) {}) + require.NoError(t, result.processingErr) + result.Done() }) t.Run("message returns to queue on error", func(t *testing.T) { @@ -144,20 +135,17 @@ func TestSQSS3EventProcessor(t *testing.T) { mockAPI := NewMockSQSAPI(ctrl) mockS3HandlerFactory := NewMockS3ObjectHandlerFactory(ctrl) mockS3Handler := NewMockS3ObjectHandler(ctrl) - mockClient := NewMockBeatClient(ctrl) - mockBeatPipeline := NewMockBeatPipeline(ctrl) gomock.InOrder( - mockBeatPipeline.EXPECT().ConnectWith(gomock.Any()).Return(mockClient, nil), - mockS3HandlerFactory.EXPECT().Create(gomock.Any(), gomock.Any(), gomock.Any(), gomock.Any(), gomock.Any()).Return(mockS3Handler), - mockS3Handler.EXPECT().ProcessS3Object().Return(errors.New("fake connectivity problem")), - mockClient.EXPECT().Close(), + mockS3HandlerFactory.EXPECT().Create(gomock.Any(), gomock.Any()).Return(mockS3Handler), + mockS3Handler.EXPECT().ProcessS3Object(gomock.Any(), gomock.Any()).Return(errors.New("fake connectivity problem")), ) - p := newSQSS3EventProcessor(logp.NewLogger(inputName), nil, mockAPI, nil, time.Minute, 5, mockBeatPipeline, mockS3HandlerFactory) - err := p.ProcessSQS(ctx, &msg) - t.Log(err) - require.Error(t, err) + p := newSQSS3EventProcessor(logp.NewLogger(inputName), nil, mockAPI, nil, time.Minute, 5, mockS3HandlerFactory) + result := p.ProcessSQS(ctx, &msg, func(_ beat.Event) {}) + t.Log(result.processingErr) + require.Error(t, result.processingErr) + result.Done() }) t.Run("message is deleted after multiple receives", func(t *testing.T) { @@ -169,8 +157,6 @@ func TestSQSS3EventProcessor(t *testing.T) { mockAPI := NewMockSQSAPI(ctrl) mockS3HandlerFactory := NewMockS3ObjectHandlerFactory(ctrl) mockS3Handler := NewMockS3ObjectHandler(ctrl) - mockClient := NewMockBeatClient(ctrl) - mockBeatPipeline := NewMockBeatPipeline(ctrl) msg := msg msg.Attributes = map[string]string{ @@ -178,17 +164,16 @@ func TestSQSS3EventProcessor(t *testing.T) { } gomock.InOrder( - mockBeatPipeline.EXPECT().ConnectWith(gomock.Any()).Return(mockClient, nil), - mockS3HandlerFactory.EXPECT().Create(gomock.Any(), gomock.Any(), gomock.Any(), gomock.Any(), gomock.Any()).Return(mockS3Handler), - mockS3Handler.EXPECT().ProcessS3Object().Return(errors.New("fake connectivity problem")), - mockClient.EXPECT().Close(), + mockS3HandlerFactory.EXPECT().Create(gomock.Any(), gomock.Any()).Return(mockS3Handler), + mockS3Handler.EXPECT().ProcessS3Object(gomock.Any(), gomock.Any()).Return(errors.New("fake connectivity problem")), mockAPI.EXPECT().DeleteMessage(gomock.Any(), gomock.Eq(&msg)).Return(nil), ) - p := newSQSS3EventProcessor(logp.NewLogger(inputName), nil, mockAPI, nil, time.Minute, 5, mockBeatPipeline, mockS3HandlerFactory) - err := p.ProcessSQS(ctx, &msg) - t.Log(err) - require.Error(t, err) + p := newSQSS3EventProcessor(logp.NewLogger(inputName), nil, mockAPI, nil, time.Minute, 5, mockS3HandlerFactory) + result := p.ProcessSQS(ctx, &msg, func(_ beat.Event) {}) + t.Log(result.eventCount) + require.Error(t, result.processingErr) + result.Done() }) } @@ -227,16 +212,12 @@ func TestSqsProcessor_keepalive(t *testing.T) { defer ctrl.Finish() mockAPI := NewMockSQSAPI(ctrl) mockS3HandlerFactory := NewMockS3ObjectHandlerFactory(ctrl) - mockBeatPipeline := NewMockBeatPipeline(ctrl) mockAPI.EXPECT().ChangeMessageVisibility(gomock.Any(), gomock.Eq(&msg), gomock.Eq(visibilityTimeout)). Times(1).Return(tc.Err) - p := newSQSS3EventProcessor(logp.NewLogger(inputName), nil, mockAPI, nil, visibilityTimeout, 5, mockBeatPipeline, mockS3HandlerFactory) - var wg sync.WaitGroup - wg.Add(1) - p.keepalive(ctx, p.log, &wg, &msg) - wg.Wait() + p := newSQSS3EventProcessor(logp.NewLogger(inputName), nil, mockAPI, nil, visibilityTimeout, 5, mockS3HandlerFactory) + p.keepalive(ctx, p.log, &msg) }) } } @@ -245,7 +226,7 @@ func TestSqsProcessor_getS3Notifications(t *testing.T) { err := logp.TestingSetup() require.NoError(t, err) - p := newSQSS3EventProcessor(logp.NewLogger(inputName), nil, nil, nil, time.Minute, 5, nil, nil) + p := newSQSS3EventProcessor(logp.NewLogger(inputName), nil, nil, nil, time.Minute, 5, nil) t.Run("s3 key is url unescaped", func(t *testing.T) { msg, err := newSQSMessage(newS3Event("Happy+Face.jpg")) diff --git a/x-pack/filebeat/input/awss3/sqs_test.go b/x-pack/filebeat/input/awss3/sqs_test.go index fff17ebc1a6d..8bc25397eaeb 100644 --- a/x-pack/filebeat/input/awss3/sqs_test.go +++ b/x-pack/filebeat/input/awss3/sqs_test.go @@ -19,6 +19,7 @@ import ( "github.com/stretchr/testify/assert" "github.com/stretchr/testify/require" + "github.com/elastic/beats/v7/libbeat/beat" "github.com/elastic/elastic-agent-libs/logp" ) @@ -33,7 +34,7 @@ func TestSQSReceiver(t *testing.T) { err := logp.TestingSetup() require.NoError(t, err) - const maxMessages = 5 + const workerCount = 5 t.Run("ReceiveMessage success", func(t *testing.T) { ctx, cancel := context.WithTimeout(context.Background(), testTimeout) @@ -61,8 +62,6 @@ func TestSQSReceiver(t *testing.T) { ReceiveMessage(gomock.Any(), gomock.Any()). Times(1). DoAndReturn(func(_ context.Context, _ int) ([]types.Message, error) { - // Stop the test. - cancel() return nil, nil }) @@ -72,19 +71,43 @@ func TestSQSReceiver(t *testing.T) { return map[string]string{sqsApproximateNumberOfMessages: "10000"}, nil }).AnyTimes() + mockSQS.EXPECT(). + DeleteMessage(gomock.Any(), gomock.Any()).Times(1).Do( + func(_ context.Context, _ *types.Message) { + cancel() + }) + + logger := logp.NewLogger(inputName) + // Expect the one message returned to have been processed. mockMsgHandler.EXPECT(). - ProcessSQS(gomock.Any(), gomock.Eq(&msg)). + ProcessSQS(gomock.Any(), gomock.Eq(&msg), gomock.Any()). Times(1). - Return(nil) + DoAndReturn( + func(_ context.Context, _ *types.Message, _ func(e beat.Event)) sqsProcessingResult { + return sqsProcessingResult{ + keepaliveCancel: func() {}, + processor: &sqsS3EventProcessor{ + log: logger, + sqs: mockSQS, + }, + } + }) // Execute sqsReader and verify calls/state. - sqsReader := newSQSReaderInput(config{MaxNumberOfMessages: maxMessages}, aws.Config{}) - sqsReader.log = logp.NewLogger(inputName) + sqsReader := newSQSReaderInput(config{NumberOfWorkers: workerCount}, aws.Config{}) + sqsReader.log = logger sqsReader.sqs = mockSQS - sqsReader.msgHandler = mockMsgHandler sqsReader.metrics = newInputMetrics("", nil, 0) + sqsReader.pipeline = &fakePipeline{} + sqsReader.msgHandler = mockMsgHandler sqsReader.run(ctx) + + select { + case <-ctx.Done(): + case <-time.After(time.Second): + require.Fail(t, "Never observed SQS DeleteMessage call") + } }) t.Run("retry after ReceiveMessage error", func(t *testing.T) { @@ -120,11 +143,12 @@ func TestSQSReceiver(t *testing.T) { }).AnyTimes() // Execute SQSReader and verify calls/state. - sqsReader := newSQSReaderInput(config{MaxNumberOfMessages: maxMessages}, aws.Config{}) + sqsReader := newSQSReaderInput(config{NumberOfWorkers: workerCount}, aws.Config{}) sqsReader.log = logp.NewLogger(inputName) sqsReader.sqs = mockSQS sqsReader.msgHandler = mockMsgHandler sqsReader.metrics = newInputMetrics("", nil, 0) + sqsReader.pipeline = &fakePipeline{} sqsReader.run(ctx) }) } diff --git a/x-pack/filebeat/input/entityanalytics/provider/okta/internal/okta/okta.go b/x-pack/filebeat/input/entityanalytics/provider/okta/internal/okta/okta.go index ef574ef4d26a..3d8bdae11c97 100644 --- a/x-pack/filebeat/input/entityanalytics/provider/okta/internal/okta/okta.go +++ b/x-pack/filebeat/input/entityanalytics/provider/okta/internal/okta/okta.go @@ -44,7 +44,7 @@ type User struct { Profile map[string]any `json:"profile"` Credentials *Credentials `json:"credentials,omitempty"` Links HAL `json:"_links,omitempty"` // See https://developer.okta.com/docs/reference/api/users/#links-object for details. - Embedded HAL `json:"_embedded,omitempty"` + Embedded map[string]any `json:"_embedded,omitempty"` } // Credentials is a redacted Okta user's credential details. Only the credential provider is retained. @@ -72,6 +72,37 @@ type Group struct { Profile map[string]any `json:"profile"` } +// Factor is an Okta identity factor description. +// +// See https://developer.okta.com/docs/api/openapi/okta-management/management/tag/UserFactor/#tag/UserFactor/operation/listFactors. +type Factor struct { + ID string `json:"id"` + FactorType string `json:"factorType"` + Provider string `json:"provider"` + VendorName string `json:"vendorName"` + Status string `json:"status"` + Created time.Time `json:"created"` + LastUpdated time.Time `json:"lastUpdated"` + Profile map[string]any `json:"profile"` + Links HAL `json:"_links,omitempty"` + Embedded map[string]any `json:"_embedded,omitempty"` +} + +// Role is an Okta user role description. +// +// See https://developer.okta.com/docs/api/openapi/okta-management/management/tag/RoleAssignmentAUser/#tag/RoleAssignmentAUser/operation/listAssignedRolesForUser +// and https://developer.okta.com/docs/api/openapi/okta-management/management/tag/RoleAssignmentBGroup/#tag/RoleAssignmentBGroup/operation/listGroupAssignedRoles. +type Role struct { + ID string `json:"id"` + Label string `json:"label"` + Type string `json:"type"` + Status string `json:"status"` + Created time.Time `json:"created"` + LastUpdated time.Time `json:"lastUpdated"` + AssignmentType string `json:"assignmentType"` + Links HAL `json:"_links"` +} + // Device is an Okta device's details. // // See https://developer.okta.com/docs/api/openapi/okta-management/management/tag/Device/#tag/Device/operation/listDevices for details @@ -176,6 +207,48 @@ func GetUserDetails(ctx context.Context, cli *http.Client, host, key, user strin return getDetails[User](ctx, cli, u, key, user == "", omit, lim, window, log) } +// GetUserFactors returns Okta group roles using the groups API endpoint. host is the +// Okta user domain and key is the API token to use for the query. group must not be empty. +// +// See GetUserDetails for details of the query and rate limit parameters. +// +// See https://developer.okta.com/docs/api/openapi/okta-management/management/tag/UserFactor/#tag/UserFactor/operation/listFactors. +func GetUserFactors(ctx context.Context, cli *http.Client, host, key, user string, lim *rate.Limiter, window time.Duration, log *logp.Logger) ([]Factor, http.Header, error) { + const endpoint = "/api/v1/users" + + if user == "" { + return nil, nil, errors.New("no user specified") + } + + u := &url.URL{ + Scheme: "https", + Host: host, + Path: path.Join(endpoint, user, "factors"), + } + return getDetails[Factor](ctx, cli, u, key, true, OmitNone, lim, window, log) +} + +// GetUserRoles returns Okta group roles using the groups API endpoint. host is the +// Okta user domain and key is the API token to use for the query. group must not be empty. +// +// See GetUserDetails for details of the query and rate limit parameters. +// +// See https://developer.okta.com/docs/api/openapi/okta-management/management/tag/RoleAssignmentBGroup/#tag/RoleAssignmentBGroup/operation/listGroupAssignedRoles. +func GetUserRoles(ctx context.Context, cli *http.Client, host, key, user string, lim *rate.Limiter, window time.Duration, log *logp.Logger) ([]Role, http.Header, error) { + const endpoint = "/api/v1/users" + + if user == "" { + return nil, nil, errors.New("no user specified") + } + + u := &url.URL{ + Scheme: "https", + Host: host, + Path: path.Join(endpoint, user, "roles"), + } + return getDetails[Role](ctx, cli, u, key, true, OmitNone, lim, window, log) +} + // GetUserGroupDetails returns Okta group details using the users API endpoint. host is the // Okta user domain and key is the API token to use for the query. user must not be empty. // @@ -197,6 +270,27 @@ func GetUserGroupDetails(ctx context.Context, cli *http.Client, host, key, user return getDetails[Group](ctx, cli, u, key, true, OmitNone, lim, window, log) } +// GetGroupRoles returns Okta group roles using the groups API endpoint. host is the +// Okta user domain and key is the API token to use for the query. group must not be empty. +// +// See GetUserDetails for details of the query and rate limit parameters. +// +// See https://developer.okta.com/docs/api/openapi/okta-management/management/tag/RoleAssignmentBGroup/#tag/RoleAssignmentBGroup/operation/listGroupAssignedRoles. +func GetGroupRoles(ctx context.Context, cli *http.Client, host, key, group string, lim *rate.Limiter, window time.Duration, log *logp.Logger) ([]Role, http.Header, error) { + const endpoint = "/api/v1/groups" + + if group == "" { + return nil, nil, errors.New("no group specified") + } + + u := &url.URL{ + Scheme: "https", + Host: host, + Path: path.Join(endpoint, group, "roles"), + } + return getDetails[Role](ctx, cli, u, key, true, OmitNone, lim, window, log) +} + // GetDeviceDetails returns Okta device details using the list devices API endpoint. host is the // Okta user domain and key is the API token to use for the query. If device is not empty, // details for the specific device are returned, otherwise a list of all devices is returned. @@ -250,7 +344,7 @@ func GetDeviceUsers(ctx context.Context, cli *http.Client, host, key, device str // entity is an Okta entity analytics entity. type entity interface { - User | Group | Device | devUser + User | Group | Role | Factor | Device | devUser } type devUser struct { diff --git a/x-pack/filebeat/input/entityanalytics/provider/okta/internal/okta/okta_test.go b/x-pack/filebeat/input/entityanalytics/provider/okta/internal/okta/okta_test.go index 2ce439252210..9b04d3996bf9 100644 --- a/x-pack/filebeat/input/entityanalytics/provider/okta/internal/okta/okta_test.go +++ b/x-pack/filebeat/input/entityanalytics/provider/okta/internal/okta/okta_test.go @@ -116,6 +116,56 @@ func Test(t *testing.T) { t.Logf("groups: %s", b) }) + t.Run("my_roles", func(t *testing.T) { + query := make(url.Values) + query.Set("limit", "200") + roles, _, err := GetUserRoles(context.Background(), http.DefaultClient, host, key, me.ID, limiter, window, logger) + if err != nil { + t.Fatalf("unexpected error: %v", err) + } + if len(roles) == 0 { + t.Fatalf("unexpected len(roles): got:%d want>0", len(roles)) + } + + if omit&OmitCredentials != 0 && me.Credentials != nil { + t.Errorf("unexpected credentials with %s: %#v", omit, me.Credentials) + } + + if !*logResponses { + return + } + b, err := json.Marshal(roles) + if err != nil { + t.Errorf("failed to marshal roles for logging: %v", err) + } + t.Logf("roles: %s", b) + }) + + t.Run("my_factors", func(t *testing.T) { + query := make(url.Values) + query.Set("limit", "200") + factors, _, err := GetUserFactors(context.Background(), http.DefaultClient, host, key, me.ID, limiter, window, logger) + if err != nil { + t.Fatalf("unexpected error: %v", err) + } + if len(factors) == 0 { + t.Fatalf("unexpected len(factors): got:%d want>0", len(factors)) + } + + if omit&OmitCredentials != 0 && me.Credentials != nil { + t.Errorf("unexpected credentials with %s: %#v", omit, me.Credentials) + } + + if !*logResponses { + return + } + b, err := json.Marshal(factors) + if err != nil { + t.Errorf("failed to marshal factors for logging: %v", err) + } + t.Logf("factors: %s", b) + }) + t.Run("user", func(t *testing.T) { login, _ := me.Profile["login"].(string) if login == "" { diff --git a/x-pack/filebeat/input/internal/private/private.go b/x-pack/filebeat/input/internal/private/private.go index e47b6521e477..c0b2e311fded 100644 --- a/x-pack/filebeat/input/internal/private/private.go +++ b/x-pack/filebeat/input/internal/private/private.go @@ -35,7 +35,11 @@ var privateKey = reflect.ValueOf("private") // `private:""`, the fields with the tag will be marked as private. Otherwise // the comma-separated list of names with be used. The list may refer to its // own field. -func Redact[T any](val T, tag string, global []string) (redacted T, err error) { +func Redact[T any](val T, tag string, global []string, replace ...Replacer) (redacted T, err error) { + reps, err := compileReplacers(replace) + if err != nil { + return redacted, err + } defer func() { switch r := recover().(type) { case nil: @@ -54,13 +58,65 @@ func Redact[T any](val T, tag string, global []string) (redacted T, err error) { rv := reflect.ValueOf(val) switch rv.Kind() { case reflect.Map, reflect.Pointer, reflect.Struct: - return redact(rv, tag, slices.Clone(global), 0, make(map[any]int)).Interface().(T), nil + return redact(rv, reps, tag, slices.Clone(global), 0, make(map[any]int)).Interface().(T), nil default: return val, nil } } -func redact(v reflect.Value, tag string, global []string, depth int, seen map[any]int) reflect.Value { +// Replacer is a function that will return a redaction replacement +// for the provided type. It must be a func(T) T. +type Replacer any + +// NewStringReplacer returns a string Replacer that returns s. +func NewStringReplacer(s string) Replacer { + return func(string) string { + return s + } +} + +// NewBytesReplacer returns a []byte Replacer that returns the bytes +// representation of s. +func NewBytesReplacer(s string) Replacer { + return func([]byte) []byte { + return []byte(s) + } +} + +type replacers map[reflect.Type]func(reflect.Value) reflect.Value + +func compileReplacers(replace []Replacer) (replacers, error) { + reps := make(replacers) + for _, r := range replace { + rv := reflect.ValueOf(r) + rt := rv.Type() + if rt.Kind() != reflect.Func { + return nil, fmt.Errorf("replacer is not a function: %T", r) + } + if n := rt.NumIn(); n != 1 { + return nil, fmt.Errorf("incorrect number of arguments for replacer: %d != 1", n) + } + if n := rt.NumOut(); n != 1 { + return nil, fmt.Errorf("incorrect number of return values from replacer: %d != 1", n) + } + in, out := rt.In(0), rt.Out(0) + if in != out { + return nil, fmt.Errorf("replacer does not preserve type: fn(%s) %s", in, out) + } + if _, exists := reps[in]; exists { + return nil, fmt.Errorf("multiple replacers for %s", in) + } + reps[in] = func(v reflect.Value) reflect.Value { + return rv.Call([]reflect.Value{v})[0] + } + } + if len(reps) == 0 { + reps = nil + } + return reps, nil +} + +func redact(v reflect.Value, reps replacers, tag string, global []string, depth int, seen map[any]int) reflect.Value { switch v.Kind() { case reflect.Pointer: if v.IsNil() { @@ -74,19 +130,19 @@ func redact(v reflect.Value, tag string, global []string, depth int, seen map[an seen[ident] = depth defer delete(seen, ident) } - return redact(v.Elem(), tag, global, depth+1, seen).Addr() + return redact(v.Elem(), reps, tag, global, depth+1, seen).Addr() case reflect.Interface: if v.IsNil() { return v } - return redact(v.Elem(), tag, global, depth+1, seen) + return redact(v.Elem(), reps, tag, global, depth+1, seen) case reflect.Array: if v.Len() == 0 { return v } r := reflect.New(v.Type()).Elem() for i := 0; i < v.Len(); i++ { - r.Index(i).Set(redact(v.Index(i), tag, global, depth+1, seen)) + r.Index(i).Set(redact(v.Index(i), reps, tag, global, depth+1, seen)) } return r case reflect.Slice: @@ -109,7 +165,7 @@ func redact(v reflect.Value, tag string, global []string, depth int, seen map[an } r := reflect.MakeSlice(v.Type(), v.Len(), v.Cap()) for i := 0; i < v.Len(); i++ { - r.Index(i).Set(redact(v.Index(i), tag, global, depth+1, seen)) + r.Index(i).Set(redact(v.Index(i), reps, tag, global, depth+1, seen)) } return r case reflect.Map: @@ -145,9 +201,13 @@ func redact(v reflect.Value, tag string, global []string, depth int, seen map[an for it.Next() { name := it.Key().String() if slices.Contains(private, name) { + v := replaceNestedWithin(it.Value(), reps) + if v.IsValid() { + r.SetMapIndex(it.Key(), v) + } continue } - r.SetMapIndex(it.Key(), redact(it.Value(), tag, nextPath(name, global), depth+1, seen)) + r.SetMapIndex(it.Key(), redact(it.Value(), reps, tag, nextPath(name, global), depth+1, seen)) } return r case reflect.Struct: @@ -219,10 +279,14 @@ func redact(v reflect.Value, tag string, global []string, depth int, seen map[an continue } if slices.Contains(private, names[i]) { + v := replaceNestedWithin(f, reps) + if v.IsValid() { + r.Field(i).Set(v) + } continue } if r.Field(i).CanSet() { - r.Field(i).Set(redact(f, tag, nextPath(names[i], global), depth+1, seen)) + r.Field(i).Set(redact(f, reps, tag, nextPath(names[i], global), depth+1, seen)) } } return r @@ -230,6 +294,67 @@ func redact(v reflect.Value, tag string, global []string, depth int, seen map[an return v } +// replaceNestedWithin replaces deeply nested values in pointer, interface and +// array/slice chains. If a replacement is not made an invalid reflect.Value +// is returned. If elements are not replaced by a replacer, it is set to the +// zero value for the type. +func replaceNestedWithin(v reflect.Value, reps replacers) reflect.Value { + if len(reps) == 0 || !v.IsValid() { + // No replacer, or an invalid value, so fall back to removal. + return reflect.Value{} + } + if rep, ok := reps[v.Type()]; ok { + return rep(v) + } + switch v.Kind() { + case reflect.Pointer: + r := replaceNestedWithin(v.Elem(), reps) + if !r.IsValid() { + return r + } + return r.Addr() + case reflect.Interface: + r := replaceNestedWithin(v.Elem(), reps) + if !r.IsValid() { + return r + } + i := reflect.New(v.Type()).Elem() + i.Set(r) + return i + case reflect.Array: + a := reflect.New(v.Type()).Elem() + wasSet := false + for i := 0; i < v.Len(); i++ { + r := replaceNestedWithin(v.Index(i), reps) + if r.IsValid() { + wasSet = true + a.Index(i).Set(r) + } + } + if !wasSet { + return reflect.Value{} + } + return a + case reflect.Slice: + s := reflect.MakeSlice(v.Type(), v.Len(), v.Cap()) + wasSet := false + for i := 0; i < v.Len(); i++ { + r := replaceNestedWithin(v.Index(i), reps) + if r.IsValid() { + wasSet = true + s.Index(i).Set(r) + } + } + if !wasSet { + return reflect.Value{} + } + return s + default: + // Could not catch, fall back to removal. + return reflect.Value{} + } +} + func nextStep(global []string) (private []string) { if len(global) == 0 { return nil diff --git a/x-pack/filebeat/input/internal/private/private_test.go b/x-pack/filebeat/input/internal/private/private_test.go index 774e35f3d532..aa813ada5d1b 100644 --- a/x-pack/filebeat/input/internal/private/private_test.go +++ b/x-pack/filebeat/input/internal/private/private_test.go @@ -7,20 +7,23 @@ package private import ( "bytes" "encoding/json" + "errors" "net/url" "reflect" + "strings" "testing" "github.com/google/go-cmp/cmp" ) type redactTest struct { - name string - in any - tag string - global []string - want any - wantErr error + name string + in any + tag string + global []string + replacers []Replacer + want any + wantErr error } var redactTests = []redactTest{ @@ -36,6 +39,34 @@ var redactTests = []redactTest{ "not_secret": "2", }, }, + { + name: "map_string_replacer", + in: map[string]any{ + "private": "secret", + "secret": "this is a secret", + "not_secret": "this is not", + }, + replacers: []Replacer{NewStringReplacer("REDACTED")}, + want: map[string]any{ + "private": "secret", + "secret": "REDACTED", + "not_secret": "this is not", + }, + }, + { + name: "map_string_custom_replacer", + in: map[string]any{ + "private": "secret", + "secret": "this is a secret", + "not_secret": "this is not", + }, + replacers: []Replacer{func(s string) string { return strings.Repeat("*", len(s)) }}, + want: map[string]any{ + "private": "secret", + "secret": "****************", // Same length as original. + "not_secret": "this is not", + }, + }, { name: "map_string_inner", in: map[string]any{ @@ -80,6 +111,78 @@ var redactTests = []redactTest{ }, }}, }, + { + name: "map_string_inner_next_inner_global_slices", + in: map[string]any{ + "inner": map[string]any{ + "next_inner": map[string]any{ + "secret": []string{"1"}, + "not_secret": []string{"2"}, + }, + }}, + global: []string{"inner.next_inner.secret"}, + want: map[string]any{ + "inner": map[string]any{ + "next_inner": map[string]any{ + "not_secret": []string{"2"}, + }, + }}, + }, + { + name: "map_string_inner_next_inner_global_nested_slices", + in: map[string]any{ + "inner": map[string]any{ + "next_inner": map[string]any{ + "secret": [][]string{{"1"}}, + "not_secret": [][]string{{"2"}}, + }, + }}, + global: []string{"inner.next_inner.secret"}, + want: map[string]any{ + "inner": map[string]any{ + "next_inner": map[string]any{ + "not_secret": [][]string{{"2"}}, + }, + }}, + }, + { + name: "map_string_inner_next_inner_global_slices_replacer", + in: map[string]any{ + "inner": map[string]any{ + "next_inner": map[string]any{ + "secret": []string{"1"}, + "not_secret": []string{"2"}, + }, + }}, + replacers: []Replacer{NewStringReplacer("REDACTED")}, + global: []string{"inner.next_inner.secret"}, + want: map[string]any{ + "inner": map[string]any{ + "next_inner": map[string]any{ + "not_secret": []string{"2"}, + "secret": []string{"REDACTED"}, + }, + }}, + }, + { + name: "map_string_inner_next_inner_global_nested_slices_replacer", + in: map[string]any{ + "inner": map[string]any{ + "next_inner": map[string]any{ + "secret": [][]string{{"1"}}, + "not_secret": [][]string{{"2"}}, + }, + }}, + replacers: []Replacer{NewStringReplacer("REDACTED")}, + global: []string{"inner.next_inner.secret"}, + want: map[string]any{ + "inner": map[string]any{ + "next_inner": map[string]any{ + "secret": [][]string{{"REDACTED"}}, + "not_secret": [][]string{{"2"}}, + }, + }}, + }, { name: "map_string_inner_next_inner_params_global", in: map[string]any{ @@ -193,6 +296,49 @@ var redactTests = []redactTest{ }, }}, }, + { + name: "map_string_inner_next_inner_params_global_internal_slice_precise_replacer", + in: map[string]any{ + "inner": map[string]any{ + "next_inner": []map[string]any{ + { + "headers": url.Values{ + "secret": []string{"1"}, + "not_secret": []string{"2"}, + }, + "not_secret": "2", + }, + { + "headers": url.Values{ + "secret": []string{"3"}, + "not_secret": []string{"4"}, + }, + "not_secret": "4", + }, + }, + }}, + global: []string{"inner.next_inner.headers.secret"}, + replacers: []Replacer{NewStringReplacer("REDACTED")}, + want: map[string]any{ + "inner": map[string]any{ + "next_inner": []map[string]any{ + { + "headers": url.Values{ + "not_secret": []string{"2"}, + "secret": []string{"REDACTED"}, + }, + "not_secret": "2", + }, + { + "headers": url.Values{ + "not_secret": []string{"4"}, + "secret": []string{"REDACTED"}, + }, + "not_secret": "4", + }, + }, + }}, + }, { name: "map_slice", in: map[string]any{ @@ -239,6 +385,50 @@ var redactTests = []redactTest{ }, } }(), + func() redactTest { + type s struct { + Private string + Secret string + NotSecret string + } + return redactTest{ + name: "struct_string_replacer", + in: s{ + Private: "Secret", + Secret: "this is a secret", + NotSecret: "this is not", + }, + replacers: []Replacer{NewStringReplacer("REDACTED")}, + tag: "", + want: s{ + Private: "Secret", + Secret: "REDACTED", + NotSecret: "this is not", + }, + } + }(), + func() redactTest { + type s struct { + Private string + Secret string + NotSecret string + } + return redactTest{ + name: "struct_string_replacer", + in: s{ + Private: "Secret", + Secret: "this is a secret", + NotSecret: "this is not", + }, + replacers: []Replacer{func(s string) string { return strings.Repeat("*", len(s)) }}, + tag: "", + want: s{ + Private: "Secret", + Secret: "****************", + NotSecret: "this is not", + }, + } + }(), func() redactTest { type s struct { Private []string @@ -399,6 +589,37 @@ var redactTests = []redactTest{ wantErr: cycle{reflect.TypeOf(&s{})}, } }(), + { + name: "invalid_replacer_wrong_type", + in: struct{}{}, + replacers: []Replacer{func(s string) int { return len(s) }}, + want: struct{}{}, + wantErr: errors.New("replacer does not preserve type: fn(string) int"), + }, + { + name: "invalid_replacer_wrong_argnum", + in: struct{}{}, + replacers: []Replacer{func(a, b string) string { return a + b }}, + want: struct{}{}, + wantErr: errors.New("incorrect number of arguments for replacer: 2 != 1"), + }, + { + name: "invalid_replacer_wrong_retnum", + in: struct{}{}, + replacers: []Replacer{func(s string) (a, b string) { return s, s }}, + want: struct{}{}, + wantErr: errors.New("incorrect number of return values from replacer: 2 != 1"), + }, + { + name: "invalid_replacer_collision", + in: struct{}{}, + replacers: []Replacer{ + func(s string) string { return s }, + func(s string) string { return s }, + }, + want: struct{}{}, + wantErr: errors.New("multiple replacers for string"), + }, } func TestRedact(t *testing.T) { @@ -415,10 +636,13 @@ func TestRedact(t *testing.T) { t.Fatalf("failed to get before state: %v", err) } } - got, err := Redact(test.in, test.tag, test.global) - if err != test.wantErr { + got, err := Redact(test.in, test.tag, test.global, test.replacers...) + if !sameError(err, test.wantErr) { t.Fatalf("unexpected error from Redact: %v", err) } + if err != nil { + return + } if !isCycle { after, err := json.Marshal(test.in) if err != nil { @@ -434,3 +658,14 @@ func TestRedact(t *testing.T) { }) } } + +func sameError(a, b error) bool { + switch { + case a == nil && b == nil: + return true + case a == nil, b == nil: + return false + default: + return a.Error() == b.Error() + } +} diff --git a/x-pack/filebeat/module/aws/_meta/config.yml b/x-pack/filebeat/module/aws/_meta/config.yml index e92cb36e7b53..da0377b6e462 100644 --- a/x-pack/filebeat/module/aws/_meta/config.yml +++ b/x-pack/filebeat/module/aws/_meta/config.yml @@ -14,7 +14,7 @@ # Bucket list interval on S3 bucket #var.bucket_list_interval: 300s - # Number of workers on S3 bucket + # Number of workers on S3 bucket or SQS queue #var.number_of_workers: 5 # Process CloudTrail logs @@ -63,9 +63,6 @@ # Enabling this option changes the service name from `s3` to `s3-fips` for connecting to the correct service endpoint. #var.fips_enabled: false - # The maximum number of messages to return from SQS. Valid values: 1 to 10. - #var.max_number_of_messages: 5 - # URL to proxy AWS API calls #var.proxy_url: http://proxy:3128 @@ -87,7 +84,7 @@ # Bucket list interval on S3 bucket #var.bucket_list_interval: 300s - # Number of workers on S3 bucket + # Number of workers on S3 bucket or SQS queue #var.number_of_workers: 5 # Filename of AWS credential file @@ -124,9 +121,6 @@ # Enabling this option changes the service name from `s3` to `s3-fips` for connecting to the correct service endpoint. #var.fips_enabled: false - # The maximum number of messages to return from SQS. Valid values: 1 to 10. - #var.max_number_of_messages: 5 - # URL to proxy AWS API calls #var.proxy_url: http://proxy:3128 @@ -148,7 +142,7 @@ # Bucket list interval on S3 bucket #var.bucket_list_interval: 300s - # Number of workers on S3 bucket + # Number of workers on S3 bucket or SQS queue #var.number_of_workers: 5 # Filename of AWS credential file @@ -185,9 +179,6 @@ # Enabling this option changes the service name from `s3` to `s3-fips` for connecting to the correct service endpoint. #var.fips_enabled: false - # The maximum number of messages to return from SQS. Valid values: 1 to 10. - #var.max_number_of_messages: 5 - # URL to proxy AWS API calls #var.proxy_url: http://proxy:3128 @@ -209,7 +200,7 @@ # Bucket list interval on S3 bucket #var.bucket_list_interval: 300s - # Number of workers on S3 bucket + # Number of workers on S3 bucket or SQS queue #var.number_of_workers: 5 # Filename of AWS credential file @@ -246,9 +237,6 @@ # Enabling this option changes the service name from `s3` to `s3-fips` for connecting to the correct service endpoint. #var.fips_enabled: false - # The maximum number of messages to return from SQS. Valid values: 1 to 10. - #var.max_number_of_messages: 5 - # URL to proxy AWS API calls #var.proxy_url: http://proxy:3128 @@ -270,7 +258,7 @@ # Bucket list interval on S3 bucket #var.bucket_list_interval: 300s - # Number of workers on S3 bucket + # Number of workers on S3 bucket or SQS queue #var.number_of_workers: 5 # Filename of AWS credential file @@ -307,9 +295,6 @@ # Enabling this option changes the service name from `s3` to `s3-fips` for connecting to the correct service endpoint. #var.fips_enabled: false - # The maximum number of messages to return from SQS. Valid values: 1 to 10. - #var.max_number_of_messages: 5 - # URL to proxy AWS API calls #var.proxy_url: http://proxy:3128 @@ -331,7 +316,7 @@ # Bucket list interval on S3 bucket #var.bucket_list_interval: 300s - # Number of workers on S3 bucket + # Number of workers on S3 bucket or SQS queue #var.number_of_workers: 5 # Filename of AWS credential file @@ -368,9 +353,6 @@ # Enabling this option changes the service name from `s3` to `s3-fips` for connecting to the correct service endpoint. #var.fips_enabled: false - # The maximum number of messages to return from SQS. Valid values: 1 to 10. - #var.max_number_of_messages: 5 - # URL to proxy AWS API calls #var.proxy_url: http://proxy:3128 diff --git a/x-pack/filebeat/module/aws/cloudtrail/config/aws-s3.yml b/x-pack/filebeat/module/aws/cloudtrail/config/aws-s3.yml index ada3a502fc20..0f395737a052 100644 --- a/x-pack/filebeat/module/aws/cloudtrail/config/aws-s3.yml +++ b/x-pack/filebeat/module/aws/cloudtrail/config/aws-s3.yml @@ -77,10 +77,6 @@ role_arn: {{ .role_arn }} fips_enabled: {{ .fips_enabled }} {{ end }} -{{ if .max_number_of_messages }} -max_number_of_messages: {{ .max_number_of_messages }} -{{ end }} - {{ if .proxy_url }} proxy_url: {{ .proxy_url }} {{ end }} diff --git a/x-pack/filebeat/module/aws/cloudtrail/manifest.yml b/x-pack/filebeat/module/aws/cloudtrail/manifest.yml index f19760eb6372..84e6d9060376 100644 --- a/x-pack/filebeat/module/aws/cloudtrail/manifest.yml +++ b/x-pack/filebeat/module/aws/cloudtrail/manifest.yml @@ -28,7 +28,6 @@ var: default: true - name: fips_enabled - name: proxy_url - - name: max_number_of_messages - name: ssl ingest_pipeline: ingest/pipeline.yml diff --git a/x-pack/filebeat/module/aws/s3access/config/aws-s3.yml b/x-pack/filebeat/module/aws/s3access/config/aws-s3.yml index 8ce1970290d2..4c0260809259 100644 --- a/x-pack/filebeat/module/aws/s3access/config/aws-s3.yml +++ b/x-pack/filebeat/module/aws/s3access/config/aws-s3.yml @@ -62,10 +62,6 @@ role_arn: {{ .role_arn }} fips_enabled: {{ .fips_enabled }} {{ end }} -{{ if .max_number_of_messages }} -max_number_of_messages: {{ .max_number_of_messages }} -{{ end }} - {{ if .proxy_url }} proxy_url: {{ .proxy_url }} {{ end }} diff --git a/x-pack/filebeat/module/aws/s3access/manifest.yml b/x-pack/filebeat/module/aws/s3access/manifest.yml index e52ba6737579..dc17d1169282 100644 --- a/x-pack/filebeat/module/aws/s3access/manifest.yml +++ b/x-pack/filebeat/module/aws/s3access/manifest.yml @@ -22,7 +22,6 @@ var: default: [forwarded] - name: fips_enabled - name: proxy_url - - name: max_number_of_messages - name: ssl ingest_pipeline: ingest/pipeline.yml diff --git a/x-pack/filebeat/module/aws/vpcflow/config/input.yml b/x-pack/filebeat/module/aws/vpcflow/config/input.yml index ecb1842be7a8..34feb9880b64 100644 --- a/x-pack/filebeat/module/aws/vpcflow/config/input.yml +++ b/x-pack/filebeat/module/aws/vpcflow/config/input.yml @@ -64,10 +64,6 @@ role_arn: {{ .role_arn }} fips_enabled: {{ .fips_enabled }} {{ end }} -{{ if .max_number_of_messages }} -max_number_of_messages: {{ .max_number_of_messages }} -{{ end }} - {{ if .proxy_url }} proxy_url: {{ .proxy_url }} {{ end }} diff --git a/x-pack/filebeat/module/aws/vpcflow/manifest.yml b/x-pack/filebeat/module/aws/vpcflow/manifest.yml index de772408a868..0787eb019b71 100644 --- a/x-pack/filebeat/module/aws/vpcflow/manifest.yml +++ b/x-pack/filebeat/module/aws/vpcflow/manifest.yml @@ -22,7 +22,6 @@ var: default: [forwarded, preserve_original_event] - name: fips_enabled - name: proxy_url - - name: max_number_of_messages - name: ssl - name: format default: diff --git a/x-pack/filebeat/modules.d/aws.yml.disabled b/x-pack/filebeat/modules.d/aws.yml.disabled index c730b8aea074..44d5e768ddc9 100644 --- a/x-pack/filebeat/modules.d/aws.yml.disabled +++ b/x-pack/filebeat/modules.d/aws.yml.disabled @@ -17,7 +17,7 @@ # Bucket list interval on S3 bucket #var.bucket_list_interval: 300s - # Number of workers on S3 bucket + # Number of workers on S3 bucket or SQS queue #var.number_of_workers: 5 # Process CloudTrail logs @@ -66,9 +66,6 @@ # Enabling this option changes the service name from `s3` to `s3-fips` for connecting to the correct service endpoint. #var.fips_enabled: false - # The maximum number of messages to return from SQS. Valid values: 1 to 10. - #var.max_number_of_messages: 5 - # URL to proxy AWS API calls #var.proxy_url: http://proxy:3128 @@ -90,7 +87,7 @@ # Bucket list interval on S3 bucket #var.bucket_list_interval: 300s - # Number of workers on S3 bucket + # Number of workers on S3 bucket or SQS queue #var.number_of_workers: 5 # Filename of AWS credential file @@ -127,9 +124,6 @@ # Enabling this option changes the service name from `s3` to `s3-fips` for connecting to the correct service endpoint. #var.fips_enabled: false - # The maximum number of messages to return from SQS. Valid values: 1 to 10. - #var.max_number_of_messages: 5 - # URL to proxy AWS API calls #var.proxy_url: http://proxy:3128 @@ -151,7 +145,7 @@ # Bucket list interval on S3 bucket #var.bucket_list_interval: 300s - # Number of workers on S3 bucket + # Number of workers on S3 bucket or SQS queue #var.number_of_workers: 5 # Filename of AWS credential file @@ -188,9 +182,6 @@ # Enabling this option changes the service name from `s3` to `s3-fips` for connecting to the correct service endpoint. #var.fips_enabled: false - # The maximum number of messages to return from SQS. Valid values: 1 to 10. - #var.max_number_of_messages: 5 - # URL to proxy AWS API calls #var.proxy_url: http://proxy:3128 @@ -212,7 +203,7 @@ # Bucket list interval on S3 bucket #var.bucket_list_interval: 300s - # Number of workers on S3 bucket + # Number of workers on S3 bucket or SQS queue #var.number_of_workers: 5 # Filename of AWS credential file @@ -249,9 +240,6 @@ # Enabling this option changes the service name from `s3` to `s3-fips` for connecting to the correct service endpoint. #var.fips_enabled: false - # The maximum number of messages to return from SQS. Valid values: 1 to 10. - #var.max_number_of_messages: 5 - # URL to proxy AWS API calls #var.proxy_url: http://proxy:3128 @@ -273,7 +261,7 @@ # Bucket list interval on S3 bucket #var.bucket_list_interval: 300s - # Number of workers on S3 bucket + # Number of workers on S3 bucket or SQS queue #var.number_of_workers: 5 # Filename of AWS credential file @@ -310,9 +298,6 @@ # Enabling this option changes the service name from `s3` to `s3-fips` for connecting to the correct service endpoint. #var.fips_enabled: false - # The maximum number of messages to return from SQS. Valid values: 1 to 10. - #var.max_number_of_messages: 5 - # URL to proxy AWS API calls #var.proxy_url: http://proxy:3128 @@ -334,7 +319,7 @@ # Bucket list interval on S3 bucket #var.bucket_list_interval: 300s - # Number of workers on S3 bucket + # Number of workers on S3 bucket or SQS queue #var.number_of_workers: 5 # Filename of AWS credential file @@ -371,9 +356,6 @@ # Enabling this option changes the service name from `s3` to `s3-fips` for connecting to the correct service endpoint. #var.fips_enabled: false - # The maximum number of messages to return from SQS. Valid values: 1 to 10. - #var.max_number_of_messages: 5 - # URL to proxy AWS API calls #var.proxy_url: http://proxy:3128