Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

include trace.id and span.id in decision logs #5230

Closed
be-a-bee opened this issue Oct 11, 2022 · 23 comments · Fixed by #5818
Closed

include trace.id and span.id in decision logs #5230

be-a-bee opened this issue Oct 11, 2022 · 23 comments · Fixed by #5818
Assignees
Labels
auditing feature-request monitoring Issues related to decision log and status plugins runtime

Comments

@be-a-bee
Copy link

be-a-bee commented Oct 11, 2022

What is the underlying problem you're trying to solve?

Hi Team,
I have a requirement for customizing what goes into the decision logs.

In our microservices application, we are using NewRelic which correlates logs with opentelemetry trace data. To facilitate this, we need to include trace-id and span-id in the logs of our services. We are using OPA-Envoy as an authorization service for multiple service endpoints. We have deployed OPA as a separate service rather than as a sidecar.

For this OPA service we have been using decision_logs as logs as it also captures the result of an evaluation. However this decision logs does not have trace.id and span.id attributes in the logs as needed by NewRelic or any monitoring tool which needs to correlate logs and trace data.

It would be helpful if decision-logs can also include trace-id in it so that it is easier to correlate tracing data with logs.
Is there a way to include tracing info like traceId in the decision logs ?

including trace_id in the input would make it appear as
input.trace_id:“some traceid”.
This is not useful for our NewRelic tool.

Describe the ideal solution

We are looking for having it as a top level attribute alongside decision_id .
decision_id : “some decision id”
trace.id: “some trace id”
span.id: “some span id”

Describe a "Good Enough" solution

Additional Context

@srenatus
Copy link
Contributor

Is there a way to include tracing info like traceId in the decision logs ?

If you're using object results with the envoy plugin, you can inject anything into the decision log's result field.

For example, see https://www.openpolicyagent.org/docs/latest/envoy-primer/#output-document for the output document. You can add any field there, it won't be picked up by envoy, but it'll go into the decision logs, like

result["allowed"] := allow
result["headers"] := headers
result["response_headers_to_add"] := response_headers_to_add
result["request_headers_to_remove"] := request_headers_to_remove
result["body"] := body
result["http_status"] := status_code
# added, no meaning to envoy:
result["meta"] = {
  "span.id": "...", # probably taken from input headers
  "trace.id": "...", # same
}

This certainly isn't perfect, but it could be a decent workaround...?


Other things worth a note:

@be-a-bee
Copy link
Author

Thanks @srenatus for the prompt reply and also the tips. Also glad to know the opa envoy plugin enhancement in flight.

It turns out that this is not sufficient for our monitoring tool to do the correlation between logs and traces.

Is this something you would consider adding as a feature to OPA ?

@be-a-bee
Copy link
Author

@srenatus , even though req_id attribute is added on the decision_logs, it still doesn't solve our problem because we need trace.id and span.id attributes on the decision logs.

There should be a way to configure what attributes go into the decision logs. Is that possible ?

@srenatus
Copy link
Contributor

There should be a way to configure what attributes go into the decision logs. Is that possible ?

This is not possible at the moment. Let's use this issue to track it.

@stale
Copy link

stale bot commented Nov 19, 2022

This issue has been automatically marked as inactive because it has not had any activity in the last 30 days.

@stale stale bot added the inactive label Nov 19, 2022
@ashutosh-narkar
Copy link
Member

Can you share more details about the fields a monitoring tool may need? Are trace-id and span-id something that monitoring tool in-general use? Also where is that value retrieved from?

@stale stale bot removed the inactive label Mar 8, 2023
@ashutosh-narkar ashutosh-narkar added auditing distribution Issues related to the bundle plugin runtime and removed distribution Issues related to the bundle plugin labels Mar 8, 2023
@ashutosh-narkar ashutosh-narkar added monitoring Issues related to decision log and status plugins runtime and removed runtime labels Mar 8, 2023
@be-a-bee
Copy link
Author

be-a-bee commented Mar 9, 2023

Open Telemetry prescribes : https://opentelemetry.io/docs/reference/specification/logs/data-model/ for what should go into a log.

@ashutosh-narkar

@ashutosh-narkar ashutosh-narkar self-assigned this Apr 5, 2023
ashutosh-narkar added a commit to ashutosh-narkar/opa that referenced this issue Apr 6, 2023
Currently OPA's decision logs do not include the trace
and span identifier associated with a given request
handled by the server. This information if available
can be helpful to correlate logs and trace data.

This change updates the decision log format to now
include the trace and span identifier if present.

Fixes: open-policy-agent#5230

Signed-off-by: Ashutosh Narkar <anarkar4387@gmail.com>
ashutosh-narkar added a commit that referenced this issue Apr 6, 2023
Currently OPA's decision logs do not include the trace
and span identifier associated with a given request
handled by the server. This information if available
can be helpful to correlate logs and trace data.

This change updates the decision log format to now
include the trace and span identifier if present.

Fixes: #5230

Signed-off-by: Ashutosh Narkar <anarkar4387@gmail.com>
@be-a-bee
Copy link
Author

be-a-bee commented Aug 11, 2023

@ashutosh-narkar , @srenatus , Can you please help prioritizing this.

To enable observability for OPA service in our kubernetes cluster, we need this feature. I'm sure others also would have this need. The observability tool we are using is New Relic .

We have some bugs in our service related to observability. We have deferred them so far. Now that we are going live, observability has become a top priority.

Please advise.

@be-a-bee
Copy link
Author

@ashutosh-narkar , to answer your questions :

Yes, trace-id and span-id are the attributes used by monitoring tools compliant with OpenTelemetry / OpenTracing.

Open Telemetry prescribes : https://opentelemetry.io/docs/reference/specification/logs/data-model/ for what should go into a log.

the trace-id and span-id are retrieved from the request headers.

@be-a-bee
Copy link
Author

be-a-bee commented Aug 11, 2023

To add some more details:

the following is a decision log from our newrelic instance. some of the irrelevant attributes have been removed.

the traceid and spanid retrieved from the headers should be logged in the logs with attributes trace.id and span.id respectively. if opa creates a new span, then that span's id should be logged in as span.id .

Decision log
{
"aws.invoked_function_arn": "arn:aws:lambda:us-east-1:255425211561:function:newrelic-s3-log-ingestion-us-east-1",
"aws.s3_bucket_name": "my-logs-255425211561-us-east-1",
"aws.s3_key": "logs/application/internal/preproduction/authz-engine/authz-engine-67bf7f9954-9rld8/authz-engine/2023-08-11/stderr/094741-vpnIMo80",
"clusterName": "preproduction",

"date": "2023-08-11T09:47:53.378877Z",
"decision_id": "a0829a24-dbdc-4ca5-b953-da3cffd4b7fe",
"environmentType": "Production",
"host": "ip-161-32-34-200.ec2.internal",
"input.attributes.request.http.headers.traceparent": "00-bb6601a5ca3c8955e60bd0f3a612b346-86269d457c911977-00",

// relevant headers from the request to be used for populating trace.id and span.id attributes
"input.attributes.request.http.headers.x-b3-parentspanid": "86269d457c911977",
"input.attributes.request.http.headers.x-b3-sampled": "0",
"input.attributes.request.http.headers.x-b3-spanid": "14640e9436507aa0",
"input.attributes.request.http.headers.x-b3-traceid": "bb6601a5ca3c8955e60bd0f3a612b346",

"input.attributes.request.http.headers.x-envoy-attempt-count": "1",
"input.attributes.request.http.headers.x-forwarded-client-cert": "By=spiffe://cluster.local/ns/rulebaseservice/sa/asx-rulebase-service;Hash=56683571d010e75627163319d2e5591f8263d9adf1f368349839e65eb68b67e6;Subject="";URI=spiffe://cluster.local/ns/model-calculation-service-core/sa/model-calculation-service-core",
"input.attributes.request.http.headers.x-forwarded-proto": "http",
"input.attributes.request.http.headers.x-request-id": "2e6e55b7-e490-9121-b966-8d769239bace",
"input.attributes.request.http.host": "idf-mybs-service.mybs.svc.cluster.local",
"input.attributes.request.http.id": "11721662766204391312",
"input.attributes.request.http.method": "GET",
"input.attributes.request.http.path": "/v1/e4772649-60ae-47b4-bfbf-fe7524fee331/repos/5d953767-2205-4db2-a346-f183478b5ce5/originTypes/workspace/origins/duplicatescleanup/types/Stage/rules",
"input.attributes.request.http.protocol": "HTTP/1.1",
"input.attributes.request.http.scheme": "http",
"input.attributes.request.time": "2023-08-11T09:47:53.374461Z",
"message": "Decision Log",
"metrics.timer_rego_query_eval_ns": 3105484,
"metrics.timer_server_handler_ns": 3235127,
"namespace_name": "authz-engine",
"newrelic.logPattern": "Decision Log",
"newrelic.source": "api.logs",
"path": "istio/authz/allow",
"plugin.type": "s3-lambda",
"plugin.version": "1.1.5",
"pod_name": "authz-engine-67bf7f9954-9rld8",
"region": "us-east-1",
"result": true,
"stream": "stderr",
"time": "2023-08-11T09:47:53.378687443Z",
"timestamp": 1691747273378,
"type": "openpolicyagent.org/decision_logs"

// The following attributes should be added to the decision log after the implementing the feature requested
"trace.id" : "bb6601a5ca3c8955e60bd0f3a612b346"
"span.id" : "14640e9436507aa0"

}

@ashutosh-narkar
Copy link
Member

@LionOnTheChase in OPA v0.52.0 we added a change to include trace and span identifier in decision logs. See here for details. It's possible OPA would not know about the New Relic specific headers.

@be-a-bee
Copy link
Author

be-a-bee commented Sep 1, 2023

@ashutosh-narkar , I have tested with latest OPA we are getting trace_id when OPA is queried via http endpoints.

But, for authorization checks via grpc this trace_id and span_id is not getting populated. We are using OPA Envoy plugin with istio service mesh.

Can you please take a look and tell what could be the cause ?

@ashutosh-narkar
Copy link
Member

There is support for OpenTelemetry spans in the OPA-Envoy plugin but they are not part of the decision log atm. I think you'll have to update this method to make then part of the decision log.

@be-a-bee
Copy link
Author

Thanks @ashutosh-narkar , I tried making the changes and tested it. it works.

But I need help with writing unit test for this change. The test I have added is not passing. can you please take a look and suggest what am I doing wrong here : https://github.com/open-policy-agent/opa-envoy-plugin/pull/455/files

@be-a-bee
Copy link
Author

Thanks @ashutosh-narkar for looking into this. All the checks for merge have passed.
Please let me know if there is anything I can help with.
open-policy-agent/opa-envoy-plugin#462

@be-a-bee
Copy link
Author

Hi @ashutosh-narkar ,

The PR is ready for merge if you don’t have any other comments. I created a new PR because the old PR had some merge conflicts.

New PR (to be merged) : open-policy-agent/opa-envoy-plugin#470
OLD PR (to see your past review comments and my replies) : open-policy-agent/opa-envoy-plugin#462

@be-a-bee
Copy link
Author

@ashutosh-narkar , which release of opa-envoy-plugin would have this enhancement ? any approximate ETA ?
open-policy-agent/opa-envoy-plugin#473

@srenatus
Copy link
Contributor

@be-a-bee
Copy link
Author

be-a-bee commented Nov 1, 2023

@srenatus , will this issue be mentioned in release notes ? I didn't find it in opa release notes. https://github.com/open-policy-agent/opa/releases/tag/v0.58.0

@srenatus
Copy link
Contributor

srenatus commented Nov 1, 2023

Oh that's an oversight. I should have been part of the release notes here. I'll make add note.

@ashutosh-narkar
Copy link
Member

Sorry about that. Thanks for updating the notes @srenatus.

@jithin-zac
Copy link

@ashutosh-narkar how should we transmit the parent trace and span IDs to ensure they are recorded in the relevant decision logs? Is there a particular header designated for this purpose, or should this information be included in the request body?

@ashutosh-narkar
Copy link
Member

@jithin-zac I think this should be part of the OPA input and follow W3C trace-context specification. @be-a-bee is that correct?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auditing feature-request monitoring Issues related to decision log and status plugins runtime
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

4 participants