Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Current span in Tracer lost after WebClient call - Span id differs in between Request vs Response #5472

Closed
pratapjai143 opened this issue Sep 11, 2024 · 12 comments

Comments

@pratapjai143
Copy link

Describe the bug

After sending request to downstream service using webclient the span id with the context getting lost while the response returned from the webclient.

I am logging the request response using a ExchangeFilterFunction and the span id in request vs response is different.

image

image

Environment

Local development environment

To Reproduce

https://github.com/pratapjai143/webclient-micrometer-demo

Please run this project and send request to http://localhost:7070/parent-service/parent-web and you can see in the log that the span id is different between request and response.

Expected behavior
Span id should be retained in the response of web client

Additional context
Related issues:-

(#3785)

@marcingrzejszczak
Copy link
Contributor

cc @chemicL @bclozel

@chemicL
Copy link

chemicL commented Sep 11, 2024

The trace-id is the same, so both spans have the same parent. And it makes sense. The request span is on behalf of the server which issues the remote request. The response span is associated with the child span generated on behalf of the remote request. In case you'd want to see the same span, perhaps there's an answer how to hook in to WebClient after when the child Observation is spawned - perhaps there are some interceptors at that level?

@jonatan-ivanov
Copy link
Member

In order to verify @chemicL's comment, please try to print out the parent spanId too, does that behavior makes sense to you?

@jonatan-ivanov jonatan-ivanov added waiting for feedback We need additional information before we can continue and removed waiting-for-triage labels Sep 11, 2024
@pratapjai143
Copy link
Author

Hi @chemicL @jonatan-ivanov Thanks for your response. For a single request it makes sense to track the request and response based on the trace id as it's returning in the response.

For an e.g. now I have a new endpoint http://localhost:7070/parent-service/parent-web-bulk which takes an array and iterate over the array and make a call to the downstream for each element within that array.

Request:-

image

End Point Code - ParentController :-

image

As for a single request to parent controller, we are sending multiple request to downstream Child Controller so how we can related the request response for each request that we send to child controller as the span id changes in the response.

Request & Response Log:- Span Id differs and same span id maintains for all the downstream request.

image

Would you please guide me how to resolve this issue as in nutshell we would like to relate the request and response that we send to downstream?

Thanks

@chemicL
Copy link

chemicL commented Sep 12, 2024

Consider logging in your ExchangeFilterFunction not during assembly but at the runtime of the reactive chain -> instead of e.g.

log.info(...)
return Mono.just(clientRequest);

do

return Mono.defer(() -> {
    log.info(...);
    return Mono.just(clientRequest);
});

I feel this might do the trick. It might also be something the framework could consider -> evaluating the user-provided filter functions at runtime instead of assembly time. @bclozel WDYT?

@pratapjai143
Copy link
Author

pratapjai143 commented Sep 12, 2024

Thanks heaps @chemicL now the correct trace and span id are printed in the log and I can related the request and response.

image

Now there is another issue. As I am sending multiple request to child service for one incoming request to the parent service so I need to trace the request in the in child service using the span id and trace id combination as the trace id is same for all the request received by the child service.

Parent Log:-

image

Child Log:-

image

Grand Child Log:-

image

In real time scenario the span id which I am logging in my parent service that same span id should propagate to the downstream so that if I want to trace any request in case of any exception then I should be able to trace.

As now there is no way I can trace the different request in my downstream system so need to fix this.

Any help would be highly appreciated.

Thanks.

@chemicL
Copy link

chemicL commented Sep 13, 2024

I am getting confused a bit. The hierarchical nature of spans is something that you can use to your advantage, e.g. by viewing them in the GUI of the distributed collector of your choice, e.g. Jaeger. The trace id is the one that allows identifying a tree-like behaviour when delegating a server call to multiple, potentially cascading, downstream services. In case you really need what you described, it feels this SO answer by @jonatan-ivanov should help: https://stackoverflow.com/questions/77044612/should-micrometer-generate-a-new-span-each-time-a-spring-boot-server-receives-a

Consider consulting the documentation on distributed tracing a bit more. The fact that the server starts its own span as a child of the incoming one can be used to see where time is spent - whether the client spends more time processing or the server.

@jonatan-ivanov
Copy link
Member

I'm confused too, I'm not sure I understand what you would like to achieve.
On top of what Dariusz already wrote above, to me this seems working as intended, every single operation has its own spanId and the operations that belong together share the same traceId, please also see: https://docs.micrometer.io/tracing/reference/glossary.html

I also encourage you to look at the data using the GUI of a distributed tracing backend like Zipkin, Jaeger, Grafana/Tempo.

@pratapjai143
Copy link
Author

Thanks @jonatan-ivanov @chemicL for your response. Our case is very unique.

We have three services e.g. Parent, Child and Grand Child.

For a single request to my Parent service from upstream, I am sending multiple requests concurrently using WebClient to Child and in turn Child service is making a call to Grand Child service.

For an e.g. my Parent service receives a single request and for that single request to my parent service, we are triggering multiple requests to the Child service, so if the span id in the child will be different what I am printing in my log of Parent service, then there is no way I can trace the request in my child service as the trace id will be same for all the requests sent to the Child service for a single request to parent service.

So we need a way how to trace a request using the span id (Parent -> Child -> Grand Child) as the trace id remains same across all the services for a given request to my parent service so using trace id we can't do much in terms of tracing a request.

Hopefully the problem statement is clear now and will get some help in resolving this issue.

Thanks.

@jonatan-ivanov
Copy link
Member

For a single request to my Parent service from upstream, I am sending multiple requests concurrently using WebClient to Child and in turn Child service is making a call to Grand Child service.

Let's see if I understand your use-case, let's say you have these three services: A, B, C respectively and the following workflow:

  • User calls A (traceId: 0, spanId: 1, parentId: null)
    • A calls B first time (traceId: 0, spanId: 2, parentId: 1)
    • A calls B second time (traceId: 0, spanId: 3, parentId: 1)
      • B calls C (traceId: 0, spanId: 4, parentId: 3)

In this scenario (this is not very unique), there is one trace (one traceId) and 4 spans (all with different ids). Also, the parent ids will reflect the parent child-relationships of the calls (see above). Your situation is a bit more complicated since there will be at least two spans for each http call (one for the client and one for the server) but the gist of it is the same.

so if the span id in the child will be different what I am printing in my log of Parent service, then there is no way I can trace the request in my child service as the trace id will be same for all the requests sent to the Child service for a single request to parent service.

You can ignore all the spanIds for this, the only thing you need to look at is the traceId to see which spans belong together. If you want the spanIds what you print out in the logs match, please see what @chemicL wrote above:

In case you'd want to see the same span, perhaps there's an answer how to hook in to WebClient after when the child Observation is spawned - perhaps there are some interceptors at that level?

Please notice that this is likely to be a WebClient feature and does not need a change in Micrometer, I would recommend opening am issue for Spring Framework.

You can also look at Brave's joint span feature that mimics the behavior you described (continues the client span on the server side): spring-projects/spring-boot#35165

Copy link

If you would like us to look at this issue, please provide the requested information. If the information is not provided within the next 7 days this issue will be closed.

Copy link

github-actions bot commented Oct 1, 2024

Closing due to lack of requested feedback. If you would like us to look at this issue, please provide the requested information and we will re-open.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Oct 1, 2024
@jonatan-ivanov jonatan-ivanov removed waiting for feedback We need additional information before we can continue feedback-reminder labels Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants