Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use env-entry to add tags per webapp deployment #8138

Merged
merged 20 commits into from
Jan 10, 2025

Conversation

amarziali
Copy link
Collaborator

@amarziali amarziali commented Dec 31, 2024

What Does This Do

This PR concerns JEE web deployments only.

It adds the possibility to configure a set of tags to be added to each local root span per web deployment.
The tags are expressed as part of env-entry on the web.xml.

A tag can be added by declaring an env entry bindable on the namespace java:comp/env/datadog/tags/

An example:

<web-app>
...
  <env-entry>
    <env-entry-name>java:comp/env/datadog/tags/custom-tag</env-entry-name>
    <env-entry-type>java.lang.String</env-entry-type>
    <env-entry-value>some value</env-entry-value>
  </env-entry>
...
</web-app>

This mechanism can also be used to configure a separate service name for each webapp since the tag service has a special meaning and TagInterceptor use its value to set the service name

Example:

<web-app>
...
  <env-entry>
    <env-entry-name>java:comp/env/datadog/tags/service</env-entry-name>
    <env-entry-type>java.lang.String</env-entry-type>
    <env-entry-value>>my-service</env-entry-value>
  </env-entry>
...
</web-app>

That feature works today only with tomcat 9+ and wildfly 9+ .

Motivation

Additional Notes

Contributor Checklist

Jira ticket: [PROJ-IDENT]

@amarziali amarziali marked this pull request as ready for review December 31, 2024 14:30
@amarziali amarziali requested review from a team as code owners December 31, 2024 14:30
@pr-commenter
Copy link

pr-commenter bot commented Dec 31, 2024

Benchmarks

Startup

Load

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
end_time 2025-01-10T10:49:50 2025-01-10T10:56:50
git_branch master andrea.marziali/javacompenv
git_commit_date 1736503404 1736505492
git_commit_sha 22458b3 8760af6
release_version 1.46.0-SNAPSHOT~22458b3367 1.46.0-SNAPSHOT~8760af662e
start_time 2025-01-10T10:49:36 2025-01-10T10:56:36
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1736506964 1736506964
ci_job_id 761595378 761595378
ci_pipeline_id 52647645 52647645
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
variant iast iast

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 13 metrics, 15 unstable metrics.

Request duration reports for petclinic
gantt
    title petclinic - request duration [CI 0.99] : candidate=1.46.0-SNAPSHOT~8760af662e, baseline=1.46.0-SNAPSHOT~22458b3367
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.352 ms) : 1333, 1371
.   : milestone, 1352,
appsec (1.75 ms) : 1726, 1773
.   : milestone, 1750,
appsec_no_iast (1.756 ms) : 1732, 1781
.   : milestone, 1756,
iast (1.493 ms) : 1471, 1516
.   : milestone, 1493,
profiling (1.55 ms) : 1524, 1576
.   : milestone, 1550,
tracing (1.495 ms) : 1471, 1519
.   : milestone, 1495,
section candidate
no_agent (1.353 ms) : 1332, 1373
.   : milestone, 1353,
appsec (1.753 ms) : 1729, 1776
.   : milestone, 1753,
appsec_no_iast (1.762 ms) : 1738, 1786
.   : milestone, 1762,
iast (1.513 ms) : 1491, 1535
.   : milestone, 1513,
profiling (1.517 ms) : 1491, 1542
.   : milestone, 1517,
tracing (1.483 ms) : 1458, 1508
.   : milestone, 1483,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.352 ms [1.333 ms, 1.371 ms] -
appsec 1.75 ms [1.726 ms, 1.773 ms] 397.813 µs (29.4%)
appsec_no_iast 1.756 ms [1.732 ms, 1.781 ms] 404.212 µs (29.9%)
iast 1.493 ms [1.471 ms, 1.516 ms] 141.143 µs (10.4%)
profiling 1.55 ms [1.524 ms, 1.576 ms] 198.27 µs (14.7%)
tracing 1.495 ms [1.471 ms, 1.519 ms] 143.295 µs (10.6%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.353 ms [1.332 ms, 1.373 ms] -
appsec 1.753 ms [1.729 ms, 1.776 ms] 400.273 µs (29.6%)
appsec_no_iast 1.762 ms [1.738 ms, 1.786 ms] 409.52 µs (30.3%)
iast 1.513 ms [1.491 ms, 1.535 ms] 160.319 µs (11.9%)
profiling 1.517 ms [1.491 ms, 1.542 ms] 163.944 µs (12.1%)
tracing 1.483 ms [1.458 ms, 1.508 ms] 130.594 µs (9.7%)
Request duration reports for insecure-bank
gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.46.0-SNAPSHOT~8760af662e, baseline=1.46.0-SNAPSHOT~22458b3367
    dateFormat X
    axisFormat %s
section baseline
no_agent (375.877 µs) : 356, 396
.   : milestone, 376,
iast (497.975 µs) : 476, 520
.   : milestone, 498,
iast_FULL (654.463 µs) : 633, 676
.   : milestone, 654,
iast_GLOBAL (523.144 µs) : 501, 545
.   : milestone, 523,
iast_HARDCODED_SECRET_DISABLED (509.807 µs) : 488, 531
.   : milestone, 510,
iast_INACTIVE (451.95 µs) : 431, 473
.   : milestone, 452,
iast_TELEMETRY_OFF (485.844 µs) : 464, 508
.   : milestone, 486,
tracing (452.364 µs) : 432, 473
.   : milestone, 452,
section candidate
no_agent (387.029 µs) : 367, 407
.   : milestone, 387,
iast (495.705 µs) : 474, 517
.   : milestone, 496,
iast_FULL (656.797 µs) : 635, 678
.   : milestone, 657,
iast_GLOBAL (528.483 µs) : 506, 551
.   : milestone, 528,
iast_HARDCODED_SECRET_DISABLED (496.953 µs) : 475, 519
.   : milestone, 497,
iast_INACTIVE (452.06 µs) : 431, 473
.   : milestone, 452,
iast_TELEMETRY_OFF (486.401 µs) : 465, 508
.   : milestone, 486,
tracing (450.263 µs) : 429, 471
.   : milestone, 450,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 375.877 µs [355.83 µs, 395.924 µs] -
iast 497.975 µs [475.682 µs, 520.268 µs] 122.098 µs (32.5%)
iast_FULL 654.463 µs [632.676 µs, 676.25 µs] 278.587 µs (74.1%)
iast_GLOBAL 523.144 µs [501.322 µs, 544.965 µs] 147.267 µs (39.2%)
iast_HARDCODED_SECRET_DISABLED 509.807 µs [488.314 µs, 531.3 µs] 133.93 µs (35.6%)
iast_INACTIVE 451.95 µs [431.197 µs, 472.702 µs] 76.073 µs (20.2%)
iast_TELEMETRY_OFF 485.844 µs [464.049 µs, 507.638 µs] 109.967 µs (29.3%)
tracing 452.364 µs [431.762 µs, 472.966 µs] 76.488 µs (20.3%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 387.029 µs [366.691 µs, 407.367 µs] -
iast 495.705 µs [474.028 µs, 517.382 µs] 108.676 µs (28.1%)
iast_FULL 656.797 µs [635.213 µs, 678.382 µs] 269.768 µs (69.7%)
iast_GLOBAL 528.483 µs [506.037 µs, 550.93 µs] 141.454 µs (36.5%)
iast_HARDCODED_SECRET_DISABLED 496.953 µs [475.368 µs, 518.537 µs] 109.924 µs (28.4%)
iast_INACTIVE 452.06 µs [431.081 µs, 473.039 µs] 65.03 µs (16.8%)
iast_TELEMETRY_OFF 486.401 µs [464.929 µs, 507.872 µs] 99.371 µs (25.7%)
tracing 450.263 µs [429.363 µs, 471.163 µs] 63.234 µs (16.3%)

Dacapo

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master andrea.marziali/javacompenv
git_commit_date 1736503404 1736505492
git_commit_sha 22458b3 8760af6
release_version 1.46.0-SNAPSHOT~22458b3367 1.46.0-SNAPSHOT~8760af662e
See matching parameters
Baseline Candidate
application biojava biojava
ci_job_date 1736507520 1736507520
ci_job_id 761595381 761595381
ci_pipeline_id 52647645 52647645
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
variant appsec appsec

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 12 metrics, 0 unstable metrics.

Execution time for tomcat
gantt
    title tomcat - execution time [CI 0.99] : candidate=1.46.0-SNAPSHOT~8760af662e, baseline=1.46.0-SNAPSHOT~22458b3367
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.467 ms) : 1456, 1479
.   : milestone, 1467,
appsec (2.35 ms) : 2307, 2392
.   : milestone, 2350,
iast (2.097 ms) : 2043, 2151
.   : milestone, 2097,
iast_GLOBAL (2.141 ms) : 2087, 2195
.   : milestone, 2141,
profiling (1.948 ms) : 1906, 1991
.   : milestone, 1948,
tracing (1.936 ms) : 1894, 1977
.   : milestone, 1936,
section candidate
no_agent (1.472 ms) : 1460, 1483
.   : milestone, 1472,
appsec (2.348 ms) : 2306, 2391
.   : milestone, 2348,
iast (2.11 ms) : 2056, 2164
.   : milestone, 2110,
iast_GLOBAL (2.139 ms) : 2085, 2194
.   : milestone, 2139,
profiling (1.973 ms) : 1930, 2017
.   : milestone, 1973,
tracing (1.944 ms) : 1902, 1986
.   : milestone, 1944,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.467 ms [1.456 ms, 1.479 ms] -
appsec 2.35 ms [2.307 ms, 2.392 ms] 882.352 µs (60.1%)
iast 2.097 ms [2.043 ms, 2.151 ms] 629.677 µs (42.9%)
iast_GLOBAL 2.141 ms [2.087 ms, 2.195 ms] 673.595 µs (45.9%)
profiling 1.948 ms [1.906 ms, 1.991 ms] 480.921 µs (32.8%)
tracing 1.936 ms [1.894 ms, 1.977 ms] 468.237 µs (31.9%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.472 ms [1.46 ms, 1.483 ms] -
appsec 2.348 ms [2.306 ms, 2.391 ms] 876.811 µs (59.6%)
iast 2.11 ms [2.056 ms, 2.164 ms] 638.222 µs (43.4%)
iast_GLOBAL 2.139 ms [2.085 ms, 2.194 ms] 667.455 µs (45.4%)
profiling 1.973 ms [1.93 ms, 2.017 ms] 501.824 µs (34.1%)
tracing 1.944 ms [1.902 ms, 1.986 ms] 472.508 µs (32.1%)
Execution time for biojava
gantt
    title biojava - execution time [CI 0.99] : candidate=1.46.0-SNAPSHOT~8760af662e, baseline=1.46.0-SNAPSHOT~22458b3367
    dateFormat X
    axisFormat %s
section baseline
no_agent (15.366 s) : 15366000, 15366000
.   : milestone, 15366000,
appsec (15.161 s) : 15161000, 15161000
.   : milestone, 15161000,
iast (18.675 s) : 18675000, 18675000
.   : milestone, 18675000,
iast_GLOBAL (18.201 s) : 18201000, 18201000
.   : milestone, 18201000,
profiling (14.906 s) : 14906000, 14906000
.   : milestone, 14906000,
tracing (14.977 s) : 14977000, 14977000
.   : milestone, 14977000,
section candidate
no_agent (15.457 s) : 15457000, 15457000
.   : milestone, 15457000,
appsec (14.877 s) : 14877000, 14877000
.   : milestone, 14877000,
iast (19.106 s) : 19106000, 19106000
.   : milestone, 19106000,
iast_GLOBAL (18.009 s) : 18009000, 18009000
.   : milestone, 18009000,
profiling (15.064 s) : 15064000, 15064000
.   : milestone, 15064000,
tracing (15.002 s) : 15002000, 15002000
.   : milestone, 15002000,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.366 s [15.366 s, 15.366 s] -
appsec 15.161 s [15.161 s, 15.161 s] -205.0 ms (-1.3%)
iast 18.675 s [18.675 s, 18.675 s] 3.309 s (21.5%)
iast_GLOBAL 18.201 s [18.201 s, 18.201 s] 2.835 s (18.4%)
profiling 14.906 s [14.906 s, 14.906 s] -460.0 ms (-3.0%)
tracing 14.977 s [14.977 s, 14.977 s] -389.0 ms (-2.5%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.457 s [15.457 s, 15.457 s] -
appsec 14.877 s [14.877 s, 14.877 s] -580.0 ms (-3.8%)
iast 19.106 s [19.106 s, 19.106 s] 3.649 s (23.6%)
iast_GLOBAL 18.009 s [18.009 s, 18.009 s] 2.552 s (16.5%)
profiling 15.064 s [15.064 s, 15.064 s] -393.0 ms (-2.5%)
tracing 15.002 s [15.002 s, 15.002 s] -455.0 ms (-2.9%)

@amarziali amarziali force-pushed the andrea.marziali/javacompenv branch from d3e1f2d to 62f2214 Compare January 2, 2025 08:50
Copy link
Contributor

@PerfectSlayer PerfectSlayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few minor comments

@amarziali amarziali force-pushed the andrea.marziali/javacompenv branch from 764661d to 8fce1cb Compare January 3, 2025 09:51
Copy link
Contributor

@PerfectSlayer PerfectSlayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good from platform 👍 You might want to have some eyes from IDM too 😉

@amarziali amarziali requested a review from mcculls January 7, 2025 10:53

public static class ContextualInfo {
private final String serviceName;
private final Map<String, Object> tags = new HashMap<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can tags be added concurrently by different threads, or added by one thread while queried by another?

If so then we'll either need a concurrent collection here, or add synchronization to methods accessing tags.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not happening since there is 1 deployment per 1 classloader so it cannot happen. For this reason I preferred to use a regular map

return;
}
if (CAN_SPLIT_SERVICE_NAME_BY_DEPLOYMENT
&& contextualInfo.serviceName != null
Copy link
Contributor

@mcculls mcculls Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small inconsistency: here we're accessing the field, while the line below we're using the getter

also we access it several times in the same method, so consider pulling it into a local variable

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I extracted a local var. It should be better now

@@ -1602,10 +1611,16 @@ private DDSpanContext buildSpanContext() {
if (serviceName == null) {
serviceName = traceConfig.getPreferredServiceName();
}
ClassloaderConfigurationOverrides.ContextualInfo contextualInfo =
ClassloaderConfigurationOverrides.maybeGetContextualInfo();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously we only called ClassloaderServiceNames when the span or its parent didn't have a service name.

We're now calling maybeGetContextualInfo unconditionally for every span which involves:

  • a call to Thread.currentThread().getContextClassLoader()
  • a call to WeakHashMap.get

Do we know how much overhead this could add to span creation? (a microbenchmark would be very helpful)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a benchmark. Even if it can be quite little overhead, as discussed, I put a boolean flag to act as a no-op when the map is empty saving the extra computation in the general non jee case

private static final Function<ClassLoader, ContextualInfo> EMPTY_CONTEXTUAL_INFO_ADDER =
ignored -> new ContextualInfo(null);

private final WeakHashMap<ClassLoader, ContextualInfo> weakCache = new WeakHashMap<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WeakHashMap is not a concurrent collection - to be safe we should really wrap this in synchronizedMap, but that would have performance implications given we're now accessing the map for every span.

There is WeakConcurrentMap but we cannot use that directly in internal-api because it would leak that implementation onto the application class-path. You might therefore need to move this cache to inside dd-trace-core or somewhere similarly isolated.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mcculls thanks. I protected the write path with a lock. The read path is not needing to be thread safe. There could be parallel deployment needing to write the map. However the read are usually happening afterwards.
I also added a quick optimization using a volatile flag in order to skip any costly operation when the map is just empty

Copy link
Contributor

@mcculls mcculls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one concern - we're now accessing the weak cache for every span regardless whether this feature is enabled or not. The current implementation used for the weak cache is not thread-safe and should really be wrapped in synchronizedMap.

Given this I'd like to see some microbenchmarking done to check how much overhead this will add. (We also need to resolve whether synchronizedMap is OK or should we consider using WeakConcurrentMap indirectly - benchmarking will also help with that)

@amarziali amarziali force-pushed the andrea.marziali/javacompenv branch from 6c4e4d7 to 9423b4c Compare January 8, 2025 13:14
@amarziali amarziali requested a review from mcculls January 8, 2025 13:20
@amarziali amarziali force-pushed the andrea.marziali/javacompenv branch 2 times, most recently from 904bee2 to 762133c Compare January 8, 2025 16:55
Copy link
Contributor

@ygree ygree left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@amarziali amarziali force-pushed the andrea.marziali/javacompenv branch from 762133c to e8e8281 Compare January 10, 2025 09:09
@amarziali amarziali enabled auto-merge (squash) January 10, 2025 13:11
@amarziali amarziali added tag: experimental Experimental changes and removed tag: experimental Experimental changes labels Jan 10, 2025
@amarziali amarziali merged commit a3a8fe6 into master Jan 10, 2025
173 of 174 checks passed
@amarziali amarziali deleted the andrea.marziali/javacompenv branch January 10, 2025 14:32
@github-actions github-actions bot added this to the 1.46.0 milestone Jan 10, 2025
sezen-datadog pushed a commit that referenced this pull request Jan 13, 2025
* Use env-entry to add tags per webapp deployment

* fix gradle file

* Migrate to hasmethodadvice

* exclude classes from coverage

* codenarc

* add more repos

* jacoco

* Update internal-api/src/main/java/datadog/trace/api/ClassloaderConfigurationOverrides.java

Co-authored-by: Bruce Bujon <PerfectSlayer@users.noreply.github.com>

* review

* use our named

* more coverage

* Update internal-api/src/main/java/datadog/trace/api/ClassloaderConfigurationOverrides.java

Co-authored-by: Stuart McCulloch <stuart.mcculloch@datadoghq.com>

* Update internal-api/src/main/java/datadog/trace/api/ClassloaderConfigurationOverrides.java

Co-authored-by: Stuart McCulloch <stuart.mcculloch@datadoghq.com>

* review

* add jmh

* optimize

* widen muzzle excludes

* exclude lazy from branch coverage

* clean

* Do not set contextual service name if jee-split-by-deployment is not enabled

---------

Co-authored-by: Bruce Bujon <PerfectSlayer@users.noreply.github.com>
Co-authored-by: Stuart McCulloch <stuart.mcculloch@datadoghq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants