Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Individual thread sampling with JVMCI can return negative numbers #9

Open
ceresek opened this issue Jun 9, 2018 · 4 comments
Open
Labels

Comments

@ceresek
Copy link
Contributor

ceresek commented Jun 9, 2018

In some of the ScalaBench measurements with JVMCI merge threads turned OFF, the values in the instruction count column are occassionally negative. This could be a problem in the ScalaBench sampling helper class, but since it does very little with the data, it is more likely an agent issue.

@ceresek ceresek added the bug label Jun 9, 2018
@vhotspur
Copy link
Member

vhotspur commented Jun 9, 2018

Would you, please, provide concrete data here and the JVM configuration? Are the negative data arbitrary numbers or do they look like PAPI error codes?

I suspect this means that we are measuring one thread several times and the second event set could not be started...

@ceresek
Copy link
Contributor Author

ceresek commented Jun 9, 2018

I just took the Graal 1.0.0 release candidate on OpenJDK, current ScalaBench (652052961d26c63518995de19f7329374fc26fbe) and agent (12c6f8b), run it on my laptop and saw this:

java -agentlib:ubench-agent -jar scala-benchmark-suite-0.1.0-20120216.103539-3-d3s.jar --extra-events instructions@JVMCI_THREADS --output-file output.csv --iterations 66 xalan
...
cat output.csv
benchmark,total_ms,iteration_time_ns,process_cpu_time_ns,compilation_time_ms,compilation_total_ms,JVMCI_T6_instructions,JVMCI_T7_instructions,JVMCI_T5_instructions
xalan,3798,3359645165,25210000000,13079,13308,-1213561013,-1213561013,2924815023
xalan,6479,2588600169,17760000000,7612,21142,6659274468,3390970203,4626275666
xalan,8376,1777705439,11090000000,6300,27443,1044707730,1044707730,1044707730
...
xalan,34844,456987508,3570000000,175,94253,1161980260,619780977,328168209
xalan,35312,416696861,3250000000,667,94921,-410579606,868577954,0
xalan,35832,486861642,3580000000,194,95115,67287093,91611228,283080490
...
xalan,36727,410000414,3180000000,0,95118,0,0,0
xalan,37226,463863647,3620000000,119,95272,128900266,1347380896,-371145660
xalan,37829,564058371,4050000000,519,96229,2466654605,471780501,283313844

@ceresek
Copy link
Contributor Author

ceresek commented Jun 9, 2018

Perhaps another observation. It seems that in cases where there are more negative numbers present in the same output line, they are the same numbers:

xalan,15650,643145035,4980000000,1937,51952,2985997795,2985997795,2985997795
xalan,16398,682321166,5190000000,1888,53841,-1036167101,-1036167101,-1036167101
xalan,17112,630376776,4840000000,2207,56050,3740067173,3740067173,3740067173
xalan,17871,676468147,5200000000,1774,57825,581611822,-130868786,1200932886
xalan,18608,632173435,4870000000,3203,61028,3379272054,3379272054,3379272054
xalan,19321,646657455,4940000000,1650,62678,776865914,776865914,776865914
xalan,20047,632705181,4920000000,2654,65332,1782850501,846101445,1782850501
xalan,20774,652520105,5030000000,2462,67794,1099108838,1099108838,1099108838
xalan,21489,653321071,5060000000,1817,69612,-369491783,-369491783,-369491783
xalan,22133,575731713,4460000000,1650,71262,-3575862241,-3575862241,-3575862241
xalan,22735,523908903,4070000000,1392,72994,0,0,0
xalan,23305,499926428,3860000000,1423,74418,-1326725636,2519343880,85983699

@ceresek
Copy link
Contributor Author

ceresek commented Jun 12, 2018

After some investigation, the reason is the counter access code in PAPI. The perf_event component uses interpolation when querying the counter values, adding the actual value of the PMC register to the kernel reported offset. Problem is, the current value of the PMC register is read in the context of the thread doing PAPI_read (), but the kernel reported offset comes from a different context, and the computation therefore does not make sense.

There are some code paths through PAPI_read () that bypass the interpolation, for example when PAPI is compiled with --enable-perfevent-rdpmc=no, but that will slow down reading for all perf_event counters so not sure if we want it. On the other hand, setting the inherit flag disables the interpolation too, so perhaps we already do not have it on the main counters we use ?

Check _pe_read () in components/perf_event/perf_event.c.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants