-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature request - Energy profiling #273
Comments
Add |
Hi @jrmadsen , thanks a lot for the quick reply. I'll give this a go when I get back to the office and let you know. For now I will mark this issue as closed. |
Hi @jrmadsen , I managed to get omnitrace installed on HPC with correct permissions. I have followed your instructions but I don't see anything in the trace (when I open in perfetto.ui). I will include the command I run and the config.
The app is a simple openMP threaded application. Ideally I want to estimate the energy usage using the RAPL hw counter. Am I doing something wrong? below is my config: 1 # auto-generated by omnitrace-avail (version 1.10.0) on 2023-04-28 @ 12:15
2
3 OMNITRACE_CONFIG_FILE =
4 OMNITRACE_USE_PERFETTO = true
5 OMNITRACE_USE_TIMEMORY = true
6 OMNITRACE_USE_SAMPLING = false
7 OMNITRACE_USE_PROCESS_SAMPLING = true
8 OMNITRACE_USE_KOKKOSP = false
9 OMNITRACE_USE_CAUSAL = false
10 OMNITRACE_USE_MPIP = true
11 OMNITRACE_USE_PID = true
12 OMNITRACE_USE_RCCLP = false
13 OMNITRACE_OUTPUT_PATH = omnitrace-%tag%-output
14 OMNITRACE_OUTPUT_PREFIX =
15 OMNITRACE_CAUSAL_BACKEND = auto
16 OMNITRACE_CAUSAL_BINARY_EXCLUDE =
17 OMNITRACE_CAUSAL_BINARY_SCOPE = %MAIN%
18 OMNITRACE_CAUSAL_DELAY = 0
19 OMNITRACE_CAUSAL_DURATION = 0
20 OMNITRACE_CAUSAL_FUNCTION_EXCLUDE =
21 OMNITRACE_CAUSAL_FUNCTION_SCOPE =
22 OMNITRACE_CAUSAL_MODE = function
23 OMNITRACE_CAUSAL_RANDOM_SEED = 0
24 OMNITRACE_CAUSAL_SOURCE_EXCLUDE =
25 OMNITRACE_CAUSAL_SOURCE_SCOPE =
26 OMNITRACE_CRITICAL_TRACE = false
27 OMNITRACE_PAPI_EVENTS = amd64_rapl::RAPL_ENERGY_PKG
28 OMNITRACE_PERFETTO_BACKEND = inprocess
29 OMNITRACE_PERFETTO_BUFFER_SIZE_KB = 1024000
30 OMNITRACE_PERFETTO_FILL_POLICY = discard
31 OMNITRACE_PROCESS_SAMPLING_DURATION = -1
32 OMNITRACE_PROCESS_SAMPLING_FREQ = 0
33 OMNITRACE_SAMPLING_CPUS = 1
34 OMNITRACE_SAMPLING_DELAY = 0.5
35 OMNITRACE_SAMPLING_DURATION = 0
36 OMNITRACE_SAMPLING_FREQ = 300
37 OMNITRACE_SAMPLING_OVERFLOW_EVENT = perf::PERF_COUNT_HW_CACHE_REFERENCES
38 OMNITRACE_TIME_OUTPUT = true
39 OMNITRACE_TIMEMORY_COMPONENTS = wall_clock
40 OMNITRACE_TRACE_DELAY = 0
41 OMNITRACE_TRACE_DURATION = 0
42 OMNITRACE_TRACE_PERIOD_CLOCK_ID = CLOCK_REALTIME
43 OMNITRACE_TRACE_PERIODS =
44 OMNITRACE_VERBOSE = 0
45 OMNITRACE_ENABLED = true
46 OMNITRACE_SUPPRESS_CONFIG = false
47 OMNITRACE_SUPPRESS_PARSING = false |
Set the |
Thanks. Unfortunately I now get
|
Is it showing up in |
yes (see output of FYI, I found this link which suggests I need to specify the cpu number e.g., I tried and it runs without error but I don't have time to check if it's correct this evening. I will take a look tomorrow but ideally I want the whole processor not just one core.
|
Ah, yeah you may just have to specify all the CPUs if you have multiple CPUs, e.g. |
@TomMelt have you gotten a chance to verify that adding the |
Hi @jrmadsen . It looks like it's similar to how omnitrace handles other CPU variables e.g., So I would need to use However the result I get in omnitrace is either wrong or doing something weird. Would it be easier if we arranged a teams/zoom call at some point? It might be easier to troubleshoot/discuss. |
Ideally I don't need the trace over time of energy usage but just the final value. Similar to the armforge perf-report. Are you able to get energy usage from a simple program? |
Hmmm... It's hard to tell if it is per core or not. Three of those bars look similar in magnitude when their samples are taken at overlapping timestamps -- those per-thread samples are taken with respect to the CPU-clock of the thread so it makes sense why they don't line up exactly. I think for this particular use case, PAPI would ideally need to not initialize per-thread support and reading the counters should be done in the background "process sampling" thread instead of the per-thread interrupt sampler. Before we hop on a call, let me experiment a bit with doing the above. |
Hi @jrmadsen , did you have any luck? |
Sorry for the delay, I started a long vacation right around when you posted the last comment. I haven’t gotten a chance yet but I’ll look into it shortly. |
Hi,
ArmForge has a feature (perf-report) that can estimate power usage of a binary.
Is it possible to do something like this in omnitrace?
I had tried using AMDuProf but it is not supported on linux (see section 10.3 Limitations, p. 179). I raised an issue on the Community discussion forum.
I think it has something to do with the RAPL drivers. I can see some reference to them in the source, but I don't know how to use it.
The text was updated successfully, but these errors were encountered: