-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remora 1.8.4 hanging #60
Comments
Just a few things to add. This is an Ubuntu 18.04 system running bash 4.4.20 (just making sure it's not a bash version problem). I went back and tried remora 1.8.2 and it too hangs (so did 1.8.3). I'm also getting errors in install.sh (note: I'm building for MPI). gcc -o mpstat -g -O2 -Wall -Wstrict-prototypes -pipe -O2 mpstat.o librdstats_light.a libsyscom.a -s WARNING : mpicc / mpif77 not found ./install.sh: 111: [: 0: unexpected operator
|
The hang is probably due to the shell problem that was fixed. (See #59.) |
Hi Jeff,
See Issues. Fixed problem.
Best,
Kent
…________________________________
From: laytonjbgmail <notifications@github.com>
Sent: Tuesday, November 10, 2020 2:43 PM
To: TACC/remora <remora@noreply.github.com>
Cc: Subscribed <subscribed@noreply.github.com>
Subject: Re: [TACC/remora] Remora 1.8.4 hanging (#60)
Just a few things to add. This is an Ubuntu 18.04 system running bash 4.4.20 (just making sure it's not a bash version problem).
I went back and tried remora 1.8.2 and it too hangs (so did 1.8.3).
I'm also getting errors in install.sh (note: I'm building for MPI).
gcc -o mpstat -g -O2 -Wall -Wstrict-prototypes -pipe -O2 mpstat.o librdstats_light.a libsyscom.a -s
Installing mpstat ...
./install.sh: 78: ./install.sh: Syntax error: Bad fd number
./install.sh: 73: [: unexpected operator
./install.sh: 83: ./install.sh: Syntax error: Bad fd number
./install.sh: 78: [: unexpected operator
./install.sh: 86: ./install.sh: Syntax error: Bad fd number
./install.sh: 87: [: 1: unexpected operator
./install.sh: 100: [: 1: unexpected operator
WARNING : mpicc / mpif77 not found
WARNING : REMORA will be built without MPI support
./install.sh: 111: [: 0: unexpected operator
Copying all scripts to installation folder ...
'./src/aux/extra' -> '/home/laytonjb/bin/remora-1.8.2/bin/aux/extra'
'./src/aux/report' -> '/home/laytonjb/bin/remora-1.8.2/bin/aux/report'
'./src/aux/scheduler' -> '/home/laytonjb/bin/remora-1.8.2/bin/aux/scheduler'
'./src/aux/sql_functions' -> '/home/laytonjb/bin/remora-1.8.2/bin/aux/sql_functions'
'./src/config/fs_blacklist' -> '/home/laytonjb/bin/remora-1.8.2/bin/config/fs_blacklist'
'./src/config/modules' -> '/home/laytonjb/bin/remora-1.8.2/bin/config/modules'
'./src/modules/cpu' -> '/home/laytonjb/bin/remora-1.8.2/bin/modules/cpu'
'./src/modules/dvs' -> '/home/laytonjb/bin/remora-1.8.2/bin/modules/dvs'
'./src/modules/eth' -> '/home/laytonjb/bin/remora-1.8.2/bin/modules/eth'
'./src/modules/gpu' -> '/home/laytonjb/bin/remora-1.8.2/bin/modules/gpu'
'./src/modules/ib' -> '/home/laytonjb/bin/remora-1.8.2/bin/modules/ib'
'./src/modules/impi' -> '/home/laytonjb/bin/remora-1.8.2/bin/modules/impi'
'./src/modules/lnet' -> '/home/laytonjb/bin/remora-1.8.2/bin/modules/lnet'
'./src/modules/lustre' -> '/home/laytonjb/bin/remora-1.8.2/bin/modules/lustre'
'./src/modules/memory' -> '/home/laytonjb/bin/remora-1.8.2/bin/modules/memory'
'./src/modules/modules_utils' -> '/home/laytonjb/bin/remora-1.8.2/bin/modules/modules_utils'
'./src/modules/mv2' -> '/home/laytonjb/bin/remora-1.8.2/bin/modules/mv2'
'./src/modules/network' -> '/home/laytonjb/bin/remora-1.8.2/bin/modules/network'
'./src/modules/numa' -> '/home/laytonjb/bin/remora-1.8.2/bin/modules/numa'
'./src/modules/power' -> '/home/laytonjb/bin/remora-1.8.2/bin/modules/power'
'./src/modules/temperature' -> '/home/laytonjb/bin/remora-1.8.2/bin/modules/temperature'
'./src/remora' -> '/home/laytonjb/bin/remora-1.8.2/bin/remora'
'./src/remora_mem_safe' -> '/home/laytonjb/bin/remora-1.8.2/bin/remora_mem_safe'
'./src/remora_post' -> '/home/laytonjb/bin/remora-1.8.2/bin/remora_post'
'./src/remora_post_crash' -> '/home/laytonjb/bin/remora-1.8.2/bin/remora_post_crash'
'./src/scripts/remora_collect.sh' -> '/home/laytonjb/bin/remora-1.8.2/bin/scripts/remora_collect.sh'
'./src/scripts/remora_finalize.sh' -> '/home/laytonjb/bin/remora-1.8.2/bin/scripts/remora_finalize.sh'
'./src/scripts/remora_init.sh' -> '/home/laytonjb/bin/remora-1.8.2/bin/scripts/remora_init.sh'
'./src/scripts/remora_monitor.sh' -> '/home/laytonjb/bin/remora-1.8.2/bin/scripts/remora_monitor.sh'
'./src/scripts/remora_monitor_memory.sh' -> '/home/laytonjb/bin/remora-1.8.2/bin/scripts/remora_monitor_memory.sh'
'./src/scripts/remora_mpi_post.sh' -> '/home/laytonjb/bin/remora-1.8.2/bin/scripts/remora_mpi_post.sh'
'./src/scripts/remora_remote_post.sh' -> '/home/laytonjb/bin/remora-1.8.2/bin/scripts/remora_remote_post.sh'
'./src/scripts/remora_report.sh' -> '/home/laytonjb/bin/remora-1.8.2/bin/scripts/remora_report.sh'
'./src/scripts/remora_report_mic.sh' -> '/home/laytonjb/bin/remora-1.8.2/bin/scripts/remora_report_mic.sh'
./install.sh: 122: [: 1: unexpected operator
./install.sh: 122: [: 1: unexpected operator
./install.sh: 126: [: -1: unexpected operator
Installation of REMORA v1.8.2 completed.
For a fully functional installation make sure to:
export PATH=$PATH:/home/laytonjb/bin/remora-1.8.2/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/laytonjb/bin/remora-1.8.2/lib
export REMORA_BIN=/home/laytonjb/bin/remora-1.8.2/bin
Good Luck!
I'm guessing these are primarily bash issues?
I tried using a build of OpenMPI (4.0.3) to perhaps get around some issues but it gives the exact same error messages.
Jeff
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub<#60 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ACJ3GTENZMTPXJT5DL3ZRA3SPGQXZANCNFSM4TQ3P2RA>.
This message is from an external sender. Learn more about why this matters.<https://ut.service-now.com/sp?id=kb_article&number=KB0011401>
|
Good morning,
I'm testing remora 1.8.4 with a simple serial application and it is hanging. I can run the code without remora and it runs correctly. However, if I run it with remora, it hangs (it normally runs for about 1 minute and with remora, I've let it run for an hour).
I look at the output folder and I see a few text files (appears to be configuration information) but there is no data in the subfolders.
Any suggestions on how to debug this?
Thanks!
Jeff
The text was updated successfully, but these errors were encountered: