Skip to content

finding memory errors in the c and c layer with gdb

zhang-alvin edited this page Jan 23, 2017 · 2 revisions

A test generates a segmentation violation

$ python test_meshtools.py
/home/cekees/proteus/proteus/LinearAlgebraTools.py:153: FutureWarning: comparison to `None` will result in an elementwise object comparison in the future.
  if array == None:
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[0]PETSC ERROR: likely location of problem given in stack below
[0]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
[0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
[0]PETSC ERROR:       INSTEAD the line number of the start of the function
[0]PETSC ERROR:       is given.
application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0

Try running python with gdb:

$ gdb python
GNU gdb (Ubuntu 7.11.90.20161005-0ubuntu1) 7.11.90.20161005-git
Copyright (C) ...
Reading symbols from python...(no debugging symbols found)...done.
(gdb) set arg test_meshtools.py
(gdb) run
Starting program: /home/cekees/proteus/linux2/bin/python test_meshtools.py
process 18445 is executing new program: /home/cekees/.hashdist/bld/python/es7yg26ocmly/bin/python2.7
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7fffe9bb0700 (LWP 18452)]
[New Thread 0x7fffe93af700 (LWP 18453)]
[New Thread 0x7fffe8bae700 (LWP 18454)]
[New Thread 0x7fffe83ad700 (LWP 18455)]
/home/cekees/proteus/proteus/LinearAlgebraTools.py:153: FutureWarning: comparison to `None` will result in an elementwise object comparison in the future.
  if array == None:

Thread 1 "python2.7" received signal SIGSEGV, Segmentation fault.
partitionElements (mesh=..., nElements_overlap=1) at proteus/flcbdfWrappersModule.cpp:4816
4816	      nodes[0] = edgeNodesArray_new[edge_old*2+0];
(gdb) list
4811	  map<NodeTuple<2>,int > nodesEdgeMap_global_new;
4812	  for (int ig = 0; ig < mesh.nEdges_global; ig++)
4813	    {
4814	      int nodes[2];
4815	      const int edge_old = edgeNumbering_global_new2old[ig];
4816	      nodes[0] = edgeNodesArray_new[edge_old*2+0];
4817	      nodes[1] = edgeNodesArray_new[edge_old*2+1];
4818	      NodeTuple<2> et(nodes);
4819	      edgeNodesArray_newNodesAndEdges[ig*2+0] = edgeNodesArray_new[edge_old*2+0];
4820	      edgeNodesArray_newNodesAndEdges[ig*2+1] = edgeNodesArray_new[edge_old*2+1];
(gdb) print edge_old
$1 = 1440817968

Debugging a case/simulation

If you need to debug a case that is initiated through the parun script file, you can instead attach gdb at a at specific points when C-code is executed. To do this, you need to first compile Proteus without any of the optimization flags for the C-compiler. In setup.py, you need to have the following prior to the setup() call:

cv["CFLAGS"] = cv["CFLAGS"].replace("-O3","")
cv["CFLAGS"] = cv["CFLAGS"].replace("-O2","")

By default, recent versions of gcc will output debug information in the DWARF4 format. It is possible that your version of gdb cannot read the information in which case you will get an error message like:

No symbol "x" in current context

You can then change the debug information format to DWARF2 or DWARF3:

cv["CFLAGS"] = cv["CFLAGS"].replace("-O2","-gdwarf-3")

Next, you need to force Proteus to wait at a specific point in the C-code. You can add the following which causes the run to wait until the user hits "Enter":

  printf("TIME TO ATTACH GDB\n");
  getchar();

Recompile the code and run the case/simulation.
You now need the process ID associated with this simulation which you can extract with:

ps ax | grep parun

The above command should yield two results, the first being the process ID associated with the run and the second being the process ID associated with grep.

Now you can launch gdb in a separate terminal and attach the process:

 (gdb) attach PROCESS_ID