OCaml Forge
SCM

Detail: [#894] MPI_Finalize must succeed finalization of all MPI objects

Bugs: Browse | Download .csv | Monitor

[#894] MPI_Finalize must succeed finalization of all MPI objects

Date:
2011-01-25 17:48
Priority:
3
State:
Open
Submitted by:
Eray Ozkural (examachine)
Assigned to:
Nobody (None)
Hardware:
Macintosh
Resolution:
Accepted As Bug
Severity:
minor
Version:
None
Component:
None
Operating System:
MacOS X
Product:
None
 
URL:
Summary:
MPI_Finalize must succeed finalization of all MPI objects

Detailed description
I still keep seeing this error. The MPI calls in question were made in a sub-thread, this might be the culprit, or related to. At any rate, it seems that it is possible that MPI_Finalize may precede finalization of MPI communicators, which is a bug.

It is seen that the error is caused by a GC sweep, that somehow happened after MPI_Finalize.

This is using openmpi under OS X.

sirius:textcat malfunct$ mpirun -np 2 src/par-all-pairs-vert -dv data/radikal.dv -threshold 0.6 -algo 3
Reading DV file data/radikal.dv
Normalizing
Converting to Fixed Point
Partitioning dims with: weight: allpairs0 sum_i C(freq_i,2)
Partition dimensions according to all-pairs-0 load

* Running par-all-pairs-0-array-vert-opt
..............*** An error occurred in MPI_Comm_free
*** after MPI was finalized
*** MPI_ERRORS_ARE_FATAL (goodbye)
[sirius:3627] [0] func:0 libopen-pal.0.dylib 0x00000001002863a5 opal_backtrace_buffer + 53
[sirius:3627] [1] func:1 libmpi.0.dylib 0x0000000100162933 ompi_mpi_abort + 819
[sirius:3627] [2] func:2 libmpi.0.dylib 0x00000001001552c5 ompi_mpi_errors_return_comm_handler + 389
[sirius:3627] [3] func:3 libmpi.0.dylib 0x0000000100155758 ompi_mpi_errors_are_fatal_comm_handler + 184
[sirius:3627] [4] func:4 libmpi.0.dylib 0x0000000100178ee0 MPI_Comm_free + 64
[sirius:3627] [5] func:5 par-all-pairs-vert 0x000000010005ff53 sweep_slice + 141
[sirius:3627] [6] func:6 par-all-pairs-vert 0x000000010006038a caml_major_collection_slice + 768
[sirius:3627] [7] func:7 par-all-pairs-vert 0x0000000100060857 caml_minor_collection + 96
[sirius:3627] [8] func:8 par-all-pairs-vert 0x000000010006171a caml_alloc_small + 84
[sirius:3627] [9] func:9 par-all-pairs-vert 0x00000001000647f1 caml_ml_out_channels_list + 154
[sirius:3627] [10] func:10 par-all-pairs-vert 0x000000010006db1c caml_c_call + 32
[sirius:3627] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!

Followup

Message
Date: 2011-06-19 10:15
Sender: Eray Ozkural

Well, mixing multi-threading with MPI doesn't work yet. It may be difficult to fix this, I'm going to have to focus on completing missing MPI-1 features for now.

Attached Files:

Changes:

Field Old Value Date By
ResolutionNone2011-06-19 10:15examachine
Severitymajor2011-06-19 10:15examachine