Problems with anaconda version of mpiexec/mpirun

When running mpi programs using mpiexec/mpirun, make sure you are using the correct version

We have discovered a problem with the anaconda version of mpiexec/mpirun. Using it can cause the following problems:

  • The computer looses contact with all FibreChannel devices, causing all StorNext disks to become unavailable. A reboot of the computer is needed to fix it.
  • The .bash_history file in your home directory becomes corrupt and login hangs if you have bash as your default shell.
  • Alle other users on the computer will also have problems because the fibre channel devices abruptly disappears.

The problem arises for those of you who use anaconda. The path for the anaconda version of mpiexec/mpirun preceeds the path to other versions of mpiexec/mpirun.

How to avoid the problem:

When using mpiexec (or mpirun), check which version of mpiexec you use. 'which mpiexec' will show you the path. If it shows that it will use mpiexec from anaconda, don't run it.

To avoid the problem, explicitly use the path to the correct version or use 'module load' to the MPI version you need just prior to running the command.

What do I do if I have run into this problem:

Send a mail to it-support@astro.uio.no and let us know which computer you were running mpiexec on. We will then reboot the computer and delete your corrupt .bash_history file.

By Torben Leifsen
Published Sep. 21, 2022 1:21 PM - Last modified Sep. 21, 2022 1:21 PM