[Octopus-users] problem with parallel run

Xavier Andrade xavier at tddft.org
Wed Aug 22 14:49:56 WEST 2007


Hi Andrea,

Are you sure that you are not mixing 32 and 64 bits code? This could produce
an error like the one you are getting, and it seems to me that the flags for
ifort are for the 32 bits, but you are linking against the 64 bits mkl.

Cheers,

Xavier

On Tue, 21 Aug 2007, adebnar at gwdg.de wrote:

> Hi
> I have compiled a parallel version of octopus without any error messages,
> but I managed to run a calculation only in the serial mode with it. Both
> the parallelization in states and in domains fail. My system is: Intel
> Xeon 5160 (Woodcrest). I configured it as:
>
> export FC=/usr/local/ofed/mpi/intel/mvapich-0.9.7-mlx2.2.0/bin/mpif90
> export CC=/usr/local/ofed/mpi/intel/mvapich-0.9.7-mlx2.2.0/bin/mpicc
> export CXX=/usr/local/ofed/mpi/intel/mvapich-0.9.7-mlx2.2.0/bin/mpicxx
> export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/users/adebnar/local/lib
> export
> LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/ofed/mpi/intel/mvapich-0.9.7-mlx2.2.0/lib/shared
> export
> LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/product/parallel/intel/compiler91/lib
> export FCFLAGS="-u -zero -fpp1 -nbs -pc80 -pad -align -unroll -O3 -ip
> -tpp7 -xW"
> ./configure --prefix=/usr/users/adebnar/local/
> --with-gsl-prefix=/usr/users/adebnar/local/ --with-f\
> ft=fftw3 --with-fft-lib="/usr/users/adebnar/local/lib/libfftw3.a"
> --with-blas="/usr/product/paralle\
> l/intel/mkl81/lib/em64t/libmkl_em64t.a -lguide"
> --with-lapack="/usr/product/parallel/intel/mkl81/li\
> b/em64t/libmkl_lapack.a -lguide" --enable-mpi=yes
>
> The error message from the parallelization in states is:
>
> 1 - MPI_BARRIER : Null communicator
> [1] [] Aborting Program!
> 3 - MPI_BARRIER : Null communicator
> [3] [] Aborting Program!
> 2 - MPI_BARRIER : Null communicator
> [2] [] Aborting Program!
> 0 - <NO ERROR MESSAGE> : Could not convert index 1073946423 into a pointer
> The index may be an incorrect argument.
> Possible sources of this problem are a missing "include 'mpif.h'",
> a misspelled MPI object (e.g., MPI_COM_WORLD instead of MPI_COMM_WORLD)
> or a misspelled user variable for an MPI object ( e.g.,
> com instead of comm).
> [0] [] Aborting Program!
>
>
> And the error message from the parallelization in domains is:
>
> * I      0.887231    0.310000  171664 | ..|..|..|mesh_init.mesh_init_stage_1
> * O      0.898953    0.000000  171664 | ..|..|..|mesh_init.mesh_init_stage_1
> * I      0.910688    0.310000  171664 | ..|..|..|mesh_init.mesh_init_stage_2
> * O     13.086420   12.160000  253872 | ..|..|..|mesh_init.mesh_init_stage_2
> * O     13.088792   12.180000  253872 | ..|..|grid.grid_init_stage_1
> * I     13.100779   12.470000  253872 | ..|..|multicomm.multicomm_init
> * I     13.112593   12.470000  253872 | ..|..|..|multicomm.sanity_check
> * O     13.124292    0.000000  253872 | ..|..|..|multicomm.sanity_check
> * O     13.138969    0.000000  253872 | ..|..|multicomm.multicomm_init
> * I     13.169127   12.470000  253872 | ..|..|grid.grid_init_stage_2
> * I     13.180866   12.470000  253872 | ..|..|..|mesh_init.mesh_init_stage_3
> * I     14.297748   13.590000  343884 | ..|..|..|..|mesh_init.mesh_partition
> forrtl: severe (174): SIGSEGV, segmentation fault occurred
> forrtl: severe (174): SIGSEGV, segmentation fault occurred
> forrtl: severe (174): SIGSEGV, segmentation fault occurred
> forrtl: error (78): process killed (SIGTERM)
>
> Any ideas how I could fix this problem?
> Thanks
>
> Andrea Debnarova
>
> Max Planck Institute for Biophysical Chemistry,
> Goettingen, Germany
>
> _______________________________________________
> Octopus-users mailing list
> Octopus-users at tddft.org
> http://www.tddft.org/mailman/listinfo/octopus-users
>


More information about the Octopus-users mailing list