[Octopus-users] problem with parallel run
Xavier Andrade
xavier at tddft.org
Wed Aug 22 14:49:56 WEST 2007
Hi Andrea,
Are you sure that you are not mixing 32 and 64 bits code? This could produce
an error like the one you are getting, and it seems to me that the flags for
ifort are for the 32 bits, but you are linking against the 64 bits mkl.
Cheers,
Xavier
On Tue, 21 Aug 2007, adebnar at gwdg.de wrote:
> Hi
> I have compiled a parallel version of octopus without any error messages,
> but I managed to run a calculation only in the serial mode with it. Both
> the parallelization in states and in domains fail. My system is: Intel
> Xeon 5160 (Woodcrest). I configured it as:
>
> export FC=/usr/local/ofed/mpi/intel/mvapich-0.9.7-mlx2.2.0/bin/mpif90
> export CC=/usr/local/ofed/mpi/intel/mvapich-0.9.7-mlx2.2.0/bin/mpicc
> export CXX=/usr/local/ofed/mpi/intel/mvapich-0.9.7-mlx2.2.0/bin/mpicxx
> export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/users/adebnar/local/lib
> export
> LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/ofed/mpi/intel/mvapich-0.9.7-mlx2.2.0/lib/shared
> export
> LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/product/parallel/intel/compiler91/lib
> export FCFLAGS="-u -zero -fpp1 -nbs -pc80 -pad -align -unroll -O3 -ip
> -tpp7 -xW"
> ./configure --prefix=/usr/users/adebnar/local/
> --with-gsl-prefix=/usr/users/adebnar/local/ --with-f\
> ft=fftw3 --with-fft-lib="/usr/users/adebnar/local/lib/libfftw3.a"
> --with-blas="/usr/product/paralle\
> l/intel/mkl81/lib/em64t/libmkl_em64t.a -lguide"
> --with-lapack="/usr/product/parallel/intel/mkl81/li\
> b/em64t/libmkl_lapack.a -lguide" --enable-mpi=yes
>
> The error message from the parallelization in states is:
>
> 1 - MPI_BARRIER : Null communicator
> [1] [] Aborting Program!
> 3 - MPI_BARRIER : Null communicator
> [3] [] Aborting Program!
> 2 - MPI_BARRIER : Null communicator
> [2] [] Aborting Program!
> 0 - <NO ERROR MESSAGE> : Could not convert index 1073946423 into a pointer
> The index may be an incorrect argument.
> Possible sources of this problem are a missing "include 'mpif.h'",
> a misspelled MPI object (e.g., MPI_COM_WORLD instead of MPI_COMM_WORLD)
> or a misspelled user variable for an MPI object ( e.g.,
> com instead of comm).
> [0] [] Aborting Program!
>
>
> And the error message from the parallelization in domains is:
>
> * I 0.887231 0.310000 171664 | ..|..|..|mesh_init.mesh_init_stage_1
> * O 0.898953 0.000000 171664 | ..|..|..|mesh_init.mesh_init_stage_1
> * I 0.910688 0.310000 171664 | ..|..|..|mesh_init.mesh_init_stage_2
> * O 13.086420 12.160000 253872 | ..|..|..|mesh_init.mesh_init_stage_2
> * O 13.088792 12.180000 253872 | ..|..|grid.grid_init_stage_1
> * I 13.100779 12.470000 253872 | ..|..|multicomm.multicomm_init
> * I 13.112593 12.470000 253872 | ..|..|..|multicomm.sanity_check
> * O 13.124292 0.000000 253872 | ..|..|..|multicomm.sanity_check
> * O 13.138969 0.000000 253872 | ..|..|multicomm.multicomm_init
> * I 13.169127 12.470000 253872 | ..|..|grid.grid_init_stage_2
> * I 13.180866 12.470000 253872 | ..|..|..|mesh_init.mesh_init_stage_3
> * I 14.297748 13.590000 343884 | ..|..|..|..|mesh_init.mesh_partition
> forrtl: severe (174): SIGSEGV, segmentation fault occurred
> forrtl: severe (174): SIGSEGV, segmentation fault occurred
> forrtl: severe (174): SIGSEGV, segmentation fault occurred
> forrtl: error (78): process killed (SIGTERM)
>
> Any ideas how I could fix this problem?
> Thanks
>
> Andrea Debnarova
>
> Max Planck Institute for Biophysical Chemistry,
> Goettingen, Germany
>
> _______________________________________________
> Octopus-users mailing list
> Octopus-users at tddft.org
> http://www.tddft.org/mailman/listinfo/octopus-users
>
More information about the Octopus-users
mailing list