๐ mpi4py
bug on Sunspot
AuroraGPT
Simple reproducer:
Load my
anl_24_q2_release
conda environment:Try
python3 -c 'from mpi4py import MPI'
- fails โ
# [08:44:41 AM][foremans@x1922c2s3b0n0][~][anl_24_q2_release] $ python3 -c 'from mpi4py import MPI' Traceback (most recent call last): File "<string>", line 1, in <module> ImportError: /home/foremans/miniconda3/envs/anl_24_q2_release/lib/python3.9/site-packages/mpi4py/MPI.cpython-39-x86_64-linux-gnu.so: undefined symbol: MPI_Message_c2f [1] 14910 exit 1 python3 -c 'from mpi4py import MPI'
Load correct modules:
# [08:44:58 AM][foremans@x1922c2s3b0n0][~][anl_24_q2_release] $ module use /home/ftartagl/graphics-compute-runtime/modulefiles ; module load graphics-compute-runtime/agama-ci-devel-803.29 spack-pe-gcc/0.6.1-23.275.2 gcc/12.2.0 ; module use /soft/preview-modulefiles/24.086.0 ; module load oneapi/release/2024.04.15.001 UMD: agama-ci-devel-803.29 successfully loaded: UMD: graphics-compute-runtime/agama-ci-devel-803.29 Due to MODULEPATH changes, the following have been reloaded: 1) mpich-config/collective-tuning/1024 The following have been reloaded with a version change: 1) intel_compute_runtime/release/agama-devel-736.25 => intel_compute_runtime/release/775.20 2) mpich/icc-all-pmix-gpu/52.2 => mpich/icc-all-pmix-gpu/20231026 3) oneapi/eng-compiler/2023.12.15.002 => oneapi/release/2024.04.15.001
Retry with new modules:
- works โ
Citation
BibTeX citation:
@online{foreman2024,
author = {Foreman, Sam},
title = {๐ `Mpi4py` Bug on {Sunspot}},
date = {2024-05-25},
url = {https://samforeman.me/posts/AuroraGPT/mpi4py-reproducer/},
langid = {en}
}
For attribution, please cite this work as:
Foreman, Sam. 2024. โ๐ `Mpi4py` Bug on Sunspot.โ May 25,
2024. https://samforeman.me/posts/AuroraGPT/mpi4py-reproducer/.