Centos 7에warewulf 클러스터를 설정하고 openmpi-x86_64-1.10.0-10.el7을 설치했으며 추가적으로 mpich도 설치했습니다. Openmpi로 mpirun을 실행하면 아래 오류가 발생합니다. mpich도 완벽하게 작동합니다. n0000을 클러스터 마스터로 변경하는 것도 가능하지만 노드에서는 실행되지 않습니다.
mpirun -n 1 -host n0000 echo $HOSTNAME
[n0000.cluster:01719] [[24772,0],1] tcp_peer_send_blocking: send() to socket 9 failed: Broken pipe (32)
--------------------------------------------------------------------------
ORTE was unable to reliably start one or more daemons.
This usually is caused by:
* not finding the required libraries and/or binaries on
one or more nodes. Please check your PATH and LD_LIBRARY_PATH
settings, or configure OMPI with --enable-orterun-prefix-by-default
* lack of authority to execute on one or more specified nodes.
Please verify your allocation and authorities.
* the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base).
Please check with your sys admin to determine the correct location to use.
* compilation of the orted with dynamic libraries when static are required
(e.g., on Cray). Please check your configure cmd line and consider using
one of the contrib/platform definitions for your system type.
* an inability to create a connection back to mpirun due to a
lack of common network interfaces and/or no route found between
them. Please check network connectivity (including firewalls
and network routing requirements).
--------------------------------------------------------------------------
아래에서 클러스터 및 서버 IP 주소 출력을 찾을 수 있습니다. 나도 좀 살펴봤어https://www.open-mpi.org/community/lists/users/2015/09/27643.php유사한 문제가 설명되어 있지만 동일한 하위 네트워크에 인터페이스가 없는 것 같습니다.
Server:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 08:00:27:2e:ee:c2 brd ff:ff:ff:ff:ff:ff
inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic enp0s3
valid_lft 85746sec preferred_lft 85746sec
inet6 fe80::a00:27ff:fe2e:eec2/64 scope link
valid_lft forever preferred_lft forever
3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 08:00:27:76:b9:e7 brd ff:ff:ff:ff:ff:ff
inet 10.1.1.1/24 brd 10.1.1.255 scope global enp0s8
valid_lft forever preferred_lft forever
inet6 fe80::a00:27ff:fe76:b9e7/64 scope link
valid_lft forever preferred_ft forever
Node:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/24 scope host lo
valid_lft forever preferred_lft forever
inet 127.0.0.1/8 brd 127.255.255.255 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 08:00:27:57:46:ce brd ff:ff:ff:ff:ff:ff
inet 10.1.1.10/24 brd 10.1.1.255 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::a00:27ff:fe57:46ce/64 scope link
valid_lft forever preferred_lft forever
이견있는 사람?
답변1
OpenFOAM을 어디에 설치하셨나요?
/opt/ 또는 /home/username/OpenFOAM에 설치하셨나요?
첫 번째 작업을 수행한 경우 컴퓨팅 노드는 최상위 위치(/opt/)를 찾을 수 없습니다.