Hi all,
I have been trying to to capture MPI traffic using ibdump for one way latency measurements (at the physical layer).
But all ibdump captures are IPoIB packets interleaved with Subnet monitoring packets such as perfqueries from the UFM/SM sweeps.
Specs:
ibdump, 2.0.0-5, built on Aug 21 2013, 13:09:58. GIT Version: 41a4dd2
I have also tried: ibdump, 2.0.0-8, built on Apr 28 2014, 17:06:51. GIT Version: f8e989d
I have also used --mem-mode and --max-burst to prevent packet dropping but with no luck.
Ofed_info: MLNX_OFED_LINUX-2.0-3.0.0.3 (OFED-2.0-3.0.0)
bash-4.1$ ibv_devinfo
hca_id: mlx4_0
transport: InfiniBand (0)
fw_ver: 2.30.3200
node_guid: 0002:c903:0044:e220
sys_image_guid: 0002:c903:0044:e223
vendor_id: 0x02c9
vendor_part_id: 4099
hw_ver: 0x1
board_id: IBM1100110019
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 4096 (5)
sm_lid: 1
port_lid: 15
port_lmc: 0x00
link_layer: InfiniBand
Using tshark -r sniffer.pcap| grep -i infiniband to parse the .pcap files, I typically get this:
frame # | Elapse time | src -> dest | Comments | |
48 | 6.469792 | LID: 1 -> LID: 35 | InfiniBand 290 PERF (PortCountersExtended) | |
49 | 6.469888 | LID: 35 -> LID: 1 | InfiniBand 290 PERF (PortCounters) | |
50 | 6.469951 | LID: 35 -> LID: 1 | InfiniBand 290 PERF (PortCountersExtended) | |
239 | 36.492380 | LID: 1 -> LID: 35 | InfiniBand 290 PERF (PortCounters) | |
240 | 36.492388 | LID: 1 -> LID: 35 | InfiniBand 290 PERF (PortCountersExtended) | |
241 | 36.492477 | LID: 35 -> LID: 1 | InfiniBand 290 PERF (PortCounters) | |
242 | 36.492543 | LID: 35 -> LID: 1 | InfiniBand 290 PERF (PortCountersExtended) | |
410 | 66.514564 | LID: 1 -> LID: 35 | InfiniBand 290 PERF (PortCounters) | |
411 | 66.514574 | LID: 1 -> LID: 35 | InfiniBand 290 PERF (PortCountersExtended) | |
412 | 66.514660 | LID: 35 -> LID: 1 | InfiniBand 290 PERF (PortCounters) | |
413 | 66.514723 | LID: 35 -> LID: 1 | InfiniBand 290 PERF (PortCountersExtended) | |
718 | 96.551601 | LID: 1 -> LID: 35 | InfiniBand 290 PERF (PortCounters) | |
719 | 96.551606 | LID: 1 -> LID: 35 | InfiniBand 290 PERF (PortCountersExtended) | |
720 | 96.551701 | LID: 35 -> LID: 1 | InfiniBand 290 PERF (PortCounters) | |
721 | 96.551770 | LID: 35 -> LID: 1 | InfiniBand 290 PERF (PortCountersExtended) |
cat /etc/modprobe.d/mlnx.conf:
blacklist mlx4_core
blacklist mlx4_en
blacklist mlx5_core
blacklist mlx5_ib
I would highly appreciate anyone who can help understand this behavior.
Thanks,
Fabrice