Hi,
Can you send me the output of:
ifconfig -a, uname -a, arp -a, ip address show
Thanks
Marc
Hi,
Can you send me the output of:
ifconfig -a, uname -a, arp -a, ip address show
Thanks
Marc
Hi Samer,
thanks for your reply.
The card has the latest firmware:
[root@localhost:/opt/mellanox] ./bin/flint -d mt4115_pciconf0 -i fw-ConnectX4-rel-12_21_2010-MCX416A-BCA_Ax-FlexBoot-3.5.305.bin burn
Current FW version on flash: 12.21.2010
New FW version: 12.21.2010
Note: The new FW version is the same as the current FW version on flash.
Do you want to continue ? (y/n) [n] :
It must be something else.
Cheers
Axel
Hello,
I am attempting to run soft RoCE and interface with a X4 card in a different computer.
I am running CentOS 7.4, and I installed MLNX_OFED version 4.2-1.2.0.0. The install finished without error, and I ran the service restart command when prompted. I proceed to try and setup soft RoCE following the directions here: HowTo Configure Soft-RoCE.
When I run rxe_cfg status/start the script complains that the rdma_rxe module is not loaded (and no other errors even in verbose mode). When I run run lsmod | grep rdma_rxe, I see that rdma_rxe is in fact loaded loaded, and that it is using mlx_compat. Small variation from the above instructions on my system - rdma_rxe is using mlx_compat, not ib_core (even though ib_core is loaded and used by mlx_compat). I figured this is some wrapper used by Mellanox in newer version of the OFED. I have even tried running modprobe rdma_rxe and see no error messages in loading rdma_rxe, and dmesg does not show any error messages from the kernel. I have also tried reloading the module and restarting the machine.
After 'starting' rxe_cfg, doing rxe_cfg add <adapter_name> does nothing. It does load any IB devices associated with the NIC, and I still see the 'rdma_rxe module is not loaded' message.
I looked around a bunch and could not find anything which helped. I have also tried the same stuff with version 4.2-1.0.0.0 of MLNX_OFED. This computer did have a X4 card in it when I first installed the OFED package. I took it out in case it was preventing soft RoCE from working on other NICs, restarted, re-installed OFED, and did the same troubleshooting without the Mellanox card in.
Any help would be appreciated.
Hi Samer,
it was the "mlxconfig -d <device> reset" which did the trick. So I guess there was a f... up in the configuration and the reset cleaned it.
Thank you very much!
Axel
"inbox driver" is a term that Mellanox uses for the drivers that comes with ubuntu centos debian rhel etc linux distributions.it is generic for all types of linux distros.this is the way of doing their business.
My problem is that even if I dont install anything , freebsd11 lists connectx-4 VPI drivers.My question is that what the use of installing mellanox drivers on github is if freebsd inbox drivers are already there.
Furthermore if ISER-RDMA is supported only for initiator but not target, I just concluded this because I did not get any response on this issue.initiator without target is like coffee without cookie.
Hi,
I've just bricked an SN2100 because of power failure during ONIE uninstall process. Now everytime I boot it, it will goes to 'grub rescue' prompt. Is there any way to recover it? Can I follow this guide to re-install ONIE using USB drive?
BR,
Donny Hariady
Hi Donny,
Yes - you can use the usb onie recovery procedure
Hi Eddie,
Thanks for your quick answer! Right now I'm still waiting for usb-to-usb_mini converter cable, but as I observed I'll need a password to enter the BIOS setting. Do we need to change boot order setting in order to boot from the USB drive? If yes then I'll ask how or to whom should I ask for the password? This is a brand new switch - about one month - and nobody except me ever touched it so I'm sure the password is still default from factory.
Hi Donny,
Please open a case at support@mellanox.com and an engineer will be able to webex with you in order to access the BIOS
Hi,
My configuration:
Server Dell PowerEdge R710, 2x6 core (2x12 threads), 96GB RAM.
Mellanox ConnectX-4 Lx EN-MCX4121A-XCAT
Firmware: FW 14.21.2010, PXE 3.5.0305
VMware 6.5: DellEMC-ESXi-6.5U1-7388607-A07 (Dell)
Based on this documentation: http://www.mellanox.com/related-docs/prod_software/Mellanox_MLNX-NATIVE-ESX-ConnectX-4-5_Driver_for_VMware_ESXi_6.5_Rele…
I configure Card:
/opt/mellanox/bin/mlxconfig -d mt4117_pciconf0 set SRIOV_EN=1 NUM_OF_VFS=18 (In documentation: Firmware VF configuration must be N+1)
esxcli system module parameters set -m nmlx5_core -p max_vfs=16
And I have 32VFs (16VFs per port), how to create more VFs when a see in ESXi "esxcli system module parameters list -m nmlx5_core"
max_vfs, Values : 0-16, 0 - disabled
ESXi limitation to 16 VFs per port ? how to create more VFs ?
Best Regards
Robert
Hello,
I'm trying to figure out how to use RoCEv2 (or v1, neither seems to work...) on Windows 7 using a ConnectX-3 Pro Ethernet Adapter and WinOF v5.35.
I have enabled RoCEv2 and set the RoCEv2 Port to 4791 using the registry keys in HKLM/SYSTEM/CurrentControlSet/services/mlx4_bus/Parameters/Roce.
The WinOF User manual states to use the Microsoft "Network Direct SPI" for RoCE programming. When I do so there is no Network Direct Provider available on the system.
What can I do? Is there an alternate way of programming with RoCE? Or is there a way to install the NetworkDirectProvider?
All the configuration tutorials I have found so far are for Server Platforms, but I cannot use one.
If it is not possible to use RoCE in Windows 7, would it be possible in Windows 10?
Any help is highly appreciated!
Best Regads,
Dominic
Hi,
I have been trying to set up a small computing cluster of 4 computers just using connectX-5 HCAs. We have one Master computer with two cards (a double and a single port) and then three Slave computers each with a single port. When I just have a master and slave computer hooked up it works fine, but when I start adding more slaves, the connection drops between the other system.
Do I need to have some sort of connection manager set up? any advice of how to set this up would be greatly appreciated. (or at least a point to the documentation would be nice)
Thanks in advance,
Bryan
I'm trying to update the firmware on my switch and have issued the following commands and ran into a problem where the PSID is a mismatch. I'm unable to find any mention of GEM0F80110002 on the net and wasn't sure if a force flash could be done.
C:\>mst ib add
-I- Discovering the fabric - Running: ibnetdiscover.exe
-I- Added 3 in-band devices
C:\>mst status
MST devices:
------------
mt4099_pci_cr0
mt4099_pciconf0
Inband devices:
-------------------
CA_MT4099_LLWPHOSTP01_lid-0x0003
CA_MT4099_LLWPHOSTP02_lid-0x0004
SW_MT48438_0x2c90200433c38_lid-0x0002
C:\>flint -d /dev/mst/SW_MT48438_0x2c90200433c38_lid-0x0002 q
Image type: FS2
FW Version: 7.4.0000
Device ID: 48438
Description: Node Sys image
GUIDs: 0002c90200433c38 0002c90200433c3b
VSD: n/a
PSID: GEM0F80110002
C:\Tools>flint -d /dev/mst/SW_MT48438_0x2c90200433c38_lid-0x0002 -i ./fw-IS4-rel-7_4_3000-MIS5024Q_A1-A5.bin -qq b
Current FW version on flash: 7.4.0000
New FW version: 7.4.3000
-E- PSID mismatch. The PSID on flash (GEM0F80110002) differs from the PSID in the given image (MT_0F80110002).
Hi All,
We are trying to connect SX1012 on Cisco 4506-E 10GBE port. We used 40GbE to 10GbE (DAC) connection cable but the link us up only when I change the speed of Mellanox port to 1GbE. Is there anything missing on our configuration? Does the Cisco port need to force the speed to 10GbE rather than auto-negotiate?
Thank.
Regards,
Reggie
Hi Reggie,
Which type of cable you are using? what is the vendor and part number?
Cisco are not locking the cable on it's end?
you configured the port on the Mellanox side to 10G?
Hello
I'm trying to create iso image for an OL 6.8 with kernel "Linux oraclenode2 4.1.12-37.4.1.el6uek.x86_64 #2 SMP"
but the iso or tgz creation exit with the following error
[root@oraclenode2 MLNX_OFED_LINUX-4.1-1.0.2.0-ol6.8-x86_64]# ./mlnx_add_kernel_support.sh --mlnx_ofed /root/suff/Driver/MLNX_OFED_LINUX-4.1-1.0.2.0-ol6.8-x86_64 --make-iso
Note: This program will create MLNX_OFED_LINUX ISO for ol6.8 under /tmp directory.
Do you want to continue?[y/N]:y
See log file /tmp/mlnx_ofed_iso.24732.log
Checking if all needed packages are installed...
Building MLNX_OFED_LINUX RPMS . Please wait...
ERROR: Failed executing "MLNX_OFED_SRC-4.1-1.0.2.0/install.pl --tmpdir /tmp --kernel-only --kernel 4.1.12-37.4.1.el6uek.x86_64 --kernel-sources /lib/modules/4.1.12-37.4.1.el6uek.x86_64/build/ --builddir /tmp/mlnx_iso.24732 --disable-kmp --build-only"
ERROR: See /tmp/mlnx_ofed_iso.24732.log
Any advice?
thanks
Matteo
We are considering the same thing I bet.
Building a "no-hop" grid but out of 100GbE links.
I read from googling posts that one way to test the throughput / cabling is to connect the CU QSFP28 to another one in another PC and do a dd or other tool and do a direct transfer point to point.
This validates the cable and ports without any switch in the way. Why wouldn't that work if natively as you also suggest - no switch latency delays.
The rates we need limit the number of links to target to 10-12 which is doable with 5-6 ConnectX4 in the right box.
Thoughts?
Hi Donny,
Please open a case at support@mellanox.com and an engineer will be able to webex with you in order to access the BIOS