Commit graph

88 commits

Author SHA1 Message Date
John Baldwin 575a6840de Various fixes to the mlxen(4) driver:
- Remove an incorrect assertion that can trigger when downing an interface.
- Stop the interface during detach to avoid panics when unloading the
  driver.
- A few locking fixes to be more consistent with other FreeBSD drivers:
  - Protect if_drv_flags with the driver lock, not atomic ops
  - Hold the driver lock when adjusting multicast state.
  - Hold the driver lock while adjusting if_capenable.

PR:		kern/180791 [1,2]
Submitted by:	Shakar Klein @ Mellanox [1,2]
MFC after:	3 days
2013-07-29 18:44:52 +00:00
John Baldwin ba90c51af3 Avoid trashing IP fragments:
- Only enable UDP/TCP hardware checksums if CSUM_UDP or CSUM_TCP is set.
- Only enable IP hardware checksums if CSUM_IP is set.

PR:		kern/180430
Submitted by:	Meny Yossefi <menyy@mellanox.com>
MFC after:	1 week
2013-07-25 16:34:34 +00:00
John Baldwin c232b2aebb Rework the previous fix for the IB vs Ethernet sysctl handler to be more
generic and apply to all sysfs attributes:
- Use sysctl_handle_string() instead of reimplementing it.
- Remove trailing newline from the current value before passing it to
  userland and append a newline to the new string value before passing it
  to the attribute's store function.
- Don't leak the temporary buffer if the first error check triggers.
- Revert earlier change to mlx4 port mode handler.

PR:		kern/174213
Submitted by:	Garrett Cooper
Reviewed by:	Shakar Klein @ Mellanox
MFC after:	1 week
2013-07-18 14:06:01 +00:00
John Baldwin d356583dbc Remove check forbidding requests that would result in one port being set
to Ethernet and the subsequent port being set to IB.

Submitted by:	Shakar Klein @ Mellanox
Tested by:	Morgan Robertson <morganrobertson@gmail.com>
MFC after:	1 week
2013-07-17 13:41:54 +00:00
John Baldwin 335349690f Allow mlx4 devices to switch from Ethernet to Infiniband (and vice versa):
- Fix sysctl wrapper for sysfs attributes to properly handle new string
  values similar to sysctl_handle_string() (only copyin the user's
  supplied length and nul-terminate the string).
- Don't check for a trailing newline when evaluating the desired operating
  mode of a mlx4 device.

PR:		kern/179999
Submitted by:	Shahar Klein <shahark@mellanox.com>
MFC after:	1 week
2013-07-08 21:25:12 +00:00
Eitan Adler 7a2b450ff8 Fxi a bunch of typos.
PR:	misc/174625
Submitted by:	Jeremy Chadwick <jdc@koitsu.org>
2013-05-10 16:41:26 +00:00
Gleb Smirnoff c46db3529c Add const qualifier to the dst parameter of the ifnet if_output method. 2013-04-27 08:11:48 +00:00
John Baldwin e5fb562225 Check for SS_NBIO in the socket state field rather than socket buffer
flags.

Submitted by:	Vijay Singh
MFC after:	1 week
2013-04-03 20:31:10 +00:00
Attilio Rao 89f6b8632c Switch the vm_object mutex to be a rwlock. This will enable in the
future further optimizations where the vm_object lock will be held
in read mode most of the time the page cache resident pool of pages
are accessed for reading purposes.

The change is mostly mechanical but few notes are reported:
* The KPI changes as follow:
  - VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK()
  - VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK()
  - VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK()
  - VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED()
    (in order to avoid visibility of implementation details)
  - The read-mode operations are added:
    VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(),
    VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED()
* The vm/vm_pager.h namespace pollution avoidance (forcing requiring
  sys/mutex.h in consumers directly to cater its inlining functions
  using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h
  consumers now must include also sys/rwlock.h.
* zfs requires a quite convoluted fix to include FreeBSD rwlocks into
  the compat layer because the name clash between FreeBSD and solaris
  versions must be avoided.
  At this purpose zfs redefines the vm_object locking functions
  directly, isolating the FreeBSD components in specific compat stubs.

The KPI results heavilly broken by this commit.  Thirdy part ports must
be updated accordingly (I can think off-hand of VirtualBox, for example).

Sponsored by:	EMC / Isilon storage division
Reviewed by:	jeff
Reviewed by:	pjd (ZFS specific review)
Discussed with:	alc
Tested by:	pho
2013-03-09 02:32:23 +00:00
Xin LI 3b9ba32d88 Fix LINT build on amd64. 2013-02-09 04:13:45 +00:00
Randall Stewart ded5ea6a25 This fixes a out-of-order problem with several
of the newer drivers. The basic problem was
that the driver was pulling the mbuf off the
drbr ring and then when sending with xmit(), encounting
a full transmit ring. Thus the lower layer
xmit() function would return an error, and the
drivers would then append the data back on to the ring.
For TCP this is a horrible scenario sure to bring
on a fast-retransmit.

The fix is to use drbr_peek() to pull the data pointer
but not remove it from the ring. If it fails then
we either call the new drbr_putback or drbr_advance
method. Advance moves it forward (we do this sometimes
when the xmit() function frees the mbuf). When
we succeed we always call advance. The
putback will always copy the mbuf back to the top
of the ring. Note that the putback *cannot* be used
with a drbr_dequeue() only with drbr_peek(). We most
of the time, in putback, would not need to copy it
back since most likey the mbuf is still the same, but
sometimes xmit() functions will change the mbuf via
a pullup or other call. So the optimial case for
the single consumer is to always copy it back. If
we ever do a multiple_consumer (for lagg?) we
will  need a test and atomic in the put back possibly
a seperate putback_mc() in the ring buf.

Reviewed by:	jhb@freebsd.org, jlv@freebsd.org
2013-02-07 15:20:54 +00:00
Gleb Smirnoff eb1b1807af Mechanically substitute flags from historic mbuf allocator with
malloc(9) flags within sys.

Exceptions:

- sys/contrib not touched
- sys/mbuf.h edited manually
2012-12-05 08:04:20 +00:00
Dimitry Andric 3910bc633f Redo r242842, now actually fixing the warnings, as follows:
- In sys/ofed/drivers/infiniband/core/cma.c, an enum struct member is
  interpreted as an int, so cast it to an int.
- In sys/ofed/drivers/infiniband/core/ud_header.c, initialize the
  packet_length variable in ib_ud_header_init(), to prevent undefined
  behaviour.
- In sys/ofed/drivers/infiniband/ulp/sdp/sdp_rx.c, call rdma_notify()
  with the correct enum type and value.
- In sys/ofed/include/linux/pci.h, change the PCI_DEVICE and PCI_VDEVICE
  macros to use C99 struct initializers, so additional members can be
  overridden.

Reviewed by:	delphij, Garrett Cooper <yanegomi@gmail.com>
MFC after:	1 week
2012-11-12 22:01:29 +00:00
Eitan Adler db702c59cf remove duplicate semicolons where possible.
Approved by:	cperciva
MFC after:	1 week
2012-10-22 03:00:37 +00:00
John Baldwin 989343faad Use if_initbaudrate(). 2012-10-18 15:43:19 +00:00
Gleb Smirnoff 063efed28c The drbr(9) API appeared to be so unclear, that most drivers in
tree used it incorrectly, which lead to inaccurate overrated
if_obytes accounting. The drbr(9) used to update ifnet stats on
drbr_enqueue(), which is not accurate since enqueuing doesn't
imply successful processing by driver. Dequeuing neither mean
that. Most drivers also called drbr_stats_update() which did
accounting again, leading to doubled if_obytes statistics. And
in case of severe transmitting, when a packet could be several
times enqueued and dequeued it could have been accounted several
times.

o Thus, make drbr(9) API thinner. Now drbr(9) merely chooses between
  ALTQ queueing or buf_ring(9) queueing.
  - It doesn't touch the buf_ring stats any more.
  - It doesn't touch ifnet stats anymore.
  - drbr_stats_update() no longer exists.

o buf_ring(9) handles its stats itself:
  - It handles br_drops itself.
  - br_prod_bytes stats are dropped. Rationale: no one ever
    reads them but update of a common counter on every packet
    negatively affects performance due to excessive cache
    invalidation.
  - buf_ring_enqueue_bytes() reduced to buf_ring_enqueue(), since
    we no longer account bytes.

o Drivers handle their stats theirselves: if_obytes, if_omcasts.

o mlx4(4), igb(4), em(4), vxge(4), oce(4) and  ixv(4) no longer
  use drbr_stats_update(), and update ifnet stats theirselves.

o bxe(4) was the most correct driver, it didn't call
  drbr_stats_update(), thus it was the only driver accurate under
  moderate load. Now it also maintains stats itself.

o ixgbe(4) had already taken stats from hardware, so just
  - drop software stats updating.
  - take multicast packet count from hardware as well.

o mxge(4) just no longer needs NO_SLOW_STATS define.

o cxgb(4), cxgbe(4) need no change, since they obtain stats
  from hardware.

Reviewed by:	jfv, gnn
2012-09-28 18:28:27 +00:00
Alexander V. Chernikov f340037ea1 Remove unneeded ipfw headers introduced in r213447 from Infiniband code.
MFC after:     2 weeks
2012-09-04 10:56:30 +00:00
Navdeep Parhar 09fe63205c - Updated TOE support in the kernel.
- Stateful TCP offload drivers for Terminator 3 and 4 (T3 and T4) ASICs.
  These are available as t3_tom and t4_tom modules that augment cxgb(4)
  and cxgbe(4) respectively.  The cxgb/cxgbe drivers continue to work as
  usual with or without these extra features.

- iWARP driver for Terminator 3 ASIC (kernel verbs).  T4 iWARP in the
  works and will follow soon.

Build-tested with make universe.

30s overview
============
What interfaces support TCP offload?  Look for TOE4 and/or TOE6 in the
capabilities of an interface:
# ifconfig -m | grep TOE

Enable/disable TCP offload on an interface (just like any other ifnet
capability):
# ifconfig cxgbe0 toe
# ifconfig cxgbe0 -toe

Which connections are offloaded?  Look for toe4 and/or toe6 in the
output of netstat and sockstat:
# netstat -np tcp | grep toe
# sockstat -46c | grep toe

Reviewed by:	bz, gnn
Sponsored by:	Chelsio communications.
MFC after:	~3 months (after 9.1, and after ensuring MFC is feasible)
2012-06-19 07:34:13 +00:00
Alexander V. Chernikov bdf942c3f0 Revert r234834 per luigi@ request.
Cleaner solution (e.g. adding another header) should be done here.

Original log:
  Move several enums and structures required for L2 filtering from ip_fw_private.h to ip_fw.h.
  Remove ipfw/ip_fw_private.h header from non-ipfw code.

Requested by:      luigi
Approved by:       kib(mentor)
2012-05-03 08:56:43 +00:00
Alexander V. Chernikov 7bd5e9b143 Move several enums and structures required for L2 filtering from ip_fw_private.h to ip_fw.h.
Remove ipfw/ip_fw_private.h header from non-ipfw code.

Approved by:        ae(mentor)
MFC after:          2 weeks
2012-04-30 10:22:23 +00:00
Bjoern A. Zeeb 7a1421ee03 Do not announce IPv6 TSO support yet. The driver seems to make assumptions
based on IPv4 header parsing only.

MFC after:	1 week
2012-04-23 21:50:10 +00:00
John Baldwin ed5a2b61fd Add OFED and the associated options and drivers to x86 LINT builds:
- Mark 'sdp' as requiring 'inet'.
- Always include "opt_inet.h" and "opt_inet6.h" and modify the IB
  driver Makefiles to honor WITH/WITHOUT_INET/INET6/_SUPPORT options
  to determine what should be enabled during a module build.
- Fix the mlxen(4) driver and the core IB code to compile without
  if INET is disabled (including when both INET and INET6 are disabled).

Reviewed by:	bz
MFC after:	2 weeks
2012-04-12 14:01:06 +00:00
John Baldwin 6937a7ac6b Don't update if_obytes when transmitting packets. That is already done
in IFQ_HANDOFF() when the packet is passed to the start routine, so doing
it here resulted in double counting.

Reported by:	Alex Tutubalin  lexa lexa ru
MFC after:	1 week
2012-04-12 13:53:49 +00:00
John Baldwin efbebba22a Properly parse 40G media types from newer Mellanox adapters that are
40G capable.  For now, map all 40G links to 40GBase-CR4.

MFC after:	2 weeks
2012-04-10 14:01:09 +00:00
John Baldwin 8b39d094aa Fix build on i386. 2012-04-04 11:55:20 +00:00
John Baldwin 8317ddc843 Fix build of OFED bits with debugging options enabled. 2012-03-19 19:53:53 +00:00
John Baldwin 57e58a39d6 Fix build with INET6 disabled. 2012-03-16 17:56:53 +00:00
Ulrich Spörlein 1eb6fc2b27 Remove spurious 8bit chars, turning files into plain ASCII. 2012-01-15 13:23:54 +00:00
Ed Schouten 6472ac3d8a Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.
The SYSCTL_NODE macro defines a list that stores all child-elements of
that node. If there's no SYSCTL_DECL macro anywhere else, there's no
reason why it shouldn't be static.
2011-11-07 15:43:11 +00:00
Ed Schouten d745c852be Mark MALLOC_DEFINEs static that have no corresponding MALLOC_DECLAREs.
This means that their use is restricted to a single C file.
2011-11-07 06:44:47 +00:00
Eitan Adler 36daf0495a - change "is is" to "is" or "it is"
- change "the the" to "the"

Approved by:	lstewart
Approved by:	sahil (mentor)
MFC after:	3 days
2011-10-16 14:30:28 +00:00
Xin LI 66d972cfd6 In ipoib_cm_handle_rx_wc(): Count incoming packets and
bytes toward incoming counters.

Reviewed by:	jeff
2011-05-26 22:29:43 +00:00
Bjoern A. Zeeb ad46f43a26 Even though this block is not compiled currently, properly assign
CSUM_TSO to if_hwassist rather than if_capabilities to avoid future
errors.

Reviewed by:	jeff
2011-04-12 01:19:23 +00:00
Jeff Roberson cafd78fc6e - Implement wake-on-lan support in mlxen. 2011-03-26 00:54:01 +00:00
Jeff Roberson a340f09abe - Correct the vlan filter programming. The device filter is built in
reverse order.
 - Name the cq taskqueues according to whether they handle rx or tx.
 - Default LRO to on.
2011-03-23 02:47:04 +00:00
Jeff Roberson 6146335abb - Don't use a separate set of rx queues for UDP, hash them into the same
set as TCP.
 - Eliminate the fully linear non-scatter/gather rx path, there is no
   harm in using arrays of clusters for both TCP and UDP.
 - Implement support for enabling/disabling per-vlan priority pause and
   queues via sysctl.
2011-03-22 04:50:47 +00:00
Konstantin Belousov f394ce6e5b Allow the ofed modules to be compiled on i386.
Reviewed by:	jeff
2011-03-21 21:16:40 +00:00
Jeff Roberson aa0a1e58f0 - Merge in OFED 1.5.3 from projects/ofed/head 2011-03-21 09:58:24 +00:00