Commit graph

470934 commits

Author SHA1 Message Date
Florian Westphal 55d8694fa8 net: tcp: assign tcp cong_ops when tcp sk is created
Split assignment and initialization from one into two functions.

This is required by followup patches that add Datacenter TCP
(DCTCP) congestion control algorithm - we need to be able to
determine if the connection is moderated by DCTCP before the
3WHS has finished.

As we walk the available congestion control list during the
assignment, we are always guaranteed to have Reno present as
it's fixed compiled-in. Therefore, since we're doing the
early assignment, we don't have a real use for the Reno alias
tcp_init_congestion_ops anymore and can thus remove it.

Actual usage of the congestion control operations are being
made after the 3WHS has finished, in some cases however we
can access get_info() via diag if implemented, therefore we
need to zero out the private area for those modules.

Joint work with Daniel Borkmann and Glenn Judd.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Glenn Judd <glenn.judd@morganstanley.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-29 00:13:10 -04:00
John Fastabend 53dfd50181 net: sched: cls_rcvp, complete rcu conversion
This completes the cls_rsvp conversion to RCU safe
copy, update semantics.

As a result all cases of tcf_exts_change occur on
empty lists now.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-29 00:04:55 -04:00
Eric Dumazet 3d9a0d2f82 dql: dql_queued() should write first to reduce bus transactions
While doing high throughput test on a BQL enabled NIC,
I found a very high cost in ndo_start_xmit() when accessing BQL data.

It turned out the problem was caused by compiler trying to be
smart, but involving a bad MESI transaction :

  0.05 │  mov    0xc0(%rax),%edi    // LOAD dql->num_queued
  0.48 │  mov    %edx,0xc8(%rax)    // STORE dql->last_obj_cnt = count
 58.23 │  add    %edx,%edi
  0.58 │  cmp    %edi,0xc4(%rax)
  0.76 │  mov    %edi,0xc0(%rax)    // STORE dql->num_queued += count
  0.72 │  js     bd8

I got an incredible 10 % gain [1] by making sure cpu do not attempt
to get the cache line in Shared mode, but directly requests for
ownership.

New code :
	mov    %edx,0xc8(%rax)  // STORE dql->last_obj_cnt = count
	add    %edx,0xc0(%rax)  // RMW   dql->num_queued += count
	mov    0xc4(%rax),%ecx  // LOAD dql->adj_limit
	mov    0xc0(%rax),%edx  // LOAD dql->num_queued
	cmp    %edx,%ecx

The TX completion was running from another cpu, with high interrupts
rate.

Note that I am using barrier() as a soft hint, as mb() here could be
too heavy cost.

[1] This was a netperf TCP_STREAM with TSO disabled, but GSO enabled.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-29 00:04:55 -04:00
WANG Cong 68f6a7c6c9 net_sched: fix another regression in cls_tcindex
Clearly the following change is not expected:

	-       if (!cp.perfect && !cp.h)
	-               cp.alloc_hash = cp.hash;
	+       if (!cp->perfect && cp->h)
	+               cp->alloc_hash = cp->hash;

Fixes: commit 331b72922c ("net: sched: RCU cls_tcindex")
Cc: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-28 17:34:35 -04:00
WANG Cong 02c5e84413 net_sched: fix errno in tcindex_set_parms()
When kmemdup() fails, we should return -ENOMEM.

Cc: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-28 17:34:22 -04:00
David S. Miller c01035f174 Merge branch 'cxgb4-next'
Hariprasad Shenai says:

====================
cxgb4: Use new BAR2 GTS for T5, adds adaptive rx and few Device ID's

This patch series adds support to use new BAR2 GTS for T5 adapter.
Adds support for adaptive rx. Remove redundant variable from a macro of
cxgb4vf driver. Adds Device ID for new adapters.

The patches series is created against 'net-next' tree.
And includes patches on cxgb4 and cxgb4vf driver.

We have included all the maintainers of respective drivers. Kindly review the
change and let us know in case of any review comments.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-28 17:32:16 -04:00
Hariprasad Shenai e553ec3ff9 cxgb4: Add support for adaptive rx
Based on original work by Kumar Sanghvi <kumaras@chelsio.com>

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-28 17:32:11 -04:00
Hariprasad Shenai 91c04a9eb3 cxgb4/cxgb4vf: Add Devicde ID for two more adapter
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-28 17:32:11 -04:00
Hariprasad Shenai b961f9a488 cxgb4vf: Remove superfluous "idx" parameter of CH_DEVICE() macro.
Remove redundant idx parameter of CH_DEVICE() macro, its always zero.

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-28 17:32:11 -04:00
Hariprasad Shenai d63a6dcf06 cxgb4: Use BAR2 Going To Sleep (GTS) for T5 and later.
Use BAR2 GTS for T5. If we are on T4 use the old doorbell mechanism;
otherwise ue the new BAR2 mechanism. Use BAR2 doorbells for refilling FL's.

Based on original work by Casey Leedom <leedom@chelsio.com>

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-28 17:32:10 -04:00
Rick Jones 825bae5d97 arp: Do not perturb drop profiles with ignored ARP packets
We do not wish to disturb dropwatch or perf drop profiles with an ARP
we will ignore.

Signed-off-by: Rick Jones <rick.jones2@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-28 17:30:35 -04:00
WANG Cong 18d0264f63 net_sched: remove the first parameter from tcf_exts_destroy()
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jamal Hadi Salim <hadi@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-28 17:29:01 -04:00
Eric Dumazet 5804283d7c mlx4: exploit skb->xmit_more to conditionally send doorbell
skb->xmit_more tells us if another skb is coming next.

We need to send doorbell when : xmit_more is not set,
or txqueue is stopped (preventing next skb to come immediately)

Tested with a modified pktgen version, I got a 40% increase of
throughput.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-28 17:27:36 -04:00
David S. Miller a8404ce5ae Merge branch 'r8152'
Hayes Wang says:

====================
r8152: support setting eee by ethtool

Modify some definitions about EEE, and add the support of setting
the EEE through ethtool.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-28 17:24:32 -04:00
hayeswang df35d283e5 r8152: support ethtool eee
Support get_eee() and set_eee() of ethtool_ops.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-28 17:24:27 -04:00
hayeswang d24f6134c7 r8152: add functions to set EEE
Add functions to enable EEE and set EEE advertisement.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-28 17:24:27 -04:00
hayeswang 4c4a6b1b85 r8152: change the EEE definition
Replace the EEE definitions with the ones which is declared
in "mdio.h".

Chage some definitions to make them readable.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-28 17:24:27 -04:00
David S. Miller 18c565eb41 Merge branch 'defxx-next'
Maciej W. Rozycki says:

====================
defxx: DEFEA fixes and updates

 I have finally got my hands on an EISA variation of the board (DEC
FDDIcontroller/EISA aka DEFEA) and was able to do some testing.  Here are
initial updates to the driver that address problems I encountered so far.
More to come later on as I get back to the system that I have in a remote
location -- I need to double-check MMIO support and see what might have
been causing spurious interrupts I saw with the 8259A PIC the board's
interrupt line has been routed to.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-28 17:22:21 -04:00
Maciej W. Rozycki b98dfaf2b0 defxx: DEFEA's ESIC port I/O decoding cleanup
Use the slot-specific I/O range for decoding accesses to PDQ ASIC
registers (IOCS0) and the discrete Burst Holdoff register (IOCS1) as per
the "HD64981F EISA Slave Interface Controller (ESIC)" datasheet.  Use
disjoint decode ranges now that the assignment of chip selects is known.
Update the span of the port I/O resource requested accordingly.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-28 17:22:10 -04:00
Maciej W. Rozycki b1a6d3ecf8 defxx: DEFEA's Burst Holdoff register initialization fix
Use the mask rather than bit number macro to initialize the chip select
control bit for PDQ register space decoding in the Burst Holdoff register.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-28 17:22:09 -04:00
Maciej W. Rozycki 8a189f1288 defxx: Correct DEFEA's ESIC port I/O accesses
Reverse the order of arguments to `outb', data to write comes first.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-28 17:22:09 -04:00
David S. Miller f5c7e1a47a Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next
Steffen Klassert says:

====================
pull request (net-next): ipsec-next 2014-09-25

1) Remove useless hash_resize_mutex in xfrm_hash_resize().
   This mutex is used only there, but xfrm_hash_resize()
   can't be called concurrently at all. From Ying Xue.

2) Extend policy hashing to prefixed policies based on
   prefix lenght thresholds. From Christophe Gouault.

3) Make the policy hash table thresholds configurable
   via netlink. From Christophe Gouault.

4) Remove the maximum authentication length for AH.
   This was needed to limit stack usage. We switched
   already to allocate space, so no need to keep the
   limit. From Herbert Xu.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-28 17:19:15 -04:00
David S. Miller fe2c5fb1ef Merge branch 'dsa_eee'
Florian Fainelli says:

====================
net: dsa: EEE and other PM features

This patch set allows DSA switch drivers to enable/disable/query EEE on a
per-port level, as well as control precisely which switch ports are
enable/disabled.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-28 17:14:15 -04:00
Florian Fainelli 450b05c15f net: dsa: bcm_sf2: add support for controlling EEE
When EEE is enabled, negotiate this feature with the PHY and make sure
that the capability checking, local EEE advertisement, link partner EEE
advertisement and auto-negotiation resolution returned by phy_init_eee()
is positive, and enable EEE at the switch level.

While querying the current EEE settings, verify the low-power indication
and indicate its status.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-28 17:14:09 -04:00
Florian Fainelli 7905288f09 net: dsa: allow switches driver to implement get/set EEE
Allow switches driver to query and enable/disable EEE on a per-port
basis by implementing the ethtool_{get,set}_eee settings and delegating
these operations to the switch driver.

set_eee() will need to coordinate with the PHY driver to make sure that
EEE is enabled, the link-partner supports it and the auto-negotiation
result is satisfactory.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-28 17:14:09 -04:00
Florian Fainelli b6d045db59 net: dsa: bcm_sf2: add port_enable/disable callbacks
The SF2 switch driver is already architected around per-port
enable/disable callbacks, so we just need a slight update to our
existing bcm_sf2_port_setup() resp. bcm_sf2_port_disable() functions to
be suitable as callbacks for port_enable/port_disable.

We need to shuffle a little the code that does the per-port VLAN
configuration/isolation since ports can now be brought up/down
separately, so we need to make sure that IMP (CPU, management) port is
always included in that specific port setup.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-28 17:14:09 -04:00
Florian Fainelli 7de1557ce7 net: dsa: bcm_sf2: disable RGMII interface(s) when link is down
When the link is down, disable the RGMII interface to conserve as much
power as possible. We re-enable the RGMII interface whenever the link is
detected.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-28 17:14:09 -04:00
Florian Fainelli b2f2af21e3 net: dsa: allow enabling and disable switch ports
Whenever a per-port network device is used/unused, invoke the switch
driver port_enable/port_disable callbacks to allow saving as much power
as possible by disabling unused parts of the switch (RX/TX logic, memory
arrays, PHYs...). We supply a PHY device argument to make sure the
switch driver can act on the PHY device if needed (like putting/taking
the PHY out of deep low power mode).

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-28 17:14:08 -04:00
Florian Fainelli f7f1de51ed net: dsa: start and stop the PHY state machine
dsa_slave_open() should start the PHY library state machine for its PHY
interface, and dsa_slave_close() should stop the PHY library state
machine accordingly.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-28 17:14:08 -04:00
Peter Pan(潘卫平) 155c6e1ad4 tcp: use tcp_flags in tcp_data_queue()
This patch is a cleanup which follows the idea in commit e11ecddf51 (tcp: use
TCP_SKB_CB(skb)->tcp_flags in input path),
and it may reduce register pressure since skb->cb[] access is fast,
bacause skb is probably in a register.

v2: remove variable th
v3: reword the changelog

Signed-off-by: Weiping Pan <panweiping3@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-28 16:37:57 -04:00
Eric Dumazet cd7d8498c9 tcp: change tcp_skb_pcount() location
Our goal is to access no more than one cache line access per skb in
a write or receive queue when doing the various walks.

After recent TCP_SKB_CB() reorganizations, it is almost done.

Last part is tcp_skb_pcount() which currently uses
skb_shinfo(skb)->gso_segs, which is a terrible choice, because it needs
3 cache lines in current kernel (skb->head, skb->end, and
shinfo->gso_segs are all in 3 different cache lines, far from skb->cb)

This very simple patch reuses space currently taken by tcp_tw_isn
only in input path, as tcp_skb_pcount is only needed for skb stored in
write queue.

This considerably speeds up tcp_ack(), granted we avoid shinfo->tx_flags
to get SKBTX_ACK_TSTAMP, which seems possible.

This also speeds up all sack processing in general.

This speeds up tcp_sendmsg() because it no longer has to access/dirty
shinfo.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-28 16:36:48 -04:00
David S. Miller dc83d4d8f6 Merge branch 'tcp_skb_cb'
Eric Dumazet says:

====================
tcp: better TCP_SKB_CB layout

TCP had the assumption that IPCB and IP6CB are first members of skb->cb[]

This is fine, except that IPCB/IP6CB are used in TCP for a very short time
in input path.

What really matters for TCP stack is to get skb->next,
TCP_SKB_CB(skb)->seq, and TCP_SKB_CB(skb)->end_seq in the same cache line.

skb that are immediately consumed do not care because whole skb->cb[] is
hot in cpu cache, while skb that sit in wocket write queue or receive queues
do not need TCP_SKB_CB(skb)->header at all.

This patch set implements the prereq for IPv4, IPv6, and TCP to make this
possible. This makes TCP more efficient.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-28 16:35:49 -04:00
Eric Dumazet 971f10eca1 tcp: better TCP_SKB_CB layout to reduce cache line misses
TCP maintains lists of skb in write queue, and in receive queues
(in order and out of order queues)

Scanning these lists both in input and output path usually requires
access to skb->next, TCP_SKB_CB(skb)->seq, and TCP_SKB_CB(skb)->end_seq

These fields are currently in two different cache lines, meaning we
waste lot of memory bandwidth when these queues are big and flows
have either packet drops or packet reorders.

We can move TCP_SKB_CB(skb)->header at the end of TCP_SKB_CB, because
this header is not used in fast path. This allows TCP to search much faster
in the skb lists.

Even with regular flows, we save one cache line miss in fast path.

Thanks to Christoph Paasch for noticing we need to cleanup
skb->cb[] (IPCB/IP6CB) before entering IP stack in tx path,
and that I forgot IPCB use in tcp_v4_hnd_req() and tcp_v4_save_options().

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-28 16:35:43 -04:00
Eric Dumazet a224772db8 ipv6: add a struct inet6_skb_parm param to ipv6_opt_accepted()
ipv6_opt_accepted() assumes IP6CB(skb) holds the struct inet6_skb_parm
that it needs. Lets not assume this, as TCP stack might use a different
place.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-28 16:35:43 -04:00
Eric Dumazet 24a2d43d88 ipv4: rename ip_options_echo to __ip_options_echo()
ip_options_echo() assumes struct ip_options is provided in &IPCB(skb)->opt
Lets break this assumption, but provide a helper to not change all call points.

ip_send_unicast_reply() gets a new struct ip_options pointer.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-28 16:35:42 -04:00
Eric Dumazet ff04a771ad net : optimize skb_release_data()
Cache skb_shinfo(skb) in a variable to avoid computing it multiple
times.

Reorganize the tests to remove one indentation level.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26 16:53:49 -04:00
Alexei Starovoitov cec0831519 sparc: bpf_jit: add support for BPF_LD(X) | BPF_LEN instructions
BPF_LD | BPF_W | BPF_LEN instruction is occasionally used by tcpdump
and present in 11 tests in lib/test_bpf.c
Teach sparc JIT compiler to emit it.

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26 16:52:09 -04:00
Tobias Klauser 0a29b3dafb net: bcmgenet: Fix compile warning
bcmgenet_wol_resume() is only used in bcmgenet_resume(), which is only
defined when CONFIG_PM_SLEEP is enabled. This leads to the following
compile warning when building with !CONFIG_PM_SLEEP:

drivers/net/ethernet/broadcom/genet/bcmgenet.c:1967:12: warning: ‘bcmgenet_wol_resume’ defined but not used [-Wunused-function]

Since bcmgenet_resume() is the only user of bcmgenet_wol_resume(), fix
this by directly inlining the function there.

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26 16:49:01 -04:00
Wang Sheng-Hui 8280bf00fd net/openvswitch: remove dup comment in vport.h
Remove the duplicated comment
"/* The following definitions are for users of the vport subsytem: */"
in vport.h

Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26 16:42:33 -04:00
David S. Miller b184006050 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next
Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2014-09-23

This patch series adds support for the FM10000 Ethernet switch host
interface.  The Intel FM10000 Ethernet Switch is a 48-port Ethernet switch
supporting both Ethernet ports and PCI Express host interfaces.  The fm10k
driver provides support for the host interface portion of the switch, both
PF and VF.

As the host interfaces are directly connected to the switch this results in
some significant differences versus a standard network driver.  For example
there is no PHY or MII on the device.  Since packets are delivered directly
from the switch to the host interface these are unnecessary.  Otherwise most
of the functionality is very similar to our other network drivers such as
ixgbe or igb.  For example we support all the standard network offloads,
jumbo frames, SR-IOV (64 VFS), PTP, and some VXLAN and NVGRE offloads.

v2: converted dev_consume_skb_any() to dev_kfree_skb_any()
    fix up PTP code based on feedback from the community
v3: converted the use of smb_mb__before_clear_bit() to smb_mb__before_atomic()
    added vmalloc header to patch 15
    added prefetch header to patch 16
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26 16:23:12 -04:00
LEROY Christophe 58e3cac561 net: optimise inet_proto_csum_replace4()
csum_partial() is a generic function which is not optimised for small fixed
length calculations, and its use requires to store "from" and "to" values in
memory while we already have them available in registers. This also has impact,
especially on RISC processors. In the same spirit as the change done by
Eric Dumazet on csum_replace2(), this patch rewrites inet_proto_csum_replace4()
taking into account RFC1624.

I spotted during a NATted tcp transfert that csum_partial() is one of top 5
consuming functions (around 8%), and the second user of csum_partial() is
inet_proto_csum_replace4().

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26 16:14:17 -04:00
LEROY Christophe 4565af0d40 net: optimise csum_replace4()
csum_partial() is a generic function which is not optimised for small fixed
length calculations, and its use requires to store "from" and "to" values in
memory while we already have them available in registers. This also has impact,
especially on RISC processors. In the same spirit as the change done by
Eric Dumazet on csum_replace2(), this patch rewrites inet_proto_csum_replace4()
taking into account RFC1624.

I spotted during a NATted tcp transfert that csum_partial() is one of top 5
consuming functions (around 8%), and the second user of csum_partial() is
inet_proto_csum_replace4().

I have proposed the same modification to inet_proto_csum_replace4() in another
patch.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26 16:14:16 -04:00
David S. Miller 3290d65553 Merge branch 'fec'
Fugang Duan says:

====================
net: fec: Code cleanup

This patches does several things:
  - Fixing multiqueue issue.
  - Removing the unnecessary errata workaround.
  - Aligning the data buffer dma map/unmap size.
  - Freeing resource after probe failed.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26 16:05:25 -04:00
Nimrod Andy e3c9614f3a net: fec: free resource after phy probe failed
Free memory and disable all related clocks when there has no phy
connection or phy probe failed.

Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26 16:05:21 -04:00
Nimrod Andy b64bf4b7dd net: fec: align rx data buffer size for dma map/unmap
Align allocated rx data buffer size for dma map/unmap, otherwise
kernel print warning when enable DMA_API_DEBUG.

Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26 16:05:21 -04:00
Nimrod Andy f88c7ede50 net: fec: remove the ERR006358 workaround for imx6sx enet
Remove the ERR006358 workaround for imx6sx enet since the hw issue
was fixed on the SOC.

Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26 16:05:21 -04:00
Nimrod Andy befe821335 net: fec: Add Ftype to BD to distiguish three tx queues for AVB
The current driver loss Ftype field init for BD, which cause tx
queue #1 and #2 cannot work well.

Add Ftype field to BD to distiguish three queues for AVB:
0 -> Best Effort
1 -> ClassA
2 -> ClassB

Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26 16:05:21 -04:00
Eric Dumazet f4a775d144 net: introduce __skb_header_release()
While profiling TCP stack, I noticed one useless atomic operation
in tcp_sendmsg(), caused by skb_header_release().

It turns out all current skb_header_release() users have a fresh skb,
that no other user can see, so we can avoid one atomic operation.

Introduce __skb_header_release() to clearly document this.

This gave me a 1.5 % improvement on TCP_RR workload.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26 15:40:06 -04:00
Fabio Estevam aebac74493 fec: Remove fec_enet_select_queue()
Sparse complains about fec_enet_select_queue() not being static.

Feedback from David Miller [1] was to remove this function instead of making it
static:

"Please just delete this function.

It's overriding code which does exactly the same thing.

Actually, more precisely, this code is duplicating code in a way that
bypasses many core facilitites of the networking.  For example, this
override means that socket based flow steering, XPS, etc. are all
not happening on these devices.

Without ->ndo_select_queue(), the flow dissector does __netdev_pick_tx
which is exactly what you want to happen."

[1] http://www.spinics.net/lists/netdev/msg297653.html

Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26 15:39:59 -04:00
David S. Miller 57219dc7bf Merge tag 'master-2014-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next
John W. Linville says:

====================
pull request: wireless-next 2014-09-22

Please pull this batch of updates intended for the 3.18 stream...

For the mac80211 bits, Johannes says:

"This time, I have some rate minstrel improvements, support for a very
small feature from CCX that Steinar reverse-engineered, dynamic ACK
timeout support, a number of changes for TDLS, early support for radio
resource measurement and many fixes. Also, I'm changing a number of
places to clear key memory when it's freed and Intel claims copyright
for code they developed."

For the bluetooth bits, Johan says:

"Here are some more patches intended for 3.18. Most of them are cleanups
or fixes for SMP. The only exception is a fix for BR/EDR L2CAP fixed
channels which should now work better together with the L2CAP
information request procedure."

For the iwlwifi bits, Emmanuel says:

"I fix here dvm which was broken by my last pull request. Arik
continues to work on TDLS and Luca solved a few issues in CT-Kill. Eyal
keeps digging into rate scaling code, more to come soon. Besides this,
nothing really special here."

Beyond that, there are the usual big batches of updates to ath9k, b43,
mwifiex, and wil6210 as well as a handful of other bits here and there.
Also, rtlwifi gets some btcoexist attention from Larry.

Please let me know if there are problems!
====================

Had to adjust the wil6210 code to comply with Joe Perches's recent
change in net-next to make the netdev_*() routines return void instead
of 'int'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26 15:39:24 -04:00