Commit graph

519670 commits

Author SHA1 Message Date
John W. Linville
125907ae5e geneve: remove MODULE_ALIAS_RTNL_LINK from net/ipv4/geneve.c
This file is essentially a library for implementing the geneve
encapsulation protocol.  The file does not register any rtnl_link_ops,
so the MODULE_ALIAS_RTNL_LINK macro is inappropriate here.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:59:12 -04:00
Pablo Neira
f0b5e8a42f net: kill useless net_*_ingress_queue() definitions when NET_CLS_ACT is unset
This fixes 4577139b2d ("net: use jump label patching for ingress qdisc in
__netif_receive_skb_core").

The only client of this is sch_ingress and it depends on NET_CLS_ACT. So
there is no way these definition can be of any help.

Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:44:28 -04:00
David S. Miller
9f0a74d7b6 Merge branch 'packet_rollover'
Willem de Bruijn says:

====================
refine packet socket rollover:

1. mitigate a case of lock contention
2. avoid exporting resource exhaustion to other sockets,
   by migrating only to a victim socket that has ample room
3. avoid reordering of most flows on the socket,
   by migrating first the flow responsible for load imbalance
4. help processes detect load imbalance,
   by exporting rollover counters

Context: rollover implements flow migration in packet socket fanout
groups in case of extreme load imbalance. It is a specific
implementation of migration that minimizes reordering by selecting
the same victim socket when possible (and by selecting subsequent
victims in a round robin fashion, from which its name derives).

Changes:
  v2 -> v3:
    - statistics: replace unsigned long with __aligned_u64
  v1 -> v2:
    - huge flow detection: run lockless
    - huge flow detection: replace stored index with random
    - contention avoidance: test in packet_poll while lock held
    - contention avoidance: clear pressure sooner

          packet_poll and packet_recvmsg would clear only if the sock
          is empty to avoid taking the necessary lock. But,
          * packet_poll already holds this lock, so a lockless variant
            __packet_rcv_has_room is cheap.
          * packet_recvmsg is usually called only for non-ring sockets,
            which also runs lockless.

    - preparation: drop "single return" patch

          packet_rcv_has_room is now a locked wrapper around
          __packet_rcv_has_room, achieving the same (single footer).

The benchmark mentioned in the patches is at
https://github.com/wdebruij/kerneltools/blob/master/tests/bench_rollover.c
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:43:01 -04:00
Willem de Bruijn
a9b6391814 packet: rollover statistics
Rollover indicates exceptional conditions. Export a counter to inform
socket owners of this state.

If no socket with sufficient room is found, rollover fails. Also count
these events.

Finally, also count when flows are rolled over early thanks to huge
flow detection, to validate its correctness.

Tested:
  Read counters in bench_rollover on all other tests in the patchset

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:43:00 -04:00
Willem de Bruijn
3b3a5b0aab packet: rollover huge flows before small flows
Migrate flows from a socket to another socket in the fanout group not
only when the socket is full. Start migrating huge flows early, to
divert possible 4-tuple attacks without affecting normal traffic.

Introduce fanout_flow_is_huge(). This detects huge flows, which are
defined as taking up more than half the load. It does so cheaply, by
storing the rxhashes of the N most recent packets. If over half of
these are the same rxhash as the current packet, then drop it. This
only protects against 4-tuple attacks. N is chosen to fit all data in
a single cache line.

Tested:
  Ran bench_rollover for 10 sec with 1.5 Mpps of single flow input.

    lpbb5:/export/hda3/willemb# ./bench_rollover -l 1000 -r -s
    cpu         rx       rx.k     drop.k   rollover     r.huge   r.failed
      0         14         14          0          0          0          0
      1         20         20          0          0          0          0
      2         16         16          0          0          0          0
      3    6168824    6168824          0    4867721    4867721          0
      4    4867741    4867741          0          0          0          0
      5         12         12          0          0          0          0
      6         15         15          0          0          0          0
      7         17         17          0          0          0          0

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:43:00 -04:00
Willem de Bruijn
2ccdbaa6d5 packet: rollover lock contention avoidance
Rollover has to call packet_rcv_has_room on sockets in the fanout
group to find a socket to migrate to. This operation is expensive
especially if the packet sockets use rings, when a lock has to be
acquired.

Avoid pounding on the lock by all sockets by temporarily marking a
socket as "under memory pressure" when such pressure is detected.
While set, only the socket owner may call packet_rcv_has_room on the
socket. Once it detects normal conditions, it clears the flag. The
socket is not used as a victim by any other socket in the meantime.

Under reasonably balanced load, each socket writer frequently calls
packet_rcv_has_room and clears its own pressure field. As a backup
for when the socket is rarely written to, also clear the flag on
reading (packet_recvmsg, packet_poll) if this can be done cheaply
(i.e., without calling packet_rcv_has_room). This is only for
edge cases.

Tested:
  Ran bench_rollover: a process with 8 sockets in a single fanout
  group, each pinned to a single cpu that receives one nic recv
  interrupt. RPS and RFS are disabled. The benchmark uses packet
  rx_ring, which has to take a lock when determining whether a
  socket has room.

  Sent 3.5 Mpps of UDP traffic with sufficient entropy to spread
  uniformly across the packet sockets (and inserted an iptables
  rule to drop in PREROUTING to avoid protocol stack processing).

  Without this patch, all sockets try to migrate traffic to
  neighbors, causing lock contention when searching for a non-
  empty neighbor. The lock is the top 9 entries.

    perf record -a -g sleep 5

    -  17.82%   bench_rollover  [kernel.kallsyms]    [k] _raw_spin_lock
       - _raw_spin_lock
          - 99.00% spin_lock
    	 + 81.77% packet_rcv_has_room.isra.41
    	 + 18.23% tpacket_rcv
          + 0.84% packet_rcv_has_room.isra.41
    +   5.20%      ksoftirqd/6  [kernel.kallsyms]    [k] _raw_spin_lock
    +   5.15%      ksoftirqd/1  [kernel.kallsyms]    [k] _raw_spin_lock
    +   5.14%      ksoftirqd/2  [kernel.kallsyms]    [k] _raw_spin_lock
    +   5.12%      ksoftirqd/7  [kernel.kallsyms]    [k] _raw_spin_lock
    +   5.12%      ksoftirqd/5  [kernel.kallsyms]    [k] _raw_spin_lock
    +   5.10%      ksoftirqd/4  [kernel.kallsyms]    [k] _raw_spin_lock
    +   4.66%      ksoftirqd/0  [kernel.kallsyms]    [k] _raw_spin_lock
    +   4.45%      ksoftirqd/3  [kernel.kallsyms]    [k] _raw_spin_lock
    +   1.55%   bench_rollover  [kernel.kallsyms]    [k] packet_rcv_has_room.isra.41

  On net-next with this patch, this lock contention is no longer a
  top entry. Most time is spent in the actual read function. Next up
  are other locks:

    +  15.52%  bench_rollover  bench_rollover     [.] reader
    +   4.68%         swapper  [kernel.kallsyms]  [k] memcpy_erms
    +   2.77%         swapper  [kernel.kallsyms]  [k] packet_lookup_frame.isra.51
    +   2.56%     ksoftirqd/1  [kernel.kallsyms]  [k] memcpy_erms
    +   2.16%         swapper  [kernel.kallsyms]  [k] tpacket_rcv
    +   1.93%         swapper  [kernel.kallsyms]  [k] mlx4_en_process_rx_cq

  Looking closer at the remaining _raw_spin_lock, the cost of probing
  in rollover is now comparable to the cost of taking the lock later
  in tpacket_rcv.

    -   1.51%         swapper  [kernel.kallsyms]  [k] _raw_spin_lock
       - _raw_spin_lock
          + 33.41% packet_rcv_has_room
          + 28.15% tpacket_rcv
          + 19.54% enqueue_to_backlog
          + 6.45% __free_pages_ok
          + 2.78% packet_rcv_fanout
          + 2.13% fanout_demux_rollover
          + 2.01% netif_receive_skb_internal

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:43:00 -04:00
Willem de Bruijn
9954729bc3 packet: rollover only to socket with headroom
Only migrate flows to sockets that have sufficient headroom, where
sufficient is defined as having at least 25% empty space.

The kernel has three different buffer types: a regular socket, a ring
with frames (TPACKET_V[12]) or a ring with blocks (TPACKET_V3). The
latter two do not expose a read pointer to the kernel, so headroom is
not computed easily. All three needs a different implementation to
estimate free space.

Tested:
  Ran bench_rollover for 10 sec with 1.5 Mpps of single flow input.

  bench_rollover has as many sockets as there are NIC receive queues
  in the system. Each socket is owned by a process that is pinned to
  one of the receive cpus. RFS is disabled. RPS is enabled with an
  identity mapping (cpu x -> cpu x), to count drops with softnettop.

    lpbb5:/export/hda3/willemb# ./bench_rollover -r -l 1000 -s
    Press [Enter] to exit

    cpu         rx       rx.k     drop.k   rollover     r.huge   r.failed
      0         16         16          0          0          0          0
      1         21         21          0          0          0          0
      2    5227502    5227502          0          0          0          0
      3         18         18          0          0          0          0
      4    6083289    6083289          0    5227496          0          0
      5         22         22          0          0          0          0
      6         21         21          0          0          0          0
      7          9          9          0          0          0          0

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:42:59 -04:00
Willem de Bruijn
0648ab70af packet: rollover prepare: per-socket state
Replace rollover state per fanout group with state per socket. Future
patches will add fields to the new structure.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:42:59 -04:00
Willem de Bruijn
ad377cab49 packet: rollover prepare: move code out of callsites
packet_rcv_fanout calls fanout_demux_rollover twice. Move all rollover
logic into the callee to simplify these callsites, especially with
upcoming changes.

The main differences between the two callsites is that the FLAG
variant tests whether the socket previously selected by another
mode (RR, RND, HASH, ..) has room before migrating flows, whereas the
rollover mode has no original socket to test.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:42:59 -04:00
Eric Dumazet
7d771aaac7 ipv4: __ip_local_out_sk() is static
__ip_local_out_sk() is only used from net/ipv4/ip_output.c

net/ipv4/ip_output.c:94:5: warning: symbol '__ip_local_out_sk' was not
declared. Should it be static?

Fixes: 7026b1ddb6 ("netfilter: Pass socket pointer down through okfn().")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:21:33 -04:00
Eric Dumazet
216f8bb9f6 tcp/dccp: tw_timer_handler() is static
tw_timer_handler() is only used from net/ipv4/inet_timewait_sock.c

Fixes: 789f558cfb ("tcp/dccp: get rid of central timewait timer")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:21:33 -04:00
David S. Miller
dd58c6359b Merge branch 'cls_flower'
Jiri Pirko says:

====================
introduce programable flow dissector and cls_flower

Per Davem's request, I prepared this patchset which introduces programmable
flow dissector. For current users of flow_keys, there is a wrapper
skb_flow_dissect_flow_keys which maintains the previous behaviour.
For purposes of cls_flower, couple of new dissection keys were introduced.

Note that this dissector can be also eventually used by openvswitch code.

Also, as a next step, I plan to get rid of *skb_flow_get_ports(export)
and *__skb_get_poff as their functionality can be now implemented by
skb_flow_dissect as well.

v2->v3:
- remove TCA_FLOWER_POLICE attr suggested by Jamal

v1->v2:
- move __skb_tx_hash rather to dev.c as suggested by Alex
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:19:48 -04:00
Jiri Pirko
77b9900ef5 tc: introduce Flower classifier
This patch introduces a flow-based filter. So far, the very essential
packet fields are supported.

This patch is only the first step. There is a lot of potential performance
improvements possible to implement. Also a lot of features are missing
now. They will be addressed in follow-up patches.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:19:48 -04:00
Jiri Pirko
59346afe7a flow_dissector: change port array into src, dst tuple
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:19:47 -04:00
Jiri Pirko
67a900cc04 flow_dissector: introduce support for Ethernet addresses
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:19:47 -04:00
Jiri Pirko
b924933cbb flow_dissector: introduce support for ipv6 addressses
So far, only hashes made out of ipv6 addresses could be dissected. This
patch introduces support for dissection of full ipv6 addresses.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:19:47 -04:00
Jiri Pirko
c3f8eaeb6e flow_dissector: add missing header includes
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:19:47 -04:00
Jiri Pirko
06635a35d1 flow_dissect: use programable dissector in skb_flow_dissect and friends
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:19:47 -04:00
Jiri Pirko
fbff949e3b flow_dissector: introduce programable flow_dissector
Introduce dissector infrastructure which allows user to specify which
parts of skb he wants to dissect.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:19:47 -04:00
Jiri Pirko
0db89b8b32 flow_dissector: fix doc for skb_get_poff
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:19:46 -04:00
Jiri Pirko
638b2a699f net: move netdev_pick_tx and dependencies to net/core/dev.c
next to its user. No relation to flow_dissector so it makes no sense to
have it in flow_dissector.c

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:19:46 -04:00
Jiri Pirko
5605c76240 net: move __skb_tx_hash to dev.c
__skb_tx_hash function has no relation to flow_dissect so just move it
to dev.c

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:19:46 -04:00
Jiri Pirko
9c684b5083 net: move __skb_get_hash function declaration to flow_dissector.h
Since the definition of the function is in flow_dissector.c, it makes
sense to have the declaration in flow_dissector.h

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:19:46 -04:00
Jiri Pirko
d4fd327571 flow_dissector: fix doc for __skb_get_hash and remove couple of empty lines
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:19:46 -04:00
Jiri Pirko
10b89ee43e net: move *skb_get_poff declarations into correct header
Since these functions are defined in flow_dissector.c, move header
declarations from skbuff.h into flow_dissector.h

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:19:45 -04:00
Jiri Pirko
b0a31431b4 flow_dissector: remove unused function flow_get_hlen declaration
commit 56193d1bce ("net: Add function for parsing the header length out
of linear ethernet frames") added this function declaration but it is
defined nowhere.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:19:45 -04:00
Jiri Pirko
1bd758eb1c net: change name of flow_dissector header to match the .c file name
add couple of empty lines on the way.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:19:45 -04:00
David S. Miller
212da1fa60 Merge branch 'sfc-next'
Edward Cree says:

====================
sfc: Bowdlerise PTP MCDI errors

When the NIC doesn't support PTP, probe-time MCDI commands fail in
predictable ways.  Instead of logging cryptic MCDI errors, just log that
PTP isn't supported.

v2: Hopefully stop Thunderbird mangling the patches.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:11:52 -04:00
Edward Cree
3de1b5137c sfc: suppress some MCDI error messages in PTP
Also, remove a needless netif_err() from efx_ptp_update_stats() - if the
 MCDI fails it'll print its own error message, we don't need another that
 adds no information.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:11:51 -04:00
Edward Cree
b133638909 sfc: nicer log message on PTP probe fail
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:11:51 -04:00
Florian Westphal
e578d9c025 net: sched: use counter to break reclassify loops
Seems all we want here is to avoid endless 'goto reclassify' loop.
tc_classify_compat even resets this counter when something other
than TC_ACT_RECLASSIFY is returned, so this skb-counter doesn't
break hypothetical loops induced by something other than perpetual
TC_ACT_RECLASSIFY return values.

skb_act_clone is now identical to skb_clone, so just use that.

Tested with following (bogus) filter:
tc filter add dev eth0 parent ffff: \
 protocol ip u32 match u32 0 0 police rate 10Kbit burst \
 64000 mtu 1500 action reclassify

Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 15:08:14 -04:00
David S. Miller
b04096ff33 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Four minor merge conflicts:

1) qca_spi.c renamed the local variable used for the SPI device
   from spi_device to spi, meanwhile the spi_set_drvdata() call
   got moved further up in the probe function.

2) Two changes were both adding new members to codel params
   structure, and thus we had overlapping changes to the
   initializer function.

3) 'net' was making a fix to sk_release_kernel() which is
   completely removed in 'net-next'.

4) In net_namespace.c, the rtnl_net_fill() call for GET operations
   had the command value fixed, meanwhile 'net-next' adjusted the
   argument signature a bit.

This also matches example merge resolutions posted by Stephen
Rothwell over the past two days.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 14:31:43 -04:00
Justin Cormack
7f460d30c8 fix missing copy_from_user in macvtap
Fix missing copy_from_user in macvtap SIOCSIFHWADDR ioctl.

Signed-off-by: Justin Cormack <justin@netbsd.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 14:21:36 -04:00
Scott Feldman
42275bd8fc switchdev: don't use anonymous union on switchdev attr/obj structs
Older gcc versions (e.g.  gcc version 4.4.6) don't like anonymous unions
which was causing build issues on the newly added switchdev attr/obj
structs.  Fix this by using named union on structs.

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Reported-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 14:20:59 -04:00
David S. Miller
1f7bd29bc0 Merge branch 'switchdev-cleanups'
Scott Feldman says:

====================
switchdev: more (minor) cleanups

Fix some sparse warnings and include some documentation review comments that
didn't get picked up in the switchdev Spring Cleanup series.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 12:26:28 -04:00
Scott Feldman
1f5dc44c88 switchdev: apply review comments on documentation
There were a few review comments on the switchdev.txt documentation that
didn't get included with the Spring Cleanup series, so include them now.

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 12:26:28 -04:00
Scott Feldman
5eb764edee switchdev: align comment with other comments in block
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 12:26:27 -04:00
Scott Feldman
7a7ee5312d switchdev: sparse warning: pass ipv4 fib dst as network-byte order
And let driver convert it to host-byte order as needed.

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 12:26:27 -04:00
Scott Feldman
22c1f67ea5 switchdev: sparse warning: make __switchdev_port_obj_add static
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13 12:26:27 -04:00
Linus Torvalds
110bc76729 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Handle max TX power properly wrt VIFs and the MAC in iwlwifi, from
    Avri Altman.

 2) Use the correct FW API for scan completions in iwlwifi, from Avraham
    Stern.

 3) FW monitor in iwlwifi accidently uses unmapped memory, fix from Liad
    Kaufman.

 4) rhashtable conversion of mac80211 station table was buggy, the
    virtual interface was not taken into account.  Fix from Johannes
    Berg.

 5) Fix deadlock in rtlwifi by not using a zero timeout for
    usb_control_msg(), from Larry Finger.

 6) Update reordering state before calculating loss detection, from
    Yuchung Cheng.

 7) Fix off by one in bluetooth firmward parsing, from Dan Carpenter.

 8) Fix extended frame handling in xiling_can driver, from Jeppe
    Ledet-Pedersen.

 9) Fix CODEL packet scheduler behavior in the presence of TSO packets,
    from Eric Dumazet.

10) Fix NAPI budget testing in fm10k driver, from Alexander Duyck.

11) macvlan needs to propagate promisc settings down the the lower
    device, from Vlad Yasevich.

12) igb driver can oops when changing number of rings, from Toshiaki
    Makita.

13) Source specific default routes not handled properly in ipv6, from
    Markus Stenberg.

14) Use after free in tc_ctl_tfilter(), from WANG Cong.

15) Use softirq spinlocking in netxen driver, from Tony Camuso.

16) Two ARM bpf JIT fixes from Nicolas Schichan.

17) Handle MSG_DONTWAIT properly in ring based AF_PACKET sends, from
    Mathias Kretschmer.

18) Fix x86 bpf JIT implementation of FROM_{BE16,LE16,LE32}, from Alexei
    Starovoitov.

19) ll_temac driver DMA maps TX packet header with incorrect length, fix
    from Michal Simek.

20) We removed pm_qos bits from netdevice.h, but some indirect
    references remained.  Kill them.  From David Ahern.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (90 commits)
  net: Remove remaining remnants of pm_qos from netdevice.h
  e1000e: Add pm_qos header
  net: phy: micrel: Fix regression in kszphy_probe
  net: ll_temac: Fix DMA map size bug
  x86: bpf_jit: fix FROM_BE16 and FROM_LE16/32 instructions
  netns: return RTM_NEWNSID instead of RTM_GETNSID on a get
  Update be2net maintainers' email addresses
  net_sched: gred: use correct backlog value in WRED mode
  pppoe: drop pppoe device in pppoe_unbind_sock_work
  net: qca_spi: Fix possible race during probe
  net: mdio-gpio: Allow for unspecified bus id
  af_packet / TX_RING not fully non-blocking (w/ MSG_DONTWAIT).
  bnx2x: limit fw delay in kdump to 5s after boot
  ARM: net: delegate filter to kernel interpreter when imm_offset() return value can't fit into 12bits.
  ARM: net fix emit_udiv() for BPF_ALU | BPF_DIV | BPF_K intruction.
  mpls: Change reserved label names to be consistent with netbsd
  usbnet: avoid integer overflow in start_xmit
  netxen_nic: use spin_[un]lock_bh around tx_clean_lock (2)
  net: xgene_enet: Set hardware dependency
  net: amd-xgbe: Add hardware dependency
  ...
2015-05-12 21:10:38 -07:00
David Ahern
01d460dd70 net: Remove remaining remnants of pm_qos from netdevice.h
Commit e2c6544829 removed pm_qos from struct net_device but left the
comment and header file. Remove those.

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 23:22:03 -04:00
David Ahern
5684044f80 e1000e: Add pm_qos header
Commit e2c6544829 moved pm_qos_req to e1000_adapter. Add the header file
that defines the struct.

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 23:22:03 -04:00
Ying Xue
9449c3cd90 net: make skb_dst_pop routine static
As xfrm_output_one() is the only caller of skb_dst_pop(), we should
make skb_dst_pop() localized.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 23:19:49 -04:00
Niklas Cassel
bced870152 net: phy: micrel: Fix regression in kszphy_probe
Don't do clock-mode-select if clk == NULL,
since when building without CONFIG_HAVE_CLK,
clk_get returns NULL and clk_get_rate returns 0.

Doing clock-mode-select in this cause causes kszphy_probe to
return -EINVAL and thus prevents the device from being probed.

The original code (before regression) would return 0
when building without CONFIG_HAVE_CLK.

Cc: stable <stable@vger.kernel.org> # 3.18+
Fixes: 1fadee0c36 ("net/phy: micrel: Add clock support for
KSZ8021/KSZ8031")
Reviewed-by: Fabio Estevam <fabio.estevam@freescale.com>
Reviewed-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Niklas Cassel <niklass@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 23:18:40 -04:00
Michal Simek
44d4f8d74e net: ll_temac: Fix DMA map size bug
DMA allocates skb->len instead of headlen
which is used for DMA.

Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 23:17:42 -04:00
Michael Holzheu
cffc642d93 test_bpf: add 173 new testcases for eBPF
add an exhaustive set of eBPF tests bringing total to:
test_bpf: Summary: 233 PASSED, 0 FAILED, [0/226 JIT'ed]

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 23:15:25 -04:00
Brenden Blanco
b88c06e36d samples/bpf: fix in-source build of samples with clang
in-source build of 'make samples/bpf/' was incorrectly
using default compiler instead of invoking clang/llvm.
out-of-source build was ok.

Fixes: a80857822b ("samples: bpf: trivial eBPF program in C")
Signed-off-by: Brenden Blanco <bblanco@plumgrid.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 23:15:25 -04:00
Alexei Starovoitov
343f845b37 x86: bpf_jit: fix FROM_BE16 and FROM_LE16/32 instructions
FROM_BE16:
'ror %reg, 8' doesn't clear upper bits of the register,
so use additional 'movzwl' insn to zero extend 16 bits into 64

FROM_LE16:
should zero extend lower 16 bits into 64 bit

FROM_LE32:
should zero extend lower 32 bits into 64 bit

Fixes: 89aa075832 ("net: sock: allow eBPF programs to be attached to sockets")
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 23:13:08 -04:00
Hariprasad Shenai
1ecc7b7a59 cxgb4/cxgb4vf: Cleanup macros, add comments and add new MACROS
Cleanup few MACROS left out in t4_hw.h to be consistent with the
existing ones. Also replace few hardcoded values with MACROS. Also
update comments for some code

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 23:11:40 -04:00
KY Srinivasan
82fa3c776e hv_netvsc: Use the xmit_more skb flag to optimize signaling the host
Based on the information given to this driver (via the xmit_more skb flag),
we can defer signaling the host if more packets are on the way. This will help
make the host more efficient since it can potentially process a larger batch of
packets. Implement this optimization.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 23:10:43 -04:00