Commit graph

783 commits

Author SHA1 Message Date
Tom Tucker 1711386c62 svcrdma: Move the QP and cm_id destruction to svc_rdma_free
Move the destruction of the QP and CM_ID to the free path so that the
QP cleanup code doesn't race with the dto_tasklet handling flushed WR.
The QP reference is not needed because we now have a reference for
every WR.

Also add a guard in the SQ and RQ completion handlers to ignore
calls generated by some providers when the QP is destroyed.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:56 -05:00
Tom Tucker 0905c0f0a2 svcrdma: Add reference for each SQ/RQ WR
Add a reference on the transport for every outstanding WR.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:55 -05:00
Tom Tucker 8da91ea8de svcrdma: Move destroy to kernel thread
Some providers may wait while destroying adapter resources.
Since it is possible that the last reference is put on the
dto_tasklet, the actual destroy must be scheduled as a work item.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:54 -05:00
Tom Tucker 47698e083e svcrdma: Shrink scope of spinlock on RQ CQ
The rq_cq_reap function is only called from the dto_tasklet. The
only resource shared with other threads is the sc_rq_dto_q. Move the
spin lock to protect only this list.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:53 -05:00
Tom Tucker 8740767376 svcrdma: Use standard Linux lists for context cache
Replace the one-off linked list implementation used to implement the
context cache with the standard Linux list_head lists. Add a context
counter to catch resource leaks. A WARN_ON will be added later to
ensure that we've freed all contexts.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:52 -05:00
Tom Tucker 02e7452de7 svcrdma: Simplify RDMA_READ deferral buffer management
An NFS_WRITE requires a set of RDMA_READ requests to fetch the write
data from the client. There are two principal pieces of data that
need to be tracked: the list of pages that comprise the completed RPC
and the SGE of dma mapped pages to refer to this list of pages. Previously
this whole bit was managed as a linked list of contexts with the
context containing the page list buried in this list. This patch
simplifies this processing by not keeping a linked list, but rather only
a pionter from the last submitted RDMA_READ's context to the context
that maps the set of pages that describe the RPC.  This significantly
simplifies this code path. SGE contexts are cleaned up inline in the DTO
path instead of at read completion time.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:51 -05:00
Tom Tucker 10a38c33f4 svcrdma: Remove unused READ_DONE context flags bit
The RDMACTXT_F_READ_DONE bit is not longer used. Remove it.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:50 -05:00
Tom Tucker d16d40093a svcrdma: Return error from rdma_read_xdr so caller knows to free context
The rdma_read_xdr function did not discriminate between no read-list and
an error posting the read-list. This results in a leak of a page if there
is an error posting the read-list.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:49 -05:00
Tom Tucker 58e8f62137 svcrdma: Fix error handling during listening endpoint creation
A listening endpoint isn't known to the generic transport switch until
the svc_create_xprt function returns without error. Calling
svc_xprt_put within the xpo_create function causes the module reference
count to be erroneously decremented.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:48 -05:00
Tom Tucker 5ac461a6f0 svcrdma: Free context on post_recv error in send_reply
If an error is encountered trying to post a recv buffer in send_reply,
free the passed in context. Return an error to the caller so it is
aware that the request was not posted.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:47 -05:00
Tom Tucker 05a0826a6e svcrdma: Free context on ib_post_recv error
If there is an error posting the recv WR to the RQ, free the
context associated with the WR. This would leak a context when
asynchronous errors occurred on the transport while conccurent threads
were processing their RPC.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:47 -05:00
Tom Tucker 120693d12c svcrdma: Add put of connection ESTABLISHED reference in rdma_cma_handler
The svcrdma transport takes a reference when it gets the ESTABLISHED
event from the provider. This reference is supposed to be removed when
the DISCONNECT event is received, however, the call to svc_xprt_put
was missing in the switch statement. This results in the memory
associated with the transport never being freed.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:46 -05:00
Tom Tucker 9d6347acd2 svcrdma: Fix return value in svc_rdma_send
Fix the return value on close to -ENOTCONN so caller knows to free context.
Also if a thread is waiting for free SQ space, check for close when waking
to avoid posting WR to a closing transport.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:45 -05:00
Tom Tucker dbcd00eba9 svcrdma: Fix race with dto_tasklet in svc_rdma_send
The svc_rdma_send function will attempt to reap SQ WR to make room for
a new request if it finds the SQ full. This function races with the
dto_tasklet that also reaps SQ WR. To avoid polling and arming the CQ
unnecessarily move the test_and_clear_bit of the RDMAXPRT_SQ_PENDING
flag and arming of the CQ to the sq_cq_reap function.

Refactor the rq_cq_reap function to match sq_cq_reap so that the
code is easier to follow.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:44 -05:00
Tom Tucker 0e7f011a19 svcrdma: Simplify receive buffer posting
The svcrdma transport provider currently allocates receive buffers
to the RQ through the xpo_release_rqst method. This approach is overly
complicated since it means that the rqstp rq_xprt_ctxt has to be
selectively set based on whether the RPC is going to be processed
immediately or deferred. Instead, just post the receive buffer when
we are certain that we are replying in the send_reply function.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:43 -05:00
Tom Tucker aa3314c8d6 svc: Remove unused header files from svc_xprt.c
This cosmetic patch removes unused header files that svc_xprt.c
inherited from svcsock.c

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:42 -05:00
Tom Tucker fc63a05086 svc: Remove extra check for XPT_DEAD bit in svc_xprt_enqueue
Remove a redundant check for the XPT_DEAD bit in the svc_xprt_enqueue
function. This same bit is checked below while holding the pool lock
and prints a debug message if found to be dead.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:41 -05:00
Huang Weiyi 625fc3a375 Remove duplicated include in net/sunrpc/svc.c
<linux/sched.h> we included twice.

Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08 10:58:25 -07:00
Denis V. Lunev e7fe23363b sunrpc: assign PDE->data before gluing PDE into /proc tree
Simply replace proc_create and further data assigned with proc_create_data.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-02 02:44:36 -07:00
Linus Torvalds 77a50df2b1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  iwlwifi: Allow building iwl3945 without iwl4965.
  wireless: Fix compile error with wifi & leds
  tcp: Fix slab corruption with ipv6 and tcp6fuzz
  ipv4/ipv6 compat: Fix SSM applications on 64bit kernels.
  [IPSEC]: Use digest_null directly for auth
  sunrpc: fix missing kernel-doc
  can: Fix copy_from_user() results interpretation
  Revert "ipv6: Fix typo in net/ipv6/Kconfig"
  tipc: endianness annotations
  ipv6: result of csum_fold() is already 16bit, no need to cast
  [XFRM] AUDIT: Fix flowlabel text format ambibuity.
2008-04-28 09:44:11 -07:00
Randy Dunlap 0b80ae4201 sunrpc: fix missing kernel-doc
Fix missing sunrpc kernel-doc:

Warning(linux-2.6.25-git7//net/sunrpc/xprt.c:451): No description found for parameter 'action'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-27 14:26:52 -07:00
Linus Torvalds 563307b2fa Merge git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (80 commits)
  SUNRPC: Invalidate the RPCSEC_GSS session if the server dropped the request
  make nfs_automount_list static
  NFS: remove duplicate flags assignment from nfs_validate_mount_data
  NFS - fix potential NULL pointer dereference v2
  SUNRPC: Don't change the RPCSEC_GSS context on a credential that is in use
  SUNRPC: Fix a race in gss_refresh_upcall()
  SUNRPC: Don't disconnect more than once if retransmitting NFSv4 requests
  SUNRPC: Remove the unused export of xprt_force_disconnect
  SUNRPC: remove XS_SENDMSG_RETRY
  SUNRPC: Protect creds against early garbage collection
  NFSv4: Attempt to use machine credentials in SETCLIENTID calls
  NFSv4: Reintroduce machine creds
  NFSv4: Don't use cred->cr_ops->cr_name in nfs4_proc_setclientid()
  nfs: fix printout of multiword bitfields
  nfs: return negative error value from nfs{,4}_stat_to_errno
  NLM/lockd: Ensure client locking calls use correct credentials
  NFS: Remove the buggy lock-if-signalled case from do_setlk()
  NLM/lockd: Fix a race when cancelling a blocking lock
  NLM/lockd: Ensure that nlmclnt_cancel() returns results of the CANCEL call
  NLM: Remove the signal masking in nlmclnt_proc/nlmclnt_cancel
  ...
2008-04-24 11:46:16 -07:00
Trond Myklebust 233607dbbc Merge branch 'devel' 2008-04-24 14:01:02 -04:00
Trond Myklebust b48633bd08 SUNRPC: Invalidate the RPCSEC_GSS session if the server dropped the request
RFC 2203 requires the server to drop the request if it believes the
RPCSEC_GSS context is out of sequence. The problem is that we have no way
on the client to know why the server dropped the request. In order to avoid
spinning forever trying to resend the request, the safe approach is
therefore to always invalidate the RPCSEC_GSS context on every major
timeout.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-04-24 13:53:46 -04:00
Chuck Lever 0dc220f081 SUNRPC: Use unsigned loop and array index in svc_init_buffer()
Clean up: Suppress a harmless compiler warning.

Index rq_pages[] with an unsigned type.  Make "pages" unsigned as well,
as it never represents a value less than zero.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-04-23 16:13:43 -04:00
Chuck Lever 50c8bb13ea SUNRPC: Use unsigned index when looping over arrays
Clean up: Suppress a harmless compiler warning in the RPC server related
to array indices.

ARRAY_SIZE() returns a size_t, so use unsigned type for a loop index when
looping over arrays.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-04-23 16:13:43 -04:00
Chuck Lever c0401ea008 SUNRPC: Update RPC server's TCP record marker decoder
Clean up: Update the RPC server's TCP record marker decoder to match the
constructs used by the RPC client's TCP socket transport.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-04-23 16:13:43 -04:00
Chuck Lever b7872fe86d SUNRPC: RPC server still uses 2.4 method for disabling TCP Nagle
Use the 2.6 method for disabling TCP Nagle in the kernel's RPC server.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-04-23 16:13:43 -04:00
Jeff Layton 8774282c4c SUNRPC: remove svc_create_thread()
Now that the nfs4 callback thread uses the kthread API, there are no
more users of svc_create_thread(). Remove it.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-04-23 16:13:42 -04:00
Kevin Coffman 4ab4b0bedd sunrpc: make token header values less confusing
g_make_token_header() and g_token_size() add two too many, and
therefore their callers pass in "(logical_value - 2)" rather
than "logical_value" as hard-coded values which causes confusion.

This dates back to the original g_make_token_header which took an
optional token type (token_id) value and added it to the token.
This was removed, but the routine always adds room for the token_id
rather than not.

Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-04-23 16:13:41 -04:00
Kevin Coffman 5743d65c2f gss_krb5: consistently use unsigned for seqnum
Consistently use unsigned (u32 vs. s32) for seqnum.

In get_mic function, send the local copy of seq_send,
rather than the context version.

Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-04-23 16:13:41 -04:00
Andrew Morton c3bb257d2d net/sunrpc/svc.c: suppress unintialized var warning
net/sunrpc/svc.c: In function '__svc_create_thread':
net/sunrpc/svc.c:587: warning: 'oldmask.bits[0u]' may be used uninitialized in this function

Cc: Neil Brown <neilb@suse.de>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Tom Tucker <tom@opengridcomputing.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-04-23 16:13:40 -04:00
Kevin Coffman 30aef3166a Remove define for KRB5_CKSUM_LENGTH, which will become enctype-dependent
cleanup: When adding new encryption types, the checksum length
can be different for each enctype.  Face the fact that the
current code only supports DES which has a checksum length of 8.

Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-04-23 16:13:40 -04:00
Kevin Coffman 3d4a688678 Correct grammer/typos in dprintks
cleanup:  Fix grammer/typos to use "too" instead of "to"

Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-04-23 16:13:40 -04:00
Tom Tucker 830bb59b6e SVCRDMA: Add check for XPT_CLOSE in svc_rdma_send
SVCRDMA: Add check for XPT_CLOSE in svc_rdma_send

The svcrdma transport can crash if a send is waiting for an
empty SQ slot and the connection is closed due to an asynchronous error.
The crash is caused when svc_rdma_send attempts to send on a deleted
QP.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-04-23 16:13:40 -04:00
Harshula Jayasuriya dd35210e1e sunrpc: GSS integrity and decryption failures should return GARBAGE_ARGS
In function svcauth_gss_accept() (net/sunrpc/auth_gss/svcauth_gss.c) the
code that handles GSS integrity and decryption failures should be
returning GARBAGE_ARGS as specified in RFC 2203, sections 5.3.3.4.2 and
5.3.3.4.3.

Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: Harshula Jayasuriya <harshula@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-04-23 16:13:39 -04:00
Jeff Layton 7b54fe61ff SUNRPC: allow svc_recv to break out of 500ms sleep when alloc_page fails
svc_recv() calls alloc_page(), and if it fails it does a 500ms
uninterruptible sleep and then reattempts. There doesn't seem to be any
real reason for this to be uninterruptible, so change it to an
interruptible sleep. Also check for kthread_stop() and signalled() after
setting the task state to avoid races that might lead to sleeping after
kthread_stop() wakes up the task.

I've done some very basic smoke testing with this, but obviously it's
hard to test the actual changes since this all depends on an
alloc_page() call failing.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-04-23 16:13:38 -04:00
J. Bruce Fields 67eb6ff610 svcrpc: move unused field from cache_deferred_req
This field is set once and never used; probably some artifact of an
earlier implementation idea.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-04-23 16:13:37 -04:00
Aurélien Charbon f15364bd4c IPv6 support for NFS server export caches
This adds IPv6 support to the interfaces that are used to express nfsd
exports.  All addressed are stored internally as IPv6; backwards
compatibility is maintained using mapped addresses.

Thanks to Bruce Fields, Brian Haley, Neil Brown and Hideaki Joshifuji
for comments

Signed-off-by: Aurelien Charbon <aurelien.charbon@bull.net>
Cc: Neil Brown <neilb@suse.de>
Cc: Brian Haley <brian.haley@hp.com>
Cc:  YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-04-23 16:13:36 -04:00
Jeff Layton 7086721f9c SUNRPC: have svc_recv() check kthread_should_stop()
When using kthreads that call into svc_recv, we want to make sure that
they do not block there for a long time when we're trying to take down
the kthread.

This patch changes svc_recv() to check kthread_should_stop() at the same
places that it checks to see if it's signalled(). Also check just before
svc_recv() tries to schedule(). By making sure that we check it just
after setting the task state we can avoid having to use any locking or
signalling to ensure it doesn't block for a long time.

There's still a chance of a 500ms sleep if alloc_page() fails, but
that should be a rare occurrence and isn't a terribly long time in
the context of a kthread being taken down.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-04-23 16:13:36 -04:00
Jeff Layton 23d42ee278 SUNRPC: export svc_sock_update_bufs
Needed since the plan is to not have a svc_create_thread helper and to
have current users of that function just call kthread_run directly.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-04-23 16:13:35 -04:00
Trond Myklebust cd019f7517 SUNRPC: Don't change the RPCSEC_GSS context on a credential that is in use
When a server rejects our credential with an AUTH_REJECTEDCRED or similar,
we need to refresh the credential and then retry the request.
However, we do want to allow any requests that are in flight to finish
executing, so that we can at least attempt to process the replies that
depend on this instance of the credential.

The solution is to ensure that gss_refresh() looks up an entirely new
RPCSEC_GSS credential instead of attempting to create a context for the
existing invalid credential.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-04-19 16:55:19 -04:00
Trond Myklebust 7b6962b0a6 SUNRPC: Fix a race in gss_refresh_upcall()
If the downcall completes before we get the spin_lock then we currently
fail to refresh the credential.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-04-19 16:55:15 -04:00
Trond Myklebust 7c1d71cf56 SUNRPC: Don't disconnect more than once if retransmitting NFSv4 requests
NFSv4 requires us to ensure that we break the TCP connection before we're
allowed to retransmit a request. However in the case where we're
retransmitting several requests that have been sent on the same
connection, we need to ensure that we don't interfere with the attempt to
reconnect and/or break the connection again once it has been established.

We therefore introduce a 'connection' cookie that is bumped every time a
connection is broken. This allows requests to track if they need to force a
disconnection.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-04-19 16:55:12 -04:00
Trond Myklebust 636ac43318 SUNRPC: Remove the unused export of xprt_force_disconnect
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-04-19 16:55:08 -04:00
Trond Myklebust 06b4b681ab SUNRPC: remove XS_SENDMSG_RETRY
The condition for exiting from the loop in xs_tcp_send_request() should be
that we find we're not making progress (i.e. number of bytes sent is 0).

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-04-19 16:55:05 -04:00
Trond Myklebust d2b8314163 SUNRPC: Protect creds against early garbage collection
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-04-19 16:55:02 -04:00
Trond Myklebust 7c67db3a8a NFSv4: Reintroduce machine creds
We need to try to ensure that we always use the same credentials whenever
we re-establish the clientid on the server. If not, the server won't
recognise that we're the same client, and so may not allow us to recover
state.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-04-19 16:54:56 -04:00
Trond Myklebust 78ea323be6 NFSv4: Don't use cred->cr_ops->cr_name in nfs4_proc_setclientid()
With the recent change to generic creds, we can no longer use
cred->cr_ops->cr_name to distinguish between RPCSEC_GSS principals and
AUTH_SYS/AUTH_NULL identities. Replace it with the rpc_authops->au_name
instead...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-04-19 16:54:53 -04:00
Trond Myklebust 1e799b673c SUNRPC: Fix read ordering problems with req->rq_private_buf.len
We want to ensure that req->rq_private_buf.len is updated before
req->rq_received, so that call_decode() doesn't use an old value for
req->rq_rcv_buf.len.

In 'call_decode()' itself, instead of using task->tk_status (which is set
using req->rq_received) must use the actual value of
req->rq_private_buf.len when deciding whether or not the received RPC reply
is too short.

Finally ensure that we set req->rq_rcv_buf.len to zero when retrying a
request. A typo meant that we were resetting req->rq_private_buf.len in
call_decode(), and then clobbering that value with the old rq_rcv_buf.len
again in xprt_transmit().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-04-19 16:53:20 -04:00