linux

mirror of https://github.com/torvalds/linux synced 2024-09-21 19:47:35 +00:00

Author	SHA1	Message	Date
Anton Altaparmakov	bfab36e816	NTFS: Fix a mount time deadlock. Big thanks go to Mathias Kolehmainen for reporting the bug, providing debug output and testing the patches I sent him to get it working. The fix was to stop calling ntfs_attr_set() at mount time as that causes balance_dirty_pages_ratelimited() to be called which on systems with little memory actually tries to go and balance the dirty pages which tries to take the s_umount semaphore but because we are still in fill_super() across which the VFS holds s_umount for writing this results in a deadlock. We now do the dirty work by hand by submitting individual buffers. This has the annoying "feature" that mounting can take a few seconds if the journal is large as we have clear it all. One day someone should improve on this by deferring the journal clearing to a helper kernel thread so it can be done in the background but I don't have time for this at the moment and the current solution works fine so I am leaving it like this for now. Signed-off-by: Anton Altaparmakov <aia21@cantab.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-12 09:16:30 -07:00
Linus Torvalds	f26e51f67a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: (51 commits) [DLM] block dlm_recv in recovery transition [DLM] don't overwrite castparam if it's NULL [GFS2] Get superblock a different way [GFS2] Don't try to remove buffers that don't exist [GFS2] Alternate gfs2_iget to avoid looking up inodes being freed [GFS2] Data corruption fix [GFS2] Clean up journaled data writing [GFS2] GFS2: chmod hung - fix race in thread creation [DLM] Make dlm_sendd cond_resched more [GFS2] Move inode deletion out of blocking_cb [GFS2] flocks from same process trip kernel BUG at fs/gfs2/glock.c:1118! [GFS2] Clean up gfs2_trans_add_revoke() [GFS2] Use slab operations for all gfs2_bufdata allocations [GFS2] Replace revoke structure with bufdata structure [GFS2] Fix ordering of dirty/journal for ordered buffer unstuffing [GFS2] Clean up ordered write code [GFS2] Move pin/unpin into lops.c, clean up locking [GFS2] Don't mark jdata dirty in gfs2_unstuffer_page() [GFS2] Introduce gfs2_remove_from_ail [GFS2] Correct lock ordering in unlink ...	2007-10-12 09:14:51 -07:00
Al Viro	782e3b3b38	Fix up more bio fallout Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-12 00:29:50 -07:00
Linus Torvalds	e86908614f	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (408 commits) [POWERPC] Add memchr() to the bootwrapper [POWERPC] Implement logging of unhandled signals [POWERPC] Add legacy serial support for OPB with flattened device tree [POWERPC] Use 1TB segments [POWERPC] XilinxFB: Allow fixed framebuffer base address [POWERPC] XilinxFB: Add support for custom screen resolution [POWERPC] XilinxFB: Use pdata to pass around framebuffer parameters [POWERPC] PCI: Add 64-bit physical address support to setup_indirect_pci [POWERPC] 4xx: Kilauea defconfig file [POWERPC] 4xx: Kilauea DTS [POWERPC] 4xx: Add AMCC Kilauea eval board support to platforms/40x [POWERPC] 4xx: Add AMCC 405EX support to cputable.c [POWERPC] Adjust TASK_SIZE on ppc32 systems to 3GB that are capable [POWERPC] Use PAGE_OFFSET to tell if an address is user/kernel in SW TLB handlers [POWERPC] 85xx: Enable FP emulation in MPC8560 ADS defconfig [POWERPC] 85xx: Killed <asm/mpc85xx.h> [POWERPC] 85xx: Add cpm nodes for 8541/8555 CDS [POWERPC] 85xx: Convert mpc8560ads to the new CPM binding. [POWERPC] mpc8272ads: Remove muram from the CPM reg property. [POWERPC] Make clockevents work on PPC601 processors ... Fixed up conflict in Documentation/powerpc/booting-without-of.txt manually.	2007-10-11 21:55:47 -07:00
Jeff Garzik	e30408b2a9	JFS: fix bio-related build breakage Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-11 21:15:14 -07:00
Linus Torvalds	038a5008b2	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (867 commits) [SKY2]: status polling loop (post merge) [NET]: Fix NAPI completion handling in some drivers. [TCP]: Limit processing lost_retrans loop to work-to-do cases [TCP]: Fix lost_retrans loop vs fastpath problems [TCP]: No need to re-count fackets_out/sacked_out at RTO [TCP]: Extract tcp_match_queue_to_sack from sacktag code [TCP]: Kill almost unused variable pcount from sacktag [TCP]: Fix mark_head_lost to ignore R-bit when trying to mark L [TCP]: Add bytes_acked (ABC) clearing to FRTO too [IPv6]: Update setsockopt(IPV6_MULTICAST_IF) to support RFC 3493, try2 [NETFILTER]: x_tables: add missing ip6t_modulename aliases [NETFILTER]: nf_conntrack_tcp: fix connection reopening [QETH]: fix qeth_main.c [NETLINK]: fib_frontend build fixes [IPv6]: Export userland ND options through netlink (RDNSS support) [9P]: build fix with !CONFIG_SYSCTL [NET]: Fix dev_put() and dev_hold() comments [NET]: make netlink user -> kernel interface synchronious [NET]: unify netlink kernel socket recognition [NET]: cleanup 3rd argument in netlink_sendskb ... Fix up conflicts manually in Documentation/feature-removal-schedule.txt and my new least favourite crap, the "mod_devicetable" support in the files include/linux/mod_devicetable.h and scripts/mod/file2alias.c. (The latter files seem to be explicitly _designed_ to get conflicts when different subsystems work with them - that have an absolutely horrid lack of subsystem separation!) Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-11 19:40:14 -07:00
Denis V. Lunev	cd40b7d398	[NET]: make netlink user -> kernel interface synchronious This patch make processing netlink user -> kernel messages synchronious. This change was inspired by the talk with Alexey Kuznetsov about current netlink messages processing. He says that he was badly wrong when introduced asynchronious user -> kernel communication. The call netlink_unicast is the only path to send message to the kernel netlink socket. But, unfortunately, it is also used to send data to the user. Before this change the user message has been attached to the socket queue and sk->sk_data_ready was called. The process has been blocked until all pending messages were processed. The bad thing is that this processing may occur in the arbitrary process context. This patch changes nlk->data_ready callback to get 1 skb and force packet processing right in the netlink_unicast. Kernel -> user path in netlink_unicast remains untouched. EINTR processing for in netlink_run_queue was changed. It forces rtnl_lock drop, but the process remains in the cycle until the message will be fully processed. So, there is no need to use this kludges now. Signed-off-by: Denis V. Lunev <den@openvz.org> Acked-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 21:15:29 -07:00
Pavel Emelyanov	39699037a5	[FS] seq_file: Introduce the seq_open_private() This function allocates the zeroed chunk of memory and call seq_open(). The __seq_open_private() helper returns the allocated memory to make it possible for the caller to initialize it. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:55:33 -07:00
Pavel Emelyanov	4665079cbb	[NETNS]: Move some code into __init section when CONFIG_NET_NS=n With the net namespaces many code leaved the __init section, thus making the kernel occupy more memory than it did before. Since we have a config option that prohibits the namespace creation, the functions that initialize/finalize some netns stuff are simply not needed and can be freed after the boot. Currently, this is almost not noticeable, since few calls are no longer in __init, but when the namespaces will be merged it will be possible to free more code. I propose to use the __net_init, __net_exit and __net_initdata "attributes" for functions/variables that are not used if the CONFIG_NET_NS is not set to save more space in memory. The exiting functions cannot just reside in the __exit section, as noticed by David, since the init section will have references on it and the compilation will fail due to modpost checks. These references can exist, since the init namespace never dies and the exit callbacks are never called. So I introduce the __exit_refok attribute just like it is already done with the __init_refok. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:54:58 -07:00
Eric W. Biederman	077130c0cf	[NET]: Fix race when opening a proc file while a network namespace is exiting. The problem: proc_net files remember which network namespace the are against but do not remember hold a reference count (as that would pin the network namespace). So we currently have a small window where the reference count on a network namespace may be incremented when opening a /proc file when it has already gone to zero. To fix this introduce maybe_get_net and get_proc_net. maybe_get_net increments the network namespace reference count only if it is greater then zero, ensuring we don't increment a reference count after it has gone to zero. get_proc_net handles all of the magic to go from a proc inode to the network namespace instance and call maybe_get_net on it. PROC_NET the old accessor is removed so that we don't get confused and use the wrong helper function. Then I fix up the callers to use get_proc_net and handle the case case where get_proc_net returns NULL. In that case I return -ENXIO because effectively the network namespace has already gone away so the files we are trying to access don't exist anymore. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:22 -07:00
Daniel Lezcano	36ac3135f5	[NETNS]: Fix export symbols. Add the appropriate EXPORT_SYMBOLS for proc_net_create, proc_net_fops_create and proc_net_remove to fix errors when compiling allmodconfig Signed-off-by: Mark Nelson <markn@au1.ibm.com> Acked-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:16 -07:00
David S. Miller	3c12afe75f	[NET]: Fix missed addition of fs/proc/proc_net.c My bad. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:14 -07:00
Eric W. Biederman	881d966b48	[NET]: Make the device list and device lookups per namespace. This patch makes most of the generic device layer network namespace safe. This patch makes dev_base_head a network namespace variable, and then it picks up a few associated variables. The functions: dev_getbyhwaddr dev_getfirsthwbytype dev_get_by_flags dev_get_by_name __dev_get_by_name dev_get_by_index __dev_get_by_index dev_ioctl dev_ethtool dev_load wireless_process_ioctl were modified to take a network namespace argument, and deal with it. vlan_ioctl_set and brioctl_set were modified so their hooks will receive a network namespace argument. So basically anthing in the core of the network stack that was affected to by the change of dev_base was modified to handle multiple network namespaces. The rest of the network stack was simply modified to explicitly use &init_net the initial network namespace. This can be fixed when those components of the network stack are modified to handle multiple network namespaces. For now the ifindex generator is left global. Fundametally ifindex numbers are per namespace, or else we will have corner case problems with migration when we get that far. At the same time there are assumptions in the network stack that the ifindex of a network device won't change. Making the ifindex number global seems a good compromise until the network stack can cope with ifindex changes when you change namespaces, and the like. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:10 -07:00
Eric W. Biederman	b4b510290b	[NET]: Support multiple network namespaces with netlink Each netlink socket will live in exactly one network namespace, this includes the controlling kernel sockets. This patch updates all of the existing netlink protocols to only support the initial network namespace. Request by clients in other namespaces will get -ECONREFUSED. As they would if the kernel did not have the support for that netlink protocol compiled in. As each netlink protocol is updated to be multiple network namespace safe it can register multiple kernel sockets to acquire a presence in the rest of the network namespaces. The implementation in af_netlink is a simple filter implementation at hash table insertion and hash table look up time. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:09 -07:00
Eric W. Biederman	457c4cbc5a	[NET]: Make /proc/net per network namespace This patch makes /proc/net per network namespace. It modifies the global variables proc_net and proc_net_stat to be per network namespace. The proc_net file helpers are modified to take a network namespace argument, and all of their callers are fixed to pass &init_net for that argument. This ensures that all of the /proc/net files are only visible and usable in the initial network namespace until the code behind them has been updated to be handle multiple network namespaces. Making /proc/net per namespace is necessary as at least some files in /proc/net depend upon the set of network devices which is per network namespace, and even more files in /proc/net have contents that are relevant to a single network namespace. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:06 -07:00
Eric W. Biederman	32da477a5b	[NET]: Don't implement dev_ifname32 inline The current implementation of dev_ifname makes maintenance difficult because updates to the implementation of the ioctl have to made in two places. So this patch updates dev_ifname32 to do a classic 32/64 structure conversion and call sys_ioctl like the rest of the compat calls do. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:03 -07:00
David Teigland	c36258b592	[DLM] block dlm_recv in recovery transition Introduce a per-lockspace rwsem that's held in read mode by dlm_recv threads while working in the dlm. This allows dlm_recv activity to be suspended when the lockspace transitions to, from and between recovery cycles. The specific bug prompting this change is one where an in-progress recovery cycle is aborted by a new recovery cycle. While dlm_recv was processing a recovery message, the recovery cycle was aborted and dlm_recoverd began cleaning up. dlm_recv decremented recover_locks_count on an rsb after dlm_recoverd had reset it to zero. This is fixed by suspending dlm_recv (taking write lock on the rwsem) before aborting the current recovery. The transitions to/from normal and recovery modes are simplified by using this new ability to block dlm_recv. The switch from normal to recovery mode means dlm_recv goes from processing locking messages, to saving them for later, and vice versa. Races are avoided by blocking dlm_recv when setting the flag that switches between modes. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:56:38 +01:00
Patrick Caulfield	b434eda6fd	[DLM] don't overwrite castparam if it's NULL If the castaddr passed to the userland API is NULL then don't overwrite the existing castparam. This allows a different thread to cancel a lock request and the CANCEL AST gets delivered to the original thread. bz#306391 (for RHEL4) refers. Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:56:36 +01:00
Steven Whitehouse	5a60c532c9	[GFS2] Get superblock a different way The mapping may be NULL by the time the I/O has completed, so we now get the superblock by a different route (via the bd and glock) to avoid this problem. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Wendy Cheng <wcheng@redhat.com>	2007-10-10 08:56:34 +01:00
Steven Whitehouse	891ba6d4a5	[GFS2] Don't try to remove buffers that don't exist Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:56:31 +01:00
Benjamin Marzinski	7a9f53b3c1	[GFS2] Alternate gfs2_iget to avoid looking up inodes being freed There is a possible deadlock between two processes on the same node, where one process is deleting an inode, and another process is looking for allocated but unused inodes to delete in order to create more space. process A does an iput() on inode X, and it's i_count drops to 0. This causes iput_final() to be called, which puts an inode into state I_FREEING at generic_delete_inode(). There no point between when iput_final() is called, and when I_FREEING is set where GFS2 could acquire any glocks. Once I_FREEING is set, no other process on that node can successfully look up that inode until the delete finishes. process B locks the the resource group for the same inode in get_local_rgrp(), which is called by gfs2_inplace_reserve_i() process A tries to lock the resource group for the inode in gfs2_dinode_dealloc(), but it's already locked by process B process B waits in find_inode for the inode to have the I_FREEING state cleared. Deadlock. This patch solves the problem by adding an alternative to gfs2_iget(), gfs2_iget_skip(), that simply skips any inodes that are in the I_FREEING state.o The alternate test function is just like the original one, except that it fails if the inode is being freed, and sets a skipped flag. The alternate set function is just like the original, except that it fails if the skipped flag is set. Only try_rgrp_unlink() calls gfs2_iget_skip() instead of gfs2_iget(). Signed-off-by: Benjamin E. Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:56:29 +01:00
Wendy Cheng	de986e859a	[GFS2] Data corruption fix * GFS2 has been using i_cache array to store its indirect meta blocks. Its flush routine doesn't correctly clean up all the entries. The problem would show while multiple nodes do simultaneous writes to the same file. Upon glock exclusive lock transfer, if the file is a sparse file with large file size where the indirect meta blocks span multiple array entries with "zero" entries in between. The flush routine prematurely stops the flushing that leaves old (stale) entries around. This leads to several nasty issues, including data corruption. * Fix gfs2_get_block_noalloc checking to correctly return EIO upon unmapped buffer. Signed-off-by: Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:56:26 +01:00
Steven Whitehouse	16615be18c	[GFS2] Clean up journaled data writing This patch cleans up the code for writing journaled data into the log. It also removes the need to allocate a small "tag" structure for each block written into the log. Instead we just keep count of the outstanding I/O so that we can be sure that its all been written at the correct time. Another result of this patch is that a number of ll_rw_block() calls have become submit_bh() calls, closing some races at the same time. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:56:24 +01:00
Bob Peterson	55c0c4ac0b	[GFS2] GFS2: chmod hung - fix race in thread creation The problem boiled down to a race between the gdlm_init_threads() function initializing thread1 and its setting of blist = 1. Essentially, "if (current == ls->thread1)" was checked by the thread before the thread creator set ls->thread1. Since thread1 is the only thread who is allowed to work on the blocking queue, and since neither thread thought it was thread1, no one was working on the queue. So everything just sat. This patch reuses the ls->async_lock spin_lock to fix the race, and it fixes the problem. I've done more than 2000 iterations of the loop that was recreating the failure and it seems to work. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> --	2007-10-10 08:56:22 +01:00
Patrick Caulfield	d66f8277f5	[DLM] Make dlm_sendd cond_resched more Under high recovery loads dlm_sendd can monopolise the CPU and cause soft lockups. This one extra and one moved cond_resched() make it yield a little more during such times keeping work moving. Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:56:19 +01:00
Wendy Cheng	49e61f2ef6	[GFS2] Move inode deletion out of blocking_cb Move inode deletion code out of blocking_cb handle_callback route to avoid racy conditions that end up blocking lock_dlm1 thread. Fix bugzilla 286821. Signed-off-by: Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:56:17 +01:00
Abhijith Das	b4c20166dc	[GFS2] flocks from same process trip kernel BUG at fs/gfs2/glock.c:1118! This patch adds a new flag to the gfs2_holder structure GL_FLOCK. It is set on holders of glocks representing flocks. This flag is checked in add_to_queue() and a process is permitted to queue more than one holder onto a glock if it is set. This solves the issue of a process not being able to do multiple flocks on the same file. Through a single descriptor, a process can now promote and demote flocks. Through multiple descriptors a process can now queue multiple flocks on the same file. There's still the problem of a process deadlocking itself (because gfs2 blocking locks are not interruptible) by queueing incompatible deadlock. Signed-off-by: Abhijith Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:56:14 +01:00
Steven Whitehouse	1ad38c437f	[GFS2] Clean up gfs2_trans_add_revoke() The following alters gfs2_trans_add_revoke() to take a struct gfs2_bufdata as an argument. This eliminates the memory allocation which was previously required by making use of the already existing struct gfs2_bufdata. It makes some sanity checks to ensure that the gfs2_bufdata has been removed from all the lists before its recycled as a revoke structure. This saves one memory allocation and one free per revoke structure. Also as a result, and to simplify the locking, since there is no longer any blocking code in gfs2_trans_add_revoke() we must hold the log lock whenever this function is called. This reduces the amount of times we take and unlock the log lock. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:56:12 +01:00
Steven Whitehouse	0820ab517e	[GFS2] Use slab operations for all gfs2_bufdata allocations The old revoke structure was allocated using kalloc/kfree but there is a slab cache for gfs2_bufdata, so we should use that now that the structures have been converted. This is part two of the patch series to merge the revoke and gfs2_bufdata structures. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:56:10 +01:00
Steven Whitehouse	82e86087bb	[GFS2] Replace revoke structure with bufdata structure Both the revoke structure and the bufdata structure are quite similar. They are basically small tags which are put on lists. In addition to which the revoke structure is always allocated when there is a bufdata structure which is (or can be) freed. As such it should be possible to reduce the number of frees and allocations by using the same structure for both purposes. This patch is the first step along that path. It replaces existing uses of the revoke structure with the bufdata structure. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:56:07 +01:00
Bob Peterson	8475487bef	[GFS2] Fix ordering of dirty/journal for ordered buffer unstuffing Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:56:05 +01:00
Steven Whitehouse	d7b616e252	[GFS2] Clean up ordered write code The following patch removes the ordered write processing from databuf_lo_before_commit() and moves it to log.c. This has the effect of greatly simplyfying databuf_lo_before_commit() and well as potentially making the ordered write code more efficient. As a side effect of this, its now possible to remove ordered buffers from the ordered buffer list at any time, so we now make use of this in invalidatepage and releasepage to ensure timely release of these buffers. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:56:03 +01:00
Steven Whitehouse	9b9107a5a8	[GFS2] Move pin/unpin into lops.c, clean up locking gfs2_pin and gfs2_unpin are only used in lops.c, despite being defined in meta_io.c, so this patch moves them into lops.c and makes them static. At the same time, its possible to clean up the locking in the buf and databuf _lo_add() functions so that we only need to grab the spinlock once. Also we have to move lock_buffer() around the _lo_add() functions since we can't do that in gfs2_pin() any more since we hold the spinlock for the duration of that function. As a result, the code shrinks by 12 lines and we do far fewer operations when adding buffers to the log. It also makes the code somewhat easier to read & understand. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:56:00 +01:00
Steven Whitehouse	eaf965270f	[GFS2] Don't mark jdata dirty in gfs2_unstuffer_page() Journaled data is marked dirty by gfs2_unpin and should not be marked dirty here. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:55:58 +01:00
Steven Whitehouse	1e1a3d03e9	[GFS2] Introduce gfs2_remove_from_ail This collects together the operations required to remove a gfs2_bufdata from the ail lists. Its only called from two places to start with, but expect to see more of this function in future. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:55:55 +01:00
Steven Whitehouse	8497a46e17	[GFS2] Correct lock ordering in unlink This patch corrects the lock ordering in unlink to be the same as that in the rest of GFS2, i.e. parent -> child -> rgrp. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:55:53 +01:00
Wendy Cheng	e9bd2b3baf	[GFS2] fix inode meta data corruption Fix a nasty inode meta data corruption issue by keeping the buffer head in icache array. This buffer needs to stay in memory until journal flush occurs Otherwise, gfs2_meta_inode_buffer could do a disk read before the inode hits disk. It ends up with meta data corruptions. The buffer will be released as part of the existing journal flush logic. Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:55:51 +01:00
Benjamin Marzinski	c4f68a130f	[GFS2] delay glock demote for a minimum hold time When a lot of IO, with some distributed mmap IO, is run on a GFS2 filesystem in a cluster, it will deadlock. The reason is that do_no_page() will repeatedly call gfs2_sharewrite_nopage(), because each node keeps giving up the glock too early, and is forced to call unmap_mapping_range(). This bumps the mapping->truncate_count sequence count, forcing do_no_page() to retry. This patch institutes a minimum glock hold time a tenth a second. This insures that even in heavy contention cases, the node has enough time to get some useful work done before it gives up the glock. A second issue is that when gfs2_glock_dq() is called from within a page fault to demote a lock, and the associated page needs to be written out, it will try to acqire a lock on it, but it has already been locked at a higher level. This patch puts makes gfs2_glock_dq() use the work queue as well, to avoid this issue. This is the same patch as Steve Whitehouse originally proposed to fix this issue, execpt that gfs2_glock_dq() now grabs a reference to the glock before it queues up the work on it. Signed-off-by: Benjamin E. Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:55:48 +01:00
Abhijith Das	d1e2777d4f	[GFS2] panic after can't parse mount arguments When you try to mount gfs2 with -o garbage, the mount fails and the gfs2 superblock is deallocated and becomes NULL. The vfs comes around later on and calls gfs2_kill_sb. At this point the hidden gfs2 superblock pointer (sb->s_fs_info) is NULL and dereferencing it through gfs2_meta_syncfs causes the panic. (the other function call to gfs2_delete_debugfs_file() succeeds because this function already checks for a NULL pointer) Signed-off-by: Abhijith Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:55:46 +01:00
Bob Peterson	ec217e0ece	[GFS2] Patch to protect sd_log_num_jdata This is a patch to GFS2 to protect sd_log_num_jdata with the gfs2_log_lock. Without this patch, there is a timing window where you can get hit the following assert from function gfs2_log_flush(): gfs2_assert_withdraw(sdp, sdp->sd_log_num_buf + sdp->sd_log_num_jdata == sdp->sd_log_commited_buf + sdp->sd_log_commited_databuf); I've tested it on my roth cluster and it fixes the problem. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:55:43 +01:00
Abhijith Das	a947e03356	[GFS2] Wendy's dump lockname in hex & fix glock dump With this patch, gfs2 glockdump through the debugfs filesystem will only dump glocks for the specified filesystem instead of all glocks. Also, to aid debugging, the glock number is dumped in hex instead of decimal. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Abhijith Das <adas@redhat.com>	2007-10-10 08:55:41 +01:00
Patrick Caulfield	61d96be0f4	[DLM] Fix lowcomms socket closing This patch fixes the slight mess made in lowcomms closing by previous patches and fixes all sorts of DLM hangs. Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:55:39 +01:00
Wendy Cheng	a13b8c5f23	[GFS2] Reduce truncate IO traffic Current GFS2 setattr call unconditionally invokes do_shrink even the requested size and actual file size are equal. This has generated large amount of extra IOs found during NFS benchmark runs. This patch moves the relevant logic out of shrink code path. Since setattr is a system call, the time stamps update is still required. Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:55:36 +01:00
Benjamin Marzinski	9a5ad13856	[GFS2] Add NULL entry to token table match_token() was returning garbage data instead of a fail value. This data happened to match a valid option id for an option that required an argument (in this case, lockproto=%s) For match_token() to correctly fail if the option doesn't match any of the tokens, the token table must end with a NULL entry. This patch adds the NULL entry. Signed-off-by: Benjamin E. Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:55:34 +01:00
Steven Whitehouse	382e6e256b	[GFS2] Add a missing gfs2_trans_add_bh() This was missing from the dir_split_leaf() function although in most cases its not a problem due to other functions having already previously called gfs2_trans_add_bh. This makes certain that it is correct. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Wendy Cheng <wcheng@redhat.com>	2007-10-10 08:55:32 +01:00
Steven Whitehouse	bb3b0e3df5	[GFS2] Clean up invalidatepage/releasepage This patch fixes some bugs relating to journaled data files by cleaning up the gfs2_invalidatepage() and gfs2_releasepage() functions. We now never block during gfs2_releasepage(), instead we always either release or refuse to release depending on the status of the buffers. This fixes Red Hat bugzillas #248969 and #252392. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Bob Peterson <rpeterso@redhat.com>	2007-10-10 08:55:29 +01:00
Abhijith Das	2d9a4bbf6d	[GFS2] Fix quota do_list operation hang This is the filesystem part of the patches to fix this bz. There are additional userland patches (gfs2_quota, libgfs2) for the complete solution. This patch adds a new field qu_ll_next to the gfs2_quota structure. This field allows us to create linked lists of quotas in the ondisk quota inode. Instead of scanning through the entire sparse quota file for valid quotas, we can now simply walk through the user and group quota linked lists to perform the do_list operation. Signed-off-by: Abhijith Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:55:27 +01:00
Denis Cheng	34eaae398e	[GFS2] fixed a NULL pointer assignment BUG Signed-off-by: Denis Cheng <crquan@gmail.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:55:24 +01:00
Abhijith Das	0fd5355470	[GFS2] Force unstuff of hidden quota inode This patch forcibly unstuffs (if stuffed) the hidden quota inode at the first availble opportunity. In any practical scenario the quota inode won't be stuffed, so this is ok to do. Unstuffing the quota inode allows us to ignore the case of a stuffed quota inode in gfs2_adjust_quota(). Signed-off-by: Abhijith Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:55:22 +01:00
Denis Cheng	5d35e31f43	[GFS2] better code for translating characters the original code could work, but I think this code could work better. Signed-off-by: Denis Cheng <crquan@gmail.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:55:20 +01:00

1 2 3 4 5 ...

6377 commits