linux

mirror of https://github.com/torvalds/linux synced 2024-10-27 05:39:25 +00:00

Author	SHA1	Message	Date
Ben Widawsky	3bc2913e2c	drm/i915: Fix set_caching locking On the EINVAL case we don't release struct_mutex. It should be safe to grab the lock after checking the parameters, which also resolves the issues. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-27 08:45:11 +02:00
Ben Widawsky	199adf40ae	drm/i915: s/cacheing/caching/ Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-26 09:24:36 +02:00
Daniel Vetter	398b7a1b88	Linux 3.6-rc7 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) iQEcBAABAgAGBQJQX7MuAAoJEHm+PkMAQRiG0h0IAJURkrMCAQUxA+Ik66ReH89s LQcVd0U9uL4UUOi7f5WR64Vf9Cfu6VVGX9ZKSvjpNskvlQaUQPMIt4pMe6g4X4dI u0bApEy4XZz3nGabUAghIU8jJ8cDmhCG6kPpSiS7pi7KHc0yIa4WFtJRrIpGaIWT xuK38YOiOHcSDRlLyWZzainMncQp/ixJdxnqVMTonkVLk0q0b84XzOr4/qlLE5lU i+TsK3PRKdQXgvZ4CebL+srPBwWX1dmgP3VkeBloQbSSenSeELICbFWavn2ml+sF GXi4dO93oNquL/Oy5SwI666T4uNcrRPaS+5X+xSZgBW/y2aQVJVJuNZg6ZP/uWk= =0v2l -----END PGP SIGNATURE----- Merge tag 'v3.6-rc7' into drm-intel-next-queued Manual backmerge of -rc7 to resolve a silent conflict leading to compile failure in drivers/gpu/drm/i915/intel_hdmi.c. This is due to the bugfix in -rc7: commit `b98b601672` Author: Wang Xingchao <xingchao.wang@intel.com> Date: Thu Sep 13 07:43:22 2012 +0800 drm/i915: HDMI - Clear Audio Enable bit for Hot Plug Since this code moved around a lot in -next git put that snippet at the wrong spot. I've tried to fix this by making the conflict explicit by merging a version for next with: commit `3cce574f01` Author: Wang Xingchao <xingchao.wang@intel.com> Date: Thu Sep 13 11:19:00 2012 +0800 drm/i915: HDMI - Clear Audio Enable bit for Hot Plug unconditionally But that failed to solve the entire problem. To avoid pushing out further -nightly branch to our QA where this is broken, do the backmerge and manually add the stuff git adds to -next from the patch in -fixes. Note that this doesn't show up in git's merge diff (and hence is also not handled by git rerere), which adds to the reasons why I'd like to fix this with a verbose backmerge. The git merge diff only shows a bunch of trivial conflicts of the "code changed in lines next to each another" kind. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-24 18:17:12 +02:00
Chris Wilson	2f745ad3d3	drm/i915: Convert the dmabuf object to use the new i915_gem_object_ops By providing a callback for when we need to bind the pages, and then release them again later, we can shorten the amount of time we hold the foreign pages mapped and pinned, and importantly the dmabuf objects then behave as any other normal object with respect to the shrinker and memory management. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-20 14:23:10 +02:00
Chris Wilson	9da3da660d	drm/i915: Replace the array of pages with a scatterlist Rather than have multiple data structures for describing our page layout in conjunction with the array of pages, we can migrate all users over to a scatterlist. One major advantage, other than unifying the page tracking structures, this offers is that we replace the vmalloc'ed array (which can be up to a megabyte in size) with a chain of individual pages which helps reduce memory pressure. The disadvantage is that we then do not have a simple array to iterate, or to access randomly. The common case for this is in the relocation processing, which will typically fit within a single scatterlist page and so be almost the same cost as the simple array. For iterating over the array, the extra function call could be optimised away, but in reality is an insignificant cost of either binding the pages, or performing the pwrite/pread. v2: Fix drm_clflush_sg() to not invoke wbinvd as well! And fix the trivial compile error from rebasing. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-20 14:22:57 +02:00
Chris Wilson	f60d7f0c1d	drm/i915: Pin backing pages for pread By using the recently introduced pinning of pages, we can safely drop the mutex in the knowledge that the pages are not going to disappear beneath us, and so we can simplify the code for iterating over the pages. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-20 14:22:57 +02:00
Chris Wilson	755d22184f	drm/i915: Pin backing pages for pwrite By using the recently introduced pinning of pages, we can safely drop the mutex in the knowledge that the pages are not going to disappear jeneath us, and so we can simplify the code for iterating over the pages. Note: The old code had such complicated page refcounting since it used obj->pages as a micro-optimization if it's there, but that could (before this patch) disappear when we drop the dev->struct_mutex. Hence some manual page refcounting was required for the slow path, complicated by the fact that pages returned by shmem_read_mapping_page already have a pageref, which needs to be dropped again. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> [danvet: Added note to explain the question Ben raised in review.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-20 14:22:56 +02:00
Chris Wilson	a5570178c0	drm/i915: Pin backing pages whilst exporting through a dmabuf vmap We need to refcount our pages in order to prevent reaping them at inopportune times, such as when they currently vmapped or exported to another driver. However, we also wish to keep the lazy deallocation of our pages so we need to take a pin/unpinned approach rather than a simple refcount. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-20 14:22:56 +02:00
Chris Wilson	37e680a15f	drm/i915: Introduce drm_i915_gem_object_ops In order to specialise functions depending upon the type of object, we can attach vfuncs to each object via a new ->ops pointer. For instance, this will be used in future patches to only bind pages from a dma-buf for the duration that the object is used by the GPU - and so prevent them from pinning those pages for the entire of the object. v2: Bonus comments. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-20 14:22:55 +02:00
Chris Wilson	7e81a42e34	drm/i915: Reduce a pin-leak BUG into a WARN Pin-leaks persist and we get the perennial bug reports of machine lockups to the BUG_ON(pin_count==MAX). If we instead loudly report that the object cannot be pinned at that time it should prevent the driver from locking up, and hopefully restore a semblance of working whilst still leaving us a OOPS to debug. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-17 10:12:57 +02:00
Dave Airlie	65983bd605	Merge branch 'for-airlied' of git://people.freedesktop.org/~danvet/drm-intel into drm-next Daniel writes: "New stuff for -next. Highlights: - prep patches for the modeset rework. Note that one of those patches touches the fb helper in the common drm code. - hasw hdmi audio support (Wang Xingchao) - improved instdone dumping for gen7 (Ben) - unbound tracking and a few follow-up patches from Chris - dma_buf->begin/end_cpu_access plus fix for drm/udl (Dave) - improve mmio error reporting for hsw - prep patch for WQ_NON_REENTRANT removal (Tejun Heo) " * 'for-airlied' of git://people.freedesktop.org/~danvet/drm-intel: (41 commits) drm/i915: Remove __GFP_NO_KSWAPD drm/i915: disable rc6 on ilk when vt-d is enabled drm/i915: Avoid unbinding due to an interrupted pin_and_fence during execbuffer drm/i915: Use new INSTDONE registers (Gen7+) drm/i915: Add new INSTDONE registers drm/i915: Extract reading INSTDONE drm/i915: Use a non-blocking wait for set-to-domain ioctl drm/i915: Juggle code order to ease flow of the next patch drm/i915: Use cpu relocations if the object is in the GTT but not mappable drm/i915: Extract general object init routine drm/i915: Protect private gem objects from truncate (such as imported dmabuf) drm/i915: Only pwrite through the GTT if there is space in the aperture i915: use alloc_ordered_workqueue() instead of explicit UNBOUND w/ max_active = 1 drm/i915: Find unclaimed MMIO writes. drm/i915: Add ERR_INT to gen7 error state drm/i915: Cantiga+ cannot handle a hsync front porch of 0 drm/i915: fix reassignment of variable "intel_dp->DP" drm/i915: Try harder to allocate an mmap_offset drm/i915: Show pin count in debugfs drm/i915: Show (count, size) of purgeable objects in i915_gem_objects ...	2012-09-03 12:05:01 +10:00
Sedat Dilek	d7c3b937bd	drm/i915: Remove __GFP_NO_KSWAPD When I pulled-in today's drm-intel-next into linux-next (next-20120824) I saw this build-breakage: drivers/gpu/drm/i915/i915_gem.c: In function 'i915_gem_object_get_pages_gtt': drivers/gpu/drm/i915/i915_gem.c:1778:40: error: '__GFP_NO_KSWAPD' undeclared (first use in this function) drivers/gpu/drm/i915/i915_gem.c:1778:40: note: each undeclared identifier is reported only once for each function it appears in This is caused by commit ba099ef165f8 ("mm: remove __GFP_NO_KSWAPD") and commit b6beae2c2014 ("mm: remove __GFP_NO_KSWAPD fixes") in linux-next (next-20120824). Fix this by removing __GFP_NO_KSWAPD from drm/i915 driver. Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-27 17:11:38 +02:00
Dave Airlie	93bb70e0c0	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into drm-next There was some merge conflicts in -next and they weren't so pretty, so backmerge now to avoid them. Conflicts: drivers/gpu/drm/i915/i915_gem.c drivers/gpu/drm/i915/intel_modes.c	2012-08-27 16:22:20 +10:00
Chris Wilson	3236f57a01	drm/i915: Use a non-blocking wait for set-to-domain ioctl The principal use for set-to-domain is for userspace to serialise operations with a particular buffer, for example to maintain coherency with a CPU map or to ratelimit its rendering by waiting on all previous operations before continuing. As such we tend to hold the struct_mutex for long periods during the synchronisation and so cause contention issues with other users of the graphics device, even for independent operations as memory management. An example is the contention between compiz and X which causes jitter in the display and a drop in peak throughput. The ultimate solution would be a set of fine grained locks and lockless operations, but an intermediate step is to first attempt the synchronisation for set-to-domain without holding the mutex. This introduces a number of race conditions, so we limit it use to the ioctl periphery where we have no dependent state and can safely complete with a locked synchronisation afterwards. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-24 11:12:56 +02:00
Chris Wilson	b361237bcc	drm/i915: Juggle code order to ease flow of the next patch Move the wait-for-rendering logic around in the file so that we can group it together with the subsequent variations. The general goal is to have the lower level routines clustered together and then the higher level logic building upon those low level routines that came before. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-24 11:12:53 +02:00
Chris Wilson	0327d6ba99	drm/i915: Extract general object init routine As we wish to create specialised object constructions in the near future that share the same basic GEM object struct, export the default initializer. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-24 02:04:38 +02:00
Chris Wilson	4d6294bf77	drm/i915: Protect private gem objects from truncate (such as imported dmabuf) If the object has no backing shmemfs filp, then we obviously cannot perform a truncation operation upon it. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-24 02:04:31 +02:00
Chris Wilson	86a1ee26bb	drm/i915: Only pwrite through the GTT if there is space in the aperture Avoid stalling and waiting for the GPU by checking to see if there is sufficient inactive space in the aperture for us to bind the buffer prior to writing through the GTT. If there is inadequate space we will have to stall waiting for the GPU, and incur overheads moving objects about. Instead, only incur the clflush overhead on the target object by writing through shmem. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-24 02:03:33 +02:00
Chris Wilson	d8cb508669	drm/i915: Try harder to allocate an mmap_offset Given the persistence of an offset for the lifetime of an object, itis easy to contemplate how the mmap space becomes badly fragmented to the point that further allocations fail with ENOSPC. Our only recourse at this point is to try to purge the objects to release some space and reattempt the allocation. References: https://bugs.freedesktop.org/show_bug.cgi?id=39552 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-21 14:34:36 +02:00
Chris Wilson	c4670ad080	drm/i915: Add some sanity checks to unbound tracking A pair of universally true checks that just need to be put in the right place depending on where in the patch sequence you go. Note that i915_gem_object_put_pages_gtt() already gains the BUG_ON(obj->gtt_space), but on reflection that needed to migrate to put_pages(). Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-21 14:34:20 +02:00
Chris Wilson	6c085a728c	drm/i915: Track unbound pages When dealing with a working set larger than the GATT, or even the mappable aperture when touching through the GTT, we end up with evicting objects only to rebind them at a new offset again later. Moving an object into and out of the GTT requires clflushing the pages, thus causing a double-clflush penalty for rebinding. To avoid having to clflush on rebinding, we can track the pages as they are evicted from the GTT and only relinquish those pages on memory pressure. As usual, if it were not for the handling of out-of-memory condition and having to manually shrink our own bo caches, it would be a net reduction of code. Alas. Note: The patch also contains a few changes to the last-hope evict_everything logic in i916_gem_execbuffer.c - we no longer try to only evict the purgeable stuff in a first try (since that's superflous and only helps in OOM corner-cases, not fragmented-gtt trashing situations). Also, the extraction of the get_pages retry loop from bind_to_gtt (and other callsites) to get_pages should imo have been a separate patch. v2: Ditch the newly added put_pages (for unbound objects only) in i915_gem_reset. A quick irc discussion hasn't revealed any important reason for this, so if we need this, I'd like to have a git blame'able explanation for it. v3: Undo the s/drm_malloc_ab/kmalloc/ in get_pages that Chris noticed. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Split out code movements and rant a bit in the commit message with a few Notes. Done v2] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-21 14:34:11 +02:00
Daniel Vetter	225067eedf	drm/i915: move functions around Prep work to make Chris Wilson's unbound tracking patch a bit easier to read. Alas, I'd have preferred that moving the page allocation retry loop from bind to get_pages would have been a separate patch, too. But that looks like real work ;-) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-20 10:59:41 +02:00
Ben Widawsky	b6c7488df6	drm/i915/contexts: fix list corruption After reset we unconditionally reinitialize lists. If the context switch hasn't yet completed before the suspend, the default context object will end up on lists that are going to go away when we resume. The patch forces the context switch to be synchronous before suspend assuring that the active/inactive tracking is correct at the time of resume. References: https://bugs.freedesktop.org/show_bug.cgi?id=52429 Tested-by: Guang A Yang <guang.a.yang@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-17 09:21:34 +02:00
Chris Wilson	b2eadbc85b	drm/i915: Lazily apply the SNB+ seqno w/a Avoid the forcewake overhead when simply retiring requests, as often the last seen seqno is good enough to satisfy the retirment process and will be promptly re-run in any case. Only ensure that we force the coherent seqno read when we are explicitly waiting upon a completion event to be sure that none go missing, and also for when we are reporting seqno values in case of error or debugging. This greatly reduces the load for userspace using the busy-ioctl to track active buffers, for instance halving the CPU used by X in pushing the pixels from a software render (flash). The effect will be even more magnified with userptr and so providing a zero-copy upload path in that instance, or in similar instances where X is simply compositing DRI buffers. v2: Reverse the polarity of the tachyon stream. Daniel suggested that 'force' was too generic for the parameter name and that 'lazy_coherency' better encapsulated the semantics of it being an optimization and its purpose. Also notice that gen6_get_seqno() is only used by gen6/7 chipsets and so the test for IS_GEN6 \|\| IS_GEN7 is redundant in that function. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-10 11:11:32 +02:00
Chris Wilson	e6994aeedc	drm/i915: Export ability of changing cache levels to userspace By selecting the cache level (essentially whether or not the CPU snoops any updates to the bo, and on more recent machines whether it resides inside the CPU's last-level-cache) a userspace driver is able to then manage all of its memory within buffer objects, if it so desires. This enables the userspace driver to accelerate uploads and more importantly downloads from the GPU and to able to mix CPU and GPU rendering/activity efficiently. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Added code comment about where we plan to stuff platform specific cacheing control bits in the ioctl struct.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-26 12:56:25 +02:00
Chris Wilson	42d6ab4839	drm/i915: Segregate memory domains in the GTT using coloring Several functions of the GPU have the restriction that differing memory domains cannot be placed next to each other (as the GPU may prefetch beyond the end of one domain and hang as it crosses into the other domain). We use the facility of the drm_mm to mark ranges with a particular color that corresponds to the cache attributes of those pages in order to prevent allocating adjacent blocks of differing memory types. v2: Rebase ontop of drm_mm coloring v2. v3: Fix rebinding existing gtt_space and add a verification routine. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-26 12:56:25 +02:00
Chris Wilson	f047e395dd	drm/i915: Avoid concurrent access when marking the device as idle/busy As suggested by Daniel, rip out the independent timers for device and crtc busyness and integrate the manual powermanagement of the display engine into the GEM core and its request tracking. The benefits are that the code is a lot smaller, fewer moving parts and should fit more neatly into the overall activity tracking of the driver. v2: Complete overhaul and removal of the racy timers and workers. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:56 +02:00
Chris Wilson	a7b9761d0a	drm/i915: Split i915_gem_flush_ring() into seperate invalidate/flush funcs By moving the function to intel_ringbuffer and currying the appropriate parameter, hopefully we make the callsites easier to read and understand. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:55 +02:00
Chris Wilson	26b9c4a57f	drm/i915: Remove the explicit flush of the GPU write domain Rely instead on the insertion of the implicit flush before the seqno breadcrumb. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:54 +02:00
Chris Wilson	86d5bc3782	drm/i915: Remove explicit flush from i915_gem_object_flush_fence() As the flush is either performed explictly immediately after the execbuffer dispatch, or before the serialisation of last_fenced_seqno we can forgo the explict i915_gem_flush_ring(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:53 +02:00
Chris Wilson	69c2fc8913	drm/i915: Remove the per-ring write list This is now handled by a global flag to ensure we emit a flush before the next serialisation point (if we failed to queue one previously). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:53 +02:00
Chris Wilson	65ce302741	drm/i915: Remove the defunct flushing list As we guarantee to emit a flush before emitting the breadcrumb or the next batchbuffer, there is no further need for the flushing list. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:52 +02:00
Chris Wilson	0201f1ecf4	drm/i915: Replace the pending_gpu_write flag with an explicit seqno As we always flush the GPU cache prior to emitting the breadcrumb, we no longer have to worry about the deferred flush causing the pending_gpu_write to be delayed. So we can instead utilize the known last_write_seqno to hopefully minimise the wait times. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:52 +02:00
Chris Wilson	e5f1d962a8	drm/i915: Remove assertion over write domain after i915_gem_object_sync() As we move to lazily clearing the GPU write domain only when the buffer becomes inactive, this leaves a window of opportunity for i915_gem_object_pin_to_display_plane() to detect a seemingly inconsistent value. This function is special as it tries to pipeline the operation to avoid the stall and so may not retires the buffer and we may not get the opportunity to clear the write domain. However, we know all is good, so drop the assertion. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:51 +02:00
Chris Wilson	3bb73aba1e	drm/i915: Allow late allocation of request for i915_add_request() Request preallocation was added to i915_add_request() in order to support the overlay. However, not all users care and can quite happily ignore the failure to allocate the request as they will simply repeat the request in the future. By pushing the allocation down into i915_add_request(), we can then remove some rather ugly error handling in the callers. v2: Nullify request->file_priv otherwise we chase a garbage pointer when retiring requests. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:51 +02:00
Chris Wilson	e9808edd98	drm/i915: Return a mask of the active rings in the high word of busy_ioctl The intention is to help select which engine to use for copies with interoperating clients - such as a GL client making a request to the X server to perform a SwapBuffers, which may require copying from the active GL back buffer to the X front buffer. We choose to report a mask of the active rings to future proof the interface against any changes which may allow for the object to reside upon multiple rings. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: bikeshed away the write ring mask and add the explanation Chris sent in a follow-up mail why we decided to use masks.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:50 +02:00
Chris Wilson	eeef9b3874	drm/i915: Add -EIO to the list of known errors for __wait_seqno This prevents a WARN introduced with commit `de2b998552` Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Wed Jul 4 22:52:50 2012 +0200 drm/i915: don't return a spurious -EIO from intel_ring_begin Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 10:39:57 +02:00
Chris Wilson	67b1b57182	drm/i915: Disable the BLT on pre-production SNB hardware It never quite worked despite the numerous workarounds, yet I still see people trying to use this hardware and filing bug reports. As we no longer even try to implement the workarounds, since `6a233c7887` (drm/i915/ringbuffer: kill snb blt workaround), simply disable the ring. v2: Add a message to inform the user about the limited capabilities of their pre-production hardware. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-20 12:21:37 +02:00
Chris Wilson	6b9d89b436	drm: Add colouring to the range allocator In order to support snoopable memory on non-LLC architectures (so that we can bind vgem objects into the i915 GATT for example), we have to avoid the prefetcher on the GPU from crossing memory domains and so prevent allocation of a snoopable PTE immediately following an uncached PTE. To do that, we need to extend the range allocator with support for tracking and segregating different node colours. This will be used by i915 to segregate memory domains within the GTT. v2: Now with more drm_mm helpers and less driver interference. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Dave Airlie <airlied@redhat.com Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dave Airlie <airlied@gmail.com>	2012-07-16 05:59:37 +10:00
Daniel Vetter	a9340ccab5	drm/i915: properly SIGBUS on I/O errors ... instead of looping endless with no hope of ever serving that page-fault. We only need to break out of this loop when the gpu died, to run the reset work (and hopefully resurrect it). To clarify questions Chris raised on irc: This is about handling I/O errors not from our own code, but e.g. when the disk died when trying to swap in a gem bo. So this patch remidies the issue that the current handling only handles gpu-death-induced cases of -EIO. Admittedly, dying disks are much rarer than hanging gpus ...To clarify questions Chris raised on irc: This is about handling I/O errors not from our own code, but e.g. when the disk died when trying to swap in a gem bo. So this patch remidies the issue that the current handling only handles gpu-death-induced cases of -EIO. Admittedly, dying disks are much rarer than hanging gpus ... This seems to have been lost in: commit `d9bc7e9f32` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Feb 7 13:09:31 2011 +0000 drm/i915: Fix infinite loop regression from `21dd3734` Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-05 10:03:01 +02:00
Daniel Vetter	0a6759c6ba	drm/i915: don't hang userspace when the gpu reset is stuck With the gpu reset no longer using a trylock we've increased the chances of userspace getting stuck quite a bit. To make that (hopefully) rare case more paletable time out when waiting for the gpu reset code to complete and signal this little issue to the caller by returning -EIO. This should help userspace to somewhat gracefully fall back and hopefully allow the user to grab some logs and reboot the machine (instead of staring at a frozen X screen in agony). Suggested by Chris Wilson because I've been stubborn about allowing the gpu reset code no to fail, ever (by removing the trylock). Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-05 10:02:24 +02:00
Daniel Vetter	d6b2c790a4	drm/i915: non-interruptible sleeps can't handle -EAGAIN So don't return -EAGAIN, even in the case of a gpu hang. Remap it to -EIO instead. Note that this isn't really an issue with interruptability, but more that we have quite a few codepaths (mostly around kms stuff) that simply can't handle any errors and hence not even -EAGAIN. Instead of adding proper failure paths so that we could restart these ioctls we've opted for the cheap way out of sleeping non-interruptibly. Which works everywhere but when the gpu dies, which this patch fixes. So essentially interruptible == false means 'wait for the gpu or die trying'.' This patch is a bit ugly because intel_ring_begin is all non-interruptible and hence only returns -EIO. But as the comment in there says, auditing all the callsites would be a pain. To avoid duplicating code, reuse i915_gem_check_wedge in __wait_seqno and intel_wait_ring_buffer. Also use the opportunity to clarify the different cases in i915_gem_check_wedge a bit with comments. v2: Don't access dev_priv->mm.interruptible from check_wedge - we might not hold dev->struct_mutex, making this racy. Instead pass interruptible in as a parameter. I've noticed this because I've hit a BUG_ON(!mutex_is_locked) at the top of check_wedge. This has been added in commit `b4aca0106c` Author: Ben Widawsky <ben@bwidawsk.net> Date: Wed Apr 25 20:50:12 2012 -0700 drm/i915: extract some common olr+wedge code although that commit is missing any justification for this. I guess it's just copy&paste, because the same commit add the same BUG_ON check to check_olr, where it indeed makes sense. But in check_wedge everything we access is protected by other means, so this is superflous. And because it now gets in the way (we add a new caller in __wait_seqno, which can be called without dev->struct_mutext) let's just remove it. v3: Group all the i915_gem_check_wedge refactoring into this patch, so that this patch here is all about not returning -EAGAIN to callsites that can't handle syscall restarting. v4: Add clarification what interuptible == fales means in our code, requested by Ben Widawsky. v5: Fix EAGAIN mispell noticed by Chris Wilson. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-05 10:01:14 +02:00
Daniel Vetter	cc889e0f6c	drm/i915: disable flushing_list/gpu_write_list This is just the minimal patch to disable all this code so that we can do decent amounts of QA before we rip it all out. The complicating thing is that we need to flush the gpu caches after the batchbuffer is emitted. Which is past the point of no return where execbuffer can't fail any more (otherwise we risk submitting the same batch multiple times). Hence we need to add a flag to track whether any caches associated with that ring are dirty. And emit the flush in add_request if that's the case. Note that this has a quite a few behaviour changes: - Caches get flushed/invalidated unconditionally. - Invalidation now happens after potential inter-ring sync. I've bantered around a bit with Chris on irc whether this fixes anything, and it might or might not. The only thing clear is that with these changes it's much easier to reason about correctness. Also rip out a lone get_next_request_seqno in the execbuffer retire_commands function. I've dug around and I couldn't figure out why that is still there, with the outstanding lazy request stuff it shouldn't be necessary. v2: Chris Wilson complained that I also invalidate the read caches when flushing after a batchbuffer. Now optimized. v3: Added some comments to explain the new flushing behaviour. Cc: Eric Anholt <eric@anholt.net> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-06-20 13:54:28 +02:00
Ben Widawsky	f2ef6eb145	drm/i915: switch to default context on idle To keep things as sane as possible, switch to the default context before idling. This should help free context objects, as well as put things in a more well defined state before suspending. v2: remove seqno from context switch call (daniel) return error on failed context switch instead of WARN+continue (daniel) v3: move idling to i915_gpu idle (from i915_gem_idle) (Chris) Signed-off-by: Ben Widawsky <ben@bwidawsk.net>	2012-06-14 17:36:20 +02:00
Ben Widawsky	254f965c39	drm/i915: preliminary context support Very basic code for context setup/destruction in the driver. Adds the file i915_gem_context.c This file implements HW context support. On gen5+ a HW context consists of an opaque GPU object which is referenced at times of context saves and restores. With RC6 enabled, the context is also referenced as the GPU enters and exists from RC6 (GPU has it's own internal power context, except on gen5). Though something like a context does exist for the media ring, the code only supports contexts for the render ring. In software, there is a distinction between contexts created by the user, and the default HW context. The default HW context is used by GPU clients that do not request setup of their own hardware context. The default context's state is never restored to help prevent programming errors. This would happen if a client ran and piggy-backed off another clients GPU state. The default context only exists to give the GPU some offset to load as the current to invoke a save of the context we actually care about. In fact, the code could likely be constructed, albeit in a more complicated fashion, to never use the default context, though that limits the driver's ability to swap out, and/or destroy other contexts. All other contexts are created as a request by the GPU client. These contexts store GPU state, and thus allow GPU clients to not re-emit state (and potentially query certain state) at any time. The kernel driver makes certain that the appropriate commands are inserted. There are 4 entry points into the contexts, init, fini, open, close. The names are self-explanatory except that init can be called during reset, and also during pm thaw/resume. As we expect our context to be preserved across these events, we do not reinitialize in this case. As Adam Jackson pointed out, The cutoff of 1MB where a HW context is considered too big is arbitrary. The reason for this is even though context sizes are increasing with every generation, they have yet to eclipse even 32k. If we somehow read back way more than that, it probably means BIOS has done something strange, or we're running on a platform that wasn't designed for this. v2: rename load/unload to init/fini (daniel) remove ILK support for get_size() (indirectly daniel) add HAS_HW_CONTEXTS macro to clarify supported platforms (daniel) added comments (Ben) Signed-off-by: Ben Widawsky <ben@bwidawsk.net>	2012-06-14 17:36:16 +02:00
Daniel Vetter	8ecd1a6615	drm/i915: call intel_enable_gtt When drm/i915 is in control of the gtt, we need to call the enable function at all the relevant places ourselves. Reviewed-by: Jani Nikula <jani.nikula@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-06-12 22:21:07 +02:00
Daniel Vetter	dd2757f8b5	drm/i915: stop using dev->agp->base For that to work we need to export the base address of the gtt mmio window from intel-gtt. Also replace all other uses of dev->agp by values we already have at hand. Reviewed-by: Jani Nikula <jani.nikula@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-06-12 22:18:06 +02:00
Ben Widawsky	eac1f14fd1	drm/i915: Inifite timeout for wait ioctl Change the ns_timeout parameter of the wait ioctl to a signed value. Doing this allows the kernel to provide an infinite wait when a timeout of less than 0 is provided. This mimics select/poll. Initially the parameter was meant to match up with the GL spec 1:1, but after being made aware of how much 2^64 - 1 nanoseconds actually is, I do not think anyone will ever notice the loss of 1 bit. The infinite timeout on waiting is similar to the existing i915 userspace interface with the exception that struct_mutex is dropped while doing the wait in this ioctl. Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-06-06 12:25:46 +02:00
Daniel Vetter	30dfebf34b	drm/i915: extract object active state flushing code Both busy_ioctl and the new wait_ioct need to do the same dance (or at least should). Some slight changes: - busy_ioctl now unconditionally checks for olr. Before emitting a require flush would have prevent the olr check and hence required a second call to the busy ioctl to really emit the request. - the timeout wait now also retires request. Not really required for abi-reasons, but makes a notch more sense imo. I've tested this by pimping the i-g-t test some more and also checking the polling behviour of the wait_rendering_timeout ioctl versus what busy_ioctl returns. v2: Too many people complained about unplug, new color is flush_active. v3: Kill the comment about the unplug moniker. v4: s/un-active/inactive/ Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-06-02 20:51:03 +02:00
Daniel Vetter	e269f90f3d	Merge remote-tracking branch 'airlied/drm-prime-vmap' into drm-intel-next-queued We need the latest dma-buf code from Dave Airlie so that we can pimp the backing storage handling code in drm/i915 with Chris Wilson's unbound tracking and stolen mem backed gem object code. Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-06-01 10:52:54 +02:00

1 2 3 4 5 ...

630 commits