linux

mirror of https://github.com/torvalds/linux synced 2024-10-24 20:29:05 +00:00

Author	SHA1	Message	Date
Chris Wilson	cecc21fea9	drm/i915: Align the hangcheck wakeup to the nearest second round_jiffies() aligns the wakeup time to the nearest second in order to batch wakeups and reduce system load, which is useful for unimportant coarse timers like our hangcheck. v2: round_jiffies_relative() returns the relative jiffie value, whereas we need the absolute value for the timer. Suggested-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Arjan van de Ven <arjan@linux.intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-10-08 18:44:36 +02:00
Chris Wilson	2f745ad3d3	drm/i915: Convert the dmabuf object to use the new i915_gem_object_ops By providing a callback for when we need to bind the pages, and then release them again later, we can shorten the amount of time we hold the foreign pages mapped and pinned, and importantly the dmabuf objects then behave as any other normal object with respect to the shrinker and memory management. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-20 14:23:10 +02:00
Ben Widawsky	c8735b0c3e	drm/i915: #define gpu freq multipler Magic numbers are bad mmmkay. In this case in particular the value is especially weird because the docs say multiple things. We'll need this value for sysfs, so extracting it is useful for that as well. Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-20 14:23:00 +02:00
Chris Wilson	9da3da660d	drm/i915: Replace the array of pages with a scatterlist Rather than have multiple data structures for describing our page layout in conjunction with the array of pages, we can migrate all users over to a scatterlist. One major advantage, other than unifying the page tracking structures, this offers is that we replace the vmalloc'ed array (which can be up to a megabyte in size) with a chain of individual pages which helps reduce memory pressure. The disadvantage is that we then do not have a simple array to iterate, or to access randomly. The common case for this is in the relocation processing, which will typically fit within a single scatterlist page and so be almost the same cost as the simple array. For iterating over the array, the extra function call could be optimised away, but in reality is an insignificant cost of either binding the pages, or performing the pwrite/pread. v2: Fix drm_clflush_sg() to not invoke wbinvd as well! And fix the trivial compile error from rebasing. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-20 14:22:57 +02:00
Chris Wilson	a5570178c0	drm/i915: Pin backing pages whilst exporting through a dmabuf vmap We need to refcount our pages in order to prevent reaping them at inopportune times, such as when they currently vmapped or exported to another driver. However, we also wish to keep the lazy deallocation of our pages so we need to take a pin/unpinned approach rather than a simple refcount. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-20 14:22:56 +02:00
Chris Wilson	37e680a15f	drm/i915: Introduce drm_i915_gem_object_ops In order to specialise functions depending upon the type of object, we can attach vfuncs to each object via a new ->ops pointer. For instance, this will be used in future patches to only bind pages from a dma-buf for the duration that the object is used by the GPU - and so prevent them from pinning those pages for the entire of the object. v2: Bonus comments. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-20 14:22:55 +02:00
Daniel Vetter	a1ceb67751	Merge the modeset-rework, basic conversion into drm-intel-next As a quick reference I'll detail the motivation and design of the new code a bit here (mostly stitched together from patchbomb announcements and commits introducing the new concepts). The crtc helper code has the fundamental assumption that encoders and crtcs can be enabled/disabled in any order, as long as we take care of depencies (which means that enabled encoders need an enabled crtc to feed them data, essentially). Our hw works differently. We already have tons of ugly cases where crtc code enables encoder hw (or encoder->mode_set enables stuff that should only be enabled in enocder->commit) to work around these issues. But on the disable side we can't pull off similar tricks - there we actually need to rework the modeset sequence that controls all this. And this is also the real motivation why I've finally undertaken this rewrite: eDP on my shiny new Ivybridge Ultrabook is broken, and it's broken due to the wrong disable sequence ... The new code introduces a few interfaces and concepts: - Add new encoder->enable/disable functions which are directly called from the crtc->enable/disable function. This ensures that the encoder's can be enabled/disabled at a very specific in the modeset sequence, controlled by our platform specific code (instead of the crtc helper code calling them at a time it deems convenient). - Rework the dpms code - our code has mostly 1:1 connector:encoder mappings and does support cloning on only a few encoders, so we can simplify things quite a bit. - Also only ever disable/enable the entire output pipeline. This ensures that we obey the right sequence of enabling/disabling things, trying to be clever here mostly just complicates the code and results in bugs. For cloneable encoders this requires a bit of special handling to ensure that outputs can still be disabled individually, but it simplifies the common case. - Add infrastructure to read out the current hw state. No amount of careful ordering will help us if we brick the hw on the initial modeset setup. Which could happen if we just randomly disable things, oblivious to the state set up by the bios. Hence we need to be able to read that out. As a benefit, we grow a few generic functions useful to cross-check our modeset code with actual hw state. With all this in place, we can copy&paste the crtc helper code into the drm/i915 driver and start to rework it: - As detailed above, the new code only disables/enables an entire output pipe. As a preparation for global mode-changes (e.g. reassigning shared resources) it keeps track of which pipes need to be touched by a set of bitmasks. - To ensure that we correctly disable the current display pipes, we need to know the currently active connector/encoder/crtc linking. The old crtc helper simply overwrote these links with the new setup, the new code stages the new links in ->new_* pointers. Those get commited to the real linking pointers once the old output configuration has been torn down, before the ->mode_set callbacks are called. - Finally the code adds tons of self-consistency checks by employing the new hw state readout functions to cross-check the actual hw state with what the datastructure think it should be. These checks are done both after every modeset and after the hw state has been read out and sanitized at boot/resume time. All these checks greatly helped in tracking down regressions and bugs in the new code. With this new basis, a lot of cleanups and improvements to the code are now possible (besides the DP fixes that ultimately made me write this), but not yet done: - I think we should create struct intel_mode and use it as the adjusted mode everywhere to store little pieces like needs_tvclock, pipe dithering values or dp link parameters. That would still be a layering violation, but at least we wouldn't need to recompute these kinds of things in intel_display.c. Especially the port bpc computation needed for selecting the pipe bpc and dithering settings in intel_display.c is rather gross. - In a related rework we could implement ->mode_valid in terms of ->mode_fixup in a generic way - I've hunted down too many bugs where ->mode_valid did the right thing, but ->mode_fixup didn't. Or vice versa, resulting in funny bugs for user-supplied modes. - Ditch the idea to rework the hdp handling in the common crtc helper code and just move things to i915.ko. Which would rid us of the ->detect crtc helper dependencies. - LVDS wire pair and pll enabling is all done in the crtc->mode_set function currently. We should be able to move this to the crtc_enable callbacks (or in the case of the LVDS wire pair enabling, into some encoder callback). Last, but not least, this new code should also help in enabling a few neat features: The hw state readout code prepares (but there are still big pieces missing) for fastboot, i.e. avoiding the inital modeset at boot-up and just taking over the configuration left behind by the bios. We also should be able to extend the configuration checks in the beginning of the modeset sequence and make better decisions about shared resources (which is the entire point behind the atomic/global modeset ioctl). Tested-by: Jani Nikula <jani.nikula@intel.com> Tested-by: Ben Widawsky <ben@bwidawsk.net> Tested-by: Damien Lespiau <damien.lespiau@intel.com> Tested-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Vijay Purushothaman <vijay.a.purushothaman@intel.com> Acked-by: Vijay Purushothaman <vijay.a.purushothaman@intel.com> Tested-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Acked-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Tested-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-06 22:52:43 +02:00
Daniel Vetter	2492935248	drm/i915: read out the modeset hw state at load and resume time ... instead of resetting a few things and hoping that this will work out. To properly disable the output pipelines at the initial modeset after resume or boot up we need to have an accurate picture of which outputs are enabled and connected to which crtcs. Otherwise we risk disabling things at the wrong time, which can lead to hangs (or at least royally confused panels), both requiring a walk to the reset button to fix. Hence read out the hw state with the freshly introduce get_hw_state functions and then sanitize it afterwards. For a full modeset readout (which would allow us to avoid the initial modeset at boot up) a few things are still missing: - Reading out the mode from the pipe, especially the dotclock computation is quite some fun. - Reading out the parameters for the stolen memory framebuffer and wrapping it up. - Reading out the pch pll connections - luckily the disable code simply bails out if the crtc doesn't have a pch pll attached (even for configurations that would need one). This patch here turned up tons of smelly stuff around resume: We restore tons of register in seemingly random way (well, not quite, but we're not too careful either), which leaves the hw in a rather ill-defined state: E.g. the port registers are sometimes unconditionally restore (lvds, crt), leaving us with an active encoder/connector but no active pipe connected to it. Luckily the hw state sanitizer detects this madness and fixes things up a bit. v2: When checking whether an encoder with active connectors has a crtc wire up to it, check for both the crtc _and_ it's active state. v3: - Extract intel_sanitize_encoder. - Manually disable active encoders without an active pipe. v4: Correclty fix up the pipe<->plane mapping on machines where we switch pipes/planes. Noticed by Chris Wilson, who also provided the fixup. v5: Spelling fix in a comment, noticed by Paulo Zanoni Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-06 07:59:24 +02:00
Daniel Vetter	76e5a89c0a	drm/i915: add crtc->enable/disable vfuncs insted of dpms Because that's what we're essentially calling. This is the first step in untangling the crtc_helper induced dpms handling mess we have - at the crtc level we only have 2 states and the magic is just in selecting which one (and atm there isn't even much magic, but on recent platforms where not even the crt output has more than 2 states we could do better). Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-06 07:52:00 +02:00
Daniel Vetter	20e4d407fb	drm/ips: move drps/ips/ilk related variables into dev_priv->ips Like with the equivalent change for gen6+ rps state, this helps in clarifying the code (and in fixing a few places that have fallen through the cracks in the locking review). Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-03 10:09:27 +02:00
Ben Widawsky	050ee91f12	drm/i915: Use new INSTDONE registers (Gen7+) Using the extracted INSTDONE reading, and our new register definitions, update our hangcheck detection and error collection to use it. This primarily means changing == to memcmp, and changing = to memcpy. Hopefully this will give more info on error dump, and provide more accurate hangcheck detection (both are actually TBD). Also, remove the reading of instdone1 from the ring error collection function, and just crap everything in capture_error_state (that could be split into a separate patch if it wasn't so trivial). v2: Now assuming i915_get_extra_instdone does the memset we can clean up the code a bit (Jani) v3: use ARRAY_SIZE as requested earlier by Jani (didn't change sizeof) Updated commit msg Cc: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-24 16:58:36 +02:00
Chris Wilson	0327d6ba99	drm/i915: Extract general object init routine As we wish to create specialised object constructions in the near future that share the same basic GEM object struct, export the default initializer. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-24 02:04:38 +02:00
Chris Wilson	86a1ee26bb	drm/i915: Only pwrite through the GTT if there is space in the aperture Avoid stalling and waiting for the GPU by checking to see if there is sufficient inactive space in the aperture for us to bind the buffer prior to writing through the GTT. If there is inadequate space we will have to stall waiting for the GPU, and incur overheads moving objects about. Instead, only incur the clflush overhead on the target object by writing through shmem. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-24 02:03:33 +02:00
Ben Widawsky	71e172e8d1	drm/i915: Add ERR_INT to gen7 error state ERR_INT can generate interrupts. However since most of the conditions seem quite fatal the patch opts to simply report it in error state instead of adding more complexity to the interrupt handler for little gain (the bits are sticky anyway). Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Tested-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Antti Koskipaa <antti.koskipaa@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-22 18:05:54 +02:00
Chris Wilson	6c085a728c	drm/i915: Track unbound pages When dealing with a working set larger than the GATT, or even the mappable aperture when touching through the GTT, we end up with evicting objects only to rebind them at a new offset again later. Moving an object into and out of the GTT requires clflushing the pages, thus causing a double-clflush penalty for rebinding. To avoid having to clflush on rebinding, we can track the pages as they are evicted from the GTT and only relinquish those pages on memory pressure. As usual, if it were not for the handling of out-of-memory condition and having to manually shrink our own bo caches, it would be a net reduction of code. Alas. Note: The patch also contains a few changes to the last-hope evict_everything logic in i916_gem_execbuffer.c - we no longer try to only evict the purgeable stuff in a first try (since that's superflous and only helps in OOM corner-cases, not fragmented-gtt trashing situations). Also, the extraction of the get_pages retry loop from bind_to_gtt (and other callsites) to get_pages should imo have been a separate patch. v2: Ditch the newly added put_pages (for unbound objects only) in i915_gem_reset. A quick irc discussion hasn't revealed any important reason for this, so if we need this, I'd like to have a git blame'able explanation for it. v3: Undo the s/drm_malloc_ab/kmalloc/ in get_pages that Chris noticed. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Split out code movements and rant a bit in the commit message with a few Notes. Done v2] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-21 14:34:11 +02:00
Daniel Vetter	5d985ac81a	drm/i915: kill a few unused things in dev_priv ... and move a few others only used by i915_dma.c into the dri1 dungeon. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-17 10:10:03 +02:00
Daniel Vetter	35eb73234b	drm/i915: kill dev_priv->mchdev_lock It's only ever a pointer to the global mchdev_lock, and we don't use it at all. Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-09 21:53:01 +02:00
Daniel Vetter	c6a828d326	drm/i915: move all rps state into dev_priv->rps This way it's easier so see what belongs together, and what is used by the ilk ips code. Also add some comments that explain the locking. Note that (cur\|min\|max)_delay need to be duplicated, because they're also used by the ips code. v2: Missed one place that the dev_priv->ips change caught ... Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-09 21:52:22 +02:00
Daniel Vetter	c96ea64ebb	drm/i915: dump the device info Handy for lazy people like me, or when people forget to add the output of lspci -nn. v2: Chris Wilson noticed that we have this duplicated already in the i915_capabilites debugfs file. But there \n as separator looks better, which would be a bit verbose in dmesg. Abuse the preprocessor to extract this all. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-09 18:29:21 +02:00
Daniel Vetter	acbe947550	drm/i915: rip out sanitize_pm again We believe to have squashed all issues around the gen6+ rps interrupt generation and why the gpu sometimes got stuck. With that cleared up, there's no user left for the sanitize_pm infrastructure, so let's just rip it out. Note that 'intel_reg_write 0xa014 0x13070000' is the w/a if we find ourselves stuck again. Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-26 13:37:13 +02:00
Chris Wilson	e6994aeedc	drm/i915: Export ability of changing cache levels to userspace By selecting the cache level (essentially whether or not the CPU snoops any updates to the bo, and on more recent machines whether it resides inside the CPU's last-level-cache) a userspace driver is able to then manage all of its memory within buffer objects, if it so desires. This enables the userspace driver to accelerate uploads and more importantly downloads from the GPU and to able to mix CPU and GPU rendering/activity efficiently. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Added code comment about where we plan to stuff platform specific cacheing control bits in the ioctl struct.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-26 12:56:25 +02:00
Chris Wilson	42d6ab4839	drm/i915: Segregate memory domains in the GTT using coloring Several functions of the GPU have the restriction that differing memory domains cannot be placed next to each other (as the GPU may prefetch beyond the end of one domain and hang as it crosses into the other domain). We use the facility of the drm_mm to mark ranges with a particular color that corresponds to the cache attributes of those pages in order to prevent allocating adjacent blocks of differing memory types. v2: Rebase ontop of drm_mm coloring v2. v3: Fix rebinding existing gtt_space and add a verification routine. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-26 12:56:25 +02:00
Ben Widawsky	f27b92651d	drm/i915: Expand DPF support to Haswell Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:57 +02:00
Ben Widawsky	e1ef7cc299	drm/i915: Macro to determine DPF support Originally I had a macro specifically for DPF support, and Daniel, with good reason asked me to change it to this. It's not the way I would have gone (and indeed I didn't), but for now there is no distinction as all platforms with L3 also have DPF. Note: The good reasons are that dpf is a l3$ feature (at least on currrent hw), hence I don't expect one to go without the other. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: added note] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:56 +02:00
Chris Wilson	f047e395dd	drm/i915: Avoid concurrent access when marking the device as idle/busy As suggested by Daniel, rip out the independent timers for device and crtc busyness and integrate the manual powermanagement of the display engine into the GEM core and its request tracking. The benefits are that the code is a lot smaller, fewer moving parts and should fit more neatly into the overall activity tracking of the driver. v2: Complete overhaul and removal of the racy timers and workers. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:56 +02:00
Chris Wilson	a7b9761d0a	drm/i915: Split i915_gem_flush_ring() into seperate invalidate/flush funcs By moving the function to intel_ringbuffer and currying the appropriate parameter, hopefully we make the callsites easier to read and understand. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:55 +02:00
Chris Wilson	69c2fc8913	drm/i915: Remove the per-ring write list This is now handled by a global flag to ensure we emit a flush before the next serialisation point (if we failed to queue one previously). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:53 +02:00
Chris Wilson	65ce302741	drm/i915: Remove the defunct flushing list As we guarantee to emit a flush before emitting the breadcrumb or the next batchbuffer, there is no further need for the flushing list. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:52 +02:00
Chris Wilson	0201f1ecf4	drm/i915: Replace the pending_gpu_write flag with an explicit seqno As we always flush the GPU cache prior to emitting the breadcrumb, we no longer have to worry about the deferred flush causing the pending_gpu_write to be delayed. So we can instead utilize the known last_write_seqno to hopefully minimise the wait times. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:52 +02:00
Chris Wilson	3bb73aba1e	drm/i915: Allow late allocation of request for i915_add_request() Request preallocation was added to i915_add_request() in order to support the overlay. However, not all users care and can quite happily ignore the failure to allocate the request as they will simply repeat the request in the future. By pushing the allocation down into i915_add_request(), we can then remove some rather ugly error handling in the callers. v2: Nullify request->file_priv otherwise we chase a garbage pointer when retiring requests. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:51 +02:00
Ben Widawsky	c0c7babc48	drm/i915: add register read IOCTL The interface's immediate purpose is to do synchronous timestamp queries as required by GL_TIMESTAMP. The GPU has a register for reading the timestamp but because that would normally require root access through libpciaccess, the IOCTL can provide this service instead. Currently the implementation whitelists only the render ring timestamp register, because that is the only thing we need to expose at this time. v2: make size implicit based on the register offset Add a generation check Reviewed-by: Eric Anholt <eric@anholt.net> Cc: Jacek Lawrynowicz <jacek.lawrynowicz@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: fixup the ioctl numerb:] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:49 +02:00
Chris Wilson	12f55818ba	drm/i915: Add comments to explain the BSD tail write workaround Having had to dive into the bspec to understand what each stage of the workaround meant, and how that the ring broadcasting IDLE corresponded with the GT powering down the ring (i.e. rc6) add comments to aide the next reader. And since the register "is used to control all aspects of PSMI and power saving functions" that makes it quite interesting to inspect with regards to RC6 hangs, so add it to the error-state. v2: Rediscover the piece of magic, set the RNCID to 0 before waiting for the ring to wake up. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-20 12:21:37 +02:00
Daniel Vetter	6c2b7c1208	drm/i915: introduce for_each_encoder_on_crtc We already have this pattern at quite a few places, and moving part of the modeset helper stuff into the driver will add more. v2: Don't clobber the crtc struct name with the macro parameter ... v3: Convert two more places noticed by Paulo Zanoni. Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-05 15:06:33 +02:00
Daniel Vetter	d6b2c790a4	drm/i915: non-interruptible sleeps can't handle -EAGAIN So don't return -EAGAIN, even in the case of a gpu hang. Remap it to -EIO instead. Note that this isn't really an issue with interruptability, but more that we have quite a few codepaths (mostly around kms stuff) that simply can't handle any errors and hence not even -EAGAIN. Instead of adding proper failure paths so that we could restart these ioctls we've opted for the cheap way out of sleeping non-interruptibly. Which works everywhere but when the gpu dies, which this patch fixes. So essentially interruptible == false means 'wait for the gpu or die trying'.' This patch is a bit ugly because intel_ring_begin is all non-interruptible and hence only returns -EIO. But as the comment in there says, auditing all the callsites would be a pain. To avoid duplicating code, reuse i915_gem_check_wedge in __wait_seqno and intel_wait_ring_buffer. Also use the opportunity to clarify the different cases in i915_gem_check_wedge a bit with comments. v2: Don't access dev_priv->mm.interruptible from check_wedge - we might not hold dev->struct_mutex, making this racy. Instead pass interruptible in as a parameter. I've noticed this because I've hit a BUG_ON(!mutex_is_locked) at the top of check_wedge. This has been added in commit `b4aca0106c` Author: Ben Widawsky <ben@bwidawsk.net> Date: Wed Apr 25 20:50:12 2012 -0700 drm/i915: extract some common olr+wedge code although that commit is missing any justification for this. I guess it's just copy&paste, because the same commit add the same BUG_ON check to check_olr, where it indeed makes sense. But in check_wedge everything we access is protected by other means, so this is superflous. And because it now gets in the way (we add a new caller in __wait_seqno, which can be called without dev->struct_mutext) let's just remove it. v3: Group all the i915_gem_check_wedge refactoring into this patch, so that this patch here is all about not returning -EAGAIN to callsites that can't handle syscall restarting. v4: Add clarification what interuptible == fales means in our code, requested by Ben Widawsky. v5: Fix EAGAIN mispell noticed by Chris Wilson. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-05 10:01:14 +02:00
Paulo Zanoni	45e6e3a1cd	drm/i915: get rid of dev_priv->info->has_pch_split Previously we had has_pch_split to tell us whether we had a PCH or not and we also had dev_priv->pch_type to tell us which kind of PCH it was, but it could only be used if we were 100% sure we did have a PCH. Now that PCH_NONE was added to dev_priv->pch_type we don't need has_pch_split anymore: we can just check for pch_type != PCH_NONE. The HAS_PCH_{IBX,CPT,LPT} macros use dev_priv->pch_type, so they can only be called after intel_detect_pch. The HAS_PCH_SPLIT macro looks at dev_priv->info->has_pch_split, which is available earlier. Since the goal is to implement HAS_PCH_SPLIT using dev_priv->pch_type instead of dev_priv->info->has_pch_split, we need to make sure that intel_detect_pch is called before any calls to HAS_PCH_SPLIT are made. So we moved the intel_detect_pch call to an earlier stage. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-05 09:56:05 +02:00
Paulo Zanoni	f035083055	drm/i915: add PCH_NONE to enum intel_pch And rely on the fact that it's 0 to assume that machines without a PCH will have PCH_NONE as dev_priv->pch_type. Just today I finally realized that HAS_PCH_IBX is true for machines without a PCH. IMHO this is totally counter-intuitive and I don't think it's a good idea to assume that we're going to check for HAS_PCH_IBX only after we check for HAS_PCH_SPLIT. I believe that in the future we'll have more PCH types and checks like: if (HAS_PCH_IBX(dev) \|\| HAS_PCH_CPT(dev)) will become more and more common. There's a good chance that we may break non-PCH machines by adding these checks in code that runs on all machines. I also believe that the HAS_PCH_SPLIT check will become less common as we add more and more different PCH types. We'll probably start replacing checks like: if (HAS_PCH_SPLIT(dev)) foo(); else bar(); with: if (HAS_PCH_NEW(dev)) baz(); else if (HAS_PCH_OLD(dev) \|\| HAS_PCH_IBX(dev)) foo(); else bar(); and this may break gen 2/3/4. As far as we have investigated, this patch will affect the behavior of intel_hdmi_dpms and intel_dp_link_down on gen 4. In both functions the code inside the HAS_PCH_IBX check is for IBX-specific workarounds, so we should be safe. If we start bisecting gen 2/3/4 bugs to this commit we should consider replacing the HAS_PCH_IBX checks with something else. V2: Improve commit message, list possible side effects and solution. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-05 09:56:05 +02:00
Daniel Vetter	930ebb4624	drm/i915: fix up ilk rc6 disabling confusion While creating the new enable/disable_gt_powersave functions in commit `8090c6b9da` Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Sun Jun 24 16:42:32 2012 +0200 drm/i915: wrap up gt powersave enabling functions I've botched up the handling of ironlake_disable_rc6. Fix this up by calling it at the right place. Note though that ironlake_disable_rc6 does a bit more than just disabling rc6 - it also tears down all the allocated context objects. Hence we need to move intel_teardown_rc6 out and directly call it from intel_modeset_cleanup. Also properly mark ironlake_enable_rc6 as static and kill the un-used declaration in i915_drv.h. Note: In review a question popped out why disable_rc6 also tears down the backing object and why we should move that out - it's simply for consistency with gen6+ rps code, which does it that way. Cc: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-05 09:56:04 +02:00
Chris Wilson	990bbdadab	drm/i915: Group the GT routines together in both code and vtable Tidy up the routines for interacting with the GT (in particular the forcewake dance) which are scattered throughout the code in a single structure. v2: use wait_for_atomic for polling. v3: really use wait_for_atomic for polling. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-03 22:08:46 +02:00
Daniel Vetter	7b0cfee1a2	Linux 3.5-rc4 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) iQEcBAABAgAGBQJP53AxAAoJEHm+PkMAQRiGs2QH/RaqkXz96fwjhDcyiKpDqA3c kGuS5mz5cOhnqKSmR88HFm6pwuhLux/qSJzeAmoQy1MC8a0ACx7AnANW0lfN3/qe /HGYz8h60yCL/fhn8/bUYtdt9xsoDqoDcq/ooFl9mcsJGWbC6WeMSZU5dAUYqviE qFrp5zjY07FG53CRGT0hFpezQNwNL+VLH30CF9LD+fJLPVEYum2zBNGXWM42rcw5 fxzGL/6SO8YqA/Upic1ht6HAd6s5LOrlST7qvnyXUMvRXN5z/Y92ueYJZefkS1Om ohuLIKM2bv9/dJS67H8N2baSKGCzBdfSe5/5WaHdLYW9MiVju0wRl6HPJtAMrkk= =H8t8 -----END PGP SIGNATURE----- Merge tag 'v3.5-rc4' into drm-intel-next-queued I want to merge the "no more fake agp on gen6+" patches into drm-intel-next (well, the last pieces). But a patch in 3.5-rc4 also adds a new use of dev->agp. Hence the backmarge to sort this out, for otherwise drm-intel-next merged into Linus' tree would conflict in the relevant code, things would compile but nicely OOPS at driver load :( Conflicts in this merge are just simple cases of "both branches changed/added lines at the same place". The only tricky part is to keep the order correct wrt the unwind code in case of errors in intel_ringbuffer.c (and the MI_DISPLAY_FLIP #defines in i915_reg.h together, obviously). Conflicts: drivers/gpu/drm/i915/i915_reg.h drivers/gpu/drm/i915/intel_ringbuffer.c Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-06-25 19:10:36 +02:00
Jesse Barnes	9355360963	drm/i915: don't enable PPGTT on VLV yet Needs some more work and testing. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-06-20 22:49:46 +02:00
Daniel Vetter	df12c6d5ec	drm/i915: initialize the context idr unconditionally It doesn't hurt and it at least prevents us from OOPSing left and right at quite a few places. This also allows us to simplify the code a bit by folding the only line of context_open into the callsite. We obviuosly also need to run the cleanup code unconditionally, too. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-06-20 11:15:37 +02:00
Ben Widawsky	8e96d9c4d9	drm/i915: reset the GPU on context fini It's the only way we know how to make the GPU actually forget about the default context. Signed-off-by: Ben Widawsky <ben@bwidawsk.net>	2012-06-14 17:36:21 +02:00
Ben Widawsky	846248136d	drm/i915/context: create & destroy ioctls Add the interfaces to allow user space to create and destroy contexts. Contexts are destroyed automatically if the file descriptor for the dri device is closed. Following convention as usual here causes checkpatch warnings. v2: with is_initialized, no longer need to init at create drop the context switch on create (daniel) v3: Use interruptible lock (Chris) return -ENODEV in !GEM case (Chris) Signed-off-by: Ben Widawsky <ben@bwidawsk.net>	2012-06-14 17:36:20 +02:00
Ben Widawsky	b9a3906b60	drm/i915: add ccid to error state Signed-off-by: Ben Widawsky <ben@bwidawsk.net>	2012-06-14 17:36:19 +02:00
Ben Widawsky	e055684168	drm/i915: context switch implementation Implement the context switch code as well as the interfaces to do the context switch. This patch also doesn't match 1:1 with the RFC patches. The main difference is that from Daniel's responses the last context object is now stored instead of the last context. This aids in allows us to free the context data structure, and context object independently. There is room for optimization: this code will pin the context object until the next context is active. The optimal way to do it is to actually pin the object, move it to the active list, do the context switch, and then unpin it. This allows the eviction code to actually evict the context object if needed. The context switch code is missing workarounds, they will be implemented in future patches. v2: actually do obj->dirty=1 in switch (daniel) Modified comment around above Remove flags to context switch (daniel) Move mi_set_context code to i915_gem_context.c (daniel) Remove seqno , use lazy request instead (daniel) v3: use i915_gem_request_next_seqno instead of outstanding_lazy_request (Daniel) remove id's from trace events (Daniel) Put the context BO in the instruction domain (Daniel) Don't unref the BO is context switch fails (Chris) Signed-off-by: Ben Widawsky <ben@bwidawsk.net>	2012-06-14 17:36:17 +02:00
Ben Widawsky	40521054fd	drm/i915: context basic create & destroy Invent an abstraction for a hw context which is passed around through the core functions. The main bit a hw context holds is the buffer object which backs the context. The rest of the members are just helper functions. Specifically the ring member, which could likely go away if we decide to never implement whatever other hw context support exists. Of note here is the introduction of the 64k alignment constraint for the BO. If contexts become heavily used, we should consider tweaking this down to 4k. Until the contexts are merged and tested a bit though, I think 64k is a nice start (based on docs). Since we don't yet switch contexts, there is really not much complexity here. Creation/destruction works pretty much as one would expect. An idr is used to generate the context id numbers which are unique per file descriptor. v2: add DRM_DEBUG_DRIVERS to distinguish ENOMEM failures (ben) convert a BUG_ON to WARN_ON, default destruction is still fatal (ben) Signed-off-by: Ben Widawsky <ben@bwidawsk.net>	2012-06-14 17:36:16 +02:00
Ben Widawsky	254f965c39	drm/i915: preliminary context support Very basic code for context setup/destruction in the driver. Adds the file i915_gem_context.c This file implements HW context support. On gen5+ a HW context consists of an opaque GPU object which is referenced at times of context saves and restores. With RC6 enabled, the context is also referenced as the GPU enters and exists from RC6 (GPU has it's own internal power context, except on gen5). Though something like a context does exist for the media ring, the code only supports contexts for the render ring. In software, there is a distinction between contexts created by the user, and the default HW context. The default HW context is used by GPU clients that do not request setup of their own hardware context. The default context's state is never restored to help prevent programming errors. This would happen if a client ran and piggy-backed off another clients GPU state. The default context only exists to give the GPU some offset to load as the current to invoke a save of the context we actually care about. In fact, the code could likely be constructed, albeit in a more complicated fashion, to never use the default context, though that limits the driver's ability to swap out, and/or destroy other contexts. All other contexts are created as a request by the GPU client. These contexts store GPU state, and thus allow GPU clients to not re-emit state (and potentially query certain state) at any time. The kernel driver makes certain that the appropriate commands are inserted. There are 4 entry points into the contexts, init, fini, open, close. The names are self-explanatory except that init can be called during reset, and also during pm thaw/resume. As we expect our context to be preserved across these events, we do not reinitialize in this case. As Adam Jackson pointed out, The cutoff of 1MB where a HW context is considered too big is arbitrary. The reason for this is even though context sizes are increasing with every generation, they have yet to eclipse even 32k. If we somehow read back way more than that, it probably means BIOS has done something strange, or we're running on a platform that wasn't designed for this. v2: rename load/unload to init/fini (daniel) remove ILK support for get_size() (indirectly daniel) add HAS_HW_CONTEXTS macro to clarify supported platforms (daniel) added comments (Ben) Signed-off-by: Ben Widawsky <ben@bwidawsk.net>	2012-06-14 17:36:16 +02:00
Daniel Vetter	dd2757f8b5	drm/i915: stop using dev->agp->base For that to work we need to export the base address of the gtt mmio window from intel-gtt. Also replace all other uses of dev->agp by values we already have at hand. Reviewed-by: Jani Nikula <jani.nikula@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-06-12 22:18:06 +02:00
Dave Airlie	6cf98d6ebb	Merge branch 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel into drm-fixes * 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel: drm/i915: pch_irq_handler -> {ibx, cpt}_irq_handler char/agp: add another Ironlake host bridge drm/i915: fix up ivb plane 3 pageflips drm/i915: hold forcewake around ring hw init drm/i915: Mark the ringbuffers as being in the GTT domain drm/i915/crt: Do not rely upon the HPD presence pin drm/i915: Reset last_retired_head when resetting ring	2012-06-08 09:42:35 +01:00
Daniel Vetter	b7884eb45e	drm/i915: hold forcewake around ring hw init Empirical evidence suggests that we need to: On at least one ivb machine when running the hangman i-g-t test, the rings don't properly initialize properly - the RING_START registers seems to be stuck at all zeros. Holding forcewake around this register init sequences makes chip reset reliable again. Note that this is not the first such issue: commit `f01db988ef` Author: Sean Paul <seanpaul@chromium.org> Date: Fri Mar 16 12:43:22 2012 -0400 drm/i915: Add wait_for in init_ring_common added delay loops to make RING_START and RING_CTL initialization reliable on the blt ring at boot-up. So I guess it won't hurt if we do this unconditionally for all force_wake needing gpus. To avoid copy&pasting of the HAS_FORCE_WAKE check I've added a new intel_info bit for that. v2: Fixup missing commas in static struct and properly handling the error case in init_ring_common, both noticed by Jani Nikula. Cc: stable@vger.kernel.org Reported-and-tested-by: Yang Guang <guang.a.yang@intel.com> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50522 Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-06-04 20:25:29 +02:00

1 2 3 4 5 ...

555 commits