Commit graph

148 commits

Author SHA1 Message Date
Daniel Vetter 31b14c9fc5 drm/i915: invalidate render cache on gen2
It looks like we also need to flush the render cache when we just
invalidate it. This fixes a regression in i-g-t/gem_tiled_blits on my
i855gm. I guess the render cache there is virtually indexed, so we
need to clean it when changing gtt mappings.

This regression has been introduce in

commit 46f0f8d120
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Apr 18 11:12:11 2012 +0100

    drm/i915: Don't set a MBZ bit in gen2/3 MI_FLUSH

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-20 09:28:06 +02:00
Chris Wilson 46f0f8d120 drm/i915: Don't set a MBZ bit in gen2/3 MI_FLUSH
On gen2 MI_EXE_FLUSH is actually an AGP flush bit and on gen3 marked as
reserved.  On both it is documented as being must-be-zero. So obey the
documentation, and separate the gen2 flush into its own little routine
and share with gen3.

This means that we can rename the existing render_ring_flush() to
reflect the generation from which it first applies and remove the code
for handling earlier generations from it.

v2: Applies to gen3 as well
v3: Make it compile and improve the commit message.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-18 12:39:57 +02:00
Chris Wilson 65f5687603 drm/i915: Replace open coded MI_BATCH_GTT
The (2<<6) virtual memory space selector harks back to gen3 and is
mandatory given our use of GTT space for batchbuffers. On gen4+, use of
the GTT became mandatory and bit6 marked reserved. However the code must
now explicitly set (1<<7), which conveniently is also (2<<6).

To clarify the meaning for future readers, replace the open coded (2<<6)
with MI_BATCH_GTT.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-18 11:11:14 +02:00
Ben Widawsky c43b563403 drm/i915: [sparse] trivial sparse fixes
This should contain all the changes which require no thought to make
sparse happy.

Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-18 10:34:49 +02:00
Daniel Vetter 767878908e Linux 3.4-rc3
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.18 (GNU/Linux)
 
 iQEbBAABAgAGBQJPi3XOAAoJEHm+PkMAQRiGnsUH9RjHwH4YFVyuP/DKtKa6zs74
 wqkpT15yITQ5WWMog4JaJFFg5rJCUd8QZr7AS/HSn0ijDyZX5VU7Rcs9cMudDzNR
 H/5K/AscS4fjb0HwWVqoltTWHRb9QGSwVN3+E3VCDLt9P89YJ0o3QztkkuEX5dkZ
 jc7reVXTfRnCcILEa9jleOzrn+OLM3j/jAjQ2hGunl8EDLzD4b17HHPoli4jEZ/5
 5ibpSVsPD+AqzN+glbXvYjVItl12D0IQos/JdOwfuZriCVWLxysSSwHZTbPCyvBZ
 LHH4HR5T+XLSXbjJeNkUFHLzqU+d5gVRadIoWtJCxqxFjKbOs2YtzJ5Ai0nDiw==
 =kTkC
 -----END PGP SIGNATURE-----

Merge tag 'v3.4-rc3' into drm-intel-next-queued

Backmerge Linux 3.4-rc3 into drm-intel-next to resolve a few things
that conflict/depend upon patches in -rc3:
- Second part of the Sandybridge workaround series - it changes some
  of the same registers.
- Preparation for Chris Wilson's fencing cleanup - we need the fix
  from -rc3 merged before we can move around all that code.
- Resolve the gmbus conflict - gmbus has been disabled in 3.4 again,
  but should be enabled on all generations in 3.5.

Conflicts:
	drivers/gpu/drm/i915/intel_i2c.c

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-17 11:16:20 +02:00
Daniel Vetter f637fde434 drm/i915: inline enable/disable_irq into ring->get/put_irq
Now that these are properly refactored this additional indirection
doesn't really buy us anything but confusion. Hence inline them.

This duplicates the ironlake gt enable/disable code snippet, but we've
already separate ilk from gen6+ gt irq in i915_irq.c, so I think this
makes more sense.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-13 12:53:44 +02:00
Daniel Vetter 28f0cbf71f drm/i915: don't set up gem ring functions on gen5 for !kms
We already disallow initialition of gem in this case in the
corresponding ioctl, so don't bother setting up the gem support ring
functions in the legacy dri render ring init.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-13 12:53:28 +02:00
Daniel Vetter 8620a3a908 drm/i915: consolidate ring->add_request a bit
They're indentical, so just kill one. Also give the other a prefix to
distinguish it from the gen6+ functions - this add_request function is
not really generic code.

v2: Fixup commit message as noted by Ben Widawsky.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-13 12:52:17 +02:00
Daniel Vetter fb3256da8d drm/i915: split up ring->dispatch_execbuffer functions
Now that we can, we should split them up in a way that makes some
sense and banishes the IS_ checks into init code.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-13 12:52:02 +02:00
Daniel Vetter 0fd2c20148 drm/i915: don't enable the gen6 bsd ring tail write enable on gen7
HW engineers have fixed this issue for ivb. Again, a nice cleanup
possible thanks to the more flexible ring initialization.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-13 12:51:49 +02:00
Daniel Vetter e48d86347c drm/i915: split out the gen5 ring irq get/put functions
Now that we have sensibly split up, we can nicely get rid of that ugly
is_gen5 check.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-13 12:51:37 +02:00
Daniel Vetter e367031966 drm/i915: abstract away ring-specific irq_get/put
Inspired by Ben Widawsky's patch for gen6+. Now after restructuring
how we set up the ring vtables and parameters, we can do this right.

This kills the bsd specific get/put_irq functions, they're now the
same.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-13 12:51:07 +02:00
Daniel Vetter 686cb5f9f5 drm/i915: consolidate ring->sync-to functions
The waiter is always the ring itself (otherwise we'd have a decent
snafu in a callsite), so we can unify this easily.

Also give it the usual gen6_ prefix, in case anyone is foolish enough to
implement hw semaphores for gen5.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-13 12:50:41 +02:00
Daniel Vetter b4178f8aaf drm/i915: don't set up rings on gen6+ for non-kms
It's not supported, and with the patch to refuse loading on gen6+
without kms enabled, there's also no way we can hit this.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-13 12:42:35 +02:00
Daniel Vetter 3535d9dd5a drm/i915: dynamically set up blt ring functions and parameters
Just for consistency.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-13 12:42:30 +02:00
Daniel Vetter 58fa383587 drm/i915: dynamically set up bsd ring functions and params
The same treatment for the bsd ring. Again, this will be split up
further by the irq rework.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-13 12:42:25 +02:00
Daniel Vetter 59465b5f78 drm/i915: dynamically set up the render ring functions and params
Our hw is simply not well-designed enough that it neatly fits into
boxes. Everywhere else we set up vtables and similar things
dynamically using switch statements - it's simply much more flexible.

This is prep work to rework the pre-gen6 ring irq stuff - it'll add a
few more differences. With the current const struct templates, that
would be a mess.

This leads to some unfortunate duplication with the old dri1 code, but
we can reap that again because gen6 isn't actually supported there.
But that's for a separate patch.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-13 12:41:36 +02:00
Daniel Vetter dfc9ef2fb0 drm/i915: set ring->size in common ring setup code
Eventually we want to scale the ring size depending upon available
gtt space. For now just consolidate this instead of replicating it
over all ringbuffer templates.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-13 12:41:22 +02:00
Daniel Vetter 6a848ccb80 drm/i915: rip out ring->irq_mask
We only ever enable/disable one interrupt (namely user_interrupts and
pipe_notify), so we don't need to track the interrupt masking state.

Also rename irq_enable to irq_enable_mask, now that it won't collide -
beforehand both a irq_mask and irq_enable_mask would have looked a bit
strange.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-13 12:40:57 +02:00
Ben Widawsky 1500f7ea06 drm/i915: hide (seqno-1) in ringbuffer code
Waiting for seqno-1 in our object synchronization code is an
implementation detail given how we've decided to do the waits within the
rest of our code.

Requested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-12 21:14:14 +02:00
Dave Airlie effbc4fd8e Merge branch 'drm-intel-next' of git://people.freedesktop.org/~danvet/drm-intel into drm-core-next
Daniel Vetter wrote
First pull request for 3.5-next, slightly large than usual because new
things kept coming in since the last pull for 3.4.
Highlights:
- first batch of hw enablement for vlv (Jesse et al) and hsw (Eugeni). pci
 ids are not yet added, and there's still quite a few patches to merge
 (mostly modesetting). To make QA easier I've decided to merge this stuff
 in pieces.
- loads of cleanups and prep patches spurred by the above. Especially vlv
 is a real frankenstein chip, but also hsw is stretching our driver's
 code design. Expect more to come in this area for 3.5.
- more gmbus fixes, cleanups and improvements by Daniel Kurtz. Again,
 there are more patches needed (and some already queued up), but I wanted
 to split this a bit for better testing.
- pwrite/pread rework and retuning. This series has been in the works for
 a few months already and a lot of i-g-t tests have been created for it.
 Now it's finally ready to be merged.  Note that one patch in this series
 touches include/pagemap.h, that patch is acked-by akpm.
- reduce mappable pressure and relocation throughput improvements from
 Chris.
- mmap offset exhaustion mitigation by Chris Wilson.
- a start at figuring out which codepaths in our messy dri1/ums+gem/kms
 driver we actually need to support by bailing out of unsupported case.
 The driver now refuses to load without kms on gen6+ and disallows a few
 ioctls that userspace never used in certain cases. More of this will
 definitely come.
- More decoupling of global gtt and ppgtt.
- Improved dual-link lvds detection by Takashi Iwai.
- Shut up the compiler + plus fix the fallout (Ben)
- Inverted panel brightness handling (mostly Acer manages to break things
 in this way).
- Small fixlets and adjustements and some minor things to help debugging.

Regression-wise QA reported quite a few issues on ivb, but all of them
turned out to be hw stability issues which are already fixed in
drm-intel-fixes (QA runs the nightly regression tests on -next alone,
without -fixes automatically merged in). There's still one issue open on
snb, it looks like occlusion query writes are not quite as cache coherent
as we've expected. With some of the pwrite adjustements we can now
reliably hit this. Kernel workaround for it is in the works."

* 'drm-intel-next' of git://people.freedesktop.org/~danvet/drm-intel: (101 commits)
  drm/i915: VCS is not the last ring
  drm/i915: Add a dual link lvds quirk for MacBook Pro 8,2
  drm/i915: make quirks more verbose
  drm/i915: dump the DMA fetch addr register on pre-gen6
  drm/i915/sdvo: Include YRPB as an additional TV output type
  drm/i915: disallow gem init ioctl on ilk
  drm/i915: refuse to load on gen6+ without kms
  drm/i915: extract gt interrupt handler
  drm/i915: use render gen to switch ring irq functions
  drm/i915: rip out old HWSTAM missed irq WA for vlv
  drm/i915: open code gen6+ ring irqs
  drm/i915: ring irq cleanups
  drm/i915: add SFUSE_STRAP registers for digital port detection
  drm/i915: add WM_LINETIME registers
  drm/i915: add WRPLL clocks
  drm/i915: add LCPLL control registers
  drm/i915: add SSC offsets for SBI access
  drm/i915: add port clock selection support for HSW
  drm/i915: add S PLL control
  drm/i915: add PIXCLK_GATE register
  ...

Conflicts:
	drivers/char/agp/intel-agp.h
	drivers/char/agp/intel-gtt.c
	drivers/gpu/drm/i915/i915_debugfs.c
2012-04-12 10:27:01 +01:00
Chris Wilson 27c1cbd06a drm/i915/ringbuffer: Exclude last 2 cachlines of ring on 845g
The 845g shares the errata with i830 whereby executing a command
within 2 cachelines of the end of the ringbuffer may cause a GPU hang.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-11 12:14:24 +02:00
Daniel Vetter 901781b997 drm/i915: use render gen to switch ring irq functions
Top-level interrupt bits are usually found in the display block. It
therefore makes sense to use HAS_PCH_SPLIT in i915_irq.c

But the irq stuff in intel_ring.c only concerns itself with render
core/gt-level interrupt sources. It therefore makes more sense to
switch based on gpu gen.

Kills a vlv special case.

Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-09 18:04:07 +02:00
Ben Widawsky 25c063004a drm/i915: open code gen6+ ring irqs
We can now open-code the get/put irq functions as they were just
abstracting single register definitions.

It would be nice to merge this in with the IRQ handling code... but that
is too much work for me at present. In addition I could probably
collapse this in to a lot of the Ironlake stuff, but I don't think it's
worth the potential regressions.

This patch itself should not effect functionality.

CC: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-09 18:04:06 +02:00
Ben Widawsky e2a1e2f024 drm/i915: ring irq cleanups
- gen6 put/get only need one argument
    rflags and gflags are always the same (see above explanation)
- remove a couple redundantly defined IRQs
- reordered some lines to make things go in descending order

Every ring has its own interrupts, enables, masks, and status bits that
are fed into the main interrupt enable/mask/status registers. At one
point in time it seemed like a good idea to make our functions support
the notion that each interrupt may have a different bit position in the
corresponding register (blitter parser error may be bit n in IMR, but
bit m in blitter IMR). It turned out though that the HW designers did us
a solid on Gen6+ and this unfortunate situation has been avoided. This
allows our interrupt code to be cleaned up a bit.

I jammed this into one commit because there should be no functional
change with this commit, and staging it into multiple commits was
unnecessarily artificial IMO.

CC: Chris Wilson <chris@chris-wilson.co.uk>
CC: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet:
- fixed up merged conflict with vlv changes.
- added GEN6 to GT blitter bit, we only use it on gen6+.
- added a comment to both ring irq bits and GT irq bits that on gen6+
  these alias.
- added comment that GT_BSD_USER_INTERRUPT is ilk-only.
- I've got confused a bit that we still use GT_USER_INTERRUPT on ivb
  for the render ring - but this goes back to ilk where we have only
  gt interrupt bits and so we be equally confusing if changed.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-09 18:04:05 +02:00
Daniel Vetter 1c7eaac737 drm/i915: apply CS reg readback trick against missed IRQ on snb
Ben Widawsky reported missed IRQ issues and this patch here helps.

We have one other missed IRQ report still left on snb, reported by QA:

https://bugs.freedesktop.org/show_bug.cgi?id=46145

This is _not_ a regression due to the forcewake voodoo though, it
started showing up before that was applied and has been on-and-off for
the past few weeks. According to QA this patch does not help. But the
missed IRQ is always from the blt ring (despite running piglit, so
also render activity expected), so I'm hopefully that this is an issue
with the blt ring itself.

Tested-by: Ben Widawsky <ben@bwidawsk.net>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-04-01 12:30:24 +02:00
Jesse Barnes 7e231dbe0c drm/i915: ValleyView IRQ support
ValleyView has a new interrupt architecture; best to put it in a new set
of functions.  Also make sure the ring mask functions handle ValleyView.

FIXME: fix flipping; need to enable interrupts and call prepare/finish

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-03-29 00:11:22 +02:00
Sean Paul f01db988ef drm/i915: Add wait_for in init_ring_common
I have seen a number of "blt ring initialization failed" messages
where the ctl or start registers are not the correct value. Upon further
inspection, if the code just waited a little bit, it would read the
correct value. Adding the wait_for to these reads should eliminate the
issue.

Signed-off-by: Sean Paul <seanpaul@chromium.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-03-18 19:10:06 +01:00
Dave Airlie 8229c885fe drm: Merge tag 'v3.3-rc7' into drm-core-next
Merge the fixes so far into core-next, needed to test
intel driver.

Conflicts:
	drivers/gpu/drm/i915/intel_ringbuffer.c
2012-03-15 10:24:32 +00:00
Chris Wilson 5d031e5b63 drm/i915: Remove use of the autoreported ringbuffer HEAD position
This is a revert of 6aa56062ea.

This was originally introduced to workaround reads of the ringbuffer
registers returning 0 on SandyBridge causing hangs due to ringbuffer
overflow. The root cause here was reads through the GT powerwell require
the forcewake dance, something we only learnt of later. Now it appears
that reading the reported head position from the HWS is returning
garbage, leading once again to hangs.

For example, on q35 the autoreported head reports:
  [  217.975608] head now 00010000, actual 00010000
  [  436.725613] head now 00200000, actual 00200000
  [  462.956033] head now 00210000, actual 00210010
  [  485.501409] head now 00400000, actual 00400020
  [  508.064280] head now 00410000, actual 00410000
  [  530.576078] head now 00600000, actual 00600020
  [  553.273489] head now 00610000, actual 00610018
which appears reasonably sane. In contrast, if we look at snb:
  [  141.970680] head now 00e10000, actual 00008238
  [  141.974062] head now 02734000, actual 000083c8
  [  141.974425] head now 00e10000, actual 00008488
  [  141.980374] head now 032b5000, actual 000088b8
  [  141.980885] head now 03271000, actual 00008950
  [  142.040628] head now 02101000, actual 00008b40
  [  142.180173] head now 02734000, actual 00009050
  [  142.181090] head now 00000000, actual 00000ae0
  [  142.183737] head now 02734000, actual 00009050

In addition, the automatic reporting of the head position is scheduled
to be defeatured in the future. It has no more utility, remove it.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45492
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Tested-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-27 08:49:56 -08:00
Chris Wilson a71d8d9452 drm/i915: Record the tail at each request and use it to estimate the head
By recording the location of every request in the ringbuffer, we know
that in order to retire the request the GPU must have finished reading
it and so the GPU head is now beyond the tail of the request. We can
therefore provide a conservative estimate of where the GPU is reading
from in order to avoid having to read back the ring buffer registers
when polling for space upon starting a new write into the ringbuffer.

A secondary effect is that this allows us to convert
intel_ring_buffer_wait() to use i915_wait_request() and so consolidate
upon the single function to handle the complicated task of waiting upon
the GPU. A necessary precaution is that we need to make that wait
uninterruptible to match the existing conditions as all the callers of
intel_ring_begin() have not been audited to handle ERESTARTSYS
correctly.

By using a conservative estimate for the head, and always processing all
outstanding requests first, we prevent a race condition between using
the estimate and direct reads of I915_RING_HEAD which could result in
the value of the head going backwards, and the tail overflowing once
again. We are also careful to mark any request that we skip over in
order to free space in ring as consumed which provides a
self-consistency check.

Given sufficient abuse, such as a set of unthrottled GPU bound
cairo-traces, avoiding the use of I915_RING_HEAD gives a 10-20% boost on
Sandy Bridge (i5-2520m):
  firefox-paintball  18927ms -> 15646ms: 1.21x speedup
  firefox-fishtank   12563ms -> 11278ms: 1.11x speedup
which is a mild consolation for the performance those traces achieved from
exploiting the buggy autoreported head.

v2: Add a few more comments and make request->tail a conservative
estimate as suggested by Daniel Vetter.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: resolve conflicts with retirement defering and the lack of
the autoreport head removal (that will go in through -fixes).]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-02-15 14:26:03 +01:00
Daniel Vetter 99ffa1629d drm/i915: enable forcewake voodoo also for gen6
We still have reports of missed irqs even on Sandybridge with the
HWSTAM workaround in place. Testing by the bug reporter gets rid of
them with the forcewake voodoo and no HWSTAM writes.

Because I've slightly botched the rebasing I've left out the ACTHD
readback which is also required to get IVB working. Seems to still
work on the tester's machine, so I think we should go with the more
minmal approach on SNB. Especially since I've only found weak evidence
for holding forcewake while waiting for an interrupt to arrive, but
none for the ACTHD readback.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45181
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45332
Tested-by: Nicolas Kalkhof nkalkhof()at()web.de
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-02-13 10:57:07 +01:00
Daniel Vetter 53d227f282 drm/i915: fixup seqno allocation logic for lazy_request
Currently we reserve seqnos only when we emit the request to the ring
(by bumping dev_priv->next_seqno), but start using it much earlier for
ring->oustanding_lazy_request. When 2 threads compete for the gpu and
run on two different rings (e.g. ddx on blitter vs. compositor)
hilarity ensued, especially when we get constantly interrupted while
reserving buffers.

Breakage seems to have been introduced in

commit 6f392d5486
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Sat Aug 7 11:01:22 2010 +0100

    drm/i915: Use a common seqno for all rings.

This patch fixes up the seqno reservation logic by moving it into
i915_gem_next_request_seqno. The ring->add_request functions now
superflously still return the new seqno through a pointer, that will
be refactored in the next patch.

Note that with this change we now unconditionally allocate a seqno,
even when ->add_request might fail because the rings are full and the
gpu died. But this does not open up a new can of worms because we can
already leave behind an outstanding_request_seqno if e.g. the caller
gets interrupted with a signal while stalling for the gpu in the
eviciton paths. And with the bugfix we only ever have one seqno
allocated per ring (and only that ring), so there are no ordering
issues with multiple outstanding seqnos on the same ring.

v2: Keep i915_gem_get_seqno (but move it to i915_gem.c) to make it
clear that we only have one seqno counter for all rings. Suggested by
Chris Wilson.

v3: As suggested by Chris Wilson use i915_gem_next_request_seqno
instead of ring->oustanding_lazy_request to make the follow-up
refactoring more clearly correct. Also improve the commit message
with issues discussed on irc.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45181
Tested-by: Nicolas Kalkhof nkalkhof()at()web.de
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-02-13 10:55:57 +01:00
Daniel Vetter 9edd576d89 Merge remote-tracking branch 'airlied/drm-fixes' into drm-intel-next-queued
Back-merge from drm-fixes into drm-intel-next to sort out two things:

- interlaced support: -fixes contains a bugfix to correctly clear
  interlaced configuration bits in case the bios sets up an interlaced
  mode and we want to set up the progressive mode (current kernels
  don't support interlaced). The actual feature work to support
  interlaced depends upon (and conflicts with) this bugfix.

- forcewake voodoo to workaround missed IRQ issues: -fixes only enabled
  this for ivybridge, but some recent bug reports indicate that we
  need this on Sandybridge, too. But in a slightly different flavour
  and with other fixes and reworks on top. Additionally there are some
  forcewake cleanup patches heading to -next that would conflict with
  currrent -fixes.

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-02-10 17:14:49 +01:00
Daniel Vetter 6a233c7887 drm/i915/ringbuffer: kill snb blt workaround
This was just to facilitate product enablement with pre-production hw.
Allows us to kill quite a bit of cruft.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-01-29 17:50:40 +01:00
Daniel Vetter 96154f2fab drm/i915: switch ring->id to be a real id
... and add a helpr function for the places where we want a flag.

This way we can use ring->id to index into arrays.

v2: Resurrect the missing beautification-space Chris Wilson noted.
I'm moving this space around because I'll reuse ring_str in the next
patch.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-01-29 17:32:58 +01:00
Eric Anholt 8d79c3490a drm/i915: Remove the MI_FLUSH_ENABLE setting.
We have always been using the wrong bit -- it's bit 12.  However, the
bit also doesn't do anything -- hardware has always accepted the
MI_FLUSH command even when it was specced not to.

Given that there is only one MI_FLUSH emitted in all of the driver
stack on gen6+ (in i965_video.c of the 2d driver, and it should be
using other code to do its flush instead), just remove the MI_FLUSH
enable instead of trying to fix it.

Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-01-25 09:32:21 +01:00
Keith Packard 8f0fc977f5 Revert "drm/i915: Work around gen7 BLT ring synchronization issues."
This reverts commit 42ff6572e5.

New forcewake voodoo makes this no longer necessary.

Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Keith Packard <keithp@keithp.com>
2012-01-20 10:20:44 -08:00
Daniel Vetter 4cd53c0c8b drm/i915: paper over missed irq issues with force wake voodoo
Two things seem to do the trick on my ivb machine here:
- prevent the gt from powering down while waiting for seqno
  notification interrupts by grabbing the force_wake in get_irq (and
  dropping it in put_irq again).
- ordering writes from the ring's CS by reading a CS register, ACTHD
  seems to work.

Only the blt&bsd ring on ivb seem to be massively affected by this,
but for paranoia do this dance also on the render ring and on snb
(i.e. all gpus with forcewake).

Tested with Eric's glCopyPixels loop which without this patch scores a
missed irq every few seconds.

This patch needs my forcewake rework to use a spinlock instead of
dev->struct_mutex.

After crawling through docs a lot I've found the following nugget:

Internal doc "SNB GT PM Programming Guide", Section 4.3.1:

"GT does not generate interrupts while in RC6 (by design)"

So it looks like rc6 and irq generation are indeed related.

v2: Improve the comment per Eugeni Dodonov's suggestion.

v3: Add the documentation snipped. Also restrict the w/a to ivb only
for -fixes, as suggested by Keith Packard.

Cc: stable@kernel.org
Cc: Eric Anholt <eric@anholt.net>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Eugeni Dodonov <eugeni.dodonov@intel.com>
Tested-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Keith Packard <keithp@keithp.com>
2012-01-19 12:28:57 -08:00
Daniel Vetter e6bfaf8542 drm/i915: don't bail out of intel_wait_ring_buffer too early
In the pre-gem days with non-existing hangcheck and gpu reset code,
this timeout of 3 seconds was pretty important to avoid stuck
processes.

But now we have the hangcheck code in gem that goes to great length
to ensure that the gpu is really dead before declaring it wedged.

So there's no need for this timeout anymore. Actually it's even harmful
because we can bail out too early (e.g. with xscreensaver slip)
when running giant batchbuffers. And our code isn't robust enough
to properly unroll any state-changes, we pretty much rely on the gpu
reset code cleaning up the mess (like cache tracking, fencing state,
active list/request tracking, ...).

With this change intel_begin_ring can only fail when the gpu is
wedged, and it will return -EAGAIN (like wait_request in case the
gpu reset is still outstanding).

v2: Chris Wilson noted that on resume timers aren't running and hence
we won't ever get kicked out of this loop by the hangcheck code. Use
an insanely large timeout instead for the HAS_GEM case to prevent
resume bugs from totally hanging the machine.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-off-by: Keith Packard <keithp@keithp.com>
2012-01-03 10:26:31 -08:00
Eric Anholt 42ff6572e5 drm/i915: Work around gen7 BLT ring synchronization issues.
Previous to this commit, testing easily reproduced a failure where the
seqno would apparently arrive after the IRQ associated with it, with test programs as simple as:

for (;;) {
    glCopyPixels(0, 0, 1, 1);
    glFinish();
}

Various workarounds we've seen for previous generations didn't work to
fix this issue, so until new information comes in, replace the IRQ
waits on the BLT ring with polling.

Signed-off-by: Eric Anholt <eric@anholt.net>
Tested-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Keith Packard <keithp@keithp.com>
2012-01-03 09:31:15 -08:00
Ben Widawsky 84f9f938be drm/i915: Force sync command ordering (Gen6+)
The docs say this is required for Gen7, and since the bit was added for
Gen6, we are also setting it there pit pf paranoia. Particularly as
Chris points out, if PIPE_CONTROL counts as a 3d state packet.

This was found through doc inspection by Ken and applies to Gen6+;

Reported-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Keith Packard <keithp@keithp.com>
2012-01-03 09:09:44 -08:00
Jesse Barnes 8d31528703 drm/i915: Use PIPE_CONTROL for flushing on gen6+.
v2 by danvet: Use a new flag to flush the render target cache on gen6+
(hw reuses the old write flush bit), as suggested by Ben Widawsdy.

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
[danvet: this seems to fix cairo-perf-trace hangs on my snb]
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Keith Packard <keithp@keithp.com>
2011-10-20 15:26:41 -07:00
Kenneth Graunke 9d971b3753 drm/i915: Rename PIPE_CONTROL bit defines to be less terse.
"STALL_AT_SCOREBOARD" is much clearer than "STALL_EN" now that there are
several different kinds of stalls.  Also, "INSTRUCTION_CACHE_INVALIDATE"
is a lot easier to understand at a glance than the terse "IS_FLUSH."

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
[danvet: use INVALIDATE for ro cache flags for more consistency]
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Keith Packard <keithp@keithp.com>
2011-10-20 15:26:40 -07:00
Kenneth Graunke fcbc34e4dc drm/i915: Remove implied length of 2 from GFX_OP_PIPE_CONTROL #define.
Not all PIPE_CONTROLs have a length of 2, so remove it from the #define
and make each invocation specify the desired length.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
[danvet: implement style suggestion from Ben Widawsdy]
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Keith Packard <keithp@keithp.com>
2011-10-20 15:26:40 -07:00
Ben Widawsky c8c99b0f0d drm/i915: Dumb down the semaphore logic
While I think the previous code is correct, it was hard to follow and
hard to debug. Since we already have a ring abstraction, might as well
use it to handle the semaphore updates and compares.

I don't expect this code to make semaphores better or worse, but you
never know...

v2:
Remove magic per Keith's suggestions.
Ran Daniel's gem_ring_sync_loop test on this.

v3:
Ignored one of Keith's suggestions.

v4:
Removed some bloat per Daniel's recommendation.

Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Keith Packard <keithp@keithp.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Keith Packard <keithp@keithp.com>
2011-09-21 14:52:41 -07:00
Akshay Joshi 0206e353a0 Drivers: i915: Fix all space related issues.
Various issues involved with the space character were generating
warnings in the checkpatch.pl file. This patch removes most of those
warnings.

Signed-off-by: Akshay Joshi <me@akshayjoshi.com>
Signed-off-by: Keith Packard <keithp@keithp.com>
2011-09-19 18:01:47 -07:00
Jesse Barnes b095cd0a0c drm/i915: set GFX_MODE to pre-Ivybridge default value even on Ivybridge
Prior to Ivybridge, the GFX_MODE would default to 0x800, meaning that
MI_FLUSH would flush the TLBs in addition to the rest of the caches
indicated in the MI_FLUSH command.  However starting with Ivybridge, the
register defaults to 0x2800 out of reset, meaning that to invalidate the
TLB we need to use PIPE_CONTROL.  Since we're not doing that yet, go
back to the old default so things work.

v2: don't forget to actually *clear* the new bit

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-08-19 11:57:12 -07:00
Keith Packard df7976797f Merge branch 'drm-intel-fixes' into drm-intel-next 2011-07-22 13:40:42 -07:00
Keith Packard f3234706a7 drm/i915: Initialize RCS ring status page address in intel_render_ring_init_dri
Physically-addressed hardware status pages are initialized early in
the driver load process by i915_init_phys_hws. For UMS environments,
the ring structure is not initialized until the X server starts. At
that point, the entire ring structure is re-initialized with all new
values. Any values set in the ring structure (including
ring->status_page.page_addr) will be lost when the ring is
re-initialized.

This patch moves the initialization of the status_page.page_addr value
to intel_render_ring_init_dri.

Signed-off-by: Keith Packard <keithp@keithp.com>
Cc: stable@kernel.org
2011-07-22 13:36:52 -07:00