Commit graph

353 commits

Author SHA1 Message Date
Ben Widawsky 93d187993b drm/i915: Remove use of gtt_mappable_entries
Mappable_end, ie. size is almost always what you want as opposed to the
number of entries. Since we already have that information, we can scrap
the number of entries and only calculate it when needed.

If gtt_start is !0, this will have slightly different behavior. This
difference can only occur in DRI1, and exists when we try to kick out
the firmware fb. The new code seems like a bugfix to me.

The other case where we've changed the behavior is during init we check
the mappable region against our current known upper and lower limits
(64MB, and 512MB). This now matches the comment, and makes things more
convenient after removing gtt_mappable_entries.

Also worth noting is the setting of mappable_end is taken out of setup
because we do it earlier now in the DRI2 case and therefore need to add
that tiny hunk to support the DRI1 IOCTL.

v2: Move up mappable end to before legacy AGP init

v3: Add the dev_priv inclusion here from previous rebase error in patch
5

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> (v2)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: squash in fix for a printk format flag mismatch warning.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-20 13:09:20 +01:00
Ben Widawsky dabb7a91ae drm/i915: Remove use on gma_bus_addr on gen6+
We have enough info to not use the intel_gtt bridge stuff.

v2: Move setup of mappable_base above the legacy init stuff because we
still need that on older platforms. (Daniel)

v3: Remove the dev_priv hunk which was rebased in by accident

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> (v2)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-17 22:47:03 +01:00
Ben Widawsky 5d4545aef5 drm/i915: Create a gtt structure
The purpose of the gtt structure is to help isolate our gtt specific
properties from the rest of the code (in doing so it help us finish the
isolation from the AGP connection).

The following members are pulled out (and renamed):
gtt_start
gtt_total
gtt_mappable_end
gtt_mappable
gtt_base_addr
gsm

The gtt structure will serve as a nice place to put gen specific gtt
routines in upcoming patches. As far as what else I feel belongs in this
structure: it is meant to encapsulate the GTT's physical properties.
This is why I've not added fields which track various drm_mm properties,
or things like gtt_mtrr (which is itself a pretty transient field).

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
[Ben modified commit messages]
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-17 22:33:56 +01:00
Chris Wilson eef90ccb8a drm/i915: Use the reloc.handle as an index into the execbuffer array
Using copywinwin10 as an example that is dependent upon emitting a lot
of relocations (2 per operation), we see improvements of:

c2d/gm45: 618000.0/sec to 623000.0/sec.
i3-330m: 748000.0/sec to 789000.0/sec.

(measured relative to a baseline with neither optimisations applied).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-17 22:23:47 +01:00
Daniel Vetter ed5982e6ce drm/i915: Allow userspace to hint that the relocations were known
Userspace is able to hint to the kernel that its command stream and
auxiliary state buffers already hold the correct presumed addresses and
so the relocation process may be skipped if the kernel does not need to
move any buffers in preparation for the execbuffer. Thus for the common
case where the allotment of buffers is static between batches, we can
avoid the overhead of individually checking the relocation entries.

Note that this requires userspace to supply the domain tracking and
requests for workarounds itself that would otherwise be computed based
upon the relocation entries.

Using copywinwin10 as an example that is dependent upon emitting a lot
of relocations (2 per operation), we see improvements of:

c2d/gm45: 618000.0/sec to 632000.0/sec.
i3-330m: 748000.0/sec to 830000.0/sec.

(measured relative to a baseline with neither optimisations applied).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Imre Deak <imre.deak@intel.com>
[danvet: Fixup merge conflict in userspace header due to different
baseline trees.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-01-17 22:23:36 +01:00
Dave Airlie b5cc6c0387 Merge tag 'drm-intel-next-2012-12-21' of git://people.freedesktop.org/~danvet/drm-intel into drm-next
Daniel writes:
- seqno wrap fixes and debug infrastructure from Mika Kuoppala and Chris
  Wilson
- some leftover kill-agp on gen6+ patches from Ben
- hotplug improvements from Damien
- clear fb when allocated from stolen, avoids dirt on the fbcon (Chris)
- Stolen mem support from Chris Wilson, one of the many steps to get to
  real fastboot support.
- Some DDI code cleanups from Paulo.
- Some refactorings around lvds and dp code.
- some random little bits&pieces

* tag 'drm-intel-next-2012-12-21' of git://people.freedesktop.org/~danvet/drm-intel: (93 commits)
  drm/i915: Return the real error code from intel_set_mode()
  drm/i915: Make GSM void
  drm/i915: Move GSM mapping into dev_priv
  drm/i915: Move even more gtt code to i915_gem_gtt
  drm/i915: Make next_seqno debugs entry to use i915_gem_set_seqno
  drm/i915: Introduce i915_gem_set_seqno()
  drm/i915: Always clear semaphore mboxes on seqno wrap
  drm/i915: Initialize hardware semaphore state on ring init
  drm/i915: Introduce ring set_seqno
  drm/i915: Missed conversion to gtt_pte_t
  drm/i915: Bug on unsupported swizzled platforms
  drm/i915: BUG() if fences are used on unsupported platform
  drm/i915: fixup overlay stolen memory leak
  drm/i915: clean up PIPECONF bpc #defines
  drm/i915: add intel_dp_set_signal_levels
  drm/i915: remove leftover display.update_wm assignment
  drm/i915: check for the PCH when setting pch_transcoder
  drm/i915: Clear the stolen fb before enabling
  drm/i915: Access to snooped system memory through the GTT is incoherent
  drm/i915: Remove stale comment about intel_dp_detect()
  ...

Conflicts:
	drivers/gpu/drm/i915/intel_display.c
2013-01-17 20:34:08 +10:00
Daniel Vetter 4d7bb01162 drm/i915: fixup overlay stolen memory leak
We need to clean up the overlay first, before taking down the
stolen memory allocator.

This regression has been introducec in

commit 8040513870
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Thu Nov 15 11:32:29 2012 +0000

    drm/i915: Allocate overlay registers from stolen memory

v2: Rework the patch a bit as suggested by Chris Wilson:
- move the overlay teardown up, into the modeset cleanup
- move the stolen mm takedown into i915_gem_cleanup_stolen

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-12-18 16:06:51 +01:00
Daniel Vetter b45305fce5 drm/i915: Implement workaround for broken CS tlb on i830/845
Now that Chris Wilson demonstrated that the key for stability on early
gen 2 is to simple _never_ exchange the physical backing storage of
batch buffers I've tried a stab at a kernel solution. Doesn't look too
nefarious imho, now that I don't try to be too clever for my own good
any more.

v2: After discussing the various techniques, we've decided to always blit
batches on the suspect devices, but allow userspace to opt out of the
kernel workaround assume full responsibility for providing coherent
batches. The principal reason is that avoiding the blit does improve
performance in a few key microbenchmarks and also in cairo-trace
replays.

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet:
- Drop the hunk which uses HAS_BROKEN_CS_TLB to implement the ring
  wrap w/a. Suggested by Chris Wilson.
- Also add the ACTHD check from Chris Wilson for the error state
  dumping, so that we still catch batches when userspace opts out of
  the w/a.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-12-17 17:27:02 +01:00
Daniel Vetter 09153000b8 drm/i915: rework locking for intel_dpio|sbi_read|write
Spinning for up to 200 us with interrupts locked out is not good. So
let's just spin (and even that seems to be excessive).

And we don't call these functions from interrupt context, so this is
not required. Besides that doing anything in interrupt contexts which
might take a few hundred us is a no-go. So just convert the entire
thing to a mutex. Also move the mutex-grabbing out of the read/write
functions (add a WARN_ON(!is_locked)) instead) since all callers are
nicely grouped together.

Finally the real motivation for this change: Dont grab the modeset
mutex in the dpio debugfs file, we don't need that consistency. And
correctness of the dpio interface is ensured with the dpio_lock.

Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-12-12 22:59:24 +01:00
Daniel Vetter 20afbda209 drm/i915: Fixup hpd irq register setup ordering
For GMCH platforms we set up the hpd irq registers in the irq
postinstall hook. But since we only enable the irq sources we actually
need in PORT_HOTPLUG_EN/STATUS, taking dev_priv->hotplug_supported_mask
into account, no hpd interrupt sources is enabled since

commit 52d7ecedac
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Sat Dec 1 21:03:22 2012 +0100

    drm/i915: reorder setup sequence to have irqs for output setup

Wrongly set-up interrupts also lead to broken hw-based load-detection
on at least GM45, resulting in ghost VGA/TV-out outputs.

To fix this, delay the hotplug register setup until after all outputs
are set up, by moving it into a new dev_priv->display.hpd_irq_callback.
We might also move the PCH_SPLIT platforms to such a setup eventually.

Another funny part is that we need to delay the fbdev initial config
probing until after the hpd regs are setup, for otherwise it'll detect
ghost outputs. But we can only enable the hpd interrupt handling
itself (and the output polling) _after_ that initial scan, due to
massive locking brain-damage in the fbdev setup code. Add a big
comment to explain this cute little dragon lair.

v2: Encapsulate all the fbdev handling by wrapping the move call into
intel_fbdev_initial_config in intel_fb.c. Requested by Chris Wilson.

v3: Applied bikeshed from Jesse Barnes.

v4: Imre Deak noticed that we also need to call intel_hpd_init after
the drm_irqinstall calls in the gpu reset and resume paths - otherwise
hotplug will be broken. Also improve the comment a bit about why
hpd_init needs to be called before we set up the initial fbdev config.

Bugzilla: Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54943
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> (v3)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-12-11 17:22:53 +01:00
Daniel Vetter 9ee32fea5f drm/i915: irq-drive the dp aux communication
At least on the platforms that have a dp aux irq and also have it
enabled - vlvhsw should have one, too. But I don't have a machine to
test this on. Judging from docs there's no dp aux interrupt for gm45.

Also, I only have an ivb cpu edp machine, so the dp aux A code for
snb/ilk is untested.

For dpcd probing when nothing is connected it slashes about 5ms of cpu
time (cpu time is now negligible), which agrees with 3 * 5 400 usec
timeouts.

A previous version of this patch increases the time required to go
through the dp_detect cycle (which includes reading the edid) from
around 33 ms to around 40 ms. Experiments indicated that this is
purely due to the irq latency - the hw doesn't allow us to queue up
dp aux transactions and hence irq latency directly affects throughput.
gmbus is much better, there we have a 8 byte buffer, and we get the
irq once another 4 bytes can be queued up.

But by using the pm_qos interface to request the lowest possible cpu
wake-up latency this slowdown completely disappeared.

Since all our output detection logic is single-threaded with the
mode_config mutex right now anyway, I've decide not ot play fancy and
to just reuse the gmbus wait queue. But this would definitely prep the
way to run dp detection on different ports in parallel

v2: Add a timeout for dp aux transfers when using interrupts - the hw
_does_  prevent this with the hw-based 400 usec timeout, but if the
irq somehow doesn't arrive we're screwed. Lesson learned while
developing this ;-)

v3: While at it also convert the busy-loop to wait_for_atomic, so that
we don't run the risk of an infinite loop any more.

v4: Ensure we have the smallest possible irq latency by using the
pm_qos interface.

v5: Add a comment to the code to explain why we frob pm_qos. Suggested
by Chris Wilson.

v6: Disable dp irq for vlv, that's easier than trying to get at docs
and hw.

v7: Squash in a fix for Haswell that Paulo Zanoni tracked down - the
dp aux registers aren't at a fixed offset any more, but can be on the
PCH while the DP port is on the cpu die.

Reviewed-by: Imre Deak <imre.deak@intel.com> (v6)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-12-06 13:18:00 +01:00
Daniel Vetter 52d7ecedac drm/i915: reorder setup sequence to have irqs for output setup
Otherwise the new&shiny irq-driven gmbus and dp aux code won't work that
well. Noticed since the dp aux code doesn't have an automatic fallback
with a timeout (since the hw provides for that already).

v2: Simple move drm_irq_install before intel_modeset_gem_init, as
suggested by Ben Widawsky.

v3: Now that interrupts are enabled before all connectors are fully
set up, we might fall over serving a HPD interrupt while things are
still being set up. Instead of jumping through massive hoops and
complicating the code with a separate hpd irq enable step, simply
block out the hotplug work item from doing anything until things are
in place.

v4: Actually, we can enable hotplug processing only after the fbdev is
fully set up, since we call down into the fbdev from the hotplug work
functions. So stick the hpd enabling right next to the poll helper
initialization.

v5: We need to enable irqs before intel_modeset_init, since that
function sets up the outputs.

v6: Fixup cleanup sequence, too.

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-12-06 13:14:36 +01:00
Daniel Vetter 61bac78e03 drm/i915: setup the hangcheck timer early
... together with all the other irq related resources in
intel_irq_init. I've managed to oops in the notify_ring function on my
ilk, presumably because of the powerctx setup call to i915_gpu_idle.

Note that this is only a problem with the reorder irq setup sequence
for irq-driver gmbus/dp aux.

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-12-06 13:14:35 +01:00
Ville Syrjälä 633cf8f505 drm/i915: Don't allow ring tail to reach the same cacheline as head
From BSpec:
"If the Ring Buffer Head Pointer and the Tail Pointer are on the same
cacheline, the Head Pointer must not be greater than the Tail
Pointer."

The easiest way to enforce this is to reduce the reported ring space.

References:
Gen2 BSpec "1. Programming Environment" / 1.4.4.6 "Ring Buffer Use"
Gen3 BSpec "vol1c Memory Interface Functions" / 2.3.4.5 "Ring Buffer Use"
Gen4+ BSpec "vol1c Memory Interface and Command Stream" / 5.3.4.5 "Ring Buffer Use"

v2: Include the exact BSpec references in the description

v3: s/64/I915_RING_FREE_SPACE, and add the BSpec information to the code

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-12-03 18:31:20 +01:00
Chris Wilson 42dcedd4f2 drm/i915: Use a slab for object allocation
The primary purpose of this was to debug some use-after-free memory
corruption that was causing an OOPS inside drm/i915. As it turned out
the corruption was being caused elsewhere and i915.ko as a major user of
many objects was being hit hardest.

Indeed as we do frequent the generic kmalloc caches, dedicating one to
ourselves (or at least naming one for us depending upon the core) aids
debugging our own slab usage.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-30 23:44:05 +01:00
Chris Wilson 3e9605018a drm/i915: Rearrange code to only have a single method for waiting upon the ring
Replace the wait for the ring to be clear with the more common wait for
the ring to be idle. The principle advantage is one less exported
intel_ring_wait function, and the removal of a hardcoded value.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-29 11:43:53 +01:00
Mika Kuoppala 4f1ba0f83a drm/i915: fix possible NULL dereference of dev_priv
Dereference dev_priv only after we know it is valid.
Found with smatch.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-29 11:43:51 +01:00
Dave Airlie 9fabd4eede Merge branch 'for-airlied' of git://people.freedesktop.org/~danvet/drm-intel into drm-next
Daniel writes:
Highlights of this -next round:
- ivb fdi B/C fixes
- hsw sprite/plane offset fixes from Damien
- unified dp/hdmi encoder for hsw, finally external dp support on hsw
  (Paulo)
- kill-agp and some other prep work in the gtt code from Ben
- some fb handling fixes from Ville
- massive pile of patches to align hsw VGA with the spec and make it
  actually work (Paulo)
- pile of workarounds from Jesse, mostly for vlv, but also some other
  related platforms
- start of a dev_priv reorg, that thing grew out of bounds and chaotic
- small bits&pieces all over the place, down to better error handling for
  load-detect on gen2 (Chris, Jani, Mika, Zhenyu, ...)

On top of the previous pile (just copypasta):
- tons of hsw dp prep patches form Paulo
- round scheduled work items and timers to nearest second (Chris)
- some hw workarounds (Jesse&Damien)
- vlv dp support and related fixups (Vijay et al.)
- basic haswell dp support, not yet wired up for external ports (Paulo)
- edp support (Paulo)
- tons of refactorings to prepare for the above (Paulo)
- panel rework, unifiying code between lvds and edp panels (Jani)
- panel fitter scaling modes (Jani + Yuly Novikov)
- panel power improvements, should now work without the BIOS setting it up
- extracting some dp helpers from radeon/i915 and move them to
  drm_dp_helper.c
- randome pile of workarounds (Damien, Ben, ...)
- some cleanups for the register restore code for suspend/resume
- secure batchbuffer support, should enable tear-free blits on gen6+
  Chris)
- random smaller fixlets and cleanups.

* 'for-airlied' of git://people.freedesktop.org/~danvet/drm-intel: (231 commits)
  drm/i915: Restore physical HWS_PGA after resume
  drm/i915: Report amount of usable graphics memory in MiB
  drm/i915/i2c: Track users of GMBUS force-bit
  drm/i915: Allocate the proper size for contexts.
  drm/i915: Update load-detect failure paths for modeset-rework
  drm/i915: Clear unused fields of mode for framebuffer creation
  drm/i915: Always calculate 8xx WM values based on a 32-bpp framebuffer
  drm/i915: Fix sparse warnings in from AGP kill code
  drm/i915: Missed lock change with rps lock
  drm/i915: Move the remaining gtt code
  drm/i915: flush system agent TLBs on SNB
  drm/i915: Kill off now unused gen6+ AGP code
  drm/i915: Calculate correct stolen size for GEN7+
  drm/i915: Stop using AGP layer for GEN6+
  drm/i915: drop the double-OP_STOREDW usage in blt_ring_flush
  drm/i915: don't rewrite the GTT on resume v4
  drm/i915: protect RPS/RC6 related accesses (including PCU) with a new mutex
  drm/i915: put ring frequency and turbo setup into a work queue v5
  drm/i915: don't block resume on fb console resume v2
  drm/i915: extract l3_parity substruct from dev_priv
  ...
2012-11-20 09:22:35 +10:00
Chris Wilson 6b8294a4d3 drm/i915: Restore physical HWS_PGA after resume
By always setting up the HWS register for both physical and virtual
address variations during render ring we can reduce the number of
different special cases that get set up at varying different times
during module load.

Fixes regression from

commit c630119f43
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Wed Oct 17 11:32:57 2012 +0200

    drm/i915: don't save/restore HWS_PGA reg for kms

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-16 13:47:40 +01:00
Ben Widawsky e76e9aebcd drm/i915: Stop using AGP layer for GEN6+
As a quick hack we make the old intel_gtt structure mutable so we can
fool a bunch of the existing code which depends on elements in that data
structure. We can/should try to remove this in a subsequent patch.

This should preserve the old gtt init behavior which upon writing these
patches seems incorrect. The next patch will fix these things.

The one exception is VLV which doesn't have the preserved flush control
write behavior. Since we want to do that for all GEN6+ stuff, we'll
handle that in a later patch. Mainstream VLV support doesn't actually
exist yet anyway.

v2: Update the comment to remove the "voodoo"
Check that the last pte written matches what we readback

v3: actually kill cache_level_to_agp_type since most of the flags will
disappear in an upcoming patch

v4: v3 was actually not what we wanted (Daniel)
Make the ggtt bind assertions better and stricter (Chris)
Fix some uncaught errors at gtt init (Chris)
Some other random stuff that Chris wanted

v5: check for i==0 in gen6_ggtt_bind_object to shut up gcc (Ben)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by [v4]: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Make the cache_level -> agp_flags conversion for pre-gen6 a
tad more robust by mapping everything != CACHE_NONE to the cached agp
flag - we have a 1:1 uncached mapping, but different modes of
cacheable (at least on later generations). Suggested by Chris Wilson.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:42 +01:00
Jesse Barnes 1abd02e2dd drm/i915: don't rewrite the GTT on resume v4
The BIOS shouldn't be touching this memory across suspend/resume, so
just leave it alone.  This saves us ~6ms on resume on my T420 (retested
with write combined PTEs).

v2: change gtt restore default on pre-gen4 (Chris)
    move needs_gtt_restore flag into dev_priv
v3: make sure we restore GTT on resume from hibernate (Daniel)
    use opregion support as the cutoff for restore from resume (Chris)
v4: use a better check for opregion (Chris)

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[danvet: Kill the needs_gtt_restore indirection and check directly for
OpRegion. Also explain in a comment what's going on.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:42 +01:00
Jesse Barnes 4fc688ce79 drm/i915: protect RPS/RC6 related accesses (including PCU) with a new mutex
This allows the power related code to run independently of the rest of
the pipeline, extending the resume and init time improvements into
userspace, which would otherwise have been blocked on the struct mutex
if we were doing PCU communication.

v2: Also convert the locking for the rps sysfs interface.

Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:41 +01:00
Jesse Barnes 073f34d9d4 drm/i915: don't block resume on fb console resume v2
The console lock can be contended, so rather than prevent other drivers
after us from being held up, queue the console suspend into the global
work queue that can happen anytime.  I've measured this to take around
200ms on my T420.  Combined with the ring freq/turbo change, we should
save almost 1/2 a second on resume.

v2: use console_trylock() to try to resume the console immediately (Chris)

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[danvet: move dev_priv->console_resume_work next to the fbdev
pointer.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:40 +01:00
Daniel Vetter 231f42a48f drm/i915: move dri1 dungeon out of dev_priv
Also, move dev_priv->counter there, it's only used in i915_dma.c

And also move the dri1 dungeon at the end of dev_priv where no one
cares about it.

Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:39 +01:00
Chris Wilson 1623392af9 drm/i915: Only kick out vesafb if we takeover the fbcon with KMS
Otherwise we may remove the only console for a nomodeset system.

We became more aggressive in our kicking with
commit e188719a28
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Tue Jun 12 11:28:17 2012 +0200

    drm/i915: kick any firmware framebuffers before claiming the gtt

Reported-and-tested-by: monnier@iro.umontreal.ca
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54615
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org # v3.6
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-10-26 17:46:35 +02:00
Daniel Vetter c2fb791692 Linux 3.7-rc2
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQEcBAABAgAGBQJQgvdwAAoJEHm+PkMAQRiG+3AH/i2XsqqN3VctL0nnbWfvds+Q
 aKulfIdJTjKiVAsawPUtRqReZ8ijiebrgA/53lZLlrFOoPPQ5+LHmnSyQF6gErOY
 NuAE1lijXDRM1pwBlhvOBbAj26wUobGjqONFJ9OkKr758Ue8ds/Q7UdxyEgmYgmg
 tvVMzfRcICzryUV3PcqL+3cNPpCUdT6wGGRJ9DCv/jvGiWKExWhOle5oltrmxk+D
 NsqRcws5pEubfHE4J8BvNWr8lE1kHfYVhrJETiLJUiN2XAJcbI4Jy7rU/3EGteNS
 0HMZdaPPjV874lohdM70X2225SbYrCVkAYB5hnZCTeC3tYyCawBBPMQoyAiOcmU=
 =+861
 -----END PGP SIGNATURE-----

Merge tag 'v3.7-rc2' into drm-intel-next-queued

Linux 3.7-rc2

Backmerge to solve two ugly conflicts:
- uapi. We've already added new ioctl definitions for -next. Do I need to say more?
- wc support gtt ptes. We've had to revert this for snb+ for 3.7 and
  also fix a few other things in the code. Now we know how to make it
  work on snb+, but to avoid losing the other fixes do the backmerge
  first before re-enabling wc gtt ptes on snb+.

And a few other minor things, among them git getting confused in
intel_dp.c and seemingly causing a conflict out of nothing ...

Conflicts:
	drivers/gpu/drm/i915/i915_reg.h
	drivers/gpu/drm/i915/intel_display.c
	drivers/gpu/drm/i915/intel_dp.c
	drivers/gpu/drm/i915/intel_modes.c
	include/drm/i915_drm.h

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-10-22 14:34:51 +02:00
Chris Wilson d7d4eeddb8 drm/i915: Allow DRM_ROOT_ONLY|DRM_MASTER to submit privileged batchbuffers
With the introduction of per-process GTT space, the hardware designers
thought it wise to also limit the ability to write to MMIO space to only
a "secure" batch buffer. The ability to rewrite registers is the only
way to program the hardware to perform certain operations like scanline
waits (required for tear-free windowed updates). So we either have a
choice of adding an interface to perform those synchronized updates
inside the kernel, or we permit certain processes the ability to write
to the "safe" registers from within its command stream. This patch
exposes the ability to submit a SECURE batch buffer to
DRM_ROOT_ONLY|DRM_MASTER processes.

v2: Haswell split up bit8 into a ppgtt bit (still bit8) and a security
bit (bit 13, accidentally not set). Also add a comment explaining why
secure batches need a global gtt binding.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
[danvet: added hsw fixup.]
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-10-17 21:06:59 +02:00
Linus Torvalds 612a9aab56 Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux
Pull drm merge (part 1) from Dave Airlie:
 "So first of all my tree and uapi stuff has a conflict mess, its my
  fault as the nouveau stuff didn't hit -next as were trying to rebase
  regressions out of it before we merged.

  Highlights:
   - SH mobile modesetting driver and associated helpers
   - some DRM core documentation
   - i915 modesetting rework, haswell hdmi, haswell and vlv fixes, write
     combined pte writing, ilk rc6 support,
   - nouveau: major driver rework into a hw core driver, makes features
     like SLI a lot saner to implement,
   - psb: add eDP/DP support for Cedarview
   - radeon: 2 layer page tables, async VM pte updates, better PLL
     selection for > 2 screens, better ACPI interactions

  The rest is general grab bag of fixes.

  So why part 1? well I have the exynos pull req which came in a bit
  late but was waiting for me to do something they shouldn't have and it
  looks fairly safe, and David Howells has some more header cleanups
  he'd like me to pull, that seem like a good idea, but I'd like to get
  this merge out of the way so -next dosen't get blocked."

Tons of conflicts mostly due to silly include line changes, but mostly
mindless.  A few other small semantic conflicts too, noted from Dave's
pre-merged branch.

* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (447 commits)
  drm/nv98/crypt: fix fuc build with latest envyas
  drm/nouveau/devinit: fixup various issues with subdev ctor/init ordering
  drm/nv41/vm: fix and enable use of "real" pciegart
  drm/nv44/vm: fix and enable use of "real" pciegart
  drm/nv04/dmaobj: fixup vm target handling in preparation for nv4x pcie
  drm/nouveau: store supported dma mask in vmmgr
  drm/nvc0/ibus: initial implementation of subdev
  drm/nouveau/therm: add support for fan-control modes
  drm/nouveau/hwmon: rename pwm0* to pmw1* to follow hwmon's rules
  drm/nouveau/therm: calculate the pwm divisor on nv50+
  drm/nouveau/fan: rewrite the fan tachometer driver to get more precision, faster
  drm/nouveau/therm: move thermal-related functions to the therm subdev
  drm/nouveau/bios: parse the pwm divisor from the perf table
  drm/nouveau/therm: use the EXTDEV table to detect i2c monitoring devices
  drm/nouveau/therm: rework thermal table parsing
  drm/nouveau/gpio: expose the PWM/TOGGLE parameter found in the gpio vbios table
  drm/nouveau: fix pm initialization order
  drm/nouveau/bios: check that fixed tvdac gpio data is valid before using it
  drm/nouveau: log channel debug/error messages from client object rather than drm client
  drm/nouveau: have drm debugging macros build on top of core macros
  ...
2012-10-03 23:29:23 -07:00
David Howells 760285e7e7 UAPI: (Scripted) Convert #include "..." to #include <path/...> in drivers/gpu/
Convert #include "..." to #include <path/...> in drivers/gpu/.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Dave Jones <davej@redhat.com>
2012-10-02 18:01:07 +01:00
David Howells 4126d5d61f UAPI: (Scripted) Remove redundant DRM UAPI header #inclusions from drivers/gpu/.
Remove redundant DRM UAPI header #inclusions from drivers/gpu/.

Remove redundant #inclusions of core DRM UAPI headers (drm.h, drm_mode.h and
drm_sarea.h).  They are now #included via drmP.h and drm_crtc.h via a preceding
patch.

Without this patch and the patch to make include the UAPI headers from the core
headers, after the UAPI split, the DRM C sources cannot find these UAPI headers
because the DRM code relies on specific -I flags to make #include "..."  work
on headers in include/drm/ - but that does not work after the UAPI split without
adding more -I flags.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Dave Jones <davej@redhat.com>
2012-10-02 18:01:05 +01:00
Ben Widawsky 199adf40ae drm/i915: s/cacheing/caching/
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-09-26 09:24:36 +02:00
Daniel Vetter 398b7a1b88 Linux 3.6-rc7
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.18 (GNU/Linux)
 
 iQEcBAABAgAGBQJQX7MuAAoJEHm+PkMAQRiG0h0IAJURkrMCAQUxA+Ik66ReH89s
 LQcVd0U9uL4UUOi7f5WR64Vf9Cfu6VVGX9ZKSvjpNskvlQaUQPMIt4pMe6g4X4dI
 u0bApEy4XZz3nGabUAghIU8jJ8cDmhCG6kPpSiS7pi7KHc0yIa4WFtJRrIpGaIWT
 xuK38YOiOHcSDRlLyWZzainMncQp/ixJdxnqVMTonkVLk0q0b84XzOr4/qlLE5lU
 i+TsK3PRKdQXgvZ4CebL+srPBwWX1dmgP3VkeBloQbSSenSeELICbFWavn2ml+sF
 GXi4dO93oNquL/Oy5SwI666T4uNcrRPaS+5X+xSZgBW/y2aQVJVJuNZg6ZP/uWk=
 =0v2l
 -----END PGP SIGNATURE-----

Merge tag 'v3.6-rc7' into drm-intel-next-queued

Manual backmerge of -rc7 to resolve a silent conflict leading to
compile failure in drivers/gpu/drm/i915/intel_hdmi.c.

This is due to the bugfix in -rc7:

commit b98b601672
Author: Wang Xingchao <xingchao.wang@intel.com>
Date:   Thu Sep 13 07:43:22 2012 +0800

    drm/i915: HDMI - Clear Audio Enable bit for Hot Plug

Since this code moved around a lot in -next git put that snippet at
the wrong spot. I've tried to fix this by making the conflict explicit
by merging a version for next with:

commit 3cce574f01
Author: Wang Xingchao <xingchao.wang@intel.com>
Date:   Thu Sep 13 11:19:00 2012 +0800

    drm/i915: HDMI - Clear Audio Enable bit for Hot Plug unconditionally

But that failed to solve the entire problem. To avoid pushing out
further -nightly branch to our QA where this is broken, do the
backmerge and manually add the stuff git adds to -next from the patch
in -fixes.

Note that this doesn't show up in git's merge diff (and hence is also
not handled by git rerere), which adds to the reasons why I'd like to
fix this with a verbose backmerge. The git merge diff only shows a
bunch of trivial conflicts of the "code changed in lines next to each
another" kind.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-09-24 18:17:12 +02:00
Chris Wilson 934d6086ea drm/i915: Limit the ioremap of the PCI bar to the registers
In the future we may like to experiment with using a WC map of the GTT
portion. However, that will conflict with i915.ko mapping the entire bar
as UC in order to access the GPU registers. Instead we can shrink the
register ioremap to only map the register block.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by (IVB): Ben Widawsky <ben@bwidawsk.net>
Acked-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Squashed-in follow-up fix for gen2/3 registers file size from
Chris Wilson.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-09-20 14:23:07 +02:00
Alexander Shishkin 99d0b1db6c drm/i915: initialize dpio_lock spin lock
This thing is killing lockdep.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
[Jani: move the init next to the other spin lock inits]
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-09-08 19:04:04 +02:00
Tejun Heo 53621860c3 i915: use alloc_ordered_workqueue() instead of explicit UNBOUND w/ max_active = 1
This is an equivalent conversion and will ease scheduled removal of
WQ_NON_REENTRANT.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-08-24 01:13:53 +02:00
Dave Airlie ec6f1bb90c drm/i915: implement dma buf begin_cpu_access (v2)
In order for udl vmap to work properly, we need to push the object
into the CPU domain before we start copying the data to the USB device.

This along with the udl change avoids userspace explicit mapping to
be used.

v2: add a flag for userspace to query to know if Intel kernel driver can
deal with the vmap flushing properly. In theory udl would need a flag also,
but I intend to push the patches very close to each other and other drivers
should do the right thing from the start.

I've added a test to my intel-gpu-tools prime branch, however testing
this is a bit messy since the only way to get udl to vmap is to rendering
something. I've tested this with real code as well to make sure it works.

Signed-off-by: Dave Airlie <airlied@redhat.com>
[danvet: resolved conflict, which required reallocating the PARAM
number to 21.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-08-17 10:10:06 +02:00
Daniel Vetter 5d985ac81a drm/i915: kill a few unused things in dev_priv
... and move a few others only used by i915_dma.c into the dri1
dungeon.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-08-17 10:10:03 +02:00
Daniel Vetter c6a828d326 drm/i915: move all rps state into dev_priv->rps
This way it's easier so see what belongs together, and what is used
by the ilk ips code. Also add some comments that explain the locking.

Note that (cur|min|max)_delay need to be duplicated, because
they're also used by the ips code.

v2: Missed one place that the dev_priv->ips change caught ...

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-08-09 21:52:22 +02:00
Daniel Vetter c96ea64ebb drm/i915: dump the device info
Handy for lazy people like me, or when people forget to add the output
of lspci -nn.

v2: Chris Wilson noticed that we have this duplicated already in the
i915_capabilites debugfs file. But there \n as separator looks better,
which would be a bit verbose in dmesg. Abuse the preprocessor to
extract this all.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-08-09 18:29:21 +02:00
Chris Wilson 2fedbff948 drm/i915: Add I915_GEM_PARAM_HAS_SEMAPHORES
Userspace tries to estimate the cost of ring switching based on whether
the GPU and GEM supports semaphores. (If we have multiple rings and no
semaphores, userspace assumes that the cost of switching rings between
batches is exorbitant and will endeavour to keep the next batch on the
active ring - as a coarse approximation to tracking both destination and
source surfaces.) Currently userspace has to guess whether semaphores
exist based on the chipset generation and the module parameter,
i915.semaphores. This is a crude and inaccurate guess as the defaults
internally depend upon other chipset features being enabled or disabled,
nor does it extend well into the future. By exporting a HAS_SEMAPHORES
parameter, we can easily query the driver and obtain an accurate answer.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-08-08 14:29:12 +02:00
Chris Wilson e6994aeedc drm/i915: Export ability of changing cache levels to userspace
By selecting the cache level (essentially whether or not the CPU snoops
any updates to the bo, and on more recent machines whether it resides
inside the CPU's last-level-cache) a userspace driver is able to then
manage all of its memory within buffer objects, if it so desires. This
enables the userspace driver to accelerate uploads and more importantly
downloads from the GPU and to able to mix CPU and GPU rendering/activity
efficiently.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Added code comment about where we plan to stuff platform
specific cacheing control bits in the ioctl struct.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-07-26 12:56:25 +02:00
Ben Widawsky c0c7babc48 drm/i915: add register read IOCTL
The interface's immediate purpose is to do synchronous timestamp queries
as required by GL_TIMESTAMP. The GPU has a register for reading the
timestamp but because that would normally require root access through
libpciaccess, the IOCTL can provide this service instead.

Currently the implementation whitelists only the render ring timestamp
register, because that is the only thing we need to expose at this time.

v2: make size implicit based on the register offset
Add a generation check

Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: Jacek Lawrynowicz <jacek.lawrynowicz@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: fixup the ioctl numerb:]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-07-25 18:23:49 +02:00
Daniel Vetter e8aeaee7b0 drm/i915: unbreak lastclose for failed driver init
We now refuse to load on gen6+ if kms is not enabled:

commit 26394d9251
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Mon Mar 26 21:33:18 2012 +0200

    drm/i915: refuse to load on gen6+ without kms

Which results in the drm core calling our lastclose function to clean
up the mess, but that one is neatly broken for such failure cases
since kms has been introduced in

commit 79e539453b
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date:   Fri Nov 7 14:24:08 2008 -0800

    DRM: i915: add mode setting support

Reported-and-tested-by: Paulo Zanoni <przanoni@gmail.com>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-07-25 10:40:00 +02:00
Paulo Zanoni 45e6e3a1cd drm/i915: get rid of dev_priv->info->has_pch_split
Previously we had has_pch_split to tell us whether we had a PCH or not
and we also had dev_priv->pch_type to tell us which kind of PCH it
was, but it could only be used if we were 100% sure we did have a PCH.
Now that PCH_NONE was added to dev_priv->pch_type we don't need
has_pch_split anymore: we can just check for pch_type != PCH_NONE.

The HAS_PCH_{IBX,CPT,LPT} macros use dev_priv->pch_type, so they can
only be called after intel_detect_pch. The HAS_PCH_SPLIT macro looks
at dev_priv->info->has_pch_split, which is available earlier.

Since the goal is to implement HAS_PCH_SPLIT using dev_priv->pch_type
instead of dev_priv->info->has_pch_split, we need to make sure that
intel_detect_pch is called before any calls to HAS_PCH_SPLIT are made.
So we moved the intel_detect_pch call to an earlier stage.

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-07-05 09:56:05 +02:00
Chris Wilson 990bbdadab drm/i915: Group the GT routines together in both code and vtable
Tidy up the routines for interacting with the GT (in particular the
forcewake dance) which are scattered throughout the code in a single
structure.

v2: use wait_for_atomic for polling.

v3: *really* use wait_for_atomic for polling.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-07-03 22:08:46 +02:00
Daniel Vetter 87207ca20e drm/i915: don't use dev->agp
This single leftover use is due to a patch that went into 3.5 through
-fixes. With the fake agp stuff on demise, at least for gen6+ we can't
use this any more.

Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-06-25 21:08:44 +02:00
Daniel Vetter df12c6d5ec drm/i915: initialize the context idr unconditionally
It doesn't hurt and it at least prevents us from OOPSing left and
right at quite a few places. This also allows us to simplify the code
a bit by folding the only line of context_open into the callsite.

We obviuosly also need to run the cleanup code unconditionally, too.

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-06-20 11:15:37 +02:00
Daniel Vetter 55a6662837 drm/i915: fix module unload after context merge
commit 8e96d9c4d9
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Mon Jun 4 14:42:56 2012 -0700

    drm/i915: reset the GPU on context fini

broke module unload because it reset the gpu before we've stopped
touching it. Later on in the unload sequence the ringbuffer code
complained that the gpu would idle properly (because intel_gpu_reset
only resets the hw and not our sw state).

v2: Reorder things so that we reset the gpu _before_ we release the
backing storage of the default context.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51183
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-06-20 10:06:29 +02:00
Ben Widawsky 846248136d drm/i915/context: create & destroy ioctls
Add the interfaces to allow user space to create and destroy contexts.
Contexts are destroyed automatically if the file descriptor for the dri
device is closed.

Following convention as usual here causes checkpatch warnings.

v2: with is_initialized, no longer need to init at create
drop the context switch on create (daniel)

v3: Use interruptible lock (Chris)
return -ENODEV in !GEM case (Chris)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
2012-06-14 17:36:20 +02:00
Ben Widawsky 254f965c39 drm/i915: preliminary context support
Very basic code for context setup/destruction in the driver.

Adds the file i915_gem_context.c This file implements HW context
support. On gen5+ a HW context consists of an opaque GPU object which is
referenced at times of context saves and restores.  With RC6 enabled,
the context is also referenced as the GPU enters and exists from RC6
(GPU has it's own internal power context, except on gen5).  Though
something like a context does exist for the media ring, the code only
supports contexts for the render ring.

In software, there is a distinction between contexts created by the
user, and the default HW context. The default HW context is used by GPU
clients that do not request setup of their own hardware context. The
default context's state is never restored to help prevent programming
errors. This would happen if a client ran and piggy-backed off another
clients GPU state.  The default context only exists to give the GPU some
offset to load as the current to invoke a save of the context we
actually care about. In fact, the code could likely be constructed,
albeit in a more complicated fashion, to never use the default context,
though that limits the driver's ability to swap out, and/or destroy
other contexts.

All other contexts are created as a request by the GPU client. These
contexts store GPU state, and thus allow GPU clients to not re-emit
state (and potentially query certain state) at any time. The kernel
driver makes certain that the appropriate commands are inserted.

There are 4 entry points into the contexts, init, fini, open, close.
The names are self-explanatory except that init can be called during
reset, and also during pm thaw/resume. As we expect our context to be
preserved across these events, we do not reinitialize in this case.

As Adam Jackson pointed out, The cutoff of 1MB where a HW context is
considered too big is arbitrary. The reason for this is even though
context sizes are increasing with every generation, they have yet to
eclipse even 32k. If we somehow read back way more than that, it
probably means BIOS has done something strange, or we're running on a
platform that wasn't designed for this.

v2: rename load/unload to init/fini (daniel)
remove ILK support for get_size() (indirectly daniel)
add HAS_HW_CONTEXTS macro to clarify supported platforms (daniel)
added comments (Ben)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
2012-06-14 17:36:16 +02:00