linux

mirror of https://github.com/torvalds/linux synced 2024-10-27 13:48:49 +00:00

Author	SHA1	Message	Date
Ben Widawsky	828c79087c	drm/i915: Disable GGTT PTEs on GEN6+ suspend Once the machine gets to a certain point in the suspend process, we expect the GPU to be idle. If it is not, we might corrupt memory. Empirically (with an early version of this patch) we have seen this is not the case. We cannot currently explain why the latent GPU writes occur. In the technical sense, this patch is a workaround in that we have an issue we can't explain, and the patch indirectly solves the issue. However, it's really better than a workaround because we understand why it works, and it really should be a safe thing to do in all cases. The noticeable effect other than the debug messages would be an increase in the suspend time. I have not measure how expensive it actually is. I think it would be good to spend further time to root cause why we're seeing these latent writes, but it shouldn't preclude preventing the fallout. NOTE: It should be safe (and makes some sense IMO) to also keep the VALID bit unset on resume when we clear_range(). I've opted not to do this as properly clearing those bits at some later point would be extra work. v2: Fix bugzilla link Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=65496 Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=59321 Tested-by: Takashi Iwai <tiwai@suse.de> Tested-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Tested-By: Todd Previte <tprevite@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-18 15:44:47 +02:00
Ben Widawsky	b35b380ed4	drm/i915: Make PTE valid encoding optional We need this to work around a corruption when the boot kernel image loads the hibernated kernel image from swap on Haswell systems - somehow not everything is properly shut off. This is just the prep work, the next patch will implement the actual workaround. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: Add a commit message suitable for -fixes and add cc: stable] Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-18 15:40:21 +02:00
Chris Wilson	651d794fae	drm/i915: Use Write-Through cacheing for the display plane on Iris Haswell GT3e has the unique feature of supporting Write-Through cacheing of objects within the eLLC/LLC. The purpose of this is to enable the display plane to remain coherent whilst objects lie resident in the eLLC/LLC - so that we, in theory, get the best of both worlds, perfect display and fast access. However, we still need to be careful as the CPU does not see the WT when accessing the cache. In particular, this means that we need to flush the cache lines after writing to an object through the CPU, and on transitioning from a cached state to WT. v2: Actually do the clflush on transition to WT, nagging by Ville. v3: Flush the CPU cache after writes into WT objects. v4: Rease onto LLC updates and report WT as "uncached" for get_cache_level_ioctl to remain symmetric with set_cache_level_ioctl. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-22 13:31:38 +02:00
Chris Wilson	2c22569bba	drm/i915: Update rules for writing through the LLC with the cpu As mentioned in the previous commit, reads and writes from both the CPU and GPU go through the LLC. This gives us coherency between the CPU and GPU irrespective of the attribute settings either device sets. We can use to avoid having to clflush even uncached memory. Except for the scanout. The scanout resides within another functional block that does not use the LLC but reads directly from main memory. So in order to maintain coherency with the scanout, writes to uncached memory must be flushed. In order to optimize writes elsewhere, we start tracking whether an framebuffer is attached to an object. v2: Use pin_display tracking rather than fb_count (to ensure we flush cursors as well etc) and only force the clflush along explicit writes to the scanout paths (i.e. pin_to_display_plane and pwrite into scanout). v3: Force the flush after hitting the slowpath in pwrite, as after dropping the lock the object's cache domain may be invalidated. (Ville) Based on a patch by Ville Syrjälä. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-10 11:20:49 +02:00
Chris Wilson	350ec881d9	drm/i915: Rename I915_CACHE_MLC_LLC to L3_LLC for Ivybridge MLC_LLC was never validated for Sandybridge and was superseded by a new level of cacheing for the GPU in Ivybridge. Update our names to be consistent with usage, and in the process stop setting the unwanted bit on Sandybridge. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> [danvet: s/BUG/WARN_ON(1) bikeshed.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-06 16:35:30 +02:00
Ben Widawsky	40d74980d3	drm/i915: Use ggtt_vm to save some typing Just some small cleanups, and a rename of vm->ggtt_vm requested by Daniel. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-05 19:04:10 +02:00
Ben Widawsky	a70a3148b0	drm/i915: Make proper functions for VMs Earlier in the conversion sequence we attempted to quickly wedge in the transitional interface as static inlines. Now that we're sure these interfaces are sane, for easier debug and to decrease code size (since many of these functions may be called quite a bit), make them real functions While at it, kill off the set_color interface. We'll always have the VMA, or easily get to it. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-05 19:04:08 +02:00
Ben Widawsky	87a6b688cc	drm/i915/hsw: Change default LLC age to 3 The default LLC age was changed: commit `0d8ff15e9a` Author: Ben Widawsky <benjamin.widawsky@intel.com> Date: Thu Jul 4 11:02:03 2013 -0700 drm/i915/hsw: Set correct Haswell PTE encodings. On the surface it would seem setting a default age wouldn't matter because all GEM BOs are aged similarly, so the order in which objects are evicted would not be subject to aging. The current working theory as to why this caused a regression though is that LLC is a bit special in that it is shared with the CPU. Presumably (not verified) the CPU fetches cachelines with age 3, and therefore recently cached GPU objects would be evicted before similar CPU object first when the LLC is full. It stands to reason therefore that this would negatively impact CPU bound benchmarks - but those seem to be low on the priority list. eLLC OTOH does not have this same property as LLC. It should be used entirely for the GPU, and so the age really shouldn't matter. Furthermore, we have no evidence to suggest one is better than another on eLLC. Since we've never properly supported eLLC before no, there should be no regression. If the GPU client really wants "younger" objects, they should use MOCS. v2: Drop the extra #define (Chad) v3: Actually git add v4: Pimped commit message Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67062 Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-05 19:04:06 +02:00
Chris Wilson	08c45263a6	drm/i915: Use the same pte_encoding for ppgtt as for gtt The PTE layouts are the same for both ppgtt and gtt, so we can simplify the setup for ppgtt by copying the encoding function pointer from gtt. This prevents bugs where we update one function pointer, but forget the other. For instance, commit `4d15c145a6` Author: Ben Widawsky <ben@bwidawsk.net> Date: Thu Jul 4 11:02:06 2013 -0700 drm/i915: Use eLLC/LLC by default when available only extends the gtt to use eLLC/LLC cacheing and forgets to also update the ppgtt function pointer. v2: Actually mention the bug being fixed (Kenneth) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-04 21:29:57 +02:00
Ben Widawsky	2f63315692	drm/i915: Create VMAs Formerly: "drm/i915: Create VMAs (part 1)" In a previous patch, the notion of a VM was introduced. A VMA describes an area of part of the VM address space. A VMA is similar to the concept in the linux mm. However, instead of representing regular memory, a VMA is backed by a GEM BO. There may be many VMAs for a given object, one for each VM the object is to be used in. This may occur through flink, dma-buf, or a number of other transient states. Currently the code depends on only 1 VMA per object, for the global GTT (and aliasing PPGTT). The following patches will address this and make the rest of the infrastructure more suited v2: s/i915_obj/i915_gem_obj (Chris) v3: Only move an object to the now global unbound list if there are no more VMAs for the object which are bound into a VM (ie. the list is empty). v4: killed obj->gtt_space some reworks due to rebase v5: Free vma on error path (Imre) v6: Another missed vma free in i915_gem_object_bind_to_gtt error path (Imre) Fixed vma freeing in stolen preallocation (Imre) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Imre Deak <imre.deak@intel.com> [danvet: Squash in fixup from Ben to not deref a non-existing vma in set_cache_level, reported by Chris.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-18 08:46:13 +02:00
Ben Widawsky	93bd8649db	drm/i915: Put the mm in the parent address space Every address space should support object allocation. It therefore makes sense to have the allocator be part of the "superclass" which GGTT and PPGTT will derive. Since our maximum address space size is only 2GB we're not yet able to avoid doing allocation/eviction; but we'd hope one day this becomes almost irrelvant. v2: Rebased Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-17 22:23:43 +02:00
Ben Widawsky	853ba5d223	drm/i915: Move gtt and ppgtt under address space umbrella The GTT and PPGTT can be thought of more generally as GPU address spaces. Many of their actions (insert entries), state (LRU lists), and many of their characteristics (size) can be shared. Do that. The change itself doesn't actually impact most of the VMA/VM rework coming up, it just fits in with the grand scheme of abstracting the GPU VM operations. GGTT will usually be a special case where we either know an object must be in the GGTT (dislay engine, workarounds, etc.). The scratch page is left as part of the VM (even though it's currently shared with the ppgtt code) because in the future when we have Full PPGTT, I intend to create a separate scratch page for each. v2: Drop usage of i915_gtt_vm (Daniel) Make cleanup also part of the parent class (Ben) Modified commit msg Rebased v3: Properly share scratch page (Imre) Finish commit message (Daniel, Imre) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-17 22:21:47 +02:00
Ben Widawsky	4d15c145a6	drm/i915: Use eLLC/LLC by default when available DRI clients really should be using MOCS to get fine grained streaming cache controls. With that note, I hope that this patch doesn't improve performance overwhelmingly, because if it does - it means there is a problem elsewhere. In any case, the kernel, and old userspace should get some benefit from this, so let's do it. eLLC is always a good default, and really not using it is the special case for MOCS. References: http://www.intel.com/newsroom/kits/restricted/ha$well!/pdfs/4th_Gen_Intel_Core_PressBriefing_5-29.pdf (page 57) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-16 08:08:39 +02:00
Ben Widawsky	0d8ff15e9a	drm/i915/hsw: Set correct Haswell PTE encodings. The cacheability controls have changed, and the bits have been rearranged in general. Note that age 0 is the oldest (most likely to get evicted) and age 3 is the youngest (most likely to stick around for a bit). We've picked 0 for no reason, but atm it shouldn't matter anyway (since we don't yet try to differentiate between different objects). v2: Remove comments for snb/ivb cache leves, that's a separate change. v3: Resolve conflicts due to patch series reordering. v4: Rebased on top of Kenneth Graunke's ->pte_encode refactoring. v5: Removed eLLC bits for separate patch. In the internal repository this was: Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> [danvet: Add comment about cache ages as requested by Ben provoked due to a question from Damien.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-16 07:57:42 +02:00
Ben Widawsky	c6cfb32567	drm/i915: Embed drm_mm_node in i915 gem obj Embedding the node in the obj is more natural in the transition to VMAs which will also have embedded nodes. This change also helps transition away from put_block to remove node. Though it's quite an uncommon occurrence, it's somewhat convenient to not fail at bind time because we cannot allocate the node. Though in practice there are other allocations (like the request structure) which would probably make this point not terribly useful. Quoting Daniel: Note that the only difference between put_block and remove_node is that the former fills up the preallocation cache. Which we don't need anyway and hence is just wasted space. v2: Clean up the stolen preallocation code. Rebased on the reserve_node patches renames ggtt_ stuff to gtt_ stuff WARN_ON if the object is already bound (which doesn't mean it's in the bound list, tricky) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-08 22:04:36 +02:00
Ben Widawsky	edd41a870f	drm/i915: Kill obj->gtt_offset With the getters in place from the previous patch this members serves no purpose other than saving one spare pointer chase, which will be killed in the next patch anyway. Moving to VMAs, this members adds unnecessary confusion since an object may exist at different offsets in different VMs. v2: Properly preserve the stolen offset. This code is a bit hacky but it all goes away when we embed the drm_mm_node and removes the need for the incorrect patch I submitted previously: "Use gtt_space->start for stolen reservation" Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-08 22:04:35 +02:00
Ben Widawsky	f343c5f647	drm/i915: Getter/setter for object attributes Soon we want to gut a lot of our existing assumptions how many address spaces an object can live in, and in doing so, embed the drm_mm_node in the object (and later the VMA). It's possible in the future we'll want to add more getter/setter methods, but for now this is enough to enable the VMAs. v2: Reworked commit message (Ben) Added comments to the main functions (Ben) sed -i "s/i915_gem_obj_set_color/i915_gem_obj_ggtt_set_color/" drivers/gpu/drm/i915/.[ch] sed -i "s/i915_gem_obj_bound/i915_gem_obj_ggtt_bound/" drivers/gpu/drm/i915/.[ch] sed -i "s/i915_gem_obj_size/i915_gem_obj_ggtt_size/" drivers/gpu/drm/i915/.[ch] sed -i "s/i915_gem_obj_offset/i915_gem_obj_ggtt_offset/" drivers/gpu/drm/i915/.[ch] (Daniel) v3: Rebased on new reserve_node patch Changed DRM_DEBUG_KMS to actually work (will need fixing later) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-08 22:04:34 +02:00
Ben Widawsky	338710e7af	drm: Change create block to reserve node With the previous patch we no longer actually create a node, we simply find the correct hole and occupy it. This very well could have been squashed with the last patch, but since I already had David's review, I figured it's easiest to keep it distinct. Also update the users in i915. Conveniently this is the only user of the interface. CC: David Airlie <airlied@linux.ie> CC: <dri-devel@lists.freedesktop.org> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Acked-by: David Airlie <airlied@linux.ie> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-08 22:04:33 +02:00
Ben Widawsky	b3a070cccb	drm: pre allocate node for create_block For an upcoming patch where we introduce the i915 VMA, it's ideal to have the drm_mm_node as part of the VMA struct (ie. it's pre-allocated). Part of the conversion to VMAs is to kill off obj->gtt_space. Doing this will break a bunch of code, but amongst them are 2 callers of drm_mm_create_block(), both related to stolen memory. It also allows us to embed the drm_mm_node into the object currently which provides a nice transition over to the new code. v2: Reordered to do before ripping out obj->gtt_offset. Some minor cleanups made available because of reordering. v3: s/continue/break on failed stolen node allocation (David) Set obj->gtt_space on failed node allocation (David) Only unref stolen (fix double free) on failed create_stolen (David) Free node, and NULL it in failed create_stolen (David) Add back accidentally removed newline (David) CC: <dri-devel@lists.freedesktop.org> Reviewed-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Acked-by: David Airlie <airlied@linux.ie> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-08 22:04:32 +02:00
Ben Widawsky	b2f21b4dfd	drm/i915: Use gtt shortform where possible Just for compactness. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-01 11:27:59 +02:00
Ben Widawsky	80a74f7f9c	drm/i915: Drop dev from pte_encode The original pte_encode function needed the dev argument so we could do platform specific handling via IS_GENX, etc. With the merging of a pte encoding function there should never been a need to quirk away gen specific details. The patch doesn't do much but makes the upcoming reworks in gtt/ppgtt/mm slightly (albeit, ever so) easier. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-01 11:27:59 +02:00
Ben Widawsky	6716724006	drm/i915: Combine scratch members into a struct There isn't any special reason to do this other than it makes it obvious that the two members are connected. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-01 11:27:58 +02:00
Ben Widawsky	84f1356058	drm/i915: Really share scratch page A previous patch had set up the ppgtt and ggtt to use the same scratch page, but still kept around both pointers. Kill it, it's not needed and gets in our way for upcoming cleanups. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-01 11:27:57 +02:00
Ben Widawsky	6670a5a5c7	drm/i915: make PDE\|PTE platform specific Nothing outside of i915_gem_gtt.c and more specifically, the relevant gen specific init function should need to know about number of PDEs, or PTEs per PD. Exposing this will only lead to circumventing using the upcoming VM abstraction. To accomplish this, move the defines into the .c file, rename the PDE define to be GEN6, and make the PTE count less of a magic number. The remaining code in the global gtt setup is a bit messy, but an upcoming patch will clean that one up. v2: Don't hardcode number of PDEs (Daniel + Jesse) Reworded commit message to reflect change. Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-01 11:27:57 +02:00
Ben Widawsky	35c20a60c7	drm/i915: Rename the gtt_list to global_list Since it will be used for the global bound/unbound list with full PPGTT, this helps clarify things for upcoming code rework. Recommended-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-06-03 10:51:14 +02:00
Daniel Vetter	e1b73cba13	Linux 3.10-rc2 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iQEcBAABAgAGBQJRmpexAAoJEHm+PkMAQRiGrRIH/1uWFW38RvaCV/PXm/ia6Z+x BfBJfBIvPxGwb4n7aQNQlhU25xkfrPZ6szO4WiBH5/KPH3xYi2I2OZ1AzffkYqMF BWkPmsPK6EsTdp16zsi6JtH2aXArG4SpYA7ZamPvDkmfigHuiZg7GlL/9eHTRPNV P7Q8JToOrcnP8RoGgNj0uFiQeQbc62Kmoq7WuPtUhVlpQCCCknXgOJiYgz9w6Xe9 /i79YFS8WRrzAquExT1NbIOh4ZMqB9MvuroaVWy8JDDLUyz7QUvOCe3tCDNguwgi FdWvU6nfkdQq5SLaWCWXDE9Rp/pL1MvfBn9vCOwFcp42aw0aQ0PgJVIXvsqufd0= =jgDI -----END PGP SIGNATURE----- Merge tag 'v3.10-rc2' into drm-intel-next-queued Backmerge Linux 3.10-rc2 since the various (rather trivial) conflicts grew a bit out of hand. intel_dp.c has the only real functional conflict since the logic changed while dev_priv->edp.bpp was moved around. Also squash in a whitespace fixup from Ben Widawsky for i915_gem_gtt.c, git seems to do something pretty strange in there (which I don't fully understand tbh). Conflicts: drivers/gpu/drm/i915/i915_reg.h drivers/gpu/drm/i915/intel_dp.c Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-21 09:52:16 +02:00
Ben Widawsky	c4ae25ecdf	Revert "drm/i915: Calculate correct stolen size for GEN7+" This reverts commit `03752f5b7b`. This revert requires a bit of explanation on how I understand things work. Internally the architects/designers decide how the stolen encoding works. We put it in a doc. BIOS writers take these docs and implement it. Driver writers read the doc too, and read the value left by the BIOS writers, and then we make magic. The failing here is that in the docs we had[1] contained two different definitions for this register for Gen7. (We have both a PCI register, and an MMIO, and each of these were different). At the time [2] of `03752f5`, we asked the architects what the correct value should be; but that doesn't match the reality (BIOS) unfortunately. So on all machines I can get my hands on, this revert is the right thing to do. I've also worked with the product group to confirm that they agree this revert is what we should do. People using HW made my "people" who both write their own BIOS, and have access to our docs (Apple?). Investigations are still ongoing about whether we need to add a list of machines needing special handling, but this patch should be the right thing for pretty much everyone. [1] The docs are still wrong on this one. Now instead of two registers with two definitions, we have one register with BOTH definitions, progress? [2] The open source PRMs have the "wrong" definitions in chapter Volume 1 part6, section 1.1.12. This digging was inspired by Paulo. Cc: Paulo Zanoni <przanoni@gmail.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> [danvet: Augment the patch saying that it's still a bit unclear whether there are any machines out there with "wrong" firmware and whether we need to add a list to handle them specially.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-07 18:59:09 +02:00
Ben Widawsky	3e30254205	drm/i915: Extract PDE writes It also makes some sense IMO to have these two functions separate irrespective of the number of callers. Only the single caller for now, but that will change as we add more PPGTTs. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> [danvet: Resolve conflict.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-06 11:49:27 +02:00
Ben Widawsky	0a73287060	drm/i915: BUG_ON bad PPGTT offset Because PPGTT PDEs within the GTT are calculated in cachelines (HW guys consistency ftw) we do a divide which will wreak havoc if this is wrong, and I know that from experience). If/when we move to multiple PPGTTs this will have to become a WARN, and return an error. For now however it should always be considered fatal, and only a developer could hit it. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> [danvet: s/BUG/WARN] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-06 11:40:47 +02:00
Zhang, Xiong Y	43b27290dd	drm/i915: correct the calculation of first_pd_entry_in_global_pt When ppgtt is enabled, dev_priv->gtt.total has excluded the gtt space occupied by ppgtt table in i915_gem_init_global_gtt() function. So the calculation of first_pd_entry_in_global_pt doesn't need to subtract I915_PPGTT_PD_ENTRIES again. Or else PPGTT directory table will be destroyed by global gtt allocation. This regression has been introduced in commit `a54c0c279f` Author: Ben Widawsky <ben@bwidawsk.net> Date: Thu Jan 24 14:45:00 2013 -0800 drm/i915: remove intel_gtt structure The breakage is pretty subtile since the old gtt_total_entries included the pde range, whereas the new on did not. Cc: stable@vger.kernel.org Signed-off-by: Xiong Zhang<xiong.y.zhang@intel.com> [danvet: Add regression citation and cc: stable. Thanks to Chris for correcting my wrong guess about which commit broke things.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-04-27 14:07:16 +02:00
Kenneth Graunke	9119708cd4	drm/i915: Split out Haswell code from gen6_pte_encode. Now that we have function pointers, it's cleaner to just create a new per-platform PTE encoding function. This should be identical in behavior to the previous code. v2: Drop accidental inline keyword on hsw_pte_encode. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Tested-by: Daniel Leung <daniel.leung@linux.intel.com> [v1] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-04-22 11:44:21 +02:00
Kenneth Graunke	93c34e70eb	drm/i915: Fix page table entries for Bay Trail. On Bay Trail, bit 1 means "writeable by the GPU." Failing to set that means basically anything using the GPU will cause hangs. v2: Drop accidental inline keyword on byt_pte_encode. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Tested-by: Daniel Leung <daniel.leung@linux.intel.com> [v1] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-04-22 11:44:11 +02:00
Kenneth Graunke	2d04befb94	drm/i915: Add PTE encoding function to the gtt/ppgtt vtables. Sandybridge/Ivybridge, Bay Trail, and Haswell all have slightly different page table entry formats. Rather than polluting one function with generation checks, simply use a function pointer and set up the correct PTE encoding function at startup. v2: Move the gen6_gtt_pte_t typedef to i915_drv.h so that the function pointers and implementations have identical signatures. Also remove inline keyword on gen6_pte_encode. Both suggested by Jani Nikula. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jani Nikula <jani.nikula@linux.intel.com> Tested-by: Daniel Leung <daniel.leung@linux.intel.com> [v1] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-04-22 11:20:15 +02:00
Ben Widawsky	6a99476180	drm/i915: Remove stale code Looks like a some remnant from a rebase. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-04-18 09:43:20 +02:00
Ville Syrjälä	a6f429a5a2	drm/i915: Configure GAM_ECOCHK appropriatly for Gen7 IVB and HSW use different encodings for the PPGTT cacheability bits in the GAM_ECOCHK register. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-04-18 09:43:19 +02:00
Ville Syrjälä	a65c2fcd00	drm/i915: Set GAC_ECO_BITS register on Gen7+ According to BSpec GAC_ECO_BITS register exists on Gen7 platforms as well. Configure it accordingly. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-04-18 09:43:18 +02:00
Ville Syrjälä	3b9d7888df	drm/i915: Add ECOBITS_SNB_BIT GAC_ECO_BITS has a bit similar to GAM_ECOCHK's ECOCHK_SNB_BIT. Add the define, and enable it on SNB. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-04-18 09:43:18 +02:00
Ben Widawsky	b7c36d2546	drm/i915: Allow PPGTT enable to fail I'm really not happy that we have to support this, but this will be the simplest way to handle cases where PPGTT init can fail, which I promise will be coming in the future. v2: Resolve conflicts due to patch series reordering. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-04-18 09:43:16 +02:00
Ben Widawsky	5963cf049a	drm/i915: NULL aliasing_ppgtt on cleanup This will allow us to carry on if we've cleaned up the PPGTT. The usage for this is coming up - it simplifies handling a failed PPGTT init. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: Spill the secrets about failing ppgtt init.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-04-18 09:43:15 +02:00
Ben Widawsky	6197349bde	drm/i915: Abstract PPGTT enabling Since we've already set up a nice vtable to abstract other PPGTT functions, also abstract the actual register programming to enable things. This function will probably need to change a bit as we implement real processes. v2: Resolve conflicts due to patch series reordering. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-04-18 09:43:15 +02:00
Ben Widawsky	3ed124b21e	drm/i915: Rework PPGTT init code This rework will help if future platforms choose to be a bit different. Should have no functional impact. v2: Don't move around the vtable setup (Daniel) v3: Squash in the disable-by-default patch. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-04-18 09:43:14 +02:00
Ben Widawsky	3eb1c005c6	drm/i915: Conditionally carve out GGTT PDE It only works that way on GEN6 and GEN7. Let's not assume GENn will be the same. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-04-18 09:43:14 +02:00
Ben Widawsky	1e7d12d467	drm/i915/ppgtt: Set scratch page "globally" The PPGTT scratch page is used for all gens, and doing it in the global part of our PPGTT setup makes the code a bit nicer. This was in a patch submitted earlier as part of the PPGTT cleanups. Grumpy maintainer must have missed it, and I didn't yell when appropriate. Apologies for everyone :-) v2: Update commit message Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-04-18 09:43:13 +02:00
Ben Widawsky	c81dbe0563	drm/i915: random checkpatch fixes There used to be other fixes in this patch but they've slowly disappeared as other parts have been fixed. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-04-18 09:43:13 +02:00
Ben Widawsky	e7c2b58b70	drm/i915: Call out GEN6 PTE specificity We can assume that the PTE layout, and size changes for future generations. To avoid confusion with the existing GEN6 PTE typedef, give it a GEN6_ prefix. v2: Fixup checkpatch warning and bikeshed commit message slightly. v3: Rebase on top of Imre's for_each_sg_pages rework. v4: Fixup conflicts in patch series reordering. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-04-18 09:43:12 +02:00
Ben Widawsky	a93e41618e	drm/i915: generalize pte vs. register BAR allocation All gen6+ parts so far have 1 BAR which holds both the register space and the GTT PTEs. Up until now, that was a 4MB BAR with half allocated to each. I have a strong hunch (wink, nod, wink) that future gens will also keep a similar 50-50 split though the sizes may change. To help this along change the code to obey the rule of half the total size instead of a hard-coded 2MB. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-04-18 09:43:11 +02:00
Imre Deak	2db76d7c3c	lib/scatterlist: sg_page_iter: support sg lists w/o backing pages The i915 driver uses sg lists for memory without backing 'struct page' pages, similarly to other IO memory regions, setting only the DMA address for these. It does this, so that it can program the HW MMU tables in a uniform way both for sg lists with and without backing pages. Without a valid page pointer we can't call nth_page to get the current page in __sg_page_iter_next, so add a helper that relevant users can call separately. Also add a helper to get the DMA address of the current page (idea from Daniel). Convert all places in i915, to use the new API. Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-03-27 17:13:44 +01:00
Daniel Vetter	a15326a57c	drm/i915: fixup pd vs pt confusion in gen6 ppgtt code The index variable points at a page table, not a page directory or a pde. Ben Widawsky fix this up correctly in his ppgtt cleanup, but I've botched the job and copy&pasted the old confusion from the original gen6 ppgtt code in commit `def886c376` Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Thu Jan 24 14:44:56 2013 -0800 drm/i915: vfuncs for ppgtt Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-03-23 12:18:03 +01:00
Daniel Vetter	6ddc4fc70a	style nit: Align function parameter continuation properly.	2013-03-23 12:18:02 +01:00
Imre Deak	6e995e231a	drm/i915: use for_each_sg_page for setting up the gtt ptes The existing gtt setup code is correct - and so doesn't need to be fixed to handle compact dma scatter lists similarly to the previous patches. Still, take the for_each_sg_page macro into use, to get somewhat simpler code. Signed-off-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-03-23 12:17:31 +01:00

1 2 3

127 commits