linux/tools/objtool
Peter Zijlstra 119251855f x86/retpoline: Simplify retpolines
Due to:

  c9c324dc22 ("objtool: Support stack layout changes in alternatives")

it is now possible to simplify the retpolines.

Currently our retpolines consist of 2 symbols:

 - __x86_indirect_thunk_\reg: the compiler target
 - __x86_retpoline_\reg:  the actual retpoline.

Both are consecutive in code and aligned such that for any one register
they both live in the same cacheline:

  0000000000000000 <__x86_indirect_thunk_rax>:
   0:   ff e0                   jmpq   *%rax
   2:   90                      nop
   3:   90                      nop
   4:   90                      nop

  0000000000000005 <__x86_retpoline_rax>:
   5:   e8 07 00 00 00          callq  11 <__x86_retpoline_rax+0xc>
   a:   f3 90                   pause
   c:   0f ae e8                lfence
   f:   eb f9                   jmp    a <__x86_retpoline_rax+0x5>
  11:   48 89 04 24             mov    %rax,(%rsp)
  15:   c3                      retq
  16:   66 2e 0f 1f 84 00 00 00 00 00   nopw   %cs:0x0(%rax,%rax,1)

The thunk is an alternative_2, where one option is a JMP to the
retpoline. This was done so that objtool didn't need to deal with
alternatives with stack ops. But that problem has been solved, so now
it is possible to fold the entire retpoline into the alternative to
simplify and consolidate unused bytes:

  0000000000000000 <__x86_indirect_thunk_rax>:
   0:   ff e0                   jmpq   *%rax
   2:   90                      nop
   3:   90                      nop
   4:   90                      nop
   5:   90                      nop
   6:   90                      nop
   7:   90                      nop
   8:   90                      nop
   9:   90                      nop
   a:   90                      nop
   b:   90                      nop
   c:   90                      nop
   d:   90                      nop
   e:   90                      nop
   f:   90                      nop
  10:   90                      nop
  11:   66 66 2e 0f 1f 84 00 00 00 00 00        data16 nopw %cs:0x0(%rax,%rax,1)
  1c:   0f 1f 40 00             nopl   0x0(%rax)

Notice that since the longest alternative sequence is now:

   0:   e8 07 00 00 00          callq  c <.altinstr_replacement+0xc>
   5:   f3 90                   pause
   7:   0f ae e8                lfence
   a:   eb f9                   jmp    5 <.altinstr_replacement+0x5>
   c:   48 89 04 24             mov    %rax,(%rsp)
  10:   c3                      retq

17 bytes, we have 15 bytes NOP at the end of our 32 byte slot. (IOW, if
we can shrink the retpoline by 1 byte we can pack it more densely).

 [ bp: Massage commit message. ]

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210326151259.506071949@infradead.org
2021-04-02 12:42:04 +02:00
..
arch/x86 x86/alternatives: Optimize optimize_nops() 2021-04-02 12:41:17 +02:00
Documentation objtool: Support stack layout changes in alternatives 2021-01-14 09:53:54 -06:00
include/objtool clang-lto for v5.12-rc1 (part2) 2021-02-23 15:13:45 -08:00
.gitignore objtool: Rework header include paths 2021-01-13 18:13:14 -06:00
Build objtool: Enable compilation of objtool for all architectures 2020-05-20 09:17:28 -05:00
builtin-check.c clang-lto for v5.12-rc1 (part2) 2021-02-23 15:13:45 -08:00
builtin-orc.c objtool: Refactor ORC section generation 2021-01-14 09:53:42 -06:00
check.c x86/retpoline: Simplify retpolines 2021-04-02 12:42:04 +02:00
elf.c objtool updates: 2021-02-23 09:56:13 -08:00
Makefile objtool: Refactor ORC section generation 2021-01-14 09:53:42 -06:00
objtool.c clang-lto for v5.12-rc1 (part2) 2021-02-23 15:13:45 -08:00
orc_dump.c x86/unwind/orc: Change REG_SP_INDIRECT 2021-02-10 20:53:51 +01:00
orc_gen.c objtool: Support stack layout changes in alternatives 2021-01-14 09:53:54 -06:00
special.c objtool: Rework header include paths 2021-01-13 18:13:14 -06:00
sync-check.sh Merge branch 'x86/cpu' into WIP.x86/core, to merge the NOP changes & resolve a semantic conflict 2021-04-02 12:36:30 +02:00
weak.c objtool: Refactor ORC section generation 2021-01-14 09:53:42 -06:00