Commit graph

54847 commits

Author SHA1 Message Date
Alex Bennée df0311e634 include/exec/exec-all: document common exit conditions
As a precursor to later patches attempt to come up with a more
concrete wording for what each of the common exit cases would be.

CC: Emilio G. Cota <cota@braap.org>
CC: Richard Henderson <rth@twiddle.net>
CC: Lluís Vilanova <vilanova@ac.upc.edu>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 20170713141928.25419-2-alex.bennee@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-07-17 13:36:07 +01:00
Peter Maydell 8d92e26b45 target/arm: Make Cortex-M3 and M4 default to 8 PMSA regions
The Cortex-M3 and M4 CPUs always have 8 PMSA MPU regions (this isn't
a configurable option for the hardware).  Make the default value of
the pmsav7-dregion property be set per-cpu, so we don't need to have
every user of these CPUs set it manually.  (The existing default of
16 is correct for the other PMSAv7 core, the Cortex-R5.)

This fixes a bug where we were creating the M3 and M4 with
too many regions; most guest software would not notice or
care, though, since it would just not use the registers
associated with the unexpected extra regions.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 1499788408-10096-4-git-send-email-peter.maydell@linaro.org
2017-07-17 13:36:07 +01:00
Peter Maydell 5cc56cc687 qdev: support properties which don't set a default value
In some situations it's useful to have a qdev property which doesn't
automatically set its default value when qdev_property_add_static is
called (for instance when the default value is not constant).

Support this by adding a flag to the Property struct indicating
whether to set the default value.  This replaces the existing test
for whether the PropertyInfo set_default_value function pointer is
NULL, and we set the .set_default field to true for all those cases
of struct Property which use a PropertyInfo with a non-NULL
set_default_value, so behaviour remains the same as before.

This gives us the semantics of:
 * if .set_default is true, then .info->set_default_value must
   be not NULL, and .defval is used as the the default value of
   the property
 * otherwise, the property system does not set any default, and
   the field will retain whatever initial value it was given by
   the device's .instance_init method

We define two new macros DEFINE_PROP_SIGNED_NODEFAULT and
DEFINE_PROP_UNSIGNED_NODEFAULT, to cover the most plausible use cases
of wanting to set an integer property with no default value.

Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1499788408-10096-3-git-send-email-peter.maydell@linaro.org
2017-07-17 13:36:06 +01:00
Peter Maydell d9a7b125d6 qdev-properties.h: Explicitly set the default value for arraylen properties
In DEFINE_PROP_ARRAY, because we use a PropertyInfo (qdev_prop_arraylen)
which has a .set_default_value member we will set the field to a default
value. That default value will be zero, by the C rule that struct
initialization sets unmentioned members to zero if at least one member
is initialized. However it's clearer to state it explicitly.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 1499788408-10096-2-git-send-email-peter.maydell@linaro.org
2017-07-17 13:36:06 +01:00
Jason Wang 189ae6bb5c virtio-net: fix offload ctrl endian
Spec said offloads should be le64, so use virtio_ldq_p() to guarantee
valid endian.

Fixes: 644c98587d ("virtio-net: dynamic network offloads configuration")
Cc: qemu-stable@nongnu.org
Cc: Dmitry Fleytman <dfleytma@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-07-17 20:13:56 +08:00
Michal Privoznik 5f997fd17b virtion-net: Prefer is_power_of_2()
We have a function that checks if given number is power of two.
We should prefer it instead of expanding the check on our own.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-07-17 20:13:55 +08:00
Zhang Chen 2484ff0624 docs/colo-proxy.txt: Update colo-proxy usage of net driver with vnet_header
Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-07-17 20:13:54 +08:00
Zhang Chen 4b39bdced5 net/filter-rewriter.c: Make filter-rewriter support vnet_hdr_len
We add the vnet_hdr_support option for filter-rewriter, default is disabled.
If you use virtio-net-pci or other driver needs vnet_hdr, please enable it.
You can use it for example:
-object filter-rewriter,id=rew0,netdev=hn0,queue=all,vnet_hdr_support

We get the vnet_hdr_len from NetClientState that make us
parse net packet correctly.

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-07-17 20:13:53 +08:00
Zhang Chen d63b366a26 net/colo-compare.c: Add vnet packet's tcp/udp/icmp compare
COLO-Proxy just focus on packet payload, so we skip vnet header.

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-07-17 20:13:52 +08:00
Zhang Chen 5cc444d367 net/colo.c: Add vnet packet parse feature in colo-proxy
Make colo-compare and filter-rewriter can parse vnet packet.

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-07-17 20:13:51 +08:00
Zhang Chen aa3a7032f7 net/colo-compare.c: Make colo-compare support vnet_hdr_len
We add the vnet_hdr_support option for colo-compare, default is disabled.
If you use virtio-net-pci or other driver needs vnet_hdr, please enable it.
You can use it for example:
-object colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,outdev=compare_out0,vnet_hdr_support

COLO-compare can get vnet header length from filter,
Add vnet_hdr_len to struct packet and output packet with
the vnet_hdr_len.

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-07-17 20:13:50 +08:00
Zhang Chen 3037e7a5b7 net/colo-compare.c: Introduce parameter for compare_chr_send()
This patch change the compare_chr_send() parameter from CharBackend to CompareState,
we can get more information like vnet_hdr(We use it to support packet with vnet_header).

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-07-17 20:13:49 +08:00
Zhang Chen ada1a33f9a net/colo.c: Make vnet_hdr_len as packet property
We can use this property flush and send packet with vnet_hdr_len.

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-07-17 20:13:48 +08:00
Zhang Chen 00d5c2406b net/filter-mirror.c: Add new option to enable vnet support for filter-redirector
We add the vnet_hdr_support option for filter-redirector, default is disabled.
If you use virtio-net-pci net driver or other driver needs vnet_hdr, please enable it.
Because colo-compare or other modules needs the vnet_hdr_len to parse
packet, we add this new option send the len to others.
You can use it for example:
-object filter-redirector,id=r0,netdev=hn0,queue=tx,outdev=red0,vnet_hdr_support

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-07-17 20:13:47 +08:00
Zhang Chen e2521f0e03 net/filter-mirror.c: Make filter mirror support vnet support.
We add the vnet_hdr_support option for filter-mirror, default is disabled.
If you use virtio-net-pci or other driver needs vnet_hdr, please enable it.
You can use it for example:
-object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0,vnet_hdr_support

If it has vnet_hdr_support flag, we will change the sending packet format from
struct {int size; const uint8_t buf[];} to {int size; int vnet_hdr_len; const uint8_t buf[];}.
make other module(like colo-compare) know how to parse net packet correctly.

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-07-17 20:13:45 +08:00
Stefan Hajnoczi 304187c51c trace: update old trace events in docs
Commit c5f1ad429c ("block: Remove
bdrv_aio_readv/writev/flush()") removed
bdrv_aio_readv()/bdrv_aio_writev() so the example in the tracing
documentation is no longer valid.

Reported-by: Wang Dong <dongdwdw@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20170714133111.27359-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-07-17 13:11:13 +01:00
Lluís Vilanova 5caa262fda trace: [trivial] Statically enable all guest events
The existing optimizations makes it feasible to have them available on all
builds.

Some quick'n'dirty numbers with 400.perlbench (SPECcpu2006) on the train input
(medium size - suns.pl) and the guest_mem_before event:

* vanilla, statically disabled
real    0m2,259s
user    0m2,252s
sys     0m0,004s

* vanilla, statically enabled (overhead: 2.18x)
real    0m4,921s
user    0m4,912s
sys     0m0,008s

* multi-tb, statically disabled (overhead: 0.99x) [within noise range]
real    0m2,228s
user    0m2,216s
sys     0m0,008s

* multi-tb, statically enabled (overhead: 0.99x) [within noise range]
real    0m2,229s
user    0m2,224s
sys     0m0,004s

Now enabling all events when booting an ARM system that immediately shuts down
(https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg04085.html):

* vanilla, statically disabled
real	0m32,153s
user	0m31,276s
sys	0m0,108s

* vanilla, statically enabled (overhead: 1.35x)
real	0m43,507s
user	0m42,680s
sys	0m0,168s

* multi-tb, statically disabled (overhead: 1.03x)
real	0m32,993s
user	0m32,516s
sys	0m0,104s

* multi-tb, statically enabled (overhead: 1.00x) [within noise range]
real	0m32,110s
user	0m31,176s
sys	0m0,156s

And finally enabling all events using Emilio's dbt-bench
(where orig == vanilla, new == multi-tb):

                                                        NBench score; higher is better

  180 +-+--------+----------+----------+---------+----------+----------+----------+----------+----------+---------+----------+--------+-+
      |                                                                                                                                 |
      |                                      *** $$$$%%                                                                    orig         |
  160 +-+....................................*.*.$..$.%............................................................orig-enabled       +-+
      |                                      * * $  $ %                                                                     new         |
  140 +-+....................................*.*.$..$.%............................................................new-disabled.......+-+
      |                                      * * $  $ %                                                                                 |
      |                                      * * $  $ %                                                                                 |
  120 +-+....................................*.*.$..$.%...............................................................................+-+
      |                                      * * $  $ %                                                                                 |
      |                                      * * $  $ %                                                                                 |
  100 +-+....................................*.*.$..$.%.....$$$%%%....................................................................+-+
      |                                      * * $  $ % *** $ $  % *** $$$%%                                                            |
   80 +-+....................................*.*.$..$.%.*.*.$.$..%.*.*.$.$.%..........................................................+-+
      |                                      * * $  $ % * * $ $  % * * $ $ %                                                            |
      |                                      * * $  $ % * * $ $  % * * $ $ %                                                            |
   60 +-+.........................***..$$$%%.*.*##..$.%.*.*.$.$..%.*.*.$.$.%..***.$$$%%...............................................+-+
      |                **** $$$%% * *  $ $ % * * #  $ % * *## $  % * * $ $ %  * * $ $ %                                                 |
      |                *  * $ $ % * *  $ $ % * * #  $ % * * # $  % * *## $ %  * * $ $ %                                                 |
   40 +-+..............*..*.$.$.%.*.*..$.$.%.*.*.#..$.%.*.*.#.$..%.*.*.#.$.%..*.*.$.$.%...............................................+-+
      |                *  * $ $ % * *  $ $ % * * #  $ % * * # $  % * * # $ %  * *## $ %                                  *** $$$%%%     |
   20 +-+....***.$$$%%.*..*##.$.%.*.*###.$.%.*.*.#..$.%.*.*.#.$..%.*.*.#.$.%..*.*.#.$.%..................................*.*.$.$..%...+-+
      |      * *## $ % *  * # $ % * *  # $ % * * #  $ % * * # $  % * * # $ %  * * # $ %                                  * *## $  %     |
      |      * * # $ % *  * # $ % * *  # $ % * * #  $ % * * # $  % * * # $ %  * * # $ %            ***###$$%% ***##$$$%% * * # $  %     |
    0 +-+----***##$$%%-****##$$%%-***###$$%%-***##$$$%%-***##$$%%%-***##$$%%--***##$$%%-****##$$%%-***###$$%%-***##$$$%%-***##$$%%%---+-+
     NUMERIC SORTSTRING SORT   BITFIEFP EMULATION ASSIGNMENT       IDEA    HUFFMAN    FOURIER NEURLU DECOMPOSITION      gmean
png: http://imgur.com/a/8XG5S

Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-id: 149915849243.6295.4484103824675839071.stgit@frigg.lan
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-07-17 13:11:13 +01:00
Lluís Vilanova 1ff7b53196 trace: [tcg, trivial] Re-align generated code
Last patch removed a nesting level in generated code. Re-align all code
generated by backends to be 4-column aligned.

Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-id: 149915824586.6295.17820926011082409033.stgit@frigg.lan
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-07-17 13:11:13 +01:00
Lluís Vilanova 864a2178d4 trace: [tcg] Do not generate TCG code to trace dynamically-disabled events
If an event is dynamically disabled, the TCG code that calls the
execution-time tracer is not generated.

Removes the overheads of execution-time tracers for dynamically disabled
events. As a bonus, also avoids checking the event state when the
execution-time tracer is called from TCG-generated code (since otherwise
TCG would simply not call it).

Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-id: 149915799921.6295.13067154430923434035.stgit@frigg.lan
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-07-17 13:11:12 +01:00
Lluís Vilanova 61a67f71dd exec: [tcg] Use different TBs according to the vCPU's dynamic tracing state
Every vCPU now uses a separate set of TBs for each set of dynamic
tracing event state values. Each set of TBs can be used by any number of
vCPUs to maximize TB reuse when vCPUs have the same tracing state.

This feature is later used by tracetool to optimize tracing of guest
code events.

The maximum number of TB sets is defined as 2^E, where E is the number
of events that have the 'vcpu' property (their state is stored in
CPUState->trace_dstate).

For this to work, a change on the dynamic tracing state of a vCPU will
force it to flush its virtual TB cache (which is only indexed by
address), and fall back to the physical TB cache (which now contains the
vCPU's dynamic tracing state as part of the hashing function).

Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-id: 149915775266.6295.10060144081246467690.stgit@frigg.lan
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-07-17 13:11:05 +01:00
Lluís Vilanova d43811165d trace: [tcg] Delay changes to dynamic state when translating
This keeps consistency across all decisions taken during translation
when the dynamic state of a vCPU is changed in the middle of translating
some guest code.

Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-id: 149915750615.6295.3713699402253529487.stgit@frigg.lan
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-07-17 13:10:54 +01:00
Lluís Vilanova d01c05c955 trace: Allocate cpu->trace_dstate in place
There's little point in dynamically allocating the bitmap if we
know at compile-time the max number of events we want to support.
Thus, make room in the struct for the bitmap, which will make things
easier later: this paves the way for upcoming changes, in which
we'll use a u32 to fully capture cpu->trace_dstate.

This change also increases performance by saving a dereference and
improving locality--note that this is important since upcoming work
makes reading this bitmap fairly common.

Signed-off-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Lluís Vilanova <vilanova@ac.upc.edu>
Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Message-id: 149915725977.6295.15069969323605305641.stgit@frigg.lan
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-07-17 13:10:45 +01:00
Zhang Chen dc3c5ac645 net/filter-mirror.c: Introduce parameter for filter_send()
This patch change the filter_send() parameter from CharBackend to MirrorState,
we can get more information like vnet_hdr(We use it to support packet with vnet_header).

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-07-17 20:02:11 +08:00
Zhang Chen 3cde5ea211 net/net.c: Add vnet_hdr support in SocketReadState
We add a flag to decide whether net_fill_rstate() need read
the vnet_hdr_len or not.

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-07-17 20:02:11 +08:00
Zhang Chen d6b732e953 net: Add vnet_hdr_len arguments in NetClientState
Add vnet_hdr_len arguments in NetClientState
that make other module get real vnet_hdr_len easily.

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-07-17 20:02:09 +08:00
Peter Maydell 77031ee1ce ppc patch queue 2017-07-17
This pull requests supersedes the one from 2017-07-14.  That one had a
 couple of subtle regressions: there was a build error for mingw32, and
 an instance_size which was theoretically wrong everywhere, but only
 actually bit on the Travis OSX build.
 
 There are two major batches in this set, rather than the usual
 collection of assorted fixes.
 
     * More DRC cleanup.  This gets the state management into a state
       which should fix many of the hotplug+migration problems we've
       had.  Plus it gets the migration stream format into something
       well defined and pretty minimal which we can reasonably support
       into the future.
 
     * Hashed Page Table resizing.  It's been a while since this was
       posted, but it's been through several previous rounds of review.
       The kernel parts (both guest and host) are merged in 4.11, so
       this is the only remaining piece left to allow resizing of the
       HPT in a running guest.
 
 There are also a handful of unrelated fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAllsWwQACgkQbDjKyiDZ
 s5LMnA//dpoqWrTPiEmx2DsXMkjLefn/2Yl1dkQDzhyb7v+tNGFYmxpbb7nPRfJE
 tfvcKu1Tz23NPOp6+1VC9eTyTO1YOXTgvQrNSbF1MmIg4PGN6s2DHrLviAqCS15M
 29x6+RdRaeLUSCsk8elsViiWb8h7cISDuN0SMA0WWjWP3bO/drz5nq5z5dRgdVFe
 Z5O0qwDNoN0NypJ68Cld+riP1uDAYMONPxA0QOWCLx8qowoJ3hYMuyNnqBQU5OJn
 PpAA3EfdxkN6rtaBjDt7xHkJfm9Xkm9SsT8qTcj/R2JjkENef8EbzrdjFE+pSVz0
 7c9C4evgYgmhUCUFvnZfgN+VBL1lS/p5UGnFPyNQ7KbSXDE71OAgWH/f/7kzsJPy
 MxbJWM6eUN9Ny0APxM8olLV1FM4GzEoCSLfDVhStrdJ6P5wBmjLSugqSOLB8aMtd
 8NwBY06nTpmo9xXGz9enLUWlpSeoReKU3TxvQvY+JcOWWpasDZOO4zD8B3bdLbA/
 I8jdkH5Vs0pyPLaWD+1FxlQvlF45CuwpwoiAz00V2XkkMu8jKCGsQ0iuqXorSqvs
 /7tQ1pHlUybAX+5W9raaJmphgc4gk33P3PlQCjhgYzxRu4yzRsEzS9hahoO/TAmq
 Y70CooZaaeGNOBEDcKLZEzJdBr52cqW4MM8t1xHWTg3VCHJGeYI=
 =O6NQ
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.10-20170717' into staging

ppc patch queue 2017-07-17

This pull requests supersedes the one from 2017-07-14.  That one had a
couple of subtle regressions: there was a build error for mingw32, and
an instance_size which was theoretically wrong everywhere, but only
actually bit on the Travis OSX build.

There are two major batches in this set, rather than the usual
collection of assorted fixes.

    * More DRC cleanup.  This gets the state management into a state
      which should fix many of the hotplug+migration problems we've
      had.  Plus it gets the migration stream format into something
      well defined and pretty minimal which we can reasonably support
      into the future.

    * Hashed Page Table resizing.  It's been a while since this was
      posted, but it's been through several previous rounds of review.
      The kernel parts (both guest and host) are merged in 4.11, so
      this is the only remaining piece left to allow resizing of the
      HPT in a running guest.

There are also a handful of unrelated fixes.

# gpg: Signature made Mon 17 Jul 2017 07:36:52 BST
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.10-20170717: (21 commits)
  target/ppc: fix CPU hotplug when radix is enabled (TCG)
  spapr: fix memory leak in spapr_core_pre_plug()
  pseries: Allow HPT resizing with KVM
  pseries: Use smaller default hash page tables when guest can resize
  pseries: Enable HPT resizing for 2.10
  pseries: Implement HPT resizing
  pseries: Stubs for HPT resizing
  ppc/pnv: Remove unused XICSState reference
  spapr: fix potential memory leak in spapr_core_plug()
  spapr: Implement DR-indicator for physical DRCs only
  spapr: Remove sPAPRConfigureConnectorState sub-structure
  spapr: Consolidate DRC state variables
  spapr: Cleanups relating to DRC awaiting_release field
  spapr: Refactor spapr_drc_detach()
  spapr: Abort on delete failure in spapr_drc_release()
  spapr: Simplify unplug path
  spapr: Remove 'awaiting_allocation' DRC flag
  spapr: Treat devices added before inbound migration as coldplugged
  spapr: Minor cleanups to events handling
  spapr: migrate pending_events of spapr state
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-07-17 12:52:59 +01:00
Peter Maydell 6632f6ff96 -----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
 
 iQEtBAABCAAXBQJZbDM5EBxmYW16QHJlZGhhdC5jb20ACgkQyjViTGqRccaYjgf7
 BmLHDcueclRZPXIyY2Gbf1/VmRyYeq1IZIYVKLAepDeYqQ4+vVXTwSuoe6cu4r1G
 AtCNAoUBwqE+YTg/dgaRF5TNVkLD4LMU6qoBO9RIRNYdz9J6V72NrzeF5rCJvYvx
 ghpgbrVgpxjiREodbzxKaEaY5x1x5LqTClSjsz64MmR0ibm2HgMMn2khUr4Z+3wL
 jgdRGCpc4xFIBNfjmGY+8OzvHw0+QqWpvWp8gYxMRI/1/wBfQYnUvg9GLuf/fQPe
 ZYIYW7jURkjBaRZd1lYxd3MHDLzyjjnSpxas4P1D/QtlakhZCiLbWIAMdxfZWBdj
 ieS6w1vgZOuEb+dH6HLNmA==
 =x5vk
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/famz/tags/block-and-testing-pull-request' into staging

# gpg: Signature made Mon 17 Jul 2017 04:47:05 BST
# gpg:                using RSA key 0xCA35624C6A9171C6
# gpg: Good signature from "Fam Zheng <famz@redhat.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 5003 7CB7 9706 0F76 F021  AD56 CA35 624C 6A91 71C6

* remotes/famz/tags/block-and-testing-pull-request:
  travis: add no-TCG build
  docker.py: Improve subprocess exit code handling
  docker.py: Drop infile parameter
  docker: Don't enable networking as a side-effect of DEBUG=1
  ssh: support I/O from any AioContext
  sheepdog: add queue_lock
  qed: protect table cache with CoMutex
  qed: introduce bdrv_qed_init_state
  block: invoke .bdrv_drain callback in coroutine context and from AioContext
  qed: move tail of qed_aio_write_main to qed_aio_write_{cow, alloc}
  vvfat: make it thread-safe
  vpc: make it thread-safe
  vdi: make it thread-safe
  coroutine-lock: add qemu_co_rwlock_downgrade and qemu_co_rwlock_upgrade
  qcow2: call CoQueue APIs under CoMutex

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-07-17 11:46:36 +01:00
Gerd Hoffmann 10750ee0d6 virtio-gpu: skip update cursor in post_load if we don't have one
If the cursor resource id isn't set the guest didn't define a cursor.
Skip the cursor update in post_load in that that case.

Reported-by: wanghaibin <wanghaibin.wang@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: wanghaibin <wanghaibin.wang@huawei.com>
Message-id: 20170710070432.856-1-kraxel@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-07-17 11:41:23 +02:00
Gerd Hoffmann 2a7f263068 ehci: add sanity check for maxframes
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20170703111549.10924-1-kraxel@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-07-17 11:39:08 +02:00
Gerd Hoffmann feb47cf2fa keymaps: fr-ca: add missing keys
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170712072305.29233-1-kraxel@redhat.com
2017-07-17 11:36:41 +02:00
Dr. David Alan Gilbert 0a9667ecdb hmp: Update info vnc
The QMP query-vnc interfaces have gained a lot more information that
the HMP interfaces hasn't got yet. Update it.

Note the output format has changed, but this is HMP so that's OK.

In particular, this now includes client information for reverse
connections:

-vnc :0
(qemu) info vnc
default:
  Server: 0.0.0.0:5900 (ipv4)
    Auth: none (Sub: none)

  (Now connect a client)

(qemu) info vnc
default:
  Server: 0.0.0.0:5900 (ipv4)
    Auth: none (Sub: none)
  Client: 127.0.0.1:51828 (ipv4)
    x509_dname: none
    sasl_username: none

-vnc localhost:7000,reverse
(qemu) info vnc
default:
  Client: ::1:7000 (ipv6)
    x509_dname: none
    sasl_username: none
  Auth: none (Sub: none)

-vnc :1,password,id=pass -vnc localhost:7000,reverse
(qemu) info vnc
default:
  Client: ::1:7000 (ipv6)
    x509_dname: none
    sasl_username: none
  Auth: none (Sub: none)
rev:
  Server: 0.0.0.0:5901 (ipv4)
    Auth: vnc (Sub: none)
  Client: 127.0.0.1:53616 (ipv4)
    x509_dname: none
    sasl_username: none

This was originally RH bz 1461682

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 20170711154414.21111-1-dgilbert@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-07-17 11:36:09 +02:00
Alexander Graf d3b0db6dfe vnc: Set default kbd delay to 10ms
The current VNC default keyboard delay is 1ms. With that we're constantly
typing faster than the guest receives keyboard events from an XHCI attached
USB HID device.

The default keyboard delay time in the input layer however is 10ms. I don't know
how that number came to be, but empirical tests on some OpenQA driven ARM
systems show that 10ms really is a reasonable default number for the delay.

This patch moves the VNC delay also to 10ms. That way our default is much
safer (good!) and also consistent with the input layer default (also good!).

Signed-off-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1499863425-103133-1-git-send-email-agraf@suse.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-07-17 11:35:27 +02:00
Hervé Poussineau 639b49ef9a audio/adlib: remove limitation of one adlib card
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20170621043401.19842-3-hpoussin@reactos.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-07-17 11:09:02 +02:00
Hervé Poussineau c57fbf50e7 audio/fmopl: modify timer callback to give opaque and channel parameters in two arguments
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20170621043401.19842-2-hpoussin@reactos.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-07-17 11:09:02 +02:00
Peng Hao facd0e9773 audio: st_rate_flow exist a infinite loop
If a voice recording equipment is opened for a long time(several days)
in windows guest, rate->ipos will overflow and rate->opos will never
have a chance to change. It will result to a infinite loop.

Signed-off-by: Peng Hao <peng.hao2@zte.com.cn>
Signed-off-by: Wang Yechao <wang.yechao255@zte.com.cn>
Message-id: 1500128061-20849-1-git-send-email-peng.hao2@zte.com.cn
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-07-17 11:08:59 +02:00
Peter Maydell acbaa0f4fd slirp updates
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEErr90SPq5RTpFUjkOsKUb9YyRecUFAllqCssACgkQsKUb9YyR
 ecVE6hAAlGPrzCgHJ1R6RzT+HHUPGgu99CiiVc6nyAtWBdBhVSt6rhlK90EPiYxL
 dEnM06M+hQ6o+K3SIHfb4MbKwck/L8QYSsp7L4SNF83uhdtJEXYtd7v1dbyznwVh
 WnCb8/gcNip7+dSd9w7LyZcMQt7RPH7M2YLZlq4u7qLpVoN1Nw4/+YKp6gwgG5M9
 ByALG6X2ZR9hI7elYQmxLhb6Vi6oy47SVy1K9pYXi3igiYMTsdced+iE50mg1ML8
 Oni70fDWW3SQVovmDLG0TB5XxwycYhZpf+4Fn8kc2QlhQraWlpFYNXU2J68vyR4w
 YXuKKMbp3aO3QBwcR9H0GtWRHARMRSo5sLDTNF34Oi3EDWqDi05OvfeWBhUKm/Sh
 RkRjBhqch5YTaYUhbRknTgvTxLLHRvffJenw0ATetduvtSJ6XNUDY/A2GDv6i5Be
 cLc9GsWQFKyAcDPUGRfeW586hIigB4DSewdL30r22djPQx3/88U9xjccwHMpFPHy
 wUNddOcUcCeTqEIPJl5j5uk4ehdkAYJljCq+3Ie9ruTPfhr5VS9phLbZLQfzvkYF
 ktBjQm2T9VX9qw4EgcubW43hgFfGOcg4jJJlBmXVRmdYb84V/cwH31eRnwJXhOiV
 iqwc6kaazeybWvHQIrmVfjD3fcTuFDq4+aORqgnRHbP+4+cjAn0=
 =o/qe
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/thibault/tags/samuel-thibault' into staging

slirp updates

# gpg: Signature made Sat 15 Jul 2017 13:30:03 BST
# gpg:                using RSA key 0xB0A51BF58C9179C5
# gpg: Good signature from "Samuel Thibault <samuel.thibault@aquilenet.fr>"
# gpg:                 aka "Samuel Thibault <sthibault@debian.org>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@gnu.org>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@inria.fr>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@labri.fr>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@ens-lyon.org>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@u-bordeaux.fr>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 900C B024 B679 31D4 0F82  304B D017 8C76 7D06 9EE6
#      Subkey fingerprint: AEBF 7448 FAB9 453A 4552  390E B0A5 1BF5 8C91 79C5

* remotes/thibault/tags/samuel-thibault:
  slirp: Handle error returns from sosendoob()
  slirp: Handle error returns from slirp_send() in sosendoob()
  slirp: fork_exec(): Don't close() a negative number in fork_exec()
  slirp: use DIV_ROUND_UP

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-07-17 10:02:22 +01:00
Gerd Hoffmann 18375b5c16 ipxe: update to commit 0600d3ae94
Rebase ipxe to latest git master.
Pick up four virtio-net fixes.

complete shortlog of ipxe changes
---------------------------------

Adamczyk, Konrad (1):
      [thunderx] Use ThunderxConfigProtocol to obtain board configuration

Bartosz Szczepanek (1):
      [thunderx] Fix hardware deinitialization

Christian Nilsson (1):
      [intel] Add INTEL_NO_PHY_RST for I219-LM (2)

David Decotigny (2):
      [build] Return const char * from uuid_ntoa()
      [af_packet] Add new AF_PACKET driver for Linux

Jason Wang (1):
      [virtio] Support VIRTIO_NET_F_IOMMU_PLATFORM

Jerone Young (1):
      [intel] Add support for I219-V in 7th Gen Intel NUC

Konrad Adamczyk (1):
      [thunderx] Don't disable NIC when exiting from iPXE

Ladi Prosek (3):
      [virtio] Cap queue size to MAX_QUEUE_NUM
      [virtio] Simplify virtqueue shutdown
      [virtio] Remove queue size limit in legacy virtio

Martin Habets (1):
      [sfc] Add driver for Solarflare SFC8XXX adapters

Michael Brown (159):
      [interface] Provide intf_reinit() to reinitialise nullified interfaces
      [iscsi] Avoid potential infinite loops during shutdown
      [efi] Add basic EFI SAN booting capability
      [undi] Allocate base memory before calling UNDI loader entry point
      [romprefix] Avoid using PMM-allocated memory in UNDI loader entry point
      [undi] Clean up driver and device name information
      [prefix] Remove impossible progress message
      [prefix] Include diagnostic information within progress messages
      [undi] Try matching UNDI ROMs in BIOS enumeration order
      [efi] Work around temporal anomaly encountered during ExitBootServices()
      [ipv4] Accept unicast packets for the local network broadcast address
      [build] Add %.vhd target for building VM bootable disk images
      [virtio] Use separate RX and TX empty header buffers
      [cloud] Add ability to retrieve Google Compute Engine metadata
      [virtio] Use host-specified MTU when available
      [netdevice] Allow MTU to be changed at runtime
      [cloud] Show CPU vendor and model in example cloud boot scripts
      [hyperv] Ignore unsolicited VMBus messages
      [pic8259] Fix definitions for "read IRR" and "read ISR" commands
      [efi] Fix building elf2efi.c when -fpic is enabled by default
      [interface] Avoid unnecessary reference counting in intf_unplug()
      [interface] Remove misleading comment
      [interface] Unplug interface before calling intf_close() in intf_shutdown()
      [netdevice] Limit MTU by hardware maximum frame length
      [cpuid] Provide cpuid_supported() to test for supported functions
      [time] Allow timer to be selected at runtime
      [hyperv] Provide timer based on the 10MHz time reference count MSR
      [int13] Avoid potential division by zero
      [int13] Test correct return status from INT 13 calls
      [settings] Add "unixtime" builtin setting to expose the current time
      [time] Report attempts to use timers before initialisation
      [interface] Provide the ability to shut down multiple interfaces
      [http] Cleanly shut down potentially looped interfaces
      [efi] Add missing SANBOOT_PROTO_HTTP to EFI default configuration
      [block] Remove spurious comments
      [block] Centralise SAN device abstraction
      [block] Centralise "san-drive" setting
      [int13] Refactor to use centralised SAN device abstraction
      [efi] Refactor to use centralised SAN device abstraction
      [block] Retry any SAN device operation
      [iscsi] Use intfs_shutdown() when shutting down multiple interfaces
      [scsi] Use intfs_shutdown() when shutting down multiple interfaces
      [block] Use intfs_shutdown() when shutting down multiple interfaces
      [scsi] Avoid duplicate calls to scsicmd_close()
      [build] Provide common ARRAY_SIZE() definition
      [efi] Update to current EDK2 headers
      [efi] Add EFI_ACPI_TABLE_PROTOCOL header and GUID definition
      [efi] Provide ACPI table description for SAN devices
      [efi] Skip cable detection at initialisation where possible
      [undi] Move PXE API caller back into UNDI driver
      [dhcp] Allow vendor class to be changed in DHCP requests
      [hermon] Avoid potential integer overflow when calculating memory mappings
      [arbel] Avoid potential integer overflow when calculating memory mappings
      [xfer] Ensure va_end() is called on failure path
      [nfs] Fix double free bug on error path
      [linda] Use correct length for memset()
      [qib7322] Use correct length for memset()
      [sis900] Remove extraneous memset() with incorrect length
      [802.11] Remove redundant NULL pointer check after dereference
      [crypto] Free correct pointer on the error path
      [librm] Fail gracefully if asked to ioremap() a zero length
      [usb] Use correct length for memcpy()
      [mucurses] Attempt to fix test for empty string
      [mucurses] Attempt to fix keypress processing logic
      [mucurses] Attempt to fix resource leaks
      [hyperv] Fix resource leaks on error path
      [slam] Fix resource leak on error path
      [slam] Avoid NULL pointer dereference in slam_pull_value()
      [eoib] Avoid passing a NULL I/O buffer to netdev_tx_complete_err()
      [http] Add missing check for memory allocation failure
      [mucurses] Attempt to fix use of uninitialised buffer with strcat()
      [xhci] Avoid accessing beyond end of endpoint context array
      [build] Avoid confusing sparse in single-argument DBG() macros
      [infiniband] Return status code from ib_create_cq() and ib_create_qp()
      [infiniband] Return status code from ib_create_mi()
      [block] Quell spurious Coverity size mismatch warning
      [ath] Add missing break statements
      [pixbuf] Avoid potential division by zero
      [usb] Use correct length for memcpy()
      [xen] Use standard calling pattern for asprintf()
      [tcp] Use correct length for memset()
      [video_subr] Use memmove() for overlapping memory copy
      [arbel] Assert that mapping length is non-zero
      [hermon] Assert that mapping length is non-zero
      [tlan] Guard against failure to identify chip
      [w89c840] Avoid potential array overrun
      [sis190] Avoid NULL pointer dereference
      [mucurses] Ensure SLK labels are always terminated
      [coverity] Add Coverity user model
      [malloc] Track maximum heap usage
      [travis] Add minimal .travis.yml file
      [travis] Build and run the unit test suite
      [travis] Integrate with Coverity Scan
      [rtl818x] Fix resource leak on error path
      [pcnet32] Eliminate redundant register read
      [iobuf] Increase minimum I/O buffer size to 128 bytes
      [vxge] Fix use of stale I/O buffer on error path
      [scsi] Avoid duplicate call to scsicmd_close() on TEST UNIT READY failure
      [block] Add dummy SAN device
      [block] Add basic multipath support
      [int13] Improve geometry guessing for unaligned partitions
      [int13con] Avoid overwriting random portions of SAN boot disks
      [time] Add sleep_fixed() function to sleep without checking for Ctrl-C
      [block] Allow SAN retry count to be reconfigured
      [block] Add a small delay between attempts to reopen SAN targets
      [block] Retry reopening indefinitely for multipath devices
      [block] Gracefully close SAN device if registration fails
      [linux] Use dummy SAN device
      [block] Ignore redundant xfer_window_changed() messages
      [block] Describe all SAN devices via ACPI tables
      [iscsi] Do not install iBFT when no iSCSI targets exist
      [http] Notify data transfer interface when underlying connection is ready
      [mucurses] Fix erroneous __nonnull attribute
      [build] Avoid implicit-fallthrough warnings on GCC 7
      [linux] Fix building with kernel 4.11 headers
      [scsi] Retry TEST UNIT READY command
      [libc] Add stdbool.h standard header
      [efi] Fix typo in efi_acpi_table_protocol_guid
      [efi] Add efi_sprintf() and efi_vsprintf()
      [block] Allow use of a non-default EFI SAN boot filename
      [intel] Show original CTRL and STATUS values in debugging output
      [intel] Do not enable ASDE on i350 backplane NIC
      [block] Provide sandev_read() and sandev_write() as global symbols
      [block] Provide abstraction to allow system to be quiesced
      [hyperv] Do not fail if guest OS ID MSR is already set
      [hyperv] Remove redundant return status code from mapping functions
      [hyperv] Cope with Windows Server 2016 enlightenments
      [efi] Standardise PCI debug messages
      [iscsi] Always send FirstBurstLength parameter
      [iscsi] Fix iBFT when no explicit initiator name setting exists
      [xen] Provide 18 4kB receive buffers to work around xen-netback bug
      [efi] Prevent EFI code from being linked in to non-EFI builds
      [tls] Keep cipherstream window open until TLS negotiation is complete
      [settings] Extend numerical setting tags to 64 bits
      [acpi] Make acpi_find_rsdt() a per-platform method
      [efi] Provide access to ACPI tables
      [acpi] Expose ACPI tables via settings mechanism
      [syslog] Handle backspace characters
      [hdprefix] Avoid attempts to read beyond the end of the disk
      [usb] Allow for USB network devices with no interrupt endpoint
      [build] Use -no-pie on newer versions of gcc
      [ecm] Display invalid MAC address strings in debug messages
      [cpuid] Allow input %ecx value to be specified
      [crypto] Expose RSA_CTX_SIZE constant
      [crypto] Expose asn1_grow()
      [crypto] Provide asn1_built() to construct a cursor from a builder
      [crypto] Expose pem_asn1() for use with non-image data
      [exanic] Add driver for Exablaze ExaNIC cards
      [usb] Use non-zero language ID to retrieve strings
      [mucurses] Avoid potential division by zero
      [tls] Support RFC5746 secure renegotiation
      [smscusb] Abstract out common SMSC USB device functionality
      [smsc95xx] Use common SMSC USB device functionality
      [smsc75xx] Use common SMSC USB device functionality
      [smscusb] Add ability to read MAC address from OTP
      [smscusb] Move non-inline register access functions to smscusb.c
      [smscusb] Allow for alternative PHY register layouts
      [smsc75xx] Expose functionality shared with LAN78xx devices
      [lan78xx] Add driver for Microchip LAN78xx USB Ethernet NICs

Mika Tiainen (1):
      [intel] Add INTEL_NO_PHY_RST for I219-V

Mike McCormack (1):
      [sky2] Use 32-bit read to read Y2_VAUX_AVAIL

Raed Salem (2):
      [golan] Update Connect-IB, ConnectX-4 and ConnectX-4 Lx (Infiniband) support
      [golan] Bug fixes and improved paging allocation method

Vishvananda Ishaya (1):
      [intel] Reset all virtual function settings

Vishvananda Ishaya Abrams (1):
      [iscsi] Don't close when receiving NOP-In

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-07-17 11:00:28 +02:00
Cédric Le Goater 346ebfc6fb target/ppc: fix CPU hotplug when radix is enabled (TCG)
But when a guest initializes radix mode, it issues a H_REGISTER_PROC_TBL
to update the LPCR of all CPUs. Hot-plugged CPUs inherit from the same
setting under KVM but not under TCG. So, Let's check for radix and update
the default LPCR to keep new CPUs in sync.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-07-17 15:07:05 +10:00
Greg Kurz df8658de43 spapr: fix memory leak in spapr_core_pre_plug()
In case of error, we must ensure the dynamically allocated base_core_type
is freed, like it is done everywhere else in this function.

This is a regression introduced in QEMU 2.9 by commit 8149e2992f.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-07-17 15:07:05 +10:00
David Gibson b55d295e3e pseries: Allow HPT resizing with KVM
So far, qemu implements the PAPR Hash Page Table (HPT) resizing extension
with TCG.  The same implementation will work with KVM PR, but we don't
currently allow that.  For KVM HV we can only implement resizing with the
assistance of the host kernel, which needs a new capability and ioctl()s.

This patch adds support for testing the new KVM capability and implementing
the resize in terms of KVM facilities when necessary.  If we're running on
a kernel which doesn't have the new capability flag at all, we fall back to
testing for PR vs. HV KVM using the same hack that we already use in a
number of places for older kernels.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-07-17 15:07:05 +10:00
David Gibson 2772cf6be9 pseries: Use smaller default hash page tables when guest can resize
We've now implemented a PAPR extension allowing PAPR guest to resize
their hash page table (HPT) during runtime.

This patch makes use of that facility to allocate smaller HPTs by default.
Specifically when a guest is aware of the HPT resize facility, qemu sizes
the HPT to the initial memory size, rather than the maximum memory size on
the assumption that the guest will resize its HPT if necessary for hot
plugged memory.

When the initial memory size is much smaller than the maximum memory size
(a common configuration with e.g. oVirt / RHEV) then this can save
significant memory on the HPT.

If the guest does *not* advertise HPT resize awareness when it makes the
ibm,client-architecture-support call, qemu resizes the HPT for maxmimum
memory size (unless it's been configured not to allow such guests at all).

For now we make that reallocation assuming the guest has not yet used the
HPT at all.  That's true in practice, but not, strictly, an architectural
or PAPR requirement.  If we need to in future we can fix this by having
the client-architecture-support call reboot the guest with the revised
HPT size (the client-architecture-support call is explicitly permitted to
trigger a reboot in this way).

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
2017-07-17 15:07:05 +10:00
David Gibson 52b81ab5e9 pseries: Enable HPT resizing for 2.10
We've now implemented a PAPR extensions which allows PAPR guests (i.e.
"pseries" machine type) to resize their hash page table during runtime.

However, that extension is only enabled if explicitly chosen on the
command line.  This patch enables it by default for spapr-2.10, but leaves
it disabled (by default) for older machine types.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2017-07-17 15:07:05 +10:00
David Gibson 0b0b831016 pseries: Implement HPT resizing
This patch implements hypercalls allowing a PAPR guest to resize its own
hash page table.  This will eventually allow for more flexible memory
hotplug.

The implementation is partially asynchronous, handled in a special thread
running the hpt_prepare_thread() function.  The state of a pending resize
is stored in SPAPR_MACHINE->pending_hpt.

The H_RESIZE_HPT_PREPARE hypercall will kick off creation of a new HPT, or,
if one is already in progress, monitor it for completion.  If there is an
existing HPT resize in progress that doesn't match the size specified in
the call, it will cancel it, replacing it with a new one matching the
given size.

The H_RESIZE_HPT_COMMIT completes transition to a resized HPT, and can only
be called successfully once H_RESIZE_HPT_PREPARE has successfully
completed initialization of a new HPT.  The guest must ensure that there
are no concurrent accesses to the existing HPT while this is called (this
effectively means stop_machine() for Linux guests).

For now H_RESIZE_HPT_COMMIT goes through the whole old HPT, rehashing each
HPTE into the new HPT.  This can have quite high latency, but it seems to
be of the order of typical migration downtime latencies for HPTs of size
up to ~2GiB (which would be used in a 256GiB guest).

In future we probably want to move more of the rehashing to the "prepare"
phase, by having H_ENTER and other hcalls update both current and
pending HPTs.  That's a project for another day, but should be possible
without any changes to the guest interface.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-07-17 15:07:05 +10:00
David Gibson 30f4b05bd0 pseries: Stubs for HPT resizing
This introduces stub implementations of the H_RESIZE_HPT_PREPARE and
H_RESIZE_HPT_COMMIT hypercalls which we hope to add in a PAPR
extension to allow run time resizing of a guest's hash page table.  It
also adds a new machine property for controlling whether this new
facility is available.

For now we only allow resizing with TCG, allowing it with KVM will require
kernel changes as well.

Finally, it adds a new string to the hypertas property in the device
tree, advertising to the guest the availability of the HPT resizing
hypercalls.  This is a tentative suggested value, and would need to be
standardized by PAPR before being merged.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2017-07-17 15:07:05 +10:00
Alexey Kardashevskiy 2ee77040f5 ppc/pnv: Remove unused XICSState reference
e6f7e110ee "ppc/xics: remove the XICSState classes" got rid of
XICSState, this is just an leftover.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-07-17 15:07:05 +10:00
Greg Kurz e49c63d5b3 spapr: fix potential memory leak in spapr_core_plug()
Since commit 5c1da81215 ("spapr: Remove unnecessary differences between
hotplug and coldplug paths"), the CPU DT for the DRC is always allocated.
This causes a memory leak for pseries-2.6 and older machine types, that
don't support CPU hotplug and don't allocate DRCs for CPUs.

Reported-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-07-17 15:07:05 +10:00
David Gibson 67fea71bf3 spapr: Implement DR-indicator for physical DRCs only
According to PAPR, the DR-indicator should only be valid for physical DRCs,
not logical DRCs.  At the moment we implement it for all DRCs, so restrict
it to physical ones only.

We move the state to the physical DRC subclass, which means adding some
QOM boilerplate to handle the newly distinct type.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
Tested-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
2017-07-17 15:07:05 +10:00
David Gibson 4445b1d27e spapr: Remove sPAPRConfigureConnectorState sub-structure
Most of the time, the state of a DRC object is contained in the single
'state' variable.  However, during the transition from UNISOLATE to
CONFIGURED state requires multiple calls to the ibm,configure-connector
RTAS call to retrieve the device tree for the attached device.  We need
some extra state to keep track of where we're up to in delivering the
device tree information to the guest.

Currently that extra state is in a sPAPRConfigureConnectorState
substructure which is only allocated when we're in the middle of the
configure connector process.  That sounds like a good idea, but the extra
state is only two integers - on many platforms that will take up the same
room as the (maybe NULL) ccs pointer even before malloc() overhead.  Plus
it's another object whose lifetime we need to manage.  In short, it's not
worth it.

So, fold the sPAPRConfigureConnectorState substructure directly into the
DRC object.

Previously the structure was allocated lazily when the configure-connector
call discovers it's not there.  Now, we need to initialize the subfields
pre-emptively, as soon as we enter UNISOLATE state.

Although it's not strictly necessary (the field values should only ever
be consulted when in UNISOLATE state), we try to keep them at -1 when in
other states, as a debugging aid.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
Tested-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
2017-07-17 15:07:05 +10:00
David Gibson 9d4c0f4f0a spapr: Consolidate DRC state variables
Each DRC has three fields describing its state: isolation_state,
allocation_state and configured.  At first this seems like a reasonable
representation, since its based directly on the PAPR defined
isolation-state and allocation-state indicators.  However:
  * Only a few combinations of the two fields' values are permitted
  * allocation_state isn't used at all for physical DRCs
  * The indicators are write only so they don't really have a well
    defined current value independent of each other

This replaces these variables with a single state variable, whose names
and numbers are based on the diagram in LoPAPR section 13.4.  Along with
this we add code to check the current state on various operations and make
sure the requested transition is permitted.

Strictly speaking, this makes guest visible changes to behaviour (since we
probably allowed some transitions we shouldn't have before).  However, a
hypothetical guest broken by that wasn't PAPR compliant, and probably
wouldn't have worked under PowerVM.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
Tested-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
2017-07-17 15:07:05 +10:00
David Gibson f1c52354e5 spapr: Cleanups relating to DRC awaiting_release field
'awaiting_release' indicates that the host has requested an unplug of the
device attached to the DRC, but the guest has not (yet) put the device
into a state where it is safe to complete removal.

1. Rename it to 'unplug_requested' which to me at least is clearer

2. Remove the ->release_pending() method used to check this from outside
spapr_drc.c.  The method only plausibly has one implementation, so use
a plain function (spapr_drc_unplug_requested()) instead.

3. Remove it from the migration stream.  Attempting to migrate mid-unplug
is broken not just for spapr - in general management has no good way to
determine if the device should be present on the destination or not.  So,
until that's fixed, there's no point adding extra things to the stream.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
2017-07-17 15:07:05 +10:00