The receive buffer autoscaling for TCP is based on a linear growth, which

is acceptable in the congestion avoidance phase, but not during slow start.
The MTU is is also not taken into account.
Use a method instead, which is based on exponential growth working also in
slow start and being independent from the MTU.

This is joint work with rrs@.

Reviewed by:		rrs@, Richard Scheffenegger
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D18375
This commit is contained in:
Michael Tuexen 2019-02-21 10:35:32 +00:00
parent bdffe3b5bf
commit 560c058683
Notes: svn2git 2020-12-20 02:59:44 +00:00
svn path=/head/; revision=344433

View file

@ -212,11 +212,6 @@ SYSCTL_INT(_net_inet_tcp, OID_AUTO, recvbuf_auto, CTLFLAG_VNET | CTLFLAG_RW,
&VNET_NAME(tcp_do_autorcvbuf), 0,
"Enable automatic receive buffer sizing");
VNET_DEFINE(int, tcp_autorcvbuf_inc) = 16*1024;
SYSCTL_INT(_net_inet_tcp, OID_AUTO, recvbuf_inc, CTLFLAG_VNET | CTLFLAG_RW,
&VNET_NAME(tcp_autorcvbuf_inc), 0,
"Incrementor step size of automatic receive buffer");
VNET_DEFINE(int, tcp_autorcvbuf_max) = 2*1024*1024;
SYSCTL_INT(_net_inet_tcp, OID_AUTO, recvbuf_max, CTLFLAG_VNET | CTLFLAG_RW,
&VNET_NAME(tcp_autorcvbuf_max), 0,
@ -1449,13 +1444,16 @@ tcp_input(struct mbuf **mp, int *offp, int proto)
* The criteria to step up the receive buffer one notch are:
* 1. Application has not set receive buffer size with
* SO_RCVBUF. Setting SO_RCVBUF clears SB_AUTOSIZE.
* 2. the number of bytes received during the time it takes
* one timestamp to be reflected back to us (the RTT);
* 3. received bytes per RTT is within seven eighth of the
* current socket buffer size;
* 4. receive buffer size has not hit maximal automatic size;
* 2. the number of bytes received during 1/2 of an sRTT
* is at least 3/8 of the current socket buffer size.
* 3. receive buffer size has not hit maximal automatic size;
*
* This algorithm does one step per RTT at most and only if
* If all of the criteria are met we increaset the socket buffer
* by a 1/2 (bounded by the max). This allows us to keep ahead
* of slow-start but also makes it so our peer never gets limited
* by our rwnd which we then open up causing a burst.
*
* This algorithm does two steps per RTT at most and only if
* we receive a bulk stream w/o packet losses or reorderings.
* Shrinking the buffer during idle times is not necessary as
* it doesn't consume any memory when idle.
@ -1472,11 +1470,10 @@ tcp_autorcvbuf(struct mbuf *m, struct tcphdr *th, struct socket *so,
if (V_tcp_do_autorcvbuf && (so->so_rcv.sb_flags & SB_AUTOSIZE) &&
tp->t_srtt != 0 && tp->rfbuf_ts != 0 &&
TCP_TS_TO_TICKS(tcp_ts_getticks() - tp->rfbuf_ts) >
(tp->t_srtt >> TCP_RTT_SHIFT)) {
if (tp->rfbuf_cnt > (so->so_rcv.sb_hiwat / 8 * 7) &&
((tp->t_srtt >> TCP_RTT_SHIFT)/2)) {
if (tp->rfbuf_cnt > ((so->so_rcv.sb_hiwat / 2)/ 4 * 3) &&
so->so_rcv.sb_hiwat < V_tcp_autorcvbuf_max) {
newsize = min(so->so_rcv.sb_hiwat +
V_tcp_autorcvbuf_inc, V_tcp_autorcvbuf_max);
newsize = min((so->so_rcv.sb_hiwat + (so->so_rcv.sb_hiwat/2)), V_tcp_autorcvbuf_max);
}
TCP_PROBE6(receive__autoresize, NULL, tp, m, tp, th, newsize);