dart-sdk/tests/lib/convert/line_splitter_performance_test.dart
Lasse R.H. Nielsen 164f8588d6 Optimize LineSplitter.
Avoid quadratic behavior when multiple chunks fail to
have a line break, and the carry-over string gets repeatedly extended.

In the original chunked conversion code, the chunk handling code retained the trailing, non-line-terminated text of the previous chunk, then eagerly concatenated it with the next chunk in order to continue looking for lines. That's moderately effective when lines are shorter than chunks, and neither are too large.
However, a very long line spread across many chunks would perform repeated string concatenation with quadratic time complexity.

This change gives `LineSplitter` the option of using a `StringBuffer` to collect multiple carry-over line parts.
The buffer is needed whenever a chunk does not contain a line break, and needs to be combined with a previous chunk's carry-over. This avoids ever directly concatenating any more than two strings.
The `StringBuffer` is not allocated until it's first needed, so if lines are generally shorter than chunks, the buffer won't be used. Once allocated, the buffer is retained in case a buffer will be needed again, but cleared when its contents are used.

The code optimizes for the simple case of each chunk having a line break.

Fixes #51167

Bug: https://github.com/dart-lang/sdk/issues/51167
Change-Id: I600a011e02aa9f1ad6f88e45764df5b2e8eccfa3
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/280100
Reviewed-by: Leaf Petersen <leafp@google.com>
Reviewed-by: Stephen Adams <sra@google.com>
Commit-Queue: Lasse Nielsen <lrn@google.com>
Reviewed-by: Siva Annamalai <asiva@google.com>
Reviewed-by: Aske Simon Christensen <askesc@google.com>
Reviewed-by: Nate Bosch <nbosch@google.com>
Reviewed-by: Sigmund Cherem <sigmund@google.com>
2023-02-21 11:33:24 +00:00

40 lines
1.1 KiB
Dart

// Copyright (c) 2023, the Dart project authors. Please see the AUTHORS file
// for details. All rights reserved. Use of this source code is governed by a
// BSD-style license that can be found in the LICENSE file.
library line_splitter_test;
import 'dart:convert';
import "package:expect/expect.dart";
void main() {
testEfficency();
}
/// Regression test for https://dartbug.com/51167
///
/// Had quadratic time behavior when concatenating chunks without linebreaks.
///
/// Should now only use linear time/space for buffering.
void testEfficency() {
// After fix: finishes in < 1 second on desktop.
// Before fix, with N = 100000, took 25 seconds.
const N = 1000000;
String result = ""; // Starts empty, set once.
var sink = LineSplitter()
.startChunkedConversion(ChunkedConversionSink.withCallback((lines) {
// Gets called only once with exactly one line.
Expect.equals("", result);
Expect.equals(1, lines.length);
var line = lines.first;
Expect.notEquals("", line);
result = line;
}));
for (var i = 0; i < N; i++) {
sink.add("xy");
}
sink.close();
Expect.equals("xy" * N, result);
}