flutter/dev/devicelab
liyuqian a44f174efc
Shader warm up (#27660)
This patch adds a default shader warm up process which moves shader compilation from the animation time to the startup time. This also provides an extension for `runApp` so developers can customize the warm up process.

This should reduce our worst_frame_rasterizer_time_millis from ~100ms to ~20-30ms for both flutter_gallery and complex_layout benchmarks. Besides, this should also have a significant improvement on 90th and 99th percentile time (50%-100% speedup in some cases, but I haven't tested them thoroughly; I'll let our device lab collect the data afterwards).

The tradeoff the is the startup time (time to first frame). Our `flutter run --profile --trace-startup` seems to be a little noisy and I see about 100ms-200ms increase in that measurement for complex_layout and flutter_gallery. Note that this only happens on the first run after install or data wipe. Later the Skia persistent cache will remove the overhead.

This also adds a cubic_bezier benchmark to test the custom shader warm up process.

This should fix https://github.com/flutter/flutter/issues/813 (either by `defaultShaderWarmUp`, or a `customShaderWarmUp`).
2019-02-22 15:37:02 -08:00
..
bin Shader warm up (#27660) 2019-02-22 15:37:02 -08:00
images Document how to read the dashboard (#14577) 2018-02-12 16:51:25 -08:00
lib Shader warm up (#27660) 2019-02-22 15:37:02 -08:00
test Revert "Remove package:test dependency from flutter_test (#23755)" (#24004) 2018-11-06 09:12:48 -08:00
manifest.yaml Shader warm up (#27660) 2019-02-22 15:37:02 -08:00
pubspec.yaml Reland #27754, now that bsdiff has moved to flutter/packages. (#28291) 2019-02-21 21:59:41 -08:00
README.md [O] Removing all timeouts (mark II) (#26736) 2019-01-19 00:31:05 -08:00

Flutter devicelab

"Devicelab" (a.k.a. "cocoon") is a physical lab that tests Flutter on real Android and iOS devices.

This package contains the code for test framework and the tests. More generally the tests are referred to as "tasks" in the API, but since we primarily use it for testing, this document refers to them as "tests".

Build results are available at https://flutter-dashboard.appspot.com.

Reading the dashboard

The build page

The build page is accessible at https://flutter-dashboard.appspot.com/build.html. This page reports the health of build servers, called agents, and the statuses of build tasks.

Agents

A green agent is considered healthy and ready to receive new tasks to build. A red agent is broken and does not receive new tasks.

In the example below, the dashboard shows that the linux2 agent is broken and requires attention. All other agents are healthy.

Agent statuses

Tasks

The table below the agent statuses displays the statuses of build tasks. Task statuses are color-coded. The following statuses are available:

New task (light blue): the task is waiting for an agent to pick it up and start the build.

Task is running (spinning blue): an agent is currently building the task.

Task succeeded (green): an agent reported a successful completion of the task.

Task is flaky (yellow): the task was attempted multiple time, but only the latest attempt succeeded (we currently only try twice).

Task failed (red): the task failed all of the attempts.

Task underperformed (orange): currently not used.

Task was skipped (transparent): the task is not scheduled for a build. This usually happens when a task is removed from manifest.yaml file.

Task status unknown (purple): currently not used.

In addition to color-coding, a task may display a question mark. This means that the task was marked as flaky manually. The status of such task is ignored when considering whether the build is broken or not. For example, if a flaky task fails, GitHub will not prevent PR submissions. However, if the latest status of a non-flaky task is red, all pending PRs will contain a warning about the broken build and recommend caution when submitting.

Legend:

Task status legend

The example below shows that commit e122d5d caused a wide-spread breakage, which was fixed by bdc6f10. It also shows that Cirrus and Chrome Infra (left-most tasks) decided to skip building these commits. Hovering over a cell will pop up a tooltip containing the name of the broken task. Clicking on the cell will open the log file in a new browser tab (only visible to core contributors as of today).

Broken Test

Why is a task stuck on "new task" status?

The dashboard aggregates build results from multiple build environments, including Cirrus, Chrome Infra, and devicelab. While devicelab tests every commit that goes into the master branch, other environments may skip some commits. For example, Cirrus will only test the last commit of a PR that's merged into the master branch. Chrome Infra may skip commits when they come in too fast.

How the devicelab runs the tasks

The devicelab agents have a small script installed on them that continuously asks the CI server for tasks to run. When the server finds a suitable task for an agent it reserves that task for the agent. If the task succeeds, the agent reports the success to the server and the dashboard shows that task in green. If the task fails, the agent reports the failure to the server, the server increments the counter counting the number of attempts it took to run the task and puts the task back in the pool of available tasks. If a task does not succeed after a certain number of attempts (as of this writing the limit is 2), the task is marked as failed and is displayed using red color on the dashboard.

Running tests locally

Do make sure your tests pass locally before deploying to the CI environment. Below is a handful of commands that run tests in a similar way to how the CI environment runs them. These commands are also useful when you need to reproduce a CI test failure locally.

Prerequisites

You must set the ANDROID_HOME or ANDROID_SDK_ROOT environment variable to run tests on Android. If you have a local build of the Flutter engine, then you have a copy of the Android SDK at .../engine/src/third_party/android_tools/sdk.

You can find where your Android SDK is using flutter doctor.

Warnings

Running devicelab will do things to your environment.

Notably, it will start and stop gradle, for instance.

Running all tests

To run all tests defined in manifest.yaml, use option -a (--all):

../../bin/cache/dart-sdk/bin/dart bin/run.dart -a

Running specific tests

To run a test, use option -t (--task):

# from the .../flutter/dev/devicelab directory
../../bin/cache/dart-sdk/bin/dart bin/run.dart -t {NAME_OR_PATH_OF_TEST}

Where NAME_OR_PATH_OF_TEST can be either of:

  • the name of a task, which you can find in the manifest.yaml file in this directory. Example: complex_layout__start_up.
  • the path to a Dart file corresponding to a task, which resides in bin/tasks. Tip: most shells support path auto-completion using the Tab key. Example: bin/tasks/complex_layout__start_up.dart.

To run multiple tests, repeat option -t (--task) multiple times:

../../bin/cache/dart-sdk/bin/dart bin/run.dart -t test1 -t test2 -t test3

To run tests from a specific stage, use option -s (--stage). Currently there are only three stages defined, devicelab, devicelab_ios and devicelab_win.

../../bin/cache/dart-sdk/bin/dart bin/run.dart -s {NAME_OF_STAGE}

Reproducing broken builds locally

To reproduce the breakage locally git checkout the corresponding Flutter revision. Note the name of the test that failed. In the example above the failing test is flutter_gallery__transition_perf. This name can be passed to the run.dart command. For example:

../../bin/cache/dart-sdk/bin/dart bin/run.dart -t flutter_gallery__transition_perf

Writing tests

A test is a simple Dart program that lives under bin/tests and uses package:flutter_devicelab/framework/framework.dart to define and run a task.

Example:

import 'dart:async';

import 'package:flutter_devicelab/framework/framework.dart';

Future<void> main() async {
  await task(() async {
    ... do something interesting ...

    // Aggregate results into a JSONable Map structure.
    Map<String, dynamic> testResults = ...;

    // Report success.
    return new TaskResult.success(testResults);

    // Or you can also report a failure.
    return new TaskResult.failure('Something went wrong!');
  });
}

Only one task is permitted per program. However, that task can run any number of tests internally. A task has a name. It succeeds and fails independently of other tasks, and is reported to the dashboard independently of other tasks.

A task runs in its own standalone Dart VM and reports results via Dart VM service protocol. This ensures that tasks do not interfere with each other and lets the CI system time out and clean up tasks that get stuck.

Adding tests to the CI environment

The manifest.yaml file describes a subset of tests we run in the CI. To add your test edit manifest.yaml and add the following in the "tasks" dictionary:

  {NAME_OF_TEST}:
    description: {DESCRIPTION}
    stage: {STAGE}
    required_agent_capabilities: {CAPABILITIES}

Where:

  • {NAME_OF_TEST} is the name of your test that also matches the name of the file in bin/tests without the .dart extension.
  • {DESCRIPTION} is the plain English description of your test that helps others understand what this test is testing.
  • {STAGE} is devicelab if you want to run on Android, or devicelab_ios if you want to run on iOS.
  • {CAPABILITIES} is an array that lists the capabilities required of the test agent (the computer that runs the test) to run your test. Available capabilities are: has-android-device, has-ios-device.