teleport/rfd/0000-rfds.md

6.5 KiB

authors state
Andrew Lytvynov (andrew@goteleport.com) draft

RFD 0 - RFDs

What

Request For Discussion (RFD) is a design document format for non-trivial technical Teleport changes. It's also a process by which these documents are proposed, discussed, approved and tracked.

Why

As the Teleport project grows, we need a way to discuss major changes (e.g. new features, major refactors, major distribution changes).

Prior to RFDs, Teleport engineers used several other discussion methods (Google Docs, brainstorm meetings, internal wiki pages, ad-hoc email/chat conversations).

RFDs formalize the process and provide several benefits:

  • discussions are retained in GitHub pull requests and commit history
  • discussions are in the open, for Teleport users to see and contribute
  • discussions are stored in one central place
  • approvals are recorded and enforced

The RFD idea is borrowed from https://oxide.computer/blog/rfd-1-requests-for-discussion/ which is in turn inspired by https://www.ietf.org/standards/rfcs/

Details

Each RFD is stored in a markdown file under https://github.com/gravitational/teleport/tree/master/rfd and has a unique number.

structure

Each RFD consists of:

  1. a header containing author name(s) and state
  2. title in the format RFD $NUMBER - $TITLE
  3. the Required Approvers which contains required set of approvers
  4. the What section - 1-3 sentence summary of what this RFD is about
  5. the Why section - a few paragraphs describing motivation for the RFD
  6. the Details section - detailed description of the proposal, including APIs, UX examples, migrations or any other relevant information

Use this RFD as an example.

process

Here's the process from and RFD idea in your head to a working implementation in the main Teleport branch.

  1. Pick the RFD number. RFD numbers are claimed by pushing a branch named rfd/$number-title.

    Use this one-liner to get the highest-numbered RFD branch that's been pushed:

    git fetch --quiet && git branch -r -v | grep origin/rfd/ | awk -F'[/,-]' '{ n=$3+0; print n }' | sort -n | tail -n 1
    
  2. make a branch off of master called rfd/$number-your-title

    For example, you're writing an RFD titled 'Teleport IRC Access' and end up with number 18 - your branch would be called rfd/0018-irc-access.

  3. write your RFD under /rfd/$number-your-title.md

    Our example RFD is in /rfd/0018-irc-access.md.

  4. submit a PR titled RFD $number: Your Title

    Our example RFD title: RFD 18: IRC Access

  5. iterate on the RFD based on reviewer feedback and get approvals

    Note: it's OK to use meetings or chat to discuss the RFD, but please write down the outcome in PR comments. A future reader will be grateful!

  6. merge the PR and start implementing

  7. once implemented, make another PR changing the state to implemented and updating any details that changed during implementation.

If an RFD is eventually deprecated (e.g. a feature is removed), make a PR changing the state to deprecated and optionally link to the replacement RFD (if applicable).

states

  1. draft - RFD is proposed or approved, but not yet implemented
  2. implemented - RFD is approved and implemented
  3. deprecated - RFD was approved and/or implemented at one point, but is now deprecated and should only be referenced for historic context; a superseding RFD, if one exists, may be linked in the header

The purpose of the state is to tell the reader whether they should care about this RFD at all. For example, deprecated RFDs can be skipped most of the time. implemented is relevant to Teleport users, but draft is mostly for Teleport engineers and early adopters.

Required Approvers

The purpose of the Required Approvers section is to be explicit on required and optional approvers for an RFD. For the subject matter experts that can provide high quality feedback to help refine and improve the RFD.

For example, suppose you are making a change with internal implementation changes, security relevant changes, with also product changes (new fields, flags, or behavior). You might create a Required Approvers section that looks something like the following.

# Required Approvers
* Engineering @zmb3 && (@codingllama || @nklaassen )
* Security: (@reedloden || @jentfoo)
* Product: (@xinding33 || @klizhentas )

Security

Describe the security considerations for your design doc. (Non-exhaustive list below.)

  • Explore possible attack vectors, explain how to prevent them
  • Explore DDoS and other outage-type attacks
  • If frontend, explore common web vulnerabilities
  • If introducing new attack surfaces (UI, CLI commands, API or gRPC endpoints), consider how they may be abused and how to prevent it
  • If introducing new auth{n,z}, explain their design and consequences
  • If using crypto, show that best practices were used to define it

UX

Describe the UX changes and impact of your design doc. (Non-exhaustive list below.)

  • Explore UI, CLI and API user experience by diving through common scenarios that users would go through
  • Show UI, CLI and API requests/responses that the user would observe
  • Make error messages actionable, explore common failure modes and how users can recover
  • Consider the UX of configuration changes and their impact on Teleport upgrades
  • Consider the UX scenarios for Cloud users

Proto Specification

Include any .proto changes or additions that are necessary for your design.

Backward Compatibility

Describe the impact that your design doc has on backwards compatibility and include any migration steps. (Non-exhaustive list below.)

  • Will the change impact older clients? (tsh, tctl)
  • What impact does the change have on remote clusters?
  • Are there any backend migrations required?
  • How will changes be rolled out across future versions?

Audit Events

Include any new events that are required to audit the behavior introduced in your design doc and the criteria required to emit them.

Observability

Describe how you will know the feature is working correctly and with acceptable performance. Consider whether you should add new Prometheus metrics, distributed tracing, or emit log messages with a particular format to detect errors.

Product Usage

Describe how we can determine whether the feature is being adopted. Consider new telemetry or usage events.

Test Plan

Include any changes or additions that will need to be made to the Test Plan to appropriately test the changes in your design doc and prevent any regressions from happening in the future.