teleport/rfd/0010-api.md
Zac Bergquist e174004612
Update RFD statuses (#24454)
We haven't been good about going back and marking RFDs as implemented,
and it's helpful when looking at old designs to know if they ever made
their way into the product.
2023-04-13 17:22:46 +00:00

28 KiB

authors state
Alexander Klizhentas (sasha@gravitational.com), Brian Joerger (bjoerger@gravitational.com) implemented

RFD 10 - API and Client Libraries

What

  • Unify Golang Client experience.
  • Automatically generate API documentation using pkg.go.dev.

Why

There are several problems with the Teleport Go client libraries:

  • The Go client example uses the teleport Go module as a dependency. This means that the example pulls in all of the teleport module's packages and dependencies, leading to extreme import bloat.
  • There is no separate client library with guarantees of compatibility with specific version of Teleport. See examples here.
  • Some client logic is residing in plugins, while other logic is in teleport/lib packages.
  • It's impossible to generate the library in any other language other than Go due to an unclear API surface.
  • Code in lib/auth uses some concepts unfamiliar to Go users. For example, it deals with tls certificates at a low level:
// connectClient establishes a gRPC connection to an auth server.
func connectClient() (*auth.Client, error) {
    tlsConfig, err := LoadTLSConfig("certs/api-admin.crt", "certs/api-admin.key", "certs/api-admin.cas")
    if err != nil {
        return nil, fmt.Errorf("Failed to setup TLS config: %v", err)
    }

    // must parse addr list panics and is not necessary
    authServerAddr := utils.MustParseAddrList("127.0.0.1:3025")
    clientConfig := auth.ClientConfig{Addrs: authServerAddr, TLS: tlsConfig}

    // TLS client is the only client
    return auth.NewTLSClient(clientConfig)
}

Details

A new api Go module will be made to expose existing and new API features. This will make it easy to import the API with a small import footprint. Users will be able to run go get github.com/gravitational/teleport/api to get the latest version of the API and get started.

Teleport's internal gRPC Auth client will be moved into the api/client package and embedded in the old client for backwards compatibility. Over time, the internal client will be deprecated and replaced with the new public client.

Other logic such as the web client and tls/ssh certificate logic will be moved to the api package as well.

The new client will gain several new quality of life improvements, including credential loaders, automatic server version checking, better documentation, and more language implementations.

Make independent api package

Move the API client and its dependencies to the new api package to keep Teleport's API layer unified.

api package file structure:

teleport/
└── api/
    ├── client
    |   ├── proto
    │   |   ├── authservice.proto
    │   |   └── authservice.pb.go
    │   ├── client.go
    |   └── webclient
    |       └── webclient.go
    ├── types
    │   ├── events
    │   |   ├──events.proto
    │   |   └──events.pb.go
    │   ├── wrappers
    │   |   ├──wrappers.proto
    │   |   └──wrappers.pb.go
    │   ├── types.proto
    |   └── types.pb.go
    ├── go.mod
    ├── go.sum
    ... utils, defaults, constants, and other limited api dependencies

Backwards compatibility

Use Go's type aliases to refactor code into the api package while maintaining backwards compatibility in the main teleport package. These aliases can be removed once the refactor is complete.

// Preserve full backwards compatibility
type services.Role = types.Role
type auth.Service = proto.Service

Separating the API from teleport packages

Since api will be a submodule and dependency of teleport, it will cause an import cycle if api has any dependencies on teleport. However, teleport packages are all dependent on each other in a convoluted dependency graph, so one of the challenges of moving the API to its own module will be to sort out these dependencies.

The majority of code that needs to be moved to the api module is in teleport/lib/auth/clt.go and teleport/lib/services. The former will be largely extracted into api/client and the latter into api/types.

Any protobuf generated code relevant to the API, including teleport/lib/events and teleport/lib/wrappers, should be moved into api.

Code in some other package, such as lib/constants, lib/defaults, lib/utils, teleport/lib/events and teleport/lib/wrappers, will need to be extracted into to api module as well.

Minimize dependencies

Some logic found in these package will be beyond the scope of the external API and should be refactored to minimize dependencies. This will also help to clarify the separation between api and business logic and make the client more language agnostic.

For example, in lib/services/role, role.CheckAndSetDefaults has some business logic that should remain on the server side. Therefore that logic should be extracted into a new function, ValidateRole, and kept in lib/services/role.go rather than api/types/role.go.

// api/types/role.go
(r *RoleV3) CheckAndSetDefaults() error {
  ... basic validation with business logic extracted
}
// lib/services/role.go
func ValidateRole(r types.Role) error {
  ... business logic
}

Some basic dependencies will be unavoidable:

  • github.com/gravitational/trace
  • google.golang.org/grpc
  • golang.org/x/net
  • golang.org/x/crypto
  • github.com/stretchr/testify

If the list of dependencies grows beyond 20 direct dependencies, this issue should be revisited for better tracking mechanisms.

Deprecate the lib/auth.Client

As stated before, all gRPC client related code will be moved out of teleport/lib/auth/clt.go and into api/client. Backwards compatibility will be maintained by embedding the gRPC api/client.Client in the internal teleport/lib/auth.Client.

Additionally, the remaining HTTP endpoints in the internal client will be converted and added to the gRPC client. To maintain backwards compatibility with old servers, the internal client will be updated to first try the new gRPC endpoint before falling back to the existing HTTP endpoint. These fallback methods will be moved to teleport/lib/auth/httpfallback.go and deprecated after one major release cycle.

This issue is being used to track remaining endpoints that need to be converted to gRPC.

Once all necessary endpoints have been added in gRPC and all HTTP fallback logic has been deprecated and removed, the gRPC client can completely replace the existing lib/auth.Client. At this point the underlying gRPC server will be the same, so the transition should be relatively straightforward.

api module

The api package will be made a sub module of teleport. It will exist in the teleport/api directory, and will be imported by the main teleport module through a go.mod replace directive using a relative path.

Users will be able to get the latest api version by using go get github.com/gravitational/teleport/api, or a specific version with go get github.com/gravitational/teleport/api@vX.Y.Z.

Versioning

Go module versioning must follow semver and can automatically be pushed with git tags. Since the api module is a sub module, its version tags will look like git tag api/vX.Y.Z (this is required for multi-module repositories).

If the module's version is above 2, it must follow the v2+ Go Module release guide by appending a major version suffix to the import path. The import path for each major version X will look like github.com/gravitational/teleport/api/vX.

Note that if these go module requirements aren't implemented, the api module will be versioned as v0.0.0-[random-string]. Getting the latest version with go get github.com/gravitational/teleport/api will get the most recent version in terms of time rather than semver/teleport version. For this reason it is important to release a proper public version as soon as possible.

Compatibility

The gRPC API and other code imported by github.com/gravitational/teleport must follow the Teleport version compatibility rules. However, the api module itself does not need to follow the same versioning rules. For example, if the API module has a new feature that is not used by the main teleport module, such as the Credential Loaders, it does not need to be compatible with previous versions of Teleport.

This gives us some flexibility to choose an api versioning strategy without worries of unrelated versioning promises.

Release option 1 - Release the API with each Teleport release

The API can be tagged for release as part of the existing Teleport release process. Each Teleport release will get both the vX.Y.Z and api/vX.Y.Z tags.

This can be done within the current release checklist with just a few additional steps:

  • Add git tag - api/vX.Y.Z
  • If it's a major version bump...
    • Update api/go.mod with the correct vX major version suffix.
    • Update teleport/go.mod to import the new version suffix and update any teleport/api imports with the updated suffix.

Pros:

  • Teleport Compatibility promise - The api module will follow Teleport version compatibility just like tsh and tctl. As noted above, this isn't strictly necessary for the api, but it may make it simpler to reason about version compatibility.
  • Release checklist changes are minimal
    • For minor/patch releases, there are no significant changes to the release process outside of the additional api/vX.Y.Z git tag.
    • For major versions, there is an additional step to update the the version suffix, but this is still quite minimal compared to managing the API version completely independently.

Cons:

  • Cluttered versions - Even when no API changes are made, there will be a new api tag and release. We could opt to skip this tag for such releases, though this will make the version bumps looks strange in some circumstances. e.g api/v6.1.1 -> api/v6.1.7.
  • Frequent major version bumps - The API will likely remain backwards compatible for a long time once it is marked as stable (released at v1+). However major version bumps in semver imply backwards incompatibility. It would be better to avoid major version bumps to simplify the upgrading process for users.
  • Major version suffix rule - The API would start at v2+, so it would have to follow the major version suffix rule.
    • vX must be appended to the api import path before each major version. This adds an additional release step that is cumbersome for frequent major releases, and can cause development issues such as merge conflicts for backports.
    • Users who want to update to a more recent client major version will have more manual steps; go get the new major version and then update import paths.

If we want to choose the quickest to implement API versioning strategy, this would be it. With a small additional cognitive load on the release manager, we can release the API alongside teleport within the same release process. Teleport version compatibility can easily be upheld and tracked due to the lined up versions, just like tsh and tctl, making it familiar to existing users and developers.

Release option 2 - Release the API separately from Teleport

The API can be released out of step with the main Teleport releases. When we want to release the API, a release manager would follow a slightly modified version of the release checklist specifically geared towards the api module.

Once a breaking change is added to the api module, the major version will be incremented, and the import path will be updated with the major version suffix. Ideally, we would keep it in v1 or v0 for as long as possible in order to avoid changing the import path on releases to include the major version suffix. API major version increases will be a special case unlike current Teleport releases.

Pros:

  • Following Go Module conventions - Go users will have an easier time if we base our Go Modules on semver rather than our unique versioning guidelines. Frequent major version bumps are irregular.
  • Unstable - The API can start at v0, which implies instability. When we are ready to guarantee the API's backwards compatibility and stability, it can be upgraded to v1.
    • If we choose to do this, I would suggest that we upgrade to v1 once the old client is completely deprecated and the new client is used internally.
  • Flexibility - The API will only be released when it is necessary to do so. This can be more or less frequent than Teleport, depending on what features are needed and when.
  • Backwards Compatibility - Since the API will follow semver, it will continue to be backwards compatible with each minor release, even if we release several major versions of Teleport between those minor releases.

Cons:

  • Compatibility - It might not be immediately obvious based on the client version what server/cli versions are compatible.
    • A compatibility matrix could be added to the docs and api/client to help reason about compatibility.
    • One solution to this problem is to version the api as v1.[teleport-major-version].[api-minor/patch-version]. With this strategy, any v1.M.X Api version will work with any vM.Y.Z Teleport version, while still getting the benefits of having a separately managed version.
    • Since the client will have automatic server version checks, users should be able to quickly catch incorrect version usage regardless.
  • Release management - The API will need to be released on its own schedule with its own versions and branches. This will burden release managers with basically a new product to worry about.
    • In the long run, this may be less of a burden than it might seem at first. With Release option 1, we may have 20 or so new versions of teleport/api released, while with Release option 2, most likely there will only be a couple of new versions.

This option provides the most flexibility in API releases. It also gives us the option to start with v0, with no stability/compatibility promises, in order to try out a strategy before committing fully to it. If treating the API as a separately versioned product is too burdensome, we can immediately transition to another release strategy.

Release option choice

The consensus is Release option 1.

It will be much easier to get started with its versioning since it can follow all the same release processes in place. Release option 2 brings many more complexities into the discussion, such as new compatibility rules for interfacing with the API and new branches and tags for the api.

Client experience improvements

This is a list of quality of life improvements that will be added to the new api client.

Go native user experience

The client user experience will be simplified with a new client constructor. The client constructor should be able to retrieve credentials during initialization and handle errors. It should also transparently handle dialing via a proxy if a proxy address or tunnel dialer is provided (along with valid ssh credentials).

import (
   "github.com/gravitational/teleport/api/client"
)

func main() {
  // TLS client is the only TLS client supported in teleport
  client, err := client.New(client.Config{
    Addrs: []string{"localhost:3025"},
    // ContextDialer is an optional context dialer
    ContextDialer: net.DialContext,
    // direct TLS credentials config
    Credentials: []client.Credentials{
      client.TLSCreds(tls.Config),
    }
  })
  ...
  defer client.Close()

  ping, err := client.Ping()
  ...
  fmt.Printf("Ping: %v\n", ping.Version)
}
$ go mod init
$ go run main.go

Credential providers

Support multiple credential providers:

  • client.LoadTLS(*tls.Config) loads creds from TLS config.
  • client.LoadKeyPair("crt_path", "key_path", "cas_path") loads creds from tls certificates.
  • client.LoadProfile("profile_path", "profile_name") loads creds from the users default or specified tsh profile.
  • client.LoadIdentityFile("identity_file_path") loads creds from an identity file.

Support tunnel address discovery from the web proxy, as well as tunnel proxy connectivity using ssh certificates. Since tsh profiles have ssh certificates and the web proxy address, LoadProfile should be able to load the client without any user input. The user can simply tsh login and then the client will retrieve their credentials. This mirrors the functionality that tctl offers.

// for testing, use client spec that loads tunnel dialer and certificates from profile
client, err := client.New(client.Config{
  Credentials: client.Credentials{
    client.LoadProfile("", ""),
  },
})
$ tsh login
# try client
$ go run main.go

Automatically reload credentials

Credential loaders should detect underlying file changes and automatically reload credentials and update the grpc transport.

Automatically check server version

The client will automatically verify that its version is compatible with the server's version by calling Ping during client initialization, similarly to how tsh does.

The client will fail to initialize if the the versions are not compatible. The user can override this by providing SkipVersionCheck: true in client.Config to avoid blocking users that are in the middle of upgrading.

Improve API documentation user experience

The API's documentation will be split into 3 parts:

pkg.go.dev (godoc)

The majority of the API's documentation will be hosted on pkg.go.dev. This documentation is automatically generated from the package code upon each package release. Having code-centric documentation will make it easier to maintain and update. It will also improve the experience of working with the code both internally and externally.

In order to improve this automatically generated documentation, code comments throughout the api package will be made thorough and user focused. There are several useful tricks to improve the formatting of the documentation, and even add runnable code examples.

In the beginning, only essential functionality for the client package will have exceptional documentation. For example, client.Credentials and client.New will have an extensive explanation of its usage with code examples.

Any code added to teleport/api should be held to a standard of excellence in regards to documentation. Specifically, all added code should have user focused comments and examples. Older code will be updated to this standard of excellence on an ongoing basis according to demand/priority.

API Reference

Some documentation will not fit cleanly within the bounds of the code. Such documentation will instead go on the main Teleport docs page under the API Reference section. This section also serves as an entry point to the generated documentation.

To begin, this documentation will have three sections:

  • Introduction: short summary, leads customers to relevant documentation.
  • Getting started: High level overview and guide to getting started.
  • Architecture: Covers in depth topics such as managing API authorization, client connections, and credentials.

More pages can be added here over time if needed, but the in-code docs should be prioritized.

Examples

The majority of code examples will be handled by in code examples. However, some examples require more in depth explanations or combine several client capabilities into a full working example.

For now, there is only one example in the teleport/examples package

Once another example is to be added, they will be moved to teleport/examples/api/example-name in order to keep all examples in one place.

spoiler: The workflows example in teleport-plugins/access/example will be the first example to be added.

Workflows

Deprecate teleport-plugins/access package

The teleport-plugins/access package was originally created in order to utilize the Teleport Auth API without being dependent on internal Teleport code. Now that Teleport has a external API, this is no longer a concern.

The access package is now basically a duplication of the api/client, with a few nonessential quality of life features over the teleport/api/client package:

  • Automatically append plugin name and other determinable parameters to relevant client methods. e.g. GetPluginData(reqID) calls client.GetPluginData({types.KindAccessRequest, reqID, plugin.name}).
  • Automatically filter and type cast access request events.
  • Rename, alias, or convert structs to be more user friendly for those who are only using access requests. e.g types.AccessRequestV3 becomes access.Request which has a flat structure as opposed to teleport.Resource with metadata and spec structs embedded.

These benefits are minimal at best, and actually detrimental at worst. Specifically, users who plan to use the access client for access request management along with the API client for other functionality will need to learn two APIs when they only need to learn one. Additionally, the listed benefits can be largely handled with improved documentation and examples.

Another downside of the separate access package is that there are two separate clients to maintain, version/release. This adds unnecessary strain and confusion for both maintainers and users, especially when it comes to compatibility concerns.

The external plugins held in teleport-plugins/access will be updated to use the new client. Once this is complete, the access client can be removed, meaning its final version will be it's previous release. e.g. If we release the updated plugins in v6.3, The final version of the access client will be v6.2. v6.2 would still be guaranteed to be compatible with teleport until v8.0, according to the Teleport compatibility guidelines.

Add workflow examples and documentation

In order to compensate for the removal of the access client and its quality of life features, the API client's documentation around workflows will be improved. User focused comments and examples will be added to all of the essential workflows functions, including the stream watcher.

Additionally, the access code example will be refactored to use the API client and added to teleport/examples/api.

Plugin package (Optional)

We could add a teleport/api/client/plugin package, which wraps plugin-specific methods in the API client. There are currently about 5 methods that would be included here, but in time this may become more useful.

type Plugin struct {
  // embed client, only replace plugin-related methods.
  client.Client
  name string
}

// Automatically use plugin name in relevant functions
func (p *Plugin) UpdatePluginData(ctx context.Context, params types.PluginDataUpdateParams) error {
  params.Plugin = p.name
	return p.Client.UpdatePluginData(ctx, params)
}

Provide simplified high level access hooks

Access hooks are too low level and require in depth knowledge of teleport and the API client.

High level access hooks will be added to a new api/client/access in the form of a simple handler similar to an http server with a router.

func myTeamRequest(ctx context.Context, req types.AccessRequest) (types.AccessRequestUpdate, error) {
  return DenyParams("No reason, just deny all."), nil
}

func main() {
router := access.Router()
  ctx := context.Background()

  router := access.Router{}
  // handle access requests from my team only
  router.HandleUpdateFunc(myTeamRequest, MatchUserTrait("team", "myteam"))
  srv := access.Server{
    Client: clt,
    Handler: router,
  }

  // Watch for incoming requests and automatically handle
  // any that match a HandleFunc.
  return srv.Run(ctx)
}
Route Matching

The router will use custom Match functions to match a request to a route.

type Route struct {
	Handler  HandlerFunc
	Matchers []Match
}

// Custom function that will attempt to match a request
type Match func(req types.AccessRequest) bool

func MatchUserTrait(k, v string) Match { ... }
Matching Algorithm

Unlike an http server router, the access hooks server cannot route via a radix tree. An access request may match multiple routes based on its values. For example, a request may have user.trait.team=myteam and user.label.group=security. If both of these values have their own handler, then the request will match both handlers, despite the handlers having no detectable overlap.

Since a simple tree-like solution is not an option, the router will instead loop over the routes available looking for a match with the request's associated matchers. The first one to match will be used to handle the request, so it is important for users to consider handler priority.

HandleFunc

Handler functions should be high level and easy to use for basic use cases. The most prevalent use case is a basic access request state update, so the base HandlerFunc will be designed to simply make a client.SetAccessRequestState request.

// Automatically send a client.SetAccessRequestState request using the update params returned.
// If updateParams.ID is empty, automatically add the request ID to the returned params.
// If updateParams is nil, the server will continue without making a request.
type HandlerFunc func(ctx context.Context, req types.AccessRequest) (updateParams types.AccessRequestUpdate, err error)

// Helper functions for putting together updateParams.
func ApproveParams(reason string, roles ...string) types.AccessRequestUpdate { ... }
func DenyParams(reason string) types.AccessRequestUpdate { ... }

func someHandler(ctx context.Context, req types.AccessRequest) (updateParams types.AccessRequestUpdate, err error)) {
  if req.GetUser() == "Jill" {
    return ApproveParams("I know Jill"), nil
  } else {
    return DenyParams("I don't know you..."), nil
  }
}

Note that if a user wants to make a client request other than SetAccessRequestState, such as SubmitAccessReview or UpdatePluginData, they can use a client in the handler via closure.

func main() {
  clt, err := client.New(...)
  ...
  someHandler := func(ctx context.Context, req types.AccessRequest) (updateParams types.AccessRequestUpdate, err error) {
    clt.SubmitAccessReview(...)
    // if updateParams are nil, the router won't try to run SetAccessRequestState.
    return nil, nil
  }
}

Add Python Client

Once the protobuf schema has been untangled from the teleport/lib code, it should be easy to create a bare bones python version of the Auth client. Python will be the first language of many new client library implementations.

ls github.com/gravitational/teleport
api/
python/

$ pip install teleport-client
import teleport

client = teleport.client(addr=...)
client.tokens()

Port router code

@access.handle(clt, path=access.trait("team")):
def handle(req):
   if req.user == "bob":
      raise access.Denied("access denied")

Update docs to include the python variant of the client as well.