Buffer fan out used simple prefix match
in a loop, what resulted in high CPU load
on many connected watchers.
This commit switches to RADIX trees for
prefix matching what reduces CPU load
substantially for 5K+ connected watchers.
This commit expands the usage of the caching layer
for auth server API:
* Introduces in-memory cache that is used to serve all
Auth server API requests. This is done to achieve scalability
on 10K+ node clusters, where each node fetches certificate authorities,
roles, users and join tokens. It is not possible to scale
DynamoDB backend or other backends on 10K reads per seconds
on a single shard or partition. The solution is to introduce
an in-memory cache of the backend state that is always used
for reads.
* In-memory cache has been expanded to support all resources
required by the auth server.
* Experimental `tctl top` command has been introduced to display
common single node metrics.
Replace SQLite Memory Backend with BTree
SQLite in memory backend was suffering from
high tail latencies under load (up to 8 seconds
in 99.9%-ile on load configurations).
This commit replaces the SQLite memory caching
backend with in-memory BTree backend that
brought down tail latencies to 2 seconds (99.9%-ile)
and brought overall performance improvement.
This commit introduces several key changes to
Teleport backend and API infrastructure
in order to achieve scalability improvements
on 10K+ node deployments.
Events and plain keyspace
--------------------------
New backend interface supports events,
pagination and range queries
and moves away from buckets to
plain keyspace, what better aligns
with DynamoDB and Etcd featuring similar
interfaces.
All backend implementations are
exposing Events API, allowing
multiple subscribers to consume the same
event stream and avoid polling database.
Replacing BoltDB, Dir with SQLite
-------------------------------
BoltDB backend does not support
having two processes access the database at the
same time. This prevented Teleport
using BoltDB backend to be live reloaded.
SQLite supports reads/writes by multiple
processes and makes Dir backend obsolete
as SQLite is more efficient on larger collections,
supports transactions and can detect data
corruption.
Teleport automatically migrates data from
Bolt and Dir backends into SQLite.
GRPC API and protobuf resources
-------------------------------
GRPC API has been introduced for
the auth server. The auth server now serves both GRPC
and JSON-HTTP API on the same TLS socket and uses
the same client certificate authentication.
All future API methods should use GRPC and HTTP-JSON
API is considered obsolete.
In addition to that some resources like
Server and CertificateAuthority are now
generated from protobuf service specifications in
a way that is fully backward compatible with
original JSON spec and schema, so the same resource
can be encoded and decoded from JSON, YAML
and protobuf.
All models should be refactored
into new proto specification over time.
Streaming presence service
--------------------------
In order to cut bandwidth, nodes
are sending full updates only when changes
to labels or spec have occured, otherwise
new light-weight GRPC keep alive updates are sent
over to the presence service, reducing
bandwidth usage on multi-node deployments.
In addition to that nodes are no longer polling
auth server for certificate authority rotation
updates, instead they subscribe to event updates
to detect updates as soon as they happen.
This is a new API, so the errors are inevitable,
that's why polling is still done, but
on a way slower rate.
* Changed import path from github.com/moby/moby to canonical path of
github.com/docker/docker.
* Updated dependency for github.com/docker/docker/pkg/term.
* Updated dependency for github.com/Azure/go-ansiterm.
This issue updates #1986.
This is intial, experimental implementation that will
be updated with tests and edge cases prior to production 2.7.0 release.
Teleport proxy adds support for Kubernetes API protocol.
Auth server uses Kubernetes API to receive certificates
issued by Kubernetes CA.
Proxy intercepts and forwards API requests to the Kubernetes
API server and captures live session traffic, making
recordings available in the audit log.
Tsh login now updates kubeconfig configuration to use
Teleport as a proxy server.
This commit implements #1860
During the the rotation procedure issuing TLS and SSH
certificate authorities are re-generated and all internal
components of the cluster re-register to get new
credentials.
The rotation procedure is based on a distributed
state machine algorithm - certificate authorities have
explicit rotation state and all parts of the cluster sync
local state machines by following transitions between phases.
Operator can launch CA rotation in auto or manual modes.
In manual mode operator moves cluster bewtween rotation states
and watches the states of the components to sync.
In auto mode state transitions are happening automatically
on a specified schedule.
The design documentation is embedded in the code:
lib/auth/rotate.go
Updates #1755
Design
------
This commit adds support for pluggable events and
sessions recordings and adds several plugins.
In case if external sessions recording storage
is used, nodes or proxies depending on configuration
store the session recordings locally and
then upload the recordings in the background.
Non-print session events are always sent to the
remote auth server as usual.
In case if remote events storage is used, auth
servers download recordings from it during playbacks.
DynamoDB event backend
----------------------
Transient DynamoDB backend is added for events
storage. Events are stored with default TTL of 1 year.
External lambda functions should be used
to forward events from DynamoDB.
Parameter audit_table_name in storage section
turns on dynamodb backend.
The table will be auto created.
S3 sessions backend
-------------------
If audit_sessions_uri is specified to s3://bucket-name
node or proxy depending on recording mode
will start uploading the recorded sessions
to the bucket.
If the bucket does not exist, teleport will
attempt to create a bucket with versioning and encryption
turned on by default.
Teleport will turn on bucket-side encryption for the tarballs
using aws:kms key.
File sessions backend
---------------------
If audit_sessions_uri is specified to file:///folder
teleport will start writing tarballs to this folder instead
of sending records to the file server.
This is helpful for plugin writers who can use fuse or NFS
mounted storage to handle the data.
Working dynamic configuration.
* Allow external audit log plugins
* Add support for auth API server plugins
* Add license file path configuration parameter (not used in open-source)
* Extend audit log with user login events
This is a fix for file leak in audit log server caused
by design issue:
Session file descriptors in audit log were opened on demand
when the session event or byte stream chunk was reported.
AuditLog server relied on SessionEnd event to close the
file descriptors associated with the session.
However, when SessionEnd event does not arrive (e.g.
there is a timeout or disconnect), the file descriptors
were not closed. This commit adds periodic clean up
of inactive sessions.
SessionEnd is now used as an optimization measure
to close the files, but is not used as the only
trigger to close files.
Now, inactive idle sessions, will close file descriptors
after periods of inactivity and will reopen the file
descriptors when the session activity resumes.
SessionLogger was not designed to open/close files
multiple times as it was reseting offsets
every time the session files were opened. This
change fixes this condition as well.
goterm had no license, I quickly replaced it with our own little table
formatter.
also rewrote some tsh commands, that were using home-made formatting, to
the new table, so the output is now much nicer.