This commit refactor discovery protocol
to make it less dependent on the database and
scale better on large numbers of tunnels.
Reverse tunnel is now always sending
back the list of all proxies registered in the
cluster in the form of discovery requests.
Before this commit, reverse tunnel server was comparing
existing TunnelConnection with the Proxies
and sending back the list of proxies that were not
discovered.
This required nodes to register tunnel connections
in the database and servers poll the connections.
On 10K clusters this is not scalable. Instead,
the change assumes that there is not a lot of
proxies so it's OK to send the information about
them back to all connected agents.
Agent pools can make up their own mind about what to
do with the information - they can ignore
the request as long as they observe all agents
connected to the requested proxies.
At the same time, to avoid using too much traffic,
reverse tunnel server only sends the discovery requests
after the first agent heartbeat and in case if
proxy list changes. To make it possible reverse tunnel
sets up a watch on the proxies.
Whenever many IOT style nodes are connecting
back to the web proxy server, they all
call /find endpoint to discover the configuration.
This new endpoint is designed to be fast and not
hit the database.
In addition to that every proxy reverse tunnel
connection handler was fetching auth servers and
this commit adds caching for the auth servers
on the proxy side.
This commit expands the usage of the caching layer
for auth server API:
* Introduces in-memory cache that is used to serve all
Auth server API requests. This is done to achieve scalability
on 10K+ node clusters, where each node fetches certificate authorities,
roles, users and join tokens. It is not possible to scale
DynamoDB backend or other backends on 10K reads per seconds
on a single shard or partition. The solution is to introduce
an in-memory cache of the backend state that is always used
for reads.
* In-memory cache has been expanded to support all resources
required by the auth server.
* Experimental `tctl top` command has been introduced to display
common single node metrics.
Replace SQLite Memory Backend with BTree
SQLite in memory backend was suffering from
high tail latencies under load (up to 8 seconds
in 99.9%-ile on load configurations).
This commit replaces the SQLite memory caching
backend with in-memory BTree backend that
brought down tail latencies to 2 seconds (99.9%-ile)
and brought overall performance improvement.
This commit introduces several key changes to
Teleport backend and API infrastructure
in order to achieve scalability improvements
on 10K+ node deployments.
Events and plain keyspace
--------------------------
New backend interface supports events,
pagination and range queries
and moves away from buckets to
plain keyspace, what better aligns
with DynamoDB and Etcd featuring similar
interfaces.
All backend implementations are
exposing Events API, allowing
multiple subscribers to consume the same
event stream and avoid polling database.
Replacing BoltDB, Dir with SQLite
-------------------------------
BoltDB backend does not support
having two processes access the database at the
same time. This prevented Teleport
using BoltDB backend to be live reloaded.
SQLite supports reads/writes by multiple
processes and makes Dir backend obsolete
as SQLite is more efficient on larger collections,
supports transactions and can detect data
corruption.
Teleport automatically migrates data from
Bolt and Dir backends into SQLite.
GRPC API and protobuf resources
-------------------------------
GRPC API has been introduced for
the auth server. The auth server now serves both GRPC
and JSON-HTTP API on the same TLS socket and uses
the same client certificate authentication.
All future API methods should use GRPC and HTTP-JSON
API is considered obsolete.
In addition to that some resources like
Server and CertificateAuthority are now
generated from protobuf service specifications in
a way that is fully backward compatible with
original JSON spec and schema, so the same resource
can be encoded and decoded from JSON, YAML
and protobuf.
All models should be refactored
into new proto specification over time.
Streaming presence service
--------------------------
In order to cut bandwidth, nodes
are sending full updates only when changes
to labels or spec have occured, otherwise
new light-weight GRPC keep alive updates are sent
over to the presence service, reducing
bandwidth usage on multi-node deployments.
In addition to that nodes are no longer polling
auth server for certificate authority rotation
updates, instead they subscribe to event updates
to detect updates as soon as they happen.
This is a new API, so the errors are inevitable,
that's why polling is still done, but
on a way slower rate.
This commit makes sure that trusted cluster resource
name is the same name as the cluster name it conects to.
If user supplies name of the trusted cluster resource
that is different from the cluster name, the warning
will be issued and trusted cluster will be renamed.
Upgrade procedure renames existing trusted clusters
in place.
If user supplies trusted cluster without role
mappings, or with role mappings referring to
non-existent roles that do not exist, the
error will be returned.
This commit adds remote cluster resource that specifies
connection and trust of the remote trusted cluster to the local
cluster. Deleting remote cluster resource deletes trust
established between clusters on the local cluster side
and terminates all reverse tunnel connections.
Migrations make sure that remote cluster resources exist
after upgrade of the auth server.
- Friendly error messages when parsing configuration and establishing
connection
- Bugs related to "first start" vs subsequent starts (reverse tunnells
added to YAML file won't be seen upon restart)
- Nicer logging
Reverse tunnels are now first class citizens of teleport.
There's no longer static configuration for reverse tunnel agents
in the config. Instead, admins can add and remove reverse tunnels
using tctl reversetunnel (hidden) commands.
* tctl reversetunnel ls
lists reverse tunnels
* tctl reversetunnel upsert a.example.com 10.0.0.4:2023,10.0.0.5:2033 --ttl=10m
updates or inserts reverse tunnel for 10 minutes
* tctl reversetunnel del a.example.com
deletes a reverse tunnel
Teleport proxies watch changes in the reverse tunnels on the backend and
spin up / spin down reverse tunnels according to these changes.
This commit introduces heartbeats of AuthServers and Proxies and fixes several issues:
1. Server init problem
There was an issue in server init, when certificates of multiple roles were overwriting each otther.
Now Teleport stores each keypair and certificate in a separate file <hostid>.role.key and <hostid>.role.cert
This also means that it's backwards incompatible with previous on disk format.
2. Proxy and Auth heartbeats
Auth servers and proxies now heartbeat into cluster as well
3. Bugfixes:
* Proxy role was missing, it is now treated as a separate role with permissions
* AdvertiseIP is now a global setting that can be used by all roles
* --advertise-ip flag was ignored and was never applied
* teleport service initialization has been simplified, now each role get it's own client
* minor cleanups