Commit graph

4 commits

Author SHA1 Message Date
Russell Jones 6d1c16f745 Added support for nodes dialing back to cluster.
Updated services.ReverseTunnel to support type (proxy or node). For
proxy types, which represent trusted cluster connections, when a
services.ReverseTunnel is created, it's created on the remote side with
name /reverseTunnels/example.com. For node types, services.ReverseTunnel
is created on the main side as /reverseTunnels/{nodeUUID}.clusterName.

Updated services.TunnelConn to support type (proxy or node). For proxy
types, which represent trusted cluster connections, tunnel connections
are created on the main side under
/tunnelConnections/remote.example.com/{proxyUUID}-remote.example.com.
For nodes, tunnel connections are created on the main side under
/tunnelConnections/example.com/{proxyUUID}-example.com. This allows
searching for tunnel connections by cluster then allows easily creating
a set of proxies that are missing matching services.TunnelConn.

The reverse tunnel server has been updated to handle heartbeats from
proxies as well as nodes. Proxy heartbeat behavior has not changed.
Heartbeats from nodes now add remote connections to the matching local
site. In addition, the reverse tunnel server now proxies connection to
the Auth Server for requests that are already authenticated (a second
authentication to the Auth Server is required).

For registration, nodes try and connect to the Auth Server to fetch host
credentials. Upon failure, nodes now try and fallback to fetching host
credentials from the web proxy.

To establish a connection to an Auth Server, nodes first try and connect
directly, and if the connection fails, fallback to obtaining a
connection to the Auth Server through the reverse tunnel. If a
connection is established directly, node startup behavior has not
changed. If a node establishes a connection through the reverse tunnel,
it creates an AgentPool that attempts to dial back to the cluster and
establish a reverse tunnel.

When nodes heartbeat, they also heartbeat if they are connected directly
to the cluster or through a reverse tunnel. For nodes that are connected
through a reverse tunnel, the proxy subsystem now directs the reverse
tunnel server to establish a connection through the reverse tunnel
instead of directly.

When sending discovery requests, the domain field has been replaced with
tunnelID. The tunnelID field is either the cluster name (same as before)
for proxies, or {nodeUUID}.example.com for nodes.
2019-04-26 15:41:45 -07:00
Sasha Klizhentas f40df845db Events and GRPC API
This commit introduces several key changes to
Teleport backend and API infrastructure
in order to achieve scalability improvements
on 10K+ node deployments.

Events and plain keyspace
--------------------------

New backend interface supports events,
pagination and range queries
and moves away from buckets to
plain keyspace, what better aligns
with DynamoDB and Etcd featuring similar
interfaces.

All backend implementations are
exposing Events API, allowing
multiple subscribers to consume the same
event stream and avoid polling database.

Replacing BoltDB, Dir with SQLite
-------------------------------

BoltDB backend does not support
having two processes access the database at the
same time. This prevented Teleport
using BoltDB backend to be live reloaded.

SQLite supports reads/writes by multiple
processes and makes Dir backend obsolete
as SQLite is more efficient on larger collections,
supports transactions and can detect data
corruption.

Teleport automatically migrates data from
Bolt and Dir backends into SQLite.

GRPC API and protobuf resources
-------------------------------

GRPC API has been introduced for
the auth server. The auth server now serves both GRPC
and JSON-HTTP API on the same TLS socket and uses
the same client certificate authentication.

All future API methods should use GRPC and HTTP-JSON
API is considered obsolete.

In addition to that some resources like
Server and CertificateAuthority are now
generated from protobuf service specifications in
a way that is fully backward compatible with
original JSON spec and schema, so the same resource
can be encoded and decoded from JSON, YAML
and protobuf.

All models should be refactored
into new proto specification over time.

Streaming presence service
--------------------------

In order to cut bandwidth, nodes
are sending full updates only when changes
to labels or spec have occured, otherwise
new light-weight GRPC keep alive updates are sent
over to the presence service, reducing
bandwidth usage on multi-node deployments.

In addition to that nodes are no longer polling
auth server for certificate authority rotation
updates, instead they subscribe to event updates
to detect updates as soon as they happen.

This is a new API, so the errors are inevitable,
that's why polling is still done, but
on a way slower rate.
2018-12-10 17:20:24 -08:00
Sasha Klizhentas bcc25f971f Upgrade etcd backend
New Etcd backend is using GRPC api v3,
dependencies were updated accordingly.
2018-09-10 15:58:05 -07:00
Sasha Klizhentas b72b734979 add grpc 2017-05-26 18:19:22 -07:00