Selectively listing package paths is error-prone. Use `go list` to get
the complete list instead. Filter out integration tests since they are
slower.
Also, enable the race detector by default. Local `make test` runs should
not skip it.
Added package cgroup to orchestrate cgroups. Only support for cgroup2
was added to utilize because cgroup2 cgroups have unique IDs that can be
used correlated with BPF events.
Added bpf package that contains three BPF programs: execsnoop,
opensnoop, and tcpconnect. The bpf package starts and stops these
programs as well correlating their output with Teleport sessions
and emitting them to the audit log.
Added support for Teleport to re-exec itself before launching a shell.
This allows Teleport to start a child process, capture it's PID, place
the PID in a cgroup, and then continue to process. Once the process is
continued it can be tracked by it's cgroup ID.
Reduced the total number of connections to a host so Teleport does not
quickly exhaust all file descriptors. Exhausting all file descriptors
happens very quickly when disk events are emitted to the audit log which
are emitted at a very high rate.
Added tarballs for exec sessions. Updated session.start and session.end
events with additional metadata. Updated the format of session tarballs
to include enhanced events.
Added file configuration for enhanced session recording. Added code to
startup enhanced session recording and pass package to SSH nodes.
If an attacker can force a username change at an IdP, upon second login,
the services.User object of the original user can be updated with new
roles and traits. If these new roles and traits differ, the original
user can have their privileges raised (or lowered).
To mitigate this, encode roles and traits within the certificate and use
these when fetching roles to make RBAC decisions. If roles and traits are
not encoded within an certificate (for example for old style SSH
certificates then fallback to using the services.User object and log a
warning.
Added "--fips" flag to "teleport start" command which can start
Enterprise in FedRAMP/FIPS 140-2 mode.
In FIPS mode, Teleport configures the TLS and SSH servers with FIPS
compliant cryptographic algorithms. In FIPS mode, if non-compliant
algorithms are chosen, Teleport will fail to start. In addition,
Teleport checks if the binary was compiled against an approved
cryptographic module (BoringCrypto) and fails to start if it was not.
If a client, like tsh, tries to use non-FIPS encryption, like NaCl,
those requests are also rejected.
This commit introduces several key changes to
Teleport backend and API infrastructure
in order to achieve scalability improvements
on 10K+ node deployments.
Events and plain keyspace
--------------------------
New backend interface supports events,
pagination and range queries
and moves away from buckets to
plain keyspace, what better aligns
with DynamoDB and Etcd featuring similar
interfaces.
All backend implementations are
exposing Events API, allowing
multiple subscribers to consume the same
event stream and avoid polling database.
Replacing BoltDB, Dir with SQLite
-------------------------------
BoltDB backend does not support
having two processes access the database at the
same time. This prevented Teleport
using BoltDB backend to be live reloaded.
SQLite supports reads/writes by multiple
processes and makes Dir backend obsolete
as SQLite is more efficient on larger collections,
supports transactions and can detect data
corruption.
Teleport automatically migrates data from
Bolt and Dir backends into SQLite.
GRPC API and protobuf resources
-------------------------------
GRPC API has been introduced for
the auth server. The auth server now serves both GRPC
and JSON-HTTP API on the same TLS socket and uses
the same client certificate authentication.
All future API methods should use GRPC and HTTP-JSON
API is considered obsolete.
In addition to that some resources like
Server and CertificateAuthority are now
generated from protobuf service specifications in
a way that is fully backward compatible with
original JSON spec and schema, so the same resource
can be encoded and decoded from JSON, YAML
and protobuf.
All models should be refactored
into new proto specification over time.
Streaming presence service
--------------------------
In order to cut bandwidth, nodes
are sending full updates only when changes
to labels or spec have occured, otherwise
new light-weight GRPC keep alive updates are sent
over to the presence service, reducing
bandwidth usage on multi-node deployments.
In addition to that nodes are no longer polling
auth server for certificate authority rotation
updates, instead they subscribe to event updates
to detect updates as soon as they happen.
This is a new API, so the errors are inevitable,
that's why polling is still done, but
on a way slower rate.