Commit graph

6 commits

Author SHA1 Message Date
Andrew Lytvynov 3004b65019 proxy: add proxy_ssh_sessions_total metric
This is similar to server_interactive_sessions_total, but tracks all
SSH sessions through a proxy.
2020-09-18 20:57:34 +00:00
Andrew Lytvynov 96375c7d3d tctl: fix tctl top colors on dark terminals
If we leave `TextStyle` empty on UI elements, it will use the default
foreground color defined by the terminal (light for dark terminals and
vice versa). Same goes for `BorderStyle`.

A few other tweaks to UI and source metrics:
- update table ratios to prevent hiding output rows on short (height)
  terminal windows
- update tab selector style to use bold/underline instead of colors to
  mark selected tab
- print `No data` in histogram tables when there are no values
- don't report the local cluster in `remote_clusters` metric
2020-08-19 22:17:17 +00:00
Andrew Lytvynov cd1344a4a5 Add prometheus metric mirroring /readyz state
This allows users to get the health of their nodes from prometheus
metrics pipeline instead of polling readyz separately.

Updates #3700
2020-05-14 18:08:10 +00:00
Russell Jones 77e8b63470 Enhanced Session Recording.
Added package cgroup to orchestrate cgroups. Only support for cgroup2
was added to utilize because cgroup2 cgroups have unique IDs that can be
used correlated with BPF events.

Added bpf package that contains three BPF programs: execsnoop,
opensnoop, and tcpconnect. The bpf package starts and stops these
programs as well  correlating their output with Teleport sessions
and emitting them to the audit log.

Added support for Teleport to re-exec itself before launching a shell.
This allows Teleport to start a child process, capture it's PID, place
the PID in a cgroup, and then continue to process. Once the process is
continued it can be tracked by it's cgroup ID.

Reduced the total number of connections to a host so Teleport does not
quickly exhaust all file descriptors. Exhausting all file descriptors
happens very quickly when disk events are emitted to the audit log which
are emitted at a very high rate.

Added tarballs for exec sessions. Updated session.start and session.end
events with additional metadata. Updated the format of session tarballs
to include enhanced events.

Added file configuration for enhanced session recording. Added code to
startup enhanced session recording and pass package to SSH nodes.
2019-12-02 15:10:39 -08:00
Alexander Klizhentas 6b5935fb71
Use RADIX trees for prefix matching. (#2666)
Buffer fan out used simple prefix match
in a loop, what resulted in high CPU load
on many connected watchers.

This commit switches to RADIX trees for
prefix matching what reduces CPU load
substantially for 5K+ connected watchers.
2019-04-22 15:28:04 -07:00
Sasha Klizhentas 8356ae6a74 Use in-memory cache for the auth server API.
This commit expands the usage of the caching layer
for auth server API:

* Introduces in-memory cache that is used to serve all
Auth server API requests. This is done to achieve scalability
on 10K+ node clusters, where each node fetches certificate authorities,
roles, users and join tokens. It is not possible to scale
DynamoDB backend or other backends on 10K reads per seconds
on a single shard or partition. The solution is to introduce
an in-memory cache of the backend state that is always used
for reads.

* In-memory cache has been expanded to support all resources
required by the auth server.

* Experimental `tctl top` command has been introduced to display
common single node metrics.

Replace SQLite Memory Backend with BTree

SQLite in memory backend was suffering from
high tail latencies under load (up to 8 seconds
in 99.9%-ile on load configurations).

This commit replaces the SQLite memory caching
backend with in-memory BTree backend that
brought down tail latencies to 2 seconds (99.9%-ile)
and brought overall performance improvement.
2019-04-12 14:23:09 -07:00