* Adding annotations to the serviceAccount definition to allow IRSA to be used on AWS EKS deployments
* Adding separate settings for the auth service when deploying highAvailability and passing through loadBalancerSourceRanges when service type is LoadBalancer
A new chart teleport-cluster helps users to get started
with Teleport on Kubernetes. It uses single node deployment with
persitent volumens and supports ACME.
A new quickstart guide will use this chart.
* Use "5.0" as string instead of integer
Otherwise, it won't find the tag as it will look for tag 5, instead of 5.0
* update values for teleport-auto-trustedcluster and teleport-deamonset
Co-authored-by: Gus Luxton <gus@gravitational.com>
Co-authored-by: Andrew Lytvynov <andrew@goteleport.com>
* benchmark package
* use default config if path is not specified
* progressiveBench as a config method
* implement a main.go approach to run progressive tests
* make teleport client, run specified benchmark
* function and method descriptions
* make teleport client
* testing
* change interface method signatures
* dry up bench.go code, move producer goroutines to own function
* output formatting
* remove yaml
* fix linter errors
* remove print
* PR suggested changes, moved export latency profile functionality to the benchmark package
* PR fixes
* method description
* update testing
* linter
* docs and example
* PR suggestion changes
* fix coord omission bug
* remove benchmark struct
* remove threads, using open system
* recover in run
* close channel, check if open with each execution
* update testing, pr suggestions
* add more instructions to readme
* update example.go
* pass back context
* use SyncBuffer
* export response and service histograms
* update readme, exporting profiles section
* return from execute()
* export singular latency profile
* export response profile
* Revert "export response profile"
This reverts commit 5a21cb034c.
* export response profile
* update branch
* format example.go
* remove threads
* update example.go
* update branch
* goimports
* add signal handler & update docs
* PR suggestions
* exit out of interactive session
* revert execute
* PR suggestion
* run commmand on non-interactive instead of nil
* Add helm chart for in-cluster kubernetes_service agent
This is a simplified version of the teleport chart, intended to only run
a "stateless" `kubernetes_service` instance within a kubernetes cluster.
This instance joins an externally-managed teleport cluster, given a
proxy address and a join token. The connection is always over a reverse
tunnel, per our recommended approach.
The chart is opinionated and only lets the user modify the bare minimum.
* Apply suggestions from code review
Co-authored-by: Gus Luxton <gus@gravitational.com>
* Move join token into a secret
Secret can be more tightly restricted via RBAC, and encrypted at rest
with KMSs.
Also, a few other small tweaks for UX.
Co-authored-by: Andrew Lytvynov <andrew@gravitational.com>
Co-authored-by: Gus Luxton <gus@gravitational.com>
Shellcheck is a linter for shell scripts. Since we have quite a few of
those for release packaging and examples, we'll benefit from an extra
set of (robot) eyes.
Note: I disabled https://github.com/koalaman/shellcheck/wiki/Sc2086 to
make this PR smaller. That specific check is for the most frequent
mistake in our scripts - not quoting env var expansions. I'll do a
separate PR cleaning those up.
`build.assets/pkg` is no longer used and was removed.
The prefix fetching logic has a bug: it treats everything starting with
`/teleport` as the legacy prefix data, even if it's `/teleport-foo/bar`.
This is an issue if user specifies `/teleport-foo` as their custom
prefix. Each restart will copy the data from `/teleport-foo/...` to
`/teleport-foo-foo/...`.
Set the legacy prefix const to `/teleport/` instead. This avoids
excessive copying during startup.
Prefixes can still be confused later on, with `Watch` and `GetRange`,
but this is harder to migrate with backwards-compatibility.
This script is similar to `examples/gke-auth/get-kubeconfig.sh` but
should work for any k8s setup.
It uses a service account bearer token for authentication instead of TLS
key/cert. These tokens shouldn't expire and are more appropriate for
automation. It also fetches the CA cert from the service account secret,
which is more reliable than assuming a `kube-dns` pod exists in the
cluster.
In addition, this script sets up the needed k8s RBAC objects for
impersonation, saving the user a few extra steps.
Prefix-handling code was using a hardcoded prefix (`/teleport`) instead
of the prefix specified in config. Use the correct config prefix and add
a test.
Our auth middleware already attaches a TLS identity as context value.
Plumb contexts through and extract the username when recording events.
If the received context doesn't have an identity attached, use "system"
as username.
Lots of noise here due to missing context.Context plumbing :(
We should eventually plumb contexts to all those RPC interfaces.
Updates #3816
* Base fork for 4.3 docs
* [docs] external email identities and Kube Users (#3628)
* Base fork for 4.3 docs
* [docs] external email identities and Kube Users (#3628)
* Remove trailing whitespace from docs files
Some editors will do this automatically on save. This causes a lot of
diffs when editing the docs in such an editor.
Clean them up once now and we'll try to keep it tidy going forward.
* Add make rules for docs whitespace and milv
docs-test-whitespace: checks for trailing whitespace in all .md files
under docs/.
docs-fix-whitespace: removes trailing whitespace in all .md files under
docs/.
docs-test-links: runs milv in all docs/ subdirectories that have
milv.config.yaml.
docs-test: runs whitespace and links tests, used during `make docs`
* Document the new `--use-local-ssh-agent` flag for tsh
The flag is used to bypass the local SSH agent even when it's running.
Specifically, this helps with agents that don't support certs.
The flag was added in #3721
* Remove pam_script.so docs from SSH PAM page
With #3725 we now populate teleport-specific env vars in a way that's
accessible to `pam_exec.so`. There's no longer any reason to install
pam_script.so separately and duplicate our docs.
Updates #3692
* Using the correct --insecure-no-tls flag
* Run docs-fix-whitespace make rule in a busybox container
* Fixes#3414
Co-authored-by: Andrew Lytvynov <andrew@gravitational.com>
Co-authored-by: Gus Luxton <gus@gravitational.com>
Co-authored-by: Steven Martin <steven@gravitational.com>
Co-authored-by: Gus Luxton <webvictim@gmail.com>
* Teleport helm upgrade command update
The --name in the helm upgrade example was not a valid parameter. Also put in comments that ca.pem is not required. It is off by default.
* Modified comments based on feedback
* Add image types, AMI IDs, extend AuthASG timeout
Added options for m4.large and m5.large. Added AMI IDs for all regions. Extended the timeout on the Auth ASG from 20 minutes to 30 minutes.
* Update ent.yaml
Co-authored-by: Ben Arent <ben@gravitational.com>
Co-authored-by: Gus Luxton <gus@gravitational.com>
* Update all connector YAML configs
* User <cluster-url> as standard
* Leverage markdown_include.include
* Include screenshots for Buttons based on Display.
The defaults file is a common location to define service specific
environment variables. Defining the variables is still up to the
admin, but like this at least the service file doesn't need to be
modified anymore.
* Expose diagnostic endpoint and add liveness/ready checks to pods to enable automatic restart if Teleport shuts down
* Force add OIDC connector to suppress error message when container restarts, also add missing echo to errors
* Force adding of trusted cluster on restart
Update mirror mode (for both the memory and SQLite backends) to no
longer emit events when an element expires. This allows caches to handle
update/delete logic themselves.
This fixes an issue where services.ProxyWatcher was not getting updates
to the list of proxies.
* Add resest for buffers to close watchers
and reset buffer the state.
* Add reconnect logic to DynamoDB
* Add tests for cache watchers, make sure
the errors of the cache internal watcher propagate to
external watchers.
* Create wildcard DNS record for the main cluster as well as single A record so we can use Kubernetes forwarding to remote clusters via proxy properly
* Automatically delete created Cloudflare DNS records via pre-delete hook when the chart is deleted to keep the zone tidy
* Don't explicitly print Cloudflare API credentials in debug mode (they're logged along with the curl commands anyway)
* Add a function to handle Cloudflare API calls rather than copy/pasting code
* Initial commit with split Helm chart for proxy/auth and node elements
* Many, many changes to add all required features
* Remove cert-manager and nginx-ingress
* Update TTL
* Add build-essential and python-dev to cloudflare-agent Docker build and set exit on error
* Add --force-upgrade flag to Tiller for potentially different Helm versions
* Enable Letsencrypt by default
* Overhaul naming to allow better multi-tenancy on k8s clusters
* Add NOTES.txt to provide cluster usage instructions
* Make the use of trusted clusters entirely optional
* Actually make the use of trusted clusters entirely optional this time
* Update .gitignore
* Update whitespace formatting in NOTES.txt
* Enable Letsencrypt by default
* Move secrets to git submodule
* Fix README typo and add secrets to .gitignore
* Update documentation
* Add some extra details to NOTES.txt
* Address PR comments plus update all references to Teleport 3.1.4 -> 3.1.7
* Make Cloudflare TTL optional (use Cloudflare's auto value when it's not provided)
* - Explicitly add admin role to clusters with use of kubernetes_groups
- Fix use of claims_to_roles so it can be specified in values.yaml
- Improve Minikube/NodePort support
- Replace use of containerPort with service port for LoadBalancer objects
* Update secrets in submodule to use Kubernetes-enabled license
* Add admin role script to containers
* Ignore all secrets files
* Update k8s RBAC to fix proxy functionality, also create 'clusteradmin' and 'admin' roles in Teleport to split permissions
* Update default version to 3.1.8
* Add k8s cluster roles and bindings to allow use of CSR APIs and limited permission scope
* Restrict admin role from seeing/updating auth_connectors
* Fix whitespace and naming bug
* Change from using k8s CSR API to impersonation API
* Update from kubectl 1.12.4 -> 1.12.5 for security fix
* Updated build scripts to use Docker cache properly, also using version tags for all containers now to keep things tidier
* Use docker build --pull rather than manual pull, also remove unused TELEPORT_VERSION arguments
This commit switches Teleport proxy to use impersonation
API instead of the CSR API.
This allows Teleport to work on EKS clusters, GKE and all
other CNCF compabitble clusters.
This commit updates helm chart RBAC as well.
It introduces extra configuration flag to proxy_service
configuration parameter:
```yaml
proxy_service:
# kubeconfig_file is used for scenarios
# when Teleport Proxy is deployed outside
# of the kubernetes cluster
kubeconfig_file: /path/to/kube/config
```
It deprecates similar flag in auth_service:
```yaml
auth_service:
# DEPRECATED. THIS FLAG IS IGNORED
kubeconfig_file: /path/to/kube/config
```
* Change cluster validation method from using CA cert stored in SSM to CA pin hash stored in SSM - also fixes issues with proxy/node being unable to join the cluster if the cluster name is reused. Split builds into local 'debug' versions and separate production/marketplace versions with different names
* Fixes for Terraform documentation and license
* Update Makefile and README
* Makefile formatting fixes
* Add build timestamps back into Jenkins
* Add BuildTimestamp into user tags
* Add region to modify-image-attribute command
* Add owner ID into list command
* Add single AMI build/setup
* Add ACM support to Terraform and Letsencrypt support to single AMI
* Finish Letsencrypt support for Single AMI, also add ACM to Single AMI and tidy up Terraform versioning
* Fix Letsencrypt cert acquistion, reduce startup timers from 5 minutes to 3 minutes, tweaks for ACM/non-ACM in Terraform
* Remove AWS-based license from Enterprise AMI to convert to BYOL
* Tidy up - move Cloudformation into a separate subdirectory and remove old Terraform code
* Updated TIG stack to latest versions and tested
* Tidy up CloudFormation builds and improve instructions
* Fix VPC variable name
This commit introduces several key changes to
Teleport backend and API infrastructure
in order to achieve scalability improvements
on 10K+ node deployments.
Events and plain keyspace
--------------------------
New backend interface supports events,
pagination and range queries
and moves away from buckets to
plain keyspace, what better aligns
with DynamoDB and Etcd featuring similar
interfaces.
All backend implementations are
exposing Events API, allowing
multiple subscribers to consume the same
event stream and avoid polling database.
Replacing BoltDB, Dir with SQLite
-------------------------------
BoltDB backend does not support
having two processes access the database at the
same time. This prevented Teleport
using BoltDB backend to be live reloaded.
SQLite supports reads/writes by multiple
processes and makes Dir backend obsolete
as SQLite is more efficient on larger collections,
supports transactions and can detect data
corruption.
Teleport automatically migrates data from
Bolt and Dir backends into SQLite.
GRPC API and protobuf resources
-------------------------------
GRPC API has been introduced for
the auth server. The auth server now serves both GRPC
and JSON-HTTP API on the same TLS socket and uses
the same client certificate authentication.
All future API methods should use GRPC and HTTP-JSON
API is considered obsolete.
In addition to that some resources like
Server and CertificateAuthority are now
generated from protobuf service specifications in
a way that is fully backward compatible with
original JSON spec and schema, so the same resource
can be encoded and decoded from JSON, YAML
and protobuf.
All models should be refactored
into new proto specification over time.
Streaming presence service
--------------------------
In order to cut bandwidth, nodes
are sending full updates only when changes
to labels or spec have occured, otherwise
new light-weight GRPC keep alive updates are sent
over to the presence service, reducing
bandwidth usage on multi-node deployments.
In addition to that nodes are no longer polling
auth server for certificate authority rotation
updates, instead they subscribe to event updates
to detect updates as soon as they happen.
This is a new API, so the errors are inevitable,
that's why polling is still done, but
on a way slower rate.
This commit moves proxy kubernetes configuration
to a separate nested block to provide more fine
grained settings:
```yaml
auth:
kubernetes_ca_cert_path: /tmp/custom-ca
proxy:
enabled: yes
kubernetes:
enabled: yes
public_addr: [custom.example.com:port]
api_addr: kuberentes.example.com:443
listen_addr: localhost:3026
```
1. Kubernetes config section is explicitly enabled
and disabled. It is disabled by default.
2. Public address in kubernetes section
is propagated to tsh profile
The other part of the commit updates Ping
endpoint to send proxy configuration back to
the client, including kubernetes public address
and ssh listen address.
Clients updates profile accordingly to configuration
received from the proxy.
This is a helm chart for Teleport that conforms to [helm chart best practices](https://docs.helm.sh/chart_best_practices/) and various conventions seen in the official charts repository, so that it becomes easy-to-use and flexible enough to support many deployment scenarios.
Features:
- Locally testable on minikube
- Chart values for flexible configuration, instead of sourcing the raw teleport.yaml contained in the chart
- Automatically rolling-update the pods on configuration change according to the helm best practices
- Service and deplyment ports more finely configurable
- Customizable service and ingress for exposing the proxy to the private network or the internet
- Use service annotatinos for integration with e.g. [external-dns](https://github.com/kubernetes-incubator/external-dns)
- Use ingress for integration with e.g.[aws-alb-ingress-controller](https://github.com/kubernetes-sigs/aws-alb-ingress-controller)
- Configurable pod annotations. Uesful for IAM integration with kube2iam/kiam for example.
- Customizable pod assignment for security and availability
This issue updates #1986.
This is intial, experimental implementation that will
be updated with tests and edge cases prior to production 2.7.0 release.
Teleport proxy adds support for Kubernetes API protocol.
Auth server uses Kubernetes API to receive certificates
issued by Kubernetes CA.
Proxy intercepts and forwards API requests to the Kubernetes
API server and captures live session traffic, making
recordings available in the audit log.
Tsh login now updates kubeconfig configuration to use
Teleport as a proxy server.
Fixes#1671
* Add notes about TOS agreements for AMI
* Use specific UID for Teleport instances
* Use encrypted EFS for session storage
* Default scale up auto scaling groups to amount of AZs
* Move dashboard to local file
* Fix dynamo locking bug
* Move PID writing fixing enterprise pid-file
* Add reload method for teleport units
Demo monitoring stack sets up example monitoring
infrastructure:
* All nodes, auth servers and proxies
run telegraf alongside them, polling prometheus
diagnostic endpoints.
* Telegraf sends the data to InfluxDB database
* Grafana sets up cluster health dashboard
watching key teleport metrics - numbers of goroutines,
number of active sessions, file descriptors and so on.
* Fix IAM instance profiles assignments for proxy and nodes
* Add support for auth server certificate verification done by
nodes and proxies joining the cluster.
* Fix out of order events returned by auth servers in HA mode.
In HA mode, auth server could return events out of order
in case if they were sent to multiple auth servers what confused
the user interface expecting events sorted.
This commit fixes the problem by sorting events returned
by function SearchEvents.
This is MVP for HA deployment of Teleport on AWS
* Using terraform
* EFS for audit log storage
* Proxies and auth servers in auto scaling group
* NLB for frontends
* Letsencrypt
Some users noticed that 'display' field is not well-documented for the
connectors.
I also noticed that some defaults are not sensible (like "google" as the
provider)
- Switched to new way of building Enterprise
- Removed `tctl tunnels` command (preparation for new resources)
- Removed `tctl auth ls` command (preparation for new resources)
Instead of trying to achieve a full "offline" operation, this commit
honestly converts previous attempts to a "caching access point client"
behavior.
Closes#554
What works:
1. You have to start all 3: node, proxy and auth.
2. Login using 'tsh' (so it will create a cert)
3. Then you can shut 'auth' down.
4. Proxy and node will stay up and tsh will be able to login.
What doesn't work:
1. Auth updates are not visible to proxy/node (like new servers)
2. Not sure if "trusted clusters" will work.