Relationships
#343 Official @swamp/ssh extension supporting multiple SSH transport styles
Opened by bixu · 5/13/2026
Problem statement
Some swamp users currently rely on the community @keeb/ssh extension for programmatic SSH from models and workflows. It supports traditional key-based auth with vault-stored keys, which covers the most common case but leaves several modern SSH topologies unaddressed:
- Tailscale SSH —
tailscale ssh user@hostuses tailnet identity (Tailscale ACL + IdP) instead of a static SSH key. This is increasingly common in production environments that have adopted Zero Trust networking. There is no clean way to express "SSH to this host via Tailscale" through@keeb/sshtoday. - Bastion / jump-host topologies —
ssh -J <bastion> <target>is the standard way to reach hosts that aren't directly routable.@keeb/sshdoesn't exposeProxyJumpconfiguration. - Custom
ProxyCommand— e.g., AWS SSM Session Manager (ssh -o ProxyCommand="aws ssm start-session ...") lets you SSH to EC2 instances with no inbound SSH port open. No first-class support today.
Because there is no official extension, each team that needs one of these styles either forks @keeb/ssh, writes their own bespoke model that shells out to ssh, or works around it with command/shell. This fragments the ecosystem and produces inconsistent vault handling, error reporting, and audit semantics.
Proposed solution
Provide an official @swamp/ssh extension that exposes a uniform model interface to multiple SSH transport styles. Callers specify the target host and a via (transport) selector; the model handles the underlying invocation.
Sketch of the interface:
// Traditional SSH (parity with @keeb/ssh)
{ via: "key", host: "...", user: "...", vaultKey: "..." }
// Tailscale SSH
{ via: "tailscale", host: "...", user: "..." }
// Bastion / jump host
{ via: "bastion", host: "...", user: "...", bastion: "user@bastion-host", vaultKey: "..." }
// ProxyCommand (e.g. AWS SSM)
{ via: "proxy-command", host: "...", user: "...", proxyCommand: "aws ssm start-session --target i-..." }Methods at minimum:
run— run a single command, capture stdout/stderr/exitscript— upload and execute a scriptcopy— scp/rsync-style file transfer
Vault handling should match the existing swamp convention (read-secret, never write to disk; pipe into ssh-agent or use file-descriptor substitution where the underlying tool requires a path).
Alternatives considered
- Continue with
@keeb/ssh— community-maintained, no Tailscale or bastion support, no roadmap commitment. - Each consumer wires their own SSH model — duplicated logic, inconsistent vault handling, harder to audit fleet-wide.
- Use
command/shelleverywhere — explicitly discouraged by swamp guidance ("command/shellis for ad-hoc one-off shell commands, NEVER for wrapping CLI tools or building integrations").
Context
Filed in the context of a project that needs a swamp posture model that audits our fleet's SSH configuration; today we'd reach for @keeb/ssh, but it can't carry a Tailscale-SSH connection — so we're either forking it or shelling out, neither of which is a good long-term answer.
Closed
No activity in this phase yet.
adam commented 5/19/2026, 12:57:57 AM
Okay, it's trending toward something like this:
10 hosts: 6 web (4 prod, 2 staging), 2 db (prod, via bastion ProxyJump), 2 edge (prod, Tailscale).
Fleet definition — fleets/awesome.yaml
modelType: "@swamp/ssh-fleet"
modelName: awesome
globalArguments:
name: awesome
transport:
kind: ssh
user: deploy
identityFile: ~/.ssh/awesome_ed25519
knownHostsFile: ~/.ssh/awesome_known_hosts
strictHostKeyChecking: accept-new
connectTimeoutSec: 10
controlMaster: { enabled: true, persistSec: 600 }
defaultParallel: 8
captureOutput: true
hosts:
- name: web-1
address: web-1.prod.example.com
tags: [web, prod]
attrs: { region: us-east-1, role: api }
- name: web-2
address: web-2.prod.example.com
tags: [web, prod]
attrs: { region: us-east-1, role: api }
- name: web-3
address: web-3.prod.example.com
tags: [web, prod]
attrs: { region: us-east-1, role: api }
- name: web-4
address: web-4.prod.example.com
tags: [web, prod]
attrs: { region: us-east-1, role: api }
- name: web-5
address: web-5.staging.example.com
tags: [web, staging]
attrs: { region: us-east-1, role: api }
- name: web-6
address: web-6.staging.example.com
tags: [web, staging]
attrs: { region: us-east-1, role: api }
- name: db-1
address: 10.0.5.21
tags: [db, prod]
attrs: { region: us-east-1, role: postgres }
transport:
proxyJump: deploy@bastion.prod.example.com
- name: db-2
address: 10.0.5.22
tags: [db, prod]
attrs: { region: us-east-1, role: postgres }
transport:
proxyJump: deploy@bastion.prod.example.com
- name: edge-1
address: edge-1 # short tailnet name
tags: [edge, prod]
attrs: { region: eu-west-1, role: edge }
transport: { kind: tailscale, user: deploy }
- name: edge-2
address: edge-2
tags: [edge, prod]
attrs: { region: eu-west-1, role: edge }
transport: { kind: tailscale, user: deploy }Apply + warm masters
swamp model apply -f fleets/awesome.yaml
swamp model run awesome open --json # opens CM for the 8 ssh hosts; no-op for edge-1/edge-2After apply, ten host-* resources exist, each tagged {fleet: awesome}.
1. exec uptime on every prod host (mix of ssh + tailscale)
swamp model run awesome exec \
--arg hosts='"prod" in host.tags' \
--arg command='uptime' \
--jsonCEL matches web-1..4, db-1..2, edge-1..2 (8 hosts). web-5 / web-6
are staging and are skipped.
Resources written by this single call, one per matched host:
run-exec-web-1 run-exec-web-2 run-exec-web-3 run-exec-web-4
run-exec-db-1 run-exec-db-2
run-exec-edge-1 run-exec-edge-2Edge-1's record:
{
"method": "exec",
"host": "edge-1",
"transport": "tailscale",
"startedAt": "2026-05-18T17:42:11.108Z",
"finishedAt": "2026-05-18T17:42:11.864Z",
"durationMs": 756,
"exitCode": 0,
"signal": null,
"stdout": " 17:42:11 up 12 days, 4:11, 0 users, load average: 0.08, 0.10, 0.09\n",
"stderr": "",
"args": { "command": "uptime" },
"argv": ["tailscale", "ssh", "deploy@edge-1", "--", "uptime"]
}vs. the ssh equivalent for web-1:
{
"method": "exec",
"host": "web-1",
"transport": "ssh",
"exitCode": 0,
"stdout": " 17:42:11 up 6 days, 8:02, 0 users, load average: 0.42, 0.51, 0.58\n",
"argv": [
"ssh",
"-o", "ControlMaster=auto",
"-o", "ControlPath=/run/user/1000/swamp-ssh/awesome/8f3c…sock",
"-o", "ControlPersist=600",
"-i", "/home/adam/.ssh/awesome_ed25519",
"-p", "22",
"deploy@web-1.prod.example.com",
"--", "uptime"
]
}2. copy nginx.conf to all us-east-1 web hosts
swamp model run awesome copy \
--arg hosts='"web" in host.tags && host.attrs.region == "us-east-1"' \
--arg src=./nginx.conf \
--arg dst=/etc/nginx/nginx.conf \
--arg direction=to \
--jsonCEL matches all six web-* hosts (prod + staging are both us-east-1). Six
per-host resources:
run-copy-web-1 run-copy-web-2 run-copy-web-3
run-copy-web-4 run-copy-web-5 run-copy-web-6web-1's record (scp reuses the master socket; the argv shows it):
{
"method": "copy",
"host": "web-1",
"transport": "ssh",
"exitCode": 0,
"durationMs": 412,
"args": { "src": "./nginx.conf", "dst": "/etc/nginx/nginx.conf", "direction": "to" },
"argv": [
"scp",
"-o", "ControlPath=/run/user/1000/swamp-ssh/awesome/8f3c…sock",
"-i", "/home/adam/.ssh/awesome_ed25519",
"-P", "22",
"./nginx.conf",
"deploy@web-1.prod.example.com:/etc/nginx/nginx.conf"
]
}Same thing from a workflow
In YAML the selector is just a string — no quoting acrobatics, and swamp's
evaluator leaves it alone because there's no ${{ … }} to trigger it:
# workflows/reload-prod-web.yaml
name: reload-prod-web
on: manual
jobs:
reload:
steps:
- name: ship config
model: awesome
method: copy
arguments:
hosts: '"web" in host.tags && "prod" in host.tags'
src: ./nginx.conf
dst: /etc/nginx/nginx.conf
direction: to
- name: reload nginx
model: awesome
method: exec
arguments:
hosts: '"web" in host.tags && "prod" in host.tags'
command: systemctl reload nginx
sudo: trueA few selector variants worth knowing
host.transport == "tailscale" # only edge-1, edge-2
host.attrs.role == "postgres" # only db-1, db-2
"prod" in host.tags && host.transport == "ssh" # everything prod except the edges
host.name.startsWith("web-") && size(host.tags) > 1 # all web hosts (all have 2 tags)
host.attrs.region != "us-east-1" # only the two tailscale edgesadam commented 5/19/2026, 12:58:45 AM
You can override everything on a per method run basis - so it all works from the CLI, and you could get by without even static definitions of hosts if you wanted
bixu commented 5/19/2026, 8:22:06 AM
The YAML pattern here (for the model/workflows) looks great. On thing to note for our use-case is that we'd need to support ssh-agent and password as well as keyfiles.
evrardjp commented 5/19/2026, 9:47:58 AM
I like this idea of making the transport more flexible.
Just curious, if we make the transport quite flexible, what's the point of tying this to SSH? Would it be bad to make it even more generic?
Many configuration management tools in the past did their own transport mechanism (with different level of success) that we could tap into...
If you are running things locally, your transport might simply be local execution. If you run something over RPC (or any non-ssh tunnel), you might want to use that as transport mechanism. Those should ideally be implementable from a base, reusable, transport mechanism.. don't you think?
At the same time, a generic transport might become too open-ended, compared to an "ssh transport system with variants" ...
adam commented 5/19/2026, 8:17:11 PM
@bixu - should be no problem to have ssh-agent. password too, I suppose (although I have to admit, I don't know why you would do that - but who am I to judge?)
@evrardjp - I think it might be a little to wacky to move beyond ssh as the transport. But once it exists, it would be easy enough to port the pattern.
bixu commented 5/20/2026, 10:35:12 AM
I don't know why you would do that -- we are digging out of the brownest of brownfield systems at $job 😅
Sign in to post a ripple.