SSH Emergency Access

(smallstep.com)

165 points | by based2 38 days ago

14 comments

  • timeattack 37 days ago

    It's cool and interesting application of the technology, but doesn't really seem to be practical.

    When you're unable to access machine using your standard SSH keys usually it means that it's highly unlikely that it will be possible to login remotely via other means.

    As an emergency login there are two common options:

    * in case of cloud: use remote VM console provided by the hosting provider.

    * in case of bare-metal: use IPMI to access machine console directly.

    • tashian 37 days ago

      Hey there — I'm the author of this post.

      There's a few scenarios where I imagined this approach being useful:

      * If you have any kind of remote dependency in your SSH auth flow (LDAP, or an online CA, or automated Ansible playbooks to push keys), any of those might fail and render the host otherwise inaccessible.

      * It's becoming more common to not ever SSH into machines. So, what if emergency SSH access is the only way to access a host? Some companies even go a few steps further: When a host is SSH'd into, it is considered "tainted by humans", is quarantined and eventually shut down.

      * Some hosts should never allow root access to anyone. For example, there's no reason for anyone to have root on a bastion host. So, what if the only way to get root on some hosts is with the emergency key?

      While you could use the cloud VM console for emergency access in these cases, having a hardware key provides even more security and would let you turn off cloud VM access.

      Of course if you broke your SSHD config, or have a network issue that prevents you from reaching the host, this won't magically fix any of that. IPMI is good for that though.

      • peterwwillis 37 days ago

        > While you could use the cloud VM console for emergency access in these cases, having a hardware key provides even more security and would let you turn off cloud VM access.

        I'm not sure it's more secure, but I suppose it depends on the provider. Your control of your account's admin key (or password) is the last bastion of security for most providers.

        > Of course if you broke your SSHD config, or have a network issue that prevents you from reaching the host, this won't magically fix any of that. IPMI is good for that though.

        This is why I just use the providers' emergency management (or IPMI). Easier to have one method of emergency access that always works regardless of the guest. The guest's root (or emergency) account can still have a pretty darned complex password.

        • bubblesorting 36 days ago

          > It's becoming more common to not ever SSH into machines

          This is a reality for me. At work we run a handful of distributed clusters, if anyone does an equivalent of sshing into a box and poking around (in our case, `kubectl exec`), the infrastructure team gets an alert, then follows up with whoever invoked the command. If they are doing debugging, we shift whatever resources they need into dev. If they are not debugging, they will probably get questioned by their boss. (fortunately, most of the time this chat results in, "oh wow I didn't know about the APM/Metrics/Graphs/Logs/etc setup we had, I'll check that next time)

          • CodeWriter23 37 days ago

            There’s also the access control feature of this approach. You can give someone temp access to a host.

            • nokya 37 days ago

              That's exactly the reason why we use certificate based ssh access at my employer's: for our suppliers. It looks like the author went far away to find alternate reasons to deploy this :/

            • ThePowerOfFuet 37 days ago

              > Valid: from 2020-06-24T16:53:03 to 2020-06-24T16:03:03

              Almost! I think you meant:

              Valid: from 2020-06-24T16:53:03 to 2020-06-24T17:03:03

            • isatty 37 days ago

              Yep, the most common way I've lost access to machines is by messing up the iptables/ipfw rules. Read a post here about avoiding that by having a timed reset with sleep.

              • lazyant 37 days ago

                For people asking: you can create a resetfw.sh script, for iptables:

                  #!/bin/bash
                
                  iptables -P INPUT ACCEPT  
                  iptables -P FORWARD ACCEPT  
                  iptables -P OUTPUT ACCEPT  
                  iptables -t nat -F  
                  iptables -t mangle -F  
                  iptables -F  
                  iptables -X
                
                chmod +x resetfw.sh

                and add it for ex to /etc/cron.hourly directory

                This way you can test your iptables rules and they'll get clear at every hour. Once you check they are OK you can delete this cronjob.

                (NOTE: I'm typing from memory, haven't tested this)

                • benedikt 37 days ago

                  https://manpages.debian.org/stretch/iptables/iptables-apply....

                  Or use `at` to run `iptables-restore`. Simpler than setting up a cronjob (and if youre doing it manually, cron has a bunch of gotchas that at least bite me in the ass once in a blue moon).

                  • lazyant 37 days ago

                    Yes. Although iirc (it may have changed, haven't looked "recently" the iptable- commands are distro specific, as in not all of them have / had them).

                  • metiscus 37 days ago

                    You might add a daily task to remove that task just in case you forget. That way you avoid lockout but don't end up opening yourself up accidentally.

                    • taftster 37 days ago

                      Or possibly just turn iptables off, in the same cron.hourly.

                      • lazyant 37 days ago

                        Ah yes, that's simpler: systemctl stop iptables. Also need to do systemctl disable iptables just in case, otherwise if the server reboots the iptables service will restart.

                    • oars 37 days ago

                      This has happened to me as well. Where could I read about this method?

                      • raincom 37 days ago

                        Maybe this:

                           service network stop && sleep 10 && service network start
                        • wiredfool 37 days ago

                          The worst is: sudo ifdown eth0 && ifup eth0

                      • leoh 37 days ago

                        Link?

                      • johnklos 37 days ago

                        IPMI is painfully insecure, and therefore assumes the existence of a completely separate, protected network. Some people don't colocate more than a few machines (and therefore can't justify the extra infrastructure for an IPMI OOB network), don't want to pay extra for a colo provider to provide IPMI OOB, and/or don't trust their colo provider to have access to such a sensitive and insecure thing.

                        Having an emergency method to connect is an excellent idea.

                        • sidpatil 37 days ago

                          Not sure about other vendors, but I know Cisco offers dial-in capabilities for managing routers, switches, etc. The dial-in modem on the router is connected to a landline.

                          Has this approach ever been taken by server admins?

                          • tyingq 37 days ago

                            A standard for emergency IPMI or other console type type access would be welcome. Vendors have certainly done a bad job in this space. Break-glass type access isn't a new thing.

                            • kevincox 37 days ago

                              I think it depends. I've worked in places that had something like the following setup.

                              - Hardware in datacenters with operators who were not experts on the applications running. - All remote access was done using a short term (~1 day) ssh keys. There was an authentication service to generate these.

                              It was pretty easy to imagine that the authentication service would go down. In this case a selection of people who worked on the infrastructure had longer-term keys on HSMs. (With very high logging and alerting for any use). It would actually make sense for these to be CA keys so that they could access different user accounts or similar.

                              TL;DR you are assuming a very basic SSH auth setup. As the regular setup gets more complicated having something like this as a backup makes sense.

                              • marcosdumay 37 days ago

                                > All remote access was done using a short term (~1 day) ssh keys. There was an authentication service to generate these.

                                This is weird. Really weird.

                                Did that service use a more secure authentication storage than a password protected key?

                                • jon-wood 37 days ago

                                  It’s really not - by limiting the life of keys, and having a service generating them, you can more effectively lock things down when someone leaves, rather than going round revoking keys from servers. Something we’re experimenting with at work is AWS Instance Connect, which uses your AWS credentials to push a key to a target instance with 1 minute validity - no more managing keys on instances, and revoking access is just a change to an IAM policy.

                                  • dsr_ 37 days ago

                                    As opposed to having a few bastion-hosts, and requiring people to log in there in order to then ssh on to their final destinations -- in that case, revoking their keys is as simple as wiping their accounts on the bastion hosts.

                                    • jon-wood 36 days ago

                                      Even with a few bastion hosts things get hard to track quickly as you end up with multiple clusters (dev/staging/UAT/production), and potentially multiple production clusters in different regions.

                                  • Spooky23 37 days ago

                                    It seems weird but has several advantages. Most places screw up defunct account cleanup and privilege management.

                                    A process like this allows you to ensure that people have the access they need and makes it easy to get them the privilege separation needed.

                                    • kevincox 37 days ago

                                      Yes, the system used multi-factor auth and could be locked for suspicious activities.

                                • ed25519FUUU 37 days ago

                                  Here's what I like to do on my server(s) in cron, which pulls my keys from github:

                                      @hourly <username> ssh-import-id gh:<github username>
                                  
                                  If I lose my keys to this host, I can simply update github.com with my new ones and go to lunch. I'll be able to login again shortly.

                                  And on all of my hosts:

                                      @reboot <username> ssh-import-id gh:<github username>
                                  
                                  This is REALLY helpful on devices like raspberry pi, where they may stay shutdown / offline for years. The minute they're powered up again they'll get my fresh keys and I can login to them without needing a console.

                                  http://manpages.ubuntu.com/manpages/bionic/man1/ssh-import-i...

                                  • cbb330 37 days ago

                                    Why not use certificates as your primary authentication for SSH? Facebook has a great blog post on implementing this at scale: https://engineering.fb.com/security/scalable-and-secure-acce...

                                    • tashian 37 days ago

                                      Yes! Shameless plug — we (smallstep.com) offer a service that makes this frictionless at scale and super easy to set up. You'll never want to go back to public keys.

                                      • theatrus2 37 days ago

                                        If you’re on AWS and have credentials for users there, you can also run bless

                                        https://github.com/Netflix/bless

                                      • masonhensley 37 days ago

                                        Neat - something I feel that often gets overlooked in most SAAS systems (think internal side) be it customer service, ops, etc tooling is break the glass escalation functionality. Most systems I’ve seen in the wild completely lack this and will result in over provisioning of admin “god mode” accounts.

                                        NoodlesUK points out alerting which is a pretty important concept to incorporate.

                                        Largely a solved concept in Electronic Medical Records & as outlined in the post.

                                        • noodlesUK 38 days ago

                                          I think another thing we might want to learn about is how to sound the alarm when the break glass is used. Is there an easy way of doing that with SSH? Running a command to page the ops/security team when a server receives a login attempt with an emergency credential?

                                          • withinboredom 38 days ago

                                            You can physically put the (yubikey) device in a vault that will physically sound an alarm when opened. It could also have a battery-powered arduino inside the box (with SIM breakout) that texted the devops team when opened.

                                            • chrisweekly 38 days ago

                                              No idea why you were downvoted; it seems like a reasonable idea to me. (Also, IMHO downvoting a good-faith comment like yours is a lazy alternative to posting a substantive response.)

                                              • BillinghamJ 37 days ago

                                                Overcomplex technical solution to a simple problem.

                                                Besides which, if you really want to go full-on with technically clever solutions, keep in mind you could ensure no cellular service prior to opening. But then we're just getting into the realms of silly situations.

                                                • dcow 37 days ago

                                                  Would you attempt to use the key if you knew the CTO, SRE and OPS teams were paged as soon as the safe was accessed?

                                            • rsync 37 days ago

                                              "I think another thing we might want to learn about is how to sound the alarm when the break glass is used. Is there an easy way of doing that with SSH?"

                                              Yes - quite simple and old-fashioned, actually ...

                                              I have this line in the SSH users' .login file:

                                                /usr/local/sbin/sms 4153331111 4158882222 "USER LOGIN TO XXX - $DATE" >& /dev/null
                                              
                                              ... where the 'sms' command, above, is a shell script I wrote to call twilio messaging with the curl command. A very simple example of that would be:

                                                curl -X POST -d "Body=$msg" -d "From=$from" -d "To=$to" "https://api.twilio.com/2010-04-01/Accounts/$accountsid/Messages" -u "$accountsid:$authtoken"
                                              
                                              ... and this works like a charm.

                                              Alternatively, you could rick-roll your on-call sysadmin:

                                                /usr/local/bin/curl -XPOST https://api.twilio.com/2010-04-01/Accounts/$accountsid/Calls.json --data-urlencode "To=$number" --data-urlencode "From=$callerid" --data-urlencode "Url=http://demo.twilio.com/docs/voice.xml" -u $accountsid:$authtoken
                                              
                                              (the voice.xml demo is, in fact, Rick Astley)
                                            • tashian 37 days ago

                                              Hey there, I wrote this post. It's a great question.

                                              One benefit of using certificates for emergency access is that SSHD logging can be configured to show a lot more detail about the certificate that was used. With public keys, there isn't anything to show. But with certificates you have a key ID, serial number, principals, CA fingerprint, etc. So, that log is a good hook for sounding the alarm. A more advanced version of this would allow you to record a reason for using the emergency access key when the connection is made (or when sudo is used).

                                              • jlgaddis 37 days ago

                                                > With public keys, there isn't anything to show.

                                                There's, at minimum, client IP address, username, and the key fingerprint -- which has always been good enough for me.

                                                There might be even more details available but I'm not sitting in front of a computer to check.

                                              • gnufx 37 days ago

                                                I don't know what it looks like for a certificated system, but syslog records the private key used for login in a fairly vanilla Debian. If you worry about things like that and aren't looking at physical access (as suggested elsewhere), you presumably have remote syslog and audit which you can check.

                                                • lormayna 38 days ago

                                                  You can use pam-hooks module to execute scripts at login/logout.

                                                  • aidos 37 days ago

                                                    You can add a script at ~/.ssh/rc that’s run on each login. You’d need to be careful to make sure it couldn’t be changed if you were relying on it for notifications.

                                                    • gnufx 37 days ago

                                                      How do you record the key used from that (assuming that's what's required)?

                                                    • nix23 38 days ago

                                                      Maybe monitor the Emergency Machine itself? If it boot's up, emergency credentials are probably used?

                                                      But really good point, and i love the analogy to 'break glass'

                                                      • andylynch 37 days ago

                                                        There are tools like Powerbroker which do this, and also privileged access management more generally - popular at banks and the like. Also SSH (the company)

                                                        • 8organicbits 37 days ago

                                                          Login shell for emergency accounts could be a script that "sounds the alarm" and then drops to a bash shell.

                                                          Edit: ooo @rsync just gave another good approach

                                                          • jlgaddis 37 days ago

                                                            Generating alerts from syslog messages is something that we've been doing for decades.

                                                          • modinfo 37 days ago

                                                            What a coincidence, 3 days ago I ordered two pieces of yubikey 5, today arrived a package and today I read a post on how to use them in an interesting way for emergency access to my server via SSH.

                                                            I'd like to add that the way it's described really works.

                                                            But... Now I don't know to leave one yubikey in case I need to use it for emergency access to ssh? I have a server since 2011 and I have never problems with access through ssh, I use the same keys to this day and everything works.

                                                            I think this way with yubikey to emergency access is overkill.

                                                            It's just an interesting way to use yubikey.

                                                            • danmur 37 days ago

                                                              If you need to the option to give someone temporary access it seems like a good option. I don't think it would add anything to my personal stuff since there's no reason I can think of to give someone else access. At work definitely.

                                                              • dcow 37 days ago

                                                                Right, this is more about a cryptographic grant of temporary emergency access to someone who doesn't have a user account or admin keys already on a machine (and ideally nobody should have persistent admin access in a well-oiled production setting) in the event that existing access control mechanisms have failed. And backing the signing operations by a YubiKey lets you physically secure the key in ways that you wouldn't an entire laptop and provides all the benefits of tamper resistant, proximity aware, hardware. Probably not something most people will want or need to bother with for personal stuff, but very reasonable expectation as soon as you're working on a team or managing many hosts, etc.

                                                            • alexandrerond 37 days ago

                                                              I'm very confused, given Yubikeys have smart card fuctionality and they can be used by gpg-agent to SSH with the regular gpg key (you can add to authorized_keys just like any other keys) and you don't have to go through this whole mess of setup to create a CA and install it.

                                                              What am I missing?

                                                              • munchbunny 37 days ago

                                                                It's a chicken and egg problem: if you can't SSH into the machine, how do you add your key to the SSH config on the target machine?

                                                                You could use a very long lived key, but then as soon as you have multiple people who might need production SSH access, you've got access control and revocation issues. The SSH CA is a good minimal solution, because the CA can issue only short-lived SSH keys (few hours at a time) that you use once and throw away. Also, CA trust scales better because it moves user management burden to the certificate issuing process and removes the need to modify the SSH config every time you onboard a new user.

                                                                It's a pretty standard practice. Here's a post from Facebook about it from several years ago. This post is just about how to do it using YubiKeys. https://engineering.fb.com/security/scalable-and-secure-acce...

                                                                • harikb 37 days ago

                                                                  This can also work as a solution where the “setup” (of trusting CA) is baked in to the image. Then there is no ssh related setup until the day you actually need to ssh to the host. And you get the guarantee that no ssh login can happen until you issue a temp-pair.

                                                                  This is actually quite useful for deploying clusters of machines that one doesn’t want normal ssh access until there is a real need. I think this was also mentioned in another comment

                                                                • tashian 37 days ago

                                                                  That sounds like a great option too, depending on your situation.

                                                                  One difference is that the CA is on the hardware key, but the cert (and its private key) is not.

                                                                  Imagine you're on a team of 50, and anyone on the team might need emergency access to a host at some point. You wouldn't want to buy 50 keys and 50 safes. Just designate a couple folks to manage emergency access. They can manually mint a cert for a colleague as needed, and send it over a secure channel. No security key needed to use the cert, and it self-destructs after a few minutes.

                                                                • aaronmdjones 37 days ago

                                                                  To add to the blog post; you don't need brew or step or any of that nonsense to inspect certificates.

                                                                      $ ssh-keygen -Lf the-cert.pub
                                                                  • dcow 37 days ago

                                                                    You can, but `ssh-keygen` is about as nice to use as `openssl` which practically means you spend a lot of time with your head in the manual. The `step` tools have a nicer UI:

                                                                        $ step ssh inspect the-cert.pub
                                                                    
                                                                    Also the post already mentions that using `step` instead of `ssh-keygen` is optional, so I'm not sure why you feel the need to repeat it...
                                                                    • aaronmdjones 37 days ago

                                                                      Right, you'd have to look up the switches in the manpage if you don't remember them, but that's already the case with the generation portion, which is why the post includes the switches for that. I'm just saying it could have included the inspect switches too.

                                                                      • dcow 37 days ago

                                                                        We actually plan to update the post to demonstrate doing it entirely with the `step` tool. We just want to do a pass on the UX to make sure it is as easy an foolproof as possible before bringing more attention to it.

                                                                  • munchbunny 37 days ago

                                                                    This is a pedantic detail, but if you're trying to implement this system, it does matter: "resident key" is not a required feature here. You're not using the hardware token for its WebAuthn capability, you're using it for its smart card capability.

                                                                    You just need PKCS11 token support for SSH, which the YubiKey's smart card capability can do. YubiKey 4 and YubiKey FIPS can both do it, and so can regular old smart cards even though that form factor is a lot less popular now.

                                                                    The workflow is the same: generate a key pair on the hardware token, have the CA sign it, install the signed cert onto the hardware token, and then SSH with it.

                                                                    • closeparen 37 days ago

                                                                      The procedure here is actually using WebAuthn, which is now explicitly implemented by OpenSSH.

                                                                    • markpeek 37 days ago

                                                                      Using certificates with SSH is the way to go for shared access servers. Here's an open source way (yes, I'm involved in the project) to manage authorization and access with asynchronous approvals:

                                                                      https://github.com/cloudtools/ssh-cert-authority

                                                                      • dcow 37 days ago

                                                                        Smallstep also offers an open source ssh-aware kms-backed certificate authority.

                                                                        https://github.com/smallstep/certificates

                                                                        One nice advantage is its support for different provisioning flows. The oauth flavor allows you to hook into an existing identity provider to authenticate certificate requests.

                                                                        Simply:

                                                                            $ step ssh login
                                                                        
                                                                        and boom you've got a short-lived ssh certificate in your ssh-agent using a private key that never touched the disk.
                                                                      • jlgaddis 37 days ago

                                                                          Valid: from 2020-06-24T16:53:03 to 2020-06-24T16:03:03
                                                                        
                                                                        Um...
                                                                        • ascotan 37 days ago

                                                                          I don't get it. When would u need a backup ssh key other than if a user lost access? Most VMs have console access for this purpose.

                                                                          • asdfasgasdgasdg 37 days ago

                                                                            I'm not sure of the exact scenario but I would just note that there are other types of computing environments than virtual machines. For example, there a physical machines, sometimes hosted in a colo where you have no employees on the ground.

                                                                            • gnufx 37 days ago

                                                                              Surely you'd have some sort of remote KVM in such cases (like IPMI, as mentioned in another comment). That's critical in the clusters I've run and, of course, the manufacturers' implementation of that critical functionality in IPMI is likely to be rubbish and you can't get it fixed...

                                                                            • kubanczyk 37 days ago

                                                                              I wonder if there are some khem-khem notable clouds that just don't provide an old-school tty login. /s

                                                                            • dmitrygr 37 days ago

                                                                              Website unreadable on mobile. Commands cut off and not scrollable.

                                                                              • ausjke 37 days ago

                                                                                what about port knocking?