Tips and Tricks for Linux-based Bastion Hosts

Most moderately-complex VPCs will require a "bastion host". This host is the only host within the VPC that has a public IP address, and exposes port 22 (SSH) and possibly other ports to the world. If performing maintenance on an EC2 instance within your VPC, you first logon to the bastion, and from there logon to the EC2 instance to be managed.

Here are a few tips and tricks to make your life easier:

1. Choose the right instance type

Most bastion hosts use the "t2.micro" instance type. This is because the resources offered by this type (nominally 10% of 1 CPU, and 1 GB of memory) are sufficient for a bastion, and the t2.micro type is in the "free tier". So you can run a bastion host for free, 24/7, for the first year. Furthermore, a plain bastion host, installed from the Amazon Linux AMI, should be able to get by with just 8 GB of EBS storage - the minimum imposed by the Amazon Linux AMI. Your free tier includes 30 GB worth of EBS storage for a year so that's sufficient too.

If you plan to use the bastion host as a VPN endpoint, or if you otherwise plan to send a lot of network traffic through the bastion, or if for some reason you expect to use more than 10% of one CPU on average, then the t2.micro type might not be suitable though.

If you only need your bastion host during business hours, consider setting up a Lambda script that automatically starts and stops your host at the start and end of the business day. This can greatly reduce running costs, although you will still need to pay for EBS storage.

2. Treat your bastion as a throwaway resource

Your bastion host is exposed to the world. If anyone tries to break into your VPC, the bastion is likely to be the first victim. If you consider your bastion to be a throwaway resource and have all the images and procedures ready, then you can replace your broken-into bastion with a fresh one within minutes. For an SSH-only bastion you can just use the Amazon Linux AMI, but if you extend your bastion with, for instance, OpenVPN, then create a snapshot+AMI once you're done and use that to deploy a new bastion if needed.

You can even do this scheduled, and from CloudFormation, if you want to.

Having all the tools in place to deploy/replace a bastion host in seconds is also useful if the AZ containing your bastion becomes unavailable. Simply deploy a new bastion in an AZ that is active and you regain access to all your EC2 instances. The alternative would be to run a bastion 24/7 in each AZ. But that's relatively costly and a waste of resources.

3. Register the bastion public IP in DNS

This is not entirely trivial. You have to write a script that runs inside each EC2 instance. This script pulls its hostname, domainname and public IP address from the EC2 metadata, and then uses the AWS CLI to update the Route53 record. Here's a good article on how to do that:

Auto-register an EC2 host with Route53

Note that the article has you setup an IAM user with the proper policy. The IAM user credentials (access key and secret key) are then loaded into the EC2 instance. This is generally not recommended. The best practice is to create a role with the proper policy to update the Route53 record, and then associate this role with the EC2 instance. You can then run the CLI commands from within the EC2 instance without needing IAM users or credentials.

4. Consider obtaining an Elastic IP address for the bastion

Elastic IP addresses make your life a lot easier as they are yours, and will not change until you "release" them. You can statically add them to DNS and use either the Elastic IP or the associated DNS name in configuration files and the like.

One Elastic IP address is for free as long as it is associated with a running EC2 instance. For subsequent EIPs you pay a small monthly fee, and you also pay a low fee if your "free" EIP is not associated with a running EC2 instance: Amazon doesn't want you to hold on to unused static public IP addresses.

5. Use a Security Group to limit SSH/RDP/OpenVPN access

It is good practice to create a Security Group (SG) that only allows access to the bastion host from specific IP addresses. Obviously the list of IP addresses should only include your own IP addresses. To effectively manage this, in each VPC create an SG named (for example) "Wouter is here", and then run the following script on your own laptop/workstation. This script will determine your own public IP address, and will then look in all regions and all VPCs for SGs named "Wouter is here". It will then add a series of inbound rules to these SGs for your public IP address only.

The script works as follows:

  • The optional "-s <SG name>" allows you to specify a custom SG name. Don't forget to quote the name if it contains spaces. The default SG name is "Wouter is here" but you can easily change this in the script and I recommend you do so - unless you happen to be called Wouter too.
  • The optional "-d" parameter deletes all current inbound rules from the SG.
  • The script will always add an inbound rule for SSH (TCP port 22).
  • The optional "-r" parameter will also add an inbound rule for RDP (TCP port 3389).
  • The optional "-o" parameter will also add an inbound rule for OpenVPN (by default UDP port 1194, but read the comment in the script if OpenVPN doesn't work on this port).
  • The optional "-p" parameter will also add an inbound rule for ICMP, all types. This is required to get "ping" to work.

If your organization has more than one staff member that needs access, create an SG like this for each staff member. Associate them all with the bastion host. Each staff member can then run this script without interfering with the other SGs.

Link to the script

6. Use the SSH ProxyCommand instead of ForwardAgent

Once your bastion host is up and running, and you have internal EC2 instances up and running too, the obvious thing to do would be to SSH to the bastion host, and then SSH again to the internal hosts. This doesn't work out of the box however, since the default EC2 AMIs do not allow password authentication. You have to use public-key authentication.

This means that in order to SSH from the bastion to the internal host, the bastion needs to have (access to) your private key. It is an exceedingly bad idea to copy your key to the bastion host (either manually using scp, or automatically using ForwardAgent=yes or the ssh -A option) as you don't necessarily want to trust the bastion host with (access to) your private key - it is after all the system that is most exposed to the internet. (See also this.) Furthermore, the -A option doesn't work if you specify a private key manually, using the -i option: It only works with keys that are loaded into your ssh-agent (using ssh-add).

A far better solution is to SSH into the bastion host and setup an SSH tunnel at the same time. This tunnel forwards all traffic to port 22 on the internal host, and is then used for a second SSH connection direct from your workstation to the internal host. And this can all be achieved by using the ProxyCommand in your ssh config file (~/.ssh/config). Here's how you would do this:

First, the stanza for the bastion host itself:

Host bastion
	Hostname my-bastion-host.example.com
	User ec2-user
	IdentityFile ~/.ssh/my-identity-file-for-the-bastion.pem

Whenever I SSH to bastion, this tells the SSH client (including SCP and SFTP by the way) everything it needs to know to create a successful connection.

Next, the stanza for an internal host:

Host internal
	Hostname my-internal-host.example.com
	User ec2-user
	IdentityFile ~/.ssh/my-identity-file-for-the-internal-host.pem
	ProxyCommand ssh -q -W %h:%p bastion

The ProxyCommand tells SSH to use the "ssh -q -W %h:%p bastion" command as its communication channel. In other words, it will not setup an SSH connection direct to my-internal-host.example.com:22, but will run "ssh -q -W my-internal-host.example.com:22 bastion" to setup that connection. Which sets up an SSH connection to bastion using the previous stanza, and then uses this connection to forward stdin/stdout to your internal host SSH server. So effectively you are tunneling SSH over SSH, with the outer tunnel going to your bastion host and the inner tunnel to the internal host.

This configuration setting makes setting up a connection to an internal host completely transparent to SSH, SCP and SFTP. Plus, your secret key doesn't need to be forwarded and stored on the bastion host anymore, leading to a more secure solution as well.

Oh, and here's a neat trick. If you set "DNS Hostnames" to "yes" in your VPC properties, Amazon will automatically register each EC2 instance in the internal DNS server under a name such as "ip-10-0-128-186.eu-central-1.compute.internal". Your ssh config file allows for wildcards, so you can then configure this:

Host *.eu-central-1.compute.internal
	User ec2-user
	IdentityFile ~/.ssh/my-identity-file-for-the-internal-hosts.pem
	ProxyCommand ssh -q -W %h:%p bastion

Now all you need to do is type "ssh ip-10-0-128-186.eu-central-1.compute.internal" and you connect automagically to the right instance, without the need to add a per-instance configuration stanza to your .ssh/config file.

If you have setup the EC2 instances to register themselves with a public DNS domain automatically (see above), then you can even use this trick with that domain name. And if you have multiple (non-peered) VPCs in a region, the only way to get this to work properly is by setting up your own DNS as above, making sure that each VPC has its own DNS subdomain.

7. Consider SSH tunnels for RDP

If you have Windows systems inside your VPC, your default course of action may be to setup a Windows-based bastion host, and install the Microsoft RDP gateway on that bastion host. This will then allow gateway access to all Windows systems inside your VPC. I have done this and it works, but it's a 35-step process to get this to work. In fact, it's so complex that Amazon has actually written a quickstart procedure for this. (This will deploy an RDP gateway via a CloudFormation Template.)

SSH and Linux to the rescue. With SSH and a Linux-based bastion host it's extremely easy to setup a TCP tunnel from your local workstation to the remote Windows system via the bastion host. You can either setup the tunnel on-the-fly as you run the SSH command, or set it up permanently in the configuration file. Here's the SSH command that runs on your local workstation:

ssh -L 3389:10.0.1.128:3389 bastion

With this command, the SSH client will open up the local TCP port 3389. Any connection that arrives here is tunneled through the SSH connection and forwarded by the bastion SSH server to 10.0.1.128:3389. So all you need to do is start an RDP client and have it connect to your local workstation, port 3389.

The same thing, but permanently configured in the ~/.ssh/config file, would look like this:

Host bastion
        Hostname my-bastion-host.example.com
        User ec2-user
        IdentityFile ~/.ssh/my-identity-file-for-the-bastion.pem
	LocalForward 3389 10.0.1.128:3389

You can setup similar tunnels from other SSH clients such as PuTTy, in case you have a Windows-based workstation.

If you need to setup more than one tunnel, simply repeat the -L option or LocalForward statement for each tunnel. Make sure you use a different local port though. For instance: "ssh -L 1128:10.0.1.128:3389 -L 1129:10.0.1.129:3389 bastion" forwards an RDP connection to local port 1128 to the RDP server on 10.0.1.128, while an RDP connection to local port 1129 is forwarded to the RDP server on 10.0.1.129.

Local tunnels are open to the world by default. To limit the exposure of the local tunnel to loopback only, use "ssh -L 127.0.0.1:3389:10.0.1.128:3389" or "LocalForward 127.0.0.1:3389 10.0.1.128:3389" respectively.

8. Consider setting up OpenVPN for client-VPC connections.

OpenVPN will give you a clean VPN connection (subject to routing and security group limitations) between your workstation and the VPC. This means you don't need to setup tunnels to access RDP, for instance: You can simply RDP directly from your workstation to a Windows host, even if this Windows EC2 instance does not have a public IP address.

OpenVPN is not trivial to setup, unfortunately. I've got a presentation that covers OpenVPN in great depth, which is available upon request.

When configuring OpenVPN in an EC2/VPC context, there are a few additional considerations:

  • You may want to setup NAT (IP Masquerading) on the network interface, for all traffic destined to the VPC. It is possible to setup a fully routed configuration but it's complex. In addition to setting up all the routing tables properly, you will also need to disable source/destination checking on the bastion network interface.
  • Do not perform the CA tasks on the bastion host. Instead, generate the CA keypair, CA certificate, Client/Server keypairs and Client/Server certificates on your own workstation or another secure system. Only transfer the required items (bastion private key, bastion certificate and CA certificate) to the bastion. This helps keep your OpenVPN connection secure even if the bastion was broken into.

9. Consider setting up OpenS/WAN for VPC-VPC connections across regions, or VPC-DC connections.

OpenVPN is nice if you need to connect individual workstations to a VPC, and don't mind setting up per-workstation VPN connections. But if you need to peer VPCs from different regions together, or if you want to connect your corporate on-premise network to your VPC, OpenVPN is not ideal.

Amazon offers various VPN connection types, but these are intended to terminate at your corporate VPN device. They do not really support software-based VPN endpoints. But more importantly, they do not support cross-region VPC peering: Setting up a VPN between AWS VPCs in different regions.

For those situations you may want to consider setting up OpenS/WAN (Open Secure Wide Area Network). This is an open-source software solution that lets you tie whole networks together using VPNs.

Please see this page on how to setup OpenS/WAN in an Amazon context.

10. Forward logs to CloudWatch Logs

CloudWatch logs offers the ability to ingest textual data, called "CloudWatch Logs". This can be used instead of a separate, EC2-based log server for off-server storage of logs. There are several advantages to this: It's more secure (a hacker cannot easily delete the log trail of his break-in attempt), it allows centralized analysis of log data, and CloudWatch has the ability to put a "metric filter" onto your textual logs, which gives you a metric (number). And based on these metrics you can do other things, such as graphing it, or putting alarms onto it.

Here are the steps to implement remote logging to CloudWatch:

  1. If necessary, install the AWS CLI. Then perform basic configuration of the CLI. At the very least, specify the default region to use.

    The AWS CLI is installed on all AWS Linux AMIs by default. If you are outside the AWS infrastructure, or if you are using your own AMI, then the CLI can be downloaded from https://aws.amazon.com/cli

    To configure the CLI, run "aws configure".

  2. Install the CloudWatch logs plugin.
    yum -y install awslogs
    
  3. Create the Log Group and Log Stream in CloudWatch.

    An example Log Group would be "LinuxLogs", and an example Log Stream name would be the EC2 ID of the machine(s) you're logging from.

    You may want to set an automated expiry on the Log Group to avoid saving years worth of data. Alternatively, you can setup a process to export the logs to S3 and from there to Glacier.

  4. Setup an IAM Role for your Bastion host that allows writing to CloudWatch. The easiest to do this is by using the default AWS Policy CloudWatchLogsFullAccess, but that policy also allows the Bastion host to delete Log Entries, Log Streams and even Log Groups. There is no AWS-provided write-only CloudWatch Logs policy, unfortunately, so you'll have to write your own. This will look like this:
    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Sid": "Stmt1493820608000",
                "Effect": "Allow",
                "Action": [
                    "logs:PutLogEvents"
                ],
                "Resource": [
                    "arn:aws:logs:eu-central-1:973674585612:log-group:LinuxLogs:log-stream:*"
                ]
            }
        ]
    }
    

    Create this policy, call it "LinuxLogToCloudwatchLogs" for example, and attach it to the EC2 role. Make sure this role is then attached to your EC2 instance.

    If your Linux server lives outside the AWS infrastructure, then you cannot use IAM Roles to gain CloudWatch privileges. In that case you'll have to setup an IAM user with the proper permissions. Generate an access key/secret key combination for this IAM user, and store it in /etc/awslogs/awscli.conf.

  5. Test the CloudWatch Logs functionality:
    echo "Hello World" | aws logs push --region <Region> --log-group-name <LogGroup> --log-stream-name <LogStream>
    
    You should not get any errors. Also, you should be seeing the message appear in CloudWatch Logs. (It may take a minute or two for the message to appear in CloudWatch Logs.)
  6. Setup the CloudWatch Logs agent. This agent will permanently follow your logfiles, and will forward all relevant entries to CloudWatch Logs. The main configuration file for this agent is /etc/awslogs/awslogs.conf.

    Scroll down the bottom of the file. There should already be a stanza for /var/log/messages here. Note that this stanza overwrites the Log Group. You may want to change that, and you may want to add a Log Stream: The latter defaults to the EC2 Instance ID.

    Also check the file /etc/awslogs/awscli.conf. This file should contain the proper region.

  7. Start the CloudWatch Logs agent.
    service awslogs start
    chkconfig awslogs on
    
  8. Generate a test message.
    logger -t test -p local7.info This is a test!
    
    Check this message was added to /var/log/messages, then check this message was sent to CloudWatch Logs.

Your logs are now automatically added to CloudWatch. What happens next is up to you. You can just leave them sitting there, or you can use the power of CloudWatch to configure metric filters, trigger alarms and so forth.

As an example, I'm interested in the number of SSH logons that happened. The sshd daemon, by defaults, logs to the AUTHPRIV facility instead of AUTH, so the logging ends up in /var/log/secure. The easiest way to fix this is by modifying the /etc/ssh/sshd_config file: Change the line "SyslogFacility AUTHPRIV" to "SyslogFacility AUTH". Then restart the sshd daemon.

The message that identifies a new connection, "sshd[<pid>]: Accepted publickey for <username> from <IP> port <port>" should now be visible in CloudWatch Logs. We can now put a metric on this.

In CloudWatch Logs, select your Log Group and click "Create Metric Filter". You now need to create your filter pattern. More information can be found with the link on that page. Also, if you select the right Log Stream, then the last few lines of that Log Stream will be displayed, and your filter applied to that so you can test your filter. HOWEVER note that, at present, CloudWatch Filters do not support regular expressions, and do not support wildcards in the middle of your expression. So something like "sshd[*]: Accepted publickey for * from * port *" is not possible. My filter therefore looks like this:

"Accepted publickey for"

Logout and login a few times to generate some log data. Now go to CloudWatch Metrics and you should see your custom metric. (It may take a few minutes for it to appear.) You can now use this metric to create graphs, set alarms and so forth.

11. Done? Finished testing? Create a snapshot!

As mentioned in step 2, you should consider your bastion host to be a throwaway resource, to be redeployed everytime there is a shred of doubt about its security. This can all be achieved if you've created a snapshot of your EBS volume at this point.

If you intend to deploy your bastion in other VPCs and/or in other regions, you can also consider creating an AMI at this point.