Our Lab in the Cloud

What?

A while back, I posted on this thing that we’ve been doing, trying to help out with teaching teh cybers to high school students. We set up a virtual environment for the students, and I wanted (needed) to take some time to document it.

Why?

Well, because I’m way (WAY!) behind on documenting our setup, and someone needs to know how to rebuild it if I get hit by a bus. Plus someone might find my ramblings useful — or not. Or maybe someone sees this and is able to tell me where I can improve. Also because I’m of the mindset that Cyber/Infosec should be like crypto from a defensive standpoint in that we should be able to share our configs (not passwords/passphrases) with a large audience and they should stand up to scrutiny and/or attack. Yes, OPSEC is a thing, and yes it’s necessary sometimes, but if we can get to a point where I can say “We’re using nginx (we are), here’s how it’s configured, and here’s the code to our web app,” and it stands up to attack, that’s much better than relying on security through obscurity and hoping people don’t misbehave.

To put it simply, “Because reasons. . .”

Basic Setup

We run our lab out of Azure. Why Azure? Well, there are a couple of reasons. First, they were offering a free trial. Yeah, I know. That’s how they get you, but I wanted to check out their interface, see what they had to offer, and see how it differed from AWS. I’ve noticed a couple of major differences between the two services. Azure’s user interface is more user friendly than AWS’s. It looks cleaner and more polished. It’s easy-ish to add/delete/modify stuff, and mostly easy to find things. There is one major area where it lacks though — API Functionality. I’ll get into how I’m using the API in another post, but to put it simply, Azure is Microsoft’s answer to Cloud Computing. It looks like a clean polished product, but it’s got some pretty major “areas of opportunity,” LOL.

Speed is another major drawback to the Azure interface (as well as the API). AWS seems more “snappy,” for lack of a better term. I’m just referring to the web interface here because I haven’t had any chance to use AWS’s API yet. It seems to take an odd amount of time to perform simple tasks, like starting VMs, or deleting resource groups.

Where Azure really stole my heart though is with their Non-profit credit. If you’ve got a 501(c)(3) and you meet their eligibility requirements you can get up to $3500 in Azure credits, with another $1500 in Azure Active Directory Premium Services. It used to be $5000 straight up for Azure services, but that changed recently. No matter though, because it still beats AWS’s $2000 in credits. It could be argued that Azure’s prices are a bit higher which evens things out in the end, but i think it’s still worth it to give Azure a shot if you haven’t. So, to sum up why we use Azure. We use Azure because reasons, but mostly because $$.

That being said, I still would like to translate all the stuff we’ve done over to AWS just in case it might come in handy to someone.

Full Disclosure – I own a couple of shares of Microsoft. Nothing that will let me retire early if someone chooses MS over Amazon for their Cloud services, but I thought I should mention that.

The Student’s Gateway

The first thing our students see when using our lab is a login page for Guacamole. If you haven’t heard of Guacamole, and you’re looking for a way to provide Citrix-like services for your users, I highly recommend it. You can’t beat the price (FREE!) and they just reached version 1.0.0. It’s a Tomcat/Java back end, and the only thing the clients need is a browser capable of running HTML5. We use it on a small scale (30-ish students), so I’m not really sure how it would scale for an enterprise-wide deployment. Our first Guacamole server was on a CentOS box, and we used a script that (mostly) set up Guacamole for us. We actually just rebuilt the server using docker containers. The docker config was a little tricky to get working, mostly due to my inexperience with using (instead of breaking) docker, but we were eventually successful in getting it up and running in docker. There are a couple of git repos that have some scripts that you might find useful, and I wish I had found them before we started — especially considering one of them had an Azure VM template. During setup and config, we just use a little D2s_v3 image — still CentOS. During labs, we transition to a beefier server, an F16s_V2. And again, we keep costs down by only running it when we need to, We also use nginx as a proxy for Guacamole, and of course, Let’s Encrypt!

The big selling point with Guacamole (at least for me) is that the students only need an HTML5-capable browser. No figuring out VPNs, or getting permission from school IT staff to open weird outbound ports. It “Just Works”(TM) through a browser.

I’ve also noticed that at least one training provider uses guacamole to provide similar access to hacking and target VMs in the cloud 🙂

We “protect” the Guacamole server by limiting access to it, via Network Security Groups, similar to AWS Security Groups. We allow SSH inbound from anywhere, but only using key-based logins allows us to sleep a little better. Maybe 2FA is in our future, but couple limiting source IPs with the fact that the Guacamole VM is only running during class time (and dev time), and it’s a pretty good start. The Guacamole server is also in it’s own VNet, which helps us keep the Network Security Group rules to a minimum, internally. Hey! Maybe a diagram would help. . .

So, internal to Azure (everything to the right of the Guacamole server), all of the dark blue clouds are Azure Virtual Networks (VNets), which are not routable to each other by default. If you wish to have the networks be able to talk to each other, you need to enable peering, which incurs additional costs, but in our case, it’s negligible, and totes worth it to cut down on the number of “firewall” rules we need to manage. Each group (e.g. Group x) is a Resource Group created for the given school or institution we’re helping out in. Again, it also has its own VNet. I’ve got a “firewalled switch thingy” (technical term) separating the VNets in the diagram, but it’s not quite that simple (diagram-wise). The Guacamole VNet shares a peering connection with each of the School Groups, the Instructor Group, and the Logging Group. Each of the School Groups also share a peering connection with the Logging Group, but not the Instructor Group, or each other. So the only thing that is Internet-facing is our Guacamole server, running in Docker (Yes, I know that containers should not be considered a security boundary). Management of all systems is done from our Guacamole server, which doubles as our salt server.

Student Groups (VNets)

Each Student Group has at least 2 VMs, a Kali Linux VM and a “target” VM, which in this case is running CentOS, Docker, and salt. The students are able to connect to their Kali instances via RDP (xrdp) or CLI (SSH) via Guacamole in their browsers. Each Student Group is also assigned its own subnet within the VNet. For example. Group x VNet Network address is 10.100.0.0/16. Student-1’s subnet is 10.100.1.0/24, Student-2’s subnet is 10.100.2.0/24, and so on. The main reason for this is because we can create Network Security Groups at the subnet level, so we can relegate each student to their own subnet, and they can’t attack each other. The only thing we allow inbound to each subnet is 22(SSH) and 3389(RDP) from the Guacamole server. We allow TCP/UDP/ICMP within the subnet, and block everything else inbound. We allow all outbound. For the Guacamole VNet, as mentioned above, we allow 22(SSH) inbound from anywhere, 4505(salt), and 4506(salt) inbound from the Group VNet, and 443(HTTPS) inbound from the schools and our dev IPs. And for out Logging VNet, we allow 22(SSH) inbound. I’ll get to logging in a bit.

Wait, WHAT!?!? “We allow all outbound!?!?” Yes. Our students can get to anything (ANYTHING!) on the internet. No filtering of content, no blocking of questionable material, and no blocking of “hacking” sites, which is kind of an important feature to have if you’re trying to teach hacking. The down side is that the students could (if they wanted to) go to “Very Bad Sites(TM).” Which is why we enabled a Network Watcher, which is just a fancy way of saying “netflow.” Again, it incurs an additional cost because you’ve got to save your netflow data to a storage account, but we can use our logging setup to tie into that storage account and pull the JSON-formatted data into our “SIEM.”

We manage both the student Kali VMs and the target VMs with salt. I’m a noob when it comes to using salt. I know it’s powerful, but our use case is pretty simple. We want to modify the target VMs for our specific lab scenario. That could be simple webapp testing, using DVWA, WebGoat, or Mutillidae (all Docker containers), or it could be something a little more complex, like walking through a pen testing scenario (Recon -> root). We plan on releasing some of our salt configs soon. It’s not Rocket Surgery, but if it can help a teacher (or anyone, really) set up a lab with a couple of button clicks or commands, then it’s worth it.

Logging

Yes. We do logging. The end.

But really, It’s pretty simple. We log every (EVERY!) command that the students run, via auditd. Will this catch everything the students could do? No, but I’m not about to install a keylogger on a student’s VM. We try to balance on that line between protecting ourselves and respecting the students. We constantly remind students of the bad things that happen to people who run afoul of the CFAA. We continually remind them that just like in meatspace, respecting people, their things, and their privacy in cyber space is mandatory and non-negotiable if they wish to continue working with us — and we make sure that we are on the same page as the teacher who’s class we’ve invaded discipline-wise. We ask for cooperation from those teachers and ask that they be active participants in our sessions. If we see something, we say something — to the teacher. We’re not certified educators, and really have no place in disciplining students. We believe that should be left up to the teachers.

Oh, right, logging. So, we use auditd, Filebeat, MetricBeat, Network Monitor, and ship it all to an ELK stack (in docker). For the auditd info, we monitor that with Filebeat and ship it to our logging server via an SSH tunnel. If we’re running MetricBeat on the VM as well (usually just the Guacamole server), we send that info via an SSH tunnel as well. The only problem I have with this setup is that making the process automated is somewhat insecure. If a student is motivated enough, they could find information that would allow them to log into the logging server as a low privileged user, so we’re relying on our old friend security through obscurity here (Boo!).

Yes, we could enable TLS for elasticsearch. Maybe there’s an easy way to automate this, but I haven’t seen one yet.

I wrote this post before the ELK folks released their SIEM tool, so I’ll need to go back and give that a try!

Logging Bonus

As a bonus, we can use our ELK stack to show the students basic SIEM stuff, get them on board with logging all the things. Something we try to beat into their brains is of you want to catch the bad guys there are 2 things that you need to do. 1) Log. If you don’t log the info, you won’t have the info, and 2) look at your logs. You can log as much as you want, but it’s pointless if you don’t look at what they’re telling you. Yes, there are data reduction things that need to be done, but do the important stuff first.

And since we’re logging all of their commands, we can go back and see which students are participating and which are goofing off. It’s cool to let them in on this secret a few weeks into the class, and just watch as understanding begins to show on their faces. You can literally see that sinking stomach feeling set in for some of them.

Why not HackTheBox, VulnHub, etc…

Why do we go through the hassle of setting all of this infrastructure up, configuring a bunch of separate Kali instances and target boxes? Why not just send the students to HackTheBox or VulnHub, or <insert online hacking site here>?

Context. And structure.

We prefer to give students structure and context for the activities that they’re doing. That’s not to say we don’t recommend online hacking sites. Learning on your own is a huge part of Infosec/Cyber. And I love and use sites like that. It’s easy (okay… Not always) for students to crack passwords given a list of hashes and a wordlist. It’s easy for students to learn to nmap servers or use a proxy. What’s not so easy, or what students usually won’t do on their own is learn why they’re doing some of the things they’re doing, or why some of the activities we do, work. We provide context for why (for example) servers should be hardened, or why we should use password managers, or why we should patch our systems.

And we provide structure so that they’re not going to the Very Bad Sites(TM) or goofing off with each other, or trying to hack us, or any number of other things that students do when they’re not provided with structure. We provide this with the help of the actual trained teacher who’s class we’ve overtaken. I have absolutely no data to back up my opinion that context and structure can help keep cyber-curious students out of trouble, but it seems like a reasonable conclusion.

Wrapping Up

It’s about time, right!? In the end, we’re just a couple of cyber/infosec nerds trying to figure out how we can help the future cyber/infosec nerds be better cyber/infosec nerds. If you’ve got ideas on how we can improve, feel free to reach out to me at noob at noobhaxor dot com, or follow me on twitter @billy_macco. If you read this far, thanks for taking time out of your busy schedule. Hopefully you got something out of it.

I’ve got a script or two on github that might come in handy for anyone setting up Guacamole. It’s a huuuuuge PITA to add a bunch of users and groups, and configure them correctly, so I’ve got a script that’ll set up a group and a specified number of students/users in that group, add their CLI and GUI connections to Guacamole, give each student permissions only to their Kali system, and assign them passwords from diceware. We plan on adding more stuff later, but it’s slow going.

And I’ll be following up with what we’re doing with the Azure API. That’s been a huge learning experience for me — can’t wait to share what I’ve learned!

tl;dr

Azure over AWS, mostly for the cost benefit of working under a 501(c)(3), though python API and UI are not quite as “snappy” as AWS
Guacamole provides a great alternative to having to set up costly VPN gateways, and is great for those places that only allow 80/443 outbound, but it is a bit lacking in the ease-of-use department when it comes to adding/modifying multiple users and connections.
Guacamole interface can be laggy, depending on bandwidth requirements and interface type, e.g. CLI v.s GUI.
Azure Network Security Groups are super configurable, being able to be deployed at the subnet or NIC level, which makes locking down our student subnets easy.
Setting up multiple VNets in Azure is easy, but peering costs and connection configs could get way out of hand if you have too many.
Log all the things
ELK/Filebeat/MetricBeat are great at collecting data, just not so great at doing it securely out of the box.
Context and structure are both important when teaching students about topics that could land them in hot water. If we provide these now, students are more likely to use their powers for good.
Automating lab setups is hard, especially when you lack coding skills and time, like me 🙁

n00bhax0r