Infrastructure

Goal

Increase the reliability of services
Ensure compliance with international security standards
Reduce the chance of knowledge silos developing within teams

Summary

Demonstrate understanding of:

Making a service Scalable
Making a service Highly Available
Principle of Least Privilege
Zero-downtime deployments

Assessments

The infra badge assessments are designed to be undertaken in any order, Spider is recommended to be taken first

Ant (Level 1)

Brief

Be able to set up a highly available, internet-facing web application.

Theory

What are IaaS, PaaS, and FaaS?
1. For each of these; Discuss a scenario where the technology is appropriate
2. For each of these; Discuss a scenario where the technology is not appropriate
What is the importance of having a highly available service?

Practical

You will be provided a Linux Container Image, though you can use your own if you wish.

Run the image with your choice of IaaS.
Make the service highly available
Draw a diagram of your infrastructure
Demonstrate that you can remove an instance, and your infrastructure will self-heal
Demonstrate that you can roll out a new deployment without causing any downtime. Use the provided downtime detection script.

Spider (Level 2)

Be able to make secure cloud applications and services.

Brief

Resources:

http://progressivecoder.com/understanding-aws-security-groups-and-best-practices-to-use-them/
https://www.jakoblell.com/blog/2013/08/13/quick-blind-tcp-connection-spoofing-with-syn-cookies/
- no need to read this entire article ^ section V is relevant

Theory

There is a cloud-hosted public web server that talks to a backend API in a private network. The private server’s network allows the IP of the public server via an allow list rule on its Firewall.

Explain what maintenance overhead IP allow-listing adds for future developers?
Explain the security problems with using only IP allow-listing for authentication?
What are the benefits of investing in monitoring and alerting in software environments?

Practical

You will need to show and demonstrate a infrastructure as code project (e.g. terraform) that can setup and teardown the following:
- 2 file storage buckets (or azure blob storage containers)
- A new user with a set of access credentials that only has permission to retrieve (*not list or modify) files in one bucket
- Enable request monitoring on the bucket so that the assessor can get the number of HTTP GET requests made to bucket over time.
- You will be asked to setup and take down you infa

Bee (Level 3)

Brief

Theory

Practical

Limitations

Provider requirements

There are certain providers, usually PaaS only providers, that do not offer the the functionality we require for testing people on all the above aspects.

As such, we recommend avoiding solutions that hide:

load balancing
autoscaling
networking rules (routing, Firewalls)

Providers that can definitely be used are:

AWS
- ECS with FARGATE is fine, though be prepared for increased theory work around deployment processes
- Lambda is not recommended, as it hides a lot of networking rules, autoscaling, and deployment processes
- Elastic Beanstalk is not allowed, as it does everything for you, but in a way that makes it hard to adapt after the fact.
Azure
GCP

Appendix

Downtime Script

target="<your endpoint>"
while :; do
 if (curl -m 1 "$target" &>/dev/null); then
   printf '.'
 else
   echo "target is down"
   break
 fi
 sleep 0.5
done