Four happy servers graze on the range before their long march to economic results.

Four happy servers graze on the range before their long march to economic results.

Odd title for a tech blog, no?  Well a few years ago I heard a great analogy from my friend Josh McKenty.  He said, we should not treat servers as pets, but instead cattle. The basic thesis being that a sick puppy takes a family of three to care for it while a sick cow in the herd gets shot in the head by the rancher and the herd moves on – three ranchers can handle a thousand plus head of cattle.  Applied to servers and IT infrastructure this model is a core premise of cloud architectures: we should never get so attached to an individual device that we stop receiving the rewards of economies of scale.

Let us continue with Josh’s analogy.  You have your herd of cattle, you are driving them from Texas to Sedalia, Missouri – the closest railhead at the time that ran to Chicago – and one of them gets sick.  But this time it is something virulent and unfortunately deadly like BSE (mad cow disease) which is infectious and causes the cattle to lose the ability to stand, thus ineffective on the drive.  Your cow has a folded protein, a virus, malware, something in it that should not be there.  You hopefully catch it before it spreads, but if you do not you can lose the entire herd.  Rapid quarantining may not be enough, euthanization may not be enough depending if the infection vector precedes the onset of identifiable symptoms.  Economic disaster.

The problem is that security killed the cow.

General purpose pools of compute, my overly obvious metaphor now apparent, suffer the same issue.  Infection of one node has very few, if any controls, to prevent lateral expansion to the rest of the pool.  So security comes in…  and this is where the cow dies.  We end up making trade-offs between scale, performance, reliability, simplicity and security.  I have seen more than a few private cloud projects go from transformative successes to mediocre failures primarily because the requirements to segment one application from another forced the creation of physically disparate zones – often this was at the behest of regulators and auditors that may not have a clear or current picture of technologies available, or the business environment faced significant cyber threats.

I would argue that we should not treat servers, VMs, applications like cattle, we need to treat them like prisoners.  Pretty rough analogy eh?  But let’s think about it objectively without all the stigma associated with the rather intensive nature of many penal systems, so for sake of argument think about this in whatever form you think a prison should be – there are a few things that are common and necessary:

  • Control access from the outside world into your prison. Obvious things like gates and walls prohibit the random population from entering the prison.  Deeper inspection points, man traps, guards, and scanners verify authorized guests don’t bring in anything ‘extra’.  Suppliers are verified so that food service, linens, etc are also vetted and don’t add anything extra to the shipments inbound.
  • Check everything going out.  Part of the goal of a prison is to keep the prisoner in, duh.  So we check everything going out just as rigidly as we inspect that which comes in.  Here though rather than looking for weapons, keys, or contraband we are looking for prisoners and unauthorized communications.
  • Monitor and control comms.  Ideally you don’t want to operate a prison that affords a criminal boss the ability to run their empire while serving time but you also don’t want to enable any retribution against witnesses or rivals.
  • Segment the prison population.  Men and women, juveniles and adults, mentally ill, physically infirm, and exceptionally dangerous are often isolated into different segments within the prison.  The most dangerous are often isolated in solitary where additional compensating controls are put in place to protect others from them.  High value prisoners who are awaiting trial where they will offer valuable testimony are also sometimes put into solitary, more for their protection as they are targets in general population.
  • Lastly, for the purposes of the analogy, the prison is responsible for maintaining the prisoners health.  From food, to water, health care the prisoners have a basic set of services provided.

If we roll this back to building a scalable IT infrastructure we end up with a more integrated model than most companies have implemented. A few examples of how to treat servers whether physical or virtual and how to apply warden-style policies to our environments:

  • Control all access into our infrastructure.  This means not just perimeter security, but also controlled points of inspection, individual cells for the systems, applications and data and verification of every person who has physical or communications access to our systems.
  • We must provide equal diligence on data exiting as we do entering.  It is easier to do this closer to the origin point of the data – the more ‘hops’ the data is inspected from its source the more points of exposure you have to manage.
  • When designing a prison you must segment the control infrastructure from the regular communications networks that are more easily compromised from inside and outside.  You protect these control systems more rigidly than other ones because a compromise there could cost you the entire prison – the classic Hollywood scene where a few prisoners take over the guard room and release all the rest.  (my personal favorite being when Rocket orchestrates the escape in Guardians of the Galaxy.)
  • If you build a prison of zones, then you have to provide policy enforcement at the choke-points and application/workload portability is compromised.  If you build a system where each zone is a ‘zone of one‘ then you can actually put pretty much any workload on any system and have complete application/VM/workload portability.  The challenge with these types of systems is that you have to trust that your policies are actually in effect and that no one can easily bypass them.  This is far too easy in agent-based or traditional network segment based models that depend on a host assigning a tag correctly.
  • Policy enforcement has to validate at the application layer today – IP Tables were so 1998.   If you are protecting web workloads – plan for the next Heartbleed; protecting Active Directory – expect to see a golden ticket; a control point system – be sure to validate where administrative sessions are coming from and use more than one authentication factor.
  • Verify clean-source software from vendors (signed ISOs) and ensure you have adequate protections on your own code-signing servers for internal development – nothing is more embarrassing than knowing someone owned your Git repo and you did the bad guys a favor and DevOps’d their malware all over the place for them.

I suppose I could go on and on at this point.  But as a parting thought skim this doc, it’s the Jail Design Guide from the National Institute of Corrections.  I am betting that if you do a find/replace of a few choice words and phrases it starts looking like a solid IT design guide that includes workflows for adding new applications, segmenting them, physical plant requirements, etc 🙂

Something to think about…