Chapter 4:  Planning for Resiliency and Robustness

J.J.C.H. Ryan

Student Learning Objectives: 

After completing this block, the student will be able to:

— describe the difference between resiliency and robustness

— describe different ways that resiliency might be enhanced

— explore ways in which resiliency can be measured or estimated

— describe difference ways that robustness might be enhanced

— describe how robustness can be measured

— conceptualize attacks on resiliency and explain cascading effects of successful attacks

— conceptualize attacks on robustness and explain cascading effects of successful attacks

— describe the cost-benefit trade space associated with resiliency and robustness

— explain how operational secrecy can be used as part of resiliency and robustness

— conceptualize protections to systems than can shore up resiliency

 

Understanding the Difference between Resiliency and Robustness

A stone aqueduct built by the Romans to carry water over hundreds of miles exists to this day.  It is robust.  An aspen tree quivers in the winds, perhaps loses a few leaves, but continues to live after the storm has passed.  It is resilient.

Both of these attributes are important.  But they can be the subject of choices in design: the aspen tree is both resilient and robust while the aqueduct is only robust and not resilient.  Should assault or insult cause an aqueduct to break and fall to the ground, it would take a great deal of effort to rebuild and mend the structure (World Monuments Fund, 2016).  Were the aspen tree to be subjected to an axe, the individual tree would be felled quickly enough, but the organism would continue: the vast majority of the “tree” is a large underground root system (Featherman, 2014).  Soon a new shoot would emerge to replace the aspen that had been cut down.

The concepts of robustness and resiliency seem simple enough, so it is striking that they are so difficult to define and measure.  The New Webster’s Dictionary simply defines robust as “strong, healthy.”  It defines resilient as “springing back; buoyant.” (Bolander (ed.) & Stodden, 1986) These definitions are not useful for engineering purposes.  In this chapter, the concepts of resiliency and robustness will be explored through the lens of security, focusing on how C-UAS operations can exploit the various aspects of both attributes for compromise.  To start, baseline operational definitions are offered so that a common language is possible for the subsequent analyses.

Resiliency

In exploring the literature, the varying definitions of resiliency do not stray far from the definition quoted above.  Two ideas permeate the definitions: first, the ability to return to a previous state; and second, the amount of time needed to return to that state.  Systems that are able to return to the previous state in a short period of time are said to have high resilience while those that take a longer period of time are said to have low resilience. (Hollnagel, 2016) (National Academy of Engineering, 1996)

There are several design features that enable or increase resiliency.  First, a system must have an ability to respond to anything that changes its state.  Next, the system needs to be able to monitor its state, being alert to internal or external changes that could affect it.  Third, the ability to “learn” is useful: keeping track of previous experiences, responses to those experiences, and the results of those responses can provide the ability to more quickly respond appropriately.  Finally, the ability to anticipate challenges or changes can accelerate the detection of issues and subsequent responses. (Hollnagel, 2016)

For the aspen tree, the ability to bend in the wind allows it to return to its previous state quickly, once the wind has calmed.  Evolution has provided the aspen with that ability, having “learned” over millennia that wind exists and how to respond appropriately.  These functions are internal to the aspen ‘system’ and are reproduced for each instantiation of aspen.  Thus, it is possible to characterize the aspen as having high resilience.

The aqueduct, on the other hand, is entirely dependent on external forces to return to its functioning state: people to identify a problem, care enough to respond, and commence the labor needed to repair the structure.  In the context of an aqueduct system that includes the architects, laborers, and tax payers, it has many of the design features, such as learning and anticipating, but the time to respond and repair is very long.  The aqueduct has low resilience.

Robustness

The definitions in the literature regard robust design as a concept separate from robustness.  There is some suggestion that following robust design processes will result in robustness, where the definition of robustness is a system that is insensitive to variations, both internally and externally.  There is no time component noted in these definitions although time does seem to lurk in the background: a system that fails soon is not robust whereas a system that lasts a long time is robust.

Robust design is a process that focuses on quality in order to reduce the vulnerability of the system as a whole to problems that it may encounter.  There are three components of robust design: system, parameter, and tolerance, with a focus on increasing quality during manufacturing rather than trying to “inspect in quality” after manufacturing. (Wysk, Niebel, Cohen, & Simpson, 2000) (Maurer & Lau, 2000)

The aqueduct design and build process used by the Romans focuses on product improvement at every step, including research and development of better materials to increase the effectiveness of the system.  Continual maintenance was performed regularly until the organizing structure of the Roman Empire collapsed.  The aqueducts continued to exist for a long time after the end of regular maintenance. (World Monuments Fund, 2016)  They were highly resistant to variations and, as a result, very robust.

The aspen is a wonder of nature: most of it is underground and hence able to withstand the insults and challenges associated with environment and technical changes.  The oldest aspen stand is estimated to be more than 80,000 years old (Featherman, 2014).  It is highly resistant to variations, has great lasting power, and is, as a result, very robust.

Comparing Resiliency and Robustness

The following Table 4-1 summarizes the above discussed differences between resiliency and robustness:

Table 4-1: Summary of Resiliency and Robustness

 

  Attributes Time Component
Resiliency Ability to respond to undesired changes

Ability to monitor current state

Ability to learn from experiences

Ability to anticipate challenges

Quick to recover to desired state

 

∂t ~ 0

Robustness Insensitive to component variation

Insensitive to parameter variation

Tolerant of environmental variation

Lasts a relatively long time

 

T >> 0

Source: Ryan, J.J.C.H Notes (2020)


Operational Aspects of Resiliency and Robustness

Resiliency and robustness aspects are important considerations in system design and operations.  Integrating the components into a system that enhance these two attributes can be costly, which means that design trade-offs may have to be made.  On the other hand, sometimes neither resiliency nor robustness are desirable attributes.  For example, single use plastic kitchen waste bags are intended to be flimsy and easily degraded environmentally, although the nature of the material renders a level of robustness that is undesired (United Nations, 2018).  On the other hand, material scientists have recently created a type of plastic that can self-destruct when exposed to sunlight:

Engineers at the Georgia Institute of Technology have developed a new type of plastic that can form flexible sheets and tough mechanical parts—then disappear in minutes to hours when hit by ultraviolet light or temperatures above 176 degrees Fahrenheit. … DARPA has already used the plastic to make light, strong gliders and parachutes. Last October the agency field-tested one of these vehicles: dropped from a high-altitude balloon at night, a glider successfully delivered a three-pound package to a spot 100 miles away. After four hours in the sun, it vanished, leaving behind nothing but an oily smudge on the ground. (Patel, 2019)

The example given in the story illustrates an obvious use for disappearing plastic: short term mission execution with very little forensics residue.  Adversaries planning attacks on distant targets could use these types of materials to launch their attacks without leaving much behind for investigators to find.  C-UAS planners might use this type of design feature as a focus for attack.

Deciding how much resiliency and how much robustness is needed for a given system is a design choice and must be made in consideration of the overall mission goals.

 

Measuring Resiliency and Robustness

As noted in the discussion regarding the definitions of robustness and resiliency, measurement of such attributes is only possible in relation to the system mission goals.  If a system is designed for preplanned product obsolescence (Buck, 2017) (Patel, 2019), then it is right and appropriate to design it with a planned lifetime.  In fact, the robustness of that product is appropriately measured in its ability to last the planned lifetime.  If it does, reliably, then it can be considered robust.  If there is a non-trivial chance of it failing prior to planned end of life, then it can be considered not robust.  Similarly, resilience must be measured relative to the mission goals.   If the mission has a goal to linger over a territory for a period of time, then resiliency can be measured in the determination of the system to react to and recover from expected problems during that period of time.  These attributes must be carefully considered and designed into the system from the beginning.

 

How Processes can Boost Resiliency and Robustness

Resiliency and robustness do not need to be cares borne solely by single components or even single systems.  Having redundant systems can boost both resiliency and robustness, if those redundant are integrated appropriately.  It does no good to have redundant systems or elements if such components are equally vulnerable to expected attacks or insults.  Redundant processes can additionally assist in delivering resiliency through the augmentation of learning and detection capabilities.  Having redundant processing channels that double check the precision and appropriateness of the primary processing channel is a very valuable method of monitoring the state of the system and ensuring that it is operating correctly.

When Resiliency and Robustness is More Costly than Optimal

Engineering for increased resiliency and/or robustness costs resources: money, labor, energy, and space.  As such, the decisions must be carefully made.  In some cases, it is not possible to have precise data on the operational environment, in which case guesses must be made.  For example, the scientists and engineers developing the first-generation space systems had little empirical data to work with when trying to design the desired resiliency and robustness.  One thing they did know is that once the system was launched, it was going to very difficult indeed to send a repair person after it.  As a result, the early systems lasted much longer than expected (Gruss, 2014).

Those satellites were very expensive, but data to inform the decision space was for all practical purposes non-existent.  For most of the systems that are being designed for terrestrial purposes, ample data exists, and significantly more computing power exists to support modeling and simulation.  Costs can be extrapolated for both design improvements and marginal returns on investment, giving the product manager the ability to make rational decisions on how to make the hard decisions about expenditures for resiliency and robustness.  But these decisions can not be made as cookie cutter decisions: just as robustness and resiliency are only measurable relative to mission goals, so are the costs associated with providing these attributes.

When Resiliency and Robustness are Attacked

Both the presence and absence of robustness and resiliency can be used as vectors for attack.  When robustness or resiliency is absent, the attacks are much more obvious.  It is when the systems have been designed with robustness or resiliency in mind that the attack challenge becomes interesting.

Candidate targets to be considered include (Ryan J. J., Information Warfare: A Conceptual Framework, 1997):

  • Autonomous Sensor Systems, which can be exploited to send false data back to the controlling system or used as conduits for other weapons such as viruses, logic torpedoes, and worms
  • The C2 Infrastructure, which includes Civilian and Strategic Leadership, the Decision Process, Societal Support Structures such as the police, and other governmental entities like the Bureau of Land Management and the Strategic Oil Reserves. Attacking these targets can sow discord in an opponent’s society, thereby fracturing the decision-making process or any consensus, deny an opponent the ability to marshal needed resources to rebuff an attack, or divert attention from other activities.
  • The Communications Infrastructure, including the physical part of a communications infrastructure which includes microwave antenna towers, switching stations, telephones, radios, computers, and modems. Non-physical portions include the data, electrical systems, and management support systems.
  • Logistics, including the computerized backbone that identifies supply requirements, positions materials, tracks deliveries, and schedules resources. Attacks on that backbone can severely impact the ability of the dependent forces to deploy or maintain a deployment.

There are many other targets, including the sensors and individual UAS systems, but it pays to think broadly about targets.

Types of Attacks

A system that is designed to be very robust is one that is expected to last for a long period of time, relative to its mission. The designers made the decision that it was necessary for the mission to engineer the components for enhanced robustness, which was a resource decision: simply stated, they decided it was worth the extra money, energy expenditures, labor, and time to make the system more robust.  The mission needs are for it to last, to persist.  Destroying or damaging such a system, then, is an obvious priority for an adversary. Discovering the relative robustness of each system is also an adversary priority, since it informs targeting decisions.

Similarly, a system that is designed to be resilient is one that has been imbued with the ability to recover quickly from challenges.  For such a system, a single attack is not likely to be (very) effective.  Instead, a series of attacks in intervals at a rate that overwhelms the recovery process may be appropriate.  For example, the distributed denial at service (DDOS) attack concept was developed when targets began designing interfaces that were resilient to normal denial of service (DOS) attacks (Cloudflare, 2020).

Revisiting the definitions of resiliency and robustness, the very attributes provide clues as to how to craft effective attacks (see Table 4-2):

 

Table 4-2 Attributes v Time

 

  Attributes Time Component
Resiliency Ability to respond to undesired changes

Ability to monitor current state

Ability to learn from experiences

Ability to anticipate challenges

Quick to recover to desired state

 

∂t ~ 0

Robustness Insensitive to component variation

Insensitive to parameter variation

Tolerant of environmental variation

Lasts a relatively long time

 

T >> 0

Source: Ryan, J.J.C.H (2020)

 

Attacking resiliency should focus on slowing down or compromising entirely the ability to recognize and recover from state changes.  Attacking robustness may be best accomplished through sabotage in the manufacturing process.  Focusing on each of these attributes provides the C-UAS planner options for consideration.

In designing appropriate attacks, the C-UAS planner needs to consider system design and system operation.  Individual components of systems can prove to be the Achilles’ heels of larger systems.  Getting to this level of knowledge requires significant intelligence data support and analytical capability.

Cascading Effect Potential

One of the challenges associated with automated systems, such as UASs, is that there is a huge potential for them to be used in multiple system configurations, including swarms.  While the offensive potential of such swarms is large, it also provides a potential for cascading C-UAS effects.  For example, if a swarm has a single controlling entity, the jamming or destruction of that single entity makes the entire swarm vulnerable.  Analysis of the C-UAS potential should always consider the potential for creating effects that cascade from one system to another (Ryan, Woloschek, & Leven, Complexities in Conducting Information Warfare, 1996).

The Role of Secrecy

Because of the obvious implications of the preceding discussion, secrecy associated with all aspects of UAS operations can be a paramount consideration.  UAS operators should be mindful of adversaries attempting to discover information useful to the adversaries C-UAS activities.  C-UAS planners should be careful of adversaries trying to discover intent and capabilities of the C-UAS efforts.  The types of secrecy considerations span operations, capability and resiliency/robustness attributes.

Operational Secrecy

Normal operations can provide hints to how resiliency and robustness are engineered into a system.  When conducting UAS operations, caution might be warranted to disguise or hide operational patterns or capabilities.  Obviously, the longer a system is in use, the harder this becomes and the potential for secrecy dwindles to simply secrecy regarding current operations.  But even this can be valuable.

From a C-UAS perspective, observing adversary training and operational patterns can provide a great deal of information regarding capabilities and intentions.  Even such apparently minor things as the types of personnel expertise being acquired or the amount of energy being used can provide clues.  Clues provide lines of inquiry for potential targeting and C-UAS mission planning.  Granted a huge part of the C-UAS problem is when the adversary fleet is inbound, but don’t overlook the opportunity to subvert it before it is launched.

Capability Secrecy

Hiding or disguising capabilities is always a popular choice.  For C-UAS planners, care should be taken to test hypotheses thoroughly to ensure that the adversary has not managed to confound the intelligence gathering and analysis process regarding the UAS missions and capabilities.

Resiliency and Robustness Secrecy

Adversaries may go to some lengths to hide the actual nature of how robust or resilient their systems might be.  In some cases, the systems may be quite frail, contrary to the data revealed by the adversary.  In other cases, the systems may be much more capable and resilient than expected.  In either case, the potential for a target-weapon-effects match might be affected, to the detriment of both the nature of the conflict and the geo-political stability.  Getting it right is important and no information should be taken at face value.

Questions for Reflection

  1. You are planning a C-UAS operation against an adversary that has very robust UASs. Your intelligence support activity has verified this level of robustness.  Is your best option to try to sabotage the systems while they are in production, in the field awaiting launch, or while in flight?  What are the trade-offs associated with each choice?
  2. A spy has revealed that an adversary has been outfitting recreational UASs with secret surveillance capabilities. These UAS systems have been advertised during the recent holiday season at deep discounts and, as a result, the sales of the systems have sky rocketed.  Part of the secret surveillance system is an AI system that detects unauthorized activity and self-destructs to avoid any information being extracted.  You have been charged with coming up with a way to subvert these capabilities.  What are your alternatives?
  3. You are on guard duty and the alarm has just been raised that a swarm of very resilient UASs are inbound on an intelligence collection mission. What are your options?

 

References

AirForceTechnology.com. (2019, June 19). The 10 longest range unmanned aerial vehicles (UAVs). Retrieved January 7, 2020, from AirForceTechnology.com: https://www.airforce-technology.com/features/featurethe-top-10-longest-range-unmanned-aerial-vehicles-uavs/

Bolander (ed.), D. O., & Stodden, V. L. (1986). The New Webster’s Dictionary. New York, New York, USA: Lexicon Publications, Inc.

Buck, S. (2017, March 3). GM invented planned obsolescence during the Great Depression, and we’ve been buying it ever since . Retrieved January 30, 2020, from TimeLine: https://timeline.com/gm-invented-planned-obsolescence-cc19f207e842

Bursztein, E. (2018, May 1). Attacks against machine learning — an overview. Retrieved January 29, 2020, from Blog: AI: https://elie.net/blog/ai/attacks-against-machine-learning-an-overview/

Cloudflare. (2020, January 1). What is a DDOS Attack? Retrieved January 30, 2020, from Cloudflare Learning: https://www.cloudflare.com/learning/ddos/what-is-a-ddos-attack/

Coram, R. (2010). Boyd: The Fighter Pilot Who Changed the Art of War. New York: Hatchette Book Group.

Featherman, H. (2014, March 21). Tree Profile: Aspen – So Much More Than a Tree. Retrieved January 30, 2020, from Blog: Trees: https://www.nationalforests.org/blog/tree-profile-aspen-so-much-more-than-a-tree

Fisher, J. (2020, January 27). The Best Drones for 2020. Retrieved January 29, 202, from PC Magazine: https://www.pcmag.com/picks/the-best-drones

Forsling, C. (2018, July 30). I’m So Sick of the OODA Loop. Retrieved November 6, 2019, from Task and Purpose: https://taskandpurpose.com/case-against-ooda-loop

Gambrell, J. (2020, January 11). Crash may be grim echo of US downing of Iran flight in 1988. Minnesota Star Tribune, p. 1.

Goel, A. (2018, February 2). How Does Siri Work? The Science Behind Siri. Retrieved January 29, 2020, from Magoosh Data Science Blog: https://magoosh.com/data-science/siri-work-science-behind-siri/

Gray, R. (2017, March 1). Lies, propaganda and fake news: A challenge for our age. Retrieved January 29, 2020, from BBC Future: https://www.bbc.com/future/article/20170301-lies-propaganda-and-fake-news-a-grand-challenge-of-our-age

Green, M. (2013, January 1). Driver Reaction Time. Retrieved January 29, 2020, from Visual Expert: https://www.visualexpert.com/Resources/reactiontime.html

Gruss, M. (2014, February 24). Long-lasting Milsats Give U.S. Time to Consider Next Steps. Retrieved January 30, 2020, from Space News: Military Space Quarterly: https://spacenews.com/39608military-space-quarterly-long-lasting-milsats-give-us-time-to-consider/

Halloran, R. (1988, July 4). The Downing of Fliight 655. New York Times, p. 1.

Hollnagel, E. (2016, January 1). Resilience Engineering. Retrieved January 30, 2020, from Erik Hollnagel Ideas: https://erikhollnagel.com/ideas/resilience-engineering.html

Huang, A. (2006, January 1). A Holistic Approach to AI. Retrieved January 29, 2020, from Ari Huang Research: https://www.ocf.berkeley.edu/~arihuang/academic/research/strongai3.html

James, R. (2019, October 30). Understanding Strong vs. Weak AI in a New Light. Retrieved January 4, 2020, from Becoming Human AI: https://becominghuman.ai/understanding-strong-vs-weak-ai-in-a-new-light-890e4b09da02

Kenton, W. (2019, February 12). Stock Market Crash of 1987. Retrieved January 29, 202, from Investopedia: https://www.investopedia.com/terms/s/stock-market-crash-1987.asp

Loon LLC. (2020, January 1). Loon.com. Retrieved January 29, 2020, from Loon.com: https://loon.com

Maurer, K., & Lau, S. (2000, February 11). Robust Design. Retrieved January 30, 2020, from IE 361: https://vardeman.public.iastate.edu/IE361/s00mini/maurer.htm

McCausland, P. (2019, November 9). Self-driving Uber car that hit and killed woman did not recognize that pedestrians jaywalk. Retrieved January 29, 2020, from NBC News: https://www.nbcnews.com/tech/tech-news/self-driving-uber-car-hit-killed-woman-did-not-recognize-n1079281

Miller, E. K. (2017, April 11). Multitasking: Why Your Brain Can’t Do It and What You Should Do About It. Retrieved January 4, 2020, from Miller Files: https://radius.mit.edu/sites/default/files/images/Miller%20Multitasking%202017.pdf

Moisejevs, I. (2019, July 14). Poisoning attacks on Machine Learning . Retrieved January 29, 2020, from Towards Data Science: https://towardsdatascience.com/poisoning-attacks-on-machine-learning-1ff247c254db

Morgan, T. P. (2019, November 13). INTEL THROWS DOWN AI GAUNTLET WITH NEURAL NETWORK CHIPS. Retrieved January 29, 2020, from The Next Platform: https://www.nextplatform.com/2019/11/13/intel-throws-down-ai-gauntlet-with-neural-network-chips/

National Academy of Engineering. (1996). Engineering Resilience versus Ecological Resilience. In N. A. Engineering, Engineering Within Ecological Constraints. Washington, DC, USA: The National Academies Press.

Nichols, R. K., Ryan, J. J., & Ryan, D. J. (2000). Defending Your Digital Assets Against Hackers, Crackers, Spies, and Thieves. New York: McGraw Hill.

Nuance. (2020, January 1). Dragon Speech Recognition Solutions. Retrieved January 29, 2020, from Nuance Products: https://www.nuance.com/dragon.html

Patel, P. (2019, August 26). Disappearing Plastics Stay Strong in the Shadows and Melt Away in the Sun. Retrieved January 30, 2020, from Scientific American: Chemistry: https://www.scientificamerican.com/article/disappearing-plastics-stay-strong-in-the-shadows-and-melt-away-in-the-sun1/

Richards, C. (2012, March 21). Boyd’s OODA Loop: It’s Not What You Think. Retrieved July 27, 2019, from Fast Transients Files: https://fasttransients.files.wordpress.com/2012/03/boydsrealooda_loop.pdf

Ryan, J. J. (1997, January 1). Information Warfare: A Conceptual Framework. Retrieved January 30, 2020, from Proceedings of the 1996 Seminar on Intelligence, Command, and Control : http://www.pirp.harvard.edu/pubs_pdf/ryan/ryan-i97-1.pdf

Ryan, J. J. (1997, September 80). Lecture Notes, EMSE 218/6540/6537. (J. J. Ryan, Performer) George Washington University, Washington, DC, USA.

Ryan, J. J. (2001, November 12). Security Challenges in Network-Centric Warfare. (J. J. Ryan, Performer) George Washington University, Washington, DC, USA.

Ryan, J. J., Woloschek, J., & Leven, B. (1996, April 1). Complexities in Conducting Information Warfare. Defense Intelligence Journal, 5(1), 69-75.

Sampson, B. (2019, February 20). Stratospheric drone reaches new heights. Retrieved January 5, 2020, from Aerospace Testing International: https://www.aerospacetestinginternational.com/features/stratospheric-drone-reaches-new-heights-with-operation-beyond-visual-line-of-sight.html

Tarm, M. (2010, January 8). Mind-reading Systems Could Change Air Security . Retrieved March 1, 2011, from The Aurora Sentinel: http://www.aurorasentinel.com/news/national/article_c618daa2-06df- 5391-8702-472af15e8b3e.html

Tozzi, C. (2019, October 16). Is Cloud AI a Fad? . Retrieved January 29, 2020, from ITPro Today: https://www.itprotoday.com/cloud-computing/cloud-ai-fad-shortcomings-cloud-artificial-intelligence

United Nations. (2018, January 1). Plastic Pollution. Retrieved January 30, 2020, from UN Environment: https://www.unenvironment.org/interactive/beat-plastic-pollution/

Vincent, J. (2017, April 12). MAGIC AI: THESE ARE THE OPTICAL ILLUSIONS THAT TRICK, FOOL, AND FLUMMOX COMPUTERS. Retrieved January 29, 2020, from The Verge: https://www.theverge.com/2017/4/12/15271874/ai-adversarial-images-fooling-attacks-artificial-intelligence

Wakabayashi, D. (2018, March 19). Self-Driving Uber Car Kills Pedestrian in Arizona, Where Robots Roam. Retrieved January 29, 2020, from New York Times: https://www.nytimes.com/2018/03/19/technology/uber-driverless-fatality.html

Wikipedia. (2019, December 29). Loon LLC. Retrieved January 14, 2020, from Wikipedia: https://en.wikipedia.org/wiki/Loon_LLC

World Monuments Fund. (2016, January 1). The Quest to Save Segovia Aqueduct. Retrieved January 30, 2020, from World Monuments Fund Articles: https://www.wmf.org/sites/default/files/article/pdfs/the_quest_to_save_segovia_aqueduct.pdf

Wysk, R. A., Niebel, B. W., Cohen, P. H., & Simpson, T. W. (2000, January 1). Taguchi’s Robust Design Method. Retrieved January 30, 2020, from IE 466: Concurrent Engineering: https://www.mne.psu.edu/simpson/courses/ie466/ie466.robust.handout.PDF