Environments with industrial or automation control systems are built to ensure process availability and resilience. Availability is defined as "the quality of being able to be used or obtained" and resilience as "the capacity to recover quickly from difficulties; toughness." These days, these definitions do not necessarily take into consideration the rampant connectivity happening today within automation environments.
So, what happens to availability and resilience when process control technologies are connected to an ethernet network and become digital, cyber assets? Organizations must immediately change their perception of threats and risk and ask: How do control environments achieve "cyber availability" and "cyber resilience?" The answer is as complex as the multitude of different types of industrial and automation control system environments.
For complex systems, the best road to availability and resilience is to break the system down into simpler parts and manage each as a part of the whole. For example, an electric utility may have separate electrical substations and natural gas compressor stations with a central control center that receives information from all remote locations. The approach to addressing these parts depends on your role in the organization and whether you will provide systemic change (strategic efforts) or tactical change (technical implementation). Tactical changes provide rapid risk reduction at a specific location while systemic changes address an organization’s cultural behavior.
For environments rushing to address the rapid deployment of control environments that are remotely accessible from the internet, the efforts of strategic and tactical teams must be coordinated to ensure these systems are resilient to attack and remain available. A plan that focuses on defense-in-depth, process recovery plans, and network service reduction is one approach to consider.
Risk reduction requires knowledge of deployed assets and current operational procedures at each location. This information is necessary to implement a defense-in-depth methodology that prepares for process recovery while reducing risk from external threat actors. You cannot protect what you do not know; a list of hardware and software assets, categorized by criticality, will provide a team with the starting point for evaluating the current implementation.
Process recovery plans need to be regularly reviewed and updated to address a prioritized list of digital network and fieldbus communications that include remote access, critical server and application unavailability, compromised credentials, and protection of device configurations. Many processes can be run manually for specific time periods. However, safe operations these days have heavy reliance on information from remote assets, require management from a central control center, or depend on administration by integrator-provided subject matter experts (SMEs).
Attacks on a process’ availability will generally be conducted against network services that accept data. The network services of Human Machine Interfaces (HMIs), Programmable Logic Controllers (PLC), Remote Terminal Units (RTUs), Universal Power Supplies (UPS), servers, and workstations are considered the process’ attack surface.
Reduction of this attack surface is necessary to improve availability. Asset management and process recovery plans provide details about the network services that are required for operations. Detailing necessary communications allows for the configuration and management of enforcement boundary technologies, implementing centralized logging for network devices and authentication servers, and improving system, device, and software configurations.
To achieve rapid risk reduction, these efforts must concentrate on identifying and prioritizing efforts for quick and noticeable improvements. As each effort is implemented, strategic and tactical teams will have more actionable intelligence to feed into the next effort. Agile development strategies can be followed to achieve quick wins while also distributing lessons learned to each team. This model will ensure that all team members are working toward the same focused goal of improving process availability and resilience. After a few passes, the team will become more adept at the process and move quickly to improve the effort for continuous risk identification and mitigation.
Don C. Weber is the Principal Consultant and Founder at Cutaway Security, LLC, an information security consulting company. Don's previous experiences include large-scale incident response efforts for organizations with international assets and interests, the certification and accreditation of classified federal and military systems, assessment and penetration testing of worldwide commercial assets, and, as a Navy contractor, the management of a team of distributed security professionals responsible for the security of mission-critical Navy assets.