Whether it’s the increased technical complexity of the bulk power system and grid, more extreme weather events, or rising physical and cybersecurity threats—the direct threats to the power grid have never been higher. For security teams responsible for protecting the technology that supports their power transmission systems, the notable risk they have direct control over is managing vulnerabilities in their OT devices.
Notably, the North American Electric Reliability Corporation (NERC) 2023 State of Reliability Technical Assessment [.pdf] found that as the risks against the grid have increased, the planning put into place years ago to protect the grid did not account for increased connectivity and system complexity, and is no longer fully aligned to defend against modern threats. However, experts say that the inherent security of new devices and software associated with managing the grid is shipping natively with better code and design quality, cutting down on commodity vulnerabilities. While challenging, the vulnerability management efforts of grid participants are improving.
NERC president and CEO James B. Robb explained before the Senate Committee on Energy and Natural Resources in early June that efforts to mitigate reliability risks have largely been successful. “Many conventional risks that challenge the grid have now been reduced by significant margins and continue to trend in a positive direction overall,” Robb said.
While the 2023 State of Reliability Technical Assessment documents a five-year trend of increased power transmission reliability, Robb detailed how this is due to declining equipment failures, improved human performance, better situational awareness, and effective vegetation management programs. “There have been no cyber events impacting bulk electric system facilities, and there have been no outages associated with substations deemed critical,” Robb added. Still, the digital risks aren’t static.
While grid operators continue with their transformation efforts, including increasingly distributed energy resources connected to distribution systems, modern smart devices required to manage these systems, and the large data sets and the required analysis of that data—all function to increase the addressable attack surface, the NERC State of Reliability report stated.
“The electricity industry, like most others, will continue to face these cyber and physical threats now and into the future. New systems and applications that manage distributed energy resources require security integration at every level of planning, design, implementation, and operation in order to maintain bulk power system reliability and position the industry on a strong security footing,” the report detailed.
While successful attacks remain plausible, mainly if vulnerabilities aren’t managed adequately, Chris Sistrunk, Mandiant technical leader, ICS/OT at Google Cloud, contends that most bulk power system utilities understand the evolving threat landscape. “[They] have worked hard to significantly reduce the impact of these risks by testing their cyber defenses with vulnerability assessments, penetration/red team assessments, purple team/validation assessments, and tabletop exercises,” he said.
Still, according to the NERC State of Reliability report, software and communications issues, including software defects and protocol weaknesses that may impact availability, ranked high as causes behind the loss of energy management system functionality.
One of the essential tools used in the bulk energy system are the energy management systems (EMS) used throughout the bulk power distribution system. The EMS is a group of systems that coordinate, control, and monitor parts of the electric grid, including situational awareness of the bulk electric system. According to the NERC State of Reliability report, there were 52 categorized events—many of them cyber incidents—associated with EMS in 2022. In total, 322 EMS-related event reports were submitted between 2018 and 2022; no reported EMS-related events caused loss of generation, transmission lines, or customer load to date.
When there was a loss of EMS, two significant factors proved to be issues with software and communications. According to the NERC State of Reliability report, losing the ability to monitor and control, or both, at least part of an entity’s system was the top factor in system failure from 2018 to 2022. Such events include the loss of situational awareness or communication between control centers.
The loss of energy management systems due to malfunctions and misconfigurations is risky enough. Still, couple those missteps with creative nation-state threat actors—a primary danger to power distribution and the risks of disruptions increase considerably.
Fortunately, the industry has taken measures to manage these risks, especially regarding monitoring and communications.
In 2022, the E-ISAC shared 230 tailored security analytic products, conducted 90 intelligence briefings, and shared 870 individual information posts on relevant cyber and physical threats to the industry. These tailored products covered a variety of threats, including cyber activity adjacent to the Russia-Ukraine war, the emergence of new destructive OT malware, significant increases in software vulnerabilities, ransomware compromises of utilities and vendors, and physical attacks on substations. And there is considerable voluntary and mandatory information sharing among grid defenders and governments in the U.S. and Canada regarding the monitoring of threats and attempts to hold bad actors accountable.
Regarding software risks, Sistrunk contends that the electric power industry is improving. This is primarily due to the NERC CIP standards that require larger utilities to conduct an annual vulnerability assessment, the greater availability of vulnerability management tools that target OT, and increased rigor from the security requirements of grid asset owners and security support from grid OEM vendors.
While NERC CIP applies to the bulk electric system, smaller players are improving, too. “Most electric utilities are small electric co-ops and public power utilities that are starting to mature their vulnerability management. The good thing is that small electric utilities often use the same OT software and SCADA equipment that large utilities use — which have undergone regular security testing. Think of it as a rising tide raises all boats,” said Sistrunk.
Still, “there is an asset visibility challenge for smaller utilities with tiny security budgets that have to choose between hiring an OT security employee or purchasing OT security software,” said Sistrunk.
However, the increased complexity of the systems that run the bulk power system poses long-term risks and requires even more diligence if the industry keeps pace with the risks. The more code, the more risks. And while vendors are performing their own security code analysis, vulnerabilities remain.
“Large code bases are probably here to stay, so we have to learn to manage risk in large systems of systems,” Sistrunk adds. He points to consequence-driven cyber-informed engineering (CCE). CCE focuses on analyzing OT systems of systems through the lens of identifying high consequence events and plausible attack paths with the goal of reducing or eliminating the impact,” said Sistrunk.
That’s sound advice. Because the power system has a good record for managing the potential impact of cybersecurity threats, with the evolution of cybersecurity-related threats showing no signs of letting up, the industry must continue to work hard to keep it that way.
George V. Hulme is an award-winning journalist and internationally recognized information security and business technology writer. He has covered business, technology, and IT security topics for more than 20 years. His work has appeared in CSOOnline, ComputerWorld, InformationWeek, Security Boulevard, and dozens of other technology publications. He is also a founding editor at DevOps.com.