A few weeks ago, the U.S. federal bureaucrats from the NASA Office of the Inspector General’s audit division put together a thorough document detailing the security woes at NASA Jet Propulsion Laboratory (JPL). The details come together in a classic “what not to do” anti-case study that many security professionals are probably familiar with at this point. It’s a good read for security people interested in learning from the mistakes of others.
A Pasadena-based research and development center, JPL is a storied institution that’s done decades worth of groundbreaking work for the U.S. space agency in supporting missions and networks that are at the heart of space exploration. It’s also had some pretty troubling cybersecurity fails over the last decade that’s included breaches that gave attackers access to servers that support JPL missions and many gigabytes of data about major mission systems.
With this report, NASA auditors endeavored to dig up the root causes behind these incidents. In particular, what they found was a series of systemic weaknesses in incident response and threat hunting practices that keep JPL from finding and responding swiftly to attacks on its full range of network asset.
“In spite of its efforts to protect these assets, critical vulnerabilities remain that place JPL at risk of cyber intrusions resulting in the theft of critical information,” the report explained. “We identified a series of weaknesses in JPL’s system of security controls that collectively diminish its ability to effectively prevent, detect, and mitigate cyberattacks targeting its IT systems and networks.”
The mistakes made by JPL are common ones out in both the public and private sector and as a result, this audit document offers a treasure trove of advice for incident response teams and broader IT teams in how to support more effective incident response. The following are X valuable incident response lessons that any security team can learn from JPL’s security fails.
Step One: Know Your Assets
It’s impossible to respond to attacks against assets you don’t know exist. Unfortunately, while JPL did have an IT systems database to track all of its connected assets, the use of that database has been spotty at best. In many cases system administrators were tracking systems in separate spreadsheets spread over the organization, sometimes using that information to update the central database and sometimes not.
“One system administrator told us he does not regularly enter new devices into the ITSDB as required because the database’s updating function sometimes does not work and he later forgets to enter the asset information,” the report detailed.
A valuable lesson here is that organizations not only need to make it policy to track assets, but they need to provide their admins with easy-to-use tools to make it happen. When they don’t, chaos reigns.
A Complete Incident Response Plan Sets the Course
While JPL does have an incident response plan, the auditors found that it was an incomplete plan. Some of the missing or incomplete elements that they identified included performance measurements, a mission statement for incident responders, a roadmap for how the team can improve its maturity levels, and description of how the incident response program meshes into the overall IT organization.
Incident response organizations operating without a clear incident response plan in place tend to fall into a ‘cowboy’ method of incident response similar to what happened at JPL.
“(Incident responders) rely on their personal experience to respond to alerts, identify false positives, and assess abnormal network behavior,” the report explained. “They described a process of informal discussions and subjective analysis methods for carrying out their detection responsibilities.”
Threats Don’t Sleep
One of the big weaknesses at JPL is the fact that the organization does not staff a 24/7 SOC, in spite of a high volume of threats targeting its assets. The auditors say that the JPL SOC team handles a comparable number of incidents as other organizations in NASA that do have that 24/7 capability.
They acknowledge that it is possible to run incident response and detection activities without 24/7 staffing , but that approach requires a more creative model for structuring duties, including using outsourced incident response services for certain situations.
However, JPL didn’t lean on this kind of third-party help.
“While incident response teams can structure themselves using a variety of models, adopting a more flexible staffing approach could reduce the burden on permanent staff during times of increased activity and investigation,” the auditor noted. “Given the high number of incidents and limited staff, leveraging talent and expertise from non-JPL SOC resources could fill possible detection capability gaps.”
Effective Log Management Needs Complete Data and Clear Review Procedures
The good news is that JPL does have established policies for managing the generation, storage and analysis of system log data that feeds into the organization’s SOC. The bad news is that it didn’t have a fail-safe process in place to make sure that system admins were pumping the complete set of data that security analysts needed to do their jobs.
“Several members of the JPL SOC acknowledged they had no method to determine whether Splunk ES captures all of the required log data,” the report explained.
What’s more, system administrators on the front line of log collection, management, and analysis showed they didn’t have a clear idea of their responsibility for doing regular routine manual review of logs. This regular activity is the spear tip of incident response, and yet these team members “misunderstood their responsibilities regarding log management and review.” In fact, the majority of JPL administrators thought that they didn’t have to do any routine analysis because they thought that their Splunk system did it all for them.
Structured Threat Hunting is Crucial
Finally, while JPL security analysts did engage in threat hunting activities, it has all been done on an informal, ad-hoc basis. Additionally, the incident response plan has no mention of threat hunting procedures or processes. The auditors showed that this team was at the lowest levels of the Hunting Maturity Model.
“A documented process that defines the techniques analysts should follow, including generating hypotheses regarding malicious activity in an IT environment and updating automated detection capabilities based on discovered threats, reduces inefficiencies of random techniques and guides analysts in a more strategic and structured hunt,” the report explains.
This is sound advice not only for the JPL team, but many others worldwide.