A View into Government Cybersecurity
Posted: Updated:
In late February, I attended the Rocky Mountain Cyberspace Symposium in Colorado Springs, a conference where industry and government converge on cybersecurity, with a strong focus on US national security and critical infrastructure. I wanted to better understand the industry, meet with insiders, and identify where tech-sector-style machine learning could be valuable. After many conversations, I’ll cover my perspective.
What is the Government Responsible For?
I’ll start with my inevitably oversimplified understanding of what the government is doing, as seen through this conference.
Broadly, they manage networks, ones managing data, software, compute, decisions, personnel, food, ammo, healthcare and equipment. There are the networks managing distributed software, compute, and data any techie would recognize (e.g. think an SDK plus Github plus Azure) after wincing from all the extra layers of security and redundancy necessary to keep secrets secret and defend from attacks. In contrast, there are the logistical networks of trucks, trains, planes, ships and manned command centers needed to maintain the intra and international flow of personnel, food, equipment and other supplies.
These networks are quite different, yet they are networks1. They are all systems communicating physical things across physical channels in mostly real time. As a simplification, I view the government’s cybersecurity burden as ensuring all critical networks are operated securely, swiftly, reliably and without disqualifying expenses.
What Does the Government Want for Cybersecurity?
My approach is to consider wants from ‘improving existing operations’ to ‘research and development.’ Along this spectrum, wants vary from specific to open ended.
For improving existing operations, some data science wants were quite specific. From military operators, I twice heard “I just want to know what’s happening at my end points”. That’s determining the identity, physical/digital state and physical/digital context of anything from phones to vehicles in the field. I heard of plans to develop digital twins of specific manufacturing lines to simulate process changes without actually changing the manufacturing process. I heard of using massive data feeds from equipment to predict maintenance issues and deploy repairmen more efficiently2.
Other wants were less specific. They want to improve asset vulnerability evaluation and correction; how can they improve the identification, prioritization and protection of devices that are exposed and critical? I felt the urgency each time someone mentioned the recent Chinese infiltration of critical IT infrastructure, a highly invasive incursion well beyond expectations.
They want to attract tech-talent. They’re aware cutting edge technology requires bright minds to pick sometimes-polarizing government work over the comfortable and highly compensated lifestyle of the technology or financial services industries. On this front, they feel behind, citing social media radicalizing the US while TikTok educates China. To counter, the Department of Defense plans to grow and enhance their recruiting operations.
They want improved organizational decisioning. They understand tall and wide hierarchies of humans make for slow, biased and partially blind judgements. Most believe technology will offer some solution, but I suspect when things get specific, they get less promising. I heard several references to sophisticated dashboarding as the future’s solution, but loading a human with dense visual information is far short of a performant and safe automated decisioning system. I expect they want the latter, but the technology isn’t quite there yet.
For research and development, the wants are explicitly open ended. From the Air Force Research Laboratory, I heard of research funding for “Cross-Domain Innovation and Science,” any technology that can facilitate secure exchange of information across the Air Force’s various heterogeneous digital and physical environments. When speaking with those from the Cyberspace Capabilities Center, I heard they send technical software requirements to the Air Force Life Cycle Management Center (AFLCMC), who might work with a large tech company to research and develop whatever the needed technology is. Lastly, there is the well known SBIR / STTR programs, which award funding for promising research proposals.
How Does Industry Serve the Government?
Clearly, it does so in many ways and a full answer is beyond me. Instead, I’ll make a few comments on the ‘how’.
The commercial side was surprisingly familiar. The issue of managing software, data and compute has been addressed commercially, so those same tools are sold to the government in a government-ready form. Amazon, Microsoft and Google were strongly represented, either directly with their own booth or indirectly, as an integrated component of another company. Labels of ‘cloud storage’, ‘cloud computing’, ‘software repositories’, ‘zero-trust’, ‘API’ and ‘PKI’ were common. My impression is that these are essentially the same products as those commercially available, but with some additional government-requested layers.
I noticed many software ‘unifiers,’ as I’ll call them. There is some technology (e.g. cloud storage and computing) offered by several companies, and the government needs it in all its forms. This creates an opportunity for another company to arise as the unified interface. This unifier does the work of integrating with every version of the technology, and can position itself as the completest form of that technology. For security and operational reasons, this is apparently essential; to meet security requirements, personnel have their software products limited. A unifier can be considered a single product while offering access to many.
Inevitably, some things will be sacrificed. A unified interface will obfuscate functionality, add debugging friction and insert new nodes of risk. Further, these integrations are across independent and imbalanced companies. If Amazon and Microsoft suddenly decide to make their APIs irreconcilably different, the leverage-less unifier will be in a pitch and so will its clients.
I notice this because use of unifiers is less common3 in the tech sector. Sturdy software requires short paths to core technologies. No one wants to build their AI startup by plugging into an API that plugs into an API that plugs into OpenAI’s API. They just plug into OpenAI’s API.
The alternative is standardization. The government should encourage major cloud providers to agree on and abide by fixed interfaces for accessing cloud technology. The government’s challenge of network management gets harder with every node. Standardization is one of our best defenses. This fact must already be well appreciated, considering the successful forms of standardization we already rely on. Those include adoption of Internet Protocol v6, standardization of global positioning (ultimately ‘GPS’) and the standardization of credit card transactions. I anticipate that as cloud operations mature, the pressure for standardization will grow, as it should.
How will government machine learning technology evolve for cyber security?
To form a view, I needed an understanding of the models currently operating. So I asked things like ‘Does the system display a score?,’ ‘Does it do automated decisioning?’, ‘Are image displays aumatically annotated?’ and ‘How does this data get used?’. I heard of models scoring the cyber risk of assets to prioritize ones worth fortifying, Project Maven, models producing email-phishing probabilities, systems auto labeling satellite images, Deloitte analysts modeling for logistics management and models searching for cybersecurity anomalies among digital network flows.
I noticed an emphasis on human-in-the-loop decisioning; models are tools and humans are the deciders. There are good reasons for this (e.g. if the fault of a decision falls on a machine, accountability and transparency become murky principles), but it nonetheless registered, to me, as limiting. Dumb machines may be improved by human supervision, but the smartest ones are not. In my mind are the examples of the top quantitative hedge funds, whose survival depends on accurate prediction and effective automated decisioning. There, humans are understood as unquantifiable, capricious, bias and drifting sources of noise. Jim Simons, arguably the greatest quantitative investor of all time, said4:
It’s just what the model says. That religious sticking-to-the-model is the only way to run such a [hedge fund] business, because you can’t simulate that guy who walked in and said ‘let’s sell Google.’
This is not to make a recommendation, but to speculate that as government decisioning systems improve, eventually human decisioning will shrink.
A frequent topic was generative AI, referring to the impressive language and image/video synthesis models powering the tech news cycle. As entertainment tools, image and video generation models were not serious topics in my conversations and felt inappropriate to mention for purposes of national security. On the other hand, language models were top of mind, but their specific application was a repeat question. Indeed, AI value is pondered at the highest level. The most interesting use cases I heard were for military training. The military has huge volumes of documents on procedure, and personnel need to understand them. Current pedagogy is training by someone procedurally experienced. It’s clear this process could be quickened with a language model trained on the document corpus; a computer can answer many more questions than an associate can.
But to look forward, I don’t believe the next decade of government ML investment will be a large push into generative modeling. For critical applications, generative models are too unreliable and there’s no mathematics to suggest that’ll change anytime soon (in fact, there’s a strong argument to the contrary).
Rather, I suspect investment will largely flow into ML infrastructure (‘MLOps’), the practices, processes, systems, and tools that manage ML models and their interactions within a software ecosystem. That’s a primary way the tech industry evolved since 2013, when ‘MLOps’ wasn’t a thing and open-source model-serving tools were nonexistent5. Today, it’s recognized as an organization’s best first investment to bolster the path towards a performant ML system. Given this, the government will likely continue along the same trend, albeit more slowly, due to their extra security and reliability considerations.
Footnotes
-
You’ll have to forgive my vague and implicit definition of a network. ↩
-
Interestingly, I heard of difficulties in beating the established heuristic approaches to maintenance; simple metrics and their bounds provide reasonable guidance already. ↩
-
Only less common. Stripe and Okta are popular unifiers. ↩
-
From the 37:30 mark of this interview. ↩
-
At the time, I recall startups offering to manage ML pipelines entirely. You’d provide your data, choose from a dropdown of model types and then it would ‘do the rest.’ In practice, no self respecting data scientist would tolerate this. It puts too much of the company’s core operations with a third party. ↩