Paul Romer

Cockpit Insecurity

Thu, Mar 30, 2023

If you were a pilot, how would you feel about being told that the copilot who will be flying with you is fun to talk to but sometimes lies convincingly?

Microsoft is already rolling out products that use the AI systems it licensed from OpenAI. On Github.com, it has been offering something that it calls “Copilot” for a couple of years. I’ve tried it twice and each time decided I’m better off flying solo.

Microsoft has just announced a new “Security Copilot.” This could turn out badly.

Github Copilot

The version of Copilot that I tried is supposed to make suggestions about how to write Python. I tried it back in June 2021 when it was first being tested. It was more trouble than it was worth, so I stopped using it. I’ve been told that it has gotten better, so I tried it again this month. For me, it still generates negative value. I’ve turned it off again.

Based on my experience, I’ve concluded that the public discussion about these AI systems is based on a huge overestimate of what they can actually do. Coplilot’s suggestions are often suboptimal, apparently because they are based on code that was widely used in the past that does not reflect current best-practice. Some of its suggestions are superficially plausible but turn out to be syntactically invalid. All you need to do to verify that they don’t work is to try them on a Python interpreter. These are useful for demonstrating the weakness of the tool, but they are not the type of problem that should worry you. What should really worry you is a recommendation from your copilot that doesn’t trigger any alarms when you follow it but which crashes the plane.

It’s not hard to see what’s driving the public impression that these tools have capabilities that far exceed the real experience for a user. If you want to overhype the powers of AI, the perfect way to do it is to encourage a lot of hand-wringing about the risk that it will take over and rule people as slaves. “I mean, after all, if we are on the verge of having these systems take over the world, then wow these systems must be really good at writing and analyzing code.”

You don’t have to rely on my word as a user of Copilot. As I pointed out in my last post, it can’t be true that the AI at OpenAI has even a human ability to understand code, let alone the mooted superhuman abilities. If these AI systems were as good as an experienced developer, OpenAI would have used it to write and debug the customer facing web software that it put into production, software that suffered from a very serious flaw that customers discovered and reported to OpenAI. The library that OpenAI misused was a Python library, redis-py. Did the developers at OpenAI ask Github Copilot what would happen if they overloaded redis with database queries that get canceled? I’m sure they didn’t because they knew that the answers they’d get back couldn’t be trusted.

If the big concern should be code suggestions that run without error but crash the plane, what type of code is most likely to expose you to such risks? Code related to security.

Security Copilot

Now we have an announcement from Microsoft about a “Security Copilot” that, according to the Microsoft executives who introduced it, will

" help security professionals identify current attacks and anticipate coming threats, respond more quickly, suss out who the attackers are, and detect threats that typically are missed by connecting dots through its security data analysis"

I’m not a security professional, but to me, this is a terrible idea.

When would it be most problematic to have a copilot who lies to you? During an emergency.
Will this tool give security professionals an advantage relative to attackers? Obviously not. The attackers can run Security Copilot themselves to see how it will respond to their attacks. They can tweak the attack to make sure that the advice from Security Copilot does not get in its way. They can even leverage the mistakes that Security Copilot makes to aid the attack. Once they figure out how to feed information into the training set for Security Copilot, they can get it to recommend things that open companies up to new attacks.

Just wait. If you think that buggy software is a security problem, wait until adversaries start to leverage the vulnerability of machine learning to manipulation of training data.

As I said in my last post, Sam Altman, the master of concern-puppetry as hype-generator, is in a great position to take the money and run. Microsoft is the company that may ultimately suffer by trying to ride the wave of hype.

The US government is finally getting serious about assigning liability for software vulnerabilities to the companies that sell the software. If Security Copilot turns security problems into security disasters, or if it actually creates its own security vulnerabilities, Microsoft is likely to be the one that is on the hook.

Does Microsoft have any indemnification or shared-liability agreements with OpenAI? If you are on the board of trustees at Microsoft, you might want to ask.

If your AI is so great at coding, why is your software so buggy?

Juice the Profits