ChatGPT hallucinations open developers to supply-chain malware attacks
Attackers can exploit ChatGPT’s penchant for returning false information to spread malicious code packages, researchers have found. This poses a significant risk for the software supply chain, as it can allow malicious code and trojans to slide into legitimate applications and code repositories like npm, PyPI, GitHub and others.
By leveraging so-called “AI package hallucinations,” threat actors can create ChatGPT-recommended, yet malicious, code packages that a developer could inadvertently download when using the chatbot, building them into software that then is used widely, researchers from Vulcan Cyber’s Voyager18 research team revealed in a blog post published today.
In artificial intelligence, a hallucination is a plausible response by the AI that’s insufficient, biased, or flat-out not true. They arise because ChatGPT (and other large language models or LLMs that are the basis for generative AI platforms) answer questions posed to them based on the sources, links, blogs, and statistics available to them in the vast expanse of the Internet, which are not always the most solid training data.
Due to this extensive training and exposure to vast amounts of textual data, LLMs like ChatGPT can generate “plausible but fictional information, extrapolating beyond their training and potentially producing responses that seem plausible but are not necessarily accurate,” lead researcher Bar Lanyado of Voyager18 wrote in the blog post, also telling Dark Reading, “it’s a phenomenon that’s been observed before and seems to be a result of the way large language models work.”
He explained in the post that in the developer world, AIs also will also generate questionable fixes to CVEs and offer links to coding libraries that don’t exist — and the latter presents an opportunity for exploitation. In that attack scenario, attackers might ask ChatGPT for coding help for common tasks; and ChatGPT might offer a recommendation for an unpublished or non-existent package. Attackers can then publish their own malicious version of the suggested package, the researchers said, and wait for ChatGPT to give legitimate developers the same recommendation for it.
How to Exploit an AI Hallucination
To prove their concept, the researchers created a scenario using ChatGPT 3.5 in which an attacker asked the platform for a question to solve a coding problem, and ChatGPT responded with multiple packages, some of which did not exist–i.e., are not published in a legitimate package repository.
“When the attacker finds a recommendation for an unpublished package, they can publish their own malicious package in its place,” the researchers wrote. “The next time a user asks a similar question they may receive a recommendation from ChatGPT to use the now-existing malicious package.”
If ChatGPT is fabricating code packages, attackers can use these hallucinations to spread malicious ones without using familiar techniques like typosquatting or masquerading, creating a “real” package that a developer might use if ChatGPT recommends it, the researchers said. In this way, that malicious code can find its way into a legitimate application or in a legitimate code repository, creating a major risk for the software supply chain.
“A developer who asks a generative AI like ChatGPT for help with their code could wind up installing a malicious library because the AI thought it was real and an attacker made it real,” Lanyado says. “A clever attacker might even make a working library, as kind of a trojan, which could wind up being used by multiple people before they realized it was malicious.”
To read the complete article, visit Dark Reading.