PHOTO
Hackers and other criminals can easily commandeer computers operating open-source large language models outside the guardrails and constraints of the major artificial-intelligence platforms, creating security risks and vulnerabilities, researchers said on Thursday.
Hackers could target the computers running the LLMs and direct them to carry out spam operations, phishing content creation or disinformation campaigns, evading platform security protocols, the researchers said.
The research, carried out jointly by cybersecurity companies SentinelOne and Censys over the course of 293 days and shared exclusively with Reuters, offers a new window into the scale of potentially illicit use cases for thousands of open-source LLM deployments. These include hacking, hate speech and harassment, violent or gore content, personal data theft, scams or fraud, and in some cases child sexual abuse material, the researchers said.
While thousands of open-source LLM variants exist, a significant portion of the LLMs on the internet-accessible hosts are variants of Meta’s Llama, Google DeepMind’s Gemma, and others, according to the researchers. While some of the open-source models include guardrails, the researchers identified hundreds of instances where guardrails were explicitly removed.
AI industry conversations about security controls are "ignoring this kind of surplus capacity that is clearly being utilized for all kinds of different stuff, some of it legitimate, some obviously criminal," said Juan Andres Guerrero-Saade, executive director for intelligence and security research at SentinelOne. Guerrero-Saade likened the situation to an "iceberg" that is not being properly accounted for across the industry and open-source community.
The research analyzed publicly accessible deployments of open-source LLMs deployed through Ollama, a tool that allows people and organizations to run their own versions of various large-language models.
The researchers were able to see system prompts, which are the instructions that dictate how the model behaves, in roughly a quarter of the LLMs they observed. Of those, they determined that 7.5% could potentially enable harmful activity.
Roughly 30% of the hosts observed by the researchers are operating out of China, and about 20% in the U.S.
Rachel Adams, the CEO and founder of the Global Center on AI Governance, said in an email that once open models are released, responsibility for what happens next becomes shared across the ecosystem, including the originating labs.
“Labs are not responsible for every downstream misuse (which are hard to anticipate), but they retain an important duty of care to anticipate foreseeable harms, document risks, and provide mitigation tooling and guidance, particularly given uneven global enforcement capacity,” Adams said.
A spokesperson for Meta declined to respond to questions about developers’ responsibilities for addressing concerns around downstream abuse of open-source models and how concerns might be reported, but noted the company's Llama Protection tools for Llama developers, and the company's Meta Llama Responsible Use Guide.
Microsoft AI Red Team Lead Ram Shankar Siva Kumar said in an email that Microsoft believes open-source models "play an important role" in a variety of areas, but, "at the same time, we are clear-eyed that open models, like all transformative technologies, can be misused by adversaries if released without appropriate safeguards."
Microsoft performs pre-release evaluations, including processes to assess "risks for internet-exposed, self-hosted, and tool-calling scenarios, where misuse can be high," he said. The company also monitors for emerging threats and misuse patterns. "Ultimately, responsible open innovation requires shared commitment across creators, deployers, researchers, and security teams."
Ollama did not respond to a request for comment. Alphabet's Google and Anthropic did not respond to questions.
(Reporting by AJ Vicens in Detroit; Editing by Matthew Lewis)





















