Voozh

Entries Tagged "hacking"

Page 1 of 78

Anthropic’s Fable and the State of AI

On June 9th, Anthropic released its Fable generative AI model. Three days later, the US government classified it as a dangerous munition, and used its export-control authority to prohibit any foreign nationals from accessing it. Unable to differentiate between Americans and foreigners, the company shut off access for everyone.

The government’s actions won’t help. The problem isn’t any one particular model; it’s the general trend of increasing AI capabilities. And any real solution requires the sort of collective action that just isn’t possible right now.

Fable is the constrained version of Mythos, the AI model Anthropic announced in April. Anthropic only released it to a few selected organizations, because the company claimed it was so good at finding and exploiting vulnerabilities in computer code that releasing it more generally would be dangerous.

It was an obviously self-serving announcement, and because few were able to verify Anthropic’s claims they were met with some skepticism. Those with access used Mythos to find and patch many vulnerabilities in their own software. But one UK group found the latest, already public, OpenAI model to be just as powerful.

Fable is just another incremental improvement in the years-long climb of AI capabilities. But just as important as the AI model is the “harness.” This is typically not AI. It’s ordinary computer code that interfaces with the user. It stitches together AI models, decides how and for what purposes they can be used, and gives them useful tools such as web search and the ability to run their own computer code.

When Mythos first entered limited release, there was widespread debate whether its power came from the model or the harness. With Mythos demonstrating that it was possible, the open-source community scrambled to build harnesses that could steer other AI models towards similar capabilities. Harness improvements don’t need massive data or data centers.

They largely succeeded. For example, a Prague company was able to replicate Anthropic’s few verifiable cybersecurity capabilities with a much smaller and cheaper model—and a more sophisticated harness. Last week, a group showed that multiple cheaper models harnessed in concert matches Fable’s performance.

The broader community had only a few days with Fable, but that time we learned some about its capabilities. Its difference is less the new model’s raw analytical and problem solving capabilities, and more that the model doesn’t need that sophisticated harness.

Fable requires much less expertise and detailed prompting from the human user. You can give it a difficult goal and it will figure out novel and unexpected ways to satisfy it, finding loopholes in whatever constraints you or the system have imposed on it.

“Relentlessly proactive” is how AI researcher Simon Willison described it. Another descriptor might be “creative.” Experienced AI developers have had that combination of creativity and proactivity since last year, but Fable puts it within easy reach of everyone.

In the hands of someone with a legitimate problem that needs solving, that can be an incredibly useful capability. But in the hands of someone who wants to do harm, it can be equally dangerous. AIs don’t have a moral compass in the same way that people do. They are agents of the wants and desires of the people who prompt them.

That points to the real problem with relentlessly proactive AI. In language, wants and desires are always underspecified. If I ask you to get me some coffee, you would probably pour me a cup from the coffeepot, or buy one from a nearby coffee shop.

You couldn’t buy me a pound of raw beans, or a coffee plantation. You wouldn’t order a cup of coffee for delivery next month. You wouldn’t find a nearby person, rip a cup of coffee out of their hands, and bring it to me. I wouldn’t have to specify any of the million limitations to my request; you would just know.

Human stories are filled with warnings about underspecified desires. King Midas wished that everything he touch turn to gold, forgetting to add “but not my food, drink, and daughter.” And genies are notorious for granting your wish in a way you wish they hadn’t.

The deeper point is that it’s impossible to list all limitations and restrictions, and like a malicious genie, a creative AI will find the ones you forgot. Block a database you don’t want it to have access to, and it might figure out how to bypass your control. Ask it to book a flight, and it might hack the airline because the website says the flight is sold out. Ask it to save money on your cellphone plan, and it might cancel it altogether—or get someone else to pay for it. As far as we know now AI has not done any of this yet, but you get the idea.

Malicious intent is not required. To an AI model, constraints are just things to get around and not general truisms about the world. They are creative problem solvers and natural rule breakers. They “hack” in the sense that they find and exploit loopholes.

Human systems rely on so many norms that we scarcely recognize the existence of until they are broken. AIs naturally think outside the box, because they don’t have any real conception of what the box is or why it’s there in the first place.

There is no foolproof way to prevent people from using AI models to complete harmful tasks. There is no way to prevent the models from incidentally causing harm while completing benign tasks. AI models are no longer isolated from the real world. They browse the internet and answer emails.

They trade stocks and make purchases. They control physical systems. They are, in effect, robots that affect life and property. We have no technical mechanisms to verify the integrity of an AI system. This level of capability and creativity in the hands of us untrustworthy humans will have both great and terrible results.

The problem is not unique to Anthropic. Mythos/Fable might currently be the most capable rules hacker, but more sophisticated harnesses give other models similar capabilities. And we should assume that the other frontier models are no more than a few months behind, and that open-source models are less than a year behind. At best, any ban only serves to delay the problem for a short while.

That delay might be useful if we—as a society, as a planet—would use that time to come together and figure out what to do. This isn’t a US/China arms race problem; this a species-level problem that requires coordinated action at that scale. Unfortunately, we have no mechanism to do that. I first wrote about this problem five years ago, but it was all too futuristic.

Today, when its right in front of us, there is no world government that can impose constraints on the for-profit corporations currently controlling AI models and research. The US has no appetite to effectively and even-handedly regulate those corporations, even as they do catastrophic damage to the environment, democracy, and—in this case—society in general.

This all makes an AI public option all the more necessary, and urgent. Today’s AIs can be fast, smart and secure, but only two of the three are possible for any given system. These safety tradeoffs are tightly held secrets of companies racing to beat one another, and they tell us we have to trust them. Instead, the choices and their consequences need to be brought out into the sunlight.

We should be funding open-source harnesses that balance capability and safety—that achieve useful goals without so much power—and open-source AI models whose provenance and biases are public and well understood. We have opened the AI Pandora’s box. Now we have to make the best of it.

This essay originally appeared in The Guardian.

Posted on June 19, 2026 at 7:03 AM • View Comments

NSO Group Hacking WhatsApp Despite Court Order

WhatsApp has caught the NSO Group phishing its users, in violation of a court order.

Posted on June 10, 2026 at 7:08 AM • View Comments

Hacking Meta’s AI Chatbot

Hackers are convincing Meta’s AI support chatbot to let them take over other peoples’ accounts:

A video posted on X showed the step-by-step process to hack someone’s Instagram account. The hacker allegedly used a VPN to spoof the targets’ presumed location to avoid triggering Instagram’s automated account protections. Then, the hacker opened a chat with Meta AI Support Assistant and asked the bot to add a new email address to the target’s account. The chatbot can be seen sending a verification code to the email address provided by the hacker; the hacker then shares the verification code with the chatbot, which prompts the chatbot to show a button to “Reset Password.” The hacker enters a new password and takes over the victim’s account.

[…]

On Monday, Instagram spokesperson Andy Stone said in a reply to Wong’s post and others that the issue was now fixed. It’s unclear how many Instagram users had their accounts improperly accessed.

It’s not that easy. Probably this particular tactic is now blocked. But there are others, many others, and they cannot be blocked as a class. The real problem is that LLM chatbots are not trustworthy enough for this application.

Another news article.

Posted on June 4, 2026 at 7:04 AM • View Comments

How Dangerous Is Anthropic’s Mythos AI?

Last month, Anthropic made a remarkable announcement about its new model, Claude Mythos Preview: it was so good at finding security vulnerabilities in software that the company would not release it to the general public. Instead, it would only be available to a select group of companies to scan and fix their own software.

The announcement requires context—but it contained an essential truth.

While Anthropic’s model is really good at finding software vulnerabilities, so are other models. The UK’s AI Security Institute found that OpenAI’s GPT-5.5, already generally available, is comparable in capability. The company Aisle reproduced Anthropic’s published results with smaller, cheaper models.

At the same time, Anthropic’s refusal to publicly release its new model makes a virtue out of necessity. Mythos is very expensive to run, and the company doesn’t appear to have the resources for a general release. What better way to juice the company’s valuation than to hint at capabilities but not prove them, and then have others parrot their claims?

Nonetheless, the truth is scary. Modern generative AI systems—not just Anthropic’s, but OpenAI’s and other, open-source models—are getting really good at finding and exploiting vulnerabilities in software. And that has important ramifications for cybersecurity: on both the offense and the defense.

Attackers will use these capabilities to find, and automatically hack, vulnerabilities in systems of all kinds. They will be able to break into critical systems around the world, sometimes to plant ransomware and make money, sometimes to steal data for espionage purposes, and sometimes to control systems in times of hostility. This will make the world a much more dangerous, and more volatile, place.

But at the same time, defenders will use these same capabilities to find, and then patch, many of those same systems. For example, Mozilla used Mythos to find 271 vulnerabilities in Firefox. Those vulnerabilities have been fixed, and will never again be available to attackers. In the future, AIs automatically finding and fixing vulnerabilities in all software will be a normal part of the development process, which will result in much more secure software.

Of course, it’s not that simple. We should expect a deluge of both attackers using newly found vulnerabilities to break into systems, and at the same time much more frequent software updates for every app and device we use. But lots of systems aren’t patchable, and many systems that are don’t get patched, meaning that many vulnerabilities will stick around. And it does seem that finding and exploiting is easier than finding and fixing. All of this points to a more dangerous short-term future. Organizations will need to adapt their security to this new reality.

But it’s the long term that we need to focus on. Mythos isn’t unique, but it’s more capable than many models that have come before. And it’s less capable than models that will come after. AIs are much better at writing software than they were just six months ago. There’s every reason to believe that they will continue to get better, which means that they will get better at writing more secure software. The endgame gives AI-enhanced defenders advantages over AI-enhanced attackers.

Even more interesting are the broader implications. The same searching, pattern-matching and reasoning capabilities that make these models so good at analyzing software almost certainly apply to similar systems. The tax code isn’t computer code, but it’s a series of algorithms with inputs and outputs. It has vulnerabilities; we call them tax loopholes. It has exploits; we call them tax avoidance strategies. And it has black hat hackers: attorneys and accountants.

Just as these models are finding hundreds of vulnerabilities in complex software systems, we should expect them to be equally effective at finding many new and undiscovered tax loopholes. I am confident that the major investment banks are working on this right now, in secret. They’ve fed AI the tax code of the US, or the UK, or maybe every industrialized country, and tasked the system with looking for money-saving strategies. How many tax loopholes will those AIs find? Ten? One hundred? One thousand? The Double Dutch Irish Sandwich is a tax loophole that involves multiple different tax jurisdictions. Can AIs find loopholes even more complex? We have no idea.

Sure, the AIs will come up with a bunch of tricks that won’t work, but that’s where those attorneys and accountants come in—to verify, and then justify, the loopholes. And then to market them to their wealthy clients.

As goes the tax code, so goes any other complex system of rules and strategies. These models could be tasked with finding loopholes in environmental rules, or food and safety rules—anywhere there are complex regulatory systems and powerful people who want to evade those rules.

The results will be much worse than insecure computers. Tax loopholes result in less revenue collected by governments, and regulatory loopholes allow the powerful to skirt the rules, both of which have all sorts of social ramifications. And while software vendors can patch their systems in days, it generally takes years for a country to amend its tax code. And that process is political, with lobbyists pressuring legislators not to patch. Just look at the carried interest loophole, a US tax dodge that has been exploited for decades. Various administrations have tried to close the vulnerability, but legislators just can’t seem to resist lobbyists long enough to patch it.

AI technologies are poised to remake much of society. Just as the industrial revolution gave humans the ability to consume calories outside of their bodies at scale, the AI revolution will give humans the ability to perform cognitive tasks outside of their bodies at scale. Our systems aren’t designed for that; they’re designed for more human paces of cognition. We’re seeing it right now in the deluge of software vulnerabilities that these models are finding and exploiting. And we will soon see it in a deluge of vulnerabilities in all sorts of other systems of rules. Adapting to this new reality will be hard, but we don’t have any choice.

This essay originally appeared in The Guardian.

Posted on May 14, 2026 at 7:04 AM • View Comments

Rowhammer Attack Against NVIDIA Chips

A new rowhammer attack gives complete control of NVIDIA CPUs.

On Thursday, two research teams, working independently of each other, demonstrated attacks against two cards from Nvidia’s Ampere generation that take GPU rowhammering into new—and potentially much more consequential—territory: GDDR bitflips that give adversaries full control of CPU memory, resulting in full system compromise of the host machine. For the attack to work, IOMMU memory management must be disabled, as is the default in BIOS settings.

“Our work shows that Rowhammer, which is well-studied on CPUs, is a serious threat on GPUs as well,” said Andrew Kwong, co-author of one of the papers. “GDDRHammer: Greatly Disturbing DRAM RowsCross-Component Rowhammer Attacks from Modern GPUs.” “With our work, we… show how an attacker can induce bit flips on the GPU to gain arbitrary read/write access to all of the CPU’s memory, resulting in complete compromise of the machine.”

Update Friday, April 3: On Friday, researchers unveiled a third Rowhammer attack that also demonstrates Rowhammer attacks on the RTX A6000 that achieves privilege escalation to a root shell. Unlike the previous two, the researchers said, it works even when IOMMU is enabled.

The second paper is GeForge: Hammering GDDR Memory to Forge GPU Page Tables for Fun and Profit:

…does largely the same thing, except that instead of exploiting the last-level page table, as GDDRHammer does, it manipulates the last-level page directory. It was able to induce 1,171 bitflips against the RTX 3060 and 202 bitflips against the RTX 6000.

GeForge, too, uses novel hammering patterns and memory massaging to corrupt GPU page table mappings in GDDR6 memory to acquire read and write access to the GPU memory space. From there, it acquires the same privileges over host CPU memory. The GeForge proof-of-concept exploit against the RTX 3060 concludes by opening a root shell window that allows the attacker to issue commands that run unfettered privileges on the host machine. The researchers said that both GDDRHammer and GeForge could do the same thing against the RTC 6000.

Posted on May 6, 2026 at 6:36 AM • View Comments

Hacking Polymarket

Polymarket is a platform where people can bet on real-world events, political and otherwise. Leaving the ethical considerations of this aside (for one, it facilitates assassination), one of the issues with making this work is the verification of these real-world events. Polymarket gamblers have threatened a journalist because his story was being used to verify an event. And now, gamblers are taking hair dryers to weather sensors to rig weather bets.

There’s also insider trading: a lot of it.

Posted on May 4, 2026 at 5:46 AM • View Comments

Fast16 Malware

Researchers have reverse-engineered a piece of malware named Fast16. It’s almost certainly state-sponsored, probably US in origin, and was deployed against Iran years before Stuxnet:

“…the Fast16 malware was designed to carry out the most subtle form of sabotage ever seen in an in-the-wild malware tool: By automatically spreading across networks and then silently manipulating computation processes in certain software applications that perform high-precision mathematical calculations and simulate physical phenomena, Fast16 can alter the results of those programs to cause failures that range from faulty research results to catastrophic damage to real-world equipment.”

Another news article.

Lots of interesting details at the links.

Posted on April 30, 2026 at 6:22 AM • View Comments

How Hackers Are Thinking About AI

Interesting paper: “What hackers talk about when they talk about AI: Early-stage diffusion of a cybercrime innovation.”

Abstract: The rapid expansion of artificial intelligence (AI) is raising concerns about its potential to transform cybercrime. Beyond empowering novice offenders, AI stands to intensify the scale and sophistication of attacks by seasoned cybercriminals. This paper examines the evolving relationship between cybercriminals and AI using a unique dataset from a cyber threat intelligence platform. Analyzing more than 160 cybercrime forum conversations collected over seven months, our research reveals how cybercriminals understand AI and discuss how they can exploit its capabilities. Their exchanges reflect growing curiosity about AI’s criminal applications through legal tools and dedicated criminal tools, but also doubts and anxieties about AI’s effectiveness and its effects on their business models and operational security. The study documents attempts to misuse legitimate AI tools and develop bespoke models tailored for illicit purposes. Combining the diffusion of innovation framework with thematic analysis, the paper provides an in-depth view of emerging AI-enabled cybercrime and offers practical insights for law enforcement and policymakers.

Posted on April 14, 2026 at 6:49 AM • View Comments

Possible US Government iPhone Hacking Tool Leaked

Wired writes (alternate source):

Security researchers at Google on Tuesday released a report describing what they’re calling “Coruna,” a highly sophisticated iPhone hacking toolkit that includes five complete hacking techniques capable of bypassing all the defenses of an iPhone to silently install malware on a device when it visits a website containing the exploitation code. In total, Coruna takes advantage of 23 distinct vulnerabilities in iOS, a rare collection of hacking components that suggests it was created by a well-resourced, likely state-sponsored group of hackers.

[…]

Coruna’s code also appears to have been originally written by English-speaking coders, notes iVerify’s cofounder Rocky Cole. “It’s highly sophisticated, took millions of dollars to develop, and it bears the hallmarks of other modules that have been publicly attributed to the US government,” Cole tells WIRED. “This is the first example we’ve seen of very likely US government toolsbased on what the code is telling usspinning out of control and being used by both our adversaries and cybercriminal groups.”

TechCrunch reports that Coruna is definitely of US origin:

Two former employees of government contractor L3Harris told TechCrunch that Coruna was, at least in part, developed by the company’s hacking and surveillance tech division, Trenchant. The two former employees both had knowledge of the company’s iPhone hacking tools. Both spoke on condition of anonymity because they weren’t authorized to talk about their work for the company.

It’s always super interesting to see what malware looks like when it’s created through a professional software development process. And the TechCrunch article has some speculation as to how the US lost control of it. It seems that an employee of L3Harris’s surviellance tech division, Trenchant, sold it to the Russian government.

Posted on April 2, 2026 at 6:05 AM • View Comments

Is “Hackback” Official US Cybersecurity Strategy?

The 2026 US “Cyber Strategy for America” document is mostly the same thing we’ve seen out of the White House for over a decade, but with a more aggressive tone.

But one sentence stood out: “We will unleash the private sector by creating incentives to identify and disrupt adversary networks and scale our national capabilities.” This sounds like a call for hackback: giving private companies permission to conduct offensive cyber operations.

The Economist noticed (alternate link) this, too.

I think this is an incredibly dumb idea:

In warfare, the notion of counterattack is extremely powerful. Going after the enemy—its positions, its supply lines, its factories, its infrastructure—is an age-old military tactic. But in peacetime, we call it revenge, and consider it dangerous. Anyone accused of a crime deserves a fair trial. The accused has the right to defend himself, to face his accuser, to an attorney, and to be presumed innocent until proven guilty.

Both vigilante counterattacks, and preemptive attacks, fly in the face of these rights. They punish people before who haven’t been found guilty. It’s the same whether it’s an angry lynch mob stringing up a suspect, the MPAA disabling the computer of someone it believes made an illegal copy of a movie, or a corporate security officer launching a denial-of-service attack against someone he believes is targeting his company over the net.

In all of these cases, the attacker could be wrong. This has been true for lynch mobs, and on the internet it’s even harder to know who’s attacking you. Just because my computer looks like the source of an attack doesn’t mean that it is. And even if it is, it might be a zombie controlled by yet another computer; I might be a victim, too. The goal of a government’s legal system is justice; the goal of a vigilante is expediency.

We don’t issue letters of marque on the high seas anymore; we shouldn’t do it in cyberspace.

Posted on April 1, 2026 at 12:57 PM • View Comments

1 2 3 … 78 Next→

Sidebar photo of Bruce Schneier by Joe MacInnis.

URL: https://www.schneier.com/tag/hacking/

⇱ hacking Archives - Schneier on Security