👁 Military personnel look at a computer screen.

A Feasible Precaution Ignored: AI Targeting Algorithms and the Failure to Recognize Protected Emblems

Published on April 1, 2026

Listen to Article

In Aug. 2021, a U.S. drone strike in Kabul, Afghanistan killed an aid worker after he loaded water jugs into his car and nine other civilians—including seven children. In Nov. 2023, an Israeli missile by the Lebanese border killed a grandmother and her three granddaughters who had also been handling water jugs. In April 2024, when the Israel Defense Forces (IDF) struck a visibly marked World Central Kitchen convoy, international condemnation of Israel’s campaign in Gaza spiked. And most recently, the Feb. 28 Tomahawk strike on the Shajarah Tayyebeh Elementary School in Minab, as well as reports of widespread use of algorithmic targeting in the ongoing Iran campaign, have provoked deep concern from U.S. Congress and the public regarding military operations relying on AI models (although it is not yet clear the extent to which AI contributed to the failures that led to the Minab strike).

With algorithmic targeting shrinking the role of human operators and decentralized strike capabilities magnifying coordination challenges, high-profile incidents of civilians killed on the battlefield produce strategic-level consequences for both countries and the corporations that underwrite military capabilities. Existing Testing, Evaluation, Validation and Verification (TEVV) procedures insufficiently address the growing role of algorithms in military targeting, which risks undermining respect for humanitarian law. While the U.S. Defense Department’s (DoD) responsible AI implementation guidance emphasizes ethical principles and Directive 3000.09 mandates legal review of the procurement or modification of autonomous weapons systems to ensure compliance with domestic and international law, DoD does not currently have a specific requirement to ensure that targeting algorithms recognize humanitarian actors, including aid workers, nor is there a clear standard by which to measure compliance. This falls patently short of the United States’ obligation to take constant care and ensure feasible precautions to protect civilians under Article 57 of Additional Protocol I (AP I) to the Geneva Conventions and corresponding customary international humanitarian law (IHL). Closing this gap is crucial to both U.S. military and strategic effectiveness and compliance with the law.

Fighting the Last War, Algorithmically

Ensuring that algorithms’ outputs are interpretable based on the data they were designed and trained on is key to recognizing developing patterns of civilian harm. Currently managed by the Chief Digital and Artificial Intelligence Office, or CDAO, the Project Maven initiative, which began in 2017, designed the AI-powered Maven Smart System to address the threat of vehicle-borne IEDs (i.e. car bombs) during the Global War on Terror (GWOT). Maven and systems like it that were in operation in Afghanistan could comb surveillance video for suspicious activity that matched patterns associated with past car bomb attacks.

Following the Aug. 2021 Kabul drone strike that killed an aid worker and nine other civilians, including seven children, reporting indicated that the aid worker had been observed loading what the U.S. military believed were explosives into the car (the “explosives” were in fact water jugs). The DoD’s inspector general, which later conducted an investigation into the strike, cited confirmation bias—a common problem with AI—while refusing to disclose “methods, sources, tactics, techniques and procedures” or release the investigation report on account of classification.

Two years later and two thousand miles away, a similar mistake took place. Amid reports of AI-assisted targeting by the IDF in the early days of the Israel-Hamas war, in Nov. 2023, a family fled from southern Lebanon amid a spike in fighting between Hizballah and the IDF (Israel has since inked a strategic partnership with Palantir, a major tech partner for the U.S. military and the developer of the Maven Smart System). But the family’s journey was brutally and abruptly cut short when they resumed their journey after loading supplies into one of the cars, which was then hit by an Israeli missile—killing a grandmother and her three granddaughters (their mother survived).

In both cases, a civilian car was struck with tragic results. And in both cases, the strikes occurred after occupants were observed loading what turned out to be jugs of water. It is impossible to know how those targets were selected or why the attacks were authorized. But there is a very real possibility that algorithmic reviews of drone feeds detected patterns that matched the car bomb threat template and nominated them for attack, despite the presence of children.

These two tragic anecdotes point to the essential nature of qualitative targeting criteria, reviewed by live operators. As conflict characteristics change, quantitative metrics engender false confidence. Qualitative criteria assist operators in understanding why. Car bombs were a major concern during GWOT. They have not featured in recent conflicts to any meaningful extent. But if models contain buried weights from old data, misguided nominations may emerge without warning. Without qualitative criteria, a misaligned assumption that containers hold explosives or that a marked convoy is hostile may lead to more tragedies like these.

TEVV’s Humanitarian Blind Spot

Currently, AI targeting systems are primarily trained to detect threat signatures. Much of battlefield data, either from GWOT or the Russia-Ukraine War, consists of combat engagements and lawful targets on the frontline. Enshrined under the First, Second, and Fourth Geneva Conventions and all three additional protocols, the red cross, red crescent, and red crystal are specially-designated emblems that indicate the clear protections afforded by international law to those bearing them. An intentional attack on a person or object with these emblems constitutes a war crime. Humanitarian logos (such as “blue” U.N. markings) further mark humanitarian actors as relief personnel legally protected under Articles 70 and 71 of AP I. Yet these emblems and logos, like the civilian environment as a whole, appear to be absent from training sets or treated as background noise for military models. As a result, TEVV procedures fail to test for recognition of these markings, which could contribute to unlawful attacks on relief missions.

Compounding the problem, visual and digital similarity increase the likelihood that algorithms will flag humanitarian activities as potential targets because of patterns similar to military signatures. Humanitarian workers often move in organized convoys and in armored vehicles, base distribution in secure compounds to prevent theft, maintain robust communications footprints, and are in regular contact with armed actors to deconflict locations and movements. And they may do so bearing an NGO emblem that is different from legally protected emblems. This means that in addition to these protected emblems under the Geneva Conventions, AI models must grapple with how to make sense of a profusion of organizational humanitarian logos.

There is admittedly a risk that bad actors may misuse emblems (although this is prohibited by IHL) or poison training data to provoke public condemnation. But criminal enemy conduct does not authorize indiscriminate targeting or eliminate the obligation to protect civilians. And while protected emblem recognition alone will not eliminate civilian harm from algorithmic targeting nominations, ensuring algorithms recognize protected emblems is an achievable first step.

What We Can Do Now

Technical fixes will not perfect AI-enabled targeting. Context-appropriate human judgement over the use of force, which remains necessary to ensure compliance with IHL as well as DoD Directive 3000.09, will remain indispensable to mitigating the shortcomings of today’s algorithms. But human judgement can be meaningfully improved by straightforward technical fixes. DoD, the International Committee of the Red Cross (ICRC), and Congress can act now to implement these fixes in the near-term and prevent future tragedies.

First, DoD’s Undersecretary for Research and Engineering and CDAO should update TEVV policy to mandate emblem recognition. The Director of Operational Test and Evaluation should include specific pass/fail tests for the non-recommendation of strikes by targeting algorithms when protected emblems are present. Since systems will only see what they are trained to see, these qualitative validation requirements for protected emblems are overdue. Similarly, Combatant Commanders should be required to sign off pre-deployment that specific targeting systems have been validated to account for local humanitarian actors and civilian conditions, since data biases from distant theaters could undermine algorithms’ accuracy.

Second, the ICRC (possibly in coordination with the U.N.’s Office for the Coordination of Humanitarian Affairs) should establish a working group to set training data and standards for military AI models’ recognition of humanitarian actors and protected emblems. The ICRC’s exploration of a next-generation protective emblem to mitigate the risks of autonomous weapons is an important step in this regard, but existing proposals place high technical hurdles to widespread adoption. In the immediate term, international TEVV standards for recognition of existing protected emblems is crucial.

Lastly, to address broader issues with AI-enabled mass targeting, Congress should include a specific requirement in must-pass, annual defense appropriations that DoD disclose if, when, and which targeting algorithms are implicated in incidents of civilian harm, even if utilized in a “decision support” capacity. With DoD’s capability to mitigate civilian harm gutted, public disclosure of algorithm use and its consequences is essential to congressional oversight over the fielding of lethal autonomous capabilities and U.S. use of military force. This disclosure requirement could be paired with Congressional mandates and earmarked resources to fully staff civilian protection and train protective AI systems built-in to targeting AI ones. Absent such steps, continued erosion to the unique U.S. advantage that transparent and principled use of force provides is likely under current DoD leadership.

All these steps to mitigate the human costs of military targeting algorithms are possible today. The question will be whether it takes another incident, and bad press, to prompt them.

FEATURED IMAGE: Military crew in a high-tech operations command post. (via Getty Images)

Filed Under

Airstrikes, Algorithms, Big Tech, Civilian Harm, Collection: Israel-Iran Conflict, Department of Defense (DoD), Emerging technology, Geneva Conventions, Humanitarianism, International Humanitarian Law (IHL), Iran, Military, Technology, Use of Force

About the Author

Michael Loftus

Michael Loftus (LinkedIn) is a national security analyst and doctoral candidate at Johns Hopkins University SAIS. He previously served at the U.S. Department of State, where he led civilian harm mitigation efforts during the Israel-Hamas conflict, oversaw large-scale humanitarian assistance across the Middle East, directed the U.S. humanitarian response to the Ukraine refugee crisis in Poland, and coordinated interagency efforts during the Afghanistan evacuation and resettlement.

Send A Letter To The Editor