Algorithms for Peace: AI's Role in Preventing Online Harm in Conflicts
- Roya Green 
- Jul 23
- 3 min read
Algorithms for Peace: AI’s Role in Preventing Online Harm in Conflicts
By Roya Green, (2024-25 College Ambassador from the University of Iowa)
Can AI Stop the Spread of Hate and Misinformation Online?
From spreading misinformation to amplifying hate speech, social media platforms have become powerful—and sometimes dangerous—tools in modern conflicts. As technology evolves, so do the ways governments, international organizations, and researchers are trying to fight back. Artificial Intelligence (AI) is now being used not just to monitor harmful content online but to help prevent violence and protect human rights.
Social Media
Social media can serve as a communication platform as well as an effective listening tool. Through sharing stories, images, and videos, people around the world can follow human rights violations perpetrated by governments. Recently, social media has been an important tool for accessing information about events that are under-reported or censored in traditional media. Having access to this information allows mediators to effectively respond to conflict scenarios. However, the sheer volume of information circulating on social media makes it impossible for humans to monitor everything in real time. That’s where AI comes in. AI-driven machine learning algorithms can effectively review and flag relevant content faster than human moderators.
AI and Hate Speech Detection
The United Nations has acknowledged that social media has played an important role in fueling genocides and violence around the world. For example, social media has been linked to the 2017 Rohingya genocide in Myanmar and the 2020-2022 Tigray War in Ethiopia. Specifically, Facebook has faced criticism for allowing hate speech that has not been checked. There needs to be a stronger push toward preventing harm on these platforms.
For AI models to successfully detect hate speech, computer scientists use supervised and unsupervised learning techniques. Supervised learning involves training an AI algorithm with labeled examples of hate speech and non-harmful content, helping it learn to distinguish between the two. Unsupervised learning, on the other hand, uses unlabeled data to help the algorithm develop its own understanding of hate speech. Another technique used to combat hate speech is counter speech, which challenges harmful narratives through methods such as natural language generation (NLG), sentiment analysis, contextual understanding, data diversity, and learning from user feedback.
The BiCapsHate Model
Although several AI models have been developed to detect hate speech, their effectiveness and accuracy have been questioned. To address these challenges, a group of researchers in the UK developed a new AI model called BiCapsHate. This model consists of deep neural network layers, each dedicated to capturing different properties of hate speech. BiCapsHate assigns numerical values to language used in social media posts and evaluates the sequence to determine whether the context of the words is hateful. The model currently detects hate speech in English and it has demonstrated higher accuracy than other existing hate speech detection algorithms.
AI for Conflict Prevention and Misinformation Control
The UN Department of Political and Peacebuilding Affairs (DPPA) has developed various approaches to social media analytics that support conflict prevention. For instance, the DPPA conducted a project on social media analysis within Arabic-speaking communities to enhance early-warning capabilities, track hate speech, and combat incitement to violence. Detecting hate speech in Arabic presents unique challenges due to the multiple dialects and cultural differences among Arabic-speaking countries. Additionally, the DPPA launched Sparrow, a social media scanning application that analyzes Twitter data and distinguishes between automated bot-generated content and authentic speech. Rather than predicting the future, these AI tools monitor current and past trends to assess risks.
Combating Disinformation with AI
The spread of misinformation can threaten peace and disproportionately affect vulnerable groups. To counter this, the United Nations Development Program (UNDP) created iVerify, a fact-checking tool that identifies false information before it spreads. iVerify has been used in Zambia and Honduras before elections to prevent the dissemination of misleading information. This tool processes articles using an open-source machine-learning algorithm that detects hate speech. Human fact-checkers then review the reports and verify claims made in the analyzed stories. By enhancing integrity and protecting vulnerable groups, iVerify demonstrates how AI can help prevent misinformation-driven violence.
The Future of AI in Human Rights
AI-driven tools are making progress in mitigating the spread of hate speech and misinformation online. However, they are not foolproof. Challenges such as bias in training data, limitations in contextual understanding, and language barriers still need to be addressed. While AI is a valuable tool, it must be combined with human oversight and policy reforms to create meaningful change. The fight against online harm is far from over, but with continued technological advancements and stronger regulations, AI may play a crucial role in protecting human rights in the digital age.








Comments