Twitter’s offensive tweet warnings are proving effective

Anne Freer | June 14, 2022

App Business

Twitter recently took a closer look at how effective its offensive tweet warnings are.

First launched in 2020, the feature auto-detects potentially insensitive or offensive terms and tweets, and prompts users to verify if they still wish to post the comment.

Earlier this year, Twitter said some 30% of its users shown these prompts ended up changing or deleting their drafts. Now, the company has taken another, closer look through a follow-up analysis. 

Based on an analysis of over 200,000 prompts from late 2021, it found that prompts positively influence people tweeting. What’s more, users were less likely to post offensive content in the future. 

According to the study, for every 100 instances where prompts were displayed, 69 tweets were sent without revision. Nine tweets were not sent and 22 were revised. 

Perhaps the most important finding of the report is the long-term behavioural impact on users of the app. 

After a single exposure to a prompt, users were 4% less likely to write offensive replies and prompted users were 20% less likely to write five or more prompt-eligible tweets.

At the same time, those users also received 6% fewer offensive replies.

The findings show that the offensive reply warnings are a useful educational tool for users to change their behaviour and while actual percentages may still be small, in the future, the impact of these warnings could be even greater. 

By signing up you agree to our privacy policy. You can opt out anytime.