Automatic content moderation analyzes comments on posts in real time using AI. When a potentially inappropriate comment is detected, the configured recipients in the Admin Console automatically receive an email notification. This allows channel administrators to respond quickly and take action if needed.
Prerequisites
For automatic content moderation to work, the "Report comments" function must be enabled in the Admin Console and at least one recipient email address must be configured. For more information, see Report content — Setup in the Admin Console.
Automatic content moderation is enabled by your Customer Success Manager and cannot be activated by yourself in the Admin Console.
How does it work?
As soon as a comment is published on a post, its content is automatically analyzed by AI. The AI checks the comment for the following categories:
- Hate speech — content that attacks people or groups based on characteristics such as origin, religion, or gender
- Violence — content that threatens, glorifies, or incites violence
- Sexual content — sexually explicit or inappropriate content
- Self-harm — content that discusses or encourages self-harm or suicide
When a potential violation is detected, an email is automatically sent to the configured recipients — the same ones who also receive manual comment reports.
What does the email contain?
The notification email has the subject line "Flagged Comment" and contains the following information:
- Detected by — "Automatic Content Moderation"
- Organisation — name of the organization
- Channel — channel in which the post was published
- Post title — title of the post
- Link to post — direct link to the affected post
- Comment author — name of the person who wrote the comment
- Comment — full content of the comment
- Violations — detected categories with severity level (e.g. HATE_SPEECH – Severity: 2/10)
The email is sent in English.
What does severity mean?
Severity indicates how serious the detected violation is, on a scale from 0 to 10. The higher the value, the more severe the content.
What actions can be taken?
Automatic moderation detects and reports content — it does not automatically delete or hide comments. After receiving the email, channel administrators can review the comment and delete it if necessary.
Note: Only channel administrators can delete comments. Once deleted, the comment is replaced with "This comment has been removed".
Frequently asked questions
Are posts and chat messages also moderated automatically?
Currently, only comments on posts are analyzed automatically. Extending this to additional content types is planned for the future.
Can false positives occur?
Yes, as with any AI-powered detection, content may occasionally be flagged incorrectly. The final decision always lies with the responsible persons.
Which languages are supported?
The AI supports detection in multiple languages, including German and English.
Can I enable or disable automatic moderation myself?
No, the feature is enabled by your Customer Success Manager. Contact your CSM team if you would like to use this feature.