Automatic content moderation for comments with AI

Automatic content moderation analyzes comments on posts in real time using AI. When a potentially inappropriate comment is detected, the configured recipients in the Admin Console automatically receive an email notification. This allows channel administrators to respond quickly and take action if needed.

Prerequisites

For automatic content moderation to work, the "Report comments" function must be enabled in the Admin Console and at least one recipient email address must be configured. For more information, see Report content — Setup in the Admin Console.

Automatic content moderation is enabled by your Customer Success Manager and cannot be activated by yourself in the Admin Console.

How does it work?

As soon as a comment is published on a post, its content is automatically analyzed by AI. The AI checks the comment for the following categories:

Hate speech — content that attacks people or groups based on characteristics such as origin, religion, or gender
Violence — content that threatens, glorifies, or incites violence
Sexual content — sexually explicit or inappropriate content
Self-harm — content that discusses or encourages self-harm or suicide

When a potential violation is detected, an email is automatically sent to the configured recipients — the same ones who also receive manual comment reports.

What does the email contain?

The notification email has the subject line "Flagged Comment" and contains the following information:

Detected by — "Automatic Content Moderation"
Organisation — name of the organization
Channel — channel in which the post was published
Post title — title of the post
Link to post — direct link to the affected post
Comment author — name of the person who wrote the comment
Comment — full content of the comment
Violations — detected categories with severity level (e.g. HATE_SPEECH – Severity: 2/10)

The email is sent in English.

What does severity mean?

Severity indicates how serious the detected violation is, on a scale from 0 to 10. The higher the value, the more severe the content.

What actions can be taken?

Automatic moderation detects and reports content — it does not automatically delete or hide comments. After receiving the email, channel administrators can review the comment and delete it if necessary.

Note: Only channel administrators can delete comments. Once deleted, the comment is replaced with "This comment has been removed".

Frequently asked questions

Are posts and chat messages also moderated automatically?
Currently, only comments on posts are analyzed automatically. Extending this to additional content types is planned for the future.

Can false positives occur?
Yes, as with any AI-powered detection, content may occasionally be flagged incorrectly. The final decision always lies with the responsible persons.

Which languages are supported?
The AI supports detection in multiple languages, including German and English.

Can I enable or disable automatic moderation myself?
No, the feature is enabled by your Customer Success Manager. Contact your CSM team if you would like to use this feature.

Automatic content moderation for comments with AI

Prerequisites

How does it work?

What does the email contain?

What does severity mean?

What actions can be taken?

Frequently asked questions

Was this article helpful?

<%= previousTitle %>

<%= nextTitle %>

In this article

<%= heading %>

<%= block.name %>

<%= heading %>

<% if (block.html_url) { %> <%= block.name %> <% } else { %> <%= block.name %> <% } %>

You haven't found the answer to your question yet?

Categories

Toggle navigation menu

<%= category.name %>