Back to Documentation

Output Filtering

Filter AI responses to prevent harmful, inappropriate, or sensitive content.

Enable Output Filtering

const response = await client.gateway.chat({
  model: "gpt-4",
  messages: [...],
  security: {
    scanOutput: true,
    outputFilters: ["harmful_content", "hate_speech", "violence"],
    redactOnMatch: true
  }
});

Filter Categories

Harmful Content

Block dangerous instructions or harmful advice.

Hate Speech

Filter discriminatory or hateful language.

Sensitive Topics

Control responses about politics, religion, etc.

Custom Filters

await client.filters.create({
  name: "competitor_mentions",
  type: "keyword",
  patterns: ["CompetitorA", "CompetitorB"],
  action: "redact",
  replacement: "[REDACTED]"
});