- Home
- AI Security
- AI Model Security
- Tinfoil GPT-OSS Safeguard 120B
Tinfoil GPT-OSS Safeguard 120B
Safety reasoning model for content classification and trust & safety apps

Tinfoil GPT-OSS Safeguard 120B
Safety reasoning model for content classification and trust & safety apps
Go Beyond the Directory. Track the Entire Market.
Monitor competitor funding, hiring signals, product launches, and market movements across the whole industry.
Tinfoil GPT-OSS Safeguard 120B Description
Tinfoil GPT-OSS Safeguard 120B is a specialized safety reasoning model built on GPT-OSS with 117 billion parameters and 5.1 billion active parameters. The model classifies text content based on custom safety policies provided by users, enabling LLM input-output filtering, content labeling, and Trust & Safety workflows. The model supports bring-your-own-policy flexibility, allowing organizations to define their own safety policies for content classification. It provides full access to reasoning chains for debugging purposes and offers configurable reasoning effort levels (low, medium, high) to balance performance and accuracy needs. The model features a 128k token context window and is trained on harmony response format. It is part of the ROOST Model Community and released under Apache 2.0 license. The model is designed for multilingual content with performance across major languages. Primary use cases include content moderation, policy enforcement, LLM guardrails, and Trust & Safety labeling workflows. The model enables organizations to implement custom safety policies for filtering and classifying content in various applications.
Tinfoil GPT-OSS Safeguard 120B FAQ
Common questions about Tinfoil GPT-OSS Safeguard 120B including features, pricing, alternatives, and user reviews.
Tinfoil GPT-OSS Safeguard 120B is Safety reasoning model for content classification and trust & safety apps developed by Tinfoil. It is a AI Security solution designed to help security teams with AI Security, Content Filtering, Classification.
FEATURED
Fix-first AppSec powered by agentic remediation, covering SCA, SAST & secrets.
Cybercrime intelligence tools for searching compromised credentials from infostealers
Password manager with end-to-end encryption and identity protection features
Fractional CISO services for B2B companies to build security programs
POPULAR
Real-time OSINT monitoring for leaked credentials, data, and infrastructure
A threat intelligence aggregation service that consolidates and summarizes security updates from multiple sources to provide comprehensive cybersecurity situational awareness.
AI security assurance platform for red-teaming, guardrails & compliance
TRENDING CATEGORIES
Stay Updated with Mandos Brief
Get strategic cybersecurity insights in your inbox