Models

The local inference landscape moves quickly, and excellent new projects appear all the time. Our approach is intentionally narrower. Rather than chasing every new release, we focus deeply on integrating one model at a time within the constraints of the hardware we support. Today, Tiles is designed around everyday productivity tasks, with plans to expand into more domain-specific workflows over time. We provide a carefully tested combination of model, inference parameters, and agent harness, so users do not have to manage model selection, tuning, or scaffolding themselves.

We currently use the OpenAI gpt-oss-20b model as the primary model for everyday tasks, alongside the Harmony renderer and support for the Open Responses API to improve inference quality. Its Mixture of Experts architecture activates roughly 4B parameters during inference out of 21B total parameters, offering a strong balance between quality and efficiency.

That narrow focus gives us room to validate more deeply over time. We aim to compare against official outputs, run long-context evaluations, and integrate models directly into the agent harness to see how they actually hold up in real workflows. The exact model may change as the landscape evolves, but the constraint remains the same: local inference should be practical and credible on everyday personal machines, starting at 16 GB of system memory for real productivity tasks.