Sign in to manage models, APIs, and live inference workloads.

Central control for native Llama backends, MCP tooling, OpenAI-compatible APIs, speech services, and runtime telemetry.

Server surface Virtual AI runtime

Ready

Native runtime .NET + C++ Llama

Acceleration CPU, Vulkan, CUDA, Metal

Protocol OpenAI compatible API

Tooling Full MCP support

Whisper TTS Audio chat Model control