blogs
getting an llm to reliably roll a dice
introducing castform: the model training platform for anyone building with ai
supporting qwen 3.5: a journey through the llm trainer dependency jungle
rl for red teaming: training models to attack and defend themselves
speeding up llm rl training by 7.5x on long-prompt, short-response tasks
pokegents: making multi-agent coding feel like a team
castform goes turbo (puffer)
rag to riches: synthetic data for training rag agents
rag not lag: rl for blazing fast agentic retrieval