blogs

getting an llm to reliably roll a dice

getting an llm to reliably roll a dice

rl Jun 17, 2026

introducing castform: the model training platform for anyone building with ai

introducing castform: the model training platform for anyone building with ai

beta Jun 11, 2026

supporting qwen 3.5: a journey through the llm trainer dependency jungle

supporting qwen 3.5: a journey through the llm trainer dependency jungle

ml May 17, 2026

rl for red teaming: training models to attack and defend themselves

rl for red teaming: training models to attack and defend themselves

rlsafety May 13, 2026

speeding up llm rl training by 7.5x on long-prompt, short-response tasks

speeding up llm rl training by 7.5x on long-prompt, short-response tasks

ml May 11, 2026

pokegents: making multi-agent coding feel like a team

pokegents: making multi-agent coding feel like a team

agentsdeveloper-tools May 8, 2026

castform goes turbo (puffer)

castform goes turbo (puffer)

ragturbopuffer Mar 19, 2026

rag to riches: synthetic data for training rag agents

rag to riches: synthetic data for training rag agents

ragsynthetic-data Mar 13, 2026

rag not lag: rl for blazing fast agentic retrieval

rag not lag: rl for blazing fast agentic retrieval

rag Mar 4, 2026

codebase-specific rl: fine-tuning llms for generating unit tests that boost coverage

codebase-specific rl: fine-tuning llms for generating unit tests that boost coverage

code Jan 10, 2026

subscribe for practical tips, experimental results, and product updates to improve your rlft model.