
Run Replicate models without burning compute
Pin the version, poll at 2-5s, use webhooks for long predictions, and pick the cheapest hardware tier the model fits on. Stops your agent from burning Replicate credits on cold starts and unpinned models.
Works with claude-code · opencode · codex · gemini