At work I regularly hit my 7.5mil tokens per hour limit one of our tools has, and have to switch model of tool, and I’m not even really a remotely heavy user. I think people don’t realise how many tokens get burned with CoT and tool calls these days
At 7.5mil per hour hard limit, 84 days to hit the grandparents $3k
That said local models really are slow still, or fast enough and not that great
At 7.5mil per hour hard limit, 84 days to hit the grandparents $3k
That said local models really are slow still, or fast enough and not that great