Never run websockets on Google Cloud Run!
A while back we had our entire backend served from a monolithic Google Cloud Run service (and even at the time of this writing, that’s still mostly true). At the time, our websockets cluster was very small and being handled by the same Cloud Run service. It was a feature that hadn’t really launched yet, and for the moment it was fine.
But the question kept coming up — what would happen when we launched to 10,000 users? 20,000? 40,000? When would it break? It turns out, there’s a hard limit — and a steep price to pay as well.
We are a productivity / calendar / project planning app all rolled into one. As you can imagine, power users have many tabs with their calendar open on a daily basis. Similarly, they have many project planning tabs. All of this is to say that a single user could easily have 10 websocket connections at any given time (1 / tab).
There are two specific restrictions to keep in mind from the documentation:
“Cloud Run has a max of 250 concurrent requests. With a maximum of 1,000 container instances with 250 concurrent requests, you can serve up to 250k clients. For example, a service serving 1,000 employees with concurrency of 250, you will need at least 4 instances.”
So the first limit is that you can only support 250k clients at a time. Assuming an average of 10 tabs / user, that meant we could only support up to 25,000 active users.
The second question was how much would all of this cost?
Assuming a rather frugal 2 vCPU with just 1GB of memory — ignoring the Redis and VPC costs — just Cloud Run would cost almost $100,000 per month at 250k connections!
Feel free to play around with it on the pricing calculator.
Given these two limitations, we felt quite strongly that the only use case for Cloud Run web sockets were at toy-project level loads, and at that level, quite frankly — any framework will do.
For those interested in absurd web socket scaling, Discord has two fantastic blog posts on this topic. But even without jumping ship to Erlang and optimizing Beam VM with Rust, there were folks way back in 2012 pushing a single NodeJS server to 1 million concurrent connections (albeit with long polling instead of true web sockets). More recently, Socket IO claims a single machine can easily run 55k concurrent connections with minimal tuning. Even taking that lower bound, running 5 dedicated machines does not cost $100k / month.
We decided to bite the bullet and launch our first dedicated Kubernetes cluster for websockets. It turned out not to be our last, but that’s a blog post for another day. :)
PS — We’re Hiring!
As always, if you enjoyed this blog post and like making these sorts of decisions, please shoot me a DM on Twitter — they’re always open! We’re hiring for all engineering positions.