Basically, I have ~20 different endpoints from different places that do exactly the same thing. The difference is in each endpoint’s rate limit, given in both requests/second and requests/month. For example, 1 endpoint might accept 5 requests/second and 10000000 requests/month, and another might accept 10 requests/second and 5000000 requests/month.
The most common load balancer configs I’m aware of are usually a simple round-robin or similar, where the intention is to evenly spread out the load. However, this might mean we hit the rate limit of one endpoint but not another, and as a result receive “rate limit” errors that could have been avoided had the load balancer taken that particular endpoint out of circulation once the rate limit is reached. Since the accepted loads of each endpoint I’m using are clearly defined, to try and minimize “rate limit” errors, I’m wondering if there is an existing production-ready way to do this.