Consider a system with K parallel servers, each with its own waiting room. Upon arrival, a job is routed to the queue of one of the servers. Finding a routing policy that minimizes the total workload in the system is a known difficult problem in general. Even if the optimal policy is identified, the policy would require the full queue length information at the arrival of each job; for example, the join-the-shortest-queue policy (which is known to be optimal for identical servers with exponentially distributed service times) would require comparing the queue lengths of all the servers. In this paper, we consider a balanced routing policy that examines only a subset of c servers, with 1 ≤ c ≤ K: specifically, upon the arrival of a job, choose a subset of c servers with a probability proportional to their service rates, and then route the job to the one with the shortest queue among the c chosen servers. Under such a balanced policy, we derive the diffusion limits of the queue length processes and the workload processes. We note that the diffusion limits are the same for these processes regardless the choice of c, as long as c ≥ 2. We further show that the proposed balanced routing policy for any fixed c ≥ 2 is asymptotically optimal in the sense that it minimizes the workload over all time in the diffusion limit. In addition, the policy helps to distribute work among all the servers evenly.
- Asymptotic optimality
- Balanced routing
- Diffusion limit
- Fluid limit
ASJC Scopus subject areas
- Computer Science Applications
- Management Science and Operations Research