Chapter 9. Variability and Its Impact on Process
Performance: Waiting Time Problems
For consumers, one of the most visible and annoying forms of supply- demand mismatches⇒
waiting time
Expected demand rate > expected supply rate When the implied utilization < 100%
⇒ In the presence of variability -Predict waiting times
- Recommend ways of reducing waiting time⇒
(choose appropriate capacity level, redesign the service system,
reduce variability)
Blocked calls
(busy signal) Abandoned calls (tired of waiting)
Calls on Hold
Sales reps processing
calls
Answered Calls Incoming
calls
Call center
Financial consequences Lost throughput Holding cost
Lost goodwill
$$$ Revenue $$$
Cost of capacity Cost per customer
§ At peak, 80% of calls dialed received a busy signal.
§ Customers getting through had to wait on average 10 minutes
§ Extra telephone expense per day for waiting was
$25,000.
An Example of a Simple Queuing System
Balking vs. Reneging
Patient
Arrival Time
Service Time
1 2 3 4 5 6 7 8 9 10 11 12
0 5 10 15 20 25 30 35 40 45 50 55
4 4 4 4 4 4 4 4 4 4 4 4
A Somewhat Odd Service Process
Utilization=Flow Rate/Capacity=(12patients/hr)/(15patients/hr) =80%
Patient
Arrival Time
Service Time
1 2 3 4 5 6 7 8 9 10 11
0 7 9 12 18 22 25 30 36 45 51
5 6 7 6 5 2 4 3 4 2 2
Time 7:10 7:20 7:30 7:40 7:50 8:00
7:00
Patient 1 Patient 3 Patient 5 Patient 7 Patient 9 Patient 11
Patient 2 Patient 4 Patient 6 Patient 8 Patient 10 Patient 12
0 1 2 3
2 min. 3 min. 4 min. 5 min. 6 min. 7 min.
Number of cases
A More Realistic Service Process
7:00 7:10 7:20 7:30 7:40 7:50
Inventory (Patients at
5 4 3 2 1
8:00 Wait time
Service time
Patient
Arrival Time
Service Time
1 2 3 4 5 6 7 8 9 10 11 12
0 7 9 12 18 22 25 30 36 45 51 55
5 6 7 6 5 2 4 3 4 2 2 3
Variability Leads to Waiting Time
Capacity can never run ahead of demand!
(service system)!
Average Wait Time?
What generates queues?
§ An arrival rate that predictably exceeds the service rate (i.e., capacity) of a process.
- e.g., toll booth congestion on the NJ Turnpike during the Thanksgiving Day rush.
§ Variation in the arrival and service rates in a process where the average service rate is more than adequate to process the average arrival rate.
- e.g., calls to a brokerage are unusually high during a particular hour relative to the same hour in other weeks (i.e., calls are high by
random chance).
§ An interarrival time is the amount of time between two arrivals to a process.
§ An arrival process is stationary over a period of time if the number of
arrivals in any subinterval depends only on the length of the interval and not on when the interval starts.
- For example, if the process is stationary over the course of a day, then the expected number of arrivals within any three hour interval is about the same no matter which three hour window is chosen (or six hour window, or one hour window, etc.).
- Processes tend to be nonstationary (or seasonal) over long time periods (e.g., over a day or several hours) but stationary over short periods of time (say one hour, or 15 minutes).
Defining interarrival times and a stationary process
Arrivals to An-ser over the day are nonstationary
0 20 40 60 80 100 120 140 160
15 00 45 30 15 00 :45 :30 :15 :00 :45 :30 :15 :00
Number of customers Per 15 minutes
0 20 40 60 80 100 120 140 160
15 00 45 30 15 00 :45 :30 :15 :00 :45 :30 :15 :00
Number of customers
Per 15 minutes The number of arrivals
in a 3-hour interval clearly depends on which 3-hour interval
your choose …
… and the peaks and troughs are predictable (they occur roughly at the same time each day.
0 20 40 60 80 100 120 140 160
0:15 2:00 3:45 5:30 7:15 9:00 10:45 12:30 14:15 16:00 17:45 19:30 21:15 23:00
Number of customers Per 15 minutes
0 20 40 60 80 100 120 140 160
0:15 2:00 3:45 5:30 7:15 9:00 10:45 12:30 14:15 16:00 17:45 19:30 21:15 23:00
Time Number of customers
Per 15 minutes
0 0.2 0.4 0.6 0.8 1
0:00:00 0:00:09 0:00:17 0:00:26 0:00:35 0:00:43 0:00:52 0:01:00 0:01:09 Distribution Function
Empirical distribution (individual points)
Exponential distribution
0 0.2 0.4 0.6 0.8 1
0:00:00 0:00:09 0:00:17 0:00:26 0:00:35 0:00:43 0:00:52 0:01:00 0:01:09 Distribution Function
Inter -arrival time Inter -arrival time Empirical distribution
(individual points)
Exponential distribution
• Seasonality vs. variability
• Need to slice-up the data
• Within a “slice”, exponential distribution (CVa=1)
• See chapter 6 for various data analysis tools
Data in Practical Call Center Setting
What to Do With Seasonal Data
Measure the true demand data
Slice the data by the hour (30min, 15min)
Apply waiting model in each slice
How to describe (or model) interarrival times
§ We will use two parameters to describe interarrival times to a process:
- The average interarrival time.
- The standard deviation of the interarrival times.
§ What is the standard deviation?
- Roughly speaking, the standard deviation is a measure of how variable the interarrival times are.
- Two arrival processes can have the same average interarrival time (say 1 minute) but one can have more variation about that average, i.e., a higher standard deviation.
Relative and absolute variability
§ The standard deviation is an absolute measure of variability.
§ Two processes can have the same standard deviation but one can seem much more variable than the other:
- Below are random samples from two processes that have the same standard deviation. The left one seems more variable.
Average = 10 Stdev = 10
0 5 10 15 20 25 30 35 40 45 50
Inter-arrival time (min)
Average = 100 Stdev = 10
0 20 40 60 80 100 120
Inter-arrival time (min)
Relative and absolute variability (continued)
§ The previous slide plotted the processes on two different axes.
§ Here, the two are plotted relative to their average and with the same axes.
§ Relative to their average, the one on the left is clearly more variable
(sometimes more than 400% above the average or less than 25% of the average).
Average = 10 Stdev = 10
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
Observation
Inter-arrival time /Average
Average = 100 Stdev = 10
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
Observation
Inter-arrival time /Average
§ The coefficient of variation is a measure of the relative variability of a process – it is the standard deviation divided by the average.
§ The coefficient of variation of the arrival process:
Coefficient of variation
Average = 10 Stdev = 10
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
Inter-arrival time /Average
Average = 100 Stdev = 10
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
Inter-arrival time /Average
CVa = 10 /10 = 1
CVa = 10 /100
= 0.1
§ The coefficient of variation of the service time process:
Service time variability
800
600
400
200
0
Call durations (seconds) Frequency
§ Standard deviation of activity times = 150 seconds.
§ Average call time (i.e., activity time) = 120 seconds.
§ CVp = 150 / 120 = 1.25
Processing
Modeling Variability in Flow
Inflow
Demand process is “random”
(Randomness is the Rule, not the Exception!) Look at the inter-arrival times
a: average inter-arrival time CVa=
Often Poisson distributed:
CVa= 1
Constant hazard rate (no memory) Exponential inter-arrivals
Buffer
St-Dev(inter-arrival times) Average(inter-arrival times)
Time IA1 IA2 IA3 IA4
Processing
(Inherent Variation, Machine Breakdown, Operator Absence) p: average processing time
Same as “activity time” and “service time”
CVp =
Can have many distributions:
CVp depends strongly on standardization Often Beta or LogNormal
St-Dev(processing times) Average(processing times)
Outflow
(Variable Routing.
eg. Emergency Room) Flow Rate
Minimum{Demand, Capacity} = Demand = 1/a
CV⇒ measure variability in relative terms
Average flow time T
Theoretical Flow Time
Utilization 100%
Increasing Variability
Waiting Time Formula (복잡한 공식은 외울 필요 없음)
Variability factor
÷÷ ø ö çç
è
æ +
÷* ø ç ö
è æ
* -
= 1 2
2 2
p
a CV
CV n
utilizatio n utilizatio Time
Activity queue
in Time
Inflow Outflow
Inventory waiting I
qEntry to system Begin Service Departure Time in queue Tq Service Time
p
Flow Time T=Tq+p
Flow rate
The Waiting Time Formula
÷ö çæ +
÷* ç ö
*æ
÷ö çæ
= + -
2 1 2
) 1 (
2 m CVa CVp
n utilizatio time
Activity queue
in Time
Waiting Time Formula for Multiple (m) Servers
Inflow Outflow
Inventory waiting Iq
Entry to system Begin Service Departure Time in queue TqService Time p
Flow Time T=Tq+p Inventory
in service Ip
Flow rate
Waiting Time Formula for Multiple, Parallel Resources
• Target Wait Time (TWT)
• Service Level = Probability{Waiting Time £ TWT}
• Example: Deutsche Bundesbahn Call Center
- Year 2020: 30% of calls answered within 20 seconds
0 0.2 0.4 0.6 0.8 1
0 50 100 150 200
Waiting time [seconds]
Fraction of customers who have to wait x seconds or less
Waiting times for those customers who do not get served immediately
Fraction of customers who get served without waiting at all
90% of calls had to wait 25 seconds or less
Service Levels in Waiting Systems
Independent Resources 2x(m=1)
Pooled Resources (m=2)
Managerial Responses to Variability: Pooling
a=4min, p=3min ⇒ a=2min, p=3/2min
min 9 2 )
1 (1 75) . 0 1
75 . ( 0 3
2 Time 1
Activity q
2 2
+ = - ´
´
=
÷÷ ø ö çç
è
æ +
÷* ø ç ö
è æ
* -
= CVa CVp
n utilizatio
n utilizatio T
min 95 . 2 3
1 1 75
. 0 1 75 . ) 0 2 (3
2 1
time Tq
1 ) 1 2 ( 2
2 1 2
) 1 ( 2
÷ = ø ç ö
è
´æ +
÷÷ ø ö çç
è æ
´ -
=
÷÷ ø ö çç
è
æ +
÷*
÷ ø ö çç
è æ
* -
÷ø ç ö
è
=æ
- +
-
+ a p
m CV CV
n utilizatio
n utilizatio m
Activity
Can serve the same number of customers using the same processing time, but in only half the waiting time!
Independent Resources 2x(m=1)
Pooled Resources
(m=2)
0.0010.00 20.00 30.00 40.00 50.00 60.00 70.00
60% 65%
m=1
m=2
m=5 m=10
70% 75% 80% 85% 90% 95%
Waiting Time Tq
Utilization u
Implications:
+ balanced utilization
Managerial Responses to Variability: Pooling
Pooling prevents the case that one resource is idle while the other faces a backlog of work.
A multi-server queue
§ a = average interarrival time
§ Flow rate = 1 / a
§ p = average activity time
§ Capacity of each server = 1 / p
§ m = number of servers
§ Capacity = m x 1 / p
§ Utilization = Flow rate / Capacity
= (1 / a) / (m x 1 / p)
Multiple servers Customers
waiting for service Arriving
customers Departing
customers
§ Assumptions:
· All servers are equally skilled, i.e., they all take p time to process each customer.
· Each customer is served by only one server.
· Customers wait until their service is completed.
· There is sufficient capacity to
A multi-server queue – some data and analysis
§ Suppose:
- a = 35 seconds per customer
- Flow rate = 1 / 35 customers per second
- p = 120 seconds per customer
- Capacity of each server = 1 / 120 customers per second
- m = number of servers = 4
§ Then
- Capacity = m x 1 / p = 4 x 1 / 120 = 1 / 30 customers per second
- Utilization = Flow rate / Capacity
= p / (a x m)
= 120 / (35 x 4)
The implications of utilization
§ A 85.7% utilization means:
- At any given moment, there is a 85.7%
chance a server is busy serving a customer and a 1 - 0.857 = 14.3% chance the server is idle.
- At any given moment, on average 85.7% of the servers are busy serving customers and 1 - 0.857 = 14.3% are idle.
§ If utilization < 100% then:
· The system is stable which means that the size of the queue does not keep growing over time. (Capacity is sufficient to serve all demand.)
§ If utilization >= 100% then:
· The system is unstable, which means that the size of the queue will continue to grow over time.
Stable and unstable queues
0 20 40 60 80 100 120 140 160 180 200
0 5000 10000 15000 20000 25000 30000 Time (seconds)
Total number of customers in the system
0 20 40 60 80 100 120 140 160 180 200
0 5000 10000 15000 20000 25000 30000 Time (seconds)
Total number of customers in the system
a = 35 p = 120 m = 4
Utilization = 86%
a = 35 p = 120 m = 3
Utilization = 114%
Time spent in the two stages of the system
Tq = Time in queue p = Time in service
T = Tq + p = Flow time
÷÷ ø ö çç
è
æ +
÷´
÷ ø ö çç
è
´æ
÷ø ç ö
è
=æ + -
2
2 1 2
p a
1)
2(m CV CV
n Utilizatio -
1
n Utilizatio m
time Activity queue
in Time
§ The above Time in queue equation works only for a stable system, i.e., a system with utilization less than 100%
Recall:
p = Activity time m = number of servers
What determines time in queue in a stable system?
The capacity factor:
Average processing time of the system = (p/m).
Demand does not influence this factor.
The utilization factor:
Average demand influences this factor because utilization is the ratio of demand to capacity.
The variability factor:
This is how variability influences time in queue – the more
variability (holding average demand and capacity constant)
÷÷ ø ö çç
è
æ +
÷´
÷ ø ö çç
è
´æ
÷ø ç ö
è
=æ + -
2 n
Utilizatio -
1
n Utilizatio m
ime Activity t queue
in Time
2 1 2
1)
2(m CVa CVp
§ Suppose:
- a = 35 seconds, p = 120 seconds, m = 4
- Utilization = p / (a x m) = 85.7%
- CVa = 1, CVp = 1
§ So Flow time = 150 + 120 = 270 seconds.
- In other words, the average customer will spend 270 seconds in the system (waiting for service plus time in service)
Evaluating Time in queue
2 150 1 1 4
120
2
2 1 2
2 1 2
÷÷ = ø çç ö
è
´æ +
÷÷ ø ö çç
è
´æ
÷ø ç ö
è
= æ
÷÷ ø ö çç
è
æ +
÷´
÷ ø ö çç
è
´æ
÷ø ç ö
è
= æ
- +
- +
0.857 -
1 0.857
CV CV
n Utilizatio -
1
n Utilizatio m
time Activity queue
in Time
1) 2(4
p 1) a
2(m
How many customers are in the system?
§ Use Little’s Law, I = R x T
§ R = Flow Rate = (1/a)
- The flow rate through the system equals the demand rate because we are demand constrained
(utilization is less than 100%)
§ Iq = (1/a) x Tq = Tq / a
§ Ip = (1/a) x p = p / a
- Note, time in service does not depend on the number of servers because when a customer is in service they are processed by only one server no matter how many servers are in the system.
Iq = Inventory in queue Ip = Inventory in service
I = Iq + Ip = Inventory in the system
0 20 40 60 80 100 120 140 160 180 200
80% 82% 84% 86% 88% 90% 92% 94% 96% 98% 100%
Time in queue
Utilization and system performance
÷÷ ø ö çç
è
æ +
÷´
÷ ø ö çç
è
´æ
÷ø ç ö
è
= æ + -
2
2 1 2
p 1) a
2(m CV CV
n Utilizatio -
1
n Utilizatio m
time Activity queue
in Time
§ Time in queue increases
dramatically as the utilization
approaches 100%
Activity time = 1 m = 1
CVa= 1 CVp= 1
Which system is more effective?
Pooled system:
One queue, four servers
Partially pooled system:
Two queues, two servers with each queue.
Separate queue system:
Four queues, one server with each
a = 35 m = 4
For each of the 2 queues:
a = 35 x 2 = 70 m = 2
For each of the 4 queues:
a = 35 x 4 = 140 m = 1
Across these three types of systems:
Variability is the same:
CVa = 1, CVp = 1 Total demand is the same: 1/35 customers per second.
Activity time is the same: p = 120 Utilization is the same: p / a x m = 85.7%
The probability a server is busy is the same = 0.857
Pooling can reduce Time in queue considerably
Pooled system
Partially pooled system
Separate queue system
a = 35, p = 120, m = 4, CVa = 1, CVp = 1 Tq = 150
Flow Time = 270
For each of the two queues:
a = 70, p = 120, m = 2, CVa = 1, CVp = 1 Tq = 336
Flow Time = 456
For each of the four queues:
a = 140, p = 120, m = 1, CVa = 1, CVp = 1 Tq = 720
Flow Time = 840
Pooling reduces the number of customers waiting but not the number in service
Pooled system
Partially pooled system
Separate queue system
Iq = 4.3 Ip = 3.4
Iq = 4.8 Ip = 1.7
Iq = 5.1 Ip = 0.86
Iq = 4.3 Ip = 3.4
Iq = 9.6 Ip = 3.4
Iq = 20.4 Ip = 3.4 Inventories
for each queue within the system
Inventories for the total system
( 2 times the inventory in each queue )
( 4 times the inventory in each queue )
Why does pooling work?
Pooled system
Separate queue system
= A customer waiting = A customer in service
§ Both systems currently have 6 customers.
§ In the pooled system, all servers are busy and only 2 customers are waiting.
§ In the separate queue system, 2 servers are idle and 4 customers are waiting.
§ The separate queue system is inefficient because there can be customers waiting and idle servers at the same time.
Some quick service restaurants pool drive through ordering
§ The person saying “Can I take your order?” may be hundreds (or even thousands) of miles away:
- Pooling the order taking process can improve time-in-queue while requiring less labor.
- It has been shown that queue length at the drive through influences demand – people don’t stop if the queue is long.
- However, this system incurs additional communication and software costs.
Limitations to pooling
§ Pooling may require workers to have a broader set of skills, which may require more training and higher wages:
- Imagine a call center that took orders for McDonalds and Wendys … now the order takers need to be experts in two menus.
- Suppose cardiac surgeons need to be skilled at kidney transplants as well.
§ Pooling may disrupt the customer – server relationship:
- Patients like to see the same physician.
§ Pooling may increase the time-in-queue for one customer class at the expense of another:
- Removing priority security screening for first-class passengers may decrease the average time-in-queue for all passengers but will likely increase it for first-class passengers.
Why the long queues for the women’s restroom?
Women wait for a public restroom at Yankee Stadium in 2014
Potty-parity Laws
§ Women’s Restroom Equity Bill:
- Passed NY City Counsel May 2005.
- Requires women’s restrooms to have twice the flushing capacity of men’s restrooms (at least a 2:1 ratio of
women’s toilets to men’s toilets & urinals)
- In the context of Time in queue equation, this law requires m to increase for
women, which decreases the capacity factor (p/m), and decreases utilization.
§ Other solutions:
- Unisex restrooms (pools capacity, at least the toilets) ⇒Not in Korea
- Flexible partitions that can change the size of each restroom.
m a n p Utilizatio u
time Activity p
CV CV
u - 1 u m
queue p in
Time a p
1) 2(m
= ´
=
=
÷÷ ø ö çç
è
æ +
÷´
÷ ø ö çç
è
´æ
÷ø ç ö è
=æ + -
2
2 1 2
Service times:
A: 9 minutes B: 10 minutes C: 4 minutes D: 8 minutes
D A
C 9 min. B
19 min.
23 min.
Total wait time: 9+19+23=51min
B D
A C
4 min.
12 min.
21 min.
Total wait time: 4+13+21=38 min
• Flow units are sequenced in the waiting area (triage step)
• Shortest Processing Time (SPT) Rule - Minimizes average waiting time
- Problem of having “true” processing times (Can I ask a quick question?)
• First-Come, First-Served (FCFS) - easy to implement
- perceived fairness
• Sequence based on importance - emergency cases
Managerial Responses to Variability: Priority Rules
in Waiting Time Systems
Reducing Variability
Variability is the enemy of all operations.
♣ Ways to reduce arrival variability Appointment systems
- Do not eliminate arrival variability
- If the doctor has the right to be late, why not the patient?
- What portion of the available capacity should be reserved in advance? (Ch. 18 Revenue Management)
- Do not reduce the variability of the true underlying demand
→providing incentives for customers to avoid peak hours
(Demand Management)
Reducing Variability
♣ Ways to reduce processing time variability
- Find a balance between operational efficiency (e.g. call duration) and the quality of service (perceived courtesy) - Investment in training and technology
(eg. Operators received real-time instructions (scripting)
• Variability is the norm, not the exception
• Variability leads to waiting times although utilization < 100%
• Use the Waiting Time Formula to
- get a qualitative feeling of the system
- analyze specific recommendations / scenarios
• Managerial response to variability:
- understand where it comes from and eliminate what you can - accommodate the rest by holding excess capacity
• Difference between variability and seasonality
- seasonality is addressed by staffing to (expected) demand
Managing Waiting Systems: Points to Remember
Summary
§ Even when a process is demand constrained (utilization is less than 100%), waiting time for service can be substantial due to variability in the arrival and/or service process.
§ Waiting times tend to increase dramatically as the utilization of a process approaches 100%.
§ Pooling multiple queues can reduce the time-in-queue with the same amount of labor (or use less labor to achieve the same time-in-queue).
§ (Remark) The utilization factor is nonlinear.
§ Utilization=0.8 →0.8/(1-0.8)=4 Utilization=0.9 →0.9/(1-0.9)=9
Utilization=0.95 →0.95/(1-0.95)=19 Utilization=0.99 →0.99/(1-0.99)=99