Chapter 9. Variability and Its Impact on Process Performance: Waiting Time Problems

(1)

Chapter 9. Variability and Its Impact on Process

Performance: Waiting Time Problems

(2)

For consumers, one of the most visible and annoying forms of supply- demand mismatches⇒

waiting time

Expected demand rate > expected supply rate When the implied utilization < 100%

⇒ In the presence of variability -Predict waiting times

- Recommend ways of reducing waiting time⇒

(choose appropriate capacity level, redesign the service system,

reduce variability)

(3)

Blocked calls

(busy signal) Abandoned calls (tired of waiting)

Calls on Hold

Sales reps processing

calls

Answered Calls Incoming

calls

Call center

Financial consequences Lost throughput Holding cost

Lost goodwill

$$$ Revenue $$$

Cost of capacity Cost per customer

§ At peak, 80% of calls dialed received a busy signal.

§ Customers getting through had to wait on average 10 minutes

§ Extra telephone expense per day for waiting was

$25,000.

An Example of a Simple Queuing System

Balking vs. Reneging

(4)

Patient

Arrival Time

Service Time

1 2 3 4 5 6 7 8 9 10 11 12

0 5 10 15 20 25 30 35 40 45 50 55

4 4 4 4 4 4 4 4 4 4 4 4

A Somewhat Odd Service Process

Utilization=Flow Rate/Capacity=(12patients/hr)/(15patients/hr) =80%

(5)

Patient

Arrival Time

Service Time

1 2 3 4 5 6 7 8 9 10 11

0 7 9 12 18 22 25 30 36 45 51

5 6 7 6 5 2 4 3 4 2 2

Time 7:10 7:20 7:30 7:40 7:50 8:00

7:00

Patient 1 Patient 3 Patient 5 Patient 7 Patient 9 Patient 11

Patient 2 Patient 4 Patient 6 Patient 8 Patient 10 Patient 12

0 1 2 3

2 min. 3 min. 4 min. 5 min. 6 min. 7 min.

Number of cases

A More Realistic Service Process

(6)

7:00 7:10 7:20 7:30 7:40 7:50

Inventory (Patients at

5 4 3 2 1

8:00 Wait time

Service time

Patient

Arrival Time

Service Time

1 2 3 4 5 6 7 8 9 10 11 12

0 7 9 12 18 22 25 30 36 45 51 55

5 6 7 6 5 2 4 3 4 2 2 3

Variability Leads to Waiting Time

Capacity can never run ahead of demand!

(service system)!

Average Wait Time?

(7)

What generates queues?

§ An arrival rate that predictably exceeds the service rate (i.e., capacity) of a process.

- e.g., toll booth congestion on the NJ Turnpike during the Thanksgiving Day rush.

§ Variation in the arrival and service rates in a process where the average service rate is more than adequate to process the average arrival rate.

- e.g., calls to a brokerage are unusually high during a particular hour relative to the same hour in other weeks (i.e., calls are high by

random chance).

(8)

§ An interarrival time is the amount of time between two arrivals to a process.

§ An arrival process is stationary over a period of time if the number of

arrivals in any subinterval depends only on the length of the interval and not on when the interval starts.

- For example, if the process is stationary over the course of a day, then the expected number of arrivals within any three hour interval is about the same no matter which three hour window is chosen (or six hour window, or one hour window, etc.).

- Processes tend to be nonstationary (or seasonal) over long time periods (e.g., over a day or several hours) but stationary over short periods of time (say one hour, or 15 minutes).

Defining interarrival times and a stationary process

(9)

Arrivals to An-ser over the day are nonstationary

0 20 40 60 80 100 120 140 160

15 00 45 30 15 00 :45 :30 :15 :00 :45 :30 :15 :00

Number of customers Per 15 minutes

0 20 40 60 80 100 120 140 160

15 00 45 30 15 00 :45 :30 :15 :00 :45 :30 :15 :00

Number of customers

Per 15 minutes The number of arrivals

in a 3-hour interval clearly depends on which 3-hour interval

your choose …

… and the peaks and troughs are predictable (they occur roughly at the same time each day.

(10)

0 20 40 60 80 100 120 140 160

0:15 2:00 3:45 5:30 7:15 9:00 10:45 12:30 14:15 16:00 17:45 19:30 21:15 23:00

Number of customers Per 15 minutes

0 20 40 60 80 100 120 140 160

0:15 2:00 3:45 5:30 7:15 9:00 10:45 12:30 14:15 16:00 17:45 19:30 21:15 23:00

Time Number of customers

Per 15 minutes

0 0.2 0.4 0.6 0.8 1

0:00:00 0:00:09 0:00:17 0:00:26 0:00:35 0:00:43 0:00:52 0:01:00 0:01:09 Distribution Function

Empirical distribution (individual points)

Exponential distribution

0 0.2 0.4 0.6 0.8 1

0:00:00 0:00:09 0:00:17 0:00:26 0:00:35 0:00:43 0:00:52 0:01:00 0:01:09 Distribution Function

Inter -arrival time Inter -arrival time Empirical distribution

(individual points)

Exponential distribution

• Seasonality vs. variability

• Need to slice-up the data

• Within a “slice”, exponential distribution (CV_a=1)

• See chapter 6 for various data analysis tools

Data in Practical Call Center Setting

(11)

What to Do With Seasonal Data

Measure the true demand data

Slice the data by the hour (30min, 15min)

Apply waiting model in each slice

(12)

How to describe (or model) interarrival times

§ We will use two parameters to describe interarrival times to a process:

- The average interarrival time.

- The standard deviation of the interarrival times.

§ What is the standard deviation?

- Roughly speaking, the standard deviation is a measure of how variable the interarrival times are.

- Two arrival processes can have the same average interarrival time (say 1 minute) but one can have more variation about that average, i.e., a higher standard deviation.

(13)

Relative and absolute variability

§ The standard deviation is an absolute measure of variability.

§ Two processes can have the same standard deviation but one can seem much more variable than the other:

- Below are random samples from two processes that have the same standard deviation. The left one seems more variable.

Average = 10 Stdev = 10

0 5 10 15 20 25 30 35 40 45 50

Inter-arrival time (min)

0 20 40 60 80 100 120

Inter-arrival time (min)

(14)

Relative and absolute variability (continued)

§ The previous slide plotted the processes on two different axes.

§ Here, the two are plotted relative to their average and with the same axes.

§ Relative to their average, the one on the left is clearly more variable

(sometimes more than 400% above the average or less than 25% of the average).

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0

Observation

Inter-arrival time /Average

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0

Observation

(15)

§ The coefficient of variation is a measure of the relative variability of a process – it is the standard deviation divided by the average.

§ The coefficient of variation of the arrival process:

Coefficient of variation

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0

CV_a = 10 /10 = 1

CV_a = 10 /100

= 0.1

(16)

§ The coefficient of variation of the service time process:

Service time variability

800

600

400

200

0

Call durations (seconds) Frequency

§ Standard deviation of activity times = 150 seconds.

§ Average call time (i.e., activity time) = 120 seconds.

§ CV_p = 150 / 120 = 1.25

(17)

Processing

Modeling Variability in Flow

Inflow

Demand process is “random”

(Randomness is the Rule, not the Exception!) Look at the inter-arrival times

a: average inter-arrival time CVa=

Often Poisson distributed:

CVa= 1

Constant hazard rate (no memory) Exponential inter-arrivals

Buffer

St-Dev(inter-arrival times) Average(inter-arrival times)

Time IA1 IA2 IA3 IA4

Processing

(Inherent Variation, Machine Breakdown, Operator Absence) p: average processing time

Same as “activity time” and “service time”

CVp =

Can have many distributions:

CV^pdepends strongly on standardization Often Beta or LogNormal

St-Dev(processing times) Average(processing times)

Outflow

(Variable Routing.

eg. Emergency Room) Flow Rate

Minimum{Demand, Capacity} = Demand = 1/a

CV⇒ measure variability in relative terms

(18)

Average flow time T

Theoretical Flow Time

Utilization 100%

Increasing Variability

Waiting Time Formula (복잡한 공식은 외울 필요 없음)

Variability factor

÷÷ ø ö çç

è

æ +

÷* ø ç ö

è æ

* -

= 1 2

2 2

p

a CV

CV n

utilizatio n utilizatio Time

Activity queue

in Time

Inflow Outflow

Inventory waiting I

_q

Entry to system Begin Service Departure Time in queue T_q Service Time

p

Flow Time T=T_q+p

Flow rate

The Waiting Time Formula

(19)

÷ö çæ +

÷* ç ö

*æ

÷ö çæ

= ⁺ ^-

2 1 2

) 1 (

2 m CVa CVp

n utilizatio time

Activity queue

in Time

Waiting Time Formula for Multiple (m) Servers

Inflow Outflow

Inventory waiting I_q

Entry to system Begin Service Departure Time in queue T_qService Time p

Flow Time T=T_q+p Inventory

in service I_p

Flow rate

Waiting Time Formula for Multiple, Parallel Resources

(20)

• Target Wait Time (TWT)

• Service Level = Probability{Waiting Time £ TWT}

• Example: Deutsche Bundesbahn Call Center

- Year 2020: 30% of calls answered within 20 seconds

0 0.2 0.4 0.6 0.8 1

0 50 100 150 200

Waiting time [seconds]

Fraction of customers who have to wait x seconds or less

Waiting times for those customers who do not get served immediately

Fraction of customers who get served without waiting at all

90% of calls had to wait 25 seconds or less

Service Levels in Waiting Systems

(21)

Independent Resources 2x(m=1)

Pooled Resources (m=2)

Managerial Responses to Variability: Pooling

a=4min, p=3min ⇒ a=2min, p=3/2min

min 9 2 )

1 (1 75) . 0 1

75 . ( 0 3

2 Time 1

Activity q

2 2

+ = - ´

´

=

÷÷ ø ö çç

è

æ +

÷* ø ç ö

è æ

* -

= CV^a CV^p

n utilizatio

n utilizatio T

min 95 . 2 3

1 1 75

. 0 1 75 . ) 0 2 (3

2 1

time Tq

1 ) 1 2 ( 2

2 1 2

) 1 ( 2

÷ = ø ç ö

è

´æ +

÷÷ ø ö çç

è æ

´ -

=

÷÷ ø ö çç

è

æ +

÷*

÷ ø ö çç

è æ

* -

÷ø ç ö

è

=æ

- +

-

+ a p

m CV CV

n utilizatio

n utilizatio m

Activity

Can serve the same number of customers using the same processing time, but in only half the waiting time!

(22)

Independent Resources 2x(m=1)

Pooled Resources

(m=2)

^0.00

10.00 20.00 30.00 40.00 50.00 60.00 70.00

60% 65%

m=1

m=2

m=5 m=10

70% 75% 80% 85% 90% 95%

Waiting Time T_q

Utilization u

Implications:

+ balanced utilization

Managerial Responses to Variability: Pooling

Pooling prevents the case that one resource is idle while the other faces a backlog of work.

(23)

A multi-server queue

§ a = average interarrival time

§ Flow rate = 1 / a

§ p = average activity time

§ Capacity of each server = 1 / p

§ m = number of servers

§ Capacity = m x 1 / p

§ Utilization = Flow rate / Capacity

= (1 / a) / (m x 1 / p)

Multiple servers Customers

waiting for service Arriving

customers Departing

customers

§ Assumptions:

· All servers are equally skilled, i.e., they all take p time to process each customer.

· Each customer is served by only one server.

· Customers wait until their service is completed.

· There is sufficient capacity to

(24)

A multi-server queue – some data and analysis

§ Suppose:

- a = 35 seconds per customer

- Flow rate = 1 / 35 customers per second

- p = 120 seconds per customer

- Capacity of each server = 1 / 120 customers per second

- m = number of servers = 4

§ Then

- Capacity = m x 1 / p = 4 x 1 / 120 = 1 / 30 customers per second

- Utilization = Flow rate / Capacity

= p / (a x m)

= 120 / (35 x 4)

(25)

The implications of utilization

§ A 85.7% utilization means:

- At any given moment, there is a 85.7%

chance a server is busy serving a customer and a 1 - 0.857 = 14.3% chance the server is idle.

- At any given moment, on average 85.7% of the servers are busy serving customers and 1 - 0.857 = 14.3% are idle.

§ If utilization < 100% then:

· The system is stable which means that the size of the queue does not keep growing over time. (Capacity is sufficient to serve all demand.)

§ If utilization >= 100% then:

· The system is unstable, which means that the size of the queue will continue to grow over time.

(26)

Stable and unstable queues

0 20 40 60 80 100 120 140 160 180 200

0 5000 10000 15000 20000 25000 30000 Time (seconds)

Total number of customers in the system

0 20 40 60 80 100 120 140 160 180 200

0 5000 10000 15000 20000 25000 30000 Time (seconds)

Total number of customers in the system

a = 35 p = 120 m = 4

Utilization = 86%

a = 35 p = 120 m = 3

Utilization = 114%

(27)

Time spent in the two stages of the system

T_q = Time in queue p = Time in service

T = T_q + p = Flow time

÷÷ ø ö çç

è

æ +

÷´

÷ ø ö çç

è

´æ

÷ø ç ö

è

=æ ⁺ ^-

2

2 1 2

p a

1)

2(m CV CV

n Utilizatio -

1

n Utilizatio m

time Activity queue

in Time

§ The above Time in queue equation works only for a stable system, i.e., a system with utilization less than 100%

Recall:

p = Activity time m = number of servers

(28)

What determines time in queue in a stable system?

The capacity factor:

Average processing time of the system = (p/m).

Demand does not influence this factor.

The utilization factor:

Average demand influences this factor because utilization is the ratio of demand to capacity.

The variability factor:

This is how variability influences time in queue – the more

variability (holding average demand and capacity constant)

÷÷ ø ö çç

è

æ +

÷´

÷ ø ö çç

è

´æ

÷ø ç ö

è

=æ ⁺ ^-

2 n

Utilizatio -

1

n Utilizatio m

ime Activity t queue

in Time

2 1 2

1)

2(m CVa CVp

(29)

§ Suppose:

- a = 35 seconds, p = 120 seconds, m = 4

- Utilization = p / (a x m) = 85.7%

- CV_a = 1, CV_p = 1

§ So Flow time = 150 + 120 = 270 seconds.

- In other words, the average customer will spend 270 seconds in the system (waiting for service plus time in service)

Evaluating Time in queue

2 150 1 1 4

120

2

2 1 2

÷÷ = ø çç ö

è

´æ +

÷÷ ø ö çç

è

´æ

÷ø ç ö

è

= æ

÷÷ ø ö çç

è

æ +

÷´

÷ ø ö çç

è

´æ

÷ø ç ö

è

= æ

- +

0.857 -

1 0.857

CV CV

n Utilizatio -

1

n Utilizatio m

in Time

1) 2(4

p 1) a

2(m

(30)

How many customers are in the system?

§ Use Little’s Law, I = R x T

§ R = Flow Rate = (1/a)

- The flow rate through the system equals the demand rate because we are demand constrained

(utilization is less than 100%)

§ I_q = (1/a) x T_q = T_q / a

§ I_p = (1/a) x p = p / a

- Note, time in service does not depend on the number of servers because when a customer is in service they are processed by only one server no matter how many servers are in the system.

I_q = Inventory in queue I_p = Inventory in service

I = I_q+ I_p = Inventory in the system

(31)

0 20 40 60 80 100 120 140 160 180 200

80% 82% 84% 86% 88% 90% 92% 94% 96% 98% 100%

Time in queue

Utilization and system performance

÷÷ ø ö çç

è

æ +

÷´

÷ ø ö çç

è

´æ

÷ø ç ö

è

= æ ⁺ ^-

2

2 1 2

p 1) a

2(m CV CV

n Utilizatio -

1

n Utilizatio m

in Time

§ Time in queue increases

dramatically as the utilization

approaches 100%

Activity time = 1 m = 1

CV_a= 1 CV_p= 1

(32)

Which system is more effective?

Pooled system:

One queue, four servers

Partially pooled system:

Two queues, two servers with each queue.

Separate queue system:

Four queues, one server with each

a = 35 m = 4

For each of the 2 queues:

a = 35 x 2 = 70 m = 2

For each of the 4 queues:

a = 35 x 4 = 140 m = 1

Across these three types of systems:

Variability is the same:

CV_a = 1, CV_p = 1 Total demand is the same: 1/35 customers per second.

Activity time is the same: p = 120 Utilization is the same: p / a x m = 85.7%

The probability a server is busy is the same = 0.857

(33)

Pooling can reduce Time in queue considerably

Pooled system

Partially pooled system

Separate queue system

a = 35, p = 120, m = 4, CV_a = 1, CV_p = 1 T_q = 150

Flow Time = 270

For each of the two queues:

a = 70, p = 120, m = 2, CV_a = 1, CV_p = 1 T_q = 336

Flow Time = 456

For each of the four queues:

a = 140, p = 120, m = 1, CV_a = 1, CV_p = 1 T_q = 720

Flow Time = 840

(34)

Pooling reduces the number of customers waiting but not the number in service

Partially pooled system

I_q = 4.3 I_p= 3.4

I_q = 4.8 I_p= 1.7

I_q = 5.1 I_p= 0.86

I_q = 4.3 I_p= 3.4

I_q = 9.6 I_p= 3.4

I_q = 20.4 I_p= 3.4 Inventories

for each queue within the system

Inventories for the total system

( 2 times the inventory in each queue )

( 4 times the inventory in each queue )

(35)

Why does pooling work?

= A customer waiting = A customer in service

§ Both systems currently have 6 customers.

§ In the pooled system, all servers are busy and only 2 customers are waiting.

§ In the separate queue system, 2 servers are idle and 4 customers are waiting.

§ The separate queue system is inefficient because there can be customers waiting and idle servers at the same time.

(36)

Some quick service restaurants pool drive through ordering

§ The person saying “Can I take your order?” may be hundreds (or even thousands) of miles away:

- Pooling the order taking process can improve time-in-queue while requiring less labor.

- It has been shown that queue length at the drive through influences demand – people don’t stop if the queue is long.

- However, this system incurs additional communication and software costs.

(37)

Limitations to pooling

§ Pooling may require workers to have a broader set of skills, which may require more training and higher wages:

- Imagine a call center that took orders for McDonalds and Wendys … now the order takers need to be experts in two menus.

- Suppose cardiac surgeons need to be skilled at kidney transplants as well.

§ Pooling may disrupt the customer – server relationship:

- Patients like to see the same physician.

§ Pooling may increase the time-in-queue for one customer class at the expense of another:

- Removing priority security screening for first-class passengers may decrease the average time-in-queue for all passengers but will likely increase it for first-class passengers.

(38)

Why the long queues for the women’s restroom?

Women wait for a public restroom at Yankee Stadium in 2014

(39)

Potty-parity Laws

§ Women’s Restroom Equity Bill:

- Passed NY City Counsel May 2005.

- Requires women’s restrooms to have twice the flushing capacity of men’s restrooms (at least a 2:1 ratio of

women’s toilets to men’s toilets & urinals)

- In the context of Time in queue equation, this law requires m to increase for

women, which decreases the capacity factor (p/m), and decreases utilization.

§ Other solutions:

- Unisex restrooms (pools capacity, at least the toilets) ⇒Not in Korea

- Flexible partitions that can change the size of each restroom.

m a n p Utilizatio u

time Activity p

CV CV

u - 1 u m

queue p in

Time ^a ^p

1) 2(m

= ´

=

÷÷ ø ö çç

è

æ +

÷´

÷ ø ö çç

è

´æ

÷ø ç ö è

=æ ⁺ ^-

2

2 1 2

(40)

Service times:

A: 9 minutes B: 10 minutes C: 4 minutes D: 8 minutes

D A

C 9 min. B

19 min.

23 min.

Total wait time: 9+19+23=51min

B D

A C

4 min.

12 min.

21 min.

Total wait time: 4+13+21=38 min

• Flow units are sequenced in the waiting area (triage step)

• Shortest Processing Time (SPT) Rule - Minimizes average waiting time

- Problem of having “true” processing times (Can I ask a quick question?)

• First-Come, First-Served (FCFS) - easy to implement

- perceived fairness

• Sequence based on importance - emergency cases

Managerial Responses to Variability: Priority Rules

in Waiting Time Systems

(41)

Reducing Variability

Variability is the enemy of all operations.

♣ Ways to reduce arrival variability Appointment systems

- Do not eliminate arrival variability

- If the doctor has the right to be late, why not the patient?

- What portion of the available capacity should be reserved in advance? (Ch. 18 Revenue Management)

- Do not reduce the variability of the true underlying demand

→providing incentives for customers to avoid peak hours

(Demand Management)

(42)

Reducing Variability

♣ Ways to reduce processing time variability

- Find a balance between operational efficiency (e.g. call duration) and the quality of service (perceived courtesy) - Investment in training and technology

(eg. Operators received real-time instructions (scripting)

(43)

(44)

• Variability is the norm, not the exception

• Variability leads to waiting times although utilization < 100%

• Use the Waiting Time Formula to

- get a qualitative feeling of the system

- analyze specific recommendations / scenarios

• Managerial response to variability:

- understand where it comes from and eliminate what you can - accommodate the rest by holding excess capacity

• Difference between variability and seasonality

- seasonality is addressed by staffing to (expected) demand

Managing Waiting Systems: Points to Remember

(45)

Summary

§ Even when a process is demand constrained (utilization is less than 100%), waiting time for service can be substantial due to variability in the arrival and/or service process.

§ Waiting times tend to increase dramatically as the utilization of a process approaches 100%.

§ Pooling multiple queues can reduce the time-in-queue with the same amount of labor (or use less labor to achieve the same time-in-queue).

(46)

(47)

§ (Remark) The utilization factor is nonlinear.

§ Utilization=0.8 →0.8/(1-0.8)=4 Utilization=0.9 →0.9/(1-0.9)=9

Utilization=0.95 →0.95/(1-0.95)=19 Utilization=0.99 →0.99/(1-0.99)=99