Journal of Advanced Computing (2012) 1
Transcription
Journal of Advanced Computing (2012) 1
Columbia International Publishing Journal of Healthcare Technology and Management (2015) Vol. 3 No. 1 pp. 1-19 doi:10.7726/jhtm.2015.1001 Research Article Genetic Algorithm–Based Outpatient Appointment Scheduling Systems Song Chew1* and Clariecia Groves1 Received 31 December 2014; Published online 25 April 2015 © The author(s) 2015. Published with open access at www.uscip.us Abstract Outpatient appointment scheduling for a clinic is a problem of assigning appointment times to patients that seek non-emergency medical attention. This paper employs genetic algorithms to find optimal or near optimal numbers of advanced appointment and walk-in patients to assign to each time block for a day. To this end, an outpatient appointment schedule is represented by a vector in which a component indicates the number of patients to be assigned to the corresponding time block. The goal is to maximize profit for the clinic subject to such factors as the number of time blocks for a day, no-show rates, mean service times, unit revenues, patient wait time unit cost, staff idle time unit costs, and staff overtime unit costs. Our results reveal that all factors are influential in maximizing the profit except the staff overtime unit costs. Keywords: Healthcare; Outpatient Appointment; Scheduling; Genetic Algorithms; Search Heuristics; Simulation; Optimization; Operations Research 1. Introduction Healthcare currently consumes 17% of the U.S. Gross Domestic Product and is expected to reach 20% within the coming decade (National Health Care Expenditures Projections: 2009-2019). Thus, there is an urgent need to provide more efficient and effective patient care. The need to produce more of a higher quality product or service at a reduced cost can only be met through better utilization of resources (Hays and McLaughlin, 2008). One aspect of reducing cost and improving quality of service is through outpatient appointment scheduling. Clinical operations are driven by the patient schedule, which determines the arrival time of patients to the clinic. Patient scheduling affects all aspects of clinical operation and is of primary interest in any effort to improve clinic operations (Muthuraman and Lawley, 2008). ______________________________________________________________________________________________________________________________ *Corresponding e-mail: schew@siue.edu 1 Department of Mathematics and Statistics, College of Arts and Sciences, Southern Illinois University Edwardsville, Edwardsville, IL 62026, USA 1 Song Chew and Clariecia Groves / Journal of Healthcare Technology and Management (2015) Vol. 3 No. 1 pp. 1-19 Effective scheduling systems have the goal of matching demand with capacity so that resources are optimally utilized and patient waiting times are minimized (Cayirli and Veral, 2003). Generally, the process of outpatient appointment scheduling is as follows: A patient calls for an appointment. Next, the appointment scheduler checks for availability of the physician and/or staff and searches for open appointment slots, which is communicated to the patient. The patient then decides which appointment time they want and confirms the time with the scheduler before the call ends (Muthuraman and Lawley, 2008). A major problem in outpatient appointment scheduling is no-shows; that is, patients not showing up for scheduled appointments. No-show rate may be considered as the probability that a patient may not show up for a scheduled appointment. No-show rates can be as high as 40% in some clinics (Lee et al., 2005). No-show patients bring about great uncertainty to clinical operations and limit accessibility to other patients with reserved appointment slots that go unused. Many factors have been cited as reasons of patient no-show, including patient demographics, medical conditions, and environment (Deyo and Inui, 1980). Ho and Lau’s (1992) assessment of three environmental factors (no-show probability, variability of service times, and number of patients per day) reveals that, among the three, no-show probability is the major one that affects performance (Cayirli and Veral, 2003). Another major problem in outpatient appointment scheduling is walk-ins, patients seeking medical care but not having pre-scheduled appointments. In the U.S., many clinics are the patient’s general practitioner, and are responsible for the patient’s total care, whether elective or emergent. Therefore, walk-ins must be anticipated and planned for in the administration of clinical operations. Similar to no-shows, walk-in probabilities are observed to vary across specialties (Fetter and Thompson, 1966; Field, 1980; Shonick and Klein, 1977). This paper incorporates no-show and walk-in probabilities in the scheduling process. Specifically, we present genetic algorithm–based outpatient appointment scheduling systems that can be used to minimize patient waiting times, doctor and staff idle times, and doctor and staff overtime while maximizing profit. 2. Statement of Problem Outpatient appointment scheduling for a clinic is a problem of assigning appointment times to patients that seek non-emergency medical attention (Chew, 2011). The goal is to assign appointment times so that patient waiting times, doctor and staff idle time, and doctor and staff overtime are minimized while the quality of service and revenue for the clinic is maximized. Generally, patients that seek non-emergency medical attention can be broken down into two categories: (1) Advanced appointment patients, and (2) walk-in patients. Advanced appointment patients are patients that have pre-scheduled appointments; walk-in patients are patients that do not have pre-scheduled appointments. We denote the numbers of allowed advanced appointment (AA) patients and of allowed walk-in (WI) patients in a given day, respectively, by M and N. 2 Song Chew and Clariecia Groves / Journal of Healthcare Technology and Management (2015) Vol. 3 No. 1 pp. 1-19 An outpatient clinic typically runs on a block schedule for each day (Chew, 2011). A day of operations may last, say, 8 hours. Figure 1 illustrates a generic block schedule. For a given day, the schedule may be divided into b blocks that are of an equal time interval. We call this time interval an interappointment time, denoted by a i , i = 1, 2,…, b. Each block Bi (i = 1, 2,…, b) contains numbers of scheduled AA and of scheduled WI patients, denoted respectively by mi and ni (i = 1, 2,…, b). Therefore, we have that M b b i 1 i 1 mi and N ni . We assumed that patients arrive on time for their appointment. The arrival time of a patient, ti (i = 1, 2,…, b), is the scheduled appointment time for the patient. Each patient is provided service for a random length of time. This random length of time is what we refer to as service time, which we assume to be exponential. The end time of a day is denoted by tb 1 . B2 B1 Bb−1 Bb m1, n1 m2, n2 m3, n3 mb−1, nb−1 mb, nb t1 t2 t3 tb−1 tb a1 a2 ab1 tb+1 ab Fig. 1. An outpatient appointment schedule with b blocks. Ideally, doctors would see each scheduled patient in block Bi during the inter-appointment time a i , without overlapping into the next block Bi 1 . However, many times doctors and the staff experience idle time and/or overtime and patients experience waiting time. Idle time occurs when the service time of patients in a block ( Bi ) ends before the scheduled arrival of patients (at time ti+1) in the next block ( Bi 1 ). Overtime occurs when patients are still being serviced after the end (at time tb+1) of the last block ( Bb ). Patient waiting time occurs when the patient has to wait until the preceding patient finishes. Doctor and staff idle time, doctor and staff overtime, and patient waiting times are considered when creating an outpatient schedule. The best and most efficient way of dealing with these issues is to assign a unit cost to each of these factors, and then maximize the profit (Chew, 2011). We will now refer to doctor and staff idle time, doctor and staff overtime, and patient waiting time as cost factors. The calculation for profit will be discussed in the next section. In addition to considering cost factors, there are other factors that will need to be considered in order to analyze the profit. Amongst them are the following: Unit revenue (will be discussed in the next section), no-show rates, service times, and the number of blocks. These factors will be explored and analyzed in Sections 7 and 8. 3 Song Chew and Clariecia Groves / Journal of Healthcare Technology and Management (2015) Vol. 3 No. 1 pp. 1-19 Our goal is to determine the optimal numbers of AA and of WI patients to assign to each block Bi for a day so as to maximize the profit for the clinic. Furthermore, we want to determine which factors have the greatest effect on the profit. The remainder of the paper is organized as follows. Section 3 discusses revenues, costs, and profits for the clinic. Section 4 talks briefly about genetic algorithms, while Section 5 expounds the way we leverage genetic algorithms to solve our problems. Section 6 deals with confidence intervals for profits. Section 7 presents numerical examples, and Section 8 examines the effects of factors influencing outpatient appointment scheduling. Finally, we offer some concluding remarks in Section 9. 3. Revenue, Expected Total Cost, and Profit Since our main goal is to maximize the profit, we need to discuss how profit is calculated. The general formula for profit is Profit = Revenue – Cost. The total cost C is a random variable and is defined as follows: C = cwW + cdD + cvV, (1) where W, D, and V are random variables that represent total patient waiting time, total doctor and staff idle time and total doctor and staff overtime, respectively; and cw, cd and cv are the respective unit costs. The values for cw, cd and cv are predetermined and are set by the clinic. We will discuss how to calculate W, D, and V shortly. By taking the expectation of total cost in Equation (1), we derive the expected total cost E(C) = cwE(W) + cdE(D) + cvE(V), (2) where E(W), E(D) and E(V) are expected total patient waiting time, expected total doctor and staff idle time and expected total doctor and staff overtime, respectively. Let us discuss a few important details that will help us calculate E(W), E(D) and E(V). Let j denote a patient and assume that each patient arrives promptly at their appointment time, tj. Now let bj, sj, and ej, respectively, be the time at which service starts, the length of service time, and the time at which service ends for patient j. Let us assume that the first patient’s appointment time and service start time to be 0; that is, t1 = b1 = 0. Each patient j after the first (j > 1) will have a service start time that is the maximum of their appointment time and the service end time of the previous patient. Therefore, we have bj = max(tj, ej–1) and ej = bj + sj (Ho and Lau, 1992). We can find the waiting time of patient j, wj, by considering the patient’s service start time bj minus the patient’s appointment time tj; hence, wj = bj – tj, where w1 = 0 (Ho and Lau, 1992). The doctor and staff are considered idle while waiting for the next patient to arrive after servicing a patient. Thus, the doctor and staff idle time (denoted by dj) before servicing patient j is d j max( 0, t j ej1 ) where d1 = 0 (Ho and Lau, 1992). 4 Song Chew and Clariecia Groves / Journal of Healthcare Technology and Management (2015) Vol. 3 No. 1 pp. 1-19 Let T be the total number of patients scheduled for a day. In our case, T = M + N (the total number of AA and WI patients). Further let TA = MA + NA be the total number of patients actually showing up at the clinic for the day, where MA and NA are the numbers of AA and of WI patients actually showing up respectively. Then the total patient waiting time and the total doctor and staff idle time are TA TA j1 j1 W wj , and D d j , respectively. We now define the doctor and staff overtime; it is the amount of time the doctor and staff spend providing service after the end of a day; so, the overtime is V = max(0, eT – tb 1 ) (Ho and Lau, 1992). Note that eT is the end time of the last patient (patient T), and that tb 1 is the end time of a day. As mentioned earlier, service times are assumed to be random. So W, D, and V are also random. We use E(W), E(D), and E(V) to denote the expectation of the total patient waiting time, the total doctor and staff idle time, and the total doctor and staff overtime, respectively. We will use the averages of h observations (or replications) of W, D, and V to approximate the respective expectations. Thus, the expected total cost of (2) is estimated by the following average total cost C = cw W + cd D + cv V , h (3) h h Wk , D 1 Dk , and V 1 Vk with Wk, Dk, and Vk being the kth observation h k 1 h k 1 h k 1 where W 1 of W, D, and V, respectively. This implies that E (C ) C , E ( D) D , E (W ) W , and E (V ) V . There will be unit revenue associated with the two types of patients, AA and WI. These associated amounts will be denoted by r1 and r2 for AA and WI, respectively. Thus, we have that the profit P is a random variable that can be defined as follows: P r1MA r2 NA C , (4) where r1MA + r2NA is the total revenue and C is the total cost as defined by Equation (1). By taking the expectation of the profit P, we obtain E( P) r1MA r2 NA E(C ) . (5) We will estimate the expected profit by the sample mean profit, P , which will be discussed in Section 6. Since our goal is to maximize profit, we need to determine the optimal numbers of AA and of WI patients scheduled for each block Bi , denoted by mi* and ni* respectively, giving rise to the optimal total numbers of AA and of WI patients, M * b b mi* and N * ni* respectively, and hence the i 1 i 1 optimal total number of patients scheduled for a day, T M N . Then, let M A* and N A* be the actual numbers of AA and of WI patients, respectively, that show up in the clinic. Thus, the optimal (maximum) profit and its expectation are respectively as follows: 5 Song Chew and Clariecia Groves / Journal of Healthcare Technology and Management (2015) Vol. 3 No. 1 pp. 1-19 P r1M A* r2 N A* C , E(P ) r1 M A* r2 NA* E(C) . (6) 4. Genetic Algorithms We employ genetic algorithms to find optimal or near optimal solutions. Genetic algorithms are a search heuristic that mimics the process of evolution; a detailed introduction may be found in the paper by Holland (1975). Most briefly, the search begins with an initial population of parent chromosomes (often represented by vectors) randomly chosen or pre-determined. Then crossover and mutation occur to create offspring from the parent chromosomes. There are several ways to perform crossover. For example, during single-point crossover, one crossover point is randomly chosen so that offspring is formed by copying the first part of a parent up to the crossover point and last part of a second parent from the crossover point to the end. To illustrate, suppose that there are two sequences of numbers 1, 2, 3, 4 and 5, 6, 7, 8 that are the parent chromosomes and that point three (between the third and fourth components) is randomly chosen as the crossover point. Then the two new offspring may be 1, 2, 3, 8 and 5, 6, 7, 4. Another way of carrying out crossover is what is called two-point crossover. During two-point crossover, two crossover points are randomly selected and offspring is formed from copying the first part to the first crossover point and from the second crossover point to the end of a parent and from the first crossover point to the second crossover point in a second parent. As an example, beginning with 1, 2, 3, 4 and 5, 6, 7, 8 as our parent chromosomes; suppose that points one and two were randomly chosen. Then two offspring may be 1, 6, 3, 4 and 5, 2, 7, 8. Yet another way is uniform crossover. During uniform crossover, components are randomly copied from a first parent chromosome or a second parent. For example, beginning with 1, 2, 3, 4 and 5, 6, 7, 8, two offspring may be 1, 6, 3, 8 and 5, 2, 7, 4 such that the random crossover components are the second and fourth. On the other hand, mutation occurs when one part of an offspring chromosome is changed. Consider, for example, the two offspring derived in the discussion of two-point crossover, 1, 6, 3, 4 and 5, 2, 7, 8, one of them may mutate. Suppose that the third point of 1, 6, 3, 4 is mutated by adding 1 or subtracting 1, which is equally likely to occur. As a result, the mutated offspring may become 1, 6, 4, 4 if 1 is added, or 1, 6, 2, 4 if 1 is subtracted. After crossover and mutation, a new population of chromosomes may be selected from the original parent and new offspring chromosomes according to certain rules. Genetic algorithms will then repeat the above crossover and mutation operations with the new population of (parent) chromosomes until certain criteria are met. The stopping criterion for our genetic algorithms is that it ends when the specified number of generations has been reached. Genetic algorithms as a methodology have been widely employed to find optimal solutions to optimization problem in many fields. For example, Werner (2013) provides a comprehensive survey of genetic algorithms used for job shop scheduling problems. Genetic algorithms are also commonly leveraged to solve production and transportation scheduling problems; for instance, see the work by Delavar et al. (2010) for details. In the meantime, computer sciences and engineering applications see a heavy use of genetic algorithms as well; Sharma et al. (2013) conduct a survey on computer software testing using genetic algorithms, while Akachukwu et al. (2014) carry out a survey of applications of genetic algorithms in engineering fields. Although they have been widely 6 Song Chew and Clariecia Groves / Journal of Healthcare Technology and Management (2015) Vol. 3 No. 1 pp. 1-19 applied elsewhere, genetic algorithms have seen little application in the field of outpatient appointment scheduling. Our work employing genetic algorithms seeks to fill the gap. The next section will explain in further details how our genetic algorithms work. 5. Genetic Algorithm-Based Outpatient Appointment Scheduling Genetic algorithms are used to find optimal or near optimal solutions to varied types of problems. In our case, we want to find the optimal or near optimal numbers of AA and of WI patients to assign to each block in a day. Next, we explain the step-by-step procedure of the genetic algorithm used to solve our problems. Step-by-Step procedure of the genetic algorithm Step 1: The following parameters are predetermined: a. number of blocks b. interappointment time c. number of types of patients (AA, WI, etc.) d. no-show rate for each type of patient e. mean service time for each type of patient, which is assumed to be exponential f. unit revenue for each type of patient g. patient wait time unit cost for each type of patient h. doctor and staff idle time unit cost for each type of patient i. doctor and staff overtime unit cost for each type of patient j. mutation rate (u %) Step 2: An initial population of 10 (parent) schedules are randomly generated. For example, a schedule with 8 blocks and 2 types (AA and WI) of patients is generated as follows m1, n1; m2, n2; m3, n3; m4, n4; m5, n5; m6, n6; m7, n7; m8, n8 (7) where mi and ni are uniform random integers from 0 to 20. Step 3: 5 pairs of (parent) schedules are randomly chosen from the population. Then single-point crossover occurs so that 10 offspring schedules are generated. The crossover point is randomly chosen; so for the example in Step 2, the crossover point can be any point between any pair of consecutive components in (7). Step 4: Of the 10 offspring schedules, u % (we choose 70%) will be randomly chosen to be mutated. The mutation will be performed on a randomly chosen component in each chosen schedule so that 1 will be added to or subtracted from that point, which is equally likely to occur. After the mutation takes place on the u % of the offspring, we will then have a total of 20 schedules (10 parents and 10 offspring). Step 5: For each of the 20 schedules, an arrangement of AA patients followed by WI patients in each block is setup (the idea is that AA patients have higher priority over WI patients). Next, the respective no-show rate is used to determine whether each patient in each block shows up or not. If a patient does not show up, the patient will be removed from the schedule. 7 Song Chew and Clariecia Groves / Journal of Healthcare Technology and Management (2015) Vol. 3 No. 1 pp. 1-19 Step 6: For each of the 20 schedules, a random service time for each patient is generated and the profit is accordingly computed. This process is repeated multiple times; for example, we choose 3,000 times. Finally, the average of the multiple profits is computed and is designated as the average profit for that schedule. In addition, the half-width of the 95% confidence interval for the average profit is also computed for that schedule. This will help us determine the statistical significance of the differences between average profits. Step 7: The 20 schedules are ranked from the highest average profit to lowest average profit. Step 8: The best few schedules will be chosen out of the 20 schedules and other schedules will be randomly chosen from the remaining schedules to form a new population for the next generation. As an example, the best schedule may be chosen and 9 others may be chosen randomly from the remaining 19 schedules to form a new population of 10 schedules. Step 9: Repeat Steps 3 through 8 with the new population until a pre-specified number of generations (we choose 5,000) is reached. 6. Confidence Intervals for Profits We discussed how profit is calculated for each schedule in Section 3. Now we discuss how confidence intervals for profits are calculated. Before we begin, let us state the following Central Limit Theorem: Central Limit Theorem If X1,…, Xn is a random sample from a distribution with mean and variance 2 , then the n limiting distribution of Z n X i 1 i n is the standard normal; that is, Z n Z ~ N (0,1) as d n n . Zn can also be written in relation to the sample mean as follows: Z n X / n . Assume that the profit of a schedule is distributed with a mean and variance 2 . Then by the Central Limit Theorem, the 95% confidence interval (CI) from the mean profit is calculated as follows: P 1.96 h , (8) where P is the sample mean profit as described by Equation (12), is the standard deviation of the profit P (Equation 4) and h (sample size) is a sufficiently large number of replications used to obtain the sample mean profit. Note that we approximate using s, the sample standard deviation calculated by Equation (14). Therefore, Equation (8) becomes 8 Song Chew and Clariecia Groves / Journal of Healthcare Technology and Management (2015) Vol. 3 No. 1 pp. 1-19 s . h P 1.96 (9) A single observation of the profit of a schedule is as follows: Pk r1M A* r2 N A* Ck , (10) where M A* , N A* , r1 , and r2 are as defined in Section 3; Ck is a single observation of the total cost calculated below by Equation (11): Ck cwWk cd Dk cvVk , (11) where cw , cd , cv are as defined in Section 3, Wk , Dk and Vk are patient wait time, doctor and staff idle time, and doctor and staff overtime respectively; and k = 1, 2,…, h. The sample mean profit, P , is then calculated as follows: P1 h h Pk . (12) k 1 The sample standard deviation, s, is calculated as follows: h s2 (P P ) k 1 2 k , h 1 (13) so by taking the square root of both sides, we have that h s (P P ) k 1 2 k h 1 . (14) We use the sample standard deviation s instead of the actual standard deviation because we do not know the true standard deviation of the profits of a schedule. We use the sample variance s2 to 2 estimate the variance and use the sample mean profit P to estimate the mean . 7. Numerical Examples and Discussion In this section, we discuss several examples that illustrate how the genetic algorithm can be used. We explore and analyze the effects of multiple factors on the profit. Particularly, we look at the effects of using various no-show rates, service times, and the numbers of blocks. For all of the examples shown in this section, a day is 240 minutes (or 4 hours) in length. Also, staff will be understood to include doctor(s) and the medical staff for the remainder of this paper. For each 9 Song Chew and Clariecia Groves / Journal of Healthcare Technology and Management (2015) Vol. 3 No. 1 pp. 1-19 scenario, T1 = Patient Type 1 and T2 = Patient Type 2. In our case, T1 = AA patients and T2 = WI patients. All simulations are programmed using MATLAB. Example 1: No-show Rates Since research has shown that no-show rates are a major problem for outpatient appointment scheduling, one could analyze the impact that high no-show rates have on profit and how to schedule patients when certain types of patients have a high probability of not showing up. As mentioned in Section 2, walk-in patients do not have a pre-scheduled appointment. So for WI patients, the no-show rate represents the probability of not having a patient walking in for service. It does not represent the probability of missing an appointment. Description Low Medium High Number of Blocks 8 8 8 Interappointment Time (minutes) 30 30 30 T1 No-show Rate (%) 10 30 70 T2 No-show Rate (%) 20 40 80 T1 Mean Service Time (minutes) 10 10 10 T2 Mean Service Time (minutes) 10 10 10 T1 Unit Revenue 300 300 300 T2 Unit Revenue 300 300 300 Wait Time Unit Cost 10 10 10 Staff Idle Time Unit Cost 5 5 5 Staff Overtime Unit Cost 5 5 5 Number of T1 Patients Scheduled 15 16 37 Number of T2 Patients Scheduled 3 7 18 Total Number of Scheduled Patients 18 23 55 2,793.8518 2,475.8363 2,007.2927 38.0925 40.8016 49.0000 Profit Half-Width of 95% CI Fig. 2. Scenarios with Low, Medium, and High no-show rates. Consider that no-show rates range from 0% to 100%. Let us divide these probabilities into 3 groups: Low (between 0% and 30%), Medium (between 30% and 60%), and High (between 60% and 100%). We will assign all patient types the same mean service time, unit revenue, and unit 10 Song Chew and Clariecia Groves / Journal of Healthcare Technology and Management (2015) Vol. 3 No. 1 pp. 1-19 costs in order to analyze the single effect of no-show rates on the profit. Three scenarios with Low, Medium, and High no-show rates are as indicated in Figure 2 above. Note that for each scenario, the half-width of the 95% confidence interval (CI) is calculated for the computed profit. The optimal schedule for each scenario is as follows in Figure 3. Scenario Optimal Schedule Low 1, 2; 2, 0; 2, 0; 2, 0; 2, 0; 2, 0; 2, 0; 2, 1 Medium 3, 0; 1, 2; 1, 2; 2, 0; 2, 0; 2, 1; 3, 0; 2, 2 High 5, 3; 5, 0; 5, 0; 4, 3; 6, 0; 5, 1; 3, 3; 4, 8 Fig. 3. Optimal schedules for scenarios with Low, Medium, and High no-show rates. A graph of the relationship between the profit (vertical axis) and the no-show rates (horizontal axis) is shown below in Figure 4 for the 3 scenarios: Effect of No-show Rates on Profit 3000 2500 2,794 2,476 2,007 Profit 2000 Low 1500 Medium High 1000 500 0 Fig. 4. Effect of no-show rates on the profit. As can be seen in the graph in Figure 4, it appears that the profit will be lower when patients have high no-show rates. We note that our goal is to find the optimal total numbers of patients and the corresponding optimal distribution of these patients in each block so as to maximize the profit for the day. For example, to maximize the profit in the Low scenario, Figures 2, 3, and 4 reveal that the optimal total number of patients should be scheduled is 18 (15 T1 patients and 3 T2 patients), that the 11 Song Chew and Clariecia Groves / Journal of Healthcare Technology and Management (2015) Vol. 3 No. 1 pp. 1-19 corresponding optimal distribution of these patients is 1, 2; 2, 0; 2, 0; 2, 0; 2, 0; 2, 0; 2, 0; 2, 1, and that the optimal profit for the day is 2794. We would also like to note that if we schedule a total of 17 or 19 (simply a number other than 18 for that matter) patients, then it is possible to find the optimal distribution of the patients and the associated optimal profit for the pre-specified number. However, this associated profit would be less than 2794. Observe that the optimal total number of patients scheduled will be higher when no-show rates are high. To see this, in Figure 2, notice that the optimal total number of patients scheduled increases across scenarios (from Low to High) as the no-show rates increase. This happens because more patients need to be scheduled in order to account for the patients that do not show up. Scheduling more patients (or overbooking) allows the clinic to earn a higher profit compared to not scheduling more patients when no-show rates are high. We would like to point out that scheduling more patients to account for the possible loss of profit does not necessarily make up the entire loss. We conjecture that high no-show rates decrease average profit because the probability of having an optimal schedule for a day will be smaller. Example 2: Service Times Description Low Medium High Number of Blocks 8 8 8 Interappointment Time (minutes) 30 30 30 T1 No-show Rate (%) 10 10 10 T2 No-show Rate (%) 10 10 10 T1 Mean Service Time (minutes) 10 30 70 T2 Mean Service Time (minutes) 20 40 80 T1 Unit Revenue 200 200 200 T2 Unit Revenue 200 200 200 Wait Time Unit Cost 10 10 10 Staff Idle Time Unit Cost 10 10 10 Staff Overtime Unit Cost 5 5 5 Number of T1 Patients Scheduled 15 6 1 Number of T2 Patients Scheduled 1 0 1 Total Number of Scheduled Patients 16 6 2 Profit 748.4459 ‒615.4773 ‒1,106.3145 Half-Width of 95% CI 30.2537 31.2649 23.5858 Fig. 5. Scenarios with Low, Medium, and High service times. 12 Song Chew and Clariecia Groves / Journal of Healthcare Technology and Management (2015) Vol. 3 No. 1 pp. 1-19 Service times may have significant effects on scheduling and profit. We will illustrate through a few examples some of the possible effects of various mean service times. Consider that service times are exponentially distributed (chosen for convenience; any type of distribution may be used). Let us divide the range of all possible values of the mean service time into 3 groups: Low (between 0 and 30 minutes), Medium (between 30 and 60 minutes), and High (greater than 60 minutes). We will assign all patient types the same no-show rate, unit revenue, and unit costs in order to analyze the effect of service times on the profit. Three scenarios with Low, Medium, and High mean service times are as indicated in Figure 5. Observe that the optimal total number of patients scheduled will be smaller when the mean service times are high. For example, notice, in Figure 5, that the optimal total number of patients scheduled decreases across scenarios (from Low to High) as the mean service times increase. This occurs because longer service times allow less time to service more patients. The optimal schedule for each scenario is provided in Figure 6. Scenario Optimal Schedule Low 2, 0; 2, 0; 2, 0; 2, 0; 2, 0; 2, 0; 2, 0; 1, 1 Medium 1, 0; 1, 0; 0, 0; 1, 0; 1, 0; 0, 0; 1, 0; 1, 0 High 1, 0; 0, 0; 0, 0; 0, 1; 0, 0; 0, 0; 0, 0; 0, 0 Fig. 6. Optimal schedules for scenarios with Low, Medium, and High service times. A graph of the relationship between profit (vertical axis) and the mean service times (horizontal axis) is shown in Figure 7 for the 3 scenarios. Fig. 7. Effect of mean service times on profit. 13 Song Chew and Clariecia Groves / Journal of Healthcare Technology and Management (2015) Vol. 3 No. 1 pp. 1-19 Similar to the results in Figure 4 for various no-show rates, in Figure 7, it appears that the profit will be lower when patients have longer service times. Example 3: Number of Blocks If we want to determine the optimal number of blocks given a set of parameters (Step 1: b to j in Section 5), we may run our genetic algorithm using a different number of blocks for each run while keeping all the other parameters the same across each run. The optimal number of blocks would be the number of blocks associated with the highest profit. We have found the optimal number of blocks for three scenarios. For each scenario, we run the algorithm for each of the following numbers of blocks: 4, 6, 8, 12, 16, and 24. Assuming a 4-hour day, note that the corresponding interappointment times are 60, 40, 30, 20, 15, and 10 minutes, respectively. Scenarios 1–3 are as indicated in Figure 8 where T1 may be AA patients and T2 may be WI patients. Description Scenario 1 Scenario 2 Scenario 3 T1 No-show Rate (%) 10 20 10 T2 No-show Rate (%) 20 10 20 T1 Mean Service Time (minutes) 10 20 50 T2 Mean Service Time (minutes) 20 10 40 T1 Unit Revenue 150 300 500 T2 Unit Revenue 300 100 150 Wait Time Unit Cost 10 5 10 Staff Idle Time Unit Cost 5 20 10 Staff Overtime Unit Cost 5 10 5 Fig. 8. Parameters for 3 scenarios. A graph of the relationship between profit (vertical axis) and the number of blocks (horizontal axis) is shown in Figures 9, 10, and 11 below for the 3 scenarios. From the graph in Figure 9, the optimal number of blocks appears to be 24 for Scenario 1. Figure 10 depicts that the optimal number of blocks seems to be 16 for Scenario 2, while Figure 11 reveals that the optimal number of blocks should be 16 for Scenario 3. 14 Song Chew and Clariecia Groves / Journal of Healthcare Technology and Management (2015) Vol. 3 No. 1 pp. 1-19 Fig. 9. Effect of numbers of blocks on profit for Scenario 1. Fig. 10. Effect of numbers of blocks on profit for Scenario 2. 15 Song Chew and Clariecia Groves / Journal of Healthcare Technology and Management (2015) Vol. 3 No. 1 pp. 1-19 Fig. 11. Effect of numbers of blocks on profit for Scenario 3. Overall, the graph of the profit generated for 4, 6, 8, 12, 16, and 24 blocks appears to be unimodal. Thus, we conjecture that the profit function is unimodal in the number of blocks. Having this unimodal shape is very helpful in determining the optimal number of blocks for a given set of parameters. Note that the total number of patients to schedule for a day changes over different numbers of blocks. For example, Scenario 1 has the following results, as shown in Figure 12, for the total number of patients for 4, 6, 8, 12, 16, and 24 blocks. Thus, for Scenario 1, a clinic would need to schedule 15 patients if they employ 24 blocks compared to 8 patients for 4 blocks in order to maximize their profit. Number of Blocks 4 6 8 12 16 24 Total Number of Patients 8 10 10 11 12 15 Fig. 12. Total number of patients for a day for Scenario 1. Perhaps there are other factors that affect the profit. To this end, we will conduct regression analysis and use the results of model selection to analyze the effects of multiple factors (independent variables) on the profit (dependent variable). 8. Model Selection Model selection, also known as subset selection or variables selection procedures, have been developed to identify a small group of regression models that are “good” according to a specified 16 Song Chew and Clariecia Groves / Journal of Healthcare Technology and Management (2015) Vol. 3 No. 1 pp. 1-19 criterion (Kutner et al., 2004). The results of model selection will indicate which factors are most important or useful in predicting profit. Thus, by using model selection, we will determine which of the following factors (or independent variables) have the greatest effect on the profit: 1. 2. 3. 4. 5. 6. 7. number of blocks no-show rate mean service time unit revenue wait time unit cost staff idle time unit cost staff overtime unit cost While there have been many criteria used for comparing regression models, we will use the following procedures to select the best models: Adjusted R2, Mallow’s Cp (where p – 1 is the number of variables and p includes the intercept term), forward selection, and backward elimination. For the forward selection and backward elimination procedures, we will use Akaike’s information criterion (AICp). A brief description of these procedures is as follows: Adjusted R2: This procedure is similar to using R2, the coefficient of multiple determination, except that it takes the number of variables in the regression model into account through the degrees of freedom (Kutner et al., 2004). The best model will have the largest value. Here is the formula: n 1 SSE p Ra2, p 1 , where SSEp is the error sum of squares and SSTO is the total sum of n p SSTO squares. Mallow’s Cp: This procedure uses the total mean squared error of the fitted values for each subset regression model to rank the models (Kutner et al., 2004). The best model will have the smallest value. Here is the formula: C p SSE p MSE ( X 1 ,..., X p 1 ) (n 2 p) , where MSE is the error mean square. AICp: This procedure uses the natural log of the error of sum of squares to rank the models (Kutner et al., 2004). The best model will have the smallest value. Here is the formula: AIC p n ln SSE p n ln n 2 p . Forward Selection: This procedure begins with no independent variables. At each step, it adds the variable that (after being added) will give the lowest value of the AICp statistic. The procedure will stop at the step where adding a variable would increase the AICp statistic. Thus, the procedure will not add the variable that would increase the AICp statistic and the best model will be the resulting model after the procedure ends. Backward Elimination: This procedure begins with all independent variables included. At each step, it removes the variable that (after being removed) will give the lowest value of the AICp statistic. The procedure will stop at the step where removing a variable would increase the AIC p 17 Song Chew and Clariecia Groves / Journal of Healthcare Technology and Management (2015) Vol. 3 No. 1 pp. 1-19 statistic. Thus, the procedure will not remove the variable that would increase the AIC p statistic and the best model will be the resulting model after the procedure ends. Here are the results of model selection. Using Adjusted R2, the best model included all factors except staff overtime unit cost. The value of Adjusted R2 for the best model was 0.8664862. Using Mallow’s Cp, the best model included all factors except staff overtime unit cost. The value of Mallow’s Cp was 6.867845. Using forward selection (with AICp), the best model included all factors except staff overtime unit cost. The value of AICp was 1622.06. Using backward elimination (with AICp), the best model included all factors except staff overtime unit cost. The value of AICp was 1622.06. Since all procedures returned the same subset of independent variables, we conclude that the factors that have the greatest effect on the profit function are the following: number of blocks, noshow rate, mean service time, unit revenue, wait time unit cost, and staff idle time unit cost. To further validate our conclusion, we should check for multicollinearity. Multicollinearity is a statistical phenomenon in which two or more independent variables in a multiple regression model are highly correlated. When this occurs, some of the independent variables may be redundant in the model because of their correlation with other variables. To check for multicollinearity, we computed the correlation between pairs of independent variables and found that there was no correlation between all pairs of independent variables. Thus, this provides strong evidence that there is no issue of multicollinearity and we may conclude that all factors except staff overtime unit cost can be used to analyze the profit function. 9. Conclusion In this project, the objective was to find the optimal number of patients to schedule for each block and for each type of patient (typically advanced appointment and walk-in patients) along with the optimal profit given a set of parameters. We used genetic algorithms in order to find an optimal or near optimal solution to this scheduling problem. Through multiple experiments of running simulations along with using model selection, we were able to draw conclusions about the effects of number of blocks, no-show rates, mean service times, unit revenue, and unit costs on the profit. In addition, we were able to employ genetic algorithms to find the optimal or near optimal number of patients (of each patient type) to schedule for each block. References Akachukwu, C., Aibinu, A., Nwohu, M., & Salau, H. (2014). A decade survey of engineering applications of Akachukwu, C., Aibinu, A., Nwohu, M., & Salau, H. (2014). A decade survey of engineering applications of genetic algorithm in power system optimization. In Fifth international conference on intelligent systems, modelling and simulation (pp. 38-42). Cayirli, T., & Veral, E. (2003). Outpatient scheduling in health care: A review of literature. Production and Operations Management, 2(4), 519-549. http://dx.doi.org/10.1111/j.1937-5956.2003.tb00218.x Chew, S. (2011). Outpatient appointment scheduling with variable interappointment times. Journal of Modelling and Simulation in Engineering, 2011, 1-9. 18 Song Chew and Clariecia Groves / Journal of Healthcare Technology and Management (2015) Vol. 3 No. 1 pp. 1-19 http://dx.doi.org/10.1155/2011/909463 Delavar, M., Hajiaghaei-Keshteli, M., & Molla-Alizadeh-Zavardehi, S. (2010). Genetic algorithms for coordinated scheduling of production and air transportation. Expert Systems with Applications, 37, 82558266. http://dx.doi.org/10.1016/j.eswa.2010.05.060 Deyo, A., & Inui, S. (1980). Dropouts and broken appointments. Medical Care, 18(11), 1146-1157. http://dx.doi.org/10.1097/00005650-198011000-00006 Fetter, R., & Thompson, J. (1966). Patients' waiting time and doctors' idle time in the outpatient setting. Health Services Research, 1, 66-90. Field, J. (1980). Problems of urgent consultations within and appointment system. Journal of the Royal College General Practitioners, 30, 173-177. Hays, J., & McLaughlin, D. (2008). Healthcare operations management. In Foundation of the American College of Healthcare Executives. Ho, C., & Lau, H. (1992). Minimizing total cost in scheduling outpatient appointments. Management Science, 38(12), 1750-1764. http://dx.doi.org/10.1287/mnsc.38.12.1750 Holland, J. (1975). Adaptation in natural and artificial Systems. Univ. Michigan, Ann Arbor, MI. Kutner, M., Nachtsheim, C., & Neter, J. (2004). Applied Linear Regression Models. Fifth Edition, McGrawHill/Irwin. National Health Care Expenditures Projections: 2009-2019. Centers for Medicare and Office of the Actuary Medicaid Services, USA. Lee, J., Earnest, A., Chen, M., & Krishnan, B. (2005). Predictors of failed attendances in a multi-specialty outpatient centre using electronic databases. BMC Health Services Research, 5(51), doi:10.1186/14726963-5-51. http://dx.doi.org/10.1186/1472-6963-5-51 Muthuraman, K., & Lawley, M. (2008). A stochastic overbooking model for outpatient clinical scheduling with no-shows. IIE Transactions, 40, 820-837. http://dx.doi.org/10.1080/07408170802165823 Sharma, C., Sabharwal, S., & Sibal, R. (2013). A survey on software testing techniques using genetic algorithm. International Journal of Computer Science Issues, 10, 381-393. Shonick, W., & Klein, B. (1977). An approach to reducing the adverse effects of broken appointments in primary care systems: Development of a decision rule based on estimated conditional probabilities. Medical Care, 15(5), 419-429. http://dx.doi.org/10.1097/00005650-197705000-00008 Werner, F. (2013). A Survey of Genetic Algorithms for Shop Scheduling Problems. Heuristics: Theory and Applications, Nova Science Publishers, 161-222. 19