Introduction
Who is going to show for a flight? It's a question that the airline industry has explored since deregulation led to the practice of yield management. It has guided airlines and other industries to develop sophisticated models and forecasts to maximize revenues. And the industries have done relatively well based on historical data patterns used in the forecast models – as long as present and future patterns correspond relatively well to the historical patterns.
Some airlines in the recent past have tried to avoid the question altogether and make the question moot with the use of additional fare restrictions. Just as with the theatre, concerts, museum exhibits, and other ticketed events, these airlines have argued that if you purchase a ticket and don't use it, there are no refunds, exchanges, or reissues. Period.
But event tickets' characteristics differ from airline tickets in very relevant areas such as time commitment, transferability, and cost. With airline tickets, there are other issues, such as customer service, customer acceptance, and competitive actions to consider, including the importance of contracts of carriage and human behavior. The potential of additional revenue in selling an extra seat by knowing more accurately that it will be available is too tempting to resist. With the European Union's recent legislation for denied boarding compensation if a flight is oversold and a passenger is left behind, the risk increases also.
And no matter how well the forecasting models might be currently, there are ways to make them even better. A small improvement in the accuracy of the predictions can result in significant additional revenue with much of it going to the bottom line. The answer to the forecast question is further influenced by certain external causal and behavioral factors that are difficult to model without the insight gained through access to more data at the detail level of customers and flights.
So, what's an airline to do? Passengers feel it's a right to choose whether or not they fly on a ticket they purchased. They have a right to 'no show,' or to not travel the purchased or reserved product, which is the seat on the flight. But, and particularly in today's economic environment, an airline must make money, and having a flight depart with empty seats that could have been sold is wasting money.
The major challenge is that an airline seat on a particular flight on a particular date and at a particular time is a perishable product; when a ticketed passenger doesn't show for the flight, the product spoils, its value upon the departure is zero, and the revenue opportunity is lost forever. History and experience have shown that not all customers will show for a flight for various reasons. It is at this point that overbooking (selling more seats than are physically available on a flight) comes to the rescue. Overbooking ensures that revenue is generated from each available seat with minimal risk to the airline.
Overbooking, or "the practice of selling seats above the physical capacity of an aircraft to compensate for events that will contribute to empty seats at departure", helps assure airline management that an airplane will depart with the maximum potential passenger load. Overbooking is a critical component of the practice of revenue management, which tries to attain the maximum potential flight revenue. This is done by taking into account in the forecast the probability of some customers not showing for a flight. Experience has also taught the airlines that there are many reasons that passengers do not show for their flight, causing that flight to depart with empty seats – even though that flight was sold out to above-capacity levels prior to departure.
Even though most airlines own the detailed data in their systems that will help them to better understand the causal factors that contribute to no shows, it is rarely exploited to its full advantage. Many airlines today are using Teradata's scalable, flexible analytical technology to address this question and add precision to their models and forecasts through accessibility to integrated detail level data, consisting of Passenger Name Record (PNR) data, ticketing data, and flown data. They are finding surprising and serendipitous relationships between customer behaviors documented in this set of data. They are exploring the causal effects that specific customer actions will drive certain customer behaviors, and are improving their show rate forecasts and overbooking models as a result of this analysis.
More about Overbooking
Overbooking needs to take into consideration the forecasted show rate of passengers. Generally, data that is currently being used to set the authorized levels on a flight includes the historical aggregated count of bookings and show rates at the detail level of specific flights in specific markets at specific times of day and days of the week.
Therefore, if four to six people miss the Tuesday evening departure from London to New York for 25 of the past 30 flight occurrences, then there is a high probability that, going forward, this trend will continue in the aggregate. But in exploring the details, an airline can understand the passenger composition and the probable passenger behavior on a flight such that it can better predict on which of the actual 25 flights this situation will occur. The ability to predict the actual show rate for a particular flight rather than doing it in the aggregate will enable an airline to confidently sell more seats while simultaneously reducing the risk of denied boardings. This level of precision leads to benefits that not only include reduced passenger compensation for denied boardings and increased customer satisfaction, but also improved revenue.
By using only aggregated data and counts, the projected no-show estimate will be incorrect some of the time. Therefore, the value of taking that additional booking must be weighed against the possible monetary and customer satisfaction cost resulting from the denied boardings that occur when the projection is incorrect, and there are too many passengers that showed for a flight. This may lead to the airline being more conservative in its estimate than it needs to be in the absence of accurate projections. The airline generally accepts that there is a tradeoff between the revenue and cost of overbooking a flight. This can be represented by the calculation:
Net Revenue = ƒ(prob. of additional revenue from oversale) - (prob. of denied boardings and its costs)
How big of a tradeoff is this? As mentioned earlier, many airlines are fairly successful in forecasting no-show rates using conventional data in the aggregate.
During a 2002 IATA conference in Orlando, Florida, Bill Brunger, vice president of distribution planning and revenue decision support at Continental Airlines, stated that the airline is able to achieve a 97% load factor when they "book full." Additionally, the probability of an oversale occurring runs about 0.16% with a less-than-0.009% probability of an involuntary denied boarding. Pretty good results. In fact, he continued to state that fewer than 1% of their 2,244 daily flights had "voluntary" denied boardings of eight or more passengers. However, even with these impressive statistics, Continental Airlines still felt the need to better explore their overbooking methodologies and investigate what information would improve their no show forecasts.
What they discovered is that with detail level data of PNR, tickets, and flown passenger in their Teradata solution, the airline was able to add an additional level of precision to its projections that was not attainable through conventional methods. This led to a radical redesign of their overbooking approach. With the precision that was gained by accessing detail level data, they were able to build additional models that identified show rates of customer segments based on the characteristics found within the detailed PNR data.
These characteristics included such items as whether a passenger was on an outbound or a return portion of the itinerary; if they were local or flow passengers; and considered such factors as the time interval between booking and flight, the fare class, fare rules, and the booking source. Through their access to historic, detailed information on the PNR, tickets, and flown, Continental is heading down the path of deriving customer level showrate scoring of their customers and using this scoring data to more accurately predict the show rates on specific flights.
Having ready access to detailed PNR, tickets, and flown information was also very helpful to Continental Airlines during non-standard events, such as the aftermath of 11 September 2001. Using the detailed data available, they were able to instantly determine that the time-of-booking information was a good predictor of probable passenger behavior under these non-standard conditions. Consequently, they were able to develop new show-rate models and adjust the overbooking levels based on the combination of behaviors derived from the detailed data available within the PNR, tickets, and flown data sets. By the time the first flight departed, they could identify the flights that would have significant no show patterns and adjust the overbooking levels accordingly. Different adjustments were made on each flight for those bookings made prior to 9/11 (which showed a very high no-show propensity) and bookings made after 9/11 (which had a much lower no-show factor).
Customer Behavior and Show Rates
So, where should your airline begin if you don't have a five-year history of detailed passenger activity on your flights? What elements should you explore when looking at the characteristics of your passengers in determining show rates? The detail found within the PNR, tickets, and flown data or that can be derived from the information available that could be explored include:
Passenger direction
This may be one of the more logical elements to consider when determining the probability to show. Is the passenger on an outbound flight or is the passenger on the return portion of their journey? Logic tells us that passengers are more likely to show on a return flight but probably not the same in all markets. However, this should not be viewed in isolation as there are other factors that may influence customer behavior beyond the direction of travel.
Fare rules
By their very nature, fare rules influence show rate patterns since more flexible fares generally do not include a penalty for last minute changes. Passengers have less risk involved if they miss a flight. Additional conditions should be reviewed to determine the relationship on show rates including: are the gate agents circumventing the more restrictive fare rules to provide improved customer service? If so, how is this behavior impacting the passenger counts that feed the revenue management models? And how should the underlying data be adjusted to represent the actual travel that should have occurred?
Type of travel
Does the booking represent a local passenger, a through passenger, or a connecting passenger? Does each passenger type by market have a distinguishing characteristic for showing for a flight that should be explored? Do local passengers show less consistently than through passengers?
Number of passengers in the booking
Most revenue management models are designed to forecast using booking counts in the aggregate and not the detailed booking characteristics. In a given day, you might book 5 x class fares on flight y. However, what difference would there be in the overbooking levels if the 5 x fares were for a family or group in a single booking? What about five individual bookings? Or two separate bookings – one of two passengers and one of three passengers? How could the overbooking model be adjusted to take advantage of this data to better predict the show rate for a flight?
Time of day
Certain external factors might contribute to a passenger not being present at the scheduled time of departure. These might include traffic issues, counter and security delays, or misconnecting inbound flights, for example. The passenger may be listed as a "no show" in the passenger counts that feed the revenue management models, but in actuality, did "show," but took a later flight due these external factors.
Other time elements
Time is a critical factor in the airlines. By being able to sell a seat on a flight nearly a year in advance, the element of time can influence a number of behaviors and outcomes. Therefore, some of the factors that should be investigated include the time between booking and ticketing or the time between ticketing and travel. The time of year and special events should also be considered when determining overbooking models and passenger show rates.
Reason for travel
The type of travel has been shown to have a significant impact on the show rate. Being able to model and segment business travel versus leisure travel versus holiday travel versus groups versus meetings/conventions travel on a flight can yield additional information about the show rate.
Reason for no show
The reason that a particular passenger did not show for a flight can impact the show rate probability, specifically when the no show was voluntary or involuntary, such as when the no show was caused by an airline disservice such as a delayed or cancelled flight. One type of no show should obviously count, and the other should not count in determining probable future show rates.
Geography
Do certain points of sale behave differently? Do travelers in Germany have different show rates than travelers in Italy? Is there a difference between travelers from the United States versus the travelers in Asia? If a person from Germany is traveling out of Chicago, should that influence the overbooking calculations?
Other miscellaneous elements
Special meals and other special service requests, elite tier status in the frequent flyer program, fraudulent bookings, the channel used to book the travel, historic travel patterns, and personal experience with the airline may all have an impact on the show rates of a passenger.
The more detailed data collected and made available for analysis, the more data mining and propensity modeling that is performed, then the higher level of precision that can be achieved with overbooking, no show predictions, and denied boarding controls.
Looking at each of these characteristics in isolation can provide insight into customer behaviors on a flight, but the complexity increases when the airline tries to account for all potential factors available to predict show rates.
The models and analysis can get very complex very fast; particularly when you multiply the characteristics across the millions of passengers flown by airlines. The questions in this case quickly turn to one of using all the available data within an airline to better predict the show rate at the detail flight level rather than at the aggregate level. The challenge in building the show rate models is which of these characteristics is the most relevant in predicting show rates and under what circumstances? How many of these passengers with these similar characteristics are traveling on that same flight? And what is the overall passenger composition on a particular flight?
It might be helpful to begin by combining a limited number of elements that have significance such as a breakdown of passenger distribution with certain simple and easily known characteristics. Information such as outbound/return, local/flow passengers, unrestricted/restricted fare can be used to build first-level models and then evolve the models from that point forward based on what is the most meaningful predictor of characteristics for that particular airline and the geographies that it serves.
Some possible examples of detail level PNR, tickets, and flown data captured and derived are presented in the table below.

Inputs into the Revenue Management Equation
Another benefit of access to detail level PNR, tickets, and flown data alluded to above, is being able to recreate the history of a flight to represent what would have occurred under normal operating conditions, thereby improving the accuracy of the inputs being fed into the Revenue Management forecasting models.
Although at first glance, the equation for the show rate on a flight is simple enough (the number of passengers booked less the number of passengers who actually traveled), a more robust analysis will need to go beyond the aggregate booked and flown numbers. The critical analysis is to determine the details for the latter half of the equation. An airline will need to decide how to handle events that impact this equation and whether it is a significant factor in the show rate propensity model. Therefore, an airline will need to determine the definition of a customer no show beyond the level of this initial calculation.
For example, a passenger departs on an earlier flight (#1) on the airline and consequently is not traveling on their original itinerary so in the aggregate is typically counted as a no show on the original flight (#2). Another passenger's itinerary was on the earlier flight (#1), but he did not show. Therefore the aggregate counts at the flight level showed that the earlier flight (#1) had zero no shows, but the later flight (#2) had one no show. This scenario is represented in Figure 2.

In actuality, it was flight #1 that had one no show. How should that passenger be "counted" within the revenue management system? Have they no-showed their flight; or did they actually show even though they didn't travel on their booked flight? If the aggregate counts remain as traveled, the show rate on the earlier flight would be better than actual, whereas the show rate on the later flight might actually be worse than predicted. Since the accuracy of this data is very important as it is being fed into the RM forecasting models, the better the accuracy of the data, the better the forecast results will be.
A second example of this data accuracy challenge might involve the connecting flights through the hub where a feeder flight has been delayed and misconnects a number of passengers to their onward journey flights. If the reaccommodation entries are not made correctly in the system, the misconnecting passengers might be misrepresented as "no shows". By adjusting the histories of the passengers, the inputs to the revenue management forecast systems can accurately represent what would have occurred in the absence of an intervening factor. Thus, the output of the forecast models can be improved by improving the accuracy of the data inputs.
Airlines have traditionally been a low margin business. Oftentimes, an additional one or two seats purchased at the last minute for full fare may be the difference between making money on a flight or losing money on a flight. And on the majority of flights, the standard models using data in the aggregate works relatively well. So, when does having access to the detailed data become important? When there is a need to have much better accuracy in the show rate predictions on a small number of significant flights. On these flights, having the detailed data and the more robust models for show rate predictions will have a significant revenue impact on an airline. Averages give you average results, but detail data give you optimal results. An additional $100 per flight for 1000 flights per day will become significant over the course of a year.
For More Information
To see how Teradata can create unique competitive advantage for your business, contact us today to arrange a live demonstration. For more information about Teradata, contact your Teradata representative.