Analytics and Data Mining - Goran's Blog: 2015

Sunday, August 9, 2015

Big-Data Analytics in hotel industry

Hotel industry is another industry where effective use of analytics can change dramatically how business is run. It is another data rich industry that captures huge volumes of data of different types, including video, audio, and Web data. However, for most hoteliers data remains an underused and underappreciated asset. Hoteliers capture loyalty information, for example, but few go beyond loyalty tier in how they consistently view and take action with their guests. With analytical exploitation of their data, hoteliers can go beyond their traditional loyalty programs and deepen their knowledge of guests in order to develop a more granular understanding of segment behavior, needs, and expectations; identify profitable customer segments and their buying preferences; and identify opportunities to attract new guests. But all that starts with having clear customer-driven vision, before embarking on Integrating and standardizing guest data from multiple channels, systems and properties into a unified, accurate view of all interactions.

Next phase is using analytics to segment guests according to booking trends, behavior and other factors in order to reveal their likelihood to respond to promotions and emerging travel trends. It is vitally important for hoteliers to be able to understand guest preferences (locations, activities, and room types), purchase behavior (frequency, length of stay, time of year) and profit potential in order to increase the brand loyalty and wallet share of their most valuable guests. Focusing on the wrong guests reduces profitability across the enterprise. For example, if a hotel targeted guests who would likely take advantage of spa services, golf and restaurants, rather than guests who only generate room nights, they could significantly increase revenues and profitability.

To maximize profits, hotels need to increase the loyalty and wallet share of their most valuable guests by marketing to their preferences and encouraging repeat visits. Focusing on the wrong guests reduces profitability across the enterprise. For example, if a hotel targeted guests who would likely take advantage of spa services, golf and restaurants, rather than guests who only generate room nights, they could significantly increase revenues and profitability. Unfortunately, money often gets spent on blanket campaigns that don’t target individual guests or segments with offers they’re most likely to respond to. As a result, guests may feel that the hotel doesn’t care about them, or simply doesn’t offer services designed to meet their needs. It becomes easy for those guests to switch to a competing hotel.

Unfortunately, money often gets spent on blanket campaigns that don’t target individual guests or segments with offers they’re most likely to respond to. As a result, guests may feel that the hotel doesn’t care about them, or simply doesn’t offer services designed to meet their needs. It becomes easy for those guests to switch to a competing hotel. For analytics to truly be a game changer, hospitality organizations need to recognize the difference between reactive and proactive decision making. Using your data to create reports, drill-downs or alerts helps you to keep a finger on the pulse of your business. But these things only show you what happened. They will not tell you why the problem is happening or what effect it will have in the future. Predictive analytics, like forecasting and optimization, can help you figure out why things are happening, show you what will happen next, or even lead you to the best alternative action considering all of your operating constraints. Hoteliers are starting to use more and more predictive analytics to move from reactive to proactive decision making, which would enable them to stay one step ahead of trends, set strategy and achieve goals. They gain advantage over the competition, increase value to shareholders, and continue to surprise and delight their guests. Following are areas where analytics can play essential role:

Customer Segmentation

For the hotel industry, a more useful approach might be to identify the unique cluster groups and to then conduct a separate value segmentation exercise for each cluster. For example, for a given hotel we identify 4 basic clusters or distinct customer groups such as tennis groups, ski group, pampered group (e.g. use spa and valet type services) and the nighthawk group ( fine dining and theatre goers). The segmentation approach might look as follows: initial learning from this type of segmentation could be used in developing a marketing strategy that is data-driven. The hotel could examine its current business customer base and once again establish unique groups of business customers. For example, we know that there are groups of business customers that simply use the hotel for overnight stays, while others are there for longer term events held at the hotel. It may be possible to further segment these groups based on industry sector. We would certainly expect that local events held by the oil and gas industry might be more appropriate in one city, while financial services type events may be more prevalent somewhere else. Of course, all this supposition on what might define unique business segments would need to be determined quantitatively through clustering routines. By using the data and mathematics rather than intuitive judgment to define key customer segments, we can develop unique programs that are appropriately geared to different groups of unique business travelers. Customer experience has always been the overriding customer philosophy within the travel industry, long before the advent of data analytics. Yet, with data analytics, the travel industry can now use information to make better decisions regarding its customers. This enhanced decision-making capability enables hotels to be more proactive with its customers. Traditionally, success in the hotel industry has always been determined by superior customer service actions that address the immediate requested needs of the customer. The competitive advantage in today’s hotel industry is driven more by the ability to anticipate and proactively meet the needs of customers; an ability that can only be exercised through data analytics.

Customer Profiling Customer profiling is accomplished through in-depth analysis of guest demographics and lifestyle characteristics. Attributes such as income levels, family status, age and sports and cultural interests, if known, can be appended to model guests. Customer profiling can be used to create an e-mail listserv for targeted marketing of current as well as prospective clientele. Prospect profiles can be especially useful in identifying those folks most likely to respond to marketing and/or promotional offers. Profiling can also be important in determining which market segments are most productive and profitable.

Site Selection Data mining can also be essential to determining sound criteria for restaurant site selection given an index derived from an analysis of high-volume, successful units. Such items as demographics (customer profile) and psychographic (buying patterns), and related customer descriptors are used to delineate highly probable factors for site modeling. As a result, evaluation data and analytical profiling qualify companies to be better able to identify candidate sites.

Forecasting Customer transactional data (segmented by menu item and day part) can be useful in the development of a forecasting model that accurately produces meaningful expectations. Regardless of whether a restaurant company relies on moving average or time series forecasting algorithms, data mining can improve the statistical reliability of forecast modeling. Estimating in advance how much and when menu items will need to be prepared is critical to efficient food production management. Data mining can provide a prognostication of product usage by day part given available sales data. In addition, knowing how much product was sold during any meal period can also be helpful in supporting an effective inventory replenishment system that minimizes the amount of capital tied up in stored products.

Customer Relationship Management An effective CRM program can be a direct outcome of data mining applications. The ability to enhance CRM given rapid accessibility of more comprehensive management information should lead to satisfied clientele and improved sales performance. The ability to anticipate and affect consumer behavior (influence menu item sales and other promotions) can provide the restaurant with a competitive advantage. Having a signature item, for example, can be found to be a driver of improved relations while providing a product that customers do not perceive as having an equivalent elsewhere.

Menu Engineering An analysis of menu item sales and contribution margins can be helpful to continuous, successful restaurant operations. While menu engineering deals with menu content decisions, data mining can produce reports to indicate menu item selections, by customer segment, as a basis for operational refinement. For example, Applebee’s has been described as employing data mining expressly for the purpose of determining ingredient replenishment quantities based on a menu optimization quadrant analysis that summarizes menu item sales. Through such analysis the company then decides which menu items to promote.

Productivity Indexing By correlating order entry time (POS time stamped) with settlement time, data mining is able to provide a reliable estimate of elapsed production and service times. This data provides insight into average service time relative to customer turnover as well as waiting line statistics. While productivity data is difficult to ascertain, this analysis provides factual data to assist management in fine tuning operations (heart of the house and dining room staff).

Customer Associations and Sequencing Data mining can uncover affinities between isolated events. For example, a guest purchasing the restaurant house specialty is likely to also purchase a small antipasto salad and glass of Chardonnay. Paired relationships provide a basis for bundling menu items into a cohesive meal that simplifies ordering while ensuring customer satisfaction. Menu design can also be manipulated to feature such combinations as unique opportunities for customers. Data associations are often credited with a means for influencing customers to spend more than anticipated or upselling.

Forecasting As mentioned previously, forecasting is one of the strengths of data mining and enables restaurants to better plan to exceed the needs of its clientele. Forecasting enables more efficient staffing, purchasing, preparation and menu planning.

Customer Value Within the travel industry, customers have always considered their time at a hotel as an experience rather just a visit. Activities such as fine dining, nightly entertainment, spas, corporate seminars / meetings nurture this notion of ‘customer experience’. This range of activities is going to have varying levels of appeal among a given clientele. The role of data mining and analytics can be quite significant in helping us to better understand these varying client needs. Our first task might be to conduct a basic customer value exercise in order to ultimately identify our best customers. As with many analytical exercises, the concept of seasonality needs to be considered here. Seasonality is a very significant factor within the hotel industry. Most analysts would agree that for the travel industry, the issue of seasonality can potentially have a significant impact on travel behaviour. For example, one traveler may spend $1,000 annually as a casual traveler throughout the year and is considered an “average customer”. Another traveler spends $1,000 annually, but on a tennis package for one week period. Both customers spend the same amount but are in fact very different types of customers. This notion of seasonality is significant when conducting any analytics exercise particularly if we consider that many hotels will offer tennis and golf packages in the summer and ski packages in the winter. In addition to the issue of seasonality, there are various services that may have more appeal to certain groups of customers. Fine dining and theatre may appeal to one group of customers while spas and perhaps valet type services appeal to another group. With varying interests amongst travel clientele, a “cluster type” segmentation exercise would be a very useful way to identify different groups of customers. Experts in the travel industry would certainly agree that there are distinct or homogenous customer segments. Using the data being captured on travel customers, we can apply some Science to identify truly distinct customer segments. How do we integrate the notion of customer value within the cluster segmentation approach? Typically we might conduct a value segmentation exercise on the entire customer base and then overlay the cluster segments to see how they align with customer value.

Personalized Marketing and Website Optimization By tracking and processing your customer’s behavior and actions, you can provide them with personalized offers that are more effective and give a personal touch. Let say for example, you have a client that visits your hotel restaurant on a frequent basis due to business. When you are planning your next promotional campaign, make it targeted and personal. Send an email to this client saying “We know you have enjoyed our great restaurant in the past, so when you visit next week, here’s a coupon for a free appetizer and drink”. There are various marketing automation tools out there that facilitate this process and allow you to deploy an effective and personalized cross channel marketing strategy. Another area where you can use data in order to boost business is to optimize your website or landing pages through A/B testing. Are you implementing a marketing campaign, but the conversion rates of your landing pages are not as anticipated? An easy solution is to resort to A/B testing. A/B testing is the act of running a simultaneous experiment between two or more pages to see which performs or converts the best.

Energy Consumption

In the hotel industry world, analytics can also be used for internal operations. Energy consumption accounts for 60 to 70% of the utility costs of a typical hotel. However, costs can be controllable, without sacrificing guest comfort, by using energy more efficiently. At present times, smart data can help managers to build energy profiles for their hotels. There are modern software solutions that gather data from multiple sources, including weather data, electricity rates and a building’s energy consumption to build a comprehensive ‘building energy profile’. Through a cloud-based, predictive analytics algorithm, the software can fine-tune whether power comes from the grid or an onsite battery module.

Investment Management

Another way to use analytics for the hotel industry is for financial performance and investment. When managers want to proceed to make capital investments, like refurbish the lobby or the rooms or renovate their restaurant, they can consider implementing a “Randomized Testing” strategy. How does this work? Basically the hotel chain would refurbish the lobby and rooms in only two or three “test” hotels. Then, they would monitor if there has been a difference in bookings and customer satisfaction. The data obtained from the test hotels can then be compared to the data of the other hotels that were not refurbished. Thus, managers can take a data driven decision and clearly see if it’s profitable to make the investment throughout the whole chain. In conclusion, data analytics can be a powerful force in transforming the hotel industry. From taking evidence based actions to developing customer centered marketing and pricing strategies, increasing the ROI of capital investments and generally empowering hoteliers to make bigger and better decisions. There are however also some great examples of hotel chains moving in the right direction in respect to use of analytics. This can result in improved customer satisfaction, personalized marketing campaigns and offers so that the right guests book the right room at the right moment and at the right rate. In addition, it can boost in employee productivity and more efficient operations. The advantages of using analytics and data mining the hotel industry are enormous. Deep customer insights can lead to improved guest satisfaction and an unforgettable experience. Making these insights available to all levels and departments within the hotel is crucial. It allows the concierge to know which local tours to recommend that fit your preferences. It allows the restaurant departments to predict which menu items are likely to be ordered, based for example on the local weather. It allows the reservations department to predict the optimal rate for a room and sales and marketing to create tailored messages across different (social) networks and send truly personalized email campaigns. Let’s dive a bit deeper in some possibilities:

The right room at the right rate

Yield management is nothing new in the hotel industry. Providing different rates to different customers has been done for ages and with success. Big Data offers hotels the possibility to take revenue management a giant leap forward and start offering truly personalized prices and rooms to guests. The massive growth in booking websites, hotel review websites such as TripAdvisor and Yelp and the ever growing list of social media networks offer a lot of potential. Combined with hotels’ own CRM systems and/or loyalty programs there is a lot of data that can be used to optimize revenue management. According to some industry studies, the hotel chain Marriot has been using Big Data Analytics to start predicting the optimal price of its rooms to fill its hotels. They do this by using improved revenue management algorithms that can deal with data a lot faster, by combining different data sets and making these insights available to all levels to improve decision-making. The American hotel chain Denihan goes even a step further. They used Analytics software to maximize profit and revenue across thousands of their rooms by combing their own data sets and data from for example review sites, blogs and/or social network website. They understand the likes and dislikes of their guests, optimize their offering and adjust the room rates accordingly.

Mobile Big Data throughout the Hotel

More and more hotels have developed mobile Apps that guests can use to book a hotel room. These apps however offer vast more possibilities for guests if developed correctly. It could serve the key to your hotel room; it can be used to make reservations in restaurants and spa’s and for example to order room service. If hotels start using the vast possibilities of mobile application they can generate massive amounts of data that can be analyzed. So, from a guest perspective, mobile offers a lot of convenience. From an employee perspective, it can make life a lot easier for the staff while at the same time increase customer satisfaction. Providing the housekeeping department with smart devices for example will allow them to know in real time, that you prefer an extra pillow or an extra light. Kempinski and Hyatt in Dubai already use such applications for their hotels. Most of the staff within hotels do not have an office or a computer so providing them with real-time guest information should be done on-the-go. Although this requires a different approach and a different way of presenting the insights, placing user-friendly analytics in the hands of guest facing employees will definitely improve customer satisfaction.

More efficient hotel operations

From a hotel operations point of view, big data offers also many different solutions. Big Data can be used to reduce your energy bill for example. By combining data from 50 different sources, including electricity rates, weather data and a building’s energy consumption, two InterContinental hotels in San Francisco managed to reduce their energy costs by 10-15%. They created detailed energy profiles for their buildings and using a predictive algorithm they decided whether to use an onsite battery module or receive power from the grid. Hotels should also use analytics more to help more efficiently running their IT operations, which is especially relevant for chains that operate their own booking engine. A server that breaks down or a booking engine that is inaccessible could result in lost bookings and therefore lost revenue. IT operations analytics monitors a hotel’s complete IT environment, including the different relations between applications and hardware and can predict when things are about to go wrong. Advanced IT operations analytics can even solve problems automatically before they occur. This could save a lot of money because IT that’s not working will results in a bad customer experience. Of course the examples given here are just a few of the massive possibilities that analytics has to offer for the hotel industry. Data mining technology can be a useful tool for hotel corporations that want to understand and predict guest behavior. Based on information derived from data mining, hotels can make well-informed marketing decisions, including who should be contacted, to whom to offer incentives (or not), and what type of relationship to establish. Data mining is currently used by a number of industries, including hotels, restaurants, auto manufacturers, movie-rental chains, and coffee purveyors. Firms adopt data mining to understand the data captured by scanner terminals, customer-survey responses, reservation records, and property-management transactions. This information can be melded into a single data set that is mined for nuggets of information by data mining experts who are familiar with the hotel industry. However, data mining is no guarantee of marketing success. Hotels must first ensure that existing data are managed—and that requires investments in hardware and software systems, data mining programs, communications equipment, and skilled personnel. Affiliated properties must also understand that data mining can increase business and profits for the entire company and should not be viewed as a threat to one location. Since data mining is in its initial stages in the hotel industry, early adopters may be able to secure a faster return on investment than will property managers who lag in their decisions. Hotel corporations must also share data among properties and divisions to gain a richer and broader knowledge of the current customer base. Management must ensure that hotel employees use the data-management system to interact with customers even though it is more time consuming than a transactional approach.

Big-data analytics for lenders and creditors

Credit today is granted by various organizations such as banks, building societies, retailers, mail order companies, utilities and various others. Because of growing demand, stronger competition and advances in computer technology, over the last 30 years traditional methods of making credit decisions that rely mostly on human judgment have been replaced by methods that rely mostly on statistical models. Such statistical models today are not only used for deciding whether or not to accept an applicant (application scoring), but also to predict the likely default of customers that have already been accepted (behavioral scoring) and to predict the likely amount of debt that the lender can expect to recover (collection scoring). The term credit scoring can be defined on several conceptual levels. Most fundamentally, credit scoring means applying a statistical model to assign a risk score to a credit application or to an existing credit account. On a higher level, credit scoring also means the process of developing such a statistical model from historical data. On yet a higher level, the term also refers to monitoring the accuracy of one or many such statistical models and monitoring the effect that score based decisions have on key business performance indicators.

Credit scoring is performed, because it provides a number of important business benefits, all of them based on the ability to quickly and efficiently obtain fact based and accurate predictions of credit risk of individual applicants or customers. So, for example, in application scoring, credit scores are used for optimizing the approval rate for credit applications. Application scores enable the organization to choose a optimal cut-off score for acceptance, such that market share can be gained while retaining maximum profitability. The approval process and the marketing of credit products can be streamlined based on credit scores: High risk applications can, for example, be given to more experienced staff or pre-approved credit products can be offered to selected low-risk customers via various channels, including direct marketing and the Web.

Credit scores, both of prospects and existing customers, are essential in the customization of credit products. They are used for determining custom credit limits, down payments or deposits and interest rates. Behavioral credit scores of existing customers are used in the early detection of high risk accounts and enable the organization to perform targeted interventions, for example by pro-actively offering debt restructuring. Behavioral credit scores also form the basis for more accurate calculations of the total consumer credit risk exposure, which can result in a reduction of bad debt provision.

Other benefits of credit scoring include an improved targeting of audits at high-risk accounts, thereby optimizing the workload of the auditing staff. Resources spent on debt collection can be optimized by targeting collection activities at accounts with a high collection score. Collection scores are also used for determining the accurate value of a debt book before it is sold to a collection agency. Finally, credit scores serve to assess the quality of portfolios intended for acquisition and to compare the quality of business from different channels, regions and suppliers.

Building credit models in-house

While under certain circumstances it is appropriate to buy ‘ready-made’ generic credit models from outside vendors or to have credit models developed by outside consultants for a specific purpose, maintaining a practice for building credit models in-house offers several advantages. Most directly, it enables the lending organization to profit from economies of scale when many models need to be built and to afford a greater number of segment specific models for a greater variety of purposes.

Building up a solid, re-usable and flexible data, knowledge and skill base of its own also makes it easier for the organization to stay consistent in the interpretation of model results and reports and to use a consistent modeling methodology across the whole range of customer related scores. This results in a reduced turnaround time for the integration of new models, thereby freeing resources to more swiftly respond to new business questions with new creative models and strategies.

Finally, in-house modeling competency is needed to verify the accuracy and analyze the strengths and weaknesses of acquired credit models, to reduce access of outsiders to strategic information and to retain competitive advantage by building up company specific best practices.

Larger credit scoring process

Modeling is the process of creating a scoring rule from a set of examples. In order for modeling to be effective, it has to be integrated into a larger process. Let’s look at application scoring. On the input side, before the modeling, the set of example applications has to be prepared. On the output side, after the modeling, the scoring rule has to be executed on a set of new applications, so that credit granting decisions can be made.

The collection of performance data is at the beginning and at the end of the credit scoring process. Before a set of example applications can be prepared, performance data has to be collected so that applications can be tagged as ‘good’ or ‘bad’. After new applications have been scored and decided upon, the performance of the accepts again has to be tracked and reports created, so that the scoring rule can be validated and possibly substituted, the acceptance policy be fine-tuned and the current risk exposure be calculated.

Choosing the right model

With available analytical technologies it is possible to create a variety of model types, such as scorecards, decision trees or neural networks. When you evaluate, which model type is best suited for achieving your goals, you may want to consider criteria such as the ease of applying the model, the ease of understanding it and the ease of justifying it. At the same time, for each particular model of whatever type, it is important to assess its predictive performance, i.e. the accuracy of the scores that the model assigns to the applications and the consequences of the accept/reject decisions that it suggests. A variety of business relevant quality measures, such as concentration, strategy and profit curves are used for this (see section Model Assessment in the case study section below). The best model will therefore be determined both by the purpose for which the model will be used and by the structure of the data set that it is validated on.

-----------------------------------------------------------------------------------------------------------

Scorecards

The traditional form of a credit scoring model is a scorecard. This is a table that contains a number of questions that an applicant is asked (called characteristics) and for each such question a list of possible answers (called attributes). One such characteristic may, for example, be the age of the applicant, and the attributes for this characteristics then are a number of age ranges that an applicant can fall into. For each answer, the applicant receives a certain amount of points – more if the attribute is one of low risk, less vice versa. If the application’s total score exceeds a specified cut-off amount of points, it is recommended for acceptance. Scorecard model, apart from being a long established method in the industry, still has several advantages when compared with more recent ‘data mining’ types of models, such as decision trees or neural networks. A scorecard is easy to apply: if needed the scorecard can be evaluated on a sheet of paper in the presence of the applicant. It is easy to understand: the amount of points for one answer doesn’t depend on any of the other answers and across the range of possible answers for one question the amount of points usually increases in a simple way (often monotonically or even linearly). It is therefore often also easy to justify a decision that is made on the basis of a scorecard to the applicant. It is possible to disclose groups of characteristics where the applicant has a potential for improving the score and to do so in broad enough terms not to risk manipulated future applications.

Scorecard development process

Development sample

The development sample (input data set) is a balanced sample consisting of 1500 good and 1500 bad accepted applicants. ‘Bad’ has been defined as having been 90 days past due once. Everyone not ‘bad’ is ‘good’ , so there are no ‘indeterminates’. A separate data set contains the data on rejects. The modeling process, especially the validation charts, require information about the actual good/bad proportion in the accept population. Sampling weights are used here for simulating that proportion. A weight of 30 is assigned to a good application and a weight of 1 to a bad one. Thereafter all nodes in the process flow diagram treat the sample as if it consisted of 45 000 good applications and 1500 bad applications. Figure 3 shows the distribution of good/bad after the application of sampling weights. The bad rate is 3.23%. A Data Partition node then splits a 50 % validation data set away from the development sample. Models will later be compared based on this validation data set.

Classing

Classing is the process of automatically and/or interactively binning and grouping interval, nominal or ordinal input variables in order to

manage the number of attributes per characteristic
improve the predictive power of the characteristic
select predictive characteristics
make the Weights Of Evidence - and thereby the amount of points in the scorecard - vary smoothly or even linearly across the attributes

The amount of points that an attribute is worth in a scorecard is determined by two factors:

the risk of the attribute relative to the other attributes of the same characteristic and
the relative contribution of the characteristic to the overall score

The relative risk of the attribute is determined by its ‘Weight of Evidence’. The contribution of the characteristic is determined by its co-efficient in a logistic regression (see section Regression below).

The Weight of Evidence of an attribute is defined as the logarithm of the ratio of the proportion of goods in the attribute over the proportion of bads in the attribute. High negative values therefore correspond to high risk, high positive values correspond to low risk. Since an attribute’s amount of points in the scorecard is proportional to its Weight of Evidence (see section Score Points Scaling below) the classing process determines how many points an attribute is worth relative to the other attributes of the same characteristic.

After classing has defined the attributes of a characteristic, the characteristic’s predictive power, i.e. its ability to separate high risks from low risks, can be assessed with the so called Information Value measure. This will aid the selection of characteristics for inclusion in the scorecard. The Information Value is the weighted sum of the Weights of Evidence of the characteristic’s attributes. The sum is weighted by the difference between the proportion of goods and the proportion of bads in the respective attribute. The Information Value should be greater than 0.02 for a characteristic to be considered for inclusion in the scorecard. Information Values lower than 0.1 can be considered weak, smaller than 0.3 medium and smaller than 0.5 strong. If the Information Value is greater than 0.5, the characteristic may be over-predicting, meaning that it is in some form trivially related to the good/bad information.

There is no single criterion, when a grouping can be considered satisfactory. A linear or at least monotone increase or decrease of the Weights of Evidence is often what is desired in order for the scorecard to appear plausible. Some analysts would always only include those characteristics where a sensible re-grouping can achieve this. Others may consider a smooth variation sufficiently plausible and would include a non-monotone characteristic such as ‘income’, where risk is high for both high and low incomes, but low for medium incomes, provided the Information Value is high enough.

Regression analysis

After the relative risk across attributes of the same characteristic has been quantified, a logistic regression analysis now determines how to weigh the characteristics against each other. The Regression node receives one input variable for each characteristic. This variable contains as values the Weights of Evidence of the characteristic’s attributes. (see table 1 for an example of Weight of Evidence coding). Note that Weight of Evidence coding is different from dummy variable coding, in that single attributes are not weighted against each other independently, but whole characteristics are, thereby preserving the relative risk structure of the attributes as determined in the classing stage

A variety of further selection methods (forward, backward, stepwise) can be used in the Regression node to eliminate redundant characteristics. In our case we use a simple regression. These values are in the following step multiplied with the Weights of Evidence of the attributes to form the basis for the score points in the scorecard.

Score points scalling

For each attribute its Weight of Evidence and the regression co-efficient of its characteristic could now be multiplied to give the score points of the attribute. An applicant’s total score would then be proportional to the logarithm of the predicted bad/good odds of that applicant. However, score points are commonly scaled linearly to take more friendly (integer) values and to conform with industry or company standards. We scale the points such that a total score of 600 points corresponds to good/bad odds of 50 to 1 and that an increase of the score of 20 points corresponds to a doubling of the good/bad odds. For the derivation of the scaling rule that transforms the score points of each attribute see equations 3 and 4. The scaling rule is implemented in the Scorecard node (see Figure 1), where it can be easily parameterized. The resulting scorecard is output as a table in HTML and is shown in table 2. Note, how the score points of the various characteristics cover different ranges. The score points develop smoothly and, with the exception of the ‘Income’ variable, also monotonically across the attributes.

Reject Inference

The application scoring models we have built so far, even though we have done everything correctly, still suffer from a fundamental bias. They have been built based on a population that is structurally different from the population to which they are supposed to be applied. All the example applications in the development sample are applications that have been accepted by the old generic scorecard that has been in place during the last two years. This is so because only for those accepted applications it is possible to evaluate their performance and to define a good/bad variable. However, the through-the-door population that is supposed to be scored is composed of all applicants, those that would have been accepted and those that would have been rejected by the old scorecard. Note that this is only a problem for application scoring, not for behavioral scoring . As a partial remedy to this fundamental bias, it is common practice to go through a process of reject inference. The idea of this approach is to score the data that is retained of the rejected applications with the model that is build on the accepted applications. Then rejects are classified as inferred goods or inferred bads and are added to the accepts data set that contains the actual good and bad. This augmented data set then serves as the input data set of a second modeling run. In case of a scorecard model this involves the re-adjustment of the classing and the re-calculation of the regression co-efficients.

------------------------------------------------------------------------------------------------------

Decision Trees

On the other hand, a decision tree may outperform a scorecard in terms of predictive accuracy, because unlike the scorecard, it detects and exploits interactions between characteristics. In a decision tree model, each answer that an applicant gives determines what question he is asked next. If the age of an applicant is for example greater than 50 the model may suggest to grant a credit without any further questions, because the average bad rate of that segment of applications is sufficiently low. If, on the other extreme, the age of the applicant is below 25 the model may suggest to ask about time on the job next. Credit would then maybe only granted to those that have exceeded 24 months of employment, because only in that sub-segment of youngsters the average bad rate is sufficiently low. A decision tree model thus consists of a set of if .. then … else rules that are still quite straightforward to apply. The decision rules also are easy to understand, maybe even more so than a decision rule that is based on a total score that is made up of many components. However, a decision rule from a tree model, while easy to apply and easy to understand, may be hard to justify for applications that lie on the border between two segments. There will be cases where an applicant will for example say: ‘If I had only been 2 months older I would have received a credit without further questions, but now I am asked for additional securities. That is unfair.’ That applicant may also be tempted to make a false statement about his age in his next application. Even if a decision tree is not used directly for scoring, this model type still adds value in a number of ways: the identification of clearly defined segments of applicants with a particular high or low risk can give dramatic new insight into the risk structure of the population. Decision trees are also used in scorecard monitoring, where they identify segments of applications where the scorecard under performs.

Finally, decision trees often can achieve a similar predictive power as a scorecard with much fewer characteristics. Models that only require few characteristics, sometimes called ‘short scores’, are becoming especially popular in the context of campaigning and marketing for credit products. However, there is a fundamental problem associated with short scores: they diminish the richness of information that the organization can collect on the applicants and thereby erode the basis for future modeling.

--------------------------------------------------------------------------------------------------------

Neural Nets

With the decision tree, we could see that there is such thing as a decision rule that is too easy to understand and thereby invites fraud. Ironically speaking, there is no danger of this happening with a neural network. Neural networks are extremely flexible models that combine combinations of characteristics in a variety of ways. Their predictive accuracy can therefore be far superior to scorecards and they don’t suffer from sharp ‘splits’ as decision trees do. However, it is virtually impossible to explain or understand the score that is produced for a particular application in any simple way. It can therefore be difficult to justify a decision that is made on the basis of a neural network model. In some countries it may even be a legal requirement to be able to explain a decision and such a justification then must be produced with additional methods. A neural network of superior predictive power therefore is best suited for certain behavioral or collection scoring purposes, where the average accuracy of the prediction is more important than the insight into the score for each particular case. Neural network models can not be applied manually like scorecards or simple decision trees, but require software to score the application. Then, however, their use is just as simple as that of the other model types.

Model Assessment

After building both a scorecard and a decision tree model we now want to compare the quality of the models on the validation data. One of the standard Enterprise Miner charts in the Assessment node is the concentration curve and is shown in Figure 9. It shows how many of all the bads in the population are concentrated in the group of 2% (4%, 6%, …) worst applicants as predicted by the model. Sorting applicants based on the scorecard scores will result, for example, in around 30% of all the bads being concentrated in the 10% applicants that are considered the worst by the scorecard model. The decision tree is only able to concentrate about half as many bads in the same number of what it calls the worst applicants (the 10% decile is marked by the vertical black line in In summary, the scorecard is assessed to be superior, because its curve stays above that of the tree.

Defining decision rules for application approval and risk management

Application approval and risk management do not rely on scores alone, but scores do form the basis of a decision strategy that groups customers into homogenous segments. These segments can then be treated with the same action. For example, in the case of approval decisions, customers are often classified using appropriate cutoff scores as approved, referred for examination or rejected. Other segmentation strategies can determine the limit amount that is assigned to a segment or the collection actions taken. An important type of segmentation is the division of customers into risk pools for the purpose of calculating certain risk components: probability of default (PD), loss given default (LGD) and exposure at default (EAD). These risk components are required by the risk weighted assets (RWA) calculation mandated by the Basel II and III capital requirements regulation. Analysts apply the scorecard and the pooling definition to a historical data set. The long-run historical averages of the default rate, losses and exposures can then be calculated by pool and used as input into the RWA calculation. There are various ways to group customers into segments using a scorecard. Often segmentation involves the setting of thresholds. Sometimes analysts define these thresholds manually, and sometimes they use an algorithm to automatically find a decision rule that is optimal in a specific way. The way multiple thresholds are combined further characterizes a decision rule. Typical examples of decision rules include policy rules (exclusions), single score bins, multiple score bins and decision trees.

Deploying scores and decisions

Execution of decision rules can be done in batch for all customers so that the assignment of each customer to a group and an action is available in an operational data store for instant retrieval by front-office software. Or, alternatively, the front-office software can initiate execution of the decision rule to make a decision on an individual customer, possibly using new or updated information supplied by the customer at that time (online). The decision is then passed back immediately to the front-office software. In either case, the decision rule is not executed by the front-office software but through middle-layer software on a central server. For existing credit customers, the batch option will be most commonly used, since behavioral information derived from the customer transaction history and other stored customer characteristics is typically more predictive than information a customer might supply in the front office.

Is big-data analytics ultimate solution for airlines?

If the airline industry could be described in two words, it would be "intensely competitive". The airline industry generates billions of dollars every year and still has a cumulative profit margin of less than 1%. The reason for this lies in this industry’s vast complexity. Airlines have a multitude of different business issues that need to be solved at once, such as globally uneven playing field, revenue vulnerability, an extremely variable planning horizon, high cyclicality and seasonality, fierce competition, excessive government intervention and high fixed and low marginal cost. The low profit to turnover ratio of airlines have been further exacerbated by growing low-fare competition, increasing security costs, and frequent dynamic shifts in air travel consumer behavior. The historical business model of many network airlines now appears to be unable to support sustained profitability under any but the most favorable economic conditions. The industry is at a turning point. The market dictates an “adapt or die” policy, and the airlines that wish to survive will face the challenge of having to make significant changes to their current archaic business model. To do this requires far more allowance for analytical technologies that would allow flow of consistent, repeatable and reliable enterprise wide intelligence needed to tackle all the challenges the industry is facing.

To ensure the best chance for full economic recovery, airlines should fully leverage their most prolific asset - data. Data used in conjunction with innovative technologies that would allow the creation of an Enterprise Wide Intelligence Platform, will provide the capabilities for a comprehensive intelligent management and decision-making system throughout the enterprise. The ultimate benefits of implementing and using an enterprise wide intelligence platform, together with airline business acumen and experience would include timely responses to current and future market demands, better planning and strategically aligned decision making, and clear understanding and monitoring of all key performance drivers relevant to the airline industry. Achieving these benefits in a timely and intelligent manner will ultimately result in lower operating costs, better customer service, market leading competitiveness and increased profit margin and shareholder value.

Business challenges in airline industry

Key to successful deployment of technological advances in airline industry is to be able to anticipate how the current business model will change to survive in tough market conditions.

Some of the challenges that can be successfully addressed by Enterprise intelligent Platform are:

The need for accurate daily and weekly performance measurement reports (e.g. “flash/estimated” revenue, operating costs and net contribution reports for every aircraft’s actual flight per sector/route).

The Need to better manage all aspects of risk.

The Need for better impact analysis and more effective optimization of all resources as well as being able to produce accurate passenger-revenue forecasts,

The Need for a holistic, 360 degrees view of the airline industries customers, suppliers, service providers and distributors.The Need for expense verification models in order to better control all industry cost aspects.

Performance Measurements

Airlines usually operate in a globally competitive environment and therefore require prompt and accurate enterprise performance measurements. Furthermore, airlines are volume driven and small variations (passengers flown, fuel spent/bought, load carried) can multiply into major effects – therefore appropriate and timely action is critical. They also suffer substantial difficulties to produce daily/weekly reliable performance measurements. Current airlines “legacy” IT systems such as Revenue Accounting, require several weeks after a month end to generate revenue results for every flight per sector/route. Enterprise Intelligence Platform can automate production of daily activity reports such as number of passenger flown per flight/sector, distance flown, etc which can be used to provide estimated performance measurements such as daily or weekly revenues for specific routes or sectors.

Risk Management

The global airline industry has been subjected to major catastrophes over the past years. It is accordingly imperative for airlines to develop various risk management models and strategies to protect themselves from negative impact of these types of events. Furthermore, due to the global playing field, airlines often earn its revenues and pay its costs in different baskets of currencies (e.g. USD, Euro, GBP etc). As a result there is frequently a mismatch between the flow of revenue receipts and expenses of each basket of currency - creating risk exposure reports.

Control and Verification

Airline carriers require a number of control and verification models to be able to control costs arising from its various operational activities. To enable this, airlines have a pressing need for a complete and integrated repository of flight information data gathered from all its disparate business units. This will enable computation of various efficiency analytics - e.g. planed fuel usage compared with actual fuel usage per aircraft, crew utilization (roster optimization). These issues could also be fully addressed by consolidating and analyzing relevant flight and aircraft data. In turn this would help to create a 360 ° view of each flight and aircraft, allowing the business users to dramatically improve their control and verification systems.

Load forecasting

Airlines require the development of an effective and holistic forecasting model to regularly assess the impact of options and alternatives such as increasing aircraft seats available, adjusting fares, introducing new routes etc. Forecasts should also take account of actual statistical trends and results e.g. actual passengers carried and actual average fares earned. Such forecasts should then be compared against budgets and prior year performance.

Holistic customer view

Airlines would greatly benefit from knowing and understanding its business environment along some of the key business issues, such as performance, behavior, risk, profitability, etc. Using customers as an example - the main objective would be to enrich the knowledge about individual customers leading to new strategic customer segments. This intelligence would allow airlines to reap the host of benefits such as successful, targeted customer promotions, cross-selling and up-selling campaigns for different flights and booking classes leading to improved yield and revenue. For example, it would give airlines the power of knowing to limit discounts on flight routes which are usually over-booked, allowing the large number of passengers to compete for high profit seats immediately prior to departure. Such multidimensional views of the business can help the airline to better serve its customers through more effective, efficient and personalized service, receiving in return customer loyalty, support and market share, all leading to higher profitability.

RFM Segmentation

Even though RFM segmentation is well known in retail industry, and basic premise is that by knowing recency, frequency and value of the purchase you can be in good position to start figuring out specific customer in terms of its value, purchasing behavior and its loyalties. However, same logic can be applied for any phenomena that we trying to predict. Therefore, knowing how often something happens, how recently its happened and its voracity – has same type of predictive power as it has in retail context. And whenever I used it for predictive modeling- RFM would always come as one of the top predictors. So, let me delve deeper in explaining basic principles of RFM method.

RFM segments the customer base based on recency of purchase (R), frequency of purchase (F) and monetary value (M). Recency parameter is the most powerful of the 3. In forecasting models latest time series often has the highest weighting and is the most predictive of the next forecasting value. Second most powerful is the frequency as long as the definition of the frequency is limited to last month or quarter and not over entire life-span of customer relationship. Least powerful is the monetary value. Since the total value in the period of time is directly correlated with frequency it is advisable to use an average value.

There are several different ways to calculate RFM groups and scores and below is the classic approach:

First create 5 segments based on the recency, dividing the data file into 5 exact quintiles, where the contacts with the most recent Transactions (i.e. in the top 20% of the file) are given a recency value of 5, then the next 20% are given a recency value of 4 and so on. Then, each of those quintiles, segmented into 5 further quintiles based on the frequency value for each contact where the contacts with the highest transaction frequency value are of 5, then the next 20% is given a frequency value of 4 and so on. Finally, each of these segments is then segmented into 5 further quintiles, based on the monetary value of each contact; i.e. the total amount which all that contact’s transactions add up to. Those contacts with the highest monetary values (i.e. in the top 20%), are given a monetary value of 5, then the next 20% are given a monetary value of 4 and so on.) At the end of this process, you will have 125 segments with a RFM group between 111 and 555 with the same number of contacts within each segment; and each contact will have a RFM score of between 3 and 15.

An alternative approach is to still calculate RFM Groups/Scores using quintiles, but by using the Independent RFM Quintile approach, not just the recency but also the frequency and monetary values for each contact are calculated across the whole data file and are not dependent on any of the other values/RFM factors or any other quintile. Another approach is to use user-definable bands for each criterion (i.e. each RFM factor) in order to determine what recency, frequency and monetary value that should be given to each contact. Even-though RFM segmentation can be used on “stand-alone” basis, I always tend to incorporate it with other demographic and affinity variables in order to have more holistic view of the segment's make-up.

I have coined my own approach that I often use which is somewhat different of the classic approach and it goes in following way:\

1.) Create variable Total Spend for for each customer

2.) Create variable Total number of visits for each customer

3.) Divide both variables into 3 equally spaced bins, based on frequency – 1st bin would be lowest 30% of all customers in regard to spending (and visits – separate variable)

4.) Evaluate each customer in terms of in which group he belonged (for that time) in terms of his total spending, and total visits, and label him for that group (Example: variable “FRM_Spend_label” would have values “L”, “M” and “H”. If amount of his total customer spending for 12m is within threshold fits within second bin – give him a value “M” (medium) in variable “FRM_Spend_label”

5.) Do the same thing for visits, creating a new variable “FRM_visit_variable”.

6.) Do slightly different thing for “Recency” – starting from the same endpoint as it has been done for “spending” and visits – go behind only 3 months and not 12. Then, do the following: if customer did purchase in month 1 (the most recent month) give him a value “H”, if the most recent purchase was in month “2” – give him a value “M” and if the most recent purchase was in month “3” – give him value “L”.

Note – it might happen that most of a customers have some sort of purchase in all months in which case it would be advisable to raise threshold above “0”. In other words call the recent purchase only if monthly total is above some specified amount bigger than “0”.

7.) Combine all three FRM dimensions together into single variable where values would be combinations of “H”, “M” and “L”. If value is “HLH” it would mean that customer falls in the top group of customers in terms of their number of visits to the stores, it means that customer wasn’t in the store (with purchase larger than…) for a month and it means that customer falls in the top group of customers in terms of their total monetary value that they bring to the company.

8.) In last step I deploy “19 +1” rule, where i retain top 19 combinations based on its frequencies and all the other combinations I drop into “other” category, so that my FRM variable doesn’t have more than 20 distinct values.

Hope this helps!