Analytics and Data Mining - Goran's Blog: 2011

Wednesday, November 2, 2011

Analytics and Data Mining in Banking

With the increasing economic globalization and improvements in information technology, large amounts of financial data are being generated and stored. These can be subjected to data mining techniques to discover hidden patterns and obtain predictions for trends in the future and the behavior of the financial markets. This in turn would result in an improved market place responsiveness and awareness leading to reduced costs

and increased revenue.

Analytics can contribute to solving business problems in banking and finance by finding patterns, causalities, and correlations in business information and market prices that are not immediately apparent to managers because the volume data is too large or is generated too quickly to screen by experts. The managers of the banks may go a step further to find the sequences, episodes and periodicity of the transaction behaviour of their customers which may help them in actually better segmenting, targeting, acquiring, retaining and maintaining a profitable customer base.

Business Intelligence and data mining techniques can also help them in identifying various classes of customers and come up with a class based product and/or pricing approach that may garner better revenue management as well. Analytics can help banks understand and drive decisions related to customer profitability, as well as enable banking institutions to segment customers according to a multitude of variables – demographics, geographies, account history, etc. – In order to create more meaningful and targeted marketing programs.

Furthermore, analytics can help banks improve retention rates by determining the causes and predicting future customer attrition. In addition, banks can apply analytics to historical data to find out which customers are good candidates for cross-selling and up-selling and as a result achieve increase in revenue and wallet share. For most banks analytics are used as the most powerful weapon in the fight against fraud.

Customer Relationship Management

Customer segmentation and profiling is a data mining process that builds customer profiles of different groups from the company’s existing customer database. The information obtained from this process can be used for different purposes, such as understanding business performance, making new marketing initiatives, market segmentation, risk analysis and revising company customer policies. The advantage of data mining is that it can handle large amounts of data and learn inherent structures and patterns in data. It can generate rules and models that are useful in enabling decisions that can be applied to future cases.

In Banking - analytics and data mining is frequently used to assign a score to a particular customer or prospect indicating the likelihood that the individual will behave in a particular way. For example, a score could measure the propensity to respond to a particular insurance or credit card offer or to switch to a competitor’s product. Data mining can be useful in all the three phases of a customer relationship-cycle: customer acquisition, increasing value of the customer and customer retention.

Banks use their credit risk models to classify these respondents in good credit risk and bad credit risk classes. Seeing the huge cost and effort involved in such marketing process, data mining techniques can significantly improve the customer conversion rate by more focused marketing.

Because high competitions in the finance industry, intelligent business decisions in marketing are more important than ever for better customer targeting, acquisition, retention and customer relationship. There is a need for customer care and marketing strategies to be in place for the success and survival of the business. It is possible with the help of data mining and predictive analytics to make such strategies. Financial institutions are finding it more difficult to locate new previously unsolicited buyers, and as a result they are implementing aggressive marketing program to acquire new customer from their competitors.

With the advent of data mining and business intelligence tools it has become possible for banks to strengthen their customer acquisition by direct marketing and establish multi-channel contacts, to improve customer development by cross selling and up selling of products, and to increase customer retention by behaviour management.

It is also possible to bundle various offers to meet the need of the valued customers. Analytics can also help the banks in customizing the various promotional offers. It is also possible for the banks to find out the problem customers who can be defaulters in the future, from their past payment records and the profile and the data patterns that are available. This can also help the banks in adjusting the relationship with these customers so that the loss in future is kept to its minimum.

Data Mining techniques can be of immense help to the banks and financial institutions in this arena for better targeting and acquiring new customers, fraud detection in real time, providing segment based products for better targeting the customers, analysis of the customers’ purchase patterns over time for better retention and relationship, detection of emerging trends to take proactive stance in a highly competitive market adding a lot more value to existing products and services and launching of new product and service bundles.

Risk Management

Managing and measurement of risk is at the core of every financial institution. Today’s major challenge in the banking and insurance world is therefore the implementation of risk management systems in order to identify, measure, and control business exposure. Here credit and market risk present the central challenge, one can observe a major change in the area of how to measure and deal with them, based on the advent of advanced database and data mining technology.( Other types of risk is also available in the banking and finance i.e., liquidity risk, operational risk, or concentration risk. ) Today, integrated measurement of different kinds of risk (i.e., market and credit risk) is moving into focus. These all are based on models representing single financial instruments or risk factors, their behaviour, and their interaction with overall market, making this field highly important topic of research.

Financial Market Risk

For single financial instruments, that is, stock indices, interest rates, or urrencies, market risk measurement is based on models depending on a set of underlying risk factor, such as interest rates, stock indices, or economic development. One is interested in a functional form between instrument price or risk and underlying risk factors as well as in functional dependency of the risk factors itself. Today different market risk measurement approaches exist. All of them rely on models representing single instrument, their behaviour and interaction with overall market. Many of this can only be built by using various data mining techniques.

Portfolio Management

Risk measurement approaches on an aggregated portfolio level quantify the risk of a set of instrument or customer including diversification effects. On the other hand, forecasting models give an induction of the expected return or price of a financial instrument. With the data mining and optimization techniques investors are able to allocate capital across trading activities to maximize profit or minimize risk. With data mining techniques it is possible to provide extensive scenario analysis capabilities concerning expected asset prices or returns and the risk involved. With this functionality what-if simulations of varying market conditions can be run to assess impact on the value and/or risk associated with portfolio. Profit and loss analyses allow users to access an asset class, region, counterparty, or custom sub-portfolio can be benchmarked against common international benchmarks.

Trading

For the last few years a major topic of research has been the building of quantitative trading tools using data mining methods based on past data as input to predict short term movements of important currencies, interest rates, or equities. The goal of this technique is to spot times when markets are cheap or expensive by identifying the factor that are important in determining market returns. The trading system examines the relationship between relevant information and piece of financial assets, and gives you buy or sell recommendations when they suspect an under or overvaluation.

Goran Dragosavac

Friday, September 30, 2011

What to do when the data doesn’t fit the analytical question?

Smart response to this question can be – well, either we get the new data, or new question!

Let’s imagine our task is to find similarity between members of the same group, for example – home loan customers. Now, imagine the situation where we ONLY have a data for the home loans customers.

We can certainly examine all their characteristics, but there is no guarantee that they will be different from purchases of some other banking products. What we need is some point of reference. We need additional data of customers who have any other product other than home loans. So, in order to find out what is something similar about them, we need to figure what is different between them and anyone else – which is pretty much one and a same thing.

This is invariably classification problem which we try to solve by unary target variable (where all purchasers having the same value of the product purchased). So, since we don’t have, or are able to get - additional data for customers that have other types of products – we need to go for second-best scenario. So, instead of “reformulating” data through the artful and creative data preparation to better fit analytical question – we have no other option but to do exactly opposite – reformulating analytical question to fit the data at hand.

This would mean that our new question should be what are the groups of similarity within the single class of loan customers, and how do they differ from other groups of loan customers – as oppose to the original question of what makes my “loan” customers similar? This is now very different question and by reformulating our question we are also picking new “tool” from our workbench, so instead of using some classification algorithm we are reverting to clustering method.

So, the usual premise where data and analytical methods are functions of business question – doesn’t work in this situation, so practical solution is to alter the initial objective.

Goran Dragosavac

Wednesday, September 28, 2011

If you are new to Web Mining…

If you selling products and services via web channel you may consider analyzing who is visiting your web site and how do people who buy differ from thos that don’t, and out of those who buy - what is their clickstream sequence and navigational pattern.

Each customer's action on a website generates data, and not just high-level interactions such as buying something but also something as simple as using a search engine or navigating through a site. All these interactions between digital service providers, and the consumer can be recorded, and stored in digital databases. These large data sets contain information helpful to business marketing strategies, both - for retrospective analysis, as well as for data-driven forecasting.

Companies today are in the unprecedented position of being able to collect vast amounts of customer information relatively easily. By using web mining, companies can analyze and predict the behavior of their customers. All web site visitors leave digital trails which web servers automatically store in log files. Web analysis tools analyze, and process these web server logs files to produce meaningful information. Essentially, a complete profile of site traffic is created which shows for example, how many visitors there were to the site, what sites they came from, and which pages on the site are most popular. Web analysis tools provide companies with previously unknown statistics, and useful insights into the behavior of their online customers. While the usage and popularity of such tools may continue to increase, many online retailers are now demanding more useful information about their customers, from the vast amounts of data generated by their web sites.

Organizations have typically invested large amounts of money into developing their web sites and web strategy and they would like to know what return they are receiving on their investment. Most sites use hits and page views as measure of success of the web site, which clearly is not going to answer their questions. A website is commonly used for:

-Selling products/services

-Providing product/company information

-Providing customer support

Typical questions that an e-retailer needs to answer are:

- How to increase browser to buyer conversion rate?

- How to increase web retention rate? (Defined as ratio of number of browsers who return to the web site within certain window of time to the total number of browsers.)

- How to reduce clicks-to-close value? (Smaller number indicates that customers are finding easier what they looking for. To reduce this value personalization of web services is a right approach.

- Does the web site design satisfy the needs of various customer segments?

Using page hits will NOT provide answer for any of these goals. Current traffic analysis tools are geared at providing high-level predefined reports about domain names, IP addresses, browsers, cookies and other machine-to-machine activity. These server activity reports simply do not provide the type of bottom-line analysis that e-tailers, service providers, marketers and advertisers in the business world have come to demand. These software packages (i.e., web analysis tools) originated from the need to report on the activity of the web server and not on the activity of the user.

Web mining may be subdivided into:

- Web-content mining

- Web-structure mining

- Web-usage mining.

- User profile data

Web-content mining is the mining of Internet pages, common in the next generation of XML/RKF-based search engines/Web spiders.

Web-structure mining is the application of data mining to reconstruct the structure of a Web site or sites.

Web-usage mining is mining of log files and associated data from a particular Web site to discover knowledge of browser and buyer behavior on that site. User profile data, such as demographic information about the users of the web-site, registration data and customer profile information can provide valuable information of its customers, and can be platform for segmentation and profiling. Web-usage mining is what is widely understood to be web mining and it is main subject of this introduction.

Goran Dragosavac

Data Mining in Retail Industry

Retail industry collects large amount of data on sales and customer shopping history. The quantity of data collected continues to expand rapidly, especially due to the increasing ease, availability and popularity of the business conducted on web, or e-commerce. Retail industry provides a rich source for data mining. Retail data mining can help identify customer behavior, discover customer shopping patterns and trends, improve the quality of customer service, achieve better customer retention and satisfaction, enhance goods consumption ratios design more effective goods transportation and distribution policies and reduce the cost of business.

Some of the retail applications of data mining are in following areas:

Customer Relationship Management

Customer Segmentation: Customer segmentation is a vital ingredient in a retail organization's marketing recipe. It can offer insights into how different segments respond to shifts in demographics, fashions and trends. For example it can help classify customers in the following segments:

· Customers who respond to new promotions

· Customers who respond to new product launches

· Customers who respond to discounts

· Customers who show propensity to purchase specific products

Campaign/ Promotion Effectiveness Analysis: Once a campaign is launched its effectiveness can be studied across different media and in terms of costs and benefits; this greatly helps in understanding what goes into a successful marketing campaign. Campaign/ promotion effectiveness analysis can answer questions like:

· Which media channels have been most successful in the past for various campaigns?

· Which geographic locations responded well to a particular campaign?

· What were the relative costs and benefits of this campaign?

· Which customer segments responded to the campaign?

Customer Lifetime Value (CLV): Not all customers are equally profitable. CLV attempts to calculate some projected relative measure of value by calculating Risk Adjusted Revenue (probability of customer owning categories/products in his portfolio that he currently doesn ‘t have), as well as Risk Adjusted Loss (probability of customer dropping categories/products in his portfolio that he currently owns) and adding to some Net Present Value, and deducting the value of servicing the customer.

Customer Potential: Also, there are those customers who are not very profitable today may have the potential of being profitable in future. Hence it is absolutely essential to identify customers with high potential before deciding what the best way to realize that potential is through the right marketing stimully..

Customer Loyalty Analysis: It is more economical to retain an existing customer than to acquire a new one. To develop effective customer retention programs it is vital to analyze the reasons for customer attrition. Business Intelligence helps in understanding customer attrition with respect to various factors influencing a customer and at times one can drill down to individual transactions, which might have resulted in the change of loyalty.

Cross Selling: Retailers use the vast amount of customer information available with them to cross sell other products at the time of purchase. This can be done through product portfolio analysis and then selling the products that are missing from typical portfolios. Also market basket analysis can be another food method for effective cross selling. Look-a-like modeling is yet another strategy where model is produce that produce some quantitative measure of affinity of the customer to a specific product.

Product Pricing: Pricing is one of the most crucial marketing decisions taken by retailers. Often an increase in price of a product can result in lower sales and customer adoption of replacement products. Using data warehousing and data mining, retailers can develop sophisticated price models for different products, which can establish price - sales relationships for the product and how changes in prices affect the sales of other products.

Target Marketing/Response Modeling: Retailers can optimize the overall marketing and promotion effort by targeting campaigns to specific customers or groups of customers. Target marketing can be based on a very simple analysis of the buying habits of the customer or the customer group; but increasingly data mining tools are being used to define specific customer segments that are likely to respond to particular types of campaigns.

Supply Chain Management & Procurement

Supply chain management (SCM) promises unprecedented efficiencies in inventory control and procurement to the retailers. With cash registers equipped with bar-code scanners, retailers can now automatically manage the flow of products and transmit stock replenishment orders to the vendors. The data collected for this purpose can provide deep insights into the dynamics of the supply chain. However, most of the commercial SCM applications provide only transaction-based functionality for inventory management and procurement; they lack sophisticated analytical capabilities required to provide an integrated view of the supply chain.

Vendor Performance Analysis: Performance of each vendor can be analyzed on the basis of a number of factors like cost, delivery time, quality of products delivered, payment lead time, etc. In addition to this, the role of suppliers in specific product outages can be critically analyzed.

Inventory Control (Inventory levels, safety stock, lot size, and lead time analysis): Both current and historic reports on key inventory indicators like inventory levels, lot size, etc. can be generated from the data warehouse, thereby helping in both operational and strategic decisions relating to the inventory.

Product Movement and the Supply Chain: Some products move much faster off the shelf than others. On-time replenishment orders are very critical for these products. Analyzing the movement of specific products - using BI tools - can help in predicting when there will be need for re-order.

Demand Forecasting: Complex demand forecasting models can be created using a number of factors like sales figures, basic economic indicators, environmental conditions, etc. If correctly implemented, a data warehouse can significantly help in improving the retailer’s relations with suppliers and can complement the existing SCM application.

Storefront Operations

The information needs of the store manager are no longer restricted to the day to day operations. Today’s consumer is much more sophisticated and she demands a compelling shopping experience. For this the store manager needs to have an in-depth understanding of her tastes and purchasing behavior. Data warehousing and data mining can help the manager gain this insight. Following are some of the uses of BI in storefront operations:

Store Segmentation: This analysis takes the data that is common for different stores, and finds out which stores are similar in terms of product or customer dimensions. In other words – what stores are similar based on products that are sold quickly or more slowly in comparison to rest of the stores. Next step is to build the profile of the customers that buys from specific store.

Market Basket Analysis: It is used to study natural affinities between products. One of the classic examples of market basket analysis is the beer-diaper affinity, which states that men who buy diapers are also likely to buy beer. This is an example of 'two-product affinity'. But in real life, market basket analysis can get extremely complex resulting in hitherto unknown affinities between a number of products. This analysis has various uses in the retail organization. One very common use is for in-store product placement. Another popular use is product bundling, i.e.grouping products to be sold in a single package deal. Other uses include design ing the company's e-commerce web site and product catalogs.

Category Management: It gives the retailer an insight into the right number of SKUs to stock in a particular category. The objective is to achieve maximum profitability from a category; too few SKUs would mean that the customer is not provided withadequate choice, and too many would mean that the SKUs are cannibalizing each other. It goes without saying that effective category management is vital for a retailer's survival in this market.

Out-Of-Stock Analysis: This analysis probes into the various reasons resulting into an out of stock situation. Typically a number of variables are involved and it can get very complicated. An integral part of the analysis is calculating the lost revenue due to product stock out.

Alternative Sales Channels

E Business Analysis: The Internet has emerged as a powerful alternative channel for established retailers. Increasing competition from retailers operating purely over the Internet - commonly known as 'e-tailers' - has forced the 'Bricks and Mortar' retailers to quickly adopt this channel. Their success would largely depend on how they use the Net to complement their existing channels. Web logs and Information forms filled over the web are very rich sources of data that can provide insightful information about customer's browsing behavior, purchasing patterns, likes and dislikes, etc. Two main types of analysis done on the web site data are:

· Web Log Analysis: This involves analyzing the basic traffic information over the e-commerce web site. This analysis is primarily required to optimize the operations over the Internet. It typically includes following analyses:

· Site Navigation: An analysis of the typical route followed by the user while navigating the web site. It also includes an analysis of the most popular pages in the web site. This can significantly help in site optimization by making it more user- friendly.

· Referrer Analysis: An analysis of the sites, which are very prolific in diverting traffic to the company’s web site.

· Error Analysis: An analysis of the errors encountered by the user while navigating the web site. This can help in solving the errors and making the browsing experience more pleasurable. n Keyword Analysis: An analysis of the most popular keywords used by various users in Internet search engines to reach the retailer’s e-commerce web site.

· Product Recommendation: If someone buys product A which other product he may buy. Usually there are 3 different angles to exploit when setting up recommendation engine: natural product affinities, customers affinities and preferences, peer dynamics and wisdom of the crowds.

Channel Profitability: Data mining can help analyze channel profitability, and whether it makes sense for the retailer to continue building up expertise in that channel. The decision of continuing with a channel would also include a number of subjective factors like outlook of key enabling technologies for that channel.

Product – Channel Affinity: Some product categories sell particularly well on certain channels. Data mining can help identify hidden product-channel affinities and help the retailer design better promotion and marketing campaigns.

Finance and Fixed Asset Management

The role of financial reporting has undergone a paradigm shift during the last decade. It is no longer restricted to just financial statements required by the law; increasingly it is being used to help in strategic decision making. Also, many organizations have embraced a free information architecture, whereby financial information is openly available for internal use. Many analytics described till now use financial data. Many companies, across industries,have integrated financial data in their enterprise wide data warehouse or established separate Financial Data Warehouse (FDW). Following are some of the uses of BI in finance:

Budgetary Analysis: Data warehousing facilitates analysis of budgeted versus actual expenditure for various cost heads like promotion overruns can be analyzed in more detail. It can also be used to allocate budgets for the coming financial period.

Fixed Asset Return Analysis: This is used to analyze financial viability of the fixed assets owned or leased by the company. It would typically involve measures like profitability per sq. foot of store space, total lease cost vs. profitability, etc.

Financial Ratio Analysis: Various financial ratios like debt-equity, liquidity ratios, etc. can be analyzed over a period of time. The ability to drill down and join inter-related reports and analyses – provided by all major OLAP tool vendors – can make ratio analysis much more intuitive.

Profitability Analysis: This includes profitability of individual stores, departments within the store, product categories, brands, and individual SKUs.

Wednesday, August 10, 2011

What constitutes a good data mining model?

There are different types of data mining models, so definition of good quality model will depend of type of the model.

Good explanatory model must be able to explain some facet of the business problem. Purpose of describtive models is to extract the patterns in the data that are non-trivial, unknown, potentially useful and actionable. Such a model should bring you deeper in the understanding of specific business phenomena, and if acted upon - these new insights can generate new business value.

Predictive models are different. The purpose of predictive models is to generalize well on the set of new data. First, we have to be able to compare the results to what actually happened in the real world. Did predicted behavior actually happened, how many times model was right, or wrong? What is the improvement of the model in comparison to pre-modeling levels? Here, basic assessment metrics that are used to choose the best model are accuracy rates, misclassification rates, lift, average squared error, etc.

The question that I have been asked many times by business audiences is how they can trust the model, since they are required not only to sponsor model implementation, but also to stake their reputations in technologies that they often don’t quite understand. My response is always to look at the assessment measures on test data. How model performs on test dataset is the closest we will ever be to assess model performance on a new dataset, where model is required to generate accurate prediction.

Model accuracy is only one of the qualitative aspects, but there are others – such as stability. At the same levels of accuracy it is always better to go for simpler model with the fewer variables since such models are always more robust and stable.

Another angle of what constitute good model comes purely from a business perspective. Near perfect models from a statistical perspective are of no use they cannot be implemented for whatever reason. On the other hand - we may have models that fall short of statistically sound model – but who can still help us do things better than what we are able to do in absence of such model.

And lastly – main question remains – how does benefits generated by the model compare with its cost of production and implementation? Benefits of the good model always outweigh its cost.

Goran Dragosavac

Thursday, June 30, 2011

Data Mining applications accross the industries

I am often asked the question about what are the most common applications of analytics in a specific industry. Even though each industry has some application of analytics and data mining that are specific to them, they also have cross-industry applications that are common to many industries. Example of industry-specific analytical application is “policy-lapse prediction” in the insurance industry. Examples of cross-industry applications could be customer segmentation or customer retention, since in any industry where there are customers there is also need to segment them and retain them. Following is a mix of analytical applications and can be done in a specific industry:

Banking (retail): Analytics can help banks understand and drive decisions related to customer profitability, as well as to enable banking institutions to segment customers according to a multitude of variables: demographics, account history, etc. – in order to create more meaningful and targeted marketing programs. Furthermore, analytics can help banks improve retention rates by determining its causes and predicting future customer attrition. In addition, banks can apply analytics to historical data to find out which customers are good candidates for cross-selling and up-selling and as a result achieve increase in revenue and wallet share. For most banks analytics are used as the most powerful weapon in the fight against fraud.

Banking (investment): In investment banking analytics can be of tremendous value in supporting cross-asset trading and various other trading strategies. Also, analytical technologies are invaluable for enterprise-wide, market and credit risk management. Other applications of an analytics are segmenting and predicting the behavior of homogeneous groups of customers, uncovering hidden correlations between different indicators, create models to price futures, options, and stocks, and optimize portfolio performance.

Insurance (short term): Analytical applications in short term insurance are in rate-making by identifying risk factors that predict profits, claims and losses as well as in identifying potentially fraudulent claims. Common applications of analytics are in segmenting and profiling customers and then doing a rate and claim analysis of a single segment for different product, as well as performing market basket analysis and sequencing that answers the question of what insurance products are purchased together or in succession. Other common applications are in reinsurance, and in estimating outstanding claims provision (severity of the claim, exposure, frequency, time before settlement, etc.), as well as in using analytics to separate claims between digital and mobile assessors.

Insurance (life): A common application of analytics in life insurance is around policy lapse predictions, modeling brokers’ performance, reactivating of dormant customers to estimating the buying potential, and realizing the untapped potential through using analytics for more effective cross-selling. In addition analytics are commonly used to model response in direct marketing of specific insurance products.

Telco's: Analytics in telecoms are used for churn management, network fault prediction, up-selling and cross-selling, capacity planning personalized advertising and subscriber profiling.

Retail: Analytics in retail are being used for supply chain and demand planning, customer segmentation and profiling, for improving response in direct marketing, for better cross-selling and up-selling, for product management, and for better understanding which products are purchased together or in sequence.

Industrials: Analytics among Industrials are being used for warranty analysis, quality control, process optimization, waste management, supplier segmentation, product and customer profitability, causal analysis, service parts optimization, and for supply chain optimization and demand planning.

Resources: The use of analytics in exploitation of natural resources is to better understand the operational risks associated with situations like equipment failures, human error and security breaches. Analytics can also be used to analyze usage patterns, weather, econometric data, changing demographics, etc. in order to accurately and confidently predict energy purchase/supply requirements.

Oil and Gas (upstream): Analytics in Oil and Gas are used for exploration and production optimization, facility integrity and reliability (predicting shut-downs, outages and downtime in production), reservoir modeling and oil-field production forecasting, estimating the shape of an oil field, fluid flood optimization and permeability prediction. It is also used for optimization of the reliability of equipment. Other applications of analytics include managing oil field assets by identifying trends in asset performance and potential, estimate the potential for infill drilling locations, screening and prioritizing workover candidates, and discover the characteristics of high potential producing assets and identify opportunities for acquisitions.

Oil and Gas (downstream): Common analytical applications are in demand forecasting, prediction of outages (planned, unplanned), grid overloads as well as predictive asset maintenance and fault prediction. Other applications are workforce optimization and consumer analytics.

Healthcare: Analytics in healthcare are being used for medical claims analysis (segmentation of claims (normal claims, claims for case managers, claims for investigative units), outcome analysis, both clinical and financial (mortality, length of stay, etc.), for disease management, for medical errors, as well as for the patient, supplier relationship management (increased patient satisfaction levels, segment suppliers and providers of cost, efficiency and quality of service).

Goods: Analytics among goods manufacturers are being used for quality control, process optimization, waste management, for inventory optimization and demand planning.

Public: Analytics in the public sector are used for improving of improving service delivery and performance of government agencies, improving safety, minimizing of tax evasion, detecting fraud, waste and abuse, analyzing scientific and research information, managing human resources, optimizing resources, and analyzing intelligence information.

Goran Dragosavac

Thursday, June 2, 2011

Applications of Analytics and Data Mining in Telecommunications

The telecommunications industry was an early adopter of data mining technology and therefore many data mining applications exist. Telco’s generate a tremendous amount of data, such as call detail data, which describes the calls across the telecommunication networks, network data, which describes the state of the hardware and software components in the network, and customer data, which describes the telecommunication customers. Such rich data is a fertile environment for many data mining applications built with the purpose of reducing some of the most pressing business problems in telecommunications.

In general , the telecommunication industry is interested in answering some strategic questions using data mining applications such as :

- which customer group is highly profitable, which one is not?

- to which customers should we advertise what kind of special offers?

- which customers are most likely to churn?

- how do customer profiles change over time?

- fraud detection and prediction ( for example stolen mobile phones or phone cards )

- how does one retain customers and keep them loyal as competitors offer special offers and reduced rates?

- how does one predict whether customers will buy additional products and services like cellular services,

call waiting or basic services?

- what characteristics differentiate our products from those toour competitors?

- when is a high-risk investment, such as new fiber optic lines ,acceptable?

- what kind of call rates would increase profit without losing good customers?

Overview of the most common app's of data mining in telco's in more detail:

////Sources: web, GDDM library ////

Network Fault Prediction

Network shut-downs for prolonged periods of time and more often can mean two things – loss of revenue and loss of customers. Here, predictive modeling can be used to generate alert just before shut-down so that immediate preventative actions can be taken. Model is built on historical instances of previous shut-downs and state of the network prior to shut-down. Such model is then applied in future time periods being able to recognize times before network failures and generating alerts.

Capacity Planning

Capital expenses contribute significantly to the overall cost of running a network. Operators invest in network capacity to address scalability and future growth. Since this growth can be unpredictable, operators typically over-provision their networks—leading to significant amounts of unutilized capacity that cannot be immediately monetized. Data mining and correlation techniques applied successfully on network data help the operator identify heavily utilized parts of the network at different points in time. This helps the operator to make key decisions related to adding capacity at the right location at the appropriate time. This analytics-assisted capacity planning, combined effectively with dynamic traffic routing, helps operators to optimize network resources—leading to overall cost reductions.

Subscriber Data Analysis and Profiling

Operators have access to large amounts of data about a subscriber, based on their usage of the operators’ services. Analysis of calling patterns, billing data and support requests, when combined with subscriber’s personal information such as demographics, age, gender, home address and income, forms the basis for creating a profile of the subscriber. For mobile and wireless services, current location and changes to the location provide additional context for the subscriber’s profile. The subscriber profile becomes the basis for other innovative services.

Social Network Modeling and Analysis

By leveraging calling patterns and other data points from a subscriber’s profile, operators can build a social networking model for the subscriber that identifies connections and proximities between different subscribers. The social network model deduces these proximities through data analytical techniques and is periodically validated and reinforced through automated and manual actions.

Personalized Advertising

Given the lower ARPU and competitive environment, operators are exploring alternate sources of revenue. Advertisement-based revenue is one such popular source. Randomized advertisements, being intrusive and interruptive, can adversely affect the subscriber’s satisfaction with the operator. On the other hand, personalized advertising that caters to the likes and needs of the individual can enhance loyalty. These advertisements, when combined with context-specific information such as location, can significantly improve the “hit-rate.” Further, advertisers are amenable to paying premium rates for personalized advertising to the targeted audience, resulting in increased revenues for the operator.

Up-Selling and Innovative Tariffs

The 80-20 principle holds true for most operators—wherein 80% of the revenue comes from 20% of the high net-worth subscribers. The analysis of service usage and billing can help the operator identify the top 20% of subscribers and focus their attention on improving loyalty by ensuring high subscriber satisfaction. Specifically, tariffs can be personalized to provide the best value for the subscribers’ money without reducing operators’ ARPU—a win-win situation. Further, this analysis also provides an opportunity to up-sell additional services (preferably personalized) based on subscribers’ profiles.

Churn Management

Competition among operators (especially mobile operators) lends itself to increased subscriber churn because subscribers have multiple options to select from. This is further exacerbated by mobile number portability, reducing the barrier for churn. To retain their subscriber base, it is important for operators to proactively identify subscribers who are likely to churn and incentivize them to stay. Many techniques, including social network modeling, can be used to identify the subscribers who are most likely to switch out. The churn management solution is integrated with the CRM systems to ensure that appropriate actions such as personalization of tariff, discounts etc. are offered to retain the customers.

Wednesday, November 2, 2011

Friday, September 30, 2011

Wednesday, September 28, 2011

Wednesday, August 10, 2011

Thursday, June 30, 2011

Thursday, June 2, 2011

Popular Posts