tag:blogger.com,1999:blog-37700434544888548182024-03-24T00:10:20.743-07:00Analytics and Data Mining - Goran's BlogIf you are interested in analytics - you are on a right place!
Send me a comment, let me know what is happening in YOUR analytical world! For much more visit my portal: http://www.bigdatanalysis.com/
Hope to see you there!
GoranAnalytics and Data Mininghttp://www.blogger.com/profile/06683904073452049165noreply@blogger.comBlogger37125tag:blogger.com,1999:blog-3770043454488854818.post-55083836283948878832016-04-05T13:45:00.001-07:002016-04-05T13:51:52.469-07:00I would like to welcome all of you to my new analytical portal: <h2>
<a href="http://www.bigdatanalysis.com/">http://www.bigdatanalysis.com/</a></h2>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjN_6ruMyRqT1X2XimxXswSuHconFXw6hAqA0gLnmk3VH1L9HwSfj9Ia9iNoPYK0MG5EarXvkO7L5EFXaREkD3HRnuzVyHC865uOOQs3xMFBiXqUvFE04x9XtZl_crN-9Ws8xDcu6kzbL0_/s1600/blog_pic.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="185" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjN_6ruMyRqT1X2XimxXswSuHconFXw6hAqA0gLnmk3VH1L9HwSfj9Ia9iNoPYK0MG5EarXvkO7L5EFXaREkD3HRnuzVyHC865uOOQs3xMFBiXqUvFE04x9XtZl_crN-9Ws8xDcu6kzbL0_/s320/blog_pic.jpg" width="320" /></a></div>
<div style="border-image: none;">
<br /></div>
<br />Analytics and Data Mininghttp://www.blogger.com/profile/06683904073452049165noreply@blogger.com54tag:blogger.com,1999:blog-3770043454488854818.post-5563116695155417242015-08-09T05:30:00.000-07:002015-08-09T05:30:29.222-07:00Big-Data Analytics in hotel industry
<br />
<div style="text-align: justify;">
<span style="font-family: "Calibri",sans-serif; mso-bidi-font-family: Calibri;"><span style="font-size: small;"> </span></span><span style="font-family: Times, "Times New Roman", serif;"><span style="color: black;">Hotel industry is another industry where effective use of
analytics can change dramatically how business is run. It is another data rich
industry that captures huge volumes of data of different types, including
video, audio, and Web data. However, for most hoteliers data remains an
underused and underappreciated asset. Hoteliers capture loyalty information,
for example, but few go beyond loyalty tier in how they consistently view and
take action with their guests. </span><span style="color: black;">With analytical exploitation of their data, hoteliers can
go beyond their traditional loyalty programs and deepen their knowledge of
guests in order to develop a more granular understanding of segment behavior,
needs, and expectations; identify profitable customer segments and their buying
preferences; and identify opportunities to attract new guests. But all that
starts with having clear customer-driven vision, before embarking on <span style="mso-bidi-font-weight: bold;">Integrating and standardizing guest data from
multiple channels, systems and properties </span>into a unified, accurate view
of all interactions. </span></span><span style="color: black;"><span style="font-family: Times, "Times New Roman", serif;"> </span></span></div>
<div style="text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">
</span></div>
<div style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none; text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;"><span style="color: black;">Next phase is using analytics to segment guests according
to booking trends, behavior and other factors in order to reveal their
likelihood to respond to promotions and emerging travel trends. It is vitally
important for hoteliers to be able to understand guest preferences (locations,
activities, and room types), purchase behavior (frequency, length of stay, time
of year) and profit potential in order to increase the brand loyalty and wallet
share of their most valuable guests. Focusing on the wrong guests reduces
profitability across the enterprise. For example, if a hotel targeted guests
who would likely take advantage of spa services, golf and restaurants, rather
than guests who only generate room nights, they could significantly increase
revenues and profitability.</span> </span></div>
<div style="text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">
</span></div>
<div style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none; text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">To maximize profits, hotels need
to increase the loyalty and wallet share of their most valuable guests by
marketing to their preferences and encouraging repeat visits. Focusing on the
wrong guests reduces profitability across the enterprise. For example, if a
hotel targeted guests who would likely take advantage of spa services, golf and
restaurants, rather than guests who only generate room nights, they could significantly
increase revenues and profitability. Unfortunately, money often gets spent on
blanket campaigns that don’t target individual guests or segments with offers
they’re most likely to respond to. As a result, guests may feel that the hotel
doesn’t care about them, or simply doesn’t offer services designed to meet
their needs. It becomes easy for those guests to switch to a competing hotel.</span></div>
<div style="text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">
</span></div>
<div style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none; text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;"> </span></div>
<div style="text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">
</span></div>
<div style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none; text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;"> Unfortunately,
money often gets spent on blanket campaigns that don’t target individual guests
or segments with offers they’re most likely to respond to. As a result, guests
may feel that the hotel doesn’t care about them, or simply doesn’t offer
services designed to meet their needs. It becomes easy for those guests to
switch to a competing hotel. <span style="color: black;">For analytics to truly
be a game changer, hospitality organizations need to recognize the difference
between reactive and proactive decision making. Using your data to create
reports, drill-downs or alerts helps you to keep a finger on the pulse of your
business.</span> <span style="color: black;">But these things only show you what
happened.</span> <span style="color: black;">They will not tell you why the
problem is happening or what effect it will have in the future.</span> <span style="color: black;">Predictive analytics, like forecasting and optimization,
can help you figure out why things are happening, show you what will happen
next, or even lead you to the best alternative action considering all of your
operating constraints.</span> Hoteliers are starting to use more and more<span style="color: black;"> predictive analytics to move from reactive to proactive
decision making</span>, which would enable them to<span style="color: black;"> stay
one step ahead of trends, set strategy and achieve goals.</span> <span style="color: black;">They gain advantage over the competition, increase value to
shareholders, and continue to surprise and delight their guests. </span></span><span lang="EN" style="mso-ansi-language: EN;"><span style="font-family: Times, "Times New Roman", serif;">Following
are areas where analytics can play essential role: </span></span></div>
<div style="text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">
</span></div>
<div style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none; text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;"> </span></div>
<div style="text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">
</span></div>
<div style="line-height: normal; margin: 15pt 0in 7.5pt; mso-outline-level: 3; text-align: justify;">
<b style="mso-bidi-font-weight: normal;"><span lang="EN" style="mso-ansi-language: EN;"><span style="font-family: Times, "Times New Roman", serif;">Customer Segmentation</span></span></b></div>
<div style="text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">
</span></div>
<div style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none; text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;"><span style="mso-bidi-font-family: ArialMT;">For the hotel industry, a more useful
approach might be to identify the unique cluster groups and to </span><i><span style="mso-bidi-font-family: Arial-ItalicMT;">then </span></i><span style="mso-bidi-font-family: ArialMT;">conduct a separate value segmentation
exercise for each cluster. For example, for a given hotel we identify 4 basic
clusters or distinct customer groups such as tennis groups, ski group, pampered
group (e.g. use spa and valet type services) and the nighthawk group ( fine
dining and theatre goers). The segmentation approach might look as follows:
initial learning from this type of segmentation could be used in developing a
marketing strategy that is data-driven. </span></span><span style="mso-bidi-font-family: ArialMT;"><span style="font-family: Times, "Times New Roman", serif;"> The hotel could examine its current business customer base and once
again establish unique groups of business customers. For example, we know that
there are groups of business customers that simply use the hotel for overnight
stays, while others are there for longer term events held at the hotel. It may
be possible to further segment these groups based on industry sector. We would
certainly expect that local events held by the oil and gas industry might be
more appropriate in one city, while financial services type events may be more
prevalent somewhere else. Of course, all this supposition on what might define
unique business segments would need to be determined quantitatively through
clustering routines. By using the data and mathematics rather than intuitive
judgment to define key customer segments, we can develop unique programs that
are appropriately geared to different groups of unique business travelers.
Customer experience has always been the overriding customer philosophy within
the travel industry, long before the advent of data analytics. Yet, with data
analytics, the travel industry can now use information to make better decisions
regarding its customers. This enhanced decision-making capability enables hotels
to be more proactive with its customers. Traditionally, success in the hotel
industry has always been determined by superior customer service actions that
address the immediate requested needs of the customer. The competitive
advantage in today’s hotel industry is driven more by the ability to anticipate
and proactively meet the needs of customers; an ability that can only be
exercised through data analytics.</span></span></div>
<div style="text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">
</span></div>
<div style="line-height: normal; margin: 15pt 0in 7.5pt; mso-outline-level: 3; text-align: justify;">
<b style="mso-bidi-font-weight: normal;"><span lang="EN" style="mso-ansi-language: EN;"><span style="font-family: Times, "Times New Roman", serif;"> </span></span></b><b style="mso-bidi-font-weight: normal;"><span lang="EN" style="mso-ansi-language: EN;"><span style="font-family: Times, "Times New Roman", serif;">Customer Profiling </span></span></b><span style="mso-bidi-font-family: "Times New Roman";"><span style="font-family: Times, "Times New Roman", serif;">Customer
profiling is accomplished through in-depth analysis of guest demographics and
lifestyle characteristics. Attributes such as income levels, family status, age
and sports and cultural interests, if known, can be appended to model guests.
Customer profiling can be used to create an e-mail listserv for targeted
marketing of current as well as prospective clientele. Prospect profiles can be
especially useful in identifying those folks most likely to respond to
marketing and/or promotional offers. Profiling can also be important in
determining which market segments are most productive and profitable. </span></span></div>
<div style="text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">
</span></div>
<div style="line-height: normal; margin: 15pt 0in 7.5pt; mso-outline-level: 3; text-align: justify;">
<b style="mso-bidi-font-weight: normal;"><span lang="EN" style="mso-ansi-language: EN;"><span style="font-family: Times, "Times New Roman", serif;"> </span></span></b><b style="mso-bidi-font-weight: normal;"><span lang="EN" style="mso-ansi-language: EN;"><span style="font-family: Times, "Times New Roman", serif;">Site Selection </span></span></b><span style="mso-bidi-font-family: "Times New Roman";"><span style="font-family: Times, "Times New Roman", serif;">Data
mining can also be essential to determining sound criteria for restaurant site
selection given an index derived from an analysis of high-volume, successful
units. Such items as demographics (customer profile) and psychographic (buying
patterns), and related customer descriptors are used to delineate highly
probable factors for site modeling. As a result, evaluation data and analytical
profiling qualify companies to be better able to identify candidate sites. </span></span></div>
<div style="text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">
</span></div>
<div style="line-height: normal; margin: 0in 0in 8pt; mso-margin-bottom-alt: auto; mso-margin-top-alt: auto; text-align: justify;">
<b><span style="mso-bidi-font-family: "Times New Roman";"><span style="font-family: Times, "Times New Roman", serif;">Forecasting </span></span></b><span style="mso-bidi-font-family: "Times New Roman";"><span style="font-family: Times, "Times New Roman", serif;"> Customer
transactional data (segmented by menu item and day part) can be useful in the
development of a forecasting model that accurately produces meaningful
expectations. Regardless of whether a restaurant company relies on moving
average or time series forecasting algorithms, data mining can improve the
statistical reliability of forecast modeling. Estimating in advance how much
and when menu items will need to be prepared is critical to efficient food
production management. Data mining can provide a prognostication of product
usage by day part given available sales data. In addition, knowing how much
product was sold during any meal period can also be helpful in supporting an
effective inventory replenishment system that minimizes the amount of capital
tied up in stored products.</span></span></div>
<div style="text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">
</span></div>
<div style="line-height: normal; margin: 0in 0in 8pt; mso-margin-bottom-alt: auto; mso-margin-top-alt: auto; text-align: justify;">
<b><span style="mso-bidi-font-family: "Times New Roman";"><span style="font-family: Times, "Times New Roman", serif;">Customer
Relationship Management </span></span></b><span style="mso-bidi-font-family: "Times New Roman";"><span style="font-family: Times, "Times New Roman", serif;"><span style="mso-spacerun: yes;"> </span>An effective CRM program can be a direct
outcome of data mining applications. The ability to enhance CRM given rapid
accessibility of more comprehensive management information should lead to
satisfied clientele and improved sales performance. The ability to anticipate
and affect consumer behavior (influence menu item sales and other promotions)
can provide the restaurant with a competitive advantage. Having a signature
item, for example, can be found to be a driver of improved relations while
providing a product that customers do not perceive as having an equivalent
elsewhere.</span></span></div>
<div style="text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">
</span></div>
<div style="line-height: normal; margin: 0in 0in 8pt; mso-margin-bottom-alt: auto; mso-margin-top-alt: auto; text-align: justify;">
<b><span style="mso-bidi-font-family: "Times New Roman";"><span style="font-family: Times, "Times New Roman", serif;">Menu
Engineering </span></span></b><span style="mso-bidi-font-family: "Times New Roman";"><span style="font-family: Times, "Times New Roman", serif;"><span style="mso-spacerun: yes;"> </span>An analysis of menu item sales and
contribution margins can be helpful to continuous, successful restaurant
operations. While menu engineering deals with menu content decisions, data
mining can produce reports to indicate menu item selections, by customer
segment, as a basis for operational refinement. For example, Applebee’s has
been described as employing data mining expressly for the purpose of
determining ingredient replenishment quantities based on a menu optimization
quadrant analysis that summarizes menu item sales. Through such analysis the
company then decides which menu items to promote.</span></span></div>
<div style="text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">
</span></div>
<div style="line-height: normal; margin: 0in 0in 8pt; mso-margin-bottom-alt: auto; mso-margin-top-alt: auto; text-align: justify;">
<b><span style="mso-bidi-font-family: "Times New Roman";"><span style="font-family: Times, "Times New Roman", serif;">Productivity
Indexing </span></span></b><span style="mso-bidi-font-family: "Times New Roman";"><span style="font-family: Times, "Times New Roman", serif;">By
correlating order entry time (POS time stamped) with settlement time, data
mining is able to provide a reliable estimate of elapsed production and service
times. This data provides insight into average service time relative to customer
turnover as well as waiting line statistics. While productivity data is
difficult to ascertain, this analysis provides factual data to assist
management in fine tuning operations (heart of the house and dining room
staff). <br />
<br />
<b><span style="mso-spacerun: yes;"> </span>Customer Associations and Sequencing </b></span></span><span style="mso-bidi-font-family: "Times New Roman";"><span style="font-family: Times, "Times New Roman", serif;"> Data
mining can uncover affinities between isolated events. For example, a guest
purchasing the restaurant house specialty is likely to also purchase a small
antipasto salad and glass of Chardonnay. Paired relationships provide a basis
for bundling menu items into a cohesive meal that simplifies ordering while
ensuring customer satisfaction. Menu design can also be manipulated to feature
such combinations as unique opportunities for customers. Data associations are
often credited with a means for influencing customers to spend more than
anticipated or upselling. <br style="mso-special-character: line-break;" />
</span></span></div>
<div style="text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">
</span></div>
<div style="line-height: normal; margin: 0in 0in 8pt; mso-margin-bottom-alt: auto; mso-margin-top-alt: auto; text-align: justify;">
<b style="mso-bidi-font-weight: normal;"><span style="mso-bidi-font-family: "Times New Roman";"><span style="font-family: Times, "Times New Roman", serif;">Forecasting </span></span></b><span style="mso-bidi-font-family: "Times New Roman";"><span style="font-family: Times, "Times New Roman", serif;">As
mentioned previously, forecasting is one of the strengths of data mining and
enables restaurants to better plan to exceed the needs of its clientele.
Forecasting enables more efficient staffing, purchasing, preparation and menu
planning.</span></span></div>
<div style="text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">
</span></div>
<div style="line-height: normal; margin: 15pt 0in 7.5pt; mso-outline-level: 3; text-align: justify;">
<b style="mso-bidi-font-weight: normal;"><span lang="EN" style="mso-ansi-language: EN;"><span style="font-family: Times, "Times New Roman", serif;">Customer Value </span></span></b><span style="font-family: Times, "Times New Roman", serif;">Within the travel industry,
customers have always considered their time at a hotel as an experience rather
just a visit. Activities such as fine dining, nightly entertainment, spas,
corporate seminars / meetings nurture this notion of ‘customer experience’.
This range of activities is going to have varying levels of appeal among a
given clientele. The role of data mining and analytics can be quite significant
in helping us to better understand these varying client needs. Our first task
might be to conduct a basic customer value exercise in order to ultimately
identify our best customers. As with many analytical exercises, the concept of
seasonality needs to be considered here. Seasonality is a very significant
factor within the hotel industry. Most analysts would agree that for the travel
industry, the issue of seasonality can potentially have a significant impact on
travel behaviour. For example, one traveler may spend $1,000 annually as a
casual traveler throughout the year and is considered an “average customer”.
Another traveler spends $1,000 annually, but on a tennis package for one week
period. Both customers spend the same amount but are in fact very different
types of customers. This notion of seasonality is significant when conducting
any analytics exercise particularly if we consider that many hotels will offer
tennis and golf packages in the summer and ski packages in the winter. In
addition to the issue of seasonality, there are various services that may have
more appeal to certain groups of customers. Fine dining and theatre may appeal
to one group of customers while spas and perhaps valet type services appeal to
another group. With varying interests amongst travel clientele, a “cluster
type” segmentation exercise would be a very useful way to identify different
groups of customers. Experts in the travel industry would certainly agree that
there are distinct or homogenous customer segments. Using the data being captured
on travel customers, we can apply some Science to identify truly distinct
customer segments. How do we integrate the notion of customer value within the
cluster segmentation approach? Typically we might conduct a value segmentation
exercise on the entire customer base and then overlay the cluster segments to
see how they align with customer value.</span></div>
<div style="text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">
</span></div>
<div style="line-height: normal; margin: 15pt 0in 7.5pt; mso-outline-level: 3; text-align: justify;">
<b style="mso-bidi-font-weight: normal;"><span lang="EN" style="mso-ansi-language: EN;"><span style="font-family: Times, "Times New Roman", serif;"> </span></span></b><b style="mso-bidi-font-weight: normal;"><span lang="EN" style="mso-ansi-language: EN;"><span style="font-family: Times, "Times New Roman", serif;">Personalized Marketing and Website Optimization </span></span></b><span lang="EN" style="mso-ansi-language: EN;"><span style="font-family: Times, "Times New Roman", serif;">By tracking and processing your customer’s
behavior and actions, you can provide them with personalized offers that are
more effective and give a personal touch. Let say for example, you have a
client that visits your hotel restaurant on a frequent basis due to business.
When you are planning your next promotional campaign, make it targeted and
personal. Send an email to this client saying “We know you have enjoyed our
great restaurant in the past, so when you visit next week, here’s a coupon for
a free appetizer and drink”. There are various marketing automation tools out
there that facilitate this process and allow you to deploy an effective and
personalized cross channel marketing strategy. </span></span><span lang="EN" style="mso-ansi-language: EN;"><span style="font-family: Times, "Times New Roman", serif;">Another area where you can use data in order to
boost business is to optimize your website or landing pages through A/B
testing. Are you implementing a marketing campaign, but the conversion rates of
your landing pages are not as anticipated? An easy solution is to resort to A/B
testing. A/B testing is the act of running a simultaneous experiment
between two or more pages to see which performs or converts the best. </span></span></div>
<div style="text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">
</span></div>
<h3 style="margin: 2pt 0in 0pt; text-align: justify;">
<span lang="EN" style="mso-ansi-language: EN; mso-bidi-font-family: Calibri;"></span><span style="font-family: Times, "Times New Roman", serif;"><br /></span></h3>
<h3 style="margin: 2pt 0in 0pt; text-align: justify;">
<strong><span lang="EN" style="color: windowtext; mso-ansi-language: EN; mso-bidi-font-family: Calibri;"><span style="font-family: Times, "Times New Roman", serif; font-size: small;">Energy Consumption </span></span></strong></h3>
<div style="text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">
</span></div>
<div style="margin: 0in 0in 8pt; text-align: justify;">
<span lang="EN" style="mso-ansi-language: EN;"><span style="font-family: Times, "Times New Roman", serif;"> </span></span><span lang="EN" style="mso-ansi-language: EN; mso-bidi-font-family: Calibri;"><span style="font-family: Times, "Times New Roman", serif;">In the
hotel industry world, analytics can also be used for internal operations.
Energy consumption accounts for 60 to 70% of the utility costs of a typical
hotel. However, costs can be controllable, without sacrificing guest comfort,
by using energy more efficiently. At present times, smart data can help
managers to build energy profiles for their hotels. There are modern software
solutions that gather data from multiple sources, including weather data,
electricity rates and a building’s energy consumption to build a comprehensive
‘building energy profile’. Through a cloud-based, predictive analytics
algorithm, the software can fine-tune whether power comes from the grid or an
onsite battery module. </span></span></div>
<div style="text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">
</span></div>
<div style="text-align: justify;">
<span lang="EN" style="mso-ansi-language: EN; mso-bidi-font-family: Calibri;"><span style="font-family: Times, "Times New Roman", serif;"> </span></span><strong><span lang="EN" style="color: windowtext; mso-ansi-language: EN; mso-bidi-font-family: Calibri;"><span style="font-family: Times, "Times New Roman", serif; font-size: small;">Investment Management </span></span></strong></div>
<div style="text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">
</span></div>
<div style="margin: 0in 0in 8pt; text-align: justify;">
<span lang="EN" style="mso-ansi-language: EN;"><span style="font-family: Times, "Times New Roman", serif;"> </span></span><span lang="EN" style="mso-ansi-language: EN; mso-bidi-font-family: Calibri;"><span style="font-family: Times, "Times New Roman", serif;">Another
way to use analytics for the hotel industry is for financial performance and
investment. When managers want to proceed to make capital investments, like
refurbish the lobby or the rooms or renovate their restaurant, they can
consider implementing a “Randomized Testing” strategy. How does this work?
Basically the hotel chain would refurbish the lobby and rooms in only two or
three “test” hotels. Then, they would monitor if there has been a difference in
bookings and customer satisfaction. The data obtained from the test hotels can
then be compared to the data of the other hotels that were not refurbished.
Thus, managers can take a data driven decision and clearly see if it’s profitable
to make the investment throughout the whole chain. In conclusion, data
analytics can be a powerful force in transforming the hotel industry. From
taking evidence based actions to developing customer centered marketing and
pricing strategies, increasing the ROI of capital investments and generally
empowering hoteliers to make bigger and better decisions. </span></span><span lang="EN" style="mso-ansi-language: EN; mso-bidi-font-family: Calibri;"><span style="font-family: Times, "Times New Roman", serif;">There
are however also some great examples of hotel chains moving in the right
direction in respect to use of analytics. This can result in improved customer
satisfaction, personalized marketing campaigns and offers so that the right
guests book the right room at the right moment and at the right rate. In
addition, it can boost in employee productivity and more efficient operations.
The advantages of using analytics and data mining the hotel industry are
enormous. </span></span><span lang="EN" style="mso-ansi-language: EN; mso-bidi-font-family: Calibri;"><span style="font-family: Times, "Times New Roman", serif;">Deep
customer insights can lead to improved guest satisfaction and an unforgettable
experience. Making these insights available to all levels and departments
within the hotel is crucial. It allows the concierge to know which local tours
to recommend that fit your preferences. It allows the restaurant departments to
predict which menu items are likely to be ordered, based for example on the
local weather. It allows the reservations department to predict the optimal
rate for a room and sales and marketing to create tailored messages across
different (social) networks and send truly personalized email campaigns. Let’s
dive a bit deeper in some possibilities:</span></span></div>
<div style="text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">
</span></div>
<div style="text-align: justify;">
<span lang="EN" style="mso-ansi-language: EN; mso-bidi-font-family: Calibri;"><span style="font-family: Times, "Times New Roman", serif;"> </span></span><b style="mso-bidi-font-weight: normal;"><span lang="EN" style="color: windowtext; mso-ansi-language: EN; mso-bidi-font-family: Calibri;"><span style="font-family: Times, "Times New Roman", serif; font-size: small;">The right room at the right rate</span></span></b></div>
<div style="text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">
</span></div>
<div style="margin: 0in 0in 8pt; text-align: justify;">
<span lang="EN" style="mso-ansi-language: EN;"><span style="font-family: Times, "Times New Roman", serif;"> </span></span><span lang="EN" style="mso-ansi-language: EN; mso-bidi-font-family: Calibri;"><span style="font-family: Times, "Times New Roman", serif;">Yield
management is nothing new in the hotel industry. Providing different rates to
different customers has been done for ages and with success. Big Data offers
hotels the possibility to take revenue management a giant leap forward and
start offering truly personalized prices and rooms to guests. The massive
growth in booking websites, hotel review websites such as TripAdvisor and Yelp
and the ever growing list of social media networks offer a lot of potential.
Combined with hotels’ own CRM systems and/or loyalty programs there is a lot of
data that can be used to optimize revenue management. </span></span><span lang="EN" style="mso-ansi-language: EN; mso-bidi-font-family: Calibri;"><span style="font-family: Times, "Times New Roman", serif;">According
to some industry studies, the hotel chain Marriot has been using Big Data
Analytics to start predicting the optimal price of its rooms to fill its
hotels. They do this by using improved revenue management algorithms that can
deal with data a lot faster, by combining different data sets and making these
insights available to all levels to improve decision-making. The American hotel
chain </span><a href="http://www.denihan.com/" target="_blank"><span style="color: black; font-family: Times, "Times New Roman", serif;">Denihan</span></a><span style="font-family: Times, "Times New Roman", serif;"><span style="color: black;"> </span>goes even a
step further. They used Analytics software to maximize profit and revenue
across thousands of their rooms by combing their own data sets and data from
for example review sites, blogs and/or social network website. They understand
the likes and dislikes of their guests, optimize their offering and adjust the
room rates accordingly.</span></span></div>
<div style="text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">
</span></div>
<div style="text-align: justify;">
<span lang="EN" style="mso-ansi-language: EN; mso-bidi-font-family: Calibri;"><span style="font-family: Times, "Times New Roman", serif;"> </span></span><b style="mso-bidi-font-weight: normal;"><span lang="EN" style="color: windowtext; mso-ansi-language: EN; mso-bidi-font-family: Calibri;"><span style="font-family: Times, "Times New Roman", serif; font-size: small;">Mobile Big Data throughout the Hotel</span></span></b></div>
<div style="text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">
</span></div>
<div style="margin: 0in 0in 8pt; text-align: justify;">
<span lang="EN" style="mso-ansi-language: EN; mso-bidi-font-family: Calibri;"><span style="font-family: Times, "Times New Roman", serif;">More
and more hotels have developed mobile Apps that guests can use to book a hotel
room. These apps however offer vast more possibilities for guests if developed
correctly. It could serve </span><span style="font-family: Times, "Times New Roman", serif;">the key to your hotel room; it can be used to make
reservations in restaurants and spa’s and for example to order room service. If
hotels start using the vast possibilities of mobile application they can
generate massive amounts of data that can be analyzed. So, from a guest
perspective, mobile offers a lot of convenience. From an employee perspective,
it can make life a lot easier for the staff while at the same time increase
customer satisfaction. Providing the housekeeping department with smart devices
for example will allow them to know</span><span style="font-family: Times, "Times New Roman", serif;"> in real time, that you prefer an extra pillow or
an extra light. Kempinski and Hyatt in Dubai already use such applications for
their hotels. Most of the staff within hotels do not have an office or a
computer so providing them with real-time guest information should be done
on-the-go. Although this requires a different approach and a different way of
presenting the insights, placing user-friendly analytics in the hands of guest
facing employees will definitely improve customer satisfaction.</span></span></div>
<div style="text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">
</span></div>
<div style="text-align: justify;">
<span lang="EN" style="mso-ansi-language: EN; mso-bidi-font-family: Calibri;"><span style="font-family: Times, "Times New Roman", serif;"><br /></span></span></div>
<div style="text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">
</span></div>
<h3 style="margin: 2pt 0in 0pt; text-align: justify;">
<b style="mso-bidi-font-weight: normal;"><span lang="EN" style="color: windowtext; mso-ansi-language: EN; mso-bidi-font-family: Calibri;"><span style="font-family: Times, "Times New Roman", serif; font-size: small;">More efficient hotel operations</span></span></b></h3>
<div style="text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">
</span></div>
<div style="margin: 0in 0in 8pt; text-align: justify;">
<span lang="EN" style="mso-ansi-language: EN;"><span style="font-family: Times, "Times New Roman", serif;"> </span></span><span lang="EN" style="mso-ansi-language: EN; mso-bidi-font-family: Calibri;"><span style="font-family: Times, "Times New Roman", serif;">From a
hotel operations point of view, big data offers also many different solutions.
Big Data can be used to reduce your energy bill for example. By combining</span><span style="font-family: Times, "Times New Roman", serif;"> data from 50 different sources, including
electricity rates, weather data and a building’s energy consumption, two
InterContinental hotels in San Francisco managed to reduce their energy costs
by 10-15%. They created detailed energy profiles for their buildings and using
a predictive algorithm they decided whether to use an onsite battery module or
receive power from the grid. </span></span><span lang="EN" style="mso-ansi-language: EN; mso-bidi-font-family: Calibri;"><span style="font-family: Times, "Times New Roman", serif;">Hotels
should also use analytics more to help more efficiently running their<span style="mso-spacerun: yes;"> IT</span></span> operations<span style="font-family: Times, "Times New Roman", serif;">, which is especially relevant for
chains that operate their own booking engine. A server that breaks down or a
booking engine that is inaccessible could result in lost bookings and therefore
lost revenue. IT operations analytics monitors a hotel’s complete IT
environment, including the different relations between applications and
hardware and can predict when things are about to go wrong. Advanced IT
operations analytics can even solve problems automatically before they occur.
This could save a lot of money because IT that’s not working will results in a
bad customer experience. Of course the examples given here are just a few of
the massive possibilities that analytics has to offer for the hotel industry. </span></span><span style="color: #333333; mso-bidi-font-family: Arial;"><span style="font-family: Times, "Times New Roman", serif;">Data mining technology can be a
useful tool for hotel corporations that want to understand and predict guest
behavior. Based on information derived from data mining, hotels can make
well-informed marketing decisions, including who should be contacted, to whom
to offer incentives (or not), and what type of relationship to establish. Data
mining is currently used by a number of industries, including hotels,
restaurants, auto manufacturers, movie-rental chains, and coffee purveyors.
Firms adopt data mining to understand the data captured by scanner terminals,
customer-survey responses, reservation records, and property-management
transactions. This information can be melded into a single data set that is
mined for nuggets of information by data mining experts who are familiar with
the hotel industry. </span></span><span style="color: #333333; mso-bidi-font-family: Arial;"><span style="font-family: Times, "Times New Roman", serif;">However, data mining is no guarantee of marketing success.
Hotels must first ensure that existing data are managed—and that requires
investments in hardware and software systems, data mining programs,
communications equipment, and skilled personnel. Affiliated properties must
also understand that data mining can increase business and profits for the
entire company and should not be viewed as a threat to one location. </span></span><span style="color: #333333; font-family: "Calibri",sans-serif; mso-bidi-font-family: Arial;"><span style="font-family: Times, "Times New Roman", serif; font-size: small;">Since data mining is in its initial stages in the hotel
industry, early adopters may be able to secure a faster return on investment
than will property managers who lag in their decisions. Hotel corporations must
also share data among properties and divisions to gain a richer and broader
knowledge of the current customer base. Management must ensure that hotel
employees use the data-management system to interact with customers even though
it is more time consuming than a
transactional approach.</span></span></div>
Analytics and Data Mininghttp://www.blogger.com/profile/06683904073452049165noreply@blogger.com170tag:blogger.com,1999:blog-3770043454488854818.post-86247807314919663022015-08-09T05:05:00.004-07:002015-08-09T05:09:53.587-07:00Big-data analytics for lenders and creditors<br />
<div style="text-align: justify;">
<span lang="EN" style="font-family: "Calibri",sans-serif; font-size: 16pt; mso-ansi-language: EN; mso-bidi-font-family: Calibri;"> </span><span style="font-family: Calibri;">Credit today is granted by
various organizations such as banks, building societies, retailers, mail order
companies, utilities and various others. Because of growing demand, stronger
competition and advances in computer technology, over the last 30 years
traditional methods of making credit decisions that rely mostly on human
judgment have been replaced by methods that rely mostly on statistical models.
Such statistical models today are not only used for deciding whether or not to
accept an applicant (application scoring), but also to predict the likely
default of customers that have already been accepted (behavioral scoring) and
to predict the likely amount of debt that the lender can expect to recover (collection
scoring). </span><span style="font-family: Calibri;">The term credit scoring can be
defined on several conceptual levels. Most fundamentally, credit scoring means
applying a statistical model to assign a risk score to a credit application or
to an existing credit account. On a higher level, credit scoring also means the
process of developing such a statistical model from historical data. On yet a
higher level, the term also refers to monitoring the accuracy of one or many
such statistical models and monitoring the effect that score based decisions
have on key business performance indicators.</span></div>
<br />
<div style="margin: 0in 0in 8pt; text-align: justify;">
<span style="font-family: Calibri;">Credit scoring is performed,
because it provides a number of important business benefits, all of them based
on the ability to quickly and efficiently obtain fact based and accurate
predictions of credit risk of individual applicants or customers. So, for
example, in application scoring, credit scores are used for optimizing the
approval rate for credit applications. Application scores enable the
organization to choose a optimal cut-off score for acceptance, such that market
share can be gained while retaining maximum profitability. The approval process
and the marketing of credit products can be streamlined based on credit scores:
High risk applications can, for example, be given to more experienced staff or
pre-approved credit products can be offered to selected low-risk customers via
various channels, including direct marketing and the Web. </span></div>
<br />
<div style="margin: 0in 0in 8pt; text-align: justify;">
<span style="font-family: Calibri;">Credit scores, both of prospects
and existing customers, are essential in the customization of credit products.
They are used for determining custom credit limits, down payments or deposits
and interest rates. Behavioral credit scores of existing customers are used in
the early detection of high risk accounts and enable the organization to
perform targeted interventions, for example by pro-actively offering debt
restructuring. Behavioral credit scores also form the basis for more accurate
calculations of the total consumer credit risk exposure, which can result in a
reduction of bad debt provision. </span></div>
<br />
<div style="margin: 0in 0in 8pt; text-align: justify;">
<span style="font-family: Calibri;">Other benefits of credit scoring
include an improved targeting of audits at high-risk accounts, thereby
optimizing the workload of the auditing staff. Resources spent on debt
collection can be optimized by targeting collection activities at accounts with
a high collection score. Collection scores are also used for determining the
accurate value of a debt book before it is sold to a collection agency.<span style="mso-spacerun: yes;"> </span></span><span style="font-family: Calibri;">Finally, credit scores serve to
assess the quality of portfolios intended for acquisition and to compare the
quality of business from different channels, regions and suppliers. </span></div>
<div style="margin: 0in 0in 8pt; text-align: justify;">
<span style="font-family: Calibri;"><br /></span></div>
<h4>
Building credit models in-house</h4>
<span style="font-family: Calibri;"><br /></span>
<br />
<br />
<div style="margin: 0in 0in 8pt; text-align: justify;">
<span style="font-family: Calibri;">While under certain circumstances
it is appropriate to buy ‘ready-made’ generic credit models from outside
vendors or to have credit models developed by outside consultants for a
specific purpose, maintaining a practice for building credit models in-house
offers several advantages. Most directly, it enables the lending organization
to profit from economies of scale when many models need to be built and to
afford a greater number of segment specific models for a greater variety of
purposes. </span></div>
<br />
<div style="margin: 0in 0in 8pt; text-align: justify;">
<span style="font-family: Calibri;">Building up a solid, re-usable
and flexible data, knowledge and skill base of its own also makes it easier for
the organization to stay consistent in the interpretation of model results and
reports and to use a consistent modeling methodology across the whole range of
customer related scores. This results in a reduced turnaround time for the
integration of new models, thereby freeing resources to more swiftly respond to
new business questions with new creative models and strategies. </span></div>
<br />
<div style="margin: 0in 0in 8pt; text-align: justify;">
<span style="font-family: Calibri;">Finally, in-house modeling
competency is needed to verify the accuracy and analyze the strengths and
weaknesses of acquired credit models, to reduce access of outsiders to
strategic information and to retain competitive advantage by building up
company specific best practices.</span></div>
<br />
<h3 style="margin: 2pt 0in 0pt;">
<span style="color: #b01513; font-family: Century Gothic; font-size: small;"> </span></h3>
<br />
<h4 style="margin: 0in 0in 8pt;">
Larger credit scoring process</h4>
<br />
<div style="margin: 0in 0in 8pt; text-align: justify;">
<span style="font-family: Calibri;">Modeling is the process of
creating a scoring rule from a set of examples. In order for modeling to be
effective, it has to be integrated into a larger process. Let’s look at
application scoring. On the input side, before the modeling, the set of example
applications has to be prepared. On the output side, after the modeling, the
scoring rule has to be executed on a set of new applications, so that credit
granting decisions can be made.</span></div>
<br />
<div style="margin: 0in 0in 8pt; text-align: justify;">
<span style="font-family: Calibri;">The collection of performance
data is at the beginning and at the end of the credit scoring process. Before a
set of example applications can be prepared, performance data has to be
collected so that applications can be tagged as ‘good’ or ‘bad’. After new
applications have been scored and decided upon, the performance of the accepts
again has to be tracked and reports created, so that the scoring rule can be
validated and possibly substituted, the acceptance policy be fine-tuned and the
current risk exposure be calculated.</span></div>
<br />
<h2 align="center" style="margin: 8pt 0in 0pt; text-align: center;">
<span style="color: #404040; font-family: Century Gothic; font-size: small;"> </span></h2>
<br />
<h4 style="margin: 0in 0in 8pt;">
Choosing the right model</h4>
<br />
<div style="margin: 0in 0in 8pt; text-align: justify;">
<span style="font-family: Calibri;">With available analytical
technologies it is possible to create a variety of model types, such as
scorecards, decision trees or neural networks.<span style="mso-spacerun: yes;">
</span>When you evaluate, which model type is best suited for achieving your
goals, you may want to consider criteria such as the ease of applying the
model, the ease of understanding it and the ease of justifying it. At the same
time, for each particular model of whatever type, it is important to assess its
predictive performance, i.e. the accuracy of the scores that the model assigns
to the applications and the consequences of the accept/reject decisions that it
suggests. A variety of business relevant quality measures, such as concentration,
strategy and profit curves are used for this (see section Model Assessment in
the case study section below). The best model will therefore be determined both
by the purpose for which the model will be used and by the structure of the
data set that it is validated on. </span></div>
<br />
<div style="margin: 0in 0in 8pt;">
<span style="font-family: Calibri;"> -----------------------------------------------------------------------------------------------------------</span></div>
<br />
<h4 style="margin: 0in 0in 8pt; text-align: justify;">
Scorecards</h4>
<br />
<div style="margin: 0in 0in 8pt; text-align: justify;">
<span style="font-family: Calibri;">The traditional form of a credit
scoring model is a scorecard. This is a table that contains a number of
questions that an applicant is asked (called characteristics) and for each such
question a list of possible answers (called attributes).<span style="mso-spacerun: yes;"> </span>One such characteristic may, for example, be
the age of the applicant, and the attributes for this characteristics then are
a number of age ranges that an applicant can fall into. For each answer, the
applicant receives a certain amount of points – more if the attribute is one of
low risk, less vice versa. If the application’s total score exceeds a specified
cut-off amount of points, it is recommended for acceptance. </span><span style="font-family: Calibri;"> </span><span style="font-family: Calibri;">Scorecard model, apart
from being a long established method in the industry, still has several
advantages when compared with more recent ‘data mining’ types of models, such
as decision trees or neural networks.<span style="mso-spacerun: yes;"> </span>A
scorecard is easy to apply: if needed the scorecard can be evaluated on a sheet
of paper in the presence of the applicant. It is easy to understand: the amount
of points for one answer doesn’t depend on any of the other answers and across
the range of possible answers for one question the amount of points usually
increases in a simple way (often monotonically or even linearly). It is
therefore often also easy to justify a decision that is made on the basis of a
scorecard to the applicant. It is possible to disclose groups of
characteristics where the applicant has a potential for improving the score and
to do so in broad enough terms not to risk manipulated future applications.</span></div>
<br />
<div style="margin: 0in 0in 8pt;">
<span style="font-family: Calibri;"> </span></div>
<h3 style="margin: 0in 0in 8pt;">
<span style="font-size: 12pt; line-height: 106%;">Scorecard
development process</span></h3>
<br />
<div style="margin: 0in 0in 8pt;">
<span style="font-family: Calibri;"><br /></span></div>
<h4 style="margin: 0in 0in 8pt;">
Development sample</h4>
<br />
<div style="margin: 0in 0in 8pt; text-align: justify;">
<span style="font-family: Calibri;">The development sample (input data set) is a balanced sample
consisting of 1500 good and 1500 bad accepted applicants. ‘Bad’ has been
defined as having been 90 days past due once. Everyone not ‘bad’ is ‘good’ , so
there are no ‘indeterminates’.<span style="mso-spacerun: yes;"> </span>A
separate data set contains the data on rejects. The modeling process,
especially the validation charts, require information about the actual good/bad
proportion in the accept population. Sampling weights are used here for
simulating that proportion. A weight of 30 is assigned to a good application
and a weight of 1 to a bad one. Thereafter all nodes in the process flow
diagram treat the sample as if it consisted of 45 000 good applications and
1500 bad applications. Figure 3 shows the distribution of good/bad after the
application of sampling weights. The bad rate is 3.23%. A Data Partition node
then splits a 50 % validation data set away from the development sample. Models
will later be compared based on this validation data set.</span></div>
<br />
<h3 style="margin: 2pt 0in 0pt;">
<span style="color: #b01513; font-family: Century Gothic; font-size: small;"> </span></h3>
<br />
<h4 style="margin: 0in 0in 8pt;">
<span style="-ms-layout-grid-mode: line; color: black; mso-fareast-language: EN-US;">Classing</span></h4>
<br />
<div style="margin: 0in 0in 8pt;">
<span style="-ms-layout-grid-mode: line; color: black; mso-fareast-language: EN-US;"><span style="font-family: Calibri;">Classing is the process of automatically and/or
interactively binning and grouping interval, nominal or ordinal input variables
in order to</span></span></div>
<br />
<ul style="direction: ltr; list-style-type: disc;">
<li style="color: black; font-family: "Times New Roman",serif; font-size: 12pt; font-style: normal; font-weight: normal;"><div style="color: black; font-family: "Calibri",sans-serif; font-size: 11pt; font-style: normal; font-weight: normal; line-height: normal; margin-bottom: 0pt; margin-top: 6pt; mso-hyphenate: auto; mso-list: l0 level1 lfo2; tab-stops: list .25in;">
<span style="-ms-layout-grid-mode: line; color: black; mso-fareast-language: EN-US;">manage the number of attributes per characteristic</span></div>
</li>
<li style="color: black; font-family: "Times New Roman",serif; font-size: 12pt; font-style: normal; font-weight: normal;"><div style="color: black; font-family: "Calibri",sans-serif; font-size: 11pt; font-style: normal; font-weight: normal; line-height: normal; margin-bottom: 0pt; margin-top: 6pt; mso-hyphenate: auto; mso-list: l0 level1 lfo2; tab-stops: list .25in;">
<span style="-ms-layout-grid-mode: line; color: black; mso-fareast-language: EN-US;">improve the predictive power of the characteristic</span></div>
</li>
<li style="color: black; font-family: "Times New Roman",serif; font-size: 12pt; font-style: normal; font-weight: normal;"><div style="color: black; font-family: "Calibri",sans-serif; font-size: 11pt; font-style: normal; font-weight: normal; line-height: normal; margin-bottom: 0pt; margin-top: 6pt; mso-hyphenate: auto; mso-list: l0 level1 lfo2; tab-stops: list .25in;">
<span style="-ms-layout-grid-mode: line; color: black; mso-fareast-language: EN-US;">select predictive characteristics</span></div>
</li>
<li style="color: black; font-family: "Times New Roman",serif; font-size: 12pt; font-style: normal; font-weight: normal;"><div style="color: black; font-family: "Calibri",sans-serif; font-size: 11pt; font-style: normal; font-weight: normal; line-height: normal; margin-bottom: 0pt; margin-top: 6pt; mso-hyphenate: auto; mso-list: l0 level1 lfo2; tab-stops: list .25in;">
<span style="-ms-layout-grid-mode: line; color: black; mso-fareast-language: EN-US;">make the Weights Of Evidence<span style="mso-spacerun: yes;"> </span>- and thereby the amount of points in the
scorecard - vary smoothly or even linearly across the attributes</span></div>
<div style="color: black; font-family: "Calibri",sans-serif; font-size: 11pt; font-style: normal; font-weight: normal; margin-bottom: 8pt; margin-top: 0in;">
</div>
<div style="color: black; font-family: "Calibri",sans-serif; font-size: 11pt; font-style: normal; font-weight: normal; margin-bottom: 8pt; margin-top: 0in;">
The amount of points that an attribute is worth in a
scorecard is determined by two factors:</div>
</li>
</ul>
<br />
<ul style="direction: ltr; list-style-type: disc;">
<li style="color: black; font-family: "Times New Roman",serif; font-size: 12pt; font-style: normal; font-weight: normal;"><div style="color: black; font-family: "Calibri",sans-serif; font-size: 11pt; font-style: normal; font-weight: normal; line-height: normal; margin-bottom: 0pt; margin-top: 0in; mso-hyphenate: auto; mso-list: l1 level1 lfo1; tab-stops: list .25in;">
the risk of the attribute relative to the other
attributes of the same characteristic and</div>
</li>
<li style="color: black; font-family: "Times New Roman",serif; font-size: 12pt; font-style: normal; font-weight: normal;"><div style="color: black; font-family: "Calibri",sans-serif; font-size: 11pt; font-style: normal; font-weight: normal; line-height: normal; margin-bottom: 0pt; margin-top: 0in; mso-hyphenate: auto; mso-list: l1 level1 lfo1; tab-stops: list .25in;">
the relative contribution of the characteristic
to the overall score</div>
<div style="color: black; font-family: "Calibri",sans-serif; font-size: 11pt; font-style: normal; font-weight: normal; margin-bottom: 8pt; margin-top: 0in;">
The relative risk of the attribute is determined by its
‘Weight of Evidence’. The contribution of the characteristic is determined by
its co-efficient in a logistic regression (see section Regression below).</div>
<div style="color: black; font-family: "Calibri",sans-serif; font-size: 11pt; font-style: normal; font-weight: normal; margin-bottom: 8pt; margin-top: 0in; text-align: justify;">
The Weight of Evidence of an attribute is defined as the
logarithm of the ratio of the proportion of goods in the attribute over the
proportion of bads in the attribute. High negative values therefore correspond
to high risk, high positive values correspond to low risk. Since an attribute’s
amount of points in the scorecard is proportional to its Weight of Evidence
(see section Score Points Scaling below) the classing process determines how
many points an attribute is worth relative to the other attributes of the same
characteristic. </div>
<div style="color: black; font-family: "Calibri",sans-serif; font-size: 11pt; font-style: normal; font-weight: normal; margin-bottom: 8pt; margin-top: 0in; text-align: justify;">
After classing has defined the
attributes of a characteristic, the characteristic’s predictive power, i.e. its
ability to separate high risks from low risks, can be assessed with the so
called Information Value measure.<span style="mso-spacerun: yes;"> </span>This
will aid the selection of characteristics for inclusion in the scorecard. The
Information Value is the weighted sum of the Weights of Evidence of the
characteristic’s attributes. The sum is weighted by the difference between the
proportion of goods and the proportion of bads in the respective attribute. The
Information Value <span style="-ms-layout-grid-mode: line; color: black; mso-fareast-language: EN-US;">should be greater than 0.02 for a characteristic to be
considered for inclusion in the scorecard. Information Values lower than 0.1
can be considered weak, smaller than 0.3 medium and smaller than 0.5 strong. If
the Information Value is greater than 0.5, the characteristic may be
over-predicting, meaning that it is in some form trivially related to the
good/bad information. </span></div>
<div style="color: black; font-family: "Calibri",sans-serif; font-size: 11pt; font-style: normal; font-weight: normal; margin-bottom: 8pt; margin-top: 0in; tab-stops: 396.9pt; text-align: justify;">
<span style="-ms-layout-grid-mode: line; color: black; mso-fareast-language: EN-US;"><span style="mso-spacerun: yes;"> </span></span>There is no single criterion, when a
grouping can be considered satisfactory.<span style="mso-spacerun: yes;">
</span>A linear or at least monotone increase or decrease of the Weights of
Evidence is often what is desired in order for the scorecard to appear
plausible. Some analysts would always only include those characteristics where
a sensible re-grouping can achieve this.<span style="mso-spacerun: yes;">
</span>Others may consider a smooth variation sufficiently plausible and would
include a non-monotone characteristic such as ‘income’, where risk is high for
both high and low incomes, but low for medium incomes, provided the Information
Value is high enough.</div>
</li>
</ul>
<br />
<h3 style="margin: 2pt 0in 0pt;">
<span style="color: #b01513; font-family: Century Gothic; font-size: small;"> </span></h3>
<br />
<h4 style="margin: 0in 0in 8pt;">
Regression analysis</h4>
<br />
<div style="margin: 0in 0in 8pt; text-align: justify;">
<span style="font-family: Calibri;">After the relative risk across
attributes of the same characteristic has been quantified, a logistic
regression analysis now determines how to weigh the characteristics against
each other.<span style="mso-spacerun: yes;"> </span>The Regression node
receives one input variable for each characteristic. This variable contains as
values the Weights of Evidence of the characteristic’s attributes. (see table 1
for an example of Weight of Evidence coding). Note that Weight of Evidence
coding is different from dummy variable coding, in that single attributes are
not weighted against each other independently, but whole characteristics are,
thereby preserving the relative risk structure of the attributes as determined
in the classing stage </span></div>
<br />
<div style="margin: 0in 0in 8pt;">
<span style="font-family: Calibri;">A variety of further selection methods (forward, backward,
stepwise) can be used in the Regression node to eliminate redundant
characteristics. In our case we use a simple regression. These values are in
the following step multiplied with the Weights of Evidence of the attributes to
form the basis for the score points in the scorecard.</span></div>
<br />
<div style="margin: 0in 0in 8pt;">
<span style="font-family: Calibri;"> </span></div>
<h4 style="margin: 0in 0in 8pt;">
Score points scalling </h4>
<br />
<div style="margin: 0in 0in 8pt; text-align: justify;">
<span style="font-family: Calibri;">For each attribute its Weight of
Evidence and the regression co-efficient of its characteristic could now be
multiplied to give the score points of the attribute. An applicant’s total
score would then be proportional to the logarithm of the predicted bad/good
odds of that applicant.<span style="mso-spacerun: yes;"> </span>However, score points
are commonly scaled linearly to take more friendly (integer) values and to
conform with industry or company standards. We scale the points such that a
total score of 600 points corresponds to good/bad odds of 50 to 1 and that an
increase of the score of 20 points corresponds to a doubling of the good/bad
odds. For the derivation of the scaling rule that transforms the score points
of each attribute see equations 3 and 4. The scaling rule is implemented in the
Scorecard node (see Figure 1), where it can be easily parameterized. The
resulting scorecard is output as a table in HTML and is shown in table 2.<span style="mso-spacerun: yes;"> </span>Note, how the score points of the various
characteristics cover different ranges. The score points develop smoothly and,
with the exception of the ‘Income’ variable, also monotonically across the
attributes.</span></div>
<br />
<div style="margin: 0in 0in 8pt;">
<span style="font-family: "Arial",sans-serif; font-size: 12pt; line-height: 106%; mso-bidi-font-family: Calibri; mso-bidi-font-size: 11.0pt;"> </span></div>
<br />
<h4 style="margin: 0in 0in 8pt;">
<span style="mso-bidi-font-family: Arial;">Reject Inference</span></h4>
<br />
<div style="margin: 0in 0in 8pt; text-align: justify;">
<span style="font-family: Calibri;">The application scoring models we
have built so far, even though we have done everything correctly, still suffer
from a fundamental bias. They have been built based on a population that is
structurally different from the population to which they are supposed to be
applied. All the example applications in the development sample are
applications that have been accepted by the old generic scorecard that has been
in place during the last two years. This is so because only for those accepted
applications it is possible to evaluate their performance and to define a
good/bad variable.<span style="mso-spacerun: yes;"> </span>However, the
through-the-door population that is supposed to be scored is composed of all
applicants, those that would have been accepted and those that would have been
rejected by the old scorecard. Note that this is only a problem for application
scoring, not for behavioral scoring . </span><span style="font-family: Calibri;">As a partial remedy to this
fundamental bias, it is common practice to go through a process of reject
inference. The idea of this approach is to score the data that is retained of
the rejected applications with the model that is build on the accepted
applications. Then rejects are classified as inferred goods or inferred bads and
are added to the accepts data set that contains the actual good and bad. This
augmented data set then serves as the input data set of a second modeling run.
In case of a scorecard model this involves the re-adjustment of the classing
and the re-calculation of the regression co-efficients.</span></div>
<br />
<h3 style="margin: 2pt 0in 0pt;">
<span style="color: #b01513; font-family: Century Gothic; font-size: small;"> </span></h3>
<div style="margin: 2pt 0in 0pt;">
<br /></div>
<div style="margin: 2pt 0in 0pt;">
------------------------------------------------------------------------------------------------------</div>
<div style="margin: 2pt 0in 0pt;">
<br /></div>
<br />
<h4 style="margin: 0in 0in 8pt;">
Decision Trees </h4>
<br />
<div style="margin: 0in 0in 8pt; text-align: justify;">
<span style="font-family: Calibri;">On the other hand, a decision
tree may outperform a scorecard in terms of predictive accuracy, because unlike
the scorecard, it detects and exploits interactions between characteristics. In
a decision tree model, each answer that an applicant gives determines what
question he is asked next. If the age of an applicant is for example greater
than 50 the model may suggest to grant a credit without any further questions,
because the average bad rate of that segment of applications is sufficiently
low. If, on the other extreme, the age of the applicant is below 25 the model
may suggest to ask about time on the job next. Credit would then maybe only
granted to those that have exceeded 24 months of employment, because only in
that sub-segment of youngsters the average bad rate is sufficiently low.<span style="mso-spacerun: yes;"> </span>A decision tree model thus consists of a set
of if .. then … else rules that are still quite straightforward to apply. The decision
rules also are easy to understand, maybe even more so than a decision rule that
is based on a total score that is made up of many components. </span><span style="font-family: Calibri;">However, a decision rule from a
tree model, while easy to apply and easy to understand, may be hard to justify
for applications that lie on the border between two segments.<span style="mso-spacerun: yes;"> </span>There will be cases where an applicant will
for example say: ‘If I had only been 2 months older I would have received a
credit without further questions, but now I am asked for additional securities.
That is unfair.’ That applicant may also be tempted to make a false statement
about his age in his next application. E</span><span style="font-family: Calibri;">ven if a decision tree is not
used directly for scoring, this model type still adds value in a number of
ways: the identification of clearly defined segments of applicants with a
particular high or low risk can give dramatic new insight into the risk
structure of the population. Decision trees are also used in scorecard
monitoring, where they identify segments of applications where the scorecard
under performs.</span><span style="font-family: Calibri;"> </span></div>
<br />
<div style="margin: 0in 0in 8pt; text-align: justify;">
<span style="font-family: Calibri;">Finally, decision trees often can achieve a similar
predictive power as a scorecard with much fewer characteristics. Models that
only require few characteristics, sometimes called ‘short scores’, are becoming
especially popular in the context of campaigning and marketing for credit
products. However, there is a fundamental problem associated with short scores:
they diminish the richness of information that the organization can collect on
the applicants and thereby erode the basis for future modeling.</span></div>
<br />
<div style="margin: 0in 0in 8pt;">
<span style="font-family: Calibri;"> </span><span style="font-family: Calibri;">--------------------------------------------------------------------------------------------------------</span></div>
<br />
<h4 style="margin: 0in 0in 8pt;">
Neural Nets</h4>
<br />
<div style="margin: 0in 0in 8pt; text-align: justify;">
<span style="font-family: Calibri;">With the decision tree, we could
see that there is such thing as a decision rule that is too easy to understand
and thereby invites fraud. Ironically speaking, there is no danger of this
happening with a neural network. Neural networks are extremely flexible models
that combine combinations of characteristics in a variety of ways. Their
predictive accuracy can therefore be far superior to scorecards and they don’t
suffer from sharp ‘splits’ as decision trees do. However, it is virtually
impossible to explain or understand the score that is produced for a particular
application in any simple way.<span style="mso-spacerun: yes;"> </span>It can
therefore be difficult to justify a decision that is made on the basis of a
neural network model. In some countries it may even be a legal requirement to
be able to explain a decision and such a justification then must be produced
with additional methods. A neural network of superior predictive power
therefore is best suited for certain behavioral or collection scoring purposes,
where the average accuracy of the prediction is more important than the insight
into the score for each particular case.<span style="mso-spacerun: yes;">
</span>Neural network models can not be applied manually like scorecards or
simple decision trees, but require software to score the application. Then,
however, their use is just as simple as that of the other model types. </span></div>
<br />
<h4 style="margin: 0in 0in 8pt;">
<span style="font-family: "Arial",sans-serif; font-size: 12pt; line-height: 106%; mso-bidi-font-family: Calibri; mso-bidi-font-size: 11.0pt;"> </span></h4>
<h4 style="margin: 0in 0in 8pt;">
<span style="font-family: "Arial",sans-serif; font-size: 12pt; line-height: 106%; mso-bidi-font-family: Calibri; mso-bidi-font-size: 11.0pt;"></span><span style="font-family: "Arial",sans-serif; font-size: 12pt; line-height: 106%; mso-bidi-font-family: Calibri; mso-bidi-font-size: 11.0pt;">Model
Assessment </span></h4>
<br />
<div style="margin: 0in 0in 8pt; text-align: justify;">
<span style="font-family: Calibri;">After building both a scorecard
and a decision tree model we now want to compare the quality of the models on
the validation data. One of the standard Enterprise Miner charts in the
Assessment node is the concentration curve and is shown in Figure 9. It shows
how many of all the bads in the population are concentrated in the group of 2%
(4%, 6%, …) worst applicants as predicted by the model. Sorting applicants
based on the scorecard scores will result, for example, in around 30% of all
the bads being concentrated in the 10% applicants that are considered the worst
by the scorecard model. The decision tree is only able to concentrate about
half as many bads in the same number of what it calls the worst applicants (the
10% decile is marked by the vertical black line in In summary, the
scorecard is assessed to be superior, because its curve stays above that of the
tree.</span></div>
<br />
<div style="margin: 0in 0in 8pt;">
<span style="font-family: Calibri;"> </span><span style="font-family: Calibri;"> </span></div>
<div style="margin: 0in 0in 8pt;">
<b><span style="font-family: "ArialNarrow,Bold",sans-serif; font-size: 13pt;">De<span style="font-size: small;">fining decision rules for application approval and risk
management</span></span></b></div>
<br />
<div style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;">
<b><span style="font-family: "ArialNarrow,Bold",sans-serif; font-size: 13pt;"> </span></b></div>
<br />
<div style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none; text-align: justify;">
<span style="mso-bidi-font-family: Arial;"><span style="font-family: Calibri;">Application approval and risk management do
not rely on scores alone, but scores do form the basis of a decision strategy
that groups customers into homogenous segments. These segments can then be
treated with the same action. For example, in the case of approval decisions,
customers are often classified using appropriate cutoff scores as approved,
referred for examination or rejected. Other segmentation strategies can
determine the limit amount that is assigned to a segment or the collection
actions taken. An important type of segmentation is the division of customers
into risk pools for the purpose of</span></span><span style="font-family: "Arial",sans-serif; font-size: 9pt;"> </span><span style="mso-bidi-font-family: Arial;"><span style="font-family: Calibri;">calculating
certain risk components: probability of default (PD), loss given default (LGD)
and exposure at default (EAD). These risk components are required by the risk
weighted assets (RWA) calculation mandated by the Basel II and III capital
requirements regulation. Analysts apply the scorecard and the pooling
definition to a historical data set. The long-run historical averages of the
default rate, losses and exposures can then be calculated by pool and used as
input into the RWA calculation. There are various ways to group customers into
segments using a scorecard. Often segmentation involves the setting of
thresholds. Sometimes analysts define these thresholds manually, and sometimes
they use an algorithm to automatically find a decision rule that is optimal in
a specific way. The way multiple thresholds are combined further characterizes
a decision rule. Typical examples of decision rules include policy rules
(exclusions), single score bins, multiple score bins and decision trees.</span></span></div>
<div style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none; text-align: justify;">
<span style="mso-bidi-font-family: Arial;"><span style="font-family: Calibri;"><br /></span></span></div>
<br />
<div style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;">
<span style="font-family: "Arial",sans-serif; font-size: 9pt;"> </span></div>
<br />
<div style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;">
<b><span style="font-family: "ArialNarrow,Bold",sans-serif;">Deploying scores and decisions</span></b></div>
<br />
<div style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;">
<b><span style="font-family: "ArialNarrow,Bold",sans-serif; font-size: 15pt;"> </span></b></div>
<br />
<div style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none; text-align: justify;">
<span style="mso-bidi-font-family: Arial;"><span style="font-family: Calibri;">Execution of decision rules can be done in
batch for all customers so that the assignment of each customer to a group and
an action is available in an operational data store for instant retrieval by
front-office software. Or, alternatively, the front-office software can
initiate execution of the decision rule to make a decision on an individual
customer, possibly using new or updated information supplied by the customer at
that time (online). The decision is then passed back immediately to the
front-office software. In either case, the decision rule is not executed by the
front-office software but through middle-layer software on a central server.
For existing credit customers, the batch option will be most commonly used,
since behavioral information derived from the customer transaction history and
other stored customer characteristics is typically more predictive than
information a customer might supply in the front office. </span></span></div>
<br />
<br />
<br />
<br />Analytics and Data Mininghttp://www.blogger.com/profile/06683904073452049165noreply@blogger.com23tag:blogger.com,1999:blog-3770043454488854818.post-79106364835463523782015-08-09T04:28:00.001-07:002015-08-09T04:28:52.074-07:00Is big-data analytics ultimate solution for airlines?
<br />
<div style="text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">If the airline industry could be
described in two words, it would be "intensely competitive". The airline
industry generates billions of dollars every year and still has a cumulative
profit margin of less than 1%.<span style="mso-spacerun: yes;"> </span>The
reason for this lies in this industry’s vast complexity. Airlines have a multitude
of different business issues that need to be solved at once, such as globally
uneven playing field, revenue vulnerability, an extremely variable planning
horizon, high cyclicality and seasonality, fierce competition, excessive
government intervention and high fixed and low marginal cost. <span style="mso-spacerun: yes;"> </span>The low profit to turnover ratio of airlines
have been further exacerbated by growing low-fare competition, increasing
security costs, and frequent dynamic shifts in air travel consumer behavior.
The historical business model of many network airlines now appears to be unable
to support sustained profitability under any but the most favorable economic
conditions. The industry is at a turning point.<span style="mso-spacerun: yes;">
</span>The market dictates an “adapt or die” policy, and the airlines that wish
to survive will face the challenge of having to make significant changes to
their current archaic business model. To do this requires far more allowance for
analytical technologies that would allow flow of consistent, repeatable and
reliable enterprise wide intelligence needed to tackle all the challenges the
industry is facing. <span style="mso-spacerun: yes;"> </span></span></div>
<span style="font-family: Times, "Times New Roman", serif;">
</span><br />
<span style="font-family: Times, "Times New Roman", serif;">
</span><br />
<div style="margin: 0in 0in 8pt; text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">To ensure the best chance for
full economic recovery, airlines should fully leverage their most prolific
asset - data.<span style="mso-spacerun: yes;"> </span>Data used in conjunction
with innovative technologies that would allow the creation of an Enterprise Wide
Intelligence Platform, will provide the capabilities for a comprehensive intelligent
management and decision-making system throughout the enterprise. The ultimate
benefits of implementing and using an enterprise wide intelligence platform, together
with airline business acumen and experience would include timely responses to
current and future market demands, better planning and strategically aligned
decision making, and clear understanding and monitoring of all key performance
drivers relevant to the airline industry. Achieving these benefits in a timely
and intelligent manner will ultimately result in lower operating costs, better
customer service, market leading competitiveness and increased profit margin
and shareholder value.<span style="mso-spacerun: yes;"> </span></span></div>
<span style="font-family: Times, "Times New Roman", serif;">
</span><br />
<div style="margin: 0in 0in 8pt; tab-stops: list .5in;">
<span style="line-height: 106%; mso-bidi-font-size: 11.0pt;"><span style="font-family: Times, "Times New Roman", serif;">Business challenges in airline industry</span></span></div>
<div style="margin: 0in 0in 8pt;">
<span style="font-family: Times, "Times New Roman", serif;">Key to successful deployment of technological advances in
airline industry is to be able to anticipate how the current business model
will change to survive in tough market conditions.<span style="mso-spacerun: yes;"> </span> </span></div>
<span style="font-family: Times, "Times New Roman", serif;">
</span><br />
<div style="margin: 0in 0in 8pt;">
<span style="font-family: Times, "Times New Roman", serif;">Some of the challenges that can be successfully addressed by
Enterprise intelligent Platform are: </span></div>
<span style="font-family: Times, "Times New Roman", serif;">
</span><br />
<ul style="margin-top: 0in;" type="disc">
<li style="color: black; font-style: normal; line-height: normal; margin: 0in 0in 0pt; mso-hyphenate: auto; mso-list: l0 level1 lfo1; tab-stops: list .5in; text-align: justify;"><span style="font-family: Times, "Times New Roman", serif;">The need for accurate daily and weekly performance
measurement reports (e.g. “flash/estimated” revenue, operating costs and
net contribution reports for every aircraft’s actual flight per
sector/route).</span></li>
</ul>
<ul style="margin-top: 0in;" type="disc">
<li style="color: black; font-style: normal; line-height: normal; margin: 0in 0in 0pt; mso-hyphenate: auto; mso-list: l0 level1 lfo1; tab-stops: list .5in; text-align: justify;"><span style="font-family: Times, "Times New Roman", serif;">The Need to better manage all aspects of risk. </span></li>
</ul>
<ul style="margin-top: 0in;" type="disc">
<li style="color: black; font-style: normal; line-height: normal; margin: 0in 0in 0pt; mso-hyphenate: auto; mso-list: l0 level1 lfo1; tab-stops: list .5in; text-align: justify;"><span style="font-family: Times, "Times New Roman", serif;">The Need for better impact analysis and more
effective optimization of all resources as well as being able to produce
accurate passenger-revenue forecasts,<span style="mso-spacerun: yes;">
</span></span></li>
</ul>
<ul style="margin-top: 0in;" type="disc">
<li style="color: black; font-style: normal; line-height: normal; margin: 0in 0in 0pt; mso-hyphenate: auto; mso-list: l0 level1 lfo1; tab-stops: list .5in; text-align: justify;"><span style="font-family: Times, "Times New Roman", serif;">The Need for a holistic, 360 degrees view of the
airline industries customers, suppliers, service providers and
distributors.The Need for expense verification models in order to better
control all industry cost aspects.</span></li>
</ul>
<span style="font-family: Times, "Times New Roman", serif;">
</span><br />
<span style="font-family: Times, "Times New Roman", serif;">
</span><a href="https://www.blogger.com/null" name="OLE_LINK5"><span style="mso-bookmark: OLE_LINK6;"><span style="font-family: Times, "Times New Roman", serif;">Performance
Measurements</span></span></a><br />
<span style="font-family: Times, "Times New Roman", serif;">
</span><br />
<div style="margin: 0in 0in 8pt; text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">Airlines usually operate in a
globally competitive environment and therefore require prompt and accurate
enterprise performance measurements. Furthermore, airlines are volume driven
and small variations (passengers flown, fuel spent/bought, load carried) can
multiply into major effects – therefore appropriate and timely action is
critical. They also suffer substantial difficulties to produce daily/weekly
reliable performance measurements. Current airlines “legacy” IT systems such as
Revenue Accounting, require several weeks after a month end to generate revenue
results for every flight per sector/route.<span style="mso-spacerun: yes;">
</span>Enterprise Intelligence Platform can automate production of daily
activity reports such as number of passenger flown per flight/sector, distance
flown, etc which can be used to provide estimated performance measurements such
as daily or weekly revenues for specific routes or sectors. </span></div>
<span style="font-family: Times, "Times New Roman", serif;">
</span><br />
<div style="margin: 8pt 0in 0pt;">
<span style="font-family: Times, "Times New Roman", serif;">Risk Management</span></div>
<div style="text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">
</span></div>
<div style="text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">The
global airline industry has been subjected to major catastrophes over the past
years.<span style="mso-spacerun: yes;"> </span>It is accordingly imperative for
airlines to develop various risk management models and strategies to protect
themselves from negative impact of these types of events. Furthermore, due to
the global playing field, airlines often earn its revenues and pay its costs in
different baskets of currencies (e.g. USD, Euro, GBP etc). As a result there is
frequently a mismatch between the flow of revenue receipts and expenses of each
basket of currency - creating risk exposure reports.</span></div>
<span style="font-family: Times, "Times New Roman", serif;">
</span><br />
<div style="margin: 8pt 0in 0pt;">
<span style="font-family: Times, "Times New Roman", serif;">Control and Verification</span></div>
<div style="margin: 8pt 0in 0pt; text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">Airline
carriers require a number of control and verification models to be able to
control costs arising from its various operational activities. To enable this,
airlines have a pressing need for a complete and integrated repository of
flight information data gathered from all its disparate business units. This
will enable computation of various efficiency analytics - e.g. planed fuel
usage compared with actual fuel usage per aircraft, crew utilization (roster
optimization). These issues could also be fully addressed by consolidating and
analyzing relevant flight and aircraft data. In turn this would help to create
a 360 ° view of each flight and aircraft, allowing the business users to
dramatically improve their control and verification systems.<span style="mso-spacerun: yes;"> </span></span></div>
<div style="text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">
</span></div>
<div style="line-height: normal; margin: 10pt 0in; mso-hyphenate: auto; mso-pagination: widow-orphan;">
<a href="https://www.blogger.com/null" name="OLE_LINK8"></a><a href="https://www.blogger.com/null" name="OLE_LINK7"><span style="mso-bookmark: OLE_LINK8;"><span style="font-family: Times, "Times New Roman", serif;">Load forecasting</span></span></a></div>
<div style="text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">
<span style="mso-bookmark: OLE_LINK7;"><span style="mso-bookmark: OLE_LINK8;">Airlines
require the development of an effective and holistic forecasting model to regularly
assess the impact of options and alternatives such as increasing aircraft seats
available, adjusting fares, introducing new routes etc. Forecasts should also
take account of actual statistical trends and results e.g. actual passengers
carried and actual average fares earned. Such </span></span>forecasts should then
be compared against budgets and prior year performance. </span></div>
<span style="font-family: Times, "Times New Roman", serif;">
</span><br />
<span style="font-family: Times, "Times New Roman", serif;">
</span><br />
<div style="margin: 0in 0in 8pt;">
<span style="font-family: Times, "Times New Roman", serif;">Holistic customer view</span></div>
<div style="text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">
Airlines would greatly benefit
from knowing and understanding its business environment along some of the key
business issues, such as performance, behavior, risk, profitability, etc. Using
customers as an example - <span style="line-height: 106%; mso-bidi-font-family: TimesNewRomanPSMT; mso-bidi-font-size: 10.0pt;">the
main objective would be to enrich the knowledge about individual customers
leading to new strategic customer segments. </span>This intelligence would
allow airlines to reap the host of benefits such as successful, targeted
customer promotions, cross-selling and up-selling campaigns for different
flights and booking classes leading to improved yield and revenue. For example,
it would give airlines the power of knowing to limit discounts on flight routes
which are usually over-booked, allowing the large number of passengers to
compete for high profit seats immediately prior to departure. Such
multidimensional views of the business can help the airline to better serve its
customers through more effective, efficient and personalized service, receiving
in return customer loyalty, support and market share, all leading to higher
profitability. </span></div>
<div style="text-align: justify;">
<span style="font-family: Times, "Times New Roman", serif;">
</span></div>
<div style="margin: 0in 0in 8pt; mso-layout-grid-align: none; tab-stops: list .5in;">
<span style="mso-spacerun: yes;"> </span></div>
<span style="font-family: Arial, Helvetica, sans-serif;">
</span><br />
<div align="center" style="margin: 0in 0in 8pt; mso-layout-grid-align: none; text-align: center;">
<span style="line-height: 106%; mso-bidi-font-family: ArialMT; mso-bidi-font-size: 10.0pt;"><span style="font-family: Arial, Helvetica, sans-serif;"> </span></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;">
</span><br />
<div style="margin: 0in 0in 8pt;">
<span style="font-family: Arial, Helvetica, sans-serif;"> </span></div>
<span style="font-family: Arial, Helvetica, sans-serif;">
</span><br />
<div style="margin: 0in 0in 8pt;">
<span style="font-family: Arial, Helvetica, sans-serif;"> </span></div>
<br />
<div style="margin: 0in 0in 8pt;">
<span style="font-family: Calibri;"> </span></div>
Analytics and Data Mininghttp://www.blogger.com/profile/06683904073452049165noreply@blogger.com18tag:blogger.com,1999:blog-3770043454488854818.post-71360677907128532872015-08-09T04:00:00.000-07:002015-08-09T04:00:56.290-07:00RFM Segmentation
<br />
<div style="margin: 0in 0in 0pt; text-align: justify;">
<span style="font-family: Calibri;"> </span><span style="font-family: Calibri;">Even though RFM segmentation is well known in retail industry, and
basic premise is that by knowing recency, frequency and value of the purchase
you can be in good position to start figuring out specific customer in terms of
its value, purchasing behavior and its loyalties. However, same logic can be
applied for any phenomena that we trying to predict. Therefore, knowing how
often something happens, how recently its happened and its voracity – has same
type of predictive power as it has in retail context. And whenever I used it
for predictive modeling- RFM would always come as one of the top predictors.
So, let me delve deeper in explaining basic principles of RFM method. </span></div>
<br />
<div style="margin: 0in 0in 0pt; text-align: justify;">
<span style="font-family: Calibri;"> </span></div>
<br />
<div style="margin: 0in 0in 0pt; text-align: justify;">
<span style="font-family: Calibri;">RFM segments the customer base based on recency of purchase (R), frequency
of purchase (F) and monetary value (M). <i style="mso-bidi-font-style: normal;">Recency</i>
parameter is the most powerful of the 3. In forecasting models latest time
series often has the highest weighting and is the most predictive of the next
forecasting value. Second most powerful is the <i style="mso-bidi-font-style: normal;">frequency </i>as long as the definition of the <i style="mso-bidi-font-style: normal;">frequency</i> is limited to last month or quarter and not over entire
life-span of customer relationship. Least powerful is the <i style="mso-bidi-font-style: normal;">monetary value</i>. Since the total value in the period of time is
directly correlated with <i style="mso-bidi-font-style: normal;">frequency</i> it
is advisable to use an average value. </span></div>
<br />
<div style="margin: 0in 0in 0pt; text-align: justify;">
<span style="font-family: Calibri;"> </span></div>
<br />
<div style="margin: 0in 0in 0pt; text-align: justify;">
<span style="font-family: Calibri;">There are several different ways to calculate RFM groups and scores
and below is the classic approach:</span></div>
<br />
<div style="margin: 0in 0in 0pt; text-align: justify;">
<span style="font-family: Calibri;"> </span></div>
<br />
<div style="margin: 0in 0in 0pt; text-align: justify;">
<span style="font-family: Calibri;">First create 5 segments based on the recency, dividing the data file
into 5 exact quintiles, where the contacts with the most recent Transactions
(i.e. in the top 20% of the file) are given a<span style="mso-spacerun: yes;">
</span>recency value of 5, then the next 20% are given a recency value of 4 and
so on. Then, each of those quintiles, segmented into 5 further quintiles based
on the <i style="mso-bidi-font-style: normal;">frequency</i> value for each
contact where the contacts with the highest transaction frequency value are of
5, then the next 20% is given a frequency value of 4 and so on.<span style="mso-spacerun: yes;"> </span>Finally, each of these segments is then
segmented into 5 further quintiles, based on the monetary value of each
contact; i.e. the total amount which all that contact’s transactions add up to.
Those contacts with the highest monetary values (i.e. in the top 20%), are
given a monetary value of 5, then the next 20% are given a monetary value of 4
and so on.)<span style="mso-spacerun: yes;"> </span>At the end of this process,
you will have 125 segments with a RFM group between 111 and 555 with the same
number of contacts within each segment; and each contact will have a RFM score
of between 3 and 15.</span></div>
<br />
<div style="margin: 0in 0in 6pt; text-align: justify;">
<span style="font-family: Calibri;"> </span></div>
<br />
<div style="margin: 0in 0in 0pt; text-align: justify;">
<span style="font-family: Calibri;">An alternative approach is to still calculate RFM Groups/Scores using
quintiles, but by using the Independent RFM Quintile approach, not just the <i style="mso-bidi-font-style: normal;">recency</i> but also the <i style="mso-bidi-font-style: normal;">frequency</i> and <i style="mso-bidi-font-style: normal;">monetary values</i> for each contact are calculated across the whole
data file and are not dependent on any of the other values/RFM factors or any
other quintile. Another approach is to use user-definable bands for each
criterion (i.e. each RFM factor) in order to determine what <i style="mso-bidi-font-style: normal;">recency, frequency </i>and<i style="mso-bidi-font-style: normal;"> monetary value</i> that should be given to
each contact. Even-though RFM segmentation can be used on “stand-alone” basis,
I always tend to incorporate it with other demographic and affinity variables
in order to have more holistic view of the segment's make-up.</span></div>
<br />
<div style="margin: 0in 0in 0pt; text-align: justify;">
<span style="font-family: Calibri;"> </span><span style="font-family: Calibri;"> </span></div>
<br />
<div style="margin: 0in 0in 6pt; text-align: justify;">
<span style="font-family: Calibri;">I have coined my own approach
that I often use which is somewhat different of the classic approach and it
goes in following way:\</span></div>
<br />
<div style="margin: 0in 0in 6pt; text-align: justify;">
<span style="font-family: Calibri;"><span style="mso-spacerun: yes;"> </span>1.) Create variable Total Spend for for
each customer </span></div>
<br />
<div style="margin: 0in 0in 6pt; text-align: justify;">
<span style="font-family: Calibri;"><span style="mso-spacerun: yes;"> </span>2.) Create variable Total number of
visits for each customer</span></div>
<br />
<div style="margin: 0in 0in 6pt; text-align: justify;">
<span style="font-family: Calibri;"><span style="mso-spacerun: yes;"> </span><span style="mso-spacerun: yes;">
</span>3.) Divide both variables into 3 equally spaced bins, based on frequency
– 1st bin would be lowest <span style="mso-spacerun: yes;"> </span>30% of all
customers in regard to spending (and visits – separate variable) </span></div>
<br />
<div style="margin: 0in 0in 8pt;">
<span style="font-family: Calibri;"><span style="mso-spacerun: yes;"> </span>4.) Evaluate
each customer in terms of in which group he belonged (for that time) in terms
of his total spending, and total visits, and label him for that group (Example:
variable “FRM_Spend_label”<span style="mso-spacerun: yes;"> </span>would have
values “L”, “M” and “H”. If amount of his total customer spending for 12m is
within threshold fits within second bin – give him a value “M” (medium) in
variable<span style="mso-spacerun: yes;"> </span>“FRM_Spend_label” <span style="mso-spacerun: yes;"> </span><span style="mso-spacerun: yes;"> </span><span style="mso-spacerun: yes;"> </span><span style="mso-spacerun: yes;"> </span></span></div>
<br />
<div style="margin: 0in 0in 8pt;">
<span style="font-family: Calibri;"><span style="mso-spacerun: yes;"> </span>5.) Do the
same thing for visits, creating a new variable “FRM_visit_variable”. </span></div>
<br />
<div style="margin: 0in 0in 8pt;">
<span style="font-family: Calibri;"><span style="mso-spacerun: yes;"> </span>6.) Do
slightly different thing for “Recency” – starting from the same endpoint as it
has been done for “spending” and visits – go behind only 3 months and not 12.
Then, do the following: if customer did purchase in month 1 (the most recent
month) give him a value “H”, if the most recent purchase was in month “2” –
give him a value “M” and if the most recent purchase was in month “3” – give him
value “L”.</span></div>
<br />
<div style="margin: 0in 0in 8pt; text-align: justify;">
<span style="font-family: Calibri;">Note – it might happen that most of a customers have some
sort of purchase in all months in which case it would be advisable to raise
threshold above “0”. In other words call the recent purchase only if monthly
total is above some specified amount bigger than “0”. </span></div>
<div style="text-align: justify;">
</div>
<div style="margin: 0in 0in 8pt; text-align: justify;">
<span style="font-family: Calibri;"><span style="mso-spacerun: yes;"> </span>7.) Combine
all three FRM dimensions together into single variable where values would be
combinations of “H”, “M” and “L”. If value is “HLH” it would mean that customer
falls in the top group of customers in terms of their number of visits to the
stores, it means that customer wasn’t in the store (with purchase larger than…)
for a month and it means that customer falls in the top group of customers in
terms of their total monetary value that they bring to the company.</span></div>
<br />
<div style="margin: 0in 0in 8pt;">
<span style="font-family: Calibri;"><span style="mso-spacerun: yes;"> </span>8.) In last
step I deploy “19 +1” rule, where i retain top 19 combinations based on its frequencies
and all the other combinations I drop into “other” category, so that my FRM
variable doesn’t have more than 20 distinct values.</span></div>
<br />
<div style="margin: 0in 0in 8pt;">
<span style="font-family: Calibri;">Hope this helps!</span></div>
Analytics and Data Mininghttp://www.blogger.com/profile/06683904073452049165noreply@blogger.com20tag:blogger.com,1999:blog-3770043454488854818.post-45275778747171439142014-10-30T02:51:00.001-07:002014-10-30T02:54:01.004-07:00<br />
<h3 style="margin: 0in 0in 0pt;">
Importance of knowing the odds <o:p></o:p></h3>
<span style="color: #1f497d; font-family: "Calibri","sans-serif"; font-size: 11pt;"><o:p><span style="font-family: Times New Roman; font-size: small;"> </span></o:p></span><br />
<br />
<div style="-qt-block-indent: 0; margin: 0in 0in 0pt; text-align: justify;">
<a class="irc_mutl" data-ved="0CAcQjRw" href="http://www.google.co.za/url?sa=i&rct=j&q=&esrc=s&frm=1&source=images&cd=&cad=rja&uact=8&ved=0CAcQjRw&url=http%3A%2F%2Fnautil.us%2Fissue%2F4%2Fthe-unlikely%2Fthe-man-who-invented-modern-probability&ei=2ghSVMzDNoLuatmGgsAL&bvm=bv.78597519,d.d2s&psig=AFQjCNFmsmfgIPl1MiWzcyoAh5vouRYXIg&ust=1414748734818262" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"></a><a class="irc_mutl" data-ved="0CAcQjRw" href="http://www.google.co.za/url?sa=i&rct=j&q=&esrc=s&frm=1&source=images&cd=&cad=rja&uact=8&ved=0CAcQjRw&url=http%3A%2F%2Fnautil.us%2Fissue%2F4%2Fthe-unlikely%2Fthe-man-who-invented-modern-probability&ei=2ghSVMzDNoLuatmGgsAL&bvm=bv.78597519,d.d2s&psig=AFQjCNFmsmfgIPl1MiWzcyoAh5vouRYXIg&ust=1414748734818262" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img class="irc_mut" height="320" src="https://encrypted-tbn3.gstatic.com/images?q=tbn:ANd9GcRCrTGnVNDcGHbVwk3bp_iT1eiRoL6d9TU95yhtbzZovjn5_HPXbQ" style="margin-top: 0px;" width="247" /></a><br />
While
scientific study of probability is relatively new development, dating from 16th
century, and mathematicians such as Cardano, Huygens and Pascal were the first
to develop mathematics of probability as science of knowing “what are the odds”
of specific event happening<span style="color: #1f497d;">. </span>To be able to
estimate likelihood of an even that could hurt us or benefit us is vital
mechanism to our development and survival as species. When father throws baby
in air, he does it because he can (being physically stronger than the mother)
and because it is fun for child. But, on a deeper level this is one of the
earliest lessons given by the parent of the concept of the risk and reward
which is vital skill for child to master in order to shift the odds in its
favor and to make right decisions and choices as it grows and lives. Instead of
living by the Nike's "just do it" slogan - this lesson is first step
in learning what is likely to happen - if i do it, and therefore should i do it?<br />
<div style="text-align: justify;">
<o:p></o:p><br />
<div style="-qt-block-indent: 0; margin: 0in 0in 0pt; text-align: justify;">
</div>
<div style="-qt-block-indent: 0; margin: 0in 0in 0pt; text-align: justify;">
Probability
measure is so central to our well-being as individuals or corporations because
it answers the question of how likely an event will occur, and yet most of us
do it rather instinctively, and not as part of plan, method or strategy<span style="color: #1f497d;">.</span> Once we figured likelihood of occurrence we can
then decide appropriate next steps that are aligned with odds of an event
happening. If chances are high of thunderstorm in specific area – we may change
our plan to go there camping. If chances are high that someone will not respond
to our marketing message we may not send expensive marketing offer to that
customer. <o:p> </o:p></div>
<div style="text-align: justify;">
</div>
<div style="text-align: justify;">
</div>
<div style="-qt-block-indent: 0; margin: 0in 0in 0pt; text-align: justify;">
So, whether is
in everyday life or in business – having some means of quantifying
probabilities is hugely important tool of navigating through life. Good news is
- we all have such means! In fact we are born with it - it is called human
brain! We have in-born powerful data mining software that we (should)
constantly work on improving – and so when we approach specific situation that
may be significant for our physical, emotional well-being or for our pockets –
we ask ourselves “has something similar occurred before and what was the
outcome?” This then tells us whether to proceed or to retreat.<o:p></o:p></div>
<div style="text-align: justify;">
<o:p> </o:p></div>
<div style="-qt-block-indent: 0; margin: 0in 0in 0pt; text-align: justify;">
We spend our
lives analyzing and comparing and then we make decisions. So, when we see word
“analytics” on billboard – it wrongly implies that analytical technologies are
only used by big companies only for commercial reasons. Not at all! We all do
it - all the time! And the human brain is a super powerful computer with
intuition, creativity, and with some serious massively processing power with
over 250 000 neurons acting simultaneously in order to make decision, produce
thought, or assign the odds of something happening. However, it has one
limitation which is in the sheer number processing. If we have only 10 variables
with each have 10 different values – there are near 10 billion potential
combinations – and that is when we rely on computer software and its algorithms
to do speedy number crunching for us, tells us some probability of something
happening that we may benefit from or be in some situation that we want to
avoid.<o:p></o:p></div>
<div style="text-align: justify;">
<o:p> </o:p></div>
<div style="-qt-block-indent: 0; margin: 0in 0in 0pt; text-align: justify;">
So, we all need
to have at our disposal some analytical capacities and if we move this to
levels of organizations (profit or non-profit) - it becomes utmost necessity to
have some computerized analytical software. If only all underlying conditions
in deterministic universe are known – there would be no probabilities and there
would be a certainty of specific outcome! Can we ever come close to<span style="color: #1f497d;"> fully </span> knowing underlying conditions and
causes and their values to absolute certainty of anything but the most simplest
events around us? Not a chance, this has zero probability! And that is why we
need so much probabilistic knowledge. That is not to say that everything is
predictable, and in areas of science that stipulate in-determinacy such as in
quantum or chaos theories and therefore probabilistic theories may not be
equally applicable. Things get slightly more complex or simpler (depends how
one looks at it) if we slightly change our scale of required “determinism” and
we decide – well, I am not even interested about modeling cause and effect,
instead i want to know associations, loose connections, linkages, patterns and
correlations or anything else that would incrementally and cumulatively
increase odds of knowing what will happen in period of time - based on how
similar scenario played out before. <br />
<o:p></o:p> </div>
<div style="text-align: justify;">
</div>
<div style="margin: 0in 0in 0pt; text-align: justify;">
Fundamental question is “would you
play if you knew the odds”? That question is applicable in all spheres of life.
However, this is like asking – do we need air to breath? However, not everyone
recognize importance of being able to answer that type of question, purpose of
it and value it can generate? It is somewhat easier to see the value in
colorful report even if presents trivial and non-actionable insight rather than
simple number that says that the probability of our top customers switching to
competitor is high. But those who do see value, will have brighter colours to
paint their reports from.</div>
<div style="margin: 0in 0in 0pt; text-align: justify;">
<o:p></o:p> </div>
<div style="margin: 0in 0in 0pt; text-align: justify;">
<o:p>Goran Dragosavac</o:p></div>
<div style="text-align: justify;">
</div>
</div>
</div>
Analytics and Data Mininghttp://www.blogger.com/profile/06683904073452049165noreply@blogger.com9tag:blogger.com,1999:blog-3770043454488854818.post-48438594467321954352014-10-14T05:20:00.001-07:002014-10-23T11:12:56.777-07:00<span style="color: black; font-family: "Times New Roman","serif"; font-size: 12pt;"><span style="font-size: large;"><span style="font-family: Times, "Times New Roman", serif;">How could government
agencies in South Africa benefit of greater use of analytics<o:p></o:p></span></span></span><br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="http://www.wipro.com/images/assessing-the-bounties-and-boundaries-in-big-data-analytics.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" src="http://www.wipro.com/images/assessing-the-bounties-and-boundaries-in-big-data-analytics.jpg" height="179" width="320" /></a></div>
<div class="MsoNormal" style="margin: 0in 0in 0pt; text-align: justify;">
<span style="color: black; font-family: "Times New Roman","serif"; font-size: 12pt;">In some of the most
developed countries, there is pervasive use of analytics for a variety of
purposes. According to latest US General Accounting Office report one can see
high level and types of usage across different departments, with departments of
defense and of homeland security being slightly ahead of others. Primary
purposes of using analytical technologies in the government sector are
improving service or performance, detecting fraud, waste, and abuse, analyzing
scientific and research information, managing human resources; detecting
criminal activities or patterns; and analyzing intelligence and detecting
terrorist activities. This is motivated by growth in the volumes and
availability of data collected by government agencies and by advances in analytical
technologies that can be deployed on such information. Another contributing
factor is decreasing cost of storage which means that larger amounts of data
can be kept cheaper than ever before.</span><span style="color: black; font-family: "Times New Roman","serif"; font-size: 12pt;"><o:p> </o:p></span></div>
<br />
<div class="MsoNormal" style="margin: 0in 0in 0pt; text-align: justify;">
<span style="color: black; font-family: "Times New Roman","serif"; font-size: 12pt;">So, question is how can
local adoption and consumption of analytics in the public sector be increased
to the levels close to usage of so called “first world” countries? There is
tremendous need in South Africa for analytically-enabled applications across
the board. Imagine benefits of “early warning” systems (EWS) that can alert before the crisis allowing for fast response times. This is applicable in all
government departments – from early warning detection systems in Eskom’s
production units that could “ring bell” just before unplanned outage! Or early
warning detection system that would indicate water pump failure like the one
now that caused week-long water shortages in areas around Johannesburg. </span><br />
<br />
<span style="color: black; font-family: "Times New Roman","serif"; font-size: 12pt;">Word "crisis" is often used in context of South
Africa’s public health delivery. Everyone knows that staff shortages are major
contributor to the poor state of affairs in area of public health. What is not
clearly known is the magnitude of the difference and ranking order between
different hospitals in different areas. Some of the government policies and programs are not only
ineffective in reducing problems but directly contributing to underlying cause.
That means that real-time awareness and feed-back are painfully lacking. This
is precisely where analytical technologies can massively assist, so that
governmental programs and policies much better represent reality. </span></div>
<div class="MsoNormal" style="margin: 0in 0in 0pt; text-align: justify;">
<span style="color: black; font-family: "Times New Roman","serif"; font-size: 12pt;"></span> </div>
<div class="MsoNormal" style="margin: 0in 0in 0pt; text-align: justify;">
<span style="color: black; font-family: "Times New Roman","serif"; font-size: 12pt;">There was a
case in one province where the school was built on one side of the river, while
majority of learners reside in rural villages on the other side of river. For
most part of the year river is not difficult to cross, but for one month of the
year this river is heavily flooded and dozens of children die every year by
being swept by heavy flood-streams. This could have been prevented through
analytically enabled decision making. </span><br />
<span style="color: black; font-family: "Times New Roman","serif"; font-size: 12pt;"></span><br />
<span style="color: black; font-family: "Times New Roman","serif"; font-size: 12pt;">Think of a case of a children hospital where analytics
have found mismatches between causes and outcomes of injuries. If stated injury is caused
by the fall from the bed and this specific type injury with its symptoms is
highly unlikely to be caused by the same cause – well, could it be that parents
are lying and child has been abused? After further scrutiny of such cases –
that is exactly what have been discovered. As a result - this specific
children’s hospital has implemented policy that for any injury pattern
discovered, where stated cause of injury doesn’t match with the expected set of
symptoms – that social workers should be alerted to have closer look at such
family. </span><br />
<span style="color: black; font-family: "Times New Roman","serif"; font-size: 12pt;"></span><br />
<span style="color: black; font-family: "Times New Roman","serif"; font-size: 12pt;">Another example is that certain government hospitals are persistently
far above the average of instance of infant mortality. The fact that some of
these hospitals are worst on on-going basis suggest that there is some negative
pattern at work that is causing the mortality numbers to be worse than
elsewhere. Analytics can potentially extract such negative pattern and by
breaking this pattern through appropriate measures and actions – one can reduce
this problem to average or below average levels – and this reduction of the
problem can be directly attributable to actionable analytics.<o:p></o:p></span></div>
<br />
<div class="MsoNormal" style="margin: 0in 0in 0pt; text-align: justify;">
<span style="color: black; font-family: "Times New Roman","serif"; font-size: 12pt;">And then there is massive problem of fraud, waste and abuse where
analytics can be used for detection and ultimately – prevention. But, what is
the main reason for slow adoption of cutting-edge analytical technologies in
public sector? Yes, there may be issues with data quality and access, issues
with a shortage of skills and lack of analytical technologies – but the biggest
challenge is lack of motivation. Neither, penalty for doing nothing, nor award
for doing something is strong enough for needed change of management culture. That
is why improvement is hard to come by. There are some pockets of excellence in
public sector that proves that analytical technologies can effectively be used
to vastly improve service delivery and performance, reduce the fraud, better
represent reality for better decision making and ultimately make idealistic
concept of “smart and just city” more achievable. In other words, there is
strong case for greater usage of analytics where communities are built on
sustainable economic development and high quality of life, with lesser crime,
greater and quicker justice delivery and with wise management of natural
resources, and last bit not least – more effective transformation and
empowerment of previously disadvantaged sectors of society - far more of a
reality for tomorrow than what it is today.</span></div>
<div class="MsoNormal" style="margin: 0in 0in 0pt; text-align: justify;">
<span style="color: black; font-family: "Times New Roman","serif"; font-size: 12pt;"></span> </div>
<div class="MsoNormal" style="margin: 0in 0in 0pt; text-align: justify;">
<span style="color: black; font-family: "Times New Roman","serif"; font-size: 12pt;">Goran Dragosavac</span><o:p></o:p></div>
Analytics and Data Mininghttp://www.blogger.com/profile/06683904073452049165noreply@blogger.com2tag:blogger.com,1999:blog-3770043454488854818.post-22422223235894487742014-10-14T05:11:00.002-07:002014-10-14T05:33:22.589-07:00<br />
<h1 style="background: white; margin: 7.5pt 0in;">
<b><span style="color: #184e86; font-family: "Arial","sans-serif"; font-size: large;">Unexpected use of analytics</span></b></h1>
<div class="MsoNormal" style="background: white; margin: 0in 0in 0pt;">
<span style="color: #444444; font-family: "Arial","sans-serif"; font-size: 14.5pt;"><o:p> </o:p></span></div>
<br />
<div class="MsoNormal" style="background: white; margin: 0in 0in 0pt; text-align: justify;">
<a href="http://www.lineaedp.it/wp-content/blogs.dir/3/files/2014/08/big_analytics.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img alt="big analytics" border="0" class="attachment-articolo wp-post-image" src="http://www.lineaedp.it/wp-content/blogs.dir/3/files/2014/08/big_analytics.jpg" height="172" title="big analytics" width="200" /></a><span style="color: #444444; font-family: "Arial","sans-serif";">While most common applications
of analytics have been in database marketing and CRM type of applications, and
most people associate the use of analytics in these areas – it is hard to find
any other area of human endeavor where analytics have not been used to either
describe or predict, whether it is research, science and techno-, logy, sports, politics, entertainment, or any other area where there is question, historical data
relevant to that question and some analytical skills and technology.</span> </div>
<div class="MsoNormal" style="background: white; margin: 0in 0in 0pt; text-align: justify;">
<span style="color: #444444; font-family: "Arial","sans-serif";"></span> </div>
<div class="MsoNormal" style="background: white; margin: 0in 0in 0pt; text-align: justify;">
<span style="color: #444444; font-family: "Arial","sans-serif";">In the wake of India's 16th
national election it was clear that some parties have taken a page from Obama's
re-election campaign where they used – in a big way – technology, social media
and big data to connect with voters. Analytics have helped in micro-segmenting
electorate focusing on swing states, different gender and minority groupings
tailoring messages for segment-specific audiences. Not to mention the use of
analytics to rework advertising campaigns and most importantly – to raise
funds.</span><br />
<span style="color: #444444; font-family: "Arial","sans-serif";"><o:p></o:p></span> </div>
<div style="text-align: justify;">
</div>
<div class="MsoNormal" style="background: white; margin: 0in 0in 0pt; text-align: justify;">
<span style="color: #444444; font-family: "Arial","sans-serif";">Another area of pervasive use
of analytics is in counter-terrorism, as well as for defense and military
agencies. A big lesson of the September 11 attacks was the importance of being
able to integrate disparate pieces of data, and data could be anything from
field reports to social media postings, from broadcast news, accounts to
e-commerce transactions, from bank records to records in classified government
databases. And once data is mapped and reduced, users can track high-value
individuals and organizations of interest, and establish connections between
people, organizations, events and places that would otherwise be difficult to
make.<o:p></o:p></span><br />
</div>
<div style="text-align: justify;">
</div>
<div class="MsoNormal" style="background: white; margin: 0in 0in 0pt; text-align: justify;">
<span style="color: #444444; font-family: "Arial","sans-serif";">It is well known that the US
Marine Corps is experimenting with big data analytics technologies, such as
Hadoop and graph databases, all for the purpose of quicker intelligence
gathering and dissemination that can be used for real-decision making by field
commanders. Also, according to US' Department of Veteran affairs, the number of
suicides among veterans and active-duty military personnel is 22 a day, which
is by all accounts an under-reported epidemic, and so analytical practitioners were
approached to see how analytics can help spot patterns of suicide and prevent
it before it occurs. Of course first step in bringing disparate and relevant
data together and then deriving a number of "stress load" factors and
modeling behavioral dynamics that could turn on the switch and push the
person to suicide. And if it can be modeled – it can be prevented, and while
suicide prevention may not be an "exact science" it is hard not to
see how analytics cannot be hugely beneficial in these areas.</span><br />
<span style="color: #444444; font-family: "Arial","sans-serif";"><o:p></o:p></span> </div>
<div style="text-align: justify;">
</div>
<div class="MsoNormal" style="background: white; margin: 0in 0in 0pt; text-align: justify;">
<span style="color: #444444; font-family: "Arial","sans-serif";">In the area of entertainment,
applications of analytics are limitless. Aziz Asnari, an American comedian,
started inviting fans to participate with him in the development of new
material. All they would have to do is subscribe to his channel and give their
feedback on his new material. Of course, the catch was that by subscribing they
gave him information about themselves, so Asnari could see what comic topics
work well or poorly on which segment of his fans, and he could use that
feedback to adjust his material to the segment he was performing to. Largely,
the media and entertainment industry of today is well aware that consumers have
multiple ways of telling them what they think about their content, especially
across social media. So, they need, in essence, to capture these digital
voices, analyze them and tweak and adjust their content accordingly. Which
segments hate a certain program, which segment likes it and therefore where
you should increase your marketing spend, who streamed latest trailers, males
or woman? Who complained about moving it to a new time slot, etc.</span> </div>
<div style="text-align: justify;">
</div>
<div class="MsoNormal" style="background: white; margin: 0in 0in 0pt; text-align: justify;">
<span style="color: #444444; font-family: "Arial","sans-serif"; mso-fareast-font-family: "Times New Roman";">In sport, the use of analytics is equally ever-increasing. From
analyzing the injury pattern of a certain football player before the transfer
window to work out what the risk is, after spending millions of dollars on him,
that he will be sidelined for months by yet another injury. Or analyzing
patterns of play of a competitor team or even an individual player to select
the right players and devise the appropriate tactics that will negate competitor's
strengths. It is the new science of winning – whether a coach uses the neural
networks in his brain or those of a computer algorithm.</span><br />
<span style="color: #444444; font-family: "Arial","sans-serif"; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span> </div>
<div style="text-align: justify;">
</div>
<div class="MsoNormal" style="background: white; margin: 0in 0in 0pt; text-align: justify;">
<span style="color: #444444; font-family: "Arial","sans-serif"; mso-fareast-font-family: "Times New Roman";">But then, on the fun side – analytics have been used for more
than just to win customers, profits, audiences or games. They have been
increasingly used to save lives and property, whether is in predicting the most
likely path of the hurricane, the likely spread of the wild fire, floods or
even diseases. Analyzing patterns of injuries is reality in many children's hospitals
for the prevention of falls, traffic accident, assaults, burns etc. And by
knowing where these injuries happen more often, how and when, and who is the
most likely victim – it becomes actionable intelligence that can be used in
injury prevention programs and ultimately reduce suffering and save lives.</span><br />
<span style="color: #444444; font-family: "Arial","sans-serif"; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span> </div>
<div style="text-align: justify;">
</div>
<div class="MsoNormal" style="background: white; margin: 0in 0in 0pt; text-align: justify;">
<span style="color: #444444; font-family: "Arial","sans-serif"; mso-fareast-font-family: "Times New Roman";">It is still early days and there is tremendous potential to do
more across the board, but with big data technologies coupled with abilities to
mine and analyze different types of data, text, pictures and videos, sensor
data will contribute to the new generation of analytical applications that will
hopefully make the world even slightly more predictable and a safer place.<o:p></o:p></span></div>
<br />
<div class="MsoNormal" style="background: white; margin: 0in 0in 0pt;">
<span style="color: #444444; font-family: "Arial","sans-serif"; font-size: 14.5pt;"><span style="font-size: small;">Goran Dragosavac<o:p></o:p></span></span></div>
Analytics and Data Mininghttp://www.blogger.com/profile/06683904073452049165noreply@blogger.com15tag:blogger.com,1999:blog-3770043454488854818.post-90046603749202212832014-10-14T05:01:00.004-07:002014-10-14T06:28:40.478-07:00<br />
<h1 style="background: white; margin: 7.5pt 0in; text-align: justify;">
<span style="font-size: small;">
<b><span style="color: #184e86; font-family: "Arial","sans-serif";">Big data - proceed with caution</span></b></span></h1>
<div class="MsoNormal" style="background: white; margin: 0in 0in 0pt; text-align: justify;">
<span style="color: #444444; font-family: "Arial","sans-serif";"><o:p> </o:p></span></div>
<div style="text-align: justify;">
<a href="https://encrypted-tbn1.gstatic.com/images?q=tbn:ANd9GcTs0edxdT-YxzWjWuN7PDyf7yJcTHNVZQJhhQnZXLMJ7G3UA3a1" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"></a><a href="https://encrypted-tbn1.gstatic.com/images?q=tbn:ANd9GcTs0edxdT-YxzWjWuN7PDyf7yJcTHNVZQJhhQnZXLMJ7G3UA3a1" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"> </a><br />
<a href="https://encrypted-tbn1.gstatic.com/images?q=tbn:ANd9GcTs0edxdT-YxzWjWuN7PDyf7yJcTHNVZQJhhQnZXLMJ7G3UA3a1" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" class="rg_i" data-src="https://encrypted-tbn1.gstatic.com/images?q=tbn:ANd9GcTs0edxdT-YxzWjWuN7PDyf7yJcTHNVZQJhhQnZXLMJ7G3UA3a1" data-sz="f" height="245" jsaction="load:str.tbn" name="eQdhYETnQCQdpM:" src="https://encrypted-tbn1.gstatic.com/images?q=tbn:ANd9GcTs0edxdT-YxzWjWuN7PDyf7yJcTHNVZQJhhQnZXLMJ7G3UA3a1" style="height: 170px; margin-top: -2px; width: 277px;" width="400" /></a><br />
<br />
<div class="MsoNormal" style="background: white; margin: 0in 0in 0pt; text-align: justify;">
<span style="color: #444444; font-family: "Arial","sans-serif";">While collection and analysis
on big data hold great promise, quantity doesn't always translate to quality.
Quantity of data is represented by the number of records and the number of
variables, and one can argue that good old statistical sampling techniques are
still relevant. If variability is captured with a random sample, there will be
very little incremental benefit, if any, of doing analysis on all rows. </span><br />
<span style="color: #444444; font-family: "Arial","sans-serif";"><o:p></o:p></span> </div>
<div style="text-align: justify;">
</div>
<div class="MsoNormal" style="background: white; margin: 0in 0in 0pt; text-align: justify;">
<span style="color: #444444; font-family: "Arial","sans-serif";">The second dimension of big
data is the number of variables. Data sets of only 10 variables with 10
distinct values for each variable, gives potentially 10 billion pattern
combinations; and with an increase in the number of variables, the potential
for extracting spurious and non-explicable patterns and correlations also
increases.</span><br />
<span style="color: #444444; font-family: "Arial","sans-serif";"><o:p></o:p></span> </div>
<div style="text-align: justify;">
</div>
<div class="MsoNormal" style="background: white; margin: 0in 0in 0pt; text-align: justify;">
<span style="color: #444444; font-family: "Arial","sans-serif";">The key question remains – how
can analytics handle stream data that keeps increasing in volume? Each method,
technique or algorithm has an optimal point, after which there are diminishing
returns, plateau and then degradation, while computational requirements
continue to grow. Some suggest that algorithms need to be rewritten to move the
optimal point further down a path of data infinity.<o:p></o:p></span><br />
</div>
<div style="text-align: justify;">
</div>
<div class="MsoNormal" style="background: white; margin: 0in 0in 0pt; text-align: justify;">
<span style="color: #444444; font-family: "Arial","sans-serif";">And what makes it more
challenging is that big data is characterized more by the data variety and
velocity rather than by sheer volume. Data doesn't only come in a standard
structured format, it comes in a stream in the form of free text, pictures,
sounds and whatever else may come to play. And it comes with a high degree of
variability where formats within the stream can change as the data are
captured.<o:p></o:p></span><br />
</div>
<div style="text-align: justify;">
</div>
<div class="MsoNormal" style="background: white; margin: 0in 0in 0pt; text-align: justify;">
<span style="color: #444444; font-family: "Arial","sans-serif";">So, all this necessitates that
analytical technologies are further redesigned in a way that they can take
advantage of massively parallel processing architectures and be able to exploit
heterogeneous data with high volumes and velocity and still be able to produce
robust and accurate models. Some argue that traditional statistical methods
that open more questions than give answers may not survive in this data flood
era, and that new machine-learning methods are needed to deal with "big
data noise" and see the "big picture around the corner".</span><span style="color: #444444; font-family: "Arial","sans-serif";"><o:p> </o:p></span><br />
</div>
<div style="text-align: justify;">
</div>
<div class="MsoNormal" style="background: white; margin: 0in 0in 0pt; text-align: justify;">
<span style="color: #444444; font-family: "Arial","sans-serif";">There is often the naïve
assumption that analysis will happen on this big data as it has been collected.
In most cases, only relevant subsets of data will be needed for analysis, which
will be integrated with other relevant data sources and most likely aggregated
to allow for knowledge induction and generalization.</span><span style="color: #444444; font-family: "Arial","sans-serif";"><o:p> </o:p></span></div>
<div style="text-align: justify;">
</div>
<div style="text-align: justify;">
</div>
<div class="MsoNormal" style="background: white; margin: 0in 0in 0pt; text-align: justify;">
<span style="color: #444444; font-family: "Arial","sans-serif";">The big challenge is also
around data management. Since data is getting collected in different time
points from different locations – temporal and spatial variability
– and comes in different formats without adequate metadata describing who,
what, when, how and from where, this can pose serious issues in terms of
contextualizing and acting on intelligence extracted from such data.<o:p></o:p></span></div>
<div style="text-align: justify;">
</div>
<div style="text-align: justify;">
</div>
<div class="MsoNormal" style="background: white; margin: 0in 0in 0pt; text-align: justify;">
<span style="color: #444444; font-family: "Arial","sans-serif";">And then there are issues
around system and components design. Since not all big data and all business
requirements are the same, designers will need to carefully consider
functionality, conceptual model, organization and interfaces that would meet
the needs of end-users. Answering the business question is more important than
processing all the data, so knowing how much data is enough for a given set of
business questions is important to know, since this will drive design and
architecture of a processing system.</span><span style="color: #444444; font-family: "Arial","sans-serif";"><o:p> </o:p></span><br />
</div>
<div style="text-align: justify;">
</div>
<div class="MsoNormal" style="background: white; margin: 0in 0in 0pt; text-align: justify;">
<span style="color: #444444; font-family: "Arial","sans-serif";">Then there is the challenge
with data ownership, privacy and security. Who owns Twitter or Facebook data
– service providers where data is stored or account holders? There are
serious attempts by researchers to develop algorithms that will automatically
randomize personal data among large data collections to mitigate privacy
concerns. International Data Corporation has suggested five levels of
increasing security: privacy, compliance-driven, custodial, confidential and
lockdown, and there is still work ahead to define these security levels in
respect to analytical exploitation before any legislative measures are in
place.</span><span style="color: #444444; font-family: "Arial","sans-serif";"><o:p> </o:p></span><br />
<span style="color: #444444; font-family: "Arial","sans-serif";"><o:p></o:p></span> </div>
<div style="text-align: justify;">
</div>
<div class="MsoNormal" style="background: white; margin: 0in 0in 0pt; text-align: justify;">
<span style="color: #444444; font-family: "Arial","sans-serif";">One must not forget what the
end-goal is here. It is about business value and the advantage of being able to
make decisions founded on big data analytics that were beyond reach before. And
the main challenges here are to prioritize big data analytical engagements so
that resources are used on high priority, high value business questions.
Successful completion of such complex big data analytics projects will require
multiple experts from different domains and different locations to share data,
as well as analytical technologies, and be able to provide input and share the
exploration of results.</span><span style="color: #444444; font-family: "Arial","sans-serif";"><o:p> </o:p></span></div>
<div style="text-align: justify;">
</div>
<div style="text-align: justify;">
</div>
<div class="MsoNormal" style="background: white; margin: 0in 0in 0pt; text-align: justify;">
<span style="color: #444444; font-family: "Arial","sans-serif";">Therefore, big data analytic
systems must support collaboration in an equally big way! And lastly, results
of analytics must be interpretable to the end-users and be relevant to
questions at hand, and some measures of relevance and interest are needed to
rank and reduce the sea of patterns so that only relevant, non-trivial and
potentially useful results are presented. And in presenting and disseminating
results of analytics – the method of visualization plays a special role both in
interpretation and collaboration purposes.</span></div>
<div class="MsoNormal" style="background: white; margin: 0in 0in 0pt; text-align: justify;">
<span style="color: #444444; font-family: "Arial","sans-serif";"><o:p> </o:p></span><br />
<span style="color: #444444; font-family: "Arial","sans-serif";"><o:p></o:p></span> </div>
<div class="MsoNormal" style="background: white; margin: 0in 0in 0pt; text-align: justify;">
<span style="color: #444444; font-family: "Arial","sans-serif";">Not all of these challenges may
be equally relevant in all situations, but at least it is helpful to be aware
of them. While there will not be a U-turn on big data, as well as big data analytics,
the issue is how to address some of these challenges while keeping an eye on
ball, which is to ensure that big data technologies deliver on their promises
of providing better answers, quicker, to more complex questions.<o:p></o:p></span></div>
<br />
<span style="font-family: Arial, Helvetica, sans-serif;">Goran Dragosavac</span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt;">
<o:p><span style="font-family: Calibri;"> </span></o:p></div>
<br />
<div class="MsoNormal" style="margin: 0in 0in 0pt;">
<o:p><span style="font-family: Calibri;"> </span></o:p></div>
</div>
Analytics and Data Mininghttp://www.blogger.com/profile/06683904073452049165noreply@blogger.com2tag:blogger.com,1999:blog-3770043454488854818.post-63879864082476061162013-06-27T09:55:00.000-07:002013-06-27T09:55:51.123-07:00Main applications of analytics and data mining in healthcare
<br />
<div style="margin-left: 0.25in;">
<b><span style="color: black; font-family: "Arial","sans-serif";">Disease Management<o:p></o:p></span></b></div>
<br />
<div style="margin-left: 0.25in; text-align: justify;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjGZfRdds518hy6QsssZHJRrUSEGyK7wwtzHvM5iVOml7gfXrUiDdZDmuNEsI_JtPjl3t0qiWuOAB56Ern_4hvsesKdkC9P7AuSJahxqv1jSdNWEYTxMETCixg0p6xg2qgbTpGwIBoDXIKU/s386/healthcare.gif" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="259" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjGZfRdds518hy6QsssZHJRrUSEGyK7wwtzHvM5iVOml7gfXrUiDdZDmuNEsI_JtPjl3t0qiWuOAB56Ern_4hvsesKdkC9P7AuSJahxqv1jSdNWEYTxMETCixg0p6xg2qgbTpGwIBoDXIKU/s320/healthcare.gif" width="320" /></a><span style="color: black; font-family: "Arial","sans-serif"; mso-bidi-font-weight: bold;">Disease management concerns with predictive
as well as descriptive aspects of specific disease. What is likely probability
of specific disease outcome, and what are the factors associated with these
outcomes with the focus on actionable factors. One has to separate effects of
causes for specific disease, and that can be done by separating event period
from period of input data collection. Disease management can involve specific aspect
of the disease whose resolution can be beneficial to not only health-providers
but more importantly to the patient. Descriptive component of disease
management involves desirable as well as undesirable patterns – and auctioning on
these patterns involves either supporting them or breaking them and then
measuring effects of these actions for purpose of achieving specific disease
management goals.<o:p></o:p></span></div>
<br />
<div style="margin-left: 0.25in;">
<span style="color: black; font-family: "Arial","sans-serif"; mso-bidi-font-weight: bold;">Some of the examples of disease
management questions:</span></div>
<ul type="disc">
<ul type="circle">
<li class="MsoNormal" style="color: black; margin: 0in 0in 0pt; mso-list: l4 level2 lfo5; mso-margin-bottom-alt: auto; mso-margin-top-alt: auto; tab-stops: list 1.0in;"><span style="font-family: "Arial","sans-serif";">If surgical procedure
"X" is done, then 45% of the time infection "Y"
occurs within two weeks- Why, reasons, contributing factors? <o:p></o:p></span></li>
<li class="MsoNormal" style="color: black; margin: 0in 0in 0pt; mso-list: l4 level2 lfo5; mso-margin-bottom-alt: auto; mso-margin-top-alt: auto; tab-stops: list 1.0in;"><span style="font-family: "Arial","sans-serif";">What, if any seasonal patterns
in emergency room nosocomial infections exist and contributing factors? <o:p></o:p></span></li>
<li class="MsoNormal" style="color: black; margin: 0in 0in 0pt; mso-list: l4 level2 lfo5; mso-margin-bottom-alt: auto; mso-margin-top-alt: auto; tab-stops: list 1.0in;"><span style="font-family: "Arial","sans-serif";">Why do some congestive heart
failure (CHF) patients return to the heart clinic after bypass surgery
for care within 3 moths, while others don't? <o:p></o:p></span></li>
<li class="MsoNormal" style="color: black; margin: 0in 0in 0pt; mso-list: l4 level2 lfo5; mso-margin-bottom-alt: auto; mso-margin-top-alt: auto; tab-stops: list 1.0in;"><span style="font-family: "Arial","sans-serif";">Compare and contrast high length
of stay patient groups based upon bed location, nursing teams, and
treatment modalities. <o:p></o:p></span></li>
<li class="MsoNormal" style="color: black; margin: 0in 0in 0pt; mso-list: l4 level2 lfo5; mso-margin-bottom-alt: auto; mso-margin-top-alt: auto; tab-stops: list 1.0in;"><span style="font-family: "Arial","sans-serif";">Compare and contrast treatment
results or glucose levels for type II diabetic patients for a given time
period, by physician, gender, age group, etc. <o:p></o:p></span></li>
<li class="MsoNormal" style="color: black; margin: 0in 0in 0pt; mso-list: l4 level2 lfo5; mso-margin-bottom-alt: auto; mso-margin-top-alt: auto; tab-stops: list 1.0in;"><span style="font-family: "Arial","sans-serif";">What practice patterns for
managing primary mammogram candidates will yield the best outcomes in
terms of survival rates or complication rates at the least cost? <o:p></o:p></span></li>
<li class="MsoNormal" style="color: black; margin: 0in 0in 0pt; mso-list: l4 level2 lfo5; mso-margin-bottom-alt: auto; mso-margin-top-alt: auto; tab-stops: list 1.0in;"><span style="font-family: "Arial","sans-serif";">What percentage of women in membership
between the ages 40 - 60 have had a mammogram in the last 12 months? <o:p></o:p></span></li>
<li class="MsoNormal" style="color: black; margin: 0in 0in 0pt; mso-list: l4 level2 lfo5; mso-margin-bottom-alt: auto; mso-margin-top-alt: auto; tab-stops: list 1.0in;"><span style="font-family: "Arial","sans-serif";">What is the comparative mean
value of hypertension levels within a certain group or population of
patients and does it fall within acceptable statistical levels? Do
variations in clinical practice patterns have a cause and effect
relationship? </span><b><span style="color: black; font-family: "Arial","sans-serif";"><o:p><span style="font-family: Times New Roman;"> </span></o:p></span></b></li>
</ul>
</ul>
<br />
<b><span style="color: black; font-family: "Arial","sans-serif";">Outcomes
Analysis: Clinical and Financial</span></b><span style="color: black; font-family: "Arial","sans-serif";"><o:p></o:p></span><br />
<br />
<b><span style="color: black; font-family: "Arial","sans-serif";">Clinical
Outcomes</span></b><span style="color: black; font-family: "Arial","sans-serif";"><br />
A Clinical Outcome is the result of medical or surgical intervention or
nonintervention. It can refer to, but is not limited to the following: </span><br />
<ul type="disc">
<li class="MsoNormal" style="color: black; margin: 0in 0in 0pt; mso-list: l0 level1 lfo2; mso-margin-bottom-alt: auto; mso-margin-top-alt: auto; tab-stops: list .5in;"><span style="font-family: "Arial","sans-serif";">Mortality <o:p></o:p></span></li>
<li class="MsoNormal" style="color: black; margin: 0in 0in 0pt; mso-list: l0 level1 lfo2; mso-margin-bottom-alt: auto; mso-margin-top-alt: auto; tab-stops: list .5in;"><span style="font-family: "Arial","sans-serif";">Morbidity <o:p></o:p></span></li>
<li class="MsoNormal" style="color: black; margin: 0in 0in 0pt; mso-list: l0 level1 lfo2; mso-margin-bottom-alt: auto; mso-margin-top-alt: auto; tab-stops: list .5in;"><span style="font-family: "Arial","sans-serif";">Re-admittance rates <o:p></o:p></span></li>
<li class="MsoNormal" style="color: black; margin: 0in 0in 0pt; mso-list: l0 level1 lfo2; mso-margin-bottom-alt: auto; mso-margin-top-alt: auto; tab-stops: list .5in;"><span style="font-family: "Arial","sans-serif";">Changes in birth and death rates for a global
population, for example, residents of a state <o:p></o:p></span></li>
<li class="MsoNormal" style="color: black; margin: 0in 0in 0pt; mso-list: l0 level1 lfo2; mso-margin-bottom-alt: auto; mso-margin-top-alt: auto; tab-stops: list .5in;"><span style="font-family: "Arial","sans-serif";">The outcome of a given diagnostic procedure, lab
result or medical test <o:p></o:p></span></li>
<li class="MsoNormal" style="color: black; margin: 0in 0in 0pt; mso-list: l0 level1 lfo2; mso-margin-bottom-alt: auto; mso-margin-top-alt: auto; tab-stops: list .5in;"><span style="font-family: "Arial","sans-serif";">The results for a patient after care, for example,
how long it took to restore the patient's ability to walk, or to work, or
how long and to what degree did the patient have pain <o:p></o:p></span></li>
<li class="MsoNormal" style="color: black; margin: 0in 0in 0pt; mso-list: l0 level1 lfo2; mso-margin-bottom-alt: auto; mso-margin-top-alt: auto; tab-stops: list .5in;"><span style="font-family: "Arial","sans-serif";">Did the patient recover, how long did it take <o:p></o:p></span></li>
<li class="MsoNormal" style="color: black; margin: 0in 0in 0pt; mso-list: l0 level1 lfo2; mso-margin-bottom-alt: auto; mso-margin-top-alt: auto; tab-stops: list .5in;"><span style="font-family: "Arial","sans-serif";">The patient's own perception of their care and
progress. </span></li>
</ul>
<div class="MsoNormal" style="margin: 0in 0in 0pt; text-align: justify;">
<span style="color: black; font-family: "Arial","sans-serif";">It
is thought that through a historical record of outcome experiences, caregivers
will know better which treatment modalities result in consistently better
outcomes for patients. Effective Outcomes Management often relies on a
successful data warehousing strategy designed to track historical outcome
experiences in many areas such as epidemiological studies, lab results,
responses to treatments, mortality and morbidity rates, length of patient stay
and clinical effectiveness measures. <o:p></o:p></span></div>
<br />
<b><span style="color: black; font-family: "Arial","sans-serif";">Financial
Outcomes</span></b><br />
<b><span style="color: black; font-family: "Arial","sans-serif";"><u></u></span></b><span style="color: black; font-family: "Arial","sans-serif";"><br />
The definition of a financial outcome varies depending upon an organization's
goals and overall strategy. As an example, financial outcomes might cover
measures such as hospital length of stay, net margins, cost breakouts, number
of ER visits and office visits - just to name a few. <o:p></o:p></span><br />
<br />
<b><span style="color: black; font-family: "Arial","sans-serif";"><o:p><span style="font-family: Times New Roman;"> </span></o:p></span></b><b><span style="color: black; font-family: "Arial","sans-serif";">Fraud and
Detection</span></b><span style="color: black; font-family: "Arial","sans-serif";"><o:p></o:p></span><br />
<br />
<div style="text-align: justify;">
<span style="color: black; font-family: "Arial","sans-serif";">It would be nice
if we could develop some type of industry wrapper to data mining technology for
the health care market specifically. But for now, this may be an area of
opportunity for AEs because the industry has yet to spend many resources on
Fraud detection and have not developed sophisticated tools and technologies for
not only detecting fraud but for predicting and catching fraud before claims
adjudication. <o:p></o:p></span></div>
<div style="text-align: justify;">
</div>
<div style="text-align: justify;">
<span style="color: black; font-family: "Arial","sans-serif";">Fraud and Abuse is
usually defined as "the intentional deception or misrepresentation that an
individual knows to be false or does not believe to be true and makes, knowing
that the deception could result in some unauthorized benefit to himself/herself
or some other person". The most frequent kind of fraud arises from a false
statement or misrepresentation made, or caused to be made, that is material to
entitlement or payment. <o:p></o:p></span></div>
<div style="text-align: justify;">
</div>
<div style="text-align: justify;">
<span style="color: black; font-family: "Arial","sans-serif";">Violators and
perpetrators of fraud may include physicians or other practitioners, a hospital
or other institutional provider, a clinical laboratory or other supplier, an
employee of any provider, a billing service, beneficiary, Medicare carrier
employee or any person in a position to file a claim for payment or benefits. <o:p></o:p></span></div>
<br />
<b><span style="color: black; font-family: "Arial","sans-serif";">Types of
abuses</span></b><span style="color: black; font-family: "Arial","sans-serif";"><o:p></o:p></span><br />
<br />
<ul type="disc">
<li class="MsoNormal" style="color: black; margin: 0in 0in 0pt; mso-list: l2 level1 lfo1; mso-margin-bottom-alt: auto; mso-margin-top-alt: auto; tab-stops: list .5in;"><span style="font-family: "Arial","sans-serif";">Misrepresentation of medical necessity: For example,
a physician who recommends that eye cataract surgery be performed on a
healthy eye. <o:p></o:p></span></li>
<li class="MsoNormal" style="color: black; margin: 0in 0in 0pt; mso-list: l2 level1 lfo1; mso-margin-bottom-alt: auto; mso-margin-top-alt: auto; tab-stops: list .5in;"><span style="font-family: "Arial","sans-serif";">Billing errors: Encompasses everything from billing
the wrong date of service to up-coding. <o:p></o:p></span></li>
<li class="MsoNormal" style="color: black; margin: 0in 0in 0pt; mso-list: l2 level1 lfo1; mso-margin-bottom-alt: auto; mso-margin-top-alt: auto; tab-stops: list .5in;"><span style="font-family: "Arial","sans-serif";">Over-provision of services: Providing medically
unnecessary tests to generate a fee. <o:p></o:p></span></li>
<li class="MsoNormal" style="color: black; margin: 0in 0in 0pt; mso-list: l2 level1 lfo1; mso-margin-bottom-alt: auto; mso-margin-top-alt: auto; tab-stops: list .5in;"><span style="font-family: "Arial","sans-serif";">Misrepresentation of services provided. <o:p></o:p></span></li>
<li class="MsoNormal" style="color: black; margin: 0in 0in 0pt; mso-list: l2 level1 lfo1; mso-margin-bottom-alt: auto; mso-margin-top-alt: auto; tab-stops: list .5in;"><span style="font-family: "Arial","sans-serif";">Offering or acceptance of kickbacks, and/or a
routine waiver of co-payments. <o:p></o:p></span></li>
</ul>
<br />
<div style="text-align: justify;">
<span style="color: black; font-family: "Arial","sans-serif";">Fraud schemes
range from those perpetrated by individuals acting alone to broad-based
activities by institutions or groups of individuals, sometimes employing
sophisticated telemarketing and other promotional techniques to lure consumers
into serving as the unwitting tools in the schemes. Seldom do perpetrators
target only one insurer or target the public or private sector exclusively.
Rather, most are found to be defrauding several private and public sector
victims simultaneously. <o:p></o:p></span></div>
<br />
<span style="color: black; font-family: "Arial","sans-serif";"><o:p><span style="font-family: Times New Roman;"> </span></o:p></span><br />
<br />
<b><span style="color: black; font-family: "Arial","sans-serif";">Medical Errors
</span></b><span style="color: black; font-family: "Arial","sans-serif";"><o:p></o:p></span><br />
<br />
<div style="text-align: justify;">
<b><span style="color: black; font-family: "Arial","sans-serif";">Definition</span></b><span style="color: black; font-family: "Arial","sans-serif";"><br />
The issue of reducing medical errors has been a heated political topic and will
continue to be controversial in the next several years. It is believed the key
to decreasing these errors will be to properly identify them, analyze the
causes, and then change the system and/or processes to prevent them from
happening in the future. A November 1999 study by the U. S. Institute of
Medicine (IOM) cited 90,000 avoidable deaths, 3 million medical errors and 2.2
million avoidable injuries each year attributable to medical errors. That's the
equivalent of having one jumbo jet crash per day with 200 people dying in each
crash. <o:p></o:p></span></div>
<div style="text-align: justify;">
</div>
<div style="text-align: justify;">
<span style="color: black; font-family: "Arial","sans-serif";">The IOM defines
medical error as "the failure to complete a planned action as intended or
the use of a wrong plan to achieve an aim. An adverse event is defined as an
injury caused by medical management rather than by the underlying disease or
condition of the patient. Some adverse events are not preventable and they
reflect the risk associated with treatment, such as a life-threatening allergic
reaction to a drug when the patient had no known allergies to it. However, the
patient who receives an antibiotic to which he or she is known to be allergic,
goes into anaphylactic shock, and dies, represents a preventable adverse event.
<o:p></o:p></span></div>
<div style="text-align: justify;">
</div>
<span style="color: black; font-family: "Arial","sans-serif";">Most people
believe that medical errors usually involve drugs, such as a patient getting
the wrong prescription or dosage, or mishandled surgeries, such as amputation
of the wrong limb. However, there are many other types of medical errors,
including: <o:p></o:p></span><br />
<br />
<ul type="disc">
<li class="MsoNormal" style="color: black; margin: 0in 0in 0pt; mso-list: l1 level1 lfo3; mso-margin-bottom-alt: auto; mso-margin-top-alt: auto; tab-stops: list .5in;"><span style="font-family: "Arial","sans-serif";">Diagnostic error, such as misdiagnosis leading to an
incorrect choice of therapy, failure to use an indicated diagnostic test,
misinterpretation of test results, and failure to act on abnormal results.
<o:p></o:p></span></li>
<li class="MsoNormal" style="color: black; margin: 0in 0in 0pt; mso-list: l1 level1 lfo3; mso-margin-bottom-alt: auto; mso-margin-top-alt: auto; tab-stops: list .5in;"><span style="font-family: "Arial","sans-serif";">Equipment failure <o:p></o:p></span></li>
<li class="MsoNormal" style="color: black; margin: 0in 0in 0pt; mso-list: l1 level1 lfo3; mso-margin-bottom-alt: auto; mso-margin-top-alt: auto; tab-stops: list .5in;"><span style="font-family: "Arial","sans-serif";">Infections, such as nosocomial and post-surgical
wound infections. <o:p></o:p></span></li>
<li class="MsoNormal" style="color: black; margin: 0in 0in 0pt; mso-list: l1 level1 lfo3; mso-margin-bottom-alt: auto; mso-margin-top-alt: auto; tab-stops: list .5in;"><span style="font-family: "Arial","sans-serif";">Blood transfusion-related injuries <o:p></o:p></span></li>
<li class="MsoNormal" style="color: black; margin: 0in 0in 0pt; mso-list: l1 level1 lfo3; mso-margin-bottom-alt: auto; mso-margin-top-alt: auto; tab-stops: list .5in;"><span style="font-family: "Arial","sans-serif";">Misinterpretation of medical orders <o:p></o:p></span></li>
<li class="MsoNormal" style="color: black; margin: 0in 0in 0pt; mso-list: l1 level1 lfo3; mso-margin-bottom-alt: auto; mso-margin-top-alt: auto; tab-stops: list .5in;"><span style="font-family: "Arial","sans-serif";">Incorrect medicines and/or prescriptions <o:p></o:p></span></li>
<li class="MsoNormal" style="color: black; margin: 0in 0in 0pt; mso-list: l1 level1 lfo3; mso-margin-bottom-alt: auto; mso-margin-top-alt: auto; tab-stops: list .5in;"><span style="font-family: "Arial","sans-serif";">Surgical errors <o:p></o:p></span></li>
<li class="MsoNormal" style="color: black; margin: 0in 0in 0pt; mso-list: l1 level1 lfo3; mso-margin-bottom-alt: auto; mso-margin-top-alt: auto; tab-stops: list .5in;"><span style="font-family: "Arial","sans-serif";">Lab reports errors. <o:p></o:p></span></li>
</ul>
<div style="text-align: justify;">
</div>
<div style="text-align: justify;">
<span style="color: black; font-family: "Arial","sans-serif";">Most errors
result from problems created by today's complex health care system. But errors
also happen when doctors and their patients have problems communicating. For
example, a recent study supported by the Agency for Healthcare Research and
Quality (AHRQ) found that doctors often do not do enough to help their patients
make informed decisions. Uninvolved and uninformed patients are less likely to
accept the doctor's choice of treatment and less likely to do what they need to
do to make the treatment work. </span></div>
<span style="color: black; font-family: "Arial","sans-serif";"><o:p></o:p></span><br />
<br />
<div class="MsoNormal" style="margin: 0in 0in 0pt; tab-stops: -27.0pt;">
<b style="mso-bidi-font-weight: normal;"><span style="font-family: "Arial","sans-serif";">Performance Management
in Healthcare<o:p></o:p></span></b></div>
<br />
<div class="MsoNormal" style="margin: 0in 0in 0pt; tab-stops: -27.0pt;">
<b style="mso-bidi-font-weight: normal;"><i style="mso-bidi-font-style: normal;"><span style="font-family: "Arial","sans-serif";"><o:p><span style="font-family: Times New Roman;"> </span></o:p></span></i></b><span style="font-family: "Arial","sans-serif";">Healthcare
provider organizations use performance management methodologies to focus on
their key challenges:<o:p></o:p></span></div>
<br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 0.5in; mso-list: l3 level1 lfo4; tab-stops: -27.0pt list .5in; text-indent: -0.25in;">
<!--[if !supportLists]--><span style="font-family: Symbol; mso-bidi-font-family: Symbol; mso-fareast-font-family: Symbol;"><span style="mso-list: Ignore;">·<span style="font-size-adjust: none; font-stretch: normal; font: 7pt/normal "Times New Roman";">
</span></span></span><!--[endif]--><span style="font-family: "Arial","sans-serif";">How
are our resources (employees, physicians, capital assets) helping us to
accomplish our strategic goals?</span></div>
<div class="MsoNormal" style="margin: 0in 0in 0pt 0.5in; mso-list: l3 level1 lfo4; tab-stops: -27.0pt list .5in; text-indent: -0.25in;">
<!--[if !supportLists]--><span style="font-family: Symbol; mso-bidi-font-family: Symbol; mso-fareast-font-family: Symbol;"><span style="mso-list: Ignore;">·<span style="font-size-adjust: none; font-stretch: normal; font: 7pt/normal "Times New Roman";">
</span></span></span><!--[endif]--><span style="font-family: "Arial","sans-serif";">How
are we going to excel at key business (access, throughput, value of service to
patients) processes?</span></div>
<div class="MsoNormal" style="margin: 0in 0in 0pt 0.5in; mso-list: l3 level1 lfo4; tab-stops: -27.0pt list .5in; text-indent: -0.25in;">
<!--[if !supportLists]--><span style="font-family: Symbol; mso-bidi-font-family: Symbol; mso-fareast-font-family: Symbol;"><span style="mso-list: Ignore;">·<span style="font-size-adjust: none; font-stretch: normal; font: 7pt/normal "Times New Roman";">
</span></span></span><!--[endif]--><span style="font-family: "Arial","sans-serif";">How
are we going to create loyalty (patient satisfaction, physician referrals,
market share) with our key stakeholders?</span></div>
<div class="MsoNormal" style="margin: 0in 0in 0pt 0.5in; mso-list: l3 level1 lfo4; tab-stops: -27.0pt list .5in; text-indent: -0.25in;">
<!--[if !supportLists]--><span style="font-family: Symbol; mso-bidi-font-family: Symbol; mso-fareast-font-family: Symbol;"><span style="mso-list: Ignore;">·<span style="font-size-adjust: none; font-stretch: normal; font: 7pt/normal "Times New Roman";">
</span></span></span><!--[endif]--><span style="font-family: "Arial","sans-serif";">How
are we going to sustain our ability (have enough financial resources) to
enhance the value of the organization?<o:p></o:p></span></div>
<br />
<br />
<div class="MsoNormal" style="margin: 0in 0in 0pt; tab-stops: -27.0pt;">
<span style="font-family: "Arial","sans-serif";">Full
service performance management programs address each of those four
perspectives.<span style="mso-spacerun: yes;"> </span><o:p></o:p></span></div>
<br />
<span style="color: black; font-family: "Arial","sans-serif";"><o:p><span style="font-family: Times New Roman;"> </span></o:p></span><br />
<br />
<div class="MsoNormal" style="margin: 0in 0in 0pt;">
<o:p> </o:p></div>
Analytics and Data Mininghttp://www.blogger.com/profile/06683904073452049165noreply@blogger.com0tag:blogger.com,1999:blog-3770043454488854818.post-24226414933979998232013-04-18T05:22:00.001-07:002013-04-18T05:24:45.344-07:00<h1 id="heading-alone" itemprop="name headline ">
Psychologist says maths can predict chances of divorce</h1>
<ul class="share-links trackable-component" data-component="Article:top share tools" id="content-actions">
<li class="full-line facebook"><span itemprop="author" itemscope="" itemtype="http://schema.org/Person"><span itemprop="name"><a class="contributor" href="http://www.guardian.co.uk/profile/timradford" itemprop="url" rel="author">Tim Radford</a></span></span>, <a href="http://www.guardian.co.uk/theguardian" itemprop="publisher">The Guardian</a>, <time datetime="2004-02-13T11:39GMT" itemprop="datePublished" pubdate="">Friday 13 February 2004 11.39 GMT</time></li>
</ul>
<div id="content">
<div class="trackable-component" data-component="Article:in body link" style="text-align: left;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgGGZ0SZbszXBtcjvUy6kuAMVeQEYULE4j9hMPO7CRGYVwjB9dwCxpqN6qEC4V5uxcLzU98miY-mCvOhTs0rOuNr-xZF53loDIMKdE0x021-D6dqaMrGxVJ1C2M0Lo0hZZjnC68xhIeIGL_/s1600/div.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="212" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgGGZ0SZbszXBtcjvUy6kuAMVeQEYULE4j9hMPO7CRGYVwjB9dwCxpqN6qEC4V5uxcLzU98miY-mCvOhTs0rOuNr-xZF53loDIMKdE0x021-D6dqaMrGxVJ1C2M0Lo0hZZjnC68xhIeIGL_/s320/div.jpg" width="320" /></a> A psychologist claims that a newly devised mathematical model can predict with 94% accuracy which couples will divorce - entirely on the basis of the first few minutes of a discussion about some disputed issue. John Gottman, of the University of Washington, and two applied mathematicians analysed hundreds of videotaped conversations between couples in Professor Gottman's relationship research institute. They also analysed pulse rates and other physiological data to provide a "bitterness rating" for each conversation. </div>
<div class="trackable-component" data-component="Article:in body link">
</div>
<div class="trackable-component" data-component="Article:in body link" style="text-align: left;">
The researchers were looking for what they called the "masters and disasters" of marriage. What mattered was not the dispute itself, but a couple's attitudes during the argument. "When the masters of marriage are talking about something important, they may be arguing, but they are also laughing and teasing and there are signs of affection because they have made emotional connections," Prof Gottman said. "But a lot of people don't know how to connect or how to build a sense of humour, and this means that a lot of fighting that couples engage in is a failure to make emotional connections. </div>
<div class="trackable-component" data-component="Article:in body link" style="text-align: left;">
"We wouldn't have known this without the mathematical model." </div>
<div class="trackable-component" data-component="Article:in body link" style="text-align: justify;">
</div>
<div class="trackable-component" data-component="Article:in body link" style="text-align: left;">
The researchers will take part in a symposium on love and marriage at the American Association for the Advancement of Science in Seattle tomorrow. On St Valentine's Day, they will produce the magic ratio of positive to negative interactions that is the mark of marital success. This ratio is 5 to 1: couples who keep their tempers and consider each other 80% of the time while arguing stand a chance of celebrating their golden wedding. Those who fall below this ratio might as well dial the lawyers, or at least the marriage guidance counsellors. The team say their model charts a "Dow Jones industrial average for marital conversation". </div>
<div class="trackable-component" data-component="Article:in body link" style="text-align: left;">
</div>
<div class="trackable-component" data-component="Article:in body link" style="text-align: left;">
Prof Gottman has spent almost 30 years trying to discover what makes marriages work and fail. In 1999, he unveiled a systematic study of conversations between 124 couples who had been married less than nine months, and rated them for emotion, gesture and attitude. The "positive" codes were for affection, humour, joy, interest and validation. And then there were ratings for disgust, contempt, anger, fear, defensiveness, whining and sadness. At the end of three years, 17 couples had divorced.</div>
</div>
Analytics and Data Mininghttp://www.blogger.com/profile/06683904073452049165noreply@blogger.com24tag:blogger.com,1999:blog-3770043454488854818.post-76787564645980660012013-04-18T05:01:00.001-07:002013-04-18T05:14:37.476-07:00<span style="font-family: Arial; font-size: large;">Thread with caution when building pregnancy models..</span><br />
<span style="font-family: Arial; font-size: large;"></span><br />
<span style="font-family: Arial;"></span><br />
<div style="text-align: justify;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhTVhccQRzcrQXw1C2eNoZ1ScFS5qWbziA62ys9w2DE2jwaGCCEsVJfKLmk4qV5kIfTH183iVaep5gOs9FNV9z0s395Xs1CJR4qn3rAdwjPe8VJM_6ctehLCUijqginUnC87UvVlWuMJHJq/s1600/appreg.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"></a><br /></div>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhTVhccQRzcrQXw1C2eNoZ1ScFS5qWbziA62ys9w2DE2jwaGCCEsVJfKLmk4qV5kIfTH183iVaep5gOs9FNV9z0s395Xs1CJR4qn3rAdwjPe8VJM_6ctehLCUijqginUnC87UvVlWuMJHJq/s1600/appreg.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhTVhccQRzcrQXw1C2eNoZ1ScFS5qWbziA62ys9w2DE2jwaGCCEsVJfKLmk4qV5kIfTH183iVaep5gOs9FNV9z0s395Xs1CJR4qn3rAdwjPe8VJM_6ctehLCUijqginUnC87UvVlWuMJHJq/s320/appreg.jpg" width="289" /></a><span style="font-family: "Times New Roman","serif"; font-size: 12pt; line-height: 115%; mso-fareast-font-family: "Times New Roman";"><span style="font-family: Arial, Helvetica, sans-serif;">Many
retailers know that if they could really anticipate our purchasing patterns and
where it leads to – that this could be very beneficial to them since they could
<span style="mso-spacerun: yes;"> </span>reach the customer quicker and more
efficiently. And for many retailers “holy grail” application in the family or
women’s segment is <em>pregnancy prediction</em>. </span></span><br />
<div class="MsoNormal" style="margin: 0in 0in 10pt;">
<span style="font-family: Arial;"></span> </div>
<div class="MsoNormal" style="margin: 0in 0in 10pt; text-align: justify;">
<span style="font-family: "Times New Roman","serif"; font-size: 12pt; line-height: 115%; mso-fareast-font-family: "Times New Roman";"><span style="font-family: Arial, Helvetica, sans-serif;">We all know
that <span style="mso-spacerun: yes;"> </span>life of any individual or family is
very different in terms of priorities, habits and shopping behavior – before and
after baby is born. At very least no one should argue that it should be
different. <span style="mso-spacerun: yes;"> </span>So, to be able to time such
“earth-shattering” event where old world is gone and a new star is born – and
then “help” that individual or family by paddling your own products ahead of
competitors<span style="mso-spacerun: yes;"> - </span>can really get you large
share of their wallets on purpose of serving their needs better than competitor. What is wrong
with that? </span></span><span style="font-family: "Times New Roman","serif"; font-size: 12pt; line-height: 115%; mso-fareast-font-family: "Times New Roman";"><span style="font-family: Arial, Helvetica, sans-serif;">Well, few
things can go wrong here, mostly in “privacy” department, and some “smarties”
who went ahead of themselves eventually learned their lessons and they had to
move a few steps back.</span></span></div>
<div class="MsoNormal" style="margin: 0in 0in 10pt; text-align: justify;">
<span style="font-family: "Times New Roman","serif"; font-size: 12pt; line-height: 115%; mso-fareast-font-family: "Times New Roman";"><span style="font-family: Arial, Helvetica, sans-serif;">Lets’s
start with conceptually outlying how you could build pregnancy predictive
model, before putting a few warning signs, kind of “proceed with caution” or
“danger ahead”.</span></span><span style="font-family: "Times New Roman","serif"; font-size: 12pt; line-height: 115%; mso-fareast-font-family: "Times New Roman";"><span style="font-family: Arial, Helvetica, sans-serif;"><span style="mso-spacerun: yes;"> </span></span></span></div>
<div class="MsoNormal" style="margin: 0in 0in 10pt; text-align: justify;">
<span style="mso-fareast-font-family: "Times New Roman";"><span style="font-family: Arial, Helvetica, sans-serif;">The
first thing you need to do is to put the "carrot on the hook" for any female customers
who would be willing to share their pregnancy secret with you (first or second
trimester preferable) for some hefty promotional discounts. Once you have a
critical mass of newly pregnant customers – it is just a matter of capturing
their purchasing history, so that you are able then to differentiate between
them the rest (non-pregnant segmented) in the form of robust and accurate
predictive model. Once, such model is in place it is a matter of implementing
it, monitoring it and measuring value it generates.</span></span></div>
<div class="MsoNormal" style="margin: 0in 0in 10pt;">
<span style="mso-fareast-font-family: "Times New Roman";"><span style="font-family: Arial, Helvetica, sans-serif;">All
sound well and good - here is reality.. </span></span></div>
<div class="MsoNormal" style="margin: 0in 0in 10pt; text-align: justify;">
<span style="mso-fareast-font-family: "Times New Roman";"><span style="font-family: Arial, Helvetica, sans-serif;">Once
upon time there was one very clever man, in very clever marketing department of
one forward-thinking retail company. And that man created a very smart data-mining
model who could predict if woman customer is pregnant. Soon after mailing list followed to its
likely pregnant female customers. As the story goes there were some very impressed
customers who were amazed with “how did they know”? But they were some who were
not so impressed, and they asked different questions of “how did they dare to
know”? There were also some who felt wrongly “impregnated” like the father
who stormed marketing department accusing them of leading his teenage daughter
into getting pregnant - so they can sell to her their new range of baby
products. But then, a few months later the same father end up sending letter of apology
after discovery that his daughter was indeed pregnant!. Not to say that he was
being completely stunned by how this retailer knew something he did not - even though his daughter lived with him. </span></span></div>
<div class="MsoNormal" style="margin: 0in 0in 10pt; text-align: justify;">
<span style="mso-fareast-font-family: "Times New Roman";"><span style="font-family: Arial, Helvetica, sans-serif;">The
biggest problem was that many customers felt spied on, feeling that their privacy
was compromised, so they started cutting ties with this retailer and doing
everything they could to hide their purchasing behavior. This prompted retailer
to adjust accordingly their model execution. And the only remedy was to blur
the fact that they had such probabilistic knowledge. This resulted in
promotions where baby-products coupons were masked with other vouchers, and
therefore it was no longer obvious that marketers had such knowledge, which kept
customers at ease. <span style="mso-spacerun: yes;"> </span>So, if you are
competing for baby product market think carefully about how you navigate
through this. Could be some stormy waters just when you think it is smooth
sailing.</span></span></div>
<div class="MsoNormal" style="margin: 0in 0in 10pt; text-align: justify;">
<span style="mso-fareast-font-family: "Times New Roman";"><span style="font-family: Arial, Helvetica, sans-serif;">Goran Dragosavac<o:p></o:p></span></span></div>
<br />
<div class="MsoNormal" style="margin: 0in 0in 10pt;">
<span style="font-family: "Times New Roman","serif"; font-size: 12pt; line-height: 115%; mso-fareast-font-family: "Times New Roman";"><span style="mso-spacerun: yes;"> </span><o:p></o:p></span></div>
Analytics and Data Mininghttp://www.blogger.com/profile/06683904073452049165noreply@blogger.com1tag:blogger.com,1999:blog-3770043454488854818.post-71647635350937046842013-04-17T11:44:00.000-07:002013-04-17T12:00:40.319-07:00<br />
<div class="MsoNormal" style="margin: 0in 0in 0pt;">
<span style="font-family: Calibri;">Text<span style="mso-spacerun: yes;"> </span>Mining on F-word<o:p></o:p></span></div>
<br />
<div class="MsoNormal" style="margin: 0in 0in 0pt;">
<span style="font-family: Calibri;">I have a colleague who works as the analytical practitioner
and recently she was involved in banking project where they were analyzing free text
data collected online. <o:p></o:p></span></div>
<br />
<div class="MsoNormal" style="margin: 0in 0in 0pt;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh4hZiRQ1TDe5Ceu5tc8QD-pVkGI3oDl8PDssoJVRvR26GqaPvYK8gunR2AZt3zpTxoSb5XFkuL7sz8oWD-tZWCNl1UNo2yb5GMxyX223ezmT9pgiiEIjMtscnQmApObXxDzM4JbqJpBOQ_/s1600/imagesCA8GOV9K.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="226" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh4hZiRQ1TDe5Ceu5tc8QD-pVkGI3oDl8PDssoJVRvR26GqaPvYK8gunR2AZt3zpTxoSb5XFkuL7sz8oWD-tZWCNl1UNo2yb5GMxyX223ezmT9pgiiEIjMtscnQmApObXxDzM4JbqJpBOQ_/s320/imagesCA8GOV9K.jpg" width="320" /></a><span style="font-family: Calibri;">The idea was to hear<span style="mso-spacerun: yes;"> </span>who is talking out there about this company,<span style="mso-spacerun: yes;"> </span>what are they saying, how influential are the
voices, what is the sentiment, what is the critical mass ad so on. And no better words to start your
exploration of negative sentiment than F-word, and then go on from there.<o:p></o:p></span></div>
<br />
<div class="MsoNormal" style="margin: 0in 0in 0pt;">
<span style="font-family: Calibri;">Next thing my colleague had done - was to use a technique
called <em>concept linking</em> which takes selected word, in this case F-word, and
produce a graphical display of the linkages between that word and other
entities. And the thicker links would indicate a stronger connection between
the words. <o:p></o:p></span></div>
<br />
<div class="MsoNormal" style="margin: 0in 0in 0pt;">
<span style="font-family: Calibri;">So, there she was, sitting with a senior bank manager who was
probably dressed in a grey suit and tie, <span style="mso-spacerun: yes;"> </span>using some neat technologies for linguistic exploration
to find the most F–ed up areas of the business. Isn't this just pure pragmatism!
Basically - let’s see what are the customers most angry about before we see <span style="mso-spacerun: yes;"> </span>if we can do something about it. </span></div>
<div class="MsoNormal" style="margin: 0in 0in 0pt;">
<o:p><span style="font-family: Calibri;"> </span></o:p></div>
<div class="MsoNormal" style="margin: 0in 0in 0pt;">
<span style="font-family: Calibri;">Next time someone throws expletive in your face – don’t
get angry, try to learn from it! </span><o:p><span style="font-family: Calibri;"> </span></o:p><br />
</div>
<div class="MsoNormal" style="margin: 0in 0in 0pt;">
<span style="font-family: Calibri;">Goran Dragosavac<o:p></o:p></span></div>
Analytics and Data Mininghttp://www.blogger.com/profile/06683904073452049165noreply@blogger.com1tag:blogger.com,1999:blog-3770043454488854818.post-78323800142268420152013-02-10T12:39:00.001-08:002013-02-10T12:47:32.865-08:00<br />
<h3 class="MsoNormal" style="margin: 0in 0in 0pt;">
<span style="font-family: Arial, Helvetica, sans-serif;">The benefits of having a multidimensional view of the
customer</span> <o:p></o:p></h3>
<br />
<div class="MsoNormal" style="margin: 0in 0in 0pt;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhFCFJNqQ5ZtXZMD3TtVQ38J7PrQwxxtx7OVXAY0YwU0QHA3g5ag_3wnC4ui0_5uDNanJuimK4nAZigO1s11yG9c7LkU5mPUbelyP7giCJBjGnCstkDLKSIZU1r5MzE_ZITlpJ1wsUo1zOj/s1600/multi1.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"></a><br /></div>
<div class="MsoNormal" style="margin: 0in 0in 0pt; text-align: justify;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhFCFJNqQ5ZtXZMD3TtVQ38J7PrQwxxtx7OVXAY0YwU0QHA3g5ag_3wnC4ui0_5uDNanJuimK4nAZigO1s11yG9c7LkU5mPUbelyP7giCJBjGnCstkDLKSIZU1r5MzE_ZITlpJ1wsUo1zOj/s1600/multi1.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="245" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhFCFJNqQ5ZtXZMD3TtVQ38J7PrQwxxtx7OVXAY0YwU0QHA3g5ag_3wnC4ui0_5uDNanJuimK4nAZigO1s11yG9c7LkU5mPUbelyP7giCJBjGnCstkDLKSIZU1r5MzE_ZITlpJ1wsUo1zOj/s320/multi1.jpg" width="320" /></a><span style="font-family: Calibri;">When we talk about a multi-dimensional view of the customer
we refer to view of the cu-stomer purely from the business perspective in
relation to profitability, risk, responsiveness, loyalty, behavior and
preferences. So, ability to see a single customer along these dimensions would
undoubtedly give any organization incredible competitive advantage of being
able to serve customers better and in return being awarded by customer’s larger
share of his wallet. </span><br />
<br />
<div class="MsoNormal" style="margin: 0in 0in 0pt; text-align: justify;">
<span style="font-family: Calibri;"></span> <span style="font-family: Calibri;">And this can certainly be achieved by using predictive
analytics whose main output are probabilities, which in this case would be
probability of being loyal, profitable… etc. So if you know that specific
customer is of low value to you, whose cost of serving is far greater than the
value he brings it to you in terms of profitability – who cares about his
loyalty and preferences? You don’t want to waste a cent of marketing budget on
him, in fact you want to open the door as wide as you can and let him go. On
the contrary - customer in the highest value segment whose loyalty scores are
dwindling deserves to be phoned by your top account managers to see how you can
improve your service to him. And if you know his preferences and buying habits
- you know what to offer him to have him re-think
his intention to leave you. This is what we mean when we say knowing the next
best action toward the customer even if action is – don’t do anything, let him
go! </span></div>
<div class="MsoNormal" style="margin: 0in 0in 0pt;">
<span style="font-family: Calibri;"></span> </div>
<div class="MsoNormal" style="margin: 0in 0in 0pt; text-align: justify;">
<span style="font-family: Calibri;">At this level, analytics are no longer used for decision support –
they are used for decision making. All you need is to look for constellation
of these probabilities and next action toward the specific customer, or customer
segment becomes crystally clear. You just need to act on what these numbers are
telling you and “count the blessings”. This is what I call “holy grail” of
analytics and until an organization can do all the above and more – usage of
analytics is nowhere close to optimal. And for these companies who use
analytics at that level and for those purposes – pay-offs are huge, but you
may never know about it. </span></div>
<div class="MsoNormal" style="margin: 0in 0in 0pt; text-align: justify;">
<span style="font-family: Calibri;"></span> </div>
<div class="MsoNormal" style="margin: 0in 0in 0pt; text-align: justify;">
<span style="font-family: Calibri;">Goran Dragosavac<o:p></o:p></span></div>
</div>
Analytics and Data Mininghttp://www.blogger.com/profile/06683904073452049165noreply@blogger.com6tag:blogger.com,1999:blog-3770043454488854818.post-10788192846346729982013-01-31T02:11:00.003-08:002013-01-31T02:14:29.288-08:00<br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<b><span lang="EN-GB" style="font-family: Arial, Helvetica, sans-serif; font-size: 14pt; mso-ansi-language: EN-GB;">Nine Laws of Data Mining</span></b><o:p></o:p></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB;">by Tom Khabaza</span><o:p></o:p></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<i><span lang="EN-GB" style="font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB;">This content was created
during the first quarter of 2010 to publish the “Nine Laws of Data Mining”,
which explain the reasons underlying the data mining process. If you prefer
brevity, see my tweets: </span></i><a href="https://twitter.com/tomkhabaza"><i><span lang="EN-GB" style="font-size: 12pt; mso-ansi-language: EN-GB;"><span style="color: blue; font-family: Arial, Helvetica, sans-serif;">twitter.com/tomkhabaza</span></span></i></a><i><span lang="EN-GB" style="font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB;">. If you are a
member of </span></i><a href="http://www.linkedin.com/"><i><span lang="EN-GB" style="font-size: 12pt; mso-ansi-language: EN-GB;"><span style="color: blue; font-family: Arial, Helvetica, sans-serif;">LinkedIn</span></span></i></a><i><span lang="EN-GB" style="font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB;">, see the “9 Laws
of Data Mining” subgroup of the CRISP-DM group for a discussion forum. This
page contains laws 1-4, with further laws on </span></i><a href="http://khabaza.codimension.net/index_files/Page346.htm"><i><span lang="EN-GB" style="font-size: 12pt; mso-ansi-language: EN-GB;"><span style="color: blue; font-family: Arial, Helvetica, sans-serif;">additional pages</span></span></i></a><i><span lang="EN-GB" style="font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB;">. The 9 Laws are
also expressed as haikus </span></i><a href="http://khabaza.codimension.net/index_files/Page353.htm"><i><span lang="EN-GB" style="font-size: 12pt; mso-ansi-language: EN-GB;"><span style="color: blue; font-family: Arial, Helvetica, sans-serif;">here</span></span></i></a><i><span lang="EN-GB" style="font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB;">.</span></i><o:p></o:p></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="font-size: 12pt; mso-ansi-language: EN-GB;"><span style="font-family: Arial, Helvetica, sans-serif;">Data mining is the creation of
new knowledge in natural or artificial form, by using business knowledge to
discover and interpret patterns in data. In its current form, data mining as a
field of practise came into existence in the 1990s, aided by the emergence of
data mining algorithms packaged within workbenches so as to be suitable for
business analysts. Perhaps because of its origins in practice rather than in
theory, relatively little attention has been paid to understanding the nature
of the data mining process. The development of the CRISP-DM methodology in the
late 1990s was a substantial step towards a standardised description of the
process that had already been found successful and was (and is) followed by
most practising data miners. <o:p></o:p></span></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<o:p><span style="font-family: Arial, Helvetica, sans-serif;"> </span></o:p></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB;">Although CRISP-DM describes
how data mining is performed, it does not explain what data mining is or why
the process has the properties that it does. In this paper I propose nine
maxims or “laws” of data mining (most of which are well-known to
practitioners), together with explanations where known. This provides the start
of a theory to explain (and not merely describe) the data mining process. </span><o:p></o:p></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB;">It is not my purpose to
criticise CRISP-DM; many of the concepts introduced by CRISP-DM are crucial to
the understanding of data mining outlined here, and I also depend on CRISP-DM’s
common terminology. This is merely the next step in the process that started
with CRISP-DM.</span><o:p></o:p></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="line-height: 113%; margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="font-family: Arial, Helvetica, sans-serif; font-size: 12pt; line-height: 113%; mso-ansi-language: EN-GB;">——————————————————————————————————</span><o:p></o:p></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<b><span lang="EN-GB" style="font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB;">1st Law of Data Mining –
“Business Goals Law”: </span></b><o:p></o:p></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<i><span lang="EN-GB" style="font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB;">Business objectives are the
origin of every data mining solution</span></i><o:p></o:p></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB;">This defines the field of data
mining: data mining is concerned with solving business problems and achieving
business goals. Data mining is not primarily a technology; it is a process,
which has one or more business objectives at its heart. Without a business
objective (whether or not this is articulated), there is no data mining. </span><o:p></o:p></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB;">Hence the maxim: “Data Mining is
a Business Process”. </span><o:p></o:p></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="line-height: 113%; margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="font-family: Arial, Helvetica, sans-serif; font-size: 12pt; line-height: 113%; mso-ansi-language: EN-GB;">——————————————————————————————————</span><o:p></o:p></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<b><span lang="EN-GB" style="font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB;">2nd Law of Data Mining –
“Business Knowledge Law”: </span></b><span lang="EN-GB" style="font-size: 12pt; mso-ansi-language: EN-GB;"><br /><span style="font-family: Arial, Helvetica, sans-serif;">
<i>Business knowledge is central to every step of the data mining process</i></span></span><o:p></o:p></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB;">This defines a crucial
characteristic of the data mining process. A naive reading of CRISP-DM would
see business knowledge used at the start of the process in defining goals, and
at the end of the process in guiding deployment of results. This would be to
miss a key property of the data mining process, that business knowledge has a
central role in every step.</span><o:p></o:p></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB;">For convenience I use the
CRISP-DM phases to illustrate:</span><o:p></o:p></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 0.5in; text-indent: -21.75pt;">
<span style="font-family: Arial, Helvetica, sans-serif;"><span lang="X-NONE" style="mso-ansi-language: X-NONE;">·</span><span lang="X-NONE"> </span><span lang="EN-GB" style="font-size: 12pt; mso-ansi-language: EN-GB;">Business understanding must be based on business knowledge, and so must
the mapping of business objectives to data mining goals. (This mapping is also
based on data knowledge data mining knowledge).</span><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 0.5in; text-indent: -21.75pt;">
<span style="font-family: Arial, Helvetica, sans-serif;"><span lang="X-NONE" style="mso-ansi-language: X-NONE;">·</span><span lang="X-NONE"> </span><span lang="EN-GB" style="font-size: 12pt; mso-ansi-language: EN-GB;">Data understanding uses business knowledge to understand which data is
related to the business problem, and how it is related.</span><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 0.5in; text-indent: -21.75pt;">
<span style="font-family: Arial, Helvetica, sans-serif;"><span lang="X-NONE" style="mso-ansi-language: X-NONE;">·</span><span lang="X-NONE"> </span><span lang="EN-GB" style="font-size: 12pt; mso-ansi-language: EN-GB;">Data preparation means using business knowledge to shape the data so
that the required business questions can be asked and answered. (For further
detail see the 3rd Law – the Data Preparation law).</span><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 0.5in; text-indent: -21.75pt;">
<span style="font-family: Arial, Helvetica, sans-serif;"><span lang="X-NONE" style="mso-ansi-language: X-NONE;">·</span><span lang="X-NONE"> </span><span lang="EN-GB" style="font-size: 12pt; mso-ansi-language: EN-GB;">Modelling means using data mining algorithms to create predictive models
and interpreting both the models and their behaviour in business terms – that
is, understanding their business relevance.</span><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 0.5in; text-indent: -21.75pt;">
<span style="font-family: Arial, Helvetica, sans-serif;"><span lang="X-NONE" style="mso-ansi-language: X-NONE;">·</span><span lang="X-NONE"> </span><span lang="EN-GB" style="font-size: 12pt; mso-ansi-language: EN-GB;">Evaluation means understanding the business impact of using the models.</span><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 0.5in; text-indent: -21.75pt;">
<span style="font-family: Arial, Helvetica, sans-serif;"><span lang="X-NONE" style="mso-ansi-language: X-NONE;">·</span><span lang="X-NONE"> </span><span lang="EN-GB" style="font-size: 12pt; mso-ansi-language: EN-GB;">Deployment means putting the data mining results to work in a business
process.</span><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB;">In summary, without business
knowledge, not a single step of the data mining process can be effective; there
are no “purely technical” steps. Business knowledge guides the process towards
useful results, and enables the recognition of those results that are useful.
Data mining is an iterative process, with business knowledge at its core,
driving continual improvement of results.</span><o:p></o:p></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB;">The reason behind this can be
explained in terms of the “chasm of representation” (an idea used by Alan
Montgomery in data mining presentations of the 1990s). Montgomery pointed out
that the business goals in data mining refer to the reality of the business,
whereas investigation takes place at the level of data which is only a
representation of that reality; there is a gap (or “chasm”) between what is
represented in the data and what takes place in the real world. In data mining,
business knowledge is used to bridge this gap; whatever is found in the data
has significance only when interpreted using business knowledge, and anything
missing from the data must be provided through business knowledge. Only
business knowledge can bridge the gap, which is why it is central to every step
of the data mining process.</span><o:p></o:p></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="line-height: 113%; margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="font-family: Arial, Helvetica, sans-serif; font-size: 12pt; line-height: 113%; mso-ansi-language: EN-GB;">——————————————————————————————————</span><o:p></o:p></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<b><span lang="EN-GB" style="font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB;">3rd Law of Data Mining – “Data
Preparation Law”: </span></b><o:p></o:p></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<i><span lang="EN-GB" style="font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB;">Data preparation is more than
half of every data mining process</span></i><o:p></o:p></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB;">It is a well-known maxim of
data mining that most of the effort in a data mining project is spent in data
acquisition and preparation. Informal estimates vary from 50 to 80 percent.
Naive explanations might be summarised as “data is difficult”, and moves to
automate various parts of data acquisition, data cleaning, data transformation
and data preparation are often viewed as attempts to mitigate this “problem”.
While automation can be beneficial, there is a risk that proponents of this technology
will believe that it can remove the large proportion of effort which goes into
data preparation. This would be to misunderstand the reasons why data
preparation is required in data mining.</span><o:p></o:p></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB;">The purpose of data
preparation is to put the data into a form in which the data mining question
can be asked, and to make it easier for the analytical techniques (such as data
mining algorithms) to answer it. Every change to the data of any sort
(including cleaning, large and small transformations, and augmentation) means a
change to the problem space which the analysis must explore. The reason that
data preparation is important, and forms such a large proportion of data mining
effort, is that the data miner is deliberately manipulating the problem space
to make it easier for their analytical techniques to find a solution.</span><o:p></o:p></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB;">There are two aspects to this
“problem space shaping”. The first is putting the data into a form in which it
can be analysed at all – for example, most data mining algorithms require data
in a single table, with one record per example. The data miner knows this as a
general parameter of what the algorithm can do, and therefore puts the data
into a suitable format. The second aspect is making the data more informative
with respect to the business problem – for example, certain derived fields or
aggregates may be relevant to the data mining question; the data miner knows
this through business knowledge and data knowledge. By including these fields
in the data, the data miner manipulates the search space to make it possible or
easier for their preferred techniques to find a solution.</span><o:p></o:p></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB;">It is therefore essential that
data preparation is informed in detail by business knowledge, data knowledge
and data mining knowledge. These aspects of data preparation cannot be
automated in any simple way.</span><o:p></o:p></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB;">This law also explains the
otherwise paradoxical observation that even after all the data acquisition,
cleaning and organisation that goes into creating a data warehouse, data
preparation is still crucial to, and more than half of, the data mining
process. Furthermore, even after a major data preparation stage, further data
preparation is often required during the iterative process of building useful
models, as shown in the CRISP-DM diagram.</span></div>
<div class="MsoNormal" style="line-height: 113%; margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="font-family: Arial, Helvetica, sans-serif; font-size: 12pt; line-height: 113%; mso-ansi-language: EN-GB;">——————————————————————————————————</span><o:p></o:p></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<b><span lang="EN-GB" style="font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB;">4th Law of Data Mining –
“NFL-DM”: </span></b><o:p></o:p></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<i><span lang="EN-GB" style="font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB;">The right model for a given
application can only be discovered by experiment</span></i><o:p></o:p></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span style="font-family: Arial, Helvetica, sans-serif;"><span lang="EN-GB" style="font-size: 12pt; mso-ansi-language: EN-GB;">or <i>“There is No Free Lunch
for the Data Miner”</i></span><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB;">It is an axiom of machine
learning that, if we knew enough about a problem space, we could choose or
design an algorithm to find optimal solutions in that problem space with
maximal efficiency. Arguments for the superiority of one algorithm over others
in data mining rest on the idea that data mining problem spaces have one
particular set of properties, or that these properties can be discovered by
analysis and built into the algorithm. However, these views arise from the
erroneous idea that, in data mining, the data miner formulates the problem and
the algorithm finds the solution. In fact, the data miner both formulates the
problem and finds the solution – the algorithm is merely a tool which the data
miner uses to assist with certain steps in this process.</span><o:p></o:p></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB;">There are 5 factors which
contribute to the necessity for experiment in finding data mining solutions:</span><o:p></o:p></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 0.5in; text-indent: -21.75pt;">
<span style="font-family: Arial, Helvetica, sans-serif;"><span lang="EN-GB" style="font-size: 12pt; mso-ansi-language: EN-GB;">1.</span><span lang="EN-GB"> </span><span lang="EN-GB" style="font-size: 12pt; mso-ansi-language: EN-GB;">If the problem space were well-understood, the data mining process would
not be needed – data mining is the process of searching for as yet unknown
connections.</span><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 0.5in; text-indent: -21.75pt;">
<span style="font-family: Arial, Helvetica, sans-serif;"><span lang="EN-GB" style="font-size: 12pt; mso-ansi-language: EN-GB;">2.</span><span lang="EN-GB"> </span><span lang="EN-GB" style="font-size: 12pt; mso-ansi-language: EN-GB;">For a given application, there is not only one problem space; different
models may be used to solve different parts of the problem, and the way in
which the problem is decomposed is itself often the result of data mining and
not known before the process begins.</span><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 0.5in; text-indent: -21.75pt;">
<span style="font-family: Arial, Helvetica, sans-serif;"><span lang="EN-GB" style="font-size: 12pt; mso-ansi-language: EN-GB;">3.</span><span lang="EN-GB"> </span><span lang="EN-GB" style="font-size: 12pt; mso-ansi-language: EN-GB;">The data miner manipulates, or “shapes”, the problem space by data
preparation, so that the grounds for evaluating a model are constantly
shifting.</span><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 0.5in; text-indent: -21.75pt;">
<span style="font-family: Arial, Helvetica, sans-serif;"><span lang="EN-GB" style="font-size: 12pt; mso-ansi-language: EN-GB;">4.</span><span lang="EN-GB"> </span><span lang="EN-GB" style="font-size: 12pt; mso-ansi-language: EN-GB;">There is no technical measure of value for a predictive model (see 8th
law).</span><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 0.5in; text-indent: -21.75pt;">
<span style="font-family: Arial, Helvetica, sans-serif;"><span lang="EN-GB" style="font-size: 12pt; mso-ansi-language: EN-GB;">5.</span><span lang="EN-GB"> </span><span lang="EN-GB" style="font-size: 12pt; mso-ansi-language: EN-GB;">The business objective itself undergoes revision and development during
the data mining process, so that the appropriate data mining goals may change
completely.</span><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB;">This last point, the ongoing
development of business objectives during data mining, is implied by CRISP-DM
but is often missed. It is widely known that CRISP-DM is not a “waterfall”
process in which each phase is completed before the next begins. In fact, any
CRISP-DM phase can continue throughout the project, and this is as true for
Business Understanding as it is for any other phase. The business objective is
not simply given at the start, it evolves throughout the process. This may be
why some data miners are willing to start projects without a clear business
objective – they know that business objectives are also a result of the
process, and not a static given.</span><o:p></o:p></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB;">Wolpert’s “No Free Lunch”
(NFL) theorem, as applied to machine learning, states that no one bias (as
embodied in an algorithm) will be better than any other when averaged across
all possible problems (datasets). This is because, if we consider all possible
problems, their solutions are evenly distributed, so that an algorithm (or
bias) which is advantageous for one subset will be disadvantageous for another.
This is strikingly similar to what all data miners know, that no one algorithm
is the right choice for every problem. Yet the problems or datasets tackled by
data mining are anything but random, and most unlikely to be evenly distributed
across the space of all possible problems – they represent a very biased
sample, so why should the conclusions of NFL apply? The answer relates to the
factors given above: because problem spaces are initially unknown, because
multiple problem spaces may relate to each data mining goal, because problem
spaces may be manipulated by data preparation, because models cannot be
evaluated by technical means, and because the business problem itself may
evolve. For all these reasons, data mining problem spaces are developed by the
data mining process, and subject to constant change during the process, so that
the conditions under which the algorithms operate mimic a random selection of
datasets and Wopert’s NFL theorem therefore applies. There is no free lunch for
the data miner.</span><o:p></o:p></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="font-size: 12pt; mso-ansi-language: EN-GB;"><span style="font-family: Arial, Helvetica, sans-serif;">This describes the data mining
process in general. However, there may well be cases where the ground is
already “well-trodden” – the business goals are stable, the data and its
pre-processing are stable, an acceptable algorithm or algorithms and their <br />
role(s) in the solution have been discovered and settled upon. In these
situations, some of the properties of the generic data mining process are
lessened. Such stability is temporary, because both the relation of the data to
the business (see 2nd law) and our understanding of the problem (see 9th law)
will change. However, as long this stability lasts, the data miner’s lunch may
be free, or at least relatively inexpensive.<o:p></o:p></span></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB;"><o:p> </o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span style="font-family: Arial, Helvetica, sans-serif;"><b><span lang="EN-GB" style="color: black; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">5th Law of Data Mining –
“Watkins’ Law”: </span></b><i><span lang="EN-GB" style="color: black; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">There are always patterns</span></i><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">This law was first
stated by David Watkins. We might expect that a proportion of data mining
projects would fail because the patterns needed to solve the business problem
are not present in the data, but this does not accord with the experience of
practising data miners. </span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">Previous explanations
have suggested that this is because:</span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">There is always
something interesting to be found in a business-relevant dataset, so that even
if the expected patterns were not found, something else useful would be found
(this does accord with data miners’ experience), and</span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">A data mining project
would not be undertaken unless business experts expected that patterns would be
present, and it should not be surprising that the experts are usually right. </span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">However, Watkins
formulated this in a simpler and more direct way: “There are always patterns.”,
and this accords more accurately with the experience of data miners than either
of the previous explanations. Watkins later amended this to mean that in data
mining projects about customer relationships, there are always patterns
connecting customers’ previous behaviour with their future behaviour, and that
these patterns can be used profitably (“Watkins’ CRM Law”). However, data
miners’ experience is that this is not limited to CRM problems – there are always
patterns in any data mining problem (“Watkins’ General Law”).</span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">The explanation of
Watkins’ General Law is as follows:</span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 0.5in; text-indent: -21.75pt;">
<span style="font-family: Arial, Helvetica, sans-serif;"><span lang="X-NONE" style="color: black; font-size: 10pt; mso-ansi-language: X-NONE; mso-bidi-font-family: "Times New Roman"; mso-fareast-font-family: "Times New Roman";">·</span><span lang="X-NONE" style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"> </span><span lang="EN-GB" style="color: black; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">The
business objective of a data mining project defines the domain of interest, and
this is reflected in the data mining goal.</span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 0.5in; text-indent: -21.75pt;">
<span style="font-family: Arial, Helvetica, sans-serif;"><span lang="X-NONE" style="color: black; font-size: 10pt; mso-ansi-language: X-NONE; mso-bidi-font-family: "Times New Roman"; mso-fareast-font-family: "Times New Roman";">·</span><span lang="X-NONE" style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"> </span><span lang="EN-GB" style="color: black; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">Data
relevant to the business objective and consequent data mining goal is generated
by processes within the domain.</span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 0.5in; text-indent: -21.75pt;">
<span style="font-family: Arial, Helvetica, sans-serif;"><span lang="X-NONE" style="color: black; font-size: 10pt; mso-ansi-language: X-NONE; mso-bidi-font-family: "Times New Roman"; mso-fareast-font-family: "Times New Roman";">·</span><span lang="X-NONE" style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"> </span><span lang="EN-GB" style="color: black; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">These
processes are governed by rules, and the data that is generated by the
processes reflects those rules.</span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 0.5in; text-indent: -21.75pt;">
<span style="font-family: Arial, Helvetica, sans-serif;"><span lang="X-NONE" style="color: black; font-size: 10pt; mso-ansi-language: X-NONE; mso-bidi-font-family: "Times New Roman"; mso-fareast-font-family: "Times New Roman";">·</span><span lang="X-NONE" style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"> </span><span lang="EN-GB" style="color: black; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">In
these terms, the purpose of the data mining process is to reveal the domain
rules by combining pattern-discovery technology (data mining algorithms) with
the business knowledge required to interpret the results of the algorithms in
terms of the domain.</span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 0.5in; text-indent: -21.75pt;">
<span style="font-family: Arial, Helvetica, sans-serif;"><span lang="X-NONE" style="color: black; font-size: 10pt; mso-ansi-language: X-NONE; mso-bidi-font-family: "Times New Roman"; mso-fareast-font-family: "Times New Roman";">·</span><span lang="X-NONE" style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"> </span><span lang="EN-GB" style="color: black; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">Data
mining requires relevant data, that is data generated by the domain processes
in question, which inevitably holds patterns from the rules which govern these
processes.</span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">To summarise this
argument: there are always patterns because they are an inevitable by-product
of the processes which produce the data. To find the patterns, start from the
process or what you know of it – the business knowledge.</span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">Discovery of these
patterns also forms an iterative process with business knowledge; the patterns
contribute to business knowledge, and business knowledge is the key component
required to interpret the patterns. In this iterative process, data mining
algorithms simply link business knowledge to patterns which cannot be observed
with the naked eye.</span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">If this explanation is
correct, then Watkins’ law is entirely general. There will always be patterns
for every data mining problem in every domain unless there is no relevant data;
this is guaranteed by the definition of relevance.</span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="line-height: 113%; margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; line-height: 113%; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">——————————————————————————————————</span><span style="color: black; font-size: 10pt; line-height: 113%; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span style="font-family: Arial, Helvetica, sans-serif;"><b><span lang="EN-GB" style="color: black; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">6th Law of Data Mining –
“Insight Law”: <br />
</span></b><i><span lang="EN-GB" style="color: black; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">Data
mining amplifies perception in the business domain</span></i><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">How does data mining
produce insight? This law approaches the heart of data mining – why it must be
a business process and not a technical one. Business problems are solved by
people, not by algorithms. The data miner and the business expert “see” the
solution to a problem, that is the patterns in the domain that allow the
business objective to be achieved. Thus data mining is, or assists as part of,
a perceptual process. Data mining algorithms reveal patterns that are not
normally visible to human perception. The data mining process integrates these
algorithms with the normal human perceptual process, which is active in nature.
Within the data mining process, the human problem solver interprets the results
of data mining algorithms and integrates them into their business
understanding, and thence into a business process.</span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">This is similar to the
concept of an “intelligence amplifier”. Early in the field of Artificial
Intelligence, it was suggested that the first practical outcomes from AI would
be not intelligent machines, but rather tools which acted as “intelligence
amplifiers”, assisting human users by boosting their mental capacities and
therefore their effective intelligence. Data mining provides a kind of
intelligence amplifier, helping business experts to solve business problems in
a way which they could not achieve unaided. </span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">In summary: Data mining
algorithms provide a capability to detect patterns beyond normal human
capabilities. The data mining process allows data miners and business experts
to integrate this capability into their own problem solving and into business
processes.</span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="line-height: 113%; margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; line-height: 113%; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">——————————————————————————————————</span><span style="color: black; font-size: 10pt; line-height: 113%; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<b><span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">7th Law of Data Mining –
“Prediction Law”: </span></b><span lang="EN-GB" style="color: black; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";"><br /><span style="font-family: Arial, Helvetica, sans-serif;">
<i>Prediction increases information locally by generalisation</i></span></span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">The term “prediction”
has become the accepted description of what data mining models do – we talk
about “predictive models” and “predictive analytics”. This is because some of
the most popular data mining models are often used to “predict the most likely
outcome” (as well as indicating how likely the outcome may be). This is the
typical use of classification and regression models in data mining solutions. </span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">However, other kinds of
data mining models, such as clustering and association models, are also
characterised as “predictive”; this is a much looser sense of the term. A
clustering model might be described as “predicting” the group into which an
individual falls, and an association model might be described as “predicting”
one or more attributes on the basis of those that are known.</span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">Similarly we might
analyse the use of the term “predict” in different domains: a classification
model might be said to predict customer behaviour – more properly we might say
that it predicts which customers should be targeted in a certain way, even
though not all the targeted individuals will behave in the “predicted” manner.
A fraud detection model might be said to predict whether individual
transactions should be treated as high-risk, even though not all those so
treated are in fact cases of fraud.</span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">These broad uses of the
term “prediction” have led to the term “predictive analytics” as an umbrella
term for data mining and the application of its results in business solutions.
But we should remain aware that this is not the ordinary everyday meaning of
“prediction” – we cannot expect to predict the behaviour of a specific
individual, or the outcome of a specific fraud investigation.</span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">What, then, is
“prediction” in this sense? What do classification, regression, clustering and
association algorithms and their resultant models have in common? The answer
lies in “scoring”, that is the application of a predictive model to a new
example. The model produces a prediction, or score, which is a new piece of
information about the example. The available information about the example in
question has been increased, locally, on the basis of the patterns found by the
algorithm and embodied in the model, that is on the basis of generalisation or
induction. It is important to remember that this new information is not “data”,
in the sense of a “given”; it is information only in the statistical sense.</span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="line-height: 113%; margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; line-height: 113%; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">——————————————————————————————————</span><span style="color: black; font-size: 10pt; line-height: 113%; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<b><span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">8th Law of Data Mining –
“Value Law”: </span></b><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<i><span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">The value of data mining
results is not determined by the accuracy or stability <br />
of predictive models</span></i><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">Accuracy and stability
are useful measures of how well a predictive model makes its predictions.
Accuracy means how often the predictions are correct (where they are truly predictions)
and stability means how much (or rather how little) the predictions would
change if the data used to create the model were a different sample from the
same population. Given the central role of the concept of prediction in data
mining, the accuracy and stability of a predictive model might be expected to
determine its value, but this is not the case. </span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">The value of a
predictive model arises in two ways:</span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">The model’s predictions
drive improved (more effective) action, and</span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">The model delivers
insight (new knowledge) which leads to improved strategy.</span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">In the case of insight,
accuracy is connected only loosely to the value of any new knowledge delivered.
Some predictive capability may be necessary to convince us that the discovered
patterns are real. However, a model which is incomprehensibly complex or
totally opaque may be highly accurate in its predictions, yet deliver no useful
insight, whereas a simpler and less accurate model may be much more useful for
delivering insight.</span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">The disconnect between
accuracy and value in the case of improved action is less obvious, but still
present, and can be highlighted by the question “Is the model predicting the
right thing, and for the right reasons?” In other words, the value of a model
derives as much from of its fit to the business problem as it does from its
predictive accuracy. For example, a customer attrition model might make highly
accurate predictions, yet make its predictions too late for the business to act
on them effectively. Alternatively an accurate customer attrition model might
drive effective action to retain customers, but only for the least profitable
subset of customers. A high degree of accuracy does not enhance the value of
these models when they have a poor fit to the business problem.</span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">The same is true of
model stability; although an interesting measure for predictive models,
stability cannot be substituted for the ability of a model to provide business
insight, or for its fit to the business problem. Neither can any other
technical measure.</span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">In summary, the value of
a predictive model is not determined by any technical measure. Data miners
should not focus on predictive accuracy, model stability, or any other
technical metric for predictive models at the expense of business insight and
business fit.</span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="line-height: 113%; margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; line-height: 113%; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">——————————————————————————————————</span><span style="color: black; font-size: 10pt; line-height: 113%; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span style="font-family: Arial, Helvetica, sans-serif;"><b><span lang="EN-GB" style="color: black; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">9th Law of Data Mining –
“Law of Change”: </span></b><i><span lang="EN-GB" style="color: black; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">All patterns are subject to change</span></i><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">The patterns discovered
by data mining do not last forever. This is well-known in many applications of
data mining, but the universality of this property and the reasons for it are
less widely appreciated.</span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">In marketing and CRM
applications of data mining, it is well-understood that patterns of customer
behaviour are subject to change over time. Fashions change, markets and
competition change, and the economy changes as a whole; for all these reasons,
predictive models become out-of-date and should be refreshed regularly or when
they cease to predict accurately.</span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">The same is true in risk
and fraud-related applications of data mining. Patterns of fraud change with a
changing environment and because criminals change their behaviour in order to
stay ahead of crime prevention efforts. Fraud detection applications must
therefore be designed to detect new, unknown types of fraud, just as they must deal
with old and familiar ones.</span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">Some kinds of data
mining might be thought to find patterns which will not change over time – for
example in scientific applications of data mining, do we not discover
unchanging universal laws? Perhaps surprisingly, the answer is that even these
patterns should be expected to change. </span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">The reason is that
patterns are not simply regularities which exist in the world and are reflected
in the data – these regularities may indeed be static in some domains. Rather,
the patterns discovered by data mining are part of a perceptual process, an
active process in which data mining mediates between the world as described by
the data and the understanding of the observer or business expert. Because our
understanding continually develops and grows, so we should expect the patterns
also to change. Tomorrow’s data may look superficially similar, but it will
have been collected by different means, for (perhaps subtly) different
purposes, and have different semantics; the analysis process, because it is
driven by business knowledge, will change as that knowledge changes. For all
these reasons, the patterns will be different.</span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">To express this briefly,
all patterns are subject to change because they reflect not only a changing
world but also our changing understanding</span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span style="font-family: Arial, Helvetica, sans-serif;"><o:p> </o:p><b><span lang="EN-GB" style="color: black; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">Postscript</span></b><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">The 9 Laws of Data
Mining are simple truths about data mining. Most of the 9 laws are already
well-known to data miners, although some are expressed in an unfamiliar way
(for example, the 5th, 6th and 7th laws). Most of the new ideas associated with
the 9 laws are in the explanations, which express an attempt to understand the
reasons behind the well-known form of the data mining process.</span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">Why should we care why
the data mining process takes the form that it does? In addition to the simple
appeal of knowledge and understanding, there is a practical reason to pursue
these questions. </span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">The data mining process
came into being in the form that exists today because of technological
developments – the widespread availability of machine learning algorithms, and
the development of workbenches which integrated these algorithms with other
techniques and make them accessible to users with a business-oriented outlook.
Should we expect technological change to change the data mining process? Eventually
it must, but if we understand the reasons for the form of the process, then we
can distinguish between technology which might change it and technology which
cannot. </span><span style="color: black; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<span lang="EN-GB" style="color: black; font-family: Arial, Helvetica, sans-serif; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">Several technological
developments have been hailed as revolutions in predictive analytics, for
example the advent of automated data preparation and model re-building, and the
integration of business rules with predictive models in deployment frameworks.
The 9 laws of data mining suggest, and their explanations demonstrate, that
these developments will not change the nature of the process. The 9 laws, and
further development of these ideas, should be used to judge any future claims
of revolutionising the data mining process, in addition to their educational
value for data miners.</span><span style="color: black; font-family: "Times New Roman","serif"; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<i><span lang="EN-GB" style="color: black; font-family: "Times New Roman","serif"; font-size: 12pt; mso-ansi-language: EN-GB; mso-fareast-font-family: "Times New Roman";">I would like to thank
Chris Thornton and David Watkins, who supplied the insights which inspired this
work, and also to thank all those who have contributed to the LinkedIn “9 Laws
of Data Mining” discussion group, which has provided invaluable food for
thought.</span></i><span style="color: black; font-family: "Times New Roman","serif"; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<br />
<div class="MsoNormal" style="margin: 0in 0in 0pt 14.25pt;">
<o:p><span style="font-family: Calibri;"> </span></o:p></div>
Analytics and Data Mininghttp://www.blogger.com/profile/06683904073452049165noreply@blogger.com28tag:blogger.com,1999:blog-3770043454488854818.post-30910649354366192042013-01-31T01:43:00.000-08:002013-01-31T01:43:42.780-08:00
<br />
<div class="MsoNormal" style="margin: 0in 0in 0pt;">
<span style="color: black;"><b><span style="font-family: "Tahoma","sans-serif"; font-size: 10pt; mso-fareast-font-family: "Times New Roman";">Visual Analytics – insights on a
speed of sight</span></b></span></div>
<div class="MsoNormal" style="margin: 0in 0in 0pt;">
<span style="color: black;"><b><span style="font-family: "Tahoma","sans-serif"; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"></span></b><span style="font-family: "Tahoma","sans-serif"; font-size: 10pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></span> </div>
<div style="direction: ltr; language: en-US; line-height: 90%; margin-bottom: 4.9pt; margin-left: 0in; margin-top: 10.08pt; mso-line-break-override: none; punctuation-wrap: hanging; text-align: justify; text-indent: 0in; unicode-bidi: embed; vertical-align: baseline;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEikzEpQjpBJbOPzTWdd3a84I1hfxGx6zafvAYrGZedX4oXoq5NcfjMX4Nd7bwFAgtKzXaR49PgnnctSvnp7fYpp27RT6OU_Rt9_dqRIPncvc_wLV4RinFvuD6JN0aUXu2fM77zcKIzcF7eN/s1600/visual-analytics-mobile-reporting.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="240" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEikzEpQjpBJbOPzTWdd3a84I1hfxGx6zafvAYrGZedX4oXoq5NcfjMX4Nd7bwFAgtKzXaR49PgnnctSvnp7fYpp27RT6OU_Rt9_dqRIPncvc_wLV4RinFvuD6JN0aUXu2fM77zcKIzcF7eN/s320/visual-analytics-mobile-reporting.jpg" width="320" /></a><span style="color: black; font-family: Times New Roman; font-size: small;">
</span><span style="font-size: small;"><span style="color: black;"><span style="font-family: Calibri;">While many organizations across the industries have
generated tremendous value of using analytical technologies to solve their
business challenges, there are many who have simply failed in generating value from the analytics. There are many reasons for this failure and most of
these reasons can be put under single banner “business was not on-board”. And
they were not on-board either because they don’t have trust in analytics or
they have a misconception of what they can get out of the analytics. <o:p></o:p></span></span></span></div>
<div style="direction: ltr; language: en-US; line-height: 90%; margin-bottom: 4.9pt; margin-left: 0in; margin-top: 10.08pt; mso-line-break-override: none; punctuation-wrap: hanging; text-align: justify; text-indent: 0in; unicode-bidi: embed; vertical-align: baseline;">
<span style="color: black; font-family: Times New Roman; font-size: small;">
</span><span style="font-size: small;"><span style="color: black;"><span style="font-family: Calibri;"> With new
analytical technologies aimed purely at business folk there is hope that
analytical maturity will fasten, gap between business and technical users will
narrow, and that will lead to deeper analytical explorations.<o:p></o:p></span></span></span></div>
<div style="direction: ltr; language: en-US; line-height: 90%; margin-bottom: 4.9pt; margin-left: 0in; margin-top: 10.08pt; mso-line-break-override: none; punctuation-wrap: hanging; text-align: justify; text-indent: 0in; unicode-bidi: embed; vertical-align: baseline;">
<span style="color: black; font-family: Times New Roman; font-size: small;">
</span><span style="mso-ascii-font-family: Calibri; mso-ascii-theme-font: minor-latin; mso-bidi-font-family: "Times New Roman"; mso-bidi-theme-font: minor-bidi; mso-hansi-font-family: Calibri; mso-hansi-theme-font: minor-latin;"><span style="font-size: small;"><span style="color: black;"><span style="font-family: Calibri;">One of
these new technologies is <a href="http://www.sas.com/technologies/bi/visual-analytics.html">Visual Analytics</a> by SAS Institute. Many companies
have accumulated vast ammounts of data and they simply don’t have means to do
any analytical processing on full data at one time. That leads to overburdened
IT, and to frustration among decision makers who are not getting quick enough answers
to their business questions. SAS Visual Analytics combines
easy-to-use interface and robust in-memory technology to enable users to
explore all of the data and to and extract new insights
through on-the-fly reporting capabilities before delivering results via Web
reports and mobile devices. </span></span></span></span></div>
<div style="direction: ltr; language: en-US; line-height: 90%; margin-bottom: 4.9pt; margin-left: 0in; margin-top: 10.08pt; mso-line-break-override: none; punctuation-wrap: hanging; text-align: justify; text-indent: 0in; unicode-bidi: embed; vertical-align: baseline;">
<span style="color: black; font-family: Times New Roman; font-size: small;">
</span><span style="mso-ascii-font-family: Calibri; mso-ascii-theme-font: minor-latin; mso-bidi-font-family: "Times New Roman"; mso-bidi-theme-font: minor-bidi; mso-hansi-font-family: Calibri; mso-hansi-theme-font: minor-latin;"><span style="font-size: small;"><span style="color: black;"><span style="font-family: Calibri;">What I like
about SAS Visual Analyitcs is how easy is to do data exploration, and to
generate dynamically linked reports. New “auto-charting” capabilitiy automatyically
choose best report for the specific type of data items that was chosen. Everything
is drag-and-drop, so if user know what he wants execution is lightning fast
even on billions of records due to the ability to take
advantage of massively parallel environments . And once you get desired output -
with the few clicks you can surface these insights or reports o your i-pad or
android tablets. This is in my view will really bring analytics to new
audiences and hopefully it will open the door for more and more business-driven
analytical initiatives.<o:p></o:p></span></span></span></span></div>
<div style="direction: ltr; language: en-US; line-height: 90%; margin-bottom: 4.9pt; margin-left: 0in; margin-top: 10.08pt; mso-line-break-override: none; punctuation-wrap: hanging; text-align: left; text-indent: 0in; unicode-bidi: embed; vertical-align: baseline;">
<span style="color: black; font-family: Times New Roman; font-size: small;">
</span><span style="mso-ascii-font-family: Calibri; mso-ascii-theme-font: minor-latin; mso-bidi-font-family: "Times New Roman"; mso-bidi-theme-font: minor-bidi; mso-hansi-font-family: Calibri; mso-hansi-theme-font: minor-latin;"><span style="font-size: small;"><span style="color: black;"><span style="font-family: Calibri;">Goran
Dragosavac<o:p></o:p></span></span></span></span></div>
<div style="direction: ltr; language: en-US; line-height: 90%; margin-bottom: 4.9pt; margin-left: 0in; margin-top: 10.08pt; mso-line-break-override: none; punctuation-wrap: hanging; text-align: left; text-indent: 0in; unicode-bidi: embed; vertical-align: baseline;">
<span style="color: black; font-family: Times New Roman; font-size: small;">
</span></div>
Analytics and Data Mininghttp://www.blogger.com/profile/06683904073452049165noreply@blogger.com27tag:blogger.com,1999:blog-3770043454488854818.post-78418305766712377292012-09-26T10:13:00.004-07:002012-09-26T10:27:29.325-07:00<span lang="EN-GB"><strong><span style="font-size: large;"><span style="font-family: Arial, Helvetica, sans-serif;">Business Analytics Solution for Airline Industry<o:p></o:p></span></span></strong></span><br />
<br />
<div class="MsoNormal" style="margin: 0in 0in 0pt; text-align: justify;">
<a href="http://listverse.files.wordpress.com/2007/10/a380-01.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="206" src="http://listverse.files.wordpress.com/2007/10/a380-01.jpg" width="320" /></a><span lang="EN-GB" style="font-size: 12pt;"><o:p> </o:p></span></div>
<div style="text-align: justify;">
<span lang="EN-GB"><span style="font-family: Arial, Helvetica, sans-serif;">If the airline industry could be described
in two words, it would be "intensely competitive". The airline
industry generates billions of dollars every year and still has a cumulative
profit margin of less than 1%.<span style="mso-spacerun: yes;"> </span>The
reason for this lies in this industry’s vast complexity. Airlines have a multitude
of different business issues that need to be solved at once, such as globally
uneven playing field, revenue vulnerability, an extremely variable planning
horizon, high cyclicality and seasonality, fierce competition, excessive
government intervention and high fixed and low marginal cost. To ensure the
best chance for full economic recovery, airlines should fully leverage their most
prolific asset - data.<span style="mso-spacerun: yes;"> </span>Data used in
conjunction with innovative technologies that would allow the creation of an
Business Analytics Solution, will provide the capabilities for a comprehensive
intelligent management and decision-making system throughout the enterprise.<span style="mso-spacerun: yes;"> </span>The ultimate benefits of implementing and
using an enterprise wide intelligence platform, together with airline business
acumen and experience would include timely responses to current and future
market demands, better planning and strategically aligned decision making, and clear
understanding and monitoring of all key performance drivers relevant to the
airline industry.<o:p></o:p></span></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt;">
<span lang="EN-GB"><span style="font-family: Arial, Helvetica, sans-serif;">Achieving these benefits in a timely and
intelligent manner will ultimately result in lower operating costs, better
customer service, market leading competitiveness and increased profit margin
and shareholder value.<span style="mso-spacerun: yes;"> </span></span></span></div>
<div class="style5" style="margin: 1em 0in; text-align: justify;">
<span style="font-family: Arial, Helvetica, sans-serif;">Airlines throughout the world are
currently facing an unprecedented financial crisis. Factors contributing to
this crisis are low customer satisfaction, overtraded markets, insufficient and
under utilization of aircraft capacity, poor labor relations, excessive
government intervention, high labor costs, ever increasing oil prices resulting
in spiraling fuel costs, and generally <span style="mso-spacerun: yes;"> </span>high operational costs. The low profit to
turnover ratio of airlines have been further exacerbated by growing low-fare
competition, increasing security costs, and frequent dynamic shifts in air
travel consumer behavior. The historical business model of many network
airlines now appears to be unable to support sustained profitability under any
but the most favorable economic conditions. The industry is at a turning point.<span style="mso-spacerun: yes;"> </span>The market dictates an “adapt or die” policy,
and the airlines that whish to survive will face the challenge of having to
make significant changes to their current archaic business model. To do this
requires far more allowance for innovative technologies that would allow
airlines to build an end-to-end Business Analytics Solution. The core
capabilities of these technologies will ensure the flow of consistent,
repeatable and reliable enterprise wide intelligence needed to tackle all the
challenges the industry is facing. <o:p></o:p></span></div>
<br />
<div style="margin: 12pt 0in 12pt 8.5pt; mso-list: l0 level1 lfo1;">
<span style="font-family: Arial, Helvetica, sans-serif;"><strong>Purpose of Business Analytics Solution for Airlines</strong></span> </div>
<div class="MsoNormal" style="margin: 0in 0in 0pt; text-align: justify;">
<span style="mso-ansi-language: EN-US;"><span style="font-family: Arial, Helvetica, sans-serif;">Purpose of an Business Analytics Solution for airlines is to bridge what is called the Information-to-intelligence
gap.<span style="mso-spacerun: yes;"> </span>The disparity between what an airlines
has – which is prolific amounts of data from disparate source systems – and
what an airline wants – which is to achieve strategy alignment for a
competitive edge; whether it be through compliance, increased profitability,
decreased risk, or to better manage performance, planning, etc.</span><span style="mso-bidi-font-weight: bold;"><o:p></o:p></span></span></div>
<span style="mso-ansi-language: EN-US; mso-bidi-font-weight: bold;"><span style="mso-spacerun: yes;">
</span></span><b style="mso-bidi-font-weight: normal;"><span style="mso-ansi-language: EN-US;"><o:p> </o:p></span></b><br />
<br />
<div class="MsoNormal" style="margin: 0in 0in 0pt; tab-stops: list .5in;">
<span style="mso-ansi-language: EN-US;"><strong><span style="mso-spacerun: yes;"> </span><span style="font-family: Arial, Helvetica, sans-serif;"><span style="mso-spacerun: yes;"> </span>Addressing
the Business issues</span></strong> </span><br />
<br />
<span style="mso-ansi-language: EN-US;"><span style="font-family: Arial, Helvetica, sans-serif;">Some of the challenges
that can be successfully addressed by <st1:city w:st="on"><st1:place w:st="on">Business Analytics Solution </st1:place></st1:city>are: <o:p></o:p></span></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<ul style="margin-top: 0in;" type="disc"><span style="font-family: Arial, Helvetica, sans-serif;">
</span>
<li class="MsoNormal" style="margin: 0in 0in 0pt; mso-list: l1 level1 lfo4; tab-stops: list .5in;"><span style="mso-ansi-language: EN-US;"><span style="font-family: Arial, Helvetica, sans-serif;">The need for accurate daily and weekly
performance measurement reports (e.g. “flash/estimated” revenue, operating
costs and net contribution reports for every aircraft’s actual flight per
sector/route).<o:p></o:p></span></span></li>
<span style="font-family: Arial, Helvetica, sans-serif;">
</span>
<li class="MsoNormal" style="margin: 0in 0in 0pt; mso-list: l1 level1 lfo4; tab-stops: list .5in;"><span style="mso-ansi-language: EN-US;"><span style="font-family: Arial, Helvetica, sans-serif;">The Need to better manage all aspects of
risk. <o:p></o:p></span></span></li>
<span style="font-family: Arial, Helvetica, sans-serif;">
</span>
<li class="MsoNormal" style="margin: 0in 0in 0pt; mso-list: l1 level1 lfo4; tab-stops: list .5in;"><span style="mso-ansi-language: EN-US;"><span style="font-family: Arial, Helvetica, sans-serif;">The Need for better impact analysis and
more effective optimization of all resources as well as being able to
produce accurate passenger-revenue forecasts,<span style="mso-spacerun: yes;"> </span><o:p></o:p></span></span></li>
<span style="font-family: Arial, Helvetica, sans-serif;">
</span>
<li class="MsoNormal" style="margin: 0in 0in 0pt; mso-list: l1 level1 lfo4; tab-stops: list .5in;"><span style="mso-ansi-language: EN-US;"><span style="font-family: Arial, Helvetica, sans-serif;">The Need for a holistic, 360 degrees view
of the airline industries customers, suppliers, service providers and
distributors.<o:p></o:p></span></span></li>
<span style="font-family: Arial, Helvetica, sans-serif;">
</span>
<li class="MsoNormal" style="margin: 0in 0in 0pt; mso-list: l1 level1 lfo4; tab-stops: list .5in;"><span style="mso-ansi-language: EN-US;"><span style="font-family: Arial, Helvetica, sans-serif;">The Need for expense verification models
in order to better control all industry cost aspects.<o:p></o:p></span></span></li>
</ul>
<span style="mso-ansi-language: EN-US;"><o:p> </o:p></span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><span style="mso-ansi-language: EN-US;"><strong>Issues related to Performance Management</strong></span><span style="mso-bookmark: OLE_LINK5;"><span style="mso-bookmark: OLE_LINK6;"><span style="mso-ansi-language: EN-US;"><o:p> </o:p></span></span></span></span><br />
<span style="mso-bookmark: OLE_LINK6;"></span><span style="mso-bookmark: OLE_LINK5;"></span>
<br />
<div class="MsoNormal" style="margin: 0in 0in 0pt; text-align: justify;">
<span style="mso-ansi-language: EN-US;"><span style="font-family: Arial, Helvetica, sans-serif;">Airlines usually
operate in a globally competitive environment and therefore require prompt and
accurate enterprise performance measurements. Furthermore, airlines are volume
driven and small variations (passengers flown, fuel spent/bought, load carried)
can multiply into major effects – therefore appropriate and timely action is
critical. Airlines suffer substantial difficulties to produce daily/weekly
reliable performance measurements. Current airlines “legacy” IT systems such as
Revenue Accounting, require several weeks after a month end to generate revenue
results for every flight per sector/route. <span style="mso-spacerun: yes;"> </span>Business Analytics Solution for Airlines can automate
production of daily activity reports such as number of passenger flown per
flight/sector, distance flown, etc which can be used to provide estimated
performance measurements such as daily or weekly revenues for specific routes
or sectors. <o:p></o:p></span></span></div>
<span style="mso-ansi-language: EN-US;"><o:p></o:p></span><br />
<div style="margin: 10pt 0in; mso-list: none; text-indent: 0in;">
<span style="font-family: Arial, Helvetica, sans-serif;"><strong>Issues related to Risk Management</strong></span></div>
<div style="text-align: justify;">
<span style="font-weight: normal; mso-ansi-language: EN-US; mso-bidi-font-weight: bold;"><span style="font-family: Arial, Helvetica, sans-serif;">The
global airline industry has been subjected to major catastrophes over the past
years.<span style="mso-spacerun: yes;"> </span>It is accordingly imperative for
airlines to develop various risk management models and strategies to protect
themselves from negative impact of these types of events. Furthermore, due to
the global playing field, airlines often earn its revenues and pay its costs in
different baskets of currencies (e.g. USD, Euro, GBP etc). As a result there is
frequently a mismatch between the flow of revenue receipts and expenses of each
basket of currency - creating risk exposure. Using Business Analytics Solution for Airlines
Infrastructure, relevant data can be gathered, consolidated and cleaned, risk
can be modeled, and risk exposure can be measured and presented on “as and
when” basis, as requested by business user.<span style="mso-spacerun: yes;">
</span><o:p></o:p></span></span></div>
<br />
<div style="margin: 10pt 0in; mso-list: none; text-indent: 0in;">
<span style="font-family: Arial, Helvetica, sans-serif;"><strong>Issues related to Control and
Verification<o:p></o:p></strong></span></div>
<div style="text-align: justify;">
<span style="font-weight: normal; mso-ansi-language: EN-US; mso-bidi-font-weight: bold;"><span style="font-family: Arial, Helvetica, sans-serif;">Airline
carriers require a number of control and verification models to be able to
control costs arising from its various operational activities. To enable this,
airlines have a pressing need for a complete and integrated repository of
flight information data gathered from all its disparate business units. This
will enable computation of various efficiency analytics - e.g. planed fuel
usage compared with actual fuel usage per aircraft, crew utilization (roster
optimization). These issues could also be fully addressed by the Business Analytics Solution for Airlines, which will access, consolidate and analyze
relevant flight and aircraft data. In turn this would help to create a 360 °
view of each flight and aircraft, allowing the business users to dramatically
improve their control and verification systems.<span style="mso-spacerun: yes;">
</span></span></span></div>
<div style="text-align: justify;">
<span style="font-family: Arial, Helvetica, sans-serif;"> </span></div>
<div style="margin: 10pt 0in; mso-list: none; text-align: justify; text-indent: 0in;">
<span style="font-family: Arial, Helvetica, sans-serif;">
<o:p></o:p><strong> Issues
related to be able to better forecast<o:p></o:p></strong></span></div>
<div style="text-align: justify;">
<span style="font-family: Arial, Helvetica, sans-serif;"><span style="mso-bookmark: OLE_LINK7;"><span style="mso-bookmark: OLE_LINK8;"><span style="font-weight: normal; mso-ansi-language: EN-US; mso-bidi-font-weight: bold;">Airlines
require the development of an effective and holistic forecasting model to
regularly asses the impact of options and alternatives such as increasing
aircraft seats available, adjusting fares, introducing new routes etc. Forecasts
should also take account of actual statistical trends and results e.g. actual
passengers carried and actual average fares earned. Such </span></span></span><span style="font-weight: normal; mso-ansi-language: EN-US; mso-bidi-font-weight: bold;">forecasts
should then be compared against budgets and prior year performance. Business Analytics Solution for Airlines has a market leading and powerful forecasting engine
capable of generating large number of forecasts automatically and making them
available to the people who would used them for sound decision making.<span style="mso-spacerun: yes;"> </span><o:p></o:p></span></span></div>
<span style="mso-ansi-language: EN-US;"><span style="font-family: Arial, Helvetica, sans-serif; mso-spacerun: yes;"> </span></span><br />
<span style="font-family: Arial, Helvetica, sans-serif;">
<strong>Issues related to a lack of a holistic view of
core business components.</strong></span> <br />
<br />
<div class="MsoNormal" style="margin: 0in 0in 0pt; text-align: justify;">
<span style="font-family: Arial, Helvetica, sans-serif;"><span style="mso-ansi-language: EN-US;">Airlines would greatly
benefit from knowing and understanding its business environment along some of
the key business issues, such as performance, behavior, risk, profitability,
etc. Using customers as an example - </span><span style="mso-ansi-language: EN-US; mso-bidi-font-family: TimesNewRomanPSMT; mso-bidi-font-size: 10.0pt;">the main objective would be to enrich the knowledge
about individual customers leading to new strategic customer segments. </span><span style="mso-ansi-language: EN-US;">This intelligence would allow airlines to reap
the host of benefits such as successful, targeted customer promotions,
cross-selling and up-selling campaigns for different flights and booking
classes leading to improved yield and revenue. For example, it would give airlines
the power of knowing to limit discounts on flight routes which are usually over-booked,
allowing the large number of passengers to compete for high profit seats immediately
prior to departure. Such multidimensional views of the business can help the
airline to better serve its customers through more effective, efficient and
personalized service, receiving in return customer loyalty, support and market
share, all leading to higher profitability. <o:p></o:p></span></span></div>
<span style="mso-ansi-language: EN-US;"><span style="font-family: Arial, Helvetica, sans-serif;"><span style="mso-spacerun: yes;"></span><o:p></o:p></span></span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><strong> </strong><span lang="EN-GB"><em><strong>Conclusion</strong> <o:p></o:p></em></span></span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt; text-align: justify;">
<span lang="EN-GB"><span style="font-family: Arial, Helvetica, sans-serif;">The Business Analytics Solution for Airlines is designed on Usable,
Interoperable, Scalable, and Manageable technology, and encompasses all aspects
of turning information into strategically aligned, powerful and accurate
intelligence and empowering the business user into intelligent action by
ensuring the delivery of the right intelligence to the right business user in
the right format in a timely manner.<span style="mso-spacerun: yes;"> </span>Solution is built on core technological components of Data Integration, Data
Management, Data Analysis and Information Deployment, all of these components
being fed by centrally shared enterprise wide metadata.<span style="mso-spacerun: yes;"> </span>Built into these core technology components
are airline specific data models, statistical and analytical models,
pre-written reports and all necessary training and methodologies for a
successful and sustainable solution for airlines implementation.<span style="mso-spacerun: yes;"> </span>All of these items collectively give the the capability and capacity to address the host of
the burning issues prevalent in the airline industry.</span></span></div>
<div class="MsoNormal" style="margin: 0in 0in 0pt; text-align: justify;">
<span lang="EN-GB"></span><span style="font-family: Arial, Helvetica, sans-serif;"> </span></div>
<div class="MsoNormal" style="margin: 0in 0in 0pt; text-align: justify;">
<span lang="EN-GB"><strong><span style="font-family: Arial, Helvetica, sans-serif;">Goran Dragosavac<o:p></o:p></span></strong></span></div>
Analytics and Data Mininghttp://www.blogger.com/profile/06683904073452049165noreply@blogger.com4tag:blogger.com,1999:blog-3770043454488854818.post-27012629535858287142012-09-26T09:19:00.000-07:002012-09-26T09:28:03.179-07:00<h1 class="entry_title">
Police using ‘predictive analytics’ to prevent crimes before they happen</h1>
<!-- Facebook Recommend --><br />
<div id="facebook_recommend_container">
</div>
<!-- Author / Post Publish Date --><br />
<div id="author_container">
By Agence France-Presse</div>
<!-- AddThis Button END --><!-- End Social Code --><!-- Featured Image --><!-- Tags --><br />
<div class="clear" style="height: 10px;">
</div>
<div style="text-align: justify;">
<a href="http://www.rawstory.com/rs/wp-content/uploads/2012/07/Handcuffs-and-CDs-via-Shutterstock.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img alt="Handcuffs and CDs via Shutterstock" border="0" class="attachment-in_article wp-post-image" height="178" src="http://www.rawstory.com/rs/wp-content/uploads/2012/07/Handcuffs-and-CDs-via-Shutterstock.jpg" title="Handcuffs and CDs via Shutterstock" width="320" /></a>Crime fighters have long used brains and brawn, but now a new kind of technology known as “predictive policing” promises to make them more efficient. A growing number of law enforcement agencies, in the US and elsewhere, have been adopting software tools with predictive analytics, based on algorithms that aim to predict crimes before they happen. The concept sounds like something out of science fiction and the thriller “Minority Report” based on a Philip K. Dick story.</div>
<div style="text-align: justify;">
</div>
<div style="text-align: justify;">
Without some of the sci-fi gimmickry, police departments from Santa Cruz, California, to Memphis, Tennessee, and law enforcement agencies from Poland to Britain have adopted these new techniques.</div>
<div style="text-align: justify;">
The premise is simple: criminals follow patterns, and with software — the same kind that retailers like Wal-Mart and Amazon use to determine consumer purchasing trends — police can determine where the next crime will occur and sometimes prevent it.</div>
<div style="text-align: justify;">
</div>
<div style="text-align: justify;">
Colleen McCue, a behavioral scientist at GeoEye, a firm that works with US Homeland Security and local law enforcement on predictive analytics, said studying criminal behavior was not that different from examining other types of behavior like shopping. “People are creatures of habit,” she said. “When you go shopping you go to a place where they have the things you’re looking for… the criminal wants to go where he will be successful also.” She said the technology could help in cities where tight budgets were forcing patrol reductions.</div>
<br />
<div style="text-align: justify;">
<div id="in_article_slot_2" style="text-align: justify;">
“When police departments are laying more sworn personnel, they can do more with less,” she said.</div>
<div style="text-align: justify;">
The key to success in predictive policing is getting as much data as possible to determine patterns. This can be especially useful in property crimes like auto theft and burglary, where patterns can be detected. “You can build a model that factors in attributes like the time of year, whether it is hot and humid or cold and snowy, if it is a payday when people are carrying a lot of cash,” says Mark Cleverly, who heads the analytical unit for predictive crime analytics. “It’s not saying a crime will occur at a particular time and place, no one can do that. But it can say you can expect a wave of vehicle thefts based one everything we know.” </div>
<div style="text-align: justify;">
</div>
<div style="text-align: justify;">
In Memphis, officials said serious crimes fell 30 percent and violent crimes declined 15 percent since implementing predictive analytics in a program with IBM and the University of Memphis in 2006.</div>
<div style="text-align: justify;">
The program known as CRUSH — Criminal Reduction Utilizing Statistical History — targeted certain “hot spots” to allow police to deploy more efficiently. John Williams, crime analysis manager for the city’s police, said the system has had a dramatic impact, allowing Memphis to get off the list of worst US cities for crime. “If the data is indicating a hot spot, we are able to immediately deploy resources there. And in a lot of instances we are able to make quality arrests because we’re in the right area at the right time,” he told AFP.</div>
<div style="text-align: justify;">
</div>
<div style="text-align: justify;">
Although beat officers can use their instincts for similar results, Williams said the software could be far more precise, such as predicting burglaries in a small geographic area between 10 pm and 2 am.</div>
<div style="text-align: justify;">
In one case, the software was able to help police break up a group that was committing armed robberies on the city’s Hispanic population. “There were 84 robberies, but we had no idea it was so organized,” Williams said. By crunching the numbers, police were able to pinpoint the zone and time of likely holdups: “We caught a group of robbers in progress, we had leads on additional robberies,” he said.</div>
<div style="text-align: justify;">
</div>
<div style="text-align: justify;">
Williams said police officials from as far away as Hong Kong, Rio de Janeiro and Estonia have come to review the experience in Memphis. In Los Angeles, another program developed by scientists at the University of California-Los Angeles and Santa Clara University was tested in a single precinct, and resulted in a 12 percent drop in crime while the rest of the city saw a 0.2 percent increase. That test and others led to the creation of a company called PredPol. And Los Angeles will expand its use of the program under contract with PredPol, said CEO Caleb Baskin.</div>
<div style="text-align: justify;">
</div>
<div style="text-align: justify;">
Baskin said the system is based on a model from mathematician George Mohler which “is very effective in predicting the time and location for crimes that have not yet taken place.” PredPol had begun working with other cities in California and “we’ve had inquiries from a lot of places in the US and international locations,” Baskin said. “The science that underlies the tool will work anywhere. The question is does the agency maintain a database that we can plug into.” While use of such analytics generally wins plaudits for helping “smarter” policing, it does raise concerns about Big Brother-like snooping.</div>
<div style="text-align: justify;">
</div>
<div style="text-align: justify;">
Andrew Guthrie Ferguson, a law professor at the University of the District of Columbia, said the use of technology could be positive but that it could lower the threshold for constitutional protections on “unreasonable” searches. “To stop you and frisk you and search you, a police officer needs easonable suspicion, so my question is how will this affect reasonable suspicion?” he said. If the search is based on a computer algorithm, Ferguson said, and the case comes to court, “How do you cross-examine a computer?” IBM’s Cleverly said the technology can in many cases improve privacy.</div>
<br />
<div style="text-align: justify;">
“You can pinpoint the record of who has access to information, you have a solid history of what’s going on, so if someone is using the system for ill you have an audit trail,” he said. As for “The Minority Report” and its predictive software, Cleverly said, “It was a great film and great short story, but it’s science fiction and will remain science fiction. That’s not what this is about.”</div>
</div>
Analytics and Data Mininghttp://www.blogger.com/profile/06683904073452049165noreply@blogger.com0tag:blogger.com,1999:blog-3770043454488854818.post-44781838968677523362012-09-26T08:32:00.000-07:002012-09-26T08:32:12.615-07:00
<br />
<h3 class="MsoNormal" style="margin: 0in 0in 0pt;">
<span style="font-family: Arial, Helvetica, sans-serif;">What to do with False Positives?<o:p></o:p></span></h3>
<br />
<div class="MsoNormal" style="margin: 0in 0in 0pt; text-align: justify;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjbCwljaDvXj152OtwpeyZMwnlOM1MMJujKEyXbi6KXwIlIE-dpTbrVSLNyG8kOl0k2H-5a1hkDadlPWKVP10uMNSHvCB_nKKaViX6t_nAgOYYXTN_sMdEdlExlNEl7m7d3_Szhu0zODcmT/s1600/false+positives.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="212" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjbCwljaDvXj152OtwpeyZMwnlOM1MMJujKEyXbi6KXwIlIE-dpTbrVSLNyG8kOl0k2H-5a1hkDadlPWKVP10uMNSHvCB_nKKaViX6t_nAgOYYXTN_sMdEdlExlNEl7m7d3_Szhu0zODcmT/s320/false+positives.jpg" width="320" /></a><o:p><span style="font-family: Calibri;"> </span></o:p><span style="font-family: Calibri;">I often hear complaints from business folk that their models need
improvement because there are too many false positives. Just for clarity - in a
case of fraud transactions – false positives are related to those transactions
which were assigned to be fraudulent when in reality they weren’t.<span style="mso-spacerun: yes;"> </span>Sure, one needs to always minimize
occurrence of false positives as much as possible, but it is not always the model’s
fault. Sometimes what looks<span style="mso-spacerun: yes;"> </span>like a clear
cut fraud – just isn’t. It is a fuzzy area where the difference between
patterns of your event and non-event are completely blurred. Kind of – it could
go either way!<o:p></o:p></span></div>
<br />
<div class="MsoNormal" style="margin: 0in 0in 0pt; text-align: justify;">
<span style="font-family: Calibri;">Some implementations of analytics have been
built on false positives. These are the people who look like buyers of
particular brand – and yet they are not. Well, the logical assumption is that
if some marketing stimuli is sent to these people – they are more likely to
become buyers of that brand, due to its high-degree of look-alike-ness than
randomly selected folk. I have completed several successful projects geared
solely on acting on these ‘so called’ modeling mistakes. <o:p></o:p></span></div>
<br />
<div class="MsoNormal" style="margin: 0in 0in 0pt; text-align: justify;">
<o:p><span style="font-family: Calibri;"> </span></o:p><span style="font-family: Calibri;">Another example is building a model capable of predicting
who will be dormant customers within a period of time.<span style="mso-spacerun: yes;"> </span>After building the model we score it on some
existing base comprising of known (historical) dormant<span style="mso-spacerun: yes;"> </span>customers as well as of those who are not.
Then, we focus on false positives and compare them to one’s that are correctly
predicted.<span style="mso-spacerun: yes;"> </span>Often the difference is so
small between the two groups in terms of their usage patterns – that we may as
well call them all dormant customers. Even though false positives are
technically not dormant yet –<span style="mso-spacerun: yes;"> </span>for all
intents and purposes they really are. So, we go back to the business definition
of what constitutes dormant customer and we look at the whole phenomenon with a
<span style="mso-spacerun: yes;"> </span>new fresh angle. Thanks to comparative
studies between accurate predictions and false positives.<o:p></o:p></span></div>
<div style="text-align: justify;">
</div>
<div class="MsoNormal" style="margin: 0in 0in 0pt; text-align: justify;">
<span style="font-family: Calibri;">So what I am trying to say in this article is that what
appears to be modeling “mistake” can be turned into the value from more than
one different angle. There is always a reason why models make mistakes – and
tiredness is never one of them.<span style="mso-spacerun: yes;"> </span><o:p></o:p></span></div>
<br />
<div class="MsoNormal" style="margin: 0in 0in 0pt;">
<o:p><span style="font-family: Calibri;"> </span></o:p><span style="font-family: Calibri;">Goran Dragosavac<o:p></o:p></span></div>
Analytics and Data Mininghttp://www.blogger.com/profile/06683904073452049165noreply@blogger.com2tag:blogger.com,1999:blog-3770043454488854818.post-61018864181419245962011-11-02T01:52:00.000-07:002011-11-02T01:57:47.357-07:00Analytics and Data Mining in Banking<div style="text-align: justify;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjjPym9ZZ-6W0XrEQtAMjXqWGfS-Tp5w2IR5obvcXQi1hriFc482z3PUEP_aBf1HWYPLDe9LeGJLbwAH_4Xe0qv_TJ0_0tPac1GKTqSR1kW45uaJb__E3IAGtAUJAHcM5Qp97CDNIJAoPWR/s1600/banking_001.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"></a></div><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjjPym9ZZ-6W0XrEQtAMjXqWGfS-Tp5w2IR5obvcXQi1hriFc482z3PUEP_aBf1HWYPLDe9LeGJLbwAH_4Xe0qv_TJ0_0tPac1GKTqSR1kW45uaJb__E3IAGtAUJAHcM5Qp97CDNIJAoPWR/s1600/banking_001.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="213" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjjPym9ZZ-6W0XrEQtAMjXqWGfS-Tp5w2IR5obvcXQi1hriFc482z3PUEP_aBf1HWYPLDe9LeGJLbwAH_4Xe0qv_TJ0_0tPac1GKTqSR1kW45uaJb__E3IAGtAUJAHcM5Qp97CDNIJAoPWR/s320/banking_001.jpg" width="320" /></a><span style="color: windowtext; font-size: 14pt; mso-fareast-language: EN-ZA;"><span style="font-family: Arial;">With the increasing economic globalization and improvements in information technology, large amounts of financial data are being generated and stored. These can be subjected to data mining techniques to discover hidden patterns and obtain predictions for trends in the future and the behavior of the financial markets. </span></span><span style="color: windowtext; font-size: 14pt; mso-fareast-language: EN-ZA;"><span style="font-family: Arial;"> This in turn would result in an improved market place responsiveness and awareness leading to reduced costs</span></span><br />
<div style="text-align: justify;"><span style="color: windowtext; font-size: 14pt; mso-fareast-language: EN-ZA;"><span style="font-family: Arial;"> and increased revenue. <o:p></o:p></span></span></div><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none; text-align: justify;"><span style="color: windowtext; font-size: 14pt; mso-fareast-language: EN-ZA;"><span style="font-family: Arial;">Analytics can contribute to solving business problems in banking and finance by finding patterns, causalities, and correlations in business information and market prices that are not immediately apparent to managers because the volume data is too large or is generated too quickly to screen by experts. The managers of the banks may go a step further to find the sequences, episodes and periodicity of the transaction behaviour of their customers which may help them in actually better segmenting, targeting, acquiring, retaining and maintaining a profitable customer base. <o:p></o:p></span></span></div><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none; text-align: justify;"><span style="font-family: Arial;"><span style="color: windowtext; font-size: 14pt; mso-fareast-language: EN-ZA;">Business Intelligence and data mining techniques can also help them in identifying various classes of customers and come up with a class based product and/or pricing approach that may garner better revenue management as well. </span></span><span style="font-size: 14pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><span style="font-family: Arial;">Analytics can help banks understand and drive decisions related to customer profitability, as well as enable banking institutions to segment customers according to a multitude of variables – demographics, geographies, account history, etc. – In order to create more meaningful and targeted marketing programs. <span style="mso-spacerun: yes;"> </span><o:p></o:p></span></span></div><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none; text-align: justify;"><span style="font-size: 14pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><span style="font-family: Arial;">Furthermore, analytics can help banks improve retention rates by determining the causes and predicting future customer attrition. <span style="mso-spacerun: yes;"> </span>In addition, banks can apply analytics to historical data to find out which customers are good candidates for cross-selling and up-selling and as a result achieve increase in revenue and wallet share. For most banks analytics are used as the most powerful weapon in the fight against fraud.<o:p></o:p></span></span></div><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-family: Arial;"><b style="mso-bidi-font-weight: normal;"><span style="color: #231f20; font-size: 14pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">Customer Relationship Management</span></b><span style="color: #231f20; font-size: 14pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div><div style="text-align: justify;"></div><div class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none; text-align: justify;"><span style="font-family: Arial;"><i style="mso-bidi-font-style: normal;"><span style="color: #231f20; font-size: 14pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">Customer segmentation and profiling</span></i><span style="color: #231f20; font-size: 14pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"> is a data mining process that builds customer profiles of different groups from the company’s existing customer database. The information obtained from this process can be used for different purposes, such as understanding business performance, making new marketing initiatives, market segmentation, risk analysis and revising company customer policies. The advantage of data mining is that it can handle large amounts of data and learn inherent structures and patterns in data. It can generate rules and models that are useful in enabling decisions that can be applied to future cases. <o:p></o:p></span></span></div><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none; text-align: justify;"><span style="color: #231f20; font-size: 14pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p><span style="font-family: Arial;"> </span></o:p></span><span style="color: #231f20; font-size: 14pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><span style="font-family: Arial;">In Banking - analytics and data mining is frequently used to assign a score to a particular customer or prospect indicating the likelihood that the individual will behave in a particular way. For example, a score could measure the propensity to respond to a particular insurance or credit card offer or to switch to a competitor’s product. Data mining can be useful in all the three phases of a customer relationship-cycle: customer acquisition, increasing value of the customer and customer retention. </span></span></div><div style="text-align: justify;"></div><div class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none; text-align: justify;"><span style="color: #231f20; font-size: 14pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><span style="font-family: Arial;">Banks use their credit risk models to classify these respondents in good credit risk and bad credit risk classes. Seeing the huge cost and effort involved in such marketing process, data mining techniques can significantly improve the customer conversion rate by more focused marketing. </span></span><br />
<br />
</div><div style="text-align: justify;"></div><div class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none; text-align: justify;"><span style="color: #231f20; font-size: 14pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><span style="font-family: Arial;">Because high competitions in the finance industry, intelligent business decisions in marketing are more important than ever for better customer targeting, acquisition, retention and customer relationship. There is a need for customer care and marketing strategies to be in place for the success and survival of the business. It is possible with the help of data mining and predictive analytics to make such strategies. Financial institutions are finding it more difficult to locate new previously unsolicited buyers, and as a result they are implementing aggressive marketing program to acquire new customer from their competitors. </span></span></div><div style="text-align: justify;"></div><div style="text-align: justify;"> </div><div class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none; text-align: justify;"><span style="color: #231f20; font-size: 14pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><span style="font-family: Arial;">With the advent of data mining and business intelligence tools it has become possible for banks to strengthen their customer acquisition by direct marketing and establish multi-channel contacts, to improve customer development by cross selling and up selling of products, and to increase customer retention by behaviour management. </span></span></div><div class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><br />
</div><div class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none; text-align: justify;"><span style="color: #231f20; font-size: 14pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><span style="font-family: Arial;">It is also possible to bundle various offers to meet the need of the valued customers. Analytics can also help the banks in customizing the various promotional offers. It is also possible for the banks to find out the problem customers who can be defaulters in the future, from their past payment records and the profile and the data patterns that are available. This can also help the banks in adjusting the </span></span><span style="color: #231f20; font-size: 14pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><span style="font-family: Arial;">relationship with these customers so that the loss in future is kept to its minimum.<o:p></o:p></span></span></div><br />
<br />
<div class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none; text-align: justify;"><span style="color: #231f20; font-size: 14pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><span style="font-family: Arial;">Data Mining techniques can be of immense help to the banks and financial institutions in this arena for better targeting and acquiring new customers, fraud detection in real time, providing segment based products for better targeting the customers, analysis of the customers’ purchase patterns over time for better retention and relationship, detection of emerging trends to take proactive stance in a highly competitive market adding a lot more value to existing products and services and launching of new product and service bundles.<o:p></o:p></span></span></div><br />
<br />
<div class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><b style="mso-bidi-font-weight: normal;"><span style="font-size: 14pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><span style="font-family: Arial;">Risk Management<o:p></o:p></span></span></b></div><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none; text-align: justify;"><span style="font-size: 14pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><span style="font-family: Arial;">Managing and measurement of risk is at the core of every financial institution. Today’s major challenge in the banking and insurance world is therefore the implementation of risk management systems in order to identify, measure, and control business exposure. Here credit and market risk present the central challenge, one can observe a major change in the area of how to measure and deal with them, based on the advent of advanced database and data mining technology.( Other types of risk is also available in the banking and finance i.e., liquidity risk, operational risk, or concentration risk. ) Today, integrated measurement of different kinds of risk (i.e., market and credit risk) is moving into focus. These all are based on models representing single financial instruments or risk factors, their behaviour, and their interaction with overall market, making this field highly important topic of research.<o:p></o:p></span></span></div><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-family: Arial;"><b style="mso-bidi-font-weight: normal;"><span style="font-size: 14pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">Financial Market Risk</span></b><span style="font-size: 14pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none; text-align: justify;"><span style="font-size: 14pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><span style="font-family: Arial;">For single financial instruments, that is, stock indices, interest rates, or </span></span><span style="font-size: 14pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><span style="font-family: Arial;">urrencies, market risk measurement is based on models depending on a set of underlying risk factor, such as interest rates, stock indices, or economic development. One is interested in a functional form between instrument price or risk and underlying risk factors as well as in functional dependency of the risk factors itself. Today different market risk measurement approaches exist. All of them rely on models representing single instrument, their behaviour and interaction with overall market. Many of this can only be built by using various data mining techniques.<o:p></o:p></span></span></div><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-family: Arial;"><b style="mso-bidi-font-weight: normal;"><span style="color: windowtext; font-size: 14pt; mso-fareast-language: EN-ZA;">Portfolio Management</span></b><span style="color: windowtext; font-size: 14pt; mso-fareast-language: EN-ZA;"><o:p></o:p></span></span></div><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none; text-align: justify;"><span style="font-family: Arial;"><span style="color: windowtext; font-size: 14pt; mso-fareast-language: EN-ZA;">Risk measurement approaches on an aggregated portfolio level quantify the risk of a set of instrument or customer including diversification effects. On the other hand, forecasting models give an induction of the expected return or price of a financial instrument. With the data mining and optimization techniques investors are able to allocate capital across trading activities to maximize profit or minimize risk. With data mining techniques it is possible to provide extensive scenario analysis capabilities concerning expected asset prices or returns and the risk involved. With this functionality what-if simulations of varying market conditions can be run to assess impact on the value and/or risk associated with portfolio. Profit and loss analyses allow users to access an asset class, region, counterparty, or custom sub-portfolio can be benchmarked against common international benchmarks.</span><span style="font-size: 14pt;"><o:p></o:p></span></span></div><div style="text-align: justify;"></div><div style="text-align: justify;"> </div><div class="MsoNormal" style="margin: 0in 0in 0pt; text-align: justify;"><b style="mso-bidi-font-weight: normal;"><span style="color: windowtext; font-size: 14pt; mso-fareast-language: EN-ZA;"><span style="font-family: Arial;">Trading<o:p></o:p></span></span></b></div><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none; text-align: justify;"><span style="color: windowtext; font-size: 14pt; mso-fareast-language: EN-ZA;"><span style="font-family: Arial;">For the last few years a major topic of research has been the building of quantitative trading tools using data mining methods based on past data as input to predict short term movements of important currencies, interest rates, or equities. The goal of this technique is to spot times when markets are cheap or expensive by identifying the factor that are important in determining market returns. The trading system examines the relationship between relevant information and piece of financial assets, and gives you buy or sell recommendations when they suspect an under or overvaluation. </span></span></div><br />
<div class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="color: windowtext; font-size: 14pt; mso-fareast-language: EN-ZA;"><span style="font-family: Arial;">Goran Dragosavac<o:p></o:p></span></span></div><div style="text-align: justify;"></div>Analytics and Data Mininghttp://www.blogger.com/profile/06683904073452049165noreply@blogger.com10tag:blogger.com,1999:blog-3770043454488854818.post-63668223001381235212011-09-30T02:28:00.000-07:002011-09-30T02:28:58.511-07:00What to do when the data doesn’t fit the analytical question?<div class="MsoNormal" style="margin: 0in 0in 10pt;"><span style="font-family: Calibri;">Smart response to this question can be – well, either we get the new data, or new question! </span></div><div class="MsoNormal" style="border-bottom: medium none; border-left: medium none; border-right: medium none; border-top: medium none; margin: 0in 0in 10pt; text-align: justify;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjDN92Ml2NBCciXkt_7apUGG5Sc8s8bzd2OJVnX1ZLrwkOt4cTyMxZIbb_d2aWbTIbOOCIDDvDW_MMGVV7Gk5d533PMexmpWcQ4k_jr4OZLlg5_SGGwFSS6tuTzEJQ1WAbz5Di8_lW5v8Zz/s1600/images.jpg" imageanchor="1" style="clear: right; cssfloat: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" kca="true" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjDN92Ml2NBCciXkt_7apUGG5Sc8s8bzd2OJVnX1ZLrwkOt4cTyMxZIbb_d2aWbTIbOOCIDDvDW_MMGVV7Gk5d533PMexmpWcQ4k_jr4OZLlg5_SGGwFSS6tuTzEJQ1WAbz5Di8_lW5v8Zz/s1600/images.jpg" /></a><span style="font-family: Calibri;">Let’s imagine our task is to find similarity between members of the same group, for example – home loan customers. Now, imagine the situation where we ONLY have a data for the home loans customers.</span></div><div class="MsoNormal" style="border-bottom: medium none; border-left: medium none; border-right: medium none; border-top: medium none; margin: 0in 0in 10pt; text-align: justify;"><span style="font-family: Calibri;">We can certainly examine all their characteristics, but there is no guarantee that they will be different from purchases of some other banking products. What we need is some point of reference. We need additional data of customers who have any other product other than home loans. So, in order to find out what is something similar about them, we need to figure what is different between them and anyone else – which is pretty much one and a same thing.</span></div><div class="MsoNormal" style="border-bottom: medium none; border-left: medium none; border-right: medium none; border-top: medium none; margin: 0in 0in 10pt; text-align: justify;"><span style="font-family: Calibri;">This is invariably classification problem which we try to solve by unary target variable (where all purchasers having the same value of the product purchased). So, since we don’t have, or are able to get - additional data for customers that have other types of products – we need to go for second-best scenario. So, instead of “reformulating” data through the artful and creative data preparation to better fit analytical question – we have no other option but to do exactly opposite – reformulating analytical question to fit the data at hand.</span></div><div class="MsoNormal" style="border-bottom: medium none; border-left: medium none; border-right: medium none; border-top: medium none; margin: 0in 0in 10pt; text-align: justify;"><span style="font-family: Calibri;">This would mean that our new question should be what are the groups of similarity within the single class of loan customers, and how do they differ from other groups of loan customers – as oppose to the original question of what makes my “loan” customers similar? This is now very different question and by reformulating our question we are also picking new “tool” from our workbench, so instead of using some classification algorithm we are reverting to clustering method. </span></div><div class="MsoNormal" style="border-bottom: medium none; border-left: medium none; border-right: medium none; border-top: medium none; margin: 0in 0in 10pt;"><span style="font-family: Calibri;">So, the usual premise where data and analytical methods are functions of business question – doesn’t work in this situation, so practical solution is to alter the initial objective. <span style="mso-spacerun: yes;"> </span><span style="mso-spacerun: yes;"> </span><span style="mso-spacerun: yes;"> </span><span style="mso-spacerun: yes;"> </span></span></div><br />
<span style="font-family: Arial, Helvetica, sans-serif; font-size: x-small;">Goran Dragosavac</span>Analytics and Data Mininghttp://www.blogger.com/profile/06683904073452049165noreply@blogger.com1tag:blogger.com,1999:blog-3770043454488854818.post-69607675870858329222011-09-28T02:31:00.000-07:002011-09-29T03:15:50.442-07:00If you are new to Web Mining…<div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><b><span style="color: #1f1a17; font-family: "Bookplate,Bold", "sans-serif"; font-size: 25.5pt; mso-bidi-font-family: "Bookplate,Bold";"></span></b></div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none; text-align: justify;"><span style="color: #1f1a17; font-size: 10pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><span style="font-family: Calibri;">If you selling products and services via web channel you may consider analyzing who is visiting your web site and how do people who buy differ from thos that don’t, and out of those who buy - what is their clickstream sequence and navigational pattern. </span></span></div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none; text-align: justify;"><span style="color: #1f1a17; font-size: 10pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><span style="font-family: Calibri;">Each customer's action on a website generates data, and not just high-level interactions such as buying something but also something as simple as using a search engine or navigating through a site. All these interactions between digital service providers, and the consumer can be recorded, and stored in digital databases. These large data sets contain information helpful to business marketing strategies, both - for retrospective analysis, as well as for data-driven forecasting.</span></span></div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><br />
</div><div class="MsoNormal" style="border-bottom: medium none; border-left: medium none; border-right: medium none; border-top: medium none; line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none; text-align: justify;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiZH_HLx-wr2aLpytAhuEQdEg9ARFeTqjx55I7PCowG4ncKbtjjJDOnIdhYwoo5VdPycdlz_Z6zss8LVAQihtKWMFNiANUWYkqUWNfYerPSGbU6R-kMHzEyUzhNSjSMJt1pJ48bUlygqHHX/s1600/webmining.jpg" imageanchor="1" style="clear: right; cssfloat: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="320" kca="true" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiZH_HLx-wr2aLpytAhuEQdEg9ARFeTqjx55I7PCowG4ncKbtjjJDOnIdhYwoo5VdPycdlz_Z6zss8LVAQihtKWMFNiANUWYkqUWNfYerPSGbU6R-kMHzEyUzhNSjSMJt1pJ48bUlygqHHX/s320/webmining.jpg" width="320" /></a><span style="color: #1f1a17; font-size: 10pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><span style="font-family: Calibri;">Companies today are in the unprecedented position of being able to collect vast amounts of customer information relatively easily. By using web mining, companies can analyze and predict the behavior of their customers. All web site visitors leave digital trails which web servers automatically store in log files. Web analysis tools analyze, and process these web server logs files to produce meaningful information. Essentially, a complete profile of site traffic is created which shows for example, how many visitors there were to the site, what sites they came from, and which pages on the site are most popular. Web analysis tools provide companies with previously unknown statistics, and useful insights into the behavior of their online customers. While the usage and popularity of such tools may continue to increase, many online retailers are now demanding more useful information about their customers, from the vast amounts of data generated by their web sites.</span></span></div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><br />
</div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none; text-align: justify;"><span style="color: #1f1a17; font-size: 10pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><span style="font-family: Calibri;">Organizations have typically invested large amounts of money into developing their web sites and web strategy and they would like to know what return they are receiving on their investment. Most sites use hits and page views as measure of success of the web site, which clearly is not going to answer their questions. A website is commonly used for: </span></span></div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><br />
</div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="color: #1f1a17; font-size: 10pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><span style="font-family: Calibri;">-Selling products/services</span></span></div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="color: #1f1a17; font-size: 10pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><span style="font-family: Calibri;">-Providing product/company information</span></span></div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="color: #1f1a17; font-size: 10pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><span style="font-family: Calibri;">-Providing customer support</span></span></div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><br />
</div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="color: #1f1a17; font-size: 10pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><span style="font-family: Calibri;">Typical questions that an e-retailer needs to answer are:</span></span></div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><br />
</div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><b><span style="color: #1f1a17; font-size: 10pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><span style="font-family: Calibri;">- How to increase browser to buyer conversion rate?</span></span></b></div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-family: Calibri;"><b><span style="color: #1f1a17; font-size: 10pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">- How to increase web retention rate? </span></b><span style="color: #1f1a17; font-size: 10pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">(Defined as ratio of number of browsers who return to the web site within certain window of time to the total number of browsers.)</span></span></div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-family: Calibri;"><b><span style="color: #1f1a17; font-size: 10pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">- How to reduce clicks-to-close value? </span></b><span style="color: #1f1a17; font-size: 10pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">(Smaller number indicates that customers are finding easier what they looking for. To reduce this value personalization of web services is a right approach.</span></span></div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><b><span style="color: #1f1a17; font-size: 10pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><span style="font-family: Calibri;">- Does the web site design satisfy the needs of various customer segments?</span></span></b></div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><br />
</div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none; text-align: justify;"><span style="color: #1f1a17; font-size: 10pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><span style="font-family: Calibri;">Using page hits will NOT provide answer for any of these goals. Current traffic analysis tools are geared at providing high-level predefined reports about domain names, IP addresses, browsers, cookies and other machine-to-machine activity. These server activity reports simply do not provide the type of bottom-line analysis that e-tailers, service providers, marketers and advertisers in the business world have come to demand. These software packages (i.e., web analysis tools) originated from the need to report on the activity of the web server and not on the </span></span><span style="color: #1f1a17; font-size: 10pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><span style="font-family: Calibri;">activity of the user.</span></span></div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><br />
</div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="color: black; font-size: 10pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><span style="font-family: Calibri;">Web mining may be subdivided into:</span></span></div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="color: black; font-size: 10pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><span style="font-family: Calibri;">- Web-content mining</span></span></div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="color: black; font-size: 10pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><span style="font-family: Calibri;">- Web-structure mining</span></span></div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="color: black; font-size: 10pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><span style="font-family: Calibri;">- Web-usage mining.</span></span></div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="color: black; font-size: 10pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><span style="font-family: Calibri;">- User profile data</span></span></div><br />
<div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none; text-align: justify;"><span style="font-family: Calibri;"><b><span style="color: black; font-size: 12pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">Web-content mining </span></b><span style="color: black; font-size: 10pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">is the mining of Internet pages, common in the next generation of XML/RKF-based search engines/Web spiders.</span></span></div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none; text-align: justify;"><span style="font-family: Calibri;"><b><span style="color: black; font-size: 12pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">Web-structure mining </span></b><span style="color: black; font-size: 10pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">is the application of data mining to reconstruct the structure of a Web site or sites.</span></span></div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none; text-align: justify;"><span style="font-family: Calibri;"><b><span style="color: black; font-size: 12pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">Web-usage mining </span></b><span style="color: black; font-size: 10pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">is mining of log files and associated data from a particular Web site to discover knowledge of browser and buyer behavior on that site. <span style="mso-bidi-font-weight: bold;">User profile data,<b> </b></span>such as demographic information about the users of the web-site, registration data and customer profile information can provide valuable information of its customers, and can be platform for segmentation and profiling. Web-usage mining is what is widely understood to be web mining and it is main subject of this introduction.</span></span></div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none; text-align: justify;"><br />
</div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none; text-align: justify;"><span style="font-family: Calibri;"><span style="color: black; font-size: 10pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">Goran Dragosavac</span></span><span style="mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"></span></div>Analytics and Data Mininghttp://www.blogger.com/profile/06683904073452049165noreply@blogger.com15tag:blogger.com,1999:blog-3770043454488854818.post-80179907191420357462011-09-28T01:31:00.000-07:002011-09-28T01:31:21.730-07:00Data Mining in Retail Industry<div class="MsoNormal" style="border-bottom: medium none; border-left: medium none; border-right: medium none; border-top: medium none; line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEipF60DvmHFVpEoQkEN0yTeMPfY7nZgKCUeY_qSeyBrfLo8_jxSjtv0kO7W_-DEceravApkRf-Vnj_13EzpPB1nnRkocv5-AStpnLwSXsFyN1mSRdqzF10hciLGtrMb1OjhB53PFqKUoJtr/s1600/retail.jpg" imageanchor="1" style="clear: right; cssfloat: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="213" kca="true" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEipF60DvmHFVpEoQkEN0yTeMPfY7nZgKCUeY_qSeyBrfLo8_jxSjtv0kO7W_-DEceravApkRf-Vnj_13EzpPB1nnRkocv5-AStpnLwSXsFyN1mSRdqzF10hciLGtrMb1OjhB53PFqKUoJtr/s320/retail.jpg" width="320" /></a><span style="font-family: "Times New Roman", "serif"; font-size: 10pt;">Retail industry collects large amount of data on sales and customer shopping history. The quantity of data collected continues to expand rapidly, especially due to the increasing ease, availability and popularity of the business conducted on web, or e-commerce. Retail industry provides a rich source for data mining. Retail data mining can help identify customer behavior, discover customer shopping patterns and trends, improve the quality of customer service, achieve better customer retention and satisfaction, enhance goods consumption ratios design more effective goods transportation and distribution policies and reduce the cost of business. </span></div><div class="MsoNormal" style="border-bottom: medium none; border-left: medium none; border-right: medium none; border-top: medium none; line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><br />
</div><div class="MsoNormal" style="border-bottom: medium none; border-left: medium none; border-right: medium none; border-top: medium none; line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-family: "Times New Roman", "serif"; font-size: 10pt;">Some of the retail applications of data mining are in following areas:</span></div><div class="MsoNormal" style="border-bottom: medium none; border-left: medium none; border-right: medium none; border-top: medium none; line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><br />
</div><div class="MsoNormal" style="border-bottom: medium none; border-left: medium none; border-right: medium none; border-top: medium none; margin: 0in 0in 10pt; mso-layout-grid-align: none;"><b><span style="font-size: 16pt; line-height: 115%;"><span style="font-family: Calibri;">Customer Relationship Management</span></span></b></div><div class="MsoNormal" style="border-bottom: medium none; border-left: medium none; border-right: medium none; border-top: medium none; line-height: normal; margin: 0in 0in 10pt; mso-layout-grid-align: none;"><span style="font-family: MSTT31c721; font-size: 6pt; mso-bidi-font-family: MSTT31c721;"><span style="mso-spacerun: yes;"><span style="font-family: Calibri;"> </span></span></span><b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt;">Customer Segmentation: </span></b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt;">Customer segmentation is a vital ingredient in a retail organization's marketing recipe. It can offer insights into how different segments respond to shifts in demographics, fashions and trends. For example it can help classify customers in the following segments:</span></div><div class="MsoListParagraphCxSpFirst" style="line-height: normal; margin: 0in 0in 0pt 0.5in; mso-layout-grid-align: none; mso-list: l0 level1 lfo1; text-indent: -0.25in;"><span style="font-family: Symbol; font-size: 9pt; mso-bidi-font-family: Symbol; mso-fareast-font-family: Symbol;"><span style="mso-list: Ignore;">·<span style="font: 7pt "Times New Roman";"> </span></span></span><span style="font-family: MSTT31c721; font-size: 6pt; mso-bidi-font-family: MSTT31c721;"><span style="mso-spacerun: yes;"><span style="font-family: Calibri;"> </span></span></span><span style="font-family: "Arial", "sans-serif"; font-size: 9pt;">Customers who respond to new promotions</span></div><div class="MsoListParagraphCxSpMiddle" style="line-height: normal; margin: 0in 0in 0pt 0.5in; mso-layout-grid-align: none; mso-list: l0 level1 lfo1; text-indent: -0.25in;"><span style="font-family: Symbol; font-size: 9pt; mso-bidi-font-family: Symbol; mso-fareast-font-family: Symbol;"><span style="mso-list: Ignore;">·<span style="font: 7pt "Times New Roman";"> </span></span></span><span style="font-family: MSTT31c721; font-size: 6pt; mso-bidi-font-family: MSTT31c721;"><span style="mso-spacerun: yes;"><span style="font-family: Calibri;"> </span></span></span><span style="font-family: "Arial", "sans-serif"; font-size: 9pt;">Customers who respond to new product launches</span></div><div class="MsoListParagraphCxSpMiddle" style="line-height: normal; margin: 0in 0in 0pt 0.5in; mso-layout-grid-align: none; mso-list: l0 level1 lfo1; text-indent: -0.25in;"><span style="font-family: Symbol; font-size: 9pt; mso-bidi-font-family: Symbol; mso-fareast-font-family: Symbol;"><span style="mso-list: Ignore;">·<span style="font: 7pt "Times New Roman";"> </span></span></span><span style="font-family: MSTT31c721; font-size: 6pt; mso-bidi-font-family: MSTT31c721;"><span style="mso-spacerun: yes;"><span style="font-family: Calibri;"> </span></span></span><span style="font-family: "Arial", "sans-serif"; font-size: 9pt;">Customers who respond to discounts</span></div><div class="MsoListParagraphCxSpLast" style="line-height: normal; margin: 0in 0in 10pt 0.5in; mso-layout-grid-align: none; mso-list: l0 level1 lfo1; text-indent: -0.25in;"><span style="font-family: Symbol; font-size: 9pt; mso-bidi-font-family: Symbol; mso-fareast-font-family: Symbol;"><span style="mso-list: Ignore;">·<span style="font: 7pt "Times New Roman";"> </span></span></span><span style="font-family: MSTT31c721; font-size: 6pt; mso-bidi-font-family: MSTT31c721;"><span style="mso-spacerun: yes;"><span style="font-family: Calibri;"> </span></span></span><span style="font-family: "Arial", "sans-serif"; font-size: 9pt;">Customers who show propensity to purchase specific products</span></div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 10pt; mso-layout-grid-align: none;"><br />
</div><div class="MsoNormal" style="margin: 0in 0in 10pt; mso-layout-grid-align: none;"><span style="font-family: MSTT31c6e7; font-size: 6pt; line-height: 115%; mso-bidi-font-family: MSTT31c6e7;"><span style="mso-spacerun: yes;"><span style="font-family: Calibri;"> </span></span></span><b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Campaign/ Promotion Effectiveness Analysis: </span></b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Once a campaign is launched its effectiveness can be studied across different media and in terms of costs and benefits; this greatly helps in understanding what goes into a successful marketing campaign. Campaign/ promotion effectiveness analysis can answer questions like:</span></div><div class="MsoListParagraphCxSpFirst" style="margin: 0in 0in 0pt 0.5in; mso-layout-grid-align: none; mso-list: l2 level1 lfo2; text-indent: -0.25in;"><span style="font-family: Symbol; font-size: 9pt; line-height: 115%; mso-bidi-font-family: Symbol; mso-fareast-font-family: Symbol;"><span style="mso-list: Ignore;">·<span style="font: 7pt "Times New Roman";"> </span></span></span><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Which media channels have been most successful in the past for various campaigns?</span></div><div class="MsoListParagraphCxSpMiddle" style="margin: 0in 0in 0pt 0.5in; mso-layout-grid-align: none; mso-list: l2 level1 lfo2; text-indent: -0.25in;"><span style="font-family: Symbol; font-size: 9pt; line-height: 115%; mso-bidi-font-family: Symbol; mso-fareast-font-family: Symbol;"><span style="mso-list: Ignore;">·<span style="font: 7pt "Times New Roman";"> </span></span></span><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Which geographic locations responded well to a particular campaign?</span></div><div class="MsoListParagraphCxSpMiddle" style="margin: 0in 0in 0pt 0.5in; mso-layout-grid-align: none; mso-list: l2 level1 lfo2; text-indent: -0.25in;"><span style="font-family: Symbol; font-size: 9pt; line-height: 115%; mso-bidi-font-family: Symbol; mso-fareast-font-family: Symbol;"><span style="mso-list: Ignore;">·<span style="font: 7pt "Times New Roman";"> </span></span></span><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">What were the relative costs and benefits of this campaign?</span></div><div class="MsoListParagraphCxSpLast" style="margin: 0in 0in 10pt 0.5in; mso-layout-grid-align: none; mso-list: l2 level1 lfo2; text-indent: -0.25in;"><span style="font-family: Symbol; font-size: 9pt; line-height: 115%; mso-bidi-font-family: Symbol; mso-fareast-font-family: Symbol;"><span style="mso-list: Ignore;">·<span style="font: 7pt "Times New Roman";"> </span></span></span><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Which customer segments responded to the campaign?</span></div><div class="MsoNormal" style="margin: 0in 0in 10pt; mso-layout-grid-align: none;"><span style="font-family: MSTT31c6e7; font-size: 6pt; line-height: 115%; mso-bidi-font-family: MSTT31c6e7;"><span style="mso-spacerun: yes;"><span style="font-family: Calibri;"> </span></span></span><b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Customer Lifetime Value (CLV): </span></b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Not all customers are equally profitable. CLV attempts to calculate some projected relative measure of value by calculating Risk Adjusted Revenue (probability of customer owning categories/products in his portfolio that he currently doesn ‘t have), as well as Risk Adjusted Loss (probability of customer dropping categories/products in his portfolio that he currently owns) and adding to some Net Present Value, and deducting the value of servicing the customer. </span></div><div class="MsoNormal" style="margin: 0in 0in 10pt; mso-layout-grid-align: none;"><b style="mso-bidi-font-weight: normal;"><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Customer Potential:</span></b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;"> Also, there are those customers who are not very profitable today may have the potential of being profitable in future. Hence it is absolutely essential to identify customers with high potential before deciding what the best way to realize that potential is through the right marketing stimully..</span></div><div class="MsoNormal" style="margin: 0in 0in 10pt; mso-layout-grid-align: none;"><span style="font-family: MSTT31c721; font-size: 6pt; line-height: 115%; mso-bidi-font-family: MSTT31c721;"><span style="mso-spacerun: yes;"><span style="font-family: Calibri;"> </span></span></span><b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Customer Loyalty Analysis: </span></b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">It is more economical to retain an existing customer than to acquire a new one. To develop effective customer retention programs it is vital to analyze the reasons for customer attrition. Business Intelligence helps in understanding customer attrition with respect to various factors influencing a customer and at times one can drill down to individual transactions, which might have resulted in the change of loyalty.</span></div><div class="MsoNormal" style="margin: 0in 0in 10pt; mso-layout-grid-align: none;"><span style="font-family: MSTT31c721; font-size: 6pt; line-height: 115%; mso-bidi-font-family: MSTT31c721;"><span style="mso-spacerun: yes;"><span style="font-family: Calibri;"> </span></span></span><b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Cross Selling: </span></b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Retailers use the vast amount of customer information available with them to cross sell other products at the time of purchase. This can be done through product portfolio analysis and then selling the products that are missing from typical portfolios. Also market basket analysis can be another food method for effective cross selling. Look-a-like modeling is yet another strategy where model is produce that produce some quantitative measure of affinity of the customer to a specific product. </span></div><div class="MsoNormal" style="margin: 0in 0in 10pt; mso-layout-grid-align: none;"><span style="font-family: MSTT31c721; font-size: 6pt; line-height: 115%; mso-bidi-font-family: MSTT31c721;"><span style="mso-spacerun: yes;"><span style="font-family: Calibri;"> </span></span></span><b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Product Pricing: </span></b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Pricing is one of the most crucial marketing decisions taken by retailers. Often an increase in price of a product can result in lower sales and customer adoption of replacement products. Using data warehousing and data mining, retailers can develop sophisticated price models for different products, which can establish price - sales relationships for the product and how changes in prices affect the sales of other products.</span></div><div class="MsoNormal" style="margin: 0in 0in 10pt; mso-layout-grid-align: none;"><span style="font-family: MSTT31c721; font-size: 6pt; line-height: 115%; mso-bidi-font-family: MSTT31c721;"><span style="mso-spacerun: yes;"><span style="font-family: Calibri;"> </span></span></span><b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Target Marketing/Response Modeling: </span></b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Retailers can optimize the overall marketing and promotion effort by targeting campaigns to specific customers or groups of customers. Target marketing can be based on a very simple analysis of the buying habits of the customer or the customer group; but increasingly data mining tools are being used to define specific customer segments that are likely to respond to particular types of campaigns.</span></div><div class="MsoNormal" style="margin: 0in 0in 10pt; mso-layout-grid-align: none;"><b><span style="font-size: 16pt; line-height: 115%;"><span style="font-family: Calibri;">Supply Chain Management & Procurement</span></span></b></div><div class="MsoNormal" style="margin: 0in 0in 10pt; mso-layout-grid-align: none;"><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Supply chain management (SCM) promises unprecedented efficiencies in inventory control and procurement to the retailers. With cash registers equipped with bar-code scanners, retailers can now automatically manage the flow of products and transmit stock replenishment orders to the vendors. The data collected for this purpose can provide deep insights into the dynamics of the supply chain. However, most of the commercial SCM applications provide only transaction-based functionality for inventory management and procurement; they lack sophisticated analytical capabilities required to provide an integrated view of the supply chain. </span></div><div class="MsoNormal" style="margin: 0in 0in 10pt; mso-layout-grid-align: none;"><span style="font-family: MSTT31c721; font-size: 6pt; line-height: 115%; mso-bidi-font-family: MSTT31c721;"><span style="mso-spacerun: yes;"><span style="font-family: Calibri;"> </span></span></span><b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Vendor Performance Analysis: </span></b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Performance of each vendor can be analyzed on the basis of a number of factors like cost, delivery time, quality of products delivered, payment lead time, etc. In addition to this, the role of suppliers in specific product outages can be critically analyzed.</span></div><div class="MsoNormal" style="margin: 0in 0in 10pt; mso-layout-grid-align: none;"><span style="font-family: MSTT31c6e7; font-size: 6pt; line-height: 115%; mso-bidi-font-family: MSTT31c6e7;"><span style="mso-spacerun: yes;"><span style="font-family: Calibri;"> </span></span></span><b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Inventory Control </span></b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">(Inventory levels, safety stock, lot size, and lead time analysis): Both current and historic reports on key inventory indicators like inventory levels, lot size, etc. can be generated from the data warehouse, thereby helping in both operational and strategic decisions relating to the inventory.</span></div><div class="MsoNormal" style="margin: 0in 0in 10pt; mso-layout-grid-align: none;"><span style="font-family: MSTT31c6e7; font-size: 6pt; line-height: 115%; mso-bidi-font-family: MSTT31c6e7;"><span style="mso-spacerun: yes;"><span style="font-family: Calibri;"> </span></span></span><b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Product Movement and the Supply Chain</span></b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">: Some products move much faster off the shelf than others. On-time replenishment orders are very critical for these products. Analyzing the movement of specific products - using BI tools - can help in predicting when there will be need for re-order.</span></div><div class="MsoNormal" style="margin: 0in 0in 10pt; mso-layout-grid-align: none;"><span style="font-family: MSTT31c6e7; font-size: 6pt; line-height: 115%; mso-bidi-font-family: MSTT31c6e7;"><span style="mso-spacerun: yes;"><span style="font-family: Calibri;"> </span></span></span><b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Demand Forecasting</span></b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">: Complex demand forecasting models can be created using a number of factors like sales figures, basic economic indicators, environmental conditions, etc. If correctly implemented, a data warehouse can significantly help in improving the retailer’s relations with suppliers and can complement the existing SCM application.</span></div><div class="MsoNormal" style="margin: 0in 0in 10pt; mso-layout-grid-align: none;"><b><span style="font-size: 16pt; line-height: 115%;"><span style="font-family: Calibri;">Storefront Operations</span></span></b></div><div class="MsoNormal" style="margin: 0in 0in 10pt; mso-layout-grid-align: none;"><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">The information needs of the store manager are no longer restricted to the day to day operations. Today’s consumer is much more sophisticated and she demands a compelling shopping experience. For this the store manager needs to have an in-depth understanding of her tastes and purchasing behavior. Data warehousing and data mining can help the manager gain this insight. Following are some of the uses of BI in storefront operations:</span></div><div class="MsoNormal" style="margin: 0in 0in 10pt; mso-layout-grid-align: none;"><b style="mso-bidi-font-weight: normal;"><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Store Segmentation: </span></b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">This analysis takes the data that is common for different stores, and finds out which stores are similar in terms of product or customer dimensions. In other words – what stores are similar based on products that are sold quickly or more slowly in comparison to rest of the stores. Next step is to build the profile of the customers that buys from specific store.<b style="mso-bidi-font-weight: normal;"></b></span></div><div class="MsoNormal" style="margin: 0in 0in 10pt; mso-layout-grid-align: none;"><span style="font-family: MSTT31c6e7; font-size: 6pt; line-height: 115%; mso-bidi-font-family: MSTT31c6e7;"><span style="mso-spacerun: yes;"><span style="font-family: Calibri;"> </span></span></span><b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Market Basket Analysis: </span></b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">It is used to study natural affinities between products. One of the classic examples of market basket analysis is the beer-diaper affinity, which states that men who buy diapers are also likely to buy beer. This is an example of 'two-product affinity'. But in real life, market basket analysis can get extremely complex resulting in hitherto unknown affinities between a number of products. This analysis has various uses in the retail organization. One very common use is for in-store product placement. Another popular use is product bundling, i.e.grouping products to be sold in a single package deal. Other uses include design ing the company's e-commerce web site and product catalogs.</span></div><div class="MsoNormal" style="margin: 0in 0in 10pt; mso-layout-grid-align: none;"><span style="font-family: MSTT31c6e7; font-size: 6pt; line-height: 115%; mso-bidi-font-family: MSTT31c6e7;"><span style="mso-spacerun: yes;"><span style="font-family: Calibri;"> </span></span></span><b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Category Management: </span></b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">It gives the retailer an insight into the right number of SKUs to stock in a particular category. The objective is to achieve maximum profitability from a category; too few SKUs would mean that the customer is not provided withadequate choice, and too many would mean that the SKUs are cannibalizing each other. It goes without saying that effective category management is vital for a retailer's survival in this market.</span></div><div class="MsoNormal" style="margin: 0in 0in 10pt; mso-layout-grid-align: none;"><span style="font-family: MSTT31c721; font-size: 6pt; line-height: 115%; mso-bidi-font-family: MSTT31c721;"><span style="mso-spacerun: yes;"><span style="font-family: Calibri;"> </span></span></span><b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Out-Of-Stock Analysis</span></b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">: This analysis probes into the various reasons resulting into an out of stock situation. Typically a number of variables are involved and it can get very complicated. An integral part of the analysis is calculating the lost revenue due to product stock out.</span></div><div class="MsoNormal" style="margin: 0in 0in 10pt; mso-layout-grid-align: none;"><b><span style="font-size: 16pt; line-height: 115%;"><span style="font-family: Calibri;">Alternative Sales Channels</span></span></b></div><div class="MsoNormal" style="margin: 0in 0in 10pt; mso-layout-grid-align: none;"><span style="font-family: MSTT31c721; font-size: 6pt; line-height: 115%; mso-bidi-font-family: MSTT31c721;"><span style="mso-spacerun: yes;"><span style="font-family: Calibri;"> </span></span></span><b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">E Business Analysis: </span></b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">The Internet has emerged as a powerful alternative channel for established retailers. Increasing competition from retailers operating purely over the Internet - commonly known as 'e-tailers' - has forced the 'Bricks and Mortar' retailers to quickly adopt this channel. Their success would largely depend on how they use the Net to complement their existing channels. Web logs and Information forms filled over the web are very rich sources of data that can provide insightful information about customer's browsing behavior, purchasing patterns, likes and dislikes, etc. Two main types of analysis done on the web site data are:</span></div><div class="MsoListParagraphCxSpFirst" style="margin: 0in 0in 0pt 0.5in; mso-layout-grid-align: none; mso-list: l1 level1 lfo3; text-indent: -0.25in;"><span style="font-family: Symbol; font-size: 9pt; line-height: 115%; mso-bidi-font-family: Symbol; mso-fareast-font-family: Symbol;"><span style="mso-list: Ignore;">·<span style="font: 7pt "Times New Roman";"> </span></span></span><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Web Log Analysis: This involves analyzing the basic traffic information over the e-commerce web site. This analysis is primarily required to optimize the operations over the Internet. It typically includes following analyses:</span></div><div class="MsoListParagraphCxSpMiddle" style="margin: 0in 0in 0pt 0.5in; mso-layout-grid-align: none; mso-list: l1 level1 lfo3; text-indent: -0.25in;"><span style="font-family: Symbol; font-size: 9pt; line-height: 115%; mso-bidi-font-family: Symbol; mso-fareast-font-family: Symbol;"><span style="mso-list: Ignore;">·<span style="font: 7pt "Times New Roman";"> </span></span></span><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Site Navigation: An analysis of the typical route followed by the user while navigating the web site. It also includes an analysis of the most popular pages in the web site. This can significantly help in site optimization by making it more user- friendly.</span></div><div class="MsoListParagraphCxSpMiddle" style="margin: 0in 0in 0pt 0.5in; mso-layout-grid-align: none; mso-list: l1 level1 lfo3; text-indent: -0.25in;"><span style="font-family: Symbol; font-size: 9pt; line-height: 115%; mso-bidi-font-family: Symbol; mso-fareast-font-family: Symbol;"><span style="mso-list: Ignore;">·<span style="font: 7pt "Times New Roman";"> </span></span></span><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Referrer Analysis: An analysis of the sites, which are very prolific in diverting traffic to the company’s web site.</span></div><div class="MsoListParagraphCxSpMiddle" style="margin: 0in 0in 0pt 0.5in; mso-layout-grid-align: none; mso-list: l1 level1 lfo3; text-indent: -0.25in;"><span style="font-family: Symbol; font-size: 9pt; line-height: 115%; mso-bidi-font-family: Symbol; mso-fareast-font-family: Symbol;"><span style="mso-list: Ignore;">·<span style="font: 7pt "Times New Roman";"> </span></span></span><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Error Analysis: An analysis of the errors encountered by the user while navigating the web site. This can help in solving the errors and making the browsing experience more pleasurable. </span><span style="font-family: MSTT31c721; font-size: 6pt; line-height: 115%; mso-bidi-font-family: MSTT31c721;"><span style="font-family: Calibri;">n </span></span><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Keyword Analysis: An analysis of the most popular keywords used by various users in Internet search engines to reach the retailer’s e-commerce web site.</span></div><div class="MsoListParagraphCxSpLast" style="border-bottom: medium none; border-left: medium none; border-right: medium none; border-top: medium none; margin: 0in 0in 10pt 0.5in; mso-layout-grid-align: none; mso-list: l1 level1 lfo3; text-indent: -0.25in;"><span style="font-family: Symbol; font-size: 9pt; line-height: 115%; mso-bidi-font-family: Symbol; mso-fareast-font-family: Symbol;"><span style="mso-list: Ignore;">·<span style="font: 7pt "Times New Roman";"> </span></span></span><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Product Recommendation: If someone buys product A which other product he may buy. Usually there are 3 different angles to exploit when setting up recommendation engine: natural product affinities, customers affinities and preferences, peer dynamics and wisdom of the crowds. </span></div><div class="MsoNormal" style="border-bottom: medium none; border-left: medium none; border-right: medium none; border-top: medium none; margin: 0in 0in 10pt; mso-layout-grid-align: none;"><br />
</div><div class="MsoNormal" style="border-bottom: medium none; border-left: medium none; border-right: medium none; border-top: medium none; margin: 0in 0in 10pt; mso-layout-grid-align: none;"><span style="font-family: MSTT31c721; font-size: 6pt; line-height: 115%; mso-bidi-font-family: MSTT31c721;"><span style="mso-spacerun: yes;"><span style="font-family: Calibri;"> </span></span></span><b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Channel Profitabilit</span></b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">y: Data mining can help analyze channel profitability, and whether it makes sense for the retailer to continue building up expertise in that channel. The decision of continuing with a channel would also include a number of subjective factors like outlook of key enabling technologies for that channel. </span></div><div class="MsoNormal" style="border-bottom: medium none; border-left: medium none; border-right: medium none; border-top: medium none; margin: 0in 0in 10pt; mso-layout-grid-align: none;"><span style="font-family: MSTT31c721; font-size: 6pt; line-height: 115%; mso-bidi-font-family: MSTT31c721;"><span style="font-family: Calibri;"> </span></span><b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Product – Channel Affinity</span></b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">: Some product categories sell particularly well on certain channels. Data mining can help identify hidden product-channel affinities and help the retailer design better promotion and marketing campaigns.</span></div><div class="MsoNormal" style="border-bottom: medium none; border-left: medium none; border-right: medium none; border-top: medium none; margin: 0in 0in 10pt; mso-layout-grid-align: none;"><b><span style="font-size: 16pt; line-height: 115%;"></span></b><b><span style="font-size: 16pt; line-height: 115%;"><span style="font-family: Calibri;">Finance and Fixed Asset Management</span></span></b></div><div class="MsoNormal" style="border-bottom: medium none; border-left: medium none; border-right: medium none; border-top: medium none; margin: 0in 0in 10pt; mso-layout-grid-align: none;"><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">The role of financial reporting has undergone a paradigm shift during the last decade. It is no longer restricted to just financial statements required by the law; increasingly it is being used to help in strategic decision making. Also, many organizations have embraced a free information architecture, whereby financial information is openly available for internal use. Many analytics described till now use financial data. Many companies, across industries,have integrated financial data in their enterprise wide data warehouse or established separate Financial Data Warehouse (FDW). Following are some of the uses of BI in finance:</span></div><div class="MsoNormal" style="border-bottom: medium none; border-left: medium none; border-right: medium none; border-top: medium none; margin: 0in 0in 10pt; mso-layout-grid-align: none;"><span style="font-family: MSTT31c721; font-size: 6pt; line-height: 115%; mso-bidi-font-family: MSTT31c721;"><span style="mso-spacerun: yes;"><span style="font-family: Calibri;"> </span></span></span><b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Budgetary Analysis: </span></b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Data warehousing facilitates analysis of budgeted versus actual expenditure for various cost heads like promotion <span style="mso-spacerun: yes;"> </span>overruns can be analyzed in more detail. It can also be used to allocate budgets for the coming financial period.</span></div><div class="MsoNormal" style="border-bottom: medium none; border-left: medium none; border-right: medium none; border-top: medium none; margin: 0in 0in 10pt; mso-layout-grid-align: none;"><b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Fixed Asset Return Analysis: </span></b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">This is used to analyze financial viability of the fixed assets owned or leased by the company. It would typically involve measures like profitability per sq. foot of store space, total lease cost vs. profitability, etc.</span></div><div class="MsoNormal" style="border-bottom: medium none; border-left: medium none; border-right: medium none; border-top: medium none; margin: 0in 0in 10pt; mso-layout-grid-align: none;"><span style="font-family: MSTT31c6e7; font-size: 6pt; line-height: 115%; mso-bidi-font-family: MSTT31c6e7;"><span style="mso-spacerun: yes;"><span style="font-family: Calibri;"> </span></span></span><b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Financial Ratio Analysis: </span></b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Various financial ratios like debt-equity, liquidity ratios, etc. can be analyzed over a period of time. The ability to drill down and join inter-related reports and analyses – provided by all major OLAP tool vendors – can make ratio analysis much more intuitive.</span></div><div class="MsoNormal" style="border-bottom: medium none; border-left: medium none; border-right: medium none; border-top: medium none; margin: 0in 0in 10pt; mso-layout-grid-align: none;"><span style="font-family: MSTT31c6e7; font-size: 6pt; line-height: 115%; mso-bidi-font-family: MSTT31c6e7;"><span style="mso-spacerun: yes;"><span style="font-family: Calibri;"> </span></span></span><b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">Profitability Analysis: </span></b><span style="font-family: "Arial", "sans-serif"; font-size: 9pt; line-height: 115%;">This includes profitability of individual stores, departments within the store, product categories, brands, and individual SKUs. </span></div><div style="border-bottom: medium none; border-left: medium none; border-right: medium none; border-top: medium none;"><br />
</div>Analytics and Data Mininghttp://www.blogger.com/profile/06683904073452049165noreply@blogger.com103tag:blogger.com,1999:blog-3770043454488854818.post-56731903615114703222011-08-10T06:40:00.000-07:002011-08-10T06:40:10.185-07:00What constitutes a good data mining model?<div class="MsoNormal" style="margin: 0in 0in 10pt;"><span style="font-family: Calibri;">There are different types of data mining models, so definition of good quality model will depend of type of the model. </span></div><div class="separator" style="border-bottom: medium none; border-left: medium none; border-right: medium none; border-top: medium none; clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgu0uKTsivpIwbWy0_a_DxSjbmLP1jxCST_ywtLonzn7zHBVcKPUL13cwoRVMQq335tVgBRxQA54rpF4qfR3h0rbaa23YOrbB_1qyxwWNA60TfE3DoEifaqqwyGyWa-sMLqIWlrYT8hbmb9/s1600/imagesCA066IEF.jpg" imageanchor="1" style="clear: right; cssfloat: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="272" naa="true" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgu0uKTsivpIwbWy0_a_DxSjbmLP1jxCST_ywtLonzn7zHBVcKPUL13cwoRVMQq335tVgBRxQA54rpF4qfR3h0rbaa23YOrbB_1qyxwWNA60TfE3DoEifaqqwyGyWa-sMLqIWlrYT8hbmb9/s320/imagesCA066IEF.jpg" width="320" /></a></div><div class="MsoNormal" style="border-bottom: medium none; border-left: medium none; border-right: medium none; border-top: medium none; margin: 0in 0in 10pt;"><span style="font-family: Calibri;">Good explanatory model must be able to explain some facet of the business problem. Purpose of describtive models is to extract the patterns in the data that are non-trivial, unknown, potentially useful and actionable. Such a model should bring you deeper in the understanding of specific business phenomena, and if acted upon - these new insights can generate new business value.</span></div><div class="MsoNormal" style="border-bottom: medium none; border-left: medium none; border-right: medium none; border-top: medium none; margin: 0in 0in 10pt;"><span style="font-family: Calibri;">Predictive models are different. The purpose of predictive models is to generalize well on the set of new data. First, we have to be able to compare the results to what actually happened in the real world. Did predicted behavior actually happened, how many times model was right, or wrong? What is the improvement of the model in comparison to pre-modeling levels? <span style="mso-spacerun: yes;"> </span>Here, basic assessment metrics that are used to choose the best model are accuracy rates, misclassification rates, lift, average squared error, etc. <span style="mso-spacerun: yes;"> </span></span></div><div class="MsoNormal" style="border-bottom: medium none; border-left: medium none; border-right: medium none; border-top: medium none; margin: 0in 0in 10pt;"><span style="font-family: Calibri;">The question that I have been asked many times by business audiences is how they can trust the model, since they are required not only to sponsor model implementation, but also to stake their reputations in technologies that they often don’t quite understand.<span style="mso-spacerun: yes;"> </span>My response is always to look at the assessment measures on test data. How model performs on test dataset is the closest we will ever be to assess model performance on a new dataset, where model is required to generate accurate prediction.</span></div><div class="MsoNormal" style="border-bottom: medium none; border-left: medium none; border-right: medium none; border-top: medium none; margin: 0in 0in 10pt;"><span style="font-family: Calibri;">Model accuracy is only one of the qualitative aspects, but there are others – such as stability. At the same levels of accuracy it is always better to go for simpler model with the fewer variables since such models are always more robust and stable.</span></div><div class="MsoNormal" style="margin: 0in 0in 10pt;"><span style="font-family: Calibri;">Another angle of what constitute good model comes purely from a business perspective. Near perfect models from a statistical perspective are of no use they cannot be implemented for whatever reason. On the other hand - we may have models that fall short of statistically sound model – but who can still help us do things better than what we are able to do in absence of such model.</span></div><div class="MsoNormal" style="margin: 0in 0in 10pt;"><span style="font-family: Calibri;">And lastly – main question remains – how does benefits generated by the model compare with its cost of production and implementation? Benefits of the good model always outweigh its cost.</span></div><div class="MsoNormal" style="margin: 0in 0in 10pt;"><span style="font-family: Calibri;">Goran Dragosavac </span></div>Analytics and Data Mininghttp://www.blogger.com/profile/06683904073452049165noreply@blogger.com2tag:blogger.com,1999:blog-3770043454488854818.post-20629022159175165782011-06-30T01:46:00.011-07:002011-06-30T02:30:32.485-07:00Data Mining applications accross the industries<div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 14pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><span style="font-family: Calibri;"></span></span><br />
<div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><div style="border-bottom: medium none; border-left: medium none; border-right: medium none; border-top: medium none;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgIQpcKC-XIvGKO6P4WLBzgeQwhqy1mvI3yCx1T8A49gSfpKYDefOPK8OEe208-gOKzhHXKPkA3hQpItNZtb_Y10qdnFC61CqG_XxwNqmlc4pef8GJpFmErVH5M0drRX1xi69NrUwUSh-cV/s1600/images.jpg" imageanchor="1" style="clear: right; cssfloat: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" i$="true" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgIQpcKC-XIvGKO6P4WLBzgeQwhqy1mvI3yCx1T8A49gSfpKYDefOPK8OEe208-gOKzhHXKPkA3hQpItNZtb_Y10qdnFC61CqG_XxwNqmlc4pef8GJpFmErVH5M0drRX1xi69NrUwUSh-cV/s1600/images.jpg" /></a><span style="mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><span style="font-size: small;">I am often asked the question about what are the most common applications of analytics in a specific industry. Even though each industry has some application of analytics and data mining that are specific to them, they also have cross-industry applications that are common to many industries. Example of industry-specific analytical application is “policy-lapse prediction” in the insurance industry. </span></span><span style="mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><span style="font-size: small;">Examples of cross-industry applications could be customer segmentation or customer retention, since in any industry where there are customers there is also need to segment them and retain them. Following is a mix of analytical applications and can be done in a specific industry: <span style="mso-spacerun: yes;"> </span><span style="mso-spacerun: yes;"> </span><span style="mso-spacerun: yes;"> </span></span></span></div></div><div class="MsoNormal" style="border-bottom: medium none; border-left: medium none; border-right: medium none; border-top: medium none; line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><br />
</div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: small;"><u><span style="mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><strong>Banking (retail):</strong></span></u><span style="mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"> Analytics can help banks understand and drive decisions related to customer profitability, as well as to enable banking institutions to segment customers <span style="color: #231f20;">according</span> to a multitude of variables: demographics, account history, etc. – in order to create more meaningful and targeted marketing programs. Furthermore, analytics can help banks improve retention rates by determining its causes and predicting future customer attrition. In addition, banks can apply analytics to historical data to find out which customers are good candidates for cross-selling and up-selling and as a result achieve increase in revenue and wallet share. For most banks analytics are used as the most powerful weapon in the fight against fraud.</span></span></div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><br />
</div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: small;"><u><span style="mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><strong>Banking (investment):</strong></span></u><span style="mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"> In investment banking analytics can be of tremendous value in supporting cross-asset trading and various other trading strategies. Also, analytical technologies are invaluable for enterprise-wide, market and credit risk management. Other applications of an analytics are s<span style="color: #231f20;">egmenting and predicting the behavior of homogeneous groups of customers, uncovering hidden correlations between different indicators, create models to price futures, options, and stocks, and optimize portfolio performance.</span></span></span><br />
<br />
</div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="color: #231f20; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"></span><u><span style="color: #231f20; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"></span></u></div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: small;"><u><span style="color: #231f20; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><strong>Insurance (short term):</strong></span></u><span style="color: #231f20; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"> Analytical applications in short term insurance are in rate-making by identifying risk factors that predict profits, claims and losses as well as in identifying potentially fraudulent claims. Common applications of analytics are in segmenting and profiling customers and then doing a rate and claim analysis of a single segment for different product, as well as performing market basket analysis and sequencing that answers the question of what insurance products are purchased together or in succession. Other common applications are in reinsurance, and in estimating outstanding claims provision (severity of the claim, exposure, frequency, time before settlement, etc.), as well as in using analytics to separate claims between digital and mobile assessors. </span></span></div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><br />
</div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: small;"><u><span style="color: #231f20; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><strong>Insurance (life):</strong></span></u><span style="color: #231f20; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><span style="mso-spacerun: yes;"> </span>A common application of analytics in life insurance is around policy lapse predictions, modeling brokers’ performance, reactivating of dormant customers to estimating the buying potential, and realizing the untapped potential through using analytics for more effective cross-selling. In addition analytics are commonly used to model response in direct marketing of specific insurance products.</span></span></div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><br />
</div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: small;"><u><span style="color: #231f20; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><strong>Telco's</strong>:</span></u><span style="color: #231f20; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"> Analytics in telecoms are used for churn management, network fault prediction, up-selling and cross-selling, capacity planning personalized advertising and subscriber profiling.</span></span><br />
<br />
</div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="color: #231f20; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"></span><span style="color: #231f20; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"></span></div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: small;"><u><span style="color: #231f20; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><strong>Retail:</strong></span></u><span style="color: #231f20; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"> </span><span style="mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">Analytics in retail are being used for supply chain and demand planning, customer segmentation and profiling, for improving response in direct marketing, for better cross-selling and up-selling, for product management, and for better understanding which products are purchased together or in sequence. </span></span></div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><br />
</div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: small;"><u><span style="mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><strong>Industrials:</strong></span></u><span style="mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"> Analytics among Industrials are being used for warranty analysis, quality control, process optimization, waste management, supplier segmentation, product and customer profitability, causal analysis, service parts optimization, and for supply chain optimization and demand planning.</span></span></div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><br />
</div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: small;"><u><span style="mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><strong>Resources:</strong></span></u><span style="mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"> The use of analytics in exploitation of natural resources is to better understand the operational risks associated with situations like equipment failures, human error and security breaches. Analytics can also be used to analyze usage patterns, weather, econometric data, changing demographics, etc. in order to accurately and confidently predict energy purchase/supply requirements.<u></u></span></span></div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><br />
</div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: small;"><u><span style="mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><strong>Oil and Gas (upstream):</strong></span></u><span style="mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"> Analytics in Oil and Gas are used for exploration and production optimization, facility integrity and reliability (predicting shut-downs, outages and downtime in production), reservoir modeling and oil-field production forecasting, estimating the shape of an oil field, fluid flood optimization and permeability prediction. It is also used for optimization of the reliability of equipment. Other applications of analytics include managing oil field assets by identifying trends in asset performance and potential, estimate the potential for infill drilling locations, screening and prioritizing workover candidates, and discover the characteristics of high potential producing assets and identify opportunities for acquisitions.</span><span style="mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin; mso-fareast-font-family: "Times New Roman";"></span></span></div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><br />
</div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: small;"><u><span style="mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><strong>Oil and Gas (downstream):</strong></span></u><span style="mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"> Common analytical applications are in demand forecasting, prediction of outages (planned, unplanned), grid overloads as well as predictive asset maintenance and fault prediction. Other applications are workforce optimization and consumer analytics.</span></span></div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><br />
</div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: small;"><u><span style="mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><strong>Healthcare:</strong></span></u><span style="mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"> Analytics in healthcare are being used for medical claims analysis (segmentation of claims (normal claims, claims for case managers, claims for investigative units), outcome analysis, both clinical and financial (mortality, length of stay, etc.), for disease management, for medical errors, as well as for the patient, supplier relationship management (increased patient satisfaction levels, segment suppliers and providers of cost, efficiency and quality of service).</span></span></div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><br />
</div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: small;"><u><span style="mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><strong>Goods:</strong></span></u><span style="mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"> Analytics among goods manufacturers are being used for quality control, process optimization, waste management, for inventory optimization and demand planning.</span></span></div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><br />
</div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: small;"><u><span style="mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><strong>Public:</strong></span></u><span style="mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"> Analytics in the public sector are used for improving of improving service delivery and performance of government agencies, improving safety, minimizing of tax evasion, detecting fraud, waste and abuse, analyzing scientific and research information, managing human resources, optimizing resources, and analyzing intelligence information. </span></span></div><div class="MsoNormal" style="line-height: normal; margin: 0in 0in 0pt; mso-layout-grid-align: none;"><br />
</div></div>Goran DragosavacAnalytics and Data Mininghttp://www.blogger.com/profile/06683904073452049165noreply@blogger.com7