Even though RFM segmentation is well known in retail industry, and
basic premise is that by knowing recency, frequency and value of the purchase
you can be in good position to start figuring out specific customer in terms of
its value, purchasing behavior and its loyalties. However, same logic can be
applied for any phenomena that we trying to predict. Therefore, knowing how
often something happens, how recently its happened and its voracity – has same
type of predictive power as it has in retail context. And whenever I used it
for predictive modeling- RFM would always come as one of the top predictors.
So, let me delve deeper in explaining basic principles of RFM method.

RFM segments the customer base based on recency of purchase (R), frequency
of purchase (F) and monetary value (M).

*Recency*parameter is the most powerful of the 3. In forecasting models latest time series often has the highest weighting and is the most predictive of the next forecasting value. Second most powerful is the*frequency*as long as the definition of the*frequency*is limited to last month or quarter and not over entire life-span of customer relationship. Least powerful is the*monetary value*. Since the total value in the period of time is directly correlated with*frequency*it is advisable to use an average value.
There are several different ways to calculate RFM groups and scores
and below is the classic approach:

First create 5 segments based on the recency, dividing the data file
into 5 exact quintiles, where the contacts with the most recent Transactions
(i.e. in the top 20% of the file) are given a
recency value of 5, then the next 20% are given a recency value of 4 and
so on. Then, each of those quintiles, segmented into 5 further quintiles based
on the

*frequency*value for each contact where the contacts with the highest transaction frequency value are of 5, then the next 20% is given a frequency value of 4 and so on. Finally, each of these segments is then segmented into 5 further quintiles, based on the monetary value of each contact; i.e. the total amount which all that contact’s transactions add up to. Those contacts with the highest monetary values (i.e. in the top 20%), are given a monetary value of 5, then the next 20% are given a monetary value of 4 and so on.) At the end of this process, you will have 125 segments with a RFM group between 111 and 555 with the same number of contacts within each segment; and each contact will have a RFM score of between 3 and 15.
An alternative approach is to still calculate RFM Groups/Scores using
quintiles, but by using the Independent RFM Quintile approach, not just the

*recency*but also the*frequency*and*monetary values*for each contact are calculated across the whole data file and are not dependent on any of the other values/RFM factors or any other quintile. Another approach is to use user-definable bands for each criterion (i.e. each RFM factor) in order to determine what*recency, frequency*and*monetary value*that should be given to each contact. Even-though RFM segmentation can be used on “stand-alone” basis, I always tend to incorporate it with other demographic and affinity variables in order to have more holistic view of the segment's make-up.
I have coined my own approach
that I often use which is somewhat different of the classic approach and it
goes in following way:\

1.) Create variable Total Spend for for
each customer

2.) Create variable Total number of
visits for each customer

3.) Divide both variables into 3 equally spaced bins, based on frequency
– 1st bin would be lowest 30% of all
customers in regard to spending (and visits – separate variable)

4.) Evaluate
each customer in terms of in which group he belonged (for that time) in terms
of his total spending, and total visits, and label him for that group (Example:
variable “FRM_Spend_label” would have
values “L”, “M” and “H”. If amount of his total customer spending for 12m is
within threshold fits within second bin – give him a value “M” (medium) in
variable “FRM_Spend_label”

5.) Do the
same thing for visits, creating a new variable “FRM_visit_variable”.

6.) Do
slightly different thing for “Recency” – starting from the same endpoint as it
has been done for “spending” and visits – go behind only 3 months and not 12.
Then, do the following: if customer did purchase in month 1 (the most recent
month) give him a value “H”, if the most recent purchase was in month “2” –
give him a value “M” and if the most recent purchase was in month “3” – give him
value “L”.

Note – it might happen that most of a customers have some
sort of purchase in all months in which case it would be advisable to raise
threshold above “0”. In other words call the recent purchase only if monthly
total is above some specified amount bigger than “0”.

7.) Combine
all three FRM dimensions together into single variable where values would be
combinations of “H”, “M” and “L”. If value is “HLH” it would mean that customer
falls in the top group of customers in terms of their number of visits to the
stores, it means that customer wasn’t in the store (with purchase larger than…)
for a month and it means that customer falls in the top group of customers in
terms of their total monetary value that they bring to the company.

8.) In last
step I deploy “19 +1” rule, where i retain top 19 combinations based on its frequencies
and all the other combinations I drop into “other” category, so that my FRM
variable doesn’t have more than 20 distinct values.

Hope this helps!

Quantitative data depicts the quality and can be scrutinized, but measuring it precisely is daunting enough; in contrast quantitative data can be easily measured and is depicted in number or amount. See more data mining clustering

ReplyDeleteThanks for the good words! Really appreciated. Great post. I’ve been commenting a lot on a few blogs recently, but I hadn’t thought about my approach until you brought it up.

ReplyDeleteCar Spa at Doorstep in Mumbai

I really appreciate information shared above. It’s of great help. If someone want to learn Online (Virtual) instructor lead live training in Data Mining, kindly contact us http://www.maxmunus.com/contact

ReplyDeleteMaxMunus Offer World Class Virtual Instructor led training on Data Mining. We have industry expert trainer. We provide Training Material and Software Support. MaxMunus has successfully conducted 100000+ trainings in India, USA, UK, Australlia, Switzerland, Qatar, Saudi Arabia, Bangladesh, Bahrain and UAE etc.

For Free Demo Contact us:

Name : Arunkumar U

Email : arun@maxmunus.com

Skype id: training_maxmunus

Contact No.-+91-9738507310

Company Website –http://www.maxmunus.com

Nice Article, Croma Campus is the pioneer of instruction giving the big data Training in Noida

ReplyDelete