a knife laying on top of a black surface

Context: Blue Tech Ltd. is a company that sells a wide arrange of technologically advanced products around the world. With hundreds of thousands of transactions registered per year, its database size is increasing rapidly, making analysis more complex and leaving certain tools outdated due to the sheer volume of data that needs to be accounted for.

Objective: Understand consumer behavior better by applying RFM (Recency - Frequency - Monetary) analysis.

Blue Tech Ltd.

Data Source

Data Quality Verification

RFM Analysis

Performing EDA and Cleansing the data

RFM analysis pinpoints high-value customers, fading buyers, and untapped opportunities by analyzing Recency, Frequency, and Spending. Which allows businesses to craft targeted retention strategies, personalize offers, and drive repeat purchases with precision.

Since the RFM analysis can be quite complex, in the next steps, we will only delve into the main points to attain our main goal of comprehending customer behavior better.

Aggregating Customers

To better understand our dataset and the different data types and fields that were are dealing with, we load our dataset and perform EDA on it trough the use of functions such as:

  • pd.read_csv() -- To load in the data into our IDE**.

  • df.info() and df.describe() -- To get a basic understanding of the number or rows, columns, and to get summary of basic metrics of the dataset.

  • df.dropna() and functions like pd.to_datetime() to make sure we exclude certain values (like empty customer id) and to make sure we have the right data types when it comes to dates.

  • Additionally, we use the .astype() method to further refine the data types of our dataset.

  • Finally, we apply df.head() to get the first rows of our cleaned data set to make sure everything is according to our requirements.

** IDE (Integrated Development Environment), for this scenario, Spyder was selected.

We change the names of the columns that will be used as our main fields for aggregating and segmenting data: Recency (our latest date of purchase vs. the current date), Frequency (the unique amount of purchases made) and Monetary (defined as total price -- quantity * price --)

Assess and Assign numeric values to our customer categories

The next step is to generate new columns for each field (R-F-M) in order to assign a value between 1 and 5 to every customer on each of these categories based on 5 quantiles. Even though this part sounds complex, the process is something like this: our code calculates a score based on the whole field and assigns a value of 1 to 5 (5 being better) depending on the performance relative to the rest.

For instance, if a customer has bought a product in the past days will have a higher score than somebody who last purchase was made months ago. This same logic applies to Frequency and Monetary.

We assign arbitrary thresholds to determine which customers fall into each category. Based on industry knowledge, we can proceed to make this final step in this segmentation process, knowing that any adjustment can be made on a later stage, if applicable.

Plotting our Results

Visualizing our results is a very effective way of determining if our analysis makes sense within the given context, or if something is amiss and maybe we need to tune the model to correctly display what is happening under our data layers.

Ultimately visualization turns data into stories, making trends and insights easy to spot at a glance which helps making smarter decisions by cutting through complexity.

  • 2296 Blue Tech customers' (52.9% of its total base) are already MVP customers or are Loyal customers, solidifying the company's position in the market and demonstrating how much do customers identify with the brand.

  • Around 28% (1230) Blue Tech consumers are part of the potential Loyalists list.

  • Only 18.7% of the the customers are part of the group that requires attention before they become potential churned candidates.

💡Insights & Questions ⁉️

Next steps

What did we find?

  • Does Blue Tech have enough incentives such as: VIP perks, personalized rewards, community recognition, etc. to guarantee that these clients will stay as part of those groups? Can we afford to do more with the current budget?

  • Can we afford to lay out a laser-focused strategy to address potential loyalists and specially, the "requires attention" group with :

    • Exclusive offers.

    • Engagement triggers.

    • Personalized attention.

    • Feedback loops.

    • Gamified re-engagement.

    • Limited comeback offers, etc.

    In order to impulse sales among these groups?