Cross-selling in eCommerce Matters: A Technical Guide for Upselling Online

In this article, I will describe a relatively simple method for launching a cross-sell (products offered to a customer in addition to those in the cart) feature in e-commerce, such as groceries or food delivery services, which we successfully implemented in the "Mnogo Lososya" mobile app. This is a basic collaborative filtering recommender system that combines user-based and item-based approaches and can be used in a variety of e-commerce projects, particularly those with a large number of SKUs, to provide a wide range of recommendations.

Mnogo Lososya, founded in 2018, is a chain of 50+ ghost and 250+ takeaway kitchens as well as an umbrella brand for multiple dish concepts. Our unique selling point is the 30-minute delivery of freshly cooked meals. We have been rapidly expanding and have recently passed 100k app MAU with over 100M RUB in gross monthly revenue.

The majority of our orders are made online, with one-third coming from our own mobile app and the other two-thirds coming from delivery services. The app is an important component of our product because it is one of the first points of contact, which, along with our service and the food itself, contributes to a better customer experience.

Architecture

This solution was built entirely on Yandex Cloud services, but it could also be built on AWS because it also has all the necessary services. I am designating services in terms of AWS notation for the convenience of AWS users, which should be clear to many YC users as well.

Managed MongoDB
Lambda function
API Gateway
Cloud DNS
Mobile backend (compute cloud)

The simplified architecture looks like this:

Users place orders through the mobile app. In the ERP system, orders are created and processed.
The orders are then copied to the data warehouse during an ETL process once per day at night. Each order includes information about the ordered products as well as the customer identifier.
SQL procedures calculate user preferences and product similarity. A more detailed description of the computation is provided below. The computation yields two collections in MongoDB with the following structure:

userPref collection
Phone: We use “phone” as a user identifier.
- relatedDishes
  - id: ID of a related dish
  - rank
- lastUpdateDate. Date and time of the last recalculation
  
  Example of a document:

productSimilarity collection

dishId
relatedDishes
- id: ID of a related dish
- rank
lastUpdateDate: Date and time of the last recalculation

Example of a document

When a user updates the cart, the app sends a request to the Lambda function. The request includes the user identifier (phone) and a list of product identifiers that are currently in the cart.
Lambda extracts related dishes from the productSimilarity collection for each meal in the cart and related dishes by the given phone from the userPref collection.
The lambda algorithm then combines these lists of related dishes into one and sends it back to the mobile app as a list of recommendations. In the cart, the app renders product cards.

Algorithm

User Preferences

We implemented user preferences based on weighted sales history, with recent sales taking priority. Consider the following arbitrary user's sales history:

Product	Sales	When	Time coef (1/months)	Weighted sales
A	1	this month	1	1
B	1	this month	1	1
C	4	1 month ago	0,5	2
A	4	1 month ago	0,5	2
B	3	4 months ago	0,25	0,75

The user purchased both product B and product C four times. However, because the majority of product B sales took place four months ago, we prioritize more recent product C sales. The products are sorted by total weighted sales, which are the sum of weighted sales for each product.

Product	Total weighted sales	Rank
A	3	1
B	1,75	3
C	2	2

The above example implies that the user prefers product A over product C and product C over product B.

Product Similarity

The number of orders in which pairs of products were present is used to calculate product similarity. The result is calculated separately for each month, with the most recent months taking priority. As a result, we rank similar products for each product and store them in MongoDB, where product ID is an index for the collection.

Recommendation

The resulting goods recommendation list combines user preferences and similar products and sorts them according to some strategy, which is ascending sorting by rank. As a result, we simply combine all related product lists and sort them. We compute the average rank for repeating products. Here is an example:

Suppose a user now has a cart with 2 products – A and B;
Lambda selects a document from the productSimilarity collection for product A. Related dishes are C (rank 1), D (rank 2), and E (rank 3);
Lambda selects a document from the productSimilarity collection for product B. Related dishes are F (rank 1), G (rank 2), and H (rank 3);
Lambda selects a document from the userPref collection for the given phone. Related dishes are D (rank 1) and H (rank 2);
All related dishes are combined, and for duplicate dishes, the average rank is calculated. The resulting list is C (rank 1), D (rank 1.5), E (rank 3), F (rank 1), G (rank 2), and H (rank 2.5).
After sorting the list is C, F, D, G, H, and E, and returned to the app.

Outcome

Metrics or How to Measure the Outcome

We chose the following metrics to measure the cross-sell efficiency:

The average order value (AOV) of orders that included cross-sell dishes was higher than the AOV of orders that did not. The total sum of all products in the order is the order value, which is how much the customer pays for the order. Therefore, this metric indicates whether customers pay more for orders that include cross-sold dishes. This is the key metric because the increase in AOV is exactly what we expect from cross-selling.
Percentage of goods added from the cross-sell section in total sold goods. This is a secondary metric that is heavily influenced by the nature of the goods sold as well as the cross-sell strategy. Consider an electronics e-commerce store that cross-sells low-cost supplements like suitcases and charging cables to more expensive items in the cart like smartphones and laptops. Many supplements can be cross-sold to one main item in this case, and the metric can exceed 50%. Although our example does not include a wide variety of supplements, this metric demonstrates how cross-selling affects the final cart structure.
Percentage of orders containing cross-sell dishes. This is another secondary metric that displays the "popularity" of cross-sell, or how frequently customers purchase cross-sell recommended products.

Actual Results

The dataset below contains impersonal order data collected from December 2022 to January 2023 in one of MnogoLososya's operations cities.

https://github.com/alexchrn/cross-sell/blob/main/orders.csv

The dataset is compiled from a variety of sources, including AppMetrica (add-to-cart events) and the ERP system (order and payment statuses, discount and payment sums).

Dataset structure:

order_id. A unique identifier of an order
number_of_cross_sell_dishes. The number of dishes which have been added from the cross-sell section.
status. The last known order status.
payment_status. The last known payment status
points_number. The number of bonus points which have been withdrawn in an order. 1 point equals 1 ruble.
discount_sum. The discount in rubles, not including bonus points.
payment_summ. How much a client paid for an order
created_at. Datetime of an order creation
appmetrica_device_id. Unique device identifier
app_version_name. App version
total_number_of_dishes. The total number of dishes in an order
has_cross_sell_dish. Whether an order contains dishes added from cross-selling. This field is calculated from number_of_cross_sell_dishes.

So here are the metrics values (derived from the above dataset using python).

The percentage of dishes added from the cross-sell section in total bought dishes – 3.97%

The percentage of orders containing cross-sell dishes – 10.46%

AOV of orders which included cross-sell dishes compared to AOV of orders which did not:

As can be seen, orders with cross-sell dishes have a higher AOV with a difference of 565 RUB. The average number of dishes in such orders is also higher, which is reasonable considering that the sole goal of cross-selling is to incentivize a customer to add more dishes to their cart.

Is the difference of 565 significant? We can use a t-test to see if this difference is due to chance. The Python scripy library has a method for this. This is a test for the null hypothesis that 2 independent samples have identical average (expected) values (1).

Thus, the p-value, or probability of the null hypothesis being true, is extremely low, and we reject the null hypothesis even at the 99% significance level. In other words, it is almost certain that the noticeable difference in mean order value is not coincidental, and orders with cross-sell meals generate more revenue.

Conclusion

Cross-selling can be an effective tool for increasing average order value even with simple collaborative filtering techniques. It can also be implemented relatively easily from a technical standpoint, thanks to AWS and other cloud providers' serverless services, as shown in this article.

Related materials:

https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_ind.html