Machine Learning Smart System:
Facial Skincare Product Recommendation System
Introduction
Unless a consumer is generally well-versed in cosmetic ingredients, they may be limited in their ability to predict a skincare product’s success. User research has shown that a good portion of online shoppers relies on customer reviews to evaluate the products; however, the reviews may be conflicting, ingenuine, or lacking in helpful details. Additionally, as the FDA does not currently regulate any cosmetic products, users could benefit from a method of checking products for carcinogenic or problematic ingredients. Other problems users face may be weighing options for cost, brands, or ingredients. Overall, the most accurate test of whether a product will work is by trying it, and consumers must risk the possibility that the product will not work for them after they buy it. With the rapid growth of the online beauty and personal care products industry forecasted to reach a market size of $100,700 million in 2025, it is exigent to develop and introduce new methods of online shopping for beauty buyers and work on increasing their shopping satisfaction.
Currently, some systems recommend products for users (for example, on a shopping website), but they do this simply based on a user’s browsing and/or purchasing history, or users' self-assessment of their skin type. In reality, skin type is more complicated than such simple categories as “oily”, “dry”, or “mature”. In this project, by using several machine learning models, users will get suggestions based on the ingredients of their previously working-well products or users who share specific skin attributes to them. This system’s value is that it will help consumers make purchasing decisions by narrowing down the window of product options to those that have a higher probability of working well for the user’s specific skin type.
Our Project Process
Before starting our system design, we went through brief primary and secondary research to identify the opportunity gap between what users need and what currently exists. Then, we moved on to the exploration of useable input datasets and the building of system models. Last but not least, we evaluated our system with potential users to see whether we meet users' needs and how well we are filling the gap identified.
Positioning: Who are the users and what do they need?
We identified two personas based on our research of users.
User with a beginner knowledge of skincare
These users may not know much about how skincare products work and the effect of different ingredients. Our system will provide an introduction to these users to learn about skincare products and illustrate how the product could address their skin concerns.
User with advanced knowledge of skincare
For these users, they have a good understanding of products and specific ingredients that could address their skincare concerns. Our system will serve as a tool to validate their knowledge. While there is an infinite number of sources online to better understand each ingredient, our system gives a focused perspective on key knowledge about each product and its ingredients.
Our Smart System Solution: My CosmoLab
A website service that recommends users suitable skincare options.
Dataset Selection
As the fundament of our recommendation system, we need adequate datasets that include at least most of the skincare products on the market and the ingredients of these products. Also, we need data of what each ingredient does (attributes) as well as data from FDA informing the status of each ingredient (whether it is harmful or not). Last but not least, we get data from user about their skincare concerns and the products that worked good or bad to them.
Recommendation Model Methods
To build a strong recommendation system, we decided to use two models. 1) Use neighborhood recommendation system for recommending products from similar users, and 2) use matrix factorization to recommend products based on ingredients. The matrix factorization is especially ideal for recommending new products, for these products likely have not been tried out by many users yet. With the two models working together, we can cross evaluate the output recommendation to our users.
Neighborhood Recommender System
The neighborhood recommendation system operates on the principle of word-of-mouth: The user will rely on like-minded people’s opinion for evaluating a product. In our system, we hope to recommend the user products used by others with similar skincare concerns. We picked this model for the following reasons:
-
Justifiability: Unlike other recommendation systems where it may not be clear how the system identified a recommendation for the user, the neighborhood has a straightforward justification for their output: The recommendations are from similar users.
-
Relationship based: The model is about learning from the relationship of items (user’s specific concerns and product features), whereas some other recommendation models focus on the value of the items.
-
Stability: The model is able to take on constant additions of features from both users and products without having to do too much re-training. Once a similarity is established, new users can immediately receive recommendations.
Matrix Factorization
The skincare industry is constantly launching new products, and therefore our system should also be able to anticipate and address cases where new products (not tried by similar users yet) can be recommended to users to avoid "cold start. For this scenario, we have decided to adopt the matrix factorization method, a collaborative filtering algorithm that is capable of dealing with sparse data (probably due to cold starts or biased recommendations).
The matrix factorization method works based on the calculation of the matrix’s 2 axes, which interpret 2 sets of data: the user input data and the product list and ingredient dictionary dataset.
For the user input data axis, the system would ask users to input either products (brand + product name) that have previously worked well for them, or their skincare concerns (e.g. oily, dry, mix, redness, acne, wrinkle, etc.). Next, the system would search in the dataset of products and ingredients, and 1) obtain the ingredient data and corresponding attributes for the features of the matrix, 2) calculate the value of each feature based on the proportion of each attribute among all ingredients. Hence, for the simplified case below, User A would get a feature value of F1= 0.2, F2= 0.8, and User B, who inputted skincare concerns that corresponded to the ingredient attributes/matrix features, would get a feature value of F1= 1, F2= 0.
For the product and ingredient data axis, the system would correspond the data from the product list dataset (from sephora.com) with the data from the ingredient dictionary (from paulaschoice.com) to get the ingredient attributes from each facial skincare products. With each type of ingredient attribute as a feature, the system would calculate the feature value of each product by adding up the counts of each feature in the product. Hence, for the simplified case below, Product 1 would get a feature value of F1= 3, F2= 1, and Product 2 would get a feature value of F1= 1, F2= 4.
For the purpose of explaining the matrix in a simple manner, we will work with only 4 users, 5 products, and 2 total features /ingredient attributes). After filling in the two axes with the feature value, the system will calculate the value of the matrix by adding up the multiplication value of each feature value. For example, the grid of User A and Product 1 is: 0.2(F1 of User A) 3(F1 of Product 1) + 0.8(F2 of User A) 1(F2 of Product 1) = 1.4. Lastly, the system would recommend users the highest value product among all products. For the case above, User A would get Product 3 as the recommendation, User B get both Product 1 and Product 3 as the recommendation, for Product 1 and 3 have similar ingredient attributes, User C would get Product 4 as the recommendation, and User D would get Product 3 as the recommendation.
The Workflow of My CosmoLab
Limitations
-
Ingredient weights: The products and ingredients may have varying levels of similarity. It is challenging to weigh the contributions of a product/ingredient differently, without knowledge of a chemist and a dermatologist.
-
Ingredient mix: The system analyses each ingredient individually. And does not take into consideration that certain products work because of a certain combination of ingredients.
-
User input skin type & skin concern: Users need to have certain level of acquaintance to their skin (to know what works well or skin concerns) in order for the system to work well.
Next Steps
-
Assigning weight factor for ingredients and products in our model based on ingredients’ attributes and chemistry effects: This can be done using expert opinions of dermatologists and chemists. Assigning weights to ingredients ensures that the system not only recommends “a” product but “the best” product for the user.
-
Expert opinion for the inputs by users: In the long run, our platform can look to provide expert assistance and consultation to users. At the first level, users can interact with a chatbot. The chatbot then redirects to a human expert who provides consultation. This will tackle the issue of inaccurate inputs which will, in turn, allow the system to provide a more accurate recommendation.
Our team
-
Anna Gipsov
-
Ariel Yu
-
Eileen Wang
-
Ria Anil Jethmalani
My Role
-
User research
-
Smart System Design
-
Recommendation system model