Retailers often compete on price to gain market share in high performance product categories. Brands too must ensure that their in-demand assortment is competitively priced across retailers. Commerce and digital shelf analytics solutions offer competitive pricing insights at both granular and SKU levels. Central to this intelligence gathering is a vital process: product matching.
Product matching or product mapping involves associating identical or similar products across diverse online platforms or marketplaces. The matching process leverages the capabilities of Artificial Intelligence (AI) to automatically create connections between various representations of identical or similar products. AI models create groups or clusters of products that are exactly the same or “similar” (based on some objectively defined similarity criteria) to solve different use cases for retailers and consumer brands.
Accurate product matching offers several key benefits for brands and retailers:
But product matching stands out as one of the most demanding technical processes for commerce intelligence tools. Here’s why:
Product information comes in various (multimodal) formats – text, images, and sometimes video. Each format presents its own set of challenges, from inconsistent naming conventions to varying image quality.
The considerable fluctuations in both data quality and quantity across diverse product categories, geographical regions, and websites introduce an additional layer of complexity to the product matching process.
Industry specific nuances introduce unique challenges to product matching. Exact matching may make sense in certain verticals, such as matching part numbers in industrial equipment or identifying substitute products in pharmaceuticals. But for other industries, exactly matched products may not offer accurate comparisons.
The diverse downstream business applications give rise to various flavors of product matching tailored to meet specific needs and objectives.
In essence, while product matching is a critical component in eCommerce, its intricacies demand sophisticated solutions that address the above challenges.
To solve these challenges, at DataWeave, we’ve developed an advanced product matching system using Siamese Networks, a type of machine learning model particularly suited for comparison tasks.
Our methodology involves the use of ensemble deep learning architectures. In such cases, multiple AI models are trained and used simultaneously to ensure highly accurate matches. These models tackle NLP (natural language processing) and Computer Vision challenges specific to eCommerce. This technology helps us efficiently narrow down millions of product candidates to just 5-15 highly relevant matches.
The key to our approach is creating what we call “embeddings” – think of these as unique digital fingerprints for each product. These embeddings are designed to capture the essence of a product in a way that makes similar products easy to identify, even when they look slightly different or have different names.
Our system learns to create these embeddings by looking at millions of product pairs. It learns to make the embeddings for similar products very close to each other while keeping the embeddings for different products far apart. This process, known as metric learning, allows our system to recognize product similarities without needing to put every product into a rigid category.
This approach is particularly powerful for eCommerce, where we often need to match products across different websites that might use different names or images for the same item. By focusing on the key features that make each product unique, our system can accurately match products even in challenging situations.
Imagine having a pair of identical twins who are experts at spotting similarities and differences. That’s essentially what a Siamese network is – a pair of identical AI systems working together to compare things.
How it works:
The architecture of a Siamese network typically consists of three main components: the shared network, the similarity metric, and the contrastive loss function.
At DataWeave, we use Siamese Networks to match products across different retailer websites. Here’s how it works:
Matches are then assigned a high or a low similarity score and segregated into “Exact Matches” or “Similar Matches.” For example, check out the image of this white shoe on the left. It has a low similarity score with the pink shoe (below) and so these SKUs are categorized as a “Similar Match.” Meanwhile, the shoe on the right is categorized as an “Exact Match.”
Similarly, in the following image of the dress for a young girl, the matched SKU has a high similarity score and so this pair is categorized as an “Exact Match.”
Siamese Networks play a pivotal role in DataWeave’s Product Matching Engine. Amid the millions of images and product descriptions online, our Siamese Networks act as an equalizing force, efficiently narrowing down millions of candidates to a curated selection of 10-15 potential matches.
In addition, these networks also find application in several other contexts at DataWeave. They are used to train our system to understand text-only data from product titles and joint multimodal content from product descriptions.
In summary, accurate and efficient product matching is no longer a luxury – it’s a necessity. DataWeave’s advanced product matching solution provides brands and retailers with the tools they need to navigate this complex landscape, turning the challenge of product matching into a competitive advantage.
By leveraging cutting-edge technology and simplifying it for practical use, we empower businesses to make informed decisions, optimize their operations, and stay ahead in the ever-evolving eCommerce market. To learn more, reach out to us today!
Thank you for Subscribing - Team DataWeave
For accounts configured with Google ID, use Google login on top. For accounts using SSO Services, use the button marked "Single Sign-on".