Every product on a retail website is categorized in such a way that it denotes where the product belongs in the entire catalog. Generally, these categorizations follow a hierarchy that puts the product under some Category, Subcategory and Product Type (Ex. Clothing, Shoes & Jewelry > Men > Clothing > Shirts). We call this hierarchical product categorization as Product Taxonomy. Categorizing products in a logical manner – in a way a shopper would find intuitive, helps in navigation when he or she is browsing an e-commerce website.
In addition, with a good category organization, a product lends itself for better searchability (for search engines) on e-commerce websites. Search engines work by looking up query terms in an index which points to products which contain those terms. Matches in various fields are ranked differently in relevance.
For instance, a term that matches a word in the title, indicates greater relevance compared to one which matches the description. Additionally, terms that are exclusive to certain products, signal greater selectivity and hence contribute more to ranking. In light of this, the choice of words in fields indicating a product’s category affects the relevance of search results for a user query. This improves discoverability and as relevant results show up, it in turn improves the user experience. A good product taxonomy contributes to increased sales by helping shoppers find relevant products while browsing or searching.
Retail websites organize products into a taxonomy which they deem intuitive for their users, and fits the organization of their business units. Different retail websites could thus have taxonomies varying significantly from each other. Since we deal with millions of products across hundreds of websites on a daily basis, we often have to work with various taxonomies for the same product coming from different websites.
We are required to align these to a common standard taxonomy for our analyses. Standard taxonomies like Global Product Classification (GPC) taxonomies and Google Product taxonomies offer a standard way of representing a product. However, none of these taxonomies are complete and generic. Hence, we at DataWeave have come up with our own Standard Taxonomies for each category in e-commerce, which are generic enough to represent products on websites across different geographies.
Having a standard taxonomy for each retail product is important for our Data Orchestration pipeline. A Standard Taxonomy helps in enriching the DataWeave Retail Knowledge Graph at scale.
The information about products on most of the retail websites is unstructured and broken. We process this unstructured data, derive structured information from it and store it in a connected format in our Knowledge Graph. The Knowledge Graph is used in downstream applications like Attribute Tagging, Content Analysis, etc. The Knowledge Graph follows a standard hierarchy of 4 levels (L1 > L2 > L3 > L4) for all the retail products.
Mapping retail taxonomies is not only a requirement for the Knowledge Graph, but has some direct business applications as well:
Health & Household > Health Care > Alternative Medicine > Aromatherapy > Candles
Fragrance > Candles & Home Scents > Candles
It is also used in Catalog Suggestion as a Service, where for any product we suggest the appropriate taxonomy it should follow on the website for a better browsing experience.
Stay tuned to Part-2 to know how we are solving the problem of mapping various retail taxonomies.
Click here to know more about assortment analytics