Merchandising teams often struggle with quick, accurate visibility into what competitors are doing across retailers. Answering questions like what’s selling, at what specs, at what price, and how it stacks up against their own catalog in real-time can be a challenge.
The data exists. It’s sitting on retailer websites, in product listings, in ratings and specifications and descriptions. But pulling it together, normalizing it, making it actually comparable across multiple sources – all in real time? That’s where the process falls apart.
For a leading North American retailer with thousands of SKUs across dozens of categories, this was the gap they set out to close.
The retailer’s core requirement was straightforward: the ability to enter a search query and get populated and normalized spec data. In practice, that meant:
Beyond the core ask, the retailer also needed the solution to work across both hard goods and soft goods, include their own listings alongside competitors for direct comparison, offer sortable and filterable results (by rating, best-selling, and other criteria), and scale to support multiple concurrent users running simultaneous queries.
Simple to describe. Hard to build. Because retail product data isn’t standardized. Every retailer structures its pages differently, uses different taxonomies, and displays specifications inconsistently, sometimes even within a single site. A field called “tank capacity” on one site might be labelled “reservoir size” on another. What one retailer lists as a feature bullet, another buries in a secondary spec table. Pricing formats, rating scales, review counts: none of it aligns neatly across sources.
Layer in anti-bot mechanisms, dynamic content rendering, rate limits, and constant site structure changes, and you start to see why maintaining a reliable data pipeline for this kind of use case wasn’t feasible for the retailer’s internal teams. They needed a partner who’d already solved these problems at scale.
The retailer partnered with DataWeave to deploy a self-serve competitive intelligence platform, the first of its kind in DataWeave’s product lineup. This platform places the controls directly in users’ hands. Category managers and merchandising teams log in, run their queries, and get structured outputs on their own terms.
The platform is built around two complementary services, each designed for a different workflow.
This is the foundation. Our Data Collection API solution gives category managers direct, on-demand access to competitive product data across multiple retailers and geographies, delivered through a dashboard they control entirely on their own.
The platform supports three search modes:

All three modes can be run as a one-time crawl for a quick snapshot or set up as scheduled jobs that deliver fresh data on a recurring basis (daily, weekly, or whatever cadence the team needs). With a scheduled job, users get an ongoing feed that surfaces what matters most: latest pricing across retailers, new products being introduced to the market, shifts in availability, and changes in ratings or review volume. Essentially, a structured view of everything that’s happening on the digital shelf across their tracked categories and SKUs, updated on their schedule.
What comes back is a data feed pulled from live retailer pages as they appear to consumers, reflecting real market conditions rather than stale or cached snapshots.

A few things worth calling out about how the Data Collection API works:
For teams that have been relying on manual searches, periodic vendor reports, or brittle internal scrapers, the Data Collection API replaces all of that with a single, reliable, always-current data source they control themselves.
Built to complement the data collection API, Merch Intelligence is designed for merchandising teams who need more than just data. They need structured competitive intelligence that helps them decide which products to onboard, where spec gaps exist, and how their catalog compares to the market.

Merch Intelligence is powered by DataWeave’s proprietary AI engine, and this is where the distinction matters. The market is full of tools that wrap a generic LLM around scraped data and call it intelligence. What comes back is surface-level, because the model doing the analysis has no domain expertise. DataWeave’s AI is different. It’s built on years of experience delivering competitive insights across retail at scale, trained on the patterns, edge cases, and category-specific nuances that only come from processing millions of product records across hundreds of retailers. The result is analysis that’s expert-level, not generic.
Here’s key features that Merch Intelligence brings to the table:
How the process works:
Merch Intelligence runs a step-by-step process that takes a simple keyword search query and returns a ranked, structured competitive shortlist.

The end-to-end experience for a user is straightforward.
Stakeholders who need a competitive data feed log in to the Data Collection API, set up a crawl job (one-time or recurring), select their retailers and geographies, and get a structured output delivered automatically. They can run a quick price check on specific UPCs, scan an entire category, or monitor key SKUs across multiple store locations, and the platform handles scheduling, alerting, and recovery without any manual oversight.
A merchandising team member who needs competitive analysis enters a product keyword, selects the retailers they want to benchmark against, and lets Merch Intelligence do the rest. The system crawls, classifies, extracts, scores, and returns a structured output grouped by retailer. The file includes product name, brand, current price, rating, review count, a direct URL, subtype classification, and a structured specification block, all standardized so columns align across every retailer in the export.
What previously took hours of manual searching, copy-paste work, and offline normalization now takes a single query and a few minutes.
For this retailer, the competitive benchmarking exercise that used to take a day or more to assemble now takes minutes.
The Data Collection API gave the team on-demand access to ready-to-use competitive data across retailers, structured and normalized without any manual cleanup. Category managers could pull a pricing snapshot before a planning meeting, track category shifts on a weekly schedule, or monitor SKU-level changes across store locations, all self-serve, all in minutes.
Merch Intelligence took it a step further, delivering AI-powered actionable insights that told the team not just what competitors were doing, but what it meant for their own catalog. They could see at a glance where competitor products were outperforming on specs, where gaps existed in the market, and where price points suggested room to move. The confidence scoring and Merch Ready flagging meant they could trust the analysis without running a secondary validation step.
Together, the two layers changed how the team approached benchmarking altogether. Planning conversations became more grounded. Product briefs got more specific. Decisions moved faster.
The challenge this retailer faced isn’t unique. Across retail and brand organizations, category managers and merchandising teams are stuck between two options that don’t work well enough: manual research that’s slow and incomplete, or generic AI tools that scrape data and run it through an LLM with no domain expertise.
DataWeave’s self-serve platform offers a different path. A real-time Data Collection API that delivers structured competitive data on demand or on schedule. And an AI-powered Merch Intelligence solution, built on years of retail domain expertise, that turns that data into analysis teams can actually act on. Both accessible directly by the people who need them, without intermediaries.
Once that foundation is in place, competitive intelligence stops being a research project and starts being a routine capability. That’s the shift DataWeave enables. Reach out to us to learn more.
For accounts configured with Google ID, use Google login on top.
For accounts using SSO Services, use the button marked "Single Sign-on".