Remember me
Forget Account? Click Here
Dont have an account Sign Up

labs @ DataWeave

The Web is a digital outlet to data being produced by humans, machines, and processes.

labs @ DataWeave

The Web is a digital outlet to data being produced by humans, machines, and processes.

So What?
Organizations, researchers, and developers are using more and more data for decision making, analytics, and application development. Businesses have started augmenting their internal data with data available on the Web to better understand consumers, monitor brands, and track competition. Governments across the globe have treasure troves of data that they are opening up now for public access.
The problem?
But this data exists in different formats, is spread across sources, and is temporally changing. There is a huge amount of noisy public data that gets generated every moment.

What if:
There was a way of aggregating this data?
We could access this data and make sense out of it?
We could build data driven applications using public data?

What we are

DataWeave makes data on the Web easily accessible in both human and machine readable formats. DataWeave provides actionable data by aggregating, cleaning, organizing, visualizing, and managing millions of data points from the Web. Businesses, developers, and end users access data through APIs, dashboards, visualizations, reports, and alerts.

We build Data Products that help people and businesses make day to day operational as well as long term strategic decisions. DataWeave aggregates noisy public data from the web and transforms it into actionable insights.

Our vision is to democratize access to data and analytics. We believe in making data actionable to our customers.

At DataWeave Labs we tackle some of the hardest data problems that there are.

Data Acquisition and Organization

Deep Web Crawling millions of data points on a daily basis across thousands of websites across geographies and languages is no easy task. On top of that the rise of mobile apps have made App Crawling an important element to consider in data acquisition. Extraction and normalization of data is as challenging a task as it is to crawl pages at scale.

At DataWeave apart from scaling our crawlers to acquire data at scale from websites and apps, we also work on very interesting areas that look to combine human intelligence and algorithms to tackle the problem of extraction and normalization.

Try out our data APIs. Most of the data that we curate are available through Data APIs. If you would like to build cool new applications on top of these APIs, just let us know.

Semantics

Once the data gets acquired and organized, it is very important that we understand the underlying semantics in the data.

For example: in our Retail Intelligence product, it is very important to perform optimal product matching to enable competitive pricing analytics. Product matching sounds simple but what if you have thousands of products at hundreds of sites with each item possibly named and described in a different way? While the problem is seemingly simple to state, it is notoriously difficult to address in the face of highly noisy product specifications with short descriptions. We use a variety of clustering techniques to solve this problem at scale with high accuracy.

We use complex Machine Learning and IR algorithms to build semantic models around this data. This combined with knowledge bases that we have been creating from the last couple of years helps us make better sense of the data.

Computer Vision

Not all of the data we analyze is text. We also consider images and intend to add videos as well in our analysis. The challenges that we encounter here are quite unique.

For examples: Using deep learning techniques for image processing, we extract features like dominant colors, shapes, and patterns out of objects of our interest. This data is used in grouping together similar products, providing product recommendations, and fashion analytics.

Data Visualization

All the aggregation, analysis and insights we provide is of little use, if our customers are not able to consume it easily and put these insights into action. We believe that these insights can be easily consumed through proper representation and visualization of these insights.

Take a look at our data visualization library.

Real-time OLAP Query Models

As we aggregate millions of data points on a daily basis, the amount of data that needs to be queries upon and analyzed keeps getting more complex. How does one plan for running complex queries across different time periods on this data.

For example: If a customer wanted to understand the kind of promotions run by their competitors for the last 1 year that were active for a period of more than 15 days; it becomes challenging to keep all the data queriable and get results almost on a near real-time basis.

Deep Web Crawling millions of data points on a daily basis across thousands of websites across geographies and languages is no easy task. On top of that the rise of mobile apps have made App Crawling an important element to consider in data acquisition. Extraction and normalization of data is as challenging a task as it is to crawl pages at scale.

At DataWeave apart from scaling our crawlers to acquire data at scale from websites and apps, we also work on very interesting areas that look to combine human intelligence and algorithms to tackle the problem of extraction and normalization.

Try out our data APIs. Most of the data that we curate are available through Data APIs. If you would like to build cool new applications on top of these APIs, just let us know.

Try Our API's

Once the data gets acquired and organized, it is very important that we understand the underlying semantics in the data.

For example: in our Retail Intelligence product, it is very important to perform optimal product matching to enable competitive pricing analytics. Product matching sounds simple but what if you have thousands of products at hundreds of sites with each item possibly named and described in a different way? While the problem is seemingly simple to state, it is notoriously difficult to address in the face of highly noisy product specifications with short descriptions. We use a variety of clustering techniques to solve this problem at scale with high accuracy.

We use complex Machine Learning and IR algorithms to build semantic models around this data. This combined with knowledge bases that we have been creating from the last couple of years helps us make better sense of the data.

Try Our API's

Not all of the data we analyze is text. We also consider images and intend to add videos as well in our analysis. The challenges that we encounter here are quite unique.

For examples: Using deep learning techniques for image processing, we extract features like dominant colors, shapes, and patterns out of objects of our interest. This data is used in grouping together similar products, providing product recommendations, and fashion analytics.

Try Our API's

All the aggregation, analysis and insights we provide is of little use, if our customers are not able to consume it easily and put these insights into action. We believe that these insights can be easily consumed through proper representation and visualization of these insights.

Take a look at our data visualization library.

Try Our API's

As we aggregate millions of data points on a daily basis, the amount of data that needs to be queries upon and analyzed keeps getting more complex. How does one plan for running complex queries across different time periods on this data.

For example: If a customer wanted to understand the kind of promotions run by their competitors for the last 1 year that were active for a period of more than 15 days; it becomes challenging to keep all the data queriable and get results almost on a near real-time basis.

Try Our API's

Our API's
ECommerce Price Intelligence
This data API lets users access latest product prices across different e-commerce portals in India on a daily basis. Users can enter the product they wish to know the prices for and access the results.
Coupon codes
Check all coupon code information aggregated from a list of diverse sources on the Web. One can use this API to build interesting applications / offerings using coupons.
Telecom Data
This data API provides details of all recharge options available with various telecom operators across all circles. The data is aggregated from the websites of all telecom operators
Book Price Search By ISBN
Lookup book prices using the International Standard Book Number (ISBN). The API takes the ISBN number as argument and returns the prices from major eCommerce portals in India.
Commodity Prices
This data api allows access to arrival and prices of different agricultural commodities as received from the Agricultural Produce Market Committee (APMCs) of different states in India.
Universal Product Code (UPC)
Lookup product information using UPC (Universal Product Code). The total number of products currently mapped are around 10000. The categories for which data exists include Mobiles, Cameras and Grocery items.
Earthquakes
This data API provides data with the latest occurrences of earthquakes around the world. The API takes the year and month as argument and returns the timestamp, place, latitude, longitude and earthquake magnitude for each earthquake.
DTH Data
This data API provides details of all DTH recharge options available with various operators across all circles in India. The data is aggregated from the websites of all respective operators on a daily basis.
World Weather Data
Weather observations, weather forecasts and climatological information for selected cities supplied by National Meteorological & Hydrological Services (NMHSs) worldwide. Temperature is in degrees centigrade.
Sample API Request
Processing..