Request A Demo
Anton Xavier

By: Anton Xavier on September 21st, 2018

Print/Save as PDF

The search challenge is breaking data standards

E-commerce  |  search

Static data standards:

Over the last decade, there has been a lot of talk about a single source of truth and creating data standards to match. However, during this period we have seen data fragmentation at an unprecedented scale. For any single product, there are hundreds if not thousands of data endpoints all being powered by different data sources with different needs and different levels of accuracy and currency. To say that brands have lost control of their product data would be a dramatic understatement.

This post makes the argument that to power the future omnichannel experience, neither a single source of truth nor the static standards created will suffice. Instead, a 360-degree product view powered by a dynamic taxonomy is key. This is what Google did to websites, and this is now needed for product information.

The case for taxonomy powered search

I make the case for taxonomy-powered search in my other post which you can explore for a more detailed dive into why a taxonomy is key to power more effective search. But to provide a high-level overview, we can think about the challenge leveraging LEGO as an analogy. Search without a taxonomy would be the equivalent to building LEGO without the colors and without instructions. It becomes very difficult to find exactly what you want and to build towards something cohesive and sophisticated.

To give one simple example: a search for gluten without a taxonomy underlying the data would probably be limited to a search for the word ‘gluten-free’ in the product title. If the product was in fact gluten-free but did not have those words in the title, it would not return in the results.

This is one example from thousands of why a taxonomy is absolutely critical to organizing product data and making it available to be found.

Modern search will drive dynamic taxonomies.

At this stage most search in an e-commerce and omnichannel environment functions based on a text match on the product title field. This requires the brand to populate the product title with all the attributes that they think may be relevant to find a particular product. Not surprisingly, this has lead to challenges around keyword stuffing and keyword stacking of product titles. There is a limit to how much information you can squeeze into 80 to 200 characters.

As search evolves, we are beginning to see more powerful search mechanisms which are providing shoppers with the ability to filter or search for products based on different types of attributes. As the user experience evolves further, the next generation of search experiences will be optimized to leverage long tail attributes so that shoppers can find exactly what they want.

This need to service the ever-growing long tail will drive the need to understand new data elements and attributes as they arrive in the market. The market is evolving and innovating very quickly. New products are further driving the fragmentation of consumer preferences, and established incumbent products are finding new ways to describe and to differentiate.

All of this leads to a constantly evolving and changing product data landscape. The challenges that this then presents is that as new data elements enter the marketplace, there is a need for them to be recognized and organized into an underlying taxonomy so that they become visible and useful for search.

For instance, if a new claim enters the market such as ‘supporting regenerative farming,' this data element needs to be captured and organized into the taxonomy so that if someone was to search ‘regenerative soil products,' the intent of that search could be mapped to the new claim, and this new product could be displayed.

The taxonomy that powers search needs to be dynamic to keep up with the reality of ever-changing product data.

Standards in a fragmented data world

Ever since the industry has been exchanging data, there has been a need to establish data standards to ensure effective alignment. The fascinating story about the invention of the bar code is essentially one of data alignment and standardization.

As the industry evolved to become more interoperable, the need to exchange and collaborate around standardized information increased. To meet this challenge, the industry rallied around creating data standards for supply chain related data - that being the data required to source, transport, store, and sell a product.

It could reasonably be argued that the supply chain flexibility powered by this data standardization was one of the greatest achievements of the post-industrial food system.

However, the challenge with these traditional static standards is that they have struggled to keep up with the ever-changing marketplace. As a result, the traditional purpose of the standards in being an aligning force between brands and retailers and the data, is being compromised because retailers are being required to get data from outside these standards.

The inability of standards to evolve is somewhat responsible for the fragmentation of product data in the marketplace.

The static standards challenge

The purpose of creating standards was to ensure that there was a common data structure that the industry could align around for the exchange of information. By definition, the standards need to be somewhat static so that there is alignment between partners. This challenge has not changed; there is still a need to exchange a certain amount of product data between partners in a standardized format.

But this is in conflict with the new challenge of needing a data structure that is flexible and dynamic to capture the new product data that comes into the market. If data standards were changed on a daily, weekly, or even monthly basis it would run the risk of creating chaos across trading partners.

This is the dynamic data standards conundrum.

Flexible taxonomy versus static Standards

The key to solving this challenge is to not create dynamic data standards, but to create a dynamic taxonomy behind the static data standards.

The data standards will act as a format for the exchange and storage of product data such as the Global Data Dictionary (GDD) presently does. But then underlying that will be a dynamic taxonomy that can be leveraged to filter, search, understand, and interpret the raw data that is stored in this standardized format.

The dynamic taxonomy will evolve in real time as new products change the data landscape. This will ensure the integrity of the user experience, while maintaining the ability for trading partners to exchange product data effectively.

Label Insight is the dynamic taxonomy for the industry.

Over the last decade, we at Label Insight have worked with the leaders across government, retail, and CPG brands to create and maintain the dynamic taxonomy for the industry.

The Label Insight product data taxonomy is constantly updated in real time. Every year over half a million products are passed through our taxonomy engines and new data elements are uncovered, organized, and structured into the taxonomies. On average we see over 11,246 new incoming claims from the market per week and every day our recognition engines automatically work though and identify over 3,372 claims per day. This is just in the marketing claims data area.

We leverage this real-time taxonomy to work with retailers and brands and their data partners to ensure that the latest most powerful product data taxonomy is being leverage to power search and discovery of products across the industry.


Interested in keeping on top of e-commerce and search? Subscribe to get the latest.


Want to learn how Label Insight can help you with understanding your retailer views? Contact us to learn more


Related Reading: 

Grocery E-comm 2.0 - the search challenge

Key to search challenge is dynamic taxonomies

Search challenge is breaking data standards

360 product view vs single source of truth 

About Anton Xavier

Anton Xavier is a Co-Founder and founding CEO of Label Insight, and works closely with the marketing and the senior management teams.