Request A Demo
Sandra Hendren

By: Sandra Hendren on November 16th, 2017

Print/Save as PDF

How Data Science Gives Label Insight a Competitive Edge

Company Viewpoint

Data science is fast becoming a core enabler of business innovation and competition. Given its notability , you would think that the successful use of data science within companies would be on the rise, but a report from MIT says no. Surprisingly, the percentage of companies claiming competitive advantage from their data science initiatives dropped from a high of almost 70% in 2012 to about 50% in 2016.  What’s going on?

On the business side, organizations are still figuring out what’s possible with data science and machine learning. The full impact on business models and operations emerging. Without that fundamental understanding, it is hard to change the culture of the organization to adapt to a new way of thinking. Yet that is exactly what is required to make the leaps of innovation that are essential to staying competitive in today’s business landscape.

Successful implementations of data science embed efficiencies and quality improvement into organizational processes. Label Insight is focused on bringing data science-driven improvements to the food industry and to help brands and retailers offer greater transparency to consumers. Here’s how we do it. 

At Label Insight, text is scanned electronically from food product labels into our data lake, a store of free format text exactly as you see it on the label itself. In the data lake, text is divided into major categories of information -- nutrients, ingredients, warnings, claims, logos and romance copy.  For this discussion, let’s look at only one of these: ingredients.


Label Insight’s competitive advantage is the ability to parse, decipher and translate food labels from an unstructured mass of text into everyday terms that are meaningful to consumers. This is no small feat. In fact, it is especially difficult for ingredients since the names of some ingredients -- food additives in particular -- are often long, complex scientific names. At the same time, consumers rarely care about the names themselves; they simply want to know whether any of them are artificial preservatives, gluten, or are GMO-based, for example.

Originally, this mapping of many complex words to a small number of meaningful consumer categories was done manually by dieticians and analysts at Label Insight. An arduous process, to be sure. Now we use these subject matter experts for quality control checking and to define new ingredients as they emerge. But the bulk of that work has moved from manual classification to a machine learning approach. 

Machine learning is ideally suited for exactly this type of text classification problem. Specifically, we are using a machine learning technique that is particularly adept at discovering very complex relationships in data. It maps a large number of input values to a smaller number of more general categories. In our case, we map more than 250,000 different ingredient names down to a more manageable and palatable list of 10,000 with attributes such as non-GMO, vegan, no artificial additives, etc.

These classifications not only make it easy for consumers to understand what is in the products they use and consume, they enable brands and retailers to gain deeper insights into their product set and inventory. Furthermore, Label Insight allows manufacturers to submit their product packaging once, and continue to respond to new diets, trends or regulatory needs in real time with current and accurate data.

No other company can provide the deep data insights that we can. That’s why brands and retailers continue to rely on our unmatched data science capabilities to power transparency initiatives, support wellness programs and unlock competitive purchase trends and opportunities.

Want to learn more about how you can take advantage of our data science capabilities and drive greater success? Request a consultation with one of our data experts to learn more.

About Sandra Hendren

Label Insight Board Advisor Sandra Hendren is a Data Scientist and Analytics Strategist focused on big data and predictive modeling and is located in the Boston area.