Amidst the dizzying increase of content outlets, daily newsletters, and social media feeds, A well-known publisher of business news and financial data found a top-of-mind problem for their customers — how could they distill the daily onslaught of the news into actionable, personalized insight?
They decided to reimagine the news as a knowledge graph and extract the most important, and personalized, data for their customers around the world.
The company offers consumer media products as well as enterprise offerings designed to help users find, monitor, interpret and share essential information. Their customers were so inundated with information they couldn’t identify what news affected their interests. Unable to decipher what was signal and what was just noise, they couldn’t take the appropriate action. One thing was clear — more information was not the solution. Customers wanted the key information at their fingertips without having to sift through a pile of newsletters.
“How to share relevant knowledge with our customers without making them read ‘the news’?”
- Key Question
A wealth of data
The company was sitting on a trove of proprietary data, stored in varying structures. Between their flagship publications, news archive aggregators and specialized news datasets, they have access to millions of facts derived from fifty years of digitized news media. The aggregator alone offers content from over 33,000 global news and information sources from over 200 countries and in 28 languages. In addition to having every significant news story, each story has associated metadata, which provides context to the reporting.
Structured data has a defined length and format and is stored in tables; unstructured data is not organized in a pre-defined manner and it’s typically text-heavy.
However, all of these data sources existed in complete silos. Theoretically, an employee could find every news story about any given company, but it would require searching at least six data sources. Making these stories actionable involved reading thousands of words, layering knowledge within that domain, and then using that context to identify the nugget of information needed for a ”data driven” decision.
Personalization at scale
Automating reading the news required a technical solution that could replicate this process of building layered context around a concept. Critically, it had to be able to incorporate data from both structured and unstructured data sources. Further, they needed a solution that was scalable, updates quickly, and operates well at high volumes — after all, the news changes minute to minute. The company ultimately turned to a knowledge graph, allowing them to create a complete view of the news, comprehensible by humans and readable by machines.
Using Natural Language Processing (NLP), the new knowledge graph extracts entities (people, companies, events, dates) and their relationships (employed by, invested in, associated with) from the unstructured news articles. Stardog links these entities to the related data in the knowledge graph, placing these entities in the context of other news stories and historic relationships. The graph shows all the relationships to any given entity and also uncovers indirect links between entities. This allows the company’s customers to see exactly what news events impact their interests, even if an investment isn’t explicitly named within the article.
NLP extracts entities and their relationships from the text. This data is incorporated into the knowledge graph and linked to related terms.
The knowledge graph also allows a level of personalization unrivaled by other news aggregators. In order to provide“News Signals as a Service,” the company must determine what is relevant to each customer. They link their customer’s CRM (Customer Relationship Management system) and extract and match customer records to the entities in the knowledge graph. With a watchlist in place, as news events occur the relevant facts are delivered to the customer directly to their data warehouse via an API. Further, these signals can be fed into automated systems, such as sales lead scoring, to make their customers’ processes more intelligent.
Ultimately, the ability to capture the real-world context of data allowed the company to build a more human-centered product — focusing on what is uniquely relevant to a customer and delivering personalization at scale. By using their proprietary data in innovative ways, the company maintained their leadership as the ultimate source for business news and data.