Amidst the dizzying increase of content outlets, daily newsletters, and social media feeds, Dow Jones found a top of mind problem for their customers — how could they distill the daily onslaught of the news into actionable, personalized insight?
At this year’s Connected Data London, Clancy Childs, the General Manager of Dow Jones Knowledge Enablement, took the stage with Stardog co-founder Mike Grove to discuss how Dow Jones is reimagining the news as a knowledge graph and extracting the most important, and personalized, data for their customers around the world.
Dow Jones offers consumer products like The Wall Street Journal and Barron’s, as well as enterprise offerings like Factiva, a provider of global business content, designed to help users find, monitor, interpret and share essential information. Clancy, who leads the Knowledge Enablement effort, outlined how customers were so inundated with information they couldn’t identify what news affected their interests. Unable to decipher what was signal and what was just noise, they couldn’t take the appropriate action. One thing was clear — more information was not the solution. Dow Jones’ customers wanted the key information at their fingertips without having to sift through a pile of newsletters.
“How can we share relevant knowledge with our customers without making them read ‘the news’?”
- Clancy Childs, Dow Jones
A wealth of data
Dow Jones was sitting on a trove of proprietary data, stored in varying structures. Between their flagship publications (The Wall Street Journal), news archive aggregators (Factiva) and specialized news datasets (DNA), Dow Jones has access to millions of facts derived from fifty years of digitized news media. Factiva alone offers content from over 33,000 global news and information sources from over 200 countries and in 28 languages. In addition to having every significant news story, Dow Jones has each story’s associated metadata which provides context to the reporting.
Structured data has a defined length and format and is stored in tables; unstructured data is not organized in a pre-defined manner and it’s typically text-heavy.
However, all of these data sources existed in complete silos. Theoretically, a Dow Jones employee could find every news story about any given company, but it would require searching at least six data sources. Making these stories actionable involved reading thousands of words, layering knowledge within that domain, and then using that context to identify the nugget of information needed for a “data driven” decision.
Personalization at scale
Automating reading the news required a technical solution that could replicate this process of building layered context around a concept. Critically, it had to be able to incorporate data from both structured and unstructured data sources. Further, they needed a solution that was scalable, updates quickly, and operates well at high volumes — after all, the news changes minute to minute. Dow Jones ultimately turned to a knowledge graph, allowing them to create a complete view of the news, comprehensible by humans and readable by machines.
Using Natural Language Processing (NLP), Dow Jones’ knowledge graph extracts entities (people, companies, events, dates) and their relationships (employed by, invested in, associated with) from the unstructured news articles. Stardog links these entities to the related data in the knowledge graph, placing these entities in the context of other news stories and historic relationships. The graph shows all the relationships to any given entity and also uncovers indirect links between entities. This allows Dow Jones’ customers to see exactly what news events impact their interests, even if an investment isn’t explicitly named within the article.
NLP extracts entities and their relationships from the text. This data is incorporated into the knowledge graph and linked to related terms.
The knowledge graph also allows a level of personalization unrivaled by other news aggregators. In order to provide “News Signals as a Service,” Dow Jones must determine what is relevant to each customer. Dow Jones links their customer’s CRM (Customer Relationship Management system) and extracts and matches customer records to the entities in the knowledge graph. With a watchlist in place, as news events occur the relevant facts are delivered to the customer directly to their data warehouse via an API. Further, these signals can be fed into automated systems, such as sales lead scoring, to make their customers’ processes more intelligent
Ultimately, the ability to capture the real-world context of data allowed Dow Jones to build a more human-centered product — focusing on what is uniquely relevant to a customer and delivering personalization at scale. By using their proprietary data in innovative ways, Dow Jones maintains their leadership as the ultimate source for business news and data.