A “data foundation” vision
Several years ago, Boehringer Ingelheim—the maker of some of the leading treatments for illnesses like type-2 diabetes, stroke, and COPD—began the process of linking their biomarker data, such as targets, genes, and diseases, to improve how bioinformaticians work across R&D data. They tried several different tech stacks before realizing they needed a more systemic approach; they needed to establish a “technical foundation” that would link data from different parts of the company and make this data available to everyone in the organization.
Boehringer had been making progress in some areas with values lists and master vocabularies, but in the more complex area of computational biology they needed a more mature solution that would allow them to show how terms are related to one another. Another important consideration was external data. Boehringer sources 30% of its active ingredients from external collaborations, and they have limited control over the quality of this data. They needed a flexible solution that could relate their internal experimental results to external and publicly available studies.
Their existing data lake was not up to this task. They needed a technology that would allow them to connect data regardless of its source or type—a solution that would create a data layer that would make data available to everyone in Boehringer and allow them to explore the data “Wikipedia-style”. This lead them to knowledge graphs.