A data infrastructure for graph-based computing that combines the natural language expressiveness of the Semantic Web and the mathematical rigor of graph theory to discover meaningful associations across multiple sources towards computer-assisted serendipitous insight discovery. The process automatically integrates massive size datasets accessed using Semantic Web standards and technologies and normalizes data in graphs. The process generates a plurality of conditional probability distributions based on type-triple meta-data and triple statistics to model saliency and automatically construct and evaluate a plurality of sub-graphs based on the plurality of conditional probabilities for contextual-saliency. The process then renders a plurality of paths (i.e. sequence of associations) that model meaningful pairwise relations between objects of the normalized integrated data. The pluralities of conditional probabilities reveal and rank previously unknown associations between entities of user-interest in the knowledge graph.
STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH AND DEVELOPMENT
 This invention was made with United States government support under Contract No. DE-AC05-00OR22725 awarded by the United States Department of Energy. The United States government has certain rights in the invention.