Skip to Content
Find More Like This
Return to Search

Systems and processes for identifying features and determining feature associations in groups of documents

United States Patent

January 12, 2016
View the Complete Patent at the US Patent & Trademark Office
Pacific Northwest National Laboratory - Visit the Technology Commercialization Program Website
Systems and computer-implemented processes for identification of features and determination of feature associations in a group of documents can involve providing a plurality of keywords identified among the terms of at least some of the documents. A value measure can be calculated for each keyword. High-value keywords are defined as those keywords having value measures that exceed a threshold. For each high-value keyword, term-document associations (TDA) are accessed. The TDA characterize measures of association between each term and at least some documents in the group. A processor quantifies similarities between unique pairs of high-value keywords based on the TDA for each respective high-value keyword and generates a similarity matrix that indicates one or more sets that each comprise highly associated high-value keywords.
Rose; Stuart J. (Richland, WA), Cowley; Wendy E. (Richland, WA), Crow; Vernon L. (Richland, WA)
Battelle Memorial Institute (Richland, WA)
13/ 769,629
February 18, 2013
STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT This invention was made with Government support under Contract DE-AC0576RLO1830 awarded by the U.S. Department of Energy. The Government has certain rights in the invention.