Skip to Content
Find More Like This
Return to Search

Methods and apparatuses for information analysis on shared and distributed computing systems

United States Patent

February 22, 2011
View the Complete Patent at the US Patent & Trademark Office
Pacific Northwest National Laboratory - Visit the Technology Commercialization Program Website
Apparatuses and computer-implemented methods for analyzing, on shared and distributed computing systems, information comprising one or more documents are disclosed according to some aspects. In one embodiment, information analysis can comprise distributing one or more distinct sets of documents among each of a plurality of processes, wherein each process performs operations on a distinct set of documents substantially in parallel with other processes. Operations by each process can further comprise computing term statistics for terms contained in each distinct set of documents, thereby generating a local set of term statistics for each distinct set of documents. Still further, operations by each process can comprise contributing the local sets of term statistics to a global set of term statistics, and participating in generating a major term set from an assigned portion of a global vocabulary.
Bohn; Shawn J. (Richland, WA), Krishnan; Manoj Kumar (Richland, WA), Cowley; Wendy E. (Richland, WA), Nieplocha; Jarek (Richland, WA)
Battelle Memorial Institute (Richland, WA)
11/ 540,240
September 29, 2006
STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT This invention was made with Government support under Contract DE-AC05-76RL01830 awarded by the U.S. Department of Energy. The Government has certain rights in the invention.