Skip to Content
Find More Like This
Return to Search

System for information discovery

United States Patent Application

20030097375
A1
View the Complete Application at the US Patent & Trademark Office
Pacific Northwest National Laboratory - Visit the Technology Commercialization Program Website
A sequence of word filters are used to eliminate terms in the database which do not discriminate document content, resulting in a filtered word set and a topic word set whose members are highly predictive of content. These two word sets are then formed into a two dimensional matrix with matrix entries calculated as the conditional probability that a document will contain a word in a row given that it contains the word in a column. The matrix representation allows the resultant vectors to be utilized to interpret document contents.
Pennock, Kelly A. (Richland, WA), Miller, Nancy E. (Kennewick, WA)
10/ 298,361
November 16, 2002
[0001] This invention was made with Government support under Contract DE-AC06-76RLO 1830 awarded by the U.S. Department of Energy. The Government has certain rights in the invention.