Detecting Temporal Precursor Words and Phrases Using a Learning Algorithm and Wavelet Analysis
Most research on mammography focuses on image data, not textual reports. However, the reports associated with patient visits offer a valuable set of observations. To take advantage of these sequential writings, a robust ORNL learning algorithm assembles, searches, and analyzes cue phrases in radiology reports to determine if they define normal or abnormal traits in mammograms over time.Description
Specifically, this system learns phrase patterns (skip bigrams) from textual documents and separates the documents into two distinct classes. The algorithm then performs longitudinal scans of mammogram records from patient visits, using the phrase patterns and a new wavelet analysis technique to detect precursors to breast abnormalities. Using this method, researchers found common phrases in both the normal and abnormal reports and were able to successfully detect common phrase patterns that uniquely identify two classes of documents.
In a follow-up system, the researchers combined the textual analysis algorithm with a discrete wavelet transform—a function in mathematics—to do a temporal analysis of precursor words in medical records. A critical feature of this method is that it will not only identify frequencies in a sequence, but also the point in time in which they occur.Benefits
- Earlier detection of cancers and abnormalities over time
- Identification of patterns that define abnormal and normal physician reports
- Use of a single learning algorithm that can be used for an intelligent software agent
- Early detection of breast cancer and other breast abnormalities
- Cyber security and text mining applications
Robert M. Patton, Thomas E. Potok, and Barbara G. Beckerman, Detecting Temporal Precursor Words in Text Documents Using Wavelet Analysis, U.S. Patent Application 61/310,351, filed March 4, 2010.
Robert M. Patton and Thomas E. Potok, Method for Learning Phrase Patterns from Textual Documents, U.S. Patent Application 61/331,941, filed May 6, 2010.
Robert M. Patton
Computational Sciences and Engineering
Oak Ridge National Laboratory
|Title and Abstract||
Method and system for determining precursors of health abnormalities from processing medical records
Medical reports are converted to document vectors in computing apparatus and sampled by applying a maximum variation sampling function including a fitness function to the document vectors to reduce a number of medical records being processed and to increase the diversity of the medical records being processed. Linguistic phrases are extracted from the medical records and converted to s-grams. A Haar wavelet function is applied to the s-grams over the preselected time interval; and the coefficient results of the Haar wavelet function are examined for patterns representing the likelihood of health abnormalities. This confirms certain s-grams as precursors of the health abnormality and a parameter can be calculated in relation to the occurrence of such a health abnormality.
|Oak Ridge National Laboratory||06/25/2013
|Technology ID||Development Stage||Availability||Published||Last Updated|