A computational method and system for the comparison and analysis of different objects of information within a database or collection. All objects are compared in a pair-wise fashion so the relative similarity between each object to every other object in the collection is known. A generalized alignment-free method is described for comparing whole genome (coding and non-coding) DNA sequences is used to investigate the relationship among placental mammalian genomes. Differences in word feature frequency profiles (FFP) are used to derive distance and infer evolutionary relationships.
STATEMENT OF GOVERNMENTAL SUPPORT
 This invention was made with government support under Contract No. DE-AC02-05CH11231 awarded by the U.S. Department of Energy and under National Institutes of Health Grant No. 3P50GM062412-0552. The government has certain rights in the invention.