Online abstract groups, in which members aren't explicitly connected, can be automatically identified by computer-implemented methods. The methods involve harvesting records from social media and extracting content-based and structure-based features from each record. Each record includes a social-media posting and is associated with one or more entities. Each feature is stored on a data storage device and includes a computer-readable representation of an attribute of one or more records. The methods further involve grouping records into record groups according to the features of each record. Further still the methods involve calculating an n-dimensional surface representing each record group and defining an outlier as a record having feature-based distances measured from every n-dimensional surface that exceed a threshold value. Each of the n-dimensional surfaces is described by a footprint that characterizes the respective record group as an online abstract group.
STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT
This invention was made with Government support under Contract DE-AC0576RLO1830 awarded by the U.S. Department of Energy. The Government has certain rights in the invention.