Computing FOAF Co-reference Relations with Rules and Machine Learning

Author/Creator ORCID

Date

2010-11-01

Department

Program

Citation of Original Publication

Rights

This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.

Abstract

The friend of a friend (FOAF) vocabulary is widely used on the Web to describe ’agents’ (people, groups and organizations) and their properties. Since FOAF does not require unique ID for agents, it is not clear when two FOAF instances should be linked as co-referent, i.e., denote the entity in the world. One approach is to use logical constraints such as the presence of inverse functional properties as evidence that two individuals are the same. Another applies heuristics based on the string similarity of values of FOAF properties such as name and school as evidence for or against co-reference. Performance is limited, however, by many factors: non-semantic string matching, noise, changes in the world, and the lack of more sophisticated graph analytics. We describe a proto-type system that takes a set of FOAF agents and identifies subsets that are believed to be co-referent. The system uses both logical constraints (e.g., IFPs), strong heuristics (e.g., FOAF agents described in the same file are not co-referent), and an SVM generated classifier. We present initial results using data collected from Swoogle and other sources and describe plans for additional analysis.