- Logan Hochstetler
- Research Papers
- Keywords Research
Automatic Paraphrase Discovery based on Context and Keywords between NE Pairs
Automatic paraphrase discovery is an important but challenging task. We propose an unsupervised method to discover paraphrases from a large untagged corpus, without requiring any seed phrase or other cue. We focus on phrases which connect two Named Entities (NEs), and proceed in two stages. The first stage identifies a keyword in each phrase and joins phrases with the same keyword into sets. The second stage links sets which involve the same pairs of individual NEs. A total of 13,976 phrases were grouped. The accuracy of the sets in representing paraphrase ranged from 73% to 99%, depending on the NE categories and set sizes; the accuracy of the links for two evaluated domains was 73% and 86%.
- Hits: 822