Semantically Relatable Sets: Building Blocks for Representing Semantics
By Rajat Kumar Mohanty, Anupama Dutta and Pushpak Bhattacharyya
Motivated by the fact that ultimately, automatic language analysis is constituent detection and attachment resolution, we present our work on the problem of generating and linking semantically relatable sets (SRS) as a via media to automatic sentence analysis leading to semantics extraction. These sets are of the form <entity1, entity2> or <entity1 function-word entity2> or <function-word entity>, where the entities can be single words or more complex sentence parts (such as an embedded clause). The challenge lies in finding the components of these sets, which involves solving prepositional phrase (PP) and clause attachment problems, and empty pronominal (PRO) determination. Use is made of
(i) the parse tree of the sentence,
(ii) the subcategorization frames of lexical items,
(iii) the lexical properties of the words and
(iv) the lexical resources like the WordNet and the Oxford Advanced Learners’ Dictionary (OALD).
The components within the sets and the sets themselves are linked using the semantic relations of an interlingua for machine translation called the Universal Networking Language (UNL). The work forms part of a UNL based MT system, where the source language is analysed into semantic graphs and target language is generated from these graphs. The system has been tested on the Penn Treebank, and the results indicate the effectiveness of our approach.
Keywords: Semantically Relatable Sets, Syntactic and Semantic Constituents, Interlingua Based MT, Parse Trees, Lexical Properties, Argument Structure, Penn Treebank.
Source: Semantics Archive
Posted by Tony Marmo
at 00:27 BST
Updated: Sunday, 18 September 2005 08:49 BST