Linguistic Analysis and Collocation Extraction
![]() |
Author(s)Dr Luka Nerima , Laboratory of Language Analysis and Language Technology (LATL) , Linguistics Departement , Faculty of Arts , University of Geneva (Unige) . Ms Violeta Seretan , Laboratory of Language Analysis and Language Technology (LATL) , Linguistics Departement , Faculty of Arts , University of Geneva (Unige) . |
Research Project
Linguistic Analysis and Collocation Extraction
Keywords
Information Technology - WebAbstract
This paper presents a method for extracting multi-word collocations (MWCs) from text corpora, which is based on the previous extraction of syntactically bound collocation bigrams. We describe an iterative word linking procedure which relies on a syntactic criterion and aims at building up arbitrarily long expressions that represent multi-word collocation candidates. We propose several measures to rank candidates according to the collocational strength, and we present the results of a trigram extraction experiment. The methodology used is particularly well-suited for the identification of those collocations whose terms are arbitrarily distant, due to syntactic processes (passivization, relativization, dislocation, topicalization).
File(s)
Collocation Translation Based on Sentence Alignment and Parsing
English | [127 ko] > download |
Multi-Word Collocation Extraction by Syntactic Composition of Collocation Bigrams
English | [1230 ko] > download |
Multilingual Collocation Extraction: Issues and Solutions
English | [132 ko] > download |
Induction of Syntactic Collocation Patterns from Generic Syntactic Relations
English | [14 ko] > download |
Accurate Collocation Extraction Using a Multilingual Parser
English | [126 ko] > download |
Le problème des collocations en TAL
French | [388 ko] > download |
Syntactic-based Collocation Extraction from Parallel Corpora and from the Web (PowerPoint Presentation)
English | [ ko] > download |
A Tool for Multi-Word Collocation Extraction and Visualization in Multilingual Corpora
English | [ ko] > download |
Using the Web as a Corpus for the Syntactic-Based Collocation Identification
English | [ ko] > download |
Extraction of Multi-Word Collocations Using Syntactic Bigram Composition
English | [159 ko] > download |
Creating a Multilingual Collocation Dictionary from Large Text Corpora
English | [60 ko] > download |