Réseau universitaire international de Genève
Geneva International Academic Network

Français | English
Homepage > Research > Outputs > Description

Linguistic Analysis and Collocation Extraction

Author(s)

Research Project

Linguistic Analysis and Collocation Extraction

> see the project description

Keywords

Information Technology - Web

Abstract

This paper presents a method for extracting multi-word collocations (MWCs) from text corpora, which is based on the previous extraction of syntactically bound collocation bigrams. We describe an iterative word linking procedure which relies on a syntactic criterion and aims at building up arbitrarily long expressions that represent multi-word collocation candidates. We propose several measures to rank candidates according to the collocational strength, and we present the results of a trigram extraction experiment. The methodology used is particularly well-suited for the identification of those collocations whose terms are arbitrarily distant, due to syntactic processes (passivization, relativization, dislocation, topicalization).

File(s)

Collocation Translation Based on Sentence Alignment and Parsing
English | [127 ko] > download
Multi-Word Collocation Extraction by Syntactic Composition of Collocation Bigrams
English | [1230 ko] > download
Multilingual Collocation Extraction: Issues and Solutions
English | [132 ko] > download
Induction of Syntactic Collocation Patterns from Generic Syntactic Relations
English | [14 ko] > download
Accurate Collocation Extraction Using a Multilingual Parser
English | [126 ko] > download
Le problème des collocations en TAL
French | [388 ko] > download
Syntactic-based Collocation Extraction from Parallel Corpora and from the Web (PowerPoint Presentation)
English | [ ko] > download
A Tool for Multi-Word Collocation Extraction and Visualization in Multilingual Corpora
English | [ ko] > download
Using the Web as a Corpus for the Syntactic-Based Collocation Identification
English | [ ko] > download
Extraction of Multi-Word Collocations Using Syntactic Bigram Composition
English | [159 ko] > download
Creating a Multilingual Collocation Dictionary from Large Text Corpora
English | [60 ko] > download