"POLIBITS"

Research journal
on Computer science and computer engineering with applications

Issue 40 (July-December 2009)

Scanned cover pages

Full version in PDF

Editorial (pp. 3-4) Alexander Gelbukh

SPECIAL SECTION:

INFORMATION RETRIEVAL AND NATURAL LANGUAGE PROCESSING

1.

David Tomás, José L. Vicedo, Empar Bisbal, and Lidia Moreno (Spain)

TrainQA: a Training Corpus for Corpus-Based Question Answering Systems (pp. 5-11)

This paper describes the development of an English corpus of factoid TREC-like question-answer pairs. The corpus obtained consists of more than 70,000 samples, containing each one the following information: a question, its question type, an exact answer to the question, the different contexts levels (sentence, paragraph and document) where the answer occurs inside a document, and a label indicating whether the answer is correct (a positive sample) or not (a negative sample). For instance, TrainQA can be used for training a binary classifier in order to decide if a given answer is correct (positive) to the question formulated or not (negative). To our knowledge, this is the first corpus aimed to train on every stage of a trainable Question Answering system: question classification, information retrieval, answer extraction and answer validation.

2.

This paper reports an experiment to evaluate a Cross Language Information Retrieval (CLIR) system that uses a multilingual ontology to improve query translation in the travel domain. The ontology-based approach significantly outperformed the Machine Readable Dictionary translation baseline using Mean Average Precision as a metric in a usercentered experiment.

3.

Masaki Murata, Masao Utiyama, Toshiyuki Kanamaru, and Hitoshi Isahara (Japan)

English-to-Japanese Cross-Language Question-Answering System using Weighted Adding with Multiple Answers (pp. 17-22)

We describe a method of using multiple documents with decreasing weights as evidence to improve the performance of a question-answering system. We also describe how it was used in cross-language question answering (CLQA) tasks. Sometimes, the answer to a question may be found in multiple documents. In such cases, using multiple documents for prediction generates better answers than using a single document. Therefore, our method uses information from multiple documents by adding the scores of candidate answers extracted from the various documents. Because simply adding scores degrades the performance of question-answering systems, we add scores with
decreasing weights to reduce the negative effect of simply adding. We used this method in the CLQA part of NTCIR-5. It was incorporated into a commercially available translation system that carries out cross-language question-answering tasks. Our method obtained relatively good CLQA results.

4.

Henry Anaya-Sánchez, Aurora Pons-Porrata, and Rafael Berlanga-Llavori (Cuba, Spain)

Using Sense Clustering for the Disambiguation of Words (pp. 23-28)

Clustering methods have been extensively used in the solution of many Information Processing tasks in order to capture unknown object categories. This paper presents an approach to Word Sense Disambiguation based on clustering. The underlying idea is that the clustering of word senses provides a useful way to discover semantically related senses. We evaluate our proposal regarding both fine- and coarse-grained disambiguation. Experimental results over Senseval-3 all-words, SemCor 2.0 and SemEval-2007 corpora are presented. Promising values of precision and recall are obtained.

5.

Tomoya Iwakura and Seishi Okamoto (Japan)

Improving Named Entity Extraction Accuracy using Unlabeled Data and Several Extractors (pp. 29-38)

This paper proposes feature augmentation methods using unlabeled data and several Named Entity (NE) extractors. We collect NE-related information of each word (which we call NE-related labels) from unlabeled data by using NE extractors. NE-related labels which we collect include candidate NE class labels of each word and NE class labels of co-occurring words. To accurately collect the NE-related labels from unlabeled data, we consider methods to collect NE-related labels by using outputs of several NE extractors. We use NE-related labels as additional features for creating new NE extractors. We apply our NE extraction methods using the NE-related labels to IREX Japanese NE extraction task. The experimental results show better accuracy than the previous results obtained with NE extractors using handcrafted resources.

6.

Farag Ahmed, Ernesto William De Luca, and Andreas Nürnberger (Germany)

Revised N-Gram based Automatic Spelling Correction Tool to Improve Retrieval Effectiveness (pp. 39-48)

We present a language-independent spell-checker that is based on an enhancement of the n-gram model. The spell checker is proposing correction suggestions by selecting the most promising candidates from a ranked list of correction candidates that is derived based on n-gram statistics and lexical resources. Besides motivating and describing the developed techniques, we briefly discuss the use of the proposed approach in an application for keyword- and semantic-based search support. In addition, the proposed tool was compared with state-of-the-art spelling correction approaches. The evaluation showed that it outperforms the other methods.

7.

Valérie Bellynck, Christian Boitet, and John Kenwright  (France)

Bilingual Lexical Data Contributed by Language Teachers via a Web Service: Quality vs. Quantity  (pp. 49-55)

IToldU is a light web service which, in its first year of use for teaching technical English in French engineering schools, has enabled the contribution of just over 17000 English terms in about twenty technical domains. These terms are associated with their French translations (95% of which are correct) and examples of use (about 85% correct). In the second year, emphasis has been on quality rather than on quality: about 6000 high-quality entries have been contributed by the same number of students and classes. Some desirable extensions are in progress, e.g. to add English when this language is not included in the original language pair, and to synchronize with off-line contributions prepared on a PDA or a hand-held calculator.

REGULAR PAPERS

8.

Juan Carlos Herrera Lozada, Patricia Pérez Romero y Magdalena Marciano Melchor (México)

Tecnología RFID Aplicada al Control de Accesos (pp. 57-62)

En el presente trabajo se expone una introducción a la tecnología RFID (Identificación por Radio Frecuencia) que prometedoramente comienza a notarse como una alternativa viable para la captura de datos y el control de recursos varios en todos los sectores. En este mismo documento se incluye un análisis de las perspectivas propias y se culmina mostrando una aplicación práctica relacionada con el control de acceso.

RFID Technology Applied to Access Control

In this paper we present the perspectives of the technology RFID (Radio Frequency Identification), which is a notorious alternative for data capture and control of resources in many industrial sectors. After the discussion of its perspectives, we present a practical application of this technology related to access control.

9.

Tran Khanh DANG and Thi Thanh Huyen PHAN  (Vietnam)

An Extended Payment Model for M-Commerce with Fair Non-Repudiation Protocols (pp. 63-70)

Non-repudiation in e-commerce has recently gained a lot of interest but its successor brother, non-repudiation in mcommerce, is still at the start. In this paper, we propose an extension of existing mobile payment models to introduce an extended mobile payment service (EMPS) model, which is based on assumptions about the cooperation between mobile network operators and financial institutions to deal with different payment amounts ranging from micro to macro payment. The novel model focuses on enhancement of non-repudiation problem. Fair non-repudiation protocols are developed for not only payment phase but also other phases in a typical mcommerce transaction, including price negotiation and content delivery. Joint signatures method is used in protocols to overcome the limitations in mobile handheld device capability and to reduce the trust dependence totally on the payment service. As with the proposed non-repudiation protocols, EMPS plays the role of a semi-trusted third party and is an indispensable factor for creating the fairness property. Nonrepudiation analyses of these protocols are also conducted besides some guidelines for ensuring non-repudiation in mcommerce.

10.

Alejandro Iturri Hinojosa, Cirilo Leon Vega, Gabriela Leija Hernández (México)

Análisis Numérico de Pérdidas de Inserción de Conmutadores Diseñados con Diodos p-i-n  (pp. 71-80)

Se presenta un análisis numérico de la pérdida de inserción de conmutadores de microondas diseñados con diodos p-i-n. Se analizan las características de resistencia serie, Rs, y la capacitancia de unión, Cj, propias del modelo de circuito equivalente de los diodos p-i-n. Así mismo, se presenta a detalle la teoría de funcionamiento de los diodos p-i-n y de los conmutadores de microondas.

Numeric Analysis of the Insertion Loss in Switches Designed using the p-i-n Diodes

We present numeric analysis of the insertion loss in the microwave switches designed using the p-i-n diodes. We analyze the characteristics of series resistance Rs, and junction capacitance Cj that are part of the equivalent circuit model of the p-i-n diodes. Also, we present necessary background for explanation of functioning of the p-i-n diodes and microwave switches.

11.

Maria Aurora Molina Vilchis, Ramón Silva Ortigoza, Yasania Joselín Escalona Bautista y Héctor Oscar Ramos García (México)

Restricción del Uso de Teléfonos Celulares en Ambientes Controlados (pp. 81-86)

Es común que se provoquen interrupciones o interferencias por el uso indiscriminado de teléfonos celulares en eventos académicos, culturales o sociales, de ahí que surja la necesidad de evitar o disminuir la recepción o transmisión de llamadas. Otras restricciones pudieran estar relacionadas con el uso de las cámaras fotográficas que incorporan estos dispositivos, la transmisión de mensajes o grabaciones de videos sin autorización. En este artículo se presenta una aplicación basada en Bluetooth para el control del uso de estos dispositivos en ambientes con restricciones.

Restriction of the Usage of Mobile Phones in Controlled Environments

It often happens that interruptions or interferences occur due to indiscriminate usage of the mobile phones during academic, cultural o social events. Thus, there is a necessity for avoiding or diminishing transmissions of phone calls. Another important restriction is related with the unauthorized usage of cameras integrated in these devices, transmission of messages or video capture without permission. In this paper, we present a Bluetooth based application for the restriction of usage of the mobile phones in specially controlled environments. 

12.

Michael Brückner and Orasa Tetiwat (Thailand)

Evaluation of E-Learning Readiness: A Study of Informational Behavior of University Students (pp. 87-92)

In this study we investigated the behavior of university students from different universities and faculties of Thailand with regard to search, evaluate, use and share information. Our goal was to prepare the introduction of personal information management into the e-learning curriculum. We compare our results with data reported by others. Method: For gathering the data we used a questionnaire in Thai language, which was actually translated from the English original and sent to various universities in Thailand. Follow-up interviews with an adapted set of questions were carried out to generate qualitative data and a deeper insight into the knowledge and practices of the students. Analysis: Both quantitative and qualitative analyses were carried out on the data coming from 1,317 university students. Quantitative analysis employed the statistical package SPSS. Results: We have got a picture of the present informational behavior of Thai students. The results showed some differences between Thai and foreign students, for example in the use of Internet search engines. The insights gained by this study will be applied in the generation of the part of the e-learning curriculum that deals with the students' personal information management and can be applied to informational behavior of students in other countries like Mexico, Brazil, etc.