Bolshakov, A. Gelbukh. Lexical functions in Spanish. Proc. CIC-98, Simposium Internacional de Computación, November 11 - 13, 1998, Mexico D.F., pp. 383 - 395.

 

Lexical Functions in Spanish

 

 

Dr. I.A. Bolshakov,

Dr. A.F. Gelbukh

 

Centro de Investigación en Computación, Instituto Politécnico Nacional,
Av. Juan de Dios Batiz, A.P. 75-476, C.P. 07738, México D.F.,
+52 (5) 729-6000, ext. 56544, 56602, fax 586-2936,
{gelbukh, igor}@pollux.cic.ipn.mx, gelbukh(?)micron.msk.ru

 

 


Abstract*

Lexical functions is a formalism for description and use of combinatorial properties of in­dividual lexemes[1]. It provides a powerful mechanism for syntactic transformations, lexical disambiguation in semantic analysis, le­xical choice in text generation. This me­chanism is useful in automatic text analysis, machine translation, text generation.

Introduced in the frame of the Meaning Û Text theory for Russian language, this extremely useful mechanism is very little known outside of Russia. There are few publications on it in English, and there have been nearly no attempts of its application to Spanish. In this article, along with a detailed enough introduction to the theory of lexical functions necessary for the Western reader, we give numerous examples of their values for Spanish, as well as some examples of the application of the corresponding rules to Spanish sentences, thus showing the applicability of this formalism to Spanish.

Key words: natural language processing, Spanish, syntax, semantics, dictionaries.

1.        Introduction

When reading a text, the most obvious task is to understand the meaning of each word, and when writing a text, to choose a word that expresses the desired meaning. Though this seems to be easy, it is not. In much greater number of cases than people tend to think, the meaning of a given word in the text or the choice of the right word for a given meaning, depend on some another word in the same sentence, to which the given word is syntactically related.

Really, what does the Spanish word profundo mean? Probably something like ‘having large distance between the surface and bottom’. What is then the meaning of the combination silencio profundo? If we say that profundo has a second meaning, ‘being of high degree’, how then will we choose between these two meanings in the combination, let us say, pozo profundo? Can we use this word, in its second meaning, with the word grito: *grito profundo? If not, which of the words with the same meaning of ‘high degree’ should we use with grito: alto? intenso? potente? ruidoso? And with llanto[2]? How can a foreigner or a computer answer such questions?

In this article, we discuss a formalism that allows not only describe such dependencies between lexemes in an elegant and consistent way, but also manipulate these dependencies and use them for syntactic transformations, disambiguation, and machine translation. This formalism was first introduced by Russian scientistsolkovskij and Meluk[3] and developed by Meluk in the frame of his Meaning Û Text theory [7, 13]. Since nearly all the literature on this theory, as well as most of examples, is in Russian, we feel the necessity to give a detailed enough introduction in the formalism of lexical functions itself. To our knowledge, so far there have been no serious attempts to apply this formalism to Spanish. In the article we present numerous examples of its application to Spanish words and sentences.

2.        Restrictions on lexical co-occurrences

Obviously, not all words can occur together in a text: there are restrictions on their co-occurrences. These restrictions can be of the following two quite different types.

The first type can be illustrated by the impossible co-occurrences observed in the sentence *Las ideas verdes incoloras duermen vigorosamente[4]. The co-occurrences of lexemes idea – dormir, dormir – vigorosamente, idea – verde, idea – incoloro, verde – incoloro are very strange and dubious just because of their meanings and our knowledge of the world, operating with these meanings. If such a sentence is literally translated into English, French, or any other language it keeps the same absurd “meaning.” Such restrictions in fact should not be covered by linguistic descriptions per se, since they do not depend on language.

The second type of restrictions on lexical co-occurrences can be illustrated by the following three groups of synonymous phrases given in different languages:

1.    Eng.   strong tea

Ger.        starker (‘powerful’) Tee

Sp.         cargado (‘loaded’)

Fr.          thé fort (‘forceful’)

Rus.        krepkiy(‘firm’) chay

Pol.        herbata mocna (‘firm’)

2.    Eng.   ask a question

Sp.         hacer(‘make’)una pregunta

Fr.          poser(‘put’) une question

Rus.        zadat’ (‘give’) vopros

3.    Eng.   let out  a cry

Sp.         dar (‘give’)un grito

Fr.          pousser (‘push’) un cri

Rus.        ispustit’(‘let out’) krik

Here, the restriction on lexical co-occurrences is impossibility of certain word combinations in one language while the same combination is absolutely normal in another language. For example, though in Spanish the combinations like *té duro, *dar una pregunta, *dejar salir un grito are impossible, we can’t account for this by absurdity of their meaning, since in another language (Russian, to say) they sound perfectly. Similarly, a perfect Spanish phrase té cargado sounds absurdly in literal English translation *loaded tea.

In the example 1, there is nothing in the meaning of the well-prepared tea, that can explain why one should qualify it as strong in English, powerful in German, loaded in Spanish, forceful in French and firm in Russian and Polish. Similarly, in the examples 2 and 3, the selection of the appropriate verb to be used together with a given noun cannot be done basing on the meaning of the latter: there is nothing in the meaning of question to explain why in English you ask it, in Spanish you make it, in French you put it, and in Russian you give it, but never ask, make, or put.

As we see, this type of restrictions is a property of the corresponding languages and thus is to be studied in the frame of linguistics. In fact, this type of restrictions on lexical co-occurrences is one of the main objectives of lexicographic description. To reflect this type of restrictions, the Meaning Û Text Model uses so-called lexical functions.

3.        Lexical functions formalism

Lexical functions (LFs) are a formalism for description of combinatorial abilities of individual lexemes. This description is topical on deep language levels, i.e., the deep syntactic and semantic ones.

A lexical function F associates with its argument lexeme L another lexeme, or a set of quasi-synonymous lexemes,  that expresses some standard abstract meaning and can play a specific role when used in the text in a syntactic relation with L.

For instance, the lexical function Magn specifies for a noun N an adjective,  or a word combination of adjective type, which expresses the meaning of great intensity or magnitude of the main quality of N. Informally we can say that Magn (L) answers the question: How to say “very” about L? Thus, the phrases of the example 1 from previous section may be rewritten as follows:

Eng.   Magn (tea) = strong

Ger.   Magn (Tee) = starker

Sp.    Magn () = cargado

Fr.     Magn (thé) = fort

Rus.   Magn (chay) = krepkiy

Pol.    Magn (herbata) = mocna

(How to say in Spanish “very” about ? — To say that it’s cargado.)

Here are several additional Spanish examples: Magn (conocimientos) = sólidos, Magn (sa­­bio) = eminente, Magn (odio) = a muerte. The latter example demonstrates an adjective word combination as a possible value. More examples of this lexical function for nouns can be found in Appendix.

Some lexical functions and rules that use them rely on the notion of so-called actants of a word. This notion roughly corresponds to a valency: an actant is a complement of the word, describing a necessary participants of the situation referred to by the word. For example, the situations referred to by the words vender o venta involve the following roles: the seller, the buyer, the goods, and the price. Subsequently, they allow for the corresponding complements in the sentence: El director no ha aprobado la venta de computadoras en mil dólares por el instituto a la compañía estadounidense. Thus, these words have 4 actants, or valencies, each.

The actants are referred to by numbers. The subject, or active agent (one who is doing, e.g., the seller) is the 1st actant, the object (that upon what it is done, e.g., the goods) is the 2nd actant, and the other actants are numbered according to their importance in the situation; their order is just a convention.

For a noun L denoting an action, the lexical function[5] Oper1  specifies a verb V which takes the name of the main actant, or agent, of the action as its grammatical subject, and the lexeme L itself as its direct object: 1st actant of V = 1st actant of L; 2nd actant of V = L. Informally we can say that Oper(L) answers the question: What one does with L when performing the action L? The phrases of the example 2 from the previous section may be rewritten as follows:

Eng.   Oper1 (question) = ask

Sp.    Oper1(pregunta) = hacer

Fr.     Oper1(question) = poser

Rus.   Oper1 (vopros) = zadat’

(What one does with a pregunta when asking? — He or she hace it.)

A lexical function is defined not for any lexeme. First, the arguments of a specific function often must belong to some set of lexemes, characterizing by a common component of their meanings, by their part of speech, or by some another property; the function is just not defined outside of this set. In most cases the domain set of a function is the same in different languages. Second, even if a word does belong to the domain set of the function, just by accident in a specific language there might not be the appropriate word, while in another language such a word might exist. The reader will easily find the examples of such “absent” values for various functions in the following sections.

Sp. argum.

Oper1

Glued verb

Eng. equival.

Oper1

Glued verb

atención

prestar,dedicar

attention

pay, focus, devote

ayuda

prestar,venir en

ayudar

aid

render, give, offer, provide, come to

aid

auxilio

prestar

auxiliar

help

give, offer, provide

help

clases

dar

lessons

teach

teach

concierto

dar

concert

give

confi­an­za

tener

creer

confidence

have

trust

coope­ra­ción

prestar

coope­rar

coope­ration

give

cooperate

dolor

aguantar

pain

feel, have

suffer

grito

dar

gritar

cry

let  out

cry out

frutas

dar,apostar

frutar

fruits

yield, bear

fruit

necesi­dad

tener

necesi­tar

necessity

be under

need

pregun­ta

hacer

pregun­tar

question

ask

ask

victo­ria

conseguir,alcanzar

vencer

victory

win, gain, achieve

conquer

Table 1. Function Oper1.


About 60 so-called standard elementary LFs[6], such as Magn or Oper1, were introduced and well elaborated in the Meaning Û Text theory, especially for Russian. A multiplicity of non-elementary functions can be built as combinations of the elementary ones as functions of other functions. The entire set of functions, both elementary and complex, allows for an exhaustive and highly systematic description of almost all language-dependent restrictions on lexical co-occurrences in natural languages.

4.        Examples of lexical functions

In this section, several elementary LFs will be introduced and illustrated with numerous examples.

4.1     Lexical function Magn for  verbs, adjectives, and adverbs

In the previous section, the function Magn was already defined, but only for nouns. However, it is applicable to many verbs, adjectives, and adverbs as well. Its argument should then have some main feature that can be qualified in grades, and correspondingly, Magn expresses the idea of the great degree of this feature, i.e., the meaning of ‘very’ or ‘intensely’.

In Appendix there are a few examples of Magn for some Spanish verbs, adjectives, and adverbs. For all of them the values are adverbs.

Notice that in any natural language there is some "standard", or the most usual, such adverb for adjectives and adverbs: muy in Spanish, very in English, très in French, ochen’ in Russian. On the other hand, a non-native speaker should use these values very cautiously, since a lot of adjectives and verbs require quite different words in this meaning. That is why the true usage of Magn, as well as ofother LFs, is so important for deep language competence.

4.2     More about lexical function Oper1

In Table 1 shown are some examples of the function Oper1 for Spanish verbs and their English equivalents in parallel.

It is easy to see that the values of Oper1 do not have any autonomous meaning for any of its arguments in both languages. Instead, they are merely tools to incorporate the substantive argument into the syntactic structure. Their role in the text is similar to that of auxiliary words or, let us say, suffixes. For example, the word dar when used with the word grito have the same “meaning” as the suffix ‑ar: dar un grito = gritar. Namely, it has no meaning at all but instead performs a function of converting a noun into a verb.

Argument

Caus

Liqu

Eng. equiv.

Caus

Liqu

atención

atraer

distraer

attention

attract

distract, divert

descanso

conceder

privar

rest

give

deprive

derechos

discernir,conceder

privar

rights

grant

deprive, concede

dolor

causar

tranquilizar

pain

give

calm

palabra2

conceder,conferir,dar

retirar

floor2

give

deny

parlamento

convocar

dar receso,disolver

parliament

convene, convoke

disband, dissolve

posibilidad

dar

privar

opportunity

raise, afford

exclude, rule out

visa

dar,conceder

privar

visa

grant, issue

deny, cancel

Table 2. Functions Caus and Liqu.


The proof of the absence of any meaning of the values of Opercan be seen in that most textual combinations of Oper(S) and S can be replaced with a single verb (see the columns 3 and 6 above in the Table 1) which is equal in its meaning to the noun[7] S: prestar ayuda = ayudar, hacer una pregunta = preguntar, etc. Such transformations are called paraphrasing. They play a significant role in theory and practice of linguistics.

4.3     Lexical functions Caus and Liqu

The meaning of the LF Caus is ‘to cause’, ‘to make the situation existing’. The grammatical subject for the verb V = Caus (L) should be different from L, while the complement should be L: 1st actant of V ¹ L; 2nd actant of V = L. E.g., to cause the situation of rendering attention of somebody, it is necessary to attract the attention of the person. Informally, Caus (L) answers the question: What one does with L to cause it?

The meaning of the LF Liqu is ‘to eliminate’, ‘to cause the situation not existing’. Informally, Liqu » Caus not; Liqu (L) answers the question: What one does with L to eliminate it? As for Caus, the grammatical subject for V = Liqu (L) should be different from L while the complement should be L: 1st actant of V ¹ L; 2nd actant of V = L. For example, to eliminate the situation of rendering attention it is necessary to distract this attention.

Table 2 illustrates both lexical functions, for Spanish and English in parallel.

4.4     Lexical functions for semantic derivates

The LF A0 is defined on nouns, verbs, and adverbs and gives the meaning equal to that of its argument, but expressed by an adjective: A(ciudad) = urbano, A(hermano) = fra­ternal, A(mover) = movedizo, A(bien) = bueno. These are examples of semantic derivation in Spanish. One could see that, in contrast to morphological word derivation, semantic derivation preserves the meaning though can take quite a different stem for the derivate.

The reader may wonder what the numerical indices of the function names such as A0 or Oper1 stand for. The indices 1, 2, …, refer to the different actants of the argument lexeme. For example, Oper1 (L) is a verb that describes the action of the first (not second, etc.) actant of L. By convention, the number 0 refers to the lexeme L itself rather than to any of its actants. As you might have guessed, there exist also such functions as Oper2, or A1, A2, etc.

The LFs A1, A2, …, express the typical adjective qualifiers for the first, second, etc., actants of L:

 

x

A1 (x)

A2 (x)

sorprender

sorprendente

sorprendido

simpatizar

simpatizante

simpático

 

Similarly, S0, S1, …, express the substantive semantic derivates, i.e., nouns. The function S0 gives a noun with the same meaning as its argument, for example[8]S(real1) = realidad, S(defender) = defensa. The functions S1, S2, S3, …, give the standard names of the first, second, third, etc., actants of the argument. Table 3 presents some examples for the verbs with different number of actants.

x

S0 (x)

S1 (x)

S2 (x)

S3 (x)

S4 (x)

llover

lluvia

vivir

vida

nadar

natación

nadador(a)

poseer

posesión

poseedor(a)

propiedad

amar

amor

amante

amado(a)

expedir

expedición

expedidor(a)

mensaje

destinatario(a)

comprar

adquisición, compra

comprador(a)

mercancía

vendedor(a)

precio

Table 3. Noun derivates.


There are several substantive LFs with the meaning of standard circumstances: Sloc denotes a standard place, Smod denotes a standard manner, etc.: Sloc (vivir) = vivienda / ha­bi­tación / morada; Smod (vivir) = modo,Smod (escribir) = estilo, Smod (hablar) = pronunciación / dicción.

In the same way, verbal semantic derivates are expressed by the function V0: V(te­lé­fo­no) = telefonear, V(diploma) = dip­lomar.

Adverbial semantic derivates are expressed by the function Adv0: Adv0 (intenso) = intensamente, Adv0 (cierto) = ciertamente / de cierto.

There exist LFs rather similar to the derivates. The function Pred of a noun or an adjective gives the standard predicative verb for its argument: Pred (maestro) = enseñar, Pred (uni­do) = unirse / aunarse / juntarse. The function Copul of a noun gives the standard copulative verb for this noun: Copul (maestro) = ser: José es maestro; Copul (ejem­plo) = servir (de): Esta palabra sirve de ejemplo de la regla; Copul (ca­ma­ra­do) = resultar: José resultó un buen camarado. These functions are used in the transformation rules that will be discussed below, see, e.g., rule 5. Informally, both functions answer the question “how to be an L,” but Copul requires the word L as a complement, while Pred gives the complete answer by itself.

4.5     Lexical functions for semantic conversives

As we have seen, there are functions related to different actants of the argument; the number of actant is indicated as an index of the function name. Similarly, there are functions related to two different actants of the argument; their names are given two indices.

A conversive to a lexeme L is a lexeme  which denotes the same situation from another “point of view,” i.e., with some permutation of the actants. The application of LF Convij(L) means that the actant i of L becomes the actant j of  and the actant j of L becomes the actant i of , i.e., the actants are swapped. In general, the permutations may be possible for any number of actants, the number of possible options grows with the number of available actants, and the function name may have, accordingly, three or more indices, e.g., Conv123.

The simplest examples of conversives are the words with two actants: Conv12 (padres) = hijos, Conv12 (hijos) = padres. The same situation can be expressed by the phrasesJuan y María son los padres de estos niños orEstos niños son los hijos de Juan y María. The situation is the same, but it is presented from the “point of view” either of Juan y María or of the children. In some cases the conversives are antonymous: Juan es mas fuerte que José vs. José es menos fuerte que Juan.

 

x

S0 (x)

 

S1 (x)

S2 (x)

S3 (x)

S4 (x)

comprar

compra

 

comprador

mercancía

vendedor

precio

vender

venta

 

vendedor

mercancía

comprador

precio

Table 4. Conversives.


More complicated examples are verbs comprar and vender with four actants each: Conv31 (comprar) = ven­der, Conv31 (ven­der) = comprar, see Table 4. Note that the standard names of the two corresponding actants of vender are those permuted for comprar. Again, the same situation is presented from the “point of view” of the seller or buyer.

4.6     Several other lexical functions

There are many other lexical functions of interest, of which we will mention here the following:

·      The synonyms are expressed through LF Syn. There are various degrees, or types, of synonymy, so there are several options for this LF: an absolute synonym Syn, a narrower synonym SynÌ­, a broader synonym SynÉ,­ and an intersecting synonym SynÇ­. Here are some examples: Syn (cómputo) = computación, SynÌ (respetar) = considerar, SynÉ (ase­sinato) = masacre, SynÇ (evi­tar) = salvarse / esquivar / eludir.

·      The contrastive terms, or antonyms, are expressed through LF Contr: Contr (cima) = fondo, Contr (bueno) = malo,Contr (más) = menos.

·      The generic terms, or hypernyms, are expressed through LF Gener: Gener (ira) = sentido, Gener (televisión) = medios masivos de comunicación.

·      The standard terms for collectivity are expressed through LF Mult: Mult (nave) = flota,Mult (borrego) = rebaño, Mult (gan­­so) = bandada.

·      The standard terms for singleness are expressed through LF Sing: Sing (lluvia) = go­ta, Sing (flota) = nave, Sing (banda) = ban­dido.

·      The ability to be the i-th actant in a situation is expressed through LF Ablei: Able(mu­dar) = mudable, Able(com­bar) = flexible, Able(co­mer) = hambriento, Able(co­mer) = comestible, Able(ex­­cusar) = indulgente, Able(ex­cusar) = excusable.

5.        Paraphrasing rules

Lexical functions present a powerful mechanism for synonymous transformations (paraphrasing) applicable on deep syntactic or semantic levels. The corresponding bi-directional formulae (equivalencies, Û) contain small subtrees on both sides; in some cases a subtree on one side or both is reduced to a single node.

The transformations work on subtrees of the whole syntactic tree or semantic network for a sentence. If the whole tree contains a subtree matching the one on either side of the formula, then it can be substituted with the tree on the other side of the rule. For any rule and any direction of its use, the meaning of the whole tree remains unchanged.

Here are several examples. Thy are related to some word that is traditionally denoted by C0, and its syntactic surrounding. The marks on the arrows correspond to the different types of syntactic relations (1, …, 5 stand for actants, in particular, 2 stands for the direct object; attr is for attributive relation).

1.    The synonyms are equal in their meaning:

C0  Û  Syn (C0).

Example: fácilÛ sencillo.

2.    A combination of Oper1 (N) with a dependent noun N, where N is equal in its meaning to some verb C0,i.e., N = S0 (C0), is reducible to this verb C0:

Oper1 (S0 (C0))S0 (C0Û  C0.

Example: prestar ayudaÛ ayudar.

3.    A combination of Gener (C0) with a dependent noun C0 is reducible to C0:

Gener (C0)C0  Û  C0.

Example: el sentido de irritaciónÛ la irritación.

4.    A combination of Gener attributed by A0 is reducible to their common argument:

Gener (C0)A0 (C0) Û  C0.

Example: la materia explosivaÛ el explosivo.

5.    A copulative word combination is reducible to LF Pred:

Copul (C0)C0  Û  Pred (C0).

Example: ser maestro Û enseñar.

6.    A verb can be replaced by its conversive:

Û, ij = 1, …, 5; i ¹ j.

Example: Juan compró el libro a Pedro por cien pesosÛ Pedro vendió [= Conv13 (comprar)] el libro a Juan por cien pesos.

The total number of such rules known in the Meaning Û Text theory[9] reaches several tens. They can be implemented through about thirty standard operations,  mainly connected with tree transformations: renaming of a node in a tree, elimination or insertion of a node, moving an arc to another node, etc. Here we can’t give a detailed discussion of the mechanism of application of these rules [6, 7, 13].

6.        Applications of the lexical functions and paraphrastic rules

Lexical functions in various ways can be used in linguistic transformations in text analysis, synthesis, and translation.

·      Through the use of paraphrases for single node or nodes constituting a syntactic sybtree, one can significantly decrease the variety of lexemes occurred in a discourse. Thus, the semantic representation becomes more homogeneous and the problem of “understanding” the discourse is facilitated. See the rules 1 or 6 of the previous section as examples.

·      Through the use of paraphrases diminishing the number of nodes in the deep syntactic structure, one can decrease the total number of nodes in the structure. Since the rules of logical inference are usually concise, it is easier to match more compact structures with them. See the rules 2 to 5 of the previous section as examples.

·      In text synthesis, through the use of lexical functions, the correct words for standard meanings can be chosen for a given language. For example, to represent the meaning of ‘high intensity’ with the concept ‘tea’, the words strong, starker (‘powerful’), cargado (‘loaded’), fort (‘forceful’), krepkiy (‘firm’),or mocna(‘firm’) can be chosen for English, German, Spanish, French, Russian, and Polish, correspondingly. We are not aware of any elegant method of choosing the word cargado for the meaning of ‘high intensity’ in the context of , other than the use of the lexical function Magn.

·      Through the use of lexical functions on the deep syntactic level in translation from one natural language to another, one can easily obtain quite idiomatic translations. This is illustrated by the following two examples of English-to-Spanish translations:

1.    strong tea ÞMagn (tea)  tea Þ té   Magn ()Þ té cargado.

2.    asks questions ÞOper1 (ques­ti­on)sing, 3pers  questionplÞOper1 (pre­gun­ta)sing, 3pers  preguntaplÞ hace preguntas.

Notice that LFs Magn and Oper1 remain invariant during interlingual transfer in the previous examples, like elements of some universal language. The labeled syntactic structures involved are also applicable to both languages. As to the lexical expressions corresponding to these LFs in various languages, they can be very specific. Thus, though it is not possible to directly translate the function values, it is possible to share the functions across languages. Even if some LFs are not meaningful by themselves, they can serve as elements of an intermediate language of a very deep, though not semantic, level.

Some of the applications mentioned above are useful not only in computer programs, but also in manual text composition or translation, or in language learning [3, 5].

Another very important application of the notion of LFs is lexicographic work: they allow to replace vague traditional explanations of words in the dictionaries with a clear classification, thus constituting (together with the techniques of semantic decomposition) a solid scientific base for lexicographic work [8, 10, 11]. For example, instead of different and often unclear explanations for the words like vender, comprar, venta, compra, vendedor, precio, valer, mercado, abarrote, etc., it is enough to explain once the situation referred to by all these words, and then give clear formulae for each of the words in terms of lexical functions.

7.        Conclusions and future work

The lexical function notion and formalism proves to be very useful for the following areas of  science and practice: automatic language processing and understanding, text generation, machine translation, lexicography, language learning.

In particular, in language understanding the application of the rules relying on the lexical functions allows to greatly simplify the analysis by standardization of the structures and reducing the number of independent word meanings in the text. In any task involving text generation, the lexical functions are a necessary part of the system dictionary.

The lexical functions are highly language-dependent, i.e., cannot be “calculated” or predicted[10] using other types of knowledge, such as word meanings or translations. The values of lexical functions is an independent piece of knowledge for each language, and must be collected in special dictionaries. There is some experience of compilation of such dictionaries for Russian, English, French, and German, with different number of words and for different purposes [1, 3, 4, 9, 12]. However, for Spanish such a dictionary is yet to be compiled.

We see the following main directions of work with Spanish lexical functions:

·      Compilation of at least a small dictionary. Such a dictionary can be monolingual or bilingual, say, English-Spanish.

This will require in the first place manual lexicographic work. Some drafts of the dictionary can be obtained by translation (manual or semi-automatic) of an existing one from another language, and/or by statistical analysis (similar to, say, lexical attraction models [2, 14]) of co-occurrences of words in text corpora, since many frequent word combinations likely have the structure LF (L).

·      Compilation of the set of transformation rules relying on lexical functions. Though many rules are universal, there are many language-specific constructions.

This is a work for a computational linguist. The first draft of a set of such rules might be obtained by revision and selection of the existing rule corpora for Russian, French, or English.

·      Implementation of the discussed mechanisms in computational linguistic systems, such as parsers or generators.

These might be some laboratory systems based mostly on the formalism of lexical functions, but the best use of the discussed dictionaries and rules can be used by their implementation as modules in larger language processing systems.

Appendix. Examples of LF Magn

Here are some Spanish examples of the values of the lexical function Magn, that expresses the meaning of ‘very’, ‘high intensity or magnitude’, for the words of various parts of speech.

Nouns:

Adelantos:serios, importantes; admiración:servil; antigüedad:remota; bombilla:brillante; cantidad:numerosa; capacidades:grandes; café:cargado, fuerte; color:intenso; duelo:grande; ejército:fuerte, poderoso, potente; enemigo:mortal; esencia:concentrada, fuerte; éxito:grande, arrollador, clamoroso; fugador:fuerte, bueno; hambre:grande, violenta, canija; ignorancia:crasa; luto:riguroso; luz:viva, brillante; motor:potente; notoriedad:resonante; precios:altos; presión:alta; rasgo:distintivo; razonamiento:complicado; resistencia:decidida, resuelta; respeto:todo; reverencia:profunda; ruina:completa; salario:alto; silencio:profundo, sepulcral, total; sueño:profundo; sucesos:importantes; :cargado, fuerte; vejez:avanzada; velocidad:grande; verdad:pura; voluntad:firme.

Adjectives and adverbs:

Agradecido:muy, sumamente; astuto:muy, como un diablo; bien:muy; borracho:como casaco, perdido, como una cuba; conmovido:profundamente; convencional:puramente; desnudo:completamente, del todo; difícil:muy, sumamente; emocionado:muy; enfadado:muy; enfermo:gravemente; enojado:muy; fuerte:muy; joven:muy; herido:gravemente; nuevo:completamente; preocupado:muy; rico:muy.

Verbs:

alabar:mucho; asustarse:terriblemente; cerrar:herméticamente, bien; comer:bien, mucho; dormir:profundamente, a pierna suelta, como un tronco; latir:fuertemente; reñir:ásperamente; soplar:reciamente; vigilar:rigurosamente, con rigurosidad, estrictamente, con severidad, austeramente, con austeridad.

References

1.    Benson, M., E. Benson, and R. Ilson. The BBI combinatory dictionary of English. John Benjamins Publishing Company, Amsterdam-Philadelphia, 1996.

2.    Bolshakov, I.A., P.J. Cassidy, A.F. Gel­bukh. CrossLexica: a dictionary of collocations and thesaurus of the general Russian lexicon (in Russian, abstract in English). Proceedings of International Workshop Dialogue’95: Computational Linguistics and its Applications, Khazan, 1995.

3.    CALLex, Computer-Aided Learning of Lexical Functions. See http: // www. gs. uni-heidelberg. de / callex.html.

4.    Leed, Richard L., ed., Lidia Iordanskaja, Slava Paperno, et al. A Russian-English Collocational Dictionary of the Human Body, 1996, 418 p. ISBN: 0-89357-265-9, see also http: // russian. dmll. cornell. edu / russian. web / BODY / MAC_STAN.

5.    Leed, Richard L., and A.D. Nakhimovsky. Lexical Functions and Language Learning. Slavic and East European Journal, vol. 23, No. 1, pp. 104–113, 1979.

6.    Leo, ed. Lexical Functions in Lexicography and Natural Language Processing. Studies in Language Companion Series, No. 31, Wanner, University of Waterloo, 1996.

7.    Meluk, I.A. An experience of the theory of Meaning Û Text models (in Russian). Nauka, Moscow, 1974.

8.    Meluk, I.A. Lexical Functions in Lexicographic Description. In: Proceedings of VIII Annual Meeting of the Berkeley Linguistic Society, Berkeley, UCB, 1982, p. 427-444.

9.    Meluk, I.A., N. Arbatchewsky, L. Iordanskaja, A. Lessard et al. Dictionnaire explicatif et combinatoire du français contemporain. Les Presses de l'Université de Montréal, Montreal, 1984.

10.Meluk, I.A., A. Polguere. A Formal Lexicon in the Meaning – Text Theory (or How to Do Lexica with Words). Computational Linguistics, v. 13, N. 3-4, 1987, p. 261-275.

11.Meluk, I.A., and A. Zholkovsky. The explanatory combinatory dictionary. In M.W. Evens, ed. Relational Models of the Lexicon: Representing Knowledge in Semantic Networks, pages 41 – 74. Cambridge University Press, 1988.

12.Meluk, I.A., and A.K.olkovsky. Explanatory Combinatorial Dictionary of Modern Russian. Wiener Slawistischer Almanach Sonderband 14, Vienna, 1984.

13.Steel, James, ed. Meaning – Text Theory. Linguistics, lexicography, and implications. University of Ottawa press, 1990.

14.Yuret, Deniz. Discovery of linguistic relations using lexical attraction. Ph.D. thesis, MIT, 1998. See http://xxx.lanl.gov/abs/cmp-lg/9805009.



* The work done under partial support of CONACYT, Project 26424-A, and DEPI, Project 980772.

[1] The formalism of lexical functions has nothing to do with Lexical Functional Grammar (LFG). The name Lexical Functional Grammar refers to a functional grammar which is lexical, while the name lexical function refers to a function defined on lexems.

[2] Grito fuerte / grande, llanto ruidoso.

[3] Other spellings: Zholkovsky; Mel’cuk, Mel’chuk, Melchuk (the latter spelling in English corresponds best to the pronunciation of the name).

[4] Example by N. Chomsky.

[5] The meaning of the index 1 of the name Oper1 will be explained later.

[6] LF stands for lexical function.

[7] In linguistics, the part of speech is considered a purely grammatical category that cannot change the core meaning of the word.

[8] The index 1 of the word real denotes the first of the homonyms of this word: real1 refers to reality, real2 refers to a king, etc.

[9] They were well elaborated mostly for the Russian language.

[10] At the current state of linguistics. Some linguists, e.g., A. Wierzbitska, suggest methods that in theory might (at least in many cases) predict some lexical functions.