User talk:Aitor izaola ibero

THE TEXT GENERATION

We can define text generation as the production of coherent lenguage texts which satisfys the intention of comunicate something. This kind of comunication must be composed of those characteristics:

accurate: containing accurate information and avoiding to user of contaminated or erroneal information. coherent: the use of a languagge by a comprehensible way. valid: causing the user to make the desired inferences. informative: presenting to user new and interesting knowledge to satisfy his curiosity. This information must be going by differents grades of knowledge begining with easy and basic concepts and then growing the grade of information. understandable: the creator of a source of information has to make it with information wich user can understand. Not using so tecnical terms because most of times the user can be unfamiliar of these pitch of knowledge. releveant: including information that is relevant to the current discussion goal not redundant one.

There is a group of people interseted in this pitch of knowledge that is know as Association for Computational Linguistics. The mewmbers of this kind of institution are mostly scientific and professional society for people working on problems involving natural language and computation. what is the Computational Linguistics?computational linguistics is the scientific study of language from a computational point of view. Work in computational linguistics is motivated from a scientific perspective in that one is trying to provide a computational explanation for a particular linguistic or psycholinguistic case. But in other cases the motivation may be more technological in that one wants to provide a working component of a speech or text generation system.

There are some typical problems and limitations that the text generator must solve as soon as possible. It is safe to say that at the present time one can fairly easily build a single-purpose generator for any specific application, or with some difficulty adapt an existing sentence generator to the application, with acceptable results. However, one cannot yet build a general-purpose sentence generator. Several significant problems remain without sufficiently general solutions: sentence planning discourse structure domain modeling generation choice criteria lexical selection Now i am going to explain a bit each problem and try to find any solution to those important problems. These are just suggestions.

Sentence Planning:a number of tasks remain before well-structured multisentence text can be generated. These tasks, required for planning the structure and content of each sentence, include: pronoun specification, theme signaling, focus signaling, content aggregation to remove unnecessary redundancies, the ordering of prepositional phrases, adjectives, etc. This kind of problem in my opinion could be solve taking care of your vocabulary or revising once and another the written estructures and sentences. One tipycal and so important problem could be the fact of revising the text and not realicing or taking care of the error.

Discourse Structure: So far no text planner exists that can reliably plan texts of several paragraphs in general.

Domain Modeling: A traditional problem with generators is that the inputs are frequently hand-crafted, or are built by some other system that uses representation elements from a fairly small hand-crafted domain model, making the generator's inputs already highly oriented toward the final language desired. A speech recognition system (well build) must be the future solution to this kind of problem.

Generation Choice Criteria: Probably the problem least addressed in generator systems today is the one that will take the longest to solve. This is the problem of guiding the generation process through its choices when multiple options exist to handle any given input. It is unfortunately the case that language, with its almost infinite flexibility, demands far more from the input to a generator than can be represented today. As long as generators remain fairly small in their expressive potential then this problem will be not sawn again. Question of time and hard work will be the solving of this problem.

Lexical Selection: Lexical selection is one of the most difficult problems in generation. At its simplest, this question involves selecting the most appropriate single word for a given unit of input. However, as soon as the semantic model approaches a realistic size, and as soon as the lexicon is large enough to permit alternative locutions, the problem becomes very hard to resolve. At this time no general methods exist to perform lexical selection. Most current generator systems simply finesse the problem by linking a single lexical item to each representation unit. Development of theories about and implementations of lexical selection algorithms, for reference to objects, event, states, etc., and tested with large lexica. So in this kind of problem i think that solve it must suppose create a real big and accurate kind of dictionary wich contains as much as possible contextualization items or possibilities. http://www.dynamicmultimedia.com.au/siggen/

There are other pages in wich gives us similar definition to the before given one. One of those pages where the recomendated of our teacher Joseba Abaitua: http://www.hltcentral.org/usr_docs/project-source/en/broch/harness.html#nlg

The SIGGEN group does too many conferences talking about the advanges and inovations in this theme. There are some links to this conferences: http://www.ags.uni-sb.de/~horacek/EACL-EWNLG03.html This one was done in the April of 2003.Another one done in the 2002 in New York: http://inlg02.cs.columbia.edu/