AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
It is worth noting that Generalized Vector Space Model is simple to compute and easy to scale up, somewhat better than Latent Semantic Indexing, and its performance is not crucially dependent on the exact value of a tuned parameter

Translingual Information Retrieval: A Comparative Evaluation

IJCAI-97 - PROCEEDINGS OF THE FIFTEENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, V..., pp.708-714, (1997)

Cited by: 226|Views138
EI
Full Text
Bibtex
Weibo

Abstract

Translingual information retrieval (TIR.) consists of providing a query in one language a nd searching document collections in one or more different languages, This paper introduces new TIR, methods and reports on comparative TIR experiments with these new methods and with previously reported ones in a realistic setting. Methods fall into...More

Code:

Data:

Introduction
  • Translingual information retrieval (TIR) is starting to receive considerable attention in recent years with the increased accessibility of ever-more-diverse on-line international text collections, including centrally the World

    Wide Web.
  • Retrieval Accuracy { Since documents contain far more information than queries, random translation errors should cause less degradation for the IR task in documents than in queries.
  • For both this reason and the above, document translation is in pbyrinDcuipmleaipsreetfearal bDleu.
  • Some are proprietary individual documents may be read or downloaded, but the entire collection may not be copied or translated.
  • Even if these problems were surmountable, translating the collection may require inordinately long computation and massive storage, not to mention re-indexing the translated collection
Highlights
  • Translingual information retrieval (TIR) is starting to receive considerable attention in recent years with the increased accessibility of ever-more-diverse on-line international text collections, including centrally the World

    Wide Web
  • We extended three monolingual retrieval methods to translingual (PRF) Buckley model (GVSM) mantic indexing eW.ge,ave1apt9lp:9aro5l.]a,,pcsh1te9huD8ed5eoeg],-rerwnaeelenersdvataeltrnhvceeeetcaltaofle.tr,eedn1s9tbp9aas0ccek]e-
  • Similar to query adt~ravisnecsttfhooerrmdinoacttuihomene,dnautadvloesccputamocreeninbtycthacneomacolpsnuovtbeinnegttirod~an0na=slfoVArSmtd~Mewd. hTinehtroee retrieval criterion in Generalized Vector Space Model for monolingual retrieval is de ned to be: sim(~q d~) = cos(At~q Atd~): Here we propose a novel extension of the monolingual Generalized Vector Space Model for translingual retrieval
  • Pseudo-relevance feedback performed well in absolute terms, indicating that if the user were willing to provide true relevance judgements, full relevance feedback could become the top-performing method for Translingual information retrieval
  • Latent Semantic Indexing did not perform according to expectations from the literature
  • It is worth noting that Generalized Vector Space Model is simple to compute and easy to scale up, somewhat better than Latent Semantic Indexing, and its performance is not crucially dependent on the exact value of a tuned parameter
Methods
  • Precision Precision to the corpus and exploits context, and is much superior.
  • This result indicates that the most popular TIR method reported in the literature (MRD-based query translation) may be the simplest, but its performance leaves much to be desired.
  • LSI did not perform according to expectations from the literature.
Results
  • The authors evaluated each method, monolingually and translingually, using human relevance judgements.
  • The corresponding 11-point average precision values in the table in gure 4 below.
  • The authors include corresponding translingual results reported by other researchers.
  • Because the methods have been run on di erent corpora with di erent queries, direct comparisons on absolute 11-point-precision recall gures are not meaningful.
  • The authors present the results in the standard recallprecision graphs for monolingual and translingual IR in gures 5 and 6, respectively
Conclusion
  • MRD-based query translation, though popular in the literature, should be re-examined as the TIR method of choice given the results in this paper.
  • It appears that Translingual LSI is not as good in a realistic setting with actual queries and 11-pointaverage precision evaluations as in the preliminary.
  • More work is clearly called for in further evaluating the GVSM method and corpusbased term-translation in other realistic contexts, and investigating whether other forms of tunable MT-based translingual IR could be made to perform reasonably well, especially in situations where translating the collection does not pose serious problems
Reference
  • Brown, 1996] R.D. Brown. Example-Based Machine Translation in the Pangloss System. In Proceedings of the Sixteenth International Conference on Computation Linguistics, pages 169{174, 1996.
    Google ScholarLocate open access versionFindings
  • Brown, 1997] R.D. Brown. Automated Dictionary Extraction for \Knowledge-Free" Example-Based Translation. In Proceedings of the Seventh International Conference on Theoretical and Methodological Issues in Machine Translation, 1997.
    Google ScholarLocate open access versionFindings
  • Buckley et al., 1995] C. Buckley, G. Salton, J. Allan, and A. Singhal. Automatic Query Expansion Using SAMRT: TREC In Overview of the Third Text REtrieval Conference (TREC-3), pages 69{80, 1995.
    Google ScholarFindings
  • Carbonell, 1985] J. G. Carbonell. New Approaches to Machine Translation. In Proceedings of the conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages, Hamilton, NY, 1985.
    Google ScholarLocate open access versionFindings
  • Davis and Dunning, 1996] M. Davis and T. Dunning. A TREC evaluation of query translation methods for multi-lingual text retrieval. In The 4th Text Retrieval Conference (TREC-4), 1996.
    Google ScholarFindings
  • Deerwester et al., 1990] S. Deerwester, S.T. Dumais, G.W. Furnas, T.K. Landauer, and R. Harshman. Indexing by Latent Semantic Analysis. In J Amer Soc Inf Sci 1, 6, pages 391{407, 1990.
    Google ScholarLocate open access versionFindings
  • Dumais et al., 1996] S. Dumais, T. Landauer, and M. Littman. Automatic Cross-Linguistic Information Retrieval using Latent Semantic Indexing. In Proceedings of SIGIR-96, Zurich, August 1996.
    Google ScholarLocate open access versionFindings
  • Gra and Finch, 1994] David Gra and Rebecca Finch. Multilingual Text Resources at the Linguistic Data Consortium. In Proceedings of the 1994 ARPA Human Language Technology Workshop. Morgan Kaufmann, 1994.
    Google ScholarLocate open access versionFindings
  • Hull and Grefenstette, 1996] D.A. Hull and G. Grefenstette. Querying Across Languages: a Dictionarybased Approach to Multilingual Information Retrieval. In 19th Ann Int ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'96), pages 49{57, 1996.
    Google ScholarLocate open access versionFindings
  • Nagao, 1984] M. Nagao. A Framework of a Mechanical Translation between Japanese and English by Analogy Principle. In A. Elithorn and R. Banerji (eds), editors, Arti cial and Human Intelligence. NATO Publications, 1984.
    Google ScholarLocate open access versionFindings
  • Nirenburg et al., 1991] S. Nirenburg, J. G. Carbonell, M. Tomita, and K. Goodman. Knowledge-Based Machine Translation. Morgan Kaufmann Inc, San Mateo, CA, 1991.
    Google ScholarFindings
  • Rissland and Daniels, 1995] Edwina L. Rissland and Jody J. Daniels. Using CBR to Drive IR. In Proceedings of the Fourteenth International Joint Conference on Arti cial Intelligence (IJCAI-95), pages 400{407, 1995.
    Google ScholarLocate open access versionFindings
  • Salton and McGill, 1983] G. Salton and M. J. McGill. Introduction to Modern Information Retrieval. McGraw-Hill Computer Science Series. McGraw-Hill, New York, 1983.
    Google ScholarFindings
  • Salton, 1970] G. Salton. Automatic Processing of Foreign Language Documents. Journal of American Society for Information Sciences, 21:187{194, 1970.
    Google ScholarLocate open access versionFindings
  • Salton, 1989] G. Salton. Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley, Reading, Pennsylvania, 1989.
    Google ScholarFindings
  • Sheridan and Ballerini, 1996] P. Sheridan and J.P. Ballerini. Experiments in Multilingual Information Retrieval using the SPIDER System. In 19th Ann Int ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'96), pages 58{ 65, 1996.
    Google ScholarLocate open access versionFindings
  • Wong et al., 1985] S.K.M. Wong, W. Ziarko, and P.C.N. Wong. Generalized Vector Space Model In Information Retrieval. In ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'85)., pages 18{25, 1985.
    Google ScholarLocate open access versionFindings
  • Yang, 1995] Y. Yang. Noise Reduction in a Statistical Approach to Text Categorization. In Proceedings of the 18th Ann Int ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'95), pages 256{263, 1995.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科