AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
Results show that our best model significantly outperforms other latent semantic models, which were considered state-of-the-art in the performance prior to the work presented in this paper

Learning deep structured semantic models for web search using clickthrough data

CIKM, (2013)

Cited by: 1261|Views494
EI

Abstract

Latent semantic models, such as LSA, intend to map a query to its relevant documents at the semantic level where keyword-based matching often fails. In this study we strive to develop a series of new latent semantic models with a deep structure that project queries and documents into a common low-dimensional space where the relevance of a...More

Code:

Data:

0
Introduction
  • Modern search engines retrieve Web documents mainly by matching keywords in documents with those in search queries.
  • Lexical matching can be inaccurate due to the fact that a concept is often expressed using different vocabularies and language styles in documents and queries
  • Latent semantic models such as latent semantic analysis (LSA) are able to map a query to its relevant documents at the semantic level where lexical matching often fails (e.g., [6][15][2][8][21]).
  • The performance of these models on Web search tasks is not as good as originally expected
Highlights
  • Modern search engines retrieve Web documents mainly by matching keywords in documents with those in search queries
  • The third includes a set of state-of-the-art latent semantic models which are learned either on documents only in an unsupervised manner (LSA, probabilistic LSA, DAE as in Rows 4 to 6) or on clickthrough data in a supervised way (BLTM-PR, Discriminative Projection Models, as in Rows 7 and 8)
  • We present and evaluate a series of new latent semantic models, notably those with deep architectures which we call the Deep Structured Semantic Models
  • The main contribution lies in our significant extension of the previous latent semantic models in three key aspects
  • Inspired by the deep learning framework recently shown to be highly successful in speech recognition [5][13][14][16][18], we extend the linear semantic models to their nonlinear counterparts using multiple hiddenrepresentation layers as
  • We use a letter n-gram based word hashing technique that proves instrumental in scaling up the training of the deep models so that very large vocabularies can be used in realistic web search
Methods
  • The authors evaluated the DSSM, proposed in Section 3, on the Web document ranking task using a real-world data set.
  • The authors first describe the data set on which the models are evaluated.
  • The label is human generated and is on a 5-level relevance scale, 0 to 4, where level 4 means that the document is the most relevant to query and 0 means is not relevant to.
  • All the queries and documents are preprocessed such that the text is white-space tokenized and lowercased, numbers are retained, and no stemming/inflection is performed
Results
  • The main results of the experiments are summarized in Table 2, where the authors compared the best version of the DSSM (Row 12) with three sets of baseline models.
  • The third includes a set of state-of-the-art latent semantic models which are learned either on documents only in an unsupervised manner (LSA, PLSA, DAE as in Rows 4 to 6) or on clickthrough data in a supervised way (BLTM-PR, DPM, as in Rows 7 and 8).
  • In order to make the results comparable, the authors reimplement these models following the descriptions in [10], e.g., models of LSA and DPM are trained using a 40k-word vocabulary due to the model complexity constraint, and the other models are trained using a 500K-word vocabulary.
Conclusion
  • The authors present and evaluate a series of new latent semantic models, notably those with deep architectures which the authors call the DSSM.
  • The main contribution lies in the significant extension of the previous latent semantic models (e.g., LSA) in three key aspects.
  • The authors show that the new techniques pertaining to each of the above three aspects lead to significant performance improvement on the document ranking task.
  • A combination of all three sets of new techniques has led to a new state-of-the-art semantic model that beats all the previously developed competing models with a significant margin
Tables
  • Table1: Word hashing token size and collision numbers as a function of the vocabulary size and the type of letter ngrams
  • Table2: Comparative results with the previous state of the art approaches and various settings of DSSM
  • Table3: Table 3
  • Table4: Table 4
  • Table5: Examples that our deep semantic model performs better than TF-IDF
  • Table6: Examples that our deep semantic model performs worse than TF-IDF
  • Table7: Examples of the clustered words on five different output nodes of the trained DNN. The clustering criterion is high activation levels at the output nodes of the DNN
Download tables as Excel
Related work
  • Our work is based on two recent extensions to the latent semantic models for IR. The first is the exploration of the clickthrough data for learning latent semantic models in a supervised fashion [10]. The second is the introduction of deep learning methods for semantic modeling [22].

    2.1 Latent Semantic Models and the Use of Clickthrough Data

    The use of latent semantic models for query-document matching is a long-standing research topic in the IR community. Popular models can be grouped into two categories, linear projection models and generative topic models, which we will review in turn.

    The most well-known linear projection model for IR is LSA [6]. By using the singular value decomposition (SVD) of a document-term matrix, a document (or a query) can be mapped to a low-dimensional concept vector
Reference
  • Bengio, Y., 2009. “Learning deep architectures for AI.” Foundumental Trends Machine Learning, vol. 2.
    Google ScholarLocate open access versionFindings
  • Blei, D. M., Ng, A. Y., and Jordan, M. J. 2003. “Latent Dirichlet allocation.” In JMLR, vol. 3.
    Google ScholarLocate open access versionFindings
  • Burges, C., Shaked, T., Renshaw, E., Lazier, A., Deeds, M., Hamilton, and Hullender, G. 2005. “Learning to rank using gradient descent.” In ICML.
    Google ScholarFindings
  • Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., and Kuksa, P., 2011. “Natural language processing (almost) from scratch.” in JMLR, vol. 12.
    Google ScholarLocate open access versionFindings
  • Dahl, G., Yu, D., Deng, L., and Acero, A., 2012. “Contextdependent pre-trained deep neural networks for large vocabulary speech recognition.” in IEEE Transactions on Audio, Speech, and Language Processing.
    Google ScholarLocate open access versionFindings
  • Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T., and Harshman, R. 1990. “Indexing by latent semantic analysis.” J. American Society for Information Science, 41(6): 391-407
    Google ScholarLocate open access versionFindings
  • Deng, L., He, X., and Gao, J., 2013. "Deep stacking networks for information retrieval." In ICASSP
    Google ScholarLocate open access versionFindings
  • Dumais, S. T., Letsche, T. A., Littman, M. L., and Landauer, T. K. 1997. “Automatic cross-linguistic information retrieval using latent semantic indexing.” In AAAI-97 Spring Symposium Series: Cross-Language Text and Speech Retrieval.
    Google ScholarFindings
  • Gao, J., He, X., and Nie, J-Y. 2010. “Clickthrough-based translation models for web search: from word models to phrase models.” In CIKM.
    Google ScholarFindings
  • Gao, J., Toutanova, K., Yih., W-T. 2011. “Clickthroughbased latent semantic models for web search.” In SIGIR.
    Google ScholarLocate open access versionFindings
  • Gao, J., Yuan, W., Li, X., Deng, K., and Nie, J-Y. 2009. “Smoothing clickthrough data for web search ranking.” In SIGIR.
    Google ScholarFindings
  • He, X., Deng, L., and Chou, W., 2008. “Discriminative learning in sequential pattern recognition,” Sept. IEEE Sig. Proc. Mag.
    Google ScholarFindings
  • Heck, L., Konig, Y., Sonmez, M. K., and Weintraub, M. 2000. “Robustness to telephone handset distortion in speaker recognition by discriminative feature design.” In Speech Communication.
    Google ScholarFindings
  • Hinton, G., Deng, L., Yu, D., Dahl, G., Mohamed, A., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T., and Kingsbury, B., 2012. “Deep neural networks for acoustic modeling in speech recognition,” IEEE Sig. Proc. Mag.
    Google ScholarLocate open access versionFindings
  • Hofmann, T. 1999. “Probabilistic latent semantic indexing.” In SIGIR.
    Google ScholarLocate open access versionFindings
  • Hutchinson, B., Deng, L., and Yu, D., 2013. “Tensor deep stacking networks.” In IEEE T-PAMI, vol. 35.
    Google ScholarLocate open access versionFindings
  • Jarvelin, K. and Kekalainen, J. 2000. “IR evaluation methods for retrieving highly relevant documents.” In SIGIR.
    Google ScholarFindings
  • Konig, Y., Heck, L., Weintraub, M., and Sonmez, M. K. 1998. “Nonlinear discriminant feature extraction for robust text-independent speaker recognition.” in RLA2C.
    Google ScholarFindings
  • Mesnil, G., He, X., Deng, L., and Bengio, Y., 2013. “Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding.” In Interspeech.
    Google ScholarFindings
  • Montavon, G., Orr, G., Müller, K., 2012. Neural Networks: Tricks of the Trade (Second edition). Springer.
    Google ScholarFindings
  • Platt, J., Toutanova, K., and Yih, W. 2010. “Translingual doc-ument representations from discriminative projections.” In EMNLP.
    Google ScholarFindings
  • Salakhutdinov R., and Hinton, G., 2007 “Semantic hashing.” in Proc. SIGIR Workshop Information Retrieval and Applications of Graphical Models.
    Google ScholarLocate open access versionFindings
  • Socher, R., Huval, B., Manning, C., Ng, A., 2012. “Semantic compositionality through recursive matrix-vector spaces.” In EMNLP.
    Google ScholarFindings
  • Svore, K., and Burges, C. 2009. “A machine learning approach for improved BM25 retrieval.” In CIKM.
    Google ScholarFindings
  • Tur, G., Deng, L., Hakkani-Tur, D., and He, X., 2012. “Towards deeper understanding deep convex networks for semantic utterance classification.” In ICASSP.
    Google ScholarFindings
  • Yih, W., Toutanova, K., Platt, J., and Meek, C. 2011. “Learning discriminative projections for text similarity measures.” In CoNLL.
    Google ScholarFindings
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科