1. Are there any existing system that can evaluate an Information Retrieval Model, for example, in the final project, where we are supposed to construct one model. How we can claim it is good and bad?
2. How to evaluate multilingual information retrieval model?
Friday, February 22, 2013
Relevance feedback
Relevance feedback is a information retrieval system feature that helps improve the search query by explicit supplies from the user (explicit feedback), or by observing the user system interaction behavior (implicit feedback), or by a method of automatic local analysis (blind feedback or pseudo feedback). The relevance feedback is very helpful at either adjusting the weights of original query terms or adding new terms that are more close to what user search for.
Friday, February 15, 2013
Muddiest Point after week 5
As in class, Dr. He talked about two language models: Unigram or higher-order models and Multinomial or Multiple-Bernoulli. He also talked about three models for ranking: Query-likelihood, Document-likelihood, and Divergence of query and document models. I am wondering what kind of factors do you need to take into considerations when determining the language models or the ranking models?
One criticism for Information Retrieval Evaluation
I think IR evaluation needs to come from the user side. It is generally difficult in terms of different users may perceive different understanding of the returning document set and thus have the different interpretation of which document is relevant. Since it's difficult to dismiss because of the subjectivieness associated with the task of deciding the relevance, it lacks a solid formal framework as a basic foundation. I guess that's why User-Oriented Measures is used under this consideration. I am wondering what the circumstances that we use different measures, like The Harmonic Mean, The E Measure?
Friday, February 1, 2013
Modeling
I listed the following comparisons as my note for classic modeling.
Boolean Model
- Advantages:
- Allows for logic
- Provides all that has been matched
- Disadvantages
- Has no particular order of output
- Treats all retrievals equally from the most to least relevant ones
- Often requires examination of large output
Vector Model:
- Advantages:
- Returns ranked retrieval
- Terms are weighted by importance
- Partial matches
- Disadvantages
- Assumes terms are independent
- Weighting is intuitive, but not very formal
My question for vector model is that it's really hard to make sense how 10% relevant to query conveys less meaningful information for resolving queried information problem than a 40% relevant one? Do they have clear difference in terms of fulfilling the query need?
Subscribe to:
Posts (Atom)