Information Relevance

The concept of relevance is fundamental to assessing the performance of systems that enable people to find information. As with information quality there is often little consideration of the implications of the term, just a desire to deliver ‘highly relevant information’. The word ‘relevant’ dates back to at least the early 16th century when it was used in Scotland to denote something that was legally pertinent. Relevance is always associated with ‘to’. It has to have a context and is also a personal construct. It is probably one of the few terms in the information world that does not have to be defined. We all have a sense of what is relevant. In the 1950s it was more usual to refer to non-relevant false drops than to relevant items. It was not until 1955 that relevance started to be discussed in the context of information retrieval. Initially the two components of search output were referred to as ‘recall’ and ‘relevance’ but the term ‘precision’ started to replace ‘relevance’ and now both recall and precision are defined in terms of relevance. There is a very good review of how the concept of relevance emerged and its relationship to information retrieval by Tefko Saracevic, who also considers some of the philosophical aspects of relevance. Saracevic has also authored an excellent monograph entitled “The Notion of Relevance in Information Science: Everybody knows what relevance is. But, what is it really?”, published in September 2016 by Morgan & Claypool.

In considering whether an information system is delivering relevant information there are some important issues to consider. The first of these is that relevance is a personal perspective, and therefore stating that certain information is relevant to all HR managers is pushing the boundaries of the concept. Personalisation and customisation are ways in which the volume of information being presented to a user can be managed but not the relevance. It is certainly possible to state that HR managers need to be made aware of a specific policy, but that is not the same as stating it is relevant to them as a group.

The second issue is that relevance can vary with time. A user can locate what seem to be highly relevant results from a search results page, and that process forms the basis for many metrics of search performance. However at that stage it is unlikely that the user will have read the documents in full. That will happen once the search has been completed. The user may then find that not only are some of the documents not relevant (they may be older that first appeared or only applicable in a certain country) but the review may also highlight terms or concepts that indicate that the initial search might have used some or all of the wrong search terms, and the search needs to be conducted a second time.

It could well be that a week or so later that the user, having now started to prepare the report for which the search was to have provided the core information, discovers that even the relevant documents are not as relevant as they were the week previously. This is usually because other documents, information and above all knowledge are now available which suggest that only a few of the original set of search results are indeed relevant. Then a draft is circulated and colleagues come up with additional information and knowledge that further reduce the number of relevant documents. And so the process continues. The point about this scenario is that care needs to be taken when assessing user satisfaction that there is an opportunity to take a wider look at search (and for that matter browse) satisfaction. If a user is asked if the search system delivered relevant documents should they answer on the basis of the initial search (“Excellent!”) or on the basis of the situation three weeks later (“I only used two”)

For some additional perspectives on relevance see

What does relevant mean?  Paul Nelson, Search Technologies

Relevant Search Doug Turnbull and John Berryman

Measuring search relevance Hugh Williams

Web search relevance ranking Hugo Zaragoza and Mark Najork

The Probabilistic Relevance Framework: BM25 and Beyond Stephen Robertson and Hugo Zaragoza

See also Information Architecture, Information Behaviour, Information Needs, Information Seeking

Martin White

October 2016

To find out more about the unique range of information management consulting services please get in touch