Tracking developments in enterprise search
Throughout my career I have been a collector of information and many a colleague has been surprised by just how many of the drawers in my filing cabinets were filled with photocopies of articles and research papers. Since setting up Intranet Focus in 1999 I have managed to create a substantial digital library as well as a rather large collection of books. My initial interest was in intranets, and there were (and are still) relatively few case studies and even fewer research papers.
As I started to focus on enterprise search in 2008 I realized there would be considerable value in being more methodical and wide-ranging in collecting and reading research papers given the substantial amount of information retrieval research being undertaken. This research ensures that I am in a position to make a useful contribution to the work of the Information School at the University of Sheffield where I hold a position as a Visiting Professor. I am also able to come up with insights and novel solutions to client challenges.
In this post I have set out some of the resources I use to track the outcomes of research and practice in case they are of interest to you. The coverage of my scanning is very much on applied information retrieval and enterprise search and not on fundamental IR research. In fact there are very few papers specifically on enterprise search but many that could be applied if you have a good understanding of the elements that go into the development and use of enterprise search applications. I usually spend an hour or so each week working through the resources below, adding around 15-20 papers.
For me there are a core set of journals.
Only Computational Linguistics and Information Research are open access. Fortunately, I have access to the digital resources of the University Library at Sheffield.
ACM Digital Library
As a member of the Association for Computing Machinery I have access to the excellent Digital Library service which has a very good search application. There are a number of ACM journals for which I have set up alerts and the ACM conference proceedings are an essential source of current research. ACM publications are behind a membership firewall but I find the wealth of research published by the ACM, especially the conference proceedings, is well worth the annual subscription.
Google Scholar not only indexes journals but also conference proceedings, reports and above all theses. I have set up a couple of alerts on Google Scholar which seem to work well for me. Of course probably the major benefit is that Google Scholar identifies open access versions of subscription journals.
The research scope of Microsoft Research is very wide indeed. I just look at the research reports on human-computer interaction and occasionally at research on artificial intelligence
arXiv publishes pre-prints of research papers. Many will end up as published papers but others will remain at a pre-print status without peer review, so a degree of discretion is needed when considering the outcomes of the research. I monitor papers in the Computers in Society, Information Retrieval, Human-Computer Interaction, Computers and Language and Digital Library collections, all of which are sections of the Computer Science category
Blogs and Books
There are lists of blogs and books on the Intranet Focus web site
Serendipity also plays an important role. One rule that I have is that when I spot a research paper in a journal I am not familiar with I always work back through the last couple of years to see what I might have missed. Rarely will it be added to my core journal list but I have built up a list of around 30 other titles that I might spend an afternoon every few months browsing through.