Information Theory Ramblings: Information Retrieval Introduction

Everyone benefits from information retrieval. Most have no knowledge of it, but encounter it still. That is to say, who doesn't use a search engine these days? They are simply algorithms. They are equations and logical commands to meet the end goal of retrieving relevant information on demand. It started before there was an internet. Libraries had the need to search for relevant books in their inventory. Also, you might know of the Rolodex...that is also a result of information retrieval research. Before I start commenting on contemporary research and articles, I will take some time and explain some of the basics of information retrieval (IR).

1 Introduction
Information retrieval (IR) sounds like a broad topic and to an extent it is. Picking up a bag of coffee at Starbucks to read the label is technically information retrieval. From another perspective, typing “Starbucks” into the Google search engine is also information retrieval. The more common association of IR is with computer searches. Though computers play a large role in information retrieval both in development and in application, the study of the topic predates computers themselves.

The phrase “information retrieval” was coined by Calvin Mooers in the 1950’s and computers weren’t actually used on a notable level until the 1960’s. According to S. Robertson, who wrote Charting a New Course: Natural Language Processing and Information Retrieval in 2005, in the early days of its conception IR was studied with respect to systems like card catalogues, indexes, and punch card mechanisms (p .13). Incredible advances in technology were made and with those advances came the introduction of the World Wide Web (WWW). With the introduction of the WWW came mass amounts of information available to the entire span of the globe (Aboutajdine 2010). Since then only more information has become available in more accessible ways at faster speeds. The concern is how to make that information accessible in an efficient and accurate manner. Thus the major concerns studied within the field of information retrieval are those of how to represent the information to be retrieved (Robertson, 2005). To see why this is true, the process of information retrieval must be examined.

This topic will be organized as follows: section two will define and introduce information retrieval, section three presents the probability model of information retrieval, section four defines the fundamental concepts of frequency, weight, and score, section five presents the vector space model for information retrieval, and section six concludes the introduction and points towards further research.

Sources:
Robertson, S. (2005). Charting a New Course: Natural Language Processing and Information Retrieval J. I. Tait. Dordrecht: Springer.

Abderrahim, E. Q., Aboutajdine, D., & Ennouary, Y. (2010, November 2). Formal Concept Analysis for Information Retrieval. International Journal of Computer Science and Information Security, 7.

Thank you for reading!

Dustin Smith
My Webpage
LinkedIn Profile

Information Theory Ramblings

Sunday, February 13, 2011

Information Retrieval Introduction

No comments:

Post a Comment