In this paper we will present a content-based image retrieval (CBIR) system for a database of pulmonary
nodule images, with a comparison of the effectiveness of various texture features and similarity measures in
retrieving similar images from a medical database. We are particularly interested in how well texture feature
analysis performs with lung nodules obtained from the Lung Image Database Consortium (LIDC).
provided a set of lung CT images along with information about nodules shown in these images. In our paper
we will compare three different types of texture features: (1) Co-occurrence matrices, (2) Gabor filters, and
(3) Markov random fields. These methods are used to extract a “feature vector” (a series of numbers) from
images that represent the image’s signature. This vector is then compared with the vectors of other images
by various similarity measures.
We have decided to base our evaluation on the idea that the first results
returned by the system for a particular nodule should be other instances of that same nodule, perhaps on
a different CT slice or marked and rated by a different radiologist. Thus, ground truth is determined by
objective, a priori knowledge about the nodules. In this way, precision is defined as the number of retrieved
instances of the query nodule divided by the number of retrieved images and recall is defined as the number
of retrieved instances of the query nodule divided by the number of total instances of the query nodule.
We have determined that Gabor-based image features generally perform better than global co-occurrence
measures for the images in the LIDC database, with a maximum average precision of 68%.