SINCERITY: A Search Engine for Image Retrieval
This article presents the third and last phase of a research project wherein the search engine SINCERITY was tested with a sample of images and image searchers. The performance of the two main searching features of the image search engine (the keyword box and taxonomic structure) is compared. In addition, this study aims to understand the effect of users’ language on their language preferences when searching for images. Both quantitative and qualitative data were gathered using search retrieval tasks and a questionnaire.
Cet article présente la troisième et dernière phase d’un projet de recherche dans lequel le moteur de recherche SINCERITY a été évalué avec un échantillon d’images et de chercheurs d’images. La performance des deux principales fonctions de recherche du moteur de recherche d’images (recherche par mot-clé et avec une structure taxinomique) est comparée. En outre, cette étude vise à comprendre l’impact du langage du chercheur d’images sur leurs préférences linguistiques lors de la recherche d’images. Des données quantitatives et qualitatives ont été recueillies à l’aide des tâches de repérage et d’un questionnaire écrit.
digital images, search engine, image indexing, image retrieval, taxonomy, usability, cross-language information retrieval (CLIR), multilingual information, image collections
images numériques, moteur de recherche, indexation d’images, repérage d’images, taxinomie, utilisabilité, repérage d’information multilingue (RIML), information multilingue, collections d’images
“One day I will find the right words, and they will be simple.”“Un jour, je vais trouver les mots justes, et ils seront simples.”—Jack Kerouac, The Dharma Bums
Image retrieval has been explored since the 1970s. This type of searching is typically performed through search engines that also provide access to textual documents. These conventional systems include a database, a user interface, a search component, and an output unit. With traditional image retrieval systems, [End Page 100] keywords are used as descriptors to index an image and as query terms. Retrieval is possible if a match is established between indexing and query terms. If this is not achieved, the image searcher will inevitably have to modify the query until the system displays satisfactory results.
Compared with textual retrieval, non-textual retrieval presents very different problems. The image-indexing process is considered a crucial stage leading to successful or unsuccessful retrieval. Context-based image retrieval uses text to describe the content of an image. In addition, by representing image content with terms, image searchers can search for images through text queries they consider to be easy and intuitive. However, the description associated with an image is not always all-encompassing and accurate. Indeed, the content of an image is much richer and cannot actually be represented by only a few keywords. In addition to this limitation, inadequacies and ambiguities can occur during image indexing due to numerous factors. The search process can then be lengthy and hazardous, since an image can be interpreted and described in many ways. There is no guarantee that image searchers will use the same terms when writing their queries.
This article presents the second evaluation phase of a search engine (SINCERITY: Search INterfaCE for the Retrieval of Images indexed with a TaxonomY) for image retrieval in a bilingual (English and French) context, that is, when the query language differs from the indexing language (Neugebauer and Ménard 2015). This evaluation process was part of the third and last phase of the research project. Like a traditional search engine, SINCERITY includes a search box where image searchers can formulate a query in their own words. However, SINCERITY also offers another way to initiate queries. In a previous phase of the study (Ménard, Khashman, and Dorey 2013), participants were asked to report problems they encountered when searching for images. Following the data analysis, it became clear that there was a consensus on the difficulty in formulating a query that describes the image one is looking for. Therefore, a taxonomy (TIIARA: Taxonomy for Image Indexing And RetrievAl) was integrated into the search engine. This search alternative allows image searchers to select a predetermined subject category to initiate their queries and ease the search process. The main reasons for this inclusion were to promote consistency and to increase the probability that words chosen by the indexer could be matched with those of the image searcher (Jörgensen 2003). Even if very few search engines provide a taxonomic structure to initiate queries, we considered that browsing a subject classification system could be a useful functionality for image searchers who have difficulties formulating a query in their own words.
During the review of search engines and tools that give access to images (Ménard and Smithglass 2014), it was also observed that most search engines offer monolingual retrieval; that is, the query-term language has to match the indexing-term language. However, this limits the number of results, not to mention, in some cases, the accuracy of retrieved images. Indeed, people who speak only one language might never be able to find images indexed in other languages. Searching with one language may not always satisfy the image [End Page 101] searcher because relevant results might not even exist in the searcher’s language. As highlighted in one of the previous phases of this study, most people would like to access images in other languages, and they are very interested in browsing as opposed to searching (Ménard and Khashman 2014). Nevertheless, the majority of image searchers we surveyed and interviewed indicated that they conducted their searches almost exclusively in their native language. This means that most image searchers do not have the reflex to translate their queries into another language to obtain more accurate, or simply additional, results. Several reasons could explain this observation. Image searchers are basically unaware that modifying the language of their query or including several words from different languages when searching can affect the results (in terms of recall and precision). Another explanation could be that image searchers do not know that online translation engines exist or do not trust these tools. To bridge this gap and give image searchers a chance to easily access a variety of visual resources independent of the image’s indexing language, SINCERITY allows users not only to change the language of the interface but also to use a machine translation device to translate queries, if desired, from French to English and from English to French.
The results of the first phase of testing revealed that even though image indexing was sometimes challenging, the majority of participants did not encounter difficulties retrieving images with SINCERITY (Ménard and Girouard 2015). The comments and suggestions received were integrated into a new version of SINCERITY to improve the performance and aesthetics of the search engine. This article presents the third and last phase of the project, wherein SINCERITY was tested with a sample of images and image searchers. The performance of the two main searching features of the image search engine (the keyword box and taxonomic structure) is compared. In addition, this study aims to understand the effect of users’ language on their language preferences when searching for images. Both quantitative and qualitative data were gathered using search retrieval tasks and a questionnaire. The article is structured as follows: the following section surveys previous studies in image management and access; the third section presents the objective of this research; the fourth section describes the methodology used in the study; the fifth section reports the main findings, which are then discussed in the sixth section; and the last section concludes the article and suggests future directions.
Over the years, image retrieval has received attention from many researchers (e.g., Panofsky 1955; Krause 1988; Markey 1988; Armitage and Enser 1997; Jörgensen 1998, 2003; Markkula and Sormunen 2000; Goodrum and Spink 2001; Choi and Rasmussen 2002, 2003; Bar-Ilan 2004; Machill, Beiler, and Zenker 2008; Spink and Jansen 2004; Thelwall 2004; Bar-Ilan, Mat-Hassan, and Levene 2006; Jansen and Spink 2006; Matusiak 2006; Enser et al. 2007; Enser 2008; Greisdorf and O’Connor 2008; Ménard 2008; Rorissa 2008; E.-K. Chung and Yoon 2009; Ginsberg et al. 2009; Stvilia and Jörgensen 2009; [End Page 102] Benson 2011). Methods and systems have been described in much detail in the literature. However, as stated by Joan E. Beaudoin (Beaudoin 2016), “That image retrieval continues to be a challenge for even expert users is a clear indication that additional research is needed. It is hoped that this study has shown that the practical application of CBIR techniques could alleviate some of the problems that image users continue to face.” Content-based image retrieval (CBIR) is one of the most popular and growing research areas of digital image processing. However, as stated by Tomasz Neugebauer and Elaine Ménard (2015), image searchers mostly prefer to search with various metadata-based keywords related to the content of the image they are looking for, the events taking place, or the people who appear in the picture. Image searchers hardly ever express a desire to initiate their queries with a drawing or a similar image. Some surveys were conducted on CBIR (Y.-C. Chung, Wang, and Chen 2004; Graham 2004; Kherfi, Ziou, and Bernardi 2004; Shah, Javed, and Shafique 2007; Datta et al. 2008; Enser 2008; Jain et al. 2009; Klare, Li, and Jain 2011; Beaudoin 2016). Although there are many sophisticated algorithms in CBIR systems that rely on colour, shape, and texture, the images resulting from this type of search are not always accurate. Image searchers are not yet familiar with the use of these low-level characteristics and continue to prefer searching for images with words, as the low-level image features cannot always depict what they have in mind. From time to time, a CBIR system becomes freely available for other researchers to customize on the Web, probably as the result of an extensive research project. While the mechanisms for providing feedback on the interface and relevance are sometimes given, these systems are generally difficult to use. They are rarely maintained and after some time are no longer supported. Therefore, CBIR systems are still not very widespread, and image searchers know little about their existence and use.
As a consequence of this obvious lack of interest in many CBIR systems, the development of SINCERITY was directed instead toward the integration of two main search functionalities: (1) a search box to enter keywords, as is encountered in most search engines; and (2) a browsing structure that contains predetermined subjects. Most of the available image search tools, such as Google Images and Yahoo! Image, are based on textual annotation of images. In these tools, images are automatically annotated with keywords extracted from surrounding text and then retrieved using text-based search methods. This ensures that image searchers can find (or not) what they are searching for, when their query terms match these annotations. With these systems, image searchers are still responsible for initiating their queries with their own keywords. Thus, SINCERITY includes a taxonomic structure as a possible alternative to keyword searching. According to Richard P. Smiraglia (2014, 54), “Taxonomies supply defining characteristics and identify the sources of the definite science from which the characteristics were observed. Taxonomies, like ontologies, arrange concepts in hierarchical orders.” The use of a taxonomy as a search functionality to initiate queries offers several potential advantages to the user. It uses consistent categories that are intuitive and complete. Categories are also arranged in a hierarchical [End Page 103] and predictable manner. The taxonomic structure shows previews of where to go next and shows how to return to previous categories; categories suggest logical alternatives. Browsing predetermined subject categories is useful as a means of narrowing the results, and this search option is considered to be of primary importance for image searches. The design of a usable interface is key to creating a good user experience. Indeed, taxonomic categories greatly facilitate efficient retrieval in database searching. It helps users avoid dead ends and empty result sets as searches are narrowed. In brief, users prefer searching with organized and predictable hierarchies. Nevertheless, as with all controlled vocabularies, searching with a taxonomic structure also presents disadvantages. Categories must be known in advance, and important trends may not be shown, which are inherent shortcomings. Also, the use of a taxonomic structure supposes that it needs to be maintained regularly to keep it up-to-date and to compete with keyword searching.
In a previous study (Ménard and Khashman 2014), questionnaires and interviews revealed that participants have a clear preference for what they consider to be a “friendly” interface that is both easy to use and easy to learn. Simply put, “the design of its user interface (e.g. menus, toolbars, buttons, icons, frames) should facilitate the visitor in his exploration” (Pallas and Economides 2008, 51). Design, layout, labelling, and system performance were all considered when developing SINCERITY.
Stan Ruecker, Ali Shiri, and Carlos Fiorentino (2012) developed two search interfaces that draw on the semantic richness of bilingual thesauri and provide for searching, browsing, and displaying results: the Searchling interface offers search, browsable navigation and the full term-record data for a selected term, while the T-Saurus interface is more visual, with term search results represented by the size, number, proximity, and opacity of buckets. The results of their study demonstrated that users’ preferences and skills may affect how they evaluate visualization user interfaces and environments.
In addition to offering a bilingual (English and French) search interface, SINCERITY also allows users to conduct their searches in both languages, which means that their queries can be automatically translated into French or English. Research in the field of cross-language information retrieval (CLIR) can be traced back to the 1970s with the work of Gerard Salton (1970, 1973). A significant number of papers were published summarizing efforts in various sub-fields of multilingual information retrieval. While most research focuses on the retrieval effectiveness of cross-language systems through information retrieval test collection approaches (Braschler and Schäuble 2000), few researchers focus on the user interface requirements with respect to the multilingual retrieval task (Oard, He, and Wang 2008; Ogden and Davis 2000). In addition, some research projects on query translation for different language pairs have been conducted (Klavans and Schaüble 1998; Gey, Kando, and Peters 2005; Gey et al. 2006). Jennifer Marlow et al. (2008) examined the effect of language skill on the use of Google Translate during a multilingual search. Their study showed that for unfamiliar languages participants made substantial use of machine [End Page 104] translation, while for familiar languages participants tended to write their own translations and focused on web pages in the original language. Recently, several researchers have leaned toward the study of links that could be established between digital libraries and CLIR mechanisms as the number of multilingual digital libraries is rapidly growing. As stated by Anne R. Diekema (2012, 175), “enabling users to search across languages requires translation resources to cross the language barrier.” In the same way, Krystyna K. Matusiak et al. (2015) investigated different approaches for multilingual indexing and retrieval in digital collections and presented a model for creating bilingual parallel records that combines translation with controlled vocabulary mapping.
In a CLIR system, queries and/or documents need to be translated. Most approaches translate queries into the document language and then perform monolingual retrieval. Among the numerous approaches studied in CLIR systems, query translation is probably the most common (Wang, Lu, and Chien 2004; Wang et al. 2006; Braschler and Ferro 2007). Query translation is a widely used technique owing to its low computational cost for translation compared to the effort of translating a large set of documents. For visual resources such as images that are indexed with only a few words, query translation seems the best method. However, translation ambiguity often emerges, as short queries often do not provide enough context to be properly translated. In addition, an important body of literature exists on the linguistic resources integrated into CLIR systems. Many different linguistic resources have been proposed and extensively discussed in the literature, with their advantages and limitations explained in detail, as all CLIR systems rely heavily on the use of language resources: bilingual or multilingual dictionaries, machine translation, and parallel or comparable corpora (Pirkola et al.2001; Hedlund et al. 2004; Chen and Gey 2004; Zhang and Vines 2004; Xu and Weischedel, 2005). Recent research has focused on a machine translation–based approach, which is said to be useful for translating short queries. However, automatic translation is not always available for all pairs of source and target languages. Consequently, the absence of translation resources between two languages is still a real obstacle in CLIR systems using machine translation devices. The main advantage is that affordable machine translation technologies are now easily accessible and can be integrated into a search engine prototype without difficulty. Similar to most CLIR systems, SINCERITY is based on the integration of an automatic translation device for query translation: Microsoft Translate. This Web service was chosen for several reasons, the main one being that it is free for up to 2 million characters per month, whereas the Google Translate Web service requires a monthly payment for any usage (Neugebauer and Ménard 2015).
Objective and research questions
Following the previous phase of evaluation of SINCERITY, several enhancements were included in a new version of the search engine. Among the changes that were completed, the search interface was revised to make it more aesthetic. The necessity to improve the general look of the search engine had been mentioned [End Page 105] by many participants (Ménard and Girouard 2015). A growing body of research (Kurosu and Kashimura 1994; Tractinsky 1997; Tractinsky, Katz, and Ikar 2000; Conklin et al. 2006) supports the idea that perceived usability and perceived aesthetics are not independent of one another. The other aspect of the search interface that was modified is the way the results are displayed. Consequently, and mainly to reduce the frustration encountered in the previous testing phase, the display of the results was enhanced (e.g., 10 items per page in two rows only).
After the initial testing of SINCERITY (Ménard and Girouard 2015), comments from the participants led to many conclusions. First, many respondents stated that images were not found with the category they previously thought they would be when performing their searches. Previously, all images included in the search engine database were indexed with one category or subcategory extracted from TIIARA, the taxonomy used in this phase of the study. However, the preliminary testing clearly highlighted that the indexing was problematic. It was then decided to re-index all images with TIIARA2 (an updated and augmented version of TIIARA). Allowing indexers to assign up to three subcategories to each image instead of only one yielded positive effects with regard to image retrieval, as the images could be found in more places when browsing the taxonomic structure. All images, re-indexed with TIIARA2 and the new indexing policy, were integrated into the search engine. In addition, the new version of TIIARA replaced the previous one within the search engine and as one of the search options (with the query box) to perform image searching. It was also decided that application notes (scope notes) would be added in the taxonomic structure to provide a better understanding of the categories/subcategories. In SINCERITY, this information served the purpose of providing explanation and clarification on a specific category/subcategory. These notes also define the conditions of the categories’ use. These notes were activated when the image searcher moved or “hovered” the mouse over the category/subcategory label. Finally, this second round of testing was conducted to evaluate how participants would respond to the possibility of searching in a bilingual environment, meaning that participants were free to use the interface in French or in English. They also had the possibility of searching in both languages and having the terms of their query translated into French or English, if needed, to obtain different results.
The objective of the second phase of testing was to measure the effectiveness and efficiency of SINCERITY with a random sample of images and a selection of respondents who were asked to complete typical retrieval tasks using SINCERITY. The performance testing was also expected to identify usability problems. With this phase of the study, we proposed to answer the following three research questions:
1. How does SINCERITY support image retrieval in terms of effectiveness?
2. How does SINCERITY support image retrieval in terms of efficiency?
3. How do image searchers react to the use of the taxonomy included in the SINCERITY search engine? [End Page 106]
A sample of 60 respondents (30 English-speaking and 30 French-speaking) was used for the testing. All participants were recruited with ads and posts on e-mail lists that explained the tasks required and the estimated time needed to perform them. Word of mouth was also used for recruitment. For ethical considerations, our participants were aged 18 years and older. In addition, to ensure the homogeneity of the group of participants (Fortin 1996), two other selection criteria were defined: participants needed to have French or English as their mother tongue, and, given the nature of the tasks to be performed during the experiment, the participants should have had no professional experience in a field involving image indexing and retrieval.
For the data collection, the 60 respondents were divided into six groups: three groups of 10 French speakers and three groups of 10 English speakers. Two groups (one French-speaking and one English-speaking) were asked to retrieve each of the 30 images randomly selected from the IDOL (Image DOnated Liberally) database (Ménard 2012), in the same order of presentation, using exclusively the SINCERITY search box (“Keyword” groups). Two additional groups were asked to search for the same 30 images, in the same order, using only the TIIARA taxonomy (“Taxonomy” groups). The last two groups (hereafter named “Choice” groups) were given the possibility to search for the images using the functionality of their choice (either the search box or the taxonomy), meaning that they could initiate their query with either functionality and modify their choice of search feature throughout each retrieval task as they wished. For each image retrieval task, the following variables were recorded:
• The final result for each of the 30 images (retrieved or not retrieved)
• The time spent on each attempt (logged)
• Depending on the group, the keywords used to retrieve the image and/or the taxonomic path used by the participant for each attempt (logged)
• For the “Choice” groups, the functionality (search box or taxonomic structure) that allowed them to retrieve the image (logged)
Once the retrieval simulation was completed, participants answered a questionnaire to give their general opinion on SINCERITY and to report any difficulties encountered during the retrieval process. The questionnaire aimed to evaluate the overall satisfaction from an end-user’s perspective. The questionnaire was administered to participants using the online survey tool Survey Monkey. It comprised 10 closed questions with responses indicated on 5-point Likert scales to gather participants’ general impressions of the search engine. The questionnaire also contained three open-ended questions that asked users to provide feedback about SINCERITY. [End Page 107]
The retrieval experiment and questionnaire were pre-tested by four respondents (two English speakers and two French speakers). A monetary compensation of $10 was allocated to each respondent deemed suitable for the experiment. The data collection occurred in a relatively short period, from January 19 to March 5, 2015, to prevent the effect of data contamination (i.e., participants sharing information on retrieval tasks and giving each other clues about how to search for a specific image). The completion of each test (retrieval tasks and questionnaire) took 50 minutes on average.
Descriptive statistics were used on the collected data, as well as on the content of the 10 closed questions. The data collected with the three open-ended questions were analysed and coded to extract direct responses made by respondents. Themes arising from the participants’ feedback were used in the constant comparative method of data analysis adopted in our analysis (Glaser and Strauss 1967). The comments received proved useful for the further refining of SINCERITY. The results of our analysis of the quantitative and qualitative data are presented in the next section.
Characteristics of participants
This study involved two linguistic groups: 30 native English speakers and 30 native French speakers. Among the 60 participants, 36 were female, 23 were male, and 1 preferred not to answer. The sample included 13 French-speaking men, 16 French-speaking women, and 1 French-speaking person, and 10 Englishspeaking men and 20 English-speaking women. The majority of participants (39 respondents) were under 26 years of age, 9 respondents were aged 26–35, 7 respondents were aged 36–45, 3 respondents were aged 46–55, and 2 respondents were over 55 years of age. Our sample showed diversity in their education level, with 22 participants having earned at least a bachelor’s degree; 14 respondents, a CEGEP/community college diploma or degree; 19 respondents, a high school diploma; and 5 respondents, a master’s degree. The majority of the 60 were students (41 respondents). Others were employed for wages (12 respondents), self-employed (5 respondents), or out of work/looking for work (2 respondents).
Effectiveness refers to the participants’ ability to retrieve (or not retrieve) the images during testing. As such, the data shown here indicate the average number of participants who successfully retrieved all 30 images. It is worth noting that, across the six subgroups of participants, all the images were retrieved by at least one person.
Table 1 breaks down the results by language and type of search. With an average of 25.97 images retrieved out of 30, we can determine that fewer French-speaking participants could retrieve all 30 images, while Englishlanguage participants retrieved an average of 27.90 out of 30. Looking at the [End Page 108] breakdown in terms of type of search, we note an increase in the number of participants who could retrieve all 30 images when they had the choice of using either keywords or TIIARA categories: 18.47 participants out of 20. This compares with 17.93 participants out of 20 for those who were required to use only keywords and 17.47 for those using the taxonomy only. Among French speakers, we note an increase in the number of participants who could retrieve all 30 images, while the number of English speakers decreased slightly.
In addition to effectiveness, discussed above, two forms of efficiency are discussed: temporal efficiency and human efficiency. Temporal efficiency refers to the time it takes, in seconds, for a participant to correctly retrieve an image. Human efficiency refers to the number of attempts made by a participant to retrieve an image using keywords and/or categories from the taxonomic structure.
Temporal efficiency is calculated from the moment the participant is shown an image until they successfully find the keyword or category that allows them to retrieve the image. A two-second thinking period is added for every image. The time participants spent browsing the various results pages for the successful keyword or category is not taken into account. The data are summarized in table 2.
As expected, table 2 shows a marked increase in time for participants who searched the images using the taxonomy as their sole method of retrieval. Browsing a taxonomy requires some learning to understand the links between the categories and subcategories and takes longer than coming up with keywords. This element is discussed below in the section “Difficulties Encountered.” Interestingly, participants took longer to retrieve the same set of images when they were given the choice of retrieval method, compared with the group that used [End Page 109] only keywords, even though most participants still chose keywords more often than browsing of the taxonomy. French speakers also took longer to retrieve the images than their English-speaking counterparts did.
Human efficiency is calculated from the log file depending on the search functionality used. For the Keyword groups, human efficiency refers to the number of keyword attempts sent to the search engine. Because autocorrect functions were not part of the search engine, keywords entered with mistakes or typos were counted as separate keyword attempts. For example, “skiong” and “skiing” were counted as two separate keywords. For the Taxonomy groups, human efficiency refers to the number of categories selected by participants. Participants could freely browse, develop, and collapse the taxonomic structure; however, an attempt was counted once they clicked on a category or subcategory to see what was inside. For the Choice groups, human efficiency refers to a combination of the previous two; the data report the keyword and category attempts separately. In all cases, if a participant reused a keyword or category for the same image, it was counted again. The average number of attempts made to retrieve each image is presented in table 3.
Both the Keyword and Choice groups used close to the same number of attempts. Participants who retrieved images solely by using the taxonomy required more attempts to retrieve the images. This is not surprising, as one needs to browse a few categories to get to “Flowers,” for example, in the taxonomic structure, whereas one need only use the keyword “flowers” to obtain the same result. However, when given the choice of retrieval method, French speakers not only used the taxonomy more often but also used keyword searching less often than English speakers. A similar trend is noted with the other groups: French speakers used fewer keywords to find the images (Keyword group) but used more categories (Taxonomy group).
Image searchers’ reactions to SINCERITY
Once they had completed the retrieval tasks, respondents were asked to express their general opinions about SINCERITY. The first section of the online questionnaire contained 10 statements on the personal perception of SINCERITY that respondents needed to grade on a Likert scale. Table 4 (French-speaking [End Page 110] participants) and table 5 (English-speaking participants) present the results for the three conditions (K = Keyword search box; T = Taxonomy; C = Choice of keyword search box or taxonomy left to participants) as a score from 1 to 5, where 1 is “strongly disagree” and 5 is “strongly agree.” In the columns Keyword (K), Taxonomy (T), and Choice (C) the cells with asterisks indicate which of the three groups gave the highest ratings for each of the statements. In the “Average” column, the cells with asterisks indicate the English-speaking and French-speaking groups that had a higher score.
Among the French-speaking participants, there is a clear interest in SINCERITY. Those participants who could choose to use either retrieval method perceived the search engine more positively than did the other two groups. Only those who used solely the keyword function gave a higher score to the statement “The search engine is easy to use.”
For the English-speaking participants, we note a marked preference for keyword-searching functionality. In terms of their evaluation, those who used only the keyword search function perceived the search engine more positively. Participants who had the choice of retrieval method nonetheless gave more positive feedback than the other two groups for the statements “In general, I am satisfied with the results obtained with the search engine” and “The search engine allowed me to retrieve the images easily.” Those who used only the taxonomy were the most positive about the statement “The search engine made me want to explore the image database.” [End Page 111]
Comparing the two language groups, we again observe a marked increase in positive opinions about SINCERITY for the French speakers. Their scores were higher for every statement but one: English speakers thought “it was easy for [them] to learn how to use the search engine.” Overall, participants had encouraging opinions about SINCERITY, regardless of which retrieval method they were assigned to. Some challenges were, however, noted and are discussed in the following sections.
In general, the participants did not encounter many problems in using the search interface. A few difficulties were noticed and expressed by some participants. The biggest inconvenience reported by respondents came from the group that was using the taxonomy as the sole search functionality. Similarly to the previous testing phases of TIIARA and SINCERITY (Ménard and Girouard 2015), it was noted that some images were not allocated to the proper category, or at least were not described as many participants would have expected. As a consequence, the retrieval process was sometimes frustrating. For example, some participants felt that images were placed in a category with very specific descriptors (e.g., wall) instead of their more global definitions or descriptions (e.g., monuments/historical sites). Certain images with particular elements (e.g., clouds) were not put in the right category. For example, images of clouds were categorized as sunsets, but they could have been put into two categories (clouds and sunsets). So, once again, this type of difficulty reveals an indexing problem more than a problem related to the search interface itself. During the second phase of indexing, the indexers received instructions to categorize an image in [End Page 112] more than one subcategory if they felt it was necessary. Nevertheless, it seems that not all problems were resolved, even when the level of exhaustivity was increased. However, some participants reported that, with two minutes per image, they gained better knowledge of the categories and consequently it became easier (and faster) to use TIIARA as a search functionality: “It was necessary to be familiar with the categories to know how to narrow down the search quicker. I found that as I performed the exercise, I became quicker at finding the images because I had discovered new tabs from my previous searches” (E24). It should be mentioned that this discomfort with indexing was shared by respondents from both linguistic groups: “Non, je n’ai pas eu de difficulté à comprendre comment utiliser le moteur de recherche. Les seules fois où j’ai eu du mal étaient quand des images n’étaient pas dans les catégories où je les aurais mises. Par exemple, la photo du Mur de Chine n’était pas sous la catégorie Lieux historiques” (F13; No, I had no difficulty understanding how to use the search engine. The only times I had trouble were when images were not in the categories where I would have placed them. For example, the picture of the Wall of China was not in the category Historical Sites [our translation]).
Concerning SINCERITY itself, very few difficulties were reported. It was noted that the method for scrolling through the categories and subcategories may not be ideal: “I did not like the method to scroll, as the arrow moved every time a new page was open, so I would have to scroll back after skipping some pages. The categories on daily life were clear. I found it easy to navigate the celebrations section as well” (E19). Another participant reported that the display of results was sometimes problematic: “The only problem was clicking the next button when there was more than one page. Every time I clicked it, it moved over a bit, so I had to keep looking up to see where to place my mouse, and that made my search time slower, because I couldn’t continuously scroll through” (E2).
The overall feedback received from the participants has been rather encouraging. It highlights that SINCERITY is already a very stable search engine that still presents some flaws, but most of the defects reported by participants point in the direction of a better indexing process and further interface design issues.
Of the suggestions received from respondents, the majority were related once again to the indexing quality, which is directly related to the difficulties described in the preceding subsection: “Have images belong to many more categories, i.e. have the car with snow on it belongs [sic] to not only ‘cars’ but to ‘weather’ or ‘snow/ice,’ as the car can be identified by more than one thing. I could also see a car with snow on it as a ‘negative event.’ Another example is the market, where it could have also been under the ‘shopping’ category” (E19). Another respondent summarized how SINCERITY could be improved: “Comme mentionné plus haut, l’idée qu’une personne reliée à une image et celle qui est représentée dans le moteur de recherche diffèrent parfois. Il faudrait je crois prendre plus d’une perspective en compte quant à la signification et au sens qui peuvent être [End Page 113] attachés à une image, même en allant dans le plus abstrait, exemple: machine à café aurait pu se retrouver aussi dans nourriture-boisson dans le sens où on associe le café à la machine” (E15; As mentioned above, the idea that a person is linked to an image and the one shown by the search engine is sometimes different. It would, I think, take more than one perspective into account regarding the significance and meaning that can be attached to an image, even by going to the more abstract level, e.g. coffee machine could also end up in food/beverage category since we can also associate it to the coffee machine [our translation]). This recurrent problem was partially fixed when all images of the IDOL image database were re-indexed with TIIARA2 and the indexers were given the opportunity to assign more than one subcategory to an image. Although it solved many problems, the present results revealed that some images could once again be re-indexed with a better version of TIIARA. A third version of TIIARA is now being considered. It could include more subcategories as well as equivalence relationships such as those we find in a thesaurus. Indeed, many participants from the two linguistic groups suggested that adding synonyms could facilitate the image retrieval process.
Other suggestions received mostly concern the display of images that resulted when browsing categories—“I would suggest more images on each page” (E15)—or the quality of images: “Better quality of images that are not cropped poorly so people can find the images quicker” (E16). This latter suggestion is interesting but has little to do with the search engine itself, since all images included in IDOL were received from personal donations. Even if we took great care to emphasize diversity in the image selection when building the image database, we were aware that some images were of poor quality, like many of the pictures we find on our digital cameras, as is the case with existing systems online.
A French respondent expressed an opinion on the look of SINCERITY: “L’apparence visuelle de l’interface est simple et pratique. Rendre l’interface plus dynamique visuellement serait un gain pour le choix de cette base versus une autre similaire” (F4; The visual appearance of the interface is simple and convenient. Making the interface more visually dynamic would be an advantage for the choice of this database versus another one [our translation]). This underlined the choices that were made at the time of the conception to keep the tool simple and practical. In the future, we will try to improve the search interface to make it “visually dynamic.” Finally, another suggestion to improve the general use of SINCERITY was expressed: “Que les boutons « précédent » et « suivant » soient fixes afin que l’on n’ait pas à bouger le curseur lorsque l’on veut naviguer rapidement entre les pages. Ou avoir la possibilité d’avoir plus d’images par page afin de scroller à travers celles-ci. Une interface un peu plus jolie, attrayante:-))” (F18; That the “back” and “next” buttons are anchored, so that you do not have to move the cursor when you want to quickly navigate between pages. Or have the possibility to see more images per page to scroll down. An interface a little bit prettier, more attractive:-)) [our translation]). This will also be considered for the next version of SINCERITY. [End Page 114]
On average, both linguistic groups assigned the same overall grade (7.4) to the search engine, with values ranging from 5 to a perfect 10 for one English-speaking respondent. This is considered highly encouraging. Some comments received with the grade are also very interesting. Most respondents thought that SINCERITY was already a good tool that allowed them to search for and retrieve images easily: “Overall, this search engine is fairly easy to use and I am able to look for the desired images within a short period of time” (E1). Another participant focused on the rapidity of the retrieval process: “I think I would give this search engine a 9 out of 10 because overall the images were easy to find. I think within 5 minutes any of the given images could have easily been retrieved using this search engine” (E27). However, some participants were less enthusiastic about the overall use of SINCERITY, and this was, not surprisingly, because they insisted that Google Images seems superior: “I think it was ok. It did find the images quickly when there was a keyword match, but there were several keywords it did not match, which made it a bit annoying when searching and increased the search time. Obviously, it was inferior to Google image search but decent for a student project” (E28); “Je préfère Google qui est plus rapide et déjà sur le marché. Il y a beaucoup de compétition dans ce marché et ce moteur n’ajoute rien de nouveau” (F5; I prefer Google, which is faster and already available on the market. There is a lot of competition in the field and this search engine adds nothing new [our translation]).
The overall grade received by SINCERITY and the comments regarding user satisfaction are positive and will lead us to further improvements of the search interface. As reported in many comments about difficulties encountered and suggestions to improve the search engine, the image-indexing process was on many occasions inconclusive, even when using an updated version of TIIARA and looser guidelines for the indexers. This main drawback will be addressed in the future.
The evaluation of the performance of any retrieval system is essential to understand the good and the not-so-good results. Testing of SINCERITY was based on two measures: effectiveness and efficiency, both temporal efficiency and human efficiency. These objective measures were complemented with a perception evaluation in the form of a questionnaire administered following user testing, which assessed the usability of the new search engine as well as its aesthetics. The goal of the research was to answer three research questions: (1) How does SINCERITY support image retrieval in terms of effectiveness? (2) How does SINCERITY support image retrieval in terms of efficiency? (3) How do image searchers react to the use of a taxonomy included in the SINCERITY search engine?
In terms of effectiveness, SINCERITY offers interesting search functionalities for image searchers, particularly in a bilingual context. For French-speaking image searchers, having the choice of retrieval method (keyword or taxonomy) [End Page 115] increased the number of images retrieved correctly during testing. While the testing phase did not openly encourage participants to use the automatic translation functions, some participants, both English speakers and French speakers, chose to use it to increase their chances of finding an image. By allowing participants to choose the retrieval method, their chances of success were increased. It was interesting to see how participants started a search with one method, only to change methods based on the results they were getting, and going back and forth between keyword searching and taxonomy browsing. Many participants expressed frustration that, at the moment, the search engine does not allow them to narrow down a search by doing a keyword search among the results from the taxonomy, or to browse the categories associated with the results from a keyword search. At present, the two functions are independent from one another.
In terms of efficiency, SINCERITY again fares well as a search engine. The Choice groups took only slightly longer to retrieve the images and made only slightly more attempts than the Keyword groups. In both language groups, the Taxonomy groups took the longest and made the most attempts at retrieving the images. However, a closer look at some specific images shows promise. A few of the more difficult-to-find images showed a marked reduction in either time or number of attempts when participants were given the choice of retrieval method. It is worth mentioning that some images were difficult to find solely by entering keywords or browsing the taxonomy. When participants could select the retrieval method of their choice, the chance of retrieving the image was greatly increased. It is thought that simple images, images where there is a clear object shown as the focal point, will continue to be easily retrieved with keywords. However, for more complex images (i.e., images with more than one central focus point), allowing searchers to choose their keywords or browse a taxonomic structure would increase search efficiency by reducing both the time and the number of attempts required to find the image.
Looking at both effectiveness and efficiency is not enough. The general perception of the search engine also provides a more complete picture of image searchers’ preferences. For example, even though French-speaking searchers took longer on average than their English-speaking counterparts or needed more attempts to retrieve the image (except for the French Taxonomy group, who was more efficient), their perception of the search engine was even more positive than that of English-speaking searchers. As was demonstrated in the analysis of the comments, many issues related to aesthetics and search functions still need to be resolved. However, when it comes to simplicity of usage, the ease of learning how to use the search engine, and the performance of SINCERITY, the study participants gave positive scores.
The linguistic features of the search engine were also well rated by most participants. One element that emerged from the last question of the survey is the participants’ satisfaction with a tool that allows searches to be performed in more than one language: “It is useful to have a multilingual search, but I am unsure whether existing engines do not satisfy this need” (E26). This statement [End Page 116] is interesting and highlights how little Web users know about multilingual searches. This was already revealed in one previous phase of the research (Ménard and Khashman 2014). Even if Internet searchers considered themselves experts, it was noticeable that many of them thought that a search in Google Images, for example, would retrieve images from all provenances, including images indexed in many languages, while this is still not the case. Languages such as Chinese, Hindi, and Spanish have grown considerably on the Web, and Internet users definitely prefer to conduct searches in their own language. This is one of the reasons behind our decision to translate the bilingual taxonomy TIIARA into other languages (Spanish, Portuguese, Hindi, Russian, Mandarin Chinese, Arabic, Italian, and German). Even though some difficulties were encountered in the translation process (e.g., words that exist in the original French or English version that did not have any equivalent in another language), the taxonomy can now claim to have a real multilingual controlled vocabulary status (Ménard et al. 2016).
It is worth mentioning that the most significant factor in determining the quality of a translation is clearly the expertise of the translators. They must understand the subject matter they are translating at least as well as the target audience for the documents does. In addition to knowledge of their subject matter, translators must have all the basic skills required of all professional translators, that is, the ability to write clear, comprehensible prose; excellent language and translation skills; and native-level command of the language into which they are translating. In the case of TIIARA, the ideal scenario would have been to also have translators who had good knowledge of controlled vocabularies. Unfortunately, this was not the case, since volunteers who are not expert taxonomists translated TIIARA. Consequently, we remain aware that this lack of expertise could have affected the choice of terms included in the controlled vocabulary. We are also conscious that the taxonomic structure could have been jeopardized. In the near future we will test the quality of the multilingual vocabularies and make sure that the structure is still coherent and consistent. For example, in many cases, translators were not able to identify the perfect term in the target language. As a consequence, the only solution they could find was to borrow a term from one of the source languages (English or French). Even if this solution may seem inappropriate, one should remember many borrowed terms have been integrated into languages over time.
Conclusions and future work
The main objective of the evaluation process presented here was to ask a representative sample of image searchers to complete typical image retrieval tasks using SINCERITY to measure its performance and usability. The interface testing was expected to identify the usability issues of SINCERITY that may not be revealed by less formal testing (Sproull 2002). The experiment also aimed to assess the structure of the interface.
While the specific focus of this assessment was to investigate the usability of the SINCERITY search interface, it was also possible to examine user preferences [End Page 117] with regard to a multilingual search tool. Although users could decide to have their queries translated, very few participants took advantage of this feature offered by SINCERITY. It can be explained by the fact that users were not actively encouraged to make use of the functionality. Since participants had to find the image they were shown, it may not have been that obvious to them that translating their queries could have helped them. This hypothesis could be validated in future studies (e.g., through the evaluation of a fully functional multilingual retrieval system that not only involves simulated work tasks requiring users to enter their own queries but also provides search options in many different languages).
For this first version of SINCERITY, we have limited the search interface language options to only two languages (English and French). Another project will be to include the multilingual TIIARA in SINCERITY. Even if very few search engines provide a taxonomic structure to initiate queries, browsing a subject classification system could be a useful functionality, especially for image searchers who have difficulty formulating a query in their own words. This redevelopment means important challenges in terms of cost and expertise. It also means that some crucial elements will have to be considered. For example, multiple scripts associated with languages from different families could be problematic. Nevertheless, adding more languages in TIIARA and SINCERITY will constitute an added value for image searchers, who are often limited by their lack of knowledge of multiple languages. During search and retrieval in the majority of search engines that provide “multilingual searches,” users have to change their query words to different languages to find multilingual information resources. By including a valid multilingual taxonomy as a search functionality, this “do-it-yourself” translation process would be eliminated.
In recent years, multilingual information retrieval has gained more and more popularity. More people on the Internet are non-English-speaking and Web documents, including images, must be found in languages other than English. Multilingual search engines are definitely required for users to be able to retrieve information from multilingual databases. This, however, will need to be tested with the eventual inclusion of more languages. The growing diversity of languages on the Web calls for reliable tools that give access to multilingual documents, including images. Consequently, converting SINCERITY from bilingual to multilingual is a crucial step that needs to be taken to ensure that language barriers are finally lifted.
Determinants of User Acceptance of Electronic Recordkeeping Systems: A User-Focused Empirical Study of System Characteristics / Caractéristiques déterminantes de l’acceptation des systèmes de gestion des documents électroniques par les utilisateurs: Une étude empirique orientée utilisateurs des caractéristiques des systèmes