Young Children’s Search Strategies and Construction of Search Queries
Glenda Revelle, Allison Druin, Michele Platner, Stacy Weng
Benjamin B. Bederson, Juan Pablo Hourcade, Lisa Sherman
Human-Computer Interaction Lab
University of Maryland
College Park, MD
+1 301 405 0154
This paper describes a quantitative study focused on two
questions: (1) Can children understand and use a hierarchical domain structure
to find particular instances of animals?
(2) Can children construct search queries to conduct complex searches if
by technology These two
questions have been explored in the context of developing a digital library
interface for children (ages
5-10 years old ) that visualizes the querying process
and its results. . In this paper the motivation for our
research, the study methods and results will be discussed.with
Children, information retrieval, digital libraries, empirical evaluation, education applications.
Research has shown that the querying process can be difficult for users when the interface is restricting in syntax or abstract in nature [9,12,16,19]. Graphical interfaces for digital libraries have been shown to help adults search efficiently and effectively [1,7,14,17].
The research concerning children and information search strategies, leads us to believe that graphical interfaces can also be supportive of children as technology users [13,26,27]. However, thanks to the importance of the World Wide Web and the proliferation of search engines for it, children typically must negotiate query tools that are language-based and use abstract logical notations for Boolean searches . While the use of text is not an issue for older children and adults, young children (4-7 years of age), have difficulty when it comes to typing skills, spelling, and syntax comprehension   .
In addition, constructing Boolean-type search queries
requires an understanding of the logic of conjunction (intersection, typically
represented as AND in a standard
Boolean search query) and disjunction (union, generally represented as OR in traditional Boolean search
terms). It has long been understood
that even adults have difficulty with these logical concepts, particularly with
disjunction . It has also been well documented that children have difficulty
with these concepts, and that the differential difficulty of disjunction over
conjunction is consistent for children from 5 to 12 years of age . However,
under certain circumstances even children as young as three years have been
shown to utilize disjunctive concepts to perform significantly better than
chance . Although these results
were all established quite some time ago, there has been little or no research
the use of computer interfaces to
construct search queries based on these logical concepts. Interestingly enough,
it has been shown that typical intefaces to the Web promote less strategic
thinking concerning searches, and more active browsing . We believe this may be due to the
inappropriate searching interfaces available for young children today.
Therefore, we began a study in the fall of 1999, to better
understand young children’s searching strategies and abilities to construct
Boolean-type search queries. At that
time, we hypothesized that if we provided enough visual and conceptual support
for young children, it might be possible for them to effectively use these
complex search concepts
.. The empirical study reported here
examined the following questions: (1) Can children understand and use a
hierarchical domain structure to find particular instances of animals? (2) Can children construct search queries if
they are provided with visual and conceptual support? Our research questions
were addressed by observing and documenting children’s searches for animals in
a hierarchical information structure, comparing the use of a paper model and an
interactive computer prototype we now call QueryKids. In the paper that follows, our research methods, results, and
conclusions will be described.
The participants in this study were second and third grade children from Yorktown Elementary School, a public school in Prince George’s County, in the Washington DC metropolitan area. Approximately 52% of the children were Caucasian, 36% were African American, and 22% were Asian or Hispanic. The school serves a lower-middle to middle-class population.
The children were divided into two groups. The first group, a total of
participants, used a paper prototype (as described in the next sections). This group was made up of thirty
second graders (14 females with a mean age of 8 yrs, 1 mo, and 16 males with a
mean age of 8 yrs, 0 mos) and twenty-six
third graders (14 females with a mean age of 9 yrs, 1 mo, and 12 males with a
mean age of 8 yrs 10 mos). The second
group, a total of fifty
participants, used the computer prototype.
This group was made up of twenty-two
second graders (12 females with a mean age of 8 yrs, 0 mos, and 10 males with a
mean age of 8 yrs, 1 mo) and twenty-eight
third graders (14 females with a mean age of 8 yrs, 10 mos, and 14 males with a
mean age of 9 yrs 0 mos).
Both the paper prototype and the computer prototype were organized
to represent four hierarchies (
1). At the top level were the names of
four parallel “branches”: Animals, Where They Live, How They Move and What They
Eat. All 62
animals in the data set could be found under each of these four branches; i.e.,
the four branches served as alternative ways of accessing the same
information. Under the Animals branch heading were the
following subcategories: Amphibians, Birds, Fish, Insects,
Invertebrate Sea Creatures, Mammals, Reptiles.
The Mammals subcategory was then further subdivided into Cats & Dogs, Rodents, Hooved, Primates, and Marsupials. The second branch, Where They Live, was divided into three subcategories: Land, Water, and Both Land and Water. Likewise, the How They Move branch was subdivided into Fly, Swim, and Walk, Crawl, Hop etc., and What They Eat had the subcategories Eats Animals, Eats Plants, and Eats Both Plants and Animals.
paper prototype consisted of a set of hierarchically nested envelopes
(see Figure x) . The four 15”x12” envelopes at
the top of the four branches of the hierarchy were labeled Animals, Where They Live, How They Move and What They Eat, and
decorated with representative pictures.
Inside each of these envelopes were smaller envelopes, labeled with the subcategories under each broad category.
For the Mammals subcategory there was one more subset of yet smaller envelopes, representing the second level of subcategories. Inside the smallest envelope for each branch of the hierarchy were 5x7 white cards, each of which displayed a color picture of one animal with its common name printed below the picture.
addition, there were two cartoon-style illustrations of children on 4 x 6 cards
These illustrations represented Dana and Kyle, who were introduced to the
participants as the “search kids”, and were used in searches for groups of
animals. Whenever children were
constructing a search query to find a group of animals (as described in the Procedures section below), they were
asked to place the envelopes representing those groups on top of the Dana and Kyle
The computer prototype
(currently called “QueryKids” )
runs on a Sony laptop computer with a USB mouse under
Windows 98, and was built on top of a software
architecture KidPad   and
Jazz  . A Microsoft Access database was
used to hold metadata about the 62
animals in the data set.
prototype consisted of three areas: two
browsing areas and a search area.
search area was used in this study. The search area displayed four icons
representing the four main branches in the hierarchy: Animals, Where They Live,
How They Move and What They Eat (Figure
1). Each icon was composed of a text label and
a representative picture.
move down through each branch, the user clicks on the “shadow” under one of the
four main icons. To specify search
parameters, the user clicks on the icon or icons representing those
parameters. So, for example, to conduct
a search for “birds that live on land and water”, one might first click on the
shadow beneath the Animals icon to
reveal the subcategories, then click on the Birds
icon to make it a search parameter.
Next, one would click on the shadow below the Where They Live icon, revealing its subcategories, and click on Land and Water to add it as a second search parameter (Figure
that live in water
selected, their icons move d to the two children in the upper
left corner of the screen. The metaphor
users is that
these two children (called Kyle and Dana) are “ query
kids”, and that you are “giving” them icons of things that you want them to
find. When items are given to Kyle and Dana, the software runs a query that
automatically performs a union among items selected from subcategories within
the same branch, and an intersection among items selected from subcategories
across different branches. The
subcategories within any one branch have been defined such that items
are not duplicated across subcategories (i.e. an intersection
would yield an empty set). Thus, the
user does not need to distinguish between intersection and union in specifying
a query, but due to the way the categories have
the “intuitive” result will be delivered most of the time.
time an icon is
selected as a search parameter, the
results of the search are immediately displayed in miniature in the outlined
area to the right of the search kids.
This serves as a “
For a more complete description of the QueryKids computer prototype and its design and development, see .
The children participated in same-sex and same-grade pairs for both paper and computer prototype research.
The participants in the paper prototype group sat on the floor with the four large envelopes arranged on the floor in front of them. The researcher described the task as being like a “treasure hunt”, and explained that inside each envelope there were smaller envelopes and inside those were index cards with pictures of animals that the children would be trying to find.
In the computer prototype group, participants sat a desk, in
Sony laptop with the QueryKids application running. All of the prototype functionality was demonstrated, and children
were allowed a free-play period of a few minutes to experiment with clicking on
icons to see what happened before the experimental procedure began.
For both groups, it was also explained that there were two parts to the research. In the first part, the goal was to find a particular animal, for example, a blue jay. Each child was asked to find four specific animals. The four animals were requested in four different orders, with each animal appearing in each serial position once. The use of these four orders was counterbalanced across prototype condition, grade level and gender groups.
In the second part, the task was to find groups of
animals. To help them find groups of
animals, children were introduced to the
kids, Kyle and Dana. The participants
were told that Kyle and Dana would find groups of animals when given an envelope/icon
representing that group. Each
participant was asked to construct one single-factor search query (e.g., all
insects), one union search query (for example, all reptiles and
After the experimental procedure, researchers interviewed the children about their reactions to the task. Children were asked if they thought finding the animals was easy or hard, fun or not, and whether there was anything they would change to make it better or easier.
Two major aspects of children’s search behavior were examined in this study: 1) children’s search efficiency when searching for a specific animal within the hierarchical information structure and 2) their ability to construct a search query.
To develop a measure of search efficiency, children’s responses were recorded when they were asked to find each of the four specific animals in the first section of the study. For the paper prototype group, observers recorded each envelope that the child opened in order. For the computer prototype group, the software logged the sequential history of all mouse clicks. Children’s responses were then coded to indicate how many unnecessary envelopes were opened or icons were clicked. In other words, search efficiency was the number of search steps taken above the minimum number necessary to find the requested animal, given the branch of the hierarchy chosen by the child. Thus, the higher the search efficiency score, the less efficient the search.
Search efficiency scores were submitted to a 2 (grade) x 2
(gender) x 2 (condition) x 4 (item number) analysis of variance, in which item
number served as a repeated measure.
Results of this analysis indicated a significant difference between
conditions, F(1,96) = 14.75, p < .0001, a significant condition by gender
interaction, F(1,96) = 4.75, p < .05, and a significant difference between
items, F(3,288) = 2.92, p < .05.
Means for the groups involved in these effects are displayed in Table
Examination of these means shows that computer searches were significantly more efficient than paper searches. Tukey post hoc tests on the condition by gender interaction indicated that the females’ searches were significantly more efficient in the computer condition than in the paper condition, while there was no significant difference for the males. In addition, comparison of the means in the item effect indicates that children’s searches became more efficient with each subsequent item, indicating a practice effect. An additional analysis indicated that there were no significant differences in search efficiency for one particular animal vs. another.
To quantify children’s search
query abilities, their responses in the second portion of the study were
examined. Their attempts to formulate
search queries to find groups of animals were scored as shown in Table
2. Search query scores range from 0 to 1, with
1 being the highest possible score.
Search query scores were analyzed using a 2 (grade) x 2 (gender) x 2 (condition) x 3 (query type) analysis of variance, in which query type (single-factor vs. union vs. intersection) served as a repeated measure. Results of this analysis indicated a significant difference between conditions, F(1,94) = 14.96, p < .0001, a significant difference between query types, F(2,188) = 3.12, p < .05, a significant interaction between condition and query type, F(2,188) = 7.15, p < .05, and a significant interaction between gender and query type, F(2,188) = 7.15, p < .001.
Means for the groups
involved in these effects are displayed in Table
Examination of these means shows that overall, search queries were more accurate in the computer condition than in the paper condition. Tukey post hoc tests on the query type effect indicated that union queries were significantly more successful than intersection queries, while neither differed significantly from the success rate for single-factor-searches. However, this main effect is qualified by two interactions. Post hoc tests on the condition by query type interaction showed that both single-factor queries and intersection queries were significantly more accurate in the computer condition than in the paper condition, but for union queries there was no difference between conditions. In addition, post hoc comparisons on the gender by query type interaction demonstrated that for females union queries were significantly more successful than
searches or intersection searches, whereas for males there were no
significant differences between the three query types.
In general, children were quite efficient in their searches
for specific animals. The overall
search efficiency mean for the entire sample was 0.48. This means that, on average, children looked
in less than one extra envelope, or clicked on less than one extra icon
per search beyond the bare minimum needed to find the animal that they were
looking for. So, for the most part,
children successfully employed a strategy of trying to find each target animal
in as few steps as possible, in an extremely focused and goal-directed
manner. In addition, children’s
ability to use this “fewest-steps” strategy effectively improved over time
within the course of the four trials in this section of the research.
The one apparent exception to the use of the fewest-steps strategy occurred in the searches of the females using the paper prototype. Their searches were significantly less efficient (i.e., used significantly more extra steps) than the searches of the girls using the computer prototype or the searches of the boys in either prototype condition. It should be noted, however, that the absolute differences in number of extra steps are small: even for the females using the paper prototype the average search efficiency was only 0.89, still less than one extra envelope opened or icon clicked per search.
Qualitative observations of the children as they engaged in the search tasks led researchers to suspect that a number of the females who used the paper prototype were intentionally browsing, rather than engaging in goal-directed, fewest-steps-type searches. They seemed to enjoy looking through all the pictures of animals as a goal in itself, sometimes continuing to look at animal pictures even after the target animal had been found. It’s not clear why there was so much less of this intentional browsing behavior with the computer prototype, but perhaps it was due to fact that children were working exclusively within the search area of the QueryKids prototype. This area is clearly structured to support purposeful, goal-directed searches, whereas other sections of the prototype support browsing.
The second portion of the study focused on children’s abilities to construct search queries. Once again, overall, children were strikingly adept at this task
! Across the entire sample and all of the
search query types, the average accuracy of constructing a search query was
0.72 of a total 1.00. Moreover, the
children who used the QueryKids computer prototype achieved an 85% accuracy
rate with their search queries, which was significantly than
the accuracy of those using the paper prototype.
What accounts for this surprisingly high level of performance, especially in light of the research previously cited which has established that children have difficulty with the underlying logical concepts involved in constructing union and intersection searches?
We believe that these positive results are the result of several different kinds of support that were built into the software as “scaffolding” devices. Scaffolding is a well-established educational technique that often enables children to complete tasks that otherwise would be beyond their capabilities [25,28], and has been shown to be an effective learning tool when used by teachers . Recently, scaffolding has begun to be incorporated as a learning support in educational software [8,11,20], and there is evidence to suggest that educational software with extensive scaffolding is more educationally effective than software without such support .
There were several kinds of scaffolding support built into the QueryKids prototype. First, the search interface was visually concrete and involved direct physical manipulation of the search elements, both of which were designed to support children in constructing search queries that they would have been unable to accomplish with a typical text-based search tool.
Second, the display of “in-progress” search results on the same screen, while the search query is being formulated makes it extremely easy for children to see whether their queries have been formulated correctly or not, and to adjust and modify their queries when needed. This immediate, dynamic feedback is one of the major points of difference between the paper prototype and the computer prototype, and probably plays a large role in the significantly better performance of those children using the computer version.
Finally, and perhaps most importantly, because of the way the information was organized and the search software was written, children did not need to distinguish between an intersection search query and a request for a union search. This lightens the cognitive complexity of the task immensely, allowing children to first focus solely on identifying the proper parameters to conduct the search they have in mind.
We believe that the kind of scaffolding described here could serve as a first step toward helping children learn to understand and use Boolean search concepts. Scaffolding is typically designed to be “eased out” as the child becomes more and more capable of completing the task with fewer supports. In future work, we plan to research systematic ways of reducing this support to gradually guide children into constructing queries with the full power of Boolean logic under their control.
This work was supported the
National Science Foundation’s DLI-2, Discovery Channel, Patux
Wild Life Refuge, and the Baltimore Learning Community project. We thank Prince
George’s County Public Schools for their cooperation and support. The children and teachers from Yorktown
Elementary School were critical to this research. They included 106 children in the 2nd and 3rd
grades along with their teachers. Also,
thanks to Delfin Barral for allowing us to use his illustrations to represent
Dana and Kyle in the paper prototype.
On-going inspiration and intellectual discussion has come by way of Ben
Shneiderman, Catherine Plaisant, Anne Rose, Joseph JaJa, and the KidStory
research project supported by the i3, ESE.
1. Ahlberg, C., Christopher, W., & Shneiderman, B. (1992). Dynamic queries for information exploration: An implementation and evaluation. Proceedings of ACM CHI'92, ACM Press, pp. 619-626.
2. Bederson, B. B., Meyer, J., & Good, L. (2000). Jazz: An Extensible Zoomable User Interface Graphics Toolkit in Java. In Proceedings of User Interface and Software Technology (UIST 2000) ACM Press, (in press).
3. Benford, S., Bederson, B. B., Åkesson, K., Bayon, V., Druin, A., Hansson, P., Hourcade, J. P., Ingram, R., Neale, H., O'Malley, C., Simsarian, K., Stanton, D., Sundblad, Y., & Taxén, G. (2000). Designing Storytelling Technologies to Encourage Collaboration Between Young Children. In Proceedings of Human Factors in Computing Systems (CHI 2000) ACM Press, pp. 556-563.
4. Bruner, J. S., Goodnow, J. J., and Austin, G. A. A study of thinking. New York: Wiley, 1956.
5. Druin, A., Bederson, B., Hourcade, J. P., Sherman, L., Revelle, G., Platner, M., Weng, S. (Submitted) Designing a digital library for young children: An intergenerational partnership. CHI 2001, ACM Press.
6. Druin, A., Stewart, J., Proft, D., Bederson, B. B., & Hollan, J. D. (1997). KidPad: A Design Collaboration Between Children, Technologists, and Educators. In Proceedings of Human Factors in Computing Systems (CHI 97) ACM Press, pp. 463-470.
7. Greene, S., Marchionini, G., Plaisant, C., & Shneiderman, B. (1997). Previews and overviews in digital libraries: Designing surrogates to support visual information seeking. Technical Report CS-TR-3838, UMIACS-TR-97-73, University of Maryland.
8. Guzdial, M. (1995). Software-realized scaffolding to facilitate programming for science learning. Interactive Learning Environments, 4(1), 1-44.
9. Hertzum, M. & Frokjaer, E. (1996). Browsing and querying in online documentation. ACM Transactions on Computer-Human Interaction (TOCHI), 3(2), pp. 139-161.
10. Hourcade, J. P. & Bederson, B. B. (1999). Architecture and Implementation of a Java Package for Multiple Input Devices (MID). Tech Report #CS-TR-4018, Computer Science Department, University of Maryland, College Park, MD, USA.
11. Jackson, S. L., Krajcik, J., & Soloway, E. (1998). The design of guided learner-adaptable scaffolding in interactive learning environments. Proceedings of ACM CHI’98, ACM Press, pp. 187-194.
12. Jones, S. (1998). Graphical query specification and dynamic result previews for a digital library. In Proceedings of UIST 98. ACM Press, pp. 143-150.
13. Large, A., Beheshti, J., & Breuleux, A. (1998). Information seeking in a multimedia environment by primary school students. Library and Information Science Research 20, pp. 343-376.
14. Michard, A. (1982). Graphical presentation of boolean expressions in a database query language: Design notes and an ergonomic evaluation. Behaviour and Information Technology, 1(3), pp. 279-288.
15. Moore, P. & St. George, A. (Spring 1991). Children, as information seekers: The cognitive demands of books and library systems. School Library Media Quarterly, 19, pp. 161-168.
16. Nickerson, R. S. (1981). Why interactive computer systems are sometimes not used by people who might benefit from them. International Journal of Man-Machine Studies, 15, pp. 469-483.
17. Plaisant, C., Marchionini, G., Bruns, T., Komlodi, A., & Campbell, L. (1997). Bringing treasures to the surface: Iterative designs for the Library of Congress National Digital Library Program. Proceedings of ACM CHI'97, ACM Press, pp. 518-525.
18. Rawson, L. M., Tamayo, F. M. V., Vehle, M. T. & Willemsen, E. W. (1973). Disjunctive concept utilization in preschool children. The Journal of Genetic Pschology, 122, pp. 211-216.
19. Ray, H. N. (1985). A study on the effect of different data models on casual users’ performance in writing database queries. International Journal of Man-Machine Studies, 23, pp. 249-262.
20. “G” is for growing: Mahwah
21. Rogoff, B. (1990) Apprenticeship in thinking: Cognitive development in social context. New York: Oxford University Press.
22. Shute, R., & Miksad, J. (1997). Computer assisted instruction and cognitive development in preschoolers. Child Study Journal, 27(3), 237-253.
23. Snow, C. E. & Rabinovtch, M. S. (1969). Conjunctive and disjunctive thinking in children. Journal of Experimental Child Psychology, 7, pp. 1-9.
24. Solomon, P. (June 1993). Children’s information retrieval behavior: A case analysis of an OPAC. Journal of American Society for Information Science, 44, pp. 245-264.
25. Vygotsky, L. S. (1978) Mind in society: The development of higher psychological processes. Cambridge, MA: Harvard University Press.
26. Walter, V. A., Borgman, C. L., & Hirsh, S. G. (Winter 1996). The Science Library Catalog: A springboard for information literacy. School Library Media Quarterly, 24, pp. 105-112.
27. Watson, J. S. (1998). If you can’t have it, you can’t find it: A close look at students’ perceptions of using technology. Journal of the American Society for Information Science, 49, 1024-1036.
28. Wood, D., Bruner, J., & Ross, G. (1976) The role of tutoring in problem solving. Journal of Child Psychology & Psychiatry & Allied Disciplines, 17(2), 89-100.