ABSTRACT

Background: This study captured students' repertoire of science ideas and determined the varied paths students take to integrate their disconnected ideas as they studied a web-based Genetic Inheritance unit. Method: We analyzed 6th graders' responses to embedded items and activities to establish progress in knowledge integration in two different learning conditions: revisiting and critiquing. Learning paths were established by measuring students' idea dissimilarities using Levenshtein edit distance, clustering using silhouette coefficient and K-means, and determining the most representative path via generalized median method. Results: Four learning paths emerged from the revisit condition (isolated links, partial links, valid links, integrated links) and three learning paths emerged from the critique condition (isolated links, partial links, and integrated links). Conclusion: We found that by providing opportunities for students to revisit or critique ideas, the curriculum supported them to follow multiple paths in building their repertoire of ideas and integrating initial and new information.

Key words: knowledge integration, Levenshtein edit distance, K-means clustering, generalized median string, silhouette coefficient, genetics

INTRODUCTION

Students bring to science instruction an abundance of everyday, intuitive science ideas.[15] As they learn, students benefit from support to put these ideas together into coherent accounts of scientific phenomena. This investigation captured students' repertoire of science ideas and determined the varied paths they take as they integrate their disconnected ideas in response to a web-based Genetic Inheritance unit designed to support this process. We studied two distinct learning conditions: critique and revisit. We document 6th graders' progress in knowledge integration using logged responses to embedded items and activities in the Genetic Inheritance unit.

Instructional activities that respect and build on students' ideas lead to more durable understanding than activities that dismiss students' ideas.[68] The Genetic Inheritance unit follows the knowledge integration design framework identified in prior research.[9] As they grapple with scientific challenges students explore links between both normative and non-normative ideas. In one study, considering various views helped students to distinguish and sort out conflicting ideas.[10] In another study, considering redundant ideas rather than diverse ideas encouraged students to make self-explanations and integrate their ideas.[11] Capturing learners' ideas in response to the knowledge integration pedagogy as they engage in inquiry learning tells us how instructional activities help students connect ideas into a coherent understanding.[12]

Recognizing students' ideas and learning paths has value for formative guidance in science classrooms, and benefits from a pedagogical framework.[13] Prior work has identified the impact of unexpected paths on students' performance,[14,15] studied how students deal with conflicting scientific phenomena,[16,17] and determined distinct paths for preselected groups of students.[14] We used clustering analysis to identify emergent clusters of students who follow distinct paths and qualitatively described those paths.

In this paper, we advance the field by tracking in detail how students build on their repertoire of ideas to link and sort out new and existing knowledge. Making visible students' repertoire of ideas and their learning paths during the unit clarifies how students integrate ideas. These findings underscore the value of supporting multiple paths to affirm each student's ideas.

Knowledge integration pedagogy

Traditional instruction often focuses on introducing new ideas without helping students integrate them with existing knowledge.[12] Understanding learners' knowledge and knowledge integration is critical for educators to provide effective instruction that builds on students' previous knowledge, connects new knowledge to existing knowledge, and appreciates the relevance of Science, Technology, Engineering, and Mathematics (STEM) concepts to their everyday lives. We explore how students integrate ideas as they study a Web-based Inquiry Science Environment (WISE) unit, designed based on the knowledge integration pedagogy. We identify ways to improve knowledge integration instruction based on an understanding of how students' repertoire of ideas and different learning paths contribute to their understanding.

Knowledge integration is a pedagogical framework that captures how students develop and refine their repertoire of ideas, including how they incorporate new ideas. Ideas are distinct perspectives or personal interpretations that students gather from observations, intuitions, anecdotes, films, experts, or scientific evidence in contexts such as home, school, or cultural communities.[12] Throughout a unit of instruction, students may maintain their prior ideas, mix their prior ideas with new normative ideas, or link ideas to form cohesive integrated knowledge. Students' knowledge integration involves making comparisons, distinguishing between intuitive and newly added scientific ideas, finding links between concepts, and clarifying uncertainties with evidence.[12] Students start with a unique repertoire of ideas. Supporting them to build on their own ideas facilitates the integration of their initial ideas with new ideas encountered in instruction. Typical instruction often fails to help students integrate new and existing ideas, instead focusing solely on introducing new ideas. Consequently, students may isolate new ideas within the context of the classroom, failing to apply them broadly. Students must act as cognitive economists, selecting when and where to focus their attention and resolve conflicts between ideas.[18] This is a common issue in STEM classrooms when students fail to understand the relevance or importance of connecting their ideas. Instruction that emphasizes learning through knowledge integration empowers students to develop scientific conceptual understanding by building on their own everyday intuitive science ideas.

This detailed study of students' multiple learning paths as they engage in knowledge integration instruction offers educators insight into the ways that instruction supports each student to progress. By analyzing how students' ideas develop as they follow sequences of instructional activities, we illustrate the multiple paths students take through the same activities as well as the distinct impacts of alternative instructional conditions. As students' ideas emerge through their engagement with a science unit, their learning path may differ from the teacher's expectations.

We illustrate that learning paths build on students' initial, unique repertoire of ideas. Rather than seeking a single learning path, we document that students follow distinct pathways both within the same set of activities and across different instructional conditions.

Revisiting versus critiquing

We examine the effectiveness of critique activities and revisiting activities in shaping students' repertoire of ideas.[19,3] Guiding students to revisit an interactive model is correlated with learning gains.[20] In addition, critique activities have enhanced the revision of explanations.[21] The use of a critique-based strategy may encourage students to recognize their own non-normative or missing ideas.[19] We found both critique and revisit guidance improved students' explanations in the post-test.[3] In this study, we examine students' scientific understanding, knowledge integration, and learning paths under two randomly assigned conditions: critique condition and revisit condition. We anticipate that students may have different learning paths based on the instructional activities in the revisiting and critiquing conditions.

Research questions

1. What progress in integrating ideas do students make while studying the web-based Genetic Inheritance unit in the critique and revisit conditions?

2. What distinct learning paths do students take as they study the web-based Genetic Inheritance unit in the critique and revisit conditions?

DATA SOURCES AND METHODS

Participants and materials

A total of 243 sixth graders from two schools, taught by three middle school science teachers studied the WISE Genetic Inheritance unit. Students were randomly assigned the revisit condition or critique condition (critiquing fictitious claims about the science).

Design of Genetic Inheritance unit

The WISE unit (https://wise.berkeley.edu/) was developed following the knowledge integration pedagogy.[9] Students started by exploring the concepts of alleles and genes by interacting with images of chromosomes and DNA. The unit elicited student ideas by asking students to make a prediction. Students discovered new scientific ideas by manipulating an interactive model for the Punnett square (Figure 1). They distinguished between their own and the ideas they discovered in either critique or revisit activities. Students reflected on their ideas as they revised their responses to the Punnett square item and the siblings' inherited traits item. Each condition had unique questions.

Figure 1

Figure 1. Interactive model used in Web-based Inquiry Science Environment Genetic Inheritance unit.

Revisit condition

Students revisited the Punnett model and DNA structure to respond to questions about Punnett squares (Supplemental table 1):

1. Punnett square: When the Punnett square is filled out, what do the four boxes represent? Explain how you would use a Punnett square to figure out the probability of getting a certain genotype.

2. DNA from parents: How much DNA do you get from each of your parents? Select all answers that are true. (Multiple choice: [A] More from one parent, depending on which parent you look more like. [B] One of each pair of chromosomes from each parent. [C] You get DNA for some traits from your mother and DNA for other traits from your father. [D] You get an equal amount from both parents, half from each.)

3. Use Punnett square to distinguish siblings: How could you use a Punnett square to explain why siblings look different from each other?

Critique condition

Students critiqued responses from a fictitious student about the Punnett square and siblings model. We used fictitious students because students are often reluctant to criticize classmates.[19] They indicated whether the response was correct, incorrect, or vague. They explained how to make it more accurate. The claims they evaluated were (Supplemental table 2):

1. a.Punnett square: "Two of their children will have free earlobes." b. Punnett square: "Their fourth child will have attached earlobes."

2. DNA from parents: "Siblings get different amounts of DNA from each parent, so they don't look exactly the same."

3. Genes: "Some kids get genes from one parent, and some get their genes from the other parent."

Pre/post and midpoint assessments

We analyzed student performance on the Punnett square item and siblings item at three time points using a knowledge integration rubric to assess overall progress in knowledge integration (Supplemental table 1 and 2). The pretest items were identical to the post-test items (Supplemental table 1 and 2). The pretest and post-test items were:

1. Siblings item: If you have siblings, you might look very similar to each other, but not exactly the same (unless you're an identical twin). Why do you think siblings look similar to each other, but not exactly the same?

2. Punnett square item: Explain how you would use a Punnett square to figure out the probability of getting a certain genotype.

Logs of student work to track ideas

We analyzed the logs of student responses to multiple choice items and explanation items across the activities in each condition to track the students' repertoire of ideas. Students' ideas were coded based on the task, domain knowledge, and knowledge integration.

Coding

Students' open responses to the questions were coded with a knowledge integration rubric (Supplemental table 3). Multiple choice and short responses were coded with the rubric in Supplemental table 4.

Students' responses were coded based on a combination of categories. Each code (e.g., Explore_Gene & Allele_Partial) included the following categories: knowledge integration task (explore, predict, discover, distinguish, and reflect), domain knowledge (gene and allele, Punnett, Punnett quantity, Punnett order, DNA amount, gene, genotype, probability, and siblings' traits), short answer/multiple choice score (correct, incorrect), or knowledge integration score (off task = 1; non-normative/irrelevant terminology use = 2; partial = 3; 1 valid link = 4; 2 or more valid links = 5).

The rubric and the coding scheme were discussed among two raters. One of the raters had experience using the knowledge integration rubric in several studies. Revisions were made to the coding scheme. The raters coded 100 student responses from the explanation items. Inter-rater reliability between the coders was near perfect (κ = 0.81).

Analysis of code sequences

To identify learning paths, we developed an analytical procedure (Figure 2) that consisted of:

Figure 2

Figure 2. String analysis model to identify learning paths.

1. Strings of ideas, each string representing students' set of ideas in response to the unit's items.

2. Levenshtein edit distance of students' strings of ideas.

3. K-means clustering based on Levenshtein edit distance.

4. Generalized median string.

Levenshtein edit distances

We coded students' strings/sequence of ideas for the questions in each condition. We compared the strings using the Levenshtein edit distance metric. This involves transforming one string of ideas to another by using a minimum number of insertions, deletions, or substitutions to measure the similarity of the strings. This method is commonly used in linguistics[22] and bioinformatics[23] to identify similar strings. For example, in bioinformatics, DNA sequencing involves identifying similar strings of elements.

K-means clustering

We used K-means clustering based on Levenshtein edit distance to form clusters into groups with high inter-cluster similarity. K-means clustering creates random initial centroids to define clusters. Then, the centroids are updated iteratively until the centroids of the new clusters reach a settled status. To determine the ideal set of clusters, we used silhouette coefficient clustering.[24]

The number of clusters

To determine the optimal number of clusters we used the silhouette coefficient, an unsupervised machine learning technique that involves calculating the average distance for each and every point with respect to every other point in the same cluster to establish the intra-cluster distance. Then the silhouette coefficient calculates the average distance for each and every point with respect to every other point in the nearest cluster. To distinguish between clusters the average distance between points within a cluster must be less than the average distance between the points and the other clusters. The silhouette coefficient (k) ranges from –1 to +1. A high value towards the positive sign indicates that the data point is well matched to its own cluster and poorly matched to its neighboring cluster.

Generalized median string

To identify the path represented by a cluster, we used the generalized median string approach.[25] The generalized median string represents the minimum sum of distances from every string of a set of strings.[26] Gonzalez-Rubio and Casacuberta have used the generalized median string as a method for determining the consensus translation of a target language (e.g., English) by comparing the translation of the same text in different languages (e.g., Spanish-English; French-English; German-English).[27] The median sequence is measured using the edit distance between the sequences. We used a generalized median string to identify the median path followed by students in each cluster.

RESULTS

The unit was implemented successfully. All teachers were able to run the software and monitor student progress.

Progress in knowledge integration of Genetic Inheritance

To assess overall progress in knowledge integration, we analyzed two embedded items that were administered at three time points: Pretest (near the beginning), post-test (at the end of the unit), and after conditions (after students completed the conditions). Table 1 shows the descriptive statistics for the Punnett square and siblings' traits items in the pretest and post-test. Students in both conditions made progress in knowledge integration on the Punnett square item from pretest to post-test. The two-way analysis of variance (ANOVA) shows a significant mean difference between the beginning and end of the unit (F [1, 482] = 9.89, P = 0.001). The effect for condition (revisit and critique) was not significant (F [1, 452] = 1.54, P = 0.21).

Table 1: Descriptive statistics for Punnett square items and siblings' traits items
        Condition         n         Pretest Punnett square         Post-test Punnett square Pretest siblings traits         Post-test siblings traits
       Revisit        115        2.81 ± 1.04        3.06 ± 0.84        2.66 ± 0.72        2.76 ± 0.73
       Critique        128        2.67 ± 1.15        2.98 ± 0.85        2.70 ± 0.64        2.66 ± 0.70
Data was expressed as the mean ± standard deviations.

Overall, in both conditions, students did not make progress in their revision of the siblings item from pretest to post-test. The two-way ANOVA revealed no significant differences (overall, F [1, 482] = 0.20, P = 0.65; by condition, F [1, 452] = 0.22, P = 0.64). However, there was a significant mean difference between the pretest siblings item and the midpoint siblings item, administered right after the conditions (F [1, 482] = 5.06, P = 0.02). Thus, right after being prompted to revisit the use of the Punnett square model for explaining the difference in siblings' traits, students made progress. This shows the advantage of combining understanding of the Punnett square with the requirements of the siblings question. For this question, the mean difference between the revisit and critique conditions was not significant (F [1, 482] = 3.00, P = 0.08). There was no effect on the conditions of any assessment. For both Punnett square and siblings, students gained insights either at the midpoint or at the end of the unit. For the siblings item, the progress was not sustained from the pretest to post-test. The critique activities were quite difficult for students, 39% of all the critique responses were correct (Table 2), suggesting that the condition did not achieve the goal of promoting the ability to critique.

Table 2: Accuracy of critique responses in critique condition (n = 128)
Item Accurate responses
Count Percentage
Critique item 1 29 23%
Critique item 2 66 51%
Critique item 3 28 22%
Critique item 4 78 61%
Average - 39%

Cluster analysis

K-means cluster analysis was used to identify the clusters based on their edit distances. Generalized median string analysis was used to identify the representative string for each cluster.

Revisit condition

Figure 3 shows the Silhouette scores against the values of clusters. Among different clusters, k = 4 has the highest silhouette score average while k = 9 has the lowest silhouette score average (Figure 3). Among different clusters k = 4 has the least negative values. Figure 4 shows that k = 4 has the least negative values while k = 9 has the most negative values. Thus, we selected k = 4 as our optimal number of clusters.

Figure 3

Figure 3. Silhouette scores for clusters in the revisit condition.

Figure 4

Figure 4. Graphical Silhouette Coefficient clustering for the revisit condition. The "Cluster label" represents the cluster to which a group of data points belong. The data points are represented by the vertical bars in the Silhouette plots, and each bar's color corresponds to the cluster to which a set of data points belong. The position of the bar represents the Silhouette coefficient value for a group of data points, with higher values indicating that the data points are well-matched to their own cluster and poorly matched to neighboring clusters.

After testing different numbers of clusters, four clusters showed more inter-cluster similarity and intra-cluster dissimilarity (Figure 5). Figure 6 shows the median strings (the most representative strings of ideas) for the revisit condition clusters. Each of the four identified learning paths culminated in either a normative partial link (knowledge integration score of 3) or an irrelevant terminology use (knowledge integration score of 2).

Figure 5

Figure 5. K-means clusters for the revisit condition.

Figure 6

Figure 6. Revisit: K-means clusters and generalized median strings. The set of boxes/steps under each cluster shows the most representative string of ideas for that cluster. Each box represents the codes for students' responses to the unit questions. The number in each box matches the number of the questions in Supplemental table 1.

There are four clusters in Figure 6, each of which shows students' sequences of ideas when exploring, eliciting, discovering, distinguishing, and reflecting. Each cluster contains nine boxes, each of which represents a step in students' learning path. What makes each cluster distinct is students' progress in making discoveries, distinguishing ideas, and reflecting on their ideas.

We categorized Cluster 1 as an isolated links group because they provided distanced normative ideas without elaborating on those ideas. From the domain knowledge perspective, we categorized the students in Cluster 1 as probability-centric because they made discoveries about the probability in the Punnett square (step 6 and step 8 in Figure 6). This cluster's discovery about the probability of offspring traits was evident when using the Punnett square (step 6 in Figure 6) and making statements such as: "You have 4 squares, each one is 1/4 chance. If the letters are the exact same, like Ee and eE, it is a total of a 50% chance". In one of the proceeding steps (step 8 in Figure 6), these students were able to use the knowledge they gained about the probability in Punnett square to state how they can use a Punnett square to explain why siblings look different. For example, student A stated: "The probability is different for siblings each time". However, in later steps, they seemed confused or unsure about the difference between "genes" and "alleles". Students mostly used expressions such as "the combination of genes" instead of "the combination of alleles" to explain the inheritance of traits. For example, in step 9 (Figure 6), student A stated: "The genes that your mom and your dad (give) are not the same and the Punnett square might be different". Teachers should build on students' partial understanding of probability in Punnett square and help them link this understanding to the causal indicators of Genetic Inheritance.

We categorized Cluster 2 as a partial links group because they made more partial links in their sequences of ideas. They made a partial link about probability while discovering the Punnett square (step 3 in Figure 6), and in the proceeding step, they made a partial link about genotypes (step 5 in Figure 6). For example, in step 5, student B stated: "They (Punnett square boxes) represent the possible genotypes". Furthermore, they provided a partial link about probability in step 8 where they were asked to state how they can use a Punnett square to explain why siblings look different. From the domain knowledge perspective, we categorized Cluster 2 as Punnett-centric because they were able to improve their knowledge about the Punnett square. Their sequences of ideas in step 3 (partial probability), step 4 (correct Punnett exercise), and step 5 (partial genotype) enabled them to improve their partial link in the Punnett square (step 3) to a valid link in their reflection on the Punnett square (step 6). For example, one student B stated: "You put the alleles of the parent and multiply them". They also reflected on their Punnett square knowledge with a normative valid link statement: "If a certain genotype shows up more than once the probability is more". Later in step 8, when they were asked to state how they could use a Punnett square to explain why siblings look different, they provided a partial link about probability without referring to genotypes. For example, student B stated: "The Punnett square shows the different types of probabilities of different traits for your sibling. So, it can explain why siblings look different from each other". What they struggled with in their learning path was explaining the causal indicators of genetic variation. For example, when they were asked "How much DNA do you get from each parent? Select all the answers that are true" (step 7), student B responded: "(you get) More (DNA) from one parent, depending on which parent you look more like". This idea reappeared in the final step (step 9) where students were asked "Why do you think siblings look similar to each other, but not exactly the same?" For example, in step 9, student B stated: "some siblings get more DNA for the mother and some get more DNA from the father". Teachers should build on students' ideas about genotype and probability, and help them link those ideas to the causal indicator of genetic material. If students' express ideas about where genetic material is coming from, teachers can help them develop an understanding of how variation of traits is possible between siblings.

We categorized Cluster 3 as a valid links group because they made a valid link in the discovery of the Punnett square (step 3) and provided a valid link when reflecting on their discovery of the Punnett square (step 6). For example, in step 3, student C stated: "You put the alleles of the parent and multiply them" and in step 6 stated: "Well, you would use the Punnett square by putting both of your parent's genotypes on each side and then match up the letters like a multiplication table". From the domain knowledge perspective, we categorized students in Cluster 3 as DNA-structure-centric because they were able to understand that "you get an equal amount from both parents, half from each; and you get one of each pair of chromosomes from each parent" and were able to use this knowledge in the final step "(siblings look similar to each other, but not exactly the same) Because they gather different traits from different sides of their family in a random order and a random condition". This statement shows that students understand randomness yet they can improve on their statement by explaining how alleles affect inherited traits. This struggle is detected in step 5 where they provided an irrelevant use of terminology as stated "(when the Punnett square is filled out, the four boxes represent) The genes of the offspring". Teachers should build on students' understanding of the Punnett square and DNA structure to help them provide explanations that include statements about the probability of different combinations of genotypes/alleles as the causal indicators for the variation of traits in siblings.

We categorized Cluster 4 as an integrated links group because they were able to integrate ideas in one of the final steps (step 8) where they were asked to state how they would use the Punnett square to explain why siblings look different. They consistently made discoveries, distinguished ideas, and gradually made progress in their knowledge integration. From the domain knowledge perspective, we categorized Cluster 4 as allele-centric. They made discoveries about the probability in the Punnett square (step 3 and step 6), distinguished ideas about genotype (step 4), and distinguished ideas about the DNA structure. These sequences of ideas from step 3 to step 7 enabled them to provide a valid link in step 8. In step 8, student D stated: "I could use a Punnett square to explain why siblings look different from each other by showing that the probability that some offspring get some kind of genotype could be different from what the other sibling gets". They understood the Punnett square and the terminologies for Genetic Inheritance, all of which enabled them to develop an understanding of allele interaction. In step 9, they stated a partial causal link: "Siblings have some of the same alleles, but some different". If these students continue to do new activities, they will consolidate their explanation in step 9 because they already provided a valid link in step 8.

Critique condition

Figure 7 shows the Silhouette scores for values of clusters. Among these clusters, k = 3 has the highest silhouette score average and k = 9 has the lowest silhouette score average. Among different clusters k = 3 has the least negative values. Figure 8 shows that k = 3 has the least negative values while k = 9 has the most negative values. Thus, we selected k = 3 as our optimal number of clusters.

Figure 7

Figure 7. Silhouette scores for clusters in critique condition.

Figure 8

Figure 8. Graphical Silhouette Coefficient clustering for the critique condition. The "Cluster label" represents the cluster to which a group of data points belong. The data points are represented by the vertical bars in the Silhouette plots, and each bar's color corresponds to the cluster to which a set of data points belong. The position of the bar represents the silhouette coefficient value for a group of data points, with higher values indicating that the data points are well-matched to their own cluster and poorly matched to neighboring clusters.

After testing different numbers of clusters, 3 clusters showed more inter-cluster similarity (Figure 9). Figure 10 shows the median strings for the critique condition clusters. Each of the three identified learning paths culminated in either a normative partial link (knowledge integration score of 3) or an incorrect terminology use (knowledge integration score of 2).

Figure 9

Figure 9. K-means clusters for the critique condition.

Figure 10

Figure 10. Critique: K-means clusters and generalized median strings. The set of boxes/steps under each cluster shows the most representative string of ideas for that cluster. Each box represents the codes for students' responses to the unit questions. The number in each box matches the number of the questions in Supplemental table 2.

The clusters in Figure 10 show students' sequences of ideas when exploring, eliciting, discovering, distinguishing, and reflecting. What makes each cluster distinct is students' progress in making discoveries, distinguishing ideas, and reflecting on their ideas.

We categorized Cluster 1 as an isolated links group because they provided distanced normative ideas in steps 1, 4, and 7. From the domain knowledge perspective, we categorized the students in Cluster 1 as probability-centric because they improved their understanding of the Punnett square from irrelevant terminology used in step 3 (Figure 10) to the use of partial links about probability in step 7 (Figure 10). In step 3, they struggled to discover and explain the probability of getting a certain genotype in Punnett square. As they moved to step 4, they successfully discovered how to work with the Punnett square. Although they struggled to critique hypothetical ideas about the Punnett square in steps 5 and 6, they finally showed an improvement in their reflection on the Punnett square discovery in step 7. For example, a student reflected on the Punnett square discoveries by providing a partial link about probability: "You look what traits you inherit and your probability of inheriting that trait". We suggest teachers build on students' partial understanding of genes (step 1) and their understanding of probability in Punnett squares (step 7). For example, teachers may ask: "Why are genes represented with two letters in the Punnett square?" and "How do siblings get different copies of chromosomes from their parents?" Next, teachers can help students connect the inheritance of different traits to random inheritance of DNA and random combination of genotypes. Furthermore, they can help students connect back to the Punnett square model to determine that alleles are responsible for the traits and use this to explain siblings' differences.

We categorized Cluster 2 as a partial links group. They made partial links when discovering the Punnett square (step 3) and critiquing ideas about genes (step 9). From the domain knowledge perspective, we categorized Cluster 2 as Punnett-centric because they developed procedural knowledge of how to create the Punnett square. They improved their partial understanding of the Punnett square in step 3 to a valid link in step 7. For example, student E stated "(in Punnett square) The probability is that you will get attached earlobes" in step 3 and stated "(in Punnett square) You can check if your children (or something else) will get a certain genotype because it will predict it" in step 7. Although students improved in their procedural knowledge about the Punnett square from step 3 to 7, they struggled to critique a presented hypothetical idea about the quantity of offspring in Punnett square boxes (step 5). For example, student E stated: "...when the E or e from the mother and...the father combine, three children will have free earlobes, and one will have attached earlobes." These students did not realize that the Punnett square does not show the number of offspring but rather shows random combinations of genotypes. As they proceeded to the next step, they succeeded in critiquing the hypothetical idea about the order of offspring in the Punnett square (step 6). For example, in response to the hypothetical idea that "Their fourth child will have attached earlobes", student E stated: "It does not have to be the fourth child it could be in any order". This cluster's fragmented ideas about the randomness of genotype combinations in the Punnett square (step 5) was carried forward to step 8. In step 8, this cluster of students showed fragmented ideas connected to the DNA structure and the random assortment of chromosomes. Since this cluster's full discovery of the Punnett square was not accompanied with accurate critique of hypothetical ideas about DNA structure, they struggled to integrate ideas in their reflection on siblings' inherited traits (step 10). Teachers should build on students' knowledge of the Punnett square to help them connect the procedure of finding all possible combinations of parental alleles to understanding randomness in those combinations and inheritance of chromosomes from each parent.

We categorized Cluster 3 as an integrated links group. They consistently made discoveries and distinguished ideas. From the domain knowledge perspective, we categorized Cluster 4 as allele-centric because they were able to integrate the idea of different combinations of alleles with the idea of siblings' inherited traits. For example, in step 9, a student stated: "Different combinations of alleles in the Punnett square affect siblings' inherited traits". This cluster of students developed partial and valid links across different areas of the unit. They also showed Punnett square mastery and maximum success in critiquing. Similar to the other clusters, Cluster 3 shows that continuous partial links in distinguishing ideas and full discovery of the Punnett square (steps 3–8 in Figure 10) led to more successful critiques and reflections.

DISCUSSION

We explored students' scientific understanding and knowledge integration in two instructional conditions: Critique condition and revisit condition. Our statistical analysis shows when students were asked to reflect on "Why siblings look similar to each other, but not exactly the same?" The majority of them did not integrate the new ideas that they gained from the discovery of the Punnett square activity. This shows that students did not apply the ideas from the Punnett square functions to the context of siblings' inherited traits in their reflections. The challenges students encountered when reflecting on their ideas are consistent with the research findings on effective revising. Effective revising is found to be a challenging process for middle school students.[28,29] Students are more likely to use evidence to confirm rather than revise their ideas.[29] It is found that novice writers revise their responses at a surface level, in which writing mechanics are given greater attention over the content.[28] An effective revision requires identifying the deficiencies in the initial writing, and then, resolving the issues in the revision.[30,31]

Our statistical analysis shows that being exposed to revisiting questions that combine different contexts may help students to focus their knowledge integration. When students in the revisit condition were asked to revisit their response to the Punnett square as they explained why siblings look similar to each other (step 8 in Figure 6), they showed a detectable change in their knowledge integration. This suggests that repeated exposure to a model interconnected with different contexts may promote knowledge integration. Improvements in revisions were found when students were prompted with a strategy for revising.[3234]

In this paper, we used cluster analysis to illustrate the multiple paths students take through the same activities as well as the distinct impacts of alternative instructional conditions. The identified learning paths were not robust in the sense that they apply across the two conditions. The learning paths were somewhat dependent on the nature of the conditions. Students following some of the paths were less successful. We observed isolated progress in some paths and more integrated progress in other paths. The distinct learning paths of each cluster of students inform us on how to build upon students' ideas to guide them on those paths.

Distinguishing ideas

In inquiry learning, a coherent understanding is achieved when students engage in the process of distinguishing ideas.[12] Some students may go down the learning path without normatively distinguishing ideas (Cluster 1 in Figure 10) or with normatively isolated distinguishing ideas (Cluster 2 in Figure 10). For example, Cluster 2 of the critique condition discontinuously distinguished an idea about the order of boxes in the Punnett square (step 6) and the genes inherited from parents (step 9) throughout their learning path. Although this cluster made a full discovery of the Punnett square, they may need more interactive models and prompts to distinguish the non-normative connection of ideas in their repertoire from the scientifically valid connection of ideas. When students such as Cluster 1 in Figure 10 do not engage in the process of distinguishing ideas because they lack sufficient normative ideas to build on, we may need to introduce pivotal questions that are relevant to students' non-normative ideas. To help this cluster, teachers need to use students' single isolated ideas and present those ideas with pivotal cases that address their non-normative ideas. It is argued that introducing pivotal cases may promote recognizing, rethinking, and restructuring of repertoire of ideas.[18]

When students make continuous discoveries or fragmented discoveries, the presence or absence of some distinguishing ideas impacts their knowledge integration in their final reflection. From the different learning paths, we learned that those who made full discovery about the Punnett square without distinguishing ideas about DNA structure, Punnett order, or Punnett quantity struggled with integrating ideas when making their final reflection about the inheritance of siblings' traits. For example, in both Cluster 2 of the revisit (Figure 6) and Cluster 2 of the critique condition (Figure 10), students continuously stated concepts correctly when they described the procedure of developing the model of a Punnett square. However, their struggle to distinguish ideas about the DNA structure shows that, despite making full discoveries about the Punnett square model in the preceding steps, they struggled to reflect on the inheritance of siblings' traits in the final step of their learning path. On the other hand, the sequences of ideas in Cluster 3 and Cluster 4 of the revisit (Figure 6) and Cluster 3 of the critique condition (Figure 10) show that if students make a continuous discovery of the Punnett square model and proceed to successfully distinguish ideas about the DNA structure, they can provide a partial link in their final reflection on the inheritance of siblings' traits.

These findings show students struggle to determine the most constructive ideas that could be used as links to form a coherent integration. They struggle to distinguish among their own diverse ideas and determine the ones they can build on. It is found that the distinguishing step requires the most support from students because students need to determine which ideas from their repertoire of ideas are most constructive for developing an integrated understanding.[35] The distinct repertoire of ideas in each learning path informs us of the need to use different instructions for supporting the process of distinguishing and reconciling conflicting ideas.

Sustaining links

The cluster of students who have been able to integrate valid knowledge for some concepts independent of the other concepts may need prompts to sustain those links across different concepts. For example, although the main concepts of alleles, genotypes, and probability were articulated in relation to the Punnett square, they were not used meaningfully in relation to the siblings' inherited traits. Depending on the distinct learning path of each cluster of students, we need to provide prompts that help connect the corresponding elements across contexts.

Sustaining invalid ideas

After developing partial or valid links, some students abandoned those links when they had to distinguish between different concepts (genotypes, DNA structure, alleles, genes, and probability and randomness in Punnett square) or reflect on those concepts in a more generalized context. This shows that students may hold onto their incorrect conceptions despite adding partial or new valid links to their repertoire of ideas. For example, Cluster 2 of the critique condition provided two valid links in their learning path, yet they were not able to integrate those valid links of gene and allele (step 1 in Figure 10) and Punnett square (step 7 in Figure 10) to provide an accurate concept in their following explanation in steps 8 and 10.

Students who have been able to integrate valid knowledge for each concept independent of the other concepts may need prompts to find the link between different concepts and to transfer their knowledge across different contexts. For example, although the main concepts of alleles, genotypes, and probability were used in relation to the Punnett square, they were not used correctly in relation to the siblings' inherited traits. We may need prompts that guide students to find the corresponding elements across contexts. Puntambekar et al. suggest that students need guidance to know what features are needed to be used to integrate ideas with their designed models.[36] Johnson et al. found that abstract prompts lead to deeper understanding if accompanied with prompts that focus students' attention on the corresponding elements across multiple text or diagram representations.[37] The authors suggest the use of self-monitoring prompts to promote more knowledge integration.[38] They studied the use of activity prompts in comparison to the use of self-monitoring prompts. The activity prompts were designed to elicit direct scientific knowledge while the self-monitoring prompts were designed to promote planning and reflection. They found that when students regularly articulate their ideas in response to the self-monitoring prompts, they learn to engage in autonomous reflection and knowledge integration.

Building on a partial link

The articulation of a partial link for distinguishing among concepts, may not be enough to scientifically improve students' repertoire of ideas. The partial understanding of the concepts (step 1 in Figure 6 and 10) was followed by either an incorrect concept use or a partial idea about why siblings look similar but not exactly the same (step 2 in Figure 6 and 10). For example, after stating a partial link about the difference between genes and alleles, a student predicted why siblings look similar but not exactly the same by applying an irrelevant terminology: "They got different genes from their parents". Another student did not mention the main concepts: "Because you come from the same parents but you take different things from both parents".

Students who formed scattered partial links in their learning path but did not proceed to make at least one partial link in distinguishing ideas were found to be struggling the most in their learning path. For example, Cluster 1 of the revisit and critique condition show that students were able to provide partial links in their learning path (steps 1, 4, and 6 in Figure 6 and steps 1, 4, and 7 in Figure 10) but were not able to follow with at least one partial distinguishing idea. Although these clusters possessed some normative partial links in their exploration and discovery processes, those partial links were scattered throughout the learning path. Consequently, students were not able to create clear links between their ideas. Thus, our findings reveal that when students' repertoire of ideas is expanded with some disconnected partial links without any distinguishing ideas, those partial links may not be used in the knowledge integration process. Scholars have proposed that individuals may develop an understanding of an idea when they encounter it, but they may isolate or fail to recall it if they do not link that idea to related situations.[39,40]

Effective prompts

We provided exploring, eliciting, discovery, distinguishing (revisiting vs. critiquing), and reflecting prompts. By comparing students' sequences of ideas in each cluster, we learned that some students may need to make more discoveries (Cluster 1 in the revisit and critique conditions) and some may need to distinguish more ideas (e.g., Cluster 2 in the revisit and critique conditions). Some students were pushed ahead to distinguish ideas while they had not made full discoveries yet and some were pushed ahead to reflect on ideas while they had not distinguished some ideas. Thus, the clusters help us understand where students are in the learning process and what prompts are needed for future studies.

Our findings indicate that distinguishing activities will elicit students' gaps in their repertoire of ideas. Critiquing activities and revisiting activities are both found to be effective for improving students' scientific understanding,[3,21] especially for students with low prior knowledge.[3] Educators and researchers need to use the elicited information to provide more distinguishing activities, new scientific ideas, or more reflective activities for students so that they can integrate more knowledge in their final reflection.[41,42] The learning paths we identified and the methods we used in this study will advance our knowledge on how to design prompts that help students build on their ideas.

Every distinct learning path will require tailored prompts to promote discovering scientific models, distinguishing ideas, integrating knowledge, and using strategies to sustain links. Studies show that students are more likely to add new ideas or replace ideas instead of linking new ideas to their repertoire of ideas.[43] This approach may lead to fragmented ideas as we observed in Cluster 1 and Cluster 2 of the revisit and critique conditions. Determining distinct learning paths may allow us to design future curriculums that provide guidance to build on students' ideas in each learning path and help students to construct their learning path by evaluating their competing ideas and refining their repertoire of ideas.

Implications for Teachers

The implications of this study extend to teaching and designing inquiry-based curricula. In particular, the findings highlight the importance of promoting effective revising and knowledge integration in science education.

The results suggest that students may struggle with integrating new ideas into their existing understanding, particularly when revising their responses at a surface-level or when lacking normative ideas to build upon. To address these challenges, teachers may need to provide prompts and pivotal questions to guide students toward effective revising and knowledge integration. The study also emphasizes the importance of distinguishing and sustaining links between concepts and contexts in science learning. Teachers may need to help students distinguish among their diverse ideas and determine the most constructive ones for developing an integrated understanding. Additionally, prompts may be needed to sustain links between concepts and contexts, particularly when students have developed partial or valid links but struggle to connect them across different contexts.

Overall, the study highlights the importance of supporting students in distinguishing among their own diverse ideas and determining the ones they can build on. Teachers may need to use different instructions for supporting the process of distinguishing and reconciling conflicting ideas, depending on the distinct repertoire of ideas in each student's learning path.

CONCLUSION AND SCHOLARLY SIGNIFICANCE

This study focuses on analyzing student learning paths and progress in knowledge integration, a key aspect of STEM education. We used actual data, including knowledge integration scores, to examine the ways in which students integrate new ideas with existing knowledge. To compare students' sequences of ideas, we used Levenshtein edit distance. We investigated the patterns of students' learning paths by clustering students based on their edit distances. To determine the number of clusters, we used the silhouette coefficient measuring technique. We clustered students' learning paths using K-means clustering and determined the most representative learning path for each cluster using Kohonen's generalized median string formula.[25] Other researchers have investigated sequential prosperities by examining individual sequences and calculating the probability of transitioning from one event to another[44] or by using sequential pattern mining to find the most frequent sequences in a sample.[45] In this study, the combination of methods we used for the sequence analysis allowed us to compare sequences based on distances and obtain an aggregate measure of central tendency for a set of sequences. Since the Levenshtein distance between two strings is the minimum number of single-character edits (insertions, deletions, or substitutions) required to change one string into the other, it is the same as other approaches that make a series of substitutions, deletions, or insertions. The unique approach here is that it measures differences between a string of knowledge integration scores, establishing the minimum number of insertions, deletions, or substitutions necessary to make one string the same as another. Thus, the clusters are sets of strings that are similar to each other in the sense that they require the fewest insertions, deletions, or substitutions to be identical to each other. Applying these methods enabled us to characterize how students' repertoire of ideas changed and to identify distinct learning paths as students integrate their ideas. In this study, the major changes in students' learning paths started to appear from the discovery step. Thus, in future research on simulation-based discovery learning, where students need to make more discoveries, this method of analysis will be useful to track multiple paths of actions and knowledge integration.

The present study compared two instructional conditions, critique and revisit, and examined their impact on students' scientific understanding and knowledge integration. The findings indicate that the majority of students had difficulty integrating new ideas when reflecting on the question of why siblings look similar but not exactly the same. However, the revisit condition, which involved revisiting questions that combine different contexts, was more effective in facilitating knowledge integration compared to the critique condition. These results suggest that exposing students to revisiting questions that combine different contexts can enhance their ability to integrate new scientific ideas into their existing knowledge structures. This study provides insights into effective instructional practices for promoting students' knowledge integration in science education. Further research is needed to explore the long-term effects of the revisit condition on students' scientific understanding and how these effects may vary across different age groups and academic settings.

To answer our second question, "What distinct learning paths do students take as they study the web-based Genetic Inheritance unit in the critique and revisits conditions", we examined the set of learning paths that emerged for each of the instructional conditions. Our approach expanded on prior work studying the trajectory of knowledge integration[12] and identified distinct learning paths using clustering analysis. By characterizing the varied student trajectories, we were able to develop a qualitative description of each path, which can help teachers understand the unique ways that students go about making sense of the ideas they encounter during instruction.

Our analysis of students' knowledge integration trajectories revealed that students progressed at different rates and grappled with different ideas, with some gradually adding ideas to reach a coherent understanding while others reconfigured their understanding using a subset of new ideas. Additionally, some students faced difficulties in differentiating between their various ideas and identifying the ones they could use to build upon and also found it challenging to maintain connections between concepts. The learning paths show when students' ideas come together and when their links remain fragmented. Students who integrate some ideas but not others may benefit from prompts to sustain their links across different concepts and to add the ideas they have neglected. These findings highlight the importance of providing support to students in the process of integrating new and prior ideas and using prompts to help them connect corresponding elements across contexts. In light of these findings, we suggest that curriculum designers can benefit from identifying distinct learning paths and utilizing them to design activities that build upon the areas where students' ideas come together. In promoting the knowledge integration framework within STEM education, this approach can support students in integrating new and prior ideas into coherent explanations of scientific phenomena. For instance, in this study, we identified four learning paths of isolated links, partial links, valid links, and integrated links, which allowed us to emphasize the need to provide support to students in the process of distinguishing and reconciling conflicting ideas, and prompts that help them connect corresponding elements across contexts to sustain their knowledge integration. ​​

Limitations

Our study focuses on 6th-grade students' learning paths toward the understanding of Genetic Inheritance and might not generalize to other grade levels or science topics. We combined data from three teacher classrooms at two different schools. Our representative learning paths of student clusters may change with the change of individual questions, the use of different instructional frameworks, or the change of teachers, students, or schools.

DECLARATIONS

Author contributions

Obaid T: Conceptualization, Investigation, Data curation, Methodology, Software, Visualization, Validation, Formal analysis, Writing—Original draft. Aghajani H: Software development for machine learning algorithms, Review of methodology. Linn MC: Resources, Supervision, Project administration, Funding acquisition, Writing—Review and Editing.

Source of funding

This work is supported by the National Science Foundation under Grant Project (No.1014863): Supporting Teachers in Responsive Instruction for Developing Expertise in Science (STRIDES).

Conflicts of interest

The authors declare that there is no conflict of interest.

REFERENCES

  1. Clark D, Linn MC. Designing for knowledge integration: the impact of instructional time. J Learn Sci. 2003;12(4):451–493.    DOI: 10.1207/S15327809JLS1204_1
  2. diSessa AA. Knowledge in pieces. In: Forman G, Pufall P, eds. Constructivism in the computer age. Lawrence Erlbaum Associates; 1988:49–70.
  3. Donnelly DF, Vitale JM, Linn MC. Automated guidance for thermodynamics Essays: Critiquing versus revisiting. J Sci Educ Technol. 2015;24(6):861–874.    DOI: 10.1007/s10956-015-9569-1
  4. Lewis E, Stern J, Linn M. The effect of computer simulations on introductory thermodynamics understanding. Educ Technol. 1993;33(1):45–58.
  5. Tiberghien A. Modeling as a basis for analyzing teaching-learning situations. Learn Instr. 1994;4(1):71–87.    DOI: 10.1016/0959-4752(94)90019-1
  6. Furtak EM, Kiemer K, Circi RK, et al. Teachers’ formative assessment abilities and their relationship to student learning: findings from a four-year intervention study. Instr Sci. 2016;44(3):267–291.    DOI: 10.1007/s11251-016-9371-3
  7. Krajcik JS, Mun K. Promises and challenges of using learning technologies to promote student learning of science. In: Lederman NG, Abell SK, eds. The handbook of research on science education. Routledge; 2014:337–360.    DOI: 10.4324/9780203097267
  8. Ruiz-Primo MA, Furtak EM. Exploring teachers' informal formative assessment practices and students' understanding in the context of scientific inquiry. J Res Sci Teach. 2007;44(1):57–84.    DOI: 10.1002/tea.20163
  9. Linn MC, Donnelly-Hermosillo D, Gerard L. Synergies Between Learning Technologies and Learning Sciences. In: Abell SK, Lederman NG, eds. Handbook of Research on Science Education. 1st ed. Routledge; 2023:447–498.    DOI: 10.4324/9780367855758-19
  10. Vitale J, Applebaum L, Linn M. Individual Versus Shared Design Goals in a Graph Construction Activity. In: Smith BK, Borge M, Mercier E, Lim KY, eds. Making a Difference: Prioritizing Equity and Access in CSCL, 12th International Conference on Computer Supported Collaborative Learning (CSCL) 2017, Volume 1. International Society of the Learning Sciences; 2017:1–8.
  11. Matuk C, Linn MC. Why and how do middle school students exchange ideas during science inquiry? Int J Comp-Support Collab Learn. 2018;13(3):263–299.    DOI: 10.1007/s11412-018-9282-1
  12. Linn MC, Eylon B-S. Science learning and instruction: Taking advantage of technology to promote knowledge integration. Routledge; 2011.    DOI: 10.4324/9780203806524
  13. Black P, Wiliam D. Classroom assessment and pedagogy. Assess Educ. 2018;25(6):551–575.
  14. Real EM, Pinheiro Pimentel E, de Oliveira LV, Cristina Braga J, Stiubiener I. Educational Process Mining for Verifying Student Learning Paths in an Introductory Programming Course. 2020 IEEE Frontiers in Education Conference (FIE). ;2020:1–9.    DOI: 10.1109/FIE44824.2020.9274125
  15. Tytler R, Peterson S. Young Children Learning about Evaporation: A Longitudinal Perspective. Can J Sci Math Technol Educ. 2004;4(1):111–126.    DOI: 10.1080/14926150409556600
  16. Psillos D, Kariotoglou P. Teaching fluids: intended knowledge and students' actual conceptual evolution. Int J Sci Educ. 1999;21(1):17–38.    DOI: 10.1080/095006999290813
  17. Duit R, Roth W-M, Komorek M, Wilbers J. Conceptual change cum discourse analysis to understand cognition in a unit on chaotic systems: towards an integrative perspective on learning in science. Int J Sci Educ. 1998;20(9):1059–1073.    DOI: 10.1080/0950069980200904
  18. Linn MC, Hsi S. Computers, teachers, peers: Science learning partners. 1st ed. Routledge; 2000.    DOI: 10.4324/9781410605917
  19. Harrison EJ, Gerard LF, Linn MC. Encouraging Revision of Scientific Ideas with Critique in an Online Genetics Unit. In: Kay J, Luckin R, eds. Proceedings of the 13th International Conference of the Learning Sciences (ICLS). International Society of the Learning Sciences; 2018;1:518–521.
  20. Svihla V, Linn MC. Distributing Practice: Challenges and Opportunities for Inquiry Learning. In:. Proceedings of the 10th International Conference of the Learning Sciences (ICLS2012). International Conference of the Learning Sciences; 2012;1:371–378.
  21. Sato E, Linn MC. Designing Critique to Improve Conceptual Understanding. In: Polman JL, Kyza EA, O’Neill DK, et al., eds. Learning and Becoming in Practice: The International Conference of the Learning Sciences (ICLS). International Society of the Learning Sciences; 2014;1:471–478.    DOI: 10.22318/icls2014.471
  22. Beijering K, Gooskens C, Heeringa W. Predicting intelligibility and perceived linguistic distance by means of the Levenshtein algorithm. Linguist Netherlands. 2008;25(1):13–24.    DOI: 10.1075/avt.25.05bei
  23. Liew AWC, Yan H, Yang M. Pattern recognition techniques for the emerging field of bioinformatics: A review. Pattern Recognit. 2005;38:2055–2073.
  24. Rousseeuw PJ. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J comput appl math. 1987;20:53–65.
  25. Nicolas F, Rivals E. Complexities of centre and median string. Lect Notes Comput Sci. 2003;2676:315–327.
  26. Kohonen T. Median strings. Pattern recognit lett. 1985;3(5):309–313.    DOI: 10.1016/0167-8655(85)90061-3
  27. González-Rubio J, Casacuberta F. On the Use of Median String for Multi-source Translation. In:. Proceedings of the 2010 20th International Conference on Pattern Recognition. IEEE Computer Society; 2010;4328–4331.    DOI: 10.1109/ICPR.2010.1052
  28. Crawford L, Lloyd S, Knoth K. Analysis of student revisions on a state writing test. Asses Eff Interv. 2008;33(2):108–119.    DOI: 10.1177/1534508407311403
  29. Kuhn D, Black J, Keselman A, Kaplan D. The development of cognitive skills to support inquiry learning. Cogn Instr. 2000;18(4):495–523.    DOI: 10.1207/S1532690XCI1804_3
  30. Beal CR. Repairing the message: Children’s monitoring and revision skills. Child Dev. 1987;58:401–408.    DOI: 10.2307/1130517
  31. Scardamalia M, Bereiter C. The development of evaluative, diagnostic and remedial capabilities in children’s composing. In: Marlew M, ed. Psychology of written language: A developmental and educational perspective. John Wiley; 1983:67–96.
  32. Daiute C, Kruidenier J. Special issue: The psycholinguistics of writing. Appl Psycholinguist. 1985;6:307–318.
  33. La Paz S, Graham S. Strategy instruction in planning: Effects on the writing performance and behavior of students with learning difficulties. Except Child. 1997;63:167–181.    DOI: 10.1177/001440299706300202
  34. Wong BYL, Butler DL, Ficzere SA, Kuperis S. Teaching low achievers and students with learning disabilities to plan, write, and revise opinion essays. J Learn Disabil. 1996;29:197–212.    DOI: 10.1177/002221949602900209
  35. Wiley KJ, Bradford A, Linn MC. Supporting collaborative curriculum customizations using the knowledge integration framework. In: Lund K, Niccolai GP, Lavoue E, Hmelo-Silver C, Gweon G, Baker M, eds. A Wide Lens: Combining Embodied, Enactive, Extended, and Embedded Learning in Collaborative Settings, 13th International Conference on Computer Supported Collaborative Learning. International Society of the Learning Sciences; 2019;1:480–487.
  36. Puntambekar S, Kolodner JL. Toward implementing distributed scaffolding: Helping students learn science from design. J Res Sci Teach. 2005;42(2):185–217.    DOI: 10.1002/tea.20048
  37. Johnson AM, Butcher KR, Ozogul G, Reisslein M. Learning from abstract and contextualized representations: The effect of verbal guidance. Comput Hum Behav. 2013;29(6):2239–2247.    DOI: 10.1016/j.chb.2013.05.002
  38. Davis EA, Linn M. Scaffolding students’ knowledge integration: prompts for reflection in KIE. Int J Sci Educ. 2000;22(8):819–837.    DOI: 10.1080/095006900412293
  39. Bransford JD, Brown AL, Cocking RR, eds. How people learn: Brain, mind, experience, and school. The National Academies Press; 1999.
  40. Vygotsky LS. Mind in society: The development of higher psychological processes. Harvard University Press; 1978.
  41. Davis EA. Prompting Middle School Science Students for Productive Reflection: Generic and Directed Prompts. J Learn Sci. 2003;12(1):91–142.    DOI: 10.1207/S15327809JLS1201_4
  42. Williams M, DeBarger A, Montgomery B, Zhou X, Tate E. Exploring middle school students’ conceptions of the relationship between genetic inheritance and cell division. Sci Educ. 2012;96(1):78–103.    DOI: 10.1002/sce.20465
  43. Berland LK, Reiser BJ. Classroom communities’ adaptations of the practice of scientific argumentation. Sci Educ. 2011;95(2):191–216.    DOI: 10.1002/sce.20420
  44. Bakeman R, Quera V. Sequential analysis and observational methods for the behavioral sciences. Cambridge University Press; 2011.    DOI: 10.1017/CBO9781139017343
  45. Chen B, Resendes M, Chai CS, Hong HY. Two tales of time: uncovering the significance of sequential patterns among contribution types in knowledge-building discourse. Interact Learn Environ. 2017;25(2):162–175.    DOI: 10.1080/10494820.2016.1276081