Learning from split-attention materials: Effects of teaching physical and mental learning strategies

Abstract Learners learn more from spatially separated text and pictures after they have been instructed to physically integrate these sources than without such an instruction. We investigated whether instructing learners to mentally integrate textual and pictorial information would yield similar results. Eighty-seven participants studied a picture with accompanying text about an electrical circuit. Text and picture were presented as spatially separated sources or in an integrated format. In the separated format, participants were instructed to use (1) a mental learning strategy, (2) a physical learning strategy, or (3) no learning strategy. Participants in the separated condition using a mental learning strategy and the integrated condition obtained higher recall and comprehension (but not causal inference) performance than participants in the separate conditions with the physical- or no learning strategy. This indicates that instructing learners to mentally integrate spatially separated text and pictures when studying split-attention material can be an effective learning strategy.


Introduction
In contemporary education, students are increasingly required to learn from textual and graphical representations. For example, studying the functioning of the heart from a computer screen or textbook typically involves the processing of a textual explanation with accompanying pictures. Research has shown that learning from mutually referring text and pictures is more effective if the text is presented near the corresponding part of the picture than when text and pictures are presented at spatially separated locations. This finding is commonly referred to as the split-attention effect (Ayres & Sweller, 2014;Liu, Lin, Tsai, & Paas, 2012;Pouw, Rop, De Koning, & Paas, 2019), or the spatial contiguity effect (Ginns, 2006;Johnson & Mayer, 2012;Mayer, 1989;Mayer, Steinhoff, Bower, & Mars, 1995). According to Cognitive Load Theory (CLT; Paas, Renkl, & Sweller, 2003), a spatially integrated format is superior over a spatially separated format because learners have to engage less in unnecessary visual search and reorienting processes to integrate text and pictures in working memory (i.e., extraneous cognitive load), and consequently have more working memory capacity available for relevant learning processes, such as schema construction (Sweller, van Merriënboer, & Paas, 2019). However, in practice learners still may frequently encounter instructional materials in which text and pictures are presented in a spatially separated format. Therefore, researchers recently have started to investigate whether it is possible to teach learners a strategy that they can use to actively integrate the text and pictures themselves (e.g., Sithole, Chandler, Abeysekera, & Paas, 2017). This latter approach has become known as "self-management of cognitive load" because it is the learner instead of the instructional designer who adapts the learning material (Roodenrys, Agostinho, Roodenrys, & Chandler, 2012). The present study contributes to this self-management of cognitive load research by investigating whether teaching a mental learning strategy to integrate spatially separated text and pictures supports learning and how this relates to learning with a physical learning strategy.

Teaching a learning strategy for self-management of cognitive load
There is a small yet increasing number of studies that have investigated the learning effectiveness of teaching learners a strategy to physically integrate spatially separated text and pictures (e.g., Sithole et al., 2017). The key feature characterizing these studies is that learners are taught to manually move a text segment as close as possible to the part of the picture that it corresponds to. Concretely, learners picked up and moved cut-out text segments to the picture (paper-based https://doi.org/10.1016/j.cedpsych.2020.101873 T materials) or selected text segments with the mouse and dragged it to the picture (computer-based materials). Overall, research indicates that teaching learners such a physical learning strategy supports learning. Sithole et al. (2017), for example, showed that university students using such a physical learning strategy obtained higher recall and transfer performance than students studying a split-attention format or an integrated format. Similarly, secondary school students using a physical learning strategy outperformed the split-attention format group on transfer questions (Tindall-Ford, Agostinho, Bokosmaty, Paas, & Chandler, 2015). Although comparable benefits have not been obtained among primary school students, using a physical learning strategy to integrate text and pictures did not hinder learning in this population (Gordon, Tindall-Ford, Agostinho, & Paas, 2016). The benefits of the physical learning strategy are typically attributed to the fact that this strategy supports engagement with the learning task and reduces unnecessary search and reorientation processes during learning.
An important characteristic of the physical approach adopted in prior self-managed text-picture integration research is that the instructional materials should allow for moving the text to the picture. However, it is clear that this is impossible in many situations, such as when having to learn from text and picture presented in school textbooks or non-interactive digital learning environments. For these situations it is not yet clear how learners can be taught to deal with splitattention learning materials. One possible way is to have learners mentally integrate the learning materials. The main aim of this study therefore is to extend prior self-management research by investigating whether teaching learners to mentally integrate text and picture can also support learning.
Teaching a mental learning strategy to integrate text and pictures appears a promising approach for supporting learning from spatially separated text and pictures. According to relevant theoretical frameworks including the cognitive theory of multimedia learning (Mayer, 2014) and the integrated text and picture comprehension model (Schnotz & Bannert, 2003) the active mental integration of textual and pictorial sources into a coherent mental representation is essential for effective learning. Related to this, the Design, Functions, and Tasks (DeFT) framework (Ainsworth, 2006) asserts that engaging in active integration of multiple external representations (e.g., text and pictures) enhances understanding because learners can combine the complementary information from the different information sources when constructing a mental representation. In line with these frameworks, theoretical models in discourse comprehension such as the construction-integration model (Kintsch, 1988(Kintsch, , 2013 assert that good comprehenders build a situation model from a text: a coherent mental representation of the state of affairs described in a text that is highly visuo-spatial and imagery-based rather than forming a textual representation of the words in the text itself. Actively engaging in meaning-making activities conducive to building a mental representation such as drawing inferences helps to develop an accurate and deeper understanding of the presented information. Particularly the use of strategies that contribute to the construction of a coherent visuo-spatial mental representation (i.e., situation model-focused strategies) supports comprehension (De Koning, Bos, Wassenburg, & van der Schoot, 2017;De Koning & van der Schoot, 2013;McNamara, Ozuru, Best, & O'Reilly, 2007) especially when the material is spatial in nature such as is the case for scientific or technical concepts (Leopold & Leutner, 2012). Teaching a mental learning strategy to integrate spatially separated information sources to learners is considered such a model-focused strategy as learners receive support in how to use their mental resources, or more specifically their imaginative processing, to actively integrate textual and pictorial representations to understand the presented information. Without such support in relating the different external representations, learners may experience high extraneous cognitive load (Liu, Lin, & Paas, 2013) and construct fragmented and/or incomplete mental representations (Ainsworth, Bibby, & Wood, 2002) due to unsystematic attempts to integrate the presented information.
By now, it is relatively well-established that mentally imagining information supports cognitive processing (e.g., Dunlosky, Rawson, Marsh, Nathan, & Willingham, 2013;Fiorella & Mayer, 2016). In the context of CLT, for example, the 'imagination effect' refers to the finding that imagining a procedure, a series of solution steps, or (to be) performed activities yields higher learning outcomes than just studying the same information (Leahy & Sweller, 2004, 2008. According to CLT, imagining information in working memory assists in the construction and automation of schemas because learners are actively engaging in cognitive processing of the presented material. Learners who rely on imagination can focus working memory resources on rehearsing specific information, connecting information to existing knowledge or identifying and resolving knowledge gaps, whereas learners who simply read and study the instructional material may prematurely decide that they fully understand the material resulting in less well developed schemas. This means that learners who actively integrate the information in their imagination are more likely to benefit from the complementary information in multiple external representations (Ainsworth, 2006). The process of imagining is assumed to require considerable working memory resources and is most likely to be effective if learners possess some background knowledge of the studied topic (Cooper, Tindall-Ford, Chandler, & Sweller, 2001;Leahy & Sweller, 2004). This means that if learners do not have appropriate schemas of the to-be studied content, encouraging imagination should be combined with strategies aimed at schema construction (Fiorella & Mayer, 2016).
One way to accomplish this is to provide specific instructions about how to imagine the content of instructional materials and why it is relevant, which likely helps learners to develop an accurate representation of the content that they can mentally rehearse through imagination. Leopold and Mayer (2015), for example, showed that instructing participants how to imagine the spatial arrangements of text elements in a scientific text resulted in deeper processing of the text content than when no such specific imagination instructions were given. Similarly, the results reported by Leutner, Leopold, and Sumfleth (2009) demonstrated that adding specific instructions stating that the created mental images should be simple and clear increased text comprehension considerably. Additionally, several studies by Glenberg and colleagues (e.g., Glenberg, Goldberg, & Zhu, 2011) have shown that teaching readers an imagination strategy that (1) required them to imagine that they physically moved toy figures as dictated by the story that was read and (2) explained the benefits of doing this improved text comprehension.
In the context of learning from text and pictures, research has focused on explicit -yet unspecific (i.e., no guidance in how and why to imagine integrating text and pictures)-encouragement of active mental integration of textual and pictorial representations. Bodemer, Ploetzner, Feuerlein, and Spada (2004), for example, investigated whether asking learners to actively integrate text and pictures by means of joint physical (i.e., drag and drop) and mental interaction would improve learning from spatially separated text and pictures. Results showed that this form of active integration improved learning over studying a spatially separated text-picture format and an integrated text-picture format. In a related study, Bodemer and Faust (2006) examined prompting of active mental integration separately from physical integration in learning from spatially separated text-picture tasks only. Learners prompted to mentally integrate text and pictures obtained marginally higher learning outcomes than those engaging in physical integration and learners who were not prompted to actively integrate text and pictures (Experiment 1). Together, these prior studies suggest that encouraging active mental integration can help learners integrate text and pictures and improve learning. However, the evidence is indirect (Bodemer et al., 2004), the individual effects of encouraging mental integration were not compared with learning from an integrated text-picture format (Bodemer & Faust, 2006), and the integration instructions were rather unspecific which might be a reason for the relatively small benefits observed for active text-picture integration.
In the present study, the taught strategy to mentally support textpicture integration involved an instruction aimed at creating an integrated format that was expected to encourage the construction of a mental representation of the content, while specific instructions to imagine moving the text to the part of the picture it refers intended to support learners to imagine the corresponding textual and pictorial elements together as one part. Given that asking learners to engage in a mental learning strategy encourages active, constructive processing of the content, this may not only result in higher learning outcomes compared to a split-attention format, but possibly also compared to an integrated format. Studying an integrated format, although providing more optimal opportunities for schema construction than a split-attention format, still may not necessarily induce active cognitive processing in learners. Learners may remain rather passive, only superficially attempt to relate the different external representations, or are not well able to adequately integrate the different representations into a coherent mental representation (cf. Seufert, 2003). Hence, learners (passively) studying an integrated format may construct less accurate or sophisticated schemas than learners who are encouraged to actively engage in schema construction and directly transform the textual and pictorial information in the instructional material into a coherent mental image through imagination. A similar argument could be made when comparing the mental learning strategy to a physical learning strategy: the physical learning strategy explicitly encourages learners to create an integrated format but this involves a behavioral activity that not necessarily coincides with appropriate cognitive activities (i.e., active and adequate integration), while such cognitive processes are explicitly encouraged with active imagination (Leahy & Sweller, 2004).
Besides exploring the effectiveness of a mental learning strategy, we extend prior research on the self-management effect in the following way. In the majority of self-management studies studying the physical learning strategy the "pick up-and-move" strategy was combined with other strategies that were aimed at supporting the integration of text and picture such as highlighting text, circling relevant pictorial elements, and drawing lines or arrows between the two information sources. Given that these additional activities have been shown to improve learning of textual and pictorial information sources in and of itself (cf. van Gog, 2014), it is unclear from these self-management studies to what extent the benefits of the physical learning strategy can exactly be traced back to the physical act of moving text to the picture. Two studies, however, have made a first attempt to investigate the 'pure' effect of physically moving text to a picture. Tindall-Ford et al. (2015) asked secondary school students to study from text and picture materials presented on a computer where participants in the physical learning strategy group were taught to use a drag-and-drop functionality to move text to the picture. Results indicated that the physical learning strategy group obtained higher transfer performance than a group who studied the material in a split-attention format. These findings thus suggest that physically moving text to a picture without engaging in additional supportive strategies (e.g., drawing arrows) supports learning. However, a study by Agostinho, Tindall-Ford, and Roodenrys (2013), who used the same approach among university students, did not show benefits, nor impediments, of physically moving text to the picture. It is unclear what may have caused these mixed findings, but it is possible that the relatively small number of participants in each of these two studies (n = 16 per condition in Tindall-Ford et al., 2015; n = 11-12 per condition in Agostinho et al., 2013) or the content of the instructional material, which might have been more familiar to participants in the Tindall-Ford et al. (2015) study, may have contributed to this. In the present study, we extended this prior selfmanagement research on the pure effect of teaching a physical learning strategy by applying it to a domain which has thus far been unexplored within this context (i.e., engineering) using a different, larger sample to provide corroborating evidence for the learning effectiveness of the physical learning strategy to support integration of text and picture.

The present study
The main aim of the present study was to investigate whether teaching a mental (vs. physical) learning strategy to integrate spatially separated text-picture materials supports learning. This extends prior research on self-managed text-picture integration, which so far has solely focused on physically integrating text and picture by means of the hands. Besides this, we also aimed to gain more insight into the effectiveness of 'pure' physical integration, that is, the sole effect of moving text to the corresponding element in the picture without engaging in additional supporting activities such as drawing arrows. Another contribution of this study is that we extend research on the self-management effect to the domain of engineering (i.e., operation of an on/off light switching circuit). By testing the self-management effect in other domains than finance (Sithole et al., 2017), mathematics (Tindall-Ford et al., 2015), natural sciences (Gordon et al., 2016), and educational science (Agostinho et al., 2013;Roodenrys et al., 2012), we further tested the generalizability of self-managed text-picture integration across domains.
In a between-subjects experiment, participants studied text-picture materials either in a (1) split-attention format, (2) integrated format, (3) split-attention format with physical integration instructions to selfmanage cognitive load, and (4) split-attention format with mental integration instructions to self-manage cognitive load. Based on prior research on the split-attention effect (Ayres & Sweller, 2014), and more specifically the study by Kalyuga, Chandler, and Sweller (1998) on which our materials were based, we expected a split-attention effect on learning performance (retention, comprehension, causal inference) such that participants studying an integrated format would outperform participants in the split-attention format group (Hypothesis 1). Regarding the mental learning strategy to self-manage cognitive load, based on CLT and prior research indicating the benefits associated with the active, constructive nature of imagination (e.g., Leahy & Sweller, 2008), we expected the mental learning strategy group to obtain higher learning performance than the split-attention group (Hypothesis 2). Moreover, using a similar rationale we anticipated that the mental learning strategy group would obtain higher learning performance than the physical learning strategy group and the integrated format group (Hypothesis 3). For the physical learning strategy, based on CLT and the majority of prior research showing learning benefits of physically moving text to the corresponding location in the picture (e.g., Tindall-Ford et al., 2015), we expected the self-management effect such that the physical learning strategy group would outperform the split-attention format group (Hypothesis 4).

Participants and design
Participants were 87 psychology students (69 females) from Erasmus University Rotterdam with a mean age of 21.41 years (SD = 2.92). They were randomly assigned to one of four betweensubjects conditions: (1) split-attention condition, where text and picture where spatially separated (n = 22), (2) physical learning strategy condition: split-attention format with instructions to physically integrate the text in the picture (n = 21), (3) mental learning strategy condition: split-attention format with instructions to mentally integrate the text in the picture (n = 22), and (4) integrated condition, where text was already integrated in the picture (n = 22). Participants provided their consent before the start of the study and received course credits as an appreciation of their participation.

Materials
The learning and practice tasks were presented on the computer and all tests were administered as paper-and-pencil tests.

Prior knowledge test
Prior knowledge about electrical circuits was assessed by asking participants to rate their knowledge of electrical circuits on a 5-point self-rating scale (1very little knowledge; 5very much knowledge). Furthermore, participants were required to complete a 6-item checklist containing yes/no-statements about electrical circuits (e.g., I know what a starter is, I know what a circuit breaker is, I know what this symbol [symbol of a coil] means). For each 'yes' answer one point was given, for each 'no' answer zero points were given, which resulted in a total checklist score (minimal score: 0; maximum score: 6). The four conditions did not differ with regard to students' perceived prior knowledge (see Table 1

Learning materials
The learning task, which was based on the materials of Kalyuga et al. (1998), showed a picture and accompanying text that explained the operation of an on/off-light-switching circuit. Given that we aimed to study the effects of physical and mental integration activities in absence of other relational cues (e.g., arrows), we removed the arrows from the Kalyuga et al. (1998) materials. For the integrated condition, we therefore placed some text segments closer to the part of the picture it referred to so that the material could be understood without arrows. The learning task was created (and later presented) with SMART Notebook software (similar as in Agostinho et al., 2013). For the physical learning strategy condition, a split-attention version (Fig. 1) was created with drag-and-and-drop functionality through which text segments placed under the picture could be moved (with the mouse) to any location in the picture. Only one text segment could be moved at the time and it was possible to move earlier placed text to another location as often as participants wished. No feedback was given as to whether participants had moved the text to the correct location in the picture for sake of comparability with the split-attention-and mental learning strategy conditions where this was also not done 1 . A version with similar presentation of text and picture but with the text being unmovable was created for the split-attention condition and mental learning strategy condition. For the integrated condition, a version was created in which the text (which could not be moved) was integrated in the picture as close as possible to the part it referred to (Fig. 2). In all conditions, participants were instructed to study the text and diagram to the best of their abilities. In both learning strategy conditions, participants were additionally instructed to physically drag-and-drop the text to the corresponding part in the picture (physical condition) or imagining doing so (mental condition). To check whether participants in the mental integration condition followed instructions, they answered one question at the end of the experiment asking them whether they imagined dragging-and-dropping the text to the corresponding location in the picture. Except for two participants, all participants indicated that they had imagined moving the text to the corresponding location of the picture during the learning phase. As the response pattern of the two participants who indicated that they did not consistently imagine moving the text did not differ from the other participants and the same analysis without these participants yielded the same results, these scores were retained in the analyses. Learning time in all conditions was four minutes.
A practice task containing similar interactive/non-interactive, spatially separated/integrated versions was used to prepare participants for the upcoming learning task. This practice task was based on the practice task used by Agostinho et al. (2013) and consisted of a picture of a cat and the word 'tail'. Each participant practiced with the version that matched his or her condition for the actual learning task. For example, participants in the physical learning strategy condition were asked to move the word 'tail' to the part of the picture it referred to while participants in the mental learning strategy condition were asked to imagine moving the word to that location.

Learning outcome measures
Learning outcomes were measured using a recall test, comprehension test, and causal inference test, which were based on Kalyuga et al. (1998). Consistent with Kalyuga et al. (1998), the picture depicting the on/off-light-switching circuit without its textual explanations was available on paper to participants during the comprehension and causal inference test.
The recall test assessed how well participants remembered the components of the on/off-light-switching circuit and their spatial organization within the system. Participants were asked to draw from memory the system they just studied. For each correctly drawn component in the right location and/or connection to another component, one point was awarded. As there were 28 components and connections, the minimum score was 0 and the maximum score was 28 points. Additionally, five labeling questions were used requiring participants to recall the name of a symbol from the system that was shown in a picture. A correctly answered question yielded one point, an incorrect answer was awarded zero points (minimum score: 0, maximum score: 5). For each participant, an overall recall score was calculated by adding the scores for the drawing task and the labeling task, which could vary from 0 to 33 points. Internal consistency of the recall test, reported as Cronbach's alpha, was 0.71.
The comprehension test contained 11 open-ended questions addressing the operation of the on/off-light-switching circuit and its functions. The questions typically asked for an explanation of isolated elements in the on-off-light-switching circuit and required no causal interpretations. Examples of questions are: "Which switches are pressed when the light is operating?" and "How can the operation of the light be ceased?". Each question had one correct answer which yielded one point. No points were awarded for incorrect answers. For each participant, the total number of points (minimum score: 0, maximum score: 11) represented the comprehension score. Internal consistency of the comprehension test, reported as Cronbach's alpha, was 0.54, which might be due to the relatively low number of questions and participants, and the many different aspects of the on-off-light switching circuit that the comprehension questions covered (Schmitt, 1996;Yang & Green, 2011).
The causal inference test contained six open-ended questions which required participants to make causal inferences and reason about the on/off-light-switching circuit. These questions typically required participants to combine information from different elements in the on-offlight-switching circuit. Examples of causal inference questions are: "After the start button is released, the bell and light stop working. What is the cause of this problem?", and "After the stop button is released, the bell and the light start working again. What is the cause of this problem?". For each correctly answered question, one point was awarded (no points were given for incorrect answers), which resulted in a causal inference score between zero and six for each participant. Internal consistency of the causal inference test, reported as Cronbach's alpha, Table 1 Mean scores and SDs (in brackets) on the prior knowledge self-rating and checklist for each condition. was 0.21. Such a low Cronbach's alpha can be expected given the limited number of causal inference questions, the relatively small sample size, and the heterogeneous nature of the questions that measured understanding of the topic on multiple dimensions (Schmitt, 1996;Yang & Green, 2011). A coder who was blind to experimental condition scored all answers on the recall, comprehension, and causal inference test. A second coder scored a randomly selected subset (30%) of these answers. Agreement (Cohen's kappa) between the coders was acceptable for the recall (κ = 0.574) and comprehension test (κ = 0.602), and good for the causal inference test (κ = 0.871), and therefore the scores of the first coder were used in the analyses. Both coders used the same coding scheme (Kalyuga et al., 1998) which was explained and discussed with them before they engaged in coding the tests.

Cognitive load
The perceived cognitive load during learning and testing was assessed with the self-rating instrument developed by Paas (1992). Participants had to indicate on a 9-point rating scale (1very, very little, 9 very, very much) the cognitive load they invested during the justcompleted task. This self-rating instrument has been shown to provide a reliable and valid assessment of the cognitive load someone experiences during a task (Paas, Tuovinen, Tabbers, & van Gerven, 2003).

Procedure
Participants were tested individually in a single session in the university lab. After being welcomed they were seated at a desk in front of a computer screen. Participants first completed the prior knowledge test. Subsequently, the experimenter provided instructions stating that participants had to study the functioning of an on-off-light-switching circuit from a picture and accompanying text to the best of their abilities as they would be tested in this later on. Participants in the physical learning strategy condition, were additionally instructed to use the mouse to drag-and-drop text segments to the corresponding location in the picture, while participants in the mental learning strategy condition were instructed to imagine doing so. Then, in all conditions participants engaged in the practice task to get used to the way they were supposed to learn during the actual learning task. The experimenter monitored this practice task and provided additional explanation if needed. Next, participants completed the learning task in the version commensurate with the condition to which they were assigned. Directly after the learning task had been completed, participants indicated the cognitive load experienced during learning. They then completed the recall test, comprehension test, and causal inference test. There was no time limit to answer these questions and therefore time to complete the tests was recorded by the experimenter. After each of these tests, participants indicated the cognitive load they experienced in completing that test. Finally, participants were thanked and debriefed. The entire experiment lasted about 45 min.

Learning outcomes
The mean scores and standard deviations for the scores on the recall test, comprehension test, and causal inference test (separately for each condition) are displayed in Table 2. A Multivariate Analysis of Variance (MANOVA) with Condition as between-subjects factor was conducted on these scores. Time to complete the tests did not differ significantly between conditions [Wilk's Λ = 0.817; F(3, 87) = 1.903, p = 0.053, η p 2 = 0.065] and was therefore not added in this analysis (nor in the analysis of the cognitive load scores presented at "3.2. Cognitive load"). Reported pairwise comparisons were based on Tukey post-hoc tests. The MANOVA results regarding the learning outcomes indicated that there was a significant medium to large difference (for η p 2~0 .01 is considered a small effect,~0.06 a medium effect, and~0.14 is considered a large effect, Stevens, 2009) Cohen, 1988) on the recall test: participants in the integrated condition had a higher recall score than participants in the split-attention condition (p = 0.003, d = 1.11). Similarly, on the comprehension test a large effect (d = 0.81) was observed showing that the integrated condition outperformed the split-attention-condition, although this difference was statistically in the close to significant range (p = 0.056). These effect sizes are equal to, or bigger than, the median effect size for the split-attention effect reported in two meta-analyses, as Ginns (2006) reported a median effect size of d = 0.72, and more recently Schroeder and Cenkci (2018) reported a median effect size of g = 0.63. Regarding the effects of self-management, the mental learning strategy condition obtained significantly higher scores on the recall test (p = 0.027, d = 0.77; medium to large effect) and the comprehension test (p = 0.011, d = 0.94; large effect) than the split-attention condition. As this study was the first to study a mental learning strategy for self-management of cognitive load, we cannot relate these effect sizes to previous literature, although Leopold and Mayer (2015) reported comparable effect sizes for the imagination effect (Experiment 1: d = 1.30; Experiment 2: d = 0.86). Moreover, the mental learning strategy condition had a significantly higher comprehension score than the physical learning strategy condition (p = 0.004, d = 1.02; large effect). Also, the integrated condition obtained a significantly higher score on the comprehension test than the physical learning strategy condition (p = 0.024, d = 0.90; large effect). There were no other significant results regarding recall and comprehension performance (ps between 0.081 and 0.930).   Table 3 shows the means and standard deviations for the perceived cognitive load ratings gathered after the learning task, recall, comprehension, and causal inference tests. A MANOVA on these ratings with Condition as between-subjects factor showed that the conditions did not significantly differ in the cognitive load they experienced, F(14, 87) = 0.66, p = 0.789; Wilk's Λ = 0.908, η p 2 = 0.032.

Discussion
The present study aimed to extend research on the self-management effect (see e.g., Roodenrys et al., 2012) when learning from spatially separated, mutually referring text and pictures. The self-management effect is a recent development in CLT research focusing on equipping learners with a strategy to manage the cognitive load -and improve learning-induced when studying sub-optimally designed instructional materials such as when presenting text and pictures in a split-attention format. As expected (Hypothesis 1), we replicated the split-attention effect (Chandler & Sweller, 1991;Ginns, 2006;Liu et al., 2012), indicating that presenting learners with spatially integrated textual and pictorial information yielded higher learning outcomes than presenting the same information in a non-integrated, spatially separated format, particularly for recall of information and to a lesser extent for comprehension (cf. Roodenrys et al., 2012;Sithole et al., 2017). No effects were found on causal inference, which diverges from findings reported by Kalyuga et al. (1998Kalyuga et al. ( , 1999 who have used similar materials and obtained comparable mean scores. However, their study was done with a different sample, i.e., company trainees instead of university students which most likely contributes to the diverging findings. When interpreting the results regarding the self-management effect, it is thus important to keep in mind that findings are restricted to recall and comprehension performance. In addition, caution should be taken when generalizing the findings to the broader population, because of the modest sample size in each condition and the rather low reliability of the comprehension and causal inference test. It is possible that the low reliability of the tests attenuated the reported performance differences. Given that we obtained meaningful effect sizes for these performance differences more reliable measures might have resulted in larger effects or could have uncovered more subtle differences between conditions.

A mental learning strategy supports learning
The major contribution of our study is a comparison between students' self-management of split-attention materials through mental and physical learning strategies. Whereas previous studies have shown that teaching learners to self-manage split-attention materials through physically integrating text and pictures has a large positive impact on their learning (e.g., Sithole et al., 2017), this strategy can only be used in learning environments in which learners can physically interact with the to-be-learned textual and pictorial information. The findings of the present study show that teaching a mental learning strategy can have a comparable positive effect on learning in situations where physical integration is not possible. In line with our expectations (Hypothesis 2), learners instructed to self-manage split-attention materials through mental integration obtained higher recall and comprehension scores than learners studying the same materials without any instruction. We refer to this finding as the 'mental self-managed integration effect'. Moreover, as predicted in our third hypothesis, using a mental learning strategy was more effective for comprehension than using a physical learning strategy. The latter finding is a noteworthy contribution to the self-management effect literature as it suggests that it is not only important to teach learners a strategy to self-manage cognitive load but also what kind of learning strategy learners engage in and/or which cognitive activities such a strategy elicits.
It should be noted that these effects did not show up on the causal inference questions. For these questions, participants had to move beyond remembering the name and location (i.e., recall test) and the function (i.e., comprehension) of a single element in the depicted system (on-off-light-switching circuit). That is, they had to combine knowledge about the different elements to reason about and troubleshoot the system, which is a more demanding activity than keeping in mind specific information about a single element. It is possible that the mental learning strategy facilitated recall and comprehension of the configuration and understanding of the main elements of the system, but left insufficient working memory resources to build a coherent mental representation of all, or most, interconnections and relations between the elements. More specifically, learners could build a representation of the basic elements of the system, but there was no, or insufficient, generative processing that is necessary to obtain higher performance on the causal inference test. Future research is needed to further investigate whether this is a viable explanation.
Our interpretation of the superior performance of the mental learning strategy is that it encourages learners to actively integrate textpicture information in mind (cf. Leahy & Sweller, 2004), which helps learners to progressively build an imagined end-product (i.e., integrated format). In terms of CLT, the mental learning strategy fosters learners to dedicate their working memory resources to cognitive processes that directly contribute to schema construction (germane cognitive load; Sweller et al., 2019). Under this view, learners actively attempted to construct a coherent mental representation of the content because the mental integration instruction elicited stronger cognitive engagement with the task. We are aware that this active and meaningful processing may not have solely been due to the strategy that was taught (i.e., imagine moving text to the picture) but may also be the result of additional engagement in other meaning-making strategies that might have been primed by the mental integration instruction. It is possible that the mental integration instruction has created better opportunities for this by reducing unnecessary cognitive load. Even though matching corresponding textual and pictorial information is still necessary, this is probably less of a problem because the guidance provided by the taught learning strategy reduced unnecessary visual search processes (i.e., extraneous cognitive load) which likely saves working memory capacity for constructing and refining a coherent mental representation. That is, the invested effort more likely contributes to generative or germane cognitive processing (Mayer, 2014;Schnotz, 2014). In the present study, this is reflected in the learning outcomes which are highest for the mental integration condition and the integrated condition.

A physical learning strategy does not benefit learning
An unexpected finding was that, in contrast to our prediction (Hypothesis 4), self-managing split-attention materials with the physical learning strategy did not lead to higher recall, comprehension, and causal inference performance than learning from split-attention materials without instruction. These findings diverge from most of the prior studies showing a strong learning advantage of using the physical learning strategy (e.g., Sithole et al., 2017), but in these studies the physical learning strategy was combined with additional instructional strategies (e.g., highlighting). The results of studies investigating the 'pure' effects of the physical learning strategy were thus far inconclusive with one study showing learning benefits of the physical strategy (Tindall-Ford et al., 2015) while another study did not (Agostinho et al., 2013). The present study also suggests that solely relying on the physical learning strategy is insufficient to improve learning with a split-attention format up to a level that they outperform learners studying a split-attention format without being taught a learning strategy. The present study shows that this is the case for university students (cf. Agostinho et al., 2013), and extends prior work by showing that this finding is not restricted to a specific domain: while Agostinho et al. (2013) used a task about educational technology, we obtained comparable findings with a task about engineering. The present study thus provides more evidence for the hypothesis that using attention-directing cues (such as highlighting, cf. De Koning, Tabbers, Rikers, & Paas, 2009 is crucial for a physical learning strategy to be effective, thus providing a clear explanation for earlier discrepancies in the literature. Such cues might be effective as learners are required to both identify the task-relevant information in the picture to appropriately place the text segments and to make physical movements. There is a substantial amount of research indicating that for novices in a domain each of these two tasks increases working memory demands and lowers task performance (Gegenfurtner, Lethinen, & Säljö, 2011;Skulmowski, Pradel, Kühnert, Brunnett, & Rey, 2016). Having to perform both tasks simultaneously likely places a heavy burden on learners' working memory and hinders learning performance.
The combined results of this and the previous studies can provide some information on the conditions under which the physical learning strategy is effective for learning, but more research is needed, for example by directly comparing a 'pure' physical learning strategy with one that is 'enriched' with additional instructional strategies and/or varying the type and number of additional instructional strategies. Another potential direction for future research is to investigate the extent to which effects of physical manipulation can be found on delayed tests and transfer to a new set of split-attention materials (cf. Roodenrys et al., 2012). According to both theoretical and empirical work on generative learning strategies (Fiorella & Mayer, 2016) higher learning outcomes are typically obtained on delayed rather than immediate tests and on measures of meaningful learning that require higher-order thinking such as transfer tests. Additionally, it could also be useful to look further into potential age effects given that prior research with secondary school students did find effects of a physical learning strategy to self-manage cognitive load on learning (Tindall-Ford et al., 2015) while other studies involving adults, including the present study, not consistently did so (e.g., Agostinho et al., 2013).
Another interesting finding from our study is that learners studying the integrated format had higher comprehension scores than learners using the physical learning strategy. In other words, creating an integrated format yourself by dragging and dropping text to the picture was less effective than studying an integrated format that was presented to the learner. This is in line with the results of Roodenrys et al. (2012), who suggested that lower performance when using the physical learning strategy is possible due to the extra demands the strategy places on learners. That is, instructing learners to physically integrate text in the picture and learn the material means that they have to split their efforts continuously between physical actions to manipulate the text-picture materials and engaging in the cognitive activities to actually learn the content. Such an interpretation fits an emerging body of research showing that being instructed to move your hand during learning can have detrimental effects on the learning process and performance (see e.g., Castro-Alonso, Ayres, Wong, & Paas, 2018;De Koning & Tabbers, 2013). Yet, it is relevant to note here that while it is relevant to be aware of such potential adverse effects of the physical learning strategy, most of the prior studies investigating the physical learning strategy showed comparable performance with presenting an integrated format to learners (e.g., Gordon et al., 2016;Tindall-Ford et al., 2015), and one study even found that the physical learning strategy outperformed the integrated format (Sithole et al., 2017). When and why the physical learning strategy does not consistently yield comparable performance to the provision of an integrated format could be a direction for future research.

Differences between the mental and physical learning strategies
In the physical learning strategy, learners were only taught to move text segments to the corresponding location in the picture and, as in the mental integration condition, received an explanation of why integrating the two information sources was useful. In other words, as in Tindall-Ford et al. (2015) and Agostinho et al. (2013), we purposefully did not include additional instructional strategies known to support learning such as highlighting (e.g., van Gog, 2014) to be able to investigate the 'pure' effect of the physical learning strategy. This means that learners were required to be behaviorally active as they had to physically move the text to the picture but were not explicitly stimulated to strongly engage with the task at the cognitive level, that is, to actively construct an accurate mental representation of the content. This means that they could have performed the task by simply moving text segments to the picture without or by insufficiently engaging in deeper cognitive processing. That is, they might have spent too little working memory resources in schema construction or generative activities (i.e., germane cognitive load). Additionally, learners likely have experienced extraneous cognitive load arising from coordinating the physical movements with the mouse to integrate the two information sources and keeping track of the resulting outcome of the movement on the screen. Together, this may explain the lower learning outcomes in the physical integration condition. Such an interpretation fits previous research showing that learning benefits of the physical learning strategy are particularly observed in studies where the physical strategy was combined with additional instructional strategies that encourage learners to cognitively engage with the learning task (e.g., Gordon et al., 2016;Roodenrys et al., 2012). It should be noted that further research on the comparison between the mental and physical learning strategies is needed to reach more definite conclusions regarding the relative effectiveness of both strategies. Of critical importance would be to examine the actual cognitive processes as well as the mental and physical strategies that learners engage in during learning (e.g., to investigate with think-aloud whether learners only used the instructed strategy or also engaged in other generative strategies). For this, it could be worthwhile to consider existing approaches, such as the three-pronged approach (Magliano & Graesser, 1991), that allow for systematic and thorough investigation to guide these efforts.
Another factor that could be considered in this regard is the level of interactivity of the task. In our study, learning with the physical learning strategy implied that learners could physically interact with the text-picture information by moving text segments, while learners in the mental learning strategy had no interaction possibilities. It would be interesting to investigate to what extent learner's preferences and/or learning outcomes favor the mental learning strategy over the physical learning strategy when confronted with a split-attention task in which it is possible to also physically interact with the learning materials.
Another future direction could be to investigate the extent to which textual and/or pictorial features such as text length influence the effects of learning strategies for self-management of cognitive load (cf. Schüler, Arndt, & Scheiter, 2019). Prior text-picture learning research has for example shown that presenting visual pictorial information together with short narrated text segments benefits learning over a visual presentation of text and picture (i.e., modality effect), but with longer text segments this modality effect disappeared (e.g., Leahy & Sweller, 2011). It could be argued that longer texts may also be less effective when using a learning strategy for self-managing cognitive load when studying split-attention materials, particularly when engaging in the mental integration of text and picture. This is because learners have to keep more textual information active in working memory, which may create too high extraneous cognitive load leaving little processing capacity for constructing an accurate mental representation of the content. It should be noted that this is under the assumption that the larger amount of information has to be processed within a restricted time span.
Alternatively, it would also be useful to investigate how learning is affected by mental and physical learning strategies if the content of the learning task is kept equal while learners are given (system-paced instructions) or take (learner-paced instructions) more time to complete the learning task. It is conceivable that if learners have more time for completing a learning task, benefits of the mental learning strategy over the physical learning strategy decrease or disappear. A longer learning time requires learners to keep the textual and pictorial information active in working memory for a longer period, which likely increases extraneous working memory load (Barrouillet & Camos, 2012;Puma, Matton, Paubel, & Tricot, 2018). Consistent with this, a study by Bodemer and Faust (2006) suggests that prompting mental and physical integration of spatially separated text and pictures during a 20-minute learning session was not more effective for learning than studying the same materials without such prompting. It is therefore important to realize that the findings of the current study were found with a relatively short learning task, short-term recall of information, and a fixed time for studying. To what extent our findings hold for longer learning tasks needs to be addressed in future work.

Educational implications
Based on our findings, it is possible to tentatively draw a number of educational implications. These implications may provide helpful direction for instructors and instructional designers to support learning from mutually referring, spatially separated textual and pictorial sources, as well make learners aware of possibilities to self-manage their learning from such instructional materials. The implications are applicable to instructional materials using a combination of text and pictures particularly in computer-assisted learning. In computer-assisted learning, text and pictures are frequently presented in a spatially separated format and learners often have no or limited possibilities to interact with the instructional materials or a teacher. A note of caution is that the implications may vary depending on task characteristics (e.g., task length, task complexity) and learner characteristics (e.g., level of prior knowledge, learning goal), so the following implications should be considered with this in mind. Firstly, the chance that learning outcomes improve increases if an instruction is added to actively mentally integrate spatially separated text and picture. Secondly, learning from spatially separated text and pictures with a mental integration instruction can yield comparable learning outcomes as learning the same materials in an integrated format, so both approaches could be used to improve learning when confronted with split attention examples. Thirdly, when aiming for active integration of text and pictures, engaging in mental integration seems more useful than relying on physical integration. Together, these implications likely increase effective learning from text-picture instructions and save valuable time and effort associated with redesigning the instructions.

Conclusion
The present study advances research on identifying effective ways to empower learners to manage their cognitive load to effectively learn from instructional materials containing spatially separated text and pictures. Our findings suggest that learning from spatially separated text and pictures can be improved by teaching learners a strategy to mentally integrate text and picture themselves. The learning outcomes were comparable to presenting learners with a spatially integrated format, indicating that the mental learning strategy provides a useful alternative for learning from split-attention materials when a spatially integrated format is not available. The results regarding the effectiveness of the physical learning strategy (in relation to the mental learning strategy and integrated format) were less promising in this study, and combined with prior work on this strategy, we suggest that this strategy is most likely to improve learning if it is combined with other instructional strategies. More research is warranted with larger numbers of participants to further develop this new area of self-management research to reliably identify the types of and conditions under which learning strategies for self-management of cognitive load are most conducive to learning.