A PROPOSAL FOR A STRUCTURAL EQUATION MODEL TO EXPLAIN ACADEMIC PERFORMANCE IN E-LEARNING

Purpose: Present the advances of a research project whose main goal is the construction and empirical validation of a structural equations model to explain the students’ academic performance in Administration distance degree careers of the Universidad de la Defensa Nacional (Argentina). Methodology/Approach: Were selected and adapted indicators for each latent variable identified at the theorical model. For selection in some cases we resort to measurable variables in objective units and in other cases to subjective variables measurement scales validated by previous research. For adaptation, we consider the population characteristics under study and the distance education model particularly applied in the University. Findings: A measurement model formulation was obtained which, linked to the causal model proposed as a result of bibliographic background integration, allowed us to reach a complete structural equations model specification with six latent variables, five endogenous and one exogenous. Research Limitation/implication: Consider that the observed variables selected are the ones that best combine to identify the hypothesized constructs. Originality/Value of paper: Learning in virtual environments is the main endogenous latent variable of the model, explained by previous knowledge, motivation, digital skills, self-regulation and interaction processes. KEYWORD: e-learning; knowledge; motivation; digital skills; higher education. PROPOSTA DE MODELO DE EQUAÇÕES ESTRUTURAIS PARA EXPLICAR O DESEMPENHO ACADÊMICO EM E-LEARNING RESUMO Objetivo: Apresentar os avanços de um projeto de pesquisa cujo principal objetivo é a construção e validação empírica de um modelo de equações estruturais para explicar o desempenho acadêmico dos estudantes em carreiras de administração a distância da Universidad de la Defensa Nacional (Argentina). Metodologia / Abordagem: Foram selecionados e adaptados indicadores para cada variável latente identificada no modelo teórico. Para seleção em alguns casos, recorremos a variáveis mensuráveis em unidades objetivas e, em outros casos, a escalas subjetivas de medição de variáveis validadas por pesquisas anteriores. Para adaptação, consideramos as características da população em estudo e o modelo de educação a distância aplicado na Universidade. Conclusões: Foi obtida uma formulação de modelo de medida que, vinculada ao modelo causal proposto, como resultado da integração bibliográfica de fundo, permitiu alcançar uma especificação completa do modelo de equações estruturais com seis variáveis latentes, cinco endógenas e uma exógena. Limitação / implicação da pesquisa: Considere que as variáveis observadas selecionadas são as que melhor se combinam para identificar os constructos hipotéticos. Originalidade / valor do trabalho: o aprendizado em ambientes virtuais é a principal variável latente endógena do modelo, explicada por conhecimentos prévios, motivação, habilidades digitais, processos de auto-regulação e interação. PALAVRAS-CHAVE: e-learning; conhecimento; motivação; habilidades digitais; ensino superior. All authors actively participated in the discussion of the article results.


INTRODUCTION
Academic performance, also called academic achievement, is an indicator of educational success or failure. It is generally determined taking into account qualitative and quantitative variables that allow determining whether students, teachers and educational institutions have been effective in their teaching and learning processes. In higher education it is one of the most important quality indicators and is a topic of great institutional, economic and social interest (García Tinisaray, 2016).
Several authors highlight the multi-causal characteristic of academic performance in universities and that it is a synthetic result of educational processes, especially the learning process, where converge the effects of numerous personal, social, institutional variables and their interrelationships (Garbanzo, 2007;Gómez Sánchez et al., 2011;Rojas, 2013). Improving the students' academic performance requires identifying their causal factors to establish the influence and importance of each one (Tejedor, 2003).
In the particular case of contemporary distance education, e-learning is conceived as an active and complex process where the student builds his knowledge based on previous knowledge and interacting with other people in virtual environments. This implies the application of selfsufficiency strategies, social construction of meanings and an important affective-motivational component, responsible for maintaining and controlling the continuous execution of the tasks and activities required in the study (Peñalosa Castro, 2010).
According to the literature, there are many factors that influence the students' academic performance in current distance education. In addition to the classic socio-demographic predictors, level of previous knowledge, motivation for the study, capacity for self-regulation of learning, digital skills and interaction in virtual environments, stand out. Based on various learning theories, these variables have no direct effect on academic performance, but yes indirectly through the learning process. However, so far there are no investigations that attempt to integrate all these theories and understand the complex and multivariate nature of the phenomenon. For this task, one of the greatest difficulties is the high probable degree of dependence between the variables. In addition, many determinants of learning are latent variables, that is, they cannot be observed directly. An alternative in these cases is modeling through structural equation (Peñalosa Castro and Castañeda Figueras, 2012).
Structural equation modeling (SEM) is considered an extension of multivariate statistical techniques like multiple regression and factor analysis (Kahn, 2006). Similar to the simultaneous equations econometric methods, SEM allows examining a set of dependency relationships in which some variables play the role of predictor variable and dependent variable at the same time, but have some particular characteristics that differentiate them from other techniques. According to Cupani (2012), one of the main differences is the capacity to estimate and evaluate the relationship between latent (unobservable) variables. These variables are supposed constructs of the theory that can be measured by one or more manifest variables or observable indicators. education where this technique is used. In Argentina there are no applications of this type in the field of distance education, the first steps have been taken in Moneta Pizarro et al. (2017) and Moneta Pizarro (2019).
In this paper we present advances of a research project whose general objective is the construction and empirical validation of a structural equation modeling with predictive capacity to explain the students learning and performance in Administration distance degree careers of the Universidad de la Defensa Nacional (UNDEF) in Argentina. Specifically, we present here partial results corresponding to the complete specification of the structural theoretical model and their components: a) structural part where the interrelation between the latent constructs is described, and b) measurement model that represents the relationships between latent variables and their manifest indicators. The fundamental objective in the measurement model is to confirm the validity of the selected indicators in the constructs measurement. The structural relations model is the part that we really want to estimate. It contains the effects and relationships between the constructs, which are normally latent variables. It is similar to a regression model, but it can include concatenated effects and loops between variables. In addition, it has the prediction errors, which are different from the measurement errors that influence the manifest variables (Ruiz, Pardo and San Martín, 2010).
In the next section we detail the methodology used for research. Then, we present the results and finally the conclusions, followed by the bibliographic references.

MATERIAL AND METHODS
This work is based on an explanatory investigation whose procedure is composed by the following phases: I. Literature review and general model specification..

II. Construction and content validation of the measuring instruments.
III. Data collection and processing.
IV. Estimation, contrast and evaluation of the structural modeling properties.
In the first phase we carry out a documentary investigation in order to identify the background in the available literature and propose a set of explanatory variables of learning and academic performance for the start theoretical model. This model is an innovative contribution because integrates different learning theories into a single causal model and offer a valid formulation for empirical testing. Preliminary results were shared in Moneta .
In the second phase we select and adapt indicators for each latent variable identified at the previous stage. For selection in some cases we resort to measurable variables in objective units and in other cases to subjective variables measurement scales validated by previous research. For adaptation, we consider the population characteristics under study and the distance education model particularly applied in UNDEF. In the cases of perceptions measures scales, attitudes and behaviors, the bank of items of each one was subjected to a content validation process by an expert teacher's jury in distance education. As a result of this phase, a measurement model formulation was obtained which, integrated to the causal model proposed in the previous stage, allowed us to reach a complete structural equation modeling specification. The achievements at this point are those we present in this work.
In the third phase, currently in execution, we are preparing the final questionnaires that will be applied online, with help of Google Forms tool, on the graduate's population of distance undergraduate Administration careers who are still active in the institution as students in the upper cycle. The data collected will then be processed with Google Sheets and exported to Stata 14 for statistical treatment in the next stage.
In the fourth and last phase we will proceed to the estimation, contrast and final evaluation by SEM. For statistical analysis we will use Stata 14 following the Acock´s guidelines (Acock, 2013) and StataCorp (2015).

Structural model
A structural model with six latent variables, five endogenous and one exogenous, was formulated as a result of bibliographic background integration. A preview of this model was shared in Moneta Pizarro et al. (2018, June). As can be seen in Figure 1, learning is the main endogenous variable. It depends directly on self-regulation, prior knowledge and interaction; and indirectly of digital competences and motivation, with effects through self-regulation and interaction as mediating variables.
Endogenous variables are self-regulation, digital skills, motivation, interaction and learning. The previous knowledge required for learning new knowledge is the exogenous variable. We assume that digital competencies are influenced by motivation and have direct effects on selfregulation and interaction. Motivation, on the other hand, in addition to being the cause of digital skills, self-regulation and interaction, has a reciprocal causality relationship with interaction.

Figure 1. Structural model
The direct effect of self-regulation on learning is based on the requirement of skills and abilities for autonomous learning that distance education has demanded from students since origins. This self-regulation refers to the student's ability to make decisions that allow him to control his learning process, aimed at achieving goals in a context that has specific conditions (Del Mastro Vecchione, 2005). A self-regulated student has the power to direct, control, regulate and evaluate their way of learning, intentionally, voluntarily and consciously. It uses learning strategies that lead to the achievement of the proposed objectives (Moneta Pizarro et al., 2018, June).
In the case of prior knowledge, is famous the statement of Ausubel, Novak and Hanesian (1983, p. 83): "the most important factor that influences learning is what the student already knows. Find out this and teach it accordingly". Recent research highlights previous knowledge among the factors that have an effect on learning (Barahona, 2014;McArdle, Paskus and Boker, 2013;Shin and Raudenbush, 2011). The constructivist theory of significative learning postulates that learning depends on a construction process of knowledge by individuals, or a reconstruction from the social point of view, which implies attributing and giving meaning to the contents. This process does not Digital Skills Motivation Previous knowledge

Selfregulation
Interaction Learning happen without prior knowledge, since the construction of new meanings is carried out on a previously constructed base (Miras, 1999).
The interaction is included in the model taking account the contributions of Barberá (2001) and Berridi Ramírez, Martínez Guerrero and García Cabrero (2015), who define it as a set of interconnected reactions between the members participating in the educational environment. It is appreciated both the student's interactivity with information, teaching materials and environment, as well as personal interaction, exchange and negotiation of meanings through dialogical sequences with other students and teachers. Some works distinguish interactivity (relationship with resources and materials) from interaction (interpersonal relationships). In this investigation we use the term interaction, in general, for both types of relationships because the final objective is the same: promoting a "cognitive interaction" (Ruiz Velasco, 2003, p.17) that triggers learning. Bernard et al. (2009) affirm that greater the interaction with resources, other students and teachers, greater is the students learning. This idea is based on the Vygotsky's near development zone theory and social constructivism, which argue that learning does happen mostly due to the different interactions that arise between teachers and students and between students themselves.
In virtual education this interaction requires both teachers and students new skills and digital communication strategies. According to the European Commission, digital skills are a "set of knowledge, skills, attitudes, strategies and awareness that are required when using ICT and digital media to perform tasks, solve problems, communicate, manage information, collaborate, create and share content, build knowledge in an effective, efficient, adequate, critical, creative, autonomous, flexible, ethical, reflective way for work, leisure, participation, learning, socialization, consumption and empowerment " (Ferrari , 2012, p. 3). Monereo (2005) identifies four major areas: competencies to seek information and learn to learn, skills to learn to communicate, skills to learn to collaborate and skills to learn to participate in public life. The competences for search information and learning to learn are closely related to the capacity for self-regulation and support the effect of digital skills on the self-regulation proposed in the model. The remaining competencies clearly favor the interaction processes and collaborative learning. Lion (2012) says that digital skills facilitate the development of collaborative work, dialogue, problem solving and promoting lifelong learning. These ideas theoretically support the direct effects from digital competencies on interaction and self-regulation and, indirectly through these variables, on learning.
Regarding motivation, learning conceived as an active and complex process involves the application of self-reliance strategies and for this, an important affective-motivational component, responsible for starting, maintaining and controlling the performance of the tasks required for study. Del Mastro Vecchione (2005) says that learning involves effort and persistence. Holmberg (1985) affirm that intellectual pleasure and motivation to study are positive for the learning goals achievement. Monereo and Pozo (2003), cited by Villardón and Yániz (2011), consider that motivation favors students' autonomy in learning. These authors also argue that motivation external nature must be internalized, ensuring that the main learning driver and activity is the person himself. The student whose learning goals are oriented to the personal progress achievement and tasks mastery, develops greater awareness, control and regulation on the different factors involved in the process (Del Mastro Vecchione, 2005). Thus, is justified the close relationship between motivation and self-regulation proposed in the model.
As can be seen in Figure 1, motivation is also related to other variables of the model in addition to self-regulation. One of these relationships is with the interaction. But here a very special relationship is postulated, a bidirectional relationship. According to Woolfolk (1996), interactions motivate students and provide feedback on this motivation, favoring learning. This relationship between motivation and interaction is also founded in Barberá (2001), where interaction types include those that favor adequate affective conditions. Motivation can be present at all times in the learning process. It is not a momentary or start activity, on the contrary, it is a dynamic process and in constant transformation (Moneta Pizarro et al., 2018, June).
Finally, the relationship between motivation and digital competences makes sense in the internal approach proposed by the Organization for Economic Cooperation and Development (OECD) in the DeSeCo Project (the acronym of Definition and Selection of Competencies). There, digital competencies are defined with a double perspective. On the one hand, from the outside, is an ability to overcome demands (social or individual) or to develop activities. On the other hand, from within, is a combination of skills (practical and cognitive), values, knowledge, motivations, attitudes and emotions, which allow a specific action (OECD, 2002).

Number of subjects appealed during the technical degree (CAR). b)
Self-regulation 1. Collaboration strategies (EC).
As can be deduced from the preceding list, the selected indicators in the case of learning correspond to objective variables measured on numerical scales with the exception of self-perceived learning that will be attempted to be measured with a Likert scale of 1 to 10 points so that it can correspond with the numerical scale of grades to which students are accustomed.
For the measurement of the latent variable of self-regulation we propose to use the scale corresponding to the Autonomous Work Strategies Questionnaire (CETA) of López Aguado (2010) adapted by Moneta Pizarro and Juárez (2018, August). This instrument consists of 23 items grouped into four dimensions: collaboration strategies (8 items), conceptualization and synthesis strategies (8 items), expansion strategies (4 items) and planning strategies (3 items). The answers will be scored with a Likert-type scale that ranges between 1 (strongly disagree) and 5 (strongly agree). Cronbach's alpha obtained in the study by Moneta Pizarro and Juárez (2018, August) were 0.89, 0.83, 0.79 and 0.67 respectively for each dimension.
We use the scale provided by the questionnaire for the study of attitudes, knowledge and use of ICT (ACUTIC) in Higher Education Mirete Ruiz Garcia-Sanchez and Hernandez Pina (2015) for the measurement of the variable referring to digital skills,. This scale is composed of 31 items that are distributed between three dimensions: attitudes towards the use of ICT (7 items), knowledge about ICT (12 items) and the use of ICTs (12 items). Each of these dimensions is accompanied by a Likert-type scale of five values adjusted to the characteristics of the dimension, 1 being the value with the lowest agreement with the item and 5 with the highest agreement. Mirete Ruiz, García-Sánchez and Hernández Pina (2015), in the reliability analysis of these subscales, obtained Cronbach's alpha coefficients equal to 0.87, 0.85 and 0.76.
Previous knowledge, meanwhile, we postulate be measured through three indicators. The first one, the orientation of the secondary school, is an objective variable of nominal categorical type. It is assumed that students who come from orientations in Economics and Management have a higher level of previous knowledge, or at least more appropriate for careers of Administration Sciences, than those of other orientations. The second, the average secondary school exit, is also an objective variable but of a quantitative type. And the third is a subjective variable given by the general self-perceived performance in the Induction Course. It is a course of leveling studies and induction to distance education, which students must take as a first step to enter undergraduate careers. As this course does not have numerical qualifications but students can pass or fail it, and considering that all the students that make up the population object of this study approved the course for having already graduated from the technicature, there is no numerical measure of the knowledge acquired in this previous course. That is why we propose as an approximation to selfperceived performance with a Likert scale of 1 to 10 points.
To measure the latent interaction variable, we will use the Scale of Interaction in Virtual Learning Contexts by Berridi Ramírez, Martínez Guerrero and García Cabrero (2015) adapted by Moneta Pizarro et al. (2017). This scale consists of 30 items organized in three parts, one for each type of interaction defined: tutor-student interaction (12 items), student-environment-study materials interaction (10 items) and student-student interaction (8 items). Likert statements include five response options on a scale ranging from 1 (almost never) to 5 (almost always). Cronbach alphas obtained in the study by Moneta Pizarro et al. (2017) were 0.94, 0.95 and 0.93 for each dimension.
Finally, for the measurement of the motivation variable we will resort to the scale given by the Motivation Questionnaire and Learning Strategies (MSLQ) created by Pintrich et al. (1991) and adapted by Burgos Castillo and Sánchez Abarca (2012). The questionnaire has a total of 81 items divided into two modular scales (motivation scale and learning strategy scale), which can be managed independently. According to the requirements of this research, only the modular Motivation scale will be used, which consists of 31 items divided into three components (assessment, expectations and affections), and six different factors or subscales are derived. The evaluation component shows the subscales of the intrinsic goal orientation (4 items) and extrinsic (4 items), as well as the task value (6 items). In the expectations of success component, the subscale control beliefs about learning (4 items) and the self-efficacy subscale (8 items). In the affect component, anxiety is included (5 items). The answers to each statement are proposed on a 5-point Likert scale, with 5 being the highest level of agreement and 1 the lowest level. Pintrich et al. (1991) obtained values of the Cronbach alpha coefficient for the subscales that fluctuated between 0.62 and 0.93. Burgos Castillo and Sánchez Abarca (2012) reported a Cronbach alpha for the global scale of 0.84.

Full Model Specification
In Figure 2, motivation, digital skills, interaction, previous knowledge, self-regulation and learning are latent variables, represented by ellipses. Measured variables (indicators) are represented by rectangles. As this figure shows, all indicators are endogenous because they are dependent (they are predicted) by their respective latent variables. Of the six latent variables, only the previous knowledge variable is exogenous (it is not predicted by any other variable); All other latent variables are dependent on some other variable. One of the fundamental assumptions of the SEM analysis is that the dependent variables have some variation not explained by the latent variable that is attributable to the measurement error. Therefore, the error variance must be modeled. The error variation is specified by an error indicator as well as the error associated with dependent latent variables is represented by the letter . The direct effect is the relationship between the latent variable and the measure (indicator) or between two latent variables, similar to what is observed in the multiple regression analysis. This relationship is indicated by a unidirectional arrow (for example, between motivation and digital competencies) that implies directionality between the variables, although it should not be interpreted as causality. An indirect effect is the relationship between an independent latent variable and a dependent latent variable when its effect is mediated by one or more latent variables. In our model, digital skills have an indirect effect on learning, mediated by self-regulation.
In relation to the identification of the model, that is, the verification that for each parameter at least one algebraic expression is available that expresses it according to the sample variances and covariances, in our model it is evident that there are 21 variables observed (231 Known elements in the covariance matrix: (21 x [21 + 1]) / 2 = 231), and we have specified 22 parameters to be estimated. Subtracting the 22 parameters to be estimated from the 231 known parameters shows that for this model there are 209 degrees of freedom. In summary, the more degrees of freedom, the more parsimonious is the model. Thus, when a model is parsimonious, it can be adjusted well to the data and the researcher can demonstrate which associations between observed and latent variables are more important.

CONCLUSION
At this stage of our work we focus on the need for a theoretical justification of the model, fundamental for the specification of dependency relationships, modifications of the proposed relationships and other aspects linked to the estimation of a model. From the literature review and considering as the main endogenous latent variable the learning process under virtual environments (e-learning), it was possible to build a model with five constructs that help explain it. In this model, we maintain that learning depends directly on self-regulation, prior knowledge and interaction; and indirectly of digital competencies and motivation, which act through self-regulation and interaction as mediating variables.
The model we propose has between three and five indicators for each latent variable. To corroborate the suitability of the selected indicators in the measurement of the constructs, we use scales validated in other investigations, as detailed above. We consider that the observed variables selected are the ones that best combine to identify the hypothesized constructs.
The next stages of the work consist of data collection through an online survey on the population described above; this survey will include the selected indicators. The data collected will be processed in Google Sheets, exporting to Stata 14 for statistical treatment. Then we will proceed to the estimation, contrast and final evaluation by modeling with structural equation (SEM), which will allow us to corroborate or not our proposal for a model on the variables that influence the academic performance of the students who completed the technical degree and are studying the superior cycle of distance degree careers of the Facultad de Ciencias de la Administración of the Universidad de la Defensa Nacional.