# Which is the best example of the domain-specific nature of self-efficacy

Access Denied

Your access to the NCBI website at **www.ncbi.nlm.nih.gov** has beentemporarily blocked due to a possible misuse/abuse situationinvolving your site. This is not an indication of a security issuesuch as a virus or attack. It could be something as simple as a runaway script or learning how to better use E-utilities,http://www.ncbi.nlm.nih.gov/books/NBK25497/,for more efficient work such that your work does not impact the ability of other researchersto also use our site.To restore access and understand how to better interact with our siteto avoid this in the future, please have your system administratorcontact [email protected]

Access Denied

Your access to the NCBI website at **www.ncbi.nlm.nih.gov** has beentemporarily blocked due to a possible misuse/abuse situationinvolving your site. This is not an indication of a security issuesuch as a virus or attack. It could be something as simple as a runaway script or learning how to better use E-utilities,http://www.ncbi.nlm.nih.gov/books/NBK25497/,for more efficient work such that your work does not impact the ability of other researchersto also use our site.To restore access and understand how to better interact with our siteto avoid this in the future, please have your system administratorcontact [email protected]

### Participants

Participants were members of the Twins Early Development Study (TEDS), a longitudinal study of twins born in England and Wales between 1994 and 1996. The families in TEDS are representative of the British population in their socio-economic distribution, ethnicity and parental occupation. Informed consent was obtained from the twins prior to each collection wave. See Haworth et al.35 for additional information on the TEDS sample. The TEDS study received ethical approval from the King’s College London Ethics Committee. The present study focuses on data collected in a subsample of TEDS twins over two waves: age 16 and age 18–21.

At **age 16**, TEDS twins contributed data on mathematics ability and achievement (*N* = 3410 pairs, 6820 twins; MZ = 2612; DZ = 4508; 56% females) and mathematics self-efficacy and interest (*N* = 2505 pairs, 5010 twins; MZ = 1954; DZ = 3270; 61.2% females). At **age 18–21**, the twins contributed data on MA and general anxiety (*N* = 1509 pairs, 3018 twins; MZ = 1172; DZ = 1846; 63.9% females). All individuals with major medical, genetic or neurodevelopmental disorders were excluded from the dataset.

### Measures

#### Mathematics anxiety

A modified version of the Abbreviated Math Anxiety Scale (AMAS)36 was administered to assess MA. The AMAS asks participants to rate how anxious they would feel when facing several mathematics-related situations. The measure includes nine items that are rated on a 5-point scale, ranging from ‘not nervous at all’ to ‘very nervous’. Two items were adapted from the original version to make them age appropriate for the current sample34, these are: ‘Listening to a maths lecture’ and ‘Reading a maths book’. The AMAS showed excellent internal validity (*α* = 0.94) and test–retest reliability (*r* = 0.85)36.

#### Mathematics attitudes: self-efficacy and interest

Two scales, adapted from the OECD Programme for International Student Assessment, measure mathematics self-efficacy and interest. The **mathematics self-efficacy** scale asked participants: ‘How confident do you feel about having to do the following mathematics tasks?’ The scale included eight items that participants had to rate on a 4-point scale, from 0 = not at all confident to 3 = very confident. Examples of items are: ‘Understanding graphs presented in newspapers*’*, and ‘Solving an equation like 3 × + 5 = 17*’*. The scale showed good internal validity (*α* = 0.90). The **mathematics interest** scale included three items that participants had to rate on a 4-point scale, from 1 = strongly disagree to 4 = strongly agree. The items were: ‘I look forward to my mathematics lessons’; ‘I do mathematics because I enjoy it’; and ‘I am interested in the things I learn in mathematics’. The scale showed good internal validity (*α* = 0.93).

#### Mathematics performance

The General Certificate of Secondary Education (GCSE) grades provided a measure of mathematics exam grade. The GCSE exams are taken nationwide at the end of compulsory education, usually when students are 16-years-old. As mathematics is one of the core subjects in the UK educational curriculum, taking the mathematics GCSE exam is a compulsory requirement for all students. Mathematics GCSE scores were collected by questionnaires sent to the twins or their parents by post, via email, or through a phone interview. The GCSE grades, which are given in letters from A* (similar to A+) to G, were re-coded on a scale from 11, corresponding to the highest grade (A*) to 4 corresponding to the lowest pass grade (G). No information about ungraded or unclassified results was available. However, these constitute a small proportion of all pupils in the UK (e.g. 1.5% of all exams in 2017; https://www.jcq.org.uk/examination-results/gcses/2017/gcse-full-course-results-summer-2017) and therefore unlikely to constitute a bias in the current study. For 7,367 twins, self- and parent-reported GCSE results were verified using data obtained from the National Pupil Database (NPD; www.gov.uk/government/uploads/system/uploads/attachment_data/file/251184/ SFR40_2013_FINALv2.pdf), yielding correlations of 0.98 for English, 0.99 for mathematics, and >0.95 for all sciences between self- and parent-reported grades and exam results obtained from NPD37.

An online test battery assessed mathematics performance with three tests: understanding numbers, problem verification and approximate number sense.

The understanding numbers test38 was developed to specifically assess the ability to understand and solve problems which included numbers and was based on the NFER-Nelson Mathematics 5–14 Series, closely linked to the curriculum requirements in the UK. The items included in the measure were taken from the National Foundation for Education Research (NFER) booklets 8 to 14. The test asked participants to solve 18 mathematics problems arranged in ascending level of difficulty. Questions were presented in multiple formats, ranging from equations to problems. Participants were asked either to type a numerical response into a box or to select one or multiple correct responses out of a set of possible options. An example of one of the difficult items is ‘Denise has thought of two numbers. The numbers added together make 23. The smaller number subtracted from twice the larger number makes 22. What are Denise’s numbers?’ with numbers 8 and 15 being correct. Each correct answer was allocated 1 point, resulting in a maximum score of 18. The test showed good reliability in the present sample (α = 0.90).

The problem verification test (PVT)39 presented participants with a series of mathematics equations appearing for 10 s on a computer screen. Participants responded to each equation (correct, incorrect, don’t know), by pressing the corresponding keys on the computer keyboard. If they timed out, they were automatically redirected to the following equation. The PVT included 48 items. Examples of items are *‘*32–16 = 14’; and ‘2/6 = 3/9’. Each correct response was allocated the score of 1 and other responses and non-responses the score of 0, for a maximum score of 48. The test showed good reliability in the current sample (α = 0.85).

The approximate number sense test28 included 150 trials displaying arrays of yellow and blue dots, varying in size. Each trial lasted 400 ms and included a different number of blue and yellow dots presented on the screen. Participants were required to judge whether there were more yellow or blue dots on the screen for each trial (see Tosto et al.40, for additional information on this task). Each correct answer was allocated the score of 1 and the final score was calculated as the number of correct trials. The final accuracy score correlated strongly (*r* = −0.931, *p* < 0.0001) with the alternative score calculated using the Weber fraction41 for which a smaller score indicates better performance.

#### General anxiety

The Generalised Anxiety Disorder Scale (GAD-7)42 assessed general anxiety. The scale includes 7 items asking participants to rate on a scale from 1 = not at all to 4 = nearly every day ‘How often in the past month have you been bothered by the following problems?’ Examples of items are ‘Not being able to control worrying*’*, and ‘Feeling afraid as if something awful might happen’. As well as measuring generalised anxiety disorder, the GAD-7 has been validated and is considered a reliable measure of anxiety in the general population. The GAD-7 is characterised by good internal validity (α = 0.89) and test–retest reliability *r* = 0.6442.

### Analyses

#### Phenotypic analyses

Descriptive statistics and ANOVAs were conducted on data from one randomly selected twin out of each pair in order to control for sample dependency (i.e. the fact that the children in the study were twins). Measures were residualised for age and sex and standardised prior to analyses.

#### Genetic analyses—the twin method

The twin method allows for the decomposition of individual differences in a trait into genetic and environmental sources of variance by capitalizing on the genetic relatedness between monozygotic twins (MZ), who share 100% of their genetic makeup, and dizygotic twins (DZ), who share on average 50% of the genes that differ between individuals. The method is further grounded in the assumption that both types of twins who are raised in the same family share their rearing environments to approximately the same extent43. Comparing how similar MZ and DZ twins are for a given trait (intraclass correlations), it is possible to estimate the relative contribution of genes and environments to variation in that trait. Heritability, the amount of variance in a trait that can be attributed to genetic variance (A), is intuitively calculated as double the difference between the MZ and DZ twin intraclass correlations44. The ACE model further partitions the variance into shared environment (C), which describes the extent to which twins raised in the same family resemble each other beyond their shared genetic variance, and non-shared environment (E), which describes environmental variance that does not contribute to similarities between twin pairs.

An alternative to the ACE model is the ADE model, which partitions the variance into additive genetic (A), non-additive (or dominant) genetic (D) and non-shared environmental (E) effects. This model is fitted in cases when intraclass correlations for DZ twins are below 50% of the MZ intraclass correlation—indicating non additive genetic influences45. While additive genetic factors (A) are the sum of the effects of all alleles at all loci contributing to the variation in a trait or to the co-variation between traits, non-additive genetic effects (D) describe interactions between alleles at the same locus (dominance) and at different loci (epistasis). The classic twin design, comparing MZ and DZ twins does not allow us to estimate all four sources of influence (A, D, C and E) within one univariate model, as it only includes two coefficients of relatedness46. Therefore, with the classic twin design it is possible to partition the variance into three sources of influences: A, E, and either C or D.

ACE models were fitted for mathematics GCSE, understanding numbers, and mathematics problems verification test. For these measures, intraclass correlations for DZ pairs were more than half of those for MZ pairs, suggesting that environmental factors contributed to the similarity between twins beyond their genetic similarity.

ADE models were fitted for MA, general anxiety, mathematics interest, mathematics self-efficacy, and number sense. For these measures, the DZ intraclass correlation was less than half that of MZ, indicating non additive genetic effects.

The twin method can be extended to the exploration of the covariance between two or more traits (multivariate genetic analysis). Multivariate genetic analysis allows for the decomposition of the covariance between multiple traits into genetic and environmental sources of variance, by modelling the cross-twin cross-trait covariances. Cross-twin cross-trait covariances describe the association between two variables, with twin 1 score on variable 1 correlated with twin 2 score on variable 2, which are calculated separately for MZ and DZ twins.

One way of partitioning the genetic and environmental covariation between two or more traits is to conduct a multivariate Cholesky decomposition. The Cholesky decomposition allows us to examine the overlapping and independent genetic (A), shared (C) (or non-additive D), and non-shared (E) environmental effects on the variance in two or more traits47. A Cholesky decomposition can be interpreted similarly to a hierarchical regression analysis, as the independent contribution of a predictor variable to the dependent variable is estimated after accounting for the variance it shares with other predictors previously entered in the model. The current study applies Cholesky decompositions to the investigation of the genetic and environmental overlap between MA, mathematics motivation and performance.