Replications of previous scientific work are at the core of the Open Scholarship movement. However, as replication efforts become more widespread, it can be challenging to scholars and educators to keep themselves up to date with which effects in their field replicate and which do not. FORRT’s replications and reversals aims to collate replications and specifically so-called reversal effects in social science. Reversals are—in the context of a replication—effects that have their original direction flipped. The extent of such reversals and non-replicated effects is already apparent in the social science literature, with even replicated effects being only half of the originally reported effect (Ioannidis, 2005; Open Science Collaboration, 2015). Although such failures to replicate are far less costly to society than for example medical ones (Prasad & Cifu, 2011), they broadly hinder science’s goal of accumulating knowledge and contribute to waste of scarce resources. This resource aims to be a “living”, freely available, crowd-sourced, and community-driven collection of effects that have either not been replicated or even reversed through empirical research across social sciences. Scholars from varied backgrounds and areas of social science are invited to contribute with prevalent effects in their respective fields.
The purpose of collating these reversal effects in social science is to encourage educators to incorporate replications of these effects into their students’ project (e.g., third-year, thesis, course work) to provide them the opportunity to experience the research process directly, assess their ability to perform and report scientific research, and to help evaluate the robustness of the original study, thereby also helping them become good consumers of research. The below crowdsourced and community-curated resource aims to satisfy three of
FORRT’s Goals:
This is a dynamic project that is organized in five stages. Currently, we are in stage 3:
Currently the project is closed for submissions, as we database’ify and merge our resource with other replication databases. In the meantime, please contribute new entries to the
FORRT Replication Database (FReD) and contact Lukas Röseler at
lukas.roeseler@uni-muenster.de for any questions.
Elderly priming. that hearing about old age makes people walk slower.
Statistics
- Status: reversed
- Original paper: ‘
Automaticity of social behaviour’, Bargh 1996; 2 experiments with Study 2a: n = 30, Study 2b: n = 30. [citations = 5938(GS, October 2021)].
- Critiques:
Doyen 2012 [n=120, citations=757(GS, October 2021)].
Lakens 2017 [meta analysis: citations = 21(GS, October 2021)].
Pashler et al. 2011 [n=66, citations=21(GS, October 2021)].
- Original effect size: not reported.
- Replication effect size: Doyen: walking speed: η2=.01/d = 0.10 [calculated, using this
conversion]. Lakens: r= .29/d= .61. Pashler: not reported.
Hostility priming (unscrambled sentences). Exposing participants to more hostility-related stimuli caused them subsequently to interpret ambiguous behaviours as more hostile.
Statistics
- Status: not replicated
- Original paper: ‘
The role of category accessibility in the interpretation of information about persons: Some determinants and implications’, Srull and Wyer, Jr. 1979; 2 experiments with Study 1: n = 96; Study 2: n = 96. [citations = 2409 (GS, November 2021)].
- Critique:
McCarthy et al. 2018 [n = 7,373 for Study 1, citations = 40(GS, November 2021)].
McCarthy et al. 2021 (see Figure) [n = 1,402 for close replication; n = 1,641 for conceptual replication, citations = 2(GS, November 2021)].
- Original effect size: 2.99 (1.58%).
- Replication effect size: All effect sizes are located in McCarthy et al. 2018: Acar: _d _= 0.16. Aczel: _d _= 0.12. Birt: d = -0.11. Evans: d = -.22. Ferreira-Santos: d = 0.01. Gonzalez-Iraizoz: d = -.21. Holzmeister: d = .11. Klein Selfe and Rozmann: d = -0.51. Koppel: d = -.14. Laine: d = -.27. Loschelder: XX =-.07. McCarthy: d = -.10. Meijer: d = .03. Ozdorgru: d = .22. Pennington: d = -.52. Roets: d = -.01. Suchotzki: d = .10. Sutan: d = .49. Vanpaemel: d = .17. Verschuere: d = -.14. Wick: d = .07. Wiggins: d = .01. Average replication effect size: d = -0.08. McCarthy et al. 2021: d = 0.06.
Intelligence priming (contemplation) (professor priming). Participants primed with a category associated with intelligence (e.g. “professor”) performed 13% better on a trivia test than participants primed with a category associated with a lack of intelligence (“soccer hooligans”).
Statistics
- Status: not replicated
- Original paper: ‘
The relation between perception and behavior, or how to win a game of trivial pursuit’, Dijksterhuis and van Knippenberg 1998; 4 experiments with Study 1: n = 60; Study 2: n = 58; Study 3: n = 95; Study 4: n = 43. [citations = 1124 (GS November 2021)].
- Critiques:
O’Donnell et al. 2018 [n = 4,493 who met the inclusion criteria; n = 6,454 in supplementary materials, citations = 71(GS November 2021)].
- Original effect size: PD = 13.20%.
- Replication effect size: All effect sizes are located in O’Donnell et al. 2018: Aczel: PD = -1.35%. Aveyard: PD = -3.99%. Baskin: PD = 4.08%. Bialobrzeska: PD = -.12%. Boot: PD = -4.99%. Braithwaite: PD = 4.01%. Chartier: PD = 3.23%. DiDonato: PD = 3.14%. Finnigan: PD: 2.89%. Karpinski: PD = 1.38%. Keller: PD = .17%. Klein: PD =.88%. Koppel: PD = -.20%. McLatchie: PD = -2.16%. Newell: PD = 1.66%. O’Donnell: PD = 1.58%. Phillipp: PD = 43%. Ropovik: PD = -.48%. Saunders: PD = -1.87%. Schulte-Mecklenbeck: PD = 4.24%. Shanks: PD = .11%. Steele: PD = -.58%. Steffens: PD = -.84%. Susa: PD = -.63%. Tamayo: PD = 1.41%. Meta-analytic estimate: PD = 0.02%.
Moral priming (cleanliness). Participants exposed to physical cleanliness were shown to reduce the severity of their moral judgments. Direct, well-powered replications did not find evidence for the phenomenon.
Statistics
- Status: not replicated
- Original paper:
With a Clean Conscience: Cleanliness Reduces the Severity of Moral Judgments, Schnall, Benton, and Harvey, 2008; 2 experiments with Study 1: n = 40, Study 2: n = 44. [citations=645 (GS November 2021)].
- Critiques:
Johnson et al. 2014, [Study 1: n = 208, Study 2: n = 126. citations=128(GS November 2021)].
- Original effect size: Study 1: d = -0.60, 95% CI [-1.23, 0.04]; Study 2: d = -0.85, 95% CI [-1.47, -0.22]
- Replication effect size: Study 1: d = -0.01, 95% CI [-0.28, 0.26]; Study 2: d = 0.01, 95% CI [-0.34, 0.36]
Moral priming (contemplation). Participants exposed to a moral-reminder prime would demonstrate reduced cheating.
Statistics
- Status: not replicated
- Original paper: ‘
The Dishonesty of Honest People: A Theory of Self-Concept Maintenance’, Mazar et al. 2008; 6 experiments with Study 1: n = 229; Study 2: n = 207; Study 3: n = 450; Study 4: n = 44; Study 5: n = 108; Study 6: n = 326. [citations= 3072 (GS November 2021)].
- Critiques:
Verschuere et al. 2018 [n = 5786 replication of Experiment 1, citations = 65(GS November 2021)].
- Original effect size: not reported; commandants-cheat versus books-cheat: d = -1.45[-2.61, -0.29] [obtained from the Verschuere et al.’s 2018 meta analysis Figure 2], commandants-cheat versus commandants-control: d = -0.35 [–1.26, 0.57] [obtained from the Verschuere et al.’s 2018 meta analysis Figure 3].
- Replication effect size: All effect sizes are located in Verschuere et al. 2018: Commandants-cheat versus books-cheat: Aczel: d = -0.26 [-1.22, 0.69]. Birt: d = 0.41 [-0.58, 1.39]. Evans: d = 0.85 [-0.13, 1.83]. Ferreira-Santos: d = -0.19 [-1.14, 0.77]. Gonzalez-Iraizoz: d = 0.26[-0.77, 1.28]. Holzmeister: d = 1.11[-0.30, 2.52]. Klein Selle and Rozmann: d = -0.27 [-1.11, 0.58]. Koppel: d = 0.39[-0.40, 1.17]. Laine: d = -0.37 [-1.18, 0.44]. Loschelder: d = -0.11[-0.86, 0.65]. McCarthy: d = 0.57 [-0.87, 2.02]. Meijer: d = -0.15 [-0.75, 0.44]. Ozdogru: d = 1.19 [0.01, 2.37]. Suchotzki: d = 0.00 [–0.93, 0.93]. Sutan: d = 0.02[-0.79, 0.83]. Vanpaemel: d = 0.17[-0.55, 0.88]. Verschuere: d = 0.18 [-0.55, 0.91]. Wick: d = -0.09 [-1.06, 0.87]. Wiggins: d = 0.19 [-0.51, 0.90]. Meta-analytic estimate: d = 0.11 [-0.09, 0.31]. Commandants-cheat versus commandants-control: Aczel: d = 0.05 [-0.77, 0.88]. Birt: d = 0.83 [-0.10, 1.75]. Evans: d = 0.60 [-0.39, 1.59]. Ferreira-Santos: d = -0.33 [-1.41, 0.74]. Gonzalez-Iraizoz: d = 1.11 [0.14, 2.08]. Holzmeister: d = 1.30 [-0.17, 2.78]. Klein Selle and Rozmann: d = -0.15 [-0.79, 1.09]. Koppel: d = 0.51 [-0.20, 1.22]. Laine: d = 0.10 [-0.63, 0.83]. Loschelder: d = -0.24 [-1.38, 0.90]. McCarthy: d = 1.10 [-0.20, 2.41]. Meijer: d = -0.31 [-0.89, 0.25]. Ozdogru: d = 1.15 [-0.10, 2.41]. Suchotzki: d = -0.05 [-0.86, 0.75]. Sutan: d = 0.41 [-0.41, 1.23]. Vanpaemel: d = 0.36 [-0.37, 1.09]. Verschuere: d = 0.13 [-0.61, 0.87]. Wick: d = -0.14 [-0.94, 0.67]. Wiggins: d = -0.08 [-1.02, 0.87]. Meta-analytic estimate: d = 0.24 [0.03, 0.44].
Distance priming. Participants primed with distance compared to closeness produced greater enjoyment of media depicting embarrassment (Study 1), less emotional distress from violent media (Study 2), lower estimates of the number of calories in unhealthy food (Study 3), and weaker reports of emotional attachments to family members and hometowns (Study 4).
Flag priming. Participants primed by a flag are more likely to be more in conservative positions than those in the control condition.
Statistics
- Status: mixed
- Original paper: ‘
A Single Exposure to the American Flag Shifts Support Toward Republicanism up to 8 Months Later’, Carter et al. 2011; experimental design, 2 studies with n = 191 completed three sessions and 71 completed the fourth session, Experiment 2: n = 70. [citations = 186 (GS, October 2021)].
- Critique:
Klein et al. 2014 [n=6,082, citations = 957 (GS, October 2021)].
- Original effect size: d = 0.50.
- Replication effect size: All effect sizes are located in ManyLabs: Adams and Nelson: d = .02. Bernstein: d = 0.07. Bocian and Frankowska: d = .19 (Study 1). Bocian and Franowska: d = -.22 (Study 2). Brandt et al.: d = .21. Brumbaugh and Storbeck: d = -.22 (Study 1). Brumbaugh and Storbeck: d = .02 (Study 2). Cemalcilar: d = .14. Cheong: d = -.11. Davis and Hicks: d = -.27 (Study 1). Davis and Hicks: d =-.03 (Study 2). Devos: d = -.11. Furrow and Thompson: d = .09. Hovermale and Joy-Gaba: d = -.07. Hunt and Krueger: d = .27. Huntsinger and Mallett: d = .06. John and Skorinko: d = .08. Kappes: d = .04. Klein et al.: d = -.11. Kurtz: d =.04. Levitan: d = -.01. Morris: d = .09 Nier: d = -.45. Packard: d = .04. Pilati: d = 0.00. Rutchick: d = -.07. Schmidt and Nosek (PI): d =.03. Schmidt and Nosek (MTURK): d = .09. Schmidt and Nosek (UVA): d = -.15. Smith: d = .27. Swol: d =-.03. Vaughn: d = -.17. Vianello and Galliani: d =.49. Vranka: d = -.03. Wichman: d = .11. Woodzicska: d =-.09. Average replication effect size: d = 0.03.
Fluency priming. Objects that are fluent (e.g., conceptually fluent, visually fluent) are perceived more concretely than objects that are disfluent (disfluent objects are perceived more abstractly).
Money priming. Images or phrases related to money cause increased faith in capitalism, and the belief that victims deserve their fate.
Statistics
- Status: not replicated
- Original paper: ‘
Mere exposure to money increases endorsement of free-market systems and social inequality’, Caruso 2013; experimental design, n between 30 and 168 [(citations~161 (GS, November 2021)].
- Critiques:
Rohrer 2015 [n=136, citations = 82 (GS, November 2021)]. Meta-analysis:
Lodder 2019 [k=246, citations = 64 (GS, November 2021)].
- Original effect size: system justification _d _= 0.8, just world _d _= 0.44, dominance _d _= 0.51, fair market ideology: not reported, d = 0.70 [obtained from Rohrer’s 2015 Experiment 4 results section].
- Replication effect size: Rohrer et al. (Experiment 1): d = 0.07 [0.41, 0.27] for system justification, d = 0.06 [-0.14, 0.25] for belief in a just world, d = -0.06 for social dominance, social dominance: d = 0.06 [0.37, 0.26], fair market ideology, d = 0.14 [-0.23, 0.50]. For 47 preregistered experiments in Lodder: g = 0.01 [-0.03, 0.05] for system justification, g = 0.11 [-0.08, 0.3] for belief in a just world, g = 0.07 [-0.02, 0.15] for fair market ideology.
Commitment priming (recall). Participants exposed to a high-commitment prime would exhibit greater forgiveness.
Mortality Salience (Death Priming/Terror Management Theory). Reminders of death lead to subconscious changes in attitudes and behaviour, for example in the form of increased in-group bias and behaviour that serves to defend an individual’s cultural worldview.
Statistics
- Status: not replicated
- Original paper:
‘Role of Consciousness and Accessibility of Death-Related Thoughts in Mortality Salience Effects’, Greenberg et al. 1994; Experiment 1, n=58. [citations=1294 (GS, June 2022)]. A second original paper was ‘
I am not an animal: Mortality salience, disgust, and the denial of human creatureliness’, Goldberg et al. 2001; two experiments n1=77, n2 = 44. [citations=501 (GS, March 2023)].
- Critiques: Meta-analysis
Burke et al. 2010 [k=277, citations = 1,497 (GS, March 2023)].
Klein et al. 2018
performed a replication of Greenberg et al. (1994) Experiment 1 with and without original author involvement [n = 2281 for Experiment 1, citations = 99 (GS, June 2022)].
Sætrevik & Sjåstad 2019 [n = 101 for Experiment 1, n = 784 for Experiment 2, citations = 7(GS, March 2023)]. A replication of
Goldberg et al. (2001) by
Rodríguez-Ferreiro et al. 2019 [n = 128, citations = 16 (GS, March 2023)].
- Original effect size: Meta-analytic effect of d = .82 for mortality salience (reported in Burke et al. 2010).
- Replication effect size: Klein et al.: Regardless of which exclusion criteria were used, the predicted effect was not observed, and the confidence interval was quite narrow: Exclusion Set 1: Hedges’ g = 0.03 [-0.06, 0.12]; Exclusion Set 2: Hedges’ g = 0.06 [-0.06, 0.17]; Exclusion Set 3: Hedges’ g = 0.04 [-0.07, 0.16]; for this reason, they were unable to further assess if original author involvement influenced the replication results. Sætrevik & Sjåstad: _d _= -0.08 – 0.35 for outcome effects related to theoretical predictions. Rodríguez-Ferreiro et al.: d = 0.09 [−0.26, 0.44], which was significantly different from the effect size of the original study, d = 1.13 [0.17, 2.07], z = 2.03, p = 0.043.
Spatial priming for emotional closeness. Plotting points closer together led to participants reporting they were closer to their own family members than those who plotted points farther apart.
Statistics
- Status: not replicated
- Original paper: ‘
Keeping One’s Distance: The effect of spatial distance cues on affect and emotion’, Lawrence and Bargh 2008, 4 experiments with Study 1: n = 73; Study 2: n = 42; Study 3: n = 59; Study 4: n = 84. [citation= 583 (GS, January 2022)].
- Critiques:
Pashler et al. 2012 [n = 92, citations = 188 (GS, January 2022)].
Open Science Collaboration 2015 [n=125, citations = 6148(GS, January 2022)].
- Original effect size: Study 1: η2 = .09/d = 0.10 [converted from partial eta squared to Cohen’s d using this
conversion]; Study 2: η2 = .18/d = 0.22 [converted from partial eta squared to Cohen’s d using this
conversion]; Study 3: η2 = .10/_d _= 0.11[converted from partial eta squared to Cohen’s d using this
conversion]; Study 4: η2 = .11/d = 0.12 [converted from partial eta squared to Cohen’s d using this
conversion].
- Replication effect size: Pashler et al.: η2 = 0.01/d = 0.01 [converted from partial eta squared to Cohen’s d using this
conversion]. Joy-Gaba et al.’s effect sizes are located in Open Science Collaboration 2015 for Study 4: η2 = .00/ d = .00.
Implicit God prime increases self-reported risky behaviour. Implicitly priming God using the scrambled-sentence paradigm increases self-reported risk taking.
Statistics
- Status: not replicated
- Original paper: ‘
Anticipating divine protection? Reminders of god can increase nonmoral risk taking’, Kupor et al. 2015; experimental design, Experiment 1a: n=61 and Experiment 1b: n=202. [citations=76 (GS, November 2022)].
- Critiques:
Gervais et al. 2020 [Experiment 1a: n=556, Experiment 1b: n=548, citations=9 (GS, November 2022)].
- Original effect size: Experiment 1a: d=0.574 [0.05, 1.09]; Experiment 1b: d=0.323 [0.04,0.60].
- Replication effect size: Gervais et al.: Experiment 1a: d=0.14 [-0.07, 0.34]; Experiment 1b: d=-0.11 [-0.31, 0.09].
Implicit God prime increases actual risky behaviour. Implicitly priming God using the scrambled-sentence paradigm increases willingness to engage in risky behaviour for financial reward.
Statistics
- Status: not replicated
- Original paper: ‘
Anticipating divine protection? Reminders of god can increase nonmoral risk taking’, Kupor et al. 2015; Experiment 3: n=101. [citations=76 (GS, November 2022)].
- Critiques:[ Gruneau Brulin et al. 2018
Experiment 1b: n = 160, Experiment 2b: n=264, citations=19 (GS, November 2022)].
- Original effect size: Experiment 3: b=0.61.
- Replication effect size: Gruneau Brulin et al: Experiment 1b: d=-0.11 [-0.31, 0.09]; Experiment 2b: b=0.14 [-0.07, 0.34].
Heat priming. Exposure to words related to hot temperatures increases aggressive thoughts and hostile perceptions. This effect suggests that people mentally associate heat-related constructs with aggression-related constructs.
Statistics
- Status: not replicated
- Original paper: ‘
Hot under the collar in a lukewarm environment: Words associated with hot temperature increase aggressive thoughts and hostile perceptions’, DeWall & Bushman 2009; 2 experiments in which participants were first exposed to words related to either heat, cold, or neutral concepts and then completed a word stem completion task (Study 1; n=127) or had to rate person’s hostility basing on ambiguous description of this person (Study 2; n=72). [citation=76 (GS, June 2022)].
- Critiques:
McCarthy 2014 [n=182, citations=14 (GS, June 2022)]; including meta-analyses [n=499].
- Original effect size: Study 1: d = 0.47 (hot vs. cold words), d = 0.46 (hot vs. neutral words); Study 2: d = 0.67 (hot vs. cold words), d = 0.63 (hot vs. neutral words).
- Replication effect size: McCarthy: Study 2A: d = -0.12 (hot vs. cold words), d = -0.02 (hot vs. neutral words); Study 2B: d = -0.06 (hot vs. cold words), d = 0.00 (hot vs. neutral words) (both experiments replicate procedure from Study 2); Meta-analysis: d = 0.18.
Honesty priming (goal-priming, social priming). An increased level of honesty to embarrassing behaviours after exposure to honesty-related words.
Statistics
- Status: not replicated
- Original paper: ‘
Using implicit goal priming to improve the quality of self-report data’, Rasinski et al. 2005; between-subjects, n = 64. [citations = 111 (GS, October 2022)].
- Critiques:
Pashler et al. 2013 [Direct replication, Experiment 1 n=149 and Experiment 2 n=152, and conceptual replication, Experiment 3 n=151 and Experiment 4 n=153, citations = 66 (GS, October 2022)].
Dalal and Hakel 2016 [Experiment 1 n = 590, conceptual replication, citations = 41 (GS, March 2023)].
- Original effect size: d = 1.21 (estimated from test-statistics in paper).
- Replication effect size: Pashler et al.: Experiment 1: d = 0.18 (non-significant; not replicated); Experiment 2: d = -0.14 (non-significant; opposite direction); Experiment 3: Measure 1: _d _= -0.14 (non-significant; opposite direction; estimated from test statistics in paper), Measure 2: d = -0.13 (non-significant; opposite direction; estimated from test statistics in paper); Experiment 4: Measure 1: d = 0.04 (non-significant; not replicated; estimated from descriptive statistics), Measure 2: d = -0.14 (non-significant; opposite direction; estimated from descriptive statistics). Dalal and Hakel : _d _= -0.07 (non-significant; opposite direction; estimated from descriptive statistics in Table 2 (to get N of groups) and 3 (to get the means and standard deviations).
Achievement priming (goal priming, high-performance goal priming). Exposing individuals to words that are success oriented (e.g., win, strive) will increase their performance on a task compared to those exposed to neutral words (e.g., carpet, shampoo).
Statistics
- Status: mixed.
- Original paper:
‘The Automated Will: Nonconscious Activation and Pursuit of Behavioral Goals’, Bargh et al. 2001; between-subjects, Experiment 1: n=78, Experiment 2: n=60, Experiment 3: n=288, Experiment 4: n=76, Experiment 5: n=65. [citation = 2,987 (GS, October 2022)].
- Critiques:
Shantz and Latham 2009 [Pilot Study: n = 52, Field Experiment: n = 81, citations = 221 (GS, October 2022)].
Harris et al. 2013 [Experiment 1: _n _= 98, Experiment 2: n = 66, citations = 199 (GS, October 2022)].
Weingarten et al. 2016 [meta-analysis, n = NA, k = 133 studies, citations = 333 (GS, October 2022)].
- Original effect size (estimated from test-statistics reported): Experiment 1: d= 0.72 (priming of high-performance words led to more words being found); Experiment 2: d = 0.53 (priming of cooperation words led to more cooperation between players); Experiment 3: d = 0.52 (adding delay between word exposure and task increased performance in the high-performance words group); Experiment 4: d = 0.76 (when given the stop signal, those in the high-performance word group continued to work on the task (Note: The statistics for this experiment suggest that they had more than 76 participants. Specifically, they fit a 2 x 2 ANOVA and have residual degrees of freedom of 75. If they had 76 participants, their residual degrees of freedom would be 72. For the purposes of estimating their effect sizes, I have used the corrected residual degrees of freedom value); Experiment 5: _ d_ = 0.68 (when interrupted, the high-performance word group was more likely to return to their task than the neutral group).
- Replication effect size: Shantz and Latham: Participants either shown a picture of a woman winning a race or not to prime achievement, Pilot Study: d = 0.84 (replicated); Field Experiment: d = 0.43 (replicated). Harris et al. : Experiment 1 (direct replication of Experiment 1 in Bargh et al., 2001): d = -0.24 [0.15, -0.64] (not replicated); Experiment 2 (direct replication of Experiment 3 in Bargh et al., 2001): d = -0.03 [0.45, -0.52] (not replicated). Weingarten et al.: The meta-analysis looked at all priming experiments that examined behaviour (i.e., not just achievement priming). It found that there is a small effect of behavioural priming (d = 0.35 [0.29, 0.41]). Factors that affected the priming effects were: Publication status - Published (n = 255 studies): d = 0.39 [0.33, 0.44], Unpublished (n = 88 studies): d = 0.10 [0.01, 0.20]; Liminality - Supraliminal (n = 255 studies): d = 0.30 [0.24, 0.36; this is the method used in Bargh et al., 2001], Subliminal (n = 88 studies): d = 0.40 [0.30, 0.51]; Use of neutral control - No neutral control (n = 38 studies): d = 0.44 [0.27, 0.60], With neutral control (n = 307 studies): d = 0.31 [0.25, 0.37].
Weapons priming effect (weapons effect). Stimuli or cues associated with aggression, such as weapons, can elicit aggressive responses.
Statistics
- Status: mixed (the effect is smaller than originally believed)
- Original paper: ‘
Weapons as aggression-eliciting stimuli’, Berkowitz and LePage 1967; between-subjects design, n = 100 (male university students). [citations = 1161 (GS, October 2022)].
- Critiques:
Turner and Simons 1974 [n = 60, citations = 11 (GS, October 2022)].
Frodi 1975 [_n = _100, citations = 50 (GS, October 2022)].
Carlson et al. 1990 [meta-analysis; n = 628 (fail-safe), k = 56 studies, citations = 339 (GS, October 2022)].
Benjamin et al., 2018 [meta-analysis; n = 7,668 participants, k = 78 studies, citations = 12 (GS, October 2022)].
Ariel et al. 2019 [RCT of taser presence and the police force; n = 678 officers, citations = 42 (GS, October 2022)].
- Original effect size: all reported effect sizes are found in Carlson et al.: d = 0.76 to 1.06.
- Replication effect size: Turner and Simons: d = -1.17 to 0.64 (reported in
Carlson et al., 1990); the greater the evaluation apprehension, the less likely aggressive behaviour was observed (mixed). Frodi: d = 0.91 (reported in
Carlson et al., 1990) (replicated). Carlson et al.: d = 0.38 (replicated). Benjamin et al. : d = 0.29 [0.21, 0.36] (replicated); The effect is moderated by several variables : Smaller if looked at behaviour (d = 0.25 [0.07, 0.43]), educed for “field” experiments (d = 0.22 [-0.07, 0.51]), larger when photos used (d = 0.35 [0.26, 0.44]) rather than actual weapons (d = 0.12 [-0.08, 0.31]). Ariel et al. : The presence of a taser on the officer led to Increased use of force, IRR = 1.48 [1.27, 1.72] (replicated); Increased injury to officers, IRR = 2.11[1.53, 2.91] (replicated).
Goal priming effect (goal contagion, goal inspiration, behavioural inspiration). The observation of other’ behaviour (e.g., your observe someone jogging in the park) may lead to the inference of the goal in the observer (“This person wants to keep fit.”) and to the adoption of the same goal (“Maybe I should do some sports too.”).
Statistics
- Status: not replicated
- Original paper: ‘
Goal Contagion: Perceiving Is for Pursuing’, Aarts, Gollwitzer and Hassin 2004; Study 1, 2 (need for money: high vs low) x 2 (goal: money vs control) between-subjects ANOVA with dependent variable of earning money goal; note: the main effect of the manipulation is of interest, n=83. [citations=824(GS, December 2022)].
- Critiques:
Brohmer et al. 2021 [meta analysis total n=4751, citation=2(GS, December 2022)].
Corcoran et al. 2020 [n=300, citations=10(GS, December 2022)].
- Original effect size: main effect of goal vs control: g = 0.38 [-0.05, 0.81] (based on F statistics: F(1,79) = 3.14, p < .08; original authors also report Goal x Need interaction effect, F(1, 79) = 5.32, p < .03).
- Replication effect size: Corcoran et al. : g = -0.20 [-0.42, 0.02] (reported in Brohmer et al.’s meta analysis); Brohmer et al. : g = 0.30 [0.21, 0.40], but the bias-corrected meta-analytic summary effect (selection model approach) is g = 0.15 [-0.02; 0.32].
Verbal framing (temporal tense). Participants who read what a person was doing (relative to those who read what person did) showed enhanced accessibility of intention-related concepts and attributed more intentionality to the person.
Statistics
- Status: mixed
- Original paper: ‘
Learning about what others were doing: Verb aspect and attributions of mundane and criminal intent for past actions’, Hart and Albarracin (2011): 3 experiments with Study 1: n = 5458; Study 2: n = 37; Study 3: n = 48. [citations = 37, (GS, January 2022)].
- Critiques:
Eerland et al. (2016) [meta analysis (total n= 685 for perfective-aspect condition; n = 681 imperfective-aspect condition) of Study 3 citations = 70, (GS, January, 2022)]
- Original effect size: Study 1: d = 1.00 for intentionality in imperfective-aspect condition; Study 2: d = 1.23 for imagery in imperfective-aspect condition; Study 3: d= 1.20 for intentionality, d = 0.92 for imagery and 0.55 for intention attribution in imperfective-aspect condition.
- Replication effect size: All effect sizes are located in Eerland et al. 2016: intentionality: Arnal (lab): d = -0.35; Berger (lab): d = -0.98; Birt and Aucoin (lab): d = -0.38; Eerland et al. (lab): d =0.16; Eerland et al.(online): d = -0.33; Ferretti (lab): d = -0.01; Knepp (lab): d = -0.95; Kurby and Kibbe (lab): d = -0.14; Melcher (lab): d = 0.65; Michael (lab): d = -0.41; Poirier et al. (lab): d = 0.32; Prenoveau and Carlucci (lab): d = -0.38. Meta-analytic estimate for laboratory replications only: d = -0.24. Imagery: Arnal (lab): d = −0.01; Berger (lab): d = −0.45; Birt and Aucoin (lab): d = −0.40; Eerland et al. (lab): d =−0.01; Eerland et al.(online): d = -−0.13; Ferretti (lab): d = 0.33; Knepp (lab): d = 0.00; Kurby and Kibbe (lab): d = 0.02; Melcher (lab): d = −0.16; Michael (lab): d = -0.08; Poirier et al. (lab): d = -0.19; Prenoveau and Carlucci (lab): d = -0.02. Meta-analytic estimate for laboratory replications only: d = -0.08. Intention attribution: Arnal (lab): d = -0.15; Berger (lab): d = -0.15; Birt and Aucoin (lab): d = 0.08; Eerland et al. (lab): d =-0.01; Eerland et al.(online): d = 0.02; Ferretti (lab): d = -0.19; Knepp (lab): d = -0.29; Kurby and Kibbe (lab): d = 0.00; Melcher (lab): d = 0.12; Michael (lab): d = 0.13; Poirier et al. (lab): d = 0.06; Prenoveau and Carlucci (lab): d = 0.03. Meta-analytic estimate for laboratory replications: d = 0.00.
Reference framing. Risk preferences change depending on whether a choice is presented in terms of gains or losses, even when the prospects of the options are held constant.
Prosocial spending. Spending money on other people leads to greater happiness than spending money on oneself.
Statistics
- Status: replicated
- Original paper: ‘
Spending Money on Others Promotes Happiness’, Dunn et al. , 2008; cross-sectional survey, n=632. [citations = 2008 (GS, March 2022)].
- Critiques:
Akinn et al., 2020 [3 Experiments, Experiment 1: n=712, Experiment 2: n =1,950, Experiment 3: n =5,199, citations = 51 (GS, March 2022)].
- Original effect size: _b _= 0.11.
- Replication effect size: Experiment 1: positive affect: d = .36, positive emotion: d = .32; Experiment 2: positive affect: d = .03, positive emotion: d = .02; Experiment 3: positive affect: d = .06, positive emotion: d = .06, positive meotion after spending one’s own money: d = .17.
Gustatory disgust on moral judgement. Gustatory disgust triggers a heightened sense of moral wrongness.
Statistics
- Status: not replicated
- Original paper: ‘
A Bad Taste in the Mouth: Gustatory Disgust Influences Moral Judgment’, Eskine et al. 2011; experiment, n = 57.[citation = 564 (GS, January 2022)].
- Critiques:
Ghelfi et al. 2020 [meta-analysis, total n = 1137, citations = 18 (GS, January 2022)].
Johnson et al. 2016 [Study 1: n = 478, Study 2: n = 934, citations=52 (GS January 2022)].
- Original effect size:_ _Cohen’s _d _= 1.12 (comparison to control group); Cohen’s _d _= 1.28 (comparison to sweet taste).
- Replication effect size: Johnson et al.: Cohen’s d = 0.04 (Study 1 - comparison to control group), Cohen’s d = 0.05 (Study 2 - comparison to control group). All effect sizes are located in Ghelfi et al. 2016: comparison to sweet group: Christopherson: Hedges g = 0.53. Christopherson: Hedges’ g = 0.04. Fischer: Hedges’ g = 0.25. Guberman: Hedges’ g = -0.30. de Haan: Hedges’ g = -0.13. Legate: Hedges’ g = 0.99. Legate: Hedges’ _g _= -0.02. Lenne: Hedges’ g = -0.19. Urry: Hedges’ g = -0.13. Wagemans: Hedges’ g = 0.03. Weber: Hedges’ g = -0.27. Meta-analytic estimate: Hedges’ g = -0.05. Comparison to control group: Christopherson: Hedges g = 0.68. Christopherson: Hedges’ g = -0.19. Fischer: Hedges’ g = -0.01. Guberman: Hedges’ g = -0.12. de Haan: Hedges’ g = -0.24. Legate: Hedges’ g = 0.79. Legate: Hedges’ _g _= 0.37. Lenne: Hedges’ g = -0.13. Urry: Hedges’ g = 0.08. Wagemans: Hedges’ g = -0.11. Weber: Hedges’ g = -0.04. Meta-analytic estimate: Hedges’ g = 0.10.
Macbeth effect. Moral aspersions induce literal physical hygiene.
Statistics
- Status: mixed
- Original paper: ‘
Washing away your sins: threatened morality and physical cleansing’, Zhong and Liljenquist 2006; 4 experiments with Study 1: n=60, Study 2: n=27, Study 3: n=32, Study 4: n=45. [citation = 1407 (GS, January 2022)].
- Critiques:
Siev et al. 2018 [meta-analysis: n=1,746, citations = 17(GS, January 2022)].
- Original effect size: Study 1: g = 0.38; Study 2: g = 0.75; Study 3: g = 0.38; Study 4: g = 0.33.
- Replication effect size: Siev et al.: g = 0.17 [0.04 – 0.31]. All effect sizes are located in Siev et al. 2018: Earp et al.: Study 1: g = 0.02 [-0.30 0.34], Study 2: g= 0.05 [-0.27, 0.37], Study 3: g = 0.13 [-0.11, 0.37]. Fayard et al.: Study 1: g = 0.11[-0.20 0.43]. Gamez et al.: Study 1: g = 0.02 [-0.54 0.56], Study 2: g= -0.01 [-0.64, 0.63], Study 3: g = 0.55 [-0.26, 1.37]. Lee and Schwarz: Study 2: g = 0.22 [-0.20 0.64]. Schaefer: Study 2: g = 0.71 [0.18, 1.23]. Siev et al. (unpublished): Study 1: g = -0.06 [-0.27 0.15], Study 2: g= -0.18 [-0.56, 0.20]. Zhong (unpublished): Study 2: g = 0.28.
Signing at the beginning rather than end makes ethics salient. Signing a statement of honest intent before providing information rather than after can reduce dishonesty.
Statistics
- Status: not replicated/
retracted
- Original paper: ‘
Signing at the Beginning Makes Ethics Salient and Decreases Dishonest Self-reports in Comparison to Signing at the End’, Shu et al., 2012; lab and field experiments, Study 1: n = 101; Study 2: n = 60; Study 3: n = 13.488. [citations=465 (GS, February 2022)].
- Critiques: Paper retracted due to evidence of fraud by
Uri, Joe and Leif, 2021 [n=NA, citations=2 (GS, March 2023)].
- Original effect size: χ2 (2, n = 101) = 12.58, p = 0.002; χ2 (1, n = 60) = 4.27, p < 0.04; F(1, 13,485) = 128.63, p < 0.001.
- Replication effect size: NA.
Social class on prosocial behaviour. Individuals from a high social class are more likely to exhibit prosocial behavior than those from a low social class, but there is a U-shaped curve between social class and prosocial behavior that sometimes appears. The final study in the critique section below reported two pre-registered replications of Piff et al., 2010 with different results. There are more studies than those described here, but these should provide a good sense of the current state of the science.
Statistics
- Status: mixed
- Original papers: ‘
Volunteering in public health: An analysis of volunteers’ characteristics and activities’, Ramirez-Valles, 2006; random-digit dialling in Illinois, US, n = 609. [citations = 9 (GS, June 2022)].
- Critiques:
Gittell & Tebaldi 2006 [n=NA, citations = 161 (GS, June 2022)].
James III & Sharpe 2007 [n = 16,442 households, citations=171 (GS, June 2022)].
Piff et al. 2010 [4 experiments with Experimeent 1: n = 115; Experiment 2 : n = 81; Experiment 3 : n = 155; Experiment4 : n = 91, citations=1572 (GS, June 2022)].
Guinote et al. 2015 ; [Experiment 1 : n = 44; Study 4 : n = 48 children, citations=185 (GS, June 2022)].
Chen et al. 2013 [n = 469 kindergarten children, citations=110 (GS, June 2022)].
Korndörfer et al. 2015 [8 studies, n1 = 9260 German households, n2 = 32,090 US households, n3 = 3975 (objective) & 3,857 (subjective) US persons, n4 = 33,072 German persons, n5 = 3,983 (objective) & n = 3,964 (subjective) US persons, n6 = 32,257 persons in 28 countries, n7 = 3,902 (objective) & n = 3,886 (subjective) US persons, n8 = 1,421 German persons, citations=238 (GS, March 2023)].
Stamos et al., 2020 [Experiment 1: n = 300, Experiment2: n = 200, citations=31 (GS, March 2023)].
- Original effect size: Ramirez-Valles: household income on past-12-month volunteering in public health OR = 1.22; education NS OR = 1.02.
- Replication effect size: Gittell and Tebaldi: correlation between income and volunteer rate (-.13), regression coefficients for personal income (769.1) and education (29.35) on average charitable contribution per tax filer. Piff et al.: Experiment 1 - subjective SES on dictator game resource allocation: β = -.23; Experiment 2 - self-reported family income: β = -.27 and manipulated social class: β = -.23 on attitudes toward charitable giving; Experiment 3 - combined education and income on trust game with arbitrary points: r = -.18; Experiment 4 - combined past and current income on ambiguous task helping: β = -.43. Guinote et al.: Experiment 1 - manipulated department rank on picking up pens for experimenter: d = 1.16; Experiment 4 - random winner on sticker donation T1: calculated d= 0.657, losing status: ηp2 = 0.34, gaining status: ηp2 = 0.38, NS differences at T2. Chen et al.: family income on sticker allocation in dictator game: Spearman’s ρ = -.10; parents education/migrant status: NS. Korndörfer et al.: Experiment 1 – household objective social class for each household on self-reported donation behavior for the previous year: OR = 2.07, NS quadratic term, on relative amount of donation, both standardized score: b= .158 and its quadratic term, b= .073; Experiment2 – household objective social class on self-reported donation behavior for the previous year: OR = 1.99, NS quadratic term, on relative amount of donation, standardized score: b= .078, NS quadratic term; Experiment 3 - Model 1, objective social class for each person on self-reported donation behavior for the previous year: OR = 2.54, NS quadratic term and frequency: b= .392, quadratic term: b= -.064; Model 2, four-category subjective social class for each person on self-reported donation behavior for the previous year: OR = 1.61, quadratic term: OR = 0.90 and frequency: b= .230, quadratic term: b= -.039; Experiment 4 - objective social class for each person on self-reported volunteering: OR = 2.03, quadratic term: OR = 0.91 and frequency: b= .336, quadratic term: b= -0.48; Experiment 5 - same models as Experiment3 but with a volunteering outcome: Model 1: OR = 1.64, NS quadratic term and frequency: b= .248, NS quadratic term: Model 2, OR = 1.29, NS quadratic term: and frequency: b= .135, NS quadratic term; Experiment 6 - Model 1, objective social class for each person on past 12 month volunteering: OR = 1.18, quadratic term: OR = 0.97 and frequency: b= 0.94, quadratic term: b= -.012; Model 2, six-category subjective social class on volunteering: OR = 1.15, NS quadratic term and frequency: b= 0.76, NS quadratic term; Experiment7 - same models as Experiments 3 and 5 but with a single everyday helping outcome: Model 1, b= .397, NS quadratic term; Model 2: NS and NS quadratic term; Experiment8 - objective social class for each person on behavior in a trust game, player 1: b= .468, player 2: b= .421. Stamos et al.: d = .36 (manipulated subjective SES), opposite direction: r = -.02 (family income).
Stanford Prison Experiment employed a simulation of a prison environment to examine the psychological effects of coercive situations. Utilizing role-playing, labeling and social expectations it showed that one third of participants in the role of prison guards displayed aggressive and dehumanizing behaviour.
Statistics
- Status: NA
- Original paper: ‘
Interpersonal dynamics in a simulated prison’, Haney, Banks, Zimbardo 1973; experimental and observational study, n=24. [, citations = 2115 (including highly referenced publications), (GS, January, 2022)].
- Critiques:
Le Texier 2019 [commentary, n=NA, citations= 38 (GS, January, 2022)].
Banuazizi & Mahavedi 1975 [methodological analysis, n=NA, citations= 118 (GS, January 2022)].
Festinger 1980 [book, n=NA, citations= 132 (GS, January 2022)].
Haslam, Reicher, & Van Bavel 2019 [methodological analysis, n=NA, citations = 37 (GS, January 2022)].
Griggs & Whitehead 2014 [textbook analysis, n=NA, citations = 37 (GS, January 2022)].
Griggs 2014 [textbook analysis, n=NA, citations = 48 (GS, January 2022)].
Blum 2018 [media coverage, n=NA, citations = 31 (GS, January 2022)].
LeTexier 2020 [preprint, citations= 0 (GS, January 2022)].
Izydorczak & Wicher 2020 [preprint, citations= 0 (GS, January, 2022)].
Reicher & Haslam 2011 [experimental case study but not exact replication of SFE; n = 15, citations ~435 (GS, January 2022)].
Lovibond, Adams, & Adams 1979 [original research but not exact replication of SFE; n = 60, citations= 55 (GS, January, 2022)].
- Original effect size: Key claims were insinuation plus a battery of difference in means tests at up to 20% significance(!). n = 24, data analysis on 21.
- Replication effect size: N/A. First, the study has been criticised for the lack of adherence to the experimental methodology. Although the study has been widely described as an ‘experiment’ it lacks many defining features: 1) it does not define the precise set of manipulated variables, 2) it manipulates multiple variables at time without the proper control over the effects of each one, 3) it does not define the dependent variable and how it will be measured, 4) it does not state any clear hypotheses. It is noteworthy that in the original paper, authors present their work as a “demonstration” not an experiment. Second group of serious issues is the degree of researchers’ ad-hoc interventions that were influencing the behaviour of the participants. One of the leading researchers, Philip F. Zimbardo took part in the experimental procedure as the prisons’ “Superintendent”. Another close collaborator of the research team David Jaffe, who initially conceived the idea of the mock-prison study, was playing the role of the “Warden”. Considering that these people knew the goal of the study and were, as later admitted, interested in the particular outcome (a call for reform of the prison system), the ad-hoc intervention, such as encouraging some of the guards to be more strict and ‘tough’, cast a reasonable doubt on the role of experimentator’ expectations on the final results of the study. The third group of issues is sampling. Namely, the study has been conducted on a small (n=24, n per condition = 12) and largely unrepresentative sample (all males, all college students of similar age, all residents of the United States). Also, despite the screening procedures of the voluntarily applying candidates, it is still possible that a strong ‘demand characteristic’ and ‘self-selection bias’ may have affected the composition of the sample. All the participants have responded to the newspaper ad about wanting help in “psychological study of prison life”. The last issue with the Stanford Prison Experiment is the interpretation of the results. Even if the discovered effect is trustworthy (and above mentioned issues put this into questions), there is no clear theoretical interpretation of what this finding actually proves. Some critics argue that violent behaviour of the guards may be rooted in their following of a strong leadership, rather than from their immersion into attributed social role.
Milgram experiment was a study examining the influence of authority on the immoral behaviour. Participants were assigned the role of ‘teachers’ and they were instructed by the experimentator to administer electric shocks of 15-450 V voltage, whenever the ‘learner’ made a mistake. There were various variants of the study. In the most basic one, 100% of participants agree to administer a 300 V shock and 65% agreed to apply to maximum shock of 450 V.
Statistics
- Status: mixed
- Original paper: ‘
Behavioral Study of obedience’, Milgram 1963; experimental study, n=40 (The full range of conditions was
n=740.). [citations =8502(GS, March 2023)].
- Critiques: Sources:
Burger 2011 [n=62 transcripts from the earlier experiment, citations= 108 (GS, March 2023)].
Perry 2012 [book, n=NA, citations= 261 (GS, March 2023)].
Brannigan 2013 [n=NA, citations= 14(GS, January 2022)].
Griggs 2016 [n=NA, citations= 28(GS, March 2023)].
Caspar 2020 [n=NA, citations= 25(GS, March 2023)].
Doliński et al. 2017 [n=80, citations= 122(GS, March 2023)].
Blass 1999 [n=NA, citations= 595(GS, March 2023)].
- Original effect size: 65% of subjects said to administer maximum, dangerous voltage.
- Replication effect size: Various sources (Burger, Perry, Branningan, Griggs, Caspar): Experiment included many** **researcher degrees of freedom, going off-script, implausible agreement between very different treatments, and “only half of the people who undertook the experiment fully believed it was real and of those, 66% disobeyed the experimenter.”. Doliński et al.: comparable effects to Milgram. Burger: similar levels of compliance to Milgram, but the level didn’t scale with the strength of the experimenter prods. Blass: average compliance of 63%, but suffer from the usual publication bias and tiny samples. (Selection was by a student of Milgram.) The most you can say is that there’s weak evidence for compliance, rather than obedience. (“Milgram’s interpretation of his findings has been largely rejected.”).
Robbers Cave Study. Utilized arbitrary groupings to demonstrate that tribalism between groups arises spontaneously, and depending on the context, it can result in group competition (e.g., in case of scarce resources) or group cooperation (e.g., in case of superordinate goals and common obstacles)**. **
Statistics
- Status: NA
- Original paper: ‘
Superordinate Goals in the Reduction of Intergroup Conflict’, Sherif 1958; field experiment, n=22. [citations= 1,010(GS, February, 2022)]. In addition to the original paper, some related books from the author(s) are also highly cited including: ‘
Groups in harmony and tension’, Sherif & Sherif 1958 [citations=2,280 (GS, February, 2022)] and ‘
Intergroup Conflict and Co-operation’, Sherif et al. 1961 [citations= 253, (GS, February, 2022)]. Overall, the effect accounts to more than 4000 total citations including the
SciAm piece.
- Critiques:
Billig 1976 in passing [book, n=NA, citations= 808 (GS, February, 2022), see media mention by
Haslam 2018].
Perry 2018 in passing [book, citations= 25 (GS, February, 2022), see also media summary by
Shariatmadari 2018 and
Haslam 2018].
Tavris 2014 [n=NA, citations= 11(GS, March 2023)] also claims that the underlying “realistic conflict theory” is otherwise confirmed. No definitive conclusion can be reached.
- Original effect size: N/A. Not reported in conventional format. (Rationale: “results obtained through observational methods were cross-checked with results obtained through sociometric technique, stereotype ratings of in-groups and outgroups, and through data obtained by techniques adapted from the laboratory. Unfortunately, these procedures cannot be elaborated here.”)
- Replication effect size: N/A. Various sources (Billig, Perry, Tavris): No good evidence that tribalism arises spontaneously following arbitrary groupings and scarcity, within weeks, and leads to inter-group violence. The “spontaneous” conflict among children at Robbers Cave was orchestrated by experimenters; tiny sample (maybe 70?); an exploratory study taken as inferential; no control group; there were really three experimental groups - that is, the experimenters had full power to set expectations and endorse deviance; results from their two other studies, with negative results, were not reported. Set aside the ethics: the total absence of consent - the boys and parents had no idea they were in an experiment - or the plan to set the forest on fire and leave the boys to it.
Digital technology use and adolescent wellbeing. Adolescents who spent more time on new media (including social media and electronic devices such as smartphones) are more likely to report mental health issues.
Statistics
- Status: N/A
- Original paper:
‘Increases in depressive symptoms, suicide-related outcomes, and suicide rates among U.S. adolescents after 2010 and links to increased new media screen-time’, Twenge et al. 2010; cross-sectional survey, n=506,820. [citations= 910 (GS, February, 2022)] .
- Critiques:
Orben & Przybylski 2019 [n=355,358, citations=621 (GS, February, 2022)].
- Original effect size: d = .27 for the rise in the depressive symptoms among females (2010 through 2015) due to screen media use.
- Replication effect size: Orben & Przybylski: A large-scale analysis on the association between adolescent well-being and digital technology use, demonstrates that screen time accounts for only 0.4 % of the variation in well-being of adolescents. Hence, the increased screen-time is
not strongly associated with a decreased wellbeing in adolescents. Median association of technology use with adolescent well-being was β=−0.035, SE=0.004.
Anthropomorphism for inanimate objects. Individuals who are lonely are more likely than people who are not lonely to attribute humanlike traits (e.g., free will) to nonhuman agents (e.g., an alarm clock),to fulfill unmet needs for belongingness.
Statistics
- Status: not replicated
- Original paper: ‘
Creating Social Connection Through Inferential Reproduction: Loneliness and Perceived Agency in Gadgets, Gods, and Greyhounds’, Epley et al. 2008; experimental design, n=20 for experiment 1, n=99 for experiment 2, n =57 for experiment 3. [ citations=722 citations, (GS, March 2022)].
- Critiques:
Sandstrom & Dunn, Open Science Collaboration 2015 [total n=81, citations= 6314 (GS, March 2022)].
Bartz et al. 2016 [total n=178, citations= 83 (GS, March 2023].
- Original effect size: r=0.53.
- Replication effect size: Sandstrom & Dunn, Open Science Collaboration: participants in the disconnection condition were no different from the beliefs of participants in the fear and control conditions combined, t(78) = .18, p = .86. Bartz et al.: r=0.17.
Hurricane names. Female-named hurricanes are more deadly than male-named ones. Original effect size was a 176% increase in deaths, driven entirely by four outliers; reanalysis using a greatly expanded historical dataset found a nonsignificant decrease in deaths from female named storms.
Statistics
- Status: reversed
- Original paper: ‘
Female hurricanes are deadlier than male hurricanes’, Jung 2014; observational study, n=92 hurricanes discarding two important outliers. [citations = 113(GS, Mar 2022)].
- Critiques:
Christensen 2014 [same data, citations = 114(GS, March 2022)].
Smith 2016 [same data, citations = 8(GS, March 2022)].
- Original effect size: d=0.65:
176% increase in deaths from flipping names from relatively masculine to relatively feminine.
- Replication effect size: Smith: 264% decrease in deaths (Atlantic); 103% decrease (Pacific).
Implicit bias testing for racism. Implicit bias scores poorly predict actual bias, r = 0.15. The operationalisations used to measure that predictive power are often unrelated to actual discrimination (e.g. ambiguous brain activations). Test-retest reliability of 0.44 for race, which is usually classed as “unacceptable”. This isn’t news; the original study also found very low test-criterion correlations.
Statistics
- Status: mixed
- Original paper: ‘
Measuring individual differences in implicit cognition: The implicit association test’, Greenwald 1998; experimental study, n=28 for Experiment 3. [citations= 16,144(GS, March 2023)].
- Critiques:
Oswald et al. 2013 [meta-analysis of 308 experiments, citations= 900(GS, Dec 2021)].
Carlsson and Agerström, 2015 [n=NA, citations= 84(GS, Dec 2021)].
Schimmack 2021 [review paper, n=NA, citations= 101(GS, Dec 2021)].
Schimmack 2019 [review paper, n=NA, citations= 113(GS, Jan 2022)].
Forscher et al. 2019[meta-analysis n=87,418, citations= 459(GS, Jan 2022)].
Marchery 2021 [review paper, n=NA, citations= 3(GS, Jan 2022)].
- Original effect size: attitude d=0.58; r=0.12.
- Replication effect size: Oswald: stereotype IAT r=0.03 [-0.08, 0.14], attitude IAT r=0.16 [0.11, 0.21].
Pygmalion effect (Rosenthal Effect, self-fulfilling prophecy). Expectations about performance (e.g., academic achievement) impact performance. Specifically, teachers’ expectations about their students’ abilities affect those students’ academic achievement; teacher beliefs impact their behaviour which in turn impacts student beliefs and behaviour.
Statistics
- Status: not replicated
- Original paper: ‘
Pygmalion in the classroom’, Rosenthal and Jacobson 1968; between-subjects experiment, N_ _= 320. [citations = 13625 (GS, January 2023)]. ‘
Teachers’ expectancies: Determinants of pupils’ IQ gains’, Rosenthal and Jacobson 1966, n around 320. [citations=881, but the
popularisation has 13,792 (GS, March 2023)].
- Critiques:
Raudenbush 1984 [n=findings from 18 experiments, citations= 598(GS, March 2023)].
Thorndike 1986 [review, n=NA, citations= 496(GS, March 2023)].
Spitz 1999 [review, n=NA, citations= 147(GS, March 2023)].
Jussim and Harber 2005 [review, n=NA, citations= 1,760(GS, March 2023)].
- Original effect size: Average +3.8 IQ, d=0.25.
- Replication effect size: Raudenbush: d=0.11 for students new to the teacher, tailing to d=0 otherwise. Snow: median effect d=0.035.
Stereotype threat on Asian women’s mathematical performance, i.e. the interaction between race, gender and stereotyping. This study found that Asian-American women performed better on a math test when their ethnic identity was activated, but worse when their gender identity was activated, compared with a control group who had neither identity activated.
Statistics
- Status: Mixed
- Original paper:
‘Domain-specific Effects of Stereotypes on Performance’, Shih et al.1999; two between-subjects experiments, n1=46, n2=19. [citations = 2,073 (GS, March 2023)].
- Critiques:
Gibson et al. 2014 [n=127, citations= 81(GS, March 2023)].
Moon and Roeder 2014 [n=139, citations= 50(GS, March 2023)].
- Original effect size: Asian-identity-salient > control > female-identity-salient, r=.27; Asian-identity-salient > female-identity-salient, r=.35.
- Replication effect size: Gibson et al.: No group differences, η2=.01; Asian-primed vs. female-primed, p=.18, d=.27; Including only those who were aware of the stereotypes, group accuracy p=.02, η2=.04, and the means followed the predicted pattern, Asian (M=.63), Control (M=.55), and Female (M=.51); Likewise, female-primed participants performed worse than Asian-primed participants, p=.02, d=.53. Moon & Roeder: Group accuracy, p=.44, η2=.004; female-primed and Asian-primed conditions, p=.43, d=.17; Analysing just those who were aware of the stereotype, p=.28, η2=.012; female-primed participants vs. Asian-primed participants, p=.28, d=.27.
Stereotype threat on girls’ mathematical performance. A situational phenomenon whereby priming a negative gender stereotype (e.g., “women are bad at math”) has a detrimental impact on mathematical performance.
Statistics
- Status: mixed
- Original paper: ‘
Stereotype Threat and Women’s Math Performance’, Spencer et al. 1999; Experiment 2, n=30 women. [citations=5076 (GS, June 2022)].
- Critiques:
Stoet & Geary 2012 [meta-analysis, k = 23,.citations= 286(GS, March 2023)].
Flore & Wicherts 2015 [meta-analysis, n=47 measurements, citations= 357(GS, March 2023)].
Flore et al. 2018 [Registered Report n=2064 Dutch high school students, citations= 89(GS, March 2023)].;
Agnoli et al. 2021 [conceptual replication with n_ _= 164 ninth grade and n = 164 eleventh grade Italian high school students, citations= 6(GS, March 2023)]. Other reported null results in the literature but not explicit replications, e.g.
Ganley 2013 [n=931 across three studies, citations= 195(GS, March 2023)].
- Original effect size: not reported; Experiment 2: Fig. 2 does not report specific values but appears to be control-group-women (M = 17, SD = 20) compared to experiment-group-women (M = 5, SD = 15), which translates to approximately d= −0.7 (calculated).
- Replication effect size: Stoet and Geary: d= −0.61 for adjusted and 0.17 [−0.27, −0.07] for unadjusted scores. Together, only the group of studies with adjusted scores confirmed a statistically significant effect of stereotype threat. Flore and Wicherts: g= −0.22 [−0.21, 0.06) and significantly different from zero, but g = −0.07 [−0.21, 0.06] and not statistically significant after accounting for publication bias. Flore et al.: d= −0.05 [−0.18, 0.07]. Agnoli et al.: Both estimated stereotype threat effects were nonsignificant (see also Table S22; https://osf.io/3u2jd), Z = 1.53, p = .25 for ninth grade female participants and Z =.70, p = .97 for eleventh grade female participants.
Increase in narcissism (leadership, vanity, entitlement) in young people over the last thirty years. It’s
an ancient hypothesis. The basic counterargument is that they’re misidentifying an age effect as a cohort effect (The narcissism construct
apparently decreases by about a standard deviation between adolescence and retirement.) “every generation is Generation Me”.
Statistics
- Status: not replicated
- Original paper: ‘
The Evidence for Generation Me and Against Generation We’, Twenge 2013; review of various studies, including national surveys [citations=251(GS, March 2022)].
- Critiques:
Donnellan
and Trzesniewski [k = 5, n=477,380, citations = 432(GS, March 2022)].
Arnett 2013 [unsystematic review, citations=171(GS, March 2022)].
Roberts 2017 [reanalysis of original data and analysis of new sample n = 476, citations=195(GS, March 2022)].
Wetzel 2017 [1990s: n = 1,166; 2000s: n = 33,647; 2010s: n = 25,412, citations=101(GS, March 2022)].(~660 total citations). Meta-analysis:
Hamamura et al. 2020 [total n =24990, citations = 5(GS, March 2022)].
- Original effect size: d=0.37 increase in NPI scores (1980-2010), n=49,000.
- Replication effect size: Roberts doesn’t give a d but it’s near 0. something like d=0.03 ((15.65 - 15.44) / 6.59). Wetzel: d = -0.27 (1990 - 2010). Hamamura: d(leadership) = -0.26, d(vanity)=-0.39, d(entitlement) = -0.23.
Minimal group effect (Minimal group paradigm). An intergroup bias that manifests as ingroup favouritism (i.e., a tendency to prefer ingroup members) when participants are assigned to previously unfamiliar, experimentally created and largely meaningless social identities. In essence, the paradigm investigates the impact of social categorization on intergroup relations in the absence of realistic conflicts of interests, showing that mere social categorization is sufficient to produce ingroup favouritism.
Statistics
- Status: replicated
- Original paper:
‘Arousal of ingroup-outgroup bias by a chance win or loss’,
Rabbie and Horwitz 1969; experimental study, n=112. [citations= 679 (GS, January 2023)].
- Critiques:
Balliet et al. 2014 [meta-analysis, k=212, citations= 930(GS, March 2023)].
Billig and Tajfel 1973; experimental design, n=75. [citations=2232 (GS, January 2023)].
Falk et al. 2014 [Japanese: n1 = 324 Japanese and Americans: n2 = 594, Americans, citations= 58(GS, March 2023)].
Fischer and Derham 2016 [meta-analysis, n = 21,266, citations=70 (GS, March 2023)].
Lazić et al. 2021 [meta-analysis, k = 69, N = 5268, citations=5 (GS, March 2023)].
Kerr et al., 2018 [n_=_412, citations=21 (GS, January 2023)].
Mullen et al. (1992) [meta-analysis, k = 137, citations= 1,867(GS, March 2023)]. [Tajfel 1970
n=64, citations= 4094 (GS, January 2023)].
Tajfel et al. 1971 [n1=64, n2=48, citations=8126 (GS, January 2023)].
- Original effect size: N/A
- Replication effect size: Balliet et al.: d= 0.19 (for situations with no mutual interdependence between group members) and d= 0.42 (for situations with strong mutual interdependence between group members). Fischer and Derham: d= 0.369 [0.33, 0.41]. Mullen et al.: r = 0.264. The ingroup bias effect was obtained from a meta-analysis on 74 hypothesis tests derived from artificial groups. Lazić, Purić & Krstić: d = 0.22 [0.07, 0.38]. Kerr et al.: comparing US vs Australian sample, highlights the importance of context-dependent factors (like differences in methodological approach) and cultural variation of MGE; significant main effects of categorization (Group vs. No-group) on allocation measures, ηp2 = 0.031 to 0.081; the ingroup favouritism effect was present in both Context conditions, but was stronger in the public (ηp2= 0.072) than in the private context (ηp2= 0.020). Falk et al. : culture was a significant predictor of resource allocation such that Americans chose more in-group favouring strategies than did Japanese, b = 1.43, z = 9.52, p < .00; American participants were also more likely to show an in-group bias in group identification (in-group vs. out-group comparison, _d _= .94), perceived group intelligence (d = .44), and perceived group personality traits (b = .15, z = 17.51) then Japanese participants (d= .50, d = -.003, b = .04, z = 2.75, respectively).
Solomon Asch’s conformity study. The degree to which a person’s own opinions are influenced by those of a group.
Statistics
- Status: replicated
- Original paper: ‘
Studies of independence and conformity: I. A minority of one against a unanimous majority of one against a unanimous majority’, Asch, 1956; experimental design, n = 123. [citations = 6558, GS, October 2021].
- Critiques:
Friend et al. 1990 [n= 99 accounts in social psychology textbooks, citations = 156 (GS, November 2021)].
Griggs 2015 [n= 20 introductory psychology textbooks and 10 introductory social psychology, citations = 12 (GS, November 2021)].
Bond and Smith, 1996 [meta-analysis, _k=_137, citations = 2,228 (GS, March 2023)]. Criticism focuses on the fact that textbooks exaggerate and misquotate evidence of conformity and omit or diminish evidence of independence.
- Original effect size: 36.8% of the responses were incorrect (influenced by the majority). The effect has been interpreted by the author as evidence for the prevalence of independence (“The preponderance of judgments was independent, evidence that under the present conditions the force of the perceived data far exceeded that of the majority.”, Asch, 1956, p.24).
- Replication effect size: Bond and Smith: d = .92[.89-.96], average rate of incorrect answers: 25%. Friend et al./Griggs: The majority of academic textbooks present the study as evidence for overwhelming conformity, failing to report the evidence of independent tendencies among participants. A common practice seen in many academic textbooks and popular writings is to report the value of “75%” or “76%” as the general indicator of conformity. In reality, this is the fraction of respondents who yielded to the majority in at least one of the twelve trials. The reversal of this value (rarely mentioned in the literature) would be 24% - a fraction of completely independent respondents or 95% - a fraction of respondents who remain independent in at least one of twelve trials.
Dynamic norms. Information about increasing minority norms increases interest/engagement in minority behaviour.
Statistics
- Status: mixed
- Original paper: ‘
Dynamic Norms Promote Sustainable Behavior, Even if It Is Counternormative’, Sparkman and Walton, 2017; three online and two field experiments, n = 122, 306. [citations = 126, altmetric = 367 (GS, December 2021)].
- Critiques:
Aldoh et al., 2021 [n = 846, citations=1 (GS, December 2021)].
- Original effect size: d = 0.31 to d = 0.41.
- Replication effect size: Aldoh et al.: d = −0.02.
Social comparison. No robust evidence for an interaction effect between body dissatisfaction and social comparison on fat talk.
Bystander effect: claims that the feeling of responsibility diffuses with an increasing number of other observers. Research about the bystander effect was sparked by the 1964 murder of Catherine “Kitty” Genovese. See this New York Times article for details. Here’s a more detailed
resource.
Statistics
- Status: mixed
- Original paper(s): ‘
Bystander Interventions in Emergencies: Diffusion of Responsibility’, Darley et al. 1968, experiment, n = 59. [citations = 4413 (GS, August 2022)].
- Critiques:
Fischer et al. 2011 meta analysis [n = 7700, citations = 963 (GS, August 2022)].
- Original effect size: not reported.
- Replication effect size: Fischer et al.: Hedges’ g= -0.35; Although the present meta-analysis shows that the presence of bystanders reduces helping responses, the picture is not as bleak as conventionally assumed. (…) bystander inhibition is less pronounced especially in dangerous emergencies.
Colour red on attractiveness. Viewing the colour red enhances men’s attraction to women. In a lingua franca this effect may reflect the amorous meaning in the human mating game.
Statistics
- Status: mixed
- Original paper: ‘
Romantic red: Red enhances men’s attraction to women’, Elliot and Niesta 2008; experiment, N = 42. [citation=66 (GS, February 2022)].
- Critiques:
Peperkoorn et al. 2016 [n=830, citations=48 (GS, February 2022)].
Pazda et al. 2021 [experiment 1: n = 116; experiment 2: n = 230; experiment 3: n = 230, citations= 3(GS, January 2023)].
- Original effect size: Experiment 1: d = 1.11; Experiment 2: η2p = .08; Experiment 3: attractiveness: η2p = .11, sexual desire: η2p = .19, desired sexual behaviour η2p = 13; Experiment 4: attractiveness: d = 0.73, sexual desire: d = 1.55, desired sexual behaviour d = 1.11; Experiment 5: attractiveness: d =0.86, sexual desire: d = 1.00, desired sexual behaviour d = 1.11;_ _asking someone on a date: d = 0.95; spending on a date: d = 1.35.
- Replication effect size: Peperkoorn et al.: Study 1: η2p= .03 (in support of white more attractive than red); Study 2: F = .07; Study 3: d = −.12. Pazda et al.: Experiment 1: sexually receptive: d = .42, attractive: d = .30, sexually appealing d = .47; Experiment 2: sexually receptive: d = .25, attractive: d = .16, perceptions of sexually appealing d = .23; Experiment 3: sexually receptive: d = .75, attractive: d = .54, sexually appealing d = .63.
Big brother effect. Being watched makes someone more likely to cooperate.
Imagined Contact - Bias. Imagining social contact (instead of having actual contact) with someone from an outgroup (based on e.g., ethnicity, sexuality, religion, age) can reduce intergroup bias.
Statistics
- Status: mixed
- Original paper:
‘Imagining intergroup contact can improve intergroup attitudes’, Turner et al. 2007; three experiments, Study 1: N = 28, Study 2 = 24, Study 3 =27. [citations = 633 (GS, October 2022)].
- Critiques:
Firat and Ataca 2020 [N = 335 citations = 9 (GS, October 2022)],
Hoffarth and Hodson 2016 [Study 1: N = 261, Study 2: N = 320 citations = 36 (GS, October 2022)].
Miles and Turner 2014 [meta-analysis, k = 71, N = 5,770 citations= 450 (GS, October 2022)].
- Original effect size: Study 1: d = .42. Study 2: ηp² = 0.20. Study 3: d= 0.86 (as calculated for this entry, using
Lakens’ tool).
- Replication effect size: Firat and Ataca: ηp2 = .01. Hoffarth and Hodson: Study 1 (concerning gay people): many outcomes, largest β = .10; Study 2 (concerning Muslims): many outcomes, largest β = .095. Miles and Turner: overall d = .35 [0.26, 0.44].
Imagined Contact - Intentions. The claim that imagining social contact (instead of having actual contact) with someone from an outgroup (based on e.g., ethnicity, sexuality, religion, age) can increase contact intentions.
Statistics
- Status: mixed
- Original paper:
‘Elaboration enhances the imagined contact effect’, Husnu and Crisp 2010; two experiments, Study 1:n = 33, Study 2: n = 60. [citations = 278 (GS, October 2022)].
- Critiques:
Klein et al. 2014 Many Labs study [n = 6344, citations = 1082 (GS, June 2022)];
Crisp et al. 2014 [citations = 16 (GS, October 2022] reply to Klein et al. stating that the effect size was significant and comparable to that obtained in the
Miles and Crisp 2014 [citations = 450 (GS, October 2022)] meta-analysis for the relevant outgroup, suggesting that the Many Labs project may provide stronger evidence than originally thought.
- Original effect size: Study 1: d= 0.86, Study 2: d= 1.13.
- Replication effect size: Klein et al.: d= 0.13 [0.00, 0.19] (NB: original study focused on ‘British Muslims’ - this on Muslims across cultures). Miles and Crisp: d= 0.35 and estimate for religious groups, d= 0.22. Crisp et al.: the observed effect size of 0.13 in the Many Labs study is substantially different from the original Husnu and Crisp study, and from our overall estimate of 0.35, but not from the most appropriate comparison: The meta-analytic estimate for religious outgroups (0.22).
Stereotype susceptibility effects. Awareness of stereotypes about a person’s in-group can affect a person’s behaviour and performance when they complete a stereotype-relevant task.
Positive mood-boost helping effect. People are more likely to do good when feeling good.
Statistics
- Status: mixed
- Original paper:
Isen and Levin 1972; experiment, Experiment 1: n = 52 male undergraduates, Experiment 2: n = 41 adults. [citations=1,881 (GS, October 2022)].
- Critiques:
Batson et al. 1979 [n = 40, citations=132 (GS, June 2022)].
Blevins and Murphy 1974 [n = 51, citations=50 (GS, October 2022)].
Carlson et al. 1988; meta-analysis [k = 61 from 34 papers (N not reported), citations = 862(GS, March 2023)].
Weyant and Clark 1977 [Study 1 n = 64, Study 2 n = 106, citations=39 (GS, October 2022)]. Failed replications:
Job 1987 [n=100 letters placed under the windshield wipers of cars, citations=38(GS, March 2023)].
- Original effect size, calculated: Study 1: OR = 2.25, Study 2: OR = 168 [no typo, both
calculated].
- Replication effect size: Batson et al.: OR = 4.3 [calculated]. Carlson et al.: d= .54 [reported]. Weyant & Clark: Study 1: OR = 4.2 (calculated, between dime and no-dime, excl. 2 other conditions), Study 2: OR = 0.7 [calculated]. Blevins & Murphy: OR = 0.9 [calculated]. Job: negative mood increases helping behaviour
so that control vs neutral might be insufficient.
Superiority-of-unconscious decision-making effect (deliberation without attention effect). While conscious reflection produces better choices on simple tasks, complex choices “should be left to unconscious thought”.
Statistics
- Status: mixed
- Original paper: ‘
On Making the Right Choice: The Deliberation-Without-Attention Effect’, Dijksterhuis et al. 2005; Study 1: n = 80, Study 2: n = 59) that show better choices (and two surveys that show greater satisfaction, not focus here). [citations = 1807 (GS, October 2022].
- Critiques: Meta-analysis:
Acker 2008 [n=888 across 17 studies, citations=233(GS, November 2022)]. Meta-analysis:
Nieuwenstein et al. 2015 [n=4518 across 67 studies, citations=103(GS, November 2022)].
- Original effect size: All reported in Acker: Study 1: ηp2 = 0.06 / g = 0.434 to Study 2: 0.11 / g = 0.242 for interaction between choice complexity and deliberation. Main effects and descriptives not reported.
- Replication effect size: All reported in Acker: Acker: g = 0.471. Ham et al.: g = 0.883 to g = 1.055. Lerouge: g = -0.064 to g = 1.116. Newell et al.: g = -0.504 to g = 0.722. Payne et al.: g = -0.483 to g = 0.722. Phillips et al.: g = -0.251. The mean effect size was g = .251. All reported in Nieuwenstein et al.: Abadie et al.: g = -0.62 to g = 0.22. Aczel et al.: g = -0.35. Ashby et al.: g = -0.21 to g = 1.00. Bos et al.: g = -0.10 to g = 1.48. Calvillo and Penaloza: g = -0.29 to g = -0.09. Dijksterhuis: g = 0.24 to g = 0.42. Dijksterhuis et al.: g = 0.70 to g = 0.86. González et al.: g = 0.00. Hasford: g = 0.43. Hess et al.: g = -0.14. Huizenga et al.: g = -0.50 to g = -0.33. Lassiter et al.: g = 0.27 to g = 0.51. Lerouge: g = 0.38 to g = 0.47. McMahon et al.: g = 0.62 to g = 0.67. Messner et al.: g = 0.63. Newell et al.: g = -0.50 to g = 0.17. Newell and Rakow: g = -0.37 to g = 0.31. Nieuwenstein and Van Rijn: g = -0.74 to g = 0.87. Nieuwenstein et al.: g = -0.01. Nordgren et al.: g = 0.27 to g = 0.36. Payne et al.: g = -0.10. Queen and Hess: g = -0.21. Rey et al.: g = 0.27. Smith et al.: g = 0.25 to g = 0.32. Strick et al.: g = 0.58 to g = 1.21. Thorsteinson and Withrow: g = 0.18 to g = 0.34. Usher et al.: g = 0.78 to g = 1.04. Waroquier et al.: g = -0.09 to g = 0.35. Pooled effect size of g = 0.15 [0.03, 0.26].
Behavioural-consequences-of automatic-evaluation (affective compatibility effect). Automatic classification of stimuli as either good or bad have direct behavioural consequences. Automatic evaluation results directly in behavioural predispositions toward the stimulus, such that positive evaluations produce immediate approach tendencies, and negative evaluations produce immediate avoidance tendencies.
Statistics
- Status: mixed
- Original paper: ‘
Consequences of Automatic Evaluation: Immediate Behavioral Predispositions to Approach or Avoid the Stimulus’, Chen and Bargh 1979; two mixed design experiments, Study 1: n= 42, Study 2: n = 50. [citations = 1943 (GS, October 2022)].
- Critiques:
Rotteveel et al. 2015 [Study 1: n=100, Study 2: n=50, citations = 35(GS, October 2022)]. Meta-analysis:
Phaf et al. 2014 [N=1538 across 29 studies, citations=271(GS, October 2022)].
- Original effect size: Study 1 (conscious evaluation) – congruence factor main effect ηp2= 0.168 / d = 0.44 [_ηp2 _calculated from reported F statistic and converted using this
conversion]; Study 2 (automatic evaluation) – congruence factor main effect ηp2 = 0.078 / d = 0.29 [_ηp2 _calculated from reported F statistic and converted using this
conversion].
- Replication effect size: Rotteveel et al.: Study 1 – Evaluative judgement × Lever movement interaction effect ηp2 = 0.030 [reported, non-significant] / d = 0.17 [converted using this
conversion], Study 2 – Affective valence × Lever movement interaction effect ηp2 = 0.057 [reported, marginally significant] / d = 0.24 [converted using this
conversion]. Phaf et al.: Positive emotions – The average effect size differed significantly from zero for explicit instructions to evaluate (g = 0.287; p = 0.0001; 95% CI = 0.204, 0.369) and for explicit-converted instructions (g= 0.287; p= 0.0001; 95% CI = 0.146, 0.429), but not for implicit instructions (g= 0.028; p= 0.572) [all reported]. Negative emotions – Effect sizes differed significantly from zero for explicit-converted instructions (g= 0.389; p= 0.001; 95% CI = 0.155, 0.624) and for explicit instructions (g= 0.249; p = 0.0001; 95% CI = 0.159, 0.339), but not for implicit instructions (g= 0.103; p= 0.0959) [all reported]. Both emotions – The average effect size differed significantly from zero for explicit-converted instructions (g= 0.433; p = 0.0001; 95% CI = 0.295, 0.571) and explicit instructions (g= 0.403; p = 0.0001; 95% CI = 0.286, 0.521), but not for implicit instructions (g= 0.076; p= 0.148) [all reported].
Self-control relies on glucose effect. Acts of self-control decrease blood glucose levels; low levels of blood glucose predict poor performance on self-control tasks; initial acts of self-control impair performance on subsequent self-control tasks, but consuming a glucose drink eliminates these impairments.
Statistics
- Status: mixed
- Original paper:
‘Self-control relies on glucose as a limited energy source: Willpower is more than a metaphor’, Gailliot et al. 2007; 9 experiments with: Study 1 (self-control decreases blood glucose): n= 103; Study 2 (self-control decreases blood glucose): n= 37; Study 3 (low levels of blood glucose predict poor performance on self-control tasks): n= 15; Study 4 (low levels of blood glucose predict poor performance on self-control tasks): n= 10; Study 5 (low levels of blood glucose predict poor performance on self-control tasks): n= 19; Study 6 (low levels of blood glucose predict poor performance on self-control tasks): n= 15; Study 7 (glucose consumption): n= 61; Study 8 (glucose consumption): n= 72; Study 9 (glucose consumption): n= 17. [citations=1956(GS, June, 2022)].
- Critiques: Meta-analysis:
Hagger et al. 2010 [citations= 2638 (GS, June, 2022)].
Lange and Egger 2014 [n= 70, citations= 114 (GS, June 2022)]. Lange and Egger also points at statistical mistakes in the meta-analysis of Hagger et al.
- Original effect size: Study 1 (self-control decreases blood glucose): ηp2 = 0.057 [calculated from the reported F(1, 100) = 6.08 using this
conversion]; Study 2- discussing a sensitive topic with a member of a different race used up a significant amount of glucose among people with low Internal Motivation to Respond Without Prejudice scale (IMS), _b _=-3.28; Study 3 (low levels of blood glucose predict poor performance on self-control tasks): r= -0.62, Study 4 (low levels of blood glucose predict poor performance on self-control tasks): r= 0.56, Study 5 (low levels of blood glucose predict poor performance on self-control tasks): r= 0.45, Study 6 (low levels of blood glucose predict poor performance on self-control tasks): r= 0.43. Study 7 (glucose consumption): ηp2 = 0.081 [calculated], Study 8 (glucose consumption): ηp2 = 0.073 [calculated], , Study 9 (glucose consumption): d= 1.518 [calculated].
- Replication effect size: Hagger et al.: for glucose consumption: d = 0.75 (includes the original study); for decrease of blood glucose levels: d= -0.87 (includes the original study). Lange & Egger: for glucose consumption: ηp2 = 0.02.
Physical warmth promotes interpersonal warmth. Exposure to physical warmth will lead to more positive judgments of strangers and an increase in prosocial behaviour (e.g., gift-giving).
Statistics
- Status: not replicated.
- Original paper:
‘Experiencing physical warmth promotes interpersonal warmth’, Williams and Bargh 2008; between-subjects experiments, n1=41, n2=53 [citations = 1,894 (GS, October 2022)].
- Critiques:
Chabris et al. 2018 [Experiment 1 (attempted to replicate Experiment 1 of Williams and Bargh 2008): n = 128, Experiment 2 (attempted to replicate Experiment 2 of Williams and Bargh 2008): n = 177, citations = 53 (GS, October 2022)].
Lynott et al. 2014 [Sample 1: n = 306 (Ohio, USA), Sample 2: n = 250 (Michigan State University, USA), Sample 3: n = 305 (University of Manchester, UK),citations = 140 (GS, October 2022)] (Note: All samples attempted to replicate Experiment 2 of Williams and Bargh 2008).
- Original effect size: Experiment 1 (estimated from test-statistic): d = 0.65 (people tended to give more positive ratings after holding a warm drink), Experiment 2 (converted from
Lynott et al. 2014’s OR reported for this study): _d _= 0.65 (people were more likely to give a gift to a friend than themselves after holding a warming pad).
- Replication effect size: Chabris et al.: Experiment 1: d = -0.06 (not replicated, converted from r statistic reported), Experiment 2: d = 0.04 (not replicated, converted from r statistic reported). Lynott et al.: Sample 1: d = -0.27 (opposite direction, converted from OR reported in paper), Sample 2: d = -0.05 (not replicated, converted from OR reported in paper), Sample 3: d = -0.14 (not replicated, converted from OR reported in paper).
Power impairs perspective-taking effect. Individuals made to feel high in power were more likely to inaccurately assume that others view the social world from the same perspective as they do.
Statistics
- Status: not replicated
- Original paper: ‘
Power and Perspectives Not Taken’, Galinsky et al. 2006; 3 between-subjects experiments, each with two conditions, Experiment 1: n = 57, Experiment 2a: n = 42, Experiment 2b: n = 51, Experiment 3: n = 70. [citations = 1550 (GS, June 2022)].
- Critiques: Experiment 2a:
Ebersole et al. 2016 [n = 2,969, citations = 438 (GS, June 2022)].
- Original effect size: d = .77 [0.12,1.41] obtained from Ebersole et al. (2016).
- Replication effect size: Ebersole et al.: d = .03 [− 0.04, 0.10].
Status-legitimacy effect. Members of low-status, disadvantaged, and marginalised groups are more likely to perceive their social systems as legitimate than their high-status and advantaged counterparts under certain circumstances. People who are most disadvantaged by the status quo, due to the greatest psychological need to reduce ideological dissonance, are most likely to support, defend, and justify existing social systems, authorities, and outcomes.
Statistics
- Status: mixed
- Original paper: ‘
Social inequality and the reduction of ideological dissonance on behalf of the system: evidence of enhanced system justification among the disadvantaged’, Jost et al. 2003; five cross-sectional / correlational studies, Study 1: n = 1345, Study 2: = 2485, Study 3: = 1396, Study 4: n = 2223, Study 5: n = 788. [citations =927(GS, October 2022)].
- Critiques:
Brandt 2013 [n=151,794, citations=271(GS, October 2022)].
Caricati 2017 [n=38,967, citations=50(GS, October 2022)].
Henry and Saul 2006 [n=356, citations=156(GS, October 2022)].
- Original effect size: Study 1 – effect of income, b = -0.22, race (European Americans vs. African Americans), b = -0.73, and education, b = -0.30, on willingness to limit the press; effect of income, b = -0.31, race (European Americans vs. African Americans),b = -1.01, and education,b = -0.38, on the attitudes of the rights of citizens, Study 2 – effect of income, b = 0.06, and education,_ b_ = -0.08, on trust in government officials among Latinos; Study 3 – effect of income on belief that large income differences are necessary to get people to work hard, b = 0.04, and as an incentive for individual effort, b = 0.02, Study 4 – main effects of region (North vs. South), ηp2 = 0.128 / d = 0.38, and income, ηp2 = 0.09 / d = 0.31, on meritocratic beliefs among African Americans [ηp2 calculated from the reported F statistic and converted using this
conversion], Study 5 – effect of socio-economic status, b = -0.34, and race (White versus Black), b = -0.25, on legitimation of income inequality.
- Replication effect size: Henry and Saul: group status effects on the support for of the dissent, ηp2 = 0.019 / d = 0.14, government approval, ηp2 = 0.024 / d = 0.16, and alienation from government, ηp2 = 0.024 / d = 0.16 [ηp2 calculated from the reported F statistic and converted using this
conversion] (replicated). Caricati: effects of the top-bottom self-placement, b = 0.117, social class, b = 0.075, and personal income, b = 0.022, on perceived fairness of income distribution [all significant, reversed].
Brandt: effects of income on trust in government and confidence in societal institutions in various multilevel regression models b= -0.014 to b= 0.005 [all non-significant, not replicated]; effects of education on trust in government and confidence in societal institutions in various multilevel regression models b= -0.044 [significant, replicated] to b= 0.021 [significant, reversed]; effects of social class on trust in government and confidence in societal institutions in various multilevel regression models b= 0.055 [significant, reversed] to b= 0.110 [significant, reversed]; effects of race on trust in government and confidence in societal institutions in various multilevel regression models b= -0.019 [non-significant, not replicated] to b= 0.017 [significant, reversed]; Overall, only one effect out of the 14 was supportive, six effects were significant and positive (reversed) and the remaining seven effects were not significantly different from zero.
Red impairs cognitive performance. The colour red impairs performance on achievement tasks, as red is associated with the danger of failure and evokes avoidance motivation.
Reduced prosociality of high SES effect. Higher socioeconomic status predicts decreased prosocial behaviour. Affluence may be linked with reduced empathy and poverty may be linked with increased empathy.
Statistics
- Status: mixed
- Original paper: ‘
Having less, giving more: the influence of social class on prosocial behavior’, Piff et al. 2010; correlational and experimental design: self-report and behavioural measure of altruism, total N = 394. [citations=1633(GS, October 2022)].
- Critiques:
Andreoni et al. 2021 field experiment [n=360, citations=27(GS, October 2022)].
Stamos et al. 2020, preregistered replications [Study 1 n=300, Study 2 n=200, citations=25(GS, October 2022)].
- Original effect size: mean r= −0.215.
- Replication effect size: Andreoni et al.: mean r=.37 (reversed). Stamos et al.: r =0.01 (non-significant).
Moral licensing effect (self-licensing, moral self-licensing, licensing effect) is the effect that acting in a moral way makes people more likely to excuse and perform subsequent immoral, unethical, or otherwise problematic behaviours.
Statistics
- Status: not replicated
- Original paper:
‘Sinning Saints and Saintly Sinners’, Sachdeva et al. 2009; three experiments using a priming-task where participants write a story about themselves using neutral/negative/positive traits, US student sample, Study 1 & 3: n = 46. [citations=919 (GS, June 2022)].
- Critiques:
Blanken et al. 2014 (direct replication of 2 of the original studies, 3 replication studies with 2 different populations) [Study 1: n = 105, Study 2: n = 150, Study 3: n = 940, citations = 81(GS, June 2022)].
Blanken et al. 2015 [meta-analysis, total n = 7,397, citations = 470(GS, June 2022)].
Simbrunner and Schlegelmilch 2015 [meta-analysis, k = 106 (n data points not reported), citations = 37(GS, June 2022)].
Kuper and Bott 2019 [re-analysis of the meta-analyses above, adjustment for publication bias, k=76 citations = 27(GS, June 2022)].
Urban et al. 2019 [failed conceptual replication of
Mazar and Zhong 2010, moral licensing in the domain of environmental behaviour, 3 studies, total n = 1274, citations = 62(GS, March 2023)].
Rotella and Barclay 2020 [failed pre-registered conceptual replication of the effect, n = 562, citations = 21(GS, March 2022)].
- Original effect size: Study 1: d = 0.62 [-0.11, 1.35]; Study 3: d = 0.59 [-0.12, 1.30] (effect sizes taken from replication paper by Blanken et al.).
- Replication effect size: Blanken et al: replication Study 1 (Dutch student sample): d = -0.03 [-0.51, 0.45]; replication Study 2 (Dutch student sample): d = -0.31 [-0.70, 0.08]; replication Study 1 & 3 (US MTurk sample): d = 0.05 [-0.15, 0.25]. Blanken et al.: meta-analysis, mean effect of d = 0.31 [0.23, 0.38]. Kuper and Bott: adjusted effect sizes: d= -0.05 (PET-PEESE) and d= 0.18 (3-PSM). Simbrunner and Schlegelmilch: mean effect of d = 0.319 [0.229, 0.408].
Colour on approach/avoidance. Red (versus blue) colour induces primarily an avoidance (versus approach) motivation and enhances performance on a detail-oriented task, whereas blue enhances performance on a creative task.
Statistics
- Status: not replicated
- Original paper: ‘
Blue or Red? Exploring the Effect of Color on Cognitive Task Performances’, Mehta and Zhu 2009; six studies, studies 1-5 between-subject experiments, study 6 correlational, Study 1: n = 69, Study 2: n = 208, Study 3: n = 118, Study 4: n = 42, Study 5: n = 161, Study 6: n = 68. [citations=1003 (GS, November 2022)].
- Critiques:
Steele et al. 2010 direct replication of Mehta an d Zhu Study 1 [n=172, citations=2(GS, November 2022)].
Steele et al. 2013 direct replication of Mehta andZhu Study 1 [n=263, citations=2(GS, November 2022)].
Steele 2014 direct replication of Mehta and Zhu Study 1 [n=263, citations=45(GS, November 2022)].
- Original effect size: Study 1 – blue versus red condition comparison for approach-related anagrams, d = 0.81, and for avoidance-related anagrams d = 0.96; Study 2 – blue versus red condition on detailed-oriented task, d = 0.64, and on creative task, d = 0.6; Study 3 - blue versus red condition on detailed-oriented task, d = 1.05, and on creative task, d = 0.69; Study 4 - blue versus red condition on the practicality of the designed toy, d = 0.64, and on the originality/novelty of the designed toy, d = 0.67; Study 5 - blue versus red condition on detailed-oriented processing style, d = 0.42, and creative thinking, d = 0.56;
- Replication effect size: Steele et al.: colour by word-type interaction ηp2 = 0.038 / d = 0.20 [_ηp2 _calculated from the reported F statistics and converted using this
conversion] (not replicated). Steele et al.: colour by word-type interaction ηp2 = 0.014 / d = 0.12 [_ηp2 _calculated from the reported F statistics and converted using this
conversion] (not replicated). Steele: The colour X word type interaction ηp2 = 0.007 [reported] / d = 0.083 [converted using this
conversion] (not replicated).
Playboy Effect. Men exposed to erotic images of the opposite-sex will report lower ratings of love for their partner and lower ratings for their partners sexual attractiveness compared to men exposed to abstract art. This effect was not found in women in either the original or replication attempts.
Statistics
- Status: not replicated
- Original paper:
‘Influence of Popular Erotica on Judgments of Strangers and Mates’, Kenrick et al. 1989; between-subjects design, Experiment 2: n_ _= 30. [citations = 399 (GS, October 2022)].
- Critiques:
Balzarini et al. 2017 [Experiment 1: n_ = 124, Experiment 2: n = 170, Experiment 3: n _= 121, meta-analysis n = 445, citations = 37 (GS, October 2022)].
- Original effect size: Reduced sexual attraction to partner (d= 1.05), reduced love for partner (d= 0.77). Effect sizes estimated from test-statistics reported in paper.
- Replication effect size: Balzarini et al.: Sexual attraction to partner: Experiment 1: d= 0.07 [-0.29, 0.42] (not replicated); Experiment 2: d= -0.10 [-0.40, 0.20] (opposite direction, but non-significant); Experiment 3: d= -0.15 [-0.51, 0.21] (opposite direction, but non-significant); Meta-analysis: d= 0.02 [-0.21, 0.24] (not replicated); Love for partner: Experiment 1: d= -0.19 [-0.55, 0.16] (opposite direction, but non-significant); Experiment 2: d= -0.10 [-0.40, 0.20] (opposite direction, but non-significant); Experiment 3: d= 0.16 [-0.20, 0.52] (not replicated); Meta-analysis: d= 0.02 [-0.22, 0.26] (not replicated). (Note: For the effects reported, only male effects are considered given these were the only significant effects found. As such, the number of subjects reported for the studies and the effect sizes account for only the male participants.)
Self-protective subjective temporal distance effect. Participants reported that negative events in their own lives felt farther away than positive events in their own lives, and this effect was stronger for participants higher in self-esteem.
Statistics
- Status: not replicated
- Original paper: ‘
It feels like yesterday: Self-esteem, valence of personal past experiences, and judgments of subjective distance’, Ross and Wilson 2002; Three studies: Study 1: N = 557 was a correlational study; Study 2: N = 357 was an experiment with two main predictors: recalled grade condition (best vs. worst; between-subjects) and self-esteem (measured); Study 3: N = 107 was an experiment with three main predictors: agent (self vs. acquaintance; between-subjects), valence of recalled experience (positive vs. negative; between-subjects), and self-esteem (measured). [citations = 462 (GS, June 2022)]. Study 2 was the one Many Labs 3 replicated.
- Critiques:
Ebersole et al. 2016 [n = 3433, citations = 438 (GS, June 2022)].
- Original effect size: ηp2 = .0185 (based on transforming from beta of -.136).
- Replication effect size: Ebersole et al.: ηp2 = .0001.
Trait loneliness hot shower effect. People self-regulate their feelings of social warmth (connectedness to others) through applications of physical warmth of shower or bath, without explicit awareness of this substitution. Loneliness as a form of “social coldness” can be relieved by applying physical warmth.
Statistics
- Status: not replicated
- Original paper: ‘
The substitutability of physical and social warmth in daily life’, Bargh and Shalev 2012; 4 experiments, n=403 across 4 experiments. [citations=414(GS, October 2022)].
- Critiques:
Donnellan et al. 2014 replicated Study 1 [n=3073 across 9 studies, citations=104 (GS, October 2022)]. See also reply to Donnellan et al. 2014 by
Shalev and Bargh 2015 [n=555 across three samples, citations=6 (GS, October 2022).
Wortman et al. 2014 replicated study 2 [n=260, citations=19(GS, October 2022)].
- Original effect size: r = .57 (Study 1a; n=51) and r = .37 (Study 1b; n =41)
- Replication effect size: Donnellan et al.: r = -.01 to .10 (but statistically indistinguishable from zero). Shalev and Bargh: loneliness-warmth index correlation for showering r = .143 and for baths r = .093 (replicated). Wortman et al.: warm vs. cold condition d = 0.02 [reported, non-significant].
American flag priming boosts Republican support. Subtle exposure to the American flag causes people to report more conservative, Republican beliefs and attitudes.
Statistics
- Status: not replicated
- Original paper: ‘
A Single Exposure to the American Flag Shifts Support Toward Republicanism up to 8 Months Later’, Carter et al. 2011; two experiments; Experiment 1: N = 235 in Session 1 (exposure to prime), 197 in Session 2 (appx. two weeks later, right before the 2008 presidential election), 191 in Session 3 (the week after this election), 75 in Session 4 (eight months after this election); Experiment 2: N = 70. [citations = 197 (GS, June 2022)].
- Critiques:
Klein et al. 2014 [n = 6344, citations = 1082 (GS, June 2022)]
- Original effect size: d= .50
- Replication effect size: Klein et al.: median d = .02.
Superstition boosts performance effect. The irrational belief that certain objects (e.g., lucky charms) or beliefs (e.g., religion) will benefit performance in a task.
Statistics
- Status: not replicated
- Original paper: ‘
Keep Your Fingers Crossed!: How Superstition Improves Performance’, Damisch et al. 2010; between-subjects, Experiment 1: n = 28, Experiment 2: n = 51, Experiment 3: n = 41, Experiment 4: n = 31. [citations = 386 (GS, February 2023)].
- Critiques:
Aruguete et al. 2012 [Experiment 1: n = 141, Experiment 2: n = 139, citations = 12 (GS, October 2022)].
Calin-Jageman and Caldwell 2014 [Experiment 1: n = 124, Experiment 2: n = 111, Meta-analysis: n = ~719 participants, k = 11 studies, citations = 42 (GS, October 2022)].
Dickhäuser et al. 2020 [Experiment 1:_ n_ = 101, Experiment 2: n = 175, citations = 0 (GS, October 2022)].
Lee et al. 2011 [n_ _= 40, citations = 78 (GS, October 2022)].
- Original effect size: Experiment 1: d = 0.83; Experiment 2: d = 0.72 to d = 0.98; Experiment 3: d = 0.66; Experiment 4: d = 0.77.
- Replication effect size: Aruguete et al.: Experiment 1: d = -0.07 (estimated from test-statistic regarding logical reasoning test) (not replicated); Experiment 2: ηp2 = 0.01 (estimated from test-statistic regarding logical reasoning test before exploratory analysis) (not replicated). Calin-Jageman and Caldwell: Experiment 1: d = 0.05 (not replicated); Experiment 2: d = 0.05 (not replicated); Meta-analysis: d = 0.40 [0.14, 0.65] (notably, this is heavily biassed by the effect size estimates of Damisch et al., 2010). Dickhäuser et al.: ES = NA. Unable to access article, but Abstract suggests both studies failed to replicate the Damisch et al. (2010) effect (not replicated). Lee et al. (Supplementary Materials: d = 0.74 (replicated).
Unethicality darkens perception of light (El Greco fallacy). Recalling abstract concepts such as evil (as exemplified by unethical deeds) and goodness (as exemplified by ethical deeds) can influence the sensory experience of the brightness of light. Recalling unethical behaviour led participants to see the room as darker and to desire more light-emitting products (e.g., a flashlight) compared to recalling ethical behaviour.
Statistics
- Status: not replicated.
- Original paper: ‘
Is It Light or Dark? Recalling Moral Behavior Changes Perception of Brightness’, Banerjee et al. 2012; two between-subjects experiments, Experiment 1: n = 40, Experiment 2: n = 74. [Citations= 194 (GS, October 2022)].
- Critiques:
Brandt et al. 2014 [online Study 1: n=475, online Study 2: n=482, lab Study 1: n=100, lab Study 2: n=121; meta-analysis: k=11, N not reported, citations=31(GS, October 2022)].
Firestone & Scholl 2013 [Experiment 4 n=89, Experiment 5 n=91, citations=266(GS, October 2022)].
- Original effect size: perceived brightness – d= 0.65 [reported], estimated watts d= 0.64 [reported], lamp preference_ – d_= 1.23 [reported], candle preference – d= 0.79 [reported], flashlight preference – d= 1.33 [reported]
- Replication effect size: All effect sizes reported in Brandt et al.: Brandt et al.: d= 0.12 [-0.46, 0.10] (non-significant, online study 1). Brandt et al.: d= -0.11 [-0.50, 0.28] (non-significant, lab study 1). estimated watts –Brandt et al.: d= 0.05 [-0.15, 0.25] (online study 2). Brandt et al.: d= 0.03 [-0.36, 0.42] (non-significant, lab study 2). lamp preference – Brandt et al.: d= -0.03 [-0.23, 0.17] (non-significant, online study 2). Brandt et al.: d= -0.11 [-0.35, 0.33] (non-significant, lab study 2). candle preference –Brandt et al.: d= 0.03 [-0.31, 0.37] (non-significant, online study 1). Brandt et al.: d= 0.01 [-0.33, 0.35] (non-significant, lab study 2). flashlight preference –Brandt et al.: d= -0.10 [-0.30, 0.10] (non-significant, online study 2). Brandt et al.: d= -0.09 [-0.25, 0.43] (non-significant, lab study 2). Meta-analytic estimate: effects on brightness judgements mean d = 0.14 [0.002, 0.28], desirability of light-emitting products mean effect size of d = 0.13 [-0.04, 0.29]. perceived brightness – Firestone and Scholl: d= 0.38 [-0.06, 0.82] (non-significant, study 4). Firestone and Scholl: d= 0.46 [0.02, 0.90] (study 5).
Fertility on voting (Ovulation effect). Ovulatory (or high-fertility) phase of the menstrual cycle affects voting preferences and has different effects on women who are single then women who are in committed relationships. Single women were more likely to vote for Barack Obama (liberal/Democrat candidate) if they were ovulating then if they were not, while the opposite was true for women in committed relationship – ovulation led them more likely to vote for Mitt Romney (conservative/Republican candidate).
Statistics
- Status: mixed
- Original paper: ‘
The fluctuating female vote: politics, religion, and the ovulatory cycle’, Durante et al., 2013; between-subjects design, two studies, Study 1: n = 275 women, Study 2: n = 502 women. [Citations = 117 (GS, October 2022)].
- Critiques:
Harris et al. 2014 [n = 1,206, citations=15 (GS, October 2022)].
- Original effect size: single women d = 0.32 [reported], women in relationships d = 0.37 [reported].
- Replication effect size: Harris et al.: hypothetical voting preferences – single women d = 0.01 [reported; non-significant], women in relationships d = 0.37 [reported]; actual voting behaviour - single women d = 0.40 [reported], women in relationships d = 0.02 [reported; non-significant].
Modulation of 1/f noise on the weapon identification task. Making an effort to modulate the use of racial information decreases the emission of 1/f noise.
Time is money effect. Putting a price on time can influence enjoyment of leisure activities as individuals get more impatient if they are compensated for engaging in these activities.
Statistics
- Status: not replicated
- Original paper:
Time, money, and happiness: How does putting a price on time affect our ability to smell the roses? ’, DeVoe et al. 2012; 3 experimental studies, Study 1: N = 53; Study 2: N = 401; Study 3: N =205. [citations = 119 (GS, June, 2022)].
- Critiques:
Connors et al. 2016 [replication attempt 1: N = 266; replication attempt 2: N = 254; citations = 29(GS, June, 2022)].
- Original effect size: Study 1: ηp2=.119; Study 2: ηp2 =.019; Study 3: ηp2 =.031.
- Replication effect size: Replication of Study Connors et al.: Replication of Study 3, attempt 1 : ηp2 =.026; attempt 2: ηp2 =.010.
Embodiment of secrets (secrets-as-burdens). Secrets are experienced as physical burdens, influencing how people perceive and act in the world. People who recalled, were preoccupied with, or suppressed an important secret estimated hills to be steeper and perceived distances to be farther.
Statistics
- Status: mixed.
- Original paper: ‘
The Physical Burdens of Secrecy’, Slepian et al. 2012; studies 1, 2 and 4 experimental mixed model design, study 3 correlational, study 1 n = 40, study 2 n = 36, study 3 n = 40, study 4 n = 30. [citations=113 (GS, November 2022)].
- Critiques:
LeBel and Wilbur 2014, direct Slepian et al. 2012 Study 1 replication [Study 1 n=240, Study 2 n = 90, citations=24(GS, November 2022)].
Pecher et al. 2015, direct Slepian et al.2012 Study 1 and Study 2 replication [Study 1 n=100, Study 2 n = 100, Study 3 n = 118, citations=11(GS, November 2022)].
Slepian et al. 2014 [Study 1 n=83, Study 2 n = 174, citations=51(GS, November 2022)].
Slepian et al. 2015, [Study 1 n = 100, Study 2 n = 100, Study 3 n = 100, Study 4 n = 352, citations=42(GS, November 2022)].
- Original effect size: Study 1 – Big/meaningful vs. small/trivial secret hill steepness comparisons d = 0.78 (calculated from M and SD data in the paper, also reported in
LeBel and Wilbur 2014); Study 2 - Big/meaningful vs. small/trivial distant perception comparisons d = 0.67 (calculated from M and SD data in the paper, also reported in
Pecher et al. 2015); Study 3 – effects of the frequencies of thought of infidelity on estimated effort required by physical task _R2 _= .21 / d = 1.03 [converted using this
conversion]; Study 4 – more burdensome vs. less burdensome secret concealment effects on willingness to help others with physical task r = .44 / d= 0.98 [converted using this
conversion].
- Replication effect size: LeBel and Wilbur: Study 1 - Big/meaningful vs. small/trivial secret hill steepness comparisons d = 0.176 [-.08, .43] [reported] (not replicated); Study 2 - Big/meaningful vs. small/trivial secret hill steepness comparisons d = -0.319 [-.73, .10] [reported] (not replicated). Pecher et al.: Study 1 - Big/meaningful vs. small/trivial secret hill steepness comparisons d = 0.08 [-0.31, 0.47] [reported] (not replicated); Study 2 - Big/meaningful vs. small/trivial secret hill steepness comparisons d = 0.21 [-0.18, 0.60] [reported] (not replicated); Study 3 - Big/meaningful vs. small/trivial secret perceived distance comparisons d = 0.21 [-0.15, 0.57] [reported] (not replicated). Slepian et al. 2014: Study 1 - Big/meaningful secret recollection condition effects on hill slant estimation in comparison to reveаling a secret, r = .29 [reported] / d= 0.61, and control condition r = .34 [reported] / d= 0.72 [d’s converted using this
conversion] (replicated); Study 2 - Big/meaningful secret recollection condition effects on distance estimation in comparison to revealing a secret, r = .24 [reported] / d= 0.49, and control condition r = .30 [reported] / d= 0.62 [_d_s converted using this
conversion] (replicated). Slepian et al. 2015: Study 1 - Big/meaningful vs. small/trivial secret hill steepness comparisons d = 0.31 (calculated from M and SD data in the paper, non-significant) (not replicated); Study 2 - Big/meaningful vs. small/trivial secret hill steepness comparisons r = .28 [reported] / d= 0.58 [converted using this
conversion] (replicated); Study 3 – Recalling preoccupying vs. non-preoccupying secret effects on hill slant judgements r = .23 [reported] / d= 0.47 [converted using this
conversion] (replicated); Study 4 - Recalling preoccupying vs. non-preoccupying secret effects on hill slant judgements r = .11 [reported] /d= 0.22 [converted using this
conversion] (replicated).
Warmer-hearts-warmer-room effect. Priming “warm” communal traits (vs. other traits) led participants to report that the room in which they were taking the study was warmer.
Statistics
- Status: not replicated
- Original paper: ‘
Warmer hearts, warmer rooms: How positive communal traits increase estimates of ambient temperature’, Szymkow et al. 2013; Experiment 1: N = 80, two between-subjects conditions; Experiment 2: N = 80, two between-subjects conditions; Experiment 3: N = 160, four between-subjects conditions. [citations = 66 (GS, June 2022)].
- Critiques:
Ebersole et al. 2016 [n = 3,119, citations = 438 (GS, June 2022)].
- Original effect size: d = .86 [.40, 1.33] obtained from Ebersole et al. 2016.
- Replication effect size: Ebersole et al.: d = .06 [−.06, .08].
Treating-prejudice-with-imagery effect. Imagining a positive encounter with a member of a stigmatised group promote positive perceptions when it was preceded by imagined negative encounter.
Statistics
- Status: not replicated.
- Original paper: ‘
Treating” Prejudice: An Exposure-Therapy Approach to Reducing Negative Reactions Toward Stigmatized Groups’, Birtel and Crisp 2012; three between-subjects experiments, Experiment 1: n = 29, Experiment 2a: n = 32, Experiment 2b: n = 30. [Citations = 100 (GS, October 2022)].
- Critiques:
McDonald et al. 2014 [Study 1: n = 240, Study 2: n = 175, citations = 24 (GS, October 2022)].
- Original effect size: All effect sizes reported in McDonald et al.: anxiety adult with schizophrenia d= 0.76, anxiety homosexual men d = 1.08, contact homosexual men d = -0.88.
- Replication effect size: McDonald et al.: anxiety adult with schizophrenia d = 0.10 (non-significant), anxiety homosexual men d = -0.19 (non-significant), contact homosexual men d = 0.01 (non-significant).
Grammar influences perceived intentionality. Describing a person’s behaviours in terms of what the person _was doing _(rather than what the person did) enhances intentionality attributions in the context of both mundane and criminal behaviors. Participants judged actions described in the imperfective as being more intentional and they imagined these actions in more detail.
Statistics
- Status: mixed.
- Original paper: ‘
Learning About What Others Were Doing: Verb Aspect and Attributions of Mundane and Criminal Intent for Past Actions’, Hart and Albarracín 2011; three between-subject experiments, Experiment 1 n = 54, Experiment 2 n = 37, Experiment 3 n = 48. [citations=40(GS, October 2022)].
- Critiques:
Eerland et al. 2016 Multilab direct replication of Study 3 [N=685 across 12 studies, citations=82(gs, October 2022)].
Sherrill et al. 2015 [N=699 across 4 Experiments, citations=14(GS, October 2022)].
- Original effect size: Experiment 1 – accessibility to intention-relevant concepts d= 1.00 [reported]; Experiment 2 – attribution of intentionality d= 1.00 [reported], detailed segmentation of behaviour descriptions d= 1.23 [reported]; Experiment 3 – criminal intentionality d= 0.76 [reported], intention attributions d= 0.66 [reported], imagery d= 0.73 [reported].
- Replication effect size: Eerland et al.: intentionality d= -0.98 to d= 0.65 [reported], Meta-analytic effect for laboratory replications d= -0.24 [-0.50, 0.02] [non-significant, reported]; imagery d= -0.45 to d= 0.33, Meta-analytic effect for laboratory replications d= -0.08 [-0.23, 0.07] [non-significant, reported]; intention attribution d= -0.29 to d= 0.19, Meta-analytic effect for laboratory replications d= 0.00 [-0.07, 0.08] [non-significant, reported]. Sherrill et al.: Experiment 2 – murder intentionality judgement in imperfective vs. perfective condition ηp2 = 0.036 [reported] / d = 0.19 [converted using this
conversion] (replicated); Experiment 3 – murder intentionality judgement in imperfective vs. perfective condition ηp2 = 0.040 [reported] / d = 0.20 [converted using this
conversion] (replicated); Experiment 4 – imperfective murder vs. perfective murder condition d= 0.15 [non-significant, reported].
Attachment-warmth embodiment effect (anxious attachment warm food effect) Attachment anxiety positively predicts sensitivity to temperature cues. Individuals with high (but not low) attachment anxiety report higher desires for warm foods (but not neutral foods) when attachment is activated.
Statistics
- Status: not replicated
- Original paper: ‘
Warm Thoughts: Attachment Anxiety and Sensitivity to Temperature Cues’, Vess 2012; between-subject experiment, Study 1: n = 56. [citations = 45 (GS, October 2022)].
- Critiques:
LeBel and Campbell 2013 [Sample 1: n = 219, Sample 2: n=233, citations = 46 (GS, October 2022)].
- Original effect size: f2 = .0734 (
Table 1), d = .60 (CurateScience), but
calculated: d = .54
- Replication effect size: LeBel and Campbell: Sample 1: f2 = .000228, d = .03; Sample 2: f2 = .000563, d=.05.
Social and personal power. Social power (power over other people) and personal power (freedom from other people) have opposite associations with independence and interdependence; they have opposite effects on stereotyping (social power decreases and personal power increases stereotyping), but parallel effects on behavioural approach (both types of power increase it).
Statistics
- Status: mixed
- Original paper: ‘
Differentiating Social and Personal Power Opposite Effects on Stereotyping, but Parallel Effects on Behavioral Approach Tendencies’, Lammers et al. 2009; Study 1 between-subject experiments, Study 2 field/correlational study, n1 = 113, n2 = 3,082. [citations=233(GS, December 2022)].
- Critiques:
Mayiwar and Lai 2009 direct replication of the Lammers et al. Study 1 [n=295, citations=5(GS, December 2022)].
- Original effect size: Study 1 – effect of the power manipulations on participants’ stereotyping ηp2 = 0.23 [reported] / d = 0.54 [converted using this
conversion]; effect of the power manipulations on participants’ behavioural approach tendencies ηp2 = 0.13 [reported] / d = 0.38 [converted using this
conversion]; Study 2 – significant effects of personal, b = 0.05 [0.01, 0.09], and social power, b = -0.04 [0.08, 0.01] on stereotyping; significant effects of personal, b = 0.22 [0.19, 0.26], and social power, b = 0.18 [0.13, 0.24] on behavioural approach tendencies.
- Replication effect size: Mayiwar and Lai: effect of the power manipulations on participants’ stereotyping ηp2 = 0.056 [reported] / d = 0.24 [converted using this
conversion] (replicated); effect of the power manipulations on participants’ behavioural approach tendencies ηp2 = 0.017 [reported, not significant] / d = 0.13 [converted using this
conversion] (not replicated).
Classical anchoring effect (anchoring and adjustment). Assimilation of numeric estimates toward previously considered numeric values.
Statistics
- Status: replicated
- Original paper:
‘Judgment under Uncertainty: Heuristics and Biases: Biases in judgments reveal some heuristics of thinking under uncertainty’, Kahneman & Tversky, 1974, p. 1128; between-subjects manipulation of anchor (high vs. low), sample size not reported. [citations=46154(GS, August 2022)].
- Critiques: Narrative review:
Furnham and Boo, 2011 [citations=1030(GS, August 2022)]. Meta-analysis (negotiations): Meta-analysis (economics):
Li et al. 2021 [n not reported, citations=15 (GS, August, 2022)].
Orr and Guthrie, 2006 [n=1,259, citations=138 (GS, August, 2022)]. Dynamic meta-analysis (all fields of anchoring):
Röseler et al., 2022 [n=18,601, citations=0 (GS, March, 2023)].
Röseler and Schütz, 2022 [n=17,708, citations=0 (GS, August, 2022)]. Meta-analysis (law context):
Townson, 2019 [n not available (no access to document), citations=4 (GS, August, 2022)].
- Original effect size: Not reported.
- Replication effect size: Röseler et al.: [as of August 2022]: g = 0.683, p < .001, 95% CI [0.584, 0.782], 95% PI [-0.24, 1.606], σ² = 0.218, Ntotal = 18601, k = 418.
Incidental environmental anchoring effect (incidental anchoring
[Critcher & Gilovich, 2008], basic anchoring
[Wilson, Houston, Etling, & Brekke, 1996] “Anchor values that are incidentally present in the environment can affect a person’s numerical estimates (…) these effects were not qualified by participants’ expertise in the relevant domain (study 1) or by their ability to subsequently recall the anchor value (study 3).” (Critcher & Gilovich, 2008).
Statistics
- Status: not replicated
- Original paper: ‘
Incidental environmental anchors’, Critcher and Gilovich, 2008; 3 studies w. Between-subjects manipulation of incidental anchors, n = 265 (Study 1) + 207 (Study 2) + 194 (Study 3) = 666. [citations=261(gs, August 2022)].
- Critiques:
Shanks et al., 2020 [n(Restaurant item)= 69 (Study 1) + 125 (Study 2) + 422 (Study 3) = 616, citations=11(GS, August, 2022)].
- Original effect size: d = 0.49 [0.25, 0.72] according to
Shanks et al., 2020, p. 9.
- Replication effect size: Shanks et al.: d = -0.02 [-0.08, 0.05].
Subliminal anchoring effect (subliminal anchoring). Numeric estimates are biassed toward previously, subliminally presented numbers that could not be perceived by respondents.
Facial redness increased perceived anger. When people rate faces and these are red, rated anger is positively associated with the faces’ redness.
Statistics
- Status: mixed
- Original paper: ‘
Facial redness, expression, and masculinity influence perceptions of anger and health’, Young et al., 2018; full within-subjects design, 40 (Study 1) + 44 (Study 2). [citations=23(GS, October 2022)].
- Critiques: Effect could not be replicated with natural shades of red:
Wolf et al., 2021 [n=609, citations=1(GSt, October 2022)]. Effect persisted only in a within-subjects design:
Wolf et al., 2022 [n= 40 (Study 1) + 329 (Study 2), citations=0(GS, October 2022)].
- Original effect size: Cohen’s f = .35.
- Replication effect size: Wolf, et al.: ηp2=0.04 [0.01, 0.06].
Romeo and Juliet effect. Greater love and commitment towards a romantic partner when others (e.g., parents, friends) are observed to interfere with, or disapprove of, the relationship.
Statistics
- Status: reversed
- Original paper:
‘Parental interference and romantic love: The Romeo and Juliet effect’, Driscoll et al. 1972; within-subjects, n = 140 (couples). [citations = 490 (GS, October 2022)].
- Critiques:
Parks et al. 1983 [n = 193, citations = 260 (GS, February, 2023)].
Sinclair et al. 2014 [Experiment: n = 396 (direct replication), Meta-analysis: n = NA, k = 22 studies, citations = 56 (GS, October 2022)].
- Original effect size: Romantic love and parental interference: r = .34; Commitment and parental interference: r = .30.
- Replication effect size: Parks et al.: All effects correlated with romantic love: Own family approval: r = .47 (opposite direction); Partners family approval: r = .42 (opposite direction); Own friend approval:_ r_ = .51 (opposite direction); Partners friends’ approval: r = .49 (opposite direction); Own network approval: r = .63 (opposite direction). Sinclair et al.: Experiment: Romantic love - Parental interference: r = -.05 (not replicated); Friend interference: r= -.07 (not replicated); Commitment - Parental interference: r = -.09 (not replicated); Friend interference: r = -.06 (not replicated). Meta-analysis: Romantic love and network approval (k = 11 studies): g = 0.49 [0.26,0.72] (opposite direction); Commitment and network approval (k = 16 studies): g = 0.62 [0.50,0.74] (opposite direction).
Stereotype activation effect. Judgments of targets that follow gender-congruent primes are made faster than judgments of targets that follow gender-incongruent primes.
Statistics
- Status: not replicated
- Original paper: ‘
Automatic Stereotyping’, Banaji and Hardin 1996; method - the semantic priming procedure, sample size = 68. [citations=1060 (CS, November 2022)].
- Critiques:
Müller and
Rothermund, 2014 [n=294, citations=32(GS, November 2022)].
- Original effect size: Prime Gender x Target Gender: F(2, 144) = 15.28, p<.001
- Replication effect size: Müller and Rothermund: Prime Gender × Target Gender interaction: F(1, 293) = 39.68, p = 1.09 × 10−9
Sex difference in distress to infidelity. Men, compared to women, are more distressed by sexual than emotional infidelity, and this sex difference continued into older age.
Statistics
- Status: replicated
- Original paper: ‘
Jealousy and the nature of beliefs about infidelity: Tests of competing hypotheses about sex differences in the United States, Korea, and Japan’, David et al. 1999; study design = experimental (Questionnaire), sample 1 size = 405, sample 2 size = 626. [citations=552 (GS, November 2022)].
- Critiques:
Shackelford and Voracek, 2004 [n=234 citations=172 (GS, November 2022)].
- Original effect size: t(494) = 6.09 for Sample 1 and t(624) = 6.82 for Sample 2, both p’s < .001.
- Replication effect size: d = 1.29, t(232) = 9.89, p < .001;
Content effect for cheater detection. There is a performance improvement on the Wason selection task if it involves cheater detection. College students were better able to complete the selection task for unfamiliar scenarios if it involved detecting a cheater instead of a descriptive scenario.
Dissenting deviant social rejection effect. Groups reject opinion deviates from future interaction.
Statistics
- Status: mixed
- Original paper: ‘
Deviation, rejection, and communication’, Schachter 1951; experiment, sample size = 198. [citations=2209 (GS, November 2022)].
- Critiques:
Wesselmann 2014 [n=80, citations=37 (GS, November 2022)].
- Original effect size: d = 1.84 (source: meta-analysis by
Tata et al. 1996)
- Replication effect size: Wesselmann: replicated: Communication Pattern - effect for change over time in overall communication to the confederates F(5, 80) = 1.23, p = 0.30, np2 = 0.07; effect for the groups’ differential communication between the confederates F(2, 32) = 20.83, p < 0.01, np2 = 0.57; interaction between communication to the different confederates and the point of the conversation F(10, 160) = 0.99, p = 0.45, np2 = 0.06; not replicated: Committee Nomination Measure χ2(4) = 0.79, p = 0.94; replicated: Sociometric Test χ2(2) = 14.74, p < .01.
Sex differences in implicit maths attitudes. College students, especially women, demonstrated negativity toward maths and science relative to arts and language on implicit measures.
Statistics
- Status: replicated
- Original paper:
Math = male, me = female, therefore math ≠ me, Nosek et al.2002; study design = experiment, n = 170. [citations=1428 (GS, November 2022)].
- Critiques:
Klein et al. 2014 [n=5842, citations=1129 (GS, November 2022)].
- Original effect size: d=1.01[.54, 1.48] (reported in Klein et al. 2014).
- Replication effect size: Klein et al.: d=0.56[0.45, 0.68]
Low versus high category scale effect on behaviour self-report. Response scales serve informative functions. The response categories suggest a range of “usual” or “expected” behaviours, and this information affects respondents’ behavioural reports as well as related judgments.
Information source on attitudes effect. The source of information has a major impact on how that information is perceived and evaluated.
Statistics
- Status: replicated
- Original paper: ‘
Prestige, Suggestion, and Attitudes’, Lorge and Curtiss 1936; experiment, sample size = 99. [citations=242(GS, November 2022)].
- Critiques:
Klein et al. 2014 [n=6325, citations=1129 (GS, November 2022)].
- Original effect size: NA.
- Replication effect size: Klein et al.: d = 0.31[0.19, 0.42].
Door-in-the-face effect. The door-in-the-face effect occurs when making a larger initial request and then afterwards scaling back and asking a more moderate request increases compliance (with the moderate request) compared to either starting with the moderate request or starting with a small request.
Foot-in-the-door effect. The foot-in-the-door effect occurs when getting people to comply with a very small initial request increases the likelihood that they will agree to a larger request (compared to starting with the larger request).
Statistics
- Status: mixed
- Original paper: ‘
Compliance without pressure: The foot-in-the-door technique’, Freedman and Fraser 1966; between-subjects manipulation of whether or not there is a very small initial request, n=156. [citations=2,667(GS, January 2023)].
- Critique:
Gamian-Wilk and Dolinski 2019 [Between-subjects manipulation of whether or not there is a very small initial request, n=60 in each of 4 replication studies; 240 total, citations=3(GS, January 2023)].
- Original effect size: OR = 3.912; d = 2.16.
- Replication effect size: Out of 4 replication attempts, only 1 succeeded with p < .05, although most were directional and had small sample sizes. OR = 8.76, d = 4.83 if aggregating across all 4 replications (which probably makes the most sense given small sample sizes); or in just the successful replication: OR = 33.14 in successful one (due to only 1 person complying with large request in the control condition).
Ingroup-outgroup norm of reciprocity effect. “When confronted with a decision about allowing or denying the same behaviour to an ingroup and outgroup, people may feel an obligation to reciprocity, or consistency in their evaluation of the behaviours.”
Statistics
- Status: replicated
- Original paper: ‘The Current Status of American Public Opinion’, Hyman and Sheatsley, 1950; experiment, n = NA. [citations=161(GS, November 2022)]. Was not able to find the online version of the original paper.
- Critiques:
Klein et al. 2014 [n=6276, citations=1129 (GS, November 2022)].
- Original effect size: d=0.16[0.06 0.27].
- Replication effect size: d=0.27 [0.18, 0.36].
Social dominance-status (verticality effects). Vertical dimension of human relations (such as dominance and submission) and nonverbal behaviour are intimately and fruitfully linked; nonverbal behaviour, such as gazing, smiling, touching, and various body positions can signal high and low verticality.
Statistics
- Status: mixed
- Original paper: ‘
Body Politics: Power, Sex, and Nonverbal Communication’, Henley 1977; book/theoretical and anecdotal evidence, n=NA. [citations=2284(GS, May 2023)].
- Critiques:
Hall et al. 2005 [meta-analysis, k=211, citations=1103(GS, May 2023)].
- Original effect size: NA.
- Replication effect size: Hall et al.: beliefs (perceptions) about the relation of verticality to nonverbal behavior (average r, weighted by sample size) – smiling r=-.25 [-.29, -.21], gazing r=.10 [.06, .14], raised brows r=-.36 [-.41, -.31], nodding r=.12 [.00, .18], self touch r=-.09 [-.24, -.06], other touch r=.21 [.17, .29], hand/arm gestures r=.37 [.25, .49], postural relaxation r=-.09 [-.04, .24], body/leg shifting r=.10 [-.29, -.21], interpersonal distance r=-.34 [-.43, -.25], facing orentation r=.10 [-.01, .21], vocal variability r=.24 [.16, .32], loudness r=.47 [.39, .55], interruptions r=.61 [.52, .70], pausing/latency to speak r=-.78 [-.94, -.62], rate of speech r=.09 [.03, .15], pitch r=-.10 [-.19, -.01], vocal relaxation r=.33 [.18, .48]; actual relations between verticality and nonverbal behavior (average r, weighted by sample size) – smiling r=-.03 [-.09, .03], gazing r=-.01 [-.09, .07], raised brows r=-.06 [-.25, .18], nodding r=.03 [-.05, .17], self touch r=-.04 [-.10, .10], other touch r=-.02 [-.10, .16], hand/arm gestures r=.05 [-.06, .10], openess r=.13 [.03, .23], postural relaxation r=.02 [-.08, .12], interpersonal distance r=-.17 [-.24, -.20], loudness r=.24 [.16, .32], interruptions r=.04 [-.02, .10], overlaps r=.06 [-.06, .81], pausing/latency to speak r=-.06 [-.24, .12], back-channel responses r=.03 [-.07, .13], speech errors r=.02 [-.10, .14], rate of speach r=-.06 [-.15, .03].
Personal cognitive dissonance - free-choice paradigm. Personal cognitive dissonance, from the cognitive dissonance theory (Festinger, 1957), suggests that an inconsistency between two cognitions (e.g., an attitude and a past behaviour) creates an unpleasant psychological state (i.e., personal dissonance) that the individual is motivated to reduce (e.g., by changing one of the elements to fit the other). This personal cognitive dissonance has been studied in the literature through different paradigms, including the following three main ones: free-choice, induced-compliance and induced-hypocrisy paradigm. The mere act of choosing equally desirable options can arouse dissonance in the individual, because choosing option A implies the rejection of option B (in other words, choosing option A means accepting its advantages but also its disadvantages, but also accepting to deprive oneself of the advantages of option B). In order to reduce dissonance, subjects will increase the perceived gap between options (i.e., spreading of alternatives) by overestimating the chosen option and/or underestimating the rejected option.
Statistics
- Status: NA
- Original paper: ‘
Postdecision changes in the desirability of alternatives’, Brehm, 1956); experimental design, n =225.[citations= 1987 (GS, February 2023)].
- Critiques:
Enisman et al. 2021 [meta-analyse; n= 43 studies, citations = 11 (GS, February 2023)].
Izuma and Murayama 2013 [meta-analysis, k= 3 studies, citations = 109(GS, February 2023)].
- Original effect size: NA.
- Replication effect size: Enisman et al: Effect of free-choice paradigm on spreading of alternatives: d= 0.40 [0.32, 0.49].
Personal cognitive dissonance - induced-compliance paradigm. In this paradigm, subjects are led to perform, in a context of free choice, an inconsistent act with their own norms or social norms (e.g., agree to perform a counter-attitudinal act). Dissonance can be resolved through multiple modes of reduction (e.g., social support, trivialization, etv.), but attitude change remains the most studied mode of reduction.
Statistics
- Status: replicated
- Original paper: ‘
Dissonance arousal: Physiological evidence’, Croyle and Cooper 1983; between-subjects design, n1 = 30, n2=30. [citations= 447 (GS, February 2023)].
- Critiques:
Kenworthy et al. 2011 [meta-analyse; n= 31 studies, citations = 71 (GS, February 2023)].
Kim et al. 2014 [meta-analyse; k= 230 effects, citations = 11 (GS, February 2023)].
Vaidis et al. 2018 [multi-Lab Replication; in preparation]. Original effect size: Effect of induced-compliance on attitude change: d= 2.40 [1.40, 3.37].
- Replication effect size: Kenworthy et al.: Effect of induced-compliance on dissonance effects: d= 0.81 [0.70, 0.91]. Kim et al.: Effect of induced-compliance on attitude change: r= .22 [.21, .24].
Personal cognitive dissonance - Induced-hypocrisy paradigm. In this paradigm, dissonance is aroused by making individuals aware of the discrepancy between a socially desirable behaviour (e.g., not wasting water; stage 1: normative commitment phase) and their own past transgressive behaviours (e.g., remembering one’s past water waste; stage 2: transgression salience phase). Most of the dissonance reduction work is done through behavioural means, and leads subjects to express behavioural intentions, and/or perform behaviours in the direction of the socially desirable behaviours expressed in step 1 (i.e., allowing for the reduction of the inconsistency between the norm, step 1, and the recall of transgressions, step 2).
Vicarious cognitive dissonance - induced-compliance paradigm. Vicarious cognitive dissonance, from the cognitive dissonance theory (Festinger, 1957; see “personal cognitive dissonance”), suggests that it would be possible for an individual to experience dissonance vicariously when they witness the performance of inconsistent act (e.g., counter-attitudinal or counter-normative behaviour) on the part of an in-group member with whom they strongly identify. As a personal cognitive dissonance, the inconsistency between two cognitions (e.g., between attitude and observed behaviour) creates an unpleasant psychological state (i.e., vicarious dissonance) that the individual is motivated to reduce (e.g., by changing one of the elements to fit the other). This vicarious cognitive dissonance has been studied in the literature through different paradigms, including the following two main ones: induced-compliance and induced-hypocrisy paradigm. In this paradigm, subjects are led to observe the realisation, by a member of their in-group, of a counter-attitudinal act with their own norms or social norms (e.g., agree to perform a counter-attitudinal act), performed in a context of free choice. Dissonance can be resolved through multiple modes of reduction (e.g., social support, trivialization, etc.), but attitude change remains the most studied mode of reduction.
Statistics
- Status: mixed
- Original paper: Vicarious dissonance: Attitude change from the inconsistency of others, Norton et al. 2003; experimental design, exp 1: n = 50, exp 2 : n = 43, exp 3: n = 108. [citations= 344 (GS, February 2023)].
- Critiques: Jaubert 2022 [paper not found; n = 102, citations = NA].
Jaubert et al. 2020 [meta-analyse in submission; k= 13 studies, citations = 0 (GS, 2023)].
- Original effect size: Effect of vicarious dissonance on attitude change: d= 0.70 [0.21, 1.26].
- Replication effect size: Jaubert: Effect of vicarious dissonance on attitude change: η2p= 0.07. Jaubert et al.: Effect of vicarious dissonance toward the induced-compliance paradigm: d= 0.35 [0.15, 0.54]. Global effect of vicarious dissonance: d= 0.41 [0.27, 0.54], lower estimated effects when correcting for publication bias (d= 0.22 [0.008, 0.43]).
Vicarious cognitive dissonance - Induced-hypocrisy paradigm. In this paradigm, subjects are made to observe a member of their group becoming aware of the discrepancy between a socially desirable behaviour (e.g., not wasting water; stage 1: normative commitment phase) and their own past transgressive behaviours (e.g., remembering one’s past water waste; stage 2: transgression salience phase). Most of the dissonance reduction work is done through behavioural means, and leads subjects to express behavioural intentions, and/or perform behaviours in the direction of the socially desirable behaviours expressed in step 1 (i.e., allowing for the reduction of the inconsistency between the norm, step 1, and the recall of transgressions, step 2).
Statistics
- Status: mixed
- Original paper: Vicarious hypocrisy: Bolstering attitudes and taking action after exposure to a hypocritical ingroup member, Focella et al., 2016; experimental design, exp 1: n = 161, exp 2 : n = 68, exp 3: n = 64, exp 4: n = 68. [citations= 34 (GS, February 2023)].
- Critiques:
Gaffney et al. 2012 [n = 78, citations=17(GS, November 2022)]. Jaubert 2022 [n = 133, citations = NA].
Jaubert et al. 2020 [meta-analyse in submission; _k _= 13 studies, citations = 0 (GS, March 2023)].
Monin et al. 2004 [study 1 n=57, study 2 n = 25, citations=97(GS, November 2022)].
- Original effect size: Effect of induced-hypocrisy on behavioural intention: d= 0.70 [0.12, 1.26].
- Replication effect size: Gaffney et al.: group membership X response to the hypocrisy interaction effect on attitudes ηp2 =0.142 /d = 0.40 [calculated from the F statistics and converted using this conversion] (replicated). Jaubert: Effect of vicarious dissonance on attitude change: η2p= 0.03. Jaubert et al.: Effect of vicarious dissonance toward the induced-compliance paradigm: d= 0.46 [0.27, 0.64]. Monin et al.: study 1 disagree versus agree condition attitude change comparison d = 0.30 [calculated from the t-test values using this
conversion]; study 2 no consequence versus consequence condition attitude change comparison d= 0.46 [calculated from the t-test values using this
conversion] (replicated).
Imposter phenomenon. People who perform outstandingly both academically and professionally believe that in fact, they are not really bright and that they have fooled anyone who thinks otherwise. This phenomenon might be especially persistent in women. Key conclusion: Therapeutic interventions might help to overcome imposter syndrome.
Statistics
- Status: not replicated
- Original paper: ‘
The imposter phenomenon in high achieving women: Dynamics and therapeutic intervention’, Clance and Imes 1978; Therapeutic interventions (but not described in detail), n=178. [Citations = 2709 (GS, January 2023)].
- Critiques:
Bravata et al. 2020: [n= 62 studies - systematic review, citations= 272 (GS, January 23)].
- Original effect size: NA; No effect sizes mentioned in original study since no statistical analyses were performed.
- Replication effect size: Bravata et al.: NA, but imposter phenomenon both present in men and women, particularly high among ethnic minority groups (original study mentioned white middle class women).
Ability EI as a factor of intelligence. Ability EI is a collection of cognitive abilities relating to the recognition, understanding and management of emotions. There have been many controversies in attempting to contextualise Ability EI within models of intelligence/cognitive ability. MacCann et al. (2014) empirically tested multiple models of how various cognitive abilities interact, including hierarchical and bi-factor models, and the data demonstrated closest fit to a hierarchical structure where Ability EI was contextualised as a second-stratum factor. A recent replication repeated this modelling process and drew the same conclusion.
Matilda effect (Matthew Matilda effect). Male scientists and masculine topics are frequently perceived as demonstrating higher scientific quality.
Statistics
- Status: mixed
- Original paper: The name of the effect shows up first in ‘
The Matthew Matilda Effect in Science’, Rossiter, 1993; theoretical paper, n=NA. [citations=114(GS, February 2023)].
- Critiques: Unpublished replication in
Feeley and Lee 2015 [n=1177 articles across 3 journals, citations=5 (GS, February 2023)]. Related article also in communication research is
Feeley and Yang 2022 [n=3324 articles across 10 journals, citations=2(GS, February 2023)]. Knobloch-Westerwick and Glynn 2013; correlational design, n=1020 articles across 2 journals [citations=114 (GS, February 2023)].
Rajko et al. 2023 [n=5,500 communication scholars from 11 countries, citations=0 (GS, February 2023)].
- Original effect size: NA.
- Replication effect size: Knobloch-Westerwick and Glynn: Publications with female lead authors were cited 12.77 times on average (SD = 20.57), whereas publications with male lead authors were cited 17.73 times on average (SD = 35.34), _η_² = 0.006; Male-typed publications receiving significantly more citations (M = 21.04, SD = 38.63 vs. M = 14.44, SD = 28.08), _η_² = 0.006; Publications with at least one male author received significantly more citations with M = 17.11 (SD = 33.38), compared with M = 11.93 citations (SD = 19.84) for publications from female authors, _η_² = 0.006Feeley and Lee: Female lead authors were cited, on average, 19.34 times (SD = 30.22) compared to male lead authors (M = 18.05, SD = 25.98) (non-significant); Male-typed topics (M = 22.43, SD = 36.55) received more citations than female-typed topics (M= 17.87, SD = 25.80), _η_² = 0.003. Feeley and Yang: 2 out of the 8 journals examined exhibited Matilda effects. Rajko et al.: After controlling for country, the total number of papers, and the total number of views, female scholars have significantly lower levels of citations than male peers (β = −.05; p < .001).
Being slightly behind increases the chance of winning. The original study has found that being slightly behind at halftime increases the chance of winning significantly in professional Basketball.
Statistics
- Status: mixed
- Original paper: ‘
Can Losing Lead to Winning?’, Beger et al., Devin, 2011; natural experiment, N = 11968. [citations = 271 (GS, February 2023)].
- Critiques:
Teeselink et al. 2022 [natural experiment, N = 17535, citations = 9 (GS, February 2023)].
- Original effect size: Teams behind by one point at halftime win between 5.8 and 8 percentage points more often than expected in the NBA across four models (between 2.1 and 2.5 percentage points in the NCAA). The result is statistically significant in all specifications in the NBA and in 3 out of 4 specifications in the NCAA.
- Replication effect size: Teeselink et at.: With a larger sample, teams behind by one point at halftime win 5 percentage points more often than expected in the NBA (0.8 percentage points in the NCAA). The result is statistically significant for the large NBA sample (95% CI: 0.007, 0.094), but not the large NCAA sample (95% CI: −0.025, 0.041). Additionally, the replication also tests the hypothesis in other leagues and sports. Out of 12 leagues, the effect is only significant in the NBA.
Ethnoracial diversity and trust. Ethnoracial diversity negatively affects trust and social capital.
Statistics
- Status: mixed
- Original paper: ‘
E Pluribus Unum: Diversity and Community in the Twenty-first Century The 2006 Johan Skytte Prize Lecture’, Putnam 2007; observational study, N = 23260. [citations = 6860 (GS, February 2023)].
- Critiques:
Abascal et al. 2015 [observational study, N = 29733, citations = 331 (GS, February 2023)].
- Original effect size: Trust in Neighbors increases by 0,18 (on a 4-point scale) when switching from a maximally heterogeneous to a maximally homogeneous community in the USA. t= 5.1. (see table 3).
- Replication effect size: On the full-sample of the US, the authors find a similar result: trust in neighbours decreases by 0,12 in heterogeneous compared to homogeneous communities, BUT when using random subsamples of the US population, they only find a significant effect in 4 out of 30 models (average t=: -0,76).
Greed moderates the relationship between SES and unethical behaviour. The original study found that people of higher socio-economic status are more likely to engage in unethical behaviour, but that this relationship is moderated by greed. When study participants were primed to think positively about greed, those of lower SES became more likely to engage in unethical behaviour than those of higher SES.
Statistics
- Status: not replicated
- Original paper: ‘
Higher social class predicts increased unethical behavior’, Piff et al. 2012; experiment, N = 90. [citations = 1273 (GS, February 2023)].
- Critiques:
Balakrishnan et al. 2017; [experiment/meta analysis, n1= 264, n2=257, n3=306, n4=114, citations = 18 (GS, February 2023)].
- Original effect size: Interaction effect between greed and SES: b=−0.24 [−0.44 , −0.04].
- Replication effect size: Interaction effect between greed and SES in replication 1: unstandardized b=0.11 [−0.02 , 0.24], in replication 2: b=−0.06 [−0.16 , 0.04], in replication 3: b=0.01 [−0.10 , 0.12], in replication 4: b=0.10 [−0.10 , 0.29]. I.e., the interaction effect was not replicated in any of the four studies.
Women’s education increases domestic violence. Women with more education report higher levels of psychological violence at home.
Statistics
- Status: not replicated
- Original paper: ‘
For Better or for Worse?: Education and the Prevalence of Domestic Violence in Turkey’, Erten and Keskin 2018; natural experiment (RDD/IV), N = 1462. [citations = 153 (GS, February 2023)].
- Critiques:
Akyol et al. 2020 [natural experiment (RDD/IV), N = 1093, citations = 4 (GS, February 2023)].
- Original effect size: With a regression continuity design, the authors determine that 1 more year of education leads to a 0,12 standard deviation increase in reported domestic psychological violence (SE: 0,057) for Turkish women living in rural areas, which is significant at the 5% level.
- Replication effect size: Akyol et al.: With the same design and sample, but a different definition of rural areas, the authors find only a 0.099 standard deviation increase in reported domestic violence (SE: 0,061), which is not significant at the 5 or 10% level.
Easterlin paradox (national income associated with happiness). When comparing across countries, higher levels of income are associated with higher levels of subjective well-being, yet this association does not show up across time.
Statistics
- Status: mixed
- Original paper:
’Does Economic Growth Improve the Human Lot? Some Empirical Evidence’,
Easterlin 1974 observational study, n=25 time series observations for the United States from 1946-1970. [citations=8686(GS, February 2023)].
- Critiques:
Easterlin 2005 [focused on descriptive statistics, citations=470(GS, February 2023)].
Hagerty and Veenhoven 2003 [n=336 (21 countries including the United States from 1973-1996), citations=765(GS, February 2023)]. A more comprehensive analysis using a variety of analyses and datasets can be found in
Sacks et al. 2012 [n=79 countries spanning 1980 to 2004, citations=525(GS, February 2023)]. Rand rejoinder in
Veenhoven and Hagerty 2006 [focused on trends in the United States, Western Europe, and 8 developing nations, citations=470(GS, February 2023)].
- Original effect size: No effect size explicitly reported, but Tables 8 to 10 of Easterlin 1974 contain time series patterns.
- Replication effect size: Easterlin: No effect size reported, but refers to lack of association between subjective well-being and in Figure 1 for the United States. Hagerty and Veenhoven: regression coefficient b=1.26 [Z-statistic=2.67]. Sacks et al.: β=0.505, SE=0.109 for the World Values Survey, β=0.278, SE=0.164 for the Eurobarometer. Veenhoven and Hagerty: No effect size reported, but Tables 1 to 4 report trends.
Humour style clusters. A number of works have attempted to determine whether individuals can be categorised into different types of humour user. The first was by Galloway (2010) and suggested four types of humour user through use of cluster analysis: (1) above average on all of the styles, or (2) below average on all of the styles, or (3) above average on the positive styles (Affiliative and Self-enhancing), and below average on the negative styles (Aggressive and Self-defeating), or (4) above average on the negative styles and below average on the positive styles.
Statistics
- Status: mixed
- Original paper: ‘
Individual differences in personal humor styles: Identification of prominent patterns and their associates’, Galloway, 2010; cross-sectional study, n=318. [Citations = 149 (GS, January 2023)].
- Critiques:
Chang et al., 2015[n=1252, citations = 47 (GS, March 2023)].
Evans & Steptoe-Warren 2018[n=202, citations = 49 (GS, March 2023)].
Evans et al. 2020[n=863, citations = 3 (GS, March 2023)].
Fox et al. 2016[n=1108, citations = 37 (GS, March 2023)].
Leist and Muller 2013[n=305, citations = 119(GS, March 2023)].
Sirigatti et al. 2016[n=244, citations = 35 (GS, March 2023)].
- Original effect size: NA.
- Replication effect size: Chang et al.: NA, but the four-cluster solution described was replicated. Evans and Steptoe-Warren: three managerial humour clusters. Evans et al.: inconsistencies in the humour style profiles across countries tested and the extant literature, possibly indicative of cultural differences in the behavioural expression of trait humour. Fox et al.: NA, the presence of distinctive humour types in childhood. Leist and Muller: evidence for three humour types (endorsers, humour deniers, and self-enhancers). Sirigatti et al.: three humour styles identified.
Social referencing effect. Crosby et al. (2008) found that hearing an offensive remark caused subjects to look longer at a potentially offended person, but only if that person could hear the remark. On the basis of this result, they argued that people use social referencing to assess the offensiveness.
Statistics
- Status: mixed
- Original paper: ‘
Where do we look during potentially offensive behavior?’, Crosby et al. 2008; experimental design, n=25. [Citations = 95 (GS, Jan 2023)].
- Critiques:
Jonas and Skorinko 2015 [n=58, Citations = 0 (GS, Jan 2023)].
Rabagliati et al. 2020 [n = 283, Citations = 2 (GS, Jan 2023)].
- Original effect size: F(3,69)=5.15, p<.005.
- Replication effect size: Jonas and Skorinko: F(1.86, 101.7) = 0.07, p=0.917. Rabagliati et al: χ2(3) = 22.11, p < .001, pseudo-R2 = .85; χ2(3) = 22.11, p < .001, pseudo-R2 = .85.
Other-race effect (cross-race effect, own-race bias). Humans are better at distinguishing between faces of two individuals of their own race than two faces of another race.
Statistics
- Status: replicated
- Original paper:
‘Recognition for faces of own and other race’, Malpass and Kravitz 1969; forced-choice recognition task, n=40. [Citations=1036 (GS, Feb 2023)].
- Critiques:
Lee and Penrod 2022 [various recognition tasks n=24937, Citations=1 (GS, Feb 2023)].
- Original effect size: η2p=0.291 to 0.350 (insufficient information for CI).
- Replication effect size: Lee and Penrod: Hedge’s g=0.54.
Unethical amnesia. Memories of unethical behaviour are less clear and vivid than memories of good deeds.
Statistics
- Status: not replicated.
- Original paper: ‘
Memories of unethical actions become obfuscated over time’, Kouchaki and Gino 2016; nine studies, between-subjects (Study 1a, 1b, 3, 4, 5, 6, 7a, 7b) and correlational design (Study 2), total n = 2,109. [citations=177(GS, May 2023)].
- Critiques:
Stanley et al. 2018 replication of Kouchaki and Gino 2016 study 5 [n1=228, n2=232, n3=228, citations=22(GS, May 2023)].
- Original effect size: Study 1a - ηp2= 0.06; Study 1b – in the “self” conditions, participants had less clear memory of their unethical actions, ηp2 = 0.06; Study 2 –cheaters reported lower clarity of memory, ηp2 = 0.04; Study 3 –participants in the self-unethical condition had a less clear recall of thoughts and feelings than did participants in the self-ethical condition, ηp2 = 0.05; Study 4 - participants who read that they had cheated indicated they had a less clear memory than those who did not cheat, ηp2 = 0.20; Study 5 – objective memory score was lower for those who read in the story that they had cheated than for those who read that they had behaved honestly, d = 0.43; Study 6 - participants in the likely-cheating condition recalled the die-throwing task less precisely than those in the no-cheating condition, d = 0.57; Study 7 - participants in the likely-cheating condition recalled the die-throwing task less precisely than those in the no-cheating condition, d = 0.38 (Study 7a), d = 0.33 (Study 7b).
- Replication effect size: Stanley et al.: no significant differences in objective memory score between participants who read the vignette depicting the ethical behaviour versus the unethical cheating behaviour, Study 1 - d = .26 (n.s.), Study 2 - d = .04 (n.s.), Study 3 - d = .17 (n.s.).
Feeling dirty after networking. People feel uncomfortable networking because networking triggers a state of “moral impurity,” which translates into feelings of “dirtiness” and a heightened desire for “cleansing”.
Public exposure influences shame and guilt differently. Public exposure (implicit and explicit) of transgression increases experienced shame more than guilt.
Statistics
- Status: not replicated
- Original paper: ‘
The role of public exposure in moral and nonmoral shame and guilt’, Smith et al. 2002; 4 studies, between-subject design (Study 1 and 4), within-subject design (Study 2), content analysis (Study 3), Study 1: n=168, Study 2: n=56, Study 3: n=510 passages, Study 4: n=60. [citations=690(GS, June 2023)].
- Critiques:
Zhang et al. 2022 [n=1727, citations=0(GS, June 2023)].
- Original effect size: Study 1: shame f =.39 and guilt f =.0.27.
- Replication effect size: Zhang: Study 1 - shame ηp2=.14 [.11, .17] and guilt ηp2=.13 [.10, .16].
Verbal overshadowing effect. In a series of six experiments, verbalising the appearance of previously seen visual stimuli impaired subsequent recognition performance.
Statistics
- Status: replicated
- Original paper:
‘Verbal overshadowing of visual memories: Some things are better left unsaid’,
Schooler and Engstler-Schooler 1990; experiment, n = 117 (study 4), n = 88 (study 1), n = 104 (study 2).[citations=1218 (GS, November 2022)].
- Critiques: Experiment 1 and 4:
Alogna 2014 [n=1105 (experiment 1), n = 663(experiment 2), citations=192 (GS, November 2022)]. Мeta-analysis.
- Original effect size: Experiment 1: -22%, Experiment 2: -25%.
- Replication effect size: Alogna: Experiment 1: 4.01% [−7.15%, −0.87%]. Experiment 2: −16.31% [−20.47%, −12.14%].
Age of acquisition effects - influence on free recall (pure block). Early-acquired items are recalled more accurately than late-acquired items when early-acquired items are presented in a separate block and late-acquired items are presented in a separate block.
Age of acquisition effects - influence on free recall (mixed block). Early-acquired items are recalled more accurately than late-acquired items when early-acquired items are mixed with late-acquired items in a block.
Age of acquisition effects - influence on recognition (mixed block). Early-acquired items are recalled more accurately than late-acquired items.
Statistics
- Status: reversed
- Original paper:
‘Word imagery but not age of acquisition affects episodic memory’, Coltheart and Winograd 1986; experiment, Experiment 2: n = 102. [citations=44(GS, November 2022)].
- Critiques:
Dewhurst et al. 1998 [Experiment 1: n=30, Experiment 2: n = 30; citations=117(GS, November 2022)].
Macmillan et al. 2022 [n = 44, citations = 9 (GS, November 2022)].
- Original effect size: ηp² = .03 [ηp2 calculated from reported F statistic and converted using this
conversion].
- Replication effect size: Dewhurst et al.: Experiment 1: Hits: ηp² = 0.42 [ηp2 calculated from reported F statistic and converted using this
conversion], False alarms: F < 1, d’: ηp² = 0.31 [ηp2 calculated from reported F statistic and converted using this
conversion]; Experiment 2: Hits: d = 0.74 [d calculated from reported t statistic and converted using this
conversion], False alarms: d= 0.57 [d_ _calculated from reported t statistic and converted using this
conversion], d’: d = 0.09 [d calculated from reported t statistic and converted using this
conversion]. Macmillan et al.: Hits: d = 0.023, False alarms: d= 0.56, d’: d = 0.44, C = 0.35, da = 0.65, slope = 0.25.
Age of acquisition influences the pre-conceptual stages of lexical retrieval (progressive demasking). Early-acquired items are identified more accurately than late-acquired items, using a progressive demasking task. A progressive demasking task is a type of perceptual identification task where participants are presented with a series of words that are gradually revealed over time and their ability to identify words at each stage of the task is measured. Words learned at an earlier age are thought to be easier to demask than those learned later in life, perhaps because the individual has gained more experience and exposure to the word, which can make it easier to recognize.
Statistics
- Status: not replicated
- Original paper:
‘Word age-of-acquisition and visual recognition threshold’, Gilhooly and Logie 1981a; experiments, Experiment 1: n = 36, Experiment 2: n = 18. [citations=32(GS, December 2022)].
- Critiques:
Gilhooly and Logie 1981b [n = 16, citations = 101(GS, December 2022)].
Ghyselinck et al. 2004 [n = 21, citations = 192(GS, December 2022)].
Chen et al. 2009 [n = 30, citations = 28(GS, December 2022)].
Ploetz and Yates 2016 [n = 64, citations = 1(GS, December 2022)].
- Original effect size: Experiment 1: Beta = 0.05; Experiment 2: Beta = 0.03.
- Replication effect size: author: Gilhooly and Logie: Beta = 0.09; Ghyselinck et al.: ηp² = 0.58 [ηp2 calculated from reported F statistic and converted using this
conversion]. Chen et al.: ηp² = 0.27 [ηp2 calculated from reported F statistic and converted using this
conversion]. Ploetz and Yates: ηp² = .124.
Age of acquisition influence on the pre-conceptual stages of lexical retrieval (object decision). The age at which one acquires the concept of an object does not contribute to the speed and accuracy of recognising whether an object is a real object or not a real world object that has chimeric features.
Statistics
- Status: not replicated
- Original paper:
‘Age of acquisition, not word frequency, affects object naming, not object recognition’, Morrison et al. 1992; experiment, n = 20. [citations=495(GS, December 2022)].
- Critiques:
Catling and Johnston 2009 [Experiment 2: n = 20; citations = 54 (GS, December 2022)].
Holmes and Ellis 2006 [Experiment 2: n = 20, Experiment 3: n = 20, Experiment 7: n = 46, citations = 87 (GS, December 2022)].
Moore et al. 2004 [Experiment 1: n = 39, Experiment 2: n = 38, citations = 79 (GS, December 2022)].
Vitkovitch and Tyrrell 1995 [n = 16, citations = 211 (GS, December 2022)].
- Original effect size: Beta = .044.
- Replication effect size: Catling and Johnston: ηp2 = 0.18 [ηp2 calculated from reported F statistic and converted using this
conversion]. Holmes and Ellis: Experiment 2: d= 1.18 [d calculated from t statistic and converted using this
conversion], Experiment 3: d= 1.44[d calculated from t statistic and converted using this
conversion], Experiment 7: ηp2 = 0.38[ηp2 calculated from reported F statistic and converted using this
conversion]. Moore et al.: ηp2 = 0.27 [ηp2 calculated from reported F statistic and converted using this
conversion]. Vitkovitch and Tyrell: Beta = .426.
Age of acquisition influence on the pre-conceptual stages of lexical retrieval (anagram solution). Age of acquisition is thought to affect lexical retrieval through its impact on anagram (word jumbles) solutions, such that words acquired at an earlier age tend to be solved more quickly and accurately in anagram tasks than those learned later in life. This may be because words learned earlier in life are more deeply encoded and may therefore be more easily accessed.
Age of acquisition influence on the pre-conceptual stages of lexical retrieval (visual duration threshold). Early-acquired items are identified more accurately than late-acquired items, using visual duration threshold task.
Age of acquisition influence on the pre-conceptual stages of lexical retrieval (category verification). The age at which one acquires an object does not contribute to the speed and accuracy of category verification during a semantic categorisation task (where objects have to be decided whether they represent one group or another, e.g. tools vs. furniture).
Statistics
- Status: reversed
- Original paper: ‘
Age-of-acquisition effects in picture naming: Are they structural and/or semantic in nature?, Chalard and Bonin 2006; experiment, n = 27. [citations=36(GS, December 2022)].
- Critiques:
Catling and Elsherif 2020 [Experiment 1a: n = 48, Experiment 2a: n = 48, citations = 12(GS, December 2022)].
Catling and Johnston 2006 [Experiment 1: n = 15, citations = 17(GS, December 2022)].
Catling and Johnston 2009 [Experiment 1: n = 24, citations = 54 (GS, December 2022)].
Holmes and Ellis 2006 [Experiment 4: n = 20, Experiment 7: n = 30, citations = 87 (GS, December 2022)].
Räling et al. 2015 [n = 36, citations = 24(GS, December 2022)].
Stadthagen-Gonzalez et al. 2009 [n = 100, citations = 51 (GS, December 2022)].
- Original effect size: NA.
- Replication effect size: Catling and Elsherif: Experiment 1a: d = 0.23, Experiment 2a: d = 0.25. Catling and Johnston: ηp2 = 0.46. Catling and Johnston: ηp2 = 0.19[ηp2 calculated from reported F statistic and converted using this
conversion]. Holmes and Ellis: Experiment 4: d = 1.62 [d calculated from reported t statistic in category verification and converted using this
conversion]; Experiment 8: t < 1. Räling et al.: ηp2 = 0.45[ηp2 calculated from reported F statistic and converted using this
conversion]. Stadthagen-Gonzalez et al.: Beta = 2.43.
Age of acquisition influence on the pre-conceptual stages of lexical retrieval (category falsification). The age at which one acquires the name of an object object does not contribute to the speed and accuracy of category falsification (i.e. deciding that a different word and the picture of the acquired concept do not match; e.g. the picture of the acquired concept of a rabbit, paired with the non-matching word “mouse”).
Statistics
- Status: mixed
- Original paper: ‘
Age of acquisition and typicality effects in three object processing tasks’, Holmes and Ellis 2006; experiment, n = 20. [citations=87(GS, December 2022)].
- Critiques:
Catling and Elsherif 2020 [Experiment 1a: n = 48, Experiment 2a: n = 48, citations = 12(GS, December 2022)].
Catling and Johnston 2006 [Experiment 1: n = 15, citations = 17(GS, December 2022)].
Stadthagen-Gonzalez et al. 2009 [n = 100, citations = 51 (GS, December 2022)].
- Original effect size: t < 1.
- Replication effect size: Catling and Elsherif: Experiment 1a: d = 0.14, Experiment 2a: d = 0.16. Catling and Johnston: ηp2 = 0.297. Stadthagen-Gonzalez et al.: Beta = 0.43.
Age of acquisition influence on face recognition. Early-acquired faces are recognised more quickly and accurately than late-acquired faces.
Age of acquisition influence on face familiarity decision. Early-acquired faces are recognised as familiar faces more quickly than late-acquired faces when the task is to discriminate between familiar and unfamiliar faces.
Age of acquisition influence on face gender decision. The age at which a celebrity face is acquired does not affect the speed to recognise a celebrity’s face, using a gender decision task (is this face male or female?).
Age of acquisition influence on semantic decision. Early-acquired semantic concepts are categorised more quickly and accurately than later acquired concepts.
Statistics
- Status: replicated
- Original paper:
‘Age-of-acquisition effects in semantic processing tasks’, Brysbaert et al. 2000; experimental design, Experiment 2: n = 36. [citations = 307(GS, December 2022)].
- Critiques:
Bai et al. 2013 [Experiment 3: n = 32, citations = 6(GS, December 2022)].
Chen et al. 2007 [Experiment 2: n = 28, citations = 43(GS, December 2022)].
De Deyne and Storms 2007young adult: n = 21, older adult: n = 21, citations = 35 (GS, December 2022)].
Ghyselinck et al. 2004 [n = 20, citations = 192 (GS, December 2022)].
Izura and Hernandez-Munoz 2017 [first categorisation task n = 30, second categorisation task: n = 26, citations = 1 (GS, December 2022)].
- Original effect size: Experiment 2: ηp2 = 0.75 [ηp2 calculated from reported F statistic and converted using this
conversion].
- Replication effect size: Bai et al.: ηp2 = 0.10 [ηp2 calculated from reported F statistic and converted using this
conversion]. Chen et al.: ηp2 = 0.57 [ηp2 calculated from reported F statistic and converted using this
conversion]. De Deyne and Storms: young adult: beta = 11.68, older adult: beta = 4.09. Ghyselinck et al.: ηp2 = 0.47[ηp2 calculated from reported F statistic and converted using this
conversion]. Izura and Hernandez-Munoz: first categorisation task: Beta = .256, second categorisation task: Beta = -0.017.
Age of acquisition influence on the conceptual stages of lexical retrieval in opaque languages (spoken picture naming in opaque language). Early-acquired objects are named more quickly and accurately than late-acquired objects in opaque languages or deep orthography (i.e. spelling-sound correspondence is not direct where one is able to pronounce the word correctly based on the spelling; e.g. English, French).
Statistics
- Status: replicated
- Original paper: ‘
Age-of-acquisition norms for 220 picturable nouns’, Carroll and White 1973; experiment, n = 62. [citations=339(GS, January 2023)].
- Critiques:
Alario et al. 2004 [n = 46, citations = 372 (GS, January 2023)].
Bonin et al. 2001 [n = 30, citations=166(GS, December 2022)].
Bonin et al. 2003 [n = 30, citations = 381(GS, January 2023)].
Catling and Elsherif 2020 [Experiment 1b: n = 48, citations = 12(GS, December 2022)].
Catling and Johnston 2009 [Experiment 4: n = 24, citations = 54 (GS, December 2022)].
Johnston et al. 2010 [n = 25, citations = 35(GS, January 2023)]. [Karimi and Diaz 2020
n = 212, citations = 9(GS, January 2023)].
Perret et al. 2014 [n = 21, citations = 42(GS, December 2022)].
Schwitter et al. 2004 [n = 31, citations = 52(GS, January 2023)].
Snodgrass and Yuditsky 1996 [ n = 84, citations = 403(GS, January 2023)].
- Original effect size: ratings: r = -771, objective: r = .773.
- Replication effect size: Alario et al.: beta = 69.4. Bonin et al.: beta = .194. Bonin et al.: ηp2 = 0.81[ηp2 calculated from reported F statistic and converted using this
conversion]. Catling and Elsherif: Experiment 2b: d = 1.15 [d calculated from reported t statistic and converted using this
conversion]. Catling and Johnston: Experiment 4: d =0.45. Johnston et al.: beta = .341. Karimi and Diaz: beta = .072. Perret et al. : d = 0.82 [_d _calculated from reported t statistic and converted using this
conversion]; Schwitter et al.: beta = .222. Snodgrass and Yuditsky: beta = .30.
Age of acquisition influence on the conceptual stages of lexical retrieval in logographic languages (spoken picture naming in logographic languages). Early-acquired names of objects are produced more quickly and accurately than late-acquired names in logographic languages such as Japanese and Chinese.
Statistics
- Status: replicated
- Original paper: ‘
Predictors of timed picture naming in Chinese’, Weekes et al. 2007; experiments, Experiment 1: n = 30, Experiment 2: n = 100. [citations=78(GS, December 2022)].
- Critique:
Liu et al. 2011 [n = 30, citations = 84(GS, December 2022)].
- Original effect size: Experiment 1: beta = .19, Experiment 2: beta = .24.
- Replication effect size: Liu et al.: objective AoA: r = .591, rated AoA: r = .475.
Age of acquisition influence on the conceptual stages of lexical retrieval in transparent languages (spoken picture naming in transparent language). Early-acquired objects are named more quickly and accurately than late-acquired objects in transparent languages or shallow orthography (i.e. spelling-sound correspondence is direct where one is able to pronounce the word correctly based on the spelling; e.g. Spanish, Turkish, Italian).
Statistics
- Status: replicated
- Original paper:
‘Naming times for the Snodgrass and Vanderwart pictures in Spanish’, Cuetos et al. 1999; experiment, n = 64. [citations=301(GS, January 2023)].
- Critiques:
Cuetos and Alija 2003 [n = 54, citations = 86(GS, January 2023)].
Severens et al. 2005 [n = 40, citations = 192(GS, January 2023)].
Shao et al. 2014 [n = 117, citations = 44(GS, January 2023)].
Wolna et al. 2022 [n = 98, citations = 0 (GS, January 2023)].
- Original effect size: beta = 39.16.
- Replication effect size: Cuetos and Alija: beta = 0.542. Severens et al.: beta = 0.24. Shao et al.: beta = 0.25. Wolna et al.: pictures of objects: d = – 0.49, pictures of actions: d = -0.29.
Age of acquisition influence on the conceptual stages of lexical retrieval (written picture naming). Early-acquired object names are written more quickly and accurately than late-acquired names.
Age of acquisition influence on the conceptual stages of lexical retrieval (typing). Early-acquired object names are typed more quickly than late-acquired object names. Typing allows more precise measure for the response execution, while written picture naming is a measure for lexical retrieval.
Statistics
- Status: replicated
- Original paper:
‘Naming times for the Snodgrass and Vanderwart pictures’, Snodgrass and Yuditsky 1996; experiment, experiment 2, n = 96. [citations=403(GS, December 2022)].
- Critiques:
Scaltritti et al. 2016 [n = 86, citations = 26(GS, December 2022)].
- Original effect size: RT: beta = 0.19; accuracy: beta = -0.31.
- Replication effect size: Scaltritti et al.: onset latency: d = 0.66 [d calculated from reported t statistic and converted using this
conversion]; interkeystroke interval: not reported.
Age of acquisition influence on the post-lexical stages of lexical retrieval (delayed spoken picture naming). Early-acquired words should not differ from late-acquired words in terms of accuracy and response speed of spoken naming, when using a delayed picture naming task that requires participants to name a picture a few seconds (e.g. 2-4 sec) after seeing the actual picture. This task enables researchers to assess if any possible delay of naming effects result at an articulatory level, as opposed to a conceptual level or lexical retrieval stage.
Age of acquisition influence on the post-lexical stages of lexical/sublexical retrieval (delayed spoken word naming). Early-acquired words should not differ from late-acquired words, when using delayed word naming. This enables researchers to assess if the lexical/sublexical effects result at an articulatory level.
Age of acquisition influence on lexical retrieval (written word naming). Early-acquired words are written and spelled more quickly and accurately than late-acquired words. In contrast to written picture naming, written word naming involves access to the lexical and sublexical pathways that are not accessed in typing or written picture naming.
Age of acquisition influence on lexical retrieval in opaque languages (immediate spoken word naming in opaque language). Early-acquired words are named more quickly and accurately than late-acquired words in opaque languages or deep orthography (i.e. spelling-sound correspondence is not direct where one is able to pronounce the word correctly based on the spelling; e.g. English, French).
Statistics
- Status: replicated
- Original paper: ‘
First in, first out: Word learning age and spoken word frequency as predictors of word familiarity and word naming latency’, Brown and Watson 1987; experiment, n = 28. [citations=468(GS, January 2023)].
- Critiques:
Catling and Elsherif 2020 [n = 48, citations = 12(GS, January 2023)].
Cortese et al. 2018 [n = 25, citations = 24(GS, January 2023)].
Dewhurst and Barry 2006 [Experiment 1: n = 30, Experiment 2: n = 30, citations = 13(GS, January 2023)].
Elsherif et al. 2020 [n = 48, citations = 10(GS, January 2023)].
Izura and Playfoot 2012 [n = 120, citations = 35(GS, January 2023)].
Morrison and Ellis 2000 [n = 27, citations = 293(GS, January 2023)].
- Original effect size: r = .30.
- Replication effect size: Catling and Elsherif: Experiment 3b: beta = −0.01. Cortese et al.: beta = .132. Dewhurst and Barry: Experiment 1: ηp2 = 0.61 [ηp2 calculated from reported F statistic and converted using this
conversion], Experiment 2: d= 1.22 [d calculated from reported t statistic and converted using this
conversion]. Elsherif et al.: beta = 0.141. Izura and Playfoot: r = .249. Morrison and Ellis: r = .244.
Age of acquisition influence on lexical retrieval (immediate spoken word naming in transparent language). Early-acquired words are named more quickly and accurately than late-acquired words in transparent languages or shallow orthography (i.e. spelling-sound correspondence is direct where one is able to pronounce the word correctly based on the spelling; e.g. Italian, Spanish).
Statistics
- Status: mixed
- Original paper: ‘
Word Frequency Affects Naming Latency in Dutch when Age of Acquisition is Controlled’, Brysbaert 1996; experiment, n = 22. [citations=73(GS, January 2023)].
- Critiques:
Brysbaert et al. 2000 [n = 20, citations = 227 (GS, January 2023)].
Cuetos and Barbon 2006 [n = 53, citations = 96(GS, January 2023)].
De Luca et al. 2008 [n = 51, citations = 55(GS, January 2023)].
Ghyselinck et al. 2004 [n = 21, citations = 192(GS, January 2023)].
Raman 2006 [n = 28, citations = 69(GS, January 2023)].
Wilson et al. 2012 [Experiment 1: n = 40, Experiment 2: n = 32, citations = 25(GS, January 2023)].
Wilson et al. 2013 [Experiment 1: n = 27, Experiment 4: n = 33, citations = 37(GS, January 2023)].
- Original effect size: beta = -0.58.
- Replication effect size: Brysbaert et al.: ηp2= 0.30 [ηp2 calculated from reported F statistic and converted using this
conversion]. Cuetos and Barbon: objective AoA r = .316, subjective AoA: r = .384. De Luca et al.: not reported. Ghyselinck et al.: ηp2 = 0.24 [ηp2 calculated from reported F statistic and converted using this
conversion]. Raman: d = 0.48 [d calculated from reported t statistic and converted using this
conversion]. Wilson et al.: Experiment 1: ηp2 = 0.07 [ηp2 calculated from reported F statistic and converted using this
conversion], Experiment 2: ηp2 = 0.33 [ηp2 calculated from reported F statistic and converted using this
conversion]. Wilson et al.: Experiment 1: ηp2 = 0.21 [ηp2 calculated from reported F statistic and converted using this
conversion], Experiment 4: ηp2 = 0.48 [ηp2 calculated from reported F statistic and converted using this
conversion].
Age of acquisition influence on lexical retrieval in logographic languages (spoken character naming in logographic languages). Early-acquired characters are named more quickly and accurately than late-acquired characters in logographic languages such as Japanese and Chinese.
Statistics
- Status: replicated
- Original paper: ‘
Two age of acquisition effects in the reading
- of Japanese Kanji’, Yamazaki et al. 1997; experiment, n = 26. [citations=107(GS, December 2022)].
- Critique:
Chen et al. 2007 [n = 26, citations = 43(GS, December 2022)].
Havelka and Tomita 2006 [n = 40, citations = 45(GS, December 2022)].
Liu et al. 2007 [n = 480, citations = 161(GS, December 2022)].
Liu et al. 2008 [n = 39, citations = 29(GS, December 2022)].
- Original effect size: Spoken: beta = 0.236, Written, beta = 0.343.
- Replication effect size: Chen et al.: Experiment 1: ηp2 = 0.41
calculated using this conversion from F toηp2. Havelka and Tomita: ηp2 = 0.51
calculated using this conversion from F to ηp2]. Liu et al.: Beta = .670. Liu et al.: ηp2 = 0.53
calculated using this conversion from F to ηp2].
Age of acquisition influence on speeded phonological retrieval (transparent language). Early-acquired words are responded to more quickly and accurately than late-acquired words, using a speeded naming paradigm, where participants must name the items as quickly as possible within a short timeframe (e.g. 400 milliseconds). This effect is argued to reduce the influence of semantics on phonological activation, which is argued to accumulate over the word naming process.
Age of acquisition influence on lexical retrieval (auditory lexical decision task). Early-acquired words are heard and responded to more quickly and accurately than late-acquired words, using auditory lexical decision tasks where participants have to judge whether they heard a real word or not.
Statistics
- Status: replicated
- Original paper:
‘Lexical Search Speed in Children and Adults’, Cirrin 1984; experimental study, kindergarten children: n = 12; first grade children: n = 11; third grade children: n = 11; adults: n = 11. [citations = 50(GS, December 2022)].
- Critiques:
Baumgaertner and Tompkins 1998 [n = 35, citations = 15(GS, December 2022)].
Turner et al. 1998 [n = 20, citations = 161(GS, December 2022)].
- Original effect size: kindergarten children: r = .485, first grade children: r = .172, third grade children: r = .369, adults: r = .379
- Replication effect size: Baumgaertner and Tompkins: r = 0.66. Turner et al.: d = 1.30 [d calculated from reported t statistic and converted using this
conversion].
Age of acquisition influence on lexical retrieval (visual lexical decision in opaque languages). Early-acquired words are seen and responded more quickly and accurately than late-acquired words in opaque languages or deep orthography (i.e. spelling-sound correspondence is not direct where one is able to pronounce the word correctly based on the spelling; e.g. English, French), using visual lexical decision task. Participants have to decide whether they saw a word or not.
Statistics
- Status: replicated
- Original paper:
‘Word-nonword classification time’, Whaley 1978; experiment, n = 32. [citations= 579(GS, December 2022)].
- Critiques:
Boulenger et al. 2007 [n = 20, citations = 24(GS, December 2022)].
Cortese et al. 2018 [n = 25, citations = 24(GS, December 2022)].
Gerhand and Barry 1999 [Experiment 1: n = 30, Experiment 2: n = 30, Experiment 3: n = 30, Experiment 4: n = 30, Experiment 5: n = 30, citations = 185(GS, December 2022)].
Morrison and Ellis 1995 [n = 16, citations = 599(GS, December 2022)].
Morrison and Ellis 2000 [n = 24, citations = 293(GS, December 2022)].
Schwanenflugel et al. 1989 [experiment 2: n = 44, citations = 536(GS, December 2022)].
Sereno and O’Donnell 2009 [n = 97, citations = 12(GS, December 2022)].
Turner et al. 1998[n = 25, citations = 161(GS, December 2022)].
- Original effect size: r = 0.63.
- Replication effect size: Boulenger et al.: slope for nouns = 24.17, slope for verbs = 15.39. Cortese et al.: beta = .340. Gerhand and Barry: Experiment 1: ηp2 = 0.40 [ηp2 calculated from reported F statistic and converted using this
conversion], Experiment 2-5 (collapsed together): ηp2 = 0.33 [ηp2 calculated from reported F statistic and converted using this
conversion]. Morrison and Ellis : d = 3.00 [d calculated from reported t statistic and converted using this
conversion]. Morrison and Ellis: beta = 0.67. Schwanenflugel et al.: r = .15. Sereno and O’Donnell: ηp2 = 0.53 [ηp2 calculated from reported F statistic and converted using this
conversion]. Turner et al.: d = 0.58
d calculated from reported t statistic and converted using this[ conversion].
Age of acquisition influence on lexical retrieval (Visual lexical decision in transparent language). Early-acquired words are responded more quickly and accurately than late-acquired words in transparent languages or shallow orthography (i.e. spelling-sound correspondence is direct where one is able to pronounce the word correctly based on the spelling; e.g. Spanish, Turkish, Italian).
Statistics
- Status: replicated
- Original paper: ‘
The effects of age-of-acquisition and frequency-of-occurrence in visual word recognition: Further evidence from the Dutch language’, Brysbaert et al. 2000; experiment, n = 20. [citations = 227 (GS, January 2023)].
- Critiques:
Colombo and Burani 2002 [Experiment 1: n = 20, citations = 82(GS, January 2023)].
de Deyne and Storms 2007 [Young adult: n = 22, Older adults: n = 20, citations = 35(GS, January 2023)].
Fiebach et al. 2003 [n = 12, citations = 117(GS, January 2023)].
Gonzalez-Nosti et al. 2014 [n = 58, citations = 58(GS, January 2023)].
Izura and Hernandez-Munoz 2017 [n = 80, citations = 1(GS, January 2023)].
Menenti and Burani 2007 [Italian speakers n = 54, Dutch speakers: n = 50, citations = 51(GS, January 2023)].
- Original effect size: ηp2 = 0.61 [ηp2 calculated from reported F statistic and converted using this
conversion].
- Replication effect size: Colombo and Burani Experiment 1: r = .502. de Deynes and Storms: young adults: r = .62, older adults: r = .74. Fiebach et al.: ηp2 = 0.80 [ηp2 calculated from reported F statistic and converted using this
conversion]. Gonzalez-Nosti et al.: r = .602. Izura and Hernandez-Munoz: beta = .486. Meneti and Burani: Dutch: beta = .10, Italian: beta = .07.
Age of acquisition influence on lexical retrieval (Visual lexical decision in logographic languages). Early-acquired logograms are responded more quickly and accurately than late-acquired logograms in logographic languages such as Chinese and Japanese, using a visual lexical decision task.
Age of acquisition influence on silent reading (eye-tracking). Early-acquired words show shorter fixations, gaze and total reading times than late-acquired words in sentences and paragraphs, using eye-tracking.
Statistics
- Status: replicated
- Original paper:
‘Investigating the effects of a set of intercorrelated variables on eye fixation durations in reading’, Juhasz and Rayner 2003; experiment, n = 40. [citations=311(GS, December 2022)].
- Critiques:
Dirix and Duyck 2017 [n = 14, citations = 17(GS, December 2022)].
Juhasz 2018 [n = 45, citations = 24(GS, December 2022)].
Juhasz and Rayner 2006 [Experiment 1: n = 32, Experiment 2: n = 40, citations = 185 (GS, December 2022)].[ Juhasz and Sheridan 2020
n = 47, citations = 2(GS, December 2022)].
- Original effect size: First fixation: beta = 4.01, Single fixation: beta = 6.55, Gaze duration: beta = 6.62, Total duration: beta = 7.00.
- Replication effect size: Dirix and Duyck: Single fixation: d = 0.95 [d calculated from reported t statistic and converted using this
conversion], Gaze duration: d = 0.81 [d calculated from reported t statistic and converted using this
conversion], total reading time: d = 0.71 [d calculated from reported t statistic and converted using this
conversion]. Juhasz: First fixation: d = 0.27 [d calculated from reported t statistic and converted using this
conversion], single fixation: d = 0.29 [d calculated from reported t statistic and converted using this
conversion], gaze duration: d = 0.34 [d calculated from reported t statistic and converted using this
conversion], total fixation: d = 0.37 [d calculated from reported t statistic and converted using this
conversion]. Juhasz and Rayner: Experiment 1: First fixation: ηp2 = 0.29 [ηp2 calculated from reported F statistic and converted using this
conversion], Single fixation: ηp2 = 0.23 [ηp2 calculated from reported F statistic and converted using this
conversion], Gaze duration: ηp2 = 0.43 [ηp2 calculated from reported F statistic and converted using this
conversion], Total duration: ηp2= 0.34 [ηp2 calculated from reported F statistic and converted using this
conversion], Experiment 2: First fixation: d = 0.39 [d calculated from reported t statistic and converted using this
conversion], Single fixation: d = 0.42 [d calculated from reported t statistic and converted using this
conversion], Gaze duration: d = 0.35 [d calculated from reported t statistic and converted using this
conversion], Total duration: d = 0.31 [d calculated from reported t statistic and converted using this
conversion]; Juhasz and Sheridan: First fixation: d = 0.22, single fixation: d = 0.20, gaze duration: d = 0.20, total fixation: d = 0.27; skipping percentage: d = .17.
Age of acquisition influence on name retrieval. The earlier an individual learns a celebrity name and face, the more quickly and accurately the participant will name the celebrity.
Statistics
- Status: replicated
- Original paper:
‘The Effect of Age of Acquisition on Speed and
- Accuracy of Naming Famous Faces’, Moore and Valentine 1998; experiments, experiment 1 n=30, experiment 2 n= 24, experiment 3 n= 24. [citations= 83(GS, December 2022)].
- Critiques:
Smith-Spark et al. 2012 [Experiment: n = 72, citations = 3 (GS, December 2022)].
- Original effect size: Experiment 1: RT ηp2= 0.37[ηp2 calculated from reported F statistic and converted using this
conversion], accuracy: ηp2= 0.40[ηp2 calculated from reported F statistic and converted using this
conversion],Experiment 2: RT: ηp2= 0.63[ηp2 calculated from reported F statistic and converted using this
conversion]; accuracy: ηp2 = 0.27[ηp2 calculated from reported F statistic and converted using this
conversion], Experiment 3: RT: ηp2= 0.38[ηp2 calculated from reported F statistic and converted using this
conversion], accuracy: ηp2 = 0.24[ηp2 calculated from reported F statistic and converted using this
conversion].
- Replication effect size: Smith-Spark et al.: accuracy: ηp2= 0.13[ηp2 calculated from reported F statistic and converted using this
conversion]; RT: ηp2= 0.43[ηp2 calculated from reported F statistic and converted using this
conversion].
Age of acquisition on lexical-semantic processes (translation task). Compared to late-acquired words, early-acquired words in a native or other language are translated more quickly to the other language or the native language, respectively.
Statistics
- Status: replicated
- Original paper:
‘Characteristics of words determining how easily they will be translated into a second language’, Murray 1986; experiment, n = 16 [citations= 12(GS, December 2022)].
- Critiques:
Izura and Ellis 2004 [Experiment 1: n = 20, Experiment 3: n = 20, citations = 102(GS, December 2022)].
Bowers and Kennison 2011 [n = 36, citations = 19(GS, December 2022)].
- Original effect size: L1 AoA and translate to L1: r = .27, L1 AoA and translate to L2: r = .19.
- Replication effect size: Izura and Ellis: Experiment 1: ηp² = 0.03 for L1 AoA [ηp2 calculated from reported F statistic and converted using this
conversion], ηp² = 0.73 for L2 AoA [ηp2 calculated from reported F statistic and converted using this
conversion]; Experiment 3: ηp² = 0.33 for L1 AoA [ηp2 calculated from reported F statistic and converted using this
conversion], ηp² = 0.47 for L2 AoA [ηp2 calculated from reported F statistic and converted using this
conversion]. Bowers and Kennison: ηp² = 0.90.
Age of acquisition influence on lexical-semantic processes (picture-word interference). The pictures of objects whose concept is acquired earlier show smaller semantic interference with simultaneously appearing semantically related words, compared to when the task is done using pictures of objects whose concept is acquired later.
Age of acquisition influence on lexical change. In contrast to the meaning of late-acquired words, the meaning of early-acquired words are less likely to change over time in the conceptual representation of the speaker and community.
Age of acquisition influence on learning (conceptual learning). The earlier a concept is learned, the more likely the concept will be more strongly consolidated and more likely to be recalled.
Statistics
- Status: replicated
- Original paper: ‘
Order of acquisition in learning perceptual categories: A laboratory analogue of the age-of-acquisition effect?’, Stewart and Ellis 2008; experiment, n = 27. [citations=35(GS, December 2022)].
- Critiques:
Catling et al. 2013 [n = 16, citations = 19(GS, December 2022)].
Izura et al. 2011 [Experiment 1: n = 25, Experiment 2: n = 26, Experiment 2: n = 24, citations = 78(GS, December 2022)].
- Original effect size: ηp2= 0.17 [ηp2 calculated from reported F statistic and converted using this
conversion].
- Replication effect size: Catling et al.: naming: ηp2= 0.23 [ηp2 calculated from reported F statistic and converted using this
conversion], visual duration threshold ηp2= 0.27 [ηp2 calculated from reported F statistic and converted using this
conversion]. Izura et al.: Experiment 1: ηp2= 0.49, Experiment 2: ηp2= 0.20, Experiment 3: delayed picture naming: d = 0.36 [d calculated from reported t statistic and converted using this
conversion]/ ηp2= 0.12 [ ηp2 calculated from Cohen’s d and converted using this
conversion], immediate picture naming: F < 1, corrected latencies: ηp2= 0.15, lexical decision: ηp2= 0.32, semantic categorisation: ηp2= 0.20.
Age of acquisition influence on learning (procedural). The order of learning new actions of a procedures influences the speed and accuracy of recalling the correct position.
Statistics
- Status: replicated
- Original paper:
‘Acquisition and long-term retention of a simple serial perceptual-motor skill’, Neumann and Amons 1956; experiment, n = 20. [citations=56(GS, December 2022)].
- Critiques:
Magil 1976 [n = 105, citations = 13 (GS, December 2022)].
- Original effect size: ηp2= 0.23[ηp2sup> calculated from reported F statistic and converted using this
conversion].
- Replication effect size: Magil: ηp2= 0.06 for position 1 and 2 in block number 3 [ηp2 calculated from reported F statistic and converted using this
conversion], ηp2= 0.06 for position 1 and 3 in block number 3 [ηp2 calculated from reported F statistic and converted using this
conversion], F < 1 for position 2 and 3 in block number 3, ηp2= 0.05 for position 1 and 2 in block number 4 [ηp2 calculated from reported F statistic and converted using this
conversion], ηp2= 0.11 for position 1 and 3 in block number 3 [ηp2 calculated from reported F statistic and converted using this
conversion], ηp2= 0.01 for position 2 and 3 in block number 4.
Ego depletion. Self-control is a limited resource that can be depleted by efforts to inhibit a thought, emotion or behaviour.
Statistics
- Status: not replicated
- Original paper: ‘
Ego Depletion: Is the Active Self a Limited Resource?’, Baumeister 1998, n=67 [citations = 7141 (GS, September 2022)].
- Critique:
Xu et al. 2014, 4 conceptual replications with high-power to detect medium-large effects [citations = 7136 (GS, September 2022)].
Hagger 2016, 23 independent conceptual replications [citations = 1027 (GS, September 2022)].
Vohs et al. 2021, multisite project, n = 3,531 [citations = 63 (GS, September 2022)].
- Original effect size: not reported (calculated d = -1.96 between control and worst condition).
- Replication effect size: Xu et al. 2014: hand grip persistence, community adults d = −0.30, young adults d= −0.002, combined difference d = −0.20; Stroop interference, community adults d = −0.15, young adults d = .21, combined difference d = −0.06. Hagger 2016: d = 0.04 [−0.07, 0.14] (NB: not testing the construct the same way). Vohs et al. 2021:d = 0.06.
Dunning-Kruger effect. A cognitive bias whereby people with limited knowledge or competence in a given intellectual or social domain greatly overestimate their own knowledge or competence in that domain relative to objective criteria or to the performance of their peers or of people in general.
Statistics
- Status: replicated
- Original paper: ‘
Unskilled and unaware of it: how difficulties in recognizing one’s own incompetence lead to inflated self-assessments’, Dunning & Kruger 1999. This contains claims (1), (2), and (5) but no hint of (3) or (4) [n=334 undergrads, citations = 8376 (GS, September, 2022)].
- Critiques:
Gignac 2020, [n=929,citations = 53 (GS, September, 2022)];
Nuhfer 2016 and
Nuhfer 2017, [n=1154, citations = 34 (GS, September, 2022)];
Luu 2015;
Greenberg 2018, n=534;
Yarkoni 2010,
Jansen 2021 [2 studies, n=2000 each study, citations= 26 (GS, October2022)],
Muller 2020 [n= 56, citations= 20 (GS, October 2022)]
- Original effect size: not reported. Study 1 on humor (n= 15): difference between the actual and estimated performance of “incompetent” (bottom quartile) participants d= 2.58 [calculated], while for “competent” (top quartile) participants d= -0.55 [calculated]. Study 2 on logical reasoning ( n= 45): difference between the actual and estimated performance of “incompetent” (bottom quartile) participants d= 5.44 (percieved logical reasoning ability) [calculated], d= 3.48 (test performance) [calculated], while for “competent” (top quartile) participants d= -1.12 [calculated], d= -0.79 (percieved test performance) [calculated]. Study 3 on grammar (n= 84): difference between the actual and estimated performance of “incompetent” (percieved bottom quartile) participants d= 3.42 (percieved ability) [calculated], d= 3.94 (percieved test performance) [calculated], while for “competent” (top quartile) participants d= -1.18 (percieved ability) [calculated], d= -1.27 (perceived test performance) [calculated].
- Replication effect size: Gignac 2020 (for IQ): when using statistical analysis as in Dunning & Kruger 1999 η2 = 0.20, but running two less-confounded tests, r= −0.05/d= -0.1 [
calculated] between P and errors , and r= 0.02/d= 0.04 [
calculated] for a quadratic relationship between self-described performance and actual performance.
Jansen 2021 (for grammar and logical reasoning): not reported (Bayesian models support the existence of the effect in the data and replicate claim 1).
Muller 2020 (for recognition memory): the difference between the actual and estimated performance of “incompetent” (bottom quartile) participants d= 4.73 [calculated], while for “competent” (top quartile) participants d= -0.88 [calculated].
Depressive realism effect. Increased predictive accuracy or decreased cognitive bias among the clinically depressed.
Statistics
- Status: reversed
- Original paper: ‘
Judgment of contingency in depressed and nondepressed students: sadder but wiser?’, Alloy and Abramson (1979): 4 experiments with Study 1: n1 = 48, n2 = 48, Study 2: n1 = 32, n2 = 32; Study 3: n1 = 32, n2 = 32; Study 4: n1 = 32, n2 = 32 [citations = 2855 (GS, June 2022)].
- Critiques:
Moore and Fresco 2012 [meta analysis, n = 7305, citations = 311 (GS, June 2022)]
- Original effect size: not reported. [d= -0.32 calculated for bias about ‘contingency’, how much the outcome actually depends on what you do]
- Replication effect size: Moore and Fresco 2012: d = -0.07.
Hungry judge effect, of massively reduced acquittals just before lunch. Case order isn’t independent of acquittal probability (“unrepresented prisoners usually go last and are less likely to be granted parole”); favourable cases may take predictably longer and so are pushed until after recess; effect size is implausible on priors; explanation involved ego depletion.
Statistics
- Status: NA
- Original paper: ‘
Extraneous factors in judicial decisions’, 2011 [n= 8 judges, 1122 judicial rulings, citations = 1626 (GS, October, 2022)].
- Critiques:
Weinshall-Margel 2011 [n= 227 decisions, citations= 79 (GS, October, 2022)],
Glöckner 2016,
Lakens 2017.
- Original effect size: d= 1.96, “the probability of a favorable ruling steadily declines from ≈0.65 to [0.05] and jumps back up to ≈0.65 after a break for a meal”, n=8 judges with n=1122 cases.
- Replication effect size: NA.
Multiple intelligences. This theory suggests that there are multiple types of intelligence that can be distinguished from one another, rather than a single general intelligence that underlies all cognitive abilities. Some of the proposed types of intelligence by Gardner are linguistic intelligence, logical-mathematical intelligence, musical intelligence, bodily-kinesthetic intelligence, interpersonal intelligence, intrapersonal intelligence, and naturalistic intelligence. More broadly, this theory can be taken to suggest people have different cognitive strengths and weaknesses.
Statistics
- Status: NA
- Original paper: ‘
Frames of Mind: The Theory of Multiple Intelligences’, Gardener 1983; book/theoretical work, n=NA. [citation=45591(GS, March 2022)].
- Critiques:
Shearer and Karanian 2017 [n = 172 neuroscience reports, citations = 92 (GS, February 2023)].
Sternberg 1994 [n = NA, citations = 103 (GS, February 2023)].
Tirri and Nokelainen 2008 [n = 410, citations = 92 (GS, February 2023)].
Visser et al. 2006 [n = 200, citations = 379 (GS, February 2023)].
Waterhouse 2006 [n = NA, citations = 103 (GS, February 2023)].
- Original effect size: No empirical data collected [Allix,
2000; Lubinski & Benbow,
1995; Sternberg 1994; Waterhouse,
2006; Gardner acknowledged lack of empirical data,
2004 (p. 214)].
- Replication effect size: Shearer and Karanian: NA; descriptive statistics, neural patterns consistent with Gardner’s hypothesis. Tirri and Nokelainen: NA; Confirmatory Factor Analysis; supports existence of logical-mathematical and spatial intelligences. Visser et al.: NA; Factor Analysis, modest support for Gardner. Strong loadings on g factor.
Brain training on intelligence - far transfer from daily computer training games to fluid intelligence. Transfer of knowledge and skills from daily computer training games to fluid intelligence in general, in particular from the Dual n-Back game.
Statistics
- Status: mixed
- Original paper: ‘
Improving fluid intelligence with training on working memory’, Jaeggi 2008; experimental design, n=70. [citations= 2840 (GS, October 2022)].
- Critiques:
Melby-Lervåg 2013 [meta-analysis of 23 studies, citations= 2156 (GS, October 2022)].
Gwern 2012 [meta-analysis of 45 studies, citations= NA(GS, April 2023)].
Reddick 2013 [n= 73, citations= 824 (GS, October 2022)].
Lampit 2014 [meta-analysis of 52 studies, n= 4885, citations= 809 (GS, October 2022)].
Berger 2020 [n= 572, citations= 22 (GS, October 2022)].
Simons 2016 [comprehensive review of literature, n=NA, citations= 1015 (GS, October 2022)].
- Original effect size: d= 0.4 over control, 1-2 days after training.
- Replication effect size: Melby-Lervåg: d= 0.19 [0.03, 0.37] nonverbal; d= 0.13 [-0.09, 0.34] verbal. Gwern: d= 0.1397 [-0.0292, 0.3085], among studies using active controls. Reddick: found “no positive transfer to any of the cognitive ability tests”, all ηp2 < 0.054. Lampit : g= 0.24 [0.09, 0.38] nonverbal memory; g= 0.08 [0.01, 0.15] verbal memory; g = 0.22 [0.09, 0.35] working memory; g = 0.31 [0.11, 0.50] processing speed; g = 0.30 [0.07, 0.54] visuospatial skills. Berger(RCT in 6-7 year olds): d= 0.2 to 0.4, but many of the apparent far-transfer effects come only 6-12 months later, i.e. well past the end of most prior studies.
Brain training on intelligence - music lessons improve intelligence. An original experimental study found an increase in IQ for children who received a year of music lessons, compared to children who were randomly assigned to drama lessons or no lessons.
Statistics
- Status: not replicated
- Original paper: ‘
Music lessons enhance IQ’, Schellenberg 2004; randomised control trial, n=144. [citations = 1424, GS, December 2021)].
- Critiques:
Mehr et al. 2013 [Study 1 n=29, Study 2 n=55, citations=52 (GS, December 2021)].
D’Souza & Wiseheart 2018 [n=75, citations=20 (GS, December 2021)].
- Original effect size: d= 1.948.
- Replication effect size: Mehr et al.: Wilks’ λ = .851/η2p= 0.077 [calculated]. D’Souza & Wiseheart: for task switching: Bayes Factor (BF) inclusion= 1.964 (weak evidence); for processing speed BF inclusion= 0.757 (box completion task), 0.243 (symbol copy task), 0.213 (symbol coding task) (weak evidence); for working memory: BF inclusion= 0.216 (digit span forward task), 0.138 (the digit span backward task), 0.004 (self-ordered pointing task) (weak evidence); for inference control: BF inclusion= 0.137 (flanker task), 0.007 (Stroop task) (weak evidence); for nonverbal intelligence: BF inclusion= 0.778 (Peabody Picture Vocabulary Test) (weak evidence).
Bilingual advantages in executive control - inhibition. Speaking two languages improves general cognitive control processes (executive control).
Statistics
- Status: mixed
- Original paper: ‘
How does bilingualism improve executive control? A comparison of active and reactive inhibition mechanisms’, Colzato et al. 2008; 3 experiments, Study 1: n1 = 16 monolingual and n2 = 16 bilingual; Study 2: n1 = 12 bilinguals and n2 = 18 monolinguals; Study 3: n1 = 18 monolinguals and n2 = 18 bilinguals for experiment 3. [citation = 421(GS, October 2021)].
- Critique:
De Bruin et al. 2015 (meta-analysis, n=128, citations=547(GS, May 2022)].
Gunnerud et al. 2020 [meta-analysis, n=143 independent group comparisons comprising 583 EF effect sizes, citations=102 (GS, December 2021)].
Kappes 2015 (Experiment 3: 38 bilingual, 40 monolingual, citations: 0 (Unpublished)].
Paap et al. 2013 [n=286, citations=1007 citations (GS, December 2021)].
Sanchez-Azanza et al. 2017 [systematic review, n=189, citations=38(GS, May 2022)].
Bialystok et al. 2004 [n=40 in study 1, n=94 in study 2, n=20 in study 3, citations=2350(GS, January 2023)].
- Original effect size: r = .22 ± .48.
- Replication effect size: De Bruin et al.: ηp2 = .073 (challenge vs. support), ηp2 = .089 (all 4 result outcomes). Gunnerud et al.: The bilingual advantage in overall EF was significant, albeit marginal (g = 0.06), and there were indications of publication bias. Kappes: r = .06 ± .36. Paap et al.: Inhibitory control (Simon task) ηp2=.69, Mixing cost η2=.52, Switching cost η2=.67. Sanchez-Azanza et al.: ηp2 = .363 (paper category), ηp2 = .281 (year), ηp2 = .155 (paper category and year interaction).
Bilingual advantages in executive control - Non-verbal task switching. The idea that bilingual language switching on a daily basis makes bilinguals better at general non-verbal task switching, compared to monolinguals who do not perform this extensive daily language switching.
Statistics
- Status: mixed
- Original paper: ‘
Bilingual language switching in naming: Asymmetrical costs of language selection’, Meuter and Allport 1999 (conceptual original article); within-group design, sample size = 16. [citations = 1557(GS, January 2023)].
- Critiques:
de Bruin et al. 2015 [n1 = 28, n2 = 24, n3 = 24, citations = 110(GS, January 2023)].
Paap and Greenberg 2013 [study 1: n1 = 30, n2 = 44; Study 2: n1 = 31; n2 = 49; study 3: n1 = 48; n2 = 51, citations = 1135(GS, January 2023)].
Prior and Macwhinney 2009 [n1 = 32, n2 = 47, citations = 782(GS, January 2023)].
Stasenko et al. 2017 [n1 = 80, n2 = 80, citations = 55(GS, January 2023)].
Timmermeister et al. 2020 [n1 = 27, n2 = 27, citations = 8(GS, January 2023)].
- Original effect size: NA.
- Replication effect size: de Bruin et al.: ηp2 (language group X trial type) = .74; ηp2 (raw switching costs) = .09; ηp2 (proportional switching) = ns; ηp2 (language group X trial type) = ns; (mixed). Paap and Greenberg: ηp2 (study 1)=.001; ηp2 (study 2)= .014; ηp2 (study 3)= .000; ηp2 (all bilingual vs. monolingual)= .004; (not replicated). .Stasenko et al.: ηp2 (CTI)=.892, ηp2 (trial type) = 488; ηp2 (half) = .339; ηp2 (CTI X language group)=.037; ηp2 (CTI X half) = .259; ηp2 (CTI X trial type) = .079; ηp2 (CTI X trial type X half) = .025; ηp2 (trial type X half X group) = .044; d (language group in trials half 1) = .34; d (language group in trials half 2) = ns; ηp2 (CTI X group, on only switch trials) = .55; ηp2 (CTI X group, on only switch trials) = ns; ηp2 (CTI X half, on error rates for bilinguals only) = .059; (mixed). Timmermeister et al.: ηp2 (accuracy and switching costs)= 0.10; ηp2 (MANCOVA with the previous factors, and SES and knowledge of Dutch as covariates) = 0.03; ηp2 (RTs and mixing costs) = 0.13; ηp2 (MANCOVA with the previous factors, and SES and knowledge of Dutch as covariates) = 0.06; (not replicated).
Bilingual advantages - theory of mind. Bilingual children are more likely to score higher in Theory of Mind tasks than monolingual counterparts, using an unexpected transfer task.
Statistics
- Status: mixed
- Original paper: ‘
The effects of bilingualism on theory of mind development’, Goetz 2003; experiment, English monolinguals: n = 32, Mandarin monolinguals: n = 32, English-Mandarin bilinguals: n = 40. [citations = 486 (GS, January 2023)].
- Critiques:
Dahlgren et al. 2017 [Monolinguals: n = 14, bilinguals: n = 14, citations = 23 (GS, January 2023)].
Diaz and Farrar 2018 [Monolinguals: n = 33, Bilinguals: n = 32, citations = 23(GS, January 2023)].
Farhadian et al. 2010 [Monolinguals: n = 65, bilinguals: n = 98, citations = 61 (GS, January 2023)].
Gordon 2016 [Monolinguals: n = 26, bilinguals n = 26, citations = 24(GS, January, 2023)].
- Original effect size: ηp2 = 0.06 [_ηp2 _calculated from reported F statistic and converted using this
conversion].
- Replication effect size: Dahlgren et al.: not reported. Diaz and Farrar: ηp2 = .063. Farhadian et al.: d = 0.40 [_d _calculated from mean differences and standard deviation and converted using this
conversion]. Gordon: d = 0.123.
Bilingual advantages - perspective taking in referential communication. Bilingual children are more likely to score higher in Director tasks than monolingual counterparts, using the director task.
Statistics
- Status: replicated
- Original paper: ‘
The exposure advantage: Early exposure to a multilingual environment promotes effective communication’, Fan et al. 2015; experiment, English monolingual children: n = 24, monolingual children exposed to other languages: n = 24, bilingual children: n = 24. [citations = 260 (GS, January 2023)].
- Critiques:
Navarro and Conway 2021 [Monolingual adults: n = 26, bilinguals n = 28, citations=10(GS, January, 2023)].
- Original effect size: bilingual vs. monolingual: d = 0.83, bilingual vs. monolingual exposed to other languages: d = 0.02.
- Replication effect size: Navarro and Conway: director task experimental condition: d = -0.51, director task control condition: d = 0.29. Non-director task: ηp2 = .01.
Exposure to another language in social communication - perspective taking in referential communication. Children who are exposed to a second language are more likely to score higher in Director tasks than children who are not exposed to a second language, using the director task.
Statistics
- Status: replicated
- Original paper: ‘
The exposure advantage: Early exposure to a multilingual environment promotes effective communication’, Fan et al. 2015; experiment, English monolingual children: n = 24, monolingual children exposed to other languages: n = 24, bilingual children: n = 24. [citations = 260 (GS, January 2023)].
- Critiques:
Agostini et al. 2022 [preprint, high exposure for monolingual children: n =32, lower exposure for monolingual children: n = 29, no exposure monolingual children: n = 38, citations=0 (GS, January, 2023)].
- Original effect size: monolinguals exposed to other languages vs. monolingual: d = 0.74.
- Replication effect size: Agostini et al.: T1: not reported, T2: not reported.
Bilingual disadvantages in creativity - fluency. Monolinguals are more likely to rapidly produce a large number of ideas or solutions to a problem than bilinguals, using the Torrance Test.
Statistics
- Status: mixed
- Original paper: ‘
An Intercultural Study of Non-Verbal Ideational Fluency’, Gowan and Torrance 1965; experiment, monolingual children: n = 853, bilingual children: n = 555. [citations=35(GS, January 2023)].
- Critiques:
Kharkhurin 2008 [bilingual adults: n =103, monolingual adults: n = 47, citations=163(GS, January 2023)].
Kharkhurin 2017 [bilingual adults: n =58, monolingual adults: n = 28, citations=27(GS, January 2023)].
Torrance et al. 1970 [monolingual children: n = 527, bilingual children: n = 536, citations=241(GS, January 2023)].
- Original effect size: not reported.
- Replication effect size: Kharkhurin: ηp2 = 0.07 [_ηp2 _calculated from reported F statistic and converted using this
conversion]. Kharkhurin: not reported. Torrance et al.: d = 0.27 [d calculated from reported t statistic and converted using this
conversion].
Monolingual advantages in creativity - Flexibility. Monolinguals are more likely to consider a variety of approaches to a problem simultaneously than bilinguals, using the Torrance Test.
Statistics
- Status: reversed
- Original paper: ‘
Creative functioning of monolingual and bilingual children in Singapore’, Torrance et al. 1970; experiment study design, monolingual children: n = 527, bilingual children: n = 536. [citations=241(GS, January 2023)].
- Critiques:
Kharkhurin 2008 [bilingual adults: n =103, monolingual adults: n = 47, citations=163(GS, January 2023)].
Kharkhurin 2017 [bilingual adults: n =58, monolingual adults: n = 28, citations=27(GS, January 2023)].
- Original effect size: Torrance et al.: d = 0.20 [_d _calculated from reported t statistic and converted using this
conversion].
- Replication effect size: Kharkhurin ηp2 = 0.04 [ηp2 calculated from reported F statistic and converted using this
conversion]. Kharkhurin: ηp2 = 0.07.
Null Bilingual advantages in creativity - Originality. There should be no difference between bilinguals and monolinguals in the tendency to produce ideas different from those of most other people, using the Torrance Test.
Statistics
- Status: replicated Original paper: ‘
Creative functioning of monolingual and bilingual children in Singapore’, Torrance et al. 1970; experiment study design, monolingual children: n = 527, bilingual children: n = 536. [citations=241(GS, January 2023)].
- Critiques:
Kharkhurin 2008 [bilingual adults: n =103, monolingual adults: n = 47, citations=163(GS, January 2023)].
Kharkhurin 2017 [bilingual adults: n =58, monolingual adults: n = 28, citations=27(GS, January 2023)].
- Original effect size: Torrance et al.: d = 0.03 [d calculated from reported t statistic and converted using this
conversion].
- Replication effect size: Kharkhurin: not reported. Kharkhurin : not reported.
Null bilingual advantages in creativity - Elaboration. There should be no difference between bilinguals and monolinguals in the tendency to think through the details of an idea, using the Torrance Test.
Statistics
- Status: mixed Original paper: ‘
Creative functioning of monolingual and bilingual children in Singapore’, Torrance et al. 1970; experiment study design, monolingual children: n = 527, bilingual children: n = 536. [citations=241(GS, January 2023)].
- Critiques:
Kharkhurin 2008 [bilingual adults: n =103, monolingual adults: n = 47, citations=163(GS, January 2023)].
Kharkhurin 2017 [bilingual adults: n =58, monolingual adults: n = 28, citations=27(GS, January 2023)].
- Original effect size: Torrance et al.: d = 0.06 [d calculated from reported t statistic and converted using this
conversion].
- Replication effect size: Kharkhurin: ηp2 = 0.01[ηp2 calculated from reported F statistic and converted using this
conversion]. Kharkhurin: not reported.
Mozart effect. Listening to Mozart’s sonata for two pianos in D major (KV 448) enhances performance on spatial tasks in standardised tests.
Statistics
- Status: not replicated
- Original paper: ‘
Music and spatial task performance’, Rauscher et al. 1993; experimental design, n=36. [citations= 2110 (GS, November 2021)].
- Critiques: Pi
etschnig et al. 2010 [meta analysis: k=39, citations= 235 (GS, November 2021)].
Steele et al. 1999a [n=86, citations=555 (GS, November 2021)].
Steele et al. 1999b [n=206, citations=126 (GS, November 2021)].
- Original effect size: d= 1.5 [0.65, 2.35].
- Replication effect size: All reported in Pietschnig et al.: Adlmann: d = 0.57 [0.25, 0.89]. Carstens: Study 1: d = -0.22 [-0.89, 0.45]; Study 2: d = 0.47 [-0.23, 1.17]. Cooper: d = 0.42 [-0.23, 1.08]. Flohr: Study 1: d = 0.14 [-0.35, 0.63]; Study 2: d = 0.16 [-0.26, 0.58]. Gileta: Study 1: d =0.13 [-0.26, 0.51]; Study 2: d = -0.05 [-0.43, 0.34]. Ivanov: d = 0.77 [0.20, 1.34]. Jones: d = 0.92 [0.27, 1.56]. Jones: d = 0.54 [0.11, 0.97]. Kenealy: d = -0.22 [-1.08, 0.64]. Knell: d = 0.45 [0.13, 0.77]. Lints: d = -0.37 [0.75, 0.02]. McClure: d = 0.46 [-0.02, 0.95]. Nantals: Study 1: _d _= 0.77 [-0.07, 1.61]; Study 2: d = 0.06 [-0.72, 0.84]. Rauscher and Hayes: d = 0.52 [0.18, 0.86]. Rauscher and Ribar: Study 1: d = 1.81 [1.24, 2.37]; Study 2: d = 0.93 [0.46, 1.39]. Rideout: d = 1.54 [-0.67, 3.75]. Rideout: d = 1.01 [0.19, 1.82]. Rideout: d =1.01 [-0.21, 2.23]. Rideout: d = 0.28 [-1.04, 1.60]. Siegel: d = 0.26 [-0.39, 0.91]. Spitzer: d = 0.01 [-0.32, 0.33]; Steele et al.: _d _= 0.85 [0.41, 1.30]. Steele, Dalla Bella, et al.: Study 1: d = 0.49 [-0.01, 1.00]; Study 2: d = -0.41 [1.15, 0.33]. Steele, Dalla Bella, et al.: d = 0.85 [0.41, 1.30]. Steele, Brown and Stoecker: d=0.20 [-.08, 0.48]. Sweeny: Study 1: d = -0.43 [-0.93, 0.07]; Study 2: d = -0.06 [-0.56, 0.42]; Study 3: d = 0.14 [-0.37, 0.65]. Twomey: d = 0.63 [-0.01, 1.27]. Wells: d = -0.18 [-0.83, 0.47]. Wilson: d =0.85 [-0.44, 2.13]. Pietschnig et al.: meta-analytic estimate: d = 0.37 [0.23, 0.52].
Education enhances intelligence. Education has a consistent positive effect on intelligence. A meta-analysis suggests that one additional year of education corresponds to a gain of approximately 1 to 5 IQ points (contingent on study design, inclusion of moderators, and publication-bias correction).
Automatic imitation. The observation of the topographical features of an action facilitates the execution of a similar action in the observer. Humans are prone to automatically imitate others. Automatic imitation differs from spatial compatibility effects and provides an important tool for the investigation of the mirror neuron system, motor mimicry, and complex forms of imitation.
Statistics
- Status: mixed
- Original papers:
‘Evidence for visuomotor priming effect’, Craighero et al. 1996; visuomotor priming, n = 17 [citation=219 (GS, June 2022)].
- Critiques:
Akzel 2012 [n=114, citations=13(GS, June 2022)]. Akzel
2015 [n=102, citations=7(GS, June 2022)].
Brass et al. 2000 [n1 = 8, n2 = 8, n3 = 8 citations = 885 (GS, June 2022)]. Meta-analysis:
Cracco et al. 2018 [n=226 experiments, citations=134 (GS, June 2022)].
- Original effect size: N/A.
- Replication effect size: Akzel: n.s. Brass et al.: Experiment 1 ηp2 = 0.93, Experiment 2 ηp2 = 0.94, Experiment 3 - ηp2 = 0.39 (n.s.) [all ηp2 calculated from reported F statistic and converted using this
conversion]. Cracco et al.: gz = 0.95 [0.88, 1.02].
Congruency sequence effect (conflict adaptation or Gratton effect). A cognitive phenomenon in which the processing of stimuli is affected by the stimuli that preceded it e.g. congruency effects are smaller following incongruent trials rather than congruent trials.
Statistics
- Status: mixed
- Original paper: ‘
Semantic priming and retrieval from lexical memory: Roles of inhibitionless spreading activation and limited-capacity attention’, Neely 1977; speeded word–nonword classification task, n = 120. [citation = 3963 (PSYCNET.APA, January 2023)].
- Critiques:
Aczel et al. 2021 [Kan et al. 2013 replication, n=103, 70 and 38 participants for Experiments 1, 2 and 3, citations=4(GS, Feb 2022)].
Gratton et al. 1992 [n1=6, n2=5, n3a=6, n3b= 8, citation = 2004 (GS, April 2023)].
Gyurkovics et al. 2020 [n=489 over four tasks, citations=3(GS, April 2023)].
Kan et al. 2013 [n = 41 in Experiment 1; n = 28 in Experiment 2; n = 15 in Experiment 3, citation=81(GS, February 2022)]
- Original effect size: Greatest facilitation in Non-shift-Expected-Related word X target condition and greatest inhibition effects in Shift-Unexpected-Unrelated and Nonshift-Unexpected-Unrelated conditions; η2 = 0.689 (calculated from the reported F(4, 84) = 46.85, using this conversion).
- Replication effect size: Aczel et al.: The congruency sequence effect for the RT analysis was inconclusive in all three experiments, ηp2 =0.00 to 0.02 (calculated from the reported _F _statistic), and for the accuracy in two out of three experiments, ηp2 =0.00 to 0.04 (calculated from the reported _F _statistic). Gratton et al.: compatible vs. incompatible trials, Reaction time ηp2 =0.88 to 0.94 (calculated from the reported _F _statistic), Error rate - ηp2 =0.59 to 0.98 (calculated from the reported _F _statistic). Gyurkovics et al.: ηp2=.40-.96. Kan et al.: congruent vs. incongruent trials, Stroop accuracy ηp2 =0.14 to 0.57 (calculated from the reported F statistic), Stroop reaction time ηp2 =0.19 to 0.46 (calculated from the reported F statistic).
Action-sentence Compatibility Effect (ACE). Participants’ movements are faster when the direction of the described action (e.g., Mark dealt the cards to you) matches the response direction (e.g., toward).
Statistics
- Status: not replicated
- Original paper: ‘
Grounding language in action’, Glenberg and Kaschak 2002; experimental design, Experiment 1: n= 44, Experiment 2A: n= 70, Experiment 2B: n= 72. [citations= 2870 (GS, October, 2022)].
- Critiques:
Morey et al. 2022 [pre-registered multi-lab replication, 18 labs, n= 1278, citations= 30 (GS, October 2022)].
- Original effect size: Experiment 1: ηp2= 0.186 [calculated]. Experiment 2A: ηp2 = 0.051 [calculated].
- Replication effect size: Morey et al.: for native English speakers d= 0.0036; for non-native English speakers d= -0.019.
The attentional spatial-numerical association of response codes (Att-SNARC) effect. The finding that participants had quicker detects to left-side targets preceded by small numbers and to the right-side targets preceded by large numbers. This finding triggered many assumptions about the number representations grounded in body experience.
Statistics
- Status: mixed
- Original paper: ‘
The mental representation of parity and number magnitude’, Dehaene et al. 1993; 9 experiments of timed odd-even judgements investigated how parity and number magnitude were accessed from Arabic and verbal numerals, Experiment 1: n=20, Experiment 2: n=20, Experiment 3: n=12, Experiment 4:, n=20, Experiment 5: n=10, Experiment 6: n=8, Experiment 7: n=20, Experiment 8: n=24, Experiment 9: n=24. [citations= 3233 (GS, January 2023)].
- Critiques:
Fischer et al. 2003 [n=15, citations= 857 (GS, January 2023)].
Colling et al. 2020 [n=1105 at 17 labs, citations= 34, (GS, January 2023)].
Wood et al. 2008 [n=46 studies (meta analysis), citations= 545, (GS, January 2023)].
- Original effect size: NA.
- Replication effect size: Fischer et al.: not reported. All reported in Colling et al.: The estimate for a 250 ms interstimulus-interval (ISI) condition [90% CI]: Fischer et al.: −5.00 ms [−12.48, 2.48]. Ansari: 1.22 ms [−1.74, 4.19]. Bryce: −0.25 ms [−3.20, 2.71]. Chen: −2.59 ms [−5.25, 0.06]. Cipora: 2.65 ms [−0.15, 5.44]. Colling (Szucs):−1.93 ms [−4.39, 0.54]. Corballis: −0.25 ms [−3.03, 2.53]. Hancock: 0.55 ms [−2.50, 3.61]. Holmes: −0.67 ms [−3.34, 2.00]. Lindemann: 0.13 ms [−3.33, 3.59]. Lukavský: −0.06 ms [−2.52, 2.40]. Mammarella: −1.66 ms [−3.95, 0.63]. Mieth: 1.01 ms [−1.30, 3.31]. Moeller: −0.34 ms [−3.32, 2.64]. Ocampo: −0.44 ms [−3.05, 2.18]. Ortiz-Tudela: 0.51 ms [−2.27, 3.28]. Toomarian: 0.37 ms [−2.35, 3.08]. Treccani: 0.38 ms [−2.70, 3.46]. Model 1 (No Moderators): −0.05 ms [−0.82, 0.71]. Model 2 (Consistent Right-Starter): 0.29 ms [−0.89, 1.47]. Model 2 (Consistent Left-Starter): 0.12 ms [−1.24, 1.48]. Model 3 (Left-to-Right): 0.10 ms [−0.87, 1.06]. Model 3 (Not Left-to-Right): −1.65 ms [−3.58, 0.28]. Model 4 (Left-Handed): −1.83 ms [−3.88, 0.22]. Model 4 (Right-Handed): −0.03 ms [−0.72, 0.66]; the estimate for a 500 ms interstimulus-interval (ISI) condition: Fischer et al.:18.00 ms [7.51, 28.49]. Ansari: 0.72 ms [−1.89, 3.32]. Bryce: −0.13 ms [−2.78, 2.52]. Chen: 2.79 ms [0.45, 5.12]. Cipora: 0.27 ms [−1.79, 2.33]. Colling (Szucs):−0.48 ms [−3.45, 2.49]. Corballis: 0.09 ms [−2.33, 2.52]. Hancock: 2.21 ms [−0.29, 4.71]. Holmes: 0.99 ms [−1.95, 3.94]. Lindemann: −1.56 ms [−5.31, 2.19]. Lukavský: −1.10 ms [−3.61, 1.40]. Mammarella: 1.54 ms [−0.08, 3.16]. Mieth: 4.19 ms [2.23, 6.14]. Moeller: 0.57 ms [−2.88, 4.01]. Ocampo: 3.88 ms [1.54, 6.23]. Ortiz-Tudela: −3.43 ms [−6.30, −0.55]. Toomarian: 3.16 ms [0.53, 5.80]. Treccani: −0.42 ms [−2.61, 1.77]. Model 1 (No Moderators): 1.06 ms [0.34, 1.78]. Model 2 (Consistent Right-Starter): 1.24 ms [0.15, 2.32]. Model 2 (Consistent Left-Starter): 0.18 ms [−1.03, 1.39]. Model 3 (Left-to-Right): 0.91 ms [−0.02, 1.83]. Model 3 (Not Left-to-Right): 2.21 ms [−0.27, 4.69]. Model 4 (Left-Handed): 1.69 ms [−0.28, 3.65]. Model 4 (Right-Handed): 0.95 ms [0.07, 1.84]; the estimate for a 750 ms interstimulus-interval (ISI) condition: Fischer et al.:23.00 ms [8.30, 37.70]. Ansari: −4.07 ms [−6.76, −1.37]. Bryce: −0.69 ms [−3.19, 1.82]. Chen: 0.08 ms [−2.56, 2.72]. Cipora: −1.58 ms [−3.68, 0.53]. Colling (Szucs):0.70 ms [−1.53, 2.94]. Corballis: 0.30 ms [−2.51, 3.11]. Hancock: −1.44 ms [−4.02, 1.14]. Holmes: 0.35 ms [−2.48, 3.19]. Lindemann: 2.45 ms [−0.43, 5.33]. Lukavský: 1.48 ms [−1.29, 4.24]. Mammarella: −0.60 ms [−2.47, 1.26]. Mieth: 0.61 ms [−1.17, 2.39]. Moeller: 0.66 ms [−1.57, 2.88]. Ocampo: 5.75 ms [3.44, 8.06]. Ortiz-Tudela: −1.73 ms [−4.93, 1.48]. Toomarian: 0.35 ms [−2.61, 3.31]. Treccani: −2.18 ms [−4.36, 0.01]. Model 1 (No Moderators): 0.19 ms [−0.53, 0.90]. Model 2 (Consistent Right-Starter): 0.13 ms [−0.97, 1.23]. Model 2 (Consistent Left-Starter): −0.03 ms [−1.23, 1.18]. Model 3 (Left-to-Right): 0.24 ms [−0.68, 1.17]. Model 3 (Not Left-to-Right): −2.25 ms [−4.31, −0.20]. Model 4 (Left-Handed): −1.92 ms [−4.03, 0.19]. Model 4 (Right-Handed): 0.24 ms [−0.84, 1.31]; the estimate for a 1,000 ms interstimulus-interval (ISI) condition: Fischer et al.:11.00 ms [1.47, 20.53]. Ansari: 1.22 ms [−1.03, 3.48]. Bryce: 0.53 ms [−1.90, 2.96]. Chen: −1.71 ms [−3.90, 0.49]. Cipora: −1.09 ms [−3.31, 1.12]. Colling (Szucs):2.48 ms [0.28, 4.68]. Corballis: 0.67 ms [−1.55, 2.89]. Hancock: −0.18 ms [−2.78, 2.42]. Holmes: 0.36 ms [−1.97, 2.69]. Lindemann: 2.06 ms [−0.83, 4.95]. Lukavský: −3.86 ms [−7.10, −0.63]. Mammarella: 1.42 ms [−0.34, 3.18]. Mieth: −0.57 ms [−2.66, 1.51]. Moeller: 0.97 ms [−2.31, 4.25]. Ocampo: −1.34 ms [−3.84, 1.15]. Ortiz-Tudela: −0.39 ms [−2.99, 2.21]. Toomarian: 2.44 ms [0.11, 4.76]. Treccani: −1.39 ms [−3.53, 0.74]. Model 1 (No Moderators): −1.27 ms [−3.29, 0.75]. Model 2 (Consistent Right-Starter): 0.12 ms [−1.12, 1.35]. Model 2 (Consistent Left-Starter): 0.42 ms [−0.71, 1.55]. Model 3 (Left-to-Right): 0.50 ms [−0.54, 1.54]. Model 3 (Not Left-to-Right): 0.29 ms [−0.62, 1.19]. Model 4 (Left-Handed): 0.18 ms [−0.51, 0.88]. Model 4 (Right-Handed): −2.51 ms [−4.59,-0.43]. Wood et al.: Pooled size of the SNARC effects - Parity d= -0.99; Magnitude classification (fixed standard) d=-1.04; Magnitude comparison (variable standard) d=-0.59; Tasks without semantic manipulation d=-0.60; bimanual response d=-0.79; eye saccades latency d=-1.20; eye saccade amplitudes d=-0.07; manual bisection d=-1.08; pointing RT d=-1.02; pointing MT_ d_= -0.94; unimanual finger response d=-1.69; naming d=0.09; foot response d=-1.59; grip aperture d=-3.29. All reported in Wood et al.: Shaki and Petrusic: intermixed adj. R2=.45; negative blocked adj. R2=.94; positive blocked adj. R2=.94. Shaki et al.: adj. R2=.92. Bachot et al.: control children adj. R2=.42; VSD children adj. R2=.24. Gevers et al.: adj. R2=.82. Castronovo & Seron: blind participants adj. R2=.92; sighted participants adj. R2=.93. Nuerk et al.: adj. R2=.96. Fischer and Rottmann: whole interval adj. R2=.69; negative interval adj. R2=0.01. Bull et al.: deaf participants adj. R2=.94; hearing participants adj. R2=.60. Ito and Hatta: adj. R2=.16. Bächthold et al.: ruler task adj. R2=.96; clock-face task adj. R2=.97.
Scarcity effect - Attention. Having too little resources leads individuals to misallocate attention, leading to consequences such as overborrowing. Study 1 examined whether scarcity causes greater cognitive fatigue, measured by poorer performance on a cognitive ability task.
Statistics
- Status: mixed
- Original paper: ‘
Some consequences of having too little’, Shah et al. 2012; 5 experiments with Study 1: n=60; Study 2: n=68; Study 3: n=143; Study 4: n=118; Study 5: n=137. [citations=1403 (GS, April 2022)].
- Critiques:
Camerer et al. 2018 [n=619, citations=855(GS, November 2021)].
O’Donnell et al. 2021 [n=668, citations=0(GS, November 2021)].
Shah et al. 2019 [n=997, citations=19(GS, November 2021)].
- Original effect size: r = .267.
- Replication effect size: Camerer et al.: r = -.015; O’Donnell et al.: r= -.039; Shah et al.: η2 = .004.
Scarcity effect - Meaning in life. Threats to people’s sense that they can afford things that they need in the present and foreseeable future, undermines perceptions of meaning in life.
Scarcity effect - Discounting. A negative income shock was associated with increased discounting rates for gains and loses.
Scarcity effect - Physical pain. The higher the economic insecurity is associated with the higher the physical pain.
Scarcity effect - Self expansion. Lower self-concept clarity (conceptualised as a finite resource) is associated with lower self-expansion.
Scarcity effect - Wellbeing. Imagining having less time available in one’s current city is positively associated with well-being.
Scarcity effect - Decision making. Lacking time or money can lead to making worse decisions.
Scarcity effect - Opportunity costs. Poor people are more likely to consider opportunity costs spontaneously.
Scarcity effect - Conscious thoughts. Thoughts triggered by financial concerns intrude more often into consciousness of poorer individuals than for wealthier individuals.
Scarcity effect - Absoluteness of losses. Poorer individuals view losses in more absolute, rather than relative, terms than do wealthier individuals.
Statistics
- Status: not replicated
- Original paper: ‘
Scarcity frames value’, Shah et al. 2015; experimental design, study 6, n=73. [citation=315(GS, November 2021)].
- Critiques:
O’Donnell et al. 2021 [n=209, citations=0(GS, November 2021)].
- Original effect size: r= .264.
- Replication effect size: r= .090.
Bottomless soup bowl. Visual cues related to portion size increase intake volume of soup.
Simon effect. Faster responses are observed when the stimulus and response are on the same side than when the stimulus and response are on opposite sides.
Statistics
- Status: mixed
- Original paper: ‘
Choice reaction time as a function of angular stimulus-response correspondence and age’, Simon and Wolf 1963; experimental design, n1 = 20, n2 = 20. [citation=289(GS, June 2022)].
- Critiques:
Ehrenstein 1994 [n1=12, n2=14, citations=27(GS, June 2022)].
Marble and Proctor 2000 [n1=48, n2=20, n3=32, n4=80, citations=89(GS, June 2022)].
Proctor et al. 2000 [n1=64, n2=64, citations=74(GS, June 2022)].
Theeuwes et al. 2014 [n1=30, n2=30, n3=30, n4=30, citations=30(GS, June 2022)].
- Original effect size: not reported but could be calculated.
- Replication effect size: Ehrenstein: not reported but could be calculated. Marble and Proctor: not reported but could be calculated. Proctor et al.: not reported but could be calculated. Theeuwes et al.: ηp ² (the compatible S-R instructions condition vs. the incompatible S-R instructions condition)=.12; ηp ²(the compatible S-R instructions condition vs. the incompatible practised S-R instructions condition)=.07; ηp ²(the incompatible S-R instructions condition vs. the compatible S-R instructions condition)=.21; ηp ² (e incompatible practised S-R instructions condition vs. the compatible S-R instructions condition)=.11.
ERPs in lie detection. Particularly the P300 ERP component has been related in literature using Guilty Knowledge Tests to conscious recognition of crime-related targets as meaningful and salient stimuli, based on crime-related episodic memories.
Statistics
- Status: mixed
- Original paper: ‘
Late Vertex Positivity in Event-Related Potentials as a Guilty Knowledge Indicator: A New Method of Lie Detection’, Rosenfeld et al. 1987; experimental design, n1=10, n2=6. [citation=126(GS, May 2022)].
- Critiques:
Abootalebi et al. 2006 [n=62, citations=159(GS, May 2022)].
Bergström et al. 2013 [n1=24, n2=24; citations=61(GS, May 2022)].
Mertens & Allen 2008 [n=79, citations=187(GS, May 2022)].
Rosenfeld et al. 2004 [n-ex1=33; n-ex2.1=12, n-ex2.2=10, citations=419(GS, May 2022)].
Wang et al. 2016 [n=28, citations=61(GS, May 2022)].
- Original effect size: N/A.
- Replication effect size: Abootalebi et al.: not reported but could be calculated. Bergström et al.: d=2.89 (effort in uncooperative recall suppression); d=2.28 (success in uncooperative recall suppression); partial _η2 = _0.20 (experiment 1 - voluntary modulations of P300); partial η2 = 0.31 (experiment 2 - voluntary modulations of P300); d = 0.48 and d = 0.31 (experiment 1 - cooperative phase); d =0.03 ( experiment 1 - uncooperative phase); d = 0.14 (experiment 1 - innocent phase); d = 0.77 (experiment 1: targets vs. probes - innocent phase); d = 0.71 (experiment 1: targets vs. probes - uncooperative phase); d = 1.03 and d = 0.48 (experiment 2 - cooperative phase); d = 0.48 and d = 0.99 (experiment 2 - uncooperative phase); d = 1.81 (experiment 2 - innocent phase); d = 0.50 (experiment 1: cooperative vs. uncooperative); d = 0.52 (experiment 2: cooperative vs. uncooperative); d = 0.07 ( experiment 1: uncooperative vs. innocent); d = 0.57 (experiment 2: uncooperative vs. innocent); d < 0.17 (targets vs. irrelevants for experiment 1 and 2). Mertens and Allen: not reported but could be calculated. Rosenfeld et al.: not reported but could be calculated. Wang et al.: not reported but could be calculated.
Evaluative conditioning. Implicit and explicit attitudes are differently sensitive to different kinds of information. Explicit attitude are formed and changed in response to the valence of consciously accessible, verbally presented behavioural information and implicit attitudes are formed and changed in response to the valence of subliminally presented primes.
Statistics
- Status: mixed
- Original paper: ‘
Of Two Minds: Forming and Changing Valence-Inconsistent Implicit and Explicit Attitudes’, Rydell et al. 2006; mixed design experiment with n=50. [citation=403(GS, November 2022)].
- Critiques:
Heycke et al. 2018 [n1=51, n2=57, citations=32(GS, November 2022)].
- Original effect size: Explicit attitudes: two-way interaction between condition and time η2 = 0.71 [reported] / d= 1.54 [converted using this
conversion]; Implicit attitudes: two-way interaction between condition and time η2 = 0.13 [reported] d= 0.38 [converted using this
conversion].
- Replication effect size: Heycke et al.: Explicit attitudes: time of measurement X valence condition – Experiment 1: η2 = 0.757 [reported] d= 1.75 [converted using this
conversion] (replicated); Experiment 2: η2 = 0.828 [reported] d= 2.17 [converted using this
conversion] (replicated); Implicit attitudes: 2-way interaction of time of measurement and condition Experiment 1: η2 = 0.075 [reported] d= 0.28 [converted using this
conversion] (reversed); Experiment 2: η2 = 0.102 [reported] d= 33 [converted using this
conversion] (reversed).
Bilingual deficit in lexical retrieval. Compared to monolinguals, bilinguals have often been found to be slower or less accurate in accessing the meaning of a certain word or the word for a certain representation under certain conditions.
Statistics
- Status: mixed
- Original paper:
‘Memory in a monolingual mode: When are bilinguals at a disadvantage?’, Ransdell and Fischler, 1987; between-group multi-experiment study, with monolingual and bilingual young adults, n1 = 28, n2 = 28. [citations=216(GS, May 2022)].
- Critiques:
Bialystok et al. 2007 [study 1: n1=24, n2 = 24; study 2: n1 = 50, n2 = 16, citations=338(GS, May 2022)].
Gollan et al. 2002 [n1=30, n2=30, citations=584(GS, May 2022)].
Gollan et al. 2005 [study 1: n1=31, n2=31; study 2: n1=36, n2=36, citations=665(GS, May 2022)].
Rosselli et al. 2000 [n1=45, n2=18, n3=19, citations=341(GS, May 2022)].
Rosselli et al. 2002 [n= 45, n2=18, n3=19, citations=151(GS, May 2022)].
- Original effect size: not reported but could be calculated.
- Replication effect size: Bialystok et al.: not reported but could be calculated. Rosselli et al.: not reported but could be calculated. Rosselli et al.: not reported but could be calculated. Gollan et al.: not reported but could be calculated. Gollan et al.: not reported but could be calculated.
Nostalgia as a positive emotional experience. A predominantly positive, albeit bittersweet emotion that arises from personally relevant and longful memories of one’s past. Nostalgia was once considered a disease or mental illness, but it has been shown to counteract loneliness, boredom and anxiety.
Statistics
- Status: replicated.
- Original paper: ‘
Nostalgia: A Psychological Perspective’, Batcho 1995; Cross-sectional survey to assess nostalgia for 20 aspects of experience, n=648. [citations=399(GS, February 2023)].
- Critiques:
Wildschut et al. 2006 [Total N=504 over seven studies, citations=1460(GS, February 2023)].
- Original effect size: Factor analysis suggested that nostalgia is composed of five factors reflecting different spheres and levels of experience. ES not reported, although the regression coefficient for nostalgia on judgement of the past, however, was positive (0.22, p < .0001), suggesting that nostalgia increases as the past is perceived more favourably.
- Replication effect size: Wildschut et al.: Nostalgic autobiographical narratives were richer in expressions of positive than negative affect, ηp2 =0.783 [calculated from the reported F statistic, F(1, 41) = 147.62, using this
conversion]; Participants expressed significantly more positive than negative affect, when describing how writing nostalgic narrative made them feel, ηp2 =0.535 [calculated from the reported F statistic, F(1, 171) = 196.56, using this
conversion] and reported more positive than negative affect on PANAS measure, ηp2 =0.633 [calculated from the reported F statistic, F(1, 171) = 294.61, using this
conversion]; Relative to participants in the control condition, those in the nostalgia condition scored higher on measures of social bonding, ηp2 = 0.205 [calculated from the reported F statistic, F(1, 50) = 12.88, using this
conversion], positive self-regard,ηp2 = 0.238 [calculated from the reported F statistic, F(1, 50) = 15.63, using this
conversion], and positive affect, ηp2 = 0.139 [calculated from the reported F statistic, F(1, 50) = 8.05, using this
conversion].
Spacing effect. Long-term memory is enhanced when learning events are spaced apart in time rather than massed in immediate succession.
Statistics
- Status: replicated
- Original paper: ‘
Memory: A contribution to experimental psychology’, Ebbinghaus 1964; series of single-case studies, n=1. [citations=6103 (GS, September, 2022)].
- Critiques:
Cepeda et al. 2006, meta-analysis [n= 184 articles, citations=1894 (GS, September 2022)].
Janiszewski et al. 2003, meta-analysis [n= 97 verbal learning studies, citations= 373 (GS, September 2022)].
- Original effect size: N/A.
- Replication effect size: Cepeda et al.: Cohen’s d for the difference in the accuracy between massed and spaced learning trials in verbal recall tasks= 0.567 (calculated). Janiszewski et al.: ηp2= 0.093 (calculated from the reported F(1, 478)=49.23,p<.01 using this
conversion) for a linear relationship between the number of lags between learning events and the accuracy of recall; ηp2= 0.051 for the log relationship (calculated fomr the reported F(1, 478)=25.69, p<.01 using this
conversion).
False memories - eyewitness testimony. A phenomenon of recalling a real event that differs from what actually happened or an event that never occurred.
Statistics
- Status: not replicated.
- Original paper:
‘Reconstruction of Automobile Destruction: An Example of the Interaction Between Language and Memory’, Loftus and Palmer 1974; experimental design, Experiment 1 n = 45, Experiment 2 n = 150. [citation = 3,049 (GS, October 2022)].
- Critiques:
Goldschmied et al. 2016 [Experiment 1 n = 115, Experiment 2 n = 112, citations = 9 (GS, October 2022)].
Raghunath et al. 2021 [n = 155, citations = 0 (GS, October 2022)].
Salovich et al. 2020, unpublished replication [n = 145, citations = NA].
Winter and Marmolejo 2001 [n = 60, citations = 0 (GS, October 2022)].
- Original effect size: Effect sizes not reported in paper, but estimated using the test statistics. d= 0.40 and φ = .23.
- Replication effect size: Raghunath et al.: NA, but in their linear mixed-effects model, they report no effect of question phrasing being present (not replicated). Salovich et al.: d = 0.07 and d = 0.22 (calculated using descriptive statistics reported) (not replicated). Winter and Marmolejo: ηp2 = -.02 (not replicated). Goldschmied et al.: Experiment 1: η2 = .00 to .02 (not replicated); Experiment 2: η2 = .07 (not replicated - went in opposite direction from original study).
Context-dependent memories. The improved recall or recognition of information when cues in the environment are the same during both encoding and retrieval.
Statistics
- Status: mixed (replicated, but smaller effect-size).
- Original paper:
‘Context-dependent memory in two natural environments: On land and underwater’, Godden and Baddeley 1975; experimental design, Experiment 1 n = 18, Experiment 2 n = 16. [citation = 2,447 (GS, October 2022)].
- Critiques:
Godden and Baddeley 1980 [n_ _= 16, citations = 449 (GS, October, 2022)].
Isarida et al. 2012 [Experiment 1 n = 80, citations = 24 (GS, October 2022)].
Martin and Aggleton 1993 [n = 40, citations = 42 (GS, October 2022)].
Murre 2021 [n = 16, citations = 3 (GS, October 2022)].
Smith and Vila 2001 [meta-analysis; k = 93 studies, citations = 1,046 (GS, October 2022)].
- Original effect size: Not reported in the paper, but can be estimated from the test-statistics. The original effect size was dz = 1.35. This is calculated using the test-statistics provided: F(1, 12) = 22.0, p < .001.
- Replication effect size: Godden and Baddeley: NA. This study found no significant difference in recognition performance across contexts (not replicated). Isarida et al. : ηp2 = .05 (replicated). Martin and Aggleton: d = 0.69 (estimated from test-statistics) (replicated). Murre: d= 0.37 (estimated from test-statistics). Result non-significant (p > .050). However, effect size is similar to meta-analyses (mixed). Smith and Vila: d =0.28 [0.23, 0.33] overall; Recall: d = 0.29 [0.21, 0.37], Recognition: d = 0.27 [0.18, 0.36] (replicated).
Motor priming. Motor priming refers to the phenomenon where a previous motor action influences the subsequent execution of a motor task. Scientific findings have shown that motor priming can have a moderate to large effect on task performance. It’s also important to note that the effect size of motor priming can depend on the specific task being used, the population being studied, and the experimental design.
Statistics
- Status: mixed
- Original paper:
‘A priming method for investigating the selection of motor responses’, Rosenbaum and Kornblum 1982; experimental design, _ _n=6. [citations= 227 (GS, March 2023].
- Critiques:
da Silva et al. 2020 [n=814 (36 articles, meta-analysis), citations = 10 (PubMed, January 2023)].
Kiesel et al. 2007 [Theoretical paper, n=NA, citations=138 (GS, March 2023].
Stoykov and Madhavan 2015 [Review, n=NA, citations=148 (GS, March 2023).
- Original effect size: not reported.
- Replication effect size: da Silva: Mean Difference = 8.64 [10.85, 16.43], Z = 2.17, p = .003, d=0.30 (estimated from the Z value using d=Z/sqrt(n) equation). Kiesel et al.: not reported. Stoykov and Madhavan: not reported.
Flanker task. The Flanker task is a measure of inhibition of prepotent responses. Response times to target stimuli flanked by irrelevant stimuli of the opposite response set (incongruent) are significantly more impaired than when they are flanked by irrelevant stimuli of the same response set (congruent).
Statistics
- Status: replicated
- Original paper: ‘
Effects of noise letters upon the identification of a target letter in a non-search task’, Eriksen and Eriksen 1974; within-subject design, n=6. [citations= 8085 (GS, August 2022)].
- Critiques:
Miller 1991 [Experiment 1: n= 36, Experiment 2: n=42, Experiment 3: n= 24, Experiment 4: n= 32, Experiment 5: n=32, Experiment 6, n=32, citations= 370 (GS, August 2022)].
- Original effect size: Spacing condition: ES = 2.96, Noise condition: ES= 2.09.
- Replication effect size: Miller: For only noise condition (i.e response compatible/incompatible) Reaction times: ES: Experiment 1= 0.89, Experiment 2: 0.23, Experiment 3: 0.74, Experiment 4: 0.58, Experiment 5: 1.28, Experiment 6: 0.47; Percent Accurate: ES: Experiment 1=0.39, Experiment 2= 0.23, Experiment 4= 0.72, Experiment 5= 0.83, Experiment 6= 0.35; For Spacing condition: Experiment 1, wide separation, ES = 0.40.
Mere Exposure Effect. Participants who are repeatedly exposed to the same stimuli rate them more positively than stimuli that have not been presented before.
Statistics
- Status: replicated
- Original paper: ‘
Attitudinal effects of mere exposure. Journal of Personality and Social Psychology’, Zajonc, 1968; correlational and experimental evidence, n=NA. [citation=9458(GS, February 2022)].
- Critiques:
Bornstein 1989 [Meta-analysis, total N = 33047, citation=2944(GS, February 2022)].
- Original effect size: Experiment 1, Nonsense words, ηp2 = 0.078 [ηp2 calculated from reported F(5,355) = 5.64, p < .001 using this
conversion] ; Experiment 2, Chinese characters ηp2 = 0.066 [ηp2 calculated from reported F(5, 335) = 4.72, p < .001 using this
conversion]; Experiment 3, Photographs ηp2 = 0.129 [ηp2 calculated from reported F(5, 355) = 9.96, p < .001 using this
conversion].
- Replication effect size: Combined effect size r = .260.
Cocktail Party Effect. Participants hear their own name being presented in the irrelevant message during a dichotic listening task.
Statistics
- Status: replicated
- Original paper:
‘Attention in dichotic listening: Affective cues and the influence of instructions’, Moray 1959; experimental design, n1=1, n2=12, n3=28. [citation=1972 (GS, February 2022)].
- Critiques:
Conway et al. 2001 [n=40, citation=1195 (GS, February 2022)].
Röer and Cowan 2021 [n=80, citation=3 (GS, February 2022)].
Wood and Cowan 1995 [Replication, n=34, citation=467 (GS, February 2022)].
- Original effect size: Detection rate = 33%.
- Replication effect size: Conway et al.: Detection rate = 43%. Röer and Cowan: Detection rate = 29%. Wood and Cowan: Detection rate = 35%.
Mental simulation - mismatch advantage: object colour. Readers verify pictures more quickly when they match rather than mismatch the object colour from the preceding sentence.
Statistics
- Status: reversed
- Original paper: ‘
Representing object colour in language comprehension’, Connell (2007); experimental design, n = 44. [citation=155(GS, November 2022)].
- Critiques:
de Koning et al. 2017 [n = 139, citations = 14 (GS, November, 2022)].
Mannaert et al. 2017 [Experiment 1: n= 205, citations= 33(GS, November 2022)].
Zwaan and Pecher 2012 Experiment 3a: n = 152, Experiment 3b: n = 152. [citations=192(GS, November 2022)].
- Original effect size: d = 0.26
calculated using this conversion.
- Replication effect size:
de Koning et al.:
d = .48. Mannaert et al.: Experiment 1:
d = 0.26. _Zwaan and Pecher: Experiment 3a: d = 0.32
calculated using this conversion; study 3b: d = 0.18
calculated using this conversion.
Mental simulation - match advantage object orientation. Readers verify pictures more quickly when they match rather than mismatch the object orientation from the preceding sentence.
Statistics
- Status: replicated
- Original paper: ‘
The effect of implied orientation derived from verbal context on picture recognition’, Stanfield and Zwaan 2001; experimental design, n=40. [citation=897(GS, November 2022)].
- Critiques:
de Koning et al. 2017 [n = 160, citation = 14(GS, November, 2022)].
Rommers et al. 2013 [Experiment 1: n = 52, Experiment 2: n = 44, Experiment 3: n = 88, citation = 48(GS, November 2022)].
Zwaan and Pecher 2012 [Experiment 1a: n=176; Experiment 1b: n=176, citations=192 (GS, November 2022)].
- Original effect size: d = .13.
- Replication effect size: de Koning et al.: d = .07. Rommers et al.: Experiment 1: d = .14, Experiment 2: d = .12, Experiment 3: d = .14 [calculated using this
conversion]. Zwaan and Pecher: Experiment 1a: d = .10; Experiment 1b: d = .09.
Mental simulation - match advantage object distance. Readers verify small pictures more quickly when they are far from the protagonist, in contrast to big pictures, while big pictures are verified more quickly when closer to the protagonist, as opposed to smaller pictures.
Mental simulation - match advantage object number. Verification response was faster for concept-object match when there was numerical congruence (compared with incongruence) between the number word and quantity.
Statistics
- Status: mixed
- Original paper: ‘
The conceptual representation of number’, Patson et al. 2014; experimental design, n = 63. [citation = 29(GS, November 2022)].
- Critiques:
Beg et al. 2021 [Experiment 1: n = 63, Experiment 2: n = 68, Experiment 3: n =42, citation = 5(GS, November 2022)].
Patson et al. 2016 [Experiment 1: n = 63, Experiment 2: n = 63, citation = 11(GS, November 2022)].
Patson 2021 [Experiment 1: n = 62, Experiment 2: n = 83, citation = 1(GS, November 2022)].
Šetić and Domijan 2017 [Experiment 1: n = 48, Experiment 2: n = 33, citation = 10(GS, November 2022)].
- Original effect size: _ηp2 _= 0.11.
- Replication effect size: Beg et al.: Experiment 1: ηp2= 0.11, experiment 2: ηp2= not reported, Experiment 3: ηp2=0.05. Patson et al.: Experiment 1: ηp2=0.03, Experiment 2: ηp2=0.06. Patson: Experiment 1: ηp2=0.08, Experiment 2: ηp2= 0.12. Šetić and Domijan: Experiment 1: ηp2=0.13, Experiment 2: ηp2= 0.21.
Mental simulation - match advantage object shape. Readers verify pictures more quickly when they match rather than mismatch the object shape from the preceding sentence.
Statistics
- Status: replicated
- Original paper: ‘
Language Comprehenders Mentally Represent the Shapes of Objects’, Zwaan et al. 2002; experiment, study 1: n = 51, study 2: n = 57. [citation=1031(GS, November 2022)].
- Critiques:
de Koning et al. 2017 [n = 160, citation = 14(GS, November, 2022)].
Ostarek et al. 2019 [n1=115, n2=114,n3=112, n4=115, citations = 21 (GS, November 2022)].
Rommers et al. 2013 [study 1: n = 52, study 2: n = 44; study 3: n = 88, citation = 48(GS, November 2022)].
Zwaan and Pecher 2012 [experiment 2a n= 176, experiment 2b n=176, citations=192(GS, November 2022)].
- Original effect size: Study 1: d = 0.58, Study 2: d = 0.39
calculated using this conversion.
- Replication effect size:
de Koning et al.:d = 0.27. Ostarek et al.: experiment 1: d = 0.22; experiment 2: d = 0.20; experiment 3: d = 0.13; experiment 4: d = 0.19
calculated using this conversion from t to Cohen’s d. Rommers et al.: study 1: ηp2= .016/d = 0.12 [calculated, using this
conversion], ηp2 = .46/d = 0.91[calculated, using this
conversion]; study 3: ηp2 =.11 /d = 0.35 [calculated, using this
conversion]. Zwaan and Pecher: experiment 2a: d = 0.25
calculated using this conversion from t to Cohen’s d, experiment 2b: d = 0.30
calculated using this conversion from t to Cohen’s d.
Mental simulation - match advantage object size. Readers verify small imagined pictures more quickly when they are small real pictures, in contrast to big real pictures, while big imagined pictures are verified more quickly when they are big real pictures, as opposed to big imagined pictures.
Mental simulation - bigger is better effect. Items that are big in real size are processed more quickly than items that are small in real size.
Statistics
- Status: replicated
- Original paper: ‘
Size matters: Bigger is faster’, Sereno et al. 2009; experiment, n =28. [citation=47(GS, November 2022)].
- Critiques:
Kang et al. 2011 [n=80, citations=23(GS, November 2022)].
Wei and Cook 2016 [n =42, citations = 7 (GS, November, 2022)].
Yao et al. 2013 [n = 60, citations =24(GS, November 2022)].
Yao et al. 2022 [Experiment 2: n = 56, citations =0(GS, November 2022)].
- Original effect size: d = 0.52.
- Replication effect size: Kang et al.: d = 0.14. Wei and Cook: d =0.37
calculated using this conversion from partial eta to Cohen’s d. Yao et al.: d = 0.59
calculated using this conversion from partial eta to Cohen’s d. Yao et al.: Experiment 2: d = 0.43
calculated using this conversion from t to Cohen’s d.
Transposed word effect. Responses to transposed word sequences (e.g. “you that read wrong”) are more error-prone and judged as ungrammatical compared with a control sequence (e.g. “you that read worry”).
Statistics
- Status: replicated
- Original paper: ‘
You that read wrong again! A transposed-word effect in grammaticality judgments’, Mirault et al. 2018; two experiments, laboratory: n = 57, online: n = 94. [citation=47(GS, November 2022)].
- Critiques:
Huang and Staub 2022 [Experiment 1: n = 49, Experiment 2: n = 51, citations=0(GS, November 2022)].
Liu et al. 2020 [Experiment 1: n = 63, Experiment 2: n = 69, Experiment 3: n = 63, citations=5(GS, November 2022)].
Liu et al. 2021 [Experiment 1: n = 60, Experiment 2: n = 32, citations=4(GS, November 2022)].
Liu et al. 2022 [n = 112, citations=2(GS, November 2022)].
Mirault et al. 2020 [n = 112, citations=13(GS, November 2022)].
Mirault et al. 2022 [Experiment 1: n = 60, Experiment 2: n = 32, citations=4(GS, November 2022)].
Pegado and Grainger 2019a [n = 28, citations=11(GS, November 2022)].
Pegado and Grainger 2019b [Experiment 1: n = 28, Experiment 2: n = 28, citations=13(GS, November 2022)].
Pegado and Grainger 2021 [n = 28, citations=6(GS, November 2022)].
Pegado et al. 2021 [n = 31, citations=2(GS, November 2022)].
Snell and Grainger 2019 [n = 24, citations=21(GS, November 2022)].
Wen et al. 2021a [n = 40, citations=3(GS, November 2022)].
Wen et al. 2021b [experiment 2: n = 26, citations=10(GS, November 2022)]
. Wen et al. 2022 [n = 124, citations=0(GS, November 2022)].
- Original effect size: laboratory: d = 1.86, online: d = 1.58.
- Replication effect size: Huang and Staub: Experiment 1: d = 1.27
calculated using this conversion from t to Cohen’s d, Experiment 2: d = 0.97
calculated using this conversion from t to Cohen’s d. Liu et al.: Experiment 1: d = 1.37
calculated using this conversion from t to Cohen’s d, Experiment 2: d = 1.37
calculated using this conversion from t to Cohen’s d, Experiment 3: d = 1.26
calculated using this conversion from t to Cohen’s d. Liu et al.: Experiment 1: d= 2.40
calculated using this conversion from t to Cohen’s d, Experiment 2: d = 1.69
calculated using this conversion from t to Cohen’s d. Liu et al.: serial visual presentation: d = 1.02
calculated using this conversion from t to Cohen’s d,parallel visual presentation: d = 2.00
calculated using this conversion from t to Cohen’s d. Mirault et al.: d = 0.52
calculated using this conversion from t to Cohen’s d. Mirault et al.: Experiment 1: d = 0.40
calculated using this conversion from t to Cohen’s d; Experiment 2: d = 0.88. Pegado and Grainger: d = 0.73
calculated using this conversion from t to Cohen’s d. Pegado and Grainger: Experiment 1: d = 2.97
calculated using this conversion from t to Cohen’s d, Experiment 2: d = 0.78
calculated using this conversion from t to Cohen’s d. Pegado and Grainger: d =1.40. Pegado et al.: d = 2.08
calculated using this conversion from t to Cohen’s d. Snell and Grainger: d = 1.58
calculated using this conversion from t to Cohen’s d. Wen et al.: d = 0.64
calculated using this conversion from t to Cohen’s d. Wen et al.: Experiment 2: d = 1.62
calculated using this conversion from t to Cohen’s d. Wen et al.: d = 0.32
calculated using this conversion from t to Cohen’s d.
Personality > intelligence predicting life outcomes. Personality is generally more predictive than IQ on a variety of important life outcomes, such as educational attainment and wage.
Statistics
- Status: not replicated
- Original paper: ‘
What grades and achievement tests measure’, Borghans et al. 2016; correlational study, n=23,023 over four large-scale survey datasets. [citations=265(GS, January 2023)].
- Critiques:
Zisman and Ganzach 2022 [n=26,600 over six large-scale datasets, citations=5(GS, January 2023)].
- Original effect size: Personality more predictive of education, R2 = 0.143, grades, R2 = 0.028 to R2 = 0.093, and wage, R2 = 0.021 to R2 = 0.053, then intelligence (education – R2 = 0.108, grades – R2 = 0.009 to R2 = 0.216, wage, R2 = 0.024 to R2 = 0.18.
- Replication effect size: Zisman & Ganzach: Intelligence more predictive of educational attainment, R2 = 0.120 to R2 = 0.328 (average R2 = 0.232), grades, R2 = 0.175 to R2 = 0.268 (average R2 = 0.229), and pay, R2 = 0.031 to R2 = 0.148 (average R2 = 0.080), then personality (educational attainment – R2 = 0.029 to R2 = 0.079, average R2 = 0.053; grades – R2 = 0.011 to R2 = 0.041, average R2 = 0.024; pay, R2 = 0.021 to R2 = 0.079, average R2 = 0.040) (not replicated).
Error salience (epistemic contextualism effects). Judgments about “knowledge” are sensitive to the salience of error possibilities. This is explained by the fact that salience shifts the evidential standard required to truthfully say someone “knows” something when those possibilities are made salient.
Statistics
- Status: mixed.
- Original paper: ‘
Knowledge Ascriptions and the Psychological Consequences of thinking about Error’, Nagel 2010; theoretical paper, n=NA. [citations=133(GS, May 2023)].
- Critiques:
Feltz & Zarpentine 2010 [n1=152, citations=128(GS, May 2023)].
Hansen & Chemla 2013 [n1=40, citations=66(GS, May 2023)].
Alexander et al. 2014 [n1=40, n2=187, n3=93 (not relevant here), n4=126, citations=34(GS, May 2023)].
Buckwalter 2017 [review paper, n=NA, citations=20(GS, May 2023)].
Buckwalter 2021 [n1=99, n2=201, n3=203, citations=5(GS, May 2023)].
- Original effect size: NA.
- Replication effect size: Feltz & Zarpentine: Experiment 1 - low versus high
- practical consequences and error salience d=0.29 (n.s., calculated from the reported t(71)=1.213, p=0.23 using this
conversion). Hansen & Chemla: Positive polarity sentences η2 = 0.39 to η2 = 0.59; Negative polarity sentences η2 = 0.11 to η2 = 0.50 (calculated from the reported F statistics in Figure 5 using this
conversion). Alexander et al.: Study 1 – d = 1.36; Study 2 – η2 = 0.12 (calculated from the reported F (4, 209) = 7.28, p < .000 using this
conversion); Study 4 – d= 1.26. Buckwalter: ES=NA, mixed evidence reported. Buckwalter: Experiment 1 – truth statements d=0.75, belief statements d=0.85, evidence statements d=1.42, actionability statements d=-1.05, knows statements d=1.47; Experiment 2 – truth statements d=0.62, belief statements d=0.06, evidence statements d=0.58, actionability statements d=-0.25, knows statements d=0.43; Experiment 3 – truth statements d=-0.09, belief statements d=-0.39, evidence statements d=0.38, actionability statements d=-0.71, knows statements d=0.41.
Gettier intuition effect. Participants attributed knowledge in Gettier-type cases (where an individual is justified in believing something to be true but their belief was only correct due to luck) at rates similar to cases of justified true belief.
Statistics
- Status: mixed.
- Original paper: ‘
The folk conception of knowledge’ Starmans and Friedman 2012; between-subject experiments, n1a=144, n1b=133, n1c=46, n2=51, n3=43. [citations=183(GS, March 2023)].
- Critiques:
Turri et al. 2015 [n1=135, n2=141, n3=576, n4= 813, citations = 97 (GS, March 2023)].
Hall et al. 2018 (pre-print) [n=4724, Citations=4 (GS, March 2023)].
- Original effect size: Experiment 1a: knowledge attribution exceeded chance in both the Gettier and Control conditions; in the False Belief condition knowledge was attributed less than in the Gettier condition and at rates less than would be expected by chance (ηp2 =0.47, calculated from the reported F(2,141) = 63.65, p < .001 using this
conversion); Experiment 1b: participants attributing knowledge equally in the Control and in the Gettier condition, but less in the False Belief condition than in the Gettier condition (ηp2 =0.34, calculated from the reported F(2,91) = 23.75, p < .001 using this
conversion); Experiment 1c: laypeople consider Gettier cases to be instances of knowledge (d = 1.75, calculated from the reported t(45) = 5.93, p < .001 using this
conversion); Experiment 2: participants attributed knowledge in High justification condition, but not in the Low justification condition (ηp2 =0.19, calculated from the reported F(1,49) = 11.75, p = .001 using this
conversion); Experiment 3: participants readily attributed knowledge when the Gettiered individual formed a belief based on authentic evidence as compared to apparent evidence (ηp2 =0.29, calculated from the reported F(1,42) = 17.51, p < .001 using this
conversion).
- Replication effect size: Turri et al.: knowledge attributions are surprisingly insensitive to lucky events that threaten, but ultimately fail to change the explanation for why a belief is true, Experiment 1 - Cramér’s V = .509, Experiment 2: Cramér’s V =.534, Experiment 3: Cramér’s V = .406, Experiment 4: Cramér’s V = .546 (all replicated). Hall et al.: participants were more likely to attribute knowledge in standard cases of justified true belief than in Gettier cases, Pseudo- R2 = 0.12 - 0.15.
Left-cradling bias (Child cradling lateralization). Humans preferentially hold their child on the left body side. This is hypothesised to be modulated by handedness as the dominant hand is preferably free for mundane tasks.
Statistics
- Status: replicated
- Original paper: ‘
Handedness as a major determinant of functional cradling bias’, van de Meer and Husby 2007; laboratory study in which left- and right-handers were asked to cradle a baby doll, side of holding was recorded in the studies, n=765. [citations = 67(GS, June 2022)].
- Critiques:
Packheiser et al. 2019 [meta-analysis, n=6799, citations = 27(GS, June 2022)].
- Original effect size: d = 1.06.
- Replication effect size: Packheiser et al.: d = 0.34.
Handedness differences - schizophrenia. Non-right-handedness is more prevalent in individuals with schizophrenia compared to the healthy population.
Handedness differences - depression. Being left-handed is associated with a higher likelihood of being depressed.
Statistics
- Status: mixed
- Original paper: ‘
Cerebral laterality and depression: Differences in perceptual asymmetry among diagnostic subtypes’, Bruder et al. 1989; analysis of different patterns of brain lateralization between depressed individuals and controls, n = 70. [citations=202 (GS, January 2023)].
- Critiques:
Denny 2009 [n= 27,482, citations = 49 (Tandfonline, June 2022)].
Elias et al. 2001 [n=541, citations = 37 (ScienceDirect, June 2022)].
Packheiser et al. 2021 [meta-analysis, k=87, n = 35501, citations = 1 (ScienceDirect, June 2022)].
- Original effect size: d= 0.57.
- Replication effect size: Elias et al.: No main effect but a significant interaction with sex; left-handed men show higher depression scores (no effect size). Denny: being left-handed is associated with a higher level of depressive symptoms, no significant interaction with sex (no effect size). Packheiser et al.: No link between handedness and depression (OR = 1.04 [0.95 - 1.15]).
Handedness differences - stuttering. The rate of stuttering was much higher in left-handers than in right-handers.
Statistics
- Status: not replicated
- Original paper: ‘
Left-handedness: Association with immune disease, migraine, and developmental learning disorder, Geschwind and Behan 1982; survey, n= 253. [citations=1637(GS, October, 2022)].
- Critiques:
Mohammadi and Papadatou-Pastou 2020 [n = 83 children who stutter, 90 children who do not stutter, citations = 4, (GS, November 2022)].
- Original effect size: not reported.
- Replication effect size: Mohammadi and Papadatou-Pastou: Cramer’s V = 0.125 [calculated using this
conversion]
Handedness differences - dyslexia. The rate of learning disabilities was much higher in left-handers than in right-handers.
Handedness differences - intelligence. Left-handedness is associated with lower scores in fluid intelligence.
Statistics
- Status: replicated
- Original paper: ’
Handedness and Intelligence’, Hicks and Beveridge, 1978; correlational survey, n = 67. [citations = 29 (Science Direct, June 2022)].
- Critiques:
Ntolka and Papadatou-Pastou 2017 [systematic review of 36 studies, n = 65,519, citations = 20 (Science Direct, June 2022)].
Papadatou-Pastoua & Tomprou 2015 [meta-analysis, n = 16,076, citations = 39 (Science Direct, June 2022)].
Somers et al. 2015 [meta-analysis, k=30, n = 359,890, citations = 63 (SD, June 2022)].
- Original effect size: N/A.
- Replication effect size: Ntolka & Papadatou-Pastou: for a subset of n = 19,744 statistically significant but marginal differences in IQ were found between the right-handed and the left-handed (d = -.07) and between the right-handed and the not-right-handed (d = -.06) each time in favour of the right-handed. Papadatou-Pastoua & Tomprou: d = -.09 (for a subset of n = 195). No effect size could be calculated for the rest of the studies in this meta-analysis. Overall, there were higher levels of non-right-handedness among the intellectually impaired, but the level was not different between typically developed individuals and gifted individuals. Somers et al.: No significant differences in overall verbal ability: Hedges’ g = −0.03; spatial ability was significantly higher for right-handed individuals: Hedges’ g = −0.14.
Handedness differences - cognitive ability. Difference in spatial ability between left and right handers. Left handers have a supposed deficit in spatial ability.
Statistics
- Status: mixed
- Original paper:
‘Possible Basis for the Evolution of Lateral Specialization of the Human Brain’, Levy 1969; comparison between spatial IQ (WAIS) in left and right handers in graduate students, n=25. [citations=857, GS, January 2023)].
- Critiques:
Briggs et al. 1976 [n = 34, citations = 114 (GS, January 2023)].
Inglis and Lawson 1984 [n=1880, citations=37 (GS, January 2023)].
Somers et al. 2015 [meta-analysis, n = 218,351, citations=97, (GS, April 2023)].
- Original effect size: d = 1.42.
- Replication effect size: Briggs et al.: no difference between left and right handers in spatial ability. Inglis and Lawson: no difference between left and right handers in spatial ability. Somers et al.: g = 0.14 (effect was significant, but did not survive sensitivity analyses).
Handedness differences - sexual orientation. Intrauterine testosterone levels may determine both handedness and sexuality, with homosexuals having an increased rate of left-handedness.
Statistics
- Status: mixed
- Original paper: ‘
Cerebral Lateralization Biological Mechanisms, Associations, and Pathology: II. A Hypothesis and a Program for Research’, Greschwind and Galaburda 1985; theory paper meaning no sample size present. [citations = 780 (GS, October 2022)].
- Critiques:
Becker et al. 1992 [n = 1,612, citations = 43 (GS, October 2022)].
Lalumière et al. 2001 [n = 23,410 (meta-analysis), citations = 301 (GS, October 2022)].
Lindesay 1987 [n = 194, citations = 101 (GS, October 2022)].
Lippa and Blanchard 2007 [n = 159,779, citations = 159 (GS, October 2022)].
Marchant-Haycox et al. 1991 [n = 774, citations = 53 (GS, October 2022)].
Rosenstein and Bigler 1987 [n = 89, citations = 31 (GS, October 2022)].
Satz et al. 1991 [n = 993, citations = 62 (GS, October 2022)].
Tran et al. 2019 [n = 3,870, citations = 7 (GS, October 2022)].
- Original effect size: NA (based on anecdotal correspondence between Greschwind and Galaburda and the homosexual community).
- Replication effect size: Lindesay: Significantly more homosexual men were left-handed than heterosexual men (χ2(1) = 6.2, p = .013) (replicated). Rosenstein and Bigler: r = .06 (not replicated). Marchant-Haycox et al.: No ES available, but non-significant relationship found between handedness and homosexuality (χ2(1) = 2.6, p = .107) (not replicated). Satz et al.: No ES available, but non-significant effect found between handedness and sexuality (not replicated). Becker et al.: _φ _= .08 to .11 (replicated). Lalumière et al.: OR = 1.39 (replicated). Lippa and Blanchard: φ(Males) = .02, φ(Females) = .05. Tran et al.: OR(Men) = 0.98 (p > .050), OR(Women) = 1.96 (_p _< .010). Homosexual women found to be more likely to be “mixed handed” (ambidextrous) (not replicated).
Handedness differences - twins. Handedness differences between twins and singletons. Twins have been suggested to show increased rates of left handedness compared to singletons.
Statistics
- Status: mixed
- Original paper: ‘
Handedness in Twins: a Meta-analysis’, Sicotte et al. 1999; meta-analysis on rates of atypical handedness, n = 85.371. [citations = 137 (GS, January, 2023)].
- Critiques:
Zheng et al. 2020 [n=631, citations=8 (GS, January 2023)].
Medland et al. 2003 [n = 9176, citations=50 (GS, January 2023)].
De Kovel et al. 2019 [UK Biobank study with n ~500,000, citations=107 (GS, April 2023)].
Pfeifer et al. 2022 [meta-analysis, n = 189,422, citations=10 (GS, April 2023)].
- Original effect size: OR = 1.43 [1.23 - 1.66].
- Replication effect size: Zheng et al.: No difference between singletons and twins. Medland et al.: No difference between singleton and twins. De Kovel et al.: OR = 1.20. Pfeifer et al.: OR = 1.40 [1.26 - 1.57] (replicated).
Handedness differences - sex. Handedness differences between men and women. Men have been suggested to show increased rates of left-handedness compared to women.
Statistics
- Status: mixed
- Original paper: ‘
Measuring handedness with questionnaires’, Bryden 1977; questionnaire study to assess handedness using factor analysis, n=1106. [citations=963 (GS, January 2023)].
- Critiques:
Cornell and& McManus 1992 [n = 266, citations = 11 (GS, January 2023)].
Green and Young 2001 [n=284, citations = 93 (GS, January 2023)].
Holtzen 1994 [n = 260, citations = 45 (GS, January 2023)].
Papadatou-Pastou et al. 2008 [meta-analysis, k = 144 studies, totaling N = 1,787,629 participants, citations = 323(GS, January 2023)].
- Original effect size: OR = 1.38.
- Replication effect size: Green and Young: similar rates of handedness between men and women. Holtzen: similar rates of handedness between men and women. Cornell and McManus: similar rates of handedness between men and women. Papadatou-Pastou et al.: OR = 1.23 [1.19 - 1.27] (replicated).
Overlooking of subtractive change. People systematically default to searching for additive transformations, and consequently overlook subtractive transformations. A tendency to generate and/or select additive ideas over subtractive ones.
Statistics
- Status: replicated
- Original paper:‘
People systematically overlook subtractive changes’, Adams et al. 2021; between subject design, N = 2261 across 8 studies. [citation = 75 (GS, October 2022)].
- Critiques:
Fillon et al. 2022 [n=477, citations = 0 (GS, October 2022)].
- Original effect size: ꭕ ² between 9.71 and 13.63.
- Replication effect size: Fillon et al.: ꭕ ² between 0.11 and 13.8, 5 out of the 6 effects are statistically significant.
Heterogeneity reduces perceived quantity. Sets of multiple colourful or different objects (e.g., stars, squares, triangles) seem less with respect to their quantity than the same sets that consist of only one type of object (e.g., only red triangles).
Statistics
- Status: not replicated
- Original paper: ‘
The presence of variety reduces perceived quantity’, Redden and Hoch, 2009; within-subjects design, Study 1: n = 80, Study 2: n = 57, Study 3: n = 105, Study 4: n = 64. [citations=90(GS, October 2022)].
- Critiques:
Röseler et al. , in press [Study 1: n = 104, Study 2: n = 199, Study 3: n = 144, Study 4: n = 82, Study 5: n = 45, Study 6: n = 84, citations=2(GS, October 2022)].
- Original effect size: d = 0.394 to d = 2.377.
- Replication effect size: Röseler et al.: d = -0.302 to d = 0.108.
Eye movements and false memories. Lateral eye movements increase false memory rates.
Gaze-liking effect. People are more likely to rate objects as more likeable when they have seen a person repeatedly gaze toward, as opposed to away from the object.
Statistics
- Status: not replicated
- Original paper: ‘
Gaze cuing and affective judgments of objects: I like what you look at’, Bayliss et al. 2006; experiment, Study 1: n=24, Study 2: n=24. [citation=317(GS, November 2022)].
- Critiques:
King et al. 2011
n=24, citations=[ 40(GS, November 2022)]. Tipples and Pecchinenda 2019 [n=98, citations=18(GS, November 2022)].
Ulloa et al. 2015 [Study 1: n = 36; Study 2: n = 35, citations = 24(GS, November 2022)].
- Original effect size: d=0.94.
- Replication effect size: King et al.: d = 1.32 for trustworthy faces [calculated using this
conversion from t to Cohen’s d], d = 0.51 for untrustworthy faces [calculated using this
conversion from t to Cohen’s d], d = 1.12 for congruent faces [calculated using this
conversion from t to Cohen’s d], d = 0.29 for incongruent faces [calculated using this
conversion from t to Cohen’s d]. Tipples and Pecchinenda: d=0.02; Ulloa et al.: Experiment 1: garage tools: d = 0.07 [calculated using this
conversion from t to Cohen’s d]; kitchen tools: d = 0.35 [calculated using
this conversion from t to Cohen’s d], letters: d = 0.52 [calculated using
this conversion from t to Cohen’s d], symbols: d = 0.47 [calculated using
this conversion from t to Cohen’s d], Experiment 2: garage tools: d = 0.14 [calculated using this
conversion from t to Cohen’s d]; kitchen tools: d = 0.25 [calculated using this
conversion from t to Cohen’s d], letters: d = 0.14 [calculated using this
conversion from t to Cohen’s d]; symbols: d = 0.02 [calculated using this
conversion from t to Cohen’s d].
Phonological working memory impairment in dyslexic adults. Dyslexic individuals show lower scores on phonological working memory, using a nonword repetition task.
Phonological monitoring impairment in dyslexic adults, dyslexic show lower scores on phonological monitoring than neurotypical adults.
Phonological awareness impairment in dyslexic adults. Dyslexic show lower scores on phonological awareness than neurotypical adults.
Phonemic fluency impairment in dyslexic adults. Dyslexic adults show lower scores on phonemic fluency tasks than neurotypical adults. Phonemic fluency tasks are a type of verbal fluency task, where people are asked to generate as many words as possible according to a specific criterion relating to phonemes, for instance words starting with the letter ‘M’.
Statistics
- Status: replicated
- Original paper: ‘
Organizational deficits in dyslexia: Possible frontal lobe dysfunction’, Levin 1990; experiment, children with dyslexia: n = 20, dyslexic children: n = 20. [citation = 97(GS, November 2022)].
- Critiques:
Frith et al. 1994 [NT: n = 19, LD: n = 19, citations = 80(GS, November 2022)].
Hatcher et al. 2002 [NT: n = 50, AWD: n = 23, citations = 426(GS, November 2022)].
Marzocchi et al. 2008 [neurotypical children: n = 30, dyslexic children: n = 22, ADHD children: n = 35, citations = 202(GS, November 2022)].
Menghini et al. 2010 [neurotypical children and adolescents: n = 65, dyslexic children and adolescent: n = 60, citations = 330(GS, November 2022)].
Moura et al. 2014 [neurotypical children: n = 50, dyslexic children: n = 50, citations = 104(GS, November 2022)].
Plaza et al. 2002 [neurotypical age-matched children: n = 26, neurotypical reading-age matched children: n = 26, dyslexic children: n = 26, citations = 81 (GS, November 2022)].
Reiter et al. 2005 [neurotypical children: n = 42, dyslexic children: n = 42, citations = 485 (GS, November 2022)].
Shareef et al. 2019 AWD: n = 16, NT: n = 26, citations = 6 (GS, November [2022)]. Smith-Spark et al. 2017 [AWD: n = 28, NT: n = 28, citations = 36 (GS, November 2022)]. Snowling et al. 1997(
https://www.tandfonline.com/doi/full/10.1080/02699931.2018.1468732) NT: n = 19, AWD: n = 14, citations = 262 (GS, November [2022)].
Varvava et al. 2014 AWD: n = 60, NT: n = 65, citations = 177 (GS, November 2022).
Wilson and Lesaux 2001, NT: n = 31, AWD: n = 28, citations = 265 [GS, November (2022)].
- Critiques:
Frith et al. 1994 [NT: n = 19, LD: n = 19, citations = 80(GS, November 2022)]
Hatcher et al. 2002 [NT: n = 50, AWD: n = 23, citations = 426(GS, November 2022)].
Marzocchi et al. 2008 [neurotypical children: n = 30, dyslexic children: n = 22, ADHD children: n = 35, citations = 202(GS, November 2022)].
Menghini et al. 2010 [neurotypical children and adolescents: n = 65, dyslexic children and adolescent: n = 60, citations = 330(GS, November 2022)].
Moura et al. 2014 [neurotypical children: n = 50, dyslexic children: n = 50, citations = 104(GS, November 2022)].
Plaza et al. 2002 [neurotypical age-matched children: n = 26, neurotypical reading-age matched children: n = 26, dyslexic children: n = 26, citations = 81 (GS, November 2022)].
Reiter et al. 2005 [neurotypical children: n = 42, dyslexic children: n = 42, citations = 485 (GS, November 2022)].
Shareef et al. 2019 AWD: n = 16, NT: n = 26, citations = 6 [GS, November 2022]. Smith-Spark et al. 2017 [AWD: n = 28, NT: n = 28, citations = 36 (GS, November 2022)].
Snowling et al. 1997 NT: n = 19, AWD: n = 14, citations = 262 (GS, November [2022)].
Varvava et al. 2014 AWD: n = 60, NT: n = 65, citations = 177 (GS, November 2022).
Wilson and Lesaux 2001, NT: n = 31, AWD: n = 28, citations = 265 [GS, November 2022].
- Original effect size: d =-0.32 [
calculated using this conversion from t to Cohen’s d].
- Replication effect size: Frith et al.: r = 0.50 [
calculated using the conversion from Mann Whitney U test to r]. Hatcher et al.: Effect size = 0.82; Marzocchi et al. (2008): ηp2= .20/ d = 0.25[
calculated using this conversion]. Menghini et al.: Effect size (%)= 7.3. Moura et al.: ηp2 = .134/ d = 0.15 [
calculated using this conversion]. Plaza et al.: age-matched neurotypical children vs dyslexic children: ηp2= 0.33 [
calculated using this conversion]/ d = 0.48 [
calculated using this conversion], reading-aged matched neurotypical children vs. dyslexic children: ηp2= 0.08 [
calculated using this conversion ηp2]/ d = 0.09 [
calculated using this conversion]. Reiter et al.: NA = 0.489. Shareef et al.: d = 1.02. Smith-Spark et al.: ß = .388. Snowling et al.: Effect size (% of variance explained) = 9.56. Varvava et al.: φ = 0.26
[calculated using the conversion from Chi square to Phi coefficient]. Wilson and Lesaux: d = 0.57.
Semantic fluency impairment in dyslexic adults. Dyslexic adults show lower scores on semantic fluency than neurotypical adults. Semantic fluency tasks are a type of verbal fluency task, where people are asked to generate as many words as possible according to a specific criterion, for instance items that are part of the same category, such as foods.
Statistics
- Status: mixed
- Original paper: ‘
Organizational deficits in dyslexia: Possible frontal lobe dysfunction’, Levin 1990; experiment, children with dyslexia: n = 20, dyslexic children: n = 20. [citation = 97(GS, November 2022)].
- Critiques:
Frith et al. 1994 [NT: n = 19, LD: n = 19, citations = 80(GS, November 2022)].
Hall and McGregor 2017 [NT: n = 132, LD: n = 53, citations = 25(GS, November 2022)].
Hatcher et al. 2002 [NT: n = 50, AWD: n = 23, citations = 426(GS, November 2022)].
Kinsbourne et al. 1991 [
AWD: n= 23, NT: n =21; Recovered AWD: n = 11, citation=144(GS, November 2022)].
Marzocchi et al. 2008 [neurotypical children: n = 30, dyslexic children: n = 22, ADHD children: n = 35, citations = 202(GS, November 2022)].
Menghini et al. 2010 [neurotypical children and adolescents: n = 65, dyslexic children and adolescent: n = 60, citations = 330(GS, November 2022)].
Moura et al. 2014 [neurotypical children: n = 50, dyslexic children: n = 50, citations = 104(GS, November 2022)].
Plaza et al. 2002 [neurotypical age-matched children: n = 26, neurotypical reading-age matched children: n = 26, dyslexic children: n = 26, citations = 81 (GS, November 2022)].
Reid et al. 2007 [neurotypical students: n = 15, AWD: n = 15, citations = 134 (GS, November 2022)].
Reiter et al. 2005 [neurotypical children: n = 42, dyslexic children: n = 42, citations = 485 (GS, November 2022)].
Shareef et al. 2019
AWD: n = 16, NT: n = 26, citations = 6 (GS, November [2022)]. Smith-Spark et al. 2017 [AWD: n = 28, NT: n = 28, citations = 36 (GS, November 2022)]. Snowling et al. 1997 [NT: n = 19, AWD: n = 14, citations = 262 (GS, November 2022)].
Varvava et al. 2014 [AWD: n = 60, NT: n = 65, citations = 177 (GS, November 2022)].
- Original effect size: d =0.37 [
calculated using this conversion from t to Cohen’s d].
- Replication effect size: Frith et al.: not reported. Hall and McGregor: ηp2= .04/ d = 0.04 [
calculated using this conversion from to Cohen’s d] Hatcher et al.: Effect size = 0.46. Kinsbourne et al.: recovered dyslexics vs. controls: d = 0.49; severe dyslexics vs. control = not reported. Marzocchi et al.: ηp2= .01/ d = 0.01 [
calculated using this conversion from to Cohen’s d]. Menghini et al.: NA (%)= 17.2. Moura et al.: ηp2 = .115 / d = 0.129 [
calculated using this conversion from to Cohen’s d]. Plaza et al.: age-matched neurotypical children vs dyslexic children: ηp2= 0.26
calculated using this conversion/ d = 0.34 [
calculated using this conversion to Cohen’s d], reading-aged matched neurotypical children vs. dyslexic children: not reported. Reid et al.: d = -0.2. Reiter et al.: NA = 0.814. Shareef et al.: d = 0.80. Smith-Spark et al.: ß = -.216. Snowling et al.: Effect size (% of variance explained) = 17.2. Varvava et al.: φ = 0.40
[calculated using the conversion from Chi square to Phi coefficient].
Lexical precision lexical competition. The direction and magnitude of inhibitory priming in word targets with dense neighbourhoods is moderated by spelling.
Statistics
- Status: mixed
- Original paper: ‘
Lexical Precision in Skilled Readers: Individual Differences in Masked Neighbor Priming’, Andrews and Hersch 2010; experiment, experiment 1: n= 97, Experiment 2: n = 123. [citation=207(GS, November 2022)].
- Critiques:
Elsherif et al. 2022a
n=84, citations=[ 8(GS, November 2022)]. Elsherif et al. 2022b n=84, citations=[28(GS, November 2022)].
- Original effect size: experiment 1: ηp2= 0.06/ d= 0.062[calculated, using this
conversion], Experiment 2: ηp2= 0.04/ d = 0.043 [calculated, using this
conversion].
- Replication effect size: Elsherif et al.: d = 0.28 [
calculated using this conversion from t to Cohen’s d]; Elsherif et al.: d = 0.34 [
calculated using this conversion from t to Cohen’s d].
Placebo Effect. Refers to the phenomenon in which a treatment or intervention that has no specific therapeutic effect (such as a sugar pill or saline injection) can still produce a therapeutic response in some individuals. The concept of the placebo effect can be traced back to the 18th century, when physicians and researchers began to notice that patients often reported improvements in their symptoms after receiving treatments that did not have any known physiological effects.
Statistics
- Status: replicated
- Original paper: ‘
The powerful placebo’, Beecher 1955; analysis of 15 studies on patients receiving either a placebo or an active treatment for various conditions such as pain, nausea, and anxiety. [citations= 2885 (GS, January 2023)].
- Critiques:
Hróbjartsson 2001 [a systematic review of 130 clinical trials, n = 4730, citations=2011 (GS, January 2023)].
Tang 2022 [meta-analysis of studies on pain, discomfort, sleep difficulty, and anxiety, k= 15, n= 1506, citations = 3 (GS, January, 2023)].
Yeung 2018 [meta-analysis on placebo on insomnia symptoms, k= 15, n= 566, citations= 56 (GS, January, 2023)].
- Original effect size: NA.
- Replication effect size: Hróbjartsson: As compared with no treatment, placebo had no significant effect on binary outcomes (pooled relative risk of an unwanted outcome with placebo, 0.95 [0.88 to 1.02]; for the trials with continuous outcomes, placebo had a beneficial effect (pooled standardised mean difference in the value for an unwanted outcome between the placebo and untreated groups: MD= -0.28[-0.38, -0.19]; trials involving the treatment of pain, placebo had a beneficial effect MD = -0.27 [-0.40, -0.15]. Tang: g = .298 (replicated). Yeung:placebo treatment led to improved perceived sleep onset latency (g = 0.272), total sleep time (g = 0.322), and global sleep quality (g = 0.581).
Placebo empathy analgesia. Downregulating first-hand pain perception via placebo analgesia (administration of an inert treatment such as a sugar pill) also dampens empathy for another person in pain.
Statistics
- Status: not replicated.
- Original paper:
‘Placebo analgesia and its opioidergic regulation suggest that empathy for pain is grounded in self pain’, Rütgen et al. 2015; between-subjects behavioural and fMRI experiment, n = 102 [citations=189(GS, May 2023)].
- Critiques:
Hartmann et al. 2021 [n=45, citations=3(GS, May 2023)].
Hartmann et al. 2022 [n=90, citations=3(GS, May 2023)].
- Original effect size: d = 0.44 (unpleasantness ratings).
- Replication effect size: Hartmann et al.: ηp2 < 0.001 (unpleasantness ratings, intensity x group); Hartmann et al.: ηp2 = 0.002 (pre-effort unpleasantness ratings, intensity x group), ηp2 = 0.001 (post-effort unpleasantness ratings, intensity x group).
Nocebo effect. This phenomenon is said to occur when negative expectations of an individual about an experience (e.g. a medical treatment) cause the experience to have a more negative effect than it would have otherwise.
Statistics
- Status: replicated
- Original paper: ‘
The nocebo reaction’, Kennedy 1961; editorial, n=NA. [citations=325(GS, January 2023)].
- Critiques:
Petersen et al. 2014 [meta-analysis, n=334, citations=206(GS, January 2023)].
Horváth et al. 2021 [meta-analysis, n=1999, citations=3(GS, January 2023)].
- Original effect size: NA.
- Replication effect size: Petersen et al.: lowest d = 0.65 [0.24, 1.05], highest d = 1.07 [0.65, 1.48]. Horváth et al.: nocebo effects on motor performance, mean effect size = 0.60 [mean ES calculation method was not reported].
Stroop Effect. A phenomenon in which it takes longer to name the ink colour of a word when the word itself is a colour name that is different from the ink colour (e.g. the word “red” printed in blue ink). The Stroop effect is considered a classic demonstration of the interference between different types of information processing.
Statistics
- Status: replicated
- Original paper: ‘
Studies of interference in serial verbal reactions’, Stroop 1935; list of colour words (e.g. “red”, “blue”, “green”) that were printed in different ink colours, and asked them to name the ink colour as quickly as possible, n = 70. [citations = 24125 (PSYCNET, January 2023)].
- Critiques:
Damen 2021 [n=66, citations= 1 (GS, April 2023)].
Epp et al. 2012 [meta-analysis, k=47, citations= 235 (GS, April 2023)].
Homack and Riccio 2004 [meta-analysis, k=33, citations= 520 (gs, April 2023)].
MacLeod 1991[n = NA, citations = 7389(PsycNet, January 2023)].
- MacKenna and Sharma 2004 [n=176, citations= 376 (PUBMED, January 2023)].
- Original effect size: NA.
- Replication effect size: Damen: ηp2 = 0.541 [0.369, 0.652]. Epp et al.: Emotional Stroop task in depression (replicated): on negative stimuli, g=.98, and on positive stimuli, g=.87. Homack and Riccio: individuals with ADHD fairly consistently exhibit poorer performance as compared to normal controls on the Stroop (mean weighted effect size of 0.50 or greater). MacKenna and Sharma: doubt on the fast and non-conscious nature of emotional Stroop.
Disfluency effect. Disfluency, the subjective experience of difficulty associated with cognitive operations, leads to deeper cognitive processing. If information is processed with difficulty or disfluently (e.g. when written in hard-to-read fonts), this experience serve as a cue that the task is difficult or that one’s intuitive (System 1) response is likely to be wrong, thereby activating more elaborate (System 2) processing, resulting in more positive cognitive outcomes.
Statistics
- Status: not replicated
- Original paper: ‘
Overcoming intuition: Metacognitive difficulty activates analytic reasoning’, Alter et al. 2007; four between-subject experiments, Study 1 n=40, Study 2 n=42, Study 3 n=150, Study 4 n=41. [citations=1196(GS, January 2023)].
- Critiques:
Kühl and Eitel 2016 [n=1,079 across 13 studies, citations=64 (GS, January 2023)].
Meyer et al. 2015 [n=7,177 across 13 studies, citations=114(GS, January 2023)].
Thompson et al. 2013 [n=579 across three studies (2c, 3a and 3b), citations=261 (GS, January 2023)].
- Original effect size: Study 1 – participants answered more items on the Cognitive Reflection Test (CRT) correctly in the disfluent font condition than in the fluent font condition, η2 = 0.056 / _d _= 0.71 [reported in
Meyer et al.].
- Replication effect size: Kühl and Eitel: no disfluency effect on cognitive and metacognitive processes and outcomes in any of the thirteen studies reviewed; effect size estimates not reported (not replicated). Meyer et al.: the effect of disfluent font on cognitive reflection test scores in 13 studies from d= -0.25 to d= 0.12 (reported, all non-significant) [not replicated]. Pooled effect of the 17 studies (including Thompson et al. and original Alter et al. study) d = -0.01 (non-significant). Thompson et al.: the effects of disfluent font on cognitive reflection test scores in three studies from d= -0.19 to d= 0.25 (d’s reported in Meyer et al., all non-significant) [not replicated].
Retrieval-induced forgetting (RIF). Forgetting of some items is in part a consequence of remembering other items.
Statistics
- Status: mixed
- Original paper: ‘
Remembering can cause forgetting: Retrieval dynamics in long-term memory’, Anderson et al. 1994; tested retrieval-induced forgetting, three experiments, n = 148. [citations=2065 (GS, January 2023)].
- Critiques:
Jonker et al. 2013 [n=30 across two experiments, citations=175 (GS, December 2022)].
Rowland et al. 2014 [n=72 (experiment 1); n=140 (experiment 2); n=70 (experiment 3), citations=18 (GS, January 2023)].
- Original effect size: NA.
- Replication effect size: Jonker et al.: reported ηp2 - experiment 1: 0.25; Experiment 2a: 0.29; Experiment 2b=0.19; Experiment 3: Standard condition: 0.43, study reinstatement condition: 0.31. Rowland et al.: reported Cohen’s d. Experiment 1: 0.31; Experiment 2: 0.38.
Mood-dependent retrieval (mood-dependent memory, state dependent memory, encoding specificity). Memory is enhanced when an individual’s mood (i.e., emotional state) at retrieval matches their mood at encoding.
Statistics
- Status: mixed
- Original paper: ‘
Emotional mood as a context for learning and recall’, Bower et al. 1978; three between-subjects experiments, Exp. 1: n_ = 10, Exp. 2: n _= 16, Exp. 3: n = 24 [citations = 741 (GS, January 2023)].
- Critiques:
Bower and Mayer 1985 [failed replication of Exp 3, n = 48, citations = 308 (GS, January 2023)].
Eich 1995 [review of 48 studies with mixed evidence, citations = 476 (GS, January 2023)].
- Original effect size: NA.
- Replication effect size: Bower and Mayer: not reported. Eich: not reported (theoretical review).
Perky effect. Mental imagery interferes with perception. If persons were asked to describe their images of common objects while dim facsimiles of the objects were presented before them, they reported only an “imagery,” not a “perceptual,” experience; imagery and stimuli are indistinguishable.
Statistics
- Status: replicated.
- Original paper: ‘
An experimental study of imagination’, Perky 1910; experimental design, Experiment 1 n=3 children, Experiment 2 n=24, Experiment 3 n=5. [citations=933(GS, March 2023)].
- Critiques:
Craver-Lemley and Reeves 1987 [n=125, citations=109(GS, March 2023)].
Okada and Matsuoka 1992 [n=14, citations=26(GS, March 2023)].
Reeves et al. 2020 [n=111, citations=4(GS, March 2023)].
Segal and Fusella 1970 [n1=8, n2=6, citations=579(GS, March 2023)].
Segal and Gordon 1969 [n1=24, n2=24, citations=52(GS, March 2023)].
- Original effect size: ES not reported but the data in all three experiments showed that respondents mistook the perceptual for the imaginative consciousness; they did not report a perception, but the image described resembled the unreported stimulus.
- Replication effect size: Craver-Lemley and Reeves: Mean accuracy for reporting the offset of vertical line targets declined from 80% to 65% when subjects were requested to imagine vertical lines near fixation (replicated). Okada and Matsuoka: the Perky effect described in the auditory modality. The auditory imagery of a pure tone affected the detection only when the frequency of the imaged tone was the same as that of the detected tone (ηp2 =0.346, calculated from the reported F(4,52) = 6.90, p < .01 using this
conversion) (replicated). Reeves et al.: Visual imagery interferes with acuity when performance is good but facilitates it when performance is poor. The mean Perky effect for the 47 subjects which scored over 80% in No Imagery condition was 21%; average correlation between Perky effects with baseline accuracy level across 111 subjects r = 0.63 (replicated). Segal and Fusella: Mental imagery was found to block detection of both visual and auditory signals; Experiment 1 - sensitivity (d’) was lower during visual (1.70) and auditory imaging (2.13) than in either the preceding (1.93) or following discrimination tasks (1.72) (all _p_s <.001) (replicated); Experiment 2 - sensitivity (d’) was lower during visual (1.48) and auditory imaging (1.68) than in either the preceding (2.64) or following discrimination tasks (2.84) (all _p_s <.001) (replicated). Segal and Gordon: Experiment 1: The significant differences in the perceptual sensitivity, d’ measures, in the Perky condition (0.74) and in the informed task (2.03) (replicated); Experiment 2: greater sensitivity in the discrimination task (d’= 2.39), compared to the imaging procedures, Experimenter-projection (d’=1.54) and self-projection (d’=1.19) (replicated).
Positive emotions broaden scope of attention. People experiencing positive emotions exhibit broader scopes of attention than do people experiencing no particular emotion.
Statistics
- Status: mixed
- Original paper: ‘
Positive emotions broaden the scope of attention and thought‐action repertoires’, Fredrickson and Branigan, 2005; between-subjects design, n=104. [citations=5037 (GS, March 2023)].
- Critiques:
Bruyneel et al. 2013 [Exp 1: n=35, Exp 2: n=38, Exp 3: n=25, citations=83 (GS, March 2023)].
Huntsinger 2013 [review, citations=137 (GS, March 2023)].
Huntsinger et al. 2010 [Exp 1: n=62, Exp 2: n=72, citations=160 (GS, March 2023)].
- Original effect size: d = 0.375 (calculated by using
this calculator).
- Replication effect size: Bruyneel et al.: Across three experiments, positive affect consistently failed to exert any impact on selective attention, Exp 1: ηp2 = 0.04, Exp 2: ηp2 = 0.001, Exp 3: ηp2 = 0.01 (null effects). Huntsinger: Rather than having fixed effects on the scope of attention, the impact of positive and negative affect is surprisingly flexible. Huntsinger et al.: Positive affect empowers whatever focus is momentarily dominant, Exp 1: d= 0.58, Exp 2: d= 0.71.
Emotional information facilitates response inhibition. Response inhibition refers to suppression of prepotent responses which are inappropriate to current task demands. In the lab setting, this is investigated with a stop signal task. The effect showed that both fearful and happy faces as stop signals facilitated response inhibition relative to neutral ones.
Statistics
- Status: not replicated
- Original paper: ‘
Interactions between cognition and emotion during response inhibition’, Pessoa et al. 2012; within-subjects design, n=36. [citations=245 (GS, March 2023)].
- Critiques:
Pandey and Gupta 2022 [n=54, citations=3 (GS, March 2023)].
Williams et al. 2020 [Study 1: n=40, Study 2: n=40, Study 3: n=42 (only younger adults sample), citations=12 (GS, March 2023)].
- Original effect size: η2= 0.17, d= 0.44 (fearful vs neutral), d= 0.33 (happy vs neutral) (Calculated using
this calculator).
- Replication effect size:
Pandey and Gupta: Angry faces as stop signal impaired response inhibition compared to happy faces,
d = 0.35. Williams et al.
: Fearful faces impaired response inhibition compared to happy faces, Study 1: d= 0.03 (fearful vs neutral), d= 0.04 (happy vs neutral), d = 0.08 (fearful vs happy), Study 2: d= 0.11 (fearful vs neutral), d= 0.04 (happy vs neutral), d= 0.15 (fearful vs happy), Study 3: d= 0.56 (fearful vs neutral), d= 0.04 (happy vs neutral), d= 0.58 (fearful vs happy).
Inhibition induced devaluation. Inhibition-induced devaluation refers to reduced response to stimuli which were previously inhibited. This effect results in participants bidding less for shapes that were paired with stop-signals, giving less trustworthiness rating for faces previously paired with stop signals. This effect has several implications for behaviour modification techniques.
Statistics
- Status: replicated
- Original paper: ‘
When approach motivation and behavioral inhibition collide: Behavior regulation through stimulus devaluation’, Veling et al. 2008; within-subjects design, Exp 1: n=33, Exp 2: n=47, Study 3: n=40. [citations=189 (GS, March 2023)].
- Critiques:
Chen et al. 2016 [Exp 1: n=45, Exp 2: n=48, Study 3: n=40, citations=122 (GS, March 2023)].
Wessel et al. 2014 [Exp 1: n=36, Exp 2: n=27, citations=64 (GS, March 2023)].
- Original effect size: Exp 1: nogo vs go: d= -0.49, nogo vs new: d= 0.48, Exp 2: nogo vs go: d= -0.33, nogo vs new: d= -0.53.
- Replication effect size: Wessel et al.: Exp 1: η2= 0.25, Exp 2: η2= 0.24. Chen et al.: Exp 1: nogo vs go: d= -0.39 [-0.71, -0.08], nogo vs untrained: d= -0.57 [-0.71, -0.08], Exp 2: nogo vs go: d= -0.91 [-1.31, -0.55], nogo vs untrained: d = -0.60 [-0.97, -0.27].
Inhibition induced forgetting. Inhibition-induced forgetting refers to impaired memory for the stimuli to which responses were inhibited.
Statistics
- Status: mixed
- Original paper: ‘
Inhibition-induced forgetting: when more control leads to less memory’, Chiu and Egner 2015; within-subjects design, Exp 1: n=54, Exp 2: n=54, Exp 3: n=53. [citations=77 (GS, March 2023)].
- Critiques:
Le and Cho 2020 [Exp 1: n=40, Exp 2: n=48, Exp 3: n=40, citations=1 (GS, March 2023)].
- Original effect size: Exp 1: d= 0.28, Exp 2: d= 0.45, Exp 3: d= 0.3.
- Replication effect size: Le and Cho: Showed inhibition induced forgetting when stimuli was task relevant, Exp 1: d= 0.06, Exp 2: d= 0.32, Exp 3: d= 0.31 (Calculated using
this calculator).
Body-object interaction (BOI) effect in lexical-semantic processing. Words that have higher ratings on the BOI measure receive faster responses (RTs) in lexical-semantic tasks (e.g., lexical decision, semantic decision). The BOI quantifies the ease with which the human body can physically interact with a word’s referent. The BOI effect is thought to show that sensorimotor information contributes to word meaning, providing support for embodied theories of semantic representation.
Statistics
- Status: replicated
- Original paper: ‘
Evidence for the activation of sensorimotor information during visual word recognition: The body–object interaction effect’, Siakaluk et al. 2008; experimental within-subjects design (high-BOI vs low-BOI), Study 1: n=30, Study 2: n = 30. [citations = 172 (GS, April 2023)].
- Critiques:
Siakaluk et al. 2010 [Study 1: n = 35, Study 2: n = 35, Study 3 task: n = 35, citations = 104 (GS, April 2023)].
Tousignant and Pexman 2012 [Study 1 Entity: n = 41, Study 2: n = 39, Study 3: n = 39, Study 4: n = 40, citations = 47 (GS, April 2023)].
Wellsby et al. 2010 [Study 1: n = 25, Study 2: n = 25, Study 3: n = 25, citations = 35 (GS, April 2023)].
- Original effect size:_ _Study 1: η2 = .33. Study 2: η2 = .30.
- Replication effect size: Siakaluk et al.: Study 1a: η2 = .38, Study 1b: η2 = .38, Study 2: η2 = .57.. Tousignant and Pexman: Study 1: η2 = 0.69, Study 2: η2 = 0.25, Study 3: η2 = 0.16, Study 4: d = 0.05 (not reported, calculated from the M and SD reported in Table 3 but does not take into account within-subject design). Wellsby et al.: Study 1: η2 = .32, Study 2: η2 = .32, Study 3: η2 = .33.
False memory implantation (false memory fabrication). People fabricate false memories after the suggestion that it happened. After discussing their memories with a researcher, participants reported a false memory.
Statistics
- Status: replicated
- Original paper: ‘
The formation of false memories’, Loftus & Pickrell 1995; one experimental condition, n = 24. [citations=1813(GS, June 2023)].
- Critiques:
Murphy et al. 2023 [n= 123, citations=1(GS, June 2023)].
- Original effect size: 25% of participants ‘remembered’ the false memory.
- Replication effect size: Murphy et al.: 35% of participants ‘remembered’ the false memory.
Serial dependence. Serial dependence describes a visual bias that a reported item (e.g., orientation) is systematically attracted towards the previous reported item.
Statistics
- Status: replicated
- Original paper: ‘
Serial dependence in visual perception’, Fischer & Whitney 2014; experiment, n=12. [citations=654 (GS, June 2023)].
- Critiques:
Fritsche et al. 2017 [n=25, citations=326 (GS, June 2023)].
Czoschke et al. 2019 [n1=15, n2=19, citations=44 (GS, June 2023)].
Fischer et al. 2020 [n1=20, n2=49, citations=58 (GS, June 2023)].
- Original effect size: a (height parameter of fitted Derivative-of-Gaussian curve)=8.19°[NA], a=6.76°[6.25, 7.28] ](
calculated), a=8.75°[8.22, 9.28] (
calculated).
- Replication effect size: Fritsche et al.: a=1.15°[NA] – a=1.17°[NA]. Czoschke et al.: a=6.11°[5.28 – 6.94] (
calculated), a=4.11°[3.04 – 5.18] (
calculated). Fischer et al.: effect of previous target on the current target a=2.99°[2.80 – 3.18] (
calculated), dest=1.351, R2 = 0.140, a=2.00°[1.94 – 2.06] (
calculated), dest=1.123, R2 = 0.118.
Modality-switching cost (modality switch effect). When verifying object properties, processing is slowed when the modality being processed is different from the modality processed in the preceding trial. The presence of the switching cost suggests that people represent semantic information in a modality-specific, rather than amodal or abstract, manner.
Statistics
- Status: replicated
- Original paper:
‘ Verifying different-modality properties for concepts produces switching costs’, Pecher et al. 2003; repeated measures design, Experiment 1: n = 64, Experiment 2: n = 88. [citations=485 (GS, June 2023)].
- Critiques:
Vermeulen et al. 2007 [n=81, citations=104(GS, June 2023)].
Lynott & Connell (2009) [n=24, citations=279(GS, June 2023)].
Ambrosi et al. 2011 [n=40, citations=8(GS, June 2023)].
- Original effect size: Experiment 1a - d = 0.18 (29ms), Experiment 1b - d= 0.15 (20ms), Experiment 2 - d= 0.27 (41ms).
- Replication effect size: Vermeulen et al.: d= 0.5, consistent with the original, with Switch trials being slower than Non-Switch trials. Lynott & Connell: d = 0.36. Ambrosi et al.: d= 0.2 (Adults), dm= 0.24 (Children).
Tactile Disadvantage (conceptual tactile disadvantage). Participants find it more difficult and are slower to process words strongly related to the tactile modality (e.g., sticky), compared to processing words from other modalities (visual, auditory etc.). The presence of this conceptual tactile disadvantage mirrors a similar disadvantage observed in perceptual processing, where tactile stimuli are slower and more difficult to process than visual or auditory stimuli.
Attentional blink. The attentional blink describes the phenomenon that in a rapid serial visual presentation of items, humans show a reduced ability to detect the second of two targets among distractors if the second target follows after approximately 200ms – 500ms after the first target. This effect is interpreted as displaying one of the limitations of human visual processing.
Default effect. In a choice scenario between two alternatives, when an alternative is presented as a default option, people stick with it rather than change it. For example, ‘Opt Out’ default organ donation policies increase organ donations.
Statistics
- Status: replicated
- Original paper: ‘
Do defaults save lives?’, Johnson and Goldstein 2003; 3 between-subjects experiment, N = 161. [citations=2649 (GS, June 2022)].
- Critiques:
DellaVigna & Linos 2022 [Meta analysis of 241 nudges based on 23.5 million participants, citations = 185 (GS, July 2022)].
Chandrasehkar et al. 2022 [N = 1920, citations = 1(GS, April 2023)].
- Original effect size: OR = 5.15 to OR = 5.93.
- Replication effect size: DellaVigna & Linos: Nudge effects in the published literature tends to report false positives and inflated effect sizes. Chandrasehkar et al.: OR=1.38 to OR = 1.67.
Decoy effect (alternatives: asymmetric dominance; attraction effect). The Decoy Effect is a cognitive bias in which an individual’s preference between two options is influenced by the presence of a third, asymmetrically dominated option (i.e., a decoy similar but inferior to one of the original options). Individuals are more likely to choose the option that is similar to the decoy option than if the decoy were absent. Decoy effect has been replicated in different studies and contexts, though the magnitude of the effect can vary, particularly depending on the specific features of the options being considered and the context in which the decisions are being made.
Statistics
- Status: replicated
- Original paper: ‘
Adding asymmetrically dominated alternatives: violations of regularity and the similarity hypothesis’, Huber et al. 1982; both within and between subjects design, n = 153 (n = 93 for within). [citations = 2542 (GS, March 2023)].
- Critiques:
Heath and Chatterjee 1995 [meta-analysis, k = 92 studies, replication n = 1,261, citations= 329 (GS, January 2023)].
Hu and Yu 2014 [n=16, citations= 37 (GS, March 2023)].
Slaughter et al. 1999 [n=108, citations=111 (GS, March 2023)].
- Original effect size: NA.
- Replication effect sizes: Heath and Chatterjee: not reported for overall decoy effect. Hu and Yu: not reported. Slaughter et al.: not reported.
Nudges. Choice architecture interventions that promote beneficial decisions.
Statistics
- Status: mixed
- Original paper: ‘
Nudge: Improving Decisions about Health, Wealth, and ’, Thaler & Sunstein, 2008, Book [citations=23376 (Google Scholar, October 2022)].
- Critiques:
Mertens et al. (2021)[citations=55 (Google Scholar, October 2022)] conducted a meta-analysis on nudges and found medium effect size across all types of nudges. They conducted several publication bias tests, the most severe indicated a very small but significant effect size.
Maier et al. (2022) [citations=15 (Google Scholar, October 2022)] re-analysed the data reviewed by
Mertens et al. (2021) and found no nudging effect after adjusting for publication bias.
- Original effect size: No effect sizes were provided in the original book.
- Replication effect size: Mertens et al. 2021: d= 0.37 to d= 0.46. Maier et al., 2022: d= 0.00 to d= 0.14.
Risky Choice Framing Effect (term used by Levin et al., 1998), alt-term = framing effect in risky-decision making. Under loss-frame, people are risk-seeking, whereas under gain-frame, people are risk-averse. In framing studies, logically equivalent choice situations are differently described and the resulting preferences are studied (
Kühberger, 1998). In risky choice problems, the way a choice is presented influences the decision. (e.g. saving 10 people out of 100 vs losing 90 people out of 100).
Statistics
- Status: replicated
- Original paper: ‘
The framing of decisions and the psychology of choice’, Tversky & Kahnemann, 1981; experimental design, P1: 152; P2: 155; P3: 150; P4: 86; P5: 77; P6: 85; P7: 81; P8: 183; P9: 200; P10.1: 93; P10.2: 88; Total = 1350* (unclear if those samples are different samples) [citations = 24617 (GS, October 2022)].
- Critiques: Meta-analysis:
Kühberger, 1998. [Total studies reviewed=136, citations=1554 (GS, June 2022)] The author finds that certain characteristics of framing studies are crucial to getting a consistent framing effect, but that the closer a methodology is to the original methodology, the better chance to replicate the original effect. Large scale replication in Klein et al., 2014 [Total replication studies = 36, citations=1082 (GS, June 2022)]
- Original effect size: Kahneman and Tversky (1982): d = 1.13, 95% CI [0.89, 1.37] (based on Klein et al., 2014 calculation) Meta-analytical effect size (many close and conceptual replications): Steiger & Kühberger (2018): d = 0.52 to 0.56.
- Replication effect size: Kühberger, 1998: d = .308.; Revised in Steiger & Kühberger, 2018 to d = .522 with only 81 of the 136 studies; Klein et al., 2014 : d=.60 (95% CI 0.53-0.67); Steiger & Kühberger, 2018 : d=.56.
Risk and Goal Message Framing. a) For illness detection behaviors, loss framing (presenting information of negative consequences with undesirable behaviors / without desirable behaviors) would be more effective than gain framing (presenting information of benefits through engaging in desirable behaviors) in encouraging healthy attitudes, intentions, and behaviors (perhaps because illness detection behaviors are riskier, Rothman & Salovey, 1997), whereas b) for health-affirming behaviors, gain framing would be more effective than loss framing in motivating healthy attitudes, intentions, behaviors (perhaps because health-affirming behaviors are less risky, Rothman & Salovey, 1997).
Statistics
- Status: Mixed, depending on operationalizations, DVs, and method (meta-analysis vs empirical study). The conceptual replication failed to provide support for the interaction, but this may be due to limited power.
- Original paper:
Rothman et al. (1999), between-subject design, sample size: 120 (Study 2) [citations=548(GS, October 2022)].
- Critiques:
van Riet et al. (2016) criticized reasoning of applying Kahneman and Tversky (1981) Prospect Theory (which was more suitable and applicable for risky choice framing) to goal message framing. Van Riet et al. (2016) also reviewed direct empirical and meta-analytical evidence, and it appears the evidence of risk-framing hypothesis in message framing is not conclusive.Original effect size: Rothman et al. (1999): partial eta squared=0.03, [90% CI [0.00, 0.10], to partial eta squared=0.06, 90% CI [0.01, 0.14].
- Replication effect size: Cox et al. (2006): author: partial eta squared =0.03, 90% CI [0.00, 0.12], non-significant, but may be due to limited power.
Status quo effect (status quo bias). A cognitive bias that leads people to prefer things to stay the same, even when change may be beneficial, thus a preference for the current state of affairs
Statistics
- Status: replicated
- Original paper: ‘
Status quo bias in decision making’, Samuelson and Zeckhauser 1988; series of decision-making experiments, n = 486. [citations=2707(SPRINGER LINK, January 2023)].
- Critiques: Review:
Bostrom and Ord 2006 [n=NA, citations = 354 (GS, January 2023)].
Godefroid et al. 2022 [n = NA, citations=4(GS, February 2023)].
Johnson and Goldstein 2003 [n = 161, citations = 2824 (GS, February 2023)].
Xiao et al. 2021 [Experiment 1: n = 311, Experiment 2: n = 316, citations = 4 (GS, January 2023)].
- Original effect size: : Cohen’s h from .16 to .79 (recalculated in Xiao 2021).
- Replication effect size: Bostrom and Ord: no ES (replicated). Godefroid et al.: NA. Johnson and Goldstein: no ES (but replicated as default effect). Xiao et al.: Cohen’s h from .45 to .62.
Temporal action-inaction effect. The proposed phenomenon that people associate or experience stronger regret with action compared to inaction in the short-term, but stronger regret with inaction compared to action in the long-term.
Statistics
- Status: mixed
- Original paper: ‘
The temporal pattern to the experience of regret’, Gilovich and Medvec 1994; hypothetical scenario experiments and real-life experience studies, Study 1: n =60, Study 2: n=77, Study 3: n= 80, Study 4: n=34, Study 5: n=32. [citations=564(GS, June 2022)].
- Critiques:
Bonnefon and Zhang 2008 [n=957, citations = 23 (GS, April 2023)].
Feldman et al. 1999 [n1=157, n2=622, citations = 97 (GS, April 2023)].
Towers et al. 2016 [n=500, citations = 31 (GS, April 2023)].
Yeung and Feldman 2022 [n=988, citations = 0 (GS, April 2023)].
Zeelenberg et al. 1998 [n1=165, n2=75, n3=100, n4=150, citations = 455(GS, April 2023)].
- Original effect sizes: Study 1: V = 0.50, Study 3: V = 0.28 to V = 0.53, Study 4: V = 0.24 to V = 0.53, Study 5: V = 0.06 to V = 0.56 (reported in Yeung and Feldman 2022).
- Replication effect sizes: Bonnefon and Zhang: The intensity of recent regrets is predicted by the consequences of the behaviour, and especially so for actions. The intensity of distant regrets is predicted by the consequences of the behaviour and by its justification, the effect of justification being stronger for actions than for inactions; failed to find support for temporal pattern. Feldman et al.: Participants reported more inaction than action regrets, and, contrary to prior research findings, regrets produced by actions and inactions were equally intense; failed to find support for temporal pattern. Towers et al.: Although regrets of inaction were more frequent than regrets of action, regrets relating to actions were slightly more intense; failed to find support for temporal pattern. Yeung and Feldman: Study 1: V = 0.25, Study 3: V = 0.15 to V = 0.23, Study 4: V = 0.10 to V = 0.24, Study 5: V = 0.04 to V = 0.05. Zeelenberg et al.: found support for temporal pattern of regret with real-life experience studies; when prior outcomes were positive or absent, people attributed more regret to action than to inaction; however, following negative prior outcomes, more regret was attributed to inaction, a finding that the authors label the inaction effect.
Money market versus goods/social market. The money market relationship refers to an exchange in which effort level is determined based on the level of compensation. By contrast, the social market relationship is an exchange in which effort level is most influenced by altruistic motivations rather than the compensation level. Heyman and Ariely (2004) proposed and showed that when the former is primed with monetary compensation, the more compensation they receive, the more effort they displayed. Yet, effort level did not vary depending on the level of compensation when the latter is primed with non-monetary compensation (i.e., goods), effort level does not depend on compensation level.
Statistics
- Status: mixed
- Original paper:
‘Effort for Payment: A Tale of Two Markets’, Heyman and Ariely 2004; between-subjects design, n = 614 (Experiment 1). [citations=1261(GS, June 2022)].
- Critiques:
Imada et al. 2022 [n=2203 (study 1) and 999 (study 2), citations=1(GS, June 2022)].
- Original effect size: Money market: d = -0.59 [-0.89, -0.29]; Social market: d = 0.25 [-0.55, 0.05]
- Replication effect size: Imada et al.: replicated the original finding that people expect others to be more willing to help when they are offered a medium amount of cash compared to a small amount of cash; contrary to Heyman and Ariely 2004, they found that people expect others to be more willing to help when they are offered with medium amount of goods compared to small amount of goods; the effect size (medium vs. small) for money was much larger than that for goods, and their replications overall supported the original claim that the sensitivity to compensation level differs depending on money vs. social market relationships; Study 1: Money market: d = -1.25 [-1.43, -1.06]; Social market: d = -0.43 [-0.60, -0,26]; Study 2: Money market: d = -1.30 [-1.39, -1.22]; Social market: d = -0.87 [-0.94, -0.79].
Risk and benefit perceptions(affect heuristic). Increasing risks of a hazard leads people to judge its benefits as lower while vice versa increasing benefits leads people to judge its risk as lower.
Statistics
- Status: replicated
- Original paper: ‘
The affect heuristic in judgments of risks and benefits’, Finucane et al. 2000; A mixed 4 (between - affective information: high risk/low risk/high benefit/low benefit) x 3 (within - technologies: nuclear power/natural gas/food preservatives), n=213 participants. [citations = 3694 (GS, June 2022)].
- Critiques:
Efendić et al. 2021 [n=1552, citations = 2(GS, November 2022)].
- Original effect size: r = -0.74 [-0.92,-0.30].
- Replication effect size: Efendić et al.: Study 1: r = -0.87 [-0.96, -0.59]; Study 2: r = -0.84 [-0.95, -0.50].
Temporal value asymmetry (TVA). The phenomenon that contemplating future events elicits stronger emotions than contemplating past events has been coined “temporal value asymmetry” (TVA). TVA was robust in between-persons comparisons and absent in within-persons comparisons.
Statistics
- Status: not replicated
- Original paper: ‘
A Wrinkle in Time: Asymmetric Valuation of Past and Future Events’, Caruso et al. 2008; 2x2 between-subject design, Study 1 n=121, Study 4 n=182. [citations=252 (GS, July 2022)].
- Critiques: ‘
Caruso 2010Study 1: n=116. [citations=112 (GS, July 2022)].
Kvam et al. 2022 [n=70, citations = 2(GS, April 2023)].
- Original effect sizes: Between-subjects analysis - Study 1 : Monetary Valuation: d = 0.41 [0.04, 0.76]; Difficulty: d = 0.08 [-0.27, 0.44]; Qualification: d = 0.19 [-0.17, 0.55]. Study 4 (DV = Monetary Value): Relevance: ηp2 = 0.03 [0.00, 0.10]; Temporal Location: ηp2 = 0.05 [0.01, 0.13]; RelevanceTemporal Location: ηp2 = 0.02 [0.00, 0.08]; Study 4 (DV = Stress): Relevance: N/A, Temporal Location: N/A; RelevanceTemporal Location: ηp2 = 0.02 [0.01, 0.12]. Study 1: Fairness: d = 0.43 [0.06, 0.80]; Negative Emotions: d = 0.37 [0.003, 0.74]; Brand’s Intentions: d = 0.33 [-0.03, 0.70].
- Replication effect sizes: Caruso et al. (within-subject analysis): Study 1: Monetary Valuation: d = 0.03 [-0.24, 0.30]; Difficulty: d = 0.01 [-0.27, 0.26]; Qualification: d = 0.18 [-0.09, 0.45]; Study 4 (DV = Monetary Value): Relevance: ηp2 = 0.00 [0.00, 0.02]; Temporal Location: ηp2 = 0.00 [0.00, 0.02]; RelevanceTemporal Location: ηp2 = 0.00 [0.00, 0.01]; Study 4 (DV = Stress): Relevance: ηp2 = 0.01 [0.00, 0.04]; Temporal Location: ηp2 = <0.001 [0.00, 0.01]; RelevanceTemporal Location: ηp2 = 0.001 [0.00, 0.02]. Caruso (within-subject analysis): Study 1: Fairness: d = 0.13 [-0.06, 0.32]; Negative Emotions: d = 0.01 [-0-19, 0.20]; Brand’s Intentions: d = -0.09 [-0.28, 0.10]. Kvam et al.: Our work provides a direct counterpoint to both the empirical phenomenon and this theoretical explanation for the temporal value asymmetry. First, we show systematic reversals of the temporal value asymmetry where participants sometimes preferred past outcomes. There were multiple instances where participants indicated that they preferred payoffs that could have occurred in the past to ones that could occur in the future in perfectly matched pairs – where the same dollar amount could be received in either the past or future at the same temporal distance (X days ago vs X days from now). Second, these reversals occurred as both the magnitude of and distance to the past / future payoffs were manipulated. Participants favoured past events when payoffs were small and temporal distance was large ($11, 2 years ago / from now), and favoured future events when payoffs were large or temporal distance was small ($10k, 7 days ago / from now). This may explain apparent replication failures related to the temporal value asymmetry (El Halabi et al., 2021) – not because the phenomenon is not real, but because different stimuli can cause it to reverse, and thus on average, fail to appear. Third and finally, a model comparison showed that framing the temporal value asymmetry in terms of hyperbolic (or even the more general hyperboloid) discounting is insufficient to account for the patterns of behaviour we observed.
Exceptionality effect. (emotional amplification, normality bias, exceptional-routine effect). The affective response to an event is enhanced if its causes are abnormal. Exceptionality effect is the phenomenon that people associate stronger negative affect with a negative outcome when it is a result of an exception (abnormal behaviour) compared to when it is a result of routine (normal behaviour). The exceptionality enhances the response to an event for the emotion of regret, self-blame, the cognitive response for victim compensation and offender punishment.
Statistics
- Status: replicated
- Original paper: ‘
Norm Theory: Comparing Reality to Its Alternatives’, Kahneman and Miller, 1986; within subject design (exceptional vs. normal), n=92 participants. [citations=4427(GS, July 2022)].
- Critiques:
Fillon et al. 2020 [meta-analysis, k= 48, citations = 10 (GS, July 2022)].
Kutscher and Feldman 2019 [exact replication (within-subject), n1= 342, n2 = 342, citations = 19 (GS, April 2023)].
- Original effect size: Hedge’s g =1.09 to g=2.78.
- Replication effect size: Kutscher and Feldman: d= 1.58 to d= 3.12. Fillon et al.: g= 0.41 to g= 0.79; the effect of exceptionality on counterfactuals was not significant and close to zero (k = 5, g = 0.39 [0.08, 0.70]). They also found that the effect for between-participants design is half the size of studies with a within-subject design.
Temporal differences in trait self-ascriptions. Much like how we are more likely to ascribe dispositional traits, as opposed to situational variables, when explaining the behaviour of other people compared to ourselves, the same asymmetry can also be observed when making trait assessments about our temporally distant selves (e.g. past or future). People are more likely to ascribe dispositional traits, compared to situational explanations, when making judgements about their past or future self.
Statistics
- Status: mixed (social distance replicated, but weaker | self-enhancement replicated | self-enhancement over temporal distance reversed | core hypothesis about temporal distance not replicated)
- Original paper: ‘
Temporal differences in trait self-ascription : When the self is seen as an other’, Pronin and Ross, 2006; study 1: 2 (between - temporal distance: past vs. present) x 2 (between - social distance: self vs friend); study 2: between, temporal distance - present vs future self; study 3: 3 (between - temporal distance: past vs present vs future self) x 2 (within - valence of trait attributions : positive vs negative), study 1: n=167 (students = 123, staff = 44); study 2: n=40; study 3: n=75. [citations=345(GS, June 2022)].
- Critiques:
Adelina and Feldman 2021 [n=911, citations = 1(GS, November 2022)].
- Original effect size: People attribute more dispositional traits to others compared to themselves (social distance), f = 0.35 [0.03, 0.16]; People attribute more positive, compared to negative, traits when making self-assessments (self-enhancement), f = 0.77 [0.29, 1.25], and this ratio does not increase with temporal distance, f = 0.16 [0.00, 0.36]; People ascribe more dispositional traits when making assessments about their temporally distant self compared to their present self (temporal distance), f = 0.54 [0.27, 0.77].
- Replication effect size: Adelina and Feldman: Social distance: f = 0.10 [0.03, 0.16]; Self-enhancement: f = 0.88 [0.50, 1.26], ratio increases with temporal distance, f = 0.33 [0.22, 0.42]; Tempn the psychology of experimental surprise oral distance: f = 0.02 [0.00, 0.06].
Bias Blind Spot. The phenomenon that people perceive stronger biases for others compared to self. Pronin (2002) found support for self-other asymmetries in perceived biases but failed to find support for self-other asymmetries in perceived personal shortcomings. Chandrashekar et al. (2021) found support for self-other asymmetries for both biases and personal shortcomings.
Statistics
- Status: replicated
- Original paper: ‘
Perceptions of bias in self versus others’, Pronin et al. 2002; within-subject design, Study 1: n = 24 , Study 2: n = 30 . [citations=1406 (GS, October 2022)].
- Critiques:
Chandrashekar et al. 2021 [N = 969, citations=10 (GS, April 2022)].
- Original effect size: d = -0.86 for biases, d = 0.28 for personal shortcomings.
- Replication effect size: Chandrashekar et al.: d = -1.00 for biases, d = -0.34 for personal shortcomings.
Hindsight Bias. Hindsight bias refers to the tendency to perceive an event outcome as more probable after being informed of that outcome.
Statistics
- Status: replicated
- Original paper: ‘
On the psychology of experimental surprise’, Slovic and Fishhoff 1977; between study design, n=184. [citations = 591 (GS, October 2022)].
- Critiques:
Chen et al. 2021 [n = 608, citations = 1 (GS, October 2022)].
- Original effect size: d=0.36 to d=0.61.
- Replication effect size: Chen et al.: d=0.05 to d=0.32.
Disjunction Effect. The sure-thing principle (STP) posits that if decision-makers are willing to make the same decision regardless of whether an external event happens or not, then decision-makers should also be willing to make the same decision when the outcome of the event is uncertain. People regularly violate the STP – uncertainty about an outcome influence decisions.
Statistics
- Status: mixed
- Original paper: ‘
The Disjunction Effect in Choice under Uncertainty’, Tversky and Shafir 1992; within and between subject design, n1=199, n2=98, n3=213, n4=171. [citations=860(GS, March 2023)].
- Critiques:
Kühberger et al. 2001 [n1=177, n2=184, n3=35, n4=97, citations=58(GS, March 2023)].
Lambdin and Burdsal 2007 [N=55, citations=51(GS, March 2023)].
Ziano et al. 2021 [N=890, citations=3(GS, March 2023)].
- Original effect size: “Paying-to-know” paradigm – participants were willing to pay a small fee to postpone a decision about a vacation package promotion when outcome of an exam was uncertain, despite preferences to purchase the package regardless of exam outcome, Cramer’s V = 0.22 [0.14, 0.32] (reported in
Ziano et al. 2021); “Choice under risk” problem – facing uncertainty about the outcome of an initial bet led to less willingness to again accept the exact same bet, compared to when having learned the outcome of the first bet, Cramer’s V = 0.26 [0.14, 0.39] (reported in
Ziano et al. 2021).
- Replication effect size: “choice under risk” problem: Kühberger et al.: ES not reported but failed to replicate the “choice under risk” problem in four experiments. Lambdin & Burdsal: ES not reported but failed to replicate. Ziano et al.: Cramer’s V = 0.11 [- 0.07, 0.20]) (not replicated). “paying to know“ problem: Ziano et al.: Cramer’s V = 0.30 [0.24, 0.37] (replicated).
Money Illusion. The inability of individuals to account for inflation or deflation when making decisions. If inflation, money loses value over time, leading to people to fail to consider the impact of inflation or real value of money.
Statistics
- Status: replicated
- Original paper:
‘Money Illusion’, Shafir et al. 1997; experiment, Problem 1: n = 358; Problem 2: n = 431 ; Problem 3: n =362; Problem 4: n = 139. [citations=1190(GS, November 2022)].
- Critiques:
Ziano et al. 2021 [n=604, citations=8(GS, November 2022)].
- Original effect size: Problem 1: Cramer’s V = 0.26 [0.17, 0.37]; Problem 2: XX = 48% [42%, 52%]; Problem 3; buy: 38% [33%, 43%] and sell: 43% [38%, 48%]; and Problem 4: V = 0.25 [0.13, 0.42] (obtained from Ziano et al. 2021).
- Replication effect size: Ziano et al.: Problem 1: V = 0.28 [0.21, 0.36]; Problem 2: 70% [66%, 73%]); Problem 3; buy: 47% [43%, 51%] and sell: 43% [39%, 47%]); and Problem 4: V = 0.17 [0.10, 0.25]).
Choosing versus rejecting (Framing effects, compatibility principle). People are inconsistent in their preferences when faced with choosing versus rejecting decision-making scenarios.
Statistics
- Status: not replicated
- Original paper: ‘
Choosing versus rejecting: Why some options are both better and worse than others’, Shafir 1993; 8 experiments with between-subjects design (i.e., two between-subjects conditions), across 8 studies sample size ranged from 170 to 398. [citations=779 (GS, July 2022)].
- Critiques:
Chandrashekar et al. 2021 [n = 1026, citations = 1 (GS, July 2022)]. Proposed and tested alternative theoretical predictions to that of Shafir 1993.
Ganzach 1995 [n = 41 & 96, citations = 94 (GS, July 2022)].
Wedell 1997 [n1=225, n2= 125, citations = 79(GS, July 2022)].
- Original effect size: d= 0.22 to 0.51.
- Replication effect size: Chandrashekar et al.: d = 0.01. Ganzach: Experiment 1 - ηp ² = 0.065 (calculated from the reported F(1,39)=6.6, p<.01 using this
conversion), Experiment 2 - ηp ²= 0.105 (calculated from the reported F(1,94)=11,p<.001 using this
conversion).
Conjunction bias (conjunction fallacy). The fallacy consists of judging the conjunction of two events as more likely than any of the two specific events, violating one of the most fundamental tenets of probability theory.
Direct versus indirect harm. Individuals believe that causing indirect harm is more moral than direct harm, regardless of outcomes, intentions, or self-presentational concern.
Statistics
- Status: mixed
- Original paper: ‘
The Preference for Indirect Harm’, Rozyman and Baron 2002; within-subjects, n_ _= 54. [citations = 223 (GS, January 2023)]. Three experiments were conducted but Experiment 2 is the key experiment.
- Critiques:
Aims and Fiske (2013) [between-subjects; Experiment 1: n = 80 (Intentional harm = 39, Unintentional harm = 41), Experiment 2: n = 93 (Intentional harm = 46, Unintentional harm = 47), Experiment 3: n = 79 (Intentional harm = 41, Unintentional harm = 38), citations = 182 (GS, January 2023)].
Ziano et al. 2021 [Experiment 1: n = 46, Experiment 2: n = 314, Meta-Analysis: n = 414, k = 3 experiments (original and the two conducted by the experimenters), citations = 5 (GS, January 2023)] replicated Experiment 2 of Rozyman and Baron 2002. .
- Original effect size: d = 0.70 [0.40, 1.00] (estimated from test-statistic: t(53) = 5.12, _p _< .001).
- Replication effect size: Ames and Fiske: Experiment 1: d = 0.74 [0.29, 1.19] (estimated using descriptive statistics) (replicated); Experiment 2: d = 0.22 [-0.18, 0.63] (estimated using descriptive statistics) (not replicated); Experiment 3: d = 0.48 [0.03, 0.93] (estimated using descriptive statistics) (replicated). Ziano et al.: Experiment 1: Scenario 1: d = 0.55 [0.24, 0.86] (replicated); Scenario 2: d = 0.41 [0.11, 0.71] (replicated); Experiment 2: Scenario 1: d = 0.24 [0.13, 0.35] (replicated, but smaller); Scenario 2: d = 0.36 [0.24, 0.47] (replicated); Meta-Analysis: Scenario 1: d = 0. 47 [0.18, 0.75] (replicated); Scenario 2: d = 0. 46 [0.26, 0.65] (replicated).
Distinction bias. When evaluating how happy options would make them, people who evaluated the options simultaneously predicted greater happiness for the good options and lower happiness for the bad options, whereas people who evaluated the options separately (i.e., only evaluated one option) showed little difference between the options.
Statistics
- Status: mixed
- Original paper: ‘
Distinction bias: Misprediction and mischoice due to joint
- evaluation’, Hsee and Zhang 2004; study 1: 5 conditions, pairwise comparisons between groups, total sample size n = 249; Study 2: 9 conditions, pairwise comparisons between groups, total sample size n = 360. [citations = 380 (GS, January 2023)].
- Critiques:
Anvari et al. 2021 [n = 824, citations = 6 (GS, April 2023)].
- Original effect size: Study 1: d = 1.17 and 3.26; Study 2: = 0.60, 0.75, 0.91, and 1.20.
- Replication effect size: Anvari et al.: Study 1: d = 2.60 and 4.13; Study 2: d= 0.45, 0.02, 1.55, and 0.02.
Inaction inertia effect. Forgoing an offer that is less appealing, but still desirable, than a previous offer. For example, if you missed the opportunity to attend a skiing trip that was offered at £40 rather than the usual £100, you would reject the offer of going to the same ski trip when offered for £80 rather than the usual £100.
Statistics
- Status: mixed (but mostly replicable).
- Original paper: ‘
Inaction inertia: Foregoing future benefits as a result of an initial failure to act’, Tykocinski et al. 1995; between-subject design, Experiment 1: n = 108, Experiment 2: n = 120, Experiment 3: n = 135, Experiment 4: n = 76, Experiment 5: n = 61, Experiment 6: n = 165. [citations = 238 (GS, February 2023)].
- Critiques:
Chen et al. 2021 [Experiment 1: n = 43 (between-subjects), Experiment 2: n = 309 (between-subjects), Experiment 3: n = 1,203 (mixed-design), Between-subjects: n = 603, Within-subjects: n = 600, Meta-Analysis: n = 1,555, k = 4 (mini analysis of own studies), citations = 10 (GS, February 2023)].
Zeelenberg et al. 2006 [Experiment 1: n = 80 (between-subjects), Experiment 3: n = 80 (between-subjects), Experiment 4: n = 120 (between-subjects), Experiment 5: n = 159 (between-subjects), citations = 96 (GS, February 2023)].
- Original effect size: Experiment 1 (Ski resort): ωp2 = .08 [.01, .19] (estimated from test-statistic: F(2, 105) = 5.92, p = .004); Experiment 2: Car: ωp2 = .05 [.00, .14] (estimated from test-statistic: F(2, 117) = 4.12, p = .019), Frequent flyer: ωp2 = .07 [.00, .16] (estimated from test-statistic: F(2, 117) = 5.46, p = .005), Fitness centre: ωp2 = .05 [.00, .13] (estimated from test-statistic: F(2, 117) = 3.92, p = .022); Experiment 3 (Ski resort): d = 0.59 [0.24, 0.94] (estimated from test-statistic: F(1, 131) = 11.28, p < .001); Experiment 4 (Fitness centre): d = 0.44 [-0.02, 0.90] (reported as significant in the paper, but estimated from test-statistic: F(1, 75) = 3.68, p = .059); Experiment 5 (Betting): Betting amount: d = 0.73 [0.19, 1.27] (estimated from test-statistic: F(1, 56) = 7.54, p = .008), Likelihood of placing bet: d = 0.61 [0.07, 1.14] (estimated from test-statistic: F(1, 56) = 5.20, p = .026); Experiment 6 (Frequent flyer): d = 0.39 [0.08, 0.71] (estimated from test-statistic: F(1, 159) = 6.18, p = .014)
- Replication effect size: Chen et al.: Experiment 1: Ski resort: η2 = .02 [.00, .12] (not replicated), Car: η2 = .38 [.13, .54] (replicated), Frequent flyer: η2 = .10 [.00, .26] (not replicated), Fitness centre: η2 = .20 [.01, .38] (replicated); Experiment 2: Ski resort: η2 = .10 [.04, .16] (replicated), Car: η2 = .10 [.04, .16] (replicated), Frequent flyer: η2 = .02 [.00, .05] (not replicated), Fitness centre: η2 = .14 [.07, .21]; Experiment 3: Between-subjects: Ski resort: η2 = .06 [.02, .09] (replicated), Car: η2 = .05 [.02, .08] (replicated), Frequent flyer: η2 = .01 [.00, .03] (not replicated), Fitness centre: η2 = .14 [.09, .19] (replicated); Within-subjects (supplementary materials): Ski resort: η2 = .10 [.06, .14] (replicated), Car: η2 = .06 [.03, .10] (replicated), Frequent flyer:_ η2_ = .00 [.00, .01] (not replicated), Fitness centre:_ η2_ = .16 [.11, .21] (replicated); Meta-Analysis: Large-difference versus small-difference – d = 0.49 [0.32, 0.67] (inaction inertia as described by Tykocinski et al., 1995). Zeelenberg et al.: Experiment 1: _d = 1.18 [0.69, 1.66] (estimated from test-statistic: F(1, 76) = 26.50, p < .001) (replicated); Experiment 3: d = _1.57 [1.05, 2.08] (estimated from test-statistic: F(1, 76) = 46.59, p < .001) (replicated); Experiment 4: _d = _0.63 [0.25, 1.00] (estimated from test-statistic: F(1, 116) = 11.44, p = .001) (replicated); Experiment 5: _d = _1.12 [0.78, 1.46] (estimated from test-statistic: F(1, 151) = 47.48, p < .001) (replicated).
Mere ownership effect. The mere ownership effect refers to an individual’s tendency to evaluate an object more favourably merely because he or she owns it.
Omission Bias (alternative terms: action-inaction effect). The tendency to view harmful actions as worse than inactions, despite the result being the same.
Statistics
- Status: mixed (but mostly replicated).
- Original paper:
‘Omission and commission in judgment and choice’, Spranca et al. 1991; within-subjects design, Experiment 1: n = 38, Experiment 4: n = 48 [citations = 1,178 (GS, February 2023)]..
- Critiques:
Jamison et al. 2020 [n = 313; citations = 19 (GS, February 2023)].
Yeung et al. 2022 [meta-analysis, n = 1,999 participants, k = 21 studies, citations = 8 (GS, February 2023)].
- Original effect size: (effect sizes generated using a formula in
Rosenthal & DiMatteo (2001, p. 71) using the conversion from r to Cohen’s d) Experiment 1: Scenario 1: d = 0.63 (estimated from frequencies reported in report: 37 vs. 20, χ2(1, N = 57) = 5.07, p = .024), Scenario 2: d = 0.80 (estimated from frequencies reported in report: 39 vs. 18, χ2(1, N = 57) = 7.74, p = .005) (*For both scenarios the effect size was estimated from the reported frequencies, generating a chi-square value, converting this to Pearson’s r and then to Cohen’s d for comparison with other experiments); Experiment 4: d = 0.94 [0.33, 1.55] (estimated from reported ANOVA: F(1,46) = 10.2, p = .003).
- Replication effect size: Jamison et al.: Scenario 1: d= 0.45 (replicated), Scenario 2: d = 0.47 (replicated). Yeung et al.: The overall effect size was g = 0.45 [0.14, 0.77]; Measure Used: Morality (k= 14): g = 0.45 [0.14, 0.77] (replicated); Blame (k = 7): g= 0.32 [0.01, 0.64] (replicated); Decision (k = 4): g= 0.30 [-0.62, 1.21] (not replicated).
Identifiable victim effect. Refers to the phenomenon that people are more likely to offer greater help to specific, identifiable victims than to anonymous victims.
Statistics
- Status: not replicated
- Original paper:
‘Sympathy and callousness: The impact of deliberative thought on donations to identifiable and statistical victims’, Small et al. 2007; field experiments, N = 280 split into Study 1: n=121 and Study 3 n=159. [citations=1126 (GS, June, 2022].
- Critiques:
Lee and Feeley 2016 [meta-analysis, k=41, citations=131 (GS, April 2023)].
- Original effect size: ηp 2 = 0.06 (Study 1) to ηp 2 = 0.07 (Study 3).
- Replication effect size: Lee and Feeley: ηp 2 = 0.00 (Study 1), ηp 2 =0.01 (Study 3); much weaker and less robust than previously thought, with lots of mixed findings, failed replications, null findings and numerous boundary conditions, overall significant yet modest effect, r = .05; after adjusting for publication bias with robust Bayesian meta-analysis there is evidence against an effect in this meta-analysis.
Psychophysical numbing. People prefer to save lives if they are a higher proportion of the total (e.g. do people prefer to save 4,500 lives out of 11,000 or 4,500 lives out of 250,000?).
Statistics
- Status: mixed
- Original paper: ‘
Insensitivity to the value of human life: A study of psychophysical numbing’, Fetherstonhaugh et al. 1997; 3 studies with within-subjects design, 2 of which are split into Part A and Part B with n’s = 1: 54; 196 ; 2: 162; Experiment 3: n=165. [citations = 468 (GS, December 2021)].
- Critique:
Ziano et al. 2021 [n=4799, citations = 0 (GS, December 2021)].
- Original Study 1 effect size: _: ηp2 = _0.14
- Replication effect size: Study 1a: ηp2 = 0.06, Study 1b: ηp2 = 0.21; Study 1c: ηp2 = 0.13.
Signing at the top and dishonesty. Signing a veracity statement at the top, instead of at the end, of a form/document encourages honest reporting.
Statistics
- Status: not replicated
- Original paper: ‘
Signing at the beginning makes ethics salient and decreases dishonest self-reports in comparison to signing at the end’, Shu et al.2012 (Retracted 09/2021); 3 experiments with Study 1: n = 101, Study 2: n = 60, Study 3: n = 13,488. [citations=482 (GS, June 2022)].
- Critiques:
Kristal et al. 2020 [ five conceptual replications, n = 4,559, and one highly powered, preregistered, direct replication, n = 1,235, citations=62 (GS, April 2023)].
Data Colada post about evidence of fraud in the data for study 3 of original paper.
- Original effect size: d = -1.05 (Study 1); d = -.53 (Study 2); d = -.20 (Study 3).
- Replication effect size: Kristal et al.: no effect of signing first on honest reporting; d = .11 (Study 1); d = -.01 (Study 2); d = .05 (Study 3); d = -.05 (Study 4); d = .01 (Study 5); d = -.04 (Study 6).
Loss aversion. The subjective value of losses exceeds the subjective values of gains. This phenomenon can denote a stronger preference for avoiding losses rather than acquiring gains. Loss aversion is still mostly replicable but with weaker effects for some people and in some situations (see
Mrkva et al., 2020).
Statistics
- Status: mixed
- Original paper: ‘
Prospect theory: An analysis of decision under risk’ , Kahneman and Tversky, 1979; multiple experiments with n between 66 and 95 [citations=72861(GS, June 2022)]
- Critiques:
Brown et al. 2021 [n=607 estimates from 150 articles, citations=10 (GS, December 2021)]. Meta-analyses:
Nieuwenstein et al. 2020 [total n = 399, citations=109(GS, April 2023)].
Walasek et al. 2018 [k=19 studies, citations=11 (GS, December 2021)].
- Original effect size: λ = 2.25 (reported in Walasek et al. 2018).
- Replication effect size: Brown et al.: λ = 1.955 [1.824, 2.104]. Walasek et al.: λ = 1.31 [1.10, 1.53]. All reported in Nieuwenstein et al.: Abadie et al.: g=-0.62 to g=0.22. Acker: g =-0.47. Aczel et al.: g =-0.35. Ashby et al.: g =-0.21 to g =1.48. Bos et al.: g =-0.10. Bos et al.: g =1.48. Calvillo & Penaloza: g =-0.28 to g =-0.09. Dijksterhuis: g =0.24 to g =0.46. Dijksterhuis et al.: g =0.70 to g =0.86. González Vallejo et al.: g =0.00. Hasford: g =0.43. Hess et al.: g =-0.14. Huizenga et al.: g =-0.26 to g =-0.50. Lassiter et al.: g =0.27 to g =0.51. Lerouge: g =0.38 to g =0.47. McMahon et al.: g =0.62 to g =0.67. Messner et al.: g =0.63. Newell et al.: g =-0.50 to g =0.17. Newell and Rakow: g =-0.37 to g =0.31. Nieuwenstein and Van Rijn: g =-0.74 to g =0.87. Nieuwenstein et al.: g =-0.01. Nordgren et al.: g =0.27 to g =0.36. Payne et al.: g =-0.10. Queen&Hess: g =-0.21. Rey et al.: g =0.27. Smith et al.: g =0.25 to g =0.32. Strick et al.: g =0.58 to g =1.21. Thorsteinson & Withrow: g =0.18 to g =0.34. Usher et al.: g =0.78 to g =1.04. Waroquier et al.: g =-0.56 to g =0.35.
Effort heuristic. People judge products that took longer time to complete as higher in quality and monetary value.
Statistics
- Status: mixed
- Original paper: ‘
The effort heuristic’, Kruger et al. 2004; Study 1: Between-subject design, n = 144, Study 2: Mixed design, n = 66. [Citations = 404 (GS, October 2022)].
- Critiques:
Ziano et al. 2022 [total N = 1405, citations=0 (GS, April 2023)].
- Original effect sizes:Study 1: d = 0.34 [0.00, 0.68] (liking/quality), d = 0.33 [-0.02, 0.67] (monetary value); Study 2: _ηp2 = _0.09 [0.01, 0.21] (liking/quality), _ηp2 = _0.15 [0.03, 0.28] (calculated by Ziano et al. 2022).
- Replication effect sizes: Ziano et al.: Study 1 MTurk: d = -0.05 [-0.21, 0.11] (liking/quality), d = 0.02 [-0.14, 0.18] (monetary value); Study 1 Prolific: d = 0.23 [0.08, 0.38] (liking/quality), d = 0.08 [-0.07, 0.22] (monetary value); Study 2: ηp2= 0.02 [0.00, 0.04] (liking/quality), ηp2= 0.04 [0.02, 0.07] (monetary value).
Unconscious thought advantage (“deliberation-without-attention”). The idea that for complex choices (with more features to take into account), not deliberating leads to better decisions (as defined by the research team, i.e., normatively).
Statistics
- Status: not replicated
- Original paper: ‘
On making the right choice’, Dijksterhuis, 2006; two experiments and two quasi-experiments, n = 80, 59, 93, 115. [citations = 605, WoK (October 2021)].
- Critiques:
Nieuwenstein and van Rijn 2012 [n = 48, 24, 32, 24, citations = 12 (WoK, October 2021)].
Nieuwenstein et al. 2015 [meta-analysis, k=61 studies, n = 40-399, replication study, n = 423, citations = 49 (WoK, October 2021)]. See also
González-Vallejo et al. 2008 for a theoretical critique [n=NA, citations = 51 (WoK, October 2021)].
- Original effect size: Experiment 1: g= .86; Experiment 2: g= .70 (as per Nieuwenstein et al. 2015).
- Replication effect size: Nieuwenstein & van Rijn: g= 0.10, g= -0.55, g= 0.87, g= -0.74. Nieuwenstein et al.: g= -0.01, after trim-and-fill, meta-analysis pooled Hedges’ g = 0.018 [−0.10, 0.14].
Self-interest is overestimated. How much personal benefits affect policy preferences and behaviours.
Statistics
- Status: replicated
- Original paper: ‘
The Disparity Between the Actual and Assumed Power of Self-Interest**’, **Miller and Ratner 1998; _n_s around 50 for 2- and 4-cell experiments across multiple studies (very underpowered). [citations = 552(GS, April 2023)].
- Critiques: Studies 1 and 4 were run and successfully replicated in
Brick et al. 2021 [two samples, UK and US, n = 800 each, citations = 4(GS, April 2023)].
- Original effect size: Effect sizes cannot be calculated as no variance was provided, but the effects looked large.
- Replication effect size: Brick et al.: Overestimation of the importance of payment for blood donation in Study 1, d = 0.59 [0.51, 0.66], 0.57 [0.49, 0.64]; and of smoking status for smoking policy preferences in Study 4, d = 0.75 [0.59, 0.90], 0.84 [0.73, 0.96].
Marshmallow experiment (self-imposed delay of gratification). A child’s success in delaying the gratification of eating marshmallows or a similar treat is related to better outcomes in later life. Outcomes that have been studied include coping, social, and academic competence, substance use, borderline personality features, BMI, executive functioning, and neural activation patterns.
Statistics
- Status: mixed
- Original paper: ‘
Cognitive and attentional mechanisms in delay of gratification’, Mischel et al. 1972; experimental design, n=92. [citation=1816 (GS, June 2022)].
- Critiques: Follow-up study:
Shoda et al. 1990 [longitudinal design, n=185, citation=1989 (GS, June 2022)].
Watts et al. 2018 [n=918, citations=319(GS, June 2022)].
Doebel et al.’s 2020 commentary on Watts et al. (2018) [n=NA, citations=22(GS, June 2022)].
- Original effect size: academic achievement r=.42 to r=.57.
- Replication effect size: Shoda et al.: follow-up study: r=.02. Watt et al. 2018: academic achievement r=.28.
Differential reinforcement of alternate behaviour (DRA). DRA procedures reduce a certain behaviour by reinforcing an appropriate alternative behaviour that serves the same function.
Statistics
- Status: replicated
- Original paper: ‘
The alteration of behavior in a special classroom situation’, Zimmerman and Zimmerman 1962; case studies, N = 2. [citations= 372 (GS, March 2023)].
- Critiques:
Allen and Harris 1965 [N = 2, citations= 307 (GS, March 2023)]. Fisher et al. 1993 [N = 4, citations= 375 (GS, March 2023)]. Hagopian et al.1998 [N = 21, citations= 475 (GS, March 2023)].
Hall et al. 1968 [N = 6, citations= 993 (GS, March 2023)].
- Original effect size: NA.
- Replication effect size: Allen and Harris = NA. Hall et al.= NA. Fisher et al.= NA. Hagopian et al.= NA.
Differential reinforcement of incompatible behaviour (DRI). DRI reinforces a physically incompatible behaviour to replace the unwanted behaviour.
Differential reinforcement of low rates of behaviour (DRL). DRL is a technique in which a positive reinforcer is delivered at the end of a specific interval if a target behaviour has occurred at a criterion rate.
Statistics
- Status: replicated
- Original: ‘
Decreasing classroom misbehavior through the use of DRL schedules of reinforcement’, Deitz and Repp 1973; case studies and observational design, N(experiment 1) = 1, N(experiment 2) = 10, N(experiment 3) = 15. [citations= 205 (GS, March 2023)].
- Critiques:
Deitz and Repp 1974 [N(experiment 1) = 1, N(experiment 2) = 1, N(experiment 3) = 1, citations= 54 (GS, March 2023)].
Deitz 1977 [N = 3, citations= 71 (GS, March 2023)].
Deitz et al. 1978 [N = 14, citations= 46 (GS, March 2023)].
- Original effect size: NA.
- Replication effect size: Deitz & Repp: NA. Deitz: NA. Deitz et al.: NA.
Extinction bursts. Extinction is an intervention procedure to reduce tantrum behaviours, by removing enforcement (eg. ignoring a child crying), and an extinction burst is a temporary increase in the frequency or intensity of that behavior.
Statistics
- Status: mixed
- Original paper: ‘
The elimination of tantrum behavior by extinction procedures’, Williams 1959; single-case experimental design, specifically a multiple-baseline design across behaviours (case report), n=1. [citation=519(GS, March 2023)].
- Critiques:
Arin et. al. 1966 [n=16, citation =778(GS, March 2023)].
Katz and Lattal 2020 [n1=9, n2=20, n3=20, citation=13(GS, March 2023)].
Lerman and Iwata 1996 [meta-analysis of 113 sets of extinction data, citation=266(GS, March 2023)].
Lerman et. al. 1999 [case report: n=1, citation=49(GS, March 2023)].
- Original effect size: NA, case report.
- Replication effect size: Arin et al.: Pigeons exhibited aggression towards nearby pigeons or models after being conditioned to peck a response key. This aggression was caused by the transition from food reinforcement to extinction. Various factors influenced the duration and frequency of attack. Katz and Lattal: Response increases relative to baseline during the first 20 min of a 324.75-min extinction session (Experiment 1) or during the first 30-min extinction session (Experiments 2 and 3) were rare and unsystematic. The results reinforce earlier meta-analyses concluding that extinction bursts may be a less ubiquitous early effect of extinction than has been suggested. Lerman and Iwata: Reported an initial increase in the frequency of the target response in 24% of the cases when extinction was implemented. Lerman et al.: Pattern of behaviour is consistent with what has been observed in studies of extinction bursts, where an initial increase in the targeted behaviour is often observed following the introduction of an extinction procedure.
Above-Average Effect (Better-Than-Average Effect). People have the tendency to perceive themselves as superior in comparison to the average peer.
Statistics
- Status: replicated
- Original paper: ‘
Global self-evaluation as determined by the desirability and controllability of trait adjectives’, Alicke 1985; within-subject design, n=164. [citations = 1589(GS, January 2023)].
- Critiques: Meta-analysis:
Zell et al. 2020 [n=124 published articles, 291 independent samples, and more than 950,000 participants, citations = 84 (GS, February 2022)]. Replication and extension by
Ziano et al. 2021 [n = 1, 573, citations = 14(GS, January 2023)].
Korbmacher et al. 2022 [n=756, citations = 0 (GS, February 2022)].
- Original effect size: For the trait desirability effect, ηp2 = .78 [.73, .81]; for the effect of desirability being stronger for more controllable traits, ηp2 = .21 [.12, .28].
- Replication effect size: Zell et al.: dz = 0.78 [0.71, 0.84]. Ziano et al.: For the trait desirability effect, sr2 = .54 [.43, .65]; for the effect of desirability being stronger for more controllable traits, sr2 = .07 [.02, .12]. Korbmacher et al.: Own ability & comparative ability r= .99, Domain difficulty and comparative ability r= -.85; Easy domains: from d = 0.54 to d = 1.18, Difficult domains: from d =0.11 (non-sig) to d = -0.65.
Below-Average Effect. The tendency of a person to underestimate their intellectual or social abilities when comparing to other people.
Statistics
- Status: mixed
- Original paper:
‘Lake Wobegan be gone! The “below-average effect” and the egocentric nature of comparative ability judgements’’, Kruger 1999; in studies 1 and 2 participants compared themselves with peers on ability domains, in study 3 cognitive load was added as a condition, study 1: N=37, study 2: N=104, study 3: N=49. [citations=1258 (GS, May 2023)].
- Critiques:
Eriksson and Funke 2015 [study 1: n=800, study 2: 193, citations=25 (GS, May 2023)].
Windschitl et al. 2002 [study 1: N=40, study 2: N=40, study 3: N=40, study 4: N=87, study 5: N=82, study 6: N=90, experiment 7: N=206, citations= 36 (GS, May 2023)].
Korbmacher et al. 2022 [n=756, citations = 0 (GS, February 2022)].
- Original effect size: Study 1: participants thought they were above average (i.e., above the 50th percentile) in the easy ability domains, but below average in difficult domain (all p’s < .01); Easy domain percentile estimates - Using mouse = 58.8, Driving = 65.4, Riding bicycle = 64, Saving money = 61.5; Difficult domain - Telling jokes = 46.4 (n.s.), Playing chess = 27.8, Juggling = 26.5, Computer programming = 24.8; Study 2: beta coefficient= .90; Study 3: participants thought that they were above average in terms of the easy abilities (M percentile = 78.4), and below average in terms of the difficult abilities (M percentile = 23.1).
- Replication effect size: Eriksson and Funke: Study 1: ES not reported, but comparison of above-ingroup measures with zero levels show that Democrats exhibited a statistically significant below-average effect on warmth and a null effect on competence; Republicans, on the other hand, exhibited a significant above-average effect on warmth and a null effect on competence; Study 2: ES not reported, but Democrats exhibited a statistically significant below-average effect on warmth (but no significant effect on competence); Republicans, on the other hand, exhibited significant below-average effect on competence (but no significant effect on warmth). Windschitl et al.: not reported. Korbmacher et al.: Own ability & comparative ability r = .99, Domain difficulty and comparative ability r= -.85; Easy domains: from d = 0.54 to d = 1.18, Difficult domains: from d =0.11 (non-sig) to d = -0.65.
Overconfidence. (“unskilled and unaware of it” effect, overplacement, overprecision, calibration of subjective probabilities, realism of confidence). It is the overestimation of one’s actual ability, performance, level of control, or chance of success in any given situation.
Statistics
- Status: mixed
- Original paper: ‘
Do Those Who Know More Also Know More about How Much They Know?”, Lichtenstein & Fischhoff 1977; five experiments with various designs, Experiment la: n= 92, Experiment 1b: n= 63, Experiment 2: n= 57, Experiment 3: n= 120, Experiment 4: n= 50, Experiment 5: n= 93. [citations=1548 (GS, March 2023)].
- Critiques:
Gigerenzer et al. 1991 [n=2081, citations=2 (GS, March 2023)].
Dawes & Mulford 1996 [n=145, citations=383 (GS, March 2023)].
Klayman et al. 1999 [Experiment 1: n=32, Experiment 2: n= 54, Experiment 3: n=32, citations=942 (GS, March 2023)].
Olsson 2014 [n=NA(review study), citations=74(GS, March 2023)].
- Original effect size: proportions over/-underconfidence: Experiment 1a – +.15; Experiment 1b – +.18, Experiment 2 – training (+.07), no training (+.14); Experiment 3 – best subjects (+.05), middle subjects (+.07), worst subjects (+.15), best subjects-easy items (-.05), best subjects-hard items (+.14), middle subjects-easy items (-.05), middle subjects-hard items (+.19), worst subjects-easy items (+.03), worst subjects-hard items (+.25); Experiment 4 - best subjects-easy items (-.06), best subjects-hard items (+.05), worst subjects-easy items (-.03), worst subjects-hard items (+.17); Experiment 5 – Easy test (-.02), Hard test (+.12)..
- Replication effect size: Gigerenzer et al.: Experiment 1 – correct was 52.9, mean confidence was 66.7, and overconfidence was 13.8; Experiment 2 – In the selected set, mean confidence was 71.6%, and percentage correct was 56.2, overconfidence 15.4; in the representative set, overconfidence largely disappeared (2.8%), mean confidence was 78.1% and percentage correct was 75.3%. Dawes and Mulford: ES not reported; regression effects account for the evidence cited in support of overconfidence. Klayman et al.: Experiment 1 – overconfidence between -.073 and +.130 on forty questions, mean overconfidence +.046; Experiment 2 – July temperatures +.150, State poverty levels -.014, Mountain heights -.115, State populations +.023, Shampoo prices -.025, Presidential sequence +.008; Experiment 3 – With confidence-range questions, overconfidence is large, on the order of 45%; differences between domains and between individuals are strong as well. Olson: ES not reported; methodological and statistical artefacts can explain many of the observed instances of apparent overconfidence.
Better-than-average effect. People tend to rate themselves as better than average on desirable traits and skills.
Statistics
- Status: replicated
- Original paper: ‘
Are we all less risky and more skillful than our fellow drivers?’, Svenson 1981; participants in a US (n = 81) and a Swedish (n = 80) sample rated their driving safety or driving skill compared to others. [citations = 2699 (GS, June 2022)].
- Critiques:
Koppel et al. 2021 [n = 1,203, citations = 0 (GS, June 2022)]. Meta-analysis:
Zell et al. 2020 [n = 965,307, citations = 100 (GS, June 2022)].
- Original effect size: Hedges’s _g _= 0.41 to Hedges’s g = 1.25 (calculated from statistics reported in original paper).
- Replication effect size: Koppel et al.: Hedges’s g = 1.18 to Hedges’s g = 1.70. Zell et al.: robust across studies (dz = 0.78 [0.71, 0.84]), with little evidence of publication bias.
Accuracy of information (truth discernment). Asking people to think about the accuracy of a single headline improves “truth discernment” of intentions to share news headlines about COVID-19.
Statistics
- Status: mixed
- Original paper: ‘
Fighting COVID-19 Misinformation on Social Media: Experimental Evidence for a Scalable Accuracy-Nudge Intervention’, Pennycook et al. 2021; 2 survey studies with Study 1: n = 853, Study 2: n = 856 [citations=887(GS, March 2022)].
- Critiques:
Roozenbeek et al. 2021 [n=1583, citations=22(GS, March 2022)].
- Original effect size: Study 1: d = 0.657 [0.477, 0.836] on accuracy judgement; d = 0.121 [0.030, 0.212] on sharing intention; Study 2: control condition: d = 0.050 [−0.033, 0.133]; treatment condition: d = 0.142 [0.049, 0.235].
- Replication effect size: Roozenbeek et al. : Study 1: F = 1.53; Study 2: treatment: d = −0.14 [−0.17, −0.12], control: d = −0.10 [−0.13, −0.078].
Resultant moral luck. The phenomenon of moral judgments being influenced by factors beyond the agent’s control that affect the outcome of their actions. Kneer and Machery (2019) claim that there is no evidence for resultant moral luck and that the puzzle of moral luck is not a genuine problem.
Statistics
- Status: NA
- Original paper: ‘
No luck for moral luck’, Kneer (2019); two experiments conducted, between-subjects (1a) and within-subjects (1b), 1a: n=196 and 1b: n=95. [citations=76(GS, January 2023)].
- Critiques:
Laves 2020 [theoretical/review paper, n=NA, citations=12(GS, January 2023)].
- Original effect size: Between-subjects Design (1a): Wrongness: d=0.44 [0.16, 0.72], Blame: d=0.39 [0.17, 0.58], Permissibility: d=0.26 [-0.02, 0.55], Punishment: d=0.79 [0.50, 1.08]; Within-subjects Design (1b): Wrongness: d=0.16 [0.004, 0.27], Blame: d=0.24 [0.09, 0.38], Permissibility: d=0.06 [-0.02, 0.14], Punishment: d=0.47 [0.30, 0.64]; Within paper replications.
- Replication effect size: Laves: NA; Laves argues that Kneer and Machery’s experiments do not dissolve the puzzle of moral luck, but rather show that people have inconsistent intuitions about moral luck depending on the context and framing of the scenarios; he also questions the validity and reliability of the measures used by Kneer and Machery, and suggests that their results are influenced by confounding factors such as moral emotions, causal responsibility, and moral principles. No replication studies conducted as of January 2023.
Incidental disgust (Amplification hypothesis). Irrelevant feelings of disgust can amplify the severity of moral condemnation.
Statistics
- Status: mixed
- Original paper: ‘
Hypnotic disgust makes moral judgments more severe’, Wheatley and Haidt 2005; two between-subject experiments, Study1 n=64, Study 2 n=94. [citations=1437(GS, February 2023)].
- Critiques:
Chapman et al. 2009 [n=87 across three experiments, citations=919(GS, February 2023)].
Eskine et al. 2011 [n=57 (with 3 dropped for guessing the hypothesis), citations=600(GS, February 2023)].
Ghelfi et al. 2020 [n=1137 (across 11 studies), citations=32(GS, February 2023)]. Meta-analysis
Landy and Goodwin 2015 [n=5,102 across 51 studies, citations=391(GS, February 2023)].
- Original effect size: Study 1 – participants rated vignettes as being more morally wrong when the hypnotic disgust word was present than when the word was absent r=34 [d=0.53, reported in
Landy and Goodwin 2015]; Study 2 - r =.36 [d=0.25, reported in
Landy and Goodwin 2015].
- Replication effect size: Eskine et al.: Regression slope coefficient of physical disgust measure on moral judgement = 0.525, t(52) = 4.445, p < .001. Ghelfi et al.: Linear mixed effects regression slope coefficient of standardised disgust ratings on standardised moral-wrongness judgments = 0.07, p = .014. Landy and Goodwin: effects of disgust induction in different sensory modalities on moral judgements - d=-0.38 to d=1.44, weighted mean of the effect sizes across 51 studies d=0.11. Chapman et al.: Increasing disgust with increasing offer unfairness [_ηp_²=0.324, calculated from the reported F(1,135) = 64.8, p < 0.001, using this
conversion].
Single-exposure musical conditioning. An important study, which employed classical conditioning theory, proposed that a person’s preference for a product can be influenced by the type of music they hear while being exposed to it. A follow-up experiment differentiated between scenarios to see whether classical conditioning or information processing might be a better explanation for product preference.
Statistics
- Status: mixed
- Original paper: ‘
The Effects of Music in Advertising on Choice Behavior: A Classical Conditioning Approach’, Gorn 1982; Experiment 1, n = 195, Experiment 2 = 122. [citations = 1592 (GS, January 2023)].
- Critiques:
Vermeulen and Beukeboom 2015 [Experiment 1: n = 182, Experiment 2 = 224, Experiment 3: n = 127, citations = 42 (GS, January 2023)]. Reported here without considering participants that were excluded due to deviant musical taste.
- Original Effect Size: Considering participants that were excluded due to deviant musical taste, the log OR was 2.67 from Gorn’s original analysis.
- Replication Effect Size: Vermeulen and Beukeboom: Both experiment results reported are based on the exclusion of participants due to their deviant musical taste. For experiment 1, a chi-square test showed a significant effect of music on choice (χ2(N = 132; 1) = 4.93, p = .026, 𝜙 = .19, log OR = .79 [.09, 1.48]. In the same line as the effect for the full sample, the effect is reliably smaller than the 𝜙 = .49, log OR = 2.67 from Gorn’s original paper. Concerning experiment 2, a significant effect of music on choice was found (χ2 (1) = 4.57, p = .033, 𝜙 = .17, log OR = .70 [.06, 1.35]. However, the obtained ES was also reliably smaller than the ES stated by Gorn (log OR D 2.67).
Rational Expectations. The extent to which participants in an experiment choose the action with the highest expected payoff based on their private signal and the choices and outcomes of previous participants.
Statistics
- Status: mixed
- Original paper: ‘
Do We Follow Others When We Should? A Simple Test of Rational Expectations’, Weizsacher 2010; a laboratory experiment with 24 sessions and 12 participants per session, n=288. [citations=153(GS, January 2023)].
- Critiques:
Ziegelmeyer et al. 2013 [n= 30,683 decisions made by 2,948 participants in 13 information cascade experiment, citations=18(GS, March 2023)].
- Original effect size: When the expected payoffs from contradicting one’s signal is higher than 1/2 (where it is empirically optimal for the participants to contradict their own private information), the respondents choose this action (among two alternatives) in less than half of the cases; the frequency of the optimal choice is 0.44; when the expected payoffs is lower than 1/2, the participants follow the signal nine out of ten times; they make the optimal choice with a frequency of 0.91.
- Replication effect size: Ziegelmeyer et al.: Where the empirical payoff for contradicting one’s own signal is greater than 1/2, the relative frequency of optimal choice is 0.60 (partly replicated, participants are moderately successful in learning from others); where the empirical payoff for contradicting one’s own signal is lower than 1/2, the optimal choice occurs with a relative frequency of 0.92 (replicated).
Decreased sense of free will reduces personal responsibility. Vohs and Schooler (2008) asked participants to read an article either debunking free will or a control passage, and found that those reading the former cheated more on an experimental task. It was suggested that the decreased sense of free will as a result of reading the text reduced perceptions of personal responsibility.
Statistics
- Status: not replicated.
- Original paper: ‘
The value of believing in free will: Encouraging a belief in determinism increases cheating’, Vohs and Schooler 2008; experimental design, n1=30, n2=122. [Citations=1044(GS, January 2023)].
- Critiques:
Buttrick et al. 2020 also found [n = 621, citations= 11 (GS, January 2023)]. The Open Science Collaboration
Embley et al. 2015 [n = 58, citations=5(GS, January 2023)].
- Original effect size: d=.88.
- Replication effect size: Buttrick et al.: no differences in cheating behaviour using a more rigorous measurement approach, d = 0.076 [−0.082, 0.22]. Embley et al.: no differences in cheating behaviour between the two experimental conditions, d = 0.20 [−0.33, 0.74], p = .44.
Unrealistic optimism (Optimism bias). The tendency to overestimate the likelihood of experiencing positive outcomes and underestimating negative ones.
Statistics
- Status: mixed (largely replicated, but contextual factors influence size).
- Original paper: ‘
Unrealistic optimism about future life events’, Weinstein 1980; within-subjects design, two experiments conducted but Experiment 1 is most relevant, n_ _= 258. [citations = 7,521 (GS, January 2023)].
- Critiques:
Klein and Helweg-Larsen 2002 [meta-analysis, n = 5,142 participants, k = 27 studies, citations = 546 (GS, January 2023)].
Maksim et al. 2022 [preprint, Experiment 1: n = 105, Experiment 2: n = 71, citations = 0 (GS, January 2023)].
Shepperd et al. 1996 [Experiment 1: n = 83 (mixed-design; 31 sophomores, 22 juniors and 29 seniors), Experiment 2: n = 144 (mixed design, but only looking at within-subject’s comparisons), citations = 491 (GS, January 2023)].
- Original effect size: Overestimating positive outcomes: d = 0.43 [0.30, 0.55] (estimated from test-statistic: t(255) = 6.8, p < .001); Underestimating negative outcomes: d = 0.87 [0.73, 1.01] (estimated from test-statistic: t(255) = 13.9, p < .001).
- Replication effect size: Klein and Helweg-Larsen: Overall effect size: d = 0.64 [0.60, 0.68] (replicated); Some of the moderators reported found that: Larger in student samples (d = 0.94 [0.87, 1.02]) than non-students (d = 0.50 [0.45, 0.55]), The US showed larger effects (d = 1.23 [1.16, 1.13]) than elsewhere (d = 0.36 [0.31, 0.41]). Maksim et al.: Experiment 1: d = 0.34 (replicated), Experiment 2: d = 0.37 (replicated). Shepperd et al.: Experiment 1: Students asked to predict their likely starting salary post-graduation; Beginning of semester: Sophomores: d = 1.26 [0.78, 1.73] (estimated from test-statistic: t(30) = 6.91, p < .001) (replicated), Juniors: d= 1.20 [0.63, 1.75] (estimated from test-statistic: t(21) = 5.50, p < .001) (replicated), Seniors: d = 0.53 [0.13, 0.93] (estimated from test-statistic: t(28) = 2.83, p = .004) (replicated); Two weeks prior to graduation: Sophomores: d = 1.02 [0.57, 1.45] (estimated from test-statistic: t(30) = 5.57, p < .001) (replicated), Juniors: d= 0.86 [0.35, 1.35] (estimated from test-statistic: t(21) = 3.95, p < .001) (replicated), Seniors: d = 0.28 [-0.10, 0.65] (estimated from test-statistic: t(28) = 1.48, p = .075) (not replicated); Experiment 2: Students asked to predict their exam score; Students overestimated their test score 1-month prior to the examination: d = 0.63 (estimated from test-statistic: t(88) = 5.91, p < .001) (replicated), Students underestimated their test score 3 seconds before scores were released: d = 0.16 [-0.01, 0.32] (estimated from test-statistic: t(145) = 1.87, p = .016 (one-tailed) (opposite direction).
Miles per gallon illusion (MPG illusion, kilometres per litres illusion)**. People misperceive how much fuel and money will be saved by, because they assume fuel use increases linearly with MPG, whereas in reality, increasing by a few MPG saves much more gas at low levels of MPG (e.g., 12 to 14 MPG) compared to high levels (e.g., 30 to 32 MPG).
Statistics
- Status: replicated
- Original paper: ‘
The MPG Illusion’, Larrick and Soll 2008; experimental design, n=171 (for the experimental manipulation of GPM frame vs. MPG frame and choice between Program A and Program B). [citations=465(GS, January 2023)].
- Critique:
Murata 2016 [n=66, citations = 0(GS, January 2023)].
- Original effect size: OR = 5.27.
- Replication effect size: Murata: OR = 2.09 (for the experimental manipulation scenario–choice between Program A and Program B); the other studies replicated even better than this one, but are more difficult to convert to effect sizes.
Certainty effect. The tendency to overweight the importance of an increase from 99% to 100% probability that some prospect/event will occur.
Statistics
- Status: replicated
- Original paper: ‘
Prospect Theory: An Analysis of Decision Under Risk’, Kahneman and Tversky 1979; between-subject manipulation of choice problems, n=66. [citations = 75,571 (GS, January 2023)].
- Critique:
Ruggeri et al. 2020 [n=4,098, citations = 135 (GS, January 2023)].
- Original effect size: Log OR = -1.36 (for the 1st certainty effect demonstration; item 1 vs. 2 contrast).
- Replication effect size: Ruggeri et al.: Log OR = -1.72 [-1.58, -1.97] (pooled across many samples from different countries). The other certainty effect contrasts also replicated successfully.
Overweighting small probabilities. People tend to overweight/overreact to changes in probability from 0 to very small probabilities. (In other words, whereas classical economic theory would suggest changing from 0% to 1% chance should have the same impact as a change from a 20% to 21% chance, people respond much more strongly to the former change.
Statistics
- Status: replicated
- Original paper: ‘
Prospect Theory: An Analysis of Decision Under Risk’, Kahneman and Tversky 1979; between-subject manipulation of choice problems, n=66. [citations = 75,571 (GS, January 2023)].
- Critique:
Ruggeri et al. 2020 [n=4,098, citations = 135 (GS, January 2023)].
- Original effect size: Log OR= -1.23.
- Replication effect size: Ruggeri et al.: Log OR= -2.78 [-2.54, -2.62] (3 out of the 4 demonstrations of this phenomenon were successfully replicated in Ruggeri et al., while 1 was not).
Positive affect increases patience. Watching a positive affect-inducing video will increase patience in an intertemporal choice task.
Slow to anger, fast to forgive. Adding noise/uncertainty to the reason for a person’s action leads people to be more lenient (slow to anger, quick to forgive). A secondary finding is that under these circumstances with noise/uncertainty, cooperative strategies also lead to higher payoffs (in situations with cooperative equilibria).
Statistics
- Status: replicated
- Original paper: ‘
Slow to anger, fast to forgive’, Fudenberg et al.2012; experimental play of the repeated prisoner’s dilemma, n=384 [citations=372(GS, January 2023)].
- Critique:
Camerer et al. 2016 [n=128, citations=1,206(GS, April 2023)].
- Original effect size: b = -.627.
- Replication effect size: Camerer et al.: b = -.605.
Isolation effect (Von Restorff effect). The isolation effect occurs when people focus on differences between options rather than similarities. We not only remember the differences between two stimuli, but we also tend to give it greater weighting. For example, we notice the one single yellow that stands out in a batch of red apples.
Magnitude effect (magnitude perception). People are sensitive to relative as well as absolute magnitude. Most people find the difference between $100 and $200 more meaningful than the difference between $1,100 and $1,200; the marginal value of the outcome generally scales with magnitude.
Reflection effect. People tend to be risk seeking when maximising gains, but risk averse when minimising losses. The preference between negative prospects is the mirror image of the preference between positive prospects – the reflection of prospects around 0 reveres the preference order.
Statistics
- Status: mixed
- Original paper: ‘
Prospect Theory: An analysis of decisions under risk’ Kahneman and Tversky 1979; between-subject manipulation of choice problems, Problem 3 n = 95, Problem 4 n = 95, Problem 5 n = 72, Problem 6 n = 72, Problem 7 n = 66, Problem 8 n = 66, Problem 9 n = 95, Problem 10 n = 141. [citations=76,664(GS, March 2023)].
- Critiques:
Ruggeri et al. 2020 [multinational replication study n=4,098, citations=143(GS, March 2023)].
- Original effect size: Different choice problems comparisons log OR = -3.772 to log OR = 0.949 (reported in
Ruggeri et al. 2020, data available at
https://osf.io/esxc4/).
- Replication effect size: Ruggeri et al.: log OR = -3.332 to log OR = 0.026; four out of five comparisons are replicated.
Unusual disease problem (Asian disease problem). When survival is communicated (positive framing), people tend to choose an option with a certain outcome (risk averse decision). In contrast, when mortality is communicated (negative framing), people tend to choose an option with an uncertain outcome (risk-seeking decision).
Statistics
- Status: replicated
- Original paper: ‘
The Framing of Decisions and the Psychology of Choice’, Tversky & Kahneman 1981; correlational, two “Asian disease problem” situations N1=152, N2=155. [citations=25,330(GS, February 2023)].
- Critiques:
Diederich et al. 2018 [N = 43, citations=31(GS, February 2023)]. Meta-analysis:
Kühberger 1998 [n≈30,000 respondents over 136 empirical papers, citations=1607(GS, February 2023)].
Otterbring et al. 2021 [Study 1 N = 200, Study 2 N=800, citations=27(GS, February 2023)].
Peterson and Tollefson 2023 [N = 1,209, citations=0(GS, February 2023)].
- Original effect size: d=1. 16 [reported in
Kühberger 1998].
- Replication effect size: Diederich et al.: Gain versus Loss effect on Risky option preference significant in two regression models, β=-0.206 and β=-0.303, respectively. (replicated). Kühberger: mean effect size for the 80 studies with Asian d=0.57 [0.53, 0.61] (replicated). Otterbring et al.: Study 1 - statistically significant effect of framing on participants’ choice of program (b = 1.52, Z = 4.80, p < 0.001), such that a larger proportion of participants chose the risky program under conditions of negative (78.0%) compared to positive framing (44.0%); Study 2 - statistically significant effect of framing on participants’ choice of program (b = 1.82, Z = 11.13, p < 0.001), such that a larger proportion of participants chose the risky program under conditions of negative (68.8%) compared to positive framing (26.6%) (replicated). Peterson and Tollefson: d =0.26 [calculated from the reported chi-square and sample size, χ2 =17.41, N = 1021] (replicated).
Last place aversion. A phenomenon where individuals are averse to being in last place and choose gambles with the potential to move them out of last place that they reject when randomly placed in other parts of the distribution.
Statistics
- Status: replicated
- Original paper: ‘
Last-Place Aversion”: Evidence and Redistributive Implications’, Kuziemko et. al. 2014; laboratory experiment, N=84. [citations=358(GS, March 2023)].
- Critiques:
Bull 2020 [Study 1: n=1144, Study 2-4: n=1203, citations= 30 (GS, March 2023)]
- Original effect size: Paper does not provide enough information to convert to effect size. Lottery experiment (Appendix Table 2): Relevant coefficient: “Last or fifth place”, Coefficient value: 0.448, P-value: < 0.01.
- Replication effect size: Bull: Study 1: Observational analysis of customers queuing at a grocery store, where the author recorded the queue positions, wait times, and switching and abandonment behaviours of 1,144 customers. The results showed that being last in line doubled the probability of switching queues and quadrupled the chances of leaving the line altogether. The last-place indicator has a coefficient of 1.255 with a p-value < 0.05. This suggests that, holding all else constant, customers were 3.5 times more likely to switch queues when they were in the last place compared to having a single person waiting in line behind them. Studies 2-4: All were online experiments participants waited in a virtual queue for a chance to win a gift card. Show that being in last place increased 1)reduced wait satisfaction, increased abandonment rates 2) increased perceived value of the service, reduced 3) queue transparency moderated effects.
Ikea effect. When compared to objectively similar goods not produced by themselves, consumers place a higher value on goods they have assembled. Consumers show a higher willingness-to-pay when they assemble products themselves.
Statistics
- Status: replicated.
- Original paper: ‘
The “IKEA Effect”: When Labor Leads to Love’, Norton et al. 2012; four between-subject experiments, N1a=52, N1b= 106, N2=118, N=39. [citations=1,358(GS, February 2023)].
- Critiques:
Mochon et al. 2012 [four experiments N1=79, N2=135, N3a=75, N3b=41, citations=262(GS, February 2023)].
Sarstedt et al. 2016 conceptual replication [N=103, citations=26(GS, February 2023)].
- Original effect size: Experiment 1a: builders bid significantly more for their boxes (M=$0.78, SD=0.63) than non-builders (M=$0.48, SD=0.40), d= 0.59 (calculated from reported t statistic, t(50)=2.12, p<.05); Experiment 1b: builders’ valuation of their origami (M=$0.23, SD= 0.25) was nearly five times higher than what nonbuilders were willing to pay for these creations M=$0.05, SD= 0.07), ηp2 = 0.096 / d= 0.32 (calculated from reported F statistic, F(2, 100)=5.34, p<.01 and converted to d using this
conversion); Experiment 2 –bids overall were highest in the build condition than in the unbuild and prebuilt conditions, ηp2=0.126 / d= 0.38 (calculated from F(2, 106)=7.68, p<.01 and converted to d using this
conversion); Experiment 3 – builders bid significantly more for their boxes (M= $1.46, SD= 1.46) than incomplete builders (M= $0.59, SD=0.70), d =0.75 (calculated from reported t statistic, t(37)=2.35, p<.05).
- Replication effect size: Mochon et al.: Study 1 – builders were willing to pay significantly more for their cars (M=$1.20, SD=1.35) than non-builders (M=$0.57, SD=.76), d= 0.56 (calculated from reported t statistic, t(73)=2.44, p<.05) [replicated]; Study 2 – builders (M = $0.72, SD = .45) were willing to pay significantly more than non-builders (M=$0.46, SD=.50) in no-affirmation condition, d =0.54 (calculated from reported t statistic, t(52)=1.99, p=.05) [replicated]. Sarstedt et al.: Participants in the experimental group (assembly group) offered significantly more money for the loom bands than the control participants, mean difference = 1.36, p < 0.01, d =1.68 (calculated from the M, SD and n data given in Table 5 in the Supplementary material) [replicated].
Endowment effect. People are more likely to retain an object they own than acquire that same object when they do not own it. This implies that the value that an individual assigns to objects appears to increase substantially as soon as that individual is given the object.
Statistics
- Status: replicated
- Original paper: ‘
Experimental Tests of the Endowment Effect and the Coase Theorem’, Kahneman et al. 1990; experimental design, Experiment 1: n=42, Experiment 2: n=38, Experiment 3: n=26, Experiment 4: n=74 [citations=6392 (GS, March 2023)].
- Critiques:
Carmon and Ariely 2000 [study 1: n=91, study 2: n=472, study 3: n=75, study 4: n=250, citations=776 (GS, March 2023)].
Shogren et al. 1994 [n=142, citations=776 (GS, March 2023)].
- Original effect size: 5 (selling price divided by buying price).
- Replication effect size: Carmon and Ariely: study 1: d= 0.03 (calculated from converting Pearson’s r to Cohen’s d through
this calculator); study 2: NA; study 3: NA, study 4: NA. Shogren et al.: 1.05 over trial 5 (selling price divided by buying price); d=-0.069 (calculated from M and SD reported in Table 2) .
One mind per hemisphere (split-brain syndrome). Surgical severing of the corpus callosum leads to the split-brain phenomenon, which is characterised by 1) a response × visual field interaction, 2) strong hemispheric specialisation 3) confabulations after left-hand actions 4) split attention, and 5) the inability to compare stimuli across the midline. Together, these reported effects have been interpreted as evidence for split consciousness. Surgical procedure does not result in the development of two independent minds or consciousnesses within one brain. Instead, the findings suggest that the two hemispheres continue to work together, even in the absence of the corpus callosum.
Statistics
- Status: mixed.
- Original paper: ‘
Some functional effects of sectioning the cerebral commissures in man’, Gazzaniga et al. 1962; case study, n=1. [citations=617(GS, October 2022)].
- Critiques:
de Haan et al. 2020 [review paper, n=NA, citations=44(GS, April 2023)].
Pinto et al. 2017 [review paper, n=NA, citations=43(GS, April 2023)].
Pinto et al., 2017 [n=2, citations=39(GS, October 2022)].
- Original effect size: NA (verbal descriptions, no quantitative data).
- Replication effect size: de Haan et al.: NA, body of evidence is insufficient to answer this question, different theories of consciousness have different predictions on the unity of mind in split-brain patients, and await the results of further investigation into this intriguing phenomenon. Pinto et al.: argue that the data could instead be indicative of a single undivided consciousness experiencing two parallel and unintegrated perceptual streams. Pinto et al.: replicated (no ES; replicate the standard finding that stimuli cannot be compared across visual half-fields, indicating that each hemisphere processes information independently of the other).
Hydrocephaly. The effect of massive volume loss improving cognition. Hydrocephalus, also known as “water on the brain,” is a condition in which there is an abnormal accumulation of cerebrospinal fluid (CSF) in the ventricular system of the brain. This can cause the ventricles to become enlarged, putting pressure on the brain and causing a wide range of symptoms, depending on the severity of the condition and the age of the individual. There are two types of Hydrocephalus: congenital and acquired. Congenital Hydrocephalus is present at birth and is caused by a genetic or developmental abnormality. Acquired Hydrocephalus develops later in life and can be caused by a variety of factors, such as a brain tumour, infection, or injury.
Statistics
- Status: NA
- Original paper: No paper; instead a documentary and
a profile of the claimant, John Lorber. Also ‘
Wittgenstein’s Certainty is Uncertain: Brain Scans of Cured Hydrocephalics Challenge Cherished Assumptions’, Forsdyke 2015; review paper, n=NA. [citations = 20 (Springerlink, January 2023)].
- Critiques:
de Oliveira et al. 2012 [review, fraudulent/retracted, n = NA, citations = 42 (GS, April, 2023)]. Feuillet et al. 2007 [n=1, citations = 192 (GS, April 2023)].
Hawks 2007 [blog, n=NA, citations = 0 (GS, April 2023)].
Gwern 2019 [blog, n=NA, citations = 0 (GS, April 2023)].
Neuroskeptic 2015 [journal article, n=NA, citations = 0 (GS, April 2023)].
- Original effect size: NA.
- Replication effect size: NA. Crucially, for meta-analyses on improvements after Hydrocephaly treatment see Zhang (2020) or Tabatabaee et al. (2019). Hawks: The reported cases do not apparently involve significant gray matter tissue loss. A “thin” cortex does not necessarily imply functionally small cortical volume, even with substantial white tissue loss. Neuroskeptic: While the enormous “holes” in these brains seem dramatic, the bulk of the grey matter of the cerebral cortex, around the outside of the brain, appears to be intact and in the correct place; no detailed post-mortem studies of their brain tissue have been published. Gwern: the cases turn out to be suspiciously unverifiable (Lorber), likely fraudulent (Oliveira), or actually low intelligence (Feuillet). It is unclear if high-functioning cases of hydrocephalus even have less brain mass, as opposed to lower proxy measures like brain volume. Feuillet et al.: man who got to 44 years old before anyone realised his severe hydrocephaly, through marriage and employment. IQ 75 (i.e. d=-1.7).
Readiness potentials. Readiness potentials are neural signals that are observed in the brain prior to voluntary movements. They are typically measured using electroencephalography (EEG) and are thought to reflect the neural activity associated with preparing for a movement, occurring several hundred milliseconds before the movement occurred, suggesting that the brain prepares for the movement before the person is consciously aware of the decision to move. RP have been observed in various regions of the brain, including the primary motor cortex, supplementary motor area, and premotor cortex.
Schurger et al. (2021) for a glossary. The discovery of readiness potentials (RP) has been used to argue against the concept of free will, as it suggests that the neural activity associated with a voluntary movement starts before the person is consciously aware of the decision to move.
Statistics
- Status: replicated
- Original paper: ‘
Hirnpotentialänderungen bei Willkürbewegungen und passiven Bewegungen des Menschen: Bereitschaftspotential und reafferente Potentiale’, Kornhuber and Deecke 1956; 12 healthy subjects in 94 experiments. [citations = 1410 (SPRINGERLINK, January 2023)].
- Critiques:
Alexander et al. 2016 [n=17, citations = 83 (GS, April 2023)].
Fried et al. 2011 [n=12, citations = 589 (GS, April 2023)].
Libet et al. (1964/1983) [6 different experimental sessions with each of 5 subjects, citations = 3814 (PUBMED, January 2023)].
McGilchrist 2012 [n=NA, citations = 55 (GS, April 2023)].
Travers et al. 2020 [n=19, citations = 25 (GS, April 2023)].
- Original effect size: NA.
- Replication effect size: Fried et al.: replicated (no ES). Travers et al.: replicated (no ES). McGilchrist/ Alexander et al.: The neural activity observed may not necessarily be associated with the preparation for a voluntary movement, but rather with a cognitive process such as attention or decision making. Some studies have suggested that RP may reflect the neural activity associated with attentional processes rather than motor preparation, and that the relationship between RP and voluntary movement is not as clear-cut as initially thought.
Left-brain vs. Right-brain Hypothesis. Individuals may be left-brain dominant or right-brain dominant based on personality and cognitive style. Specifically, the hypothesis proposes that the two hemispheres of the brain have different functions and abilities, with the left hemisphere being associated with logical, analytical, and verbal skills, and the right hemisphere being associated with creative, intuitive, and spatial skills. This idea has been popularised in popular culture, but it is not supported by scientific evidence.
Statistics
- Status: not replicated
- Original paper: No original paper based on brain data, seems to have evolved from early studies by Broca and Wernicke and on language localization of the brain, became a mainstream popular idea but not backed by evidence (
Source)._ _Cognitive styles original papers use questionnaires and non-brain based measures to determine “hemispheric dominance”. No-brain-data early paper: ‘
Hemispheric dominance in recall and recognition’, Zenhausen and Gebhardt 1979; within-subjects design, n = 20. [citations = 34 (GS, June 2022)].
- Critiques:
Nielsen et al. 2013 [n=1,011, citations= 446(GS, April 2023)].
- Original effect size: N/A [Zenhausen & Gebhardt, not provided].
- Replication effect size: Nielsen et al.: N/A, data are not consistent with a whole-brain phenotype of greater “left-brained” or greater “right-brained” network strength across individuals [no specific result in Nielsen et al. 2013].
Oxytocin on trust. Intranasal administration of oxytocin increases trust in strangers in a laboratory setting.
Statistics
- Status: not replicated
- Original paper: ‘
Oxytocin increases trust in humans’, Kosfeld et al. 2005; experiment, n = 128_. _[citations = 4800 (GS, April 2022)].
- Critiques:
Declerck et al. 2020 [n = 677, citations =57 (GS, April 2022)].
Lane et al. 2015 [n1 = 95, n2= 61, citations =63 (GS, April 2022)].
- Original effect size: Not reported but could be calculated: “In fact, our data show that oxytocin increases investors’ trust considerably. Out of the 29 subjects, 13 (45%) in the oxytocin group showed the maximal trust level, whereas only 6 of the 29 subjects (21%) in the placebo group showed maximal trust (Fig. 2a). In contrast, only 21% of the subjects in the oxytocin group had a trust level below 8 monetary units (MU), but 45% of the subjects in the control group showed such low levels of trust.”
- Replication effect size: Declerck et al.: No support for the hypothesis that OT increases trust in the minimal social contact condition β= −0.136 [−0.952, –0.682]. Lane et al.: Study 1 - no significant effect, F(1,93) = .229, p = .663; Study 2 - no significant effect, F(1,59) = .295, p = .589.
Structural brain-behaviour correlations - the association between behavioural activation and white matter integrity. Individual differences in the sensitivity to signals of reward as indexed by BAS-Total and in the tendency to seek out potentially rewarding experiences as measured by BAS-Fun are positively correlated with diffusion measures of several white matter pathways.
Statistics
- Status: not replicated
- Original paper: ‘
White matter integrity and behavioral activation in healthy subjects’, Xu et al. 2012; correlational design, n = 51. [citations = 29 (GS, May 2022)].
- Critiques:
corrigendum of
Boekel et al. 2015 [n=36, citations = 196 (GS, May 2022)].
Keuken et al. 2017 [n = 34-35, citations = 1 (GS, May 2022)].
- Original effect size: BAS-Total correlation with parallel diffusivity in the left corona radiata (CR)/superior longitudinal fasciculus (SLF): r = .51; BAS-Fun correlation with: fractional anisotropy in the left CR/SLF: r = .52, parallel diffusivity in the left CR/SLF: r = .58, mean diffusivity in the left SLF/inferior fronto-occipital fasciculus (IFOF): r = .51.
- Replication effect size: Keuken et al.: BAS-Total correlation with parallel diffusivity in the left CR/SLF: r = -.15; BAS-Fun correlation with: fractional anisotropy in the left CR/SLF: r = -.15, parallel diffusivity in the left CR/SLF: r = -.04, mean diffusivity in the left SLF/inferior fronto-occipital fasciculus (IFOF): r = .05.
Structural brain behaviour correlations - the association between social network size and grey matter volume. Individual differences in the number of Facebook friends (FBN) are positively correlated with grey matter volume in several brain areas: left middle temporal gyrus (MTG), right superior temporal sulcus (STS), rich entorhinal cortex (EC), left and right amygdala.
Statistics
- Status: mixed
- Original paper: ‘
Online social network size is reflected in human brain structure’, Kanai et al. 2012; correlational design, n = 125. [citations= 411 (GS, May 2022)].
- Critiques:
Boekel et al. 2015 [n = 34-35, citations = 196 (GS, May 2022)].
Kanai et al. 2012 [n = 40, citations= 411 (GS, May 2022)].
- Original effect size: left MTG: r =.35; right STS: r = .35; right EC: r = .35, left amygdala: r = .30; right amygdala: r = .32.
- Replication effect size: Kanai et al.: left MTG: r =.38; right STS: r = .44; right EC: r = .48; left amygdala: r = .33; right amygdala: r = .48. Boekel et al.: left MTG: r = .18; right STS: r = .11; right EC: r = .06; left amygdala: r = -.14; right amygdala: r = .02.
Structural brain-behaviour correlations - the association between distractibility and grey matter volume. Variability in self-reported distractibility is positively correlated with grey matter volume in the left superior parietal lobule (SPL) and negatively correlated with grey matter volume in medial pre-frontal cortex (mPFC).
Structural brain-behaviour correlations - the association between attention and cortical thickness. Individual differences in executive control are negatively correlated with cortical thickness in left anterior cingulate cortex (ACC), left superior temporal gyrus (STG), and right middle temporal gyrus (MTG), whereas variation in alerting scores is negatively correlated with cortical thickness in the left superior parietal lobule (SPL).
Structural brain-behaviour correlations - the association between control over speed/accuracy of perceptual decisions and white matter tracts strength. Individual differences in control over speed and accuracy of perceptual decisions are positively correlated with the strength of white matter tracts between the right presupplementary motor area (pre-SMA) and the right striatum.
Structural brain-behaviour associations - the association between executive function and grey matter volume. Grey matter volume in the rostral dorsal premotor cortex is associated with individual differences in executive function as measured by the trail making test.
Fear conditioning - Amygdala. Animal research suggests that fear conditioning activates the amygdala (
LeDoux, 1993), which has been replicated in some (but not all) human fMRI fear conditioning studies.
Statistics
- Status: mixed
- Original paper: ‘
Human Amygdala Activation during Conditioned Fear Acquisition and Extinction: a Mixed-Trial fMRI Study’, LaBar et al. 1998; differential fear conditioning paradigm, N=18. [citations=1826(GS, March 2023)]. Note that amygdala activation habituated over time, as it would be expected from research in animals; this methodological consideration has been neglected in many replication attempts.
- Critiques:
Fullana et al. 2016 [meta-analysis, total n=677 from 27 studies, citations=503(GS, March 2023)].
Mechias et al. 2010 [meta-analysis, total n=360, citations=430 (GS, March 2023)]
Öhman et al. 2009 [n=NA, citation=20 (GS, March 2023)].
Phelps et al. 2004 [replication, n=18, citations=2144(GS, March 2023)]. .
Sehlmeyer et al. 2009 [systematic review, n=NA, citations=612(GS, March 2023)]. the following studies demonstrated that amygdala activation can be detected in fear conditioning experiments when amygdala habituation over time is explicitly modelled/considered:
Armony and Dolan 2001 [n=8, citations=60(GS, March 2023)].
Büchel et al. 1998 [n=9, citations=1264(GS, March 2023);
Büchel et al. 1999 [n=11, citations=561(GS, March2023)].
Sperl et al. 2019 [n=21, citations=33(GS, March-2023)].
Yin et al. 2018 [n=18, citations=21(GS, March 2023)].
- Original effect size: NA.
- Replication effect size: Armony and Dolan: the presence of the aversive visual context was associated with enhanced activity in parietal cortex, which may reflect an increase in attention to the presence of environmental threat stimuli. Büchel et al.: Differential evoked responses, related to conditioning, were found in the anterior cingulate and the anterior insula, regions with known involvement in emotional processing. Büchel et al.: Differential responses (CS+ vs CS−), related to conditioning, were observed in anterior cingulate and anterior insula, regions previously implicated in delay fear conditioning; differential responses were also observed in the amygdala and hippocampus that were best characterized with a time × stimulus interaction, indicating rapid adaptation of CS+-specific responses in medial temporal lobe. Fullana et al.: no robust and consistent involvement of the amygdala in fear acquisition across studies; see effect size maps (Fig. 1, 2, 3, and 4; maps are difficult to convert into numbers). Mechias et al.: consistent activation in rostral dmPFC but not in the other candidate areas; discussing methodological constraints. Öhman et al.: excellent overview about early replications and methodological considerations to capture amygdala activity in humans. Phelps et al.: amygdala activation was correlated across subjects with the conditioned response in both acquisition and early extinction.. Sehlmeyer et al.: A network consisting of fear-related brain areas, such as amygdala, insula, and anterior cingulate cortex, is activated independently of design parameters. However, some neuroimaging studies do not report these findings in the presence of methodological heterogeneities. Furthermore, other brain areas are differentially activated, depending on specific design parameters. These include stronger hippocampal activation in trace conditioning and tactile stimulation. Furthermore, tactile unconditioned stimuli enhance activation of pain related, motor, and somatosensory areas. Sperl et al.: Fear and extinction recall as indicated by theta explained 60% of the variance for the analogous effect in the right amygdala.
Fear conditioning - vmPFC. Animal research suggests that fear extinction activates the vmPFC (
Morgan et al., 1993); based on these findings from animal research, some (but not all) human fMRI fear conditioning/extinction studies found that the vmPFC becomes activated during fear extinction recall.
Statistics
- Status: mixed
- Original paper: ‘
Extinction Learning in Humans: Role of the Amygdala and vmPFC’, Phelps et al. 2004; differential fear conditioning and extinction paradigm, N=18. [citations=2144(GS, March 2023)].
- Critiques:
Diekhof et al. 2011 [meta-analysis, total n=154, citations=323(Elsevier, March 2023)].
Fullana et al. 2018 [meta-analysis, total n>1.300 participants, citations=210(GS, March 2023)]. (see also methodological comments by
Morriss et al., 2018 and
Fullana et al., 2019).
- Original effect size: NA.
- Replication effect size: Diekhof et al.: evidence that fear extinction activates vmPFC subregions in humans. Fullana et al.: there is support that fear extinction recall is associated with vmPFC activation, but vmPFC extinction effects seem to be more nuanced than previously assumed and vmPFC effects seem to depend on paradigm characteristics; see effect size maps (Fig. 1, 2, 3, and 4; maps are difficult to convert into numbers).
Fear conditioning - Theta oscillations. Animal research suggests that fear conditioning evokes prefrontal theta activity, which can be measured with EEG in humans.
Statistics
- Status: mixed
- Status: replicated
- Original paper: ‘
Prefrontal Oscillations during Recall of Conditioned and Extinguished Fear in Humans’, Mueller et al. 2014; two-day differential fear conditioning study, n=42. [citations=87(GS, January 2023)].
- Critiques:
Bierwirth et al. 2021 [n=60, citations=11 (GS, January 2023)].
Chen et al. 2021 [n=13, citations=16(GS, January 2023)].
Mueller and Pizzagalli 2016 [n=16, citations=26(GS, January 2023)].
Sperl et al. 2021 [n=21, citations=31(GS, January 2023)].
Starita et al. 2023 [n=20, citations=0(Wiley; January 2023)].
- Original effect size: NA.
- Replication effect size: ierwirth et al.: results replicated and extended by the influence of sex hormones, ηp2 (estradiol status for E2 level by group)_ =.076; d (one-sided; MC women vs. OC women) = .698; d (one-sided; MC women vs. men) = .756; d (one-sided; OC women vs. men) = .077. ηp2(estradiol status for P4 level by group) =.099; d (MC women vs. OC women) = .718; d (MC women vs. men) = .867; d (OC women vs. men) = .219; ηp2(estradiol status for testosterone level by group) =.681; d ( men vs. MC women) = 3.303; d (men vs. OC women) = 3.303; d (MC women vs. OC women) = .598; ηp2(skin conductance responses, day 1: contingency) = .426; ηp2 (skin conductance responses, day 1: estradiol status) = .139; ηp2(skin conductance responses, day 1: estradiol status X contingency) = .115; d (diffSCR, men vs. OC women) = .797; d (diffSCR, men vs. MC women) = .665; ηp2(extinction learning, day 1: contingency) = .277; ηp2(extinction learning, day 1: contingency X estradiol status) = .250; d (diffSCR during learning, men vs. OC women) = .937; d (diffSCR during learning, men vs. MC women) = 1.175; d (one-sided; diffSCR during extinction learning vs. fear acquisition) = .353; ηp2(skin conductance responses, day 2: contingency) = .491; ηp2(skin conductance responses, day 2: contingency X extinction status) < .001 (ns); d (diffSCR, extinction learning vs. extinction recall) = .269; d (diffSCR, fear acquisition vs. extinction recall) = .069; ηp2(diffSCR, day 2: estradiol status) = .137; d (one-sided; diffSCR, MC women vs. OC women) = .615; d (one-sided; diffSCR, MC women vs. men) = .879; ηp2(estradiol status, FRI vs. ERI) = .103; d (one-sided; FRI, MC women vs. OC women) = .639; d (one-sided; ERI, MC women vs. OC women) = .547; d (one-sided; FRI, MC women vs. men) = .928; d (one-sided; ERI, MC women vs. men) = .796; ηp2(theta oscillations, electrode) = .119; ηp2(theta oscillations, electrode X contingency) = .730; ηp2(dACC source, contingency effect) = .090; ηp2(dACC source, contingency X extinction status) = .023; ηp2(dACC source, contingency X estradiol status) = .100; d (one-sided; differential theta power in dACC, MC women vs. OC women) = .609; d (one-sided; differential theta power in dACC, MC women vs. men) = .741; ηp2(frontal theta power during extinction learning, contingency factor) = .061.Chen et al.: NA; results replicated by intracranial EEG. Mueller and Pizzagalli: ES=NA; replicated during a fear recall test one year after conditioning. Sperl et al.: ES=NA; results replicated and extended by simultaneous EEG-fMRI. B Starita et al.: replicated and extended by reversal learning, main effect of CS type in midcingulate cortex ηp2=.37 [0.08, 0.56].
Fear conditioning - Late Positive Potential. Fear conditioning leads to elevated amplitudes during the time period of the Late Positive Potential (LPP), i.e., a positive-going event-related brain potential (ERP) component that can be recorded using electroencephalography (EEG).
Statistics
- Status: replicated.
- Original paper: ‘
Spatio-temporal dynamics of brain mechanisms in aversive classical conditioning: high-density event-related potential and brain electrical tomography analyses’, Pizzagalli et al., 2003; differential fear conditioning paradigm, N=50. [citations=133(GS, 03-2023)].
- Critiques:
Bacigalupo & Luck 2018 [n=70, citations=37(GS, March 2023)].
Ferreira de Sá et al. 2019 [n=24, citations=10(Oxford University Press, March 2023)].
Panitz et al. 2015 [n=22, citations=31(GS, March 2023)].
Pastor et al. 2015 [n=48, citations=19(GS, March 2023)].
Pavlov & Kotchoubey 2019 [n=23, citations=12(GS, March 2023)].
Seligowski et al. 2018 [n=83, citations=12(GS, March 2023)].
Stolz et al. 2019 [n=29, citations=34(GS, March 2023)].
Sperl et al. 2021 [n=24, citations=14(GS, March 2023)].
- Original effect size: NA.
- Replication effect size: Bacigalupo and Luck: NA. Ferreira de Sá et al.: d= 1.490 and d= 1.149. Panitz et al.: d=0.66. Pastor et al.: NA. Pavlov & Kotchoubey: ƞ2 = 0.41.Seligowski et al.: d= 0.336. Stolz et al.: ηp2 = 0.23. Sperl et al.: NA.
Fear conditioning - Bradicardia / heart rate modulation. Fear conditioning leads to heart rate slowing (fear-conditioned bradycardia).
Statistics
- Status: replicated
- Original paper: ‘
Conditioned heart rate response in human beings during experimental anxiety’, Notterman et al. 1952; fear conditioning paradigm with heart rate recording, n= 20. [citations=122(GS, April 2023)]; see also
Notterman et al., 1952b.
- Critiques:
Castegnetti et al. 2016 [n=99, citations=31(Wiley, January 2023)].
Deane & Zeaman 1958 [n=10, citations=51(GS, January 2023)].
Gruss et al. 2016 [n=63, citations=86(Elsevier, January 2023)].
Mueller et al. 2019 [n=86, citations=28(GS, January 2023)].
Panitz et al. 2015 [n=22, citations=27(GS, January 2023)].
Panitz et al. 2018 [n=87, citations=24(gs, January 2023)].
Schipper et al. 2019 [n=104, citations=10(PNAS, January 2023)].
Sperl et al. 2021 [n=24, citations=13(GS, January 2023)].
Thigpen et al. 2017 [n=17, citations=29(GS, January 2023)].
Yin et al. 2018 [n=18, citations=19(GS, January 2023)].
- Original effect size: ES=NA.
- Replication effect size: Castegnetti et al.: NA (replicated). Deane and Zeaman: NA (replicated). Gruss: NA (replicated, heart rate modulation depending on COMT genotype). Panitz et al.: ηp2=.281 (replicated, heart rate modulation depending on COMT genotype). Mueller et al.: study 1: ηp2=.08, study 2: ηp2=.07 (replicated, use of an aversive imagery unconditioned stimulus; paper includes two datasets with independent samples, effect can be replicated). Schipper et al.: NA (replicated, heart rate modulation depending on 5-HTTLPR genotype). Sperl et al.: d = 1.17 (replicated).Thigpen et al.: all d’s>.8 (replicated). Yin et al.: NA (replicated).
Bouba/kiki effect (sound symbolism). When presented with sounds (e.g., the words “bouba” and “kiki”) and visual objects (e.g., a curvy shape and a spiky shape), humans make non-arbitrary mappings between sounds and objects (e.g., the curvy shape is consistently called “bouba”).
Statistics
- Status: replicated
- Original paper:
‘Synaesthesia—A window into perception, thought and language’ Ramchandran and Hubbard 2001; two-alternative forced choice task (claim made about ‘95% of participants,’ but actual study not described), n=NA. [citations=2218 (GS, February 2023)].
- Critiques:
Ćwiek et al. 2022 [n=976 (replication), citations=30 (gs, February 2023)].
Fort et al. 2018 [n=425 (meta-analysis), citations=48 (GS, February 2023)].
- Original effect size: NA.
- Replication effect size: Ćwiek et al.: Hedge’s g=0.106 [0.029, 0.154] (calculated). Fort et al.: Hedge’s g=0.163 [0.088, 0.238].
Human freezing behaviour (postural sway). A physiological response that occurs in response to a perceived threat. Overall, freezing-like behaviour has been replicated in multiple studies across different species and contexts, and is considered a robust and reliable measure of fear and anxiety.
Glucose amplification of cortisol stress reactivity. After fasting, the administration of glucose prior to psychosocial stress or a nicotine challenge led to an increased cortisol stress response (in comparison to water administration). Blood glucose levels were positively associated with the cortisol stress response triggered by the Trier Social Stress Test (TSST).
Statistics
- Status: mixed
- Original paper: ‘
Effects of Fasting and Glucose Load on Free Cortisol Responses to Stress and Nicotine’, Kirschbaum et al. 1997; laboratory experiment, administration of glucose (100 g) prior to the Trier Social Stress Test (TSST) or smoking two cigarettes, TSST: N = 25, smoking: N = 12. [citations=178(GS, April 2023)].
- Critiques:
Bentele et al. 2021 [n=122, citations=10(GS, April 2023)].
Gonzalez-Bono et al. 2001 [n=37, citations = 158 (GS, April 2023)].
Meier et al. 2022 [n=152, citations=5(GS, April 2023)].
Rüttgens and Wolf 2022 [n=72, citations=1(GS, April 2023)].
Von Dawans et al. 2021 [n=151, citations=14(GS, April 2023)].
Zänkert et al. 2020 [n=103, citations=27(GS, April 2023)].
- Original effect size: NA.
- Replication effect size: Bentele et al.: NA, replicated the enhancing effect of glucose administration on the cortisol stress response to the TSST in females. Gonzalez-Bono et al.: NA, but replicated both the increased cortisol stress response after glucose consumption and the significant association between blood glucose levels and the cortisol stress response. Meier et al.: NA, replicated the enhancing effect of glucose administration on the cortisol stress response to the TSST in females; no correlation between blood glucose levels and the cortisol stress response. Rüttgens and Wolf: NA, could neither replicate the enhancing effect of glucose on the cortisol response to a socially-evaluated cold pressor test, nor did they find a correlation between blood glucose levels and the cortisol stress response (SE-CPT). Von Dawans et al.: ηp2=0.042, replicated the increased cortisol stress response after glucose administration in response to the TSST and the cold pressor test (CPT), yet they note that the effect seems to descriptively stronger for the psychosocial stressor (TSST). Zänkert et al.: η2=0.077-0.082, replicated the increased cortisol stress response after grape juice and glucose administration.
Resting-state functional connectivity patterns can accurately classify individuals diagnosed with depression. Multivariate pattern analyses can identify patterns of resting-state functional connectivity that successfully differentiate individuals with depression from healthy controls. This finding demonstrates the potential utility of resting-state functional connectivity as a biomarker of depression.
Statistics
- Status: mixed
- Original paper:
‘Disease state prediction from resting state functional connectivity’, Craddock et al. 2009; quasi-experimental design, ncontrols = 20, nclinical = 20. [citations=464(GS, April 2023)].
- Critiques:
Bhaumik et al. 2017 [ncontrols = 29, nclinical = 38, Citations=58 (GS, April 2023)].
Cao et al. 2014 [ncontrols = 37, nclinical = 39, Citations= 51 (GS, April 2023)].
Guo et al. 2014 [ncontrols = 27, nclinical = 36, Citations= 73 (GS, April 2023)].
Lord et al. 2012 [ncontrols = 22, nclinical = 21, Citations=177 (GS, April 2023)].
Ma et al. 2013[ncontrols = 29, nclinical = 24, Citations=89 (GS, April 2023)].
Qin et al. 2015 [ncontrols = 29, nclinical = 24, Citations=38 (GS, April 2023)].
Ramasubbu et al. 2016 [ncontrols = 19, nclinical = 45, Citations= 51 (GS, April 2023)].
Sundermann et al. 2017 [ncontrols = 180/60 (whole sample/severe symptoms only), nclinical = 180/60, Citations= 17 (GS, April 2023)].
Yu et al. 2013 [ncontrols = 38, nclinical = 19, Citations= 69 (GS, April 2023)].
Zeng et al. 2012 [ncontrols = 29, nclinical = 24, Citations=753 (GS, April 2023)].
Zeng et al. 2014 [ncontrols = 29, nclinical = 24, Citations=169 (GS, April 2023)]
- Original effect size: 62.5% - 95% Classification Accuracy (cross-validation; CV); 16.7–83.3% (Hold-out validation).
- Replication effect size: Bhaumik et al.: 76.1% (CV); 77.8%. (Hold-out validation). Lord et al.: 99.3% (CV). Zeng et al. /
Zeng et al./
Ma et al./
Qin et al.: 69.8–96.2% (CV). Yu et al.: 80.9% (CV). Guo et al.: 90.5% (CV). Cao et al.: 84.2% (CV). Ramasubbu et al.: 49–66% (CV) – mixed, only significant in group with most severe symptoms. Sundermann et al.: no significant results in main analysis on whole sample (ES not reported); only significant in group with most severe symptoms 40.8 to 65.0% (CV), 54.2-61.7 (hold-out validation).