The SAIL Challenge

Developing Judgment Through Structured AI Use

The Problem

Students can use AI. They struggle to evaluate it.

Each time a student accepts AI output without critical evaluation, they accumulate what researchers call cognitive debt — the gap between apparent competence (what AI produces) and actual competence (what they understand). The debt compounds. Eventually, when AI makes a mistake — and it will — they lack the capacity to catch it.

Research Evidence

EEG studies at MIT's Media Lab found that students using LLMs for essay writing showed the weakest brain connectivity and lowest sense of essay ownership compared to other groups — empirical evidence of cognitive debt accumulating in real time.

Kosmyna, N., et al. (2025). "Your Brain on ChatGPT: Accumulation of Cognitive Debt." MIT Media Lab.

The solution is not to restrict AI use. It is to structure AI use so that judgment develops alongside it.

The SAIL Challenge

The SAIL Challenge is a structured learning experience that develops critical thinking through deliberate comparison between human judgment and AI output. Students work through three phases on a single problem:

1
Foundation
Analyze without AI
2
Integration
Collaborate with AI
3
Leadership
Own the decision

The power is in the comparison. By establishing their own thinking first (Phase 1), then engaging AI (Phase 2), students can see clearly where AI adds value, where it falls short, and where their judgment must prevail. Phase 3 requires them to own the final decision — AI cannot answer "where did you override AI?"

Why This Works

The SAIL Challenge design is grounded in over 25 years of learning research:

Comparison Creates Learning

When learners compare contrasting cases, they notice critical distinctions they would otherwise miss. This creates what Schwartz and Bransford call "a time for telling" — a readiness to learn. A meta-analysis of 57 experiments confirmed that case comparison activities produce significantly greater learning outcomes than other forms of study (d = 0.50). Effect sizes were largest when explanatory principles emerged after comparison (d = 1.18) — exactly the sequence the SAIL Challenge uses.

Alfieri, L., Nokes-Malach, T. J., & Schunn, C. D. (2013). Meta-analysis of 57 experiments. Educational Psychologist.

Productive Failure

Attempting to solve problems before instruction — even when failing — improves conceptual understanding and transfer. A 2021 meta-analysis of 53 studies found that problem-solving before instruction significantly outperformed the reverse sequence (g = 0.36). When implemented with high fidelity, effects were even stronger (g = 0.37–0.58). Phase 1 of the Challenge operationalizes this principle.

Sinha, T., & Kapur, M. (2021). Meta-analysis of 53 studies. Review of Educational Research.

Desirable Difficulties

Learning conditions that appear to slow immediate performance often enhance long-term retention and transfer. Requiring students to think before AI assists creates productive struggle that strengthens learning.

Bjork, R. A., & Bjork, E. L. (2011). "Making things hard on yourself, but in a good way."

Metacognition

Awareness and regulation of one's own thinking processes can be developed through structured reflection. Phase 3's requirement to articulate "where did you change your mind?" builds metacognitive capacity.

Flavell, J. H. (1979). "Metacognition and cognitive monitoring." American Psychologist.

Transfer of Learning

Knowledge taught in a single context is less likely to transfer than knowledge practiced across multiple contexts. The Challenge structure remains constant while cases vary by discipline — enabling both low-road (automatic) and high-road (mindful) transfer.

Perkins, D. N., & Salomon, G. (1988). "Teaching for transfer." Educational Leadership.

Critical Thinking as Judgment

Critical thinking is "purposeful, self-regulatory judgment" that integrates both cognitive skills (analysis, evaluation) and dispositional qualities (open-mindedness, truth-seeking). The Challenge develops both dimensions.

Facione, P. A. (1990). The Delphi Report. APA Expert Consensus.

Connection to SAIL

Each phase of the Challenge exercises the four SAIL competencies:

S
Social Intelligence — Phase 3 requires communicating your reasoning clearly in the judgment memo
A
AI Literacy — Phase 2 requires identifying AI's strengths, limitations, and errors in your specific context
I
Innovation/Inquiry — Phase 2 requires questioning AI output, not accepting it as oracle
L
Leadership — Phase 3 requires taking responsibility for the final decision and owning the outcome

Critical thinking is not a fifth pillar. It is the connective tissue that makes the other four function. The Greek root kritikos means "able to discern." The SAIL Challenge develops discernment through structured practice.

Get Started

References

Alfieri, L., Nokes-Malach, T. J., & Schunn, C. D. (2013). Learning through case comparisons: A meta-analytic review. Educational Psychologist, 48(2), 87-113.

Bjork, R. A., & Bjork, E. L. (2011). Making things hard on yourself, but in a good way: Creating desirable difficulties to enhance learning. In M. A. Gernsbacher et al. (Eds.), Psychology and the real world (pp. 56-64). Worth Publishers.

Bransford, J. D., & Schwartz, D. L. (1999). Rethinking transfer: A simple proposal with multiple implications. Review of Research in Education, 24, 61-100.

Facione, P. A. (1990). Critical thinking: A statement of expert consensus for purposes of educational assessment and instruction (The Delphi Report). California Academic Press.

Flavell, J. H. (1979). Metacognition and cognitive monitoring: A new area of cognitive-developmental inquiry. American Psychologist, 34, 906-911.

Kosmyna, N., et al. (2025). Your brain on ChatGPT: Accumulation of cognitive debt when using an AI assistant. arXiv preprint arXiv:2506.08872. MIT Media Lab.

Perkins, D. N., & Salomon, G. (1988). Teaching for transfer. Educational Leadership, 46(1), 22-32.

Schwartz, D. L., & Bransford, J. D. (1998). A time for telling. Cognition and Instruction, 16(4), 475-522.

Sinha, T., & Kapur, M. (2021). When problem solving followed by instruction works: Evidence for productive failure. Review of Educational Research, 91(5), 761-798.