Teachers Are Using the Wrong Tool to Fight AI
23 AI prompts for educators, researchers, and learners — plus a free tool to audit your assessments for "integrity debt". AI detectors don't work. These do.
There’s one space where the adaptation to AI is happening slowly, and where the stakes of getting it wrong are enormous. Education.
The students in classrooms right now are the next generation entering the workforce. The people who will treat you, represent you, build the companies you work for, make the decisions that affect your life.
That workforce is more unstable than it’s been in decades. Roles are being restructured, entire job categories are shrinking, and AI is already doing work that used to require a degree.
And yet, how they’re taught today decides whether they graduate ready for what’s out there, or not.
Everyone in education is trying to figure out what to do about this. Universities, schools, training programs.
Students are using AI. Teachers know it. The response, almost everywhere, has been to try and detect it, flag it, punish it.
The conversation stays stuck on "how do we catch students" instead of moving to the question that matters more: how should teaching adapt now that these AI tools exist?
I have friends who are professors. Most of them are underpaid, both for what they do and for how much it matters. The pressure to publish, teaching full course loads, administrative work that never stops. And then AI shows up, and the curricula they spent years designing, the assessment briefs, the grading rubrics, all of it can now be completed by a tool in minutes. And the result is good enough to pass.
That’s the reality. Whether we like it or not. And it’s not a problem that goes away by catching more students using AI to cheat on their assignments or by banning it.
In this article:
23 AI prompts for educators (inside the AI Blew My Mind MCP)
Expertise, reimagined with AI — featuring Dr Sam Illingworth
The real question for education isn’t whether AI belongs in the classroom. It’s how to redesign learning so that students still develop real skills, real thinking, real judgment, in a world where AI is a permanent part of the toolkit.
23 AI prompts for educators (inside the AI Blew My Mind MCP)
That's a question I think about a lot. Education has been part of my work for years, from designing learning programs to building tools for students.
And it’s why I built an entire teaching, learning, and thinking prompt library inside the AI Blew My Mind MCP: 23 prompts for educators, researchers, and learners. Dr Sam Illingworth also contributed some of his own prompts to the collection.
Now, let me hand it over to Sam. I invited him to write about what he’s seeing from inside academia and how he's approaching it.
Expertise, reimagined with AI — featuring Dr Sam Illingworth
This is Expertise, reimagined with AI. Professionals with decades of experience from the AI Blew My Mind community who chose to reimagine their work instead of protecting it. You might be one of them.
This monthly series brings those stories forward.
Dr Sam Illingworth is a Full Professor of Creative Pedagogies, a poet, and the founder of Slow AI , a Substack publication about critical AI literacy with over 13,000 subscribers. His whole approach is about knowing when AI helps and when to leave it alone. He co-authored a book on GenAI in higher education published by Bloomsbury, and he’s building a 12-month curriculum to help professionals develop real AI judgment, not just AI skills.
Now, over to Sam.
Stop Catching Students. Start Fixing the Assignments.
Why AI detectors are failing teachers
I am a Full Professor at a UK university. Almost every colleague I know is terrified of AI.
Not terrified of the technology itself. Terrified that students are using it to cheat, and that they cannot tell. The solution most institutions have landed on is AI detection software. Turnitin’s AI detector is now built into most university submission portals. Lecturers run student essays through GPTZero. Some departments have started requiring handwritten exams again.
Here is the problem: AI detectors do not work.
They produce false positives at alarming rates. One study found a 61.3% false positive rate for non-native English speakers. That means the student who learned English as a second language and worked hardest on their essay is the one most likely to be accused of cheating.
And they will only get worse. Every model update makes AI-generated text harder to detect. The detectors are always one step behind, and the gap is widening.
I spent months watching this arms race. Then I asked a different question.
The question nobody is asking
If a machine can pass your assessment, whose failure is that?
It is not the student’s. It is the curriculum’s.
A standard essay brief that asks a student to “critically evaluate the role of X in Y with reference to key literature” can be completed by any frontier AI model in under two minutes. The output will score a comfortable 2:1. It will cite real sources. It will follow the marking rubric. It will be undetectable.
The essay was always testing whether a student could produce a polished written output on demand. We just did not notice until a machine could do it too.
I call this integrity debt. It is the gap between what an assessment claims to measure and what it actually requires. Most university assessments have been accumulating integrity debt for decades. AI did not create the problem. It exposed it.
What “integrity debt” means (a framework, not a detector)
Instead of building another detector, I built an audit tool. It asks ten questions about an assessment brief, and each one identifies a specific vulnerability to AI automation.
Here are the ten categories:
Final product weighting. Is all the credit on the final submission, or is the process assessed too?
Iterative documentation. Must students show the messy middle: drafts, rejected ideas, annotations?
Contextual specificity. Is the task tied to something local, specific, and inaccessible to AI?
Reflective criticality. Does it require genuine personal synthesis, not generic “I learned that…” reflection?
Temporal friction. Could a student plausibly complete this in one sitting with AI?
Multimodal evidence. Are there non-text outputs: audio, physical models, hand-drawn work?
Explicit AI interrogation. Must students critically analyse AI outputs as part of the task?
Real-time defence. Is there a live component: a viva, a presentation, unscripted Q&A?
Social and collaborative labour. Does it require verified group work with peer accountability?
Data recency. Does it require engagement with events or data from the past fortnight?
Each category scores 1 (resilient) to 5 (vulnerable). A total score of 10 means your assessment is designed for human learning. A score of 50 means AI can do the whole thing.
Most assessments I have tested land between 35 and 45. That is not a student problem. That is a design problem.
Building the tool: from vibe coding to Claude Code
I wanted a tool that any educator could use without technical knowledge. Upload your assessment brief, get a report telling you where the vulnerabilities are and what to do about them.
I started building it with vibe coding in Google’s Gemini. If you have not tried vibe coding, it is exactly what it sounds like: you describe what you want in plain language and the AI writes the code. I had a working prototype within a few hours. It looked good. It ran. I felt productive.
Then I hit the walls.
The prototype worked for simple cases but broke on edge cases: scanned PDFs, login-protected university pages, documents that were too short to analyse properly. Every time I asked Gemini to fix something, it would introduce a new problem somewhere else. I was copy-pasting code I did not understand, and when something went wrong I had no idea where to look.
So I moved to Claude Code. This is a command-line tool. No visual interface. You work in a terminal. It is less comfortable than vibe coding in a browser, and that discomfort turned out to be the point.
Working in the terminal forced me to understand the file structure. I had to read error messages. I had to make decisions about architecture rather than accepting whatever the AI suggested. Claude Code could write the code, but I had to direct it. The relationship shifted from “build this for me” to “build this with me.”
The irony is not lost on me. I was building a tool about the dangers of outsourcing thinking to AI, and the process of building it taught me exactly how that outsourcing happens. Vibe coding felt productive. It was fast. But I was not learning anything. The moment I moved to an environment with more friction, I started understanding what I was building.
The finished tool uses Streamlit (a Python web framework), Google’s Gemini for the analysis, and generates a detailed PDF report for each audit. It handles file uploads, text paste, and public URLs. It has rate limiting, prompt injection protection, and detects when someone tries to upload a scanned PDF or a login-protected university page. None of those features existed in the vibe-coded prototype. All of them came from understanding the problems well enough to solve them properly.
What the audit reveals about your assessments
You upload an assessment brief. The tool scores it across all ten categories and returns:
A total integrity score with a vulnerability rating
A breakdown of each category: what scored well, what did not, and why
A downloadable PDF report with specific, actionable improvements for each vulnerable category
The PDF is designed for department meetings. You can take it to your programme leader and say: here is where our assessments are exposed, and here is what we change.
You can try it yourself here. There are two example briefs included: a traditional essay (which scores badly) and a portfolio with viva (which scores well). Upload both and compare the reports.
The danger of not using AI in education
If I had refused to engage with AI tools, I would never have built this. I would still be in a department meeting arguing about which detector to buy. The tool uses AI to identify where AI breaks your assessment. It turns the technology back on itself.
But more than that: the process of building it, moving from vibe coding to something more deliberate, taught me how AI actually works at a practical level. I understand prompt engineering not because I read a tutorial but because I had to debug a broken JSON response at midnight. I understand token limits because my PDF reports kept getting truncated. I understand hallucination because the AI confidently generated assessment categories that did not exist in my framework.
The professors who refuse to touch AI are not protecting their integrity. They are protecting their ignorance. And their students are paying for it.
The answer is not to use AI uncritically. That is how you get vibe-coded prototypes that break on the first edge case. The answer is to use AI deliberately, with enough friction to understand what it is doing, and enough critical distance to spot where it fails.
That is what the Integrity Debt Audit does for assessments. And that is what building it did for me.
Where to start if you’re an educator
If you work in education and want to start:
Upload one of your current assessments to the Integrity Debt Audit. See what it says.
Look at the categories that scored highest (most vulnerable). Pick one and redesign it.
Share the report with your team. The PDF is designed for this.
Stop buying detectors. Spend the money on paying students to help you workshop the results to create assessments that are authentic, engaging, and resilient.
The fear is real. The detectors are not the answer. The curriculum is.
Your turn
Dr Sam Illingworth’s Integrity Debt Audit is free to use. If you work in education, upload one of your assessments and see what it says. The link is here.
If you want to go deeper, Sam writes about all of this on Slow AI , his Substack on critical AI literacy.
And the 23 teaching prompts I mentioned at the beginning are all live inside the AIBMM MCP. Just set up the MCP (it takes two minutes) and start using them, along with 40+ more prompts for writing, marketing, business, operations, and more.
If you’re an educator, or you work with people in education, we want to hear from you.
What’s the hardest part of adapting to AI in your work right now? What would help?
If this resonated, share it with someone who needs to read it.
This post is free. If you found it useful and want access to more of what I’m building - prompts, automations, step-by-step guides - paid subscribers get all of it.







Thanks so much for the opportunity to work and write together Daria, and most importantly of all for my very own Boomie Avatar! I feel like I have now truly arrived on Substack. 🥰
Thanks both for this piece!
The shift from “how do we catch students?” to “what are we actually asking assessments to do now?” is exactly the right one. Detection was always going to be a dead end. Design is the harder, and more honest, conversation.
The problem is not AI in education. The problem is educational design that lets AI substitute for judgment instead of sharpening it. That is where learning integrity starts to matter.