The trainer didn’t reply, and docked Ostovitz’s grade.
Ostovitz’s mother, Stephanie Rizk, says her daughter is a high-achieving scholar who cares about doing nicely at school and she or he was alarmed when the trainer jumped to conclusions about Ostovitz’s work so early within the faculty yr.
“Get to know their degree of talent, after which possibly your AI detector is beneficial,” Rizk says.
Rizk instructed NPR she met with the trainer in mid-November and the trainer stated they by no means noticed her daughter’s message.
The varsity district, Prince George’s County Public Faculties, made clear in a press release that Ostovitz’s trainer used an AI detection instrument on their very own and that the district doesn’t pay for this software program.
“Throughout workers coaching, we advise educators to not depend on such instruments, as a number of sources have documented their potential inaccuracies and inconsistencies,” the assertion stated.
PGCPS declined to make Ostovitz’s trainer accessible for an interview. Rizk instructed NPR that after their assembly, the trainer now not believed Ostovitz used AI.
However what occurred to Ostovitz isn’t stunning.
Greater than 40% of surveyed Sixth- to Twelfth-grade academics used AI detection instruments over the last faculty yr, in accordance with a nationally consultant ballot by the Heart for Democracy and Know-how, a nonprofit that advocates for civil rights and civil liberties within the digital age.
That’s regardless of quite a few analysis research displaying that AI detection instruments are removed from dependable.
“It’s now pretty nicely established within the tutorial integrity discipline that these instruments should not match for goal,” says Mike Perkins, a number one researcher on tutorial integrity and AI at British College Vietnam.
Perkins discovered that a few of the hottest AI detectors — together with Turnitin, GPTZero and Copyleaks — flagged some issues as AI that weren’t, and vice versa. Their accuracy charges dropped even additional when AI textual content was manipulated to seem extra human.
“We noticed some actually regarding issues with a few of the most prolific AI textual content detection instruments,” he says.
Regardless of these issues, NPR discovered that college districts from Utah to Ohio to Alabama are spending 1000’s of {dollars} on these instruments.
Why one of many nation’s largest districts makes use of AI detection software program
Close to Miami, Broward County Public Faculties is spending greater than $550,000 on a three-year contract with Turnitin. The long-standing ed-tech firm has traditionally offered faculties with plagiarism detection software program; in 2023, it launched an AI detection function. When educators put scholar work by means of this instrument, it generates a share, which displays the quantity of textual content the software program determines was possible generated by AI. One caveat: In line with the corporate, scores of 20% or decrease are much less dependable.
“The Turnitin instrument is one thing that helps us facilitate dialog and suggestions, not grading,” says Sherri Wilson, director of progressive studying for the Broward faculty district, which enrolls greater than 230,000 college students and is likely one of the largest faculty districts within the nation.
Wilson says the district is “completely conscious” of the analysis displaying AI detection instruments, together with Turnitin, aren’t 100% correct or dependable.
Turnitin additionally acknowledges this: On the corporate’s web site, it says, “our AI writing detection might not at all times be correct … so it shouldn’t be used as the only real foundation for hostile actions towards a scholar.”
Turnitin wrote in a press release to NPR that it’s extra necessary to keep away from falsely accusing college students of dishonest than to catch all AI writing.
Wilson says the Turnitin instrument continues to be helpful as a result of it saves academics time by rapidly scanning scholar work for suspected AI use.
One more reason that Broward academics have entry to the instrument, Wilson says, is that the district participates in tutorial packages, akin to Worldwide Baccalaureate, or IB, during which scholar work should be authenticated by academics earlier than it’s despatched out for exterior evaluation.
Each of the packages Broward gives, IB and Worldwide Training at Cambridge, instructed NPR that faculties should not required to make use of AI detection software program as a part of the authentication course of. Nonetheless, Broward instructed NPR in a press release, “now we have chosen to offer our academics with [Turnitin] as one of many instruments to fulfill the necessities.”
However Wilson says academics are the final word authority on whether or not a scholar’s work is their very own — not the AI detection instrument.
“They’re utilizing these instruments as suggestions to then have these teachable moments with college students,” she says.
Why one trainer makes use of AI detection instruments
Language and literature trainer John Grady says, for him, AI detection instruments present “a leaping off level” to begin a dialog with a scholar who might have used AI.

“It’s definitely not foolproof,” he says. “However it offers you one thing to hold your hat on.”
Grady teaches at Shaker Heights Excessive College, a part of the Shaker Heights Metropolis College District exterior Cleveland. The district serves roughly 4,400 college students, and is paying GPTZero, one other AI detection software program firm, about $5,600 this yr for annual licenses for 27 of the district’s academics. The instrument calculates a share probability {that a} scholar’s work is AI-generated.
Grady says he places all scholar essays by means of GPTZero; if the instrument exhibits greater than a 50% probability AI was used for the task, Grady digs deeper. That features utilizing revision historical past instruments to see how a lot time a scholar spent on an task, and what number of edits they made in the course of the writing course of. If it seems that a scholar made just a few edits and spent hardly any time writing, he’ll examine in with that scholar.
“And I’ll say, ‘Hey, this flagged. Are you able to speak to me about why?’ I’d say the majority of the time, like 75%, if it was AI, they’d be like, ‘Yeah, I did.’ And I’m like, ‘OK, nicely now you’ve bought to rewrite it with much less credit score,’” Grady says.
Edward Tian, co-founder and CEO of GPTZero, says that is how educators ought to be utilizing his firm’s instrument.
“We undoubtedly don’t consider this can be a punishment instrument,” Tian says. “This must be a instrument within the toolkit and never the ultimate smoking gun.”
He says it’s necessary to know {that a} GPTZero likelihood rating beneath 50% means it’s extra possible the textual content was human versus AI-generated. He says scores over 50% warrant nearer examination — like what Grady describes.
Tian doesn’t dispute the analysis that exhibits GPTZero isn’t at all times dependable. However he notes that there are educators, like Grady, who nonetheless discover it helpful for the knowledge it supplies.
He says that instruments like his supply a “sign on what’s taking place in your classroom” however that academics ought to at all times observe up with college students if that sign exhibits one thing regarding.
The AI detection skeptics
Shaker Heights junior Zi Shi, whose first language is Mandarin, says his writing type can generally seem like AI “due to the repetition of phrases I take advantage of. I really feel prefer it’s due to how restricted my vocabulary is.”
Shi — who isn’t a scholar of Grady’s — says he’s nonetheless engaged on his writing abilities and he’s involved that AI detection software program may be biased towards non-native English audio system like himself.
Some educators share this concern, although the analysis up to now is proscribed and contradictory.
Shi says an task he accomplished for his English class earlier this fall was flagged by GPTZero as presumably AI-generated. He says his trainer instructed that his use of a web based instrument referred to as Grammarly might have triggered the detection software program. Grammarly makes use of AI to right grammar and, if prompted, generate textual content. (The trainer confirmed Shi’s account with NPR.)
Shi says he solely used Grammarly to scrub up his writing and that he wrote the task himself. “It was undoubtedly disappointing to see the remark of it being flagged as AI,” Shi says.
Shi thinks AI detectors must be considered a “smoke alarm, the place it’s an indication, or warning. However, you already know, generally it might be like a false alarm.”
He questions whether or not the college district must be spending 1000’s of {dollars} on AI detection software program. He says that cash might be higher spent on skilled improvement for academics.
Carrie Cofer, a highschool English trainer within the Cleveland Metropolitan College District — only a few miles from Shaker Heights — shares that view.
Final yr, as an experiment, she uploaded a chapter of her Ph.D. dissertation into GPTZero. “And it got here up with like 89% or 91% AI-written, and I’m like, ‘Oh, no, I don’t assume that’s proper, as a result of it was all mine,’” Cofer says.

Cofer helps her district form its AI coverage and tips; she says Cleveland faculties don’t presently pay for AI detection software program and she or he’d advocate towards it.
“I don’t assume it’s an efficacious use of their cash,” Cofer says. “The children are going to get round it in some way.”
Some workarounds that college students might flip to incorporate utilizing AI detection software program themselves, to workshop assignments so that they don’t get flagged, and utilizing “AI humanizer” packages, which declare to make AI-generated writing seem extra human.
Finally, she says, academics might want to adapt to AI by altering how they educate and assess scholar studying.
Again in Maryland, highschool junior Ailsa Ostovitz can be adapting. She now runs all her homework assignments by means of a number of AI detection instruments earlier than she turns them in.
The writing is her personal, she says, however she’ll rewrite sentences the software program identifies as presumably AI-generated, an additional step that provides about half an hour to each task.
“I feel I’ve undoubtedly grow to be extra vigilant about presenting my work as mine and never AI,” she explains.
She doesn’t need to take any possibilities.
This reporting was supported by a grant from the Tarbell Heart for AI Journalism.
