For this workflow, it just needs to be accurate enough to flag a manuscript or reviewer comments for human review. If the authors disclosed and it was AI assisted, great. If not, question what else the authors or reviews might be dishonest about.
The detection AI concurs reasonably with human judgement: "The study also found that submissions in 2025 with abstracts flagged by Pangram were twice as likely to be rejected by journal editors before peer review as were those not flagged by the tool. The desk-rejection rate was higher for manuscripts flagged for AI-generated text in the methods section."
A humon typist or graduate student typing, incorporating edits, and otherwise revising a manuscript learns and improves both themselves and the manuscript by discerning the meanings that were or were not intended with each iteration of writing one paper or across a series of papers, even when using winword's grammar and spelling checkers, Grammarly and similar technical tools. There are AIs that can learn from a user's revisions. Using those would be more helpful than asking AI to generate text and then revising.
The summary also notes that AI detection was higher among papers from countries where English is not a native language. In the previous process where a manuscript by non-English native authors would be sent out to an English language editor as part of the drafting process, the editor would provide helpful questions about meaning, ambiguities, consistency of style, logical flow, etc. AI tools are starting to do that.
For reviews, a human reviewer, native language user or not, will react to unusual spelling and grammar or errors of meaning, and methods/claims that are not plausibly within the discipline, whereas many AIs will parse over all that to infer meaning statistically. AI reviews may also draw connections among concepts that may exist across literatures but do not exist in practice, and/or hallucinate suggested citations about *God cremating the Earth in seven days*, etc.