Ghostbuster is a detector designed to identify AI-generated text, regardless of the model used. It employs a multi-step process where documents are analyzed by passing them through various language models. By conducting a structured search over potential combinations of these models’ features, Ghostbuster selects relevant characteristics and trains a classifier to recognize AI-generated content.
It’s important to understand that Ghostbuster’s training data primarily consists of news articles, student essays, and creative writing in American and British English. Consequently, its effectiveness may vary when applied to texts in different styles, languages, or authored by non-native English speakers. Additionally, Ghostbuster may struggle with shorter texts or domains significantly different from its training data.
It’s crucial to note that Ghostbuster isn’t meant for systems that automatically penalize students or writers without human verification. While it’s not infallible and may produce incorrect predictions, particularly if text is edited or paraphrased by humans, its purpose is to assist rather than replace human oversight.
In terms of privacy, Ghostbuster relies on the OpenAI API for its operations. While its developers pledge not to publicly distribute the data, they can’t guarantee the privacy of the inputs provided to the tool.
Lastly, Ghostbuster isn’t recommended for texts shorter than 100 words or within the range of 100-250 words, as its reliability diminishes in these cases.