Tortured phrases: common behavior of language models

Use of Large language models (LLMs) in preparing academic content is grappling academic research writings, blogs, etc. all over the world. There are many reasons to support this but to what extent this can be useful and to what extent it is causing the degraded quality of research papers. This question is under discussion and leaving our mind little bit perplexed in the fair use of these LLMs. Tortured phrases found in academic papers give the proof of using AI generated text in research papers and articles.

What are Tortured phrases?

Guillaume Cabanac. Cyril Labbé. Alexander Magazinov (2021) introduced the concept of 'Tortured Phrases' in their paper (Tortured phrases: A dubious writing style emerging in science (arxiv.org)) defined as

unexpected, weird phrases in lieu of established ones, such as ‘counterfeit consciousness’ instead of ‘artificial intelligence'. [1]

As we know that words in the original language have multiple meanings, and words also change their meaning depending on the word or words they have been paired with. Depending on context in which we use them some pairs are appropriate, and some are not. Humans who know the language easily understand this but computers are not smart enough to know the difference and may not always choose corresponding words with the intended meaning.

Let's read the following words 👇 that you have listened or read all these days randomly somewhere and you are familiar with them.

"Artificial Intelligence",

"big data,” and

“random value.”

👉 But what if they are taken to mean

“counterfeit consciousness,”

“colossal information,”

and “irregular esteem”?

Few Tortured Phrases

General Phrase	Tortured Phrase
Artificial Intelligence	Counterfeit Consciousness
Big Data	Colossal Information
Random value	Irregular Esteem
Deep neural network’	Profound neural organization
Signal to noise	Flag to commotion
Remaining Energy	Leftover vitality
Cloud Computing	Haze figuring
Linear prediction	Straight expectation
Naive Bayes	Gullible Bayes
Random forest	Irregular Woodland
Smart home	Savy home

These weird phrases have been found in a few journals. Many of them (about 500 papers) found concentrated in special issue of the journal Microprocessors and Microsystems between 2018 and 2021. [2]

By January 2022, Cabanac, Labbé, and Magazinov had found nearly 3,200 papers containing tortured phrases or weird English phrases even in reputable and peer-reviewed journals. [3]

After research it was found that such phrases are outcome of using automated translation/paraphrasing. [4]

Problematic Paper Screener

It is a tool, (software package) to track papers that contain tortured phrases or weird English phrases. The team of computer scientists, led by Cabanac, Labbé, and Magazinov, developed Problematic Paper Screener. [5]

According to Yateendra Joshi, this practice of totally depending on AI LLMs for writing academic papers is unethical and erodes public confidence in the academic publishing industry, which may lead to the authors of such publications being pressured to retract them. The researchers ought to make an effort to write more effectively or enlist the aid of reliable editing and translation services. [6]

We can take advantage of LLMs and use them in writing, but with care and understanding that computer models have their limitations and humans creativity and analysis power do not know any boundaries. Use them wisely, and if the language is a barrier to writing, then the help of language experts and tools can be beneficial. This will not only make your research writing sound but eventually bring faith to research publications that are made with the help of LLMs.

References:

1. Cabanac, G., Labbé, C. & Magazinov, A. Preprint at arXiv https://arxiv.org/abs/2107.06751 (2021). Tortured phrases: A dubious writing style emerging in science (arxiv.org)

2. Else, Holly. Tortured phrases’ give away fabricated research papers. Nature 596, 328-329 (2021) doi: https://doi.org/10.1038/d41586-021-02134-0

3. Cabanac, G., Labbé, C. & Magazinov, A. (January 13, 2022). “Bosom peril” is not “breast cancer”: How weird computer-generated phrases help researchers find scientific publishing fraud. Bulletin of the Atomic Scientist. URL: "Bosomperil" is not "breast cancer": How weird computer-generatedphrases help researchers find scientific publishing fraud - Bulletin of theAtomic Scientists (thebulletin.org)

4. Joshi, Yateendra (April 21, 2022) Tortured phrases: What they are, how they are detected, and how to avoid them. Editage Insights. URL: Torturedphrases: What they are, how they are detected, and how to avoid them(editage.com)

5. Cabanac, G., Labbé, C., & Magazinov, A. (2022). The ‘Problematic Paper Screener’ automatically selects suspect publications for post-publication (re)assessment.Presented at WCRI 2022: 7th World Conference on Research Integrity. arXiv preprint. https://doi.org/10.48550/arXiv.2210.04895

6. Joshi, Yateendra (April 21, 2022) Tortured phrases: What they are, how they are detected, and how to avoid them. Editage Insights. URL: Torturedphrases: What they are, how they are detected, and how to avoid them(editage.com)