Assessing the Reliability of ChatGPT and Gemini in Identifying Relevant Orthodontic Literature

Research output: Contribution to journalArticlepeer-review

Abstract

Objectives Artificial intelligence (AI)-based solutions offer potential remedies to the issues encountered in conventional reference identification methods. However, the effectiveness of these AI models in assisting orthodontic experts in discovering relevant material is unknown. The purpose of this study was to assess the validity of ChatGPT and Google Gemini in delivering references for orthodontic literature studies. Materials and Methods This study utilized ChatGPT models (3.5 and 4) and Gemini to search for topics in orthodontics and specific subdomains. To verify the existence and precision of the cited references, several reputable sources were employed, including PubMed, Google Scholar, and Web of Science. Statistical Analysis Descriptive statistics were employed to present the data numerically and as percentages, focusing on three aspects: completeness, accuracy, and fabrication. Reliability analysis was conducted using Cronbach’s α and the results were visually presented in the form of the correlation heat map. Results Out of all references, only 15.76% were correct, whereas 71.92% were fake or fabricated references and 12.32% were inaccurate references. Gemini had the significantly highest proportion of correct references (36.36%), followed by GPT 3.5 (15.76%) and GPT 4 (0.95%) (p-value < 0.01). The reliability score of 0.418 indicate low-to-moderate consistency in the accuracy of the references. Conclusion While Gemini showed better performance than GPT models, significant limitation remains in all three models in reference generations. These findings advocate for balanced and cautious use of AI tools in academic research related to orthodontics, emphasizing human validation of the references and training of dental professionals and researchers in efficient use of AI tools.

Original languageEnglish
JournalEuropean Journal of General Dentistry
DOIs
StateAccepted/In press - 2025

Keywords

  • artificial intelligence
  • chatbot
  • literature
  • orthodontics
  • references

Fingerprint

Dive into the research topics of 'Assessing the Reliability of ChatGPT and Gemini in Identifying Relevant Orthodontic Literature'. Together they form a unique fingerprint.

Cite this