Western Libraries

Knowledge Synthesis: Systematic & Scoping Reviews

Artificial Intelligence

A variety of AI tools can be used during the systematic review or evidence synthesis process. These may be used to assist with developing a search strategy; locating relevant articles or resources; or during the data screening, data extraction or synthesis stage. They can also be used to draft plain language summaries.

The overall consensus is that the AI tools can be very useful in different stages of the systematic or other evidence review but that it is important to fully understand any bias and weakness they may bring to the process. In many cases using new AI tools, which previous research has not assessed rigorously, should happen in conjunction with existing validated methods. It is also essential to consider ethical, copyright and intellectual property issues for example if the process involves you uploading data or full text of articles to an AI tool.

Examples of Key Terms:


Purpose and Strategies


Using AI as a mediating step in between sections of the systematic review process

Creates efficient operations and reduces the amount of time spent on more time-heavy portions

Using AI as an aid to make faster decisions

Increasing transparency and clarity in review questions


Determine the strengths and weaknesses of different sections of the systematic review process

Identify the areas that take the most amount of time

Assess the risk in automation 

Talk to research and library team about where automated processes would benefit in the process


AI in Systematic Review Process

Human Review Primary (in between first and second step):

  • AI can synthesize information to form a protocol
  • Checking to make sure elements of DEI are included in protocol and all components are present

Human Review Secondary (in between second and third step):

  • Autogenerated search strings
  • Automated literature selections; Conducting the quality check after return results 

Human Review Tertiary (in between third and fourth step):

  • Automated selection of studies; review selection criteria and process
  • Automated data extraction; review type of data and what is included and excluded
  • Automated synthesis of data; review for any biases and exclusive


Of course all of this will change. The use of AI for evidence synthesis is a rapidly developing field, but for clinical use it will still be necessary that syntheses meet the underlying standards of transparency and rigour which are so far absent. Keep this in mind when reading the latest tech hype.


  • Alshami, A.; Elsayed, M.; Ali, E.; Eltoukhy, A.E.E.; Zayed, T. Harnessing the Power of ChatGPT for Automating Systematic Review Process: Methodology, Case Study, Limitations, and Future Directions. Systems 2023, 11, 351. https://doi.org/10.3390/systems11070351
    Explores the use of ChatGPT in (1) Preparation of Boolean research terms and article collection, (2) Abstract screening and articles categorization, (3) Full-text filtering and information extraction, and (4) Content analysis to identify trends, challenges, gaps, and proposed solutions.
  • Blaizot, A, Veettil, SK, Saidoung, P, et al. Using artificial intelligence methods for systematic review in health sciences: A systematic review. Res Syn Meth. 2022; 13(3): 353-362. doi:10.1002/jrsm.1553
    The review below delineated automated tools and platforms that employ artificial intelligence (AI) approaches and evaluated the reported benefits and challenges in using such methods.They report the usage of Rayyan Robot Reviewer EPPI-reviewer; K-means; SWIFT-review; SWIFT-Active Screener; Abstrackr; Wordstat; Qualitative Data Analysis (QDA);  Miner and NLP and assess the quality of the reviews which used these.
  • Janka H, Metzendorf M-I. High precision but variable recall – comparing the performance of five deduplication tools. JEAHIL [Internet]. 17Mar.2024 [cited 28Mar.2024];20(1):12-7. Available from: http://ojs.eahil.eu/ojs/index.php/JEAHIL/article/view/607 
  • Kebede, MM, Le Cornet, C, Fortner, RT. In-depth evaluation of machine learning methods for semi-automating article screening in a systematic review of mechanistic literature. Res Syn Meth. 2023; 14(2): 156-172. doi:10.1002/jrsm.1589
    "We aimed to evaluate the performance of supervised machine learning algorithms in predicting articles relevant for full-text review in a systematic review." "Implementing machine learning approaches in title/abstract screening should be investigated further toward refining these tools and automating their implementation" 
  • Khalil H, Ameen D, Zarnegar A. Tools to support the automation of systematic reviews: a scoping reviewJ Clin Epidemiol 2022; 144: 22-42 https://www.jclinepi.com/article/S0895-4356(21)00402-9/fulltext 
    "The current scoping review identified that LitSuggest, Rayyan, Abstractr, BIBOT, R software, RobotAnalyst, DistillerSR, ExaCT and NetMetaXL have potential to be used for the automation of systematic reviews. However, they are not without limitations. The review also identified other studies that employed algorithms that have not yet been developed into user friendly tools. Some of these algorithms showed high validity and reliability but their use is conditional on user knowledge of computer science and algorithms."

  • Khraisha Q, Put S, Kappenberg J, Warraitch A, Hadfield K. Can large language models replace humans in systematic reviews? Evaluating GPT-4's efficacy in screening and extracting data from peer-reviewed and grey literature in multiple languagesRes Syn Meth. 2024; 1-11. doi:10.1002/jrsm.1715
    "Although our findings indicate that, currently, substantial caution should be exercised if LLMs are being used to conduct systematic reviews, they also offer preliminary evidence that, for certain review tasks delivered under specific conditions, LLMs can rival human performance."

  • Mahuli, S., Rai, A., Mahuli, A. et al. Application ChatGPT in conducting systematic reviews and meta-analyses. Br Dent J 235, 90–92 (2023). https://doi.org/10.1038/s41415-023-6132-y
    Explores using ChatGPT for conducting Risk of Bias analysis and data extraction from a randomised controlled trial.

  • Ovelman, C., Kugley, S., Gartlehner, G., & Viswanathan, M. (2024). The use of a large language model to create plain language summaries of evidence reviews in healthcare: A feasibility study. Cochrane Evidence Synthesis and Methods, 2(2), e12041. https://onlinelibrary.wiley.com/doi/abs/10.1002/cesm.12041 

  • Qureshi, R., Shaughnessy, D., Gill, K.A.R. et al. Are ChatGPT and large language models “the answer” to bringing us closer to systematic review automation?Syst Rev 12, 72 (2023). https://doi.org/10.1186/s13643-023-02243-z
    "Our experience from exploring the responses of ChatGPT suggest that while ChatGPT and LLMs show some promise for aiding in SR-related tasks, the technology is in its infancy and needs much development for such applications. Furthermore, we advise that great caution should be taken by non-content experts in using these tools due to much of the output appearing, at a high level, to be valid, while much is erroneous and in need of active vetting."

  • van Dijk SHB, Brusse-Keizer MGJ, Bucsán CC, et al.Artificial intelligence in systematic reviews: promising when appropriately used. BMJ Open 2023;13:e072254. doi: 10.1136/bmjopen-2023-072254 
    Suggests how to conduct a transparent and reliable systematic review using the AI tool ‘ASReview’ in the title and abstract screening.