Radiologists are using more AI in their practices, increasingly leveraging it for specific interpretive and non-interpretive use cases. When AI alerts a radiologist about an emergent finding, there is an immediate benefit for the patient and the patient’s care team. But what happens when AI discovers pathology that was not suspected or that nobody was quite looking for?
Just as human interpreters looking for pneumonia might find a lung nodule instead or in addition, the ability to apply multiple AI algorithms to a single imaging examination means AI could identify potentially unexpected but important incidental findings that require management and follow-up.
This scenario raises several questions. If AI flagged an unexpected incidental finding in a clinical setting, a radiologist would be expected to review it and make recommendations for further workup if necessary. However, it is not that straightforward. How will the clinical team monitor and recognize when the AI in question has too many false positives, inducing alert fatigue for the human “in the loop” or causing more harm from unnecessary testing than benefit?
Running multiple AI algorithms one case at a time may make sense prospectively in clinical settings. But how will clinicians or researchers deal with multiple important incidental findings identified by AI on large retrospective cohorts when dozens of lung nodules or other incidentalomas are found in studies from years ago, as could occur when developing and validating new AI algorithms? Are they to assume these findings were all managed during the course of routine care? Or is there a higher burden of management — should all these patients be contacted, informed of the potential findings, and given a recommendation for follow-up?
Should AI results deemed erroneous by a radiologist be stored as part of the medical record, sent to referring or primary care providers, and released on patient portals “just in case”? Should they be treated in a different manner than the results of computer- aided detection software long in use in breast imaging, with its many false positive marks that are dismissed by breast imagers and not included in reports?
Clinical leaders and AI experts in our radiology and imaging sciences department at Emory University came together to discuss these questions and others. After some time, a consensus began to emerge on some of the key issues. Most agreed that, in general, the responsibility to report incidental findings when AI is involved in prospective clinical care is no different than what would ordinarily be expected of a radiologist acting without AI. When AI is being used on a case-by-case basis, any incidental finding should always be reviewed by a physician and communicated as needed.
However, a different consensus emerged for when AI is applied retrospectively to large volumes of cases for quality improvement, administrative or research purposes. It may be impracticable to review every case, adjudicate it with the clinical record and communicate every incidental finding. Some things to consider may include the accuracy of the AI, what clinical diagnosis the AI is trained to look for, the volume of cases, available resources to review positive findings, details of any relevant institutional review boards, how old the exams are, and whether the patients or research subjects are still easily reachable. This scenario is also more likely to involve AI that is not FDA-cleared or is still under development, further complicating the calculus.
Establishing a group focused on discussing these issues helped our team carefully consider these scenarios before applying AI, whether prospectively one case at a time or with large retrospective cohorts. Each practice or department may want to consider establishing written guidelines or policies to document where it stands. While some points of consensus were achieved, on other issues our group did not reach a clear agreement, such as on whether erroneous or discounted AI recommendations should be stored in PACS or emergency medical records and shared in the patient portal. It was clear, however, that the exercise of discussing these issues helped the team establish common ground and a forum for hashing out these important decisions.
Establishing local guidelines on how to deal with incidental findings detected by AI is an opportunity for radiology leadership in healthcare more broadly as it moves to adopt predictive models in all aspects of clinical care. While some radiologists have been dealing with AI in their practices for years, these same concerns could be replicated for AI used to predict sepsis, readmission, clinical deterioration and a host of other clinical problems.
The questions we’re facing about AI in radiology today are the same all our other clinical colleagues will be grappling with soon. Developing consensus and guidelines within the imaging community
is critically important since at some point these will become medicolegal questions rather than those of institutional policy.