AI-Powered Natural Language Processing Tool Enhances TNM Staging Efficiency

Bayer,Morgan;

AI-Powered Natural Language Processing Tool Enhances TNM Staging Efficiency

April 11, 2025

By Morgan Bayer

Fact checked by Tony Berberabe, MPH

Publication

Article

Targeted Therapies in OncologyMarch II, 2025

Volume 14

Issue 4

Pages: 32

titima157 (top pattern) - stock.adobe.com / Robot image by Jennifer Kralyevich / MJH Life Sciences using AI

AN ONGOING CHALLENGE at many cancer centers is accurately identifying the TNM staging of patients for data reporting purposes. This can occur due to the time-consuming task of manually reviewing extensive pathology reports. To address this, Nicholas Tatonetti, PhD, developed an artificial intelligence (AI) and natural language processing (NLP) tool called Big Bird-TEN (BB-TEN). This tool efficiently analyzes pathology reports to quickly and accurately determine the TNM stage of cancer, streamlining the reporting process.¹

“The TNM staging [for a patient] is trapped in notes and free text that require a lot of clicking through of electronic health records [EHR] to get to that information. The ability to extract this information easily and then deliver it in a structured form through EHR or reports is potentially a very powerful ability for clinical practice,” Tatonetti said in an interview with Targeted Therapies in Oncology.

Tatonetti is vice chair of operations in the Department of Computational Biomedicine and associate director of computational oncology at Cedars-Sinai Medical Center in Los Angeles, California.

The Inspiration

For Tatonetti, this issue first presented itself when he started at Columbia University in New York, New York, and later at Cedars-Sinai when he transitioned there. It can occur at cancer centers around the world, he explained. “It is very important that the staging is accurate for patients,” Tatonetti said, “as a person handling the data, we get a lot of questions such as ‘How many patients do we have at this particular stage?’”

BB-TEN is a bidirectional encoder representations from transformers (BERT) model, which is a machine learning program that learns and understands the meaning of language in text form. “The inspiration behind this [initiative] was to explore how we can leverage advances in NLP and AI models to search clinical notes at scale for clinically and research-relevant patient information,” he said.

Methods and Results

To train the BB-TEN model, 9523 pathology reports from The Cancer Genome Atlas (TCGA), a publicly available data set, were used. These pathology reports were classified with the 3 TNM stages of cancer: tumor size (6887 reports), regional lymph node involvement (5678 reports), and metastasis (4608 reports). Originally 3 models were developed and then compared. The highest-performing model achieved an area under the receiver operating characteristic curve (AU-ROC) ranging from 0.82 to 0.96.

After evaluating the models on independent TCGA test sets, an AU-ROC from 0.75 to 0.95 was achieved. Then researchers evaluated the models on an independent set of pathology reports from Columbia University Irving Medical Center (CUIMC), This included 7792 reports of tumor size, 6140 of lymph node status, and 2245 of metastasis. With this data set, BB-TEN achieved an AU-ROC ranging from 0.815 to 0.942. Researchers also noted that removing protected health information had a minimal effect on BB-TEN’s performance, resulting in only a small improvement of 0.0001 in the AU-ROC result.

Surprising Benefits

Tatonetti was surprised that using the TCGA data set for training the model resulted in a very robust model. This was due to the diversity of the TCGA data set, which is comprised of pathology reports from many different institutions with varying styles of reporting, he explained. This factor led to a smooth transition with minimal performance variance of testing the model with CUIMC’s pathology reports for validation and then with Cedars-Sinai Medical Center pathology reports, Tatonetti explained.

“Typically, you often see very dramatic decreases in performance once a model goes out of the healthcare system in which it was trained. However, we did not see that performance deterioration, and this means that anyone could use this model right out of the box,” Tatonetti said.

Another surprise with the BB-TEN model was that it was able to infer staging even if it was not explicitly mentioned in the pathology report. For instance, if descriptions of the tumor size or the infiltration of different lymph nodes were mentioned, the model could deduce which staging the patient had through these descriptions, Tatonetti explained.

Challenges

During development, the model had a limited context window, meaning it could only process a small amount of information at once, typically 1 or 2 paragraphs. However, pathology reports are often much longer, which posed a challenge in fitting the entire report into the model’s context window, Tatonetti explained. To address this, the team focused on extracting the most important information, particularly from the beginning of the report, and minimizing less relevant boilerplate content typically found at the end. This proved to be effective.

As mentioned, BB stands for big bird, which means that the researchers made the context window even larger from the earlier models they evaluated. “Thus, we have already moved to a window that can handle much larger context, and that problem is moot now,” he said.

Next Steps

The model requires further validation; however, Tatonetti explained that it is very encouraging to see no deterioration of performance between the 2 data sets evaluated at CUIMC and Cedars-Sinai Medical Center.

In contrast to similar models created, the BB-TEN model spans all possible cancer types and is trained using the diverse TCGA database, whereas other models are trained on 1 specific center. In addition, the researchers chose to open source the BB-TEN model to reduce barriers to access. “It will be some time before these types of models get integrated into clinical systems and EHR, but we are working on ways to integrate them into the research workflow here at Cedars-Sinai Medical Center,” Tatonetti said.