SCORE: Story Coherence and Retrieval Enhancement for AI Narratives

Abstract

Large Language Models (LLMs) can generate creative and engaging narratives from user-specified input, but maintaining coherence and emotional depth throughout these AI-generated stories remains a challenge. In this work, we propose SCORE, a framework for Story Coherence and Retrieval Enhancement, designed to detect and resolve narrative inconsistencies. By tracking key item statuses and generating episode summaries, SCORE uses a Retrieval-Augmented Generation (RAG) approach, incorporating TF-IDF and cosine similarity to identify related episodes and enhance the overall story structure. Results from testing multiple LLM-generated stories demonstrate that SCORE significantly improves the consistency and stability of narrative coherence compared to baseline GPT models, providing a more robust method for evaluating and refining AI-generated narratives.

Method

SCORE framework overview

The SCORE framework for improving AI-generated story coherence. (a) Extracts key item statuses per episode. (b) Conducts detailed analysis and summaries of each episode. (c) Uses RAG to answer user queries and resolve narrative inconsistencies.

BibTeX

@misc{yi2025scorestorycoherenceretrieval,
      title={SCORE: Story Coherence and Retrieval Enhancement for AI Narratives}, 
      author={Qiang Yi and Yangfan He and Jianhui Wang and Xinyuan Song and Shiyao Qian and Miao Zhang and Li Sun and Tianyu Shi},
      year={2025},
      eprint={2503.23512},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2503.23512}, 
}