Publication Type
PhD Dissertation
Version
publishedVersion
Publication Date
6-2025
Abstract
In the context of the current infodemic, the rapid spread of misinformation poses a severe threat to social stability and public health. Recently, the rise of deep learning technologies has offered the potential for accelerating the development of automated misinformation detection and verification. However, current technological capabilities and computational resources often prove inadequate for the exhaustive scrutiny required, rendering the enhancement of processing efficiency a critical imperative. Given the vast amount of data on the internet, current technology and computational power often fall short in timely and accurate scrutiny of each piece of information, making the improvement of processing efficiency an urgent issue to address. Furthermore, effective fact-checking requires sophisticated reasoning capabilities. According to the characteristics of large language models (LLMs), only models of considerable size can emerge with advanced reasoning and understanding abilities. This demand places a higher challenge on computational resources. Moreover, even if these LLM theoretically support advanced fact-checking, efficiently adapting them for specific verification tasks remains a technical challenge. This thesis aims to improve automated misinformation monitoring and evidence-based fact verification systems by enhancing their efficiency and accuracy.
Accordingly, we address two primary challenges: In the first part, we address the challenge of more effective infodemic surveillance. We innovatively study rumor detection and virality prediction within a unified framework, as these are critical aspects of infodemic surveillance. The second part focuses on leveraging large language models (LLMs) for scalable fact verification. We propose HiSS prompting, which decomposes complex claims into subclaims and verifies them step-by-step with evidence retrieved via web search. To further improve retrieval quality, we develop Fine-grained Feedback with Reinforcement Retrieval (FFRR), which collects document- and question-level feedback from LLMs to refine evidence selection through policy gradient optimization. We also introduce Chain of Preference Optimization (CPO), a fine-tuning method that aligns LLM reasoning with preferred multi-step thought chains, improving accuracy by up to 4.3% over base models while reducing inference cost significantly. Meanwhile, we also propose hybrid LightTransfer models that deliver up to 2.17× inference throughput with negligible performance loss. In summary, we systematically study the issue of automated misinformation monitoring and verification systems. We demonstrate the effectiveness of the proposed lightweight methods on real-world datasets, implying the potential of our work for application in real-world scenarios.
Degree Awarded
PhD in Computer Science
Discipline
Artificial Intelligence and Robotics
Supervisor(s)
GAO, Wei
First Page
1
Last Page
192
Publisher
Singapore Management University
City or Country
Singapore
Citation
ZHANG, Xuan.
Cutting through the infodemic efficiently: News claims surveillance and LLM-based lightweight fact verification. (2025). 1-192.
Available at: https://ink.library.smu.edu.sg/etd_coll/778
Copyright Owner and License
Author
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.