Publication Type
Journal Article
Version
publishedVersion
Publication Date
9-2025
Abstract
Large language models (LLMs) have transformed sentiment analysis, yet balancing accuracy, efficiency, and explainability remains a critical challenge. This study presents the first comprehensive evaluation of DeepSeek-R1—an open-source reasoning model—against OpenAI’s GPT-4o and GPT-4o-mini. We test the full 671B model and its distilled variants, systematically documenting few-shot learning curves. Our experiments show DeepSeek-R1 achieves a 91.39% F1 score on 5-class sentiment and 99.31% accuracy on binary tasks with just 5 shots, an eightfold improvement in few-shot efficiency over GPT-4o. Architecture-specific distillation effects emerge, where a 32B Qwen2.5-based model outperforms the 70B Llama-based variant by 6.69 percentage points. While its reasoning process reduces throughput, DeepSeek-R1 offers superior explainability via transparent, step-by-step traces, establishing it as a powerful, interpretable open-source alternative.
Keywords
Sentiment Analysis, Large Language Models, Explainability, DeepSeek-R1, GPT-4o, Few-Shot Learning, Emotion Recognition
Discipline
Artificial Intelligence and Robotics
Research Areas
Intelligent Systems and Optimization
Areas of Excellence
Digital transformation
Publication
IEEE Intelligent Systems
First Page
1
Last Page
10
ISSN
1541-1672
Identifier
10.1109/MIS.2025.3614967
Publisher
Institute of Electrical and Electronics Engineers
Citation
HUANG, Donghao and WANG, Zhaoxia.
Explainable sentiment analysis with DeepSeek-R1: Performance, efficiency, and few-shot learning. (2025). IEEE Intelligent Systems. 1-10.
Available at: https://ink.library.smu.edu.sg/sis_research/10465
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1109/MIS.2025.3614967