Skip to main navigation Skip to search Skip to main content

Foundations of DeepSeek: How it works

Research output: Chapter in book / Conference proceedingChapter in an edited book (as author)Academic researchpeer-review

Abstract

This chapter provides a comprehensive overview of DeepSeek-R1, an open-source large language model notable for its efficiency, transparency, and adaptability. It traces DeepSeek’s origins and design philosophy, emphasising resource optimisation and open collaboration, which enable high performance at significantly lower costs compared to other leading AI models. Technical innovations such as the Mixture-of-Experts (MoE) architecture are explored, highlighting how they support both specialised and general purpose tasks. Benchmark analyses demonstrate that DeepSeek-R1 excels in coding, mathematical reasoning, and multilingual contexts, frequently matching or surpassing established models like GPT-4o and Claude 3.5. The chapter also discusses DeepSeek’s focus on transparent reasoning, empathetic responses, and cross-cultural capabilities, ensuring outputs are both interpretable and user-centric. Ultimately, the chapter concludes that DeepSeek’s rapid advancement, combined with its accessible and cost-effective approach, sets a new standard for large language models and marks a significant step forward in the democratisation of advanced AI technology.

Original languageEnglish
Title of host publicationDeepSeek and Mental Health Support Among Chinese Youth
Subtitle of host publicationUse Cases, Risks, and Broader Implications
PublisherCRC Press
Chapter4
Pages31-43
Number of pages13
ISBN (Electronic)9781040569450
ISBN (Print)9781041092445
DOIs
Publication statusPublished - 16 Feb 2026

ASJC Scopus subject areas

  • General Computer Science
  • General Medicine
  • General Arts and Humanities
  • General Social Sciences

Fingerprint

Dive into the research topics of 'Foundations of DeepSeek: How it works'. Together they form a unique fingerprint.

Cite this