Mar 03, 2025 13:10:00

DeepSeek announces open-source 3FS, a file system that accelerates AI

DeepSeek has released the Fire-Flyer File System (3FS), an open source project that is a parallel file system that improves the efficiency of AI training and inference performance.

DeepSeek brings disruption to AI-optimized parallel file systems, releases powerful new open-source Fire-Flyer File System | Tom's Hardware

https://www.tomshardware.com/pc-components/storage/deepseek-releases-powerful-new-parallel-file-system-fire-flyer-fire-system-made-open-source

DeepSeek AI Releases Fire-Flyer File System (3FS): A High-Performance Distributed File System Designed to Address the Challenges of AI Training and Inference Workload - MarkTechPost
https://www.marktechpost.com/2025/02/28/deepseek-ai-releases-fire-flyer-file-system-3fs-a-high-performance-distributed-file-system-designed-to-address-the-challenges-of-ai-training-and-inference-workload/

DeepSeek is hosting an event called 'OpenSourceWeek' from February 24, 2025, where it will announce the open sourcing of various AI technologies. So far, it has announced the MLA decoding kernel 'FlashMLA' developed for NVIDIA's Hopper architecture-based GPUs, and the communication library 'DeepEP' that can speed up training and inference of Mixture of Experts (MoE) models.

DeepSeek-R1 development companies open-source their proprietary technologies one after another, enabling faster AI learning and inference - GIGAZINE

On the fifth day, February 28, 2025, 3FS, a parallel file system designed with SSDs and RDMA networks in mind, was announced. 3FS is a Linux-based file system that uses Filesystem in Userspace (FUSE) , and by using 3FS on its own servers, DeepSeek has achieved a total read throughput of 7.3TB per second.

🚀 Day 5 of #OpenSourceWeek : 3FS, Thruster for All DeepSeek Data Access

Fire-Flyer File System (3FS) - a parallel file system that utilizes the full bandwidth of modern SSDs and RDMA networks.

⚡ 6.6 TiB/s aggregate read throughput in a 180-node cluster
⚡ 3.66 TiB/min…
— DeepSeek (@deepseek_ai) February 28, 2025

In high-performance computing (HPC), which supports the evolution of AI, GPUs constantly access random training data to train LLMs, essentially reading the data once.

In particular, using read caching can be detrimental to developing AI, since repeatedly reading the same data in the same order can lead to unrelated data being trained on sets in the LLM.

Since read caching is largely useless, 3FS almost completely ignores read caching in favor of random read speed, which is said to set 3FS apart from other filesystems.

According to

a paper on 3FS published in August 2024 by the team in charge of the operation of 'Fire-Flyer 2,' one of DeepSeek's server clusters, DeepSeek operates 180 storage nodes consisting of 16 16TB SSDs and two 200Gbps network interface cards (NICs).

DeepSeek claims to have achieved 6.6 TiB /s performance using 3FS on this server cluster, and in a GraySort benchmark run on a cluster of 25 storage nodes and 50 compute nodes, it sorted 110.5 TiB of data distributed across 8192 partitions in just over 30 minutes, achieving an average throughput of 3.66 TiB/min.

The 3FS repository can be accessed from the following link:

GitHub - deepseek-ai/3FS: A high-performance distributed file system designed to address the challenges of AI training and inference workloads.
https://github.com/deepseek-ai/3FS

Mar 03, 2025 13:10:00 in Software, Posted by log1l_ks