JuiceFS is a high-performance shared file system designed for cloud-native use and released under the Apache License 2.0. It provides full POSIX compatibility, allowing almost all kinds of object storage to be used locally as massive local disks and to be mounted and read on different cross-platform and cross-region hosts at the same time.
JuiceFS implements a distributed file system by adopting the architecture that seperates "data" and "metadata" storage. When using JuiceFS to store data, the data itself is persisted in object storage (e.g. Amazon S3), and the corresponding metadata can be persisted in various databases such as Redis, MySQL, TiKV, SQLite, etc., based on the scenarios and requirements.
JuiceFS provides rich APIs for various forms of data management, analysis, archiving, and backup. It can seamlessly interface with big data, machine learning, artificial intelligence and other application platforms without modifying code, and provide massive, elastic and high-performance storage at low cost. With JuiceFS, you do not need to worry about availability, disaster recovery, monitoring and expansion, and thus operation and maintaince work can be remarkably simplified, which helps companies focus more on business development and R&D efficiency improvement.
- POSIX Compatible JuiceFS can be used like a local file system as it seamlessly interfaces with existing applications.
- HDFS Compatible: JuiceFS is fully compatible with HDFS API, which can enhance metadata performance.
- S3 Compatible: JuiceFS provides S3 gateway to implement an S3-compatible access interface.
- Cloud-Native: It is easy to use JuiceFS in Kubernetes via CSI Driver.
- Distributed: Each file system can be mounted on thousands of servers at the same time with high-performance concurrent reads and writes and shared data.
- Strong Consistency: Any committed changes in files will be visible on all servers immediately.
- Outstanding Performance: The latency can be down to a few milliseconds, and the throughput can be nearly unlimited depending on object storage scale (see performance test results).
- Data Security: JuiceFS supports encryption in transit and encryption at rest (view Details).
- File Lock: JuiceFS supports BSD lock (flock) and POSIX lock (fcntl).
- Data Compression: JuiceFS supports LZ4 and Zstandard compression algorithms to save storage space.
JuiceFS is designed for massive data storage and can be used as an alternative to many distributed file systems and network file systems, especially for the following scenarios.
- Big Data Analytics: compatible with HDFS without requiring extra API; seamlessly integrated with mainstream computing engines (Spark, Presto, Hive, etc.); unlimited storage space; nearly zero operation and maintenance costs; well-developed caching mechanism, and better performance than object storage.
- Machine Learning: compatible with POSIX, supporting all machine learning and deep learning frameworks; shareable file storage, which can improve the efficiency of team management and data use.
- Persistent volumes in container clusters: supporting Kubernetes CSI; persistent storage and independent of container lifetime; strong consistency to ensure that date stored is correct; take over data storage requirements to ensure statelessness of the service.
- Shared Workspace: JuiceFS file system can be mounted on any host; no restrictions to client concurrent read/write; POSIX compatible with existing data flow and scripting operations.
- Data Backup: Back up all kinds of data in scalable storage space without limitation; combined with the shared mount feature, data from multiple hosts can be aggregated into one place and then backed up together.
JuiceFS is open source software, and the source code can be found at GitHub. When using JuiceFS to store data, the data is split into chunks according to certain rules and stored in self-defined object storage or other storage media, and the corresponding metadata is stored in self-defined database.