*msaFilesystem: Practical way of file system management
Feb 18, 2025
—

Managing different file systems in modern applications can be messy—but it doesn’t have to be. In this article, written by our developer Eduard, you’ll discover msaFilesystem—an agnostic abstract filesystem API that simplifies working with S3, Azure Datalake, local storage, YouTube, and more. Eduard explains the challenges, presents the solution, and shares real-world examples using the powerful FileWorker class.
Let’s start with a definition. msaFilesystem stands for Agnostic Abstract Filesystem API which allows to use S3, GCS, Azure Datalake, your local FS, Youtube etc.
In this article I will present to you the overall description of the existing problem, the solution to it that I recommend to use, and will share some examples of practical utilization of msaFilesystem for working with different file systems.
1. Problem description
In modern applications, there is often a need to work with different types of file systems. Depending on the project’s requirements, this could include local file systems, cloud storage (such as S3, Google Cloud Storage or Azure Datalake), network file systems (such as SMB or WebDAV), and specialized data sources (such as YouTube or Dropbox ). In such conditions, developers have to use different libraries and interfaces to work with each of these systems, which leads to more complex code and increased development time.
2. The reason of choosing the suggested approach
We decided to use msaFilesystem to create a single abstract API for working with different file systems. This approach allows you to use a single library to access all the necessary file systems, which greatly simplifies code development and maintenance. We chose to optimize for FastAPI and Pydantic, as these are modern and popular tools for developing web applications in Python.
3. The key tasks the following approach solves
Using msaFilesystem solves several key problems:
- Unification of interfaces
Provides a single interface for working with different types of file systems.
- Integration Simplification
Reduces the amount of code required to integrate different file systems within an application.
- Scalability
Allows you to easily switch between local and cloud storage without changing code.
- Development optimization
Integration with FastAPI and Pydantic speeds up the development process and improves code readability.
4. Examples of msaFilesystem usage
Below you can familiarize yourself with the examples of using msaFilesystem when working with various file systems.
- The local file system
- The file system Amazon S3
“`python
s3_fs = open_fs(‘s3://mybucket’)
with s3_fs.open(‘example.txt’, ‘w’) as file:
file.write(‘Hello, S3!’)
“`
- The temporary file system
“`python
temp_fs = open_fs(‘temp://’)
with temp_fs.open(‘example.txt’, ‘w’) as file:
file.write(‘Temporary data’)
“`
5. The core capabilities
Among the existing capabilities of file system usage I can name the following ones:
1. Abstract file system
A single interface for working with various file systems such as S3, GCS, Azure Datalake and local file system.
2. Support of various protocols and storages
- FTP, memory, ZIP, SMB, WebDAV, Azure Datalake, S3, Google Cloud Storage, Google Drive, Dropbox, OneDrive и YouTube.
3. Optimization for FastAPI/Pydantic
Simplified integration with modern web frameworks.
6. The additional capabilities
Apart from having the most well known capabilities, the system has some additional number of features.
- File system management in the memory
Temporary and cacheable file systems for temporary data storage and testing.
- Archive support
Archives’ read and write functions in such formats ZIP и TAR.
- Installation of virtual file systems
The installation possibility of sub catalogs in other file systems.
- Network file system support
SMB, WebDAV and other protocols for secure access to files on the network.
The `FileWorker` class was created based on `msaFilesystem`.
`FileWorker` is a class that provides an interface for working with various files and file systems using msaFilesystem. Below are examples of how to use FileWorker to perform various tasks.
Since it was used by our platform for processing documents, it was important for us to separate documents by subdomain (organization). For this purpose we used folders.
7. The initialization of FileWorker
To start working with FileWorker, you need to create an instance of it, specifying the subdomain, document and client identifier, along with a path to the file system.
``python
from msaFileWorker.file_worker import FileWorker
file_worker = FileWorker(subdomain="example_subdomain", document_id="example_document_id")
```
8. The developed methods
Below I would like to present you a peace of code that will familiarize you with the developed methods:
- save_bytes_file
- save_many_bytes_as_files
- save_text_as_file
- save_many_texts_as_files
- get_file_as_zip
- get_folder_as_zip
- delete_file
- delete_folder
- get_file_as_bytes
- get_file_content
9. The auxiliary methods
The methods listed below represent the auxiliary ones:
- _get_zip_one_file
- _create_folders_if_not_exist
- _zip_with_subfolders
- _split_path
All methods are quite simple and comprehensible. It is possible to understand what they do from their names.
`_zip_with_subfolders`method can be pretty interesting to explore. Its implementation uses recursion to collect all subfolders.
```python
async def _zip_with_subfolders(self, zip_buffer: BytesIO, folder: str, path: str) -> BytesIO:
"""Create zip with all files(include all folders)
Parameters:
zip_buffer: BytesIO object
folder: name of folder
path: path to folder
Returns:
BytesIO
`_zip_with_subfolders`method can be pretty interesting to explore. Its implementation uses recursion to collect all subfolders.
```python
async def _zip_with_subfolders(self, zip_buffer: BytesIO, folder: str, path: str) -> BytesIO:
"""Create zip with all files(include all folders)
Disclaimer
msaFilesystem was created exclusively for the internal needs of our company and is intended to simplify the work with file systems in various environments. The main goal of developing this package was to create a convenient and effective tool for integrating various types of file systems, allowing developers to focus on solving business problems, rather than on the routine work on service calls and set up process.