Navigation
Search
|
VectorVFS: your filesystem as a vector database
Wednesday May 7, 2025. 01:40 AM , from OS News
VectorVFS is a lightweight Python package that transforms your Linux filesystem into a vector database by leveraging the native VFS (Virtual File System) extended attributes. Rather than maintaining a separate index or external database, VectorVFS stores vector embeddings directly alongside each file—turning your existing directory structure into an efficient and semantically searchable embedding store.
VectorVFS supports Meta’s Perception Encoders (PE) [arxiv] which includes image/video encoders for vision language understanding, it outperforms InternVL3, Qwen2.5VL and SigLIP2 for zero-shot image tasks. We support both CPU and GPU but if you have a large collection of images it might take a while in the first time to embed all items if you are not using a GPU. ↫ Christian S. Perone It won’t surprise many of you that this goes a bit above my paygrade, but according to my limited understanding, VectorVFS stores information about files inside the xattr part of inodes. The information being stored is converted into vectors first, and this is the part that breaks my brain a bit, because vectors in this context are far too complex for me to understand. I vaguely understand the end result here – making files searchable using vector magic without using a dedicated database or separate files by using extended attributes in inodes – but the process is far more complicated to understand. It still seems like a very interesting approach, though, and I’d love for people smarter than me to take VectorVFS apart and explain it in easier terms for those of us who don’t fully grasp it.
https://www.osnews.com/story/142293/vectorvfs-your-filesystem-as-a-vector-database/
Related News |
25 sources
Current Date
May, Thu 8 - 23:07 CEST
|