Summary¶
Vearch is a scalable distributed system for efficient similarity search of deep learning vectors.
Overall Architecture¶
Data Model: space, documents, vectors, scalars
Components: Master,Routerm,PartitionServer。
Master: Responsible for schema mananagement, cluster-level metadata, and resource coordination.
Router: Provides RESTful API: create , delete search and update ; request routing, and result merging.
PartitionServer(PS): Hosts document partitions with raft-based replication. Gamma is the core vector search engine. It provides the ability of storing, indexing and retrieving the vectors and scalars.
General Introduction¶
- One document one vector.
- One document multiple vectors.
- One document has multiple data sources and vectors.
- Numerical field filtration
- Batch operations to support addition and search.
System Features¶
- Gamma engine implemented by C++ guarantees fast detection of vectors.
- Supporting Interior Product and L2 Method to Calculate Vector Distance.
- Supporting memory and disk data storage, supporting super-large data scale.
- Data multi copy storage based on raft protocol.