Use JinaAI models for embeddings (#14252)

* add generic onnx model class and use jina ai clip models for all embeddings * fix merge confligt * add generic onnx model class and use jina ai clip models for all embeddings * fix merge confligt * preferred providers * fix paths * disable download progress bar * remove logging of path * drop and recreate tables on reindex * use cache paths * fix model name * use trust remote code per transformers docs * ensure tokenizer and feature extractor are correctly loaded * revert * manually download and cache feature extractor config * remove unneeded * remove old clip and minilm code * docs update
2024-10-09 16:31:54 -05:00
parent dbeaf43b8f
commit d4925622f9
7 changed files with 277 additions and 331 deletions
--- a/docs/docs/configuration/semantic_search.md
+++ b/docs/docs/configuration/semantic_search.md
@@ -5,7 +5,7 @@ title: Using Semantic Search

 Semantic Search in Frigate allows you to find tracked objects within your review items using either the image itself, a user-defined text description, or an automatically generated one. This feature works by creating _embeddings_ — numerical vector representations — for both the images and text descriptions of your tracked objects. By comparing these embeddings, Frigate assesses their similarities to deliver relevant search results.

-Frigate has support for two models to create embeddings, both of which run locally: [OpenAI CLIP](https://openai.com/research/clip) and [all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2). Embeddings are then saved to Frigate's database.
+Frigate has support for [Jina AI's CLIP model](https://huggingface.co/jinaai/jina-clip-v1) to create embeddings, which runs locally. Embeddings are then saved to Frigate's database.

 Semantic Search is accessed via the _Explore_ view in the Frigate UI.

@@ -27,13 +27,11 @@ If you are enabling the Search feature for the first time, be advised that Friga

 :::

-### OpenAI CLIP
+### Jina AI CLIP

-This model is able to embed both images and text into the same vector space, which allows `image -> image` and `text -> image` similarity searches. Frigate uses this model on tracked objects to encode the thumbnail image and store it in the database. When searching for tracked objects via text in the search box, Frigate will perform a `text -> image` similarity search against this embedding. When clicking "Find Similar" in the tracked object detail pane, Frigate will perform an `image -> image` similarity search to retrieve the closest matching thumbnails.
+The vision model is able to embed both images and text into the same vector space, which allows `image -> image` and `text -> image` similarity searches. Frigate uses this model on tracked objects to encode the thumbnail image and store it in the database. When searching for tracked objects via text in the search box, Frigate will perform a `text -> image` similarity search against this embedding. When clicking "Find Similar" in the tracked object detail pane, Frigate will perform an `image -> image` similarity search to retrieve the closest matching thumbnails.

-### all-MiniLM-L6-v2
-
-This is a sentence embedding model that has been fine tuned on over 1 billion sentence pairs. This model is used to embed tracked object descriptions and perform searches against them. Descriptions can be created, viewed, and modified on the Search page when clicking on the gray tracked object chip at the top left of each review item. See [the Generative AI docs](/configuration/genai.md) for more information on how to automatically generate tracked object descriptions.
+The text model is used to embed tracked object descriptions and perform searches against them. Descriptions can be created, viewed, and modified on the Search page when clicking on the gray tracked object chip at the top left of each review item. See [the Generative AI docs](/configuration/genai.md) for more information on how to automatically generate tracked object descriptions.

 ## Usage