Merge pull request #2699 from weaviate/v1-27-palm-to-google_new

v1 27 `xxx-palm` to `xxx-google` second try
weaviate · Oct 15, 2024 · 4e8d6f0 · 4e8d6f0
2 parents 75e9dc8 + e591d18
commit 4e8d6f0
Show file tree

Hide file tree

Showing 9 changed files with 95 additions and 36 deletions.
diff --git a/developers/contributor-guide/weaviate-core/setup.md b/developers/contributor-guide/weaviate-core/setup.md
@@ -31,7 +31,7 @@ To run the server locally with the OpenAI module.
 
 The default configuration is `local-development` which will run the server locally with the `text2vec-contextionary` and `backup-filesystem` modules.
 
-You can also create your own configuration. For instance, you can clone an entry (`local-all-openai-cohere-palm` is a good start) and add the required [environment variables](../../weaviate/config-refs/env-vars.md).
+You can also create your own configuration. For instance, you can clone an entry (`local-all-openai-cohere-google` is a good start) and add the required [environment variables](../../weaviate/config-refs/env-vars.md).
 
 
 ## Running with Docker
@@ -85,4 +85,4 @@ import ContributorGuideMoreResources from '/_includes/more-resources-contributor
 
 import DocsFeedback from '/_includes/docs-feedback.mdx';
 
-<DocsFeedback/>
+<DocsFeedback/>
diff --git a/developers/weaviate/config-refs/schema/vector-index.md b/developers/weaviate/config-refs/schema/vector-index.md
@@ -237,7 +237,7 @@ services:
       AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
       PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
       DEFAULT_VECTORIZER_MODULE: 'text2vec-openai'
-      ENABLE_MODULES: 'text2vec-cohere,text2vec-huggingface,text2vec-openai,text2vec-palm,generative-cohere,generative-openai,generative-palm'
+      ENABLE_MODULES: 'text2vec-cohere,text2vec-huggingface,text2vec-openai,text2vec-google,generative-cohere,generative-openai,generative-google'
       CLUSTER_HOSTNAME: 'node1'
       AUTOSCHEMA_ENABLED: 'false'
       ASYNC_INDEXING: 'true'

diff --git a/developers/weaviate/configuration/tenant-offloading.md b/developers/weaviate/configuration/tenant-offloading.md
@@ -4,6 +4,9 @@ sidebar_position: 5
 image: og/docs/configuration.jpg
 ---
 
+:::info Added in `v1.26`
+:::
+
 Tenants can be offloaded to cold storage to reduce memory and disk usage, and onloaded back when needed.
 
 This page explains how to configure tenant offloading in Weaviate. For information on how to offload and onload tenants, see [How-to: manage tenant states](../manage-data/tenant-states.mdx).

diff --git a/developers/weaviate/model-providers/google/embeddings-multimodal.md b/developers/weaviate/model-providers/google/embeddings-multimodal.md
@@ -32,7 +32,11 @@ At [import time](#data-import), Weaviate generates multimodal object embeddings
 
 ### Weaviate configuration
 
-Your Weaviate instance must be configured with the Google AI vectorizer integration (`multi2vec-palm`) module.
+Your Weaviate instance must be configured with the Google AI vectorizer integration (`multi2vec-google`) module.
+
+:::info Module name change
+`multi2vec-google` was called `multi2vec-palm` in Weaviate versions prior to `v1.27`.
+:::
 
 <details>
   <summary>For Weaviate Cloud (WCD) users</summary>

diff --git a/developers/weaviate/model-providers/google/embeddings.md b/developers/weaviate/model-providers/google/embeddings.md
@@ -32,7 +32,11 @@ At the time of writing (November 2023), AI Studio is not available in all region
 
 ### Weaviate configuration
 
-Your Weaviate instance must be configured with the Google AI vectorizer integration (`text2vec-palm`) module.
+Your Weaviate instance must be configured with the Google AI vectorizer integration (`text2vec-google`) module.
+
+:::info Module name change
+`text2vec-google` was called `text2vec-palm` in Weaviate versions prior to `v1.27`.
+:::
 
 <details>
   <summary>For Weaviate Cloud (WCD) users</summary>

diff --git a/developers/weaviate/model-providers/google/generative.md b/developers/weaviate/model-providers/google/generative.md
@@ -32,7 +32,11 @@ At the time of writing (November 2023), AI Studio is not available in all region
 
 ### Weaviate configuration
 
-Your Weaviate instance must be configured with the Google AI generative AI integration (`generative-palm`) module.
+Your Weaviate instance must be configured with the Google AI generative AI integration (`generative-google`) module.
+
+:::info Module name change
+`generative-google` was called `generative-palm` in Weaviate versions prior to `v1.27`.
+:::
 
 <details>
   <summary>For Weaviate Cloud (WCD) users</summary>

diff --git a/developers/weaviate/modules/index.md b/developers/weaviate/modules/index.md
@@ -11,14 +11,56 @@ This section describes Weaviate's individual modules, including their capabiliti
 They have moved to our [model provider integrations](../model-providers/index.md) section, for a more focussed, user-centric look at these integrations.
 :::
 
-- The Vectorizer (also called Retrievers sometimes) modules such as `text2vec-*` or `img2vec-*` convert data objects and query inputs to vectors.
-- The (Re)Ranker modules such as `rerank-*` apply a(n) (additional) ranking process to the search results.
-- The Reader & Generator modules process data after retrieving the data from Weaviate, such as to answer questions or summarize text.
-- The other modules include everything else, such as a spellcheck module.
-
 ## General
 
-Modules can be "vectorizers" (defines how the numbers in the vectors are chosen from the data) or other modules providing additional functions like question answering, custom classification, etc. Modules have the following characteristics:
+Weaviate's modules are built into the codebase, and [enabled through environment variables](../configuration/modules.md) to provide additional functionalities.
+
+### Module types
+
+Weaviate modules can be divided into the following categories:
+
+- [Vectorizers](#vectorizer-reranker-and-generative-ai-integrations): Convert data into vector embeddings for import and vector search.
+- [Rerankers](#vectorizer-reranker-and-generative-ai-integrations): Improve search results by reordering initial search results.
+- [Generative AI](#vectorizer-reranker-and-generative-ai-integrations): Integrate generative AI models for retrieval augmented generation (RAG).
+- [Backup](#backup-modules): Facilitate backup and restore operations in Weaviate.
+- [Offloading](#offloading-modules): Facilitate offloading of tenant data to external storage.
+- [Others]: Modules that provide additional functionalities.
+
+#### Vectorizer, reranker, and generative AI integrations
+
+For these modules, see the [model provider integrations](../model-providers/index.md) documentation. These pages are organized by the model provider (e.g. Hugging Face, OpenAI) and then the model type (e.g. vectorizer, reranker, generative AI).
+
+For example:
+
+- [The OpenAI embedding integration page](../model-providers/openai/embeddings.md) shows how to use OpenAI's embedding models in Weaviate.
+
+<img
+    src={require('../model-providers/_includes/integration_openai_embedding.png').default}
+    alt="Embedding integration illustration"
+    style={{ maxWidth: "50%", display: "block", marginLeft: "auto", marginRight: "auto"}}
+/>
+<br/>
+
+- [The Cohere reranker integration page](../model-providers/cohere/reranker.md) shows how to use Cohere's reranker models in Weaviate.
+
+<img
+    src={require('../model-providers/_includes/integration_cohere_reranker.png').default}
+    alt="Reranker integration illustration"
+    style={{ maxWidth: "50%", display: "block", marginLeft: "auto", marginRight: "auto"}}
+/>
+<br/>
+
+- [The Anthropic generative AI integration page](../model-providers/anthropic/generative.md) shows how to use Anthropic's generative AI models in Weaviate.
+
+<img
+    src={require('../model-providers/_includes/integration_anthropic_rag.png').default}
+    alt="Generative integration illustration"
+    style={{ maxWidth: "50%", display: "block", marginLeft: "auto", marginRight: "auto"}}
+/>
+<br/>
+
+### Module characteristics
+
 - Naming convention:
   - Vectorizer (Retriever module): `<media>2vec-<name>-<optional>`, for example `text2vec-contextionary`, `img2vec-neural` or `text2vec-transformers`.
   - Other modules: `<functionality>-<name>-<optional>`, for example `qna-transformers`.
@@ -28,32 +70,15 @@ Modules can be "vectorizers" (defines how the numbers in the vectors are chosen
 - General module information (which modules are attached, version, etc.) is accessible through Weaviate's [`v1/meta` endpoint](../config-refs/meta.md).
 - Modules can add `additional` properties in the RESTful API and [`_additional` properties in the GraphQL API](../api/graphql/additional-properties.md).
 - A module can add [filters](../api/graphql/filters.md) in GraphQL queries.
-- Which vectorizer and other modules are applied to which data classes is configured in the [schema](../manage-data/collections.mdx#specify-a-vectorizer).
-
-## Default vectorizer module
-
-Unless you specify a default vectorization module in Weaviate's configuration, you'll need to specify which vectorization module is used per class you add to the data schema (or you need to enter a vector for each data point you add manually). Set the default with the environment variable `DEFAULT_VECTORIZER_MODULE` to `text2vec-contextionary` in the Docker Compose file:
-
-``` yaml
-services:
-  weaviate:
-    environment:
-      DEFAULT_VECTORIZER_MODULE: text2vec-contextionary
-```
-
-## Multiple vectors (named vectors)
-
-import MultiVectorSupport from '/_includes/multi-vector-support.mdx';
-
-<MultiVectorSupport />
+- Which vectorizer and other modules are applied to which data collection is configured in the [schema](../manage-data/collections.mdx#specify-a-vectorizer).
 
 ## Backup Modules
 
 Backup and restore operations in Weaviate are facilitated by the use of backup provider modules.
 
-These are interchangeable storage backends which exist either internally or externally. The following sections will explain the difference between these two types of backup provider modules, and their intended usages.
+These are interchangeable storage backends which exist either internally or externally.
 
-## External provider
+### External provider
 
 External backup providers coordinate the storage and retrieval of backed-up Weaviate data with external storage services.
 
@@ -68,12 +93,31 @@ The supported external backup providers are:
 
 Thanks to the extensibility of the module system, new providers can be readily added. If you are interested in an external provider other than the ones listed above, feel free to reach out via our [forum](https://forum.weaviate.io/), or open an issue on [GitHub](https://github.com/weaviate/weaviate).
 
-## Internal provider
+### Internal provider
 
 Internal providers coordinate the storage and retrieval of backed-up Weaviate data within a Weaviate instance. This type of provider is intended for developmental or experimental use, and is not recommended for production. Internal Providers are not compatible for multi-node backups, which require the use of an external provider.
 
 As of Weaviate `v1.16`, the only supported internal backup provider is the [filesystem](/developers/weaviate/configuration/backups.md#filesystem) provider.
 
+## Offloading Modules
+
+:::info Added in `v1.26`
+:::
+
+Offloading modules facilitate the offloading of tenant data to external storage. This is useful for managing resources and costs.
+
+See [how to configure: offloading](../configuration/tenant-offloading.md) for more information on how to configure and use offloading modules.
+
+## Other modules
+
+In addition to the above, there are other modules such as:
+
+- [qna-transformers](./qna-transformers.md): Question-answering (answer extraction) capability using transformers models.
+- [qna-openai](./qna-openai.md): Question-answering (answer extraction) capability using OpenAI models.
+- [ner-transformers](./ner-transformers.md): Named entity recognition capability using transformers models.
+- [text-spellcheck](./ner-transformers.md): Spell checking capability for GraphQL queries.
+- [sum-transformers](./sum-transformers.md): Summarize text using transformer models.
+
 ## Related pages
 
 - [Configuration: Modules](../configuration/modules.md)

diff --git a/developers/weaviate/starter-guides/generative.md b/developers/weaviate/starter-guides/generative.md
@@ -333,7 +333,7 @@ To use generative search, the appropriate `generative-xxx` module must be:
 - Enabled in Weaviate, and
 - Specified in the collection definition.
 
-Each module is tied to a specific group of LLMs, such as `generative-cohere` for Cohere models, `generative-openai` for OpenAI models and `generative-palm` for PaLM and Gemini models.
+Each module is tied to a specific group of LLMs, such as `generative-cohere` for Cohere models, `generative-openai` for OpenAI models and `generative-google` for Google models.
 
 If you are using WCD, you will not need to do anything to enable modules.
 
@@ -390,7 +390,7 @@ For configurable deployments, you can specify enabled modules. For example, in a
 services:
   weaviate:
     environment:
-      ENABLE_MODULES: 'text2vec-cohere,text2vec-huggingface,text2vec-openai,text2vec-palm,generative-cohere,generative-openai,generative-palm'
+      ENABLE_MODULES: 'text2vec-cohere,text2vec-huggingface,text2vec-openai,text2vec-google,generative-cohere,generative-openai,generative-googles'
 ```
 
 Check the specific documentation for your deployment method ([Docker](../installation/docker-compose.md), [Kubernetes](../installation/kubernetes.md), [Embedded Weaviate](../installation/embedded.md)) for more information on how to configure it.

diff --git a/developers/weaviate/starter-guides/which-weaviate.md b/developers/weaviate/starter-guides/which-weaviate.md
@@ -44,7 +44,7 @@ If you are evaluating Weaviate, we recommend using one of these instance types t
 - [Weaviate Cloud (WCD)](/developers/wcs) sandbox
 - [Embedded Weaviate](/developers/weaviate/installation/embedded)
 
-Use an inference-API based text vectorizer with your instance, for example, `text2vec-cohere`, `text2vec-huggingface`, `text2vec-openai`, or  `text2vec-palm`.
+Use an inference-API based text vectorizer with your instance, for example, `text2vec-cohere`, `text2vec-huggingface`, `text2vec-openai`, or  `text2vec-google`.
 
 The [Quickstart guide](/developers/weaviate/quickstart) uses a WCD sandbox and an API based vectorizer to run the examples.