This script is designed to initialize and return a PGVector instance that interacts with a PostgreSQL database to store and retrieve vectorized document embeddings.
- logging: Used for logging script actions and errors.
- psycopg2: For interacting with PostgreSQL databases.
- BedrockEmbeddings: LangChain community embeddings instance for handling document embeddings. This project uses the Amazon Titan Text Embeddings V2 model to generate embeddings.
- PGVector: PostgreSQL-based vector store for storing and retrieving vectorized documents.
- get_vectorstore: Initializes and returns a PGVector instance connected to the PostgreSQL database. It handles connection setup and error logging.
- Database Connection: A connection string is built based on the provided database credentials.
- Vector Store Initialization: A PGVector instance is initialized using the connection string and collection name.
- Logging: The script logs successful initialization and error messages if any occur.
def get_vectorstore(
collection_name: str,
embeddings: BedrockEmbeddings,
dbname: str,
user: str,
password: str,
host: str,
port: int
) -> Optional[PGVector]:
try:
connection_string = (
f"postgresql+psycopg://{user}:{password}@{host}:{port}/{dbname}"
)
logger.info("Initializing the VectorStore")
vectorstore = PGVector(
embeddings=embeddings,
collection_name=collection_name,
connection=connection_string,
use_jsonb=True
)
logger.info("VectorStore initialized")
return vectorstore, connection_string
except Exception as e:
logger.error(f"Error initializing vector store: {e}")
return None
Initializes and returns a PGVector
instance that connects to a PostgreSQL database and prepares a vector store for storing embedded document data.
- Database Connection Setup: Creates a connection string using the provided database credentials and logs successful connections or errors.
- Vector Store Initialization: Constructs a
PGVector
instance using the connection string, collection name, and embeddings. If successful, returns the initializedPGVector
instance along with the connection string. - Error Handling: Captures and logs any errors that occur during vector store initialization.
-
Inputs:
collection_name
: The name of the collection in the vector store.embeddings
: TheBedrockEmbeddings
instance for creating embeddings.dbname
: Database name.user
: Database user.password
: Database password.host
: Host for the PostgreSQL database.port
: Port for the PostgreSQL database.
-
Outputs:
- Returns the initialized
PGVector
instance and the connection string if successful. - Returns
None
if an error occurred during setup.
- Returns the initialized