Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Use RAG(Retrival-Augumented Generation) with Mistral-7b-instruct language model #20

Closed
wants to merge 19 commits into from

Conversation

cindyli
Copy link
Contributor

@cindyli cindyli commented Apr 18, 2024

Description

This pull request is to use RAG(Retrival-Augumented Generation) with a language model (Mistral-7b-instruct) in order to query the user's background information for the language model to provide a better response.

@cindyli cindyli changed the title feat: Use RAG(Retrival-Augumented Generation) with Mistral-7b-instruct feat: Use RAG(Retrival-Augumented Generation) with Mistral-7b-instruct language model Apr 18, 2024
@klown
Copy link
Contributor

klown commented Apr 22, 2024

@cindyli , I replicated the error at line 53 of the rag.py script. But there is another earlier error that has to do with the pydantic packages and its BaseSettings. It looks like this is the actual error:

pydantic.errors.PydanticImportError: BaseSettings has been moved to the pydantic-settings package. See https://docs.pydantic.dev/2.7/migration/#basesettings-has-moved-to-pydantic-settings for more details.

Searching for that error led to a github issue. There are compatibility issues with the version of python used, the version of pydantic and the version of fastapi. A solution is referenced in a pull request to pin the versions of pydantic and fastapi to older versions.

Here's a full trace of what happened during my interactive session. The first part uses the job_rag.sh and rag.py scripts as is, and replicates the error you reported. The second part uses the hot-pinned solution using an earlier version of pydantic and fastapi. But, it fails earlier at line 14 of rag.py. It's another version issue.

First part:

[clown@cdr2598 scratch]$ module load cuda cudnn
[clown@cdr2598 scratch]$ moduel unload cuda
[mii] moduel not found! Similar commands: "mode", "modes", "modutil"
[clown@cdr2598 scratch]$ module unload cuda

Inactive Modules:
  1) cudnn     2) flexiblas     3) gcccore/.12.3     4) hwloc/2.9.1     5) libfabric/1.18.0     6) openmpi     7) pmix/4.2.4     8) ucc/1.2.0     9) ucx/1.14.1

[clown@cdr2598 scratch]$ module load StdEnv/2023

Activating Modules:
  1) flexiblas/3.3.1     2) gcccore/.12.3     3) hwloc/2.9.1     4) libfabric/1.18.0     5) openmpi/4.1.5     6) pmix/4.2.4     7) ucc/1.2.0     8) ucx/1.14.1

[clown@cdr2598 scratch]$ module load python/3.11.
[clown@cdr2598 scratch]$ module load python/3.11.5
[clown@cdr2598 scratch]$ python --version
Python 3.11.5
[clown@cdr2598 scratch]$ virtualenv --no-download $SLURM_TMPDIR/env
created virtual environment CPython3.11.5.final.0-64 in 26985ms
  creator CPython3Posix(dest=/localscratch/clown.29386918.0/env, clear=False, no_vcs_ignore=False, global=False)
  seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/home/clown/.local/share/virtualenv)
    added seed packages: pip==23.3.2, setuptools==69.2.0, wheel==0.43.0
  activators BashActivator,CShellActivator,FishActivator,NushellActivator,PowerShellActivator,PythonActivator
[clown@cdr2598 scratch]$ source $SLURM_TMPDIR/env/bin/activate
(env) [clown@cdr2598 scratch]$ pip install --upgrade pip
Looking in links: /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/gentoo2023/x86-64-v3, /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/gentoo2023/generic, /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/generic
Requirement already satisfied: pip in /localscratch/clown.29386918.0/env/lib/python3.11/site-packages (23.3.2)
Processing /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/generic/pip-24.0+computecanada-py3-none-any.whl
Installing collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 23.3.2
    Uninstalling pip-23.3.2:
      Successfully uninstalled pip-23.3.2
Successfully installed pip-24.0+computecanada
(env) [clown@cdr2598 scratch]$ module load StdEnv/2023 rust/1.70.0 arrow/15.0.1 gcc/12.3 cudacore/.12.2.2

Activating Modules:
  1) cudnn/8.9.5.29

(env) [clown@cdr2598 scratch]$ pip install --no-index torch transformers tensorflow sentence_transformers accelerate==0.25.0 peft==0.5.0 bitsandbytes==0.42.0 datasets==2.17.0 trl
Defaulting to user installation because normal site-packages is not writeable
Looking in links: /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/gentoo2023/x86-64-v3, /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/gentoo2023/generic, /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/generic
Requirement already satisfied: torch in /home/clown/.local/lib/python3.11/site-packages (2.2.1+computecanada)
Requirement already satisfied: transformers in /home/clown/.local/lib/python3.11/site-packages (4.39.3+computecanada)
Requirement already satisfied: tensorflow in /home/clown/.local/lib/python3.11/site-packages (2.15.1+computecanada)
Requirement already satisfied: sentence_transformers in /home/clown/.local/lib/python3.11/site-packages (2.5.0+computecanada)
Requirement already satisfied: accelerate==0.25.0 in /home/clown/.local/lib/python3.11/site-packages (0.25.0+computecanada)
Requirement already satisfied: peft==0.5.0 in /home/clown/.local/lib/python3.11/site-packages (0.5.0+computecanada)
Requirement already satisfied: bitsandbytes==0.42.0 in /home/clown/.local/lib/python3.11/site-packages (0.42.0+computecanada)
Requirement already satisfied: datasets==2.17.0 in /home/clown/.local/lib/python3.11/site-packages (2.17.0+computecanada)
Requirement already satisfied: trl in /home/clown/.local/lib/python3.11/site-packages (0.7.11+computecanada)
Requirement already satisfied: numpy>=1.17 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from accelerate==0.25.0) (1.25.2+computecanada)
Requirement already satisfied: packaging>=20.0 in /home/clown/.local/lib/python3.11/site-packages (from accelerate==0.25.0) (23.2+computecanada)
Requirement already satisfied: psutil in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/ipykernel/2023b/lib/python3.11/site-packages (from accelerate==0.25.0) (5.9.5+computecanada)
Requirement already satisfied: pyyaml in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from accelerate==0.25.0) (6.0.1+computecanada)
Requirement already satisfied: huggingface-hub in /home/clown/.local/lib/python3.11/site-packages (from accelerate==0.25.0) (0.22.2+computecanada)
Requirement already satisfied: safetensors>=0.3.1 in /home/clown/.local/lib/python3.11/site-packages (from accelerate==0.25.0) (0.4.1+computecanada)
Requirement already satisfied: tqdm in /home/clown/.local/lib/python3.11/site-packages (from peft==0.5.0) (4.66.2+computecanada)
Requirement already satisfied: scipy in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from bitsandbytes==0.42.0) (1.11.2+computecanada)
Requirement already satisfied: filelock in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/python/3.11.5/lib/python3.11/site-packages (from datasets==2.17.0) (3.12.2)
Requirement already satisfied: pyarrow>=12.0.0 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/CUDA/gcccore/cuda12.2/arrow/15.0.1/lib/python3.11/site-packages (from datasets==2.17.0) (15.0.1)
Requirement already satisfied: pyarrow-hotfix in /home/clown/.local/lib/python3.11/site-packages (from datasets==2.17.0) (0.6+computecanada)
Requirement already satisfied: dill<0.3.9,>=0.3.0 in /home/clown/.local/lib/python3.11/site-packages (from datasets==2.17.0) (0.3.8+computecanada)
Requirement already satisfied: pandas in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from datasets==2.17.0) (2.1.0+computecanada)
Requirement already satisfied: requests>=2.19.0 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from datasets==2.17.0) (2.31.0+computecanada)
Requirement already satisfied: xxhash in /home/clown/.local/lib/python3.11/site-packages (from datasets==2.17.0) (3.2.0+computecanada)
Requirement already satisfied: multiprocess in /home/clown/.local/lib/python3.11/site-packages (from datasets==2.17.0) (0.70.16+computecanada)
Requirement already satisfied: fsspec[http]<=2023.10.0,>=2023.1.0 in /home/clown/.local/lib/python3.11/site-packages (from datasets==2.17.0) (2023.10.0+computecanada)
Requirement already satisfied: aiohttp in /home/clown/.local/lib/python3.11/site-packages (from datasets==2.17.0) (3.9.1+computecanada)
Requirement already satisfied: typing-extensions>=4.8.0 in /home/clown/.local/lib/python3.11/site-packages (from torch) (4.11.0+computecanada)
Requirement already satisfied: sympy in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from torch) (1.12+computecanada)
Requirement already satisfied: networkx in /home/clown/.local/lib/python3.11/site-packages (from torch) (3.3+computecanada)
Requirement already satisfied: jinja2 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from torch) (3.1.2+computecanada)
Requirement already satisfied: regex!=2019.12.17 in /home/clown/.local/lib/python3.11/site-packages (from transformers) (2023.8.8+computecanada)
Requirement already satisfied: tokenizers<0.19,>=0.14 in /home/clown/.local/lib/python3.11/site-packages (from transformers) (0.15.0+computecanada)
Requirement already satisfied: absl-py>=1.0.0 in /home/clown/.local/lib/python3.11/site-packages (from tensorflow) (2.1.0+computecanada)
Requirement already satisfied: astunparse>=1.6.0 in /home/clown/.local/lib/python3.11/site-packages (from tensorflow) (1.6.3+computecanada)
Requirement already satisfied: flatbuffers>=23.5.26 in /home/clown/.local/lib/python3.11/site-packages (from tensorflow) (20190709135844+computecanada)
Requirement already satisfied: gast!=0.5.0,!=0.5.1,!=0.5.2,>=0.2.1 in /home/clown/.local/lib/python3.11/site-packages (from tensorflow) (0.5.4+computecanada)
Requirement already satisfied: google-pasta>=0.1.1 in /home/clown/.local/lib/python3.11/site-packages (from tensorflow) (0.2.0+computecanada)
Requirement already satisfied: h5py>=2.9.0 in /home/clown/.local/lib/python3.11/site-packages (from tensorflow) (3.10.0+computecanada)
Requirement already satisfied: libclang>=13.0.0 in /home/clown/.local/lib/python3.11/site-packages (from tensorflow) (14.0.1+computecanada)
Requirement already satisfied: ml-dtypes~=0.3.1 in /home/clown/.local/lib/python3.11/site-packages (from tensorflow) (0.3.2)
Requirement already satisfied: opt-einsum>=2.3.2 in /home/clown/.local/lib/python3.11/site-packages (from tensorflow) (3.3.0+computecanada)
Requirement already satisfied: protobuf!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.20.3 in /home/clown/.local/lib/python3.11/site-packages (from tensorflow) (4.23.4+computecanada)
Requirement already satisfied: setuptools in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/python/3.11.5/lib/python3.11/site-packages (from tensorflow) (68.1.2)
Requirement already satisfied: six>=1.12.0 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/python/3.11.5/lib/python3.11/site-packages (from tensorflow) (1.16.0)
Requirement already satisfied: termcolor>=1.1.0 in /home/clown/.local/lib/python3.11/site-packages (from tensorflow) (2.4.0+computecanada)
Requirement already satisfied: wrapt<1.15,>=1.11.0 in /home/clown/.local/lib/python3.11/site-packages (from tensorflow) (1.14.1+computecanada)
Requirement already satisfied: tensorflow-io-gcs-filesystem>=0.23.1 in /home/clown/.local/lib/python3.11/site-packages (from tensorflow) (0.32.0+computecanada)
Requirement already satisfied: grpcio<2.0,>=1.24.3 in /home/clown/.local/lib/python3.11/site-packages (from tensorflow) (1.62.1+computecanada)
Requirement already satisfied: tensorboard<2.16,>=2.15 in /home/clown/.local/lib/python3.11/site-packages (from tensorflow) (2.15.1+computecanada)
Requirement already satisfied: tensorflow-estimator<2.16,>=2.15.0 in /home/clown/.local/lib/python3.11/site-packages (from tensorflow) (2.15.0+computecanada)
Requirement already satisfied: keras<2.16,>=2.15.0 in /home/clown/.local/lib/python3.11/site-packages (from tensorflow) (2.15.0+computecanada)
Requirement already satisfied: scikit-learn in /home/clown/.local/lib/python3.11/site-packages (from sentence_transformers) (1.3.1+computecanada)
Requirement already satisfied: Pillow in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from sentence_transformers) (10.0.0+computecanada)
Requirement already satisfied: tyro>=0.5.11 in /home/clown/.local/lib/python3.11/site-packages (from trl) (0.7.3+computecanada)
Requirement already satisfied: wheel<1.0,>=0.23.0 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/python/3.11.5/lib/python3.11/site-packages (from astunparse>=1.6.0->tensorflow) (0.41.2)
Requirement already satisfied: attrs>=17.3.0 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from aiohttp->datasets==2.17.0) (23.1.0+computecanada)
Requirement already satisfied: multidict<7.0,>=4.5 in /home/clown/.local/lib/python3.11/site-packages (from aiohttp->datasets==2.17.0) (6.0.5+computecanada)
Requirement already satisfied: yarl<2.0,>=1.0 in /home/clown/.local/lib/python3.11/site-packages (from aiohttp->datasets==2.17.0) (1.9.3+computecanada)
Requirement already satisfied: frozenlist>=1.1.1 in /home/clown/.local/lib/python3.11/site-packages (from aiohttp->datasets==2.17.0) (1.4.1+computecanada)
Requirement already satisfied: aiosignal>=1.1.2 in /home/clown/.local/lib/python3.11/site-packages (from aiohttp->datasets==2.17.0) (1.3.1+computecanada)
Requirement already satisfied: charset-normalizer<4,>=2 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from requests>=2.19.0->datasets==2.17.0) (3.2.0+computecanada)
Requirement already satisfied: idna<4,>=2.5 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from requests>=2.19.0->datasets==2.17.0) (3.4+computecanada)
Requirement already satisfied: urllib3<3,>=1.21.1 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from requests>=2.19.0->datasets==2.17.0) (2.0.4+computecanada)
Requirement already satisfied: certifi>=2017.4.17 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from requests>=2.19.0->datasets==2.17.0) (2023.7.22+computecanada)
Requirement already satisfied: google-auth<3,>=1.6.3 in /home/clown/.local/lib/python3.11/site-packages (from tensorboard<2.16,>=2.15->tensorflow) (2.25.2+computecanada)
Requirement already satisfied: google-auth-oauthlib<2,>=0.5 in /home/clown/.local/lib/python3.11/site-packages (from tensorboard<2.16,>=2.15->tensorflow) (1.2.0+computecanada)
Requirement already satisfied: markdown>=2.6.8 in /home/clown/.local/lib/python3.11/site-packages (from tensorboard<2.16,>=2.15->tensorflow) (3.5.2+computecanada)
Requirement already satisfied: tensorboard-data-server<0.8.0,>=0.7.0 in /home/clown/.local/lib/python3.11/site-packages (from tensorboard<2.16,>=2.15->tensorflow) (0.7.0+computecanada)
Requirement already satisfied: werkzeug>=1.0.1 in /home/clown/.local/lib/python3.11/site-packages (from tensorboard<2.16,>=2.15->tensorflow) (3.0.1+computecanada)
Requirement already satisfied: docstring-parser>=0.14.1 in /home/clown/.local/lib/python3.11/site-packages (from tyro>=0.5.11->trl) (0.15+computecanada)
Requirement already satisfied: rich>=11.1.0 in /home/clown/.local/lib/python3.11/site-packages (from tyro>=0.5.11->trl) (13.7.1+computecanada)
Requirement already satisfied: shtab>=1.5.6 in /home/clown/.local/lib/python3.11/site-packages (from tyro>=0.5.11->trl) (1.7.1+computecanada)
Requirement already satisfied: MarkupSafe>=2.0 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from jinja2->torch) (2.1.3+computecanada)
Requirement already satisfied: python-dateutil>=2.8.2 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/ipykernel/2023b/lib/python3.11/site-packages (from pandas->datasets==2.17.0) (2.8.2+computecanada)
Requirement already satisfied: pytz>=2020.1 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from pandas->datasets==2.17.0) (2023.3+computecanada)
Requirement already satisfied: tzdata>=2022.1 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from pandas->datasets==2.17.0) (2023.3+computecanada)
Requirement already satisfied: joblib>=1.1.1 in /home/clown/.local/lib/python3.11/site-packages (from scikit-learn->sentence_transformers) (1.4.0+computecanada)
Requirement already satisfied: threadpoolctl>=2.0.0 in /home/clown/.local/lib/python3.11/site-packages (from scikit-learn->sentence_transformers) (3.4.0+computecanada)
Requirement already satisfied: mpmath>=0.19 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from sympy->torch) (1.3.0+computecanada)
Requirement already satisfied: cachetools<6.0,>=2.0.0 in /home/clown/.local/lib/python3.11/site-packages (from google-auth<3,>=1.6.3->tensorboard<2.16,>=2.15->tensorflow) (5.3.2+computecanada)
Requirement already satisfied: pyasn1-modules>=0.2.1 in /home/clown/.local/lib/python3.11/site-packages (from google-auth<3,>=1.6.3->tensorboard<2.16,>=2.15->tensorflow) (0.3.0+computecanada)
Requirement already satisfied: rsa<5,>=3.1.4 in /home/clown/.local/lib/python3.11/site-packages (from google-auth<3,>=1.6.3->tensorboard<2.16,>=2.15->tensorflow) (4.9+computecanada)
Requirement already satisfied: requests-oauthlib>=0.7.0 in /home/clown/.local/lib/python3.11/site-packages (from google-auth-oauthlib<2,>=0.5->tensorboard<2.16,>=2.15->tensorflow) (1.3.1+computecanada)
Requirement already satisfied: markdown-it-py>=2.2.0 in /home/clown/.local/lib/python3.11/site-packages (from rich>=11.1.0->tyro>=0.5.11->trl) (3.0.0+computecanada)
Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/ipykernel/2023b/lib/python3.11/site-packages (from rich>=11.1.0->tyro>=0.5.11->trl) (2.16.1+computecanada)
Requirement already satisfied: mdurl~=0.1 in /home/clown/.local/lib/python3.11/site-packages (from markdown-it-py>=2.2.0->rich>=11.1.0->tyro>=0.5.11->trl) (0.1.2+computecanada)
Requirement already satisfied: pyasn1<0.6.0,>=0.4.6 in /home/clown/.local/lib/python3.11/site-packages (from pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tensorboard<2.16,>=2.15->tensorflow) (0.5.1+computecanada)
Requirement already satisfied: oauthlib>=3.0.0 in /home/clown/.local/lib/python3.11/site-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<2,>=0.5->tensorboard<2.16,>=2.15->tensorflow) (3.2.2+computecanada)
(env) [clown@cdr2598 scratch]$ pip install langchain langchain-community unstructured chromadb
Defaulting to user installation because normal site-packages is not writeable
Looking in links: /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/gentoo2023/x86-64-v3, /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/gentoo2023/generic, /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/generic
Requirement already satisfied: langchain in /home/clown/.local/lib/python3.11/site-packages (0.1.16)
Requirement already satisfied: langchain-community in /home/clown/.local/lib/python3.11/site-packages (0.0.33)
Requirement already satisfied: unstructured in /home/clown/.local/lib/python3.11/site-packages (0.13.2)
Requirement already satisfied: chromadb in /home/clown/.local/lib/python3.11/site-packages (0.3.23)
Requirement already satisfied: PyYAML>=5.3 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from langchain) (6.0.1+computecanada)
Requirement already satisfied: SQLAlchemy<3,>=1.4 in /home/clown/.local/lib/python3.11/site-packages (from langchain) (2.0.29)
Requirement already satisfied: aiohttp<4.0.0,>=3.8.3 in /home/clown/.local/lib/python3.11/site-packages (from langchain) (3.9.1+computecanada)
Requirement already satisfied: dataclasses-json<0.7,>=0.5.7 in /home/clown/.local/lib/python3.11/site-packages (from langchain) (0.6.4)
Requirement already satisfied: jsonpatch<2.0,>=1.33 in /home/clown/.local/lib/python3.11/site-packages (from langchain) (1.33+computecanada)
Requirement already satisfied: langchain-core<0.2.0,>=0.1.42 in /home/clown/.local/lib/python3.11/site-packages (from langchain) (0.1.44)
Requirement already satisfied: langchain-text-splitters<0.1,>=0.0.1 in /home/clown/.local/lib/python3.11/site-packages (from langchain) (0.0.1+computecanada)
Requirement already satisfied: langsmith<0.2.0,>=0.1.17 in /home/clown/.local/lib/python3.11/site-packages (from langchain) (0.1.49)
Requirement already satisfied: numpy<2,>=1 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from langchain) (1.25.2+computecanada)
Requirement already satisfied: pydantic<3,>=1 in /home/clown/.local/lib/python3.11/site-packages (from langchain) (2.7.0)
Requirement already satisfied: requests<3,>=2 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from langchain) (2.31.0+computecanada)
Requirement already satisfied: tenacity<9.0.0,>=8.1.0 in /home/clown/.local/lib/python3.11/site-packages (from langchain) (8.2.3+computecanada)
Requirement already satisfied: chardet in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from unstructured) (5.2.0+computecanada)
Requirement already satisfied: filetype in /home/clown/.local/lib/python3.11/site-packages (from unstructured) (1.2.0+computecanada)
Requirement already satisfied: python-magic in /home/clown/.local/lib/python3.11/site-packages (from unstructured) (0.4.27)
Requirement already satisfied: lxml in /home/clown/.local/lib/python3.11/site-packages (from unstructured) (5.1.0+computecanada)
Requirement already satisfied: nltk in /home/clown/.local/lib/python3.11/site-packages (from unstructured) (3.8.1+computecanada)
Requirement already satisfied: tabulate in /home/clown/.local/lib/python3.11/site-packages (from unstructured) (0.9.0+computecanada)
Requirement already satisfied: beautifulsoup4 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from unstructured) (4.12.2+computecanada)
Requirement already satisfied: emoji in /home/clown/.local/lib/python3.11/site-packages (from unstructured) (2.11.0)
Requirement already satisfied: python-iso639 in /home/clown/.local/lib/python3.11/site-packages (from unstructured) (2024.2.7)
Requirement already satisfied: langdetect in /home/clown/.local/lib/python3.11/site-packages (from unstructured) (1.0.9)
Requirement already satisfied: rapidfuzz in /home/clown/.local/lib/python3.11/site-packages (from unstructured) (3.3.0+computecanada)
Requirement already satisfied: backoff in /home/clown/.local/lib/python3.11/site-packages (from unstructured) (2.2.1+computecanada)
Requirement already satisfied: typing-extensions in /home/clown/.local/lib/python3.11/site-packages (from unstructured) (4.11.0+computecanada)
Requirement already satisfied: unstructured-client<=0.18.0 in /home/clown/.local/lib/python3.11/site-packages (from unstructured) (0.18.0)
Requirement already satisfied: wrapt in /home/clown/.local/lib/python3.11/site-packages (from unstructured) (1.14.1+computecanada)
Requirement already satisfied: pandas>=1.3 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from chromadb) (2.1.0+computecanada)
Requirement already satisfied: hnswlib>=0.7 in /home/clown/.local/lib/python3.11/site-packages (from chromadb) (0.8.0)
Requirement already satisfied: clickhouse-connect>=0.5.7 in /home/clown/.local/lib/python3.11/site-packages (from chromadb) (0.7.8)
Requirement already satisfied: sentence-transformers>=2.2.2 in /home/clown/.local/lib/python3.11/site-packages (from chromadb) (2.5.0+computecanada)
Requirement already satisfied: duckdb>=0.7.1 in /home/clown/.local/lib/python3.11/site-packages (from chromadb) (0.10.1+computecanada)
Requirement already satisfied: fastapi>=0.85.1 in /home/clown/.local/lib/python3.11/site-packages (from chromadb) (0.110.2)
Requirement already satisfied: uvicorn[standard]>=0.18.3 in /home/clown/.local/lib/python3.11/site-packages (from chromadb) (0.29.0)
Requirement already satisfied: posthog>=2.4.0 in /home/clown/.local/lib/python3.11/site-packages (from chromadb) (3.5.0)
Requirement already satisfied: attrs>=17.3.0 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (23.1.0+computecanada)
Requirement already satisfied: multidict<7.0,>=4.5 in /home/clown/.local/lib/python3.11/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (6.0.5+computecanada)
Requirement already satisfied: yarl<2.0,>=1.0 in /home/clown/.local/lib/python3.11/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (1.9.3+computecanada)
Requirement already satisfied: frozenlist>=1.1.1 in /home/clown/.local/lib/python3.11/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (1.4.1+computecanada)
Requirement already satisfied: aiosignal>=1.1.2 in /home/clown/.local/lib/python3.11/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (1.3.1+computecanada)
Requirement already satisfied: certifi in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from clickhouse-connect>=0.5.7->chromadb) (2023.7.22+computecanada)
Requirement already satisfied: urllib3>=1.26 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from clickhouse-connect>=0.5.7->chromadb) (2.0.4+computecanada)
Requirement already satisfied: pytz in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from clickhouse-connect>=0.5.7->chromadb) (2023.3+computecanada)
Requirement already satisfied: zstandard in /home/clown/.local/lib/python3.11/site-packages (from clickhouse-connect>=0.5.7->chromadb) (0.20.0+computecanada)
Requirement already satisfied: lz4 in /home/clown/.local/lib/python3.11/site-packages (from clickhouse-connect>=0.5.7->chromadb) (4.3.2+computecanada)
Requirement already satisfied: marshmallow<4.0.0,>=3.18.0 in /home/clown/.local/lib/python3.11/site-packages (from dataclasses-json<0.7,>=0.5.7->langchain) (3.21.1+computecanada)
Requirement already satisfied: typing-inspect<1,>=0.4.0 in /home/clown/.local/lib/python3.11/site-packages (from dataclasses-json<0.7,>=0.5.7->langchain) (0.9.0+computecanada)
Requirement already satisfied: starlette<0.38.0,>=0.37.2 in /home/clown/.local/lib/python3.11/site-packages (from fastapi>=0.85.1->chromadb) (0.37.2+computecanada)
Requirement already satisfied: jsonpointer>=1.9 in /home/clown/.local/lib/python3.11/site-packages (from jsonpatch<2.0,>=1.33->langchain) (2.4+computecanada)
Requirement already satisfied: packaging<24.0,>=23.2 in /home/clown/.local/lib/python3.11/site-packages (from langchain-core<0.2.0,>=0.1.42->langchain) (23.2+computecanada)
Requirement already satisfied: orjson<4.0.0,>=3.9.14 in /home/clown/.local/lib/python3.11/site-packages (from langsmith<0.2.0,>=0.1.17->langchain) (3.9.15+computecanada)
Requirement already satisfied: python-dateutil>=2.8.2 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/ipykernel/2023b/lib/python3.11/site-packages (from pandas>=1.3->chromadb) (2.8.2+computecanada)
Requirement already satisfied: tzdata>=2022.1 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from pandas>=1.3->chromadb) (2023.3+computecanada)
Requirement already satisfied: six>=1.5 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/python/3.11.5/lib/python3.11/site-packages (from posthog>=2.4.0->chromadb) (1.16.0)
Requirement already satisfied: monotonic>=1.5 in /home/clown/.local/lib/python3.11/site-packages (from posthog>=2.4.0->chromadb) (1.6)
Requirement already satisfied: annotated-types>=0.4.0 in /home/clown/.local/lib/python3.11/site-packages (from pydantic<3,>=1->langchain) (0.6.0+computecanada)
Requirement already satisfied: pydantic-core==2.18.1 in /home/clown/.local/lib/python3.11/site-packages (from pydantic<3,>=1->langchain) (2.18.1+computecanada)
Requirement already satisfied: charset-normalizer<4,>=2 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from requests<3,>=2->langchain) (3.2.0+computecanada)
Requirement already satisfied: idna<4,>=2.5 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from requests<3,>=2->langchain) (3.4+computecanada)
Requirement already satisfied: transformers<5.0.0,>=4.32.0 in /home/clown/.local/lib/python3.11/site-packages (from sentence-transformers>=2.2.2->chromadb) (4.39.3+computecanada)
Requirement already satisfied: tqdm in /home/clown/.local/lib/python3.11/site-packages (from sentence-transformers>=2.2.2->chromadb) (4.66.2+computecanada)
Requirement already satisfied: torch>=1.11.0 in /home/clown/.local/lib/python3.11/site-packages (from sentence-transformers>=2.2.2->chromadb) (2.2.1+computecanada)
Requirement already satisfied: scikit-learn in /home/clown/.local/lib/python3.11/site-packages (from sentence-transformers>=2.2.2->chromadb) (1.3.1+computecanada)
Requirement already satisfied: scipy in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from sentence-transformers>=2.2.2->chromadb) (1.11.2+computecanada)
Requirement already satisfied: huggingface-hub>=0.15.1 in /home/clown/.local/lib/python3.11/site-packages (from sentence-transformers>=2.2.2->chromadb) (0.22.2+computecanada)
Requirement already satisfied: Pillow in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from sentence-transformers>=2.2.2->chromadb) (10.0.0+computecanada)
Requirement already satisfied: greenlet!=0.4.17 in /home/clown/.local/lib/python3.11/site-packages (from SQLAlchemy<3,>=1.4->langchain) (2.0.2+computecanada)
Requirement already satisfied: dataclasses-json-speakeasy>=0.5.11 in /home/clown/.local/lib/python3.11/site-packages (from unstructured-client<=0.18.0->unstructured) (0.5.11)
Requirement already satisfied: jsonpath-python>=1.0.6 in /home/clown/.local/lib/python3.11/site-packages (from unstructured-client<=0.18.0->unstructured) (1.0.6)
Requirement already satisfied: mypy-extensions>=1.0.0 in /home/clown/.local/lib/python3.11/site-packages (from unstructured-client<=0.18.0->unstructured) (1.0.0+computecanada)
Requirement already satisfied: click>=7.0 in /home/clown/.local/lib/python3.11/site-packages (from uvicorn[standard]>=0.18.3->chromadb) (8.1.7+computecanada)
Requirement already satisfied: h11>=0.8 in /home/clown/.local/lib/python3.11/site-packages (from uvicorn[standard]>=0.18.3->chromadb) (0.14.0+computecanada)
Requirement already satisfied: httptools>=0.5.0 in /home/clown/.local/lib/python3.11/site-packages (from uvicorn[standard]>=0.18.3->chromadb) (0.5.0+computecanada)
Requirement already satisfied: python-dotenv>=0.13 in /home/clown/.local/lib/python3.11/site-packages (from uvicorn[standard]>=0.18.3->chromadb) (1.0.1+computecanada)
Requirement already satisfied: uvloop!=0.15.0,!=0.15.1,>=0.14.0 in /home/clown/.local/lib/python3.11/site-packages (from uvicorn[standard]>=0.18.3->chromadb) (0.17.0+computecanada)
Requirement already satisfied: watchfiles>=0.13 in /home/clown/.local/lib/python3.11/site-packages (from uvicorn[standard]>=0.18.3->chromadb) (0.18.0+computecanada)
Requirement already satisfied: websockets>=10.4 in /home/clown/.local/lib/python3.11/site-packages (from uvicorn[standard]>=0.18.3->chromadb) (12.0+computecanada)
Requirement already satisfied: soupsieve>1.2 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from beautifulsoup4->unstructured) (2.4.1+computecanada)
Requirement already satisfied: joblib in /home/clown/.local/lib/python3.11/site-packages (from nltk->unstructured) (1.4.0+computecanada)
Requirement already satisfied: regex>=2021.8.3 in /home/clown/.local/lib/python3.11/site-packages (from nltk->unstructured) (2023.8.8+computecanada)
Requirement already satisfied: filelock in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/python/3.11.5/lib/python3.11/site-packages (from huggingface-hub>=0.15.1->sentence-transformers>=2.2.2->chromadb) (3.12.2)
Requirement already satisfied: fsspec>=2023.5.0 in /home/clown/.local/lib/python3.11/site-packages (from huggingface-hub>=0.15.1->sentence-transformers>=2.2.2->chromadb) (2023.10.0+computecanada)
Requirement already satisfied: anyio<5,>=3.4.0 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from starlette<0.38.0,>=0.37.2->fastapi>=0.85.1->chromadb) (3.7.1+computecanada)
Requirement already satisfied: sympy in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from torch>=1.11.0->sentence-transformers>=2.2.2->chromadb) (1.12+computecanada)
Requirement already satisfied: networkx in /home/clown/.local/lib/python3.11/site-packages (from torch>=1.11.0->sentence-transformers>=2.2.2->chromadb) (3.3+computecanada)
Requirement already satisfied: jinja2 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from torch>=1.11.0->sentence-transformers>=2.2.2->chromadb) (3.1.2+computecanada)
Requirement already satisfied: tokenizers<0.19,>=0.14 in /home/clown/.local/lib/python3.11/site-packages (from transformers<5.0.0,>=4.32.0->sentence-transformers>=2.2.2->chromadb) (0.15.0+computecanada)
Requirement already satisfied: safetensors>=0.4.1 in /home/clown/.local/lib/python3.11/site-packages (from transformers<5.0.0,>=4.32.0->sentence-transformers>=2.2.2->chromadb) (0.4.1+computecanada)
Requirement already satisfied: threadpoolctl>=2.0.0 in /home/clown/.local/lib/python3.11/site-packages (from scikit-learn->sentence-transformers>=2.2.2->chromadb) (3.4.0+computecanada)
Requirement already satisfied: sniffio>=1.1 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from anyio<5,>=3.4.0->starlette<0.38.0,>=0.37.2->fastapi>=0.85.1->chromadb) (1.3.0+computecanada)
Requirement already satisfied: MarkupSafe>=2.0 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from jinja2->torch>=1.11.0->sentence-transformers>=2.2.2->chromadb) (2.1.3+computecanada)
Requirement already satisfied: mpmath>=0.19 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from sympy->torch>=1.11.0->sentence-transformers>=2.2.2->chromadb) (1.3.0+computecanada)
(env) [clown@cdr2598 scratch]$ echo "=== Use RAG with Mistral-7b-instruct-1.0 from job $SLURM_JOB_ID on nodes $SLURM_JOB_NODELIST."
=== Use RAG with Mistral-7b-instruct-1.0 from job 29386918 on nodes cdr2598.
(env) [clown@cdr2598 scratch]$ python
Python 3.11.5 (main, Sep 18 2023, 12:23:42) [GCC 12.3.1 20230526] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> from transformers import (
...   AutoTokenizer,
...   AutoModelForCausalLM,
...   BitsAndBytesConfig,
...   pipeline
... )
2024-04-19 11:38:47.427139: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-04-19 11:38:51.033206: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-04-19 11:38:51.034254: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-04-19 11:38:51.316939: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-04-19 11:38:51.826612: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
from langchain_community.document_loaders import UnstructuredFileLoader
>>> from langchain_community.document_loaders import UnstructuredFileLoader
>>> from langchain_text_splitters import RecursiveCharacterTextSplitter
>>> from langchain.embeddings.huggingface import HuggingFaceEmbeddings
>>> from langchain.prompts import PromptTemplate
>>> from langchain.schema.runnable import RunnablePassthrough
>>> from langchain.llms.huggingface_pipeline import HuggingFacePipeline
>>> from langchain.chains import LLMChain
>>> from langchain_community.vectorstores import Chroma
>>> user_files = ["/home/clown/mistral/RAG/data/user_doc.txt"]
>>> model_dir = "/home/clown/projects/ctb-whkchun/s2_bliss_LLMs/Mistral-7B-Instruct-v0.2"
>>> sentence_transformer_dir = "/home/clown/projects/ctb-whkchun/s2_bliss_LLMs/all-MiniLM-L6-v2"
>>> output_dir = "home/projects/ctb-whkchun/s2_bliss/results-rag-mistral-joseph-itest"
>>> loader = UnstructuredFileLoader(user_files, mode="elements")
>>> raw_documents = loader.load()
[nltk_data] Downloading package punkt to /home/clown/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /home/clown/nltk_data...
[nltk_data]   Unzipping taggers/averaged_perceptron_tagger.zip.
>>> print(f"Loaded documents (first 10 rows):\n{raw_documents[:10]}")
Loaded documents (first 10 rows):
*** SENSITIVE, REMOVED ***
>>> embedding_func = HuggingFaceEmbeddings(model_name=sentence_transformer_dir)
>>> vectordb = Chroma.from_documents(splitted_documents, embedding_func)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'splitted_documents' is not defined
>>> splitted_documents = text_splitter.split_documents(raw_documents)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'text_splitter' is not defined
>>> splitted_documents = text_splitter.split_documents(raw_documents)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'text_splitter' is not defined
>>> text_splitter = RecursiveCharacterTextSplitter(chunk_size=200)
>>> 
>>> splitted_documents = text_splitter.split_documents(raw_documents)
>>> print(f"Splitted documents (first 10 rows):\n{splitted_documents[:10]}")
Splitted documents (first 10 rows):
*** SENSITIVE, REMOVED ***
>>> vectordb = Chroma.from_documents(splitted_documents, embedding_func)
Traceback (most recent call last):
  File "/home/clown/.local/lib/python3.11/site-packages/langchain_community/vectorstores/chroma.py", line 81, in __init__
    import chromadb
  File "/home/clown/.local/lib/python3.11/site-packages/chromadb/__init__.py", line 2, in <module>
    import chromadb.config
  File "/home/clown/.local/lib/python3.11/site-packages/chromadb/config.py", line 1, in <module>
    from pydantic import BaseSettings
  File "/home/clown/.local/lib/python3.11/site-packages/pydantic/__init__.py", line 380, in __getattr__
    return _getattr_migration(attr_name)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/clown/.local/lib/python3.11/site-packages/pydantic/_migration.py", line 296, in wrapper
    raise PydanticImportError(
pydantic.errors.PydanticImportError: `BaseSettings` has been moved to the `pydantic-settings` package. See https://docs.pydantic.dev/2.7/migration/#basesettings-has-moved-to-pydantic-settings for more details.

For further information visit https://errors.pydantic.dev/2.7/u/import-error

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/clown/.local/lib/python3.11/site-packages/langchain_community/vectorstores/chroma.py", line 778, in from_documents
    return cls.from_texts(
           ^^^^^^^^^^^^^^^
  File "/home/clown/.local/lib/python3.11/site-packages/langchain_community/vectorstores/chroma.py", line 714, in from_texts
    chroma_collection = cls(
                        ^^^^
  File "/home/clown/.local/lib/python3.11/site-packages/langchain_community/vectorstores/chroma.py", line 84, in __init__
    raise ImportError(
ImportError: Could not import chromadb python package. Please install it with `pip install chromadb`.
>>> Chroma
<class 'langchain_community.vectorstores.chroma.Chroma'>
>>> import chromadb
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/clown/.local/lib/python3.11/site-packages/chromadb/__init__.py", line 2, in <module>
    import chromadb.config
  File "/home/clown/.local/lib/python3.11/site-packages/chromadb/config.py", line 1, in <module>
    from pydantic import BaseSettings
  File "/home/clown/.local/lib/python3.11/site-packages/pydantic/__init__.py", line 380, in __getattr__
    return _getattr_migration(attr_name)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/clown/.local/lib/python3.11/site-packages/pydantic/_migration.py", line 296, in wrapper
    raise PydanticImportError(
pydantic.errors.PydanticImportError: `BaseSettings` has been moved to the `pydantic-settings` package. See https://docs.pydantic.dev/2.7/migration/#basesettings-has-moved-to-pydantic-settings for more details.

For further information visit https://errors.pydantic.dev/2.7/u/import-error
>>> 

Second part, using hot-pinning solution:

(env) [clown@cdr2598 scratch]$ pip show chromadb
Name: chromadb
Version: 0.3.23
Summary: Chroma.
Home-page: 
Author: 
Author-email: Jeff Huber <jeff@trychroma.com>, Anton Troynikov <anton@trychroma.com>
License: 
Location: /home/clown/.local/lib/python3.11/site-packages
Requires: clickhouse-connect, duckdb, fastapi, hnswlib, numpy, pandas, posthog, pydantic, requests, sentence-transformers, typing-extensions, uvicorn
Required-by: 
(env) [clown@cdr2598 scratch]$ python --version
Python 3.11.5
(env) [clown@cdr2598 scratch]$ pip show pydantic
Name: pydantic
Version: 2.7.0
Summary: Data validation using Python type hints
Home-page: 
Author: 
Author-email: Samuel Colvin <s@muelcolvin.com>, Eric Jolibois <em.jolibois@gmail.com>, Hasan Ramezani <hasan.r67@gmail.com>, Adrian Garcia Badaracco <1755071+adriangb@users.noreply.github.com>, Terrence Dorsey <terry@pydantic.dev>, David Montague <david@pydantic.dev>, Serge Matveenko <lig@countzero.co>, Marcelo Trylesinski <marcelotryle@gmail.com>, Sydney Runkle <sydneymarierunkle@gmail.com>, David Hewitt <mail@davidhewitt.io>, Alex Hall <alex.mojaki@gmail.com>
License: 
Location: /home/clown/.local/lib/python3.11/site-packages
Requires: annotated-types, pydantic-core, typing-extensions
Required-by: chromadb, fastapi, langchain, langchain-core, langsmith
(env) [clown@cdr2598 scratch]$ pip show pydantic-settings
WARNING: Package(s) not found: pydantic-settings
(env) [clown@cdr2598 scratch]$ module spider pydantic-settings
Lmod has detected the following error:  Unable to find: "pydantic-settings".



(env) [clown@cdr2598 scratch]$ module spider pydantic

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  pydantic:
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
     Versions:
        pydantic/1.7.3 (E)
        pydantic/1.9.1 (E)

Names marked by a trailing (E) are extensions provided by another module.


-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  For detailed information about a specific "pydantic" package (including how to load the modules) use the module's full name.
  Note that names that have a trailing (E) are extensions provided by other modules.
  For example:

     $ module spider pydantic/1.9.1
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

 

(env) [clown@cdr2598 scratch]$ pip search pydantic-settings
ERROR: XMLRPC request failed [code: -32500]
RuntimeError: PyPI no longer supports 'pip search' (or XML-RPC search). Please use https://pypi.org/search (via a browser) instead. See https://warehouse.pypa.io/api-reference/xml-rpc.html#deprecated-methods for more information.
(env) [clown@cdr2598 scratch]$ pip install pydantic-settings
Defaulting to user installation because normal site-packages is not writeable
Looking in links: /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/gentoo2023/x86-64-v3, /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/gentoo2023/generic, /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/generic
Collecting pydantic-settings
  Obtaining dependency information for pydantic-settings from https://files.pythonhosted.org/packages/99/ee/24ec87e3a91426497c5a2b9880662d19cfd640342d477334ebc60fc2c276/pydantic_settings-2.2.1-py3-none-any.whl.metadata
  Downloading pydantic_settings-2.2.1-py3-none-any.whl.metadata (3.1 kB)
Requirement already satisfied: pydantic>=2.3.0 in /home/clown/.local/lib/python3.11/site-packages (from pydantic-settings) (2.7.0)
Requirement already satisfied: python-dotenv>=0.21.0 in /home/clown/.local/lib/python3.11/site-packages (from pydantic-settings) (1.0.1+computecanada)
Requirement already satisfied: annotated-types>=0.4.0 in /home/clown/.local/lib/python3.11/site-packages (from pydantic>=2.3.0->pydantic-settings) (0.6.0+computecanada)
Requirement already satisfied: pydantic-core==2.18.1 in /home/clown/.local/lib/python3.11/site-packages (from pydantic>=2.3.0->pydantic-settings) (2.18.1+computecanada)
Requirement already satisfied: typing-extensions>=4.6.1 in /home/clown/.local/lib/python3.11/site-packages (from pydantic>=2.3.0->pydantic-settings) (4.11.0+computecanada)
Downloading pydantic_settings-2.2.1-py3-none-any.whl (13 kB)
Installing collected packages: pydantic-settings
Successfully installed pydantic-settings-2.2.1
(env) [clown@cdr2598 scratch]$ python
Python 3.11.5 (main, Sep 18 2023, 12:23:42) [GCC 12.3.1 20230526] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import chromadb
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/clown/.local/lib/python3.11/site-packages/chromadb/__init__.py", line 2, in <module>
    import chromadb.config
  File "/home/clown/.local/lib/python3.11/site-packages/chromadb/config.py", line 1, in <module>
    from pydantic import BaseSettings
  File "/home/clown/.local/lib/python3.11/site-packages/pydantic/__init__.py", line 380, in __getattr__
    return _getattr_migration(attr_name)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/clown/.local/lib/python3.11/site-packages/pydantic/_migration.py", line 296, in wrapper
    raise PydanticImportError(
pydantic.errors.PydanticImportError: `BaseSettings` has been moved to the `pydantic-settings` package. See https://docs.pydantic.dev/2.7/migration/#basesettings-has-moved-to-pydantic-settings for more details.

For further information visit https://errors.pydantic.dev/2.7/u/import-error
>>> import pydantic-settings
  File "<stdin>", line 1
    import pydantic-settings
                   ^
SyntaxError: invalid syntax
>>> import pydantic_settings
>>> import chromadb
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/clown/.local/lib/python3.11/site-packages/chromadb/__init__.py", line 2, in <module>
    import chromadb.config
  File "/home/clown/.local/lib/python3.11/site-packages/chromadb/config.py", line 1, in <module>
    from pydantic import BaseSettings
  File "/home/clown/.local/lib/python3.11/site-packages/pydantic/__init__.py", line 380, in __getattr__
    return _getattr_migration(attr_name)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/clown/.local/lib/python3.11/site-packages/pydantic/_migration.py", line 296, in wrapper
    raise PydanticImportError(
pydantic.errors.PydanticImportError: `BaseSettings` has been moved to the `pydantic-settings` package. See https://docs.pydantic.dev/2.7/migration/#basesettings-has-moved-to-pydantic-settings for more details.

For further information visit https://errors.pydantic.dev/2.7/u/import-error
>>> pydantic_settings.BaseSettings
<class 'pydantic_settings.main.BaseSettings'>
>>> from pydantic_setings import BaseSettings
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'pydantic_setings'
>>> from pydantic_settings import BaseSettings
>>> import chromadb
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/clown/.local/lib/python3.11/site-packages/chromadb/__init__.py", line 2, in <module>
    import chromadb.config
  File "/home/clown/.local/lib/python3.11/site-packages/chromadb/config.py", line 1, in <module>
    from pydantic import BaseSettings
  File "/home/clown/.local/lib/python3.11/site-packages/pydantic/__init__.py", line 380, in __getattr__
    return _getattr_migration(attr_name)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/clown/.local/lib/python3.11/site-packages/pydantic/_migration.py", line 296, in wrapper
    raise PydanticImportError(
pydantic.errors.PydanticImportError: `BaseSettings` has been moved to the `pydantic-settings` package. See https://docs.pydantic.dev/2.7/migration/#basesettings-has-moved-to-pydantic-settings for more details.

For further information visit https://errors.pydantic.dev/2.7/u/import-error
>>> pip show fastapi
  File "<stdin>", line 1
    pip show fastapi
        ^^^^
SyntaxError: invalid syntax
>>> 
(env) [clown@cdr2598 scratch]$ pip show fastapi
Name: fastapi
Version: 0.110.2
Summary: FastAPI framework, high performance, easy to learn, fast to code, ready for production
Home-page: 
Author: 
Author-email: Sebastián Ramírez <tiangolo@gmail.com>
License: 
Location: /home/clown/.local/lib/python3.11/site-packages
Requires: pydantic, starlette, typing-extensions
Required-by: chromadb
(env) [clown@cdr2598 scratch]$ pip uninstall pydantic-settings
Found existing installation: pydantic-settings 2.2.1
Uninstalling pydantic-settings-2.2.1:
  Would remove:
    /home/clown/.local/lib/python3.11/site-packages/pydantic_settings-2.2.1.dist-info/*
    /home/clown/.local/lib/python3.11/site-packages/pydantic_settings/*
Proceed (Y/n)? y
  Successfully uninstalled pydantic-settings-2.2.1
(env) [clown@cdr2598 scratch]$ pip uninstall pydantic
Found existing installation: pydantic 2.7.0
Uninstalling pydantic-2.7.0:
  Would remove:
    /home/clown/.local/lib/python3.11/site-packages/pydantic-2.7.0.dist-info/*
    /home/clown/.local/lib/python3.11/site-packages/pydantic/*
Proceed (Y/n)? y
  Successfully uninstalled pydantic-2.7.0
(env) [clown@cdr2598 scratch]$ pip uninstall fastapi
Found existing installation: fastapi 0.110.2
Uninstalling fastapi-0.110.2:
  Would remove:
    /home/clown/.local/lib/python3.11/site-packages/fastapi-0.110.2.dist-info/*
    /home/clown/.local/lib/python3.11/site-packages/fastapi/*
Proceed (Y/n)? y
  Successfully uninstalled fastapi-0.110.2
(env) [clown@cdr2598 scratch]$ pip install pydantic==1.9
Defaulting to user installation because normal site-packages is not writeable
Looking in links: /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/gentoo2023/x86-64-v3, /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/gentoo2023/generic, /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/generic
Collecting pydantic==1.9
  Obtaining dependency information for pydantic==1.9 from https://files.pythonhosted.org/packages/d4/4e/00724eebf52854e65dabe2c190b4842afbda0e09817f415683a3130a123c/pydantic-1.9.0-py3-none-any.whl.metadata
  Using cached pydantic-1.9.0-py3-none-any.whl.metadata (121 kB)
Requirement already satisfied: typing-extensions>=3.7.4.3 in /home/clown/.local/lib/python3.11/site-packages (from pydantic==1.9) (4.11.0+computecanada)
Downloading pydantic-1.9.0-py3-none-any.whl (140 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 140.3/140.3 kB 317.1 kB/s eta 0:00:00
Installing collected packages: pydantic
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
chromadb 0.3.23 requires fastapi>=0.85.1, which is not installed.
Successfully installed pydantic-1.9.0
(env) [clown@cdr2598 scratch]$ pip install fastapi==0.85.1
Defaulting to user installation because normal site-packages is not writeable
Looking in links: /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/gentoo2023/x86-64-v3, /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/gentoo2023/generic, /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/generic
Collecting fastapi==0.85.1
  Obtaining dependency information for fastapi==0.85.1 from https://files.pythonhosted.org/packages/bf/54/6eb1882b5cb29e6647df92ee74d0a93dab149234ec914563cab955fa667f/fastapi-0.85.1-py3-none-any.whl.metadata
  Using cached fastapi-0.85.1-py3-none-any.whl.metadata (24 kB)
Requirement already satisfied: pydantic!=1.7,!=1.7.1,!=1.7.2,!=1.7.3,!=1.8,!=1.8.1,<2.0.0,>=1.6.2 in /home/clown/.local/lib/python3.11/site-packages (from fastapi==0.85.1) (1.9.0)
Collecting starlette==0.20.4 (from fastapi==0.85.1)
  Obtaining dependency information for starlette==0.20.4 from https://files.pythonhosted.org/packages/51/37/8ac52116984d6a0d8502ec2c7e4a4a78f862b76410cdb1a4bcb384c91cb3/starlette-0.20.4-py3-none-any.whl.metadata
  Downloading starlette-0.20.4-py3-none-any.whl.metadata (5.5 kB)
Requirement already satisfied: anyio<5,>=3.4.0 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from starlette==0.20.4->fastapi==0.85.1) (3.7.1+computecanada)
Requirement already satisfied: typing-extensions>=3.7.4.3 in /home/clown/.local/lib/python3.11/site-packages (from pydantic!=1.7,!=1.7.1,!=1.7.2,!=1.7.3,!=1.8,!=1.8.1,<2.0.0,>=1.6.2->fastapi==0.85.1) (4.11.0+computecanada)
Requirement already satisfied: idna>=2.8 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from anyio<5,>=3.4.0->starlette==0.20.4->fastapi==0.85.1) (3.4+computecanada)
Requirement already satisfied: sniffio>=1.1 in /cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/scipy-stack/2023b/lib/python3.11/site-packages (from anyio<5,>=3.4.0->starlette==0.20.4->fastapi==0.85.1) (1.3.0+computecanada)
Downloading fastapi-0.85.1-py3-none-any.whl (55 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 55.4/55.4 kB 9.7 kB/s eta 0:00:00
Downloading starlette-0.20.4-py3-none-any.whl (63 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 63.6/63.6 kB 11.4 kB/s eta 0:00:00
Installing collected packages: starlette, fastapi
  Attempting uninstall: starlette
    Found existing installation: starlette 0.37.2+computecanada
    Uninstalling starlette-0.37.2+computecanada:
      Successfully uninstalled starlette-0.37.2+computecanada
Successfully installed fastapi-0.85.1 starlette-0.20.4
(env) [clown@cdr2598 scratch]$ python
Python 3.11.5 (main, Sep 18 2023, 12:23:42) [GCC 12.3.1 20230526] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import chromadb
>>> 
(env) [clown@cdr2598 scratch]$ python ~/mistral/RAG/rag.py 
2024-04-19 13:22:23.079869: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-04-19 13:22:26.332817: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-04-19 13:22:26.333672: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-04-19 13:22:26.543526: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-04-19 13:22:26.936112: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
Traceback (most recent call last):
  File "/home/clown/mistral/RAG/rag.py", line 14, in <module>
    from langchain.prompts import PromptTemplate
  File "/home/clown/.local/lib/python3.11/site-packages/langchain/prompts/__init__.py", line 30, in <module>
    from langchain_core.example_selectors import (
  File "/home/clown/.local/lib/python3.11/site-packages/langchain_core/example_selectors/__init__.py", line 6, in <module>
    from langchain_core.example_selectors.length_based import (
  File "/home/clown/.local/lib/python3.11/site-packages/langchain_core/example_selectors/length_based.py", line 6, in <module>
    from langchain_core.prompts.prompt import PromptTemplate
  File "/home/clown/.local/lib/python3.11/site-packages/langchain_core/prompts/__init__.py", line 27, in <module>
    from langchain_core.prompts.base import (
  File "/home/clown/.local/lib/python3.11/site-packages/langchain_core/prompts/base.py", line 22, in <module>
    from langchain_core.output_parsers.base import BaseOutputParser
  File "/home/clown/.local/lib/python3.11/site-packages/langchain_core/output_parsers/__init__.py", line 33, in <module>
    from langchain_core.output_parsers.xml import XMLOutputParser
  File "/home/clown/.local/lib/python3.11/site-packages/langchain_core/output_parsers/xml.py", line 127, in <module>
    class XMLOutputParser(BaseTransformOutputParser):
  File "/home/clown/.local/lib/python3.11/site-packages/pydantic/main.py", line 204, in __new__
    fields[ann_name] = ModelField.infer(
                       ^^^^^^^^^^^^^^^^^
  File "/home/clown/.local/lib/python3.11/site-packages/pydantic/fields.py", line 488, in infer
    return cls(
           ^^^^
  File "/home/clown/.local/lib/python3.11/site-packages/pydantic/fields.py", line 419, in __init__
    self.prepare()
  File "/home/clown/.local/lib/python3.11/site-packages/pydantic/fields.py", line 539, in prepare
    self.populate_validators()
  File "/home/clown/.local/lib/python3.11/site-packages/pydantic/fields.py", line 801, in populate_validators
    *(get_validators() if get_validators else list(find_validators(self.type_, self.model_config))),
                                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/clown/.local/lib/python3.11/site-packages/pydantic/validators.py", line 723, in find_validators
    raise RuntimeError(f'no validator found for {type_}, see `arbitrary_types_allowed` in Config')
RuntimeError: no validator found for <class 're.Pattern'>, see `arbitrary_types_allowed` in Config

@cindyli cindyli closed this May 7, 2024
@cindyli cindyli deleted the feat/RAG branch May 7, 2024 20:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants