- spin up an ubuntu ec2 instance
- give your server an elastic IP
- create an API Gateway API
- add a
/chat
route to your API, and attach to it anHTTP-URI
integration with HTTP methodPOST
and routehttp://<ec2 elastic ip>/chat
- put nginx on your server
sudo apt-get install nginx
- put this in
/etc/nginx/sites-available/default
:server { listen 80 default_server; listen [::]:80 default_server; server_name _; location / { proxy_pass http://localhost:8080; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; } }
- start nginx
sudo systemctl start nginx
- install the python dependencies:
aiohttp, openai, langchain
- Enable Google Natural Language Processing API, Maps API, point
gcp_config.py
to point to your Google Project - Download the embeddings by running
./downloadEmbeddings.sh
- Run
python3
,import nltk
,nltk.download('wordnet')
,nltk.download('stopwords')
- Set your OpenAI API key -
export OPENAI_API_KEY=myKeyHere
- put
server.py
on your server and run it
Data sources used for english_word_freq_list
:
- Google: https://github.com/garyongguanjie/entrie/blob/main/unigram_freq.csv or https://www.kaggle.com/datasets/rtatman/english-word-frequency (there are other mirrors as well)
- Norvig: https://github.com/arstgit/high-frequency-vocabulary (30k.txt, change to .csv and add a single line at the top that just says "word" (CSV header))
If you want to parse your bookmarks, use the parse_bookmarks.py
file (see comment on top of that file for usage).