Skip to content

Commit

Permalink
Added RAKE Keyword extraction and query search
Browse files Browse the repository at this point in the history
  • Loading branch information
iam-yashpradhan committed Nov 17, 2024
1 parent 6acf64e commit 0685205
Showing 1 changed file with 40 additions and 5 deletions.
45 changes: 40 additions & 5 deletions MinuteMate/back/main.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,22 @@
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from weaviate.classes.query import Rerank, MetadataQuery

Check failure on line 3 in MinuteMate/back/main.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (F401)

MinuteMate/back/main.py:3:36: F401 `weaviate.classes.query.Rerank` imported but unused

Check failure on line 3 in MinuteMate/back/main.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (F401)

MinuteMate/back/main.py:3:44: F401 `weaviate.classes.query.MetadataQuery` imported but unused

Check failure on line 3 in MinuteMate/back/main.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (F401)

MinuteMate/back/main.py:3:36: F401 `weaviate.classes.query.Rerank` imported but unused

Check failure on line 3 in MinuteMate/back/main.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (F401)

MinuteMate/back/main.py:3:44: F401 `weaviate.classes.query.MetadataQuery` imported but unused
import os
import weaviate
from weaviate.classes.init import Auth
from weaviate.classes.init import AdditionalConfig, Timeout

Check failure on line 7 in MinuteMate/back/main.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (F401)

MinuteMate/back/main.py:7:35: F401 `weaviate.classes.init.AdditionalConfig` imported but unused

Check failure on line 7 in MinuteMate/back/main.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (F401)

MinuteMate/back/main.py:7:53: F401 `weaviate.classes.init.Timeout` imported but unused

Check failure on line 7 in MinuteMate/back/main.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (F401)

MinuteMate/back/main.py:7:35: F401 `weaviate.classes.init.AdditionalConfig` imported but unused

Check failure on line 7 in MinuteMate/back/main.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (F401)

MinuteMate/back/main.py:7:53: F401 `weaviate.classes.init.Timeout` imported but unused
from rake_nltk import Rake
import nltk
nltk.download('stopwords')
nltk.download('punkt')


weaviate_url = os.environ["WEAVIATE_URL"]
weaviate_api_key = os.environ["WEAVIATE_API_KEY"]
client = weaviate.connect_to_weaviate_cloud(
cluster_url=weaviate_url,
auth_credentials=Auth.api_key(weaviate_api_key),
)

# Initialize the FastAPI app
app = FastAPI(
Expand All @@ -19,11 +36,10 @@ class PromptResponse(BaseModel):

# Your Python processing logic
def process_prompt(prompt: str) -> str:
response = ''

#Call whatever code we need to here

return response
rake = Rake()
rake.extract_keywords_from_text(prompt)
return rake.get_ranked_phrases()[:3]


# API endpoint
@app.post("/process-prompt", response_model=PromptResponse)
Expand All @@ -36,3 +52,22 @@ async def process_prompt_endpoint(request: PromptRequest):
return PromptResponse(result=result)
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))



# This is the code to get the collections based on top 3 keywords we are fetching from the RAKE code.
# You can add this code block below wherever you are configuring your API
collection = client.collections.get("MeetingDocument")
response = collection.query.bm25(
query=",".join(keywords),

Check failure on line 62 in MinuteMate/back/main.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (F821)

MinuteMate/back/main.py:62:20: F821 Undefined name `keywords`

Check failure on line 62 in MinuteMate/back/main.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (F821)

MinuteMate/back/main.py:62:20: F821 Undefined name `keywords`
limit=5,
# rerank=Rerank(
# prop="content",
# query="meeting"
# ),
# return_metadata=MetadataQuery(score=True)
)

for o in response.objects:
print(o.properties)
# print(o.metadata.rerank_score)

0 comments on commit 0685205

Please sign in to comment.