-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
add crew ai job application, and start llama index
- Loading branch information
1 parent
aecd2dc
commit 4c61832
Showing
10 changed files
with
241 additions
and
91 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
llama-index | ||
python-dotenv | ||
llama-index-llms-openai | ||
llama-index-embeddings-openai | ||
llama-index-llms-mistralai |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,67 @@ | ||
""" | ||
Load a pdf document, split it using sentence splitter | ||
""" | ||
import sys | ||
from dotenv import load_dotenv | ||
# import nest_asyncio | ||
from llama_index.core import SimpleDirectoryReader, Settings, SummaryIndex, VectorStoreIndex | ||
from llama_index.core.node_parser import SentenceSplitter | ||
from llama_index.core.query_engine.router_query_engine import RouterQueryEngine | ||
from llama_index.core.selectors import LLMSingleSelector | ||
from llama_index.llms.openai import OpenAI | ||
from llama_index.core.tools import QueryEngineTool | ||
from llama_index.embeddings.openai import OpenAIEmbedding | ||
|
||
load_dotenv("../.env") | ||
|
||
def load_docs(fn: str): | ||
docs = SimpleDirectoryReader(input_files=[fn]).load_data() | ||
|
||
splitter = SentenceSplitter(chunk_size=1024) | ||
nodes = splitter.get_nodes_from_documents(docs) | ||
Settings.llm = OpenAI(model="gpt-3.5-turbo") | ||
Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002") | ||
print("--- Build two indexes one for the summary, one for the vector (for semantic query based on simililarity)") | ||
summary_index =SummaryIndex(nodes) | ||
vector_index = VectorStoreIndex(nodes) | ||
return summary_index,vector_index | ||
|
||
def get_router_engine(summary_index,vector_index): | ||
# Build query engines | ||
summary_query_engine = summary_index.as_query_engine( response_mode="tree_summarize", use_async=True) | ||
vector_query_engine = vector_index.as_query_engine() | ||
|
||
|
||
summary_tool = QueryEngineTool.from_defaults( | ||
query_engine=summary_query_engine, | ||
description=( | ||
"Useful for summarization questions related to MetaGPT" | ||
), | ||
) | ||
|
||
vector_tool = QueryEngineTool.from_defaults( | ||
query_engine=vector_query_engine, | ||
description=( | ||
"Useful for retrieving specific context from the MetaGPT paper." | ||
), | ||
) | ||
|
||
query_engine= RouterQueryEngine( | ||
selector=LLMSingleSelector.from_defaults(), | ||
query_engine_tools=[ summary_tool, vector_tool], | ||
verbose=True | ||
) | ||
return query_engine | ||
|
||
if __name__ == "__main__": | ||
filename=sys.argv[1] | ||
print(f"--> Build indexes from content of {filename}") | ||
summary_index,vector_index = load_docs(filename) | ||
engine=get_router_engine(summary_index,vector_index) | ||
response = engine.query("What is the summary of the document?") | ||
print(str(response)) | ||
print(len(response.source_nodes)) | ||
response = engine.query( | ||
"How do agents share information with other agents?" | ||
) | ||
print(str(response)) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
# LlamaIndex library | ||
|
||
[LlamaIndex](https://docs.llamaindex.ai/en/stable/) is a framework for building context-augmented LLM applications, like Q&A, chatbot, Document understanding and extraction, and agentic apps. | ||
|
||
It includes data connectors, indexes, NL engines, Agents, and a set of tools for observabiltiy and evaluation. | ||
|
||
LlamaParse is an interesting document parsing solution. [LlamaHub](https://llamahub.ai/) | ||
|
||
## Getting Started | ||
|
||
* Routing queries example using RAG [router_engine.py](https://github.com/jbcodeforce/ML-studies/tree/master/LlamaIndex/router_engine.py). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
|
||
Interview Questions: | ||
|
||
1. Can you share more about your experience with AI applications, particularly in deep learning and generative AI? | ||
2. How have you utilized statistical programming languages, specifically Python, in your previous roles? Can you give specific examples of projects or initiatives where you applied these skills? | ||
3. Tell us about a time when you delivered a technical presentation that influenced stakeholders' decisions. How did you approach it? | ||
4. You mentioned that you have experience owning outcomes and making significant decisions. Can you give some specific examples where your decision-making impacted the business positively? | ||
5. Could you elaborate on your experience with ML frameworks like TensorFlow, PyTorch, and Spark ML? How have you used them in your projects? | ||
6. Can you share your experience with distributed training and optimizing performance versus costs? What strategies did you employ to achieve this? | ||
7. How have you incorporated CI/CD solutions in the context of MLOps and LLMOps in your previous roles? | ||
8. You mentioned that you had experience in systems design and the ability to design and explain data pipelines and ML pipelines. Could you walk us through your process? | ||
|
||
Talking Points: | ||
|
||
1. Jerome's extensive experience in AI applications, notably in deep learning and generative AI. | ||
2. His proficiency in using statistical programming languages like Python in various AI applications. | ||
3. Jerome’s experience in delivering technical presentations and leading business value sessions. | ||
4. His experience in owning outcomes and making significant decisions that have led to substantial business impacts. | ||
5. Jerome's expertise with ML frameworks like TensorFlow, PyTorch, JAX, and Spark ML. | ||
6. His experience with distributed training and optimizing performance versus costs. | ||
7. Jerome’s familiarity with CI/CD solutions in the context of MLOps and LLMOps. | ||
8. His experience in systems design, with the ability to design and explain data pipelines and ML pipelines. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,21 +1,19 @@ | ||
Interview Questions: | ||
|
||
1. Can you share more about your experience with AI applications, particularly in deep learning and generative AI? | ||
2. How have you utilized statistical programming languages, specifically Python, in your previous roles? Can you give specific examples of projects or initiatives where you applied these skills? | ||
3. Tell us about a time when you delivered a technical presentation that influenced stakeholders' decisions. How did you approach it? | ||
4. You mentioned that you have experience owning outcomes and making significant decisions. Can you give some specific examples where your decision-making impacted the business positively? | ||
5. Could you elaborate on your experience with ML frameworks like TensorFlow, PyTorch, and Spark ML? How have you used them in your projects? | ||
6. Can you share your experience with distributed training and optimizing performance versus costs? What strategies did you employ to achieve this? | ||
7. How have you incorporated CI/CD solutions in the context of MLOps and LLMOps in your previous roles? | ||
8. You mentioned that you had experience in systems design and the ability to design and explain data pipelines and ML pipelines. Could you walk us through your process? | ||
1. Can you share some examples of real-world business problems you have solved using data-driven optimization and machine learning modeling? | ||
2. What kind of ML models have you built from scratch for new applications? Can you explain the process you followed and the challenges you faced? | ||
3. As an expert in NLP, how have you used this skill in your previous roles? Can you provide specific examples? | ||
4. Can you explain your experience with SQL and handling large-scale data? Can you give an example of a project where you had to manage a large volume of data? | ||
5. Your resume mentions that you are comfortable with Python, Java, and Node.js. Can you share how you have used these languages in your work? | ||
6. Can you talk about your experience in the healthcare industry? If you haven't worked in this industry before, how do you plan to apply your skills and knowledge to it? | ||
7. You have a Master's Degree in Computer Science from Nice University, France. How has your educational background helped you in your career? | ||
8. How have you measured and optimized the direct business impact of your work in your previous roles? | ||
|
||
Talking Points: | ||
|
||
1. Jerome's extensive experience in AI applications, notably in deep learning and generative AI. | ||
2. His proficiency in using statistical programming languages like Python in various AI applications. | ||
3. Jerome’s experience in delivering technical presentations and leading business value sessions. | ||
4. His experience in owning outcomes and making significant decisions that have led to substantial business impacts. | ||
5. Jerome's expertise with ML frameworks like TensorFlow, PyTorch, JAX, and Spark ML. | ||
6. His experience with distributed training and optimizing performance versus costs. | ||
7. Jerome’s familiarity with CI/CD solutions in the context of MLOps and LLMOps. | ||
8. His experience in systems design, with the ability to design and explain data pipelines and ML pipelines. | ||
1. Discuss the candidate's experience as a Principal Consultant at Athena Decision Systems and AWS Principal Solution Architect at Amazon. | ||
2. Discuss the candidate's proficiency in Python, Java, and Node.js and how it applies to the position. | ||
3. Discuss the candidate's experience with Generative AI and how it could be applied in the role of NLP Scientist/Engineer at Machinify. | ||
4. Discuss the candidate's desire to learn about the healthcare industry and how this interest could be beneficial to the role. | ||
5. Discuss the candidate's experience in handling large-scale data using SQL, Databricks, and Iceberg lakehouse for data governance. | ||
6. Discuss the candidate's experience building ML models from scratch for new applications. | ||
7. Discuss the candidate's certifications and how they demonstrate his commitment to continued learning and expertise in his field. | ||
8. Discuss the awards the candidate has received and what they mean for his capabilities in the role. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.