automatic-learning-amplifier is a Python project based on MLX that generates synthetic question-answer pairs from a given text corpus locally, finetunes a language model using the generated data, and deploys the finetuned model for inference. This project is particularly useful for creating domain-specific chatbots or question-answering systems in various fields. It is perfect for on-premise enterprise implementations.
- Generates specific, synthetic question-answer pairs from a large text corpus using LLaMA-3-8B with MLX or GGUF back-end
- Verifies the correctness of the generated question-answer pairs (optional)
- Analyzes the specificity of the generated questions and answers (optional)
- Supports local, OpenAI, OpenRouter, and Claude models for inference
- Automatically finetunes the LLaMA-3-8B model using the generated data with (Q)LoRA
- Converts the finetuned model to MLX or GGUF format for efficient inference
- Quantizes the model to reduce memory footprint without significant loss in perplexity
- Deploys the finetuned model for inference
- Compares the performance of the finetuned model with the original model
-
Clone the repository:
git clone https://github.com/s_smits/automatic-learning-amplifier.git
-
Install the required dependencies:
pip install -r requirements.txt
-
Clone the repository:
git clone https://github.com/s_smits/automatic-learning-amplifier.git
-
Install Poetry if you haven't already:
pip install poetry
-
Install the project dependencies:
poetry install
-
Set up the necessary environment variables if using Claude or OpenRouter:
- 'CLAUDE_API_KEY': Your Anthropic Claude API key
- 'OPENROUTER_API_KEY': Your OpenRouter API key
-
Prepare your text corpus in the
data/documents
folder. -
Run the
main.py
script with the desired arguments:streamlit run src/app.py [--local] [--openrouter] [--claude] [--word_limit WORD_LIMIT] [--question_amount QUESTION_AMOUNT] [--focus {processes,knowledge,formulas}] [--images] [--optimize] [--add_summary {math,science,history,geography,english,art,music,education,computer science,drama}] [--verify] [--test_size TEST_SIZE] [--lora] [--qlora] [--gguf] [--compare]
--local
,--openrouter
, or--claude
: Choose the inference method (default:--local
)--word_limit
: Set the word limit for each chunk (default: 1000)--question_amount
: Set the number of question-answer pairs to generate (default: 5)--focus
: Choose the focus for generating questions (options:processes
,knowledge
,formulas
)--images
: Generate captions for images in the documents (default: False)--optimize
: Activate optimization mode (default: False)--add_summary
: Add a short summary to each prompt from every 5 document files to give the model more context in order to generate better questions (options: 'general',math
,science
,history
,geography
,english
,art
,music
,education
,computer science
,drama
) (default: None)--verify
: Verify the generated questions and answers (default: False)--test_size
: Set the test size for splitting the data into training and validation sets (default: 0.1)--lora
: Finetune the model with LoRA (default: False)--qlora
: Finetune the model with QLoRA (default: True)--gguf
: Convert and infer the model with GGUF (default: False)--compare_initial
: Compare the performance of the finetuned model with the initial local model (default: True)--compare_anthropic
: Compare the performance of the finetuned model with Haiku (note: the generated questions will be sent to the API) (default: True)--compare
: Compare the performance of the finetuned model with the non-finetuned model (default: True)
-
The script will generate synthetic question-answer pairs, finetune the language model, and deploy the finetuned model for inference and comparison with the original model.
data/
: Contains the text corpus and generated datadocuments/
: Place your documents here (Supported file types: PDF, TXT, DOCX, PPTX, HTML)data_prepared/
: Prepared data for finetuningdata_ft/
: Finetuning data (train and eval splits)qa_json/
: Generated question-answer pairs in JSON formatlogs/
: Contains log filesmodels/
: Contains the finetuned model and adapter filesrc/
: Contains the source code files
MLX is an array framework for machine learning research on Apple silicon, brought to you by Apple machine learning research.
Some key features of MLX include:
-
Familiar APIs: MLX has a Python API that closely follows NumPy. MLX also has fully featured C++, C, and Swift APIs, which closely mirror the Python API. MLX has higher-level packages like
mlx.nn
andmlx.optimizers
with APIs that closely follow PyTorch to simplify building more complex models. -
Composable function transformations: MLX supports composable function transformations for automatic differentiation, automatic vectorization, and computation graph optimization.
-
Lazy computation: Computations in MLX are lazy. Arrays are only materialized when needed.
-
Dynamic graph construction: Computation graphs in MLX are constructed dynamically. Changing the shapes of function arguments does not trigger slow compilations, and debugging is simple and intuitive.
-
Multi-device: Operations can run on any of the supported devices (currently the CPU and the GPU).
-
Unified memory: A notable difference from MLX and other frameworks is the unified memory model. Arrays in MLX live in shared memory. Operations on MLX arrays can be performed on any of the supported device types without transferring data.
-
Enterprise Knowledge Management: Large corporations can use this tool to create custom, domain-specific chatbots from their internal documentation. This allows employees to quickly access company-specific information, improving productivity and reducing time spent searching for answers.
-
Specialized Customer Support: Companies with complex products or services can use this to create an AI assistant that understands their specific offerings in detail. This can provide 24/7 customer support, handling a large volume of inquiries without human intervention.
-
Rapid Onboarding and Training: Organizations can use this tool to create interactive learning systems from their training materials. New employees or partners can then engage with an AI tutor that understands the company's processes, products, and policies in depth, accelerating the onboarding process.
I would like to send my many thanks to:
- The Apple Machine Learning Research team for the amazing MLX library.
Contributions are welcome! If you find any issues or have suggestions for improvements, please open an issue or submit a pull request.
This project is licensed under the Apache 2.0.
- MLX for the
mlx
package - Llama.cpp for better inference
- OpenRouter and Anthropic Claude for alternative inference methods
This project has been primarily tested with the LLaMA-3-8B model. Using other models may result in poorly parsed JSON outputs. Please exercise caution and verify the generated data when using different models.