automatic-learning-amplifier

automatic-learning-amplifier is a Python project based on MLX that generates synthetic question-answer pairs from a given text corpus locally, finetunes a language model using the generated data, and deploys the finetuned model for inference. This project is particularly useful for creating domain-specific chatbots or question-answering systems in various fields. It is perfect for on-premise enterprise implementations.

Features

Generates specific, synthetic question-answer pairs from a large text corpus using LLaMA-3-8B with MLX or GGUF back-end
Verifies the correctness of the generated question-answer pairs (optional)
Analyzes the specificity of the generated questions and answers (optional)
Supports local, OpenAI, OpenRouter, and Claude models for inference
Automatically finetunes the LLaMA-3-8B model using the generated data with (Q)LoRA
Converts the finetuned model to MLX or GGUF format for efficient inference
Quantizes the model to reduce memory footprint without significant loss in perplexity
Deploys the finetuned model for inference
Compares the performance of the finetuned model with the original model

Installation

Using pip

Clone the repository:

git clone https://github.com/s_smits/automatic-learning-amplifier.git

Install the required dependencies:
```
pip install -r requirements.txt
```

Using Poetry

Clone the repository:

git clone https://github.com/s_smits/automatic-learning-amplifier.git

Install Poetry if you haven't already:
```
pip install poetry
```
Install the project dependencies:
```
poetry install
```
Set up the necessary environment variables if using Claude or OpenRouter:
- 'CLAUDE_API_KEY': Your Anthropic Claude API key
- 'OPENROUTER_API_KEY': Your OpenRouter API key

Usage

Prepare your text corpus in the data/documents folder.
Run the main.py script with the desired arguments:
```
streamlit run src/app.py [--local] [--openrouter] [--claude] [--word_limit WORD_LIMIT] [--question_amount QUESTION_AMOUNT] [--focus {processes,knowledge,formulas}] [--images] [--optimize] [--add_summary {math,science,history,geography,english,art,music,education,computer science,drama}] [--verify] [--test_size TEST_SIZE] [--lora] [--qlora] [--gguf] [--compare]
```
- --local, --openrouter, or --claude: Choose the inference method (default: --local)
- --word_limit: Set the word limit for each chunk (default: 1000)
- --question_amount: Set the number of question-answer pairs to generate (default: 5)
- --focus: Choose the focus for generating questions (options: processes, knowledge, formulas)
- --images: Generate captions for images in the documents (default: False)
- --optimize: Activate optimization mode (default: False)
- --add_summary: Add a short summary to each prompt from every 5 document files to give the model more context in order to generate better questions (options: 'general', math, science, history, geography, english, art, music, education, computer science, drama) (default: None)
- --verify: Verify the generated questions and answers (default: False)
- --test_size: Set the test size for splitting the data into training and validation sets (default: 0.1)
- --lora: Finetune the model with LoRA (default: False)
- --qlora: Finetune the model with QLoRA (default: True)
- --gguf: Convert and infer the model with GGUF (default: False)
- --compare_initial: Compare the performance of the finetuned model with the initial local model (default: True)
- --compare_anthropic: Compare the performance of the finetuned model with Haiku (note: the generated questions will be sent to the API) (default: True)
- --compare: Compare the performance of the finetuned model with the non-finetuned model (default: True)
The script will generate synthetic question-answer pairs, finetune the language model, and deploy the finetuned model for inference and comparison with the original model.

Folder Structure

data/: Contains the text corpus and generated data
documents/: Place your documents here (Supported file types: PDF, TXT, DOCX, PPTX, HTML)
data_prepared/: Prepared data for finetuning
data_ft/: Finetuning data (train and eval splits)
qa_json/: Generated question-answer pairs in JSON format
logs/: Contains log files
models/: Contains the finetuned model and adapter file
src/: Contains the source code files

Why MLX?

MLX is an array framework for machine learning research on Apple silicon, brought to you by Apple machine learning research.

Some key features of MLX include:

Familiar APIs: MLX has a Python API that closely follows NumPy. MLX also has fully featured C++, C, and Swift APIs, which closely mirror the Python API. MLX has higher-level packages like mlx.nn and mlx.optimizers with APIs that closely follow PyTorch to simplify building more complex models.
Composable function transformations: MLX supports composable function transformations for automatic differentiation, automatic vectorization, and computation graph optimization.
Lazy computation: Computations in MLX are lazy. Arrays are only materialized when needed.
Dynamic graph construction: Computation graphs in MLX are constructed dynamically. Changing the shapes of function arguments does not trigger slow compilations, and debugging is simple and intuitive.
Multi-device: Operations can run on any of the supported devices (currently the CPU and the GPU).
Unified memory: A notable difference from MLX and other frameworks is the unified memory model. Arrays in MLX live in shared memory. Operations on MLX arrays can be performed on any of the supported device types without transferring data.

Use Cases

Enterprise Knowledge Management: Large corporations can use this tool to create custom, domain-specific chatbots from their internal documentation. This allows employees to quickly access company-specific information, improving productivity and reducing time spent searching for answers.
Specialized Customer Support: Companies with complex products or services can use this to create an AI assistant that understands their specific offerings in detail. This can provide 24/7 customer support, handling a large volume of inquiries without human intervention.
Rapid Onboarding and Training: Organizations can use this tool to create interactive learning systems from their training materials. New employees or partners can then engage with an AI tutor that understands the company's processes, products, and policies in depth, accelerating the onboarding process.

Acknowledgement

I would like to send my many thanks to:

The Apple Machine Learning Research team for the amazing MLX library.

Contributing

Contributions are welcome! If you find any issues or have suggestions for improvements, please open an issue or submit a pull request.

License

This project is licensed under the Apache 2.0.

Acknowledgements

MLX for the mlx package
Llama.cpp for better inference
OpenRouter and Anthropic Claude for alternative inference methods

Disclaimer

This project has been primarily tested with the LLaMA-3-8B model. Using other models may result in poorly parsed JSON outputs. Please exercise caution and verify the generated data when using different models.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
data		data
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

automatic-learning-amplifier

Features

Installation

Using pip

Using Poetry

Usage

Folder Structure

Why MLX?

Use Cases

Acknowledgement

Contributing

License

Acknowledgements

Disclaimer

About

Releases

Packages

Languages

s-smits/automatic-learning-amplifier

Folders and files

Latest commit

History

Repository files navigation

automatic-learning-amplifier

Features

Installation

Using pip

Using Poetry

Usage

Folder Structure

Why MLX?

Use Cases

Acknowledgement

Contributing

License

Acknowledgements

Disclaimer

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages