Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmarks v2 Merging dev to main #184

Merged
merged 19 commits into from
Apr 29, 2024
Merged

Benchmarks v2 Merging dev to main #184

merged 19 commits into from
Apr 29, 2024

Commits on Apr 15, 2024

  1. AutoGPTQ Mistral, Memory profiling support and empirical quality chec…

    …ks (#163)
    
    * Added info about mistral support
    
    * AutoGPTQ now uses base class, with mistral support and memory profiling
    
    * minor changes on change of cli args in bench.sh
    
    * changed requirements with latest update of autogptq
    
    * support for mistral instruct and llama2 chat and latest autogptq installation from source
    
    * Added another common utility to build chat templates for model
    
    * fix bugs for multiple duplicated logging
    
    * PyTorchBenchmark supports mistral, memory profile and uses Base class
    
    * changes in instruction and using latest model for benchmarking, removed Logs
    
    * fixing dependencies with proper versions
    
    * Addition of mistral and llama and also table for precision wise quality comparision
    
    * Added new docs and template for mistral and starting out new benchmark performance logs in templates
    
    * improvements on better logging strategies to log the quality checks output in a readme
    
    * integrated the utility for logging improvements
    
    * using better logging stratgies in bench pytorch
    
    * questions.json has the ground truth answer set from fp32 response
    
    * AutoGPTQ readme improvements and added quality checks examples for llama and mistral
    
    * Using latest logging utilities
    
    * removed creation of Logs folder and unnencessary arguments
    
    * Added fsspec
    
    * Added llama2 and mistral performance logs
    
    * pinned version of huggingface_hub
    
    * Latest info under 'some points to note' section
    
    * Update bench_autogptq/bench.sh
    
    Co-authored-by: Nicola Sosio <sosio.nicola94@tiscali.it>
    
    ---------
    
    Co-authored-by: Anindyadeep Sannigrahi <anindyadeepsannigrahi@Anindyadeeps-MacBook-Pro.local>
    Co-authored-by: Nicola Sosio <sosio.nicola94@tiscali.it>
    3 people authored Apr 15, 2024
    Configuration menu
    Copy the full SHA
    0a956be View commit details
    Browse the repository at this point in the history

Commits on Apr 17, 2024

  1. Deepspeed Mistral, Memory profiling support and empirical quality che…

    …cks (#168)
    
    * Added another common utility to build chat templates for model
    
    * fix bugs for multiple duplicated logging
    
    * PyTorchBenchmark supports mistral, memory profile and uses Base class
    
    * changes in instruction and using latest model for benchmarking, removed Logs
    
    * fixing dependencies with proper versions
    
    * Addition of mistral and llama and also table for precision wise quality comparision
    
    * Added new docs and template for mistral and starting out new benchmark performance logs in templates
    
    * improvements on better logging strategies to log the quality checks output in a readme
    
    * integrated the utility for logging improvements
    
    * using better logging stratgies in bench pytorch
    
    * questions.json has the ground truth answer set from fp32 response
    
    * DeepSpeed now using base class, mistral support and memory profiling
    
    * removed unused imports
    
    * removed Logs and latest improvements w.r.t base class
    
    * README now has quality comparision for deepspeed
    
    * using latest version of deepspeed
    
    * added latest performance logs for llama2 and mistral
    
    * added docs for llama and mistral with latest scores
    
    * updated readme with correct model info
    
    ---------
    
    Co-authored-by: Anindyadeep Sannigrahi <anindyadeepsannigrahi@Anindyadeeps-MacBook-Pro.local>
    Anindyadeep and Anindyadeep Sannigrahi authored Apr 17, 2024
    Configuration menu
    Copy the full SHA
    c1c7337 View commit details
    Browse the repository at this point in the history
  2. Ctransformers Mistral and Memory Profiling support (#165)

    * Ctransformers support mistral and uses Base class along with memory profiling
    
    * uses latest bench.py arguments, remove making log folders and improvements
    
    * supporting mistral and llama chat models and installation improvements
    
    * added additional requirements which is not support by ctransformers by default
    
    * Added another common utility to build chat templates for model
    
    * fix bugs for multiple duplicated logging
    
    * PyTorchBenchmark supports mistral, memory profile and uses Base class
    
    * changes in instruction and using latest model for benchmarking, removed Logs
    
    * fixing dependencies with proper versions
    
    * Addition of mistral and llama and also table for precision wise quality comparision
    
    * Added new docs and template for mistral and starting out new benchmark performance logs in templates
    
    * improvements on better logging strategies to log the quality checks output in a readme
    
    * integrated the utility for logging improvements
    
    * using better logging stratgies in bench pytorch
    
    * questions.json has the ground truth answer set from fp32 response
    
    * CTransformers using latest logging utilities
    
    * removed unnecessary arguments and creation of Logs folder
    
    * Add precision wise quality comparision on AutoGPTQ readme
    
    * Added performance scores for llama2 and mistral
    
    * Latest info under 'some points to note' section
    
    * added ctransformers performance logs for mistral and llama
    
    * Update bench_ctransformers/bench.sh
    
    Co-authored-by: Nicola Sosio <sosio.nicola94@tiscali.it>
    
    ---------
    
    Co-authored-by: Anindyadeep Sannigrahi <anindyadeepsannigrahi@Anindyadeeps-MacBook-Pro.local>
    Co-authored-by: Nicola Sosio <sosio.nicola94@tiscali.it>
    3 people authored Apr 17, 2024
    Configuration menu
    Copy the full SHA
    ebd217b View commit details
    Browse the repository at this point in the history

Commits on Apr 18, 2024

  1. CTranslate2 Benchmark with Mistral Support (#170)

    * Added support for BaseClass and mistral with memory profiling
    
    * removed docker support with latest ctranslate release
    
    * Added latest ctranslate2 version
    
    * Removed runs with docker and added mistral model support
    
    * removed docker support and added mistral support
    
    * Added performance logs for mistral and llama
    
    * engine specific readme with qualitative comparision
    Anindyadeep authored Apr 18, 2024
    Configuration menu
    Copy the full SHA
    bc8929d View commit details
    Browse the repository at this point in the history
  2. Llamacpp mistral (#171)

    * fix bug: handle temperature when None
    
    * Added llamacpp engine readme with quality comparision
    
    * Using Base class with mistral support and memory profiling
    
    * shell script cli improvements
    
    * Added newer requirements with version pineed
    
    * small improvements
    
    * removed MODEL_NAME while running setup
    
    * Added performance logs for llama and mistral
    Anindyadeep authored Apr 18, 2024
    Configuration menu
    Copy the full SHA
    7d828d9 View commit details
    Browse the repository at this point in the history

Commits on Apr 19, 2024

  1. Configuration menu
    Copy the full SHA
    0322126 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    61f84a6 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    c7620a2 View commit details
    Browse the repository at this point in the history
  4. Merge pull request #173 from premAI-io/fix-numbers

    Fix Transformers benchmark numbers
    Anindyadeep authored Apr 19, 2024
    Configuration menu
    Copy the full SHA
    eb796e4 View commit details
    Browse the repository at this point in the history

Commits on Apr 22, 2024

  1. ExLlamaV2 Mistral, Memory support, qualitative comparision and improv…

    …ements (#175)
    
    * Added performance logs for mistral and llama for exllamav2 along with qualitative comparisions
    
    * ExLlamaV2 using base class along with support for mistral and memory profiling
    
    * removed old cli args and small improvements
    
    * deleted convert.py script
    
    * pinned latest version and added transformers
    
    * addition of mistral model along with usage of latest exllamav2 repo
    
    * Update bench_exllamav2/bench.sh
    
    Co-authored-by: Nicola Sosio <sosio.nicola94@tiscali.it>
    
    ---------
    
    Co-authored-by: Nicola Sosio <sosio.nicola94@tiscali.it>
    Anindyadeep and nsosio authored Apr 22, 2024
    Configuration menu
    Copy the full SHA
    793dbc9 View commit details
    Browse the repository at this point in the history
  2. vLLM Mistral, Memory support, qualitative comparision and improvements (

    #172)
    
    * Adding base class with mistral support and memory profling
    
    * small improvements on removing unnecessary cli args
    
    * download support for mistral
    
    * adding on_exit function on get_answers
    
    * Added precision wise qualitative checks for vLLM README
    
    * Added performance logs on docs for mistral and llama
    
    * Update bench_vllm/bench.sh
    
    Co-authored-by: Nicola Sosio <sosio.nicola94@tiscali.it>
    
    ---------
    
    Co-authored-by: Nicola Sosio <sosio.nicola94@tiscali.it>
    Anindyadeep and nsosio authored Apr 22, 2024
    Configuration menu
    Copy the full SHA
    2033b74 View commit details
    Browse the repository at this point in the history

Commits on Apr 24, 2024

  1. Nvidia TensortRT LLM Mistral, Memory support, qualitative comparision…

    … and improvements (#178)
    
    * Added readme with mistral support and qualitative comparision
    
    * TRT LLM using base class with mistral and memory profiling support
    
    * removed old cli args, and some improvements
    
    * Added support for mistral with latest trt llm
    
    * Added support for root dir for handling runs inside and outside docker
    
    * Added performance logs for both mistral and llama
    
    * Added float32 on docs and performance logs
    
    * Added support for float32 precision
    
    * Added support for float32
    
    * revised to int4 for mistral
    Anindyadeep authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    78a80c4 View commit details
    Browse the repository at this point in the history
  2. Optimum Nvidia Mistral, Memory support, qualitative comparision and i…

    …mprovements (#177)
    
    * Added performance logs for mistral and llama for exllamav2 along with qualitative comparisions
    
    * ExLlamaV2 using base class along with support for mistral and memory profiling
    
    * removed old cli args and small improvements
    
    * deleted convert.py script
    
    * pinned latest version and added transformers
    
    * addition of mistral model along with usage of latest exllamav2 repo
    
    * Using base benchmark class with memory profiling support and mistral model support
    
    * Addition of new contructor argument root_dir to handle paths inside or outside docker
    
    * created a converter script to convert to tensorrt engine file
    
    * Addition of latest update usage of optimum nvidia and also added qualitative comparision
    
    * cli improvements and remove older cli args
    
    * added latest conversion script logic to conver hf weights to engine and mistral support
    
    * Added latest performance logs for both mistral and llama
    
    * removed the conflict with exllamav2
    
    * removed changes from exllamav2
    Anindyadeep authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    454b4c0 View commit details
    Browse the repository at this point in the history
  3. ONNX Runtime with mistral support and memory profiling (#182)

    * Added comparative quality analysis for mistral and llama and also added nuances related to onnx
    
    * Using base class with memory profiling and mistral support
    
    * removed old cli arguments and some improvements
    
    * removed requirements, since onnx runs using custom docker container
    
    * Added new setup sh file with mistral and llama onnx conversion through docker
    
    * Added performance logs of onnx for llama and mistral
    Anindyadeep authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    658fd19 View commit details
    Browse the repository at this point in the history
  4. Lightning AI Mistral and memory integration (#174)

    * Added qualitative comparision of quantity for litgpt
    
    * Using base class with mistral support and memory support
    
    * small cli improvements, removed old arguments
    
    * removed convert logic with latest litgpt
    
    * Added latest inference logic code
    
    * pinned version for dependencies
    
    * Added latest method of installation and model conversions with litgpt
    
    * added performance benchmarks info in litgpt
    
    * updated the memory usage and token per seconds
    
    * chore: minor improvements and added latest info about int4
    Anindyadeep authored Apr 24, 2024
    Configuration menu
    Copy the full SHA
    a92a3bc View commit details
    Browse the repository at this point in the history

Commits on Apr 29, 2024

  1. Configuration menu
    Copy the full SHA
    b5b90c4 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    2c2cddc View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    178f317 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    a5cc756 View commit details
    Browse the repository at this point in the history