Skip to content

antmikinka/swift-transformers-test

Repository files navigation

Creating CoreML Apple Neural Engine Models

  • Currently all of these models would have been converted from openelm-coreml.py
  • Review ops, layers, precision for a model
  • Review apple/ml-recurrent-drafter
    • modeling_llama.py
    • this model also seems to be an ANE optimized Llama with the ANE Principles being implemented
      • lines 161 and 162 deal with key_states and value_states
      • Class LlamaAttention
        • key_states = torch.repeat_interleave(key_states, dim=1, repeats=self.n_kv_groups)
        • value_states = torch.repeat_interleave(value_states, dim=1, repeats=self.n_kv_groups)
  • Review chunk_mlprogram.py (changed from apple/ml-stable-diffusion)
    • Optimize for chunking text LLMs
    • needs to check PSNR
      • random_gen_input_feature_type func is not working due to the model being converted, not properly displaying a value type to let the func know how to generate those input features (this seems to be the issue)
    • program does work

Ways to View Layers, OPs, & Precision

CoreMLInspect Models - Layers, OPs, & Precision

layer-iteration.py Model - Layers, OPs, & Precision

Performance Tests

OpenELM-270M-Instruct

Chunked

OpenELM-1B-Instruct

Palttized

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages