multihead-attention

Machine Translation models (with and without attention) to convert sentences in Tamil to Hindi. Transformer models are also used for this same task and performance is compared.

machine-translation pytorch transformer deeplearning attention-mechanism tokenization multihead-attention pytorch-implementation

Updated Oct 27, 2020
Jupyter Notebook

meme2515 / transformer_pytorch

Star

PyTorch implementation of the Transformer architecture from the paper Attention is All You Need. Includes implementation of attention mechanism.

pytorch transformer attention gpt bert multihead-attention

Updated May 5, 2023
Python

achiverram28 / FedLSF-DCOSS

Star

Official implementation of the paper "FedLSF: Federated Local Graph Learning via Specformers"

graphs eigenvectors eigenvalues federated-learning multihead-attention graphneuralnetwork spectral-gnns

Updated Mar 19, 2024
Python

whsqkaak / attentions_pytorch

Star

A repository for implementations of attention mechanism by PyTorch.

pytorch attention attention-mechanism multihead-attention dot-product-attention scaled-dot-product-attention

Updated Aug 1, 2022
Python

sarthak7509 / ConversationalAi

Star

This is implementation of famous multi head attention mode for conversational ai paper. This model is trained on both Cornell movie data set and WikkiQna data set provided by microsoft

tensorflow python3 transformer neural-networks multihead-attention self-attention

Updated Jul 13, 2021
Jupyter Notebook

aman-17 / 3dprinting-extrusion-detection

Star

3D Printing Extrusion Detection using Multi-Head Attention Model

python deep-learning 3d-printing multihead-attention

Updated May 10, 2023
Python

JivanAcharya / Shakespeare-GPT

Star

Implementing a GPT (Generative Pre-trained Transformer) model from scratch on Shakespeare's work.

transformer gpt multihead-attention self-attention

Updated Jun 20, 2024
Jupyter Notebook

jaydeepthik / Nano-GPT

Star

Simple GPT with multiheaded attention for char level tokens, inspired from Andrej Karpathy's video lectures : https://github.com/karpathy/ng-video-lecture

transformers pytorch gpt multihead-attention pytorch-implementation

Updated May 10, 2023
Jupyter Notebook

puskal-khadka / Transformer

Star

Transformer model based on the research paper: "𝗔𝘁𝘁𝗲𝗻𝘁𝗶𝗼𝗻 𝗜𝘀 𝗔𝗹𝗹 𝗬𝗼𝘂 𝗡𝗲𝗲𝗱"

deep-neural-networks pytorch transformer seq2seq attention-is-all-you-need multihead-attention transformermodel

Updated Mar 2, 2024
Python

Pranavhc / Shakespearean-Text-Generator

Star

A Decoder-only Transfomer model for text generation.

natural-language-processing neural-network transformer attention-mechanism multihead-attention

Updated Sep 22, 2024
Jupyter Notebook

bkhanal-11 / transformers

Star

The implementation of transformer as presented in the paper "Attention is all you need" from scratch.

transformers attention-mechanism attention-is-all-you-need multihead-attention self-attention positional-encoding

Updated Mar 4, 2023
Python

abhilash1910 / GraphAttentionNetworks

Sponsor

Star

This package is a Tensorflow2/Keras implementation for Graph Attention Network embeddings and also provides a Trainable layer for Multihead Graph Attention.

tf2 keras-tensorflow leaky-relu graph-attention-networks multihead-attention self-attention

Updated Sep 23, 2021
Python

Resh-97 / MixSeq-Connecting-Macroscopic-Time-Series-Forecasting-with-Microscopic-Time-Series-Data

Star

Testing the Reproducibility of the paper: MixSeq. Under the assumption that macroscopic time series follow a mixture distribution, they hypothesise that lower variance of constituting latent mixture components could improve the estimation of macroscopic time series.

time-series arma multihead-attention deepar vae-implementation vae-pytorch reproducibility-challenge comp6248

Updated Jun 7, 2023
Jupyter Notebook

varunram2001 / MHA-Module-for-Attention-based-Deep-Learning

Star

This repository contains the code for a Multi Scale attention based module that was built and tested on a data set containing Concrete crack images. It was later tested with other data sets as well. Provided a better accuracy compared to the standard approach.

deep-neural-networks deep-learning attention-mechanism multihead-attention multihead-attention-networks

Updated May 25, 2023
Jupyter Notebook

Improve this page

Add a description, image, and links to the multihead-attention topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multihead-attention topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multihead-attention

Here are 34 public repositories matching this topic...

OscarHChung / Slang-Emulator

iafarhan / causal-synthesizer-multihead-attention

dcarpintero / transformer101

yl-jiang / Transformer

Group-1-ET / English-Telugu-Translator

aniketDash7 / multihead_attention_implementation

vasisthasinghal / Machine_Translation

meme2515 / transformer_pytorch

achiverram28 / FedLSF-DCOSS

whsqkaak / attentions_pytorch

sarthak7509 / ConversationalAi

aman-17 / 3dprinting-extrusion-detection

JivanAcharya / Shakespeare-GPT

jaydeepthik / Nano-GPT

puskal-khadka / Transformer

Pranavhc / Shakespearean-Text-Generator

bkhanal-11 / transformers

abhilash1910 / GraphAttentionNetworks

Resh-97 / MixSeq-Connecting-Macroscopic-Time-Series-Forecasting-with-Microscopic-Time-Series-Data

varunram2001 / MHA-Module-for-Attention-based-Deep-Learning

Improve this page

Add this topic to your repo