Dual-Stage Attention-Based Recurrent Neural Net for Time Series Prediction based off this blog post by Chandler Zuo. Given I'm basically copying the code from his post, the copyright probably belongs to him.
There is a different implementation by Zhenye-Na, but from what I can tell, it's only single-variate.