It's been 1 year since the previous version, but I finally looked into actual transfer problems which caused glitches on hardware and fixed it for good! Now it runs faster and finishes in 3:27.80
Also see this issue for more details: gbdev/pandocs#299