Video Language Co-Attention with Multimodal Fast-Learning Feature Fusion for VideoQA

Published in Workshop on Representation Learning for NLP @ ACL, 2022