Behind the Scene: Revealing the Secrets ofPre-trained Vision-and-Language ModelsPaper:https://arxiv.org/pdf/2005.07310.pdfRelated Codes:https://github.com/airsplay/lxmerthttps://github.com/ChenRocks/UNITER