NeurIPS2022-WorkshopDec 2022 » Vision-and-Language Learning Improving cross-modal attention via object detection