Text this: Multimodal Semantics Integration Using Ontologies Enhanced By Ontology Extraction And Cross Modality Disambiguation