Multimodal classification AND Arxiv Doc
Common descendants