Multimodal Models AND Computer vision
Common descendants