Loading...

Computer vision

A field of AI focused on interpreting and generating information from images and video (e.g., object detection, segmentation, captioning). Many modern systems use multimodal models that combine vision and language capabilities.

See: CNN; Multimodal; Vision-language model