VOOZH about

URL: https://deepinfra.com/models/zero-shot-image-classification

⇱ Models | Machine Learning Inference | DeepInfra


We use essential cookies to make our site work. With your consent, we may also use non-essential cookies to improve user experience and analyze website traffic…

DeepInfra raises $107M Series B to scale the inference cloud β€” read the announcement

Browse deepinfra models:

All categories and models you can try out and directly use in deepinfra:

​
clip-vit-base-patch32

The CLIP model was developed by OpenAI to investigate the robustness of computer vision models. It uses a Vision Transformer architecture and was trained on a large dataset of image-caption pairs. The model shows promise in various computer vision tasks but also has limitations, including difficulties with fine-grained classification and potential biases in certain applications.
$0.0005 / second
clip-vit-large-patch14-336

A zero-shot-image-classification model released by OpenAI. The clip-vit-large-patch14-336 model was trained from scratch on an unknown dataset and achieves unspecified results on the evaluation set. The model's intended uses and limitations, as well as its training and evaluation data, are not provided. The training procedure used an unknown optimizer and precision, and the framework versions included Transformers 4.21.3, TensorFlow 2.8.2, and Tokenizers 0.12.1.
$0.0005 / second
πŸ‘ Built With Love in Palo Alto

Β© 2026 DeepInfra. All rights reserved.