RSI-CB256-35

RSI-CB256-35 is a vision-language encoder model fine-tuned from google/siglip2-base-patch16-224 for multi-class remote sensing image classification. Built using the SiglipForImageClassification architecture, it is designed to accurately categorize overhead imagery into 35 distinct land-use and land-cover categories.

Classification Report:
 precision recall f1-score support

 parking lot 0.9978 0.9872 0.9925 467
 avenue 0.9927 1.0000 0.9963 544
 highway 0.9283 0.9865 0.9565 223
 bridge 0.9283 0.9659 0.9467 469
 marina 0.9946 1.0000 0.9973 366
 crossroads 0.9909 0.9801 0.9855 553
 airport runway 0.9956 0.9926 0.9941 678
 pipeline 0.9900 1.0000 0.9950 198
 town 0.9970 1.0000 0.9985 335
 airplane 0.9915 0.9915 0.9915 351
 forest 0.9972 0.9945 0.9958 1082
 mangrove 1.0000 1.0000 1.0000 1049
 artificial grassland 0.9821 0.9717 0.9769 283
river protection forest 1.0000 1.0000 1.0000 524
 shrubwood 1.0000 1.0000 1.0000 1331
 sapling 0.9955 1.0000 0.9977 879
 sparse forest 1.0000 1.0000 1.0000 1110
 lakeshore 1.0000 1.0000 1.0000 438
 river 0.9680 0.9555 0.9617 539
 stream 1.0000 0.9971 0.9985 688
 coastline 0.9913 0.9978 0.9946 459
 hirst 0.9890 1.0000 0.9945 628
 dam 0.9868 0.9259 0.9554 324
 sea 0.9971 0.9864 0.9917 1028
 snow mountain 1.0000 1.0000 1.0000 1153
 sandbeach 0.9944 0.9907 0.9925 536
 mountain 0.9926 0.9938 0.9932 812
 desert 0.9757 0.9927 0.9841 1092
 dry farm 1.0000 0.9992 0.9996 1309
 green farmland 0.9984 0.9969 0.9977 644
 bare land 0.9870 0.9630 0.9748 864
 city building 0.9785 0.9892 0.9838 1014
 residents 0.9926 0.9877 0.9901 810
 container 0.9970 0.9955 0.9962 660
 storage room 0.9985 1.0000 0.9992 1307

 accuracy 0.9919 24747
 macro avg 0.9894 0.9897 0.9895 24747
 weighted avg 0.9920 0.9919 0.9919 24747

Label Space: 35 Remote Sensing Classes

This model supports the classification of satellite or aerial images into the following classes:

Class 0: "parking lot" 
Class 1: "avenue" 
Class 2: "highway" 
Class 3: "bridge" 
Class 4: "marina" 
Class 5: "crossroads" 
Class 6: "airport runway" 
Class 7: "pipeline" 
Class 8: "town" 
Class 9: "airplane" 
Class 10: "forest" 
Class 11: "mangrove" 
Class 12: "artificial grassland" 
Class 13: "river protection forest" 
Class 14: "shrubwood" 
Class 15: "sapling" 
Class 16: "sparse forest" 
Class 17: "lakeshore" 
Class 18: "river" 
Class 19: "stream" 
Class 20: "coastline" 
Class 21: "hirst" 
Class 22: "dam" 
Class 23: "sea" 
Class 24: "snow mountain" 
Class 25: "sandbeach" 
Class 26: "mountain" 
Class 27: "desert" 
Class 28: "dry farm" 
Class 29: "green farmland" 
Class 30: "bare land" 
Class 31: "city building" 
Class 32: "residents" 
Class 33: "container" 
Class 34: "storage room"

Install Dependencies

pip install -q transformers torch pillow gradio

Inference Code

import gradio as gr
from transformers import AutoImageProcessor, SiglipForImageClassification
from PIL import Image
import torch

# Load model and processor
model_name = "prithivMLmods/RSI-CB256-35"
model = SiglipForImageClassification.from_pretrained(model_name)
processor = AutoImageProcessor.from_pretrained(model_name)

# ID to label mapping
id2label = {
 "0": "parking lot",
 "1": "avenue",
 "2": "highway",
 "3": "bridge",
 "4": "marina",
 "5": "crossroads",
 "6": "airport runway",
 "7": "pipeline",
 "8": "town",
 "9": "airplane",
 "10": "forest",
 "11": "mangrove",
 "12": "artificial grassland",
 "13": "river protection forest",
 "14": "shrubwood",
 "15": "sapling",
 "16": "sparse forest",
 "17": "lakeshore",
 "18": "river",
 "19": "stream",
 "20": "coastline",
 "21": "hirst",
 "22": "dam",
 "23": "sea",
 "24": "snow mountain",
 "25": "sandbeach",
 "26": "mountain",
 "27": "desert",
 "28": "dry farm",
 "29": "green farmland",
 "30": "bare land",
 "31": "city building",
 "32": "residents",
 "33": "container",
 "34": "storage room"
}

def classify_rsi_image(image):
 image = Image.fromarray(image).convert("RGB")
 inputs = processor(images=image, return_tensors="pt")

 with torch.no_grad():
 outputs = model(**inputs)
 logits = outputs.logits
 probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist()

 prediction = {
 id2label[str(i)]: round(probs[i], 3) for i in range(len(probs))
 }

 return prediction

# Gradio Interface
iface = gr.Interface(
 fn=classify_rsi_image,
 inputs=gr.Image(type="numpy"),
 outputs=gr.Label(num_top_classes=5, label="Top-5 Predicted Categories"),
 title="RSI-CB256-35",
 description="Remote sensing image classification using SigLIP2. Upload an aerial or satellite image to classify its land-use category."
)

if __name__ == "__main__":
 iface.launch()

Intended Use

Land-Use Mapping and Planning
Environmental Monitoring
Infrastructure Identification
Remote Sensing Analytics
Agricultural and Forest Area Classification

Downloads last month: 41

Safetensors

Model size

92.9M params

Tensor type

F32

Model tree for prithivMLmods/RSI-CB256-35

Base model

google/siglip2-base-patch16-224

Finetuned

(119)

this model

Dataset used to train prithivMLmods/RSI-CB256-35

Collection including prithivMLmods/RSI-CB256-35

Moderation, Balance, Classifiers • 7 items • Updated 1 day ago • 1

URL: https://huggingface.co/prithivMLmods/RSI-CB256-35

⇱ prithivMLmods/RSI-CB256-35 · Hugging Face

RSI-CB256-35

Label Space: 35 Remote Sensing Classes

Install Dependencies

Inference Code

Intended Use

Model tree for prithivMLmods/RSI-CB256-35

Dataset used to train prithivMLmods/RSI-CB256-35

Collection including prithivMLmods/RSI-CB256-35