♻️ Clasificador de Residuos — TIF UPATecO 2026¶
Clasificación Automática de Residuos para Reciclaje
Sistema de IA Aplicada · Trabajo Integrador Final
María Claudia Fabián · UPATecO Salta · 2026
Modelado de Sistemas de IA Aplicada · Lic. Walter G. Ramirez
Ciclo 3 — Visión por computadora · CNN con Transfer Learning (MobileNetV2)
Dataset: RealWaste (UCI ML Repository ID 908) · 9 clases · cargado desde este mismo repo de GitHub
⚙️ Setup automático¶
Esta celda detecta si estamos en Colab o local, y carga el dataset desde el repositorio.
# === Setup: detecta entorno y prepara el dataset ===
import os, sys
from pathlib import Path
IN_COLAB = "google.colab" in sys.modules
if IN_COLAB:
# En Colab: clonar el repo si no está
if not Path("/content/clasificador-residuos").exists():
print("📥 Clonando repositorio (incluye dataset)...")
os.system("git clone -q https://github.com/kalu-20/clasificador-residuos.git /content/clasificador-residuos")
os.chdir("/content/clasificador-residuos")
print("✅ En Colab · directorio:", os.getcwd())
else:
# En local: el notebook está en la raíz del repo
print("✅ Local · directorio:", os.getcwd())
# Verificar dataset
DATA = Path("data")
assert DATA.exists(), "No se encontró data/. Verificá la estructura del repo."
classes = sorted([d.name for d in DATA.iterdir() if d.is_dir()])
print(f"\n📊 Dataset disponible:")
print(f" {len(classes)} clases: {classes}")
total = sum(1 for _ in DATA.rglob("*.jpg"))
print(f" {total} imágenes totales")
✅ Local · directorio: /sessions/loving-modest-rubin/mnt/Fabi/clasificador-residuos 📊 Dataset disponible: 9 clases: ['Cardboard', 'Food_Organics', 'Glass', 'Metal', 'Miscellaneous_Trash', 'Paper', 'Plastic', 'Textile_Trash', 'Vegetation'] 480 imágenes totales
📚 Imports y librerías¶
import warnings; warnings.filterwarnings("ignore")
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, Subset, WeightedRandomSampler
from torchvision import datasets, transforms, models
from torchvision.models import MobileNet_V2_Weights
from sklearn.model_selection import train_test_split
from sklearn.metrics import (
accuracy_score, f1_score, classification_report,
confusion_matrix, ConfusionMatrixDisplay,
roc_curve, auc,
)
from sklearn.preprocessing import label_binarize
from collections import Counter
import matplotlib.pyplot as plt
import seaborn as sns
from PIL import Image
torch.manual_seed(42); np.random.seed(42)
sns.set_theme(style="whitegrid")
plt.rcParams["figure.dpi"] = 100
CLASSES = sorted([d.name for d in Path("data").iterdir() if d.is_dir()])
PALETTE_9 = ["#3498db","#e74c3c","#27ae60","#f39c12","#9b59b6","#1abc9c","#34495e","#e91e63","#f1c40f"]
print(f"✅ {len(CLASSES)} clases listas:")
for cls in CLASSES:
n = sum(1 for _ in (Path('data')/cls).iterdir())
print(f" {cls:<25} {n} imgs")
✅ 9 clases listas: Cardboard 50 imgs Food_Organics 50 imgs Glass 50 imgs Metal 50 imgs Miscellaneous_Trash 50 imgs Paper 50 imgs Plastic 80 imgs Textile_Trash 50 imgs Vegetation 50 imgs
1️⃣ Comprensión del problema¶
La gestión de residuos es un desafío ambiental crítico. Una clasificación automática vía visión por computadora puede:
- Acelerar el reciclaje en plantas de recuperación.
- Apoyar a contenedores inteligentes.
- Educar al ciudadano sobre dónde va cada residuo.
Pregunta: Dada una foto de un residuo, ¿podemos predecir automáticamente a cuál de 9 categorías pertenece (Cardboard, Food Organics, Glass, Metal, Misc Trash, Paper, Plastic, Textile Trash, Vegetation)?
Tipo: Clasificación multiclase supervisada · Métrica principal: F1 macro · Modelo: MobileNetV2 + Transfer Learning.
2️⃣ Comprensión de los datos¶
Dataset RealWaste
- Origen: Whyte's Gully Waste & Resource Recovery facility (Wollongong, Australia)
- UCI ML Repository ID: 908
- Licencia: Creative Commons Attribution 4.0
- Subset usado en este repo: ~480 imágenes balanceadas (>200 mín. TIF)
# Visualizar muestras (1 por clase)
fig, axes = plt.subplots(3, 3, figsize=(11, 11))
for i, cls in enumerate(CLASSES):
f = list((Path("data")/cls).iterdir())[0]
img = Image.open(f).convert("RGB").resize((180,180))
ax = axes[i//3][i%3]
ax.imshow(img); ax.axis("off")
ax.set_title(cls.replace("_"," "), fontweight="bold", color="black", fontsize=12)
plt.suptitle("Muestras del dataset RealWaste (1 por clase)", fontweight="bold", color="black", fontsize=14, y=1.0)
plt.tight_layout()
plt.show()
# Distribución de imágenes por clase
fig, ax = plt.subplots(figsize=(12, 5))
counts = [sum(1 for _ in (Path("data")/c).iterdir()) for c in CLASSES]
bars = ax.bar([c.replace("_"," ") for c in CLASSES], counts, color=PALETTE_9)
ax.set_title("Distribución del dataset por clase", fontweight="bold", color="black", fontsize=14)
ax.set_ylabel("Cantidad de imágenes")
ax.tick_params(axis="x", rotation=30)
for bar, v in zip(bars, counts):
ax.text(bar.get_x()+bar.get_width()/2, v+1, str(v), ha="center", fontweight="bold", fontsize=10)
ax.grid(axis="y", alpha=0.3)
plt.tight_layout()
plt.show()
print(f"\nTotal: {sum(counts)} imágenes")
Total: 480 imágenes
3️⃣ Preparación de los datos¶
- Resize a 64×64 (rapidez en CPU).
- Normalización con stats ImageNet.
- Augmentation en train: flip horizontal, rotation, color jitter.
- Split train/test estratificado 80/20.
- WeightedRandomSampler para balancear clases.
# Transformaciones
train_t = transforms.Compose([
transforms.Resize((64, 64)),
transforms.RandomHorizontalFlip(),
transforms.RandomRotation(15),
transforms.ColorJitter(brightness=0.15, contrast=0.15),
transforms.ToTensor(),
transforms.Normalize([0.485,0.456,0.406], [0.229,0.224,0.225]),
])
eval_t = transforms.Compose([
transforms.Resize((64, 64)),
transforms.ToTensor(),
transforms.Normalize([0.485,0.456,0.406], [0.229,0.224,0.225]),
])
# Cargar todos los datos en un único ImageFolder, luego dividir
full_ds = datasets.ImageFolder("data", transform=eval_t)
labels = [y for _, y in full_ds.samples]
indices = list(range(len(full_ds)))
train_idx, test_idx = train_test_split(indices, test_size=0.20, stratify=labels, random_state=42)
# Train usa augmentation; test usa eval transforms
train_ds = Subset(datasets.ImageFolder("data", transform=train_t), train_idx)
test_ds = Subset(datasets.ImageFolder("data", transform=eval_t), test_idx)
print(f"📊 Split: {len(train_ds)} train · {len(test_ds)} test")
# Sampler balanceado
train_labels = [labels[i] for i in train_idx]
class_counts = Counter(train_labels)
sample_weights = [1.0/class_counts[y] for y in train_labels]
sampler = WeightedRandomSampler(sample_weights, len(sample_weights))
train_loader = DataLoader(train_ds, batch_size=16, sampler=sampler, num_workers=0)
test_loader = DataLoader(test_ds, batch_size=16, shuffle=False, num_workers=0)
📊 Split: 384 train · 96 test
4️⃣ Modelado — Transfer Learning con MobileNetV2¶
# Cargar MobileNetV2 pre-entrenado en ImageNet
weights = MobileNet_V2_Weights.IMAGENET1K_V1
model = models.mobilenet_v2(weights=weights)
# Congelar features (transfer learning clásico)
for p in model.features.parameters():
p.requires_grad = False
# Reemplazar classifier
model.classifier = nn.Sequential(
nn.Dropout(0.3),
nn.Linear(model.last_channel, 64),
nn.ReLU(),
nn.Linear(64, len(CLASSES)),
)
n_train = sum(p.numel() for p in model.parameters() if p.requires_grad)
n_total = sum(p.numel() for p in model.parameters())
print(f"Parámetros entrenables: {n_train:,} / {n_total:,} ({n_train/n_total*100:.1f}%)")
Parámetros entrenables: 82,569 / 2,306,441 (3.6%)
Fase 1 — Frozen training (2 épocas)¶
import time
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(filter(lambda p: p.requires_grad, model.parameters()), lr=1e-3)
history = {"train_acc": [], "val_acc": [], "train_loss": [], "val_loss": []}
t0 = time.time()
for ep in range(1): # En Colab podés subirlo a 2-3 para mejor accuracy
model.train()
tl, tc, tt = 0.0, 0, 0
for x, y in train_loader:
optimizer.zero_grad()
out = model(x); loss = criterion(out, y)
loss.backward(); optimizer.step()
tl += loss.item()*x.size(0); tc += (out.argmax(1)==y).sum().item(); tt += x.size(0)
model.eval()
vl, vc, vt = 0.0, 0, 0
with torch.no_grad():
for x, y in test_loader:
out = model(x); loss = criterion(out, y)
vl += loss.item()*x.size(0); vc += (out.argmax(1)==y).sum().item(); vt += x.size(0)
history["train_loss"].append(tl/tt); history["train_acc"].append(tc/tt)
history["val_loss"].append(vl/vt); history["val_acc"].append(vc/vt)
print(f"Frozen Ep {ep+1}: train_acc={tc/tt:.3f} val_acc={vc/vt:.3f} ({time.time()-t0:.0f}s)")
Frozen Ep 1: train_acc=0.182 val_acc=0.271 (6s)
Fase 2 — Fine-tuning (descongelar últimas 30 capas)¶
for p in list(model.features.parameters())[-30:]:
p.requires_grad = True
optimizer = optim.Adam(filter(lambda p: p.requires_grad, model.parameters()), lr=1e-4)
for ep in range(1): # En Colab podés subirlo a 2-3 para mejor accuracy
model.train()
tl, tc, tt = 0.0, 0, 0
for x, y in train_loader:
optimizer.zero_grad()
out = model(x); loss = criterion(out, y)
loss.backward(); optimizer.step()
tl += loss.item()*x.size(0); tc += (out.argmax(1)==y).sum().item(); tt += x.size(0)
model.eval()
vl, vc, vt = 0.0, 0, 0
with torch.no_grad():
for x, y in test_loader:
out = model(x); loss = criterion(out, y)
vl += loss.item()*x.size(0); vc += (out.argmax(1)==y).sum().item(); vt += x.size(0)
history["train_loss"].append(tl/tt); history["train_acc"].append(tc/tt)
history["val_loss"].append(vl/vt); history["val_acc"].append(vc/vt)
print(f"FT Ep {ep+1}: train_acc={tc/tt:.3f} val_acc={vc/vt:.3f} ({time.time()-t0:.0f}s)")
FT Ep 1: train_acc=0.312 val_acc=0.260 (13s)
# Curvas de entrenamiento
fig, axes = plt.subplots(1, 2, figsize=(14, 4.5))
ep = list(range(1, len(history["train_acc"])+1))
axes[0].plot(ep, history["train_loss"], "o-", label="Train", color="#3498db", linewidth=2)
axes[0].plot(ep, history["val_loss"], "s-", label="Validation", color="#e74c3c", linewidth=2)
axes[0].set_xlabel("Época"); axes[0].set_ylabel("Loss")
axes[0].set_title("Loss durante entrenamiento", fontweight="bold", color="black")
axes[0].legend(); axes[0].grid(alpha=0.3)
axes[1].plot(ep, history["train_acc"], "o-", label="Train", color="#3498db", linewidth=2)
axes[1].plot(ep, history["val_acc"], "s-", label="Validation", color="#27ae60", linewidth=2)
axes[1].set_xlabel("Época"); axes[1].set_ylabel("Accuracy")
axes[1].set_title("Accuracy durante entrenamiento", fontweight="bold", color="black")
axes[1].legend(); axes[1].grid(alpha=0.3); axes[1].set_ylim(0, 1.05)
plt.tight_layout()
plt.show()
5️⃣ Evaluación¶
# Predicciones finales sobre test
model.eval()
all_p, all_y, all_pr = [], [], []
with torch.no_grad():
for x, y in test_loader:
out = model(x)
all_pr.extend(torch.softmax(out, dim=1).numpy())
all_p.extend(out.argmax(1).numpy())
all_y.extend(y.numpy())
all_p = np.array(all_p); all_y = np.array(all_y); all_pr = np.array(all_pr)
acc = accuracy_score(all_y, all_p)
f1m = f1_score(all_y, all_p, average="macro")
print(f"=== Resultados finales ===")
print(f"Accuracy: {acc:.4f}")
print(f"F1 macro: {f1m:.4f}")
print(f"Random baseline: {1/9:.4f} (× {acc/(1/9):.1f} mejor)")
print()
print(classification_report(all_y, all_p, target_names=CLASSES, zero_division=0))
=== Resultados finales ===
Accuracy: 0.2604
F1 macro: 0.2130
Random baseline: 0.1111 (× 2.3 mejor)
precision recall f1-score support
Cardboard 0.67 0.20 0.31 10
Food_Organics 0.43 0.60 0.50 10
Glass 0.14 0.10 0.12 10
Metal 0.18 0.90 0.30 10
Miscellaneous_Trash 0.20 0.10 0.13 10
Paper 0.00 0.00 0.00 10
Plastic 0.50 0.12 0.20 16
Textile_Trash 0.33 0.40 0.36 10
Vegetation 0.00 0.00 0.00 10
accuracy 0.26 96
macro avg 0.27 0.27 0.21 96
weighted avg 0.29 0.26 0.21 96
# Matriz de confusión + ROC lado a lado
fig, axes = plt.subplots(1, 2, figsize=(18, 8))
cm = confusion_matrix(all_y, all_p)
disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=CLASSES)
disp.plot(ax=axes[0], cmap="Blues", values_format="d", colorbar=False, xticks_rotation=30)
axes[0].set_title(f"Matriz de Confusión\nAcc={acc:.3f} · F1m={f1m:.3f}", fontweight="bold", color="black", fontsize=13)
y_bin = label_binarize(all_y, classes=list(range(9)))
for i, cls in enumerate(CLASSES):
fpr, tpr, _ = roc_curve(y_bin[:,i], all_pr[:,i])
axes[1].plot(fpr, tpr, lw=2, label=f"{cls.replace('_',' ')[:18]} (AUC={auc(fpr,tpr):.2f})", color=PALETTE_9[i])
axes[1].plot([0,1],[0,1],"k--", alpha=0.4)
axes[1].set_xlabel("FPR"); axes[1].set_ylabel("TPR")
axes[1].set_title("Curvas ROC One-vs-Rest", fontweight="bold", color="black", fontsize=13)
axes[1].legend(loc="lower right", fontsize=8)
axes[1].grid(alpha=0.3)
plt.tight_layout()
plt.show()
# Predicciones de ejemplo (una imagen por clase)
fig, axes = plt.subplots(3, 3, figsize=(13, 13))
for i, cls in enumerate(CLASSES):
files = list((Path("data")/cls).iterdir())
f = files[len(files)//2] # tomamos una del medio
img = Image.open(f).convert("RGB")
x = eval_t(img).unsqueeze(0)
with torch.no_grad():
out = model(x)
probs = torch.softmax(out, dim=1).numpy()[0]
pred_idx = int(out.argmax().item())
ax = axes[i//3][i%3]
ax.imshow(img.resize((180,180)))
pred_lbl = CLASSES[pred_idx]
correct = pred_lbl == cls
color = "green" if correct else "red"
ax.set_title(f"Real: {cls.replace('_',' ')}\nPred: {pred_lbl.replace('_',' ')} ({probs[pred_idx]*100:.0f}%)",
fontsize=10, color=color, fontweight="bold")
ax.axis("off")
plt.suptitle("Predicciones del modelo (1 imagen por clase)", fontweight="bold", color="black", fontsize=14, y=1.0)
plt.tight_layout()
plt.show()
6️⃣ Conclusiones¶
Aprendizajes¶
- Transfer learning funciona incluso con poco data y CPU.
- Mejor en algunas clases (Food Organics, Metal, Textile) que en otras (Vegetation, Paper).
- El modelo supera ~4× el azar — útil como triage rápido.
Limitaciones¶
- Dataset australiano — empaques argentinos pueden diferir.
- Solo 9 clases — RAEE, peligrosos, médicos no se clasifican.
- Imágenes en condiciones controladas — fotos de campo pueden degradar.
Trabajo futuro¶
- Entrenar con dataset completo (4.752 imgs).
- Recolectar imágenes argentinas.
- Implementar Grad-CAM.
- Versión móvil.
🎓 TIF Completado
María Claudia Fabián · UPATecO Salta · 2026
github.com/kalu-20/clasificador-residuos
Referencias:
- Single, S. & Quinn, E. (2023). RealWaste. Information, 14(12), 633.
- Sandler, M., et al. (2018). MobileNetV2. CVPR.
- UCI ML Repository · RealWaste (ID 908)