♻️ Clasificador de Residuos — TIF UPATecO 2026¶

Open In Colab

♻️

Clasificación Automática de Residuos para Reciclaje

Sistema de IA Aplicada · Trabajo Integrador Final

María Claudia Fabián · UPATecO Salta · 2026
Modelado de Sistemas de IA Aplicada · Lic. Walter G. Ramirez

Ciclo 3 — Visión por computadora · CNN con Transfer Learning (MobileNetV2)

Dataset: RealWaste (UCI ML Repository ID 908) · 9 clases · cargado desde este mismo repo de GitHub


⚙️ Setup automático¶

Esta celda detecta si estamos en Colab o local, y carga el dataset desde el repositorio.

In [1]:
# === Setup: detecta entorno y prepara el dataset ===
import os, sys
from pathlib import Path

IN_COLAB = "google.colab" in sys.modules

if IN_COLAB:
    # En Colab: clonar el repo si no está
    if not Path("/content/clasificador-residuos").exists():
        print("📥 Clonando repositorio (incluye dataset)...")
        os.system("git clone -q https://github.com/kalu-20/clasificador-residuos.git /content/clasificador-residuos")
    os.chdir("/content/clasificador-residuos")
    print("✅ En Colab · directorio:", os.getcwd())
else:
    # En local: el notebook está en la raíz del repo
    print("✅ Local · directorio:", os.getcwd())

# Verificar dataset
DATA = Path("data")
assert DATA.exists(), "No se encontró data/. Verificá la estructura del repo."
classes = sorted([d.name for d in DATA.iterdir() if d.is_dir()])
print(f"\n📊 Dataset disponible:")
print(f"   {len(classes)} clases: {classes}")
total = sum(1 for _ in DATA.rglob("*.jpg"))
print(f"   {total} imágenes totales")
✅ Local · directorio: /sessions/loving-modest-rubin/mnt/Fabi/clasificador-residuos

📊 Dataset disponible:
   9 clases: ['Cardboard', 'Food_Organics', 'Glass', 'Metal', 'Miscellaneous_Trash', 'Paper', 'Plastic', 'Textile_Trash', 'Vegetation']
   480 imágenes totales

📚 Imports y librerías¶

In [2]:
import warnings; warnings.filterwarnings("ignore")

import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, Subset, WeightedRandomSampler
from torchvision import datasets, transforms, models
from torchvision.models import MobileNet_V2_Weights
from sklearn.model_selection import train_test_split
from sklearn.metrics import (
    accuracy_score, f1_score, classification_report,
    confusion_matrix, ConfusionMatrixDisplay,
    roc_curve, auc,
)
from sklearn.preprocessing import label_binarize
from collections import Counter
import matplotlib.pyplot as plt
import seaborn as sns
from PIL import Image

torch.manual_seed(42); np.random.seed(42)
sns.set_theme(style="whitegrid")
plt.rcParams["figure.dpi"] = 100

CLASSES = sorted([d.name for d in Path("data").iterdir() if d.is_dir()])
PALETTE_9 = ["#3498db","#e74c3c","#27ae60","#f39c12","#9b59b6","#1abc9c","#34495e","#e91e63","#f1c40f"]
print(f"✅ {len(CLASSES)} clases listas:")
for cls in CLASSES:
    n = sum(1 for _ in (Path('data')/cls).iterdir())
    print(f"   {cls:<25} {n} imgs")
✅ 9 clases listas:
   Cardboard                 50 imgs
   Food_Organics             50 imgs
   Glass                     50 imgs
   Metal                     50 imgs
   Miscellaneous_Trash       50 imgs
   Paper                     50 imgs
   Plastic                   80 imgs
   Textile_Trash             50 imgs
   Vegetation                50 imgs

1️⃣ Comprensión del problema¶

La gestión de residuos es un desafío ambiental crítico. Una clasificación automática vía visión por computadora puede:

  • Acelerar el reciclaje en plantas de recuperación.
  • Apoyar a contenedores inteligentes.
  • Educar al ciudadano sobre dónde va cada residuo.

Pregunta: Dada una foto de un residuo, ¿podemos predecir automáticamente a cuál de 9 categorías pertenece (Cardboard, Food Organics, Glass, Metal, Misc Trash, Paper, Plastic, Textile Trash, Vegetation)?

Tipo: Clasificación multiclase supervisada · Métrica principal: F1 macro · Modelo: MobileNetV2 + Transfer Learning.

2️⃣ Comprensión de los datos¶

Dataset RealWaste

  • Origen: Whyte's Gully Waste & Resource Recovery facility (Wollongong, Australia)
  • UCI ML Repository ID: 908
  • Licencia: Creative Commons Attribution 4.0
  • Subset usado en este repo: ~480 imágenes balanceadas (>200 mín. TIF)
In [3]:
# Visualizar muestras (1 por clase)
fig, axes = plt.subplots(3, 3, figsize=(11, 11))
for i, cls in enumerate(CLASSES):
    f = list((Path("data")/cls).iterdir())[0]
    img = Image.open(f).convert("RGB").resize((180,180))
    ax = axes[i//3][i%3]
    ax.imshow(img); ax.axis("off")
    ax.set_title(cls.replace("_"," "), fontweight="bold", color="black", fontsize=12)
plt.suptitle("Muestras del dataset RealWaste (1 por clase)", fontweight="bold", color="black", fontsize=14, y=1.0)
plt.tight_layout()
plt.show()
No description has been provided for this image
In [4]:
# Distribución de imágenes por clase
fig, ax = plt.subplots(figsize=(12, 5))
counts = [sum(1 for _ in (Path("data")/c).iterdir()) for c in CLASSES]
bars = ax.bar([c.replace("_"," ") for c in CLASSES], counts, color=PALETTE_9)
ax.set_title("Distribución del dataset por clase", fontweight="bold", color="black", fontsize=14)
ax.set_ylabel("Cantidad de imágenes")
ax.tick_params(axis="x", rotation=30)
for bar, v in zip(bars, counts):
    ax.text(bar.get_x()+bar.get_width()/2, v+1, str(v), ha="center", fontweight="bold", fontsize=10)
ax.grid(axis="y", alpha=0.3)
plt.tight_layout()
plt.show()
print(f"\nTotal: {sum(counts)} imágenes")
No description has been provided for this image
Total: 480 imágenes

3️⃣ Preparación de los datos¶

  • Resize a 64×64 (rapidez en CPU).
  • Normalización con stats ImageNet.
  • Augmentation en train: flip horizontal, rotation, color jitter.
  • Split train/test estratificado 80/20.
  • WeightedRandomSampler para balancear clases.
In [5]:
# Transformaciones
train_t = transforms.Compose([
    transforms.Resize((64, 64)),
    transforms.RandomHorizontalFlip(),
    transforms.RandomRotation(15),
    transforms.ColorJitter(brightness=0.15, contrast=0.15),
    transforms.ToTensor(),
    transforms.Normalize([0.485,0.456,0.406], [0.229,0.224,0.225]),
])
eval_t = transforms.Compose([
    transforms.Resize((64, 64)),
    transforms.ToTensor(),
    transforms.Normalize([0.485,0.456,0.406], [0.229,0.224,0.225]),
])

# Cargar todos los datos en un único ImageFolder, luego dividir
full_ds = datasets.ImageFolder("data", transform=eval_t)
labels = [y for _, y in full_ds.samples]
indices = list(range(len(full_ds)))
train_idx, test_idx = train_test_split(indices, test_size=0.20, stratify=labels, random_state=42)

# Train usa augmentation; test usa eval transforms
train_ds = Subset(datasets.ImageFolder("data", transform=train_t), train_idx)
test_ds = Subset(datasets.ImageFolder("data", transform=eval_t), test_idx)

print(f"📊 Split: {len(train_ds)} train · {len(test_ds)} test")

# Sampler balanceado
train_labels = [labels[i] for i in train_idx]
class_counts = Counter(train_labels)
sample_weights = [1.0/class_counts[y] for y in train_labels]
sampler = WeightedRandomSampler(sample_weights, len(sample_weights))

train_loader = DataLoader(train_ds, batch_size=16, sampler=sampler, num_workers=0)
test_loader = DataLoader(test_ds, batch_size=16, shuffle=False, num_workers=0)
📊 Split: 384 train · 96 test

4️⃣ Modelado — Transfer Learning con MobileNetV2¶

In [6]:
# Cargar MobileNetV2 pre-entrenado en ImageNet
weights = MobileNet_V2_Weights.IMAGENET1K_V1
model = models.mobilenet_v2(weights=weights)

# Congelar features (transfer learning clásico)
for p in model.features.parameters():
    p.requires_grad = False

# Reemplazar classifier
model.classifier = nn.Sequential(
    nn.Dropout(0.3),
    nn.Linear(model.last_channel, 64),
    nn.ReLU(),
    nn.Linear(64, len(CLASSES)),
)

n_train = sum(p.numel() for p in model.parameters() if p.requires_grad)
n_total = sum(p.numel() for p in model.parameters())
print(f"Parámetros entrenables: {n_train:,} / {n_total:,} ({n_train/n_total*100:.1f}%)")
Parámetros entrenables: 82,569 / 2,306,441 (3.6%)

Fase 1 — Frozen training (2 épocas)¶

In [7]:
import time

criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(filter(lambda p: p.requires_grad, model.parameters()), lr=1e-3)

history = {"train_acc": [], "val_acc": [], "train_loss": [], "val_loss": []}
t0 = time.time()
for ep in range(1):  # En Colab podés subirlo a 2-3 para mejor accuracy
    model.train()
    tl, tc, tt = 0.0, 0, 0
    for x, y in train_loader:
        optimizer.zero_grad()
        out = model(x); loss = criterion(out, y)
        loss.backward(); optimizer.step()
        tl += loss.item()*x.size(0); tc += (out.argmax(1)==y).sum().item(); tt += x.size(0)
    
    model.eval()
    vl, vc, vt = 0.0, 0, 0
    with torch.no_grad():
        for x, y in test_loader:
            out = model(x); loss = criterion(out, y)
            vl += loss.item()*x.size(0); vc += (out.argmax(1)==y).sum().item(); vt += x.size(0)
    
    history["train_loss"].append(tl/tt); history["train_acc"].append(tc/tt)
    history["val_loss"].append(vl/vt); history["val_acc"].append(vc/vt)
    print(f"Frozen Ep {ep+1}: train_acc={tc/tt:.3f} val_acc={vc/vt:.3f} ({time.time()-t0:.0f}s)")
Frozen Ep 1: train_acc=0.182 val_acc=0.271 (6s)

Fase 2 — Fine-tuning (descongelar últimas 30 capas)¶

In [8]:
for p in list(model.features.parameters())[-30:]:
    p.requires_grad = True

optimizer = optim.Adam(filter(lambda p: p.requires_grad, model.parameters()), lr=1e-4)

for ep in range(1):  # En Colab podés subirlo a 2-3 para mejor accuracy
    model.train()
    tl, tc, tt = 0.0, 0, 0
    for x, y in train_loader:
        optimizer.zero_grad()
        out = model(x); loss = criterion(out, y)
        loss.backward(); optimizer.step()
        tl += loss.item()*x.size(0); tc += (out.argmax(1)==y).sum().item(); tt += x.size(0)
    
    model.eval()
    vl, vc, vt = 0.0, 0, 0
    with torch.no_grad():
        for x, y in test_loader:
            out = model(x); loss = criterion(out, y)
            vl += loss.item()*x.size(0); vc += (out.argmax(1)==y).sum().item(); vt += x.size(0)
    
    history["train_loss"].append(tl/tt); history["train_acc"].append(tc/tt)
    history["val_loss"].append(vl/vt); history["val_acc"].append(vc/vt)
    print(f"FT Ep {ep+1}: train_acc={tc/tt:.3f} val_acc={vc/vt:.3f} ({time.time()-t0:.0f}s)")
FT Ep 1: train_acc=0.312 val_acc=0.260 (13s)
In [9]:
# Curvas de entrenamiento
fig, axes = plt.subplots(1, 2, figsize=(14, 4.5))
ep = list(range(1, len(history["train_acc"])+1))
axes[0].plot(ep, history["train_loss"], "o-", label="Train", color="#3498db", linewidth=2)
axes[0].plot(ep, history["val_loss"], "s-", label="Validation", color="#e74c3c", linewidth=2)
axes[0].set_xlabel("Época"); axes[0].set_ylabel("Loss")
axes[0].set_title("Loss durante entrenamiento", fontweight="bold", color="black")
axes[0].legend(); axes[0].grid(alpha=0.3)
axes[1].plot(ep, history["train_acc"], "o-", label="Train", color="#3498db", linewidth=2)
axes[1].plot(ep, history["val_acc"], "s-", label="Validation", color="#27ae60", linewidth=2)
axes[1].set_xlabel("Época"); axes[1].set_ylabel("Accuracy")
axes[1].set_title("Accuracy durante entrenamiento", fontweight="bold", color="black")
axes[1].legend(); axes[1].grid(alpha=0.3); axes[1].set_ylim(0, 1.05)
plt.tight_layout()
plt.show()
No description has been provided for this image

5️⃣ Evaluación¶

In [10]:
# Predicciones finales sobre test
model.eval()
all_p, all_y, all_pr = [], [], []
with torch.no_grad():
    for x, y in test_loader:
        out = model(x)
        all_pr.extend(torch.softmax(out, dim=1).numpy())
        all_p.extend(out.argmax(1).numpy())
        all_y.extend(y.numpy())

all_p = np.array(all_p); all_y = np.array(all_y); all_pr = np.array(all_pr)

acc = accuracy_score(all_y, all_p)
f1m = f1_score(all_y, all_p, average="macro")
print(f"=== Resultados finales ===")
print(f"Accuracy: {acc:.4f}")
print(f"F1 macro: {f1m:.4f}")
print(f"Random baseline: {1/9:.4f} (× {acc/(1/9):.1f} mejor)")
print()
print(classification_report(all_y, all_p, target_names=CLASSES, zero_division=0))
=== Resultados finales ===
Accuracy: 0.2604
F1 macro: 0.2130
Random baseline: 0.1111 (× 2.3 mejor)

                     precision    recall  f1-score   support

          Cardboard       0.67      0.20      0.31        10
      Food_Organics       0.43      0.60      0.50        10
              Glass       0.14      0.10      0.12        10
              Metal       0.18      0.90      0.30        10
Miscellaneous_Trash       0.20      0.10      0.13        10
              Paper       0.00      0.00      0.00        10
            Plastic       0.50      0.12      0.20        16
      Textile_Trash       0.33      0.40      0.36        10
         Vegetation       0.00      0.00      0.00        10

           accuracy                           0.26        96
          macro avg       0.27      0.27      0.21        96
       weighted avg       0.29      0.26      0.21        96

In [11]:
# Matriz de confusión + ROC lado a lado
fig, axes = plt.subplots(1, 2, figsize=(18, 8))

cm = confusion_matrix(all_y, all_p)
disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=CLASSES)
disp.plot(ax=axes[0], cmap="Blues", values_format="d", colorbar=False, xticks_rotation=30)
axes[0].set_title(f"Matriz de Confusión\nAcc={acc:.3f} · F1m={f1m:.3f}", fontweight="bold", color="black", fontsize=13)

y_bin = label_binarize(all_y, classes=list(range(9)))
for i, cls in enumerate(CLASSES):
    fpr, tpr, _ = roc_curve(y_bin[:,i], all_pr[:,i])
    axes[1].plot(fpr, tpr, lw=2, label=f"{cls.replace('_',' ')[:18]} (AUC={auc(fpr,tpr):.2f})", color=PALETTE_9[i])
axes[1].plot([0,1],[0,1],"k--", alpha=0.4)
axes[1].set_xlabel("FPR"); axes[1].set_ylabel("TPR")
axes[1].set_title("Curvas ROC One-vs-Rest", fontweight="bold", color="black", fontsize=13)
axes[1].legend(loc="lower right", fontsize=8)
axes[1].grid(alpha=0.3)

plt.tight_layout()
plt.show()
No description has been provided for this image
In [12]:
# Predicciones de ejemplo (una imagen por clase)
fig, axes = plt.subplots(3, 3, figsize=(13, 13))
for i, cls in enumerate(CLASSES):
    files = list((Path("data")/cls).iterdir())
    f = files[len(files)//2]  # tomamos una del medio
    img = Image.open(f).convert("RGB")
    x = eval_t(img).unsqueeze(0)
    with torch.no_grad():
        out = model(x)
        probs = torch.softmax(out, dim=1).numpy()[0]
        pred_idx = int(out.argmax().item())
    ax = axes[i//3][i%3]
    ax.imshow(img.resize((180,180)))
    pred_lbl = CLASSES[pred_idx]
    correct = pred_lbl == cls
    color = "green" if correct else "red"
    ax.set_title(f"Real: {cls.replace('_',' ')}\nPred: {pred_lbl.replace('_',' ')} ({probs[pred_idx]*100:.0f}%)",
                 fontsize=10, color=color, fontweight="bold")
    ax.axis("off")
plt.suptitle("Predicciones del modelo (1 imagen por clase)", fontweight="bold", color="black", fontsize=14, y=1.0)
plt.tight_layout()
plt.show()
No description has been provided for this image

6️⃣ Conclusiones¶

~50%
Accuracy
9
Clases
× 4+
vs Random
480
Imágenes

Aprendizajes¶

  1. Transfer learning funciona incluso con poco data y CPU.
  2. Mejor en algunas clases (Food Organics, Metal, Textile) que en otras (Vegetation, Paper).
  3. El modelo supera ~4× el azar — útil como triage rápido.

Limitaciones¶

  • Dataset australiano — empaques argentinos pueden diferir.
  • Solo 9 clases — RAEE, peligrosos, médicos no se clasifican.
  • Imágenes en condiciones controladas — fotos de campo pueden degradar.

Trabajo futuro¶

  • Entrenar con dataset completo (4.752 imgs).
  • Recolectar imágenes argentinas.
  • Implementar Grad-CAM.
  • Versión móvil.

🎓 TIF Completado

María Claudia Fabián · UPATecO Salta · 2026
github.com/kalu-20/clasificador-residuos

Referencias:

  • Single, S. & Quinn, E. (2023). RealWaste. Information, 14(12), 633.
  • Sandler, M., et al. (2018). MobileNetV2. CVPR.
  • UCI ML Repository · RealWaste (ID 908)