Deep Learning

Articles about Deep Learning fundamentals and advanced methods, from supervised to self-supervised learning, including Vision Transformers, DINO, CLIP, SAM2, and more; all implemented in PyTorch.

U2SEG: Unsupervised Universal Image Segmentation

Deep Learning, Image Segmentation

U2SEG is the first unsupervised approach to combine instance, semantic, and panoptic segmentation. It uses MaskCut algorithm from CutLER to create instance segmentation masks and…

Image Segmentation, Deep Learning

U-Net was published in 2015 for spotting microscopic cells in biomedical scans, and since then, it became very popular. It created a massive impact; before…

Deep Learning, Image Segmentation

There are a lot of supervised object detection and instance segmentation models (YOLO, RCNN Family, DETR …). Pipeline is the same for each one; first,…

Deep Learning

When we talk about depth in computer vision, we think about stereo cameras, time-of-flight sensors, and LiDAR. These methods don’t work with a single RGB…

Deep Learning

“a tiny vision language model that kicks ass and runs anywhere“, that is exactly how the creators of Moondream defined it. They are not wrong,…

Deep Learning, Image Segmentation

SAM3 is just announced, and everybody is talking about it. But what is this hype all about? As you probably know from before, all the…

Deep Learning, Object Detection

→ An article explaining Grounding DINO and how to detect objects with text prompts. I have so many articles about closed-set object detection, and most…

Object Detection, Deep Learning

→ Step-by-step guide for training DETR(Detection Transformer) object detection models in PyTorch with any dataset. When it comes to object detection, there are popular models…

Deep Learning

→ Article explaining DINOv3 and demonstrating how to create similarity maps using cosine similarity formula. Just look around. You probably see a door, window, bookcase,…

Image Classification, Deep Learning

→ Article about explaining CLIP and demonstrating image classification using CLIP models. I normally like to write an introduction paragraph about the article, but not…

Deep Learning

U2SEG: Unsupervised Universal Image Segmentation

Pipeline for Training U-Net Semantic Segmentation Models with PyTorch

CutLER: Unsupervised Object Detection and Instance Segmentation

Depth Anything V2: Generating Depth Maps from RGB Images

Moondream: Tiny Vision Language Model

SAM3: Segment Objects with a Text Prompt on Videos and Images

Grounding DINO: Detecting Objects with Text Prompts

Pipeline for Training DETR Object Detection Models

Introduction to DINOv3: Generating Similarity Maps with Vision Transformers

Image Classification with CLIP: Image-Text Similarity and Zero-Shot Labels

Trending now