Any backbone for linear segmentation#777
Conversation
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
There was a problem hiding this comment.
Pull request overview
This PR introduces a new “linear semantic segmentation” task-model implementation that is intended to be extendable across multiple backbone families by adding a small model-name → backbone/config registry, and it refactors internal model-wrapper typing to distinguish multi-scale ViT vs CNN capabilities.
Changes:
- Added a new linear semantic segmentation task model (
LinearSemanticSegmentation) with tiling/untile inference utilities and a dedicated training loop. - Introduced a registry-based configuration layer for mapping
*-linearmodel names to a backbone + default backbone args. - Refined model wrapper / package typing to express “multi-scale feature” capability for ViTs vs CNNs, and updated DINOv2/DINOv3 wrappers accordingly.
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 21 comments.
Show a summary per file
| File | Description |
|---|---|
| src/lightly_train/_task_models/linear_semantic_segmentation/transforms.py | Adds default train/val transform arg presets for linear semantic segmentation. |
| src/lightly_train/_task_models/linear_semantic_segmentation/train_model.py | Adds the training/validation loop for linear semantic segmentation. |
| src/lightly_train/_task_models/linear_semantic_segmentation/task_model.py | Adds the inference-time task model (tiling, postprocess) and backbone wiring. |
| src/lightly_train/_task_models/linear_semantic_segmentation/config.py | Adds registry entries mapping *-linear model names to backbone definitions. |
| src/lightly_train/_models/package.py | Introduces MultiScaleFeaturePackage and a model-name parsing protocol. |
| src/lightly_train/_models/model_wrapper.py | Splits multi-scale feature protocols into ViT/CNN variants and adds stride/patch-size capability. |
| src/lightly_train/_models/dinov3/dinov3_vit.py | Updates DINOv3 ViT wrapper to provide patch size and new protocol typing. |
| src/lightly_train/_models/dinov3/dinov3_convnext.py | Updates DINOv3 ConvNeXt wrapper typing and adds multi-scale stride reporting. |
| src/lightly_train/_models/dinov3/dinov3_src/models/convnext.py | Adds a type annotation for downsample_layers. |
| src/lightly_train/_models/dinov3/dinov3_package.py | Marks DINOv3 as a multi-scale-feature package and adds protocol typing. |
| src/lightly_train/_models/dinov2_vit/dinov2_vit.py | Updates DINOv2 ViT wrapper to provide patch size and new protocol typing. |
| src/lightly_train/_models/dinov2_vit/dinov2_vit_package.py | Marks DINOv2 as a multi-scale-feature package and adds protocol typing. |
| src/lightly_train/_configs/model_registry.py | Adds a generic registry utility used for model/config lookup. |
| src/lightly_train/_configs/config.py | Adds ConfigsNamespace to enforce “namespace-only” config containers. |
| src/lightly_train/_commands/train_task_helpers.py | Switches the training task helper to import the new linear semantic segmentation train model. |
| ... | ||
|
|
||
|
|
||
| class MultiScaleFeaturePackage(Package): |
There was a problem hiding this comment.
Note: Eventually all packages should implement this.
| DinoVisionTransformer, | ||
| from lightly_train._models.model_wrapper import ModelWrapper | ||
| from lightly_train._models.package import MultiScaleFeaturePackage | ||
| from lightly_train._task_models.dinov2_linear_semantic_segmentation.config import ( |
There was a problem hiding this comment.
I'll do a lot of renaming from dinov2_linear_semantic_... to just semantic_ in a follow up.
|
/review |
What has changed and why?
This is the beginning of a refactoring effort
Follow-ups will rename things away from DINOv2 and to generic names. After that I will move on to object detection, but don't want this to interfere with #772 etc. (ECViT implementations).
How has it been tested?
Did you update CHANGELOG.md?
Did you update the documentation?