feat: add ECViT LTDETR backbone wrapper#772
Conversation
|
@codex review |
|
Codex Review: Didn't find any major issues. Swish! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
/review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4ddad28aae
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| def forward(self, *, H: int, W: int) -> tuple[Tensor, Tensor]: | ||
| periods = cast(Tensor, self.periods) | ||
| device = periods.device | ||
| dtype = self.dtype if self.dtype is not None else torch.get_default_dtype() |
There was a problem hiding this comment.
Preserve RoPE dtype under mixed precision
When this backbone is run with AMP or after model.half(), the RoPE buffer is moved to fp16 but this line still creates coordinates, sin, and cos in torch.get_default_dtype() (usually fp32). apply_rope then converts only q/k to fp32 while v remains fp16, so scaled_dot_product_attention(q, k, v, ...) receives mixed dtypes and can fail in the mixed-precision LTDETR training/inference path. Derive the generated RoPE dtype from the buffer/input dtype instead of the global default.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Pull request overview
Adds an EdgeCrafter ECViT backbone + ViTAdapter port as a standalone LTDETR-compatible backbone wrapper in lightly_train, along with licensing/NOTICE updates and unit tests. This sets up ECViT feature pyramid outputs (P3, P4, P5) for later integration into DINOv3LTDETRObjectDetection.
Changes:
- Introduces
ECViTWrapperimplementing the LTDETR backbone interface and checkpoint loading/unwrapping. - Adds pytest coverage for presets, output shapes, layer-fusion behavior, and strict checkpoint loading.
- Updates third-party licensing attribution (NOTICE + Apache 2.0 license header tooling + license text).
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
src/lightly_train/_task_models/dinov3_ltdetr_object_detection/ecvit_wrapper.py |
New ECViT backbone wrapper + supporting modules + checkpoint loading utilities. |
tests/_task_models/dinov3_ltdetr_object_detection/test_ecvit_wrapper.py |
Unit tests for wrapper instantiation, forward output contract, fusion behavior, and weight loading. |
NOTICE |
Adds EdgeCrafter attribution and modification notes. |
Makefile |
Excludes the new EdgeCrafter-derived file from the default header step and applies an Apache 2.0 header template. |
licences/EDGECRAFTER_LICENSE |
Adds the Apache 2.0 license text for EdgeCrafter. |
dev_tools/edgecrafter_licenseheader.tmpl |
Adds the Apache 2.0 license header template for EdgeCrafter-derived files. |
| torch.Size([2, expected_channels, 2, 2]), | ||
| torch.Size([2, expected_channels, 1, 1]), | ||
| ] | ||
|
|
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
liopeer
left a comment
There was a problem hiding this comment.
Looks clean and working. I would move it to the _models subpackage and properly implement the ModelWrapper in follow-up PRs.
|
|
||
| # EdgeCrafter calls this module ViTAdapter. Keep the alias for familiarity while the | ||
| # Lightly-facing name describes the LTDETR backbone wrapper contract. | ||
| ViTAdapter = ECViTWrapper |
There was a problem hiding this comment.
Unless this is actually required somewhere I would discard it.
| @@ -0,0 +1,596 @@ | |||
| # | |||
| # Licensed under the Apache License, Version 2.0 (the "License"); | |||
There was a problem hiding this comment.
Are you planning to move it to a package inside _models?
There was a problem hiding this comment.
Yeah, I would write a package in a follow-up PR . I will move it to _models since is more suitable.
| state_dict = _unwrap_state_dict(state) | ||
| self.backbone.load_state_dict(state_dict, strict=True) | ||
|
|
||
| def forward(self, x: Tensor) -> tuple[Tensor, Tensor, Tensor]: |
There was a problem hiding this comment.
Note: If you're planning to move it to a package (which i would), you should implement the interfaces in the src/lightly_train/_models/model_wrapper.py module.
What has changed and why?
Ports EdgeCrafter's ECViT backbone + ViTAdapter into Lightly as a standalone
LTDETR-compatible backbone wrapper. This is the backbone port only; wiring into
DINOv3LTDETRObjectDetectionand end-to-end integration will be done in afollow-up PR.
Highlights:
ECViTWrapperatsrc/lightly_train/_task_models/dinov3_ltdetr_object_detection/ecvit_wrapper.py.Contract:
forward(x) -> tuple[Tensor, Tensor, Tensor], matching the currentLTDETR backbone interface.
/fuse -> reshape to spatial map -> interpolate to 3 levels -> project
channels -> return
(P3, P4, P5).ecvitt,ecvittplus,ecvits,ecvitsplus.Does not use
DINOv3STAsor any DINOv3 weight key conversion.How has it been tested?
Added
tests/_task_models/dinov3_ltdetr_object_detection/test_ecvit_wrapper.py:(P3, P4, P5)with theexpected channels and spatial sizes.
to a spatial map.
weights_pathloads backbone state dicts strictly, includingstate_dict/model/backbonecheckpoint containers.nameandnum_levels != 3raise the expected errors.EdgeCrafter
.pthfiles whenECVIT_CHECKPOINT_DIRis set.Validated locally:
PYTHONPATH=src python -m ruff check \ src/lightly_train/_task_models/dinov3_ltdetr_object_detection/ecvit_wrapper.py \ tests/_task_models/dinov3_ltdetr_object_detection/test_ecvit_wrapper.py python -m compileall \ src/lightly_train/_task_models/dinov3_ltdetr_object_detection/ecvit_wrapper.py PYTHONPATH=src pytest \ tests/_task_models/dinov3_ltdetr_object_detection/test_ecvit_wrapper.py -q # 11 passed, 4 skippedDid you update CHANGELOG.md?
The wrapper is not yet wired into a task model and is not exposed to users.
Did you update the documentation?
No user-facing surface changes in this PR. Documentation will be updated when
the wrapper is wired into
DINOv3LTDETRObjectDetectionin the follow-up PR.