Hi @ZhengPeng7,
We have trained a car background removal (segmentation) model using a dataset of 4K images with a resolution of 4032x3024. The model performs well on the 4K testing dataset and also works reasonably well on 1K images (1365x1024) for most cases. However, we have observed that for certain images, the 4K version produces a clean and accurate background removal, while the same image resized to 1K yields results that are noticeably less precise.
To illustrate this, I have attached four files:
• small_car1.png: 1K image (1365x1024)

• small_car1_mask.png: Mask for the 1K image
• large_car1.png: 4K image (4032x3024)

• large_car1_mask.png: Mask for the 4K image
As you can see, there are subtle but significant differences, especially in the finer details of the car.
Questions:
1. Could you share insights on why the model might produce less accurate results when the input image is downscaled to 1K, even though augmentation with resizing was part of training?
2. From an approach perspective, should we take our current model (trained on 4K images) and fine-tune it using the same dataset downscaled to 1K? Would this improve accuracy for 1K inputs?
3. Alternatively, is there a better method to ensure consistent performance across both 4K and 1K resolutions?
Your advice on this would be greatly appreciated.
Hi @ZhengPeng7,
We have trained a car background removal (segmentation) model using a dataset of 4K images with a resolution of 4032x3024. The model performs well on the 4K testing dataset and also works reasonably well on 1K images (1365x1024) for most cases. However, we have observed that for certain images, the 4K version produces a clean and accurate background removal, while the same image resized to 1K yields results that are noticeably less precise.
To illustrate this, I have attached four files:
As you can see, there are subtle but significant differences, especially in the finer details of the car.
Questions:
1. Could you share insights on why the model might produce less accurate results when the input image is downscaled to 1K, even though augmentation with resizing was part of training?
2. From an approach perspective, should we take our current model (trained on 4K images) and fine-tune it using the same dataset downscaled to 1K? Would this improve accuracy for 1K inputs?
3. Alternatively, is there a better method to ensure consistent performance across both 4K and 1K resolutions?
Your advice on this would be greatly appreciated.