vault backup: 2023-12-19 20:31:21

2023-12-19 20:31:21 +08:00
parent f1bf108986
commit 36034a9d51
1 changed files with 3 additions and 2 deletions
--- a/Paper/Open-Vocabulary
+++ b/Paper/Open-Vocabulary
@@ -13,8 +13,9 @@
 		4. Use MaskFormer(a mask proposal generator trained on COCO) as an region proposal generator.
 		5. Select the region proposals with highest overlap with ground-truth masks.
 		6. Assign the object label to this region.
-		7. This model reach mIoU of 66.5%(despite)
-
+		7. This model reach mIoU of 66.5%. (despite imperfect region proposal)
+	- Conclusion
+	  Pre-trained CLIP not performed well over masked images, we hypothesize that CLIP trained on natural image which are not cropped or noised by segmentation masks.

 ## Vocabularies
 1. ground-truth masks: refer to the manually annotated masks or pixel-level labels that are used to define the correct segmentation of objects in an image. Each pixel in the ground-truth mask is assigned a specific class label corresponding to the object or region it belongs to.