Authors
Param Rajpura, Alakh Aggarwal, Manik Goyal, Sanchit Gupta, Jonti Talukdar, Hristo Bojinov, Ravi Hegde
Publication date
2017/12/16
Book
National Conference on Computer Vision, Pattern Recognition, Image Processing, and Graphics
Pages
517-528
Publisher
Springer Singapore
Description
We show that finetuning pretrained CNNs entirely on synthetic images is an effective strategy to achieve transfer learning. We apply this strategy for detecting packaged food products clustered in refrigerator scenes. A CNN pretrained on the COCO dataset and fine-tuned with our 4000 synthetic images achieves mean average precision (mAP @ 0.5-IOU) of 52.59 on a test set of real images (150 distinct products as objects of interest and 25 distractor objects) in comparison to a value of 24.15 achieved without such finetuning. The synthetic images were rendered with freely available 3D models with variations in parameters like color, texture and viewpoint without a high emphasis on photorealism. We analyze factors like training data set size, cue variances, 3D model dictionary size and network architecture for their influence on the transfer learning performance. Additionally, training strategies like fine-tuning …
Total citations
20182019202020212022202320242523232
Scholar articles
P Rajpura, A Aggarwal, M Goyal, S Gupta, J Talukdar… - National Conference on Computer Vision, Pattern …, 2017