dataset on object recognition


  • 2017 e-Lab Video Data Set
  • Image Net
    • 2012-2016
    • I.Object localization for 1000 categories.
    • II.Object detection for 200 fully labeled categories.
    • III.Object detection from video for 30 fully labeled categories.
    • IV.Scene classification for 365 scene categories (Joint with MIT Places team) on Places2 Database
    • V.Scene parsing for 150 stuff and discrete object categories (Joint with MIT Places team).
  • Microsoft CoCo: Common Objects in Context
    • Object segmentation
    • Recognition in Context
    • Multiple objects per image
    • More than 300,000 images
    • More than 2 Million instances
    • 80 object categories
    • 5 captions per image
    • Keypoints on 100,000 people
  • Sun Database: Scene Categorization Benchmark
  • PASCAL VOC: The Pattern Analysis, Statistical Modeling and Computational Learning(PASCAL) Visual Object Classes(VOC)
    • 2005-2012
    • Classification: For each of the twenty classes, predicting presence/absence of an example of that class in the test image.
    • Detection: Predicting the bounding box and label of each object from the twenty target classes in the test image.
    • Segmentation: Generating pixel-wise segmentations giving the class of the object visible at each pixel, or “background” otherwise.
    • Action Classification: Predicting the action(s) being performed by a person in a still image
    • Person Layout: Predicting the bounding box and label of each part of a person (head, hands, feet).
  • MIT’s Place2
    • Places365-Standard has ~1.8 million images from 365 scene categories
    • Places365-Challenge has extra 6.2 million images along with Places365-Standard
  • CUB-200: Caltech-UCSD Birds 200
  • Oxford Flower Dataset
  • Food-101
  • Caltech-101/Caltech-256
  • Caltech 10, 000Web Faces


Share this to:


邮箱地址不会被公开。 必填项已用*标注