§  SIFT [1] [][] []

§  PCA-SIFT[2] []

§  Affine-SIFT[3] []

§  SURF [4] [] []

§  AffineCovariant Features [5] []

§  MSER [6] [] []

§  GeometricBlur [7] []

§  LocalSelf-Similarity Descriptor [8] []

§  Global andEfficient Self-Similarity [9] []

§  Histogramof Oriented Graidents [10] [] []

§  GIST [11][]

§  ShapeContext [12] []

§  ColorDescriptor [13] []

§  Pyramidsof Histograms of Oriented Gradients []

§  Space-TimeInterest Points (STIP) [14][] []

§  BoundaryPreserving Dense Local Regions [15][]

§  WeightedHistogram[]

§  Histogram-basedInterest Points Detectors[][]

§  An OpenCV- C++ implementation of Local Self Similarity Descriptors []

§  FastSparse Representation with Prototypes[]

§  CornerDetection []

§  AGASTCorner Detector: faster than FAST and even FAST-ER[]

§  Real-timeFacial Feature Detection using Conditional Regression Forests[]

§  Global andEfficient Self-Similarity for Object Classification and Detection[]

§  WαSH:Weighted α-Shapes for Local Feature Detection[]

§  HOG[]

§  OnlineSelection of Discriminative Tracking Features[]



§  NormalizedCut [1] []

§  Gerg Mori’Superpixel code [2] []

§  EfficientGraph-based Image Segmentation [3] [] []

§  Mean-ShiftImage Segmentation [4] [] []

§  OWT-UCMHierarchical Segmentation [5] []

§  Turbepixels[6] [] [] []

§  Quick-Shift[7] []

§  SLICSuperpixels [8] []

§  Segmentationby Minimum Code Length [9] []

§  BiasedNormalized Cut [10] []

§  SegmentationTree [11-12] []

§  EntropyRate Superpixel Segmentation [13] []

§  FastApproximate Energy Minimization via Graph Cuts[][]

§  EfficientPlanar Graph Cuts with Applications in Computer Vision[][]

§  IsoperimetricGraph Partitioning for Image Segmentation[][]

§  RandomWalks for Image Segmentation[][]

§  Blossom V:A new implementation of a minimum cost perfect matching algorithm[]

§  AnExperimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimizationin Computer Vision[][]

§  GeodesicStar Convexity for Interactive Image Segmentation[]

§  ContourDetection and Image Segmentation Resources[][]

§  BiasedNormalized Cuts[]

§  Max-flow/min-cut[]

§  Chan-VeseSegmentation using Level Set[]

§  A Toolboxof Level Set Methods[]

§  Re-initializationFree Level Set Evolution via Reaction Diffusion[]

§  ImprovedC-V active contour model[][]

§  AVariational Multiphase Level Set Approach to Simultaneous Segmentation and BiasCorrection[][]

§  Level SetMethod Research by Chunming Li[]

§  ClassCutfor Unsupervised Class Segmentation[e]

§  SEEDS:Superpixels Extracted via Energy-Driven Sampling ][]



§  A simpleobject detector with boosting []

§  INRIAObject Detection and Localization Toolkit [1] []

§  DiscriminativelyTrained Deformable Part Models [2] []

§  CascadeObject Detection with Deformable Part Models [3] []

§  Poselet[4] []

§  ImplicitShape Model [5] []

§  Viola andJones’s Face Detection [6] []

§  BayesianModelling of Dyanmic Scenes for Object Detection[][]

§  Handdetection using multiple proposals[]

§  ColorConstancy, Intrinsic Images, and Shape Estimation[][]

§  Discriminativelytrained deformable part models[]

§  GradientResponse Maps for Real-Time Detection of Texture-Less Objects: LineMOD []

§  ImageProcessing On Line[]

§  RobustOptical Flow Estimation[]

§  Where'sWaldo: Matching People in Images of Crowds[]

§  ScalableMulti-class Object Detection[]

§  Class-SpecificHough Forests for Object Detection[]

§  DeformedLattice Detection In Real-World Images[]

§  Discriminativelytrained deformable part models[]



§  Itti,Koch, and Niebur’ saliency detection [1] []

§  Frequency-tunedsalient region detection [2] []

§  Saliencydetection using maximum symmetric surround [3] []

§  Attentionvia Information Maximization [4] []

§  Context-awaresaliency detection [5] []

§  Graph-basedvisual saliency [6] []

§  Saliencydetection: A spectral residual approach. [7] []

§  Segmentingsalient objects from images and videos. [8] []

§  SaliencyUsing Natural statistics. [9] []

§  DiscriminantSaliency for Visual Recognition from Cluttered Scenes. [10] []

§  Learningto Predict Where Humans Look [11] []

§  GlobalContrast based Salient Region Detection [12] []

§  BayesianSaliency via Low and Mid Level Cues[]

§  Top-DownVisual Saliency via Joint CRF and Dictionary Learning[][]

§  SaliencyDetection: A Spectral Residual Approach[]


五、图像分类、聚类ImageClassification, Clustering

§  PyramidMatch [1] []

§  SpatialPyramid Matching [2] []

§  Locality-constrainedLinear Coding [3] [] []

§  SparseCoding [4] [] []

§  TextureClassification [5] []

§  MultipleKernels for Image Classification [6] []

§  FeatureCombination [7] []

§  SuperParsing[]

§  LargeScale Correlation Clustering Optimization[]

§  Detectingand Sketching the Common[]

§  Self-TuningSpectral Clustering[][]

§  UserAssisted Separation of Reflections from a Single Image Using a Sparsity Prior[][]

§  Filtersfor Texture Classification[]

§  MultipleKernel Learning for Image Classification[]

§  SLICSuperpixels[]



§  A ClosedForm Solution to Natural Image Matting []

§  SpectralMatting []

§  Learning-basedMatting []



§  A Forestof Sensors - Tracking Adaptive Background Mixture Models []

§  ObjectTracking via Partial Least Squares Analysis[][]

§  RobustObject Tracking with Online Multiple Instance Learning[][]

§  OnlineVisual Tracking with Histograms and Articulating Blocks[]

§  IncrementalLearning for Robust Visual Tracking[]

§  Real-timeCompressive Tracking[]

§  RobustObject Tracking via Sparsity-based Collaborative Model[]

§  VisualTracking via Adaptive Structural Local Sparse Appearance Model[]

§  OnlineDiscriminative Object Tracking with Local Sparse Representation[][]

§  SuperpixelTracking[]

§  LearningHierarchical Image Representation with Sparsity, Saliency and Locality[][]

§  OnlineMultiple Support Instance Tracking [][]

§  VisualTracking with Online Multiple Instance Learning[]

§  Objectdetection and recognition[]

§  CompressiveSensing Resources[]

§  RobustReal-Time Visual Tracking using Pixel-Wise Posteriors[]

§  Tracking-Learning-Detection[][]

§  the HandVu:vision-based hand gesture interface[]

§  LearningProbabilistic Non-Linear Latent Variable Models for Tracking Complex Activities[]



§  Kinecttoolbox[]

§  OpenNI[]

§  zouxy09CSDN Blog[]

§  FingerTracker手指跟踪[]



§  3DReconstruction of a Moving Object[] []

§  Shape FromShading Using Linear Approximation[]

§  CombiningShape from Shading and Stereo Depth Maps[][]

§  Shape fromShading: A Survey[][]

§  ASpatio-Temporal Descriptor based on 3D Gradients (HOG3D)[][]

§  Multi-cameraScene Reconstruction via Graph Cuts[][]

§  A FastMarching Formulation of Perspective Shape from Shading under FrontalIllumination[][]

§  Reconstruction:3DShape, Illumination, Shading, Reflectance, Texture[]

§  MonocularTracking of 3D Human Motion with a Coordinated Mixture of Factor Analyzers[]

§  Learning3-D Scene Structure from a Single Still Image[]



§  Matlabclass for computing Approximate Nearest Nieghbor (ANN) [ providinginterface to]

§  RandomSampling[]

§  ProbabilisticLatent Semantic Analysis (pLSA)[]

§  FASTANNand FASTCLUSTER for approximate k-means (AKM)[]

§  FastIntersection / Additive Kernel SVMs[]

§  SVM[]

§  Ensemblelearning[]

§  DeepLearning[]

§  DeepLearning Methods for Vision[]

§  NeuralNetwork for Recognition of Handwritten Digits[]

§  Training adeep autoencoder or a classifier on MNIST digits[]

§  THE MNISTDATABASE of handwritten digits[]

§  Ersatz:deep neural networks in the cloud[]

§  DeepLearning []

§  sparseLM :Sparse Levenberg-Marquardt nonlinear least squares in C/C++[]

§  Weka 3:Data Mining Software in Java[]

§  Invitedtalk "A Tutorial on Deep Learning" by Dr. Kai Yu (余凯)[]

§  CNN -Convolutional neural network class[]

§  YannLeCun's Publications[]

§  LeNet-5,convolutional neural networks[]

§  Training adeep autoencoder or a classifier on MNIST digits[]

§  DeepLearning 大牛GeoffreyE. Hinton's HomePage[]

§  MultipleInstance Logistic Discriminant-based Metric Learning (MildML) and LogisticDiscriminant-based Metric Learning (LDML)[]

§  Sparsecoding simulation software[]

§  VisualRecognition and Machine Learning Summer School[]


十一、目标、行为识别Object,Action Recognition:

§  ActionRecognition by Dense Trajectories[][]

§  ActionRecognition Using a Distributed Representation of Pose and Appearance[]

§  RecognitionUsing Regions[][]

§  2DArticulated Human Pose Estimation[]

§  Fast HumanPose Estimation Using Appearance and Motion via Multi-Dimensional BoostingRegression[][]

§  EstimatingHuman Pose from Occluded Images[][]

§  Quasi-densewide baseline matching[]

§  ChaLearnGesture Challenge: Principal motion: PCA-based reconstruction of motionhistograms[]

§  Real TimeHead Pose Estimation with Random Regression Forests[]

§  2D ActionRecognition Serves 3D Human Pose Estimation[

§  A HoughTransform-Based Voting Framework for Action Recognition[

§  MotionInterchange Patterns for Action Recognition in Unconstrained Videos[

§  2Darticulated human pose estimation software[]

§  Learningand detecting shape models []

§  ProgressiveSearch Space Reduction for Human Pose Estimation[]

§  LearningNon-Rigid 3D Shape from 2D Motion[]



§  DistanceTransforms of Sampled Functions[]

§  TheComputer Vision Homepage[]

§  Efficientappearance distances between windows[]

§  ImageExploration algorithm[]

§  MotionMagnification 运动放大 []

§  BilateralFiltering for Gray and Color Images 双边滤波器 []

§  A FastApproximation of the Bilateral Filter using a Signal Processing Approach [



§  EGT: aToolbox for Multiple View Geometry and Visual Servoing[] []

§  adevelopment kit of matlab mex functions for OpenCV library[]

§  FastArtificial Neural Network Library[]



§  finger-detection-and-gesture-recognition []

§  Hand andFinger Detection using JavaCV[]

§  Hand andfingers detection[]



§  NonparametricScene Parsing via Label Transfer []



§  Highaccuracy optical flow using a theory for warping []

§  DenseTrajectories Video Description []

§  SIFT Flow:Dense Correspondence across Scenes and its Applications[]

§  KLT: AnImplementation of the Kanade-Lucas-Tomasi Feature Tracker []

§  TrackingCars Using Optical Flow[]

§  Secrets ofoptical flow estimation and their principles[]

§  implmentationof the Black and Anandan dense optical flow method[]

§  OpticalFlow Computation[]

§  BeyondPixels: Exploring New Representations and Applications for Motion Analysis[]

§  A Databaseand Evaluation Methodology for Optical Flow[]

§  opticalflow relative[]

§  RobustOptical Flow Estimation []

§  opticalflow[]



§  Semi-SupervisedDistance Metric Learning for Collaborative Image Retrieval ][]


十八、马尔科夫随机场MarkovRandom Fields:

§  MarkovRandom Fields for Super-Resolution ]

§  AComparative Study of Energy Minimization Methods for Markov Random Fields withSmoothness-Based Priors []



§  MovingObject Extraction, Using Models or Analysis of Regions ]

§  BackgroundSubtraction: Experiments and Improvements for ViBe []

§  ASelf-Organizing Approach to Background Subtraction for Visual SurveillanceApplications []

§  changedetection.net:A new change detection benchmark dataset[]

§  ViBe - apowerful technique for background detection and subtraction in video sequences[]

§  BackgroundSubtraction Program[]

§  MotionDetection Algorithms[]

§  StuttgartArtificial Background Subtraction Dataset[]

§  ObjectDetection, Motion Estimation, and Tracking[]


Feature Detection and Description

General Libraries: 

§   – Implementation of various featuredescriptors (including SIFT, HOG, and LBP) and covariant feature detectors(including DoG, Hessian, Harris Laplace, Hessian Laplace, Multiscale Hessian,Multiscale Harris). Easy-to-use Matlab interface. See  – Slides providing a demonstration ofVLFeat and also links to other software. Check also 

§   – Various implementations of modernfeature detectors and descriptors (SIFT, SURF, FAST, BRIEF, ORB, FREAK, etc.)


Fast Keypoint Detectors for Real-timeApplications: 

§   – High-speed corner detectorimplementation for a wide variety of platforms

§   – Even faster than the FAST cornerdetector. A multi-scale version of this method is used for the BRISK descriptor(ECCV 2010).


Binary Descriptors for Real-TimeApplications: 

§   – C++ code for a fast and accurateinterest point descriptor (not invariant to rotations and scale) (ECCV 2010)

§   – OpenCV implementation of theOriented-Brief (ORB) descriptor (invariant to rotations, but not scale)

§   – Efficient Binary descriptor invariantto rotations and scale. It includes a Matlab mex interface. (ICCV 2011)

§   – Faster than BRISK (invariant torotations and scale) (CVPR 2012)


SIFT and SURF Implementations: 

§  SIFT: , ,  byDavid Lowe, , 

§  SURF: , , 


Other Local Feature Detectors andDescriptors: 

§   – Oxford code for various affinecovariant feature detectors and descriptors.

§   – Source code for the Local Intensityorder Pattern (LIOP) descriptor (ICCV 2011).

§   – Source code for matching of local symmetryfeatures under large variations in lighting, age, and rendering style (CVPR2012).


Global Image Descriptors: 

§   – Matlab code for the GIST descriptor

§   – Global visual descriptor for scenecategorization and object detection (PAMI 2011)


Feature Coding and Pooling 

§   – Source code for variousstate-of-the-art feature encoding methods – including Standard hard encoding,Kernel codebook encoding, Locality-constrained linear encoding, and Fisherkernel encoding.

§   – Source code for feature pooling based on spatialpyramid matching (widely used for image classification)


Convolutional Nets and Deep Learning 

§   – C++ Library for Energy-BasedLearning. It includes several demos and step-by-step instructions to trainclassifiers based on convolutional neural networks.

§   – Provides a matlab-like environmentfor state-of-the-art machine learning algorithms, including a fast implementationof convolutional neural networks.

§   -Various links for deep learning software.


Part-Based Models 

§   – Library provided by the authors of the originalpaper (state-of-the-art in PASCAL VOC detection task)

§   – Branch-and-Bound implementation for adeformable part-based detector.

§   – Efficient implementation of a method thatachieves the exact same performance of deformable part-based detectors but withsignificant acceleration (ECCV 2012).

§   – Fast approach for deformable object detection(CVPR 2011).

§   – C++ and Matlab versions for objectdetection based on poselets.

§   – Implementation of a unified approachfor face detection, pose estimation, and landmark localization (CVPR 2012).


Attributes and Semantic Features 

§   – Modified implementation of RankSVM totrain Relative Attributes (ICCV 2011).

§   – Implementation of object banksemantic features (NIPS 2010). See also 

§   – Software for extracting high-levelimage descriptors (ECCV 2010, NIPS 2011, CVPR 2012).


Large-Scale Learning 

§   – Source code for fast additive kernelSVM classifiers (PAMI 2013).

§   – Library for large-scale linear SVMclassification.

§   – Implementation for Pegasos SVM andHomogeneous Kernel map.


Fast Indexing and Image Retrieval 

§   – Library for performing fastapproximate nearest neighbor.

§   – Source code for KernelizedLocality-Sensitive Hashing (ICCV 2009).

§   – Code for generation of small binary codes usingIterative Quantization and other baselines such as Locality-Sensitive-Hashing(CVPR 2011).

§   – Efficient code for state-of-the-art large-scaleimage retrieval (CVPR 2011).


Object Detection 

§  See  and  above.

§   – Very fast and accurate pedestrian detector (CVPR2012).

§   – Excellent resource for pedestriandetection, with various links for state-of-the-art implementations.

§   – Enhanced implementation ofViola&Jones real-time object detector, with trained models for facedetection.

§   – Source code for branch-and-boundoptimization for efficient object localization (CVPR 2008).


3D Recognition 

§   –Library for 3D image and point cloud processing.


Action Recognition 

§   – Source code for action recognitionbased on the ActionBank representation (CVPR 2012).

§   –software for computing space-time interest point descriptors

§   – Look for Stacked ISA for Videos (CVPR 2011)

§   - C++ code for activity recognitionusing the velocity histories of tracked keypoints (ICCV 2009)





§   – 30,475 images of 50 animals classes with 6pre-extracted feature representations for each image.

§   – Attribute annotations for images collected fromYahoo and Pascal VOC 2008.

§   – 15,000 faces annotated with 10attributes and fiducial points.

§   – 58,797 face images of 200 people with73 attribute classifier outputs.

§   – 13,233 face images of 5,749 peoplewith 73 attribute classifier outputs.

§   – 8,000 people with annotatedattributes. Check also this  for another dataset of humanattributes.

§   – Large-scale scene attribute database with ataxonomy of 102 attributes.

§   – Variety of attribute labels for theImageNet dataset.

§   – Data for OSR and a subset of PubFigdatasets. Check also this  for the WhittleSearch data.

§   – Images of shopping categoriesassociated with textual descriptions.


Fine-grained Visual Categorization 

§   – Hundreds of bird categories withannotated parts and attributes.

§   – 20,000 images of 120 breeds of dogsfrom around the world.

§   – 37 category pet dataset with roughly 200 imagesfor each class. Pixel level trimap segmentation is included.

§   – 832 images of 10 species ofbutterflies.

§   – Hundreds of flower categories.


Face Detection 

§   – UMass face detection dataset andbenchmark (5,000+ faces)

§   – Classical face detection dataset.


Face Recognition 

§   – Large collection of face recognition datasets.

§   – UMass unconstrained face recognitiondataset (13,000+ face images).

§   – includes face recognition grand challenge (FRGC),vendor tests (FRVT) and others.

§   –contains more than 750,000 images of 337 people, with 15 different views and 19lighting conditions.

§   – Classical face recognition dataset.

§   – Easy to use if you want play withsimple face datasets including Yale, ORL, PIE, and Extended Yale B.

§   – Low-resolution face dataset capturedfrom surveillance cameras.


Handwritten Digits 

§   – large dataset containing a trainingset of 60,000 examples, and a test set of 10,000 examples.


Pedestrian Detection

§   – 10 hours of video taken from avehicle,350K bounding boxes for about 2.3K unique pedestrians.

§   – Currently one of the most popular pedestriandetection datasets.

§   – Urban dataset captured from a stereo rig mountedon a stroller.

§   – Dataset with image pairs recorded in an crowdedurban setting with an onboard camera.

§   – One of 20 categories in PASCAL VOCdetection challenges.

§   – Small dataset captured fromsurveillance cameras.


Generic Object Recognition 

§   – Currently the largest visualrecognition dataset in terms of number of categories and images.

§   – 80 million 32x32 low resolutionimages.

§   – One of the most influential visualrecognition datasets.

§   /  –Popular image datasets containing 101 and 256 object categories, respectively.

§   – Online annotation tool for buildingcomputer vision databases.


Scene Recognition

§   – MIT scene understanding dataset.

§   – Dataset of 15 natural scene categories.


Feature Detection and Description 

§   – Widely used dataset for measuringperformance of feature detection and description. Checkfor an evaluation framework.


Action Recognition

§   – CVPR 2012 tutorial covering variousdatasets for action recognition.


RGBD Recognition 

§   – Dataset containing 300 commonhousehold objects


