A comparison of fisher vectors and gaussian supervectors for document versus non-document image classification

David C. Smith, Keri A. Kornelson

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

7 Scopus citations


This research addresses the document vs. non-document image classification problem. The ability to select images containing text from an OCR processing stream that also includes images of scenes, people, faces, etc., will eliminate unnecessary computation and free up valuable computer resources for other tasks. This is particularly true for high volume OCR systems. Fisher vectors represent images as gradients of a global generative Gaussian Mixture Model (GMM) of low level image descriptors, and exhibit state-of-the-art performance for object categorization. Gaussian supervectors represent images by soft clustering low level image descriptors according to posterior GMM mixture probabilities, optionally using MAP adaptation, and have demonstrated state-of-the-art performance for scene categorization. We compare results obtained by applying linear SVMs to Fisher vector and Gaussian supervector representations to categorize images as having only text, no text, or a mixture of text and non-text. We also report the performance of GMM-based soft versions of vectors of locally aggregated descriptors (VLAD) and Bag of Visual words (BOV).

Original languageEnglish
Title of host publicationApplications of Digital Image Processing XXXVI
StatePublished - 2013
Externally publishedYes
EventApplications of Digital Image Processing XXXVI - San Diego, CA, United States
Duration: Aug 26 2013Aug 29 2013

Publication series

NameProceedings of SPIE - The International Society for Optical Engineering
ISSN (Print)0277-786X
ISSN (Electronic)1996-756X


ConferenceApplications of Digital Image Processing XXXVI
Country/TerritoryUnited States
CitySan Diego, CA


  • Fisher vector
  • GMM
  • Gaussian supervector
  • SURF descriptor
  • SVM
  • dimension reduction
  • document image classification
  • random projection
  • soft BOV
  • soft VLAD


Dive into the research topics of 'A comparison of fisher vectors and gaussian supervectors for document versus non-document image classification'. Together they form a unique fingerprint.

Cite this