The tail end of storm clouds over Santa Barbara, at the end of a rare storm in  paradise - David Boardman 2003
 
TECHNOLOGIES
. . . Image search and categorization based on visual clues
 
       

Our Technology

 

``A Major Breakthrough in Image Retrieval.´´

Technical Panel
International Conference on
Multimedia, Switzerland, 2002


``One of the best image search engines that I have ever seen. VIMA's use of a multi-modal approach surpasses those state-of-the-art techniques, and yields substantially superior accuracy. It's really a great stride toward matching human perception.´´


Dr. Chung-Sheng Li
Senior Manager
T.J.Watson Research Center
IBM Corporation

Our Technology

VIMA Technologies is the inventor of the patented PBIR - Perception-Based Image Retrieval. PBIR uses a combination of machine learning (i.e., data mining) and cognitive psychological concepts to analyze visual images - including video and other multimedia - the way a person would. As a result, our systems can find what a user is looking for - whether to retrieve, filter, analyze or organize - faster and with more accuracy than any other visual image technology on the market.

Developed by leaders in the database and information technology fields, VIMA's PBIR represents a substantial core technology lead in the market for  multimedia filtering, retrieval and organization solutions. Several patents issued or pending cover different aspects of PBIR, and our executives and advisors are well-known on the cutting edge of technology research in the field .

How Does it Work?
PBIR is composed of several key components:

  • Multi-Resolution Feature Extractor
  • Query-Concept Learner
  • Perceptual Distance Function Engine
  • High Dimensional Indexer
  • Multi-Modal Weighting

The Multi-Resolution Feature Extractor extracts and organizes image features based on two perceptual principles: First, it structures image features into a multiple level hierarchy (coarse, medium, fine). This characterization not only allows the system to flexibly select features that are appropriate for a particular visual task, but also speeds up query-concept learning exponentially. Second, the extractor enables the high-level query concepts to be cleanly mapped to the low-level features, which further allows query-concept learning to be performed accurately.

The Query-Concept Learner learns a subjective user-concept quickly, with a small number of iterations. This technological breakthrough is made possible by three revolutionary active learning algorithms, which select the most informative samples to seek for user feedback. Even without user profiling or prior knowledge about a query, the learning algorithms can capture the query concept in three to four rounds of relevance feedback from the user,  in a matter of seconds.  For example, if a user first selects an image of a yellow daisy, VIMA's PBIR will learn through further user selections whether the subjective concept was yellow flowers,  flowers of any color, yellow objects, radially symmetrical objects, and so on.

The Perceptual Distance Function measures visual-data similarity based on how human perception works. VIMA invents the Dynamic Partial Function (DPF), which selects a subset of perceptual features to measure similarity between two objects (e.g., images and video clips). The selected feature subset is determined by the pair of objects to be compared. In other words, DPF formulates distance functions differently for different pairs of objects. VIMA implements DPF efficiently and DPF provides a significant edge over other technology in providing high search accuracy.

The High Dimensional Indexer treats indexing as a classification problem and uses statistical approaches to conduct searches efficiently.  The storage requirements for these indexes are quite small compared to the images themselves (typically in the range of 600 bytes/image), and the structure is scalable and supports fast searching across databases of millions of images.

Multi-Modality Weighting allows VIMA's PBIR system to incorporate complex image features, textual hints, and "learning" algorithms all at once - and weight each in relation to the others - to increase accuracy and better discern the subjective concept of the search target. 

Each of these pieces is individually patented or patent-pending, and all work together to provide users with the quickest, most efficient, most accurate results possible. This ICIP (IEEE International Conference in Image Processing) invited paper provides an indepth overview on VIMA's revolutionary technology. (The paper provides references to more than fifteen papers, authored by VIMA founders, published in most prestigious conferences and journals.)

What Makes it Better than Other Systems?
Currently almost all visual image products rely on either text-based analysis or content-based analysis to assess images. Both methods, however, have serious limitations. Text-based systems use keywords to identify images, which involves a subjective process of naming and labeling (and also limits usage to one language). Content analysis is usually uses only four or so parameters (such as color, shape, texture, and object), which severely reduces its accuracy in matching images. Specifically for porn-image detection, content matching relies on skin tone to a great degree, which makes it nearly impossible to differentiate between art, pornography and medical photos, for example.

VIMA's PBIR uses perception-based analysis - our patented process - to break an image down into more than 150 parameters, much as the human brain does. Users select the images - either provided by the system or by the users themselves - that best suit their preferences and the technology uses their positive AND negative feedback to actually learn what it is they are looking for. By analyzing the characteristics of both selected and unselected images, PBIR infers the user's intention. This ability to be taught user preferences is exclusive to VIMA's products, as is the technology's capacity to distinguish between nude pornography and nude artwork.

What's more, PBIR-based modules are easily integrated into existing databases, web crawlers and text-based search engines across almost all platforms, using standard interfaces.  And the technology has been proven to be scalable to very large image sets in high throughput configurations. 

Where Can it Be Used?
The applications for VIMA's technology are extremely broad. Any industry, company or application that uses visual images of any sort - scans, x-rays, illustrations, photos, etc. - can benefit from integration with PBIR.  Fast and accurate visual image retrieval and management promises to bring a new kind of interaction and control to customer-facing businesses. 

Some of the current applications of PBIR include:

  • Image search engines
  • Near-replica copy detectors
  • Objectionable image filters (e.g. pornography)
  • Automatic image annotation and classification processing
  • Video editing and storage programs
  • Security systems
  • Facial recognition software