Our Technology
``A
Major Breakthrough in Image Retrieval.´´
Technical
Panel
International Conference on
Multimedia, Switzerland, 2002
``One of the best image search engines that I have ever seen.
VIMA's use of a multi-modal approach surpasses those state-of-the-art
techniques, and yields substantially superior accuracy. It's
really a great stride toward matching human perception.´´
Dr.
Chung-Sheng Li
Senior Manager
T.J.Watson Research Center
IBM Corporation
|
Our Technology
VIMA Technologies is the inventor of the patented PBIR - Perception-Based
Image Retrieval. PBIR uses a combination of machine learning (i.e.,
data mining) and cognitive psychological concepts to analyze visual
images - including video and other multimedia - the way a person
would. As a result, our systems can find what a user is looking
for - whether to retrieve, filter, analyze or organize - faster
and with more accuracy than any other visual image technology on
the market.
Developed by leaders in the database and information technology
fields, VIMA's PBIR represents a substantial core technology lead
in the market for multimedia filtering, retrieval and organization
solutions. Several patents issued or pending cover different aspects
of PBIR, and our executives and advisors are well-known on the
cutting edge of technology research in the field .
How Does it Work?
PBIR is composed of several key components:
- Multi-Resolution Feature Extractor
- Query-Concept Learner
- Perceptual Distance Function Engine
- High Dimensional Indexer
- Multi-Modal Weighting
The Multi-Resolution Feature Extractor extracts and organizes
image features based on two perceptual principles: First, it structures
image features into a multiple level hierarchy (coarse, medium,
fine). This characterization not only allows the system to flexibly
select features that are appropriate for a particular visual task,
but also speeds up query-concept learning exponentially. Second,
the extractor enables the high-level query concepts to be cleanly
mapped to the low-level features, which further allows query-concept
learning to be performed accurately.
The Query-Concept Learner learns a subjective user-concept
quickly, with a small number of iterations. This technological
breakthrough is made possible by three revolutionary active learning
algorithms, which select the most informative samples to seek for
user feedback. Even without user profiling or prior knowledge about
a query, the learning algorithms can capture the query concept
in three to four rounds of relevance feedback from the user, in
a matter of seconds. For example, if a user first selects
an image of a yellow daisy, VIMA's PBIR will learn through further
user selections whether the subjective concept was yellow flowers, flowers
of any color, yellow objects, radially symmetrical objects, and
so on.
The Perceptual Distance Function measures visual-data similarity
based on how human perception works. VIMA invents the Dynamic
Partial Function (DPF), which selects a subset of perceptual
features to measure similarity between two objects (e.g., images
and video clips). The selected feature subset is determined by
the pair of objects to be compared. In other words, DPF formulates
distance functions differently for different pairs of objects.
VIMA implements DPF efficiently and DPF provides a significant
edge over other technology in providing high search accuracy.
The High Dimensional Indexer treats indexing as a classification
problem and uses statistical approaches to conduct searches efficiently. The
storage requirements for these indexes are quite small compared
to the images themselves (typically in the range of 600 bytes/image),
and the structure is scalable and supports fast searching across
databases of millions of images.
Multi-Modality Weighting allows VIMA's PBIR system to incorporate
complex image features, textual hints, and "learning" algorithms
all at once - and weight each in relation to the others - to increase
accuracy and better discern the subjective concept of the search
target.
Each of these pieces is individually patented or patent-pending,
and all work together to provide users with the quickest, most
efficient, most accurate results possible. This ICIP (IEEE International
Conference in Image Processing) invited
paper provides an indepth overview on VIMA's revolutionary
technology. (The paper provides references to more than fifteen
papers, authored by VIMA founders, published in most prestigious
conferences and journals.)
What Makes it Better than Other Systems?
Currently almost all visual image products rely on either
text-based analysis or content-based analysis to assess images.
Both methods, however, have serious limitations. Text-based systems
use keywords to identify images, which involves a subjective process
of naming and labeling (and also limits usage to one language).
Content analysis is usually uses only four or so parameters (such
as color, shape, texture, and object), which severely reduces its
accuracy in matching images. Specifically for porn-image detection,
content matching relies on skin tone to a great degree, which makes
it nearly impossible to differentiate between art, pornography
and medical photos, for example.
VIMA's PBIR uses perception-based analysis - our patented process
- to break an image down into more than 150 parameters, much as
the human brain does. Users select the images - either provided
by the system or by the users themselves - that best suit their
preferences and the technology uses their positive AND negative feedback
to actually learn what it is they are looking for. By analyzing
the characteristics of both selected and unselected images, PBIR
infers the user's intention. This ability to be taught user preferences
is exclusive to VIMA's products, as is the technology's capacity
to distinguish between nude pornography and nude artwork.
What's more, PBIR-based modules are easily integrated into existing
databases, web crawlers and text-based search engines across almost
all platforms, using standard interfaces. And the technology
has been proven to be scalable to very large image sets in high
throughput configurations.
Where Can it Be Used?
The applications for VIMA's technology are extremely broad.
Any industry, company or application that uses visual images of
any sort - scans, x-rays, illustrations, photos, etc. - can benefit
from integration with PBIR. Fast and accurate visual image
retrieval and management promises to bring a new kind of interaction
and control to customer-facing businesses.
Some of the current applications of PBIR include:
- Image search engines
- Near-replica copy detectors
- Objectionable image filters (e.g. pornography)
- Automatic image annotation and classification processing
- Video editing and storage programs
- Security systems
- Facial recognition software
|