|
Searching Audio and Video by Sound
"An ability to locate the sound you want in a database of such sounds, is clearly of
far-reaching economic value. The Comparisonics breakthroughs in this area are fun to
play with and, more seriously, of very great potential commercial value."
Dr. Harry M. Markowitz, Nobel laureate in Economics, and Computer Science
pioneer
The Problem
One of the most important operations performed by computers is searching. Efficient
methods for the search and retrieval of numeric data and text documents have been the focus
of computer scientists for more than 50 years. In recent years, computers have rapidly
evolved from numeric and text processing to include multimedia, specifically audio, video,
and images. However, few methods exist for searching multimedia.
Audio and video collections have been searchable only through text descriptions. That is,
each audio or video clip is described in words by a human cataloger, who types the
description into the computer. These descriptions are then searched by keyword to locate
clips of interest.
"Woefully insufficient!" says The Hollywood Reporter in its white
paper on multimedia retrieval. Creating text descriptions for audio and video is not only
a burden costing time and money, but the value of such descriptions is limited. Sounds are
difficult to describe in words (is it a bang, a crash, a thud?) and as a result, audio
collections have been largely inaccessible.
The Solution
It makes sense to search documents based on the words they contain, and to search audio
based on the sounds it contains. Comparisonics Corporation has invented the
technology that makes this possible. It is called sound matching, and it is the
ability to compare sounds, automatically by computer, and discern whether they are
similar, and to what extent. Given any sound, similar sounds can be located automatically
in audio and video collections. In database terminology, this is a form of
"query by example" and "content-based retrieval."
To search audio or video, the digital audio content is first characterized by an automated
indexing process. Then any sound can be used to find matches in the indexed collection.
The given sound, called the prototype, is compared automatically with each sound in
the collection. The degree of similarity between two sounds is indicated by a
similarity score that ranges from 0 (least similar) to 100 (most similar).
This score permits sounds to be ranked by similarity to the prototype.
Recordings can be characterized in about 1% of the time it takes to play them, plus the
time required to read the audio data from storage media. Once characterized, thousands of
hours of audio and video can be searched in seconds! Because audio can be characterized in
"real time," even live material can be searched, such as ongoing radio and
television programs, and multimedia streamed over the Internet.
The Comparisonics® sound-matching technology works for all possible sounds,
including sounds from people, animals, machinery, and musical instruments, as well as
noise, electronic tones, and environmental ambiences. It works for any number of audio
channels, up to 32 bits of resolution per channel, and for all sample rates of at least
8,000 samples per second.
Home
Overview
Technologies
Applications
Sound Gallery
FindSounds.com
FindSounds Palette
About Us
Contact Us
|