OkOb.net | ||||||||
Registration | Projects |
RepRap X2V3 | RepRap X2 | Virtual Display Image Viewer for PDA | XUL Jabber Client | Voice Sensitive Screen Saver |
Voice Authentication Screen Saver Design Overview
Introduction
Neural Networks (NN) represent connectionist approach to the
main goal of AI – eliminating human cognitive process. Modeling the brain starts
at the lowest cognitive level: the neurons and the connections between them. The
fundamental hypothesis of this approach is that the architecture of the brain
has an influence on the thought processes. Its typical realizations are
concerned with pattern matching systems, such as speech or image processing
systems.
Even though speech recognition and image processing are genuine functions of the
human brain and seem to be the most promising when considered for the
connectionist approach, the NN are applicable to a wide spectra of various
problems not easily solved by human beings. These problems include business
applications. There are examples of using NN in such a creative field of human
life as music for automatic accompaniment generation. In general, NN can be
applied to any problem requiring analyzing or producing of huge amount of data
where learning can be performed.
The voice authorization belongs to the kind of problems described above. For a
long time, voiceprint analyzing has been used in legislation system. The method
considered being quite comprehensive. The idea of the method is to provide two
voiceprints to an expert who can compare them and decide either the phrase has
been pronounced by the same person or not. A voiceprint is a spectrogram printed
in an easily readable form. Usually, more weighted frequencies are represented
by brighter colors or more dense printing.
NN here can play the expert's role
and decide either to grant an access considering voiceprint provided or
not. NN can be fed directly by the data coming from a voiceprint system and
be taught to react on the particular phrase and features of the speaker's voice.
The system would not need to know language of the phrase the subject is
using. Assuming a right learning strategy is used, the system should be able to
adjust to use the most distinguishing features of the person voice and language.
Consequently, we can expect better results from the NN based system than from a
statistical analyzer even the current situation in the field does not prove it.
-
Parameters of the human vocal system differ significantly from
individual to individual;
-
The parameters have reflection to the sound produced;
-
Theoretically, the sound produced can be used for
determining a person identity as well reliable as fingerprints.
Two spectrograms of the same phrase
Spectrum is calculated for sections of signal in a small timeslots and placed in columns of the spectrogram. The higher energy a frequency has the darker dot represents it in the spectrogram.
Multi-layer Perceptron
Three layer
neural network for
xor
y(x) = 1/(1+exp(-x))
Neuron’s output:
On – output of the neuron n
Wkj – weight of the link from neuron k to j
k – neurons having link to neuron j
Three layer neural network
Error signal calculation (d):
Adjusting weights:
d – learning rate
Voice Verification Program Structure
- The Recorder is picking up a parole phrase from continuous input. The phrase must be identified by a relatively long period of high energy sound confined by silence or a low energy input.
- The Data preprocessor prepares data for passing it to the Neural Network module, saves samples for learning and generates fake samples for rejection. The data preparation process includes scaling, removing unrelated to the voice identification data from the sample and building a voiceprint.
- The Neural Network module analyzes input from data preprocessor, returns the result and can perform learning if requested.
After the recording stage:
After data preprocessing stage:
When the output of the Neural Network module is above a particular threshold the phrase is accepted. Otherwise, it is rejected. During the teaching process the Controller receives corrections from user and initiates the Neural Network learning process. The adjusted weights are saved on hard drive. The Controller also keeps several correct samples in memory until the learning dialog box is closed.