Projects

RepRap X2

Virtual Display Image Viewer for PDA

Voice Authentication Screen Saver Design Overview

Introduction

Neural Networks (NN) represent connectionist approach to the main goal of AI – eliminating human cognitive process. Modeling the brain starts at the lowest cognitive level: the neurons and the connections between them. The fundamental hypothesis of this approach is that the architecture of the brain has an influence on the thought processes. Its typical realizations are concerned with pattern matching systems, such as speech or image processing systems.
Even though speech recognition and image processing are genuine functions of the human brain and seem to be the most promising when considered for the connectionist approach, the NN are applicable to a wide spectra of various problems not easily solved by human beings. These problems include business applications. There are examples of using NN in such a creative field of human life as music for automatic accompaniment generation. In general, NN can be applied to any problem requiring analyzing or producing of huge amount of data where learning can be performed.
The voice authorization belongs to the kind of problems described above. For a long time, voiceprint analyzing has been used in legislation system. The method considered being quite comprehensive. The idea of the method is to provide two voiceprints to an expert who can compare them and decide either the phrase has been pronounced by the same person or not. A voiceprint is a spectrogram printed in an easily readable form. Usually, more weighted frequencies are represented by brighter colors or more dense printing. NN here can play the expert's role and decide either to grant an access considering voiceprint provided or not. NN can be fed directly by the data coming from a voiceprint system and be taught to react on the particular phrase and features of the speaker's voice. The system would not need to know language of the phrase the subject is using. Assuming a right learning strategy is used, the system should be able to adjust to use the most distinguishing features of the person voice and language. Consequently, we can expect better results from the NN based system than from a statistical analyzer even the current situation in the field does not prove it.

Sound Production

Functional components of the human vocal system

- Parameters of the human vocal system differ significantly from individual to individual;

- The parameters have reflection to the sound produced;

- Theoretically, the sound produced can be used for determining a person identity as well reliable as fingerprints.

Voice Spectrograms

Two spectrograms of the same phrase

Spectrum is calculated for sections of signal in a small timeslots and placed in columns of the spectrogram. The higher energy a frequency has the darker dot represents it in the spectrogram.

Multi-layer Perceptron

Three layer neural network for xor

Activation function:

y(x) = 1/(1+exp(-x))

Neuron’s output:

O_n – output of the neuron n

W_kj – weight of the link from neuron k to j

k – neurons having link to neuron j

Teaching Multi-layer Perceptron

Three layer neural network

Error signal calculation (d):

Adjusting weights:

d – learning rate

Voice Verification Program Structure

- The Recorder is picking up a parole phrase from continuous input. The phrase must be identified by a relatively long period of high energy sound confined by silence or a low energy input.

- The Data preprocessor prepares data for passing it to the Neural Network module, saves samples for learning and generates fake samples for rejection. The data preparation process includes scaling, removing unrelated to the voice identification data from the sample and building a voiceprint.

- The Neural Network module analyzes input from data preprocessor, returns the result and can perform learning if requested.

After the recording stage:

After data preprocessing stage:

Neural Network Module

When the output of the Neural Network module is above a particular threshold the phrase is accepted. Otherwise, it is rejected. During the teaching process the Controller receives corrections from user and initiates the Neural Network learning process. The adjusted weights are saved on hard drive. The Controller also keeps several correct samples in memory until the learning dialog box is closed.

Sound Production

Functional components of the human vocal system

Voice Spectrograms

On – output of the neuron n

Neural Network Module

O_n – output of the neuron n