survival8: BITS WILP Software Architectures Assignment (Solutions) 2017-H1

Link to StarUML File:
SA Assignment (UML)

SOFTWARE ARCHITECTURE ASSIGNMENT

Problem: Design a speech recognition software system case study.
The input is speech recorded in waveform in single words as well as whole sentences. It is restricted to syntax and vocabulary need for a specific application such as database query.
The output is machine representation of English phrases (often used tuples). The transformations involved linguistic, acoustic-phonetic and statistical expertise.
It is assumed that there is no consistent algorithm that combines all the necessary procedures for recognizing speech. The problem is worse when it is affected by ambiguities of spoken language, noisy data, the individual peculiarities of speakers such as vocabulary, pronunciation and syntax.
Which architectural style and design patterns will be most suitable to design such system?

Solution: The Speech Recognition Software system involves a number of procedures to carried out on the speech inpyt given by the user. These procedures are recording input from the user in the form of audio waveform, transforming audio waveform into respective phonemes/words, and then forming SQL queries from the spoken words.
No software system has a consistent algorithm that combines all the necessary procedures for recognizing the speech, this property of the system makes the ‘Blackboard’ design pattern suitable for its implementation.
We develop the system as a collection of independent programs that work cooperatively to solve a particular part of the overall task. Our system will have the following parts:
Blackboard: Class SpeechReconitionBlackboard
Knowledge Sources: Classes Microphone, PhonemeTokenizer and QueryBuilder
Controller: Class SpeechRecognitionController
The blackboard ‘SpeechReconitionBlackboard’ is the central data store, elements of the solution space and control data are stored here.
The controller SpeechRecognitionController monitors the state of the blackboard and calls respective methods of the Knowledge Sources classes depending upon the state of the blackboard. For example, for Microphone class, controller calls ‘startRecording()’; for PhonemeTokenizer class, controller calls ‘tokenize()’; for QueryBuilder class, controller calls ‘buildQuery()’.
Microphone takes care of the acoustic-phonetic transformation, PhonemeTokenizer takes care of the linguistic transformation and QueryBuilder takes care of the statistical expertise required to form the SQL queries.
Operation starts when user calls ‘startOperation()’ on the ‘SpeechRecognitionController’ class, the Speech Recognition System runs in a loop, taking speech input via ‘Microphone’ class that has methods ‘startRecording()’ and ‘stopRecording()’ for it.
After the recording is over, ‘SpeechRecognitionController’ calls tokenize() method on the PhonemeTokenizer class.
Now when word tokens are available on the SpeechReconitionBlackboard from PhonemeTokenizer, controller monitors this and calls buildQuery() method on the QueryBuilder class.
The QueryBuilder updates the SpeechReconitionBlackboard with the generated SQL query.
Certain keywords could be stored in the system memory that it catches immediately and these stored words tell about the starting/ending of the query. For example, ‘START’ and ‘END’ to tell the system to start or terminate the query here.

Usecase Diagram: