Virtual Music Teacher

Computer aided music learning system


Data - Input/Output Definition




Our system must get data from the played music and outputs its diagnostic. Which kind of data will be handled ? We have to know exactly which informations we'll need to perform matching between the score and the performance. What will the output look like ? "diagnostic" is quite a fuzzy word..

 

I - Input Data

Input flow is played music : it means it can be mainly of two different kinds : Audio or MIDI. We'll describe those data representations and discuss pros and cons for each of them.


I.1 - Audio data

Audio data present a great advantage : as it is a very rich "format", we can have all the informations we need to evaluate music. It is by the way very general, i.e. everykind of music source could be used as input signal through an audio acquisition : piano, flute, guitar, but also voice or percussions.

The number of useful parameters can be very high, and considerabely improve results: Harmonic spectrum, enveloppe spectrum, energy, zero-crossing...
Since most of these parameters are instrument-depending, We can significantly increase system acuracy by building a different model for each kind of instrument.

On the other hand, audio data require space and time : real-time calculations on audio data means an optimal use of computer's ressources.
Another problem would be the possible lack of precision : for example, detecting the start time of a note can not be done without an error rate. This could be mainly due to signal perturbations (noise, low level signal...), especialy on weak attacked notes



I.2 - MIDI data

Unlike audio, MIDI data allow few, but very precise parameters : pitch, duration, velocity, time... Of course, in real-time played music, duration is only known a posteriori. But the NOTE-ON / NOTE-OFF mechanism allows us to detect very simply note's start and note's end.
Another great advantage can be found in polyphonic music : we have, with MIDI, all the informations, while polyphonic audio data are more difficult to use.

The greatest problem with MIDI is the limited number of instruments that can product it. Except Keyboard instruments (MIDI-Piano, Organ, MIDI-accordeon...), the produced MIDI is not very clean. There are mainly two ways to produce MIDI for non-keyboard instruments :

Pitch trackers :
The audio signal is acquired, analysed, and MIDI parameters are extracted and outputed. Well working on simple signal, but often produce errors with harmonics on attack, or with special musical features like vibrato, tremolo or tongue-flat for wind instruments.

Instrument "midification" :
Captors are placed on the instrument to capture the mechanic features of the note without having to analyse it : for example MIDI-captors on guitar strings, mechanic captors on flute keys... The problem is always a lack of precision : for the guitar, note's beginning is easy to determine, but note's end is threshold depending. For the flute, attacks are detected by a microphone near the embouchure, and pitch with keys positions : the problem is we are not able to detect octaviations, since keys positions are the same for octaved notes, for example A4(440Hz) and A5(880Hz).
Furthermore, such instruments are rare, expensive, and very often heavy and hard to play because of all the "added mechanisms".

 

I.3 - Our Choice

Choice criteria could be summarized as below :

Criterion Audio Midi
Space requirement and treatment duration high low
complete data very good average
precise data average very good
noise and interference tolerance low high
coding treatment functions quite difficult quite easy
could be applied to everykind of instruments yes no
allow system to be adapted to the source instrument yes no
could be applied to polyphonic music with difficulty yes

 

A choice is difficult because of the complementarity of both data types. Moreover, the Score Following system described in document "Background - Existing Systems" give two kinds of Score Followers : one with Audio Input, the other with MIDI input.

We can't make a choice without knowing which kind of music will be played. If we have for example a Chopin's piece, i.e. a polyphonic piano piece, we will obviously choose MIDI as input, whereas if we have a study for flute, i.e. a simple monophonic piece, audio will be more efficient.

Another (and the last) choice criterion will be the time we will have to realize the system. Since input data type does not affect other functions too much, we will focus on only one type, as the first realisation step, and then consider the second input type as a program extention.
MIDI data allow to test the system without having to store a big amount of data. Moreover, we can easily produce a MIDI-score and a test MIDI-performance with artificial errors, which is more difficult with audio.

Our first goal will be to use MIDI data as input for the system.

 


II - Output Data

Main goal of the project is to indicate where errors have been done and the kind of these errors. We have to find an output which would allow to reach this goal.
There are two ways to indicate errors : immediately when they are done, or at the end of the performance with a report of all played errors.


II.1 - Real-time signalisation

As soon as an error has been done, we may want to have a feedback : "you have made an error!". This feedback could be a sound, a visual signal on the screen, or anything else that could indicate to the player that what he played just before was wrong.
This cannot be done for everykind of errors : dynamic errors, for example, need a large "window" to be detected, and a real-time indication like "you have played too loud" would not be relevant.

For didactic purpose, it would be a good idea to indicate errors immediately : the players know what he just played, and doesn't have to remember "what did I do meas.37 ?". But the related problem would be the following one : it's sometime difficult for a beginner to stop and then continue to play from a given point in the score. Moreover, a real-time signalisation could definitely disturb the player, and make him do much more errors because his attention has been distracted.


II.2 - General report

When the performance is over, all detected errors can be indicated. The output could be a text file, a screen text, an interactive dialogue... and it can contain all the informations we need about the played errors : when and where the error was done, its kind, the comparision between what was played and what should have been played, etc.
To have the most useful defined report, we must wait to have the results of the inquiery that was lead with music teachers.