Thanks for your advice, junior. I am not retired but just happen to be interested in Machine Learning methods and this problem seems to be difficult enough since only few people have created anything that would even closely perform at skilled human operator level. I did investigate some speech recognition algorithms such as HMM and SOM. I have spent also some time collecting data and training software to recognize real world noisy and messy signals. In fact the current shipping version of FLDIGI package has one of these algorithms (SOM) built in.
I don't have a PhD in related field but I have studied signal processing and even wrote some software for MRI image reconstruction and processing earlier in my career. The papers I have read on speech recognition over the last 20 years have certainly improved the state of the art but the methods are more incremental improvements than some ground breaking new discoveries.
BTW - How is that Siri working for you in a noisy car with windows open at highway speed? Humans can still understand each others in this kind of conditions.
Great idea and in fact I have been using this strategy to create a number of different synthetic test cases. I have synthetic audio files with various Signal-to-Noise levels, with different speeds and so on. The variable timing (rhythm) is more difficult to simulate as there is no clear distribution (like Gaussian) to use as a model. Only if you aggregate over many users and normalize by speed you can start to observe some sort of Gaussian distribution in dits and dahs. I wrote about this problem when I was investigating what kind of classifier would work well for individual ham operators.
While this seems like a great strategy the real world signals are more complex than I am able to generate with my Octave based Morse generator tool. Also, real world signals tend to have a random mixture of all kinds issues that the decoder needs to be able to handle simultaneously. For example, in typical CW contest many stations give their call signs at one speed, say 23 WPM. When responding to other stations they might give signal report (5NN) at 40 WPM. Then you add some interference from other stations and a lot of noise to this. Perhaps the best simulator I have seen so far is written by Alex, VE3NEA and it produces very realistic sounding audio.
I did some testing using classifiers in WEKA package but was quite disappointed on the results. My next attempt was to leverage PNN (Probabilistic Neural Network) and got somewhat better results. In the test runs with noisy audio files with Morse code I got up to 90% accuracy in classifying dits and dahs. I have not used FANN package a lot though I installed it on my development machine 1-2 years ago. What are your thought about FANN exactly? How would you go about using the package?
@jfalcom -- I do realize the differences between live traffic and recordings. The example links I provided above demonstrated a live feed from ARRL W1AW code bulletin on 12/24 at 3.58105 MHz that I decoded using experimental version of FLDIGI v3.21.75 connected via SignaLink USB to Elecraft KX3 radio.
However, there is a difference between debugging software and listening live feeds. I posted this question to figure out ways how to get a test set of boundary conditions captured by other hams so that I could re-run those errors in a controlled environment to replicate observed software bugs & decoding errors. Trying to debug a live feed is very hard and unfortunately beyond my skill level.
My goal is obviously to make the software to work well with a real source and be capable to self-adjust automatically to different band conditions, operators and traffic styles. Your proposal on listening straight key nights is actually a real good suggestion -- those events are the opportunity to see the real human variety of hand keyed Morse code. Thanks for your suggestion.
I did listen parts of the conversation between WOOH, NMN, LJKR and other boats in vicinity. Scary indeed.
BTW - FLDIGI had hard time decoding this correctly, partly because the signal quality was so poor.
Thanks for sharing.
Thanks @SnowZero. I have looked at HMMs and in fact I wrote a simplistic decoder version using RubyHMM just to learn more how HMM really works. You would be surprised on the mathematical rigor of the original thesis. Many of the ideas are very relevant today, just much easier to implement with current generation of computers.
The current decoder actually uses Markov Model - the software calculates conditional probabilities based on 2nd order Markov symbol transition matrix. The framework itself allows to add additional components. The de-noising is done by a set of Kalman filters that are used in the first pass before all possible paths are labeled and control is passed to trellis calculation and eventual letter translation.
I am not yet at the stage for overall speed scaling. The algorithm itself needs to work well before I want to pursue scaling this up.
I have two SDR receivers myself and using them actively. The problem is not in the volume of data but having a set of data with a lot of variability to find out limits where the decoder stops working correctly. I integrated the decoder to FLDIGI with the hope that I get other hams to try this out and report back when they observe conditions where decoder stops working.
I have also created many synthetic Morse files with different speed and Signal-to-noise ratio in order to plot the performance of the decoder under controlled conditions. Testing all variations manually is pretty labor intensive work even though I have written some automated scripts to run these test sequences and plot the results.
Great suggestion - thank you! Looks like the site requires registration but it has been created exactly for this kind of audio related research. It has even APIs to access the data. I will investigate this a bit further.
I have already many samples of CW contest traffic recorded from my Flex3000. Because most of it is computer generated the decoding challenge is mostly related to signal-to-noise ratio and interference, not so much on personal rhythm variances when people are using straight key.
The idea presented was to collect many different kinds of CW samples. I am looking more for variation than uniformity. Having an adaptive decoder algorithm that adjusts itself automatically to all kinds of CW is a challenge.
I am using CW skimmer fairly actively - in fact I have been corresponding with Alex, VE3NEA who wrote the CW Skimmer. He gave me the idea of pursuing Bayesian framework as I have been progressing in developing a well working CW decoder. The main difference here is that I am focusing on improving FLDIGI which is open source software while CW Skimmer is a commercial software package. I do agree with you that CW skimmer does a great job decoding multiple streams simultaneously. Once the algorithm works decoding multiple streams is not that difficult.
1. Figure out a cool project
2. Find a sponsor
3. Take one step to skydive from 128,000 ft
Failure is more frequently from want of energy than want of capital.