Forgot your password?
typodupeerror

+ - Morse Learning Machine Challenge ->

Submitted by Anonymous Coward
An anonymous reader writes "The goal of this competition is to build a machine that learns how to decode audio files containing Morse code. The organizer hopes to attract people who are interested in solving new, difficult challenges using their predictive data modeling, computer science and machine learning expertise. During the competition, the participants build a learning system capable of decoding Morse code. To that end, they get development data consisting of WAV audio files containing short sequences of randomized Morse code at different speed and signal-to-noise levels. The data labels are provided for a training set so the participants can self-evaluate their systems. To evaluate their progress and compare themselves with others, they can submit their prediction results on-line to get immediate feedback. A real-time Kaggle leaderboard shows participants their current standing based on their validation set predictions.

In the Kaggle challenge there is also a sample Python Morse decoder provided to make it easier to get started. While this software is purely experimental version it has some features of the FLDIGI Morse decoder but implemented using Python instead of C++.

This challenge represents a unique combination of cutting edge artificial intelligence/ machine learning combined with the oldest digital communications mode, the Morse code."

Link to Original Source

Comment: Re:my two cents (Score 1) 79

by mni12 (#45791713) Attached to: Ask Slashdot: How To Build a Morse Code Audio Library For Machine Learning?

Thanks for your advice, junior. I am not retired but just happen to be interested in Machine Learning methods and this problem seems to be difficult enough since only few people have created anything that would even closely perform at skilled human operator level. I did investigate some speech recognition algorithms such as HMM and SOM. I have spent also some time collecting data and training software to recognize real world noisy and messy signals. In fact the current shipping version of FLDIGI package has one of these algorithms (SOM) built in.

I don't have a PhD in related field but I have studied signal processing and even wrote some software for MRI image reconstruction and processing earlier in my career. The papers I have read on speech recognition over the last 20 years have certainly improved the state of the art but the methods are more incremental improvements than some ground breaking new discoveries.

BTW - How is that Siri working for you in a noisy car with windows open at highway speed? Humans can still understand each others in this kind of conditions.

   

Comment: Re:PRNG? (Score 1) 79

by mni12 (#45791563) Attached to: Ask Slashdot: How To Build a Morse Code Audio Library For Machine Learning?

Great idea and in fact I have been using this strategy to create a number of different synthetic test cases. I have synthetic audio files with various Signal-to-Noise levels, with different speeds and so on. The variable timing (rhythm) is more difficult to simulate as there is no clear distribution (like Gaussian) to use as a model. Only if you aggregate over many users and normalize by speed you can start to observe some sort of Gaussian distribution in dits and dahs. I wrote about this problem when I was investigating what kind of classifier would work well for individual ham operators.

While this seems like a great strategy the real world signals are more complex than I am able to generate with my Octave based Morse generator tool. Also, real world signals tend to have a random mixture of all kinds issues that the decoder needs to be able to handle simultaneously. For example, in typical CW contest many stations give their call signs at one speed, say 23 WPM. When responding to other stations they might give signal report (5NN) at 40 WPM. Then you add some interference from other stations and a lot of noise to this. Perhaps the best simulator I have seen so far is written by Alex, VE3NEA and it produces very realistic sounding audio.

Comment: Re:FANN Neural Net (Score 2) 79

by mni12 (#45791403) Attached to: Ask Slashdot: How To Build a Morse Code Audio Library For Machine Learning?

I did some testing using classifiers in WEKA package but was quite disappointed on the results. My next attempt was to leverage PNN (Probabilistic Neural Network) and got somewhat better results. In the test runs with noisy audio files with Morse code I got up to 90% accuracy in classifying dits and dahs. I have not used FANN package a lot though I installed it on my development machine 1-2 years ago. What are your thought about FANN exactly? How would you go about using the package?

Comment: Re:You're doing it all wrong.... (Score 1) 79

by mni12 (#45789285) Attached to: Ask Slashdot: How To Build a Morse Code Audio Library For Machine Learning?

@jfalcom -- I do realize the differences between live traffic and recordings. The example links I provided above demonstrated a live feed from ARRL W1AW code bulletin on 12/24 at 3.58105 MHz that I decoded using experimental version of FLDIGI v3.21.75 connected via SignaLink USB to Elecraft KX3 radio.

However, there is a difference between debugging software and listening live feeds. I posted this question to figure out ways how to get a test set of boundary conditions captured by other hams so that I could re-run those errors in a controlled environment to replicate observed software bugs & decoding errors. Trying to debug a live feed is very hard and unfortunately beyond my skill level.

My goal is obviously to make the software to work well with a real source and be capable to self-adjust automatically to different band conditions, operators and traffic styles. Your proposal on listening straight key nights is actually a real good suggestion -- those events are the opportunity to see the real human variety of hand keyed Morse code. Thanks for your suggestion.

Comment: Re:Try HMMs (Score 2) 79

by mni12 (#45788633) Attached to: Ask Slashdot: How To Build a Morse Code Audio Library For Machine Learning?

Thanks @SnowZero. I have looked at HMMs and in fact I wrote a simplistic decoder version using RubyHMM just to learn more how HMM really works. You would be surprised on the mathematical rigor of the original thesis. Many of the ideas are very relevant today, just much easier to implement with current generation of computers.

The current decoder actually uses Markov Model - the software calculates conditional probabilities based on 2nd order Markov symbol transition matrix. The framework itself allows to add additional components. The de-noising is done by a set of Kalman filters that are used in the first pass before all possible paths are labeled and control is passed to trellis calculation and eventual letter translation.

I am not yet at the stage for overall speed scaling. The algorithm itself needs to work well before I want to pursue scaling this up.

   

Comment: Re:Skimmer (Score 1) 79

by mni12 (#45788489) Attached to: Ask Slashdot: How To Build a Morse Code Audio Library For Machine Learning?

I have two SDR receivers myself and using them actively. The problem is not in the volume of data but having a set of data with a lot of variability to find out limits where the decoder stops working correctly. I integrated the decoder to FLDIGI with the hope that I get other hams to try this out and report back when they observe conditions where decoder stops working.

I have also created many synthetic Morse files with different speed and Signal-to-noise ratio in order to plot the performance of the decoder under controlled conditions. Testing all variations manually is pretty labor intensive work even though I have written some automated scripts to run these test sequences and plot the results.

Comment: Re:It's like you're not even trying. (Score 1) 79

by mni12 (#45788313) Attached to: Ask Slashdot: How To Build a Morse Code Audio Library For Machine Learning?

I have already many samples of CW contest traffic recorded from my Flex3000. Because most of it is computer generated the decoding challenge is mostly related to signal-to-noise ratio and interference, not so much on personal rhythm variances when people are using straight key.

The idea presented was to collect many different kinds of CW samples. I am looking more for variation than uniformity. Having an adaptive decoder algorithm that adjusts itself automatically to all kinds of CW is a challenge.

Comment: Re:Skimmer (Score 4, Informative) 79

by mni12 (#45788265) Attached to: Ask Slashdot: How To Build a Morse Code Audio Library For Machine Learning?

I am using CW skimmer fairly actively - in fact I have been corresponding with Alex, VE3NEA who wrote the CW Skimmer. He gave me the idea of pursuing Bayesian framework as I have been progressing in developing a well working CW decoder. The main difference here is that I am focusing on improving FLDIGI which is open source software while CW Skimmer is a commercial software package. I do agree with you that CW skimmer does a great job decoding multiple streams simultaneously. Once the algorithm works decoding multiple streams is not that difficult.

+ - Ask Slashdot: How to build Morse code audio library for machine learning? ->

Submitted by mni12
mni12 (451821) writes "I have been working on a Bayesian Morse decoder for a while. My goal is to have a CW decoder that adapts well to different ham radio operators rhythm, sudden speed changes, signal fluctuations, interference and noise and has ability to decode Morse code accurately. While this problem is not as complex as speaker independent speech recognition there is still a lot of human variation where machine learning algorithms such as Bayesian probabilistic methods can help.

I posted first alpha release yesterday and despite all the bugs first brave ham reported success.
I would like to collect thousands of audio samples (WAV files) of real world CW traffic captured by hams via some sort of online system that would allow hams not only to upload captured files but also provide relevant details such as their callsign, date & time, frequency, radio / antenna used, software version, comments etc. I would then use these audio files to build a test library for automated tests to improve the Bayesian decoder performance.

Since my focus is on improving the decoder and not starting to build a digital audio archive service I would like to get suggestions of any open source (free) software packages, online services or any other ideas how to effectively collect large number of audio files and without putting much burden on alpha / beta testers to submit their audio captures. Many available services require registration and don't support metadata or aggregation of submissions.

Thanks in advance for your suggestions."

Link to Original Source

+ - Machine Learning Algorithms to Crack Morse Code->

Submitted by
mni12
mni12 writes "Morse code has been used since early 1840's and is still a very popular mode of communication especially among ham radio operators. While it takes some effort for humans to learn Morse code it is a very efficient way in communicating short messages over radio waves, especially under noise, interference, propagation fading or other adverse conditions. Experienced human operators can easily outperform any publicly available Morse decoding software.
I have done some experiments with machine learning algorithms, especially with Self Organizing Maps (SOM) applied to real-time decoding Morse code in real world noise & interference filled signals. Early test results look promising but I would like to turn to Slashdot community for some advice and ideas.

What kind of machine learning algorithms would be applicable for real time Morse decoder when signals contain a lot of noise, interference from other stations, fading, irregular timing and other problematic features?"

Link to Original Source
United States

+ - Forensics Expert says Al-Qaeda Images Altered

Submitted by WerewolfOfVulcan
WerewolfOfVulcan (320426) writes "Wired reports that researcher Neal Krawetz revealed some veeeeeery interesting things about the Al-Qaeda images that our government loves to show off.

From the article: "Krawetz was also able to determine that the writing on the banner behind al-Zawahiri's head was added to the image afterward. In the second picture above showing the results of the error level analysis, the light clusters on the image indicate areas of the image that were added or changed. The subtitles and logos in the upper right and lower left corners (IntelCenter is an organization that monitors terrorist activity and As-Sahab is the video production branch of al Qaeda) were all added at the same time, while the banner writing was added at a different time, likely around the same time that al-Zawahiri was added, Krawetz says." Why would Al-Qaeda add an IntelCenter logo to their video? Why would IntelCenter add an Al-Qaeda logo? Methinks we have bigger fish to fry than Gonzo and his fired attorneys... }:-) The article contains links to Krawetz's presentation and the source code he used to analyze the photos."

Just because he's dead is no reason to lay off work.

Working...