For a very long time, I’ve been wanting to try to guess the clash location based on gyro/accelerometer data. I’ve collected and plotted clash data before, but this time I got serious and applied some machine learning to the problem.
First thing I did was to create a better clash recorder. It lights up a small section of the blade, waits for a clash, and then asks if I want to save it. The saved data contains the accelerometer and gyro data, along with the computed “down” vector, saved 1600 times per second. The saved data has 256 rows, 128 of which are before the clash, and 128 are after the clash. The saved data also contains the location that was highlighted.
Next, I recorded 150 clashes, each at a random location on the blade. Then I copied the saved data to my computer and fired up TensorFlow.
Unfortunately, TensorFlow doesn’t seem to be able to find a correlation between the data and the location, the best it seems able to do looks something like this:
In this scatterplot, the X axis is the highlighted point on the blade, and the Y axis is the predicted point. In an ideal world, these points should form a 45-degree line. As you can see there is very little correlation between predicted data and real data.
Unless I’m using TensorFlow wrong in some way, I think that means that it’s unlikely that we can find a correlation between the accelerometer/gyro/down data and the clash location.
A+ on the effort though, thanks!
Next time I need to investigate if I can do it with a microphone.
(I would need to sample the microphone at at least 200kHz though.)
Could the scattering just just be the limitations of the device? I mean if you look at the points far enough back they are in that general area. Drop the sample rate and see what the point look like then and run the analysis. As in more “focal group” less “crowd sample”.
I have no idea what you are suggesting.
Reducing the sample rate just means we’ll have less accurate data as input, leading to more garbage in → garbage out.
I’m not sure how to put it into terms that apply to Proffie so bear with me as I try to articulate. Remember I’m used to vehicles and planes not sound boards. In my mind it covers how the samples are being taken. Maybe instead what I should ask is how the samples are taken and what overall factors are in each sample. Like does the sample set include things that can be slightly changed or even switched to include alternate sources. Hopefully that makes sense. Refine what’s being looked at to better the test answers bringing them more into line.
The point of using machine learning is that I can throw all the data at it, and it should be able to pick up the signal, if there is one. So that’s what I did. If that worked, I was going to try my best to find out what that signal was and write my own algorithm for detecting it. However, the signal found by the machine learning framework is super weak, making it almost useless, and that’s before I started reducing the input data, reduce the size of the model or using cross-validation. (All of which will make the error larger.)
- If the machine learning framework is powerful enough
Ok that I get. I was going to ask “Were you swinging the saber during these clash tests that the machine looked at?” Like maybe latency accounts for the scattered spots where swing speed may be a factor.
Wait a second…
I found a bug in my program.
In fact, it’s suspiciously good, I think maybe I have over-fitted the model.
Time to break out the cross-validation…
Now that’s what I would expect to see. Awesome!
Turns out that the “good” result above is misleading though.
I was being lazy with the machine learning and used the same data to train and evaluate the TensorFlow model. The model didn’t learn how to predict clash locations, it simply learned how to recognize answer for each of the inputs I provided. It’s like learning the answers to a math test instead of learning how to calculate the answers; works great if you get the same test again, but not so great if it’s a different math test.
When I separate the training data from the evaluation data, the results aren’t nearly as good. In fact, they are worse than my first picture above. It’s not entirely bad news though. I might be able to reduce the complexity of the model to a point where it can’t remember all the answers anymore, and that might encourage it to learn the real thing instead. There might be other techniques I can use as well, or maybe, I just need to feed it more data. So while the outlook is a bit grim at the moment, there are still some avenues to explore.
I’ve listened to engineers deal w the same on dynamic controls. Apply that first bit and run the tests as ongoing tweaking the math as you go, take a break, go out and swing a saber, have some tea, have a brilliant thought, go back to it.
Lots of good drama there. AI work in the real world, Nagata’s “Edges” up next on the reading list. It’s a whole vibe. Sorry, please go on.
Looks like piezo-electric disks can be used to pick up vibrations.
Maybe I can use that to detect a clashes. If sampled fast enough, I might be able to pick up the reflected wave and learn how far up the blade the clash happened.
Wow…just reading this now. Thank you for pouring that much effort into a concept like this. Remarkable your vision is!