Hello there,
I am currently installing a neopixel saber for my brother with a Proffieboard V2.2. So far hardware installation and customizing ProffieOS 6.7 were really straight forward. However, today I started doing some stability tests on the saber chassis outside of the hilt and noticed that the sound suddenly goes mute while doing some fast swings. To reproduce the issue, I extend the blade and do several fast consecutive swing motions. Then the sound goes mute. The blade or at least the neopixel leds on the internal chassis connector continue to function and the saber can be controlled by both the pow and aux button. If I retract the blade using the pow button and then, after it fully retracted, push the pow button again, the saber seems to work within normal parameters. At first, I was suspecting a hardware issue, but then I noticed that the issue does not occur with one of the three sound fonts, I installed on the saber. All three sound fonts worked until now on another Proffiboard v2.2 based saber. Looking at the output in the serial monitor revealed nothing suspicious, but I have a feeling that some property of the sound font might be causing this issue. Maybe an expected but missing file, or some specific property of the sound fond wav or config files. I already tried to get more information by enabling the diagnostic and developer commands in the config and using the monitor command, but have not found a usable clue on the source of the issue. Do you have some useful hints on how to debug this?
Several hours of debugging later, I have strong indications to believe that the issue is related to the low frequency cutoff filter. I had enabled this filter in my config via
to get rid of some crackling noises due to a clearly overdriven speaker during smooth swing sounds overlying (although I only use it at a volume of 250 for testing purposes at the moment). If the filter is active, I am also able to reproduce the issue with the factory config in default_proffiboard_config.h and the default sound fonts. It does, however not occur as fast as on my custom sound fonts. If I do not activate the filter, I have not been able to reproduce the issue. Maybe there is a bug in the filter implementation or the resulting CPU load of the filter has some unforeseen side effects. Unfortunately deactivating the filter seems to be no option, since the otherwise occurring crackling noises sound like they have quite a potential to damage the speaker.
Could you specify exactly what font causes it?
You mention default config and default sounds.
Is it possibly smoothswing pair01in TeensySF?
Could you try this default package instead and see what results?
Additional testing suggests that it does occur apparently with any font that includes saber swing sounds. I was able to reproduce it at least with with TeensySF, SmthJedi, SmthFuzz, RgueCmdr with the default_proffiboard_config.h only extended by the above mentioned two lines for filter activation. The issue occurs quite randomly in normal operation and is hard to reproduce. If you want to trigger it, I suggest you take a light saber chassis, hold it in the center between thumb and middle finger and wiggle it back and forth at high speed. This way, I can reproduce the issue within seconds. In my last tests with the default_proffiboard_config.h, I discovered a possibly related issue: The saber I am building has three blades(two for accents and stuff). The default_proffiboard_config.h has only a neopixel and a three CREE blade configured. When I try to trigger the issue as described above, sometimes my second accent LED fires up switches to full white and remains stuck there until I retract the blade. I am starting to suspect that the Proffiboard starts triggering all kind of strange stuff, when under heavy load. Maybe some buffer overflows in the code or multithreading madness.
Interesting and unusual. I think some other people use the filters without hitting such issues.
Here is a list of possibilities:
the volume has been turned to zero somehow
A bug in the filters is causing it to produce little or no output
The amplifier is cutting out because of a short or something (and resetting when disabled.)
Magic, cosmic radiation, gremlins
A few serial monitor commands that can help:
a) get_volume If it got muted somehow, this would show it.
b) dacbuffer (requires ENABLE_DEVELOPER_COMMANDS) this will show a small set of the samples that are being sent to the amplifier, run this after having triggered the bug, but before retracting. If it’s a filter issue, this is expected to show all zeroes.
c) beep If this works, but not other sounds, then the problem is in the wav-playing code, not in the actual filter.
d) amp on / amp off if this helps, then the problem is probably (3) above.
Hello! Thank you for your fast and expedient answer.
a) get_volume outputs 250, which is my configured VOLUME value
b) the command dacbuffer does not seem to exist. So I scanned the ProffieOS code and found the command buffered. It outputs:
Unit 0 Buffered: 512
Unit 1 Buffered: 0
Unit 2 Buffered: 512
Unit 3 Buffered: 512
Unit 4 Buffered: 0
Unit 5 Buffered: 0
Unit 6 Buffered: 0
c) beep (lol, nice close encounters reference) does not work, but works again once the blade has been retracted
d) amp on / amp off output nothing in the serial monitor and do not change the muted sound
Looks like my previous observation that it might be connected to the filter was not too far off.
This would certainly point to the filters (or something in that vincinity) doing something weird.
I can add some code to inspect the internal filter values.
It’s easier for me to add it to the github master, do you know how to get and test code from there? (If so, would you mind just trying it and see if it has the same problem or not.)
Getting the code from github is no problem. I just had a closer look at filters.h and dac.h. Could be filter_.clear(); in the begin function of LS_DAC getting mysteriously called a second time. However, the filter related code in that section did not look suspicious on first sight.
filter_.clear() shouldn’t make everything silent though, unless you call it a LOT.
It just clears out the state that persists between samples in the filter.
My guess is that something ends up as a NaN[1] for some reason.
Adding a command that prints out all internal variables in the filter should tell us if that’s what’s going on. It’s possible that filter_.clear() would actually make it work again if that is the case.
It’s weird though, because the filter code is almost exclusively additions and multiplications, which usually don’t create NaNs. Normally only divisions by zero and square roots on negative numbers create NaNs.
In the meantime I found another command that might help us figure out what is going on: monitor samples This command will print out some statistics from the dynamic mixer twice per second, which might be interesting.
Well, I have a theory.
It seems like at very high internal volumes, the volume calculated by the internal mixer could overflow. This would cause the square root to produce NaNs, which would propagate into the filter and cause problems until clear is called.
I’m adding a filterdata command which will print out the data values inside the filter. It won’t exactly explain where the problem comes from, but if we see NaN values in there, I will try making some modifications to the dynamic mixer to fix it.
Actually, the fix might be as simple as changing int32_t to uint32_t on this line:
With an uint32_t, even if it did overflow, no NaNs would be produced.
In all instances I’ve seen of this, I’m pretty sure it’s always occurred with peak volumes. I wish I could remember where, or know where to search a previous instance or two to confirm that.
Looks like my theory is incorrect.
The numbers shouldn’t be able to add up enough to cause an overflow.
It’s still possible that a NaN is causing these issues somehow, but if so, I think it must be coming from somewhere else…
Good news: I tried your data type fix and until now, I was not able to reproduce the issue. If it still persists, it must take significantly longer to trigger it. Of course this still is no proof of it being a complete fix, so It would be nice tracing the issue to its origin. I will do some debugging this afternoon, but your filterdata debug command could come in handy in the future anyways. Regardless of that, I will report, if the issue shows up again. Thank you for the efforts, you put into this.
As for the unexpected LED triggering described earlier. I think this was only due to the second blade not being configured as neopixel at the time of testing. So the corresponding data pin could have been floating and the neopixels data line got fed with some uncontrolled garbage.
I just inserted some debug code into int read(float* data, int elements) which gets called instead of the integer version when FILTER_CUTOFF_FREQUENCY is defined. Just before sqrtf is called, I tracked the current and previous value of vol_ and the sum v. The following shows two occurrences of the issue in the serial monitor
As can be clearly seen in all cases, the subterm ((vol_ + abs(v)) * 255) in line 122 overflows 2^31 because of the multiplication with 255, consequently feeding values smaller than zero into the square-root, which then cause an inescapable negative number frenzy. So, declaring vol_ as a uint32_t could indeed be a valid fix for this issue. However, I do not know if it would be desirable to prevent such high values before this happens.
So my theory is right, even though I managed to convince myself that it can’t happen. Guess I’ll have to read the code again and see if I can figure out where I went wrong.
One possible way to fix this would be to replace this:
vol_ = ((vol_ + abs(v)) * 255) >> 8;
with:
vol_ += abs(v);
vol_ -= (vol_ + 255) >> 8;
Should be exactly the same thing, but without the multiplication by 255, and only costs one extra operation.
Hmm, I do not fully understand what the line does in terms of the filter (did not have time to read the paper) and I might interpret the rounding effects of right shift division wrong, but I have the impression the 255 should not be there:
vol_ += abs(v);
vol_ -= vol_ >> 8;
I guess there is no easy way to detect if the filter is working correctly, aside from the overflow bug.
Ah, I did not take the changed position of the rounding down shift division in the term into account. Thank you for enlightening me. I just tested the modified code and it seems to work.