Something like this could very well be going on.
It’s especially likely right now, because the chips aren’t easy to come by, which may mean that some of the chips are fakes or second-sort or something.
Is this something that could be easily tested?
No. If the error is different each time, then it’s almost impossible to test. If the error is the same each time, but randomly triggered, then it is possible to test, but it is still quite difficult. (depending on how often it happens.)
If we knew what the problem was exactly, then it would be much easier to test for.
No. If the error is different each time, then it’s almost impossible to test. If the error is the same each time, but randomly triggered, then it is possible to test, but it is still quite difficult. (depending on how often it happens.)
If we knew what the problem was exactly, then it would be much easier to test for.
Apparently, Sandisk changed stuff in 2020. The versions they distribute and faster speeds .
Important Notice: SanDisk micro SD Card changes, Q4-2020 - OEMPCWorld.
Also,
SanDisk QuickFlow Technology
SanDisk developed a new technology it calls QuickFlow which allows it to leverage the original implementation of the SD Association’s UHS-I specification and incorporate enhancements that enable it to reach faster speeds.
Firstly, SanDisk designed custom firmware with detection schemes and adjustments to clock, bus, and card output timing. SanDisk says it samples data on both rising and falling edges of the clock and increases the clock frequency above 208 MHz to enable transfer speeds up to twice as fast as standard UHS-I cards. It combines this approach with faster memory core interfaces and a NAND memory with proprietary higher performance trims.
SanDisk UHS-1 memory card
The result is a card that can perform faster under certain circumstances but is still backward compatible with current UHS-I host devices.
This approach is why the cards are only rated as V30 (another term explained in the memory card guide) despite the promise of speeds well beyond 30 MB/s: the cards will only be able to reach those maximum promised speeds when inserted into a compatible host device.
So obviously nothing a Proffiebord can take advantage of, but I wonder if some of their backward compatibility is goofed. If they use new firmware with detection schemes and adjustments to clock, bus, and card output timing… sounds like that could break a few things, eh?
Leave it to engineers to fix something until it’s broke.
QuickFlow is largely a marketing trick since it doesn’t work with anything but sandisk SD card readers. Also, it’s now a couple of generations out of date. The new SD card spec is built on PCIe lanes, just like thunderbolt, nvme, usb4 and CFexpress.
Anyways, It is entirely possible that they goofed something up in the process, but the SPI access that proffieboards use is so far removed from UHS-I, that it’s not particularly likely IMHO.
Just to add to what’s been said above, there’s bad news but then what I think is very good news…
I’ve just installed a brand new board in a Thrawn KR kit, and it is the worst board I’ve yet had for exhibiting this problem. It only boots successfully in perhaps one in seven attempts using the main release 6.7.
But…
Last night I downloaded the pre-OS-7 version from GitHub, and although I haven’t done exhaustive testing and I don’t want to tempt fate, so far it’s worked flawlessly.
It’s probably a bit late to use this board to try to track the exact cause of the problem (although Fredrik, I’m happy unpick it and send it over if you think it would be useful). If 7.x fixes it then I guess that’s enough, but I will be forever curious as to what actually caused this phenomenon and why it was so variable between different boards.
I’m just happy that people are seeing the same results that I have: The problem has gone away with OS7.
I have one board which was exhibiting this problem, however, the SD card kept getting corrupted while I was trying to figure out what the problem was, so I started by re-writing the save code. Once that was properly done, I re-tested, and the problem seemed to have disappeared, which was a bit of a surprise to me. (I had assumed that the sd card corruption was a separate problem.)
So either the sd card corruption IS the problem, or the problem got fixed by accident while I was fixing the save file problem… Hopefully it’s the first one, or the problem could come back just as quickly as it went away.
Out of curiosity, was there some part of the old save code you suspect may have been the culprit? Like what needed to be fixed in it?
I’m encountering a variation of the issue on OS7 now with an affected KR board.
Instead of getting “SD card not found” I’m getting a pause on boot and two “font directory not found”. Probably one for the font directory and the second for the common directory.
Strangely enough it seems to consistently do this on battery. When powered by USB I haven’t encountered it yet after a few boots.
That is a different problem, please start a new thread for it if you need help.
The old save code would create new files. If the board was powered off in the middle of creating these new files, the directory would get corrupted, which could cause all sorts of problems.
In the new code, files are created once, then re-written as needed. Any corruption that occurs should only affect that file, nothing else.
Ahh, smart! Aside from cases of when the battery is disconnected, is there a way to prevent the bard from powering off while a write is in progress? I’m reminded of years ago when I was doing gpgpu programming in CUDA. In order to prevent data in a memory location from being changed while one core is in the middle of a calculation, we would have to put a lock on that element that would block other cores from changing it till it was released. Maybe something like that?
No.
Power is generally not controlled by the board.
It’s a kill switch or kill key that is doing it.
It would be possible to have the board control it’s own power, but …
A) it takes more components
B) it could be a nuisance if the board crashes
C) the STM32 BOOTLOADER wouldn’t know how to operate the power circuit
So I don’t think that’s a good idea.
So would the corruption be happening when people remove the battery or flip the kill switch?
Yes.
And the more things we save, the more likely it is to happen.
If I wanted to plop a capacitor into the works somewhere on an existing board, in order to provide a few seconds backup power, could that be done? How long would a backup even need to last?
It could be done, but it’s not entirely simple.
The software on the proffieboard has to have a way to know that the power has been cut so that it doesn’t start a new save operation. The board could also turn off sound and stuff when the power is cut to make the energy in the cap last longer. Depending on how the detection works, the cap would need to last somewhere in the 10-100 millisecond range.
GIven that people could also just yank the SD card out at any time, or the proffieboard could crash, or a million other things could happen, it’s probably better/simpler to just make sure that the save code handles interruption at any time…
Given that ProffieOS 7.x seems to have sorted the SD thing, can I just double-check:
I know 7.x handles save files differently to before, but am I right in saying that is an under-the-hood thing only?
So therefore am I right in saying that things like #define SAVE_STATE will still work the same as they did before from a user/config point of view?
Thanks as always.