I really really nailed converting 16-bit audio samples to 1 bit BTc sampling last night! Super excited as my NES emulator has very good sound now
I thought I would share that with the programmers here, the implementation is available inside of my version of the SDK, but should be pretty easy to lift for other uses:
https://github.com/tswilliamson/PrizmSDK/tree/master/utils/snd/src
Using this solution we can play back pretty decent quality audio on Prizm! Though I think the various apps people have made to play sound/movies have just been tech demos, they could still benefit if you were interested.
The basic concept: at a very high bitrate, the serial port will increase voltage by just a small amount with each 1 bit, and decrease by a small amount with each bit not set. So if you know this, you can write bits in a predictive way based on whether a 1 or 0 will bring you to the desired waveform:
A VERY important point, which I will come to later, is that the increase or decrease in voltage highly depends on what the current voltage already is. The higher your voltage, the less it increases with a 1, and the lower your voltage, the less it decreases with a 0. This is due to basic properties of EMF.
In order to simulate this appropriately, I am sending 460,800 bits per second to the serial port. This may seem like overkill, but at 60 FPS that's only 1750 bits per quarter frame, which is the update rate of the NES. With a fast enough loop, this only takes up about a couple % of my frame time.
When I implemented Roman Black's BTc algorithm, it actually unfortunately had a high amount of distortion and a lot of buzzing. This caused me to hack the crap out of this for the Prizoop release resulting in tinny, but passable audio. I implemented a table algorithm to approximate the needed bits to "move around" the wave form based on the needed offsets and called it a day.
I've been annoyed for the past 6 weeks trying to crack this problem for NESizm and I came to two realizations: I can't use tables, I need to simulate every single bit for quality, and I had to improve Roman Black's algorithm to remove distortion somehow.
The key function for my final, much improved version is BTC() function in snd_main.cpp. This takes a single wave form target amplitude and attempts to move the current voltage towards it one bit at a time. With the right tweaking, I got the main predictive element of Black's algorithm working pretty darn well, but it still had a lot of distortion and buzzing. You can see the pseudocode for this here under "Predictive BTc algorithm"
The reason for the buzzing/distortion is this: whenever you maintain a level voltage on either side of the very middle of the voltage, you'll end up with an imbalanced number of 1's and 0's. For example, because going up is slightly harder than going down, you may end up with 12 1's for every 11 0's. This makes for a slight hitch difference every 23 bits, and a low level buzz at 460,800/23, or approximately 20 KHz.
The reason it happens so much is the prevalence of square waves AND how often there is no current audio value, making it try to keep the voltage low. In fact, for pretty much all early game consoles square waves are the #1 sound generator.
So, the key ended up being adding this code to my BTc conversion loop:
Code:
What this code does is it determines if my current voltage has essentially "reached" the target voltage I am trying to simulate, meaning that the predicted voltage of going up or down will put me on either side of my target. When this happens, and I have an even number of bits left to convert, I just flush the remaining bits with alternating 1's and 0's, regardless of where I am in the wave form.
What this will do is force the 1-bit audio to a 230 kHz signal very far in the inaudible range, and slowly drift the voltage towards the mid point in actuality. This is perfectly fine though, as the human ear actually only cares about *changes* in pressure. Once the voltage goes back to changing everything quickly evens out and the simulation becomes accurate again. A nice side benefit, is that these flushes happen very often, and it's much faster than calculating every bit!
I hope someone found this useful or interesting, as I did.
I thought I would share that with the programmers here, the implementation is available inside of my version of the SDK, but should be pretty easy to lift for other uses:
https://github.com/tswilliamson/PrizmSDK/tree/master/utils/snd/src
Using this solution we can play back pretty decent quality audio on Prizm! Though I think the various apps people have made to play sound/movies have just been tech demos, they could still benefit if you were interested.
The basic concept: at a very high bitrate, the serial port will increase voltage by just a small amount with each 1 bit, and decrease by a small amount with each bit not set. So if you know this, you can write bits in a predictive way based on whether a 1 or 0 will bring you to the desired waveform:
A VERY important point, which I will come to later, is that the increase or decrease in voltage highly depends on what the current voltage already is. The higher your voltage, the less it increases with a 1, and the lower your voltage, the less it decreases with a 0. This is due to basic properties of EMF.
In order to simulate this appropriately, I am sending 460,800 bits per second to the serial port. This may seem like overkill, but at 60 FPS that's only 1750 bits per quarter frame, which is the update rate of the NES. With a fast enough loop, this only takes up about a couple % of my frame time.
When I implemented Roman Black's BTc algorithm, it actually unfortunately had a high amount of distortion and a lot of buzzing. This caused me to hack the crap out of this for the Prizoop release resulting in tinny, but passable audio. I implemented a table algorithm to approximate the needed bits to "move around" the wave form based on the needed offsets and called it a day.
I've been annoyed for the past 6 weeks trying to crack this problem for NESizm and I came to two realizations: I can't use tables, I need to simulate every single bit for quality, and I had to improve Roman Black's algorithm to remove distortion somehow.
The key function for my final, much improved version is BTC() function in snd_main.cpp. This takes a single wave form target amplitude and attempts to move the current voltage towards it one bit at a time. With the right tweaking, I got the main predictive element of Black's algorithm working pretty darn well, but it still had a lot of distortion and buzzing. You can see the pseudocode for this here under "Predictive BTc algorithm"
The reason for the buzzing/distortion is this: whenever you maintain a level voltage on either side of the very middle of the voltage, you'll end up with an imbalanced number of 1's and 0's. For example, because going up is slightly harder than going down, you may end up with 12 1's for every 11 0's. This makes for a slight hitch difference every 23 bits, and a low level buzz at 460,800/23, or approximately 20 KHz.
The reason it happens so much is the prevalence of square waves AND how often there is no current audio value, making it try to keep the voltage low. In fact, for pretty much all early game consoles square waves are the #1 sound generator.
So, the key ended up being adding this code to my BTc conversion loop:
Code:
// an early out when voltage is reached that reduces distortion:
if ((bit & 1) == 0 && upVoltage > target && downVoltage < target) {
// just maintain position with 1/0 flipping
int useBit = (upVoltage - target < target - downVoltage) ? 1 : 0;
btc |= useBit;
bit++;
while (bit < 8) {
btc <<= 1;
useBit = useBit ^ 1;
btc |= useBit;
bit++;
}
break;
}
What this code does is it determines if my current voltage has essentially "reached" the target voltage I am trying to simulate, meaning that the predicted voltage of going up or down will put me on either side of my target. When this happens, and I have an even number of bits left to convert, I just flush the remaining bits with alternating 1's and 0's, regardless of where I am in the wave form.
What this will do is force the 1-bit audio to a 230 kHz signal very far in the inaudible range, and slowly drift the voltage towards the mid point in actuality. This is perfectly fine though, as the human ear actually only cares about *changes* in pressure. Once the voltage goes back to changing everything quickly evens out and the simulation becomes accurate again. A nice side benefit, is that these flushes happen very often, and it's much faster than calculating every bit!
I hope someone found this useful or interesting, as I did.