
Sonic Share is my entry for HackGT 9. The these of this hackathon is retro.
The inspiration for this project comes from the old phone-based modem internet. In the olden days, one would yank their cabled phone into a modem box, and data is transmitted through audio waves. It was a genius solution because it manages to provide internet through phone cables, which were already readily available back then, unlike internet cables. The phone modem is essentially a speaker and a microphone that "talk on the phone". Guess what also has a speaker and a microphone? Yes, literally ALL modern devices have a decent speaker and microphone. So why not bring this retro-idea of data transmission back?
Obviously, it would be way too annoying if we brought it back as is. You will hear BRBRBRBZZZZZLLLLLLLLAAAAAAA all the time. But if we keep the concept, but reimagine the implementation. We can see that there is a narrow band of frequency between 17khz and 20khz that most modern devices can produce, but are outside of the human hearing range. That is essentially what I did, creating a data transmission protocol (and implementation) to transmit data with audio between the frequency range of 17khz and 20khz.
It may be obvious to think that the easy way to go about this is to transmit with a single pitch sound that turns on and off at a defined frequency, but that proves to be completely unreliable through tests done before the hackathon. Besides, the margin of the signal-to-noise ratio is very thin, and I have a large band of frequency available, so why not use them?
When I started to work on it, I settled on four frequencies of 17khz, 18khz, 19khz, and 20khz to transmit my data, where each frequency is a single bit. The on/off of specific frequencies are determined by the bit they transmit. The first signal is where all frequencies are on, and the subsequent signals are the data itself. I would use a 16-sized FFT to detect the rising of the first signal and perform a 128-sized FFT to extract frequencies in subsequent signals. This would give me a theoretical bitrate of 44100/128 * 4 = 1378 bits per second.
I implemented this idea, but soon realized a glaring problem: frequency interference. The sending frequencies are high enough to interfere with the raw audio representation medium of 44100Hz and with each other. Imagine a signal of 101Hz, and you plot this signal 200 times per second. What you would get is a 100Hz signal that is modulating its volume at 1 time per second. With a frequency difference of 1khz, we would get interference that is all over the place.
The solution was to only send a single frequency at once, eliminating interference once and for all. That reduced the bit rate down to 44100/128 * 2 = 689 bits per second. The mapping for frequencies is down below.
17khz = 00
18khz = 01
19khz = 11
20khz = 10
This mapping is chosen because I also decided to implement the Hamming code into the transmission. Specifically, the Hamming code is the 16-11-SECODED Hamming code. It stands for 15+1=16 total bits, 11 data bits, single error correction, double error detection. This mapping is chosen because the difference between two adjacent frequencies is only a single bit difference. Thus, a misinterpretation is only a single-bit error. Thus, to transmit a byte (8 bits), 16bits are needed. This further decreased our maximum bitrate to 44100/128 * 2 / 2 = 344 bits per second. Mapping is as follows.
DDDDDDDD111EEEEE
where
D = data
E = error correction parity
However, problems keep coming. This method is good enough for my computer with a high-end microphone and speaker. Mobile devices like my phone have a hard time decoding the signal. Thus, I increased the size of the data extraction FFT from 128 to 256 samples. This decision decreased our bitrate further by half to ```44100/128 * 2 / 4 = 172```. I then added transitions between one signal point to the next, making the high-frequency switching noise significantly less noticeable. This further decreased the bitrate from 172hz to 150hz. The size of the transmission is also provided upfront as a double-byte integer. This, along with the hamming code decoder, allows the receiver to decide whether the signal is complete or not. This is the final iteration of the communication protocol.
The UI is largely simplistic (I ran out of time).
I built two pages. One page to send a transmission with the option of looping it. One page to receive a transmission with the option of copying it to the clipboard and opening it in a browser.
There are many challenges: