Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

At first I thought the file sharing itself was over sound, but the sound is just to negotiate details of the WebRTC session which is then actually used to transmit the data. Neat and handy.

I was thinking they had made something similar to the Fldigi suite, software kind of common in the ham radio universe. You can use it to encode arbitrary binary data into a lot of different modulations which exist in the audible range. Send files between computers without any IP networking at all.

https://en.wikipedia.org/wiki/Fldigi#The_Fldigi_Suite



There does seem to be a few projects related to this on GitHub if you browse the tag. Enough so there’s firewall apps to counter non typical network telemetry. Allegedly, google, amazon, etc are using this. Interesting one way or another.

See here: https://github.com/fhstp/SoniControl

Seems massively not talked about given the vector.


I wonder if it’s related to ad-tech that was embedding “sonic markers” in tv/YouTube ads that were to be picked up by ad-tech SDK’s embedded in apps.

If someone had granted microphone access, the SDK could wait until it hears a signal embedded in an ad, and then pass back whatever data it had accumulated.


Shazamified cookies?


Watch out, every time someone speaks a startup idea, one pops into being.

In a few years, we'll all be hearing about hot new "Shazam for Adtech" startups.


the moment i saw something along the lines of "use Shazam on this ad spot to find out more" i was pretty certain where they were heading to.


Sounds like something Alexa might listen for.


I'm pretty sure Alexa already does this so it doesn't activate from Alexa commercials.

(Although that does still happen to me occasionally.)


What throughput can you get with audible sound?


I estimate up to somewhere in the range 100kbit/s to 500kbit/s.

It would take a sophisticated modulation scheme, like a modem. The sound would be similar to white noise. Software that's getting 20bit/s or whatever is using old-school tone-based modulation, but it is quite robust in the presence of other sounds.

That assumes:

- Frequency response ~20kHz.

- Audio ADC/DAC sample rates configurable well in excess of the Nuqyist limit of the frequency response range of the speakers and mics.

- Good signal to noise (~90dB), which equates to ~15 bits at max volume range.

- Not playing at max volume, but a reasonable level ~18dB down, so ~12 bits.

- A very quiet environment, or one where the background sound is very predictable.

- Stereo laptop speakers and stereo mics, to make a 2x2 MIMO spatially modulated channel.

- Good channel separation (~80dB).

- Great linearity, which might be optimistic.


Because of this I just figured out why white noise is truly random and can’t communicate any data. I wasn’t thinking of it intuitively that way before even knowing conceptually of randomness and entropy and Cosmic background radiation.

Thanks!

Interesting...

If white noise “exists” and is constantly being received does it have an energetic value? Hmm.


A lot of the attributes you describe are the attributes that humans have for hearing and perception as a range, no?

Interesting if there is a particular reason our bodies converged to this state because of a link to sound and resiliency to information communication. I need to think on that one.


Some of the attributes are from MacBook Pro specs :-)

It's not a coincidence that human range is similar, as the Mac was designed for humans.


Thinking of dialup networking, I guess somewhere around 56 kilobits per second is certainly feasible? hehe :)


Probably more. The phone network had a bandwidth of 300-3300 Hz I think. Audible spectrum is larger


True, but roundtripping speaker -> microphone is lossier than a cable carrying electrical signals.


I was thinking the same. Minimodem[1] turns devices into FSK communication devices. Analogous to transferring information via dial-up.

[1] https://github.com/kamalmostafa/minimodem


I was originally thinking the post was more like what you shared.

In theory minimodem could be put into web assembly right?


For the FSK modulation scheme that I use in wave-share, I manage to achieve 8-16 bytes / s for a reasonable in-room distances and regular surrounding noise. It depends also on the speakers / microphone quality, but overall if you want to have a reliable connection between air-gapped devices, the speed is nowhere near comparable to what modems can achieve. Or at least this is my experience.


I need to study more. This brings up interesting questions for me on ultrasound spectrum and others related to information loss. I hear bass more clearly over distances that are noisy but I hear high pitches over distances without noise. Speakers have filters on them to prevent transmission in particular frequencies. I’m unsure of what quality is related to on mics.

Cool stuff to understand. Thanks. Now I must learn more.

Also, weird: https://en.m.wikipedia.org/wiki/Parametric_array

How do I reason about the range of frequencies that air itself has a threshold of vibration? There’s a particular range within that type of matter itself I’d imagine. Sound is only a human description of a waveform perturbation within the air... I’d imagine if the air vibrates too much it could explode or some other effects.


Wouldn’t it be based on a few factors? Rate, volume, etc? Not too different than radio at some level? I could be totally wrong. Guessing here.

If in audible range a faster song would have more data capability - techno vs hip hop? Ha.


Radio waves are electromagnetic while sound waves are a physical wave. I assume that having to actually move the air molecules would require a lot more power and time than electrons in an antenna as well as having to deal with air density and environmental sound interference. I would be interested in knowing the limit of bandwidth here though.

It also brings up an interesting thought I never had before. With early home computers, and I assume before that as well, you would place a phone handset onto a coupler to receive data. It is odd to think now that there is a little air in there that is transferring sound between the handset speaker and the coupler microphone. I assume that without that little bit of air it wouldn't function.


How are electromagnetic waves not physical?

Also if air was such an issue then the dynamics of a song being played live would be of series inconsistent patterns leading to error in rate based on environmental factors. Rate would be whatever air density allows but whether you received that rate is different.

Song recorded vs song played vs song heard when recorded vs song heard when played over speakers.

The waveform of both should be highly similar in a way that wouldn’t be considered error besides amplitude or echoes from acoustics.

So it’s a distributed message sensing problem?

In my perspective I just saw the waveform of a techno song == the rate of data being communicated.

120 bpm is 120 bpm. Loss over distance is it’s own factor based on ability to ack.


Physical waves need a medium for transfer and are affected by the density and properties of the material(s) involved.


What isn’t a medium? A vacuum?

Physical waves depend on a fluid?

EM seems to be affected by a medium/fluid.

Wonder where the distinctions are more poignant. Maybe degree of affects?


All you need to know is the channel bandwidth (say 20-50khz) and noise floor power relative to rx power to give maximum data transfer rate (Shannon-Hartle theorem). Capacity = (Bandwidth) * log2 (1 + recieve_power / noise_floor_power). There is also a lot of em noise going on - just like sound. I would expect sound to be many orders of magnitude less power effecient/bit than EM - but the principles basically the same


Thanks for adding color to the thoughts there. Appreciate it. Now I must go figure out intuitively why log is baked into the universe as a mathematical property...


Not a lot. I'm not sure of where these things usually top out but at least for the software linked above the normal codecs are often around 20 baud. They're intended to be used with high-frequency radio transmissions, more so for keyboard to keyboard typing interfaces rather than arbitrary digital data but they can easily be used for any arbitrary data.

I've used it on a few occasions to send out a text email or some other kind of small text document. Sending out something like a several megabyte image would be very slow.


I would expect most technologies to be capped at 56k otherwise using a phone line for transmissions would circumvent FCC rules on transmission power.


Transmission power hasn't been the cap fo a long time.

The analogue phone line signal is digitised at the exchange, and the digital channel is explicitly 64kbit/s.

Modems can only do "56k" by co-operating with the digital system; one of the modems actually has a digital ISDN connection.

Without that ISDN connection, analogue modems reach up to 33.6k. Theoretically they can do more (but never more than 64k), but in practice that was the last standard produced prior to the 56k, semi-analogue-semi-digital standard.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: