Let's Design and Build a (mostly) Digital Theremin!

Posted: 7/9/2020 8:55:16 PM

From: Northern NJ, USA

Joined: 2/17/2012

"The real vowels sound a bit breathy, though. Could make the picture somewhat less clear for Praat, I guess?"  - tinkeringdude

Full disclosure: I'm just flailing around in Praat.  For this one note sample it's using very sparse data (~soft sawtooth harmonics, though supposedly sufficiently characterized at this point by the research community?) to determine the lower formants.  And the upper formants are largely obscured / attenuated by the "ooo" closing of the mouth.  That said, I've unfortunately never been able to directly use the formant pitch numbers coming out of Praat, which shakes my faith in both it and me.

"Also, after an octave or especially more above your voice's "resting pitch", if it was roughly there (though you started a bit lower and then shifted up 2 half steps or so), I'd expect to the vowels becoming less clear.  Try to actually vocalize it at that pitch."

Yes, lower notes are definitely more revealing, vowel-wise. 

One thing I'm starting to do with every preset that I have access to real samples for is to try to audibly match the patch with the sample at various pitches - an octave seems to work pretty well.  I didn't do that here, but it would likely help.

"It's a known effect among singers, although I'm not sure whether it's because of actual changes made in vocal setup to reach the high notes that cause it, or just that a smaller number of (humanly audible) harmonics is "scanning" the voice's response curve, as it were, making things less discernable.  At some point the first formant might also be skipped by the fundamental frequency for some vowels?"

I think the effect you are describing is what enables some simple waveshaping analog Theremins to give a passable higher register female vocal sound.  The stimulus is above the lowest formant or two, leaving little for the ear to run afoul of.

Posted: 7/11/2020 7:17:00 PM

From: Northern NJ, USA

Joined: 2/17/2012

"Now I Understand"  - oldtemecula

I watched another video of it (https://www.youtube.com/watch?v=KE6A1gYRnxk) and it wasn't making the 'b' sound with its lips, which made me suspicious.  Then I watched another video (https://www.youtube.com/watch?v=eXolYkLsVdg) as well as the origin page (https://theprayer.diemutstrebe.com/) both of which lead me to believe it's just lip syncing to audio generated elsewhere, not in the rubber vocal tract. 


[EDIT] Aaaand - yet another hit-and-run post by the troll.  Please keep your bad behavior out of this thread.

Posted: 7/11/2020 8:21:43 PM

From: Northern NJ, USA

Joined: 2/17/2012

All Formants Articulated

While working on the bass clarinet patch, which relies on tracking formants to accentuate the lower odd harmonics and suppress the even, it seemed a bit constricting to only have 4 of them articulated (able to modulate their cutoff frequencies with volume and/or pitch axis operating points).  Up to now there have been 4 articulated & 4 fixed, with each fixed formant sharing a common resonance (~Q) control with one articulated, forming a pair.

This morning I looked into sharing articulation between the paired formants as well.  Thread 5 handles the first 4 formants, and it was only using 294 out of a maximum of 468 cycles.  Thread 6 handles the second 4, and it was using 352 cycles.  Articulation adds ~50 cycles to each fixed formant, which puts thread 6 at 452 cycles, really close to the max.  So I made the edits for thread 6 first, uploaded them, and checked the register that flags missed IRQ returns, which showed no errors.  So I went ahead and made the edits for thread 5, which are safer as it has more real-time overhead.

Here is a sample of 8 pitch axis articulated formants doing all harmonics, followed by the formants set to odd harmonics, both stimulated with lower frequency noise bursts: [MP3].  This was just a test to make sure everything is working OK - the all harmonic case sounds a faintly like a human vocal.

These changes shouldn't affect any patches other than some of my bells / glasses which use slightly detuned formants to give a beating effect, which I may have to edit to get back.  These changes should also enable the full "eerie" patch, which relied on 4 formants going higher and 4 lower, interleaved, should I be able to find that again.

On an unrelated note, yesterday I did a quick recording of all the FM patches to send to someone: [MP3] - no outboard effects or reverb were added.

[EDIT] While I was in there I also extended the octave switch range from [-4:3] to [-5:4].  The extra low end is useful for things like ringing bells.

Posted: 7/16/2020 10:08:43 PM

From: Northern NJ, USA

Joined: 2/17/2012

Eerie2: The Eerieing

Wanting my old "eerie" patch back, I compiled an old Hive sim, ran my old software through it to assemble it, and uploaded it along with the old patch to the prototype to see what was going on with the parameters.  Wrote it all down & sampled various sections in isolation.  Then uploaded the current software back in and played around with things for several hours.  See what you think: [MP3].  I think it's more satanic sounding than the original, so yay! 

Here's the patch in the librarian:

The sound source is very low frequency oscillators slightly detuned and mixed with noise.  Both have articulated filters, the oscillator a 2nd order BP which tunes down with increasing pitch, the noise a 4th order BP which tunes up.  This goes through 8 fairly resonant pitch articulated formants, interleaved so every other one goes in a different direction with pitch hand movement.  It's also processed by a fairly resonant inharmonic resonator, which gives it a bit of a vocal sound and some ambience via pseudo stereo.  A sharp knee and a bunch of velocity make the volume side peg easily, and some decay here makes it sound more vocal too. Minimum damping / vloc lets us chop the decay off by moving the volume hand significantly outside the normal field.

I can't get the "scraping" sound any more by turning the velocity, as I installed a LPF on this a while back to "fix" exactly that, but some emergent features inevitably hit the big recycling bin in the sky during development.  Which is too bad, as they often provide some added character in certain very limited patch scenarios, but you can't stop progress.

Posted: 7/19/2020 7:38:23 PM

From: Northern NJ, USA

Joined: 2/17/2012

To Sleep(1), Perchance To Actually Do So For ~1ms (In Win32)

I finally tracked down the serial port and UI slowness in my Win32 C++ code.  Turns out the Win32 Sleep() function, which takes a millisecond parameter as input, doesn't actually go down to ~1ms delay unless you invoke timeBeginPeriod(1) before using it.  Without that function you get something like 20ms when asking for 1ms, and I believe this is the thing that's been plaguing me literally for years on the Win console.

Here are the results of some testing, the first without timeBeginPeriod(1), the second with:

sleep for 1ms = 15623us
sleep for 2ms = 22281us
sleep for 4ms = 15622us
sleep for 8ms = 15617us
sleep for 16ms = 31256us

sleep for 1ms = 1696us
sleep for 2ms = 2413us
sleep for 4ms = 4899us
sleep for 8ms = 8978us
sleep for 16ms = 16267us

I know one shouldn't be using the OS thread tick for precision timing, and so 1ms constitutes something of an abuse of the system, but come on, even Sleep(16) is way off without this mystery function.  And thread sleeping seems to work just fine on Linux without any additional fancy footwork.

And holy mother of god, it took me like a day to find this info on the web.  Here's the straightest, clearest poop I could find on this subject (warning: old school web page ahead):


Posted: 7/20/2020 1:42:38 AM

From: Germany

Joined: 8/30/2014

I am wondering now though, what scenario in serial port handling (within application code), or UI code, needs that time granularity? I don't recall ever having to mess with that.

This software only sends parameters to the device, which then does its thing independently?
Or does it actually do some real-time controlling for functionality not (yet) in the device itself?

On Linux, time measurement may be of finer granularity without special tricks, and you might get on-average switching times between system and user code much smaller than on Windows - but not as a maximum. Maxima may easily be tens of milliseconds also, so you can't rely on a reaction time that you "mostly" see, if staying below a certain time is important. 
Outliers can be made a lot more rare with per-thread scheduler settings, but not zero with the mainline kernel.

Posted: 7/20/2020 3:01:07 AM

From: Northern NJ, USA

Joined: 2/17/2012

"I am wondering now though, what scenario in serial port handling (within application code), or UI code, needs that time granularity? I don't recall ever having to mess with that."  - tinkeringdude

I'm polling both the PC keyboard serial port and the USB serial port.  I looked into various things like curses, but they were either ancient, wanted to control too many things, weren't cross platform, or were too complicated to pick up for a casual project.  I don't really like it, but here I am, doing what everyone says you shouldn't, but with no good alternatives that I'm aware of.  The thing is, programming is so much more than knowing the language, a lot of it is being intimately familiar with libraries, GUI building tools, IDEs, compiler and linker switches, etc. and a working awareness of the various ecosystems and options can require immense knowledge of the zeitgeist.  I can barely handle a simple compile in Geany so I don't dare get too fancy.

"This software only sends parameters to the device, which then does its thing independently?"

Yes.  But it can also upload SW, and can up/down load bulk presets, and even with the port set pretty fast it's kinda slow at those things.

"Or does it actually do some real-time controlling for functionality not (yet) in the device itself?"

No.  Well, polling the PC keyboard is kinda real-time, and polling the serial port definitely is.  Though I'm using data buffers as much as I can.

"On Linux, time measurement may be of finer granularity without special tricks, and you might get on-average switching times between system and user code much smaller than on Windows - but not as a maximum. Maxima may easily be tens of milliseconds also, so you can't rely on a reaction time that you "mostly" see, if staying below a certain time is important.  Outliers can be made a lot more rare with per-thread scheduler settings, but not zero with the mainline kernel."

I just ran the same code on Linux Mint (and of course without using timeBeginPeriod(1) as that's a Win32 function):

sleep for 1ms = 1206us
sleep for 2ms = 2113us
sleep for 4ms = 4201us
sleep for 8ms = 8123us
sleep for 16ms = 16095us

The data seems somewhat better than my Win10 machine, though I suppose one would have to do statistical analysis to know for sure.  If the delay maxima is occasionally much larger it isn't a problem for my code, but if I'm asking for 2ms and it's always giving me >10x then that's a very noticeable slowness problem, and I honestly for the longest time thought it was some serial port setting like the inter-character latency timer they love to crank way up as a default.  Maybe it shouldn't have, all things considered, but this issue really blindsided me.  If Sleep() can't do 1ms ballpark reliably, they should set the granularity to whatever the minimum reliable grain is.  Or they should invoke the special function for me if I'm asking too much of the basic function.  This is one of those "why did it take you all day to write one line of code?" situations that take all the fun out of coding.  C++ is like a minefield.

I have to say that I really get why Tera Term, every other terminal emulator, and particularly audio applications, are as strange as they are.  You have to take the bull by the horns at some level and do sorta real-time stuff, with the language and the OS fighting you every step of the way.

Posted: 7/20/2020 2:24:54 PM

From: Northern NJ, USA

Joined: 2/17/2012

War Of The Worlds (1953)

Playing around just now I found this voice, which sounds a lot like the alien spaceship weapon in that film: [MP3].  It's just noise with some decay, fed through four volume axis modulated formants.  Significant knee & velocity allow me to retrigger it fairly quickly and easily.

Posted: 7/20/2020 3:29:16 PM

From: Germany

Joined: 8/30/2014

IIRC there should be a way of calling a blocking read function which will return when there is keyboard input, leaving your CPU idle as long as nothing happens.
Also similarly, it should be possible to request to read a certain expected number of bytes from a serial port via an OS API call, with an adequate timeout value, which will be buffered, and the system returns control to your thread when it's either available or timed out, with a return code indicating which one it is.
Something along those lines. All without ncurses or stuff like that.

I would not expect this type of application to require polling anything, let alone at such a high rate.
I've received data from devices via FTDI UART->USB adapter @ 3Mbit/s using regular com port API, imagine polling for that, vs. a kernel driver doing this using interrupts or/and hardware dedicated to the problem, filling a buffer for you that you just get notified to fetch. (while OS internal buffers may be larger still, so there won't be loss when you fetch some data and the other end is still sending. You should be able to set the buffers to sizes more suited to your application)

Not sure why terminal emulators would be strange, well Tera Term is strange to me but that's a different story

Music software like DAWs are a bit of another story, yes. They have been fighting against the OS w.r.t. stable timing since early Windows versions, but things apparently have improved, well, at least it is evidently doable reliably.
I'd say the required time granularity there is still not as fine as you aim for, in what looks to me like a far less demanding scenario.
After all, 1ms or 1000Hz is already orders of magnitude above the lowest audible frequency. Nobody needs to play notes of some synth VST that accuretely timed. An incoming MIDI signal, depending on what's going on, may be less finely grained. At 3125 bytes/sec, 1-byte commands might get 0.32ms, but a regular note-ON event is already 3 bytes ~1ms, then the player starts simultaneously using pitchbend or modwheel and floods the channel with a lot of
extra 3-byte commands.

EDIT: And then that's one "channel". By specs, IIRC up to 16 "channels", identified only by the value in the 1st command byte, go over the same physical line, making everything a lot more limited. I don't know how many "talking" instruments were actually chained via MIDI-thru, or dedicated MIDI-merger boxes in practice, never had a studio full of $$$$ gear.

(although there are complaints about MIDI being too slow for the kind of controllers possible today, and they are, or by now have completed, defining a new standard. But let's say, how things were 10 years ago, where already no sane person has been using N-track tape reel recorders for music production anymore for a long time).
All the fine-grained modulation to synthesize sounds happens at audio rates and per-buffer and perhaps interpolated control parameters across a buffer, not really real-time, but a kind of "chunked real-time" that gets synchronized at buffer seams, where a buffer might be something like 64 or 128 samples @ 48 kHz.
But the only timing is done by the hardware that finally processes the audio buffers. The buffer-preparation can be as time-fluctuating as it wants, as long as there is never overrun. There is an OS callback per buffer, and never ever any sort of timed waiting, anywhere in the entire program. (I may have used API timer callbacks for triggering GUI updates, completely decoupled from the rest)
I have not implemented a full DAW, only a prototype software synth engine that can be controlled via MIDI data. I haven't stumbled on timing problems so far. But I guess when a lot more stuff is coming together that has to be "made to fit under one hat" as we say here, it could get more tricky than I have experienced in my comparably little programs.
Fetching lots of stuff from hard drives in time used to be an issue. I guess not anymore. Not to mention fast drives, but these days you can just throw a few GB of RAM at an orchestra sample library. (and, as things are, a lot of programs are not a bit reluctant in that regard even if not necessary. *death stare at Mozilla*)

Posted: 7/20/2020 4:40:46 PM

From: Northern NJ, USA

Joined: 2/17/2012

Hi tinkeringdude,

Well, one problem is the Linux keyboard buffer, which is triggered by the return/enter key.  The commands in my HIVE simulator and D-Lev librarian are triggered by whitespace following a valid command, much like the AutoCAD command line, so I can't use the usual mechanisms provided, so one must then periodically check the buffer to see what's in there, which boils down to manual polling.  In Win32 a couple of low level commands (kbhit(), getch()) let you handle all this quite easily without explicitly polling.  These don't exist in Linux, and the choices are you can have the OS/C++ do stuff for you that doesn't fit this particular operational model (cooked), or you can do it all yourself (raw).  It's a my-way-or-the-highway kind of thing in Linux, and I suppose that's because it ran on terminals so much at the start.

For the serial port, I'm running it at 230,400 baud with a 256 char buffer.  That baud rate /10 (10 bits w/ start & stop) gives 23.4kHz.  That /256 gives 936Hz.  I'm testing out increasing the buffer size and lowering the polling rate, but any delay in polling can slow down transfers as there is handshaking going on with the Hive command line (it waits for a ">" symbol back to start the next transaction).

I know what you're saying about letting the OS + API handle the serial port via interrupts and such, I just couldn't find a solution that would adapt to both MS & Linux and present the same functional interface to my code.  I looked and looked, but maybe I didn't look hard enough.  At some point the looking has to stop and the code writing has to start, a bird in the hand, etc.

People who write things like guitar effects for the PC must have it really hard, as they have to deal with ADC + DAC + I/O buffers + DSP latency < ~10ms.

You must be logged in to post a reply. Please log in or register for a new account.