Let's Design and Build a (mostly) Digital Theremin!

Posted: 11/21/2016 6:01:42 PM

From: Northern NJ, USA

Joined: 2/17/2012

Found a nasty flaw in the way I was parsing labels.  Explicitly assigned labels work fine, but those that are implicitly assigned can be broken because addresses can require one, two, or three opcode slots when assigned to a literal, which can move all of the code forward zero, one, or two opcode slots per literal depending on the address value.  But the label address value can be dependent on the location of the label, so it's a chicken / egg thing.  For instance, say we want to place a label value on stack 0, but the label is implicitly assigned somewhere higher in address space.  We can't do the assignment unless we know the address, and we can't know the address unless we do the assignment!

One can work this sort of problem out on paper, so a solution exists.  But formally coding it is another thing, particularly with a fixed number of passes over the code.  One could probably gather all the info and do some sorting and thresholding, but it may be easiest to do repeated passes over the code, updating the label table, and exiting when the table stops changing.  This will be the first approach I try.  If labels weren't so useful I'd just drop it altogether, but I'm finding coding with them really speeds things up and removes a lot of pain and bookkeeping.


Am also polishing the command line interface, when it's done it should be easily extensible to do other things.  Engineers get criticized for spending inordinate amounts of time automating and optimizing things (XKCD), time one may not make up by making those things run more smoothly.  But this fails to take into account all of the necessary experimentation and self-training going on - which enables future projects to be developed that much faster.  If we could live forever an engineer might be your go-to person for getting lots of stuff done quickly because they will have worked long and hard on lots of problems, with ready solutions to many thorny issues.  As for premature optimization, well it's often difficult to tell where you are in a project, so it can be difficult to tell if your efforts are indeed premature.  Experts will tell you to make a quick prototype, then toss it and make another, etc. but often the project isn't a good fit for that advice.  E.g. my Hive simulator code is about as rats-nesty as I can tolerate / follow, and I often need to add features to it which isn't all that easy, but I'm not about to scrap it and start over.  


Once the above is behind me I will attempt an experiment in order to characterize the noise at the pitch antenna.  In the DPLL, the operating point numbers are fed to a 4th order low-pass digital filter at double the edge rate (~5MHz).  This filter lops off noise starting at ~1kHz and pretty much kills everything above 1/2 the sampling rate (48kHz).  So I should be able to send the output of this filter more or less directly to the SPDIF TX logic and record it as audio in Audition.  From there I will be able to see what is going on both spectrally and in terms of overall amplitude.  This will allow me to assess countermeasures implemented in order to reduce the noise in SW or in FPGA hardware (comb filter, variable LPF, etc.).

The numbers from the DPLL are likely good enough to use as-is, but it never hurts to have more precision than you need, particularly when a fair amount of math will be performed on them.

Posted: 11/23/2016 6:10:40 PM

From: Northern NJ, USA

Joined: 2/17/2012

Was literally burning the midnight oil coding last night, slogging through the nested IFs, trying to catch and handle all of the label scenarios I could imagine - it's remarkably nuanced.  Got it working and hit the hay, then got up this morning to work out the final bugs and apply some shoeshine.  In the process I was able to finally reconcile the minus sign '-' as a valid HAL symbol in the parser, dealing with negative numbers and various fancy assembly code constructs systematically in the pre-processor.  The final pass is now more of a post-processor, where some absolute jumps get turned into relative jumps.

So HAL parsing goes like this (all generated code files are tokenized):

1. Split the source HAL file into tokenized code (hal_code.tmp) and end of line comment (hal_eol.tmp) files.

2. Read the code file and pre-process it to hal_pre.tmp.  Here we deal with negative numbers, as well as opcode convenience constructs not directly supported by the processor.  This makes downstream parsing easier too.

3. Read hal_pre.tmp, put explicit label values in the table and delete the assignment statement, and write the results to hal_lbl0.tmp.

4. Read hal_lbl0.tmp, replace all explicit labels with table values, and write the results to hal_lbl1.tmp.

5. Read hal_lbl1.tmp and place the remaining implicitly assigned label values (addresses) in the table.  Refer to the table for all label values as the file is parsed repeatedly as many times as it takes until the table values are stable.  Delete beginning of line labels, replace all remaining labels with table values, and write the results to hal_lbl2.tmp.

6. Read hal_lbl2.tmp, convert certain jump constructs from absolute to relative addressing, and write the results to hal_post.tmp.

7. Parse hal_post.tmp and hal_eol.tmp files to the memory array and delete all temporary files.

It might seem that steps 3 and 4 could be combined, but doing so would make it harder to detect table collisions and null assignments.  As with all other parsing I've implemented, original line numbers are retained throughout, and parsing halts on the first error encountered (reporting the line number).  This makes it a lot easier to figure out what when wrong where when it inevitably does.  The stand-alone compiler has a command line switch option that retains the temp files for debugging purposes.

Here's a small example of HAL processing.  This is the HAL source code for writing a byte to memory (memory is addressed in 16 or 32 bit "chunks" hence the need for this subroutine and a similar one for reading bytes from memory):
lbl[0x14] s2 >>= 1 // * SUB: WR BYTE * (0:data, 1:addr, 2:idx, 7:rtn | -)
P1 += P2// base addr + idx/2
s3 :u= mem[s1]// read mem
P0 <<= 24// in_byte to MSB position
P2 >>r= 1// test idx odd
(P2<0) ? pc := lbl[0x15]// idx odd=wr_hi
P3 >>= 8// wr_lo: hi_byte to MSB position
pc := lbl[0x16]// do common
lbl[0x15] P3 <<= 24 // wr_hi: lo_byte to MSB position
P0 >>= 24// in_byte to LSB position
lbl[0x16] P3 |= P0 // common: combine
P3 <<r= 8// rotate to final position
mem[P1] := P3// write mem
pc := P7// * sub end * return
Here is the code after being tokenized:
lbl [ 0x14 ] s2 > > = 1 
P1 + = P2 
s3 : u = mem [ s1 ] 
P0 < < = 24 
P2 > > r = 1 
( P2 < 0 ) ? pc : = lbl [ 0x15 ] 
P3 > > = 8 
pc : = lbl [ 0x16 ] 
lbl [ 0x15 ] P3 < < = 24 
P0 > > = 24 
lbl [ 0x16 ] P3 | = P0 
P3 < < r = 8 
mem [ P1 ] : = P3 
pc : = P7 
And here is the code after being post-processed:
s2 < < = -1 
P1 + = P2 
s3 : u = mem [ s1 + 0 ] 
P0 < < = 24 
P2 < < r = -1 
( P2 < 0 ) ? pc + = 2 
P3 < < = -8 
pc + = 2 
P3 < < = 24 
P0 < < = -24 
P3 | = P0 
P3 < < r = 8 
mem [ P1 + 0 ] : = P3 
pc : = P7 

All of the labels here are implicitly assigned, so the beginning of line labels are removed, and the remaining are turned into relative jump distances.  The convenience expression "mem[s1]" is converted to the standard expression with zero offset "mem[s0+0]".   All immediate right shifts and rotates are converted into left shifts and rotates, with the immediate shift value negated.  Other expressions not shown here like ++ increments and -- decrements get turned into += 1 and += -1 respectively.

This work is pretty tedious in C++, other languages likely implement parsing in a more natural way.  Now, back to the CLI coding in HAL.

Posted: 12/1/2016 6:22:16 PM

From: Northern NJ, USA

Joined: 2/17/2012

Still pondering the command line code structure.  Found this today via hacker news: https://interpreterbook.com/ which is kind of interesting as he probably addresses many of the issues I've encountered while implementing the HAL assembly language, as well as the Hive command line interpreter.  The command line is actually more difficult because it has to be done in limited hardware / software, which prevents me from using things like multiple intermediate buffers.

At the moment I'm leaning towards buffering the command line like everyone else seems to do, and that is via a circular buffer, with read/tail and write/head modulo pointers.  My previous command line code buffers everything directly as tokens, but that prevents the use of destructive backspace editing (mainly because "undoing" input values building base 10 tokens is problematic) - the lack of which I've found to be pretty frustrating to use.  The RX UART is only double buffered, so one either runs the RX interrupt to the software process, or oversamples the RX register somehow.  I think I'll go with using the 48kHz audio sample interrupt to do this as it is ~5x the ~10kHz byte rate (the audio int is quite handy). The interrupt process will look at the incoming byte, and if it is not BKSP (backspace) stick it in the current slot and increment the write pointer.   If it is BKSP it will decrement the write pointer if it is not equal to the read pointer.  I could include ESC (escape) here and have it set the write pointer equal to the read pointer, but that seems a little scary.

The read process then doesn't have to deal with BKSP at all (which would be a royal pain), but it still must figure out when to process tokens.  I keep forgetting that the command line is interactive, and that the serial stream feeding the command line, whether an actual human banging on the keyboard, or automated via TeraTerm, will wait and look for a new line as confirmation that the processor has consumed & processed the current command before issuing another.  So the read side can leisurely look for a change in characters (head PTR - tail PTR, modulo) with a command character at the end (white space, function key, etc.).

I want to make it possible to respond to things like single function keys being pressed without hitting the enter key (or adding some other white space to the end).  I also want to make all of the sub processes as generic as possible so I don't have to reinvent the wheel every time I make a fundamental change to the way the command line functions.  It's taken me a fair amount of thought to get here (where everyone else ends up, LOL) so I'm not all that keen on revisiting it yet again.  

You must be logged in to post a reply. Please log in or register for a new account.