Working on the command line interface (CLI) again. I really miss it, and really need it to peek & poke memory, run hardware exercise scripts, and the like. The one I wrote and was using was somewhat awkward - in particular the backspace key didn't work like you expect so there was no way to fix single keystroke errors. I've done this kind of investigation before, but yesterday I got really serious and recorded all the codes that are produced in the text terminal via C++ getch() when the PC keyboard keys are pressed, stuck them in a spreadsheet, and sorted them:
There are lots of these kinds of documents on-line, but I've found most of them to be rather untrustworthy / not filtered through a Windows-centric development environment.
Also included in the spreadsheet is a worksheet table with four-column ASCII, with vertical least significant hex nibble values and horizontal most significant hex nibble values. This nicely shows the repeating nature of the ASCII character encoding. Hex is the best way to view, list, and understand ASCII encoding, I'm not sure why coders often use decimal here as it obscures the underlying symmetry.
But an ASCII table doesn't tell the whole story of what's going on in terms of keyboard output, hence the need to document actual getch() data. It's a matter of input vs. output, there are way more key combinations (with SHIFT, CTRL, and ALT) than there are ASCII characters to display. The excess input combinations are handled via escape characters. The original IBM PC keyboard used the value zero (0) to indicate escape, with the following character interpreted differently than normal, and a reversion to normal interpretation after that character. The IBM 101 PC keyboard added a second escape character 0xE0 to handle the new page navigation pads. So these have to be accommodated somehow in a full CLI implementation.
Almost all keyboard generated codes are in the range 0 thru 127, which is an unambiguously (whether signed or unsigned) positive byte. And the escape characters don't show up as escaped data, which makes things easier - if you encounter them they definitely are escape characters, so you don't have to look from the beginning of time to know what the current state is.
Because my hardware doesn't interface to it, I haven't done any investigation into what codes are actually emitted by the keyboard hardware. I do know the keyboard serial interface is bi-directional, and there are ways to determine when multiple keys (beyond the usual SHIFT, etc.) are being depressed / lifted.
ASCII, keyboard codes, the way they are interpreted by the OS and programming language libraries, and the whole English-centric thing, are a big steaming pile of legacy, hence the need for rosetta stones here. (If I were in charge of straightening this mess, at minimum I'd make the ASCII codes for the characters 0-9 and A-F correspond to their hex values. As it is the ASCII code for the zero character is 0x30, and the code for the letter 'A' is 0x41 - crazy stuff. It could be a whole lot more crazy, but it could be a whole lot less crazy too.)
[EDIT] Holy Moses! The scan codes coming out of the keyboard hardware serial port are one serious mess! Check this out: http://retired.beyondlogic.org/keyboard/keybrd.htm