No products in the cart.
May 9, 2024#125 Programming Theory
Hybrid BASIC/ASM Programs
Here is a blog post about some general programming theory and practice on the C64, that I personally find useful and hopefully you’ll find it useful too.
I recently updated a tool that I use to help me with C64 OS development from being written in BASIC to being written as a hybrid of BASIC and 6502 Assembly. I’m sure there are many books and many magazine articles that have been published on this subject, but there can never be too much information on programming tips and ideas for the Commodore 64.
I should set the stage by explaining my development environment. I decided from the beginning that I wanted to develop C64 OS using a C64 (or C128), and not by cross development on a Mac or PC. I know that isn’t everyone’s cup of tea, but since I would be spending years working on the project, I wanted to spend those years becoming intimately comfortable again with the C64’s keyboard, and the storage devices, and all the various commands in BASIC and in the DOSes of the drives. I think that time spent has paid off, as I now have a pretty comprehensive understanding about how things work.
A tricky thing that I found when doing native development is a lack of a standardized set of tools. How do you convert numbers from int to hex and back? How do you run a checksum on a file? How do you split an existing file into two pieces at an arbitrary place? How do you add a header to a file? How do you convert an ASCII file to a PETSCII file? And on and on and on. There are tons of small tasks that have to be performed in order to do the dirty work of creating software.
I have been collecting many small and useful tools, and these are then stored for standard access in a directory called “c64tools” found in the root of the C64 OS system directory. What tools I couldn’t find, I wrote for myself. BASIC is usually the easiest way to sit down and plunk out the first revision of some tool.
BASIC is super convenient because you just print out a couple of lines with the name of the program, and then use an input to get the name of a file from the user. Once you have the name of a file, you can open that file with one line and immediately start reading data from that file, and it’s all very easy. There is just one problem. As soon as you try to do something more than the very trivial, it is also very slow.
Here’s the basic idea, take a filename for input, take a filename for output; open both files, the first for read, the second for write as a SEQ-type file. Next, we’ll read a byte from the input file, and check to see if its numeric value falls within a block that should be mapped to a different block.
In order to do this, we should refer to a PETSCII table such as the one that I’ve provided on c64os.com here: https://www.c64os.com/post/c64petsciicodes And compare it to an ASCII table such as can be found here: https://www.asciitable.com
ASCII is technically a 7-bit code, so it only has 4 blocks of 32 characters each. PETSCII is an 8-bit code, so it has 8 blocks of 32 characters each, although two of those 8 blocks are undefined. The PETSCII chart linked above numbers the blocks from 1 to 8. We can see that in both PETSCII and ASCII control codes are found in block 1, therefore these require no conversion. In both, numbers and symbols are found in block 2, so they don’t need conversion either. However, in ASCII, uppercase characters are in block 3, but these are in block 7 of PETSCII. And lastly, in ASCII, lowercase characters are in block 4, but these are in block 3 of PETSCII.
Therefore, if any character’s byte value falls from 64 to 95 (block 3), we will add 128 to it, to move it to 192 to 223 (block 7). And, if any character’s byte value falls from 96 to 127 (block 4), we will subtract 32 to move it to 64 to 95 (block 3). Then we’ll write the newly mapped byte to the output file. After reading each byte, we’ll read the status byte into a variable. And after writing each byte, if that read-status variable is zero then there are more bytes to fetch, and so we’ll repeat. When the read-status finally comes back as something not-zero, then after writing out the final byte, we’ll close both files and the task is complete.
Now let’s write that in BASIC.
It’s so short and simple, right? And you can just type it up straight from the READY prompt, save it, and boom you’ve got yourself an ASCII to PETSCII conversion tool. Let’s go through it line by line so we’re clear about what it’s doing.
The first three lines print out the name of the program, a copyright if you want it, and a blank line.
Lines 40 and 50 ask for two filenames, which it saves as sf$ and df$.
Line 60 is what every C64 programmer should do, but which many don’t. The value at 186 is the device number of the last accessed device. Simply by reading this into dv and now using dv instead of hardcoding the number 8 we have just added support for multiple storage devices or storage devices on some dev number other than 8. Hooray!
Lines 70 and 80 open two files. Logical file 2, uses data channel 2, on “dv” device number, and it explicitly asks for a SEQ-type input file, for read. If you copy files from a PC via an SD Card, say, and your “.txt” ASCII files appear as PRG-type files, it would be fine to just remove the trailing + “,s,r” from line 70. It opens for read by default, and not specifying the file type would allow it to open either SEQ or PRG (or USR.) Though, technically an ASCII text file ought to be a SEQ-type file.
Regardless of the file type of the source file, line 80 opens on logical file 3, using data channel 3, on “dv” device number, and it opens the file with ,s,w to make it create a SEQ-type file for output.
Between the dashed REM lines is the meat and potatoes of the ASCII/PETSCII conversion.
Line 90 gets one byte from #2 (the input file), and it also reads the status from that operation into a variable “s”. Why do we do this? Because the write operation is going to have an effect on the status, but we later need to refer back to what the status was after the read.
In order to compare numbers and perform mathematical operations, the single-character string a$ needs to be converted into its numeric value. Line 100 does this with the ASC() function, and saves the result to “a”.
Now we can compare numbers. Line 110, if a<64, then it is in block 1 or 2, nothing to convert and so go to line 140 for output.
Line 120 checks if the value is less than 96. But we’ve already eliminated values less than 64, so this really means, any value from 64 to 95. If so, a = a + 128, that maps the value to block 7, and then we go to line 140 for output.
Lastly, if we reach line 130, we know the value is 96 or above, and we assert that since ASCII is a 7-bit code that it won’t have values greater than 127. Very easy to handle this, just subtract 32, a = a – 32, and then fall through to line 140 to output the remapped character.
Line 140 prints to logical file 3, a single-byte string which is converted back from the numeric value “a” using the chr$() function. Pay attention to the semi-colon at the end of the print# command in line 140. Without the semi-colon there, print# would output a carriage return too. We definitely don’t want that.
That’s one byte successfully converted and output. At line 150, we check the “s” which is the read-status we saved. If it’s zero, jump back to line 90 and repeat. When we’ve finally read the last byte, “s” will no longer be zero. The code at line 150 will not loop but fall through to 160, which closes the two files and the program ends.
This program will work! It’s very logical, the implementation in BASIC of the steps that have to be taken is sound. There is just one problem. It’s bloody slow. If you have to convert a file that’s 50 blocks (50 / 4 = 12.5 KB) you will be waiting many minutes for it to finish. That’s a bit long to convert just a few pages of text.
Implement it in assembly
Everyone knows that assembly is a kajillion times faster than BASIC. So the answer is easy, just implement it in assembly. The problem here is that assembly language, on its own, provides you with almost nothing to work with. Thank god for the KERNAL ROM (and some BASIC ROM routines) or you’d have literally nothing. But, from a UI perspective, there isn’t much. This is, of course, one of the main reasons for the existence of an operating system such as C64 OS. The clipboard, and memory management, and Toolkit UI widgets, and mouse pointer and menus… the assembly language program just draws together the various elements that the operating system provides.
However, there are times when you want to write something very simple that can be run just from the READY prompt, a small or simple tool, like an ASCII to PETSCII text file converter. The problem is, somehow you have to output strings to the screen, and then you have to prompt the user and get input for the filenames, and then you have to open the files before you get to the hard work.
Anyone who has ever optimized code will know that in a given program, the computer may spend 1% of its time in 99% of the code, and then spend 99% of its time in just 1% of the code. As the optimizer, what you want to do is figure out what’s the 1% where all the time is being spent? (Modern tools in modern languages identify these things for you, but we’re coding like it’s 1983, so we’re doing the analysis by hand.) Fortunately for us, it’s easy to see that lines 90 to 150 are where all the time is spent, which is why I split them out with the dashed REM lines.
BASIC and assembly in one program
Printing lines, getting user input strings, opening files based upon those strings, that stuff is a pain in assembly. But it’s also the part that takes almost no time to execute. So let’s do in BASIC that short part that’s annoying to implement in assembly, and then do in assembly just the part that takes too long to run in BASIC.
But how do we get BASIC and assembly into a single program?
Many programs written fully in assembly will include a small BASIC header. When the program is loaded, and listed, it will have just a single line, something like:
64 sys2061
The line number doesn’t matter, but making it line #64 is a fun wink-wink. The only thing the line does is SYS2061, which is an address somewhere just past the end of the BASIC segment.
How does one get this basic prelude or header into the assembly program? It turns out, you can just embed the byte codes that represent the entire BASIC program right from your assembly program. Different people have different tricks, but in C64 OS, there is a file //os/s/:basic.s that you can include in your assembly program, it looks like this:
It includes the *= $0801 for you, so the whole shebang will get loaded right into memory where a BASIC program normally starts. And lo and behold, the first 12 bytes ARE a BASIC program. Memory address 2061 is $080d in hexadecimal, which is exactly 12 bytes more than $0801. Therefore, the SYS tells BASIC to begin executing assembled 6502 code starting on the first byte following the BASIC program.
This is great, but only if you want the entire program to be in assembly save for this one SYS line. What we want is to have a fairly complex chunk of BASIC, followed by some assembly. But we definitely don’t want to write the BASIC program by manually encoding all the byte tokens like in the short example above. How do we get the best of both worlds, and make them work together?
What we want to do is write the BASIC part the way we write any BASIC program and save it to a file like normal. Then we want to write an assembly language program that implements just the bit we care about, and assemble it to a file. And then, as a final step, we want to merge the two files into a single file that can be distributed, loaded, and run like any other program would be.
So let’s see how we can do that.
Implement the short part in BASIC
Let’s start with the BASIC part. Here is our new stripped-down BASIC program:
Pretty straightforward, right? We’ve just chopped out the long-running part and replaced it with SYS2000. But, will the assembly part REALLY be at memory address 2000? We don’t know this yet. So how do we find out?
Enter this program, and then save it to disk with a filename ending in .bsc (.bas or whatever is your convention.) The extension tells you that this is not the final program but is just the BASIC component of the program. Let’s say we call it “asc2pet.bsc”
Once saved, you need a program that will open the BASIC program file, read it in, and count the byte length of the file. Well, how do you do THAT??! You need another tool to do that. You could write that other tool in BASIC… but it would be slow. Boy, wouldn’t it be nice to have a hybrid tool that is both BASIC and assembled 6502 to analyze the size of a BASIC or any other program? Yes, yes, it would. C64 OS includes just such a tool called fileinfo found at //os/c64tools/:fileinfo
Numbers in the image do not accurately reflect the examples shown in this blog post.
Of course, fileinfo was originally implemented just in BASIC,
Original article by www.c64os.com