💾 Archived View for mirrors.apple2.org.za › archive › ground.icaen.uiowa.edu › MiscInfo › Programmin… captured on 2024-07-09 at 04:18:01.
⬅️ Previous capture (2023-01-29)
-=-=-=-=-=-=-
In article <2gj3o9F3d6lnU1@uni-berlin.de>, Kent Dickey <kadickey@alumni.princeton.edu> wrote: >In article <nospam-4C3750.23050311052004@news-server.woh.rr.com>, John B. Matthews <nospam@nospam.com> wrote: >>In article <nospam-AAC5F1.22351411052004@news-server-fe-02.columbus.rr.com> , "John B. Matthews" <nospam@nospam.com> wrote: >>> In article <370265cd.0405111045.7cd359b7@posting.google.com>, aiiadict@hotmail.com (Rich J.) wrote: >>> > >>> > 16bit/16 bit division with 16 bit result (or 8?), with remainder >>> > 16 bit to 16 bit compare (greater than, less than, equal) >>> > >>> > all with 6502 instructions. >>> > >>> > Rich >>> >> >>Egad! I'd have sworn Sweet 16 does multiply and divide. Instead, >>see lines 522 & 543 of the system monitor listing >> >>http://members.buckeye-express.com/marksm/6502/MON.TXT >> > >The Apple II monitor listing above is a good reference, but those routines >are kinda oddball in that they've been over-optimized for space, not speed or >simplicity. Just a warning. > [ snip] >I'm going to give three multiply routines of increasing complexity. >(I realize you didn't ask for this). I'll do divide in a few days. OK, now for some divide routines. Again, some zero-page locations are used: NUM equ $00 ; Numerator DEN equ $02 ; Denominator REM equ $04 ; Remainder QUOT equ $06 ; Quotient = result (for first routine only) The idea is to do long division in binary. Here's an example: 1 0010 ___________ 101 ) 101 1101 -101 ---- 000 000 1 - 0000 ---- 1 11 - 000 --- 11 110 -101 --- 001 11 -000 --- 11 Result = 10010, remainder = 11. And yup, 93/5 = 18 remainder 3. To do division, you have to "line-up" the denominator with the numerator and then start producing the quotient digits. A simple approach would be to shift the denominator to the left until it is >= the numerator, then do the compare/subtract/shift steps. Then undo the shift at the end to fix up the remainder. But this is inefficient and unnecessary. Instead, the numerator can be shifted into the REM register one bit a time, and compared with the normal denominator. If REM >= DEN, then shift a 1 into the QUOT, otherwise shift in a zero. Repeat for all 16 bits. Here's that code (again, unsigned only, or positive values only): DIVIDE LDY #$10 LDA #$00 STA REM+1 STA REM STA QUOT+1 STA QUOT DIVLOOP ASL NUM ROL NUM+1 ROL REM ROL REM+1 SEC LDA REM SBC DEN PHA LDA REM+1 SBC DEN+1 BCC DIVSHIFT STA REM+1 PLA STA REM PHA DIVSHIFT PLA ; toss ROL QUOT ROL QUOT+1 DIVLOOPEND DEY BNE DIVLOOP RTS The routine does the subtraction of REM-DEN and saves the result on the stack and in ACC until it sees if it is < 0 or not. If it is >= 0, it updates REM with the new values. In any case, the carry gets shifted into QUOT indicating whether to set this bit in QUOT or not. There are many inefficiencies in the above, I wrote it as a more straight- forward form. First, the QUOT register is unnecessary. Note how the NUM is shifting by one each time just as QUOT is. So we can have them share the same memory. And the PHA/PLA can be replaced with TAX/TXA. DIVIDE LDY #$10 LDA #$00 STA REM+1 STA REM ASL NUM ROL NUM+1 DIVLOOP ROL REM ROL REM+1 SEC LDA REM SBC DEN TAX LDA REM+1 SBC DEN+1 BCC DIVSHIFT STA REM+1 STX REM DIVSHIFT ROL NUM ROL NUM+1 DIVLOOPEND DEY BNE DIVLOOP RTS There are some other optimizations to be made. The A2 monitor routine is very similar to this except it eliminates the redundant ROL NUM's before DIVLOOP and at DIVSHIFT. Kent Dickey In article <370265cd.0405161945.2ea7db95@posting.google.com>, Rich J. <aiiadict@hotmail.com> wrote: >the method subtracts the 8 bit divisor from the 16 bit numerator until >it is less than the divisor. the ammount left in the numerator when >it is less than the divisor is the remainder. > >any advice appreciated (ps, I know I should read references. they >aren't here. Internet searches come up will all sorts of results, but >none are what I want specifically) > >the numerator is between 0 and 400 >the divisor is 20 (always) >the quotient will always be 1byte (400/20 = 20) >the remainder will of course always be one byte >the numerator need not be left intact. > >the code below (written from memory of my battle with the IIe earlier) >seems to give strange results sometimes. but I am understanding more >about this now that I have typed it all out and thought about it. > >yuck. maybe not. when a borrow condition is found the code doesn't >work. please save my hair. I am about to pull it out. > > ldy #0 ;initialize quotient > start lda numLO > cmp #divisor > bmi testHI ; if low byte is less than divisor, >see if we can \ > ;borrow from hibyte > sec > sbc divisor ;subtract divisor from low byte > sta numLO > lda numHI >subHI iny ;increment quotient > sbc #00 ;subtract 1 from HI byte if there >was a borrow > sta numHI > bne start ;hibyte is still > 0, so keep on >subtracting > lda numLO ;hibyte is 0, see if Lobyte can still >have divisor > cmp divisor ;subtracted from it > bmi done ;lobyte is less than divisor, end >subtraction > jmp start >testHI lda numHI > BEQ done > sec > jmp subHI > >done sty quotient > lda numLO > sta remainder > rts > >numHI NOP >numLO NOP >divisor equ #20 ;I want to divide a 16 bit # by 20 >remainder nop >quotient nop > > >Rich >aiiadict AT hotmail DOT com >http://rich12345.tripod.com As you said, the above code does not work. First, a minor nit: I think you need to always code "cmp #divisor", with the "#" everytime--otherwise you'll probably get cmp $14 which will compare to zero-page location $14. Even if your assembler has special syntax for this, I recommend staying compatible with most other assemblers. There are two main flaws in the above algorithm: BMI is not the function you want to use to see if the CMP indicates the subtract was successful, and if numLO < 0x14 but numHI != 0, the subtraction does not work correctly. I think you may have some confusion about how SBC/CMP work, so I'll give my explanation. In this case, you're doing a SBC #$14 (decimal 20). It's useful to note that 6502 subtraction is always this: SBC val is the same thing as ADC (val ^ 0xff) In other words, a subtract just inverts all the bits in its operand and does a normal add-with-carry. This means SBC #$14 is identical to ADC #$EB. The other fact to note is that ((val ^ 0xff) + 1) = -val in binary signed math (which is what the 6502 does). This is why SBC requires carry to be set beforehand--just inverting the value does not make it -value, it needs an extra 1 carried in. Once you see this, a lot of confusion around subtraction should go away. I followed your same basic flow but fixed the problems here: ldy #0 start lda numLO sec sbc #20 ; skip the CMP, just do the subtract bcs DoSub ; carry set = subtract is OK to do ldx numHI ; else, check if numHI is 0 beq done ; if it is, we're done DoSub sta numLO ; If we get here, subtract #$14 from num lda numHI sbc #0 sta numHI iny ; increment quotient in Y bne start ; Can use bne instead of JMP since Y < 255 done sty quotient lda numLO sta remainder rts The basic idea is to first see if numLO >= 20. If it is, go to DoSub to do a subtraction. Otherwise, see if numHI != 0. If it is, go to DoSub; otherwise, we're done. An optimization just does the SBC instead of a CMP since they set the carry bit the same but SBC also gives the result we need. Another trick is to check numHI with a LDX to preserve the carry which DoSub will need. As with almost any piece of code, lots of optimizations are still possible (for instance, the SEC can be moved outside of the loop, and numLO could be kept in the X register). However, if you really only need to divide a maximum of 400 by a fixed divider of 20, then more tricks can be used. First, 20 can be factored 5*4. A divide by 4 is just a simple shift, so that can be done outside the loop. This changes the problem to a new range of a max of 100 divided by 5. Here's an algorithm to then do an 8-bit divide by 5 in the inner loop and handle the divide by 4 at the beginning and end. The end divide-by-4 fixups just affects the remainder-- the main divide will give the correct quotient without the fixup at the end: Divide_4_5 lsr NUM+1 ror NUM ror NUM+1 ; stick bit at top of NUM+1, rotate out next bit ror NUM ror NUM+1 ; stick bit at top of NUM+1 LDY #$8 LDA #$00 ASL NUM DIVLOOP ROL CMP #$5 BCC DIVSHIFT SBC #$5 DIVSHIFT ROL NUM DIVLOOPEND DEY BNE DIVLOOP asl num+1 rol asl num+1 rol sta REM RTS This converts NUM+1 and NUM into a single byte: NUM, and then the DIVLOOP does an 8-bit divide. An optimization is to now save the REM location in the accumulator for the inner loop which is faster and saves space. The routine can handle NUM at the start being as high as 1023 ($3FF). Overall, this routine is faster in the worst-case than the divide-by- repeated-subtraction method earlier, but if you know you're numerator values will often be small, then that routine could be faster for your purposes. You only need a single shift to get the range down from 400 max to 200, and that would fit in a byte, so a "ror NUM; ror NUM+1" at the top and a "asl num+1; rol" at the bottom can be removed to save space if the CMP and SBC are changed to use #$0a instead. Kent Dickey