PHA around JSR

bbc micro/electron/atom/risc os coding queries and routines
Post Reply
AJW
Posts: 798
Joined: Sun Feb 15, 2004 2:01 pm
Contact:

PHA around JSR

Post by AJW » Sun Jun 28, 2020 11:02 pm

Code: Select all

PHA:LDY&85:BEQno8:JSR clyr2:PLA
results in "at Line ____" error , but when PHA and PLA are removed it seemingly works. Can anyone explain why? And if not possible how to preserve the accumulator outside of the subroutine?

User avatar
IanS
Posts: 1277
Joined: Mon Aug 31, 2009 7:02 pm
Contact:

Re: PHA around JSR

Post by IanS » Sun Jun 28, 2020 11:09 pm

AJW wrote:
Sun Jun 28, 2020 11:02 pm
results in "at Line ____" error
At compile time or run-time?

If the branch is taken you'll still need to pop the value back off the stack. Is the branch being taken?

(I'm a 6502 idiot, so may have missed the point entirely)

Naomasa298
Posts: 355
Joined: Sat Feb 16, 2013 12:49 pm
Contact:

Re: PHA around JSR

Post by Naomasa298 » Sun Jun 28, 2020 11:14 pm

AJW wrote:
Sun Jun 28, 2020 11:02 pm

Code: Select all

PHA:LDY&85:BEQno8:JSR clyr2:PLA
results in "at Line ____" error , but when PHA and PLA are removed it seemingly works. Can anyone explain why? And if not possible how to preserve the accumulator outside of the subroutine?
I think IanS has got it. If the branch is taken and you're not pulling the value off the stack, then you have a problem.

Save the accumulator to memory. Stick it in a zero page location. It's faster than using the stack.

User avatar
sweh
Posts: 2177
Joined: Sat Mar 10, 2012 12:05 pm
Location: New York, New York
Contact:

Re: PHA around JSR

Post by sweh » Sun Jun 28, 2020 11:20 pm

AJW wrote:
Sun Jun 28, 2020 11:02 pm

Code: Select all

PHA:LDY&85:BEQno8:JSR clyr2:PLA
results in "at Line ____" error , but when PHA and PLA are removed it seemingly works. Can anyone explain why? And if not possible how to preserve the accumulator outside of the subroutine?
You probably want to put the PHA just before the JSR

Code: Select all

LDY&85:BEQno8:PHA:JSR clyr2:PLA
Rgds
Stephen

AJW
Posts: 798
Joined: Sun Feb 15, 2004 2:01 pm
Contact:

Re: PHA around JSR

Post by AJW » Mon Jun 29, 2020 12:39 pm

Yes that works thanks.. This part isn't speed critical but I agree it's a good idea to use zero page as temporary store, or even better an absolute address as scratch area as they are 1 cycle faster?

Naomasa298
Posts: 355
Joined: Sat Feb 16, 2013 12:49 pm
Contact:

Re: PHA around JSR

Post by Naomasa298 » Mon Jun 29, 2020 12:46 pm

AJW wrote:
Mon Jun 29, 2020 12:39 pm
Yes that works thanks.. This part isn't speed critical but I agree it's a good idea to use zero page as temporary store, or even better an absolute address as scratch area as they are 1 cycle faster?
LDA and STA to zero page is 3 cycles. Absolute address is 4 cycles.

AJW
Posts: 798
Joined: Sun Feb 15, 2004 2:01 pm
Contact:

Re: PHA around JSR

Post by AJW » Mon Jun 29, 2020 12:55 pm

Correct I was confusing it with immediate.

User avatar
sweh
Posts: 2177
Joined: Sat Mar 10, 2012 12:05 pm
Location: New York, New York
Contact:

Re: PHA around JSR

Post by sweh » Mon Jun 29, 2020 1:03 pm

Naomasa298 wrote:Save the accumulator to memory. Stick it in a zero page location. It's faster than using the stack.
PHA is one byte and takes 3 cycles; PLA is one byte and takes 7 cycles. So 2 bytes and 7 cycles.
STA &70 is 2 bytes and takes 3 cycles; LDA &70 is 2 bytes and takes 3 cycles. So 4 bytes and 6 cycles, and one (temp) storage byte.

Given the lack of memory on the Beeb, sometimes optimising for space is better than for speed. Othertimes, speed is better than space :-)

There are other tradeoffs possible as well; under certain circumstances if you don't care about X or Y registers and the subroutine doesn't modify them, then you can TAX/TAY (1 byte 2 cycles) and then TXA/TYA (1 byte 2 cycles) so only needs 2 bytes and 4 cycles :-)
Rgds
Stephen

Naomasa298
Posts: 355
Joined: Sat Feb 16, 2013 12:49 pm
Contact:

Re: PHA around JSR

Post by Naomasa298 » Mon Jun 29, 2020 1:30 pm

sweh wrote:
Mon Jun 29, 2020 1:03 pm
Naomasa298 wrote:Save the accumulator to memory. Stick it in a zero page location. It's faster than using the stack.
PHA is one byte and takes 3 cycles; PLA is one byte and takes 7 cycles. So 2 bytes and 7 cycles.
STA &70 is 2 bytes and takes 3 cycles; LDA &70 is 2 bytes and takes 3 cycles. So 4 bytes and 6 cycles, and one (temp) storage byte.

Given the lack of memory on the Beeb, sometimes optimising for space is better than for speed. Othertimes, speed is better than space :-)

There are other tradeoffs possible as well; under certain circumstances if you don't care about X or Y registers and the subroutine doesn't modify them, then you can TAX/TAY (1 byte 2 cycles) and then TXA/TYA (1 byte 2 cycles) so only needs 2 bytes and 4 cycles :-)
True. Though if you're wanting to save the X or Y registers, it's faster to store in ZP rather than TXA/PHA.

Coeus
Posts: 1575
Joined: Mon Jul 25, 2016 12:05 pm
Contact:

Re: PHA around JSR

Post by Coeus » Mon Jun 29, 2020 3:56 pm

So on this space vs. time tradoff, how about this from another post of mine:

Code: Select all

       83E1: 48          PHA         ; Save all registers.
       83E2: 8A          TXA         
       83E3: 48          PHA         
       83E4: 98          TYA         
       83E5: 48          PHA         
       83E6: A9 84       LDA #84     ; add a new return address that points to the tail of this
       83E8: 48          PHA         ; subroutine which will pull the registers back off the stack.
       83E9: A9 03       LDA #03     
       83EB: 48          PHA         
       83EC: A0 05       LDY #05     
       83EE: BA          TSX         
       83EF: BD 07 01    LDA 0107,X  ; grab the return address of our caller and the registers
       83F2: 48          PHA         ; pushed at the start of this routine and push them back
       83F3: 88          DEY         ; on the top of the stack.
       83F4: D0 F8       BNE 83EE    
       83F6: A0 0A       LDY #0A     ; now move the top ten bytes of the stack down two bytes 
       83F8: BD 09 01    LDA 0109,X  ; removing the copy of the address of our immediate caller
       83FB: 9D 0B 01    STA 010B,X  ; burried the most deeply but leaving the values of the
       83FE: CA          DEX         ; registers pushed at the start of this routine.
       83FF: 88          DEY         
       8400: D0 F6       BNE 83F8    
       8402: 68          PLA         ; as we didn't adjust the stack pointer when we moved the stack
       8403: 68          PLA         ; down by two bytes, discard two bytes from the top.

       8404: 68          PLA         ; This bit is executed twice, once as part of the original call
       8405: A8          TAY         ; and then again when the calling subroutine returns to here.
       8406: 68          PLA         
       8407: AA          TAX         
       8408: 68          PLA         
       8409: 60          RTS         
the code comes from the DFS ROM (Acorn, Opus, Solidisk and Watford all share this). A call to this subroutine at the head of another subroutine takes three bytes whereas if that subroutine did PHA:TXA:PHA:TYA:PHA and then PLA:TAY:PLA:TAX:PLA at the end it would take ten bytes. Also the subroutine above leaves A unchanged. This extra subroutine is going to be much slower, though.

User avatar
hoglet
Posts: 9241
Joined: Sat Oct 13, 2012 7:21 pm
Location: Bristol
Contact:

Re: PHA around JSR

Post by hoglet » Mon Jun 29, 2020 4:21 pm

sweh wrote:
Mon Jun 29, 2020 1:03 pm
PHA is one byte and takes 3 cycles; PLA is one byte and takes 7 cycles. So 2 bytes and 7 cycles.
Just in case anyone reading this is confused, there is a typo in the above. It should say:

PHA is one byte and takes 3 cycles; PLA is one byte and takes 4 cycles. So 2 bytes and 7 cycles.

Post Reply

Return to “programming”