This thread is to clarify terminology used to refer to instruction execution and how long they take. Without clear terminology, people would read posts with less clarity and things would have to be described in a more verbose manner every time. With clear terminology, we can state something precisely and concisely and be done with it.
I examined the MCS6500 manual from 1976 to see how things are officially described. Here is my summary:
- All instructions take at minimum 2 clock cycles to execute. Each of these cycles accesses memory in some way related to the instruction being executed (sometimes dummy reads/writes).
- Actual instruction execution might extend through one or more clock cycles that are overlapped with the beginning of the next instruction. They are not normally considered because they don't add to execution time and don't affect programming.
- Therefore, we can validly refer to the third and fourth clocks of a two-cycle instruction, since the "two-cycle" part refers to fact that the execution time for the code using this instruction is increased by two cycles.
This still leaves the question of what the "last cycle" of an instruction means. When referring to individual cycles and what actions occur during them, we're going to a hardware level, since they are irrelevant from a pure programming perspective. This would suggest that the last cycle might be one of those overlapped with the next instruction. On the other hand, the actions performed during these last overlapped cycles of an instruction aren't well-documented, and their existence can really only be proven by cracking open a 6502 chip, since they leave no external evidence, even at a hardware level. Therefore, I propose that "last cycle" mean what most people here would thing it to mean: the last cycle of an instruction that isn't overlapped with the next instruction. This means that the last cycle of a 4-cycle instruction is the 4th cycle, so in the case of STA $1234, it would be the cycle that writes to address $1234. If the additional cycles need to be referred to, use the "Nth cycle" terminology, or something more verbose like "cycle during the opcode fetch of the next instruction".
Here are some relevant paragraphs from the manual that support the above terminology. Emphasis and bracked comments [] are mine:
I examined the MCS6500 manual from 1976 to see how things are officially described. Here is my summary:
- All instructions take at minimum 2 clock cycles to execute. Each of these cycles accesses memory in some way related to the instruction being executed (sometimes dummy reads/writes).
- Actual instruction execution might extend through one or more clock cycles that are overlapped with the beginning of the next instruction. They are not normally considered because they don't add to execution time and don't affect programming.
- Therefore, we can validly refer to the third and fourth clocks of a two-cycle instruction, since the "two-cycle" part refers to fact that the execution time for the code using this instruction is increased by two cycles.
This still leaves the question of what the "last cycle" of an instruction means. When referring to individual cycles and what actions occur during them, we're going to a hardware level, since they are irrelevant from a pure programming perspective. This would suggest that the last cycle might be one of those overlapped with the next instruction. On the other hand, the actions performed during these last overlapped cycles of an instruction aren't well-documented, and their existence can really only be proven by cracking open a 6502 chip, since they leave no external evidence, even at a hardware level. Therefore, I propose that "last cycle" mean what most people here would thing it to mean: the last cycle of an instruction that isn't overlapped with the next instruction. This means that the last cycle of a 4-cycle instruction is the 4th cycle, so in the case of STA $1234, it would be the cycle that writes to address $1234. If the additional cycles need to be referred to, use the "Nth cycle" terminology, or something more verbose like "cycle during the opcode fetch of the next instruction".
Here are some relevant paragraphs from the manual that support the above terminology. Emphasis and bracked comments [] are mine:
MCS6500 Manual wrote:
The overlap of fetching the next memory location while interpreting the current data from memory minimizes the operation time of a normal 2- or 3-byte instruction and is referred to as pipelining. It is this feature that allows a 2-byte instruction to only take 2 clock times and a 3-byte instruction to be interpreted in 3 clock cycles.
In the MCS650X microprocessors, a clock cycle is defined as 1 complete operation of each of the 2 phase clocks.
...
Because that [fourth cycle] completes the memory operations for this instruction [ADC absolute], during the fifth cycle [of a four-cycle instruction] the microprocessor starts to fetch the next instruction from memory while it is completing the add operation from the first instruction. During the sixth cycle, the microprocessor is interpreting the new instruction fetched during cycle 5 while transferring the result of the add operation to the accumulator. This means that even though it really takes 6 cycles for the microprocessor to do the ADC instruction, the programmer only need concern himself with the first 4 cycles as the next 2 are overlapped as shown.
[wow, so some instructions even overlap two clocks with the next!]
...
All instructions take at least 2 cycles; one to fetch the OP CODE and 1 to interpret the OP CODE and, with few exceptions, the number of cycles that an instruction takes is equal to the number of times that memory must be addressed.
Implied addressing is a single-byte instruction.
The byte contains the OP CODE which stipulates an operation internal to the microprocessor. Instructions utilizing this type of addressing include operations which clear and set bits in the P (Processor Status) register, incrementing and decrementing internal registers and transferring contents of one internal register to another internal register. Operations of this form take 2 clock cycles to execute. The first cycle is the OP CODE fetch and during this fetch, the program counter increments.
In the second cycle, the incremented P-counter is now the address of the next byte of the instruction. However, since the OP CODE totally defines the operation, the second memory fetch is worthless and any P-counter increment in the second cycle is suppressed. During the second cycle, the OP CODE is decoded with recognition of its single byte operation.
In the third cycle [again, of a two-cycle instruction], the microprocessor repeats the same address to fetch the next OP CODE.
In the MCS650X microprocessors, a clock cycle is defined as 1 complete operation of each of the 2 phase clocks.
...
Because that [fourth cycle] completes the memory operations for this instruction [ADC absolute], during the fifth cycle [of a four-cycle instruction] the microprocessor starts to fetch the next instruction from memory while it is completing the add operation from the first instruction. During the sixth cycle, the microprocessor is interpreting the new instruction fetched during cycle 5 while transferring the result of the add operation to the accumulator. This means that even though it really takes 6 cycles for the microprocessor to do the ADC instruction, the programmer only need concern himself with the first 4 cycles as the next 2 are overlapped as shown.
[wow, so some instructions even overlap two clocks with the next!]
...
All instructions take at least 2 cycles; one to fetch the OP CODE and 1 to interpret the OP CODE and, with few exceptions, the number of cycles that an instruction takes is equal to the number of times that memory must be addressed.
Implied addressing is a single-byte instruction.
The byte contains the OP CODE which stipulates an operation internal to the microprocessor. Instructions utilizing this type of addressing include operations which clear and set bits in the P (Processor Status) register, incrementing and decrementing internal registers and transferring contents of one internal register to another internal register. Operations of this form take 2 clock cycles to execute. The first cycle is the OP CODE fetch and during this fetch, the program counter increments.
In the second cycle, the incremented P-counter is now the address of the next byte of the instruction. However, since the OP CODE totally defines the operation, the second memory fetch is worthless and any P-counter increment in the second cycle is suppressed. During the second cycle, the OP CODE is decoded with recognition of its single byte operation.
In the third cycle [again, of a two-cycle instruction], the microprocessor repeats the same address to fetch the next OP CODE.