Bregalad shared a nice multiply here:
Wiki: 8-bit MultiplyCC65 has some multiply/divide routines: (see umul/udiv/imul/idiv)
CC65 GitHub runtime moduleKeldon recently shared a multiply using a lookup table:
Forum: Relatively fast multi, ...In general, you can make it faster with lookup tables, but they take up space.
There's unfortunately a lot of different needs for multiplication (e.g. 8 bit x 8 bit = 16 bit? 16 x 16 = 32? signed/unsigned? remainder or fixed point?) so it might be tricky to find a "generic" one that does exactly what you want efficiently.
If you want to try writing your own, the simple method isn't really that tricky. If you know how to do long multiplication / long division by hand, it's basically just this technique in binary (i.e. for multiply it's a left shift x2 for each row instead of x10, but you still just add up the rows in the end).
Finally, there's an even simpler version where you can implement multiply as just a repeated add, or divide as counting a repeated subtract. This is slow and inefficient in the general case, but if your numbers are small or you're in a situation where the result is cumulatively maintained, they can be pretty effective anyway.