Hello. I can do this, but need some clarification:
Do you wand one bit per clock tick? Because 250KHz it's 0.004 micro second per bit, not 0.5 millisecond.
And 0.5ms per bit it's 4ms per byte, not 3.
Draw timing diagram not easy, but it give less misunderstandings.
P.S. Hardware timer interrupt maybe not reduce overhead. 250kbit per sec need 20Mhz/250Khz = 80 free cycles of microcontroller, this not much, saving/restoring 32 registers will take most. So it can take 50-100% of "processor loading".