Date: 19961029
From: Frank
To: PIC mailing list
Subject: Re: [PICS] Square Root routine for 16C64? FFT for 16C64!!!
We are developing a medical application using FFT to find small
oscillations in bloodpressure. We are using a PIC16C64.
The FFT routine is almost finished and will analyse a 64 10bit input
array. External memory is needed of course.
The output data is complex and we would like to calculate the frequency
spectrum by squaring the samples and then square rooting the sum.
- does anyone know how to perform SQUARE ROOTING or has come across some
code to do this?
- is anyone interested in a FFT (64 samples/10bit) routine for 16C64?
Thanks...
Date: 19961030
From: Unknown
To: PIC mailing list
Subject: Re: [PICS] Square Root routine for 16C64? FFT for 16C64!!!
Frank,
One of the most common methods is the Newton-Raphson method found in the
Microchip embedded Control Handbook.
Yes, I would be interested in seeing code on a FFT.
Date: 19961030
From: Jaap van Ganswijk
To: PIC mailing list
Subject: Re: [PICS] Square Root routine for 16C64? FFT for 16C64!!!
At 12:48 96-10-29 +0000, you wrote:
>- does anyone know how to perform SQUARE ROOTING or has come across some
>code to do this?
I used the following routine by hand, when
calculators didn't have square root buttons yet:
float
square(x)
float x;
{
float a;
float b;
a=1;
do {
b=x/a;
a=(a+b)/2
} until (fabs(a-b)N return n;
u=N;
l2=0;
while (u>>=1) l2++; //l2=highest bit set (log base 2)
l2>>=1; //l2=1/2 highest bit set
u=1L<>j++)
;
hiBitPos=j-2;
j=1<>=1;
C2=(int)(((long)x*ERR_C1)/j)-ERR_C2;
C2=ERR_C1-(int)(((long)C2*C2)/ERR_C1);
return L0+C1-C2;
}
void main(void)
{
us i;
long x;
//for (i=32758;32758<=i;++i)
//for (i=32758;i<65535;++i)
for (i=1;i<17;++i) {
x=log2K(i);
printf("%5u %5lX.%-4.4lX\n",i,x>>14,(x&0x00003FFFL)<<2);
}
}
Date: 19970818
From: Jim Webster
To: Motorola 6811 mailing list
Subject: Integer Square root
My latest project requires many RMS calculations to be performed. I'm
looking for a FAST integer square root algorithm that will be
restricted to handling 16 bit unsigned integers - returning 8 bit
unsigned int.
I'm presently using Newton-Raphson iteration on R - X^2 = 0. This
technique requires integer divisions and worst case has proven to be 7
divisions - yuk!
I'm considering using a 255 element look-up table and a binary search
routine. Any thoughts on this idea? Any other suggestions would be
appreciated.
Jim Webster, Electrical Engineer
Date: 19970822
From: Kevin Moody
To: Motorola 68HC11 mailing list
Subject: RE: Integer Square root
Hi Jim,
Try my routines - loosely based on Newton- Raphson but faster!!!!
******************************************
* UNSIGNED Square root routine *
* sqrt X -> A *
******************************************
sq16
ldaa #0
cpx #0
beq sqrt164
ldaa #1
cpx #4
blo sqrt164
pshb ;Save B
pshx ;Save NUMB
xgdx ;get in D
lsra
lsra ;Get in lower 6 bits
tab
ldx #sqrguess
abx
ldaa 0,x ;Get GUESS
psha ;Save PREV Guess
psha ;Save Guess!
ldab #8
pshb ;Cycle Counter
tsx
sqrt161
ldab 1,x
stab 2,x ;Guess becomes previous
aba
rora ;(QUOTIENT+GUESS)/2
staa 1,x ;NEW GUESS
ldx 3,x ;Get number
tab
ldaa #0
xgdx
IDIV ;D/X -> X, remainder in D
xgdx
cmpa #0
beq sqrt162
ldab #$FF ;QUOTIENT >255, LIMIT to 255
sqrt162
tba
tsx
cmpa 1,x
beq sqrt163 ;Same as Guess - DONE!!
cmpa 2,x
beq sqrt163 ;Same as Previous - DONE!!
dec 0,x ;cycle counter
bne sqrt161 ;Try again!
sqrt163
adda 1,x
rora
adca #0 ;Avergage with guess
pulb ;Dump cycle counter
pulx ;Guess and prev guess
pulx ;Restore NUMB
pulb ;Restore B
sqrt164
clc
rts
sqrguess
fcb $0F,$1F,$2E,$38,$40,$48,$4F,$55,$5B,$60,$66,$6C,$70,$76,$78,$7C
fcb $82,$88,$8C,$8E,$92,$96,$9A,$9C,$9F,$A2,$A7,$A9,$AC,$AF,$B3,$B5
fcb $B7,$BB,$BD,$BF,$C3,$C5,$C7,$CA,$CC,$CF,$D1,$D4,$D6,$D9,$DB,$DD
fcb $DF,$E2,$E4,$E6,$E8,$EB,$ED,$EF,$F1,$F3,$F5,$F7,$F9,$FB,$FD,$FF
Date: 19970824
From: Bob Smith
To: Motorola 68HC11 mailing list
Subject: RE: Integer Square root
Do division by integer multiply and shift (divide by power of two).
Some care is needed to control dynamic range of the argument and
possible overflow, but its much easier on the processor.
Some years ago I saw a direct method integer square root extraction
algorithim but have lost the reference. It was based on the old
base ten square root calculation method we had to learn in grade
school (or is anybody out there old enough to remember when American
Public schools taught useful things).
Has anyone preserved this technique??
Good Luck, Bob Smith
Date: 19970825
From: Jaap van Ganswijk
To: Motorola 68HC11 mailing list
Subject: RE: Integer Square root
Here some more stuff about determining square roots.
We're having a heat wave over here in the Netherlands so
thinking is getting difficult...
The method I used before calculators had square root buttons was:
sqr(x)
float x;
float precision;
{
float a;
float b=1;
do {
a=x/b;
b=(a+b)/2
} until (abs(b-a)xx) a=c;
else b=c;
} until (abs(b-a)>3
In both methods you can remove the abs function
when you can restrict your input range or when it
is already restricted.
Ah and how about using: sqr(x)=2**(log2(x)/2)?
Perhaps you can even handle more of your
calculations in the logarithmic domain.
Greetings,
Jaap
Date: 19970824
From: Forrest J. Cavalier III
To: Motorola 68HC11 mailing list
Subject: RE: Integer Square root
> Probably this is just Newton Raphson or whatever.
> To use it with integers multiply everything with an
> appropriate factor, for example a, b and the precision
> with 16 and x with 16*16=256.
You can get a "head start" on any square root by
successive approximation algorithm by either table-lookup,
or by bit shifts. Having a good guess can greatly reduce
the number of iterations of compute intensive x/b operations.
Since the table lookup for the starting guess already went by,
here is the bit shift method for getting the starting point.
int sqrt(int x)
{
int a = 1; /* Thats a number not a lowercase 'L' */
int b = x;
while(b) {
b >>= 2;
a <<= 1;
}
b = a;
a >>= 1;
/* At this point a is the closest power of 2 less than
the resultant square root, and b is just above.
Run whatever technique you like from here.
*/
b = (a+b+1)/2;
do {
a = x/b;
b = (a+b)/2;
} while((a>b) ? ((a-b) > 1) : ((b-a) > 1));
if (a != b) {
a = x/b;
b = (a+b)/2;
}
/* result in B */
}
Just as an example, by precalculating the closest power of two divisor
for the start....
sqrt(20000) is cut from 10 x/b operations
to 3 to get the closest integer <= sqrt(20000)
Why is that a good way to guess? Think logarithmically...
log( sqrt(x) ) = 1/2 log(x)
The number of shifts in the preparation loop will be
1/2 log_base_2( x )
Forrest Cavalier
Date: 19970825
From: Tom Carlson
Organization: US Digital
To: Motorola 68HC11 mailing list
Subject: Re: Integer Square root
I don't know what your RMSifying, but if its an analog signal, you could
also use an analog muliplier, and do this in hardware, before you read
the value with your A/D.
Just a thought
--
Tom Carlson, Design Engineer
Date: 19970825
From: Mike
To: Motorola 68HC11 mailing list
Subject: Re: Integer Square root
At 07:58 AM 8/25/97 -0700, you wrote:
>I don't know what your RMSifying, but if its an analog signal, you could
>also use an analog muliplier, and do this in hardware, before you read
>the value with your A/D.
Thats a good point Jim, I've considered doing RMS in software for a
current project (feasability study stage) but, after working out the
basics for 14bit resolution meant that the three channels (3 phase) I
wanted would have slowed everything else down too much - so I ended
up using a mux, AD736 (cheap RMS to DC) and A to D converter. That way
the software can do the important comms and reliability nontime-
intensive stuff...
Rgds
Mike
Date: 19970825
From: Dennis Tomlinson
To: Motorola 68HC11 mailing list
Subject: Re: Integer Square root
Hi Jim, Bob, & others;
> Do division by integer multiply and shift (divide by power of two).
> Some care is needed to control dynamic range of the argument and
> possible overflow, but its much easier on the processor.
>
> Some years ago I saw a direct method integer square root extraction
> algorithim but have lost the reference. It was based on the old
> base ten square root calculation method we had to learn in grade
> school (or is anybody out there old enough to remember when American
> Public schools taught useful things).
What a can of worms this could be!
> Has anyone preserved this technique??
OK, I have two schemes which in and of themselves are not solutions,
but which may be explooited. I am not the creator of either scheme -
just the saver of such stuff:
method 1 (sum of odds scheme):
This method is based on a discovery made by Carl Gauss when
he was less than a teenager. The following formula explains
the "trick":
n
--
\
/ (2n-1) = n**2
--
n=1
which says for any number which has an integer square root,
that root is the index of the summation of successive odd numbers which
yields the number. Examples:
1 = 1
1+3 = 4
1+3+5=9
1+3+5+7=16
1+3+5+7+9 = 25
etc.
I've ommited the proof. If anyone wants to see it, let me know.
method 2 (base ten trick):
I believe this is the method referred to by Bob Smith. The source
I got this from said it could be found in the World Bood Encyclopedia
(1959 edition - when American public schools were the envy of the
planet). The method consists of partitioning the number into groups
of two digits around the decimal point. An example is affords the easiest
explanation:
Suppose you wanted to calculate the square root of 1,248.4
------
\/1248.4
After partitioning would look like:
---------
\/12'48.40'
Step 1: Find the largest square which is <= 12. This would
be 9 (3*3). Write 3 as the first number in the
square root, subtract 9 from 12, and bring down the next
two digit partition as the remainder:
3
---------
\/12'48.40'
9
-----
3 48
Step 2: Multiply the partial square root, 3, by 2, and place the
product, 6, to the left of 348.
Determine the next digit of the square root by dividing
348 by 10 x 6, or 60: 348 / 60 = 5.
Write down the 5 as the next digit of the square root, and
place 5 to the right of the 6 to form the divisor, 65. Now
multiply 65 x 5 = 325, and subtract this product from 348.
Bring down the next partition to form the next remainder,
2,340:
3 5
---------
\/12'48.40'
9
-----
65|3 48
|3 25
--------
23 40
Step 3: Multiply the partial square root, 35 by 2, and place the
product, 70 to the left of 2,340.
Determine the third digit of the square root by dividing 2340
by 10 x 70 = 700: 2,340 / 700 = 3.
Write 3 down as the third digit of the square root, and place 3
to the right of 70 to form the divisor 703.
Multiply 703 x 3 = 2109, and subtract this product from 2,340.
Add two zeros to the right of the decimal point, and bring
them down to form the next remainder 2 31 00.
3 5. 3
-------------
\/12'48'40'00
9
-----
65|3 48
|3 25
--------
703 |23 40
|21 09
---------
2 31 00
This process can be continued until the desired accuracy is
achieved. I don't have the proof for this one. Also, I must
credit the source: Francis Richmond. I pulled this from:
ftp://ftp.armory.com/
Good Luck,
Dennis Tomlinson
Date: 19970825
From: Donald E. Haselwood
To: Motorola 68HC11 mailing list
Subject: RE: Integer Square root
I rummaged through my old pile of articles--I thought I had a fast
square-root, possibly an approximation article, circa 1980 (it seems an era
when computation tricks were in vogue in order to squeeze more performance
out of the 8080, 6800, and 6502 machines). No luck, but I did turn up a fast
approximation for:
m = sqrt( a^2 + b^2 )
The reference is Byte magazine, February, 1982, p 248. A Fast Approximation
for Fast Fourier, Mark H. Polczynski. The approximation is m = L + K*S,
where L is the larger of a and b, and K is a factor. A reasonable
approximation is obtained with K = .375, a value which allows simple shifts,
and gives an error in the +-5% range.
It contains no divides and is not interative.
If memory is available, the fastest scheme I came up with is a simple "test
& branch," checking if the number is in the upper half (i.e. greater than
128 * 128), then split that (e.g. if greater than 128 * 128, the see if
greater than 192 * 192; or, if less, test against 64 * 64), ect. with the
final instructions being ldab #q; rts, where "q" is the square root. The
process takes 8 test & branches which takes roughtly 64 cycles, though some
long jumps will be required as the code will be in the 1K range. It
appeared that a table approach is somewhat slower, however it saves memory.
Regards,
Donald E. Haselwood
Date: 19970825
From: Jaap van Ganswijk
To: Motorola 68HC11 mailing list
Subject: RE: Integer Square root
At 00:11 1997-08-25 +0200, Jaap van Ganswijk wrote:
>Here some more stuff about determining square roots.
>
>We're having a heat wave over here in the Netherlands so
>thinking is getting difficult...
Thinking was really getting difficult in this heat...
After reading Forrest J. Cavalier III's reaction I remembered
that I had already got some stuff about it in the ChipDir at:
http://www.twinight.org/chipdir/txt/sqroot.txt
It's basically my calculator method and Forrest's logarithmic
method I think, but I'll check it later and try to combine all
material into some sort of FAQ. Or is anybody else volunteering?
Greetings,
Jaap
Date: 19970826
From: Jaap van Ganswijk
To: Motorola 68HC11 mailing list
Subject: Re: Integer Square root
At 07:58 1997-08-25 -0700, Tom Carlson wrote:
>I don't know what your RMSifying, but if its an analog signal, you could
>also use an analog muliplier, and do this in hardware, before you read
>the value with your A/D.
Also consider using a DSP perhaps...
Greetings,
Jaap
Date: 19970825
From: Peter W. Richards
Organization: IDEO Product Development
To: Motorola 68HC11 mailing list
Subject: Re: Integer Square root
Method 1 is probably not too fast...on 'average' (depending on the
distribution of inputs of course) it'll take ~100 iterations....probably
no faster than 4 or 5 slower Newton/Raphson iterations...
BTW after some experimentation on my calculator it appears that N-R
takes about 3 iterations (sometimes 2!) to converge with an initial
guess of sqrt(2) for 'squares' between 1 and 4. Using a lookup table or
some shifting algorithm, any 'square' can be quickly prescaled into a
smaller range to greatly reduce the number of iterations (& slow
divisions) necessary.
Of course I could just write the code *for* you...how's $100/hr sound?
Method 2, while a fascinating bit of math, is probably a real bad idea
if implemented literally in base 10! Surely converting to base 10 and
back won't speed the algorithm up any!
It would be an insightful exercise, though, to convert the algorithm
below into its base-2 (or base-8 or base-16!) equivalent and compare it
with other algorithms! I still don't have high hopes, since this
algorithm still involves division.
Date: 19970826
From: Graeme Dunbar
Organization: The Robert Gordon University
To: Motorola 68HC11 mailing list
Subject: Re: Integer Square root
If I may join the recent flurry of activity on this subject after a
couple of days delay (which all goes to prove it takes roughly the
same amount of time to hunt out old yellowed books and magazines
where ever you live :-)
When this query came up I knew I had come across a method of
calculating square roots in integer arithmetic somewhere. I tracked
it down to an introductory book on Forth - C H Ting "The first course:
introduction to Forth and FPC". (Reading other contributions it is
good to know at last where it originally came from.)
It works by trying squares of integers 1,2,3... in turn against a
number until the square is bigger than the number. However squaring
the numbers is not necessary because:
(n + 1)^2 = n^2 + 2n + 1
so the next square in the sequence (n+1) can be calculated from n with
one shift and two additions (increments).
>From the Forth book this is:
: SQRT ( n1 -- sqrt(n1) )
0 \ Initialise the root
SWAP 0 \ Set n1 as the limit
DO
1+ DUP \ Try next root
2* 1+ \ Add 2n+1 to running total
+LOOP \ Loop if less than n1 else done
;
Or if you prefer a C-like pseudo code:
number = you_choose;
n = 0;
nsquared = 0;
do
{
n++;
nsquared = nsquared + 2 * n + 1;
}
while ( nsquared < number );
So (what I hope is) a working 68HC11 for the EVBU program is:
******************************************************************
Integer Square Root - 16 bit number, 8 bit result
PROG EQU $0100
DATA EQU $0180
ORG PROG
START LDD NUMBER Read the number from somewhere (5~)
STD LIMIT and keep it safe as for testing (5~)
CLRA Square root held in A, (2~)
LDX #0 (n+1)^2 held in X, (3~)
LOOP INCA Next guess for the root (2~*n)
TAB Perform arithmetic in B (2~*n)
ASLB first calculate 2n (2~*n)
INCB then 2n+1 and (2~*n)
ABX add to X = n^2+2n+1 = (n+1)^2 (3~*n)
CPX LIMIT (6~*n)
BLO LOOP Loop again if result < limit (3~*n)
RTS
* Timing = 5 + 5 + 2 + 3 + ( 2 + 2 + 2 + 2 + 3 + 6 + 3 )*n cycles
* = 15 + 20*n cycles (n = 255 max) = 5115 cycles = 2.56ms
ORG DATA
NUMBER FDB 42 Some test data
LIMIT RMB 2 Safe place for limit
END
**********************************************************************
This was written and run late last night so a pre-emptive apology for
any blunders. Do I hear the sound of a gauntlet hitting the deck?
Any advance on 5115 cycles (give or take a few for using direct
instead of extended addressing)?
Regards,
Graeme
Date: 19970826
From: Brian Trial
To: Motorola 68HC11 mailing list
Subject: RE: Integer Square root
> ******************************************************************
> Integer Square Root - 16 bit number, 8 bit result
>
> PROG EQU $0100
> DATA EQU $0180
>
> ORG PROG
>
> START LDD NUMBER Read the number from somewhere (5~)
> STD LIMIT and keep it safe as for testing (5~)
> CLRA Square root held in A, (2~)
> LDX #0 (n+1)^2 held in X, (3~)
> LOOP INCA Next guess for the root (2~*n)
> TAB Perform arithmetic in B (2~*n)
> ASLB first calculate 2n (2~*n)
> INCB then 2n+1 and (2~*n)
> ABX add to X = n^2+2n+1 = (n+1)^2 (3~*n)
> CPX LIMIT (6~*n)
> BLO LOOP Loop again if result < limit (3~*n)
>
> RTS
>
> * Timing = 5 + 5 + 2 + 3 + ( 2 + 2 + 2 + 2 + 3 + 6 + 3 )*n cycles
> * = 15 + 20*n cycles (n = 255 max) = 5115 cycles = 2.56ms
>
>
> ORG DATA
> NUMBER FDB 42 Some test data
> LIMIT RMB 2 Safe place for limit
>
> END
> **********************************************************************
> This was written and run late last night so a pre-emptive apology for
> any blunders. Do I hear the sound of a gauntlet hitting the deck?
> Any advance on 5115 cycles (give or take a few for using direct
> instead of extended addressing)?
>
> Regards,
> Graeme
>
Sorry for another post on this topic, but I did hear a gauntlet.....
If I have this right, this takes 23 + 42 * n (n = 8 max) = 359 = under
0.2 ms
Date: 19970829
From: Earl Popard
To: Motorola 68HC11 mailing list
Subject: Re: Integer Square root
Jim
I implemented a look-up table solution to this in 140 cycle times (~70 us).
Unfortunately the routine chews up about 600 bytes of memory. If you are
interested please email me directly at earl@flametech.com and I will send
you the code.
Earl Popard
Date: 19970902
From: Jim Webster
To: Motorola 6811 mailing list
Subject: RE:Integer Square Root
>Please post a copy to me of your routine.
>
>You might outline your methodology to the list, with code snippets if it
>is too long, but I would imagine the WHOLE code is not very large, as fast
>as it is, and using only 600 bytes of memory!
>
>Thanks in advance,
>R. Oleksuk
OK R, here it is. A complete description of Newton-Raphson iteration can be
found in any numerical analysis text but briefly:
Normalize the function, in our case we use f(x) = x^2 - R where R is the
argument of which we need a square root and x is the root desired. We are now
trying to find a particular zero of f.
You will typically have some idea of where to search for a solution to the
equation - in the case of integer square root of a 16 bit number we know the
solution lies between 0 and 255 inclusive. In more complicated functions you
might need to pre-process the function to find a starting point for N-R
iteration.
Pick a starting point, Xo, by some means - I found Xo = R>>5 to be the best
starting point for my application. Find the equation of the tangent line to
f(Xo). Your next guess will be the X at which the tangent line crosses the Y
axis (it might help to draw a picture of this step). Repeat until |Xn - Xn-1|
<= accuracy desired.
This method converges on the square root of a 16 bit unsigned integer pretty
quickly - typically in 3 iterations. Note that, mathematically, you can obtain
any accuracy desired using N-R on "well behaved" functions. Of course, in
practice you're limited by that nasty little critter known as the machine
epsilon.
Note that the code to implement Newton-Raphson on f(x) = sqrt(x) listed below is
optimized for this particular function. The more general algorithm requires
more code.
/*********************************************************************
************** I N T E G E R S Q U A R E R O O T ***************
**********************************************************************
* *
* Module name: Isqrt()
*
* Description: This routine uses Newton-Raphson iteration to return
* the square root of a 16 bit integer. The choice of initial value
* was made from empirical data. Divisors of 2^n were tested so that
* an integer division could be replaced by multiple right shifts.
* 128 turned out to be the optimal value, yielding an average of 3.5
* iterations of the loop.
* A note on the exit condition (xn - xs > 1). I tried exiting
* when xn == xs but on certain parameters, this condition is never
* met. The exit condition used, produces an off-by-one error in the
* returned value 0.4% of the time.
* Average execution time is 476 cycles.
* Worst case execution time is 818 cycles (7 iterations).
*
* Module dependencies: none.
* Global variables affected: none.
* Registers used/affected: none.
* Parameters: 1 unsigned integer x.
* Returns: 1 byte: sqrt(x).
*
* History: First version 0.01.0 8/19/97.
* Written by Jim Webster
*********************************************************************/
byte Isqrt(unsigned int x)
{
unsigned int xn,xs;
if (x<=1)
return(x);
xn=x/128; //First guess: Xo = x>>5
if (xn==0) xn=1; //Prevent divide by 0
do
{
xs = xn; //Save present Xn
xn = (xn+x/xn)/2; //Compiler outputs right shifts for /2^n
}
while (xn - xs > 1);
return((byte) xn);
}
B.T.W, thanks to everyone who replied to my post.
Date: 19970902
From: Peter W. Richards
Organization: IDEO Product Development
To: Motorola 68HC11 mailing list
Subject: Re: Integer Square Root
...
> Pick a starting point, Xo, by some means - I found Xo = R>>5 to be the best
> starting point for my application. Find the equation of the tangent line to
...
Wise choice of a starting point WILL SAVE YOU MANY ITERATIONS! If a
simple algorithm
saves you even one iteration it will probably be a win. With the
following choice of Xo, it appears (haven't verified this 100%!) that
the worst-case convergence takes about 3, maybe 4 iterations. These
tests could easily be implemented with a few shifts & bit masks...
Xo R
sqrt(2) 1<=R<4
2*sqrt(2) 4<=R<16
4*sqrt(2) 16<=R<64
8*sqrt(2) 64<=R<256
16*sqrt(2) 256<=R<1024
etc etc etc
This kind of normalization will really only help for large R where there
are enough
significant digits to matter. For R<64, it'll probably only take about
2 iterations anyway and not much can help...
Date: 23 Nov 1997
From: Mark Schultz
To: 68HC11 Mailing List
Subject: SQRT16 and DIV32
Several weeks (months?) ago the topic of how to compute a integer square
root was brought up and a number of approaches offered. While this
response is not exactly timely, I offer a fairly efficient method for
computing the square root of a 16-bit number, based on a clever
successive-approximation technique that was posted (with one or two bugs)
by another member of the list. I apologize for not being able to properly
credit the original author; I do not recall his name.
Before posting the routine, I'll use this as a lead-in to a question of my
own: Is there any clever way known to take advantage of the 16x16 IDIV and
FDIV instructions to create a 32x16 integer divide? I'm presently using a
bit-oriented routine that requires around 3800 cycles (worst case) to
complete. While this is OK, I'd like to do better if I can. I know, for
example, that one can use the built-in 8x8 MUL along with addition to
create a multi-byte MUL. In the case of a 16x16 MUL, four 8-bit MULs and
as many 16-bit ADDs are required. I'm looking for a similar 'trick' to use
with a 32/16 divide.
The square root routine shown below requires less than 400 cycles to
complete and is fully re-entrant. I can also provide other routines from
my math library (8x16 MUL, 16x16 MUL, 32/16 DIV, 32-bit modified Newton's
method SQRT) if interested.
One final comment, also apropos given the recent discussion on code style.
For what it's worth, the example below reflects my philosophy on style and
commenting, at least for assembly-language code.
;*****************************************************************************
;Subroutine: SQRT16
;Function: Calculate square root of 16-bit number
;
;Parameters: D -> 16-bit value to compute square root of
; A <- SQRT(D)
;
;Notes: The approach used to compute the square root is very similar
; to that used to perform analog to digital conversions using
; successive approximation. The main computation loop
; iterates once for each bit in the result (starting with the
; MSB), determining if the bit should be set or clear in the
; final result.
;
; Starting at bit 7 and working downward, the square of
; (successive-approximation register + 2^bit#) is compared
; against the value passed. If (Value < SAR^2) then the
; current bit in the SAR is set. Operation continues with
; continually lower-significance bits until 8 bits have
; been processed.
;
; The value returned will ALWAYS be less than or equal to the
; actual square root; this routine does not round the result.
; Thus, the error can be expressed as:
;
; Error = ActualSQRT(n) - ComputedSQRT(n)
; 0.0 <= Error < 1.0
;
;Stack usage: 8 bytes (including JSR)
; Routine is fully re-entrant
;
;Destroyed: Nothing
;*****************************************************************************
SQRT16 PROC
; Set up environment:
; 0 - SAR bitshifter; current bit under test
; 1 - Accumulated successive-approximated result (valid on exit)
; 2,3 - Value to find square root of
; 4,5 - Saved X (X is used for stack indexing)
PSHX ;Save X
PSHB ;Save value passed
PSHA ;
CLRA ;Initialize result / current guess
PSHA ;
LDAA #$80 ;Initialze SAR bitshifter (start w/MSB)
PSHA ;
TSX ;Point to variables on stack
; Find square root of value passed using
; successive approximation technique.
SQRT16a
LDAA 1,X ;Get current accumulated result
ADDA 0,X ;Add bit-under-test
TAB ;
MUL ;Square (result + new bit)
CPD 2,X ;Compare guess with value passed
BHI SQRT16b ;Current bit not set if guess > value
LDAA 0,X ;Current bit is set in final result
ADDA 1,X ;
STAA 1,X ;
SQRT16b
LSR 0,X ;Try next bit down
BNE SQRT16a ;Continue until all 8 bits checked
; Clean up & exit
INS ;Deallocate SAR bitshifter
PULA ;A <- SQRT(D)
INS ;Deallocate value passed (D)
PULB ;Restore B
PULX ;Restore X
RTS ;Exit
SQRT16 ENDPROC
-----------------------------------------------------------------------
Mark Schultz