normalizing vector without sqaure root?

What it says on the tin: a place to discuss proposed new features.
Post Reply
User avatar
JaJDoo
Gnome
Posts: 343
Joined: Wed Feb 04, 2009 9:15 pm
x 5

normalizing vector without sqaure root?

Post by JaJDoo »

as i understand it, normalizing requires square root, which limit its usefulness in per-frame uses.

normalizing can also be done with trig function, which can use trig tables for speed.

ie (i doubt you need this example, but still) :
x: 13, y:7
angle : ATan ( y / x ) = ~28
new y : sin(28) = ~0.46
new x : cos(28) = ~0.88
or calculate vector length ( x/cos(a) or y /sin(a) )

understandingly its significantly faster when using trig tables, and can provide a per-frame tool when accuracy is not critical..
some post from somewhere:
"So you basically want to make a car without a steering wheel because you don't know how to drive. I'd say learn how to use pointers"
SigfriedMcWild
Halfling
Posts: 62
Joined: Wed Mar 03, 2004 3:12 am
Location: just a figment of your imagination
Contact:

Re: normalizing vector without sqaure root?

Post by SigfriedMcWild »

Using lookup tables on modern hardware is a bad idea. Memory is slow, very very slow.

By all means try it out with a profiler, but I expect calculating the square root on the cpu is going to be faster than a lookup table
User avatar
xavier
OGRE Retired Moderator
OGRE Retired Moderator
Posts: 9481
Joined: Fri Feb 18, 2005 2:03 am
Location: Dublin, CA, US
x 22

Re: normalizing vector without sqaure root?

Post by xavier »

Pretty much every modern compiler is going to use the VPU (SSE/AVX on CPU) for all math operations, and "rsq" (reciprocal square root) is a common operation supported directly in SSE.

When accuracy is not critical, I'd take the few clocks of latency for a RSQRTSS instruction over the untold number of clocks to access a random memory location. In general, most re-normalizations in 3D graphics are not long-lived anyway. As always, the solution that is right for you is the one that works best in your particular situation -- have you profiled both approaches to see which one performs better for you?
Do you need help? What have you tried?

Image

Angels can fly because they take themselves lightly.
User avatar
xavier
OGRE Retired Moderator
OGRE Retired Moderator
Posts: 9481
Joined: Fri Feb 18, 2005 2:03 am
Location: Dublin, CA, US
x 22

Re: normalizing vector without sqaure root?

Post by xavier »

FWIW:

- The accuracy of RSQRTSS according to the IA instruction set reference matches what I've seen documented elsewhere (in the GCC bugs, for example), namely you get 12 bits of accuracy (the relative error is < 1.5 * 2 ^ -12))

As you can see, for the following code

Code: Select all

int _tmain(int argc, _TCHAR* argv[])
{
	float f[3];

	srand(12345);
	f[0] = (float)rand() / (float)RAND_MAX * 3;
	f[1] = (float)rand() / (float)RAND_MAX * 3;
	f[2] = (float)rand() / (float)RAND_MAX * 3;

	float r[3];
	r[0] = 1 / sqrt(f[0]);
	r[1] = 1 / sqrt(f[1]);
	r[2] = 1 / sqrt(f[2]);

	printf("r: (%f,%f,%f)\n", r[0], r[1], r[2]);

	return 0;
}
the Intel compiler produces the following abridged assembly

Code: Select all

;;; 
;;; 	float r[3];
;;; 	r[0] = 1 / sqrt(f[0]);

        movss     xmm0, DWORD PTR [128+esp]                     ;18.2
$LN13:

;;; 	r[1] = 1 / sqrt(f[1]);

        movss     xmm3, DWORD PTR [132+esp]                     ;19.2
$LN15:
        cvtsi2ss  xmm6, eax                                     ;15.2
        divss     xmm6, DWORD PTR [_2il0floatpacket.20]         ;15.2
        mulss     xmm6, DWORD PTR [_2il0floatpacket.21]         ;15.2
$LN17:
        rsqrtss   xmm1, xmm0                                    ;18.2
        mulss     xmm0, xmm1                                    ;18.2
        mulss     xmm0, xmm1                                    ;18.2
        subss     xmm0, DWORD PTR [_2il0floatpacket.21]         ;18.2
$LN19:

;;; 	r[2] = 1 / sqrt(f[2]);
;;; 
;;; 	printf("r: (%f,%f,%f)\n", r[0], r[1], r[2]);

        mov       DWORD PTR [esp], OFFSET FLAT: ??_C@_0P@A@r?3?5?$CI?$CFf?0?$CFf?0?$CFf?$CJ?6?$AA@ ;22.2
$LN21:
        mulss     xmm1, xmm0                                    ;18.2
        mulss     xmm1, DWORD PTR [_2il0floatpacket.22]         ;18.2
$LN23:
        cvtss2sd  xmm2, xmm1                                    ;22.2
        movsd     QWORD PTR [4+esp], xmm2                       ;22.2
$LN25:
        rsqrtss   xmm4, xmm3                                    ;19.2
        mulss     xmm3, xmm4                                    ;19.2
        mulss     xmm3, xmm4                                    ;19.2
        subss     xmm3, DWORD PTR [_2il0floatpacket.21]         ;19.2
$LN27:
        rsqrtss   xmm7, xmm6                                    ;20.2
        mulss     xmm6, xmm7                                    ;20.2
So in short, I wouldn't worry about it. You already said this is for cases where accuracy is less important, and for cases where it's very important (HPC for example) there's always Newton-Raphson (slower but more accurate).
Do you need help? What have you tried?

Image

Angels can fly because they take themselves lightly.
User avatar
JaJDoo
Gnome
Posts: 343
Joined: Wed Feb 04, 2009 9:15 pm
x 5

Re: normalizing vector without sqaure root?

Post by JaJDoo »

well, i did a small experiment with gravity that required a square root each frame.
when i had one object fps was fine.
adding even one more made a significant slowdown in fps when second object was visible on screen.
i dropped the gravity idea for a few reasons; hard to plot, somewhat unpredictable at times, and the orbit itself doesn't 'stay' (even if locked into endless orbit, the ellipse will rotate around the focal point..) - reality makes it hard :P

i decided to go on a "fake" gravity on elliptical course.
so square root isn't the devil they say it is ?
some post from somewhere:
"So you basically want to make a car without a steering wheel because you don't know how to drive. I'd say learn how to use pointers"
User avatar
mkultra333
Gold Sponsor
Gold Sponsor
Posts: 1894
Joined: Sun Mar 08, 2009 5:25 am
x 114

Re: normalizing vector without sqaure root?

Post by mkultra333 »

If you are normalizing you can multiply by the inverse square root, so then there's the fast inverse square root function made famous by Quake 3 Arena.

http://en.wikipedia.org/wiki/Fast_inverse_square_root

Code: Select all

float InvSqrt(float x)
{
	union {
		float f;
		int i;
	} tmp;
	tmp.f = x;
	tmp.i = 0x5f3759df - (tmp.i >> 1);
	float y = tmp.f;
	return y * (1.5f - 0.5f * x * y * y);
}
It's a sped up version of the Newton-Raphson method mentioned above that makes use of some weird float-to-integer demonology. I mention it for historical interest, since this page, http://assemblyrequired.crashworks.org/ ... uare-root/, shows that this is still significantly slower and less accurate than RSQRTSS... though it still does impressively well compared to other methods, if you can live with the approximation.
"In theory there is no difference between practice and theory. In practice, there is." - Psychology Textbook.
User avatar
JaJDoo
Gnome
Posts: 343
Joined: Wed Feb 04, 2009 9:15 pm
x 5

Re: normalizing vector without sqaure root?

Post by JaJDoo »

just as a suffix, there is a very weird thing going on -

the slowdown i talked about? well its only occurs when i have exactly 3 objects on screen. i a few just to see if there is farther slowdown, but the reverse happened! 4,5,6,7,8.... it just stabilizes. cutting down back to 3, the slow down returns!
odd...
some post from somewhere:
"So you basically want to make a car without a steering wheel because you don't know how to drive. I'd say learn how to use pointers"
SigfriedMcWild
Halfling
Posts: 62
Joined: Wed Mar 03, 2004 3:12 am
Location: just a figment of your imagination
Contact:

Re: normalizing vector without sqaure root?

Post by SigfriedMcWild »

I'd guess it isn't the square root then. Use a profiler to figure out where your application is wasting time. Just guessing won't get you anywhere.
User avatar
Jabberwocky
OGRE Moderator
OGRE Moderator
Posts: 2819
Joined: Mon Mar 05, 2007 11:17 pm
Location: Canada
x 218
Contact:

Re: normalizing vector without sqaure root?

Post by Jabberwocky »

JaJDoo wrote:the slowdown i talked about? well its only occurs when i have exactly 3 objects on screen.
The number 3 was removed from the OgreCore as an optimization - the team found that having less than 10 digits can increase performance dramatically. Unfortunately, any reference to the number 3 will cause serious instability and performance problems in the core ogre library, so try to avoid it if at all possible.
Image
User avatar
xavier
OGRE Retired Moderator
OGRE Retired Moderator
Posts: 9481
Joined: Fri Feb 18, 2005 2:03 am
Location: Dublin, CA, US
x 22

Re: normalizing vector without sqaure root?

Post by xavier »

Jabberwocky wrote: The number 3
5, sir.
Do you need help? What have you tried?

Image

Angels can fly because they take themselves lightly.
User avatar
Jabberwocky
OGRE Moderator
OGRE Moderator
Posts: 2819
Joined: Mon Mar 05, 2007 11:17 pm
Location: Canada
x 218
Contact:

Re: normalizing vector without sqaure root?

Post by Jabberwocky »

I know this sounds crazy, but maybe we could get rid of all numbers except for 0 and 1.
... nah, that'd never work.
Image
User avatar
jacmoe
OGRE Retired Moderator
OGRE Retired Moderator
Posts: 20570
Joined: Thu Jan 22, 2004 10:13 am
Location: Denmark
x 179
Contact:

Re: normalizing vector without sqaure root?

Post by jacmoe »

That really made my day! =)
/* Less noise. More signal. */
Ogitor Scenebuilder - powered by Ogre, presented by Qt, fueled by Passion.
OgreAddons - the Ogre code suppository.
Post Reply