NEON optimization in bada - Bada Software Development

NEON optimization in bada - Bada Software Development

http://developer.bada.com/article/NEON-optimization-in-bada?rlWlfcp=duq&isReturn=Y
NEON can execute arithmetic, MAC, logic, comparison and many others operations.NEON is a powerful thing, which can speed up your bada application from 30% up to 20 times
Click to expand...
Click to collapse
Maybe usefull info.
Best Regards

definitely - it's vector computing. Very nice for crunching lots of data in parallel such as texture matrixes simple logic or math operations and other stuff that can be broken down into small logic blocks for execution - One instruction -> multiple data tuples.
This makes bada dev much more interesting for game developers.

Related

Why Iphone 3G with lower processor can run better games than HTC Hero does?

As I know, IPhone 3G runs on 400 MHz processor (based on: CNet), while HTC Hero runs on 528 MHz processor, but as far as I knows, IPhone games are much more better and run smoothly, while games in Android devices like Raging Thunder 2, Super KO Boxing runs very lag in them. Can someone explain to me why?
Thanks in advance...

Most probably the dedicated/better graphic chip inside the iPhone then htc hero has. Plus, I think iPhone has programming language (C?) which is a bit faster then android's Java.

yes.. maybe to better graphics chip...
but I think... it has to do with ... ability to program to 1 hardware!!!!! NO surprises!
iphone OS is on ... one phone!!!
android is on so many different phones with different features and hardware and limits and powers.
if you are a programmer... looking to develop a new game of yours...
On the iphone, you know exactly what to expect and how to make your game perform to the best it can.
Now, try to imagine developing the same game for android. You have to keep in mind all the different phones..size screens, screen techs, graphic chips, CPUs, memory size, keyboard or no keyboard, trackball, optical ball, Dpad, etc etc etc... this list can drive you crazy!!!! what do you do?? You have to make decision at each turn, what you can program for; what you have to not support.

Dan330 said:
yes.. maybe to better graphics chip...
but I think... it has to do with ... ability to program to 1 hardware!!!!! NO surprises!
iphone OS is on ... one phone!!!
android is on so many different phones with different features and hardware and limits and powers.
if you are a programmer... looking to develop a new game of yours...
On the iphone, you know exactly what to expect and how to make your game perform to the best it can.
Now, try to imagine developing the same game for android. You have to keep in mind all the different phones..size screens, screen techs, graphic chips, CPUs, memory size, keyboard or no keyboard, trackball, optical ball, Dpad, etc etc etc... this list can drive you crazy!!!! what do you do?? You have to make decision at each turn, what you can program for; what you have to not support.
Click to expand...
Click to collapse
Ahhhhh I get it..... It make sense, Thanks for the answer.

There are three reasons:
1) The iPhone CPU has a built-in Floating Point Unit (FPU), whereas the hero CPU doesn't. This means that when doing mathematics involving real numbers with a decimal point (e.g. numbers like 1.23, 3.14159, rather than integer numbers like 1, 73 and 492363), the iPhone is considerably faster, probably by an order of magnitude. 3D games make a lot of use of that kind of mathematics.
2) iPhone programs are compiled to run directly on the iPhone's CPU, whereas Android programs compiled to run on a Java Virtual Machine, which in turn runs on the Hero's CPU. This extra level of indirection means that the programs run maybe 5 - 10 times as slowly as they could if they ran directly on the CPU.
3) The iPhone has a more powerful GPU (Graphics Processing Unit) - this means that it is capable of drawing more things to the screen in one frame than the Hero is.

all android phones dont have much internal storage so limates games
Sent from my aHero using the XDA mobile application powered by Tapatalk

Dan330 said:
yes.. maybe to better graphics chip...
but I think... it has to do with ... ability to program to 1 hardware!!!!! NO surprises!
iphone OS is on ... one phone!!!
android is on so many different phones with different features and hardware and limits and powers.
if you are a programmer... looking to develop a new game of yours...
On the iphone, you know exactly what to expect and how to make your game perform to the best it can.
Now, try to imagine developing the same game for android. You have to keep in mind all the different phones..size screens, screen techs, graphic chips, CPUs, memory size, keyboard or no keyboard, trackball, optical ball, Dpad, etc etc etc... this list can drive you crazy!!!! what do you do?? You have to make decision at each turn, what you can program for; what you have to not support.
Click to expand...
Click to collapse
Java was supposed to be platform independand(spelling) in the beginning... oh well... the wonders of theory vs reality..

Most laggy games are laggy because of bad programming.
This can be observed in things like... 2 games/apps with similar graphics where 1 is not laggy and the other is. I've experienced this quite lot. You can make decent games with Java, especially in 3d, since it just calls "native" OpenGLES functions and doesn't have to do the rendering. If you need an extra boost you can make native libraries and supply them with your app... Of course you lose a bit of platform independence, but it's not a big deal and a mere cross compilation of that library away from porting an app to a new device with different processors.

PlanetTimmy said:
2) iPhone programs are compiled to run directly on the iPhone's CPU, whereas Android programs compiled to run on a Java Virtual Machine, which in turn runs on the Hero's CPU. This extra level of indirection means that the programs run maybe 5 - 10 times as slowly as they could if they ran directly on the CPU.
Click to expand...
Click to collapse
I think that's not the problem behind this. You can write critical code in NDK so you can achieve performance.. There's a lot of videos with motorola droid/milestone games. And they are working great.
http://www.youtube.com/watch?v=mn-XaaQXIxw
http://www.youtube.com/watch?v=GUlsfP38lSM
http://www.youtube.com/results?search_query=quake+3+motorola&aq=f
Motorola Milestone has a powerful GPU (PowerVR) and kicksoff the latest snapdron enabled devices.
qualcomm always delivered poor performance in their soc solutions..
+ qualcom msm7200A lacks FPU ... what a shame... screw you crapcomm and htc (for using cheap hardware, such as soc, display,etc). i'm keep wondering why htc doesn't lunch a true super smartphone with real GPU, high quality touchscreen, etc etc. And what's strange, even if they use cheap hardware their devices are more expensive than from other manufacturers ... hahaha

Java VS C++

Hello everybody,
I'm working on an augmented reality application for Android, so I need to make efficient image processing. For the moment, all my code is written in Java, but computation time is much too high.
Do you think that using C++ code with the NDK could significantly improve the performance ? To give you an example, the first of my computation aims at building an integral image matrix : every pixel of a picture is converted in greyscale and added to two values of the matrix.
Thanks for your help !

I'm not C++ developer, but yeah, I think you could gain huge performance boost. Even if Java would be JITed, even if it would be as fast as native code, C++ have e.g. pointers, which could significantly increase your performance in image processing.

Brut.all said:
I'm not C++ developer, but yeah, I think you could gain huge performance boost. Even if Java would be JITed, even if it would be as fast as native code, C++ have e.g. pointers, which could significantly increase your performance in image processing.
Click to expand...
Click to collapse
im not an android dev(yet) but im am experinced in c++ and java. for raw data computation i would imagine u would get a pretty good speed boost over java. But it couldnt be a straight port of the code you would need to take advantage of what c++ offers over java. BC doing per pixel calculations is going to take a while no matter what way u go about it.

Yes, better !
Just to confirm, after some benchmarks, C++ clearly outperforms Java. A complex java operation which costs in average on the HTC Desire 58 ms is done on the same platform in 14 ms in C++ !
Thanks for your reply.
Caraphcole

Caraphcole said:
Just to confirm, after some benchmarks, C++ clearly outperforms Java. A complex java operation which costs in average on the HTC Desire 58 ms is done on the same platform in 14 ms in C++ !
Thanks for your reply.
Caraphcole
Click to expand...
Click to collapse
As I said, I think you could have much, much bigger boost in some situations. Especially when working with some binary data like images, sounds, etc.

OpenCL... you seen this?

Hi guys, just alerting you to the presence of this: http://www.engadget.com/2013/01/04/opencl-mod-for-the-kindle-fire-hd/
Seems pretty good, eh? You think amazon will end up working with this? It'd be pretty awesome. :victory:

jamajenx said:
Hi guys, just alerting you to the presence of this: http://www.engadget.com/2013/01/04/opencl-mod-for-the-kindle-fire-hd/
Seems pretty good, eh? You think amazon will end up working with this? It'd be pretty awesome. :victory:
Click to expand...
Click to collapse
good

PowerVR GPU OpenCL mod for the Kindle Fire HD Series
jamajenx said:
Hi guys, just alerting you to the presence of this: http://www.engadget.com/2013/01/04/opencl-mod-for-the-kindle-fire-hd/
Seems pretty good, eh? You think amazon will end up working with this? It'd be pretty awesome. :victory:
Click to expand...
Click to collapse
jamajenx,
This is a great find! Thank you, we will have to take a peek at the code to see how we can better improve the overall PowerVR GPU Graphics performance.
We have tested Chainfire 3D Pro: https://play.google.com/store/apps/details?id=eu.chainfire.cf3d.pro&hl=en
This had a nice improvement on the overall OpenGL performance on the Kindle, give it a try. Just in case OpenCL and OpenGL is confusion to the reader, here is more information.
OpenGL (Open Graphics Library) is one of the most popular tool-sets available for graphical processing, and many computer games and CAD tools rely on it for 3-Drendering. Originally developed by Silicon Graphics in the early 1990s, OpenGL has been ported to Windows, Linux, Mac OS, and many embedded devices. On desktop computers, a modern OpenGL application consists of two parts: a host application that runs on the CPU and special-purpose routines called shaders that execute on the graphics processing unit, or GPU. In general, the CPU handles complex graphical routines such as physics and geometry and the GPU performs simple tasks like assigning positions to verticals and colors to pixels.
In contrast, OpenCL (Open Compute Language) is only a few years old and isn't nearly as well-known as OpenGL. However, it allows developers to access GPUs (and many other devices) for purposes other than graphics. Because of this general-purpose GPU (GPGPU) processing, OpenCL is frequently employed to crunch numbers at high speed, and common OpenCL applications include data sorting, statistical computation, and frequency analysis. An OpenCL application consists of a host application that runs on the CPU and general-purpose routines called kernels that can execute on any OpenCL-compliant device, including a GPU.

prokennexusa said:
jamajenx,
This is a great find! Thank you, we will have to take a peek at the code to see how we can better improve the overall PowerVR GPU Graphics performance.
We have tested Chainfire 3D Pro: https://play.google.com/store/apps/details?id=eu.chainfire.cf3d.pro&hl=en
This had a nice improvement on the overall OpenGL performance on the Kindle, give it a try. Just in case OpenCL and OpenGL is confusion to the reader, here is more information.
OpenGL (Open Graphics Library) is one of the most popular tool-sets available for graphical processing, and many computer games and CAD tools rely on it for 3-Drendering. Originally developed by Silicon Graphics in the early 1990s, OpenGL has been ported to Windows, Linux, Mac OS, and many embedded devices. On desktop computers, a modern OpenGL application consists of two parts: a host application that runs on the CPU and special-purpose routines called shaders that execute on the graphics processing unit, or GPU. In general, the CPU handles complex graphical routines such as physics and geometry and the GPU performs simple tasks like assigning positions to verticals and colors to pixels.
In contrast, OpenCL (Open Compute Language) is only a few years old and isn't nearly as well-known as OpenGL. However, it allows developers to access GPUs (and many other devices) for purposes other than graphics. Because of this general-purpose GPU (GPGPU) processing, OpenCL is frequently employed to crunch numbers at high speed, and common OpenCL applications include data sorting, statistical computation, and frequency analysis. An OpenCL application consists of a host application that runs on the CPU and general-purpose routines called kernels that can execute on any OpenCL-compliant device, including a GPU.
Click to expand...
Click to collapse
Just noticed, it says incompatible with ics, any risks from doing this on fire hd?

jamajenx said:
Just noticed, it says incompatible with ics, any risks from doing this on fire hd?
Click to expand...
Click to collapse
OH GOD OTHER MORE, chainfire 3d is compatible with kindle fire hd and some ics and jb devices!

Chainfire3D OpenGL Upgrade
jamajenx said:
Just noticed, it says incompatible with ics, any risks from doing this on fire hd?
Click to expand...
Click to collapse
jamajenx,
We have tested successfully tested Chainfire3D on the Kindle. Yes, we noticed that warning, I do not understand except to say the ugarde went well and Chainfire3D did increase the OpenGL Gaming Performance.

Does linaro make a difference?

I notice some ROMs and kernels use linaro. I have tried them and others. I don't notice a difference in speed or battery. What is the advantage?
Sent from my Nexus 7 using XDA Premium HD app

The kernel sources compile faster. LOL
Hard to imagine that would be important to an end user.
There are probably some corner cases where code that is specifically crafted to take advantage of compiler features will execute more efficiently, but that's not the case when comparing compilation of identical sources by two different compilers.

It does on older phones like when I built Roms for the galaxy exhibit 1ghz one core 512mb ram phone, linaro literally doubled the speed but on the n7 Google has it pretty much fully optimised
Sent from my Nexus 4 @1.72 GHz on Stock 4.2.2

bftb0 said:
The kernel sources compile faster. LOL
Click to expand...
Click to collapse
For many codebases, moving to a newer version of gcc actually slows down the compilation process: http://gcc.gnu.org/ml/gcc/2012-02/msg00134.html
But switching to clang (where possible) sometimes helps.
Most compiler developers are focused heavily on producing optimal (and correct) output; compile time is a secondary consideration. It's relatively easy to write a compiler that runs fast but generates slow/bloated code. Good optimization requires a great deal of computation (and often RAM too).
There are probably some corner cases where code that is specifically crafted to take advantage of compiler features will execute more efficiently, but that's not the case when comparing compilation of identical sources by two different compilers.
Click to expand...
Click to collapse
Each new generation of gcc adds more techniques for optimizing existing code. You can see the effects when a standard benchmark is built by different compilers and run on the same system: http://www.phoronix.com/scan.php?page=article&item=gcc_42_47snapshot&num=3
As you can see, the changes are fairly subtle.
With respect to rebuilding Android using another compiler: you're more likely to notice a difference if your workload is heavily CPU-bound and if your current ROM was built by a much older compiler.

SW686 said:
Each new generation of gcc adds more techniques for optimizing existing code. You can see the effects when a standard benchmark is built by different compilers and run on the same system: http://www.phoronix.com/scan.php?page=article&item=gcc_42_47snapshot&num=3
As you can see, the changes are fairly subtle.
Click to expand...
Click to collapse
Yup. That was precisely my point - subtle to the point that they are only observable via careful benchmarking - but (despite claims to the contrary by enthusiastic folks on the internet) probably not discernible by users in a blind trial comparison without the aid of a stopwatch. Our raw perception of "how long something takes" simply is not accurate at the few-percentage-points level... and that's what the OP stated "I don't notice a difference".
Put another way, if a short one-second task becomes a 950 ms task I won't be able to notice the difference, or if a 60 second task becomes a 57-second task, I won't be able to notice that either (without a stopwatch). Both are 5% improvements.
Which is not to say that folks can't be interested in knowing they have a kernel or tweak that is 2% "better" than everybody else's - but they shouldn't over-sell the perceptibility of the actual gains involved.
I would like to see benchmark measurements of IRX120's claim; I have a hard time believing Samsung left a 100% performance gain "on the table" for a phone which was just released one month ago...
cheers

bftb0 said:
I would like to see benchmark measurements of IRX120's claim; I have a hard time believing Samsung left a 100% performance gain "on the table" for a phone which was just released one month ago...
Click to expand...
Click to collapse
To take a 50% performance hit due to the compiler, they would have to screw up something big, e.g. using a softfp toolchain on hardware that supports hard float[1]. Or accidentally building everything with -O0.
Even then, only the part of the workload using floating point would suffer, and that's nowhere near 100% for most operations. Maybe certain benchmarks.
So, as you said, most users probably wouldn't notice. These devices aren't exactly used for Bitcoin mining or computing Mersenne primes.
Also, ever since Froyo, Dalvik has implemented JIT to optimize hotspots. JIT code is typically generated by the VM, not by the native C compiler. This means that a large percentage of the cycles consumed by an application could be spent on instructions emitted by Dalvik directly, and not from anything originating in gcc.
And of course, applications that perform heavy computation often ship with their own native (binary) libraries. So switching to the Linaro toolchain is unlikely to have much of an impact on games or non-WebView browsers.
[1] http://www.memetic.org/raspbian-benchmarking-armel-vs-armhf/

Question (silly question) Can I use an older Android device's processor to add more processing power to the CPU?

I know the question contains a little of ignorance, but idk much about windows kernels and how works the OS en general, but, it is posible that a android phone with, idk, for example a snapdragon processor with an arch of ARM been used as more CPU processing power to the computer? Im just proposing it theoretically
And also by the way if someone could explain me what are the cores of the CPU and if it has anything related to the question thanks you

No. It will not work. Cores of the CPU are like brains in Humans, more cores = more processing power. Android uses the Linux kernel and Windows...the Windows kernel. Two differant beast. It would be like Cats and Dogs agreeing on the best place to go poo....it won't happen.

A CPU, or Central Processing Unit, is the part of the computer that does the actual work - performing operations. Modern CPUs have multiple cores, where each core is able to work on a different part of the operation. In a mobile context, multiple cores are also used to provide a balance between performance and power saving; depending on the CPU, there are generally 2 or more "little" cores that prioritize efficiency over performance; 2 or more "mid" cores that provide more processing power when the "little" cores aren't up to the task; and 1 or 2 "big" cores that provide the best performance but use the most power. When someone talks about "throttling" in a kernel, they're talking about the runtime mechanism that decides what cores a CPU will use under given load conditions.
There are multiple different CPU architectures, and as far as I know, it's not possible to parallel them - you can't use an ARM64 CPU in parallel with an Intel x64, even though they're both 64 bit. The reason for this is different architectures use different basic instructions and scheduling, so the amount of code that would need to go into a kernel to make different types work together would slow the system down and make the whole endeavor pointless, unless you're working with a really large scale operation.
If you look at multi-CPU systems, you'll see that everything from Xeon servers to supercomputers all use the same types of CPU to simplify interconnects, as well as the ability to use one kernel.
It's worth mentioning that there are some projects that do make use of different platforms - for example, SETI @ Home uses a network of Internet connected computers to create a sort of supercomputer. Botnets do the same sort of thing. The difference here is that these systems aren't paralleled, and they work at the application level, so they can only use a certain amount of the client system's resources.

V0latyle said:
A CPU, or Central Processing Unit, is the part of the computer that does the actual work - performing operations. Modern CPUs have multiple cores, where each core is able to work on a different part of the operation. In a mobile context, multiple cores are also used to provide a balance between performance and power saving; depending on the CPU, there are generally 2 or more "little" cores that prioritize efficiency over performance; 2 or more "mid" cores that provide more processing power when the "little" cores aren't up to the task; and 1 or 2 "big" cores that provide the best performance but use the most power. When someone talks about "throttling" in a kernel, they're talking about the runtime mechanism that decides what cores a CPU will use under given load conditions.
There are multiple different CPU architectures, and as far as I know, it's not possible to parallel them - you can't use an ARM64 CPU in parallel with an Intel x64, even though they're both 64 bit. The reason for this is different architectures use different basic instructions and scheduling, so the amount of code that would need to go into a kernel to make different types work together would slow the system down and make the whole endeavor pointless, unless you're working with a really large scale operation.
If you look at multi-CPU systems, you'll see that everything from Xeon servers to supercomputers all use the same types of CPU to simplify interconnects, as well as the ability to use one kernel.
It's worth mentioning that there are some projects that do make use of different platforms - for example, SETI @ Home uses a network of Internet connected computers to create a sort of supercomputer. Botnets do the same sort of thing. The difference here is that these systems aren't paralleled, and they work at the application level, so they can only use a certain amount of the client system's resources.
Click to expand...
Click to collapse
Whoa ok! Cool thanks for your explanation and time. I understood most of the reply so thanks for answering me question!
Have a good day

7zLT said:
Whoa ok! Cool thanks for your explanation and time. I understood most of the reply so thanks for answering me question!
Have a good day
Click to expand...
Click to collapse
No problem. Here is a Wiki article that may provide a more concise explanation. Turns out I was wrong about instruction sets, at least concerning AMD APUs.
The bottom line is...Yes, it's absolutely possible to use multiple different systems to provide more processing power than just one. But, unless those systems are specifically designed to work in parallel with other systems, it would be a bit more complicated to get everything to work together, and the end result wouldn't necessarily be faster. If you're enterprising enough, you could set up an application on your computer as well as your phone that uses your phone's CPU to perform operations, but it wouldn't be easy.

Oh!
Ok, thanks for the references

The Northbridge chipset has limited bandwidth and is optimized to work with specific cpu's. Integrating at this level be ineffective at best even if you could get it to work because of the Northbridge bandwidth limitations.
A dual processor board is the one that you wanted, originally used mostly for servers they are also used in high end workstations. Most games are designed to run on 4 cores so it may not yield much. Some 3D rendering softwares and such are designed to take advantage of dual processor mobos. Again designed to work with a specific processor family like the Xeon series ie matched processors.

Database Users

welcome