Optimisation using assembler
This was undertaken as a second year University assignment, with the aim being to take a small C++ program (which would draw Christmas Tree's of varying sizes to a console window) and make the program run faster. A testing harness was created so that the time it took to draw each indidivual tree could be calculated (the average time of 1000 iterations) using fixed tree heights to allow the timing of the original and optimised versions to be compared. Speed gains were made by: re-writing the program in assembler, using registers to store variables, pre-calculating variables, restructuring the program (removing function calls, assert statements), unrolling for-loops, static branch prediction hints, and storing the tree in an array to reduce the number of printf() calls to one for each tree rather than one for each character of the tree.
Speed comparison
The table below shows the average time to draw one Christmas Tree over 1000 iterations:
| Tree height | Original (microseconds) | Optimised (microseconds) |
|---|---|---|
| 4 | 841.716 | 292.857 |
| 5 | 1479.73 | 217.381 |
| 6 | 2295.15 | 263.454 |
| 8 | 4346.15 | 378.075 |
| 10 | 6974.1 | 499.316 |
| 15 | 17035.4 | 923.597 |
| 20 | 31394.4 | 1484.9 |
| 25 | 49757.2 | 2192.47 |
| 30 | 73077.3 | 3100.64 |
| 40 | 130705 | 5147.81 |
Download
Please download the project here. The download includes the original C++ code, the optimised code and the exe.