The main aspects that should be considered for optimizing a cryptographic system are:
Input and output buffers were kept aside so that the core could be processed without any interrupts. The external tasks must not be allowed to enter the critical path.
In many cases, programs have a high-cost critical path that needs to be optimized. It makes sense to optimize the critical paths to a higher extent than the less critical paths.
Many programs need to perform highly complex sets of arithmetic functions. Such complex functions can be made simpler by exploring other alternatives such as look-up tables and bit-manipulation.
Reusability and Functionality
Programming should be performed in such a way that the program modules are flexible so they can be used again in the application.
Hardware devices typically have a high-level of parallelism when compared to software devices. Design of an embedded device should include consideration of such parallelism found in the hardware.
A single instruction is broken into different instructions capable of being executed in parallel. Different register sets should be used to perform individual instructions, which result in instruction-level parallelism that makes the code efficient for multi-processors.
Some tasks in a program need to be executed a finite number of times. Such tasks are called as recursive tasks. Recursive tasks have an overhead that needs to be checked when the instruction sequence should jump out of the loop.
Two pipelining tasks were considered in order to reduce code execution time.
Some tasks use conditional statements like if-then-else, which consume a lot of cycles. A better way is to remove the conditional statements as much as possible.
Interrupt Service Management
The cryptographic related modules should be given the highest priority. If the case arises to perform some other critical task, then an interrupt routine should be programmed to check whether any cryptographic module is running at that time. If so, then all cryptic data should be deleted until completion of the interrupt routine. Then the cryptographic module should be executed again. Under no circumstances should the cryptographic data be sent to the stacks in order to perform interrupt routines.
Time-sliced multitasking of a cryptographic module with other applications also presents vulnerability to attacks. Time slicing could help the attacker to read the data of the registers in order to obtain crucial information, which could lead to knowledge of the key.
I/O Queues Management
In order to run the cryptographic modules efficiently, the input and output modules should be structurally separated. When the embedded device has multi-processor capability, separate processing should be catered for I/O data management.
There are many optimization metrics concerned with embedded systems such as: