Although division and square root are not frequent operations, most processors implement them in hardware to not compromise the overall performance. Two classes of algorithms implement division or square root: digit-recurrence and multiplicative (e.g., Newton-Raphson) algorithms. Previous work shows that division and square root units based on the digit-recurrence algorithm offer the best tradeoff delay-area-power. Moreover, the two operations can be combined in a single unit. Here, we present a radix-16 combined division and square root unit obtained by overlapping two radix-4 stages. The proposed unit is compared to similar solutions based on the digit-recurrence algorithm and it is compared to a unit based on the multiplicative Newton-Raphson algorithm.
- Floating point
- Square root