Abstract
The increasing demand of computation at the edge and the tight power budgets push designers to migrate double and single-precision calculations to formats of reduced precision and dynamic range for applications that can tolerate some inaccuracy.In this context, we introduce a variable format for reduced precision floating-point with storage limited to 16 bits. This format is suitable for applications in signal processing, machine learning and other applications in embedded systems. We present the hardware implementations for multiplication and division units that can sustain a throughput of one result per clock cycle designed for vector processing. We also show some examples of applications that can benefit from the proposed format.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of 2020 IEEE 27th Symposium on Computer Arithmetic |
| Publisher | IEEE |
| Publication date | Jun 2020 |
| Pages | 96-102 |
| Article number | 9154490 |
| ISBN (Electronic) | 9781728171203 |
| DOIs | |
| Publication status | Published - Jun 2020 |
| Event | 27th IEEE Symposium on Computer Arithmetic - Portland, United States Duration: 7 Jun 2020 → 10 Jun 2020 Conference number: 27 https://ieeexplore.ieee.org/xpl/conhome/9146973/proceeding |
Conference
| Conference | 27th IEEE Symposium on Computer Arithmetic |
|---|---|
| Number | 27 |
| Country/Territory | United States |
| City | Portland |
| Period | 07/06/2020 → 10/06/2020 |
| Internet address |
Keywords
- Customizable bias
- Floating-point
- Variable precision