TY - JOUR
T1 - Real-time Volumetric Synthetic Aperture Software Beamforming of Row-Column Probe Data
AU - Stuart, Matthias Bo
AU - Jensen, Patrick Møller
AU - Olsen, Julian Thomas Reckeweg
AU - Kristensen, Alexander Borch
AU - Schou, Mikkel
AU - Dammann, Bernd
AU - Sørensen, Hans Henrik Brandenborg
AU - Jensen, Jørgen Arendt
PY - 2021
Y1 - 2021
N2 - Two delay-and-sum beamformers for 3-D synthetic aperture imaging with row-column addressed arrays are presented. Both beamformers are software implementations for graphics processing unit (GPU) execution with dynamic apodizations and 3rd order polynomial subsample interpolation. The first beamformer was written in the MATLAB programming language and the second was written in C/C++ with the compute unified device architecture (CUDA) extensions by NVIDIA. Performance was measured as volume rate and sample throughput on three different GPUs: a 1050 Ti, a 1080 Ti, and a TITAN V. The beamformers were evaluated across 112 combinations of output geometry, depth range, transducer array size, number of virtual sources, floating point precision, and Nyquist rate or inphase/ quadrature beamforming using analytic signals. Real-time imaging defined as more than 30 volumes per second was attained by the CUDA beamformer on the three GPUs for 13, 27, and 43 setups, respectively. The MATLAB beamformer did not attain real-time imaging for any setup. The median, single precision sample throughput of the CUDA beamformer was 4.9, 20.8, and 33.5 gigasamples per second on the three GPUs, respectively. The CUDA beamformer's throughput was an order of magnitude higher than that of the MATLAB beamformer.
AB - Two delay-and-sum beamformers for 3-D synthetic aperture imaging with row-column addressed arrays are presented. Both beamformers are software implementations for graphics processing unit (GPU) execution with dynamic apodizations and 3rd order polynomial subsample interpolation. The first beamformer was written in the MATLAB programming language and the second was written in C/C++ with the compute unified device architecture (CUDA) extensions by NVIDIA. Performance was measured as volume rate and sample throughput on three different GPUs: a 1050 Ti, a 1080 Ti, and a TITAN V. The beamformers were evaluated across 112 combinations of output geometry, depth range, transducer array size, number of virtual sources, floating point precision, and Nyquist rate or inphase/ quadrature beamforming using analytic signals. Real-time imaging defined as more than 30 volumes per second was attained by the CUDA beamformer on the three GPUs for 13, 27, and 43 setups, respectively. The MATLAB beamformer did not attain real-time imaging for any setup. The median, single precision sample throughput of the CUDA beamformer was 4.9, 20.8, and 33.5 gigasamples per second on the three GPUs, respectively. The CUDA beamformer's throughput was an order of magnitude higher than that of the MATLAB beamformer.
U2 - 10.1109/TUFFC.2021.3071810
DO - 10.1109/TUFFC.2021.3071810
M3 - Journal article
C2 - 33830920
SN - 0885-3010
VL - 68
SP - 2608
EP - 2618
JO - IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control
JF - IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control
IS - 8
ER -