Transceiver52M: Add ARM NEON support

Similar to the existing Intel SSE cases, add support for NEON vector
floating point SIMD processing. In this case, use ARM assembly
directly as the NEON intrinsics do not generate preferential code
output.

Currently support NEON vectorized convolution and floating point
integer conversions.

Signed-off-by: Thomas Tsou <tom@tsou.cc>
12 files changed