core/conv: add x86 SSE support for Viterbi decoder

Fast convolutional decoding is provided through x86 intrinsic based
SSE operations. SSE3, found on virtually all modern x86 processors,
is the minimal requirement. SSE4.1 and AVX2 are used if available.

Also, the original code was extended with runtime SIMD detection,
so only supported extensions will be used by target CPU. It makes
the library more partable, what is very important for binary
packages distribution. Runtime SIMD detection is currently
implemented through the __builtin_cpu_supports call.

Change-Id: I1da6d71ed0564f1d684f3a836e998d09de5f0351
diff --git a/src/Makefile.am b/src/Makefile.am
index 5724055..a0aa5a0 100644
--- a/src/Makefile.am
+++ b/src/Makefile.am
@@ -23,6 +23,12 @@
 			 macaddr.c stat_item.c stats.c stats_statsd.c prim.c \
 			 viterbi.c viterbi_gen.c sercomm.c
 
+if HAVE_SSE3
+libosmocore_la_SOURCES += viterbi_sse.c
+# Per-object flags hack
+viterbi_sse.lo : CFLAGS += $(SIMD_FLAGS)
+endif
+
 BUILT_SOURCES = crc8gen.c crc16gen.c crc32gen.c crc64gen.c
 
 if ENABLE_PLUGIN