construct: add adapter Utf8Adapter to safely interpret utf8 text

Uninitialized Files, File records or fields in a File record or File
usually contain a string of 0xff bytes. This becomes a problem when the
content is normally encoded/decoded as utf8 since by the construct
parser. The parser will throw an expection when it tries to decode the
0xff string as utf8. This is especially a serious problem in pySim-trace
where an execption stops the parser.

Let's fix this by interpreting a string of 0xff as an empty string.

Related: OS#6094
Change-Id: Id114096ccb8b7ff8fcc91e1ef3002526afa09cb7
diff --git a/pySim/construct.py b/pySim/construct.py
index ab44a63..af96b49 100644
--- a/pySim/construct.py
+++ b/pySim/construct.py
@@ -6,6 +6,7 @@
 from construct.lib import integertypes
 from pySim.utils import b2h, h2b, swap_nibbles
 import gsm0338
+import codecs
 
 """Utility code related to the integration of the 'construct' declarative parser."""
 
@@ -34,6 +35,18 @@
     def _encode(self, obj, context, path):
         return h2b(obj)
 
+class Utf8Adapter(Adapter):
+    """convert a bytes() type that contains utf8 encoded text to human readable text."""
+
+    def _decode(self, obj, context, path):
+        # In case the string contains only 0xff bytes we interpret it as an empty string
+        if obj == b'\xff' * len(obj):
+                return ""
+        return codecs.decode(obj, "utf-8")
+
+    def _encode(self, obj, context, path):
+        return codecs.encode(obj, "utf-8")
+
 
 class BcdAdapter(Adapter):
     """convert a bytes() type to a string of BCD nibbles."""