protocol UnicodeCodec
Inheritance |
_UnicodeEncoding
View Protocol Hierarchy →
|
---|---|
Import | import Swift |
Initializers
Static Methods
Encodes a Unicode scalar as a series of code units by calling the given closure on each code unit.
For example, the musical fermata symbol ("𝄐") is a single Unicode scalar
value (\u{1D110}
) but requires four code units for its UTF-8
representation. The following code uses the UTF8
codec to encode a
fermata in UTF-8:
var bytes: [UTF8.CodeUnit] = []
UTF8.encode("𝄐", into: { bytes.append($0) })
print(bytes)
// Prints "[240, 157, 132, 144]"
Parameters: input: The Unicode scalar value to encode. processCodeUnit: A closure that processes one code unit argument at a time.
Declaration
static func encode(_ input: Unicode.Scalar, into processCodeUnit: (Self.CodeUnit) -> Void)
Instance Methods
Starts or continues decoding a code unit sequence into Unicode scalar values.
To decode a code unit sequence completely, call this method repeatedly
until it returns UnicodeDecodingResult.emptyInput
. Checking that the
iterator was exhausted is not sufficient, because the decoder can store
buffered data from the input iterator.
Because of buffering, it is impossible to find the corresponding position
in the iterator for a given returned Unicode.Scalar
or an error.
The following example decodes the UTF-8 encoded bytes of a string into an
array of Unicode.Scalar
instances:
let str = "✨Unicode✨"
print(Array(str.utf8))
// Prints "[226, 156, 168, 85, 110, 105, 99, 111, 100, 101, 226, 156, 168]"
var bytesIterator = str.utf8.makeIterator()
var scalars: [Unicode.Scalar] = []
var utf8Decoder = UTF8()
Decode: while true {
switch utf8Decoder.decode(&bytesIterator) {
case .scalarValue(let v): scalars.append(v)
case .emptyInput: break Decode
case .error:
print("Decoding error")
break Decode
}
}
print(scalars)
// Prints "["\u{2728}", "U", "n", "i", "c", "o", "d", "e", "\u{2728}"]"
input
: An iterator of code units to be decoded. input
must be
the same iterator instance in repeated calls to this method. Do not
advance the iterator or any copies of the iterator outside this
method.
Returns: A UnicodeDecodingResult
instance, representing the next
Unicode scalar, an indication of an error, or an indication that the
UTF sequence has been fully decoded.
Declaration
mutating func decode<I>(_ input: inout I) -> UnicodeDecodingResult where I : IteratorProtocol, Self.CodeUnit == I.Element
A Unicode encoding form that translates between Unicode scalar values and form-specific code units.
The
UnicodeCodec
protocol declares methods that decode code unit sequences into Unicode scalar values and encode Unicode scalar values into code unit sequences. The standard library implements codecs for the UTF-8, UTF-16, and UTF-32 encoding schemes as theUTF8
,UTF16
, andUTF32
types, respectively. Use theUnicode.Scalar
type to work with decoded Unicode scalar values.