Class: NSCharacterSet

Inherits:
NSObject show all

Overview

An NSCharacterSet object represents a set of Unicode-compliant characters. NSString and NSScanner objects use NSCharacterSet objects to group characters together for searching operations, so that they can find any of a particular set of characters during a search. The cluster’s two public classes, NSCharacterSet and NSMutableCharacterSet, declare the programmatic interface for static and dynamic character sets, respectively.

Direct Known Subclasses

NSMutableCharacterSet

Class Method Summary (collapse)

Instance Method Summary (collapse)

Methods inherited from NSObject

#!, #!=, #!~, #, #==, #===, #=~, #Rational, #__callee__, #__method__, #__send__, #__type__, `, alloc, allocWithZone:, #autoContentAccessingProxy, autoload, autoload?, autorelease_pool, #awakeAfterUsingCoder:, binding, block_given?, caller, cancelPreviousPerformRequestsWithTarget:, cancelPreviousPerformRequestsWithTarget:selector:object:, catch, class, classFallbacksForKeyedArchiver, #classForCoder, #classForKeyedArchiver, classForKeyedUnarchiver, #clone, conformsToProtocol:, #copy, copyWithZone:, #dealloc, #define_singleton_method, description, display, #doesNotRecognizeSelector:, #dup, #enum_for, #eql?, #equal?, #extend, fail, #finalize, format, #forwardInvocation:, #forwardingTargetForSelector:, framework, #freeze, #frozen?, getpass, gets, global_variables, #init, initialize, #initialize_clone, #initialize_copy, #initialize_dup, #inspect, instanceMethodForSelector:, instanceMethodSignatureForSelector:, #instance_eval, #instance_exec, #instance_of?, #instance_variable_defined?, #instance_variable_get, #instance_variable_set, #instance_variables, instancesRespondToSelector:, isSubclassOfClass:, #is_a?, iterator?, #kind_of?, lambda, load, load_bridge_support_file, load_plist, local_variables, loop, #method, #methodForSelector:, #methodSignatureForSelector:, #methods, #mutableCopy, mutableCopyWithZone:, new, #nil?, open, p, #performSelector:onThread:withObject:waitUntilDone:, #performSelector:onThread:withObject:waitUntilDone:modes:, #performSelector:withObject:afterDelay:, #performSelector:withObject:afterDelay:inModes:, #performSelectorInBackground:withObject:, #performSelectorOnMainThread:withObject:waitUntilDone:, #performSelectorOnMainThread:withObject:waitUntilDone:modes:, print, printf, #private_methods, proc, #protected_methods, #public_method, #public_methods, #public_send, putc, puts, raise, rand, readline, readlines, #replacementObjectForCoder:, #replacementObjectForKeyedArchiver:, require, resolveClassMethod:, resolveInstanceMethod:, #respond_to?, #respond_to_missing?, select, #send, setVersion:, #singleton_methods, sprintf, srand, superclass, #taint, #tainted?, #tap, test, throw, #to_plist, #to_s, trace_var, trap, #trust, #untaint, untrace_var, #untrust, #untrusted?, version

Constructor Details

This class inherits a constructor from NSObject

Dynamic Method Handling

This class handles dynamic methods through the method_missing method in the class NSObject

Class Method Details

+ (Object) alphanumericCharacterSet

Returns a character set containing the characters in the categories Letters, Marks, and Numbers. Informally, this set is the set of all characters used as basic units of alphabets, syllabaries, ideographs, and digits.

Returns:

  • (Object)

    A character set containing the characters in the categories Letters, Marks, and Numbers.

+ (Object) capitalizedLetterCharacterSet

Returns a character set containing the characters in the category of Titlecase Letters.

Returns:

  • (Object)

    A character set containing the characters in the category of Titlecase Letters.

+ (Object) characterSetWithBitmapRepresentation(data)

Returns a character set containing characters determined by a given bitmap representation. This method is useful for creating a character set object with data from a file or other external data source.A raw bitmap representation of a character set is a byte array of 2^16 bits (that is, 8192 bytes). The value of the bit at position n represents the presence in the character set of the character with decimal Unicode value n. To add a character with decimal Unicode value n to a raw bitmap representation, use a statement such as the following: To remove that character:

Parameters:

  • data (NSData)

    A bitmap representation of a character set.

Returns:

  • (Object)

    A character set containing characters determined by data.

+ (Object) characterSetWithCharactersInString(aString)

Returns a character set containing the characters in a given string.

Parameters:

  • aString (String)

    A string containing characters for the new character set.

Returns:

  • (Object)

    A character set containing the characters in aString. Returns an empty character set if aString is empty.

+ (Object) characterSetWithContentsOfFile(path)

Returns a character set read from the bitmap representation stored in the file a given path. To read a bitmap representation from any file, use the NSData methoddataWithContentsOfFile:options:error: and pass the result to characterSetWithBitmapRepresentation:.This method doesn’t use filenames to check for the uniqueness of the character sets it creates. To prevent duplication of character sets in memory, cache them and make them available through an API that checks whether the requested set has already been loaded.

Parameters:

  • path (String)

    A path to a file containing a bitmap representation of a character set. The path name must end with the extension .bitmap.

Returns:

  • (Object)

    A character set read from the bitmap representation stored in the file at path.

+ (Object) characterSetWithRange(aRange)

Returns a character set containing characters with Unicode values in a given range. This code excerpt creates a character set object containing the lowercase English alphabetic characters:

Parameters:

  • aRange (NSRange)

    A range of Unicode values.aRange.location is the value of the first character to return; aRange.location + aRange.length– 1 is the value of the last.

Returns:

  • (Object)

    A character set containing characters whose Unicode values are given by aRange. If aRange.length is 0, returns an empty character set.

+ (Object) controlCharacterSet

Returns a character set containing the characters in the categories of Control or Format Characters. These characters are specifically the Unicode values U+0000 to U+001F and U+007F to U+009F.

Returns:

  • (Object)

    A character set containing the characters in the categories of Control or Format Characters.

+ (Object) decimalDigitCharacterSet

Returns a character set containing the characters in the category of Decimal Numbers. Informally, this set is the set of all characters used to represent the decimal values 0 through 9. These characters include, for example, the decimal digits of the Indic scripts and Arabic.

Returns:

  • (Object)

    A character set containing the characters in the category of Decimal Numbers.

+ (Object) decomposableCharacterSet

Returns a character set containing all individual Unicode characters that can also be represented as composed character sequences. These characters include compatibility characters as well as pre-composed characters.Note: This character set doesn’t currently include the Hangul characters defined in version 2.0 of the Unicode standard.

Returns:

  • (Object)

    A character set containing all individual Unicode characters that can also be represented as composed character sequences (such as for letters with accents), by the definition of “standard decomposition” in version 3.2 of the Unicode character encoding standard.

+ (Object) illegalCharacterSet

Returns a character set containing values in the category of Non-Characters or that have not yet been defined in version 3.2 of the Unicode standard.

Returns:

  • (Object)

    A character set containing values in the category of Non-Characters or that have not yet been defined in version 3.2 of the Unicode standard.

+ (Object) letterCharacterSet

Returns a character set containing the characters in the categories Letters and Marks. Informally, this set is the set of all characters used as letters of alphabets and ideographs.

Returns:

  • (Object)

    A character set containing the characters in the categories Letters and Marks.

+ (Object) lowercaseLetterCharacterSet

Returns a character set containing the characters in the category of Lowercase Letters. Informally, this set is the set of all characters used as lowercase letters in alphabets that make case distinctions.

Returns:

  • (Object)

    A character set containing the characters in the category of Lowercase Letters.

+ (Object) newlineCharacterSet

Returns a character set containing the newline characters.

Returns:

  • (Object)

    A character set containing the newline characters (U+000A–U+000D, U+0085).

+ (Object) nonBaseCharacterSet

Returns a character set containing the characters in the category of Marks. This set is also defined as all legal Unicode characters with a non-spacing priority greater than 0. Informally, this set is the set of all characters used as modifiers of base characters.

Returns:

  • (Object)

    A character set containing the characters in the category of Marks.

+ (Object) punctuationCharacterSet

Returns a character set containing the characters in the category of Punctuation. Informally, this set is the set of all non-whitespace characters used to separate linguistic units in scripts, such as periods, dashes, parentheses, and so on.

Returns:

  • (Object)

    A character set containing the characters in the category of Punctuation.

+ (Object) symbolCharacterSet

Returns a character set containing the characters in the category of Symbols. These characters include, for example, the dollar sign ($) and the plus (+) sign.

Returns:

  • (Object)

    A character set containing the characters in the category of Symbols.

+ (Object) uppercaseLetterCharacterSet

Returns a character set containing the characters in the categories of Uppercase Letters and Titlecase Letters. Informally, this set is the set of all characters used as uppercase letters in alphabets that make case distinctions.

Returns:

  • (Object)

    A character set containing the characters in the categories of Uppercase Letters and Titlecase Letters.

+ (Object) whitespaceAndNewlineCharacterSet

Returns a character set containing only the whitespace characters space (U+0020) and tab (U+0009) and the newline and nextline characters (U+000A–U+000D, U+0085).

Returns:

  • (Object)

    A character set containing only the whitespace characters space (U+0020) and tab (U+0009) and the newline and nextline characters (U+000A–U+000D, U+0085).

+ (Object) whitespaceCharacterSet

Returns a character set containing only the in-line whitespace characters space (U+0020) and tab (U+0009). This set doesn’t contain the newline or carriage return characters.

Returns:

  • (Object)

    A character set containing only the in-line whitespace characters space (U+0020) and tab (U+0009).

Instance Method Details

- (NSData) bitmapRepresentation

Returns an NSData object encoding the receiver in binary format. This format is suitable for saving to a file or otherwise transmitting or archiving.A raw bitmap representation of a character set is a byte array of 2^16 bits (that is, 8192 bytes). The value of the bit at position n represents the presence in the character set of the character with decimal Unicode value n. To test for the presence of a character with decimal Unicode value n in a raw bitmap representation, use an expression such as the following:

Returns:

  • (NSData)

    An NSData object encoding the receiver in binary format.

- (Boolean) characterIsMember(aCharacter)

Returns a Boolean value that indicates whether a given character is in the receiver.

Parameters:

  • aCharacter (Integer)

    The character to test for membership of the receiver.

Returns:

  • (Boolean)

    YES if aCharacter is in the receiving character set, otherwise NO.

- (Boolean) hasMemberInPlane(thePlane)

Returns a Boolean value that indicates whether the receiver has at least one member in a given character plane. This method makes it easier to find the plane containing the members of the current character set. The Basic Multilingual Plane is plane 0.

Parameters:

  • thePlane (Integer)

    A character plane.

Returns:

  • (Boolean)

    YES if the receiver has at least one member in thePlane, otherwise NO.

- (NSCharacterSet) invertedSet

Returns a character set containing only characters that don’t exist in the receiver. Inverting an immutable character set is much more efficient than inverting a mutable character set.

Returns:

  • (NSCharacterSet)

    A character set containing only characters that don’t exist in the receiver.

- (Boolean) isSupersetOfSet(theOtherSet)

Returns a Boolean value that indicates whether the receiver is a superset of another given character set.

Parameters:

Returns:

  • (Boolean)

    YES if the receiver is a superset of theOtherSet, otherwise NO.

- (Boolean) longCharacterIsMember(theLongChar)

Returns a Boolean value that indicates whether a given long character is a member of the receiver. This method supports the specification of 32-bit characters.

Parameters:

  • theLongChar (UTF32Char)

    A UTF32 character.

Returns:

  • (Boolean)

    YES if theLongChar is in the receiver, otherwise NO.