Class: NSLinguisticTagger

Inherits:
NSObject show all

Overview

The NSLinguisticTagger class is used to automatically segment natural-language text and tag it with information, such as parts of speech. It can also tag languages, scripts, stem forms of words, etc. An instance of this class is assigned a string to tag, and clients can then obtain tags and ranges for tokens in that string appropriate to a given tag scheme.

Class Method Summary (collapse)

Instance Method Summary (collapse)

Methods inherited from NSObject

#!, #!=, #!~, #, #==, #===, #=~, #Rational, #__callee__, #__method__, #__send__, #__type__, `, alloc, allocWithZone:, #autoContentAccessingProxy, autoload, autoload?, autorelease_pool, #awakeAfterUsingCoder:, binding, block_given?, caller, cancelPreviousPerformRequestsWithTarget:, cancelPreviousPerformRequestsWithTarget:selector:object:, catch, class, classFallbacksForKeyedArchiver, #classForCoder, #classForKeyedArchiver, classForKeyedUnarchiver, #clone, conformsToProtocol:, #copy, copyWithZone:, #dealloc, #define_singleton_method, description, display, #doesNotRecognizeSelector:, #dup, #enum_for, #eql?, #equal?, #extend, fail, #finalize, format, #forwardInvocation:, #forwardingTargetForSelector:, framework, #freeze, #frozen?, getpass, gets, global_variables, #init, initialize, #initialize_clone, #initialize_copy, #initialize_dup, #inspect, instanceMethodForSelector:, instanceMethodSignatureForSelector:, #instance_eval, #instance_exec, #instance_of?, #instance_variable_defined?, #instance_variable_get, #instance_variable_set, #instance_variables, instancesRespondToSelector:, isSubclassOfClass:, #is_a?, iterator?, #kind_of?, lambda, load, load_bridge_support_file, load_plist, local_variables, loop, #method, #methodForSelector:, #methodSignatureForSelector:, #methods, #mutableCopy, mutableCopyWithZone:, new, #nil?, open, p, #performSelector:onThread:withObject:waitUntilDone:, #performSelector:onThread:withObject:waitUntilDone:modes:, #performSelector:withObject:afterDelay:, #performSelector:withObject:afterDelay:inModes:, #performSelectorInBackground:withObject:, #performSelectorOnMainThread:withObject:waitUntilDone:, #performSelectorOnMainThread:withObject:waitUntilDone:modes:, print, printf, #private_methods, proc, #protected_methods, #public_method, #public_methods, #public_send, putc, puts, raise, rand, readline, readlines, #replacementObjectForCoder:, #replacementObjectForKeyedArchiver:, require, resolveClassMethod:, resolveInstanceMethod:, #respond_to?, #respond_to_missing?, select, #send, setVersion:, #singleton_methods, sprintf, srand, superclass, #taint, #tainted?, #tap, test, throw, #to_plist, #to_s, trace_var, trap, #trust, #untaint, untrace_var, #untrust, #untrusted?, version

Constructor Details

This class inherits a constructor from NSObject

Dynamic Method Handling

This class handles dynamic methods through the method_missing method in the class NSObject

Class Method Details

+ (Array) availableTagSchemesForLanguage(language)

Returns the tag schemes supported by the linguistic tagger for a particular language. Clients wishing to know the tag schemes supported for a NSLinguisticTagger instance for a particular language may query them with this method. The language should be specified using a standard abbreviation as with NSOrthography.

Parameters:

Returns:

  • (Array)

    An array of “Linguistic Tag Schemes.”

Instance Method Details

- (Object) enumerateTagsInRange(range, scheme:tagScheme, options:opts, usingBlock:block)

Enumerates the specific range of the string, providing the Block with the located tags. The tagger will segment the string as needed into sentences and tokens, and return those ranges along with a tag for any scheme in its array of tag schemes. This is the fundamental tagging method of NSLinguisticTagger. This method’s block iterates over all tokens intersecting a given range, supplying tags and ranges. There are several additional convenience methods, for obtaining a sentence range, information about a single token, or information about all tokens intersecting a given range at once.For example, if the tag scheme is NSLinguisticTagSchemeLexicalClass, the tags will specify the part of speech (for word tokens) or the type of whitespace or punctuation (for whitespace or punctuation tokens). If the tag scheme is NSLinguisticTagSchemeLemma, the tags will specify the stem form of the word (if known) for each word token.It is important to note that this method will return the ranges of all tokens that intersect the given range.

Parameters:

  • range

    The range to analyze

  • tagScheme

    The tag scheme.

  • opts

    The linguistic tagger options to use. See “NSLinguisticTaggerOptions” for the constants. These constants can be combined using the C Bitwise operator.

  • block

    The Block to apply to ranges of the string.The Block takes four arguments:tagThe located linguistic tag.tokenRangeThe range of the linguistic tag.sentenceRangeThe range of the sentence in which the tag occurs.stopA reference to a Boolean value. The block can set the value to YES to stop further processing of the set. The stop argument is an out-only argument. You should only ever set this Boolean to YES within the Block.

  • tag

    The located linguistic tag.

  • tokenRange

    The range of the linguistic tag.

  • sentenceRange

    The range of the sentence in which the tag occurs.

  • stop

    A reference to a Boolean value. The block can set the value to YES to stop further processing of the set. The stop argument is an out-only argument. You should only ever set this Boolean to YES within the Block.

Returns:

- (Object) initWithTagSchemes(tagSchemes, options:opts)

Creates a linguistic tagger instance using the specified tag schemes and options.

Parameters:

  • tagSchemes (Array)

    An array of tag schemes. See “Linguistic Tag Schemes” for the possible values.

  • opts (Integer)

    The linguistic tagger options to use. See “NSLinguisticTaggerOptions” for the constants. These constants can be combined using the C-Bitwise OR operator.

Returns:

  • (Object)

    An initialized linguistic tagger.

- (NSOrthography) orthographyAtIndex(charIndex, effectiveRange:effectiveRange)

Returns the orthography at the index and also returns the effective range.

Parameters:

  • charIndex (Integer)

    The character index to begin examination.

  • effectiveRange (NSRangePointer)

    An NSRangePointer that, upon completion, contains the range of the orthography containing charIndex.

Returns:

- (Array) possibleTagsAtIndex(charIndex, scheme:tagScheme, tokenRange:tokenRange, sentenceRange:sentenceRange, scores:scores)

Returns an array of possible tags for the given scheme at the specified range, supplying matching scores.

Parameters:

  • charIndex (Integer)

    The initial character index.

  • tagScheme (String)

    The tag scheme. See “Linguistic Tag Schemes” for the possible values.

  • tokenRange (NSRangePointer)

    The token range.

  • sentenceRange (NSRangePointer)

    The range of the sentence.

  • scores (Pointer)

    Returns by-reference an array of numeric scores (wrapped as NSValue objects) indicating the likelihood that the range matches the tag scheme.

Returns:

  • (Array)

    Returns an array of possible tags for the tagScheme at the specified location, starting with the most likely tag scheme. For some tag schemes only a single tag will be returned, but for others a list of possibilities will be provided.

- (NSRange) sentenceRangeForRange(charRange)

Returns the range of a sentence boundary containing the specified range. This method can be used to obtain the enclosing sentence range given a token range.

Parameters:

  • charRange (NSRange)

    The range.

Returns:

  • (NSRange)

    Returns the range of a sentence that contains charRange.

- (Object) setOrthography(orthography, range:charRange)

Sets the orthography for the specified range. If the orthography of the linguistic tagger is not set, it will determine it automatically from the contents of the text. Clients should call this method only if they already know the language of the text by some other means.

Parameters:

Returns:

- (Object) setString(string)

Sets the string to be analyzed by the linguistic tagger.

Parameters:

  • string (String)

    The string.

Returns:

- (String) string

Returns the string being analyzed by the linguistic tagger.

Returns:

- (Object) stringEditedInRange(newCharRange, changeInLength:delta)

Notifies the linguistic tagger that the string (if mutable) has changed as specified by the parameters.

Parameters:

  • newCharRange (NSRange)

    The range in the final string that was edited.

  • delta (Integer)

    The change in length.

Returns:

- (String) tagAtIndex(charIndex, scheme:tagScheme, tokenRange:tokenRange, sentenceRange:sentenceRange)

Returns a tag for a single scheme at the specified index. When the returned array contains entries that do not have a corresponding tagScheme, that entry is an instance of NSNull.

Parameters:

  • charIndex (Integer)

    The initial character index.

  • tagScheme (String)

    The tag scheme. See “Linguistic Tag Schemes” for the possible values.

  • tokenRange (NSRangePointer)

    A pointer to the token range. If NULL, no pointer range is returned.

  • sentenceRange (NSRangePointer)

    A pointer to the range of the sentence. If NULL, no pointer range is returned.

Returns:

  • (String)

    Returns the tag for the requested tagScheme. There are cases in which there may not be a tag for a given scheme and token, in which case the return value of the method would be nil.

- (Array) tagSchemes

Returns the tag schemes supported by the linguistic tagger for a particular language.

Returns:

  • (Array)

    An array of tag schemes. See “Linguistic Tag Schemes” for the possible values.

- (Array) tagsInRange(range, scheme:tagScheme, options:opts, tokenRanges:tokenRanges)

Returns an array of linguistic tags and token ranges.

Parameters:

  • range (NSRange)

    The range from which to return tags.

  • tagScheme (String)

    The tag scheme. See “Linguistic Tag Schemes” for the possible values.

  • opts (NSLinguisticTaggerOptions)

    The linguistic tagger options to use. See “NSLinguisticTaggerOptions” for the constants. These constants can be combined using the C-Bitwise OR operator.

  • tokenRanges (Pointer)

    Returns by-reference an array of token range objects wrapped in NSValue objects.

Returns:

  • (Array)

    An array of the tag schemes corresponding to the entries in the tokenRanges array.