Class: Regexp
Overview
Constant Summary
- IGNORECASE
- EXTENDED
- MULTILINE
Class Method Summary (collapse)
- + alloc
-
+ compile
:nodoc:.
-
+ escape
Escapes any characters that would have special meaning in a regular expression.
-
+ last_match
The first form returns the MatchData object generated by the last successful pattern match.
-
+ quote
Escapes any characters that would have special meaning in a regular expression.
-
+ try_convert
Try to convert obj into a Regexp, using to_regexp method.
-
+ union
Return a Regexp object that is the union of the given patterns, i.e., will match any of its parts.
Instance Method Summary (collapse)
-
- ==
Equality---Two regexps are equal if their patterns are identical, they have the same character set code, and their casefold? values are the same.
-
- ===
Case Equality---Synonym for Regexp#=~ used in case statements.
-
- =~
Match---Matches rxp against str.
-
- casefold?
Returns the value of the case-insensitive flag.
- - encoding
-
- eql?
Equality---Two regexps are equal if their patterns are identical, they have the same character set code, and their casefold? values are the same.
-
- fixed_encoding?
Returns false if rxp is applicable to a string with any ASCII compatible encoding.
-
- hash
Produce a hash based on the text and options of this regular expression.
-
- initialize
constructor
Constructs a new regular expression from pattern, which can be either a String or a Regexp (in which case that regexp's options are propagated, and new options may not be specified (a change as of Ruby 1.8). If options is a Fixnum, it should be one or more of the constants Regexp::EXTENDED, Regexp::IGNORECASE, and Regexp::MULTILINE, or-ed together. Otherwise, if options is not nil, the regexp will be case insensitive. When the lang parameter is `n' or `N' sets the regexp no encoding.
r1 = Regexp.new('^a-z+:\\s+\w+') #=> /^a-z+:\s+\w+/ r2 = Regexp.new('cat', true) #=> /cat/i r3 = Regexp.new('dog', Regexp::EXTENDED) #=> /dog/x r4 = Regexp.new(r2) #=> /cat/i.
-
- initialize_copy
:nodoc:.
-
- inspect
Produce a nicely formatted string-version of rxp.
-
- match
Returns a MatchData object describing the match, or nil if there was no match.
-
- named_captures
Returns a hash representing information about named captures of rxp.
-
- names
Returns a list of names of captures as an array of strings.
-
- options
Returns the set of bits corresponding to the options used when creating this Regexp (see Regexp::new for details. Note that additional bits may be set in the returned options: these are used internally by the regular expression code. These extra bits are ignored if the options are passed to Regexp::new.
Regexp::IGNORECASE #=> 1 Regexp::EXTENDED #=> 2 Regexp::MULTILINE #=> 4 /cat/. #=> 0 /cat/ix. #=> 3 Regexp.new('cat', true). #=> 1 /\xa1\xa2/e. #=> 16 r = /cat/ix Regexp.new(r.source, r.) #=> /cat/ix.
-
- source
Returns the original string of the pattern.
-
- to_s
Returns a string containing the regular expression and its options (using the (?opts:source) notation. This string can be fed back in to Regexp::new to a regular expression with the same semantics as the original. (However, Regexp#== may not return true when comparing the two, as the source of the regular expression itself may differ, as the example shows). Regexp#inspect produces a generally more readable version of rxp.
r1 = /ab+c/ix #=> /ab+c/ix s1 = r1.to_s #=> "(?ix-m:ab+c)" r2 = Regexp.new(s1) #=> /(?ix-m:ab+c)/ r1 == r2 #=> false r1.source #=> "ab+c" r2.source #=> "(?ix-m:ab+c)".
-
- ~
Match---Matches rxp against the contents of $_.
Methods inherited from NSObject
#!, #!=, #!~, #, #Rational, #__callee__, #__method__, #__send__, #__type__, `, allocWithZone:, #autoContentAccessingProxy, autoload, autoload?, autorelease_pool, #awakeAfterUsingCoder:, binding, block_given?, caller, cancelPreviousPerformRequestsWithTarget:, cancelPreviousPerformRequestsWithTarget:selector:object:, catch, class, classFallbacksForKeyedArchiver, #classForCoder, #classForKeyedArchiver, classForKeyedUnarchiver, #clone, conformsToProtocol:, #copy, copyWithZone:, #dealloc, #define_singleton_method, description, display, #doesNotRecognizeSelector:, #dup, #enum_for, #equal?, #extend, fail, #finalize, format, #forwardInvocation:, #forwardingTargetForSelector:, framework, #freeze, #frozen?, getpass, gets, global_variables, #init, initialize, #initialize_clone, #initialize_dup, instanceMethodForSelector:, instanceMethodSignatureForSelector:, #instance_eval, #instance_exec, #instance_of?, #instance_variable_defined?, #instance_variable_get, #instance_variable_set, #instance_variables, instancesRespondToSelector:, isSubclassOfClass:, #is_a?, iterator?, #kind_of?, lambda, load, load_bridge_support_file, load_plist, local_variables, loop, #method, #methodForSelector:, #methodSignatureForSelector:, #methods, #mutableCopy, mutableCopyWithZone:, new, #nil?, open, p, #performSelector:onThread:withObject:waitUntilDone:, #performSelector:onThread:withObject:waitUntilDone:modes:, #performSelector:withObject:afterDelay:, #performSelector:withObject:afterDelay:inModes:, #performSelectorInBackground:withObject:, #performSelectorOnMainThread:withObject:waitUntilDone:, #performSelectorOnMainThread:withObject:waitUntilDone:modes:, print, printf, #private_methods, proc, #protected_methods, #public_method, #public_methods, #public_send, putc, puts, raise, rand, readline, readlines, #replacementObjectForCoder:, #replacementObjectForKeyedArchiver:, require, resolveClassMethod:, resolveInstanceMethod:, #respond_to?, #respond_to_missing?, select, #send, setVersion:, #singleton_methods, sprintf, srand, superclass, #taint, #tainted?, #tap, test, throw, #to_plist, trace_var, trap, #trust, #untaint, untrace_var, #untrust, #untrusted?, version
Constructor Details
- (Regexp) new(string, [options [, lang]]) - (Regexp) new(regexp) - (Regexp) compile(string, [options [, lang]]) - (Regexp) compile(regexp)
Constructs a new regular expression from pattern, which can be either a String or a Regexp (in which case that regexp's options are propagated, and new options may not be specified (a change as of Ruby 1.8). If options is a Fixnum, it should be one or more of the constants Regexp::EXTENDED, Regexp::IGNORECASE, and Regexp::MULTILINE, or-ed together. Otherwise, if options is not nil, the regexp will be case insensitive. When the lang parameter is `n' or `N' sets the regexp no encoding.
r1 = Regexp.new('^a-z+:\\s+\w+') #=> /^a-z+:\s+\w+/
r2 = Regexp.new('cat', true) #=> /cat/i
r3 = Regexp.new('dog', Regexp::EXTENDED) #=> /dog/x
r4 = Regexp.new(r2) #=> /cat/i
Dynamic Method Handling
This class handles dynamic methods through the method_missing method in the class NSObject
Class Method Details
+ (Object) alloc
+ (Object) compile
:nodoc:
+ (MatchData) last_match + (String) last_match(n)
The first form returns the MatchData object generated by the last successful pattern match. Equivalent to reading the global variable $~. The second form returns the nth field in this MatchData object. n can be a string or symbol to reference a named capture.
Note that the last_match is local to the thread and method scope of the method that did the pattern match.
/c(.)t/ =~ 'cat' #=> 0
Regexp.last_match #=> #<MatchData "cat" 1:"a">
Regexp.last_match(0) #=> "cat"
Regexp.last_match(1) #=> "a"
Regexp.last_match(2) #=> nil
/(?<lhs>\w+)\s*=\s*(?<rhs>\w+)/ =~ "var = val"
Regexp.last_match #=> #<MatchData "var = val" lhs:"var" rhs:"val">
Regexp.last_match(:lhs) #=> "var"
Regexp.last_match(:rhs) #=> "val"
+ (nil) try_convert(obj)
Try to convert obj into a Regexp, using to_regexp method. Returns converted regexp or nil if obj cannot be converted for any reason.
Regexp.try_convert(/re/) #=> /re/
Regexp.try_convert("re") #=> nil
o = Object.new
Regexp.try_convert(o) #=> nil
def o.to_regexp() /foo/ end
Regexp.try_convert(o) #=> /foo/
+ (Regexp) union(pat1, pat2, ...) + (Regexp) union(pats_ary)
Return a Regexp object that is the union of the given patterns, i.e., will match any of its parts. The patterns can be Regexp objects, in which case their options will be preserved, or Strings. If no patterns are given, returns /(?!)/. The behavior is unspecified if any given pattern contains capture.
Regexp.union #=> /(?!)/
Regexp.union("penzance") #=> /penzance/
Regexp.union("a+b*c") #=> /a\+b\*c/
Regexp.union("skiing", "sledding") #=> /skiing|sledding/
Regexp.union(["skiing", "sledding"]) #=> /skiing|sledding/
Regexp.union(/dogs/, /cats/i) #=> /(?-mix:dogs)|(?i-mx:cats)/
Instance Method Details
- (Boolean) ==(other_rxp) - (Boolean) eql?(other_rxp)
Equality---Two regexps are equal if their patterns are identical, they have the same character set code, and their casefold? values are the same.
/abc/ == /abc/x #=> false
/abc/ == /abc/i #=> false
/abc/ == /abc/n #=> false
/abc/u == /abc/n #=> false
- (Boolean) ===(str)
Case Equality---Synonym for Regexp#=~ used in case statements.
a = "HELLO"
case a
when /^[a-z]*$/; print "Lower case\n"
when /^[A-Z]*$/; print "Upper case\n"
else; print "Mixed case\n"
end
produces:
Upper case
- (Integer?) =~(str)
Match---Matches rxp against str.
/at/ =~ "input data" #=> 7
/ax/ =~ "input data" #=> nil
If =~ is used with a regexp literal with named captures, captured strings (or nil) is assigned to local variables named by the capture names.
/(?<lhs>\w+)\s*=\s*(?<rhs>\w+)/ =~ " x = y "
p lhs #=> "x"
p rhs #=> "y"
If it is not matched, nil is assigned for the variables.
/(?<lhs>\w+)\s*=\s*(?<rhs>\w+)/ =~ " x = "
p lhs #=> nil
p rhs #=> nil
This assignment is implemented in the Ruby parser. The parser detects 'regexp-literal =~ expression' for the assignment. The regexp must be a literal without interpolation and placed at left hand side.
The assignment does not occur if the regexp is not a literal.
re = /(?<lhs>\w+)\s*=\s*(?<rhs>\w+)/
re =~ " x = y "
p lhs # undefined local variable
p rhs # undefined local variable
A regexp interpolation, #{}, also disables the assignment.
rhs_pat = /(?<rhs>\w+)/
/(?<lhs>\w+)\s*=\s*#{rhs_pat}/ =~ "x = y"
p lhs # undefined local variable
The assignment does not occur if the regexp is placed at the right hand side.
" x = y " =~ /(?<lhs>\w+)\s*=\s*(?<rhs>\w+)/
p lhs, rhs # undefined local variable
- (Boolean) casefold?
Returns the value of the case-insensitive flag.
/a/.casefold? #=> false
/a/i.casefold? #=> true
/(?i:a)/.casefold? #=> false
- (Object) encoding
- (Boolean) ==(other_rxp) - (Boolean) eql?(other_rxp)
Equality---Two regexps are equal if their patterns are identical, they have the same character set code, and their casefold? values are the same.
/abc/ == /abc/x #=> false
/abc/ == /abc/i #=> false
/abc/ == /abc/n #=> false
/abc/u == /abc/n #=> false
- (Boolean) fixed_encoding?
Returns false if rxp is applicable to a string with any ASCII compatible encoding. Returns true otherwise.
r = /a/
r.fixed_encoding? #=> false
r =~ "\u{6666} a" #=> 2
r =~ "\xa1\xa2 a".force_encoding("euc-jp") #=> 2
r =~ "abc".force_encoding("euc-jp") #=> 0
r = /a/u
r.fixed_encoding? #=> true
r.encoding #=> #<Encoding:UTF-8>
r =~ "\u{6666} a" #=> 2
r =~ "\xa1\xa2".force_encoding("euc-jp") #=> ArgumentError
r =~ "abc".force_encoding("euc-jp") #=> 0
r = /\u{6666}/
r.fixed_encoding? #=> true
r.encoding #=> #<Encoding:UTF-8>
r =~ "\u{6666} a" #=> 0
r =~ "\xa1\xa2".force_encoding("euc-jp") #=> ArgumentError
r =~ "abc".force_encoding("euc-jp") #=> nil
- (Fixnum) hash
Produce a hash based on the text and options of this regular expression.
- (Object) initialize_copy
:nodoc:
- (String) inspect
Produce a nicely formatted string-version of rxp. Perhaps surprisingly, #inspect actually produces the more natural version of the string than #to_s.
/ab+c/ix.inspect #=> "/ab+c/ix"
- (MatchData?) match(str) - (MatchData?) match(str, pos)
Returns a MatchData object describing the match, or nil if there was no match. This is equivalent to retrieving the value of the special variable $~ following a normal match. If the second parameter is present, it specifies the position in the string to begin the search.
/(.)(.)(.)/.match("abc")[2] #=> "b"
/(.)(.)/.match("abc", 1)[2] #=> "c"
If a block is given, invoke the block with MatchData if match succeed, so that you can write
pat.match(str) {|m| ...}
instead of
if m = pat.match(str)
...
end
The return value is a value from block execution in this case.
- (Hash) named_captures
Returns a hash representing information about named captures of rxp.
A key of the hash is a name of the named captures. A value of the hash is an array which is list of indexes of corresponding named captures.
/(?<foo>.)(?<bar>.)/.named_captures
#=> {"foo"=>[1], "bar"=>[2]}
/(?<foo>.)(?<foo>.)/.named_captures
#=> {"foo"=>[1, 2]}
If there are no named captures, an empty hash is returned.
/(.)(.)/.named_captures
#=> {}
- (Array) names
Returns a list of names of captures as an array of strings.
/(?<foo>.)(?<bar>.)(?<baz>.)/.names
#=> ["foo", "bar", "baz"]
/(?<foo>.)(?<foo>.)/.names
#=> ["foo"]
/(.)(.)/.names
#=> []
- (Fixnum) options
Returns the set of bits corresponding to the options used when creating this Regexp (see Regexp::new for details. Note that additional bits may be set in the returned options: these are used internally by the regular expression code. These extra bits are ignored if the options are passed to Regexp::new.
Regexp::IGNORECASE #=> 1
Regexp::EXTENDED #=> 2
Regexp::MULTILINE #=> 4
/cat/. #=> 0
/cat/ix. #=> 3
Regexp.new('cat', true). #=> 1
/\xa1\xa2/e. #=> 16
r = /cat/ix
Regexp.new(r.source, r.) #=> /cat/ix
- (String) source
Returns the original string of the pattern.
/ab+c/ix.source #=> "ab+c"
Note that escape sequences are retained as is.
/\x20\+/.source #=> "\\x20\\+"
- (String) to_s
Returns a string containing the regular expression and its options (using the (?opts:source) notation. This string can be fed back in to Regexp::new to a regular expression with the same semantics as the original. (However, Regexp#== may not return true when comparing the two, as the source of the regular expression itself may differ, as the example shows). Regexp#inspect produces a generally more readable version of rxp.
r1 = /ab+c/ix #=> /ab+c/ix
s1 = r1.to_s #=> "(?ix-m:ab+c)"
r2 = Regexp.new(s1) #=> /(?ix-m:ab+c)/
r1 == r2 #=> false
r1.source #=> "ab+c"
r2.source #=> "(?ix-m:ab+c)"
- (Integer?) ~(rxp)
Match---Matches rxp against the contents of $_. Equivalent to rxp =~ $_.
$_ = "input data"
~ /at/ #=> 7