Skip to content

Androidrwcwride2/RexExp.escape

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RexExp.escape Proposal

Proposal for adding a RegExp.escape method to the ECMAScript standard http://benjamingr.github.io/RexExp.escape/.

Status

This proposal is a stage 0 (strawman) proposal and is awaiting specification, implementation and input.

Motivation

See this issue. It is often the case when we want to build a regular expression out of a string without treating special characters from the string as special regular expression tokens. For example if we want to replace all occurrences of the the string Hello. which we got from the user we might be tempted to do ourLongText.replace(new RegExp(text, "g")) but this would match . against any character rather than a dot.

This is a fairly common use in regular expressions and standardizing it would be useful.

In other languages:

Note that the languages differ in what they do - (perl does something different from C#) but they all have the same goal.

Proposed Solution

We propose the addition of an RegExp.escape function, such that strings can be escaped in order to be used inside regular expressions:

var str = prompt("Please enter a string");
str = RegExp.escape(str);
alert(ourLongText.replace(new RegExp(str, "g")); // handles reg exp special tokens with the replacement.

There is initial previous work here: https://gist.github.com/kangax/9698100 which includes valuable work we've used. Unlike that proposal this one uses the spec's SyntaxCharacter list of characters so updates are in sync with the specificaiton instead of specifying the characters escaped manually.

##Cross-Cutting Concerns

The list of escaped identifiers should be kept in sync with what the regular expressions grammar considers to be syntax characters that need escaping - for this reason instead of hard-coding the list of escaped characters we escape characters that are recognized as a SyntaxCharacters by the engine. For example, if regex comments are ever added to the specification (presumably under a flag) - this ensures they are properly escaped.

##FAQ

##Semantics

RegExp.escape(S)

When the escape function is called with an argument S the following steps are taken:

  1. Let str be ToString(S).
  2. ReturnIfAbrupt(str).
  3. Let cpList be a List containing in order the code points as defined in 6.1.4 of str, starting at the first element of str.
  4. Let cuList be a new List.
  5. For each code point c in cpList in List order, do:
  6. If c is matched by SyntaxCharacter then do:
  7. Append code unit 0x002F (SOLIDUS) to cuList.
  8. Append the elements of the UTF16Encoding (10.1.1) of c to cuList.
  9. Let L be a String whose elements are, in order, the elements of cuList.
  10. Return L.

##Usage Examples

RegExp.escape("The Quick Brown Fox"); // "The Quick Brown Fox"
RegExp.escape("Buy it. use it. break it. fix it.") // "Buy it\. use it\. break it\. fix it\."
RegExp.escape("(*.*)"); // "\(\*\.\*\)"
RegExp.escape("。^・ェ・^。") // "。\^・ェ・\^。"
RegExp.escape("😊 *_* +_+ ... 👍"); // "😊 \*_\* \+_\+ \.\.\. 👍"

About

Proposal for adding RegExp.escape to the ECMAScript standard

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • JavaScript 80.4%
  • HTML 19.6%