2006-12-31

String format function for JavaScript

Note: This artical is the republication of my two posts A high performance string format function for JavaScript and Just another high performance string format function for JavaScript on csdn blog.


Last month, I wrote a logging tool for js, and to avoid performance depression, I need a string formatter function.

I found that though many js toolkit or framework provide string format function, such as Atlas, but they r not quite fast. Most r using String.replace(regex, func) to replace the placeholder(such as '{n}' or '$n'), but this function tend to slow, because every match will call the func, and js function calling is very expensive.

On the contrary, native functions r very fast, so to improve performance, we should utilize the native functions as possible as we can. A wonderful example is StringBuilder which use Array.join to concat the string.

So I create my String.format(), it's very fast.

Usage:


var name = 'world';
var result = 'Hello $1!'.format(name);
// result = "Hello world!"

var letters = String.format(
 '$1$2$3$4$5$6$7$8$9$10$11$12$13$14$15\
 $16$17$18$19$20$21$22$23$24$25$26',
 'a', 'b', 'c', 'd', 'e', 'f', 'g',
 'h', 'i', 'j', 'k', 'l', 'm', 'n',
 'o', 'p', 'q', 'r', 's', 't',
 'u', 'v', 'w', 'x', 'y', 'z');
// letters = "abcdefghijklmnopqrstuvwxyz"

The later one almost same fast as the former one, no other implementation can have the same performance as I know.

Note:

  • It's depend on String.replace(regex, string), so u can use at most 99 placeholder($1 to $99), but if the script engine is too old, it maybe only support nine ($1 to $9) or not work at all (eg. JScript before 5.5?).
  • literal $ should be escape into $$ (two $).
  • $` and $' will be reomoved, and $& will replaced into some strange things :)
  • '$1 1'.format('a') result in 'a 1', if you want to strip the space, u can't write '$11'.format(...) because it will try to match the 11nd parameter, u should write '$011'.format(...) instead.
  • There is a magic character which you can't use anyway, currently I choose 0x1f (which means data separator in ascii and unicode).

Source code:


// Copyright (c) HE Shi-Jun , 2006
// Below codes can be used under GPL (v2 or later) or LGPL (v2.1 or later) license

if (!String._FORMAT_SEPARATOR) ...{
    String._FORMAT_SEPARATOR = String.fromCharCode(0x1f);
    String._FORMAT_ARGS_PATTERN = new RegExp('^[^' + String._FORMAT_SEPARATOR + ']*'
      + new Array(100).join('(?:.([^' + String._FORMAT_SEPARATOR + ']*))?'));
}
if (!String.format)
    String.format = function (s) ...{
    return Array.prototype.join.call(arguments, String._FORMAT_SEPARATOR).
    replace(String._FORMAT_ARGS_PATTERN, s);
}
if (!''.format)
    String.prototype.format = function () ...{
    return (String._FORMAT_SEPARATOR +
    Array.prototype.join.call(arguments, String._FORMAT_SEPARATOR)).
    replace(String._FORMAT_ARGS_PATTERN, this);
}

Below is just another format function:


// Copyright (c) HE Shi-Jun , 2006
// Below codes can be used under GPL (v2 or later) or LGPL (v2.1 or later) license

format2.cache = ...{};
function format2(pattern) ...{
    if (!(pattern in format2.cache)) ...{
        format2.cache[pattern] = new Function('"' + pattern.replace(/"/g, '\"').replace(/$([0-9]+)/g, '" + arguments[$1] + "').replace(/$$/g, '$') + '"');
    }
    return format2.cache[pattern](arguments);
}

Compare to previous method, it's even more fast in heavy using (especially on FireFox and Opera), because it's compile the pattern to function and cache it. But this method will waste memory. So the best practice is combining these two methods.

And the no cache version here, but not helpful, because it's lose the advantage of cacheable and will be very slow on Opera:


// Copyright (c) HE Shi-Jun , 2006
// Below codes can be used under GPL (v2 or later) or LGPL (v2.1 or later) license

function format3(pattern) ...{
    return eval('"' + pattern.replace(/"/g, '\"').replace(/$([0-9]+)/g, '" + arguments[$1] + "').replace(/$$/g, '$') + '"');
}

No comments: