JavaScript-Minifier-XS-0.09000755000764000764 011464171071 15473 5ustar00grahamgraham000000000000Changes000444000764000764 376511464171071 17057 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09Revision history for Perl extension JavaScript::Minifier::XS. 0.09 Tue Nov 2 22:12 PDT 2010 - Bump required Perl version to 5.8.8; oldest release w/Newxz() available. 0.08 Wed Jul 21 21:23 PDT 2010 - use Newxz/Safefree for memory management, instead of malloc/free. Thanks to Kenichi Ishigaki for his patch to CSS-Minifier-XS that prompted this. 0.07 Fri Apr 23 23:44 PDT 2010 - switch to Git 0.06 Thu Aug 6, 22:08 PDT 2009 - fix invalid "L" POD sequences 0.05 Wed Jul 16, 23:35 PDT 2008 - don't segfault w/older Perls if we minify right down to nothing. (similar behaviour as described for CSS::Minifier::XS in RT #36557) 0.04 Wed May 28, 21:58 PDT 2008 - rebuild packages; wrong version number in META.yml 0.03 Wed May 28, 14:46 PDT 2008 - fix minification when a regexp follows a comment that ends with something that looks like code; was treating it as division instead of as a literal - properly clear end of internally allocated buffers - added some debugging output, which could be enabled at compile-time 0.02 Tue May 6 00:16 PDT 2008 - rebuild packages; EU::MM borked my META.yml 0.01 Mon May 5 15:11 PDT 2008 - fix minification of "division of an array subscript". Thanks to Ingy and Dan at Socialtext for the JS! - first NON-devel release 0.01_05 Sat Oct 20 22:48 PDT 2007 - don't use "strcasestr()"; not available on Solaris 0.01_04 Wed Oct 17 15:56 PDT 2007 - fix t/02-minify.t, so it doesn't try to "use_ok()" before issuing a test plan 0.01_03 Tue Oct 16 19:47 PDT 2007 - don't use "strndup()"; not available on all systems - we require Perl 5.006; update Build.PL and XS.pm to denote this 0.01_02 Tue Oct 16 12:22 PDT 2007 - relocate the XS file so that its picked up properly by EU::MM when running "perl Makefile.PL" to do a build. 0.01_01 Mon Oct 15 22:11 PDT 2007 - initial public version Makefile.PL000444000764000764 67311464171071 17511 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09# Note: this file was auto-generated by Module::Build::Compat version 0.35 require 5.8.8; use ExtUtils::MakeMaker; WriteMakefile ( 'PL_FILES' => {}, 'INSTALLDIRS' => 'site', 'NAME' => 'JavaScript::Minifier::XS', 'EXE_FILES' => [], 'VERSION_FROM' => 'lib/JavaScript/Minifier/XS.pm', 'PREREQ_PM' => { 'Test::More' => 0 } ) ; Build.PL000444000764000764 105011464171071 17041 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09use strict; use warnings; use Module::Build; Module::Build->new( 'module_name' => 'JavaScript::Minifier::XS', 'license' => 'perl', 'dist_author' => 'Graham TerMarsch (cpan@howlingfrog.com)', 'create_makefile_pl'=> 'traditional', 'xs_files' => { 'XS.xs' => 'lib/JavaScript/Minifier/XS.xs', }, 'requires' => { 'perl' => '5.8.8', }, 'build_requires' => { 'Test::More' => 0, }, )->create_build_script(); README000444000764000764 62011464171071 16407 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09JavaScript::Minifier::XS minifies JavaScript documents by removing un-necessary whitespace Copyright (C) 2007, Graham TerMarsch. All Rights Reserved. This is free software; you can redistribute it and/or modify it under the same terms as Perl itself. To install: perl Build.PL ./Build ./Build test ./Build install or: perl Makefile.PL make make test make install MANIFEST000444000764000764 156111464171071 16705 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09Build.PL Makefile.PL Changes MANIFEST MANIFEST.SKIP META.yml README XS.xs lib/JavaScript/Minifier/XS.pm t/01-loads.t t/02-minify.t t/03-minifies-to-nothing.t t/99-benchmark.t t/pod.t t/pod-coverage.t t/js/comments.js t/js/comments.min t/js/comments-before-a-regex.js t/js/comments-before-a-regex.min t/js/comments-ie-conditional.js t/js/comments-ie-conditional.min t/js/division.js t/js/division.min t/js/division-of-array-subscripts.js t/js/division-of-array-subscripts.min t/js/leading-whitespace.js t/js/leading-whitespace.min t/js/literals-double-quotes.js t/js/literals-double-quotes.min t/js/literals-regexp.js t/js/literals-regexp.min t/js/literals-single-quotes.js t/js/literals-single-quotes.min t/js/postfix-sigil.js t/js/postfix-sigil.min t/js/prefix-sigil.js t/js/prefix-sigil.min t/js/simple.js t/js/simple.min t/js/trailing-whitespace.js t/js/trailing-whitespace.min MANIFEST.SKIP000444000764000764 4311464171071 17404 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09Build _build .git/ \..*\.swp blib/ META.yml000444000764000764 102111464171071 17014 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09--- name: JavaScript-Minifier-XS version: 0.09 author: - Graham TerMarsch (cpan@howlingfrog.com) abstract: XS based JavaScript minifier license: perl resources: license: http://dev.perl.org/licenses/ build_requires: Test::More: 0 requires: perl: v5.8.8 configure_requires: Module::Build: 0.35 provides: JavaScript::Minifier::XS: file: lib/JavaScript/Minifier/XS.pm version: 0.09 generated_by: Module::Build version 0.35 meta-spec: url: http://module-build.sourceforge.net/META-spec-v1.4.html version: 1.4 XS.xs000444000764000764 6117111464171071 16505 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09#include #include #include #include #include #include #include /* uncomment to enable debugging output */ //#define DEBUG 1 /* **************************************************************************** * CHARACTER CLASS METHODS * **************************************************************************** */ int charIsSpace(char ch) { if (ch == ' ') return 1; if (ch == '\t') return 1; return 0; } int charIsEndspace(char ch) { if (ch == '\n') return 1; if (ch == '\r') return 1; if (ch == '\f') return 1; return 0; } int charIsWhitespace(char ch) { return charIsSpace(ch) || charIsEndspace(ch); } int charIsIdentifier(char ch) { if ((ch >= 'a') && (ch <= 'z')) return 1; if ((ch >= 'A') && (ch <= 'Z')) return 1; if ((ch >= '0') && (ch <= '9')) return 1; if (ch == '_') return 1; if (ch == '$') return 1; if (ch == '\\') return 1; if (ch > 126) return 1; return 0; } int charIsInfix(char ch) { /* EOL characters before+after these characters can be removed */ if (ch == ',') return 1; if (ch == ';') return 1; if (ch == ':') return 1; if (ch == '=') return 1; if (ch == '&') return 1; if (ch == '%') return 1; if (ch == '*') return 1; if (ch == '<') return 1; if (ch == '>') return 1; if (ch == '?') return 1; if (ch == '|') return 1; if (ch == '\n') return 1; return 0; } int charIsPrefix(char ch) { /* EOL characters after these characters can be removed */ if (ch == '{') return 1; if (ch == '(') return 1; if (ch == '[') return 1; if (ch == '!') return 1; return charIsInfix(ch); } int charIsPostfix(char ch) { /* EOL characters before these characters can be removed */ if (ch == '}') return 1; if (ch == ')') return 1; if (ch == ']') return 1; return charIsInfix(ch); } /* **************************************************************************** * TYPE DEFINITIONS * **************************************************************************** */ typedef enum { NODE_EMPTY, NODE_WHITESPACE, NODE_BLOCKCOMMENT, NODE_LINECOMMENT, NODE_IDENTIFIER, NODE_LITERAL, NODE_SIGIL } NodeType; static char* strNodeTypes[] = { "empty", "whitespace", "block comment", "line comment", "identifier", "literal", "sigil" }; struct _Node; typedef struct _Node Node; struct _Node { /* linked list pointers */ Node* prev; Node* next; /* node internals */ char* contents; size_t length; NodeType type; size_t id; }; typedef struct { /* linked list pointers */ Node* head; Node* tail; /* doc internals */ const char* buffer; size_t length; size_t offset; } JsDoc; /* **************************************************************************** * NODE CHECKING MACROS/FUNCTIONS * **************************************************************************** */ /* checks to see if the node is the given string, case INSENSITIVELY */ int nodeEquals(Node* node, const char* string) { return (strcasecmp(node->contents, string) == 0); } /* checks to see if the node contains the given string, case INSENSITIVELY */ int nodeContains(Node* node, const char* string) { const char* haystack = node->contents; size_t len = strlen(string); char ul_start[2] = { tolower(*string), toupper(*string) }; /* if node is shorter we know we're not going to have a match */ if (len > node->length) return 0; /* find the needle in the haystack */ while (haystack && *haystack) { /* find first char of needle */ haystack = strpbrk( haystack, ul_start ); if (haystack == NULL) return 0; /* check if the rest matches */ if (strncasecmp(haystack, string, len) == 0) return 1; /* nope, move onto next character in the haystack */ haystack ++; } /* no match */ return 0; } /* checks to see if the node begins with the given string, case INSENSITIVELY */ int nodeBeginsWith(Node* node, const char* string) { size_t len = strlen(string); if (len > node->length) return 0; return (strncasecmp(node->contents, string, len) == 0); } /* checks to see if the node ends with the given string, case INSENSITIVELY */ int nodeEndsWith(Node* node, const char* string) { size_t len = strlen(string); size_t off = node->length - len; if (len > node->length) return 0; return (strncasecmp(node->contents+off, string, len) == 0); } /* macros to help see what kind of node we've got */ #define nodeIsWHITESPACE(node) ((node->type == NODE_WHITESPACE)) #define nodeIsBLOCKCOMMENT(node) ((node->type == NODE_BLOCKCOMMENT)) #define nodeIsLINECOMMENT(node) ((node->type == NODE_LINECOMMENT)) #define nodeIsIDENTIFIER(node) ((node->type == NODE_IDENTIFIER)) #define nodeIsLITERAL(node) ((node->type == NODE_LITERAL)) #define nodeIsSIGIL(node) ((node->type == NODE_SIGIL)) #define nodeIsEMPTY(node) ((node->type == NODE_EMPTY) || (node->length==0) || (node->contents=NULL)) #define nodeIsCOMMENT(node) (nodeIsBLOCKCOMMENT(node) || nodeIsLINECOMMENT(node)) #define nodeIsIECONDITIONALBLOCKCOMMENT(node) (nodeIsBLOCKCOMMENT(node) && nodeBeginsWith(node,"/*@") && nodeEndsWith(node,"@*/")) #define nodeIsIECONDITIONALLINECOMMENT(node) (nodeIsLINECOMMENT(node) && nodeBeginsWith(node,"//@")) #define nodeIsIECONDITIONALCOMMENT(node) (nodeIsIECONDITIONALBLOCKCOMMENT(node) || nodeIsIECONDITIONALLINECOMMENT(node)) #define nodeIsPREFIXSIGIL(node) (nodeIsSIGIL(node) && charIsPrefix(node->contents[0])) #define nodeIsPOSTFIXSIGIL(node) (nodeIsSIGIL(node) && charIsPostfix(node->contents[0])) #define nodeIsENDSPACE(node) (nodeIsWHITESPACE(node) && charIsEndspace(node->contents[0])) #define nodeIsCHAR(node,ch) ((node->contents[0]==ch) && (node->length==1)) /* **************************************************************************** * NODE MANIPULATION FUNCTIONS * **************************************************************************** */ /* allocates a new node */ static size_t js_id_counter=0; // XXX Node* JsAllocNode() { Node* node; Newxz(node, 1, Node); node->prev = NULL; node->next = NULL; node->contents = NULL; node->length = 0; node->type = NODE_EMPTY; node->id = js_id_counter++; // XXX return node; } /* frees the memory used by a node */ void JsFreeNode(Node* node) { if (node->contents) Safefree(node->contents); Safefree(node); } void JsFreeNodeList(Node* head) { while (head) { Node* tmp = head->next; JsFreeNode(head); head = tmp; } } /* clears the contents of a node */ void JsClearNodeContents(Node* node) { if (node->contents) Safefree(node->contents); node->contents = NULL; node->length = 0; } /* sets the contents of a node */ void JsSetNodeContents(Node* node, const char* string, size_t len) { size_t bufSize = len + 1; /* clear node, set new length */ JsClearNodeContents(node); node->length = len; /* allocate string, fill with NULLs, and copy */ Newxz(node->contents, bufSize, char); strncpy( node->contents, string, len ); } /* removes the node from the list and discards it entirely */ void JsDiscardNode(Node* node) { if (node->prev) node->prev->next = node->next; if (node->next) node->next->prev = node->prev; JsFreeNode(node); } /* appends the node to the given element */ void JsAppendNode(Node* element, Node* node) { if (element->next) element->next->prev = node; node->next = element->next; node->prev = element; element->next = node; } /* collapses a node to a single whitespace character. If the node contains any * endspace characters, that is what we're collapsed to. */ void JsCollapseNodeToWhitespace(Node* node) { if (node->contents) { char ws = node->contents[0]; size_t idx; for (idx=0; idxlength; idx++) { if (charIsEndspace(node->contents[idx])) { ws = node->contents[idx]; break; } } JsSetNodeContents(node, &ws, 1); } } /* collapses a node to a single endspace character. If the node doesn't * contain any endspace characters, the node is collapsed to an empty string. */ void JsCollapseNodeToEndspace(Node* node) { if (node->contents) { char ws = 0; size_t idx; for (idx=0; idxlength; idx++) { if (charIsEndspace(node->contents[idx])) { ws = node->contents[idx]; break; } } JsClearNodeContents(node); if (ws) JsSetNodeContents(node, &ws, 1); } } /* **************************************************************************** * TOKENIZING FUNCTIONS * **************************************************************************** */ /* extracts a quoted literal string */ void _JsExtractLiteral(JsDoc* doc, Node* node) { const char* buf = doc->buffer; size_t offset = doc->offset; char delimiter = buf[offset]; /* skip start of literal */ offset ++; /* search for end of literal */ while (offset < doc->length) { if (buf[offset] == '\\') { /* escaped character; skip */ offset ++; } else if (buf[offset] == delimiter) { const char* start = buf + doc->offset; size_t length = offset - doc->offset + 1; JsSetNodeContents(node, start, length); node->type = NODE_LITERAL; return; } /* move onto next character */ offset ++; } croak( "unterminated quoted string literal" ); } /* extracts a block comment */ void _JsExtractBlockComment(JsDoc* doc, Node* node) { const char* buf = doc->buffer; size_t offset = doc->offset; /* skip start of comment */ offset ++; /* skip "/" */ offset ++; /* skip "*" */ /* search for end of comment block */ while (offset < doc->length) { if (buf[offset] == '*') { if (buf[offset+1] == '/') { const char* start = buf + doc->offset; size_t length = offset - doc->offset + 2; JsSetNodeContents(node, start, length); node->type = NODE_BLOCKCOMMENT; return; } } /* move onto next character */ offset ++; } croak( "unterminated block comment" ); } /* extracts a line comment */ void _JsExtractLineComment(JsDoc* doc, Node* node) { const char* buf = doc->buffer; size_t offset = doc->offset; /* skip start of comment */ offset ++; /* skip "/" */ offset ++; /* skip "/" */ /* search for end of line */ while ((offset < doc->length) && !charIsEndspace(buf[offset])) offset ++; /* found it ! */ { const char* start = buf + doc->offset; size_t length = offset - doc->offset; JsSetNodeContents(node, start, length); node->type = NODE_LINECOMMENT; } } /* extracts a run of whitespace characters */ void _JsExtractWhitespace(JsDoc* doc, Node* node) { const char* buf = doc->buffer; size_t offset = doc->offset; while ((offset < doc->length) && charIsWhitespace(buf[offset])) offset ++; JsSetNodeContents(node, doc->buffer+doc->offset, offset-doc->offset); node->type = NODE_WHITESPACE; } /* extracts an identifier */ void _JsExtractIdentifier(JsDoc* doc, Node* node) { const char* buf = doc->buffer; size_t offset = doc->offset; while ((offset < doc->length) && charIsIdentifier(buf[offset])) offset ++; JsSetNodeContents(node, doc->buffer+doc->offset, offset-doc->offset); node->type = NODE_IDENTIFIER; } /* extracts a -single- symbol/sigil */ void _JsExtractSigil(JsDoc* doc, Node* node) { JsSetNodeContents(node, doc->buffer+doc->offset, 1); node->type = NODE_SIGIL; } /* tokenizes the given string and returns the list of nodes */ Node* JsTokenizeString(const char* string) { JsDoc doc; /* initialize our JS document object */ doc.head = NULL; doc.tail = NULL; doc.buffer = string; doc.length = strlen(string); doc.offset = 0; /* parse the JS */ while ((doc.offset < doc.length) && (doc.buffer[doc.offset])) { /* allocate a new node */ Node* node = JsAllocNode(); if (!doc.head) doc.head = node; if (!doc.tail) doc.tail = node; /* parse the next node out of the JS */ if (doc.buffer[doc.offset] == '/') { if (doc.buffer[doc.offset+1] == '*') _JsExtractBlockComment(&doc, node); else if (doc.buffer[doc.offset+1] == '/') _JsExtractLineComment(&doc, node); else { /* could be "division" or "regexp", but need to know more about * our context... */ /* find last non-whitespace, non-comment node */ Node* last = doc.tail; char ch = 0; while (nodeIsWHITESPACE(last) || nodeIsCOMMENT(last)) last = last->prev; ch = last->contents[last->length-1]; /* see if we're "division" or "regexp" */ if (ch && ((ch == ')') || (ch == '.') || (ch == ']') || (charIsIdentifier(ch)))) _JsExtractSigil(&doc, node); /* division */ else _JsExtractLiteral(&doc, node); /* regexp */ } } else if ((doc.buffer[doc.offset] == '"') || (doc.buffer[doc.offset] == '\'')) _JsExtractLiteral(&doc, node); else if (charIsWhitespace(doc.buffer[doc.offset])) _JsExtractWhitespace(&doc, node); else if (charIsIdentifier(doc.buffer[doc.offset])) _JsExtractIdentifier(&doc, node); else _JsExtractSigil(&doc, node); /* move ahead to the end of the parsed node */ doc.offset += node->length; /* add the node to our list of nodes */ if (node != doc.tail) JsAppendNode(doc.tail, node); doc.tail = node; /* some debugging info */ #ifdef DEBUG { int idx; printf("----------------------------------------------------------------\n"); printf("%s: %s\n", strNodeTypes[node->type], node->contents); printf("next: '"); for (idx=0; idx<=10; idx++) { if ((doc.offset+idx) >= doc.length) break; if (!doc.buffer[doc.offset+idx]) break; printf("%c", doc.buffer[doc.offset+idx]); } printf("'\n"); } #endif } /* return the node list */ return doc.head; } /* **************************************************************************** * MINIFICATION FUNCTIONS * **************************************************************************** */ /* collapses all of the nodes to their shortest possible representation */ void JsCollapseNodes(Node* curr) { while (curr) { Node* next = curr->next; switch (curr->type) { case NODE_WHITESPACE: /* all WS gets collapsed */ JsCollapseNodeToWhitespace(curr); break; case NODE_BLOCKCOMMENT: /* block comments get collapsed to WS if that's a side-affect * of their placement in the JS document. */ if (!nodeIsIECONDITIONALBLOCKCOMMENT(curr)) { int convert_to_ws = 0; /* find surrounding non-WS nodes */ Node* nonws_prev = curr->prev; Node* nonws_next = curr->next; while (nonws_prev && nodeIsWHITESPACE(nonws_prev)) nonws_prev = nonws_prev->prev; while (nonws_next && nodeIsWHITESPACE(nonws_next)) nonws_next = nonws_next->next; /* check what we're between... */ if (nonws_prev && nonws_next) { /* between identifiers? convert to WS */ if (nodeIsIDENTIFIER(nonws_prev) && nodeIsIDENTIFIER(nonws_next)) convert_to_ws = 1; /* between possible pre/post increment? convert to WS */ if (nodeIsCHAR(nonws_prev,'-') && nodeIsCHAR(nonws_next,'-')) convert_to_ws = 1; if (nodeIsCHAR(nonws_prev,'+') && nodeIsCHAR(nonws_next,'+')) convert_to_ws = 1; } /* convert to WS */ if (convert_to_ws) { JsSetNodeContents(curr," ",1); curr->type = NODE_WHITESPACE; } } break; default: break; } curr = next; } } /* checks to see whether we can prune the given node from the list. * * THIS is the function that controls the bulk of the minification process. */ enum { PRUNE_NO, PRUNE_PREVIOUS, PRUNE_CURRENT, PRUNE_NEXT }; int JsCanPrune(Node* node) { Node* prev = node->prev; Node* next = node->next; switch (node->type) { case NODE_EMPTY: /* prune empty nodes */ return PRUNE_CURRENT; case NODE_WHITESPACE: /* multiple whitespace gets pruned to preserve endspace */ if (prev && nodeIsENDSPACE(prev)) return PRUNE_CURRENT; if (prev && nodeIsWHITESPACE(prev)) return PRUNE_PREVIOUS; /* leading whitespace gets pruned */ if (!prev) return PRUNE_CURRENT; /* trailing whitespace gets pruned */ if (!next) return PRUNE_CURRENT; /* keep all other whitespace */ return PRUNE_NO; case NODE_BLOCKCOMMENT: /* keep comments that contain the word "copyright" */ if (nodeContains(node, "copyright")) return PRUNE_NO; /* keep comments that are for IE Conditional Compilation */ if (nodeIsIECONDITIONALBLOCKCOMMENT(node)) return PRUNE_NO; /* block comments get pruned */ return PRUNE_CURRENT; case NODE_LINECOMMENT: /* keep comments that contain the word "copyright" */ if (nodeContains(node, "copyright")) return PRUNE_NO; /* keep comments that are for IE Conditional Compilation */ if (nodeIsIECONDITIONALLINECOMMENT(node)) return PRUNE_NO; /* line comments get pruned */ return PRUNE_CURRENT; case NODE_IDENTIFIER: /* remove whitespace (but NOT endspace) after identifiers, provided * that next thing is -NOT- another identifier */ if (nodeIsWHITESPACE(next) && !nodeIsENDSPACE(next) && next->next && !nodeIsIDENTIFIER(next->next)) return PRUNE_NEXT; /* keep all identifiers */ return PRUNE_NO; case NODE_LITERAL: /* keep all literals */ return PRUNE_NO; case NODE_SIGIL: /* remove whitespace after "prefix" sigils */ if (nodeIsPREFIXSIGIL(node) && next && nodeIsWHITESPACE(next)) return PRUNE_NEXT; /* remove whitespace before "postfix" sigils */ if (nodeIsPOSTFIXSIGIL(node) && prev && nodeIsWHITESPACE(prev)) return PRUNE_PREVIOUS; /* remove whitespace (but NOT endspace) after closing brackets */ if (next && nodeIsWHITESPACE(next) && !nodeIsENDSPACE(next) && (nodeIsCHAR(node,')') || nodeIsCHAR(node,'}') || nodeIsCHAR(node,']'))) return PRUNE_NEXT; /* remove whitespace surrounding "/", EXCEPT where it'd cause "//" */ if (nodeIsCHAR(node,'/') && prev && nodeIsWHITESPACE(prev) && prev->prev && !nodeEndsWith(prev->prev,"/")) return PRUNE_PREVIOUS; if (nodeIsCHAR(node,'/') && next && nodeIsWHITESPACE(next) && next->next && !nodeBeginsWith(next->next,"/")) return PRUNE_NEXT; /* remove whitespace (but NOT endspace) surrounding "-", EXCEPT where it'd cause "--" */ if (nodeIsCHAR(node,'-') && prev && nodeIsWHITESPACE(prev) && !nodeIsENDSPACE(prev) && prev->prev && !nodeIsCHAR(prev->prev,'-')) return PRUNE_PREVIOUS; if (nodeIsCHAR(node,'-') && next && nodeIsWHITESPACE(next) && !nodeIsENDSPACE(next) && next->next && !nodeIsCHAR(next->next,'-')) return PRUNE_NEXT; /* remove whitespace (but NOT endspace) surrounding "+", EXCEPT where it'd cause "++" */ if (nodeIsCHAR(node,'+') && prev && nodeIsWHITESPACE(prev) && !nodeIsENDSPACE(prev) && prev->prev && !nodeIsCHAR(prev->prev,'+')) return PRUNE_PREVIOUS; if (nodeIsCHAR(node,'+') && next && nodeIsWHITESPACE(next) && !nodeIsENDSPACE(next) && next->next && !nodeIsCHAR(next->next,'+')) return PRUNE_NEXT; /* keep all other sigils */ return PRUNE_NO; } /* keep anything else */ return PRUNE_NO; } /* prune nodes from the list */ Node* JsPruneNodes(Node *head) { Node* curr = head; while (curr) { /* see if/howe we can prune this node */ int prune = JsCanPrune(curr); /* prune. each block is responsible for moving onto the next node */ Node* prev = curr->prev; Node* next = curr->next; switch (prune) { case PRUNE_PREVIOUS: /* discard previous node */ JsDiscardNode(prev); /* reset "head" if that's what got pruned */ if (prev == head) prev = curr; break; case PRUNE_CURRENT: /* discard current node */ JsDiscardNode(curr); /* reset "head" if that's what got pruned */ if (curr == head) head = prev ? prev : next; /* backup and try again if possible */ curr = prev ? prev : next; break; case PRUNE_NEXT: /* discard next node */ JsDiscardNode(next); /* stay on current node, and try again */ break; default: /* move ahead to next node */ curr = next; break; } } /* return the (possibly new) head node back to the caller */ return head; } /* **************************************************************************** * Minifies the given JavaScript, returning a newly allocated string back to * the caller (YOU'RE responsible for freeing its memory). * **************************************************************************** */ char* JsMinify(const char* string) { char* results; /* PASS 1: tokenize JS into a list of nodes */ Node* head = JsTokenizeString(string); if (!head) return NULL; /* PASS 2: collapse nodes */ JsCollapseNodes(head); /* PASS 3: prune nodes */ head = JsPruneNodes(head); if (!head) return NULL; /* PASS 4: re-assemble JS into single string */ { Node* curr; char* ptr; /* allocate the result buffer to the same size as the original JS; in a * worst case scenario that's how much memory we'll need for it. */ Newxz(results, (strlen(string)+1), char); ptr = results; /* copy node contents into result buffer */ curr = head; while (curr) { memcpy(ptr, curr->contents, curr->length); ptr += curr->length; curr = curr->next; } *ptr = 0; } /* free memory used by node list */ JsFreeNodeList(head); /* return resulting minified JS back to caller */ return results; } MODULE = JavaScript::Minifier::XS PACKAGE = JavaScript::Minifier::XS PROTOTYPES: disable SV* minify(string) SV* string INIT: char* buffer = NULL; RETVAL = &PL_sv_undef; CODE: /* minify the JavaScript */ buffer = JsMinify( SvPVX(string) ); /* hand back the minified JS (if we had any) */ if (buffer != NULL) { RETVAL = newSVpv(buffer, 0); Safefree( buffer ); } OUTPUT: RETVAL t000755000764000764 011464171071 15657 5ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.0902-minify.t000444000764000764 203011464171071 17706 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09/tuse strict; use warnings; use IO::File; use Test::More; use JavaScript::Minifier::XS qw(minify); ############################################################################### # figure out how many JS files we're going to run through for testing my @files = ; plan tests => scalar @files; ############################################################################### # test each of the JS files in turn foreach my $file (@files) { (my $min_file = $file) =~ s/\.js$/\.min/; my $str = slurp( $file ); my $min = slurp( $min_file ); my $res = minify( $str ); is( $res, $min, $file ); } ############################################################################### # HELPER METHOD: slurp in contents of file to scalar. ############################################################################### sub slurp { my $filename = shift; my $fin = IO::File->new( $filename, '<' ) || die "can't open '$filename'; $!"; my $str = join('', <$fin>); $fin->close(); chomp( $str ); return $str; } pod.t000444000764000764 20111464171071 16734 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09/tuse Test::More; eval "use Test::Pod 1.00"; plan skip_all => "Test::Pod 1.00 required for testing POD" if $@; all_pod_files_ok(); 03-minifies-to-nothing.t000444000764000764 113511464171071 22310 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09/tuse strict; use warnings; use Test::More tests => 3; use JavaScript::Minifier::XS qw(minify); my $results; # ############################################################################### # Minifying down to "nothing" shouldn't segfault. # # RT #36557 described this for CSS::Minifier::XS, but we exhibit the same bug # here too. $results = minify( "/* */" ); ok( !defined $results, "minified block comment to nothing" ); $results = minify( "// foo" ); ok( !defined $results, "minified line comment to nothing" ); $results = minify( q{} ); ok( !defined $results, "minified empty string to nothing" ); pod-coverage.t000444000764000764 23011464171071 20527 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09/tuse Test::More; eval "use Test::Pod::Coverage 1.00"; plan skip_all => "Test::Pod::Coverage 1.00 required for testing POD" if $@; all_pod_coverage_ok(); 01-loads.t000444000764000764 13111464171071 17474 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09/tuse strict; use Test::More tests=>1; BEGIN { use_ok( 'JavaScript::Minifier::XS' ); } 99-benchmark.t000444000764000764 371711464171071 20402 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09/tuse strict; use warnings; use Test::More; use IO::File; use Benchmark qw(countit); use JavaScript::Minifier::XS; ############################################################################### # check if JavaScript::Minifier available, so we can do comparison testing eval { require JavaScript::Minifier }; if ($@) { plan skip_all => 'JavaScript::Minifier not available for benchmark comparison'; } plan tests => 1; ############################################################################### # get the list of JS files we're going to run through testing my @files = ; ############################################################################### # time test the PurePerl version against the XS version. compare_benchmark: { my $count; my $time = 10; # build a longer JavaScript document to process; 64KBytes should be # suitable my $str = ''; while (1) { foreach my $file (@files) { $str .= slurp( $file ); } last if (length($str) > (64*1024)); } # benchmark the original "pure perl" version $count = countit( $time, sub { JavaScript::Minifier::minify(input=>$str) } ); my $rate_pp = ($count->iters() / $time) * length($str); # benchmark the "XS" version $count = countit( $time, sub { JavaScript::Minifier::XS::minify($str) } ); my $rate_xs = ($count->iters() / $time) * length($str); ok( 1, "benchmarking" ); diag( "" ); diag( "Benchmark results:" ); diag( "\tperl\t=> $rate_pp bytes/sec" ); diag( "\txs\t=> $rate_xs bytes/sec" ); } ############################################################################### # HELPER METHOD: slurp in contents of file to scalar. ############################################################################### sub slurp { my $filename = shift; my $fin = IO::File->new( $filename, '<' ) || die "can't open '$filename'; $!"; my $str = join('', <$fin>); $fin->close(); chomp( $str ); return $str; } js000755000764000764 011464171071 16273 5ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09/tliterals-double-quotes.min000444000764000764 10711464171071 23520 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09/t/jsvar double_quoted=" double quoted strings /* with block comments */ "; literals-single-quotes.min000444000764000764 10311464171071 23523 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09/t/jsvar single_quoted=' single quoted strings // with line comments '; comments-before-a-regex.min000444000764000764 3311464171071 23504 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09/t/jsvar foo=[/^'/,/^"/,/foo/]; literals-regexp.min000444000764000764 3611464171071 22203 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09/t/jsvar regexes=/ regexes stay /; literals-double-quotes.js000444000764000764 17511464171071 23356 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09/t/js/* quoted literals get preserved, in several forms */ var double_quoted=" double quoted strings /* with block comments */ "; trailing-whitespace.js000444000764000764 7111464171071 22667 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09/t/jsvar trailing="trailing whitespace gets removed"; literals-regexp.js000444000764000764 12411464171071 22052 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09/t/js/* quoted literals get preserved, in several forms */ var regexes=/ regexes stay /; postfix-sigil.js000444000764000764 15411464171071 21547 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09/t/js/* whitespace before "postfix" sigils should get removed */ function foo( ) { alert("foo!") ; } comments.min000444000764000764 40111464171071 20735 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09/t/js/* comments containing the word "copyright" are left in, though */ // including line comments, with mixed case cOpYrIgHt var foo=3;var bar=4;var replaced_with_ws=foo+ +bar;var also_replaced=foo- -bar;var removed_outright=foo+-bar;var also_removed=foo-+bar; division.min000444000764000764 1611464171071 20716 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09/t/jsvar foo=10/2; simple.js000444000764000764 2711464171071 20216 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09/t/js/* foo */ var x = 2; division.js000444000764000764 10511464171071 20566 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09/t/js/* use of "/" for division should get compacted */ var foo = 10 / 2; prefix-sigil.js000444000764000764 14311464171071 21346 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09/t/js/* whitespace after "prefix" sigils should get removed */ function foo( ){ alert("foo!"); } simple.min000444000764000764 1111464171071 20356 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09/t/jsvar x=2; comments-before-a-regex.js000444000764000764 103411464171071 23377 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09/t/js/* comments placed directly before a regex should be skipped, instead of being * used to determine whether the leading '/' of the regexp is actually for * division or not. * * when its not working correctly, the regexes are parsed as division and that * causes the quote matching to get bungled up. */ var foo = [ // trick the engine into thinking we end in an array[] /^'/, // this *should* be parsed as a comment, not a literal /^"/, // isn't this the line with the closing apostrophe in it? /foo/ ]; postfix-sigil.min000444000764000764 4011464171071 21670 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09/t/jsfunction foo() {alert("foo!");} prefix-sigil.min000444000764000764 3711464171071 21477 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09/t/jsfunction foo(){alert("foo!");} leading-whitespace.min000444000764000764 5711464171071 22634 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09/t/jsvar leading="leading whitespace gets removed"; literals-single-quotes.js000444000764000764 17111464171071 23361 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09/t/js/* quoted literals get preserved, in several forms */ var single_quoted=' single quoted strings // with line comments '; comments-ie-conditional.js000444000764000764 52311464171071 23467 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09/t/js/* comments get removed */ /*@ except those that are "IE Conditional Compilation" comments @*/ /*@ we'll remove those that start with the flag but don't end with it. */ /* as well as those that end with it but didn't start with it @*/ // line comments also get removed //@ except those that are "IE Conditional Compilation" line comments division-of-array-subscripts.js000444000764000764 41211464171071 24504 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09/t/js/* * Division of an array subscript should NOT be treated as opening a regexp, * but should be treated as division. */ function foo() { var bar = someArray[2]/2; } function bar() { foo(); // this / is not a regexp close, its just part of a line comment } leading-whitespace.js000444000764000764 6711464171071 22466 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09/t/js var leading="leading whitespace gets removed"; trailing-whitespace.min000444000764000764 6111464171071 23035 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09/t/jsvar trailing="trailing whitespace gets removed"; comments-ie-conditional.min000444000764000764 21111464171071 23630 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09/t/js/*@ except those that are "IE Conditional Compilation" comments @*/ //@ except those that are "IE Conditional Compilation" line comments comments.js000444000764000764 116511464171071 20616 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09/t/js/* block comments get removed */ // as do line comments /* comments containing the word "copyright" are left in, though */ // including line comments, with mixed case cOpYrIgHt /* block comments placed inline get removed too. If they function as providing * whitespace between things that shouldn't be shoved together, though, they're * replaced with some whitespace. */ var foo /* remove */ = /* me too */ 3; var bar = /* and me */ 4; var replaced_with_ws = foo + /* ws */ +bar; var also_replaced = foo - /* ws */ -bar; var removed_outright = foo + /* me gone */ -bar; var also_removed = foo - /* me gone */ +bar; division-of-array-subscripts.min000444000764000764 7711464171071 24642 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09/t/jsfunction foo(){var bar=someArray[2]/2;} function bar(){foo();} lib000755000764000764 011464171071 16162 5ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09JavaScript000755000764000764 011464171071 20230 5ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09/libMinifier000755000764000764 011464171071 21772 5ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09/lib/JavaScriptXS.pm000444000764000764 652611464171071 23030 0ustar00grahamgraham000000000000JavaScript-Minifier-XS-0.09/lib/JavaScript/Minifierpackage JavaScript::Minifier::XS; use 5.8.8; use strict; use warnings; require Exporter; require DynaLoader; our @ISA = qw(Exporter DynaLoader); our @EXPORT_OK = qw(minify); our $VERSION = '0.09'; bootstrap JavaScript::Minifier::XS $VERSION; 1; =head1 NAME JavaScript::Minifier::XS - XS based JavaScript minifier =head1 SYNOPSIS use JavaScript::Minifier::XS qw(minify); $minified = minify($js); =head1 DESCRIPTION C is a JavaScript "minifier"; its designed to remove un-necessary whitespace and comments from JavaScript files, which also B breaking the JavaScript. C is similar in function to C, but is substantially faster as its written in XS and not just pure Perl. =head1 METHODS =over =item minify($js) Minifies the given C<$js>, returning the minified JavaScript back to the caller. =back =head1 HOW IT WORKS C minifies the JavaScript by removing un-necessary whitespace from JavaScript documents. Comments (both block and line) are also removed, I when (a) they contain the word "copyright" in them, or (b) they're needed to implement "IE Conditional Compilation". Internally, the minification process is done by taking multiple passes through the JavaScript document: =head2 Pass 1: Tokenize First, we go through and parse the JavaScript document into a series of tokens internally. The tokenizing process B check to make sure you've got syntactically valid JavaScript, it just breaks up the text into a stream of tokens suitable for processing by the subsequent stages. =head2 Pass 2: Collapse We then march through the token list and collapse certain tokens down to their smallest possible representation. I they're still included in the final results we only want to include them at their shortest. =over =item Whitespace Runs of multiple whitespace characters are reduced down to a single whitespace character. If the whitespace contains any "end of line" (EOL) characters, then the end result is the I EOL character encountered. Otherwise, the result is the first whitespace character in the run. =back =head2 Pass 3: Pruning We then go back through the token list and prune and remove un-necessary tokens. =over =item Whitespace Wherever possible, whitespace is removed; before+after comment blocks, and before+after various symbols/sigils. =item Comments Comments that are either (a) IE conditional compilation comments, or that (b) contain the word "copyright" in them are preserved. B other comments (line and block) are removed. =item Everything else We keep everything else; identifiers, quoted literal strings, symbols/sigils, etc. =back =head2 Pass 4: Re-assembly Lastly, we go back through the token list and re-assemble it all back into a single JavaScript string, which is then returned back to the caller. =head1 AUTHOR Graham TerMarsch (cpan@howlingfrog.com) =head1 REPORTING BUGS Please report bugs via RT (L), and be sure to include the JavaScript that you're having troubles minifying. =head1 COPYRIGHT Copyright (C) 2007-2008, Graham TerMarsch. All Rights Reserved. This is free software; you can redistribute it and/or modify it under the same license as Perl itself. =head1 SEE ALSO C. =cut