HTML-GenToc-3.20000755001750001750 011545551123 12102 5ustar00katkat000000000000README000644001750001750 5634511545551123 13077 0ustar00katkat000000000000HTML-GenToc-3.20NAME HTML::GenToc - Generate a Table of Contents for HTML documents. VERSION version 3.20 SYNOPSIS use HTML::GenToc; # create a new object my $toc = new HTML::GenToc(); my $toc = new HTML::GenToc(title=>"Table of Contents", toc_entry=>{ H1=>1, H2=>2 }, toc_end=>{ H1=>'/H1', H2=>'/H2' } ); # generate a ToC from a file $toc->generate_toc(input=>$html_file, footer=>$footer_file, header=>$header_file ); DESCRIPTION HTML::GenToc generates anchors and a table of contents for HTML documents. Depending on the arguments, it will insert the information it generates, or output to a string, a separate file or STDOUT. While it defaults to taking H1 and H2 elements as the significant elements to put into the table of contents, any tag can be defined as a significant element. Also, it doesn't matter if the input HTML code is complete, pure HTML, one can input pseudo-html or page-fragments, which makes it suitable for using on templates and HTML meta-languages such as WML. Also included in the distrubution is hypertoc, a script which uses the module so that one can process files on the command-line in a user-friendly manner. DETAILS The ToC generated is a multi-level level list containing links to the significant elements. HTML::GenToc inserts the links into the ToC to significant elements at a level specified by the user. Example: If H1s are specified as level 1, than they appear in the first level list of the ToC. If H2s are specified as a level 2, than they appear in a second level list in the ToC. Information on the significant elements and what level they should occur are passed in to the methods used by this object, or one can use the defaults. There are two phases to the ToC generation. The first phase is to put suitable anchors into the HTML documents, and the second phase is to generate the ToC from HTML documents which have anchors in them for the ToC to link to. For more information on controlling the contents of the created ToC, see "Formatting the ToC". HTML::GenToc also supports the ability to incorporate the ToC into the HTML document itself via the inline option. See "Inlining the ToC" for more information. In order for HTML::GenToc to support linking to significant elements, HTML::GenToc inserts anchors into the significant elements. One can use HTML::GenToc as a filter, outputing the result to another file, or one can overwrite the original file, with the original backed up with a suffix (default: "org") appended to the filename. One can also output the result to a string. METHODS Default arguments can be set when the object is created, and overridden by setting arguments when the generate_toc method is called. Arguments are given as a hash of arguments. Method -- new $toc = new HTML::GenToc(); $toc = new HTML::GenToc(toc_entry=>\%my_toc_entry, toc_end=>\%my_toc_end, bak=>'bak', ... ); Creates a new HTML::GenToc object. These arguments will be used as defaults in invocations of other methods. See generate_tod for possible arguments. generate_toc $toc->generate_toc(outfile=>"index2.html"); my $result_str = $toc->generate_toc(to_string=>1); Generates a table of contents for the significant elements in the HTML documents, optionally generating anchors for them first. Options bak bak => *string* If the input file/files is/are being overwritten (overwrite is on), copy the original file to "*filename*.*string*". If the value is empty, no backup file will be created. (default:org) debug debug => 1 Enable verbose debugging output. Used for debugging this module; in other words, don't bother. (default:off) entrysep entrysep => *string* Separator string for non-
  • item entries (default: ", ") filenames filenames => \@filenames The filenames to use when creating table-of-contents links. This overrides the filenames given in the input option, and is expected to have exactly the same number of elements. This can also be used when passing in string-content to the input option, to give a (fake) filename to use for the links relating to that content. footer footer => *file_or_string* Either the filename of the file containing footer text for ToC; or a string containing the footer text. header header => *file_or_string* Either the filename of the file containing header text for ToC; or a string containing the header text. ignore_only_one ignore_only_one => 1 If there would be only one item in the ToC, don't make a ToC. ignore_sole_first ignore_sole_first => 1 If the first item in the ToC is of the highest level, AND it is the only one of that level, ignore it. This is useful in web-pages where there is only one H1 header but one doesn't know beforehand whether there will be only one. inline inline => 1 Put ToC in document at a given point. See "Inlining the ToC" for more information. input input => \@filenames input => $content This is expected to be either a reference to an array of filenames, or a string containing content to process. The three main uses would be: (a) you have more than one file to process, so pass in multiple filenames (b) you have one file to process, so pass in its filename as the only array item (c) you have HTML content to process, so pass in just the content as a string (default:undefined) notoc_match notoc_match => *string* If there are certain individual tags you don't wish to include in the table of contents, even though they match the "significant elements", then if this pattern matches contents inside the tag (not the body), then that tag will not be included, either in generating anchors nor in generating the ToC. (default: "class="notoc"") ol ol => 1 Use an ordered list for level 1 ToC entries. ol_num_levels ol_num_levels => 2 The number of levels deep the OL listing will go if ol is true. If set to zero, will use an ordered list for all levels. (default:1) overwrite overwrite => 1 Overwrite the input file with the output. (default:off) outfile outfile => *file* File to write the output to. This is where the modified HTML output goes to. Note that it doesn't make sense to use this option if you are processing more than one file. If you give '-' as the filename, then output will go to STDOUT. (default: STDOUT) quiet quiet => 1 Suppress informative messages. (default: off) textonly textonly => 1 Use only text content in significant elements. title title => *string* Title for ToC page (if not using header or inline or toc_only) (default: "Table of Contents") toc_after toc_after => \%toc_after_data %toc_after_data = { *tag1* => *suffix1*, *tag2* => *suffix2* }; toc_after => { H2=>'' } For defining layout of significant elements in the ToC. This expects a reference to a hash of tag=>suffix pairs. The *tag* is the HTML tag which marks the start of the element. The *suffix* is what is required to be appended to the Table of Contents entry generated for that tag. (default: undefined) toc_before toc_before => \%toc_before_data %toc_before_data = { *tag1* => *prefix1*, *tag2* => *prefix2* }; toc_before=>{ H2=>'' } For defining the layout of significant elements in the ToC. The *tag* is the HTML tag which marks the start of the element. The *prefix* is what is required to be prepended to the Table of Contents entry generated for that tag. (default: undefined) toc_end toc_end => \%toc_end_data %toc_end_data = { *tag1* => *endtag1*, *tag2* => *endtag2* }; toc_end => { H1 => '/H1', H2 => '/H2' } For defining significant elements. The *tag* is the HTML tag which marks the start of the element. The *endtag* the HTML tag which marks the end of the element. When matching in the input file, case is ignored (but make sure that all your *tag* options referring to the same tag are exactly the same!). toc_entry toc_entry => \%toc_entry_data %toc_entry_data = { *tag1* => *level1*, *tag2* => *level2* }; toc_entry => { H1 => 1, H2 => 2 } For defining significant elements. The *tag* is the HTML tag which marks the start of the element. The *level* is what level the tag is considered to be. The value of *level* must be numeric, and non-zero. If the value is negative, consective entries represented by the significant_element will be separated by the value set by entrysep option. toclabel toclabel => *string* HTML text that labels the ToC. Always used. (default: "

    Table of Contents

    ") toc_tag toc_tag => *string* If a ToC is to be included inline, this is the pattern which is used to match the tag where the ToC should be put. This can be a start-tag, an end-tag or a comment, but the < should be left out; that is, if you want the ToC to be placed after the BODY tag, then give "BODY". If you want a special comment tag to make where the ToC should go, then include the comment marks, for example: "!--toc--" (default:BODY) toc_tag_replace toc_tag_replace => 1 In conjunction with toc_tag, this is a flag to say whether the given tag should be replaced, or if the ToC should be put after the tag. This can be useful if your toc_tag is a comment and you don't need it after you have the ToC in place. (default:false) toc_only toc_only => 1 Output only the Table of Contents, that is, the Table of Contents plus the toclabel. If there is a header or a footer, these will also be output. If toc_only is false then if there is no header, and inline is not true, then a suitable HTML page header will be output, and if there is no footer and inline is not true, then a HTML page footer will be output. (default:false) to_string to_string => 1 Return the modified HTML output as a string. This *does* override other methods of output (unlike version 3.00). If *to_string* is false, the method will return 1 rather than a string. use_id use_id => 1 Use id="*name*" for anchors rather than anchors. However if an anchor already exists for a Significant Element, this won't make an id for that particular element. useorg useorg => 1 Use pre-existing backup files as the input source; that is, files of the form *infile*.*bak* (see input and bak). INTERNAL METHODS These methods are documented for developer purposes and aren't intended to be used externally. make_anchor_name $toc->make_anchor_name(content=>$content, anchors=>\%anchors); Makes the anchor-name for one anchor. Bases the anchor on the content of the significant element. Ensures that anchors are unique. make_anchors my $new_html = $toc->make_anchors(input=>$html, notoc_match=>$notoc_match, use_id=>$use_id, toc_entry=>\%toc_entries, toc_end=>\%toc_ends, ); Makes the anchors the given input string. Returns a string. make_toc_list my @toc_list = $toc->make_toc_list(input=>$html, labels=>\%labels, notoc_match=>$notoc_match, toc_entry=>\%toc_entry, toc_end=>\%toc_end, filename=>$filename); Makes a list of lists which represents the structure and content of (a portion of) the ToC from one file. Also updates a list of labels for the ToC entries. build_lol Build a list of lists of paths, given a list of hashes with info about paths. output_toc $self->output_toc(toc=>$toc_str, input=>\@input, filenames=>\@filenames); Put the output (whether to file, STDOUT or string). The "output" in this case could be the ToC, the modified (anchors added) HTML, or both. put_toc_inline my $newhtml = $toc->put_toc_inline(toc_str=>$toc_str, filename=>$filename, in_string=>$in_string); Puts the given toc_str into the given input string; returns a string. cp cp($src, $dst); Copies file $src to $dst. Used for making backups of files. FILE FORMATS Formatting the ToC The toc_entry and other related options give you control on how the ToC entries may look, but there are other options to affect the final appearance of the ToC file created. With the header option, the contents of the given file (or string) will be prepended before the generated ToC. This allows you to have introductory text, or any other text, before the ToC. Note: If you use the header option, make sure the file specified contains the opening HTML tag, the HEAD element (containing the TITLE element), and the opening BODY tag. However, these tags/elements should not be in the header file if the inline option is used. See "Inlining the ToC" for information on what the header file should contain for inlining the ToC. With the toclabel option, the contents of the given string will be prepended before the generated ToC (but after any text taken from a header file). With the footer option, the contents of the file will be appended after the generated ToC. Note: If you use the footer, make sure it includes the closing BODY and HTML tags (unless, of course, you are using the inline option). If the header option is not specified, the appropriate starting HTML markup will be added, unless the toc_only option is specified. If the footer option is not specified, the appropriate closing HTML markup will be added, unless the toc_only option is specified. If you do not want/need to deal with header, and footer, files, then you are allowed to specify the title, title option, of the ToC file; and it allows you to specify a heading, or label, to put before ToC entries' list, the toclabel option. Both options have default values. If you do not want HTML page tags to be supplied, and just want the ToC itself, then specify the toc_only option. If there are no header or footer files, then this will simply output the contents of toclabel and the ToC itself. Inlining the ToC The ability to incorporate the ToC directly into an HTML document is supported via the inline option. Inlining will be done on the first file in the list of files processed, and will only be done if that file contains an opening tag matching the toc_tag value. If overwrite is true, then the first file in the list will be overwritten, with the generated ToC inserted at the appropriate spot. Otherwise a modified version of the first file is output to either STDOUT or to the output file defined by the outfile option. The options toc_tag and toc_tag_replace are used to determine where and how the ToC is inserted into the output. Example 1 $toc->generate_toc(inline=>1, toc_tag => 'BODY', toc_tag_replace => 0, ... ); This will put the generated ToC after the BODY tag of the first file. If the header option is specified, then the contents of the specified file are inserted after the BODY tag. If the toclabel option is not empty, then the text specified by the toclabel option is inserted. Then the ToC is inserted, and finally, if the footer option is specified, it inserts the footer. Then the rest of the input file follows as it was before. Example 2 $toc->generate_toc(inline=>1, toc_tag => '!--toc--', toc_tag_replace => 1, ... ); This will put the generated ToC after the first comment of the form , and that comment will be replaced by the ToC (in the order header toclabel ToC footer) followed by the rest of the input file. Note: The header file should not contain the beginning HTML tag and HEAD element since the HTML file being processed should already contain these tags/elements. NOTES * HTML::GenToc is smart enough to detect anchors inside significant elements. If the anchor defines the NAME attribute, HTML::GenToc uses the value. Else, it adds its own NAME attribute to the anchor. If use_id is true, then it likewise checks for and uses IDs. * The TITLE element is treated specially if specified in the toc_entry option. It is illegal to insert anchors (A) into TITLE elements. Therefore, HTML::GenToc will actually link to the filename itself instead of the TITLE element of the document. * HTML::GenToc will ignore a significant element if it does not contain any non-whitespace characters. A warning message is generated if such a condition exists. * If you have a sequence of significant elements that change in a slightly disordered fashion, such as H1 -> H3 -> H2 or even H2 -> H1, though HTML::GenToc deals with this to create a list which is still good HTML, if you are using an ordered list to that depth, then you will get strange numbering, as an extra list element will have been inserted to nest the elements at the correct level. For example (H2 -> H1 with ol_num_levels=1): 1. * My H2 Header 2. My H1 Header For example (H1 -> H3 -> H2 with ol_num_levels=0 and H3 also being significant): 1. My H1 Header 1. 1. My H3 Header 2. My H2 Header 2. My Second H1 Header In cases such as this it may be better not to use the ol option. CAVEATS * Version 3.10 (and above) generates more verbose (SEO-friendly) anchors than prior versions. Thus anchors generated with earlier versions will not match version 3.10 anchors. * Version 3.00 (and above) of HTML::GenToc is not compatible with Version 2.x of HTML::GenToc. It is now designed to do everything in one pass, and has dropped certain options: the infile option is no longer used (it has been replaced with the input option); the toc_file option no longer exists; use the outfile option instead; the tocmap option is no longer supported. Also the old array-parsing of arguments is no longer supported. There is no longer a generate_anchors method; everything is done with generate_toc. It now generates lower-case tags rather than upper-case ones. * HTML::GenToc is not very efficient (memory and speed), and can be slow for large documents. * Now that generation of anchors and of the ToC are done in one pass, even more memory is used than was the case before. This is more notable when processing multiple files, since all files are read into memory before processing them. * Invalid markup will be generated if a significant element is contained inside of an anchor. For example:

    The FOO command

    will be converted to (if H1 is a significant element),

    The FOO command

    which is illegal since anchors cannot be nested. It is better style to put anchor statements within the element to be anchored. For example, the following is preferred:

    The FOO command

    HTML::GenToc will detect the "foo" name and use it. * name attributes without quotes are not recognized. BUGS Tell me about them. REQUIRES The installation of this module requires "Module::Build". The module depends on "HTML::SimpleParse", "HTML::Entities" and "HTML::LinkList" and uses "Data::Dumper" for debugging purposes. The hypertoc script depends on "Getopt::Long", "Getopt::ArgvFile" and "Pod::Usage". Testing of this distribution depends on "Test::More". INSTALLATION To install this module, run the following commands: perl Build.PL ./Build ./Build test ./Build install Or, if you're on a platform (like DOS or Windows) that doesn't like the "./" notation, you can do this: perl Build.PL perl Build perl Build test perl Build install In order to install somewhere other than the default, such as in a directory under your home directory, like "/home/fred/perl" go perl Build.PL --install_base /home/fred/perl as the first step instead. This will install the files underneath /home/fred/perl. You will then need to make sure that you alter the PERL5LIB variable to find the modules, and the PATH variable to find the script. Therefore you will need to change: your path, to include /home/fred/perl/script (where the script will be) PATH=/home/fred/perl/script:${PATH} the PERL5LIB variable to add /home/fred/perl/lib PERL5LIB=/home/fred/perl/lib:${PERL5LIB} SEE ALSO perl(1) htmltoc(1) hypertoc(1) AUTHOR Kathryn Andersen (RUBYKAT) http://www.katspace.org/tools/hypertoc/ Based on htmltoc by Earl Hood ehood AT medusa.acs.uci.edu Contributions by Dan Dascalescu, COPYRIGHT Copyright (C) 1994-1997 Earl Hood, ehood AT medusa.acs.uci.edu Copyright (C) 2002-2008 Kathryn Andersen This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. LICENSE000644001750001750 3524711545551123 13222 0ustar00katkat000000000000HTML-GenToc-3.20This software is Copyright (c) 2011 by Kathryn Andersen. This is free software, licensed under: The GNU General Public License, Version 2, June 1991 The General Public License (GPL) Version 2, June 1991 Copyright (C) 1989, 1991 Free Software Foundation, Inc. 675 Mass Ave, Cambridge, MA 02139, USA. Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. This General Public License applies to most of the Free Software Foundation's software and to any other program whose authors commit to using it. (Some other Free Software Foundation software is covered by the GNU Library General Public License instead.) You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things. To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it. For example, if you distribute copies of such a program, whether gratis or for a fee, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. We protect your rights with two steps: (1) copyright the software, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the software. Also, for each author's protection and ours, we want to make certain that everyone understands that there is no warranty for this free software. If the software is modified by someone else and passed on, we want its recipients to know that what they have is not the original, so that any problems introduced by others will not reflect on the original authors' reputations. Finally, any free program is threatened constantly by software patents. We wish to avoid the danger that redistributors of a free program will individually obtain patent licenses, in effect making the program proprietary. To prevent this, we have made it clear that any patent must be licensed for everyone's free use or not licensed at all. The precise terms and conditions for copying, distribution and modification follow. GNU GENERAL PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 0. This License applies to any program or other work which contains a notice placed by the copyright holder saying it may be distributed under the terms of this General Public License. The "Program", below, refers to any such program or work, and a "work based on the Program" means either the Program or any derivative work under copyright law: that is to say, a work containing the Program or a portion of it, either verbatim or with modifications and/or translated into another language. (Hereinafter, translation is included without limitation in the term "modification".) Each licensee is addressed as "you". Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running the Program is not restricted, and the output from the Program is covered only if its contents constitute a work based on the Program (independent of having been made by running the Program). Whether that is true depends on what the Program does. 1. You may copy and distribute verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and to the absence of any warranty; and give any other recipients of the Program a copy of this License along with the Program. You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee. 2. You may modify your copy or copies of the Program or any portion of it, thus forming a work based on the Program, and copy and distribute such modifications or work under the terms of Section 1 above, provided that you also meet all of these conditions: a) You must cause the modified files to carry prominent notices stating that you changed the files and the date of any change. b) You must cause any work that you distribute or publish, that in whole or in part contains or is derived from the Program or any part thereof, to be licensed as a whole at no charge to all third parties under the terms of this License. c) If the modified program normally reads commands interactively when run, you must cause it, when started running for such interactive use in the most ordinary way, to print or display an announcement including an appropriate copyright notice and a notice that there is no warranty (or else, saying that you provide a warranty) and that users may redistribute the program under these conditions, and telling the user how to view a copy of this License. (Exception: if the Program itself is interactive but does not normally print such an announcement, your work based on the Program is not required to print an announcement.) These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Program, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Program, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it. Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Program. In addition, mere aggregation of another work not based on the Program with the Program (or with a work based on the Program) on a volume of a storage or distribution medium does not bring the other work under the scope of this License. 3. You may copy and distribute the Program (or a work based on it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you also do one of the following: a) Accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, b) Accompany it with a written offer, valid for at least three years, to give any third party, for a charge no more than your cost of physically performing source distribution, a complete machine-readable copy of the corresponding source code, to be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, c) Accompany it with the information you received as to the offer to distribute corresponding source code. (This alternative is allowed only for noncommercial distribution and only if you received the program in object code or executable form with such an offer, in accord with Subsection b above.) The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable. However, as a special exception, the source code distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable. If distribution of executable or object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place counts as distribution of the source code, even though third parties are not compelled to copy the source along with the object code. 4. You may not copy, modify, sublicense, or distribute the Program except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense or distribute the Program is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance. 5. You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Program or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Program (or any work based on the Program), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Program or works based on it. 6. Each time you redistribute the Program (or any work based on the Program), the recipient automatically receives a license from the original licensor to copy, distribute or modify the Program subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties to this License. 7. If, as a consequence of a court judgment or allegation of patent infringement or for any other reason (not limited to patent issues), conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot distribute so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not distribute the Program at all. For example, if a patent license would not permit royalty-free redistribution of the Program by all those who receive copies directly or indirectly through you, then the only way you could satisfy both it and this License would be to refrain entirely from distribution of the Program. If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply and the section as a whole is intended to apply in other circumstances. It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system, which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice. This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License. 8. If the distribution and/or use of the Program is restricted in certain countries either by patents or by copyrighted interfaces, the original copyright holder who places the Program under this License may add an explicit geographical distribution limitation excluding those countries, so that distribution is permitted only in or among countries not thus excluded. In such case, this License incorporates the limitation as if written in the body of this License. 9. The Free Software Foundation may publish revised and/or new versions of the General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies a version number of this License which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of this License, you may choose any version ever published by the Free Software Foundation. 10. If you wish to incorporate parts of the Program into other free programs whose distribution conditions are different, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally. NO WARRANTY 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. END OF TERMS AND CONDITIONS Changes000644001750001750 213611545551123 13457 0ustar00katkat000000000000HTML-GenToc-3.20Revision History for HTML-GenToc ================================ v3.20 2011-04-02 ---------------- * 2011-04-02 18:36:31 +1100 tweak * 2011-03-29 22:02:49 +1100 Added "ignore_only_one" option. * 2011-03-29 17:27:49 +1100 Added "ignore_sole_first" option. * 2011-03-29 16:26:45 +1100 Need git to ignore more files. * 2011-03-29 16:22:47 +1100 Changed SEO anchors. Tweaks to make tests work. * 2011-03-29 15:38:27 +1100 changed over to use Dist::Zilla * 2009-11-27 05:14:22 +0000 tidy up a few things v3.10 2009-11-27 ---------------- * 2008-11-27 07:36:18 +0000 generate README file * 2008-11-27 07:36:13 +0000 bump version to 3.10 * 2008-11-27 07:36:08 +0000 update release notes * 2008-11-27 07:30:19 +0000 Updated documentation. * 2008-11-26 09:26:51 +0000 1. fixed bug with outputting to string 2. improved anchor making (thanks to Dan Dascalescu) * 2007-12-16 09:05:15 +0000 revamped depot v3.00 2009-11-27 ---------------- * 2007-12-16 09:05:15 +0000 revamped depot ==================================== End of changes in the last 1000 days ==================================== META.yml000644001750001750 133611545551123 13436 0ustar00katkat000000000000HTML-GenToc-3.20--- abstract: 'Generate a Table of Contents for HTML documents.' author: - 'Kathryn Andersen' build_requires: File::Find: 0 File::Temp: 0 Module::Build: 0.3601 Test::More: 0 configure_requires: Module::Build: 0.3601 dynamic_config: 0 generated_by: 'Dist::Zilla version 4.200004, CPAN::Meta::Converter version 2.101670' license: gpl meta-spec: url: http://module-build.sourceforge.net/META-spec-v1.4.html version: 1.4 name: HTML-GenToc requires: Data::Dumper: 0 File::Basename: 0 Getopt::Long: 2.34 HTML::Entities: 0 HTML::LinkList: 0 HTML::SimpleParse: 0 Pod::Usage: 0 resources: homepage: http://github.com/rubykat/HTML-GenToc/tree repository: git://github.com/rubykat/HTML-GenToc.git version: 3.20 MANIFEST000644001750001750 176611545551123 13325 0ustar00katkat000000000000HTML-GenToc-3.20Build.PL Changes LICENSE MANIFEST MANIFEST.SKIP META.yml OldChanges README README.mkdn lib/HTML/GenToc.pm scripts/hypertoc t/00-compile.t t/010_files.t t/020_strings.t t/030_anchors.t t/070_script.t t/compare.pl t/release-distmeta.t t/release-has-version.t t/release-kwalitee.t t/release-pod-coverage.t t/release-pod-syntax.t t/release-portability.t tfiles/good_test1_anch.wml tfiles/good_test1_toc.html tfiles/good_test1a_toc.html tfiles/good_test2_anch.html tfiles/good_test2_toc.html tfiles/good_test2a_anch.html tfiles/good_test2a_toc.html tfiles/good_test3_anch.wml tfiles/good_test3_toc.html tfiles/good_test4_anch.html tfiles/good_test4_toc.html tfiles/good_test4a_anch.html tfiles/good_test4a_toc.html tfiles/good_test4b_toc.html tfiles/good_test5_toc.html tfiles/good_test5b_toc.html tfiles/good_test6_toc.html tfiles/good_test6a_toc.html tfiles/good_test7a.html tfiles/test1.wml tfiles/test1b.args tfiles/test2.html tfiles/test3.wml tfiles/test4.html tfiles/test5.php tfiles/test6.html tfiles/test7.html Build.PL000644001750001750 164611545551123 13465 0ustar00katkat000000000000HTML-GenToc-3.20 use strict; use warnings; use Module::Build 0.3601; my %module_build_args = ( 'build_requires' => { 'File::Find' => '0', 'File::Temp' => '0', 'Module::Build' => '0.3601', 'Test::More' => '0' }, 'configure_requires' => { 'Module::Build' => '0.3601' }, 'dist_abstract' => 'Generate a Table of Contents for HTML documents.', 'dist_author' => [ 'Kathryn Andersen' ], 'dist_name' => 'HTML-GenToc', 'dist_version' => '3.20', 'license' => 'gpl', 'module_name' => 'HTML::GenToc', 'recommends' => {}, 'recursive_test_files' => 1, 'requires' => { 'Data::Dumper' => '0', 'File::Basename' => '0', 'Getopt::Long' => '2.34', 'HTML::Entities' => '0', 'HTML::LinkList' => '0', 'HTML::SimpleParse' => '0', 'Pod::Usage' => '0' }, 'script_files' => [ 'scripts/hypertoc' ] ); my $build = Module::Build->new(%module_build_args); $build->create_build_script; OldChanges000644001750001750 2047511545551123 14144 0ustar00katkat000000000000HTML-GenToc-3.20Revision history for HTML::GenToc ================================= 3.10 Thu 27 November 2008 ------------------------- * (2008-11-27) Updated documentation. * (2008-11-26) Makefile.PL is auto-generated, so should not be under revision-control. * (2008-11-26) 1. fixed bug with outputting to string 2. improved anchor making (thanks to Dan Dascalescu) * (2007-12-17) Make svk ignore generated files. * (2007-12-16) revamped depot 3.00 Sun 27 May 2007 -------------------- * (27 May 2007) refactor Massive rewrite; now everything is done in one pass with one generate_toc method, and it uses HTML::LinkList to generate the actual Table-of-Contents list. 2.31 Wed 06 September 2006 -------------------------- * (22 Apr 2006) tweak docs Removed duplicate header, is all. * (25 Oct 2004) documentation tweak * (24 Oct 2004) argfile option Using Getopt::ArgvFile 1.09, now use --argfile as an option to get Options Files, instead of having to use the @ prefix. 2.30 Fri 22 October 2004 ------------------------ * (22 Oct 2004) documentation and README Now auto-generate the README from the module PoD; which entailed rewriting and improving it. * (22 Oct 2004) updated TODO * (22 Oct 2004) optional tests Added optional tests using Test::Distribution, Test::Pod and Test::Pod::Coverage which only run if you have those modules installed. * (22 Oct 2004) improving Pod Some things which additional optional tests complained about, such as Pod about every function, and the usage of =back, fixed. (Needed to commit this before adding the actual tests which tested this, because otherwise the tests failed) * (22 Oct 2004) use_id option Add option to use IDs instead of anchors in generate_anchors and also recognise IDs in generate_toc. * (10 Oct 2004) change auto-build stuff to Module::DevAid Now that I've written a proper module for it, use it. 2.22 Wed 06 October 2004 ------------------------ Wed Oct 6 07:45:31 EST 2004 perlkat AT katspace dot com * arrgh! more overlooks (blush) I forgot to change the README file! 2.21 Wed 06 October 2004 ------------------------ Wed Oct 6 07:38:31 EST 2004 perlkat AT katspace dot com * correcting documentation Just a few things I overlooked earlier. 2.20 Wed 06 October 2004 ------------------------ Sat Oct 2 20:53:31 EST 2004 perlkat AT katspace dot com * enable OL on all levels Added --ol_num_levels option (and tests for same) and improved documentation. Thu Sep 30 21:07:56 EST 2004 perlkat AT katspace dot com * improved behaviour Fixed problems with doing both --gen_anchors and --gen_toc in one pass; it now no longer stomps on the backup file, as it passes the data from the gen_anchors pass to the gen_toc pass. Fixed odd behaviour with STDOUT always being sent to even if a file of '' was given (now '' works to disable output to STDOUT). Also changed the 'option' method in HTML::GenToc to 'setting' instead. And improved documentation. Wed Sep 29 09:13:59 EST 2004 perlkat AT katspace dot com * added 'option' method to HTML::GenToc Now options are isolated a bit more, and can be queried with the 'option' method; set them with ->args, get them with ->option. Wed Sep 29 08:40:21 EST 2004 perlkat AT katspace dot com * documentation and deprecation Improved documentation, including more examples. Also went through and added notes and warnings about deprecation of --tocmap option and the old way of calling the HTML::GenToc methods. Also removed the HISTORY section of hypertoc, because it's better for all changes to be documented in one spot, namely, here. Mon Sep 27 07:15:47 EST 2004 perlkat AT katspace dot com * enable test of script Mon Sep 27 07:06:02 EST 2004 perlkat AT katspace dot com * clearing out remnant of configPL Sat Sep 25 21:38:32 EST 2004 perlkat AT katspace dot com * fix OL oddness and invisible list items in TOC Kevin Brannen pointed out some oddness with ToC which had OL lists instead of UL lists, and didn't like the "invisible" list items in TOC lists; and the two problems turned out to be related. Rewrote the TOC stuff to keep more information, nest list items better, and only use "invisible" list items when absolutely necessary. Sat Sep 25 14:09:53 EST 2004 perlkat AT katspace dot com * testing and fixing tocmap 2.16 Fri 24 September 2004 -------------------------- Fri Sep 24 15:43:33 EST 2004 perlkat AT katspace dot com * oops changes fix Fri Sep 24 15:42:16 EST 2004 perlkat AT katspace dot com * tweaking auto-release stuff I forgot that the TODO file isn't under revision control (just the .todo file...) 2.15 Fri 24 September 2004 -------------------------- Fri Sep 24 15:39:07 EST 2004 perlkat AT katspace dot com * update release notes Fri Sep 24 15:36:57 EST 2004 perlkat AT katspace dot com * things for the automated release process Fri Sep 24 15:03:45 EST 2004 perlkat AT katspace dot com * added .todo file (the devtodo program) With intent to automatically generate a TODO file from it. Fri Sep 24 15:00:39 EST 2004 perlkat AT katspace dot com * correcting error in darcs test stuff Fri Sep 24 14:58:55 EST 2004 perlkat AT katspace dot com * change over to Module::Build 2.10 Tue 12th August 2003 ------------------------- - added --to_string and --in_string options to HTML::GenToc generate_anchors and generate_toc to enable using strings rather than files, so that one can use the module in perl scripts which are doing additional processing. 2.02 Sat 15th February 2003 --------------------------- - removed heavily spammed email address from documentation. 2.01 Sun 8th December 2002 -------------------------- - Bug fix in hypertoc, to fix the way Getopt::ArgvFile is called. 2.00 Sun 8th December 2002 -------------------------- - no longer using the AppConfig module, but the old style of calling the methods should still work. Some of the options which were synonyms have been removed. - the hypertoc script is now part of this distribution. It now uses Getopt::Long and Getopt::ArgvFile instead of AppConfig. This gives it the full power of Getopt::Long, while config files are taken care of by Getopt::ArgvFile. This means a slightly different format for config files. 1.4 Wed 20th November 2002 --------------------------- - CPAN testers complained about a lack of explicitly stating all the dependencies of AppConfig, which either means that AppConfig has changed desperately, or their testing methods have changed, since I didn't think it was possible to get the AppConfig module without getting all its dependent modules, but, oh well. 1.3 Sun 17th November 2002 --------------------------- - fixed minor bug where the filename was always included in the table of contents even when it was an inline TOC and the filename in question was the containing file. (Only a minor bug because the link still worked, but it messed up things when the file in question was a .shtml file which had query arguments to it; presumably would mess up things like .php files as well.) 1.2 Sat 26th October 2002 -------------------------- - fixed bug which would produce rubbish in the TOC if there happened to be an element which had an *attribute* which had content which matched a TOC entry; this would make it start collecting content for that, and never find an end-tag for it. 1.1 Wed 28th August 2002 ------------------------- - fixed bug with requirements which prevented working with perl 5.5 1.0 Fri 24th May 2002 ---------------------- - cleaned up the tests (now uses Test::Simple and compares test files nicely) - rearranged the documentation - added --help and --manpage options 0.3 Fri 1st Mar 2002 --------------------- - added --notoc_match option to suppress ToC for individual tags 0.2 Sat 23rd Feb 2002 ---------------------- - added README file - updated documentation - made the generated ToC more XHTML compliant - changed tests slightly 0.1 Mon 28th Jan 2002 ---------------------- - conversion of htmltoc to a module - use HTML::SimpleParse to parse the HTML - split the ToC generation into two phases; generate_anchors and generate_toc - expanded the --inline option to place the ToC after the first instance of any tag, or to replace a given tag - no longer use prefix + $$ to make anchor names unique; instead derive them from the content of the significant element. - various other slight improvements README.mkdn000644001750001750 5155711545551123 14027 0ustar00katkat000000000000HTML-GenToc-3.20# NAME HTML::GenToc - Generate a Table of Contents for HTML documents. # VERSION version 3.20 # SYNOPSIS use HTML::GenToc; # create a new object my $toc = new HTML::GenToc(); my $toc = new HTML::GenToc(title=>"Table of Contents", toc_entry=>{ H1=>1, H2=>2 }, toc_end=>{ H1=>'/H1', H2=>'/H2' } ); # generate a ToC from a file $toc->generate_toc(input=>$html_file, footer=>$footer_file, header=>$header_file ); # DESCRIPTION HTML::GenToc generates anchors and a table of contents for HTML documents. Depending on the arguments, it will insert the information it generates, or output to a string, a separate file or STDOUT. While it defaults to taking H1 and H2 elements as the significant elements to put into the table of contents, any tag can be defined as a significant element. Also, it doesn't matter if the input HTML code is complete, pure HTML, one can input pseudo-html or page-fragments, which makes it suitable for using on templates and HTML meta-languages such as WML. Also included in the distrubution is hypertoc, a script which uses the module so that one can process files on the command-line in a user-friendly manner. # DETAILS The ToC generated is a multi-level level list containing links to the significant elements. HTML::GenToc inserts the links into the ToC to significant elements at a level specified by the user. __Example:__ If H1s are specified as level 1, than they appear in the first level list of the ToC. If H2s are specified as a level 2, than they appear in a second level list in the ToC. Information on the significant elements and what level they should occur are passed in to the methods used by this object, or one can use the defaults. There are two phases to the ToC generation. The first phase is to put suitable anchors into the HTML documents, and the second phase is to generate the ToC from HTML documents which have anchors in them for the ToC to link to. For more information on controlling the contents of the created ToC, see L. HTML::GenToc also supports the ability to incorporate the ToC into the HTML document itself via the __inline__ option. See L for more information. In order for HTML::GenToc to support linking to significant elements, HTML::GenToc inserts anchors into the significant elements. One can use HTML::GenToc as a filter, outputing the result to another file, or one can overwrite the original file, with the original backed up with a suffix (default: "org") appended to the filename. One can also output the result to a string. # METHODS Default arguments can be set when the object is created, and overridden by setting arguments when the generate_toc method is called. Arguments are given as a hash of arguments. ## Method -- new $toc = new HTML::GenToc(); $toc = new HTML::GenToc(toc_entry=>\%my_toc_entry, toc_end=>\%my_toc_end, bak=>'bak', ... ); Creates a new HTML::GenToc object. These arguments will be used as defaults in invocations of other methods. See [generate_tod](http://search.cpan.org/perldoc?generate_tod) for possible arguments. ## generate_toc $toc->generate_toc(outfile=>"index2.html"); my $result_str = $toc->generate_toc(to_string=>1); Generates a table of contents for the significant elements in the HTML documents, optionally generating anchors for them first. __Options__ - bak bak => _string_ If the input file/files is/are being overwritten (__overwrite__ is on), copy the original file to "_filename_._string_". If the value is empty, __no__ backup file will be created. (default:org) - debug debug => 1 Enable verbose debugging output. Used for debugging this module; in other words, don't bother. (default:off) - entrysep entrysep => _string_ Separator string for non-
  • item entries (default: ", ") - filenames filenames => \@filenames The filenames to use when creating table-of-contents links. This overrides the filenames given in the __input__ option, and is expected to have exactly the same number of elements. This can also be used when passing in string-content to the __input__ option, to give a (fake) filename to use for the links relating to that content. - footer footer => _file_or_string_ Either the filename of the file containing footer text for ToC; or a string containing the footer text. - header header => _file_or_string_ Either the filename of the file containing header text for ToC; or a string containing the header text. - ignore_only_one ignore_only_one => 1 If there would be only one item in the ToC, don't make a ToC. - ignore_sole_first ignore_sole_first => 1 If the first item in the ToC is of the highest level, AND it is the only one of that level, ignore it. This is useful in web-pages where there is only one H1 header but one doesn't know beforehand whether there will be only one. - inline inline => 1 Put ToC in document at a given point. See L for more information. - input input => \@filenames input => $content This is expected to be either a reference to an array of filenames, or a string containing content to process. The three main uses would be: - (a) you have more than one file to process, so pass in multiple filenames - (b) you have one file to process, so pass in its filename as the only array item - (c) you have HTML content to process, so pass in just the content as a string (default:undefined) - notoc_match notoc_match => _string_ If there are certain individual tags you don't wish to include in the table of contents, even though they match the "significant elements", then if this pattern matches contents inside the tag (not the body), then that tag will not be included, either in generating anchors nor in generating the ToC. (default: `class="notoc"`) - ol ol => 1 Use an ordered list for level 1 ToC entries. - ol_num_levels ol_num_levels => 2 The number of levels deep the OL listing will go if __ol__ is true. If set to zero, will use an ordered list for all levels. (default:1) - overwrite overwrite => 1 Overwrite the input file with the output. (default:off) - outfile outfile => _file_ File to write the output to. This is where the modified HTML output goes to. Note that it doesn't make sense to use this option if you are processing more than one file. If you give '-' as the filename, then output will go to STDOUT. (default: STDOUT) - quiet quiet => 1 Suppress informative messages. (default: off) - textonly textonly => 1 Use only text content in significant elements. - title title => _string_ Title for ToC page (if not using __header__ or __inline__ or __toc_only__) (default: "Table of Contents") - toc_after toc_after => \%toc_after_data %toc_after_data = { _tag1_ => _suffix1_, _tag2_ => _suffix2_ }; toc_after => { H2=>'' } For defining layout of significant elements in the ToC. This expects a reference to a hash of tag=>suffix pairs. The _tag_ is the HTML tag which marks the start of the element. The _suffix_ is what is required to be appended to the Table of Contents entry generated for that tag. (default: undefined) - toc_before toc_before => \%toc_before_data %toc_before_data = { _tag1_ => _prefix1_, _tag2_ => _prefix2_ }; toc_before=>{ H2=>'' } For defining the layout of significant elements in the ToC. The _tag_ is the HTML tag which marks the start of the element. The _prefix_ is what is required to be prepended to the Table of Contents entry generated for that tag. (default: undefined) - toc_end toc_end => \%toc_end_data %toc_end_data = { _tag1_ => _endtag1_, _tag2_ => _endtag2_ }; toc_end => { H1 => '/H1', H2 => '/H2' } For defining significant elements. The _tag_ is the HTML tag which marks the start of the element. The _endtag_ the HTML tag which marks the end of the element. When matching in the input file, case is ignored (but make sure that all your _tag_ options referring to the same tag are exactly the same!). - toc_entry toc_entry => \%toc_entry_data %toc_entry_data = { _tag1_ => _level1_, _tag2_ => _level2_ }; toc_entry => { H1 => 1, H2 => 2 } For defining significant elements. The _tag_ is the HTML tag which marks the start of the element. The _level_ is what level the tag is considered to be. The value of _level_ must be numeric, and non-zero. If the value is negative, consective entries represented by the significant_element will be separated by the value set by __entrysep__ option. - toclabel toclabel => _string_ HTML text that labels the ToC. Always used. (default: "

    Table of Contents

    ") - toc_tag toc_tag => _string_ If a ToC is to be included inline, this is the pattern which is used to match the tag where the ToC should be put. This can be a start-tag, an end-tag or a comment, but the < should be left out; that is, if you want the ToC to be placed after the BODY tag, then give "BODY". If you want a special comment tag to make where the ToC should go, then include the comment marks, for example: "!--toc--" (default:BODY) - toc_tag_replace toc_tag_replace => 1 In conjunction with __toc_tag__, this is a flag to say whether the given tag should be replaced, or if the ToC should be put after the tag. This can be useful if your toc_tag is a comment and you don't need it after you have the ToC in place. (default:false) - toc_only toc_only => 1 Output only the Table of Contents, that is, the Table of Contents plus the toclabel. If there is a __header__ or a __footer__, these will also be output. If __toc_only__ is false then if there is no __header__, and __inline__ is not true, then a suitable HTML page header will be output, and if there is no __footer__ and __inline__ is not true, then a HTML page footer will be output. (default:false) - to_string to_string => 1 Return the modified HTML output as a string. This _does_ override other methods of output (unlike version 3.00). If _to_string_ is false, the method will return 1 rather than a string. - use_id use_id => 1 Use id="_name_" for anchors rather than anchors. However if an anchor already exists for a Significant Element, this won't make an id for that particular element. - useorg useorg => 1 Use pre-existing backup files as the input source; that is, files of the form _infile_._bak_ (see __input__ and __bak__). # INTERNAL METHODS These methods are documented for developer purposes and aren't intended to be used externally. ## make_anchor_name $toc->make_anchor_name(content=>$content, anchors=>\%anchors); Makes the anchor-name for one anchor. Bases the anchor on the content of the significant element. Ensures that anchors are unique. ## make_anchors my $new_html = $toc->make_anchors(input=>$html, notoc_match=>$notoc_match, use_id=>$use_id, toc_entry=>\%toc_entries, toc_end=>\%toc_ends, ); Makes the anchors the given input string. Returns a string. ## make_toc_list my @toc_list = $toc->make_toc_list(input=>$html, labels=>\%labels, notoc_match=>$notoc_match, toc_entry=>\%toc_entry, toc_end=>\%toc_end, filename=>$filename); Makes a list of lists which represents the structure and content of (a portion of) the ToC from one file. Also updates a list of labels for the ToC entries. ## build_lol Build a list of lists of paths, given a list of hashes with info about paths. ## output_toc $self->output_toc(toc=>$toc_str, input=>\@input, filenames=>\@filenames); Put the output (whether to file, STDOUT or string). The "output" in this case could be the ToC, the modified (anchors added) HTML, or both. ## put_toc_inline my $newhtml = $toc->put_toc_inline(toc_str=>$toc_str, filename=>$filename, in_string=>$in_string); Puts the given toc_str into the given input string; returns a string. ## cp cp($src, $dst); Copies file $src to $dst. Used for making backups of files. # FILE FORMATS ## Formatting the ToC The __toc_entry__ and other related options give you control on how the ToC entries may look, but there are other options to affect the final appearance of the ToC file created. With the __header__ option, the contents of the given file (or string) will be prepended before the generated ToC. This allows you to have introductory text, or any other text, before the ToC. - Note: If you use the __header__ option, make sure the file specified contains the opening HTML tag, the HEAD element (containing the TITLE element), and the opening BODY tag. However, these tags/elements should not be in the header file if the __inline__ option is used. See L for information on what the header file should contain for inlining the ToC. With the __toclabel__ option, the contents of the given string will be prepended before the generated ToC (but after any text taken from a __header__ file). With the __footer__ option, the contents of the file will be appended after the generated ToC. - Note: If you use the __footer__, make sure it includes the closing BODY and HTML tags (unless, of course, you are using the __inline__ option). If the __header__ option is not specified, the appropriate starting HTML markup will be added, unless the __toc_only__ option is specified. If the __footer__ option is not specified, the appropriate closing HTML markup will be added, unless the __toc_only__ option is specified. If you do not want/need to deal with header, and footer, files, then you are allowed to specify the title, __title__ option, of the ToC file; and it allows you to specify a heading, or label, to put before ToC entries' list, the __toclabel__ option. Both options have default values. If you do not want HTML page tags to be supplied, and just want the ToC itself, then specify the __toc_only__ option. If there are no __header__ or __footer__ files, then this will simply output the contents of __toclabel__ and the ToC itself. ## Inlining the ToC The ability to incorporate the ToC directly into an HTML document is supported via the __inline__ option. Inlining will be done on the first file in the list of files processed, and will only be done if that file contains an opening tag matching the __toc_tag__ value. If __overwrite__ is true, then the first file in the list will be overwritten, with the generated ToC inserted at the appropriate spot. Otherwise a modified version of the first file is output to either STDOUT or to the output file defined by the __outfile__ option. The options __toc_tag__ and __toc_tag_replace__ are used to determine where and how the ToC is inserted into the output. __Example 1__ $toc->generate_toc(inline=>1, toc_tag => 'BODY', toc_tag_replace => 0, ... ); This will put the generated ToC after the BODY tag of the first file. If the __header__ option is specified, then the contents of the specified file are inserted after the BODY tag. If the __toclabel__ option is not empty, then the text specified by the __toclabel__ option is inserted. Then the ToC is inserted, and finally, if the __footer__ option is specified, it inserts the footer. Then the rest of the input file follows as it was before. __Example 2__ $toc->generate_toc(inline=>1, toc_tag => '!--toc--', toc_tag_replace => 1, ... ); This will put the generated ToC after the first comment of the form , and that comment will be replaced by the ToC (in the order __header__ __toclabel__ ToC __footer__) followed by the rest of the input file. - Note: The header file should not contain the beginning HTML tag and HEAD element since the HTML file being processed should already contain these tags/elements. # NOTES - * HTML::GenToc is smart enough to detect anchors inside significant elements. If the anchor defines the NAME attribute, HTML::GenToc uses the value. Else, it adds its own NAME attribute to the anchor. If __use_id__ is true, then it likewise checks for and uses IDs. - * The TITLE element is treated specially if specified in the __toc_entry__ option. It is illegal to insert anchors (A) into TITLE elements. Therefore, HTML::GenToc will actually link to the filename itself instead of the TITLE element of the document. - * HTML::GenToc will ignore a significant element if it does not contain any non-whitespace characters. A warning message is generated if such a condition exists. - * If you have a sequence of significant elements that change in a slightly disordered fashion, such as H1 -> H3 -> H2 or even H2 -> H1, though HTML::GenToc deals with this to create a list which is still good HTML, if you are using an ordered list to that depth, then you will get strange numbering, as an extra list element will have been inserted to nest the elements at the correct level. For example (H2 -> H1 with ol_num_levels=1): 1. * My H2 Header 2. My H1 Header For example (H1 -> H3 -> H2 with ol_num_levels=0 and H3 also being significant): 1. My H1 Header 1. 1. My H3 Header 2. My H2 Header 2. My Second H1 Header In cases such as this it may be better not to use the __ol__ option. # CAVEATS - * Version 3.10 (and above) generates more verbose (SEO-friendly) anchors than prior versions. Thus anchors generated with earlier versions will not match version 3.10 anchors. - * Version 3.00 (and above) of HTML::GenToc is not compatible with Version 2.x of HTML::GenToc. It is now designed to do everything in one pass, and has dropped certain options: the __infile__ option is no longer used (it has been replaced with the __input__ option); the __toc_file__ option no longer exists; use the __outfile__ option instead; the __tocmap__ option is no longer supported. Also the old array-parsing of arguments is no longer supported. There is no longer a __generate_anchors__ method; everything is done with __generate_toc__. It now generates lower-case tags rather than upper-case ones. - * HTML::GenToc is not very efficient (memory and speed), and can be slow for large documents. - * Now that generation of anchors and of the ToC are done in one pass, even more memory is used than was the case before. This is more notable when processing multiple files, since all files are read into memory before processing them. - * Invalid markup will be generated if a significant element is contained inside of an anchor. For example:

    The FOO command

    will be converted to (if H1 is a significant element),

    The FOO command

    which is illegal since anchors cannot be nested. It is better style to put anchor statements within the element to be anchored. For example, the following is preferred:

    The FOO command

    HTML::GenToc will detect the "foo" name and use it. - * name attributes without quotes are not recognized. # BUGS Tell me about them. # REQUIRES The installation of this module requires `Module::Build`. The module depends on `HTML::SimpleParse`, `HTML::Entities` and `HTML::LinkList` and uses `Data::Dumper` for debugging purposes. The hypertoc script depends on `Getopt::Long`, `Getopt::ArgvFile` and `Pod::Usage`. Testing of this distribution depends on `Test::More`. # INSTALLATION To install this module, run the following commands: perl Build.PL ./Build ./Build test ./Build install Or, if you're on a platform (like DOS or Windows) that doesn't like the "./" notation, you can do this: perl Build.PL perl Build perl Build test perl Build install In order to install somewhere other than the default, such as in a directory under your home directory, like "/home/fred/perl" go perl Build.PL --install_base /home/fred/perl as the first step instead. This will install the files underneath /home/fred/perl. You will then need to make sure that you alter the PERL5LIB variable to find the modules, and the PATH variable to find the script. Therefore you will need to change: your path, to include /home/fred/perl/script (where the script will be) PATH=/home/fred/perl/script:${PATH} the PERL5LIB variable to add /home/fred/perl/lib PERL5LIB=/home/fred/perl/lib:${PERL5LIB} # SEE ALSO perl(1) htmltoc(1) hypertoc(1) # AUTHOR Kathryn Andersen (RUBYKAT) http://www.katspace.org/tools/hypertoc/ Based on htmltoc by Earl Hood ehood AT medusa.acs.uci.edu Contributions by Dan Dascalescu, # COPYRIGHT Copyright (C) 1994-1997 Earl Hood, ehood AT medusa.acs.uci.edu Copyright (C) 2002-2008 Kathryn Andersen This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.t000755001750001750 011545551123 12266 5ustar00katkat000000000000HTML-GenToc-3.20compare.pl000644001750001750 223011545551123 14405 0ustar00katkat000000000000HTML-GenToc-3.20/t# compare two files sub compare { my $file1 = shift; my $file2 = shift; return 0 unless (-f $file1); return 0 unless (-f $file2); my $fh1 = undef; my $fh2 = undef; open($fh1, $file1) || return 0; open($fh2, $file2) || return 0; my $res = 1; my $count = 0; while (<$fh1>) { $count++; my $comp1 = $_; # remove newline/carriage return (in case these aren't both Unix) $comp1 =~ s/\n//; $comp1 =~ s/\r//; my $comp2 = <$fh2>; # check if $fh2 has less lines than $fh1 if (!defined $comp2) { print "error - line $count does not exist in $file2\n $file1 : $comp1\n"; close($fh1); close($fh2); return 0; } # remove newline/carriage return $comp2 =~ s/\n//; $comp2 =~ s/\r//; if ($comp1 ne $comp2) { print "error - line $count not equal\n $file1 : $comp1\n $file2 : $comp2\n"; close($fh1); close($fh2); return 0; } } close($fh1); # check if $fh2 has more lines than $fh1 if (defined($comp2 = <$fh2>)) { $comp2 =~ s/\n//; $comp2 =~ s/\r//; print "error - extra line in $file2 : '$comp2'\n"; $res = 0; } close($fh2); return $res; } 1; MANIFEST.SKIP000644001750001750 43411545551123 14041 0ustar00katkat000000000000HTML-GenToc-3.20# version control files \bRCS\b \bCVS\b ,v$ ^\.git ^\.build ^dist.ini$ # archives .*\.tar\.gz$ # the TODO file ^\.todo$ # MakeMaker files ^Makefile$ ^blib/ ^MakeMaker-\d pm_to_blib # Module::Build files ^Build$ ^_build/ # temp, old, backup files ~$ \.old$ \.bak$ \.# \.swp$ ^#.*#$ 010_files.t000644001750001750 2343111545551123 14317 0ustar00katkat000000000000HTML-GenToc-3.20/tuse Test::More tests => 36; use HTML::GenToc; require 't/compare.pl'; # Insert your test code below #=================================================== $toc = new HTML::GenToc(debug=>0,quiet=>1); # # file test1 # $result = $toc->generate_toc( make_anchors=>1, make_toc=>0, input=>['tfiles/test1.wml'], outfile=>'test1_anch.wml', ); ok($result, 'generated anchors from test1.wml'); # compare the files $result = compare('test1_anch.wml', 'tfiles/good_test1_anch.wml'); ok($result, 'test1_anch.wml matches tfiles/good_test1_anch.wml exactly'); $result = $toc->generate_toc( make_anchors=>0, make_toc=>1, input=>['test1_anch.wml'], outfile=>'test1_toc.html', ); ok($result, 'generated toc from test1_anch.wml'); # compare the files $result = compare('test1_toc.html', 'tfiles/good_test1_toc.html'); ok($result, 'test1_toc.html matches tfiles/good_test1_toc.html exactly'); # clean up test1 if ($result) { unlink('test1_anch.wml'); unlink('test1_toc.html'); } # # file test2 # if (-f 'test2_anch.html') { unlink('test2_anch.html'); } if (-f 'test2_anch.html.org') { unlink('test2_anch.html.org'); } $result = $toc->generate_toc( make_anchors=>1, make_toc=>0, input=>['tfiles/test2.html'], outfile=>'test2_anch.html', ); ok($result, 'generated anchors from test2.html'); # compare the files $result = compare('test2_anch.html', 'tfiles/good_test2_anch.html'); ok($result, 'test2_anch.html matches tfiles/good_test2_anch.html exactly'); $result = $toc->generate_toc( make_anchors=>0, make_toc=>1, outfile=>'', input=>['test2_anch.html'], inline=>1, overwrite=>1, ); ok($result, 'generated toc inline test2_anch.html'); # compare the files $result = compare('test2_anch.html', 'tfiles/good_test2_toc.html'); ok($result, 'test2_anch.html matches tfiles/good_test2_toc.html exactly'); # clean up if ($result) { unlink('test2_anch.html'); unlink('test2_anch.html.org'); } # # file test3 # $result = $toc->generate_toc( make_anchors=>1, make_toc=>0, bak=>'', inline=>0, overwrite=>0, input=>['tfiles/test3.wml'], outfile=>'test3_anch.wml', toc_entry=>{ H1=>1, H2=>2, H3=>3, }, toc_end=>{ H1=>'/H1', H2=>'/H2', H3=>'/H3', }, ); ok($result, 'generated anchors (H1,H2,H3) from test3.wml'); # compare the files $result = compare('test3_anch.wml', 'tfiles/good_test3_anch.wml'); ok($result, 'test3_anch.wml matches tfiles/good_test3_anch.wml exactly'); $result = $toc->generate_toc( make_anchors=>0, make_toc=>1, input=>['test3_anch.wml'], outfile=>'test3_toc.html', toc_entry=>{ H1=>1, H2=>2, H3=>3, }, toc_end=>{ H1=>'/H1', H2=>'/H2', H3=>'/H3', }, ); ok($result, 'generated toc from test3_anch.wml'); # compare the files $result = compare('test3_toc.html', 'tfiles/good_test3_toc.html'); ok($result, 'test3_toc.html matches tfiles/good_test3_toc.html exactly'); # clean up if ($result) { unlink('test3_anch.wml'); unlink('test3_toc.html'); } # # file test4 # $result = $toc->generate_toc( make_anchors=>1, make_toc=>0, input=>['tfiles/test4.html'], bak=>'', inline=>0, overwrite=>0, outfile=>'test4_anch.html', toc_entry=>{ 'H2'=>1, 'H3'=>2, }, toc_end=>{ 'H2'=>'/H2', 'H3'=>'/H3', } ); ok($result, 'generated anchors (H1,H3) from test4.html'); # compare the files $result = compare('test4_anch.html', 'tfiles/good_test4_anch.html'); ok($result, 'test4_anch.html matches tfiles/good_test4_anch.html exactly'); $result = $toc->generate_toc( make_anchors=>0, make_toc=>1, input=>['test4_anch.html'], outfile=>'test4_toc.html', toc_entry=>{ 'H2'=>1, 'H3'=>2, }, toc_end=>{ 'H2'=>'/H2', 'H3'=>'/H3', } ); ok($result, 'generated toc from test4_anch.html'); # compare the files $result = compare('test4_toc.html', 'tfiles/good_test4_toc.html'); ok($result, 'test4_toc.html matches tfiles/good_test4_toc.html exactly'); # clean up if ($result) { unlink('test4_anch.html'); unlink('test4_toc.html'); } # # file test4 using entrysep # $result = $toc->generate_toc( make_anchors=>1, make_toc=>0, input=>['tfiles/test4.html'], bak=>'', inline=>0, overwrite=>0, outfile=>'test4a_anch.html', toc_entry=>{ 'H2'=>1, 'H3'=>-2, }, toc_end=>{ 'H2'=>'/H2', 'H3'=>'/H3', } ); ok($result, 'generated anchors (entrysep) from test4.html'); # compare the files $result = compare('test4a_anch.html', 'tfiles/good_test4a_anch.html'); ok($result, 'test4a_anch.html matches tfiles/good_test4a_anch.html exactly'); $result = $toc->generate_toc( make_anchors=>0, make_toc=>1, input=>['test4a_anch.html'], outfile=>'test4a_toc.html', toc_entry=>{ 'H2'=>1, 'H3'=>-2, }, toc_end=>{ 'H2'=>'/H2', 'H3'=>'/H3', } ); ok($result, 'generated toc from test4a_anch.html'); # compare the files $result = compare('test4a_toc.html', 'tfiles/good_test4a_toc.html'); ok($result, 'test4a_toc.html matches tfiles/good_test4a_toc.html exactly'); # clean up if ($result) { unlink('test4a_anch.html'); unlink('test4a_toc.html'); } # # file test4 using ol # $result = $toc->generate_toc( make_anchors=>1, make_toc=>0, input=>['tfiles/test4.html'], bak=>'', inline=>0, overwrite=>0, outfile=>'test4b_anch.html', toc_entry=>{ 'H2'=>1, 'H3'=>2, }, toc_end=>{ 'H2'=>'/H2', 'H3'=>'/H3', } ); # (don't check the above because it's exactly the same as test4) $result = $toc->generate_toc( make_anchors=>0, make_toc=>1, input=>['test4b_anch.html'], outfile=>'test4b_toc.html', ol=>1, toc_entry=>{ 'H2'=>1, 'H3'=>2, }, toc_end=>{ 'H2'=>'/H2', 'H3'=>'/H3', } ); ok($result, 'generated toc (ol) from test4b_anch.html'); # compare the files $result = compare('test4b_toc.html', 'tfiles/good_test4b_toc.html'); ok($result, 'test4b_toc.html matches tfiles/good_test4b_toc.html exactly'); # clean up if ($result) { unlink('test4b_anch.html'); unlink('test4b_toc.html'); } # # file test5 (this file already has anchors) # (testing H3 -> H2 sequence) # $result = $toc->generate_toc( make_anchors=>0, make_toc=>1, input=>['tfiles/test5.php'], ol=>0, inline=>0, overwrite=>0, bak=>'', outfile=>'test5_toc.html', toc_entry=>{ 'H2'=>1, 'H3'=>2, }, toc_end=>{ 'H2'=>'/H2', 'H3'=>'/H3', } ); ok($result, 'generated toc from test5.php'); # compare the files $result = compare('test5_toc.html', 'tfiles/good_test5_toc.html'); ok($result, 'test5_toc.html (H3 -> H2) matches tfiles/good_test5_toc.html exactly'); # clean up if ($result) { unlink('test5_toc.html'); } # # file test5 (this file already has anchors) # (testing H3 -> H2 sequence with OL) # $result = $toc->generate_toc( make_anchors=>0, make_toc=>1, input=>['tfiles/test5.php'], ol=>1, inline=>0, overwrite=>0, bak=>'', outfile=>'test5b_toc.html', toc_entry=>{ 'H2'=>1, 'H3'=>2, }, toc_end=>{ 'H2'=>'/H2', 'H3'=>'/H3', } ); ok($result, 'generated toc with OL from test5.php'); # compare the files $result = compare('test5b_toc.html', 'tfiles/good_test5b_toc.html'); ok($result, 'test5b_toc.html (H3 -> H2 + OL) matches tfiles/good_test5b_toc.html exactly'); # clean up if ($result) { unlink('test5b_toc.html'); } # # file test6 (this file already has anchors) # (testing 2-level OL) # $result = $toc->generate_toc( make_anchors=>0, make_toc=>1, input=>['tfiles/test6.html'], ol=>1, ol_num_levels=>2, inline=>0, overwrite=>0, bak=>'', outfile=>'test6_toc.html', toc_entry=>{ 'H1'=>1, 'H2'=>2, 'H3'=>3, }, toc_end=>{ 'H1'=>'/H1', 'H2'=>'/H2', 'H3'=>'/H3', } ); ok($result, 'generated toc with OL(2) from test6.html'); # compare the files $result = compare('test6_toc.html', 'tfiles/good_test6_toc.html'); ok($result, 'test6_toc.html (L2 OL) matches tfiles/good_test6_toc.html exactly'); # clean up if ($result) { unlink('test6_toc.html'); } # # file test6 (this file already has anchors) # (testing all-level OL) # $result = $toc->generate_toc( make_anchors=>0, make_toc=>1, input=>['tfiles/test6.html'], ol=>1, ol_num_levels=>0, bak=>'', outfile=>'test6a_toc.html', toc_entry=>{ 'H1'=>1, 'H2'=>2, 'H3'=>3, }, toc_end=>{ 'H1'=>'/H1', 'H2'=>'/H2', 'H3'=>'/H3', } ); ok($result, 'generated toc with OL(0) from test6.html'); # compare the files $result = compare('test6a_toc.html', 'tfiles/good_test6a_toc.html'); ok($result, 'test6a_toc.html (OL) matches tfiles/good_test6a_toc.html exactly'); # clean up if ($result) { unlink('test6a_toc.html'); } # # RESET file test2a # undef $toc; $toc = new HTML::GenToc(debug=>0,quiet=>1); if (-f 'test2a_anch.html') { unlink('test2a_anch.html'); } if (-f 'test2a_anch.html.org') { unlink('test2a_anch.html.org'); } $result = $toc->generate_toc( make_anchors=>1, make_toc=>0, use_id=>1, input=>['tfiles/test2.html'], outfile=>'test2a_anch.html', ); ok($result, 'generated anchors (ID) from test2.html'); # compare the files $result = compare('test2a_anch.html', 'tfiles/good_test2a_anch.html'); ok($result, 'test2a_anch.html matches tfiles/good_test2a_anch.html exactly'); $result = $toc->generate_toc( make_anchors=>0, make_toc=>1, input=>['test2a_anch.html'], inline=>1, overwrite=>1, ); ok($result, 'generated toc inline test2a_anch.html'); # compare the files $result = compare('test2a_anch.html', 'tfiles/good_test2a_toc.html'); ok($result, 'test2a_anch.html matches tfiles/good_test2a_toc.html exactly'); # clean up if ($result) { unlink('test2a_anch.html'); unlink('test2a_anch.html.org'); } # # file test7 (this file already has some anchors) # testing generation of anchors # $result = $toc->generate_toc( make_anchors=>1, use_id=>1, make_toc=>0, input=>['tfiles/test7.html'], overwrite=>0, bak=>'', outfile=>'test7a.html', toc_entry=>{ 'H1'=>1, 'H2'=>2, }, toc_end=>{ 'H1'=>'/H1', 'H2'=>'/H2', } ); ok($result, 'generated anchors from test7.html'); # compare the files $result = compare('test7a.html', 'tfiles/good_test7a.html'); ok($result, 'test7a.html matches tfiles/good_test7a.html exactly'); # clean up if ($result) { unlink('test7a.html'); } # vim: ft=perl 070_script.t000644001750001750 421011545551123 14501 0ustar00katkat000000000000HTML-GenToc-3.20/tuse Test::More tests => 8; require 't/compare.pl'; #-------------------------------------------------------------------- # Insert your test code below #-------------------------------------------------------------------- # clear files if (-f 'test1_anch.wml') { unlink('test1_anch.wml'); } if (-f 'test1_toc.html') { unlink('test1_toc.html'); } # now test the script my $command = "$^X -I lib scripts/hypertoc --quiet --gen_anchors --outfile test1_anch.wml tfiles/test1.wml"; my $result = system($command); ok($result == 0, 'hypertoc generated anchors from test1.wml'); # compare the files $result = compare('test1_anch.wml', 'tfiles/good_test1_anch.wml'); ok($result, 'hypertoc: test1_anch.wml matches tfiles/good_test1_anch.wml exactly'); $command = "$^X -I lib scripts/hypertoc --gen_toc --quiet --outfile test1_toc.html test1_anch.wml"; my $result2 = system($command); ok($result2 == 0, 'hypertoc generated toc from test1_anch.wml'); # compare the files $result2 = compare('test1_toc.html', 'tfiles/good_test1_toc.html'); ok($result2, 'hypertoc: test1_toc.html matches tfiles/good_test1_toc.html exactly'); # clean up test1 if ($result && $result2) { unlink('test1_anch.wml'); unlink('test1_toc.html'); } # # test with both generate options together # $command = "$^X -I lib scripts/hypertoc --gen_anchors --quiet --gen_toc --outfile test1a_toc.html tfiles/test1.wml"; $result = system($command); ok($result == 0, 'hypertoc generated toc from test1.wml'); # compare the files $result = compare('test1a_toc.html', 'tfiles/good_test1a_toc.html'); ok($result, 'hypertoc: test1a_toc.html matches tfiles/good_test1a_toc.html exactly'); # clean up test1 if ($result) { unlink('test1a_toc.html'); } # # test with option file # $command = "$^X -I lib scripts/hypertoc --argfile tfiles/test1b.args tfiles/test1.wml"; $result = system($command); ok($result == 0, 'hypertoc generated toc (argfile) from test1.wml'); # compare the files $result = compare('test1b_toc.html', 'tfiles/good_test1a_toc.html'); ok($result, 'hypertoc: test1b_toc.html matches tfiles/good_test1a_toc.html exactly'); # clean up test1 if ($result) { unlink('test1b_toc.html'); } 00-compile.t000644001750001750 204111545551123 14454 0ustar00katkat000000000000HTML-GenToc-3.20/t#!perl use strict; use warnings; use Test::More; use File::Find; use File::Temp qw{ tempdir }; my @modules; find( sub { return if $File::Find::name !~ /\.pm\z/; my $found = $File::Find::name; $found =~ s{^lib/}{}; $found =~ s{[/\\]}{::}g; $found =~ s/\.pm$//; # nothing to skip push @modules, $found; }, 'lib', ); my @scripts = glob "bin/*"; my $plan = scalar(@modules) + scalar(@scripts); $plan ? (plan tests => $plan) : (plan skip_all => "no tests to run"); { # fake home for cpan-testers # no fake requested ## local $ENV{HOME} = tempdir( CLEANUP => 1 ); like( qx{ $^X -Ilib -e "require $_; print '$_ ok'" }, qr/^\s*$_ ok/s, "$_ loaded ok" ) for sort @modules; SKIP: { eval "use Test::Script 1.05; 1;"; skip "Test::Script needed to test script compilation", scalar(@scripts) if $@; foreach my $file ( @scripts ) { my $script = $file; $script =~ s!.*/!!; script_compiles( $file, "$script script compiles" ); } } } 020_strings.t000644001750001750 771711545551123 14700 0ustar00katkat000000000000HTML-GenToc-3.20/tuse Test::More tests => 7; use HTML::GenToc; # Insert your test code below #=================================================== $toc = new HTML::GenToc(debug=>0, quiet=>1); #---------------------------------------------------------- # string input and output $html1 ="

    Cool header

    This is a paragraph.

    Getting Cooler

    Another paragraph.

    "; $html2 ="

    Cool header

    This is a paragraph.

    Getting Cooler

    Another paragraph.

    "; $out_str = $toc->generate_toc( make_anchors=>1, make_toc=>0, to_string=>1, filenames=>["fred.html"], input=>$html1, toc_entry=>{ 'H1' =>1, 'H2' =>2, }, toc_end=>{ 'H1' =>'/H1', 'H2' =>'/H2', }, ); is($out_str, $html2, "(1) generate_anchors matches strings"); $out_str = $toc->generate_toc( make_anchors=>0, make_toc=>1, to_string=>1, filenames=>["fred.html"], input=>$html2, ); $ok_toc_str1=' Table of Contents

    Table of Contents

    '; is($out_str, $ok_toc_str1, "(2) generate_toc matches toc string"); $out_str = $toc->generate_toc( make_anchors=>0, make_toc=>1, to_string=>1, filenames=>["fred.html"], input=>$html2, inline=>1, toc_tag=>'/H1', toc_tag_replace=>0, toclabel=>'', ); $ok_toc_str2='

    Cool header

    This is a paragraph.

    Getting Cooler

    Another paragraph.

    '; is($out_str, $ok_toc_str2, "(3) generate_toc matches inline toc string"); # # Reset undef $toc; $toc = new HTML::GenToc(debug=>0, quiet=>1); $html1 ="

    Cool header

    This is a paragraph.

    Getting Cooler

    Another paragraph.

    "; $html2 ="

    Cool header

    This is a paragraph.

    Getting Cooler

    Another paragraph.

    "; $out_str = $toc->generate_toc( make_anchors=>1, make_toc=>0, to_string=>1, use_id=>1, filenames=>["fred.html"], input=>$html1, toc_entry=>{ 'H1' =>1, 'H2' =>2, }, toc_end=>{ 'H1' =>'/H1', 'H2' =>'/H2', }, ); is($out_str, $html2, "(4) generate_anchors (id) matches strings"); $out_str = $toc->generate_toc( make_anchors=>0, make_toc=>1, to_string=>1, filenames=>["fred.html"], input=>$html2, ); $ok_toc_str1=' Table of Contents

    Table of Contents

    '; is($out_str, $ok_toc_str1, "(5) generate_toc (id) matches toc string"); # ignore sole first $out_str = $toc->generate_toc( make_anchors=>0, make_toc=>1, to_string=>1, filenames=>["fred.html"], ignore_sole_first=>1, input=>$html2, ); $ok_toc_str1=' Table of Contents

    Table of Contents

    '; is($out_str, $ok_toc_str1, "(6) generate_toc (ignore_sole_first) matches toc string"); # ignore_only_one $html1 =<Cool header

    This is a paragraph.

    EOT $out_str = $toc->generate_toc( to_string=>1, use_id=>1, inline=>1, ignore_only_one=>1, toc_tag=>'/h1', input=>$html1, ); $ok_toc_str1 =<Cool header

    This is a paragraph.

    EOT is($out_str, $ok_toc_str1, "(7) generate_toc (ignore_only_one) matches string"); 030_anchors.t000644001750001750 1061611545551123 14655 0ustar00katkat000000000000HTML-GenToc-3.20/tuse Test::More tests => 1; use HTML::GenToc; #=================================================== my $toc = new HTML::GenToc(); my $input = <<'HTML';

    The Big Step 1

    The first heading text hoes here

    The Big Step 2

    This is the second heading text

    second header, first subheader

    Some subheader text here

    second header, second subheader

    Another piece of subheader text here

    The Big Step

    Third heading text

    The Big Step

    Fourth heading text; anchor above needed uniquifying

    The big Step

    Per http://www.w3.org/TR/REC-html40/struct/links.html#h-12.2.1, "Anchor names must be unique within a document. Anchor names that differ only in case may not appear in the same document."

    The Big Step #6

    The number/hash sign is allowed in fragments; the fragment starts with the first hash. No spec as a reference for this, but the anchors work in Firefox 3 and IE 6.

    Calculation #7: 7/5>3 or <2?

    Hash marks in fragments work, as well as '/' and '?' signs. < and > are escaped.

    #8: start with a number (hash) [pound] {comment} sign

    HTML my $output; =pod Test 1 --- 1. SEO-friendly anchors --------------------------------------------------------- Anchors should be generated with SEO-friendly names, i.e. out of the entire token text, instead of being numeric or reduced to the first word(s) of the token. In the spirit of http://seo2.0.onreact.com/top-10-fatal-url-design-mistakes, compare: http://beachfashion.com/photos/Pamela_Anderson#In_red_swimsuit_in_Baywatch vs. http://beachfashion.com/photos/Pamela_Anderson#in Which one speaks your language more, which one will you rather click? The anchor names generated are compliant with XHTML1.0 Strict. Also, per the HTML 4.01 spec, anchors that differ only in case may not appear in the same document and anchor names should be restricted to ASCII characters. =cut $output = $toc->generate_toc( input => $input, inline => 1, toc_tag => 'tochere', toc_tag_replace => 1, to_string => 1, ); my $good_output = <<'EOT';

    Table of Contents

    The Big Step 1

    The first heading text hoes here

    The Big Step 2

    This is the second heading text

    second header, first subheader

    Some subheader text here

    second header, second subheader

    Another piece of subheader text here

    The Big Step

    Third heading text

    The Big Step

    Fourth heading text; anchor above needed uniquifying

    The big Step

    Per http://www.w3.org/TR/REC-html40/struct/links.html#h-12.2.1, "Anchor names must be unique within a document. Anchor names that differ only in case may not appear in the same document."

    The Big Step #6

    The number/hash sign is allowed in fragments; the fragment starts with the first hash. No spec as a reference for this, but the anchors work in Firefox 3 and IE 6.

    Calculation #7: 7/5>3 or <2?

    Hash marks in fragments work, as well as '/' and '?' signs. < and > are escaped.

    #8: start with a number (hash) [pound] {comment} sign

    EOT is($output, $good_output, "(1) SEO-friendly anchors match"); tfiles000755001750001750 011545551123 13311 5ustar00katkat000000000000HTML-GenToc-3.20test1.wml000644001750001750 154411545551123 15235 0ustar00katkat000000000000HTML-GenToc-3.20/tfiles#------------------------------------------------------------------------- # Test WML page #------------------------------------------------------------------------- $(body_class=withimage) $(style_sheet2=defimg.css) $(no_index=true) $(header_style=) #include "prologue.wml" $(TITLE=The Title)

    Wow

    Authors

    Kathryn Andersen < kitty@example.com> is the author of HTML::GenToc.

    Earl Hood is the author of htmltoc.

    Authors Golly

    Mary had a little lamb
    Its fleece was white as snow
    And everywhere that Mary went
    The lamb was sure to go.

    Authors Golly Gee

    "BOOM! Sooner or later. *BOOM*!"
    	-- Ivanova, "Grail" (Babylon 5)
    
    test5.php000644001750001750 1753511545551123 15260 0ustar00katkat000000000000HTML-GenToc-3.20/tfiles Author Swellison

    Swellison

    Archaeology 701 (Sentinel)

    Reviewed by Kathryn Andersen on 22 July 2000 (2)

    Cool to see Jim in Blair's world, for once. Evidence gathering is evidence gathering, though policemen don't usually have to dig for it. Nice.

    Platinum (Sentinel)

    Reviewed by Kathryn Andersen on 26 August 2000 (9)

    A lot of cool senses work in this story, I really liked that. And I enjoyed the banter between the guys. And it was a case story too!

    Routine Traffic Stop (Sentinel/ER)

    Reviewed by Kathryn Andersen on 29 July 2001 (14)

    This is one of the stories written for the "crossover" Lyric Wheel, though the ER stuff is more of a cameo -- the main concentration is on Jim and Blair. There are two parts to this story -- the action and the owies. I found the action the most interesting bit, the plot deliciously ironic. Jim and Blair on the beat, and it's just a routine traffic stop -- NOT! (grin) The bit at the hospital is good because the ER characters are, well, from ER (therefore not cyphers as they often are) but I found some parts with Jim at the hospital were a bit soppy -- saying things aloud that he would think, but probably wouldn't say -- but you could argue that since he was effectively alone, there was no difference...

    Faux Paws Productions

    (520) Wind Shift (FPP-506) (Sentinel)

    Reviewed by Kathryn Andersen on 26 August 2000 (1)

    This was good. I liked the little bit of continuity with Deal's Way. This was just a good case story, with a cool little Jim flashback in there.


    Prev Author: Suburban House Elf Next Author: TAE

    test3.wml000644001750001750 166711545551123 15245 0ustar00katkat000000000000HTML-GenToc-3.20/tfiles#------------------------------------------------------------------------- # Test WML page #------------------------------------------------------------------------- $(body_class=withimage) $(style_sheet2=defimg.css) $(no_index=true) $(header_style=) #include "prologue.wml" $(TITLE=The Title)

    Wow

    Authors

    Kathryn Andersen < kitty@example.com> is the author of HTML::GenToc.

    Earl Hood is the author of htmltoc.

    Authors Golly

    Mary had a little lamb
    Its fleece was white as snow
    And everywhere that Mary went
    The lamb was sure to go.

    Authors Golly Gee

    "BOOM! Sooner or later. *BOOM*!"
    	-- Ivanova, "Grail" (Babylon 5)
    

    And back again

    Here we un-nest ourselves a bit.

    scripts000755001750001750 011545551123 13512 5ustar00katkat000000000000HTML-GenToc-3.20hypertoc000644001750001750 6164211545551123 15462 0ustar00katkat000000000000HTML-GenToc-3.20/scripts#!/usr/bin/env perl use strict; =head1 NAME hypertoc - generate a table of contents for HTML documents =head1 VERSION version 3.20 =head1 SYNOPSIS hypertoc --help | --manpage | --man_help | --man hypertoc [--bak I ] [ --debug ] [ --entrysep I ] [ --footer I ] [ --header I ] [ --ignore_only_one ] [ --ignore_sole_first ] [ --inline ] [ --make_anchors ] [ --make_toc ] [ --notoc_match I ] [ --ol | --nool ] [ --ol_num_levels ] [ --outfile I ] [ --overwrite ] [ --quiet ] [ --textonly ] [ --title I ] { --toc_after I } { --toc_before I } { --toc_end I } { --toc_entry I } [ --toc_label I ] [ --toc_only | --notoc_only ] [ --toc_tag I ] [ --toc_tag_replace ] [ --use_id ] [ --useorg ] file ... =head1 DESCRIPTION hypertoc allows you to specify "significant elements" that will be hyperlinked to in a "Table of Contents" (ToC) for a given set of HTML documents. Basically, the ToC generated is a multi-level level list containing links to the significant elements. hypertoc inserts the links into the ToC to significant elements at a level specified by the user. B If H1s are specified as level 1, than they appear in the first level list of the ToC. If H2s are specified as a level 2, than they appear in a second level list in the ToC. There are two aspects to the ToC generation: (1) putting suitable anchors into the HTML documents (--make_anchors), and (2) generating the ToC from HTML documents which have anchors in them for the ToC to link to (--make_toc). One can choose to do one or both of these. hypertoc also supports the ability to incorporate the ToC into the HTML document itself via the --inline option. In order for hypertoc to support linking to significant elements, hypertoc inserts anchors into the significant elements. One can use hypertoc as a filter, outputing the result to another file, or one can overwrite the original file, with the original backed up with a suffix (default: "org") appended to the filename. One can also define options in a config file as well as on the command-line. =head1 OPTIONS Options can start with "--" or "-"; boolean options can be negated by preceding them with "no"; options with hash or array values can be added to by giving the option again for each value. See L for more information. =over =item --argfile I The name of a file to read more options from. This can be used more than once. For example: --argfile your.args --argfile my.args See L for more information. =item --bak --bak I If the input file/files is/are being overwritten (--overwrite is on), copy the original file to "I.I". If the value is empty, there is no backup file written. (default:org) =item --debug Enable verbose debugging output. Used for debugging this module; in other words, don't bother. (default:off) =item --entrysep --entrysep I Separator string for non-
  • item entries (default: ", ") =item --footer --footer I File containing footer text for table of contents. =item --header --header I File containing header text for table of contents. =item --help Print a short help message and exit. =item --ignore_only_one If there would be only one item in the ToC, don't make a ToC. =item --ignore_sole_first If the first item in the ToC is of the highest level, AND it is the only one of that level, ignore it. This is useful in web-pages where there is only one H1 header but one doesn't know beforehand whether there will be only one. =item --inline Put ToC in document at a given point. See L for more information. =item --make_anchors | --gen_anchors Create anchors for the table-of-contents to link to. =item --make_toc | --gen_toc Make a Table-of-Contents which links to anchored significant elements. =item --man_help | --manpage | --man Print all documentation and exit. =item --notoc_match --notoc_match I If there are certain individual tags you don't wish to include in the table of contents, even though they match the "significant elements", then if this pattern matches contents inside the tag (not the body), then that tag will not be included, either in generating anchors nor in generating the ToC. (default: class="notoc") =item --ol | --nool Use an ordered list for Table-of-Contents entries (to a given depth). If --ol is false (i.e. --nool is set) then I use an ordered list for ToC entries. (default:false) (See --ol_num_levels to determine how deep the ordered-list listing goes) =item --ol_num_levels The number of levels deep the OL listing will go if --ol is true. If set to zero, will use an ordered list for all levels. (default:1) =item --outfile --outfile I File to write the output to. This is where the modified HTML output and the Table-of-Contents goes to. If you give '-' as the filename, then output will go to STDOUT. (default: STDOUT) =item --overwrite Overwrite the input file with the output. If this is in effect, --outfile is ignored. Used in I for creating the anchors "in place" and in I if the --inline option is in effect. (default:off) =item --quiet Suppress informative messages. (default: off) =item --textonly Use only text content in significant elements. =item --title --title I Title for ToC page (if not using --header or --inline or --toc_only) (default: "Table of Contents") =item --toc_after --toc_after I=I --toc_after "H2=" For defining layout of significant elements in the ToC. The I is the HTML tag which marks the start of the element. The I is what is required to be appended to the Table of Contents entry generated for that tag. This is a cumulative hash argument. (default: undefined) =item --toc_before --toc_before I=I --toc_before "H2=" For defining the layout of significant elements in the ToC. The I is the HTML tag which marks the start of the element. The I is what is required to be prepended to the Table of Contents entry generated for that tag. This is a cumulative hash argument. (default: undefined) =item --toc_end --toc_end I=I --toc_end "H1=/H1" For defining significant elements. The I is the HTML tag which marks the start of the element. The I the HTML tag which marks the end of the element. When matching in the input file, case is ignored (but make sure that all your I options referring to the same tag are exactly the same!). This is a cumulative hash argument. (default: H1=/H1 H2=/H2) =item --toc_entry --toc_entry I=I --toc_entry "TITLE=1" --toc_entry "H1=2" For defining significant elements. The I is the HTML tag which marks the start of the element. The I is what level the tag is considered to be. The value of I must be numeric, and non-zero. If the value is negative, consective entries represented by the significant_element will be separated by the value set by --entrysep option. This is a cumulative hash argument. (default: H1=1 H2=2) =item --toc_label | --toclabel --toc_label I HTML text that labels the ToC. Always used. (default: "

    Table of Contents

    ") =item --toc_only | --notoc_only Output only the Table of Contents, that is, the Table of Contents plus the toc_label. If there is a --header or a --footer, these will also be output. If --toc_only is false (i.e. --notoc_only is set) then if there is no --header, and --inline is not true, then a suitable HTML page header will be output, and if there is no --footer and --inline is not true, then a HTML page footer will be output. (default:--notoc_only) =item --toc_tag --toc_tag I If a ToC is to be included inline, this is the pattern which is used to match the tag where the ToC should be put. This can be a start-tag, an end-tag or a comment, but the E should be left out; that is, if you want the ToC to be placed after the BODY tag, then give "BODY". If you want a special comment tag to make where the ToC should go, then include the comment marks, for example: "!--toc--" (default:BODY) =item --toc_tag_replace In conjunction with --toc_tag, this is a flag to say whether the given tag should be replaced, or if the ToC should be put after the tag. This can be useful if your toc_tag is a comment and you don't need it after you have the ToC in place. (default:false) =item --use_id Use id="I" for anchors rather than anchors. However if an anchor already exists for a Significant Element, this won't make an ID for that particular element. =item --useorg Use pre-existing backup files as the input source; that is, files of the form I.I (see --bak). =back =head1 FILE FORMATS =head2 Options Files Options can be given in files as well as on the command-line by using the --argfile I option in the command-line. Also, the files ~/.hypertocrc and ./.hypertocrc are checked for options. The format is as follows: Lines starting with # are comments. Lines enclosed in PoD markers are also comments. Blank lines are ignored. The options themselves should be given the way they would be on the command line, that is, the option name (I the --) followed by its value (if any). For example: # set the ToC to be three-level --toc_entry H1=1 --toc_entry H2=2 --toc_entry H3=3 --toc_end H1=/H1 --toc_end H2=/H2 --toc_end H3=/H3 Option files can be nested, by giving an --argfile I argument inside the option file, it will go and get that referred file as well. See L for more information. =head1 DETAILS =head2 Significant Elements Here are some examples of defining the significant elements for your Table of Contents. =head3 Example of Default The following reflects the default setting if nothing is explicitly specified: --toc_entry "H1=1" --toc_end "H1=/H1" --toc_entry "H2=2" --toc_end "H2=/H2" Or, if it was defined in one of the possible L: # default settings --toc_entry H1=1 --toc_end H1=/H1 --toc_entry H2=2 --toc_end H2=/H2 =head3 Example of before/after The following options make use of the before/after options: # An options file that adds some formatting # make level 1 ToC entries --toc_entry H1=1 --toc_end H1=/H1 --toc_before H1= --toc_after H1= # make level 2 ToC entries --toc_entry H2=2 --toc_end H2=/H2 --toc_before H2= --toc_after H2= # Make level 3 entries as is --toc_entry H3=3 --toc_end H3=/H3 =head3 Example of custom end The following options try to index definition terms: # An options file that can work for Glossary type documents --toc_entry H1=1 --toc_end H1=/H1 --toc_entry H2=2 --toc_end H2=/H2 # Assumes document has a DD for each DT, otherwise ToC # will get entries with alot of text. --toc_entry DT=3 --toc_end DT=DD --toc_before DT= --toc_after DT= =head2 Formatting the ToC The --toc_entry etc. options give you control on how the ToC entries may look, but there are other options to affect the final appearance of the ToC file created. With the --header option, the contents of the given file will be prepended before the generated ToC. This allows you to have introductory text, or any other text, before the ToC. =over =item Note: If you use the --header option, make sure the file specified contains the opening HTML tag, the HEAD element (containing the TITLE element), and the opening BODY tag. However, these tags/elements should not be in the header file if the --inline options is used. See L for information on what the header file should contain for inlining the ToC. =back With the --toc_label option, the contents of the given string will be prepended before the generated ToC (but after any text taken from a --header file). With the --footer option, the contents of the file will be appended after the generated ToC. =over =item Note: If you use the -footer, make sure it includes the closing BODY and HTML tags (unless, of course, you are using the --inline option). =back If the --header option is not specified, the appropriate starting HTML markup will be added, unless the --toc_only option is specified. If the --footer option is not specified, the appropriate closing HTML markup will be added, unless the --toc_only option is specified. If you do not want/need to deal with header, and footer, files, then you are alloed to specify the title, --title option, of the ToC file; and it allows you to specify a heading, or label, to put before ToC entries' list, the --toc_label option. Both options have default values, see L for more information on each option. If you do not want HTML page tags to be supplied, and just want the ToC itself, then specify the --toc_only option. If there are no --header or --footer files, then this will simply output the contents of --toc_label and the ToC itself. =head2 Inlining the ToC The ability to incorporate the ToC directly into an HTML document is supported via the --inline option. Inlining will be done on the first file in the list of files processed, and will only be done if that file contains an opening tag matching the --toc_tag value. If --overwrite is true, then the first file in the list will be overwritten, with the generated ToC inserted at the appropriate spot. Otherwise a modified version of the first file is output to either STDOUT or to the output file defined by the --outfile option. The options --toc_tag and --toc_tag_replace are used to determine where and how the ToC is inserted into the output. =head3 Example 1 # this is the default --toc_tag BODY --notoc_tag_replace This will put the generated ToC after the BODY tag of the first file. If the --header option is specified, then the contents of the specified file are inserted after the BODY tag. If the --toc_label option is not empty, then the text specified by the --toc_label option is inserted. Then the ToC is inserted, and finally, if the --footer option is specified, it inserts the footer. Then the rest of the input file follows as it was before. =head3 Example 2 --toc_tag '!--toc--' --toc_tag_replace This will put the generated ToC after the first comment of the form , and that comment will be replaced by the ToC (in the order --header --toc_label ToC --footer) followed by the rest of the input file. =over =item Note: The header file should not contain the beginning HTML tag and HEAD element since the HTML file being processed should already contain these tags/elements. =back =head1 EXAMPLES =head2 Create an inline ToC for one file hypertoc --inline --make_anchors --overwrite --make_toc index.html This will create anchors in C, create a ToC with a heading of "Table of Contents" and place it after the BODY tag of C. The file index.html.org will contain the original index.html file, without ToC or anchors. =head2 Create a ToC file from multiple files First, create the anchors. hypertoc --make_anchors --overwrite index.html fred.html george.html Then create the ToC hypertoc --make_toc --outfile table.html index.html fred.html george.html =head2 Create an inline ToC after the first heading of the first file hypertoc --make_anchors --inline --overwrite --make_toc --toc_tag /H1 \ --notoc_tag_replace --toc_label "" index.html fred.html george.html This will create anchors in the C, C and C files, create a ToC with no header and place it after the first H1 header in C and back up the original files to C, C and C =head2 Create an inline ToC with custom elements hypertoc --quiet --make_anchors --bak "" --overwrite \ --make_toc --inline --toc_label "" --toc_tag '!--toc--' \ --toc_tag_replace \ --toc_entry H2=1 --toc_entry H3=2 \ --toc_end H2=/H2 --toc_end H3=/H3 myfile.html This will create an inline ToC overwriting the original file, and replacing a comment, and which takes H2 headers as level 1 and H3 headers as level 2. This can be useful where the .html file is generated by some other process, and you can then create the ToC as the last step. =head2 Create a ToC with custom elements hypertoc --quiet --make_anchors --bak "" --overwrite \ --toc_entry TITLE=1 --toc_end TITLE=/TITLE --toc_entry H2=2 --toc_entry H3=3 \ --toc_end H2=/H2 --toc_end H3=/H3 \ --make_toc --outfile index.html \ mary.html fred.html george.html This creates anchors at the H2 and H3 elements, and creates a ToC file called index.html, indexing on the TITLE, and the H2 and H3 elements. =head2 Create a ToC with custom elements and options file Given an options file called 'custom.opt' as follows: # Title, H2 and H3 --toc_entry TITLE=1 --toc_end TITLE=/TITLE --toc_entry H2=2 --toc_end H2=/H2 --toc_entry H3=3 --toc_end H3=/H3 then the previous example can have shorter command lines as follows: hypertoc --quiet --make_anchors --bak "" --overwrite \ --argfile custom.opt --make_toc --outfile index.html mary.html fred.html george.html =head1 NOTES =over =item * hypertoc is smart enough to detect anchors inside significant elements. If the anchor defines the NAME attribute, hypertoc uses the value. Else, it adds its own NAME attribute to the anchor. If --use_id is true, then it likewise checks for and uses IDs. =item * The TITLE element is treated specially if specified as a significant element. It is illegal to insert anchors (A) into TITLE elements. Therefore, hypertoc will actually link to the filename itself instead of the TITLE element of the document. =item * hypertoc will ignore a significant element if it does not contain any non-whitespace characters. A warning message is generated if such a condition exists. =item * If you have a sequence of significant elements that change in a slightly disordered fashion, such as H1 -> H3 -> H2 or even H2 -> H1, though hypertoc deals with this to create a list which is still good HTML, if you are using an ordered list to that depth, then you will get strange numbering, as an extra list element will have been inserted to nest the elements at the correct level. For example (H2 -> H1 with --ol_num_levels=1): 1. * My H2 Header 2. My H1 Header For example (H1 -> H3 -> H2 with --ol_num_levels=0 and H3 also being significant): 1. My H1 Header 1. 1. My H3 Header 2. My H2 Header 2. My Second H1 Header In cases such as this it may be better not to use the --ol option. =item * If one is not using --overwrite when generating anchors, then the command needs to be done in two passes, in order to give the correct filenames (the ones with the actual anchors in them) to the ToC generation part. Otherwise the ToC will have anchors pointing to files that don't have them. =item * When using --inline, care needs to be taken if overwriting -- if one sets the ToC to be included after a given tag (such as the default BODY) then if one runs the command repeatedly one could get multiple ToCs in the same file, one after the other. =back =head1 CAVEATS =over =item * Version 3.10 (and above) generates more verbose (SEO-friendly) anchors than prior versions. Thus anchors generated with earlier versions will not match version 3.10 anchors. =item * Version 3.00 (and above) of hypertoc behaves somewhat differently than Version 2.x of hypertoc. It is now designed to do everything in one pass, and has dropped certain options: the --infile option is no longer used (all filenames are put at the end of the command); the --toc_file option no longer exists; use the --outfile option instead; the --tocmap option is no longer supported. It now generates lower-case tags rather than upper-case ones. =item * hypertoc is not very efficient (memory and speed), and can be slow for large documents. =item * Now that generation of anchors and of the ToC are done in one pass, even more memory is used than was the case before. This is more notable when processing multiple files, since all files are read into memory before processing them. =item * Invalid markup will be generated if a significant element is contained inside of an anchor. For example:

    The FOO command

    will be converted to (if h1 is a significant element),

    The FOO command

    which is illegal since anchors cannot be nested. It is better style to put anchor statements within the element to be anchored. For example, the following is preferred:

    The FOO command

    hypertoc will detect the "foo" NAME and use it. Even better is to use IDs:

    The FOO command

    =item * NAME attributes without quotes are not recognized. =back =head1 BUGS Tell me about them. =head1 REQUIRES Getopt::Long Getopt::ArgvFile File::Basename Pod::Usage HTML::LinkList HTML::Entities HTML::GenToc =head1 SCRIPT CATEGORIES Web =head1 ENVIRONMENT =over =item HOME hypertoc looks in the HOME directory for config files. =back =head1 FILES =over =item C<~/.hypertocrc> User configuration file. =item C<.hypertocrc> Configuration file in the current working directory; overrides options in C<~/.hypertocrc> and is overridden by command-line options. =back =head1 SEE ALSO perl(1) htmltoc(1) HTML::GenToc Getopt::ArgvFile Getopt::Long =head1 AUTHOR Kathryn Andersen http://www.katspace.org/tools/hypertoc/ Based on htmltoc by Earl Hood ehood AT medusa.acs.uci.edu Contributions from Dan Dascalescu, =head1 COPYRIGHT Copyright (C) 1994-1997 Earl Hood, ehood AT medusa.acs.uci.edu Copyright (C) 2002-2008 Kathryn Andersen This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. =cut ################################################################# # Includes use Getopt::Long 2.34; use Pod::Usage; use File::Basename; use HTML::GenToc; ################################################################# # Subroutines sub init_data ($) { my $data_ref = shift; my %args = (); $args{manpage} = 0; $args{debug} = 0; $args{quiet} = 0; $data_ref->{args} = \%args; } sub process_args ($) { my $data_ref = shift; my $args_ref = $data_ref->{args}; my $ok = 1; # check the rc file if we can if (eval("require Getopt::ArgvFile")) { my $nameBuilder=sub { my $bn = basename($_[0], ''); [".${bn}rc", ".${bn}/config", ".config/${bn}/config"]; }; Getopt::ArgvFile::argvFile( startupFilename=>$nameBuilder, fileOption=>'argfile', home=>1, current=>1); } my $op = new Getopt::Long::Parser; $op->configure(qw(auto_version auto_help)); $op->getoptions($args_ref, 'manpage|man_help', 'debug', 'quiet!', 'bak=s', 'entrysep=s', 'footer=s', 'ignore_sole_first!', 'inline!', 'header=s', 'notoc_match=s', 'ol|ordered_list!', 'ol_num_levels=n', 'overwrite!', 'textonly!', 'title=s', 'toclabel|toc_label=s', 'tocmap=s', 'outfile|toc_file|tocfile=s', 'toc_tag|toctag=s', 'toc_tag_replace!', 'toc_only!', 'toc_entry=s%', 'toc_end=s%', 'toc_before=s%', 'toc_after=s%', 'use_id!', 'useorg!', 'make_toc|gen_toc|generate_toc', 'make_anchors|gen_anchors|generate_anchors', ); if (!$ok) { pod2usage({ -message => "$0", -exitval => 1, -verbose => 0, }); } if ($args_ref->{'manpage'}) { pod2usage({ -message => "$0", -exitval => 0, -verbose => 2, }); } # transfer script-only things to the data-ref $data_ref->{make_anchors} = $args_ref->{make_anchors}; undef $args_ref->{make_anchors}; $data_ref->{make_toc} = $args_ref->{make_toc}; undef $args_ref->{make_toc}; undef $args_ref->{manpage}; # make the object my $toc = HTML::GenToc->new(%{$args_ref}); $data_ref->{toc} = $toc; } ################################################################# # Main MAIN: { my %data = (); my $result = 0; init_data(\%data); process_args(\%data); # Now the remainder must be input-files my @infiles = (); push @infiles, @ARGV; if (!$data{toc}->{quiet}) { print STDERR "To process: ", join(" ", @infiles), "\n"; } $result = $data{toc}->generate_toc( make_anchors=>$data{make_anchors}, make_toc=>$data{make_toc}, input=>\@infiles, ); if ($result) { exit 0; } else { exit 1; } } # vim: sw=4 sts=4 aitest7.html000644001750001750 640111545551123 15405 0ustar00katkat000000000000HTML-GenToc-3.20/tfiles Out of Death Was I Born...

    Out of Death Was I Born...

    The Blake's 7 crew in a Sime~Gen universe

    It all began with a typo.

    Crew dynamics:
    If Avon, Cally, and Vila are Simes and Blake, Jenna and Gan are Gens, we have a stable first and second-season interaction. Orac doesn't affect the equation, but Gan's death certainly does.

    At the start of the third season, we still have Avon, Cally and Vila, and they're joined by Tarrant and Dayna. Tarrant's a Sime, and I'd assume that weapons-specialist Dayna is also. And naturally Soolin is, too. This in itself goes a long way toward explaining the difference between the first two seasons and the second two...first and second season, they're a balanced set of transfer partners; third and fourth seasons they're a hunting pack of Sime raiders. (MD)

    When Cally died, the situation became less stable, because they'd lost their one strong Channel, so they did have to raid. (KA)

    The Federation:
    I don't think a Sime~Gen Federation would be like the Tecton; perhaps one reason why the Federation is into conquering is the neverending search for more Gens... (Yes, assumption that planets had been settled by Simes and Gens, not by Ancients) Some planets could well be Gen enclaves, but the Federation would be Sime-dominated.

    There would obviously have to be a Channel distribution system in place, or the Federation would have succumbed to Zelerod's Doom long ago, but I expect that killing isn't illegal either. It's just really really expensive. Choice kills. Ohnj Verlis probably didn't (just) deal in slaves, but in Gens. Gen relatives of deserters would not be sent to the slave pits, they would be sent to the Gen pits; not for death, but to be milked by the government Channels until they died. Gens would be second-class citizens, but I don't think they would be classed as non-persons. All channels would be forced to work for the government - but Cally, not being a Federation citizen, would have received different Channel training - Auron-style training, whatever that was. (KA)

    Huh? What is this Sime~Gen stuff?

    For those of you completely confused, a small explanation must be given. The Sime~Gen universe is a fictional creation of Jacqueline Lichtenberg in which humanity has mutated into two forms - the selyn-producing Gens, and the tentacled (on their arms) Simes which feed on selyn. Selyn is a kind of life-energy which Simes can detect. Unfortunately, for most Simes, taking the selyn from a Gen kills the Gen. A Channel is a Sime who can take selyn from a Gen without killing them, and channel it to other Simes. Some Gens also, with proper talent and training, are able to donate selyn to any sime (not just Channels) without being killed.

    If you want to know more, check out the Sime~Gen pages.

    test2.html000644001750001750 574311545551123 15410 0ustar00katkat000000000000HTML-GenToc-3.20/tfiles The Title

    KatSpace
    KatSpace
    Last touched 2002-01-26 11:47:22


     

    Wow

    Authors

    Kathryn Andersen < kitty@example.com> is the author of HTML::GenToc.

    Earl Hood is the author of htmltoc.  

    Golly

    Mary had a little lamb
    Its fleece was white as snow
    And everywhere that Mary went
    The lamb was sure to go.

    Suppression

    These are things which no eye hast seen, not ear heard, neither hast they spoken of it unto any man, fish, nor fowl.


      *  <<  *  <  *  >  *  >>  *   *   Top of Page   *   About Site   *   Pretty Logos   *   Validate Me!   *  
    test6.html000644001750001750 276711545551123 15417 0ustar00katkat000000000000HTML-GenToc-3.20/tfiles Weather

    Weather

    Storm

    Wind

    * Overfiend sighs
    <Overfiend> Netscape sucks.
    <Overfiend> It is a house of cards resting on a bed of quicksand.
    <Espy> during an earthquake
    <Overfiend> in a tornado

    Rain

    rain falls where clouds come
    sun shines where clouds go
    clouds just come and go

    -- Florian Gutzwiller

    Rainy days and Mondays always get me down.

    Rainy days and automatic weapons always get me down.

    Ho! Ho! Ho! to the bottle I go
    To heal my heart and drown my woe.
    Rain may fall and wind may blow,
    And many miles be still to go,
    But under a tall tree I will lie,
    And let the clouds go sailing by.

    -- J. R. R. Tolkien

    "Rain. Frost in low lying areas. Temperature 60 degrees. Suggest t-shirt, vest, flannel shirt, and blue jeans. Shoes, optional."

    -- Computer to Sydney (VR.5: VR.5)

    Sun

    A day without sunshine is like a day without orange juice.

    A day without sunshine is like night.

    Simon: You think he's here to buy the ice? Gaines: This time of year, he ain't coming for the sunshine.

    (The Sentinel: The Debt) test4.html000644001750001750 263211545551123 15404 0ustar00katkat000000000000HTML-GenToc-3.20/tfiles Wow

    Wow

    Authors

    Kathryn Andersen <kitty@example.com> is the author of HTML::GenToc.

    Earl Hood is the author of htmltoc.

    Marytude

    Mary Mary

    Mary had a little lamb
    Its fleece was white as snow
    And everywhere that Mary went
    The lamb was sure to go.

    Mary Mary

    In every job that must be done, there is an element of fun.
    Find the fun and snap! The job's a game.
    And every task you undertake, becomes a piece of cake,
    a lark, a spree; it's very clear to see.
      -- Mary Poppins

    Suppression

    These are things which no eye hast seen, not ear heard, neither hast they spoken of it unto any man, fish, nor fowl.

    Emergence

    Whenever the literary German dives into a sentence, that is the last you are going to see of him until he emerges on the other side of the Atlantic with his verb in his mouth.
      -- Mark Twain "A Connecticut Yankee in King Arthur's Court"
    HTML000755001750001750 011545551123 13335 5ustar00katkat000000000000HTML-GenToc-3.20/libGenToc.pm000644001750001750 13461011545551123 15256 0ustar00katkat000000000000HTML-GenToc-3.20/lib/HTMLpackage HTML::GenToc; BEGIN { $HTML::GenToc::VERSION = '3.20'; } use strict; =head1 NAME HTML::GenToc - Generate a Table of Contents for HTML documents. =head1 VERSION version 3.20 =head1 SYNOPSIS use HTML::GenToc; # create a new object my $toc = new HTML::GenToc(); my $toc = new HTML::GenToc(title=>"Table of Contents", toc_entry=>{ H1=>1, H2=>2 }, toc_end=>{ H1=>'/H1', H2=>'/H2' } ); # generate a ToC from a file $toc->generate_toc(input=>$html_file, footer=>$footer_file, header=>$header_file ); =head1 DESCRIPTION HTML::GenToc generates anchors and a table of contents for HTML documents. Depending on the arguments, it will insert the information it generates, or output to a string, a separate file or STDOUT. While it defaults to taking H1 and H2 elements as the significant elements to put into the table of contents, any tag can be defined as a significant element. Also, it doesn't matter if the input HTML code is complete, pure HTML, one can input pseudo-html or page-fragments, which makes it suitable for using on templates and HTML meta-languages such as WML. Also included in the distrubution is hypertoc, a script which uses the module so that one can process files on the command-line in a user-friendly manner. =head1 DETAILS The ToC generated is a multi-level level list containing links to the significant elements. HTML::GenToc inserts the links into the ToC to significant elements at a level specified by the user. B If H1s are specified as level 1, than they appear in the first level list of the ToC. If H2s are specified as a level 2, than they appear in a second level list in the ToC. Information on the significant elements and what level they should occur are passed in to the methods used by this object, or one can use the defaults. There are two phases to the ToC generation. The first phase is to put suitable anchors into the HTML documents, and the second phase is to generate the ToC from HTML documents which have anchors in them for the ToC to link to. For more information on controlling the contents of the created ToC, see L. HTML::GenToc also supports the ability to incorporate the ToC into the HTML document itself via the B option. See L for more information. In order for HTML::GenToc to support linking to significant elements, HTML::GenToc inserts anchors into the significant elements. One can use HTML::GenToc as a filter, outputing the result to another file, or one can overwrite the original file, with the original backed up with a suffix (default: "org") appended to the filename. One can also output the result to a string. =head1 METHODS Default arguments can be set when the object is created, and overridden by setting arguments when the generate_toc method is called. Arguments are given as a hash of arguments. =cut use Data::Dumper; use HTML::SimpleParse; use HTML::Entities; use HTML::LinkList; ################################################################# #---------------------------------------------------------------# # Object interface #---------------------------------------------------------------# =head2 Method -- new $toc = new HTML::GenToc(); $toc = new HTML::GenToc(toc_entry=>\%my_toc_entry, toc_end=>\%my_toc_end, bak=>'bak', ... ); Creates a new HTML::GenToc object. These arguments will be used as defaults in invocations of other methods. See L for possible arguments. =cut sub new { my $invocant = shift; my $class = ref($invocant) || $invocant; # Object or class name my $self = { debug => 0, bak => 'org', entrysep => ', ', footer => '', inline => 0, header => '', input => '', notoc_match => 'class="notoc"', ol => 0, ol_num_levels => 1, overwrite => 0, outfile => '-', quiet => 0, textonly => 0, title => 'Table of Contents', toclabel => '

    Table of Contents

    ', toc_tag => '^BODY', toc_tag_replace => 0, toc_only => 0, # define TOC entry elements toc_entry => { 'H1'=>1, 'H2'=>2, }, # TOC entry element terminators toc_end => { 'H1'=>'/H1', 'H2'=>'/H2', }, useorg => 0, @_ }; # bless self bless($self, $class); if ($self->{debug}) { print STDERR Dumper($self); } return $self; } # new =head2 generate_toc $toc->generate_toc(outfile=>"index2.html"); my $result_str = $toc->generate_toc(to_string=>1); Generates a table of contents for the significant elements in the HTML documents, optionally generating anchors for them first. B =over =item bak bak => I If the input file/files is/are being overwritten (B is on), copy the original file to "I.I". If the value is empty, B backup file will be created. (default:org) =item debug debug => 1 Enable verbose debugging output. Used for debugging this module; in other words, don't bother. (default:off) =item entrysep entrysep => I Separator string for non-
  • item entries (default: ", ") =item filenames filenames => \@filenames The filenames to use when creating table-of-contents links. This overrides the filenames given in the B option, and is expected to have exactly the same number of elements. This can also be used when passing in string-content to the B option, to give a (fake) filename to use for the links relating to that content. =item footer footer => I Either the filename of the file containing footer text for ToC; or a string containing the footer text. =item header header => I Either the filename of the file containing header text for ToC; or a string containing the header text. =item ignore_only_one ignore_only_one => 1 If there would be only one item in the ToC, don't make a ToC. =item ignore_sole_first ignore_sole_first => 1 If the first item in the ToC is of the highest level, AND it is the only one of that level, ignore it. This is useful in web-pages where there is only one H1 header but one doesn't know beforehand whether there will be only one. =item inline inline => 1 Put ToC in document at a given point. See L for more information. =item input input => \@filenames input => $content This is expected to be either a reference to an array of filenames, or a string containing content to process. The three main uses would be: =over =item (a) you have more than one file to process, so pass in multiple filenames =item (b) you have one file to process, so pass in its filename as the only array item =item (c) you have HTML content to process, so pass in just the content as a string =back (default:undefined) =item notoc_match notoc_match => I If there are certain individual tags you don't wish to include in the table of contents, even though they match the "significant elements", then if this pattern matches contents inside the tag (not the body), then that tag will not be included, either in generating anchors nor in generating the ToC. (default: C) =item ol ol => 1 Use an ordered list for level 1 ToC entries. =item ol_num_levels ol_num_levels => 2 The number of levels deep the OL listing will go if B
      is true. If set to zero, will use an ordered list for all levels. (default:1) =item overwrite overwrite => 1 Overwrite the input file with the output. (default:off) =item outfile outfile => I File to write the output to. This is where the modified HTML output goes to. Note that it doesn't make sense to use this option if you are processing more than one file. If you give '-' as the filename, then output will go to STDOUT. (default: STDOUT) =item quiet quiet => 1 Suppress informative messages. (default: off) =item textonly textonly => 1 Use only text content in significant elements. =item title title => I Title for ToC page (if not using B
      or B or B) (default: "Table of Contents") =item toc_after toc_after => \%toc_after_data %toc_after_data = { I => I, I => I }; toc_after => { H2=>'' } For defining layout of significant elements in the ToC. This expects a reference to a hash of tag=>suffix pairs. The I is the HTML tag which marks the start of the element. The I is what is required to be appended to the Table of Contents entry generated for that tag. (default: undefined) =item toc_before toc_before => \%toc_before_data %toc_before_data = { I => I, I => I }; toc_before=>{ H2=>'' } For defining the layout of significant elements in the ToC. The I is the HTML tag which marks the start of the element. The I is what is required to be prepended to the Table of Contents entry generated for that tag. (default: undefined) =item toc_end toc_end => \%toc_end_data %toc_end_data = { I => I, I => I }; toc_end => { H1 => '/H1', H2 => '/H2' } For defining significant elements. The I is the HTML tag which marks the start of the element. The I the HTML tag which marks the end of the element. When matching in the input file, case is ignored (but make sure that all your I options referring to the same tag are exactly the same!). =item toc_entry toc_entry => \%toc_entry_data %toc_entry_data = { I => I, I => I }; toc_entry => { H1 => 1, H2 => 2 } For defining significant elements. The I is the HTML tag which marks the start of the element. The I is what level the tag is considered to be. The value of I must be numeric, and non-zero. If the value is negative, consective entries represented by the significant_element will be separated by the value set by B option. =item toclabel toclabel => I HTML text that labels the ToC. Always used. (default: "

      Table of Contents

      ") =item toc_tag toc_tag => I If a ToC is to be included inline, this is the pattern which is used to match the tag where the ToC should be put. This can be a start-tag, an end-tag or a comment, but the E should be left out; that is, if you want the ToC to be placed after the BODY tag, then give "BODY". If you want a special comment tag to make where the ToC should go, then include the comment marks, for example: "!--toc--" (default:BODY) =item toc_tag_replace toc_tag_replace => 1 In conjunction with B, this is a flag to say whether the given tag should be replaced, or if the ToC should be put after the tag. This can be useful if your toc_tag is a comment and you don't need it after you have the ToC in place. (default:false) =item toc_only toc_only => 1 Output only the Table of Contents, that is, the Table of Contents plus the toclabel. If there is a B
      or a B