pax_global_header00006660000000000000000000000064131244511670014516gustar00rootroot0000000000000052 comment=a205ff00c83eb0b3fb68219c03fe44dd5a824d34 libkiwix-0.2.0/000077500000000000000000000000001312445116700133375ustar00rootroot00000000000000libkiwix-0.2.0/.travis.yml000066400000000000000000000004251312445116700154510ustar00rootroot00000000000000language: cpp dist: trusty sudo: required cache: ccache install: travis/install_deps.sh script: travis/compile.sh env: - PLATFORM="native_static" - PLATFORM="native_dyn" - PLATFORM="win32_static" - PLATFORM="win32_dyn" - PLATFORM="android_arm" - PLATFORM="android_arm64" libkiwix-0.2.0/COPYING000066400000000000000000001043741312445116700144030ustar00rootroot00000000000000 GNU GENERAL PUBLIC LICENSE Version 3, 29 June 2007 Copyright (C) 2007 Free Software Foundation, Inc. Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The GNU General Public License is a free, copyleft license for software and other kinds of works. The licenses for most software and other practical works are designed to take away your freedom to share and change the works. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change all versions of a program--to make sure it remains free software for all its users. We, the Free Software Foundation, use the GNU General Public License for most of our software; it applies also to any other work released this way by its authors. You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for them if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs, and that you know you can do these things. To protect your rights, we need to prevent others from denying you these rights or asking you to surrender the rights. Therefore, you have certain responsibilities if you distribute copies of the software, or if you modify it: responsibilities to respect the freedom of others. For example, if you distribute copies of such a program, whether gratis or for a fee, you must pass on to the recipients the same freedoms that you received. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. Developers that use the GNU GPL protect your rights with two steps: (1) assert copyright on the software, and (2) offer you this License giving you legal permission to copy, distribute and/or modify it. For the developers' and authors' protection, the GPL clearly explains that there is no warranty for this free software. For both users' and authors' sake, the GPL requires that modified versions be marked as changed, so that their problems will not be attributed erroneously to authors of previous versions. Some devices are designed to deny users access to install or run modified versions of the software inside them, although the manufacturer can do so. This is fundamentally incompatible with the aim of protecting users' freedom to change the software. The systematic pattern of such abuse occurs in the area of products for individuals to use, which is precisely where it is most unacceptable. Therefore, we have designed this version of the GPL to prohibit the practice for those products. If such problems arise substantially in other domains, we stand ready to extend this provision to those domains in future versions of the GPL, as needed to protect the freedom of users. Finally, every program is threatened constantly by software patents. States should not allow patents to restrict development and use of software on general-purpose computers, but in those that do, we wish to avoid the special danger that patents applied to a free program could make it effectively proprietary. To prevent this, the GPL assures that patents cannot be used to render the program non-free. The precise terms and conditions for copying, distribution and modification follow. TERMS AND CONDITIONS 0. Definitions. "This License" refers to version 3 of the GNU General Public License. "Copyright" also means copyright-like laws that apply to other kinds of works, such as semiconductor masks. "The Program" refers to any copyrightable work licensed under this License. Each licensee is addressed as "you". "Licensees" and "recipients" may be individuals or organizations. To "modify" a work means to copy from or adapt all or part of the work in a fashion requiring copyright permission, other than the making of an exact copy. The resulting work is called a "modified version" of the earlier work or a work "based on" the earlier work. A "covered work" means either the unmodified Program or a work based on the Program. To "propagate" a work means to do anything with it that, without permission, would make you directly or secondarily liable for infringement under applicable copyright law, except executing it on a computer or modifying a private copy. Propagation includes copying, distribution (with or without modification), making available to the public, and in some countries other activities as well. To "convey" a work means any kind of propagation that enables other parties to make or receive copies. Mere interaction with a user through a computer network, with no transfer of a copy, is not conveying. An interactive user interface displays "Appropriate Legal Notices" to the extent that it includes a convenient and prominently visible feature that (1) displays an appropriate copyright notice, and (2) tells the user that there is no warranty for the work (except to the extent that warranties are provided), that licensees may convey the work under this License, and how to view a copy of this License. If the interface presents a list of user commands or options, such as a menu, a prominent item in the list meets this criterion. 1. Source Code. The "source code" for a work means the preferred form of the work for making modifications to it. "Object code" means any non-source form of a work. A "Standard Interface" means an interface that either is an official standard defined by a recognized standards body, or, in the case of interfaces specified for a particular programming language, one that is widely used among developers working in that language. The "System Libraries" of an executable work include anything, other than the work as a whole, that (a) is included in the normal form of packaging a Major Component, but which is not part of that Major Component, and (b) serves only to enable use of the work with that Major Component, or to implement a Standard Interface for which an implementation is available to the public in source code form. A "Major Component", in this context, means a major essential component (kernel, window system, and so on) of the specific operating system (if any) on which the executable work runs, or a compiler used to produce the work, or an object code interpreter used to run it. The "Corresponding Source" for a work in object code form means all the source code needed to generate, install, and (for an executable work) run the object code and to modify the work, including scripts to control those activities. However, it does not include the work's System Libraries, or general-purpose tools or generally available free programs which are used unmodified in performing those activities but which are not part of the work. For example, Corresponding Source includes interface definition files associated with source files for the work, and the source code for shared libraries and dynamically linked subprograms that the work is specifically designed to require, such as by intimate data communication or control flow between those subprograms and other parts of the work. The Corresponding Source need not include anything that users can regenerate automatically from other parts of the Corresponding Source. The Corresponding Source for a work in source code form is that same work. 2. Basic Permissions. All rights granted under this License are granted for the term of copyright on the Program, and are irrevocable provided the stated conditions are met. This License explicitly affirms your unlimited permission to run the unmodified Program. The output from running a covered work is covered by this License only if the output, given its content, constitutes a covered work. This License acknowledges your rights of fair use or other equivalent, as provided by copyright law. You may make, run and propagate covered works that you do not convey, without conditions so long as your license otherwise remains in force. You may convey covered works to others for the sole purpose of having them make modifications exclusively for you, or provide you with facilities for running those works, provided that you comply with the terms of this License in conveying all material for which you do not control copyright. Those thus making or running the covered works for you must do so exclusively on your behalf, under your direction and control, on terms that prohibit them from making any copies of your copyrighted material outside their relationship with you. Conveying under any other circumstances is permitted solely under the conditions stated below. Sublicensing is not allowed; section 10 makes it unnecessary. 3. Protecting Users' Legal Rights From Anti-Circumvention Law. No covered work shall be deemed part of an effective technological measure under any applicable law fulfilling obligations under article 11 of the WIPO copyright treaty adopted on 20 December 1996, or similar laws prohibiting or restricting circumvention of such measures. When you convey a covered work, you waive any legal power to forbid circumvention of technological measures to the extent such circumvention is effected by exercising rights under this License with respect to the covered work, and you disclaim any intention to limit operation or modification of the work as a means of enforcing, against the work's users, your or third parties' legal rights to forbid circumvention of technological measures. 4. Conveying Verbatim Copies. You may convey verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice; keep intact all notices stating that this License and any non-permissive terms added in accord with section 7 apply to the code; keep intact all notices of the absence of any warranty; and give all recipients a copy of this License along with the Program. You may charge any price or no price for each copy that you convey, and you may offer support or warranty protection for a fee. 5. Conveying Modified Source Versions. You may convey a work based on the Program, or the modifications to produce it from the Program, in the form of source code under the terms of section 4, provided that you also meet all of these conditions: a) The work must carry prominent notices stating that you modified it, and giving a relevant date. b) The work must carry prominent notices stating that it is released under this License and any conditions added under section 7. This requirement modifies the requirement in section 4 to "keep intact all notices". c) You must license the entire work, as a whole, under this License to anyone who comes into possession of a copy. This License will therefore apply, along with any applicable section 7 additional terms, to the whole of the work, and all its parts, regardless of how they are packaged. This License gives no permission to license the work in any other way, but it does not invalidate such permission if you have separately received it. d) If the work has interactive user interfaces, each must display Appropriate Legal Notices; however, if the Program has interactive interfaces that do not display Appropriate Legal Notices, your work need not make them do so. A compilation of a covered work with other separate and independent works, which are not by their nature extensions of the covered work, and which are not combined with it such as to form a larger program, in or on a volume of a storage or distribution medium, is called an "aggregate" if the compilation and its resulting copyright are not used to limit the access or legal rights of the compilation's users beyond what the individual works permit. Inclusion of a covered work in an aggregate does not cause this License to apply to the other parts of the aggregate. 6. Conveying Non-Source Forms. You may convey a covered work in object code form under the terms of sections 4 and 5, provided that you also convey the machine-readable Corresponding Source under the terms of this License, in one of these ways: a) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by the Corresponding Source fixed on a durable physical medium customarily used for software interchange. b) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by a written offer, valid for at least three years and valid for as long as you offer spare parts or customer support for that product model, to give anyone who possesses the object code either (1) a copy of the Corresponding Source for all the software in the product that is covered by this License, on a durable physical medium customarily used for software interchange, for a price no more than your reasonable cost of physically performing this conveying of source, or (2) access to copy the Corresponding Source from a network server at no charge. c) Convey individual copies of the object code with a copy of the written offer to provide the Corresponding Source. This alternative is allowed only occasionally and noncommercially, and only if you received the object code with such an offer, in accord with subsection 6b. d) Convey the object code by offering access from a designated place (gratis or for a charge), and offer equivalent access to the Corresponding Source in the same way through the same place at no further charge. You need not require recipients to copy the Corresponding Source along with the object code. If the place to copy the object code is a network server, the Corresponding Source may be on a different server (operated by you or a third party) that supports equivalent copying facilities, provided you maintain clear directions next to the object code saying where to find the Corresponding Source. Regardless of what server hosts the Corresponding Source, you remain obligated to ensure that it is available for as long as needed to satisfy these requirements. e) Convey the object code using peer-to-peer transmission, provided you inform other peers where the object code and Corresponding Source of the work are being offered to the general public at no charge under subsection 6d. A separable portion of the object code, whose source code is excluded from the Corresponding Source as a System Library, need not be included in conveying the object code work. A "User Product" is either (1) a "consumer product", which means any tangible personal property which is normally used for personal, family, or household purposes, or (2) anything designed or sold for incorporation into a dwelling. In determining whether a product is a consumer product, doubtful cases shall be resolved in favor of coverage. For a particular product received by a particular user, "normally used" refers to a typical or common use of that class of product, regardless of the status of the particular user or of the way in which the particular user actually uses, or expects or is expected to use, the product. A product is a consumer product regardless of whether the product has substantial commercial, industrial or non-consumer uses, unless such uses represent the only significant mode of use of the product. "Installation Information" for a User Product means any methods, procedures, authorization keys, or other information required to install and execute modified versions of a covered work in that User Product from a modified version of its Corresponding Source. The information must suffice to ensure that the continued functioning of the modified object code is in no case prevented or interfered with solely because modification has been made. If you convey an object code work under this section in, or with, or specifically for use in, a User Product, and the conveying occurs as part of a transaction in which the right of possession and use of the User Product is transferred to the recipient in perpetuity or for a fixed term (regardless of how the transaction is characterized), the Corresponding Source conveyed under this section must be accompanied by the Installation Information. But this requirement does not apply if neither you nor any third party retains the ability to install modified object code on the User Product (for example, the work has been installed in ROM). The requirement to provide Installation Information does not include a requirement to continue to provide support service, warranty, or updates for a work that has been modified or installed by the recipient, or for the User Product in which it has been modified or installed. Access to a network may be denied when the modification itself materially and adversely affects the operation of the network or violates the rules and protocols for communication across the network. Corresponding Source conveyed, and Installation Information provided, in accord with this section must be in a format that is publicly documented (and with an implementation available to the public in source code form), and must require no special password or key for unpacking, reading or copying. 7. Additional Terms. "Additional permissions" are terms that supplement the terms of this License by making exceptions from one or more of its conditions. Additional permissions that are applicable to the entire Program shall be treated as though they were included in this License, to the extent that they are valid under applicable law. If additional permissions apply only to part of the Program, that part may be used separately under those permissions, but the entire Program remains governed by this License without regard to the additional permissions. When you convey a copy of a covered work, you may at your option remove any additional permissions from that copy, or from any part of it. (Additional permissions may be written to require their own removal in certain cases when you modify the work.) You may place additional permissions on material, added by you to a covered work, for which you have or can give appropriate copyright permission. Notwithstanding any other provision of this License, for material you add to a covered work, you may (if authorized by the copyright holders of that material) supplement the terms of this License with terms: a) Disclaiming warranty or limiting liability differently from the terms of sections 15 and 16 of this License; or b) Requiring preservation of specified reasonable legal notices or author attributions in that material or in the Appropriate Legal Notices displayed by works containing it; or c) Prohibiting misrepresentation of the origin of that material, or requiring that modified versions of such material be marked in reasonable ways as different from the original version; or d) Limiting the use for publicity purposes of names of licensors or authors of the material; or e) Declining to grant rights under trademark law for use of some trade names, trademarks, or service marks; or f) Requiring indemnification of licensors and authors of that material by anyone who conveys the material (or modified versions of it) with contractual assumptions of liability to the recipient, for any liability that these contractual assumptions directly impose on those licensors and authors. All other non-permissive additional terms are considered "further restrictions" within the meaning of section 10. If the Program as you received it, or any part of it, contains a notice stating that it is governed by this License along with a term that is a further restriction, you may remove that term. If a license document contains a further restriction but permits relicensing or conveying under this License, you may add to a covered work material governed by the terms of that license document, provided that the further restriction does not survive such relicensing or conveying. If you add terms to a covered work in accord with this section, you must place, in the relevant source files, a statement of the additional terms that apply to those files, or a notice indicating where to find the applicable terms. Additional terms, permissive or non-permissive, may be stated in the form of a separately written license, or stated as exceptions; the above requirements apply either way. 8. Termination. You may not propagate or modify a covered work except as expressly provided under this License. Any attempt otherwise to propagate or modify it is void, and will automatically terminate your rights under this License (including any patent licenses granted under the third paragraph of section 11). However, if you cease all violation of this License, then your license from a particular copyright holder is reinstated (a) provisionally, unless and until the copyright holder explicitly and finally terminates your license, and (b) permanently, if the copyright holder fails to notify you of the violation by some reasonable means prior to 60 days after the cessation. Moreover, your license from a particular copyright holder is reinstated permanently if the copyright holder notifies you of the violation by some reasonable means, this is the first time you have received notice of violation of this License (for any work) from that copyright holder, and you cure the violation prior to 30 days after your receipt of the notice. Termination of your rights under this section does not terminate the licenses of parties who have received copies or rights from you under this License. If your rights have been terminated and not permanently reinstated, you do not qualify to receive new licenses for the same material under section 10. 9. Acceptance Not Required for Having Copies. You are not required to accept this License in order to receive or run a copy of the Program. Ancillary propagation of a covered work occurring solely as a consequence of using peer-to-peer transmission to receive a copy likewise does not require acceptance. However, nothing other than this License grants you permission to propagate or modify any covered work. These actions infringe copyright if you do not accept this License. Therefore, by modifying or propagating a covered work, you indicate your acceptance of this License to do so. 10. Automatic Licensing of Downstream Recipients. Each time you convey a covered work, the recipient automatically receives a license from the original licensors, to run, modify and propagate that work, subject to this License. You are not responsible for enforcing compliance by third parties with this License. An "entity transaction" is a transaction transferring control of an organization, or substantially all assets of one, or subdividing an organization, or merging organizations. If propagation of a covered work results from an entity transaction, each party to that transaction who receives a copy of the work also receives whatever licenses to the work the party's predecessor in interest had or could give under the previous paragraph, plus a right to possession of the Corresponding Source of the work from the predecessor in interest, if the predecessor has it or can get it with reasonable efforts. You may not impose any further restrictions on the exercise of the rights granted or affirmed under this License. For example, you may not impose a license fee, royalty, or other charge for exercise of rights granted under this License, and you may not initiate litigation (including a cross-claim or counterclaim in a lawsuit) alleging that any patent claim is infringed by making, using, selling, offering for sale, or importing the Program or any portion of it. 11. Patents. A "contributor" is a copyright holder who authorizes use under this License of the Program or a work on which the Program is based. The work thus licensed is called the contributor's "contributor version". A contributor's "essential patent claims" are all patent claims owned or controlled by the contributor, whether already acquired or hereafter acquired, that would be infringed by some manner, permitted by this License, of making, using, or selling its contributor version, but do not include claims that would be infringed only as a consequence of further modification of the contributor version. For purposes of this definition, "control" includes the right to grant patent sublicenses in a manner consistent with the requirements of this License. Each contributor grants you a non-exclusive, worldwide, royalty-free patent license under the contributor's essential patent claims, to make, use, sell, offer for sale, import and otherwise run, modify and propagate the contents of its contributor version. In the following three paragraphs, a "patent license" is any express agreement or commitment, however denominated, not to enforce a patent (such as an express permission to practice a patent or covenant not to sue for patent infringement). To "grant" such a patent license to a party means to make such an agreement or commitment not to enforce a patent against the party. If you convey a covered work, knowingly relying on a patent license, and the Corresponding Source of the work is not available for anyone to copy, free of charge and under the terms of this License, through a publicly available network server or other readily accessible means, then you must either (1) cause the Corresponding Source to be so available, or (2) arrange to deprive yourself of the benefit of the patent license for this particular work, or (3) arrange, in a manner consistent with the requirements of this License, to extend the patent license to downstream recipients. "Knowingly relying" means you have actual knowledge that, but for the patent license, your conveying the covered work in a country, or your recipient's use of the covered work in a country, would infringe one or more identifiable patents in that country that you have reason to believe are valid. If, pursuant to or in connection with a single transaction or arrangement, you convey, or propagate by procuring conveyance of, a covered work, and grant a patent license to some of the parties receiving the covered work authorizing them to use, propagate, modify or convey a specific copy of the covered work, then the patent license you grant is automatically extended to all recipients of the covered work and works based on it. A patent license is "discriminatory" if it does not include within the scope of its coverage, prohibits the exercise of, or is conditioned on the non-exercise of one or more of the rights that are specifically granted under this License. You may not convey a covered work if you are a party to an arrangement with a third party that is in the business of distributing software, under which you make payment to the third party based on the extent of your activity of conveying the work, and under which the third party grants, to any of the parties who would receive the covered work from you, a discriminatory patent license (a) in connection with copies of the covered work conveyed by you (or copies made from those copies), or (b) primarily for and in connection with specific products or compilations that contain the covered work, unless you entered into that arrangement, or that patent license was granted, prior to 28 March 2007. Nothing in this License shall be construed as excluding or limiting any implied license or other defenses to infringement that may otherwise be available to you under applicable patent law. 12. No Surrender of Others' Freedom. If conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot convey a covered work so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not convey it at all. For example, if you agree to terms that obligate you to collect a royalty for further conveying from those to whom you convey the Program, the only way you could satisfy both those terms and this License would be to refrain entirely from conveying the Program. 13. Use with the GNU Affero General Public License. Notwithstanding any other provision of this License, you have permission to link or combine any covered work with a work licensed under version 3 of the GNU Affero General Public License into a single combined work, and to convey the resulting work. The terms of this License will continue to apply to the part which is the covered work, but the special requirements of the GNU Affero General Public License, section 13, concerning interaction through a network will apply to the combination as such. 14. Revised Versions of this License. The Free Software Foundation may publish revised and/or new versions of the GNU General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies that a certain numbered version of the GNU General Public License "or any later version" applies to it, you have the option of following the terms and conditions either of that numbered version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of the GNU General Public License, you may choose any version ever published by the Free Software Foundation. If the Program specifies that a proxy can decide which future versions of the GNU General Public License can be used, that proxy's public statement of acceptance of a version permanently authorizes you to choose that version for the Program. Later license versions may give you additional or different permissions. However, no additional obligations are imposed on any author or copyright holder as a result of your choosing to follow a later version. 15. Disclaimer of Warranty. THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 16. Limitation of Liability. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. 17. Interpretation of Sections 15 and 16. If the disclaimer of warranty and limitation of liability provided above cannot be given local legal effect according to their terms, reviewing courts shall apply local law that most closely approximates an absolute waiver of all civil liability in connection with the Program, unless a warranty or assumption of liability accompanies a copy of the Program in return for a fee. END OF TERMS AND CONDITIONS How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively state the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Copyright (C) This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see . Also add information on how to contact you by electronic and paper mail. If the program does terminal interaction, make it output a short notice like this when it starts in an interactive mode: Copyright (C) This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. The hypothetical commands `show w' and `show c' should show the appropriate parts of the General Public License. Of course, your program's commands might be different; for a GUI interface, you would use an "about box". You should also get your employer (if you work as a programmer) or school, if any, to sign a "copyright disclaimer" for the program, if necessary. For more information on this, and how to apply and follow the GNU GPL, see . The GNU General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Lesser General Public License instead of this License. But first, please read . libkiwix-0.2.0/ChangeLog000066400000000000000000000012151312445116700151100ustar00rootroot00000000000000kiwix-lib 0.2.0 =============== * Generate the snippet from the article content if the snippet is not directly in the database. This provide better snippets as they now depending of the query. * Use the stopwords and the language stored in the fulltext index database to parse the user query. * Remove the indexer functionnality. * Move to C++11 standard. * Use the fulltext search of the zimlib. We still have the fulltext search code in kiwix-lib to be able to search in fulltext index by side of a zim file. (To be remove in the future) * Few API hanges * Change a lot of `Reader` methods to const methods. * Fix some crashes. libkiwix-0.2.0/README.md000066400000000000000000000064761312445116700146330ustar00rootroot00000000000000Kiwix library ============= The Kiwix library provides the Kiwix software core. It contains the code shared by all Kiwix ports (Windows, Linux, OSX, Android, ...). Disclaimer ---------- This document assumes you have a little knowledge about software compilation. If you experience difficulties with the dependencies or with the Kiwix libary compilation itself, we recommend to have a look to [kiwix-build](https://github.com/kiwix/kiwix-build). Preamble -------- Although the Kiwix library can be compiled/cross-compiled on/for many sytems, the following documentation explains how to do it on POSIX ones. It is primarly though for GNU/Linux systems and has been tested on recent releases of Ubuntu and Fedora. Dependencies ------------ The Kiwix library relies on many third parts software libraries. They are prerequisites to the Kiwix library compilation. Following libraries need to be available: * ICU ................................... http://site.icu-project.org/ (package libicu-dev on Ubuntu) * ZIM ........................................ http://www.openzim.org/ (package libzim-dev on Ubuntu) * Pugixml ........................................ http://pugixml.org/ (package libpugixml-dev on Ubuntu) * ctpp2 ........................................ http://ctpp.havoc.ru/ (package libctpp2-dev on Ubuntu) * Xapian ......................................... https://xapian.org/ (package libxapian-dev on Ubuntu) These dependencies may or may not be packaged by your operating system. They may also be packaged but only in an older version. The compilation script will tell you if one of them is missing or too old. In the worse case, you will have to download and compile bleeding edge version by hand. If you want to install these dependencies locally, then use the kiwix-lib directory as install prefix. If you compile ctpp2 from source and want to compile the Kiwix library statically then you will probably need to rename ctpp2 static library from ctpp2-st.a to ctpp2.a. Environnement ------------- The Kiwix library builds using [Meson](http://mesonbuild.com/) version 0.34 or higher. Meson relies itself on Ninja, pkg-config and few other compilation tools. Install first the few common compilation tools: * Automake * Libtool * Virtualenv * Pkg-config Then install Meson itself: ``` virtualenv -p python3 ./ # Create virtualenv source bin/activate # Activate the virtualenv pip install meson # Install Meson hash -r # Refresh bash paths ``` Finally download and build Ninja locally: ``` git clone git://github.com/ninja-build/ninja.git cd ninja git checkout release ./configure.py --bootstrap mkdir ../bin cp ninja ../bin cd .. ``` Compilation ----------- Once all dependencies are installed, you can compile kiwix-lib with: ``` mkdir build meson . build cd build ninja ``` By default, it will compile dynamic linked libraries. If you want statically linked libraries, you can add `--default-library=static` option to the Meson command. Depending of you system, `ninja` may be called `ninja-build`. Installation ------------ If you want to install the libraries you just have compiled on your system, here we go: ``` ninja install cd .. ``` You might need to run the command as root, depending where you want to install the libraries. License ------- GPLv3 or later, see COPYING for more details. libkiwix-0.2.0/include/000077500000000000000000000000001312445116700147625ustar00rootroot00000000000000libkiwix-0.2.0/include/common/000077500000000000000000000000001312445116700162525ustar00rootroot00000000000000libkiwix-0.2.0/include/common/base64.h000066400000000000000000000002101312445116700175000ustar00rootroot00000000000000#include std::string base64_encode(unsigned char const* , unsigned int len); std::string base64_decode(std::string const& s); libkiwix-0.2.0/include/common/networkTools.h000066400000000000000000000024041312445116700211350ustar00rootroot00000000000000/* * Copyright 2012 Emmanuel Engelhart * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 3 of the License, or * any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, * MA 02110-1301, USA. */ #ifndef KIWIX_NETWORKTOOLS_H #define KIWIX_NETWORKTOOLS_H #ifdef _WIN32 #include #include #else #include #include #include #include #include #include #include #include #include #endif #include #include #include #include namespace kiwix { std::map getNetworkInterfaces(); std::string getBestPublicIp(); } #endif libkiwix-0.2.0/include/common/otherTools.h000066400000000000000000000017041312445116700205670ustar00rootroot00000000000000/* * Copyright 2014 Emmanuel Engelhart * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 3 of the License, or * any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, * MA 02110-1301, USA. */ #ifndef KIWIX_OTHERTOOLS_H #define KIWIX_OTHERTOOLS_H #ifdef _WIN32 #include #else #include #endif namespace kiwix { void sleep(unsigned int milliseconds); } #endif libkiwix-0.2.0/include/common/pathTools.h000066400000000000000000000036541312445116700204100ustar00rootroot00000000000000/* * Copyright 2011 Emmanuel Engelhart * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 3 of the License, or * any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, * MA 02110-1301, USA. */ #ifndef KIWIX_PATHTOOLS_H #define KIWIX_PATHTOOLS_H #include #include #include #include #include #include #include #include #include #include #include #include #ifdef _WIN32 #include #endif #include "stringTools.h" using namespace std; bool isRelativePath(const string &path); string computeAbsolutePath(const string path, const string relativePath); string computeRelativePath(const string path, const string absolutePath); string removeLastPathElement(const string path, const bool removePreSeparator = false, const bool removePostSeparator = false); string appendToDirectory(const string &directoryPath, const string &filename); unsigned int getFileSize(const string &path); string getFileSizeAsString(const string &path); bool fileExists(const string &path); bool makeDirectory(const string &path); bool copyFile(const string &sourcePath, const string &destPath); string getLastPathElement(const string &path); string getExecutablePath(); string getCurrentDirectory(); bool writeTextFile(const string &path, const string &content); #endif libkiwix-0.2.0/include/common/regexTools.h000066400000000000000000000023111312445116700205530ustar00rootroot00000000000000/* * Copyright 2011 Emmanuel Engelhart * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 3 of the License, or * any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, * MA 02110-1301, USA. */ #ifndef KIWIX_REGEXTOOLS_H #define KIWIX_REGEXTOOLS_H #include #include #include #include bool matchRegex(const std::string &content, const std::string ®ex); std::string replaceRegex(const std::string &content, const std::string &replacement, const std::string ®ex); std::string appendToFirstOccurence(const std::string &content, const std::string regex, const std::string &replacement); #endif libkiwix-0.2.0/include/common/stringTools.h000066400000000000000000000041361312445116700207560ustar00rootroot00000000000000/* * Copyright 2011-2012 Emmanuel Engelhart * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 3 of the License, or * any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, * MA 02110-1301, USA. */ #ifndef KIWIX_STRINGTOOLS_H #define KIWIX_STRINGTOOLS_H #include #include #include #include #include #include #include "pathTools.h" namespace kiwix { #ifndef __ANDROID__ std::string beautifyInteger(const unsigned int number); std::string beautifyFileSize(const unsigned int number); std::string urlEncode(const std::string &c); void printStringInHexadecimal(const char *s); void printStringInHexadecimal(UnicodeString s); void stringReplacement(std::string& str, const std::string& oldStr, const std::string& newStr); std::string encodeDiples(const std::string& str); #endif std::string removeAccents(const std::string &text); void loadICUExternalTables(); std::string urlDecode(const std::string &c); std::vector split(const std::string&, const std::string&); std::vector split(const char*, const char*); std::vector split(const std::string&, const char*); std::vector split(const char*, const std::string&); std::string ucAll(const std::string &word); std::string lcAll(const std::string &word); std::string ucFirst(const std::string &word); std::string lcFirst(const std::string &word); std::string toTitle(const std::string &word); std::string normalize(const std::string &word); } #endif libkiwix-0.2.0/include/common/tree.h000066400000000000000000002421331312445116700173670ustar00rootroot00000000000000 // STL-like templated tree class. // // Copyright (C) 2001-2011 Kasper Peeters // Distributed under the GNU General Public License version 3. // // When used together with the htmlcxx library to create // HTML::Node template instances, the GNU Lesser General Public // version 2 applies. Special permission to use tree.hh under // the LGPL for other projects can be requested from the author. /** \mainpage tree.hh \author Kasper Peeters \version 2.81 \date 23-Aug-2011 \see http://tree.phi-sci.com/ \see http://tree.phi-sci.com/ChangeLog The tree.hh library for C++ provides an STL-like container class for n-ary trees, templated over the data stored at the nodes. Various types of iterators are provided (post-order, pre-order, and others). Where possible the access methods are compatible with the STL or alternative algorithms are available. */ #ifndef tree_hh_ #define tree_hh_ #include #include #include #include #include #include #include #include /// A node in the tree, combining links to other nodes as well as the actual data. template class tree_node_ { // size: 5*4=20 bytes (on 32 bit arch), can be reduced by 8. public: tree_node_(); tree_node_(const T&); tree_node_ *parent; tree_node_ *first_child, *last_child; tree_node_ *prev_sibling, *next_sibling; T data; }; // __attribute__((packed)); template tree_node_::tree_node_() : parent(0), first_child(0), last_child(0), prev_sibling(0), next_sibling(0) { } template tree_node_::tree_node_(const T& val) : parent(0), first_child(0), last_child(0), prev_sibling(0), next_sibling(0), data(val) { } template > > class tree { protected: typedef tree_node_ tree_node; public: /// Value of the data stored at a node. typedef T value_type; class iterator_base; class pre_order_iterator; class post_order_iterator; class sibling_iterator; class leaf_iterator; tree(); tree(const T&); tree(const iterator_base&); tree(const tree&); ~tree(); tree& operator=(const tree&); /// Base class for iterators, only pointers stored, no traversal logic. #ifdef __SGI_STL_PORT class iterator_base : public stlport::bidirectional_iterator { #else class iterator_base { #endif public: typedef T value_type; typedef T* pointer; typedef T& reference; typedef size_t size_type; typedef ptrdiff_t difference_type; typedef std::bidirectional_iterator_tag iterator_category; iterator_base(); iterator_base(tree_node *); T& operator*() const; T* operator->() const; /// When called, the next increment/decrement skips children of this node. void skip_children(); void skip_children(bool skip); /// Number of children of the node pointed to by the iterator. unsigned int number_of_children() const; sibling_iterator begin() const; sibling_iterator end() const; tree_node *node; protected: bool skip_current_children_; }; /// Depth-first iterator, first accessing the node, then its children. class pre_order_iterator : public iterator_base { public: pre_order_iterator(); pre_order_iterator(tree_node *); pre_order_iterator(const iterator_base&); pre_order_iterator(const sibling_iterator&); bool operator==(const pre_order_iterator&) const; bool operator!=(const pre_order_iterator&) const; pre_order_iterator& operator++(); pre_order_iterator& operator--(); pre_order_iterator operator++(int); pre_order_iterator operator--(int); pre_order_iterator& operator+=(unsigned int); pre_order_iterator& operator-=(unsigned int); }; /// Depth-first iterator, first accessing the children, then the node itself. class post_order_iterator : public iterator_base { public: post_order_iterator(); post_order_iterator(tree_node *); post_order_iterator(const iterator_base&); post_order_iterator(const sibling_iterator&); bool operator==(const post_order_iterator&) const; bool operator!=(const post_order_iterator&) const; post_order_iterator& operator++(); post_order_iterator& operator--(); post_order_iterator operator++(int); post_order_iterator operator--(int); post_order_iterator& operator+=(unsigned int); post_order_iterator& operator-=(unsigned int); /// Set iterator to the first child as deep as possible down the tree. void descend_all(); }; /// Breadth-first iterator, using a queue class breadth_first_queued_iterator : public iterator_base { public: breadth_first_queued_iterator(); breadth_first_queued_iterator(tree_node *); breadth_first_queued_iterator(const iterator_base&); bool operator==(const breadth_first_queued_iterator&) const; bool operator!=(const breadth_first_queued_iterator&) const; breadth_first_queued_iterator& operator++(); breadth_first_queued_iterator operator++(int); breadth_first_queued_iterator& operator+=(unsigned int); private: std::queue traversal_queue; }; /// The default iterator types throughout the tree class. typedef pre_order_iterator iterator; typedef breadth_first_queued_iterator breadth_first_iterator; /// Iterator which traverses only the nodes at a given depth from the root. class fixed_depth_iterator : public iterator_base { public: fixed_depth_iterator(); fixed_depth_iterator(tree_node *); fixed_depth_iterator(const iterator_base&); fixed_depth_iterator(const sibling_iterator&); fixed_depth_iterator(const fixed_depth_iterator&); bool operator==(const fixed_depth_iterator&) const; bool operator!=(const fixed_depth_iterator&) const; fixed_depth_iterator& operator++(); fixed_depth_iterator& operator--(); fixed_depth_iterator operator++(int); fixed_depth_iterator operator--(int); fixed_depth_iterator& operator+=(unsigned int); fixed_depth_iterator& operator-=(unsigned int); tree_node *top_node; }; /// Iterator which traverses only the nodes which are siblings of each other. class sibling_iterator : public iterator_base { public: sibling_iterator(); sibling_iterator(tree_node *); sibling_iterator(const sibling_iterator&); sibling_iterator(const iterator_base&); bool operator==(const sibling_iterator&) const; bool operator!=(const sibling_iterator&) const; sibling_iterator& operator++(); sibling_iterator& operator--(); sibling_iterator operator++(int); sibling_iterator operator--(int); sibling_iterator& operator+=(unsigned int); sibling_iterator& operator-=(unsigned int); tree_node *range_first() const; tree_node *range_last() const; tree_node *parent_; private: void set_parent_(); }; /// Iterator which traverses only the leaves. class leaf_iterator : public iterator_base { public: leaf_iterator(); leaf_iterator(tree_node *, tree_node *top=0); leaf_iterator(const sibling_iterator&); leaf_iterator(const iterator_base&); bool operator==(const leaf_iterator&) const; bool operator!=(const leaf_iterator&) const; leaf_iterator& operator++(); leaf_iterator& operator--(); leaf_iterator operator++(int); leaf_iterator operator--(int); leaf_iterator& operator+=(unsigned int); leaf_iterator& operator-=(unsigned int); private: tree_node *top_node; }; /// Return iterator to the beginning of the tree. inline pre_order_iterator begin() const; /// Return iterator to the end of the tree. inline pre_order_iterator end() const; /// Return post-order iterator to the beginning of the tree. post_order_iterator begin_post() const; /// Return post-order end iterator of the tree. post_order_iterator end_post() const; /// Return fixed-depth iterator to the first node at a given depth from the given iterator. fixed_depth_iterator begin_fixed(const iterator_base&, unsigned int) const; /// Return fixed-depth end iterator. fixed_depth_iterator end_fixed(const iterator_base&, unsigned int) const; /// Return breadth-first iterator to the first node at a given depth. breadth_first_queued_iterator begin_breadth_first() const; /// Return breadth-first end iterator. breadth_first_queued_iterator end_breadth_first() const; /// Return sibling iterator to the first child of given node. sibling_iterator begin(const iterator_base&) const; /// Return sibling end iterator for children of given node. sibling_iterator end(const iterator_base&) const; /// Return leaf iterator to the first leaf of the tree. leaf_iterator begin_leaf() const; /// Return leaf end iterator for entire tree. leaf_iterator end_leaf() const; /// Return leaf iterator to the first leaf of the subtree at the given node. leaf_iterator begin_leaf(const iterator_base& top) const; /// Return leaf end iterator for the subtree at the given node. leaf_iterator end_leaf(const iterator_base& top) const; /// Return iterator to the parent of a node. template static iter parent(iter); /// Return iterator to the previous sibling of a node. template iter previous_sibling(iter) const; /// Return iterator to the next sibling of a node. template iter next_sibling(iter) const; /// Return iterator to the next node at a given depth. template iter next_at_same_depth(iter) const; /// Erase all nodes of the tree. void clear(); /// Erase element at position pointed to by iterator, return incremented iterator. template iter erase(iter); /// Erase all children of the node pointed to by iterator. void erase_children(const iterator_base&); /// Insert empty node as last/first child of node pointed to by position. template iter append_child(iter position); template iter prepend_child(iter position); /// Insert node as last/first child of node pointed to by position. template iter append_child(iter position, const T& x); template iter prepend_child(iter position, const T& x); /// Append the node (plus its children) at other_position as last/first child of position. template iter append_child(iter position, iter other_position); template iter prepend_child(iter position, iter other_position); /// Append the nodes in the from-to range (plus their children) as last/first children of position. template iter append_children(iter position, sibling_iterator from, sibling_iterator to); template iter prepend_children(iter position, sibling_iterator from, sibling_iterator to); /// Short-hand to insert topmost node in otherwise empty tree. pre_order_iterator set_head(const T& x); /// Insert node as previous sibling of node pointed to by position. template iter insert(iter position, const T& x); /// Specialisation of previous member. sibling_iterator insert(sibling_iterator position, const T& x); /// Insert node (with children) pointed to by subtree as previous sibling of node pointed to by position. template iter insert_subtree(iter position, const iterator_base& subtree); /// Insert node as next sibling of node pointed to by position. template iter insert_after(iter position, const T& x); /// Insert node (with children) pointed to by subtree as next sibling of node pointed to by position. template iter insert_subtree_after(iter position, const iterator_base& subtree); /// Replace node at 'position' with other node (keeping same children); 'position' becomes invalid. template iter replace(iter position, const T& x); /// Replace node at 'position' with subtree starting at 'from' (do not erase subtree at 'from'); see above. template iter replace(iter position, const iterator_base& from); /// Replace string of siblings (plus their children) with copy of a new string (with children); see above sibling_iterator replace(sibling_iterator orig_begin, sibling_iterator orig_end, sibling_iterator new_begin, sibling_iterator new_end); /// Move all children of node at 'position' to be siblings, returns position. template iter flatten(iter position); /// Move nodes in range to be children of 'position'. template iter reparent(iter position, sibling_iterator begin, sibling_iterator end); /// Move all child nodes of 'from' to be children of 'position'. template iter reparent(iter position, iter from); /// Replace node with a new node, making the old node a child of the new node. template iter wrap(iter position, const T& x); /// Move 'source' node (plus its children) to become the next sibling of 'target'. template iter move_after(iter target, iter source); /// Move 'source' node (plus its children) to become the previous sibling of 'target'. template iter move_before(iter target, iter source); sibling_iterator move_before(sibling_iterator target, sibling_iterator source); /// Move 'source' node (plus its children) to become the node at 'target' (erasing the node at 'target'). template iter move_ontop(iter target, iter source); /// Merge with other tree, creating new branches and leaves only if they are not already present. void merge(sibling_iterator, sibling_iterator, sibling_iterator, sibling_iterator, bool duplicate_leaves=false); /// Sort (std::sort only moves values of nodes, this one moves children as well). void sort(sibling_iterator from, sibling_iterator to, bool deep=false); template void sort(sibling_iterator from, sibling_iterator to, StrictWeakOrdering comp, bool deep=false); /// Compare two ranges of nodes (compares nodes as well as tree structure). template bool equal(const iter& one, const iter& two, const iter& three) const; template bool equal(const iter& one, const iter& two, const iter& three, BinaryPredicate) const; template bool equal_subtree(const iter& one, const iter& two) const; template bool equal_subtree(const iter& one, const iter& two, BinaryPredicate) const; /// Extract a new tree formed by the range of siblings plus all their children. tree subtree(sibling_iterator from, sibling_iterator to) const; void subtree(tree&, sibling_iterator from, sibling_iterator to) const; /// Exchange the node (plus subtree) with its sibling node (do nothing if no sibling present). void swap(sibling_iterator it); /// Exchange two nodes (plus subtrees) void swap(iterator, iterator); /// Count the total number of nodes. size_t size() const; /// Count the total number of nodes below the indicated node (plus one). size_t size(const iterator_base&) const; /// Check if tree is empty. bool empty() const; /// Compute the depth to the root or to a fixed other iterator. static int depth(const iterator_base&); static int depth(const iterator_base&, const iterator_base&); /// Determine the maximal depth of the tree. An empty tree has max_depth=-1. int max_depth() const; /// Determine the maximal depth of the tree with top node at the given position. int max_depth(const iterator_base&) const; /// Count the number of children of node at position. static unsigned int number_of_children(const iterator_base&); /// Count the number of siblings (left and right) of node at iterator. Total nodes at this level is +1. unsigned int number_of_siblings(const iterator_base&) const; /// Determine whether node at position is in the subtrees with root in the range. bool is_in_subtree(const iterator_base& position, const iterator_base& begin, const iterator_base& end) const; /// Determine whether the iterator is an 'end' iterator and thus not actually pointing to a node. bool is_valid(const iterator_base&) const; /// Find the lowest common ancestor of two nodes, that is, the deepest node such that /// both nodes are descendants of it. iterator lowest_common_ancestor(const iterator_base&, const iterator_base &) const; /// Determine the index of a node in the range of siblings to which it belongs. unsigned int index(sibling_iterator it) const; /// Inverse of 'index': return the n-th child of the node at position. static sibling_iterator child(const iterator_base& position, unsigned int); /// Return iterator to the sibling indicated by index sibling_iterator sibling(const iterator_base& position, unsigned int); /// For debugging only: verify internal consistency by inspecting all pointers in the tree /// (which will also trigger a valgrind error in case something got corrupted). void debug_verify_consistency() const; /// Comparator class for iterators (compares pointer values; why doesn't this work automatically?) class iterator_base_less { public: bool operator()(const typename tree::iterator_base& one, const typename tree::iterator_base& two) const { return one.node < two.node; } }; tree_node *head, *feet; // head/feet are always dummy; if an iterator points to them it is invalid private: tree_node_allocator alloc_; void head_initialise_(); void copy_(const tree& other); /// Comparator class for two nodes of a tree (used for sorting and searching). template class compare_nodes { public: compare_nodes(StrictWeakOrdering comp) : comp_(comp) {}; bool operator()(const tree_node *a, const tree_node *b) { return comp_(a->data, b->data); } private: StrictWeakOrdering comp_; }; }; //template //class iterator_base_less { // public: // bool operator()(const typename tree::iterator_base& one, // const typename tree::iterator_base& two) const // { // txtout << "operatorclass<" << one.node < two.node << std::endl; // return one.node < two.node; // } //}; // template // bool operator<(const typename tree::iterator& one, // const typename tree::iterator& two) // { // txtout << "operator< " << one.node < two.node << std::endl; // if(one.node < two.node) return true; // return false; // } // // template // bool operator==(const typename tree::iterator& one, // const typename tree::iterator& two) // { // txtout << "operator== " << one.node == two.node << std::endl; // if(one.node == two.node) return true; // return false; // } // // template // bool operator>(const typename tree::iterator_base& one, // const typename tree::iterator_base& two) // { // txtout << "operator> " << one.node < two.node << std::endl; // if(one.node > two.node) return true; // return false; // } // Tree template tree::tree() { head_initialise_(); } template tree::tree(const T& x) { head_initialise_(); set_head(x); } template tree::tree(const iterator_base& other) { head_initialise_(); set_head((*other)); replace(begin(), other); } template tree::~tree() { clear(); alloc_.destroy(head); alloc_.destroy(feet); alloc_.deallocate(head,1); alloc_.deallocate(feet,1); } template void tree::head_initialise_() { head = alloc_.allocate(1,0); // MSVC does not have default second argument feet = alloc_.allocate(1,0); alloc_.construct(head, tree_node_()); alloc_.construct(feet, tree_node_()); head->parent=0; head->first_child=0; head->last_child=0; head->prev_sibling=0; //head; head->next_sibling=feet; //head; feet->parent=0; feet->first_child=0; feet->last_child=0; feet->prev_sibling=head; feet->next_sibling=0; } template tree& tree::operator=(const tree& other) { if(this != &other) copy_(other); return *this; } template tree::tree(const tree& other) { head_initialise_(); copy_(other); } template void tree::copy_(const tree& other) { clear(); pre_order_iterator it=other.begin(), to=begin(); while(it!=other.end()) { to=insert(to, (*it)); it.skip_children(); ++it; } to=begin(); it=other.begin(); while(it!=other.end()) { to=replace(to, it); to.skip_children(); it.skip_children(); ++to; ++it; } } template void tree::clear() { if(head) while(head->next_sibling!=feet) erase(pre_order_iterator(head->next_sibling)); } template void tree::erase_children(const iterator_base& it) { // std::cout << "erase_children " << it.node << std::endl; if(it.node==0) return; tree_node *cur=it.node->first_child; tree_node *prev=0; while(cur!=0) { prev=cur; cur=cur->next_sibling; erase_children(pre_order_iterator(prev)); // kp::destructor(&prev->data); alloc_.destroy(prev); alloc_.deallocate(prev,1); } it.node->first_child=0; it.node->last_child=0; // std::cout << "exit" << std::endl; } template template iter tree::erase(iter it) { tree_node *cur=it.node; assert(cur!=head); iter ret=it; ret.skip_children(); ++ret; erase_children(it); if(cur->prev_sibling==0) { cur->parent->first_child=cur->next_sibling; } else { cur->prev_sibling->next_sibling=cur->next_sibling; } if(cur->next_sibling==0) { cur->parent->last_child=cur->prev_sibling; } else { cur->next_sibling->prev_sibling=cur->prev_sibling; } // kp::destructor(&cur->data); alloc_.destroy(cur); alloc_.deallocate(cur,1); return ret; } template typename tree::pre_order_iterator tree::begin() const { return pre_order_iterator(head->next_sibling); } template typename tree::pre_order_iterator tree::end() const { return pre_order_iterator(feet); } template typename tree::breadth_first_queued_iterator tree::begin_breadth_first() const { return breadth_first_queued_iterator(head->next_sibling); } template typename tree::breadth_first_queued_iterator tree::end_breadth_first() const { return breadth_first_queued_iterator(); } template typename tree::post_order_iterator tree::begin_post() const { tree_node *tmp=head->next_sibling; if(tmp!=feet) { while(tmp->first_child) tmp=tmp->first_child; } return post_order_iterator(tmp); } template typename tree::post_order_iterator tree::end_post() const { return post_order_iterator(feet); } template typename tree::fixed_depth_iterator tree::begin_fixed(const iterator_base& pos, unsigned int dp) const { typename tree::fixed_depth_iterator ret; ret.top_node=pos.node; tree_node *tmp=pos.node; unsigned int curdepth=0; while(curdepthfirst_child==0) { if(tmp->next_sibling==0) { // try to walk up and then right again do { if(tmp==ret.top_node) throw std::range_error("tree: begin_fixed out of range"); tmp=tmp->parent; if(tmp==0) throw std::range_error("tree: begin_fixed out of range"); --curdepth; } while(tmp->next_sibling==0); } tmp=tmp->next_sibling; } tmp=tmp->first_child; ++curdepth; } ret.node=tmp; return ret; } template typename tree::fixed_depth_iterator tree::end_fixed(const iterator_base& pos, unsigned int dp) const { assert(1==0); // FIXME: not correct yet: use is_valid() as a temporary workaround tree_node *tmp=pos.node; unsigned int curdepth=1; while(curdepthfirst_child==0) { tmp=tmp->next_sibling; if(tmp==0) throw std::range_error("tree: end_fixed out of range"); } tmp=tmp->first_child; ++curdepth; } return tmp; } template typename tree::sibling_iterator tree::begin(const iterator_base& pos) const { assert(pos.node!=0); if(pos.node->first_child==0) { return end(pos); } return pos.node->first_child; } template typename tree::sibling_iterator tree::end(const iterator_base& pos) const { sibling_iterator ret(0); ret.parent_=pos.node; return ret; } template typename tree::leaf_iterator tree::begin_leaf() const { tree_node *tmp=head->next_sibling; if(tmp!=feet) { while(tmp->first_child) tmp=tmp->first_child; } return leaf_iterator(tmp); } template typename tree::leaf_iterator tree::end_leaf() const { return leaf_iterator(feet); } template typename tree::leaf_iterator tree::begin_leaf(const iterator_base& top) const { tree_node *tmp=top.node; while(tmp->first_child) tmp=tmp->first_child; return leaf_iterator(tmp, top.node); } template typename tree::leaf_iterator tree::end_leaf(const iterator_base& top) const { return leaf_iterator(top.node, top.node); } template template iter tree::parent(iter position) { assert(position.node!=0); return iter(position.node->parent); } template template iter tree::previous_sibling(iter position) const { assert(position.node!=0); iter ret(position); ret.node=position.node->prev_sibling; return ret; } template template iter tree::next_sibling(iter position) const { assert(position.node!=0); iter ret(position); ret.node=position.node->next_sibling; return ret; } template template iter tree::next_at_same_depth(iter position) const { // We make use of a temporary fixed_depth iterator to implement this. typename tree::fixed_depth_iterator tmp(position.node); ++tmp; return iter(tmp); // assert(position.node!=0); // iter ret(position); // // if(position.node->next_sibling) { // ret.node=position.node->next_sibling; // } // else { // int relative_depth=0; // upper: // do { // ret.node=ret.node->parent; // if(ret.node==0) return ret; // --relative_depth; // } while(ret.node->next_sibling==0); // lower: // ret.node=ret.node->next_sibling; // while(ret.node->first_child==0) { // if(ret.node->next_sibling==0) // goto upper; // ret.node=ret.node->next_sibling; // if(ret.node==0) return ret; // } // while(relative_depth<0 && ret.node->first_child!=0) { // ret.node=ret.node->first_child; // ++relative_depth; // } // if(relative_depth<0) { // if(ret.node->next_sibling==0) goto upper; // else goto lower; // } // } // return ret; } template template iter tree::append_child(iter position) { assert(position.node!=head); assert(position.node!=feet); assert(position.node); tree_node *tmp=alloc_.allocate(1,0); alloc_.construct(tmp, tree_node_()); // kp::constructor(&tmp->data); tmp->first_child=0; tmp->last_child=0; tmp->parent=position.node; if(position.node->last_child!=0) { position.node->last_child->next_sibling=tmp; } else { position.node->first_child=tmp; } tmp->prev_sibling=position.node->last_child; position.node->last_child=tmp; tmp->next_sibling=0; return tmp; } template template iter tree::prepend_child(iter position) { assert(position.node!=head); assert(position.node!=feet); assert(position.node); tree_node *tmp=alloc_.allocate(1,0); alloc_.construct(tmp, tree_node_()); // kp::constructor(&tmp->data); tmp->first_child=0; tmp->last_child=0; tmp->parent=position.node; if(position.node->first_child!=0) { position.node->first_child->prev_sibling=tmp; } else { position.node->last_child=tmp; } tmp->next_sibling=position.node->first_child; position.node->prev_child=tmp; tmp->prev_sibling=0; return tmp; } template template iter tree::append_child(iter position, const T& x) { // If your program fails here you probably used 'append_child' to add the top // node to an empty tree. From version 1.45 the top element should be added // using 'insert'. See the documentation for further information, and sorry about // the API change. assert(position.node!=head); assert(position.node!=feet); assert(position.node); tree_node* tmp = alloc_.allocate(1,0); alloc_.construct(tmp, x); // kp::constructor(&tmp->data, x); tmp->first_child=0; tmp->last_child=0; tmp->parent=position.node; if(position.node->last_child!=0) { position.node->last_child->next_sibling=tmp; } else { position.node->first_child=tmp; } tmp->prev_sibling=position.node->last_child; position.node->last_child=tmp; tmp->next_sibling=0; return tmp; } template template iter tree::prepend_child(iter position, const T& x) { assert(position.node!=head); assert(position.node!=feet); assert(position.node); tree_node* tmp = alloc_.allocate(1,0); alloc_.construct(tmp, x); // kp::constructor(&tmp->data, x); tmp->first_child=0; tmp->last_child=0; tmp->parent=position.node; if(position.node->first_child!=0) { position.node->first_child->prev_sibling=tmp; } else { position.node->last_child=tmp; } tmp->next_sibling=position.node->first_child; position.node->first_child=tmp; tmp->prev_sibling=0; return tmp; } template template iter tree::append_child(iter position, iter other) { assert(position.node!=head); assert(position.node!=feet); assert(position.node); sibling_iterator aargh=append_child(position, value_type()); return replace(aargh, other); } template template iter tree::prepend_child(iter position, iter other) { assert(position.node!=head); assert(position.node!=feet); assert(position.node); sibling_iterator aargh=prepend_child(position, value_type()); return replace(aargh, other); } template template iter tree::append_children(iter position, sibling_iterator from, sibling_iterator to) { assert(position.node!=head); assert(position.node!=feet); assert(position.node); iter ret=from; while(from!=to) { insert_subtree(position.end(), from); ++from; } return ret; } template template iter tree::prepend_children(iter position, sibling_iterator from, sibling_iterator to) { assert(position.node!=head); assert(position.node!=feet); assert(position.node); iter ret=from; while(from!=to) { insert_subtree(position.begin(), from); ++from; } return ret; } template typename tree::pre_order_iterator tree::set_head(const T& x) { assert(head->next_sibling==feet); return insert(iterator(feet), x); } template template iter tree::insert(iter position, const T& x) { if(position.node==0) { position.node=feet; // Backward compatibility: when calling insert on a null node, // insert before the feet. } tree_node* tmp = alloc_.allocate(1,0); alloc_.construct(tmp, x); // kp::constructor(&tmp->data, x); tmp->first_child=0; tmp->last_child=0; tmp->parent=position.node->parent; tmp->next_sibling=position.node; tmp->prev_sibling=position.node->prev_sibling; position.node->prev_sibling=tmp; if(tmp->prev_sibling==0) { if(tmp->parent) // when inserting nodes at the head, there is no parent tmp->parent->first_child=tmp; } else tmp->prev_sibling->next_sibling=tmp; return tmp; } template typename tree::sibling_iterator tree::insert(sibling_iterator position, const T& x) { tree_node* tmp = alloc_.allocate(1,0); alloc_.construct(tmp, x); // kp::constructor(&tmp->data, x); tmp->first_child=0; tmp->last_child=0; tmp->next_sibling=position.node; if(position.node==0) { // iterator points to end of a subtree tmp->parent=position.parent_; tmp->prev_sibling=position.range_last(); tmp->parent->last_child=tmp; } else { tmp->parent=position.node->parent; tmp->prev_sibling=position.node->prev_sibling; position.node->prev_sibling=tmp; } if(tmp->prev_sibling==0) { if(tmp->parent) // when inserting nodes at the head, there is no parent tmp->parent->first_child=tmp; } else tmp->prev_sibling->next_sibling=tmp; return tmp; } template template iter tree::insert_after(iter position, const T& x) { tree_node* tmp = alloc_.allocate(1,0); alloc_.construct(tmp, x); // kp::constructor(&tmp->data, x); tmp->first_child=0; tmp->last_child=0; tmp->parent=position.node->parent; tmp->prev_sibling=position.node; tmp->next_sibling=position.node->next_sibling; position.node->next_sibling=tmp; if(tmp->next_sibling==0) { if(tmp->parent) // when inserting nodes at the head, there is no parent tmp->parent->last_child=tmp; } else { tmp->next_sibling->prev_sibling=tmp; } return tmp; } template template iter tree::insert_subtree(iter position, const iterator_base& subtree) { // insert dummy iter it=insert(position, value_type()); // replace dummy with subtree return replace(it, subtree); } template template iter tree::insert_subtree_after(iter position, const iterator_base& subtree) { // insert dummy iter it=insert_after(position, value_type()); // replace dummy with subtree return replace(it, subtree); } // template // template // iter tree::insert_subtree(sibling_iterator position, iter subtree) // { // // insert dummy // iter it(insert(position, value_type())); // // replace dummy with subtree // return replace(it, subtree); // } template template iter tree::replace(iter position, const T& x) { // kp::destructor(&position.node->data); // kp::constructor(&position.node->data, x); position.node->data=x; // alloc_.destroy(position.node); // alloc_.construct(position.node, x); return position; } template template iter tree::replace(iter position, const iterator_base& from) { assert(position.node!=head); tree_node *current_from=from.node; tree_node *start_from=from.node; tree_node *current_to =position.node; // replace the node at position with head of the replacement tree at from // std::cout << "warning!" << position.node << std::endl; erase_children(position); // std::cout << "no warning!" << std::endl; tree_node* tmp = alloc_.allocate(1,0); alloc_.construct(tmp, (*from)); // kp::constructor(&tmp->data, (*from)); tmp->first_child=0; tmp->last_child=0; if(current_to->prev_sibling==0) { if(current_to->parent!=0) current_to->parent->first_child=tmp; } else { current_to->prev_sibling->next_sibling=tmp; } tmp->prev_sibling=current_to->prev_sibling; if(current_to->next_sibling==0) { if(current_to->parent!=0) current_to->parent->last_child=tmp; } else { current_to->next_sibling->prev_sibling=tmp; } tmp->next_sibling=current_to->next_sibling; tmp->parent=current_to->parent; // kp::destructor(¤t_to->data); alloc_.destroy(current_to); alloc_.deallocate(current_to,1); current_to=tmp; // only at this stage can we fix 'last' tree_node *last=from.node->next_sibling; pre_order_iterator toit=tmp; // copy all children do { assert(current_from!=0); if(current_from->first_child != 0) { current_from=current_from->first_child; toit=append_child(toit, current_from->data); } else { while(current_from->next_sibling==0 && current_from!=start_from) { current_from=current_from->parent; toit=parent(toit); assert(current_from!=0); } current_from=current_from->next_sibling; if(current_from!=last) { toit=append_child(parent(toit), current_from->data); } } } while(current_from!=last); return current_to; } template typename tree::sibling_iterator tree::replace( sibling_iterator orig_begin, sibling_iterator orig_end, sibling_iterator new_begin, sibling_iterator new_end) { tree_node *orig_first=orig_begin.node; tree_node *new_first=new_begin.node; tree_node *orig_last=orig_first; while((++orig_begin)!=orig_end) orig_last=orig_last->next_sibling; tree_node *new_last=new_first; while((++new_begin)!=new_end) new_last=new_last->next_sibling; // insert all siblings in new_first..new_last before orig_first bool first=true; pre_order_iterator ret; while(1==1) { pre_order_iterator tt=insert_subtree(pre_order_iterator(orig_first), pre_order_iterator(new_first)); if(first) { ret=tt; first=false; } if(new_first==new_last) break; new_first=new_first->next_sibling; } // erase old range of siblings bool last=false; tree_node *next=orig_first; while(1==1) { if(next==orig_last) last=true; next=next->next_sibling; erase((pre_order_iterator)orig_first); if(last) break; orig_first=next; } return ret; } template template iter tree::flatten(iter position) { if(position.node->first_child==0) return position; tree_node *tmp=position.node->first_child; while(tmp) { tmp->parent=position.node->parent; tmp=tmp->next_sibling; } if(position.node->next_sibling) { position.node->last_child->next_sibling=position.node->next_sibling; position.node->next_sibling->prev_sibling=position.node->last_child; } else { position.node->parent->last_child=position.node->last_child; } position.node->next_sibling=position.node->first_child; position.node->next_sibling->prev_sibling=position.node; position.node->first_child=0; position.node->last_child=0; return position; } template template iter tree::reparent(iter position, sibling_iterator begin, sibling_iterator end) { tree_node *first=begin.node; tree_node *last=first; assert(first!=position.node); if(begin==end) return begin; // determine last node while((++begin)!=end) { last=last->next_sibling; } // move subtree if(first->prev_sibling==0) { first->parent->first_child=last->next_sibling; } else { first->prev_sibling->next_sibling=last->next_sibling; } if(last->next_sibling==0) { last->parent->last_child=first->prev_sibling; } else { last->next_sibling->prev_sibling=first->prev_sibling; } if(position.node->first_child==0) { position.node->first_child=first; position.node->last_child=last; first->prev_sibling=0; } else { position.node->last_child->next_sibling=first; first->prev_sibling=position.node->last_child; position.node->last_child=last; } last->next_sibling=0; tree_node *pos=first; for(;;) { pos->parent=position.node; if(pos==last) break; pos=pos->next_sibling; } return first; } template template iter tree::reparent(iter position, iter from) { if(from.node->first_child==0) return position; return reparent(position, from.node->first_child, end(from)); } template template iter tree::wrap(iter position, const T& x) { assert(position.node!=0); sibling_iterator fr=position, to=position; ++to; iter ret = insert(position, x); reparent(ret, fr, to); return ret; } template template iter tree::move_after(iter target, iter source) { tree_node *dst=target.node; tree_node *src=source.node; assert(dst); assert(src); if(dst==src) return source; if(dst->next_sibling) if(dst->next_sibling==src) // already in the right spot return source; // take src out of the tree if(src->prev_sibling!=0) src->prev_sibling->next_sibling=src->next_sibling; else src->parent->first_child=src->next_sibling; if(src->next_sibling!=0) src->next_sibling->prev_sibling=src->prev_sibling; else src->parent->last_child=src->prev_sibling; // connect it to the new point if(dst->next_sibling!=0) dst->next_sibling->prev_sibling=src; else dst->parent->last_child=src; src->next_sibling=dst->next_sibling; dst->next_sibling=src; src->prev_sibling=dst; src->parent=dst->parent; return src; } template template iter tree::move_before(iter target, iter source) { tree_node *dst=target.node; tree_node *src=source.node; assert(dst); assert(src); if(dst==src) return source; if(dst->prev_sibling) if(dst->prev_sibling==src) // already in the right spot return source; // take src out of the tree if(src->prev_sibling!=0) src->prev_sibling->next_sibling=src->next_sibling; else src->parent->first_child=src->next_sibling; if(src->next_sibling!=0) src->next_sibling->prev_sibling=src->prev_sibling; else src->parent->last_child=src->prev_sibling; // connect it to the new point if(dst->prev_sibling!=0) dst->prev_sibling->next_sibling=src; else dst->parent->first_child=src; src->prev_sibling=dst->prev_sibling; dst->prev_sibling=src; src->next_sibling=dst; src->parent=dst->parent; return src; } // specialisation for sibling_iterators template typename tree::sibling_iterator tree::move_before(sibling_iterator target, sibling_iterator source) { tree_node *dst=target.node; tree_node *src=source.node; tree_node *dst_prev_sibling; if(dst==0) { // must then be an end iterator dst_prev_sibling=target.parent_->last_child; assert(dst_prev_sibling); } else dst_prev_sibling=dst->prev_sibling; assert(src); if(dst==src) return source; if(dst_prev_sibling) if(dst_prev_sibling==src) // already in the right spot return source; // take src out of the tree if(src->prev_sibling!=0) src->prev_sibling->next_sibling=src->next_sibling; else src->parent->first_child=src->next_sibling; if(src->next_sibling!=0) src->next_sibling->prev_sibling=src->prev_sibling; else src->parent->last_child=src->prev_sibling; // connect it to the new point if(dst_prev_sibling!=0) dst_prev_sibling->next_sibling=src; else target.parent_->first_child=src; src->prev_sibling=dst_prev_sibling; if(dst) { dst->prev_sibling=src; src->parent=dst->parent; } src->next_sibling=dst; return src; } template template iter tree::move_ontop(iter target, iter source) { tree_node *dst=target.node; tree_node *src=source.node; assert(dst); assert(src); if(dst==src) return source; // if(dst==src->prev_sibling) { // // } // remember connection points tree_node *b_prev_sibling=dst->prev_sibling; tree_node *b_next_sibling=dst->next_sibling; tree_node *b_parent=dst->parent; // remove target erase(target); // take src out of the tree if(src->prev_sibling!=0) src->prev_sibling->next_sibling=src->next_sibling; else src->parent->first_child=src->next_sibling; if(src->next_sibling!=0) src->next_sibling->prev_sibling=src->prev_sibling; else src->parent->last_child=src->prev_sibling; // connect it to the new point if(b_prev_sibling!=0) b_prev_sibling->next_sibling=src; else b_parent->first_child=src; if(b_next_sibling!=0) b_next_sibling->prev_sibling=src; else b_parent->last_child=src; src->prev_sibling=b_prev_sibling; src->next_sibling=b_next_sibling; src->parent=b_parent; return src; } template void tree::merge(sibling_iterator to1, sibling_iterator to2, sibling_iterator from1, sibling_iterator from2, bool duplicate_leaves) { sibling_iterator fnd; while(from1!=from2) { if((fnd=std::find(to1, to2, (*from1))) != to2) { // element found if(from1.begin()==from1.end()) { // full depth reached if(duplicate_leaves) append_child(parent(to1), (*from1)); } else { // descend further merge(fnd.begin(), fnd.end(), from1.begin(), from1.end(), duplicate_leaves); } } else { // element missing insert_subtree(to2, from1); } ++from1; } } template void tree::sort(sibling_iterator from, sibling_iterator to, bool deep) { std::less comp; sort(from, to, comp, deep); } template template void tree::sort(sibling_iterator from, sibling_iterator to, StrictWeakOrdering comp, bool deep) { if(from==to) return; // make list of sorted nodes // CHECK: if multiset stores equivalent nodes in the order in which they // are inserted, then this routine should be called 'stable_sort'. std::multiset > nodes(comp); sibling_iterator it=from, it2=to; while(it != to) { nodes.insert(it.node); ++it; } // reassemble --it2; // prev and next are the nodes before and after the sorted range tree_node *prev=from.node->prev_sibling; tree_node *next=it2.node->next_sibling; typename std::multiset >::iterator nit=nodes.begin(), eit=nodes.end(); if(prev==0) { if((*nit)->parent!=0) // to catch "sorting the head" situations, when there is no parent (*nit)->parent->first_child=(*nit); } else prev->next_sibling=(*nit); --eit; while(nit!=eit) { (*nit)->prev_sibling=prev; if(prev) prev->next_sibling=(*nit); prev=(*nit); ++nit; } // prev now points to the last-but-one node in the sorted range if(prev) prev->next_sibling=(*eit); // eit points to the last node in the sorted range. (*eit)->next_sibling=next; (*eit)->prev_sibling=prev; // missed in the loop above if(next==0) { if((*eit)->parent!=0) // to catch "sorting the head" situations, when there is no parent (*eit)->parent->last_child=(*eit); } else next->prev_sibling=(*eit); if(deep) { // sort the children of each node too sibling_iterator bcs(*nodes.begin()); sibling_iterator ecs(*eit); ++ecs; while(bcs!=ecs) { sort(begin(bcs), end(bcs), comp, deep); ++bcs; } } } template template bool tree::equal(const iter& one_, const iter& two, const iter& three_) const { std::equal_to comp; return equal(one_, two, three_, comp); } template template bool tree::equal_subtree(const iter& one_, const iter& two_) const { std::equal_to comp; return equal_subtree(one_, two_, comp); } template template bool tree::equal(const iter& one_, const iter& two, const iter& three_, BinaryPredicate fun) const { pre_order_iterator one(one_), three(three_); // if(one==two && is_valid(three) && three.number_of_children()!=0) // return false; while(one!=two && is_valid(three)) { if(!fun(*one,*three)) return false; if(one.number_of_children()!=three.number_of_children()) return false; ++one; ++three; } return true; } template template bool tree::equal_subtree(const iter& one_, const iter& two_, BinaryPredicate fun) const { pre_order_iterator one(one_), two(two_); if(!fun(*one,*two)) return false; if(number_of_children(one)!=number_of_children(two)) return false; return equal(begin(one),end(one),begin(two),fun); } template tree tree::subtree(sibling_iterator from, sibling_iterator to) const { tree tmp; tmp.set_head(value_type()); tmp.replace(tmp.begin(), tmp.end(), from, to); return tmp; } template void tree::subtree(tree& tmp, sibling_iterator from, sibling_iterator to) const { tmp.set_head(value_type()); tmp.replace(tmp.begin(), tmp.end(), from, to); } template size_t tree::size() const { size_t i=0; pre_order_iterator it=begin(), eit=end(); while(it!=eit) { ++i; ++it; } return i; } template size_t tree::size(const iterator_base& top) const { size_t i=0; pre_order_iterator it=top, eit=top; eit.skip_children(); ++eit; while(it!=eit) { ++i; ++it; } return i; } template bool tree::empty() const { pre_order_iterator it=begin(), eit=end(); return (it==eit); } template int tree::depth(const iterator_base& it) { tree_node* pos=it.node; assert(pos!=0); int ret=0; while(pos->parent!=0) { pos=pos->parent; ++ret; } return ret; } template int tree::depth(const iterator_base& it, const iterator_base& root) { tree_node* pos=it.node; assert(pos!=0); int ret=0; while(pos->parent!=0 && pos!=root.node) { pos=pos->parent; ++ret; } return ret; } template int tree::max_depth() const { int maxd=-1; for(tree_node *it = head->next_sibling; it!=feet; it=it->next_sibling) maxd=std::max(maxd, max_depth(it)); return maxd; } template int tree::max_depth(const iterator_base& pos) const { tree_node *tmp=pos.node; if(tmp==0 || tmp==head || tmp==feet) return -1; int curdepth=0, maxdepth=0; while(true) { // try to walk the bottom of the tree while(tmp->first_child==0) { if(tmp==pos.node) return maxdepth; if(tmp->next_sibling==0) { // try to walk up and then right again do { tmp=tmp->parent; if(tmp==0) return maxdepth; --curdepth; } while(tmp->next_sibling==0); } if(tmp==pos.node) return maxdepth; tmp=tmp->next_sibling; } tmp=tmp->first_child; ++curdepth; maxdepth=std::max(curdepth, maxdepth); } } template unsigned int tree::number_of_children(const iterator_base& it) { tree_node *pos=it.node->first_child; if(pos==0) return 0; unsigned int ret=1; // while(pos!=it.node->last_child) { // ++ret; // pos=pos->next_sibling; // } while((pos=pos->next_sibling)) ++ret; return ret; } template unsigned int tree::number_of_siblings(const iterator_base& it) const { tree_node *pos=it.node; unsigned int ret=0; // count forward while(pos->next_sibling && pos->next_sibling!=head && pos->next_sibling!=feet) { ++ret; pos=pos->next_sibling; } // count backward pos=it.node; while(pos->prev_sibling && pos->prev_sibling!=head && pos->prev_sibling!=feet) { ++ret; pos=pos->prev_sibling; } return ret; } template void tree::swap(sibling_iterator it) { tree_node *nxt=it.node->next_sibling; if(nxt) { if(it.node->prev_sibling) it.node->prev_sibling->next_sibling=nxt; else it.node->parent->first_child=nxt; nxt->prev_sibling=it.node->prev_sibling; tree_node *nxtnxt=nxt->next_sibling; if(nxtnxt) nxtnxt->prev_sibling=it.node; else it.node->parent->last_child=it.node; nxt->next_sibling=it.node; it.node->prev_sibling=nxt; it.node->next_sibling=nxtnxt; } } template void tree::swap(iterator one, iterator two) { // if one and two are adjacent siblings, use the sibling swap if(one.node->next_sibling==two.node) swap(one); else if(two.node->next_sibling==one.node) swap(two); else { tree_node *nxt1=one.node->next_sibling; tree_node *nxt2=two.node->next_sibling; tree_node *pre1=one.node->prev_sibling; tree_node *pre2=two.node->prev_sibling; tree_node *par1=one.node->parent; tree_node *par2=two.node->parent; // reconnect one.node->parent=par2; one.node->next_sibling=nxt2; if(nxt2) nxt2->prev_sibling=one.node; else par2->last_child=one.node; one.node->prev_sibling=pre2; if(pre2) pre2->next_sibling=one.node; else par2->first_child=one.node; two.node->parent=par1; two.node->next_sibling=nxt1; if(nxt1) nxt1->prev_sibling=two.node; else par1->last_child=two.node; two.node->prev_sibling=pre1; if(pre1) pre1->next_sibling=two.node; else par1->first_child=two.node; } } // template // tree::iterator tree::find_subtree( // sibling_iterator subfrom, sibling_iterator subto, iterator from, iterator to, // BinaryPredicate fun) const // { // assert(1==0); // this routine is not finished yet. // while(from!=to) { // if(fun(*subfrom, *from)) { // // } // } // return to; // } template bool tree::is_in_subtree(const iterator_base& it, const iterator_base& begin, const iterator_base& end) const { // FIXME: this should be optimised. pre_order_iterator tmp=begin; while(tmp!=end) { if(tmp==it) return true; ++tmp; } return false; } template bool tree::is_valid(const iterator_base& it) const { if(it.node==0 || it.node==feet || it.node==head) return false; else return true; } template typename tree::iterator tree::lowest_common_ancestor( const iterator_base& one, const iterator_base& two) const { std::set parents; // Walk up from 'one' storing all parents. iterator walk=one; do { walk=parent(walk); parents.insert(walk); } while( is_valid(parent(walk)) ); // Walk up from 'two' until we encounter a node in parents. walk=two; do { walk=parent(walk); if(parents.find(walk) != parents.end()) break; } while( is_valid(parent(walk)) ); return walk; } template unsigned int tree::index(sibling_iterator it) const { unsigned int ind=0; if(it.node->parent==0) { while(it.node->prev_sibling!=head) { it.node=it.node->prev_sibling; ++ind; } } else { while(it.node->prev_sibling!=0) { it.node=it.node->prev_sibling; ++ind; } } return ind; } template typename tree::sibling_iterator tree::sibling(const iterator_base& it, unsigned int num) { tree_node *tmp; if(it.node->parent==0) { tmp=head->next_sibling; while(num) { tmp = tmp->next_sibling; --num; } } else { tmp=it.node->parent->first_child; while(num) { assert(tmp!=0); tmp = tmp->next_sibling; --num; } } return tmp; } template void tree::debug_verify_consistency() const { iterator it=begin(); while(it!=end()) { if(it.node->parent!=0) { if(it.node->prev_sibling==0) assert(it.node->parent->first_child==it.node); else assert(it.node->prev_sibling->next_sibling==it.node); if(it.node->next_sibling==0) assert(it.node->parent->last_child==it.node); else assert(it.node->next_sibling->prev_sibling==it.node); } ++it; } } template typename tree::sibling_iterator tree::child(const iterator_base& it, unsigned int num) { tree_node *tmp=it.node->first_child; while(num--) { assert(tmp!=0); tmp=tmp->next_sibling; } return tmp; } // Iterator base template tree::iterator_base::iterator_base() : node(0), skip_current_children_(false) { } template tree::iterator_base::iterator_base(tree_node *tn) : node(tn), skip_current_children_(false) { } template T& tree::iterator_base::operator*() const { return node->data; } template T* tree::iterator_base::operator->() const { return &(node->data); } template bool tree::post_order_iterator::operator!=(const post_order_iterator& other) const { if(other.node!=this->node) return true; else return false; } template bool tree::post_order_iterator::operator==(const post_order_iterator& other) const { if(other.node==this->node) return true; else return false; } template bool tree::pre_order_iterator::operator!=(const pre_order_iterator& other) const { if(other.node!=this->node) return true; else return false; } template bool tree::pre_order_iterator::operator==(const pre_order_iterator& other) const { if(other.node==this->node) return true; else return false; } template bool tree::sibling_iterator::operator!=(const sibling_iterator& other) const { if(other.node!=this->node) return true; else return false; } template bool tree::sibling_iterator::operator==(const sibling_iterator& other) const { if(other.node==this->node) return true; else return false; } template bool tree::leaf_iterator::operator!=(const leaf_iterator& other) const { if(other.node!=this->node) return true; else return false; } template bool tree::leaf_iterator::operator==(const leaf_iterator& other) const { if(other.node==this->node && other.top_node==this->top_node) return true; else return false; } template typename tree::sibling_iterator tree::iterator_base::begin() const { if(node->first_child==0) return end(); sibling_iterator ret(node->first_child); ret.parent_=this->node; return ret; } template typename tree::sibling_iterator tree::iterator_base::end() const { sibling_iterator ret(0); ret.parent_=node; return ret; } template void tree::iterator_base::skip_children() { skip_current_children_=true; } template void tree::iterator_base::skip_children(bool skip) { skip_current_children_=skip; } template unsigned int tree::iterator_base::number_of_children() const { tree_node *pos=node->first_child; if(pos==0) return 0; unsigned int ret=1; while(pos!=node->last_child) { ++ret; pos=pos->next_sibling; } return ret; } // Pre-order iterator template tree::pre_order_iterator::pre_order_iterator() : iterator_base(0) { } template tree::pre_order_iterator::pre_order_iterator(tree_node *tn) : iterator_base(tn) { } template tree::pre_order_iterator::pre_order_iterator(const iterator_base &other) : iterator_base(other.node) { } template tree::pre_order_iterator::pre_order_iterator(const sibling_iterator& other) : iterator_base(other.node) { if(this->node==0) { if(other.range_last()!=0) this->node=other.range_last(); else this->node=other.parent_; this->skip_children(); ++(*this); } } template typename tree::pre_order_iterator& tree::pre_order_iterator::operator++() { assert(this->node!=0); if(!this->skip_current_children_ && this->node->first_child != 0) { this->node=this->node->first_child; } else { this->skip_current_children_=false; while(this->node->next_sibling==0) { this->node=this->node->parent; if(this->node==0) return *this; } this->node=this->node->next_sibling; } return *this; } template typename tree::pre_order_iterator& tree::pre_order_iterator::operator--() { assert(this->node!=0); if(this->node->prev_sibling) { this->node=this->node->prev_sibling; while(this->node->last_child) this->node=this->node->last_child; } else { this->node=this->node->parent; if(this->node==0) return *this; } return *this; } template typename tree::pre_order_iterator tree::pre_order_iterator::operator++(int) { pre_order_iterator copy = *this; ++(*this); return copy; } template typename tree::pre_order_iterator tree::pre_order_iterator::operator--(int) { pre_order_iterator copy = *this; --(*this); return copy; } template typename tree::pre_order_iterator& tree::pre_order_iterator::operator+=(unsigned int num) { while(num>0) { ++(*this); --num; } return (*this); } template typename tree::pre_order_iterator& tree::pre_order_iterator::operator-=(unsigned int num) { while(num>0) { --(*this); --num; } return (*this); } // Post-order iterator template tree::post_order_iterator::post_order_iterator() : iterator_base(0) { } template tree::post_order_iterator::post_order_iterator(tree_node *tn) : iterator_base(tn) { } template tree::post_order_iterator::post_order_iterator(const iterator_base &other) : iterator_base(other.node) { } template tree::post_order_iterator::post_order_iterator(const sibling_iterator& other) : iterator_base(other.node) { if(this->node==0) { if(other.range_last()!=0) this->node=other.range_last(); else this->node=other.parent_; this->skip_children(); ++(*this); } } template typename tree::post_order_iterator& tree::post_order_iterator::operator++() { assert(this->node!=0); if(this->node->next_sibling==0) { this->node=this->node->parent; this->skip_current_children_=false; } else { this->node=this->node->next_sibling; if(this->skip_current_children_) { this->skip_current_children_=false; } else { while(this->node->first_child) this->node=this->node->first_child; } } return *this; } template typename tree::post_order_iterator& tree::post_order_iterator::operator--() { assert(this->node!=0); if(this->skip_current_children_ || this->node->last_child==0) { this->skip_current_children_=false; while(this->node->prev_sibling==0) this->node=this->node->parent; this->node=this->node->prev_sibling; } else { this->node=this->node->last_child; } return *this; } template typename tree::post_order_iterator tree::post_order_iterator::operator++(int) { post_order_iterator copy = *this; ++(*this); return copy; } template typename tree::post_order_iterator tree::post_order_iterator::operator--(int) { post_order_iterator copy = *this; --(*this); return copy; } template typename tree::post_order_iterator& tree::post_order_iterator::operator+=(unsigned int num) { while(num>0) { ++(*this); --num; } return (*this); } template typename tree::post_order_iterator& tree::post_order_iterator::operator-=(unsigned int num) { while(num>0) { --(*this); --num; } return (*this); } template void tree::post_order_iterator::descend_all() { assert(this->node!=0); while(this->node->first_child) this->node=this->node->first_child; } // Breadth-first iterator template tree::breadth_first_queued_iterator::breadth_first_queued_iterator() : iterator_base() { } template tree::breadth_first_queued_iterator::breadth_first_queued_iterator(tree_node *tn) : iterator_base(tn) { traversal_queue.push(tn); } template tree::breadth_first_queued_iterator::breadth_first_queued_iterator(const iterator_base& other) : iterator_base(other.node) { traversal_queue.push(other.node); } template bool tree::breadth_first_queued_iterator::operator!=(const breadth_first_queued_iterator& other) const { if(other.node!=this->node) return true; else return false; } template bool tree::breadth_first_queued_iterator::operator==(const breadth_first_queued_iterator& other) const { if(other.node==this->node) return true; else return false; } template typename tree::breadth_first_queued_iterator& tree::breadth_first_queued_iterator::operator++() { assert(this->node!=0); // Add child nodes and pop current node sibling_iterator sib=this->begin(); while(sib!=this->end()) { traversal_queue.push(sib.node); ++sib; } traversal_queue.pop(); if(traversal_queue.size()>0) this->node=traversal_queue.front(); else this->node=0; return (*this); } template typename tree::breadth_first_queued_iterator tree::breadth_first_queued_iterator::operator++(int) { breadth_first_queued_iterator copy = *this; ++(*this); return copy; } template typename tree::breadth_first_queued_iterator& tree::breadth_first_queued_iterator::operator+=(unsigned int num) { while(num>0) { ++(*this); --num; } return (*this); } // Fixed depth iterator template tree::fixed_depth_iterator::fixed_depth_iterator() : iterator_base() { } template tree::fixed_depth_iterator::fixed_depth_iterator(tree_node *tn) : iterator_base(tn), top_node(0) { } template tree::fixed_depth_iterator::fixed_depth_iterator(const iterator_base& other) : iterator_base(other.node), top_node(0) { } template tree::fixed_depth_iterator::fixed_depth_iterator(const sibling_iterator& other) : iterator_base(other.node), top_node(0) { } template tree::fixed_depth_iterator::fixed_depth_iterator(const fixed_depth_iterator& other) : iterator_base(other.node), top_node(other.top_node) { } template bool tree::fixed_depth_iterator::operator==(const fixed_depth_iterator& other) const { if(other.node==this->node && other.top_node==top_node) return true; else return false; } template bool tree::fixed_depth_iterator::operator!=(const fixed_depth_iterator& other) const { if(other.node!=this->node || other.top_node!=top_node) return true; else return false; } template typename tree::fixed_depth_iterator& tree::fixed_depth_iterator::operator++() { assert(this->node!=0); if(this->node->next_sibling) { this->node=this->node->next_sibling; } else { int relative_depth=0; upper: do { if(this->node==this->top_node) { this->node=0; // FIXME: return a proper fixed_depth end iterator once implemented return *this; } this->node=this->node->parent; if(this->node==0) return *this; --relative_depth; } while(this->node->next_sibling==0); lower: this->node=this->node->next_sibling; while(this->node->first_child==0) { if(this->node->next_sibling==0) goto upper; this->node=this->node->next_sibling; if(this->node==0) return *this; } while(relative_depth<0 && this->node->first_child!=0) { this->node=this->node->first_child; ++relative_depth; } if(relative_depth<0) { if(this->node->next_sibling==0) goto upper; else goto lower; } } return *this; } template typename tree::fixed_depth_iterator& tree::fixed_depth_iterator::operator--() { assert(this->node!=0); if(this->node->prev_sibling) { this->node=this->node->prev_sibling; } else { int relative_depth=0; upper: do { if(this->node==this->top_node) { this->node=0; return *this; } this->node=this->node->parent; if(this->node==0) return *this; --relative_depth; } while(this->node->prev_sibling==0); lower: this->node=this->node->prev_sibling; while(this->node->last_child==0) { if(this->node->prev_sibling==0) goto upper; this->node=this->node->prev_sibling; if(this->node==0) return *this; } while(relative_depth<0 && this->node->last_child!=0) { this->node=this->node->last_child; ++relative_depth; } if(relative_depth<0) { if(this->node->prev_sibling==0) goto upper; else goto lower; } } return *this; // // // assert(this->node!=0); // if(this->node->prev_sibling!=0) { // this->node=this->node->prev_sibling; // assert(this->node!=0); // if(this->node->parent==0 && this->node->prev_sibling==0) // head element // this->node=0; // } // else { // tree_node *par=this->node->parent; // do { // par=par->prev_sibling; // if(par==0) { // FIXME: need to keep track of this! // this->node=0; // return *this; // } // } while(par->last_child==0); // this->node=par->last_child; // } // return *this; } template typename tree::fixed_depth_iterator tree::fixed_depth_iterator::operator++(int) { fixed_depth_iterator copy = *this; ++(*this); return copy; } template typename tree::fixed_depth_iterator tree::fixed_depth_iterator::operator--(int) { fixed_depth_iterator copy = *this; --(*this); return copy; } template typename tree::fixed_depth_iterator& tree::fixed_depth_iterator::operator-=(unsigned int num) { while(num>0) { --(*this); --(num); } return (*this); } template typename tree::fixed_depth_iterator& tree::fixed_depth_iterator::operator+=(unsigned int num) { while(num>0) { ++(*this); --(num); } return *this; } // Sibling iterator template tree::sibling_iterator::sibling_iterator() : iterator_base() { set_parent_(); } template tree::sibling_iterator::sibling_iterator(tree_node *tn) : iterator_base(tn) { set_parent_(); } template tree::sibling_iterator::sibling_iterator(const iterator_base& other) : iterator_base(other.node) { set_parent_(); } template tree::sibling_iterator::sibling_iterator(const sibling_iterator& other) : iterator_base(other), parent_(other.parent_) { } template void tree::sibling_iterator::set_parent_() { parent_=0; if(this->node==0) return; if(this->node->parent!=0) parent_=this->node->parent; } template typename tree::sibling_iterator& tree::sibling_iterator::operator++() { if(this->node) this->node=this->node->next_sibling; return *this; } template typename tree::sibling_iterator& tree::sibling_iterator::operator--() { if(this->node) this->node=this->node->prev_sibling; else { assert(parent_); this->node=parent_->last_child; } return *this; } template typename tree::sibling_iterator tree::sibling_iterator::operator++(int) { sibling_iterator copy = *this; ++(*this); return copy; } template typename tree::sibling_iterator tree::sibling_iterator::operator--(int) { sibling_iterator copy = *this; --(*this); return copy; } template typename tree::sibling_iterator& tree::sibling_iterator::operator+=(unsigned int num) { while(num>0) { ++(*this); --num; } return (*this); } template typename tree::sibling_iterator& tree::sibling_iterator::operator-=(unsigned int num) { while(num>0) { --(*this); --num; } return (*this); } template typename tree::tree_node *tree::sibling_iterator::range_first() const { tree_node *tmp=parent_->first_child; return tmp; } template typename tree::tree_node *tree::sibling_iterator::range_last() const { return parent_->last_child; } // Leaf iterator template tree::leaf_iterator::leaf_iterator() : iterator_base(0), top_node(0) { } template tree::leaf_iterator::leaf_iterator(tree_node *tn, tree_node *top) : iterator_base(tn), top_node(top) { } template tree::leaf_iterator::leaf_iterator(const iterator_base &other) : iterator_base(other.node), top_node(0) { } template tree::leaf_iterator::leaf_iterator(const sibling_iterator& other) : iterator_base(other.node), top_node(0) { if(this->node==0) { if(other.range_last()!=0) this->node=other.range_last(); else this->node=other.parent_; ++(*this); } } template typename tree::leaf_iterator& tree::leaf_iterator::operator++() { assert(this->node!=0); if(this->node->first_child!=0) { // current node is no longer leaf (children got added) while(this->node->first_child) this->node=this->node->first_child; } else { while(this->node->next_sibling==0) { if (this->node->parent==0) return *this; this->node=this->node->parent; if (top_node != 0 && this->node==top_node) return *this; } this->node=this->node->next_sibling; while(this->node->first_child) this->node=this->node->first_child; } return *this; } template typename tree::leaf_iterator& tree::leaf_iterator::operator--() { assert(this->node!=0); while (this->node->prev_sibling==0) { if (this->node->parent==0) return *this; this->node=this->node->parent; if (top_node !=0 && this->node==top_node) return *this; } this->node=this->node->prev_sibling; while(this->node->last_child) this->node=this->node->last_child; return *this; } template typename tree::leaf_iterator tree::leaf_iterator::operator++(int) { leaf_iterator copy = *this; ++(*this); return copy; } template typename tree::leaf_iterator tree::leaf_iterator::operator--(int) { leaf_iterator copy = *this; --(*this); return copy; } template typename tree::leaf_iterator& tree::leaf_iterator::operator+=(unsigned int num) { while(num>0) { ++(*this); --num; } return (*this); } template typename tree::leaf_iterator& tree::leaf_iterator::operator-=(unsigned int num) { while(num>0) { --(*this); --num; } return (*this); } #endif // Local variables: // default-tab-width: 3 // End: libkiwix-0.2.0/include/ctpp2/000077500000000000000000000000001312445116700160125ustar00rootroot00000000000000libkiwix-0.2.0/include/ctpp2/CTPP2VMStringLoader.hpp000066400000000000000000000036261312445116700221430ustar00rootroot00000000000000/* * Copyright 2013 Renaud Gaudin * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 3 of the License, or * any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, * MA 02110-1301, USA. */ #ifndef _CTPP2_VM_STRING_LOADER_HPP__ #define _CTPP2_VM_STRING_LOADER_HPP__ 1 #include #include #include #include #include #include #include #include #include #include #include #include #include /** @file VMStringLoader.hpp @brief Load program core from file */ namespace CTPP // C++ Template Engine { // FWD struct VMExecutable; /** @class VMStringLoader CTPP2VMStringLoader.hpp @brief Load program core from file */ class CTPP2DECL VMStringLoader: public VMLoader { public: /** */ VMStringLoader(CCHAR_P rawContent, size_t rawContentSize); /** @brief Get ready-to-run program */ const VMMemoryCore * GetCore() const; /** @brief A destructor */ ~VMStringLoader() throw(); private: /** Program core */ VMExecutable * oCore; /** Ready-to-run program */ VMMemoryCore * pVMMemoryCore; }; } // namespace CTPP #endif // _CTPP2_VM_STRING_LOADER_HPP__ // End. libkiwix-0.2.0/include/kiwix.h000066400000000000000000000015011312445116700162630ustar00rootroot00000000000000/* * Copyright 2011 Emmanuel Engelhart * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 3 of the License, or * any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, * MA 02110-1301, USA. */ #ifndef KIWIX_H #define KIWIX_H #include "library.h" #endiflibkiwix-0.2.0/include/library.h000066400000000000000000000055721312445116700166100ustar00rootroot00000000000000/* * Copyright 2011 Emmanuel Engelhart * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 3 of the License, or * any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, * MA 02110-1301, USA. */ #ifndef KIWIX_LIBRARY_H #define KIWIX_LIBRARY_H #include #include #include #include #include #include #include "common/stringTools.h" #include "common/regexTools.h" #define KIWIX_LIBRARY_VERSION "20110515" using namespace std; namespace kiwix { enum supportedIndexType { UNKNOWN, XAPIAN }; class Book { public: Book(); ~Book(); static bool sortByLastOpen(const Book &a, const Book &b); static bool sortByTitle(const Book &a, const Book &b); static bool sortBySize(const Book &a, const Book &b); static bool sortByDate(const Book &a, const Book &b); static bool sortByCreator(const Book &a, const Book &b); static bool sortByPublisher(const Book &a, const Book &b); static bool sortByLanguage(const Book &a, const Book &b); string getHumanReadableIdFromPath(); string id; string path; string pathAbsolute; string last; string indexPath; string indexPathAbsolute; supportedIndexType indexType; string title; string description; string language; string creator; string publisher; string date; string url; string name; string tags; string origId; string articleCount; string mediaCount; bool readOnly; string size; string favicon; string faviconMimeType; }; class Library { public: Library(); ~Library(); string version; bool addBook(const Book &book); bool removeBookByIndex(const unsigned int bookIndex); vector books; /* * 'current' is the variable storing the current content/book id * in the library. This is used to be able to load per default a * content. As Kiwix may work with many library XML files, you may * have "current" defined many time with different values. The * last XML file read has the priority, Although we do not have an * library object for each file, we want to be able to fallback to * an 'old' current book if the one which should be load * failed. That is the reason why we need a stack here */ stack current; }; } #endif libkiwix-0.2.0/include/manager.h000066400000000000000000000062561312445116700165560ustar00rootroot00000000000000/* * Copyright 2011 Emmanuel Engelhart * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 3 of the License, or * any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, * MA 02110-1301, USA. */ #ifndef KIWIX_MANAGER_H #define KIWIX_MANAGER_H #include #include #include #include #include "common/base64.h" #include "common/regexTools.h" #include "common/pathTools.h" #include "library.h" #include "reader.h" using namespace std; namespace kiwix { enum supportedListMode { LASTOPEN, REMOTE, LOCAL }; enum supportedListSortBy { TITLE, SIZE, DATE, CREATOR, PUBLISHER }; class Manager { public: Manager(); ~Manager(); bool readFile(const string path, const bool readOnly = true); bool readFile(const string nativePath, const string UTF8Path, const bool readOnly = true); bool readXml(const string xml, const bool readOnly = true, const string libraryPath = ""); bool writeFile(const string path); bool removeBookByIndex(const unsigned int bookIndex); bool removeBookById(const string id); bool setCurrentBookId(const string id); string getCurrentBookId(); bool setBookIndex(const string id, const string path, const supportedIndexType type); bool setBookIndex(const string id, const string path); bool setBookPath(const string id, const string path); string addBookFromPathAndGetId(const string pathToOpen, const string pathToSave = "", const string url = "", const bool checkMetaData = false); bool addBookFromPath(const string pathToOpen, const string pathToSave = "", const string url = "", const bool checkMetaData = false); Library cloneLibrary(); bool getBookById(const string id, Book &book); bool getCurrentBook(Book &book); unsigned int getBookCount(const bool localBooks, const bool remoteBooks); bool updateBookLastOpenDateById(const string id); void removeBookPaths(); bool listBooks(const supportedListMode mode, const supportedListSortBy sortBy, const unsigned int maxSize, const string language, const string creator, const string publisher, const string search); vector getBooksLanguages(); vector getBooksCreators(); vector getBooksPublishers(); vector getBooksIds(); string writableLibraryPath; vector bookIdList; protected: kiwix::Library library; bool readBookFromPath(const string path, Book *book = NULL); bool parseXmlDom(const pugi::xml_document &doc, const bool readOnly, const string libraryPath); private: void checkAndCleanBookPaths(Book &book, const string &libraryPath); }; } #endif libkiwix-0.2.0/include/meson.build000066400000000000000000000007631312445116700171320ustar00rootroot00000000000000headers = [ 'library.h', 'manager.h', 'reader.h', 'searcher.h' ] if xapian_dep.found() headers += ['xapianSearcher.h'] endif install_headers(headers, subdir:'kiwix') install_headers( 'common/base64.h', 'common/networkTools.h', 'common/otherTools.h', 'common/pathTools.h', 'common/regexTools.h', 'common/stringTools.h', 'common/tree.h', subdir:'kiwix/common' ) if has_ctpp2_dep install_headers( 'ctpp2/CTPP2VMStringLoader.hpp', subdir:'kiwix/ctpp2' ) endif libkiwix-0.2.0/include/reader.h000066400000000000000000000075401312445116700164030ustar00rootroot00000000000000/* * Copyright 2011 Emmanuel Engelhart * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 3 of the License, or * any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, * MA 02110-1301, USA. */ #ifndef KIWIX_READER_H #define KIWIX_READER_H #include #include #include #include #include #include #include #include #include #include "common/pathTools.h" #include "common/stringTools.h" using namespace std; namespace kiwix { class Reader { public: Reader(const string zimFilePath); ~Reader(); void reset(); unsigned int getArticleCount() const; unsigned int getMediaCount() const; unsigned int getGlobalCount() const; string getZimFilePath() const; string getId() const; string getRandomPageUrl() const; string getFirstPageUrl() const; string getMainPageUrl() const; bool getMetatag(const string &url, string &content) const; string getTitle() const; string getDescription() const; string getLanguage() const; string getName() const; string getTags() const; string getDate() const; string getCreator() const; string getPublisher() const; string getOrigId() const; bool getFavicon(string &content, string &mimeType) const; bool getPageUrlFromTitle(const string &title, string &url) const; bool getMimeTypeByUrl(const string &url, string &mimeType) const; bool getContentByUrl(const string &url, string &content, unsigned int &contentLength, string &contentType) const; bool getContentByEncodedUrl(const string &url, string &content, unsigned int &contentLength, string &contentType, string &baseUrl) const; bool getContentByEncodedUrl(const string &url, string &content, unsigned int &contentLength, string &contentType) const; bool getContentByDecodedUrl(const string &url, string &content, unsigned int &contentLength, string &contentType, string &baseUrl) const; bool getContentByDecodedUrl(const string &url, string &content, unsigned int &contentLength, string &contentType) const; bool searchSuggestions(const string &prefix, unsigned int suggestionsCount, const bool reset = true); bool searchSuggestionsSmart(const string &prefix, unsigned int suggestionsCount); bool urlExists(const string &url) const; bool hasFulltextIndex() const; std::vector getTitleVariants(const std::string &title) const; bool getNextSuggestion(string &title); bool getNextSuggestion(string &title, string &url); bool canCheckIntegrity() const; bool isCorrupted() const; bool parseUrl(const string &url, char *ns, string &title) const; unsigned int getFileSize() const; zim::File* getZimFileHandler() const; bool getArticleObjectByDecodedUrl(const string &url, zim::Article &article) const; protected: zim::File* zimFileHandler; zim::size_type firstArticleOffset; zim::size_type lastArticleOffset; zim::size_type currentArticleOffset; zim::size_type nsACount; zim::size_type nsICount; std::string zimFilePath; std::vector< std::vector > suggestions; std::vector< std::vector >::iterator suggestionsOffset; private: std::map parseCounterMetadata() const; }; } #endif libkiwix-0.2.0/include/searcher.h000066400000000000000000000051651312445116700167360ustar00rootroot00000000000000/* * Copyright 2011 Emmanuel Engelhart * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 3 of the License, or * any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, * MA 02110-1301, USA. */ #ifndef KIWIX_SEARCHER_H #define KIWIX_SEARCHER_H #include #include #include #include #include #include #include #include #include "common/pathTools.h" #include "common/stringTools.h" #include #include "kiwix_config.h" using namespace std; namespace kiwix { class Reader; class Result { public: virtual ~Result() {}; virtual std::string get_url() = 0; virtual std::string get_title() = 0; virtual int get_score() = 0; virtual std::string get_snippet() = 0; virtual int get_wordCount() = 0; virtual int get_size() = 0; }; struct SearcherInternal; class Searcher { public: Searcher(const string &xapianDirectoryPath, Reader* reader); ~Searcher(); void search(std::string &search, unsigned int resultStart, unsigned int resultEnd, const bool verbose=false); Result* getNextResult(); void restart_search(); unsigned int getEstimatedResultCount(); bool setProtocolPrefix(const std::string prefix); bool setSearchProtocolPrefix(const std::string prefix); void reset(); void setContentHumanReadableId(const string &contentHumanReadableId); #ifdef ENABLE_CTPP2 string getHtml(); #endif protected: std::string beautifyInteger(const unsigned int number); void closeIndex() ; void searchInIndex(string &search, const unsigned int resultStart, const unsigned int resultEnd, const bool verbose=false); Reader* reader; SearcherInternal* internal; std::string searchPattern; std::string protocolPrefix; std::string searchProtocolPrefix; std::string template_ct2; unsigned int resultCountPerPage; unsigned int estimatedResultCount; unsigned int resultStart; unsigned int resultEnd; std::string contentHumanReadableId; }; } #endif libkiwix-0.2.0/include/xapianSearcher.h000066400000000000000000000046771312445116700201060ustar00rootroot00000000000000/* * Copyright 2011 Emmanuel Engelhart * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 3 of the License, or * any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, * MA 02110-1301, USA. */ #ifndef KIWIX_XAPIAN_SEARCHER_H #define KIWIX_XAPIAN_SEARCHER_H #include #include "searcher.h" #include "reader.h" #include #include using namespace std; namespace kiwix { class XapianSearcher; class XapianResult : public Result { public: XapianResult(XapianSearcher* searcher, Xapian::MSetIterator& iterator); virtual ~XapianResult() {}; virtual std::string get_url(); virtual std::string get_title(); virtual int get_score(); virtual std::string get_snippet(); virtual int get_wordCount(); virtual int get_size(); private: XapianSearcher* searcher; Xapian::MSetIterator iterator; Xapian::Document document; }; class NoXapianIndexInZim: public exception { virtual const char* what() const throw() { return "There is no fulltext index in the zim file"; } }; class XapianSearcher { friend class XapianResult; public: XapianSearcher(const string &xapianDirectoryPath, Reader* reader); virtual ~XapianSearcher() {}; void searchInIndex(string &search, const unsigned int resultStart, const unsigned int resultEnd, const bool verbose=false); virtual Result* getNextResult(); void restart_search(); Xapian::MSet results; protected: void closeIndex(); void openIndex(const string &xapianDirectoryPath); void setup_queryParser(); Reader* reader; Xapian::Database readableDatabase; std::string language; std::string stopwords; Xapian::QueryParser queryParser; Xapian::Stem stemmer; Xapian::SimpleStopper stopper; Xapian::MSetIterator current_result; std::map valuesmap; }; } #endif libkiwix-0.2.0/kiwix.pc.in000066400000000000000000000004371312445116700154270ustar00rootroot00000000000000prefix=@prefix@ libdir=${prefix}/lib64 includedir=${prefix}/include Name: libkiwix Description: A library that contains a lot of things used by used by other kiwix programs Version: 1.0 Requires: @requires@ Libs: -L${libdir} -lkiwix @extra_libs@ Cflags: -I${includedir}/ @extra_cflags@ libkiwix-0.2.0/meson.build000066400000000000000000000065001312445116700155020ustar00rootroot00000000000000project('kiwixlib', 'cpp', version : '0.2.0', license : 'GPL', default_options : ['c_std=c11', 'cpp_std=c++11']) compiler = meson.get_compiler('cpp') find_library_in_compiler = meson.version().version_compare('>=0.31.0') thread_dep = dependency('threads') libicu_dep = dependency('icu-i18n') libzim_dep = dependency('libzim') pugixml_dep = dependency('pugixml') ctpp2_include_path = '' has_ctpp2_dep = false ctpp2_prefix_install = get_option('ctpp2-install-prefix') ctpp2_link_args = [] if ctpp2_prefix_install == '' if compiler.has_header('ctpp2/CTPP2Logger.hpp') if find_library_in_compiler ctpp2_lib = compiler.find_library('ctpp2') else ctpp2_lib = find_library('ctpp2') endif ctpp2_link_args = ['-lctpp2'] if meson.is_cross_build() and host_machine.system() == 'windows' if find_library_in_compiler iconv_lib = compiler.find_library('iconv', required:false) else iconv_lib = find_library('iconv', required:false) endif if iconv_lib.found() ctpp2_link_args += ['-liconv'] endif endif has_ctpp2_dep = true ctpp2_dep = declare_dependency(link_args:ctpp2_link_args) else message('ctpp2/CTPP2Logger.hpp not found. Compiling without CTPP2 support') endif else if not find_library_in_compiler error('For custom ctpp2_prefix_install you need a meson version >=0.31.0') endif ctpp2_include_path = ctpp2_prefix_install + '/include' ctpp2_include_args = ['-I'+ctpp2_include_path] if compiler.has_header('ctpp2/CTPP2Logger.hpp', args:ctpp2_include_args) ctpp2_include_dir = include_directories(ctpp2_include_path, is_system:true) ctpp2_lib_path = ctpp2_prefix_install+'/lib' ctpp2_lib = compiler.find_library('ctpp2', dirs:ctpp2_lib_path) ctpp2_link_args = ['-L'+ctpp2_lib_path, '-lctpp2'] if meson.is_cross_build() and host_machine.system() == 'windows' iconv_lib = compiler.find_library('iconv', required:false) if iconv_lib.found() ctpp2_link_args += ['-liconv'] endif endif has_ctpp2_dep = true ctpp2_dep = declare_dependency(include_directories:ctpp2_include_dir, link_args:ctpp2_link_args) else message('ctpp2/CTPP2Logger.hpp not found. Compiling without CTPP2 support') endif endif xapian_dep = dependency('xapian-core', required:false) all_deps = [thread_dep, libicu_dep, libzim_dep, xapian_dep, pugixml_dep] if has_ctpp2_dep all_deps += [ctpp2_dep] endif inc = include_directories('include') conf = configuration_data() conf.set('VERSION', '"@0@"'.format(meson.project_version())) conf.set('ENABLE_CTPP2', has_ctpp2_dep) subdir('include') subdir('scripts') subdir('static') subdir('src') pkg_requires = ['libzim', 'icu-i18n', 'pugixml'] if xapian_dep.found() pkg_requires += ['xapian-core'] endif extra_libs = [] extra_cflags = '' if has_ctpp2_dep extra_libs += ctpp2_link_args if ctpp2_include_path != '' extra_cflags = '-I'+ctpp2_include_path endif endif pkg_conf = configuration_data() pkg_conf.set('prefix', get_option('prefix')) pkg_conf.set('requires', ' '.join(pkg_requires)) pkg_conf.set('extra_libs', ' '.join(extra_libs)) pkg_conf.set('extra_cflags', extra_cflags) configure_file(output : 'kiwix.pc', configuration : pkg_conf, input : 'kiwix.pc.in', install_dir: get_option('libdir')+'/pkgconfig' ) libkiwix-0.2.0/meson_options.txt000066400000000000000000000003421312445116700167730ustar00rootroot00000000000000option('ctpp2-install-prefix', type : 'string', value : '', description : 'Prefix where ctpp libs has been installed') option('android', type : 'boolean', value : false, description : 'Do we make a kiwix-lib for android') libkiwix-0.2.0/scripts/000077500000000000000000000000001312445116700150265ustar00rootroot00000000000000libkiwix-0.2.0/scripts/compile_resources.py000077500000000000000000000116631312445116700211340ustar00rootroot00000000000000#!/usr/bin/env python3 import argparse import os.path import re def full_identifier(filename): parts = os.path.normpath(filename).split(os.sep) parts = [to_identifier(part) for part in parts] print(filename, parts) return parts def to_identifier(name): ident = re.sub(r'[^0-9a-zA-Z]', '_', name) if ident[0].isnumeric(): return "_"+ident return ident resource_impl_template = """ static const unsigned char {data_identifier}[] = {{ {resource_content} }}; namespace RESOURCE {{ {namespaces_open} const std::string {identifier} = init_resource("{env_identifier}", {data_identifier}, {resource_len}); {namespaces_close} }} """ resource_getter_template = """ if (name == "{common_name}") return RESOURCE::{identifier}; """ resource_decl_template = """{namespaces_open} extern const std::string {identifier}; {namespaces_close}""" class Resource: def __init__(self, base_dir, filename): filename = filename.strip() self.filename = filename self.identifier = full_identifier(filename) with open(os.path.join(base_dir, filename), 'rb') as f: self.data = f.read() def dump_impl(self): nb_row = len(self.data)//16 + (1 if len(self.data) % 16 else 0) sliced = (self.data[i*16:(i+1)*16] for i in range(nb_row)) return resource_impl_template.format( data_identifier="_".join([""]+self.identifier), resource_content=",\n ".join(", ".join("{:#04x}".format(i) for i in r) for r in sliced), resource_len=len(self.data), namespaces_open=" ".join("namespace {} {{".format(id) for id in self.identifier[:-1]), namespaces_close=" ".join(["}"]*(len(self.identifier)-1)), identifier=self.identifier[-1], env_identifier="RES_"+"_".join(self.identifier)+"_PATH" ) def dump_getter(self): return resource_getter_template.format( common_name=self.filename, identifier="::".join(self.identifier) ) def dump_decl(self): return resource_decl_template.format( namespaces_open=" ".join("namespace {} {{".format(id) for id in self.identifier[:-1]), namespaces_close=" ".join(["}"]*(len(self.identifier)-1)), identifier=self.identifier[-1] ) master_c_template = """//This file is automaically generated. Do not modify it. #include #include #include #include "{include_file}" class ResourceNotFound : public std::runtime_error {{ public: ResourceNotFound(const std::string& what_arg): std::runtime_error(what_arg) {{ }}; }}; static std::string init_resource(const char* name, const unsigned char* content, int len) {{ char * resPath = getenv(name); if (NULL == resPath) return std::string(reinterpret_cast(content), len); std::ifstream ifs(resPath); if (!ifs.good()) return std::string(reinterpret_cast(content), len); return std::string( (std::istreambuf_iterator(ifs)), (std::istreambuf_iterator() )); }} const std::string& getResource_{basename}(const std::string& name) {{ {RESOURCES_GETTER} throw ResourceNotFound("Resource not found."); }} {RESOURCES} """ def gen_c_file(resources, basename): return master_c_template.format( RESOURCES="\n\n".join(r.dump_impl() for r in resources), RESOURCES_GETTER="\n\n".join(r.dump_getter() for r in resources), include_file=basename, basename=to_identifier(basename) ) master_h_template = """//This file is automaically generated. Do not modify it. #ifndef KIWIX_{BASENAME} #define KIWIX_{BASENAME} #include namespace RESOURCE {{ {RESOURCES} }}; const std::string& getResource_{basename}(const std::string& name); #define getResource(a) (getResource_{basename}(a)) #endif // KIWIX_{BASENAME} """ def gen_h_file(resources, basename): return master_h_template.format( RESOURCES="\n ".join(r.dump_decl() for r in resources), BASENAME=basename.upper(), basename=basename, ) if __name__ == "__main__": parser = argparse.ArgumentParser() parser.add_argument('--cxxfile', help='The Cpp file name to generate') parser.add_argument('--hfile', help='The h file name to generate') parser.add_argument('resource_file', help='The list of resources to compile.') args = parser.parse_args() base_dir = os.path.dirname(os.path.realpath(args.resource_file)) with open(args.resource_file, 'r') as f: resources = [Resource(base_dir, filename) for filename in f.readlines()] h_identifier = to_identifier(os.path.basename(args.hfile)) with open(args.hfile, 'w') as f: f.write(gen_h_file(resources, h_identifier)) with open(args.cxxfile, 'w') as f: f.write(gen_c_file(resources, os.path.basename(args.hfile))) libkiwix-0.2.0/scripts/meson.build000066400000000000000000000001721312445116700171700ustar00rootroot00000000000000 res_compiler = find_program('compile_resources.py') install_data(res_compiler.path(), install_dir:get_option('bindir')) libkiwix-0.2.0/src/000077500000000000000000000000001312445116700141265ustar00rootroot00000000000000libkiwix-0.2.0/src/android/000077500000000000000000000000001312445116700155465ustar00rootroot00000000000000libkiwix-0.2.0/src/android/AndroidManifest.xml000066400000000000000000000003711312445116700213400ustar00rootroot00000000000000 libkiwix-0.2.0/src/android/gen_kiwix.sh000077500000000000000000000002571312445116700200750ustar00rootroot00000000000000#!/usr/bin/env bash set -e BUILD_PATH=$(pwd) javac -d $BUILD_PATH/src/android $1 $2 $3 $4 cd $BUILD_PATH/src/android javah -jni org.kiwix.kiwixlib.JNIKiwix cd $BUILD_PATH libkiwix-0.2.0/src/android/kiwix.cpp000066400000000000000000000310551312445116700174110ustar00rootroot00000000000000#include #include "org_kiwix_kiwixlib_JNIKiwix.h" #include #include #include #include #include "unicode/putil.h" #include "reader.h" #include "searcher.h" #include "common/base64.h" #include #define LOGI(...) __android_log_print(ANDROID_LOG_INFO, "kiwix", __VA_ARGS__) #include #include #include #include #include /* global variables */ kiwix::Reader *reader = NULL; kiwix::Searcher *searcher = NULL; static pthread_mutex_t readerLock = PTHREAD_MUTEX_INITIALIZER; static pthread_mutex_t searcherLock = PTHREAD_MUTEX_INITIALIZER; /* c2jni type conversion functions */ jboolean c2jni(const bool &val) { return val ? JNI_TRUE : JNI_FALSE; } jstring c2jni(const std::string &val, JNIEnv *env) { return env->NewStringUTF(val.c_str()); } jint c2jni(const int val) { return (jint)val; } jint c2jni(const unsigned val) { return (unsigned)val; } /* jni2c type conversion functions */ bool jni2c(const jboolean &val) { return val == JNI_TRUE; } std::string jni2c(const jstring &val, JNIEnv *env) { return std::string(env->GetStringUTFChars(val, 0)); } int jni2c(const jint val) { return (int)val; } /* Method to deal with variable passed by reference */ void setStringObjValue(const std::string &value, const jobject obj, JNIEnv *env) { jclass objClass = env->GetObjectClass(obj); jfieldID objFid = env->GetFieldID(objClass, "value", "Ljava/lang/String;"); env->SetObjectField(obj, objFid, c2jni(value, env)); } void setIntObjValue(const int value, const jobject obj, JNIEnv *env) { jclass objClass = env->GetObjectClass(obj); jfieldID objFid = env->GetFieldID(objClass, "value", "I"); env->SetIntField(obj, objFid, value); } void setBoolObjValue(const bool value, const jobject obj, JNIEnv *env) { jclass objClass = env->GetObjectClass(obj); jfieldID objFid = env->GetFieldID(objClass, "value", "Z"); env->SetIntField(obj, objFid, c2jni(value)); } /* Kiwix library functions */ JNIEXPORT jstring JNICALL Java_org_kiwix_kiwixlib_JNIKiwix_getMainPage(JNIEnv *env, jobject obj) { jstring url; pthread_mutex_lock(&readerLock); if (reader != NULL) { try { std::string cUrl = reader->getMainPageUrl(); url = c2jni(cUrl, env); } catch (...) { std::cerr << "Unable to get ZIM main page" << std::endl; } } pthread_mutex_unlock(&readerLock); return url; } JNIEXPORT jstring JNICALL Java_org_kiwix_kiwixlib_JNIKiwix_getId(JNIEnv *env, jobject obj) { jstring id; pthread_mutex_lock(&readerLock); if (reader != NULL) { try { std::string cId = reader->getId(); id = c2jni(cId, env); } catch (...) { std::cerr << "Unable to get ZIM id" << std::endl; } } pthread_mutex_unlock(&readerLock); return id; } JNIEXPORT jint JNICALL Java_org_kiwix_kiwixlib_JNIKiwix_getFileSize(JNIEnv *env, jobject obj) { jint size; pthread_mutex_lock(&readerLock); if (reader != NULL) { try { int cSize = reader->getFileSize(); size = c2jni(cSize); } catch (...) { std::cerr << "Unable to get ZIM file size" << std::endl; } } pthread_mutex_unlock(&readerLock); return size; } JNIEXPORT jstring JNICALL Java_org_kiwix_kiwixlib_JNIKiwix_getCreator(JNIEnv *env, jobject obj) { jstring creator; pthread_mutex_lock(&readerLock); if (reader != NULL) { try { std::string cCreator = reader->getCreator(); creator = c2jni(cCreator, env); } catch (...) { std::cerr << "Unable to get ZIM creator" << std::endl; } } pthread_mutex_unlock(&readerLock); return creator; } JNIEXPORT jstring JNICALL Java_org_kiwix_kiwixlib_JNIKiwix_getPublisher(JNIEnv *env, jobject obj) { jstring publisher; pthread_mutex_lock(&readerLock); if (reader != NULL) { try { std::string cPublisher = reader->getPublisher(); publisher = c2jni(cPublisher, env); } catch (...) { std::cerr << "Unable to get ZIM creator" << std::endl; } } pthread_mutex_unlock(&readerLock); return publisher; } JNIEXPORT jstring JNICALL Java_org_kiwix_kiwixlib_JNIKiwix_getName(JNIEnv *env, jobject obj) { jstring name; pthread_mutex_lock(&readerLock); if (reader != NULL) { try { std::string cName = reader->getName(); name = c2jni(cName, env); } catch (...) { std::cerr << "Unable to get ZIM name" << std::endl; } } pthread_mutex_unlock(&readerLock); return name; } JNIEXPORT jstring JNICALL Java_org_kiwix_kiwixlib_JNIKiwix_getFavicon(JNIEnv *env, jobject obj) { jstring favicon; pthread_mutex_lock(&readerLock); if (reader != NULL) { try { std::string cContent; std::string cMime; reader->getFavicon(cContent, cMime); favicon = c2jni(base64_encode(reinterpret_cast(cContent.c_str()), cContent.length()), env); } catch (...) { std::cerr << "Unable to get ZIM favicon" << std::endl; } } pthread_mutex_unlock(&readerLock); return favicon; } JNIEXPORT jstring JNICALL Java_org_kiwix_kiwixlib_JNIKiwix_getDate(JNIEnv *env, jobject obj) { jstring date; pthread_mutex_lock(&readerLock); if (reader != NULL) { try { std::string cDate = reader->getDate(); date = c2jni(cDate, env); } catch (...) { std::cerr << "Unable to get ZIM date" << std::endl; } } pthread_mutex_unlock(&readerLock); return date; } JNIEXPORT jstring JNICALL Java_org_kiwix_kiwixlib_JNIKiwix_getLanguage(JNIEnv *env, jobject obj) { jstring language; pthread_mutex_lock(&readerLock); if (reader != NULL) { try { std::string cLanguage = reader->getLanguage(); language = c2jni(cLanguage, env); } catch (...) { std::cerr << "Unable to get ZIM language" << std::endl; } } pthread_mutex_unlock(&readerLock); return language; } JNIEXPORT jstring JNICALL Java_org_kiwix_kiwixlib_JNIKiwix_getMimeType(JNIEnv *env, jobject obj, jstring url) { jstring mimeType; pthread_mutex_lock(&readerLock); if (reader != NULL) { std::string cUrl = jni2c(url, env); try { std::string cMimeType; reader->getMimeTypeByUrl(cUrl, cMimeType); mimeType = c2jni(cMimeType, env); } catch (...) { std::cerr << "Unable to get mime-type for url " << cUrl << std::endl; } } pthread_mutex_unlock(&readerLock); return mimeType; } JNIEXPORT jboolean JNICALL Java_org_kiwix_kiwixlib_JNIKiwix_loadZIM(JNIEnv *env, jobject obj, jstring path) { jboolean retVal = JNI_TRUE; std::string cPath = jni2c(path, env); pthread_mutex_lock(&readerLock); try { if (reader != NULL) delete reader; reader = new kiwix::Reader(cPath); } catch (...) { std::cerr << "Unable to load ZIM " << cPath << std::endl; reader = NULL; retVal = JNI_FALSE; } pthread_mutex_unlock(&readerLock); return retVal; } JNIEXPORT jbyteArray JNICALL Java_org_kiwix_kiwixlib_JNIKiwix_getContent(JNIEnv *env, jobject obj, jstring url, jobject mimeTypeObj, jobject sizeObj) { /* Default values */ setStringObjValue("", mimeTypeObj, env); setIntObjValue(0, sizeObj, env); jbyteArray data = env->NewByteArray(0); /* Retrieve the content */ if (reader != NULL) { std::string cUrl = jni2c(url, env); std::string cData; std::string cMimeType; unsigned int cSize = 0; pthread_mutex_lock(&readerLock); try { if (reader->getContentByUrl(cUrl, cData, cSize, cMimeType)) { data = env->NewByteArray(cSize); env->SetByteArrayRegion(data, 0, cSize, reinterpret_cast(cData.c_str())); setStringObjValue(cMimeType, mimeTypeObj, env); setIntObjValue(cSize, sizeObj, env); } } catch (...) { std::cerr << "Unable to get content for url " << cUrl << std::endl; } pthread_mutex_unlock(&readerLock); } return data; } JNIEXPORT jboolean JNICALL Java_org_kiwix_kiwixlib_JNIKiwix_searchSuggestions (JNIEnv *env, jobject obj, jstring prefix, jint count) { jboolean retVal = JNI_FALSE; std::string cPrefix = jni2c(prefix, env); unsigned int cCount = jni2c(count); pthread_mutex_lock(&readerLock); try { if (reader != NULL) { if (reader->searchSuggestionsSmart(cPrefix, cCount)) { retVal = JNI_TRUE; } } } catch (...) { std::cerr << "Unable to search suggestions for pattern " << cPrefix << std::endl; } pthread_mutex_unlock(&readerLock); return retVal; } JNIEXPORT jboolean JNICALL Java_org_kiwix_kiwixlib_JNIKiwix_getNextSuggestion (JNIEnv *env, jobject obj, jobject titleObj) { jboolean retVal = JNI_FALSE; std::string cTitle; pthread_mutex_lock(&readerLock); try { if (reader != NULL) { if (reader->getNextSuggestion(cTitle)) { setStringObjValue(cTitle, titleObj, env); retVal = JNI_TRUE; } } } catch (...) { std::cerr << "Unable to get next suggestion" << std::endl; } pthread_mutex_unlock(&readerLock); return retVal; } JNIEXPORT jboolean JNICALL Java_org_kiwix_kiwixlib_JNIKiwix_getPageUrlFromTitle (JNIEnv *env, jobject obj, jstring title, jobject urlObj) { jboolean retVal = JNI_FALSE; std::string cTitle = jni2c(title, env); std::string cUrl; pthread_mutex_lock(&readerLock); try { if (reader != NULL) { if (reader->getPageUrlFromTitle(cTitle, cUrl)) { setStringObjValue(cUrl, urlObj, env); retVal = JNI_TRUE; } } } catch (...) { std::cerr << "Unable to get URL for title " << cTitle << std::endl; } pthread_mutex_unlock(&readerLock); return retVal; } JNIEXPORT jboolean JNICALL Java_org_kiwix_kiwixlib_JNIKiwix_getTitle (JNIEnv *env , jobject obj, jobject titleObj) { jboolean retVal = JNI_FALSE; std::string cTitle; pthread_mutex_lock(&readerLock); try { if (reader != NULL) { std::string cTitle = reader->getTitle(); setStringObjValue(cTitle, titleObj, env); retVal = JNI_TRUE; } } catch (...) { std::cerr << "Unable to get ZIM title" << std::endl; } pthread_mutex_unlock(&readerLock); return retVal; } JNIEXPORT jstring JNICALL Java_org_kiwix_kiwixlib_JNIKiwix_getDescription(JNIEnv *env, jobject obj) { jstring description; pthread_mutex_lock(&readerLock); if (reader != NULL) { try { std::string cDescription = reader->getDescription(); description = c2jni(cDescription, env); } catch (...) { std::cerr << "Unable to get ZIM description" << std::endl; } } pthread_mutex_unlock(&readerLock); return description; } JNIEXPORT jboolean JNICALL Java_org_kiwix_kiwixlib_JNIKiwix_getRandomPage (JNIEnv *env, jobject obj, jobject urlObj) { jboolean retVal = JNI_FALSE; std::string cUrl; pthread_mutex_lock(&readerLock); try { if (reader != NULL) { std::string cUrl = reader->getRandomPageUrl(); setStringObjValue(cUrl, urlObj, env); retVal = JNI_TRUE; } } catch (...) { std::cerr << "Unable to get random page" << std::endl; } pthread_mutex_unlock(&readerLock); return retVal; } JNIEXPORT void JNICALL Java_org_kiwix_kiwixlib_JNIKiwix_setDataDirectory (JNIEnv *env, jobject obj, jstring dirStr) { std::string cPath = jni2c(dirStr, env); pthread_mutex_lock(&readerLock); try { u_setDataDirectory(cPath.c_str()); } catch (...) { std::cerr << "Unable to set data directory " << cPath << std::endl; } pthread_mutex_unlock(&readerLock); } JNIEXPORT jboolean JNICALL Java_org_kiwix_kiwixlib_JNIKiwix_loadFulltextIndex(JNIEnv *env, jobject obj, jstring path) { jboolean retVal = JNI_TRUE; std::string cPath = jni2c(path, env); pthread_mutex_lock(&searcherLock); searcher = NULL; try { if (searcher != NULL) delete searcher; searcher = new kiwix::Searcher(cPath, reader); } catch (...) { searcher = NULL; retVal = JNI_FALSE; std::cerr << "Unable to load full text index " << cPath << std::endl; } pthread_mutex_unlock(&searcherLock); return retVal; } JNIEXPORT jstring JNICALL Java_org_kiwix_kiwixlib_JNIKiwix_indexedQuery (JNIEnv *env, jclass obj, jstring query, jint count) { std::string cQuery = jni2c(query, env); unsigned int cCount = jni2c(count); kiwix::Result *p_result; std::string result; pthread_mutex_lock(&searcherLock); try { if (searcher != NULL) { searcher->search(cQuery, 0, count); while ( (p_result = searcher->getNextResult()) && !(p_result->get_title().empty()) && !(p_result->get_url().empty())) { result += p_result->get_title() + "\n"; delete p_result; } } } catch (...) { std::cerr << "Unable to make indexed query " << cQuery << std::endl; } pthread_mutex_unlock(&searcherLock); return env->NewStringUTF(result.c_str()); } libkiwix-0.2.0/src/android/meson.build000066400000000000000000000011011312445116700177010ustar00rootroot00000000000000 jni_generator = find_program('gen_kiwix.sh') kiwix_jni = custom_target('jni', input: ['org/kiwix/kiwixlib/JNIKiwix.java', 'org/kiwix/kiwixlib/JNIKiwixInt.java', 'org/kiwix/kiwixlib/JNIKiwixString.java', 'org/kiwix/kiwixlib/JNIKiwixBool.java'], output: ['org_kiwix_kiwixlib_JNIKiwix.h'], command:[jni_generator, '@INPUT@'] ) kiwix_sources += ['android/kiwix.cpp', kiwix_jni] install_subdir('org', install_dir: 'kiwix-lib/java') install_subdir('res', install_dir: 'kiwix-lib') install_data('AndroidManifest.xml', install_dir: 'kiwix-lib') libkiwix-0.2.0/src/android/org/000077500000000000000000000000001312445116700163355ustar00rootroot00000000000000libkiwix-0.2.0/src/android/org/kiwix/000077500000000000000000000000001312445116700174705ustar00rootroot00000000000000libkiwix-0.2.0/src/android/org/kiwix/kiwixlib/000077500000000000000000000000001312445116700213125ustar00rootroot00000000000000libkiwix-0.2.0/src/android/org/kiwix/kiwixlib/JNIKiwix.java000066400000000000000000000041111312445116700236060ustar00rootroot00000000000000/* * Copyright 2013 * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 3 of the License, or * any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, * MA 02110-1301, USA. */ package org.kiwix.kiwixlib; import org.kiwix.kiwixlib.JNIKiwixString; import org.kiwix.kiwixlib.JNIKiwixBool; import org.kiwix.kiwixlib.JNIKiwixInt; public class JNIKiwix { static { System.loadLibrary("kiwix"); } public native String getMainPage(); public native String getId(); public native String getLanguage(); public native String getMimeType(String url); public native boolean loadZIM(String path); public native boolean loadFulltextIndex(String path); public native byte[] getContent(String url, JNIKiwixString mimeType, JNIKiwixInt size); public native boolean searchSuggestions(String prefix, int count); public native boolean getNextSuggestion(JNIKiwixString title); public native boolean getPageUrlFromTitle(String title, JNIKiwixString url); public native boolean getTitle(JNIKiwixString title); public native String getDescription(); public native String getDate(); public native String getFavicon(); public native String getCreator(); public native String getPublisher(); public native String getName(); public native int getFileSize(); public native int getArticleCount(); public native int getMediaCount(); public native boolean getRandomPage(JNIKiwixString url); public native void setDataDirectory(String icuDataDir); public static native String indexedQuery(String db, int count); } libkiwix-0.2.0/src/android/org/kiwix/kiwixlib/JNIKiwixBool.java000066400000000000000000000014611312445116700244270ustar00rootroot00000000000000/* * Copyright 2013 * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 3 of the License, or * any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, * MA 02110-1301, USA. */ package org.kiwix.kiwixlib; public class JNIKiwixBool { public boolean value; } libkiwix-0.2.0/src/android/org/kiwix/kiwixlib/JNIKiwixInt.java000066400000000000000000000014551312445116700242710ustar00rootroot00000000000000/* * Copyright 2013 * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 3 of the License, or * any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, * MA 02110-1301, USA. */ package org.kiwix.kiwixlib; public class JNIKiwixInt { public int value; } libkiwix-0.2.0/src/android/org/kiwix/kiwixlib/JNIKiwixString.java000066400000000000000000000014621312445116700250030ustar00rootroot00000000000000/* * Copyright 2013 * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 3 of the License, or * any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, * MA 02110-1301, USA. */ package org.kiwix.kiwixlib; public class JNIKiwixString { public String value; } libkiwix-0.2.0/src/android/res/000077500000000000000000000000001312445116700163375ustar00rootroot00000000000000libkiwix-0.2.0/src/android/res/values/000077500000000000000000000000001312445116700176365ustar00rootroot00000000000000libkiwix-0.2.0/src/android/res/values/strings.xml000066400000000000000000000001061312445116700220460ustar00rootroot00000000000000 Kiwix Lib libkiwix-0.2.0/src/common/000077500000000000000000000000001312445116700154165ustar00rootroot00000000000000libkiwix-0.2.0/src/common/base64.cpp000066400000000000000000000072301312445116700172100ustar00rootroot00000000000000/* base64.cpp and base64.h Copyright (C) 2004-2008 René Nyffenegger This source code is provided 'as-is', without any express or implied warranty. In no event will the author be held liable for any damages arising from the use of this software. Permission is granted to anyone to use this software for any purpose, including commercial applications, and to alter it and redistribute it freely, subject to the following restrictions: 1. The origin of this source code must not be misrepresented; you must not claim that you wrote the original source code. If you use this source code in a product, an acknowledgment in the product documentation would be appreciated but is not required. 2. Altered source versions must be plainly marked as such, and must not be misrepresented as being the original source code. 3. This notice may not be removed or altered from any source distribution. René Nyffenegger rene.nyffenegger@adp-gmbh.ch */ #include #include static const std::string base64_chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "abcdefghijklmnopqrstuvwxyz" "0123456789+/"; static inline bool is_base64(unsigned char c) { return (isalnum(c) || (c == '+') || (c == '/')); } std::string base64_encode(unsigned char const* bytes_to_encode, unsigned int in_len) { std::string ret; int i = 0; int j = 0; unsigned char char_array_3[3]; unsigned char char_array_4[4]; while (in_len--) { char_array_3[i++] = *(bytes_to_encode++); if (i == 3) { char_array_4[0] = (char_array_3[0] & 0xfc) >> 2; char_array_4[1] = ((char_array_3[0] & 0x03) << 4) + ((char_array_3[1] & 0xf0) >> 4); char_array_4[2] = ((char_array_3[1] & 0x0f) << 2) + ((char_array_3[2] & 0xc0) >> 6); char_array_4[3] = char_array_3[2] & 0x3f; for(i = 0; (i <4) ; i++) ret += base64_chars[char_array_4[i]]; i = 0; } } if (i) { for(j = i; j < 3; j++) char_array_3[j] = '\0'; char_array_4[0] = (char_array_3[0] & 0xfc) >> 2; char_array_4[1] = ((char_array_3[0] & 0x03) << 4) + ((char_array_3[1] & 0xf0) >> 4); char_array_4[2] = ((char_array_3[1] & 0x0f) << 2) + ((char_array_3[2] & 0xc0) >> 6); char_array_4[3] = char_array_3[2] & 0x3f; for (j = 0; (j < i + 1); j++) ret += base64_chars[char_array_4[j]]; while((i++ < 3)) ret += '='; } return ret; } std::string base64_decode(std::string const& encoded_string) { int in_len = encoded_string.size(); int i = 0; int j = 0; int in_ = 0; unsigned char char_array_4[4], char_array_3[3]; std::string ret; while (in_len-- && ( encoded_string[in_] != '=') && is_base64(encoded_string[in_])) { char_array_4[i++] = encoded_string[in_]; in_++; if (i ==4) { for (i = 0; i <4; i++) char_array_4[i] = base64_chars.find(char_array_4[i]); char_array_3[0] = (char_array_4[0] << 2) + ((char_array_4[1] & 0x30) >> 4); char_array_3[1] = ((char_array_4[1] & 0xf) << 4) + ((char_array_4[2] & 0x3c) >> 2); char_array_3[2] = ((char_array_4[2] & 0x3) << 6) + char_array_4[3]; for (i = 0; (i < 3); i++) ret += char_array_3[i]; i = 0; } } if (i) { for (j = i; j <4; j++) char_array_4[j] = 0; for (j = 0; j <4; j++) char_array_4[j] = base64_chars.find(char_array_4[j]); char_array_3[0] = (char_array_4[0] << 2) + ((char_array_4[1] & 0x30) >> 4); char_array_3[1] = ((char_array_4[1] & 0xf) << 4) + ((char_array_4[2] & 0x3c) >> 2); char_array_3[2] = ((char_array_4[2] & 0x3) << 6) + char_array_4[3]; for (j = 0; (j < i - 1); j++) ret += char_array_3[j]; } return ret; } libkiwix-0.2.0/src/common/networkTools.cpp000066400000000000000000000111021312445116700206270ustar00rootroot00000000000000/* * Copyright 2012 Emmanuel Engelhart * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 3 of the License, or * any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, * MA 02110-1301, USA. */ #include std::map kiwix::getNetworkInterfaces() { std::map interfaces; #ifdef _WIN32 SOCKET sd = WSASocket(AF_INET, SOCK_DGRAM, 0, 0, 0, 0); if (sd == SOCKET_ERROR) { std::cerr << "Failed to get a socket. Error " << WSAGetLastError() << std::endl; return interfaces; } INTERFACE_INFO InterfaceList[20]; unsigned long nBytesReturned; if (WSAIoctl(sd, SIO_GET_INTERFACE_LIST, 0, 0, &InterfaceList, sizeof(InterfaceList), &nBytesReturned, 0, 0) == SOCKET_ERROR) { std::cerr << "Failed calling WSAIoctl: error " << WSAGetLastError() << std::endl; return interfaces; } int nNumInterfaces = nBytesReturned / sizeof(INTERFACE_INFO); for (int i = 0; i < nNumInterfaces; ++i) { sockaddr_in *pAddress; pAddress = (sockaddr_in *) & (InterfaceList[i].iiAddress); /* Add to the map */ std::string interfaceName = std::string(inet_ntoa(pAddress->sin_addr)); std::string interfaceIp = std::string(inet_ntoa(pAddress->sin_addr)); interfaces.insert(std::pair(interfaceName, interfaceIp)); } #else /* Get Network interfaces information */ char buf[16384]; struct ifconf ifconf; int fd = socket(PF_INET, SOCK_DGRAM, 0); /* Only IPV4 */ ifconf.ifc_len=sizeof buf; ifconf.ifc_buf=buf; if(ioctl(fd, SIOCGIFCONF, &ifconf)!=0) { perror("ioctl(SIOCGIFCONF)"); exit(EXIT_FAILURE); } /* Go through each interface */ int i; size_t len; struct ifreq *ifreq; ifreq = ifconf.ifc_req; for (i = 0; i < ifconf.ifc_len; ) { if (ifreq->ifr_addr.sa_family == AF_INET) { /* Get the network interface ip */ char host[128] = { 0 }; const int error = getnameinfo(&(ifreq->ifr_addr), sizeof ifreq->ifr_addr, host, sizeof host, 0, 0, NI_NUMERICHOST); if (!error) { std::string interfaceName = std::string(ifreq->ifr_name); std::string interfaceIp = std::string(host); /* Add to the map */ interfaces.insert(std::pair(interfaceName, interfaceIp)); } else { perror("getnameinfo()"); } } /* some systems have ifr_addr.sa_len and adjust the length that * way, but not mine. weird */ #ifndef __linux__ len=IFNAMSIZ + ifreq->ifr_addr.sa_len; #else len=sizeof *ifreq; #endif ifreq=(struct ifreq*)((char*)ifreq+len); i+=len; } #endif return interfaces; } std::string kiwix::getBestPublicIp() { std::map interfaces = kiwix::getNetworkInterfaces(); #ifndef _WIN32 const char* const prioritizedNames[] = { "eth0", "eth1", "wlan0", "wlan1", "en0", "en1" }; const int count = (sizeof prioritizedNames) / (sizeof prioritizedNames[0]); for (int i = 0; i < count; ++i) { std::map::const_iterator it = interfaces.find(prioritizedNames[i]); if (it != interfaces.end()) return it->second; } #endif for (std::map::iterator iter = interfaces.begin(); iter != interfaces.end(); ++iter) { std::string interfaceIp = iter->second; if (interfaceIp.length() >= 7 && interfaceIp.substr(0, 7) == "192.168") return interfaceIp; } for (std::map::iterator iter = interfaces.begin(); iter != interfaces.end(); ++iter) { std::string interfaceIp = iter->second; if (interfaceIp.length() >= 7 && interfaceIp.substr(0, 7) == "172.16.") return interfaceIp; } for (std::map::iterator iter = interfaces.begin(); iter != interfaces.end(); ++iter) { std::string interfaceIp = iter->second; if (interfaceIp.length() >= 3 && interfaceIp.substr(0, 3) == "10.") return interfaceIp; } return "127.0.0.1"; } libkiwix-0.2.0/src/common/otherTools.cpp000066400000000000000000000016551312445116700202730ustar00rootroot00000000000000/* * Copyright 2014 Emmanuel Engelhart * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 3 of the License, or * any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, * MA 02110-1301, USA. */ #include void kiwix::sleep(unsigned int milliseconds) { #ifdef _WIN32 Sleep(milliseconds); #else usleep(1000 * milliseconds); #endif } libkiwix-0.2.0/src/common/pathTools.cpp000066400000000000000000000143351312445116700201050ustar00rootroot00000000000000/* * Copyright 2011-2014 Emmanuel Engelhart * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 3 of the License, or * any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, * MA 02110-1301, USA. */ #include #ifdef __APPLE__ #include #include #elif _WIN32 #include #include "shlwapi.h" #include #define getcwd _getcwd // stupid MSFT "deprecation" warning #endif #ifdef _WIN32 #else #include #endif #ifdef _WIN32 #define SEPARATOR "\\" #else #define SEPARATOR "/" #include #endif #include #ifndef PATH_MAX #define PATH_MAX 1024 #endif bool isRelativePath(const string &path) { #ifdef _WIN32 return path.empty() || path.substr(1, 2) == ":\\" ? false : true; #else return path.empty() || path.substr(0, 1) == "/" ? false : true; #endif } string computeRelativePath(const string path, const string absolutePath) { std::vector pathParts = kiwix::split(path, SEPARATOR); std::vector absolutePathParts = kiwix::split(absolutePath, SEPARATOR); unsigned int commonCount = 0; while (commonCount < pathParts.size() && commonCount < absolutePathParts.size() && pathParts[commonCount] == absolutePathParts[commonCount]) { if (!pathParts[commonCount].empty()) { commonCount++; } } string relativePath; #ifdef _WIN32 /* On Windows you have a token more because the root is represented by a letter */ if (commonCount == 0) { relativePath = "../"; } #endif for (unsigned int i = commonCount ; i < pathParts.size() ; i++) { relativePath += "../"; } for (unsigned int i = commonCount ; i < absolutePathParts.size() ; i++) { relativePath += absolutePathParts[i]; relativePath += i + 1 < absolutePathParts.size() ? "/" : ""; } return relativePath; } /* Warning: the relative path must be with slashes */ string computeAbsolutePath(const string path, const string relativePath) { string absolutePath; if (path.empty()) { char *path=NULL; size_t size = 0; #ifdef _WIN32 path = _getcwd(path, size); #else path = getcwd(path, size); #endif absolutePath = string(path) + SEPARATOR; } else { absolutePath = path.substr(path.length() - 1, 1) == SEPARATOR ? path : path + SEPARATOR; } #if _WIN32 char *cRelativePath = _strdup(relativePath.c_str()); #else char *cRelativePath = strdup(relativePath.c_str()); #endif char *token = strtok(cRelativePath, "/"); while (token != NULL) { if (string(token) == "..") { absolutePath = removeLastPathElement(absolutePath, true, false); token = strtok(NULL, "/"); } else if (strcmp(token, ".") && strcmp(token, "")) { absolutePath += string(token); token = strtok(NULL, "/"); if (token != NULL) absolutePath += SEPARATOR; } else { token = strtok(NULL, "/"); } } return absolutePath; } string removeLastPathElement(const string path, const bool removePreSeparator, const bool removePostSeparator) { string newPath = path; size_t offset = newPath.find_last_of(SEPARATOR); if (removePreSeparator && #ifndef _WIN32 offset != newPath.find_first_of(SEPARATOR) && #endif offset == newPath.length()-1) { newPath = newPath.substr(0, offset); offset = newPath.find_last_of(SEPARATOR); } newPath = removePostSeparator ? newPath.substr(0, offset) : newPath.substr(0, offset+1); return newPath; } string appendToDirectory(const string &directoryPath, const string &filename) { string newPath = directoryPath + SEPARATOR + filename; return newPath; } string getLastPathElement(const string &path) { return path.substr(path.find_last_of(SEPARATOR) + 1); } unsigned int getFileSize(const string &path) { #ifdef _WIN32 struct _stat filestatus; _stat(path.c_str(), &filestatus); #else struct stat filestatus; stat(path.c_str(), &filestatus); #endif return filestatus.st_size / 1024; } string getFileSizeAsString(const string &path) { ostringstream convert; convert << getFileSize(path); return convert.str(); } bool fileExists(const string &path) { #ifdef _WIN32 return PathFileExists(path.c_str()); #else bool flag = false; fstream fin; fin.open(path.c_str(), ios::in); if (fin.is_open()) { flag = true; } fin.close(); return flag; #endif } bool makeDirectory(const string &path) { #ifdef _WIN32 int status = _mkdir(path.c_str()); #else int status = mkdir(path.c_str(), S_IRWXU | S_IRWXG | S_IROTH | S_IXOTH); #endif return status == 0; } /* Try to create a link and if does not work then make a copy */ bool copyFile(const string &sourcePath, const string &destPath) { try { #ifndef _WIN32 if (link(sourcePath.c_str(), destPath.c_str()) != 0) { #endif std::ifstream infile(sourcePath.c_str(), std::ios_base::binary); std::ofstream outfile(destPath.c_str(), std::ios_base::binary); outfile << infile.rdbuf(); #ifndef _WIN32 } #endif } catch (exception &e) { cerr << e.what() << endl; return false; } return true; } string getExecutablePath() { char binRootPath[PATH_MAX]; #ifdef _WIN32 GetModuleFileName( NULL, binRootPath, PATH_MAX); return std::string(binRootPath); #elif __APPLE__ uint32_t max = (uint32_t)PATH_MAX; _NSGetExecutablePath(binRootPath, &max); return std::string(binRootPath); #else ssize_t size = readlink("/proc/self/exe", binRootPath, PATH_MAX); if (size != -1) { return std::string(binRootPath, size); } #endif return ""; } bool writeTextFile(const string &path, const string &content) { std::ofstream file; file.open(path.c_str()); file << content; file.close(); return true; } string getCurrentDirectory() { char* a_cwd = getcwd(NULL,0); string s_cwd(a_cwd); free(a_cwd); return s_cwd; } libkiwix-0.2.0/src/common/regexTools.cpp000066400000000000000000000053301312445116700202560ustar00rootroot00000000000000/* * Copyright 2011 Emmanuel Engelhart * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 3 of the License, or * any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, * MA 02110-1301, USA. */ #include std::map regexCache; RegexMatcher *buildRegex(const std::string ®ex) { RegexMatcher *matcher; std::map::iterator itr = regexCache.find(regex); /* Regex is in cache */ if (itr != regexCache.end()) { matcher = itr->second; } /* Regex needs to be parsed (and cached) */ else { UErrorCode status = U_ZERO_ERROR; UnicodeString uregex = UnicodeString(regex.c_str()); matcher = new RegexMatcher(uregex, UREGEX_CASE_INSENSITIVE, status); regexCache[regex] = matcher; } return matcher; } /* todo */ void freeRegexCache() { } bool matchRegex(const std::string &content, const std::string ®ex) { ucnv_setDefaultName("UTF-8"); UnicodeString ucontent = UnicodeString(content.c_str()); RegexMatcher *matcher = buildRegex(regex); matcher->reset(ucontent); return matcher->find(); } std::string replaceRegex(const std::string &content, const std::string &replacement, const std::string ®ex) { ucnv_setDefaultName("UTF-8"); UnicodeString ucontent = UnicodeString(content.c_str()); UnicodeString ureplacement = UnicodeString(replacement.c_str()); RegexMatcher *matcher = buildRegex(regex); matcher->reset(ucontent); UErrorCode status = U_ZERO_ERROR; UnicodeString uresult = matcher->replaceAll(ureplacement, status); std::string tmp; uresult.toUTF8String(tmp); return tmp; } std::string appendToFirstOccurence(const std::string &content, const std::string regex, const std::string &replacement) { ucnv_setDefaultName("UTF-8"); UnicodeString ucontent = UnicodeString(content.c_str()); UnicodeString ureplacement = UnicodeString(replacement.c_str()); RegexMatcher *matcher = buildRegex(regex); matcher->reset(ucontent); if (matcher->find()) { UErrorCode status = U_ZERO_ERROR; ucontent.insert(matcher->end(status), ureplacement); std::string tmp; ucontent.toUTF8String(tmp); return tmp; } return content; } libkiwix-0.2.0/src/common/stringTools.cpp000066400000000000000000000166421312445116700204620ustar00rootroot00000000000000/* * Copyright 2011 Emmanuel Engelhart * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 3 of the License, or * any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, * MA 02110-1301, USA. */ #include #include #include #include #include #include #include /* tell ICU where to find its dat file (tables) */ void kiwix::loadICUExternalTables() { #ifdef __APPLE__ std::string executablePath = getExecutablePath(); std::string executableDirectory = removeLastPathElement(executablePath); std::string datPath = computeAbsolutePath(executableDirectory, "icudt49l.dat"); try { u_setDataDirectory(datPath.c_str()); } catch (exception &e) { std::cerr << e.what() << std::endl; } #endif } std::string kiwix::removeAccents(const std::string &text) { loadICUExternalTables(); ucnv_setDefaultName("UTF-8"); UErrorCode status = U_ZERO_ERROR; Transliterator *removeAccentsTrans = Transliterator::createInstance("Lower; NFD; [:M:] remove; NFC", UTRANS_FORWARD, status); UnicodeString ustring = UnicodeString(text.c_str()); removeAccentsTrans->transliterate(ustring); delete removeAccentsTrans; std::string unaccentedText; ustring.toUTF8String(unaccentedText); return unaccentedText; } #ifndef __ANDROID__ /* Prepare integer for display */ std::string kiwix::beautifyInteger(const unsigned int number) { std::stringstream numberStream; numberStream << number; std::string numberString = numberStream.str(); signed int offset = numberString.size() - 3; while (offset > 0) { numberString.insert(offset, ","); offset -= 3; } return numberString; } std::string kiwix::beautifyFileSize(const unsigned int number) { if (number > 1024*1024) { return kiwix::beautifyInteger(number/(1024*1024)) + " GB"; } else { return kiwix::beautifyInteger(number/1024 != 0 ? number/1024 : 1) + " MB"; } } void kiwix::printStringInHexadecimal(UnicodeString s) { std::cout << std::showbase << std::hex; for (int i=0; i", ">"); return result; } // Urlencode //based on javascript encodeURIComponent() std::string char2hex(char dec) { char dig1 = (dec&0xF0)>>4; char dig2 = (dec&0x0F); if ( 0<= dig1 && dig1<= 9) dig1+=48; //0,48inascii if (10<= dig1 && dig1<=15) dig1+=97-10; //a,97inascii if ( 0<= dig2 && dig2<= 9) dig2+=48; if (10<= dig2 && dig2<=15) dig2+=97-10; std::string r; r.append( &dig1, 1); r.append( &dig2, 1); return r; } std::string kiwix::urlEncode(const std::string &c) { std::string escaped=""; int max = c.length(); for(int i=0; i> std::hex >> Z; return char (Z); } std::string kiwix::urlDecode(const std::string &originalUrl) { std::string url = originalUrl; std::string::size_type pos = 0; while ((pos = url.find('%', pos)) != std::string::npos && pos + 2 < url.length()) { url.replace(pos, 3, 1, charFromHex(url.substr(pos + 1, 2))); ++pos; } return url; } /* Split string in a token array */ std::vector kiwix::split(const std::string & str, const std::string & delims=" *-") { std::string::size_type lastPos = str.find_first_not_of(delims, 0); std::string::size_type pos = str.find_first_of(delims, lastPos); std::vector tokens; while (std::string::npos != pos || std::string::npos != lastPos) { tokens.push_back(str.substr(lastPos, pos - lastPos)); lastPos = str.find_first_not_of(delims, pos); pos = str.find_first_of(delims, lastPos); } return tokens; } std::vector kiwix::split(const char* lhs, const char* rhs){ const std::string m1 (lhs), m2 (rhs); return split(m1, m2); } std::vector kiwix::split(const char* lhs, const std::string& rhs){ return split(lhs, rhs.c_str()); } std::vector kiwix::split(const std::string& lhs, const char* rhs){ return split(lhs.c_str(), rhs); } std::string kiwix::ucFirst (const std::string &word) { if (word.empty()) return ""; std::string result; UnicodeString unicodeWord(word.c_str()); UnicodeString unicodeFirstLetter = UnicodeString(unicodeWord, 0, 1).toUpper(); unicodeWord.replace(0, 1, unicodeFirstLetter); unicodeWord.toUTF8String(result); return result; } std::string kiwix::ucAll (const std::string &word) { if (word.empty()) return ""; std::string result; UnicodeString unicodeWord(word.c_str()); unicodeWord.toUpper().toUTF8String(result); return result; } std::string kiwix::lcFirst (const std::string &word) { if (word.empty()) return ""; std::string result; UnicodeString unicodeWord(word.c_str()); UnicodeString unicodeFirstLetter = UnicodeString(unicodeWord, 0, 1).toLower(); unicodeWord.replace(0, 1, unicodeFirstLetter); unicodeWord.toUTF8String(result); return result; } std::string kiwix::lcAll (const std::string &word) { if (word.empty()) return ""; std::string result; UnicodeString unicodeWord(word.c_str()); unicodeWord.toLower().toUTF8String(result); return result; } std::string kiwix::toTitle (const std::string &word) { if (word.empty()) return ""; std::string result; UnicodeString unicodeWord(word.c_str()); unicodeWord = unicodeWord.toTitle(0); unicodeWord.toUTF8String(result); return result; } std::string kiwix::normalize (const std::string &word) { return kiwix::lcAll(word); } libkiwix-0.2.0/src/config.h.in000066400000000000000000000000611312445116700161460ustar00rootroot00000000000000 #mesondefine VERSION #mesondefine ENABLE_CTPP2 libkiwix-0.2.0/src/ctpp2/000077500000000000000000000000001312445116700151565ustar00rootroot00000000000000libkiwix-0.2.0/src/ctpp2/CTPP2VMStringLoader.cpp000066400000000000000000000142671312445116700213050ustar00rootroot00000000000000/* * Copyright 2013 Renaud Gaudin * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 3 of the License, or * any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, * MA 02110-1301, USA. */ #include namespace CTPP // C++ Template Engine { // // Convert byte order // static void ConvertExecutable(VMExecutable * oCore) { // Code entry point oCore -> entry_point = Swap32(oCore -> entry_point); // Offset of code segment oCore -> code_offset = Swap32(oCore -> code_offset); // Code segment size oCore -> code_size = Swap32(oCore -> code_size); // Offset of static text segment oCore -> syscalls_offset = Swap32(oCore -> syscalls_offset); // Static text segment size oCore -> syscalls_data_size = Swap32(oCore -> syscalls_data_size); // Offset of static text index segment oCore -> syscalls_index_offset = Swap32(oCore -> syscalls_index_offset); // Static text index segment size oCore -> syscalls_index_size = Swap32(oCore -> syscalls_index_size); // Offset of static data segment oCore -> static_data_offset = Swap32(oCore -> static_data_offset); // Static data segment size oCore -> static_data_data_size = Swap32(oCore -> static_data_data_size); // Offset of static text segment oCore -> static_text_offset = Swap32(oCore -> static_text_offset); // Static text segment size oCore -> static_text_data_size = Swap32(oCore -> static_text_data_size); // Offset of static text index segment oCore -> static_text_index_offset = Swap32(oCore -> static_text_index_offset); // Static text index segment size oCore -> static_text_index_size = Swap32(oCore -> static_text_index_size); // Version 2.2+ // Offset of static data bit index oCore -> static_data_bit_index_offset = Swap32(oCore -> static_data_bit_index_offset); /// Offset of static data bit index oCore -> static_data_bit_index_size = Swap32(oCore -> static_data_bit_index_size); // Platform oCore -> platform = Swap64(oCore -> platform); // Ugly-jolly hack! // ... dereferencing type-punned pointer will break strict-aliasing rules ... UINT_64 iTMP; memcpy(&iTMP, &(oCore -> ieee754double), sizeof(UINT_64)); iTMP = Swap64(iTMP); memcpy(&(oCore -> ieee754double), &iTMP, sizeof(UINT_64)); // Cyclic Redundancy Check oCore -> crc = 0; // Convert data structures // Convert code segment VMInstruction * pInstructions = const_cast(VMExecutable::GetCodeSeg(oCore)); UINT_32 iI = 0; UINT_32 iSteps = oCore -> code_size / sizeof(VMInstruction); for(iI = 0; iI < iSteps; ++iI) { pInstructions -> instruction = Swap32(pInstructions -> instruction); pInstructions -> argument = Swap32(pInstructions -> argument); pInstructions -> reserved = Swap64(pInstructions -> reserved); ++pInstructions; } // Convert syscalls index TextDataIndex * pTextIndex = const_cast(VMExecutable::GetSyscallsIndexSeg(oCore)); iSteps = oCore -> syscalls_index_size / sizeof(TextDataIndex); for(iI = 0; iI < iSteps; ++iI) { pTextIndex -> offset = Swap32(pTextIndex -> offset); pTextIndex -> length = Swap32(pTextIndex -> length); ++pTextIndex; } // Convert static text index pTextIndex = const_cast(VMExecutable::GetStaticTextIndexSeg(oCore)); iSteps = oCore -> static_text_index_size / sizeof(TextDataIndex); for(iI = 0; iI < iSteps; ++iI) { pTextIndex -> offset = Swap32(pTextIndex -> offset); pTextIndex -> length = Swap32(pTextIndex -> length); ++pTextIndex; } // Convert static data StaticDataVar * pStaticDataVar = const_cast(VMExecutable::GetStaticDataSeg(oCore)); iSteps = oCore -> static_data_data_size / sizeof(StaticDataVar); for(iI = 0; iI < iSteps; ++iI) { (*pStaticDataVar).i_data = Swap64((*pStaticDataVar).i_data); ++pStaticDataVar; } } // // Constructor // VMStringLoader::VMStringLoader(CCHAR_P rawContent, size_t rawContentSize) { oCore = (VMExecutable *)malloc(rawContentSize + 1); memcpy(oCore, rawContent, rawContentSize); if (oCore -> magic[0] == 'C' && oCore -> magic[1] == 'T' && oCore -> magic[2] == 'P' && oCore -> magic[3] == 'P') { // Check version if (oCore -> version[0] >= 1) { // Platform-dependent data (byte order) if (oCore -> platform == 0x4142434445464748ull) { #ifdef _DEBUG fprintf(stderr, "Big/Little Endian conversion: Nothing to do\n"); #endif // Nothing to do, only check crc UINT_32 iCRC = oCore -> crc; oCore -> crc = 0; // Calculate CRC of file // KELSON: next line used to refer to oStat.st_size // changed it to rawContentSize if (iCRC != crc32((UCCHAR_P)oCore, rawContentSize)) { free(oCore); throw CTPPLogicError("CRC checksum invalid"); } } // Platform-dependent data (byte order) else if (oCore -> platform == 0x4847464544434241ull) { // Need to reconvert data #ifdef _DEBUG fprintf(stderr, "Big/Little Endian conversion: Need to reconvert core\n"); #endif ConvertExecutable(oCore); } else { free(oCore); throw CTPPLogicError("Conversion of middle-end architecture does not supported."); } // Check IEEE 754 format if (oCore -> ieee754double != 15839800103804824402926068484019465486336.0) { free(oCore); throw CTPPLogicError("IEEE 754 format is broken, cannot convert file"); } } pVMMemoryCore = new VMMemoryCore(oCore); } else { free(oCore); throw CTPPLogicError("Not an CTPP bytecode file."); } } // // Get ready-to-run program // const VMMemoryCore * VMStringLoader::GetCore() const { return pVMMemoryCore; } // // A destructor // VMStringLoader::~VMStringLoader() throw() { delete pVMMemoryCore; free(oCore); } } // namespace CTPP // End. libkiwix-0.2.0/src/library.cpp000066400000000000000000000071141312445116700163010ustar00rootroot00000000000000/* * Copyright 2011 Emmanuel Engelhart * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 3 of the License, or * any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, * MA 02110-1301, USA. */ #include "library.h" namespace kiwix { /* Constructor */ Book::Book(): readOnly(false) { } /* Destructor */ Book::~Book() { } /* Sort functions */ bool Book::sortByLastOpen(const kiwix::Book &a, const kiwix::Book &b) { return atoi(a.last.c_str()) > atoi(b.last.c_str()); } bool Book::sortByTitle(const kiwix::Book &a, const kiwix::Book &b) { return strcmp(a.title.c_str(), b.title.c_str()) < 0; } bool Book::sortByDate(const kiwix::Book &a, const kiwix::Book &b) { return strcmp(a.date.c_str(), b.date.c_str()) > 0; } bool Book::sortBySize(const kiwix::Book &a, const kiwix::Book &b) { return atoi(a.size.c_str()) < atoi(b.size.c_str()); } bool Book::sortByPublisher(const kiwix::Book &a, const kiwix::Book &b) { return strcmp(a.publisher.c_str(), b.publisher.c_str()) < 0; } bool Book::sortByCreator(const kiwix::Book &a, const kiwix::Book &b) { return strcmp(a.creator.c_str(), b.creator.c_str()) < 0; } bool Book::sortByLanguage(const kiwix::Book &a, const kiwix::Book &b) { return strcmp(a.language.c_str(), b.language.c_str()) < 0; } std::string Book::getHumanReadableIdFromPath() { std::string id = pathAbsolute; if (!id.empty()) { kiwix::removeAccents(id); #ifdef _WIN32 id = replaceRegex(id, "", "^.*\\\\"); #else id = replaceRegex(id, "", "^.*/"); #endif id = replaceRegex(id, "", "\\.zim[a-z]*$"); id = replaceRegex(id, "_", " "); id = replaceRegex(id, "plus", "\\+"); } return id; } /* Constructor */ Library::Library(): version(KIWIX_LIBRARY_VERSION) { } /* Destructor */ Library::~Library() { } bool Library::addBook(const Book &book) { /* Try to find it */ std::vector::iterator itr; for ( itr = this->books.begin(); itr != this->books.end(); ++itr ) { if (itr->id == book.id) { if (!itr->readOnly) { itr->readOnly = book.readOnly; if (itr->path.empty()) itr->path = book.path; if (itr->pathAbsolute.empty()) itr->pathAbsolute = book.pathAbsolute; if (itr->url.empty()) itr->url = book.url; if (itr->tags.empty()) itr->tags = book.tags; if (itr->name.empty()) itr->name = book.name; if (itr->indexPath.empty()) { itr->indexPath = book.indexPath; itr->indexType = book.indexType; } if (itr->indexPathAbsolute.empty()) { itr->indexPathAbsolute = book.indexPathAbsolute; itr->indexType = book.indexType; } if (itr->faviconMimeType.empty()) { itr->favicon = book.favicon; itr->faviconMimeType = book.faviconMimeType; } } return false; } } /* otherwise */ this->books.push_back(book); return true; } bool Library::removeBookByIndex(const unsigned int bookIndex) { books.erase(books.begin()+bookIndex); return true; } } libkiwix-0.2.0/src/manager.cpp000066400000000000000000000427171312445116700162570ustar00rootroot00000000000000/* * Copyright 2011 Emmanuel Engelhart * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 3 of the License, or * any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, * MA 02110-1301, USA. */ #include "manager.h" namespace kiwix { /* Constructor */ Manager::Manager() : writableLibraryPath("") { } /* Destructor */ Manager::~Manager() { } bool Manager::parseXmlDom(const pugi::xml_document &doc, const bool readOnly, const string libraryPath) { pugi::xml_node libraryNode = doc.child("library"); if (strlen(libraryNode.attribute("current").value())) this->setCurrentBookId(libraryNode.attribute("current").value()); string libraryVersion = libraryNode.attribute("version").value(); for (pugi::xml_node bookNode = libraryNode.child("book"); bookNode; bookNode = bookNode.next_sibling("book")) { bool ok = true; kiwix::Book book; book.readOnly = readOnly; book.id = bookNode.attribute("id").value(); book.path = bookNode.attribute("path").value(); book.last = (std::string(bookNode.attribute("last").value()) != "undefined" ? bookNode.attribute("last").value() : ""); book.indexPath = bookNode.attribute("indexPath").value(); book.indexType = XAPIAN; book.title = bookNode.attribute("title").value(); book.name = bookNode.attribute("name").value(); book.tags = bookNode.attribute("tags").value(); book.description = bookNode.attribute("description").value(); book.language = bookNode.attribute("language").value(); book.date = bookNode.attribute("date").value(); book.creator = bookNode.attribute("creator").value(); book.publisher = bookNode.attribute("publisher").value(); book.url = bookNode.attribute("url").value(); book.origId = bookNode.attribute("origId").value(); book.articleCount = bookNode.attribute("articleCount").value(); book.mediaCount = bookNode.attribute("mediaCount").value(); book.size = bookNode.attribute("size").value(); book.favicon = bookNode.attribute("favicon").value(); book.faviconMimeType = bookNode.attribute("faviconMimeType").value(); /* Check absolute and relative paths */ this->checkAndCleanBookPaths(book, libraryPath); /* Update the book properties with the new importer */ if (libraryVersion.empty() || atoi(libraryVersion.c_str()) <= atoi(KIWIX_LIBRARY_VERSION)) { if (!book.path.empty()) { ok = this->readBookFromPath(book.pathAbsolute); } } if (ok) { library.addBook(book); } } return true; } bool Manager::readXml(const string xml, const bool readOnly, const string libraryPath) { pugi::xml_document doc; pugi::xml_parse_result result = doc.load_buffer_inplace((void*)xml.data(), xml.size()); if (result) { this->parseXmlDom(doc, readOnly, libraryPath); } return true; } bool Manager::readFile(const string path, const bool readOnly) { return this->readFile(path, path, readOnly); } bool Manager::readFile(const string nativePath, const string UTF8Path, const bool readOnly) { bool retVal = true; pugi::xml_document doc; pugi::xml_parse_result result = doc.load_file(nativePath.c_str()); if (result) { this->parseXmlDom(doc, readOnly, UTF8Path); } else { retVal = false; } /* This has to be set (although if the file does not exists) to be * able to know where to save the library if new content are * available */ if (!readOnly) { this->writableLibraryPath = UTF8Path; } return retVal; } bool Manager::writeFile(const string path) { pugi::xml_document doc; /* Add the library node */ pugi::xml_node libraryNode = doc.append_child("library"); if (!getCurrentBookId().empty()) { libraryNode.append_attribute("current") = getCurrentBookId().c_str(); } if (!library.version.empty()) libraryNode.append_attribute("version") = library.version.c_str(); /* Add each book */ std::vector::iterator itr; for ( itr = library.books.begin(); itr != library.books.end(); ++itr ) { if (!itr->readOnly) { this->checkAndCleanBookPaths(*itr, path); pugi::xml_node bookNode = libraryNode.append_child("book"); bookNode.append_attribute("id") = itr->id.c_str(); if (!itr->path.empty()) bookNode.append_attribute("path") = itr->path.c_str(); if (!itr->last.empty() && itr->last != "undefined") { bookNode.append_attribute("last") = itr->last.c_str(); } if (!itr->indexPath.empty()) bookNode.append_attribute("indexPath") = itr->indexPath.c_str(); if (!itr->indexPath.empty() || !itr->indexPathAbsolute.empty()) { if (itr->indexType == XAPIAN) bookNode.append_attribute("indexType") = "xapian"; } if (itr->origId.empty()) { if (!itr->title.empty()) bookNode.append_attribute("title") = itr->title.c_str(); if (!itr->name.empty()) bookNode.append_attribute("name") = itr->name.c_str(); if (!itr->tags.empty()) bookNode.append_attribute("tags") = itr->tags.c_str(); if (!itr->description.empty()) bookNode.append_attribute("description") = itr->description.c_str(); if (!itr->language.empty()) bookNode.append_attribute("language") = itr->language.c_str(); if (!itr->creator.empty()) bookNode.append_attribute("creator") = itr->creator.c_str(); if (!itr->publisher.empty()) bookNode.append_attribute("publisher") = itr->publisher.c_str(); if (!itr->favicon.empty()) bookNode.append_attribute("favicon") = itr->favicon.c_str(); if (!itr->faviconMimeType.empty()) bookNode.append_attribute("faviconMimeType") = itr->faviconMimeType.c_str(); } if (!itr->date.empty()) bookNode.append_attribute("date") = itr->date.c_str(); if (!itr->url.empty()) bookNode.append_attribute("url") = itr->url.c_str(); if (!itr->origId.empty()) bookNode.append_attribute("origId") = itr->origId.c_str(); if (!itr->articleCount.empty()) bookNode.append_attribute("articleCount") = itr->articleCount.c_str(); if (!itr->mediaCount.empty()) bookNode.append_attribute("mediaCount") = itr->mediaCount.c_str(); if (!itr->size.empty()) bookNode.append_attribute("size") = itr->size.c_str(); } } /* saving file */ doc.save_file(path.c_str()); return true; } bool Manager::setCurrentBookId(const string id) { if (library.current.empty() || library.current.top() != id) { if (id.empty() && !library.current.empty()) library.current.pop(); else library.current.push(id); } return true; } string Manager::getCurrentBookId() { return library.current.empty() ? "" : library.current.top(); } /* Add a book to the library. Return empty string if failed, book id otherwise */ string Manager::addBookFromPathAndGetId(const string pathToOpen, const string pathToSave, const string url, const bool checkMetaData) { kiwix::Book book; if (this->readBookFromPath(pathToOpen, &book)) { if (pathToSave != pathToOpen) { book.path = pathToSave; book.pathAbsolute = isRelativePath(pathToSave) ? computeAbsolutePath(removeLastPathElement(writableLibraryPath, true, false), pathToSave) : pathToSave; } if (!checkMetaData || (checkMetaData && !book.title.empty() && !book.language.empty() && !book.date.empty())) { book.url = url; library.addBook(book); return book.id; } } return ""; } /* Wrapper over Manager::addBookFromPath which return a bool instead of a string */ bool Manager::addBookFromPath(const string pathToOpen, const string pathToSave, const string url, const bool checkMetaData) { return !(this->addBookFromPathAndGetId(pathToOpen, pathToSave, url, checkMetaData).empty()); } bool Manager::readBookFromPath(const string path, kiwix::Book *book) { try { kiwix::Reader *reader = new kiwix::Reader(path); if (book != NULL) { book->path = path; book->pathAbsolute = path; book->id = reader->getId(); book->description = reader->getDescription(); book->language = reader->getLanguage(); book->date = reader->getDate(); book->creator = reader->getCreator(); book->publisher = reader->getPublisher(); book->title = reader->getTitle(); book->name = reader->getName(); book->tags = reader->getTags(); book->origId = reader->getOrigId(); std::ostringstream articleCountStream; articleCountStream << reader->getArticleCount(); book->articleCount = articleCountStream.str(); std::ostringstream mediaCountStream; mediaCountStream << reader->getMediaCount(); book->mediaCount = mediaCountStream.str(); ostringstream convert; convert << reader->getFileSize(); book->size = convert.str(); string favicon; string faviconMimeType; if (reader->getFavicon(favicon, faviconMimeType)) { book->favicon = base64_encode(reinterpret_cast(favicon.c_str()), favicon.length()); book->faviconMimeType = faviconMimeType; } } delete reader; } catch (const std::exception& e) { std::cerr << e.what() << std::endl; return false; } return true; } bool Manager::removeBookByIndex(const unsigned int bookIndex) { return this->library.removeBookByIndex(bookIndex); } bool Manager::removeBookById(const string id) { unsigned int bookIndex = 0; std::vector::iterator itr; for ( itr = library.books.begin(); itr != library.books.end(); ++itr ) { if ( itr->id == id) { return this->library.removeBookByIndex(bookIndex); } bookIndex++; } return false; } vector Manager::getBooksLanguages() { std::vector booksLanguages; std::vector::iterator itr; std::map booksLanguagesMap; std::sort(library.books.begin(), library.books.end(), kiwix::Book::sortByLanguage); for (itr = library.books.begin(); itr != library.books.end(); ++itr) { if (booksLanguagesMap.find(itr->language) == booksLanguagesMap.end()) { if (itr->origId.empty()) { booksLanguagesMap[itr->language] = true; booksLanguages.push_back(itr->language); } } } return booksLanguages; } vector Manager::getBooksCreators() { std::vector booksCreators; std::vector::iterator itr; std::map booksCreatorsMap; std::sort(library.books.begin(), library.books.end(), kiwix::Book::sortByCreator); for (itr = library.books.begin(); itr != library.books.end(); ++itr) { if (booksCreatorsMap.find(itr->creator) == booksCreatorsMap.end()) { if (itr->origId.empty()) { booksCreatorsMap[itr->creator] = true; booksCreators.push_back(itr->creator); } } } return booksCreators; } vector Manager::getBooksIds() { std::vector booksIds; std::vector::iterator itr; for ( itr = library.books.begin(); itr != library.books.end(); ++itr ) { booksIds.push_back(itr->id); } return booksIds; } vector Manager::getBooksPublishers() { std::vector booksPublishers; std::vector::iterator itr; std::map booksPublishersMap; std::sort(library.books.begin(), library.books.end(), kiwix::Book::sortByPublisher); for ( itr = library.books.begin(); itr != library.books.end(); ++itr ) { if (booksPublishersMap.find(itr->publisher) == booksPublishersMap.end()) { if (itr->origId.empty()) { booksPublishersMap[itr->publisher] = true; booksPublishers.push_back(itr->publisher); } } } return booksPublishers; } kiwix::Library Manager::cloneLibrary() { return this->library; } bool Manager::getCurrentBook(Book &book) { string currentBookId = getCurrentBookId(); if (currentBookId.empty()) { return false; } else { getBookById(currentBookId, book); return true; } } bool Manager::getBookById(const string id, Book &book) { std::vector::iterator itr; for ( itr = library.books.begin(); itr != library.books.end(); ++itr ) { if ( itr->id == id) { book = *itr; return true; } } return false; } bool Manager::updateBookLastOpenDateById(const string id) { std::vector::iterator itr; for ( itr = library.books.begin(); itr != library.books.end(); ++itr ) { if ( itr->id == id) { char unixdate[12]; sprintf (unixdate, "%d", (int)time(NULL)); itr->last = unixdate; return true; } } return false; } bool Manager::setBookIndex(const string id, const string path, const supportedIndexType type) { std::vector::iterator itr; for ( itr = library.books.begin(); itr != library.books.end(); ++itr ) { if ( itr->id == id) { itr->indexPath = path; itr->indexPathAbsolute = isRelativePath(path) ? computeAbsolutePath(removeLastPathElement(writableLibraryPath, true, false), path) : path; itr->indexType = type; return true; } } return false; } bool Manager::setBookIndex(const string id, const string path) { return this->setBookIndex(id, path, XAPIAN); } bool Manager::setBookPath(const string id, const string path) { std::vector::iterator itr; for ( itr = library.books.begin(); itr != library.books.end(); ++itr ) { if ( itr->id == id) { itr->path = path; itr->pathAbsolute = isRelativePath(path) ? computeAbsolutePath(removeLastPathElement(writableLibraryPath, true, false), path) : path; return true; } } return false; } void Manager::removeBookPaths() { std::vector::iterator itr; for ( itr = library.books.begin(); itr != library.books.end(); ++itr ) { itr->path = ""; itr->pathAbsolute = ""; } } unsigned int Manager::getBookCount(const bool localBooks, const bool remoteBooks) { unsigned int result = 0; std::vector::iterator itr; for ( itr = library.books.begin(); itr != library.books.end(); ++itr ) { if ((!itr->path.empty() && localBooks) || (itr->path.empty() && remoteBooks)) result++; } return result; } bool Manager::listBooks(const supportedListMode mode, const supportedListSortBy sortBy, const unsigned int maxSize, const string language, const string creator, const string publisher, const string search) { this->bookIdList.clear(); std::vector::iterator itr; /* Sort */ if (sortBy == TITLE) { std::sort(library.books.begin(), library.books.end(), kiwix::Book::sortByTitle); } else if (sortBy == SIZE) { std::sort(library.books.begin(), library.books.end(), kiwix::Book::sortBySize); } else if (sortBy == DATE) { std::sort(library.books.begin(), library.books.end(), kiwix::Book::sortByDate); } else if (sortBy == CREATOR) { std::sort(library.books.begin(), library.books.end(), kiwix::Book::sortByCreator); } else if (sortBy == PUBLISHER) { std::sort(library.books.begin(), library.books.end(), kiwix::Book::sortByPublisher); } /* Special sort for LASTOPEN */ if (mode == LASTOPEN) { std::sort(library.books.begin(), library.books.end(), kiwix::Book::sortByLastOpen); for ( itr = library.books.begin(); itr != library.books.end(); ++itr ) { if (!itr->last.empty()) this->bookIdList.push_back(itr->id); } } else { /* Generate the list of book id */ for ( itr = library.books.begin(); itr != library.books.end(); ++itr ) { bool ok = true; if (mode == LOCAL && itr->path.empty()) ok = false; if (ok == true && mode == REMOTE && (!itr->path.empty() || itr->url.empty())) ok = false; if (ok == true && maxSize != 0 && (unsigned int)atoi(itr->size.c_str()) > maxSize * 1024 * 1024) ok = false; if (ok == true && !language.empty() && !matchRegex(itr->language, language)) ok = false; if (ok == true && !creator.empty() && itr->creator != creator) ok = false; if (ok == true && !publisher.empty() && itr->publisher != publisher) ok = false; if ((ok == true && !search.empty()) && !(matchRegex(itr->title, "\\Q" + search + "\\E") || matchRegex(itr->description, "\\Q" + search + "\\E") || matchRegex(itr->language, "\\Q" + search + "\\E") )) ok = false; if (ok == true) { this->bookIdList.push_back(itr->id); } } } return true; } void Manager::checkAndCleanBookPaths(Book &book, const string &libraryPath) { if (!book.path.empty()) { if (isRelativePath(book.path)) { book.pathAbsolute = computeAbsolutePath(removeLastPathElement(libraryPath, true, false), book.path); } else { book.pathAbsolute = book.path; book.path = computeRelativePath(removeLastPathElement(libraryPath, true, false), book.pathAbsolute); } } if (!book.indexPath.empty()) { if (isRelativePath(book.indexPath)) { book.indexPathAbsolute = computeAbsolutePath(removeLastPathElement(libraryPath, true, false), book.indexPath); } else { book.indexPathAbsolute = book.indexPath; book.indexPath = computeRelativePath(removeLastPathElement(libraryPath, true, false), book.indexPathAbsolute); } } } } libkiwix-0.2.0/src/meson.build000066400000000000000000000021241312445116700162670ustar00rootroot00000000000000kiwix_sources = [ 'library.cpp', 'manager.cpp', 'reader.cpp', 'searcher.cpp', 'common/base64.cpp', 'common/pathTools.cpp', 'common/regexTools.cpp', 'common/stringTools.cpp', 'common/networkTools.cpp', 'common/otherTools.cpp', 'xapian/htmlparse.cc', 'xapian/myhtmlparse.cc' ] kiwix_sources += lib_resources if xapian_dep.found() kiwix_sources += ['xapianSearcher.cpp'] endif if get_option('android') subdir('android') install_dir = 'kiwix-lib/jniLibs/' + meson.get_cross_property('android_abi') else install_dir = get_option('libdir') endif if has_ctpp2_dep kiwix_sources += ['ctpp2/CTPP2VMStringLoader.cpp'] endif config_h = configure_file(output : 'kiwix_config.h', configuration : conf, input : 'config.h.in') install_headers(config_h, subdir:'kiwix') kiwixlib = library('kiwix', kiwix_sources, include_directories : inc, dependencies : all_deps, version: meson.project_version(), install: true, install_dir: install_dir) libkiwix-0.2.0/src/reader.cpp000066400000000000000000000456441312445116700161110ustar00rootroot00000000000000/* * Copyright 2011 Emmanuel Engelhart * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 3 of the License, or * any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, * MA 02110-1301, USA. */ #include "reader.h" #include inline char hi(char v) { char hex[] = "0123456789abcdef"; return hex[(v >> 4) & 0xf]; } inline char lo(char v) { char hex[] = "0123456789abcdef"; return hex[v & 0xf]; } std::string hexUUID (std::string in) { std::ostringstream out; for (unsigned n = 0; n < 4; ++n) out << hi(in[n]) << lo(in[n]); out << '-'; for (unsigned n = 4; n < 6; ++n) out << hi(in[n]) << lo(in[n]); out << '-'; for (unsigned n = 6; n < 8; ++n) out << hi(in[n]) << lo(in[n]); out << '-'; for (unsigned n = 8; n < 10; ++n) out << hi(in[n]) << lo(in[n]); out << '-'; for (unsigned n = 10; n < 16; ++n) out << hi(in[n]) << lo(in[n]); std::string op=out.str(); return op; } namespace kiwix { /* Constructor */ Reader::Reader(const string zimFilePath) : zimFileHandler(NULL) { string tmpZimFilePath = zimFilePath; /* Remove potential trailing zimaa */ size_t found = tmpZimFilePath.rfind("zimaa"); if (found != string::npos && tmpZimFilePath.size() > 5 && found == tmpZimFilePath.size() - 5) { tmpZimFilePath.resize(tmpZimFilePath.size() - 2); } this->zimFileHandler = new zim::File(tmpZimFilePath); if (this->zimFileHandler != NULL) { this->firstArticleOffset = this->zimFileHandler->getNamespaceBeginOffset('A'); this->lastArticleOffset = this->zimFileHandler->getNamespaceEndOffset('A'); this->currentArticleOffset = this->firstArticleOffset; this->nsACount = this->zimFileHandler->getNamespaceCount('A'); this->nsICount = this->zimFileHandler->getNamespaceCount('I'); this->zimFilePath = zimFilePath; } /* initialize random seed: */ srand ( time(NULL) ); } /* Destructor */ Reader::~Reader() { if (this->zimFileHandler != NULL) { delete this->zimFileHandler; } } zim::File* Reader::getZimFileHandler() const { return this->zimFileHandler; } /* Reset the cursor for GetNextArticle() */ void Reader::reset() { this->currentArticleOffset = this->firstArticleOffset; } std::map Reader::parseCounterMetadata() const { std::map counters; string mimeType, item, counterString; unsigned int counter; zim::Article article = this->zimFileHandler->getArticle('M',"Counter"); if ( article.good() ) { stringstream ssContent(article.getData()); while(getline(ssContent, item, ';')) { stringstream ssItem(item); getline(ssItem, mimeType, '='); getline(ssItem, counterString, '='); if (!counterString.empty() && !mimeType.empty()) { sscanf(counterString.c_str(), "%u", &counter); counters.insert(pair(mimeType, counter)); } } } return counters; } /* Get the count of articles which can be indexed/displayed */ unsigned int Reader::getArticleCount() const { std::map counterMap = this->parseCounterMetadata(); unsigned int counter = 0; if (counterMap.empty()) { counter = this->nsACount; } else { auto it = counterMap.find("text/html"); if (it != counterMap.end()) counter = it->second; } return counter; } /* Get the count of medias content in the ZIM file */ unsigned int Reader::getMediaCount() const { std::map counterMap = this->parseCounterMetadata(); unsigned int counter = 0; if (counterMap.empty()) counter = this->nsICount; else { auto it = counterMap.find("image/jpeg"); if (it != counterMap.end()) counter += it->second; it = counterMap.find("image/gif"); if (it != counterMap.end()) counter += it->second; it = counterMap.find("image/png"); if (it != counterMap.end()) counter += it->second; } return counter; } /* Get the total of all items of a ZIM file, redirects included */ unsigned int Reader::getGlobalCount() const { return this->zimFileHandler->getCountArticles(); } /* Return the UID of the ZIM file */ string Reader::getId() const { std::ostringstream s; s << this->zimFileHandler->getFileheader().getUuid(); return s.str(); } /* Return a page url from a title */ bool Reader::getPageUrlFromTitle(const string &title, string &url) const { /* Extract the content from the zim file */ zim::Article article = this->zimFileHandler->getArticleByTitle('A', title); if ( ! article.good() ) { return false; } unsigned int loopCounter = 0; while (article.isRedirect() && loopCounter++<42) { article = article.getRedirectArticle(); } url = article.getLongUrl(); return true; } /* Return an URL from a title*/ string Reader::getRandomPageUrl() const { zim::Article article; zim::size_type idx; std::string mainPageUrl = this->getMainPageUrl(); do { idx = this->firstArticleOffset + (zim::size_type)((double)rand() / ((double)RAND_MAX + 1) * this->nsACount); article = zimFileHandler->getArticle(idx); } while (article.getLongUrl() == mainPageUrl); return article.getLongUrl(); } /* Return the welcome page URL */ string Reader::getMainPageUrl() const { string url = ""; if (this->zimFileHandler->getFileheader().hasMainPage()) { zim::Article article = zimFileHandler->getArticle(this->zimFileHandler->getFileheader().getMainPage()); url = article.getLongUrl(); if (url.empty()) { url = getFirstPageUrl(); } } else { url = getFirstPageUrl(); } return url; } bool Reader::getFavicon(string &content, string &mimeType) const { unsigned int contentLength = 0; this->getContentByUrl( "/-/favicon.png", content, contentLength, mimeType); if (content.empty()) { this->getContentByUrl( "/I/favicon.png", content, contentLength, mimeType); if (content.empty()) { this->getContentByUrl( "/I/favicon", content, contentLength, mimeType); if (content.empty()) { this->getContentByUrl( "/-/favicon", content, contentLength, mimeType); } } } return content.empty() ? false : true; } string Reader::getZimFilePath() const { return this->zimFilePath; } /* Return a metatag value */ bool Reader::getMetatag(const string &name, string &value) const { unsigned int contentLength = 0; string contentType = ""; return this->getContentByUrl( "/M/" + name, value, contentLength, contentType); } string Reader::getTitle() const { string value; this->getMetatag("Title", value); if (value.empty()) { value = getLastPathElement(zimFileHandler->getFilename()); std::replace(value.begin(), value.end(), '_', ' '); size_t pos = value.find(".zim"); value = value.substr(0, pos); } return value; } string Reader::getName() const { string value; this->getMetatag("Name", value); return value; } string Reader::getTags() const { string value; this->getMetatag("Tags", value); return value; } string Reader::getDescription() const{ string value; this->getMetatag("Description", value); /* Mediawiki Collection tends to use the "Subtitle" name */ if (value.empty()) { this->getMetatag("Subtitle", value); } return value; } string Reader::getLanguage() const { string value; this->getMetatag("Language", value); return value; } string Reader::getDate() const { string value; this->getMetatag("Date", value); return value; } string Reader::getCreator() const { string value; this->getMetatag("Creator", value); return value; } string Reader::getPublisher() const { string value; this->getMetatag("Publisher", value); return value; } string Reader::getOrigId() const { string value; this->getMetatag("startfileuid", value); if(value.empty()) return ""; std::string id=value; std::string origID; std::string temp=""; unsigned int k=0; char tempArray[16]=""; for(unsigned int i=0; igetNamespaceBeginOffset('A'); zim::Article article = zimFileHandler->getArticle(firstPageOffset); return article.getLongUrl(); } bool Reader::parseUrl(const string &url, char *ns, string &title) const { /* Offset to visit the url */ unsigned int urlLength = url.size(); unsigned int offset = 0; /* Ignore the '/' */ while ((offset < urlLength) && (url[offset] == '/')) offset++; /* Get namespace */ while ((offset < urlLength) && (url[offset] != '/')) { *ns= url[offset]; offset++; } /* Ignore the '/' */ while ((offset < urlLength) && (url[offset] == '/')) offset++; /* Get content title */ unsigned int titleOffset = offset; while (offset < urlLength) { offset++; } /* unescape title */ title = url.substr(titleOffset, offset - titleOffset); return true; } /* Return article by url */ bool Reader::getArticleObjectByDecodedUrl(const string &url, zim::Article &article) const { if (this->zimFileHandler == NULL) { return false; } /* Parse the url */ char ns = 0; string urlStr; this->parseUrl(url, &ns, urlStr); /* Main page */ if (urlStr.empty() && ns == 0) { this->parseUrl(this->getMainPageUrl(), &ns, urlStr); } /* Extract the content from the zim file */ article = zimFileHandler->getArticle(ns, urlStr); return article.good(); } /* Return the mimeType without the content */ bool Reader::getMimeTypeByUrl(const string &url, string &mimeType) const { if (this->zimFileHandler == NULL) { return false; } zim::Article article; if (this->getArticleObjectByDecodedUrl(url, article)) { try { mimeType = article.getMimeType(); } catch (exception &e) { cerr << "Unable to get the mimetype for " << url << ":" << e.what() << endl; mimeType = "application/octet-stream"; } return true; } else { mimeType = ""; return false; } } /* Get a content from a zim file */ bool Reader::getContentByUrl(const string &url, string &content, unsigned int &contentLength, string &contentType) const { return this->getContentByEncodedUrl(url, content, contentLength, contentType); } bool Reader::getContentByEncodedUrl(const string &url, string &content, unsigned int &contentLength, string &contentType, string &baseUrl) const { return this->getContentByDecodedUrl(kiwix::urlDecode(url), content, contentLength, contentType, baseUrl); } bool Reader::getContentByEncodedUrl(const string &url, string &content, unsigned int &contentLength, string &contentType) const { std::string stubRedirectUrl; return this->getContentByEncodedUrl(kiwix::urlDecode(url), content, contentLength, contentType, stubRedirectUrl); } bool Reader::getContentByDecodedUrl(const string &url, string &content, unsigned int &contentLength, string &contentType) const { std::string stubRedirectUrl; return this->getContentByDecodedUrl(kiwix::urlDecode(url), content, contentLength, contentType, stubRedirectUrl); } bool Reader::getContentByDecodedUrl(const string &url, string &content, unsigned int &contentLength, string &contentType, string &baseUrl) const { content=""; contentType=""; contentLength = 0; zim::Article article; if ( ! this->getArticleObjectByDecodedUrl(url, article)) { return false; } /* If redirect */ unsigned int loopCounter = 0; while (article.isRedirect() && loopCounter++<42) { article = article.getRedirectArticle(); } if (loopCounter < 42) { /* Compute base url (might be different from the url if redirects */ baseUrl = "/" + std::string(1, article.getNamespace()) + "/" + article.getUrl(); /* Get the content mime-type */ try { contentType = string(article.getMimeType().data(), article.getMimeType().size()); } catch (exception &e) { cerr << "Unable to get the mimetype for "<< baseUrl<< ":" << e.what() << endl; contentType = "application/octet-stream"; } /* Get the data */ content = string(article.getData().data(), article.getArticleSize()); } /* Try to set a stub HTML header/footer if necesssary */ if (contentType.find("text/html") != string::npos && content.find("" + content + ""; } /* Get the data length */ contentLength = article.getArticleSize(); return true; } /* Check if an article exists */ bool Reader::urlExists(const string &url) const { char ns = 0; string titleStr; this->parseUrl(url, &ns, titleStr); titleStr = "/" + titleStr; zim::File::const_iterator findItr = zimFileHandler->find(ns, titleStr); return findItr != zimFileHandler->end() && findItr->getUrl() == titleStr; } /* Does the ZIM file has a fulltext index */ bool Reader::hasFulltextIndex() const { return this->urlExists("/Z/fulltextIndex/xapian"); } /* Search titles by prefix */ bool Reader::searchSuggestions(const string &prefix, unsigned int suggestionsCount, const bool reset) { bool retVal = false; zim::File::const_iterator articleItr; /* Reset the suggestions otherwise check if the suggestions number is less than the suggestionsCount */ if (reset) { this->suggestions.clear(); this->suggestionsOffset = this->suggestions.begin(); } else { if (this->suggestions.size() > suggestionsCount) { return false; } } /* Return if no prefix */ if (prefix.size() == 0) { return false; } for (articleItr = zimFileHandler->findByTitle('A', prefix); articleItr != zimFileHandler->end() && articleItr->getTitle().compare(0, prefix.size(), prefix) == 0 && this->suggestions.size() < suggestionsCount ; ++articleItr) { /* Extract the interesting part of article title & url */ std::string normalizedArticleTitle = kiwix::normalize(articleItr->getTitle()); std::string articleFinalUrl = "/A/"+articleItr->getUrl(); if (articleItr->isRedirect()) { zim::Article article = *articleItr; unsigned int loopCounter = 0; while (article.isRedirect() && loopCounter++<42) { article = article.getRedirectArticle(); } articleFinalUrl = "/A/"+article.getUrl(); } /* Go through all already found suggestions and skip if this article is already in the suggestions list (with an other title) */ bool insert = true; std::vector< std::vector >::iterator suggestionItr; for (suggestionItr = this->suggestions.begin(); suggestionItr != this->suggestions.end(); suggestionItr++) { int result = normalizedArticleTitle.compare((*suggestionItr)[2]); if (result == 0 && articleFinalUrl.compare((*suggestionItr)[1]) == 0) { insert = false; break; } else if (result < 0) { break; } } /* Insert if possible */ if (insert) { std::vector suggestion; suggestion.push_back(articleItr->getTitle()); suggestion.push_back(articleFinalUrl); suggestion.push_back(normalizedArticleTitle); this->suggestions.insert(suggestionItr, suggestion); } /* Suggestions where found */ retVal = true; } /* Set the cursor to the begining */ this->suggestionsOffset = this->suggestions.begin(); return retVal; } std::vector Reader::getTitleVariants(const std::string &title) const { std::vector variants; variants.push_back(title); variants.push_back(kiwix::ucFirst(title)); variants.push_back(kiwix::lcFirst(title)); variants.push_back(kiwix::toTitle(title)); return variants; } /* Try also a few variations of the prefix to have better results */ bool Reader::searchSuggestionsSmart(const string &prefix, unsigned int suggestionsCount) { std::vector variants = this->getTitleVariants(prefix); bool retVal; this->suggestions.clear(); this->suggestionsOffset = this->suggestions.begin(); for (std::vector::iterator variantsItr = variants.begin(); variantsItr != variants.end(); variantsItr++) { retVal = this->searchSuggestions(*variantsItr, suggestionsCount, false) || retVal; } return retVal; } /* Get next suggestion */ bool Reader::getNextSuggestion(string &title) { if (this->suggestionsOffset != this->suggestions.end()) { /* title */ title = (*(this->suggestionsOffset))[0]; /* increment the cursor for the next call */ this->suggestionsOffset++; return true; } return false; } bool Reader::getNextSuggestion(string &title, string &url) { if (this->suggestionsOffset != this->suggestions.end()) { /* title */ title = (*(this->suggestionsOffset))[0]; url = (*(this->suggestionsOffset))[1]; /* increment the cursor for the next call */ this->suggestionsOffset++; return true; } return false; } /* Check if the file has as checksum */ bool Reader::canCheckIntegrity() const { return this->zimFileHandler->getChecksum() != ""; } /* Return true if corrupted, false otherwise */ bool Reader::isCorrupted() const { try { if (this->zimFileHandler->verify() == true) return false; } catch (exception &e) { cerr << e.what() << endl; return true; } return true; } /* Return the file size, works also for splitted files */ unsigned int Reader::getFileSize() const { zim::File *file = this->getZimFileHandler(); zim::offset_type size = 0; if (file != NULL) { size = file->getFilesize(); } return (size / 1024); } } libkiwix-0.2.0/src/searcher.cpp000066400000000000000000000204631312445116700164330ustar00rootroot00000000000000/* * Copyright 2011 Emmanuel Engelhart * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 3 of the License, or * any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, * MA 02110-1301, USA. */ #include "searcher.h" #include "xapianSearcher.h" #include "reader.h" #include "kiwixlib-resources.h" #include #ifdef ENABLE_CTPP2 #include #include #include #include "ctpp2/CTPP2VMStringLoader.hpp" using namespace CTPP; #endif namespace kiwix { class _Result : public Result { public: _Result(Searcher* searcher, zim::Search::iterator& iterator); virtual ~_Result() {}; virtual std::string get_url(); virtual std::string get_title(); virtual int get_score(); virtual std::string get_snippet(); virtual int get_wordCount(); virtual int get_size(); private: Searcher* searcher; zim::Search::iterator iterator; }; struct SearcherInternal { const zim::Search *_search; XapianSearcher *_xapianSearcher; zim::Search::iterator current_iterator; SearcherInternal() : _search(NULL), _xapianSearcher(NULL) {} ~SearcherInternal() { if ( _search != NULL ) delete _search; if ( _xapianSearcher != NULL ) delete _xapianSearcher; } }; /* Constructor */ Searcher::Searcher(const string &xapianDirectoryPath, Reader* reader) : reader(reader), internal(new SearcherInternal()), searchPattern(""), protocolPrefix("zim://"), searchProtocolPrefix("search://?"), resultCountPerPage(0), estimatedResultCount(0), resultStart(0), resultEnd(0) { template_ct2 = RESOURCE::results_ct2; loadICUExternalTables(); if ( !reader || !reader->hasFulltextIndex() ) { internal->_xapianSearcher = new XapianSearcher(xapianDirectoryPath, reader); } } /* Destructor */ Searcher::~Searcher() { delete internal; } /* Search strings in the database */ void Searcher::search(std::string &search, unsigned int resultStart, unsigned int resultEnd, const bool verbose) { this->reset(); if (verbose == true) { cout << "Performing query `" << search << "'" << endl; } /* If resultEnd & resultStart inverted */ if (resultStart > resultEnd) { resultEnd += resultStart; resultStart = resultEnd - resultStart; resultEnd -= resultStart; } /* Try to find results */ if (resultStart != resultEnd) { /* Avoid big researches */ this->resultCountPerPage = resultEnd - resultStart; if (this->resultCountPerPage > 70) { resultEnd = resultStart + 70; this->resultCountPerPage = 70; } /* Perform the search */ this->searchPattern = search; this->resultStart = resultStart; this->resultEnd = resultEnd; string unaccentedSearch = removeAccents(search); if ( internal->_xapianSearcher ) { internal->_xapianSearcher->searchInIndex(unaccentedSearch, resultStart, resultEnd, verbose); this->estimatedResultCount = internal->_xapianSearcher->results.get_matches_estimated(); } else { internal->_search = this->reader->getZimFileHandler()->search(unaccentedSearch, resultStart, resultEnd); internal->current_iterator = internal->_search->begin(); this->estimatedResultCount = internal->_search->get_matches_estimated(); } } return; } void Searcher::restart_search() { if ( internal->_xapianSearcher ) { internal->_xapianSearcher->restart_search(); } else { internal->current_iterator = internal->_search->begin(); } } Result* Searcher::getNextResult() { if ( internal->_xapianSearcher ) { return internal->_xapianSearcher->getNextResult(); } else if (internal->current_iterator != internal->_search->end()) { Result* result = new _Result(this, internal->current_iterator); internal->current_iterator++; return result; } return NULL; } /* Reset the results */ void Searcher::reset() { this->estimatedResultCount = 0; this->searchPattern = ""; return; } /* Return the result count estimation */ unsigned int Searcher::getEstimatedResultCount() { return this->estimatedResultCount; } bool Searcher::setProtocolPrefix(const std::string prefix) { this->protocolPrefix = prefix; return true; } bool Searcher::setSearchProtocolPrefix(const std::string prefix) { this->searchProtocolPrefix = prefix; return true; } void Searcher::setContentHumanReadableId(const string &contentHumanReadableId) { this->contentHumanReadableId = contentHumanReadableId; } _Result::_Result(Searcher* searcher, zim::Search::iterator& iterator): searcher(searcher), iterator(iterator) { } std::string _Result::get_url() { return iterator.get_url(); } std::string _Result::get_title() { return iterator.get_title(); } int _Result::get_score() { return iterator.get_score(); } std::string _Result::get_snippet() { return iterator.get_snippet(); } int _Result::get_size() { return iterator.get_size(); } int _Result::get_wordCount() { return iterator.get_wordCount(); } #ifdef ENABLE_CTPP2 string Searcher::getHtml() { SimpleVM oSimpleVM; // Fill data CDT oData; CDT resultsCDT(CDT::ARRAY_VAL); this->restart_search(); Result * p_result = NULL; while ( (p_result = this->getNextResult()) ) { CDT result; result["title"] = p_result->get_title(); result["url"] = p_result->get_url(); result["snippet"] = p_result->get_snippet(); if (p_result->get_size() >= 0) result["size"] = kiwix::beautifyInteger(p_result->get_size()); if (p_result->get_wordCount() >= 0) result["wordCount"] = kiwix::beautifyInteger(p_result->get_wordCount()); resultsCDT.PushBack(result); delete p_result; } this->restart_search(); oData["results"] = resultsCDT; // pages CDT pagesCDT(CDT::ARRAY_VAL); unsigned int pageStart = this->resultStart / this->resultCountPerPage >= 5 ? this->resultStart / this->resultCountPerPage - 4 : 0; unsigned int pageCount = this->estimatedResultCount / this->resultCountPerPage + 1 - pageStart; if (pageCount > 10) pageCount = 10; else if (pageCount == 1) pageCount = 0; for (unsigned int i=pageStart; iresultCountPerPage; page["end"] = (i+1) * this->resultCountPerPage; if (i * this->resultCountPerPage == this->resultStart) page["selected"] = true; pagesCDT.PushBack(page); } oData["pages"] = pagesCDT; oData["count"] = kiwix::beautifyInteger(this->estimatedResultCount); oData["searchPattern"] = kiwix::encodeDiples(this->searchPattern); oData["searchPatternEncoded"] = urlEncode(this->searchPattern); oData["resultStart"] = this->resultStart + 1; oData["resultEnd"] = (this->resultEnd > this->estimatedResultCount ? this->estimatedResultCount : this->resultEnd); oData["resultRange"] = this->resultCountPerPage; oData["resultLastPageStart"] = this->estimatedResultCount > this->resultCountPerPage ? this->estimatedResultCount - this->resultCountPerPage : 0; oData["protocolPrefix"] = this->protocolPrefix; oData["searchProtocolPrefix"] = this->searchProtocolPrefix; oData["contentId"] = this->contentHumanReadableId; VMStringLoader oLoader(template_ct2.c_str(), template_ct2.size()); FileLogger oLogger(stderr); // DEBUG only (write output to stdout) // oSimpleVM.Run(oData, oLoader, stdout, oLogger); std::string sResult; oSimpleVM.Run(oData, oLoader, sResult, oLogger); return sResult; } #endif } libkiwix-0.2.0/src/xapian/000077500000000000000000000000001312445116700154065ustar00rootroot00000000000000libkiwix-0.2.0/src/xapian/htmlparse.cc000066400000000000000000000232701312445116700177200ustar00rootroot00000000000000/* htmlparse.cc: simple HTML parser for omega indexer * * Copyright 1999,2000,2001 BrightStation PLC * Copyright 2001 Ananova Ltd * Copyright 2002,2006,2007,2008 Olly Betts * * This program is free software; you can redistribute it and/or * modify it under the terms of the GNU General Public License as * published by the Free Software Foundation; either version 2 of the * License, or (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 * USA */ // #include #include "htmlparse.h" #include // #include "utf8convert.h" #include #include #include #include #include using namespace std; inline void lowercase_string(string &str) { for (string::iterator i = str.begin(); i != str.end(); ++i) { *i = tolower(static_cast(*i)); } } map HtmlParser::named_ents; inline static bool p_notdigit(char c) { return !isdigit(static_cast(c)); } inline static bool p_notxdigit(char c) { return !isxdigit(static_cast(c)); } inline static bool p_notalnum(char c) { return !isalnum(static_cast(c)); } inline static bool p_notwhitespace(char c) { return !isspace(static_cast(c)); } inline static bool p_nottag(char c) { return !isalnum(static_cast(c)) && c != '.' && c != '-' && c != ':'; // ':' for XML namespaces. } inline static bool p_whitespacegt(char c) { return isspace(static_cast(c)) || c == '>'; } inline static bool p_whitespaceeqgt(char c) { return isspace(static_cast(c)) || c == '=' || c == '>'; } bool HtmlParser::get_parameter(const string & param, string & value) { map::const_iterator i = parameters.find(param); if (i == parameters.end()) return false; value = i->second; return true; } HtmlParser::HtmlParser() { static const struct ent { const char *n; unsigned int v; } ents[] = { #include "namedentities.h" { NULL, 0 } }; if (named_ents.empty()) { const struct ent *i = ents; while (i->n) { named_ents[string(i->n)] = i->v; ++i; } } } void HtmlParser::decode_entities(string &s) { // We need a const_iterator version of s.end() - otherwise the // find() and find_if() templates don't work... string::const_iterator amp = s.begin(), s_end = s.end(); while ((amp = find(amp, s_end, '&')) != s_end) { unsigned int val = 0; string::const_iterator end, p = amp + 1; if (p != s_end && *p == '#') { p++; if (p != s_end && (*p == 'x' || *p == 'X')) { // hex p++; end = find_if(p, s_end, p_notxdigit); sscanf(s.substr(p - s.begin(), end - p).c_str(), "%x", &val); } else { // number end = find_if(p, s_end, p_notdigit); val = atoi(s.substr(p - s.begin(), end - p).c_str()); } } else { end = find_if(p, s_end, p_notalnum); string code = s.substr(p - s.begin(), end - p); map::const_iterator i; i = named_ents.find(code); if (i != named_ents.end()) val = i->second; } if (end < s_end && *end == ';') end++; if (val) { string::size_type amp_pos = amp - s.begin(); if (val < 0x80) { s.replace(amp_pos, end - amp, 1u, char(val)); } else { // Convert unicode value val to UTF-8. char seq[4]; unsigned len = Xapian::Unicode::nonascii_to_utf8(val, seq); s.replace(amp_pos, end - amp, seq, len); } s_end = s.end(); // We've modified the string, so the iterators are no longer // valid... amp = s.begin() + amp_pos + 1; } else { amp = end; } } } void HtmlParser::parse_html(const string &body) { in_script = false; parameters.clear(); string::const_iterator start = body.begin(); while (true) { // Skip through until we find an HTML tag, a comment, or the end of // document. Ignore isolated occurrences of `<' which don't start // a tag or comment. string::const_iterator p = start; while (true) { p = find(p, body.end(), '<'); if (p == body.end()) break; unsigned char ch = *(p + 1); // Tag, closing tag, or comment (or SGML declaration). if ((!in_script && isalpha(ch)) || ch == '/' || ch == '!') break; if (ch == '?') { // PHP code or XML declaration. // XML declaration is only valid at the start of the first line. // FIXME: need to deal with BOMs... if (p != body.begin() || body.size() < 20) break; // XML declaration looks something like this: // if (p[2] != 'x' || p[3] != 'm' || p[4] != 'l') break; if (strchr(" \t\r\n", p[5]) == NULL) break; string::const_iterator decl_end = find(p + 6, body.end(), '?'); if (decl_end == body.end()) break; // Default charset for XML is UTF-8. charset = "UTF-8"; string decl(p + 6, decl_end); size_t enc = decl.find("encoding"); if (enc == string::npos) break; enc = decl.find_first_not_of(" \t\r\n", enc + 8); if (enc == string::npos || enc == decl.size()) break; if (decl[enc] != '=') break; enc = decl.find_first_not_of(" \t\r\n", enc + 1); if (enc == string::npos || enc == decl.size()) break; if (decl[enc] != '"' && decl[enc] != '\'') break; char quote = decl[enc++]; size_t enc_end = decl.find(quote, enc); if (enc != string::npos) charset = decl.substr(enc, enc_end - enc); break; } p++; } // Process text up to start of tag. if (p > start) { string text = body.substr(start - body.begin(), p - start); // convert_to_utf8(text, charset); decode_entities(text); process_text(text); } if (p == body.end()) break; start = p + 1; if (start == body.end()) break; if (*start == '!') { if (++start == body.end()) break; if (++start == body.end()) break; // comment or SGML declaration if (*(start - 1) == '-' && *start == '-') { ++start; string::const_iterator close = find(start, body.end(), '>'); // An unterminated comment swallows rest of document // (like Netscape, but unlike MSIE IIRC) if (close == body.end()) break; p = close; // look for --> while (p != body.end() && (*(p - 1) != '-' || *(p - 2) != '-')) p = find(p + 1, body.end(), '>'); if (p != body.end()) { // Check for htdig's "ignore this bit" comments. if (p - start == 15 && string(start, p - 2) == "htdig_noindex") { string::size_type i; i = body.find("", p + 1 - body.begin()); if (i == string::npos) break; start = body.begin() + i + 21; continue; } // If we found --> skip to there. start = p; } else { // Otherwise skip to the first > we found (as Netscape does). start = close; } } else { // just an SGML declaration, perhaps giving the DTD - ignore it start = find(start - 1, body.end(), '>'); if (start == body.end()) break; } ++start; } else if (*start == '?') { if (++start == body.end()) break; // PHP - swallow until ?> or EOF start = find(start + 1, body.end(), '>'); // look for ?> while (start != body.end() && *(start - 1) != '?') start = find(start + 1, body.end(), '>'); // unterminated PHP swallows rest of document (rather arbitrarily // but it avoids polluting the database when things go wrong) if (start != body.end()) ++start; } else { // opening or closing tag int closing = 0; if (*start == '/') { closing = 1; start = find_if(start + 1, body.end(), p_notwhitespace); } p = start; start = find_if(start, body.end(), p_nottag); string tag = body.substr(p - body.begin(), start - p); // convert tagname to lowercase lowercase_string(tag); if (closing) { closing_tag(tag); if (in_script && tag == "script") in_script = false; /* ignore any bogus parameters on closing tags */ p = find(start, body.end(), '>'); if (p == body.end()) break; start = p + 1; } else { // FIXME: parse parameters lazily. while (start < body.end() && *start != '>') { string name, value; p = find_if(start, body.end(), p_whitespaceeqgt); name.assign(body, start - body.begin(), p - start); p = find_if(p, body.end(), p_notwhitespace); start = p; if (start != body.end() && *start == '=') { start = find_if(start + 1, body.end(), p_notwhitespace); p = body.end(); int quote = *start; if (quote == '"' || quote == '\'') { start++; p = find(start, body.end(), quote); } if (p == body.end()) { // unquoted or no closing quote p = find_if(start, body.end(), p_whitespacegt); } value.assign(body, start - body.begin(), p - start); start = find_if(p, body.end(), p_notwhitespace); if (!name.empty()) { // convert parameter name to lowercase lowercase_string(name); // in case of multiple entries, use the first // (as Netscape does) parameters.insert(make_pair(name, value)); } } } #if 0 cout << "<" << tag; map::const_iterator x; for (x = parameters.begin(); x != parameters.end(); x++) { cout << " " << x->first << "=\"" << x->second << "\""; } cout << ">\n"; #endif opening_tag(tag); parameters.clear(); // In