pax_global_header00006660000000000000000000000064122645462250014522gustar00rootroot0000000000000052 comment=91bc516e312b292bd2fd46806c0214c2e49c64d6 whitedb-0.7.2/000077500000000000000000000000001226454622500131565ustar00rootroot00000000000000whitedb-0.7.2/AUTHORS000066400000000000000000000005761226454622500142360ustar00rootroot00000000000000Authors ======= Project lead T.Tammet Main programming T.Tammet and P.Järv (priit@cc.ttu.ee) Additional programming: E.Reilent (t-tree) A.Puusepp (java bindings), A.Rebane (dump/log parts), M.Puju (prototype version) WhiteDB uses several packages developed by other people. Information can be found in README and AUTHORS files in corresponding folders. whitedb-0.7.2/Bootstrap000077500000000000000000000014101226454622500150550ustar00rootroot00000000000000#!/bin/sh # $Id: $ # $Source: $ # Booting up the GNU automake, autoconf, etc system: # not needed after the configure executable script has been built # Need to use use autoconf >= 2.50 an automake >= 1.5. This allows user to # set these variables in their environment, or to just use the defaults below. # This is needed since some systems still use autoconf-2.13 and automake-1.4 as # the defaults (e.g. debian). : ${ACLOCAL=aclocal} : ${AUTOMAKE=automake} : ${AUTOCONF=autoconf} : ${AUTOHEADER=autoheader} : ${LIBTOOLIZE=libtoolize} command -v glibtoolize && LIBTOOLIZE=glibtoolize set -e set -x ( ${ACLOCAL} ${AUTOHEADER} ${AUTOCONF} ${LIBTOOLIZE} --no-warn [ -d config-aux ] || mkdir config-aux ${AUTOMAKE} -a -c ) rm -f config.cache whitedb-0.7.2/COPYING000066400000000000000000001045131226454622500142150ustar00rootroot00000000000000 GNU GENERAL PUBLIC LICENSE Version 3, 29 June 2007 Copyright (C) 2007 Free Software Foundation, Inc. Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The GNU General Public License is a free, copyleft license for software and other kinds of works. The licenses for most software and other practical works are designed to take away your freedom to share and change the works. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change all versions of a program--to make sure it remains free software for all its users. We, the Free Software Foundation, use the GNU General Public License for most of our software; it applies also to any other work released this way by its authors. You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for them if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs, and that you know you can do these things. To protect your rights, we need to prevent others from denying you these rights or asking you to surrender the rights. Therefore, you have certain responsibilities if you distribute copies of the software, or if you modify it: responsibilities to respect the freedom of others. For example, if you distribute copies of such a program, whether gratis or for a fee, you must pass on to the recipients the same freedoms that you received. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. Developers that use the GNU GPL protect your rights with two steps: (1) assert copyright on the software, and (2) offer you this License giving you legal permission to copy, distribute and/or modify it. For the developers' and authors' protection, the GPL clearly explains that there is no warranty for this free software. For both users' and authors' sake, the GPL requires that modified versions be marked as changed, so that their problems will not be attributed erroneously to authors of previous versions. Some devices are designed to deny users access to install or run modified versions of the software inside them, although the manufacturer can do so. This is fundamentally incompatible with the aim of protecting users' freedom to change the software. The systematic pattern of such abuse occurs in the area of products for individuals to use, which is precisely where it is most unacceptable. Therefore, we have designed this version of the GPL to prohibit the practice for those products. If such problems arise substantially in other domains, we stand ready to extend this provision to those domains in future versions of the GPL, as needed to protect the freedom of users. Finally, every program is threatened constantly by software patents. States should not allow patents to restrict development and use of software on general-purpose computers, but in those that do, we wish to avoid the special danger that patents applied to a free program could make it effectively proprietary. To prevent this, the GPL assures that patents cannot be used to render the program non-free. The precise terms and conditions for copying, distribution and modification follow. TERMS AND CONDITIONS 0. Definitions. "This License" refers to version 3 of the GNU General Public License. "Copyright" also means copyright-like laws that apply to other kinds of works, such as semiconductor masks. "The Program" refers to any copyrightable work licensed under this License. Each licensee is addressed as "you". "Licensees" and "recipients" may be individuals or organizations. To "modify" a work means to copy from or adapt all or part of the work in a fashion requiring copyright permission, other than the making of an exact copy. The resulting work is called a "modified version" of the earlier work or a work "based on" the earlier work. A "covered work" means either the unmodified Program or a work based on the Program. To "propagate" a work means to do anything with it that, without permission, would make you directly or secondarily liable for infringement under applicable copyright law, except executing it on a computer or modifying a private copy. Propagation includes copying, distribution (with or without modification), making available to the public, and in some countries other activities as well. To "convey" a work means any kind of propagation that enables other parties to make or receive copies. Mere interaction with a user through a computer network, with no transfer of a copy, is not conveying. An interactive user interface displays "Appropriate Legal Notices" to the extent that it includes a convenient and prominently visible feature that (1) displays an appropriate copyright notice, and (2) tells the user that there is no warranty for the work (except to the extent that warranties are provided), that licensees may convey the work under this License, and how to view a copy of this License. If the interface presents a list of user commands or options, such as a menu, a prominent item in the list meets this criterion. 1. Source Code. The "source code" for a work means the preferred form of the work for making modifications to it. "Object code" means any non-source form of a work. A "Standard Interface" means an interface that either is an official standard defined by a recognized standards body, or, in the case of interfaces specified for a particular programming language, one that is widely used among developers working in that language. The "System Libraries" of an executable work include anything, other than the work as a whole, that (a) is included in the normal form of packaging a Major Component, but which is not part of that Major Component, and (b) serves only to enable use of the work with that Major Component, or to implement a Standard Interface for which an implementation is available to the public in source code form. A "Major Component", in this context, means a major essential component (kernel, window system, and so on) of the specific operating system (if any) on which the executable work runs, or a compiler used to produce the work, or an object code interpreter used to run it. The "Corresponding Source" for a work in object code form means all the source code needed to generate, install, and (for an executable work) run the object code and to modify the work, including scripts to control those activities. However, it does not include the work's System Libraries, or general-purpose tools or generally available free programs which are used unmodified in performing those activities but which are not part of the work. For example, Corresponding Source includes interface definition files associated with source files for the work, and the source code for shared libraries and dynamically linked subprograms that the work is specifically designed to require, such as by intimate data communication or control flow between those subprograms and other parts of the work. The Corresponding Source need not include anything that users can regenerate automatically from other parts of the Corresponding Source. The Corresponding Source for a work in source code form is that same work. 2. Basic Permissions. All rights granted under this License are granted for the term of copyright on the Program, and are irrevocable provided the stated conditions are met. This License explicitly affirms your unlimited permission to run the unmodified Program. The output from running a covered work is covered by this License only if the output, given its content, constitutes a covered work. This License acknowledges your rights of fair use or other equivalent, as provided by copyright law. You may make, run and propagate covered works that you do not convey, without conditions so long as your license otherwise remains in force. You may convey covered works to others for the sole purpose of having them make modifications exclusively for you, or provide you with facilities for running those works, provided that you comply with the terms of this License in conveying all material for which you do not control copyright. Those thus making or running the covered works for you must do so exclusively on your behalf, under your direction and control, on terms that prohibit them from making any copies of your copyrighted material outside their relationship with you. Conveying under any other circumstances is permitted solely under the conditions stated below. Sublicensing is not allowed; section 10 makes it unnecessary. 3. Protecting Users' Legal Rights From Anti-Circumvention Law. No covered work shall be deemed part of an effective technological measure under any applicable law fulfilling obligations under article 11 of the WIPO copyright treaty adopted on 20 December 1996, or similar laws prohibiting or restricting circumvention of such measures. When you convey a covered work, you waive any legal power to forbid circumvention of technological measures to the extent such circumvention is effected by exercising rights under this License with respect to the covered work, and you disclaim any intention to limit operation or modification of the work as a means of enforcing, against the work's users, your or third parties' legal rights to forbid circumvention of technological measures. 4. Conveying Verbatim Copies. You may convey verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice; keep intact all notices stating that this License and any non-permissive terms added in accord with section 7 apply to the code; keep intact all notices of the absence of any warranty; and give all recipients a copy of this License along with the Program. You may charge any price or no price for each copy that you convey, and you may offer support or warranty protection for a fee. 5. Conveying Modified Source Versions. You may convey a work based on the Program, or the modifications to produce it from the Program, in the form of source code under the terms of section 4, provided that you also meet all of these conditions: a) The work must carry prominent notices stating that you modified it, and giving a relevant date. b) The work must carry prominent notices stating that it is released under this License and any conditions added under section 7. This requirement modifies the requirement in section 4 to "keep intact all notices". c) You must license the entire work, as a whole, under this License to anyone who comes into possession of a copy. This License will therefore apply, along with any applicable section 7 additional terms, to the whole of the work, and all its parts, regardless of how they are packaged. This License gives no permission to license the work in any other way, but it does not invalidate such permission if you have separately received it. d) If the work has interactive user interfaces, each must display Appropriate Legal Notices; however, if the Program has interactive interfaces that do not display Appropriate Legal Notices, your work need not make them do so. A compilation of a covered work with other separate and independent works, which are not by their nature extensions of the covered work, and which are not combined with it such as to form a larger program, in or on a volume of a storage or distribution medium, is called an "aggregate" if the compilation and its resulting copyright are not used to limit the access or legal rights of the compilation's users beyond what the individual works permit. Inclusion of a covered work in an aggregate does not cause this License to apply to the other parts of the aggregate. 6. Conveying Non-Source Forms. You may convey a covered work in object code form under the terms of sections 4 and 5, provided that you also convey the machine-readable Corresponding Source under the terms of this License, in one of these ways: a) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by the Corresponding Source fixed on a durable physical medium customarily used for software interchange. b) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by a written offer, valid for at least three years and valid for as long as you offer spare parts or customer support for that product model, to give anyone who possesses the object code either (1) a copy of the Corresponding Source for all the software in the product that is covered by this License, on a durable physical medium customarily used for software interchange, for a price no more than your reasonable cost of physically performing this conveying of source, or (2) access to copy the Corresponding Source from a network server at no charge. c) Convey individual copies of the object code with a copy of the written offer to provide the Corresponding Source. This alternative is allowed only occasionally and noncommercially, and only if you received the object code with such an offer, in accord with subsection 6b. d) Convey the object code by offering access from a designated place (gratis or for a charge), and offer equivalent access to the Corresponding Source in the same way through the same place at no further charge. You need not require recipients to copy the Corresponding Source along with the object code. If the place to copy the object code is a network server, the Corresponding Source may be on a different server (operated by you or a third party) that supports equivalent copying facilities, provided you maintain clear directions next to the object code saying where to find the Corresponding Source. Regardless of what server hosts the Corresponding Source, you remain obligated to ensure that it is available for as long as needed to satisfy these requirements. e) Convey the object code using peer-to-peer transmission, provided you inform other peers where the object code and Corresponding Source of the work are being offered to the general public at no charge under subsection 6d. A separable portion of the object code, whose source code is excluded from the Corresponding Source as a System Library, need not be included in conveying the object code work. A "User Product" is either (1) a "consumer product", which means any tangible personal property which is normally used for personal, family, or household purposes, or (2) anything designed or sold for incorporation into a dwelling. In determining whether a product is a consumer product, doubtful cases shall be resolved in favor of coverage. For a particular product received by a particular user, "normally used" refers to a typical or common use of that class of product, regardless of the status of the particular user or of the way in which the particular user actually uses, or expects or is expected to use, the product. A product is a consumer product regardless of whether the product has substantial commercial, industrial or non-consumer uses, unless such uses represent the only significant mode of use of the product. "Installation Information" for a User Product means any methods, procedures, authorization keys, or other information required to install and execute modified versions of a covered work in that User Product from a modified version of its Corresponding Source. The information must suffice to ensure that the continued functioning of the modified object code is in no case prevented or interfered with solely because modification has been made. If you convey an object code work under this section in, or with, or specifically for use in, a User Product, and the conveying occurs as part of a transaction in which the right of possession and use of the User Product is transferred to the recipient in perpetuity or for a fixed term (regardless of how the transaction is characterized), the Corresponding Source conveyed under this section must be accompanied by the Installation Information. But this requirement does not apply if neither you nor any third party retains the ability to install modified object code on the User Product (for example, the work has been installed in ROM). The requirement to provide Installation Information does not include a requirement to continue to provide support service, warranty, or updates for a work that has been modified or installed by the recipient, or for the User Product in which it has been modified or installed. Access to a network may be denied when the modification itself materially and adversely affects the operation of the network or violates the rules and protocols for communication across the network. Corresponding Source conveyed, and Installation Information provided, in accord with this section must be in a format that is publicly documented (and with an implementation available to the public in source code form), and must require no special password or key for unpacking, reading or copying. 7. Additional Terms. "Additional permissions" are terms that supplement the terms of this License by making exceptions from one or more of its conditions. Additional permissions that are applicable to the entire Program shall be treated as though they were included in this License, to the extent that they are valid under applicable law. If additional permissions apply only to part of the Program, that part may be used separately under those permissions, but the entire Program remains governed by this License without regard to the additional permissions. When you convey a copy of a covered work, you may at your option remove any additional permissions from that copy, or from any part of it. (Additional permissions may be written to require their own removal in certain cases when you modify the work.) You may place additional permissions on material, added by you to a covered work, for which you have or can give appropriate copyright permission. Notwithstanding any other provision of this License, for material you add to a covered work, you may (if authorized by the copyright holders of that material) supplement the terms of this License with terms: a) Disclaiming warranty or limiting liability differently from the terms of sections 15 and 16 of this License; or b) Requiring preservation of specified reasonable legal notices or author attributions in that material or in the Appropriate Legal Notices displayed by works containing it; or c) Prohibiting misrepresentation of the origin of that material, or requiring that modified versions of such material be marked in reasonable ways as different from the original version; or d) Limiting the use for publicity purposes of names of licensors or authors of the material; or e) Declining to grant rights under trademark law for use of some trade names, trademarks, or service marks; or f) Requiring indemnification of licensors and authors of that material by anyone who conveys the material (or modified versions of it) with contractual assumptions of liability to the recipient, for any liability that these contractual assumptions directly impose on those licensors and authors. All other non-permissive additional terms are considered "further restrictions" within the meaning of section 10. If the Program as you received it, or any part of it, contains a notice stating that it is governed by this License along with a term that is a further restriction, you may remove that term. If a license document contains a further restriction but permits relicensing or conveying under this License, you may add to a covered work material governed by the terms of that license document, provided that the further restriction does not survive such relicensing or conveying. If you add terms to a covered work in accord with this section, you must place, in the relevant source files, a statement of the additional terms that apply to those files, or a notice indicating where to find the applicable terms. Additional terms, permissive or non-permissive, may be stated in the form of a separately written license, or stated as exceptions; the above requirements apply either way. 8. Termination. You may not propagate or modify a covered work except as expressly provided under this License. Any attempt otherwise to propagate or modify it is void, and will automatically terminate your rights under this License (including any patent licenses granted under the third paragraph of section 11). However, if you cease all violation of this License, then your license from a particular copyright holder is reinstated (a) provisionally, unless and until the copyright holder explicitly and finally terminates your license, and (b) permanently, if the copyright holder fails to notify you of the violation by some reasonable means prior to 60 days after the cessation. Moreover, your license from a particular copyright holder is reinstated permanently if the copyright holder notifies you of the violation by some reasonable means, this is the first time you have received notice of violation of this License (for any work) from that copyright holder, and you cure the violation prior to 30 days after your receipt of the notice. Termination of your rights under this section does not terminate the licenses of parties who have received copies or rights from you under this License. If your rights have been terminated and not permanently reinstated, you do not qualify to receive new licenses for the same material under section 10. 9. Acceptance Not Required for Having Copies. You are not required to accept this License in order to receive or run a copy of the Program. Ancillary propagation of a covered work occurring solely as a consequence of using peer-to-peer transmission to receive a copy likewise does not require acceptance. However, nothing other than this License grants you permission to propagate or modify any covered work. These actions infringe copyright if you do not accept this License. Therefore, by modifying or propagating a covered work, you indicate your acceptance of this License to do so. 10. Automatic Licensing of Downstream Recipients. Each time you convey a covered work, the recipient automatically receives a license from the original licensors, to run, modify and propagate that work, subject to this License. You are not responsible for enforcing compliance by third parties with this License. An "entity transaction" is a transaction transferring control of an organization, or substantially all assets of one, or subdividing an organization, or merging organizations. If propagation of a covered work results from an entity transaction, each party to that transaction who receives a copy of the work also receives whatever licenses to the work the party's predecessor in interest had or could give under the previous paragraph, plus a right to possession of the Corresponding Source of the work from the predecessor in interest, if the predecessor has it or can get it with reasonable efforts. You may not impose any further restrictions on the exercise of the rights granted or affirmed under this License. For example, you may not impose a license fee, royalty, or other charge for exercise of rights granted under this License, and you may not initiate litigation (including a cross-claim or counterclaim in a lawsuit) alleging that any patent claim is infringed by making, using, selling, offering for sale, or importing the Program or any portion of it. 11. Patents. A "contributor" is a copyright holder who authorizes use under this License of the Program or a work on which the Program is based. The work thus licensed is called the contributor's "contributor version". A contributor's "essential patent claims" are all patent claims owned or controlled by the contributor, whether already acquired or hereafter acquired, that would be infringed by some manner, permitted by this License, of making, using, or selling its contributor version, but do not include claims that would be infringed only as a consequence of further modification of the contributor version. For purposes of this definition, "control" includes the right to grant patent sublicenses in a manner consistent with the requirements of this License. Each contributor grants you a non-exclusive, worldwide, royalty-free patent license under the contributor's essential patent claims, to make, use, sell, offer for sale, import and otherwise run, modify and propagate the contents of its contributor version. In the following three paragraphs, a "patent license" is any express agreement or commitment, however denominated, not to enforce a patent (such as an express permission to practice a patent or covenant not to sue for patent infringement). To "grant" such a patent license to a party means to make such an agreement or commitment not to enforce a patent against the party. If you convey a covered work, knowingly relying on a patent license, and the Corresponding Source of the work is not available for anyone to copy, free of charge and under the terms of this License, through a publicly available network server or other readily accessible means, then you must either (1) cause the Corresponding Source to be so available, or (2) arrange to deprive yourself of the benefit of the patent license for this particular work, or (3) arrange, in a manner consistent with the requirements of this License, to extend the patent license to downstream recipients. "Knowingly relying" means you have actual knowledge that, but for the patent license, your conveying the covered work in a country, or your recipient's use of the covered work in a country, would infringe one or more identifiable patents in that country that you have reason to believe are valid. If, pursuant to or in connection with a single transaction or arrangement, you convey, or propagate by procuring conveyance of, a covered work, and grant a patent license to some of the parties receiving the covered work authorizing them to use, propagate, modify or convey a specific copy of the covered work, then the patent license you grant is automatically extended to all recipients of the covered work and works based on it. A patent license is "discriminatory" if it does not include within the scope of its coverage, prohibits the exercise of, or is conditioned on the non-exercise of one or more of the rights that are specifically granted under this License. You may not convey a covered work if you are a party to an arrangement with a third party that is in the business of distributing software, under which you make payment to the third party based on the extent of your activity of conveying the work, and under which the third party grants, to any of the parties who would receive the covered work from you, a discriminatory patent license (a) in connection with copies of the covered work conveyed by you (or copies made from those copies), or (b) primarily for and in connection with specific products or compilations that contain the covered work, unless you entered into that arrangement, or that patent license was granted, prior to 28 March 2007. Nothing in this License shall be construed as excluding or limiting any implied license or other defenses to infringement that may otherwise be available to you under applicable patent law. 12. No Surrender of Others' Freedom. If conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot convey a covered work so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not convey it at all. For example, if you agree to terms that obligate you to collect a royalty for further conveying from those to whom you convey the Program, the only way you could satisfy both those terms and this License would be to refrain entirely from conveying the Program. 13. Use with the GNU Affero General Public License. Notwithstanding any other provision of this License, you have permission to link or combine any covered work with a work licensed under version 3 of the GNU Affero General Public License into a single combined work, and to convey the resulting work. The terms of this License will continue to apply to the part which is the covered work, but the special requirements of the GNU Affero General Public License, section 13, concerning interaction through a network will apply to the combination as such. 14. Revised Versions of this License. The Free Software Foundation may publish revised and/or new versions of the GNU General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies that a certain numbered version of the GNU General Public License "or any later version" applies to it, you have the option of following the terms and conditions either of that numbered version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of the GNU General Public License, you may choose any version ever published by the Free Software Foundation. If the Program specifies that a proxy can decide which future versions of the GNU General Public License can be used, that proxy's public statement of acceptance of a version permanently authorizes you to choose that version for the Program. Later license versions may give you additional or different permissions. However, no additional obligations are imposed on any author or copyright holder as a result of your choosing to follow a later version. 15. Disclaimer of Warranty. THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 16. Limitation of Liability. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. 17. Interpretation of Sections 15 and 16. If the disclaimer of warranty and limitation of liability provided above cannot be given local legal effect according to their terms, reviewing courts shall apply local law that most closely approximates an absolute waiver of all civil liability in connection with the Program, unless a warranty or assumption of liability accompanies a copy of the Program in return for a fee. END OF TERMS AND CONDITIONS How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively state the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Copyright (C) This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see . Also add information on how to contact you by electronic and paper mail. If the program does terminal interaction, make it output a short notice like this when it starts in an interactive mode: Copyright (C) This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. The hypothetical commands `show w' and `show c' should show the appropriate parts of the General Public License. Of course, your program's commands might be different; for a GUI interface, you would use an "about box". You should also get your employer (if you work as a programmer) or school, if any, to sign a "copyright disclaimer" for the program, if necessary. For more information on this, and how to apply and follow the GNU GPL, see . The GNU General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Lesser General Public License instead of this License. But first, please read . whitedb-0.7.2/ChangeLog000066400000000000000000000001241226454622500147250ustar00rootroot00000000000000ChangeLog ========= Nothing here: see https://github.com/priitj/whitedb for historywhitedb-0.7.2/Db/000077500000000000000000000000001226454622500135035ustar00rootroot00000000000000whitedb-0.7.2/Db/Makefile.am000066400000000000000000000007511226454622500155420ustar00rootroot00000000000000# # - - - - main db sources - - - noinst_LTLIBRARIES = libDb.la libDb_la_SOURCES = dbmem.c dbmem.h\ dballoc.c dballoc.h dbfeatures.h\ dbdata.c dbdata.h\ dbtest.c dbtest.h\ dblock.c dblock.h\ dbdump.c dbdump.h crc1.h\ dblog.c dblog.h\ dbhash.c dbhash.h\ dbindex.c dbindex.h\ dbcompare.c dbcompare.h\ dbquery.c dbquery.h\ dbutil.c dbutil.h\ dbmpool.c dbmpool.h\ dbjson.c dbjson.h\ dbschema.c dbschema.h if RAPTOR AM_CFLAGS += `$(RAPTOR_CONFIG) --cflags` endif whitedb-0.7.2/Db/crc1.h000066400000000000000000000113411226454622500145040ustar00rootroot00000000000000/* * zlib/libpng license * Copyright (c) 2000-2004 mypapit * * This software is provided 'as-is', without any express or implied warranty. * In no event will the authors be held liable for any damages arising from the * use of this software. * * Permission is granted to anyone to use this software for any purpose, including * commercial applications, and to alter it and redistribute it freely, subject to * the following restrictions: * * 1. The origin of this software must not be misrepresented; you must not claim * that you wrote the original software. If you use this software in a product, an * acknowledgment in the product documentation would be appreciated but is not required. * * 2. Altered source versions must be plainly marked as such, and must not be * misrepresented as being the original software. * * 3. This notice may not be removed or altered from any source distribution. */ /** @file crc1.h * CRC32 calculator from minicrc project. */ /* table of CRC-32's of all single-byte values (made by makecrc.c) */ gint32 crc_table[256] = { 0x00000000L, 0x77073096L, 0xee0e612cL, 0x990951baL, 0x076dc419L, 0x706af48fL, 0xe963a535L, 0x9e6495a3L, 0x0edb8832L, 0x79dcb8a4L, 0xe0d5e91eL, 0x97d2d988L, 0x09b64c2bL, 0x7eb17cbdL, 0xe7b82d07L, 0x90bf1d91L, 0x1db71064L, 0x6ab020f2L, 0xf3b97148L, 0x84be41deL, 0x1adad47dL, 0x6ddde4ebL, 0xf4d4b551L, 0x83d385c7L, 0x136c9856L, 0x646ba8c0L, 0xfd62f97aL, 0x8a65c9ecL, 0x14015c4fL, 0x63066cd9L, 0xfa0f3d63L, 0x8d080df5L, 0x3b6e20c8L, 0x4c69105eL, 0xd56041e4L, 0xa2677172L, 0x3c03e4d1L, 0x4b04d447L, 0xd20d85fdL, 0xa50ab56bL, 0x35b5a8faL, 0x42b2986cL, 0xdbbbc9d6L, 0xacbcf940L, 0x32d86ce3L, 0x45df5c75L, 0xdcd60dcfL, 0xabd13d59L, 0x26d930acL, 0x51de003aL, 0xc8d75180L, 0xbfd06116L, 0x21b4f4b5L, 0x56b3c423L, 0xcfba9599L, 0xb8bda50fL, 0x2802b89eL, 0x5f058808L, 0xc60cd9b2L, 0xb10be924L, 0x2f6f7c87L, 0x58684c11L, 0xc1611dabL, 0xb6662d3dL, 0x76dc4190L, 0x01db7106L, 0x98d220bcL, 0xefd5102aL, 0x71b18589L, 0x06b6b51fL, 0x9fbfe4a5L, 0xe8b8d433L, 0x7807c9a2L, 0x0f00f934L, 0x9609a88eL, 0xe10e9818L, 0x7f6a0dbbL, 0x086d3d2dL, 0x91646c97L, 0xe6635c01L, 0x6b6b51f4L, 0x1c6c6162L, 0x856530d8L, 0xf262004eL, 0x6c0695edL, 0x1b01a57bL, 0x8208f4c1L, 0xf50fc457L, 0x65b0d9c6L, 0x12b7e950L, 0x8bbeb8eaL, 0xfcb9887cL, 0x62dd1ddfL, 0x15da2d49L, 0x8cd37cf3L, 0xfbd44c65L, 0x4db26158L, 0x3ab551ceL, 0xa3bc0074L, 0xd4bb30e2L, 0x4adfa541L, 0x3dd895d7L, 0xa4d1c46dL, 0xd3d6f4fbL, 0x4369e96aL, 0x346ed9fcL, 0xad678846L, 0xda60b8d0L, 0x44042d73L, 0x33031de5L, 0xaa0a4c5fL, 0xdd0d7cc9L, 0x5005713cL, 0x270241aaL, 0xbe0b1010L, 0xc90c2086L, 0x5768b525L, 0x206f85b3L, 0xb966d409L, 0xce61e49fL, 0x5edef90eL, 0x29d9c998L, 0xb0d09822L, 0xc7d7a8b4L, 0x59b33d17L, 0x2eb40d81L, 0xb7bd5c3bL, 0xc0ba6cadL, 0xedb88320L, 0x9abfb3b6L, 0x03b6e20cL, 0x74b1d29aL, 0xead54739L, 0x9dd277afL, 0x04db2615L, 0x73dc1683L, 0xe3630b12L, 0x94643b84L, 0x0d6d6a3eL, 0x7a6a5aa8L, 0xe40ecf0bL, 0x9309ff9dL, 0x0a00ae27L, 0x7d079eb1L, 0xf00f9344L, 0x8708a3d2L, 0x1e01f268L, 0x6906c2feL, 0xf762575dL, 0x806567cbL, 0x196c3671L, 0x6e6b06e7L, 0xfed41b76L, 0x89d32be0L, 0x10da7a5aL, 0x67dd4accL, 0xf9b9df6fL, 0x8ebeeff9L, 0x17b7be43L, 0x60b08ed5L, 0xd6d6a3e8L, 0xa1d1937eL, 0x38d8c2c4L, 0x4fdff252L, 0xd1bb67f1L, 0xa6bc5767L, 0x3fb506ddL, 0x48b2364bL, 0xd80d2bdaL, 0xaf0a1b4cL, 0x36034af6L, 0x41047a60L, 0xdf60efc3L, 0xa867df55L, 0x316e8eefL, 0x4669be79L, 0xcb61b38cL, 0xbc66831aL, 0x256fd2a0L, 0x5268e236L, 0xcc0c7795L, 0xbb0b4703L, 0x220216b9L, 0x5505262fL, 0xc5ba3bbeL, 0xb2bd0b28L, 0x2bb45a92L, 0x5cb36a04L, 0xc2d7ffa7L, 0xb5d0cf31L, 0x2cd99e8bL, 0x5bdeae1dL, 0x9b64c2b0L, 0xec63f226L, 0x756aa39cL, 0x026d930aL, 0x9c0906a9L, 0xeb0e363fL, 0x72076785L, 0x05005713L, 0x95bf4a82L, 0xe2b87a14L, 0x7bb12baeL, 0x0cb61b38L, 0x92d28e9bL, 0xe5d5be0dL, 0x7cdcefb7L, 0x0bdbdf21L, 0x86d3d2d4L, 0xf1d4e242L, 0x68ddb3f8L, 0x1fda836eL, 0x81be16cdL, 0xf6b9265bL, 0x6fb077e1L, 0x18b74777L, 0x88085ae6L, 0xff0f6a70L, 0x66063bcaL, 0x11010b5cL, 0x8f659effL, 0xf862ae69L, 0x616bffd3L, 0x166ccf45L, 0xa00ae278L, 0xd70dd2eeL, 0x4e048354L, 0x3903b3c2L, 0xa7672661L, 0xd06016f7L, 0x4969474dL, 0x3e6e77dbL, 0xaed16a4aL, 0xd9d65adcL, 0x40df0b66L, 0x37d83bf0L, 0xa9bcae53L, 0xdebb9ec5L, 0x47b2cf7fL, 0x30b5ffe9L, 0xbdbdf21cL, 0xcabac28aL, 0x53b39330L, 0x24b4a3a6L, 0xbad03605L, 0xcdd70693L, 0x54de5729L, 0x23d967bfL, 0xb3667a2eL, 0xc4614ab8L, 0x5d681b02L, 0x2a6f2b94L, 0xb40bbe37L, 0xc30c8ea1L, 0x5a05df1bL, 0x2d02ef8dL }; static gint32 update_crc32(char *buf, gint n, gint32 crc) { register gint i; crc ^= 0xffffffff; for (i=0; i> 8); return crc ^= 0xffffffff; } whitedb-0.7.2/Db/dballoc.c000066400000000000000000001460541226454622500152610ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009 * Copyright (c) Priit Järv 2013 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dballoc.c * Database initialisation and common allocation/deallocation procedures: * areas, subareas, objects, strings etc. * */ /* ====== Includes =============== */ #include #include #include #include #ifdef __cplusplus extern "C" { #endif #ifdef _WIN32 #include "../config-w32.h" #else #include "../config.h" #endif #include "dballoc.h" #include "dbfeatures.h" #include "dblock.h" #include "dbtest.h" #include "dbindex.h" /* don't output 'segment does not have enough space' messages */ #define SUPPRESS_LOWLEVEL_ERR 1 /* ====== Private headers and defs ======== */ /* ======= Private protos ================ */ static gint init_db_subarea(void* db, void* area_header, gint index, gint size); static gint alloc_db_segmentchunk(void* db, gint size); // allocates a next chunk from db memory segment static gint init_syn_vars(void* db); static gint init_extdb(void* db); static gint init_db_index_area_header(void* db); static gint init_logging(void* db); static gint init_strhash_area(void* db, db_hash_area_header* areah); static gint init_hash_subarea(void* db, db_hash_area_header* areah, gint arraylength); #ifdef USE_REASONER static gint init_anonconst_table(void* db); static gint intern_anonconst(void* db, char* str, gint enr); #endif static gint make_subarea_freelist(void* db, void* area_header, gint arrayindex); static gint init_area_buckets(void* db, void* area_header); static gint init_subarea_freespace(void* db, void* area_header, gint arrayindex); static gint extend_fixedlen_area(void* db, void* area_header); static gint split_free(void* db, void* area_header, gint nr, gint* freebuckets, gint i); static gint extend_varlen_area(void* db, void* area_header, gint minbytes); static gint show_dballoc_error_nr(void* db, char* errmsg, gint nr); static gint show_dballoc_error(void* db, char* errmsg); /* ====== Functions ============== */ /* -------- segment header initialisation ---------- */ /** starts and completes memsegment initialisation * * should be called after new memsegment is allocated */ gint wg_init_db_memsegment(void* db, gint key, gint size) { db_memsegment_header* dbh = dbmemsegh(db); gint tmp; gint free; gint i; // set main global values for db dbh->mark=(gint32) MEMSEGMENT_MAGIC_INIT; dbh->version=(gint32) MEMSEGMENT_VERSION; dbh->features=(gint32) MEMSEGMENT_FEATURES; dbh->checksum=0; dbh->size=size; dbh->initialadr=(gint)dbh; /* XXX: this assumes pointer size. Currently harmless * because initialadr isn't used much. */ dbh->key=key; /* might be 0 if local memory used */ #ifdef CHECK if(((gint) dbh)%SUBAREA_ALIGNMENT_BYTES) show_dballoc_error(dbh,"db base pointer has bad alignment (ignoring)"); #endif // set correct alignment for free free=sizeof(db_memsegment_header); // set correct alignment for free i=SUBAREA_ALIGNMENT_BYTES-(free%SUBAREA_ALIGNMENT_BYTES); if (i==SUBAREA_ALIGNMENT_BYTES) i=0; dbh->free=free+i; // allocate and initialise subareas //datarec tmp=init_db_subarea(db,&(dbh->datarec_area_header),0,INITIAL_SUBAREA_SIZE); if (tmp) { show_dballoc_error(db," cannot create datarec area"); return -1; } (dbh->datarec_area_header).fixedlength=0; tmp=init_area_buckets(db,&(dbh->datarec_area_header)); // fill buckets with 0-s if (tmp) { show_dballoc_error(db," cannot initialize datarec area buckets"); return -1; } tmp=init_subarea_freespace(db,&(dbh->datarec_area_header),0); // mark and store free space in subarea 0 if (tmp) { show_dballoc_error(db," cannot initialize datarec subarea 0"); return -1; } //longstr tmp=init_db_subarea(db,&(dbh->longstr_area_header),0,INITIAL_SUBAREA_SIZE); if (tmp) { show_dballoc_error(db," cannot create longstr area"); return -1; } (dbh->longstr_area_header).fixedlength=0; tmp=init_area_buckets(db,&(dbh->longstr_area_header)); // fill buckets with 0-s if (tmp) { show_dballoc_error(db," cannot initialize longstr area buckets"); return -1; } tmp=init_subarea_freespace(db,&(dbh->longstr_area_header),0); // mark and store free space in subarea 0 if (tmp) { show_dballoc_error(db," cannot initialize longstr subarea 0"); return -1; } //listcell tmp=init_db_subarea(db,&(dbh->listcell_area_header),0,INITIAL_SUBAREA_SIZE); if (tmp) { show_dballoc_error(db," cannot create listcell area"); return -1; } (dbh->listcell_area_header).fixedlength=1; (dbh->listcell_area_header).objlength=sizeof(gcell); tmp=make_subarea_freelist(db,&(dbh->listcell_area_header),0); // freelist into subarray 0 if (tmp) { show_dballoc_error(db," cannot initialize listcell area"); return -1; } //shortstr tmp=init_db_subarea(db,&(dbh->shortstr_area_header),0,INITIAL_SUBAREA_SIZE); if (tmp) { show_dballoc_error(db," cannot create short string area"); return -1; } (dbh->shortstr_area_header).fixedlength=1; (dbh->shortstr_area_header).objlength=SHORTSTR_SIZE; tmp=make_subarea_freelist(db,&(dbh->shortstr_area_header),0); // freelist into subarray 0 if (tmp) { show_dballoc_error(db," cannot initialize shortstr area"); return -1; } //word tmp=init_db_subarea(db,&(dbh->word_area_header),0,INITIAL_SUBAREA_SIZE); if (tmp) { show_dballoc_error(db," cannot create word area"); return -1; } (dbh->word_area_header).fixedlength=1; (dbh->word_area_header).objlength=sizeof(gint); tmp=make_subarea_freelist(db,&(dbh->word_area_header),0); // freelist into subarray 0 if (tmp) { show_dballoc_error(db," cannot initialize word area"); return -1; } //doubleword tmp=init_db_subarea(db,&(dbh->doubleword_area_header),0,INITIAL_SUBAREA_SIZE); if (tmp) { show_dballoc_error(db," cannot create doubleword area"); return -1; } (dbh->doubleword_area_header).fixedlength=1; (dbh->doubleword_area_header).objlength=2*sizeof(gint); tmp=make_subarea_freelist(db,&(dbh->doubleword_area_header),0); // freelist into subarray 0 if (tmp) { show_dballoc_error(db," cannot initialize doubleword area"); return -1; } /* index structures also user fixlen object storage: * tnode area - contains index nodes * index header area - contains index headers * index template area - contains template headers * index hash area - varlen storage for hash buckets * index lookup data takes up relatively little space so we allocate * the smallest chunk allowed for the headers. */ tmp=init_db_subarea(db,&(dbh->tnode_area_header),0,INITIAL_SUBAREA_SIZE); if (tmp) { show_dballoc_error(db," cannot create tnode area"); return -1; } (dbh->tnode_area_header).fixedlength=1; (dbh->tnode_area_header).objlength=sizeof(struct wg_tnode); tmp=make_subarea_freelist(db,&(dbh->tnode_area_header),0); if (tmp) { show_dballoc_error(db," cannot initialize tnode area"); return -1; } tmp=init_db_subarea(db,&(dbh->indexhdr_area_header),0,MINIMAL_SUBAREA_SIZE); if (tmp) { show_dballoc_error(db," cannot create index header area"); return -1; } (dbh->indexhdr_area_header).fixedlength=1; (dbh->indexhdr_area_header).objlength=sizeof(wg_index_header); tmp=make_subarea_freelist(db,&(dbh->indexhdr_area_header),0); if (tmp) { show_dballoc_error(db," cannot initialize index header area"); return -1; } #ifdef USE_INDEX_TEMPLATE tmp=init_db_subarea(db,&(dbh->indextmpl_area_header),0,MINIMAL_SUBAREA_SIZE); if (tmp) { show_dballoc_error(db," cannot create index header area"); return -1; } (dbh->indextmpl_area_header).fixedlength=1; (dbh->indextmpl_area_header).objlength=sizeof(wg_index_template); tmp=make_subarea_freelist(db,&(dbh->indextmpl_area_header),0); if (tmp) { show_dballoc_error(db," cannot initialize index header area"); return -1; } #endif tmp=init_db_subarea(db,&(dbh->indexhash_area_header),0,INITIAL_SUBAREA_SIZE); if (tmp) { show_dballoc_error(db," cannot create indexhash area"); return -1; } (dbh->indexhash_area_header).fixedlength=0; tmp=init_area_buckets(db,&(dbh->indexhash_area_header)); // fill buckets with 0-s if (tmp) { show_dballoc_error(db," cannot initialize indexhash area buckets"); return -1; } tmp=init_subarea_freespace(db,&(dbh->indexhash_area_header),0); if (tmp) { show_dballoc_error(db," cannot initialize indexhash subarea 0"); return -1; } /* initialize other structures */ /* initialize strhash array area */ tmp=init_strhash_area(db,&(dbh->strhash_area_header)); if (tmp) { show_dballoc_error(db," cannot create strhash array area"); return -1; } /* initialize synchronization */ tmp=init_syn_vars(db); if (tmp) { show_dballoc_error(db," cannot initialize synchronization area"); return -1; } /* initialize external database register */ tmp=init_extdb(db); if (tmp) { show_dballoc_error(db," cannot initialize external db register"); return -1; } /* initialize index structures */ tmp=init_db_index_area_header(db); if (tmp) { show_dballoc_error(db," cannot initialize index header area"); return -1; } #ifdef USE_REASONER /* initialize anonconst table */ tmp=init_anonconst_table(db); if (tmp) { show_dballoc_error(db," cannot initialize anonconst table"); return -1; } #endif /* initialize logging structures */ tmp=init_logging(db); /* tmp=init_db_subarea(db,&(dbh->logging_area_header),0,INITIAL_SUBAREA_SIZE); if (tmp) { show_dballoc_error(db," cannot create logging area"); return -1; } (dbh->logging_area_header).fixedlength=0; tmp=init_area_buckets(db,&(dbh->logging_area_header)); // fill buckets with 0-s if (tmp) { show_dballoc_error(db," cannot initialize logging area buckets"); return -1; }*/ /* Database is initialized, mark it as valid */ dbh->mark=(gint32) MEMSEGMENT_MAGIC_MARK; return 0; } /** initializes a subarea. subarea is used for actual data obs allocation * * returns 0 if ok, negative otherwise; * * called - several times - first by wg_init_db_memsegment, then as old subareas * get filled up */ static gint init_db_subarea(void* db, void* area_header, gint index, gint size) { db_area_header* areah; gint segmentchunk; gint i; gint asize; //printf("init_db_subarea called with size %d \n",size); if (sizesubarea_array)[index]).size=size; ((areah->subarea_array)[index]).offset=segmentchunk; // set correct alignment for alignedoffset i=SUBAREA_ALIGNMENT_BYTES-(segmentchunk%SUBAREA_ALIGNMENT_BYTES); if (i==SUBAREA_ALIGNMENT_BYTES) i=0; ((areah->subarea_array)[index]).alignedoffset=segmentchunk+i; // set correct alignment for alignedsize asize=(size-i); i=asize-(asize%MIN_VARLENOBJ_SIZE); ((areah->subarea_array)[index]).alignedsize=i; // set last index and freelist areah->last_subarea_index=index; areah->freelist=0; return 0; } /** allocates a new segment chunk from the segment * * returns offset if successful, 0 if no more space available * if 0 returned, no allocation performed: can try with a smaller value * used for allocating all subareas * * Alignment is guaranteed to SUBAREA_ALIGNMENT_BYTES */ static gint alloc_db_segmentchunk(void* db, gint size) { db_memsegment_header* dbh = dbmemsegh(db); gint lastfree; gint nextfree; gint i; lastfree=dbh->free; nextfree=lastfree+size; if (nextfree<0) { show_dballoc_error_nr(db,"trying to allocate next segment exceeds positive int limit",size); return 0; } // set correct alignment for nextfree i=SUBAREA_ALIGNMENT_BYTES-(nextfree%SUBAREA_ALIGNMENT_BYTES); if (i==SUBAREA_ALIGNMENT_BYTES) i=0; nextfree=nextfree+i; if (nextfree>=(dbh->size)) { #ifndef SUPPRESS_LOWLEVEL_ERR show_dballoc_error_nr(db,"segment does not have enough space for the required chunk of size",size); #endif return 0; } dbh->free=nextfree; return lastfree; } /** initializes sync variable storage * * returns 0 if ok, negative otherwise; * Note that a basic spinlock area is initialized even if locking * is disabled, this is done for better memory image compatibility. */ static gint init_syn_vars(void* db) { db_memsegment_header* dbh = dbmemsegh(db); gint i; #if !defined(LOCK_PROTO) || (LOCK_PROTO < 3) /* rpspin, wpspin */ /* calculate aligned pointer */ i = ((gint) (dbh->locks._storage) + SYN_VAR_PADDING - 1) & -SYN_VAR_PADDING; dbh->locks.global_lock = dbaddr(db, (void *) i); dbh->locks.writers = dbaddr(db, (void *) (i + SYN_VAR_PADDING)); #else i = alloc_db_segmentchunk(db, SYN_VAR_PADDING * (MAX_LOCKS+2)); if(!i) return -1; /* re-align (SYN_VAR_PADDING <> SUBAREA_ALIGNMENT_BYTES) */ i = (i + SYN_VAR_PADDING - 1) & -SYN_VAR_PADDING; dbh->locks.queue_lock = i; dbh->locks.storage = i + SYN_VAR_PADDING; dbh->locks.max_nodes = MAX_LOCKS; dbh->locks.freelist = dbh->locks.storage; /* dummy, wg_init_locks() will overwrite this */ #endif /* allocating space was successful, set the initial state */ return wg_init_locks(db); } /** initializes external database register * * returns 0 if ok, negative otherwise; */ static gint init_extdb(void* db) { db_memsegment_header* dbh = dbmemsegh(db); int i; dbh->extdbs.count = 0; for(i=0; iextdbs.offset[i] = 0; dbh->extdbs.size[i] = 0; } return 0; } /** initializes main index area * Currently this function only sets up an empty index table. The rest * of the index storage is initialized by wg_init_db_memsegment(). * returns 0 if ok */ static gint init_db_index_area_header(void* db) { db_memsegment_header* dbh = dbmemsegh(db); dbh->index_control_area_header.number_of_indexes=0; memset(dbh->index_control_area_header.index_table, 0, (MAX_INDEXED_FIELDNR+1)*sizeof(gint)); dbh->index_control_area_header.index_list=0; #ifdef USE_INDEX_TEMPLATE dbh->index_control_area_header.index_template_list=0; memset(dbh->index_control_area_header.index_template_table, 0, (MAX_INDEXED_FIELDNR+1)*sizeof(gint)); #endif return 0; } /** initializes logging area * */ static gint init_logging(void* db) { db_memsegment_header* dbh = dbmemsegh(db); dbh->logging.active = 0; dbh->logging.dirty = 0; dbh->logging.serial = 1; /* non-zero, so that zero value in db handle * indicates uninitialized state. */ return 0; } /** initializes strhash area * */ static gint init_strhash_area(void* db, db_hash_area_header* areah) { db_memsegment_header* dbh = dbmemsegh(db); gint arraylength; if(STRHASH_SIZE > 0.01 && STRHASH_SIZE < 50) { arraylength = (gint) ((dbh->size+1) * (STRHASH_SIZE/100.0)) / sizeof(gint); } else { arraylength = DEFAULT_STRHASH_LENGTH; } return init_hash_subarea(db, areah, arraylength); } /** initializes hash area * */ static gint init_hash_subarea(void* db, db_hash_area_header* areah, gint arraylength) { gint segmentchunk; gint i; gint asize; gint j; //printf("init_hash_subarea called with arraylength %d \n",arraylength); asize=((arraylength+1)*sizeof(gint))+(2*SUBAREA_ALIGNMENT_BYTES); // 2* just to be safe //printf("asize: %d \n",asize); //if (asize<100) return -1; // errcase to filter out stupid requests segmentchunk=alloc_db_segmentchunk(db,asize); //printf("segmentchunk: %d \n",segmentchunk); if (!segmentchunk) return -2; // errcase areah->offset=segmentchunk; areah->size=asize; areah->arraylength=arraylength; // set correct alignment for arraystart i=SUBAREA_ALIGNMENT_BYTES-(segmentchunk%SUBAREA_ALIGNMENT_BYTES); if (i==SUBAREA_ALIGNMENT_BYTES) i=0; areah->arraystart=segmentchunk+i; i=areah->arraystart; for(j=0;janonconst.anonconst_nr=0; dbh->anonconst.anonconst_funs=0; // clearing is not really necessary for(i=0;ianonconst.anonconst_table)[i]=0; } if (intern_anonconst(db,ACONST_TRUE_STR,ACONST_TRUE)) return 1; if (intern_anonconst(db,ACONST_FALSE_STR,ACONST_FALSE)) return 1; if (intern_anonconst(db,ACONST_IF_STR,ACONST_IF)) return 1; if (intern_anonconst(db,ACONST_NOT_STR,ACONST_NOT)) return 1; if (intern_anonconst(db,ACONST_AND_STR,ACONST_AND)) return 1; if (intern_anonconst(db,ACONST_OR_STR,ACONST_OR)) return 1; if (intern_anonconst(db,ACONST_IMPLIES_STR,ACONST_IMPLIES)) return 1; if (intern_anonconst(db,ACONST_XOR_STR,ACONST_XOR)) return 1; if (intern_anonconst(db,ACONST_LESS_STR,ACONST_LESS)) return 1; if (intern_anonconst(db,ACONST_EQUAL_STR,ACONST_EQUAL)) return 1; if (intern_anonconst(db,ACONST_GREATER_STR,ACONST_GREATER)) return 1; if (intern_anonconst(db,ACONST_LESSOREQUAL_STR,ACONST_LESSOREQUAL)) return 1; if (intern_anonconst(db,ACONST_GREATEROREQUAL_STR,ACONST_GREATEROREQUAL)) return 1; if (intern_anonconst(db,ACONST_ISZERO_STR,ACONST_ISZERO)) return 1; if (intern_anonconst(db,ACONST_ISEMPTYSTR_STR,ACONST_ISEMPTYSTR)) return 1; if (intern_anonconst(db,ACONST_PLUS_STR,ACONST_PLUS)) return 1; if (intern_anonconst(db,ACONST_MINUS_STR,ACONST_MINUS)) return 1; if (intern_anonconst(db,ACONST_MULTIPLY_STR,ACONST_MULTIPLY)) return 1; if (intern_anonconst(db,ACONST_DIVIDE_STR,ACONST_DIVIDE)) return 1; if (intern_anonconst(db,ACONST_STRCONTAINS_STR,ACONST_STRCONTAINS)) return 1; if (intern_anonconst(db,ACONST_STRCONTAINSICASE_STR,ACONST_STRCONTAINSICASE)) return 1; if (intern_anonconst(db,ACONST_SUBSTR_STR,ACONST_SUBSTR)) return 1; if (intern_anonconst(db,ACONST_STRLEN_STR,ACONST_STRLEN)) return 1; ++(dbh->anonconst.anonconst_nr); // max used slot + 1 dbh->anonconst.anonconst_funs=dbh->anonconst.anonconst_nr; return 0; } /** internalizes new anonymous constants: used in init * */ static gint intern_anonconst(void* db, char* str, gint enr) { db_memsegment_header* dbh = dbmemsegh(db); gint nr; gint uri; nr=decode_anonconst(enr); if (nr<0 || nr>=ANONCONST_TABLE_SIZE) { show_dballoc_error_nr(db,"inside intern_anonconst: nr given out of range: ", nr); return 1; } uri=wg_encode_uri(db,str,NULL); if (uri==WG_ILLEGAL) { show_dballoc_error_nr(db,"inside intern_anonconst: cannot create an uri of size ",strlen(str)); return 1; } (dbh->anonconst.anonconst_table)[nr]=uri; if (dbh->anonconst.anonconst_nranonconst.anonconst_nr)=nr; return 0; } #endif /* -------- freelists creation ---------- */ /** create freelist for an area * * used for initialising (sub)areas used for fixed-size allocation * * returns 0 if ok * * speed stats: * * 10000 * creation of 100000 elems (1 000 000 000 or 1G ops) takes 1.2 sec on penryn * 1000 * creation of 1000000 elems (1 000 000 000 or 1G ops) takes 3.4 sec on penryn * */ static gint make_subarea_freelist(void* db, void* area_header, gint arrayindex) { db_area_header* areah; gint objlength; gint max; gint size; gint offset; gint i; // general area info areah=(db_area_header*)area_header; objlength=areah->objlength; //subarea info size=((areah->subarea_array)[arrayindex]).alignedsize; offset=((areah->subarea_array)[arrayindex]).alignedoffset; // create freelist max=(offset+size)-(2*objlength); for(i=offset;i<=max;i=i+objlength) { dbstore(db,i,i+objlength); } dbstore(db,i,0); (areah->freelist)=offset; // //printf("(areah->freelist) %d \n",(areah->freelist)); return 0; } /* -------- buckets creation ---------- */ /** fill bucket data for an area * * used for initialising areas used for variable-size allocation * * returns 0 if ok, not 0 if error * */ gint init_area_buckets(void* db, void* area_header) { db_area_header* areah; gint* freebuckets; gint i; // general area info areah=(db_area_header*)area_header; freebuckets=areah->freebuckets; // empty all buckets for(i=0;ifreebuckets; //subarea info size=((areah->subarea_array)[arrayindex]).alignedsize; offset=((areah->subarea_array)[arrayindex]).alignedoffset; // if the previous area exists, store current victim to freelist if (arrayindex>0) { dv=freebuckets[DVBUCKET]; dvsize=freebuckets[DVSIZEBUCKET]; if (dv!=0 && dvsize>=MIN_VARLENOBJ_SIZE) { dbstore(db,dv,makefreeobjectsize(dvsize)); // store new size with freebit to the second half of object dbstore(db,dv+dvsize-sizeof(gint),makefreeobjectsize(dvsize)); dvindex=wg_freebuckets_index(db,dvsize); freelist=freebuckets[dvindex]; if (freelist!=0) dbstore(db,freelist+2*sizeof(gint),dv); // update prev ptr dbstore(db,dv+sizeof(gint),freelist); // store previous freelist dbstore(db,dv+2*sizeof(gint),dbaddr(db,&freebuckets[dvindex])); // store ptr to previous freebuckets[dvindex]=dv; // store offset to correct bucket //printf("in init_subarea_freespace: \n PUSHED DV WITH SIZE %d TO FREELIST TO BUCKET %d:\n", // dvsize,dvindex); //show_bucket_freeobjects(db,freebuckets[dvindex]); } } // create two minimal in-use objects never to be freed: marking beginning // and end of free area via in-use bits in size // beginning of free area dbstore(db,offset,makespecialusedobjectsize(MIN_VARLENOBJ_SIZE)); // lowest bit 0 means in use dbstore(db,offset+sizeof(gint),SPECIALGINT1START); // next ptr dbstore(db,offset+2*sizeof(gint),0); // prev ptr dbstore(db,offset+MIN_VARLENOBJ_SIZE-sizeof(gint),MIN_VARLENOBJ_SIZE); // len to end as well // end of free area endmarkobj=offset+size-MIN_VARLENOBJ_SIZE; dbstore(db,endmarkobj,makespecialusedobjectsize(MIN_VARLENOBJ_SIZE)); // lowest bit 0 means in use dbstore(db,endmarkobj+sizeof(gint),SPECIALGINT1END); // next ptr dbstore(db,endmarkobj+2*sizeof(gint),0); // prev ptr dbstore(db,endmarkobj+MIN_VARLENOBJ_SIZE-sizeof(gint),MIN_VARLENOBJ_SIZE); // len to end as well // calc where real free area starts and what is the size freeoffset=offset+MIN_VARLENOBJ_SIZE; freesize=size-2*MIN_VARLENOBJ_SIZE; // put whole free area into one free object // store the single free object as a designated victim dbstore(db,freeoffset,makespecialusedobjectsize(freesize)); // length without free bits: victim not marked free dbstore(db,freeoffset+sizeof(gint),SPECIALGINT1DV); // marks that it is a dv kind of special object freebuckets[DVBUCKET]=freeoffset; freebuckets[DVSIZEBUCKET]=freesize; // alternative: store the single free object to correct bucket /* dbstore(db,freeoffset,setcfree(freesize)); // size with free bits stored to beginning of object dbstore(db,freeoffset+sizeof(gint),0); // empty ptr to remaining obs stored after size i=freebuckets_index(db,freesize); if (i<0) { show_dballoc_error_nr(db,"initialising free object failed for ob size ",freesize); return -1; } dbstore(db,freeoffset+2*sizeof(gint),dbaddr(db,&freebuckets[i])); // ptr to previous stored freebuckets[i]=freeoffset; */ return 0; } /* -------- fixed length object allocation and freeing ---------- */ /** allocate a new fixed-len object * * return offset if ok, 0 if allocation fails */ gint wg_alloc_fixlen_object(void* db, void* area_header) { db_area_header* areah; gint freelist; areah=(db_area_header*)area_header; freelist=areah->freelist; if (!freelist) { if(!extend_fixedlen_area(db,areah)) { show_dballoc_error_nr(db,"cannot extend fixed length object area for size ",areah->objlength); return 0; } freelist=areah->freelist; if (!freelist) { show_dballoc_error_nr(db,"no free fixed length objects available for size ",areah->objlength); return 0; } else { areah->freelist=dbfetch(db,freelist); return freelist; } } else { areah->freelist=dbfetch(db,freelist); return freelist; } } /** create and initialise a new subarea for fixed-len obs area * * returns allocated size if ok, 0 if failure * used when the area has no more free space * */ static gint extend_fixedlen_area(void* db, void* area_header) { gint i; gint tmp; gint size, newsize; db_area_header* areah; areah=(db_area_header*)area_header; i=areah->last_subarea_index; if (i+1>=SUBAREA_ARRAY_SIZE) { show_dballoc_error_nr(db, " no more subarea array elements available for fixedlen of size: ",areah->objlength); return 0; // no more subarea array elements available } size=((areah->subarea_array)[i]).size; // last allocated subarea size // make tmp power-of-two times larger newsize=size<<1; //printf("fixlen OLD SUBAREA SIZE WAS %d NEW SUBAREA SIZE SHOULD BE %d\n",size,newsize); while(newsize >= MINIMAL_SUBAREA_SIZE) { if(!init_db_subarea(db,areah,i+1,newsize)) { goto done; } /* fall back to smaller size */ newsize>>=1; //printf("REQUIRED SPACE FAILED, TRYING %d\n",newsize); } show_dballoc_error_nr(db," cannot extend datarec area with a new subarea of size: ",newsize<<1); return 0; done: // here we have successfully allocated a new subarea tmp=make_subarea_freelist(db,areah,i+1); // fill with a freelist, store ptrs if (tmp) { show_dballoc_error(db," cannot initialize new subarea"); return 0; } return newsize; } /** free an existing listcell * * the object is added to the freelist * */ void wg_free_listcell(void* db, gint offset) { dbstore(db,offset,(dbmemsegh(db)->listcell_area_header).freelist); (dbmemsegh(db)->listcell_area_header).freelist=offset; } /** free an existing shortstr object * * the object is added to the freelist * */ void wg_free_shortstr(void* db, gint offset) { dbstore(db,offset,(dbmemsegh(db)->shortstr_area_header).freelist); (dbmemsegh(db)->shortstr_area_header).freelist=offset; } /** free an existing word-len object * * the object is added to the freelist * */ void wg_free_word(void* db, gint offset) { dbstore(db,offset,(dbmemsegh(db)->word_area_header).freelist); (dbmemsegh(db)->word_area_header).freelist=offset; } /** free an existing doubleword object * * the object is added to the freelist * */ void wg_free_doubleword(void* db, gint offset) { dbstore(db,offset,(dbmemsegh(db)->doubleword_area_header).freelist); //bug fixed here (dbmemsegh(db)->doubleword_area_header).freelist=offset; } /** free an existing tnode object * * the object is added to the freelist * */ void wg_free_tnode(void* db, gint offset) { dbstore(db,offset,(dbmemsegh(db)->tnode_area_header).freelist); (dbmemsegh(db)->tnode_area_header).freelist=offset; } /** free generic fixlen object * * the object is added to the freelist * */ void wg_free_fixlen_object(void* db, db_area_header *hdr, gint offset) { dbstore(db,offset,hdr->freelist); hdr->freelist=offset; } /* -------- variable length object allocation and freeing ---------- */ /** allocate a new object of given length * * returns correct offset if ok, 0 in case of error * */ gint wg_alloc_gints(void* db, void* area_header, gint nr) { gint wantedbytes; // actually wanted size in bytes, stored in object header gint usedbytes; // amount of bytes used: either wantedbytes or bytes+4 (obj must be 8 aligned) gint* freebuckets; gint res, nextobject; gint nextel; gint i; gint j; gint tmp; gint size; db_area_header* areah; areah=(db_area_header*)area_header; wantedbytes=nr*sizeof(gint); // object sizes are stored in bytes if (wantedbytes<0) return 0; // cannot allocate negative or zero sizes if (wantedbytes<=MIN_VARLENOBJ_SIZE) usedbytes=MIN_VARLENOBJ_SIZE; /* XXX: modifying the next line breaks encode_query_param_unistr(). * Rewrite this using macros to reduce the chance of accidental breakage */ else if (wantedbytes%8) usedbytes=wantedbytes+4; else usedbytes=wantedbytes; //printf("wg_alloc_gints called with nr %d and wantedbytes %d and usedbytes %d\n",nr,wantedbytes,usedbytes); // first find if suitable length free object is available freebuckets=areah->freebuckets; if (usedbytes=usedbytes+MIN_VARLENOBJ_SIZE) { // found one somewhat larger: now split and store the rest res=freebuckets[i]; tmp=split_free(db,areah,usedbytes,freebuckets,i); if (tmp<0) return 0; // error case // prev elem cannot be free (no consecutive free elems) dbstore(db,res,makeusedobjectsizeprevused(wantedbytes)); // store wanted size to the returned object return res; } } // next try to use the cached designated victim for creating objects off beginning // designated victim is not marked free by header and is not present in any freelist size=freebuckets[DVSIZEBUCKET]; if (usedbytes<=size && freebuckets[DVBUCKET]!=0) { res=freebuckets[DVBUCKET]; if (usedbytes==size) { // found a designated victim of exactly right size, dv is used up and disappears freebuckets[DVBUCKET]=0; freebuckets[DVSIZEBUCKET]=0; // prev elem of dv cannot be free dbstore(db,res,makeusedobjectsizeprevused(wantedbytes)); // store wanted size to the returned object return res; } else if (usedbytes+MIN_VARLENOBJ_SIZE<=size) { // found a designated victim somewhat larger: take the first part and keep the rest as dv dbstore(db,res+usedbytes,makespecialusedobjectsize(size-usedbytes)); // store smaller size to victim, turn off free bits dbstore(db,res+usedbytes+sizeof(gint),SPECIALGINT1DV); // marks that it is a dv kind of special object freebuckets[DVBUCKET]=res+usedbytes; // point to rest of victim freebuckets[DVSIZEBUCKET]=size-usedbytes; // rest of victim becomes shorter // prev elem of dv cannot be free dbstore(db,res,makeusedobjectsizeprevused(wantedbytes)); // store wanted size to the returned object return res; } } // next try to find first free object in exact-length buckets (shorter first) for(i=usedbytes+1;i=usedbytes+MIN_VARLENOBJ_SIZE) { // found one somewhat larger: now split and store the rest res=freebuckets[i]; tmp=split_free(db,areah,usedbytes,freebuckets,i); if (tmp<0) return 0; // error case // prev elem cannot be free (no consecutive free elems) dbstore(db,res,makeusedobjectsizeprevused(wantedbytes)); // store wanted size to the returned object return res; } } // next try to find first free object in var-length buckets (shorter first) for(i=wg_freebuckets_index(db,usedbytes);i=usedbytes+MIN_VARLENOBJ_SIZE) { // found one somewhat larger: now split and store the rest res=freebuckets[i]; //printf("db %d,nr %d,freebuckets %d,i %d\n",db,(int)nr,(int)freebuckets,(int)i); tmp=split_free(db,areah,usedbytes,freebuckets,i); if (tmp<0) return 0; // error case // prev elem cannot be free (no consecutive free elems) dbstore(db,res,makeusedobjectsizeprevused(wantedbytes)); // store wanted size to the returned object return res; } } } // down here we have found no suitable dv or free object to use for allocation // try to get a new memory area //printf("ABOUT TO CREATE A NEW SUBAREA\n"); tmp=extend_varlen_area(db,areah,usedbytes); if (!tmp) { show_dballoc_error(db," cannot initialize new varlen subarea"); return 0; } // here we have successfully allocated a new subarea // call self recursively: this call will use the new free area tmp=wg_alloc_gints(db,areah,nr); //show_db_memsegment_header(db); return tmp; } /** create and initialise a new subarea for var-len obs area * * returns allocated size if ok, 0 if failure * used when the area has no more free space * * bytes indicates the minimal required amount: * could be extended much more, but not less than bytes * */ static gint extend_varlen_area(void* db, void* area_header, gint minbytes) { gint i; gint tmp; gint size, minsize, newsize; db_area_header* areah; areah=(db_area_header*)area_header; i=areah->last_subarea_index; if (i+1>=SUBAREA_ARRAY_SIZE) { show_dballoc_error_nr(db," no more subarea array elements available for datarec: ",i); return 0; // no more subarea array elements available } size=((areah->subarea_array)[i]).size; // last allocated subarea size minsize=minbytes+SUBAREA_ALIGNMENT_BYTES+2*(MIN_VARLENOBJ_SIZE); // minimum allowed #ifdef CHECK if(minsize<0) { /* sanity check */ show_dballoc_error_nr(db, "invalid number of bytes requested: ", minbytes); return 0; } #endif if(minsize=0 && newsize= minsize) { if(!init_db_subarea(db,areah,i+1,newsize)) { goto done; } /* fall back to smaller size */ newsize>>=1; //printf("REQUIRED SPACE FAILED, TRYING %d\n",newsize); } show_dballoc_error_nr(db," cannot extend datarec area with a new subarea of size: ",newsize<<1); return 0; done: // here we have successfully allocated a new subarea tmp=init_subarea_freespace(db,areah,i+1); // mark beg and end, store new victim if (tmp) { show_dballoc_error(db," cannot initialize new subarea"); return 0; } return newsize; } /** splits a free object into a smaller new object and the remainder, stores remainder to right list * * returns 0 if ok, negative nr in case of error * we assume we always split the first elem in a bucket freelist * we also assume the remainder is >=MIN_VARLENOBJ_SIZE * */ static gint split_free(void* db, void* area_header, gint nr, gint* freebuckets, gint i) { gint object; gint oldsize; gint oldnextptr; gint splitsize; gint splitobject; gint splitindex; gint freelist; gint dv; gint dvsize; gint dvindex; object=freebuckets[i]; // object offset oldsize=dbfetch(db,object); // first gint at offset if (!isfreeobject(oldsize)) return -1; // not really a free object! oldsize=getfreeobjectsize(oldsize); // remove free bits, get real size // observe object is first obj in freelist, hence no free obj at prevptr oldnextptr=dbfetch(db,object+sizeof(gint)); // second gint at offset // store new size at offset (beginning of object) and mark as used with used prev // observe that a free object cannot follow another free object, hence we know prev is used dbstore(db,object,makeusedobjectsizeprevused(nr)); freebuckets[i]=oldnextptr; // store ptr to next elem into bucket ptr splitsize=oldsize-nr; // remaining size splitobject=object+nr; // offset of the part left // we may store the splitobject as a designated victim instead of a suitable freelist // but currently this is disallowed and the underlying code is not really finished: // marking of next used object prev-free/prev-used is missing // instead of this code we rely on using a newly freed object as dv is larger than dv dvsize=freebuckets[DVSIZEBUCKET]; if (0) { // (splitsize>dvsize) { // store splitobj as a new designated victim, but first store current victim to freelist, if possible dv=freebuckets[DVBUCKET]; if (dv!=0) { if (dvsizefreebuckets; // first try to merge with the previous free object, if so marked if (isnormalusedobjectprevfree(objecthead)) { //printf("**** about to merge object %d on free with prev %d !\n",object,prevobject); // use the size of the previous (free) object stored at the end of the previous object prevobjectsize=getfreeobjectsize(dbfetch(db,(object-sizeof(gint)))); prevobject=object-prevobjectsize; prevobjecthead=dbfetch(db,prevobject); if (!isfreeobject(prevobjecthead) || !getfreeobjectsize(prevobject)==prevobjectsize) { show_dballoc_error(db,"wg_free_object notices corruption: previous object is not ok free object"); return -4; // corruption noticed } // remove prev object from its freelist // first, get necessary information prevnextptr=dbfetch(db,prevobject+sizeof(gint)); prevprevptr=dbfetch(db,prevobject+2*sizeof(gint)); previndex=wg_freebuckets_index(db,prevobjectsize); freelist=freebuckets[previndex]; // second, really remove prev object from freelist if (freelist==prevobject) { // prev object pointed to directly from bucket freebuckets[previndex]=prevnextptr; // modify prev prev if (prevnextptr!=0) dbstore(db,prevnextptr+2*sizeof(gint),prevprevptr); // modify prev next } else { // prev object pointed to from another object, not directly bucket // next of prev of prev will point to next of next dbstore(db,prevprevptr+sizeof(gint),prevnextptr); // prev of next of prev will prev-point to prev of prev if (prevnextptr!=0) dbstore(db,prevnextptr+2*sizeof(gint),prevprevptr); } // now treat the prev object as the current object to be freed! object=prevobject; size=size+prevobjectsize; } else if ((freebuckets[DVBUCKET]+freebuckets[DVSIZEBUCKET])==object) { // should merge with a previous dv object=freebuckets[DVBUCKET]; size=size+freebuckets[DVSIZEBUCKET]; // increase size to cover dv as well // modify dv size information in area header: dv will extend to freed object freebuckets[DVSIZEBUCKET]=size; // store dv size and marker to dv head dbstore(db,object,makespecialusedobjectsize(size)); dbstore(db,object+sizeof(gint),SPECIALGINT1DV); return 0; // do not store anything to freebuckets!! } // next, try to merge with the next object: either free object or dv // also, if next object is normally used instead, mark it as following the free object nextobject=object+size; nextobjecthead=dbfetch(db,nextobject); if (isfreeobject(nextobjecthead)) { // should merge with a following free object //printf("**** about to merge object %d on free with next %d !\n",object,nextobject); size=size+getfreeobjectsize(nextobjecthead); // increase size to cover next object as well // remove next object from its freelist // first, get necessary information nextnextptr=dbfetch(db,nextobject+sizeof(gint)); nextprevptr=dbfetch(db,nextobject+2*sizeof(gint)); nextindex=wg_freebuckets_index(db,getfreeobjectsize(nextobjecthead)); freelist=freebuckets[nextindex]; // second, really remove next object from freelist if (freelist==nextobject) { // next object pointed to directly from bucket freebuckets[nextindex]=nextnextptr; // modify next prev if (nextnextptr!=0) dbstore(db,nextnextptr+2*sizeof(gint),nextprevptr); // modify next next } else { // next object pointed to from another object, not directly bucket // prev of next will point to next of next dbstore(db,nextprevptr+sizeof(gint),nextnextptr); // next of next will prev-point to prev of next if (nextnextptr!=0) dbstore(db,nextnextptr+2*sizeof(gint),nextprevptr); } } else if (isspecialusedobject(nextobjecthead) && nextobject==freebuckets[DVBUCKET]) { // should merge with a following dv size=size+freebuckets[DVSIZEBUCKET]; // increase size to cover next object as well // modify dv information in area header freebuckets[DVBUCKET]=object; freebuckets[DVSIZEBUCKET]=size; // store dv size and marker to dv head dbstore(db,object,makespecialusedobjectsize(size)); dbstore(db,object+sizeof(gint),SPECIALGINT1DV); return 0; // do not store anything to freebuckets!! } else if (isnormalusedobject(nextobjecthead)) { // mark the next used object as following a free object dbstore(db,nextobject,makeusedobjectsizeprevfree(dbfetch(db,nextobject))); } // we do no special actions in case next object is end marker // maybe the newly freed object is larger than the designated victim? // if yes, use the newly freed object as a new designated victim // and afterwards put the old dv to freelist if (size>freebuckets[DVSIZEBUCKET]) { dv=freebuckets[DVBUCKET]; dvsize=freebuckets[DVSIZEBUCKET]; freebuckets[DVBUCKET]=object; freebuckets[DVSIZEBUCKET]=size; dbstore(db,object,makespecialusedobjectsize(size)); dbstore(db,object+sizeof(gint),SPECIALGINT1DV); // set the next used object mark to prev-used! nextobject=object+size; tmp=dbfetch(db,nextobject); if (isnormalusedobject(tmp)) dbstore(db,nextobject,makeusedobjectsizeprevused(tmp)); // dv handling if (dv==0) return 0; // if no dv actually, then nothing to put to freelist // set the object point to dv to make it put into freelist after // but first mark the next object after dv as following free nextobject=dv+dvsize; tmp=dbfetch(db,nextobject); if (isnormalusedobject(tmp)) dbstore(db,nextobject,makeusedobjectsizeprevfree(tmp)); // let the old dv be handled as object to be put to freelist after object=dv; size=dvsize; } // store freed (or freed and merged) object to the correct bucket, // except for dv-merge cases above (returns earlier) i=wg_freebuckets_index(db,size); bucketfreelist=freebuckets[i]; if (bucketfreelist!=0) dbstore(db,bucketfreelist+2*sizeof(gint),object); // update prev ptr dbstore(db,object,makefreeobjectsize(size)); // store size and freebit dbstore(db,object+size-sizeof(gint),makefreeobjectsize(size)); // store size and freebit dbstore(db,object+sizeof(gint),bucketfreelist); // store previous freelist dbstore(db,object+2*sizeof(gint),dbaddr(db,&freebuckets[i])); // store prev ptr freebuckets[i]=object; return 0; } /* Tanel Tammet http://www.epl.ee/?i=112121212 Kuiv tn 9, Tallinn, Estonia +3725524876 len | refcount | xsd:type | namespace | contents .... | header: 4*4=16 bytes 128 bytes */ /***************** Child database functions ******************/ /* Register external database offset * * Stores offset and size of an external database. This allows * recognizing external pointers/offsets and computing their * base offset. * * Once external data is stored to the database, the memory * image can no longer be saved/restored. */ gint wg_register_external_db(void *db, void *extdb) { #ifdef USE_CHILD_DB db_memsegment_header* dbh = dbmemsegh(db); #ifdef CHECK if(dbh->key != 0) { show_dballoc_error(db, "external references not allowed in a shared memory db"); return -1; } #endif if(dbh->index_control_area_header.number_of_indexes > 0) { return show_dballoc_error(db, "Database has indexes, external references not allowed"); } if(dbh->extdbs.count >= MAX_EXTDB) { show_dballoc_error(db, "cannot register external database"); } else { dbh->extdbs.offset[dbh->extdbs.count] = ptrtooffset(db, dbmemsegh(extdb)); dbh->extdbs.size[dbh->extdbs.count++] = \ dbmemsegh(extdb)->size; } return 0; #else show_dballoc_error(db, "child database support is not enabled"); return -1; #endif } /******************** Hash index support *********************/ /* * Initialize a new hash table for an index. */ gint wg_create_hash(void *db, db_hash_area_header* areah, gint size) { if(size <= 0) size = DEFAULT_IDXHASH_LENGTH; if(init_hash_subarea(db, areah, size)) { return show_dballoc_error(db," cannot create strhash array area"); } return 0; } /********** Helper functions for accessing the header ********/ /* * Return free space in segment (in bytes) * Also tries to predict whether it is possible to allocate more * space in the segment. */ gint wg_database_freesize(void *db) { db_memsegment_header* dbh = dbmemsegh(db); gint freesize = dbh->size - dbh->free; return (freesize < MINIMAL_SUBAREA_SIZE ? 0 : freesize); } /* * Return total segment size (in bytes) */ gint wg_database_size(void *db) { db_memsegment_header* dbh = dbmemsegh(db); return dbh->size; } /* --------------- error handling ------------------------------*/ /** called with err msg when an allocation error occurs * * may print or log an error * does not do any jumps etc */ static gint show_dballoc_error(void* db, char* errmsg) { #ifdef WG_NO_ERRPRINT #else fprintf(stderr,"db memory allocation error: %s\n",errmsg); #endif return -1; } /** called with err msg and err nr when an allocation error occurs * * may print or log an error * does not do any jumps etc */ static gint show_dballoc_error_nr(void* db, char* errmsg, gint nr) { #ifdef WG_NO_ERRPRINT #else fprintf(stderr,"db memory allocation error: %s %d\n", errmsg, (int) nr); #endif return -1; } #ifdef __cplusplus } #endif whitedb-0.7.2/Db/dballoc.h000066400000000000000000000535111226454622500152610ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009 * Copyright (c) Priit Järv 2013 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dballoc.h * Public headers for database heap allocation procedures. */ #ifndef DEFINED_DBALLOC_H #define DEFINED_DBALLOC_H /* For gint/wg_int types */ #include #ifndef _MSC_VER #include #endif #ifdef _WIN32 #include "../config-w32.h" #else #include "../config.h" #endif #define USE_DATABASE_HANDLE /* Levels of allocation used: - Memory segment allocation: gives a large contiguous area of memory (typically shared memory). Could be extended later (has to be contiguous). - Inside the contiguous memory segment: Allocate usage areas for different heaps (data records, strings, doubles, lists, etc). Each area is typically not contiguous: can consist of several subareas of different length. Areas have different object allocation principles: - fixed-length object area (e.g. list cells) allocation uses pre-calced freelists - various-length object area (e.g. data records) allocation uses ordinary allocation techniques: - objects initialised from next free / designated victim object, split as needed - short freed objects are put into freelists in size-corresponding buckets - large freed object lists contain objects of different sizes - Data object allocation: data records, strings, list cells etc. Allocated in corresponding subareas. list area: 8M is filled 16 M area 32 datarec area: 8M is filled 16 M area 32 M area Fixlen allocation: - Fixlen objects are allocated using a pre-calced singly-linked freelist. When one subarea is exhausted(freelist empty), a new subarea is taken, it is organised into a long freelist and the beginning of the freelist is stored in db_area_header.freelist. - Each freelist element is one fixlen object. The first gint of the object is an offset of the next freelist element. The list is terminated with 0. Varlen allocation follows the main ideas of the Doug Lea allocator: - the minimum size to allocate is 4 gints (MIN_VARLENOBJ_SIZE) and all objects should be aligned at least to a gint. - each varlen area contains a number of gint-size buckets for storing different doubly-linked freelists. The buckets are: - EXACTBUCKETS_NR of buckets for exact object size. Contains an offset of the first free object of this size. - VARBUCKETS_NR of buckets for variable (interval between prev and next) object size, growing exponentially. Contains an offset of the first free object in this size interval. - EXACTBUCKETS_NR+VARBUCKETS_NR+1 is a designated victim (marked as in use): offset of the preferred place to split off new objects. Initially the whole free area is made one big designated victim. - EXACTBUCKETS_NR+VARBUCKETS_NR+2 is a size of the designated victim. - a free object contains gints: - size (in bytes) with last two bits marked (i.e. not part of size!): - last bits: 00 - offset of the next element in the freelist (terminated with 0). - offset of the previous element in the freelist (can be offset of the bucket!) ... arbitrary nr of bytes ... - size (in bytes) with last two bits marked as the initial size gint. This repeats the initial size gint and is located at the very end of the memory block. - an in-use object contains gints: - size (in bytes) with mark bits and assumptions: - last 2 bits markers, not part of size: - for normal in-use objects with in-use predecessor 00 - for normal in-use objects with free predecessor 10 - for specials (dv area and start/end markers) 11 - real size taken is always 8-aligned (minimal granularity 8 bytes) - size gint may be not 8-aligned if 32-bit gint used (but still has to be 4-aligned). In this case: - if size gint is not 8-aligned, real size taken either: - if size less than MIN_VARLENOBJ_SIZE, then MIN_VARLENOBJ_SIZE - else size+4 bytes (but used size is just size, no bytes added) - usable gints following - a designated victim is marked to be in use: - the first gint has last bits 11 to differentiate from normal in-use objects (00 or 10 bits) - the second gint contains 0 to indicate that it is a dv object, and not start marker (1) or end marker (2) - all the following gints are arbitrary and contain no markup. - the first 4 gints and the last 4 gints of each subarea are marked as in-use objects, although they should be never used! The reason is to give a markup for subarea beginning and end. - last bits 10 to differentiate from normal in-use objects (00 bits) - the next gint is 1 for start marker an 2 for end marker - the following 2 gints are arbitrary and contain no markup - summary of end bits for various objects: - 00 in-use normal object with in-use previous object - 10 in-use normal object with a free previous object - 01 free object - 11 in-use special object (dv or start/end marker) */ #define MEMSEGMENT_MAGIC_MARK 1232319011 /** enables to check that we really have db pointer */ #define MEMSEGMENT_MAGIC_INIT 1916950123 /** init time magic */ #define MEMSEGMENT_VERSION ((VERSION_REV<<16)|\ (VERSION_MINOR<<8)|(VERSION_MAJOR)) /** written to dump headers for compatibilty checking */ #define SUBAREA_ARRAY_SIZE 64 /** nr of possible subareas in each area */ #define INITIAL_SUBAREA_SIZE 8192 /** size of the first created subarea (bytes) */ #define MINIMAL_SUBAREA_SIZE 8192 /** checked before subarea creation to filter out stupid requests */ #define SUBAREA_ALIGNMENT_BYTES 8 /** subarea alignment */ #define SYN_VAR_PADDING 128 /** sync variable padding in bytes */ #if (LOCK_PROTO==3) #define MAX_LOCKS 64 /** queue size (currently fixed :-() */ #endif #define EXACTBUCKETS_NR 256 /** amount of free ob buckets with exact length */ #define VARBUCKETS_NR 32 /** amount of free ob buckets with varying length */ #define CACHEBUCKETS_NR 2 /** buckets used as special caches */ #define DVBUCKET EXACTBUCKETS_NR+VARBUCKETS_NR /** cachebucket: designated victim offset */ #define DVSIZEBUCKET EXACTBUCKETS_NR+VARBUCKETS_NR+1 /** cachebucket: byte size of designated victim */ #define MIN_VARLENOBJ_SIZE (4*(gint)(sizeof(gint))) /** minimal size of variable length object */ #define SHORTSTR_SIZE 32 /** max len of short strings */ /* defaults, used when there is no user-supplied or computed value */ #define DEFAULT_STRHASH_LENGTH 10000 /** length of the strhash array (nr of array elements) */ #define DEFAULT_IDXHASH_LENGTH 10000 /** hash index hash size */ #define ANONCONST_TABLE_SIZE 200 /** length of the table containing predefined anonconst uri ptrs */ /* ====== general typedefs and macros ======= */ // integer and address fetch and store typedef ptrdiff_t gint; /** always used instead of int. Pointers are also handled as gint. */ #ifndef _MSC_VER /* MSVC on Win32 */ typedef int32_t gint32; /** 32-bit fixed size storage */ typedef int64_t gint64; /** 64-bit fixed size storage */ #else typedef __int32 gint32; /** 32-bit fixed size storage */ typedef __int64 gint64; /** 64-bit fixed size storage */ #endif #ifdef USE_DATABASE_HANDLE #define dbmemseg(x) ((void *)(((db_handle *) x)->db)) #define dbmemsegh(x) ((db_memsegment_header *)(((db_handle *) x)->db)) #define dbmemsegbytes(x) ((char *)(((db_handle *) x)->db)) #else #define dbmemseg(x) ((void *)(x)) #define dbmemsegh(x) ((db_memsegment_header *)(x)) #define dbmemsegbytes(x) ((char *)(x)) #endif #define dbfetch(db,offset) (*((gint*)(dbmemsegbytes(db)+(offset)))) /** get gint from address */ #define dbstore(db,offset,data) (*((gint*)(dbmemsegbytes(db)+(offset)))=data) /** store gint to address */ #define dbaddr(db,realptr) ((gint)(((char*)(realptr))-dbmemsegbytes(db))) /** give offset of real adress */ #define offsettoptr(db,offset) ((void*)(dbmemsegbytes(db)+(offset))) /** give real address from offset */ #define ptrtooffset(db,realptr) (dbaddr((db),(realptr))) #define dbcheckh(dbh) (dbh!=NULL && *((gint32 *) dbh)==MEMSEGMENT_MAGIC_MARK) /** check that correct db ptr */ #define dbcheck(db) dbcheckh(dbmemsegh(db)) /** check that correct db ptr */ #define dbcheckhinit(dbh) (dbh!=NULL && *((gint32 *) dbh)==MEMSEGMENT_MAGIC_INIT) #define dbcheckinit(db) dbcheckhinit(dbmemsegh(db)) /* ==== fixlen object allocation macros ==== */ #define alloc_listcell(db) wg_alloc_fixlen_object((db),&(dbmemsegh(db)->listcell_area_header)) #define alloc_shortstr(db) wg_alloc_fixlen_object((db),&(dbmemsegh(db)->shortstr_area_header)) #define alloc_word(db) wg_alloc_fixlen_object((db),&(dbmemsegh(db)->word_area_header)) #define alloc_doubleword(db) wg_alloc_fixlen_object((db),&(dbmemsegh(db)->doubleword_area_header)) /* ==== varlen object allocation special macros ==== */ #define isfreeobject(i) (((i) & 3)==1) /** end bits 01 */ #define isnormalusedobject(i) (!((i) & 1)) /** end bits either 00 or 10, i.e. last bit 0 */ #define isnormalusedobjectprevused(i) (!((i) & 3)) /** end bits 00 */ #define isnormalusedobjectprevfree(i) (((i) & 3)==2) /** end bits 10 */ #define isspecialusedobject(i) (((i) & 3) == 3) /** end bits 11 */ #define getfreeobjectsize(i) ((i) & ~3) /** mask off two lowest bits: just keep all higher */ /** small size marks always use MIN_VARLENOBJ_SIZE, * non-8-aligned size marks mean obj really takes 4 more bytes (all real used sizes are 8-aligned) */ #define getusedobjectsize(i) (((i) & ~3)<=MIN_VARLENOBJ_SIZE ? MIN_VARLENOBJ_SIZE : ((((i) & ~3)%8) ? (((i) & ~3)+4) : ((i) & ~3)) ) #define getspecialusedobjectsize(i) ((i) & ~3) /** mask off two lowest bits: just keep all higher */ #define getusedobjectwantedbytes(i) ((i) & ~3) #define getusedobjectwantedgintsnr(i) (((i) & ~3)>>((sizeof(gint)==4) ? 2 : 3)) /** divide pure size by four or eight */ #define makefreeobjectsize(i) (((i) & ~3)|1) /** set lowest bits to 01: current object is free */ #define makeusedobjectsizeprevused(i) ((i) & ~3) /** set lowest bits to 00 */ #define makeusedobjectsizeprevfree(i) (((i) & ~3)|2) /** set lowest bits to 10 */ #define makespecialusedobjectsize(i) ((i)|3) /** set lowest bits to 11 */ #define SPECIALGINT1DV 1 /** second gint of a special in use dv area */ #define SPECIALGINT1START 0 /** second gint of a special in use start marker area, should be 0 */ #define SPECIALGINT1END 0 /** second gint of a special in use end marker area, should be 0 */ // #define setpfree(i) ((i) | 2) /** set next lowest bit to 1: previous object is free ???? */ /* === data structures used in allocated areas ===== */ /** general list cell: a pair of two integers (both can be also used as pointers) */ typedef struct { gint car; /** first element */ gint cdr;} /** second element, often a pointer to the rest of the list */ gcell; #define car(cell) (((gint)((gcell*)(cell)))->car) /** get list cell first elem gint */ #define cdr(cell) (((gint)((gcell*)(cell)))->cdr) /** get list cell second elem gint */ /* index related stuff */ #define MAX_INDEX_FIELDS 10 /** maximum number of fields in one index */ #define MAX_INDEXED_FIELDNR 127 /** limits the size of field/index table */ #ifndef TTREE_CHAINED_NODES #define WG_TNODE_ARRAY_SIZE 10 #else #define WG_TNODE_ARRAY_SIZE 8 #endif /* logging related */ #define maxnumberoflogrows 10 /* external database stuff */ #define MAX_EXTDB 20 /* ====== segment/area header data structures ======== */ /* memory segment structure: ------------- db_memsegment_header - - - - - - - db_area_header - - - - db_subarea_header ... db_subarea_header - - - - - - - ... - - - - - - - db_area_header - - - - db_subarea_header ... db_subarea_header ---------------- various actual subareas ---------------- */ /** located inside db_area_header: one single memory subarea header * * alignedoffset should be always used: it may come some bytes after offset */ typedef struct _db_subarea_header { gint size; /** size of subarea */ gint offset; /** subarea exact offset from segment start: do not use for objects! */ gint alignedsize; /** subarea object alloc usable size: not necessarily to end of area */ gint alignedoffset; /** subarea start as to be used for object allocation */ } db_subarea_header; /** located inside db_memsegment_header: one single memory area header * */ typedef struct _db_area_header { gint fixedlength; /** 1 if fixed length area, 0 if variable length */ gint objlength; /** only for fixedlength: length of allocatable obs in bytes */ gint freelist; /** freelist start: if 0, then no free objects available */ gint last_subarea_index; /** last used subarea index (0,...,) */ db_subarea_header subarea_array[SUBAREA_ARRAY_SIZE]; /** array of subarea headers */ gint freebuckets[EXACTBUCKETS_NR+VARBUCKETS_NR+CACHEBUCKETS_NR]; /** array of subarea headers */ } db_area_header; /** synchronization structures in shared memory * * Note that due to the similarity we can keep the memory images * using the wpspin and rpspin protocols compatible. */ typedef struct { #if !defined(LOCK_PROTO) || (LOCK_PROTO < 3) /* rpspin, wpspin */ gint global_lock; /** db offset to cache-aligned sync variable */ gint writers; /** db offset to cache-aligned writer count */ char _storage[SYN_VAR_PADDING*3]; /** padded storage */ #else /* tfqueue */ gint tail; /** db offset to last queue node */ gint queue_lock; /** db offset to cache-aligned sync variable */ gint storage; /** db offset to queue node storage */ gint max_nodes; /** number of cells in queue node storage */ gint freelist; /** db offset to the top of the allocation stack */ #endif } syn_var_area; /** hash area header * */ typedef struct _db_hash_area_header { gint size; /** size of subarea */ gint offset; /** subarea exact offset from segment start: do not use for array! */ gint arraysize; /** subarea object alloc usable size: not necessarily to end of area */ gint arraystart; /** subarea start as to be used for object allocation */ gint arraylength; /** nr of elements in the hash array */ } db_hash_area_header; /** * T-tree specific index header fields */ struct __wg_ttree_header { gint offset_root_node; #ifdef TTREE_CHAINED_NODES gint offset_max_node; /** last node in chain */ gint offset_min_node; /** first node in chain */ #endif }; /** * Hash-specific index header fields */ struct __wg_hashidx_header { db_hash_area_header hasharea; }; /** control data for one index * */ typedef struct { gint type; gint fields; /** number of fields in index */ gint rec_field_index[MAX_INDEX_FIELDS]; /** field numbers for this index */ union { struct __wg_ttree_header t; struct __wg_hashidx_header h; } ctl; /** shared fields for different index types */ gint template_offset; /** matchrec template, 0 if full index */ } wg_index_header; /** index mask meta-info * */ #ifdef USE_INDEX_TEMPLATE typedef struct { gint fixed_columns; /** number of fixed columns in the template */ gint offset_matchrec; /** offset to the record that stores the fields */ gint refcount; /** number of indexes using this template */ } wg_index_template; #endif /** highest level index management data * contains lookup table by field number and memory management data */ typedef struct { gint number_of_indexes; /** unused, reserved */ gint index_list; /** master index list */ gint index_table[MAX_INDEXED_FIELDNR+1]; /** index lookup by column */ #ifdef USE_INDEX_TEMPLATE gint index_template_list; /** sorted list of index masks */ gint index_template_table[MAX_INDEXED_FIELDNR+1]; /** masks indexed by column */ #endif } db_index_area_header; /** Registered external databases * Offsets of data in these databases are recognized properly * by the data store/retrieve/compare functions. */ typedef struct { gint count; /** number of records */ gint offset[MAX_EXTDB]; /** offsets of external databases */ gint size[MAX_EXTDB]; /** corresponding sizes of external databases */ } extdb_area; /** logging management * */ typedef struct { gint active; /** logging mode on/off */ gint dirty; /** log file is clean/dirty */ gint serial; /** incremented when the log file is backed up */ } db_logging_area_header; /** anonconst area header * */ #ifdef USE_REASONER typedef struct _db_anonconst_area_header { gint anonconst_nr; gint anonconst_funs; gint anonconst_table[ANONCONST_TABLE_SIZE]; } db_anonconst_area_header; #endif /** located at the very beginning of the memory segment * */ typedef struct _db_memsegment_header { // core info about segment /****** fixed size part of the header. Do not edit this without * also editing the code that checks the header in dbmem.c */ gint32 mark; /** fixed uncommon int to check if really a segment */ gint32 version; /** db engine version to check dump file compatibility */ gint32 features; /** db engine compile-time features */ gint32 checksum; /** dump file checksum */ /* end of fixed size header ******/ gint size; /** segment size in bytes */ gint free; /** pointer to first free area in segment (aligned) */ gint initialadr; /** initial segment address, only valid for creator */ gint key; /** global shared mem key */ // areas db_area_header datarec_area_header; db_area_header longstr_area_header; db_area_header listcell_area_header; db_area_header shortstr_area_header; db_area_header word_area_header; db_area_header doubleword_area_header; // hash structures db_hash_area_header strhash_area_header; // index structures db_index_area_header index_control_area_header; db_area_header tnode_area_header; db_area_header indexhdr_area_header; db_area_header indextmpl_area_header; db_area_header indexhash_area_header; // logging structures db_logging_area_header logging; // anonconst table #ifdef USE_REASONER db_anonconst_area_header anonconst; #endif // statistics // field/table name structures syn_var_area locks; /** currently holds a single global lock */ extdb_area extdbs; /** offset ranges of external databases */ } db_memsegment_header; #ifdef USE_DATABASE_HANDLE /** Database handle in local memory. Contains the pointer to the * shared memory area. */ typedef struct { db_memsegment_header *db; /** shared memory header */ void *logdata; /** log data structure in local memory */ } db_handle; #endif /* --------- anonconsts: special uris with attached funs ----------- */ #ifdef USE_REASONER #define ACONST_FALSE_STR "false" #define ACONST_FALSE encode_anonconst(0) #define ACONST_TRUE_STR "true" #define ACONST_TRUE encode_anonconst(1) #define ACONST_IF_STR "if" #define ACONST_IF encode_anonconst(2) #define ACONST_NOT_STR "not" #define ACONST_NOT encode_anonconst(3) #define ACONST_AND_STR "and" #define ACONST_AND encode_anonconst(4) #define ACONST_OR_STR "or" #define ACONST_OR encode_anonconst(5) #define ACONST_IMPLIES_STR "implies" #define ACONST_IMPLIES encode_anonconst(6) #define ACONST_XOR_STR "xor" #define ACONST_XOR encode_anonconst(7) #define ACONST_LESS_STR "<" #define ACONST_LESS encode_anonconst(8) #define ACONST_EQUAL_STR "=" #define ACONST_EQUAL encode_anonconst(9) #define ACONST_GREATER_STR ">" #define ACONST_GREATER encode_anonconst(10) #define ACONST_LESSOREQUAL_STR "<=" #define ACONST_LESSOREQUAL encode_anonconst(11) #define ACONST_GREATEROREQUAL_STR ">=" #define ACONST_GREATEROREQUAL encode_anonconst(12) #define ACONST_ISZERO_STR "zero" #define ACONST_ISZERO encode_anonconst(13) #define ACONST_ISEMPTYSTR_STR "strempty" #define ACONST_ISEMPTYSTR encode_anonconst(14) #define ACONST_PLUS_STR "+" #define ACONST_PLUS encode_anonconst(15) #define ACONST_MINUS_STR "!-" #define ACONST_MINUS encode_anonconst(16) #define ACONST_MULTIPLY_STR "*" #define ACONST_MULTIPLY encode_anonconst(17) #define ACONST_DIVIDE_STR "/" #define ACONST_DIVIDE encode_anonconst(18) #define ACONST_STRCONTAINS_STR "strcontains" #define ACONST_STRCONTAINS encode_anonconst(19) #define ACONST_STRCONTAINSICASE_STR "strcontainsicase" #define ACONST_STRCONTAINSICASE encode_anonconst(20) #define ACONST_SUBSTR_STR "substr" #define ACONST_SUBSTR encode_anonconst(21) #define ACONST_STRLEN_STR "strlen" #define ACONST_STRLEN encode_anonconst(22) #endif /* ==== Protos ==== */ gint wg_init_db_memsegment(void* db, gint key, gint size); // creates initial memory structures for a new db gint wg_alloc_fixlen_object(void* db, void* area_header); gint wg_alloc_gints(void* db, void* area_header, gint nr); void wg_free_listcell(void* db, gint offset); void wg_free_shortstr(void* db, gint offset); void wg_free_word(void* db, gint offset); void wg_free_doubleword(void* db, gint offset); void wg_free_tnode(void* db, gint offset); void wg_free_fixlen_object(void* db, db_area_header *hdr, gint offset); gint wg_freebuckets_index(void* db, gint size); gint wg_free_object(void* db, void* area_header, gint object) ; #if 0 void *wg_create_child_db(void* db, gint size); #endif gint wg_register_external_db(void *db, void *extdb); gint wg_create_hash(void *db, db_hash_area_header* areah, gint size); gint wg_database_freesize(void *db); gint wg_database_size(void *db); /* ------- testing ------------ */ #endif /* DEFINED_DBALLOC_H */ whitedb-0.7.2/Db/dbapi.h000066400000000000000000000341011226454622500147320ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dbapi.h * * Wg database api for public use. * */ #ifndef DEFINED_DBAPI_H #define DEFINED_DBAPI_H /* For gint/wg_int types */ #include #ifdef __cplusplus extern "C" { #endif /* --- built-in data type numbers ----- */ /* the built-in data types are primarily for api purposes. internally, some of these types like int, str etc have several different ways to encode along with different bit masks */ #define WG_NULLTYPE 1 #define WG_RECORDTYPE 2 #define WG_INTTYPE 3 #define WG_DOUBLETYPE 4 #define WG_STRTYPE 5 #define WG_XMLLITERALTYPE 6 #define WG_URITYPE 7 #define WG_BLOBTYPE 8 #define WG_CHARTYPE 9 #define WG_FIXPOINTTYPE 10 #define WG_DATETYPE 11 #define WG_TIMETYPE 12 #define WG_ANONCONSTTYPE 13 #define WG_VARTYPE 14 /* Illegal encoded data indicator */ #define WG_ILLEGAL 0xff /* Query "arglist" parameters */ #define WG_COND_EQUAL 0x0001 /** = */ #define WG_COND_NOT_EQUAL 0x0002 /** != */ #define WG_COND_LESSTHAN 0x0004 /** < */ #define WG_COND_GREATER 0x0008 /** > */ #define WG_COND_LTEQUAL 0x0010 /** <= */ #define WG_COND_GTEQUAL 0x0020 /** >= */ /* Query types. Python extension module uses the API and needs these. */ #define WG_QTYPE_TTREE 0x01 #define WG_QTYPE_HASH 0x02 #define WG_QTYPE_SCAN 0x04 #define WG_QTYPE_PREFETCH 0x80 /* Direct access to field */ #define RECORD_HEADER_GINTS 3 #define wg_field_addr(db,record,fieldnr) (((wg_int*)(record))+RECORD_HEADER_GINTS+(fieldnr)) /* WhiteDB data types */ typedef ptrdiff_t wg_int; typedef size_t wg_uint; /** Query argument list object */ typedef struct { wg_int column; /** column (field) number this argument applies to */ wg_int cond; /** condition (equal, less than, etc) */ wg_int value; /** encoded value */ } wg_query_arg; /** Query object */ typedef struct { wg_int qtype; /** Query type (T-tree, hash, full scan, prefetch) */ /* Argument list based query is the only one supported at the moment. */ wg_query_arg *arglist; /** check each row in result set against these */ wg_int argc; /** number of elements in arglist */ wg_int column; /** index on this column used */ /* Fields for T-tree query (XXX: some may be re-usable for * other types as well) */ wg_int curr_offset; wg_int end_offset; wg_int curr_slot; wg_int end_slot; wg_int direction; /* Fields for full scan */ wg_int curr_record; /** offset of the current record */ /* Fields for prefetch; with/without mpool */ void *mpool; /** storage for row offsets */ void *curr_page; /** current page of results */ wg_int curr_pidx; /** current index on page */ wg_uint res_count; /** number of rows in results */ } wg_query; /* prototypes of wg database api functions */ /* ------- attaching and detaching a database ----- */ void* wg_attach_database(char* dbasename, wg_int size); // returns a pointer to the database, NULL if failure void* wg_attach_existing_database(char* dbasename); // like wg_attach_database, but does not create a new base void* wg_attach_logged_database(char* dbasename, wg_int size); // like wg_attach_database, but activates journal logging on creation int wg_detach_database(void* dbase); // detaches a database: returns 0 if OK int wg_delete_database(char* dbasename); // deletes a database: returns 0 if OK /* ------- attaching and detaching a local db ----- */ void* wg_attach_local_database(wg_int size); void wg_delete_local_database(void* dbase); /* ------- functions to query database state ------ */ wg_int wg_database_freesize(void *db); wg_int wg_database_size(void *db); /* -------- creating and scanning records --------- */ void* wg_create_record(void* db, wg_int length); ///< returns NULL when error, ptr to rec otherwise void* wg_create_raw_record(void* db, wg_int length); ///< returns NULL when error, ptr to rec otherwise wg_int wg_delete_record(void* db, void *rec); ///< returns 0 on success, non-0 on error void* wg_get_first_record(void* db); ///< returns NULL when error or no recs void* wg_get_next_record(void* db, void* record); ///< returns NULL when error or no more recs /* -------- setting and fetching record field values --------- */ wg_int wg_get_record_len(void* db, void* record); ///< returns negative int when error wg_int* wg_get_record_dataarray(void* db, void* record); ///< pointer to record data array start // following field setting functions return negative int when err, 0 when ok wg_int wg_set_field(void* db, void* record, wg_int fieldnr, wg_int data); wg_int wg_set_new_field(void* db, void* record, wg_int fieldnr, wg_int data); wg_int wg_set_int_field(void* db, void* record, wg_int fieldnr, wg_int data); wg_int wg_set_double_field(void* db, void* record, wg_int fieldnr, double data); wg_int wg_set_str_field(void* db, void* record, wg_int fieldnr, char* data); wg_int wg_update_atomic_field(void* db, void* record, wg_int fieldnr, wg_int data, wg_int old_data); wg_int wg_set_atomic_field(void* db, void* record, wg_int fieldnr, wg_int data); wg_int wg_add_int_atomic_field(void* db, void* record, wg_int fieldnr, int data); wg_int wg_get_field(void* db, void* record, wg_int fieldnr); // returns 0 when error wg_int wg_get_field_type(void* db, void* record, wg_int fieldnr); // returns 0 when error /* ---------- general operations on encoded data -------- */ wg_int wg_get_encoded_type(void* db, wg_int data); wg_int wg_free_encoded(void* db, wg_int data); /* -------- encoding and decoding data: records contain encoded data only ---------- */ wg_int wg_encode_null(void* db, wg_int data); wg_int wg_decode_null(void* db, wg_int data); // int wg_int wg_encode_int(void* db, wg_int data); wg_int wg_decode_int(void* db, wg_int data); // double wg_int wg_encode_double(void* db, double data); double wg_decode_double(void* db, wg_int data); // fixpoint wg_int wg_encode_fixpoint(void* db, double data); double wg_decode_fixpoint(void* db, wg_int data); // date and time wg_int wg_encode_date(void* db, int data); int wg_decode_date(void* db, wg_int data); wg_int wg_encode_time(void* db, int data); int wg_decode_time(void* db, wg_int data); int wg_current_utcdate(void* db); int wg_current_localdate(void* db); int wg_current_utctime(void* db); int wg_current_localtime(void* db); int wg_strf_iso_datetime(void* db, int date, int time, char* buf); int wg_strp_iso_date(void* db, char* buf); int wg_strp_iso_time(void* db, char* inbuf); int wg_ymd_to_date(void* db, int yr, int mo, int day); int wg_hms_to_time(void* db, int hr, int min, int sec, int prt); void wg_date_to_ymd(void* db, int date, int *yr, int *mo, int *day); void wg_time_to_hms(void* db, int time, int *hr, int *min, int *sec, int *prt); // str (standard C string: zero-terminated array of chars) // along with optional attached language indicator str wg_int wg_encode_str(void* db, char* str, char* lang); ///< let lang==NULL if not used char* wg_decode_str(void* db, wg_int data); char* wg_decode_str_lang(void* db, wg_int data); wg_int wg_decode_str_len(void* db, wg_int data); wg_int wg_decode_str_lang_len(void* db, wg_int data); wg_int wg_decode_str_copy(void* db, wg_int data, char* strbuf, wg_int buflen); wg_int wg_decode_str_lang_copy(void* db, wg_int data, char* langbuf, wg_int buflen); // xmlliteral (standard C string: zero-terminated array of chars) // along with obligatory attached xsd:type str wg_int wg_encode_xmlliteral(void* db, char* str, char* xsdtype); char* wg_decode_xmlliteral(void* db, wg_int data); char* wg_decode_xmlliteral_xsdtype(void* db, wg_int data); wg_int wg_decode_xmlliteral_len(void* db, wg_int data); wg_int wg_decode_xmlliteral_xsdtype_len(void* db, wg_int data); wg_int wg_decode_xmlliteral_copy(void* db, wg_int data, char* strbuf, wg_int buflen); wg_int wg_decode_xmlliteral_xsdtype_copy(void* db, wg_int data, char* strbuf, wg_int buflen); // uri (standard C string: zero-terminated array of chars) // along with an optional namespace str wg_int wg_encode_uri(void* db, char* str, char* nspace); ///< let nspace==NULL if not used char* wg_decode_uri(void* db, wg_int data); char* wg_decode_uri_prefix(void* db, wg_int data); wg_int wg_decode_uri_len(void* db, wg_int data); wg_int wg_decode_uri_prefix_len(void* db, wg_int data); wg_int wg_decode_uri_copy(void* db, wg_int data, char* strbuf, wg_int buflen); wg_int wg_decode_uri_prefix_copy(void* db, wg_int data, char* strbuf, wg_int buflen); // blob (binary large object, i.e. any kind of data) // along with an obligatory length in bytes wg_int wg_encode_blob(void* db, char* str, char* type, wg_int len); char* wg_decode_blob(void* db, wg_int data); char* wg_decode_blob_type(void* db, wg_int data); wg_int wg_decode_blob_len(void* db, wg_int data); wg_int wg_decode_blob_copy(void* db, wg_int data, char* strbuf, wg_int buflen); wg_int wg_decode_blob_type_len(void* db, wg_int data); wg_int wg_decode_blob_type_copy(void* db, wg_int data, char* langbuf, wg_int buflen); /// ptr to record wg_int wg_encode_record(void* db, void* data); void* wg_decode_record(void* db, wg_int data); /// char wg_int wg_encode_char(void* db, char data); char wg_decode_char(void* db, wg_int data); // anonconst wg_int wg_encode_anonconst(void* db, char* str); char* wg_decode_anonconst(void* db, wg_int data); // var wg_int wg_encode_var(void* db, wg_int varnr); wg_int wg_decode_var(void* db, wg_int data); /* --- dumping and restoring -------- */ wg_int wg_dump(void * db,char* fileName); // dump shared memory database to the disk wg_int wg_import_dump(void * db,char* fileName); // import database from the disk wg_int wg_start_logging(void *db); /* activate journal logging globally */ wg_int wg_stop_logging(void *db); /* deactivate journal logging */ wg_int wg_replay_log(void *db, char *filename); /* restore from journal */ /* ---------- concurrency support ---------- */ wg_int wg_start_write(void * dbase); /* start write transaction */ wg_int wg_end_write(void * dbase, wg_int lock); /* end write transaction */ wg_int wg_start_read(void * dbase); /* start read transaction */ wg_int wg_end_read(void * dbase, wg_int lock); /* end read transaction */ /* ------------- utilities ----------------- */ void wg_print_db(void *db); void wg_print_record(void *db, wg_int* rec); void wg_snprint_value(void *db, wg_int enc, char *buf, int buflen); wg_int wg_parse_and_encode(void *db, char *buf); wg_int wg_parse_and_encode_param(void *db, char *buf); void wg_export_db_csv(void *db, char *filename); wg_int wg_import_db_csv(void *db, char *filename); /* ---------- query functions -------------- */ wg_query *wg_make_query(void *db, void *matchrec, wg_int reclen, wg_query_arg *arglist, wg_int argc); #define wg_make_prefetch_query wg_make_query wg_query *wg_make_query_rc(void *db, void *matchrec, wg_int reclen, wg_query_arg *arglist, wg_int argc, wg_uint rowlimit); void *wg_fetch(void *db, wg_query *query); void wg_free_query(void *db, wg_query *query); wg_int wg_encode_query_param_null(void *db, char *data); wg_int wg_encode_query_param_record(void *db, void *data); wg_int wg_encode_query_param_char(void *db, char data); wg_int wg_encode_query_param_fixpoint(void *db, double data); wg_int wg_encode_query_param_date(void *db, int data); wg_int wg_encode_query_param_time(void *db, int data); wg_int wg_encode_query_param_var(void *db, wg_int data); wg_int wg_encode_query_param_int(void *db, wg_int data); wg_int wg_encode_query_param_double(void *db, double data); wg_int wg_encode_query_param_str(void *db, char *data, char *lang); wg_int wg_encode_query_param_xmlliteral(void *db, char *data, char *xsdtype); wg_int wg_encode_query_param_uri(void *db, char *data, char *prefix); wg_int wg_free_query_param(void* db, wg_int data); void *wg_find_record(void *db, wg_int fieldnr, wg_int cond, wg_int data, void* lastrecord); void *wg_find_record_null(void *db, wg_int fieldnr, wg_int cond, char *data, void* lastrecord); void *wg_find_record_record(void *db, wg_int fieldnr, wg_int cond, void *data, void* lastrecord); void *wg_find_record_char(void *db, wg_int fieldnr, wg_int cond, char data, void* lastrecord); void *wg_find_record_fixpoint(void *db, wg_int fieldnr, wg_int cond, double data, void* lastrecord); void *wg_find_record_date(void *db, wg_int fieldnr, wg_int cond, int data, void* lastrecord); void *wg_find_record_time(void *db, wg_int fieldnr, wg_int cond, int data, void* lastrecord); void *wg_find_record_var(void *db, wg_int fieldnr, wg_int cond, wg_int data, void* lastrecord); void *wg_find_record_int(void *db, wg_int fieldnr, wg_int cond, int data, void* lastrecord); void *wg_find_record_double(void *db, wg_int fieldnr, wg_int cond, double data, void* lastrecord); void *wg_find_record_str(void *db, wg_int fieldnr, wg_int cond, char *data, void* lastrecord); void *wg_find_record_xmlliteral(void *db, wg_int fieldnr, wg_int cond, char *data, char *xsdtype, void* lastrecord); void *wg_find_record_uri(void *db, wg_int fieldnr, wg_int cond, char *data, char *prefix, void* lastrecord); /* ---------- child database handling ------ */ wg_int wg_register_external_db(void *db, void *extdb); wg_int wg_encode_external_data(void *db, void *extdb, wg_int encoded); /* ---------- JSON document I/O ------------ */ wg_int wg_parse_json_file(void *db, char *filename); wg_int wg_parse_json_document(void *db, char *buf); #ifdef __cplusplus } #endif #endif /* DEFINED_DBAPI_H */ whitedb-0.7.2/Db/dbcompare.c000066400000000000000000000211711226454622500156050ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Priit Järv 2010 * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dbcompare.c * Data comparison functions. */ /* ====== Includes =============== */ #include #ifdef __cplusplus extern "C" { #endif #include "dbdata.h" /* ====== Private headers and defs ======== */ #include "dbcompare.h" /* ====== Functions ============== */ /** Compare two encoded values * a, b - encoded values * returns WG_GREATER, WG_EQUAL or WG_LESSTHAN * assumes that a and b themselves are not equal and so * their decoded values need to be examined (which could still * be equal for some data types). * depth - recursion depth for records */ gint wg_compare(void *db, gint a, gint b, int depth) { /* a very simplistic version of the function: * - we get the types of the variables * - if the types match, compare the decoded values * - otherwise compare the type codes (not really scientific, * but will provide a means of ordering values). * * One important point that should be observed here is * that the returned values should be consistent when * comparing A to B and then B to A. This applies to cases * where we have no reason to think one is greater than * the other from the *user's* point of view, but for use * in T-tree index and similar, values need to be consistently * ordered. Examples include unknown types and record pointers * (once recursion depth runs out). */ /* XXX: might be able to save time here to mask and compare * the type bits instead */ gint typea = wg_get_encoded_type(db, a); gint typeb = wg_get_encoded_type(db, b); /* assume types are >2 (NULLs are always equal) and * <13 (not implemented as of now) * XXX: all of this will fall apart if type codes * are somehow rearranged :-) */ if(typeb==typea) { if(typea>WG_CHARTYPE) { /* > 9, not a string */ if(typea>WG_FIXPOINTTYPE) { /* date or time. Compare decoded gints */ gint deca, decb; if(typea==WG_DATETYPE) { deca = wg_decode_date(db, a); decb = wg_decode_date(db, b); } else if(typea==WG_TIMETYPE) { deca = wg_decode_time(db, a); decb = wg_decode_time(db, b); } else if(typea==WG_VARTYPE) { deca = wg_decode_var(db, a); decb = wg_decode_var(db, b); } else { /* anon const or other new type, no idea how to compare */ return (a>b ? WG_GREATER : WG_LESSTHAN); } return (deca>decb ? WG_GREATER : WG_LESSTHAN); } else { /* fixpoint, need to compare doubles */ double deca, decb; deca = wg_decode_fixpoint(db, a); decb = wg_decode_fixpoint(db, b); return (deca>decb ? WG_GREATER : WG_LESSTHAN); } } else if(typea (gint) decb ? WG_GREATER : WG_LESSTHAN); } else { int i; #ifdef USE_CHILD_DB void *parenta, *parentb; #endif int lena = wg_get_record_len(db, deca); int lenb = wg_get_record_len(db, decb); #ifdef USE_CHILD_DB /* Determine, if the records are inside the memory area beloning * to our current base address. If it is outside, the encoded * values inside the record contain offsets in relation to * a different base address and need to be translated. */ parenta = wg_get_rec_owner(db, deca); parentb = wg_get_rec_owner(db, decb); #endif /* XXX: Currently we're considering records of differing lengths * non-equal without comparing the elements */ if(lena!=lenb) return (lena>lenb ? WG_GREATER : WG_LESSTHAN); /* Recursively check each element in the record. If they * are not equal, break and return with the obtained value */ for(i=0; idecb ? WG_GREATER : WG_LESSTHAN); } else { /* WG_DOUBLETYPE */ double deca, decb; deca = wg_decode_double(db, a); decb = wg_decode_double(db, b); if(deca==decb) return WG_EQUAL; /* decoded doubles can be equal */ return (deca>decb ? WG_GREATER : WG_LESSTHAN); } } else { /* string */ /* Need to compare the characters. In case of 0-terminated * strings we use strcmp() directly, which in glibc is heavily * optimised. In case of blob type we need to query the length * and use memcmp(). */ char *deca, *decb, *exa=NULL, *exb=NULL; char buf[4]; gint res; if(typea==WG_STRTYPE) { /* lang is ignored */ deca = wg_decode_str(db, a); decb = wg_decode_str(db, b); } else if(typea==WG_URITYPE) { exa = wg_decode_uri_prefix(db, a); exb = wg_decode_uri_prefix(db, b); deca = wg_decode_uri(db, a); decb = wg_decode_uri(db, b); } else if(typea==WG_XMLLITERALTYPE) { exa = wg_decode_xmlliteral_xsdtype(db, a); exb = wg_decode_xmlliteral_xsdtype(db, b); deca = wg_decode_xmlliteral(db, a); decb = wg_decode_xmlliteral(db, b); } else if(typea==WG_CHARTYPE) { buf[0] = wg_decode_char(db, a); buf[1] = '\0'; buf[2] = wg_decode_char(db, b); buf[3] = '\0'; deca = buf; decb = &buf[2]; } else { /* WG_BLOBTYPE */ deca = wg_decode_blob(db, a); decb = wg_decode_blob(db, b); } if(exa || exb) { /* String type where extra information is significant * (we're ignoring this for plain strings and blobs). * If extra part is equal, normal comparison continues. If * one string is missing altogether, it is considered to be * smaller than the other string. */ if(!exb) { if(exa[0]) return WG_GREATER; } else if(!exa) { if(exb[0]) return WG_LESSTHAN; } else { res = strcmp(exa, exb); if(res > 0) return WG_GREATER; else if(res < 0) return WG_LESSTHAN; } } #if 0 /* paranoia check */ if(!deca || !decb) { if(decb) if(decb[0]) return WG_LESSTHAN; } else if(deca) { if(deca[0]) return WG_GREATER; } return WG_EQUAL; } #endif if(typea==WG_BLOBTYPE) { /* Blobs are not 0-terminated */ int lena = wg_decode_blob_len(db, a); int lenb = wg_decode_blob_len(db, b); res = memcmp(deca, decb, (lena < lenb ? lena : lenb)); if(!res) res = lena - lenb; } else { res = strcmp(deca, decb); } if(res > 0) return WG_GREATER; else if(res < 0) return WG_LESSTHAN; else return WG_EQUAL; } } else return (typea>typeb ? WG_GREATER : WG_LESSTHAN); } #ifdef __cplusplus } #endif whitedb-0.7.2/Db/dbcompare.h000066400000000000000000000034111226454622500156070ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Priit Järv 2010 * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dbcompare.h * Public headers for data comparison functions. */ #ifndef DEFINED_DBCOMPARE_H #define DEFINED_DBCOMPARE_H #ifdef _WIN32 #include "../config-w32.h" #else #include "../config.h" #endif /* For gint data type */ #include "dbdata.h" /* ==== Public macros ==== */ #define WG_EQUAL 0 #define WG_GREATER 1 #define WG_LESSTHAN -1 /* If backlinking is enabled, records can be compared by their * contents instead of just pointers. With no backlinking this * is disabled so that records' comparative values do not change * when updating their contents. */ #ifdef USE_BACKLINKING #define WG_COMPARE_REC_DEPTH 7 /** recursion depth for record comparison */ #else #define WG_COMPARE_REC_DEPTH 0 #endif /* wrapper macro for wg_compare(), if encoded values are * equal they will also decode to an equal value and so * we can avoid calling the function. */ #define WG_COMPARE(d,a,b) (a==b ? WG_EQUAL :\ wg_compare(d,a,b,WG_COMPARE_REC_DEPTH)) /* ==== Protos ==== */ gint wg_compare(void *db, gint a, gint b, int depth); #endif /* DEFINED_DBCOMPARE_H */ whitedb-0.7.2/Db/dbdata.c000066400000000000000000002530421226454622500150740ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dbdata.c * Procedures for handling actual data: strings, integers, records, etc * */ /* ====== Includes =============== */ #ifdef _WIN32 #define WIN32_LEAN_AND_MEAN /* For Sleep() */ #include #endif #include #include #include #include #include #include //#include #ifdef __cplusplus extern "C" { #endif #ifdef _WIN32 #include "../config-w32.h" #else #include "../config.h" #endif #include "dballoc.h" #include "dbdata.h" #include "dbhash.h" #include "dblog.h" #include "dbindex.h" #include "dbcompare.h" #include "dblock.h" /* ====== Private headers and defs ======== */ #ifdef _WIN32 //Thread-safe localtime_r appears not to be present on windows: emulate using win localtime, which is thread-safe. static struct tm * localtime_r (const time_t *timer, struct tm *result); #define sscanf sscanf_s // warning: needs extra buflen args for string etc params #define snprintf sprintf_s #endif /* ======= Private protos ================ */ #ifdef USE_BACKLINKING static gint remove_backlink_index_entries(void *db, gint *record, gint value, gint depth); static gint restore_backlink_index_entries(void *db, gint *record, gint value, gint depth); #endif static int isleap(unsigned yr); static unsigned months_to_days (unsigned month); static long years_to_days (unsigned yr); static long ymd_to_scalar (unsigned yr, unsigned mo, unsigned day); static void scalar_to_ymd (long scalar, unsigned *yr, unsigned *mo, unsigned *day); static gint free_field_encoffset(void* db,gint encoffset); static gint find_create_longstr(void* db, char* data, char* extrastr, gint type, gint length); #ifdef USE_CHILD_DB static void *get_ptr_owner(void *db, gint encoded); static int is_local_offset(void *db, gint offset); #endif static gint show_data_error(void* db, char* errmsg); static gint show_data_error_nr(void* db, char* errmsg, gint nr); static gint show_data_error_double(void* db, char* errmsg, double nr); static gint show_data_error_str(void* db, char* errmsg, char* str); /* ====== Functions ============== */ /* ------------ full record handling ---------------- */ void* wg_create_record(void* db, wg_int length) { void *rec = wg_create_raw_record(db, length); /* Index all the created NULL fields to ensure index consistency */ if(rec) { if(wg_index_add_rec(db, rec) < -1) return NULL; /* index error */ } return rec; } /* * Creates the record and initializes the fields * to NULL, but does not update indexes. This is useful in two * scenarios: 1. fields are immediately initialized to something * else, making indexing NULLs useless 2. record will have * a RECORD_META_NOTDATA bit set, so the fields should not * be indexed at all. * * In the first case, it is required that wg_set_new_field() * is called on all the fields in the record. In the second case, * the caller is responsible for setting the meta bits, however * it is not mandatory to re-initialize all the fields. */ void* wg_create_raw_record(void* db, wg_int length) { gint offset; gint i; #ifdef CHECK if (!dbcheck(db)) { show_data_error_nr(db,"wrong database pointer given to wg_create_record with length ",length); return 0; } if(length < 0) { show_data_error_nr(db, "invalid record length:",length); return 0; } #endif #ifdef USE_DBLOG /* Log first, modify shared memory next */ if(dbmemsegh(db)->logging.active) { if(wg_log_create_record(db, length)) return 0; } #endif offset=wg_alloc_gints(db, &(dbmemsegh(db)->datarec_area_header), length+RECORD_HEADER_GINTS); if (!offset) { show_data_error_nr(db,"cannot create a record of size ",length); #ifdef USE_DBLOG if(dbmemsegh(db)->logging.active) { wg_log_encval(db, 0); } #endif return 0; } /* Init header */ dbstore(db, offset+RECORD_META_POS*sizeof(gint), 0); dbstore(db, offset+RECORD_BACKLINKS_POS*sizeof(gint), 0); for(i=RECORD_HEADER_GINTS;ilogging.active) { if(wg_log_encval(db, offset)) return 0; /* journal error */ } #endif return offsettoptr(db,offset); } /** Delete record from database * returns 0 on success * returns -1 if the record is referenced by others and cannot be deleted. * returns -2 on general error * returns -3 on fatal error * * XXX: when USE_BACKLINKING is off, this function should be used * with extreme care. */ gint wg_delete_record(void* db, void *rec) { gint offset; gint* dptr; gint* dendptr; gint data; #ifdef CHECK if (!dbcheck(db)) { show_data_error(db, "wrong database pointer given to wg_delete_record"); return -2; } #endif #ifdef USE_BACKLINKING if(*((gint *) rec + RECORD_BACKLINKS_POS)) return -1; #endif #ifdef USE_DBLOG /* Log first, modify shared memory next */ if(dbmemsegh(db)->logging.active) { if(wg_log_delete_record(db, ptrtooffset(db, rec))) return -3; } #endif /* Remove data from index */ if(!is_special_record(rec)) { if(wg_index_del_rec(db, rec) < -1) return -3; /* index error */ } offset = ptrtooffset(db, rec); #if defined(CHECK) && defined(USE_CHILD_DB) /* Check if it's a local record */ if(!is_local_offset(db, offset)) { show_data_error(db, "not deleting an external record"); return -2; } #endif /* Loop over fields, freeing them */ dendptr = (gint *) (((char *) rec) + datarec_size_bytes(*((gint *)rec))); for(dptr=(gint *)rec+RECORD_HEADER_GINTS; dptrcar == offset) { gint old_offset = *next_offset; *next_offset = old->cdr; /* remove from list chain */ wg_free_listcell(db, old_offset); /* free storage */ goto recdel_backlink_removed; } next_offset = &(old->cdr); } show_data_error(db, "Corrupt backlink chain"); return -3; /* backlink error */ } recdel_backlink_removed: #endif if(isptr(data)) free_field_encoffset(db,data); } /* Free the record storage */ wg_free_object(db, &(dbmemsegh(db)->datarec_area_header), offset); return 0; } /** Get the first data record from the database * Uses header meta bits to filter out special records * (rules, system records etc) */ void* wg_get_first_record(void* db) { void *res = wg_get_first_raw_record(db); if(res && is_special_record(res)) return wg_get_next_record(db, res); /* find first data record */ return res; } /** Get the next data record from the database * Uses header meta bits to filter out special records */ void* wg_get_next_record(void* db, void* record) { void *res = record; do { res = wg_get_next_raw_record(db, res); } while(res && is_special_record(res)); return res; } /** Get the first record from the database * */ void* wg_get_first_raw_record(void* db) { db_subarea_header* arrayadr; gint firstoffset; void* res; #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_get_first_record"); return NULL; } #endif arrayadr=&((dbmemsegh(db)->datarec_area_header).subarea_array[0]); firstoffset=((arrayadr[0]).alignedoffset); // do NOT skip initial "used" marker //printf("arrayadr %x firstoffset %d \n",(uint)arrayadr,firstoffset); res=wg_get_next_raw_record(db,offsettoptr(db,firstoffset)); return res; } /** Get the next record from the database * */ void* wg_get_next_raw_record(void* db, void* record) { gint curoffset; gint head; db_subarea_header* arrayadr; gint last_subarea_index; gint i; gint found; gint subareastart; gint subareaend; gint freemarker; curoffset=ptrtooffset(db,record); //printf("curroffset %d record %x\n",curoffset,(uint)record); #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_get_first_record"); return NULL; } head=dbfetch(db,curoffset); if (isfreeobject(head)) { show_data_error(db,"wrong record pointer (free) given to wg_get_next_record"); return NULL; } #endif freemarker=0; //assume input pointer to used object head=dbfetch(db,curoffset); while(1) { // increase offset to next memory block curoffset=curoffset+(freemarker ? getfreeobjectsize(head) : getusedobjectsize(head)); head=dbfetch(db,curoffset); //printf("new curoffset %d head %d isnormaluseobject %d isfreeobject %d \n", // curoffset,head,isnormalusedobject(head),isfreeobject(head)); // check if found a normal used object if (isnormalusedobject(head)) return offsettoptr(db,curoffset); //return ptr to normal used object if (isfreeobject(head)) { freemarker=1; // loop start leads us to next object } else { // found a special object (dv or end marker) freemarker=0; if (dbfetch(db,curoffset+sizeof(gint))==SPECIALGINT1DV) { // we have reached a dv object continue; // loop start leads us to next object } else { // we have reached an end marker, have to find the next subarea // first locate subarea for this offset arrayadr=&((dbmemsegh(db)->datarec_area_header).subarea_array[0]); last_subarea_index=(dbmemsegh(db)->datarec_area_header).last_subarea_index; found=0; for(i=0;(i<=last_subarea_index)&&(i=subareastart && curoffsetlast_subarea_index || i>=SUBAREA_ARRAY_SIZE) { //printf("next used object not found: i %d curoffset %d \n",i,curoffset); return NULL; } //printf("taking next subarea i %d\n",i); curoffset=((arrayadr[i]).alignedoffset); // curoffset is now the special start marker head=dbfetch(db,curoffset); // loop start will lead us to next object from special marker } } } } /* ------------ backlink chain recursive functions ------------------- */ #ifdef USE_BACKLINKING /** Remove index entries in backlink chain recursively. * Needed for index maintenance when records are compared by their * contens, as change in contents also changes the value of the entire * record and thus affects it's placement in the index. * Returns 0 for success * Returns -1 in case of errors. */ static gint remove_backlink_index_entries(void *db, gint *record, gint value, gint depth) { gint col, length, err = 0; db_memsegment_header *dbh = dbmemsegh(db); if(!is_special_record(record)) { /* Find all fields in the record that match value (which is actually * a reference to a child record in encoded form) and remove it from * indexes. It will be recreated in the indexes by wg_set_field() later. */ length = getusedobjectwantedgintsnr(*record) - RECORD_HEADER_GINTS; if(length > MAX_INDEXED_FIELDNR) length = MAX_INDEXED_FIELDNR + 1; for(col=0; colindex_control_area_header.index_table[col]) { if(wg_index_del_field(db, record, col) < -1) return -1; } } } } /* If recursive depth is not exchausted, continue with the parents * of this record. */ if(depth > 0) { gint backlink_list = *(record + RECORD_BACKLINKS_POS); if(backlink_list) { gcell *next = (gcell *) offsettoptr(db, backlink_list); for(;;) { err = remove_backlink_index_entries(db, (gint *) offsettoptr(db, next->car), wg_encode_record(db, record), depth-1); if(err) return err; if(!next->cdr) break; next = (gcell *) offsettoptr(db, next->cdr); } } } return 0; } /** Add index entries in backlink chain recursively. * Called after doing remove_backling_index_entries() and updating * data in the record that originated the call. This recreates the * entries in the indexes for all the records that were affected. * Returns 0 for success * Returns -1 in case of errors. */ static gint restore_backlink_index_entries(void *db, gint *record, gint value, gint depth) { gint col, length, err = 0; db_memsegment_header *dbh = dbmemsegh(db); if(!is_special_record(record)) { /* Find all fields in the record that match value (which is actually * a reference to a child record in encoded form) and add it back to * indexes. */ length = getusedobjectwantedgintsnr(*record) - RECORD_HEADER_GINTS; if(length > MAX_INDEXED_FIELDNR) length = MAX_INDEXED_FIELDNR + 1; for(col=0; colindex_control_area_header.index_table[col]) { if(wg_index_add_field(db, record, col) < -1) return -1; } } } } /* Continue to the parents until depth==0 */ if(depth > 0) { gint backlink_list = *(record + RECORD_BACKLINKS_POS); if(backlink_list) { gcell *next = (gcell *) offsettoptr(db, backlink_list); for(;;) { err = restore_backlink_index_entries(db, (gint *) offsettoptr(db, next->car), wg_encode_record(db, record), depth-1); if(err) return err; if(!next->cdr) break; next = (gcell *) offsettoptr(db, next->cdr); } } } return 0; } #endif /* ------------ field handling: data storage and fetching ---------------- */ wg_int wg_get_record_len(void* db, void* record) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_get_record_len"); return -1; } #endif return ((gint)(getusedobjectwantedgintsnr(*((gint*)record))))-RECORD_HEADER_GINTS; } wg_int* wg_get_record_dataarray(void* db, void* record) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_get_record_dataarray"); return NULL; } #endif return (((gint*)record)+RECORD_HEADER_GINTS); } /** Update contents of one field * returns 0 if successful * returns -1 if invalid db pointer passed (by recordcheck macro) * returns -2 if invalid record passed (by recordcheck macro) * returns -3 for fatal index error * returns -4 for backlink-related error * returns -5 for invalid external data * returns -6 for journal error */ wg_int wg_set_field(void* db, void* record, wg_int fieldnr, wg_int data) { gint* fieldadr; gint fielddata; gint* strptr; #ifdef USE_BACKLINKING gint backlink_list; /** start of backlinks for this record */ gint rec_enc = WG_ILLEGAL; /** this record as encoded value. */ #endif db_memsegment_header *dbh = dbmemsegh(db); #ifdef USE_CHILD_DB void *offset_owner = dbmemseg(db); #endif #ifdef CHECK recordcheck(db,record,fieldnr,"wg_set_field"); #endif #ifdef USE_DBLOG /* Do not proceed before we've logged the operation */ if(dbh->logging.active) { if(wg_log_set_field(db,record,fieldnr,data)) return -6; /* journal error, cannot write */ } #endif /* Read the old encoded value */ fieldadr=((gint*)record)+RECORD_HEADER_GINTS+fieldnr; fielddata=*fieldadr; /* Update index(es) while the old value is still in the db */ #ifdef USE_INDEX_TEMPLATE if(!is_special_record(record) && fieldnr<=MAX_INDEXED_FIELDNR &&\ (dbh->index_control_area_header.index_table[fieldnr] ||\ dbh->index_control_area_header.index_template_table[fieldnr])) { #else if(!is_special_record(record) && fieldnr<=MAX_INDEXED_FIELDNR &&\ dbh->index_control_area_header.index_table[fieldnr]) { #endif if(wg_index_del_field(db, record, fieldnr) < -1) return -3; /* index error */ } /* If there are backlinks, go up the chain and remove the reference * to this record from all indexes (updating a field in the record * causes the value of the record to change). Note that we only go * as far as the recursive comparison depth - records higher in the * hierarchy are not affected. */ #if defined(USE_BACKLINKING) && (WG_COMPARE_REC_DEPTH > 0) backlink_list = *((gint *) record + RECORD_BACKLINKS_POS); if(backlink_list) { gint err; gcell *next = (gcell *) offsettoptr(db, backlink_list); rec_enc = wg_encode_record(db, record); for(;;) { err = remove_backlink_index_entries(db, (gint *) offsettoptr(db, next->car), rec_enc, WG_COMPARE_REC_DEPTH-1); if(err) { return -4; /* override the error code, for now. */ } if(!next->cdr) break; next = (gcell *) offsettoptr(db, next->cdr); } } #endif #ifdef USE_CHILD_DB /* Get the offset owner */ if(isptr(data)) { offset_owner = get_ptr_owner(db, data); if(!offset_owner) { show_data_error(db, "External reference not recognized"); return -5; } } #endif #ifdef USE_BACKLINKING /* Is the old field value a record pointer? If so, remove the backlink. * XXX: this can be optimized to use a custom macro instead of * wg_get_encoded_type(). */ #ifdef USE_CHILD_DB /* Only touch local records */ if(wg_get_encoded_type(db, fielddata) == WG_RECORDTYPE && offset_owner == dbmemseg(db)) { #else if(wg_get_encoded_type(db, fielddata) == WG_RECORDTYPE) { #endif gint *rec = (gint *) wg_decode_record(db, fielddata); gint *next_offset = rec + RECORD_BACKLINKS_POS; gint parent_offset = ptrtooffset(db, record); gcell *old = NULL; while(*next_offset) { old = (gcell *) offsettoptr(db, *next_offset); if(old->car == parent_offset) { gint old_offset = *next_offset; *next_offset = old->cdr; /* remove from list chain */ wg_free_listcell(db, old_offset); /* free storage */ goto setfld_backlink_removed; } next_offset = &(old->cdr); } show_data_error(db, "Corrupt backlink chain"); return -4; /* backlink error */ } setfld_backlink_removed: #endif //printf("wg_set_field adr %d offset %d\n",fieldadr,ptrtooffset(db,fieldadr)); if (isptr(fielddata)) { //printf("wg_set_field freeing old data\n"); free_field_encoffset(db,fielddata); } (*fieldadr)=data; // store data to field #ifdef USE_CHILD_DB if (islongstr(data) && offset_owner == dbmemseg(db)) { #else if (islongstr(data)) { #endif // increase data refcount for longstr-s strptr = (gint *) offsettoptr(db,decode_longstr_offset(data)); ++(*(strptr+LONGSTR_REFCOUNT_POS)); } /* Update index after new value is written */ #ifdef USE_INDEX_TEMPLATE if(!is_special_record(record) && fieldnr<=MAX_INDEXED_FIELDNR &&\ (dbh->index_control_area_header.index_table[fieldnr] ||\ dbh->index_control_area_header.index_template_table[fieldnr])) { #else if(!is_special_record(record) && fieldnr<=MAX_INDEXED_FIELDNR &&\ dbh->index_control_area_header.index_table[fieldnr]) { #endif if(wg_index_add_field(db, record, fieldnr) < -1) return -3; } #ifdef USE_BACKLINKING /* Is the new field value a record pointer? If so, add a backlink */ #ifdef USE_CHILD_DB if(wg_get_encoded_type(db, data) == WG_RECORDTYPE && offset_owner == dbmemseg(db)) { #else if(wg_get_encoded_type(db, data) == WG_RECORDTYPE) { #endif gint *rec = (gint *) wg_decode_record(db, data); gint *next_offset = rec + RECORD_BACKLINKS_POS; gint new_offset = wg_alloc_fixlen_object(db, &(dbmemsegh(db)->listcell_area_header)); gcell *new_cell = (gcell *) offsettoptr(db, new_offset); while(*next_offset) next_offset = &(((gcell *) offsettoptr(db, *next_offset))->cdr); new_cell->car = ptrtooffset(db, record); new_cell->cdr = 0; *next_offset = new_offset; } #endif #if defined(USE_BACKLINKING) && (WG_COMPARE_REC_DEPTH > 0) /* Create new entries in indexes in all referring records */ if(backlink_list) { gint err; gcell *next = (gcell *) offsettoptr(db, backlink_list); for(;;) { err = restore_backlink_index_entries(db, (gint *) offsettoptr(db, next->car), rec_enc, WG_COMPARE_REC_DEPTH-1); if(err) { return -4; } if(!next->cdr) break; next = (gcell *) offsettoptr(db, next->cdr); } } #endif return 0; } /** Write contents of one field. * * Used to initialize fields in records that have been created with * wg_create_raw_record(). * * This function ignores the previous contents of the field. The * rationale is that newly created fields do not have any meaningful * content and this allows faster writing. It is up to the programmer * to ensure that this function is not called on fields that already * contain data. * * returns 0 if successful * returns -1 if invalid db pointer passed * returns -2 if invalid record or field passed * returns -3 for fatal index error * returns -4 for backlink-related error * returns -5 for invalid external data * returns -6 for journal error */ wg_int wg_set_new_field(void* db, void* record, wg_int fieldnr, wg_int data) { gint* fieldadr; gint* strptr; #ifdef USE_BACKLINKING gint backlink_list; /** start of backlinks for this record */ #endif db_memsegment_header *dbh = dbmemsegh(db); #ifdef USE_CHILD_DB void *offset_owner = dbmemseg(db); #endif #ifdef CHECK recordcheck(db,record,fieldnr,"wg_set_field"); #endif #ifdef USE_DBLOG /* Do not proceed before we've logged the operation */ if(dbh->logging.active) { if(wg_log_set_field(db,record,fieldnr,data)) return -6; /* journal error, cannot write */ } #endif #ifdef USE_CHILD_DB /* Get the offset owner */ if(isptr(data)) { offset_owner = get_ptr_owner(db, data); if(!offset_owner) { show_data_error(db, "External reference not recognized"); return -5; } } #endif /* Write new value */ fieldadr=((gint*)record)+RECORD_HEADER_GINTS+fieldnr; #ifdef CHECK if(*fieldadr) { show_data_error(db,"wg_set_new_field called on field that contains data"); return -2; } #endif (*fieldadr)=data; #ifdef USE_CHILD_DB if (islongstr(data) && offset_owner == dbmemseg(db)) { #else if (islongstr(data)) { #endif // increase data refcount for longstr-s strptr = (gint *) offsettoptr(db,decode_longstr_offset(data)); ++(*(strptr+LONGSTR_REFCOUNT_POS)); } /* Update index after new value is written */ #ifdef USE_INDEX_TEMPLATE if(!is_special_record(record) && fieldnr<=MAX_INDEXED_FIELDNR &&\ (dbh->index_control_area_header.index_table[fieldnr] ||\ dbh->index_control_area_header.index_template_table[fieldnr])) { #else if(!is_special_record(record) && fieldnr<=MAX_INDEXED_FIELDNR &&\ dbh->index_control_area_header.index_table[fieldnr]) { #endif if(wg_index_add_field(db, record, fieldnr) < -1) return -3; } #ifdef USE_BACKLINKING /* Is the new field value a record pointer? If so, add a backlink */ #ifdef USE_CHILD_DB if(wg_get_encoded_type(db, data) == WG_RECORDTYPE && offset_owner == dbmemseg(db)) { #else if(wg_get_encoded_type(db, data) == WG_RECORDTYPE) { #endif gint *rec = (gint *) wg_decode_record(db, data); gint *next_offset = rec + RECORD_BACKLINKS_POS; gint new_offset = wg_alloc_fixlen_object(db, &(dbmemsegh(db)->listcell_area_header)); gcell *new_cell = (gcell *) offsettoptr(db, new_offset); while(*next_offset) next_offset = &(((gcell *) offsettoptr(db, *next_offset))->cdr); new_cell->car = ptrtooffset(db, record); new_cell->cdr = 0; *next_offset = new_offset; } #endif #if defined(USE_BACKLINKING) && (WG_COMPARE_REC_DEPTH > 0) /* Create new entries in indexes in all referring records. Normal * usage scenario would be that the record is also new, so that * there are no backlinks, however this is not guaranteed. */ backlink_list = *((gint *) record + RECORD_BACKLINKS_POS); if(backlink_list) { gint err; gcell *next = (gcell *) offsettoptr(db, backlink_list); gint rec_enc = wg_encode_record(db, record); for(;;) { err = restore_backlink_index_entries(db, (gint *) offsettoptr(db, next->car), rec_enc, WG_COMPARE_REC_DEPTH-1); if(err) { return -4; } if(!next->cdr) break; next = (gcell *) offsettoptr(db, next->cdr); } } #endif return 0; } wg_int wg_set_int_field(void* db, void* record, wg_int fieldnr, gint data) { gint fielddata; fielddata=wg_encode_int(db,data); //printf("wg_set_int_field data %d encoded %d\n",data,fielddata); if (fielddata==WG_ILLEGAL) return -1; return wg_set_field(db,record,fieldnr,fielddata); } wg_int wg_set_double_field(void* db, void* record, wg_int fieldnr, double data) { gint fielddata; fielddata=wg_encode_double(db,data); if (fielddata==WG_ILLEGAL) return -1; return wg_set_field(db,record,fieldnr,fielddata); } wg_int wg_set_str_field(void* db, void* record, wg_int fieldnr, char* data) { gint fielddata; fielddata=wg_encode_str(db,data,NULL); if (fielddata==WG_ILLEGAL) return -1; return wg_set_field(db,record,fieldnr,fielddata); } wg_int wg_set_rec_field(void* db, void* record, wg_int fieldnr, void* data) { gint fielddata; fielddata=wg_encode_record(db,data); if (fielddata==WG_ILLEGAL) return -1; return wg_set_field(db,record,fieldnr,fielddata); } /** Special case of updating a field value without a write-lock. * * Operates like wg_set_field but takes a previous value in a field * as an additional argument for atomicity check. * * This special case does not require a write lock: however, * you MUST still get a read-lock before the operation while * doing parallel processing, otherwise the operation * may corrupt the database: no complex write operations should * happen in parallel to this operation. * * NB! the operation may still confuse other parallel readers, changing * the value in a record they have just read. Use only if this is * known to not create problems for other processes. * * It can be only used to write an immediate value (NULL, short int, * char, date, time) to a non-indexed field containing also an * immediate field: checks whether these conditions hold. * * The operation will fail if the original value passed has been * overwritten before we manage to store a new value: this is * a guaranteed atomic check and enables correct operation of * several parallel wg_set_atomic_field operations * changing the same field. * * returns 0 if successful * returns -1 if wrong db pointer * returns -2 if wrong fieldnr * returns -10 if new value non-immediate * returns -11 if old value non-immediate * returns -12 if cannot fetch old data * returns -13 if the field has an index * returns -14 if logging is active * returns -15 if the field value has been changed from old_data * may return other field-setting error codes from wg_set_new_field * */ wg_int wg_update_atomic_field(void* db, void* record, wg_int fieldnr, wg_int data, wg_int old_data) { gint* fieldadr; db_memsegment_header *dbh = dbmemsegh(db); gint tmp; // basic sanity check #ifdef CHECK recordcheck(db,record,fieldnr,"wg_update_atomic_field"); #endif // check whether new value and old value are direct values in a record if (!isimmediatedata(data)) return -10; if (!isimmediatedata(old_data)) return -11; // check whether there is index on the field #ifdef USE_INDEX_TEMPLATE if(!is_special_record(record) && fieldnr<=MAX_INDEXED_FIELDNR &&\ (dbh->index_control_area_header.index_table[fieldnr] ||\ dbh->index_control_area_header.index_template_table[fieldnr])) { #else if(!is_special_record(record) && fieldnr<=MAX_INDEXED_FIELDNR &&\ dbh->index_control_area_header.index_table[fieldnr]) { #endif return -13; } // check that no logging is used #ifdef USE_DBLOG if(dbh->logging.active) { return -14; } #endif // checks passed, do atomic field setting fieldadr=((gint*)record)+RECORD_HEADER_GINTS+fieldnr; tmp=wg_compare_and_swap(fieldadr, old_data, data); if (tmp) return 0; else return -15; } /** Special case of setting a field value without a write-lock. * * Calls wg_update_atomic_field iteratively until compare-and-swap succeeds. * * The restrictions and error codes from wg_update_atomic_field apply. * returns 0 if successful * returns -1...-15 with an error defined before in wg_update_atomic_field. * returns -17 if atomic assignment failed after a large number (1000) of tries */ wg_int wg_set_atomic_field(void* db, void* record, wg_int fieldnr, wg_int data) { gint* fieldadr; gint old,r; int i; #ifdef _WIN32 int ts=1; #else struct timespec ts; #endif // basic sanity check #ifdef CHECK recordcheck(db,record,fieldnr,"wg_set_atomic_field"); #endif fieldadr=((gint*)record)+RECORD_HEADER_GINTS+fieldnr; for(i=0;;i++) { // loop until preconditions fail or addition succeeds and // the old value is not changed during compare-and-swap old=*fieldadr; r=wg_update_atomic_field(db,record,fieldnr,data,old); if (!r) return 0; if (r!=-15) return r; // -15 is field changed error // here compare-and-swap failed, try again if (i>1000) return -17; // possibly a deadlock if (i%10!=0) continue; // sleep only every tenth loop // several loops passed, sleep a bit #ifdef _WIN32 Sleep(ts); // 1000 for loops take ca 0.1 sec #else ts.tv_sec=0; ts.tv_nsec=100+i; nanosleep(&ts,NULL); // 1000 for loops take ca 60 microsec #endif } return -17; // should not reach here } /** Special case of adding to an int field without a write-lock. * * fieldnr must contain a smallint and the result of addition * must also be a smallint. * * The restrictions and error codes from wg_update_atomic_field apply. * * returns 0 if successful * returns -1...-15 with an error defined before in wg_set_atomic_field. * returns -16 if the result of the addition does not fit into a smallint * returns -17 if atomic assignment failed after a large number (1000) of tries * */ wg_int wg_add_int_atomic_field(void* db, void* record, wg_int fieldnr, int data) { gint* fieldadr; gint old,nxt,r; int i,sum; #ifdef _WIN32 int ts=1; #else struct timespec ts; #endif // basic sanity check #ifdef CHECK recordcheck(db,record,fieldnr,"wg_add_int_atomic_field"); #endif fieldadr=((gint*)record)+RECORD_HEADER_GINTS+fieldnr; for(i=0;;i++) { // loop until preconditions fail or addition succeeds and // the old value is not changed during compare-and-swap old=*fieldadr; if (!issmallint(old)) return -11; sum=wg_decode_int(db,(gint)old)+data; if (!fits_smallint(sum)) return -16; nxt=encode_smallint(sum); r=wg_update_atomic_field(db,record,fieldnr,nxt,old); if (!r) return 0; if (r!=-15) return r; // -15 is field changed error // here compare-and-swap failed, try again if (i>1000) return -17; // possibly a deadlock if (i%10!=0) continue; // sleep only every tenth loop // several loops passed, sleep a bit #ifdef _WIN32 Sleep(ts); // 1000 for loops take ca 0.1 sec #else ts.tv_sec=0; ts.tv_nsec=100+i; nanosleep(&ts,NULL); // 1000 for loops take ca 60 microsec #endif } return -17; // should not reach here } wg_int wg_get_field(void* db, void* record, wg_int fieldnr) { #ifdef CHECK if (!dbcheck(db)) { show_data_error_nr(db,"wrong database pointer given to wg_get_field",fieldnr); return WG_ILLEGAL; } if (fieldnr<0 || (getusedobjectwantedgintsnr(*((gint*)record))<=fieldnr+RECORD_HEADER_GINTS)) { show_data_error_nr(db,"wrong field number given to wg_get_field",fieldnr);\ return WG_ILLEGAL; } #endif //printf("wg_get_field adr %d offset %d\n", // (((gint*)record)+RECORD_HEADER_GINTS+fieldnr), // ptrtooffset(db,(((gint*)record)+RECORD_HEADER_GINTS+fieldnr))); return *(((gint*)record)+RECORD_HEADER_GINTS+fieldnr); } wg_int wg_get_field_type(void* db, void* record, wg_int fieldnr) { #ifdef CHECK if (!dbcheck(db)) { show_data_error_nr(db,"wrong database pointer given to wg_get_field_type",fieldnr);\ return 0; } if (fieldnr<0 || (getusedobjectwantedgintsnr(*((gint*)record))<=fieldnr+RECORD_HEADER_GINTS)) { show_data_error_nr(db,"wrong field number given to wg_get_field_type",fieldnr);\ return 0; } #endif return wg_get_encoded_type(db,*(((gint*)record)+RECORD_HEADER_GINTS+fieldnr)); } /* ------------- general operations -------------- */ wg_int wg_free_encoded(void* db, wg_int data) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_free_encoded"); return 0; } #endif if (isptr(data)) { gint *strptr; /* XXX: Major hack: since free_field_encoffset() decrements * the refcount, but wg_encode_str() does not (which is correct), * before, increment the refcount once before we free the * object. If the string is in use already, this will be a * no-op, otherwise it'll be successfully freed anyway. */ #ifdef USE_CHILD_DB if (islongstr(data) && is_local_offset(db, decode_longstr_offset(data))) { #else if (islongstr(data)) { #endif // increase data refcount for longstr-s strptr = (gint *) offsettoptr(db,decode_longstr_offset(data)); ++(*(strptr+LONGSTR_REFCOUNT_POS)); } return free_field_encoffset(db,data); } return 0; } /** properly removes ptr (offset) to data * * assumes fielddata is offset to allocated data * depending on type of fielddata either deallocates pointed data or * removes data back ptr or decreases refcount * * in case fielddata points to record or longstring, these * are freed only if they have no more pointers * * returns non-zero in case of error */ static gint free_field_encoffset(void* db,gint encoffset) { gint offset; #if 0 gint* dptr; gint* dendptr; gint data; gint i; #endif gint tmp; gint* objptr; gint* extrastr; // takes last three bits to decide the type // fullint is represented by two options: 001 and 101 switch(encoffset&NORMALPTRMASK) { case DATARECBITS: #if 0 /* This section of code in quarantine */ // remove from list // refcount check offset=decode_datarec_offset(encoffset); tmp=dbfetch(db,offset+sizeof(gint)*LONGSTR_REFCOUNT_POS); tmp--; if (tmp>0) { dbstore(db,offset+LONGSTR_REFCOUNT_POS,tmp); } else { // free frompointers structure // loop over fields, freeing them dptr=offsettoptr(db,offset); dendptr=(gint*)(((char*)dptr)+datarec_size_bytes(*dptr)); for(i=0,dptr=dptr+RECORD_HEADER_GINTS;dptrdatarec_area_header),offset); } #endif break; case LONGSTRBITS: offset=decode_longstr_offset(encoffset); #ifdef USE_CHILD_DB if(!is_local_offset(db, offset)) break; /* Non-local reference, ignore it */ #endif // refcount check tmp=dbfetch(db,offset+sizeof(gint)*LONGSTR_REFCOUNT_POS); tmp--; if (tmp>0) { dbstore(db,offset+sizeof(gint)*LONGSTR_REFCOUNT_POS,tmp); } else { objptr = (gint *) offsettoptr(db,offset); extrastr=(gint*)(((char*)(objptr))+(sizeof(gint)*LONGSTR_EXTRASTR_POS)); tmp=*extrastr; // remove from hash wg_remove_from_strhash(db,encoffset); // remove extrastr if (tmp!=0) free_field_encoffset(db,tmp); *extrastr=0; // really free object from area wg_free_object(db,&(dbmemsegh(db)->longstr_area_header),offset); } break; case SHORTSTRBITS: #ifdef USE_CHILD_DB offset = decode_shortstr_offset(encoffset); if(!is_local_offset(db, offset)) break; /* Non-local reference, ignore it */ wg_free_shortstr(db, offset); #else wg_free_shortstr(db,decode_shortstr_offset(encoffset)); #endif break; case FULLDOUBLEBITS: #ifdef USE_CHILD_DB offset = decode_fulldouble_offset(encoffset); if(!is_local_offset(db, offset)) break; /* Non-local reference, ignore it */ wg_free_doubleword(db, offset); #else wg_free_doubleword(db,decode_fulldouble_offset(encoffset)); #endif break; case FULLINTBITSV0: #ifdef USE_CHILD_DB offset = decode_fullint_offset(encoffset); if(!is_local_offset(db, offset)) break; /* Non-local reference, ignore it */ wg_free_word(db, offset); #else wg_free_word(db,decode_fullint_offset(encoffset)); #endif break; case FULLINTBITSV1: #ifdef USE_CHILD_DB offset = decode_fullint_offset(encoffset); if(!is_local_offset(db, offset)) break; /* Non-local reference, ignore it */ wg_free_word(db, offset); #else wg_free_word(db,decode_fullint_offset(encoffset)); #endif break; } return 0; } /* ------------- data encoding and decoding ------------ */ /** determines the type of encoded data * * returns a zero-or-bigger macro integer value from wg_db_api.h beginning: * * #define WG_NULLTYPE 1 * #define WG_RECORDTYPE 2 * #define WG_INTTYPE 3 * #define WG_DOUBLETYPE 4 * #define WG_STRTYPE 5 * ... etc ... * * returns a negative number -1 in case of error * */ wg_int wg_get_encoded_type(void* db, wg_int data) { gint fieldoffset; gint tmp; #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_get_encoded_type"); return 0; } #endif if (!data) return WG_NULLTYPE; if (((data)&NONPTRBITS)==NONPTRBITS) { // data is one of the non-pointer types if (isvar(data)) return (gint)WG_VARTYPE; if (issmallint(data)) return (gint)WG_INTTYPE; switch(data&LASTBYTEMASK) { case CHARBITS: return WG_CHARTYPE; case FIXPOINTBITS: return WG_FIXPOINTTYPE; case DATEBITS: return WG_DATETYPE; case TIMEBITS: return WG_TIMETYPE; case TINYSTRBITS: return WG_STRTYPE; case VARBITS: return WG_VARTYPE; case ANONCONSTBITS: return WG_ANONCONSTTYPE; default: return -1; } } // here we know data must be of ptr type // takes last three bits to decide the type // fullint is represented by two options: 001 and 101 //printf("cp0\n"); switch(data&NORMALPTRMASK) { case DATARECBITS: return (gint)WG_RECORDTYPE; case LONGSTRBITS: //printf("cp1\n"); fieldoffset=decode_longstr_offset(data)+LONGSTR_META_POS*sizeof(gint); //printf("fieldoffset %d\n",fieldoffset); tmp=dbfetch(db,fieldoffset); //printf("str meta %d lendiff %d subtype %d\n", // tmp,(tmp&LONGSTR_META_LENDIFMASK)>>LONGSTR_META_LENDIFSHFT,tmp&LONGSTR_META_TYPEMASK); return tmp&LONGSTR_META_TYPEMASK; // WG_STRTYPE, WG_URITYPE, WG_XMLLITERALTYPE case SHORTSTRBITS: return (gint)WG_STRTYPE; case FULLDOUBLEBITS: return (gint)WG_DOUBLETYPE; case FULLINTBITSV0: return (gint)WG_INTTYPE; case FULLINTBITSV1: return (gint)WG_INTTYPE; default: return -1; } return 0; } char* wg_get_type_name(void* db, wg_int type) { switch (type) { case WG_NULLTYPE: return "null"; case WG_RECORDTYPE: return "record"; case WG_INTTYPE: return "int"; case WG_DOUBLETYPE: return "double"; case WG_STRTYPE: return "string"; case WG_XMLLITERALTYPE: return "xmlliteral"; case WG_URITYPE: return "uri"; case WG_BLOBTYPE: return "blob"; case WG_CHARTYPE: return "char"; case WG_FIXPOINTTYPE: return "fixpoint"; case WG_DATETYPE: return "date"; case WG_TIMETYPE: return "time"; case WG_ANONCONSTTYPE: return "anonconstant"; case WG_VARTYPE: return "var"; default: return "unknown"; } } wg_int wg_encode_null(void* db, char* data) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_encode_null"); return WG_ILLEGAL; } if (data!=NULL) { show_data_error(db,"data given to wg_encode_null is not NULL"); return WG_ILLEGAL; } #endif #ifdef USE_DBLOG /* Skip logging values that do not cause storage allocation. if(dbh->logging.active) { if(wg_log_encode(db, WG_NULLTYPE, NULL, 0, NULL, 0)) return WG_ILLEGAL; } */ #endif return (gint)0; } char* wg_decode_null(void* db,wg_int data) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_decode_null"); return NULL; } if (data!=(gint)0) { show_data_error(db,"data given to wg_decode_null is not an encoded NULL"); return NULL; } #endif return NULL; } wg_int wg_encode_int(void* db, wg_int data) { gint offset; #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_encode_int"); return WG_ILLEGAL; } #endif if (fits_smallint(data)) { return encode_smallint(data); } else { #ifdef USE_DBLOG /* Log before allocating. Note this call is skipped when * we have a small int. */ if(dbmemsegh(db)->logging.active) { if(wg_log_encode(db, WG_INTTYPE, &data, 0, NULL, 0)) return WG_ILLEGAL; } #endif offset=alloc_word(db); if (!offset) { show_data_error_nr(db,"cannot store an integer in wg_set_int_field: ",data); #ifdef USE_DBLOG if(dbmemsegh(db)->logging.active) { wg_log_encval(db, WG_ILLEGAL); } #endif return WG_ILLEGAL; } dbstore(db,offset,data); #ifdef USE_DBLOG if(dbmemsegh(db)->logging.active) { if(wg_log_encval(db, encode_fullint_offset(offset))) return WG_ILLEGAL; /* journal error */ } #endif return encode_fullint_offset(offset); } } wg_int wg_decode_int(void* db, wg_int data) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_decode_int"); return 0; } #endif if (issmallint(data)) return decode_smallint(data); if (isfullint(data)) return dbfetch(db,decode_fullint_offset(data)); show_data_error_nr(db,"data given to wg_decode_int is not an encoded int: ",data); return 0; } wg_int wg_encode_char(void* db, char data) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_encode_char"); return WG_ILLEGAL; } #endif #ifdef USE_DBLOG /* Skip logging values that do not cause storage allocation. if(dbh->logging.active) { if(wg_log_encode(db, WG_CHARTYPE, &data, 0, NULL, 0)) return WG_ILLEGAL; } */ #endif return (wg_int)(encode_char((wg_int)data)); } char wg_decode_char(void* db, wg_int data) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_decode_char"); return 0; } #endif return (char)(decode_char(data)); } wg_int wg_encode_double(void* db, double data) { gint offset; #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_encode_double"); return WG_ILLEGAL; } #endif #ifdef USE_DBLOG /* Log before allocating. */ if(dbmemsegh(db)->logging.active) { if(wg_log_encode(db, WG_DOUBLETYPE, &data, 0, NULL, 0)) return WG_ILLEGAL; } #endif if (0) { // possible future case for tiny floats } else { offset=alloc_doubleword(db); if (!offset) { show_data_error_double(db,"cannot store a double in wg_set_double_field: ",data); #ifdef USE_DBLOG if(dbmemsegh(db)->logging.active) { wg_log_encval(db, WG_ILLEGAL); } #endif return WG_ILLEGAL; } *((double*)(offsettoptr(db,offset)))=data; #ifdef USE_DBLOG if(dbmemsegh(db)->logging.active) { if(wg_log_encval(db, encode_fulldouble_offset(offset))) return WG_ILLEGAL; /* journal error */ } #endif return encode_fulldouble_offset(offset); } } double wg_decode_double(void* db, wg_int data) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_decode_double"); return 0; } #endif if (isfulldouble(data)) return *((double*)(offsettoptr(db,decode_fulldouble_offset(data)))); show_data_error_nr(db,"data given to wg_decode_double is not an encoded double: ",data); return 0; } wg_int wg_encode_fixpoint(void* db, double data) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_encode_fixpoint"); return WG_ILLEGAL; } if (!fits_fixpoint(data)) { show_data_error(db,"argument given to wg_encode_fixpoint too big or too small"); return WG_ILLEGAL; } #endif #ifdef USE_DBLOG /* Skip logging values that do not cause storage allocation. if(dbh->logging.active) { if(wg_log_encode(db, WG_FIXPOINTTYPE, &data, 0, NULL, 0)) return WG_ILLEGAL; } */ #endif return encode_fixpoint(data); } double wg_decode_fixpoint(void* db, wg_int data) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_decode_double"); return 0; } #endif if (isfixpoint(data)) return decode_fixpoint(data); show_data_error_nr(db,"data given to wg_decode_fixpoint is not an encoded fixpoint: ",data); return 0; } wg_int wg_encode_date(void* db, int data) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_encode_date"); return WG_ILLEGAL; } if (!fits_date(data)) { show_data_error(db,"argument given to wg_encode_date too big or too small"); return WG_ILLEGAL; } #endif #ifdef USE_DBLOG /* Skip logging values that do not cause storage allocation. if(dbh->logging.active) { if(wg_log_encode(db, WG_DATETYPE, &data, 0, NULL, 0)) return WG_ILLEGAL; } */ #endif return encode_date(data); } int wg_decode_date(void* db, wg_int data) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_decode_date"); return 0; } #endif if (isdate(data)) return decode_date(data); show_data_error_nr(db,"data given to wg_decode_date is not an encoded date: ",data); return 0; } wg_int wg_encode_time(void* db, int data) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_encode_time"); return WG_ILLEGAL; } if (!fits_time(data)) { show_data_error(db,"argument given to wg_encode_time too big or too small"); return WG_ILLEGAL; } #endif #ifdef USE_DBLOG /* Skip logging values that do not cause storage allocation. if(dbh->logging.active) { if(wg_log_encode(db, WG_TIMETYPE, &data, 0, NULL, 0)) return WG_ILLEGAL; } */ #endif return encode_time(data); } int wg_decode_time(void* db, wg_int data) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_decode_time"); return 0; } #endif if (istime(data)) return decode_time(data); show_data_error_nr(db,"data given to wg_decode_time is not an encoded time: ",data); return 0; } int wg_current_utcdate(void* db) { time_t ts; int epochadd=719163; // y 1970 m 1 d 1 ts=time(NULL); // secs since Epoch 1970 return (int)(ts/(24*60*60))+epochadd; } int wg_current_localdate(void* db) { time_t esecs; int res; struct tm ctime; esecs=time(NULL); // secs since Epoch 1970tstruct.time; localtime_r(&esecs,&ctime); res=ymd_to_scalar(ctime.tm_year+1900,ctime.tm_mon+1,ctime.tm_mday); return res; } int wg_current_utctime(void* db) { struct timeb tstruct; int esecs; int days; int secs; int milli; int secsday=24*60*60; ftime(&tstruct); esecs=(int)(tstruct.time); milli=tstruct.millitm; days=esecs/secsday; secs=esecs-(days*secsday); return (secs*100)+(milli/10); } int wg_current_localtime(void* db) { struct timeb tstruct; time_t esecs; int secs; int milli; struct tm ctime; ftime(&tstruct); esecs=tstruct.time; milli=tstruct.millitm; localtime_r(&esecs,&ctime); secs=ctime.tm_hour*60*60+ctime.tm_min*60+ctime.tm_sec; return (secs*100)+(milli/10); } int wg_strf_iso_datetime(void* db, int date, int time, char* buf) { unsigned yr, mo, day, hr, min, sec, spart; int t=time; int c; hr=t/(60*60*100); t=t-(hr*(60*60*100)); min=t/(60*100); t=t-(min*(60*100)); sec=t/100; t=t-(sec*(100)); spart=t; scalar_to_ymd(date,&yr,&mo,&day); c=snprintf(buf,24,"%04d-%02d-%02dT%02d:%02d:%02d.%02d",yr,mo,day,hr,min,sec,spart); return(c); } int wg_strp_iso_date(void* db, char* inbuf) { int sres; int yr=0; int mo=0; int day=0; int res; sres=sscanf(inbuf,"%4d-%2d-%2d",&yr,&mo,&day); if (sres<3 || yr<0 || mo<1 || mo>12 || day<1 || day>31) return -1; res=ymd_to_scalar(yr,mo,day); return res; } int wg_strp_iso_time(void* db, char* inbuf) { int sres; int hr=0; int min=0; int sec=0; int prt=0; sres=sscanf(inbuf,"%2d:%2d:%2d.%2d",&hr,&min,&sec,&prt); if (sres<3 || hr<0 || hr>24 || min<0 || min>60 || sec<0 || sec>60 || prt<0 || prt>99) return -1; return hr*(60*60*100)+min*(60*100)+sec*100+prt; } int wg_ymd_to_date(void* db, int yr, int mo, int day) { if (yr<0 || mo<1 || mo>12 || day<1 || day>31) return -1; return ymd_to_scalar(yr,mo,day); } int wg_hms_to_time(void* db, int hr, int min, int sec, int prt) { if (hr<0 || hr>24 || min<0 || min>60 || sec<0 || sec>60 || prt<0 || prt>99) return -1; return hr*(60*60*100)+min*(60*100)+sec*100+prt; } void wg_date_to_ymd(void* db, int date, int *yr, int *mo, int *day) { unsigned int y, m, d; scalar_to_ymd(date, &y, &m, &d); *yr=y; *mo=m; *day=d; } void wg_time_to_hms(void* db, int time, int *hr, int *min, int *sec, int *prt) { int t=time; *hr=t/(60*60*100); t=t-(*hr * (60*60*100)); *min=t/(60*100); t=t-(*min * (60*100)); *sec=t/100; t=t-(*sec * (100)); *prt=t; } // record wg_int wg_encode_record(void* db, void* data) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_encode_char"); return WG_ILLEGAL; } #endif #ifdef USE_DBLOG /* Skip logging values that do not cause storage allocation. if(dbh->logging.active) { if(wg_log_encode(db, WG_RECORDTYPE, &data, 0, NULL, 0)) return WG_ILLEGAL; } */ #endif return (wg_int)(encode_datarec_offset(ptrtooffset(db,data))); } void* wg_decode_record(void* db, wg_int data) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_encode_char"); return 0; } #endif return (void*)(offsettoptr(db,decode_datarec_offset(data))); } /* ============================================ Separate string, xmlliteral, uri, blob funs call universal funs defined later ============================================== */ /* string */ wg_int wg_encode_str(void* db, char* str, char* lang) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_encode_str"); return WG_ILLEGAL; } if (str==NULL) { show_data_error(db,"NULL string ptr given to wg_encode_str"); return WG_ILLEGAL; } #endif /* Logging handled inside wg_encode_unistr() */ return wg_encode_unistr(db,str,lang,WG_STRTYPE); } char* wg_decode_str(void* db, wg_int data) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_decode_str"); return NULL; } if (!data) { show_data_error(db,"data given to wg_decode_str is 0, not an encoded string"); return NULL; } #endif return wg_decode_unistr(db,data,WG_STRTYPE); } wg_int wg_decode_str_len(void* db, wg_int data) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_decode_str_len"); return -1; } if (!data) { show_data_error(db,"data given to wg_decode_str_len is 0, not an encoded string"); return -1; } #endif return wg_decode_unistr_len(db,data,WG_STRTYPE); } wg_int wg_decode_str_copy(void* db, wg_int data, char* strbuf, wg_int buflen) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_decode_str_copy"); return -1; } if (!data) { show_data_error(db,"data given to wg_decode_str_copy is 0, not an encoded string"); return -1; } if (strbuf==NULL) { show_data_error(db,"buffer given to wg_decode_str_copy is 0, not a valid buffer pointer"); return -1; } if (buflen<1) { show_data_error(db,"buffer len given to wg_decode_str_copy is 0 or less"); return -1; } #endif return wg_decode_unistr_copy(db,data,strbuf,buflen,WG_STRTYPE); } char* wg_decode_str_lang(void* db, wg_int data) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_decode_str"); return NULL; } if (!data) { show_data_error(db,"data given to wg_decode_str_lang is 0, not an encoded string"); return NULL; } #endif return wg_decode_unistr_lang(db,data,WG_STRTYPE); } wg_int wg_decode_str_lang_len(void* db, wg_int data) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_decode_str_lang_len"); return -1; } if (!data) { show_data_error(db,"data given to wg_decode_str_lang_len is 0, not an encoded string"); return -1; } #endif return wg_decode_unistr_lang_len(db,data,WG_STRTYPE); } wg_int wg_decode_str_lang_copy(void* db, wg_int data, char* langbuf, wg_int buflen) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_decode_str_lang_copy"); return -1; } if (!data) { show_data_error(db,"data given to wg_decode_str_lang_copy is 0, not an encoded string"); return -1; } if (langbuf==NULL) { show_data_error(db,"buffer given to wg_decode_str_lang_copy is 0, not a valid buffer pointer"); return -1; } if (buflen<1) { show_data_error(db,"buffer len given to wg_decode_str_lang_copy is 0 or less"); return -1; } #endif return wg_decode_unistr_lang_copy(db,data,langbuf,buflen,WG_STRTYPE); } /* xmlliteral */ wg_int wg_encode_xmlliteral(void* db, char* str, char* xsdtype) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_encode_xmlliteral"); return WG_ILLEGAL; } if (str==NULL) { show_data_error(db,"NULL string ptr given to wg_encode_xmlliteral"); return WG_ILLEGAL; } if (xsdtype==NULL) { show_data_error(db,"NULL xsdtype ptr given to wg_encode_xmlliteral"); return WG_ILLEGAL; } #endif /* Logging handled inside wg_encode_unistr() */ return wg_encode_unistr(db,str,xsdtype,WG_XMLLITERALTYPE); } char* wg_decode_xmlliteral(void* db, wg_int data) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_decode_xmlliteral"); return NULL; } if (!data) { show_data_error(db,"data given to wg_decode_xmlliteral is 0, not an encoded xmlliteral"); return NULL; } #endif return wg_decode_unistr(db,data,WG_XMLLITERALTYPE); } wg_int wg_decode_xmlliteral_len(void* db, wg_int data) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_decode_xmlliteral_len"); return -1; } if (!data) { show_data_error(db,"data given to wg_decode_xmlliteral_len is 0, not an encoded xmlliteral"); return -1; } #endif return wg_decode_unistr_len(db,data,WG_XMLLITERALTYPE); } wg_int wg_decode_xmlliteral_copy(void* db, wg_int data, char* strbuf, wg_int buflen) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_decode_xmlliteral_copy"); return -1; } if (!data) { show_data_error(db,"data given to wg_decode_xmlliteral_copy is 0, not an encoded xmlliteral"); return -1; } if (strbuf==NULL) { show_data_error(db,"buffer given to wg_decode_xmlliteral_copy is 0, not a valid buffer pointer"); return -1; } if (buflen<1) { show_data_error(db,"buffer len given to wg_decode_xmlliteral_copy is 0 or less"); return -1; } #endif return wg_decode_unistr_copy(db,data,strbuf,buflen,WG_XMLLITERALTYPE); } char* wg_decode_xmlliteral_xsdtype(void* db, wg_int data) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_decode_xmlliteral"); return NULL; } if (!data) { show_data_error(db,"data given to wg_decode_xmlliteral_xsdtype is 0, not an encoded xmlliteral"); return NULL; } #endif return wg_decode_unistr_lang(db,data,WG_XMLLITERALTYPE); } wg_int wg_decode_xmlliteral_xsdtype_len(void* db, wg_int data) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_decode_xmlliteral_xsdtype_len"); return -1; } if (!data) { show_data_error(db,"data given to wg_decode_xmlliteral_lang_xsdtype is 0, not an encoded xmlliteral"); return -1; } #endif return wg_decode_unistr_lang_len(db,data,WG_XMLLITERALTYPE); } wg_int wg_decode_xmlliteral_xsdtype_copy(void* db, wg_int data, char* langbuf, wg_int buflen) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_decode_xmlliteral_xsdtype_copy"); return -1; } if (!data) { show_data_error(db,"data given to wg_decode_xmlliteral_xsdtype_copy is 0, not an encoded xmlliteral"); return -1; } if (langbuf==NULL) { show_data_error(db,"buffer given to wg_decode_xmlliteral_xsdtype_copy is 0, not a valid buffer pointer"); return -1; } if (buflen<1) { show_data_error(db,"buffer len given to wg_decode_xmlliteral_xsdtype_copy is 0 or less"); return -1; } #endif return wg_decode_unistr_lang_copy(db,data,langbuf,buflen,WG_XMLLITERALTYPE); } /* uri */ wg_int wg_encode_uri(void* db, char* str, char* prefix) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_encode_uri"); return WG_ILLEGAL; } if (str==NULL) { show_data_error(db,"NULL string ptr given to wg_encode_uri"); return WG_ILLEGAL; } #endif /* Logging handled inside wg_encode_unistr() */ return wg_encode_unistr(db,str,prefix,WG_URITYPE); } char* wg_decode_uri(void* db, wg_int data) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_decode_uri"); return NULL; } if (!data) { show_data_error(db,"data given to wg_decode_uri is 0, not an encoded string"); return NULL; } #endif return wg_decode_unistr(db,data,WG_URITYPE); } wg_int wg_decode_uri_len(void* db, wg_int data) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_decode_uri_len"); return -1; } if (!data) { show_data_error(db,"data given to wg_decode_uri_len is 0, not an encoded string"); return -1; } #endif return wg_decode_unistr_len(db,data,WG_URITYPE); } wg_int wg_decode_uri_copy(void* db, wg_int data, char* strbuf, wg_int buflen) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_decode_uri_copy"); return -1; } if (!data) { show_data_error(db,"data given to wg_decode_uri_copy is 0, not an encoded string"); return -1; } if (strbuf==NULL) { show_data_error(db,"buffer given to wg_decode_uri_copy is 0, not a valid buffer pointer"); return -1; } if (buflen<1) { show_data_error(db,"buffer len given to wg_decode_uri_copy is 0 or less"); return -1; } #endif return wg_decode_unistr_copy(db,data,strbuf,buflen,WG_URITYPE); } char* wg_decode_uri_prefix(void* db, wg_int data) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_decode_uri_prefix"); return NULL; } if (!data) { show_data_error(db,"data given to wg_decode_uri_prefix is 0, not an encoded uri"); return NULL; } #endif return wg_decode_unistr_lang(db,data,WG_URITYPE); } wg_int wg_decode_uri_prefix_len(void* db, wg_int data) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_decode_uri_prefix_len"); return -1; } if (!data) { show_data_error(db,"data given to wg_decode_uri_prefix_len is 0, not an encoded string"); return -1; } #endif return wg_decode_unistr_lang_len(db,data,WG_URITYPE); } wg_int wg_decode_uri_prefix_copy(void* db, wg_int data, char* langbuf, wg_int buflen) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_decode_uri_prefix_copy"); return -1; } if (!data) { show_data_error(db,"data given to wg_decode_uri_prefix_copy is 0, not an encoded string"); return -1; } if (langbuf==NULL) { show_data_error(db,"buffer given to wg_decode_uri_prefix_copy is 0, not a valid buffer pointer"); return -1; } if (buflen<1) { show_data_error(db,"buffer len given to wg_decode_uri_prefix_copy is 0 or less"); return -1; } #endif return wg_decode_unistr_lang_copy(db,data,langbuf,buflen,WG_URITYPE); } /* blob */ wg_int wg_encode_blob(void* db, char* str, char* type, wg_int len) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_encode_blob"); return WG_ILLEGAL; } if (str==NULL) { show_data_error(db,"NULL string ptr given to wg_encode_blob"); return WG_ILLEGAL; } #endif return wg_encode_uniblob(db,str,type,WG_BLOBTYPE,len); } char* wg_decode_blob(void* db, wg_int data) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_decode_blob"); return NULL; } if (!data) { show_data_error(db,"data given to wg_decode_blob is 0, not an encoded string"); return NULL; } #endif return wg_decode_unistr(db,data,WG_BLOBTYPE); } wg_int wg_decode_blob_len(void* db, wg_int data) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_decode_blob_len"); return -1; } if (!data) { show_data_error(db,"data given to wg_decode_blob_len is 0, not an encoded string"); return -1; } #endif return wg_decode_unistr_len(db,data,WG_BLOBTYPE)+1; } wg_int wg_decode_blob_copy(void* db, wg_int data, char* strbuf, wg_int buflen) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_decode_blob_copy"); return -1; } if (!data) { show_data_error(db,"data given to wg_decode_blob_copy is 0, not an encoded string"); return -1; } if (strbuf==NULL) { show_data_error(db,"buffer given to wg_decode_blob_copy is 0, not a valid buffer pointer"); return -1; } if (buflen<1) { show_data_error(db,"buffer len given to wg_decode_blob_copy is 0 or less"); return -1; } #endif return wg_decode_unistr_copy(db,data,strbuf,buflen,WG_BLOBTYPE); } char* wg_decode_blob_type(void* db, wg_int data) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_decode_blob_type"); return NULL; } if (!data) { show_data_error(db,"data given to wg_decode_blob_type is 0, not an encoded blob"); return NULL; } #endif return wg_decode_unistr_lang(db,data,WG_BLOBTYPE); } wg_int wg_decode_blob_type_len(void* db, wg_int data) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_decode_blob_type_len"); return -1; } if (!data) { show_data_error(db,"data given to wg_decode_blob_type_len is 0, not an encoded string"); return -1; } #endif return wg_decode_unistr_lang_len(db,data,WG_BLOBTYPE); } wg_int wg_decode_blob_type_copy(void* db, wg_int data, char* langbuf, wg_int buflen) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_decode_blob_type_copy"); return -1; } if (!data) { show_data_error(db,"data given to wg_decode_blob_type_copy is 0, not an encoded string"); return -1; } if (langbuf==NULL) { show_data_error(db,"buffer given to wg_decode_blob_type_copy is 0, not a valid buffer pointer"); return -1; } if (buflen<1) { show_data_error(db,"buffer len given to wg_decode_blob_type_copy is 0 or less"); return -1; } #endif return wg_decode_unistr_lang_copy(db,data,langbuf,buflen,WG_BLOBTYPE); } /* anonconst */ wg_int wg_encode_anonconst(void* db, char* str) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_encode_anonconst"); return WG_ILLEGAL; } if (str==NULL) { show_data_error(db,"NULL string ptr given to wg_encode_anonconst"); return WG_ILLEGAL; } #endif //return wg_encode_unistr(db,str,NULL,WG_ANONCONSTTYPE); /* Logging handled inside wg_encode_unistr() */ return wg_encode_unistr(db,str,NULL,WG_URITYPE); } char* wg_decode_anonconst(void* db, wg_int data) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_decode_anonconst"); return NULL; } if (!data) { show_data_error(db,"data given to wg_decode_anonconst is 0, not an encoded anonconst"); return NULL; } #endif //return wg_decode_unistr(db,data,WG_ANONCONSTTYPE); return wg_decode_unistr(db,data,WG_URITYPE); } /* var */ wg_int wg_encode_var(void* db, wg_int varnr) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_encode_var"); return WG_ILLEGAL; } if (!fits_var(varnr)) { show_data_error(db,"int given to wg_encode_var too big/small"); return WG_ILLEGAL; } #endif #ifdef USE_DBLOG /* Skip logging values that do not cause storage allocation. if(dbh->logging.active) { if(wg_log_encode(db, WG_VARTYPE, &data, 0, NULL, 0)) return WG_ILLEGAL; } */ #endif return encode_var(varnr); } wg_int wg_decode_var(void* db, wg_int data) { #ifdef CHECK if (!dbcheck(db)) { show_data_error(db,"wrong database pointer given to wg_decode_var"); return -1; } if (!data) { show_data_error(db,"data given to wg_decode_var is 0, not an encoded var"); return -1; } #endif return decode_var(data); } /* ============================================ Universal funs for string, xmlliteral, uri, blob ============================================== */ gint wg_encode_unistr(void* db, char* str, char* lang, gint type) { gint offset; gint len; #ifdef USETINYSTR gint res; #endif char* dptr; char* sptr; char* dendptr; len=(gint)(strlen(str)); #ifdef USE_DBLOG /* Log before allocating. */ if(dbmemsegh(db)->logging.active) { gint extlen = 0; if(lang) extlen = strlen(lang); if(wg_log_encode(db, type, str, len, lang, extlen)) return WG_ILLEGAL; } #endif #ifdef USETINYSTR /* XXX: add tinystr support to logging */ #ifdef USE_DBLOG #error USE_DBLOG and USETINYSTR are incompatible #endif if (lang==NULL && type==WG_STRTYPE && len<(sizeof(gint)-1)) { res=TINYSTRBITS; // first zero the field and set last byte to mask if (LITTLEENDIAN) { dptr=((char*)(&res))+1; // type bits stored in lowest addressed byte } else { dptr=((char*)(&res)); // type bits stored in highest addressed byte } memcpy(dptr,str,len+1); return res; } #endif if (lang==NULL && type==WG_STRTYPE && lenlogging.active) { wg_log_encval(db, WG_ILLEGAL); } #endif return WG_ILLEGAL; } // loop over bytes, storing them starting from offset dptr = (char *) offsettoptr(db,offset); dendptr=dptr+SHORTSTR_SIZE; // //strcpy(dptr,sptr); //memset(dptr+len,0,SHORTSTR_SIZE-len); // for(sptr=str; (*dptr=*sptr)!=0; sptr++, dptr++) {}; // copy string for(dptr++; dptrlogging.active) { if(wg_log_encval(db, encode_shortstr_offset(offset))) return WG_ILLEGAL; /* journal error */ } #endif return encode_shortstr_offset(offset); //dbstore(db,ptrtoffset(record)+RECORD_HEADER_GINTS+fieldnr,encode_shortstr_offset(offset)); } else { offset=find_create_longstr(db,str,lang,type,len+1); if (!offset) { show_data_error_nr(db,"cannot create a string of size ",len); #ifdef USE_DBLOG if(dbmemsegh(db)->logging.active) { wg_log_encval(db, WG_ILLEGAL); } #endif return WG_ILLEGAL; } #ifdef USE_DBLOG if(dbmemsegh(db)->logging.active) { if(wg_log_encval(db, encode_longstr_offset(offset))) return WG_ILLEGAL; /* journal error */ } #endif return encode_longstr_offset(offset); } } gint wg_encode_uniblob(void* db, char* str, char* lang, gint type, gint len) { gint offset; if (0) { } else { offset=find_create_longstr(db,str,lang,type,len); if (!offset) { show_data_error_nr(db,"cannot create a blob of size ",len); return WG_ILLEGAL; } return encode_longstr_offset(offset); } } static gint find_create_longstr(void* db, char* data, char* extrastr, gint type, gint length) { db_memsegment_header* dbh = dbmemsegh(db); gint offset; size_t i; gint tmp; gint lengints; gint lenrest; char* lstrptr; gint old=0; int hash; gint hasharrel; gint res; if (0) { } else { // find hash, check if exists and use if found hash=wg_hash_typedstr(db,data,extrastr,type,length); //hasharrel=((gint*)(offsettoptr(db,((db->strhash_area_header).arraystart))))[hash]; hasharrel=dbfetch(db,((dbh->strhash_area_header).arraystart)+(sizeof(gint)*hash)); //printf("hash %d((dbh->strhash_area_header).arraystart)+(sizeof(gint)*hash) %d hasharrel %d\n", // hash,((dbh->strhash_area_header).arraystart)+(sizeof(gint)*hash), hasharrel); if (hasharrel) old=wg_find_strhash_bucket(db,data,extrastr,type,length,hasharrel); //printf("old %d \n",old); if (old) { //printf("str found in hash\n"); return old; } //printf("str not found in hash\n"); //printf("hasharrel 1 %d \n",hasharrel); // equal string not found in hash // allocate a new string lengints=length/sizeof(gint); // 7/4=1, 8/4=2, 9/4=2, lenrest=length%sizeof(gint); // 7%4=3, 8%4=0, 9%4=1, if (lenrest) lengints++; offset=wg_alloc_gints(db, &(dbmemsegh(db)->longstr_area_header), lengints+LONGSTR_HEADER_GINTS); if (!offset) { //show_data_error_nr(db,"cannot create a data string/blob of size ",length); return 0; } lstrptr=(char*)(offsettoptr(db,offset)); // store string contents memcpy(lstrptr+(LONGSTR_HEADER_GINTS*sizeof(gint)),data,length); //zero the rest for(i=0;lenrest && istrhash_area_header).arraystart)+(sizeof(gint)*hash),res); //printf("hasharrel 2 %d \n",hasharrel); dbstore(db,offset+LONGSTR_HASHCHAIN_POS*sizeof(gint),hasharrel); // store old hash array el // return result return res; } } char* wg_decode_unistr(void* db, gint data, gint type) { gint* objptr; char* dataptr; #ifdef USETINYSTR if (type==WG_STRTYPE && istinystr(data)) { if (LITTLEENDIAN) { dataptr=((char*)(&data))+1; // type bits stored in lowest addressed byte } else { dataptr=((char*)(&data)); // type bits stored in highest addressed byte } return dataptr; } #endif if (isshortstr(data)) { dataptr=(char*)(offsettoptr(db,decode_shortstr_offset(data))); return dataptr; } if (islongstr(data)) { objptr = (gint *) offsettoptr(db,decode_longstr_offset(data)); dataptr=((char*)(objptr))+(LONGSTR_HEADER_GINTS*sizeof(gint)); return dataptr; } show_data_error(db,"data given to wg_decode_unistr is not an encoded string"); return NULL; } char* wg_decode_unistr_lang(void* db, gint data, gint type) { gint* objptr; gint* fldptr; gint fldval; char* res; #ifdef USETINYSTR if (type==WG_STRTYPE && istinystr(data)) { return NULL; } #endif if (type==WG_STRTYPE && isshortstr(data)) { return NULL; } if (islongstr(data)) { objptr = (gint *) offsettoptr(db,decode_longstr_offset(data)); fldptr=((gint*)objptr)+LONGSTR_EXTRASTR_POS; fldval=*fldptr; if (fldval==0) return NULL; res=wg_decode_unistr(db,fldval,type); return res; } show_data_error(db,"data given to wg_decode_unistr_lang is not an encoded string"); return NULL; } /** * return length of the main string, not including terminating 0 * * */ gint wg_decode_unistr_len(void* db, gint data, gint type) { char* dataptr; gint* objptr; gint objsize; gint strsize; #ifdef USETINYSTR if (type==WG_STRTYPE && istinystr(data)) { if (LITTLEENDIAN) { dataptr=((char*)(&data))+1; // type bits stored in lowest addressed byte } else { dataptr=((char*)(&data)); // type bits stored in highest addressed byte } strsize=strlen(dataptr); return strsize; } #endif if (isshortstr(data)) { dataptr=(char*)(offsettoptr(db,decode_shortstr_offset(data))); strsize=strlen(dataptr); return strsize; } if (islongstr(data)) { objptr = (gint *) offsettoptr(db,decode_longstr_offset(data)); objsize=getusedobjectsize(*objptr); dataptr=((char*)(objptr))+(LONGSTR_HEADER_GINTS*sizeof(gint)); //printf("dataptr to read from %d str '%s' of len %d\n",dataptr,dataptr,strlen(dataptr)); strsize=objsize-(((*(objptr+LONGSTR_META_POS))&LONGSTR_META_LENDIFMASK)>>LONGSTR_META_LENDIFSHFT); return strsize-1; } show_data_error(db,"data given to wg_decode_unistr_len is not an encoded string"); return 0; } /** * copy string, return length of a copied string, not including terminating 0 * * return -1 in case of error * */ gint wg_decode_unistr_copy(void* db, gint data, char* strbuf, gint buflen, gint type) { gint i; gint* objptr; char* dataptr; gint objsize; gint strsize; #ifdef USETINYSTR if (type==WG_STRTYPE && istinystr(data)) { if (LITTLEENDIAN) { dataptr=((char*)(&data))+1; // type bits stored in lowest addressed byte } else { dataptr=((char*)(&data)); // type bits stored in highest addressed byte } strsize=strlen(dataptr)+1; if (strsize>=sizeof(gint)) { show_data_error_nr(db,"wrong data stored as tinystr, impossible length:",strsize); return 0; } if (buflen=buflen) { show_data_error_nr(db,"insufficient buffer length given to wg_decode_unistr_copy:",buflen); return -1; } *strbuf=*dataptr; } *strbuf=0; return i-1; } if (islongstr(data)) { objptr = (gint *) offsettoptr(db,decode_longstr_offset(data)); objsize=getusedobjectsize(*objptr); dataptr=((char*)(objptr))+(LONGSTR_HEADER_GINTS*sizeof(gint)); //printf("dataptr to read from %d str '%s' of len %d\n",dataptr,dataptr,strlen(dataptr)); strsize=objsize-(((*(objptr+LONGSTR_META_POS))&LONGSTR_META_LENDIFMASK)>>LONGSTR_META_LENDIFSHFT); //printf("objsize %d metaptr %d meta %d lendiff %d strsize %d \n", // objsize,((gint*)objptr+LONGSTR_META_POS),*((gint*)objptr+LONGSTR_META_POS), // (((*(objptr+LONGSTR_META_POS))&LONGSTR_META_LENDIFMASK)>>LONGSTR_META_LENDIFSHFT),strsize); if(buflen=buflen) { show_data_error_nr(db,"insufficient buffer length given to wg_decode_unistr_lang_copy:",buflen); return -1; } memcpy(strbuf,langptr,len+1); return len; } /* ----------- calendar and time functions ------------------- */ /* Scalar date routines used are written and given to public domain by Ray Gardner. */ static int isleap(unsigned yr) { return yr % 400 == 0 || (yr % 4 == 0 && yr % 100 != 0); } static unsigned months_to_days (unsigned month) { return (month * 3057 - 3007) / 100; } static long years_to_days (unsigned yr) { return yr * 365L + yr / 4 - yr / 100 + yr / 400; } static long ymd_to_scalar (unsigned yr, unsigned mo, unsigned day) { long scalar; scalar = day + months_to_days(mo); if ( mo > 2 ) /* adjust if past February */ scalar -= isleap(yr) ? 1 : 2; yr--; scalar += years_to_days(yr); return scalar; } static void scalar_to_ymd (long scalar, unsigned *yr, unsigned *mo, unsigned *day) { unsigned n; /* compute inverse of years_to_days() */ for ( n = (unsigned)((scalar * 400L) / 146097L); years_to_days(n) < scalar;) n++; /* 146097 == years_to_days(400) */ *yr = n; n = (unsigned)(scalar - years_to_days(n-1)); if ( n > 59 ) { /* adjust if past February */ n += 2; if (isleap(*yr)) n -= n > 62 ? 1 : 2; } *mo = (n * 100 + 3007) / 3057; /* inverse of months_to_days() */ *day = n - months_to_days(*mo); } /* Thread-safe localtime_r appears not to be present on windows: emulate using win localtime_s, which is thread-safe */ #ifdef _WIN32 static struct tm * localtime_r (const time_t *timer, struct tm *result) { struct tm local_result; int res; res = localtime_s (&local_result,timer); if (!res) return NULL; //if (local_result == NULL || result == NULL) return NULL; memcpy (result, &local_result, sizeof (result)); return result; } #endif /* ------ value offset translation ---- */ /* Translate externally encoded value in relation to current base address * * Data argument is a value encoded in the database extdb. Returned value is * translated so that it can be used in WhiteDB API functions with the * database db. */ gint wg_encode_external_data(void *db, void *extdb, gint encoded) { #ifdef USE_CHILD_DB return wg_translate_hdroffset(db, dbmemseg(extdb), encoded); #else show_data_error(db, "child databases support is not enabled."); return WG_ILLEGAL; #endif } #ifdef USE_CHILD_DB gint wg_translate_hdroffset(void *db, void *exthdr, gint encoded) { gint extoff = ptrtooffset(db, exthdr); /* relative offset of external db */ /* Only pointer-type values need translating */ if(isptr(encoded)) { switch(encoded&NORMALPTRMASK) { case DATARECBITS: return encode_datarec_offset( decode_datarec_offset(encoded) + extoff); case LONGSTRBITS: return encode_longstr_offset( decode_longstr_offset(encoded) + extoff); case SHORTSTRBITS: return encode_shortstr_offset( decode_shortstr_offset(encoded) + extoff); case FULLDOUBLEBITS: return encode_fulldouble_offset( decode_fulldouble_offset(encoded) + extoff); case FULLINTBITSV0: case FULLINTBITSV1: return encode_fullint_offset( decode_fullint_offset(encoded) + extoff); default: /* XXX: it's not entirely correct to fail silently here, but * we can only end up here if new pointer types are added without * updating this function. */ break; } } return encoded; } /** Return base address that an encoded value is "native" to. * * The external database must be registered first for the offset * to be recognized. Returns NULL if none of the registered * databases match. */ static void *get_ptr_owner(void *db, gint encoded) { gint offset = 0; if(isptr(encoded)) { switch(encoded&NORMALPTRMASK) { case DATARECBITS: offset = decode_datarec_offset(encoded); case LONGSTRBITS: offset = decode_longstr_offset(encoded); case SHORTSTRBITS: offset = decode_shortstr_offset(encoded); case FULLDOUBLEBITS: offset = decode_fulldouble_offset(encoded); case FULLINTBITSV0: case FULLINTBITSV1: offset = decode_fullint_offset(encoded); default: break; } } else { return dbmemseg(db); /* immediate values default to "Local" */ } if(!offset) return NULL; /* data values do not point at memsegment header * start anyway. */ if(offset > 0 && offset < dbmemsegh(db)->size) { return dbmemseg(db); /* "Local" record */ } else { int i; db_memsegment_header* dbh = dbmemsegh(db); for(i=0; iextdbs.count; i++) { if(offset > dbh->extdbs.offset[i] && \ offset < dbh->extdbs.offset[i] + dbh->extdbs.size[i]) { return (void *) (dbmemsegbytes(db) + dbh->extdbs.offset[i]); } } return NULL; } } /** Check if an offset is "native" to the current database. * * Returns 1 if the offset is local, 0 otherwise. */ static int is_local_offset(void *db, gint offset) { if(offset > 0 && offset < dbmemsegh(db)->size) { return 1; /* "Local" data */ } return 0; } #endif /** Return base address that the record belongs to. * * Takes pointer values as arguments. * The external database must be registered first for the offset * to be recognized. Returns NULL if none of the registered * databases match. * XXX: needed to compile the lib under windows even * if child databases are disabled. */ void *wg_get_rec_owner(void *db, void *rec) { int i; db_memsegment_header* dbh = dbmemsegh(db); if((gint) rec > (gint) dbmemseg(db)) { void *eodb = (void *) (dbmemsegbytes(db) + dbh->size); if((gint) rec < (gint) eodb) return dbmemseg(db); /* "Local" record */ } for(i=0; iextdbs.count; i++) { void *base = (void *) (dbmemsegbytes(db) + dbh->extdbs.offset[i]); void *eodb = (void *) (((char *) base) + dbh->extdbs.size[i]); if((gint) rec > (gint) base && (gint) rec < (gint) eodb) { return base; } } show_data_error(db, "invalid pointer in wg_get_rec_base_offset"); return NULL; } /* ------------ errors ---------------- */ static gint show_data_error(void* db, char* errmsg) { #ifdef WG_NO_ERRPRINT #else fprintf(stderr,"wg data handling error: %s\n",errmsg); #endif return -1; } static gint show_data_error_nr(void* db, char* errmsg, gint nr) { #ifdef WG_NO_ERRPRINT #else fprintf(stderr,"wg data handling error: %s %d\n", errmsg, (int) nr); #endif return -1; } static gint show_data_error_double(void* db, char* errmsg, double nr) { #ifdef WG_NO_ERRPRINT #else fprintf(stderr,"wg data handling error: %s %f\n",errmsg,nr); #endif return -1; } static gint show_data_error_str(void* db, char* errmsg, char* str) { #ifdef WG_NO_ERRPRINT #else fprintf(stderr,"wg data handling error: %s %s\n",errmsg,str); #endif return -1; } #ifdef __cplusplus } #endif whitedb-0.7.2/Db/dbdata.h000066400000000000000000000477741226454622500151160ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dbdata.h * Datatype encoding defs and public headers for actual data handling procedures. */ #ifndef DEFINED_DBDATA_H #define DEFINED_DBDATA_H #ifdef _WIN32 #include "../config-w32.h" #else #include "../config.h" #endif #include "dballoc.h" // ============= external funs defs ============ #ifndef _WIN32 extern double round(double); #else /* round as a macro (no libm equivalent for MSVC) */ #define round(x) ((double) floor((double) x + 0.5)) #endif // ============= api part starts ================ /* --- built-in data type numbers ----- */ /* the built-in data types are primarily for api purposes. internally, some of these types like int, str etc have several different ways to encode along with different bit masks */ #define WG_NULLTYPE 1 #define WG_RECORDTYPE 2 #define WG_INTTYPE 3 #define WG_DOUBLETYPE 4 #define WG_STRTYPE 5 #define WG_XMLLITERALTYPE 6 #define WG_URITYPE 7 #define WG_BLOBTYPE 8 #define WG_CHARTYPE 9 #define WG_FIXPOINTTYPE 10 #define WG_DATETYPE 11 #define WG_TIMETYPE 12 #define WG_ANONCONSTTYPE 13 // not implemented yet #define WG_VARTYPE 14 // not implemented yet /* Illegal encoded data indicator */ #define WG_ILLEGAL 0xff /* prototypes of wg database api functions */ typedef ptrdiff_t wg_int; typedef size_t wg_uint; // used in time enc /* -------- creating and scanning records --------- */ void* wg_create_record(void* db, wg_int length); ///< returns NULL when error, ptr to rec otherwise void* wg_create_raw_record(void* db, wg_int length); ///< returns NULL when error, ptr to rec otherwise wg_int wg_delete_record(void* db, void *rec); ///< returns 0 on success, non-0 on error void* wg_get_first_record(void* db); ///< returns NULL when error or no recs void* wg_get_next_record(void* db, void* record); ///< returns NULL when error or no more recs void* wg_get_first_raw_record(void* db); void* wg_get_next_raw_record(void* db, void* record); /* -------- setting and fetching record field values --------- */ wg_int wg_get_record_len(void* db, void* record); ///< returns negative int when error wg_int* wg_get_record_dataarray(void* db, void* record); ///< pointer to record data array start // following field setting functions return negative int when err, 0 when ok wg_int wg_set_field(void* db, void* record, wg_int fieldnr, wg_int data); wg_int wg_set_new_field(void* db, void* record, wg_int fieldnr, wg_int data); wg_int wg_set_int_field(void* db, void* record, wg_int fieldnr, wg_int data); wg_int wg_set_double_field(void* db, void* record, wg_int fieldnr, double data); wg_int wg_set_str_field(void* db, void* record, wg_int fieldnr, char* data); wg_int wg_update_atomic_field(void* db, void* record, wg_int fieldnr, wg_int data, wg_int old_data); wg_int wg_set_atomic_field(void* db, void* record, wg_int fieldnr, wg_int data); wg_int wg_add_int_atomic_field(void* db, void* record, wg_int fieldnr, int data); wg_int wg_get_field(void* db, void* record, wg_int fieldnr); // returns 0 when error wg_int wg_get_field_type(void* db, void* record, wg_int fieldnr); // returns 0 when error /* ---------- general operations on encoded data -------- */ wg_int wg_get_encoded_type(void* db, wg_int data); char* wg_get_type_name(void* db, wg_int type); wg_int wg_free_encoded(void* db, wg_int data); /* -------- encoding and decoding data: records contain encoded data only ---------- */ // null wg_int wg_encode_null(void* db, char* data); char* wg_decode_null(void* db, wg_int data); // int wg_int wg_encode_int(void* db, wg_int data); wg_int wg_decode_int(void* db, wg_int data); // char wg_int wg_encode_char(void* db, char data); char wg_decode_char(void* db, wg_int data); // double wg_int wg_encode_double(void* db, double data); double wg_decode_double(void* db, wg_int data); // fixpoint wg_int wg_encode_fixpoint(void* db, double data); double wg_decode_fixpoint(void* db, wg_int data); // date and time wg_int wg_encode_date(void* db, int data); int wg_decode_date(void* db, wg_int data); wg_int wg_encode_time(void* db, int data); int wg_decode_time(void* db, wg_int data); int wg_current_utcdate(void* db); int wg_current_localdate(void* db); int wg_current_utctime(void* db); int wg_current_localtime(void* db); int wg_strf_iso_datetime(void* db, int date, int time, char* buf); int wg_strp_iso_date(void* db, char* buf); int wg_strp_iso_time(void* db, char* inbuf); int wg_ymd_to_date(void* db, int yr, int mo, int day); int wg_hms_to_time(void* db, int hr, int min, int sec, int prt); void wg_date_to_ymd(void* db, int date, int *yr, int *mo, int *day); void wg_time_to_hms(void* db, int time, int *hr, int *min, int *sec, int *prt); //record wg_int wg_encode_record(void* db, void* data); void* wg_decode_record(void* db, wg_int data); // str (standard C string: zero-terminated array of chars) // along with optional attached language indicator str wg_int wg_encode_str(void* db, char* str, char* lang); ///< let lang==NULL if not used char* wg_decode_str(void* db, wg_int data); char* wg_decode_str_lang(void* db, wg_int data); wg_int wg_decode_str_len(void* db, wg_int data); wg_int wg_decode_str_lang_len(void* db, wg_int data); wg_int wg_decode_str_copy(void* db, wg_int data, char* strbuf, wg_int buflen); wg_int wg_decode_str_lang_copy(void* db, wg_int data, char* langbuf, wg_int buflen); // xmlliteral (standard C string: zero-terminated array of chars) // along with obligatory attached xsd:type str wg_int wg_encode_xmlliteral(void* db, char* str, char* xsdtype); char* wg_decode_xmlliteral(void* db, wg_int data); char* wg_decode_xmlliteral_xsdtype(void* db, wg_int data); wg_int wg_decode_xmlliteral_len(void* db, wg_int data); wg_int wg_decode_xmlliteral_xsdtype_len(void* db, wg_int data); wg_int wg_decode_xmlliteral_copy(void* db, wg_int data, char* strbuf, wg_int buflen); wg_int wg_decode_xmlliteral_xsdtype_copy(void* db, wg_int data, char* strbuf, wg_int buflen); // uri (standard C string: zero-terminated array of chars) // along with an optional prefix str wg_int wg_encode_uri(void* db, char* str, char* prefix); char* wg_decode_uri(void* db, wg_int data); char* wg_decode_uri_prefix(void* db, wg_int data); wg_int wg_decode_uri_len(void* db, wg_int data); wg_int wg_decode_uri_prefix_len(void* db, wg_int data); wg_int wg_decode_uri_copy(void* db, wg_int data, char* strbuf, wg_int buflen); wg_int wg_decode_uri_prefix_copy(void* db, wg_int data, char* strbuf, wg_int buflen); // blob (binary large object, i.e. any kind of data) // along with an obligatory length in bytes wg_int wg_encode_blob(void* db, char* str, char* type, wg_int len); char* wg_decode_blob(void* db, wg_int data); char* wg_decode_blob_type(void* db, wg_int data); wg_int wg_decode_blob_len(void* db, wg_int data); wg_int wg_decode_blob_copy(void* db, wg_int data, char* strbuf, wg_int buflen); wg_int wg_decode_blob_type_len(void* db, wg_int data); wg_int wg_decode_blob_type_copy(void* db, wg_int data, char* langbuf, wg_int buflen); // anonconst wg_int wg_encode_anonconst(void* db, char* str); char* wg_decode_anonconst(void* db, wg_int data); // var wg_int wg_encode_var(void* db, wg_int varnr); wg_int wg_decode_var(void* db, wg_int data); // ================ api part ends ================ /* Record header structure. Position 0 is always reserved * for size. */ #define RECORD_HEADER_GINTS 3 #define RECORD_META_POS 1 /** metainfo, reserved for future use */ #define RECORD_BACKLINKS_POS 2 /** backlinks structure offset */ #define LITTLEENDIAN 1 ///< (intel is little-endian) difference in encoding tinystr //#define USETINYSTR 1 ///< undef to prohibit usage of tinystr /* Record meta bits. */ #define RECORD_META_NOTDATA 0x1 /** Record is a "special" record (not data) */ #define RECORD_META_MATCH 0x2 /** "match" record (needs NOTDATA as well) */ #define RECORD_META_DOC 0x10 /** schema bits: top-level document */ #define RECORD_META_OBJECT 0x20 /** schema bits: object */ #define RECORD_META_ARRAY 0x40 /** schema bits: array */ #define is_special_record(r) (*((gint *) r + RECORD_META_POS) &\ RECORD_META_NOTDATA) #define is_plain_record(r) (*((gint *) r + RECORD_META_POS) == 0) #define is_schema_array(r) (*((gint *) r + RECORD_META_POS) &\ RECORD_META_ARRAY) #define is_schema_object(r) (*((gint *) r + RECORD_META_POS) &\ RECORD_META_OBJECT) #define is_schema_document(r) (*((gint *) r + RECORD_META_POS) &\ RECORD_META_DOC) // recognising gint types as gb types: bits, shifts, masks /* special value null (unassigned) integer 0 Pointers to word-len ints end with ?01 = not eq Pointers to data records end with 000 = not eq Pointers to long string records end with 100 = eq Pointers to doubleword-len doubles end with 010 = not eq Pointers to 32byte string records end with 110 = not eq Immediate integers end with 011 = is eq (Other immediates 111 (continued below)) Immediate vars end with 0111 // not implemented yet Immediate short fixpoints 0000 1111 = is eq Immediate chars 0001 1111 = is eq Immediate dates 0010 1111 = is eq Immediate times 0011 1111 = is eq // Immediate tiny strings 0100 1111 = is eq // not used yet Immediate anon constants 0101 1111 = is eq // not implemented yet */ /* --- encoding and decoding basic data ---- */ #define SMALLINTBITS 0x3 ///< int ends with 011 #define SMALLINTSHFT 3 #define SMALLINTMASK 0x7 #define fits_smallint(i) ((((i)<>SMALLINTSHFT)==i) #define encode_smallint(i) (((i)<>SMALLINTSHFT) #define FULLINTBITS 0x1 ///< full int ptr ends with 01 #define FULLINTBITSV0 0x1 ///< full int type as 3-bit nr version 0: 001 #define FULLINTBITSV1 0x5 ///< full int type as 3-bit nr version 1: 101 #define FULLINTMASK 0x3 #define encode_fullint_offset(i) ((i)|FULLINTBITS) #define decode_fullint_offset(i) ((i) & ~FULLINTMASK) #define DATARECBITS 0x0 ///< datarec ptr ends with 000 #define DATARECMASK 0x7 #define encode_datarec_offset(i) (i) #define decode_datarec_offset(i) (i) #define LONGSTRBITS 0x4 ///< longstr ptr ends with 100 #define LONGSTRMASK 0x7 #define encode_longstr_offset(i) ((i)|LONGSTRBITS) #define decode_longstr_offset(i) ((i) & ~LONGSTRMASK) #define FULLDOUBLEBITS 0x2 ///< full double ptr ends with 010 #define FULLDOUBLEMASK 0x7 #define encode_fulldouble_offset(i) ((i)|FULLDOUBLEBITS) #define decode_fulldouble_offset(i) ((i) & ~FULLDOUBLEMASK) #define SHORTSTRBITS 0x6 ///< short str ptr ends with 110 #define SHORTSTRMASK 0x7 #define encode_shortstr_offset(i) ((i)|SHORTSTRBITS) #define decode_shortstr_offset(i) ((i) & ~SHORTSTRMASK) /* --- encoding and decoding other data ---- */ #define VARMASK 0xf #define VARSHFT 4 #define VARBITS 0x7 ///< var ends with 0111 #define fits_var(i) ((((i)<>VARSHFT)==i) #define encode_var(i) (((i)<>VARSHFT) #define CHARMASK 0xff #define CHARSHFT 8 #define CHARBITS 0x1f ///< char ends with 0001 1111 #define encode_char(i) (((i)<>CHARSHFT) #define DATEMASK 0xff #define DATESHFT 8 #define DATEBITS 0x2f ///< date ends with 0010 1111 #define MAXDATE 128*255*255 #define MINDATE -128*255*255 #define fits_date(i) (((i)<=MAXDATE) && ((i)>=MINDATE)) #define encode_date(i) (((i)<>DATESHFT) #define TIMEMASK 0xff #define TIMESHFT 8 #define TIMEBITS 0x3f ///< time ends with 0011 1111 #define MAXTIME 24*60*60*100 #define MINTIME 0 #define fits_time(i) (((i)<=MAXTIME) && ((i)>=MINTIME)) #define encode_time(i) (((i)<>TIMESHFT)) #define FIXPOINTMASK 0xff #define FIXPOINTSHFT 8 #define FIXPOINTBITS 0xf ///< fixpoint ends with 0000 1111 #define MAXFIXPOINT 800 #define MINFIXPOINT -800 #define FIXPOINTDIVISOR 10000.0 #define fits_fixpoint(i) (((i)<=MAXFIXPOINT) && ((i)>=MINFIXPOINT)) #define encode_fixpoint(i) ((((int)(round((i)*(double)FIXPOINTDIVISOR)))<>FIXPOINTSHFT)/(double)FIXPOINTDIVISOR)) #define TINYSTRMASK 0xff #define TINYSTRSHFT 8 #define TINYSTRBITS 0x4f ///< tiny str ends with 0100 1111 #define ANONCONSTMASK 0xff #define ANONCONSTSHFT 8 #define ANONCONSTBITS 0x5f ///< anon const ends with 0101 1111 #define encode_anonconst(i) (((i)<>ANONCONSTSHFT) /* --- recognizing data ---- */ #define NORMALPTRMASK 0x7 ///< all pointers except fullint #define NONPTRBITS 0x3 #define LASTFOURBITSMASK 0xf #define PRELASTFOURBITSMASK 0xf0 #define LASTBYTEMASK 0xff #define isptr(i) ((i) && (((i)&NONPTRBITS)!=NONPTRBITS)) #define isdatarec(i) (((i)&DATARECMASK)==DATARECBITS) #define isfullint(i) (((i)&FULLINTMASK)==FULLINTBITS) #define isfulldouble(i) (((i)&FULLDOUBLEMASK)==FULLDOUBLEBITS) #define isshortstr(i) (((i)&SHORTSTRMASK)==SHORTSTRBITS) #define islongstr(i) (((i)&LONGSTRMASK)==LONGSTRBITS) #define issmallint(i) (((i)&SMALLINTMASK)==SMALLINTBITS) #define isvar(i) (((i)&VARMASK)==VARBITS) #define ischar(i) (((i)&CHARMASK)==CHARBITS) #define isfixpoint(i) (((i)&FIXPOINTMASK)==FIXPOINTBITS) #define isdate(i) (((i)&DATEMASK)==DATEBITS) #define istime(i) (((i)&TIMEMASK)==TIMEBITS) #define istinystr(i) (((i)&TINYSTRMASK)==TINYSTRBITS) #define isanonconst(i) (((i)&ANONCONSTMASK)==ANONCONSTBITS) #define isimmediatedata(i) ((i)==0 || (!isptr(i) && !isfullint(i))) /* ------ metainfo and special data items --------- */ #define datarec_size_bytes(i) (getusedobjectwantedbytes(i)) #define datarec_end_ptr(i) /* --------- record and longstr data object structure ---------- */ /* record data object gint usage from start: 0: encodes length in bytes. length is aligned to sizeof gint 1: pointer to next sibling 2: pointer to prev sibling or parent 3: data gints ... ---- conventional database rec ---------- car1: id: 10 model: ford licenceplate: 123LGH owner: 20 (we will have ptr to rec 20) car2: id: 11 model: opel licenceplate: 456RMH owner: 20 (we will have ptr to rec 20) person1: parents: list of pointers to person1? id: 20 fname: John lname: Brown ---- xml node ------- xml-corresponding rdf triplets _10 model ford _10 licenceplate 123LGH _11 model opel _11 licenceplate 456RMH _20 fname john _20 lname brown _20 owns _10 _20 owns _11 (?x fname john) & (?x lname brown) & (?x owns ?y) & (?y model ford) => answer(?y) solution: - locate from value index brown - instantiate ?x with _20 - scan _20 value occurrences with pred lname to find john - scan _20 subject occurrences with pred owns to find _10 - scan _10 subject occurrences with pred model to find ford ----normal rdf triplets ----- _10 model ford _10 licenceplate 123LGH _10 owner _20 _11 model opel _11 licenceplate 456RMH _11 owner _20 _20 fname john _20 lname brown (?x fname john) & (?x lname brown) & (?y owner ?x) & (?y model ford) => answer(?y) solution: - locate from value index brown - instantiate ?x with _20 - scan _20 value occurrences with pred lname to find john - scan _20 value occurrences with pred owner to find _10 - scan _10 subject occurrences with pred model to find ford --------- fromptr structure ------- fld 1 pts to either directly (single) occurrence or rec of occurrences: single occ case: - last bit zero indicates direct ptr to rec - two previous bits indicate position in rec (0-3) multiple (or far pos) case: - last bit 1 indicates special pos list array ptr: pos array: recsize position fld nr, ptr to (single) rec or to corresp list of recs position fld nr, ptr to (single) rec or to corresp list o recs ... where corresp list is made of pairs (list cells): ptr to rec ptr to next list cell alternative: ptr to rec ptr to rec ptr to rec ptr to rec ptr to next block */ /* record data object gint usage from start: 0: encodes data obj length in bytes. length is aligned to sizeof gint 1: metainfo, incl object type: - last byte object type - top-level/dependent bit - original/derived bit 2: backlinks 3: actual gints .... ... */ /* longstr/xmlliteral/uri/blob data object gint usage from start: 0: encodes data obj length in bytes. length is aligned to sizeof gint 1: metainfo, incl object type (longstr/xmlliteral/uri/blob/datarec etc): - last byte object type - byte before last: nr to delete from obj length to get real actual-bytes length 2: refcount 3: backlinks 4: pointer to next longstr in the hash bucket, 0 if no following 5: lang/xsdtype/namespace str (offset): if 0 not present 6: actual bytes .... ... */ #define LONGSTR_HEADER_GINTS 6 /** including obj length gint */ #define LONGSTR_META_POS 1 /** metainfo, incl object type (longstr/xmlliteral/uri/blob/datarec etc) last byte (low 0) object type (WG_STRTYPE,WG_XMLLITERALTYPE, etc) byte before last (low 1): lendif: nr to delete from obj length to get real actual-bytes length of str low 2: unused low 3: unused */ #define LONGSTR_META_LENDIFMASK 0xFF00 /** second lowest bytes contains lendif*/ #define LONGSTR_META_LENDIFSHFT 8 /** shift 8 bits right to get lendif */ #define LONGSTR_META_TYPEMASK 0xFF /*** lowest byte contains actual subtype: str,uri,xmllliteral */ #define LONGSTR_REFCOUNT_POS 2 /** reference count, if 0, delete*/ #define LONGSTR_BACKLINKS_POS 3 /** backlinks structure offset */ #define LONGSTR_HASHCHAIN_POS 4 /** offset of next longstr in the hash bucket, 0 if no following */ #define LONGSTR_EXTRASTR_POS 5 /** lang/xsdtype/namespace str (encoded offset): if 0 not present */ /* --------- error handling ------------ */ #define recordcheck(db,record,fieldnr,opname) { \ if (!dbcheck(db)) {\ show_data_error_str(db,"wrong database pointer given to ",opname);\ return -1;\ }\ if (fieldnr<0 || getusedobjectwantedgintsnr(*((gint*)record))<=fieldnr+RECORD_HEADER_GINTS) {\ show_data_error_str(db,"wrong field number given to ",opname);\ return -2;\ }\ } /* ==== Protos ==== */ //void free_field_data(void* db,gint fielddata, gint fromrecoffset, gint fromrecfield); gint wg_encode_unistr(void* db, char* str, char* lang, gint type); ///< let lang==NULL if not used gint wg_encode_uniblob(void* db, char* str, char* lang, gint type, gint len); char* wg_decode_unistr(void* db, wg_int data, gint type); char* wg_decode_unistr_lang(void* db, wg_int data, gint type); gint wg_decode_unistr_len(void* db, wg_int data, gint type); gint wg_decode_unistr_lang_len(void* db, wg_int data, gint type); gint wg_decode_unistr_copy(void* db, wg_int data, char* strbuf, wg_int buflen, gint type); gint wg_decode_unistr_lang_copy(void* db, wg_int data, char* langbuf, wg_int buflen, gint type); gint wg_encode_external_data(void *db, void *extdb, gint encoded); #ifdef USE_CHILD_DB gint wg_translate_hdroffset(void *db, void *exthdr, gint encoded); void *wg_get_rec_owner(void *db, void *rec); #endif #endif /* DEFINED_DBDATA_H */ whitedb-0.7.2/Db/dbdump.c000066400000000000000000000223121226454622500151220ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Andri Rebane 2009, Priit Järv 2009,2010 * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dbdump.c * DB dumping support for WhiteDB memory database * */ /* ====== Includes =============== */ #include #include #ifdef _WIN32 #define WIN32_LEAN_AND_MEAN #include #include #endif #ifdef __cplusplus extern "C" { #endif #ifdef _WIN32 #include "../config-w32.h" #else #include "../config.h" #endif #include "dballoc.h" #include "dbmem.h" #include "dblock.h" #include "dblog.h" /* ====== Private headers and defs ======== */ #include "dbdump.h" #include "crc1.h" /* ======= Private protos ================ */ static gint show_dump_error(void *db, char *errmsg); static gint show_dump_error_str(void *db, char *errmsg, char *str); /* ====== Functions ============== */ /** dump shared memory to the disk. * Returns 0 when successful (no error). * -1 non-fatal error (db may continue) * -2 fatal error (should abort db) * This function is parallel-safe (may run during normal db usage) */ gint wg_dump(void * db,char fileName[]) { return wg_dump_internal(db, fileName, 1); } /** Handle the actual dumping (called by the API wrapper) * if locking is non-zero, properly acquire locks on the database. * Otherwise do a rescue dump by copying the memory image without locking. */ gint wg_dump_internal(void * db, char fileName[], int locking) { FILE *f; db_memsegment_header* dbh = dbmemsegh(db); gint dbsize = dbh->free; /* first unused offset - 0 = db size */ #ifdef USE_DBLOG gint active; #endif gint err = -1; gint lock_id = 0; gint32 crc; #ifdef CHECK if(dbh->extdbs.count != 0) { show_dump_error(db, "Database contains external references"); } #endif /* Open the dump file */ #ifdef _WIN32 if(fopen_s(&f, fileName, "wb")) { #else if(!(f = fopen(fileName, "wb"))) { #endif show_dump_error(db, "Error opening file"); return -1; } #ifndef USE_DBLOG /* Get shared lock on the db */ if(locking) { lock_id = db_rlock(db, DEFAULT_LOCK_TIMEOUT); if(!lock_id) { show_dump_error(db, "Failed to lock the database for dump"); return -1; } } #else /* Get exclusive lock on the db, we need to modify the logging area */ if(locking) { lock_id = db_wlock(db, DEFAULT_LOCK_TIMEOUT); if(!lock_id) { show_dump_error(db, "Failed to lock the database for dump"); return -1; } } active = dbh->logging.active; if(active) { wg_stop_logging(db); } #endif /* Compute the CRC32 of the used area */ crc = update_crc32(dbmemsegbytes(db), dbsize, 0x0); /* Now, write the memory area to file */ if(fwrite(dbmemseg(db), dbsize, 1, f) == 1) { /* Overwrite checksum field */ fseek(f, ptrtooffset(db, &(dbh->checksum)), SEEK_SET); if(fwrite(&crc, sizeof(gint32), 1, f) == 1) { err = 0; } } if(err) show_dump_error(db, "Error writing file"); #ifndef USE_DBLOG /* We're done writing */ if(locking) { if(!db_rulock(db, lock_id)) { show_dump_error(db, "Failed to unlock the database"); err = -2; /* This error should be handled as fatal */ } } #else /* restart logging */ if(active) { dbh->logging.dirty = 0; if(wg_start_logging(db)) { err = -2; /* Failed to re-initialize log */ } } if(locking) { if(!db_wulock(db, lock_id)) { show_dump_error(db, "Failed to unlock the database"); err = -2; /* Write lock failure --> fatal */ } } #endif fflush(f); fclose(f); return err; } /* This has to be large enough to hold all the relevant * fields in the header during the first pass of the read. * (Currently this is the first 24 bytes of the dump file) */ #define BUFSIZE 8192 /** Check dump file for compatibility and errors. * Returns 0 when successful (no error). * -1 on system error (cannot open file, no memory) * -2 header is incompatible * -3 on file integrity error (size mismatch, CRC32 error). * * Sets minsize to minimum required segment size and maxsize * to original memory image size if check was successful. Otherwise * the contents of these variables may be undefined. */ gint wg_check_dump(void *db, char fileName[], gint *minsize, gint *maxsize) { char *buf; FILE *f; gint len, filesize; gint32 crc, dump_crc; gint err = -1; /* Attempt to open the dump file */ #ifdef _WIN32 if(fopen_s(&f, fileName, "rb")) { #else if(!(f = fopen(fileName, "rb"))) { #endif show_dump_error(db, "Error opening file"); return -1; } buf = (char *) malloc(BUFSIZE); if(!buf) { show_dump_error(db, "malloc error in wg_import_dump"); goto abort1; } /* First pass of reading. Examine the header. */ if(fread(buf, BUFSIZE, 1, f) != 1) { show_dump_error(db, "Error reading dump header"); goto abort2; } if(wg_check_header_compat((db_memsegment_header *) buf)) { show_dump_error_str(db, "Incompatible dump file", fileName); wg_print_code_version(); wg_print_header_version((db_memsegment_header *) buf); err = -2; goto abort2; } *minsize = ((db_memsegment_header *) buf)->free; *maxsize = ((db_memsegment_header *) buf)->size; /* Now check file integrity. */ dump_crc = ((db_memsegment_header *) buf)->checksum; ((db_memsegment_header *) buf)->checksum = 0; len = BUFSIZE; filesize = 0; crc = 0; do { filesize += len; crc = update_crc32(buf, len, crc); } while((len=fread(buf,1,BUFSIZE,f)) > 0); if(filesize != *minsize) { show_dump_error_str(db, "File size incorrect", fileName); err = -3; } else if(crc != dump_crc) { show_dump_error_str(db, "File CRC32 incorrect", fileName); err = -3; } else err = 0; /* Check for registered external data sources */ if(((db_memsegment_header *) buf)->extdbs.count != 0) { show_dump_error(db, "Dump contains external references"); err = -2; } abort2: free(buf); abort1: fclose(f); return err; } /** Import database dump from disk. * Returns 0 when successful (no error). * -1 non-fatal error (db may continue) * -2 fatal error (should abort db) * * this function is NOT parallel-safe. Other processes accessing * db concurrently may cause undefined behaviour (including data loss) */ gint wg_import_dump(void * db,char fileName[]) { db_memsegment_header* dumph; FILE *f; db_memsegment_header* dbh = dbmemsegh(db); gint dbsize = -1, newsize; gint err = -1; #ifdef USE_DBLOG gint active = dbh->logging.active; #endif /* Attempt to open the dump file */ #ifdef _WIN32 if(fopen_s(&f, fileName, "rb")) { #else if(!(f = fopen(fileName, "rb"))) { #endif show_dump_error(db, "Error opening file"); return -1; } /* Examine the dump header. We only read the size, it is * implied that the integrity and compatibility were verified * earlier. */ dumph = (db_memsegment_header *) malloc(sizeof(db_memsegment_header)); if(!dumph) { show_dump_error(db, "malloc error in wg_import_dump"); } else if(fread(dumph, sizeof(db_memsegment_header), 1, f) != 1) { show_dump_error(db, "Error reading dump header"); } else { dbsize = dumph->free; if(dumph->extdbs.count != 0) { show_dump_error(db, "Dump contains external references"); goto abort; } } if(dumph) free(dumph); /* 0 > dbsize >= dbh->size indicates that we were able to read the dump * and it contained a memory image that fits in our current shared memory. */ if(dbh->size < dbsize) { show_dump_error(db, "Data does not fit in shared memory area"); } else if(dbsize > 0) { /* We have a compatible dump file. */ newsize = dbh->size; fseek(f, 0, SEEK_SET); if(fread(dbmemseg(db), dbsize, 1, f) != 1) { show_dump_error(db, "Error reading dump file"); err = -2; /* database is in undetermined state now */ } else { err = 0; dbh->size = newsize; dbh->checksum = 0; } } abort: fclose(f); /* any errors up to now? */ if(err) return err; /* Initialize db state */ #ifdef USE_DBLOG /* restart logging */ dbh->logging.dirty = 0; dbh->logging.active = 0; if(active) { /* state inherited from memory */ if(wg_start_logging(db)) { return -2; /* Failed to re-initialize log */ } } #endif return wg_init_locks(db); } /* ------------ error handling ---------------- */ static gint show_dump_error(void *db, char *errmsg) { #ifdef WG_NO_ERRPRINT #else fprintf(stderr,"wg dump error: %s.\n", errmsg); #endif return -1; } static gint show_dump_error_str(void *db, char *errmsg, char *str) { #ifdef WG_NO_ERRPRINT #else fprintf(stderr,"wg dump error: %s: %s.\n", errmsg, str); #endif return -1; } #ifdef __cplusplus } #endif whitedb-0.7.2/Db/dbdump.h000066400000000000000000000025721226454622500151350ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Andri Rebane 2009 * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dbdump.h * Public headers for memory dumping to the disk. */ #ifndef DEFINED_DBDUMP_H #define DEFINED_DBDUMP_H #ifdef _WIN32 #include "../config-w32.h" #else #include "../config.h" #endif /* ====== data structures ======== */ /* ==== Protos ==== */ gint wg_dump(void * db,char fileName[]); /* dump shared memory database to the disk */ gint wg_dump_internal(void * db,char fileName[], int locking); /* handle the dump */ gint wg_import_dump(void * db,char fileName[]); /* import database from the disk */ gint wg_check_dump(void *db, char fileName[], gint *mixsize, gint *maxsize); /* check the dump file and get the db size */ #endif /* DEFINED_DBDUMP_H */ whitedb-0.7.2/Db/dbfeatures.h000066400000000000000000000041101226454622500157740ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Priit Järv 2010 * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dbfeatures.h * Constructs bit vector of libwgdb compile-time features */ #ifndef DEFINED_DBFEATURES_H #define DEFINED_DBFEATURES_H #ifdef _WIN32 #include "../config-w32.h" #else #include "../config.h" #endif /* Used to check for individual features */ #define FEATURE_BITS_64BIT 0x1 #define FEATURE_BITS_QUEUED_LOCKS 0x2 #define FEATURE_BITS_TTREE_CHAINED 0x4 #define FEATURE_BITS_BACKLINK 0x8 #define FEATURE_BITS_CHILD_DB 0x10 #define FEATURE_BITS_INDEX_TMPL 0x20 /* Construct the bit vector */ #ifdef HAVE_64BIT_GINT #define FEATURE_BITS_01 FEATURE_BITS_64BIT #else #define FEATURE_BITS_01 0x0 #endif #if (LOCK_PROTO==3) #define FEATURE_BITS_02 FEATURE_BITS_QUEUED_LOCKS #else #define FEATURE_BITS_02 0x0 #endif #ifdef TTREE_CHAINED_NODES #define FEATURE_BITS_03 FEATURE_BITS_TTREE_CHAINED #else #define FEATURE_BITS_03 0x0 #endif #ifdef USE_BACKLINKING #define FEATURE_BITS_04 FEATURE_BITS_BACKLINK #else #define FEATURE_BITS_04 0x0 #endif #ifdef USE_CHILD_DB #define FEATURE_BITS_05 FEATURE_BITS_CHILD_DB #else #define FEATURE_BITS_05 0x0 #endif #ifdef USE_INDEX_TEMPLATE #define FEATURE_BITS_06 FEATURE_BITS_INDEX_TMPL #else #define FEATURE_BITS_06 0x0 #endif #define MEMSEGMENT_FEATURES (FEATURE_BITS_01 |\ FEATURE_BITS_02 |\ FEATURE_BITS_03 |\ FEATURE_BITS_04 |\ FEATURE_BITS_05 |\ FEATURE_BITS_06) #endif /* DEFINED_DBFEATURES_H */ whitedb-0.7.2/Db/dbhash.c000066400000000000000000000653631226454622500151150ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009 * Copyright (c) Priit Järv 2013 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dbhash.c * Hash operations for strings and other datatypes. * * */ /* ====== Includes =============== */ #include #include #include #include #ifdef __cplusplus extern "C" { #endif #ifdef _WIN32 #include "../config-w32.h" #else #include "../config.h" #endif #include "dbhash.h" #include "dbdata.h" #include "dbmpool.h" /* ====== Private headers and defs ======== */ /* Bucket capacity > 1 reduces the impact of collisions */ #define GINTHASH_BUCKETCAP 7 /* Level 24 hash consumes approx 640MB with bucket capacity 3 on 32-bit * architecture and about twice as much on 64-bit systems. With bucket * size increased to 7 (which is more space efficient due to imperfect * hash distribution) we can reduce the level by 1 for the same space * requirements. */ #define GINTHASH_MAXLEVEL 23 /* rehash keys (useful for lowering the impact of bad distribution) */ #define GINTHASH_SCRAMBLE(v) (rehash_gint(v)) /*#define GINTHASH_SCRAMBLE(v) (v)*/ typedef struct { gint level; /* local level */ gint fill; /* slots filled / next slot index */ gint key[GINTHASH_BUCKETCAP + 1]; /* includes one overflow slot */ gint value[GINTHASH_BUCKETCAP + 1]; } ginthash_bucket; /* Dynamic local memory hashtable for gint key/value pairs. Resize * is handled using the extendible hashing algorithm. * Note: we don't use 0-level hash, so buckets[0] is unused. */ typedef struct { gint level; /* global level */ ginthash_bucket **directory; /* bucket pointers, contiguous memory */ void *mpool; /* dbmpool storage */ } ext_ginthash; #ifdef HAVE_64BIT_GINT #define FNV_offset_basis ((wg_uint) 14695981039346656037ULL) #define FNV_prime ((wg_uint) 1099511628211ULL) #else #define FNV_offset_basis ((wg_uint) 2166136261UL) #define FNV_prime ((wg_uint) 16777619UL) #endif /* ======= Private protos ================ */ // static gint show_consistency_error(void* db, char* errmsg); static gint show_consistency_error_nr(void* db, char* errmsg, gint nr) ; // static gint show_consistency_error_double(void* db, char* errmsg, double nr); // static gint show_consistency_error_str(void* db, char* errmsg, char* str); static gint show_hash_error(void* db, char* errmsg); static gint show_ginthash_error(void *db, char* errmsg); static wg_uint hash_bytes(void *db, char *data, gint length, gint hashsz); static gint find_idxhash_bucket(void *db, char *data, gint length, gint *chainoffset); static gint rehash_gint(gint val); static gint grow_ginthash(void *db, ext_ginthash *tbl); static ginthash_bucket *ginthash_newbucket(void *db, ext_ginthash *tbl); static ginthash_bucket *ginthash_splitbucket(void *db, ext_ginthash *tbl, ginthash_bucket *bucket); static gint add_to_bucket(ginthash_bucket *bucket, gint key, gint value); static gint remove_from_bucket(ginthash_bucket *bucket, int idx); /* ====== Functions ============== */ /* ------------- strhash operations ------------------- */ /* Hash function for two-part strings and blobs. * * Based on sdbm. * */ int wg_hash_typedstr(void* db, char* data, char* extrastr, gint type, gint length) { char* endp; unsigned long hash = 0; int c; //printf("in wg_hash_typedstr %s %s %d %d \n",data,extrastr,type,length); if (data!=NULL) { for(endp=data+length; datastrhash_area_header).arraylength); } /* Find longstr from strhash bucket chain * * */ gint wg_find_strhash_bucket(void* db, char* data, char* extrastr, gint type, gint size, gint hashchain) { //printf("wg_find_strhash_bucket called %s %s type %d size %d hashchain %d\n",data,extrastr,type,size,hashchain); for(;hashchain!=0; hashchain=dbfetch(db,decode_longstr_offset(hashchain)+LONGSTR_HASHCHAIN_POS*sizeof(gint))) { if (wg_right_strhash_bucket(db,hashchain,data,extrastr,type,size)) { // found equal longstr, return it //printf("wg_find_strhash_bucket found hashchain %d\n",hashchain); return hashchain; } } return 0; } /* Check whether longstr hash bucket matches given new str * * */ int wg_right_strhash_bucket (void* db, gint longstr, char* cstr, char* cextrastr, gint ctype, gint cstrsize) { char* str; char* extrastr; int strsize; gint type; //printf("wg_right_strhash_bucket called with %s %s type %d size %d\n", // cstr,cextrastr,ctype,cstrsize); type=wg_get_encoded_type(db,longstr); if (type!=ctype) return 0; strsize=wg_decode_str_len(db,longstr)+1; if (strsize!=cstrsize) return 0; str=wg_decode_str(db,longstr); if ((cstr==NULL && str!=NULL) || (cstr!=NULL && str==NULL)) return 0; if ((cstr!=NULL) && (memcmp(str,cstr,cstrsize))) return 0; extrastr=wg_decode_str_lang(db,longstr); if ((cextrastr==NULL && extrastr!=NULL) || (cextrastr!=NULL && extrastr==NULL)) return 0; if ((cextrastr!=NULL) && (strcmp(extrastr,cextrastr))) return 0; return 1; } /* Remove longstr from strhash * * Internal langstr etc are not removed by this op. * */ gint wg_remove_from_strhash(void* db, gint longstr) { db_memsegment_header* dbh = dbmemsegh(db); gint type; gint* extrastrptr; char* extrastr; char* data; gint length; gint hash; gint chainoffset; gint hashchain; gint nextchain; gint offset; gint* objptr; gint fldval; gint objsize; gint strsize; gint* typeptr; //printf("wg_remove_from_strhash called on %d\n",longstr); //wg_debug_print_value(db,longstr); //printf("\n\n"); offset=decode_longstr_offset(longstr); objptr=(gint*) offsettoptr(db,offset); // get string data elements //type=objptr=offsettoptr(db,decode_longstr_offset(data)); extrastrptr=(gint *) (((char*)(objptr))+(LONGSTR_EXTRASTR_POS*sizeof(gint))); fldval=*extrastrptr; if (fldval==0) extrastr=NULL; else extrastr=wg_decode_str(db,fldval); data=((char*)(objptr))+(LONGSTR_HEADER_GINTS*sizeof(gint)); objsize=getusedobjectsize(*objptr); strsize=objsize-(((*(objptr+LONGSTR_META_POS))&LONGSTR_META_LENDIFMASK)>>LONGSTR_META_LENDIFSHFT); length=strsize; typeptr=(gint*)(((char*)(objptr))+(+LONGSTR_META_POS*sizeof(gint))); type=(*typeptr)&LONGSTR_META_TYPEMASK; //type=wg_get_encoded_type(db,longstr); // get hash of data elements and find the location in hashtable/chains hash=wg_hash_typedstr(db,data,extrastr,type,length); chainoffset=((dbh->strhash_area_header).arraystart)+(sizeof(gint)*hash); hashchain=dbfetch(db,chainoffset); while(hashchain!=0) { if (hashchain==longstr) { nextchain=dbfetch(db,decode_longstr_offset(hashchain)+(LONGSTR_HASHCHAIN_POS*sizeof(gint))); dbstore(db,chainoffset,nextchain); return 0; } chainoffset=decode_longstr_offset(hashchain)+(LONGSTR_HASHCHAIN_POS*sizeof(gint)); hashchain=dbfetch(db,chainoffset); } show_consistency_error_nr(db,"string not found in hash during deletion, offset",offset); return -1; } /* -------------- hash index support ------------------ */ #define CONCAT_FOR_HASHING(d, b, e, l, bb, en) \ if(e) { \ gint xl = wg_decode_xmlliteral_xsdtype_len(d, en); \ bb = malloc(xl + l + 1); \ if(!bb) \ return 0; \ memcpy(bb, e, xl); \ bb[xl] = '\0'; \ memcpy(bb + xl + 1, b, l); \ b = bb; \ l += xl + 1; \ } /* * Return an encoded value as a decoded byte array. * It should be freed afterwards. * returns the number of bytes in the array. * returns 0 if the decode failed. * * NOTE: to differentiate between identical byte strings * the value is prefixed with a type identifier. * TODO: For values with varying length that can contain * '\0' bytes, add length to the prefix. */ gint wg_decode_for_hashing(void *db, gint enc, char **decbytes) { gint len; gint type; gint ptrdata; int intdata; double doubledata; char *bytedata; char *exdata, *buf = NULL, *outbuf; type = wg_get_encoded_type(db, enc); switch(type) { case WG_NULLTYPE: len = sizeof(gint); ptrdata = 0; bytedata = (char *) &ptrdata; break; case WG_RECORDTYPE: len = sizeof(gint); ptrdata = (gint) wg_decode_record(db, enc); bytedata = (char *) &ptrdata; break; case WG_INTTYPE: len = sizeof(int); intdata = wg_decode_int(db, enc); bytedata = (char *) &intdata; break; case WG_DOUBLETYPE: len = sizeof(double); doubledata = wg_decode_double(db, enc); bytedata = (char *) &doubledata; break; case WG_FIXPOINTTYPE: len = sizeof(double); doubledata = wg_decode_fixpoint(db, enc); bytedata = (char *) &doubledata; break; case WG_STRTYPE: len = wg_decode_str_len(db, enc); bytedata = wg_decode_str(db, enc); break; case WG_URITYPE: len = wg_decode_uri_len(db, enc); bytedata = wg_decode_uri(db, enc); exdata = wg_decode_uri_prefix(db, enc); CONCAT_FOR_HASHING(db, bytedata, exdata, len, buf, enc) break; case WG_XMLLITERALTYPE: len = wg_decode_xmlliteral_len(db, enc); bytedata = wg_decode_xmlliteral(db, enc); exdata = wg_decode_xmlliteral_xsdtype(db, enc); CONCAT_FOR_HASHING(db, bytedata, exdata, len, buf, enc) break; case WG_CHARTYPE: len = sizeof(int); intdata = wg_decode_char(db, enc); bytedata = (char *) &intdata; break; case WG_DATETYPE: len = sizeof(int); intdata = wg_decode_date(db, enc); bytedata = (char *) &intdata; break; case WG_TIMETYPE: len = sizeof(int); intdata = wg_decode_time(db, enc); bytedata = (char *) &intdata; break; case WG_VARTYPE: len = sizeof(int); intdata = wg_decode_var(db, enc); bytedata = (char *) &intdata; break; case WG_ANONCONSTTYPE: /* Ignore anonconst */ default: return 0; } /* Form the hashable buffer. It is not 0-terminated */ outbuf = malloc(len + 1); if(outbuf) { outbuf[0] = (char) type; memcpy(outbuf + 1, bytedata, len++); *decbytes = outbuf; } else { /* Indicate failure */ len = 0; } if(buf) free(buf); return len; } /* * Calculate a hash for a byte buffer. Truncates the hash to given size. */ static wg_uint hash_bytes(void *db, char *data, gint length, gint hashsz) { char* endp; wg_uint hash = 0; if (data!=NULL) { for(endp=data+length; dataarraylength); head_offset = (ha->arraystart)+(sizeof(gint) * hash); head = dbfetch(db, head_offset); /* Traverse the hash chain to check if there is a matching * hash string already */ bucket = find_idxhash_bucket(db, data, length, &head_offset); if(!bucket) { size_t i; gint lengints, lenrest; char* dptr; /* Make a new bucket */ lengints = length / sizeof(gint); lenrest = length % sizeof(gint); if(lenrest) lengints++; bucket = wg_alloc_gints(db, &(dbh->indexhash_area_header), lengints + HASHIDX_HEADER_SIZE); if(!bucket) { return -1; } /* Copy the byte data */ dptr = (char *) (offsettoptr(db, bucket + HASHIDX_HEADER_SIZE*sizeof(gint))); memcpy(dptr, data, length); for(i=0;lenrest && iarraystart)+(sizeof(gint) * hash)), bucket); dbstore(db, bucket + HASHIDX_HASHCHAIN_POS*sizeof(gint), head); } /* Add the record offset to the list. */ rec_head = dbfetch(db, bucket + HASHIDX_RECLIST_POS*sizeof(gint)); rec_offset = wg_alloc_fixlen_object(db, &(dbh->listcell_area_header)); rec_cell = (gcell *) offsettoptr(db, rec_offset); rec_cell->car = offset; rec_cell->cdr = rec_head; dbstore(db, bucket + HASHIDX_RECLIST_POS*sizeof(gint), rec_offset); return 0; } /* * Remove an offset from the index hash. * * Returns 0 on success * Returns -1 on error. */ gint wg_idxhash_remove(void* db, db_hash_area_header *ha, char* data, gint length, gint offset) { wg_uint hash; gint bucket_offset, bucket; gint *next_offset, *reclist_offset; hash = hash_bytes(db, data, length, ha->arraylength); bucket_offset = (ha->arraystart)+(sizeof(gint) * hash); /* points to head */ /* Find the correct bucket. */ bucket = find_idxhash_bucket(db, data, length, &bucket_offset); if(!bucket) { return show_hash_error(db, "wg_idxhash_remove: Hash value not found."); } /* Remove the record offset from the list. */ reclist_offset = offsettoptr(db, bucket + HASHIDX_RECLIST_POS*sizeof(gint)); next_offset = reclist_offset; while(*next_offset) { gcell *rec_cell = (gcell *) offsettoptr(db, *next_offset); if(rec_cell->car == offset) { gint rec_offset = *next_offset; *next_offset = rec_cell->cdr; /* remove from list chain */ wg_free_listcell(db, rec_offset); /* free storage */ goto is_bucket_empty; } next_offset = &(rec_cell->cdr); } return show_hash_error(db, "wg_idxhash_remove: Offset not found"); is_bucket_empty: if(!(*reclist_offset)) { gint nextchain = dbfetch(db, bucket + HASHIDX_HASHCHAIN_POS*sizeof(gint)); dbstore(db, bucket_offset, nextchain); wg_free_object(db, &(dbmemsegh(db)->indexhash_area_header), bucket); } return 0; } /* * Retrieve the list of matching offsets from the hash. * * Returns the offset to head of the linked list. * Returns 0 if value was not found. */ gint wg_idxhash_find(void* db, db_hash_area_header *ha, char* data, gint length) { wg_uint hash; gint head_offset, bucket; hash = hash_bytes(db, data, length, ha->arraylength); head_offset = (ha->arraystart)+(sizeof(gint) * hash); /* points to head */ /* Find the correct bucket. */ bucket = find_idxhash_bucket(db, data, length, &head_offset); if(!bucket) return 0; return dbfetch(db, bucket + HASHIDX_RECLIST_POS*sizeof(gint)); } /* ------- local-memory extendible gint hash ---------- */ /* * Dynamically growing gint hash. * * Implemented in local memory for temporary usage (database memory is not well * suited as it is not resizable). Uses the extendible hashing algorithm * proposed by Fagin et al '79 as this allows the use of simple, easily * disposable data structures. */ /** Initialize the hash table. * The initial hash level is 1. * returns NULL on failure. */ void *wg_ginthash_init(void *db) { ext_ginthash *tbl = malloc(sizeof(ext_ginthash)); if(!tbl) { show_ginthash_error(db, "Failed to allocate table."); return NULL; } memset(tbl, 0, sizeof(ext_ginthash)); if(grow_ginthash(db, tbl)) { /* initial level is set to 1 */ free(tbl); return NULL; } return tbl; } /** Add a key/value pair to the hash table. * tbl should be created with wg_ginthash_init() * Returns 0 on success * Returns -1 on failure */ gint wg_ginthash_addkey(void *db, void *tbl, gint key, gint val) { size_t dirsize = 1<<((ext_ginthash *)tbl)->level; size_t hash = GINTHASH_SCRAMBLE(key) & (dirsize - 1); ginthash_bucket *bucket = ((ext_ginthash *)tbl)->directory[hash]; /*static gint keys = 0;*/ /* printf("add: %d hash %d items %d\n", key, hash, ++keys); */ if(!bucket) { /* allocate a new bucket, store value, we're done */ bucket = ginthash_newbucket(db, (ext_ginthash *) tbl); if(!bucket) return -1; bucket->level = ((ext_ginthash *) tbl)->level; add_to_bucket(bucket, key, val); /* Always fits, no check needed */ ((ext_ginthash *)tbl)->directory[hash] = bucket; } else { add_to_bucket(bucket, key, val); while(bucket->fill > GINTHASH_BUCKETCAP) { ginthash_bucket *newb; /* Overflow, bucket split needed. */ if(!(newb = ginthash_splitbucket(db, (ext_ginthash *)tbl, bucket))) return -1; /* Did everything flow to the new bucket, causing another overflow? */ if(newb->fill > GINTHASH_BUCKETCAP) { bucket = newb; /* Keep splitting */ } } } return 0; } /** Fetch a value from the hash table. * If the value is not found, returns -1 (val is unmodified). * Otherwise returns 0; contents of val is replaced with the * value from the hash table. */ gint wg_ginthash_getkey(void *db, void *tbl, gint key, gint *val) { size_t dirsize = 1<<((ext_ginthash *)tbl)->level; size_t hash = GINTHASH_SCRAMBLE(key) & (dirsize - 1); ginthash_bucket *bucket = ((ext_ginthash *)tbl)->directory[hash]; if(bucket) { int i; for(i=0; ifill; i++) { if(bucket->key[i] == key) { *val = bucket->value[i]; return 0; } } } return -1; } /** Release all memory allocated for the hash table. * */ void wg_ginthash_free(void *db, void *tbl) { if(tbl) { if(((ext_ginthash *) tbl)->directory) free(((ext_ginthash *) tbl)->directory); if(((ext_ginthash *) tbl)->mpool) wg_free_mpool(db, ((ext_ginthash *) tbl)->mpool); free(tbl); } } /** Scramble a gint value * This is useful when dealing with aligned offsets, that are * multiples of 4, 8 or larger values and thus waste the majority * of the directory space when used directly. * Uses FNV-1a. */ static gint rehash_gint(gint val) { int i; wg_uint hash = FNV_offset_basis; for(i=0; ilevel + 1; if(newlevel >= GINTHASH_MAXLEVEL) return show_ginthash_error(db, "Maximum level exceeded."); if((tmp = realloc((void *) tbl->directory, (1<directory = (ginthash_bucket **) tmp; if(tbl->level) { size_t i; size_t dirsize = 1<level; /* duplicate the existing pointers. */ for(i=0; idirectory[dirsize + i] = tbl->directory[i]; } else { /* Initialize the memory pool (2 buckets) */ if((tmp = wg_create_mpool(db, 2*sizeof(ginthash_bucket)))) { tbl->mpool = tmp; /* initial directory is empty */ memset(tbl->directory, 0, 2*sizeof(ginthash_bucket *)); } else { return show_ginthash_error(db, "Failed to allocate bucket pool."); } } } else { return show_ginthash_error(db, "Failed to reallocate directory."); } tbl->level = newlevel; return 0; } /** Allocate a new bucket. * */ static ginthash_bucket *ginthash_newbucket(void *db, ext_ginthash *tbl) { ginthash_bucket *bucket = (ginthash_bucket *) \ wg_alloc_mpool(db, tbl->mpool, sizeof(ginthash_bucket)); if(bucket) { /* bucket->level = tbl->level; */ bucket->fill = 0; } return bucket; } /** Split a bucket. * Returns the newly created bucket on success * Returns NULL on failure (likely cause being out of memory) */ static ginthash_bucket *ginthash_splitbucket(void *db, ext_ginthash *tbl, ginthash_bucket *bucket) { gint msbmask, lowbits; int i; ginthash_bucket *newbucket; if(bucket->level == tbl->level) { /* can't split at this level anymore, extend directory */ /*printf("grow: curr level %d\n", tbl->level);*/ if(grow_ginthash(db, (ext_ginthash *) tbl)) return NULL; } /* Hash values for the new level (0+lowbits, msb+lowbits) */ msbmask = (1<<(bucket->level++)); lowbits = GINTHASH_SCRAMBLE(bucket->key[0]) & (msbmask - 1); /* Create a bucket to split into */ newbucket = ginthash_newbucket(db, tbl); if(!newbucket) return NULL; newbucket->level = bucket->level; /* Split the entries based on the most significant * bit for the local level hash (the ones with msb set are relocated) */ for(i=bucket->fill-1; i>=0; i--) { gint k_i = bucket->key[i]; if(GINTHASH_SCRAMBLE(k_i) & msbmask) { add_to_bucket(newbucket, k_i, remove_from_bucket(bucket, i)); /* printf("reassign: %d hash %d --> %d\n", k_i, lowbits, msbmask | lowbits); */ } } /* Update the directory */ if(bucket->level == tbl->level) { /* There are just two pointers pointing to bucket, * we can compute the location of the one that has the index * with msb set. The other one's contents do not need to be * modified. */ tbl->directory[msbmask | lowbits] = newbucket; } else { /* The pointers that need to be updated have indexes * of xxx1yyyy where 1 is the msb in the index of the new * bucket, yyyy is the hash value of the bucket masked * by the previous level and xxx are all combinations of * bits that still remain masked by the local level after * the split. The pointers xxx0yyyy will remain pointing * to the old bucket. */ size_t msbbuckets = 1<<(tbl->level - bucket->level), j; for(j=0; jlevel) | msbmask | lowbits; /* XXX: paranoia check, remove in production */ if(tbl->directory[k] != bucket) return NULL; tbl->directory[k] = newbucket; } } return newbucket; } /** Add a key/value pair to bucket. * Returns bucket fill. */ static gint add_to_bucket(ginthash_bucket *bucket, gint key, gint value) { #ifdef CHECK if(bucket->fill > GINTHASH_BUCKETCAP) { /* Should never happen */ return bucket->fill + 1; } else { #endif bucket->key[bucket->fill] = key; bucket->value[bucket->fill] = value; return ++(bucket->fill); #ifdef CHECK } #endif } /** Remove an indexed value from bucket. * Returns the value. */ static gint remove_from_bucket(ginthash_bucket *bucket, int idx) { int i; gint val = bucket->value[idx]; for(i=idx; i=bucket->fill are always undefined * and shouldn't be accessed directly. */ bucket->key[i] = bucket->key[i+1]; bucket->value[i] = bucket->value[i+1]; } bucket->fill--; return val; } /* ------------- error handling ------------------- */ /* static gint show_consistency_error(void* db, char* errmsg) { #ifdef WG_NO_ERRPRINT #else fprintf(stderr,"wg consistency error: %s\n",errmsg); #endif return -1; } */ static gint show_consistency_error_nr(void* db, char* errmsg, gint nr) { #ifdef WG_NO_ERRPRINT #else fprintf(stderr,"wg consistency error: %s %d\n", errmsg, (int) nr); return -1; #endif } /* static gint show_consistency_error_double(void* db, char* errmsg, double nr) { #ifdef WG_NO_ERRPRINT #else fprintf(stderr,"wg consistency error: %s %f\n",errmsg,nr); #endif return -1; } static gint show_consistency_error_str(void* db, char* errmsg, char* str) { #ifdef WG_NO_ERRPRINT #else fprintf(stderr,"wg consistency error: %s %s\n",errmsg,str); #endif return -1; } */ static gint show_hash_error(void* db, char* errmsg) { #ifdef WG_NO_ERRPRINT #else fprintf(stderr,"wg hash error: %s\n",errmsg); #endif return -1; } static gint show_ginthash_error(void *db, char* errmsg) { #ifdef WG_NO_ERRPRINT #else fprintf(stderr,"wg gint hash error: %s\n", errmsg); #endif return -1; } /* #include "pstdint.h" // Replace with if appropriate #undef get16bits #if (defined(__GNUC__) && defined(__i386__)) || defined(__WATCOMC__) \ || defined(_MSC_VER) || defined (__BORLANDC__) || defined (__TURBOC__) #define get16bits(d) (*((const uint16_t *) (d))) #endif #if !defined (get16bits) #define get16bits(d) ((((uint32_t)(((const uint8_t *)(d))[1])) << 8)\ +(uint32_t)(((const uint8_t *)(d))[0]) ) #endif uint32_t SuperFastHash (const char * data, int len) { uint32_t hash = len, tmp; int rem; if (len <= 0 || data == NULL) return 0; rem = len & 3; len >>= 2; // Main loop for (;len > 0; len--) { hash += get16bits (data); tmp = (get16bits (data+2) << 11) ^ hash; hash = (hash << 16) ^ tmp; data += 2*sizeof (uint16_t); hash += hash >> 11; } // Handle end cases switch (rem) { case 3: hash += get16bits (data); hash ^= hash << 16; hash ^= data[sizeof (uint16_t)] << 18; hash += hash >> 11; break; case 2: hash += get16bits (data); hash ^= hash << 11; hash += hash >> 17; break; case 1: hash += *data; hash ^= hash << 10; hash += hash >> 1; } // Force "avalanching" of final 127 bits hash ^= hash << 3; hash += hash >> 5; hash ^= hash << 4; hash += hash >> 17; hash ^= hash << 25; hash += hash >> 6; return hash; } */ #ifdef __cplusplus } #endif whitedb-0.7.2/Db/dbhash.h000066400000000000000000000042101226454622500151020ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009 * Copyright (c) Priit Järv 2013 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dbhash.h * Public headers for hash-related procedures. */ #ifndef DEFINED_DBHASH_H #define DEFINED_DBHASH_H #ifdef _WIN32 #include "../config-w32.h" #else #include "../config.h" #endif #include "dballoc.h" /* ==== Public macros ==== */ #define HASHIDX_META_POS 1 #define HASHIDX_RECLIST_POS 2 #define HASHIDX_HASHCHAIN_POS 3 #define HASHIDX_HEADER_SIZE 4 /* ==== Protos ==== */ int wg_hash_typedstr(void* db, char* data, char* extrastr, gint type, gint length); gint wg_find_strhash_bucket(void* db, char* data, char* extrastr, gint type, gint size, gint hashchain); int wg_right_strhash_bucket (void* db, gint longstr, char* cstr, char* cextrastr, gint ctype, gint cstrsize); gint wg_remove_from_strhash(void* db, gint longstr); gint wg_decode_for_hashing(void *db, gint enc, char **decbytes); gint wg_idxhash_store(void* db, db_hash_area_header *ha, char* data, gint length, gint offset); gint wg_idxhash_remove(void* db, db_hash_area_header *ha, char* data, gint length, gint offset); gint wg_idxhash_find(void* db, db_hash_area_header *ha, char* data, gint length); void *wg_ginthash_init(void *db); gint wg_ginthash_addkey(void *db, void *tbl, gint key, gint val); gint wg_ginthash_getkey(void *db, void *tbl, gint key, gint *val); void wg_ginthash_free(void *db, void *tbl); #endif /* DEFINED_DBHASH_H */ whitedb-0.7.2/Db/dbindex.c000066400000000000000000002733011226454622500152720ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Enar Reilent 2009, Priit Järv 2010,2011 * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dbindex.c * Implementation of T-tree index */ /* ====== Includes =============== */ #include #include #include #ifdef __cplusplus extern "C" { #endif #ifdef _WIN32 #include "../config-w32.h" #else #include "../config.h" #endif #include "dbdata.h" #include "dbindex.h" #include "dbcompare.h" #include "dbhash.h" /* ====== Private defs =========== */ #define LL_CASE 0 #define LR_CASE 1 #define RL_CASE 2 #define RR_CASE 3 #ifndef max #define max(a,b) (a>b ? a : b) #endif #define HASHIDX_OP_STORE 1 #define HASHIDX_OP_REMOVE 2 #define HASHIDX_OP_FIND 3 /* ======= Private protos ================ */ #ifndef TTREE_SINGLE_COMPARE static gint db_find_bounding_tnode(void *db, gint rootoffset, gint key, gint *result, struct wg_tnode *rb_node); #endif static int db_which_branch_causes_overweight(void *db, struct wg_tnode *root); static int db_rotate_ttree(void *db, gint index_id, struct wg_tnode *root, int overw); static gint ttree_add_row(void *db, gint index_id, void *rec); static gint ttree_remove_row(void *db, gint index_id, void * rec); static gint create_ttree_index(void *db, gint index_id); static gint drop_ttree_index(void *db, gint column); static gint insert_into_list(void *db, gint *head, gint value); static void delete_from_list(void *db, gint *head); #ifdef USE_INDEX_TEMPLATE static gint add_index_template(void *db, gint *matchrec, gint reclen); static gint find_index_template(void *db, gint *matchrec, gint reclen); static gint remove_index_template(void *db, gint template_offset); #endif static gint hash_add_row(void *db, gint index_id, void *rec); static gint hash_remove_row(void *db, gint index_id, void *rec); static gint hash_recurse(void *db, wg_index_header *hdr, char *prefix, gint prefixlen, gint *values, gint count, void *rec, gint op, gint expand); static gint hash_extend_prefix(void *db, wg_index_header *hdr, char *prefix, gint prefixlen, gint nextval, gint *values, gint count, void *rec, gint op, gint expand); static gint create_hash_index(void *db, gint index_id); static gint drop_hash_index(void *db, gint index_id); static gint sort_columns(gint *sorted_cols, gint *columns, gint col_count); static gint show_index_error(void* db, char* errmsg); static gint show_index_error_nr(void* db, char* errmsg, gint nr); /* ====== Functions ============== */ /* * Index implementation: * - T-Tree, as described by Lehman & Carey '86 * This includes search with a single compare per node, enabled by * defining TTREE_SINGLE_COMPARE * * - improvements loosely based on T* tree (Kim & Choi '96) * Nodes have predecessor and successor pointers. This is turned * on by defining TTREE_CHAINED_NODES. Other alterations described in * the original T* tree paper were not implemented. * * - hash index (allows multi-column indexes) (not done yet) * * Index metainfo: * data about indexes in system is stored in dbh->index_control_area_header * * index_table[] - 0 - 0 - v - 0 - 0 - v - 0 * | | * index hdr A <--- list elem list elem ---> index hdr B * ^ 0 v * | | * ----------------------- list elem * 0 * * index_table is a fixed size array that contains offsets to index * lists by database field (column) number. Index lists themselves contain * offsets to index headers. This arrangement is used so that one * index can be referred to from several fields (index headers are * unique, index list elements are not). * * In the above example, A is a (hash) index on columns 2 and 5, while B * is an index on column 5. * * Note: offset to index header struct is also used as an index id. */ /* ------------------- T-tree private functions ------------- */ #ifndef TTREE_SINGLE_COMPARE /** * returns bounding node offset or if no really bounding node exists, then the closest node */ static gint db_find_bounding_tnode(void *db, gint rootoffset, gint key, gint *result, struct wg_tnode *rb_node) { struct wg_tnode * node = (struct wg_tnode *)offsettoptr(db,rootoffset); /* Original tree search algorithm: compares both bounds of * the node to determine immediately if the value falls between them. */ if(WG_COMPARE(db, key, node->current_min) == WG_LESSTHAN) { /* if(key < node->current_max) */ if(node->left_child_offset != 0) return db_find_bounding_tnode(db, node->left_child_offset, key, result, NULL); else { *result = DEAD_END_LEFT_NOT_BOUNDING; return rootoffset; } } else if(WG_COMPARE(db, key, node->current_max) != WG_GREATER) { *result = REALLY_BOUNDING_NODE; return rootoffset; } else { /* if(key > node->current_max) */ if(node->right_child_offset != 0) return db_find_bounding_tnode(db, node->right_child_offset, key, result, NULL); else{ *result = DEAD_END_RIGHT_NOT_BOUNDING; return rootoffset; } } } #else /* "rightmost" node search is the improved tree search described in * the original T-tree paper. */ #define db_find_bounding_tnode wg_search_ttree_rightmost #endif /** * returns the description of imbalance - 4 cases possible * LL - left child of the left child is overweight * LR - right child of the left child is overweight * etc */ static int db_which_branch_causes_overweight(void *db, struct wg_tnode *root){ struct wg_tnode *child; if(root->left_subtree_height > root->right_subtree_height){ child = (struct wg_tnode *)offsettoptr(db,root->left_child_offset); if(child->left_subtree_height >= child->right_subtree_height)return LL_CASE; else return LR_CASE; }else{ child = (struct wg_tnode *)offsettoptr(db,root->right_child_offset); if(child->left_subtree_height > child->right_subtree_height)return RL_CASE; else return RR_CASE; } } static int db_rotate_ttree(void *db, gint index_id, struct wg_tnode *root, int overw){ gint grandparent = root->parent_offset; gint initialrootoffset = ptrtooffset(db,root); struct wg_tnode *r = NULL; struct wg_tnode *g = (struct wg_tnode *)offsettoptr(db,grandparent); wg_index_header *hdr = (wg_index_header *)offsettoptr(db,index_id); gint column = hdr->rec_field_index[0]; /* always one column for T-tree */ if(overw == LL_CASE){ /* A B * B C D A * D E -> N E C * N */ //printf("LL_CASE\n"); //save some stuff into variables for later use gint offset_left_child = root->left_child_offset; gint offset_right_grandchild = ((struct wg_tnode *)offsettoptr(db,offset_left_child))->right_child_offset; gint right_grandchild_height = ((struct wg_tnode *)offsettoptr(db,offset_left_child))->right_subtree_height; //first switch: E goes to A's left child root->left_child_offset = offset_right_grandchild; root->left_subtree_height = right_grandchild_height; if(offset_right_grandchild != 0){ ((struct wg_tnode *)offsettoptr(db,offset_right_grandchild))->parent_offset = ptrtooffset(db,root); } //second switch: A goes to B's right child ((struct wg_tnode *)offsettoptr(db,offset_left_child)) -> right_child_offset = ptrtooffset(db,root); ((struct wg_tnode *)offsettoptr(db,offset_left_child)) -> right_subtree_height = max(root->left_subtree_height,root->right_subtree_height)+1; root->parent_offset = offset_left_child; //for later grandparent fix r = (struct wg_tnode *)offsettoptr(db,offset_left_child); }else if(overw == RR_CASE){ /* A C * B C A E * D E -> B D N * N */ //printf("RR_CASE\n"); //save some stuff into variables for later use gint offset_right_child = root->right_child_offset; gint offset_left_grandchild = ((struct wg_tnode *)offsettoptr(db,offset_right_child))->left_child_offset; gint left_grandchild_height = ((struct wg_tnode *)offsettoptr(db,offset_right_child))->left_subtree_height; //first switch: D goes to A's right child root->right_child_offset = offset_left_grandchild; root->right_subtree_height = left_grandchild_height; if(offset_left_grandchild != 0){ ((struct wg_tnode *)offsettoptr(db,offset_left_grandchild))->parent_offset = ptrtooffset(db,root); } //second switch: A goes to C's left child ((struct wg_tnode *)offsettoptr(db,offset_right_child)) -> left_child_offset = ptrtooffset(db,root); ((struct wg_tnode *)offsettoptr(db,offset_right_child)) -> left_subtree_height = max(root->right_subtree_height,root->left_subtree_height)+1; root->parent_offset = offset_right_child; //for later grandparent fix r = (struct wg_tnode *)offsettoptr(db,offset_right_child); }else if(overw == LR_CASE){ /* A E * B C B A * D E -> D F G C * F G N * N */ struct wg_tnode *bb, *ee; //save some stuff into variables for later use gint offset_left_child = root->left_child_offset; gint offset_right_grandchild = ((struct wg_tnode *)offsettoptr(db,offset_left_child))->right_child_offset; //first swtich: G goes to A's left child ee = (struct wg_tnode *)offsettoptr(db,offset_right_grandchild); root -> left_child_offset = ee -> right_child_offset; root -> left_subtree_height = ee -> right_subtree_height; if(ee -> right_child_offset != 0){ ((struct wg_tnode *)offsettoptr(db,ee->right_child_offset))->parent_offset = ptrtooffset(db, root); } //second switch: F goes to B's right child bb = (struct wg_tnode *)offsettoptr(db,offset_left_child); bb -> right_child_offset = ee -> left_child_offset; bb -> right_subtree_height = ee -> left_subtree_height; if(ee -> left_child_offset != 0){ ((struct wg_tnode *)offsettoptr(db,ee->left_child_offset))->parent_offset = offset_left_child; } //third switch: B goes to E's left child /* The Lehman/Carey "special" LR rotation - instead of creating * an internal node with one element, the values of what will become the * left child will be moved over to the parent, thus ensuring the internal * node is adequately filled. This is only allowed if E is a leaf. */ if(ee->number_of_elements == 1 && !ee->right_child_offset &&\ !ee->left_child_offset && bb->number_of_elements == WG_TNODE_ARRAY_SIZE){ int i; /* Create space for elements from B */ ee->array_of_values[bb->number_of_elements - 1] = ee->array_of_values[0]; /* All the values moved are smaller than in E */ for(i=1; inumber_of_elements; i++) ee->array_of_values[i-1] = bb->array_of_values[i]; ee->number_of_elements = bb->number_of_elements; /* Examine the new leftmost element to find current_min */ ee->current_min = wg_get_field(db, (void *)offsettoptr(db, ee->array_of_values[0]), column); bb -> number_of_elements = 1; bb -> current_max = bb -> current_min; } //then switch the nodes ee -> left_child_offset = offset_left_child; ee -> left_subtree_height = max(bb->right_subtree_height,bb->left_subtree_height)+1; bb -> parent_offset = offset_right_grandchild; //fourth switch: A goes to E's right child ee -> right_child_offset = ptrtooffset(db, root); ee -> right_subtree_height = max(root->right_subtree_height,root->left_subtree_height)+1; root -> parent_offset = offset_right_grandchild; //for later grandparent fix r = ee; }else if(overw == RL_CASE){ /* A E * C B A B * E D -> C G F D * G F N * N */ struct wg_tnode *bb, *ee; //save some stuff into variables for later use gint offset_right_child = root->right_child_offset; gint offset_left_grandchild = ((struct wg_tnode *)offsettoptr(db,offset_right_child))->left_child_offset; //first swtich: G goes to A's left child ee = (struct wg_tnode *)offsettoptr(db,offset_left_grandchild); root -> right_child_offset = ee -> left_child_offset; root -> right_subtree_height = ee -> left_subtree_height; if(ee -> left_child_offset != 0){ ((struct wg_tnode *)offsettoptr(db,ee->left_child_offset))->parent_offset = ptrtooffset(db, root); } //second switch: F goes to B's right child bb = (struct wg_tnode *)offsettoptr(db,offset_right_child); bb -> left_child_offset = ee -> right_child_offset; bb -> left_subtree_height = ee -> right_subtree_height; if(ee -> right_child_offset != 0){ ((struct wg_tnode *)offsettoptr(db,ee->right_child_offset))->parent_offset = offset_right_child; } //third switch: B goes to E's right child /* "special" RL rotation - see comments for LR_CASE */ if(ee->number_of_elements == 1 && !ee->right_child_offset &&\ !ee->left_child_offset && bb->number_of_elements == WG_TNODE_ARRAY_SIZE){ int i; /* All the values moved are larger than in E */ for(i=1; inumber_of_elements; i++) ee->array_of_values[i] = bb->array_of_values[i-1]; ee->number_of_elements = bb->number_of_elements; /* Examine the new rightmost element to find current_max */ ee->current_max = wg_get_field(db, (void *)offsettoptr(db, ee->array_of_values[ee->number_of_elements - 1]), column); /* Remaining B node array element should sit in slot 0 */ bb->array_of_values[0] = \ bb->array_of_values[bb->number_of_elements - 1]; bb -> number_of_elements = 1; bb -> current_min = bb -> current_max; } ee -> right_child_offset = offset_right_child; ee -> right_subtree_height = max(bb->right_subtree_height,bb->left_subtree_height)+1; bb -> parent_offset = offset_left_grandchild; //fourth switch: A goes to E's right child ee -> left_child_offset = ptrtooffset(db, root); ee -> left_subtree_height = max(root->right_subtree_height,root->left_subtree_height)+1; root -> parent_offset = offset_left_grandchild; //for later grandparent fix r = ee; } //fix grandparent - regardless of current 'overweight' case if(grandparent == 0){//'grandparent' is index header data r->parent_offset = 0; //TODO more error check here TTREE_ROOT_NODE(hdr) = ptrtooffset(db,r); }else{//grandparent is usual node //printf("change grandparent node\n"); r -> parent_offset = grandparent; if(g->left_child_offset == initialrootoffset){//new subtree must replace the left child of grandparent g->left_child_offset = ptrtooffset(db,r); g->left_subtree_height = max(r->left_subtree_height,r->right_subtree_height)+1; }else{ g->right_child_offset = ptrtooffset(db,r); g->right_subtree_height = max(r->left_subtree_height,r->right_subtree_height)+1; } } return 0; } /** inserts pointer to data row into index tree structure * * returns: * 0 - on success * -1 - if error */ static gint ttree_add_row(void *db, gint index_id, void *rec) { gint rootoffset, column; gint newvalue, boundtype, bnodeoffset, newoffset; struct wg_tnode *node; wg_index_header *hdr = (wg_index_header *)offsettoptr(db,index_id); db_memsegment_header* dbh = dbmemsegh(db); rootoffset = TTREE_ROOT_NODE(hdr); #ifdef CHECK if(rootoffset == 0){ #ifdef WG_NO_ERRPRINT #else fprintf(stderr,"index at offset %d does not exist\n", (int) index_id); #endif return -1; } #endif column = hdr->rec_field_index[0]; /* always one column for T-tree */ //extract real value from the row (rec) newvalue = wg_get_field(db, rec, column); //find bounding node for the value bnodeoffset = db_find_bounding_tnode(db, rootoffset, newvalue, &boundtype, NULL); node = (struct wg_tnode *)offsettoptr(db,bnodeoffset); newoffset = 0;//save here the offset of newly created tnode - 0 if no node added into the tree //if bounding node exists - follow one algorithm, else the other if(boundtype == REALLY_BOUNDING_NODE){ //check if the node has room for a new entry if(node->number_of_elements < WG_TNODE_ARRAY_SIZE){ int i, j; gint cr; /* add array entry and update control data. We keep the * array sorted, smallest values left. */ for(i=0; inumber_of_elements; i++) { /* The node is small enough for naive scans to be * "good enough" inside the node. Note that we * branch into re-sort loop as early as possible * with >= operator (> would be algorithmically correct too) * since here the compare is more expensive than the slot * copying. */ cr = WG_COMPARE(db, wg_get_field(db, (void *)offsettoptr(db,node->array_of_values[i]), column), newvalue); if(cr != WG_LESSTHAN) { /* value >= newvalue */ /* Push remaining values to the right */ for(j=node->number_of_elements; j>i; j--) node->array_of_values[j] = node->array_of_values[j-1]; break; } } /* i is either number_of_elements or a vacated slot * in the array now. */ node->array_of_values[i] = ptrtooffset(db,rec); node->number_of_elements++; /* Update min. Due to the >= comparison max is preserved * in this case. Note that we are overwriting values that * WG_COMPARE() may deem equal. This is intentional, because other * parts of T-tree algorithm rely on encoded values of min/max fields * to be in sync with the leftmost/rightmost slots. */ if(i==0) { node->current_min = newvalue; } } else{ //still, insert the value here, but move minimum out of this node //get the minimum element from this node int i, j; gint cr, minvalue, minvaluerowoffset; minvalue = node->current_min; minvaluerowoffset = node->array_of_values[0]; /* Now scan for the matching slot. However, since * we already know the 0 slot will be re-filled, we * do this scan (and sort) in reverse order, compared to the case * where array had some space left. */ for(i=WG_TNODE_ARRAY_SIZE-1; i>0; i--) { cr = WG_COMPARE(db, wg_get_field(db, (void *)offsettoptr(db,node->array_of_values[i]), column), newvalue); if(cr != WG_GREATER) { /* value <= newvalue */ /* Push remaining values to the left */ for(j=0; jarray_of_values[j] = node->array_of_values[j+1]; break; } } /* i is either 0 or a freshly vacated slot */ node->array_of_values[i] = ptrtooffset(db,rec); /* Update minimum. Thanks to the sorted array, we know for a fact * that the minimum sits in slot 0. */ if(i==0) { node->current_min = newvalue; } else { node->current_min = wg_get_field(db, (void *)offsettoptr(db,node->array_of_values[0]), column); /* The scan for the free slot starts from the right and * tries to exit as fast as possible. So it's possible that * the rightmost slot was changed. */ if(i == WG_TNODE_ARRAY_SIZE-1) { node->current_max = newvalue; } } //proceed to the node that holds greatest lower bound - must be leaf (can be the initial bounding node) if(node->left_child_offset != 0){ #ifndef TTREE_CHAINED_NODES gint greatestlb = wg_ttree_find_glb_node(db,node->left_child_offset); #else gint greatestlb = node->pred_offset; #endif node = (struct wg_tnode *)offsettoptr(db, greatestlb); } //if the greatest lower bound node has room, insert value //otherwise make the new node as right child and put the value there if(node->number_of_elements < WG_TNODE_ARRAY_SIZE){ //add array entry and update control data node->array_of_values[node->number_of_elements] = minvaluerowoffset;//save offset, use first free slot node->number_of_elements++; node->current_max = minvalue; }else{ //create, initialize and save first value struct wg_tnode *leaf; gint newnode = wg_alloc_fixlen_object(db, &dbh->tnode_area_header); if(newnode == 0)return -1; leaf =(struct wg_tnode *)offsettoptr(db,newnode); leaf->parent_offset = ptrtooffset(db,node); leaf->left_subtree_height = 0; leaf->right_subtree_height = 0; leaf->current_max = minvalue; leaf->current_min = minvalue; leaf->number_of_elements = 1; leaf->left_child_offset = 0; leaf->right_child_offset = 0; leaf->array_of_values[0] = minvaluerowoffset; /* If the original, full node did not have a left child, then * there also wasn't a separate GLB node, so we are adding one now * as the left child. Otherwise, the new node is added as the right * child to the current GLB node. */ if(bnodeoffset == ptrtooffset(db,node)) { node->left_child_offset = newnode; #ifdef TTREE_CHAINED_NODES /* Create successor / predecessor relationship */ leaf->succ_offset = ptrtooffset(db, node); leaf->pred_offset = node->pred_offset; if(node->pred_offset) { struct wg_tnode *pred = \ (struct wg_tnode *) offsettoptr(db, node->pred_offset); pred->succ_offset = newnode; } else { TTREE_MIN_NODE(hdr) = newnode; } node->pred_offset = newnode; #endif } else { #ifdef TTREE_CHAINED_NODES struct wg_tnode *succ; #endif node->right_child_offset = newnode; #ifdef TTREE_CHAINED_NODES /* Insert the new node in the sequential chain between * the original node and the GLB node found */ leaf->succ_offset = node->succ_offset; leaf->pred_offset = ptrtooffset(db, node); #ifdef CHECK if(!node->succ_offset) { show_index_error(db, "GLB with no successor, panic"); return -1; } else { #endif succ = (struct wg_tnode *) offsettoptr(db, leaf->succ_offset); succ->pred_offset = newnode; #ifdef CHECK } #endif node->succ_offset = newnode; #endif /* TTREE_CHAINED_NODES */ } newoffset = newnode; } } }//the bounding node existed - first algorithm else{// bounding node does not exist //try to insert the new value to that node - becoming new min or max //if the node has room for a new entry if(node->number_of_elements < WG_TNODE_ARRAY_SIZE){ int i; /* add entry, keeping the array sorted (see also notes for the * bounding node case. The difference this time is that we already * know if this value is becoming the new min or max). */ if(boundtype == DEAD_END_LEFT_NOT_BOUNDING) { /* our new value is the new min, push everything right */ for(i=node->number_of_elements; i>0; i--) node->array_of_values[i] = node->array_of_values[i-1]; node->array_of_values[0] = ptrtooffset(db,rec); node->current_min = newvalue; } else { /* DEAD_END_RIGHT_NOT_BOUNDING */ /* even simpler case, new value is added to the right */ node->array_of_values[node->number_of_elements] = ptrtooffset(db,rec); node->current_max = newvalue; } node->number_of_elements++; /* XXX: not clear if the empty node can occur here. Until this * is checked, we'll be paranoid and overwrite both min and max. */ if(node->number_of_elements==1) { node->current_max = newvalue; node->current_min = newvalue; } }else{ //make a new node and put data there struct wg_tnode *leaf; gint newnode = wg_alloc_fixlen_object(db, &dbh->tnode_area_header); if(newnode == 0)return -1; leaf =(struct wg_tnode *)offsettoptr(db,newnode); leaf->parent_offset = ptrtooffset(db,node); leaf->left_subtree_height = 0; leaf->right_subtree_height = 0; leaf->current_max = newvalue; leaf->current_min = newvalue; leaf->number_of_elements = 1; leaf->left_child_offset = 0; leaf->right_child_offset = 0; leaf->array_of_values[0] = ptrtooffset(db,rec); newoffset = newnode; //set new node as left or right leaf if(boundtype == DEAD_END_LEFT_NOT_BOUNDING){ node->left_child_offset = newnode; #ifdef TTREE_CHAINED_NODES /* Set the new node as predecessor of the parent */ leaf->succ_offset = ptrtooffset(db, node); leaf->pred_offset = node->pred_offset; if(node->pred_offset) { /* Notify old predecessor that the node following * it changed */ struct wg_tnode *pred = \ (struct wg_tnode *) offsettoptr(db, node->pred_offset); pred->succ_offset = newnode; } else { TTREE_MIN_NODE(hdr) = newnode; } node->pred_offset = newnode; #endif }else if(boundtype == DEAD_END_RIGHT_NOT_BOUNDING){ node->right_child_offset = newnode; #ifdef TTREE_CHAINED_NODES /* Set the new node as successor of the parent */ leaf->succ_offset = node->succ_offset; leaf->pred_offset = ptrtooffset(db, node); if(node->succ_offset) { /* Notify old successor that the node preceding * it changed */ struct wg_tnode *succ = \ (struct wg_tnode *) offsettoptr(db, node->succ_offset); succ->pred_offset = newnode; } else { TTREE_MAX_NODE(hdr) = newnode; } node->succ_offset = newnode; #endif } } }//no bounding node found - algorithm 2 //if new node was added to tree - must update child height data in nodes from leaf to root //or until find a node with imbalance //then determine the bad balance case: LL, LR, RR or RL and execute proper rotation if(newoffset){ struct wg_tnode *child = (struct wg_tnode *)offsettoptr(db,newoffset); struct wg_tnode *parent; int left = 0; while(child->parent_offset != 0){//this is not a root int balance; parent = (struct wg_tnode *)offsettoptr(db,child->parent_offset); //determine which child the child is, left or right one if(parent->left_child_offset == ptrtooffset(db,child)) left = 1; else left = 0; //increment parent left or right subtree height if(left)parent->left_subtree_height++; else parent->right_subtree_height++; //check balance balance = parent->left_subtree_height - parent->right_subtree_height; if(balance == 0) { /* As a result of adding a new node somewhere below, left * and right subtrees of the node we're checking became * of EQUAL height. This means that changes in subtree heights * do not propagate any further (the max depth in this node * dit NOT change). */ break; } if(balance > 1 || balance < -1){//must rebalance //the current parent is root for balancing operation //determine the branch that causes overweight int overw = db_which_branch_causes_overweight(db,parent); //fix balance db_rotate_ttree(db,index_id,parent,overw); break;//while loop because balance does not change in the next levels }else{//just proceed to the parent node child = parent; } } } return 0; } /** removes pointer to data row from index tree structure * * returns: * 0 - on success * -1 - if error, index doesnt exist * -2 - if error, no bounding node for key * -3 - if error, boundig node exists, value not * -4 - if error, tree not in balance */ static gint ttree_remove_row(void *db, gint index_id, void * rec) { int i, found; gint key, rootoffset, column, boundtype, bnodeoffset; gint rowoffset; struct wg_tnode *node, *parent; wg_index_header *hdr = (wg_index_header *)offsettoptr(db,index_id); rootoffset = TTREE_ROOT_NODE(hdr); #ifdef CHECK if(rootoffset == 0){ #ifdef WG_NO_ERRPRINT #else fprintf(stderr,"index at offset %d does not exist\n", (int) index_id); #endif return -1; } #endif column = hdr->rec_field_index[0]; /* always one column for T-tree */ key = wg_get_field(db, rec, column); rowoffset = ptrtooffset(db, rec); /* find bounding node for the value. Since non-unique values * are allowed, we will find the leftmost node and scan * right from there (we *need* the exact row offset). */ bnodeoffset = wg_search_ttree_leftmost(db, rootoffset, key, &boundtype, NULL); node = (struct wg_tnode *)offsettoptr(db,bnodeoffset); //if bounding node does not exist - error if(boundtype != REALLY_BOUNDING_NODE) return -2; /* find the record inside the node. This is an expensive loop if there * are many repeated values, so unnecessary deleting should be avoided * on higher level. */ found = -1; for(;;) { for(i=0;inumber_of_elements;i++){ if(node->array_of_values[i] == rowoffset) { found = i; goto found_row; } } bnodeoffset = TNODE_SUCCESSOR(db, node); if(!bnodeoffset) break; /* no more successors */ node = (struct wg_tnode *)offsettoptr(db,bnodeoffset); if(WG_COMPARE(db, node->current_min, key) == WG_GREATER) break; /* successor is not a bounding node */ } found_row: if(found == -1) return -3; //delete the key and rearrange other elements node->number_of_elements--; if(found < node->number_of_elements) { /* not the last element */ /* slide the elements to the right of the found value * one step to the left */ for(i=found; inumber_of_elements; i++) node->array_of_values[i] = node->array_of_values[i+1]; } /* Update min/max */ if(found==node->number_of_elements && node->number_of_elements != 0) { /* Rightmost element was removed, so new max should be updated to * the new rightmost value */ node->current_max = wg_get_field(db, (void *)offsettoptr(db, node->array_of_values[node->number_of_elements - 1]), column); } else if(found==0 && node->number_of_elements != 0) { /* current_min removed, update to new leftmost value */ node->current_min = wg_get_field(db, (void *)offsettoptr(db, node->array_of_values[0]), column); } //check underflow and take some actions if needed if(node->number_of_elements < 5){//TODO use macro //if the node is internal node - borrow its gratest lower bound from the node where it is if(node->left_child_offset != 0 && node->right_child_offset != 0){//internal node #ifndef TTREE_CHAINED_NODES gint greatestlb = wg_ttree_find_glb_node(db,node->left_child_offset); #else gint greatestlb = node->pred_offset; #endif struct wg_tnode *glbnode = (struct wg_tnode *)offsettoptr(db, greatestlb); /* Make space for a new min value */ for(i=node->number_of_elements; i>0; i--) node->array_of_values[i] = node->array_of_values[i-1]; /* take the glb value (always the rightmost in the array) and * insert it in our node */ node -> array_of_values[0] = \ glbnode->array_of_values[glbnode->number_of_elements-1]; node -> number_of_elements++; node -> current_min = glbnode -> current_max; if(node->number_of_elements == 1) /* we just got our first element */ node->current_max = glbnode -> current_max; glbnode -> number_of_elements--; //reset new max for glbnode if(glbnode->number_of_elements != 0) { glbnode->current_max = wg_get_field(db, (void *)offsettoptr(db, glbnode->array_of_values[glbnode->number_of_elements - 1]), column); } node = glbnode; } } //now variable node points to the node which really lost an element //this is definitely leaf or half-leaf //if the node is empty - free it and rebalanc the tree parent = NULL; //delete the empty leaf if(node->left_child_offset == 0 && node->right_child_offset == 0 && node->number_of_elements == 0){ if(node->parent_offset != 0){ parent = (struct wg_tnode *)offsettoptr(db, node->parent_offset); //was it left or right child if(parent->left_child_offset == ptrtooffset(db,node)){ parent->left_child_offset=0; parent->left_subtree_height=0; }else{ parent->right_child_offset=0; parent->right_subtree_height=0; } } #ifdef TTREE_CHAINED_NODES /* Remove the node from sequential chain */ if(node->succ_offset) { struct wg_tnode *succ = \ (struct wg_tnode *) offsettoptr(db, node->succ_offset); succ->pred_offset = node->pred_offset; } else { TTREE_MAX_NODE(hdr) = node->pred_offset; } if(node->pred_offset) { struct wg_tnode *pred = \ (struct wg_tnode *) offsettoptr(db, node->pred_offset); pred->succ_offset = node->succ_offset; } else { TTREE_MIN_NODE(hdr) = node->succ_offset; } #endif /* Free the node, unless it's the root node */ if(node != offsettoptr(db, TTREE_ROOT_NODE(hdr))) { wg_free_tnode(db, ptrtooffset(db,node)); } else { /* Set empty state of root node */ node->current_max = WG_ILLEGAL; node->current_min = WG_ILLEGAL; #ifdef TTREE_CHAINED_NODES TTREE_MAX_NODE(hdr) = TTREE_ROOT_NODE(hdr); TTREE_MIN_NODE(hdr) = TTREE_ROOT_NODE(hdr); #endif } //rebalance if needed } //or if the node was a half-leaf, see if it can be merged with its leaf if((node->left_child_offset == 0 && node->right_child_offset != 0) || (node->left_child_offset != 0 && node->right_child_offset == 0)){ int elements = node->number_of_elements; int left; struct wg_tnode *child; if(node->left_child_offset != 0){ child = (struct wg_tnode *)offsettoptr(db, node->left_child_offset); left = 1;//true }else{ child = (struct wg_tnode *)offsettoptr(db, node->right_child_offset); left = 0;//false } elements += child->number_of_elements; if(!(child->left_subtree_height == 0 && child->right_subtree_height == 0)){ show_index_error(db, "index tree is not balanced, deleting algorithm doesn't work"); return -4; } //if possible move all elements from child to node and free child if(elements <= WG_TNODE_ARRAY_SIZE){ int i = node->number_of_elements; int j; node->number_of_elements = elements; if(left){ /* Left child elements are all smaller than in current node */ for(j=i-1; j>=0; j--){ node->array_of_values[j + child->number_of_elements] = \ node->array_of_values[j]; } for(j=0;jnumber_of_elements;j++){ node->array_of_values[j]=child->array_of_values[j]; } node->left_subtree_height=0; node->left_child_offset=0; node->current_min=child->current_min; if(!i) node->current_max=child->current_max; /* parent was empty */ }else{ /* Right child elements are all larger than in current node */ for(j=0;jnumber_of_elements;j++){ node->array_of_values[i+j]=child->array_of_values[j]; } node->right_subtree_height=0; node->right_child_offset=0; node->current_max=child->current_max; if(!i) node->current_min=child->current_min; /* parent was empty */ } #ifdef TTREE_CHAINED_NODES /* Remove the child from sequential chain */ if(child->succ_offset) { struct wg_tnode *succ = \ (struct wg_tnode *) offsettoptr(db, child->succ_offset); succ->pred_offset = child->pred_offset; } else { TTREE_MAX_NODE(hdr) = child->pred_offset; } if(child->pred_offset) { struct wg_tnode *pred = \ (struct wg_tnode *) offsettoptr(db, child->pred_offset); pred->succ_offset = child->succ_offset; } else { TTREE_MIN_NODE(hdr) = child->succ_offset; } #endif wg_free_tnode(db, ptrtooffset(db, child)); if(node->parent_offset) { parent = (struct wg_tnode *)offsettoptr(db, node->parent_offset); if(parent->left_child_offset==ptrtooffset(db,node)){ parent->left_subtree_height=1; }else{ parent->right_subtree_height=1; } } } } //check balance and update subtree height data //stop when find a node where subtree heights dont change if(parent != NULL){ int balance, height; for(;;) { balance = parent->left_subtree_height - parent->right_subtree_height; if(balance > 1 || balance < -1){//must rebalance //the current parent is root for balancing operation //rotarion fixes subtree heights in grandparent //determine the branch that causes overweight int overw = db_which_branch_causes_overweight(db,parent); //fix balance db_rotate_ttree(db,index_id,parent,overw); } else if(parent->parent_offset) { struct wg_tnode *gp; //manually set grandparent subtree heights height = max(parent->left_subtree_height,parent->right_subtree_height); gp = (struct wg_tnode *)offsettoptr(db, parent->parent_offset); if(gp->left_child_offset==ptrtooffset(db,parent)){ gp->left_subtree_height=1+height; }else{ gp->right_subtree_height=1+height; } } if(!parent->parent_offset) break; /* root node reached */ parent = (struct wg_tnode *)offsettoptr(db, parent->parent_offset); } } return 0; } /* ------------------- T-tree public functions ---------------- */ /** * returns offset to data row: * -1 - error, index does not exist * 0 - if key NOT found * other integer - if key found (= offset to data row) * XXX: with duplicate values, which one is returned is somewhat * undetermined, so this function is mainly for early development/testing */ gint wg_search_ttree_index(void *db, gint index_id, gint key){ int i; gint rootoffset, bnodetype, bnodeoffset; gint rowoffset, column; struct wg_tnode * node; wg_index_header *hdr = (wg_index_header *)offsettoptr(db,index_id); rootoffset = TTREE_ROOT_NODE(hdr); #ifdef CHECK /* XXX: This is a rather weak check but might catch some errors */ if(rootoffset == 0){ #ifdef WG_NO_ERRPRINT #else fprintf(stderr,"index at offset %d does not exist\n", (int) index_id); #endif return -1; } #endif /* Find the leftmost bounding node */ bnodeoffset = wg_search_ttree_leftmost(db, rootoffset, key, &bnodetype, NULL); node = (struct wg_tnode *)offsettoptr(db,bnodeoffset); if(bnodetype != REALLY_BOUNDING_NODE) return 0; column = hdr->rec_field_index[0]; /* always one column for T-tree */ /* find the record inside the node. */ for(;;) { for(i=0;inumber_of_elements;i++){ rowoffset = node->array_of_values[i]; if(WG_COMPARE(db, wg_get_field(db, (void *)offsettoptr(db,rowoffset), column), key) == WG_EQUAL) { return rowoffset; } } /* Normally we cannot end up here. We'll keep the code in case * implementation of wg_compare() changes in the future. */ bnodeoffset = TNODE_SUCCESSOR(db, node); if(!bnodeoffset) break; /* no more successors */ node = (struct wg_tnode *)offsettoptr(db,bnodeoffset); if(WG_COMPARE(db, node->current_min, key) == WG_GREATER) break; /* successor is not a bounding node */ } return 0; } /* * The following pairs of functions implement tree traversal. Only * wg_ttree_find_glb_node() is used for the upkeep of T-tree (insert, delete, * re-balance), the rest are required for sequential scan and range queries * when the tree is implemented without predecessor and successor pointers. */ #ifndef TTREE_CHAINED_NODES /** find greatest lower bound node * returns offset of the (half-) leaf node with greatest lower bound * goes only right - so: must call on the left child of the internal * which we are looking the GLB node for. */ gint wg_ttree_find_glb_node(void *db, gint nodeoffset) { struct wg_tnode * node = (struct wg_tnode *)offsettoptr(db,nodeoffset); if(node->right_child_offset != 0) return wg_ttree_find_glb_node(db, node->right_child_offset); else return nodeoffset; } /** find least upper bound node * returns offset of the (half-) leaf node with least upper bound * Call with the right child of an internal node as argument. */ gint wg_ttree_find_lub_node(void *db, gint nodeoffset) { struct wg_tnode * node = (struct wg_tnode *)offsettoptr(db,nodeoffset); if(node->left_child_offset != 0) return wg_ttree_find_lub_node(db, node->left_child_offset); else return nodeoffset; } /** find predecessor of a leaf. * Returns offset of the internal node which holds the value * immediately preceeding the current_min of the leaf. * If the search hit root (the leaf could be the leftmost one in * the tree) the function returns 0. * This is the reverse of finding the LUB node. */ gint wg_ttree_find_leaf_predecessor(void *db, gint nodeoffset) { struct wg_tnode *node, *parent; node = (struct wg_tnode *)offsettoptr(db,nodeoffset); if(node->parent_offset) { parent = (struct wg_tnode *) offsettoptr(db, node->parent_offset); /* If the current node was left child of the parent, the immediate * parent has larger values, so we need to climb to the next * level with our search. */ if(parent->left_child_offset == nodeoffset) return wg_ttree_find_leaf_predecessor(db, node->parent_offset); } return node->parent_offset; } /** find successor of a leaf. * Returns offset of the internal node which holds the value * immediately succeeding the current_max of the leaf. * Returns 0 if there is no successor. * This is the reverse of finding the GLB node. */ gint wg_ttree_find_leaf_successor(void *db, gint nodeoffset) { struct wg_tnode *node, *parent; node = (struct wg_tnode *)offsettoptr(db,nodeoffset); if(node->parent_offset) { parent = (struct wg_tnode *) offsettoptr(db, node->parent_offset); if(parent->right_child_offset == nodeoffset) return wg_ttree_find_leaf_successor(db, node->parent_offset); } return node->parent_offset; } #endif /* TTREE_CHAINED_NODES */ /* * Functions to support range queries (and fetching multiple * duplicate values) using T-tree index. Since the nodes can be * traversed sequentially, the simplest way to implement queries that * have result sets is to find leftmost (or rightmost) value that * meets the query conditions and scan right (or left) from there. */ /** Find rightmost node containing given value * returns NULL if node was not found */ gint wg_search_ttree_rightmost(void *db, gint rootoffset, gint key, gint *result, struct wg_tnode *rb_node) { struct wg_tnode * node; #ifdef TTREE_SINGLE_COMPARE node = (struct wg_tnode *)offsettoptr(db,rootoffset); /* Improved(?) tree search algorithm with a single compare per node. * only lower bound is examined, if the value is larger the right subtree * is selected immediately. If the search ends in a dead end, the node where * the right branch was taken is examined again. */ if(WG_COMPARE(db, key, node->current_min) == WG_LESSTHAN) { /* key < node->current_min */ if(node->left_child_offset != 0) { return wg_search_ttree_rightmost(db, node->left_child_offset, key, result, rb_node); } else if (rb_node) { /* Dead end, but we still have an unexamined node left */ if(WG_COMPARE(db, key, rb_node->current_max) != WG_GREATER) { /* key<=rb_node->current_max */ *result = REALLY_BOUNDING_NODE; return ptrtooffset(db, rb_node); } } /* No left child, no rb_node or it's right bound was not interesting */ *result = DEAD_END_LEFT_NOT_BOUNDING; return rootoffset; } else { if(node->right_child_offset != 0) { /* Here we jump the gun and branch to right, ignoring the * current_max of the node (therefore avoiding one expensive * compare operation). */ return wg_search_ttree_rightmost(db, node->right_child_offset, key, result, node); } else if(WG_COMPARE(db, key, node->current_max) != WG_GREATER) { /* key<=node->current_max */ *result = REALLY_BOUNDING_NODE; return rootoffset; } /* key is neither left of or inside this node and * there is no right child */ *result = DEAD_END_RIGHT_NOT_BOUNDING; return rootoffset; } #else gint bnodeoffset; bnodeoffset = db_find_bounding_tnode(db, rootoffset, key, result, NULL); if(*result != REALLY_BOUNDING_NODE) return bnodeoffset; /* There is at least one node with the key we're interested in, * now make sure we have the rightmost */ node = offsettoptr(db, bnodeoffset); while(WG_COMPARE(db, node->current_max, key) == WG_EQUAL) { gint nextoffset = TNODE_SUCCESSOR(db, node); if(nextoffset) { struct wg_tnode *next = offsettoptr(db, nextoffset); if(WG_COMPARE(db, next->current_min, key) == WG_GREATER) /* next->current_min > key */ break; /* overshot */ node = next; } else break; /* last node in chain */ } return ptrtooffset(db, node); #endif } /** Find leftmost node containing given value * returns NULL if node was not found */ gint wg_search_ttree_leftmost(void *db, gint rootoffset, gint key, gint *result, struct wg_tnode *lb_node) { struct wg_tnode * node; #ifdef TTREE_SINGLE_COMPARE node = (struct wg_tnode *)offsettoptr(db,rootoffset); /* Rightmost bound search mirrored */ if(WG_COMPARE(db, key, node->current_max) == WG_GREATER) { /* key > node->current_max */ if(node->right_child_offset != 0) { return wg_search_ttree_leftmost(db, node->right_child_offset, key, result, lb_node); } else if (lb_node) { /* Dead end, but we still have an unexamined node left */ if(WG_COMPARE(db, key, lb_node->current_min) != WG_LESSTHAN) { /* key>=lb_node->current_min */ *result = REALLY_BOUNDING_NODE; return ptrtooffset(db, lb_node); } } *result = DEAD_END_RIGHT_NOT_BOUNDING; return rootoffset; } else { if(node->left_child_offset != 0) { return wg_search_ttree_leftmost(db, node->left_child_offset, key, result, node); } else if(WG_COMPARE(db, key, node->current_min) != WG_LESSTHAN) { /* key>=node->current_min */ *result = REALLY_BOUNDING_NODE; return rootoffset; } *result = DEAD_END_LEFT_NOT_BOUNDING; return rootoffset; } #else gint bnodeoffset; bnodeoffset = db_find_bounding_tnode(db, rootoffset, key, result, NULL); if(*result != REALLY_BOUNDING_NODE) return bnodeoffset; /* One (we don't know which) bounding node found, traverse the * tree to the leftmost. */ node = offsettoptr(db, bnodeoffset); while(WG_COMPARE(db, node->current_min, key) == WG_EQUAL) { gint prevoffset = TNODE_PREDECESSOR(db, node); if(prevoffset) { struct wg_tnode *prev = offsettoptr(db, prevoffset); if(WG_COMPARE(db, prev->current_max, key) == WG_LESSTHAN) /* prev->current_max < key */ break; /* overshot */ node = prev; } else break; /* first node in chain */ } return ptrtooffset(db, node); #endif } /** Find first occurrence of a value in a T-tree node * returns the number of the slot. If the value itself * is missing, the location of the first value that * exceeds it is returned. */ gint wg_search_tnode_first(void *db, gint nodeoffset, gint key, gint column) { gint i, encoded; struct wg_tnode *node = (struct wg_tnode *) offsettoptr(db, nodeoffset); for(i=0; inumber_of_elements; i++) { /* Naive scan is ok for small values of WG_TNODE_ARRAY_SIZE. */ encoded = wg_get_field(db, (void *)offsettoptr(db,node->array_of_values[i]), column); if(WG_COMPARE(db, encoded, key) != WG_LESSTHAN) /* encoded >= key */ return i; } return -1; } /** Find last occurrence of a value in a T-tree node * returns the number of the slot. If the value itself * is missing, the location of the first value that * is smaller (when scanning from right to left) is returned. */ gint wg_search_tnode_last(void *db, gint nodeoffset, gint key, gint column) { gint i, encoded; struct wg_tnode *node = (struct wg_tnode *) offsettoptr(db, nodeoffset); for(i=node->number_of_elements -1; i>=0; i--) { encoded = wg_get_field(db, (void *)offsettoptr(db,node->array_of_values[i]), column); if(WG_COMPARE(db, encoded, key) != WG_GREATER) /* encoded <= key */ return i; } return -1; } /** Create T-tree index on a column * returns: * 0 - on success * -1 - error (failed to create the index) */ static gint create_ttree_index(void *db, gint index_id){ gint node; unsigned int rowsprocessed; struct wg_tnode *nodest; void *rec; db_memsegment_header* dbh = dbmemsegh(db); wg_index_header *hdr = (wg_index_header *) offsettoptr(db, index_id); gint column = hdr->rec_field_index[0]; /* allocate (+ init) root node for new index tree and save * the offset into index_array */ node = wg_alloc_fixlen_object(db, &dbh->tnode_area_header); nodest =(struct wg_tnode *)offsettoptr(db,node); nodest->parent_offset = 0; nodest->left_subtree_height = 0; nodest->right_subtree_height = 0; nodest->current_max = WG_ILLEGAL; nodest->current_min = WG_ILLEGAL; nodest->number_of_elements = 0; nodest->left_child_offset = 0; nodest->right_child_offset = 0; #ifdef TTREE_CHAINED_NODES nodest->succ_offset = 0; nodest->pred_offset = 0; #endif TTREE_ROOT_NODE(hdr) = node; #ifdef TTREE_CHAINED_NODES TTREE_MIN_NODE(hdr) = node; TTREE_MAX_NODE(hdr) = node; #endif //scan all the data - make entry for every suitable row rec = wg_get_first_record(db); rowsprocessed = 0; while(rec != NULL) { if(column >= wg_get_record_len(db, rec)) { rec=wg_get_next_record(db,rec); continue; } if(MATCH_TEMPLATE(db, hdr, rec)) { ttree_add_row(db, index_id, rec); rowsprocessed++; } rec=wg_get_next_record(db,rec); } #ifdef WG_NO_ERRPRINT #else fprintf(stderr,"new index created on rec field %d into slot %d and %d data rows inserted\n", (int) column, (int) index_id, rowsprocessed); #endif return 0; } /** Drop T-tree index by id * Frees the memory in the T-node area * returns: * 0 - on success * -1 - error */ static gint drop_ttree_index(void *db, gint index_id){ struct wg_tnode *node; wg_index_header *hdr; hdr = (wg_index_header *) offsettoptr(db, index_id); /* Free the T-node memory. This is trivial for chained nodes, since * once we've found a successor for a node it can be deleted and * forgotten about. For plain T-tree this does not work since tree * traversal often runs down and up parent-child chains, which means * that some parents cannot be deleted before their children. */ node = NULL; #ifdef TTREE_CHAINED_NODES if(TTREE_MIN_NODE(hdr)) node = (struct wg_tnode *) offsettoptr(db, TTREE_MIN_NODE(hdr)); else if(TTREE_ROOT_NODE(hdr)) /* normally this does not happen */ node = (struct wg_tnode *) offsettoptr(db, TTREE_ROOT_NODE(hdr)); while(node) { gint deleteme = ptrtooffset(db, node); if(node->succ_offset) node = (struct wg_tnode *) offsettoptr(db, node->succ_offset); else node = NULL; wg_free_tnode(db, deleteme); } #else /* XXX: not implemented */ show_index_error(db, "Warning: T-node memory cannot be deallocated"); #endif return 0; } /* -------------- Hash index private functions ------------- */ /** inserts pointer to data row into index tree structure * returns: * 0 - on success * -1 - if error */ static gint hash_add_row(void *db, gint index_id, void *rec) { wg_index_header *hdr = (wg_index_header *)offsettoptr(db,index_id); gint i; gint values[MAX_INDEX_FIELDS]; for(i=0; ifields; i++) { values[i] = wg_get_field(db, rec, hdr->rec_field_index[i]); } return hash_recurse(db, hdr, NULL, 0, values, hdr->fields, rec, HASHIDX_OP_STORE, (hdr->type == WG_INDEX_TYPE_HASH_JSON)); } /** Remove all entries connected to a row from hash index * returns: * 0 - on success * -1 - if error */ static gint hash_remove_row(void *db, gint index_id, void *rec) { wg_index_header *hdr = (wg_index_header *)offsettoptr(db,index_id); gint i; gint values[MAX_INDEX_FIELDS]; for(i=0; ifields; i++) { values[i] = wg_get_field(db, rec, hdr->rec_field_index[i]); } return hash_recurse(db, hdr, NULL, 0, values, hdr->fields, rec, HASHIDX_OP_REMOVE, (hdr->type == WG_INDEX_TYPE_HASH_JSON)); } /** * Construct a byte array for hashing recursively. * Hash it when it is complete. * * If we have a JSON index *and* we're acting on an indexable row, * all arrays are expanded. This does not happen if we're called * by updating a value *in* an array. * * returns: * 0 - on success * -1 - on error */ static gint hash_recurse(void *db, wg_index_header *hdr, char *prefix, gint prefixlen, gint *values, gint count, void *rec, gint op, gint expand) { if(count) { gint nextvalue = values[0]; if(expand) { /* In case of a JSON/array index, check the value */ if(wg_get_encoded_type(db, nextvalue) == WG_RECORDTYPE) { void *valrec = wg_decode_record(db, nextvalue); if(is_schema_array(valrec)) { /* expand the array */ gint i, reclen, retv = 0; reclen = wg_get_record_len(db, valrec); for(i=0; itype; gint firstcol = hdr->rec_field_index[0]; gint i; /* Initialize the hash table (0 - use default size) */ if(wg_create_hash(db, HASHIDX_ARRAYP(hdr), 0)) return -1; /* Add existing records */ rec = wg_get_first_record(db); rowsprocessed = 0; while(rec != NULL) { if(firstcol >= wg_get_record_len(db, rec)) { rec=wg_get_next_record(db,rec); continue; } if(MATCH_TEMPLATE(db, hdr, rec)) { if(type == WG_INDEX_TYPE_HASH_JSON) { /* Ignore array and object records. Their data is indexed * from the rows that point to them. */ if(is_plain_record(rec)) { hash_add_row(db, index_id, rec); rowsprocessed++; } } else { /* Add all rows normally */ hash_add_row(db, index_id, rec); rowsprocessed++; } } rec=wg_get_next_record(db,rec); } #ifdef WG_NO_ERRPRINT #else fprintf(stderr,"new hash index created on ("); #endif for(i=0; ifields; i++) { #ifdef WG_NO_ERRPRINT #else #ifdef _WIN32 fprintf(stderr,"%s%Id", (i ? "," : ""), hdr->rec_field_index[i]); #else fprintf(stderr,"%s%td", (i ? "," : ""), hdr->rec_field_index[i]); #endif #endif } #ifdef WG_NO_ERRPRINT #else fprintf(stderr,") into slot %d and %d data rows inserted\n", (int) index_id, rowsprocessed); #endif return 0; } /** Drop a hash index by id * returns: * 0 - on success * -1 - error * * XXX: implement this. Needs some method of de-allocating or reusing * the main hash table (list cells/varlen storage can be freed piece by * piece if necessary). */ static gint drop_hash_index(void *db, gint index_id){ show_index_error(db, "Cannot drop hash index: not implemented"); return -1; } /* -------------- Hash index public functions -------------- */ /** * Search the hash index for given values. * * returns offset to data row: * -1 - error * 0 - if key NOT found * >0 - offset to the linked list that contains the row offsets */ gint wg_search_hash(void *db, gint index_id, gint *values, gint count) { wg_index_header *hdr = (wg_index_header *) offsettoptr(db, index_id); #ifdef CHECK gint type = wg_get_index_type(db, index_id); /* also validates the id */ if(type < 0) return type; if(type != WG_INDEX_TYPE_HASH && type != WG_INDEX_TYPE_HASH_JSON) return show_index_error(db, "wg_search_hash: Not a hash index"); if(hdr->fields != count) { show_index_error(db, "Number of indexed fields does not match"); return -1; } #endif return hash_recurse(db, hdr, NULL, 0, values, count, NULL, HASHIDX_OP_FIND, 0); } /* ----------------- Index template functions -------------- */ /** Insert into list * * helper function to insert list elements. Takes address of * a variable containing an offset to the first element (that * offset may be 0 for empty lists or when appending). */ static gint insert_into_list(void *db, gint *head, gint value) { db_memsegment_header* dbh = dbmemsegh(db); gint old = *head; *head = wg_alloc_fixlen_object(db, &dbh->listcell_area_header); if(*head) { gcell *listelem = (gcell *) offsettoptr(db, *head); listelem->car = value; listelem->cdr = old; } return *head; } /** Delete from list * * helper function to delete list elements. Deletes the current * element. */ static void delete_from_list(void *db, gint *head) { db_memsegment_header* dbh = dbmemsegh(db); gcell *listelem = (gcell *) offsettoptr(db, *head); *head = listelem->cdr; /* Free the vacated list element */ wg_free_fixlen_object(db, &dbh->listcell_area_header, ptrtooffset(db, listelem)); } #ifdef USE_INDEX_TEMPLATE /** Add index template * * Takes a gint array that represents an template for records * that are inserted into an index. Creates a database record * from that array and links the record into an ordered list. * * Returns offset to the created match record, if successful * Returns 0 on error. */ static gint add_index_template(void *db, gint *matchrec, gint reclen) { gint *ilist, *meta; void *rec; db_memsegment_header* dbh = dbmemsegh(db); wg_index_template *tmpl; gint fixed_columns = 0, template_offset = 0, last_fixed = 0; int i; /* Find the number of fixed columns in the template */ for(i=0; iindex_control_area_header.index_template_list; while(*ilist) { gcell *ilistelem = (gcell *) offsettoptr(db, *ilist); if(!ilistelem->car) { show_index_error(db, "Invalid header in index tempate list"); return 0; } tmpl = (wg_index_template *) offsettoptr(db, ilistelem->car); if(tmpl->fixed_columns == fixed_columns) { rec = offsettoptr(db, tmpl->offset_matchrec); if(reclen != wg_get_record_len(db, rec)) goto nextelem; /* match not possible */ for(i=0; icar; } else if(tmpl->fixed_columns < fixed_columns) { /* No matching record found. New template should be inserted * ahead of current element. */ break; } nextelem: ilist = &ilistelem->cdr; } /* Create the new match record */ rec = wg_create_raw_record(db, reclen); if(!rec) return 0; for(i=0; iindextmpl_area_header); tmpl = (wg_index_template *) offsettoptr(db, template_offset); tmpl->offset_matchrec = ptrtooffset(db, rec); tmpl->fixed_columns = fixed_columns; /* Insert it into the template list */ if(!insert_into_list(db, ilist, template_offset)) return 0; return template_offset; } /** Find index template * * Takes a gint array that represents an template for records * that are inserted into an index. Checks if a matching template * exists in a database. This function is used for finding an * index. * * Returns the template offset on success. * Returns 0 on error. */ static gint find_index_template(void *db, gint *matchrec, gint reclen) { gint *ilist; void *rec; db_memsegment_header* dbh = dbmemsegh(db); wg_index_template *tmpl; gint fixed_columns = 0, last_fixed = 0; int i; /* Get some statistics about the match record and validate it */ for(i=0; iindex_control_area_header.index_template_list; while(*ilist) { gcell *ilistelem = (gcell *) offsettoptr(db, *ilist); if(!ilistelem->car) { show_index_error(db, "Invalid header in index tempate list"); return 0; } tmpl = (wg_index_template *) offsettoptr(db, ilistelem->car); if(tmpl->fixed_columns == fixed_columns) { rec = offsettoptr(db, tmpl->offset_matchrec); if(reclen != wg_get_record_len(db, rec)) goto nextelem; /* match not possible */ for(i=0; icar; } else if(tmpl->fixed_columns < fixed_columns) { /* No matching record found. New template should be inserted * ahead of current element. */ break; } nextelem: ilist = &ilistelem->cdr; } return 0; } /** Remove index template * * Caller should make sure that the template is no longer * referenced by any indexes before calling this. */ static gint remove_index_template(void *db, gint template_offset) { gint *ilist; void *rec; db_memsegment_header* dbh = dbmemsegh(db); wg_index_template *tmpl; tmpl = (wg_index_template *) offsettoptr(db, template_offset); /* Delete the database record */ rec = offsettoptr(db, tmpl->offset_matchrec); wg_delete_record(db, rec); /* Remove from template list */ ilist = &dbh->index_control_area_header.index_template_list; while(*ilist) { gcell *ilistelem = (gcell *) offsettoptr(db, *ilist); if(ilistelem->car == template_offset) { delete_from_list(db, ilist); break; } ilist = &ilistelem->cdr; } /* Free the template */ wg_free_fixlen_object(db, &dbh->indextmpl_area_header, template_offset); return 0; } /** Check if a record matches a template * * Returns 1 if they match * Otherwise, returns 0 */ gint wg_match_template(void *db, wg_index_template *tmpl, void *rec) { void *matchrec; gint reclen, mreclen; int i; #ifdef CHECK /* Paranoia */ if(!tmpl->offset_matchrec) { show_index_error(db, "Invalid match record template"); return 0; } #endif matchrec = offsettoptr(db, tmpl->offset_matchrec); mreclen = wg_get_record_len(db, matchrec); reclen = wg_get_record_len(db, rec); if(mreclen > reclen) { /* Match records always end in a fixed column, so * this is guaranteed to be a mismatch */ return 0; } else if(mreclen < reclen) { /* Fields outside the template always match */ reclen = mreclen; } for(i=0; i prev) lowest = columns[j]; } if(lowest == MAX_INDEXED_FIELDNR + 1) break; sorted_cols[i++] = lowest; prev = lowest; }; return i; } /** Create an index. * * Single-column backward compatibility wrapper. */ gint wg_create_index(void *db, gint column, gint type, gint *matchrec, gint reclen) { return wg_create_multi_index(db, &column, 1, type, matchrec, reclen); } /** Create an index. * * Arguments - * type - WG_INDEX_TYPE_TTREE - single-column T-tree index * WG_INDEX_TYPE_TTREE_JSON - T-tree for JSON schema * WG_INDEX_TYPE_HASH - multi-column hash index * WG_INDEX_TYPE_HASH_JSON - hash index with JSON features * * columns - array of column numbers * col_count - size of the column number array * * matchrec - array of gints * reclen - size of matchrec * If matchrec is NULL, regular index will be created. Otherwise, * only database records that match the template defined by * matchrec are inserted in this index. */ gint wg_create_multi_index(void *db, gint *columns, gint col_count, gint type, gint *matchrec, gint reclen) { gint index_id, template_offset = 0, i; wg_index_header *hdr; #ifdef USE_INDEX_TEMPLATE wg_index_template *tmpl = NULL; gint fixed_columns = 0; #endif gint *ilist[MAX_INDEX_FIELDS]; gint sorted_cols[MAX_INDEX_FIELDS]; db_memsegment_header* dbh = dbmemsegh(db); /* Check the arguments */ #ifdef CHECK if (!dbcheck(db)) { show_index_error(db, "Invalid database pointer in wg_create_multi_index"); return -1; } if(!columns) { show_index_error(db, "columns list is a NULL pointer"); return -1; } #endif #ifdef USE_CHILD_DB /* Workaround to handle external refs/ttree issue */ if(dbh->extdbs.count > 0) { return show_index_error(db, "Database has external data, "\ "indexes disabled."); } #endif /* Column count validation */ if(col_count < 1) { show_index_error(db, "need at least one indexed column"); return -1; } else if(col_count > MAX_INDEX_FIELDS) { show_index_error_nr(db, "Max allowed indexed fields", MAX_INDEX_FIELDS); return -1; } else if(col_count > 1 &&\ (type == WG_INDEX_TYPE_TTREE || type == WG_INDEX_TYPE_TTREE_JSON)) { show_index_error(db, "Cannot create a T-tree index on multiple columns"); return -1; } if(sort_columns(sorted_cols, columns, col_count) < col_count) { show_index_error(db, "Duplicate columns not allowed"); return -1; } for(i=0; i MAX_INDEXED_FIELDNR) { show_index_error_nr(db, "Max allowed column number", MAX_INDEXED_FIELDNR); return -1; } } #ifdef USE_INDEX_TEMPLATE /* Handle the template */ if(matchrec) { if(!reclen) { show_index_error(db, "Zero-length match record not allowed"); return -1; } if(reclen > MAX_INDEXED_FIELDNR+1) { show_index_error_nr(db, "Match record too long, max", MAX_INDEXED_FIELDNR+1); return -1; } /* Sanity check */ for(i=0; ifixed_columns; } #endif /* Scan to the end of index chain for each column. If templates are used, * new indexes are inserted in between list elements to maintain * the chains sorted by number of fixed columns. */ for(i=0; iindex_control_area_header.index_table[column]; while(*(ilist[i])) { gcell *ilistelem = (gcell *) offsettoptr(db, *(ilist[i])); if(!ilistelem->car) { show_index_error(db, "Invalid header in index list"); return -1; } hdr = (wg_index_header *) offsettoptr(db, ilistelem->car); /* If this is the first column, check for a matching index. * Note that this is simplified by having the column lists sorted. */ if(!i && hdr->type==type && template_offset==hdr->template_offset &&\ hdr->fields==col_count) { gint j, match = 1; /* Compare the field lists */ for(j=0; jrec_field_index[j] != sorted_cols[j]) { match = 0; break; } } if(match) { show_index_error(db, "Identical index already exists on the column"); return -1; } } #ifdef USE_INDEX_TEMPLATE if(hdr->template_offset) { wg_index_template *t = \ (wg_index_template *) offsettoptr(db, hdr->template_offset); if(t->fixed_columns < fixed_columns) break; /* new template is more promising, insert here */ } else if(fixed_columns) { /* Current list element does not have a template, so * the new one should be inserted before it. */ break; } #endif ilist[i] = &ilistelem->cdr; } } /* Add new index header */ index_id = wg_alloc_fixlen_object(db, &dbh->indexhdr_area_header); for(i=0; itype = type; hdr->fields = col_count; for(i=0; i < col_count; i++) { hdr->rec_field_index[i] = sorted_cols[i]; } hdr->template_offset = template_offset; /* create the actual index */ switch(hdr->type) { case WG_INDEX_TYPE_TTREE: case WG_INDEX_TYPE_TTREE_JSON: create_ttree_index(db, index_id); break; case WG_INDEX_TYPE_HASH: case WG_INDEX_TYPE_HASH_JSON: if(create_hash_index(db, index_id)) return -1; break; default: show_index_error(db, "Invalid index type"); return -1; } /* Add to master list */ if(!insert_into_list(db, &dbh->index_control_area_header.index_list ,index_id)) return -1; #ifdef USE_INDEX_TEMPLATE if(hdr->template_offset) { int i; /* Update the template index */ for(i=0; iindex_control_area_header.index_template_table[i]), index_id)) return 0; } } } #endif /* increase index counter */ dbh->index_control_area_header.number_of_indexes++; #ifdef USE_INDEX_TEMPLATE if(tmpl) tmpl->refcount++; #endif return 0; } /** Drop index by index id * * returns: * 0 - on success * -1 - error */ gint wg_drop_index(void *db, gint index_id){ int i; wg_index_header *hdr = NULL; gint *ilist; gcell *ilistelem; db_memsegment_header* dbh = dbmemsegh(db); /* Locate the header */ ilist = &dbh->index_control_area_header.index_list; while(*ilist) { ilistelem = (gcell *) offsettoptr(db, *ilist); if(ilistelem->car == index_id) { hdr = (wg_index_header *) offsettoptr(db, index_id); /* Delete current element */ delete_from_list(db, ilist); break; } ilist = &ilistelem->cdr; } if(!hdr) { show_index_error_nr(db, "Invalid index for delete", index_id); return -1; } /* Remove the index from index table */ for(i=0; ifields; i++) { int column = hdr->rec_field_index[i]; ilist = &dbh->index_control_area_header.index_table[column]; while(*ilist) { ilistelem = (gcell *) offsettoptr(db, *ilist); if(ilistelem->car == index_id) { delete_from_list(db, ilist); break; } ilist = &ilistelem->cdr; } } #ifdef USE_INDEX_TEMPLATE if(hdr->template_offset) { wg_index_template *tmpl = \ (wg_index_template *) offsettoptr(db, hdr->template_offset); void *matchrec = offsettoptr(db, tmpl->offset_matchrec); gint reclen = wg_get_record_len(db, matchrec); /* Remove from template index */ for(i=0; iindex_control_area_header.index_template_table[i]; while(*ilist) { ilistelem = (gcell *) offsettoptr(db, *ilist); if(ilistelem->car == index_id) { delete_from_list(db, ilist); break; } ilist = &ilistelem->cdr; } } } } #endif /* Drop the index */ switch(hdr->type) { case WG_INDEX_TYPE_TTREE: case WG_INDEX_TYPE_TTREE_JSON: if(drop_ttree_index(db, index_id)) return -1; break; case WG_INDEX_TYPE_HASH: case WG_INDEX_TYPE_HASH_JSON: if(drop_hash_index(db, index_id)); return -1; break; default: show_index_error(db, "Invalid index type"); return -1; } #ifdef USE_INDEX_TEMPLATE if(hdr->template_offset) { wg_index_template *tmpl = \ (wg_index_template *) offsettoptr(db, hdr->template_offset); if(!(--(tmpl->refcount))) remove_index_template(db, hdr->template_offset); } #endif /* Now free the header */ wg_free_fixlen_object(db, &dbh->indexhdr_area_header, index_id); /* decrement index counter */ dbh->index_control_area_header.number_of_indexes--; return 0; } /** Find index id (index header) by column. * * Single-column backward compatibility wrapper. */ gint wg_column_to_index_id(void *db, gint column, gint type, gint *matchrec, gint reclen) { return wg_multi_column_to_index_id(db, &column, 1, type, matchrec, reclen); } /** Find index id (index header) by column(s) * Supports all types of indexes, calling program should examine the * header of returned index to decide how to proceed. Alternatively, * if type is not 0 then only indexes of the given type are * returned. * * If matchrec is NULL, "full" index is returned. Otherwise * the function attempts to locate a matching template. * * returns: * -1 if no index found * offset > 0 if index found - index id */ gint wg_multi_column_to_index_id(void *db, gint *columns, gint col_count, gint type, gint *matchrec, gint reclen) { int i; gint template_offset = 0; db_memsegment_header* dbh = dbmemsegh(db); gint *ilist; gcell *ilistelem; gint sorted_cols[MAX_INDEX_FIELDS]; #ifdef USE_INDEX_TEMPLATE /* Validate the match record and find the template */ if(matchrec) { if(!reclen) { show_index_error(db, "Zero-length match record not allowed"); return -1; } if(reclen > MAX_INDEXED_FIELDNR+1) { show_index_error_nr(db, "Match record too long, max", MAX_INDEXED_FIELDNR+1); return -1; } template_offset = find_index_template(db, matchrec, reclen); if(!template_offset) { /* No matching template */ return -1; } } #endif /* Column count validation */ if(col_count < 1) { show_index_error(db, "need at least one indexed column"); return -1; } else if(col_count > MAX_INDEX_FIELDS) { show_index_error_nr(db, "Max allowed indexed fields", MAX_INDEX_FIELDS); return -1; } if(col_count > 1) { if(sort_columns(sorted_cols, columns, col_count) < col_count) { show_index_error(db, "Duplicate columns not allowed"); return -1; } } else { sorted_cols[0] = columns[0]; } for(i=0; i MAX_INDEXED_FIELDNR) { show_index_error_nr(db, "Max allowed column number", MAX_INDEXED_FIELDNR); return -1; } } /* Find all indexes on the first column */ ilist = &dbh->index_control_area_header.index_table[sorted_cols[0]]; while(*ilist) { ilistelem = (gcell *) offsettoptr(db, *ilist); if(ilistelem->car) { wg_index_header *hdr = \ (wg_index_header *) offsettoptr(db, ilistelem->car); #ifndef USE_INDEX_TEMPLATE if(!type || type==hdr->type) { #else if((!type || type==hdr->type) &&\ hdr->template_offset == template_offset) { #endif if(hdr->fields == col_count) { for(i=0; irec_field_index[i]!=sorted_cols[i]) goto nextindex; } return ilistelem->car; /* index id */ } } } nextindex: ilist = &ilistelem->cdr; } return -1; } /** Return index type by index id * * returns: * -1 if no index found * type >= 0 if index found */ gint wg_get_index_type(void *db, gint index_id) { wg_index_header *hdr = NULL; gint *ilist; gcell *ilistelem; db_memsegment_header* dbh = dbmemsegh(db); /* Locate the header */ ilist = &dbh->index_control_area_header.index_list; while(*ilist) { ilistelem = (gcell *) offsettoptr(db, *ilist); if(ilistelem->car == index_id) { hdr = (wg_index_header *) offsettoptr(db, index_id); break; } ilist = &ilistelem->cdr; } if(!hdr) { show_index_error_nr(db, "Invalid index_id", index_id); return -1; } return hdr->type; } /** Return index template by index id * * Returns a pointer to the gint array used for the index template. * reclen is set to the length of the array. The pointer may not * be freed and it's contents should be accessed read-only. * * If the index is not found or has no template, NULL is returned. * In that case contents of *reclen are unmodified. */ void * wg_get_index_template(void *db, gint index_id, gint *reclen) { #ifdef USE_INDEX_TEMPLATE wg_index_header *hdr = NULL; gint *ilist; gcell *ilistelem; db_memsegment_header* dbh = dbmemsegh(db); wg_index_template *tmpl = NULL; void *matchrec; /* Locate the header */ ilist = &dbh->index_control_area_header.index_list; while(*ilist) { ilistelem = (gcell *) offsettoptr(db, *ilist); if(ilistelem->car == index_id) { hdr = (wg_index_header *) offsettoptr(db, index_id); break; } ilist = &ilistelem->cdr; } if(!hdr) { show_index_error_nr(db, "Invalid index_id", index_id); return NULL; } if(!hdr->template_offset) { return NULL; } tmpl = (wg_index_template *) offsettoptr(db, hdr->template_offset); #ifdef CHECK if(!tmpl->offset_matchrec) { show_index_error(db, "Invalid match record template"); return NULL; } #endif matchrec = offsettoptr(db, tmpl->offset_matchrec); *reclen = wg_get_record_len(db, matchrec); return wg_get_record_dataarray(db, matchrec); #else return NULL; #endif } /** Return all indexes in database. * * Returns a pointer to a NEW allocated array of index id-s. * count is initialized to the number of indexes in the array. * * Returns NULL if there are no indexes. */ void * wg_get_all_indexes(void *db, gint *count) { int column; db_memsegment_header* dbh = dbmemsegh(db); gint *ilist; gint *res; *count = 0; if(!dbh->index_control_area_header.number_of_indexes) { return NULL; } res = (gint *) malloc(dbh->index_control_area_header.number_of_indexes *\ sizeof(gint)); if(!res) { show_index_error(db, "Memory allocation failed"); return NULL; } for(column=0; column<=MAX_INDEXED_FIELDNR; column++) { ilist = &dbh->index_control_area_header.index_table[column]; while(*ilist) { gcell *ilistelem = (gcell *) offsettoptr(db, *ilist); if(ilistelem->car) { res[(*count)++] = ilistelem->car; } ilist = &ilistelem->cdr; } } if(*count != dbh->index_control_area_header.number_of_indexes) { show_index_error(db, "Index control area is corrupted"); free(res); return NULL; } return res; } #define INDEX_ADD_ROW(d, h, i, r) \ switch(h->type) { \ case WG_INDEX_TYPE_TTREE: \ if(ttree_add_row(d, i, r)) \ return -2; \ break; \ case WG_INDEX_TYPE_TTREE_JSON: \ if(is_plain_record(r)) { \ if(ttree_add_row(d, i, r)) \ return -2; \ } \ break; \ case WG_INDEX_TYPE_HASH: \ if(hash_add_row(d, i, r)) \ return -2; \ break; \ case WG_INDEX_TYPE_HASH_JSON: \ if(is_plain_record(r)) { \ if(hash_add_row(d, i, r)) \ return -2; \ } \ break; \ default: \ show_index_error(db, "unknown index type, ignoring"); \ break; \ } #define INDEX_REMOVE_ROW(d, h, i, r) \ switch(h->type) { \ case WG_INDEX_TYPE_TTREE: \ if(ttree_remove_row(d, i, r) < -2) \ return -2; \ break; \ case WG_INDEX_TYPE_TTREE_JSON: \ if(is_plain_record(r)) { \ if(ttree_remove_row(d, i, r) < -2) \ return -2; \ } \ break; \ case WG_INDEX_TYPE_HASH: \ if(hash_remove_row(d, i, r) < -2) \ return -2; \ break; \ case WG_INDEX_TYPE_HASH_JSON: \ if(is_plain_record(r)) { \ if(hash_remove_row(d, i, r) < -2) \ return -2; \ } \ break; \ default: \ show_index_error(db, "unknown index type, ignoring"); \ break; \ } /** Add data of one field to all indexes * Loops over indexes in one field and inserts the data into * each one of them. * returns 0 for success * returns -1 for invalid arguments * returns -2 for error (insert failed, index is no longer consistent) */ gint wg_index_add_field(void *db, void *rec, gint column) { gint *ilist; gcell *ilistelem; db_memsegment_header* dbh = dbmemsegh(db); #ifdef CHECK /* XXX: if used from wg_set_field() only, this is redundant */ if(column > MAX_INDEXED_FIELDNR || column >= wg_get_record_len(db, rec)) return -1; if(is_special_record(rec)) return -1; #endif #if 0 /* XXX: if used from wg_set_field() only, this is redundant */ if(!dbh->index_control_area_header.index_table[column]) return -1; #endif ilist = &dbh->index_control_area_header.index_table[column]; while(*ilist) { ilistelem = (gcell *) offsettoptr(db, *ilist); if(ilistelem->car) { wg_index_header *hdr = \ (wg_index_header *) offsettoptr(db, ilistelem->car); if(MATCH_TEMPLATE(db, hdr, rec)) { INDEX_ADD_ROW(db, hdr, ilistelem->car, rec) } } ilist = &ilistelem->cdr; } #ifdef USE_INDEX_TEMPLATE /* Other candidates are indexes that have match * records. The current record may have become compatible * with their template. */ ilist = &dbh->index_control_area_header.index_template_table[column]; while(*ilist) { ilistelem = (gcell *) offsettoptr(db, *ilist); if(ilistelem->car) { wg_index_header *hdr = \ (wg_index_header *) offsettoptr(db, ilistelem->car); if(MATCH_TEMPLATE(db, hdr, rec)) { INDEX_ADD_ROW(db, hdr, ilistelem->car, rec) } } ilist = &ilistelem->cdr; } #endif return 0; } /** Add data of one record to all indexes * Convinience function to add an entire record into * all indexes in the database. * returns 0 on success, -2 on error * (-1 is skipped to have consistent error codes for add/del functions) */ gint wg_index_add_rec(void *db, void *rec) { gint i; db_memsegment_header* dbh = dbmemsegh(db); gint reclen = wg_get_record_len(db, rec); #ifdef CHECK if(is_special_record(rec)) return -1; #endif if(reclen > MAX_INDEXED_FIELDNR) reclen = MAX_INDEXED_FIELDNR + 1; for(i=0;iindex_control_area_header.index_table[i]; while(*ilist) { ilistelem = (gcell *) offsettoptr(db, *ilist); if(ilistelem->car) { wg_index_header *hdr = \ (wg_index_header *) offsettoptr(db, ilistelem->car); if(hdr->rec_field_index[0] >= i) { /* A little trick: we only update index if the * first column in the column list matches. The reasoning * behind this is that we only want to update each index * once, for multi-column indexes we can rest assured that * the work was already done. */ if(MATCH_TEMPLATE(db, hdr, rec)) { INDEX_ADD_ROW(db, hdr, ilistelem->car, rec) } } } ilist = &ilistelem->cdr; } #ifdef USE_INDEX_TEMPLATE ilist = &dbh->index_control_area_header.index_template_table[i]; while(*ilist) { ilistelem = (gcell *) offsettoptr(db, *ilist); if(ilistelem->car) { wg_index_header *hdr = \ (wg_index_header *) offsettoptr(db, ilistelem->car); wg_index_template *tmpl = \ (wg_index_template *) offsettoptr(db, hdr->template_offset); void *matchrec; gint mreclen; int j, firstmatch = -1; /* Here the check for a match is slightly more complicated. * If there is a match *but* the current column is not the * first fixed one in the template, the match has * already occurred earlier. */ matchrec = offsettoptr(db, tmpl->offset_matchrec); mreclen = wg_get_record_len(db, matchrec); if(mreclen > reclen) { goto nexttmpl1; } for(j=0; jcar, rec) } } nexttmpl1: ilist = &ilistelem->cdr; } #endif } return 0; } /** Delete data of one field from all indexes * Loops over indexes in one column and removes the references * to the record from all of them. * returns 0 for success * returns -1 for invalid arguments * returns -2 for error (delete failed, possible index corruption) */ gint wg_index_del_field(void *db, void *rec, gint column) { gint *ilist; gcell *ilistelem; db_memsegment_header* dbh = dbmemsegh(db); #ifdef CHECK /* XXX: if used from wg_set_field() only, this is redundant */ if(column > MAX_INDEXED_FIELDNR || column >= wg_get_record_len(db, rec)) return -1; if(is_special_record(rec)) return -1; #endif #if 0 /* XXX: if used from wg_set_field() only, this is redundant */ if(!dbh->index_control_area_header.index_table[column]) return -1; #endif /* Find all indexes on the column */ ilist = &dbh->index_control_area_header.index_table[column]; while(*ilist) { ilistelem = (gcell *) offsettoptr(db, *ilist); if(ilistelem->car) { wg_index_header *hdr = \ (wg_index_header *) offsettoptr(db, ilistelem->car); if(MATCH_TEMPLATE(db, hdr, rec)) { INDEX_REMOVE_ROW(db, hdr, ilistelem->car, rec) } } ilist = &ilistelem->cdr; } #ifdef USE_INDEX_TEMPLATE /* Find all indexes on the column */ ilist = &dbh->index_control_area_header.index_template_table[column]; while(*ilist) { ilistelem = (gcell *) offsettoptr(db, *ilist); if(ilistelem->car) { wg_index_header *hdr = \ (wg_index_header *) offsettoptr(db, ilistelem->car); if(MATCH_TEMPLATE(db, hdr, rec)) { INDEX_REMOVE_ROW(db, hdr, ilistelem->car, rec) } } ilist = &ilistelem->cdr; } #endif return 0; } /* Delete data of one record from all indexes * Should be called from wg_delete_record() * returns 0 for success * returns -2 for error (delete failed, index presumably corrupt) */ gint wg_index_del_rec(void *db, void *rec) { gint i; db_memsegment_header* dbh = dbmemsegh(db); gint reclen = wg_get_record_len(db, rec); #ifdef CHECK if(is_special_record(rec)) return -1; #endif if(reclen > MAX_INDEXED_FIELDNR) reclen = MAX_INDEXED_FIELDNR + 1; for(i=0;iindex_control_area_header.index_table[i]; while(*ilist) { ilistelem = (gcell *) offsettoptr(db, *ilist); if(ilistelem->car) { wg_index_header *hdr = \ (wg_index_header *) offsettoptr(db, ilistelem->car); if(hdr->rec_field_index[0] >= i) { /* Ignore second, third etc references to multi-column * indexes. XXX: This only works if index table is scanned * sequentially, from position 0. See also comment for * wg_index_add_rec command. */ if(MATCH_TEMPLATE(db, hdr, rec)) { INDEX_REMOVE_ROW(db, hdr, ilistelem->car, rec) } } } ilist = &ilistelem->cdr; } #ifdef USE_INDEX_TEMPLATE ilist = &dbh->index_control_area_header.index_template_table[i]; while(*ilist) { ilistelem = (gcell *) offsettoptr(db, *ilist); if(ilistelem->car) { wg_index_header *hdr = \ (wg_index_header *) offsettoptr(db, ilistelem->car); wg_index_template *tmpl = \ (wg_index_template *) offsettoptr(db, hdr->template_offset); void *matchrec; gint mreclen; int j, firstmatch = -1; /* Similar check as in wg_index_add_rec() */ matchrec = offsettoptr(db, tmpl->offset_matchrec); mreclen = wg_get_record_len(db, matchrec); if(mreclen > reclen) { goto nexttmpl2; /* no match */ } for(j=0; jcar, rec) } } nexttmpl2: ilist = &ilistelem->cdr; } #endif } return 0; } /* --------------- error handling ------------------------------*/ /** called with err msg * * may print or log an error * does not do any jumps etc */ static gint show_index_error(void* db, char* errmsg) { #ifdef WG_NO_ERRPRINT #else fprintf(stderr,"index error: %s\n",errmsg); #endif return -1; } /** called with err msg and additional int data * * may print or log an error * does not do any jumps etc */ static gint show_index_error_nr(void* db, char* errmsg, gint nr) { #ifdef WG_NO_ERRPRINT #else fprintf(stderr,"index error: %s %d\n", errmsg, (int) nr); #endif return -1; } #ifdef __cplusplus } #endif whitedb-0.7.2/Db/dbindex.h000066400000000000000000000116601226454622500152750ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Enar Reilent 2009, Priit Järv 2010,2011 * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dblock.h * Public headers for indexing routines */ #ifndef DEFINED_DBINDEX_H #define DEFINED_DBINDEX_H #ifdef _WIN32 #include "../config-w32.h" #else #include "../config.h" #endif /* For gint data type */ #include "dbdata.h" /* ==== Public macros ==== */ #define REALLY_BOUNDING_NODE 0 #define DEAD_END_LEFT_NOT_BOUNDING 1 #define DEAD_END_RIGHT_NOT_BOUNDING 2 #ifdef TTREE_CHAINED_NODES #define TNODE_SUCCESSOR(d, x) (x->succ_offset) #define TNODE_PREDECESSOR(d, x) (x->pred_offset) #else #define TNODE_SUCCESSOR(d, x) (x->right_child_offset ? \ wg_ttree_find_lub_node(d, x->right_child_offset) : \ wg_ttree_find_leaf_successor(d, ptrtooffset(d, x))) #define TNODE_PREDECESSOR(d, x) (x->left_child_offset ? \ wg_ttree_find_glb_node(d, x->left_child_offset) : \ wg_ttree_find_leaf_predecessor(d, ptrtooffset(d, x))) #endif /* Check if record matches index (takes pointer arguments) */ #ifndef USE_INDEX_TEMPLATE #define MATCH_TEMPLATE(d, h, r) 1 #else #define MATCH_TEMPLATE(d, h, r) (h->template_offset ? \ wg_match_template(d, \ (wg_index_template *) offsettoptr(d, h->template_offset), r) : 1) #endif #define WG_INDEX_TYPE_TTREE 50 #define WG_INDEX_TYPE_TTREE_JSON 51 #define WG_INDEX_TYPE_HASH 60 #define WG_INDEX_TYPE_HASH_JSON 61 /* Index header helpers */ #define TTREE_ROOT_NODE(x) (x->ctl.t.offset_root_node) #ifdef TTREE_CHAINED_NODES #define TTREE_MIN_NODE(x) (x->ctl.t.offset_min_node) #define TTREE_MAX_NODE(x) (x->ctl.t.offset_max_node) #endif #define HASHIDX_ARRAYP(x) (&(x->ctl.h.hasharea)) /* ====== data structures ======== */ /** structure of t-node * (array of data pointers, pointers to parent/children nodes, control data) * overall size is currently 64 bytes (cache line?) if array size is 10, * with extra node chaining pointers the array size defaults to 8. */ struct wg_tnode{ gint parent_offset; gint current_max; /** encoded value */ gint current_min; /** encoded value */ short number_of_elements; unsigned char left_subtree_height; unsigned char right_subtree_height; gint array_of_values[WG_TNODE_ARRAY_SIZE]; gint left_child_offset; gint right_child_offset; #ifdef TTREE_CHAINED_NODES gint succ_offset; /** forward (smaller to larger) sequential chain */ gint pred_offset; /** backward sequential chain */ #endif }; /* ==== Protos ==== */ /* API functions (copied in indexapi.h) */ gint wg_create_index(void *db, gint column, gint type, gint *matchrec, gint reclen); gint wg_create_multi_index(void *db, gint *columns, gint col_count, gint type, gint *matchrec, gint reclen); gint wg_drop_index(void *db, gint index_id); gint wg_column_to_index_id(void *db, gint column, gint type, gint *matchrec, gint reclen); gint wg_multi_column_to_index_id(void *db, gint *columns, gint col_count, gint type, gint *matchrec, gint reclen); gint wg_get_index_type(void *db, gint index_id); void * wg_get_index_template(void *db, gint index_id, gint *reclen); void * wg_get_all_indexes(void *db, gint *count); /* WhiteDB internal functions */ gint wg_search_ttree_index(void *db, gint index_id, gint key); #ifndef TTREE_CHAINED_NODES gint wg_ttree_find_glb_node(void *db, gint nodeoffset); gint wg_ttree_find_lub_node(void *db, gint nodeoffset); gint wg_ttree_find_leaf_predecessor(void *db, gint nodeoffset); gint wg_ttree_find_leaf_successor(void *db, gint nodeoffset); #endif gint wg_search_ttree_rightmost(void *db, gint rootoffset, gint key, gint *result, struct wg_tnode *rb_node); gint wg_search_ttree_leftmost(void *db, gint rootoffset, gint key, gint *result, struct wg_tnode *lb_node); gint wg_search_tnode_first(void *db, gint nodeoffset, gint key, gint column); gint wg_search_tnode_last(void *db, gint nodeoffset, gint key, gint column); gint wg_search_hash(void *db, gint index_id, gint *values, gint count); #ifdef USE_INDEX_TEMPLATE gint wg_match_template(void *db, wg_index_template *tmpl, void *rec); #endif gint wg_index_add_field(void *db, void *rec, gint column); gint wg_index_add_rec(void *db, void *rec); gint wg_index_del_field(void *db, void *rec, gint column); gint wg_index_del_rec(void *db, void *rec); #endif /* DEFINED_DBINDEX_H */ whitedb-0.7.2/Db/dbjson.c000066400000000000000000000433351226454622500151360ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Priit Järv 2013 * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dbjson.c * WhiteDB JSON input and output. */ /* ====== Includes =============== */ #include #include #include /* ====== Private headers and defs ======== */ #ifdef __cplusplus extern "C" { #endif /*#ifdef _WIN32 #define WIN32_LEAN_AND_MEAN #include #include #endif*/ #ifdef _WIN32 #include "../config-w32.h" #else #include "../config.h" #endif #include "dbdata.h" #include "dbcompare.h" #include "dbschema.h" #include "dbjson.h" #include "dbutil.h" #include "../json/yajl_api.h" #ifdef _WIN32 #define strncpy(d, s, sz) strncpy_s(d, sz+1, s, sz) #define strnlen strnlen_s #endif #ifdef USE_BACKLINKING #if !defined(WG_COMPARE_REC_DEPTH) || (WG_COMPARE_REC_DEPTH < 2) #error WG_COMPARE_REC_DEPTH not defined or too small #else #define MAX_DEPTH WG_COMPARE_REC_DEPTH #endif #else /* !USE_BACKLINKING */ #define MAX_DEPTH 99 /* no reason to limit */ #endif typedef enum { ARRAY, OBJECT } stack_entry_t; struct __stack_entry_elem { gint enc; struct __stack_entry_elem *next; }; typedef struct __stack_entry_elem stack_entry_elem; typedef struct { stack_entry_t type; stack_entry_elem *head; stack_entry_elem *tail; char last_key[80]; int size; } stack_entry; typedef struct { int state; stack_entry stack[MAX_DEPTH]; int stack_ptr; void *db; int isparam; void **document; } parser_context; /* ======= Private protos ================ */ static int push(parser_context *ctx, stack_entry_t type); static int pop(parser_context *ctx); static int add_elem(parser_context *ctx, gint enc); static int add_key(parser_context *ctx, char *key); static int add_literal(parser_context *ctx, gint val); static gint run_json_parser(void *db, char *buf, yajl_callbacks *cb, int isparam, void **document); static int check_push_cb(void* cb_ctx); static int check_pop_cb(void* cb_ctx); static int array_begin_cb(void* cb_ctx); static int array_end_cb(void* cb_ctx); static int object_begin_cb(void* cb_ctx); static int object_end_cb(void* cb_ctx); static int elem_integer_cb(void* cb_ctx, long long intval); static int elem_double_cb(void* cb_ctx, double doubleval); static int object_key_cb(void* cb_ctx, const unsigned char * strval, size_t strl); static int elem_string_cb(void* cb_ctx, const unsigned char * strval, size_t strl); static int pretty_print_json(void *db, FILE *f, void *rec, int indent, int comma, int newline); static gint show_json_error(void *db, char *errmsg); static gint show_json_error_fn(void *db, char *errmsg, char *filename); static gint show_json_error_byte(void *db, char *errmsg, int byte); /* ======== Data ========================= */ yajl_callbacks validate_cb = { NULL, NULL, NULL, NULL, NULL, NULL, check_push_cb, NULL, check_pop_cb, check_push_cb, check_pop_cb }; yajl_callbacks input_cb = { NULL, NULL, elem_integer_cb, elem_double_cb, NULL, elem_string_cb, object_begin_cb, object_key_cb, object_end_cb, array_begin_cb, array_end_cb }; /* ====== Functions ============== */ /** * Parse an input file. Does an initial pass to verify the syntax * of the input and passes it on to the document parser. * XXX: caches the data in memory, so this is very unsuitable * for large files. An alternative would be to feed bytes directly * to the document parser and roll the transaction back, if something fails; */ #define WG_JSON_INPUT_CHUNK 16384 gint wg_parse_json_file(void *db, char *filename) { char *buf = NULL; FILE *f = NULL; int count = 0, result = 0, bufsize = 0, depth = 0; yajl_handle hand = NULL; buf = malloc(WG_JSON_INPUT_CHUNK); if(!buf) { return show_json_error(db, "Failed to allocate memory"); } bufsize = WG_JSON_INPUT_CHUNK; if(!filename) { #ifdef _WIN32 printf("reading JSON from stdin, press CTRL-Z and ENTER when done\n"); #else printf("reading JSON from stdin, press CTRL-D when done\n"); #endif fflush(stdout); f = stdin; } else { #ifdef _WIN32 if(fopen_s(&f, filename, "r")) { #else if(!(f = fopen(filename, "r"))) { #endif show_json_error_fn(db, "Failed to open input", filename); result = -1; goto done; } } /* setup parser */ hand = yajl_alloc(&validate_cb, NULL, (void *) &depth); yajl_config(hand, yajl_allow_comments, 1); while(!feof(f)) { int rd = fread((void *) &buf[count], 1, WG_JSON_INPUT_CHUNK, f); if(rd == 0) { if(!feof(f)) { show_json_error_byte(db, "Read error", count); result = -1; } goto done; } if(yajl_parse(hand, (unsigned char *) &buf[count], rd) != yajl_status_ok) { unsigned char *errtxt = yajl_get_error(hand, 1, (unsigned char *) &buf[count], rd); show_json_error(db, (char *) errtxt); yajl_free_error(hand, errtxt); result = -1; goto done; } count += rd; if(count >= bufsize) { void *tmp = realloc(buf, bufsize + WG_JSON_INPUT_CHUNK); if(!tmp) { show_json_error(db, "Failed to allocate additional memory"); result = -1; goto done; } buf = tmp; bufsize += WG_JSON_INPUT_CHUNK; } } if(yajl_complete_parse(hand) != yajl_status_ok) { show_json_error(db, "Syntax error (JSON not properly terminated?)"); result = -1; goto done; } buf[count] = '\0'; result = wg_parse_json_document(db, buf); done: if(buf) free(buf); if(filename && f) fclose(f); if(hand) yajl_free(hand); return result; } /* Parse a JSON buffer. * The data is inserted in database using the JSON schema. * * returns 0 for success. * returns -1 on non-fatal error. * returns -2 if database is left non-consistent due to an error. */ gint wg_parse_json_document(void *db, char *buf) { void *document = NULL; /* ignore */ return run_json_parser(db, buf, &input_cb, 0, &document); } /* Parse a JSON parameter(s). * The data is inserted in database as "special" records. * * returns 0 for success. * returns -1 on non-fatal error. * returns -2 if database is left non-consistent due to an error. */ gint wg_parse_json_param(void *db, char *buf, void **document) { return run_json_parser(db, buf, &input_cb, 1, document); } /* Run JSON parser. * The data is inserted in the database. If there are any errors, the * database will currently remain in an inconsistent state, so beware. * * if isparam is specified, the data will not be indexed nor returned * by wg_get_*_record() calls. * * if the call is successful, *document contains a pointer to the * top-level record. * * returns 0 for success. * returns -1 on non-fatal error. * returns -2 if database is left non-consistent due to an error. */ static gint run_json_parser(void *db, char *buf, yajl_callbacks *cb, int isparam, void **document) { int count = 0, result = 0; yajl_handle hand = NULL; char *iptr = buf; parser_context ctx; /* setup context */ ctx.state = 0; ctx.stack_ptr = -1; ctx.db = db; ctx.isparam = isparam; ctx.document = document; /* setup parser */ hand = yajl_alloc(cb, NULL, (void *) &ctx); yajl_config(hand, yajl_allow_comments, 1); while((count = strnlen(iptr, WG_JSON_INPUT_CHUNK)) > 0) { if(yajl_parse(hand, (unsigned char *) iptr, count) != yajl_status_ok) { show_json_error(db, "JSON parsing failed"); result = -2; /* Fatal error */ goto done; } iptr += count; } if(yajl_complete_parse(hand) != yajl_status_ok) { show_json_error(db, "JSON parsing failed"); result = -2; /* Fatal error */ } done: if(hand) yajl_free(hand); return result; } static int check_push_cb(void* cb_ctx) { int *depth = (int *) cb_ctx; if(++(*depth) >= MAX_DEPTH) { return 0; } return 1; } static int check_pop_cb(void* cb_ctx) { int *depth = (int *) cb_ctx; --(*depth); return 1; } /** * Push an object or an array on the stack. */ static int push(parser_context *ctx, stack_entry_t type) { stack_entry *e; if(++ctx->stack_ptr >= MAX_DEPTH) /* paranoia, parser guards from this */ return 0; e = &ctx->stack[ctx->stack_ptr]; e->size = 0; e->type = type; e->head = NULL; e->tail = NULL; return 1; } /** * Pop an object or an array from the stack. * If this is not the top level in the document, the object is also added * as an element on the previous level. */ static int pop(parser_context *ctx) { stack_entry *e; void *rec; int ret, istoplevel; if(ctx->stack_ptr < 0) return 0; e = &ctx->stack[ctx->stack_ptr--]; /* is it a top level object? */ if(ctx->stack_ptr < 0) { istoplevel = 1; } else { istoplevel = 0; } if(e->type == ARRAY) { rec = wg_create_array(ctx->db, e->size, istoplevel, ctx->isparam); } else { rec = wg_create_object(ctx->db, e->size, istoplevel, ctx->isparam); } /* add elements to the database */ if(rec) { stack_entry_elem *curr = e->head; int i = 0; ret = 1; while(curr) { if(wg_set_field(ctx->db, rec, i++, curr->enc)) { ret = 0; break; } curr = curr->next; } if(istoplevel) *(ctx->document) = rec; } else { ret = 0; } /* free the elements */ while(e->head) { stack_entry_elem *tmp = e->head; e->head = e->head->next; free(tmp); } e->tail = NULL; e->size = 0; /* is it an element of something? */ if(!istoplevel && rec && ret) { gint enc = wg_encode_record(ctx->db, rec); ret = add_literal(ctx, enc); } return ret; } /** * Append an element to the current stack entry. */ static int add_elem(parser_context *ctx, gint enc) { stack_entry *e; stack_entry_elem *tmp; if(ctx->stack_ptr < 0 || ctx->stack_ptr >= MAX_DEPTH) return 0; /* paranoia */ e = &ctx->stack[ctx->stack_ptr]; tmp = (stack_entry_elem *) malloc(sizeof(stack_entry_elem)); if(!tmp) return 0; if(!e->tail) { e->head = tmp; } else { e->tail->next = tmp; } e->tail = tmp; e->size++; tmp->enc = enc; tmp->next = NULL; return 1; } /** * Store a key in the current stack entry. */ static int add_key(parser_context *ctx, char *key) { stack_entry *e; if(ctx->stack_ptr < 0 || ctx->stack_ptr >= MAX_DEPTH) return 0; /* paranoia */ e = &ctx->stack[ctx->stack_ptr]; strncpy(e->last_key, key, 80); e->last_key[79] = '\0'; return 1; } /** * Add a literal value. If it's inside an object, generate * a key-value pair using the last key. Otherwise insert * it directly. */ static int add_literal(parser_context *ctx, gint val) { stack_entry *e; if(ctx->stack_ptr < 0 || ctx->stack_ptr >= MAX_DEPTH) return 0; /* paranoia */ e = &ctx->stack[ctx->stack_ptr]; if(e->type == ARRAY) { return add_elem(ctx, val); } else { void *rec; gint key = wg_encode_str(ctx->db, e->last_key, NULL); if(key == WG_ILLEGAL) return 0; rec = wg_create_kvpair(ctx->db, key, val, ctx->isparam); if(!rec) return 0; return add_elem(ctx, wg_encode_record(ctx->db, rec)); } } #define OUT_INDENT(x,i,f) \ for(i=0; istack_ptr+1, i, stdout) printf("BEGIN ARRAY\n");*/ if(!push(ctx, ARRAY)) return 0; return 1; } static int array_end_cb(void* cb_ctx) { /* int i;*/ parser_context *ctx = (parser_context *) cb_ctx; if(!pop(ctx)) return 0; /* OUT_INDENT(ctx->stack_ptr+1, i, stdout) printf("END ARRAY\n");*/ return 1; } static int object_begin_cb(void* cb_ctx) { /* int i;*/ parser_context *ctx = (parser_context *) cb_ctx; /* OUT_INDENT(ctx->stack_ptr+1, i, stdout) printf("BEGIN object\n");*/ if(!push(ctx, OBJECT)) return 0; return 1; } static int object_end_cb(void* cb_ctx) { /* int i;*/ parser_context *ctx = (parser_context *) cb_ctx; if(!pop(ctx)) return 0; /* OUT_INDENT(ctx->stack_ptr+1, i, stdout) printf("END object\n");*/ return 1; } static int elem_integer_cb(void* cb_ctx, long long intval) { /* int i;*/ gint val; parser_context *ctx = (parser_context *) cb_ctx; val = wg_encode_int(ctx->db, (gint) intval); if(val == WG_ILLEGAL) return 0; if(!add_literal(ctx, val)) return 0; /* OUT_INDENT(ctx->stack_ptr+1, i, stdout) printf("INTEGER: %d\n", (int) intval);*/ return 1; } static int elem_double_cb(void* cb_ctx, double doubleval) { /* int i;*/ gint val; parser_context *ctx = (parser_context *) cb_ctx; val = wg_encode_double(ctx->db, doubleval); if(val == WG_ILLEGAL) return 0; if(!add_literal(ctx, val)) return 0; /* OUT_INDENT(ctx->stack_ptr+1, i, stdout) printf("FLOAT: %.6f\n", doubleval);*/ return 1; } static int object_key_cb(void* cb_ctx, const unsigned char * strval, size_t strl) { /* int i;*/ int res = 1; parser_context *ctx = (parser_context *) cb_ctx; char *buf = malloc(strl + 1); if(!buf) return 0; strncpy(buf, (char *) strval, strl); buf[strl] = '\0'; if(!add_key(ctx, buf)) { res = 0; } /* OUT_INDENT(ctx->stack_ptr+1, i, stdout) printf("KEY: %s\n", buf);*/ free(buf); return res; } static int elem_string_cb(void* cb_ctx, const unsigned char * strval, size_t strl) { /* int i;*/ int res = 1; gint val; parser_context *ctx = (parser_context *) cb_ctx; char *buf = malloc(strl + 1); if(!buf) return 0; strncpy(buf, (char *) strval, strl); buf[strl] = '\0'; val = wg_encode_str(ctx->db, buf, NULL); if(val == WG_ILLEGAL) { res = 0; } else if(!add_literal(ctx, val)) { res = 0; } /* OUT_INDENT(ctx->stack_ptr+1, i, stdout) printf("STRING: %s\n", buf);*/ free(buf); return res; } /* * Print a JSON document into the given stream. */ void wg_print_json_document(void *db, FILE *f, void *document) { if(!is_schema_document(document)) { /* Paranoia check. This increases the probability we're dealing * with records belonging to a proper schema. Omitting this check * would allow printing parts of documents as well. */ show_json_error(db, "Given record is not a document"); return; } pretty_print_json(db, f, document, 0, 0, 1); } /* * Recursively print JSON elements (using the JSON schema) * Returns 0 on success * Returns -1 on error. */ static int pretty_print_json(void *db, FILE *f, void *rec, int indent, int comma, int newline) { if(is_schema_object(rec)) { gint i, reclen; /*OUT_INDENT(indent, i, f);*/ fprintf(f, "%s{\n", (comma ? "," : "")); reclen = wg_get_record_len(db, rec); for(i=0; i. * */ /** @file dbjson.h * Public headers for JSON I/O. */ #ifndef DEFINED_DBJSON_H #define DEFINED_DBJSON_H /* ====== data structures ======== */ /* ==== Protos ==== */ gint wg_parse_json_file(void *db, char *filename); gint wg_parse_json_document(void *db, char *buf); gint wg_parse_json_param(void *db, char *buf, void **document); void wg_print_json_document(void *db, FILE *f, void *document); #endif /* DEFINED_DBJSON_H */ whitedb-0.7.2/Db/dblock.c000066400000000000000000001013461226454622500151120ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Priit Järv 2009, 2010, 2011, 2013 * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dblock.c * Concurrent access support for WhiteDB memory database * * Note: this file contains compiler and target-specific code. * For compiling on plaforms that do not have support for * specific opcodes needed for atomic operations and spinlocks, * locking may be disabled by ./configure --disable-locking * or by editing the appropriate config-xxx.h file. This will * allow the code to compile, but concurrent access will NOT * work. */ /* ====== Includes =============== */ #include #ifdef _WIN32 #define WIN32_LEAN_AND_MEAN #include #else #include #include #endif #ifdef __cplusplus extern "C" { #endif #ifdef _WIN32 #include "../config-w32.h" #else #include "../config.h" #endif #include "dballoc.h" #include "dblock.h" #if (LOCK_PROTO==TFQUEUE) #ifdef __linux__ #include #include #include #include #endif #endif /* ====== Private headers and defs ======== */ #define compare_and_swap wg_compare_and_swap // wg_ prefix used in dblock.h, non-wg below #ifndef LOCK_PROTO #define DUMMY_ATOMIC_OPS /* allow compilation on unsupported platforms */ #endif #if (LOCK_PROTO==RPSPIN) || (LOCK_PROTO==WPSPIN) #define WAFLAG 0x1 /* writer active flag */ #define RC_INCR 0x2 /* increment step for reader count */ #else /* classes of locks. */ #define LOCKQ_READ 0x02 #define LOCKQ_WRITE 0x04 #endif /* Macro to emit Pentium 4 "pause" instruction. */ #if !defined(LOCK_PROTO) #define MM_PAUSE #elif defined(__GNUC__) #if defined(__i686__) || defined(__amd64__) /* assume SSE2 support */ #define MM_PAUSE {\ __asm__ __volatile__("pause;\n");\ } #else #define MM_PAUSE #endif #elif defined(_WIN32) #include #define MM_PAUSE { _mm_pause(); } #endif /* Helper function for implementing atomic operations * with gcc 4.3 / ARM EABI by Julian Brown. * This works on Linux ONLY. */ #if defined(__ARM_EABI__) && defined(__linux__) typedef int (kernel_cmpxchg_t) (int oldval, int newval, int *ptr); #define kernel_cmpxchg (*(kernel_cmpxchg_t *) 0xffff0fc0) #endif /* For easier testing of GCC version */ #ifdef __GNUC__ #define GCC_VERSION (__GNUC__ * 10000 \ + __GNUC_MINOR__ * 100 \ + __GNUC_PATCHLEVEL__) #endif /* Spinlock timings * SPIN_COUNT: how many cycles until CPU is yielded * SLEEP_MSEC and SLEEP_NSEC: increment of wait time after each cycle */ #ifdef _WIN32 #define SPIN_COUNT 100000 /* Windows scheduling seems to force this */ #define SLEEP_MSEC 1 /* minimum resolution is 1 millisecond */ #else #define SPIN_COUNT 500 /* shorter spins perform better with Linux */ #define SLEEP_NSEC 500000 /* 500 microseconds */ #endif #ifdef _WIN32 /* XXX: quick hack for MSVC. Should probably find a cleaner solution */ #define inline __inline #endif #ifdef _WIN32 #define INIT_SPIN_TIMEOUT(t) #else /* timings are in nsec */ #define INIT_SPIN_TIMEOUT(t) \ if(t > INT_MAX/1000000) /* hack: primitive overflow protection */ \ t = INT_MAX; \ else \ t *= 1000000; #endif #ifdef _WIN32 #define UPDATE_SPIN_TIMEOUT(t, ts) t -= ts; #else #define UPDATE_SPIN_TIMEOUT(t, ts) t -= ts.tv_nsec; #endif #define INIT_QLOCK_TIMEOUT(t, ts) \ ts.tv_sec = t / 1000; \ ts.tv_nsec = t % 1000; #define ALLOC_LOCK(d, l) \ l = alloc_lock(d); \ if(!l) { \ unlock_queue(d); \ show_lock_error(d, "Failed to allocate lock"); \ return 0; \ } #define DEQUEUE_LOCK(d, dbh, l, lp) \ if(lp->prev) { \ lock_queue_node *pp = offsettoptr(d, lp->prev); \ pp->next = lp->next; \ } \ if(lp->next) { \ lock_queue_node *np = offsettoptr(d, lp->next); \ np->prev = lp->prev; \ } else if(dbh->locks.tail == l) { \ dbh->locks.tail = lp->prev; \ } /* ======= Private protos ================ */ static inline void atomic_increment(volatile gint *ptr, gint incr); static inline void atomic_and(volatile gint *ptr, gint val); static inline gint fetch_and_add(volatile gint *ptr, gint incr); static inline gint fetch_and_store(volatile gint *ptr, gint val); // static inline gint compare_and_swap(volatile gint *ptr, gint oldv, gint newv); #if (LOCK_PROTO==TFQUEUE) static inline gint alloc_lock(void * db); static inline void free_lock(void * db, gint node); /*static gint deref_link(void *db, volatile gint *link);*/ #ifdef __linux__ static inline void futex_wait(volatile gint *addr1, int val1); static inline int futex_trywait(volatile gint *addr1, int val1, struct timespec *timeout); static inline void futex_wake(volatile gint *addr1, int val1); #endif #endif static gint show_lock_error(void *db, char *errmsg); /* ====== Functions ============== */ /* -------------- helper functions -------------- */ /* * System- and platform-dependent atomic operations */ /** Atomic increment. On x86 platform, this is internally * the same as fetch_and_add(). */ static inline void atomic_increment(volatile gint *ptr, gint incr) { #if defined(DUMMY_ATOMIC_OPS) *ptr += incr; #elif defined(__GNUC__) #if defined(_MIPS_ARCH) gint tmp1, tmp2; /* XXX: any way to get rid of these? */ __asm__ __volatile__( ".set noreorder\n\t" "1: ll %0,%4\n\t" /* load old */ "add %1,%0,%3\n\t" /* compute tmp2=tmp1+incr */ "sc %1,%2\n\t" /* store new */ "beqz %1,1b\n\t" /* SC failed, retry */ "sync\n\t" ".set reorder\n\t" : "=&r" (tmp1), "=&r" (tmp2), "=m" (*ptr) : "r" (incr), "m" (*ptr) : "memory"); #elif (GCC_VERSION < 40400) && defined(__ARM_EABI__) && defined(__linux__) gint failure, tmp; do { tmp = *ptr; failure = kernel_cmpxchg(tmp, tmp + incr, (int *) ptr); } while (failure != 0); #else /* try gcc intrinsic */ __sync_fetch_and_add(ptr, incr); #endif #elif defined(_WIN32) _InterlockedExchangeAdd(ptr, incr); #else #error Atomic operations not implemented for this compiler #endif } /** Atomic AND operation. */ static inline void atomic_and(volatile gint *ptr, gint val) { #if defined(DUMMY_ATOMIC_OPS) *ptr &= val; #elif defined(__GNUC__) #if defined(_MIPS_ARCH) gint tmp1, tmp2; __asm__ __volatile__( ".set noreorder\n\t" "1: ll %0,%4\n\t" /* load old */ "and %1,%0,%3\n\t" /* compute tmp2=tmp1 & val; */ "sc %1,%2\n\t" /* store new */ "beqz %1,1b\n\t" /* SC failed, retry */ "sync\n\t" ".set reorder\n\t" : "=&r" (tmp1), "=&r" (tmp2), "=m" (*ptr) : "r" (val), "m" (*ptr) : "memory"); #elif (GCC_VERSION < 40400) && defined(__ARM_EABI__) && defined(__linux__) gint failure, tmp; do { tmp = *ptr; failure = kernel_cmpxchg(tmp, tmp & val, (int *) ptr); } while (failure != 0); #else /* try gcc intrinsic */ __sync_fetch_and_and(ptr, val); #endif #elif defined(_WIN32) _InterlockedAnd(ptr, val); #else #error Atomic operations not implemented for this compiler #endif } /** Atomic OR operation. */ static inline void atomic_or(volatile gint *ptr, gint val) { #if defined(DUMMY_ATOMIC_OPS) *ptr |= val; #elif defined(__GNUC__) #if defined(_MIPS_ARCH) gint tmp1, tmp2; __asm__ __volatile__( ".set noreorder\n\t" "1: ll %0,%4\n\t" /* load old */ "or %1,%0,%3\n\t" /* compute tmp2=tmp1 | val; */ "sc %1,%2\n\t" /* store new */ "beqz %1,1b\n\t" /* SC failed, retry */ "sync\n\t" ".set reorder\n\t" : "=&r" (tmp1), "=&r" (tmp2), "=m" (*ptr) : "r" (val), "m" (*ptr) : "memory"); #elif (GCC_VERSION < 40400) && defined(__ARM_EABI__) && defined(__linux__) gint failure, tmp; do { tmp = *ptr; failure = kernel_cmpxchg(tmp, tmp | val, (int *) ptr); } while (failure != 0); #else /* try gcc intrinsic */ __sync_fetch_and_or(ptr, val); #endif #elif defined(_WIN32) _InterlockedOr(ptr, val); #else #error Atomic operations not implemented for this compiler #endif } /** Fetch and (dec|inc)rement. Returns value before modification. */ static inline gint fetch_and_add(volatile gint *ptr, gint incr) { #if defined(DUMMY_ATOMIC_OPS) gint tmp = *ptr; *ptr += incr; return tmp; #elif defined(__GNUC__) #if defined(_MIPS_ARCH) gint ret, tmp; __asm__ __volatile__( ".set noreorder\n\t" "1: ll %0,%4\n\t" /* load old */ "add %1,%0,%3\n\t" /* compute tmp=ret+incr */ "sc %1,%2\n\t" /* store new */ "beqz %1,1b\n\t" /* SC failed, retry */ "sync\n\t" ".set reorder\n\t" : "=&r" (ret), "=&r" (tmp), "=m" (*ptr) : "r" (incr), "m" (*ptr) : "memory"); return ret; #elif (GCC_VERSION < 40400) && defined(__ARM_EABI__) && defined(__linux__) gint failure, tmp; do { tmp = *ptr; failure = kernel_cmpxchg(tmp, tmp + incr, (int *) ptr); } while (failure != 0); return tmp; #else /* try gcc intrinsic */ return __sync_fetch_and_add(ptr, incr); #endif #elif defined(_WIN32) return _InterlockedExchangeAdd(ptr, incr); #else #error Atomic operations not implemented for this compiler #endif } /** Atomic fetch and store. Swaps two values. */ static inline gint fetch_and_store(volatile gint *ptr, gint val) { /* Despite the name, the GCC builtin should just * issue XCHG operation. There is no testing of * anything, just lock the bus and swap the values, * as per Intel's opcode reference. * * XXX: not available on all compiler targets :-( */ #if defined(DUMMY_ATOMIC_OPS) gint tmp = *ptr; *ptr = val; return tmp; #elif defined(__GNUC__) #if defined(_MIPS_ARCH) gint ret, tmp; __asm__ __volatile__( ".set noreorder\n\t" "1: ll %0,%4\n\t" /* load old */ "move %1,%3\n\t" "sc %1,%2\n\t" /* store new */ "beqz %1,1b\n\t" /* SC failed, retry */ "sync\n\t" ".set reorder\n\t" : "=&r" (ret), "=&r" (tmp), "=m" (*ptr) : "r" (val), "m" (*ptr) : "memory"); return ret; #elif (GCC_VERSION < 40400) && defined(__ARM_EABI__) && defined(__linux__) gint failure, oldval; do { oldval = *ptr; failure = kernel_cmpxchg(oldval, val, (int *) ptr); } while (failure != 0); return oldval; #else /* try gcc intrinsic */ return __sync_lock_test_and_set(ptr, val); #endif #elif defined(_WIN32) return _InterlockedExchange(ptr, val); #else #error Atomic operations not implemented for this compiler #endif } /** Compare and swap. If value at ptr equals old, set it to * new and return 1. Otherwise the function returns 0. */ inline gint wg_compare_and_swap(volatile gint *ptr, gint oldv, gint newv) { #if defined(DUMMY_ATOMIC_OPS) if(*ptr == oldv) { *ptr = newv; return 1; } return 0; #elif defined(__GNUC__) #if defined(_MIPS_ARCH) gint ret; __asm__ __volatile__( ".set noreorder\n\t" "1: ll %0,%4\n\t" "bne %0,%2,2f\n\t" /* *ptr!=oldv, return *ptr */ "move %0,%3\n\t" "sc %0,%1\n\t" "beqz %0,1b\n\t" /* SC failed, retry */ "move %0,%2\n\t" /* return oldv (*ptr==newv now) */ "2: sync\n\t" ".set reorder\n\t" : "=&r" (ret), "=m" (*ptr) : "r" (oldv), "r" (newv), "m" (*ptr) : "memory"); return ret == oldv; #elif (GCC_VERSION < 40400) && defined(__ARM_EABI__) && defined(__linux__) gint failure = kernel_cmpxchg(oldv, newv, (int *) ptr); return (failure == 0); #else /* try gcc intrinsic */ return __sync_bool_compare_and_swap(ptr, oldv, newv); #endif #elif defined(_WIN32) return (_InterlockedCompareExchange(ptr, newv, oldv) == oldv); #else #error Atomic operations not implemented for this compiler #endif } /* ----------- read and write transaction support ----------- */ /* * Read and write transactions are currently realized using database * level locking. The rest of the db API is implemented independently - * therefore use of the locking routines does not automatically guarantee * isolation, rather, all of the concurrently accessing clients are expected * to follow the same protocol. */ /** Start write transaction * Current implementation: acquire database level exclusive lock */ gint wg_start_write(void * db) { return db_wlock(db, DEFAULT_LOCK_TIMEOUT); } /** End write transaction * Current implementation: release database level exclusive lock */ gint wg_end_write(void * db, gint lock) { return db_wulock(db, lock); } /** Start read transaction * Current implementation: acquire database level shared lock */ gint wg_start_read(void * db) { return db_rlock(db, DEFAULT_LOCK_TIMEOUT); } /** End read transaction * Current implementation: release database level shared lock */ gint wg_end_read(void * db, gint lock) { return db_rulock(db, lock); } /* * The following functions implement a giant shared/exclusive * lock on the database. * * Algorithms used for locking: * * 1. Simple reader-preference lock using a single global sync * variable (described by Mellor-Crummey & Scott '92). * 2. A writer-preference spinlock based on the above. * 3. A task-fair lock implemented using a queue. Similar to * the queue-based MCS rwlock, but uses futexes to synchronize * the waiting processes. */ #if (LOCK_PROTO==RPSPIN) /** Acquire database level exclusive lock (reader-preference spinlock) * Blocks until lock is acquired. * If USE_LOCK_TIMEOUT is defined, may return without locking */ #ifdef USE_LOCK_TIMEOUT gint db_rpspin_wlock(void * db, gint timeout) { #else gint db_rpspin_wlock(void * db) { #endif int i; #ifdef _WIN32 int ts; #else struct timespec ts; #endif volatile gint *gl; #ifdef CHECK if (!dbcheck(db)) { show_lock_error(db, "Invalid database pointer in db_wlock"); return 0; } #endif gl = (gint *) offsettoptr(db, dbmemsegh(db)->locks.global_lock); /* First attempt at getting the lock without spinning */ if(compare_and_swap(gl, 0, WAFLAG)) return 1; #ifdef _WIN32 ts = SLEEP_MSEC; #else ts.tv_sec = 0; ts.tv_nsec = SLEEP_NSEC; #endif #ifdef USE_LOCK_TIMEOUT INIT_SPIN_TIMEOUT(timeout) #endif /* Spin loop */ for(;;) { for(i=0; ilocks.global_lock); /* Clear the writer active flag */ atomic_and(gl, ~(WAFLAG)); return 1; } /** Acquire database level shared lock (reader-preference spinlock) * Increments reader count, blocks until there are no active * writers. * If USE_LOCK_TIMEOUT is defined, may return without locking. */ #ifdef USE_LOCK_TIMEOUT gint db_rpspin_rlock(void * db, gint timeout) { #else gint db_rpspin_rlock(void * db) { #endif int i; #ifdef _WIN32 int ts; #else struct timespec ts; #endif volatile gint *gl; #ifdef CHECK if (!dbcheck(db)) { show_lock_error(db, "Invalid database pointer in db_rlock"); return 0; } #endif gl = (gint *) offsettoptr(db, dbmemsegh(db)->locks.global_lock); /* Increment reader count atomically */ fetch_and_add(gl, RC_INCR); /* Try getting the lock without pause */ if(!((*gl) & WAFLAG)) return 1; #ifdef _WIN32 ts = SLEEP_MSEC; #else ts.tv_sec = 0; ts.tv_nsec = SLEEP_NSEC; #endif #ifdef USE_LOCK_TIMEOUT INIT_SPIN_TIMEOUT(timeout) #endif /* Spin loop */ for(;;) { for(i=0; ilocks.global_lock); /* Decrement reader count */ fetch_and_add(gl, -RC_INCR); return 1; } #elif (LOCK_PROTO==WPSPIN) /** Acquire database level exclusive lock (writer-preference spinlock) * Blocks until lock is acquired. */ #ifdef USE_LOCK_TIMEOUT gint db_wpspin_wlock(void * db, gint timeout) { #else gint db_wpspin_wlock(void * db) { #endif int i; #ifdef _WIN32 int ts; #else struct timespec ts; #endif volatile gint *gl, *w; #ifdef CHECK if (!dbcheck(db)) { show_lock_error(db, "Invalid database pointer in db_wlock"); return 0; } #endif gl = (gint *) offsettoptr(db, dbmemsegh(db)->locks.global_lock); w = (gint *) offsettoptr(db, dbmemsegh(db)->locks.writers); /* Let the readers know a writer is present */ atomic_increment(w, 1); /* First attempt at getting the lock without spinning */ if(compare_and_swap(gl, 0, WAFLAG)) return 1; #ifdef _WIN32 ts = SLEEP_MSEC; #else ts.tv_sec = 0; ts.tv_nsec = SLEEP_NSEC; #endif #ifdef USE_LOCK_TIMEOUT INIT_SPIN_TIMEOUT(timeout) #endif /* Spin loop */ for(;;) { for(i=0; ilocks.global_lock); w = (gint *) offsettoptr(db, dbmemsegh(db)->locks.writers); /* Clear the writer active flag */ atomic_and(gl, ~(WAFLAG)); /* writers-- */ atomic_increment(w, -1); return 1; } /** Acquire database level shared lock (writer-preference spinlock) * Blocks until there are no active or waiting writers, then increments * reader count atomically. */ #ifdef USE_LOCK_TIMEOUT gint db_wpspin_rlock(void * db, gint timeout) { #else gint db_wpspin_rlock(void * db) { #endif int i; #ifdef _WIN32 int ts; #else struct timespec ts; #endif volatile gint *gl, *w; #ifdef CHECK if (!dbcheck(db)) { show_lock_error(db, "Invalid database pointer in db_rlock"); return 0; } #endif gl = (gint *) offsettoptr(db, dbmemsegh(db)->locks.global_lock); w = (gint *) offsettoptr(db, dbmemsegh(db)->locks.writers); /* Try locking without spinning */ if(!(*w)) { gint readers = (*gl) & ~WAFLAG; if(compare_and_swap(gl, readers, readers + RC_INCR)) return 1; } #ifdef USE_LOCK_TIMEOUT INIT_SPIN_TIMEOUT(timeout) #endif for(;;) { #ifdef _WIN32 ts = SLEEP_MSEC; #else ts.tv_sec = 0; ts.tv_nsec = SLEEP_NSEC; #endif /* Spin-wait until writers disappear */ while(*w) { for(i=0; ilocks.global_lock); /* Decrement reader count */ atomic_increment(gl, -RC_INCR); return 1; } #elif (LOCK_PROTO==TFQUEUE) /** Acquire the queue mutex. */ static void lock_queue(void * db) { int i; #ifdef _WIN32 int ts; #else struct timespec ts; #endif volatile gint *gl; /* skip the database pointer check, this function is not called directly */ gl = (gint *) offsettoptr(db, dbmemsegh(db)->locks.queue_lock); /* First attempt at getting the lock without spinning */ if(compare_and_swap(gl, 0, 1)) return; #ifdef _WIN32 ts = SLEEP_MSEC; #else ts.tv_sec = 0; ts.tv_nsec = SLEEP_NSEC; #endif /* Spin loop */ for(;;) { for(i=0; ilocks.queue_lock); *gl = 0; } /** Acquire database level exclusive lock (task-fair queued lock) * Blocks until lock is acquired. * If USE_LOCK_TIMEOUT is defined, may return without locking */ #ifdef USE_LOCK_TIMEOUT gint db_tfqueue_wlock(void * db, gint timeout) { #else gint db_tfqueue_wlock(void * db) { #endif #ifdef _WIN32 int ts; #else struct timespec ts; #endif gint lock, prev; lock_queue_node *lockp; db_memsegment_header* dbh; #ifdef CHECK if (!dbcheck(db)) { show_lock_error(db, "Invalid database pointer in db_wlock"); return 0; } #endif dbh = dbmemsegh(db); lock_queue(db); ALLOC_LOCK(db, lock) prev = dbh->locks.tail; dbh->locks.tail = lock; lockp = (lock_queue_node *) offsettoptr(db, lock); lockp->class = LOCKQ_WRITE; lockp->prev = prev; lockp->next = 0; if(prev) { lock_queue_node *prevp = offsettoptr(db, prev); prevp->next = lock; lockp->waiting = 1; } else { lockp->waiting = 0; } unlock_queue(db); if(lockp->waiting) { #ifdef __linux__ #ifdef USE_LOCK_TIMEOUT INIT_QLOCK_TIMEOUT(timeout, ts) if(futex_trywait(&lockp->waiting, 1, &ts) == ETIMEDOUT) { lock_queue(db); DEQUEUE_LOCK(db, dbh, lock, lockp) free_lock(db, lock); unlock_queue(db); return 0; } #else futex_wait(&lockp->waiting, 1); #endif #else /* XXX: add support for other platforms */ #error This code needs Linux SYS_futex service to function #endif } return lock; } /** Release database level exclusive lock (task-fair queued lock) */ gint db_tfqueue_wulock(void * db, gint lock) { lock_queue_node *lockp; db_memsegment_header* dbh; volatile gint *syn_addr = NULL; #ifdef CHECK if (!dbcheck(db)) { show_lock_error(db, "Invalid database pointer in db_wulock"); return 0; } #endif dbh = dbmemsegh(db); lockp = (lock_queue_node *) offsettoptr(db, lock); lock_queue(db); if(lockp->next) { lock_queue_node *nextp = offsettoptr(db, lockp->next); nextp->waiting = 0; nextp->prev = 0; /* we're a writer lock, head of the queue */ syn_addr = &nextp->waiting; } else if(dbh->locks.tail == lock) { dbh->locks.tail = 0; } free_lock(db, lock); unlock_queue(db); if(syn_addr) { #ifdef __linux__ futex_wake(syn_addr, 1); #else /* XXX: add support for other platforms */ #error This code needs Linux SYS_futex service to function #endif } return 1; } /** Acquire database level shared lock (task-fair queued lock) * If USE_LOCK_TIMEOUT is defined, may return without locking. */ #ifdef USE_LOCK_TIMEOUT gint db_tfqueue_rlock(void * db, gint timeout) { #else gint db_tfqueue_rlock(void * db) { #endif #ifdef _WIN32 int ts; #else struct timespec ts; #endif gint lock, prev; lock_queue_node *lockp; db_memsegment_header* dbh; #ifdef CHECK if (!dbcheck(db)) { show_lock_error(db, "Invalid database pointer in db_rlock"); return 0; } #endif dbh = dbmemsegh(db); lock_queue(db); ALLOC_LOCK(db, lock) prev = dbh->locks.tail; dbh->locks.tail = lock; lockp = (lock_queue_node *) offsettoptr(db, lock); lockp->class = LOCKQ_READ; lockp->prev = prev; lockp->next = 0; if(prev) { lock_queue_node *prevp = (lock_queue_node *) offsettoptr(db, prev); prevp->next = lock; if(prevp->class == LOCKQ_READ && prevp->waiting == 0) { lockp->waiting = 0; } else { lockp->waiting = 1; } } else { lockp->waiting = 0; } unlock_queue(db); if(lockp->waiting) { volatile gint *syn_addr = NULL; #ifdef __linux__ #ifdef USE_LOCK_TIMEOUT INIT_QLOCK_TIMEOUT(timeout, ts) if(futex_trywait(&lockp->waiting, 1, &ts) == ETIMEDOUT) { lock_queue(db); DEQUEUE_LOCK(db, dbh, lock, lockp) free_lock(db, lock); unlock_queue(db); return 0; } #else futex_wait(&lockp->waiting, 1); #endif #else /* XXX: add support for other platforms */ #error This code needs Linux SYS_futex service to function #endif lock_queue(db); if(lockp->next) { lock_queue_node *nextp = offsettoptr(db, lockp->next); if(nextp->class == LOCKQ_READ && nextp->waiting) { nextp->waiting = 0; syn_addr = &nextp->waiting; } } unlock_queue(db); if(syn_addr) { #ifdef __linux__ futex_wake(syn_addr, 1); #else /* XXX: add support for other platforms */ #error This code needs Linux SYS_futex service to function #endif } } return lock; } /** Release database level shared lock (task-fair queued lock) */ gint db_tfqueue_rulock(void * db, gint lock) { lock_queue_node *lockp; db_memsegment_header* dbh; volatile gint *syn_addr = NULL; #ifdef CHECK if (!dbcheck(db)) { show_lock_error(db, "Invalid database pointer in db_rulock"); return 0; } #endif dbh = dbmemsegh(db); lockp = (lock_queue_node *) offsettoptr(db, lock); lock_queue(db); if(lockp->prev) { lock_queue_node *prevp = offsettoptr(db, lockp->prev); prevp->next = lockp->next; } if(lockp->next) { lock_queue_node *nextp = offsettoptr(db, lockp->next); nextp->prev = lockp->prev; if(nextp->waiting && (!lockp->prev || nextp->class == LOCKQ_READ)) { nextp->waiting = 0; syn_addr = &nextp->waiting; } } else if(dbh->locks.tail == lock) { dbh->locks.tail = lockp->prev; } free_lock(db, lock); unlock_queue(db); if(syn_addr) { #ifdef __linux__ futex_wake(syn_addr, 1); #else /* XXX: add support for other platforms */ #error This code needs Linux SYS_futex service to function #endif } return 1; } #endif /* LOCK_PROTO */ /** Initialize locking subsystem. * Not parallel-safe, so should be run during database init. * * Note that this function is called even if locking is disabled. */ gint wg_init_locks(void * db) { #if (LOCK_PROTO==TFQUEUE) gint i, chunk_wall; lock_queue_node *tmp = NULL; #endif db_memsegment_header* dbh; #ifdef CHECK if (!dbcheck(db) && !dbcheckinit(db)) { show_lock_error(db, "Invalid database pointer in wg_init_locks"); return -1; } #endif dbh = dbmemsegh(db); #if (LOCK_PROTO==TFQUEUE) chunk_wall = dbh->locks.storage + dbh->locks.max_nodes*SYN_VAR_PADDING; for(i=dbh->locks.storage; inext_cell = i; /* offset of next cell */ } tmp->next_cell=0; /* last node */ /* top of the stack points to first cell in chunk */ dbh->locks.freelist = dbh->locks.storage; /* reset the state */ dbh->locks.tail = 0; /* 0 is considered invalid offset==>no value */ dbstore(db, dbh->locks.queue_lock, 0); #else dbstore(db, dbh->locks.global_lock, 0); dbstore(db, dbh->locks.writers, 0); #endif return 0; } #if (LOCK_PROTO==TFQUEUE) /* ---------- memory management for queued locks ---------- */ /* * Queued locks algorithm assumes allocating memory cells * for each lock. These cells need to be memory-aligned to * allow spinlocks run locally, but more importantly, allocation * and freeing of the cells has to be implemented in a lock-free * manner. * * The method used in the initial implementation is freelist * with reference counts (generally described by Valois '95, * actual code is based on examples from * http://www.non-blocking.com/Eng/services-technologies_non-blocking-lock-free.htm) * * XXX: code untested currently * XXX: Mellor-Crummey & Scott algorithm possibly does not need * refcounts. If so, they should be #ifdef-ed out, but * kept for possible future expansion. */ /** Allocate memory cell for a lock. * Used internally only, so we assume the passed db pointer * is already validated. * * Returns offset to allocated cell. */ #if 0 static inline gint alloc_lock(void * db) { db_memsegment_header* dbh = dbmemsegh(db); lock_queue_node *tmp; for(;;) { gint t = dbh->locks.freelist; if(!t) return 0; /* end of chain :-( */ tmp = (lock_queue_node *) offsettoptr(db, t); fetch_and_add(&(tmp->refcount), 2); if(compare_and_swap(&(dbh->locks.freelist), t, tmp->next_cell)) { fetch_and_add(&(tmp->refcount), -1); /* clear lsb */ return t; } free_lock(db, t); } return 0; /* dummy */ } /** Release memory cell for a lock. * Used internally only. */ static inline void free_lock(void * db, gint node) { db_memsegment_header* dbh = dbmemsegh(db); lock_queue_node *tmp; volatile gint t; tmp = (lock_queue_node *) offsettoptr(db, node); /* Clear reference */ fetch_and_add(&(tmp->refcount), -2); /* Try to set lsb */ if(compare_and_swap(&(tmp->refcount), 0, 1)) { /* XXX: if(tmp->next_cell) free_lock(db, tmp->next_cell); */ do { t = dbh->locks.freelist; tmp->next_cell = t; } while (!compare_and_swap(&(dbh->locks.freelist), t, node)); } } /** De-reference (release pointer to) a link. * Used internally only. */ static inline gint deref_link(void *db, volatile gint *link) { lock_queue_node *tmp; volatile gint t; for(;;) { t = *link; if(t == 0) return 0; tmp = (lock_queue_node *) offsettoptr(db, t); fetch_and_add(&(tmp->refcount), 2); if(t == *link) return t; free_lock(db, t); } } #else /* Simple lock memory allocation (non lock-free) */ static inline gint alloc_lock(void * db) { db_memsegment_header* dbh = dbmemsegh(db); gint t = dbh->locks.freelist; lock_queue_node *tmp; if(!t) return 0; /* end of chain :-( */ tmp = (lock_queue_node *) offsettoptr(db, t); dbh->locks.freelist = tmp->next_cell; return t; } static inline void free_lock(void * db, gint node) { db_memsegment_header* dbh = dbmemsegh(db); lock_queue_node *tmp = (lock_queue_node *) offsettoptr(db, node); tmp->next_cell = dbh->locks.freelist; dbh->locks.freelist = node; } #endif #ifdef __linux__ /* Futex operations */ static inline void futex_wait(volatile gint *addr1, int val1) { syscall(SYS_futex, (void *) addr1, FUTEX_WAIT, val1, NULL); } static inline int futex_trywait(volatile gint *addr1, int val1, struct timespec *timeout) { if(syscall(SYS_futex, (void *) addr1, FUTEX_WAIT, val1, timeout) == -1) return errno; /* On Linux, this is thread-safe. Caution needed however */ else return 0; } static inline void futex_wake(volatile gint *addr1, int val1) { syscall(SYS_futex, (void *) addr1, FUTEX_WAKE, val1); } #endif #endif /* LOCK_PROTO==TFQUEUE */ /* ------------ error handling ---------------- */ static gint show_lock_error(void *db, char *errmsg) { #ifdef WG_NO_ERRPRINT #else fprintf(stderr,"wg locking error: %s.\n", errmsg); #endif return -1; } #ifdef __cplusplus } #endif whitedb-0.7.2/Db/dblock.h000066400000000000000000000115551226454622500151210ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Priit Järv 2009, 2013 * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dblock.h * Public headers for concurrent access routines. */ #ifndef DEFINED_DBLOCK_H #define DEFINED_DBLOCK_H #ifdef _WIN32 #include "../config-w32.h" #else #include "../config.h" #endif /* ==== Public macros ==== */ /* XXX: move to configure.in / config-xxx.h */ #define USE_LOCK_TIMEOUT 1 #define DEFAULT_LOCK_TIMEOUT 2000 /* in ms */ /* Lock protocol */ #define RPSPIN 1 #define WPSPIN 2 #define TFQUEUE 3 /* ====== data structures ======== */ #if (LOCK_PROTO==TFQUEUE) /* Queue nodes are stored locally in allocated cells. * The size of this structure can never exceed SYN_VAR_PADDING * defined in dballoc.h. */ struct __lock_queue_node { /* XXX: do we need separate links for stack? Or even, does * it break correctness? */ gint next_cell; /* freelist chain (db offset) */ gint class; /* LOCKQ_READ, LOCKQ_WRITE */ volatile gint waiting; /* sync variable */ volatile gint next; /* queue chain (db offset) */ volatile gint prev; /* queue chain */ }; typedef struct __lock_queue_node lock_queue_node; #endif /* ==== Protos ==== */ /* API functions (copied in dbapi.h) */ gint wg_start_write(void * dbase); /* start write transaction */ gint wg_end_write(void * dbase, gint lock); /* end write transaction */ gint wg_start_read(void * dbase); /* start read transaction */ gint wg_end_read(void * dbase, gint lock); /* end read transaction */ /* WhiteDB internal functions */ #ifndef _WIN32 inline gint wg_compare_and_swap(volatile gint *ptr, gint oldv, gint newv); #else __inline gint wg_compare_and_swap(volatile gint *ptr, gint oldv, gint newv); #endif gint wg_init_locks(void * db); /* (re-) initialize locking subsystem */ #if (LOCK_PROTO==RPSPIN) #ifdef USE_LOCK_TIMEOUT gint db_rpspin_wlock(void * dbase, gint timeout); #define db_wlock(d, t) db_rpspin_wlock(d, t) #else gint db_rpspin_wlock(void * dbase); /* get DB level X lock */ #define db_wlock(d, t) db_rpspin_wlock(d) #endif gint db_rpspin_wulock(void * dbase); /* release DB level X lock */ #define db_wulock(d, l) db_rpspin_wulock(d) #ifdef USE_LOCK_TIMEOUT gint db_rpspin_rlock(void * dbase, gint timeout); #define db_rlock(d, t) db_rpspin_rlock(d, t) #else gint db_rpspin_rlock(void * dbase); /* get DB level S lock */ #define db_rlock(d, t) db_rpspin_rlock(d) #endif gint db_rpspin_rulock(void * dbase); /* release DB level S lock */ #define db_rulock(d, l) db_rpspin_rulock(d) #elif (LOCK_PROTO==WPSPIN) #ifdef USE_LOCK_TIMEOUT gint db_wpspin_wlock(void * dbase, gint timeout); #define db_wlock(d, t) db_wpspin_wlock(d, t) #else gint db_wpspin_wlock(void * dbase); /* get DB level X lock */ #define db_wlock(d, t) db_wpspin_wlock(d) #endif gint db_wpspin_wulock(void * dbase); /* release DB level X lock */ #define db_wulock(d, l) db_wpspin_wulock(d) #ifdef USE_LOCK_TIMEOUT gint db_wpspin_rlock(void * dbase, gint timeout); #define db_rlock(d, t) db_wpspin_rlock(d, t) #else gint db_wpspin_rlock(void * dbase); /* get DB level S lock */ #define db_rlock(d, t) db_wpspin_rlock(d) #endif gint db_wpspin_rulock(void * dbase); /* release DB level S lock */ #define db_rulock(d, l) db_wpspin_rulock(d) #elif (LOCK_PROTO==TFQUEUE) #ifdef USE_LOCK_TIMEOUT gint db_tfqueue_wlock(void * dbase, gint timeout); #define db_wlock(d, t) db_tfqueue_wlock(d, t) #else gint db_tfqueue_wlock(void * dbase); /* get DB level X lock */ #define db_wlock(d, t) db_tfqueue_wlock(d) #endif gint db_tfqueue_wulock(void * dbase, gint lock); /* release DB level X lock */ #define db_wulock(d, l) db_tfqueue_wulock(d, l) #ifdef USE_LOCK_TIMEOUT gint db_tfqueue_rlock(void * dbase, gint timeout); #define db_rlock(d, t) db_tfqueue_rlock(d, t) #else gint db_tfqueue_rlock(void * dbase); /* get DB level S lock */ #define db_rlock(d, t) db_tfqueue_rlock(d) #endif gint db_tfqueue_rulock(void * dbase, gint lock); /* release DB level S lock */ #define db_rulock(d, l) db_tfqueue_rulock(d, l) #else /* undefined or invalid value, disable locking */ #define db_wlock(d, t) (1) #define db_wulock(d, l) (1) #define db_rlock(d, t) (1) #define db_rulock(d, l) (1) #endif /* LOCK_PROTO */ #endif /* DEFINED_DBLOCK_H */ whitedb-0.7.2/Db/dblog.c000066400000000000000000000714421226454622500147460ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Andri Rebane 2009 * Copyright (c) Priit Järv 2013 * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dblog.c * DB logging support for WhiteDB memory database * */ /* ====== Includes =============== */ #include #include #include #include #include #ifdef _WIN32 #include #include #include #include #include #else #include #include #include #endif #ifdef __cplusplus extern "C" { #endif #ifdef _WIN32 #include "../config-w32.h" #else #include "../config.h" #endif #include "dballoc.h" #include "dbdata.h" #include "dbhash.h" /* ====== Private headers and defs ======== */ #include "dblog.h" #if defined(USE_DBLOG) && !defined(USE_DATABASE_HANDLE) #error Logging requires USE_DATABASE_HANDLE #endif #ifdef _WIN32 #define snprintf(s, sz, f, ...) _snprintf_s(s, sz+1, sz, f, ## __VA_ARGS__) #define inline _inline #endif #ifndef _WIN32 #define JOURNAL_FAIL(f, e) \ close(f); \ return e; #else #define JOURNAL_FAIL(f, e) \ _close(f); \ return e; #endif #define GET_LOG_BYTE(d, f, v) \ if((v = fgetc(f)) == EOF) { \ return show_log_error(d, "Failed to read log entry"); \ } #define GET_LOG_CMD(d, f, v) \ if((v = fgetc(f)) == EOF) { \ if(feof(f)) break; \ else return show_log_error(d, "Failed to read log entry"); \ } /* Does not emit a message as fget_varint() does that already. */ #define GET_LOG_VARINT(d, f, v, e) \ if(fget_varint(d, f, (wg_uint *) &v)) { \ return e; \ } #ifdef HAVE_64BIT_GINT #define VARINT_SIZE 9 #else #define VARINT_SIZE 5 #endif /* ====== data structures ======== */ /* ======= Private protos ================ */ #ifdef USE_DBLOG static int backup_journal(void *db, char *journal_fn); static gint check_journal(void *db, int fd); static int open_journal(void *db, int create); static gint add_tran_offset(void *db, void *table, gint old, gint new); static gint add_tran_enc(void *db, void *table, gint old, gint new); static gint translate_offset(void *db, void *table, gint offset); static gint translate_encoded(void *db, void *table, gint enc); static gint recover_encode(void *db, FILE *f, gint type); static gint recover_journal(void *db, FILE *f, void *table); static gint write_log_buffer(void *db, void *buf, int buflen); #endif /* USE_DBLOG */ static gint show_log_error(void *db, char *errmsg); /* ====== Functions ============== */ #ifdef USE_DBLOG /** Check the file magic of the journal file. * * Since the files are opened in append mode, we don't need to * seek before or after reading the header (on Linux). */ static gint check_journal(void *db, int fd) { char buf[WG_JOURNAL_MAGIC_BYTES + 1]; #ifndef _WIN32 if(read(fd, buf, WG_JOURNAL_MAGIC_BYTES) != WG_JOURNAL_MAGIC_BYTES) { #else if(_read(fd, buf, WG_JOURNAL_MAGIC_BYTES) != WG_JOURNAL_MAGIC_BYTES) { #endif return show_log_error(db, "Error checking log file"); } buf[WG_JOURNAL_MAGIC_BYTES] = '\0'; if(strncmp(buf, WG_JOURNAL_MAGIC, WG_JOURNAL_MAGIC_BYTES)) { return show_log_error(db, "Bad log file magic"); } return 0; } /** Rename the existing journal. * * Uses a naming scheme of xxx.yy where xxx is the journal filename * and yy is a sequence number that is incremented. * * Returns 0 on success. * Returns -1 on failure. */ static int backup_journal(void *db, char *journal_fn) { int i, logidx, err; time_t oldest = 0; /* keep this buffer large enough to fit the backup counter length */ char journal_backup[WG_JOURNAL_FN_BUFSIZE + 10]; for(i=0, logidx=0; i tmp.st_mtime) { oldest = tmp.st_mtime; logidx = i; } } /* at this point, logidx points to either an available backup * filename or the oldest existing backup (which will be overwritten). * If all else fails, filename xxx.0 is used. */ snprintf(journal_backup, WG_JOURNAL_FN_BUFSIZE + 10, "%s.%d", journal_fn, logidx); #ifdef _WIN32 _unlink(journal_backup); #endif err = rename(journal_fn, journal_backup); if(!err) { db_memsegment_header* dbh = dbmemsegh(db); dbh->logging.serial++; /* new journal file */ } return err; } /** Open the journal file. * * In create mode, we also take care of the backup copy. */ static int open_journal(void *db, int create) { char journal_fn[WG_JOURNAL_FN_BUFSIZE]; db_memsegment_header* dbh = dbmemsegh(db); int addflags = 0; int fd = -1; #ifndef _WIN32 mode_t savemask = 0; #endif #ifndef _WIN32 snprintf(journal_fn, WG_JOURNAL_FN_BUFSIZE, "%s.%td", WG_JOURNAL_FILENAME, dbh->key); #else snprintf(journal_fn, WG_JOURNAL_FN_BUFSIZE, "%s.%Id", WG_JOURNAL_FILENAME, dbh->key); #endif if(create) { #ifndef _WIN32 struct stat tmp; savemask = umask(WG_JOURNAL_UMASK); addflags |= O_CREAT; #else struct _stat tmp; addflags |= _O_CREAT; #endif #ifndef _WIN32 if(!dbh->logging.dirty && !stat(journal_fn, &tmp)) { #else if(!dbh->logging.dirty && !_stat(journal_fn, &tmp)) { #endif if(backup_journal(db, journal_fn)) { show_log_error(db, "Failed to back up the existing journal."); goto abort; } } } #ifndef _WIN32 if((fd = open(journal_fn, addflags|O_APPEND|O_RDWR, S_IRUSR|S_IWUSR|S_IRGRP|S_IWGRP|S_IROTH|S_IWOTH)) == -1) { #else if(_sopen_s(&fd, journal_fn, addflags|_O_APPEND|_O_BINARY|_O_RDWR, _SH_DENYNO, _S_IREAD|_S_IWRITE)) { #endif show_log_error(db, "Error opening log file"); } abort: if(create) { #ifndef _WIN32 umask(savemask); #endif } return fd; } /** Varint encoder * this needs to be fast, so we don't check array size. It must * be at least 9 bytes. * based on http://stackoverflow.com/a/2982965 */ static inline size_t enc_varint(unsigned char *buf, wg_uint val) { buf[0] = (unsigned char)(val | 0x80); if(val >= (1 << 7)) { buf[1] = (unsigned char)((val >> 7) | 0x80); if(val >= (1 << 14)) { buf[2] = (unsigned char)((val >> 14) | 0x80); if(val >= (1 << 21)) { buf[3] = (unsigned char)((val >> 21) | 0x80); if(val >= (1 << 28)) { #ifndef HAVE_64BIT_GINT buf[4] = (unsigned char)(val >> 28); return 5; #else buf[4] = (unsigned char)((val >> 28) | 0x80); if(val >= ((wg_uint) 1 << 35)) { buf[5] = (unsigned char)((val >> 35) | 0x80); if(val >= ((wg_uint) 1 << 42)) { buf[6] = (unsigned char)((val >> 42) | 0x80); if(val >= ((wg_uint) 1 << 49)) { buf[7] = (unsigned char)((val >> 49) | 0x80); if(val >= ((wg_uint) 1 << 56)) { buf[8] = (unsigned char)(val >> 56); return 9; } else { buf[7] &= 0x7f; return 8; } } else { buf[6] &= 0x7f; return 7; } } else { buf[5] &= 0x7f; return 6; } } else { buf[4] &= 0x7f; return 5; } #endif } else { buf[3] &= 0x7f; return 4; } } else { buf[2] &= 0x7f; return 3; } } else { buf[1] &= 0x7f; return 2; } } else { buf[0] &= 0x7f; return 1; } } #if 0 /** Varint decoder * returns the number of bytes consumed (so that the caller * knows where the next value starts). Note that this approach * assumes we're using a read buffer - this is acceptable and * probably preferable when doing the journal replay. */ static inline size_t dec_varint(unsigned char *buf, wg_uint *val) { wg_uint tmp = buf[0] & 0x7f; if(buf[0] & 0x80) { tmp |= ((buf[1] & 0x7f) << 7); if(buf[1] & 0x80) { tmp |= ((buf[2] & 0x7f) << 14); if(buf[2] & 0x80) { tmp |= ((buf[3] & 0x7f) << 21); if(buf[3] & 0x80) { #ifndef HAVE_64BIT_GINT tmp |= (buf[4] << 28); *val = tmp; return 5; #else tmp |= ((wg_uint) (buf[4] & 0x7f) << 28); if(buf[4] & 0x80) { tmp |= ((wg_uint) (buf[5] & 0x7f) << 35); if(buf[5] & 0x80) { tmp |= ((wg_uint) (buf[6] & 0x7f) << 42); if(buf[6] & 0x80) { tmp |= ((wg_uint) (buf[7] & 0x7f) << 49); if(buf[7] & 0x80) { tmp |= ((wg_uint) buf[8] << 56); *val = tmp; return 9; } else { *val = tmp; return 8; } } else { *val = tmp; return 7; } } else { *val = tmp; return 6; } } else { *val = tmp; return 5; } #endif } else { *val = tmp; return 4; } } else { *val = tmp; return 3; } } else { *val = tmp; return 2; } } else { *val = tmp; return 1; } } #endif /** Read varint from a buffered stream * returns 0 on success * returns -1 on error */ static inline int fget_varint(void *db, FILE *f, wg_uint *val) { register int c; wg_uint tmp; GET_LOG_BYTE(db, f, c) tmp = c & 0x7f; if(c & 0x80) { GET_LOG_BYTE(db, f, c) tmp |= ((c & 0x7f) << 7); if(c & 0x80) { GET_LOG_BYTE(db, f, c) tmp |= ((c & 0x7f) << 14); if(c & 0x80) { GET_LOG_BYTE(db, f, c) tmp |= ((c & 0x7f) << 21); if(c & 0x80) { GET_LOG_BYTE(db, f, c) #ifndef HAVE_64BIT_GINT tmp |= (c << 28); #else tmp |= ((wg_uint) (c & 0x7f) << 28); if(c & 0x80) { GET_LOG_BYTE(db, f, c) tmp |= ((wg_uint) (c & 0x7f) << 35); if(c & 0x80) { GET_LOG_BYTE(db, f, c) tmp |= ((wg_uint) (c & 0x7f) << 42); if(c & 0x80) { GET_LOG_BYTE(db, f, c) tmp |= ((wg_uint) (c & 0x7f) << 49); if(c & 0x80) { GET_LOG_BYTE(db, f, c) tmp |= ((wg_uint) c << 56); } } } } #endif } } } } *val = tmp; return 0; } /** Add a log recovery translation entry * Uses extendible gint hashtable internally. */ static gint add_tran_offset(void *db, void *table, gint old, gint new) { return wg_ginthash_addkey(db, table, old, new); } /** Wrapper around add_tran_offset() to handle encoded data * */ static gint add_tran_enc(void *db, void *table, gint old, gint new) { if(isptr(old)) { gint offset, newoffset; switch(old & NORMALPTRMASK) { case LONGSTRBITS: offset = decode_longstr_offset(old); newoffset = decode_longstr_offset(new); return add_tran_offset(db, table, offset, newoffset); case SHORTSTRBITS: offset = decode_shortstr_offset(old); newoffset = decode_shortstr_offset(new); return add_tran_offset(db, table, offset, newoffset); case FULLDOUBLEBITS: offset = decode_fulldouble_offset(old); newoffset = decode_fulldouble_offset(new); return add_tran_offset(db, table, offset, newoffset); case FULLINTBITSV0: case FULLINTBITSV1: offset = decode_fullint_offset(old); newoffset = decode_fullint_offset(new); return add_tran_offset(db, table, offset, newoffset); default: return 0; } } return 0; } /** Translate a log offset * */ static gint translate_offset(void *db, void *table, gint offset) { gint newoffset; if(wg_ginthash_getkey(db, table, offset, &newoffset)) return offset; else return newoffset; } /** Wrapper around translate_offset() to handle encoded data * */ static gint translate_encoded(void *db, void *table, gint enc) { if(isptr(enc)) { gint offset; switch(enc & NORMALPTRMASK) { case DATARECBITS: return translate_offset(db, table, enc); case LONGSTRBITS: offset = decode_longstr_offset(enc); return encode_longstr_offset(translate_offset(db, table, offset)); case SHORTSTRBITS: offset = decode_shortstr_offset(enc); return encode_shortstr_offset(translate_offset(db, table, offset)); case FULLDOUBLEBITS: offset = decode_fulldouble_offset(enc); return encode_fulldouble_offset(translate_offset(db, table, offset)); case FULLINTBITSV0: case FULLINTBITSV1: offset = decode_fullint_offset(enc); return encode_fullint_offset(translate_offset(db, table, offset)); default: return enc; } } return enc; } /** Parse an encode entry from the log. * */ gint recover_encode(void *db, FILE *f, gint type) { char *strbuf, *extbuf; gint length = 0, extlength = 0, enc; int intval; double doubleval; switch(type) { case WG_INTTYPE: if(fread((char *) &intval, sizeof(int), 1, f) != 1) { show_log_error(db, "Failed to read log entry"); return WG_ILLEGAL; } return wg_encode_int(db, intval); case WG_DOUBLETYPE: if(fread((char *) &doubleval, sizeof(double), 1, f) != 1) { show_log_error(db, "Failed to read log entry"); return WG_ILLEGAL; } return wg_encode_double(db, doubleval); case WG_STRTYPE: case WG_URITYPE: case WG_XMLLITERALTYPE: case WG_ANONCONSTTYPE: case WG_BLOBTYPE: /* XXX: no encode func for this yet */ /* strings with extdata */ GET_LOG_VARINT(db, f, length, WG_ILLEGAL) GET_LOG_VARINT(db, f, extlength, WG_ILLEGAL) strbuf = (char *) malloc(length + 1); if(!strbuf) { show_log_error(db, "Failed to allocate buffers"); return WG_ILLEGAL; } if(fread(strbuf, 1, length, f) != length) { show_log_error(db, "Failed to read log entry"); free(strbuf); return WG_ILLEGAL; } strbuf[length] = '\0'; if(extlength) { extbuf = (char *) malloc(extlength + 1); if(!extbuf) { free(strbuf); show_log_error(db, "Failed to allocate buffers"); return WG_ILLEGAL; } if(fread(extbuf, 1, extlength, f) != extlength) { show_log_error(db, "Failed to read log entry"); free(strbuf); free(extbuf); return WG_ILLEGAL; } extbuf[extlength] = '\0'; } else { extbuf = NULL; } enc = wg_encode_unistr(db, strbuf, extbuf, type); free(strbuf); if(extbuf) free(extbuf); return enc; default: break; } return show_log_error(db, "Unsupported data type"); } /** Parse the journal file. Used internally only. * */ static gint recover_journal(void *db, FILE *f, void *table) { int c; gint length = 0, offset = 0, newoffset; gint col = 0, enc = 0, newenc; void *rec; for(;;) { GET_LOG_CMD(db, f, c) switch((unsigned char) c & WG_JOURNAL_ENTRY_CMDMASK) { case WG_JOURNAL_ENTRY_CRE: GET_LOG_VARINT(db, f, length, -1) GET_LOG_VARINT(db, f, offset, -1) rec = wg_create_record(db, length); if(offset != 0) { /* XXX: should we have even tried if this failed earlier? */ if(!rec) { return show_log_error(db, "Failed to create a new record"); } newoffset = ptrtooffset(db, rec); if(newoffset != offset) { if(add_tran_offset(db, table, offset, newoffset)) { return show_log_error(db, "Failed to parse log "\ "(out of translation memory)"); } } } break; case WG_JOURNAL_ENTRY_DEL: GET_LOG_VARINT(db, f, offset, -1) newoffset = translate_offset(db, table, offset); rec = offsettoptr(db, newoffset); if(wg_delete_record(db, rec) < -1) { return show_log_error(db, "Failed to delete a record"); } break; case WG_JOURNAL_ENTRY_ENC: newenc = recover_encode(db, f, (unsigned char) c & WG_JOURNAL_ENTRY_TYPEMASK); GET_LOG_VARINT(db, f, enc, -1) if(enc != WG_ILLEGAL) { /* Encode was supposed to succeed */ if(newenc == WG_ILLEGAL) { return -1; } if(newenc != enc) { if(add_tran_enc(db, table, enc, newenc)) { return show_log_error(db, "Failed to parse log "\ "(out of translation memory)"); } } } break; case WG_JOURNAL_ENTRY_SET: GET_LOG_VARINT(db, f, offset, -1) GET_LOG_VARINT(db, f, col, -1) GET_LOG_VARINT(db, f, enc, -1) newoffset = translate_offset(db, table, offset); rec = offsettoptr(db, newoffset); newenc = translate_encoded(db, table, enc); if(wg_set_field(db, rec, col, newenc)) { return show_log_error(db, "Failed to set field data"); } break; default: return show_log_error(db, "Invalid log entry"); } } return 0; } #endif /* USE_DBLOG */ /** Set up the logging area in the database handle * Normally called when opening the database connection. */ gint wg_init_handle_logdata(void *db) { #ifdef USE_DBLOG db_handle_logdata **ld = \ (db_handle_logdata **) &(((db_handle *) db)->logdata); *ld = malloc(sizeof(db_handle_logdata)); if(!(*ld)) { return show_log_error(db, "Error initializing local log data"); } memset(*ld, 0, sizeof(db_handle_logdata)); (*ld)->fd = -1; #endif return 0; } /** Clean up the state of logging in the database handle. * Normally called when closing the database connection. */ void wg_cleanup_handle_logdata(void *db) { #ifdef USE_DBLOG db_handle_logdata *ld = \ (db_handle_logdata *) (((db_handle *) db)->logdata); if(ld) { if(ld->fd >= 0) { #ifndef _WIN32 close(ld->fd); #else _close(ld->fd); #endif ld->fd = -1; } free(ld); ((db_handle *) db)->logdata = NULL; } #endif } /** Activate logging * * When successful, does the following: * opens the logfile and initializes it; * sets the logging active flag. * * Security concerns: * - the log file name is compiled in (so we can't trick other * processes into writing over files they're not supposed to) * - the log file has a magic header (see above, avoid accidentally * destroying files) * - the process that initialized logging needs to have write * access to the log file. * * Returns 0 on success * Returns -1 when logging is already active * Returns -2 when the function failed and logging is not active * Returns -3 when additionally, the log file was possibly destroyed */ gint wg_start_logging(void *db) { #ifdef USE_DBLOG db_memsegment_header* dbh = dbmemsegh(db); /* db_handle_logdata *ld = ((db_handle *) db)->logdata;*/ int fd; if(dbh->logging.active) { show_log_error(db, "Logging is already active"); return -1; } if((fd = open_journal(db, 1)) == -1) { show_log_error(db, "Error opening log file"); return -2; } if(!dbh->logging.dirty) { /* logfile is clean, re-initialize */ /* fseek(f, 0, SEEK_SET); */ #ifndef _WIN32 ftruncate(fd, 0); /* XXX: this is a no-op with backups */ if(write(fd, WG_JOURNAL_MAGIC, WG_JOURNAL_MAGIC_BYTES) != \ WG_JOURNAL_MAGIC_BYTES) { #else _chsize_s(fd, 0); if(_write(fd, WG_JOURNAL_MAGIC, WG_JOURNAL_MAGIC_BYTES) != \ WG_JOURNAL_MAGIC_BYTES) { #endif show_log_error(db, "Error initializing log file"); JOURNAL_FAIL(fd, -3) } } else { /* check the magic header */ if(check_journal(db, fd)) { JOURNAL_FAIL(fd, -2) } } #if 0 /* Keep using this handle */ ld->fd = fd; ld->serial = dbh->logging.serial; #else #ifndef _WIN32 close(fd); #else _close(fd); #endif #endif dbh->logging.active = 1; return 0; #else return show_log_error(db, "Logging is disabled"); #endif /* USE_DBLOG */ } /** Turn journal logging off. * * Returns 0 on success * Returns non-zero on failure */ gint wg_stop_logging(void *db) { #ifdef USE_DBLOG db_memsegment_header* dbh = dbmemsegh(db); if(!dbh->logging.active) { show_log_error(db, "Logging is not active"); return -1; } dbh->logging.active = 0; return 0; #else return show_log_error(db, "Logging is disabled"); #endif /* USE_DBLOG */ } /** Replay journal file. * * Requires exclusive access to the database. * Marks the log as clean, but does not re-initialize the file. * * Returns 0 on success * Returns -1 on non-fatal error (database unmodified) * Returns -2 on fatal error (database inconsistent) */ gint wg_replay_log(void *db, char *filename) { #ifdef USE_DBLOG db_memsegment_header* dbh = dbmemsegh(db); gint active, err = 0; void *tran_tbl; int fd; FILE *f; #ifndef _WIN32 if((fd = open(filename, O_RDONLY)) == -1) { #else if(_sopen_s(&fd, filename, _O_RDONLY|_O_BINARY, _SH_DENYNO, 0)) { #endif show_log_error(db, "Error opening log file"); return -1; } if(check_journal(db, fd)) { err = -1; goto abort2; } active = dbh->logging.active; dbh->logging.active = 0; /* turn logging off before restoring */ /* Reading can be done with buffered IO */ #ifndef _WIN32 f = fdopen(fd, "r"); #else f = _fdopen(fd, "rb"); #endif /* XXX: may consider fcntl-locking here */ /* restore the log contents */ tran_tbl = wg_ginthash_init(db); if(!tran_tbl) { show_log_error(db, "Failed to create log translation table"); err = -1; goto abort1; } if(recover_journal(db, f, tran_tbl)) { err = -2; goto abort0; } dbh->logging.dirty = 0; /* on success, set the log as clean. */ abort0: wg_ginthash_free(db, tran_tbl); abort1: fclose(f); abort2: if(!err && active) { if(wg_start_logging(db)) { show_log_error(db, "Log restored but failed to reactivate logging"); err = -2; } } return err; #else return show_log_error(db, "Logging is disabled"); #endif /* USE_DBLOG */ } #ifdef USE_DBLOG /** Write a byte buffer to the log file. * */ static gint write_log_buffer(void *db, void *buf, int buflen) { db_memsegment_header* dbh = dbmemsegh(db); db_handle_logdata *ld = \ (db_handle_logdata *) (((db_handle *) db)->logdata); if(ld->fd >= 0 && ld->serial != dbh->logging.serial) { /* Stale file descriptor, get a new one */ #ifndef _WIN32 close(ld->fd); #else _close(ld->fd); #endif ld->fd = -1; } if(ld->fd < 0) { int fd; if((fd = open_journal(db, 0)) == -1) { show_log_error(db, "Error opening log file"); } else { if(check_journal(db, fd)) { #ifndef _WIN32 close(fd); #else _close(fd); #endif } else { /* fseek(f, 0, SEEK_END); */ ld->fd = fd; ld->serial = dbh->logging.serial; } } } if(ld->fd < 0) return -1; /* Always mark log as dirty when writing something */ dbh->logging.dirty = 1; #ifndef _WIN32 if(write(ld->fd, (char *) buf, buflen) != buflen) { #else if(_write(ld->fd, (char *) buf, buflen) != buflen) { #endif show_log_error(db, "Error writing to log file"); JOURNAL_FAIL(ld->fd, -5) } return 0; } #endif /* USE_DBLOG */ /* * Operations (and data) logged: * * WG_JOURNAL_ENTRY_CRE - create a record (length) * followed by a single varint field that contains the newly allocated offset * WG_JOURNAL_ENTRY_DEL - delete a record (offset) * WG_JOURNAL_ENTRY_ENC - encode a value (data bytes, extdata if applicable) * followed by a single varint field that contains the encoded value * WG_JOURNAL_ENTRY_SET - set a field value (record offset, column, encoded value) * * lengths, offsets and encoded values are stored as varints */ /** Log the creation of a record. * This call should always be followed by wg_log_encval() * * We assume that dbh->logging.active flag is checked before calling this. */ gint wg_log_create_record(void *db, gint length) { #ifdef USE_DBLOG unsigned char buf[1 + VARINT_SIZE], *optr; buf[0] = WG_JOURNAL_ENTRY_CRE; optr = &buf[1]; optr += enc_varint(optr, (wg_uint) length); return write_log_buffer(db, (void *) buf, optr - buf); #else return show_log_error(db, "Logging is disabled"); #endif /* USE_DBLOG */ } /** Log the deletion of a record. * */ gint wg_log_delete_record(void *db, gint enc) { #ifdef USE_DBLOG unsigned char buf[1 + VARINT_SIZE], *optr; buf[0] = WG_JOURNAL_ENTRY_DEL; optr = &buf[1]; optr += enc_varint(optr, (wg_uint) enc); return write_log_buffer(db, (void *) buf, optr - buf); #else return show_log_error(db, "Logging is disabled"); #endif /* USE_DBLOG */ } /** Log the result of an encode operation. Also handles records. * * If the encode function or record creation failed, call this * with WG_ILLEGAL to indicate the failure of the operation. */ gint wg_log_encval(void *db, gint enc) { #ifdef USE_DBLOG unsigned char buf[VARINT_SIZE]; size_t buflen = enc_varint(buf, (wg_uint) enc); return write_log_buffer(db, (void *) buf, buflen); #else return show_log_error(db, "Logging is disabled"); #endif /* USE_DBLOG */ } /** Log an encode operation. * * This is the most expensive log operation as we need to write the * chunk of data to be encoded. */ gint wg_log_encode(void *db, gint type, void *data, gint length, void *extdata, gint extlength) { #ifdef USE_DBLOG unsigned char *buf, *optr, *oend, *iptr; size_t buflen = 0; int err; switch(type) { case WG_NULLTYPE: case WG_RECORDTYPE: case WG_CHARTYPE: case WG_DATETYPE: case WG_TIMETYPE: case WG_VARTYPE: case WG_FIXPOINTTYPE: /* Shared memory not altered, don't log */ return 0; break; case WG_INTTYPE: /* int argument */ if(fits_smallint(*((int *) data))) { return 0; /* self-contained, don't log */ } else { buflen = 1 + sizeof(int); buf = (unsigned char *) malloc(buflen); optr = buf + 1; *((int *) optr) = *((int *) data); } break; case WG_DOUBLETYPE: /* double precision argument */ buflen = 1 + sizeof(double); buf = (unsigned char *) malloc(buflen); optr = buf + 1; *((double *) optr) = *((double *) data); break; case WG_STRTYPE: case WG_URITYPE: case WG_XMLLITERALTYPE: case WG_ANONCONSTTYPE: case WG_BLOBTYPE: /* XXX: no encode func for this yet */ /* strings with extdata */ buflen = 1 + 2*VARINT_SIZE + length + extlength; buf = (unsigned char *) malloc(buflen); /* data and extdata length */ optr = buf + 1; optr += enc_varint(optr, (wg_uint) length); optr += enc_varint(optr, (wg_uint) extlength); buflen -= 1 + 2*VARINT_SIZE - (optr - buf); /* actual size known */ /* data */ oend = optr + length; iptr = (unsigned char *) data; while(optr < oend) *(optr++) = *(iptr++); /* extdata */ oend = optr + extlength; iptr = (unsigned char *) extdata; while(optr < oend) *(optr++) = *(iptr++); break; default: return show_log_error(db, "Unsupported data type"); } /* Add a fixed prefix */ buf[0] = WG_JOURNAL_ENTRY_ENC | type; err = write_log_buffer(db, (void *) buf, buflen); free(buf); return err; #else return show_log_error(db, "Logging is disabled"); #endif /* USE_DBLOG */ } /** Log setting a data field. * * We assume that dbh->logging.active flag is checked before calling this. */ gint wg_log_set_field(void *db, void *rec, gint col, gint data) { #ifdef USE_DBLOG unsigned char buf[1 + 3*VARINT_SIZE], *optr; buf[0] = WG_JOURNAL_ENTRY_SET; optr = &buf[1]; optr += enc_varint(optr, (wg_uint) ptrtooffset(db, rec)); optr += enc_varint(optr, (wg_uint) col); optr += enc_varint(optr, (wg_uint) data); return write_log_buffer(db, (void *) buf, optr - buf); #else return show_log_error(db, "Logging is disabled"); #endif /* USE_DBLOG */ } /* ------------ error handling ---------------- */ static gint show_log_error(void *db, char *errmsg) { #ifdef WG_NO_ERRPRINT #else fprintf(stderr,"wg log error: %s.\n", errmsg); #endif return -1; } #ifdef __cplusplus } #endif whitedb-0.7.2/Db/dblog.h000066400000000000000000000042051226454622500147440ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Andri Rebane 2009 * Copyright (c) Priit Järv 2013 * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dblog.h * Public headers for the recovery journal. */ #ifndef DEFINED_DBLOG_H #define DEFINED_DBLOG_H #ifndef _WIN32 #define WG_JOURNAL_FILENAME DBLOG_DIR "/wgdb.journal" #else #define WG_JOURNAL_FILENAME DBLOG_DIR "\\wgdb_journal" #endif #define WG_JOURNAL_FN_BUFSIZE (sizeof(WG_JOURNAL_FILENAME) + 20) #define WG_JOURNAL_UMASK 0 #define WG_JOURNAL_MAX_BACKUPS 10 #define WG_JOURNAL_MAGIC "wgdb" #define WG_JOURNAL_MAGIC_BYTES 4 #define WG_JOURNAL_ENTRY_ENC ((unsigned char) 0) /* top bits clear |= type */ #define WG_JOURNAL_ENTRY_CRE ((unsigned char) 0x40) #define WG_JOURNAL_ENTRY_DEL ((unsigned char) 0x80) #define WG_JOURNAL_ENTRY_SET ((unsigned char) 0xc0) #define WG_JOURNAL_ENTRY_CMDMASK (0xc0) #define WG_JOURNAL_ENTRY_TYPEMASK (0x3f) /* ====== data structures ======== */ typedef struct { FILE *f; int fd; gint serial; } db_handle_logdata; /* ==== Protos ==== */ gint wg_init_handle_logdata(void *db); void wg_cleanup_handle_logdata(void *db); gint wg_start_logging(void *db); gint wg_stop_logging(void *db); gint wg_replay_log(void *db, char *filename); gint wg_log_create_record(void *db, gint length); gint wg_log_delete_record(void *db, gint enc); gint wg_log_encval(void *db, gint enc); gint wg_log_encode(void *db, gint type, void *data, gint length, void *extdata, gint extlength); gint wg_log_set_field(void *db, void *rec, gint col, gint data); #endif /* DEFINED_DBLOG_H */ whitedb-0.7.2/Db/dbmem.c000066400000000000000000000465031226454622500147430ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dbmem.c * Allocating/detaching system memory: shared memory and allocated ordinary memory * */ /* ====== Includes =============== */ #include #include #include #include #ifdef _WIN32 #include #else #include #include #endif #ifdef __cplusplus extern "C" { #endif #ifdef _WIN32 #include "../config-w32.h" #else #include "../config.h" #endif #include "dballoc.h" #include "dbfeatures.h" #include "dbmem.h" #include "dblog.h" /* ====== Private headers and defs ======== */ /* ======= Private protos ================ */ static void* link_shared_memory(int key); static void* create_shared_memory(int key, gint size); static int free_shared_memory(int key); static int detach_shared_memory(void* shmptr); #ifdef USE_DATABASE_HANDLE static void *init_dbhandle(void); static void free_dbhandle(void *dbhandle); #endif static gint show_memory_error(char *errmsg); #ifdef _WIN32 static gint show_memory_error_nr(char* errmsg, int nr); #endif /* ====== Functions ============== */ /* ----------- dbase creation and deletion api funs ------------------ */ /** returns a pointer to the database, NULL if failure * * In case database with dbasename exists, the returned pointer * points to the existing database. * * If there exists no database with dbasename, a new database * is created in shared memory with size in bytes * * If size is not 0 and the database exists, the size of the * existing segment is required to be >= requested size, * otherwise the operation fails. * */ void* wg_attach_database(char* dbasename, gint size){ void* shm = wg_attach_memsegment(dbasename, size, size, 1, 0); if(shm) { int err; /* Check the header for compatibility. * XXX: this is not required for a fresh database. */ if((err = wg_check_header_compat(dbmemsegh(shm)))) { if(err < -1) { show_memory_error("Existing segment header is incompatible"); wg_print_code_version(); wg_print_header_version(dbmemsegh(shm)); } return NULL; } } return shm; } /** returns a pointer to the existing database, NULL if * there is no database with dbasename. * */ void* wg_attach_existing_database(char* dbasename){ void* shm = wg_attach_memsegment(dbasename, 0, 0, 0, 0); if(shm) { int err; /* Check the header for compatibility. * XXX: this is not required for a fresh database. */ if((err = wg_check_header_compat(dbmemsegh(shm)))) { if(err < -1) { show_memory_error("Existing segment header is incompatible"); wg_print_code_version(); wg_print_header_version(dbmemsegh(shm)); } return NULL; } } return shm; } /** returns a pointer to the existing database, NULL if failure. * * Starts journal logging in the database. */ void* wg_attach_logged_database(char* dbasename, gint size){ void* shm = wg_attach_memsegment(dbasename, size, size, 1, 1); if(shm) { int err; /* Check the header for compatibility. * XXX: this is not required for a fresh database. */ if((err = wg_check_header_compat(dbmemsegh(shm)))) { if(err < -1) { show_memory_error("Existing segment header is incompatible"); wg_print_code_version(); wg_print_header_version(dbmemsegh(shm)); } return NULL; } } return shm; } /** Attach to shared memory segment. * Normally called internally by wg_attach_database() * May be called directly if the version compatibility of the * memory image is not relevant (such as, when importing a dump * file). */ void* wg_attach_memsegment(char* dbasename, gint minsize, gint size, int create, int logging){ #ifdef USE_DATABASE_HANDLE void *dbhandle; #endif void* shm; int err; int key=0; #ifdef USE_DATABASE_HANDLE dbhandle = init_dbhandle(); if(!dbhandle) return NULL; #endif // default args handling if (dbasename!=NULL) key=strtol(dbasename,NULL,10); if (key<=0 || key==INT_MIN || key==INT_MAX) key=DEFAULT_MEMDBASE_KEY; if (minsize<0) minsize=0; if (sizesize < minsize) { show_memory_error("Existing segment is too small"); #ifdef USE_DATABASE_HANDLE free_dbhandle(dbhandle); #endif return NULL; } } #if defined(USE_DATABASE_HANDLE) && defined(USE_DBLOG) if(logging) { /* If logging was requested and we're not initializing a new * segment, we should fail here if the existing database is * not actively logging. */ if(!((db_memsegment_header *) shm)->logging.active) { show_memory_error("Existing memory segment has no journal"); free_dbhandle(dbhandle); return NULL; } } #endif #ifdef USE_DATABASE_HANDLE ((db_handle *) dbhandle)->db = shm; #endif } else if (!create) { /* linking to already existing block failed do not create a new base */ #ifdef USE_DATABASE_HANDLE free_dbhandle(dbhandle); #endif return NULL; } else { /* linking to already existing block failed */ /* create a new block if createnew_flag set * * When creating a new base, we have to select the size for the * memory segment. There are three possible scenarios: * - no size was requested. Use the default size. * - specific size was requested. Use it. * - a size and a minimum size were provided. First try the size * given, if that fails fall back to minimum size. */ if(!size) size = DEFAULT_MEMDBASE_SIZE; shm = create_shared_memory(key, size); if(!shm && minsize && minsizedb = shm; err=wg_init_db_memsegment(dbhandle, key, size); #ifdef USE_DBLOG if(!err && logging) err = wg_start_logging(dbhandle); #endif #else err=wg_init_db_memsegment(shm,key,size); #endif if(err) { show_memory_error("Database initialization failed"); free_shared_memory(key); #ifdef USE_DATABASE_HANDLE free_dbhandle(dbhandle); #endif return NULL; } } } #ifdef USE_DATABASE_HANDLE return dbhandle; #else return shm; #endif } /** Detach database * * returns 0 if OK */ int wg_detach_database(void* dbase) { int err = detach_shared_memory(dbmemseg(dbase)); #ifdef USE_DATABASE_HANDLE if(!err) { free_dbhandle(dbase); } #endif return err; } /** Delete a database * * returns 0 if OK */ int wg_delete_database(char* dbasename) { int key=0; // default args handling if (dbasename!=NULL) key=strtol(dbasename,NULL,10); if (key<=0 || key==INT_MIN || key==INT_MAX) key=DEFAULT_MEMDBASE_KEY; return free_shared_memory(key); } /* --------- local memory db creation and deleting ---------- */ /** Create a database in local memory * returns a pointer to the database, NULL if failure. */ void* wg_attach_local_database(gint size) { void* shm; #ifdef USE_DATABASE_HANDLE void *dbhandle = init_dbhandle(); if(!dbhandle) return NULL; #endif if (size<=0) size=DEFAULT_MEMDBASE_SIZE; shm = (void *) malloc(size); if (shm==NULL) { show_memory_error("malloc failed"); return NULL; } else { /* key=0 - no shared memory associated */ #ifdef USE_DATABASE_HANDLE ((db_handle *) dbhandle)->db = shm; if(wg_init_db_memsegment(dbhandle, 0, size)) { #else if(wg_init_db_memsegment(shm, 0, size)) { #endif show_memory_error("Database initialization failed"); free(shm); #ifdef USE_DATABASE_HANDLE free_dbhandle(dbhandle); #endif return NULL; } } #ifdef USE_DATABASE_HANDLE return dbhandle; #else return shm; #endif } /** Free a database in local memory * frees the allocated memory. */ void wg_delete_local_database(void* dbase) { if(dbase) { void *localmem = dbmemseg(dbase); if(localmem) free(localmem); #ifdef USE_DATABASE_HANDLE free_dbhandle(dbase); #endif } } /* -------------------- database handle management -------------------- */ #ifdef USE_DATABASE_HANDLE static void *init_dbhandle() { void *dbhandle = malloc(sizeof(db_handle)); if(!dbhandle) { show_memory_error("Failed to allocate the db handle"); return NULL; } else { memset(dbhandle, 0, sizeof(db_handle)); } #ifdef USE_DBLOG if(wg_init_handle_logdata(dbhandle)) { free(dbhandle); return NULL; } #endif return dbhandle; } static void free_dbhandle(void *dbhandle) { #ifdef USE_DBLOG wg_cleanup_handle_logdata(dbhandle); #endif free(dbhandle); } #endif /* ----------------- memory image/dump compatibility ------------------ */ /** Check compatibility of memory image (or dump file) header * * Note: unlike API functions, this functions works directly on * the (db_memsegment_header *) pointer. * * returns 0 if header is compatible with current executable * returns -1 if header is not recognizable * returns -2 if header has wrong endianness * returns -3 if header version does not match * returns -4 if compile-time features do not match */ int wg_check_header_compat(db_memsegment_header *dbh) { /* * Check: * - magic marker (including endianness) * - version */ if(!dbcheckh(dbh)) { gint32 magic = MEMSEGMENT_MAGIC_MARK; char *magic_bytes = (char *) &magic; char *header_bytes = (char *) dbh; if(magic_bytes[0]==header_bytes[3] && magic_bytes[1]==header_bytes[2] &&\ magic_bytes[2]==header_bytes[1] && magic_bytes[3]==header_bytes[0]) { return -2; /* wrong endianness */ } else { return -1; /* unknown marker (not a valid header) */ } } if(dbh->version!=MEMSEGMENT_VERSION) { return -3; } if(dbh->features!=MEMSEGMENT_FEATURES) { return -4; } return 0; } void wg_print_code_version(void) { int i = 1; char *i_bytes = (char *) &i; printf("\nlibwgdb version: %d.%d.%d\n", VERSION_MAJOR, VERSION_MINOR, VERSION_REV); printf("byte order: %s endian\n", (i_bytes[0]==1 ? "little" : "big")); printf("compile-time features:\n"\ "64-bit encoded data: %s\n"\ "queued locks: %s\n"\ "chained nodes in T-tree: %s\n"\ "record backlinking: %s\n"\ "child databases: %s\n"\ "index templates: %s\n", (MEMSEGMENT_FEATURES & FEATURE_BITS_64BIT ? "yes" : "no"), (MEMSEGMENT_FEATURES & FEATURE_BITS_QUEUED_LOCKS ? "yes" : "no"), (MEMSEGMENT_FEATURES & FEATURE_BITS_TTREE_CHAINED ? "yes" : "no"), (MEMSEGMENT_FEATURES & FEATURE_BITS_BACKLINK ? "yes" : "no"), (MEMSEGMENT_FEATURES & FEATURE_BITS_CHILD_DB ? "yes" : "no"), (MEMSEGMENT_FEATURES & FEATURE_BITS_INDEX_TMPL ? "yes" : "no")); } void wg_print_header_version(db_memsegment_header *dbh) { gint32 version, features; gint32 magic = MEMSEGMENT_MAGIC_MARK; char *magic_bytes = (char *) &magic; char *header_bytes = (char *) dbh; char magic_lsb = (char) (MEMSEGMENT_MAGIC_MARK & 0xff); /* Header might be incompatible, but to display version and feature * information, we still need to read it somehow, even if * it has wrong endianness. */ if(magic_bytes[0]==header_bytes[3] && magic_bytes[1]==header_bytes[2] &&\ magic_bytes[2]==header_bytes[1] && magic_bytes[3]==header_bytes[0]) { char *f1 = (char *) &(dbh->version); char *t1 = (char *) &version; char *f2 = (char *) &(dbh->features); char *t2 = (char *) &features; int i; for(i=0; i<4; i++) { t1[i] = f1[3-i]; t2[i] = f2[3-i]; } } else { version = dbh->version; features = dbh->features; } printf("\nheader version: %d.%d.%d\n", (version & 0xff), ((version>>8) & 0xff), ((version>>16) & 0xff)); printf("byte order: %s endian\n", (header_bytes[0]==magic_lsb ? "little" : "big")); printf("compile-time features:\n"\ "64-bit encoded data: %s\n"\ "queued locks: %s\n"\ "chained nodes in T-tree: %s\n"\ "record backlinking: %s\n"\ "child databases: %s\n"\ "index templates: %s\n", (features & FEATURE_BITS_64BIT ? "yes" : "no"), (features & FEATURE_BITS_QUEUED_LOCKS ? "yes" : "no"), (features & FEATURE_BITS_TTREE_CHAINED ? "yes" : "no"), (features & FEATURE_BITS_BACKLINK ? "yes" : "no"), (features & FEATURE_BITS_CHILD_DB ? "yes" : "no"), (features & FEATURE_BITS_INDEX_TMPL ? "yes" : "no")); } /* --------------- dbase create/delete ops not in api ----------------- */ static void* link_shared_memory(int key) { void *shm; #ifdef _WIN32 char fname[MAX_FILENAME_SIZE]; HANDLE hmapfile; sprintf_s(fname,MAX_FILENAME_SIZE-1,"%d",key); hmapfile = OpenFileMapping( FILE_MAP_ALL_ACCESS, // read/write access FALSE, // do not inherit the name fname); // name of mapping object errno = 0; if (hmapfile == NULL) { /* this is an expected error, message in most cases not wanted */ return NULL; } shm = (void*) MapViewOfFile(hmapfile, // handle to map object FILE_MAP_ALL_ACCESS, // read/write permission 0, 0, 0); // size of mapping if (shm == NULL) { show_memory_error_nr("Could not map view of file", (int) GetLastError()); CloseHandle(hmapfile); return NULL; } return shm; #else int shmflg; /* shmflg to be passed to shmget() */ int shmid; /* return value from shmget() */ errno = 0; // Link to existing segment shmflg=0666; shmid=shmget((key_t)key, 0, shmflg); if (shmid < 0) { return NULL; } // Attach the segment to our data space shm=shmat(shmid,NULL,0); if (shm==(char *) -1) { show_memory_error("attaching shared memory segment failed"); return NULL; } return (void*) shm; #endif } static void* create_shared_memory(int key, gint size) { void *shm; #ifdef _WIN32 char fname[MAX_FILENAME_SIZE]; HANDLE hmapfile; sprintf_s(fname,MAX_FILENAME_SIZE-1,"%d",key); hmapfile = CreateFileMapping( INVALID_HANDLE_VALUE, // use paging file NULL, // default security PAGE_READWRITE, // read/write access 0, // max. object size size, // buffer size fname); // name of mapping object errno = 0; if (hmapfile == NULL) { show_memory_error_nr("Could not create file mapping object", (int) GetLastError()); return NULL; } shm = (void*) MapViewOfFile(hmapfile, // handle to map object FILE_MAP_ALL_ACCESS, // read/write permission 0, 0, 0); if (shm == NULL) { show_memory_error_nr("Could not map view of file", (int) GetLastError()); CloseHandle(hmapfile); return NULL; } return shm; #else int shmflg; /* shmflg to be passed to shmget() */ int shmid; /* return value from shmget() */ // Create the segment shmflg=IPC_CREAT | IPC_EXCL | 0666; shmid=shmget((key_t)key,size,shmflg); if (shmid < 0) { switch(errno) { case EEXIST: show_memory_error("creating shared memory segment: "\ "Race condition detected when initializing"); break; case EINVAL: show_memory_error("creating shared memory segment: "\ "Specified segment size too large or too small"); break; case ENOMEM: show_memory_error("creating shared memory segment: "\ "Not enough physical memory"); break; default: /* Generic error */ show_memory_error("creating shared memory segment failed"); break; } return NULL; } // Attach the segment to our data space shm=shmat(shmid,NULL,0); if (shm==(char *) -1) { show_memory_error("attaching shared memory segment failed"); return NULL; } return (void*) shm; #endif } static int free_shared_memory(int key) { #ifdef _WIN32 return 0; #else int shmflg; /* shmflg to be passed to shmget() */ int shmid; /* return value from shmget() */ int tmp; errno = 0; // Link to existing segment shmflg=0666; shmid=shmget((key_t)key, 0, shmflg); if (shmid < 0) { switch(errno) { case EACCES: show_memory_error("linking to shared memory segment (for freeing): "\ "Access denied"); break; case ENOENT: show_memory_error("linking to shared memory segment (for freeing): "\ "Segment does not exist"); break; default: show_memory_error("linking to shared memory segment (for freeing) failed"); break; } return -1; } // Free the segment tmp=shmctl(shmid, IPC_RMID, NULL); if (tmp==-1) { switch(errno) { case EPERM: show_memory_error("freeing shared memory segment: "\ "Permission denied"); break; default: show_memory_error("freeing shared memory segment failed"); break; } return -2; } return 0; #endif } static int detach_shared_memory(void* shmptr) { #ifdef _WIN32 return 0; #else int tmp; // detach the segment tmp=shmdt(shmptr); if (tmp==-1) { show_memory_error("detaching shared memory segment failed"); return -2; } return 0; #endif } /* ------------ error handling ---------------- */ /** Handle memory error * since these errors mostly indicate a fatal error related to database * memory allocation, the db pointer is not very useful here and is * omitted. */ static gint show_memory_error(char *errmsg) { #ifdef WG_NO_ERRPRINT #else fprintf(stderr,"wg memory error: %s.\n", errmsg); #endif return -1; } #ifdef _WIN32 static gint show_memory_error_nr(char* errmsg, int nr) { #ifdef WG_NO_ERRPRINT #else fprintf(stderr,"db memory allocation error: %s %d\n", errmsg, nr); #endif return -1; } #endif #ifdef __cplusplus } #endif whitedb-0.7.2/Db/dbmem.h000066400000000000000000000045021226454622500147410ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dbmem.h * Public headers for database memory handling. */ #ifndef DEFINED_DBMEM_H #define DEFINED_DBMEM_H #ifdef _WIN32 #include "../config-w32.h" #else #include "../config.h" #endif #define DEFAULT_MEMDBASE_KEY 1000 //#define DEFAULT_MEMDBASE_SIZE 1000000 // 1 meg #define DEFAULT_MEMDBASE_SIZE 10000000 // 10 meg //#define DEFAULT_MEMDBASE_SIZE 800000000 // 800 meg //#define DEFAULT_MEMDBASE_SIZE 2000000000 #define MAX_FILENAME_SIZE 100 /* ====== data structures ======== */ /* ==== Protos ==== */ void* wg_attach_database(char* dbasename, gint size); // returns a pointer to the database, NULL if failure void* wg_attach_existing_database(char* dbasename); // like wg_attach_database, but does not create a new base void* wg_attach_logged_database(char* dbasename, gint size); // like wg_attach_database, but activates journal logging on creation void* wg_attach_memsegment(char* dbasename, gint minsize, gint size, int create, int logging); // same as wg_attach_database, does not check contents int wg_detach_database(void* dbase); // detaches a database: returns 0 if OK int wg_delete_database(char* dbasename); // deletes a database: returns 0 if OK int wg_check_header_compat(db_memsegment_header *dbh); // check memory image compatibility void wg_print_code_version(void); // show libwgdb version info void wg_print_header_version(db_memsegment_header *dbh); // show version info from header void* wg_attach_local_database(gint size); void wg_delete_local_database(void* dbase); #endif /* DEFINED_DBMEM_H */ whitedb-0.7.2/Db/dbmpool.c000066400000000000000000000317441226454622500153140ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dbmpool.c * Allocating data using a temporary memory pool. * */ /* ====== Includes =============== */ #include #include #include #include #ifdef _WIN32 #include #else #include #include #endif #ifdef __cplusplus extern "C" { #endif #ifdef _WIN32 #include "../config-w32.h" #else #include "../config.h" #endif #include "dballoc.h" #include "dbmem.h" #include "dbapi.h" /* ====== Private headers and defs ======== */ #define NROF_SUBAREAS 100 // size of subarea array #define MIN_FIRST_SUBAREA_SIZE 1024 // first free area minimum: if less asked, this given #define ALIGNMENT_BYTES 4 // every val returned by wg_alloc_mpool is aligned to this #define TYPEMASK 1 // memory pool convenience objects type mask for address #define PAIRBITS 0 // memory pool convenience objects type bit for pairs (lists) #define ATOMBITS 1 // memory pool convenience objects type bit for atoms (strings etc) /** located inside mpool_header: one single memory subarea header * * */ typedef struct _wg_mpoolsubarea_header { int size; /** size of subarea in bytes */ void* area_start; /** pointer to the first byte of the subarea */ void* area_end; /** pointer to the first byte after the subarea */ } mpool_subarea_header; /** memory pool management data * stored in the beginning of the first segment of mempool * */ typedef struct { void* freeptr; /** pointer to the next free location in the pool */ int cur_subarea; /** index of the currently used subarea in the subarea_table (starts with 0) */ int nrof_subareas; /** full nr of rows in the subarea table */ mpool_subarea_header subarea_table[NROF_SUBAREAS]; /** subarea information (mpool_subarea_header) table */ } mpool_header; /* ======= Private protos ================ */ static int extend_mpool(void* db, void* mpool, int minbytes); static int show_mpool_error(void* db, char* errmsg); static int show_mpool_error_nr(void* db, char* errmsg, int nr); static void wg_mpool_print_aux(void* db, void* ptr, int depth, int pflag); /* ====== Functions for mpool creating/extending/allocating/destroying ============== */ /** create and initialise a new memory pool * * initial pool has at least origbytes of free space * mpool is extended automatically later when space used up * returns void* pointer to mpool if OK, NULL if failure * * does a single malloc (latex extensions do further mallocs) */ void* wg_create_mpool(void* db, int origbytes) { int bytes; void* mpool; mpool_header* mpoolh; int puresize; void* nextptr; int i; if (origbytesfreeptr)=nextptr; (mpoolh->cur_subarea)=0; ((mpoolh->subarea_table)[0]).size=puresize; ((mpoolh->subarea_table)[0]).area_start=mpool; ((mpoolh->subarea_table)[0]).area_end=(void*)(((char*)mpool)+bytes); return mpool; } /** extend an existing memory pool * * called automatically when mpool space used up * does one malloc for a new subarea * */ static int extend_mpool(void* db, void* mpool, int minbytes) { int cursize; int bytes; void* subarea; mpool_header* mpoolh; int i; void* nextptr; mpoolh=(mpool_header*)mpool; cursize=((mpoolh->subarea_table)[(mpoolh->cur_subarea)]).size; bytes=cursize; for(i=0;i<100;i++) { bytes=bytes*2; if (bytes>=(minbytes+ALIGNMENT_BYTES)) break; } subarea=malloc(bytes); if (subarea==NULL) { show_mpool_error_nr(db, " cannot extend mpool to size: ",minbytes); return -1; } (mpoolh->freeptr)=subarea; (mpoolh->cur_subarea)++; ((mpoolh->subarea_table)[mpoolh->cur_subarea]).size=bytes; nextptr=subarea; // set correct alignment for nextptr i=((size_t)nextptr)%ALIGNMENT_BYTES; if (i!=0) nextptr=((char*)nextptr)+(ALIGNMENT_BYTES-i); // aligment now ok (mpoolh->freeptr)=nextptr; ((mpoolh->subarea_table)[mpoolh->cur_subarea]).area_start=subarea; ((mpoolh->subarea_table)[mpoolh->cur_subarea]).area_end=(void*)(((char*)subarea)+bytes); return 0; } /** free the whole memory pool * * frees all the malloced subareas and initial mpool * */ void wg_free_mpool(void* db, void* mpool) { int i; mpool_header* mpoolh; mpoolh=(mpool_header*)mpool; i=mpoolh->cur_subarea; for(;i>0;i--) { free(((mpoolh->subarea_table)[i]).area_start); } free(mpool); } /** allocate bytes from a memory pool: analogous to malloc * * mpool is extended automatically if not enough free space present * returns void* pointer to a memory block if OK, NULL if failure * */ void* wg_alloc_mpool(void* db, void* mpool, int bytes) { void* curptr; void* nextptr; mpool_header* mpoolh; void* curend; int tmp; int i; if (bytes<=0) { show_mpool_error_nr(db, " trying to allocate too little from mpool: ",bytes); return NULL; } if (mpool==NULL) { show_mpool_error(db," mpool passed to wg_alloc_mpool is NULL "); return NULL; } mpoolh=(mpool_header*)mpool; nextptr=(void*)(((char*)(mpoolh->freeptr))+bytes); curend=((mpoolh->subarea_table)[(mpoolh->cur_subarea)]).area_end; if (nextptr>curend) { tmp=extend_mpool(db,mpool,bytes); if (tmp!=0) { show_mpool_error_nr(db," cannot extend mpool size by: ",bytes); return NULL; } nextptr=((char*)(mpoolh->freeptr))+bytes; } curptr=mpoolh->freeptr; // set correct alignment for nextptr i=((size_t)nextptr)%ALIGNMENT_BYTES; if (i!=0) nextptr=((char*)nextptr)+(ALIGNMENT_BYTES-i); // alignment now ok mpoolh->freeptr=nextptr; return curptr; } /* ====== Convenience functions for using data allocated from mpool ================= */ /* Core object types are pairs and atoms plus 0 (NULL). Lists are formed by pairs of gints. Each pair starts at address with two last bits 0. The first element of the pair points to the contents of the cell, the second to rest. Atoms may contain strings, ints etc etc. Each atom starts at address with last bit 1. The first byte of the atom indicates its type. The following bytes are content, always encoded as a 0-terminated string or TWO consequent 0-terminated strings. The atom type byte contains dbapi.h values: STRING, CONVERSION TO BE DETERMINED LATER: 0 #define WG_NULLTYPE 1 #define WG_RECORDTYPE 2 #define WG_INTTYPE 3 #define WG_DOUBLETYPE 4 #define WG_STRTYPE 5 #define WG_XMLLITERALTYPE 6 #define WG_URITYPE 7 #define WG_BLOBTYPE 8 #define WG_CHARTYPE 9 #define WG_FIXPOINTTYPE 10 #define WG_DATETYPE 11 #define WG_TIMETYPE 12 #define WG_ANONCONSTTYPE 13 #define WG_VARTYPE 14 #define WG_ILLEGAL 0xff Atom types 5-8 (strings, xmlliterals, uris, blobs) contain TWO consequent strings, first the main, terminating 0, then the second (lang, namespace etc) and the terminating 0. Two terminating 0-s after the first indicates the missing second string (NULL). Other types are simply terminated by two 0-s. */ // ------------- pairs ---------------- int wg_ispair(void* db, void* ptr) { return (ptr!=NULL && ((((gint)ptr)&TYPEMASK)==PAIRBITS)); } void* wg_mkpair(void* db, void* mpool, void* x, void* y) { void* ptr; ptr=wg_alloc_mpool(db,mpool,sizeof(gint)*2); if (ptr==NULL) { show_mpool_error(db,"cannot create a pair in mpool"); return NULL; } *((gint*)ptr)=(gint)x; *((gint*)ptr+1)=(gint)y; return ptr; } void* wg_first(void* db, void* ptr) { return (void*)(*((gint*)ptr)); } void* wg_rest(void* db, void *ptr) { return (void*)(*((gint*)ptr+1)); } int wg_listtreecount(void* db, void *ptr) { if (wg_ispair(db,ptr)) return wg_listtreecount(db,wg_first(db,ptr)) + wg_listtreecount(db,wg_rest(db,ptr)); else return 1; } // ------------ atoms ------------------ int wg_isatom(void* db, void* ptr) { return (ptr!=NULL && ((((gint)ptr)&TYPEMASK)==ATOMBITS)); } void* wg_mkatom(void* db, void* mpool, int type, char* str1, char* str2) { char* ptr; char* curptr; int size=2; if (str1!=NULL) size=size+strlen(str1); size++; if (str2!=NULL) size=size+strlen(str2); size++; ptr=(char*)(wg_alloc_mpool(db,mpool,size)); if (ptr==NULL) { show_mpool_error(db,"cannot create an atom in mpool"); return NULL; } ptr++; // shift one pos right to set address last byte 1 curptr=ptr; *curptr=(char)type; curptr++; if (str1!=NULL) { while((*curptr++ = *str1++)); } else { *curptr=(char)0; curptr++; } if (str2!=NULL) { while((*curptr++ = *str2++)); } else { *curptr=(char)0; curptr++; } return ptr; } int wg_atomtype(void* db, void* ptr) { if (ptr==NULL) return 0; else return (gint)*((char*)ptr); } char* wg_atomstr1(void* db, void* ptr) { if (ptr==NULL) return NULL; if (*(((char*)ptr)+1)==(char)0) return NULL; else return ((char*)ptr)+1; } char* wg_atomstr2(void* db, void* ptr) { if (ptr==NULL) return NULL; ptr=(char*)ptr+strlen((char*)ptr)+1; if (*(((char*)ptr)+1)==(char)0) return NULL; else return ((char*)ptr); } // ------------ printing ------------------ void wg_mpool_print(void* db, void* ptr) { wg_mpool_print_aux(db,ptr,0,1); } static void wg_mpool_print_aux(void* db, void* ptr, int depth, int pflag) { int type; char* p; int count; int ppflag=0; int i; void *curptr; if (ptr==NULL) { printf("()"); } else if (wg_isatom(db,ptr)) { type=wg_atomtype(db,ptr); switch (type) { case 0: printf("_:"); break; case WG_NULLTYPE: printf("n:"); break; case WG_RECORDTYPE: printf("r:"); break; case WG_INTTYPE: printf("i:"); break; case WG_DOUBLETYPE: printf("d:"); break; case WG_STRTYPE: printf("s:"); break; case WG_XMLLITERALTYPE: printf("x:"); break; case WG_URITYPE: printf("u:"); break; case WG_BLOBTYPE: printf("b:"); break; case WG_CHARTYPE: printf("c:"); break; case WG_FIXPOINTTYPE: printf("f:"); break; case WG_DATETYPE: printf("date:"); break; case WG_TIMETYPE: printf("time:"); break; case WG_ANONCONSTTYPE: printf("a:"); break; case WG_VARTYPE: printf("?:"); break; default: printf("!:"); } p=wg_atomstr1(db,ptr); if (p!=NULL) { if (strchr(p,' ')!=NULL || strchr(p,'\n')!=NULL || strchr(p,'\t')!=NULL) { printf("\"%s\"",p); } else { printf("%s",p); } } else { printf("\"\""); } p=wg_atomstr2(db,ptr); if (p!=NULL) { if (strchr(p,' ')!=NULL || strchr(p,'\n')!=NULL || strchr(p,'\t')!=NULL) { printf("^^\"%s\"",p); } else { printf("^^%s",p); } } } else { if (pflag && wg_listtreecount(db,ptr)>10) ppflag=1; printf ("("); for(curptr=ptr, count=0;curptr!=NULL && !wg_isatom(db,curptr);curptr=wg_rest(db,curptr), count++) { if (count>0) { if (ppflag) { printf("\n"); for(i=0;i. * */ /** @file dbmpool.h * Public headers for memory pool utilities. */ #ifndef DEFINED_DBMPOOL_H #define DEFINED_DBMPOOL_H #ifdef _WIN32 #include "../config-w32.h" #else #include "../config.h" #endif /* ====== data structures ======== */ /* ==== Protos ==== */ void* wg_create_mpool(void* db, int bytes); // call this to init pool with initial size bytes void* wg_alloc_mpool(void* db, void* mpool, int bytes); // call each time you want to "malloc": // automatically extends pool if no space left void wg_free_mpool(void* db, void* mpool); // remove the whole pool int wg_ispair(void* db, void* ptr); void* wg_mkpair(void* db, void* mpool, void* x, void* y); void* wg_first(void* db, void* ptr); void* wg_rest(void* db, void *ptr); int wg_listtreecount(void* db, void *ptr); int wg_isatom(void* db, void* ptr); void* wg_mkatom(void* db, void* mpool, int type, char* str1, char* str2); int wg_atomtype(void* db, void* ptr); char* wg_atomstr1(void* db, void* ptr); char* wg_atomstr2(void* db, void* ptr); void wg_mpool_print(void* db, void* ptr); #endif /* DEFINED_DBMPOOL_H */ whitedb-0.7.2/Db/dbquery.c000066400000000000000000001650401226454622500153300ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Priit Järv 2010,2011,2013 * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dbquery.c * WhiteDB query engine. */ /* ====== Includes =============== */ #include #include #include /* ====== Private headers and defs ======== */ #ifdef __cplusplus extern "C" { #endif #include "dballoc.h" #include "dbquery.h" #include "dbcompare.h" #include "dbmpool.h" #include "dbschema.h" /* T-tree based scoring */ #define TTREE_SCORE_EQUAL 5 #define TTREE_SCORE_BOUND 2 #define TTREE_SCORE_NULL -1 /** penalty for null values, which * are likely to be abundant */ #define TTREE_SCORE_MASK 5 /** matching field in template */ /* Query flags for internal use */ #define QUERY_FLAGS_PREFETCH 0x1000 #define QUERY_RESULTSET_PAGESIZE 63 /* mpool is aligned, so we can align * the result pages too by selecting an * appropriate size */ /* Emulate array index when doing a scan of key-value pairs * in a JSON query. * If this is not desirable, commenting this out makes * scans somewhat faster. */ #define JSON_SCAN_UNWRAP_ARRAY struct __query_result_page { gint rows[QUERY_RESULTSET_PAGESIZE]; struct __query_result_page *next; }; typedef struct __query_result_page query_result_page; typedef struct { query_result_page *page; /** current page of results */ gint pidx; /** current index on page (reading) */ } query_result_cursor; typedef struct { void *mpool; /** storage for row offsets */ query_result_page *first_page; /** first page of results, for rewinding */ query_result_cursor wcursor; /** read cursor */ query_result_cursor rcursor; /** write cursor */ gint res_count; /** number of rows in results */ } query_result_set; /* ======= Private protos ================ */ static gint most_restricting_column(void *db, wg_query_arg *arglist, gint argc, gint *index_id); static gint check_arglist(void *db, void *rec, wg_query_arg *arglist, gint argc); static gint prepare_params(void *db, void *matchrec, gint reclen, wg_query_arg *arglist, gint argc, wg_query_arg **farglist, gint *fargc); static gint find_ttree_bounds(void *db, gint index_id, gint col, gint start_bound, gint end_bound, gint start_inclusive, gint end_inclusive, gint *curr_offset, gint *curr_slot, gint *end_offset, gint *end_slot); static wg_query *internal_build_query(void *db, void *matchrec, gint reclen, wg_query_arg *arglist, gint argc, gint flags, wg_uint rowlimit); static query_result_set *create_resultset(void *db); static void free_resultset(void *db, query_result_set *set); static void rewind_resultset(void *db, query_result_set *set); static gint append_resultset(void *db, query_result_set *set, gint offset); static gint fetch_resultset(void *db, query_result_set *set); static query_result_set *intersect_resultset(void *db, query_result_set *seta, query_result_set *setb); static gint encode_query_param_unistr(void *db, char *data, gint type, char *extdata, int length); static gint show_query_error(void* db, char* errmsg); /*static gint show_query_error_nr(void* db, char* errmsg, gint nr);*/ /* ====== Functions ============== */ /** Find most restricting column from query argument list * This is probably a reasonable approach to optimize queries * based on T-tree indexes, but might be difficult to combine * with hash indexes. * XXX: currently only considers the existence of T-tree * index and nothing else. */ static gint most_restricting_column(void *db, wg_query_arg *arglist, gint argc, gint *index_id) { struct column_score { gint column; int score; int index_id; }; struct column_score *sc; int i, j, mrc_score = -1; gint mrc = -1; db_memsegment_header* dbh = dbmemsegh(db); sc = (struct column_score *) malloc(argc * sizeof(struct column_score)); if(!sc) { show_query_error(db, "Failed to allocate memory"); return -1; } /* Scan through the arguments and calculate accumulated score * for each column. */ for(i=0; iindex_control_area_header.index_table[sc[i].column]; while(*ilist) { gcell *ilistelem = (gcell *) offsettoptr(db, *ilist); if(ilistelem->car) { wg_index_header *hdr = \ (wg_index_header *) offsettoptr(db, ilistelem->car); if(hdr->type == WG_INDEX_TYPE_TTREE) { #ifdef USE_INDEX_TEMPLATE /* If index templates are available, we can increase the * score of the index if the template has any columns matching * the query parameters. On the other hand, in case of a * mismatch the index is unusable and has to be skipped. * The indexes are sorted in the order of fixed columns in * the template, so if there is a match, the search is * complete (remaining index are likely to be worse) */ if(hdr->template_offset) { wg_index_template *tmpl = \ (wg_index_template *) offsettoptr(db, hdr->template_offset); void *matchrec = offsettoptr(db, tmpl->offset_matchrec); gint reclen = wg_get_record_len(db, matchrec); for(j=0; jcar; break; } } #ifdef USE_INDEX_TEMPLATE nextindex: #endif ilist = &ilistelem->cdr; } } if(!sc[i].index_id) sc[i].score = 0; /* no index, score reset */ if(sc[i].score > mrc_score) { mrc_score = sc[i].score; mrc = sc[i].column; *index_id = sc[i].index_id; } } /* TODO: does the best score have no index? In that case, * try to locate an index that would restrict at least * some columns. */ free(sc); return mrc; } /** Check a record against list of conditions * returns 1 if the record matches * returns 0 if the record fails at least one condition */ static gint check_arglist(void *db, void *rec, wg_query_arg *arglist, gint argc) { int i, reclen; reclen = wg_get_record_len(db, rec); for(i=0; inumber_of_elements <= cs) { /* Crossed node boundary */ co = TNODE_SUCCESSOR(db, node); cs = 0; } } else if(boundtype == DEAD_END_RIGHT_NOT_BOUNDING) { /* Since exact value was not found, this case is exactly * the same as with the inclusive range. */ node = (struct wg_tnode *) offsettoptr(db, co); co = TNODE_SUCCESSOR(db, node); cs = 0; } else if(boundtype == DEAD_END_LEFT_NOT_BOUNDING) { /* No exact value in tree, same as inclusive range */ cs = 0; } } } /* Finding of the end of the range is more or less opposite * of finding the beginning. */ if(end_bound==WG_ILLEGAL) { /* Rightmost node in index */ #ifdef TTREE_CHAINED_NODES eo = TTREE_MAX_NODE(hdr); #else /* GLB search on root node returns the rightmost node in tree */ eo = wg_ttree_find_glb_node(db, TTREE_ROOT_NODE(hdr)); #endif if(eo) { node = (struct wg_tnode *) offsettoptr(db, eo); es = node->number_of_elements - 1; /* rightmost slot */ } } else { gint boundtype; if(end_inclusive) { /* Find the rightmost node with a given value and the * righmost slot that is equal or smaller than that value */ eo = wg_search_ttree_rightmost(db, TTREE_ROOT_NODE(hdr), end_bound, &boundtype, NULL); if(boundtype == REALLY_BOUNDING_NODE) { es = wg_search_tnode_last(db, eo, end_bound, col); if(es == -1) { show_query_error(db, "Ending index node was bad"); return -1; } } else if(boundtype == DEAD_END_RIGHT_NOT_BOUNDING) { /* Last node containing values in range. */ node = (struct wg_tnode *) offsettoptr(db, eo); es = node->number_of_elements - 1; } else if(boundtype == DEAD_END_LEFT_NOT_BOUNDING) { /* Previous node should be in range. */ node = (struct wg_tnode *) offsettoptr(db, eo); eo = TNODE_PREDECESSOR(db, node); if(eo) { node = (struct wg_tnode *) offsettoptr(db, eo); es = node->number_of_elements - 1; /* rightmost */ } } } else { /* For non-inclusive, we need the leftmost node and * the first slot-1. */ eo = wg_search_ttree_leftmost(db, TTREE_ROOT_NODE(hdr), end_bound, &boundtype, NULL); if(boundtype == REALLY_BOUNDING_NODE) { es = wg_search_tnode_first(db, eo, end_bound, col); if(es == -1) { show_query_error(db, "Ending index node was bad"); return -1; } es--; if(es < 0) { /* Crossed node boundary */ node = (struct wg_tnode *) offsettoptr(db, eo); eo = TNODE_PREDECESSOR(db, node); if(eo) { node = (struct wg_tnode *) offsettoptr(db, eo); es = node->number_of_elements - 1; } } } else if(boundtype == DEAD_END_RIGHT_NOT_BOUNDING) { /* No exact value in tree, same as inclusive range */ node = (struct wg_tnode *) offsettoptr(db, eo); es = node->number_of_elements - 1; } else if(boundtype == DEAD_END_LEFT_NOT_BOUNDING) { /* No exact value in tree, same as inclusive range */ node = (struct wg_tnode *) offsettoptr(db, eo); eo = TNODE_PREDECESSOR(db, node); if(eo) { node = (struct wg_tnode *) offsettoptr(db, eo); es = node->number_of_elements - 1; /* rightmost slot */ } } } } /* Now detect the cases where the above bound search * has produced a result with an empty range. */ if(co) { /* Value could be bounded inside a node, but actually * not present. Note that we require the end_slot to be * >= curr_slot, this implies that query->direction == 1. */ if(eo == co && es < cs) { co = 0; /* query will return no rows */ eo = 0; } else if(!eo) { /* If one offset is 0 the other should be forced to 0, so that * if we want to switch direction we won't run into any surprises. */ co = 0; } else { /* Another case we have to watch out for is when we have a * range that fits in the space between two nodes. In that case * the end offset will end up directly left of the start offset. */ node = (struct wg_tnode *) offsettoptr(db, co); if(eo == TNODE_PREDECESSOR(db, node)) { co = 0; /* no rows */ eo = 0; } } } else { eo = 0; /* again, if one offset is 0, * the other should be, too */ } *curr_offset = co; *curr_slot = cs; *end_offset = eo; *end_slot = es; return 0; } /** Create a query object. * * matchrec - array of encoded integers. Can be a pointer to a database record * or a user-allocated array. If reclen is 0, it is treated as a native * database record. If reclen is non-zero, reclen number of gint-sized * words is read, starting from the pointer. * * Fields of type WG_VARTYPE in matchrec are treated as wildcards. Other * types, including NULL, are used as "equals" conditions. * * arglist - array of wg_query_arg objects. The size is must be given * by argc. * * flags - type of query requested and other parameters * * rowlimit - maximum number of rows fetched. Only has an effect if * QUERY_FLAGS_PREFETCH is set. * * returns NULL if constructing the query fails. Otherwise returns a pointer * to a wg_query object. */ static wg_query *internal_build_query(void *db, void *matchrec, gint reclen, wg_query_arg *arglist, gint argc, gint flags, wg_uint rowlimit) { wg_query *query; wg_query_arg *full_arglist; gint fargc = 0; gint col, index_id = -1; int i; #ifdef CHECK if (!dbcheck(db)) { /* XXX: currently show_query_error would work too */ #ifdef WG_NO_ERRPRINT #else fprintf(stderr, "Invalid database pointer in wg_make_query.\n"); #endif return NULL; } #endif /* Check and prepare the parameters. If there was an error, * prepare_params() does it's own cleanup so we can (and should) * return immediately. */ if(prepare_params(db, matchrec, reclen, arglist, argc, &full_arglist, &fargc)) { return NULL; } query = (wg_query *) malloc(sizeof(wg_query)); if(!query) { show_query_error(db, "Failed to allocate memory"); return NULL; } if(fargc) { /* Find the best (hopefully) index to base the query on. * Then initialise the query object to the first row in the * query result set. * XXX: only considering T-tree indexes now. */ col = most_restricting_column(db, full_arglist, fargc, &index_id); } else { /* Create a "full scan" query with no arguments. */ index_id = -1; full_arglist = NULL; /* redundant/paranoia */ } if(index_id > 0) { int start_inclusive = 0, end_inclusive = 0; gint start_bound = WG_ILLEGAL; /* encoded values */ gint end_bound = WG_ILLEGAL; query->qtype = WG_QTYPE_TTREE; query->column = col; query->curr_offset = 0; query->curr_slot = -1; query->end_offset = 0; query->end_slot = -1; query->direction = 1; /* Determine the bounds for the given column/index. * * Examples of using rightmost and leftmost bounds in T-tree queries: * val = 5 ==> * find leftmost (A) and rightmost (B) nodes that contain value 5. * Follow nodes sequentially from A until B is reached. * val > 1 & val < 7 ==> * find rightmost node with value 1 (A). Find leftmost node with * value 7 (B). Find the rightmost value in A that still equals 1. * The value immediately to the right is the beginning of the result * set and the value immediately to the left of the first occurrence * of 7 in B is the end of the result set. * val > 1 & val <= 7 ==> * A is the same as above. Find rightmost node with value 7 (B). The * beginning of the result set is the same as above, the end is the * last slot in B with value 7. * val <= 1 ==> * find rightmost node with value 1. Find the last (rightmost) slot * containing 1. The result set begins with that value, scan left * until the end of chain is reached. */ for(i=0; i= 1 & val <= 1 */ if(start_bound==WG_ILLEGAL ||\ WG_COMPARE(db, start_bound, full_arglist[i].value)==WG_LESSTHAN) { start_bound = full_arglist[i].value; start_inclusive = 1; } if(end_bound==WG_ILLEGAL ||\ WG_COMPARE(db, end_bound, full_arglist[i].value)==WG_GREATER) { end_bound = full_arglist[i].value; end_inclusive = 1; } break; case WG_COND_LESSTHAN: /* No earlier right bound or new end bound is a smaller * value (reducing the result set). The result set is also * possibly reduced if the value is equal, because this * condition is non-inclusive. */ if(end_bound==WG_ILLEGAL ||\ WG_COMPARE(db, end_bound, full_arglist[i].value)!=WG_LESSTHAN) { end_bound = full_arglist[i].value; end_inclusive = 0; } break; case WG_COND_GREATER: /* No earlier left bound or new left bound is >= of old value */ if(start_bound==WG_ILLEGAL ||\ WG_COMPARE(db, start_bound, full_arglist[i].value)!=WG_GREATER) { start_bound = full_arglist[i].value; start_inclusive = 0; } break; case WG_COND_LTEQUAL: /* Similar to "less than", but inclusive */ if(end_bound==WG_ILLEGAL ||\ WG_COMPARE(db, end_bound, full_arglist[i].value)==WG_GREATER) { end_bound = full_arglist[i].value; end_inclusive = 1; } break; case WG_COND_GTEQUAL: /* Similar to "greater", but inclusive */ if(start_bound==WG_ILLEGAL ||\ WG_COMPARE(db, start_bound, full_arglist[i].value)==WG_LESSTHAN) { start_bound = full_arglist[i].value; start_inclusive = 1; } break; case WG_COND_NOT_EQUAL: /* Force use of full argument list to check each row in the result * set since we have a condition we cannot satisfy using * a continuous range of T-tree values alone */ query->column = -1; break; default: show_query_error(db, "Invalid condition (ignoring)"); break; } } /* Simple sanity check. Is start_bound greater than end_bound? */ if(start_bound!=WG_ILLEGAL && end_bound!=WG_ILLEGAL &&\ WG_COMPARE(db, start_bound, end_bound) == WG_GREATER) { /* return empty query */ query->argc = 0; query->arglist = NULL; free(full_arglist); return query; } /* Now find the bounding nodes for the query */ if(find_ttree_bounds(db, index_id, col, start_bound, end_bound, start_inclusive, end_inclusive, &query->curr_offset, &query->curr_slot, &query->end_offset, &query->end_slot)) { free(query); free(full_arglist); return NULL; } /* XXX: here we can reverse the direction and switch the start and * end nodes/slots, if "descending" sort order is needed. */ } else { /* Nothing better than full scan available */ void *rec; query->qtype = WG_QTYPE_SCAN; query->column = -1; /* no special column, entire argument list * should be checked for each row */ rec = wg_get_first_record(db); if(rec) query->curr_record = ptrtooffset(db, rec); else query->curr_record = 0; } /* Now attach the argument list to the query. If the query is based * on a column index, we will create a slimmer copy that does not contain * the conditions already satisfied by the index bounds. */ if(query->column == -1) { query->arglist = full_arglist; query->argc = fargc; } else { int cnt = 0; for(i=0; icolumn) cnt++; } /* The argument list is reduced, but still contains columns */ if(cnt) { int j; query->arglist = (wg_query_arg *) malloc(cnt * sizeof(wg_query_arg)); if(!query->arglist) { show_query_error(db, "Failed to allocate memory"); free(query); free(full_arglist); return NULL; } for(i=0, j=0; icolumn) { query->arglist[j].column = full_arglist[i].column; query->arglist[j].cond = full_arglist[i].cond; query->arglist[j++].value = full_arglist[i].value; } } } else query->arglist = NULL; query->argc = cnt; free(full_arglist); /* Now we have a reduced argument list, free * the original one */ } /* Now handle any post-processing required. */ if(flags & QUERY_FLAGS_PREFETCH) { query_result_page **prevnext; query_result_page *currpage; void *rec; query->curr_page = NULL; /* initialize as empty */ query->curr_pidx = 0; query->res_count = 0; /* XXX: could move this inside the loop (speeds up empty * query, slows down other queries) */ query->mpool = wg_create_mpool(db, sizeof(query_result_page)); if(!query->mpool) { show_query_error(db, "Failed to allocate result memory pool"); wg_free_query(db, query); return NULL; } i = QUERY_RESULTSET_PAGESIZE; prevnext = (query_result_page **) &(query->curr_page); while((rec = wg_fetch(db, query))) { if(i >= QUERY_RESULTSET_PAGESIZE) { currpage = (query_result_page *) \ wg_alloc_mpool(db, query->mpool, sizeof(query_result_page)); if(!currpage) { show_query_error(db, "Failed to allocate a resultset row"); wg_free_query(db, query); return NULL; } memset(currpage->rows, 0, sizeof(gint) * QUERY_RESULTSET_PAGESIZE); *prevnext = currpage; prevnext = &(currpage->next); currpage->next = NULL; i = 0; } currpage->rows[i++] = ptrtooffset(db, rec); query->res_count++; if(rowlimit && query->res_count >= rowlimit) break; } /* Finally, convert the query type. */ query->qtype = WG_QTYPE_PREFETCH; } return query; } /** Create a query object and pre-fetch all data rows. * * Allocates enough space to hold all row offsets, fetches them and stores * them in an array. Isolation is not guaranteed in any way, shape or form, * but can be implemented on top by the user. * * returns NULL if constructing the query fails. Otherwise returns a pointer * to a wg_query object. */ wg_query *wg_make_query(void *db, void *matchrec, gint reclen, wg_query_arg *arglist, gint argc) { return internal_build_query(db, matchrec, reclen, arglist, argc, QUERY_FLAGS_PREFETCH, 0); } /** Create a query object and pre-fetch rowlimit number of rows. * * returns NULL if constructing the query fails. Otherwise returns a pointer * to a wg_query object. */ wg_query *wg_make_query_rc(void *db, void *matchrec, gint reclen, wg_query_arg *arglist, gint argc, wg_uint rowlimit) { return internal_build_query(db, matchrec, reclen, arglist, argc, QUERY_FLAGS_PREFETCH, rowlimit); } /** Return next record from the query object * returns NULL if no more records */ void *wg_fetch(void *db, wg_query *query) { void *rec; #ifdef CHECK if (!dbcheck(db)) { /* XXX: currently show_query_error would work too */ #ifdef WG_NO_ERRPRINT #else fprintf(stderr, "Invalid database pointer in wg_fetch.\n"); #endif return NULL; } if(!query) { show_query_error(db, "Invalid query object"); return NULL; } #endif if(query->qtype == WG_QTYPE_SCAN) { for(;;) { void *next; if(!query->curr_record) { /* Query exhausted */ return NULL; } rec = offsettoptr(db, query->curr_record); /* Pre-fetch the next record */ next = wg_get_next_record(db, rec); if(next) query->curr_record = ptrtooffset(db, next); else query->curr_record = 0; /* Check the record against all conditions; if it does * not match, go to next iteration. */ if(!query->arglist || \ check_arglist(db, rec, query->arglist, query->argc)) return rec; } } else if(query->qtype == WG_QTYPE_TTREE) { struct wg_tnode *node; for(;;) { if(!query->curr_offset) { /* No more nodes to examine */ return NULL; } node = (struct wg_tnode *) offsettoptr(db, query->curr_offset); rec = offsettoptr(db, node->array_of_values[query->curr_slot]); /* Increment the slot/and or node cursors before we * return. If the current node does not satisfy the * argument list we may need to do this multiple times. */ if(query->curr_offset==query->end_offset && \ query->curr_slot==query->end_slot) { /* Last slot reached, mark the query as exchausted */ query->curr_offset = 0; } else { /* Some rows still left */ query->curr_slot += query->direction; if(query->curr_slot < 0) { #ifdef CHECK if(query->end_offset==query->curr_offset) { /* This should not happen */ show_query_error(db, "Warning: end slot mismatch, possible bug"); query->curr_offset = 0; } else { #endif query->curr_offset = TNODE_PREDECESSOR(db, node); if(query->curr_offset) { node = (struct wg_tnode *) offsettoptr(db, query->curr_offset); query->curr_slot = node->number_of_elements - 1; } #ifdef CHECK } #endif } else if(query->curr_slot >= node->number_of_elements) { #ifdef CHECK if(query->end_offset==query->curr_offset) { /* This should not happen */ show_query_error(db, "Warning: end slot mismatch, possible bug"); query->curr_offset = 0; } else { #endif query->curr_offset = TNODE_SUCCESSOR(db, node); query->curr_slot = 0; #ifdef CHECK } #endif } } /* If there are no extra conditions or the row satisfies * all the conditions, we can return. */ if(!query->arglist || \ check_arglist(db, rec, query->arglist, query->argc)) return rec; } } if(query->qtype == WG_QTYPE_PREFETCH) { if(query->curr_page) { query_result_page *currpage = (query_result_page *) query->curr_page; gint offset = currpage->rows[query->curr_pidx++]; if(!offset) { /* page not filled completely */ query->curr_page = NULL; return NULL; } else { if(query->curr_pidx >= QUERY_RESULTSET_PAGESIZE) { query->curr_page = (void *) (currpage->next); query->curr_pidx = 0; } } return offsettoptr(db, offset); } else return NULL; } else { show_query_error(db, "Unsupported query type"); return NULL; } } /** Release the memory allocated for the query */ void wg_free_query(void *db, wg_query *query) { if(query->arglist) free(query->arglist); if(query->qtype==WG_QTYPE_PREFETCH && query->mpool) wg_free_mpool(db, query->mpool); free(query); } /* ----------- query parameter preparing functions -------------*/ /* Types that use no storage are encoded * using standard API functions. */ gint wg_encode_query_param_null(void *db, char *data) { return wg_encode_null(db, data); } gint wg_encode_query_param_record(void *db, void *data) { return wg_encode_record(db, data); } gint wg_encode_query_param_char(void *db, char data) { return wg_encode_char(db, data); } gint wg_encode_query_param_fixpoint(void *db, double data) { return wg_encode_fixpoint(db, data); } gint wg_encode_query_param_date(void *db, int data) { return wg_encode_date(db, data); } gint wg_encode_query_param_time(void *db, int data) { return wg_encode_time(db, data); } gint wg_encode_query_param_var(void *db, gint data) { return wg_encode_var(db, data); } /* Types using storage are encoded by emulating the behaviour * of dbdata.c functions. Some assumptions are made about storage * size of the data (but similar assumptions exist in dbdata.c) */ gint wg_encode_query_param_int(void *db, gint data) { void *dptr; if(fits_smallint(data)) { return encode_smallint(data); } else { dptr=malloc(sizeof(gint)); if(!dptr) { show_query_error(db, "Failed to encode query parameter"); return WG_ILLEGAL; } *((gint *) dptr) = data; return encode_fullint_offset(ptrtooffset(db, dptr)); } } gint wg_encode_query_param_double(void *db, double data) { void *dptr; dptr=malloc(2*sizeof(gint)); if(!dptr) { show_query_error(db, "Failed to encode query parameter"); return WG_ILLEGAL; } *((double *) dptr) = data; return encode_fulldouble_offset(ptrtooffset(db, dptr)); } gint wg_encode_query_param_str(void *db, char *data, char *lang) { if(data) { return encode_query_param_unistr(db, data, WG_STRTYPE, lang, strlen(data)); } else { show_query_error(db, "NULL pointer given as parameter"); return WG_ILLEGAL; } } gint wg_encode_query_param_xmlliteral(void *db, char *data, char *xsdtype) { if(data) { return encode_query_param_unistr(db, data, WG_XMLLITERALTYPE, xsdtype, strlen(data)); } else { show_query_error(db, "NULL pointer given as parameter"); return WG_ILLEGAL; } } gint wg_encode_query_param_uri(void *db, char *data, char *prefix) { if(data) { return encode_query_param_unistr(db, data, WG_URITYPE, prefix, strlen(data)); } else { show_query_error(db, "NULL pointer given as parameter"); return WG_ILLEGAL; } } /* Encode shortstr- or longstr-compatible data in local memory. * string type without lang is handled as "short", ignoring the * actual length. All other types require longstr storage to * handle the extdata field. */ static gint encode_query_param_unistr(void *db, char *data, gint type, char *extdata, int length) { void *dptr; if(type == WG_STRTYPE && extdata == NULL) { dptr=malloc(length+1); if(!dptr) { show_query_error(db, "Failed to encode query parameter"); return WG_ILLEGAL; } memcpy((char *) dptr, data, length); ((char *) dptr)[length] = '\0'; return encode_shortstr_offset(ptrtooffset(db, dptr)); } else { size_t i; int extlen = 0; int dlen, lengints, lenrest; gint offset, meta; if(type != WG_BLOBTYPE) length++; /* include the terminating 0 */ /* Determine storage size */ lengints = length / sizeof(gint); lenrest = length % sizeof(gint); if(lenrest) lengints++; dlen = sizeof(gint) * (LONGSTR_HEADER_GINTS + lengints); /* Emulate the behaviour of wg_alloc_gints() */ if(dlen < MIN_VARLENOBJ_SIZE) dlen = MIN_VARLENOBJ_SIZE; if(dlen % 8) dlen += 4; if(extdata) { extlen = strlen(extdata); } dptr=malloc(dlen + (extdata ? extlen + 1 : 0)); if(!dptr) { show_query_error(db, "Failed to encode query parameter"); return WG_ILLEGAL; } offset = ptrtooffset(db, dptr); /* Copy the data, fill the remainder with zeroes */ memcpy((char *) dptr + (LONGSTR_HEADER_GINTS*sizeof(gint)), data, length); for(i=0; lenrest && ircursor.page = NULL; /* initialize as empty */ set->rcursor.pidx = 0; set->wcursor.page = NULL; set->wcursor.pidx = QUERY_RESULTSET_PAGESIZE; /* new page needed */ set->first_page = NULL; set->res_count = 0; set->mpool = wg_create_mpool(db, sizeof(query_result_page)); if(!set->mpool) { show_query_error(db, "Failed to allocate result memory pool"); free(set); return NULL; } return set; } /* * Free the resultset and it's memory pool */ static void free_resultset(void *db, query_result_set *set) { if(set->mpool) wg_free_mpool(db, set->mpool); free(set); } /* * Set the resultset pointers to the beginning of the * first results page. */ static void rewind_resultset(void *db, query_result_set *set) { set->rcursor.page = set->first_page; set->rcursor.pidx = 0; } /* * Append an offset to the result set. * returns 0 on success. * returns -1 on error. */ static gint append_resultset(void *db, query_result_set *set, gint offset) { if(set->wcursor.pidx >= QUERY_RESULTSET_PAGESIZE) { query_result_page *newpage = (query_result_page *) \ wg_alloc_mpool(db, set->mpool, sizeof(query_result_page)); if(!newpage) { return show_query_error(db, "Failed to allocate a resultset page"); } memset(newpage->rows, 0, sizeof(gint) * QUERY_RESULTSET_PAGESIZE); newpage->next = NULL; if(set->wcursor.page) { set->wcursor.page->next = newpage; } else { /* first_page==NULL implied */ set->first_page = newpage; set->rcursor.page = newpage; } set->wcursor.page = newpage; set->wcursor.pidx = 0; } set->wcursor.page->rows[set->wcursor.pidx++] = offset; set->res_count++; return 0; } /* * Fetch the next offset from the result set. * returns 0 if the set is exhausted. */ static gint fetch_resultset(void *db, query_result_set *set) { if(set->rcursor.page) { gint offset = set->rcursor.page->rows[set->rcursor.pidx++]; if(!offset) { /* page not filled completely. Mark set as exhausted. */ set->rcursor.page = NULL; } else { if(set->rcursor.pidx >= QUERY_RESULTSET_PAGESIZE) { set->rcursor.page = set->rcursor.page->next; set->rcursor.pidx = 0; } } return offset; } return 0; } /* * Create an intersection of two result sets. * Returns a new result set (can be empty). * Returns NULL on error. */ static query_result_set *intersect_resultset(void *db, query_result_set *seta, query_result_set *setb) { gint offseta; query_result_set *intersection; if(!(intersection = create_resultset(db))) { return NULL; } rewind_resultset(db, seta); while((offseta = fetch_resultset(db, seta))) { gint offsetb; rewind_resultset(db, setb); while((offsetb = fetch_resultset(db, setb))) { if(offseta == offsetb) { gint err = append_resultset(db, intersection, offseta); if(err) { free_resultset(db, intersection); return NULL; } break; } } } return intersection; } /* * Create a result set that contains only unique rows. * Returns a new result set (can be empty). * Returns NULL on error. */ static query_result_set *unique_resultset(void *db, query_result_set *set) { gint offset; query_result_set *unique; if(!(unique = create_resultset(db))) { return NULL; } rewind_resultset(db, set); while((offset = fetch_resultset(db, set))) { gint offsetu, found = 0; rewind_resultset(db, unique); while((offsetu = fetch_resultset(db, unique))) { if(offset == offsetu) { found = 1; break; } } if(!found) { /* We're now at the end of the set and may append normally. */ gint err = append_resultset(db, unique, offset); if(err) { free_resultset(db, unique); return NULL; } } } return unique; } /* ------------------- (JSON) document query -------------------*/ #define ADD_DOC_TO_RESULTSET(db, ns, cr, doc, err) \ if(doc) { \ err = append_resultset(db, ns, ptrtooffset(db, doc)); \ } else { \ err = show_query_error(db, "Failed to retrieve the document"); \ } \ if(err) { \ free_resultset(db, ns); \ if(cr) \ free_resultset(db, cr); \ return NULL; \ } /* * Find a list of documents that contain the key-value pairs. * Returns a prefetch query object. * Returns NULL on error. */ wg_query *wg_make_json_query(void *db, wg_json_query_arg *arglist, gint argc) { wg_query *query = NULL; query_result_set *curr_res = NULL; gint index_id = -1; gint icols[2], i; #ifdef CHECK if(!arglist || argc < 1) { show_query_error(db, "Not enough parameters"); return NULL; } if (!dbcheck(db)) { #ifdef WG_NO_ERRPRINT #else fprintf(stderr, "Invalid database pointer in wg_make_json_query.\n"); #endif return NULL; } #endif /* Get index */ icols[0] = WG_SCHEMA_KEY_OFFSET; icols[1] = WG_SCHEMA_VALUE_OFFSET; index_id = wg_multi_column_to_index_id(db, icols, 2, WG_INDEX_TYPE_HASH_JSON, NULL, 0); /* Iterate over the argument pairs. * XXX: it is possible that getting the first set from index and * doing a scan to check the remaining arguments is faster than * doing the intersect operation of sets retrieved from index. * XXX: given that we don't index complex structures, reorder * arguments so that immediate values come first. */ for(i=0; i 0 &&\ wg_get_encoded_type(db, arglist[i].value) != WG_RECORDTYPE) { /* Fetch the matching rows from the index, then retrieve the * documents they belong to. */ gint values[2]; gint reclist_offset; values[0] = arglist[i].key; values[1] = arglist[i].value; reclist_offset = wg_search_hash(db, index_id, values, 2); if(reclist_offset > 0) { gint *nextoffset = &reclist_offset; while(*nextoffset) { gcell *rec_cell = (gcell *) offsettoptr(db, *nextoffset); gint err = -1; void *document = \ wg_find_document(db, offsettoptr(db, rec_cell->car)); ADD_DOC_TO_RESULTSET(db, next_set, curr_res, document, err) nextoffset = &(rec_cell->cdr); } } } else { /* No index, do a scan. This also happens if the value * is a complex structure. * XXX: if i>0 scan curr_res instead! (duh) */ gint *rec = wg_get_first_record(db); while(rec) { gint reclen = wg_get_record_len(db, rec); if(reclen > WG_SCHEMA_VALUE_OFFSET) { /* XXX: assume key * before value */ #ifndef JSON_SCAN_UNWRAP_ARRAY if(WG_COMPARE(db, wg_get_field(db, rec, WG_SCHEMA_KEY_OFFSET), arglist[i].key) == WG_EQUAL &&\ WG_COMPARE(db, wg_get_field(db, rec, WG_SCHEMA_VALUE_OFFSET), arglist[i].value) == WG_EQUAL) { gint err = -1; void *document = wg_find_document(db, rec); ADD_DOC_TO_RESULTSET(db, next_set, curr_res, document, err) } #else if(WG_COMPARE(db, wg_get_field(db, rec, WG_SCHEMA_KEY_OFFSET), arglist[i].key) == WG_EQUAL) { gint k = wg_get_field(db, rec, WG_SCHEMA_VALUE_OFFSET); if(WG_COMPARE(db, k, arglist[i].value) == WG_EQUAL) { /* Direct match. */ gint err = -1; void *document = wg_find_document(db, rec); ADD_DOC_TO_RESULTSET(db, next_set, curr_res, document, err) } else if(wg_get_encoded_type(db, k) == WG_RECORDTYPE) { /* No direct match, but if it is a record AND an array, * scan the array contents. */ void *arec = wg_decode_record(db, k); if(is_schema_array(arec)) { gint areclen = wg_get_record_len(db, arec); int j; for(j=0; jres_count < next_set->res_count) { /* minor optimization */ tmp_set = intersect_resultset(db, curr_res, next_set); } else { tmp_set = intersect_resultset(db, next_set, curr_res); } free_resultset(db, curr_res); free_resultset(db, next_set); if(!tmp_set) { return NULL; } else { curr_res = tmp_set; } } else { /* This set becomes the working resultset */ curr_res = next_set; } } /* Initialize query object */ query = (wg_query *) malloc(sizeof(wg_query)); if(!query) { free_resultset(db, curr_res); show_query_error(db, "Failed to allocate memory"); return NULL; } query->qtype = WG_QTYPE_PREFETCH; query->arglist = NULL; query->argc = 0; query->column = -1; /* Copy the result. */ query->curr_page = curr_res->first_page; query->curr_pidx = 0; query->res_count = curr_res->res_count; query->mpool = curr_res->mpool; free(curr_res); /* contents were inherited, dispose of the struct */ return query; } /* ------------------ simple query functions -------------------*/ void *wg_find_record(void *db, gint fieldnr, gint cond, gint data, void* lastrecord) { gint index_id = -1; /* find index on colum */ if(cond != WG_COND_NOT_EQUAL) { index_id = wg_multi_column_to_index_id(db, &fieldnr, 1, WG_INDEX_TYPE_TTREE, NULL, 0); } if(index_id > 0) { int start_inclusive = 1, end_inclusive = 1; /* WG_ILLEGAL is interpreted as "no bound" */ gint start_bound = WG_ILLEGAL; gint end_bound = WG_ILLEGAL; gint curr_offset = 0, curr_slot = -1, end_offset = 0, end_slot = -1; void *prev = NULL; switch(cond) { case WG_COND_EQUAL: start_bound = end_bound = data; break; case WG_COND_LESSTHAN: end_bound = data; end_inclusive = 0; break; case WG_COND_GREATER: start_bound = data; start_inclusive = 0; break; case WG_COND_LTEQUAL: end_bound = data; break; case WG_COND_GTEQUAL: start_bound = data; break; default: show_query_error(db, "Invalid condition (ignoring)"); return NULL; } if(find_ttree_bounds(db, index_id, fieldnr, start_bound, end_bound, start_inclusive, end_inclusive, &curr_offset, &curr_slot, &end_offset, &end_slot)) { return NULL; } /* We have the bounds, scan to lastrecord */ while(curr_offset) { struct wg_tnode *node = (struct wg_tnode *) offsettoptr(db, curr_offset); void *rec = offsettoptr(db, node->array_of_values[curr_slot]); if(prev == lastrecord) { /* if lastrecord is NULL, first match returned */ return rec; } prev = rec; if(curr_offset==end_offset && curr_slot==end_slot) { /* Last slot reached */ break; } else { /* Some rows still left */ curr_slot += 1; /* direction implied as 1 */ if(curr_slot >= node->number_of_elements) { #ifdef CHECK if(end_offset==curr_offset) { /* This should not happen */ show_query_error(db, "Warning: end slot mismatch, possible bug"); break; } else { #endif curr_offset = TNODE_SUCCESSOR(db, node); curr_slot = 0; #ifdef CHECK } #endif } } } } else { /* no index (or cond == WG_COND_NOT_EQUAL), do a scan */ wg_query_arg arg; void *rec; if(lastrecord) { rec = wg_get_next_record(db, lastrecord); } else { rec = wg_get_first_record(db); } arg.column = fieldnr; arg.cond = cond; arg.value = data; while(rec) { if(check_arglist(db, rec, &arg, 1)) { return rec; } rec = wg_get_next_record(db, rec); } } /* No records found (this can also happen if matching records were * found but lastrecord does not match any of them or matches the * very last one). */ return NULL; } /* * Wrapper function for wg_find_record with unencoded data (null) */ void *wg_find_record_null(void *db, gint fieldnr, gint cond, char *data, void* lastrecord) { gint enc = wg_encode_query_param_null(db, data); void *rec = wg_find_record(db, fieldnr, cond, enc, lastrecord); return rec; } /* * Wrapper function for wg_find_record with unencoded data (record) */ void *wg_find_record_record(void *db, gint fieldnr, gint cond, void *data, void* lastrecord) { gint enc = wg_encode_query_param_record(db, data); void *rec = wg_find_record(db, fieldnr, cond, enc, lastrecord); return rec; } /* * Wrapper function for wg_find_record with unencoded data (char) */ void *wg_find_record_char(void *db, gint fieldnr, gint cond, char data, void* lastrecord) { gint enc = wg_encode_query_param_char(db, data); void *rec = wg_find_record(db, fieldnr, cond, enc, lastrecord); return rec; } /* * Wrapper function for wg_find_record with unencoded data (fixpoint) */ void *wg_find_record_fixpoint(void *db, gint fieldnr, gint cond, double data, void* lastrecord) { gint enc = wg_encode_query_param_fixpoint(db, data); void *rec = wg_find_record(db, fieldnr, cond, enc, lastrecord); return rec; } /* * Wrapper function for wg_find_record with unencoded data (date) */ void *wg_find_record_date(void *db, gint fieldnr, gint cond, int data, void* lastrecord) { gint enc = wg_encode_query_param_date(db, data); void *rec = wg_find_record(db, fieldnr, cond, enc, lastrecord); return rec; } /* * Wrapper function for wg_find_record with unencoded data (time) */ void *wg_find_record_time(void *db, gint fieldnr, gint cond, int data, void* lastrecord) { gint enc = wg_encode_query_param_time(db, data); void *rec = wg_find_record(db, fieldnr, cond, enc, lastrecord); return rec; } /* * Wrapper function for wg_find_record with unencoded data (var) */ void *wg_find_record_var(void *db, gint fieldnr, gint cond, gint data, void* lastrecord) { gint enc = wg_encode_query_param_var(db, data); void *rec = wg_find_record(db, fieldnr, cond, enc, lastrecord); return rec; } /* * Wrapper function for wg_find_record with unencoded data (int) */ void *wg_find_record_int(void *db, gint fieldnr, gint cond, int data, void* lastrecord) { gint enc = wg_encode_query_param_int(db, data); void *rec = wg_find_record(db, fieldnr, cond, enc, lastrecord); wg_free_query_param(db, enc); return rec; } /* * Wrapper function for wg_find_record with unencoded data (double) */ void *wg_find_record_double(void *db, gint fieldnr, gint cond, double data, void* lastrecord) { gint enc = wg_encode_query_param_double(db, data); void *rec = wg_find_record(db, fieldnr, cond, enc, lastrecord); wg_free_query_param(db, enc); return rec; } /* * Wrapper function for wg_find_record with unencoded data (string) */ void *wg_find_record_str(void *db, gint fieldnr, gint cond, char *data, void* lastrecord) { gint enc = wg_encode_query_param_str(db, data, NULL); void *rec = wg_find_record(db, fieldnr, cond, enc, lastrecord); wg_free_query_param(db, enc); return rec; } /* * Wrapper function for wg_find_record with unencoded data (xmlliteral) */ void *wg_find_record_xmlliteral(void *db, gint fieldnr, gint cond, char *data, char *xsdtype, void* lastrecord) { gint enc = wg_encode_query_param_xmlliteral(db, data, xsdtype); void *rec = wg_find_record(db, fieldnr, cond, enc, lastrecord); wg_free_query_param(db, enc); return rec; } /* * Wrapper function for wg_find_record with unencoded data (uri) */ void *wg_find_record_uri(void *db, gint fieldnr, gint cond, char *data, char *prefix, void* lastrecord) { gint enc = wg_encode_query_param_uri(db, data, prefix); void *rec = wg_find_record(db, fieldnr, cond, enc, lastrecord); wg_free_query_param(db, enc); return rec; } /* --------------- error handling ------------------------------*/ /** called with err msg * * may print or log an error * does not do any jumps etc */ static gint show_query_error(void* db, char* errmsg) { #ifdef WG_NO_ERRPRINT #else fprintf(stderr,"query error: %s\n",errmsg); #endif return -1; } #if 0 /** called with err msg and additional int data * * may print or log an error * does not do any jumps etc */ static gint show_query_error_nr(void* db, char* errmsg, gint nr) { #ifdef WG_NO_ERRPRINT #else fprintf(stderr,"query error: %s %d\n",errmsg,nr); #endif return -1; } #endif #ifdef __cplusplus } #endif whitedb-0.7.2/Db/dbquery.h000066400000000000000000000122741226454622500153350ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Priit Järv 2010,2011,2013 * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dbcompare.h * Public headers for WhiteDB query engine. */ #ifndef DEFINED_DBQUERY_H #define DEFINED_DBQUERY_H #ifdef _WIN32 #include "../config-w32.h" #else #include "../config.h" #endif #include "dbdata.h" #include "dbindex.h" /* ==== Public macros ==== */ #define WG_COND_EQUAL 0x0001 /** = */ #define WG_COND_NOT_EQUAL 0x0002 /** != */ #define WG_COND_LESSTHAN 0x0004 /** < */ #define WG_COND_GREATER 0x0008 /** > */ #define WG_COND_LTEQUAL 0x0010 /** <= */ #define WG_COND_GTEQUAL 0x0020 /** >= */ #define WG_QTYPE_TTREE 0x01 #define WG_QTYPE_HASH 0x02 #define WG_QTYPE_SCAN 0x04 #define WG_QTYPE_PREFETCH 0x80 /* ====== data structures ======== */ /** Query argument list object */ typedef struct { gint column; /** column (field) number this argument applies to */ gint cond; /** condition (equal, less than, etc) */ gint value; /** encoded value */ } wg_query_arg; typedef struct { gint key; /** encoded key */ gint value; /** encoded value */ } wg_json_query_arg; /** Query object */ typedef struct { gint qtype; /** Query type (T-tree, hash, full scan, prefetch) */ /* Argument list based query is the only one supported at the moment. */ wg_query_arg *arglist; /** check each row in result set against these */ gint argc; /** number of elements in arglist */ gint column; /** index on this column used */ /* Fields for T-tree query (XXX: some may be re-usable for * other types as well) */ gint curr_offset; gint end_offset; gint curr_slot; gint end_slot; gint direction; /* Fields for full scan */ gint curr_record; /** offset of the current record */ /* Fields for prefetch */ void *mpool; /** storage for row offsets */ void *curr_page; /** current page of results */ gint curr_pidx; /** current index on page */ wg_uint res_count; /** number of rows in results */ } wg_query; /* ==== Protos ==== */ wg_query *wg_make_query(void *db, void *matchrec, gint reclen, wg_query_arg *arglist, gint argc); #define wg_make_prefetch_query wg_make_query wg_query *wg_make_query_rc(void *db, void *matchrec, gint reclen, wg_query_arg *arglist, gint argc, wg_uint rowlimit); wg_query *wg_make_json_query(void *db, wg_json_query_arg *arglist, gint argc); void *wg_fetch(void *db, wg_query *query); void wg_free_query(void *db, wg_query *query); gint wg_encode_query_param_null(void *db, char *data); gint wg_encode_query_param_record(void *db, void *data); gint wg_encode_query_param_char(void *db, char data); gint wg_encode_query_param_fixpoint(void *db, double data); gint wg_encode_query_param_date(void *db, int data); gint wg_encode_query_param_time(void *db, int data); gint wg_encode_query_param_var(void *db, gint data); gint wg_encode_query_param_int(void *db, gint data); gint wg_encode_query_param_double(void *db, double data); gint wg_encode_query_param_str(void *db, char *data, char *lang); gint wg_encode_query_param_xmlliteral(void *db, char *data, char *xsdtype); gint wg_encode_query_param_uri(void *db, char *data, char *prefix); gint wg_free_query_param(void* db, gint data); void *wg_find_record(void *db, gint fieldnr, gint cond, gint data, void* lastrecord); void *wg_find_record_null(void *db, gint fieldnr, gint cond, char *data, void* lastrecord); void *wg_find_record_record(void *db, gint fieldnr, gint cond, void *data, void* lastrecord); void *wg_find_record_char(void *db, gint fieldnr, gint cond, char data, void* lastrecord); void *wg_find_record_fixpoint(void *db, gint fieldnr, gint cond, double data, void* lastrecord); void *wg_find_record_date(void *db, gint fieldnr, gint cond, int data, void* lastrecord); void *wg_find_record_time(void *db, gint fieldnr, gint cond, int data, void* lastrecord); void *wg_find_record_var(void *db, gint fieldnr, gint cond, gint data, void* lastrecord); void *wg_find_record_int(void *db, gint fieldnr, gint cond, int data, void* lastrecord); void *wg_find_record_double(void *db, gint fieldnr, gint cond, double data, void* lastrecord); void *wg_find_record_str(void *db, gint fieldnr, gint cond, char *data, void* lastrecord); void *wg_find_record_xmlliteral(void *db, gint fieldnr, gint cond, char *data, char *xsdtype, void* lastrecord); void *wg_find_record_uri(void *db, gint fieldnr, gint cond, char *data, char *prefix, void* lastrecord); #endif /* DEFINED_DBQUERY_H */ whitedb-0.7.2/Db/dbschema.c000066400000000000000000000150101226454622500154120ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Priit Järv 2013 * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dbschema.c * WhiteDB (semi-)structured data representation */ /* ====== Includes =============== */ #include /* ====== Private headers and defs ======== */ #ifdef __cplusplus extern "C" { #endif #ifdef _WIN32 #include "../config-w32.h" #else #include "../config.h" #endif #include "dbdata.h" #include "dbcompare.h" #include "dbindex.h" #include "dbschema.h" /* ======== Data ========================= */ /* ======= Private protos ================ */ #ifdef USE_BACKLINKING static void *find_document_recursive(void *db, gint *rec, int depth); #endif static gint delete_record_recursive(void *db, void *rec, int depth); static gint show_schema_error(void *db, char *errmsg); /* ====== Functions ============== */ /* * Create a data triple (subj, prop, ob) * May also be called to create key-value pairs with (NULL, key, value) * if isparam is non-0, the data is not indexed. * returns the new record * returns NULL on error. */ void *wg_create_triple(void *db, gint subj, gint prop, gint ob, gint isparam) { void *rec = wg_create_raw_record(db, WG_SCHEMA_TRIPLE_SIZE); gint *meta; if(rec) { meta = ((gint *) rec + RECORD_META_POS); if(isparam) { *meta |= (RECORD_META_NOTDATA|RECORD_META_MATCH); } else if(wg_index_add_rec(db, rec) < -1) { return NULL; /* index error */ } if(wg_set_field(db, rec, WG_SCHEMA_TRIPLE_OFFSET, subj)) return NULL; if(wg_set_field(db, rec, WG_SCHEMA_TRIPLE_OFFSET + 1, prop)) return NULL; if(wg_set_field(db, rec, WG_SCHEMA_TRIPLE_OFFSET + 2, ob)) return NULL; } return rec; } /* * Create an empty (JSON) array of given size. * if isparam is non-0, the data is not indexed (incl. when updating later) * if isdocument is non-0, the record represents a top-level document * returns the new record * returns NULL on error. */ void *wg_create_array(void *db, gint size, gint isdocument, gint isparam) { void *rec = wg_create_raw_record(db, size); gint *meta; if(rec) { meta = ((gint *) rec + RECORD_META_POS); *meta |= RECORD_META_ARRAY; if(isdocument) *meta |= RECORD_META_DOC; if(isparam) { *meta |= (RECORD_META_NOTDATA|RECORD_META_MATCH); } else if(wg_index_add_rec(db, rec) < -1) { return NULL; /* index error */ } } return rec; } /* * Create an empty (JSON) object of given size. * if isparam is non-0, the data is not indexed (incl. when updating later) * if isdocument is non-0, the record represents a top-level document * returns the new record * returns NULL on error. */ void *wg_create_object(void *db, gint size, gint isdocument, gint isparam) { void *rec = wg_create_raw_record(db, size); gint *meta; if(rec) { meta = ((gint *) rec + RECORD_META_POS); *meta |= RECORD_META_OBJECT; if(isdocument) *meta |= RECORD_META_DOC; if(isparam) { *meta |= (RECORD_META_NOTDATA|RECORD_META_MATCH); } else if(wg_index_add_rec(db, rec) < -1) { return NULL; /* index error */ } } return rec; } /* * Find a top-level document that the record belongs to. * returns the document pointer on success * returns NULL if the document was not found. */ void *wg_find_document(void *db, void *rec) { #ifndef USE_BACKLINKING show_schema_error(db, "Backlinks are required to find complete documents"); return NULL; #else return find_document_recursive(db, (gint *) rec, WG_COMPARE_REC_DEPTH-1); #endif } #ifdef USE_BACKLINKING /* * Find a document recursively. * iterates through the backlink chain and checks each parent recursively. * Returns the pointer to the (first) found document. * Returns NULL if nothing found. * XXX: if a document links to the contents of another document, it * can "hijack" it in the search results this way. The priority * depends on the position(s) in the backlink chain, as this is a depth-first * search. */ static void *find_document_recursive(void *db, gint *rec, int depth) { if(is_schema_document(rec)) return rec; if(depth > 0) { gint backlink_list = *(rec + RECORD_BACKLINKS_POS); if(backlink_list) { gcell *next = (gcell *) offsettoptr(db, backlink_list); for(;;) { void *res = find_document_recursive(db, (gint *) offsettoptr(db, next->car), depth-1); if(res) return res; /* Something was found recursively */ if(!next->cdr) break; next = (gcell *) offsettoptr(db, next->cdr); } } } return NULL; /* Depth exhausted or nothing found. */ } #endif /* * Delete a top-level document * returns 0 on success * returns -1 on error */ gint wg_delete_document(void *db, void *document) { #ifdef CHECK if(!is_schema_document(document)) { return show_schema_error(db, "wg_delete_document: not a document"); } #endif #ifndef USE_BACKLINKING return delete_record_recursive(db, document, 99); #else return delete_record_recursive(db, document, WG_COMPARE_REC_DEPTH); #endif } /* * Delete a record and all the records it points to. * This is safe to call on JSON documents. */ static gint delete_record_recursive(void *db, void *rec, int depth) { gint i, reclen; if(depth <= 0) { return show_schema_error(db, "deleting record: recursion too deep"); } reclen = wg_get_record_len(db, rec); for(i=0; i. * */ /** @file dbschema.h * Public headers for the strucured data functions. */ #ifndef DEFINED_DBSCHEMA_H #define DEFINED_DBSCHEMA_H /* ==== Public macros ==== */ #define WG_SCHEMA_TRIPLE_SIZE 3 #define WG_SCHEMA_TRIPLE_OFFSET 0 #define WG_SCHEMA_KEY_OFFSET (WG_SCHEMA_TRIPLE_OFFSET + 1) #define WG_SCHEMA_VALUE_OFFSET (WG_SCHEMA_TRIPLE_OFFSET + 2) /* ====== data structures ======== */ /* ==== Protos ==== */ void *wg_create_triple(void *db, gint subj, gint prop, gint ob, gint isparam); #define wg_create_kvpair(db, key, val, ip) \ wg_create_triple(db, 0, key, val, ip) void *wg_create_array(void *db, gint size, gint isdocument, gint isparam); void *wg_create_object(void *db, gint size, gint isdocument, gint isparam); void *wg_find_document(void *db, void *rec); gint wg_delete_document(void *db, void *document); #endif /* DEFINED_DBSCHEMA_H */ whitedb-0.7.2/Db/dbtest.c000066400000000000000000004421021226454622500151370ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009 * Copyright (c) Priit Järv 2010, 2011, 2012, 2013 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dbtest.c * Database testing, checking and report procedures * */ /* ====== Includes =============== */ #include #include #include #include #include #include #include #ifndef _WIN32 #include #else #include #include #include #include #endif #ifdef __cplusplus extern "C" { #endif #ifdef _WIN32 #include "../config-w32.h" #else #include "../config.h" #endif #include "dballoc.h" #include "dbdata.h" #include "dbhash.h" #include "dbtest.h" #include "dbindex.h" #include "dbmem.h" #include "dbutil.h" #include "dbquery.h" #include "dbcompare.h" #include "dblog.h" #include "dbschema.h" #include "dbjson.h" /* ====== Private headers and defs ======== */ #ifdef _WIN32 #define snprintf sprintf_s #endif /* ======= Private protos ================ */ static int do_check_parse_encode(void *db, gint enc, gint exptype, void *expval, int printlevel); static gint check_varlen_area(void* db, void* area_header); static gint check_varlen_area_freelist(void* db, void* area_header); static gint check_bucket_freeobjects(void* db, void* area_header, gint bucketindex); static gint check_varlen_area_markers(void* db, void* area_header); static gint check_varlen_area_dv(void* db, void* area_header); static gint check_object_in_areabounds(void*db,void* area_header,gint offset,gint size); static gint check_varlen_area_scan(void* db, void* area_header); static gint check_varlen_object_infreelist(void* db, void* area_header, gint offset, gint isfree); static int guarded_strlen(char* str); static int guarded_strcmp(char* a, char* b); static int bufguarded_strcmp(char* a, char* b); static int validate_index(void *db, void *rec, int rows, int column, int printlevel); #ifdef USE_CHILD_DB static int childdb_mkindex(void *db, int cnt); static int childdb_ckindex(void *db, int cnt, int printlevel); static int childdb_dropindex(void *db, int cnt); #endif static gint longstr_in_hash(void* db, char* data, char* extrastr, gint type, gint length); static int is_offset_in_list(void *db, gint reclist_offset, gint offset); static int check_matching_rows(void *db, int col, int cond, void *val, gint type, int expected, int printlevel); static int check_db_rows(void *db, int expected, int printlevel); static int check_sanity(void *db); /* ====== Functions ============== */ /** Run database tests. * Allows each test to be run in separate locally allocated databases, * if necessary. * * returns 0 if no errors. * otherwise returns error code. */ int wg_run_tests(int tests, int printlevel) { int tmp = 0; void *db = NULL; if(tests & WG_TEST_COMMON) { db = wg_attach_local_database(800000); wg_show_db_memsegment_header(db); tmp=check_sanity(db); if (tmp==0) tmp=wg_check_db(db); if (tmp==0) tmp=wg_check_datatype_writeread(db,printlevel); if (tmp==0) tmp=wg_check_parse_encode(db,printlevel); if (tmp==0) tmp=wg_check_backlinking(db,printlevel); if (tmp==0) tmp=wg_check_compare(db,printlevel); if (tmp==0) tmp=wg_check_query_param(db,printlevel); if (tmp==0) tmp=wg_check_db(db); if (tmp==0) tmp=wg_check_strhash(db,printlevel); if (tmp==0) tmp=wg_test_index2(db,printlevel); if (tmp==0) tmp=wg_check_childdb(db,printlevel); wg_delete_local_database(db); if (tmp==0) { /* separate database for the schema */ db = wg_attach_local_database(800000); tmp=wg_check_schema(db,printlevel); /* run this first */ if (tmp==0) tmp=wg_check_json_parsing(db,printlevel); if (tmp==0) tmp=wg_check_idxhash(db,printlevel); wg_delete_local_database(db); } if (tmp==0) { printf("\n***** Quick tests passed ******\n"); } else { printf("\n***** Quick test failed ******\n"); return tmp; } } if(tests & WG_TEST_INDEX) { db = wg_attach_local_database(20000000); tmp = wg_test_index1(db, 50, printlevel); wg_delete_local_database(db); if (tmp) { printf("\n***** Index test failed ******\n"); return tmp; } else { printf("\n***** Index test succeeded ******\n"); } } if(tests & WG_TEST_QUERY) { db = wg_attach_local_database(120000000); tmp = wg_test_query(db, 4, printlevel); wg_delete_local_database(db); if (tmp) { printf("\n***** Query test failed ******\n"); return tmp; } else { printf("\n***** Query test succeeded ******\n"); } } if(tests & WG_TEST_LOG) { db = wg_attach_local_database(800000); tmp = wg_check_log(db, printlevel); wg_delete_local_database(db); if (tmp) { printf("\n***** Log test failed ******\n"); return tmp; } else { printf("\n***** Log test succeeded ******\n"); } } /* Add other tests here */ return tmp; } /* ---------------- overviews, statistics ---------------------- */ /** print an overview of full memsegment memory usage and addresses * * */ void wg_show_db_memsegment_header(void* db) { db_memsegment_header* dbh = dbmemsegh(db); printf("\nShowing db segment information\n"); printf("==============================\n"); printf("mark %d\n", (int) dbh->mark); #ifdef _WIN32 printf("size %Id\n", dbh->size); printf("free %Id\n", dbh->free); #else printf("size %td\n", dbh->size); printf("free %td\n", dbh->free); #endif printf("initialadr %p\n", (void *) dbh->initialadr); printf("key %d\n", (int) dbh->key); printf("segment header size %d\n", (int) sizeof(db_memsegment_header)); printf("subarea array size %d\n",SUBAREA_ARRAY_SIZE); printf("\ndatarec_area\n"); printf("-------------\n"); wg_show_db_area_header(db,&(dbh->datarec_area_header)); printf("\nlongstr_area\n"); printf("-------------\n"); wg_show_db_area_header(db,&(dbh->longstr_area_header)); printf("\nlistcell_area\n"); printf("-------------\n"); wg_show_db_area_header(db,&(dbh->listcell_area_header)); printf("\nshortstr_area\n"); printf("-------------\n"); wg_show_db_area_header(db,&(dbh->shortstr_area_header)); printf("\nword_area\n"); printf("-------------\n"); wg_show_db_area_header(db,&(dbh->word_area_header)); printf("\ndoubleword_area\n"); printf("-------------\n"); wg_show_db_area_header(db,&(dbh->doubleword_area_header)); printf("\ntnode_area\n"); printf("-------------\n"); wg_show_db_area_header(db,&(dbh->tnode_area_header)); } /** print an overview of a single area memory usage and addresses * * */ void wg_show_db_area_header(void* db, void* area_header) { db_area_header* areah; gint i; areah=(db_area_header*)area_header; if (areah->fixedlength) { printf("fixedlength with objlength %d bytes\n", (int) areah->objlength); printf("freelist %d\n", (int) areah->freelist); printf("freelist len %d\n", (int) wg_count_freelist(db,areah->freelist)); } else { printf("varlength\n"); } printf("last_subarea_index %d\n", (int) areah->last_subarea_index); for (i=0;i<=(areah->last_subarea_index);i++) { printf("subarea nr %d \n", (int) i); printf(" size %d\n", (int) ((areah->subarea_array)[i]).size); printf(" offset %d\n", (int) ((areah->subarea_array)[i]).offset); printf(" alignedsize %d\n", (int) ((areah->subarea_array)[i]).alignedsize); printf(" alignedoffset %d\n", (int) ((areah->subarea_array)[i]).alignedoffset); } for (i=0;ifreebuckets)[i]!=0) { printf("bucket nr %d \n", (int) i); if (ifreebuckets)[i])); wg_show_bucket_freeobjects(db,(areah->freebuckets)[i]); } else { printf(" is varbucket at offset %d \n", (int) dbaddr(db,&(areah->freebuckets)[i])); wg_show_bucket_freeobjects(db,(areah->freebuckets)[i]); } } } if ((areah->freebuckets)[DVBUCKET]!=0) { printf("bucket nr %d at offset %d \n contains dv at offset %d with size %d(%d) and end %d \n", DVBUCKET, (int) dbaddr(db,&(areah->freebuckets)[DVBUCKET]), (int) (areah->freebuckets)[DVBUCKET], (int) ((areah->freebuckets)[DVSIZEBUCKET]>0 ? dbfetch(db,(areah->freebuckets)[DVBUCKET]) : -1), (int) (areah->freebuckets)[DVSIZEBUCKET], (int) ((areah->freebuckets)[DVBUCKET]+(areah->freebuckets)[DVSIZEBUCKET])); } } /** show a list of free objects in a bucket * */ void wg_show_bucket_freeobjects(void* db, gint freelist) { gint size; gint freebits; gint nextptr; gint prevptr; while(freelist!=0) { size=getfreeobjectsize(dbfetch(db,freelist)); freebits=dbfetch(db,freelist) & 3; nextptr=dbfetch(db,freelist+sizeof(gint)); prevptr=dbfetch(db,freelist+2*sizeof(gint)); printf(" object offset %d end %d freebits %d size %d nextptr %d prevptr %d \n", (int) freelist, (int) (freelist+size), (int) freebits, (int) size, (int) nextptr, (int) prevptr); freelist=nextptr; } } /** count elements in a freelist * */ gint wg_count_freelist(void* db, gint freelist) { gint i; //printf("freelist %d dbfetch(db,freelist) %d\n",freelist,dbfetch(db,freelist)); for(i=0;freelist; freelist=dbfetch(db,freelist)) { //printf("i %d freelist %u\n",i,(uint)freelist); i++; } return i; } /* --------------- datatype conversion/writing/reading testing ------------------------------*/ /** printlevel: 0 no print, 1 err print, 2 full print */ gint wg_check_datatype_writeread(void* db, int printlevel) { int p; int i; int j; int k,r,m; int tries; // encoded and decoded data gint enc; gint* rec; char* nulldec; int intdec; char chardec; double doubledec; char* strdec; int len; int tmplen; int tmp; int decbuflen=1000; char decbuf[1000]; char encbuf[1000]; // amount of tested data int nulldata_nr=1; int chardata_nr=2; int vardata_nr=2; int intdata_nr=4; int doubledata_nr=4; int fixpointdata_nr=5; int datedata_nr=4; int timedata_nr=4; int datevecdata_nr=4; int datevecbad_nr=2; int timevecdata_nr=4; int timevecbad_nr=4; int strdata_nr=5; int xmlliteraldata_nr=2; int uridata_nr=2; int blobdata_nr=3; int recdata_nr=10; // tested data buffers char* nulldata[10]; char chardata[10]; int vardata[10]; int intdata[10]; double doubledata[10]; double fixpointdata[10]; int timedata[10]; int datedata[10]; char* strdata[10]; char* strextradata[10]; char* xmlliteraldata[10]; char* xmlliteralextradata[10]; char* uridata[10]; char* uriextradata[10]; char* blobdata[10]; char* blobextradata[10]; int bloblendata[10]; int recdata[10]; int tmpvec[4]; int datevecdata[][3] = { {1, 1, 1}, {2010, 1, 1}, {2010, 4, 30}, {5997, 1, 6} }; int datevecbad[][3] = { {1, -1, 2}, {1990, 7, 32}, {2010, 2, 29}, {2010, 4, 31} }; int timevecdata[][4] = { {0, 0, 0, 0}, {0, 10, 20, 3}, {24, 0, 0, 0}, {13, 32, 0, 3} }; int timevecbad[][4] = { {1, -1, 2, 99}, {1, 1, 1, 101}, {25, 2, 1, 0}, {23, 12, 73, 0} }; p=printlevel; tries=1; if (p>1) printf("********* check_datatype_writeread starts ************\n"); // initialise tested data nulldata[0]=NULL; chardata[0]='p'; chardata[1]=' '; intdata[0]=0; intdata[1]=100; intdata[2]=-50; intdata[3]=100200; doubledata[0]=0; doubledata[1]=1000; doubledata[2]=0.45678; doubledata[3]=-45.991; datedata[1]=733773; // 2010 m 1 d 1 datedata[2]=733892; // 2010 m 4 d 30 datedata[0]=1; datedata[3]=6000*365; timedata[0]=0; timedata[1]=10*(60*100)+20*100+3; timedata[2]=24*60*60*100; timedata[3]=14*60*58*100+3; fixpointdata[0]=0; fixpointdata[1]=1.23; fixpointdata[2]=790.3456; fixpointdata[3]=-799.7891; fixpointdata[4]=0.002345678; strdata[0]="abc"; strdata[1]="abcdefghijklmnop"; strdata[2]="1234567890123456789012345678901234567890"; strdata[3]=""; strdata[4]=""; strextradata[0]=NULL; strextradata[1]=NULL; strextradata[2]="op12345"; strextradata[3]="asdasdasdsd"; strextradata[4]=NULL; xmlliteraldata[0]="ffoo"; xmlliteraldata[1]="ffooASASASasaasweerrtttyyuuu"; xmlliteralextradata[0]="bar:we"; xmlliteralextradata[1]="bar:weasdasdasdasdasdasdasdasdasdasdasdasdasddas"; uridata[0]="dasdasdasd"; uridata[1]="dasdasdasd12345678901234567890"; uriextradata[0]=""; uriextradata[1]="fofofofof"; blobdata[0]=(char*)malloc(10); for(i=0;i<10;i++) *(blobdata[0]+i)=i+65; blobextradata[0]="type1"; bloblendata[0]=10; blobdata[1]=(char*)malloc(1000); for(i=0;i<1000;i++) *(blobdata[1]+i)=(i%10)+65; //i%256; blobextradata[1]="type2"; bloblendata[1]=200; blobdata[2]=(char*)malloc(10); for(i=0;i<10;i++) *(blobdata[2]+i)=i%256; blobextradata[2]=NULL; bloblendata[2]=10; recdata[0]=0; recdata[1]=1; recdata[2]=2; recdata[3]=3; recdata[4]=4; recdata[5]=5; recdata[6]=100; recdata[7]=101; recdata[8]=10000; recdata[9]=10001; vardata[0]=0; vardata[1]=999882; for (i=0;i1) printf("checking null enc/dec\n"); enc=wg_encode_null(db,nulldata[j]); if (wg_get_encoded_type(db,enc)!=WG_NULLTYPE) { if (p) printf("check_datatype_writeread gave error: null enc not right type \n"); return 1; } nulldec=wg_decode_null(db,enc); if (nulldata[j]!=nulldec) { if (p) printf("check_datatype_writeread gave error: null enc/dec \n"); return 1; } } // char test for (j=0;j1) printf("checking char enc/dec for j %d, value '%c'\n",j,chardata[j]); enc=wg_encode_char(db,chardata[j]); if (wg_get_encoded_type(db,enc)!=WG_CHARTYPE) { if (p) printf("check_datatype_writeread gave error: char enc not right type for j %d value '%c'\n", j,chardata[j]); return 1; } chardec=wg_decode_char(db,enc); if (chardata[j]!=chardec) { if (p) printf("check_datatype_writeread gave error: char enc/dec for j %d enc value '%c' dec value '%c'\n", j,chardata[j],chardec); return 1; } } // int test for (j=0;j1) printf("checking int enc/dec for j %d, value %d\n",j,intdata[j]); enc=wg_encode_int(db,intdata[j]); if (wg_get_encoded_type(db,enc)!=WG_INTTYPE) { if (p) printf("check_datatype_writeread gave error: int enc not right type for j %d value %d\n", j,intdata[j]); return 1; } intdec=wg_decode_int(db,enc); if (intdata[j]!=intdec) { if (p) printf("check_datatype_writeread gave error: int enc/dec for j %d enc value %d dec value %d\n", j,intdata[j],intdec); return 1; } } // double test for (j=0;j1) printf("checking double enc/dec for j %d, value %e\n",j,doubledata[j]); enc=wg_encode_double(db,doubledata[j]); if (wg_get_encoded_type(db,enc)!=WG_DOUBLETYPE) { if (p) printf("check_datatype_writeread gave error: double enc not right type for j %d value %e\n", j,doubledata[j]); return 1; } doubledec=wg_decode_double(db,enc); if (doubledata[j]!=doubledec) { if (p) printf("check_datatype_writeread gave error: double enc/dec for j %d enc value %e dec value %e\n", j,doubledata[j],doubledec); return 1; } } // date test for (j=0;j1) printf("checking date enc/dec for j %d, value %d\n",j,datedata[j]); enc=wg_encode_date(db,datedata[j]); if (wg_get_encoded_type(db,enc)!=WG_DATETYPE) { if (p) printf("check_datatype_writeread gave error: date enc not right type for j %d value %d\n", j,intdata[j]); return 1; } intdec=wg_decode_date(db,enc); if (datedata[j]!=intdec) { if (p) printf("check_datatype_writeread gave error: date enc/dec for j %d enc value %d dec value %d\n", j,datedata[j],intdec); return 1; } } for (j=0;j1) printf("checking building dates from vectors for j %d, expected value %d\n",j,datedata[j]); tmp=wg_ymd_to_date(db, datevecdata[j][0], datevecdata[j][1], datevecdata[j][2]); if(tmp != datedata[j]) { if (p) printf("check_datatype_writeread gave error: scalar date returned was %d\n",tmp); return 1; } wg_date_to_ymd(db, tmp, &tmpvec[0], &tmpvec[1], &tmpvec[2]); if(tmpvec[0]!=datevecdata[j][0] || tmpvec[1]!=datevecdata[j][1] ||\ tmpvec[2]!=datevecdata[j][2]) { if (p) printf("check_datatype_writeread gave error: scalar date reverse conversion failed for j %d\n",j); return 1; } } for (j=0;j1) printf("checking invalid date input for j %d\n",j); tmp=wg_ymd_to_date(db, datevecbad[j][0], datevecbad[j][1], datevecbad[j][2]); if(tmp != -1) { if (p) printf("check_datatype_writeread gave error: invalid date j %d did not cause an error\n", j); return 1; } } // time test for (j=0;j1) printf("checking time enc/dec for j %d, value %d\n",j,timedata[j]); enc=wg_encode_time(db,timedata[j]); if (wg_get_encoded_type(db,enc)!=WG_TIMETYPE) { if (p) printf("check_datatype_writeread gave error: time enc not right type for j %d value %d\n", j,timedata[j]); return 1; } intdec=wg_decode_time(db,enc); if (timedata[j]!=intdec) { if (p) printf("check_datatype_writeread gave error: time enc/dec for j %d enc value %d dec value %d\n", j,timedata[j],intdec); return 1; } } for (j=0;j1) printf("checking building times from vectors for j %d, expected value %d\n",j,timedata[j]); tmp=wg_hms_to_time(db, timevecdata[j][0], timevecdata[j][1], timevecdata[j][2], timevecdata[j][3]); if(tmp != timedata[j]) { if (p) printf("check_datatype_writeread gave error: scalar time returned was %d\n",tmp); return 1; } wg_time_to_hms(db, tmp, &tmpvec[0], &tmpvec[1], &tmpvec[2], &tmpvec[3]); if(tmpvec[0]!=timevecdata[j][0] || tmpvec[1]!=timevecdata[j][1] ||\ tmpvec[2]!=timevecdata[j][2] || tmpvec[3]!=timevecdata[j][3]) { if (p) printf("check_datatype_writeread gave error: scalar time reverse conversion failed for j %d\n",j); return 1; } } for (j=0;j1) printf("checking invalid time input for j %d\n",j); tmp=wg_hms_to_time(db, timevecbad[j][0], timevecbad[j][1], timevecbad[j][2], timevecbad[j][3]); if(tmp != -1) { if (p) printf("check_datatype_writeread gave error: invalid time j %d did not cause an error\n", j); return 1; } } // datetime test for (j=0;j1) printf("checking strf iso datetime conv for j %d, date %d time %d\n",j,datedata[j],timedata[j]); for(k=0;k<1000;k++) decbuf[k]=0; for(k=0;k<1000;k++) encbuf[k]=0; wg_strf_iso_datetime(db,datedata[j],timedata[j],decbuf); if (p>1) printf("wg_strf_iso_datetime gives %s ",decbuf); k=wg_strp_iso_date(db,decbuf); r=wg_strp_iso_time(db,decbuf+11); //printf("k is %d r is %d\n",k,r); if (1) { if (k>=0 && r>=0) { wg_strf_iso_datetime(db,k,r,encbuf); if (strcmp(decbuf,encbuf)) { if(p) printf("check_datatype_writeread gave error: wg_strf_iso_datetime gives %s and rev op gives %s\n", decbuf,encbuf); return 1; } if (p>1) printf("rev gives %s\n",decbuf); } else { if(p) printf("check_datatype_writeread gave error: wg_strp_iso_date gives %d and wg_strp_iso_time gives %d on %s\n", k,r,decbuf); return 1; } } } // current date and time test for(k=0;k<1000;k++) encbuf[k]=0; m=wg_current_utcdate(db); k=wg_current_localdate(db); r=wg_current_utctime(db); j=wg_current_localtime(db); if (p>1) { wg_strf_iso_datetime(db,m,r,encbuf); printf("checking wg_current_utcdate/utctime: %s\n",encbuf); wg_strf_iso_datetime(db,k,j,encbuf); printf("checking wg_current_localdate/localtime: %s\n",encbuf); } // fixpoint test for (j=0;j1) printf("checking fixpoint enc/dec for j %d, value %f\n",j,fixpointdata[j]); enc=wg_encode_fixpoint(db,fixpointdata[j]); if (wg_get_encoded_type(db,enc)!=WG_FIXPOINTTYPE) { if (p) printf("check_datatype_writeread gave error: fixpoint enc not right type for j %d value %e\n", j,doubledata[j]); return 1; } doubledec=wg_decode_fixpoint(db,enc); if (round(FIXPOINTDIVISOR*fixpointdata[j])!=round(FIXPOINTDIVISOR*doubledec)) { //(fixpointdata[j]!=doubledec) { if (p) printf("check_datatype_writeread gave error: fixpoint enc/dec for j %d enc value %f dec value %f\n", j,fixpointdata[j],doubledec); return 1; } } // str test for (j=0;j1) printf("checking str enc/dec for j %d, value \"%s\", extra \"%s\"\n", j,strdata[j],strextradata[j]); enc=wg_encode_str(db,strdata[j],strextradata[j]); if (wg_get_encoded_type(db,enc)!=WG_STRTYPE) { if (p) printf("check_datatype_writeread gave error: str enc not right type for j %d value \"%s\", extra \"%s\"\n", j,strdata[j],strextradata[j]); return 1; } len=wg_decode_str_len(db,enc); if (len!=guarded_strlen(strdata[j])) { if (p) printf("check_datatype_writeread gave error: wg_decode_str_len for j %d value \"%s\" extra \"%s\" enc len %d dec len %d\n", j,strdata[j],strextradata[j],guarded_strlen(strdata[j]),len); return 1; } strdec=wg_decode_str(db,enc); if (guarded_strcmp(strdata[j],strdec)) { if (p) printf("check_datatype_writeread gave error: wg_decode_str for j %d value \"%s\" extra \"%s\"\n", j,strdata[j],strextradata[j]); return 1; } tmplen=wg_decode_str_copy(db,enc,decbuf,decbuflen); if (tmplen!=guarded_strlen(strdata[j])) { if (p) printf("check_datatype_writeread gave error: wg_decode_str_copy len for j %d value \"%s\" extra \"%s\" enc len %d dec len %d\n\n", j,strdata[j],strextradata[j],guarded_strlen(strdata[j]),tmplen); return 1; } if (bufguarded_strcmp(decbuf,strdata[j])) { if (p) printf("check_datatype_writeread gave error: wg_decode_str_copy for j %d value \"%s\" extra \"%s\" dec main \"%s\"\n", j,strdata[j],strextradata[j],decbuf); return 1; } len=wg_decode_str_lang_len(db,enc); if (len!=guarded_strlen(strextradata[j])) { if (p) printf("check_datatype_writeread gave error: wg_decode_str_lang_len for j %d value \"%s\" extra \"%s\" enc len %d dec len %d\n", j,strdata[j],strextradata[j],guarded_strlen(strextradata[j]),len); return 1; } strdec=wg_decode_str_lang(db,enc); if (guarded_strcmp(strextradata[j],strdec)) { if (p) printf("check_datatype_writeread gave error: wg_decode_str_lang for j %d value \"%s\" extra \"%s\"\n", j,strdata[j],strextradata[j]); return 1; } tmplen=wg_decode_str_lang_copy(db,enc,decbuf,decbuflen); if (tmplen!=guarded_strlen(strextradata[j])) { if (p) printf("check_datatype_writeread gave error: wg_decode_str_lang_copy len for j %d value \"%s\" extra \"%s\" enc len %d dec len %d\n\n", j,strdata[j],strextradata[j],guarded_strlen(strextradata[j]),tmplen); return 1; } if (bufguarded_strcmp(decbuf,strextradata[j])) { if (p) printf("check_datatype_writeread gave error: wg_decode_str_lang_copy for j %d value \"%s\" extra \"%s\" dec extra \"%s\"\n", j,strdata[j],strextradata[j],decbuf); return 1; } } // xmllit test for (j=0;j1) printf("checking xmlliteral enc/dec for j %d, value \"%s\", extra \"%s\"\n", j,xmlliteraldata[j],xmlliteralextradata[j]); enc=wg_encode_xmlliteral(db,xmlliteraldata[j],xmlliteralextradata[j]); if (wg_get_encoded_type(db,enc)!=WG_XMLLITERALTYPE) { if (p) printf("check_datatype_writeread gave error: xmlliteral enc not right type for j %d value \"%s\", extra \"%s\"\n", j,xmlliteraldata[j],xmlliteralextradata[j]); return 1; } len=wg_decode_xmlliteral_len(db,enc); if (len!=guarded_strlen(xmlliteraldata[j])) { if (p) printf("check_datatype_writeread gave error: wg_decode_xmlliteral_len for j %d value \"%s\" extra \"%s\" enc len %d dec len %d\n", j,xmlliteraldata[j],xmlliteralextradata[j],guarded_strlen(xmlliteraldata[j]),len); return 1; } strdec=wg_decode_xmlliteral(db,enc); if (guarded_strcmp(xmlliteraldata[j],strdec)) { if (p) printf("check_datatype_writeread gave error: wg_decode_xmlliteral for j %d value \"%s\" extra \"%s\"\n", j,xmlliteraldata[j],xmlliteralextradata[j]); return 1; } tmplen=wg_decode_xmlliteral_copy(db,enc,decbuf,decbuflen); if (tmplen!=guarded_strlen(xmlliteraldata[j])) { if (p) printf("check_datatype_writeread gave error: wg_decode_xmlliteral_copy len for j %d value \"%s\" extra \"%s\" enc len %d dec len %d\n\n", j,xmlliteraldata[j],xmlliteralextradata[j],guarded_strlen(xmlliteraldata[j]),tmplen); return 1; } if (bufguarded_strcmp(decbuf,xmlliteraldata[j])) { if (p) printf("check_datatype_writeread gave error: wg_decode_xmlliteral_copy for j %d value \"%s\" extra \"%s\" dec main \"%s\"\n", j,xmlliteraldata[j],xmlliteralextradata[j],decbuf); return 1; } len=wg_decode_xmlliteral_xsdtype_len(db,enc); if (len!=guarded_strlen(xmlliteralextradata[j])) { if (p) printf("check_datatype_writeread gave error: wg_decode_xmlliteral_xsdtype_len for j %d value \"%s\" extra \"%s\" enc len %d dec len %d\n", j,xmlliteraldata[j],xmlliteralextradata[j],guarded_strlen(xmlliteralextradata[j]),len); return 1; } strdec=wg_decode_xmlliteral_xsdtype(db,enc); if (guarded_strcmp(xmlliteralextradata[j],strdec)) { if (p) printf("check_datatype_writeread gave error: wg_decode_xmlliteral_xsdtype for j %d value \"%s\" extra \"%s\"\n", j,xmlliteraldata[j],xmlliteralextradata[j]); return 1; } tmplen=wg_decode_xmlliteral_xsdtype_copy(db,enc,decbuf,decbuflen); if (tmplen!=guarded_strlen(xmlliteralextradata[j])) { if (p) printf("check_datatype_writeread gave error: wg_decode_xmlliteral_xsdtype_copy len for j %d value \"%s\" extra \"%s\" enc len %d dec len %d\n\n", j,xmlliteraldata[j],xmlliteralextradata[j],guarded_strlen(xmlliteralextradata[j]),tmplen); return 1; } if (bufguarded_strcmp(decbuf,xmlliteralextradata[j])) { if (p) printf("check_datatype_writeread gave error: wg_decode_xmlliteral_xsdtype_copy for j %d value \"%s\" extra \"%s\" dec extra \"%s\"\n", j,xmlliteraldata[j],xmlliteralextradata[j],decbuf); return 1; } } // uri test for (j=0;j1) printf("checking uri enc/dec for j %d, value \"%s\", extra \"%s\"\n", j,uridata[j],uriextradata[j]); enc=wg_encode_uri(db,uridata[j],uriextradata[j]); if (wg_get_encoded_type(db,enc)!=WG_URITYPE) { if (p) printf("check_datatype_writeread gave error: uri enc not right type for j %d value \"%s\", extra \"%s\"\n", j,uridata[j],uriextradata[j]); return 1; } len=wg_decode_uri_len(db,enc); if (len!=guarded_strlen(uridata[j])) { if (p) printf("check_datatype_writeread gave error: wg_decode_uri_len for j %d value \"%s\" extra \"%s\" enc len %d dec len %d\n", j,uridata[j],uriextradata[j],guarded_strlen(uridata[j]),len); return 1; } strdec=wg_decode_uri(db,enc); if (guarded_strcmp(uridata[j],strdec)) { if (p) printf("check_datatype_writeread gave error: wg_decode_uri for j %d value \"%s\" extra \"%s\"\n", j,uridata[j],uriextradata[j]); return 1; } tmplen=wg_decode_uri_copy(db,enc,decbuf,decbuflen); if (tmplen!=guarded_strlen(uridata[j])) { if (p) printf("check_datatype_writeread gave error: wg_decode_uri_copy len for j %d value \"%s\" extra \"%s\" enc len %d dec len %d\n\n", j,uridata[j],uriextradata[j],guarded_strlen(uridata[j]),tmplen); return 1; } if (bufguarded_strcmp(decbuf,uridata[j])) { if (p) printf("check_datatype_writeread gave error: wg_decode_uri_copy for j %d value \"%s\" extra \"%s\" dec main \"%s\"\n", j,uridata[j],uriextradata[j],decbuf); return 1; } len=wg_decode_uri_prefix_len(db,enc); if (len!=guarded_strlen(uriextradata[j])) { if (p) printf("check_datatype_writeread gave error: wg_decode_uri_prefix_len for j %d value \"%s\" extra \"%s\" enc len %d dec len %d\n", j,uridata[j],uriextradata[j],guarded_strlen(uriextradata[j]),len); return 1; } strdec=wg_decode_uri_prefix(db,enc); if (guarded_strcmp(uriextradata[j],strdec)) { if (p) printf("check_datatype_writeread gave error: wg_decode_uri_prefix for j %d value \"%s\" extra \"%s\"\n", j,uridata[j],uriextradata[j]); return 1; } tmplen=wg_decode_uri_prefix_copy(db,enc,decbuf,decbuflen); if (tmplen!=guarded_strlen(uriextradata[j])) { if (p) printf("check_datatype_writeread gave error: wg_decode_uri_prefix_copy len for j %d value \"%s\" extra \"%s\" enc len %d dec len %d\n\n", j,uridata[j],uriextradata[j],guarded_strlen(uriextradata[j]),tmplen); return 1; } if (bufguarded_strcmp(decbuf,uriextradata[j])) { if (p) printf("check_datatype_writeread gave error: wg_decode_uri_prefix_copy for j %d value \"%s\" extra \"%s\" dec extra \"%s\"\n", j,uridata[j],uriextradata[j],decbuf); return 1; } } // blob test for (j=0;j1) printf("checking blob enc/dec for j %d, len %d extra \"%s\"\n", j,bloblendata[j],blobextradata[j]); enc=wg_encode_blob(db,blobdata[j],blobextradata[j],bloblendata[j]); if (!enc) { if (p) printf("check_datatype_writeread gave error: cannot create a blob\n"); return 1; } if (wg_get_encoded_type(db,enc)!=WG_BLOBTYPE) { if (p) printf("check_datatype_writeread gave error: blob enc not right type for j %d len %d, extra \"%s\"\n", j,bloblendata[j],blobextradata[j]); return 1; } len=wg_decode_blob_len(db,enc); if (len!=bloblendata[j]) { if (p) printf("check_datatype_writeread gave error: wg_decode_blob_len for j %d len %d extra \"%s\" enc len %d dec len %d\n", j,bloblendata[j],blobextradata[j],bloblendata[j],len); return 1; } strdec=wg_decode_blob(db,enc); if (memcmp(blobdata[j],strdec,bloblendata[j])) { if (p) printf("check_datatype_writeread gave error: wg_decode_blob for j %d len %d extra \"%s\"\n", j,bloblendata[j],blobextradata[j]); return 1; } tmplen=wg_decode_blob_copy(db,enc,decbuf,decbuflen); if (tmplen!=bloblendata[j]) { if (p) printf("check_datatype_writeread gave error: wg_decode_blob_copy len for j %d len %d extra \"%s\" enc len %d dec len %d\n\n", j,bloblendata[j],blobextradata[j],bloblendata[j],tmplen); return 1; } if (memcmp(decbuf,blobdata[j],bloblendata[j])) { if (p) printf("check_datatype_writeread gave error: wg_decode_blob_copy for j %d len %d extra \"%s\" dec len %d\n", j,bloblendata[j],blobextradata[j],tmplen); return 1; } len=wg_decode_blob_type_len(db,enc); if (len!=guarded_strlen(blobextradata[j])) { if (p) printf("check_datatype_writeread gave error: wg_decode_blob_type_len for j %d len %d extra \"%s\" enc len %d dec len %d\n", j,bloblendata[j],blobextradata[j],guarded_strlen(blobextradata[j]),len); return 1; } strdec=wg_decode_blob_type(db,enc); if (guarded_strcmp(blobextradata[j],strdec)) { if (p) printf("check_datatype_writeread gave error: wg_decode_blob_type for j %d len %d extra \"%s\"\n", j,bloblendata[j],blobextradata[j]); return 1; } tmplen=wg_decode_blob_type_copy(db,enc,decbuf,decbuflen); if (tmplen!=guarded_strlen(blobextradata[j])) { if (p) printf("check_datatype_writeread gave error: wg_decode_blob_type_copy len for j %d len %d extra \"%s\" enc len %d dec len %d\n\n", j,bloblendata[j],blobextradata[j],guarded_strlen(blobextradata[j]),tmplen); return 1; } if (bufguarded_strcmp(decbuf,blobextradata[j])) { if (p) printf("check_datatype_writeread gave error: wg_decode_blob_type_copy for j %d len %d extra \"%s\" dec extra \"%s\"\n", j,bloblendata[j],blobextradata[j],decbuf); return 1; } } // rec test for (j=0;j1) printf("checking rec creation, content read/write for j %d, length %d\n",j,recdata[j]); rec=(gint *)wg_create_record(db,recdata[j]); if (rec==NULL) { if (p) printf("check_datatype_writeread gave error: creating record for j %d len %d failed\n", j,recdata[j]); return 1; } /* the following code can't be correct - rec is a pointer, not encoded value if (wg_get_encoded_type(db,(gint)rec)!=WG_RECORDTYPE) { if (p) printf("check_datatype_writeread gave error: created record not right type for j %d len %d\n", j,recdata[j]); return 1; } */ tmplen=wg_get_record_len(db,rec); if (tmplen!=recdata[j]) { if (p) printf("check_datatype_writeread gave error: wg_get_record_len gave %d for rec of len %d\n", tmplen,recdata[j]); return 1; } for(k=0;k1) printf("checking var enc/dec for j %d, value %d\n",j,vardata[j]); enc=wg_encode_var(db,vardata[j]); if (wg_get_encoded_type(db,enc)!=WG_VARTYPE) { if (p) printf("check_datatype_writeread gave error: var enc not right type for j %d value %d\n", j,vardata[j]); return 1; } intdec=wg_decode_var(db,enc); if (vardata[j]!=intdec) { if (p) printf("check_datatype_writeread gave error: var enc/dec for j %d enc value %d dec value %d\n", j,vardata[j],intdec); return 1; } } /* Test string decode with insufficient buffer size */ if (p>1) printf("checking decoding data that doesn't fit the decode buffer "\ "(expecting some errors)\n"); enc=wg_encode_str(db, "00000000001111111111", NULL); /* shortstr, len=20 */ memset(decbuf, 0, decbuflen); if(wg_decode_str_copy(db, enc, decbuf, 10) > 0) { /* we expect this to fail, but if it succeeds, let's check if the * buffer size was honored */ if(strlen(decbuf) != 10) { if(p) printf("check_datatype_writeread gave error: "\ "buffer overflow when decoding a shortstr\n"); return 1; } } enc=wg_encode_str(db, "0000000000111111111", "et"); /* longstr, len=19 */ memset(decbuf, 0, decbuflen); if(wg_decode_str_copy(db, enc, decbuf, 11) > 0) { if(strlen(decbuf) != 11) { if(p) printf("check_datatype_writeread gave error: "\ "buffer overflow when decoding a longstr\n"); return 1; } } enc=wg_encode_blob(db, "000000000011111111", "blobtype", 18); /* blob */ memset(decbuf, 0, decbuflen); if(wg_decode_blob_copy(db, enc, decbuf, 12) > 0) { if(strlen(decbuf) != 12) { if(p) printf("check_datatype_writeread gave error: "\ "buffer overflow when decoding a blob\n"); return 1; } } } if (p>1) printf("********* check_datatype_writeread ended without errors ************\n"); return 0; } static int guarded_strlen(char* str) { if (str==NULL) return 0; else return strlen(str); } static int guarded_strcmp(char* a, char* b) { if (a==NULL && b!=NULL) return 1; if (a!=NULL && b==NULL) return -1; if (a==NULL && b==NULL) return 0; else return strcmp(a,b); } static int bufguarded_strcmp(char* a, char* b) { if (a==NULL && b==NULL) return 0; if (a==NULL && strlen(b)==0) return 0; if (b==NULL && strlen(a)==0) return 0; if (a==NULL && b!=NULL) return 1; if (a!=NULL && b==NULL) return -1; else return strcmp(a,b); } /* ------------------------ test record linking ------------------------------*/ gint wg_check_backlinking(void* db, int printlevel) { #ifdef USE_BACKLINKING int p; int tmp; gint *rec, *rec2, *rec3; p = printlevel; if (p>1) printf("********* checking record linking and deleting ************\n"); rec=(gint *) wg_create_record(db,2); rec2=(gint *) wg_create_record(db,2); rec3=(gint *) wg_create_record(db,1); if (rec==NULL || rec2==NULL || rec3==NULL) { if (p) printf("unexpected error: rec creation failed\n"); return 1; } wg_set_field(db, rec, 0, wg_encode_int(db, 10)); wg_set_field(db, rec, 1, wg_encode_str(db, "hello", NULL)); wg_set_field(db, rec2, 0, wg_encode_record(db, rec)); wg_set_field(db, rec2, 1, wg_encode_str(db, "hi", NULL)); wg_set_field(db, rec3, 0, wg_encode_record(db, rec2)); /* this should fail */ tmp = wg_delete_record(db, rec); if(tmp != -1) { if (p) printf("check_backlinking: deleting referenced record, expected %d, received %d\n", -1, (int) tmp); return 1; } /* this should also fail */ tmp = wg_delete_record(db, rec2); if(tmp != -1) { if (p) printf("check_backlinking: deleting referenced record, expected %d, received %d\n", -1, (int) tmp); return 1; } wg_set_field(db, rec3, 0, 0); /* this should now succeed */ tmp = wg_delete_record(db, rec2); if(tmp != 0) { if (p) printf("check_backlinking: deleting no longer referenced record, expected %d, received %d\n", 0, (int) tmp); return 1; } /* this should also succeed */ tmp = wg_delete_record(db, rec); if(tmp != 0) { if (p) printf("check_backlinking: deleting child of deleted record, expected %d, received %d\n", 0, (int) tmp); return 1; } /* and this should succeed */ tmp = wg_delete_record(db, rec3); if(tmp != 0) { if (p) printf("check_backlinking: deleting record, expected %d, received %d\n", 0, (int) tmp); return 1; } if (p>1) printf("********* check_backlinking: no errors ************\n"); #else printf("check_backlinking: disabled, skipping checks\n"); #endif return 0; } /* ------------------------ test string parsing ------------------------------*/ static int do_check_parse_encode(void *db, gint enc, gint exptype, void *expval, int printlevel) { int i, p=printlevel, tmp; gint intdec; double doubledec, diff; char* strdec; int vecdec[4]; gint enctype; enctype = wg_get_encoded_type(db, enc); if(enctype != exptype) { if(p) printf("check_parse_encode: expected type %s, got type %s\n", wg_get_type_name(db, exptype), wg_get_type_name(db, enctype)); return 1; } switch(enctype) { case WG_NULLTYPE: if(wg_decode_null(db, enc) != NULL) { if(p) printf("check_parse_encode: expected value NULL, got %d (encoded)\n", (int) enc); return 1; } break; case WG_INTTYPE: intdec = wg_decode_int(db, enc); if(intdec != *((gint *) expval)) { if(p) printf("check_parse_encode: expected value %d, got %d\n", (int) *((gint *) expval), (int) intdec); return 1; } break; case WG_DOUBLETYPE: doubledec = wg_decode_double(db, enc); diff = doubledec - *((double *) expval); if(diff < -0.000001 || diff > 0.000001) { if(p) printf("check_parse_encode: expected value %f, got %f\n", *((double *) expval), doubledec); return 1; } break; case WG_STRTYPE: strdec = wg_decode_str(db, enc); if(bufguarded_strcmp(strdec, (char *) expval)) { if(p) printf("check_parse_encode: expected value \"%s\", got \"%s\"\n", (char *) expval, strdec); return 1; } break; case WG_DATETYPE: tmp = wg_decode_date(db, enc); wg_date_to_ymd(db, tmp, &vecdec[0], &vecdec[1], &vecdec[2]); for(i=0; i<3; i++) { if(vecdec[i] != ((int *) expval)[i]) { if(p) printf("check_parse_encode: "\ "date vector pos %d expected value %d, got %d\n", i, ((int *) expval)[i], vecdec[i]); return 1; } } break; case WG_TIMETYPE: tmp = wg_decode_time(db, enc); wg_time_to_hms(db, tmp, &vecdec[0], &vecdec[1], &vecdec[2], &vecdec[3]); for(i=0; i<4; i++) { if(vecdec[i] != ((int *) expval)[i]) { if(p) printf("check_parse_encode: "\ "time vector pos %d expected value %d, got %d\n", i, ((int *) expval)[i], vecdec[i]); return 1; } } break; default: printf("check_parse_encode: unexpected type %s\n", wg_get_type_name(db, enctype)); return 1; } return 0; } gint wg_check_parse_encode(void* db, int printlevel) { int p, i; const char *testinput[] = { "", /* empty string - NULL */ " ", /* space - string */ "\r\t \n\r\t \b\xff", /* various whitespace and other junk */ "üöäõõõü ÄÖÜÕ", /* ISO-8859-1 encoded string */ "\xc3\xb5\xc3\xa4\xc3\xb6\xc3\xbc \xc3\x95\xc3\x84\xc3\x96\xc3\x9c", /* UTF-8 */ "0", /* integer */ "5435354534", /* a large integer, parsed as string if strtol() is 32-bit */ "54312313214385290438390523442348932048234324348930243242342342389"\ "4380148902432428904283323892374282394832423", /* a very large integer */ "7.432432", /* floating point (CSV_DECIMAL_SEPARATOR in dbutil.c) */ "-7899", /* negative integer */ "-14324.432432", /* negative floating point number */ "-tere", /* something that is not a negative number */ "0.88872d", /* a number with garbage appended */ " 995", /* a number that is parsed as a string */ "1996-01-01", /* iso8601 date */ "2038-12-12", /* same, in the future */ "12:01:17", /* iso8601 time */ "23:01:17.87", /* iso8601 time, with fractions */ "09:01", /* time, no seconds */ NULL /* terminator */ }; /* verification data */ gint intval[] = { 0, (sizeof(long) > 4 ? (gint) 5435354534L : 0), -7899 }; double doubleval[] = { 7.432432, -14324.432432 }; int datevec[][3] = { {1996, 1, 1}, {2038, 12, 12} }; int timevec[][4] = { {12, 1, 17, 0}, {23, 1, 17, 87} }; /* should match testinput */ gint testtype[] = { WG_NULLTYPE, WG_STRTYPE, WG_STRTYPE, WG_STRTYPE, WG_STRTYPE, WG_INTTYPE, (sizeof(long) > 4 ? WG_INTTYPE : WG_STRTYPE), WG_STRTYPE, WG_DOUBLETYPE, WG_INTTYPE, WG_DOUBLETYPE, WG_STRTYPE, WG_STRTYPE, WG_STRTYPE, WG_DATETYPE, WG_DATETYPE, WG_TIMETYPE, WG_TIMETYPE, WG_STRTYPE, -1, /* unused */ }; /* map to verification data, recast to correct type when used */ void *testval[] = { NULL, /* unused */ (void *) testinput[1], (void *) testinput[2], (void *) testinput[3], (void *) testinput[4], (void *) &intval[0], (sizeof(long) > 4 ? (void *) &intval[1] : (void *) testinput[6]), (void *) testinput[7], (void *) &doubleval[0], (void *) &intval[2], (void *) &doubleval[1], (void *) testinput[11], (void *) testinput[12], (void *) testinput[13], (void *) datevec[0], (void *) datevec[1], (void *) timevec[0], (void *) timevec[1], (void *) testinput[18], NULL /* unused */ }; p=printlevel; if (p>1) printf("********* testing string parsing ************\n"); i=0; while(testinput[i]) { gint encv, encp; /* Announce */ if(p>1) { printf("parsing string: \"%s\"\n", testinput[i]); } /* Parse and encode */ encv = wg_parse_and_encode(db, (char *) testinput[i]); encp = wg_parse_and_encode_param(db, (char *) testinput[i]); /* Check */ if(encv == WG_ILLEGAL) { if(p) printf("check_parse_encode: encode value failed, got WG_ILLEGAL\n"); return 1; } else if(do_check_parse_encode(db, encv, testtype[i], testval[i], p)) { return 1; } if(encp == WG_ILLEGAL) { if(p) printf("check_parse_encode: encode param failed, got WG_ILLEGAL\n"); return 1; } else if(do_check_parse_encode(db, encp, testtype[i], testval[i], p)) { return 1; } /* Free */ wg_free_encoded(db, encv); wg_free_query_param(db, encp); i++; } if (p>1) printf("********* check_parse_encode: no errors ************\n"); return 0; } /* ------------------------ test comparison ------------------------------*/ gint wg_check_compare(void* db, int printlevel) { int i, j; gint testdata[28]; void *rec1, *rec2, *rec3; testdata[0] = wg_encode_null(db, 0); testdata[4] = wg_encode_int(db, -321784); testdata[5] = wg_encode_int(db, 34531); testdata[6] = wg_encode_double(db, 0.000000001); testdata[7] = wg_encode_double(db, 0.00000001); testdata[8] = wg_encode_str(db, "", NULL); testdata[9] = wg_encode_str(db, "XX", NULL); testdata[10] = wg_encode_str(db, "this is a string", NULL); testdata[11] = wg_encode_str(db, "this is a string ", NULL); testdata[12] = wg_encode_xmlliteral(db, "this is a string ", "foo:bar"); testdata[13] = wg_encode_xmlliteral(db, "this is a string ", "foo:bart"); testdata[14] = wg_encode_uri(db, "www.amazon.com", "http://"); testdata[15] = wg_encode_uri(db, "www.yahoo.com", "http://"); testdata[16] = wg_encode_blob(db, "\0\0\045\120\104\106\055\061\0\056\065\012\045\045", "blob", 14); testdata[17] = wg_encode_blob(db, "\0\0\045\120\104\106\055\061\001\056\065\012\044", "blob", 13); testdata[18] = wg_encode_char(db, 'C'); testdata[19] = wg_encode_char(db, 'c'); testdata[20] = wg_encode_fixpoint(db, -7.25); testdata[21] = wg_encode_fixpoint(db, -7.2); testdata[22] = wg_encode_date(db, wg_ymd_to_date(db, 2010, 4, 1)); testdata[23] = wg_encode_date(db, wg_ymd_to_date(db, 2010, 4, 30)); testdata[24] = wg_encode_time(db, wg_hms_to_time(db, 13, 32, 0, 3)); testdata[25] = wg_encode_time(db, wg_hms_to_time(db, 24, 0, 0, 0)); testdata[26] = wg_encode_var(db, 7); testdata[27] = wg_encode_var(db, 10); /* create records in reverse order to catch offset comparison */ rec3 = wg_create_raw_record(db, 3); wg_set_new_field(db, rec3, 0, testdata[4]); wg_set_new_field(db, rec3, 1, testdata[23]); wg_set_new_field(db, rec3, 2, testdata[9]); rec2 = wg_create_raw_record(db, 3); wg_set_new_field(db, rec2, 0, testdata[4]); wg_set_new_field(db, rec2, 1, testdata[14]); wg_set_new_field(db, rec2, 2, testdata[9]); rec1 = wg_create_raw_record(db, 2); testdata[1] = wg_encode_record(db, rec1); testdata[2] = wg_encode_record(db, rec2); testdata[3] = wg_encode_record(db, rec3); if(printlevel>1) printf("********* testing data comparison ************\n"); for(i=0; i<26; i++) { for(j=i; j<26; j++) { if(i==j) { if(WG_COMPARE(db, testdata[i], testdata[j]) != WG_EQUAL) { if(printlevel) { printf("value1: "); wg_debug_print_value(db, testdata[i]); printf(" value2: "); wg_debug_print_value(db, testdata[j]); printf("\nvalue1 and value2 should have been equal\n"); } return 1; } } else #if WG_COMPARE_REC_DEPTH < 2 if(wg_get_encoded_type(db, testdata[i]) != \ wg_get_encoded_type(db, testdata[j]) || \ wg_get_encoded_type(db, testdata[i]) != WG_RECORDTYPE) #endif { if(WG_COMPARE(db, testdata[i], testdata[j]) != WG_LESSTHAN) { if(printlevel) { printf("value1: "); wg_debug_print_value(db, testdata[i]); printf(" value2: "); wg_debug_print_value(db, testdata[j]); printf("\nvalue1 should have been less than value2\n"); } return 1; } if(WG_COMPARE(db, testdata[j], testdata[i]) != WG_GREATER) { if(printlevel) { printf("value1: "); wg_debug_print_value(db, testdata[i]); printf(" value2: "); wg_debug_print_value(db, testdata[j]); printf("\nvalue1 should have been greater than value2\n"); } return 1; } } } } if(printlevel>1) printf("********* check_compare: no errors ************\n"); return 0; } /* -------------------- test query parameter encoding --------------------*/ gint wg_check_query_param(void* db, int printlevel) { gint encv, encp, tmp; int i; char *strdata[] = { "RjlTKUoxfhdqLiIz", "llWsdbuVGhoGqjs", "HRmUHyBkMKiqsu", "NcDoCfVjFPgWh", "ESGgFsyEcGLI", "PxPGipbFQgq", "UdDVsnFVKA", "JnhQcGTnC", "KxKPyzju", NULL }; if(printlevel>1) printf("********* testing query parameter encoding ************\n"); /* Data that does not require storage allocation */ encv = wg_encode_null(db, 0); encp = wg_encode_query_param_null(db, 0); if(encv != encp) { if(printlevel) { printf("check_query_param: encoded NULL parameter (%d)"\ "was not equal to encoded NULL value (%d)\n", (int) encp, (int) encv); } return 1; } encv = wg_encode_char(db, 'X'); encp = wg_encode_query_param_char(db, 'X'); if(encv != encp) { if(printlevel) { printf("check_query_param: encoded char parameter (%d) "\ "was not equal to encoded char value (%d)\n", (int) encp, (int) encv); } return 1; } encv = wg_encode_fixpoint(db, 37.596); encp = wg_encode_query_param_fixpoint(db, 37.596); if(encv != encp) { if(printlevel) { printf("check_query_param: encoded fixpoint parameter (%d) "\ "was not equal to encoded fixpoint value (%d)\n", (int) encp, (int) encv); } return 1; } tmp = wg_ymd_to_date(db, 1859, 7, 13); encv = wg_encode_date(db, tmp); encp = wg_encode_query_param_date(db, tmp); if(encv != encp) { if(printlevel) { printf("check_query_param: encoded date parameter (%d) "\ "was not equal to encoded date value (%d)\n", (int) encp, (int) encv); } return 1; } tmp = wg_hms_to_time(db, 17, 15, 0, 0); encv = wg_encode_time(db, tmp); encp = wg_encode_query_param_time(db, tmp); if(encv != encp) { if(printlevel) { printf("check_query_param: encoded time parameter (%d) "\ "was not equal to encoded time value (%d)\n", (int) encp, (int) encv); } return 1; } encv = wg_encode_var(db, 2); encp = wg_encode_query_param_var(db, 2); if(encv != encp) { if(printlevel) { printf("check_query_param: encoded var parameter (%d) "\ "was not equal to encoded var value (%d)\n", (int) encp, (int) encv); } return 1; } /* Smallint */ encv = wg_encode_int(db, 77); encp = wg_encode_query_param_int(db, 77); if(encv != encp) { if(printlevel) { printf("check_query_param: encoded int parameter (%d) "\ "was not equal to encoded int value (%d)\n", (int) encp, (int) encv); } return 1; } /* Data that requires storage */ if(sizeof(gint) > 4) { tmp = (gint) 3152921502073741877L; } else { tmp = 2073741877; } encp = wg_encode_query_param_int(db, tmp); if(!isfullint(encp)) { if(printlevel) { printf("check_query_param: encoded int parameter (%d) "\ "had bad encoding (does not look like a full int)\n", (int) encp); } wg_free_query_param(db, encp); return 1; } if((gint) (dbfetch(db, decode_fullint_offset(encp))) != tmp) { if(printlevel) { printf("check_query_param: encoded int parameter (%d) "\ "contained an invalid value\n", (int) encp); } wg_free_query_param(db, encp); return 1; } tmp = decode_fullint_offset(encp); if(tmp > 0 && tmp < dbmemsegh(db)->free) { if(printlevel) { printf("check_query_param: encoded int parameter (%d) "\ "had an invalid offset\n", (int) encp); } wg_free_query_param(db, encp); return 1; } wg_free_query_param(db, encp); encp = wg_encode_query_param_double(db, 0.00000000000324445); if(!isfulldouble(encp)) { if(printlevel) { printf("check_query_param: encoded double parameter (%d) "\ "had bad encoding (does not look like a double)\n", (int) encp); } wg_free_query_param(db, encp); return 1; } else { double val = wg_decode_double(db, encp); double diff = val - 0.00000000000324445; if(diff > 0.00000000000000001 || diff < -0.00000000000000001) { if(printlevel) { printf("check_query_param: encoded double parameter (%d) "\ "contained an invalid value (delta: %f)\n", (int) encp, diff); } wg_free_query_param(db, encp); return 1; } tmp = decode_fulldouble_offset(encp); if(tmp > 0 && tmp < dbmemsegh(db)->free) { if(printlevel) { printf("check_query_param: encoded double parameter (%d) "\ "had an invalid offset\n", (int) encp); } wg_free_query_param(db, encp); return 1; } } wg_free_query_param(db, encp); encp = wg_encode_query_param_str(db, "lalalalalalalalalalalalalalalalalalalala", NULL); if(!isshortstr(encp)) { if(printlevel) { printf("check_query_param: encoded longstr parameter (%d) "\ "had bad encoding (should be encoded as shortstr)\n", (int) encp); } wg_free_query_param(db, encp); return 1; } else { char *val = wg_decode_str(db, encp); if(strcmp(val, "lalalalalalalalalalalalalalalalalalalala")) { if(printlevel) { printf("check_query_param: encoded longstr parameter (%d) "\ "decoded to an invalid value \"%s\"\n", (int) encp, val); } wg_free_query_param(db, encp); return 1; } if(wg_decode_str_len(db, encp) != 40) { if(printlevel) { printf("check_query_param: encoded longstr parameter (%d) "\ "had invalid length\n", (int) encp); } wg_free_query_param(db, encp); return 1; } tmp = decode_shortstr_offset(encp); if(tmp > 0 && tmp < dbmemsegh(db)->free) { if(printlevel) { printf("check_query_param: encoded longstr parameter (%d) "\ "had an invalid offset\n", (int) encp); } wg_free_query_param(db, encp); return 1; } } wg_free_query_param(db, encp); encp = wg_encode_query_param_str(db, "", NULL); if(wg_get_encoded_type(db, encp) != WG_STRTYPE) { if(printlevel) { printf("check_query_param: encoded empty string parameter (%d) "\ "had bad type (should be WG_STRTYPE)\n", (int) encp); } wg_free_query_param(db, encp); return 1; } else { char *val = wg_decode_str(db, encp); if(strcmp(val, "")) { if(printlevel) { printf("check_query_param: encoded empty string parameter (%d) "\ "decoded to an invalid value \"%s\"\n", (int) encp, val); } wg_free_query_param(db, encp); return 1; } if(wg_decode_str_len(db, encp) != 0) { if(printlevel) { printf("check_query_param: encoded empty string parameter (%d) "\ "had invalid length\n", (int) encp); } wg_free_query_param(db, encp); return 1; } tmp = decode_shortstr_offset(encp); if(tmp > 0 && tmp < dbmemsegh(db)->free) { if(printlevel) { printf("check_query_param: encoded empty string parameter (%d) "\ "had an invalid offset\n", (int) encp); } wg_free_query_param(db, encp); return 1; } } wg_free_query_param(db, encp); i=0; while(strdata[i]) { encp = wg_encode_query_param_str(db, strdata[i], "et"); if(!islongstr(encp)) { if(printlevel) { printf("check_query_param: encoded string parameter (%d) "\ "had bad encoding (should be encoded as longstr)\n", (int) encp); } wg_free_query_param(db, encp); return 1; } else { char *val = wg_decode_str(db, encp); char cbuf[17]; int strl = strlen(strdata[i]), encl; if(strcmp(val, strdata[i])) { if(printlevel) { printf("check_query_param: encoded string parameter (%d) "\ "decoded to an invalid value \"%s\"\n", (int) encp, val); } wg_free_query_param(db, encp); return 1; } if((encl = wg_decode_str_len(db, encp)) != strl) { if(printlevel) { printf("check_query_param: encoded string parameter \"%s\" "\ "had invalid length (%d != %d)\n", strdata[i], encl, strl); } wg_free_query_param(db, encp); return 1; } if(wg_decode_str_copy(db, encp, cbuf, 17) != strl) { if(printlevel) { printf("check_query_param: wg_decode_str_copy(): invalid length\n"); } wg_free_query_param(db, encp); return 1; } if(strcmp(cbuf, strdata[i])) { if(printlevel) { printf("check_query_param: copy of encoded string parameter (%d) "\ "is an invalid value \"%s\"\n", (int) encp, cbuf); } wg_free_query_param(db, encp); return 1; } val = wg_decode_str_lang(db, encp); if(strcmp(val, "et")) { if(printlevel) { printf("check_query_param: encoded string parameter (%d) "\ "had invalid language \"%s\"\n", (int) encp, val); } wg_free_query_param(db, encp); return 1; } if(wg_decode_str_lang_len(db, encp) != 2) { if(printlevel) { printf("check_query_param: encoded string parameter (%d) language "\ "had invalid length\n", (int) encp); } wg_free_query_param(db, encp); return 1; } if(wg_decode_str_lang_copy(db, encp, cbuf, 17) != 2) { if(printlevel) { printf("check_query_param: wg_decode_str_lang_copy(): "\ "invalid length\n"); } wg_free_query_param(db, encp); return 1; } if(strcmp(cbuf, "et")) { if(printlevel) { printf("check_query_param: copy of encoded string parameter's (%d) "\ "language is an invalid value \"%s\"\n", (int) encp, cbuf); } wg_free_query_param(db, encp); return 1; } tmp = decode_longstr_offset(encp); if(tmp > 0 && tmp < dbmemsegh(db)->free) { if(printlevel) { printf("check_query_param: encoded string parameter (%d) "\ "had an invalid offset\n", (int) encp); } wg_free_query_param(db, encp); return 1; } } wg_free_query_param(db, encp); i++; } encp = wg_encode_query_param_xmlliteral(db, "VwwEtCiQQLvcIoB", "ACWzCMGGFVcZBjk"); if(!islongstr(encp)) { if(printlevel) { printf("check_query_param: encoded string parameter (%d) "\ "had bad encoding (should be encoded as longstr)\n", (int) encp); } wg_free_query_param(db, encp); return 1; } else { char *val = wg_decode_xmlliteral(db, encp); char cbuf[16]; int encl; if(strcmp(val, "VwwEtCiQQLvcIoB")) { if(printlevel) { printf("check_query_param: encoded XML literal param (%d) "\ "decoded to an invalid value \"%s\"\n", (int) encp, val); } wg_free_query_param(db, encp); return 1; } if((encl = wg_decode_xmlliteral_len(db, encp)) != 15) { if(printlevel) { printf("check_query_param: encoded XML literal param \"%s\" "\ "had invalid length (%d != 15)\n", strdata[i], encl); } wg_free_query_param(db, encp); return 1; } if(wg_decode_xmlliteral_copy(db, encp, cbuf, 16) != 15) { if(printlevel) { printf("check_query_param: wg_decode_xmlliteral_copy(): "\ "invalid length\n"); } wg_free_query_param(db, encp); return 1; } if(strcmp(cbuf, "VwwEtCiQQLvcIoB")) { if(printlevel) { printf("check_query_param: copy of encoded XML literal param (%d) "\ "is an invalid value \"%s\"\n", (int) encp, cbuf); } wg_free_query_param(db, encp); return 1; } val = wg_decode_xmlliteral_xsdtype(db, encp); if(strcmp(val, "ACWzCMGGFVcZBjk")) { if(printlevel) { printf("check_query_param: encoded XML literal param (%d) "\ "had invalid language \"%s\"\n", (int) encp, val); } wg_free_query_param(db, encp); return 1; } if(wg_decode_xmlliteral_xsdtype_len(db, encp) != 15) { if(printlevel) { printf("check_query_param: encoded XML literal param (%d) type "\ "had invalid length\n", (int) encp); } wg_free_query_param(db, encp); return 1; } if(wg_decode_xmlliteral_xsdtype_copy(db, encp, cbuf, 16) != 15) { if(printlevel) { printf("check_query_param: wg_decode_xmlliteral_xsdtype_copy(): "\ "invalid length\n"); } wg_free_query_param(db, encp); return 1; } if(strcmp(cbuf, "ACWzCMGGFVcZBjk")) { if(printlevel) { printf("check_query_param: copy of encoded XML literal param's "\ "(%d) type is an invalid value \"%s\"\n", (int) encp, cbuf); } wg_free_query_param(db, encp); return 1; } tmp = decode_longstr_offset(encp); if(tmp > 0 && tmp < dbmemsegh(db)->free) { if(printlevel) { printf("check_query_param: encoded XML literal param (%d) "\ "had an invalid offset\n", (int) encp); } wg_free_query_param(db, encp); return 1; } } wg_free_query_param(db, encp); encp = wg_encode_query_param_uri(db, "GCwepgqnKqcxnTj", "WdszkaEjrhEjgNS"); if(!islongstr(encp)) { if(printlevel) { printf("check_query_param: encoded string parameter (%d) "\ "had bad encoding (should be encoded as longstr)\n", (int) encp); } wg_free_query_param(db, encp); return 1; } else { char *val = wg_decode_uri(db, encp); char cbuf[16]; int encl; if(strcmp(val, "GCwepgqnKqcxnTj")) { if(printlevel) { printf("check_query_param: encoded URI parameter (%d) "\ "decoded to an invalid value \"%s\"\n", (int) encp, val); } wg_free_query_param(db, encp); return 1; } if((encl = wg_decode_uri_len(db, encp)) != 15) { if(printlevel) { printf("check_query_param: encoded URI parameter \"%s\" "\ "had invalid length (%d != 15)\n", strdata[i], encl); } wg_free_query_param(db, encp); return 1; } if(wg_decode_uri_copy(db, encp, cbuf, 16) != 15) { if(printlevel) { printf("check_query_param: wg_decode_uri_copy(): "\ "invalid length\n"); } wg_free_query_param(db, encp); return 1; } if(strcmp(cbuf, "GCwepgqnKqcxnTj")) { if(printlevel) { printf("check_query_param: copy of encoded URI parameter (%d) "\ "is an invalid value \"%s\"\n", (int) encp, cbuf); } wg_free_query_param(db, encp); return 1; } val = wg_decode_uri_prefix(db, encp); if(strcmp(val, "WdszkaEjrhEjgNS")) { if(printlevel) { printf("check_query_param: encoded URI parameter (%d) "\ "had invalid language \"%s\"\n", (int) encp, val); } wg_free_query_param(db, encp); return 1; } if(wg_decode_uri_prefix_len(db, encp) != 15) { if(printlevel) { printf("check_query_param: encoded URI parameter (%d) type "\ "had invalid length\n", (int) encp); } wg_free_query_param(db, encp); return 1; } if(wg_decode_uri_prefix_copy(db, encp, cbuf, 16) != 15) { if(printlevel) { printf("check_query_param: wg_decode_uri_prefix_copy(): "\ "invalid length\n"); } wg_free_query_param(db, encp); return 1; } if(strcmp(cbuf, "WdszkaEjrhEjgNS")) { if(printlevel) { printf("check_query_param: copy of encoded URI parameter's "\ "(%d) type is an invalid value \"%s\"\n", (int) encp, cbuf); } wg_free_query_param(db, encp); return 1; } tmp = decode_longstr_offset(encp); if(tmp > 0 && tmp < dbmemsegh(db)->free) { if(printlevel) { printf("check_query_param: encoded URI parameter (%d) "\ "had an invalid offset\n", (int) encp); } wg_free_query_param(db, encp); return 1; } } wg_free_query_param(db, encp); if(printlevel>1) printf("********* check_query_param: no errors ************\n"); return 0; } /* --------------- allocation, storage, updafe and deallocation tests ---------- */ /* gint wg_check_allocation_deallocation(void* db, int printlevel) { rec* records[1000]; gint strs[1000]; int count; int n; int i; int j; char tmpstr[1000]; char str; count=2; n=3; for(i=0;i1) printf("********* testing strhash ********** \n"); /*for(i=0;i<100;i++) strs[i]=0;*/ if (p>1) printf("---------- initial hashtable -----------\n"); if (p>1) wg_show_strhash(db); if (p>1) printf("---------- testing str creation --------- \n"); for (i=0;i1) printf("wg_set_field rec %d fld %d str '%s' lang '%s' encoded %d\n", (int)i,(int)j,instrbuf,lang,(int)enc); wg_set_field(db,rec,j,enc); if (!longstr_in_hash(db,instrbuf,lang,WG_STRTYPE,strlen(instrbuf)+1)) { if (p) printf("wg_check_strhash gave error: stored str not present in strhash: \"%s\" lang \"%s\" \n",instrbuf,lang); return 1; } } } if (p>1) printf("---------- hashtable after str adding -----------\n"); if (p>1) wg_show_strhash(db); if (p>1) printf("---------- testing str removals by overwriting data --------- \n"); for (i=0;i=0;i--) { if (strs[i]!=0) { if (p>1) printf("removing str nr %d enc %d\n",i,strs[i]); j=wg_remove_from_strhash(db,strs[i]); if (p>1) printf("removal result %d\n",j); wg_show_strhash(db); } } */ if (p>1) printf("---------- ending str removals, testing if strs removed from hash ----------\n"); for (i=0;i1) printf("---------- hashtable after str removals -----------\n"); if (p>1) wg_show_strhash(db); if (p>1)printf("********* strhash testing ended without errors ********** \n"); return 0; } static gint longstr_in_hash(void* db, char* data, char* extrastr, gint type, gint length) { db_memsegment_header* dbh = dbmemsegh(db); gint old=0; int hash; gint hasharrel; if (0) { } else { // find hash, check if exists hash=wg_hash_typedstr(db,data,extrastr,type,length); //hasharrel=((gint*)(offsettoptr(db,((db->strhash_area_header).arraystart))))[hash]; hasharrel=dbfetch(db,((dbh->strhash_area_header).arraystart)+(sizeof(gint)*hash)); //printf("hash %d ((dbh->strhash_area_header).arraystart)+(sizeof(gint)*hash) %d hasharrel %d\n", // hash,((dbh->strhash_area_header).arraystart)+(sizeof(gint)*hash), hasharrel); if (hasharrel) old=wg_find_strhash_bucket(db,data,extrastr,type,length,hasharrel); //printf("old %d \n",old); if (old) { //printf("str found in hash\n"); return 1; } //printf("str not found in hash\n"); return 0; } } void wg_show_strhash(void* db) { db_memsegment_header* dbh = dbmemsegh(db); gint i; gint hashchain; /*gint lasthashchain;*/ gint type; //gint offset; //gint refc; //int encoffset; printf("\nshowing strhash table and buckets\n"); printf("-----------------------------------\n"); printf("configured strhash size %d (%% of db size)\n",STRHASH_SIZE); printf("size %d\n", (int) (dbh->strhash_area_header).size); printf("offset %d\n", (int) (dbh->strhash_area_header).offset); printf("arraystart %d\n", (int) (dbh->strhash_area_header).arraystart); printf("arraylength %d\n", (int) (dbh->strhash_area_header).arraylength); printf("nonempty hash buckets:\n"); for(i=0;i<(dbh->strhash_area_header).arraylength;i++) { hashchain=dbfetch(db,(dbh->strhash_area_header).arraystart+(sizeof(gint)*i)); /*lasthashchain=hashchain; */ if (hashchain!=0) { printf("%d: contains %d encoded offset to chain\n", (int) i, (int) hashchain); for(;hashchain!=0; hashchain=dbfetch(db,decode_longstr_offset(hashchain)+LONGSTR_HASHCHAIN_POS*sizeof(gint))) { //printf("hashchain %d decode_longstr_offset(hashchain) %d fulladr %d contents %d\n", // hashchain, // decode_longstr_offset(hashchain), // (decode_longstr_offset(hashchain)+LONGSTR_HASHCHAIN_POS*sizeof(gint)), // dbfetch(db,decode_longstr_offset(hashchain)+LONGSTR_HASHCHAIN_POS*sizeof(gint))); type=wg_get_encoded_type(db,hashchain); printf(" "); wg_debug_print_value(db,hashchain); printf("\n"); //printf(" type %s",wg_get_type_name(db,type)); if (type==WG_BLOBTYPE) { //printf(" len %d\n",wg_decode_str_len(db,hashchain)); } else if (type==WG_STRTYPE || type==WG_XMLLITERALTYPE || type==WG_URITYPE || type== WG_ANONCONSTTYPE) { } else { printf("ERROR: wrong type in strhash bucket\n"); exit(0); } /*lasthashchain=hashchain;*/ } } } } /* --------------- allocation/memory checking and testing ------------------------------*/ /** check if varlen freelist is ok * * return 0 if ok, error nr if wrong * in case of error an errmsg is printed and function returns immediately * */ gint wg_check_db(void* db) { gint res; db_memsegment_header* dbh = dbmemsegh(db); printf("\nchecking datarec area\n"); printf("-----------------------\n"); res=check_varlen_area(db,&(dbh->datarec_area_header)); if (res) return res; printf("\narea test passed ok\n"); printf("\nchecking longstr area\n"); printf("-----------------------\n"); res=check_varlen_area(db,&(dbh->longstr_area_header)); if (res) return res; printf("\narea test passed ok\n"); printf("\nwhole test passed ok\n"); return 0; } static gint check_varlen_area(void* db, void* area_header) { gint res; res=check_varlen_area_markers(db,area_header); if (res) return res; res=check_varlen_area_dv(db,area_header); if (res) return res; res=check_varlen_area_freelist(db,area_header); if (res) return res; res=check_varlen_area_scan(db,area_header); if (res) return res; return 0; } static gint check_varlen_area_freelist(void* db, void* area_header) { db_area_header* areah; gint i; gint res; areah=(db_area_header*)area_header; for (i=0;ifreebuckets)[i]!=0) { //printf("checking bucket nr %d \n",i); if (ifreebuckets)[i])); res=check_bucket_freeobjects(db,areah,i); if (res) return res; } else { //printf(" is varbucket at offset %d \n",dbaddr(db,&(areah->freebuckets)[i])); res=check_bucket_freeobjects(db,areah,i); if (res) return res; } } } return 0; } static gint check_bucket_freeobjects(void* db, void* area_header, gint bucketindex) { db_area_header* areah; gint freelist; gint size; gint nextptr; gint prevptr; gint prevfreelist; gint tmp; areah=(db_area_header*)area_header; freelist=(areah->freebuckets)[bucketindex]; prevfreelist=ptrtooffset(db,&((areah->freebuckets)[bucketindex])); while(freelist!=0) { if (!isfreeobject(dbfetch(db,freelist))) { printf("varlen freelist object error:\n"); printf("object at offset %d has size gint %d which is not marked free\n", (int) freelist, (int) dbfetch(db,freelist)); return 1; } size=getfreeobjectsize(dbfetch(db,freelist)); if (bucketindex!=wg_freebuckets_index(db,size)) { printf("varlen freelist object error:\n"); printf("object at offset %d with size %d is in wrong bucket %d instead of right %d\n", (int) freelist, (int) size, (int) bucketindex, (int) wg_freebuckets_index(db,size)); return 2; } if (getfreeobjectsize(dbfetch(db,freelist+size-sizeof(gint)))!=size) { printf("varlen freelist object error:\n"); printf("object at offset %d has wrong end size %d which is not same as start size %d\n", (int) freelist, (int) dbfetch(db,freelist+size-sizeof(gint)), (int) size); return 3; } nextptr=dbfetch(db,freelist+sizeof(gint)); prevptr=dbfetch(db,freelist+2*sizeof(gint)); if (prevptr!=prevfreelist) { printf("varlen freelist object error:\n"); printf("object at offset %d has a wrong prevptr: %d instead of %d\n", (int) freelist, (int) prevptr, (int) prevfreelist); return 4; } tmp=check_object_in_areabounds(db,area_header,freelist,size); if (tmp) { printf("varlen freelist object error:\n"); if (tmp==1) { printf("object at offset %d does not start in the area bounds\n", (int) freelist); return 5; } else { printf("object at offset %d does not end (%d) in the same area it starts\n", (int) freelist, (int) (freelist+size)); return 6; } } //printf(" ok freeobject offset %d end %d size %d nextptr %d prevptr %d \n", // freelist,freelist+size,size,nextptr,prevptr); prevfreelist=freelist; freelist=nextptr; } return 0; } static gint check_varlen_area_markers(void* db, void* area_header) { /*db_subarea_header* arrayadr;*/ db_area_header* areah; gint last_subarea_index; gint i; gint size; gint subareastart; /*gint subareaend;*/ gint offset; gint head; areah=(db_area_header*)area_header; /*arrayadr=(areah->subarea_array);*/ last_subarea_index=areah->last_subarea_index; for(i=0;(i<=last_subarea_index)&&(isubarea_array)[i]).alignedsize; subareastart=((areah->subarea_array)[i]).alignedoffset; /*subareaend=(((areah->subarea_array)[i]).alignedoffset)+size;*/ // start marker offset=subareastart; head=dbfetch(db,offset); if (!isspecialusedobject(head)) { printf("start marker at offset %d has head %d which is not specialusedobject\n", (int) offset, (int) head); return 21; } if (getspecialusedobjectsize(head)!=MIN_VARLENOBJ_SIZE) { printf("start marker at offset %d has size %d which is not MIN_VARLENOBJ_SIZE %d\n", (int) offset, (int) getspecialusedobjectsize(head), (int) MIN_VARLENOBJ_SIZE); return 22; } if (dbfetch(db,offset+sizeof(gint))!=SPECIALGINT1START) { printf("start marker at offset %d has second gint %d which is not SPECIALGINT1START %d\n", (int) offset, (int) dbfetch(db,offset+sizeof(gint)), SPECIALGINT1START ); return 23; } //end marker offset=offset+size-MIN_VARLENOBJ_SIZE; head=dbfetch(db,offset); if (!isspecialusedobject(head)) { printf("end marker at offset %d has head %d which is not specialusedobject\n", (int) offset, (int) head); return 21; } if (getspecialusedobjectsize(head)!=MIN_VARLENOBJ_SIZE) { printf("end marker at offset %d has size %d which is not MIN_VARLENOBJ_SIZE %d\n", (int) offset, (int) getspecialusedobjectsize(head), (int) MIN_VARLENOBJ_SIZE); return 22; } if (dbfetch(db,offset+sizeof(gint))!=SPECIALGINT1END) { printf("end marker at offset %d has second gint %d which is not SPECIALGINT1END %d\n", (int) offset, (int) dbfetch(db,offset+sizeof(gint)), SPECIALGINT1END ); return 23; } } return 0; } static gint check_varlen_area_dv(void* db, void* area_header) { db_area_header* areah; gint dv; gint tmp; areah=(db_area_header*)area_header; dv=(areah->freebuckets)[DVBUCKET]; if (dv!=0) { printf("checking dv: bucket nr %d at offset %d \ncontains dv at offset %d with size %d(%d) and end %d \n", DVBUCKET, (int) dbaddr(db,&(areah->freebuckets)[DVBUCKET]), (int) dv, (int) ((areah->freebuckets)[DVSIZEBUCKET]>0 ? dbfetch(db,(areah->freebuckets)[DVBUCKET]) : -1), (int) (areah->freebuckets)[DVSIZEBUCKET], (int) ((areah->freebuckets)[DVBUCKET]+(areah->freebuckets)[DVSIZEBUCKET])); if (!isspecialusedobject(dbfetch(db,dv))) { printf("dv at offset %d has head %d which is not marked specialusedobject\n", (int) dv, (int) dbfetch(db,dv)); return 10; } if ((areah->freebuckets)[DVSIZEBUCKET]!=getspecialusedobjectsize(dbfetch(db,dv))) { printf("dv at offset %d has head %d with size %d which is different from freebuckets[DVSIZE] %d\n", (int) dv, (int) dbfetch(db,dv), (int) getspecialusedobjectsize(dbfetch(db,dv)), (int) (areah->freebuckets)[DVSIZEBUCKET]); return 11; } if (getspecialusedobjectsize(dbfetch(db,dv))subarea_array); last_subarea_index=areah->last_subarea_index; found=0; for(i=0;(i<=last_subarea_index)&&(i=subareastart && offsetsubareaend) { return 1; } found=1; break; } } if (!found) { return 2; } else { return 0; } } static gint check_varlen_area_scan(void* db, void* area_header) { db_area_header* areah; gint dv; gint tmp; /*db_subarea_header* arrayadr;*/ gint firstoffset; gint curoffset; gint head; gint last_subarea_index; gint i; gint subareastart; gint subareaend; gint freemarker; gint dvmarker; gint usedcount=0; gint usedbytesrealcount=0; gint usedbyteswantedcount=0; gint freecount=0; gint freebytescount=0; gint dvcount=0; gint dvbytescount=0; gint size; /*gint offset;*/ areah=(db_area_header*)area_header; /*arrayadr=(areah->subarea_array);*/ last_subarea_index=areah->last_subarea_index; dv=(areah->freebuckets)[DVBUCKET]; for(i=0;(i<=last_subarea_index)&&(isubarea_array)[i]).alignedsize; subareastart=((areah->subarea_array)[i]).alignedoffset; subareaend=(((areah->subarea_array)[i]).alignedoffset)+size; // start marker /*offset=subareastart; */ firstoffset=subareastart; // do not skip initial "used" marker curoffset=firstoffset; //printf("curroffset %d record %x\n",curoffset,(uint)record); freemarker=0; //assume first object is a special in-use marker dvmarker=0; // assume first object is not dv head=dbfetch(db,curoffset); while(1) { // increase offset to next memory block curoffset=curoffset+(freemarker ? getfreeobjectsize(head) : getusedobjectsize(head)); if (curoffset>=(subareastart+size)) { printf("object areanr %d offset %d size %d starts at or after area end %d\n", (int) i, (int) curoffset, (int) getusedobjectsize(head), (int) (subareastart+size)); return 32; } head=dbfetch(db,curoffset); //printf("new curoffset %d head %d isnormaluseobject %d isfreeobject %d \n", // curoffset,head,isnormalusedobject(head),isfreeobject(head)); // check if found a normal used object if (isnormalusedobject(head)) { if (freemarker && !isnormalusedobjectprevfree(head)) { printf("inuse normal object areanr %d offset %d size %d follows free but is not marked to follow free\n", (int) i, (int) curoffset, (int) getusedobjectsize(head)); return 31; } else if (!freemarker && !isnormalusedobjectprevused(head)) { printf("inuse normal object areanr %d offset %d size %d follows used but is not marked to follow used\n", (int) i, (int) curoffset, (int) getusedobjectsize(head)); return 32; } tmp=check_varlen_object_infreelist(db,area_header,curoffset,0); if (tmp!=0) return tmp; freemarker=0; dvmarker=0; usedcount++; usedbytesrealcount+=getusedobjectsize(head); usedbyteswantedcount+=getfreeobjectsize(head); // just remove two lowest bits } else if (isfreeobject(head)) { if (freemarker) { printf("free object areanr %d offset %d size %d follows free\n", (int) i, (int) curoffset, (int) getfreeobjectsize(head)); return 33; } if (dvmarker) { printf("free object areanr %d offset %d size %d follows dv\n", (int) i, (int) curoffset, (int) getfreeobjectsize(head)); return 34; } tmp=check_varlen_object_infreelist(db,area_header,curoffset,1); if (tmp!=1) { printf("free object areanr %d offset %d size %d not found in freelist\n", (int) i, (int) curoffset, (int) getfreeobjectsize(head)); return 55; } freemarker=1; dvmarker=0; freecount++; freebytescount+=getfreeobjectsize(head); // loop start leads us to next object } else { // found a special object (dv or end marker) if (dbfetch(db,curoffset+sizeof(gint))==SPECIALGINT1DV) { // we have reached a dv object if (curoffset!=dv) { printf("dv object found areanr %d offset %d size %d not marked as in[DVBUCKET] %d\n", (int) i, (int) curoffset, (int) getspecialusedobjectsize(head), (int) dv); return 35; } if (dvcount!=0) { printf("second dv object found areanr %d offset %d size %d\n", (int) i, (int) curoffset, (int) getspecialusedobjectsize(head)); return 36; } if (getspecialusedobjectsize(head)freebuckets)[bucketindex]; /*prevfreelist=0;*/ while(freelist!=0) { objsize=getfreeobjectsize(dbfetch(db,freelist)); if (isfree) { if (offset==freelist) return 1; } else { if (offset==freelist) { printf("used object at offset %d in freelist for bucket %d\n", (int) offset, (int) bucketindex); return 51; } if (offset>freelist && freelist+objsize>offset) { printf("used object at offset %d inside freelist object at %d size %d for bucket %d\n", (int) offset, (int) freelist, (int) objsize, (int) bucketindex); return 52; } } freelist=dbfetch(db,freelist+sizeof(gint)); } return 0; } /* --------------------- index testing ------------------------ */ /** Test data inserting with indexed column * */ gint wg_test_index1(void *db, int magnitude, int printlevel) { const int dbsize = 50*magnitude, rand_updates = magnitude; int i, j; void *start = NULL, *rec = NULL; gint oldv, newv; db_memsegment_header* dbh = dbmemsegh(db); #ifdef _WIN32 srand(102435356); #else srandom(102435356); /* fixed seed for repeatable sequences */ #endif if(wg_column_to_index_id(db, 0, WG_INDEX_TYPE_TTREE, NULL, 0) == -1) { if(printlevel > 1) printf("no index found on column 0, creating.\n"); if(wg_create_index(db, 0, WG_INDEX_TYPE_TTREE, NULL, 0)) { if(printlevel) fprintf(stderr, "index creation failed, aborting.\n"); return -3; } } if(printlevel > 1) { printf("------- tnode_area stats before insert --------\n"); wg_show_db_area_header(db,&(dbh->tnode_area_header)); } /* 1st loop: insert data in set 1 */ for(i=0; i>4; #else newv = random()>>4; #endif if(wg_set_field(db, rec, 0, wg_encode_int(db, newv))) { if(printlevel) fprintf(stderr, "insert error, aborting.\n"); return -1; } } if(validate_index(db, start, dbsize, 0, printlevel)) { if(printlevel) fprintf(stderr, "index validation failed after insert.\n"); return -2; } if(printlevel > 1) { printf("------- tnode_area stats after insert --------\n"); wg_show_db_area_header(db,&(dbh->tnode_area_header)); } /* 2nd loop: keep updating with random data */ for(j=0; j>4; #else newv = random()>>4; #endif if(wg_set_field(db, rec, 0, wg_encode_int(db, newv))) { if(printlevel) { printf("loop: %d row: %d old: %d new: %d\n", j, i, (int) oldv, (int) newv); fprintf(stderr, "insert error, aborting.\n"); } return -2; } if(validate_index(db, start, dbsize, 0, printlevel)) { if(printlevel) { printf("loop: %d row: %d old: %d new: %d\n", j, i, (int) oldv, (int) newv); fprintf(stderr, "index validation failed after update.\n"); } return -2; } } } if(printlevel > 1) { printf("------- tnode_area stats after update --------\n"); wg_show_db_area_header(db,&(dbh->tnode_area_header)); } return 0; } /** Quick index test to check basic behaviour * indexes existing data in database and validates the resulting index */ gint wg_test_index2(void *db, int printlevel) { int i, dbsize; void *rec, *start; if (printlevel>1) printf("********* testing T-tree index ********** \n"); for(i=0; i<10; i++) { if(wg_column_to_index_id(db, i, WG_INDEX_TYPE_TTREE, NULL, 0) == -1) { if(wg_create_index(db, i, WG_INDEX_TYPE_TTREE, NULL, 0)) { if (printlevel) printf("index creation failed, aborting.\n"); return -3; } } } start = rec = wg_get_first_record(db); dbsize = 0; /* Get the number of records in database */ while(rec) { dbsize++; rec = wg_get_next_record(db, rec); } if(!dbsize) return 0; /* no data, so nothing more to do */ for(i=0; i<10; i++) { if(validate_index(db, start, dbsize, i, printlevel)) { if (printlevel) printf("index validation failed.\n"); return -2; } } if (printlevel>1) printf("********* index test successful ********** \n"); return 0; } /** Validate index * 1. validates a set of rows starting from *rec. * 2. checks tree balance * 3. checks tree min/max values * returns 0 if no errors found * returns -1 if value was not indexed * returns -2 if there was another error */ static int validate_index(void *db, void *rec, int rows, int column, int printlevel) { gint index_id = wg_column_to_index_id(db, column, WG_INDEX_TYPE_TTREE, NULL, 0); gint tnode_offset; wg_index_header *hdr; if(index_id == -1) return -2; /* Check if all values are indexed */ while(rec && rows) { if(wg_get_record_len(db, rec) > column) { gint val = wg_get_field(db, rec, column); if(wg_search_ttree_index(db, index_id, val) < 1) { if(printlevel) { printf("missing: %d\n", (int) val); } return -1; } } rec = wg_get_next_record(db, rec); rows--; } hdr = (wg_index_header *) offsettoptr(db, index_id); if(((struct wg_tnode *)(offsettoptr(db, TTREE_ROOT_NODE(hdr))))->parent_offset != 0) { if(printlevel) printf("root node parent offset is not 0\n"); return -2; } #ifdef TTREE_CHAINED_NODES if(TTREE_MIN_NODE(hdr) == 0) { if(printlevel) printf("min node offset is 0\n"); return -2; } if(TTREE_MAX_NODE(hdr) == 0) { if(printlevel) printf("max node offset is 0\n"); return -2; } #endif #ifdef TTREE_CHAINED_NODES tnode_offset = TTREE_MIN_NODE(hdr); #else tnode_offset = wg_ttree_find_lub_node(db, TTREE_ROOT_NODE(hdr)); #endif while(tnode_offset) { int diff; gint minval, maxval; struct wg_tnode *node = (struct wg_tnode *) offsettoptr(db, tnode_offset); /* Check index tree balance */ diff = node->left_subtree_height - node->right_subtree_height; if(diff < -1 || diff > 1) return -2; /* Check min/max values */ minval = wg_get_field(db, offsettoptr(db, node->array_of_values[0]), column); maxval = wg_get_field(db, offsettoptr(db, node->array_of_values[node->number_of_elements - 1]), column); if(minval != node->current_min) { if(printlevel) { printf("current_min invalid: %d is: %d should be: %d\n", (int) tnode_offset, (int) node->current_min, (int) minval); } return -2; } if(maxval != node->current_max) { if(printlevel) { printf("current_max invalid: %d is: %d should be: %d\n", (int) tnode_offset, (int) node->current_max, (int) maxval); } return -2; } tnode_offset = TNODE_SUCCESSOR(db, node); } return 0; } /* -------------------- child db testing ------------------------ */ #ifdef USE_CHILD_DB static int childdb_mkindex(void *db, int cnt) { int i; for(i=0; i 1) printf("checking (%p %d).\n", dbmemseg(db), i); if(validate_index(db, start, dbsize, i, printlevel)) { if(printlevel) printf("index validation failed (%p %d).\n", dbmemseg(db), i); return 0; } } return 1; } static int childdb_dropindex(void *db, int cnt) { int i; for(i=0; i1) { printf("********* testing child database ********** \n"); } foo = wg_attach_local_database(500000); if(foo) { if(printlevel>1) { #ifndef _WIN32 printf("Parent: %p free %td.\nChild: %p free %td extdbs %td size %td\n", #else printf("Parent: %p free %Id.\nChild: %p free %Id extdbs %Id size %Id\n", #endif dbmemseg(db), dbmemsegh(db)->free, dbmemseg(foo), dbmemsegh(foo)->free, dbmemsegh(foo)->extdbs.count, dbmemsegh(foo)->size); } } else { printf("Failed to attach to local database.\n"); return 1; } if(dbmemsegh(db)->key != 0) { /* Test invalid registering */ if(!wg_register_external_db(db, foo)) { if(printlevel) printf("Registering the local db in a shared db succeeded, should have failed\n"); wg_delete_local_database(foo); return 1; } } /* Records in parent db */ rec1 = (void *) wg_create_raw_record(db, 3); rec2 = (void *) wg_create_raw_record(db, 3); str1 = wg_encode_str(db, "hello", NULL); wg_set_new_field(db, rec1, 0, str1); wg_set_new_field(db, rec1, 1, wg_encode_str(db, "world", NULL)); wg_set_new_field(db, rec1, 2, wg_encode_double(db, 1.234)); wg_set_new_field(db, rec2, 0, wg_encode_record(db, rec1)); str2 = wg_encode_str(db, "bar", NULL); wg_set_new_field(db, rec2, 1, str2); /* Records in child db */ foorec1 = (void *) wg_create_raw_record(foo, 3); foorec2 = (void *) wg_create_raw_record(foo, 3); tmp = wg_encode_external_data(foo, db, str1); /* Try storing external data */ if(printlevel>1) { printf("Expecting an error: \"wg data handling error: "\ "External reference not recognized\".\n"); } if(!wg_set_new_field(foo, foorec1, 0, tmp)) { if(printlevel) printf("Storing external data succeeded, should have failed\n"); wg_delete_local_database(foo); return 1; } /* Test indexes */ if(printlevel>1) { printf("Testing child database index.\n"); } if(!childdb_mkindex(foo, 3)) { if(printlevel) printf("Child database index creation failed\n"); wg_delete_local_database(foo); return 1; } if(!childdb_ckindex(foo, 3, printlevel)) { if(printlevel) printf("Child database index test failed\n"); wg_delete_local_database(foo); return 1; } /* Test registering (should fail, as we have indexes) */ if(printlevel>1) { printf("Expecting an error: \"db memory allocation error: "\ "Database has indexes, external references not allowed\".\n"); } if(!wg_register_external_db(foo, db)) { if(printlevel) printf("Registering the external db succeeded, but we have indexes\n"); wg_delete_local_database(foo); return 1; } if(!childdb_dropindex(foo, 3)) { if(printlevel) printf("Dropping indexes failed\n"); wg_delete_local_database(foo); return 1; } /* Test registering again */ if(wg_register_external_db(foo, db)) { if(printlevel) printf("Registering the shared db in local db failed, should have succeeded\n"); wg_delete_local_database(foo); return 1; } if(printlevel>1) { printf("Expecting an error: \"index error: "\ "Database has external data, indexes disabled\".\n"); } if(childdb_mkindex(foo, 1)) { if(printlevel) printf("Child database index creation succeeded (should have failed)\n"); wg_delete_local_database(foo); return 1; } /* Storing external data should now work */ if(wg_set_new_field(foo, foorec1, 0, tmp)) { if(printlevel) printf("Storing external data failed, should have succeeded\n"); wg_delete_local_database(foo); return 1; } wg_set_new_field(foo, foorec1, 1, wg_encode_str(foo, "local data", NULL)); tmp = wg_encode_external_data(foo, db, wg_encode_record(db, rec1)); wg_set_new_field(foo, foorec2, 0, tmp); wg_set_new_field(foo, foorec2, 1, wg_encode_str(foo, "more local data", NULL)); tmp = wg_encode_external_data(foo, db, str2); wg_set_new_field(foo, foorec2, 2, tmp); if(printlevel>1) { printf("Testing data comparing.\n"); } /* Test comparing */ foorec3 = (void *) wg_create_raw_record(foo, 3); foorec4 = (void *) wg_create_raw_record(foo, 3); wg_set_new_field(foo, foorec3, 0, wg_encode_str(foo, "hello", NULL)); wg_set_new_field(foo, foorec3, 1, wg_encode_str(foo, "world", NULL)); wg_set_new_field(foo, foorec3, 2, wg_encode_double(foo, 1.234)); wg_set_new_field(foo, foorec4, 0, wg_encode_record(foo, foorec3)); wg_set_new_field(foo, foorec4, 1, wg_encode_str(foo, "more local data", NULL)); tmp = wg_encode_external_data(foo, db, str2); wg_set_new_field(foo, foorec4, 2, tmp); #if WG_COMPARE_REC_DEPTH > 2 /* foorec2 and foorec4 should be equal */ if(WG_COMPARE(foo, wg_encode_record(foo, foorec2), wg_encode_record(foo, foorec4)) != WG_EQUAL) { if(printlevel) printf("foorec2 and foorec4 were not equal, but should be.\n"); wg_delete_local_database(foo); return 1; } /* rec1 and foorec3 should be equal */ if(WG_COMPARE(foo, wg_encode_external_data(foo, db, wg_encode_record(db, rec1)), wg_encode_record(foo, foorec3)) != WG_EQUAL) { if(printlevel) printf("rec1 and foorec3 were not equal, but should be.\n"); wg_delete_local_database(foo); return 1; } #endif /* sanity check: foorec3 and foorec4 should not be equal */ if(WG_COMPARE(foo, wg_encode_record(foo, foorec3), wg_encode_record(foo, foorec4)) == WG_EQUAL) { if(printlevel) printf("foorec3 and foorec4 were equal, but should not be.\n"); wg_delete_local_database(foo); return 1; } #ifdef USE_BACKLINKING /* Test deleting */ if(wg_delete_record(db, rec1) != -1) { if(printlevel) printf("Deleting referenced parent rec1 succeeded (should have failed)\n"); wg_delete_local_database(foo); return 1; } #else if(wg_delete_record(db, rec1) != 0) { if(printlevel) printf("Deleting parent rec1 failed (should have succeeded)\n"); wg_delete_local_database(foo); return 1; } #endif if(wg_delete_record(db, rec2) != 0) { if(printlevel) printf("Deleting non-referenced parent rec2 failed (should have succeeded)\n"); wg_delete_local_database(foo); return 1; } if(wg_delete_record(foo, foorec2) != 0) { if(printlevel) printf("Deleting child foorec2 failed (should have succeeded)\n"); wg_delete_local_database(foo); return 1; } /* right now string refcounts are a bit fishy... skip this */ /* wg_set_field(foo, foorec4, 2, tmp); */ /* this should fail, but we don't want to interact with the * filesystem in these automated tests wg_dump(foo, "invalid.bin");*/ wg_delete_local_database(foo); if(printlevel>1) printf("********* child database test successful ********** \n"); #else printf("child databases disabled, skipping checks\n"); #endif return 0; } /* ---------------- schema/JSON related tests ----------------- */ /* * Run this on a dedicated database to check the effects * of param bits and deleting. */ gint wg_check_schema(void* db, int printlevel) { void *rec, *arec, *orec, *trec; gint *gptr; gint tmp1, tmp2, tmp3; if(printlevel>1) { printf("********* testing schema functions ********** \n"); } tmp1 = wg_encode_int(db, 99); tmp2 = wg_encode_int(db, 98); tmp3 = wg_encode_int(db, 97); /* Triple */ rec = wg_create_triple(db, tmp1, tmp2, tmp3, 0); /* Check the record (fields and meta bits). * it is not a param. */ gptr = ((gint *) rec + RECORD_META_POS); if(*gptr) { if(printlevel) { printf("plain triple is expected to have no meta bits\n"); } return 1; } gptr = ((gint *) rec + RECORD_HEADER_GINTS + WG_SCHEMA_TRIPLE_OFFSET); if(*gptr != tmp1) { if(printlevel) printf("triple field 1 does not match\n"); return 1; } if(*(gptr+1) != tmp2) { if(printlevel) printf("triple field 2 does not match\n"); return 1; } if(*(gptr+2) != tmp3) { if(printlevel) printf("triple field 3 does not match\n"); return 1; } /* the next triple is a param. */ rec = wg_create_triple(db, tmp1, tmp2, tmp3, 1); gptr = ((gint *) rec + RECORD_META_POS); if(*gptr != (RECORD_META_NOTDATA|RECORD_META_MATCH)) { if(printlevel) { #ifndef _WIN32 printf("param triple had invalid meta bits (%td)\n", *gptr); #else printf("param triple had invalid meta bits (%Id)\n", *gptr); #endif } return 1; } /* kv-pair */ rec = wg_create_kvpair(db, tmp2, tmp3, 1); /* Check the record (fields and meta bits). * it is a param. */ gptr = ((gint *) rec + RECORD_META_POS); if(*gptr != (RECORD_META_NOTDATA|RECORD_META_MATCH)) { if(printlevel) { #ifndef _WIN32 printf("param kv-pair had invalid meta bits (%td)\n", *gptr); #else printf("param kv-pair had invalid meta bits (%Id)\n", *gptr); #endif } return 1; } gptr = ((gint *) rec + RECORD_HEADER_GINTS + WG_SCHEMA_TRIPLE_OFFSET); if(*gptr != 0) { if(printlevel) printf("kv-pair prefix is not NULL\n"); return 1; } if(*(gptr+1) != tmp2) { if(printlevel) printf("kv-pair key does not match\n"); return 1; } if(*(gptr+2) != tmp3) { if(printlevel) printf("kv-pair value does not match\n"); return 1; } /* this is not a param. */ rec = wg_create_triple(db, tmp1, tmp2, tmp3, 0); gptr = ((gint *) rec + RECORD_META_POS); if(*gptr) { if(printlevel) { printf("plain kv-pair is expected to have no meta bits\n"); } return 1; } /* params should be invisible */ if(check_db_rows(db, 2, printlevel)) { if(printlevel) printf("row count check failed (should have 2 non-param rows).\n"); return 1; } /* Object */ orec = wg_create_object(db, 1, 0, 0); if(wg_get_record_len(db, orec) != 1) { if(printlevel) { printf("object had invalid length\n"); } return 1; } gptr = ((gint *) orec + RECORD_META_POS); if(*gptr != RECORD_META_OBJECT) { if(printlevel) { #ifndef _WIN32 printf("object (nonparam) had invalid meta bits (%td)\n", *gptr); #else printf("object (nonparam) had invalid meta bits (%Id)\n", *gptr); #endif } return 1; } wg_set_field(db, orec, 0, wg_encode_record(db, rec)); /* Array. It has the document bit set. */ arec = wg_create_array(db, 4, 1, 0); if(wg_get_record_len(db, arec) != 4) { if(printlevel) { printf("array had invalid length\n"); } return 1; } gptr = ((gint *) arec + RECORD_META_POS); if(*gptr != (RECORD_META_ARRAY|RECORD_META_DOC)) { if(printlevel) { #ifndef _WIN32 printf("array (doc, nonparam) had invalid meta bits (%td)\n", *gptr); #else printf("array (doc, nonparam) had invalid meta bits (%Id)\n", *gptr); #endif } return 1; } /* Form the document. */ wg_set_field(db, arec, 0, tmp3); wg_set_field(db, arec, 1, tmp2); wg_set_field(db, arec, 2, tmp1); wg_set_field(db, arec, 3, wg_encode_record(db, orec)); #ifdef USE_BACKLINKING /* Locate the document through an element. */ trec = wg_find_document(db, rec); if(trec != arec) { if(printlevel) { printf("wg_find_document() failed\n"); } return 1; } #endif if(wg_delete_document(db, arec)) { if(printlevel) { printf("wg_delete_document() failed\n"); } return 1; } /* of the two rows in db earlier, one was included in the * deleted document. One should be remaining. */ if(check_db_rows(db, 1, printlevel)) { if(printlevel) printf("Invalid number of remaining rows after deleting.\n"); return 1; } /* Check the param bits of object and array. */ orec = wg_create_object(db, 5, 0, 1); gptr = ((gint *) orec + RECORD_META_POS); if(*gptr != (RECORD_META_OBJECT|RECORD_META_NOTDATA|RECORD_META_MATCH)) { if(printlevel) { #ifndef _WIN32 printf("object (param) had invalid meta bits (%td)\n", *gptr); #else printf("object (param) had invalid meta bits (%Id)\n", *gptr); #endif } return 1; } arec = wg_create_array(db, 6, 0, 1); gptr = ((gint *) arec + RECORD_META_POS); if(*gptr != (RECORD_META_ARRAY|RECORD_META_NOTDATA|RECORD_META_MATCH)) { if(printlevel) { #ifndef _WIN32 printf("array (param) had invalid meta bits (%td)\n", *gptr); #else printf("array (param) had invalid meta bits (%Id)\n", *gptr); #endif } return 1; } /* we added params, row count should not increase. */ if(check_db_rows(db, 1, printlevel)) { if(printlevel) printf("Invalid number of remaining rows after deleting.\n"); return 1; } if(printlevel>1) printf("********* schema test successful ********** \n"); return 0; } /* * Test JSON parsing. This produces some errors in stderr * which is expected (rely on the return value to check for success). */ gint wg_check_json_parsing(void* db, int printlevel) { void *doc, *rec; gint enc; char *json1 = "[7,8,9]"; /* ok */ char *json2 = "{ \"a\":{\n\"b\": 55.0\n}, \"c\"\n:\"hello\","\ "\"d\"\t:[\n]}"; /* ok */ char *json3 = "25"; /* fail */ char *json4 = "{ \"a\":{\"b\": 55.0}, \"c\":\"hello\""; /* fail */ if(printlevel>1) { printf("********* testing JSON parsing functions ********** \n"); } /* parse input buf. */ if(wg_parse_json_document(db, json1)) { if(printlevel) printf("Parsing a valid document failed.\n"); return 1; } /* Use the param parser to get direct access to the * document structure. */ doc = NULL; if(wg_parse_json_param(db, json2, &doc)) { if(printlevel) printf("Parsing a valid document failed.\n"); return 1; } if(!doc) { if(printlevel) printf("Param parser did not return a document.\n"); return 1; } /* examine structure */ if(wg_get_record_len(db, doc) != 3) { if(printlevel) printf("Document structure error: bad object length.\n"); return 1; } if(!is_special_record(doc) || !is_schema_document(doc) ||\ !is_schema_object(doc)) { if(printlevel) { printf("Document structure error: invalid meta type\n"); } return 1; } /* first kv-pair */ enc = wg_get_field(db, doc, 0); if(wg_get_encoded_type(db, enc) != WG_RECORDTYPE) { if(printlevel) printf("Document structure error: bad object element(0).\n"); return 1; } rec = wg_decode_record(db, enc); enc = wg_get_field(db, rec, WG_SCHEMA_KEY_OFFSET); if(wg_get_encoded_type(db, enc) != WG_STRTYPE) { if(printlevel) printf("Document structure error: bad key type.\n"); return 1; } if(strncmp("a", wg_decode_str(db, enc), 1)) { if(printlevel) printf("Document structure error: bad key string.\n"); return 1; } enc = wg_get_field(db, rec, WG_SCHEMA_VALUE_OFFSET); if(wg_get_encoded_type(db, enc) != WG_RECORDTYPE) { if(printlevel) printf("Document structure error: bad value type.\n"); return 1; } rec = wg_decode_record(db, enc); if(wg_get_record_len(db, rec) != 1) { if(printlevel) printf("Document structure error: bad sub-object length.\n"); return 1; } if(!is_schema_object(rec)) { if(printlevel) { printf("Document structure error: sub-object has invalid meta type\n"); } } enc = wg_get_field(db, rec, 0); if(wg_get_encoded_type(db, enc) != WG_RECORDTYPE) { if(printlevel) printf("Document structure error: bad sub-object element(0).\n"); return 1; } rec = wg_decode_record(db, enc); enc = wg_get_field(db, rec, WG_SCHEMA_KEY_OFFSET); if(wg_get_encoded_type(db, enc) != WG_STRTYPE) { if(printlevel) printf("Document structure error: bad subobj key type.\n"); return 1; } if(strncmp("b", wg_decode_str(db, enc), 1)) { if(printlevel) printf("Document structure error: bad subobj key string.\n"); return 1; } enc = wg_get_field(db, rec, WG_SCHEMA_VALUE_OFFSET); if(wg_get_encoded_type(db, enc) != WG_DOUBLETYPE) { if(printlevel) printf("Document structure error: bad subobj value type.\n"); return 1; } if(wg_decode_double(db, enc) >= 55.1 ||\ wg_decode_double(db, enc) <= 54.9) { if(printlevel) printf("Document structure error: bad subobj value.\n"); return 1; } /* second kv-pair */ enc = wg_get_field(db, doc, 1); if(wg_get_encoded_type(db, enc) != WG_RECORDTYPE) { if(printlevel) printf("Document structure error: bad object element(1).\n"); return 1; } rec = wg_decode_record(db, enc); enc = wg_get_field(db, rec, WG_SCHEMA_KEY_OFFSET); if(wg_get_encoded_type(db, enc) != WG_STRTYPE) { if(printlevel) printf("Document structure error: bad key type.\n"); return 1; } if(strncmp("c", wg_decode_str(db, enc), 1)) { if(printlevel) printf("Document structure error: bad key string.\n"); return 1; } enc = wg_get_field(db, rec, WG_SCHEMA_VALUE_OFFSET); if(wg_get_encoded_type(db, enc) != WG_STRTYPE) { if(printlevel) printf("Document structure error: value type.\n"); return 1; } if(strncmp("hello", wg_decode_str(db, enc), 5)) { if(printlevel) printf("Document structure error: bad value.\n"); return 1; } /* third kv-pair */ enc = wg_get_field(db, doc, 2); if(wg_get_encoded_type(db, enc) != WG_RECORDTYPE) { if(printlevel) printf("Document structure error: bad object element(0).\n"); return 1; } rec = wg_decode_record(db, enc); enc = wg_get_field(db, rec, WG_SCHEMA_KEY_OFFSET); if(wg_get_encoded_type(db, enc) != WG_STRTYPE) { if(printlevel) printf("Document structure error: bad key type.\n"); return 1; } if(strncmp("d", wg_decode_str(db, enc), 1)) { if(printlevel) printf("Document structure error: bad key string.\n"); return 1; } enc = wg_get_field(db, rec, WG_SCHEMA_VALUE_OFFSET); if(wg_get_encoded_type(db, enc) != WG_RECORDTYPE) { if(printlevel) printf("Document structure error: bad value type.\n"); return 1; } rec = wg_decode_record(db, enc); if(!is_schema_array(rec)) { if(printlevel) { printf("Document structure error: bad value (array expected)\n"); } } if(wg_get_record_len(db, rec) != 0) { if(printlevel) printf("Document structure error: bad array length.\n"); return 1; } /* Invalid documents, expect a failure. */ if(!wg_parse_json_document(db, json3)) { if(printlevel) printf("Parsing an invalid document succeeded.\n"); return 1; } if(!wg_parse_json_param(db, json4, &doc)) { if(printlevel) printf("Parsing an invalid document succeeded.\n"); return 1; } if(printlevel>1) printf("********* JSON parsing test successful ********** \n"); return 0; } /* * Returns 1 if the offset is in list. * Returns 0 otherwise. */ static int is_offset_in_list(void *db, gint reclist_offset, gint offset) { if(reclist_offset > 0) { gint *nextoffset = &reclist_offset; while(*nextoffset) { gcell *rec_cell = (gcell *) offsettoptr(db, *nextoffset); if(rec_cell->car == offset) return 1; nextoffset = &(rec_cell->cdr); } } return 0; } /* * Test index hash (low-level functions) */ gint wg_check_idxhash(void* db, int printlevel) { db_hash_area_header ha; struct { char *data; gint offsets[10]; int delidx; } rowdata[] = { { "0iQ1vMvGX5wfsjLTssyx", { 5709281, 5769186, 0, 0, 0, 0, 0, 0, 0, 0 }, 1 }, { "1jP3hJxO61QVscBEKu9", { 3510018, 8944261, 8172536, 4346587, 0, 0, 0, 0, 0, 0 }, 2 }, { "yLMt2eSQuIi3ChQlI0", { 6587099, 6385516, 0, 0, 0, 0, 0, 0, 0, 0 }, 1 }, { "ZlGS9cVX7fE1v7H6m", { 2059694, 1981000, 8360987, 752526, 6435820, 240982, 323628, 8875951, 0, 0 }, 1 }, { "duflillyRviJ1ZvH", { 6711262, 9685175, 4070003, 5977585, 9671591, 5321015, 7499127, 9101853, 0, 0 }, 2 }, { "USLP83gH6f4pNYJ", { 8759349, 436333, 0, 0, 0, 0, 0, 0, 0, 0 }, 1 }, { "yHIDgxlEA7RLAx", { 7613500, 534106, 4361094, 1506219, 0, 0, 0, 0, 0, 0 }, 1 }, { " ", { 6588510, 6253610, 9020726, 8514572, 9378303, 1100373, 0, 0, 0, 0 }, 2 }, { " ", { 8185484, 0, 0, 0, 0, 0, 0, 0, 0, 0 }, 0 }, { "yHIDgxlEA7R", { 2542797, 6481658, 214793, 943434, 2934816, 9503963, 1374313, 0, 0, 0 }, 4 }, { NULL, { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }, -1 } }; int i; if(printlevel>1) { printf("********* testing index hash functions ********** \n"); } /* Create a tiny hash table to allow hash chains to be created. */ if(wg_create_hash(db, &ha, 4)) { if(printlevel) printf("Failed to create the hash table.\n"); return 1; } /* Insert rows in order of columns */ for(i=0; i<10; i++) { int j; for(j=0; rowdata[j].data; j++) { if(rowdata[j].offsets[i]) { if(wg_idxhash_store(db, &ha, rowdata[j].data, strlen(rowdata[j].data), rowdata[j].offsets[i])) { if(printlevel) printf("Hash table insertion failed (j=%d i=%d).\n", j, i); return 1; } } } } /* Check that each offset is present */ for(i=0; rowdata[i].data; i++) { int j; gint list = wg_idxhash_find(db, &ha, rowdata[i].data, strlen(rowdata[i].data)); for(j=0; j<10 && rowdata[i].offsets[j]; j++) { if(!is_offset_in_list(db, list, rowdata[i].offsets[j])) { if(printlevel) printf("Offset missing in hash table (i=%d j=%d).\n", i, j); return 1; } } } /* Delete the rows designated by delidx */ for(i=0; rowdata[i].data; i++) { if(wg_idxhash_remove(db, &ha, rowdata[i].data, strlen(rowdata[i].data), rowdata[i].offsets[rowdata[i].delidx])) { if(printlevel) printf("Hash table deletion failed (i=%d delidx=%d).\n", i, rowdata[i].delidx); return 1; } } /* Check that the deleted row is not present and that all the others are */ for(i=0; rowdata[i].data; i++) { int j; gint list = wg_idxhash_find(db, &ha, rowdata[i].data, strlen(rowdata[i].data)); for(j=0; j<10 && rowdata[i].offsets[j]; j++) { if(j == rowdata[i].delidx) { /* Should be missing */ if(is_offset_in_list(db, list, rowdata[i].offsets[j])) { if(printlevel) printf("Offset not correctly deleted (i=%d delidx=%d).\n", i, rowdata[i].delidx); return 1; } } else { /* Should be present */ if(!is_offset_in_list(db, list, rowdata[i].offsets[j])) { if(printlevel) printf("Offset missing in hash table (i=%d j=%d).\n", i, j); return 1; } } } } if(printlevel>1) printf("********* index hash test successful ********** \n"); return 0; } /* --------------------- query testing ------------------------ */ /** * Fetch all rows where "col" "cond" "val" is true * (where cond is a comparison operator - equal, less than etc) * Check that the val matches the field value in returned records. * Check that the number of rows matches the expected value */ static int check_matching_rows(void *db, int col, int cond, void *val, gint type, int expected, int printlevel) { void *rec = NULL; wg_query *query = NULL; wg_query_arg arglist; int cnt; arglist.column = col; arglist.cond = cond; switch(type) { case WG_INTTYPE: arglist.value = wg_encode_query_param_int(db, *((gint *) val)); break; case WG_DOUBLETYPE: arglist.value = wg_encode_query_param_double(db, *((double *) val)); break; case WG_STRTYPE: arglist.value = wg_encode_query_param_str(db, (char *) val, NULL); break; default: return -1; } query = wg_make_query(db, NULL, 0, &arglist, 1); if(!query) { return -2; } if(query->res_count != expected) { if(printlevel) printf("check_matching_rows: res_count mismatch (%d != %d)\n", (int) query->res_count, expected); return -3; } cnt = 0; while((rec = wg_fetch(db, query))) { gint enc = wg_get_field(db, rec, col); if(cond == WG_COND_EQUAL) { switch(type) { case WG_INTTYPE: if(wg_decode_int(db, enc) != *((int *) val)) { if(printlevel) printf("check_matching_rows: int value mismatch\n"); return -4; } break; case WG_DOUBLETYPE: if(wg_decode_double(db, enc) != *((double *) val)) { if(printlevel) printf("check_matching_rows: double value mismatch\n"); return -4; } break; case WG_STRTYPE: if(strcmp(wg_decode_str(db, enc), (char *) val)) { if(printlevel) printf("check_matching_rows: string value mismatch\n"); return -4; } break; default: break; } } cnt++; } if(cnt != expected) { if(printlevel) printf("check_matching_rows: actual count mismatch (%d != %d)\n", cnt, expected); return -5; } wg_free_query(db, query); wg_free_query_param(db, arglist.value); return 0; } /** * version of check_matching_rows() using wg_find_record_*() */ static int check_matching_rows_find(void *db, int col, int cond, void *val, gint type, int expected, int printlevel) { void *rec = NULL; int cnt = 0; for(;;) { switch(type) { case WG_INTTYPE: rec = wg_find_record_int(db, col, cond, *((int *) val), rec); break; case WG_DOUBLETYPE: rec = wg_find_record_double(db, col, cond, *((double *) val), rec); break; case WG_STRTYPE: rec = wg_find_record_str(db, col, cond, (char *) val, rec); break; default: break; } if(!rec) break; cnt++; } if(cnt != expected) { if(printlevel) printf("check_matching_rows_find: actual count mismatch (%d != %d)\n", cnt, expected); return -5; } return 0; } /** * Count db rows */ static int check_db_rows(void *db, int expected, int printlevel) { void *rec = NULL; int cnt; rec = wg_get_first_record(db); cnt = 0; while(rec) { cnt++; rec = wg_get_next_record(db, rec); } if(cnt != expected) { if(printlevel) printf("check_db_rows: actual count mismatch (%d != %d)\n", cnt, expected); return -1; } return 0; } /** * Basic query tests */ gint wg_test_query(void *db, int magnitude, int printlevel) { const int dbsize = 50*magnitude; int i, j, k; void *rec = NULL; if(wg_column_to_index_id(db, 0, WG_INDEX_TYPE_TTREE, NULL, 0) == -1) { if(printlevel > 1) printf("no index found on column 0, creating.\n"); if(wg_create_index(db, 0, WG_INDEX_TYPE_TTREE, NULL, 0)) { if(printlevel) printf("index creation failed, aborting.\n"); return -3; } } if(printlevel > 1) printf("------- Inserting test data --------\n"); /* Create predictable data */ for(i=0; i 1) printf("no index found on column 2, creating.\n"); if(wg_create_index(db, 2, WG_INDEX_TYPE_TTREE, NULL, 0)) { if(printlevel) printf("index creation failed, aborting.\n"); return -3; } } if(printlevel > 1) printf("------- Running read query tests --------\n"); /* Content check read queries */ for(i=0; i 1) printf("------- Running find tests --------\n"); /* Content check read queries */ for(i=0; i 1) printf("------- Updating test data --------\n"); /* Update queries */ for(i=0; i 1) printf("------- Running read query tests --------\n"); /* Content check read queries, iteration 2 */ for(i=0; i 1) printf("------- Running delete queries --------\n"); /* Delete query */ for(i=0; i 1) printf("------- Checking row count --------\n"); /* Database scan */ if(check_db_rows(db, dbsize * (50 * 50 - 30 * 20), printlevel)) { if(printlevel) printf("row count check failed.\n"); return -7; } return 0; } /* ------------------------- log testing ------------------------ */ #ifndef _WIN32 #define LOG_TESTFILE "/tmp/wgdb.logtest" #else #define LOG_TESTFILE "c:\\windows\\temp\\wgdb.logtest" #endif gint wg_check_log(void* db, int printlevel) { #if defined(USE_DBLOG) db_memsegment_header* dbh = dbmemsegh(db); db_handle_logdata *ld = ((db_handle *) db)->logdata; void *clonedb; void *rec1, *rec2; gint tmp, str1, str2; char logfn[100]; int i, err, pid; int fd; if(printlevel>1) { printf("********* testing journal logging ********** \n"); } /* Set up the temporary log. We don't use the standard method as * that might interfere with real database logs. Also, normally * local databases are not logged. */ #ifndef _WIN32 pid = getpid(); #else pid = _getpid(); #endif snprintf(logfn, 99, "%s.%d", LOG_TESTFILE, pid); logfn[99] = '\0'; #ifdef _WIN32 if(_sopen_s(&fd, logfn, _O_CREAT|_O_APPEND|_O_BINARY|_O_RDWR, _SH_DENYNO, _S_IREAD|_S_IWRITE)) { #else if((fd = open(logfn, O_CREAT|O_APPEND|O_RDWR, S_IRUSR|S_IWUSR|S_IRGRP|S_IWGRP|S_IROTH|S_IWOTH)) == -1) { #endif if(printlevel) printf("Failed to open the test journal\n"); return 1; } #ifndef _WIN32 if(write(fd, WG_JOURNAL_MAGIC, WG_JOURNAL_MAGIC_BYTES) != \ WG_JOURNAL_MAGIC_BYTES) { if(printlevel) printf("Failed to initialize the test journal\n"); close(fd); return 1; } #else if(_write(fd, WG_JOURNAL_MAGIC, WG_JOURNAL_MAGIC_BYTES) != \ WG_JOURNAL_MAGIC_BYTES) { if(printlevel) printf("Failed to initialize the test journal\n"); _close(fd); return 1; } #endif ld->fd = fd; ld->serial = dbh->logging.serial; dbh->logging.active = 1; /* Do various operations in the database: * Encode short/long strings, doubles, ints * Create records * Delete records * Set fields */ str1 = wg_encode_str(db, "0000000001000000000200000000030000000004", NULL); str2 = wg_encode_str(db, "00000000010000000002", NULL); tmp = wg_encode_double(db, -6543.3412); rec1 = wg_create_record(db, 7); wg_set_field(db, rec1, 4, str1); wg_set_field(db, rec1, 5, str2); wg_set_field(db, rec1, 6, tmp); if(printlevel) printf("Expecting a field index error:\n"); wg_set_field(db, rec1, 7, 0); /* Failed operation, shouldn't be logged */ rec2 = wg_create_record(db, 6); wg_set_field(db, rec2, 1, str1); wg_set_field(db, rec2, 3, str2); wg_set_field(db, rec2, 5, tmp); wg_delete_record(db, rec1); rec1 = wg_create_record(db, 10); for(i=0; i<10; i++) wg_set_field(db, rec1, i, wg_encode_int(db, (~((gint) 0))-i)); #ifndef _WIN32 close(ld->fd); #else _close(ld->fd); #endif ld->fd = -1; /* Replay the log in a clone database. * Note that replay normally restarts logging using the * standard configuration, but here this is not the case as * the logging.active flag is not set in the local database. */ clonedb = wg_attach_local_database(800000); if(!clonedb) { if(printlevel) printf("Failed to create a second memory database\n"); remove(logfn); return 1; } if(wg_replay_log(clonedb, logfn)) { if(printlevel) printf("Failed to replay the journal\n"); wg_delete_local_database(clonedb); remove(logfn); return 1; } err = 0; /* Compare the databases */ rec1 = wg_get_first_record(db); rec2 = wg_get_first_record(clonedb); while(rec1) { int len1, len2; if(!rec2) { if(printlevel) printf("Error: clone database had fewer records\n"); err = 1; break; } len1 = wg_get_record_len(db, rec1); len2 = wg_get_record_len(clonedb, rec2); if(len1 != len2) { if(printlevel) printf("Error: records had different lengths\n"); err = 1; break; } for(i=0; i1) printf("********* journal logging test successful ********** \n"); #else printf("logging disabled, skipping checks\n"); #endif return 0; } /* ------------------ bulk testdata generation ---------------- */ /* Asc/desc/mix integer data functions originally written by Enar Reilent. * these functions will generate integer data of given * record size into database. */ /** Generate integer data with ascending values * */ int wg_genintdata_asc(void *db, int databasesize, int recordsize){ int i, j, tmp; void *rec; wg_int value = 0; int increment = 1; int incrementincrement = 17; int k = 0; for (i=0;i", (int) ptrdata); //len = strlen(buf); //if(buflen - len > 1) // snprint_record(db, (wg_int*)ptrdata, buf+len, buflen-len); break; case WG_INTTYPE: intdata = wg_decode_int(db, enc); if (issmallint(enc)) snprintf(buf, buflen, "smallint:%d", intdata); else snprintf(buf, buflen, "longint:%d", intdata); break; case WG_DOUBLETYPE: doubledata = wg_decode_double(db, enc); snprintf(buf, buflen, "double:%f", doubledata); break; case WG_STRTYPE: strdata = wg_decode_str(db, enc); if ((enc&NORMALPTRMASK)==LONGSTRBITS) { /*fieldoffset=decode_longstr_offset(enc)+LONGSTR_META_POS*sizeof(gint);*/ //printf("fieldoffset %d\n",fieldoffset); /*tmp=dbfetch(db,fieldoffset); */ offset=decode_longstr_offset(enc); refc=dbfetch(db,offset+LONGSTR_REFCOUNT_POS*sizeof(gint)); if (1) { //(tmp&LONGSTR_META_TYPEMASK)==WG_STRTYPE) { snprintf(buf, buflen, "longstr: len %d refcount %d str \"%s\" extrastr \"%s\"", (int) wg_decode_unistr_len(db,enc,type), (int) refc, wg_decode_unistr(db,enc,type), wg_decode_unistr_lang(db,enc,type)); } /* } else if ((tmp&LONGSTR_META_TYPEMASK)==WG_URITYPE) { snprintf(buf, buflen, "uri:\"%s\"", strdata); } else if ((tmp&LONGSTR_META_TYPEMASK)==WG_XMLLITERALTYPE) { snprintf(buf, buflen, "xmlliteral:\"%s\"", strdata); } else { snprintf(buf, buflen, "unknown_str_subtype %d",tmp&LONGSTR_META_TYPEMASK); } */ } else { snprintf(buf, buflen, "shortstr: len %d str \"%s\"", (int) wg_decode_str_len(db,enc), wg_decode_str(db,enc)); } break; case WG_URITYPE: strdata = wg_decode_uri(db, enc); exdata = wg_decode_uri_prefix(db, enc); snprintf(buf, buflen, "uri:\"%s%s\"", exdata, strdata); break; case WG_XMLLITERALTYPE: strdata = wg_decode_xmlliteral(db, enc); exdata = wg_decode_xmlliteral_xsdtype(db, enc); snprintf(buf, buflen, "xmlliteral:\"%s\"", exdata, strdata); break; case WG_BLOBTYPE: //strdata = wg_decode_blob(db, enc); //exdata = wg_decode_xmlliteral_xsdtype(db, enc); snprintf(buf, buflen, "blob: len %d extralen %d", (int) wg_decode_blob_len(db,enc), (int) wg_decode_blob_type_len(db,enc)); break; case WG_CHARTYPE: intdata = wg_decode_char(db, enc); snprintf(buf, buflen, "char:%c", (char) intdata); break; case WG_DATETYPE: intdata = wg_decode_date(db, enc); wg_strf_iso_datetime(db,intdata,0,strbuf); strbuf[10]=0; snprintf(buf, buflen, "date:%s", intdata,strbuf); break; case WG_TIMETYPE: intdata = wg_decode_time(db, enc); wg_strf_iso_datetime(db,1,intdata,strbuf); snprintf(buf, buflen, "time:%s",intdata,strbuf+11); break; default: snprintf(buf, buflen, ""); break; } printf("enc %d %s", (int) enc, buf); } /** * General sanity checks */ static int check_sanity(void *db) { #ifdef HAVE_64BIT_GINT if(sizeof(gint) != 8) { printf("gint size sanity check failed\n"); return 1; } #else if(sizeof(gint) != 4) { printf("gint size sanity check failed\n"); return 1; } #endif return 0; } #ifdef __cplusplus } #endif whitedb-0.7.2/Db/dbtest.h000066400000000000000000000052161226454622500151450ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dbtest.h * Public headers for database testing procedures. */ #ifndef DEFINED_DBTEST_H #define DEFINED_DBTEST_H #ifdef _WIN32 #include "../config-w32.h" #else #include "../config.h" #endif /* ====== general typedefs and macros ======= */ #define WG_TEST_COMMON 0x01 #define WG_TEST_INDEX 0x02 #define WG_TEST_QUERY 0x04 #define WG_TEST_LOG 0x08 #define WG_TEST_QUICK (WG_TEST_COMMON|WG_TEST_LOG) #define WG_TEST_FULL (WG_TEST_QUICK|WG_TEST_INDEX|WG_TEST_QUERY) /* ==== Protos ==== */ int wg_run_tests(int tests, int printlevel); gint wg_check_db(void* db); gint wg_check_datatype_writeread(void* db, int printlevel); gint wg_check_backlinking(void* db, int printlevel); gint wg_check_parse_encode(void* db, int printlevel); gint wg_check_compare(void* db, int printlevel); gint wg_check_query_param(void* db, int printlevel); gint wg_check_strhash(void* db, int printlevel); gint wg_test_index1(void *db, int magnitude, int printlevel); gint wg_test_index2(void *db, int printlevel); gint wg_check_childdb(void* db, int printlevel); gint wg_check_schema(void* db, int printlevel); gint wg_check_json_parsing(void* db, int printlevel); gint wg_check_idxhash(void* db, int printlevel); gint wg_test_query(void *db, int magnitude, int printlevel); gint wg_check_log(void* db, int printlevel); void wg_show_db_memsegment_header(void* db); void wg_show_db_area_header(void* db, void* area_header); void wg_show_bucket_freeobjects(void* db, gint freelist); void wg_show_strhash(void* db); gint wg_count_freelist(void* db, gint freelist); int wg_genintdata_asc(void *db, int databasesize, int recordsize); int wg_genintdata_desc(void *db, int databasesize, int recordsize); int wg_genintdata_mix(void *db, int databasesize, int recordsize); void wg_debug_print_value(void *db, gint data); void wg_show_strhash(void* db); /* ------- testing ------------ */ #endif /* DEFINED_DBTEST_H */ whitedb-0.7.2/Db/dbutil.c000066400000000000000000001072031226454622500151350ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Priit Järv 2010,2011,2012,2013 * * Minor mods by Tanel Tammet. Triple handler for raptor and raptor * rdf parsing originally written by Tanel Tammet. * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dbutil.c * Miscellaneous utility functions. */ /* ====== Includes =============== */ #include #include #include #include #ifdef HAVE_RAPTOR #include #endif /* ====== Private headers and defs ======== */ #ifdef __cplusplus extern "C" { #endif #include "dbdata.h" #include "dbutil.h" #include "dbquery.h" #ifdef _WIN32 #define snprintf(s, sz, f, ...) _snprintf_s(s, sz+1, sz, f, ## __VA_ARGS__) #define strncpy(d, s, sz) strncpy_s(d, sz+1, s, sz) #else /* Use error-detecting versions for other C libs */ #define atof(s) strtod(s, NULL) #define atol(s) strtol(s, NULL, 10) #endif #define CSV_FIELD_BUF 4096 /** max size of csv I/O field */ #define CSV_FIELD_SEPARATOR ',' /** field separator, comma or semicolon */ #define CSV_DECIMAL_SEPARATOR '.' /** comma or dot */ #define CSV_ENCDATA_BUF 10 /** initial storage for encoded (gint) data */ #define MAX_URI_SCHEME 10 /* ======== Data ========================= */ /** Recognized URI schemes (used when parsing input data) * when adding new schemes, check that MAX_URI_SCHEME is enough to * store the entire scheme + '\0' */ struct uri_scheme_info { char *prefix; int length; } uri_scheme_table[] = { { "urn:", 4 }, { "file:", 5 }, { "http://", 7 }, { "https://", 8 }, { "mailto:", 7 }, { NULL, 0 } }; /* ======= Private protos ================ */ static gint show_io_error(void *db, char *errmsg); static gint show_io_error_str(void *db, char *errmsg, char *str); static void snprint_record(void *db, wg_int* rec, char *buf, int buflen); static void csv_escaped_str(void *db, char *iptr, char *buf, int buflen); static void snprint_value_csv(void *db, gint enc, char *buf, int buflen); #if 0 static gint parse_and_encode_uri(void *db, char *buf); #endif static gint parse_input_type(void *db, char *buf, gint *intdata, double *doubledata, gint *datetime); static gint fread_csv(void *db, FILE *f); #ifdef HAVE_RAPTOR static gint import_raptor(void *db, gint pref_fields, gint suff_fields, gint (*callback) (void *, void *), char *filename, raptor_parser *rdf_parser); static void handle_triple(void* user_data, const raptor_statement* triple); static raptor_uri *dburi_to_raptoruri(void *db, gint enc); static gint export_raptor(void *db, gint pref_fields, char *filename, raptor_serializer *rdf_serializer); #endif /* ====== Functions ============== */ /** Print contents of database. * */ void wg_print_db(void *db) { void *rec; rec = wg_get_first_record(db); while(rec) { wg_print_record(db, (gint *) rec); printf("\n"); rec = wg_get_next_record(db,rec); } } /** Print single record * */ void wg_print_record(void *db, wg_int* rec) { wg_int len, enc; int i; char strbuf[256]; #ifdef USE_CHILD_DB void *parent; #endif if (rec==NULL) { printf("\n"); return; } #ifdef USE_CHILD_DB parent = wg_get_rec_owner(db, rec); #endif len = wg_get_record_len(db, rec); printf("["); for(i=0; i\n"); return; } if(buflen < 2) return; *buf++ = '['; buflen--; #ifdef USE_CHILD_DB parent = wg_get_rec_owner(db, rec); #endif strbuf = malloc(buflen); if(strbuf) { int i, strbuflen; gint enc; gint len = wg_get_record_len(db, rec); for(i=0; i 1) *buf++ = ']'; *buf = '\0'; } /** Print a single, encoded value * The value is written into a character buffer. */ void wg_snprint_value(void *db, gint enc, char *buf, int buflen) { gint ptrdata; int intdata, len; char *strdata, *exdata; double doubledata; char strbuf[80]; buflen--; /* snprintf adds '\0' */ switch(wg_get_encoded_type(db, enc)) { case WG_NULLTYPE: snprintf(buf, buflen, "NULL"); break; case WG_RECORDTYPE: ptrdata = (gint) wg_decode_record(db, enc); snprintf(buf, buflen, "", (int) ptrdata); len = strlen(buf); if(buflen - len > 1) snprint_record(db, (wg_int*)ptrdata, buf+len, buflen-len); break; case WG_INTTYPE: intdata = wg_decode_int(db, enc); snprintf(buf, buflen, "%d", intdata); break; case WG_DOUBLETYPE: doubledata = wg_decode_double(db, enc); snprintf(buf, buflen, "%f", doubledata); break; case WG_FIXPOINTTYPE: doubledata = wg_decode_fixpoint(db, enc); snprintf(buf, buflen, "%f", doubledata); break; case WG_STRTYPE: strdata = wg_decode_str(db, enc); snprintf(buf, buflen, "\"%s\"", strdata); break; case WG_URITYPE: strdata = wg_decode_uri(db, enc); exdata = wg_decode_uri_prefix(db, enc); if (exdata==NULL) snprintf(buf, buflen, "%s", strdata); else snprintf(buf, buflen, "%s:%s", exdata, strdata); break; case WG_XMLLITERALTYPE: strdata = wg_decode_xmlliteral(db, enc); exdata = wg_decode_xmlliteral_xsdtype(db, enc); snprintf(buf, buflen, "\"%s\"", exdata, strdata); break; case WG_CHARTYPE: intdata = wg_decode_char(db, enc); snprintf(buf, buflen, "%c", (char) intdata); break; case WG_DATETYPE: intdata = wg_decode_date(db, enc); wg_strf_iso_datetime(db,intdata,0,strbuf); strbuf[10]=0; snprintf(buf, buflen, "%s", intdata,strbuf); break; case WG_TIMETYPE: intdata = wg_decode_time(db, enc); wg_strf_iso_datetime(db,1,intdata,strbuf); snprintf(buf, buflen, "%s",intdata,strbuf+11); break; case WG_VARTYPE: intdata = wg_decode_var(db, enc); snprintf(buf, buflen, "?%d", intdata); break; case WG_ANONCONSTTYPE: strdata = wg_decode_anonconst(db, enc); snprintf(buf, buflen, "!%s",strdata); break; default: snprintf(buf, buflen, ""); break; } } /** Create CSV-formatted quoted string * */ static void csv_escaped_str(void *db, char *iptr, char *buf, int buflen) { char *optr; #ifdef CHECK if(buflen < 3) { show_io_error(db, "CSV field buffer too small"); return; } #endif optr = buf; *optr++ = '"'; buflen--; /* space for terminating quote */ while(*iptr) { /* \0 terminates */ int nextsz = 1; if(*iptr == '"') nextsz++; /* Will our string fit? */ if(((gint)optr + nextsz - (gint)buf) < buflen) { *optr++ = *iptr; if(*iptr++ == '"') *optr++ = '"'; /* quote -> double quote */ } else break; } *optr++ = '"'; /* CSV string terminator */ *optr = '\0'; /* C string terminator */ } /** Print a single, encoded value, into a CSV-friendly format * The value is written into a character buffer. */ static void snprint_value_csv(void *db, gint enc, char *buf, int buflen) { int intdata, ilen; double doubledata; char strbuf[80], *ibuf; buflen--; /* snprintf adds '\0' */ switch(wg_get_encoded_type(db, enc)) { case WG_NULLTYPE: buf[0] = '\0'; /* output an empty field */ break; case WG_RECORDTYPE: intdata = ptrtooffset(db, wg_decode_record(db, enc)); snprintf(buf, buflen, "\"\"", intdata); break; case WG_INTTYPE: intdata = wg_decode_int(db, enc); snprintf(buf, buflen, "%d", intdata); break; case WG_DOUBLETYPE: doubledata = wg_decode_double(db, enc); snprintf(buf, buflen, "%f", doubledata); break; case WG_FIXPOINTTYPE: doubledata = wg_decode_fixpoint(db, enc); snprintf(buf, buflen, "%f", doubledata); break; case WG_STRTYPE: csv_escaped_str(db, wg_decode_str(db, enc), buf, buflen); break; case WG_XMLLITERALTYPE: csv_escaped_str(db, wg_decode_xmlliteral(db, enc), buf, buflen); break; case WG_URITYPE: /* More efficient solutions are possible, but here we simply allocate * enough storage to concatenate the URI before encoding it for CSV. */ ilen = wg_decode_uri_len(db, enc); ilen += wg_decode_uri_prefix_len(db, enc); ibuf = (char *) malloc(ilen + 1); if(!ibuf) { show_io_error(db, "Failed to allocate memory"); return; } snprintf(ibuf, ilen+1, "%s%s", wg_decode_uri_prefix(db, enc), wg_decode_uri(db, enc)); csv_escaped_str(db, ibuf, buf, buflen); free(ibuf); break; case WG_CHARTYPE: intdata = wg_decode_char(db, enc); snprintf(buf, buflen, "%c", (char) intdata); break; case WG_DATETYPE: intdata = wg_decode_date(db, enc); wg_strf_iso_datetime(db,intdata,0,strbuf); strbuf[10]=0; snprintf(buf, buflen, "%s", strbuf); break; case WG_TIMETYPE: intdata = wg_decode_time(db, enc); wg_strf_iso_datetime(db,1,intdata,strbuf); snprintf(buf, buflen, "%s", strbuf+11); break; default: snprintf(buf, buflen, "\"\""); break; } } /** Try parsing an URI from a string. * Returns encoded WG_URITYPE field when successful * Returns WG_ILLEGAL on error * * XXX: this is a very naive implementation. Something more robust * is needed. * * XXX: currently unused. */ #if 0 static gint parse_and_encode_uri(void *db, char *buf) { gint encoded = WG_ILLEGAL; struct uri_scheme_info *next = uri_scheme_table; /* Try matching to a known scheme */ while(next->prefix) { if(!strncmp(buf, next->prefix, next->length)) { /* We have a matching URI scheme. * XXX: check this code for correct handling of prefix. */ int urilen = strlen(buf); char *prefix = (char *) malloc(urilen + 1); char *dataptr; if(!prefix) break; strncpy(prefix, buf, urilen); dataptr = prefix + urilen; while(--dataptr >= prefix) { switch(*dataptr) { case ':': case '/': case '#': *(dataptr+1) = '\0'; goto prefix_marked; default: break; } } prefix_marked: encoded = wg_encode_uri(db, buf+((gint)dataptr-(gint)prefix+1), prefix); free(prefix); break; } next++; } return encoded; } #endif /** Parse value from string, encode it for WhiteDB * returns WG_ILLEGAL if value could not be parsed or * encoded. * * See the comment for parse_input_type() for the supported types. * If other conversions fail, data will be encoded as string. */ gint wg_parse_and_encode(void *db, char *buf) { gint intdata = 0; double doubledata = 0; gint encoded = WG_ILLEGAL, res = 0; switch(parse_input_type(db, buf, &intdata, &doubledata, &res)) { case WG_NULLTYPE: encoded = 0; break; case WG_INTTYPE: encoded = wg_encode_int(db, intdata); break; case WG_DOUBLETYPE: encoded = wg_encode_double(db, doubledata); break; case WG_STRTYPE: encoded = wg_encode_str(db, buf, NULL); break; case WG_DATETYPE: encoded = wg_encode_date(db, res); break; case WG_TIMETYPE: encoded = wg_encode_time(db, res); break; default: break; } return encoded; } /** Parse value from string, encode it as a query parameter. * returns WG_ILLEGAL if value could not be parsed or * encoded. * * Parameters encoded like this should be freed with * wg_free_query_param() and cannot be used interchangeably * with other encoded values. */ gint wg_parse_and_encode_param(void *db, char *buf) { gint intdata = 0; double doubledata = 0; gint encoded = WG_ILLEGAL, res = 0; switch(parse_input_type(db, buf, &intdata, &doubledata, &res)) { case WG_NULLTYPE: encoded = 0; break; case WG_INTTYPE: encoded = wg_encode_query_param_int(db, intdata); break; case WG_DOUBLETYPE: encoded = wg_encode_query_param_double(db, doubledata); break; case WG_STRTYPE: encoded = wg_encode_query_param_str(db, buf, NULL); break; case WG_DATETYPE: encoded = wg_encode_query_param_date(db, res); break; case WG_TIMETYPE: encoded = wg_encode_query_param_time(db, res); break; default: break; } return encoded; } /** Detect the type of input data in string format. * * Supports following data types: * NULL - empty string * int - plain integer * double - floating point number in fixed decimal notation * date - ISO8601 date * time - ISO8601 time+fractions of second. * string - input data that does not match the above types * * Does NOT support ambiguous types: * fixpoint - floating point number in fixed decimal notation * uri - string starting with an URI prefix * char - single character * * Does NOT support types which would require a special encoding * scheme in string form: * record, XML literal, blob, anon const, variables * * Return values: * 0 - value type could not be parsed or detected * WG_NULLTYPE - NULL * WG_INTTYPE - int, *intdata contains value * WG_DOUBLETYPE - double, *doubledata contains value * WG_DATETYPE - date, *datetime contains internal representation * WG_TIMETYPE - time, *datetime contains internal representation * WG_STRTYPE - string, use entire buf * * Since leading whitespace makes type guesses fail, it invariably * causes WG_STRTYPE to be returned. */ static gint parse_input_type(void *db, char *buf, gint *intdata, double *doubledata, gint *datetime) { gint type = 0; char c = buf[0]; if(c == 0) { /* empty fields become NULL-s */ type = WG_NULLTYPE; } else if((c >= '0' && c <= '9') ||\ (c == '-' && buf[1] >= '0' && buf[1] <= '9')) { /* This could be one of int, double, date or time */ if(c != '-' && (*datetime = wg_strp_iso_date(db, buf)) >= 0) { type = WG_DATETYPE; } else if(c != '-' && (*datetime = wg_strp_iso_time(db, buf)) >= 0) { type = WG_TIMETYPE; } else { /* Examine the field contents to distinguish between float * and int, then convert using atol()/atof(). sscanf() tends to * be too optimistic about the conversion, especially under Win32. */ char numbuf[80]; char *ptr = buf, *wptr = numbuf, *decptr = NULL; int decsep = 0; while(*ptr) { if(*ptr == CSV_DECIMAL_SEPARATOR) { decsep++; decptr = wptr; } else if((*ptr < '0' || *ptr > '9') && ptr != buf) { /* Non-numeric. Mark this as an invalid number * by abusing the decimal separator count. */ decsep = 2; break; } *(wptr++) = *(ptr++); if((int) (wptr - numbuf) >= 79) break; } *wptr = '\0'; if(decsep==1) { char tmp = *decptr; *decptr = '.'; /* ignore locale, force conversion by plain atof() */ *doubledata = atof(numbuf); if(errno!=ERANGE && errno!=EINVAL) { type = WG_DOUBLETYPE; } else { errno = 0; /* Under Win32, successful calls don't do this? */ } *decptr = tmp; /* conversion might have failed, restore string */ } else if(!decsep) { *intdata = atol(numbuf); if(errno!=ERANGE && errno!=EINVAL) { type = WG_INTTYPE; } else { errno = 0; } } } } if(type == 0) { /* Default type is string */ type = WG_STRTYPE; } return type; } /** Write single record to stream in CSV format * */ void wg_fprint_record_csv(void *db, wg_int* rec, FILE *f) { wg_int len, enc; int i; char *strbuf; if(rec==NULL) { show_io_error(db, "null record pointer"); return; } strbuf = (char *) malloc(CSV_FIELD_BUF); if(strbuf==NULL) { show_io_error(db, "Failed to allocate memory"); return; } len = wg_get_record_len(db, rec); for(i=0; i= encdata_sz) { gint *tmp; encdata_sz += CSV_ENCDATA_BUF; tmp = (gint *) realloc(encdata, sizeof(gint) * encdata_sz); if(tmp==NULL) { err = -3; show_io_error(db, "Failed to allocate memory"); break; } else encdata = tmp; } /* Do the actual parsing. This also allocates database-side * storage for the new data. */ enc = wg_parse_and_encode(db, strbuf); if(enc == WG_ILLEGAL) { show_io_error_str(db, "Warning: failed to parse", strbuf); enc = 0; /* continue anyway */ } encdata[reclen++] = enc; } if(commit_record) { /* Need to save the record to database. */ int i; void *rec; commit_record = 0; if(!reclen) continue; /* Ignore empty rows */ rec = wg_create_record(db, reclen); if(!rec) { err = -2; show_io_error(db, "Failed to create record"); break; } for(i=0; i 0) err = -1; /* XXX: not clear if fatal errors can occur here */ if(!user_data.count && err > -1) err = -1; /* No rows read. File was total garbage? */ if(err > user_data.error) err = user_data.error; /* More severe database error. */ raptor_free_uri(base_uri); raptor_free_uri(uri); raptor_free_memory(uri_string); return (gint) err; } /** Triple handler for raptor * Stores the triples parsed by raptor into database */ static void handle_triple(void* user_data, const raptor_statement* triple) { void* rec; struct wg_triple_handler_params *params = \ (struct wg_triple_handler_params *) user_data; gint enc; rec=wg_create_record(params->db, params->pref_fields + 3 + params->suff_fields); if (!rec) { show_io_error(params->db, "cannot create a new record"); params->error = -2; raptor_parse_abort(params->rdf_parser); } /* Field storage order: predicate, subject, object */ enc = parse_and_encode_uri(params->db, (char*)(triple->predicate)); if(enc==WG_ILLEGAL ||\ wg_set_field(params->db, rec, params->pref_fields, enc)) { show_io_error(params->db, "failed to store field"); params->error = -2; raptor_parse_abort(params->rdf_parser); } enc = parse_and_encode_uri(params->db, (char*)(triple->subject)); if(enc==WG_ILLEGAL ||\ wg_set_field(params->db, rec, params->pref_fields+1, enc)) { show_io_error(params->db, "failed to store field"); params->error = -2; raptor_parse_abort(params->rdf_parser); } if ((triple->object_type)==RAPTOR_IDENTIFIER_TYPE_RESOURCE) { enc = parse_and_encode_uri(params->db, (char*)(triple->object)); } else if ((triple->object_type)==RAPTOR_IDENTIFIER_TYPE_ANONYMOUS) { /* Fixed prefix urn:local: */ enc=wg_encode_uri(params->db, (char*)(triple->object), "urn:local:"); } else if ((triple->object_type)==RAPTOR_IDENTIFIER_TYPE_LITERAL) { if ((triple->object_literal_datatype)==NULL) { enc=wg_encode_str(params->db,(char*)(triple->object), (char*)(triple->object_literal_language)); } else { enc=wg_encode_xmlliteral(params->db, (char*)(triple->object), (char*)(triple->object_literal_datatype)); } } else { show_io_error(params->db, "Unknown triple object type"); /* XXX: is this fatal? Maybe we should set error and continue here */ params->error = -2; raptor_parse_abort(params->rdf_parser); } if(enc==WG_ILLEGAL ||\ wg_set_field(params->db, rec, params->pref_fields+2, enc)) { show_io_error(params->db, "failed to store field"); params->error = -2; raptor_parse_abort(params->rdf_parser); } /* After correctly storing the triple, call the designated callback */ if(params->callback) { if((*(params->callback)) (params->db, rec)) { show_io_error(params->db, "record callback failed"); params->error = -2; raptor_parse_abort(params->rdf_parser); } } params->count++; } /** WhiteDB RDF parsing callback * This callback does nothing, but is always called when RDF files * are imported using wgdb commandline tool. If import API is used from * user application, alternative callback functions can be implemented * in there. * * Callback functions are expected to return 0 on success and * <0 on errors that cause the database to go into an invalid state. */ gint wg_rdfparse_default_callback(void *db, void *rec) { return 0; } /** Export triple data to file * wrapper for export_raptor(), allows user to specify serializer type. * * raptor provides an API to enumerate serializers. This is not * utilized here. */ gint wg_export_raptor_file(void *db, gint pref_fields, char *filename, char *serializer) { raptor_serializer *rdf_serializer=NULL; gint err = 0; raptor_init(); rdf_serializer = raptor_new_serializer(serializer); if(!rdf_serializer) return -1; err = export_raptor(db, pref_fields, filename, rdf_serializer); raptor_free_serializer(rdf_serializer); raptor_finish(); return err; } /** Export triple data to file, instructing raptor to use rdfxml serializer * */ gint wg_export_raptor_rdfxml_file(void *db, gint pref_fields, char *filename) { return wg_export_raptor_file(db, pref_fields, filename, "rdfxml"); } /** Convert wgdb URI field to raptor URI * Helper function. Caller is responsible for calling raptor_free_uri() * when the returned value is no longer needed. */ static raptor_uri *dburi_to_raptoruri(void *db, gint enc) { raptor_uri *tmpuri = raptor_new_uri((unsigned char *) wg_decode_uri_prefix(db, enc)); raptor_uri *uri = raptor_new_uri_from_uri_local_name(tmpuri, (unsigned char *) wg_decode_uri(db, enc)); raptor_free_uri(tmpuri); return uri; } /** File-based raptor export function * Uses WhiteDB-specific API parameters of: * pref_fields * suff_fields * * Expects an initialized serializer as an argument. * returns 0 on success. * returns -1 on errors (no fatal errors that would corrupt * the database are expected here). */ static gint export_raptor(void *db, gint pref_fields, char *filename, raptor_serializer *rdf_serializer) { int err, minsize; raptor_statement *triple; void *rec; err = raptor_serialize_start_to_filename(rdf_serializer, filename); if(err) return -1; /* initialization failed somehow */ /* Start constructing triples and sending them to the serializer. */ triple = (raptor_statement *) malloc(sizeof(raptor_statement)); if(!triple) { show_io_error(db, "Failed to allocate memory"); return -1; } memset(triple, 0, sizeof(raptor_statement)); rec = wg_get_first_record(db); minsize = pref_fields + 3; while(rec) { if(wg_get_record_len(db, rec) >= minsize) { gint enc = wg_get_field(db, rec, pref_fields); if(wg_get_encoded_type(db, enc) == WG_URITYPE) { triple->predicate = dburi_to_raptoruri(db, enc); } else if(wg_get_encoded_type(db, enc) == WG_STRTYPE) { triple->predicate = (void *) raptor_new_uri( (unsigned char *) wg_decode_str(db, enc)); } else { show_io_error(db, "Bad field type for predicate"); err = -1; goto done; } triple->predicate_type = RAPTOR_IDENTIFIER_TYPE_RESOURCE; enc = wg_get_field(db, rec, pref_fields + 1); if(wg_get_encoded_type(db, enc) == WG_URITYPE) { triple->subject = dburi_to_raptoruri(db, enc); } else if(wg_get_encoded_type(db, enc) == WG_STRTYPE) { triple->subject = (void *) raptor_new_uri( (unsigned char *) wg_decode_str(db, enc)); } else { show_io_error(db, "Bad field type for subject"); err = -1; goto done; } triple->subject_type = RAPTOR_IDENTIFIER_TYPE_RESOURCE; enc = wg_get_field(db, rec, pref_fields + 2); triple->object_literal_language = NULL; triple->object_literal_datatype = NULL; if(wg_get_encoded_type(db, enc) == WG_URITYPE) { triple->object = dburi_to_raptoruri(db, enc); triple->object_type = RAPTOR_IDENTIFIER_TYPE_RESOURCE; } else if(wg_get_encoded_type(db, enc) == WG_XMLLITERALTYPE) { triple->object = (void *) raptor_new_uri( (unsigned char *) wg_decode_xmlliteral(db, enc)); triple->object_literal_datatype = raptor_new_uri( (unsigned char *) wg_decode_xmlliteral_xsdtype(db, enc)); triple->object_type = RAPTOR_IDENTIFIER_TYPE_LITERAL; } else if(wg_get_encoded_type(db, enc) == WG_STRTYPE) { triple->object = (void *) wg_decode_str(db, enc); triple->object_literal_language =\ (unsigned char *) wg_decode_str_lang(db, enc); triple->object_type = RAPTOR_IDENTIFIER_TYPE_LITERAL; } else { show_io_error(db, "Bad field type for object"); err = -1; goto done; } /* Write the triple */ raptor_serialize_statement(rdf_serializer, triple); /* Cleanup current triple */ raptor_free_uri((raptor_uri *) triple->subject); raptor_free_uri((raptor_uri *) triple->predicate); if(triple->object_type == RAPTOR_IDENTIFIER_TYPE_RESOURCE) raptor_free_uri((raptor_uri *) triple->object); else if(triple->object_literal_datatype) raptor_free_uri((raptor_uri *) triple->object_literal_datatype); } rec = wg_get_next_record(db, rec); } done: raptor_serialize_end(rdf_serializer); free(triple); return (gint) err; } #endif /* HAVE_RAPTOR */ /* ------------ error handling ---------------- */ static gint show_io_error(void *db, char *errmsg) { #ifdef WG_NO_ERRPRINT #else fprintf(stderr,"I/O error: %s.\n", errmsg); #endif return -1; } static gint show_io_error_str(void *db, char *errmsg, char *str) { #ifdef WG_NO_ERRPRINT #else fprintf(stderr,"I/O error: %s: %s.\n", errmsg, str); #endif return -1; } #ifdef __cplusplus } #endif whitedb-0.7.2/Db/dbutil.h000066400000000000000000000047651226454622500151530ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Priit Järv 2010 * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dbutil.h * Public headers for miscellaneous functions. */ #ifndef DEFINED_DBUTIL_H #define DEFINED_DBUTIL_H #ifdef HAVE_RAPTOR #include #endif #ifdef _WIN32 #include "../config-w32.h" #else #include "../config.h" #endif /* ====== data structures ======== */ #ifdef HAVE_RAPTOR struct wg_triple_handler_params { void *db; int pref_fields; /** number of fields preceeding the triple */ int suff_fields; /** number of fields to reserve at the end */ gint (*callback) (void *, void *); /** function called after *the triple is stored */ raptor_parser *rdf_parser; /** parser object */ int count; /** return status: rows parsed */ int error; /** return status: error level */ }; #endif /* ==== Protos ==== */ /* API functions (copied in dbapi.h) */ void wg_print_db(void *db); void wg_print_record(void *db, gint* rec); void wg_snprint_value(void *db, gint enc, char *buf, int buflen); gint wg_parse_and_encode(void *db, char *buf); gint wg_parse_and_encode_param(void *db, char *buf); void wg_export_db_csv(void *db, char *filename); gint wg_import_db_csv(void *db, char *filename); /* Separate raptor API (copied in rdfapi.h) */ #ifdef HAVE_RAPTOR gint wg_import_raptor_file(void *db, gint pref_fields, gint suff_fields, gint (*callback) (void *, void *), char *filename); gint wg_import_raptor_rdfxml_file(void *db, gint pref_fields, gint suff_fields, gint (*callback) (void *, void *), char *filename); gint wg_rdfparse_default_callback(void *db, void *rec); gint wg_export_raptor_file(void *db, gint pref_fields, char *filename, char *serializer); gint wg_export_raptor_rdfxml_file(void *db, gint pref_fields, char *filename); #endif #endif /* DEFINED_DBUTIL_H */ whitedb-0.7.2/Db/indexapi.h000066400000000000000000000033721226454622500154620ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Priit Järv 2011 * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file indexapi.h * * Index management API for WhiteDB. * * wg_int type is defined in dbapi.h */ #ifndef DEFINED_INDEXAPI_H #define DEFINED_INDEXAPI_H /* Public macros */ #define WG_INDEX_TYPE_TTREE 50 #define WG_INDEX_TYPE_TTREE_JSON 51 #define WG_INDEX_TYPE_HASH 60 #define WG_INDEX_TYPE_HASH_JSON 61 /* Public protos */ wg_int wg_create_index(void *db, wg_int column, wg_int type, wg_int *matchrec, wg_int reclen); wg_int wg_create_multi_index(void *db, wg_int *columns, wg_int col_count, wg_int type, wg_int *matchrec, wg_int reclen); wg_int wg_drop_index(void *db, wg_int index_id); wg_int wg_column_to_index_id(void *db, wg_int column, wg_int type, wg_int *matchrec, wg_int reclen); wg_int wg_multi_column_to_index_id(void *db, wg_int *columns, wg_int col_count, wg_int type, wg_int *matchrec, wg_int reclen); wg_int wg_get_index_type(void *db, wg_int index_id); void * wg_get_index_template(void *db, wg_int index_id, wg_int *reclen); void * wg_get_all_indexes(void *db, wg_int *count); #endif /* DEFINED_INDEXAPI_H */ whitedb-0.7.2/Db/rdfapi.h000066400000000000000000000025741226454622500151310ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Priit Järv 2010 * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file rdfapi.h * * RDF parsing API for WhiteDB, depends on libraptor. * */ #ifndef DEFINED_RDFAPI_H #define DEFINED_RDFAPI_H wg_int wg_import_raptor_file(void *db, wg_int pref_fields, wg_int suff_fields, wg_int (*callback) (void *, void *), char *filename); wg_int wg_import_raptor_rdfxml_file(void *db, wg_int pref_fields, wg_int suff_fields, wg_int (*callback) (void *, void *), char *filename); wg_int wg_rdfparse_default_callback(void *db, void *rec); wg_int wg_export_raptor_file(void *db, wg_int pref_fields, char *filename, char *serializer); wg_int wg_export_raptor_rdfxml_file(void *db, wg_int pref_fields, char *filename); #endif /* DEFINED_RDFAPI_H */ whitedb-0.7.2/Doc/000077500000000000000000000000001226454622500136635ustar00rootroot00000000000000whitedb-0.7.2/Doc/Install.txt000066400000000000000000000114421226454622500160340ustar00rootroot00000000000000WhiteDB Installation ==================== Introduction ------------ There are two primary ways you can use the distribution package: - compile the database library ('libwgdb.so' under Linux, 'wgdb.dll' under Windows) and link your application against that - compile your application program by including the database files directly In both of these cases your application using WhiteDB calls should include the API header file: 'dbapi.h' In addition, you may want to compile the included utility programs (`wgdb` and `indextool`) to manage the database. Quick-start instructions ------------------------ Under Linux, type ./configure make make install This produces the database utilities, the library and installs the them together with the database header files. Under Windows, check that you have MSVC installed. Open the command prompt with the Visual C environment configured and type: compile.bat This produces the database utilities, 'wgdb.lib' and 'wgdb.dll'. The shared memory ----------------- Under Linux, the default memory settings are sufficient for testing and initial evaluation. For increasing the maximum amount of shared memory, type: sysctl kernel.shmmax=100000000 This example sets the available shared memory to 100M bytes. Under Mac OS X you need to set a kern.sysv.shmmax and kern.sysv.shmall, type: sudo sysctl -w kern.sysv.shmmax=1073741824 sudo sysctl -w kern.sysv.shmall=262144 You can add these settings to '/etc/sysctl.conf' to make it permanent. Under Windows, the shared memory is not persistent. To maintain a persistent database, use wgdb server 100000000 This example creates a shared memory database of 100M bytes. Once this process is terminated, the shared memory is destroyed. The configure script -------------------- Some more relevant options to the configure script are: '--prefix=PREFIX' specifies the directory the program is installed under. The binaries go in 'PREFIX/bin', the header files in 'PREFIX/include/whitedb' and the libraries in 'PREFIX/lib'. The Python modules, if compiled, will be placed in 'PREFIX/lib/pythonX.Y/site-packages', where X.Y is the Python version '--with-python' compiles the Python bindings. By default, the configure script attempts to automatically locate a suitable version of Python. Use '--with-python=/usr/bin/pythonX.Y' to point to a specific version of Python. '--enable-locking' changes the locking protocol. The available options are: 'rpspin' (a reader preference spinlock), 'wpspin' (a writer preference spinlock), 'tfqueue' (task-fair queue, no preference) and 'no' (locking is disabled). The default value is 'tfqueue' which performs best under heavy workload. For simple applications 'rpspin' may be preferrable, as it has lower overhead. '--enable-logging' enables the journal log of the database. Still somewhat experimental; off by default. '--enable-reasoner' enables the Gandalf reasoner. Disabled by default. '--disable-backlink' disables references between records. May be used to increase performance if the database records never contain any links to other records. '--disable-checking' disables sanity checking in many internal database operations. Increases performance by a small percentage. `./configure --help` will provide the full list of available options. Building the repository version ------------------------------- The github repository (https://github.com/priitj/whitedb) does not contain a pre-generated configure script. You'll need the autoconf and automake packages, if you have those installed, run: ./Bootstrap This generates the `configure` script and other scripts used by the autotools build. Building the utilities without configure and GNU make ----------------------------------------------------- The `compile.sh` script is provided to allow compiling the utilities with the C compiler. This is intended to simplify building in cross-target or embedded environments. It is assumed that the GNU C Compiler (`gcc`) is used. When the script is executed the first time, it copies 'config-gcc.h' to 'config.h', unless that file is already present. Edit 'config.h' to change database options. Under Windows, `compile.bat` serves a similar function. To change the database options, edit the 'config-w32.h' file. Note that in both cases, the config file for building the utilities `indextool` and `wgdb` should match the config file for building your database application. Not building anything --------------------- Building the database library and the utilities is not strictly necessary. Alternatively you may compile the database sources directly into your program. See 'Examples/compile_demo.sh' ('Examples\compile_demo.bat' under Windows). This compiles the demo program 'demo.c' with the WhiteDB source files. These programs and scripts may be used as templates for creating database applications. whitedb-0.7.2/Doc/Manual.txt000066400000000000000000001532621226454622500156520ustar00rootroot00000000000000WhiteDB shared memory database ============================== Principles and goals --------------------- WhiteDB is a lightweight database library operating fully in main memory. Disk is used only for dumping/restoring database and logging. Data is persistantly kept in the shared memory area: it is available simultaneously to all processes and is kept intact even if no processes are currently using the database. WhiteDB has no server process. Data is read and written directly from/to memory, no sockets are used between WhiteDB and the application using WhiteDB. WhiteDB keeps data as N-tuples: each database record is a tuple of N elements. Each element (record field) may have an arbitrary type amongst the types provided by WhiteDB. Each record field contains exactly one integer (4 bytes or 8 bytes). Datatypes which cannot be fit into one integer are allocated separately and the record field contains an (encoded) pointer to the real data. WhiteDB is written in pure C in a portable manner and should compile and function without additional porting at least under Linux (gcc) and Windows (native Windows C compiler cl). It has Python and experimental Java bindings. The Python bindings and their usage is explained in the separate manual 'python.txt'. WhiteDB has several goals: - speed - portability - small footprint and low memory usage - usability as an rdf database - usability as an extended rdf database, xml database and outside these scopes - seamless integration with the Gandalf rule engine (work in progress) NOTE: The name 'wgdb' is also used in some places, such as the name of loadable modules and libraries. In documentation it may be used interchangeable with WhiteDB, the letter 'G' refers to the Gandalf reasoner. Obtaining and licence --------------------- WhiteDB releases can be obtained from http://www.whitedb.org The development version can be obtained from the source repository: https://github.com/priitj/whitedb WhiteDB is licensed under GPL version 3. Using WhiteDB in applications ----------------------------- See 'demo.c' and 'query.c' in the 'Examples' directory of the distribution package for complete examples of basic database usage. Compiling and linking against WhiteDB installation ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Include the API headers in your programs: [source,C] ---- #include #include /* only for using the raptor API */ #include /* only for using the index API */ ---- - Include -lwgdb to LDFLAGS in your Makefile or linker arguments If you used a non-standard installation prefix, using -I and -L compiler/linker flags is required as usual. Dynamic linking under Windows ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Include the API headers [source,C] ---- #include #include /* only for using the raptor API */ #include /* only for using the index API */ ---- This requires providing the header file directory to the compiler. - Compile and link against the library cl.exe /I"..\whitedb-0.6\Db" yourprog.c ..\whitedb-0.6\wgdb.lib This produces 'yourprog.exe' that requires 'wgdb.dll' to run. The 'compile.bat' and Examples directory also contains compilation examples. Compiling with database source files ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ See 'Examples/compile_demo.sh' ('Examples\compile_demo.bat' under Windows). This compiles the demo program 'demo.c' with the WhiteDB source files. These programs and scripts may be used as templates for creating database applications. Database API ------------- The database API prototypes and macros are all found in the 'Db/dbapi.h' file. You should include this single header file in all the files of your application calling WhiteDB functions. The database API has functions for: - creating and deleting the database - creating and deleting records - setting and reading record fields - encoding and decoding data stored in the record fields - dumping and restoring database contents to/from disk - read and write locking the database for concurrency control It is a good idea to check the usage of API calls from the example program 'Examples/demo.c' Preliminaries ~~~~~~~~~~~~~ All the API calls follow these principles: - each function has a wg_ prefix. - function names are all lower case, _ used as a separator - each function takes the pointer to the database as a first argument. The database pointer is obtained when creating a new database or attaching to an existing one. You can have several databases open at any time: they will simply have different pointers. Observe that the pointer you will get from two different processes for the same database will be different. The record pointer is returned when creating records or when fetching query results. This `void *` type pointer points directly to the record data in the shared memory segment and should be used with all the functions that read or manipulate record fields. You can also encode a record pointer and write it into another record, forming a link between records. All the record fields are ordinary C integers (32 or 64 bytes). In order to allow exact control over the integer length the datatype `wg_int` is used for all encoded data. This datatype is in normal usage equivalent (typedef-d) to an int (or a 64-bit integer if the database is configured as 64-bit). Strings given to the API functions are ordinary 0-terminated C strings, their length is an ordinary C string length as computed by strlen. Checks and errors ~~~~~~~~~~~~~~~~~ WhiteDB library performs a few checks for most library operations to ensure sanity. Checking causes a very small speed penalty and can be disabled by setting '--disable-checking' during installation. One of the standard checks is whether the database pointer passed as the first argument is not NULL and the first segment of the database area contains the specific integer indicating that the segment is actually created as a WhiteDB database. Whenever a field record is accessed, WhiteDB checks that the field number is not larger than the record length. Validity checks are also performed during data decoding and encoding. In case WhiteDB recognizes an error, the API function called returns an error value specified in the API doc. For example, failed record creation returns a NULL pointer. A WG_ILLEGAL value is returned in case of encoding error, NULL in case of string decoding errors, -1 in case of length decoding errors. In addition to returning a specific error value, WhiteDB prints an error message to stderr. In some cases the error message is a small error trace through several layers of internal calls. Printing to stderr can be inhibited by defining a macro WG_NO_ERRPRINT during WhiteDB compilation. Notice that in error cases, nothing is printed to stdout (only stderr) and WhiteDB does not exit: the corresponding API call returns an error value which should be handled by the code calling the API function. Creating and deleting the database ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Functions: [source,C] ---- void* wg_attach_database(char* dbasename, wg_int size); void* wg_attach_existing_database(char* dbasename); void* wg_attach_logged_database(char* dbasename, wg_int size); int wg_detach_database(void* dbase); int wg_delete_database(char* dbasename); void* wg_attach_local_database(wg_int size); void wg_delete_local_database(void* dbase); ---- Details: void* wg_attach_database(char* dbasename, int size) Returns a pointer to the database, NULL if failure. Size in bytes. Created database is a contiguous block of shared memory of size bytes. It cannot be shrinked or extended later. The returned pointer should be passed to all the WhiteDB API calls as the first parameter. Database name should be an integer. The call wg_attach_database(NULL, 0) creates a database with a default name ("1000") and default size 10000000 (10 megabytes). Both defaults can be configured from 'Db/dbmem.h'. If the size parameter is > 0, the named shared memory segment exists and it is smaller than the given size, the call returns NULL. NOTE: The typical default shared memory allocatable size of a linux system is under 100 megabytes. You can see the allocatable size in bytes by doing `cat /proc/sys/kernel/shmmax`. You can set the shared memory size by becoming root and doing `echo shared_memory_size > /proc/sys/kernel/shmmax` where shared_memory_size is a number of bytes. void* wg_attach_existing_database(char* dbasename) Like `wg_attach_database()`, but does not create a new database when no database with name dbasename exists. In the latter case returns NULL. void* wg_attach_logged_database(char* dbasename, wg_int size) Like `wg_attach_database()`, but starts journal logging when the database is initialized. If the named segment already exists and does not have logging enabled, the function returns NULL. int wg_detach_database(void* dbase) Detaches a database: returns 0 if OK. Exiting from the process detaches database automatically. int wg_delete_database(char* dbasename) Deletes a database: returns 0 if OK. NB! Database is not deleted unless all processes who have previously attached have detached from it and at least one process has made a delete call. void* wg_attach_local_database(int size) Returns a pointer to local memory database, NULL if failure. Size is given in bytes. The database is allocated in the private memory of the process and will neither be readable to other processes nor persist when the process closes. In every other aspect the database behaves similarly to a shared memory database. void wg_delete_local_database(void* dbase) Deletes a local memory database. Memory allocated for the database will be freed. Creating, deleting, scanning records ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Functions: [source,C] ---- void* wg_create_record(void* db, wg_int length); void* wg_create_raw_record(void* db, wg_int length); wg_int wg_delete_record(void* db, void *rec); void* wg_get_first_record(void* db); void* wg_get_next_record(void* db, void* record); ---- Details: void* wg_create_record(void* db, wg_int length) Creates a new record of length length and initialises all fields to 0 (used as a NULL value in WhiteDB). Returns NULL when error, ptr to record otherwise. void* wg_create_raw_record(void* db, wg_int length) Same as wg_create_record(), except the initial field values are not indexed. Use together with wg_set_new_field(). wg_int wg_delete_record(void* db, void *rec) Deletes a record with a pointer rec. Returns 0 if OK, non-0 on error. You should not worry about deallocation of data in the record fields: this is done automatically. void* wg_get_first_record(void* db) Returns first record pointer, NULL when error or no records available. void* wg_get_next_record(void* db, void* record) Returns next record pointer, NULL when error or no records available. record parameter is a pointer to the (previous) record. Setting and reading record fields ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Functions: [source,C] ---- wg_int wg_get_record_len(void* db, void* record); wg_int wg_set_field(void* db, void* record, wg_int fieldnr, wg_int data); wg_int wg_set_new_field(void* db, void* record, wg_int fieldnr, wg_int data); wg_int wg_get_field(void* db, void* record, wg_int fieldnr); wg_int wg_get_field_type(void* db, void* record, wg_int fieldnr); wg_int wg_set_int_field(void* db, void* record, wg_int fieldnr, wg_int data); wg_int wg_set_double_field(void* db, void* record, wg_int fieldnr, double data); wg_int wg_set_str_field(void* db, void* record, wg_int fieldnr, char* data); wg_int* wg_field_addr(void* db, void* record, wg_int fieldnr); ---- Details: wg_int wg_get_record_len(void* db, void* record) Gives record length (0,...). Returns negative int when error. wg_int wg_set_field(void* db, void* record, wg_int fieldnr, wg_int data) Sets field fieldnr value to encoded data. Field numbers start from 0. Passed data must be 0 (NULL value) or encoded (see next chapter). Returns negative int when err, 0 when ok. Do not worry about deallocating earlier data in the field: this is done automatically. wg_int wg_set_new_field(void* db, void* record, wg_int fieldnr, wg_int data) Same as wg_set_field() except it can only be used to write the contents of newly created fields that do not have values. Writing will be somewhat faster than with wg_set_field(). It is the responsibility of the caller to ensure that the field to be written really is one that contains no earlier data. Use together with wg_create_raw_record(). wg_int wg_get_field(void* db, void* record, wg_int fieldnr) Returns encoded data in field fieldnr. Data should be decoded later for ordinary use, see next chapter. wg_int wg_get_field_type(void* db, void* record, wg_int fieldnr) Returns datatype in field fieldnr. Datatypes are defined by these macros, avoid using corresponding numbers, since these may change: [source,C] ---- #define WG_NULLTYPE 1 #define WG_RECORDTYPE 2 #define WG_INTTYPE 3 #define WG_DOUBLETYPE 4 #define WG_STRTYPE 5 #define WG_XMLLITERALTYPE 6 #define WG_URITYPE 7 #define WG_BLOBTYPE 8 #define WG_CHARTYPE 9 #define WG_FIXPOINTTYPE 10 #define WG_DATETYPE 11 #define WG_TIMETYPE 12 ---- The following are convenience functions for common datatypes: wg_int wg_set_int_field(void* db, void* record, wg_int fieldnr, wg_int data) Like wg_set_field but automatically encodes data: pass ordinary integer. wg_int wg_set_double_field(void* db, void* record, wg_int fieldnr, double data) Like wg_set_field but automatically encodes data: pass ordinary double. wg_int wg_set_str_field(void* db, void* record, wg_int fieldnr, char* data) Like wg_set_field but automatically encodes data: pass ordinary null-terminated string. The following is a macro returning an address (C pointer) of a field: wg_int* wg_field_addr(void* db, void* record, wg_int fieldnr) Avoid wg_field_addr in normal cases: use wg_get_field and wg_set_field instead. The wg_field_addr macro performs no checks whatsoever: it is useful only for achieving maximum speed. While it is safe to read a value from the address returned, use it with extreme caution when storing data to the field. It is OK to directly store an encoded value to the field only if it currently contains an immediate value (immediates are NULL, short integer, date, time, char), is not indexed and no logging is used. Encoding and decoding data stored in the record fields ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The general principle of data storage in records is that each datatype has to be encoded before storage and decoded after reading before ordinary usage. Data stored in the fields is deallocated automatically if not used any more in any records. Hence you should not use the decoded data in your own variables after storage, unless you are sure the corresponding records are not deleted before you are using your variables again. The encoding principles are following, from smallest and fastest to largest and slowest: - 0, small (28 bit) integers, fixpoint doubles, chars, dates and times are stored directly in the field, no additional allocation is done, no special deallocation is done. - Records are encoded as an offset from the start of the shared memory segment to the start of the record. The encoded value is stored directly in a field. It can be decoded into a direct pointer to the start of the record data in the shared memory. - large integers and doubles are allocated one copy per data item, in a 4 byte or 8 byte chunk. - Short simple strings up to 32 bytes are allocated one copy per data item, always 32 bytes. - Long strings, strings with added language property, xmlliterals, uris, blobs are kept uniquely: only one copy of each item is allocated. They are deallocated automatically when the reference count falls to zero (reference counting garbage collection is used). - Long strings, xmlliterals, uris and blobs have different types (not equal even if they look the same when printed) and they all contain two strings: * main part (string, xmlliteral, uri, blob) * extra part (string language, xmlliteral namespace, uri prefix, blob type) where all these are ordinary 0-terminated C strings except blob, which is not 0-terminated. It is always possible to give a NULL value as an extra part. - Strings and blob returned by decoding strings, xmlliterals, uris and blobs should not be changed or used directly except for immediate copying to buffer. Prefer to use the decode...copy functions instead of direct decode functions giving a pointer to a string in the database. - A WG_ILLEGAL value is returned in case of encoding error. A value returned in case of decoding error is sometimes not recognizable as an error. In string-type value decoding NULL is returned in case of decoding errors, length and date/time decoding errors return -1. Functions: [source,C] ---- wg_int wg_get_encoded_type(void* db, wg_int data); wg_int wg_free_encoded(void* db, wg_int data); wg_int wg_encode_null(void* db, wg_int data); wg_int wg_decode_null(void* db, wg_int data); wg_int wg_encode_int(void* db, wg_int data); wg_int wg_decode_int(void* db, wg_int data); wg_int wg_encode_char(void* db, char data); char wg_decode_char(void* db, wg_int data); wg_int wg_encode_record(void* db, void* data); void* wg_decode_record(void* db, wg_int data); wg_int wg_encode_double(void* db, double data); double wg_decode_double(void* db, wg_int data); wg_int wg_encode_fixpoint(void* db, double data); double wg_decode_fixpoint(void* db, wg_int data); wg_int wg_encode_date(void* db, int data); int wg_decode_date(void* db, wg_int data); wg_int wg_encode_time(void* db, int data); int wg_decode_time(void* db, wg_int data); int wg_current_utcdate(void* db); int wg_current_localdate(void* db); int wg_current_utctime(void* db); int wg_current_localtime(void* db); int wg_strf_iso_datetime(void* db, int date, int time, char* buf); int wg_strp_iso_date(void* db, char* buf); int wg_strp_iso_time(void* db, char* inbuf); int wg_ymd_to_date(void* db, int yr, int mo, int day); int wg_hms_to_time(void* db, int hr, int min, int sec, int prt); void wg_date_to_ymd(void* db, int date, int *yr, int *mo, int *day); void wg_time_to_hms(void* db, int time, int *hr, int *min, int *sec, int *prt); wg_int wg_encode_str(void* db, char* str, char* lang); char* wg_decode_str(void* db, wg_int data); char* wg_decode_str_lang(void* db, wg_int data); wg_int wg_decode_str_len(void* db, wg_int data); wg_int wg_decode_str_lang_len(void* db, wg_int data); wg_int wg_decode_str_copy(void* db, wg_int data, char* strbuf, wg_int buflen); wg_int wg_decode_str_lang_copy(void* db, wg_int data, char* langbuf, wg_int buflen); wg_int wg_encode_xmlliteral(void* db, char* str, char* xsdtype); char* wg_decode_xmlliteral_copy(void* db, wg_int data); char* wg_decode_xmlliteral_xsdtype_copy(void* db, wg_int data); wg_int wg_decode_xmlliteral_len(void* db, wg_int data); wg_int wg_decode_xmlliteral_xsdtype_len(void* db, wg_int data); wg_int wg_decode_xmlliteral(void* db, wg_int data, char* strbuf, wg_int buflen); wg_int wg_decode_xmlliteral_xsdtype(void* db, wg_int data, char* strbuf, wg_int buflen); wg_int wg_encode_uri(void* db, char* str, char* nspace); char* wg_decode_uri(void* db, wg_int data); char* wg_decode_uri_prefix(void* db, wg_int data); wg_int wg_decode_uri_len(void* db, wg_int data); wg_int wg_decode_uri_prefix_len(void* db, wg_int data); wg_int wg_decode_uri_copy(void* db, wg_int data, char* strbuf, wg_int buflen); wg_int wg_decode_uri_prefix_copy(void* db, wg_int data, char* strbuf, wg_int buflen); wg_int wg_encode_blob(void* db, char* str, char* type, wg_int len); char* wg_decode_blob(void* db, wg_int data); char* wg_decode_blob_type(void* db, wg_int data); wg_int wg_decode_blob_len(void* db, wg_int data); wg_int wg_decode_blob_copy(void* db, wg_int data, char* strbuf, wg_int buflen); wg_int wg_decode_blob_type_len(void* db, wg_int data); wg_int wg_decode_blob_type_copy(void* db, wg_int data, char* langbuf, wg_int buflen); wg_int wg_encode_var(void* db, wg_int data); wg_int wg_decode_var(void* db, wg_int data); ---- Details: wg_int wg_get_encoded_type(void* db, wg_int data) Return a type of the encoded data (see the documentation for `wg_get_field_type()`) wg_int wg_free_encoded(void* db, wg_int data) Deallocate encoded data. You need to deallocate data if and only if you have encoded it yourself (not read from the field) and have not stored it into any fields. In case the data is stored in a field, you should never deallocate it, otherwise unexpected errors will occur. In case a field is written over or a record is deleted, deallocation is done automatically and properly. wg_int wg_encode_null(void* db, wg_int data) wg_int wg_decode_null(void* db, wg_int data) Not strictly needed; encoded value 0 stands for NULL. wg_int wg_encode_int(void* db, wg_int data) wg_int wg_decode_int(void* db, wg_int data) Encode/decode integers. Observe that shorter integers (28 bits) take less space and are a bit faster: they are kept directly in the field. wg_int wg_encode_char(void* db, char data) char wg_decode_char(void* db, wg_int data) Encode/decode a single char. Kept directly in the field. wg_int wg_encode_record(void* db, void* data) void* wg_decode_record(void* db, wg_int data) Encodes/decode a pointer to the record. wg_int wg_encode_double(void* db, double data) double wg_decode_double(void* db, wg_int data) Encode/decode ordinary doubles. Allocated separately. wg_int wg_encode_fixpoint(void* db, double data) double wg_decode_fixpoint(void* db, wg_int data) Encode/decode doubles as small and fast fixpoint numbers. Data must be a double between -800...800, four places after comma are kept after rounding. wg_int wg_encode_date(void* db, int data) int wg_decode_date(void* db, wg_int data) Unencoded date is a number of years since year 0. Use 1 as the first year. Kept directly in the field. wg_int wg_encode_time(void* db, int data) int wg_decode_time(void* db, wg_int data) Unencoded time is a number of 100-ths of a seconds past midnight. Kept directly in the field. int wg_current_utcdate(void* db) int wg_current_localdate(void* db) int wg_current_utctime(void* db) int wg_current_localtime(void* db) Gives current unencoded date or time, either utc or local. int wg_strf_iso_datetime(void* db, int date, int time, char* buf) Stores unencoded date and time as an iso datetime with 100-ths of seconds in the buf using iso format like 2010-03-31T12:59:00.33 int wg_strp_iso_date(void* db, char* buf) int wg_strp_iso_time(void* db, char* inbuf) Parses unencoded date or time from the part of iso string like 2010-03-31 or 12:59:00.33 and returns it. int wg_ymd_to_date(void* db, int yr, int mo, int day) int wg_hms_to_time(void* db, int hr, int min, int sec, int prt) Return scalar date or time like the above ISO string parsing functions, except the parameters are given as integer values (for ex: 2010, 1, 7). void wg_date_to_ymd(void* db, int date, int *yr, int *mo, int *day) void wg_time_to_hms(void* db, int time, int *hr, int *min, int *sec, int *prt) Reverse conversion functions for scalar date and time into separate integer values. wg_int wg_encode_str(void* db, char* str, char* lang) char* wg_decode_str(void* db, wg_int data) char* wg_decode_str_lang(void* db, wg_int data) wg_int wg_decode_str_len(void* db, wg_int data) wg_int wg_decode_str_lang_len(void* db, wg_int data) wg_int wg_decode_str_copy(void* db, wg_int data, char* strbuf, wg_int buflen) wg_int wg_decode_str_lang_copy(void* db, wg_int data, char* langbuf, wg_int buflen) All strings are 0-terminated standard C strings. Lang parameter is the extra-string which may be given 0. Simple decode returns a pointer to the string. `wg_decode_str_copy()` copies the string to the given buffer with a given buflen. A WG_ILLEGAL value is returned in case of encoding error, NULL in case of string decoding errors, -1 in case of length decoding errors. wg_int wg_encode_xmlliteral(void* db, char* str, char* xsdtype) char* wg_decode_xmlliteral_copy(void* db, wg_int data) char* wg_decode_xmlliteral_xsdtype_copy(void* db, wg_int data) wg_int wg_decode_xmlliteral_len(void* db, wg_int data) wg_int wg_decode_xmlliteral_xsdtype_len(void* db, wg_int data) wg_int wg_decode_xmlliteral(void* db, wg_int data, char* strbuf, wg_int buflen) wg_int wg_decode_xmlliteral_xsdtype(void* db, wg_int data, char* strbuf, wg_int buflen) Analogous to str functions, the extra-string represents xmlliteral xsdtype, may be NULL. wg_int wg_encode_uri(void* db, char* str, char* nspace) char* wg_decode_uri(void* db, wg_int data) char* wg_decode_uri_prefix(void* db, wg_int data) wg_int wg_decode_uri_len(void* db, wg_int data) wg_int wg_decode_uri_prefix_len(void* db, wg_int data) wg_int wg_decode_uri_copy(void* db, wg_int data, char* strbuf, wg_int buflen) wg_int wg_decode_uri_prefix_copy(void* db, wg_int data, char* strbuf, wg_int buflen) Analogous to str functions, the extra-string represents uri prefix, may be NULL. wg_int wg_encode_blob(void* db, char* str, char* type, wg_int len) char* wg_decode_blob(void* db, wg_int data) char* wg_decode_blob_type(void* db, wg_int data) wg_int wg_decode_blob_len(void* db, wg_int data) wg_int wg_decode_blob_copy(void* db, wg_int data, char* strbuf, wg_int buflen) wg_int wg_decode_blob_type_len(void* db, wg_int data) wg_int wg_decode_blob_type_copy(void* db, wg_int data, char* langbuf, wg_int buflen) Analogous to str functions, except that: - data is not 0-terminated, length must be always passed. - the extra-string represents blob type, may be NULL wg_int wg_encode_var(void* db, wg_int data) wg_int wg_decode_var(void* db, wg_int data) Data to be encoded is a variable identifier which is an integer. Values up to 28 bit size may be safely used on any modern hardware. Dumping and restoring database contents to/from disk ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Functions: [source,C] ---- wg_int wg_dump(void * db,char* fileName); wg_int wg_import_dump(void * db,char* fileName); wg_int wg_start_logging(void *db); wg_int wg_stop_logging(void *db); wg_int wg_replay_log(void *db, char *filename); ---- Details: wg_int wg_dump(void * db,char* fileName) Dump shared memory database to the disk. If the database has journal logging enabled, this will also restart the journal (creating a fresh journal file). Returns 0 on success, -1 on non-fatal error and -2 on a fatal error. In case of a fatal error, the database is in a corrupt state and should not (or cannot) be used further. wg_int wg_import_dump(void * db,char* fileName) Import database from the disk. If the database has journal logging enabled, this will also start the journal log (creating a fresh journal file) when the import is completed. Note that whether the journal is enabled is determined by the *current* memory segment, not the state of the database at the moment the dump was created. Returns 0 on success, -1 on non-fatal error and -2 on a fatal error. In case of a fatal error, the database is in a corrupt state. Otherwise, the import failed (dump file not found or incompatible format), but the memory image was not modified. wg_int wg_start_logging(void *db) Start the journal log. The journal logs are created in the directory determined at compilation time and have a name following the pattern 'wgdb.journal.' where 'shmname' is the name of the database. Call to this function always causes a new journal file to be created. When a previous journal file exists at the time the journal is started, it is backed up into a file named 'wgdb.journal..' where 'serial' is the next available suffix. If there are too many backups already present, the oldest one is overwritten instead. Returns 0 on success, -1 when logging is already active, -2 when the function failed and logging is not active and -3 when additionally, the log file was possibly destroyed NOTE: Normally, the journal is started upon the database creation by calling `wg_attach_logged_database()` and it is not necessary to call this function. wg_int wg_stop_logging(void *db) Suspend the journal log. None of the writes by any client connected to the database will be logged from this point. Returns 0 on success, non-zero on failure. NOTE: Normally it is not necessary to manually stop and start the journal. wg_int wg_replay_log(void *db, char *filename) Restore the database from the journal. If logging is enabled, this function will also suspend the journal during the restore and restart it afterwards (creating a fresh journal file). Returns 0 on success, -1 on non-fatal error and -2 on a fatal error. In case of a fatal error, the database is in a corrupt state. Otherwise, the replay failed, but the database currently in memory was not modified. Journal restarts and filenames ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The current journal file always has the name 'wgdb.journal.' (for example, 'wgdb.journal.1000'). If the database has logging enabled, all of the writes will be recorded in that file. Journal restarts will cause the current journal to be backed up and the 'wgdb.journal.' file will be replaced with a fresh journal. Example 1: Only 'wgdb.journal.99' exists. The database is dumped to the disk, causing a journal restart. The filenames after the restart will be: wgdb.journal.99 --> wgdb.journal.99.0 a new empty journal --> wgdb.journal.99 Example 2: The current journal is 'wgdb.journal.1000'. There is also an older backup with the name 'wgdb.journal.1000.0'. The new filenames: wgdb.journal.1000 --> wgdb.journal.1000.1 wgdb.journal.1000.0 --> wgdb.journal.1000.0 (unchanged) a new empty journal --> wgdb.journal.1000 Example 3: There are 10 backups (the maximum amount the database is configured to handle). The oldest one of them is 'wgdb.journal.1000.0', the newest one is 'wgdb.journal.1000.9'. There is also the current journal 'wgdb.journal.1000'. After the restart, the filenames are: wgdb.journal.1000 --> wgdb.journal.1000.0 (overwriting the oldest backup) wgdb.journal.1000.1 --> wgdb.journal.1000.1 (unchanged) ... wgdb.journal.1000.9 --> wgdb.journal.1000.9 (unchanged) a new empty journal --> wgdb.journal.1000 Interaction between dump files and journal ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ If the database has journal logging enabled, the latest database state is normally recoverable by importing the latest dump (if it exists) and replaying the journal created after that dump (or at the initialization of the database, if there is no dump). When dumping the database, the journal will be restarted and will be generated into a new file. Importing a dump will also have the same effect. Journal replay also causes the journal to be restarted so that the point of restore is distinguishable later. However, in this situation, the latest database state will be represented, incrementally, by the latest dump, the recovered journal and the new journal (until a new dump is created). Read and write locking the database for concurrency control ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Functions: [source,C] ---- wg_int wg_start_write(void * dbase); /* start write transaction */ wg_int wg_end_write(void * dbase, wg_int lock); /* end write transaction */ wg_int wg_start_read(void * dbase); /* start read transaction */ wg_int wg_end_read(void * dbase, wg_int lock); /* end read transaction */ ---- Overview ^^^^^^^^ Concurrency control in WhiteDB is achieved using a single database-level shared/exclusive lock. It is implemented independently of the rest of the db API (currently) - therefore use of the locking routines does not automatically guarantee isolation. Generally, a database level lock is characterized by very low overhead but maximum possible contention. This means that processes should spend as little time between acquiring a lock and releasing it, as possible. Implementation and current limitations ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ There are three alternative implementations. - Simple reader-preference lock using a single global spinlock (described by Mellor-Crummey & Scott '92). Reader-preference means that this lock can cause writer starvation. Tests have shown good performance under N>>P conditions (N- number of processes, P- number of CPU-s). - A writer-preference version of the spinlock. - A task-fair lock implemented using a queue. This lock is not susceptible to starvation, but has higher overhead compared to the spinlocks. The waiting processes are synchronized using the futex kernel interface. Current limitations: - dead processes hold locks indefinitely. - maximum timeout with spinlocks is 2000 ms. - the task-fair lock is only supported on Linux. Configuration ^^^^^^^^^^^^^ By default, WhiteDB is compiled with the task-fair lock if it is available and reader-preference spinlock otherwise. The writer-preference lock is selected by `./configure --enable-locking=wpspin`. The reader-preference lock is selected by `./configure --enable-locking=rpspin`. When using manual build, the LOCK_PROTO macro in 'config.h' (or 'config-w32.h') can be modified to select the locking method. For plaforms that do not support the atomic operations, use `./configure --disable-locking` or edit the appropriate header file and comment out the LOCK_PROTO macro. This will allow the code to compile correctly, but the database should be used by a single user or process only. Usage ^^^^^ Getting a shared (read) lock: [source,C] ---- wg_int lock_id; void *db; /* should be initialized before calling wg_start_read() */ ... /* acquire lock. This function normally blocks until the lock * is aquired */ lock_id = wg_start_read(db); if(!lock_id) { /* getting the lock failed, do something */ } else { ... one or more database reads ... /* release the lock */ if(!wg_end_read(db, lock_id)) { /* this is unlikely to fail, but if it does, the consequenses * could be severe, so this error should also be handled. */ } } ---- Getting an exclusive (write) lock is similar: [source,C] ---- wg_int lock_id; ... /* acquire lock. */ lock_id = wg_start_write(db); if(!lock_id) { /* getting the lock failed, do something */ } else { ... one or more database write operations ... /* release the lock */ if(!wg_end_write(db, lock_id)) { /* handle error */ } } ---- Porting ^^^^^^^ For platforms that do not support either GNU C or Win32 builtin functions that implement the atomic operations in 'dblock.c', appropriate code should be added to each of the platform-specific helper functions. The macro _MM_PAUSE can generally be defined as empty on platforms that do not support Pentium 4/Athlon64-specific "pause" instruction. This will not have a significant effect (or in other words, the "pause" instruction is only actually useful on aforementioned processor families). Writing safely without a write lock ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Although it is in general crucial to use wg_start_write before writing any data in a concurrent setting, in simple special cases it is possible to safely avoid write locks while writing data. The following atomic functions all assume that the field contains an immediate value (NULL, short integer, char, date or time), the value written is also immediate, the field is not indexed and logging is not activated. This guarantees that no allocation operations are performed and thus it is safe to rely on read locks (wg_start_read and wg_end_read) while writing data: [source,C] ---- wg_int wg_update_atomic_field(void* db, void* record, wg_int fieldnr, wg_int data, wg_int old_data); wg_int wg_set_atomic_field(void* db, void* record, wg_int fieldnr, wg_int data); wg_int wg_add_int_atomic_field(void* db, void* record, wg_int fieldnr, int data); ---- Details: wg_update_atomic_field(void* db, void* record, wg_int fieldnr, wg_int data, wg_int old_data); Given the assumptions described before, write data to field which is currently contains old_data. If the field does not contain old_data while writing, an error is generated and writing is cancelled: this is checked by the atomic compare-and-swap operation. Returns 0 if the operation was successful. wg_set_atomic_field(void* db, void* record, wg_int fieldnr, wg_int data); Perform the wg_update_atomic_field operation using the current value as old_data iteratively until it succeeds. In a normal situation the operation is expected to succeed immediately without any iterations. All the preconditions described before are checked. Returns 0 if the operation was successful. wg_add_int_atomic_field(void* db, void* record, wg_int fieldnr, int data); Increase or decrease an existing short integer value in the field by adding integer data to this value. Performs an atomic update operation iteratively until it succeeds. In a normal situation the operation is expected to succeed immediately without any iterations. All the preconditions described before are checked. Returns 0 if the operation was successful. The three atomic functions may return any of these errors: - -1 if wrong db pointer - -2 if wrong fieldnr - -10 if new value non-immediate - -11 if old value non-immediate - -12 if cannot fetch old data - -13 if the field has an index - -14 if logging is active - -15 if the field value has been changed from old_data - -16 if the result of the addition does not fit into a smallint - -17 if atomic assignment failed after a large number (1000) of tries Utilities ~~~~~~~~~ void wg_print_db(void *db) Print entire database contents in stdout, row by row. void wg_print_record(void *db, wg_int* rec) Print just one row, pointed to by rec. void wg_snprint_value(void *db, wg_int enc, char *buf, int buflen) Print a single, encoded value into a character buffer. wg_int wg_parse_and_encode(void *db, char *buf) Parse value from a string, encode it for WhiteDB. Returns WG_ILLEGAL if value could not be parsed or encoded. Following types are detected automatically from the input: - NULL - empty string - int - plain integer - double - floating point number in fixed decimal notation - date - ISO8601 date - time - ISO8601 time+fractions of second. - string - input data that does not match the above types Does NOT support ambiguous types: - fixpoint - floating point number in fixed decimal notation - uri - string starting with an URI prefix - char - single character Does NOT support types which would require a special encoding scheme in string form: record, XML literal, blob, anon const, variables Note that double values need to have CSV_DECIMAL_SEPARATOR as the decimal marker, independent of the system locale settings. wg_int wg_parse_and_encode_param(void *db, char *buf) Like `wg_parse_and_encode()`, except the returned value is encoded as a query parameter. Values encoded like this should be freed with wg_free_query_param() and cannot be used interchangeably with other encoded values. Query functions ~~~~~~~~~~~~~~~ [source,C] ---- wg_query *wg_make_query(void *db, void *matchrec, wg_int reclen, wg_query_arg *arglist, wg_int argc); void *wg_fetch(void *db, wg_query *query); void wg_free_query(void *db, wg_query *query); wg_int wg_encode_query_param_null(void *db, char *data); wg_int wg_encode_query_param_record(void *db, void *data); wg_int wg_encode_query_param_char(void *db, char data); wg_int wg_encode_query_param_fixpoint(void *db, double data); wg_int wg_encode_query_param_date(void *db, int data); wg_int wg_encode_query_param_time(void *db, int data); wg_int wg_encode_query_param_var(void *db, wg_int data); wg_int wg_encode_query_param_int(void *db, wg_int data); wg_int wg_encode_query_param_double(void *db, double data); wg_int wg_encode_query_param_str(void *db, char *data, char *lang); wg_int wg_encode_query_param_xmlliteral(void *db, char *data, char *xsdtype); wg_int wg_encode_query_param_uri(void *db, char *data, char *prefix); wg_int wg_free_query_param(void* db, wg_int data); ---- wg_query *wg_make_query(void *db, void *matchrec, wg_int reclen, wg_query_arg *arglist, wg_int argc) Build a query using parameters in match record and argument list formats. The match record is an array of encoded values of wg_int type. This can either be allocated by the caller, in which case the reclen should contain the size of the array, or point to an existing database record, in which case reclen must be zero. The argument list format consists of an array of: [source,C] ---- typedef struct { gint column; /** column (field) number this argument applies to */ gint cond; /** condition (equal, less than, etc) */ gint value; /** encoded value */ } wg_query_arg; ---- Available conditions are: WG_COND_EQUAL = WG_COND_NOT_EQUAL != WG_COND_LESSTHAN < WG_COND_GREATER > WG_COND_LTEQUAL <= WG_COND_GTEQUAL >= argc is the size of the array (at least 1 is required if arglist parameter is given). The function returns NULL if there is an error, otherwise a pointer to a query object is returned. When the query is no longer used, wg_free_query() should be called to release it's memory. If arglist and matchrec are NULL, the query has no parameters and will return all the rows in the database. void *wg_fetch(void *db, wg_query *query) Fetch next row from the query result. Returns a pointer to the next row (same as `wg_get_next_record()`). Returns NULL if there are no more rows. void wg_free_query(void *db, wg_query *query) Release the memory pointed to by query. wg_int wg_encode_query_param_*() Family of functions to prepare the parameters for `wg_make_query()`. They return a WhiteDB encoded value when successful or WG_ILLEGAL on failure. Locking the database when using these functions is not required, since they do not access shared memory. wg_int wg_free_query_param(void* db, wg_int data) Free the storage allocated for the encoded data which has been prepared with the `wg_encode_query_param_*()` family of functions. It is not advisable to call this on data encoded with other functions. Simplified query functions ^^^^^^^^^^^^^^^^^^^^^^^^^^ [source,C] ---- void *wg_find_record(void *db, wg_int fieldnr, wg_int cond, wg_int data, void* lastrecord); void *wg_find_record_null(void *db, wg_int fieldnr, wg_int cond, char *data, void* lastrecord); void *wg_find_record_record(void *db, wg_int fieldnr, wg_int cond, void *data, void* lastrecord); void *wg_find_record_char(void *db, wg_int fieldnr, wg_int cond, char data, void* lastrecord); void *wg_find_record_fixpoint(void *db, wg_int fieldnr, wg_int cond, double data, void* lastrecord); void *wg_find_record_date(void *db, wg_int fieldnr, wg_int cond, int data, void* lastrecord); void *wg_find_record_time(void *db, wg_int fieldnr, wg_int cond, int data, void* lastrecord); void *wg_find_record_var(void *db, wg_int fieldnr, wg_int cond, wg_int data, void* lastrecord); void *wg_find_record_int(void *db, wg_int fieldnr, wg_int cond, int data, void* lastrecord); void *wg_find_record_double(void *db, wg_int fieldnr, wg_int cond, double data, void* lastrecord); void *wg_find_record_str(void *db, wg_int fieldnr, wg_int cond, char *data, void* lastrecord); void *wg_find_record_xmlliteral(void *db, wg_int fieldnr, wg_int cond, char *data, char *xsdtype, void* lastrecord); void *wg_find_record_uri(void *db, wg_int fieldnr, wg_int cond, char *data, char *prefix, void* lastrecord); ---- These functions provide a simplified alternative to the query functions. void *wg_find_record(void *db, wg_int fieldnr, wg_int cond, wg_int data, void* lastrecord); Returns the first record in the database that where "fieldnr" "cond" "data" is true. `data` is an encoded value. `cond` is one of the conditions listed under "Query functions". The `wg_find_record_*()` group of functions are convinience functions for using unencoded data directly. The user is not required to encode or free encoded data when using these functions. Comparison of the query interfaces ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |========================================================================= | | Full query | simplified query | query type | conjunctive | one clause only | without index | slower | faster | with index, fetch one row | slower | faster | with index, fetch many (>5) rows | faster | much slower | isolation | "read commited" | none |========================================================================= NOTE: the isolation level given here is only an approximation. Up to "serializable" is currently possible with the use of `wg_start_*()` and `wg_end_*()` functions, but this may become relaxed during future development. Child databases ~~~~~~~~~~~~~~~ Note: child db is not compiled in by default. Use `./configure --enable-childdb` or for manual build, edit the appropriate 'config-xxx.h' file and enable the USE_CHILD_DB macro. wg_int wg_register_external_db(void *db, void *extdb) Store information in db about an external database extdb. This allows storing data from extdb inside db. Returns 0 on success, negative on error. wg_int wg_encode_external_data(void *db, void *extdb, wg_int encoded) Translate an encoded value from extdb to another encoded value which may be stored into db. Physically the data (assuming there is any memory allocated) continues to reside in extdb. Child databases are databases which contain references to data (fields and records) located in another database, called parent. The requirement is that both the child and parent are located in the same virtual address space. A typical scenario is that a "main" shared memory database is used as the parent and temporary, local memory databases are created as children. Main difference between referring to local and external data is that external references are (intentionally) not tracked by the parent database. This allows instantly deleting the child databases. On the other hand, extra measures must be taken to ensure that the referenced external data stays intact while in use by the child database. Read locking the parent database should be sufficient there. Typical usage scenario ^^^^^^^^^^^^^^^^^^^^^^ (assuming parent is already created) Create a child database and assign the parent. [source,C] ---- childdb = wg_attach_local_database(size); wg_register_external_db(childdb, parentdb); ---- Use parent data in child database. Encoded data from parent database must be re-encoded before writing it to the child database. [source,C] ---- tmp = wg_encode_external_data(childdb, parentdb, parentdata); wg_set_field(childdb, childrec, 0, tmp); ---- Free child database, when done. [source,C] wg_delete_local_database(childdb); There are three main restrictions when using external references: - External references may not be written into shared memory databases. For this reason, `wg_register_external_db()` may only be called with a local (non-shared) database as the first argument. - once an external database X is registered inside another database Y, the database Y may no longer be dumped/restored. - A database that contains external references cannot be indexed. Getting information about the database state ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [source,C] ---- wg_int wg_database_freesize(void *db); wg_int wg_database_size(void *db); ---- These functions provide information about the database size and available free space. wg_int wg_database_size(void *db) Returns the total memory segment size for the database, in bytes. wg_int wg_database_freesize(void *db) Returns the amount of free space in the database memory segment, in bytes. Note that this is a conservative estimate, meaning that the actual amount of free space may be more, but no less, than reported. RDF parsing / exporting API --------------------------- This API is dependent on libraptor. It is not available on Win32. When compiling WhiteDB without autotools (using `compile.sh`) the API can be enabled by defining HAVE_RAPTOR in 'config.h' and modifying build scripts to link with appropriate libraries. [source,C] ---- #include "rdfapi.h" wg_int wg_import_raptor_file(void *db, wg_int pref_fields, wg_int suff_fields, wg_int (*callback) (void *, void *), char *filename); wg_int wg_import_raptor_rdfxml_file(void *db, wg_int pref_fields, wg_int suff_fields, wg_int (*callback) (void *, void *), char *filename); wg_int wg_rdfparse_default_callback(void *db, void *rec); wg_int wg_export_raptor_file(void *db, wg_int pref_fields, char *filename, char *serializer); wg_int wg_export_raptor_rdfxml_file(void *db, wg_int pref_fields, char *filename); ---- wg_int wg_import_raptor_file(void *db, wg_int pref_fields, wg_int suff_fields, wg_int (*callback) (void *, void *), char *filename) Imports RDF file. Creates records with length = pref_fields + 3 + suff_fields. The data will be stored as follows: | pref_fields .. | predicate | subject | object | suff_fields | The file type is determined automatically from filename. Callback function should match the prototype of `wg_rdfparse_default_callback()` and can be used to calculate contents of fields other than the RDF triple. wg_int wg_import_raptor_rdfxml_file(void *db, wg_int pref_fields, wg_int suff_fields, wg_int (*callback) (void *, void *), char *filename) As above, but file type is assumed to be RDF/XML wg_int wg_rdfparse_default_callback(void *db, void *rec) Does nothing. Called when importing rdf files with the 'wgdb' commandline tool. May be modified to add field initialization functionality to commandline importing. wg_int wg_export_raptor_file(void *db, wg_int pref_fields, char *filename, char *serializer) Export triple data to file. The format is selected by the raptor serializer (more info about serializers can be found at http://librdf.org/raptor/. There is also serializers enumeration function in libraptor API). The pref_fields parameters marks the start position of the triple in WhiteDB records (storage schema is assumed to be the same as described above for wg_import_raptor_file() function). wg_int wg_export_raptor_rdfxml_file(void *db, wg_int pref_fields, char *filename) Export triple data to file in RDF/XML format. Index API --------- [source,C] ---- #include wg_int wg_create_index(void *db, wg_int column, wg_int type, wg_int *matchrec, wg_int reclen); wg_int wg_drop_index(void *db, wg_int index_id); wg_int wg_column_to_index_id(void *db, wg_int column, wg_int type, wg_int *matchrec, wg_int reclen); wg_int wg_get_index_type(void *db, wg_int index_id); void * wg_get_index_template(void *db, wg_int index_id, wg_int *reclen); void * wg_get_all_indexes(void *db, wg_int *count); ---- Index API header exposes functions to create and drop indexes. wg_int wg_create_index(void *db, wg_int column, wg_int type, wg_int *matchrec, wg_int reclen) Create an index on column. Index type must be specified. Currently supported index types: WG_INDEX_TYPE_TTREE - T-tree index on single column If matchrec is NULL, a normal index is created. If matchrec is non-null, the index will be created with a template. In this case reclen must specify the length of the array pointed to by matchrec. If an index has a template, only records that match the template are inserted into the index. Wildcards in the template are specified using WG_VARTYPE values. This function returns 0 if successful and non-0 in case of an error. wg_int wg_drop_index(void *db, wg_int index_id) Delete the specified index. Returns 0 on success, non-0 on error. wg_int wg_column_to_index_id(void *db, wg_int column, wg_int type, wg_int *matchrec, wg_int reclen) Find an index on a column. If type is specified, the first index with a matching type is returned. If type is 0, indexes of any type may be returned. If matchrec is non-NULL and WhiteDB is configured with USE_INDEX_TEMPLATE option, the provided match record will be used to locate an index with a specified template. If matchrec is NULL, this function finds a full index. Returns an index id on success. Returns -1 on error. wg_int wg_get_index_type(void *db, wg_int index_id) Finds index type. Returns type (>0) on success, -1 if the index was not found. void * wg_get_index_template(void *db, wg_int index_id, wg_int *reclen) Finds index template. Returns a pointer to the gint array used for the index template. reclen is set to the length of the array. The pointer may not be freed and it's contents should be accessed read-only. If the index is not found or has no template, NULL is returned. In that case contents of *reclen are unmodified. void * wg_get_all_indexes(void *db, wg_int *count) Returns a pointer to a NEW allocated array of index id-s. count is initialized to the number of indexes in the array. Returns NULL if there are no indexes. Examples ~~~~~~~~ Create a T-tree index on a column conditionally: [source,C] ---- if(wg_column_to_index_id(db, col, WG_INDEX_TYPE_TTREE, NULL, 0) == -1) { if(wg_create_index(db, col, WG_INDEX_TYPE_TTREE, NULL, 0)) { printf("index creation failed.\n"); } else { printf("index created.\n"); } } ---- Create an index on column 0 that only contains rows where the 2-nd column is equal to 6 (requires that WhiteDB is compiled with USE_INDEX_TEMPLATE defined in config.h): [source,C] ---- wg_int matchrec[3]; matchrec[0] = wg_encode_var(db, 0); matchrec[1] = wg_encode_var(db, 0); matchrec[2] = wg_encode_int(db, 6); if(wg_create_index(db, 0, WG_INDEX_TYPE_TTREE, matchrec, 3)) { printf("index creation failed.\n"); } ---- Delete all indexes in the database that have a template: [source,C] ---- wg_int *indexes = wg_get_all_indexes(db, &count); for(i=0; i /* or #include on Windows */ int main(int argc, char **argv) { void *db; db = wg_attach_database("1000", 2000000); return 0; } ---- First, the program needs to include the API headers. There are a few other header files distributed with WhiteDB, but 'dbapi.h' is the one we'll need for now. NOTE: The programs in the 'Examples' directory use a different way of including the headers, by referring to their location directly. This is so that the examples can be compiled before the installation of the database and it is perfectly acceptable - but let's stick to the standard way of using library headers in this tutorial. `void *db` is the database handle. Once we have the handle, we can use it in all the subsequent database operations - it will always point to the same database we originally attached to. Why stress that it is the same database? WhiteDB allows using multiple databases in parallel, without any prior configuration. The number "1000" we give to the `wg_attach_database()` function is the key that refers to the shared memory segment containing our database. Observe that when using `wg_attach_database()`, it does not matter whether the database already exists or not - it will be created, if necessary. The size of the database will be the one we supplied, 2MB in this case. When the program exits, the database will remain in memory. When you have already created a database in shared memory, you can later use `wg_attach_existing_database(dbname)` which functions exactly as `wg_attach_database(dbname,...)` but does not create a new database. If no database with the name `dbname` is found, it simply returns NULL. This is quite handy when you want to avoid creating a new base or just want to check whether it exists already. Adding data to the database --------------------------- An empty database isn't usually much of a practical use, so we need to learn how to populate it with data. It is actually a three-step process: creating a record, encoding the data and writing to the fields of the records. Records ~~~~~~~ A WhiteDB record is a n-tuple of encoded data. The n refers to the length of the record and there is no specific limit except that it must fit inside the database memory segment (of course, the size is given as `wg_int`, the universal datatype of WhiteDB, which itself has a maximum value, but this is quite large, especially on a 64-bit system). void *rec = wg_create_record(db, 10); The datatype of the record is `void *`, just like the database handle. Now we can use `rec` any time we need to do something with the record we've created. By the way, the records do not all need to be the same size, so we could do void *rec2 = wg_create_record(db, 2); and have two records, one of them 10 fields and the other 2 fields in size. However, the size is final and cannot be changed later. Data in WhiteDB ~~~~~~~~~~~~~~~ An important distinction between WhiteDB and traditional databases is that the user can and in some cases must pop the hood open and get their hands dirty. Data encoding is one of such cases. Everything inside the database is a "WhiteDB int", or a `wg_int` when we're writing C code. These are basically numbers (32-bit or 64-bit integers, depending on your system), but for WhiteDB they contain encoded pieces of information - type of a value and the value itself or some way to access the value. So whenever we need to write something, be it a string, a number or a date to the database, first we have to encode it so that WhiteDB is ready to handle it. wg_int enc = wg_encode_int(db, 443); wg_int enc2 = wg_encode_str(db, "this is my string", NULL); The first line should be self-explanatory - `enc` is now 443 in WhiteDB's internal format. When encoding a string, be aware that the string itself will be written to the database memory segment at that point - the encoded value `enc2` will merely contain a reference to it. Also, there is a third parameter which we can ignore for simple applications. Setting field values ~~~~~~~~~~~~~~~~~~~~ You may be asking yourself why do we need to bother with encoding the values when we could simply write things like integers or character arrays directly. The main reason for that is that WhiteDB is schemaless. When we created records, we did not specify what type any of the fields were - they can be of any type. The encoded value is how WhiteDB can tell what type of data it is dealing with, since field 1 could be an integer in one record, a floating-point number in another one and so on. With that out of the way, let's take our encoded data and store it properly in the database: wg_set_field(db, rec, 7, enc); wg_set_field(db, rec2, 0, enc2); Field 7 of the first record now contains 443 and field 0 of the second record (which has two fields, field 0 and field 1) contains "this is my string". We didn't touch any of the other fields and if we were to look at the contents of the records now, these would be filled with NULL values. Each time a new record is created, it initially contains a row of NULL-s which the user can then overwrite with their own data. Here is our complete example ('Examples/tut2.c'): [source,C] ---- #include int main(int argc, char **argv) { void *db, *rec, *rec2; wg_int enc, enc2; db = wg_attach_database("1000", 2000000); rec = wg_create_record(db, 10); rec2 = wg_create_record(db, 2); enc = wg_encode_int(db, 443); enc2 = wg_encode_str(db, "this is my string", NULL); wg_set_field(db, rec, 7, enc); wg_set_field(db, rec2, 0, enc2); return 0; } ---- It is likely that you need to deal with more types than just strings and integers. The manual will provide a full list of supported types. The wgdb utility ---------------- Once you've started working with WhiteDB, the `wgdb` tool may come in handy to manage the databases, so let's take a quick look at it. First we deal with database persistence, you may skip to "Looking at data" if you're not on Windows. Database persistence on Windows ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The way shared memory works on Windows is that it is only present as long as there is a program holding a handle to it. So when we compile and run the previous example, the data gets written to the memory but then the program terminates and the database immediately disappears. To get around that, run wgdb.exe 1000 server 2000000 in another window. That will keep the shared memory present, until you press CTRL+C. You can now run the tutorial programs and the following examples should work. Looking at data ~~~~~~~~~~~~~~~ If you ran the program from the previous section, there should be some records in memory now. Let's take a look: wgdb 1000 select 20 It should return something like this: [NULL,NULL,NULL,NULL,NULL,NULL,NULL,443,NULL,NULL] ["this is my string",NULL] The "1000" in the command is the same shared memory key we used earlier. "select" prints records from the database and "20" limits the maximum number of records that will be shown. There is also a query command that lets you specify which records you are interested in: wgdb 1000 query 7 = 443 That will only return the first record, the one where field 7 equals 443. There are other comparison operators: "!=" for not equal, "<" for less than, "<=" for less than or equal and so forth. Currently the query command does not have a row limit parameter. Modifying data ~~~~~~~~~~~~~~ The command line tools allows some data manipulation: deleting and adding records. The "del" command has the same syntax as the query command, so wgdb 1000 del 7 = 443 will delete the first row from the database. We can also add records, but only integer and string values are recognized this way - dealing with other types unambiguously would become complicated. wgdb 1000 add 1 2 3 This created a record with the length 3 and inserted three integer values in it. Let's see what the database now contains. By the way, since "1000" is the default key, we may omit it: wgdb select 20 The entire contents of the database will now be: ["this is my string",NULL] [1,2,3] Freeing the memory ~~~~~~~~~~~~~~~~~~ At some point we may need to delete the database for whatever reason. The `wgdb` tool will help: wgdb 1000 free The database with the given key will be freed. Again, "1000" may be omitted as it is the default. Making queries -------------- Finding matching records ~~~~~~~~~~~~~~~~~~~~~~~~ Finding records that match some condition is easy: void *rec = wg_find_record_int(db, 7, WG_COND_EQUAL, 443, NULL); This returns the first record that has the integer 443 in field 7. That much is obvious, but some of the parameters might need extra explanation. First, just a reminder that `db` is the database handle we've been using each time we call a WhiteDB function. As a second parameter we give the number of the field that the database engine should check against the value we've given. The third parameter is the condition: we need some way of stating that we want records where "field 7" "equals" "443" so that is the "equals" part. There are other conditions, for example, if we substituted WG_COND_LESSTHAN there, we would receive a record where the value in field 7 is less than 443. The full list of possibe conditions is given in 'Manual.txt'. The fourth parameter is, of course, the value. The function we called ended with '_int' and that parameter should also be of the type `int`. There is a function for most of WhiteDB's datatypes, so if we were looking for a string we would use `wg_find_record_str()` instead. Now let's turn our attention to the mysterious NULL parameter. Remember that our example function call returned the *first* record that matched our parameters? What if there are more matching records and we want to find those too? That can be done: void *nextrec = wg_find_record_int(db, 7, WG_COND_EQUAL, 443, rec); Instead of the NULL, we can give the record that the function returned last time and WhiteDB will return the next one that matches the same condition. The following example will call `wg_find_record_int()` in a cycle, finding all the matching record from the database. We're adding some records, so it will print "Found a record..." at least once. Run it multiple times and see the number of matching records increase ('Examples/tut3.c'): [source,C] ---- #include #include int main(int argc, char **argv) { void *db, *rec; wg_int enc; db = wg_attach_database("1000", 2000000); /* create some records for testing */ rec = wg_create_record(db, 10); enc = wg_encode_int(db, 443); /* will match */ wg_set_field(db, rec, 7, enc); rec = wg_create_record(db, 10); enc = wg_encode_int(db, 442); wg_set_field(db, rec, 7, enc); /* will not match */ /* now find the records that match our condition * "field 7 equals 443" */ rec = wg_find_record_int(db, 7, WG_COND_EQUAL, 443, NULL); while(rec) { printf("Found a record where field 7 is 443\n"); rec = wg_find_record_int(db, 7, WG_COND_EQUAL, 443, rec); } return 0; } ---- Full query interface ~~~~~~~~~~~~~~~~~~~~ The above method of finding data is convinient, but it can be too limited (and in some specific cases inefficient) so eventually we may need to make use of full queries. The query API has a number of features we will not be discussing here, instead we'll look at only the basic steps. Running a query in WhiteDB consists of preparing the argument list, creating the query itself and then fetching the matching records from the query. wg_query_arg arglist[2]; arglist[0].column = 7; arglist[0].cond = WG_COND_EQUAL; arglist[0].value = wg_encode_query_param_int(db, 443); The `wg_query_arg` type is where we store one condition that the returned records should match (or, a "clause" of the query, if you like more exact terminology). Here we've specified again that we'd like to find records where "field 7 equals 443". Notice that we declared `arglist` as an array of 2 elements? Well, this is because we can give more than one argument: arglist[1].column = 6; arglist[1].cond = WG_COND_EQUAL; arglist[1].value = wg_encode_query_param_null(db, NULL); Now we're looking for records where "field 7 equals 443 and field 6 equals NULL". The value for both arguments is encoded and it's recommended to use the special `wg_encode_query_param_*()` functions for that purpose. wg_query *query = wg_make_query(db, NULL, 0, arglist, 2); We pass the argument list and it's size, which is 2, to WhiteDB (ignore the other parameters for now, they're not used if you use the argument list). We receive a query object in return and are finally ready to start fetching records: void *rec = wg_fetch(db, query); The `wg_fetch()` function will return a different record each time you call it and eventually it will return NULL, meaning that you've already fetched all the records that match our argument list. Finally we should do some housekeeping, as queries may take up quite a bit of memory: wg_free_query(db, query); wg_free_query_param(db, arglist[0].value); wg_free_query_param(db, arglist[1].value); That was quite a bit of work to do essentialy the same thing we achieved with the help of just one function earlier, but it will also give you more power and flexibility. The following program ('Examples/tut4.c') summarizes what we've looked at here: [source,C] ---- #include #include int main(int argc, char **argv) { void *db, *rec; wg_int enc; wg_query_arg arglist[2]; /* holds the arguments to the query */ wg_query *query; /* used to fetch the query results */ db = wg_attach_database("1000", 2000000); /* just in case, create some records for testing */ rec = wg_create_record(db, 10); enc = wg_encode_int(db, 443); /* will match */ wg_set_field(db, rec, 7, enc); rec = wg_create_record(db, 10); enc = wg_encode_int(db, 442); wg_set_field(db, rec, 7, enc); /* will not match */ /* now find the records that match the condition * "field 7 equals 443 and field 6 equals NULL". The * second part is a bit redundant but we're adding it * to show the use of the argument list. */ arglist[0].column = 7; arglist[0].cond = WG_COND_EQUAL; arglist[0].value = wg_encode_query_param_int(db, 443); arglist[1].column = 6; arglist[1].cond = WG_COND_EQUAL; arglist[1].value = wg_encode_query_param_null(db, NULL); query = wg_make_query(db, NULL, 0, arglist, 2); while((rec = wg_fetch(db, query))) { printf("Found a record where field 7 is 443 and field 6 is NULL\n"); } /* Free the memory allocated for the query */ wg_free_query(db, query); wg_free_query_param(db, arglist[0].value); wg_free_query_param(db, arglist[1].value); return 0; } ---- You may run this program a couple of times, then run `wgdb select 20` to verify that the tutorial program prints the correct number of rows. Doing things properly --------------------- The examples we've been following up to now have been a bit sloppy. We haven't bothered to check whether the WhiteDB functions fail or succeed, nor to clean up after after we were done - there was only the bit about freeing queries which was just too important to ignore. First thing you should consider is that attaching to a database can sometimes fail. For example, it is possible that you requested more shared memory than the system configuration allows. So do a check like this: void *db = wg_attach_database("1000", 1000000); if(db == NULL) { /* do something to handle the error */ } Creating records or encoding data can fail if the database is full - actually a common occurence since memory databases are naturally smaller than the traditional disk-based ones. void *rec = wg_create_record(db, 1000); if(rec == NULL) { /* record was not created, can't use it */ } wg_int enc = wg_encode_str(db, "This could fail", NULL); if(enc == WG_ILLEGAL) { /* string encoding failed */ } Notice the WG_ILLEGAL value? This is a special encoded value that WhiteDB uses to tell you that something went wrong. All `wg_encode_*()` functions return that so it is easy to check for encode errors. Decoding values can also fail. The most obvious case is when you expect some field in a record to be of a certain type, but some other program or user has written a different value there. This can sometimes be tricky to detect so you should consult the 'Manual.txt' file how a particular `wg_decode_*()` function behaves. If the database is used in a way that makes it difficult to predict what type of data a field contains, there is a way to find out: if(wg_get_field_type(db, rec, 0)) == WG_STRTYPE) { printf("Field 0 in rec is a string\n"); } Finally, here's something that is nice to do whenever a program stops using a database: wg_detach_database(db); This detaches us from the shared memory and also frees any memory that may be allocated for the database handle. Let's try to apply all of those things in practice ('Examples/tut5.c'). The program creates records in a loop and writes a short string value in each of them. We have a small database, so soon we'll run out of space, causing some errors which we are now able to cope with: [source,C] ---- #include #include #include int main(int argc, char **argv) { void *db, *rec, *lastrec; wg_int enc; int i; db = wg_attach_database("1000", 1000000); /* 1MB should fill up fast */ if(!db) { printf("ERR: Could not attach to database.\n"); exit(1); } lastrec = NULL; for(i=0;;i++) { char buf[20]; rec = wg_create_record(db, 1); if(!rec) { printf("ERR: Failed to create a record (made %d so far)\n", i); break; } lastrec = rec; sprintf(buf, "%d", i); /* better to use snprintf() in real applications */ enc = wg_encode_str(db, buf, NULL); if(enc == WG_ILLEGAL) { printf("ERR: Failed to encode a string (%d records currently)\n", i+1); break; } if(wg_set_field(db, rec, 0, enc)) { printf("ERR: This error is less likely, but wg_set_field() failed.\n"); break; } } /* For educational purposes, let's pretend we're interested in what's * stored in the last record. */ if(lastrec) { char *str = wg_decode_str(db, wg_get_field(db, lastrec, 0)); if(!str) { printf("ERR: Decoding the string field failed.\n"); if(wg_get_field_type(db, lastrec, 0) != WG_STRTYPE) { printf("ERR: The field type is not string - " "should have checked that first!\n"); } } } wg_detach_database(db); return 0; } ---- To try this out, let's first try to cause an error on attaching. Type these commands: wgdb free wgdb create 999999 Then run the example program. Since we've already created a database and it's smaller than what the progam is requesting, it should complain and exit early. When done with this, type `wgdb free` again to delete the smaller database. Run the example program again, this time there is nothing in the way so it can create the database named "1000" itself. The results can be a bit unpredictable, but you should either see a record creation error or a number of errors related to a string field. Of course, this does not mean that the program failed or did nothing useful - a number of records were created successfully, we just eventually ran out of database space. Everything our program printed has "ERR:" in front of it - the rest of the messages come from WhiteDB. Parallel use ------------ WhiteDB never locks the database for you. Whenever something is read or written, the engine just goes and does it without checking whether some other program (or user) is currently using the database. This goes with the philosophy of speed and simplicity. But there are many use cases where parallel use of the database is needed and in those cases everybody cannot just crowd the database at the same time and start making changes to the shared memory area - that could result in inconsistent data, or worse, a corrupt and useless database. Fortunately, WhiteDB does provide the user with the tools to handle that. The rule of thumb is that you need these concurrency control functions whenever the database is *both read and written* by several processes and possibly at the same time. For example, if a database is serving data to a webserver and there are occasional updates to the data (without shutting down the webserver), that would qualify as needing concurrency control. To implement this concurrency control, we first request permission to read whenever we are about to read something from the database and similarly declare our intention to write to the database. Once we're done we inform the database engine that we're finished so others may proceed. So, wg_int lock_id = wg_start_read(db); /* do some reading */ wg_end_read(db, lock_id); requests permission to read. There may be some time until the function `wg_start_read()` returns - it may need to wait for some other process to finish whatever it is doing with the database. Once it returns the `lock_id`, we have shared access to the database - we may read it safely and so may other processes, but no one can write anything. `wg_end_read()` declares that we no longer need the read access. It is quite possible that `wg_start_read()` fails - it can happen under heavy load or if some other process is hogging the database for a long time. We should always check: lock_id = wg_start_read(db); if(!lock_id) { printf("wg_start_read() timed out\n"); exit(1); /* or go and retry, whatever is appropriate */ } Getting write access is similar, the major difference is that once we get the permission, we have exclusive access - everyone else has to wait until we're done adding or updating the data: lock_id = wg_start_write(db); if(!lock_id) { printf("wg_start_write() timed out\n"); exit(1); } /* do some writing */ wg_end_write(db, lock_id); NOTE: At present time these functions behave exactly like operating on a single big database-level lock. This tutorial does not make a secret of it, however, the future direction may be that they start behaving more like transactions. The important thing to remember is that the purpose is to allow you to read and write data safely, without corruption and inconsistency. By following the pattern described here, your program will continue to work unmodified, no matter how WhiteDB implements things internally. To illustrate parallel use, we will implement a counter that is incremented from two programs simultaneously. This kind of example is frequently used in parallel programming tutorials, when done naively the counter counts incorrectly because the processes end up ignoring some of the increments done by the other process. We will place this counter value inside a WhiteDB database ('Examples/tut6.c'). [source,C] ---- #include #include #include #define NUM_INCREMENTS 100000 void die(void *db, int err) { wg_detach_database(db); exit(err); } int main(int argc, char **argv) { void *db, *rec; wg_int lock_id; int i, val; if(!(db = wg_attach_database("1000", 1000000))) { exit(1); /* failed to attach */ } /* First we need to make sure both counting programs start at the * same time (otherwise the example would be boring). */ lock_id = wg_start_read(db); rec = wg_get_first_record(db); /* our database only contains one record, * so we don't need to make a query. */ wg_end_read(db, lock_id); if(!rec) { /* There is no record yet, we're the first to run and have * to set up the counter. */ lock_id = wg_start_write(db); if(!lock_id) die(db, 2); rec = wg_create_record(db, 1); wg_end_write(db, lock_id); if(!rec) die(db, 3); printf("Press a key when all the counter programs have been started."); fgetc(stdin); /* Setting the counter to 0 lets each counting program know it can * start counting now. */ lock_id = wg_start_write(db); if(!lock_id) die(db, 2); wg_set_field(db, rec, 0, wg_encode_int(db, 0)); wg_end_write(db, lock_id); } else { /* Some other program has started first, we wait until the counter * is ready. */ int ready = 0; while(!ready) { lock_id = wg_start_read(db); if(!lock_id) die(db, 2); if(wg_get_field_type(db, rec, 0) == WG_INTTYPE) ready = 1; wg_end_read(db, lock_id); } } /* Now start the actual counting. */ for(i=0; i #include int main(int argc, char **argv) { void *db, *rec, *rec2, *rec3; wg_int enc; if(!(db = wg_attach_database("1000", 2000000))) exit(1); /* failed to attach */ rec = wg_create_record(db, 2); /* this is some record */ rec2 = wg_create_record(db, 3); /* this is another record */ rec3 = wg_create_record(db, 4); /* this is a third record */ if(!rec || !rec2 || !rec3) exit(2); /* Add some content */ wg_set_field(db, rec, 1, wg_encode_str(db, "hello", NULL)); wg_set_field(db, rec2, 0, wg_encode_str(db, "I'm pointing to other records", NULL)); wg_set_field(db, rec3, 0, wg_encode_str(db, "I'm linked from two records", NULL)); /* link the records to each other */ enc = wg_encode_record(db, rec); wg_set_field(db, rec2, 2, enc); /* rec2[2] points to rec */ enc = wg_encode_record(db, rec3); wg_set_field(db, rec2, 1, enc); /* rec2[1] points to rec3 */ wg_set_field(db, rec, 0, enc); /* rec[0] points to rec3 */ wg_detach_database(db); return 0; } ---- When to use the network model ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ NOTE: the code included in this section isn't here to show you how to do things but rather to illustrate what would happen if things *were* done this way. Consider our earlier example with two records, one of them containing the important message "hello". Let's try to use techniques that work in relational databases to accomplish the same thing. Storing the data isn't complicated: void *rec = wg_create_record(db, 2); void *rec2 = wg_create_record(db, 3); wg_int hello_id = wg_encode_int(db, 11913); /* an arbitrary record id */ wg_set_field(db, rec, 0, hello_id); /* assign the id */ wg_set_field(db, rec, 1, wg_encode_str(db, "hello", NULL)); wg_set_field(db, rec2, 2, hello_id); /* reference by id */ So far so good. The records should now contain `[11913, "hello"]` and `[NULL, NULL, 11913]`. Pretend again that we have just the `rec2` (as a result of a query, for example) and need the contents of `rec`: int hello_id = wg_decode_int(db, wg_get_field(db, rec2, 2)); /* this will search the database for 11913 */ void *rec = wg_find_record_int(db, 0, WG_COND_EQUAL, hello_id, NULL); char *message = wg_decode_str(db, wg_get_field(db, rec, 1)); If you have an index on field 0, this doesn't look that bad. Sure, the value "11913" needs to be looked up, but `wg_find_record_int()` can manage it reasonably fast. What if you had to do this inside a loop, though? How about a nested loop? Executing a million queries isn't that fun anymore, even if they don't take up time individually - `wg_decode_record()` would have taken a fraction of that. The lesson to learn from here is that whenever you have a data model where objects have relations to each other and need to perform JOIN type queries on them, it can be much faster in WhiteDB to implement the "join" in advance by linking the object records to each other. Those links can then be navigated rapidly to collect all the data. Another use case is storing semi-structured data. We have used the network model in our implementation of JSON documents in WhiteDB (a work in progress) where for example an array can contain other arrays or objects: the record linking feature allows us to accomplish this elegantly by making the array a record whose fields contain links to other records holding the child elements. More examples ------------- There are a few more examples distributed with WhiteDB that were not covered in this tutorial. You may look at 'Examples/demo.c' and 'Examples/query.c' that should be commented well enough to be understandable by now. Examples in Examples/speed are geared towards speed testing and are covered in http://whitedb.org/speed.html A bit more involved example to look at is 'Examples/dserve.c': making queries from WhiteDB with a simple REST cgi program giving json or csv output. `dserve` is useful both as it is and as an example/template for making your own tools for WhiteDB data handling: see http://whitedb.org/tools.html whitedb-0.7.2/Doc/Tutorial2html.sed000066400000000000000000000006731226454622500171400ustar00rootroot00000000000000s/^Connecting to the database/\[\[anchor-1\]\]\n&/ s/"Connecting to the database"/xref:anchor-1\[\]/ s/^Looking at data/\[\[anchor-2\]\]\n&/ s/"Looking at data"/xref:anchor-2\[\]/ s/'python\.txt'/link:python\.html\[Python documentation\]/ s/'Manual\.txt' file/link:Manual\.html\[Manual\]/ s/The manual will provide/link:Manual\.html\[The manual\] will provide/ s/"<="/"\\<="/ s/given in 'Manual\.txt'/given in link:Manual\.html\[the manual\]/ whitedb-0.7.2/Doc/Utilities.txt000066400000000000000000000441221226454622500164020ustar00rootroot00000000000000WhiteDB command line utilities ============================== wgdb - general database management ---------------------------------- This is a simple command-line utility that allows creating and freeing a database, dump, import, run some tests and more. Usage: wgdb [shmname] [command arguments] The shared memory name identifies the database to use and is an arbitrary numeric value. If it is omitted, the default ("1000") will be used. Commands commonly available: help (or "-h") - display this text. version (or "-v") - display libwgdb version. free - free shared memory. export [-f] - write memory dump to disk (-f: force dump even if unable to get lock) import [-l] - read memory dump from disk. Overwrites existing memory contents (-l: enable logging after import). exportcsv - export data to a CSV file. importcsv - import data from a CSV file. replay - replay a journal file. test - run quick database tests. fulltest - run in-depth database tests. header - print header data. fill [asc | desc | mix] - fill db with integer data. add .. - store data row (only int or str recognized) select [start from] - print db contents. query "" .. - basic query. del "" .. - like query. Matching rows are deleted from database. server [-l] [size b] - provide persistent shared memory for other processes (Windows). (-l: enable logging in the database). create [-l] [size] - create empty db of given size (non-Windows). (-l: enable logging in the database). Importing and exporting data ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Data may be exported to and imported from text files. This provides a way to exchange data between WhiteDB on different platforms or other data sources. The simplest format that is always available is CSV (comma separated values). Since there is no straightforward mapping between most of WhiteDB types and the CSV format and as CSV is not standardized, only limited support for data types is available. The following data types are recognized when importing from CSV: - NULL - empty string - int - plain integer - double - floating point number in fixed decimal notation - date - ISO8601 date - time - ISO8601 time+fractions of second. - string - input data that does not match the above types The field separator is ',' (comma). The decimal point separator is '.' (dot). WhiteDB may also provide RDF (if libraptor is available) and JSON (ongoing development, undocumented) support. Memory dumps and journal logs ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In the default configuration, WhiteDB runs in shared memory only. There are two methods of providing data persistence: memory dumps, which are snapshots of the current database contents and journal logs which contain incremental updates to the database contents. Logs are only available if the database is configured with the `./configure --enable-logging` option and need to be explicitly enabled. Managing memory dumps ^^^^^^^^^^^^^^^^^^^^^ A memory dump is the memory image of a WhiteDB database, saved into a file. It stores everything except concurrency control and journal file metadata. Images are not compatible between systems of different endianness or word size (the most common case would probably be 32-bit vs 64-bit systems). Also, different versions of the database library may use different format for the memory image, therefore WhiteDB automatically refuses to import an image made with a different library version. Type `wgdb -v` to list the compatibility information of the database library. It will display something like this: libwgdb version: 0.7.0 byte order: little endian compile-time features: 64-bit encoded data: yes queued locks: yes chained nodes in T-tree: yes record backlinking: yes child databases: no index templates: yes Memory dumps are suitable for creating snapshots of the database for backup purposes. Type wgdb export imagename.bin to create a backup of the current database (since the shared memory name was omitted, the `wgdb` utility will use "1000" by default). wgdb import imagename.bin Will restore the image from disk. It will completely overwrite the current memory contents. Note that if there is an existing memory segment, it needs to be large enough to fit the image. Otherwise the import will fail with an error message (and the shared memory segment will not be modified). If the `wgdb` tool is unable to access the memory image (for example, due to a programming error that causes the database to become permanently locked), a "rescue" dump can be created with wgdb export -f rescue.bin Note however, that in such cases care must be taken, as it is unknown what type of errors the image may contain. Managing journal logs ^^^^^^^^^^^^^^^^^^^^^ Journal logs provide a way of keeping a continuous backup of the database. All of the changes to the shared memory are logged incrementally and the journal logs may be played back to repeat all of those changes, bringing the database to the same state as it was at the time the log file ended. To enable journal logging, use ./configure --enable-logging --with-logdir=./logs during the building of the database. Replace './logs' with wherever you'd like the library to store the journal files. This location and the names of the journal files are not changeable during runtime due to security reasons. If you do not specify the log directory, '/tmp' (or '\windows\temp') will be used by default, which may work for testing, but is probably undesirable in the long term. Journaling must be enabled at the time of database creation. To do this from the command line, supply the `-l` switch to either the `wgdb create` command or when importing an image with the `wgdb import` command. This will cause a journal file to be created. It will be placed in 'logdir/wgdb.journal.shmname' where shmname is the database name, for example, "1000". The journal is then incrementally written, until one of the three things happens: 1. A dump file is created, either by `wgdb export` or by calling the `wg_dump()` function 2. A dump file is imported (again, by command line or API) 3. A journal file is replayed. In each of these cases, the current journal file is backed up, appending '.0' to its name (or '.1', if '.0' already exists and so forth) and a fresh journal file is started. The first case can be considered normal usage, as it creates a snapshot of the database. Since this snapshot contains everything that the journal has logged up to this point, the journal is no longer necessary for recovery and a fresh one will be used. Importing the dump file will either be part of a recovery process or to simply work with a new image. In either case, the previous journal has become irrelevant to the current database contents, making it necessary to start a new one. Finally, a journal replay itself will be not logged in a journal. Therefore, the database contents after the replay and the journal file that was in use during the replay have become inconsistent, similarly than whan happens with importing a memory dump. Generally, journal backups caused by these recovery actions should be cleaned up or moved away. The user is expected to handle this manually, case by case. Journal log example ^^^^^^^^^^^^^^^^^^^ Assuming we use the './logs' directory to store the journal files and the database has been compiled with journal support, let's start by enabling the journal and adding data to the database: wgdb create -l wgdb 1011 add 1 2 3 The contents of the './logs' directory will now be: -rw-rw-rw- 1 user group 27 Dec 7 22:00 wgdb.journal.1011 And the contents of the database, by typing `wgdb 1011 select 10`: [1,2,3] Let's create a memory dump of this database. Make sure you don't have 'example.bin' already, as it will be overwritten. wgdb 1011 export example.bin Now we have these log files: -rw-rw-rw- 1 user group 4 Dec 7 22:04 wgdb.journal.1011 -rw-rw-rw- 1 user group 27 Dec 7 22:00 wgdb.journal.1011.0 Note that the original file received the suffix '.0' and a new one was created in it's place. Let's add more data: wgdb 1011 add Hello world The database now contains: [1,2,3] ["Hello","world"] Assume next that something destroyed our database. Try `wgdb 1011 free`. We now know that we have a recent dump called 'example.bin' and some journals: -rw-rw-rw- 1 user group 47 Dec 7 22:06 wgdb.journal.1011 -rw-rw-rw- 1 user group 27 Dec 7 22:00 wgdb.journal.1011.0 The newest journal is what we're interested in. However, *keep in mind* that importing the dump restarts the journal. Nothing will happen to our precious log file, but in order to avoid confusion, let's move it somewhere safe first: mv -i ./logs/wgdb.journal.1011 ./logs/recover.me.1011 Import the dump (note that the '-l' switch is used to re-enable journaling so that our updates will be logged again after we've completed the recovery): wgdb 1011 import -l example.bin Try to list the database contents (`wgdb 1011 select 10`): [1,2,3] Finally, recover the log: wgdb 1011 replay ./logs/recover.me.1011 And the database contents after this will be [1,2,3] ["Hello","world"] We managed to restore our latest state of the database by first importing the dump file, then replaying the journal file that was written after the dump was created. If we would now continue to modify the database, the state could be recovered by importing 'example.bin' and then replaying 'recover.me.1011' and the latest journal file 'wgdb.journal.1011' in that order. The recovery process creates a number of intermediate journals which we may now clean up by `rm ./logs/wgdb.journal.1011.?`. CAUTION: in this case, we moved the log file that we cared about, away first. This is why this command is safe to use here. In general, however, it is better to make sure that the deleted log files do not contain anything not backed up elsewhere. The easiest way would be to verify that the database is healthy and create a fresh memory dump. An automated backup procedure ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ A reasonable procedure that helps keeping track of image dumps and journals could be implemented with a following script: 1. dump the memory segment into a file such as 'backup.YYYYMMDD.shmname.bin' 2. check that the dump was successful 3. move 'logdir/wgdb.journal.shmname.0' to 'journal.YYYYMMDD.shmname' This way, 'logdir' always contains only the current journal. Assuming that YYYYMMDD represents the current date, memory dumps and journals can be archived and accessed by date. If recovery is needed, the database can be restored from the latest image and the current journal in 'logdir/wgdb.journal.shmname' (which should first be archived separately). Any journal backups in 'logdir' should then be removed after the restore is successful, to ensure that step 3. archives the correct journal file next time. indextool - index management ---------------------------- `indextool` is for listing and managing database indexes. It takes the shared memory name as an argument. If it is omitted, the default ("1000") will be used. The `logtree` and `dumphash` commands are for debugging and may produce a lot of output. indextool [shmname] createindex - create ttree index indextool [shmname] createhash - create hash index (JSON support) indextool [shmname] dropindex - delete ttree index indextool [shmname] list - list all indexes in database indextool [shmname] logtree [filename] - log tree indextool [shmname] dumphash - print hash table dserve - simple REST queries with json -------------------------------------- This is a simple REST service tool taking a query described with cgi parameters and printing json or csv output. `dserve` is useful both as a ready-made tool and as an example/template for making your own tools for WhiteDB data handling. `dserve` is not compiled by default. Find it under the Examples folder as a single source file dserve.c, modify and compile yourself by doing: gcc dserve.c -o dserve -O2 -lwgdb Copy the resulting executable under the cgi folder of the Apache server. You can use `dserve` as a cgi program taking a few GET parameters like this http://myserver.com/cgi-bin/dserve?op=search&from=0&count=5 or as a command line program: on the command line simply pass the urlencoded query string as a single argument, for example dserve 'op=search&from=2&count=3' This will print rows 2..4 of the database, like this: ---- content-length: 110 content-type: application/json [ [1,1.1,"a simple string",[10,[],"point to me"],"2013-10-24","23:17:36.68"], [12,"an:uri",-2000], [23,-12] ] ---- Observe that the fourth field contains a pointer to another record, printed as a sublist. The same example with a small addition: dserve 'op=search&from=2&count=3&showid=yes' The added `showid=yes` parameter prepends automatic record id-s (encoded offsets) to each record as a first element: ---- content-length: 134 content-type: application/json [ [23368,1,1.1,"a simple string",[23304,10,[],"point to me"],"2013-10-24","23:17:36.68"], [23408,12,"an:uri",-2000], [23432,23,-12] ] ---- The third example asks for three first records with field 0 less than 20: ---- dserve 'op=search&field=0&value=20&compare=lessthan&count=3' content-length: 178 content-type: application/json [ [10,[],"point to me"], [0,0.1,"a simple string",[10,[],"point to me"],"2013-10-24","23:17:36.68"], [1,1.1,"a simple string",[10,[],"point to me"],"2013-10-24","23:17:36.68"] ] ---- Dserve does not facilitate adding or updating records in the database. We may add such a cgi utility in the future, but anyway, it is a good idea to write your own utilities which fit your own needs exactly. Most of the dserve parameters are optional, i.e. have sensible defaults. Limits, error messages etc are all configurable by changing the macro definitions at the beginning of the C source. Query by field values ~~~~~~~~~~~~~~~~~~~~~ You have to indicate the field numbers and the values to compare the contents of these fields with. The whole set of cgi parameters is as follows: - *db* : numeric database name. Default 1000. - *op* : either `search` or `recids`. Here we assume `search`, will cover `recids` later. - *field* : field number (0,1,2, etc) to check against the compared value. - *value* : value to check the field contents against. Examples: value=32, value=sometext. - *type* : datatype of the value: null, int, double, str, char or record. Guessed from the value by default. - *compare* : equal, not_equal, lessthan, greater, ltequal or gtequal. Default `equal`. - *from* : skip initial matching records, start from the result nr given here. Default 0. - *count* : max number of matching records output. Default 10000. - *depth* : max depth of nested trees of records. Default 100, hard limit 10000. Set to 0 for hiding all sublists. Modify the macro MAX_DEPTH_HARD to change the hard limit. - *format* : either `json` or `csv`. Default `json`. - *escape* : escaping special characters in json strings, either `no` (replace nothing) , `url` (urlencode %, " and all non-printable and non-ascii characters, i.e. under 32 and over 126, to be completely ascii-safe) or `json` (not ascii-safe: replace only the minimal set indicated in the rfc). Default `json`. NB! this parameter has no effect for csv, where only " gets replaced with "". - *showid : `yes` or `no`, if `yes`, print the record offset (automatic id) as the first element of each record, moving all the other fields one position to the right. Default `no`. All the input values are assumed to be urlencoded. The json output is a list of matching records. Each record is also presented as a list. The list elements are integers, doubles, strings or records (again represented as lists) pointed to from the field. Null is printed as an empty list []. All the other datatypes, like dates, times, URI-s, strings with a language attribute etc in WhiteDB are converted to standard strings in a fairly intuitive manner. Blobs are url-encoded. For csv and too deep json branches the full internal record is represented by its offset (encoded pointer, i.e. automatic id). Notice that for a graph database the json output can be a complex tree of records. Cycles are possible, but can be inhibited by the depth parameter. Errors are reported by printing an error string as a single element of the output list, both for json and csv, like this: ---- content-length: 48 content-type: application/json ["unrecognized op: use op=search or op=recids"] ---- On Linux `dserve` should be able to free locks and detach the database even in case of hard errors like segfaults. Importantly, you can give several sets of field/value/compare/count parameters to perform a complex and-query. Example: dserve 'op=search&field=1&value=100&type=int&compare=lessthan&field=2&value=10.3&type=double&compare=greater' In case you use several fields in the query, you have to fill the otherwise optional type and compare parameters for all the fields indicated. BTW, it is OK to put the same field into several comparisons. You can also have no fields in the query at all. In this case the full database will be traversed and printed according to the parameters given. Query by record id-s ~~~~~~~~~~~~~~~~~~~~ The second way to query is to simply indicate a list of record id-s (offsets: encoded record pointers) like this: dserve 'op=recids&recids=12000,10236,22458' You can learn the record id-s by using the `showid=yes` parameter described before. The database selection parameter `db` and the generic formatting parameters `depth`,`format`,`escape` and `showid` function exactly as described above. Good to know ~~~~~~~~~~~~ Things to note and useful code examples in dserve: - does not require additional libraries except wgdb - uses readlocks, does not use writelocks - cgi parameter parsing - various printing routines for WhiteDB values - text is printed into a an automatically growing string buffer - error and timeout handling with signals: when a signal arrives, free the readlock, detach database and output ["internal error"] or ["timeout"]. Available on Linux only. whitedb-0.7.2/Doc/python.txt000066400000000000000000000744571226454622500157660ustar00rootroot00000000000000WhiteDB python bindings ======================== About this document ------------------- The second part, "Compilation and Installation" describes the compilation, installation and general usage of WhiteDB Python bindings. The third part, "wgdb.so (wgdb.pyd) module", describes the immediate low level API provided by the wgdb module. This API (in most cases) directly wraps functions provided by libwgdb. The last part, "whitedb.py module (high level API)" describes the DBI-style API, which is designed for convinience of usage and is not speed-optimized at the moment (start there if you just want to know how to put stuff into the database using Python). The examples in this document were create using Python 2. They should be syntactically correct for Python 3, but can produce slightly different output (particularly, the `print` statement vs the `print()` function). Compilation and Installation ---------------------------- Compiling Python bindings ~~~~~~~~~~~~~~~~~~~~~~~~~ Python module is not compiled by default. `./configure --with-python` enables the compilation (provided that the configure script is able to locate the 'Python.h' file in the system. If not, it is assumed that Python is not properly installed and WhiteDB will be compiled without Python bindings). When building manually, use the separate scripts in Python directory. Check that the Python path in 'compile.sh' ('compile.bat' for Windows) matches your system. Installation ~~~~~~~~~~~~ The high level 'whitedb.py' module expects to find the compiled 'wgdb.so' module in the same directory it resides in. To install the modules, they can be copied to Python site-packages directory manually. Compatibility ~~~~~~~~~~~~~ Minimum version of Python required: 2.5. Other tested versions: 2.6, 2.7 and 3.3. Note that Python 3 is supported but is not extensively tested yet. wgdb.so (wgdb.pyd) module ------------------------- Attaching and deleting a database ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ FUNCTIONS attach_database(shmname='', size=0, local=0) Connect to a shared memory database. If the database with the given name does not exist, it is created. If local is non-zero, the parameter shmname is ignored and the database is created in local memory instead. attach_existing_database(shmname) Connect to a shared memory database. Fails if the database with the given name does not exist. delete_database(shmname) Delete a shared memory database. detach_database(db) Detach from shared memory database. If the database is in the local memory, it is deleted. `attach_database()` allows keyword arguments. If either database name or size are omitted, default values are used. Note that the shared memory name is expected to be converted by `strtol()`. `detach_database()` tells the system that the current process is no longer interested in reading the shared memory. This allows the system to free the shared memory (applies to SysV IPC model - not Win32). In case of a local database, the allocated memory is freed on all systems. Examples: >>> a=wgdb.attach_database() >>> b=wgdb.attach_database("1001") >>> c=wgdb.attach_database(size=3000000) >>> d=wgdb.attach_database(size=500000, shmname="9999") >>> d=wgdb.attach_database(local=1) >>> wgdb.detach_database(d) `attach_existing_database()` requires that a shared memory base with the given name exists. >>> d=wgdb.attach_existing_database("1002") Traceback (most recent call last): File "", line 1, in wgdb.error: Failed to attach to database. >>> d=wgdb.attach_existing_database() `delete_database()` takes a single argument. If this is omitted, the default value will be used. >>> wgdb.delete_database("1001") >>> wgdb.delete_database() Exception handling. ~~~~~~~~~~~~~~~~~~~ wgdb module defines a `wgdb.error` exception object that can be used in error handling: >>> try: ... a=wgdb.attach_database() ... except wgdb.error, msg: ... print ('wgdb error') ... except: ... print ('other error') ... Creating and manipulating records ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ FUNCTIONS create_record(db, length) Create a record with given length. create_raw_record(db, length) Create a record without indexing the fields. delete_record(db, rec) Delete a record. get_first_record(db) Fetch first record from database. get_next_record(db, rec) Fetch next record from database. get_record_len(db, rec) Get record length (number of fields). is_record(rec) Determine if object is a WhiteDB record. `db` is an object returned by `wgdb.attach_database()`. `rec` is an object returned by `get_first_record()` or other similar functions that return a record. Examples: >>> d=wgdb.attach_database() ... >>> a=wgdb.create_record(d,5) >>> a >>> b=wgdb.create_record(d,3) >>> b >>> rec=wgdb.get_first_record(d) >>> wgdb.get_record_len(d,rec) 5 >>> rec >>> rec=wgdb.get_next_record(d,rec) >>> wgdb.get_record_len(d,rec) 3 >>> rec >>> rec=wgdb.get_next_record(d,rec) Traceback (most recent call last): File "", line 1, in wgdb.error: Failed to fetch a record. Writing and reading field contents. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ wgdb module handles data type conversion between Python and WhiteDB. Field contents will be converted to Python object when reading data and again encoded into field data when writing to database. Currently supported types include: None, int, float, string (regular 0-terminated string. Raw binary data is not allowed), record. Setting a field to None is equivalent to clearing the field data. Similarly, unwritten fields will be returned to Python as containing None. FUNCTIONS get_field(db, rec, fieldnr) Get field data decoded to corresponding Python type. set_field(db, rec, fieldnr, data, encoding=0, ext_str="") Set field value. set_new_field(db, rec, fieldnr, data, encoding=0, ext_str="") Set field value (assumes no previous content). `db` is an object returned by `wgdb.attach_database()`. `rec` is an object returned by `get_first_record()` or other similar functions that return a record. Encoding (or field type) is an optional keyword argument. If it is omitted, the type of the field is determined by the Python type. Following encoding types are defined by the wgdb module: BLOBTYPE CHARTYPE - Python string (length 1, longer is allowed but ignored) DATETYPE - datetime.date() DOUBLETYPE - default encoding for Python float FIXPOINTTYPE - Python float (small, low precision real numbers) INTTYPE - default encoding for Python int NULLTYPE - Python None RECORDTYPE - wgdb.Record type. STRTYPE - default encoding for Python string TIMETYPE - datetime.time() URITYPE - Python string. ext_str defines URI prefix XMLLITERALTYPE - Python string. ext_str defines type. VARTYPE - (varnum, VARTYPE) tuple. `ext_str` is an optional keyword argument. For string types it has varied meaning depending on the type selected. For other types it is ignored. Examples: >>> d=wgdb.attach_database() ... >>> tmp=wgdb.create_record(d,4) >>> tmp >>> print (wgdb.get_field(d,tmp,0),) (None,) >>> wgdb.set_field(d,tmp,0,0) >>> wgdb.set_field(d,tmp,1,256) >>> wgdb.set_field(d,tmp,2,78.3345) >>> wgdb.set_field(d,tmp,3,"hello") >>> print (wgdb.get_field(d,tmp,0),) (0,) >>> print (wgdb.get_field(d,tmp,1),) (256,) >>> print (wgdb.get_field(d,tmp,2),) (78.334500000000006,) >>> print (wgdb.get_field(d,tmp,3),) ('hello',) >>> wgdb.set_field(d,tmp,3,None) >>> print (wgdb.get_field(d,tmp,3),) (None,) Example with a field pointing to another record: >>> tmp=wgdb.create_record(d,4) >>> n=wgdb.create_record(d,4) >>> wgdb.set_field(d,tmp,3,n) >>> wgdb.set_field(d,n,0,1) >>> uu=wgdb.get_field(d,tmp,3) >>> uu >>> wgdb.get_field(d,uu,0) 1 Example with using specific encoding: >>> d=wgdb.attach_database() >>> tmp=wgdb.create_record(d,1) >>> wgdb.set_field(d,tmp,0,"Hello") >>> wgdb.get_field(d,tmp,0) 'Hello' >>> wgdb.set_field(d,tmp,0,"Hello", wgdb.STRTYPE) >>> wgdb.get_field(d,tmp,0) 'Hello' >>> wgdb.set_field(d,tmp,0,"Hello", wgdb.CHARTYPE) >>> wgdb.get_field(d,tmp,0) 'H' >>> wgdb.set_field(d,tmp,0,"H", wgdb.FIXPOINTTYPE) Traceback (most recent call last): File "", line 1, in TypeError: Requested encoding is not supported. Transaction handling ~~~~~~~~~~~~~~~~~~~~ Logical level of transaction handling is provided by the wgdb module. These functions should guarantee safe concurrent usage, however the method of providing that concurrency is up to the database engine (in simplest case, the method is a database level lock). FUNCTIONS end_read(db, lock_id) Finish reading transaction. end_write(db, lock_id) Finish writing transaction. start_read(db) Start reading transaction. start_write(db) Start writing transaction. Parameter `lock_id` is returned by `start_write()` and `start_read()` functions. The same lock id should be passed to `end_write()` and `end_read()` functions, respectively. Depending on the locking mode used, the id may or may not be meaningful, but in any case this should be handled by the database itself. If timeouts are enabled, `start_read()` and `start_write()` will raise the `wgdb.error` exception upon failure to acquire the lock. Examples: >>> d=wgdb.attach_database() ... >>> l=wgdb.start_write(d) >>> wgdb.create_record(d, 5) >>> wgdb.end_write(d,l) >>> l=wgdb.start_read(d) >>> wgdb.get_first_record(d) >>> wgdb.end_read(d,l) Date and time fields. ~~~~~~~~~~~~~~~~~~~~~ WhiteDB uses a compact encoding for date and time values, which is translated to and from Python datetime representation on the wgdb module level. See Python `datetime` module documentation for more information on how to construct and use date and time objects. Note that tzinfo field of the time object and general timezone awareness supported by the datetime module is ignored on wgdb module level. In practical applications, it's recommended to treat all time fields as UTC or local time. Examples: >>> import wgdb >>> import datetime >>> d=wgdb.attach_database() >>> tmp=wgdb.create_record(d,1) >>> a=datetime.date(1990,1,2) >>> wgdb.set_field(d,tmp,0,a) >>> x=wgdb.get_field(d,tmp,0) >>> x datetime.date(1990, 1, 2) >>> x.day 2 >>> x.month 1 >>> x.year 1990 >>> b=datetime.time(12,5) >>> wgdb.set_field(d,tmp,0,b) >>> x=wgdb.get_field(d,tmp,0) >>> x datetime.time(12, 5) >>> x.hour 12 >>> x.minute 5 >>> x.second 0 >>> x.microsecond 0 Queries ~~~~~~~ wgdb module provides a direct wrapper for `wg_make_query()` and `wg_fetch()` functions. The query building function uses a similar convention for handling wgdb data types as the 'whitedb.py' module (see "Specifying field encoding and extended information") - data values in query parameters may be given as immediate Python values or as tuples that add the field type and extra string information. FUNCTIONS fetch(db, query) Fetch next record from a query. free_query(db, query) Unallocates the memory (local and shared) used by the query. make_query(db, matchrec, arglist) Create a query object. `query` is the `wgdb.Query` object returned by the `make_query()` method. `matchrec` is either a sequence of values or a reference to an actual database record. In either case, rows that have exactly matching fields will be returned. The query object has a read-only attribute `res_count` that contains the number of matching rows. If the number of rows is not known, `query.res_count` will be None. `arglist` is a list of 3-tuples (column, condition, value). Conditions (defined in wgdb module) may be: COND_EQUAL COND_NOT_EQUAL COND_LESSTHAN COND_GREATER COND_LTEQUAL COND_GTEQUAL Both `matchrec` and `arglist` are optional keyword arguments. If neither is provided, the query will return all the rows in the database. Example: >>> d=wgdb.attach_database() >>> tmp=wgdb.create_record(d,2) >>> tmp >>> wgdb.set_field(d,tmp,0,2) >>> wgdb.set_field(d,tmp,1,"hello") >>> tmp=wgdb.create_record(d,2) >>> tmp >>> wgdb.set_field(d,tmp,0,3) >>> wgdb.set_field(d,tmp,1,4) >>> # column 0 equals 2 ... q=wgdb.make_query(d, arglist=[(0,wgdb.COND_EQUAL,2)]) >>> wgdb.fetch(d, q) >>> # column 1 does not equal "hello", column 0 is less than 100 ... q=wgdb.make_query(d, arglist=[(1,wgdb.COND_NOT_EQUAL,"hello"), ... (0,wgdb.COND_LESSTHAN,100)]) >>> wgdb.fetch(d, q) >>> # use match record ... q=wgdb.make_query(d, [3, 4]) >>> wgdb.fetch(d, q) >>> # all rows. ... q=wgdb.make_query(d) >>> q.res_count # number of rows matching 2 >>> wgdb.fetch(d, q) >>> wgdb.fetch(d, q) >>> wgdb.fetch(d, q) # runs out of rows Traceback (most recent call last): File "", line 1, in wgdb.error: Failed to fetch a record. >>> whitedb.py module (high level API) ----------------------------------- Overview ~~~~~~~~ High level access to database is provided by 'whitedb.py' module. This module requires the low level 'wgdb.so' ('wgdb.pyd' on Windows) module. CLASSES Connection Cursor Record wgdb.error(exceptions.StandardError) DatabaseError DataError InternalError ProgrammingError class Connection | The Connection class acts as a container for | wgdb.Database and provides all connection-related | and record accessing functions. | | Methods defined here: | | __init__(self, shmname=None, shmsize=0) | | atomic_create_record(self, fields) | Create a record and set field contents atomically. | | atomic_update_record(self, rec, fields) | Set the contents of the entire record atomically. | | close(self) | Close the connection. | | commit(self) | Commit the transaction (no-op) | | create_record(self, size) | Create new record with given size. | | cursor(self) | Return a DBI-style database cursor | | delete_record(self, rec) | Delete record. | | end_read(self) | Finish reading transaction | | end_write(self) | Finish writing transaction | | fetch(self, query) | Get next record from query result set. | | first_record(self) | Get first record from database. | | free_query(self, cur) | Free query belonging to a cursor. | | get_field(self, rec, fieldnr) | Return data field contents | | insert(self, fields) | Insert a record into database | | make_query(self, matchrec=None, *arg, **kwarg) | Create a query object. | | next_record(self, rec) | Get next record from database. | | rollback(self) | Roll back the transaction (no-op) | | set_field(self, rec, fieldnr, data, *arg, **kwarg) | Set data field contents | | set_locking(self, mode) | Set locking mode (1=on, 0=off) | | start_read(self) | Start reading transaction | | start_write(self) | Start writing transaction class Cursor | Cursor object. Supports wgdb-style queries based on match | records or argument lists. Does not currently support SQL. | | Methods defined here: | | __init__(self, conn) | | close(self) | Close the cursor | | execute(self, sql='', matchrec=None, arglist=None) | Execute a database query | | fetchall(self) | Fetch all (remaining) records from the result set | | fetchone(self) | Fetch the next record from the result set | | get__query(self) | Return low level query object | | insert(self, fields) | Insert a record into database --DEPRECATED-- | | set__query(self, query) | Overwrite low level query object class DataError(DatabaseError) | Exception class to indicate invalid data passed to the db adapter class DatabaseError(wgdb.error) | Base class for database errors class InternalError(DatabaseError) | Exception class to indicate invalid internal state of the module class ProgrammingError(DatabaseError) | Exception class to indicate invalid database usage class Record | Record data representation. Allows field-level and record-level | manipulation of data. Supports iterator and (partial) sequence protocol. | | Methods defined here: | | __getitem__(self, index) | # sequence protocol | | __init__(self, conn, rec) | | __iter__(self) | # iterator protocol | | __setitem__(self, index, data, *arg, **kwarg) | | delete(self) | Delete the record from database | | get__rec(self) | Return low level record object | | get_field(self, fieldnr) | Return data field contents | | get_size(self) | Return record size | | set__rec(self, rec) | Overwrite low level record object | | set_field(self, fieldnr, data, *arg, **kwarg) | Set data field contents with optional encoding | | update(self, fields) | Set the contents of the entire record FUNCTIONS connect(shmname=None, shmsize=0, local=0) Attaches to (or creates) a database. Returns a database object Examples: Connecting to database with default parameters (see examples for `wgdb.attach_database()` for possible arguments and their usage). >>> import whitedb >>> d=whitedb.connect() Cursor methods. Calling `execute()` without any parameters creates a query that returns all the rows in the database. At first the record set will be emtpy, then we insert one using the `insert()` method provided by the connection object. It will subsequently be returned by the query. >>> c=d.cursor() >>> c.execute() >>> c.fetchall() [] >>> d.insert(("This", "is", "my", 1.0, "record")) >>> c.execute() >>> rows=c.fetchall() >>> rows [] The `Record` class has some aspects of a sequence and also works as an iterator. To simply access the entire contents of the record, it can be converted to a normal sequence, such as with the `tuple()` function. Fields may be accessed by their index as well: >>> r=rows[0] >>> r[1] 'is' >>> r[2] 'my' >>> tuple(r) ('This', 'is', 'my', 1.0, 'record') >>> for column in r: print (column) ... This is my 1.0 record Record methods. We create a new record, then attempt to modify a single field and the full record. The last attempt will fail because record size is fixed. >>> new=d.insert(('My', 2, 'record')) >>> new >>> c.execute() >>> rows=c.fetchall() >>> rows [, ] >>> new.get_field(1) 2 >>> new.set_field(1, 2.0) >>> tuple(new) ('My', 2.0, 'record') >>> new.update(('this','will','not','fit')) wg data handling error: wrong field number given to wg_set_field Traceback (most recent call last): File "", line 1, in File "whitedb.py", line 433, in update self._conn.atomic_update_record(self, fields) File "whitedb.py", line 242, in atomic_update_record wgdb.set_field(*fargs) wgdb.error: Failed to set field value. >>> tuple(new) ('this', 'will', 'not') Records can be deleted like so (when using the method provided by the `Record` object, the Python level object itself will remain, but the database record will no longer be accessible): >>> new.delete() >>> new >>> tuple(new) Traceback (most recent call last): File "", line 1, in File "whitedb.py", line 442, in __iter__ yield self.get_field(fieldnr) File "whitedb.py", line 416, in get_field return self._conn.get_field(self, fieldnr) File "whitedb.py", line 264, in get_field data = wgdb.get_field(self._db, rec.get__rec(), fieldnr) TypeError: argument 2 must be wgdb.Record, not None Connections can be closed, after which the cursors and records created using that connection will no longer be usable. NOTE: if `Connection.close()` method is used, it is recommended to close all cursors first. >>> c.close() >>> d.close() >>> tuple(new) Traceback (most recent call last): File "", line 1, in File "whitedb.py", line 442, in __iter__ yield self.get_field(fieldnr) File "whitedb.py", line 416, in get_field return self._conn.get_field(self, fieldnr) File "whitedb.py", line 262, in get_field self.start_read() File "whitedb.py", line 108, in start_read self._lock_id = wgdb.start_read(self._db) TypeError: argument 1 must be wgdb.Database, not None Linked records ~~~~~~~~~~~~~~ WhiteDB record fields may contain references to other records. In high level API, these records are represented as instances of `whitedb.Record` class. Note that it is not useful to create such instances directly. Instances of `Record` class are always returned by WhiteDB operations (creating new records or retrieving existing ones). Example of linking to other records: >>> import whitedb >>> d=whitedb.connect() >>> rec=d.insert((1,2,3,4,5)) >>> c=d.cursor() >>> c.execute() >>> tuple(c.fetchone()) (1, 2, 3, 4, 5) >>> d.insert(('1st linked record', rec)) >>> d.insert(('2nd linked record', rec)) >>> c.execute() >>> l=c.fetchall() >>> list(map(tuple,l)) [(1, 2, 3, 4, 5), ('1st linked record', ), ('2nd linked record', )] Changing the contents of the original record will be visible through the records that refer to it: >>> linked=l[-2:] >>> linked [, ] >>> list(map(lambda x: tuple(x[1]), linked)) [(1, 2, 3, 4, 5), (1, 2, 3, 4, 5)] >>> rec.set_field(3, 99) >>> list(map(lambda x: tuple(x[1]), linked)) [(1, 2, 3, 99, 5), (1, 2, 3, 99, 5)] Transaction support ~~~~~~~~~~~~~~~~~~~ Transactions are handled internally by the whitedb module. By default the concurrency support is turned on and each database read or write is treated as a separate transaction. The user can turn this behaviour on and off (when there is a single database user, there will be a small performance gain with locking turned off). Turning locking (or transactional) mode off: >>> d=whitedb.connect() >>> d.set_locking(0) Turning it back on: >>> d.set_locking(1) Specifying field encoding and extended information. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The user can explicitly state which encoding should be used when writing data to the database. Examples of encodings where this is useful are 1-character strings and small fixed-point numbers. When encoded as such they consume less storage space in database and may speed up access as well. Allowed types are listed under the section "Writing and reading field contents". Example: >>> import whitedb >>> d=whitedb.connect() >>> r=d.insert((None,)) >>> r.set_field(0,"Hello") >>> tuple(r) ('Hello',) >>> r.set_field(0,"Hello",whitedb.wgdb.STRTYPE) >>> tuple(r) ('Hello',) >>> r.set_field(0,"Hello", encoding=whitedb.wgdb.CHARTYPE) >>> tuple(r) ('H',) >>> r.set_field(0,"Hello",whitedb.wgdb.INTTYPE) Traceback (most recent call last): File "", line 1, in File "whitedb.py", line 422, in set_field return self._conn.set_field(self, fieldnr, data, *arg, **kwarg) File "whitedb.py", line 283, in set_field rec.get__rec(), fieldnr, data, *arg, **kwarg) TypeError: Requested encoding is not supported. Some string types allow extra information, stored together with the value. This can be done by adding the `ext_str` keyword parameter. The specific types and meaning of the extra information: STRTYPE - language URITYPE - URI prefix XMLLITERAL - XML literal type Example: >>> r=d.create_record(3) >>> r >>> r.set_field(0, "#example", whitedb.wgdb.URITYPE, "http://example.com/myns") >>> r.set_field(1, "True", ext_str="xsd:boolean", encoding=whitedb.wgdb.XMLLITERALTYPE) >>> r.set_field(2, "#object_id", encoding=whitedb.wgdb.URITYPE) >>> tuple(r) ('http://example.com/myns#example', 'True', '#object_id') Finally, `Connection.insert()` method and `Record.update()` method allow the user to supply the additional field encoding and extra string parameters together with the data value. Field values passed to these methods may be given as tuples (data, encoding) or (data, encoding, ext_str). These additional parameters will be passed on to the database in a similar way to the positional parameters in the above examples with the `set_field()` method. If ext_str is given, encoding must also be present. Passing 0 for the encoding lets the wgdb module select the default encoding. Example: >>> r=d.insert((1,2.0,"3")) >>> tuple(r) (1, 2.0, '3') >>> r.update((None,None,("hello",whitedb.wgdb.CHARTYPE))) >>> tuple(r) (None, None, 'h') >>> r.update((None,None,("hello",0,"en"))) >>> tuple(r) (None, None, 'hello') >>> r=d.insert((("#example",whitedb.wgdb.URITYPE,"http://mydomain.org/"), ... ("False",whitedb.wgdb.XMLLITERALTYPE,"xsd:boolean"))) >>> tuple(r) ('http://mydomain.org/#example', 'False') >>> import math >>> r.update((math.pi,(math.pi,whitedb.wgdb.FIXPOINTTYPE))) >>> tuple(r) (3.1415926535897931, 3.1415999999999999) Using dates and times. ~~~~~~~~~~~~~~~~~~~~~~ Date and time support is implemented using the datetime module included with the standard Python distribution. Storing a `datetime.date` object in the database creates a WhiteDB date type field, similarly a `datetime.time` object is stored as a time field. When reading the database, low-level wgdb module converts the times and dates to datetime.date/time instances again. Timezones are not supported through the wgdb API, so timezone-awareness should be implemented on the application level, if needed. Date and time fields combined can be used to construct datetime data. Example: >>> import whitedb >>> import datetime >>> d=whitedb.connect() >>> a=datetime.date(2010,3,31) >>> b=datetime.time(12,59,microsecond=330000) >>> rec=d.insert((a,b)) >>> tuple(rec) (datetime.date(2010, 3, 31), datetime.time(12, 59, 0, 330000)) >>> rec[0].month 3 >>> rec[1].hour 12 Example of using combined date and time fields as a datetime object (continuing previous example): >>> x=datetime.datetime.combine(rec[0], rec[1]) >>> x datetime.datetime(2010, 3, 31, 12, 59, 0, 330000) >>> x.strftime("%d.%m.%Y") '31.03.2010' >>> x.ctime() 'Wed Mar 31 12:59:00 2010' Using queries. ~~~~~~~~~~~~~~ The `execute()` method of `Cursor` class implements non-DBI, WhiteDB-specific extensions. These can be used to query data that matches specific conditions. SQL support is currently not implemented in libwgdb and is non-functional in the whitedb Python module. Optional keyword parameters to `execute()`: - sql - ignored - matchrec - may be either a sequence of values or a whitedb.Record instance that points to an actual record in the database. In the first case, records with fields matching the values in the sequence will be returned. In the second case, equivalent records (including the match record itself) will be returned. - arglist - sequence of 3-tuples (column, condition, value) Values are either immediate Python values or tuples with extended type information (see "Specifying field encoding and extended information"). For the possible conditions, see the section "Queries". `arglist` and `matchrec` parameters may be present simultaneously. Also, `arglist` parameter may contain multiple conditions for one column. If neither parameter is present, the result set will include all the rows in the database unconditionally. After calling `execute()`, the attribute `rowcount` will indicate the number of rows matching the query (unless that information is not available from the wgdb layer). Examples: >>> import whitedb >>> from wgdb import COND_EQUAL, COND_LESSTHAN, COND_NOT_EQUAL >>> d=whitedb.connect() >>> d.insert((2,3,4)) >>> d.insert(("Hello", 110)) One condition (column 0 should not equal 2): >>> c=d.cursor() >>> c.execute(arglist=[(0, COND_NOT_EQUAL, 2)]) >>> tuple(c.fetchone()) ('Hello', 110) Multiple conditions (column 1 should be < 100, column 0 should equal 2): >>> c.execute(arglist=[(1, COND_LESSTHAN, 100), (0, COND_EQUAL, 2)]) >>> r=c.fetchone() >>> tuple(r) (2, 3, 4) Try match record: >>> d.insert((2,3,4,5)) >>> c.execute(matchrec=r) >>> c.rowcount 2 >>> list(map(tuple, c.fetchall())) [(2, 3, 4), (2, 3, 4, 5)] Empty query (all database rows match): >>> c.execute() >>> list(map(tuple, c.fetchall())) [(2, 3, 4), ('Hello', 110), (2, 3, 4, 5)] libwgdb query engine treats all match record fields that are of WG_VARTYPE, as wildcards. This can be used in Python-side match records as well (VARTYPE field is constructed using the extended type syntax convention): >>> x=(0, whitedb.wgdb.VARTYPE) >>> c.execute(matchrec=[2,x,4]) >>> list(map(tuple, c.fetchall())) [(2, 3, 4), (2, 3, 4, 5)] Database record fields can also be of WG_VARTYPE. Note that the query using such match record also returns the match record itself. The variables are represented as (varnum, VARTYPE) tuples in Python: >>> var_rec=d.insert((x,x,4)) >>> var_rec >>> c.execute(matchrec=var_rec) >>> list(map(tuple, c.fetchall())) [(2, 3, 4), (2, 3, 4, 5), ((0, 14), (0, 14), 4)] whitedb-0.7.2/Doc/python2html.sed000066400000000000000000000015401226454622500166500ustar00rootroot00000000000000s/^Compilation and Installation/\[\[anchor-1\]\]\n&/ s/The second part, "Compilation and Installation"/xref:anchor-1\[\]/ s/^wgdb.so (wgdb.pyd) module/\[\[anchor-2\]\]\n&/ s/The third part, "wgdb.so (wgdb.pyd) module"/xref:anchor-2\[\]/ s/^whitedb.py module (high level API)/\[\[anchor-3\]\]\n&/ s/The last part, "whitedb.py module (high level API)"/xref:anchor-3\[\]/ s/^Specifying field encoding and extended information/\[\[anchor-4\]\]\n&/ s/"Specifying field encoding and extended information"/xref:anchor-4\[\]/ s/^`attach_database()` allows/\[\[anchor-5,attach_database()\]\]\n&/ s/^`wgdb.attach_database()`\( for possible\)/xref:anchor-5\[\]\1/ s/^Writing and reading field contents/\[\[anchor-6\]\]\n&/ s/section "Writing and reading field contents"/xref:anchor-6\[\]/ s/^Queries/\[\[anchor-7\]\]\n&/ s/\(see the\) section "Queries"/\1 xref:anchor-7\[\]/ whitedb-0.7.2/Doxyfile000066400000000000000000001530561226454622500146760ustar00rootroot00000000000000# Doxyfile 1.5.4 # This file describes the settings to be used by the documentation system # doxygen (www.doxygen.org) for a project # # All text after a hash (#) is considered a comment and will be ignored # The format is: # TAG = value [value, ...] # For lists items can also be appended using: # TAG += value [value, ...] # Values that contain spaces should be placed between quotes (" ") #--------------------------------------------------------------------------- # Project related configuration options #--------------------------------------------------------------------------- # This tag specifies the encoding used for all characters in the config file that # follow. The default is UTF-8 which is also the encoding used for all text before # the first occurrence of this tag. Doxygen uses libiconv (or the iconv built into # libc) for the transcoding. See http://www.gnu.org/software/libiconv for the list of # possible encodings. DOXYFILE_ENCODING = UTF-8 # The PROJECT_NAME tag is a single word (or a sequence of words surrounded # by quotes) that should identify the project. PROJECT_NAME = WhiteDB # The PROJECT_NUMBER tag can be used to enter a project or revision number. # This could be handy for archiving the generated documentation or # if some version control system is used. PROJECT_NUMBER = # The OUTPUT_DIRECTORY tag is used to specify the (relative or absolute) # base path where the generated documentation will be put. # If a relative path is entered, it will be relative to the location # where doxygen was started. If left blank the current directory will be used. OUTPUT_DIRECTORY = Doc/ # If the CREATE_SUBDIRS tag is set to YES, then doxygen will create # 4096 sub-directories (in 2 levels) under the output directory of each output # format and will distribute the generated files over these directories. # Enabling this option can be useful when feeding doxygen a huge amount of # source files, where putting all generated files in the same directory would # otherwise cause performance problems for the file system. CREATE_SUBDIRS = NO # The OUTPUT_LANGUAGE tag is used to specify the language in which all # documentation generated by doxygen is written. Doxygen will use this # information to generate all constant output in the proper language. # The default language is English, other supported languages are: # Afrikaans, Arabic, Brazilian, Catalan, Chinese, Chinese-Traditional, # Croatian, Czech, Danish, Dutch, Finnish, French, German, Greek, Hungarian, # Italian, Japanese, Japanese-en (Japanese with English messages), Korean, # Korean-en, Lithuanian, Norwegian, Polish, Portuguese, Romanian, Russian, # Serbian, Slovak, Slovene, Spanish, Swedish, and Ukrainian. OUTPUT_LANGUAGE = English # If the BRIEF_MEMBER_DESC tag is set to YES (the default) Doxygen will # include brief member descriptions after the members that are listed in # the file and class documentation (similar to JavaDoc). # Set to NO to disable this. BRIEF_MEMBER_DESC = YES # If the REPEAT_BRIEF tag is set to YES (the default) Doxygen will prepend # the brief description of a member or function before the detailed description. # Note: if both HIDE_UNDOC_MEMBERS and BRIEF_MEMBER_DESC are set to NO, the # brief descriptions will be completely suppressed. REPEAT_BRIEF = YES # This tag implements a quasi-intelligent brief description abbreviator # that is used to form the text in various listings. Each string # in this list, if found as the leading text of the brief description, will be # stripped from the text and the result after processing the whole list, is # used as the annotated text. Otherwise, the brief description is used as-is. # If left blank, the following values are used ("$name" is automatically # replaced with the name of the entity): "The $name class" "The $name widget" # "The $name file" "is" "provides" "specifies" "contains" # "represents" "a" "an" "the" ABBREVIATE_BRIEF = # If the ALWAYS_DETAILED_SEC and REPEAT_BRIEF tags are both set to YES then # Doxygen will generate a detailed section even if there is only a brief # description. ALWAYS_DETAILED_SEC = YES # If the INLINE_INHERITED_MEMB tag is set to YES, doxygen will show all # inherited members of a class in the documentation of that class as if those # members were ordinary class members. Constructors, destructors and assignment # operators of the base classes will not be shown. INLINE_INHERITED_MEMB = NO # If the FULL_PATH_NAMES tag is set to YES then Doxygen will prepend the full # path before files name in the file list and in the header files. If set # to NO the shortest path that makes the file name unique will be used. FULL_PATH_NAMES = YES # If the FULL_PATH_NAMES tag is set to YES then the STRIP_FROM_PATH tag # can be used to strip a user-defined part of the path. Stripping is # only done if one of the specified strings matches the left-hand part of # the path. The tag can be used to show relative paths in the file list. # If left blank the directory from which doxygen is run is used as the # path to strip. STRIP_FROM_PATH = # The STRIP_FROM_INC_PATH tag can be used to strip a user-defined part of # the path mentioned in the documentation of a class, which tells # the reader which header file to include in order to use a class. # If left blank only the name of the header file containing the class # definition is used. Otherwise one should specify the include paths that # are normally passed to the compiler using the -I flag. STRIP_FROM_INC_PATH = # If the SHORT_NAMES tag is set to YES, doxygen will generate much shorter # (but less readable) file names. This can be useful is your file systems # doesn't support long names like on DOS, Mac, or CD-ROM. SHORT_NAMES = NO # If the JAVADOC_AUTOBRIEF tag is set to YES then Doxygen # will interpret the first line (until the first dot) of a JavaDoc-style # comment as the brief description. If set to NO, the JavaDoc # comments will behave just like regular Qt-style comments # (thus requiring an explicit @brief command for a brief description.) JAVADOC_AUTOBRIEF = YES # If the QT_AUTOBRIEF tag is set to YES then Doxygen will # interpret the first line (until the first dot) of a Qt-style # comment as the brief description. If set to NO, the comments # will behave just like regular Qt-style comments (thus requiring # an explicit \brief command for a brief description.) QT_AUTOBRIEF = NO # The MULTILINE_CPP_IS_BRIEF tag can be set to YES to make Doxygen # treat a multi-line C++ special comment block (i.e. a block of //! or /// # comments) as a brief description. This used to be the default behaviour. # The new default is to treat a multi-line C++ comment block as a detailed # description. Set this tag to YES if you prefer the old behaviour instead. MULTILINE_CPP_IS_BRIEF = NO # If the DETAILS_AT_TOP tag is set to YES then Doxygen # will output the detailed description near the top, like JavaDoc. # If set to NO, the detailed description appears after the member # documentation. DETAILS_AT_TOP = NO # If the INHERIT_DOCS tag is set to YES (the default) then an undocumented # member inherits the documentation from any documented member that it # re-implements. INHERIT_DOCS = YES # If the SEPARATE_MEMBER_PAGES tag is set to YES, then doxygen will produce # a new page for each member. If set to NO, the documentation of a member will # be part of the file/class/namespace that contains it. SEPARATE_MEMBER_PAGES = NO # The TAB_SIZE tag can be used to set the number of spaces in a tab. # Doxygen uses this value to replace tabs by spaces in code fragments. TAB_SIZE = 2 # This tag can be used to specify a number of aliases that acts # as commands in the documentation. An alias has the form "name=value". # For example adding "sideeffect=\par Side Effects:\n" will allow you to # put the command \sideeffect (or @sideeffect) in the documentation, which # will result in a user-defined paragraph with heading "Side Effects:". # You can put \n's in the value part of an alias to insert newlines. ALIASES = # Set the OPTIMIZE_OUTPUT_FOR_C tag to YES if your project consists of C # sources only. Doxygen will then generate output that is more tailored for C. # For instance, some of the names that are used will be different. The list # of all members will be omitted, etc. OPTIMIZE_OUTPUT_FOR_C = YES # Set the OPTIMIZE_OUTPUT_JAVA tag to YES if your project consists of Java # sources only. Doxygen will then generate output that is more tailored for Java. # For instance, namespaces will be presented as packages, qualified scopes # will look different, etc. OPTIMIZE_OUTPUT_JAVA = NO # If you use STL classes (i.e. std::string, std::vector, etc.) but do not want to # include (a tag file for) the STL sources as input, then you should # set this tag to YES in order to let doxygen match functions declarations and # definitions whose arguments contain STL classes (e.g. func(std::string); v.s. # func(std::string) {}). This also make the inheritance and collaboration # diagrams that involve STL classes more complete and accurate. BUILTIN_STL_SUPPORT = NO # If you use Microsoft's C++/CLI language, you should set this option to YES to # enable parsing support. CPP_CLI_SUPPORT = NO # Set the SIP_SUPPORT tag to YES if your project consists of sip sources only. # Doxygen will parse them like normal C++ but will assume all classes use public # instead of private inheritance when no explicit protection keyword is present. SIP_SUPPORT = NO # If member grouping is used in the documentation and the DISTRIBUTE_GROUP_DOC # tag is set to YES, then doxygen will reuse the documentation of the first # member in the group (if any) for the other members of the group. By default # all members of a group must be documented explicitly. DISTRIBUTE_GROUP_DOC = NO # Set the SUBGROUPING tag to YES (the default) to allow class member groups of # the same type (for instance a group of public functions) to be put as a # subgroup of that type (e.g. under the Public Functions section). Set it to # NO to prevent subgrouping. Alternatively, this can be done per class using # the \nosubgrouping command. SUBGROUPING = YES # When TYPEDEF_HIDES_STRUCT is enabled, a typedef of a struct (or union) is # documented as struct with the name of the typedef. So # typedef struct TypeS {} TypeT, will appear in the documentation as a struct # with name TypeT. When disabled the typedef will appear as a member of a file, # namespace, or class. And the struct will be named TypeS. This can typically # be useful for C code where the coding convention is that all structs are # typedef'ed and only the typedef is referenced never the struct's name. TYPEDEF_HIDES_STRUCT = YES #--------------------------------------------------------------------------- # Build related configuration options #--------------------------------------------------------------------------- # If the EXTRACT_ALL tag is set to YES doxygen will assume all entities in # documentation are documented, even if no documentation was available. # Private class members and static file members will be hidden unless # the EXTRACT_PRIVATE and EXTRACT_STATIC tags are set to YES EXTRACT_ALL = YES # If the EXTRACT_PRIVATE tag is set to YES all private members of a class # will be included in the documentation. EXTRACT_PRIVATE = YES # If the EXTRACT_STATIC tag is set to YES all static members of a file # will be included in the documentation. EXTRACT_STATIC = YES # If the EXTRACT_LOCAL_CLASSES tag is set to YES classes (and structs) # defined locally in source files will be included in the documentation. # If set to NO only classes defined in header files are included. EXTRACT_LOCAL_CLASSES = YES # This flag is only useful for Objective-C code. When set to YES local # methods, which are defined in the implementation section but not in # the interface are included in the documentation. # If set to NO (the default) only methods in the interface are included. EXTRACT_LOCAL_METHODS = NO # If this flag is set to YES, the members of anonymous namespaces will be extracted # and appear in the documentation as a namespace called 'anonymous_namespace{file}', # where file will be replaced with the base name of the file that contains the anonymous # namespace. By default anonymous namespace are hidden. EXTRACT_ANON_NSPACES = NO # If the HIDE_UNDOC_MEMBERS tag is set to YES, Doxygen will hide all # undocumented members of documented classes, files or namespaces. # If set to NO (the default) these members will be included in the # various overviews, but no documentation section is generated. # This option has no effect if EXTRACT_ALL is enabled. HIDE_UNDOC_MEMBERS = NO # If the HIDE_UNDOC_CLASSES tag is set to YES, Doxygen will hide all # undocumented classes that are normally visible in the class hierarchy. # If set to NO (the default) these classes will be included in the various # overviews. This option has no effect if EXTRACT_ALL is enabled. HIDE_UNDOC_CLASSES = NO # If the HIDE_FRIEND_COMPOUNDS tag is set to YES, Doxygen will hide all # friend (class|struct|union) declarations. # If set to NO (the default) these declarations will be included in the # documentation. HIDE_FRIEND_COMPOUNDS = NO # If the HIDE_IN_BODY_DOCS tag is set to YES, Doxygen will hide any # documentation blocks found inside the body of a function. # If set to NO (the default) these blocks will be appended to the # function's detailed documentation block. HIDE_IN_BODY_DOCS = NO # The INTERNAL_DOCS tag determines if documentation # that is typed after a \internal command is included. If the tag is set # to NO (the default) then the documentation will be excluded. # Set it to YES to include the internal documentation. INTERNAL_DOCS = NO # If the CASE_SENSE_NAMES tag is set to NO then Doxygen will only generate # file names in lower-case letters. If set to YES upper-case letters are also # allowed. This is useful if you have classes or files whose names only differ # in case and if your file system supports case sensitive file names. Windows # and Mac users are advised to set this option to NO. CASE_SENSE_NAMES = YES # If the HIDE_SCOPE_NAMES tag is set to NO (the default) then Doxygen # will show members with their full class and namespace scopes in the # documentation. If set to YES the scope will be hidden. HIDE_SCOPE_NAMES = NO # If the SHOW_INCLUDE_FILES tag is set to YES (the default) then Doxygen # will put a list of the files that are included by a file in the documentation # of that file. SHOW_INCLUDE_FILES = YES # If the INLINE_INFO tag is set to YES (the default) then a tag [inline] # is inserted in the documentation for inline members. INLINE_INFO = YES # If the SORT_MEMBER_DOCS tag is set to YES (the default) then doxygen # will sort the (detailed) documentation of file and class members # alphabetically by member name. If set to NO the members will appear in # declaration order. SORT_MEMBER_DOCS = YES # If the SORT_BRIEF_DOCS tag is set to YES then doxygen will sort the # brief documentation of file, namespace and class members alphabetically # by member name. If set to NO (the default) the members will appear in # declaration order. SORT_BRIEF_DOCS = NO # If the SORT_BY_SCOPE_NAME tag is set to YES, the class list will be # sorted by fully-qualified names, including namespaces. If set to # NO (the default), the class list will be sorted only by class name, # not including the namespace part. # Note: This option is not very useful if HIDE_SCOPE_NAMES is set to YES. # Note: This option applies only to the class list, not to the # alphabetical list. SORT_BY_SCOPE_NAME = NO # The GENERATE_TODOLIST tag can be used to enable (YES) or # disable (NO) the todo list. This list is created by putting \todo # commands in the documentation. GENERATE_TODOLIST = YES # The GENERATE_TESTLIST tag can be used to enable (YES) or # disable (NO) the test list. This list is created by putting \test # commands in the documentation. GENERATE_TESTLIST = YES # The GENERATE_BUGLIST tag can be used to enable (YES) or # disable (NO) the bug list. This list is created by putting \bug # commands in the documentation. GENERATE_BUGLIST = YES # The GENERATE_DEPRECATEDLIST tag can be used to enable (YES) or # disable (NO) the deprecated list. This list is created by putting # \deprecated commands in the documentation. GENERATE_DEPRECATEDLIST= YES # The ENABLED_SECTIONS tag can be used to enable conditional # documentation sections, marked by \if sectionname ... \endif. ENABLED_SECTIONS = # The MAX_INITIALIZER_LINES tag determines the maximum number of lines # the initial value of a variable or define consists of for it to appear in # the documentation. If the initializer consists of more lines than specified # here it will be hidden. Use a value of 0 to hide initializers completely. # The appearance of the initializer of individual variables and defines in the # documentation can be controlled using \showinitializer or \hideinitializer # command in the documentation regardless of this setting. MAX_INITIALIZER_LINES = 30 # Set the SHOW_USED_FILES tag to NO to disable the list of files generated # at the bottom of the documentation of classes and structs. If set to YES the # list will mention the files that were used to generate the documentation. SHOW_USED_FILES = YES # If the sources in your project are distributed over multiple directories # then setting the SHOW_DIRECTORIES tag to YES will show the directory hierarchy # in the documentation. The default is NO. SHOW_DIRECTORIES = YES # The FILE_VERSION_FILTER tag can be used to specify a program or script that # doxygen should invoke to get the current version for each file (typically from the # version control system). Doxygen will invoke the program by executing (via # popen()) the command , where is the value of # the FILE_VERSION_FILTER tag, and is the name of an input file # provided by doxygen. Whatever the program writes to standard output # is used as the file version. See the manual for examples. FILE_VERSION_FILTER = #--------------------------------------------------------------------------- # configuration options related to warning and progress messages #--------------------------------------------------------------------------- # The QUIET tag can be used to turn on/off the messages that are generated # by doxygen. Possible values are YES and NO. If left blank NO is used. QUIET = NO # The WARNINGS tag can be used to turn on/off the warning messages that are # generated by doxygen. Possible values are YES and NO. If left blank # NO is used. WARNINGS = YES # If WARN_IF_UNDOCUMENTED is set to YES, then doxygen will generate warnings # for undocumented members. If EXTRACT_ALL is set to YES then this flag will # automatically be disabled. WARN_IF_UNDOCUMENTED = YES # If WARN_IF_DOC_ERROR is set to YES, doxygen will generate warnings for # potential errors in the documentation, such as not documenting some # parameters in a documented function, or documenting parameters that # don't exist or using markup commands wrongly. WARN_IF_DOC_ERROR = YES # This WARN_NO_PARAMDOC option can be abled to get warnings for # functions that are documented, but have no documentation for their parameters # or return value. If set to NO (the default) doxygen will only warn about # wrong or incomplete parameter documentation, but not about the absence of # documentation. WARN_NO_PARAMDOC = NO # The WARN_FORMAT tag determines the format of the warning messages that # doxygen can produce. The string should contain the $file, $line, and $text # tags, which will be replaced by the file and line number from which the # warning originated and the warning text. Optionally the format may contain # $version, which will be replaced by the version of the file (if it could # be obtained via FILE_VERSION_FILTER) WARN_FORMAT = "$file:$line: $text" # The WARN_LOGFILE tag can be used to specify a file to which warning # and error messages should be written. If left blank the output is written # to stderr. WARN_LOGFILE = #--------------------------------------------------------------------------- # configuration options related to the input files #--------------------------------------------------------------------------- # The INPUT tag can be used to specify the files and/or directories that contain # documented source files. You may enter file names like "myfile.cpp" or # directories like "/usr/src/myproject". Separate the files or directories # with spaces. INPUT = Main Db # This tag can be used to specify the character encoding of the source files that # doxygen parses. Internally doxygen uses the UTF-8 encoding, which is also the default # input encoding. Doxygen uses libiconv (or the iconv built into libc) for the transcoding. # See http://www.gnu.org/software/libiconv for the list of possible encodings. INPUT_ENCODING = UTF-8 # If the value of the INPUT tag contains directories, you can use the # FILE_PATTERNS tag to specify one or more wildcard pattern (like *.cpp # and *.h) to filter out the source-files in the directories. If left # blank the following patterns are tested: # *.c *.cc *.cxx *.cpp *.c++ *.java *.ii *.ixx *.ipp *.i++ *.inl *.h *.hh *.hxx # *.hpp *.h++ *.idl *.odl *.cs *.php *.php3 *.inc *.m *.mm *.py *.f90 FILE_PATTERNS = # The RECURSIVE tag can be used to turn specify whether or not subdirectories # should be searched for input files as well. Possible values are YES and NO. # If left blank NO is used. RECURSIVE = NO # The EXCLUDE tag can be used to specify files and/or directories that should # excluded from the INPUT source files. This way you can easily exclude a # subdirectory from a directory tree whose root is specified with the INPUT tag. EXCLUDE = # The EXCLUDE_SYMLINKS tag can be used select whether or not files or # directories that are symbolic links (a Unix filesystem feature) are excluded # from the input. EXCLUDE_SYMLINKS = NO # If the value of the INPUT tag contains directories, you can use the # EXCLUDE_PATTERNS tag to specify one or more wildcard patterns to exclude # certain files from those directories. Note that the wildcards are matched # against the file with absolute path, so to exclude all test directories # for example use the pattern */test/* EXCLUDE_PATTERNS = # The EXCLUDE_SYMBOLS tag can be used to specify one or more symbol names # (namespaces, classes, functions, etc.) that should be excluded from the output. # The symbol name can be a fully qualified name, a word, or if the wildcard * is used, # a substring. Examples: ANamespace, AClass, AClass::ANamespace, ANamespace::*Test EXCLUDE_SYMBOLS = # The EXAMPLE_PATH tag can be used to specify one or more files or # directories that contain example code fragments that are included (see # the \include command). EXAMPLE_PATH = # If the value of the EXAMPLE_PATH tag contains directories, you can use the # EXAMPLE_PATTERNS tag to specify one or more wildcard pattern (like *.cpp # and *.h) to filter out the source-files in the directories. If left # blank all files are included. EXAMPLE_PATTERNS = # If the EXAMPLE_RECURSIVE tag is set to YES then subdirectories will be # searched for input files to be used with the \include or \dontinclude # commands irrespective of the value of the RECURSIVE tag. # Possible values are YES and NO. If left blank NO is used. EXAMPLE_RECURSIVE = NO # The IMAGE_PATH tag can be used to specify one or more files or # directories that contain image that are included in the documentation (see # the \image command). IMAGE_PATH = # The INPUT_FILTER tag can be used to specify a program that doxygen should # invoke to filter for each input file. Doxygen will invoke the filter program # by executing (via popen()) the command , where # is the value of the INPUT_FILTER tag, and is the name of an # input file. Doxygen will then use the output that the filter program writes # to standard output. If FILTER_PATTERNS is specified, this tag will be # ignored. INPUT_FILTER = # The FILTER_PATTERNS tag can be used to specify filters on a per file pattern # basis. Doxygen will compare the file name with each pattern and apply the # filter if there is a match. The filters are a list of the form: # pattern=filter (like *.cpp=my_cpp_filter). See INPUT_FILTER for further # info on how filters are used. If FILTER_PATTERNS is empty, INPUT_FILTER # is applied to all files. FILTER_PATTERNS = # If the FILTER_SOURCE_FILES tag is set to YES, the input filter (if set using # INPUT_FILTER) will be used to filter the input files when producing source # files to browse (i.e. when SOURCE_BROWSER is set to YES). FILTER_SOURCE_FILES = NO #--------------------------------------------------------------------------- # configuration options related to source browsing #--------------------------------------------------------------------------- # If the SOURCE_BROWSER tag is set to YES then a list of source files will # be generated. Documented entities will be cross-referenced with these sources. # Note: To get rid of all source code in the generated output, make sure also # VERBATIM_HEADERS is set to NO. If you have enabled CALL_GRAPH or CALLER_GRAPH # then you must also enable this option. If you don't then doxygen will produce # a warning and turn it on anyway SOURCE_BROWSER = YES # Setting the INLINE_SOURCES tag to YES will include the body # of functions and classes directly in the documentation. INLINE_SOURCES = NO # Setting the STRIP_CODE_COMMENTS tag to YES (the default) will instruct # doxygen to hide any special comment blocks from generated source code # fragments. Normal C and C++ comments will always remain visible. STRIP_CODE_COMMENTS = YES # If the REFERENCED_BY_RELATION tag is set to YES (the default) # then for each documented function all documented # functions referencing it will be listed. REFERENCED_BY_RELATION = YES # If the REFERENCES_RELATION tag is set to YES (the default) # then for each documented function all documented entities # called/used by that function will be listed. REFERENCES_RELATION = YES # If the REFERENCES_LINK_SOURCE tag is set to YES (the default) # and SOURCE_BROWSER tag is set to YES, then the hyperlinks from # functions in REFERENCES_RELATION and REFERENCED_BY_RELATION lists will # link to the source code. Otherwise they will link to the documentstion. REFERENCES_LINK_SOURCE = YES # If the USE_HTAGS tag is set to YES then the references to source code # will point to the HTML generated by the htags(1) tool instead of doxygen # built-in source browser. The htags tool is part of GNU's global source # tagging system (see http://www.gnu.org/software/global/global.html). You # will need version 4.8.6 or higher. USE_HTAGS = NO # If the VERBATIM_HEADERS tag is set to YES (the default) then Doxygen # will generate a verbatim copy of the header file for each class for # which an include is specified. Set to NO to disable this. VERBATIM_HEADERS = YES #--------------------------------------------------------------------------- # configuration options related to the alphabetical class index #--------------------------------------------------------------------------- # If the ALPHABETICAL_INDEX tag is set to YES, an alphabetical index # of all compounds will be generated. Enable this if the project # contains a lot of classes, structs, unions or interfaces. ALPHABETICAL_INDEX = NO # If the alphabetical index is enabled (see ALPHABETICAL_INDEX) then # the COLS_IN_ALPHA_INDEX tag can be used to specify the number of columns # in which this list will be split (can be a number in the range [1..20]) COLS_IN_ALPHA_INDEX = 5 # In case all classes in a project start with a common prefix, all # classes will be put under the same header in the alphabetical index. # The IGNORE_PREFIX tag can be used to specify one or more prefixes that # should be ignored while generating the index headers. IGNORE_PREFIX = #--------------------------------------------------------------------------- # configuration options related to the HTML output #--------------------------------------------------------------------------- # If the GENERATE_HTML tag is set to YES (the default) Doxygen will # generate HTML output. GENERATE_HTML = YES # The HTML_OUTPUT tag is used to specify where the HTML docs will be put. # If a relative path is entered the value of OUTPUT_DIRECTORY will be # put in front of it. If left blank `html' will be used as the default path. HTML_OUTPUT = html # The HTML_FILE_EXTENSION tag can be used to specify the file extension for # each generated HTML page (for example: .htm,.php,.asp). If it is left blank # doxygen will generate files with .html extension. HTML_FILE_EXTENSION = .html # The HTML_HEADER tag can be used to specify a personal HTML header for # each generated HTML page. If it is left blank doxygen will generate a # standard header. HTML_HEADER = # The HTML_FOOTER tag can be used to specify a personal HTML footer for # each generated HTML page. If it is left blank doxygen will generate a # standard footer. HTML_FOOTER = # The HTML_STYLESHEET tag can be used to specify a user-defined cascading # style sheet that is used by each HTML page. It can be used to # fine-tune the look of the HTML output. If the tag is left blank doxygen # will generate a default style sheet. Note that doxygen will try to copy # the style sheet file to the HTML output directory, so don't put your own # stylesheet in the HTML output directory as well, or it will be erased! HTML_STYLESHEET = # If the HTML_ALIGN_MEMBERS tag is set to YES, the members of classes, # files or namespaces will be aligned in HTML using tables. If set to # NO a bullet list will be used. HTML_ALIGN_MEMBERS = YES # If the GENERATE_HTMLHELP tag is set to YES, additional index files # will be generated that can be used as input for tools like the # Microsoft HTML help workshop to generate a compressed HTML help file (.chm) # of the generated HTML documentation. GENERATE_HTMLHELP = NO # If the HTML_DYNAMIC_SECTIONS tag is set to YES then the generated HTML # documentation will contain sections that can be hidden and shown after the # page has loaded. For this to work a browser that supports # JavaScript and DHTML is required (for instance Mozilla 1.0+, Firefox # Netscape 6.0+, Internet explorer 5.0+, Konqueror, or Safari). HTML_DYNAMIC_SECTIONS = NO # If the GENERATE_HTMLHELP tag is set to YES, the CHM_FILE tag can # be used to specify the file name of the resulting .chm file. You # can add a path in front of the file if the result should not be # written to the html output directory. CHM_FILE = # If the GENERATE_HTMLHELP tag is set to YES, the HHC_LOCATION tag can # be used to specify the location (absolute path including file name) of # the HTML help compiler (hhc.exe). If non-empty doxygen will try to run # the HTML help compiler on the generated index.hhp. HHC_LOCATION = # If the GENERATE_HTMLHELP tag is set to YES, the GENERATE_CHI flag # controls if a separate .chi index file is generated (YES) or that # it should be included in the master .chm file (NO). GENERATE_CHI = NO # If the GENERATE_HTMLHELP tag is set to YES, the BINARY_TOC flag # controls whether a binary table of contents is generated (YES) or a # normal table of contents (NO) in the .chm file. BINARY_TOC = NO # The TOC_EXPAND flag can be set to YES to add extra items for group members # to the contents of the HTML help documentation and to the tree view. TOC_EXPAND = NO # The DISABLE_INDEX tag can be used to turn on/off the condensed index at # top of each HTML page. The value NO (the default) enables the index and # the value YES disables it. DISABLE_INDEX = NO # This tag can be used to set the number of enum values (range [1..20]) # that doxygen will group on one line in the generated HTML documentation. ENUM_VALUES_PER_LINE = 4 # If the GENERATE_TREEVIEW tag is set to YES, a side panel will be # generated containing a tree-like index structure (just like the one that # is generated for HTML Help). For this to work a browser that supports # JavaScript, DHTML, CSS and frames is required (for instance Mozilla 1.0+, # Netscape 6.0+, Internet explorer 5.0+, or Konqueror). Windows users are # probably better off using the HTML help feature. GENERATE_TREEVIEW = NO # If the treeview is enabled (see GENERATE_TREEVIEW) then this tag can be # used to set the initial width (in pixels) of the frame in which the tree # is shown. TREEVIEW_WIDTH = 250 #--------------------------------------------------------------------------- # configuration options related to the LaTeX output #--------------------------------------------------------------------------- # If the GENERATE_LATEX tag is set to YES (the default) Doxygen will # generate Latex output. GENERATE_LATEX = NO # The LATEX_OUTPUT tag is used to specify where the LaTeX docs will be put. # If a relative path is entered the value of OUTPUT_DIRECTORY will be # put in front of it. If left blank `latex' will be used as the default path. LATEX_OUTPUT = latex # The LATEX_CMD_NAME tag can be used to specify the LaTeX command name to be # invoked. If left blank `latex' will be used as the default command name. LATEX_CMD_NAME = latex # The MAKEINDEX_CMD_NAME tag can be used to specify the command name to # generate index for LaTeX. If left blank `makeindex' will be used as the # default command name. MAKEINDEX_CMD_NAME = makeindex # If the COMPACT_LATEX tag is set to YES Doxygen generates more compact # LaTeX documents. This may be useful for small projects and may help to # save some trees in general. COMPACT_LATEX = NO # The PAPER_TYPE tag can be used to set the paper type that is used # by the printer. Possible values are: a4, a4wide, letter, legal and # executive. If left blank a4wide will be used. PAPER_TYPE = a4wide # The EXTRA_PACKAGES tag can be to specify one or more names of LaTeX # packages that should be included in the LaTeX output. EXTRA_PACKAGES = # The LATEX_HEADER tag can be used to specify a personal LaTeX header for # the generated latex document. The header should contain everything until # the first chapter. If it is left blank doxygen will generate a # standard header. Notice: only use this tag if you know what you are doing! LATEX_HEADER = # If the PDF_HYPERLINKS tag is set to YES, the LaTeX that is generated # is prepared for conversion to pdf (using ps2pdf). The pdf file will # contain links (just like the HTML output) instead of page references # This makes the output suitable for online browsing using a pdf viewer. PDF_HYPERLINKS = NO # If the USE_PDFLATEX tag is set to YES, pdflatex will be used instead of # plain latex in the generated Makefile. Set this option to YES to get a # higher quality PDF documentation. USE_PDFLATEX = NO # If the LATEX_BATCHMODE tag is set to YES, doxygen will add the \\batchmode. # command to the generated LaTeX files. This will instruct LaTeX to keep # running if errors occur, instead of asking the user for help. # This option is also used when generating formulas in HTML. LATEX_BATCHMODE = NO # If LATEX_HIDE_INDICES is set to YES then doxygen will not # include the index chapters (such as File Index, Compound Index, etc.) # in the output. LATEX_HIDE_INDICES = NO #--------------------------------------------------------------------------- # configuration options related to the RTF output #--------------------------------------------------------------------------- # If the GENERATE_RTF tag is set to YES Doxygen will generate RTF output # The RTF output is optimized for Word 97 and may not look very pretty with # other RTF readers or editors. GENERATE_RTF = NO # The RTF_OUTPUT tag is used to specify where the RTF docs will be put. # If a relative path is entered the value of OUTPUT_DIRECTORY will be # put in front of it. If left blank `rtf' will be used as the default path. RTF_OUTPUT = rtf # If the COMPACT_RTF tag is set to YES Doxygen generates more compact # RTF documents. This may be useful for small projects and may help to # save some trees in general. COMPACT_RTF = NO # If the RTF_HYPERLINKS tag is set to YES, the RTF that is generated # will contain hyperlink fields. The RTF file will # contain links (just like the HTML output) instead of page references. # This makes the output suitable for online browsing using WORD or other # programs which support those fields. # Note: wordpad (write) and others do not support links. RTF_HYPERLINKS = NO # Load stylesheet definitions from file. Syntax is similar to doxygen's # config file, i.e. a series of assignments. You only have to provide # replacements, missing definitions are set to their default value. RTF_STYLESHEET_FILE = # Set optional variables used in the generation of an rtf document. # Syntax is similar to doxygen's config file. RTF_EXTENSIONS_FILE = #--------------------------------------------------------------------------- # configuration options related to the man page output #--------------------------------------------------------------------------- # If the GENERATE_MAN tag is set to YES (the default) Doxygen will # generate man pages GENERATE_MAN = NO # The MAN_OUTPUT tag is used to specify where the man pages will be put. # If a relative path is entered the value of OUTPUT_DIRECTORY will be # put in front of it. If left blank `man' will be used as the default path. MAN_OUTPUT = man # The MAN_EXTENSION tag determines the extension that is added to # the generated man pages (default is the subroutine's section .3) MAN_EXTENSION = .3 # If the MAN_LINKS tag is set to YES and Doxygen generates man output, # then it will generate one additional man file for each entity # documented in the real man page(s). These additional files # only source the real man page, but without them the man command # would be unable to find the correct page. The default is NO. MAN_LINKS = NO #--------------------------------------------------------------------------- # configuration options related to the XML output #--------------------------------------------------------------------------- # If the GENERATE_XML tag is set to YES Doxygen will # generate an XML file that captures the structure of # the code including all documentation. GENERATE_XML = NO # The XML_OUTPUT tag is used to specify where the XML pages will be put. # If a relative path is entered the value of OUTPUT_DIRECTORY will be # put in front of it. If left blank `xml' will be used as the default path. XML_OUTPUT = xml # The XML_SCHEMA tag can be used to specify an XML schema, # which can be used by a validating XML parser to check the # syntax of the XML files. XML_SCHEMA = # The XML_DTD tag can be used to specify an XML DTD, # which can be used by a validating XML parser to check the # syntax of the XML files. XML_DTD = # If the XML_PROGRAMLISTING tag is set to YES Doxygen will # dump the program listings (including syntax highlighting # and cross-referencing information) to the XML output. Note that # enabling this will significantly increase the size of the XML output. XML_PROGRAMLISTING = YES #--------------------------------------------------------------------------- # configuration options for the AutoGen Definitions output #--------------------------------------------------------------------------- # If the GENERATE_AUTOGEN_DEF tag is set to YES Doxygen will # generate an AutoGen Definitions (see autogen.sf.net) file # that captures the structure of the code including all # documentation. Note that this feature is still experimental # and incomplete at the moment. GENERATE_AUTOGEN_DEF = NO #--------------------------------------------------------------------------- # configuration options related to the Perl module output #--------------------------------------------------------------------------- # If the GENERATE_PERLMOD tag is set to YES Doxygen will # generate a Perl module file that captures the structure of # the code including all documentation. Note that this # feature is still experimental and incomplete at the # moment. GENERATE_PERLMOD = NO # If the PERLMOD_LATEX tag is set to YES Doxygen will generate # the necessary Makefile rules, Perl scripts and LaTeX code to be able # to generate PDF and DVI output from the Perl module output. PERLMOD_LATEX = NO # If the PERLMOD_PRETTY tag is set to YES the Perl module output will be # nicely formatted so it can be parsed by a human reader. This is useful # if you want to understand what is going on. On the other hand, if this # tag is set to NO the size of the Perl module output will be much smaller # and Perl will parse it just the same. PERLMOD_PRETTY = YES # The names of the make variables in the generated doxyrules.make file # are prefixed with the string contained in PERLMOD_MAKEVAR_PREFIX. # This is useful so different doxyrules.make files included by the same # Makefile don't overwrite each other's variables. PERLMOD_MAKEVAR_PREFIX = #--------------------------------------------------------------------------- # Configuration options related to the preprocessor #--------------------------------------------------------------------------- # If the ENABLE_PREPROCESSING tag is set to YES (the default) Doxygen will # evaluate all C-preprocessor directives found in the sources and include # files. ENABLE_PREPROCESSING = YES # If the MACRO_EXPANSION tag is set to YES Doxygen will expand all macro # names in the source code. If set to NO (the default) only conditional # compilation will be performed. Macro expansion can be done in a controlled # way by setting EXPAND_ONLY_PREDEF to YES. MACRO_EXPANSION = NO # If the EXPAND_ONLY_PREDEF and MACRO_EXPANSION tags are both set to YES # then the macro expansion is limited to the macros specified with the # PREDEFINED and EXPAND_AS_DEFINED tags. EXPAND_ONLY_PREDEF = NO # If the SEARCH_INCLUDES tag is set to YES (the default) the includes files # in the INCLUDE_PATH (see below) will be search if a #include is found. SEARCH_INCLUDES = YES # The INCLUDE_PATH tag can be used to specify one or more directories that # contain include files that are not input files but should be processed by # the preprocessor. INCLUDE_PATH = # You can use the INCLUDE_FILE_PATTERNS tag to specify one or more wildcard # patterns (like *.h and *.hpp) to filter out the header-files in the # directories. If left blank, the patterns specified with FILE_PATTERNS will # be used. INCLUDE_FILE_PATTERNS = # The PREDEFINED tag can be used to specify one or more macro names that # are defined before the preprocessor is started (similar to the -D option of # gcc). The argument of the tag is a list of macros of the form: name # or name=definition (no spaces). If the definition and the = are # omitted =1 is assumed. To prevent a macro definition from being # undefined via #undef or recursively expanded use the := operator # instead of the = operator. PREDEFINED = # If the MACRO_EXPANSION and EXPAND_ONLY_PREDEF tags are set to YES then # this tag can be used to specify a list of macro names that should be expanded. # The macro definition that is found in the sources will be used. # Use the PREDEFINED tag if you want to use a different macro definition. EXPAND_AS_DEFINED = # If the SKIP_FUNCTION_MACROS tag is set to YES (the default) then # doxygen's preprocessor will remove all function-like macros that are alone # on a line, have an all uppercase name, and do not end with a semicolon. Such # function macros are typically used for boiler-plate code, and will confuse # the parser if not removed. SKIP_FUNCTION_MACROS = NO #--------------------------------------------------------------------------- # Configuration::additions related to external references #--------------------------------------------------------------------------- # The TAGFILES option can be used to specify one or more tagfiles. # Optionally an initial location of the external documentation # can be added for each tagfile. The format of a tag file without # this location is as follows: # TAGFILES = file1 file2 ... # Adding location for the tag files is done as follows: # TAGFILES = file1=loc1 "file2 = loc2" ... # where "loc1" and "loc2" can be relative or absolute paths or # URLs. If a location is present for each tag, the installdox tool # does not have to be run to correct the links. # Note that each tag file must have a unique name # (where the name does NOT include the path) # If a tag file is not located in the directory in which doxygen # is run, you must also specify the path to the tagfile here. TAGFILES = # When a file name is specified after GENERATE_TAGFILE, doxygen will create # a tag file that is based on the input files it reads. GENERATE_TAGFILE = # If the ALLEXTERNALS tag is set to YES all external classes will be listed # in the class index. If set to NO only the inherited external classes # will be listed. ALLEXTERNALS = NO # If the EXTERNAL_GROUPS tag is set to YES all external groups will be listed # in the modules index. If set to NO, only the current project's groups will # be listed. EXTERNAL_GROUPS = YES # The PERL_PATH should be the absolute path and name of the perl script # interpreter (i.e. the result of `which perl'). PERL_PATH = /usr/bin/perl #--------------------------------------------------------------------------- # Configuration options related to the dot tool #--------------------------------------------------------------------------- # If the CLASS_DIAGRAMS tag is set to YES (the default) Doxygen will # generate a inheritance diagram (in HTML, RTF and LaTeX) for classes with base # or super classes. Setting the tag to NO turns the diagrams off. Note that # this option is superseded by the HAVE_DOT option below. This is only a # fallback. It is recommended to install and use dot, since it yields more # powerful graphs. CLASS_DIAGRAMS = YES # You can define message sequence charts within doxygen comments using the \msc # command. Doxygen will then run the mscgen tool (see http://www.mcternan.me.uk/mscgen/) to # produce the chart and insert it in the documentation. The MSCGEN_PATH tag allows you to # specify the directory where the mscgen tool resides. If left empty the tool is assumed to # be found in the default search path. MSCGEN_PATH = # If set to YES, the inheritance and collaboration graphs will hide # inheritance and usage relations if the target is undocumented # or is not a class. HIDE_UNDOC_RELATIONS = YES # If you set the HAVE_DOT tag to YES then doxygen will assume the dot tool is # available from the path. This tool is part of Graphviz, a graph visualization # toolkit from AT&T and Lucent Bell Labs. The other options in this section # have no effect if this option is set to NO (the default) HAVE_DOT = NO # If the CLASS_GRAPH and HAVE_DOT tags are set to YES then doxygen # will generate a graph for each documented class showing the direct and # indirect inheritance relations. Setting this tag to YES will force the # the CLASS_DIAGRAMS tag to NO. CLASS_GRAPH = YES # If the COLLABORATION_GRAPH and HAVE_DOT tags are set to YES then doxygen # will generate a graph for each documented class showing the direct and # indirect implementation dependencies (inheritance, containment, and # class references variables) of the class with other documented classes. COLLABORATION_GRAPH = YES # If the GROUP_GRAPHS and HAVE_DOT tags are set to YES then doxygen # will generate a graph for groups, showing the direct groups dependencies GROUP_GRAPHS = YES # If the UML_LOOK tag is set to YES doxygen will generate inheritance and # collaboration diagrams in a style similar to the OMG's Unified Modeling # Language. UML_LOOK = NO # If set to YES, the inheritance and collaboration graphs will show the # relations between templates and their instances. TEMPLATE_RELATIONS = NO # If the ENABLE_PREPROCESSING, SEARCH_INCLUDES, INCLUDE_GRAPH, and HAVE_DOT # tags are set to YES then doxygen will generate a graph for each documented # file showing the direct and indirect include dependencies of the file with # other documented files. INCLUDE_GRAPH = YES # If the ENABLE_PREPROCESSING, SEARCH_INCLUDES, INCLUDED_BY_GRAPH, and # HAVE_DOT tags are set to YES then doxygen will generate a graph for each # documented header file showing the documented files that directly or # indirectly include this file. INCLUDED_BY_GRAPH = YES # If the CALL_GRAPH, SOURCE_BROWSER and HAVE_DOT tags are set to YES then doxygen will # generate a call dependency graph for every global function or class method. # Note that enabling this option will significantly increase the time of a run. # So in most cases it will be better to enable call graphs for selected # functions only using the \callgraph command. CALL_GRAPH = NO # If the CALLER_GRAPH, SOURCE_BROWSER and HAVE_DOT tags are set to YES then doxygen will # generate a caller dependency graph for every global function or class method. # Note that enabling this option will significantly increase the time of a run. # So in most cases it will be better to enable caller graphs for selected # functions only using the \callergraph command. CALLER_GRAPH = NO # If the GRAPHICAL_HIERARCHY and HAVE_DOT tags are set to YES then doxygen # will graphical hierarchy of all classes instead of a textual one. GRAPHICAL_HIERARCHY = YES # If the DIRECTORY_GRAPH, SHOW_DIRECTORIES and HAVE_DOT tags are set to YES # then doxygen will show the dependencies a directory has on other directories # in a graphical way. The dependency relations are determined by the #include # relations between the files in the directories. DIRECTORY_GRAPH = YES # The DOT_IMAGE_FORMAT tag can be used to set the image format of the images # generated by dot. Possible values are png, jpg, or gif # If left blank png will be used. DOT_IMAGE_FORMAT = png # The tag DOT_PATH can be used to specify the path where the dot tool can be # found. If left blank, it is assumed the dot tool can be found in the path. DOT_PATH = # The DOTFILE_DIRS tag can be used to specify one or more directories that # contain dot files that are included in the documentation (see the # \dotfile command). DOTFILE_DIRS = # The MAX_DOT_GRAPH_MAX_NODES tag can be used to set the maximum number of # nodes that will be shown in the graph. If the number of nodes in a graph # becomes larger than this value, doxygen will truncate the graph, which is # visualized by representing a node as a red box. Note that doxygen if the number # of direct children of the root node in a graph is already larger than # MAX_DOT_GRAPH_NOTES then the graph will not be shown at all. Also note # that the size of a graph can be further restricted by MAX_DOT_GRAPH_DEPTH. DOT_GRAPH_MAX_NODES = 50 # The MAX_DOT_GRAPH_DEPTH tag can be used to set the maximum depth of the # graphs generated by dot. A depth value of 3 means that only nodes reachable # from the root by following a path via at most 3 edges will be shown. Nodes # that lay further from the root node will be omitted. Note that setting this # option to 1 or 2 may greatly reduce the computation time needed for large # code bases. Also note that the size of a graph can be further restricted by # DOT_GRAPH_MAX_NODES. Using a depth of 0 means no depth restriction. MAX_DOT_GRAPH_DEPTH = 0 # Set the DOT_TRANSPARENT tag to YES to generate images with a transparent # background. This is disabled by default, which results in a white background. # Warning: Depending on the platform used, enabling this option may lead to # badly anti-aliased labels on the edges of a graph (i.e. they become hard to # read). DOT_TRANSPARENT = YES # Set the DOT_MULTI_TARGETS tag to YES allow dot to generate multiple output # files in one run (i.e. multiple -o and -T options on the command line). This # makes dot run faster, but since only newer versions of dot (>1.8.10) # support this, this feature is disabled by default. DOT_MULTI_TARGETS = NO # If the GENERATE_LEGEND tag is set to YES (the default) Doxygen will # generate a legend page explaining the meaning of the various boxes and # arrows in the dot generated graphs. GENERATE_LEGEND = YES # If the DOT_CLEANUP tag is set to YES (the default) Doxygen will # remove the intermediate dot files that are used to generate # the various graphs. DOT_CLEANUP = YES #--------------------------------------------------------------------------- # Configuration::additions related to the search engine #--------------------------------------------------------------------------- # The SEARCHENGINE tag specifies whether or not a search engine should be # used. If set to NO the values of all tags below this one will be ignored. SEARCHENGINE = NO whitedb-0.7.2/Examples/000077500000000000000000000000001226454622500147345ustar00rootroot00000000000000whitedb-0.7.2/Examples/Makefile.am000066400000000000000000000012471226454622500167740ustar00rootroot00000000000000# $Id: $ # $Source: $ # # Compile test program(s) # ---- options ---- # ---- targets ---- noinst_PROGRAMS = demo query #if RAPTOR #noinst_PROGRAMS += raptortry #endif # ---- extra dependencies, flags, etc ----- LIBDEPS = -lm # dependency from libm round() should be removed if RAPTOR LIBDEPS += `$(RAPTOR_CONFIG) --libs` endif #if RAPTOR #raptortry_CFLAGS = $(AM_CFLAGS) `$(RAPTOR_CONFIG) --cflags` #endif AM_LDFLAGS = $(LIBDEPS) # ----- all sources for the created programs ----- #raptortry_SOURCES = raptortry.c #raptortry_LDADD = ../Main/libwgdb.la demo_SOURCES = demo.c demo_LDADD = ../Main/libwgdb.la query_SOURCES = query.c query_LDADD = ../Main/libwgdb.la whitedb-0.7.2/Examples/compile_demo.bat000077500000000000000000000003701226454622500200630ustar00rootroot00000000000000cl /Ox /W3 /I..\Db demo.c ..\Db\dbmem.c ..\Db\dballoc.c ..\Db\dbdata.c ..\Db\dblock.c ..\DB\dbindex.c ..\Db\dblog.c ..\Db\dbhash.c ..\Db\dbcompare.c ..\Db\dbquery.c ..\Db\dbutil.c ..\Db\dbmpool.c ..\Db\dbjson.c ..\Db\dbschema.c ..\json\yajl_all.c whitedb-0.7.2/Examples/compile_demo.sh000077500000000000000000000004241226454622500177270ustar00rootroot00000000000000#/bin/sh gcc -O3 -march=pentium4 -o demo demo.c ../Db/dbmem.c ../Db/dballoc.c ../Db/dbdata.c ../Db/dblock.c ../Db/dbindex.c ../Db/dblog.c ../Db/dbhash.c ../Db/dbcompare.c ../Db/dbquery.c ../Db/dbutil.c ../Db/dbmpool.c ../Db/dbjson.c ../Db/dbschema.c ../json/yajl_all.c -lm whitedb-0.7.2/Examples/compile_query.bat000077500000000000000000000004101226454622500202770ustar00rootroot00000000000000cl /Ox /W3 /I..\Db query.c ..\Db\dbmem.c ..\Db\dballoc.c ..\Db\dbdata.c ..\Db\dblock.c ..\DB\dbindex.c ..\Db\dblog.c ..\Db\dbhash.c ..\Db\dbcompare.c ..\Db\dbquery.c ..\Db\dbutil.c ..\Db\dbtest.c ..\Db\dbmpool.c ..\Db\dbjson.c ..\Db\dbschema.c ..\json\yajl_all.c whitedb-0.7.2/Examples/compile_query.sh000077500000000000000000000004461226454622500201540ustar00rootroot00000000000000#/bin/sh gcc -O3 -march=pentium4 -o query query.c ../Db/dbmem.c ../Db/dballoc.c ../Db/dbdata.c ../Db/dblock.c ../Db/dbindex.c ../Db/dblog.c ../Db/dbhash.c ../Db/dbcompare.c ../Db/dbquery.c ../Db/dbutil.c ../Db/dbtest.c ../Db/dbmpool.c ../Db/dbjson.c ../Db/dbschema.c ../json/yajl_all.c -lm whitedb-0.7.2/Examples/demo.c000066400000000000000000000205201226454622500160230ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Priit Järv 2010 * * Minor mods by Tanel Tammet * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file demo.c * Demonstration of WhiteDB low-level API usage */ /* ====== Includes =============== */ #include #include #include /* Include dbapi.h for WhiteDB API functions */ #include "../Db/dbapi.h" #ifdef __cplusplus extern "C" { #endif /* ====== Private defs =========== */ /* ======= Private protos ================ */ void run_demo(void *db); /* ====== Functions ============== */ /** Init database, run demo, drop database * Command line arguments are ignored. */ int main(int argc, char **argv) { void* shmptr; /* Create a database with custom key and 2M size */ shmptr=wg_attach_database("9273", 2000000); /* Using default key and size: shmptr=wg_attach_database(NULL, 0); */ if(!shmptr) { fprintf(stderr, "Failed to attach to database.\n"); exit(1); } /* We have successfully attached, run the demo code */ run_demo(shmptr); /* Clean up. The shared memory area is released. This is * useful for the purposes of this demo, but might best be * avoided for more persistent databases. */ wg_delete_database("9273"); /* Database with default key: wg_delete_database(NULL); */ exit(0); } /** Run demo code. * Uses various database API functions. */ void run_demo(void* db) { void *rec = NULL, *firstrec = NULL, *nextrec = NULL; /* Pointers to a database record */ wg_int enc; /* Encoded data */ wg_int lock_id; /* Id of an acquired lock (for releasing it later) */ wg_int len; int i; int intdata, datedata, timedata; char strbuf[80]; printf("********* Starting demo ************\n"); /* Begin by creating a simple record of 3 fields and fill it * with integer data. */ printf("Creating first record.\n"); rec=wg_create_record(db, 3); if (rec==NULL) { printf("rec creation error.\n"); return; } /* Encode a field, checking for errors */ enc = wg_encode_int(db, 44); if(enc==WG_ILLEGAL) { printf("failed to encode an integer.\n"); return; } /* Negative return value shows that an error occurred */ if(wg_set_field(db, rec, 0, enc) < 0) { printf("failed to store a field.\n"); return; } /* Skip error checking for the sake of brevity for the rest of fields */ enc = wg_encode_int(db, -199999); wg_set_field(db, rec, 1, enc); wg_set_field(db, rec, 2, wg_encode_int(db, 0)); /* Now examine the record we have created. Get record length, * encoded value of each field, data type and decoded value. */ /* Negative return value shows an error. */ len = wg_get_record_len(db, rec); if(len < 0) { printf("failed to get record length.\n"); return; } printf("Size of created record at %p was: %d\n", rec, (int) len); for(i=0; i #include #include #include #include // for alarm and termination signal handling #include // #if _MSC_VER // no alarm on windows #else #include // for alarm #endif /* =============== configuration macros =================== */ #define DEFAULT_DATABASE "1000" #define TIMEOUT_SECONDS 2 #define JSON_CONTENT_TYPE "content-type: application/json\r\n\r\n" #define CSV_CONTENT_TYPE "content-type: text/csv\r\n\r\n" #define CONTENT_LENGTH "content-length: %d\r\n" #define MAXQUERYLEN 2000 // query string length limit #define MAXPARAMS 100 // max number of cgi params in query #define MAXCOUNT 10000 // max number of result records #define MAXIDS 1000 // max number of rec id-s in recids query #define INITIAL_MALLOC 1000 // initially malloced result size #define MAX_MALLOC 100000000 // max malloced result size #define MIN_STRLEN 100 // fixed-len obj strlen, add this to strlen for print-space need #define STRLEN_FACTOR 6 // might need 6*strlen for json encoding #define DOUBLE_FORMAT "%g" // snprintf format for printing double #define JS_NULL "[]" #define CSV_SEPARATOR ',' // must be a single char #define MAX_DEPTH_DEFAULT 100 // can be increased #define MAX_DEPTH_HARD 10000 // too deep rec nesting will cause stack overflow in the printer #define TIMEOUT_ERR "timeout" #define INTERNAL_ERR "internal error" #define NOQUERY_ERR "no query" #define LONGQUERY_ERR "too long query" #define MALFQUERY_ERR "malformed query" #define UNKNOWN_PARAM_ERR "unrecognized parameter: %s" #define UNKNOWN_PARAM_VALUE_ERR "unrecognized value %s for parameter %s" #define NO_OP_ERR "no op given: use op=opname for opname in search" #define UNKNOWN_OP_ERR "unrecognized op: use op=search or op=recids" #define NO_FIELD_ERR "no field given" #define NO_VALUE_ERR "no value given" #define DB_PARAM_ERR "use db=name with a numeric name for a concrete database" #define DB_ATTACH_ERR "no database found: use db=name with a numeric name for a concrete database" #define FIELD_ERR "unrecognized field: use an integer starting from 0" #define COND_ERR "unrecognized compare: use equal, not_equal, lessthan, greater, ltequal or gtequal" #define INTYPE_ERR "unrecognized type: use null, int, double, str, char or record " #define INVALUE_ERR "did not find a value to use for comparison" #define INVALUE_TYPE_ERR "value does not match type" #define LOCK_ERR "database locked" #define LOCK_RELEASE_ERR "releasing read lock failed: database may be in deadlock" #define MALLOC_ERR "cannot allocate enough memory for result string" #define QUERY_ERR "query creation failed" #define DECODE_ERR "field data decoding failed" #define JS_TYPE_ERR "\"\"" // currently this will be shown also for empty string #if _MSC_VER // microsoft compatibility #define snprintf _snprintf #endif /* =============== protos =================== */ void timeout_handler(int signal); void termination_handler(int signal); void print_final(char* str,int format); char* search(char* database, char* inparams[], char* invalues[], int count, int* hformat); char* recids(char* database, char* inparams[], char* invalues[], int incount, int* hformat); static wg_int encode_incomp(void* db, char* incomp); static wg_int encode_intype(void* db, char* intype); static wg_int encode_invalue(void* db, char* invalue, wg_int type); static int isint(char* s); static int isdbl(char* s); static int parse_query(char* query, int ql, char* params[], char* values[]); static char* urldecode(char *indst, char *src); int sprint_record(void *db, wg_int *rec, char **buf, int *bufsize, char **bptr, int format, int showid, int depth, int maxdepth, int strenc); char* sprint_value(void *db, wg_int enc, char **buf, int *bufsize, char **bptr, int format, int showid, int depth, int maxdepth, int strenc); int sprint_string(char* bptr, int limit, char* strdata, int strenc); int sprint_blob(char* bptr, int limit, char* strdata, int strenc); int sprint_append(char** buf, char* str, int l); static char* str_new(int len); static int str_guarantee_space(char** stradr, int* strlenadr, char** ptr, int needed); void err_clear_detach_halt(void* db, wg_int lock_id, char* errstr); void errhalt(char* str); /* =============== globals =================== */ // global vars are used only for enabling signal/error handlers // to free the lock and detach from the database: // set/cleared after taking/releasing lock, attaching/detaching database // global_format used in error handler to select content-type header void* global_db=NULL; // NULL iff not attached wg_int global_lock_id=0; // 0 iff not locked int global_format=1; // 1 json, 0 csv /* =============== main =================== */ int main(int argc, char **argv) { int i=0; char* inquery=NULL; char query[MAXQUERYLEN]; int pcount=0; int ql,found; char* res=NULL; char* database=DEFAULT_DATABASE; char* params[MAXPARAMS]; char* values[MAXPARAMS]; int hformat=1; // for header 0: csv, 1: json: reset later after reading params // Set up timeout signal and abnormal termination handlers: // the termination handler clears the read lock and detaches database. // This may fail, however, for some lock strategies and in case // nontrivial operations are taken in the handler. #if _MSC_VER // no signals on windows #else signal(SIGSEGV,termination_handler); signal(SIGINT,termination_handler); signal(SIGFPE,termination_handler); signal(SIGABRT,termination_handler); signal(SIGTERM,termination_handler); signal(SIGALRM,timeout_handler); alarm(TIMEOUT_SECONDS); #endif // for debugging print the plain content-type immediately // printf("content-type: text/plain\r\n"); // get the cgi query: passed by server or given on the command line inquery=getenv("QUERY_STRING"); if (inquery==NULL && argc>1) inquery=argv[1]; // or use your own query string for testing a la // inquery="db=1000&op=search&field=1&value=2&compare=equal&type=record&from=0&count=3"; // parse the query if (inquery==NULL || inquery[0]=='\0') errhalt(NOQUERY_ERR); ql=strlen(inquery); if (ql>MAXQUERYLEN) errhalt(LONGQUERY_ERR); strcpy((char*)query,inquery); //printf("query: %s\n",query); pcount=parse_query(query,ql,params,values); if (pcount<=0) errhalt(MALFQUERY_ERR); //for(i=0;iMAX_DEPTH_HARD) maxdepth=MAX_DEPTH_HARD; rec=wg_get_first_record(db); do { if (rcount>=from) { gcount++; if (gcount>count) break; str_guarantee_space(&strbuffer,&strbufferlen,&strbufferptr,MIN_STRLEN); if (gcount>1 && format!=0) { // json and not first row snprintf(strbufferptr,MIN_STRLEN,",\n"); strbufferptr+=2; } sprint_record(db,rec,&strbuffer,&strbufferlen,&strbufferptr,format,showid,0,maxdepth,escape); if (format==0) { // csv str_guarantee_space(&strbuffer,&strbufferlen,&strbufferptr,MIN_STRLEN); snprintf(strbufferptr,MIN_STRLEN,"\r\n"); strbufferptr+=2; } } rec=wg_get_next_record(db,rec); rcount++; } while(rec!=NULL); if (!wg_end_read(db, lock_id)) { // release read lock err_clear_detach_halt(db,lock_id,LOCK_RELEASE_ERR); } global_lock_id=0; // only for handling errors wg_detach_database(db); global_db=NULL; // only for handling errors str_guarantee_space(&strbuffer,&strbufferlen,&strbufferptr,MIN_STRLEN); if (format!=0) { // json snprintf(strbufferptr,MIN_STRLEN,"\n]"); strbufferptr+=3; } return strbuffer; } // ------------ normal search case: --------- // create a query list datastructure for(i=0;iMAX_DEPTH_HARD) maxdepth=MAX_DEPTH_HARD; while((rec = wg_fetch(db, wgquery))) { if (rcount>=from) { gcount++; str_guarantee_space(&strbuffer,&strbufferlen,&strbufferptr,MIN_STRLEN); if (gcount>1 && format!=0) { // json and not first row snprintf(strbufferptr,MIN_STRLEN,",\n"); strbufferptr+=2; } sprint_record(db,rec,&strbuffer,&strbufferlen,&strbufferptr,format,showid,0,maxdepth,escape); if (format==0) { // csv str_guarantee_space(&strbuffer,&strbufferlen,&strbufferptr,MIN_STRLEN); snprintf(strbufferptr,MIN_STRLEN,"\r\n"); strbufferptr+=2; } } rcount++; if (gcount>=count) break; } // free query datastructure, release lock, detach for(i=0;i0) ids[x++]=atoi(cids+j); if (x>=MAXIDS) break; for(;jMAX_DEPTH_HARD) maxdepth=MAX_DEPTH_HARD; for(j=0; ids[j]!=0 && jcount) break; str_guarantee_space(&strbuffer,&strbufferlen,&strbufferptr,MIN_STRLEN); if (gcount>1 && format!=0) { // json and not first row snprintf(strbufferptr,MIN_STRLEN,",\n"); strbufferptr+=2; } sprint_record(db,rec,&strbuffer,&strbufferlen,&strbufferptr,format,showid,0,maxdepth,escape); if (format==0) { // csv str_guarantee_space(&strbuffer,&strbufferlen,&strbufferptr,MIN_STRLEN); snprintf(strbufferptr,MIN_STRLEN,"\r\n"); strbufferptr+=2; } rec=wg_get_next_record(db,rec); } if (!wg_end_read(db, lock_id)) { // release read lock err_clear_detach_halt(db,lock_id,LOCK_RELEASE_ERR); } global_lock_id=0; // only for handling errors wg_detach_database(db); global_db=NULL; // only for handling errors str_guarantee_space(&strbuffer,&strbufferlen,&strbufferptr,MIN_STRLEN); if (format!=0) { // json snprintf(strbufferptr,MIN_STRLEN,"\n]"); strbufferptr+=3; } return strbuffer; } /* *************** encode cgi params as query vals ******************** */ static wg_int encode_incomp(void* db, char* incomp) { if (incomp==NULL || incomp=='\0') return WG_COND_EQUAL; else if (!strcmp(incomp,"equal")) return WG_COND_EQUAL; else if (!strcmp(incomp,"not_equal")) return WG_COND_NOT_EQUAL; else if (!strcmp(incomp,"lessthan")) return WG_COND_LESSTHAN; else if (!strcmp(incomp,"greater")) return WG_COND_GREATER; else if (!strcmp(incomp,"ltequal")) return WG_COND_LTEQUAL; else if (!strcmp(incomp,"gtequal")) return WG_COND_GTEQUAL; else err_clear_detach_halt(db,0,COND_ERR); return WG_COND_EQUAL; // this return never happens } static wg_int encode_intype(void* db, char* intype) { if (intype==NULL || intype=='\0') return 0; else if (!strcmp(intype,"null")) return WG_NULLTYPE; else if (!strcmp(intype,"int")) return WG_INTTYPE; else if (!strcmp(intype,"record")) return WG_RECORDTYPE; else if (!strcmp(intype,"double")) return WG_DOUBLETYPE; else if (!strcmp(intype,"str")) return WG_STRTYPE; else if (!strcmp(intype,"char")) return WG_CHARTYPE; else err_clear_detach_halt(db,0,INTYPE_ERR); return 0; // this return never happens } static wg_int encode_invalue(void* db, char* invalue, wg_int type) { if (invalue==NULL) { err_clear_detach_halt(db,0,INVALUE_ERR); return 0; // this return never happens } if (type==WG_NULLTYPE) return wg_encode_query_param_null(db,NULL); else if (type==WG_INTTYPE) { if (!isint(invalue)) err_clear_detach_halt(db,0,INVALUE_TYPE_ERR); return wg_encode_query_param_int(db,atoi(invalue)); } else if (type==WG_RECORDTYPE) { if (!isint(invalue)) err_clear_detach_halt(db,0,INVALUE_TYPE_ERR); return (wg_int)atoi(invalue); } else if (type==WG_DOUBLETYPE) { if (!isdbl(invalue)) err_clear_detach_halt(db,0,INVALUE_TYPE_ERR); return wg_encode_query_param_double(db,strtod(invalue,NULL)); } else if (type==WG_STRTYPE) { return wg_encode_query_param_str(db,invalue,NULL); } else if (type==WG_CHARTYPE) { return wg_encode_query_param_char(db,invalue[0]); } else if (type==0 && isint(invalue)) { return wg_encode_query_param_int(db,atoi(invalue)); } else if (type==0 && isdbl(invalue)) { return wg_encode_query_param_double(db,strtod(invalue,NULL)); } else if (type==0) { return wg_encode_query_param_str(db,invalue,NULL); } else { err_clear_detach_halt(db,0,INVALUE_TYPE_ERR); return 0; // this return never happens } } /* ******************* cgi query parsing ******************** */ /* query parser: split by & and =, urldecode param and value return param count or -1 for error */ static int parse_query(char* query, int ql, char* params[], char* values[]) { int count=0; int i,pi,vi; for(i=0;i=ql) return -1; vi=i; for(;i=MAXPARAMS) return -1; params[count]=urldecode(query+pi,query+pi); values[count]=urldecode(query+vi,query+vi); count++; } return count; } /* urldecode used by query parser */ static char* urldecode(char *indst, char *src) { char a, b; char* endptr; char* dst; dst=indst; if (src==NULL || src[0]=='\0') return indst; endptr=src+strlen(src); while (*src) { if ((*src == '%') && (src+2= 'a') a -= 'A'-'a'; if (a >= 'A') a -= ('A' - 10); else a -= '0'; if (b >= 'a') b -= 'A'-'a'; if (b >= 'A') b -= ('A' - 10); else b -= '0'; *dst++ = 16*a+b; src+=3; } else { *dst++ = *src++; } } *dst++ = '\0'; return indst; } /* **************** guess string datatype ***************** */ /* return 1 iff s contains numerals only */ static int isint(char* s) { if (s==NULL) return 0; while(*s!='\0') { if (!isdigit(*s)) return 0; s++; } return 1; } /* return 1 iff s contains numerals plus single optional period only */ static int isdbl(char* s) { int c=0; if (s==NULL) return 0; while(*s!='\0') { if (!isdigit(*s)) { if (*s=='.') c++; else return 0; if (c>1) return 0; } s++; } return 1; } /* **************** json printing **************** */ /** Print a record, handling records recursively The value is written into a character buffer. db: database pointer rec: rec to be printed buf: address of the whole string buffer start (not the start itself) bufsize: address of the actual pointer to start printing at in buffer bptr: address of the whole string buffer format: 0 csv, 1 json showid: print record id for record: 0 no show, 1 first (extra) elem of record depth: current depth in a nested record tree (increases from initial 0) maxdepth: limit on printing records nested via record pointers (0: no nesting) strenc==0: nothing is escaped at all strenc==1: non-ascii chars and % and " urlencoded strenc==2: json utf-8 encoding, not ascii-safe */ int sprint_record(void *db, wg_int *rec, char **buf, int *bufsize, char **bptr, int format, int showid, int depth, int maxdepth, int strenc) { int i,limit; wg_int enc, len; #ifdef USE_CHILD_DB void *parent; #endif limit=MIN_STRLEN; str_guarantee_space(buf, bufsize, bptr, MIN_STRLEN); if (rec==NULL) { snprintf(*bptr, limit, JS_NULL); (*bptr)+=strlen(JS_NULL); return 1; } if (format!=0) { // json **bptr= '['; (*bptr)++; } #ifdef USE_CHILD_DB parent = wg_get_rec_owner(db, rec); #endif if (1) { len = wg_get_record_len(db, rec); if (showid) { // add record id (offset) as the first extra elem of record snprintf(*bptr, limit-1, "%d",wg_encode_record(db,rec)); *bptr=*bptr+strlen(*bptr); } for(i=0; i=maxdepth) { snprintf(*bptr, limit,"%d", (int)enc); // record offset (i.e. id) return *bptr+strlen(*bptr); } else { // recursive print ptrdata = wg_decode_record(db, enc); sprint_record(db,ptrdata,buf,bufsize,bptr,format,showid,depth+1,maxdepth,strenc); **bptr='\0'; return *bptr; } break; case WG_INTTYPE: intdata = wg_decode_int(db, enc); str_guarantee_space(buf, bufsize, bptr, MIN_STRLEN); snprintf(*bptr, limit, "%d", intdata); return *bptr+strlen(*bptr); case WG_DOUBLETYPE: doubledata = wg_decode_double(db, enc); str_guarantee_space(buf, bufsize, bptr, MIN_STRLEN); snprintf(*bptr, limit, DOUBLE_FORMAT, doubledata); return *bptr+strlen(*bptr); case WG_FIXPOINTTYPE: doubledata = wg_decode_fixpoint(db, enc); str_guarantee_space(buf, bufsize, bptr, MIN_STRLEN); snprintf(*bptr, limit, DOUBLE_FORMAT, doubledata); return *bptr+strlen(*bptr); case WG_STRTYPE: strdata = wg_decode_str(db, enc); exdata = wg_decode_str_lang(db,enc); if (strdata!=NULL) strl1=strlen(strdata); else strl1=0; if (exdata!=NULL) strl2=strlen(exdata); else strl2=0; str_guarantee_space(buf, bufsize, bptr, MIN_STRLEN+STRLEN_FACTOR*(strl1+strl2)); sprint_string(*bptr,(strl1+strl2),strdata,strenc); if (exdata!=NULL) { snprintf(*bptr+strl1+1,limit,"@%s\"", exdata); } return *bptr+strlen(*bptr); case WG_URITYPE: strdata = wg_decode_uri(db, enc); exdata = wg_decode_uri_prefix(db, enc); if (strdata!=NULL) strl1=strlen(strdata); else strl1=0; if (exdata!=NULL) strl2=strlen(exdata); else strl2=0; limit=MIN_STRLEN+STRLEN_FACTOR*(strl1+strl2); str_guarantee_space(buf, bufsize, bptr, limit); if (exdata==NULL) snprintf(*bptr, limit, "\"%s\"", strdata); else snprintf(*bptr, limit, "\"%s:%s\"", exdata, strdata); return *bptr+strlen(*bptr); case WG_XMLLITERALTYPE: strdata = wg_decode_xmlliteral(db, enc); exdata = wg_decode_xmlliteral_xsdtype(db, enc); if (strdata!=NULL) strl1=strlen(strdata); else strl1=0; if (exdata!=NULL) strl2=strlen(exdata); else strl2=0; limit=MIN_STRLEN+STRLEN_FACTOR*(strl1+strl2); str_guarantee_space(buf, bufsize, bptr, limit); snprintf(*bptr, limit, "\"%s:%s\"", exdata, strdata); return *bptr+strlen(*bptr); case WG_CHARTYPE: intdata = wg_decode_char(db, enc); str_guarantee_space(buf, bufsize, bptr, MIN_STRLEN); snprintf(*bptr, limit, "\"%c\"", (char) intdata); return *bptr+strlen(*bptr); case WG_DATETYPE: intdata = wg_decode_date(db, enc); wg_strf_iso_datetime(db,intdata,0,strbuf); strbuf[10]=0; str_guarantee_space(buf, bufsize, bptr, MIN_STRLEN); snprintf(*bptr, limit, "\"%s\"",strbuf); return *bptr+strlen(*bptr); case WG_TIMETYPE: intdata = wg_decode_time(db, enc); wg_strf_iso_datetime(db,1,intdata,strbuf); str_guarantee_space(buf, bufsize, bptr, MIN_STRLEN); snprintf(*bptr, limit, "\"%s\"",strbuf+11); return *bptr+strlen(*bptr); case WG_VARTYPE: intdata = wg_decode_var(db, enc); str_guarantee_space(buf, bufsize, bptr, MIN_STRLEN); snprintf(*bptr, limit, "\"?%d\"", intdata); return *bptr+strlen(*bptr); case WG_BLOBTYPE: strdata = wg_decode_blob(db, enc); strl=wg_decode_blob_len(db, enc); limit=MIN_STRLEN+STRLEN_FACTOR*strlen(strdata); str_guarantee_space(buf, bufsize, bptr, limit); sprint_blob(*bptr,strl,strdata,strenc); return *bptr+strlen(*bptr); default: str_guarantee_space(buf, bufsize, bptr, MIN_STRLEN); snprintf(*bptr, limit, JS_TYPE_ERR); return *bptr+strlen(*bptr); } } /* Print string with several encoding/escaping options. It must be guaranteed beforehand that there is enough room in the buffer. bptr: direct pointer to location in buffer where to start writing limit: max nr of chars traversed (NOT limiting output len) strdata: pointer to printed string strenc==0: nothing is escaped at all strenc==1: non-ascii chars and % and " urlencoded strenc==2: json utf-8 encoding, not ascii-safe strenc==3: csv encoding, only " replaced for "" For proper json tools see: json rfc http://www.ietf.org/rfc/rfc4627.txt ccan json tool http://git.ozlabs.org/?p=ccan;a=tree;f=ccan/json Jansson json tool https://jansson.readthedocs.org/en/latest/ Parser http://linuxprograms.wordpress.com/category/json-c/ */ int sprint_string(char* bptr, int limit, char* strdata, int strenc) { unsigned char c; char *sptr; char *hex_chars="0123456789abcdef"; int i; sptr=strdata; *bptr++='"'; if (sptr==NULL) { *bptr++='"'; *bptr='\0'; return 1; } if (!strenc) { // nothing is escaped at all for(i=0;i126) { *bptr++='%'; *bptr++=hex_chars[c >> 4]; *bptr++=hex_chars[c & 0xf]; } else { *bptr++=c; } } } else { // json encoding; chars before ' ' are are escaped with \u00 sptr=strdata; for(i=0;i> 4]; *bptr++=hex_chars[c & 0xf]; } else { *bptr++=c; } } } } *bptr++='"'; *bptr='\0'; return 1; } int sprint_blob(char* bptr, int limit, char* strdata, int strenc) { unsigned char c; char *sptr; char *hex_chars="0123456789abcdef"; int i; sptr=strdata; *bptr++='"'; if (sptr==NULL) { *bptr++='"'; *bptr='\0'; return 1; } // non-ascii chars and % and " urlencoded for(i=0;i126) { *bptr++='%'; *bptr++=hex_chars[c >> 4]; *bptr++=hex_chars[c & 0xf]; } else { *bptr++=c; } } *bptr++='"'; *bptr='\0'; return 1; } int sprint_append(char** bptr, char* str, int l) { int i; for(i=0;i(*strlenadr-(int)((*ptr)-(*stradr)))) { used=(int)((*ptr)-(*stradr)); newlen=(*strlenadr)*2; if (newlenMAX_MALLOC) { if (*stradr!=NULL) free(*stradr); err_clear_detach_halt(global_db,global_lock_id,MALLOC_ERR); return 0; // never returns } //printf("needed %d oldlen %d used %d newlen %d \n",needed,*strlenadr,used,newlen); tmp=realloc(*stradr,newlen); if (tmp==NULL) { if (*stradr!=NULL) free(*stradr); err_clear_detach_halt(global_db,global_lock_id,MALLOC_ERR); return 0; // never returns } tmp[newlen-1]=0; // set last byte to 0 //printf("oldstradr %d newstradr %d oldptr %d newptr %d \n",(int)*stradr,(int)tmp,(int)*ptr,(int)tmp+used); *stradr=tmp; *strlenadr=newlen; *ptr=tmp+used; return 1; } return 1; } /* ******************* errors ******************** */ /* called in case of internal errors by the signal catcher: it is crucial that the locks are released and db detached */ void termination_handler(int xsignal) { err_clear_detach_halt(global_db,global_lock_id,INTERNAL_ERR); } /* called in case of timeout by the signal catcher: it is crucial that the locks are released and db detached */ void timeout_handler(int signal) { err_clear_detach_halt(global_db,global_lock_id,TIMEOUT_ERR); } /* normal termination call: free locks, detach, call errprint and halt */ void err_clear_detach_halt(void* db, wg_int lock_id, char* errstr) { int r; if (lock_id) { r=wg_end_read(db, lock_id); global_lock_id=0; // only for handling errors } if (db!=NULL) wg_detach_database(db); global_db=NULL; // only for handling errors errhalt(errstr); } /* error output and immediate halt */ void errhalt(char* str) { char buf[1000]; snprintf(buf,1000,"[\"%s\"]",str); print_final(buf,global_format); exit(0); } whitedb-0.7.2/Examples/query.c000066400000000000000000000236161226454622500162550ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Priit Järv 2010,2011,2013 * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file query.c * Demonstration of various queries to WhiteDB database. * This program also uses locking to show how to handle queries * in a parallel environment. */ /* ====== Includes =============== */ #include #include #include #include "../Db/dbapi.h" #include "../Db/indexapi.h" /* Extra protos for demo data (not in dbapi.h) */ int wg_genintdata_mix(void *db, int databasesize, int recordsize); /* ====== Private defs =========== */ /* ======= Private protos ================ */ void run_querydemo(void *db); void fetchall(void *db, wg_query *q); /* ====== Functions ============== */ /** Init database, run demo, drop database * Command line arguments are ignored. */ int main(int argc, char **argv) { char* shmptr; /* Create a database with custom key and size */ shmptr=wg_attach_database("9722", 2000000); if(!shmptr) { fprintf(stderr, "Failed to attach to database.\n"); exit(1); } /* Database was created, run demo and clean up. */ run_querydemo(shmptr); wg_delete_database("9722"); exit(0); } /** Run demo queries. * */ void run_querydemo(void* db) { void *rec = NULL, *firstrec = NULL; wg_int lock_id; wg_query_arg arglist[5]; /* holds query parameters */ wg_query *query; /* query object */ wg_int matchrec[10]; /* match record query parameter */ int i; printf("********* Starting query demo ************\n"); /* Create indexes on the database */ if(wg_create_index(db, 0, WG_INDEX_TYPE_TTREE, NULL, 0)) { fprintf(stderr, "index creation failed, aborting.\n"); return; } if(wg_create_index(db, 2, WG_INDEX_TYPE_TTREE, NULL, 0)) { fprintf(stderr, "index creation failed, aborting.\n"); return; } if(wg_create_index(db, 3, WG_INDEX_TYPE_TTREE, NULL, 0)) { fprintf(stderr, "index creation failed, aborting.\n"); return; } /* Take a write lock until we're done writing to the database. */ lock_id = wg_start_write(db); if(!lock_id) { fprintf(stderr, "failed to get write lock, aborting.\n"); return; /* lock timed out */ } /* Generate test data */ wg_genintdata_mix(db, 20, 4); /* Add some non-unique values */ firstrec = rec = wg_get_first_record(db); i = 0; while(rec) { wg_set_field(db, rec, 0, wg_encode_int(db, i++ % 3)); if(i<=6) wg_set_field(db, rec, 3, wg_encode_int(db, 6)); rec = wg_get_next_record(db, rec); } printf("Database test data contents\n"); wg_print_db(db); /* Release the write lock. We could have released it before wg_print_db() * and acquired a separate read lock instead. That would have been correct, * but unnecessary for this demo. */ wg_end_write(db, lock_id); /* Encode query arguments. We will use the wg_encode_query_param*() * family of functions which do not write to the shared memory * area, therefore locking is not required at this point. * * Basic query 1: column 2 less than 30 */ arglist[0].column = 2; arglist[0].cond = WG_COND_LESSTHAN; arglist[0].value = wg_encode_query_param_int(db, 30); /* Take read lock. No alterations should be allowed * during the building of the query. */ lock_id = wg_start_read(db); if(!lock_id) { fprintf(stderr, "failed to get read lock, aborting.\n"); return; /* lock timed out */ } query = wg_make_query(db, NULL, 0, arglist, 1); if(!query) { fprintf(stderr, "failed to build query, aborting.\n"); return; } /* We keep the lock before using the results for best possible * isolation. In some cases this is not necessary. */ printf("Printing results for query 1: column 2 less than 30\n"); fetchall(db, query); /* Release the read lock */ wg_end_read(db, lock_id); wg_free_query(db, query); /* free the memory */ /* Basic query 2: col 2 > 21 and col 2 <= 111 */ arglist[0].column = 2; arglist[0].cond = WG_COND_GREATER; arglist[0].value = wg_encode_query_param_int(db, 21); arglist[1].column = 2; arglist[1].cond = WG_COND_LTEQUAL; arglist[1].value = wg_encode_query_param_int(db, 111); lock_id = wg_start_read(db); if(!lock_id) { fprintf(stderr, "failed to get read lock, aborting.\n"); return; /* lock timed out */ } query = wg_make_query(db, NULL, 0, arglist, 2); if(!query) { fprintf(stderr, "failed to build query, aborting.\n"); return; } printf("Printing results for query 2: col 2 > 21 and col 2 <= 111\n"); fetchall(db, query); wg_end_read(db, lock_id); wg_free_query(db, query); /* Basic query 3: match all records [ 0, ...]. Fields that * are beyond the size of matchrec implicitly become wildcards. */ matchrec[0] = wg_encode_query_param_int(db, 0); lock_id = wg_start_read(db); if(!lock_id) { fprintf(stderr, "failed to get read lock, aborting.\n"); return; /* lock timed out */ } query = wg_make_query(db, matchrec, 1, NULL, 0); if(!query) { fprintf(stderr, "failed to build query, aborting.\n"); return; } printf("Printing results for query 3: all records that match [ 0, ... ]\n"); fetchall(db, query); wg_end_read(db, lock_id); wg_free_query(db, query); /* Combine the parameters of queries 2 and 3 (it is allowed to * mix both types of arguments). */ lock_id = wg_start_read(db); if(!lock_id) { fprintf(stderr, "failed to get read lock, aborting.\n"); return; /* lock timed out */ } query = wg_make_query(db, matchrec, 1, arglist, 2); if(!query) { fprintf(stderr, "failed to build query, aborting.\n"); return; } printf("Printing results for combined queries 2 and 3\n"); fetchall(db, query); wg_end_read(db, lock_id); wg_free_query(db, query); /* Add an extra condition */ arglist[2].column = 3; arglist[2].cond = WG_COND_EQUAL; arglist[2].value = wg_encode_query_param_int(db, 112); lock_id = wg_start_read(db); if(!lock_id) { fprintf(stderr, "failed to get read lock, aborting.\n"); return; /* lock timed out */ } query = wg_make_query(db, matchrec, 1, arglist, 3); if(!query) { fprintf(stderr, "failed to build query, aborting.\n"); return; } printf("Adding extra condtion to previous queries: col 3 = 112\n"); fetchall(db, query); wg_end_read(db, lock_id); wg_free_query(db, query); /* Non-indexed columns may be used too. This will produce * a "full scan" query with non-ordered results. */ arglist[0].column = 1; arglist[0].cond = WG_COND_GREATER; arglist[0].value = wg_encode_query_param_int(db, 20); arglist[1].column = 1; arglist[1].cond = WG_COND_LTEQUAL; arglist[1].value = wg_encode_query_param_int(db, 110); lock_id = wg_start_read(db); if(!lock_id) { fprintf(stderr, "failed to get read lock, aborting.\n"); return; /* lock timed out */ } query = wg_make_query(db, NULL, 0, arglist, 2); if(!query) { fprintf(stderr, "failed to build query, aborting.\n"); return; } printf("Printing results for non-indexed column: col 1 > 20 and col 1 <= 110\n"); fetchall(db, query); wg_end_read(db, lock_id); wg_free_query(db, query); /* More complete match record. Use variable field type * for wildcards. The identifier used for the variable is not * important currently. */ matchrec[0] = wg_encode_query_param_int(db, 1); matchrec[1] = wg_encode_query_param_var(db, 0); matchrec[2] = wg_encode_query_param_var(db, 0); matchrec[3] = wg_encode_query_param_int(db, 6); lock_id = wg_start_read(db); if(!lock_id) { fprintf(stderr, "failed to get read lock, aborting.\n"); return; /* lock timed out */ } query = wg_make_query(db, matchrec, 4, NULL, 0); if(!query) { fprintf(stderr, "failed to build query, aborting.\n"); return; } printf("Printing results for match query: records like [ 1, *, *, 6 ]\n"); fetchall(db, query); wg_end_read(db, lock_id); wg_free_query(db, query); lock_id = wg_start_read(db); if(!lock_id) { fprintf(stderr, "failed to get read lock, aborting.\n"); return; /* lock timed out */ } /* Arguments may be omitted. This causes the query to return * all the rows in the database. */ query = wg_make_query(db, NULL, 0, NULL, 0); if(!query) { fprintf(stderr, "failed to build query, aborting.\n"); return; } printf("Printing results for a query with no arguments\n"); fetchall(db, query); wg_end_read(db, lock_id); wg_free_query(db, query); lock_id = wg_start_read(db); if(!lock_id) { fprintf(stderr, "failed to get read lock, aborting.\n"); return; /* lock timed out */ } /* Finally, try matching to a database record. Depending on the * test data, this sould return at least one record. Note that * reclen 0 indicates that the record should be taken from database. */ query = wg_make_query(db, firstrec, 0, NULL, 0); if(!query) { fprintf(stderr, "failed to build query, aborting.\n"); return; } printf("Printing records matching the first record in database\n"); fetchall(db, query); wg_end_read(db, lock_id); wg_free_query(db, query); printf("********* Demo ended ************\n"); } /** Fetch the results of a single query * */ void fetchall(void *db, wg_query *q) { void *rec = wg_fetch(db, q); while(rec) { wg_print_record(db, rec); printf("\n"); rec = wg_fetch(db, q); } printf("---- end of data ----\n"); } whitedb-0.7.2/Examples/raptortry.c000066400000000000000000000070311226454622500171470ustar00rootroot00000000000000#include #include #include #ifdef _WIN32 #include // for _getch #endif #include #ifdef _WIN32 #include "../config-w32.h" #else #include "../config.h" #endif #include "../Db/dbmem.h" #include "../Db/dballoc.h" #include "../Db/dbdata.h" //#include "../Db/dbapi.h" #include "../Db/dbtest.h" #include "../Db/dbdump.h" #include "../Db/dblog.h" /* rdfprint.c: print triples from parsing RDF/XML */ /* gcc -o rdfprint rdfprint.c `raptor-config --cflags` `raptor-config --libs` gcc -o raptortry raptortry.c `raptor-config --cflags` `raptor-config --libs` $ ./rdfprint raptor.rdf _:genid1 . _:genid1 "Raptor" . _:genid1 . ... */ int tcount=0; void handle_triple(void* user_data, const raptor_statement* triple) { int print=1; void* rec; void* db=user_data; gint enc; gint tmp; rec=wg_create_record(db,3); if (!rec) { printf("\n cannot create a new record, tcount %d\n",tcount); exit(0); } if (print) { raptor_print_statement_as_ntriples(triple, stdout); fputc('\n', stdout); printf("s: %s\n",(char*)(triple->subject)); printf("p: %s\n",(char*)(triple->predicate)); printf("o: %s\n",(char*)(triple->object)); } enc=wg_encode_str(db,(char*)(triple->subject),NULL); tmp=wg_set_field(db,rec,0,enc); enc=wg_encode_str(db,(char*)(triple->predicate),NULL); tmp=wg_set_field(db,rec,1,enc); if ((triple->object_type)==RAPTOR_IDENTIFIER_TYPE_RESOURCE) { if (print) printf("t: resource\n"); enc=wg_encode_str(db,(char*)(triple->subject),NULL); tmp=wg_set_field(db,rec,2,enc); } else if ((triple->object_type)==RAPTOR_IDENTIFIER_TYPE_ANONYMOUS) { if (print) printf("t: anonymous\n"); enc=wg_encode_str(db,(char*)(triple->subject),NULL); tmp=wg_set_field(db,rec,2,enc); } else if ((triple->object_type)==RAPTOR_IDENTIFIER_TYPE_LITERAL) { if (print) printf("t: literal\n"); if (print) printf("d: %s\n",(char*)(triple->object_literal_datatype)); if (print) printf("l: %s\n",(char*)(triple->object_literal_language)); if ((triple->object_literal_datatype)==NULL) { enc=wg_encode_str(db,(char*)(triple->subject),(char*)(triple->object_literal_language)); tmp=wg_set_field(db,rec,2,enc); } else { enc=wg_encode_str(db,(char*)(triple->subject),(char*)(triple->object_literal_datatype)); tmp=wg_set_field(db,rec,2,enc); } } else { printf("ERROR! Unknown triple object type.\n"); exit(0); } tcount++; } int main(int argc, char *argv[]) { raptor_parser* rdf_parser=NULL; unsigned char *uri_string; raptor_uri *uri, *base_uri; char* shmname=NULL; char* shmptr; if (!strcmp(argv[1],"load")) { shmptr=wg_attach_database(shmname,0); raptor_init(); //rdf_parser=raptor_new_parser("rdfxml"); rdf_parser=raptor_new_parser("turtle"); raptor_set_statement_handler(rdf_parser, shmptr, handle_triple); uri_string=raptor_uri_filename_to_uri_string(argv[2]); uri=raptor_new_uri(uri_string); base_uri=raptor_uri_copy(uri); raptor_parse_file(rdf_parser, uri, base_uri); raptor_free_parser(rdf_parser); raptor_free_uri(base_uri); raptor_free_uri(uri); raptor_free_memory(uri_string); raptor_finish(); printf("tcount %d \n",tcount); } else if (!strcmp(argv[1],"run1")) { } return 0; } whitedb-0.7.2/Examples/speed/000077500000000000000000000000001226454622500160345ustar00rootroot00000000000000whitedb-0.7.2/Examples/speed/README000066400000000000000000000001101226454622500167040ustar00rootroot00000000000000Simple single-core speed tests covered in http://whitedb.org/speed.html whitedb-0.7.2/Examples/speed/speed1.c000066400000000000000000000011201226454622500173530ustar00rootroot00000000000000/* creating and deleting a 1 GB database 1000 times: real 0m9.694s user 0m2.044s sys 0m7.642s creating and deleting a 10 MB database 100000 times: real 0m12.800s user 0m2.622s sys 0m10.137s */ #include #include #include int main(int argc, char **argv) { void *db; char *name="1"; int i; for(i=0;i<1000;i++) { // 100000 db = wg_attach_database(name,1000000000); // 10000000 if (!db) { printf("failed at try %d\n", i); exit(0); } wg_detach_database(db); wg_delete_database(name); } printf("i %d\n", i); return 0; } whitedb-0.7.2/Examples/speed/speed10.c000066400000000000000000000015371226454622500174470ustar00rootroot00000000000000/* creating and immediately filling with integer data 10 million records of 5 fields in a 1 GIG database. Field values are computed as the last five digits of the record number, thus storing each number between 0...100000 to 100 different records. Database created will be later used by speed11 for scanning. real 0m0.483s user 0m0.381s sys 0m0.101s */ #include #include #include int main(int argc, char **argv) { void *db, *rec; char *name="10"; int i,j; db = wg_attach_database(name, 1000000000); if (!db) { printf("db creation failed \n"); exit(0); } for(i=0;i<10000000;i++) { rec = wg_create_raw_record(db, 5); if (!rec) { printf("record creation failed \n"); exit(0); } wg_set_new_field(db,rec,3,wg_encode_int(db,i%100000)); } printf("i %d\n", i); return 0; } whitedb-0.7.2/Examples/speed/speed11.c000066400000000000000000000020331226454622500174400ustar00rootroot00000000000000/* Scan through 10 million records in a pre-built database, counting all these records which have integer 123 as the value of the third field. Database was created earlier by speed10. Do not forget to delete database later, a la: wgdb 10 free real 0m0.201s user 0m0.157s sys 0m0.044s */ #include #include #include int main(int argc, char **argv) { void *db, *rec; char *name="10"; int i=0; int count=0; wg_int encval; db = wg_attach_database(name, 1000000000); if (!db) { printf("db attaching failed \n"); exit(0); } encval=wg_encode_int(db,123); // encode for faster comparison in the loop rec=wg_get_first_record(db); do { //if (wg_decode_int(db,wg_get_field(db,rec,3))==123) count++; // a bit slower alternative if (wg_get_field(db,rec,3)==encval) count++; rec=wg_get_next_record(db,rec); i++; } while(rec!=NULL); wg_free_encoded(db,encval); // have to free encval since we did not store it to db printf("i %d, count %d\n", i,count); return 0; } whitedb-0.7.2/Examples/speed/speed12.c000066400000000000000000000011611226454622500174420ustar00rootroot00000000000000/* create an index for the field 3 in the previously built database of 10 million records. real 0m6.540s user 0m6.436s sys 0m0.098s */ #include #include // must additionally include indexapi.h #include #include int main(int argc, char **argv) { void *db, *rec; char *name="10"; int tmp; db = wg_attach_database(name, 1000000000); if (!db) { printf("db creation failed \n"); exit(0); } tmp=wg_create_index(db,3,WG_INDEX_TYPE_TTREE,NULL,0); if (tmp) printf("Index creation failed\n"); else printf("Index creation succeeded\n"); return 0; } whitedb-0.7.2/Examples/speed/speed13.c000066400000000000000000000026641226454622500174540ustar00rootroot00000000000000/* Outer loop: run the inner loop million times to obtain sensible timings. Inner loop: prepare query and search from the index on 10 million records, counting all these which have integer 123 as the value of the third field, using query on the indexed field. There are 100 of such values. Database was created earlier by speed11 and indexed by speed 12. Do not forget to delete database later, a la: wgdb 10 free Outer loop time (i.e. 1 million identical query building / performing / deallocating operations) altogether: real 0m3.256s user 0m3.252s sys 0m0.001s */ #include #include #include int main(int argc, char **argv) { void *db, *rec; char *name="10"; int i; int count=0; wg_query *query; wg_query_arg arglist[5]; db = wg_attach_database(name, 1000000000); if (!db) { printf("db attaching failed \n"); exit(0); } // outer loop is just for sensible timing: do the same thing 1000 times for(i=0;i<1000000;i++) { arglist[0].column = 3; arglist[0].cond = WG_COND_EQUAL; arglist[0].value = wg_encode_query_param_int(db,123); query = wg_make_query(db, NULL, 0, arglist, 1); if(!query) { printf("query creation failed \n"); exit(0); } while((rec = wg_fetch(db, query))) { count++; //wg_print_record(db, rec); printf("\n"); } wg_free_query(db,query); } printf("count altogether for i %d runs: %d\n", i, count); return 0; } whitedb-0.7.2/Examples/speed/speed15.c000066400000000000000000000025421226454622500174510ustar00rootroot00000000000000/* creating a flat pointer list of 10 million records of 5 fields in a 1 GB database: store a pointer to the previous record to field 3 of each record, store a pointer to the last record to field 2 of first record. Observe that we use standard C int 0 for the NULL pointer, this is also what wg_encode_null(db,0) always gives. real 0m0.666s user 0m0.516s sys 0m0.150s */ #include #include #include int main(int argc, char **argv) { void *db, *rec, *firstrec, *lastrec; char *name="15"; int i; db = wg_attach_database(name, 1000000000); if (!db) { printf("db creation failed \n"); exit(0); } rec = wg_create_raw_record(db, 5); firstrec=rec; // store for use in the end lastrec=rec; for(i=1;i<10000000;i++) { rec = wg_create_raw_record(db, 5); if (!rec) { printf("record creation failed \n"); exit(0); } // store a pointer to the previously built record wg_set_new_field(db,rec,3,wg_encode_record(db,lastrec)); lastrec=rec; } // field 3 of the first record will be an encoded NULL pointer // which is always just (wg_int)0 wg_set_new_field(db,firstrec,3,wg_encode_null(db,0)); // field 2 of the first record will be a pointer to the last record wg_set_new_field(db,firstrec,2,wg_encode_record(db,lastrec)); printf("i %d\n", i); wg_detach_database(db); return 0; } whitedb-0.7.2/Examples/speed/speed16.c000066400000000000000000000022741226454622500174540ustar00rootroot00000000000000/* traversing a flat pre-built (in speed15.c) pointer list of 10 million records of 5 fields in a 1 GB database: a pointer to the previous record is stored in field 3 of each record, a pointer to the last record is stored in field 2 of first record. Observe that we use standard C int 0 for the NULL pointer, this is also what wg_encode_null(db,0) always gives. Database was created earlier by speed15. Do not forget to delete database later, a la: wgdb 15 free real 0m0.153s user 0m0.110s sys 0m0.043s */ #include #include #include int main(int argc, char **argv) { void *db, *rec; char *name="15"; int i; wg_int encptr; db = wg_attach_database(name, 1000000000); if (!db) { printf("db creation failed \n"); exit(0); } rec=wg_get_first_record(db); // get a pointer to the last record rec=wg_decode_record(db,wg_get_field(db,rec,2)); i=1; while(1) { encptr=wg_get_field(db,rec,3); // encptr is not yet decoded if (encptr==(wg_int)0) break; // encoded null is always standard 0 rec=wg_decode_record(db,encptr); // get a pointer to the previous record i++; } printf("i %d\n", i); wg_detach_database(db); return 0; }whitedb-0.7.2/Examples/speed/speed2.c000066400000000000000000000012621226454622500173630ustar00rootroot00000000000000/* creating 10 million records of 5 fields in a 1 GB database real 0m0.586s user 0m0.473s sys 0m0.113s creating 10 million records of 9 fields in a 1 GB database real 0m0.812s user 0m0.645s sys 0m0.166s */ #include #include #include int main(int argc, char **argv) { void *db, *rec; char *name="2"; int i; db = wg_attach_database(name, 1000000000); if (!db) { printf("db creation failed \n"); exit(0); } for(i=0;i<10000000;i++) { rec = wg_create_record(db, 9); if (!rec) { printf("record creation failed \n"); exit(0); } } printf("i %d\n", i); wg_detach_database(db); wg_delete_database(name); return 0; } whitedb-0.7.2/Examples/speed/speed20.c000066400000000000000000000071101226454622500174410ustar00rootroot00000000000000/* call with speed20 N where N is the number of segments (N >= 1), default 1 compile with gcc speed21.c -o speed21 -O2 -lwgdb -lpthread creating a flat pointer list of 10 million records of 5 fields in a 1 GB database: store a pointer to the previous record to field 3 of each record, thus first list elems are actually created last. Store lstlen-i to field 1. Additionally create a ctrl record with pointers to the middle of the list, to be used for parallel multicore scanning of the list later. Ctrl record fields: 0: type (unused) 1: ptr field (3 here) 2: start pointer 3: midpointer0 4: midpointer1 ... N: ptr to last record N+1: NULL Observe that we use standard C int 0 for the NULL pointer, this is also what wg_encode_null(db,0) always gives. */ #include #include #include #define DB_NAME "20" int main(int argc, char **argv) { void *db, *rec, *ctrlrec, *firstrec, *lastrec; char *name=DB_NAME; int i; int lstlen=10000000; // total nr of elems in list int ptrfld=3; // field where a pointer is stored int segnr=1; // total number of segments int midcount=0; // middle ptr count int midlasti=0; // last i where midpoint stored int midseglen; // mid segment length wg_int tmp; if (argc>=2) { segnr=atoi(argv[1]); } printf("creating a list with %d segments \n",segnr); midseglen=lstlen/segnr; // mid segment length db = wg_attach_database(name, 1000000000); if (!db) { printf("db creation failed \n"); exit(0); } ctrlrec = wg_create_record(db, 1000); // this will contain info about the list // build the list firstrec=wg_create_raw_record(db, 5); lastrec=firstrec; // next ptr from the last record is an encoded NULL pointer wg_set_new_field(db,firstrec,ptrfld,wg_encode_null(db,0)); wg_set_new_field(db,firstrec,1,wg_encode_int(db,lstlen)); for(i=1;i=midseglen) { // this lst is built from end to beginning wg_set_field(db,ctrlrec,2+(segnr-1)-midcount,wg_encode_record(db,rec)); printf("\nmidpoint %d at i %d to field %d val %d",midcount,i,2+(segnr-1)-midcount, (int)(db,wg_get_field(db,ctrlrec,2+(segnr-1)-midcount))); midlasti=i; midcount++; } } // set ctrlrec fields: type,ptr field,first pointer,midpointer1,midpointer2,... wg_set_field(db,ctrlrec,0,wg_encode_int(db,1)); // type not used wg_set_field(db,ctrlrec,1,wg_encode_int(db,ptrfld)); // ptrs at field 3 wg_set_field(db,ctrlrec,2,wg_encode_record(db,lastrec)); // lst starts here wg_set_field(db,ctrlrec,2+segnr,wg_encode_record(db,firstrec)); // last record in a list printf("\nfinal i %d\n", i); printf("\nctrl rec ptr fld val %d \n",wg_decode_int(db,wg_get_field(db,ctrlrec,1))); for(i=0;i<1000;i++) { tmp=wg_get_field(db,ctrlrec,2+i); if (!(int)tmp) break; printf("ptr %d value %d content %d\n",i,(int)tmp, wg_decode_int(db,wg_get_field(db,wg_decode_record(db,tmp),1)) ); } wg_detach_database(db); return 0; } whitedb-0.7.2/Examples/speed/speed21.c000066400000000000000000000073251226454622500174520ustar00rootroot00000000000000/* traversing a flat pre-built (in speed20.c) pointer list of 10 million records of 5 fields in a 1 GB database using a ctrl record to scan different segments of the list in parallel. Nr of segments is built by speed20: control the number there. Compile with gcc speed21.c -o speed21 -O2 -lwgdb -lpthread Ctrl record (see speed20.c) fields: 0: type (unused) 1: ptr field (3 here) 2: start pointer 3: midpointer0 4: midpointer1 ... N: ptr to last record N+1: NULL Database was created earlier by speed20 and is not freed here. one thread: real 0m0.110s user 0m0.064s sys 0m0.045s two threads: real 0m0.064s user 0m0.071s sys 0m0.048s three threads: real 0m0.047s user 0m0.073s sys 0m0.052s four threads: real 0m0.041s user 0m0.072s sys 0m0.057s eight threads: real 0m0.032s user 0m0.097s sys 0m0.062s sixteen threads real 0m0.033s user 0m0.088s sys 0m0.076s */ #include #include #include #include #define RECORD_HEADER_GINTS 3 #define wg_field_addr(db,record,fieldnr) (((wg_int*)record)+RECORD_HEADER_GINTS+fieldnr) #define MAX_THREADS 100 #define DB_NAME "20" void *process(void *targ); struct thread_data{ int thread_id; // 0,1,.. void* db; // db handler wg_int fptr; // first pointer wg_int lptr; // last pointer int ptrfld; // pointer field in rec int res; // stored by thread }; int main(int argc, char **argv) { void *db, *ctrl, *rec; char *name=DB_NAME; int i,ptrfld,ptrs,rc; wg_int encptr,tmp,first,next,last; pthread_t threads[MAX_THREADS]; struct thread_data tdata[MAX_THREADS]; pthread_attr_t attr; long tid; db = wg_attach_database(name, 1000000000); if (!db) { printf("db creation failed \n"); exit(0); } // get values from cntrl record ctrl=wg_get_first_record(db); ptrfld=wg_decode_int(db,wg_get_field(db,ctrl,1)); first=wg_get_field(db,ctrl,2); for(ptrs=0;ptrs<10000;ptrs++) { tmp=wg_get_field(db,ctrl,3+ptrs); if ((int)tmp==0) break; last=tmp; } printf("\nsegments found: %d \n",ptrs); if (ptrs>=MAX_THREADS) { printf("List segment nr larger than MAX_THREADS, exiting\n"); wg_detach_database(db); pthread_exit(NULL); return 0; } // prepare and create threads pthread_attr_init(&attr); pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE); for(tid=0;tidthread_id; db=tdata->db; ptrfld=tdata->ptrfld; rec=wg_decode_record(db,tdata->fptr); lptr=wg_decode_record(db,tdata->lptr); printf("thread %d starts, first el value %d \n",tid, wg_decode_int(db,wg_get_field(db,rec,1))); i=1; while(1) { encptr=*(wg_field_addr(db,rec,ptrfld)); // encptr is not yet decoded rec=wg_decode_record(db,encptr); // get a pointer to the previous record if (rec==lptr) break; i++; } tdata->res=i; printf ("thread %d finishing with res %d \n",tid,i); pthread_exit((void*) tid); } whitedb-0.7.2/Examples/speed/speed3.c000066400000000000000000000011231226454622500173600ustar00rootroot00000000000000/* creating 10 million raw records of 5 fields in a 1 GB database real 0m0.349s user 0m0.217s sys 0m0.131s */ #include #include #include int main(int argc, char **argv) { void *db, *rec; char *name="3"; int i; db = wg_attach_database(name, 1000000000); if (!db) { printf("db creation failed \n"); exit(0); } for(i=0;i<10000000;i++) { rec = wg_create_raw_record(db, 5); if (!rec) { printf("record creation failed \n"); exit(0); } } printf("i %d\n", i); wg_detach_database(db); wg_delete_database(name); return 0; } whitedb-0.7.2/Examples/speed/speed4.c000066400000000000000000000012071226454622500173640ustar00rootroot00000000000000/* creating and immediately deleting 10 million records of 5 fields in a 1 GIG database real 0m1.160s user 0m1.149s sys 0m0.009s */ #include #include #include int main(int argc, char **argv) { void *db, *rec; char *name="4"; int i; db = wg_attach_database(name, 1000000000); if (!db) { printf("db creation failed \n"); exit(0); } for(i=0;i<10000000;i++) { rec = wg_create_record(db, 5); if (!rec) { printf("record creation failed \n"); exit(0); } wg_delete_record(db, rec); } printf("i %d\n", i); wg_detach_database(db); wg_delete_database(name); return 0; } whitedb-0.7.2/Examples/speed/speed5.c000066400000000000000000000017561226454622500173760ustar00rootroot00000000000000/* creating and immediately filling with data 10 million records of 5 fields in a 1 GIG database real 0m1.163s user 0m1.042s sys 0m0.120s adding code to read back the value and add it to a counter: real 0m1.768s user 0m1.643s sys 0m0.124s doing record filling with 1000 records of length 50 thousand: real 0m0.941s user 0m0.863s sys 0m0.077s */ #include #include #include int main(int argc, char **argv) { void *db, *rec; char *name="51"; int i,j,count=0; db = wg_attach_database(name, 1000000000); if (!db) { printf("db creation failed \n"); exit(0); } for(i=0;i<10000000;i++) { rec = wg_create_raw_record(db, 5); if (!rec) { printf("record creation failed \n"); exit(0); } for (j=0;j<5;j++) { wg_set_new_field(db,rec,j,wg_encode_int(db,i+j)); //count+=wg_decode_int(db,wg_get_field(db, rec, j)); } } printf("i %d count %d\n", i,count); wg_detach_database(db); wg_delete_database(name); return 0; } whitedb-0.7.2/Examples/speed/speed6.c000066400000000000000000000015131226454622500173660ustar00rootroot00000000000000/* creating and immediately filling with string data 1 million records of 5 fields in a 1 GIG database: the string is 30 bytes long and is encoded each time using up 32 bytes real 0m0.403s user 0m0.283s sys 0m0.119s */ #include #include #include int main(int argc, char **argv) { void *db, *rec; char *name="69"; int i,j; char* content="01234567890"; db = wg_attach_database(name, 1000000000); if (!db) { printf("db creation failed \n"); exit(0); } for(i=0;i<1000000;i++) { rec = wg_create_raw_record(db, 5); if (!rec) { printf("record creation failed \n"); exit(0); } for (j=0;j<5;j++) { wg_set_new_field(db,rec,j,wg_encode_str(db,content,NULL)); } } printf("i %d \n", i); wg_detach_database(db); wg_delete_database(name); return 0; } whitedb-0.7.2/Examples/speed/speed7.c000066400000000000000000000016231226454622500173710ustar00rootroot00000000000000/* creating and immediately filling with string data 1 million records of 5 fields in a 1 GIG database: the string is 100 bytes long and is encoded each time. real 0m1.464s user 0m1.446s sys 0m0.017s */ #include #include #include int main(int argc, char **argv) { void *db, *rec; char *name="7"; int i,j; char* content="0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789"; db = wg_attach_database(name, 1000000000); if (!db) { printf("db creation failed \n"); exit(0); } for(i=0;i<1000000;i++) { rec = wg_create_raw_record(db, 5); if (!rec) { printf("record creation failed \n"); exit(0); } for (j=0;j<5;j++) { wg_set_new_field(db,rec,j,wg_encode_str(db,content,NULL)); } } printf("i %d \n", i); wg_detach_database(db); wg_delete_database(name); return 0; } whitedb-0.7.2/Examples/speed/speed8.c000066400000000000000000000014211226454622500173660ustar00rootroot00000000000000/* creating and immediately filling with double data 1 million records of 5 fields in a 1 GIG database. The double value is encoded each time. real 0m0.190s user 0m0.144s sys 0m0.045s */ #include #include #include int main(int argc, char **argv) { void *db, *rec; char *name="9"; int i,j; db = wg_attach_database(name, 1000000000); if (!db) { printf("db creation failed \n"); exit(0); } for(i=0;i<1000000;i++) { rec = wg_create_raw_record(db, 5); if (!rec) { printf("record creation failed \n"); exit(0); } for (j=0;j<5;j++) { wg_set_new_field(db,rec,j,wg_encode_double(db,(double)(i+j))); } } printf("i %d \n", i); wg_detach_database(db); wg_delete_database(name); return 0; } whitedb-0.7.2/Examples/tut1.c000066400000000000000000000002541226454622500157760ustar00rootroot00000000000000#include /* or #include on Windows */ int main(int argc, char **argv) { void *db; db = wg_attach_database("1000", 2000000); return 0; } whitedb-0.7.2/Examples/tut2.c000066400000000000000000000006111226454622500157740ustar00rootroot00000000000000#include int main(int argc, char **argv) { void *db, *rec, *rec2; wg_int enc, enc2; db = wg_attach_database("1000", 2000000); rec = wg_create_record(db, 10); rec2 = wg_create_record(db, 2); enc = wg_encode_int(db, 443); enc2 = wg_encode_str(db, "this is my string", NULL); wg_set_field(db, rec, 7, enc); wg_set_field(db, rec2, 0, enc2); return 0; } whitedb-0.7.2/Examples/tut3.c000066400000000000000000000013331226454622500157770ustar00rootroot00000000000000#include #include int main(int argc, char **argv) { void *db, *rec; wg_int enc; db = wg_attach_database("1000", 2000000); /* create some records for testing */ rec = wg_create_record(db, 10); enc = wg_encode_int(db, 443); /* will match */ wg_set_field(db, rec, 7, enc); rec = wg_create_record(db, 10); enc = wg_encode_int(db, 442); wg_set_field(db, rec, 7, enc); /* will not match */ /* now find the records that match our condition * "field 7 equals 443" */ rec = wg_find_record_int(db, 7, WG_COND_EQUAL, 443, NULL); while(rec) { printf("Found a record where field 7 is 443\n"); rec = wg_find_record_int(db, 7, WG_COND_EQUAL, 443, rec); } return 0; } whitedb-0.7.2/Examples/tut4.c000066400000000000000000000025441226454622500160050ustar00rootroot00000000000000#include #include int main(int argc, char **argv) { void *db, *rec; wg_int enc; wg_query_arg arglist[2]; /* holds the arguments to the query */ wg_query *query; /* used to fetch the query results */ db = wg_attach_database("1000", 2000000); /* just in case, create some records for testing */ rec = wg_create_record(db, 10); enc = wg_encode_int(db, 443); /* will match */ wg_set_field(db, rec, 7, enc); rec = wg_create_record(db, 10); enc = wg_encode_int(db, 442); wg_set_field(db, rec, 7, enc); /* will not match */ /* now find the records that match the condition * "field 7 equals 443 and field 6 equals NULL". The * second part is a bit redundant but we're adding it * to show the use of the argument list. */ arglist[0].column = 7; arglist[0].cond = WG_COND_EQUAL; arglist[0].value = wg_encode_query_param_int(db, 443); arglist[1].column = 6; arglist[1].cond = WG_COND_EQUAL; arglist[1].value = wg_encode_query_param_null(db, NULL); query = wg_make_query(db, NULL, 0, arglist, 2); while((rec = wg_fetch(db, query))) { printf("Found a record where field 7 is 443 and field 6 is NULL\n"); } /* Free the memory allocated for the query */ wg_free_query(db, query); wg_free_query_param(db, arglist[0].value); wg_free_query_param(db, arglist[1].value); return 0; } whitedb-0.7.2/Examples/tut5.c000066400000000000000000000025601226454622500160040ustar00rootroot00000000000000#include #include #include int main(int argc, char **argv) { void *db, *rec, *lastrec; wg_int enc; int i; db = wg_attach_database("1000", 1000000); /* 1MB should fill up fast */ if(!db) { printf("ERR: Could not attach to database.\n"); exit(1); } lastrec = NULL; for(i=0;;i++) { char buf[20]; rec = wg_create_record(db, 1); if(!rec) { printf("ERR: Failed to create a record (made %d so far)\n", i); break; } lastrec = rec; sprintf(buf, "%d", i); /* better to use snprintf() in real applications */ enc = wg_encode_str(db, buf, NULL); if(enc == WG_ILLEGAL) { printf("ERR: Failed to encode a string (%d records currently)\n", i+1); break; } if(wg_set_field(db, rec, 0, enc)) { printf("ERR: This error is less likely, but wg_set_field() failed.\n"); break; } } /* For educational purposes, let's pretend we're interested in what's * stored in the last record. */ if(lastrec) { char *str = wg_decode_str(db, wg_get_field(db, lastrec, 0)); if(!str) { printf("ERR: Decoding the string field failed.\n"); if(wg_get_field_type(db, lastrec, 0) != WG_STRTYPE) { printf("ERR: The field type is not string - " "should have checked that first!\n"); } } } wg_detach_database(db); return 0; } whitedb-0.7.2/Examples/tut6.c000066400000000000000000000041421226454622500160030ustar00rootroot00000000000000#include #include #include #define NUM_INCREMENTS 100000 void die(void *db, int err) { wg_detach_database(db); exit(err); } int main(int argc, char **argv) { void *db, *rec; wg_int lock_id; int i, val; if(!(db = wg_attach_database("1000", 1000000))) { exit(1); /* failed to attach */ } /* First we need to make sure both counting programs start at the * same time (otherwise the example would be boring). */ lock_id = wg_start_read(db); rec = wg_get_first_record(db); /* our database only contains one record, * so we don't need to make a query. */ wg_end_read(db, lock_id); if(!rec) { /* There is no record yet, we're the first to run and have * to set up the counter. */ lock_id = wg_start_write(db); if(!lock_id) die(db, 2); rec = wg_create_record(db, 1); wg_end_write(db, lock_id); if(!rec) die(db, 3); printf("Press a key when all the counter programs have been started."); fgetc(stdin); /* Setting the counter to 0 lets each counting program know it can * start counting now. */ lock_id = wg_start_write(db); if(!lock_id) die(db, 2); wg_set_field(db, rec, 0, wg_encode_int(db, 0)); wg_end_write(db, lock_id); } else { /* Some other program has started first, we wait until the counter * is ready. */ int ready = 0; while(!ready) { lock_id = wg_start_read(db); if(!lock_id) die(db, 2); if(wg_get_field_type(db, rec, 0) == WG_INTTYPE) ready = 1; wg_end_read(db, lock_id); } } /* Now start the actual counting. */ for(i=0; i #include int main(int argc, char **argv) { void *db, *rec, *rec2, *rec3; wg_int enc; if(!(db = wg_attach_database("1000", 2000000))) exit(1); /* failed to attach */ rec = wg_create_record(db, 2); /* this is some record */ rec2 = wg_create_record(db, 3); /* this is another record */ rec3 = wg_create_record(db, 4); /* this is a third record */ if(!rec || !rec2 || !rec3) exit(2); /* Add some content */ wg_set_field(db, rec, 1, wg_encode_str(db, "hello", NULL)); wg_set_field(db, rec2, 0, wg_encode_str(db, "I'm pointing to other records", NULL)); wg_set_field(db, rec3, 0, wg_encode_str(db, "I'm linked from two records", NULL)); /* link the records to each other */ enc = wg_encode_record(db, rec); wg_set_field(db, rec2, 2, enc); /* rec2[2] points to rec */ enc = wg_encode_record(db, rec3); wg_set_field(db, rec2, 1, enc); /* rec2[1] points to rec3 */ wg_set_field(db, rec, 0, enc); /* rec[0] points to rec3 */ wg_detach_database(db); return 0; } whitedb-0.7.2/GPLLICENCE000066400000000000000000001045131226454622500145120ustar00rootroot00000000000000 GNU GENERAL PUBLIC LICENSE Version 3, 29 June 2007 Copyright (C) 2007 Free Software Foundation, Inc. Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The GNU General Public License is a free, copyleft license for software and other kinds of works. The licenses for most software and other practical works are designed to take away your freedom to share and change the works. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change all versions of a program--to make sure it remains free software for all its users. We, the Free Software Foundation, use the GNU General Public License for most of our software; it applies also to any other work released this way by its authors. You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for them if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs, and that you know you can do these things. To protect your rights, we need to prevent others from denying you these rights or asking you to surrender the rights. Therefore, you have certain responsibilities if you distribute copies of the software, or if you modify it: responsibilities to respect the freedom of others. For example, if you distribute copies of such a program, whether gratis or for a fee, you must pass on to the recipients the same freedoms that you received. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. Developers that use the GNU GPL protect your rights with two steps: (1) assert copyright on the software, and (2) offer you this License giving you legal permission to copy, distribute and/or modify it. For the developers' and authors' protection, the GPL clearly explains that there is no warranty for this free software. For both users' and authors' sake, the GPL requires that modified versions be marked as changed, so that their problems will not be attributed erroneously to authors of previous versions. Some devices are designed to deny users access to install or run modified versions of the software inside them, although the manufacturer can do so. This is fundamentally incompatible with the aim of protecting users' freedom to change the software. The systematic pattern of such abuse occurs in the area of products for individuals to use, which is precisely where it is most unacceptable. Therefore, we have designed this version of the GPL to prohibit the practice for those products. If such problems arise substantially in other domains, we stand ready to extend this provision to those domains in future versions of the GPL, as needed to protect the freedom of users. Finally, every program is threatened constantly by software patents. States should not allow patents to restrict development and use of software on general-purpose computers, but in those that do, we wish to avoid the special danger that patents applied to a free program could make it effectively proprietary. To prevent this, the GPL assures that patents cannot be used to render the program non-free. The precise terms and conditions for copying, distribution and modification follow. TERMS AND CONDITIONS 0. Definitions. "This License" refers to version 3 of the GNU General Public License. "Copyright" also means copyright-like laws that apply to other kinds of works, such as semiconductor masks. "The Program" refers to any copyrightable work licensed under this License. Each licensee is addressed as "you". "Licensees" and "recipients" may be individuals or organizations. To "modify" a work means to copy from or adapt all or part of the work in a fashion requiring copyright permission, other than the making of an exact copy. The resulting work is called a "modified version" of the earlier work or a work "based on" the earlier work. A "covered work" means either the unmodified Program or a work based on the Program. To "propagate" a work means to do anything with it that, without permission, would make you directly or secondarily liable for infringement under applicable copyright law, except executing it on a computer or modifying a private copy. Propagation includes copying, distribution (with or without modification), making available to the public, and in some countries other activities as well. To "convey" a work means any kind of propagation that enables other parties to make or receive copies. Mere interaction with a user through a computer network, with no transfer of a copy, is not conveying. An interactive user interface displays "Appropriate Legal Notices" to the extent that it includes a convenient and prominently visible feature that (1) displays an appropriate copyright notice, and (2) tells the user that there is no warranty for the work (except to the extent that warranties are provided), that licensees may convey the work under this License, and how to view a copy of this License. If the interface presents a list of user commands or options, such as a menu, a prominent item in the list meets this criterion. 1. Source Code. The "source code" for a work means the preferred form of the work for making modifications to it. "Object code" means any non-source form of a work. A "Standard Interface" means an interface that either is an official standard defined by a recognized standards body, or, in the case of interfaces specified for a particular programming language, one that is widely used among developers working in that language. The "System Libraries" of an executable work include anything, other than the work as a whole, that (a) is included in the normal form of packaging a Major Component, but which is not part of that Major Component, and (b) serves only to enable use of the work with that Major Component, or to implement a Standard Interface for which an implementation is available to the public in source code form. A "Major Component", in this context, means a major essential component (kernel, window system, and so on) of the specific operating system (if any) on which the executable work runs, or a compiler used to produce the work, or an object code interpreter used to run it. The "Corresponding Source" for a work in object code form means all the source code needed to generate, install, and (for an executable work) run the object code and to modify the work, including scripts to control those activities. However, it does not include the work's System Libraries, or general-purpose tools or generally available free programs which are used unmodified in performing those activities but which are not part of the work. For example, Corresponding Source includes interface definition files associated with source files for the work, and the source code for shared libraries and dynamically linked subprograms that the work is specifically designed to require, such as by intimate data communication or control flow between those subprograms and other parts of the work. The Corresponding Source need not include anything that users can regenerate automatically from other parts of the Corresponding Source. The Corresponding Source for a work in source code form is that same work. 2. Basic Permissions. All rights granted under this License are granted for the term of copyright on the Program, and are irrevocable provided the stated conditions are met. This License explicitly affirms your unlimited permission to run the unmodified Program. The output from running a covered work is covered by this License only if the output, given its content, constitutes a covered work. This License acknowledges your rights of fair use or other equivalent, as provided by copyright law. You may make, run and propagate covered works that you do not convey, without conditions so long as your license otherwise remains in force. You may convey covered works to others for the sole purpose of having them make modifications exclusively for you, or provide you with facilities for running those works, provided that you comply with the terms of this License in conveying all material for which you do not control copyright. Those thus making or running the covered works for you must do so exclusively on your behalf, under your direction and control, on terms that prohibit them from making any copies of your copyrighted material outside their relationship with you. Conveying under any other circumstances is permitted solely under the conditions stated below. Sublicensing is not allowed; section 10 makes it unnecessary. 3. Protecting Users' Legal Rights From Anti-Circumvention Law. No covered work shall be deemed part of an effective technological measure under any applicable law fulfilling obligations under article 11 of the WIPO copyright treaty adopted on 20 December 1996, or similar laws prohibiting or restricting circumvention of such measures. When you convey a covered work, you waive any legal power to forbid circumvention of technological measures to the extent such circumvention is effected by exercising rights under this License with respect to the covered work, and you disclaim any intention to limit operation or modification of the work as a means of enforcing, against the work's users, your or third parties' legal rights to forbid circumvention of technological measures. 4. Conveying Verbatim Copies. You may convey verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice; keep intact all notices stating that this License and any non-permissive terms added in accord with section 7 apply to the code; keep intact all notices of the absence of any warranty; and give all recipients a copy of this License along with the Program. You may charge any price or no price for each copy that you convey, and you may offer support or warranty protection for a fee. 5. Conveying Modified Source Versions. You may convey a work based on the Program, or the modifications to produce it from the Program, in the form of source code under the terms of section 4, provided that you also meet all of these conditions: a) The work must carry prominent notices stating that you modified it, and giving a relevant date. b) The work must carry prominent notices stating that it is released under this License and any conditions added under section 7. This requirement modifies the requirement in section 4 to "keep intact all notices". c) You must license the entire work, as a whole, under this License to anyone who comes into possession of a copy. This License will therefore apply, along with any applicable section 7 additional terms, to the whole of the work, and all its parts, regardless of how they are packaged. This License gives no permission to license the work in any other way, but it does not invalidate such permission if you have separately received it. d) If the work has interactive user interfaces, each must display Appropriate Legal Notices; however, if the Program has interactive interfaces that do not display Appropriate Legal Notices, your work need not make them do so. A compilation of a covered work with other separate and independent works, which are not by their nature extensions of the covered work, and which are not combined with it such as to form a larger program, in or on a volume of a storage or distribution medium, is called an "aggregate" if the compilation and its resulting copyright are not used to limit the access or legal rights of the compilation's users beyond what the individual works permit. Inclusion of a covered work in an aggregate does not cause this License to apply to the other parts of the aggregate. 6. Conveying Non-Source Forms. You may convey a covered work in object code form under the terms of sections 4 and 5, provided that you also convey the machine-readable Corresponding Source under the terms of this License, in one of these ways: a) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by the Corresponding Source fixed on a durable physical medium customarily used for software interchange. b) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by a written offer, valid for at least three years and valid for as long as you offer spare parts or customer support for that product model, to give anyone who possesses the object code either (1) a copy of the Corresponding Source for all the software in the product that is covered by this License, on a durable physical medium customarily used for software interchange, for a price no more than your reasonable cost of physically performing this conveying of source, or (2) access to copy the Corresponding Source from a network server at no charge. c) Convey individual copies of the object code with a copy of the written offer to provide the Corresponding Source. This alternative is allowed only occasionally and noncommercially, and only if you received the object code with such an offer, in accord with subsection 6b. d) Convey the object code by offering access from a designated place (gratis or for a charge), and offer equivalent access to the Corresponding Source in the same way through the same place at no further charge. You need not require recipients to copy the Corresponding Source along with the object code. If the place to copy the object code is a network server, the Corresponding Source may be on a different server (operated by you or a third party) that supports equivalent copying facilities, provided you maintain clear directions next to the object code saying where to find the Corresponding Source. Regardless of what server hosts the Corresponding Source, you remain obligated to ensure that it is available for as long as needed to satisfy these requirements. e) Convey the object code using peer-to-peer transmission, provided you inform other peers where the object code and Corresponding Source of the work are being offered to the general public at no charge under subsection 6d. A separable portion of the object code, whose source code is excluded from the Corresponding Source as a System Library, need not be included in conveying the object code work. A "User Product" is either (1) a "consumer product", which means any tangible personal property which is normally used for personal, family, or household purposes, or (2) anything designed or sold for incorporation into a dwelling. In determining whether a product is a consumer product, doubtful cases shall be resolved in favor of coverage. For a particular product received by a particular user, "normally used" refers to a typical or common use of that class of product, regardless of the status of the particular user or of the way in which the particular user actually uses, or expects or is expected to use, the product. A product is a consumer product regardless of whether the product has substantial commercial, industrial or non-consumer uses, unless such uses represent the only significant mode of use of the product. "Installation Information" for a User Product means any methods, procedures, authorization keys, or other information required to install and execute modified versions of a covered work in that User Product from a modified version of its Corresponding Source. The information must suffice to ensure that the continued functioning of the modified object code is in no case prevented or interfered with solely because modification has been made. If you convey an object code work under this section in, or with, or specifically for use in, a User Product, and the conveying occurs as part of a transaction in which the right of possession and use of the User Product is transferred to the recipient in perpetuity or for a fixed term (regardless of how the transaction is characterized), the Corresponding Source conveyed under this section must be accompanied by the Installation Information. But this requirement does not apply if neither you nor any third party retains the ability to install modified object code on the User Product (for example, the work has been installed in ROM). The requirement to provide Installation Information does not include a requirement to continue to provide support service, warranty, or updates for a work that has been modified or installed by the recipient, or for the User Product in which it has been modified or installed. Access to a network may be denied when the modification itself materially and adversely affects the operation of the network or violates the rules and protocols for communication across the network. Corresponding Source conveyed, and Installation Information provided, in accord with this section must be in a format that is publicly documented (and with an implementation available to the public in source code form), and must require no special password or key for unpacking, reading or copying. 7. Additional Terms. "Additional permissions" are terms that supplement the terms of this License by making exceptions from one or more of its conditions. Additional permissions that are applicable to the entire Program shall be treated as though they were included in this License, to the extent that they are valid under applicable law. If additional permissions apply only to part of the Program, that part may be used separately under those permissions, but the entire Program remains governed by this License without regard to the additional permissions. When you convey a copy of a covered work, you may at your option remove any additional permissions from that copy, or from any part of it. (Additional permissions may be written to require their own removal in certain cases when you modify the work.) You may place additional permissions on material, added by you to a covered work, for which you have or can give appropriate copyright permission. Notwithstanding any other provision of this License, for material you add to a covered work, you may (if authorized by the copyright holders of that material) supplement the terms of this License with terms: a) Disclaiming warranty or limiting liability differently from the terms of sections 15 and 16 of this License; or b) Requiring preservation of specified reasonable legal notices or author attributions in that material or in the Appropriate Legal Notices displayed by works containing it; or c) Prohibiting misrepresentation of the origin of that material, or requiring that modified versions of such material be marked in reasonable ways as different from the original version; or d) Limiting the use for publicity purposes of names of licensors or authors of the material; or e) Declining to grant rights under trademark law for use of some trade names, trademarks, or service marks; or f) Requiring indemnification of licensors and authors of that material by anyone who conveys the material (or modified versions of it) with contractual assumptions of liability to the recipient, for any liability that these contractual assumptions directly impose on those licensors and authors. All other non-permissive additional terms are considered "further restrictions" within the meaning of section 10. If the Program as you received it, or any part of it, contains a notice stating that it is governed by this License along with a term that is a further restriction, you may remove that term. If a license document contains a further restriction but permits relicensing or conveying under this License, you may add to a covered work material governed by the terms of that license document, provided that the further restriction does not survive such relicensing or conveying. If you add terms to a covered work in accord with this section, you must place, in the relevant source files, a statement of the additional terms that apply to those files, or a notice indicating where to find the applicable terms. Additional terms, permissive or non-permissive, may be stated in the form of a separately written license, or stated as exceptions; the above requirements apply either way. 8. Termination. You may not propagate or modify a covered work except as expressly provided under this License. Any attempt otherwise to propagate or modify it is void, and will automatically terminate your rights under this License (including any patent licenses granted under the third paragraph of section 11). However, if you cease all violation of this License, then your license from a particular copyright holder is reinstated (a) provisionally, unless and until the copyright holder explicitly and finally terminates your license, and (b) permanently, if the copyright holder fails to notify you of the violation by some reasonable means prior to 60 days after the cessation. Moreover, your license from a particular copyright holder is reinstated permanently if the copyright holder notifies you of the violation by some reasonable means, this is the first time you have received notice of violation of this License (for any work) from that copyright holder, and you cure the violation prior to 30 days after your receipt of the notice. Termination of your rights under this section does not terminate the licenses of parties who have received copies or rights from you under this License. If your rights have been terminated and not permanently reinstated, you do not qualify to receive new licenses for the same material under section 10. 9. Acceptance Not Required for Having Copies. You are not required to accept this License in order to receive or run a copy of the Program. Ancillary propagation of a covered work occurring solely as a consequence of using peer-to-peer transmission to receive a copy likewise does not require acceptance. However, nothing other than this License grants you permission to propagate or modify any covered work. These actions infringe copyright if you do not accept this License. Therefore, by modifying or propagating a covered work, you indicate your acceptance of this License to do so. 10. Automatic Licensing of Downstream Recipients. Each time you convey a covered work, the recipient automatically receives a license from the original licensors, to run, modify and propagate that work, subject to this License. You are not responsible for enforcing compliance by third parties with this License. An "entity transaction" is a transaction transferring control of an organization, or substantially all assets of one, or subdividing an organization, or merging organizations. If propagation of a covered work results from an entity transaction, each party to that transaction who receives a copy of the work also receives whatever licenses to the work the party's predecessor in interest had or could give under the previous paragraph, plus a right to possession of the Corresponding Source of the work from the predecessor in interest, if the predecessor has it or can get it with reasonable efforts. You may not impose any further restrictions on the exercise of the rights granted or affirmed under this License. For example, you may not impose a license fee, royalty, or other charge for exercise of rights granted under this License, and you may not initiate litigation (including a cross-claim or counterclaim in a lawsuit) alleging that any patent claim is infringed by making, using, selling, offering for sale, or importing the Program or any portion of it. 11. Patents. A "contributor" is a copyright holder who authorizes use under this License of the Program or a work on which the Program is based. The work thus licensed is called the contributor's "contributor version". A contributor's "essential patent claims" are all patent claims owned or controlled by the contributor, whether already acquired or hereafter acquired, that would be infringed by some manner, permitted by this License, of making, using, or selling its contributor version, but do not include claims that would be infringed only as a consequence of further modification of the contributor version. For purposes of this definition, "control" includes the right to grant patent sublicenses in a manner consistent with the requirements of this License. Each contributor grants you a non-exclusive, worldwide, royalty-free patent license under the contributor's essential patent claims, to make, use, sell, offer for sale, import and otherwise run, modify and propagate the contents of its contributor version. In the following three paragraphs, a "patent license" is any express agreement or commitment, however denominated, not to enforce a patent (such as an express permission to practice a patent or covenant not to sue for patent infringement). To "grant" such a patent license to a party means to make such an agreement or commitment not to enforce a patent against the party. If you convey a covered work, knowingly relying on a patent license, and the Corresponding Source of the work is not available for anyone to copy, free of charge and under the terms of this License, through a publicly available network server or other readily accessible means, then you must either (1) cause the Corresponding Source to be so available, or (2) arrange to deprive yourself of the benefit of the patent license for this particular work, or (3) arrange, in a manner consistent with the requirements of this License, to extend the patent license to downstream recipients. "Knowingly relying" means you have actual knowledge that, but for the patent license, your conveying the covered work in a country, or your recipient's use of the covered work in a country, would infringe one or more identifiable patents in that country that you have reason to believe are valid. If, pursuant to or in connection with a single transaction or arrangement, you convey, or propagate by procuring conveyance of, a covered work, and grant a patent license to some of the parties receiving the covered work authorizing them to use, propagate, modify or convey a specific copy of the covered work, then the patent license you grant is automatically extended to all recipients of the covered work and works based on it. A patent license is "discriminatory" if it does not include within the scope of its coverage, prohibits the exercise of, or is conditioned on the non-exercise of one or more of the rights that are specifically granted under this License. You may not convey a covered work if you are a party to an arrangement with a third party that is in the business of distributing software, under which you make payment to the third party based on the extent of your activity of conveying the work, and under which the third party grants, to any of the parties who would receive the covered work from you, a discriminatory patent license (a) in connection with copies of the covered work conveyed by you (or copies made from those copies), or (b) primarily for and in connection with specific products or compilations that contain the covered work, unless you entered into that arrangement, or that patent license was granted, prior to 28 March 2007. Nothing in this License shall be construed as excluding or limiting any implied license or other defenses to infringement that may otherwise be available to you under applicable patent law. 12. No Surrender of Others' Freedom. If conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot convey a covered work so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not convey it at all. For example, if you agree to terms that obligate you to collect a royalty for further conveying from those to whom you convey the Program, the only way you could satisfy both those terms and this License would be to refrain entirely from conveying the Program. 13. Use with the GNU Affero General Public License. Notwithstanding any other provision of this License, you have permission to link or combine any covered work with a work licensed under version 3 of the GNU Affero General Public License into a single combined work, and to convey the resulting work. The terms of this License will continue to apply to the part which is the covered work, but the special requirements of the GNU Affero General Public License, section 13, concerning interaction through a network will apply to the combination as such. 14. Revised Versions of this License. The Free Software Foundation may publish revised and/or new versions of the GNU General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies that a certain numbered version of the GNU General Public License "or any later version" applies to it, you have the option of following the terms and conditions either of that numbered version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of the GNU General Public License, you may choose any version ever published by the Free Software Foundation. If the Program specifies that a proxy can decide which future versions of the GNU General Public License can be used, that proxy's public statement of acceptance of a version permanently authorizes you to choose that version for the Program. Later license versions may give you additional or different permissions. However, no additional obligations are imposed on any author or copyright holder as a result of your choosing to follow a later version. 15. Disclaimer of Warranty. THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 16. Limitation of Liability. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. 17. Interpretation of Sections 15 and 16. If the disclaimer of warranty and limitation of liability provided above cannot be given local legal effect according to their terms, reviewing courts shall apply local law that most closely approximates an absolute waiver of all civil liability in connection with the Program, unless a warranty or assumption of liability accompanies a copy of the Program in return for a fee. END OF TERMS AND CONDITIONS How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively state the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Copyright (C) This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see . Also add information on how to contact you by electronic and paper mail. If the program does terminal interaction, make it output a short notice like this when it starts in an interactive mode: Copyright (C) This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. The hypothetical commands `show w' and `show c' should show the appropriate parts of the General Public License. Of course, your program's commands might be different; for a GUI interface, you would use an "about box". You should also get your employer (if you work as a programmer) or school, if any, to sign a "copyright disclaimer" for the program, if necessary. For more information on this, and how to apply and follow the GNU GPL, see . The GNU General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Lesser General Public License instead of this License. But first, please read . whitedb-0.7.2/INSTALL000066400000000000000000000000631226454622500142060ustar00rootroot00000000000000See Doc/Install.txt for installation instructions. whitedb-0.7.2/MANIFEST000066400000000000000000000011021226454622500143010ustar00rootroot00000000000000MANIFEST ======== This is the top folder of the WhiteDB system. FOLDERS ------- Main components: Doc : plain text documentation Main : top level source, main compiled binaries Db : WhiteDB core source; start here json : json handling Python : Python bindings java : java bindings Under development: Reasoner : reasoner core Parser : parsers for various input languages: used by reasoner Printer : printing and other output functions: used by reasoner Rexamples: reasoner examples whitedb-0.7.2/Main/000077500000000000000000000000001226454622500140425ustar00rootroot00000000000000whitedb-0.7.2/Main/Makefile.am000066400000000000000000000021521226454622500160760ustar00rootroot00000000000000# $Id: $ # $Source: $ # # Creating the WhiteDB binaries # ---- options ---- # ---- path variables ---- dbdir=../Db printerdir=../Printer parserdir=../Parser reasonerdir=../Reasoner jsondir=../json # ---- targets ---- lib_LTLIBRARIES = libwgdb.la bin_PROGRAMS = wgdb stresstest indextool pkginclude_HEADERS = $(dbdir)/dbapi.h $(dbdir)/rdfapi.h $(dbdir)/indexapi.h # ---- extra dependencies, flags, etc ----- LIBDEPS = if RAPTOR LIBDEPS += `$(RAPTOR_CONFIG) --libs` endif AM_LDFLAGS = $(LIBDEPS) stresstest_LIBS=$(PTHREAD_LIBS) stresstest_CFLAGS=$(AM_CFLAGS) $(PTHREAD_CFLAGS) stresstest_LDFLAGS= -static $(PTHREAD_CFLAGS) $(LIBDEPS) stresstest_CC=$(PTHREAD_CC) libwgdb_la_LDFLAGS = # ----- all sources for the created programs ----- libwgdb_la_SOURCES = libwgdb_la_LIBADD = $(dbdir)/libDb.la ${jsondir}/libjson.la if REASONER libwgdb_la_LIBADD += $(parserdir)/libParser.la \ $(printerdir)/libPrinter.la $(reasonerdir)/libReasoner.la endif wgdb_SOURCES = wgdb.c wgdb_LDADD = libwgdb.la stresstest_SOURCES = stresstest.c stresstest_LDADD = libwgdb.la indextool_SOURCES = indextool.c indextool_LDADD = libwgdb.la whitedb-0.7.2/Main/indextool.c000066400000000000000000000271101226454622500162140ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Enar Reilent 2009 * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file indextool.c * Command line utility for index manipulation */ /* ====== Includes =============== */ #include #include #ifdef __cplusplus extern "C" { #endif #ifdef _WIN32 #include "../config-w32.h" #else #include "../config.h" #endif #include "../Db/dballoc.h" #include "../Db/dbmem.h" #include "../Db/dbindex.h" #include "../Db/dbhash.h" #include "../Db/dbutil.h" /* ====== Private headers and defs ======== */ #ifdef _WIN32 #define sscanf sscanf_s /* XXX: This will break for string parameters */ #endif /* ======= Private protos ================ */ void print_tree(void *db, FILE *file, struct wg_tnode *node, int col); int log_tree(void *db, char *file, struct wg_tnode *node, int col); void dump_hash(void *db, FILE *file, db_hash_area_header *ha); wg_index_header *get_index_by_id(void *db, gint index_id); void print_indexes(void *db, FILE *f); /* ====== Functions ============== */ static int printhelp(){ printf("\nindextool user commands:\n" \ "indextool [shmname] createindex - create ttree index\n" \ "indextool [shmname] createhash - create hash index " \ "(JSON support)\n" \ "indextool [shmname] dropindex - delete an index\n" \ "indextool [shmname] list - list all indexes in database\n" \ "indextool [shmname] logtree [filename] - log tree\n" \ "indextool [shmname] dumphash - print hash table\n\n"); return 0; } int main(int argc, char **argv) { char* shmname = NULL; void *db; int i, scan_to, shmsize; if(argc < 3) scan_to = argc; else scan_to = 3; shmsize = 0; /* 0 size causes default size to be used */ /* Similar command parser as in wgdb.c */ for(i=1; i (i+2)) a = argv[i+2]; hdr = get_index_by_id(db, index_id); if(hdr) { if(hdr->type != WG_INDEX_TYPE_TTREE && \ hdr->type != WG_INDEX_TYPE_TTREE_JSON) { fprintf(stderr, "Index type not supported.\n"); return 0; } log_tree(db, a, (struct wg_tnode *) offsettoptr(db, TTREE_ROOT_NODE(hdr)), hdr->rec_field_index[0]); } else { fprintf(stderr, "Invalid index id.\n"); return 0; } return 0; } else if(!strcmp(argv[i], "dumphash")) { int index_id; wg_index_header *hdr; if(argc < (i+1)) { printhelp(); return 0; } db = (void *) wg_attach_database(shmname, shmsize); if(!db) { fprintf(stderr, "Failed to attach to database.\n"); return 0; } sscanf(argv[i+1], "%d", &index_id); hdr = get_index_by_id(db, index_id); if(hdr) { if(hdr->type != WG_INDEX_TYPE_HASH && \ hdr->type != WG_INDEX_TYPE_HASH_JSON) { fprintf(stderr, "Index type not supported.\n"); return 0; } dump_hash(db, stdout, HASHIDX_ARRAYP(hdr)); } else { fprintf(stderr, "Invalid index id.\n"); return 0; } return 0; } shmname = argv[1]; /* assuming two loops max */ i++; } printhelp(); return 0; } void print_tree(void *db, FILE *file, struct wg_tnode *node, int col){ int i; char strbuf[256]; fprintf(file,"\n", (int) ptrtooffset(db, node)); fprintf(file,"%d",node->number_of_elements); fprintf(file,"\n"); fprintf(file,"%d",node->left_subtree_height); fprintf(file,"\n"); fprintf(file,"%d",node->right_subtree_height); fprintf(file,"\n"); #ifdef TTREE_CHAINED_NODES fprintf(file,"%d\n", (int) node->succ_offset); fprintf(file,"%d\n", (int) node->pred_offset); #endif wg_snprint_value(db, node->current_min, strbuf, 255); fprintf(file,"%s ",strbuf); wg_snprint_value(db, node->current_max, strbuf, 255); fprintf(file,"%s\n",strbuf); fprintf(file,""); for(i=0;inumber_of_elements;i++){ wg_int encoded = wg_get_field(db, (struct wg_tnode *) offsettoptr(db,node->array_of_values[i]), col); wg_snprint_value(db, encoded, strbuf, 255); fprintf(file, "%s ", strbuf); } fprintf(file,"\n"); fprintf(file,"\n"); if(node->left_child_offset == 0)fprintf(file,"null"); else{ print_tree(db,file, (struct wg_tnode *) offsettoptr(db,node->left_child_offset),col); } fprintf(file,"\n"); fprintf(file,"\n"); if(node->right_child_offset == 0)fprintf(file,"null"); else{ print_tree(db,file, (struct wg_tnode *) offsettoptr(db,node->right_child_offset),col); } fprintf(file,"\n"); fprintf(file,"\n"); } int log_tree(void *db, char *file, struct wg_tnode *node, int col){ #ifdef _WIN32 FILE *filee; fopen_s(&filee, file, "w"); #else FILE *filee = fopen(file,"w"); #endif print_tree(db,filee,node,col); fflush(filee); fclose(filee); return 0; } void dump_hash(void *db, FILE *file, db_hash_area_header *ha) { gint i; for(i=0; iarraylength; i++) { gint bucket = dbfetch(db, (ha->arraystart)+(sizeof(gint) * i)); if(bucket) { #ifdef _WIN32 fprintf(file, "hash: %Id\n", i); #else fprintf(file, "hash: %td\n", i); #endif while(bucket) { gint j, rec_offset; gint length = dbfetch(db, bucket + HASHIDX_META_POS*sizeof(gint)); unsigned char *dptr = offsettoptr(db, bucket + \ HASHIDX_HEADER_SIZE*sizeof(gint)); /* Hash string dump */ #ifdef _WIN32 fprintf(file, " offset: %Id ", bucket); #else fprintf(file, " offset: %td ", bucket); #endif for(j=0; j 126) fputc('.', file); else fputc(dptr[j], file); } fprintf(file, ")\n"); /* Offset dump */ fprintf(file, " records:"); rec_offset = dbfetch(db, bucket + HASHIDX_RECLIST_POS*sizeof(gint)); while(rec_offset) { gcell *rec_cell = (gcell *) offsettoptr(db, rec_offset); #ifdef _WIN32 fprintf(file, " %Id", rec_cell->car); #else fprintf(file, " %td", rec_cell->car); #endif rec_offset = rec_cell->cdr; } fprintf(file, "\n"); bucket = dbfetch(db, bucket + HASHIDX_HASHCHAIN_POS*sizeof(gint)); } } } } /* Find index by id * * helper function to validate index id-s. Checks if the * index is present in master list before converting the offset * into pointer. */ wg_index_header *get_index_by_id(void *db, gint index_id) { wg_index_header *hdr = NULL; db_memsegment_header* dbh = dbmemsegh(db); gint *ilist = &dbh->index_control_area_header.index_list; /* Locate the header */ while(*ilist) { gcell *ilistelem = (gcell *) offsettoptr(db, *ilist); if(ilistelem->car == index_id) { hdr = (wg_index_header *) offsettoptr(db, index_id); break; } ilist = &ilistelem->cdr; } return hdr; } void print_indexes(void *db, FILE *f) { int column; db_memsegment_header* dbh = dbmemsegh(db); gint *ilist; if(!dbh->index_control_area_header.number_of_indexes) { fprintf(f, "No indexes in the database.\n"); return; } else { fprintf(f, "col\ttype\tmulti\tid\tmask\n"); } for(column=0; column<=MAX_INDEXED_FIELDNR; column++) { ilist = &dbh->index_control_area_header.index_table[column]; while(*ilist) { gcell *ilistelem = (gcell *) offsettoptr(db, *ilist); if(ilistelem->car) { char typestr[3]; wg_index_header *hdr = \ (wg_index_header *) offsettoptr(db, ilistelem->car); typestr[2] = '\0'; switch(hdr->type) { case WG_INDEX_TYPE_TTREE: typestr[0] = 'T'; typestr[1] = '\0'; break; case WG_INDEX_TYPE_TTREE_JSON: typestr[0] = 'T'; typestr[1] = 'J'; break; case WG_INDEX_TYPE_HASH: typestr[0] = '#'; typestr[1] = '\0'; break; case WG_INDEX_TYPE_HASH_JSON: typestr[0] = '#'; typestr[1] = 'J'; break; default: break; } fprintf(f, "%d\t%s\t%d\t%d\t%s\n", column, typestr, (int) hdr->fields, (int) ilistelem->car, #ifndef USE_INDEX_TEMPLATE "-"); #else (hdr->template_offset ? "Y" : "N")); #endif } ilist = &ilistelem->cdr; } } } #ifdef __cplusplus } #endif whitedb-0.7.2/Main/stresstest.c000066400000000000000000000323101226454622500164300ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Priit Järv 2009 * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file stresstest.c * generate load with writer and reader threads * Currently supports two thread API-s: libpthread and Win32 */ /* ====== Includes =============== */ #include #include #ifdef _WIN32 #define WIN32_LEAN_AND_MEAN #include #include #else #include #endif #ifdef _WIN32 #include "../config-w32.h" #else #include "../config.h" #endif #ifdef HAVE_PTHREAD #include #endif #ifdef __cplusplus extern "C" { #endif #include "../Db/dballoc.h" #include "../Db/dbmem.h" #include "../Db/dbdata.h" #include "../Db/dbtest.h" #include "../Db/dblock.h" /* ====== Private defs =========== */ #define DBSIZE 10000000 #define WORKLOAD 100000 #define REC_SIZE 5 #define CHATTY_THREADS 1 #define SYNC_THREADS 1 /* Use libpthread rwlock to create a reference * benchmark for measuring the performance of * dblock.c spinlocks against. */ /* #define BENCHMARK 1 */ typedef struct { int threadid; void *db; #ifdef HAVE_PTHREAD pthread_t pth; #elif defined(_WIN32) HANDLE hThread; #endif } pt_data; #if defined(_WIN32) typedef DWORD worker_t; #else /* compatible with libpthread */ typedef void * worker_t; #endif /* ======= Private protos ================ */ int prepare_data(void *db); void check_data(void *db, int wcnt); void run_workers(void *db, int rcnt, int wcnt); worker_t writer_thread(void * threadarg); worker_t reader_thread(void * threadarg); /* ====== Global vars ======== */ #ifdef SYNC_THREADS #if defined(HAVE_PTHREAD) pthread_mutex_t twait_mutex; pthread_cond_t twait_cv; #elif defined(_WIN32) HANDLE twait_ev; #endif volatile int twait_cnt; /* count of workers in wait state */ #endif #if defined(BENCHMARK) && defined(HAVE_PTHREAD) pthread_rwlock_t rwlock; #endif /* ====== Functions ============== */ int main(int argc, char **argv) { char* shmname = NULL; void* shmptr; int rcnt = -1, wcnt = -1; #ifndef _WIN32 struct timeval tv; #endif unsigned long long start_ms, end_ms; if(argc==4) { shmname = argv[1]; rcnt = atol(argv[2]); wcnt = atol(argv[3]); } if(rcnt<0 || wcnt<0) { fprintf(stderr, "usage: %s \n", argv[0]); exit(1); } shmptr=wg_attach_database(shmname,DBSIZE); if (shmptr==NULL) exit(2); if(prepare_data(shmptr)) { wg_delete_database(shmname); exit(3); } #ifdef _WIN32 start_ms = (unsigned long long) GetTickCount(); #else gettimeofday(&tv, NULL); start_ms = tv.tv_sec * 1000 + tv.tv_usec / 1000; #endif run_workers(shmptr, rcnt, wcnt); #ifdef _WIN32 end_ms = (unsigned long long) GetTickCount(); #else gettimeofday(&tv, NULL); end_ms = tv.tv_sec * 1000 + tv.tv_usec / 1000; #endif check_data(shmptr, wcnt); fprintf(stdout, "elapsed: %d ms\n", (int) (end_ms - start_ms)); wg_delete_database(shmname); exit(0); } /** * Precreate data for workers */ int prepare_data(void *db) { int i; for (i=0; i= tcnt) break; #ifdef HAVE_PTHREAD pthread_mutex_unlock(&twait_mutex); #endif } /* Now wake up all threads */ #if defined(HAVE_PTHREAD) pthread_cond_broadcast(&twait_cv); pthread_mutex_unlock(&twait_mutex); #elif defined(_WIN32) SetEvent(twait_ev); #endif #endif /* SYNC_THREADS */ /* Join the workers (wait for them to complete) */ for(i=0; idb; threadid = ((pt_data *) threadarg)->threadid; #ifdef CHATTY_THREADS fprintf(stdout, "Writer thread %d started.\n", threadid); #endif #ifdef SYNC_THREADS /* Increment the thread counter to inform the caller * that we are entering wait state. */ #ifdef HAVE_PTHREAD pthread_mutex_lock(&twait_mutex); #endif twait_cnt++; #if defined(HAVE_PTHREAD) pthread_cond_wait(&twait_cv, &twait_mutex); pthread_mutex_unlock(&twait_mutex); #elif defined(_WIN32) WaitForSingleObject(twait_ev, INFINITE); #endif #endif /* SYNC_THREADS */ frec = wg_get_first_record(db); for(i=0; idb; threadid = ((pt_data *) threadarg)->threadid; #ifdef CHATTY_THREADS fprintf(stdout, "Reader thread %d started.\n", threadid); #endif #ifdef SYNC_THREADS /* Enter wait state */ #ifdef HAVE_PTHREAD pthread_mutex_lock(&twait_mutex); #endif twait_cnt++; #if defined(HAVE_PTHREAD) pthread_cond_wait(&twait_cv, &twait_mutex); pthread_mutex_unlock(&twait_mutex); #elif defined(_WIN32) WaitForSingleObject(twait_ev, INFINITE); #endif #endif /* SYNC_THREADS */ for(i=0; i. * */ /** @file wgdb.c * WhiteDB database tool: command line utility */ /* ====== Includes =============== */ #include #include #include #include #include #ifdef _WIN32 #include // for _getch #endif #ifdef __cplusplus extern "C" { #endif #ifdef _WIN32 #include "../config-w32.h" #else #include "../config.h" #endif #include "../Db/dballoc.h" #include "../Db/dbmem.h" #include "../Db/dbdata.h" #include "../Db/dbtest.h" #include "../Db/dbdump.h" #include "../Db/dblog.h" #include "../Db/dbquery.h" #include "../Db/dbutil.h" #include "../Db/dblock.h" #include "../Db/dbjson.h" #include "../Db/dbschema.h" #ifdef USE_REASONER #include "../Parser/dbparse.h" #endif /* ====== Private defs =========== */ #ifdef _WIN32 #define sscanf sscanf_s /* XXX: This will break for string parameters */ #endif #define TESTREC_SIZE 3 #define FLAGS_FORCE 0x1 #define FLAGS_LOGGING 0x2 /* Helper macros for database lock management */ #define RLOCK(d,i) i = wg_start_read(d); \ if(!i) { \ fprintf(stderr, "Failed to get database lock\n"); \ break; \ } #define WLOCK(d,i) i = wg_start_write(d); \ if(!i) { \ fprintf(stderr, "Failed to get database lock\n"); \ break; \ } #define RULOCK(d,i) if(i) { \ wg_end_read(d,i); \ i = 0; \ } #define WULOCK(d,i) if(i) { \ wg_end_write(d,i); \ i = 0; \ } /* ======= Private protos ================ */ gint parse_shmsize(char *arg); gint parse_flag(char *arg); wg_query_arg *make_arglist(void *db, char **argv, int argc, int *sz); void free_arglist(void *db, wg_query_arg *arglist, int sz); void query(void *db, char **argv, int argc); void del(void *db, char **argv, int argc); void selectdata(void *db, int howmany, int startingat); int add_row(void *db, char **argv, int argc); wg_json_query_arg *make_json_arglist(void *db, char *json, int *sz, void **doc); void findjson(void *db, char *json); /* ====== Functions ============== */ /* how to set 500 meg of shared memory: su echo 500000000 > /proc/sys/kernel/shmmax */ /** usage: display command line help. * */ void usage(char *prog) { printf("usage: %s [shmname] [command arguments]\n"\ "Where:\n"\ " shmname - (numeric) shared memory name for database. May be omitted.\n"\ " command - required, one of:\n\n"\ " help (or \"-h\") - display this text.\n"\ " version (or \"-v\") - display libwgdb version.\n"\ " free - free shared memory.\n"\ " export [-f] - write memory dump to disk (-f: force dump "\ "even if unable to get lock)\n"\ " import [-l] - read memory dump from disk. Overwrites "\ " existing memory contents (-l: enable logging after import).\n"\ " exportcsv - export data to a CSV file.\n"\ " importcsv - import data from a CSV file.\n", prog); #ifdef USE_REASONER printf(" importotter - import facts/rules from "\ "otter syntax file.\n"\ " importprolog - import facts/rules from "\ "prolog syntax file.\n"\ " runreasoner - run the reasoner on facts/rules in the database.\n"); #endif #ifdef HAVE_RAPTOR printf(" exportrdf - export data to a RDF/XML file.\n"\ " importrdf - import data from a RDF file.\n"); #endif #ifdef USE_DBLOG printf(" replay - replay a journal file.\n"); #endif printf(" test - run quick database tests.\n"\ " fulltest - run in-depth database tests.\n"\ " header - print header data.\n"\ " fill [asc | desc | mix] - fill db with integer data.\n"\ " add .. - store data row (only int or str recognized)\n"\ " select [start from] - print db contents.\n"\ " query \"\" .. - basic query.\n"\ " del \"\" .. - like query. Matching rows "\ "are deleted from database.\n"\ " addjson [] - store a json document.\n"\ " findjson - find documents with matching keys/values.\n"); #ifdef _WIN32 printf(" server [-l] [size] - provide persistent shared memory for "\ "other processes (-l: enable logging in the database). Will allocate "\ "requested amount of memory and sleep; "\ "Ctrl+C aborts and releases the memory.\n"); #else printf(" create [-l] [size] - create empty db of given size "\ "(-l: enable logging in the database).\n"); #endif printf("\nCommands may have variable number of arguments. "\ "Commands that take values as arguments have limited support "\ "for parsing various data types (see manual for details). Size "\ "may be given as bytes or in larger units by appending k, M or G "\ "to the size argument.\n"); } /** Handle the user-supplied database size (or pick a reasonable * substitute). Parses up to 32-bit values, but the user may * append up to 'G' for larger bases on 64-bit systems. */ gint parse_shmsize(char *arg) { char *trailing = NULL; long maxv = LONG_MAX, mult = 1, val = strtol(arg, &trailing, 10); if((val == LONG_MAX || val == LONG_MIN) && errno==ERANGE) { fprintf(stderr, "Numeric value out of range (try k, M, G?)\n"); } else if(trailing) { switch(trailing[0]) { case 'k': case 'K': mult = 1000; break; case 'm': case 'M': mult = 1000000; break; case 'g': case 'G': mult = 1000000000; break; default: break; } } #ifndef HAVE_64BIT_GINT maxv /= mult; #endif if(val > maxv) { fprintf(stderr, "Requested segment size not supported (using %ld)\n", mult * maxv); val = maxv; } return (gint) val * (gint) mult; } /** Handle a command-line flag * */ gint parse_flag(char *arg) { while(arg[0] == '-') arg++; switch(arg[0]) { case 'f': return FLAGS_FORCE; case 'l': return FLAGS_LOGGING; default: fprintf(stderr, "Unrecognized option: `%c'\n", arg[0]); break; } return 0; } /** top level for the database command line tool * * */ int main(int argc, char **argv) { char *shmname = NULL; void *shmptr = NULL; int i, scan_to; gint shmsize; wg_int rlock = 0; wg_int wlock = 0; /* look for commands in argv[1] or argv[2] */ if(argc < 3) scan_to = argc; else scan_to = 3; shmsize = 0; /* 0 size causes default size to be used */ /* 1st loop through, shmname is NULL for default. If * the first argument is not a recognizable command, it * is assumed to be the shmname and the next argument * is checked against known commands. */ for(i=1; i(i+1) && !strcmp(argv[i],"import")){ wg_int err, minsize, maxsize; int flags = 0; if(argv[i+1][0] == '-') { flags = parse_flag(argv[++i]); if(argc<=(i+1)) { /* Filename argument missing */ usage(argv[0]); exit(1); } } err = wg_check_dump(NULL, argv[i+1], &minsize, &maxsize); if(err) { fprintf(stderr, "Import failed.\n"); break; } shmptr=wg_attach_memsegment(shmname, minsize, maxsize, 1, (flags & FLAGS_LOGGING)); if(!shmptr) { fprintf(stderr, "Failed to attach to database.\n"); exit(1); } /* Locking is handled internally by the dbdump.c functions */ err = wg_import_dump(shmptr,argv[i+1]); if(!err) printf("Database imported.\n"); else if(err<-1) fprintf(stderr, "Fatal error in wg_import_dump, db may have"\ " become corrupt\n"); else fprintf(stderr, "Import failed.\n"); break; } else if(argc>(i+1) && !strcmp(argv[i],"export")){ wg_int err; int flags = 0; if(argv[i+1][0] == '-') { flags = parse_flag(argv[++i]); if(argc<=(i+1)) { /* Filename argument missing */ usage(argv[0]); exit(1); } } shmptr=wg_attach_database(shmname, shmsize); if(!shmptr) { fprintf(stderr, "Failed to attach to database.\n"); exit(1); } /* Locking is handled internally by the dbdump.c functions */ if(flags & FLAGS_FORCE) err = wg_dump_internal(shmptr,argv[i+1], 0); else err = wg_dump(shmptr,argv[i+1]); if(err<-1) fprintf(stderr, "Fatal error in wg_dump, db may have"\ " become corrupt\n"); else if(err) fprintf(stderr, "Export failed.\n"); break; } #ifdef USE_DBLOG else if(argc>(i+1) && !strcmp(argv[i],"replay")){ wg_int err; shmptr=wg_attach_database(shmname, shmsize); if(!shmptr) { fprintf(stderr, "Failed to attach to database.\n"); exit(1); } WLOCK(shmptr, wlock); err = wg_replay_log(shmptr,argv[i+1]); WULOCK(shmptr, wlock); if(!err) printf("Log suggessfully imported from file.\n"); else if(err<-1) fprintf(stderr, "Fatal error when importing, database may have "\ "become corrupt\n"); else fprintf(stderr, "Failed to import log (database unmodified).\n"); break; } #endif else if(argc>(i+1) && !strcmp(argv[i],"exportcsv")){ shmptr=wg_attach_database(shmname, shmsize); if(!shmptr) { fprintf(stderr, "Failed to attach to database.\n"); exit(1); } RLOCK(shmptr, wlock); wg_export_db_csv(shmptr,argv[i+1]); RULOCK(shmptr, wlock); break; } else if(argc>(i+1) && !strcmp(argv[i],"importcsv")){ wg_int err; shmptr=wg_attach_database(shmname, shmsize); if(!shmptr) { fprintf(stderr, "Failed to attach to database.\n"); exit(1); } WLOCK(shmptr, wlock); err = wg_import_db_csv(shmptr,argv[i+1]); WULOCK(shmptr, wlock); if(!err) printf("Data imported from file.\n"); else if(err<-1) fprintf(stderr, "Fatal error when importing, data may be partially"\ " imported\n"); else fprintf(stderr, "Import failed.\n"); break; } #ifdef USE_REASONER else if(argc>(i+1) && !strcmp(argv[i],"importprolog")){ wg_int err; shmptr=wg_attach_database(shmname, shmsize); if(!shmptr) { fprintf(stderr, "Failed to attach to database.\n"); exit(1); } err = wg_import_prolog_file(shmptr,argv[i+1]); if(!err) printf("Data imported from prolog file.\n"); else if(err<-1) fprintf(stderr, "Fatal error when importing, data may be partially"\ " imported\n"); else fprintf(stderr, "Import failed.\n"); break; } else if(argc>(i+1) && !strcmp(argv[i],"importotter")){ wg_int err; shmptr=wg_attach_database(shmname, shmsize); if(!shmptr) { fprintf(stderr, "Failed to attach to database.\n"); exit(1); } err = wg_import_otter_file(shmptr,argv[i+1]); if(!err) printf("Data imported from otter file.\n"); else if(err<-1) fprintf(stderr, "Fatal error when importing otter file, data may be partially"\ " imported\n"); else fprintf(stderr, "Import failed.\n"); break; } else if(argc>i && !strcmp(argv[i],"runreasoner")){ wg_int err; shmptr=wg_attach_database(shmname, shmsize); if(!shmptr) { fprintf(stderr, "Failed to attach to database.\n"); exit(1); } //printf("about to call wg_run_reasoner\n"); err = wg_run_reasoner(shmptr,argc,argv); //if(!err); //printf("wg_run_reasoner finished ok.\n"); //else //fprintf(stderr, "wg_run_reasoner finished with an error %d.\n",err); //break; break; } else if(argc>i && !strcmp(argv[i],"testreasoner")){ wg_int err; //printf("about to call wg_test_reasoner\n"); err = wg_test_reasoner(argc,argv); //if(!err); //printf("wg_test_reasoner finished ok.\n"); //else //fprintf(stderr, "wg_test_reasoner finished with an error %d.\n",err); //break; break; } #endif #ifdef HAVE_RAPTOR else if(argc>(i+2) && !strcmp(argv[i],"exportrdf")){ wg_int err; int pref_fields = atol(argv[i+1]); shmptr=wg_attach_database(shmname, shmsize); if(!shmptr) { fprintf(stderr, "Failed to attach to database.\n"); exit(1); } printf("Exporting with %d prefix fields.\n", pref_fields); RLOCK(shmptr, wlock); err = wg_export_raptor_rdfxml_file(shmptr, pref_fields, argv[i+2]); RULOCK(shmptr, wlock); if(err) fprintf(stderr, "Export failed.\n"); break; } else if(argc>(i+3) && !strcmp(argv[i],"importrdf")){ wg_int err; int pref_fields = atol(argv[i+1]); int suff_fields = atol(argv[i+2]); shmptr=wg_attach_database(shmname, shmsize); if(!shmptr) { fprintf(stderr, "Failed to attach to database.\n"); exit(1); } printf("Importing with %d prefix fields, %d suffix fields.\n,", pref_fields, suff_fields); WLOCK(shmptr, wlock); err = wg_import_raptor_file(shmptr, pref_fields, suff_fields, wg_rdfparse_default_callback, argv[i+3]); WULOCK(shmptr, wlock); if(!err) printf("Data imported from file.\n"); else if(err<-1) fprintf(stderr, "Fatal error when importing, data may be partially"\ " imported\n"); else fprintf(stderr, "Import failed.\n"); break; } #endif else if(!strcmp(argv[i],"test")) { /* This test function does it's own memory allocation. */ wg_run_tests(WG_TEST_QUICK, 2); break; } else if(!strcmp(argv[i],"fulltest")) { wg_run_tests(WG_TEST_FULL, 2); break; } else if(!strcmp(argv[i], "header")) { shmptr=wg_attach_database(shmname, shmsize); if(!shmptr) { fprintf(stderr, "Failed to attach to database.\n"); exit(1); } RLOCK(shmptr, wlock); wg_show_db_memsegment_header(shmptr); RULOCK(shmptr, wlock); break; } #ifdef _WIN32 else if(!strcmp(argv[i],"server")) { int flags = 0; if(argc>(i+1) && argv[i+1][0] == '-') { flags = parse_flag(argv[++i]); } if(argc>(i+1)) { shmsize = parse_shmsize(argv[i+1]); if(!shmsize) fprintf(stderr, "Failed to parse memory size, using default.\n"); } shmptr=wg_attach_memsegment(shmname, shmsize, shmsize, 1, (flags & FLAGS_LOGGING)); if(!shmptr) { fprintf(stderr, "Failed to attach to database.\n"); exit(1); } printf("Press Ctrl-C to end and release the memory.\n"); while(_getch() != 3); break; } #else else if(!strcmp(argv[i],"create")) { int flags = 0; if(argc>(i+1) && argv[i+1][0] == '-') { flags = parse_flag(argv[++i]); } if(argc>(i+1)) { shmsize = parse_shmsize(argv[i+1]); if(!shmsize) fprintf(stderr, "Failed to parse memory size, using default.\n"); } shmptr=wg_attach_memsegment(shmname, shmsize, shmsize, 1, (flags & FLAGS_LOGGING)); if(!shmptr) { fprintf(stderr, "Failed to attach to database.\n"); exit(1); } break; } #endif else if(argc>(i+1) && !strcmp(argv[i], "fill")) { int rows = atol(argv[i+1]); if(!rows) { fprintf(stderr, "Invalid number of rows.\n"); exit(1); } shmptr=wg_attach_database(shmname, shmsize); if(!shmptr) { fprintf(stderr, "Failed to attach to database.\n"); exit(1); } WLOCK(shmptr, wlock); if(argc > (i+2) && !strcmp(argv[i+2], "mix")) wg_genintdata_mix(shmptr, rows, TESTREC_SIZE); else if(argc > (i+2) && !strcmp(argv[i+2], "desc")) wg_genintdata_desc(shmptr, rows, TESTREC_SIZE); else wg_genintdata_asc(shmptr, rows, TESTREC_SIZE); WULOCK(shmptr, wlock); printf("Data inserted\n"); break; } else if(argc>(i+1) && !strcmp(argv[i],"select")) { int rows = atol(argv[i+1]); int from = 0; if(!rows) { fprintf(stderr, "Invalid number of rows.\n"); exit(1); } if(argc > (i+2)) from = atol(argv[i+2]); shmptr=wg_attach_database(shmname, shmsize); if(!shmptr) { fprintf(stderr, "Failed to attach to database.\n"); exit(1); } RLOCK(shmptr, wlock); selectdata(shmptr, rows, from); RULOCK(shmptr, wlock); break; } else if(argc>(i+1) && !strcmp(argv[i],"add")) { int err; shmptr=wg_attach_database(shmname, shmsize); if(!shmptr) { fprintf(stderr, "Failed to attach to database.\n"); exit(1); } WLOCK(shmptr, wlock); err = add_row(shmptr, argv+i+1, argc-i-1); WULOCK(shmptr, wlock); if(!err) printf("Row added.\n"); break; } else if(argc>(i+2) && !strcmp(argv[i],"del")) { shmptr=wg_attach_database(shmname, shmsize); if(!shmptr) { fprintf(stderr, "Failed to attach to database.\n"); exit(1); } /* Delete works like query(), except deletes the matching rows */ del(shmptr, argv+i+1, argc-i-1); break; break; } else if(argc>(i+3) && !strcmp(argv[i],"query")) { shmptr=wg_attach_database(shmname, shmsize); if(!shmptr) { fprintf(stderr, "Failed to attach to database.\n"); exit(1); } /* Query handles it's own locking */ query(shmptr, argv+i+1, argc-i-1); break; } else if(argc>i && !strcmp(argv[i],"addjson")){ wg_int err; shmptr=wg_attach_database(shmname, shmsize); if(!shmptr) { fprintf(stderr, "Failed to attach to database.\n"); exit(1); } WLOCK(shmptr, wlock); /* the filename parameter is optional */ err = wg_parse_json_file(shmptr, (argc>(i+1) ? argv[i+1] : NULL)); WULOCK(shmptr, wlock); if(!err) printf("JSON document imported.\n"); else if(err<-1) fprintf(stderr, "Fatal error when importing, data may be partially"\ " imported\n"); else fprintf(stderr, "Import failed.\n"); break; } else if(argc>(i+1) && !strcmp(argv[i],"findjson")) { shmptr=wg_attach_database(shmname, shmsize); if(!shmptr) { fprintf(stderr, "Failed to attach to database.\n"); exit(1); } WLOCK(shmptr, wlock); findjson(shmptr, argv[i+1]); WULOCK(shmptr, wlock); break; } shmname = argv[1]; /* no match, assume shmname was given */ } if(i==scan_to) { /* loop completed normally ==> no commands found */ usage(argv[0]); } if(shmptr) { RULOCK(shmptr, rlock); WULOCK(shmptr, wlock); wg_detach_database(shmptr); } exit(0); } /** Parse row matching parameters from the command line * * argv should point to the part in argument list where the * parameters start. * * If the parsing is successful, *sz holds the size of the argument list. * Otherwise that value should be ignored; the return value of the * function should be used to check for success. */ wg_query_arg *make_arglist(void *db, char **argv, int argc, int *sz) { int c, i, j, qargc; char cond[80]; wg_query_arg *arglist; gint encoded; qargc = argc / 3; *sz = qargc; arglist = (wg_query_arg *) malloc(qargc * sizeof(wg_query_arg)); if(!arglist) return NULL; for(i=0,j=0; i=", 2)) arglist[i].cond = WG_COND_GTEQUAL; else if(!strncmp(cond, "<", 1)) arglist[i].cond = WG_COND_LESSTHAN; else if(!strncmp(cond, ">", 1)) arglist[i].cond = WG_COND_GREATER; else { fprintf(stderr, "invalid condition %s\n", cond); free_arglist(db, arglist, qargc); return NULL; } } return arglist; } /** Free the argument list created by make_arglist() * */ void free_arglist(void *db, wg_query_arg *arglist, int sz) { if(arglist) { int i; for(i=0; icolumn, q->qtype); */ rec = wg_fetch(db, q); while(rec) { wg_print_record(db, (gint *) rec); printf("\n"); rec = wg_fetch(db, q); } wg_free_query(db, q); abrt2: wg_end_read(db, lock_id); abrt1: free_arglist(db, arglist, qargc); } /** Delete rows * Like query(), except the selected rows are deleted. */ void del(void *db, char **argv, int argc) { int qargc; void *rec = NULL; wg_query *q; wg_query_arg *arglist; gint lock_id; arglist = make_arglist(db, argv, argc, &qargc); if(!arglist) return; /* Use maximum isolation */ if(!(lock_id = wg_start_write(db))) { fprintf(stderr, "failed to get lock on database\n"); goto abrt1; } q = wg_make_query(db, NULL, 0, arglist, qargc); if(!q) goto abrt2; if(q->res_count > 0) { printf("Deleting %d rows...", (int) q->res_count); rec = wg_fetch(db, q); while(rec) { wg_delete_record(db, (gint *) rec); rec = wg_fetch(db, q); } printf(" done\n"); } wg_free_query(db, q); abrt2: wg_end_write(db, lock_id); abrt1: free_arglist(db, arglist, qargc); } /** Print rows from database * */ void selectdata(void *db, int howmany, int startingat) { void *rec = wg_get_first_record(db); int i, count; for(i=0;i. * */ /** @file dbgenparse.h * Top level/generic headers and defs for parsers * */ #ifndef DEFINED_DBGENPARSE_H #define DEFINED_DBGENPARSE_H #include "../Db/dbdata.h" #include "../Db/dbmpool.h" #include "dbparse.h" #define parseprintf(...) #define MKWGPAIR(pp,x,y) (wg_mkpair(((parse_parm*)pp)->db,((parse_parm*)pp)->mpool,x,y)) #define MKWGINT(pp,x) (wg_mkatom(((parse_parm*)pp)->db,((parse_parm*)pp)->mpool,WG_INTTYPE,x,NULL)) #define MKWGFLOAT(pp,x) (wg_mkatom(((parse_parm*)pp)->db,((parse_parm*)pp)->mpool,WG_DOUBLETYPE,x,NULL)) #define MKWGDATE(pp,x) (wg_mkatom(((parse_parm*)pp)->db,((parse_parm*)pp)->mpool,WG_DATETYPE,x,NULL)) #define MKWGTIME(pp,x) (wg_mkatom(((parse_parm*)pp)->db,((parse_parm*)pp)->mpool,WG_TIMETYPE,x,NULL)) #define MKWGID(pp,x) (wg_mkatom(((parse_parm*)pp)->db,((parse_parm*)pp)->mpool,WG_URITYPE,x,NULL)) #define MKWGURI(pp,x) (wg_mkatom(((parse_parm*)pp)->db,((parse_parm*)pp)->mpool,WG_URITYPE,x,NULL)) #define MKWGSTRING(pp,x) (wg_mkatom(((parse_parm*)pp)->db,((parse_parm*)pp)->mpool,WG_STRTYPE,x,NULL)) #define MKWGCONST(pp,x) (wg_mkatom(((parse_parm*)pp)->db,((parse_parm*)pp)->mpool,WG_ANONCONSTTYPE,x,NULL)) #define MKWGVAR(pp,x) (wg_mkatom(((parse_parm*)pp)->db,((parse_parm*)pp)->mpool,WG_VARTYPE,x,NULL)) #define MKWGNIL NULL // ---- reeentrant ---- typedef struct parse_parm_s { void *yyscanner; // has to be present char *buf; // for parse from str case int pos; // for parse from str case int length; // for parse from str case char* filename; // for err handling void* result; // parser result void* db; // database pointer void* mpool; // mpool pointer char* foo; // if NULL, use input from stdin, else from buf (str case) } parse_parm; #define YYSTYPE char* #define YY_EXTRA_TYPE parse_parm * #endif whitedb-0.7.2/Parser/dbotter.l000066400000000000000000000142621226454622500162370ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009,2010 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dbotter.l * Lexer rules for otter parser * */ %{ #include #include #include "dbotterparse.h" #include "dbotter.tab.h" /* reentrant stuff starts */ #define PARM yyget_extra(yyscanner) /* YYERROR_VERBOSE. */ /* #define YY_INPUT(buffer, res, max_size) \ if (PARM->foo!=NULL) { \ do { \ if (PARM->pos >= PARM->length) \ res = YY_NULL; \ else \ { \ res = PARM->length - PARM->pos; \ res > (int)max_size ? res = max_size : 0; \ memcpy(buffer, PARM->buf + PARM->pos, res); \ PARM->pos += res; \ } \ } while (0);\ } else { \ int c = getchar(); \ res = ((c == EOF) ? YY_NULL : (buffer[0] = c, 1)); \ } */ #define YY_INPUT(buffer, res, max_size) \ if (PARM->foo!=NULL) { \ do { \ if (PARM->pos >= PARM->length) \ res = YY_NULL; \ else \ { \ res = PARM->length - PARM->pos; \ res > (int)max_size ? res = max_size : 0; \ memcpy(buffer, PARM->buf + PARM->pos, res); \ PARM->pos += res; \ } \ } while (0);\ } else { \ int n = fread(buffer,1,max_size,stdin); \ if (n<=0) res=YY_NULL;\ else res=n; \ } /* void lex_parsestr(const char *s) { YY_BUFFER_STATE yyhandle; yyhandle = YY_CURRENT_BUFFER; yy_scan_string(s); yylex(); yy_delete_buffer(YY_CURRENT_BUFFER); yy_switch_to_buffer(yyhandle); } */ char linebuf[1024]; char elmparsestrbuf[1024]; char *s; %} %option reentrant %option bison-bridge %option noyywrap %option yylineno %option nounput %option noinput %x STRSTATE %x QUOTESTATE %x COMMENT DIGIT [0-9] ID [A-z][A-z0-9_:+\-*/<>=]* %% "+"|"!-"|"*"|"/"|"<"|">"|"="|"<="|">=" { parseprintf( "an op: %s", yytext); *yylval=yytext; return URI; } {DIGIT}+ { parseprintf( "An integer: %s (%d)\n",yytext,atoi(yytext)); *yylval=yytext; return INT; } {DIGIT}+"."{DIGIT}+ { parseprintf( "A float: %s", yytext); *yylval=yytext; return FLOAT; } {DIGIT}{DIGIT}{DIGIT}{DIGIT}"-"{DIGIT}{DIGIT}"-"{DIGIT}{DIGIT} { parseprintf( "A date: %s\n", yytext); *yylval=yytext; return DATE; } {DIGIT}{DIGIT}":"{DIGIT}{DIGIT}":"{DIGIT}{DIGIT} { parseprintf( "A time: %s\n", yytext); *yylval=yytext; return TIME; } \" { BEGIN STRSTATE; s = elmparsestrbuf; } \\n { *s++ = '\n'; } \\t { *s++ = '\t'; } \\\" { *s++ = '\"'; } \" { *s = 0; BEGIN 0; parseprintf("found '%s'\n", elmparsestrbuf); *yylval=elmparsestrbuf; return STRING; } \n { *s++ = '\n'; /* parseprintf("elm parser error: invalid string (newline in string)"); exit(1); */ } . { *s++ = *yytext; } \' { BEGIN QUOTESTATE; s = elmparsestrbuf; } \\n { *s++ = '\n'; } \\t { *s++ = '\t'; } \\\' { *s++ = '\''; } \' { *s = 0; BEGIN 0; parseprintf("found '%s'\n", elmparsestrbuf); *yylval=elmparsestrbuf; return URI; } \n { *s++ = '\n'; /* parseprintf("elm parser error: invalid string (newline in quote)"); exit(1); */ } . { *s++ = *yytext; } "/*" BEGIN(COMMENT); [^*\n]* /* eat anything that's not a '*' */ "*"+[^*/\n]* /* eat up '*'s not followed by '/'s */ \n ; "*"+"/" BEGIN(INITIAL); {ID} { parseprintf( "An identifier: %s\n", yytext ); *yylval=yytext; return URI; } [?]{ID} { parseprintf( "A variable: %s\n", yytext ); *yylval=yytext; return VAR; } "%".*\n { } /* eat up line comment until end of line */ [-] return '-'; [|] return '|'; [)] return ')'; [(] return '('; [}] return '}'; [{] return '{'; [,] return ','; [.] return '.'; <> { parseprintf("file end\n"); //return FILEEND; yyterminate(); } [\r\n] {} [ \t]+ /* eat up whitespace */ . parseprintf( "Unrecognized character: %s\n", yytext ); %% void wg_yyottererror (parse_parm* parm, void* scanner, char* msg) { //printf("\n yyerror called with xx msg %s\n",msg); printf("%s at otter file %s line %d text fragment:\n%s\n", msg,parm->filename,yyget_lineno(scanner),yyget_text(scanner)); } whitedb-0.7.2/Parser/dbotter.y000066400000000000000000000054771226454622500162640ustar00rootroot00000000000000/* Copyright (c) Mindstone 2004 * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 1, or (at your option) * any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program: the file COPYING contains this copy. * if not, write to the Free Software * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. * */ %{ #include #include "dbotterparse.h" //#define PP ((void*)(((* parse_parm)parm)->foo)) #define PP (parm) %} /* ** 'pure_parser' tells bison to use no global variables and create a ** reentrant parser. */ %pure_parser %parse-param {parse_parm *parm} %parse-param {void *scanner} %lex-param {yyscan_t *scanner} %token VARIABLE %token FILEEND %token INT %token FLOAT %token DATE %token TIME %token STRING %token ID %token URI %token CONST %token VAR %% /* Grammar rules and actions follow */ input: /* empty */ | sentencelist { (parm->result)=$1; } ; sentence: assertion { $$ = $1; } ; sentencelist: sentence { $$ = MKWGPAIR(PP,$1,MKWGNIL); } | sentencelist sentence { $$ = MKWGPAIR(PP,$2,$1); } ; assertion: primsentence '.' { $$ = $1; } ; primsentence: term { $$ = MKWGPAIR(PP,$1,MKWGNIL); } | loglist { $$ = $1; } ; loglist: term { $$ = MKWGPAIR(PP,$1,MKWGNIL); } | term '|' loglist { $$ = MKWGPAIR(PP,$1,$3); } ; term: prim { $$ = $1; } | prim '(' ')' { $$ = MKWGPAIR(PP,$1,NULL); } | prim '(' termlist ')' { $$ = MKWGPAIR(PP,$1,$3); } | '-' term { $$ = MKWGPAIR(PP,MKWGCONST(PP,"not"),MKWGPAIR(PP,$2,MKWGNIL)); } ; termlist: term { $$ = MKWGPAIR(PP,$1,MKWGNIL); } | term ',' termlist { $$ = MKWGPAIR(PP,$1,$3); } ; prim: INT { $$ = MKWGINT(PP,$1); } | FLOAT { $$ = MKWGFLOAT(PP,$1); } | DATE { $$ = MKWGDATE(PP,$1); } | TIME { $$ = MKWGTIME(PP,$1); } | STRING { $$ = MKWGSTRING(PP,$1); } | VAR { $$ = MKWGVAR(PP,$1); } | URI { $$ = MKWGURI(PP,$1); } | ID { $$ = MKWGCONST(PP,$1); } | CONST { $$ = MKWGCONST(PP,$1); } ; %% whitedb-0.7.2/Parser/dbotterparse.h000066400000000000000000000023701226454622500172630ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009,2010 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dbotterparse.h * Special defs and headers for otter parser * */ #ifndef DEFINED_DBOTTERPARSE_H #define DEFINED_DBOTTERPARSE_H #include "dbgenparse.h" int wg_yyotterlex(YYSTYPE *, void *); int wg_yyotterlex_init(void **); int wg_yyotterlex_destroy(void *); void wg_yyotterset_extra(YY_EXTRA_TYPE, void *); int wg_yyotterparse(parse_parm *, void *); void wg_yyottererror (parse_parm* parm, void* scanner, char* msg); #endif whitedb-0.7.2/Parser/dbparse.c000066400000000000000000000560061226454622500162050ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009,2010 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dbparse.c * Top level procedures for parsers * */ /* ====== Includes =============== */ #include #include #include #include #include #include "../Db/dbdata.h" #include "../Db/dbmem.h" #include "../Db/dballoc.h" #include "../Db/dbdata.h" #include "../Db/dbmpool.h" #include "../Printer/dbotterprint.h" #include "../Reasoner/clterm.h" #include "dbparse.h" #include "dbgenparse.h" #include "dbotterparse.h" #include "dbprologparse.h" /* ====== Private headers and defs ======== */ #define MAX_URI_SCHEME 10 #define VARDATALEN 1000 #undef DEBUG #ifdef DEBUG #define DPRINTF(...) { printf(__VA_ARGS__); } #else #define DPRINTF(...) ; #endif //static void otter_escaped_str(void *db, char *iptr, char *buf, int buflen); static int show_parse_error(void* db, char* format, ...); static int show_parse_warning(void* db, char* format, ...); /* ======== Data ========================= */ /** Recognized URI schemes (used when parsing input data) * when adding new schemes, check that MAX_URI_SCHEME is enough to * store the entire scheme + '\0' */ struct uri_scheme_info { char *prefix; int length; } uri_scheme_table_otter[] = { { "urn:", 4 }, { "file:", 5 }, { "http://", 7 }, { "https://", 8 }, { "mailto:", 7 }, { NULL, 0 } }; /* ====== Private protos ======== */ /* ====== Functions ============== */ int wr_import_otter_file(glb* g, char* filename, char* strasfile, cvec clvec) { void* db=g->db; parse_parm pp; char* fnamestr; FILE* fp; //char* buf; int pres=1; void* pres2=NULL; void *mpool; DPRINTF("wr_import_otter_file called\n"); if (strasfile==NULL) { // input from file fnamestr=filename; fp=freopen(fnamestr, "r", stdin); pp.db=db; pp.filename=fnamestr; pp.foo=NULL; // indicates file case in YY_INPUT in dbotter.l pp.result=NULL; } else { // input from string fnamestr="string"; pp.db=db; pp.filename=fnamestr; pp.foo="a"; // non-NULL indicates string case in YY_INPUT in dbotter.l pp.buf = strasfile; pp.length = strlen(strasfile); pp.pos = 0; pp.result=NULL; } mpool=wg_create_mpool(db,1000000); pp.mpool=mpool; wg_yyotterlex_init(&pp.yyscanner); wg_yyotterset_extra(&pp, pp.yyscanner); pres=wg_yyotterparse(&pp, pp.yyscanner); wg_yyotterlex_destroy(pp.yyscanner); DPRINTF("result: %d pp.result %s\n",pres,(char*)pp.result); if (!pres && pp.result!=NULL) { if ((g->print_initial_parser_result)>0) { printf("\nOtter parser result:\n"); wg_mpool_print(db,pp.result); } pres2=wr_parse_clauselist(g,mpool,clvec,pp.result); } DPRINTF("\notterparse quitting with pres2 %d .\n",(int)pres2); if (pres2==NULL) { DPRINTF("\npres2 is null.\n"); } else { //wg_mpool_print(db,pres2); if ((g->print_generic_parser_result)>0) { printf("\nGeneric parser result:\n"); wr_print_db_otter(g,(g->print_clause_detaillevel)); } } wg_free_mpool(db,mpool); if (pres || pres2==NULL) return 1; else return 0; } /* void wg_yyottererror (parse_parm* parm, void* scanner, char* msg) { printf("\n yyerror called with msg %s\n",msg); printf("\ input error at line %d token %s \n", yylineno,yytext); return; } */ /* void wg_yyottererror (const char *s) { char* errbuf; char* tmp; errbuf=malloc(1000); //(g->parser_errbuf)=errbuf; //snprintf(errbuf,1000,"input error at line %d: %s", wg_yyotterlineno, s); sprintf(errbuf,1000,"input error at line %d: %s", wg_yyotterlineno, s); //tmp=xml_encode_str(errbuf); tmp=errbuf; //rqlgandalferr(-1,tmp); printf ("otterparse error at line %d: %s\n", wg_yyotterlineno, s); exit(0); //printf ("parse error at line %d: %s\n", wg_yyotterlineno, s); } */ int wr_import_prolog_file(glb* g, char* filename, char* strasfile, cvec clvec) { void *db=g->db; parse_parm pp; char* fnamestr; FILE* fp; DPRINTF("Hello from dbprologparse!\n"); fnamestr=filename; fp=freopen(fnamestr, "r", stdin); pp.db=db; pp.filename=fnamestr; pp.foo="abcba"; wg_yyprologlex_init(&pp.yyscanner); wg_yyprologset_extra(&pp, pp.yyscanner); wg_yyprologparse(&pp, pp.yyscanner); wg_yyprologlex_destroy(pp.yyscanner); DPRINTF("\nprologparse quitting.\n"); return 0; } /* ---- convert parser-returned list to db records --------- */ void* wr_parse_clauselist(glb* g,void* mpool,cvec clvec,void* clauselist) { void* db=g->db; void* lpart; void* cl; void* clpart; void* lit; void* atom; int clnr=0; int litnr=0; void* fun; int isneg=0; void* tmpptr; void* atomres; gint ameta; gint tmpres2; gint setres; gint setres2; void* record=NULL; int issimple; void* termpart; void* subterm; void* resultlist=NULL; char** vardata; int i; #ifdef DEBUG DPRINTF("wg_parse_rulelist starting with clauselist\n"); wg_mpool_print(db,clauselist); DPRINTF("\n"); #endif if (clvec!=NULL) CVEC_NEXT(clvec)=CVEC_START; // create vardata block by malloc or inside mpool vardata=(char**)(wg_alloc_mpool(db,mpool,sizeof(char*)*VARDATALEN)); if (vardata==NULL) { show_parse_error(db,"cannot allocate vardata in wg_parse_clauselist\n"); return NULL; } //vardata=(char**)(malloc(sizeof(char*)*VARDATALEN)); for(i=0;i1) issimple=0; DPRINTF("\nclause issimple res %d length %d\n",issimple,litnr); // create record for a rule clause if (!issimple) { record=wr_create_rule_clause(g,litnr); if (((int)record)==0) { free(vardata); return NULL; } resultlist=wg_mkpair(db,mpool,record,resultlist); } // clear vardata block for the next clause for(i=0;idb; void* termpart; void* ret; void* subterm; int termnr=0; int deeptcount=0; int vartcount=0; void* tmpres=NULL; gint tmpres2; gint setres; void* record; DPRINTF("\nwg_parse_atom starting with isneg %d atom\n",isneg); #ifdef DEBUG wg_mpool_print(db,term); printf("\n"); #endif // examine term termnr=0; deeptcount=0; vartcount=0; for(termpart=term;wg_ispair(db,termpart);termpart=wg_rest(db,termpart),termnr++) { subterm=wg_first(db,termpart); if (subterm!=NULL && wg_ispair(db,subterm)) deeptcount++; else if (wg_atomtype(db,subterm)==WG_VARTYPE) vartcount++; } // create data record record=wr_create_atom(g,termnr); if (((int)record)==0) { return NULL; } // fill data record and do recursive calls for subterms for(termpart=term,termnr=0;wg_ispair(db,termpart);termpart=wg_rest(db,termpart),termnr++) { term=wg_first(db,termpart); #ifdef DEBUG DPRINTF("\nterm nr %d:",termnr); wg_mpool_print(db,term); printf("\n"); #endif if (!wg_ispair(db,term)) { DPRINTF("term nr %d is primitive \n",termnr); tmpres2=wr_parse_primitive(g,mpool,term,vardata); if (tmpres2==WG_ILLEGAL) { wg_delete_record(db,record); return NULL; } } else { DPRINTF("term nr %d is nonprimitive \n",termnr); tmpres=wr_parse_term(g,mpool,term,vardata); if (tmpres==NULL) { wg_delete_record(db,record); return NULL; } tmpres2=wg_encode_record(db,tmpres); } if (tmpres2==WG_ILLEGAL) return NULL; setres=wr_set_atom_subterm(g,record,termnr,tmpres2); if (setres!=0) { wg_delete_record(db,record); return NULL; } } ret=record; DPRINTF("\nwg_parse_atom ending\n"); if (ret==NULL) DPRINTF("\nwg_parse_atom returns NULL\n"); return ret; } void* wr_parse_term(glb* g,void* mpool,void* term, char** vardata) { void* db=g->db; void* termpart; void* ret; void* subterm; int termnr=0; int deeptcount=0; int vartcount=0; void* tmpres=NULL; gint tmpres2; gint setres; void* record; #ifdef DEBUG DPRINTF("\nwg_parse_term starting with "); wg_mpool_print(db,term); printf("\n"); #endif // examine term termnr=0; deeptcount=0; vartcount=0; for(termpart=term;wg_ispair(db,termpart);termpart=wg_rest(db,termpart),termnr++) { subterm=wg_first(db,termpart); if (subterm!=NULL && wg_ispair(db,subterm)) deeptcount++; else if (wg_atomtype(db,subterm)==WG_VARTYPE) vartcount++; } // create data record record=wr_create_term(g,termnr); if (((int)record)==0) { return NULL; } //DPRINTF("\nwg_parse_term termnr %d \n",termnr); // fill data record and do recursive calls for subterms for(termpart=term,termnr=0;wg_ispair(db,termpart);termpart=wg_rest(db,termpart),termnr++) { term=wg_first(db,termpart); #ifdef DEBUG //DPRINTF("\nterm nr %d:",termnr); //wg_mpool_print(db,term); //printf("\n"); #endif if (!wg_ispair(db,term)) { DPRINTF("term nr %d is primitive \n",termnr); tmpres2=wr_parse_primitive(g,mpool,term,vardata); if (tmpres2==WG_ILLEGAL) { wg_delete_record(db,record); return NULL; } } else { DPRINTF("term nr %d is nonprimitive \n",termnr); tmpres=wr_parse_term(g,mpool,term,vardata); if (tmpres==NULL) { wg_delete_record(db,record); return NULL; } tmpres2=wg_encode_record(db,tmpres); } if (tmpres2==WG_ILLEGAL) return NULL; setres=wr_set_term_subterm(g,record,termnr,tmpres2); if (setres!=0) { wg_delete_record(db,record); return NULL; } } ret=record; DPRINTF("\nwg_parse_term ending \n"); if (ret==NULL) DPRINTF("\nwg_parse_term returns NULL\n"); return ret; } gint wr_parse_primitive(glb* g,void* mpool,void* atomptr, char** vardata) { void *db=g->db; gint ret; int type; char* str1; char* str2; int intdata; double doubledata; int i; #ifdef DEBUG DPRINTF("\nwg_parse_primitive starting with "); wg_mpool_print(db,atomptr); printf("\n"); #endif if (atomptr==NULL) { ret=wg_encode_null(db,NULL); } else { type=wg_atomtype(db,atomptr); str1=wg_atomstr1(db,atomptr); str2=wg_atomstr2(db,atomptr); switch (type) { case 0: ret=wg_encode_null(db,NULL); break; case WG_NULLTYPE: ret=wg_encode_null(db,NULL); break; case WG_INTTYPE: intdata = atol(str1); if(errno!=ERANGE && errno!=EINVAL) { ret = wg_encode_int(db, intdata); } else { errno=0; ret=WG_ILLEGAL; } break; case WG_DOUBLETYPE: doubledata = atof(str1); if(errno!=ERANGE && errno!=EINVAL) { ret = wg_encode_double(db, doubledata); } else { errno=0; ret=WG_ILLEGAL; } break; case WG_STRTYPE: ret=wg_encode_str(db,str1,str2); break; case WG_XMLLITERALTYPE: ret=wg_encode_xmlliteral(db,str1,str2); break; case WG_URITYPE: ret=wg_encode_uri(db,str1,str2); break; //case WG_BLOBTYPE: // ret=wg_encode_blob(db,str1,str2); // break; case WG_CHARTYPE: ret=wg_encode_char(db,*str1); break; case WG_FIXPOINTTYPE: doubledata = atof(str1); if(errno!=ERANGE && errno!=EINVAL) { ret = wg_encode_fixpoint(db, doubledata); } else { errno=0; ret=WG_ILLEGAL; } break; case WG_DATETYPE: intdata=wg_strp_iso_date(db,str1); ret=wg_encode_date(db,intdata); break; case WG_TIMETYPE: intdata=wg_strp_iso_time(db,str1); ret=wg_encode_time(db,intdata); break; case WG_ANONCONSTTYPE: ret=wg_encode_anonconst(db,str1); break; case WG_VARTYPE: intdata=0; DPRINTF("starting WG_VARTYPE block\n"); for(i=0;i=VARDATALEN) { show_parse_warning(db,"too many variables in a clause: ignoring the clause"); errno=0; ret=WG_ILLEGAL; break; } ret=wg_encode_var(db,intdata); DPRINTF("var %d encoded ok\n",intdata); break; default: ret=wg_encode_null(db,NULL); } } DPRINTF("\nwg_parse_term ending with %d\n",ret); return ret; } /* -------------- parsing utilities ----------------------- */ /** Parse value from string, encode it for WhiteDB * returns WG_ILLEGAL if value could not be parsed or * encoded. * Supports following data types: * NULL - empty string * variable - ?x where x is a numeric character * int - plain integer * double - floating point number in fixed decimal notation * date - ISO8601 date * time - ISO8601 time+fractions of second. * uri - string starting with an URI prefix * string - other strings * Since leading whitespace generally makes type guesses fail, * it invariably causes the data to be parsed as string. */ gint wr_parse_and_encode_otter_prim(glb* g, char *buf) { void* db=g->db; int intdata; double doubledata; gint encoded = WG_ILLEGAL, res; char c = buf[0]; if(c == 0) { /* empty fields become NULL-s */ encoded = 0; } else if(c == '?' && buf[1] >= '0' && buf[1] <= '9') { /* try a variable */ intdata = atol(buf+1); if(errno!=ERANGE && errno!=EINVAL) { encoded = wg_encode_var(db, intdata); } else { errno = 0; } } else if(c >= '0' && c <= '9') { /* This could be one of int, double, date or time */ if((res = wg_strp_iso_date(db, buf)) >= 0) { encoded = wg_encode_date(db, res); } else if((res = wg_strp_iso_time(db, buf)) >= 0) { encoded = wg_encode_time(db, res); } else { /* Examine the field contents to distinguish between float * and int, then convert using atol()/atof(). sscanf() tends to * be too optimistic about the conversion, especially under Win32. */ char *ptr = buf, *decptr = NULL; int decsep = 0; while(*ptr) { if(*ptr == OTTER_DECIMAL_SEPARATOR) { decsep++; decptr = ptr; } else if(*ptr < '0' || *ptr > '9') { /* Non-numeric. Mark this as an invalid number * by abusing the decimal separator count. */ decsep = 2; break; } ptr++; } if(decsep==1) { char tmp = *decptr; *decptr = '.'; /* ignore locale, force conversion by plain atof() */ doubledata = atof(buf); if(errno!=ERANGE && errno!=EINVAL) { encoded = wg_encode_double(db, doubledata); } else { errno = 0; /* Under Win32, successful calls don't do this? */ } *decptr = tmp; /* conversion might have failed, restore string */ } else if(!decsep) { intdata = atol(buf); if(errno!=ERANGE && errno!=EINVAL) { encoded = wg_encode_int(db, intdata); } else { errno = 0; } } } } else { /* Check for uri scheme */ encoded = wr_parse_and_encode_otter_uri(g, buf); } if(encoded == WG_ILLEGAL) { /* All else failed. Try regular string. */ encoded = wg_encode_str(db, buf, NULL); } return encoded; } /** Try parsing an URI from a string. * Returns encoded WG_URITYPE field when successful * Returns WG_ILLEGAL on error * * XXX: this is a very naive implementation. Something more robust * is needed. */ gint wr_parse_and_encode_otter_uri(glb* g, char *buf) { void* db=g->db; gint encoded = WG_ILLEGAL; struct uri_scheme_info *next = uri_scheme_table_otter; /* Try matching to a known scheme */ while(next->prefix) { if(!strncmp(buf, next->prefix, next->length)) { /* We have a matching URI scheme. * XXX: check this code for correct handling of prefix. */ int urilen = strlen(buf); char *prefix = malloc(urilen + 1); char *dataptr; if(!prefix) break; strncpy(prefix, buf, urilen); dataptr = prefix + urilen; while(--dataptr >= prefix) { switch(*dataptr) { case ':': case '/': case '#': *(dataptr+1) = '\0'; goto prefix_marked; default: break; } } prefix_marked: encoded = wg_encode_uri(db, buf+((int)dataptr-(int)prefix+1), prefix); free(prefix); break; } next++; } return encoded; } /* ------------ errors ---------------- */ static int show_parse_error(void* db, char* format, ...) { va_list args; va_start (args, format); printf("*** Parser error: "); vprintf (format, args); va_end (args); return -1; } static int show_parse_warning(void* db, char* format, ...) { va_list args; va_start (args, format); printf("*** Parser warning: "); vprintf (format, args); va_end (args); return -1; } whitedb-0.7.2/Parser/dbparse.h000066400000000000000000000033451226454622500162100ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009,2010 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dbparse.h * Top level/generic headers and defs for parsers * */ #ifndef DEFINED_DBPARSE_H #define DEFINED_DBPARSE_H #include "../Db/dballoc.h" #include "../Reasoner/mem.h" #include "../Reasoner/glb.h" #define OTTER_DECIMAL_SEPARATOR '.' int wr_import_otter_file(glb* g, char* filename, char* strasfile, cvec clvec); //int wg_import_otter_file(void* db, char* filename, int printlevel); int wr_import_prolog_file(glb* g, char* filename, char* strasfile, cvec clvec); void* wr_parse_clauselist(glb* g,void* mpool,cvec clvec,void* clauselist); void* wr_parse_atom(glb* g,void* mpool,void* term, int isneg, int issimple, char** vardata); void* wr_parse_term(glb* g,void* mpool,void* term, char** vardata); gint wr_parse_primitive(glb* g,void* mpool,void* term, char** vardata); gint wr_parse_and_encode_otter_prim(glb* g, char *buf); gint wr_parse_and_encode_otter_uri(glb* g, char *buf); gint wr_print_parseres(glb* g, gint x); #endif whitedb-0.7.2/Parser/dbprolog.l000066400000000000000000000071421226454622500164030ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009,2010 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dbprolog.l * Lexer rules for prolog parser * */ %{ #include #include #include "dbprologparse.h" #include "dbprolog.tab.h" /* reentrant stuff starts */ #define PARM yyget_extra(yyscanner) /* #define YY_INPUT(buffer, res, max_size) \ do { \ if (PARM->pos >= PARM->length) \ res = YY_NULL; \ else \ { \ res = PARM->length - PARM->pos; \ res > (int)max_size ? res = max_size : 0; \ memcpy(buffer, PARM->buf + PARM->pos, res); \ PARM->pos += res; \ } \ } while (0) */ char linebuf[1024]; char elmparsestrbuf[1024]; char *s; %} %option reentrant %option bison-bridge %option noyywrap %option yylineno %option nounput %option noinput %x SQLSTATE %x STRSTATE ATOM [a-z#][_a-zA-Z0-9]* VAR [A-Z][_a-zA-Z0-9]* %% "%".*\n { parseprintf("A lineful of comment.\n"); /* eat line comments */ } :- { parseprintf("A :- , \"IS\"\n"); return IS; } not { parseprintf("A \"not\".\n"); return NOT; } {ATOM} { parseprintf("An atom: %s\n", yytext); *yylval=strdup(yytext); return ATOM; } {VAR} { parseprintf("A variable: %s\n", yytext); *yylval=strdup(yytext); return VAR; } [0-9]+ { parseprintf( "An integer: %s (%d)\n", yytext, atoi(yytext)); *yylval=strdup(yytext); return INT; } [0-9]+"."[0-9]+ { parseprintf( "A float: %s (%lf)\n", yytext, atof(yytext)); *yylval=strdup(yytext); return FLOAT; } "'"[^']+"'" { parseprintf("A \'-quoted string, basically an atom: %s\n", yytext); *yylval = strdup(yytext) + 1; // erase the first character ((char*)*yylval)[strlen(*yylval) - 1] = '\0'; // erase the last character ' return ATOM; } "\""[^"]+"\"" { parseprintf("A quoted string, basically an atom: %s\n", yytext); *yylval = strdup(yytext); // erase the first character //((char*)*yylval)[strlen(*yylval) - 1] = '\0'; // erase the last character ' return ATOM; } [)] return ')'; [(] return '('; [,] return ','; [.] return '.'; [;] return ';'; [!] return '!'; [~] return '~'; <> { parseprintf("file end. Read %d lines.\n", yylineno); yyterminate(); return 0; } [\r\n] { yylineno++; } [ \t]+ ;/* eat up whitespace */ . { parseprintf( "Unrecognized character: %s\n", yytext ); } %% void wg_yyprologerror(parse_parm* parm, void* scanner, char* msg) { //printf("\n yyerror called with xx msg %s\n",msg); printf("parse error at prolog file %s line %d text fragment %s: %s \n", parm->filename,yyget_lineno(scanner),yyget_text(scanner),msg); } whitedb-0.7.2/Parser/dbprolog.y000066400000000000000000000105531226454622500164200ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009,2010 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dbprolog.y * Grammar rules for prolog parser * */ %{ #include #include "dbprologparse.h" #define DBPARM ((void*)(parm->db)) %} /* ** 'pure_parser' tells bison to use no global variables and create a ** reentrant parser. */ /*%pure_parser*/ %pure-parser %parse-param {parse_parm *parm} %parse-param {void *scanner} %lex-param {yyscan_t *scanner} %token IS %token ATOM %token FLOAT %token INT %token NOT %token VAR %token FAIL %left NOT /* TODO: make these work %left OR %left AND %nonassoc '<' '>' '=' LE GE NE %left '+' '-' %left '*' '/' */ %% /* Grammar rules and actions follow */ input: clauselist { $$ = MKWGPAIR(DBPARM,$1,MKWGNIL); } ; clauselist: /* empty */ { $$ = MKWGNIL; } | clause clauselist { $$ = MKWGPAIR(DBPARM,$1, $2); } ; clause: fact { $$ = $1; } | rule { $$ = $1; } ; fact: functionform '.' { $$ = MKWGPAIR(DBPARM,$1, MKWGNIL); } ; functionform: ATOM '(' arguments ')' { $$ = MKWGPAIR(DBPARM,MKWGSTRING(DBPARM,$1), $3); } | NOT ATOM '(' arguments ')' { $$ = MKWGPAIR(DBPARM,MKWGSTRING(DBPARM,"not"), MKWGPAIR(DBPARM,MKWGPAIR(DBPARM,MKWGSTRING(DBPARM,$2), $4), MKWGNIL)); } | '~' ATOM '(' arguments ')' { $$ = MKWGPAIR(DBPARM,MKWGSTRING(DBPARM,"~"), MKWGPAIR(DBPARM,MKWGPAIR(DBPARM,MKWGSTRING(DBPARM,$2), $4), MKWGNIL)); } ; arguments: argument { $$ = MKWGPAIR(DBPARM,$1, MKWGNIL); } | argument ',' arguments { $$ = MKWGPAIR(DBPARM,$1, $3); } ; argument: functionform { $$ = $1; } | ATOM { $$ = MKWGSTRING(DBPARM,$1); } | INT { $$ = MKWGINT(DBPARM,$1); } | FLOAT { $$ = MKWGSTRING(DBPARM,$1); } | VAR { $$ = MKWGSTRING(DBPARM,$1); } ; rule: functionform IS body '.' { $$ = MKWGPAIR(DBPARM,$1, $3); } ; body: functionform { $$ = MKWGPAIR(DBPARM,MKWGPAIR(DBPARM,MKWGSTRING(DBPARM,"not"), MKWGPAIR(DBPARM,$1, MKWGNIL)), MKWGNIL); } | functionform ',' body { $$ = MKWGPAIR(DBPARM,MKWGPAIR(DBPARM,MKWGSTRING(DBPARM,"not"), MKWGPAIR(DBPARM,$1, MKWGNIL)), $3); } | functionform ';' body { $$ = MKWGPAIR(DBPARM,MKWGPAIR(DBPARM,MKWGSTRING(DBPARM,"and"), MKWGPAIR(DBPARM,$1, $3)), MKWGNIL); } ; body: functionform { $$ = MKWGPAIR(DBPARM,$1, MKWGNIL); } | orlist { $$ = $1; } | andlist { $$ = $1; } ; orlist: { $$ = MKWGNIL; } | functionform ',' orlist { $$ = MKWGPAIR(DBPARM,$1, $3); } | functionform ',' andlist { $$ = MKWGPAIR(DBPARM,$1, $3); } ; andlist: { $$ = MKWGNIL; } | functionform ';' andlist { $$ = MKWGPAIR(DBPARM,$1, $3); } | functionform ';' orlist { $$ = MKWGPAIR(DBPARM,$1, $3); } ; fact: atomargs '.' { $$ = $1; } ; atomargs: ATOM '(' arguments ')' { $$ = MKWGPAIR(DBPARM,MKWGSTRING(DBPARM,$1), MKWGPAIR(DBPARM,$3, MKWGNIL)); } ; rule: ATOM '(' arguments ')' IS body {$$ = MKWGPAIR(DBPARM,MKWGSTRING(DBPARM,$1), MKWGPAIR(DBPARM,$3, MKWGPAIR(DBPARM,MKWGSTRING(DBPARM,":-"), $6))); } ; arguments: argument { $$ = MKWGPAIR(DBPARM,$1, MKWGNIL); } | argument ',' arguments { $$ = MKWGPAIR(DBPARM,$1, $3); } ; body: terms '.' { $$ = $1; } ; terms: term { $$ = $1; } | term ',' terms { $$ = MKWGPAIR(DBPARM,MKWGPAIR(DBPARM,MKWGSTRING(DBPARM,"and"), MKWGPAIR(DBPARM,$1, $3)), MKWGNIL); } | term ';' terms { $$ = MKWGPAIR(DBPARM,MKWGPAIR(DBPARM,MKWGSTRING(DBPARM,"or"), MKWGPAIR(DBPARM,$1, $3)), MKWGNIL); } ; term: ATOM '(' term ')' { $$ = MKWGPAIR(DBPARM,MKWGSTRING(DBPARM,$1), MKWGPAIR(DBPARM,$3, MKWGNIL)); } | NOT '(' term ')' { $$ = MKWGPAIR(DBPARM,MKWGSTRING(DBPARM,"not"), $3); } | atomargs { $$ = $1; } | '!' { $$ = MKWGPAIR(DBPARM,MKWGSTRING(DBPARM,"cut"), MKWGNIL); } ; %% whitedb-0.7.2/Parser/dbprologparse.h000066400000000000000000000024011226454622500174230ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009,2010 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dbprologparse.h * Special defs and headers for prolog parser * */ #ifndef DEFINED_PROLOGPARSE_H #define DEFINED_PROLOGPARSE_H #include "dbgenparse.h" int wg_yyprologlex(YYSTYPE *, void *); int wg_yyprologlex_init(void **); int wg_yyprologlex_destroy(void *); void wg_yyprologset_extra(YY_EXTRA_TYPE, void *); int wg_yyprologparse(parse_parm *, void *); void wg_yyprologerror (parse_parm* parm, void* scanner, char* msg); #endif whitedb-0.7.2/Printer/000077500000000000000000000000001226454622500146015ustar00rootroot00000000000000whitedb-0.7.2/Printer/Makefile.am000066400000000000000000000002001226454622500166250ustar00rootroot00000000000000# # - - - - reasoner sources - - - - noinst_LTLIBRARIES = libPrinter.la libPrinter_la_SOURCES = dbotterprint.c dbotterprint.h whitedb-0.7.2/Printer/dbotterprint.c000066400000000000000000000222551226454622500174730ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009,2010 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dbotterprint.c * Top level procedures for otterprinter * */ /* ====== Includes =============== */ #include #include #include #include #include "../Db/dbdata.h" #include "../Db/dbmem.h" #include "../Db/dballoc.h" #include "../Db/dbdata.h" #include "../Db/dbmpool.h" #include "../Db/dbutil.h" #include "../Reasoner/clterm.h" #include "dbotterprint.h" /* ====== Private headers and defs ======== */ #undef DEBUG /* ======== Data ========================= */ /* ====== Private protos ======== */ /* static gint show_print_error(void* db, char* errmsg); static gint show_print_error_nr(void* db, char* errmsg, gint nr); static gint show_print_error_str(void* db, char* errmsg, char* str); */ /* ====== Functions ============== */ void wr_print_clause(glb* g, gptr rec) { if (rec==NULL) return; wr_print_clause_otter(g,rec,(g->print_clause_detaillevel)); } void wr_print_term(glb* g, gint rec) { if (rec==(gint)NULL || rec==WG_ILLEGAL) return; wr_print_term_otter(g,rec,(g->print_clause_detaillevel)); } void wr_print_record(glb* g, gptr rec) { wg_print_record(g->db,rec); } /** Print whole db * */ void wr_print_db_otter(glb* g,int printlevel) { void* db=g->db; void *rec; rec = wg_get_first_raw_record(db); while(rec) { if (wg_rec_is_rule_clause(db,rec)) { wr_print_rule_clause_otter(g, (gint *) rec,printlevel); printf("\n"); } else if (wg_rec_is_fact_clause(db,rec)) { wr_print_fact_clause_otter(g, (gint *) rec,printlevel); printf("\n"); } rec = wg_get_next_raw_record(db,rec); } } /** Print single clause (rule/fact record) * */ void wr_print_clause_otter(glb* g, gint* rec, int printlevel) { //printf("wg_print_clause_otter called with rec "); //wg_print_record(db,rec); if (rec==NULL) { printf("NULL\n"); return; } if (wg_rec_is_rule_clause(db,rec)) { //printf("ruleclause\n"); wr_print_rule_clause_otter(g, (gint *) rec,printlevel); printf("\n"); } else if (wg_rec_is_fact_clause(db,rec)) { //printf("factclause\n"); wr_print_fact_clause_otter(g, (gint *) rec,printlevel); printf("\n"); } //printf("wg_print_clause_otter exiting\n"); } /** Print single rule record * */ void wr_print_rule_clause_otter(glb* g, gint* rec,int printlevel) { void* db=g->db; gint meta, enc; int i, len; //char strbuf[256]; #if 0 /* XXX: FIXME */ #ifdef USE_CHILD_DB gint parent; #endif #endif if (rec==NULL) {printf("NULL\n"); return;} #if 0 /* XXX: FIXME */ #ifdef USE_CHILD_DB parent = wg_get_rec_base_offset(db, rec); #endif #endif //len = wg_get_record_len(db, rec); len = wg_count_clause_atoms(db, rec); //printf("[%d ",len); for(i=0; i(CLAUSE_EXTRAHEADERLEN+2)) printf(" | "); //if (i>1 && i+10 && idb; if (rec==NULL) { printf("NULL\n"); return; } wr_print_atom_otter(g,wg_encode_record(db,rec),printlevel); printf("."); } void wr_print_atom_otter(glb* g, gint rec, int printlevel) { void* db=g->db; gptr recptr; gint len, enc; int i; if (wg_get_encoded_type(db,rec)!=WG_RECORDTYPE) { wr_print_simpleterm_otter(g,rec,printlevel); return; } #if 0 /* XXX: FIXME */ #ifdef USE_CHILD_DB gint parent; parent = wg_get_rec_base_offset(db, rec); #endif #endif recptr=wg_decode_record(db, rec); len = wg_get_record_len(db, recptr); //printf("["); for(i=0; iunify_firstuseterm)) continue; if(i>((g->unify_firstuseterm)+1)) printf(","); enc = wg_get_field(db, recptr, i); #if 0 /* XXX: FIXME */ #ifdef USE_CHILD_DB if(parent) enc = wg_encode_parent_data(parent, enc); #endif #endif if (wg_get_encoded_type(db, enc)==WG_RECORDTYPE) { wr_print_term_otter(g,enc,printlevel); } else { wr_print_simpleterm_otter(g, enc,printlevel); } if (i==(g->unify_firstuseterm)) printf("("); } printf(")"); } void wr_print_term_otter(glb* g, gint rec,int printlevel) { void* db=g->db; gptr recptr; gint len, enc; int i; #ifdef DEBUG printf("print_term called with enc %d and type %d \n",(int)rec,wg_get_encoded_type(db,rec)); #endif if (wg_get_encoded_type(db,rec)!=WG_RECORDTYPE) { wr_print_simpleterm_otter(g,rec,printlevel); return; } #if 0 /* XXX: FIXME */ #ifdef USE_CHILD_DB gint parent; parent = wg_get_rec_base_offset(db, rec); #endif #endif recptr=wg_decode_record(db, rec); len = wg_get_record_len(db, recptr); for(i=0; iunify_firstuseterm)) continue; if(i>((g->unify_firstuseterm)+1)) printf(","); enc = wg_get_field(db, recptr, i); #if 0 /* XXX: FIXME */ #ifdef USE_CHILD_DB if(parent) enc = wg_encode_parent_data(parent, enc); #endif #endif if (wg_get_encoded_type(db, enc)==WG_RECORDTYPE) { wr_print_term_otter(g,enc,printlevel); } else { wr_print_simpleterm_otter(g, enc,printlevel); } if (i==(g->unify_firstuseterm)) printf("("); } printf(")"); } /** Print a single, encoded value or a subrecord * */ void wr_print_simpleterm_otter(glb* g, gint enc,int printlevel) { void* db=g->db; int intdata; char *strdata, *exdata; double doubledata; char strbuf[80]; #ifdef DEBUG printf("simpleterm called with enc %d and type %d \n",(int)enc,wg_get_encoded_type(db,enc)); #endif switch(wg_get_encoded_type(db, enc)) { case WG_NULLTYPE: printf("NULL"); break; //case WG_RECORDTYPE: // ptrdata = (gint) wg_decode_record(db, enc); // wg_print_subrecord_otter(db,(gint*)ptrdata); // break; case WG_INTTYPE: intdata = wg_decode_int(db, enc); printf("%d", intdata); break; case WG_DOUBLETYPE: doubledata = wg_decode_double(db, enc); printf("%f", doubledata); break; case WG_STRTYPE: strdata = wg_decode_str(db, enc); printf("\"%s\"", strdata); break; case WG_URITYPE: strdata = wg_decode_uri(db, enc); exdata = wg_decode_uri_prefix(db, enc); if (exdata==NULL) printf("%s", strdata); else printf("%s:%s", exdata, strdata); break; case WG_XMLLITERALTYPE: strdata = wg_decode_xmlliteral(db, enc); exdata = wg_decode_xmlliteral_xsdtype(db, enc); printf("\"%s\"", exdata, strdata); break; case WG_CHARTYPE: intdata = wg_decode_char(db, enc); printf("%c", (char) intdata); break; case WG_DATETYPE: intdata = wg_decode_date(db, enc); wg_strf_iso_datetime(db,intdata,0,strbuf); strbuf[10]=0; printf("%s", intdata,strbuf); break; case WG_TIMETYPE: intdata = wg_decode_time(db, enc); wg_strf_iso_datetime(db,1,intdata,strbuf); printf("%s",intdata,strbuf+11); break; case WG_VARTYPE: intdata = wg_decode_var(db, enc); printf("?%d", intdata); break; case WG_ANONCONSTTYPE: strdata = wg_decode_anonconst(db, enc); printf("!%s", strdata); break; default: printf(""); break; } } /* ------------ errors ---------------- */ /* static gint show_print_error(void* db, char* errmsg) { printf("wg otterprint error: %s\n",errmsg); return -1; } static gint show_print_error_nr(void* db, char* errmsg, gint nr) { printf("wg parser error: %s %d\n", errmsg, (int) nr); return -1; } static gint show_print_error_str(void* db, char* errmsg, char* str) { printf("wg parser error: %s %s\n",errmsg,str); return -1; } */ whitedb-0.7.2/Printer/dbotterprint.h000066400000000000000000000031411226454622500174710ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009,2010 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file dbotterprint.h * Top level/generic headers and defs for otterprinter * */ #ifndef DEFINED_DBOTTERPRINT_H #define DEFINED_DBOTTERPRINT_H #include "../Db/dballoc.h" #include "../Reasoner/types.h" #include "../Reasoner/mem.h" #include "../Reasoner/glb.h" void wr_print_clause(glb* g, gptr rec); void wr_print_term(glb* g, gint rec); void wr_print_record(glb* g, gptr rec); void wr_print_db_otter(glb* g,int printlevel); void wr_print_clause_otter(glb* g, gint* rec,int printlevel); void wr_print_rule_clause_otter(glb* g, gint* rec,int printlevel); void wr_print_fact_clause_otter(glb* g, gint* rec,int printlevel); void wr_print_atom_otter(glb* g, gint rec,int printlevel); void wr_print_term_otter(glb* g, gint rec,int printlevel); void wr_print_simpleterm_otter(glb* g, gint enc,int printlevel); #endif whitedb-0.7.2/Python/000077500000000000000000000000001226454622500144375ustar00rootroot00000000000000whitedb-0.7.2/Python/Makefile.am000066400000000000000000000011231226454622500164700ustar00rootroot00000000000000# $Id: $ # $Source: $ # # Compile Python extension module # ---- options ---- # ---- targets ---- python_PYTHON = whitedb.py WGandalf.py pyexec_LTLIBRARIES = wgdb.la # ---- path variables ---- dbdir=../Db # ---- extra dependencies, flags, etc ----- LIBDEPS = -lm # dependency from libm round() should be removed if RAPTOR LIBDEPS += `$(RAPTOR_CONFIG) --libs` endif AM_CPPFLAGS = $(PYTHON_INCLUDES) -I $(dbdir) wgdb_la_LDFLAGS = -module -avoid-version $(LIBDEPS) # ----- all sources for the created programs ----- wgdb_la_SOURCES = wgdbmodule.c wgdb_la_LIBADD = ../Main/libwgdb.la whitedb-0.7.2/Python/WGandalf.py000066400000000000000000000016761226454622500165060ustar00rootroot00000000000000#!/usr/bin/env python # -*- coding: latin-1 -*- # # Copyright (c) Priit Järv 2013 # # This file is part of WhiteDB # # WhiteDB is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # WhiteDB is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with WhiteDB. If not, see . """@file WGandalf.py Backwards compatibility wrapper for WhiteDB database Python API """ from warnings import warn warn("WGandalf module is deprecated, use whitedb instead", DeprecationWarning) from whitedb import * whitedb-0.7.2/Python/compile.bat000077500000000000000000000017171226454622500165700ustar00rootroot00000000000000@rem Check that this matches your Python path set PYDIR=c:\Python25 @rem When compiling for Python 3, replace /export:initwgdb @rem with /export:PyInit_wgdb @cl /Ox /W3 /MT /I..\Db /I%PYDIR%\include wgdbmodule.c ..\Db\dbmem.c ..\Db\dballoc.c ..\Db\dbdata.c ..\Db\dblock.c ..\Db\dbtest.c ..\DB\dbdump.c ..\Db\dblog.c ..\Db\dbhash.c ..\Db\dbindex.c ..\Db\dbcompare.c ..\Db\dbquery.c ..\Db\dbutil.c ..\Db\dbmpool.c ..\Db\dbjson.c ..\Db\dbschema.c ..\json\yajl_all.c /link /dll /incremental:no /MANIFEST:NO /LIBPATH:%PYDIR%\libs /export:initwgdb /out:wgdb.pyd @rem Currently this script produced a statically linked DLL for ease of @rem testing and debugging. If dynamic linking is needed: @rem 1. replace /MT with /MD @rem 2. remove /manifest:no or just the "NO" part @rem 3. uncomment the following: @rem mt -manifest wgdb.pyd.manifest -outputresource:wgdb.pyd;2 @rem You may also need to: @rem 4. distribute msvcrxx.dll from the compiler suite with the lib whitedb-0.7.2/Python/compile.sh000077500000000000000000000005571226454622500164350ustar00rootroot00000000000000#!/bin/sh export PYDIR=/usr/include/python2.7 gcc -O3 -Wall -march=pentium4 -shared -I../Db -I${PYDIR} -o wgdb.so wgdbmodule.c ../Db/dbmem.c ../Db/dballoc.c ../Db/dbdata.c ../Db/dblock.c ../Db/dbindex.c ../Db/dbtest.c ../Db/dblog.c ../Db/dbhash.c ../Db/dbcompare.c ../Db/dbquery.c ../Db/dbutil.c ../Db/dbmpool.c ../Db/dbjson.c ../Db/dbschema.c ../json/yajl_all.c whitedb-0.7.2/Python/tests.py000066400000000000000000000677511226454622500161730ustar00rootroot00000000000000#!/usr/bin/env python # -*- coding: latin-1 -*- # # Copyright (c) Priit Järv 2012,2013 # # This file is part of WhiteDB # # WhiteDB is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # WhiteDB is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with WhiteDB. If not, see . """@file tests.py Unit tests for the WhiteDB Python API """ import unittest import wgdb import whitedb import datetime MINDBSIZE=8000000 # should cover 64-bit databases that need more memory class LowLevelTest(unittest.TestCase): """Provide setUp()/tearDown() for test cases that use the low level Python API.""" def setUp(self): self.d = wgdb.attach_database(size=MINDBSIZE, local=1) def tearDown(self): wgdb.detach_database(self.d) class RecordTests(LowLevelTest): """Test low level record functionality""" def test_creation(self): """Tests record creation and low level scanning to retrieve records from the database.""" rec = wgdb.create_record(self.d, 3) self.assertTrue(wgdb.is_record(rec)) l = wgdb.get_record_len(self.d, rec) self.assertEqual(l, 3) rec2 = wgdb.create_raw_record(self.d, 678) self.assertTrue(wgdb.is_record(rec2)) l = wgdb.get_record_len(self.d, rec2) self.assertEqual(l, 678) # wgdb module only allows comparing records by contents # so we need to use recognizable data for this test. wgdb.set_field(self.d, rec, 0, 99531179) wgdb.set_field(self.d, rec2, 0, 55498756) # XXX: the following relies on certain assumptions on memory # management of WhiteDB. By the API description, the records # are not necessarily fetched in order of creation, it is just # useful for the current test case that it happens to be the case. # cand = wgdb.get_first_record(self.d) self.assertEqual(wgdb.get_field(self.d, cand, 0), 99531179) cand = wgdb.get_next_record(self.d, cand) self.assertEqual(wgdb.get_field(self.d, cand, 0), 55498756) # This, however, should always work correctly wgdb.delete_record(self.d, rec) cand = wgdb.get_first_record(self.d) self.assertEqual(wgdb.get_field(self.d, cand, 0), 55498756) def test_field_data(self): """Tests field data encoding and decoding.""" # BLOBTYPE not supported yet #blob = "\045\120\104\106\055\061\056\065\012\045"\ # "\265\355\256\373\012\063\040\060\040\157\142\152\012\074\074"\ # "\040\057\114\145\156\147\164\150\040\064\040\060\040\122\012"\ # "\040\040\040\057\106\151\154\164\145\162\040\057\106\154\141"\ # "\164\145\104\145\143\157\144\145\012\076\076\012\163\164\162"\ # "\145\141\155\012\170\234\255\227\333\152\334\060\020\100\337"\ # "\375\025\372\001\357\152\106\067\013\312\076\024\112\041\120"\ # "\150\132\103\037\102\036\366\032\010\064\064\027\350\357\167"\ # "\106\222\327\326\156\222\125\152\141\144\153\155\315\350\150"\ # "\146\064\243\175\154\224\025\316\130\141\264\024\255\103\051"\ # "\236\366\342\227\170\150\100\360\365\164\047\226\153\051\356" s1 = "Qly9y63M84Qly9y63M84Qly9y63M84Qly9y63M84Qly9y63M84Qly9y63M84" s2 = "2O15At13Iu" s3 = "A Test String" s4 = "#testobject" s5 = "http://example.com/testns" s6 = "9091270" s7 = "xsd:integer" rec = wgdb.create_record(self.d, 16) # BLOBTYPE not supported yet #wgdb.set_field(self.d, rec, 0, blob, wgdb.BLOBTYPE, "blob.pdf") #val = wgdb.get_field(self.d, rec, 0) #self.assertEqual(type(val), type(())) #self.assertEqual(len(val), 3) #self.assertEqual(val[0], blob) #self.assertEqual(val[1], wgdb.BLOBTYPE) #self.assertEqual(val[2], "blob.pdf") # CHARTYPE wgdb.set_field(self.d, rec, 1, "c", wgdb.CHARTYPE) val = wgdb.get_field(self.d, rec, 1) self.assertEqual(val, "c") # DATETYPE wgdb.set_field(self.d, rec, 2, datetime.date(2040, 7, 24)) val = wgdb.get_field(self.d, rec, 2) self.assertTrue(isinstance(val, datetime.date)) self.assertEqual(val.day, 24) self.assertEqual(val.month, 7) self.assertEqual(val.year, 2040) # DOUBLETYPE wgdb.set_field(self.d, rec, 3, -0.94794830) val = wgdb.get_field(self.d, rec, 3) self.assertAlmostEqual(val, -0.94794830) # FIXPOINTTYPE wgdb.set_field(self.d, rec, 4, 549.8390, wgdb.FIXPOINTTYPE) val = wgdb.get_field(self.d, rec, 4) self.assertEqual(val, 549.8390) # INTTYPE wgdb.set_field(self.d, rec, 5, 2073741877) val = wgdb.get_field(self.d, rec, 5) self.assertEqual(val, 2073741877) wgdb.set_field(self.d, rec, 6, -10) val = wgdb.get_field(self.d, rec, 6) self.assertEqual(val, -10) # NULLTYPE wgdb.set_field(self.d, rec, 7, None) val = wgdb.get_field(self.d, rec, 7) self.assertIsNone(val) # RECORDTYPE rec2 = wgdb.create_record(self.d, 1) wgdb.set_field(self.d, rec, 8, rec2) wgdb.set_field(self.d, rec2, 0, 30755904) val = wgdb.get_field(self.d, rec, 8) self.assertTrue(wgdb.is_record(val)) self.assertEqual(wgdb.get_field(self.d, val, 0), 30755904) # STRTYPE wgdb.set_field(self.d, rec, 9, s1) val = wgdb.get_field(self.d, rec, 9) self.assertEqual(val, s1) wgdb.set_field(self.d, rec, 10, s2, wgdb.STRTYPE) val = wgdb.get_field(self.d, rec, 10) self.assertEqual(val, s2) # extra string not supported yet #wgdb.set_field(self.d, rec, 11, s3, ext_str="en") #val = wgdb.get_field(self.d, rec, 11) #self.assertEqual(val, s3) # TIMETYPE wgdb.set_field(self.d, rec, 12, datetime.time(23, 44, 6)) val = wgdb.get_field(self.d, rec, 12) self.assertTrue(isinstance(val, datetime.time)) self.assertEqual(val.hour, 23) self.assertEqual(val.minute, 44) self.assertEqual(val.second, 6) # URITYPE wgdb.set_field(self.d, rec, 13, s4, wgdb.URITYPE, s5) val = wgdb.get_field(self.d, rec, 13) self.assertEqual(val, s5 + s4) # XMLLITERALTYPE wgdb.set_field(self.d, rec, 14, s6, wgdb.XMLLITERALTYPE, s7) val = wgdb.get_field(self.d, rec, 14) self.assertEqual(val, s6) # VARTYPE # when decoded, a tuple is returned that contains the # value and database (kind of a representation of vartype). wgdb.set_field(self.d, rec, 15, 2, wgdb.VARTYPE) val = wgdb.get_field(self.d, rec, 15) self.assertEqual(type(val), type(())) self.assertEqual(len(val), 2) self.assertEqual(val[0], 2) self.assertEqual(val[1], wgdb.VARTYPE) class LowLevelQueryTest(LowLevelTest): """Helper functions for query testing""" def fetch(self, query): try: rec = wgdb.fetch(self.d, query) except wgdb.error: rec = None return rec def get_first_record(self): try: rec = wgdb.get_first_record(self.d) except wgdb.error: rec = None return rec def get_next_record(self, rec): try: rec = wgdb.get_next_record(self.d, rec) except wgdb.error: rec = None return rec class QueryTests(LowLevelQueryTest): """Test low level query functions""" def make_testdata(self, dbsize): """Generates patterned test data for the query.""" for i in range(dbsize): for j in range(50): for k in range(50): rec = wgdb.create_record(self.d, 3) c1 = str(10 * i) c2 = 100 * j c3 = float(1000 * k) wgdb.set_field(self.d, rec, 0, c1) wgdb.set_field(self.d, rec, 1, c2) wgdb.set_field(self.d, rec, 2, c3) def check_matching_rows(self, col, cond, val, expected): """Fetch all rows where "col" "cond" "val" is true (where cond is a comparison operator - equal, less than etc) Check that the val matches the field value in returned records. Check that the number of rows matches the expected value""" query = wgdb.make_query(self.d, arglist = [(col, cond, val)]) # XXX: should check rowcount here when it's implemented # self.assertEqual(expected, query rowcount) cnt = 0 rec = self.fetch(query) while rec is not None: dbval = wgdb.get_field(self.d, rec, col) self.assertEqual(type(val), type(dbval)) self.assertEqual(val, dbval) cnt += 1 rec = self.fetch(query) self.assertEqual(cnt, expected) def check_db_rows(self, expected): """Count db rows.""" cnt = 0 rec = self.get_first_record() while rec is not None: cnt += 1 rec = self.get_next_record(rec) self.assertEqual(cnt, expected) def test_query(self): """Tests various queries: - read pre-generated content; - update content; - read updated content; - delete rows; - check row count after deleting. """ dbsize = 10 # use a fairly small database self.make_testdata(dbsize) # Content check read queries for i in range(dbsize): val = str(10 * i) self.check_matching_rows(0, wgdb.COND_EQUAL, val, 50*50) for i in range(50): val = 100 * i self.check_matching_rows(1, wgdb.COND_EQUAL, val, dbsize*50) for i in range(50): val = float(1000 * i) self.check_matching_rows(2, wgdb.COND_EQUAL, val, dbsize*50) # Update queries for i in range(dbsize): c1 = str(10 * i) query = wgdb.make_query(self.d, arglist = [(0, wgdb.COND_EQUAL, c1)]) rec = self.fetch(query) while rec is not None: c2 = wgdb.get_field(self.d, rec, 1) wgdb.set_field(self.d, rec, 1, c2 - 34555) rec = self.fetch(query) for i in range(50): c2 = 100 * i - 34555 query = wgdb.make_query(self.d, arglist = [(1, wgdb.COND_EQUAL, c2)]) rec = self.fetch(query) while rec is not None: c3 = wgdb.get_field(self.d, rec, 2) wgdb.set_field(self.d, rec, 2, c3 + 177889.576) rec = self.fetch(query) for i in range(50): c3 = 1000 * i + 177889.576 query = wgdb.make_query(self.d, arglist = [(2, wgdb.COND_EQUAL, c3)]) rec = self.fetch(query) while rec is not None: c1val = int(wgdb.get_field(self.d, rec, 0)) c1 = str(c1val + 99) wgdb.set_field(self.d, rec, 0, c1) rec = self.fetch(query) # Content check read queries, iteration 2 for i in range(dbsize): val = str(10 * i + 99) self.check_matching_rows(0, wgdb.COND_EQUAL, val, 50*50) for i in range(50): val = 100 * i - 34555 self.check_matching_rows(1, wgdb.COND_EQUAL, val, dbsize*50) for i in range(50): val = 1000 * i + 177889.576 self.check_matching_rows(2, wgdb.COND_EQUAL, val, dbsize*50) # Delete query for i in range(dbsize): c1 = str(10 * i + 99) arglist = [ (0, wgdb.COND_EQUAL, c1), (1, wgdb.COND_GREATER, -30556), # 10 matching (2, wgdb.COND_LESSTHAN, 217889.575) # 40 matching ] query = wgdb.make_query(self.d, arglist = arglist) rec = self.fetch(query) while rec is not None: wgdb.delete_record(self.d, rec) rec = self.fetch(query) # Database scan self.check_db_rows(dbsize * (50 * 50 - 10 * 40)) class QueryParamTests(LowLevelQueryTest): """Test query parameter encoding through the wgdb module""" def test_params(self): """Test encoding parameters""" marker = "This is a marker" s1 = "ctGXioJeeUkTrxiSGaWxqFujCyWHJkmveMQXEnrHAMomjuPjKqUHlUtCVjOT" s2 = "zjXNNGYUBjmdCrLaAaKv" s3 = "GRvWOVYBMObOzWPqVFCt" s4 = "#eNijRGUJbuHoJEMxRUCQ" s5 = "http://example.com/?UQCOtBzWkdipHplZqwQF" s6 = "KqKVvVhVcxbLssirtydJ" s7 = "xsd:Name" # this row shouldn't be returned by the queries (except # the NULL query) rec0 = wgdb.create_record(self.d, 15) wgdb.set_field(self.d, rec0, 0, "This is not a marker") # this row should be returned by the queries rec = wgdb.create_record(self.d, 15) wgdb.set_field(self.d, rec, 0, marker) # CHARTYPE wgdb.set_field(self.d, rec, 1, "Z", wgdb.CHARTYPE) # DATETYPE wgdb.set_field(self.d, rec, 2, datetime.date(1943, 2, 28)) # DOUBLETYPE wgdb.set_field(self.d, rec, 3, 105819.387451) # FIXPOINTTYPE wgdb.set_field(self.d, rec, 4, 783.799, wgdb.FIXPOINTTYPE) # INTTYPE wgdb.set_field(self.d, rec, 5, -871043) # NULLTYPE wgdb.set_field(self.d, rec, 7, None) # RECORDTYPE wgdb.set_field(self.d, rec, 8, rec0) # STRTYPE wgdb.set_field(self.d, rec, 9, s1) wgdb.set_field(self.d, rec, 10, s2, wgdb.STRTYPE) wgdb.set_field(self.d, rec, 11, s3, ext_str="en") # TIMETYPE wgdb.set_field(self.d, rec, 12, datetime.time(11, 22, 33)) # URITYPE wgdb.set_field(self.d, rec, 13, s4, wgdb.URITYPE, s5) # XMLLITERALTYPE wgdb.set_field(self.d, rec, 14, s6, wgdb.XMLLITERALTYPE, s7) # CHARTYPE query = wgdb.make_query(self.d, arglist = [(1, wgdb.COND_EQUAL, ("Z", wgdb.CHARTYPE))]) self.assertEqual(query.res_count, 1) rec = self.fetch(query) self.assertEqual(wgdb.get_field(self.d, rec, 0), marker) self.assertIsNone(self.fetch(query)) # DATETYPE query = wgdb.make_query(self.d, arglist = [(2, wgdb.COND_EQUAL, datetime.date(1943, 2, 28))]) self.assertEqual(query.res_count, 1) rec = self.fetch(query) self.assertEqual(wgdb.get_field(self.d, rec, 0), marker) self.assertIsNone(self.fetch(query)) # DOUBLETYPE query = wgdb.make_query(self.d, arglist = [(3, wgdb.COND_EQUAL, 105819.387451)]) self.assertEqual(query.res_count, 1) rec = self.fetch(query) self.assertEqual(wgdb.get_field(self.d, rec, 0), marker) self.assertIsNone(self.fetch(query)) # FIXPOINTTYPE query = wgdb.make_query(self.d, arglist = [(4, wgdb.COND_EQUAL, (783.799, wgdb.FIXPOINTTYPE))]) self.assertEqual(query.res_count, 1) rec = self.fetch(query) self.assertEqual(wgdb.get_field(self.d, rec, 0), marker) self.assertIsNone(self.fetch(query)) # INTTYPE query = wgdb.make_query(self.d, arglist = [(5, wgdb.COND_EQUAL, -871043)]) self.assertEqual(query.res_count, 1) rec = self.fetch(query) self.assertEqual(wgdb.get_field(self.d, rec, 0), marker) self.assertIsNone(self.fetch(query)) # NULLTYPE query = wgdb.make_query(self.d, arglist = [(7, wgdb.COND_EQUAL, None)]) self.assertEqual(query.res_count, 2) self.assertIsNotNone(self.fetch(query)) self.assertIsNotNone(self.fetch(query)) self.assertIsNone(self.fetch(query)) # RECORDTYPE query = wgdb.make_query(self.d, arglist = [(8, wgdb.COND_EQUAL, rec0)]) self.assertEqual(query.res_count, 1) rec = self.fetch(query) self.assertEqual(wgdb.get_field(self.d, rec, 0), marker) self.assertIsNone(self.fetch(query)) # STRTYPE query = wgdb.make_query(self.d, arglist = [(9, wgdb.COND_EQUAL, s1)]) self.assertEqual(query.res_count, 1) rec = self.fetch(query) self.assertEqual(wgdb.get_field(self.d, rec, 0), marker) self.assertIsNone(self.fetch(query)) query = wgdb.make_query(self.d, arglist = [(10, wgdb.COND_EQUAL, (s2, wgdb.STRTYPE))]) self.assertEqual(query.res_count, 1) rec = self.fetch(query) self.assertEqual(wgdb.get_field(self.d, rec, 0), marker) self.assertIsNone(self.fetch(query)) query = wgdb.make_query(self.d, arglist = [(11, wgdb.COND_EQUAL, (s3, wgdb.STRTYPE, "en"))]) self.assertEqual(query.res_count, 1) rec = self.fetch(query) self.assertEqual(wgdb.get_field(self.d, rec, 0), marker) self.assertIsNone(self.fetch(query)) # TIMETYPE query = wgdb.make_query(self.d, arglist = [(12, wgdb.COND_EQUAL, datetime.time(11, 22, 33))]) self.assertEqual(query.res_count, 1) rec = self.fetch(query) self.assertEqual(wgdb.get_field(self.d, rec, 0), marker) self.assertIsNone(self.fetch(query)) # URITYPE query = wgdb.make_query(self.d, arglist = [(13, wgdb.COND_EQUAL, (s4, wgdb.URITYPE, s5))]) self.assertEqual(query.res_count, 1) rec = self.fetch(query) self.assertEqual(wgdb.get_field(self.d, rec, 0), marker) self.assertIsNone(self.fetch(query)) # XMLLITERALTYPE query = wgdb.make_query(self.d, arglist = [(14, wgdb.COND_EQUAL, (s6, wgdb.XMLLITERALTYPE, s7))]) self.assertEqual(query.res_count, 1) rec = self.fetch(query) self.assertEqual(wgdb.get_field(self.d, rec, 0), marker) self.assertIsNone(self.fetch(query)) class WhiteDBTest(unittest.TestCase): """Provide setUp()/tearDown() for test cases that use the WhiteDB module API.""" def setUp(self): self.d = whitedb.connect(shmsize=MINDBSIZE, local=1) def tearDown(self): self.d.close() def check_db_rows(self, expected): """Count db rows.""" cnt = 0 rec = self.d.first_record() while rec is not None: cnt += 1 rec = self.d.next_record(rec) self.assertEqual(cnt, expected) class WhiteDBConnection(WhiteDBTest): """Test WhiteDB connection class methods. Does not cover the functionality that is normally accessed through Cursor and Record classes""" def test_creation(self): """Tests record creation and low level scanning to retrieve records from the database.""" rec = self.d.create_record(3) self.assertTrue(isinstance(rec, whitedb.Record)) rec = self.d.atomic_create_record([0, 0, 0]) self.assertTrue(isinstance(rec, whitedb.Record)) rec = self.d.insert([0, 0, 0]) self.assertTrue(isinstance(rec, whitedb.Record)) with self.assertRaises(whitedb.DataError): self.d.insert([]) with self.assertRaises(whitedb.DataError): self.d.create_record(-3) def test_fielddata(self): """Test field data reading and writing on connection level. This would be normally accessed through the record, but we depend on these functions to check the first/next records""" rec = self.d.create_record(20) self.d.set_field(rec, 6, 372296787) # regular data self.d.set_field(rec, 13, "2467305", whitedb.wgdb.CHARTYPE) # data with encoding self.d.set_field(rec, 19, "#907735743", whitedb.wgdb.URITYPE, "http://unittest/") # data with extstr self.assertEqual(self.d.get_field(rec, 6), 372296787) self.assertEqual(self.d.get_field(rec, 13), "2") self.assertEqual(self.d.get_field(rec, 19), "http://unittest/#907735743") def test_firstnext(self): """Test fetching the first and next record""" self.d.insert([112060684]) self.d.insert([566973731]) rec = self.d.first_record() self.assertEqual(self.d.get_field(rec, 0), 112060684) rec = self.d.next_record(rec) self.assertEqual(self.d.get_field(rec, 0), 566973731) class WhiteDBRecord(WhiteDBTest): """Test WhiteDB Record class""" def test_highlevel(self): """Tests high level record functionality.""" rec = self.d.insert([197622332, (2.67985826, whitedb.wgdb.DOUBLETYPE), ("874485001", whitedb.wgdb.XMLLITERALTYPE,"xsd:integer") ]) self.assertTrue(isinstance(rec, whitedb.Record)) self.assertEqual(len(rec), 3) self.assertEqual(rec[0], 197622332) self.assertAlmostEqual(rec[1], 2.67985826) self.assertEqual(rec[2], "874485001") # XXX: test len() here once implemented. def test_deletion(self): """Tests deleting a record""" self.check_db_rows(0) rec = self.d.insert([None]) self.check_db_rows(1) rec.delete() self.check_db_rows(0) def test_update(self): """Test record updating""" rec = self.d.insert([None, None, None, None, 630781304]) rec.update(["This", "is", "an", "update", 345849564]) self.assertEqual(rec[0], "This") self.assertEqual(rec[1], "is") self.assertEqual(rec[2], "an") self.assertEqual(rec[3], "update") self.assertEqual(rec[4], 345849564) with self.assertRaises(whitedb.wgdb.error): # too long rec.update([None, None, None, None, 630781304, None]) # nevertheless, fields that fit are overwritten self.assertEqual(rec[4], 630781304) def test_fielddata(self): """Test set and get field functions""" rec = self.d.create_record(3) rec.set_field(0, "168691904") with self.assertRaises(TypeError): rec.set_field(1, ("notanumber", whitedb.wgdb.INTTYPE)) with self.assertRaises(TypeError): rec.set_field(2, (248557089, 959010401)) with self.assertRaises(whitedb.DataError): rec.set_field(3, "no such field") self.assertEqual(rec.get_field(0), "168691904") self.assertEqual(rec.get_field(1), None) self.assertEqual(rec.get_field(2), None) with self.assertRaises(whitedb.DataError): rec.get_field(3) def test_getsize(self): """Test record size helper function""" rec = self.d.create_record(275) self.assertEqual(rec.get_size(), 275) l = [ 0, 0, 0, 0 ] rec = self.d.insert(l) self.assertEqual(rec.get_size(), len(l)) def test_linkrec(self): """Test linked records""" rec = self.d.insert([737483554]) rec2 = self.d.insert([859310257, rec]) self.assertTrue(isinstance(rec2[1], whitedb.Record)) self.assertEqual(rec2[1][0], 737483554) rec[0] = 284107294 self.assertEqual(rec2.get_field(1).get_field(0), 284107294) class WhiteDBCursor(WhiteDBTest): """Test WhiteDB Cursor class""" def make_testdata(self): rows = [ [5038, 933, 2513, 3743, 1068], [1459, 6185, 8457, 277, 171], [7261, 9882, 172, 7034, 755], [3751, 3690, 9976, 1225, 5825], [9910, 8478, 595, 924, 8804], [6801, 745, 5993, 6331, 7807], [5255, 2481, 595, 5685, 8532], [4579, 9155, 595, 478, 1167], [6753, 3518, 5928, 9286, 1637], [2781, 3919, 786, 9286, 7953] ] for row in rows: self.d.insert(row) def count_results(self, cur): cnt = 0 while cur.fetchone() is not None: cnt += 1 return cnt def test_basic(self): """Tests record creation and low level scanning to retrieve records from the database.""" cur = self.d.cursor() self.assertTrue(isinstance(cur, whitedb.Cursor)) cur.execute() self.assertIsNone(cur.fetchone()) self.d.insert([None, None, 846516765]) cur.execute() self.assertEqual(cur.rowcount, 1) rec = cur.fetchone() self.assertEqual(rec[2], 846516765) def test_matchrec(self): """Test query with a match record""" self.make_testdata() wildcard = (0, whitedb.wgdb.VARTYPE) cur = self.d.cursor() # list matchrec cur.execute(matchrec = [wildcard, wildcard, wildcard, 9286, wildcard]) self.assertEqual(cur.rowcount, 2) self.assertEqual(self.count_results(cur), 2) cur.execute(matchrec = [wildcard, wildcard, wildcard, 9286, 7953]) self.assertEqual(cur.rowcount, 1) self.assertEqual(self.count_results(cur), 1) cur.execute(matchrec = [None, wildcard, wildcard, 9286, 7953]) self.assertEqual(cur.rowcount, 0) self.assertEqual(self.count_results(cur), 0) # shorter record with matching field values cur.execute(matchrec = [5038, 933, 2513]) self.assertEqual(cur.rowcount, 1) self.assertEqual(self.count_results(cur), 1) # actual record matchrec rec = self.d.insert([wildcard, wildcard, 595, wildcard, wildcard]) cur.execute(matchrec = rec) self.assertEqual(cur.rowcount, 4) self.assertEqual(self.count_results(cur), 4) # shorter record with matching field values rec = self.d.insert([2781, 3919, 786, 9286]) cur.execute(matchrec = rec) self.assertEqual(cur.rowcount, 2) self.assertEqual(self.count_results(cur), 2) def test_arglist(self): """Test query with an argument list""" self.make_testdata() cur = self.d.cursor() # one condition: COND_EQUAL cur.execute(arglist = [(2, wgdb.COND_EQUAL, 595)]) self.assertEqual(cur.rowcount, 3) self.assertEqual(self.count_results(cur), 3) # inverse of previous query: COND_NOT_EQUAL cur.execute(arglist = [(2, wgdb.COND_NOT_EQUAL, 595)]) self.assertEqual(cur.rowcount, 7) self.assertEqual(self.count_results(cur), 7) # two conditions: COND_LESSTHAN, COND_GREATER cur.execute(arglist = [(0, wgdb.COND_LESSTHAN, 6801), (4, wgdb.COND_GREATER, 1637)]) self.assertEqual(cur.rowcount, 3) self.assertEqual(self.count_results(cur), 3) # inclusive versions of previous query: COND_LTEQUAL, COND_GTEQUAL cur.execute(arglist = [(0, wgdb.COND_LTEQUAL, 6801), (4, wgdb.COND_GTEQUAL, 1637)]) self.assertEqual(cur.rowcount, 5) self.assertEqual(self.count_results(cur), 5) def test_fetch(self): """Test the fetchall() and fetchone() functions""" self.make_testdata() cur = self.d.cursor() with self.assertRaises(whitedb.ProgrammingError): cur.fetchone() cur.execute(arglist = [(3, wgdb.COND_NOT_EQUAL, 9286)]) self.assertEqual(cur.rowcount, 8) rows = cur.fetchall() self.assertEqual(len(rows), 8) for row in rows: self.assertNotEqual(row[3], 9286) cur.execute(arglist = [(3, wgdb.COND_NOT_EQUAL, 9286)]) self.assertEqual(cur.rowcount, 8) cnt = 0 row = cur.fetchone() while row is not None: cnt += 1 self.assertNotEqual(row[3], 9286) row = cur.fetchone() self.assertEqual(cnt, 8) cur.execute(arglist = [(3, wgdb.COND_NOT_EQUAL, 9286)]) cur.close() with self.assertRaises(whitedb.ProgrammingError): cur.fetchone() if __name__ == "__main__": unittest.main() whitedb-0.7.2/Python/wgdbmodule.c000066400000000000000000001577021226454622500167500ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Priit Järv 2009, 2010, 2013 * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file wgdbmodule.c * Python extension module for accessing WhiteDB database * */ /* ====== Includes =============== */ #include #include #ifdef _WIN32 #define WIN32_LEAN_AND_MEAN #include #endif #include "dbapi.h" /* ====== Private headers and defs ======== */ #if PY_VERSION_HEX >= 0x03000000 #define PYTHON3 #define ENCODEERR "surrogateescape" /* handling of charset mismatches */ #if PY_VERSION_HEX >= 0x03030000 #define HAVE_LOCALEENC /* locale dependent string encode function exists */ #endif #endif struct module_state { PyObject *wgdb_error; }; typedef struct { PyObject_HEAD void *db; int local; } wg_database; typedef struct { PyObject_HEAD void *rec; } wg_record; typedef struct { PyObject_HEAD wg_query *query; wg_database *db; wg_query_arg *arglist; int argc; void *matchrec; int reclen; } wg_query_ob; /* append _ob to avoid name clash with dbapi.h */ /* ======= Private protos ================ */ static PyObject *wgdb_attach_database(PyObject *self, PyObject *args, PyObject *kwds); static PyObject *wgdb_attach_existing_database(PyObject *self, PyObject *args); static PyObject *wgdb_delete_database(PyObject *self, PyObject *args); static PyObject *wgdb_detach_database(PyObject *self, PyObject *args); static PyObject *wgdb_create_record(PyObject *self, PyObject *args); static PyObject *wgdb_create_raw_record(PyObject *self, PyObject *args); static PyObject *wgdb_get_first_record(PyObject *self, PyObject *args); static PyObject *wgdb_get_next_record(PyObject *self, PyObject *args); static PyObject *wgdb_get_record_len(PyObject *self, PyObject *args); static PyObject *wgdb_is_record(PyObject *self, PyObject *args); static PyObject *wgdb_delete_record(PyObject *self, PyObject *args); static wg_int pytype_to_wgtype(PyObject *data, wg_int ftype); static wg_int encode_pyobject_null(wg_database *db); static wg_int encode_pyobject_record(wg_database *db, PyObject *data); static wg_int encode_pyobject_int(wg_database *db, PyObject *data, int param); static wg_int encode_pyobject_double(wg_database *db, PyObject *data, int param); static wg_int encode_pyobject_str(wg_database *db, PyObject *data, char *ext_str, int param); static wg_int encode_pyobject_uri(wg_database *db, PyObject *data, char *ext_str, int param); static wg_int encode_pyobject_xmlliteral(wg_database *db, PyObject *data, char *ext_str, int param); static wg_int encode_pyobject_char(wg_database *db, PyObject *data, int param); static wg_int encode_pyobject_fixpoint(wg_database *db, PyObject *data, int param); static wg_int encode_pyobject_date(wg_database *db, PyObject *data, int param); static wg_int encode_pyobject_time(wg_database *db, PyObject *data, int param); static wg_int encode_pyobject_var(wg_database *db, PyObject *data, int param); static wg_int encode_pyobject(wg_database *db, PyObject *data, wg_int ftype, char *ext_str, int param); static wg_int encode_pyobject_ext(PyObject *self, wg_database *db, PyObject *obj, int param); static PyObject *wgdb_set_field(PyObject *self, PyObject *args, PyObject *kwds); static PyObject *wgdb_set_new_field(PyObject *self, PyObject *args, PyObject *kwds); static PyObject *wgdb_get_field(PyObject *self, PyObject *args); static PyObject *wgdb_start_write(PyObject *self, PyObject *args); static PyObject *wgdb_end_write(PyObject *self, PyObject *args); static PyObject *wgdb_start_read(PyObject *self, PyObject *args); static PyObject *wgdb_end_read(PyObject *self, PyObject *args); static int parse_query_params(PyObject *self, PyObject *args, PyObject *kwds, wg_query_ob *query); static PyObject * wgdb_make_query(PyObject *self, PyObject *args, PyObject *kwds); static PyObject * wgdb_fetch(PyObject *self, PyObject *args); static PyObject * wgdb_free_query(PyObject *self, PyObject *args); static void free_query(wg_query_ob *obj); static void wg_database_dealloc(wg_database *obj); static void wg_query_dealloc(wg_query_ob *obj); static PyObject *wg_database_repr(wg_database *obj); static PyObject *wg_record_repr(wg_record *obj); static PyObject *wg_query_repr(wg_query_ob *obj); static PyObject *wg_query_get_res_count(wg_query_ob *obj, void *closure); static int wg_query_set_res_count(wg_query_ob *obj, PyObject *value, void *closure); static void wgdb_error_setstring(PyObject *self, char *err); /* ============= Private vars ============ */ /** Module state, contains the exception object */ #ifndef PYTHON3 static struct module_state _state; #endif /** Database object type */ static PyTypeObject wg_database_type = { #ifndef PYTHON3 PyObject_HEAD_INIT(NULL) 0, /*ob_size*/ #else PyVarObject_HEAD_INIT(NULL, 0) #endif "wgdb.Database", /*tp_name*/ sizeof(wg_database), /*tp_basicsize*/ 0, /*tp_itemsize*/ (destructor) wg_database_dealloc, /*tp_dealloc*/ 0, /*tp_print*/ 0, /*tp_getattr*/ 0, /*tp_setattr*/ 0, /*tp_compare*/ (reprfunc) wg_database_repr, /*tp_repr*/ 0, /*tp_as_number*/ 0, /*tp_as_sequence*/ 0, /*tp_as_mapping*/ 0, /*tp_hash */ 0, /*tp_call*/ (reprfunc) wg_database_repr, /*tp_str*/ 0, /*tp_getattro*/ 0, /*tp_setattro*/ 0, /*tp_as_buffer*/ Py_TPFLAGS_DEFAULT, /*tp_flags*/ "WhiteDB database object", /* tp_doc */ }; /** Record object type */ static PyTypeObject wg_record_type = { #ifndef PYTHON3 PyObject_HEAD_INIT(NULL) 0, /*ob_size*/ #else PyVarObject_HEAD_INIT(NULL, 0) #endif "wgdb.Record", /*tp_name*/ sizeof(wg_record), /*tp_basicsize*/ 0, /*tp_itemsize*/ 0, /*tp_dealloc*/ 0, /*tp_print*/ 0, /*tp_getattr*/ 0, /*tp_setattr*/ 0, /*tp_compare*/ (reprfunc) wg_record_repr, /*tp_repr*/ 0, /*tp_as_number*/ 0, /*tp_as_sequence*/ 0, /*tp_as_mapping*/ 0, /*tp_hash */ 0, /*tp_call*/ (reprfunc) wg_record_repr, /*tp_str*/ 0, /*tp_getattro*/ 0, /*tp_setattro*/ 0, /*tp_as_buffer*/ Py_TPFLAGS_DEFAULT, /*tp_flags*/ "WhiteDB record object", /* tp_doc */ }; /** Data accessor functions for the Query type */ static PyGetSetDef wg_query_getset[] = { {"res_count", /* attribyte name */ (getter) wg_query_get_res_count, (setter) wg_query_set_res_count, "Number of rows in the result set", /* doc */ NULL}, /* closure, not used here */ {NULL} }; /** Query object type */ static PyTypeObject wg_query_type = { #ifndef PYTHON3 PyObject_HEAD_INIT(NULL) 0, /*ob_size*/ #else PyVarObject_HEAD_INIT(NULL, 0) #endif "wgdb.Query", /*tp_name*/ sizeof(wg_query_ob), /*tp_basicsize*/ 0, /*tp_itemsize*/ (destructor) wg_query_dealloc, /*tp_dealloc*/ 0, /*tp_print*/ 0, /*tp_getattr*/ 0, /*tp_setattr*/ 0, /*tp_compare*/ (reprfunc) wg_query_repr, /*tp_repr*/ 0, /*tp_as_number*/ 0, /*tp_as_sequence*/ 0, /*tp_as_mapping*/ 0, /*tp_hash */ 0, /*tp_call*/ (reprfunc) wg_query_repr, /*tp_str*/ 0, /*tp_getattro*/ 0, /*tp_setattro*/ 0, /*tp_as_buffer*/ Py_TPFLAGS_DEFAULT, /*tp_flags*/ "WhiteDB query object", /* tp_doc */ 0, /* tp_traverse */ 0, /* tp_clear */ 0, /* tp_richcompare */ 0, /* tp_weaklistoffset */ 0, /* tp_iter */ 0, /* tp_iternext */ 0, /* tp_methods */ 0, /* tp_members */ wg_query_getset, /* tp_getset */ }; /** Method table */ static PyMethodDef wgdb_methods[] = { {"attach_database", (PyCFunction) wgdb_attach_database, METH_VARARGS | METH_KEYWORDS, "Connect to a shared memory database. If the database with the "\ "given name does not exist, it is created."}, {"attach_existing_database", (PyCFunction) wgdb_attach_existing_database, METH_VARARGS, "Connect to a shared memory database. Fails if the database with the "\ "given name does not exist."}, {"delete_database", wgdb_delete_database, METH_VARARGS, "Delete a shared memory database."}, {"detach_database", wgdb_detach_database, METH_VARARGS, "Detach from shared memory database."}, {"create_record", wgdb_create_record, METH_VARARGS, "Create a record with given length."}, {"create_raw_record", wgdb_create_raw_record, METH_VARARGS, "Create a record without indexing the fields."}, {"get_first_record", wgdb_get_first_record, METH_VARARGS, "Fetch first record from database."}, {"get_next_record", wgdb_get_next_record, METH_VARARGS, "Fetch next record from database."}, {"get_record_len", wgdb_get_record_len, METH_VARARGS, "Get record length (number of fields)."}, {"is_record", wgdb_is_record, METH_VARARGS, "Determine if object is a WhiteDB record."}, {"delete_record", wgdb_delete_record, METH_VARARGS, "Delete a record."}, {"set_field", (PyCFunction) wgdb_set_field, METH_VARARGS | METH_KEYWORDS, "Set field value."}, {"set_new_field", (PyCFunction) wgdb_set_new_field, METH_VARARGS | METH_KEYWORDS, "Set field value (assumes no previous content)."}, {"get_field", wgdb_get_field, METH_VARARGS, "Get field data decoded to corresponding Python type."}, {"start_write", wgdb_start_write, METH_VARARGS, "Start writing transaction."}, {"end_write", wgdb_end_write, METH_VARARGS, "Finish writing transaction."}, {"start_read", wgdb_start_read, METH_VARARGS, "Start reading transaction."}, {"end_read", wgdb_end_read, METH_VARARGS, "Finish reading transaction."}, {"make_query", (PyCFunction) wgdb_make_query, METH_VARARGS | METH_KEYWORDS, "Create a query object."}, {"fetch", wgdb_fetch, METH_VARARGS, "Fetch next record from a query."}, {"free_query", wgdb_free_query, METH_VARARGS, "Unallocates the memory (local and shared) used by the query."}, {NULL, NULL, 0, NULL} /* terminator */ }; #ifdef PYTHON3 static struct PyModuleDef wgdb_def = { PyModuleDef_HEAD_INIT, "wgdb", /* name of module */ "WhiteDB database adapter", /* module documentation, may be NULL */ sizeof(struct module_state), /* size of per-interpreter state */ wgdb_methods }; #endif /* ============== Functions ============== */ /* Wrapped wgdb API * uses wg_database_type object to store the database pointer * when making calls from python. This type is not available * generally (using non-restricted values for the pointer * would cause segfaults), only by calling wgdb_attach_database(). */ /* Functions for attaching and deleting */ /** Attach to memory database. * Python wrapper to wg_attach_database() and wg_attach_local_database() */ static PyObject * wgdb_attach_database(PyObject *self, PyObject *args, PyObject *kwds) { wg_database *db; char *shmname = NULL; wg_int sz = 0; wg_int local = 0; static char *kwlist[] = {"shmname", "size", "local", NULL}; if(!PyArg_ParseTupleAndKeywords(args, kwds, "|snn", kwlist, &shmname, &sz, &local)) return NULL; db = (wg_database *) wg_database_type.tp_alloc(&wg_database_type, 0); if(!db) return NULL; /* Now try to actually connect. Note that this may create * a new database if none is found with a matching name. In case of * a local database, a new one is allocated every time. */ if(!local) db->db = (void *) wg_attach_database(shmname, sz); else db->db = (void *) wg_attach_local_database(sz); if(!db->db) { wgdb_error_setstring(self, "Failed to attach to database."); wg_database_type.tp_free(db); return NULL; } db->local = local; /* Py_INCREF(db);*/ /* XXX: not needed? if we increment here, the object is never freed, even if it's unused */ return (PyObject *) db; } /** Attach to memory database. * Python wrapper to wg_attach_existing_database() */ static PyObject *wgdb_attach_existing_database(PyObject *self, PyObject *args) { wg_database *db; char *shmname = NULL; if(!PyArg_ParseTuple(args, "|s", &shmname)) return NULL; db = (wg_database *) wg_database_type.tp_alloc(&wg_database_type, 0); if(!db) return NULL; /* Try to attach to an existing database. Fails if a database * with a matching name is not found. Only applies to shared * memory databases. */ db->db = (void *) wg_attach_existing_database(shmname); if(!db->db) { wgdb_error_setstring(self, "Failed to attach to database."); wg_database_type.tp_free(db); return NULL; } db->local = 0; /* Py_INCREF(db);*/ /* XXX: not needed? if we increment here, the object is never freed, even if it's unused */ return (PyObject *) db; } /** Delete memory database. * Python wrapper to wg_delete_database() */ static PyObject * wgdb_delete_database(PyObject *self, PyObject *args) { char *shmname = NULL; int err = 0; if(!PyArg_ParseTuple(args, "|s", &shmname)) return NULL; err = wg_delete_database(shmname); if(err) { wgdb_error_setstring(self, "Failed to delete the database."); return NULL; } Py_INCREF(Py_None); return Py_None; } /** Detach from memory database. * Python wrapper to wg_detach_database() * Detaching is generally SysV-specific (so under Win32 this * is currently a no-op). * In case of a local database, wg_delete_local_database() is * called instead. */ static PyObject * wgdb_detach_database(PyObject *self, PyObject *args) { PyObject *db = NULL; if(!PyArg_ParseTuple(args, "O!", &wg_database_type, &db)) return NULL; /* Only try detaching if we have a valid pointer. */ if(((wg_database *) db)->db) { if(((wg_database *) db)->local) { /* Local database should be deleted instead */ wg_delete_local_database(((wg_database *) db)->db); } else if(wg_detach_database(((wg_database *) db)->db) < 0) { wgdb_error_setstring(self, "Failed to detach from database."); return NULL; } ((wg_database *) db)->db = NULL; /* mark as detached */ } Py_INCREF(Py_None); return Py_None; } /* Functions to manipulate records. The record is also * represented as a custom type to avoid dealing with word * size issues on different platforms. So the type is essentially * a container for the record pointer. */ /** Create a record with given length. * Python wrapper to wg_create_record() */ static PyObject * wgdb_create_record(PyObject *self, PyObject *args) { PyObject *db = NULL; wg_int length = 0; wg_record *rec; if(!PyArg_ParseTuple(args, "O!n", &wg_database_type, &db, &length)) return NULL; /* Build a new record object */ rec = (wg_record *) wg_record_type.tp_alloc(&wg_record_type, 0); if(!rec) return NULL; rec->rec = wg_create_record(((wg_database *) db)->db, length); if(!rec->rec) { wgdb_error_setstring(self, "Failed to create a record."); wg_record_type.tp_free(rec); return NULL; } /* Py_INCREF(rec);*/ /* XXX: not needed? */ return (PyObject *) rec; } /** Create a record without indexing the fields. * Python wrapper to wg_create_raw_record() */ static PyObject * wgdb_create_raw_record(PyObject *self, PyObject *args) { PyObject *db = NULL; wg_int length = 0; wg_record *rec; if(!PyArg_ParseTuple(args, "O!n", &wg_database_type, &db, &length)) return NULL; /* Build a new record object */ rec = (wg_record *) wg_record_type.tp_alloc(&wg_record_type, 0); if(!rec) return NULL; rec->rec = wg_create_raw_record(((wg_database *) db)->db, length); if(!rec->rec) { wgdb_error_setstring(self, "Failed to create a record."); wg_record_type.tp_free(rec); return NULL; } return (PyObject *) rec; } /** Fetch first record from database. * Python wrapper to wg_get_first_record() */ static PyObject * wgdb_get_first_record(PyObject *self, PyObject *args) { PyObject *db = NULL; wg_record *rec; if(!PyArg_ParseTuple(args, "O!", &wg_database_type, &db)) return NULL; /* Build a new record object */ rec = (wg_record *) wg_record_type.tp_alloc(&wg_record_type, 0); if(!rec) return NULL; rec->rec = wg_get_first_record(((wg_database *) db)->db); if(!rec->rec) { wgdb_error_setstring(self, "Failed to fetch a record."); wg_record_type.tp_free(rec); return NULL; } Py_INCREF(rec); return (PyObject *) rec; } /** Fetch next record from database. * Python wrapper to wg_get_next_record() */ static PyObject * wgdb_get_next_record(PyObject *self, PyObject *args) { PyObject *db = NULL, *prev = NULL; wg_record *rec; if(!PyArg_ParseTuple(args, "O!O!", &wg_database_type, &db, &wg_record_type, &prev)) return NULL; /* Build a new record object */ rec = (wg_record *) wg_record_type.tp_alloc(&wg_record_type, 0); if(!rec) return NULL; rec->rec = wg_get_next_record(((wg_database *) db)->db, ((wg_record *) prev)->rec); if(!rec->rec) { wgdb_error_setstring(self, "Failed to fetch a record."); wg_record_type.tp_free(rec); return NULL; } Py_INCREF(rec); return (PyObject *) rec; } /** Get record length (number of fields). * Python wrapper to wg_get_record_len() */ static PyObject * wgdb_get_record_len(PyObject *self, PyObject *args) { PyObject *db = NULL, *rec = NULL; wg_int len = 0; if(!PyArg_ParseTuple(args, "O!O!", &wg_database_type, &db, &wg_record_type, &rec)) return NULL; len = wg_get_record_len(((wg_database *) db)->db, ((wg_record *) rec)->rec); if(len < 0) { wgdb_error_setstring(self, "Failed to get the record length."); return NULL; } return Py_BuildValue("i", (int) len); } /** Determine, if object is a record * Instead of exposing the record type directly as wgdb.Record, * we provide this function. The reason is that we do not want * these objects to be instantiated from Python, as such instances * would have no valid record pointer to the memory database. */ static PyObject * wgdb_is_record(PyObject *self, PyObject *args) { PyObject *ob = NULL; if(!PyArg_ParseTuple(args, "O", &ob)) return NULL; if(PyObject_TypeCheck(ob, &wg_record_type)) return Py_BuildValue("i", 1); else return Py_BuildValue("i", 0); } /** Delete record. * Python wrapper to wg_delete_record() */ static PyObject * wgdb_delete_record(PyObject *self, PyObject *args) { PyObject *db = NULL, *rec = NULL; wg_int err = 0; if(!PyArg_ParseTuple(args, "O!O!", &wg_database_type, &db, &wg_record_type, &rec)) return NULL; err = wg_delete_record(((wg_database *) db)->db, ((wg_record *) rec)->rec); if(err == -1) { wgdb_error_setstring(self, "Record has references."); return NULL; } else if(err < -1) { wgdb_error_setstring(self, "Failed to delete record."); return NULL; } Py_INCREF(Py_None); return Py_None; } /* Functions to manipulate field contents. * * Storing data: the Python object is first converted to an appropriate * C data. Then wg_encode_*() is used to convert it to WhiteDB encoded * field data (possibly storing the actual data in the database, if the * object itself is hashed or does not fit in a field). The encoded data * is then stored with wg_set_field() or wg_set_new_field() as appropriate. * * Reading data: encoded field data is read using wg_get_field() and * examined to determine the type. If the type is recognized, the data * is converted to appropriate C data using wg_decode_*() family of * functions and finally to a Python object. */ /** Determine matching wgdb type of a Python object. * ftype argument is a type hint in some cases where there's * ambiguity due to multiple matching wgdb types. * * returns -1 if the type is known, but the type hint is invalid. * returns -2 if type is not recognized */ static wg_int pytype_to_wgtype(PyObject *data, wg_int ftype) { if(data==Py_None) { if(!ftype) return WG_NULLTYPE; else if(ftype!=WG_NULLTYPE) return -1; } #ifndef PYTHON3 else if(PyInt_Check(data)) { #else else if(PyLong_Check(data)) { #endif if(!ftype) return WG_INTTYPE; else if(ftype!=WG_INTTYPE && ftype!=WG_VARTYPE) return -1; } else if(PyFloat_Check(data)) { if(!ftype) return WG_DOUBLETYPE; else if(ftype!=WG_DOUBLETYPE && ftype!=WG_FIXPOINTTYPE) return -1; } #ifndef PYTHON3 else if(PyString_Check(data)) { #else else if(PyUnicode_Check(data)) { #endif if(!ftype) return WG_STRTYPE; else if(ftype!=WG_STRTYPE && ftype!=WG_CHARTYPE &&\ ftype!=WG_URITYPE && ftype!=WG_XMLLITERALTYPE) return -1; } else if(PyObject_TypeCheck(data, &wg_record_type)) { if(!ftype) return WG_RECORDTYPE; else if(ftype!=WG_RECORDTYPE) return -1; } else if(PyDate_Check(data)) { if(!ftype) return WG_DATETYPE; else if(ftype!=WG_DATETYPE) return -1; } else if(PyTime_Check(data)) { if(!ftype) return WG_TIMETYPE; else if(ftype!=WG_TIMETYPE) return -1; } else /* Nothing matched */ return -2; /* User-selected type was suitable */ return ftype; } /** Encode an atomic value of type WG_NULLTYPE * Always succeeds. */ static wg_int encode_pyobject_null(wg_database *db) { return wg_encode_null(db->db, 0); } /** Encode an atomic value of type WG_RECORDTYPE * returns WG_ILLEGAL on failure */ static wg_int encode_pyobject_record(wg_database *db, PyObject *data) { return wg_encode_record(db->db, ((wg_record *) data)->rec); } /** Encode an atomic value of type WG_INTTYPE * returns WG_ILLEGAL on failure * if param is 1, the storage will be allocated in local memory (intended * for encoding query parameters without write locking) */ static wg_int encode_pyobject_int(wg_database *db, PyObject *data, int param) { wg_int intdata; #ifndef PYTHON3 intdata = (wg_int) PyInt_AsLong(data); #else intdata = (wg_int) PyLong_AsLong(data); #endif if(!param) { return wg_encode_int(db->db, intdata); } else { return wg_encode_query_param_int(db->db, intdata); } } /** Encode an atomic value of type WG_DOUBLETYPE * returns WG_ILLEGAL on failure * if param is 1, the storage will be allocated in local memory (intended * for encoding query parameters without write locking) */ static wg_int encode_pyobject_double(wg_database *db, PyObject *data, int param) { if(!param) { return wg_encode_double(db->db, (double) PyFloat_AsDouble(data)); } else { return wg_encode_query_param_double(db->db, (double) PyFloat_AsDouble(data)); } } /** Encode an atomic value of type WG_STRTYPE * returns WG_ILLEGAL on failure * if param is 1, the storage will be allocated in local memory (intended * for encoding query parameters without write locking) */ static wg_int encode_pyobject_str(wg_database *db, PyObject *data, char *ext_str, int param) { char *s; #ifndef PYTHON3 s = PyString_AsString(data); #elif defined(HAVE_LOCALEENC) s = PyBytes_AsString(PyUnicode_EncodeLocale(data, ENCODEERR)); #else s = PyBytes_AsString(PyUnicode_AsEncodedString(data, NULL, ENCODEERR)); #endif /* wg_encode_str is not guaranteed to check for NULL pointer */ if(s) { if(!param) { return wg_encode_str(db->db, s, ext_str); } else { return wg_encode_query_param_str(db->db, s, ext_str); } } else { return WG_ILLEGAL; } } /** Encode an atomic value of type WG_URITYPE * returns WG_ILLEGAL on failure * if param is 1, the storage will be allocated in local memory (intended * for encoding query parameters without write locking) */ static wg_int encode_pyobject_uri(wg_database *db, PyObject *data, char *ext_str, int param) { char *s; #ifndef PYTHON3 s = PyString_AsString(data); #elif defined(HAVE_LOCALEENC) s = PyBytes_AsString(PyUnicode_EncodeLocale(data, ENCODEERR)); #else s = PyBytes_AsString(PyUnicode_AsEncodedString(data, NULL, ENCODEERR)); #endif /* wg_encode_str is not guaranteed to check for NULL pointer */ if(s) { if(!param) { return wg_encode_uri(db->db, s, ext_str); } else { return wg_encode_query_param_uri(db->db, s, ext_str); } } else { return WG_ILLEGAL; } } /** Encode an atomic value of type WG_XMLLITERALTYPE * returns WG_ILLEGAL on failure * if param is 1, the storage will be allocated in local memory (intended * for encoding query parameters without write locking) */ static wg_int encode_pyobject_xmlliteral(wg_database *db, PyObject *data, char *ext_str, int param) { char *s; #ifndef PYTHON3 s = PyString_AsString(data); #elif defined(HAVE_LOCALEENC) s = PyBytes_AsString(PyUnicode_EncodeLocale(data, ENCODEERR)); #else s = PyBytes_AsString(PyUnicode_AsEncodedString(data, NULL, ENCODEERR)); #endif /* wg_encode_str is not guaranteed to check for NULL pointer */ if(s) { if(!param) { return wg_encode_xmlliteral(db->db, s, ext_str); } else { return wg_encode_query_param_xmlliteral(db->db, s, ext_str); } } else { return WG_ILLEGAL; } } /** Encode an atomic value of type WG_CHARTYPE * returns WG_ILLEGAL on failure * if param is 1, value is encoded as a query parameter. */ static wg_int encode_pyobject_char(wg_database *db, PyObject *data, int param) { char *s; #ifndef PYTHON3 s = PyString_AsString(data); #elif defined(HAVE_LOCALEENC) s = PyBytes_AsString(PyUnicode_EncodeLocale(data, ENCODEERR)); #else s = PyBytes_AsString(PyUnicode_AsEncodedString(data, NULL, ENCODEERR)); #endif /* wg_encode_str is not guaranteed to check for NULL pointer */ if(s) { if(!param) { return wg_encode_char(db->db, s[0]); } else { return wg_encode_query_param_char(db->db, s[0]); } } else { return WG_ILLEGAL; } } /** Encode an atomic value of type WG_FIXPOINTTYPE * returns WG_ILLEGAL on failure * if param is 1, value is encoded as a query parameter. */ static wg_int encode_pyobject_fixpoint(wg_database *db, PyObject *data, int param) { if(!param) { return wg_encode_fixpoint(db->db, (double) PyFloat_AsDouble(data)); } else { return wg_encode_query_param_fixpoint(db->db, (double) PyFloat_AsDouble(data)); } } /** Encode an atomic value of type WG_DATETYPE * returns WG_ILLEGAL on failure * if param is 1, value is encoded as a query parameter. */ static wg_int encode_pyobject_date(wg_database *db, PyObject *data, int param) { int datedata = wg_ymd_to_date(db->db, PyDateTime_GET_YEAR(data), PyDateTime_GET_MONTH(data), PyDateTime_GET_DAY(data)); if(datedata > 0) { if(!param) { return wg_encode_date(db->db, datedata); } else { return wg_encode_query_param_date(db->db, datedata); } } else { return WG_ILLEGAL; } } /** Encode an atomic value of type WG_TIMETYPE * returns WG_ILLEGAL on failure * if param is 1, value is encoded as a query parameter. */ static wg_int encode_pyobject_time(wg_database *db, PyObject *data, int param) { int timedata = wg_hms_to_time(db->db, PyDateTime_TIME_GET_HOUR(data), PyDateTime_TIME_GET_MINUTE(data), PyDateTime_TIME_GET_SECOND(data), PyDateTime_TIME_GET_MICROSECOND(data)/10000); if(timedata >= 0) { if(!param) { return wg_encode_time(db->db, timedata); } else { return wg_encode_query_param_time(db->db, timedata); } } else { return WG_ILLEGAL; } } /** Encode an atomic value of type WG_VARTYPE * returns WG_ILLEGAL on failure * if param is 1, value is encoded as a query parameter. */ static wg_int encode_pyobject_var(wg_database *db, PyObject *data, int param) { int intdata; #ifndef PYTHON3 intdata = (int) PyInt_AsLong(data); #else intdata = (int) PyLong_AsLong(data); #endif if(!param) { return wg_encode_var(db->db, intdata); } else { return wg_encode_query_param_var(db->db, intdata); } } /** Encode Python object as wgdb value of specific type. * returns WG_ILLEGAL if the conversion is not possible. The * database API may also return WG_ILLEGAL. */ static wg_int encode_pyobject(wg_database *db, PyObject *data, wg_int ftype, char *ext_str, int param) { switch(ftype) { case WG_NULLTYPE: return encode_pyobject_null(db); case WG_RECORDTYPE: return encode_pyobject_record(db, data); case WG_INTTYPE: return encode_pyobject_int(db, data, param); case WG_DOUBLETYPE: return encode_pyobject_double(db, data, param); case WG_STRTYPE: return encode_pyobject_str(db, data, ext_str, param); case WG_URITYPE: return encode_pyobject_uri(db, data, ext_str, param); case WG_XMLLITERALTYPE: return encode_pyobject_xmlliteral(db, data, ext_str, param); case WG_CHARTYPE: return encode_pyobject_char(db, data, param); case WG_FIXPOINTTYPE: return encode_pyobject_fixpoint(db, data, param); case WG_DATETYPE: return encode_pyobject_date(db, data, param); case WG_TIMETYPE: return encode_pyobject_time(db, data, param); case WG_VARTYPE: return encode_pyobject_var(db, data, param); default: break; } /* Handle unknown type */ return WG_ILLEGAL; } /** Encode Python object as wgdb value * The object may be an immediate value or a tuple containing * extended type information. * * Conversion rules: * Immediate Python value -> use the default wgdb type * (value, ftype) -> use the provided field type (if possible) * (value, ftype, ext_str) -> use the field type and extra string */ static wg_int encode_pyobject_ext(PyObject *self, wg_database *db, PyObject *obj, int param) { PyObject *data; wg_int enc, ftype = 0; char *ext_str = NULL; if(PyTuple_Check(obj)) { /* Extended value. */ int extargs = PyTuple_Size(obj); if(extargs<1 || extargs>3) { PyErr_SetString(PyExc_ValueError, "Values with extended type info must be 2/3-tuples."); return WG_ILLEGAL; } data = PyTuple_GetItem(obj, 0); if(extargs > 1) { #ifndef PYTHON3 ftype = (wg_int) PyInt_AsLong(PyTuple_GetItem(obj, 1)); #else ftype = (wg_int) PyLong_AsLong(PyTuple_GetItem(obj, 1)); #endif if(ftype<0) { PyErr_SetString(PyExc_ValueError, "Invalid field type for value."); return WG_ILLEGAL; } } if(extargs > 2) { #ifndef PYTHON3 ext_str = PyString_AsString(PyTuple_GetItem(obj, 2)); #elif defined(HAVE_LOCALEENC) ext_str = PyBytes_AsString( PyUnicode_EncodeLocale(PyTuple_GetItem(obj, 2), ENCODEERR)); #else ext_str = PyBytes_AsString( PyUnicode_AsEncodedString(PyTuple_GetItem(obj, 2), NULL, ENCODEERR)); #endif if(!ext_str) { /* Error has been set in conversion */ return WG_ILLEGAL; } } } else { data = obj; } /* Now do the actual conversion. */ ftype = pytype_to_wgtype(data, ftype); if(ftype == -1) { PyErr_SetString(PyExc_TypeError, "Requested encoding is not supported."); return WG_ILLEGAL; } else if(ftype == -2) { PyErr_SetString(PyExc_TypeError, "Value is of unsupported type."); return WG_ILLEGAL; } /* Now encode the given obj using the selected type */ enc = encode_pyobject(db, data, ftype, ext_str, param); if(enc==WG_ILLEGAL) wgdb_error_setstring(self, "Value encoding error."); return enc; } /** Update field data. * Data types supported: * Python None. Translates to WhiteDB NULL (empty) field. * Python integer. * Python float. * Python string. Embedded \0 bytes are not allowed (i.e. \0 is * treated as a standard string terminator). * XXX: add language support for str type? * wgdb.Record object * datetime.date object * datetime.time object */ static PyObject *wgdb_set_field(PyObject *self, PyObject *args, PyObject *kwds) { PyObject *db = NULL, *rec = NULL; wg_int fieldnr, fdata = WG_ILLEGAL, err = 0, ftype = 0; PyObject *data; char *ext_str = NULL; static char *kwlist[] = {"db", "rec", "fieldnr", "data", "encoding", "ext_str", NULL}; if(!PyArg_ParseTupleAndKeywords(args, kwds, "O!O!nO|ns", kwlist, &wg_database_type, &db, &wg_record_type, &rec, &fieldnr, &data, &ftype, &ext_str)) return NULL; /* Determine the argument type. If the optional encoding * argument is not supplied, default encoding for the Python type * of the data is selected. Otherwise the user-provided encoding * is used, with the limitation that the Python type must * be compatible with the encoding. */ ftype = pytype_to_wgtype(data, ftype); if(ftype == -1) { PyErr_SetString(PyExc_TypeError, "Requested encoding is not supported."); return NULL; } else if(ftype == -2) { PyErr_SetString(PyExc_TypeError, "Argument is of unsupported type."); return NULL; } /* Now encode the given data using the selected type */ fdata = encode_pyobject((wg_database *) db, data, ftype, ext_str, 0); if(fdata==WG_ILLEGAL) { wgdb_error_setstring(self, "Field data conversion error."); return NULL; } /* Store the encoded field data in the record */ err = wg_set_field(((wg_database *) db)->db, ((wg_record *) rec)->rec, fieldnr, fdata); if(err < 0) { wgdb_error_setstring(self, "Failed to set field value."); return NULL; } Py_INCREF(Py_None); return Py_None; } /** Set field data (assumes no previous content) * Skips some bookkeeping related to the previous field * contents, making the insert faster. Using it on fields * that have previous content is likely to corrupt the database. * Otherwise identical to set_field(). */ static PyObject *wgdb_set_new_field(PyObject *self, PyObject *args, PyObject *kwds) { PyObject *db = NULL, *rec = NULL; wg_int fieldnr, fdata = WG_ILLEGAL, err = 0, ftype = 0; PyObject *data; char *ext_str = NULL; static char *kwlist[] = {"db", "rec", "fieldnr", "data", "encoding", "ext_str", NULL}; if(!PyArg_ParseTupleAndKeywords(args, kwds, "O!O!nO|ns", kwlist, &wg_database_type, &db, &wg_record_type, &rec, &fieldnr, &data, &ftype, &ext_str)) return NULL; ftype = pytype_to_wgtype(data, ftype); if(ftype == -1) { PyErr_SetString(PyExc_TypeError, "Requested encoding is not supported."); return NULL; } else if(ftype == -2) { PyErr_SetString(PyExc_TypeError, "Argument is of unsupported type."); return NULL; } fdata = encode_pyobject((wg_database *) db, data, ftype, ext_str, 0); if(fdata==WG_ILLEGAL) { wgdb_error_setstring(self, "Field data conversion error."); return NULL; } err = wg_set_new_field(((wg_database *) db)->db, ((wg_record *) rec)->rec, fieldnr, fdata); if(err < 0) { wgdb_error_setstring(self, "Failed to set field value."); return NULL; } Py_INCREF(Py_None); return Py_None; } /** Get decoded field value. * Currently supported types: * NULL - Python None * record - wgdb.Record * int - Python int * double - Python float * string - Python string * char - Python string * fixpoint - Python float * date - datetime.date * time - datetime.time */ static PyObject *wgdb_get_field(PyObject *self, PyObject *args) { PyObject *db = NULL, *rec = NULL; wg_int fieldnr, fdata, ftype; if(!PyArg_ParseTuple(args, "O!O!n", &wg_database_type, &db, &wg_record_type, &rec, &fieldnr)) return NULL; /* First retrieve the field data. The information about * the field type is encoded inside the field. */ fdata = wg_get_field(((wg_database *) db)->db, ((wg_record *) rec)->rec, fieldnr); if(fdata==WG_ILLEGAL) { wgdb_error_setstring(self, "Failed to get field data."); return NULL; } /* Decode the type */ ftype = wg_get_encoded_type(((wg_database *) db)->db, fdata); if(!ftype) { wgdb_error_setstring(self, "Failed to get field type."); return NULL; } /* Decode (or retrieve) the actual data */ if(ftype==WG_NULLTYPE) { Py_INCREF(Py_None); return Py_None; } else if(ftype==WG_RECORDTYPE) { wg_record *ddata = (wg_record *) wg_record_type.tp_alloc( &wg_record_type, 0); if(!ddata) return NULL; ddata->rec = wg_decode_record(((wg_database *) db)->db, fdata); if(!ddata->rec) { wgdb_error_setstring(self, "Failed to fetch a record."); wg_record_type.tp_free(ddata); return NULL; } Py_INCREF(ddata); return (PyObject *) ddata; } else if(ftype==WG_INTTYPE) { wg_int ddata = wg_decode_int(((wg_database *) db)->db, fdata); return Py_BuildValue("n", ddata); } else if(ftype==WG_DOUBLETYPE) { double ddata = wg_decode_double(((wg_database *) db)->db, fdata); return Py_BuildValue("d", ddata); } else if(ftype==WG_STRTYPE) { char *ddata = wg_decode_str(((wg_database *) db)->db, fdata); /* Data is copied here, no leaking */ return Py_BuildValue("s", ddata); } else if(ftype==WG_URITYPE) { char *ddata = wg_decode_uri(((wg_database *) db)->db, fdata); char *ext_str = wg_decode_uri_prefix(((wg_database *) db)->db, fdata); if(ext_str) { #ifndef PYTHON3 return PyString_FromFormat("%s%s", ext_str, ddata); #else return PyUnicode_FromFormat("%s%s", ext_str, ddata); #endif } else return Py_BuildValue("s", ddata); } else if(ftype==WG_XMLLITERALTYPE) { char *ddata = wg_decode_xmlliteral(((wg_database *) db)->db, fdata); return Py_BuildValue("s", ddata); } else if(ftype==WG_CHARTYPE) { char ddata[2]; ddata[0] = wg_decode_char(((wg_database *) db)->db, fdata); ddata[1] = '\0'; return Py_BuildValue("s", ddata); /* treat as single-character string */ } else if(ftype==WG_FIXPOINTTYPE) { double ddata = wg_decode_fixpoint(((wg_database *) db)->db, fdata); return Py_BuildValue("d", ddata); } else if(ftype==WG_DATETYPE) { int year, month, day; int datedata = wg_decode_date(((wg_database *) db)->db, fdata); if(!datedata) { wgdb_error_setstring(self, "Failed to decode date."); return NULL; } wg_date_to_ymd(((wg_database *) db)->db, datedata, &year, &month, &day); return PyDate_FromDate(year, month, day); } else if(ftype==WG_TIMETYPE) { int hour, minute, second, fraction; int timedata = wg_decode_time(((wg_database *) db)->db, fdata); /* 0 is both a valid time of 00:00:00.00 and an error. So the * latter case is ignored here. */ wg_time_to_hms(((wg_database *) db)->db, timedata, &hour, &minute, &second, &fraction); return PyTime_FromTime(hour, minute, second, fraction*10000); } else if(ftype==WG_VARTYPE) { /* XXX: returns something that, if written back to database * unaltered, will result in the database field actually containing * the same variable. The literal value in Python is not very meaningful, * however this approach preserves consistency in handling the type * conversions. */ int ddata = wg_decode_var(((wg_database *) db)->db, fdata); return Py_BuildValue("(i,i)", ddata, WG_VARTYPE); } else { char buf[80]; snprintf(buf, 80, "Cannot handle field type %d.", (int) ftype); wgdb_error_setstring(self, buf); return NULL; } } /* * Functions to handle transactions. Logical level of * concurrency control with wg_start_write() and friends * is implemented here. In the simplest case, these functions * internally map to physical locking and unlocking, however they * should not be relied upon to do so. */ /** Start a writing transaction * Python wrapper to wg_start_write() * Returns lock id when successful, otherwise raises an exception. */ static PyObject * wgdb_start_write(PyObject *self, PyObject *args) { PyObject *db = NULL; wg_int lock_id = 0; if(!PyArg_ParseTuple(args, "O!", &wg_database_type, &db)) return NULL; lock_id = wg_start_write(((wg_database *) db)->db); if(!lock_id) { wgdb_error_setstring(self, "Failed to acquire write lock."); return NULL; } return Py_BuildValue("i", (int) lock_id); } /** Finish a writing transaction * Python wrapper to wg_end_write() * Returns None when successful, otherwise raises an exception. */ static PyObject * wgdb_end_write(PyObject *self, PyObject *args) { PyObject *db = NULL; wg_int lock_id = 0; if(!PyArg_ParseTuple(args, "O!n", &wg_database_type, &db, &lock_id)) return NULL; if(!wg_end_write(((wg_database *) db)->db, lock_id)) { wgdb_error_setstring(self, "Failed to release write lock."); return NULL; } Py_INCREF(Py_None); return Py_None; } /** Start a reading transaction * Python wrapper to wg_start_read() * Returns lock id when successful, otherwise raises an exception. */ static PyObject * wgdb_start_read(PyObject *self, PyObject *args) { PyObject *db = NULL; wg_int lock_id = 0; if(!PyArg_ParseTuple(args, "O!", &wg_database_type, &db)) return NULL; lock_id = wg_start_read(((wg_database *) db)->db); if(!lock_id) { wgdb_error_setstring(self, "Failed to acquire read lock."); return NULL; } return Py_BuildValue("i", (int) lock_id); } /** Finish a reading transaction * Python wrapper to wg_end_read() * Returns None when successful, otherwise raises an exception. */ static PyObject * wgdb_end_read(PyObject *self, PyObject *args) { PyObject *db = NULL; wg_int lock_id = 0; if(!PyArg_ParseTuple(args, "O!n", &wg_database_type, &db, &lock_id)) return NULL; if(!wg_end_read(((wg_database *) db)->db, lock_id)) { wgdb_error_setstring(self, "Failed to release read lock."); return NULL; } Py_INCREF(Py_None); return Py_None; } /* Functions to create and fetch data from queries. * The query object defined on wgdb module level stores both * the pointer to the query and all the encoded parameters - * since the parameters use database side storage, they * should preferrably be freed after the query is finished. */ /** Parse query arguments. * Creates wgdb query arglist and matchrec. */ static int parse_query_params(PyObject *self, PyObject *args, PyObject *kwds, wg_query_ob *query) { PyObject *db = NULL; PyObject *arglist = NULL; PyObject *matchrec = NULL; static char *kwlist[] = {"db", "matchrec", "arglist", NULL}; if(!PyArg_ParseTupleAndKeywords(args, kwds, "O!|OO", kwlist, &wg_database_type, &db, &matchrec, &arglist)) return 0; Py_INCREF(db); /* Make sure we don't lose database connection */ query->db = (wg_database *) db; /* Determine type of arglist */ query->argc = 0; if(arglist && arglist!=Py_None) { int len, i; if(!PySequence_Check(arglist)) { PyErr_SetString(PyExc_TypeError, "Query arglist must be a sequence."); return 0; } len = (int) PySequence_Size(arglist); /* Argument list was present. Extract the individual arguments. */ if(len > 0) { query->arglist = (wg_query_arg *) malloc(len * sizeof(wg_query_arg)); if(!query->arglist) { wgdb_error_setstring(self, "Failed to allocate memory."); return 0; } memset(query->arglist, 0, len * sizeof(wg_query_arg)); /* Now copy all the parameters. */ for(i=0; idb, PyTuple_GetItem(t, 2), 1); if(enc==WG_ILLEGAL) { /* Error set by encode function */ return 0; } /* Finally, set the argument fields */ query->arglist[i].column = col; query->arglist[i].cond = cond; query->arglist[i].value = enc; query->argc++; /* We have successfully encoded a parameter. We're * not setting this to len immediately, because * there might be an encoding error and part of * the arguments may be left uninitialized. */ } } } query->reclen = 0; /* Determine type of matchrec */ if(matchrec && matchrec!=Py_None) { if(PyObject_TypeCheck(matchrec, &wg_record_type)) { /* Database record pointer was given. Pass it directly * to the query. */ query->matchrec = ((wg_record *) matchrec)->rec; } else if(PySequence_Check(matchrec)) { int len = (int) PySequence_Size(matchrec); if(len) { int i; /* Construct the record. */ query->matchrec = malloc(len * sizeof(wg_int)); if(!query->matchrec) { wgdb_error_setstring(self, "Failed to allocate memory."); return 0; } memset(query->matchrec, 0, len * sizeof(wg_int)); for(i=0; idb, PySequence_GetItem(matchrec, i), 1); if(enc==WG_ILLEGAL) { /* Error set by encode function */ return 0; } ((wg_int *) query->matchrec)[i] = enc; query->reclen++; /* Count the successfully encoded fields. */ } } } else { PyErr_SetString(PyExc_TypeError, "Query match record must be a sequence or a wgdb.Record"); return 0; } } return 1; } /** Create a query object. * Python wrapper to wg_make_query() */ static PyObject * wgdb_make_query(PyObject *self, PyObject *args, PyObject *kwds) { wg_query_ob *query; /* Build a new query object */ query = (wg_query_ob *) wg_query_type.tp_alloc(&wg_query_type, 0); if(!query) return NULL; query->query = NULL; query->db = NULL; query->arglist = NULL; query->argc = 0; query->matchrec = NULL; query->reclen = 0; /* Create the arglist and matchrec from parameters. */ if(!parse_query_params(self, args, kwds, query)) { wg_query_dealloc(query); return NULL; } query->query = wg_make_query(query->db->db, query->matchrec, query->reclen, query->arglist, query->argc); if(!query->query) { wgdb_error_setstring(self, "Failed to create the query."); /* Call destructor. It should take care of all the allocated memory. */ wg_query_dealloc(query); return NULL; } return (PyObject *) query; } /** Fetch next row from a query. * Python wrapper for wg_fetch() */ static PyObject * wgdb_fetch(PyObject *self, PyObject *args) { PyObject *db = NULL, *query = NULL; wg_record *rec; if(!PyArg_ParseTuple(args, "O!O!", &wg_database_type, &db, &wg_query_type, &query)) return NULL; /* Build a new record object */ rec = (wg_record *) wg_record_type.tp_alloc(&wg_record_type, 0); if(!rec) return NULL; rec->rec = wg_fetch(((wg_database *) db)->db, ((wg_query_ob *) query)->query); if(!rec->rec) { wgdb_error_setstring(self, "Failed to fetch a record."); wg_record_type.tp_free(rec); return NULL; } Py_INCREF(rec); return (PyObject *) rec; } /** Free query. * Python wrapper to wg_free_query() * In addition, this function frees the local memory for * the arguments and attempts to free database-side encoded data. */ static PyObject * wgdb_free_query(PyObject *self, PyObject *args) { PyObject *db = NULL, *query = NULL; if(!PyArg_ParseTuple(args, "O!O!", &wg_database_type, &db, &wg_query_type, &query)) return NULL; /* XXX: since the query contains the db pointer, ignore * the database object we were given (it is still required * for consistency between the API-s and possible future * extensions). */ free_query((wg_query_ob *) query); Py_INCREF(Py_None); return Py_None; } /** Helper function to free local query memory * (wg_query_ob *) query->db field is used as a marker * (set to NULL for queries that do not need deallocating). */ static void free_query(wg_query_ob *obj) { if(obj->db) { #if 0 /* Suppress the warning if we use local parameters */ if(!obj->db->db) { fprintf(stderr, "Warning: database connection lost before freeing encoded data\n"); } #endif /* Allow freeing the query object. * XXX: this is hacky. db pointer may become significant * in the future, which makes this a timebomb. */ if(obj->query) wg_free_query(obj->db->db, obj->query); if(obj->arglist) { if(obj->db->db) { int i; for(i=0; iargc; i++) wg_free_query_param(obj->db->db, obj->arglist[i].value); } free(obj->arglist); } if(obj->matchrec && obj->reclen) { if(obj->db->db) { int i; for(i=0; ireclen; i++) wg_free_query_param(obj->db->db, ((wg_int *) obj->matchrec)[i]); } free(obj->matchrec); } Py_DECREF(obj->db); obj->db = NULL; } } /* Methods for data types defined by this module. */ /** Database object desctructor. * Detaches from shared memory or frees local memory. */ static void wg_database_dealloc(wg_database *obj) { if(obj->db) { if(obj->local) wg_delete_local_database(obj->db); else wg_detach_database(obj->db); } #ifndef PYTHON3 obj->ob_type->tp_free((PyObject *) obj); #else Py_TYPE(obj)->tp_free((PyObject *) obj); #endif } /** Query object desctructor. * Frees query and encoded query parameters. */ static void wg_query_dealloc(wg_query_ob *obj) { free_query(obj); #ifndef PYTHON3 obj->ob_type->tp_free((PyObject *) obj); #else Py_TYPE(obj)->tp_free((PyObject *) obj); #endif } /** String representation of database object. This is used for both * repr() and str() */ static PyObject *wg_database_repr(wg_database * obj) { /* XXX: this is incompatible with eval(). If a need to * eval() such representations should arise, new initialization * function is also needed for the type. */ #ifndef PYTHON3 return PyString_FromFormat("", (void *) obj->db); #else return PyUnicode_FromFormat("", (void *) obj->db); #endif } /** String representation of record object. This is used for both * repr() and str() */ static PyObject *wg_record_repr(wg_record * obj) { /* XXX: incompatible with eval(). */ #ifndef PYTHON3 return PyString_FromFormat("", (void *) obj->rec); #else return PyUnicode_FromFormat("", (void *) obj->rec); #endif } /** String representation of query object. Used for both repr() and str() */ static PyObject *wg_query_repr(wg_query_ob *obj) { /* XXX: incompatible with eval(). */ #ifndef PYTHON3 return PyString_FromFormat("", (void *) obj->query); #else return PyUnicode_FromFormat("", (void *) obj->query); #endif } /** Get the number of rows in a query result */ static PyObject *wg_query_get_res_count(wg_query_ob *obj, void *closure) { if(obj->query) { if(obj->query->qtype == WG_QTYPE_PREFETCH) { return Py_BuildValue("K", (unsigned PY_LONG_LONG) obj->query->res_count); } else { return Py_None; } } else { PyErr_SetString(PyExc_ValueError, "Invalid query object"); return NULL; } return Py_None; /* satisfy the compiler */ } /** Set the number of rows in a query result (not allowed) */ static int wg_query_set_res_count(wg_query_ob *obj, PyObject *value, void *closure) { PyErr_SetString(PyExc_AttributeError, "res_count is a read only attribute"); return -1; } /** Set module exception. * */ static void wgdb_error_setstring(PyObject *self, char *err) { #ifndef PYTHON3 struct module_state *st = &_state; #else struct module_state *st = (struct module_state *) PyModule_GetState(self); #endif PyErr_SetString(st->wgdb_error, err); } /** Initialize module. * Standard entry point for Python extension modules, executed * during import. */ #ifdef PYTHON3 #define INITERROR return NULL; #else #define INITERROR return; #endif #ifndef PYTHON3 PyMODINIT_FUNC initwgdb(void) #else PyMODINIT_FUNC PyInit_wgdb(void) #endif { PyObject *m; struct module_state *st; wg_database_type.tp_new = PyType_GenericNew; if (PyType_Ready(&wg_database_type) < 0) INITERROR wg_record_type.tp_new = PyType_GenericNew; if (PyType_Ready(&wg_record_type) < 0) INITERROR wg_query_type.tp_new = PyType_GenericNew; if (PyType_Ready(&wg_query_type) < 0) INITERROR #ifndef PYTHON3 m = Py_InitModule3("wgdb", wgdb_methods, "WhiteDB database adapter"); #else m = PyModule_Create(&wgdb_def); #endif if(!m) INITERROR #ifndef PYTHON3 st = &_state; #else st = (struct module_state *) PyModule_GetState(m); #endif st->wgdb_error = PyErr_NewException("wgdb.error", NULL, NULL); Py_INCREF(st->wgdb_error); PyModule_AddObject(m, "error", st->wgdb_error); /* Expose wgdb internal encoding types */ PyModule_AddIntConstant(m, "NULLTYPE", WG_NULLTYPE); PyModule_AddIntConstant(m, "RECORDTYPE", WG_RECORDTYPE); PyModule_AddIntConstant(m, "INTTYPE", WG_INTTYPE); PyModule_AddIntConstant(m, "DOUBLETYPE", WG_DOUBLETYPE); PyModule_AddIntConstant(m, "STRTYPE", WG_STRTYPE); PyModule_AddIntConstant(m, "XMLLITERALTYPE", WG_XMLLITERALTYPE); PyModule_AddIntConstant(m, "URITYPE", WG_URITYPE); PyModule_AddIntConstant(m, "BLOBTYPE", WG_BLOBTYPE); PyModule_AddIntConstant(m, "CHARTYPE", WG_CHARTYPE); PyModule_AddIntConstant(m, "FIXPOINTTYPE", WG_FIXPOINTTYPE); PyModule_AddIntConstant(m, "DATETYPE", WG_DATETYPE); PyModule_AddIntConstant(m, "TIMETYPE", WG_TIMETYPE); PyModule_AddIntConstant(m, "VARTYPE", WG_VARTYPE); /* these types are not implemented yet: PyModule_AddIntConstant(m, "ANONCONSTTYPE", WG_ANONCONSTTYPE); */ /* Expose query conditions */ PyModule_AddIntConstant(m, "COND_EQUAL", WG_COND_EQUAL); PyModule_AddIntConstant(m, "COND_NOT_EQUAL", WG_COND_NOT_EQUAL); PyModule_AddIntConstant(m, "COND_LESSTHAN", WG_COND_LESSTHAN); PyModule_AddIntConstant(m, "COND_GREATER", WG_COND_GREATER); PyModule_AddIntConstant(m, "COND_LTEQUAL", WG_COND_LTEQUAL); PyModule_AddIntConstant(m, "COND_GTEQUAL", WG_COND_GTEQUAL); /* Initialize PyDateTime C API */ PyDateTime_IMPORT; #ifdef PYTHON3 return m; #endif } whitedb-0.7.2/Python/whitedb.py000066400000000000000000000336031226454622500164440ustar00rootroot00000000000000#!/usr/bin/env python # -*- coding: latin-1 -*- # # Copyright (c) Priit Järv 2009, 2010, 2013 # # This file is part of WhiteDB # # WhiteDB is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # WhiteDB is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with WhiteDB. If not, see . """@file whitedb.py High level Python API for WhiteDB database """ # This module is implemented loosely following the guidelines # in Python DBI spec (http://www.python.org/dev/peps/pep-0249/). # Due to the wgdb feature set being much slimmer than typical # SQL databases, it does not make sense to follow DBI fully, # but where there are overlaps in functionality, similar # naming and object structure should be generally used. import wgdb ### Error classes (by DBI recommendation) ### # class DatabaseError(wgdb.error): """Base class for database errors""" pass class ProgrammingError(DatabaseError): """Exception class to indicate invalid database usage""" pass class DataError(DatabaseError): """Exception class to indicate invalid data passed to the db adapter""" pass class InternalError(DatabaseError): """Exception class to indicate invalid internal state of the module""" pass ############## DBI classes: ############### # class Connection(object): """The Connection class acts as a container for wgdb.Database and provides all connection-related and record accessing functions.""" def __init__(self, shmname=None, shmsize=0, local=0): if local: self._db = wgdb.attach_database(size=shmsize, local=local) elif shmname: self._db = wgdb.attach_database(shmname, shmsize) else: self._db = wgdb.attach_database(size=shmsize) self.shmname = shmname self.locking = 1 self._lock_id = None def close(self): """Close the connection.""" if self._db: wgdb.detach_database(self._db) self._db = None def cursor(self): """Return a DBI-style database cursor""" if self._db is None: raise InternalError("Connection is closed.") return Cursor(self) # Locking support # def set_locking(self, mode): """Set locking mode (1=on, 0=off)""" self.locking = mode def start_write(self): """Start writing transaction""" if self._lock_id: raise ProgrammingError("Transaction already started.") self._lock_id = wgdb.start_write(self._db) def end_write(self): """Finish writing transaction""" if not self._lock_id: raise ProgrammingError("No current transaction.") wgdb.end_write(self._db, self._lock_id) self._lock_id = None def start_read(self): """Start reading transaction""" if self._lock_id: raise ProgrammingError("Transaction already started.") self._lock_id = wgdb.start_read(self._db) def end_read(self): """Finish reading transaction""" if not self._lock_id: raise ProgrammingError("No current transaction.") wgdb.end_read(self._db, self._lock_id) self._lock_id = None def commit(self): """Commit the transaction (no-op)""" pass def rollback(self): """Roll back the transaction (no-op)""" pass # Record operations. Wrap wgdb.Record object into Record class. # def _new_record(self, rec): """Create a Record instance from wgdb record object (internal)""" r = Record(self, rec) if self.locking: self.start_read() try: r.size = wgdb.get_record_len(self._db, rec) finally: if self.locking: self.end_read() return r def first_record(self): """Get first record from database.""" if self.locking: self.start_read() try: r = wgdb.get_first_record(self._db) except wgdb.error: r = None finally: if self.locking: self.end_read() if not r: return None return self._new_record(r) def next_record(self, rec): """Get next record from database.""" if self.locking: self.start_read() try: r = wgdb.get_next_record(self._db, rec.get__rec()) except wgdb.error: r = None finally: if self.locking: self.end_read() if not r: return None return self._new_record(r) def create_record(self, size): """Create new record with given size.""" if size <= 0: raise DataError("Invalid record size") if self.locking: self.start_write() try: r = wgdb.create_record(self._db, size) finally: if self.locking: self.end_write() return self._new_record(r) def delete_record(self, rec): """Delete record.""" if self.locking: self.start_write() try: r = wgdb.delete_record(self._db, rec.get__rec()) finally: if self.locking: self.end_write() rec.set__rec(None) # prevent future usage def atomic_create_record(self, fields): """Create a record and set field contents atomically.""" if not fields: raise DataError("Cannot create an empty record") l = len(fields) tupletype = type(()) if self.locking: self.start_write() try: r = wgdb.create_raw_record(self._db, l) for i in range(l): if type(fields[i]) == tupletype: data = fields[i][0] extarg = fields[i][1:] else: data = fields[i] extarg = () if isinstance(data, Record): data = data.get__rec() fargs = (self._db, r, i, data) + extarg wgdb.set_new_field(*fargs) finally: if self.locking: self.end_write() return self._new_record(r) def atomic_update_record(self, rec, fields): """Set the contents of the entire record atomically.""" # fields should be a sequence l = len(fields) sz = rec.get_size() r = rec.get__rec() tupletype = type(()) if self.locking: self.start_write() try: for i in range(l): if type(fields[i]) == tupletype: data = fields[i][0] extarg = fields[i][1:] else: data = fields[i] extarg = () if isinstance(data, Record): data = data.get__rec() fargs = (self._db, r, i, data) + extarg wgdb.set_field(*fargs) if l < sz: # fill the remainder: for i in range(l, sz): wgdb.set_field(self._db, r, i, None) finally: if self.locking: self.end_write() # alias for atomic_create_record() def insert(self, fields): """Insert a record into database""" return self.atomic_create_record(fields) # Field operations. Expect Record instances as argument # def get_field(self, rec, fieldnr): """Return data field contents""" if self.locking: self.start_read() try: data = wgdb.get_field(self._db, rec.get__rec(), fieldnr) finally: if self.locking: self.end_read() if wgdb.is_record(data): return self._new_record(data) else: return data def set_field(self, rec, fieldnr, data, *arg, **kwarg): """Set data field contents""" if isinstance(data, Record): data = data.get__rec() if self.locking: self.start_write() try: r = wgdb.set_field(self._db, rec.get__rec(), fieldnr, data, *arg, **kwarg) finally: if self.locking: self.end_write() return r # Query operations # def make_query(self, matchrec=None, *arg, **kwarg): """Create a query object.""" if isinstance(matchrec, Record): matchrec = matchrec.get__rec() if self.locking: self.start_write() # write lock for parameter encoding try: query = wgdb.make_query(self._db, matchrec, *arg, **kwarg) finally: if self.locking: self.end_write() return query def fetch(self, query): """Get next record from query result set.""" if self.locking: self.start_read() try: r = wgdb.fetch(self._db, query) except wgdb.error: r = None finally: if self.locking: self.end_read() if not r: return None return self._new_record(r) def free_query(self, cur): """Free query belonging to a cursor.""" if not self._db: # plausible enough to warrant special handling raise ProgrammingError("Database closed before freeing query "\ "(Hint: use Cursor.close() before Connection.close())") if self.locking: self.start_write() # may write shared memory try: r = wgdb.free_query(self._db, cur.get__query()) finally: if self.locking: self.end_write() cur.set__query(None) # prevent future usage class Cursor(object): """Cursor object. Supports wgdb-style queries based on match records or argument lists. Does not currently support SQL.""" def __init__(self, conn): self._query = None self._conn = conn self.rowcount = -1 def get__query(self): """Return low level query object""" return self._query def set__query(self, query): """Overwrite low level query object""" self._query = query def fetchone(self): """Fetch the next record from the result set""" if not self._query: raise ProgrammingError("No results to fetch.") return self._conn.fetch(self._query) def fetchall(self): """Fetch all (remaining) records from the result set""" result = [] while 1: r = self.fetchone() if not r: break result.append(r) return result # includes sql parameter for future extension. Current # wgdb queries should use arglist and matchrec keyword parameters. def execute(self, sql="", matchrec=None, arglist=None): """Execute a database query""" self._query = self._conn.make_query(matchrec=matchrec, arglist=arglist) if self._query.res_count is not None: self.rowcount = self._query.res_count # using cursors to insert data does not make sense # in WhiteDB context, since there is no relation at all # between the current cursor state and new records. # This functionality will be moved to Connection object. def insert(self, fields): """Insert a record into database --DEPRECATED--""" return self._conn.atomic_create_record(fields) def close(self): """Close the cursor""" if self._query: self._conn.free_query(self) ############## Additional classes: ############### # class Record(object): """Record data representation. Allows field-level and record-level manipulation of data. Supports iterator and (partial) sequence protocol.""" def __init__(self, conn, rec): self._rec = rec self._conn = conn self.size = 0 def get__rec(self): """Return low level record object""" return self._rec def set__rec(self, rec): """Overwrite low level record object""" self._rec = rec def get_size(self): """Return record size""" return self.size def get_field(self, fieldnr): """Return data field contents""" if fieldnr < 0 or fieldnr >= self.size: raise DataError("Field number out of bounds.") return self._conn.get_field(self, fieldnr) def set_field(self, fieldnr, data, *arg, **kwarg): """Set data field contents with optional encoding""" if fieldnr < 0 or fieldnr >= self.size: raise DataError("Field number out of bounds.") return self._conn.set_field(self, fieldnr, data, *arg, **kwarg) def update(self, fields): """Set the contents of the entire record""" self._conn.atomic_update_record(self, fields) def delete(self): """Delete the record from database""" self._conn.delete_record(self) # iterator protocol def __iter__(self): for fieldnr in range(self.size): yield self.get_field(fieldnr) # sequence protocol def __getitem__(self, index): return self.get_field(index) def __setitem__(self, index, data, *arg, **kwarg): # XXX: should we allow this? # Could be counter-intuitive for users. return self.set_field(index, data, *arg, **kwarg) def __len__(self): return self.size ############## DBI API functions: ############### # def connect(shmname=None, shmsize=0, local=0): """Attaches to (or creates) a database. Returns a database object""" return Connection(shmname, shmsize, local) whitedb-0.7.2/README000066400000000000000000000034341226454622500140420ustar00rootroot00000000000000WhiteDB (wgdb) README ====================== WhiteDB is a lightweight database library operating fully in main memory. Disk is used only for dumping/restoring database and logging. Data is persistantly kept in the shared memory area: it is available simultaneously to all processes and is kept intact even if no processes are currently using the database. WhiteDB has no server process. Data is read and written directly from/to memory, no sockets are used between WhiteDB and the application using WhiteDB. WhiteDB keeps data as N-tuples: each database record is a tuple of N elements. Each element (record field) may have an arbitrary type amongst the types provided by WhiteDB. Each record field contains exactly one integer (4 bytes or 8 bytes). Datatypes which cannot be fit into one integer are allocated separately and the record field contains an (encoded) pointer to the real data. WhiteDB is written in pure C in a portable manner and should compile and function without additional porting at least under Linux (gcc) and Windows (native Windows C compiler cl). It has Python and experimental Java bindings. WhiteDB has several goals: - speed - portability - small footprint and low memory usage - usability as an rdf database - usability as an extended rdf database, xml database and outside these scopes - seamless integration with the Gandalf rule engine (work in progress) See http://whitedb.org for up-to-date documentation and other information. This distribution also includes various documentation: Doc/Install.txt - the installation instructions Doc/Tutorial.txt - getting started with the database Doc/Manual.txt - full C API documentation Doc/Utilities.txt - command line utilities and other programs Doc/python.txt - Python API documentation WhiteDB is licenced under GPL version 3. whitedb-0.7.2/README.asc000066400000000000000000000037101226454622500146040ustar00rootroot00000000000000WhiteDB (wgdb) README ====================== WhiteDB is a lightweight database library operating fully in main memory. Disk is used only for dumping/restoring database and logging. Data is persistantly kept in the shared memory area: it is available simultaneously to all processes and is kept intact even if no processes are currently using the database. WhiteDB has no server process. Data is read and written directly from/to memory, no sockets are used between WhiteDB and the application using WhiteDB. WhiteDB keeps data as N-tuples: each database record is a tuple of N elements. Each element (record field) may have an arbitrary type amongst the types provided by WhiteDB. Each record field contains exactly one integer (4 bytes or 8 bytes). Datatypes which cannot be fit into one integer are allocated separately and the record field contains an (encoded) pointer to the real data. WhiteDB is written in pure C in a portable manner and should compile and function without additional porting at least under Linux (gcc) and Windows (native Windows C compiler cl). It has Python and experimental Java bindings. WhiteDB has several goals: - speed - portability - small footprint and low memory usage - usability as an rdf database - usability as an extended rdf database, xml database and outside these scopes - seamless integration with the Gandalf rule engine (work in progress) See http://whitedb.org for up-to-date documentation and other information. This distribution also includes various documentation: - Doc/Install.txt - the installation instructions - Doc/Tutorial.txt - getting started with the database - Doc/Manual.txt - full C API documentation - Doc/Utilities.txt - command line utilities and other programs - Doc/python.txt - Python API documentation WhiteDB is licenced under GPL version 3. NOTE: if you're looking for release packages, please don't use the ones Github generates automatically. Get them from http://whitedb.org/download.html instead. whitedb-0.7.2/Reasoner/000077500000000000000000000000001226454622500147345ustar00rootroot00000000000000whitedb-0.7.2/Reasoner/Makefile.am000066400000000000000000000005671226454622500170000ustar00rootroot00000000000000# # - - - - reasoner sources - - - - noinst_LTLIBRARIES = libReasoner.la libReasoner_la_SOURCES = types.h rincludes.h \ printerrutils.c printerrutils.h\ rmain.c rmain.h\ rtest.c rtest.h\ glb.c glb.h\ rgenloop.c rgenloop.h\ derive.c derive.h\ subsume.c subsume.h\ unify.c unify.h\ build.c build.h\ clstore.c clstore.h\ clterm.c clterm.h\ mem.c mem.h whitedb-0.7.2/Reasoner/build.c000066400000000000000000000227041226454622500162040ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009,2010 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file build.h * Term and clause building functions. * */ /* ====== Includes =============== */ #include #include #include #ifdef __cplusplus extern "C" { #endif #include "rincludes.h" /* ====== Private headers and defs ======== */ #define PRINT_LIMITS /* ======= Private protos ================ */ /* ====== Functions ============== */ /* ** subst and calc term uses global flags set before: g->build_subst // subst var values into vars g->build_calc // do fun and pred calculations g->build_dcopy // copy nonimmediate data (vs return ptr) g->build_buffer; // build everything into tmp local buffer area (vs main area) */ gptr wr_build_calc_cl(glb* g, gptr xptr) { void* db; int ruleflag; gptr yptr; gint xlen; gint xatomnr; gint xmeta, xatom, yatom; int i; gint tmp; int ilimit; //printf("wr_build_calc_cl called\n"); db=g->db; if (g->build_rename) (g->build_rename_vc)=0; ruleflag=wg_rec_is_rule_clause(db,xptr); if (!ruleflag) { // in some cases, no change, no copy: normally copy //yptr=xptr; tmp=wr_build_calc_term(g,encode_datarec_offset(pto(db,xptr))); if (tmp==WG_ILLEGAL) return NULL; // could be memory err yptr=rotp(g,tmp); } else { xlen=get_record_len(xptr); // allocate space if ((g->build_buffer)!=NULL) { yptr=wr_alloc_from_cvec(g,g->build_buffer,(RECORD_HEADER_GINTS+xlen)); } else { yptr=wg_create_raw_record(db,xlen); } if (yptr==NULL) return NULL; // copy rec header and clause header ilimit=RECORD_HEADER_GINTS+(g->unify_firstuseterm); for(i=0;ivarbanks); // loop over clause elems xatomnr=wg_count_clause_atoms(db,xptr); for(i=0;ivarbanks); } ++(g->stat_built_cl); return yptr; } /* ** subst and calc term uses global flags set before: g->build_subst // subst var values into vars g->build_calc // do fun and pred calculations g->build_dcopy // copy nonimmediate data (vs return ptr) g->build_buffer; // build everything into tmp local buffer area (vs main area) */ gint wr_build_calc_term(glb* g, gint x) { void* db; gptr xptr,yptr; gint xlen,uselen; gint tmp; // used by VARVAL_F gint vnr; gint newvar; int i; gint res; int ilimit; int substflag; //printf("wr_build_calc_term called with x %d type %d\n",x,wg_get_encoded_type(g->db,x)); if (isvar(x) && (g->build_subst || g->build_rename)) x=VARVAL_F(x,(g->varbanks)); if (!isdatarec(x)) { // now we have a simple value if (!isvar(x) || !(g->build_rename)) return x; vnr=decode_var(x); if (vnrbuild_rename_vc)+FIRST_UNREAL_VAR_NR; if ((g->build_rename_vc)>=NROF_VARSINBANK) { ++(g->stat_internlimit_discarded_cl); (g->alloc_err)=3; #ifdef PRINT_LIMITS printf("limiterr in wr_build_calc_term for renamed var nrs\n"); #endif return WG_ILLEGAL; } ++(g->build_rename_vc); SETVAR(x,encode_var(newvar),(g->varbanks),(g->varstack),(g->tmp_unify_vc)); return encode_var(((g->build_rename_banknr)*NROF_VARSINBANK)+(newvar-FIRST_UNREAL_VAR_NR)); } else { return encode_var(((g->build_rename_banknr)*NROF_VARSINBANK)+(vnr-FIRST_UNREAL_VAR_NR)); } } // now we have a datarec if (0) { } else { db=g->db; xptr=decode_record(db,x); xlen=get_record_len(xptr); //printf("wr_build_calc_term xptr %d xlen %d\n",(gint)xptr,xlen); // allocate space if ((g->build_buffer)!=NULL) { yptr=wr_alloc_from_cvec(g,g->build_buffer,(RECORD_HEADER_GINTS+xlen)); //yptr=malloc(64); } else { yptr=wg_create_raw_record(db,xlen); } if (yptr==NULL) return WG_ILLEGAL; // copy rec header and term header ilimit=RECORD_HEADER_GINTS+(g->unify_firstuseterm); for(i=0;iunify_maxuseterms) { if (((g->unify_maxuseterms)+(g->unify_firstuseterm))unify_maxuseterms)+(g->unify_firstuseterm)+RECORD_HEADER_GINTS); else uselen=xlen+RECORD_HEADER_GINTS; } else { uselen=xlen+RECORD_HEADER_GINTS; } substflag=(g->build_subst || g->build_rename); for(;iunify_maxuseterms) { ilimit=RECORD_HEADER_GINTS+xlen; for(;iuse_comp_funs) && wr_computable_termptr(g,yptr)) { res=wr_compute_from_termptr(g,yptr); if (res==WG_ILLEGAL) return WG_ILLEGAL; } else { res=encode_record(db,yptr); } return res; } } int wr_computable_termptr(glb* g, gptr tptr) { gint fun; gint nr; //printf("wr_computable_termptr called with rec\n"); //wg_print_record(g->db,tptr); //printf("\n"); fun=tptr[RECORD_HEADER_GINTS+(g->unify_funpos)]; //printf("cp1 fun %d type %d :\n",fun,wg_get_encoded_type(g->db,fun)); //wg_debug_print_value(g->db,fun); //printf("\n"); if (isanonconst(fun)) { nr=decode_anonconst(fun); //printf("nr %d\n",nr); if (nr<(dbmemsegh(g->db)->anonconst.anonconst_nr) && nr>=0) return 1; } return 0; } gint wr_compute_from_termptr(glb* g, gptr tptr) { gint fun; gint res; //printf("wr_compute_from_termptr called\n"); fun=tptr[RECORD_HEADER_GINTS+(g->unify_funpos)]; // assume fun is anonconst!! switch (fun) { case ACONST_PLUS: res=wr_compute_fun_plus(g,tptr); break; case ACONST_EQUAL: res=wr_compute_fun_equal(g,tptr); break; default: res=encode_record(g->db,tptr); } return res; } gint wr_compute_fun_plus(glb* g, gptr tptr) { void* db=g->db; gint len; gint a,b; gint atype, btype; gint ri; double ad,bd,rd; //printf("wr_compute_fun_plus called\n"); len=get_record_len(tptr); if (len<(g->unify_firstuseterm)+3) return encode_record(db,tptr); a=tptr[RECORD_HEADER_GINTS+(g->unify_funarg1pos)]; atype=wg_get_encoded_type(db,a); if (atype!=WG_INTTYPE && atype!=WG_DOUBLETYPE) return encode_record(db,tptr); b=tptr[RECORD_HEADER_GINTS+(g->unify_funarg2pos)]; btype=wg_get_encoded_type(db,b); if (btype!=WG_INTTYPE && btype!=WG_DOUBLETYPE) return encode_record(db,tptr); if (atype==WG_INTTYPE && btype==WG_INTTYPE) { // integer res case ri=wg_decode_int(db,a)+wg_decode_int(db,b); return wg_encode_int(db,ri); } else { // double res case if (atype==WG_INTTYPE) ad=(double)(wg_decode_int(db,a)); else ad=wg_decode_double(db,a); if (btype==WG_INTTYPE) bd=(double)(wg_decode_int(db,b)); else bd=wg_decode_double(db,b); rd=ad+bd; return wg_encode_double(db,rd); } } gint wr_compute_fun_equal(glb* g, gptr tptr) { void* db=g->db; int len; gint a,b; gint atype, btype; len=get_record_len(tptr); if (len<(g->unify_firstuseterm)+3) return encode_record(db,tptr); a=tptr[RECORD_HEADER_GINTS+(g->unify_funarg1pos)]; atype=wg_get_encoded_type(db,a); b=tptr[RECORD_HEADER_GINTS+(g->unify_funarg2pos)]; btype=wg_get_encoded_type(db,b); if (wr_equal_term(g,a,b,1)) return ACONST_TRUE; atype=wg_get_encoded_type(db,a); if (atype==WG_VARTYPE) return encode_record(db,tptr); btype=wg_get_encoded_type(db,b); if (btype==WG_VARTYPE) return encode_record(db,tptr); // here we have not equal a and b with non-var types if (atype==WG_RECORDTYPE || atype==WG_URITYPE || atype==WG_ANONCONSTTYPE || btype==WG_RECORDTYPE || btype==WG_URITYPE || btype==WG_ANONCONSTTYPE) { return encode_record(db,tptr); } else { return ACONST_FALSE; } } #ifdef __cplusplus } #endif whitedb-0.7.2/Reasoner/build.h000066400000000000000000000023531226454622500162070ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009,2010 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file build.h * Term and clause building functions. * */ #ifndef DEFINED_BUILD_H #define DEFINED_BUILD_H #include "glb.h" #define CVEC_ALLOC_ALIGNMENT_BYTES 8 gptr wr_build_calc_cl(glb* g, gptr x); gint wr_build_calc_term(glb* g, gint x); int wr_computable_termptr(glb* g, gptr yptr); gint wr_compute_from_termptr(glb* g, gptr yptr); gint wr_compute_fun_plus(glb* g, gptr tptr); gint wr_compute_fun_equal(glb* g, gptr tptr); #endif whitedb-0.7.2/Reasoner/clstore.c000066400000000000000000000651011226454622500165560ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009,2010 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file clstore.c * Clause storage functions. */ /* ====== Includes =============== */ #ifdef __cplusplus extern "C" { #endif #include "rincludes.h" /* ====== Private defs =========== */ //#define DEBUG #undef DEBUG #undef DEBUGHASH /* ====== Private headers ======== */ static gint conststrhash(char* str); static int int_hash(int x); static int double_hash(double x); static int str_hash(char* x); static int str_dual_hash(char* x, char* y); /* ====== Functions ============== */ /* store a clause in a passive stack */ void wr_push_clpickstack_cl(glb* g, gptr cl) { #ifdef DEBUG printf("pushing to clpickstack pos %d\n",(rotp(g,g->clpickstack))[1]); #endif (g->clpickstack)=rpto(g,wr_cvec_push(g,rotp(g,(g->clpickstack)),(gint)cl)); //wr_show_clpickstack(g); } void wr_show_clpickstack(glb* g) { int i; for(i=2;i<(rotp(g,g->clpickstack))[1];i++) { printf("\nclpickstack nr %d :",i); wr_print_record(g,(gptr)((rotp(g,g->clpickstack))[i])); } } /* store a clause in a passive queue */ void wr_push_clqueue_cl(glb* g, gptr cl) { #ifdef DEBUG printf("pushing to clqueue pos %d\n",(rotp(g,g->clqueue))[1]); #endif (g->clqueue)=rpto(g,wr_cvec_push(g,rotp(g,(g->clqueue)),(gint)cl)); //wr_show_clqueue(g); } void wr_show_clqueue(glb* g) { int i; for(i=2;i<(rotp(g,g->clqueue))[1];i++) { printf("\nclqueue nr %d :",i); wr_print_record(g,(gptr)((rotp(g,g->clqueue))[i])); } } /* make a clause active */ void wr_push_clactive_cl(glb* g, gptr cl) { #ifdef DEBUG printf("pushing to clactive pos %d\n",(rotp(g,g->clactive))[1]); #endif (g->clactive)=rpto(g,wr_cvec_push(g,rotp(g,(g->clactive)),(gint)cl)); wr_cl_store_res_terms(g,cl); } void wr_show_clactive(glb* g) { int i; for(i=2;i<(rotp(g,g->clactive))[1];i++) { printf("\nclactive nr %d :",i); wr_print_record(g,(gptr)((rotp(g,g->clactive))[i])); } } /* store resolvable literals/terms of a clause to fast resolvable-lit list returns 0 iff ok hash adding: - separately for pos and neg literals (g->hash_pos_atoms_bits and g->hash_neg_atoms_bits) - g->hash_pos_atoms_bits is a cvec with required hash pos combinations, each as a gint of bits correponding to positions - g->hash_pos_atoms_vecs is a cvec with els pointing to hashvec-s of corresponding bit/pos values - two hash systems: - bit/pos for top-level-ground atoms - non-var top-level prefix for non-top-level-ground - unifiable atoms of each active clause are entered to hash for all given bit/pos values where all corresp subterms are non-var - usage of stored hash for unification candidates: - pick active, say -p(a,X) | -p(X,b) | r(a,b) - search the p(a hash (bits 11) for all matches and use them like p(a,c), p(a,f(X)) ... - then search the p( hash (bits 1) for all matches and use them like p(X,c), p(X,X), ... - then search the univ list for all matches and use them like X(Y,Y), ... - easy for finding unifiable active ground atoms: pick the hash with most bits nonvar: this is the best option and covers all unifiable ground atoms like for p(a,X) pick p(a, for p(a,c) pick this - suppose we search unifiable atoms for p(a,X) - we have p(a,Z): need p(a - we have p(Y,b): need p( - p(a,Z) comes up twice: as p(a and as p( - how to skip p( case? p( would be needed for finding p(Y,b) where we have var at hash bit/pos - we could also mark handled cases - normally we search unifiable atoms for ground atoms? - idea: N-nonvar subcases 0 nonvar: full list 1 nonvar: hash over all nonvars 2 nonvar: hash over all nonvars search: p(X,Y): use 1 nonvar hash use 0 nonvar hash p(a,Y) use 2 nonvar hash to find p(a,Z) and p(Z,a) use 1 nonvar hash to find p(Z,U) use 0 nonvar hash to find W(U,V) - 0-var: ground case, full bit/pos hash storage - search: just look for max bit/pos combo - 1-var: store in possible bit/pos hashes - search: - ... - N/all-var: list - idea: N-len ground prefixes suppose we search unifiable atoms for p(a,X) - we have p(a,Z): need p(a - we have p(Y,b): need p( suppose we search unifiable atoms for p(U,V) - we have p(a,Z): need p( - we have p(Y,b): need p( suppose we search unifiable atoms for W(U,V) - we have p(a,Z): need full list - we have p(Y,b): need full list NB! no overlap in ground prefix lists: each atom in exactly one p(a,X) would be in 2-pref p(Y,X) would be in 1-pref U(Y,X) would be in 0-pref search unifiers for p(a,U): - search all hashes from 2-pref to lower should find p(a,V) in 2-pref p(Y,b) in 1-pref X(Y,c) in 0-pref search unifiers for p(X,Y): should find p(a,V) in 2-pref?? no, need to put p(a,V) to 1-pref as well --- two lists: pred hash and full --- full contains everything full is used only by X(a,b) cases with var pred predhash contains all with pred predhash is used by all p(X,Y) cases? how to find X(U,V) then? --- just pred hash list search unifiers for p(X,Y): normal search unifiers for U(X,Y): scan all active clauses */ int wr_cl_store_res_terms(glb* g, gptr cl) { void* db=g->db; int i; int len; int ruleflag; // 0 if not rule int poscount=0; // used only for pos/neg pref int negcount=0; // used only for pos/neg pref int posok=1; // default allow int negok=1; // default allow gint meta; gint atom; int negflag; // 1 if negative int termflag; // 1 if complex atom gint hash; int addflag=0; int negadded=0; int posadded=0; vec hashvec; int tmp; #ifdef DEBUG printf("cl_store_res_terms called on cl: "); wr_print_clause(g,cl); #endif // get clause data for input clause ruleflag=wg_rec_is_rule_clause(db,cl); if (ruleflag) len = wg_count_clause_atoms(db, cl); else len=1; // for negpref check out if negative literals present if (1) { // prohibit pos or neg if ((g->negpref_strat) || (g->pospref_strat)) { if (!ruleflag) { poscount=1; negcount=0; } else { poscount=0; negcount=0; for(i=0; inegpref_strat) { if (poscount>0 && negcount>0) posok=0; } if (g->pospref_strat) { if (poscount>0 && negcount>0) negok=0; } } } } #ifdef DEBUG printf("ruleflag %d len %d poscount %d negcount %d posok %d negok %d\n", ruleflag,len,poscount,negcount,posok,negok); #endif // loop over literals #if 0 /* XXX: FIXME */ #ifdef USE_CHILD_DB if (ruleflag) parent = wg_get_rec_base_offset(db,cl); #endif #endif for(i=0; inegpref_strat) || (g->pospref_strat)) || (negflag && (g->hyperres_strat)) || (negok && negflag && !negadded) || (posok && !negflag)) { if (negflag) negadded++; else posadded++; atom=wg_get_rule_clause_atom(db,cl,i); if (wg_get_encoded_type(db,atom)==WG_RECORDTYPE) { termflag=1; #if 0 /* XXX: FIXME */ #ifdef USE_CHILD_DB if(parent) atom=wg_encode_parent_data(parent,atom); #endif #endif } addflag=1; } } if (addflag) { hash=wr_atom_funhash(g,atom); #ifdef DEBUG printf("before adding to hash negflag: %d\n",negflag); #endif if (negflag) hashvec=rotp(g,g->hash_neg_atoms); else hashvec=rotp(g,g->hash_pos_atoms); tmp=wr_clterm_add_hashlist(g,hashvec,hash,atom,cl); if (tmp) { wr_sys_exiterr2int(g,"adding term to hashlist in cl_store_res_terms, code ",tmp); return 1; } #ifdef DEBUGHASH printf("\nhash table after adding:"); wr_clterm_hashlist_print(g,hashvec); printf("\npos hash table after adding:"); wr_clterm_hashlist_print(g,rotp(g,g->hash_pos_atoms)); printf("\nneg hash table after adding:"); wr_clterm_hashlist_print(g,rotp(g,g->hash_neg_atoms)); #endif } } #ifdef DEBUG printf("cl_store_res_terms finished\n"); #endif return 0; } int wr_cl_store_res_terms_new (glb* g, gptr cl) { void* db=g->db; int i; int len; int ruleflag; // 0 if not rule int poscount=0; // used only for pos/neg pref int negcount=0; // used only for pos/neg pref int posok=1; // default allow int negok=1; // default allow gint meta; gint atom; int negflag; // 1 if negative int termflag; // 1 if complex atom //gint hash; int addflag=0; int negadded=0; int posadded=0; vec hashvec; //void* hashdata; int tmp; //int hashposbits; #ifdef DEBUG printf("cl_store_res_terms called on cl: "); wr_print_clause(g,cl); #endif // get clause data for input clause ruleflag=wg_rec_is_rule_clause(db,cl); if (ruleflag) len = wg_count_clause_atoms(db, cl); else len=1; // for negpref check out if negative literals present if (1) { // prohibit pos or neg if ((g->negpref_strat) || (g->pospref_strat)) { if (!ruleflag) { poscount=1; negcount=0; } else { poscount=0; negcount=0; for(i=0; inegpref_strat) { if (poscount>0 && negcount>0) posok=0; } if (g->pospref_strat) { if (poscount>0 && negcount>0) negok=0; } } } } #ifdef DEBUG printf("ruleflag %d len %d poscount %d negcount %d posok %d negok %d\n", ruleflag,len,poscount,negcount,posok,negok); #endif // loop over literals #if 0 /* XXX: FIXME */ #ifdef USE_CHILD_DB if (ruleflag) parent = wg_get_rec_base_offset(db,cl); #endif #endif for(i=0; inegpref_strat) || (g->pospref_strat)) || (negflag && (g->hyperres_strat)) || (negok && negflag && !negadded) || (posok && !negflag)) { if (negflag) negadded++; else posadded++; atom=wg_get_rule_clause_atom(db,cl,i); if (wg_get_encoded_type(db,atom)==WG_RECORDTYPE) { termflag=1; #if 0 /* XXX: FIXME */ #ifdef USE_CHILD_DB if(parent) atom=wg_encode_parent_data(parent,atom); #endif #endif } addflag=1; } } if (addflag) { #ifdef DEBUG printf("before adding to hash negflag: %d\n",negflag); #endif if (negflag) hashvec=rotp(g,g->hash_neg_atoms); else hashvec=rotp(g,g->hash_pos_atoms); tmp=wr_term_hashstore(g,hashvec,atom,cl); if (tmp) { wr_sys_exiterr2int(g,"adding term to hashlist in cl_store_res_terms, code ",tmp); return 1; } #ifdef DEBUGHASH printf("\nhash table after adding:"); wr_clterm_hashlist_print(g,hashvec); printf("\npos hash table after adding:"); wr_clterm_hashdata_print(g,rotp(g,g->hash_pos_atoms)); printf("\nneg hash table after adding:"); wr_clterm_hashdata_print(g,rotp(g,g->hash_neg_atoms)); #endif } } #ifdef DEBUG printf("cl_store_res_terms finished\n"); #endif return 0; } /* ===================================================== top level hash storage funs for atom/clause ====================================================== */ int wr_term_hashstore(glb* g, void* hashdata, gint term, gptr cl) { void* db=g->db; int pos; int bits; unsigned int hash=0; gptr tptr; int tlen; int uselen; int spos; int epos; int preflen; int prefpos; int nonvarflag; gint el; int thash; int tmp; int i; gint hashposbits=3; // bits 11 //gptr nonvarhashbitset; //int nonvarhashbitsetsize; int nonvarhashbitsetsize=2; gint nonvarhashbitset[2]; int maxhashpos=MAXHASHPOS; gint hasharr[MAXHASHPOS]; nonvarhashbitset[0]=1; nonvarhashbitset[1]=3; // find the basic props of atom for hashing tptr=decode_record(db,term); tlen=get_record_len(tptr); uselen=tlen; if (g->unify_maxuseterms) { if (((g->unify_maxuseterms)+(g->unify_firstuseterm))unify_firstuseterm)+(g->unify_maxuseterms); } spos=RECORD_HEADER_GINTS+(g->unify_firstuseterm); epos=RECORD_HEADER_GINTS+uselen; // loop over atom preflen=0; nonvarflag=1; for(pos=spos, preflen=0; pos=epos) { // fully non-var top level // precalc hash vals for positions for(pos=spos, prefpos=0; pos0 && pos>1, ++pos) { if (bits & 1) { el=*(tptr+pos); thash=wr_term_basehash(g,el); //hash=hash+thash; hash = thash + (hash << 6) + (hash << 16) - hash; } } */ // loop over hashbitset, compute complex hash and store for(i=0;i0 && pos>1, ++pos, ++preflen) { if (bits & 1) { thash=hasharr[preflen]; //hash=hash+thash; hash = thash + (hash << 6) + (hash << 16) - hash; } } if (hash<0) hash=0-hash; hash=(1+(hash%(NROF_CLTERM_HASHVEC_ELS-2))); // here store! tmp=1; // dummy //tmp=wr_clterm_add_hashlist(g,hashvec,hash,atom,cl); if (tmp) { //wr_sys_exiterr2int(g,"adding toplevel-nonvar term to hashlist in wr_term_hashstore, code ",tmp); return 1; } } return 0; } // now we have vars in top level: use predicate hashtables if (1) { el=get_field(tptr,(g->unify_funpos)); thash=wr_term_basehash(g,el); // here store! //tmp=1; // dummy tmp=wr_clterm_add_hashlist(g,(vec)hashdata,thash,term,cl); if (tmp) { //wr_sys_exiterr2int(g,"adding toplevel-nonvar term to hashlist in wr_term_hashstore, code ",tmp); return 1; } } // everything ok return 0; } gint wr_term_complexhash(glb* g, gint* hasharr, gint hashposbits, gint term) { int pos; int bits; unsigned int hash=0; gptr tptr; int tlen; int uselen; int spos; int epos; gint el; int thash; #ifdef DEBUG printf("wr_term_complexhash called with term %d bits %d \n",term,hashposbits); #endif tptr=decode_record(g->db,term); tlen=get_record_len(tptr); uselen=tlen; if (g->unify_maxuseterms) { if (((g->unify_maxuseterms)+(g->unify_firstuseterm))unify_firstuseterm)+(g->unify_maxuseterms); } spos=RECORD_HEADER_GINTS+(g->unify_firstuseterm); epos=RECORD_HEADER_GINTS+uselen; // first check if hashable (ie non-vars at hash positions) for(bits=hashposbits, pos=spos; bits>0 && pos>1, ++pos) { if (bits & 1) { el=*(tptr+pos); if (isvar(el)) return 0; } } // we know term is hashable: compute hash for(bits=hashposbits, pos=spos; bits>0 && pos>1, ++pos) { if (bits & 1) { el=*(tptr+pos); thash=wr_term_basehash(g,el); //hash=hash+thash; hash = thash + (hash << 6) + (hash << 16) - hash; } } if (hash<0) hash=0-hash; #ifdef DEBUG printf("wr_term_complexhash computed hash %d using NROF_CLTERM_HASHVEC_ELS-2 %d gives final res %d \n", hash,NROF_CLTERM_HASHVEC_ELS-2,1+(hash%(NROF_CLTERM_HASHVEC_ELS-2))); #endif return (gint)(1+(hash%(NROF_CLTERM_HASHVEC_ELS-2))); } gint wr_atom_funhash(glb* g, gint atom) { void* db=g->db; gint fun; gint chash; #if 0 /* XXX: FIXME */ #ifdef USE_CHILD_DB gint parent; parent=wg_get_rec_base_offset(db,cl); if(parent) enc=wg_encode_parent_data(parent, enc); #endif #endif //printf("wr_atom_funhash called\n"); fun=get_field(decode_record(db,atom),(g->unify_funpos)); //printf("fun %d\n",fun); chash=wr_term_basehash(g,fun); //printf("chash %d\n",chash); return chash; } /* ===================================================== proper hash funs for atoms and terms ====================================================== */ gint wr_term_basehash(glb* g, gint enc) { void* db=g->db; int hash; int intdata; char *strdata, *exdata; double doubledata; #ifdef DEBUG printf("wr_termhash called with enc %d visually ", enc); wr_print_simpleterm_otter(g,enc,(g->print_clause_detaillevel)); printf("\n"); printf("wg_get_encoded_type %d\n",wg_get_encoded_type(db,enc)); #endif switch(wg_get_encoded_type(db, enc)) { case WG_NULLTYPE: hash=0; break; case WG_INTTYPE: intdata = wg_decode_int(db, enc); hash=int_hash(intdata); break; case WG_DOUBLETYPE: doubledata = wg_decode_double(db, enc); hash=double_hash(doubledata); break; case WG_STRTYPE: strdata = wg_decode_unistr(db,enc,WG_STRTYPE); hash=str_hash(strdata); break; case WG_URITYPE: strdata = wg_decode_unistr(db, enc,WG_URITYPE); exdata = wg_decode_unistr_lang(db, enc,WG_URITYPE); hash=str_dual_hash(strdata,exdata); break; case WG_XMLLITERALTYPE: strdata = wg_decode_unistr(db,enc,WG_XMLLITERALTYPE); exdata = wg_decode_unistr_lang(db,enc,WG_XMLLITERALTYPE); hash=str_dual_hash(strdata,exdata); break; case WG_CHARTYPE: hash=int_hash(enc); break; case WG_DATETYPE: hash=int_hash(enc); break; case WG_TIMETYPE: hash=int_hash(enc); break; case WG_VARTYPE: hash=int_hash(0); break; case WG_ANONCONSTTYPE: hash=int_hash(enc); break; case WG_RECORDTYPE: // ptrdata = (gint) wg_decode_record(db, enc); // wg_print_subrecord_otter(db,(gint*)ptrdata); hash=1; break; default: hash=2; break; } if (hash<0) hash=0-hash; #ifdef DEBUG printf("wr_termhash computed hash %d using NROF_CLTERM_HASHVEC_ELS-2 %d gives final res %d \n", hash,NROF_CLTERM_HASHVEC_ELS-2,0+(hash%(NROF_CLTERM_HASHVEC_ELS-2))); #endif return (gint)(0+(hash%(NROF_CLTERM_HASHVEC_ELS-2))); } static int int_hash(int x) { unsigned int a; if (x>=0 && x>17); a -= (a<<9); a ^= (a<<4); a -= (a<<3); a ^= (a<<10); a ^= (a>>15); return (int)a; } static int double_hash(double x) { if (x==(double)0) return 20; return int_hash((int)(x*1000)); } static int str_hash(char* x) { unsigned long hash = 0; int c; if (x!=NULL) { while(1) { c = (int)(*x); if (!c) break; hash = c + (hash << 6) + (hash << 16) - hash; x++; } } return (int)hash; } static int str_dual_hash(char* x, char* y) { unsigned long hash = 0; int c; //printf("x %s y %s\n",x,y); if (x!=NULL) { while(1) { c = (int)(*x); if (!c) break; hash = c + (hash << 6) + (hash << 16) - hash; x++; } } if (y!=NULL) { while(1) { c = (int)(*y); if (!c) break; hash = c + (hash << 6) + (hash << 16) - hash; y++; } } return (int)hash; } /* ===================================================== storage to hashdata ====================================================== */ int wr_clterm_add_hashlist(glb* g, vec hashvec, gint hash, gint term, gptr cl) { void* db=g->db; gint vlen; gint cell; gptr node; gptr prevnode; gint nextnode; vlen=VEC_LEN(hashvec); if (hash>=vlen || hash<1) return 1; // err case cell=hashvec[hash]; if (cell==0) { // no hash chain yet: add first len-containing node prevnode=wr_clterm_alloc_hashnode(g); if (prevnode==NULL) { wr_sys_exiterr(g,"could not allocate node for hashlist in cl_store_res_terms"); return 1; } hashvec[hash]=pto(db,prevnode); prevnode[CLTERM_HASHNODE_LEN_POS]=1; nextnode=0; } else { // hash chain exists: first node contains counter to increase // then take next ptr for node to handle prevnode=otp(db,cell); prevnode[CLTERM_HASHNODE_LEN_POS]++; nextnode=prevnode[CLTERM_HASHNODE_NEXT_POS]; } // make new node and add to chain node=wr_clterm_alloc_hashnode(g); if (node==NULL) { wr_sys_exiterr(g,"could not allocate node for hashlist in cl_store_res_terms"); return 1; } node[CLTERM_HASHNODE_TERM_POS]=term; node[CLTERM_HASHNODE_CL_POS]=pto(db,cl); node[CLTERM_HASHNODE_NEXT_POS]=nextnode; prevnode[CLTERM_HASHNODE_NEXT_POS]=pto(db,node); return 0; } int wr_clterm_add_hashlist_new (glb* g, vec hashvec, gint hash, gint term, gptr cl) { void* db=g->db; gint vlen; gint cell; gptr node; gptr prevnode; gint nextnode; vlen=VEC_LEN(hashvec); if (hash>=vlen || hash<1) return 1; // err case cell=hashvec[hash]; if (cell==0) { // no hash chain yet: add first len-containing node prevnode=wr_clterm_alloc_hashnode(g); if (prevnode==NULL) { wr_sys_exiterr(g,"could not allocate node for hashlist in cl_store_res_terms"); return 1; } hashvec[hash]=pto(db,prevnode); prevnode[CLTERM_HASHNODE_LEN_POS]=1; nextnode=0; } else { // hash chain exists: first node contains counter to increase // then take next ptr for node to handle prevnode=otp(db,cell); prevnode[CLTERM_HASHNODE_LEN_POS]++; nextnode=prevnode[CLTERM_HASHNODE_NEXT_POS]; } // make new node and add to chain node=wr_clterm_alloc_hashnode(g); if (node==NULL) { wr_sys_exiterr(g,"could not allocate node for hashlist in cl_store_res_terms"); return 1; } node[CLTERM_HASHNODE_TERM_POS]=term; node[CLTERM_HASHNODE_CL_POS]=pto(db,cl); node[CLTERM_HASHNODE_NEXT_POS]=nextnode; prevnode[CLTERM_HASHNODE_NEXT_POS]=pto(db,node); return 0; } int wr_clterm_hashlist_len(glb* g, vec hashvec, gint hash) { gint vlen; gint cell; vlen=VEC_LEN(hashvec); if (hash>=vlen || hash<1) return -1; // err case cell=hashvec[hash]; if (cell==0) return 0; return (int)((rotp(g,cell))[CLTERM_HASHNODE_LEN_POS]); } gint wr_clterm_hashlist_start(glb* g, vec hashvec, gint hash) { gint vlen; gint cell; printf("wr_clterm_hashlist_start len %d hash %d\n",VEC_LEN(hashvec),hash); vlen=VEC_LEN(hashvec); if (hash>=vlen || hash<1) return -1; // err case cell=hashvec[hash]; if (cell==0) return 0; // empty case return (rotp(g,cell))[CLTERM_HASHNODE_NEXT_POS]; } gint wr_clterm_hashlist_next(glb* g, vec hashvec, gint lastel) { return (rotp(g,lastel))[CLTERM_HASHNODE_NEXT_POS]; } gptr wr_clterm_alloc_hashnode(glb* g) { return sys_malloc(sizeof(gint)*CLTERM_HASHNODE_GINT_NR); } void wr_clterm_free_hashnode(glb* g,gptr node) { sys_free(node); } void wr_clterm_hashlist_free(glb* g, vec hashvec) { void* db=g->db; gint vlen; gint node; gint nextnode; int i; vlen=VEC_LEN(hashvec); //printf("\nhashvec len %d and els:\n",vlen); for(i=VEC_START;i. * */ /** @file clstore.h * Headers for clause storage functions. */ #ifndef DEFINED_CLSTORE_H #define DEFINED_CLSTORE_H /* ==== Includes ==== */ #include "types.h" #include "glb.h" /* ==== Global defines ==== */ #define CLTERM_HASHNODE_GINT_NR 3 #define CLTERM_HASHNODE_LEN_POS 0 #define CLTERM_HASHNODE_TERM_POS 0 #define CLTERM_HASHNODE_CL_POS 1 #define CLTERM_HASHNODE_NEXT_POS 2 #define MAXHASHPOS 30 /* ==== Protos ==== */ void wr_push_clpickstack_cl(glb* g, gptr cl); void wr_show_clpickstack(glb* g); void wr_push_clqueue_cl(glb* g, gptr cl); void wr_show_clqueue(glb* g); void wr_push_clactive_cl(glb* g, gptr cl); void wr_show_clactive(glb* g); int wr_cl_store_res_terms(glb* g, gptr cl); int wr_term_hashstore(glb* g, void* hashdata, gint atom, gptr cl); gint wr_term_complexhash(glb* g, gint* hasharr, gint hashposbits, gint term); gint wr_atom_funhash(glb* g, gint atom); gint wr_term_basehash(glb* g, gint enc); int wr_clterm_add_hashlist(glb* g, vec hashvec, gint hash, gint term, gptr cl); int wr_clterm_hashlist_len(glb* g, vec hashvec, gint hash); gint wr_clterm_hashlist_start(glb* g, vec hashvec, gint hash); gint wr_clterm_hashlist_next(glb* g, vec hashvec, gint lastel); gptr wr_clterm_alloc_hashnode(glb* g); void wr_clterm_free_hashnode(glb* g, gptr node); void wr_clterm_hashlist_free(glb* g, vec hashvec); void wr_clterm_hashlist_print(glb* g, vec hashvec); #endif whitedb-0.7.2/Reasoner/clterm.c000066400000000000000000000076151226454622500163770ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009,2010 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file clterm.c * Procedures for building clauses/terms and fetching parts. * */ /* ====== Includes =============== */ #include #include #include #ifdef __cplusplus extern "C" { #endif #include "rincludes.h" /* ====== Private headers and defs ======== */ /* ======= Private protos ================ */ /* ====== Functions ============== */ /* ---------------- wr functions --------------------- */ gptr wr_create_raw_record(glb* g, gint length, gint meta, gptr buffer) { gptr rec; if (buffer==NULL) { rec=wg_create_raw_record(g->db,length); if (rec==NULL) return NULL; rec[RECORD_META_POS]=meta; } else { rec=wr_alloc_from_cvec(g,buffer,length+RECORD_HEADER_GINTS); if (rec==NULL) return NULL; rec[0]=(length+RECORD_HEADER_GINTS)*sizeof(gint); rec[RECORD_BACKLINKS_POS]=0; rec[RECORD_META_POS]=meta; //printf("wr_create_raw_record created: \n"); //wg_print_record(g->db,rec); } return rec; } /* ---------------- wg functions --------------------- */ void* wr_create_rule_clause(glb* g, int litnr) { void* res; res=wg_create_raw_record(g->db,CLAUSE_EXTRAHEADERLEN+(LIT_WIDTH*litnr)); //printf("meta %d",*((gint*)res+RECORD_META_POS)); *((gint*)res+RECORD_META_POS)=(RECORD_META_NOTDATA | RECORD_META_RULE_CLAUSE); return res; } void* wr_create_fact_clause(glb* g, int litnr) { void* res; res=wg_create_raw_record(g->db,(g->unify_firstuseterm)+litnr+(g->unify_footerlen)); *((gint*)res+RECORD_META_POS)=RECORD_META_FACT_CLAUSE; return res; } void* wr_create_atom(glb* g, int termnr) { void* res; res=wg_create_raw_record(g->db,(g->unify_firstuseterm)+termnr+(g->unify_footerlen)); *((gint*)res+RECORD_META_POS)=(RECORD_META_NOTDATA | RECORD_META_ATOM); return res; } void* wr_create_term(glb* g, int termnr) { void* res; res=wg_create_raw_record(g->db,(g->unify_firstuseterm)+termnr+(g->unify_footerlen)); *((gint*)res+RECORD_META_POS)=(RECORD_META_NOTDATA | RECORD_META_TERM); return res; } void* wr_convert_atom_fact_clause(glb* g, void* atom, int isneg) { void* res; res=atom; *((gint*)res+RECORD_META_POS)=(RECORD_META_ATOM | RECORD_META_FACT_CLAUSE); return res; } int wr_set_rule_clause_atom(glb* g, void* clause, int litnr, gint atom) { //wg_set_new_field(db,clause,CLAUSE_EXTRAHEADERLEN+(LIT_WIDTH*litnr)+1,atom); *((gint*)clause+RECORD_HEADER_GINTS+CLAUSE_EXTRAHEADERLEN+(LIT_WIDTH*litnr)+1)=atom; return 0; } int wr_set_rule_clause_atom_meta(glb* g, void* clause, int litnr, gint meta) { //wg_set_new_field(db,clause,CLAUSE_EXTRAHEADERLEN+(LIT_WIDTH*litnr),meta); *((gint*)clause+RECORD_HEADER_GINTS+CLAUSE_EXTRAHEADERLEN+(LIT_WIDTH*litnr))=meta; return 0; } int wr_set_atom_subterm(glb* g, void* atom, int termnr, gint subterm) { wg_set_new_field(g->db,atom,(g->unify_firstuseterm)+termnr,subterm); return 0; } int wr_set_term_subterm(glb* g, void* term, int termnr, gint subterm) { wg_set_new_field(g->db,term,(g->unify_firstuseterm)+termnr,subterm); return 0; } #ifdef __cplusplus } #endif whitedb-0.7.2/Reasoner/clterm.h000066400000000000000000000107151226454622500163770ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009,2010 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file clterm.h * Procedures for building clauses/terms and fetching parts. */ #ifndef DEFINED_CLTERM_H #define DEFINED_CLTERM_H #ifdef _WIN32 #include "../config-w32.h" #else #include "../config.h" #endif #include "../Db/dballoc.h" #include "types.h" #include "glb.h" /* ============= term and record header structure =============== */ /* // Record header structure. Position 0 is always reserved // for size. #define RECORD_HEADER_GINTS 3 #define RECORD_META_POS 1 // metainfo, reserved for future use / #define RECORD_BACKLINKS_POS 2 // backlinks structure offset / #define LITTLEENDIAN 1 ///< (intel is little-endian) difference in encoding tinystr //#define USETINYSTR 1 ///< undef to prohibit usage of tinystr // Record meta bits. #define RECORD_META_NOTDATA 0x1 // Record is a "special" record (not data) #define RECORD_META_MATCH 0x2 // "match" record (needs NOTDATA as well) #define is_special_record(r) (*((gint *) r + RECORD_META_POS) &\ RECORD_META_NOTDATA) */ #define CLAUSE_EXTRAHEADERLEN 1 //#define TERM_EXTRAHEADERLEN 1 // nr of gints in datarec before terms start // we use (g->unify_firstuseterm) instead with the same meaning //#define RECORD_META_NOTDATA 0x1 // Record is a "special" record (not data) //#define RECORD_META_MATCH 0x2 #define RECORD_META_RULE_CLAUSE (1<<3) // should be notdata as well #define RECORD_META_FACT_CLAUSE (1<<4) // should be notdata as well #define RECORD_META_ATOM (1<<5) // should be notdata as well #define RECORD_META_TERM (1<<6) // should be notdata as well #define ATOM_META_NEG encode_smallint(1) /* ============= external funs defs ============ */ /* ==== macros ===== */ #define LIT_WIDTH 2 //meta gint plus atom gint is width 2 #define LIT_META_POS 0 #define LIT_ATOM_POS 1 #define get_field(r,n) (*(((gint*)(r))+RECORD_HEADER_GINTS+(n))) #define set_field(r,n,d) (*(((gint*)record)+RECORD_HEADER_GINTS+fieldnr)=(d)) #define get_record_len(r) (((gint)(getusedobjectwantedgintsnr(*((gint*)(r)))))-RECORD_HEADER_GINTS) #define decode_record(db,d) ((void*)(offsettoptr(db,decode_datarec_offset((d))))) #define encode_record(db,d) ((gint)(encode_datarec_offset(ptrtooffset((db),(d))))) #define wg_rec_is_rule_clause(db,rec) (*((gint*)(rec)+RECORD_META_POS) & RECORD_META_RULE_CLAUSE) #define wg_rec_is_fact_clause(db,rec) (*((gint*)(rec)+RECORD_META_POS) & RECORD_META_FACT_CLAUSE) #define wg_rec_is_atom_rec(db,rec) (*((gint*)(rec)+RECORD_META_POS) & RECORD_META_ATOM) #define wg_rec_is_term_rec(db,rec) (*((gint*)(rec)+RECORD_META_POS) & RECORD_META_TERM) #define wg_get_rule_clause_atom_meta(db,rec,litnr) get_field((rec), (CLAUSE_EXTRAHEADERLEN+((litnr)*LIT_WIDTH))) #define wg_get_rule_clause_atom(db,rec,litnr) get_field((rec), (CLAUSE_EXTRAHEADERLEN+((litnr)*LIT_WIDTH)+1)) #define wg_atom_meta_is_neg(db,meta) ((meta) & ATOM_META_NEG) #define litmeta_negpolarities(meta1,meta2) (((meta1) & ATOM_META_NEG)!=((meta2) & ATOM_META_NEG)) #define wg_count_clause_atoms(db,clause) ((get_record_len((clause))-CLAUSE_EXTRAHEADERLEN)/LIT_WIDTH) /* ==== Protos ==== */ gptr wr_create_raw_record(glb* g, gint length, gint meta, gptr buffer); void* wr_create_rule_clause(glb* g, int litnr); void* wr_create_fact_clause(glb* g, int litnr); void* wr_create_atom(glb* g, int termnr); void* wr_create_term(glb* g, int termnr); void* wr_convert_atom_fact_clause(glb* g, void* atom, int isneg); int wr_set_rule_clause_atom(glb* g, void* clause, int litnr, gint atom); int wr_set_rule_clause_atom_meta(glb* g, void* clause, int litnr, gint meta); int wr_set_atom_subterm(glb* g, void* atom, int termnr, gint subterm); int wr_set_term_subterm(glb* g, void* term, int termnr, gint subterm); #endif whitedb-0.7.2/Reasoner/derive.c000066400000000000000000000325151226454622500163640ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009,2010 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file derive.c * Clause derivation functions. */ /* ====== Includes =============== */ #ifdef __cplusplus extern "C" { #endif #include "rincludes.h" /* ====== Private defs =========== */ #define DEBUG //#undef DEBUG /* ====== Private headers ======== */ /* ====== Functions ============== */ void wr_process_resolve_result(glb* g, gint xatom, gptr xcl, gint yatom, gptr ycl) { void* db=g->db; int xisrule,yisrule,xatomnr,yatomnr; int rlen; int i,tmp; gptr rptr; int rpos; gptr res; gint meta; gint blt; int ruleflag,datalen; //int clstackflag; int partialresflag; gint given_termbuf_storednext; #ifdef DEBUG printf("wr_process_resolve_result called\n"); wr_print_clause(g,xcl); printf(" : ");wr_print_term(g,xatom); printf("\n"); wr_print_clause(g,ycl); printf(" : ");wr_print_term(g,yatom); printf("\n"); wr_print_vardata(g); #endif ++(g->stat_derived_cl); ++(g->stat_binres_derived_cl); // get basic info about clauses xisrule=wg_rec_is_rule_clause(db,xcl); yisrule=wg_rec_is_rule_clause(db,ycl); if (xisrule) xatomnr=wg_count_clause_atoms(db,xcl); else xatomnr=1; if (yisrule) yatomnr=wg_count_clause_atoms(db,ycl); else yatomnr=1; // reserve sufficient space in derived_termbuf for simple sequential store of atoms: // no top-level meta kept rlen=(xatomnr+yatomnr-2)*LIT_WIDTH; if (rlen==0) { g->proof_found=1; return; } (g->derived_termbuf)[1]=2; // init termbuf rptr=wr_alloc_from_cvec(g,g->derived_termbuf,rlen); if (rptr==NULL) { ++(g->stat_internlimit_discarded_cl); wr_alloc_err(g,"could not alloc first buffer in wr_process_resolve_result "); return; // could not alloc memory, could not store clause } //printf("xisrule %d yisrule %d xatomnr %d yatomnr %d rlen %d\n", // xisrule,yisrule,xatomnr,yatomnr,rlen); // set up var rename params wr_process_resolve_result_setupsubst(g); // store all ready-built atoms sequentially, excluding duplicates // and looking for tautology: only for rule clauses needed rpos=0; if (xatomnr>1) { tmp=wr_process_resolve_result_aux(g,xcl,xatom,xatomnr,rptr,&rpos); if (!tmp) { wr_process_resolve_result_cleanupsubst(g); return; } } if (yatomnr>1) { tmp=wr_process_resolve_result_aux(g,ycl,yatom,yatomnr,rptr,&rpos); if (!tmp) { wr_process_resolve_result_cleanupsubst(g); return; } } wr_process_resolve_result_cleanupsubst(g); if (rpos==0) { g->proof_found=1; return; } // now we have stored all subst-into and renamed metas/atoms into rptr: build clause //printf("filled meta/atom new vec, rpos %d\n",rpos); // check whether should be stored as a ruleclause or not ruleflag=wr_process_resolve_result_isrulecl(g,rptr,rpos); // create new record if ((g->hyperres_strat) && !wr_hyperres_satellite_tmpres(g,rptr,rpos)){ partialresflag=1; //wr_process_resolve_result_setupclpickstackcopy(g); wr_process_resolve_result_setupgivencopy(g); // store buffer pos to be restored later given_termbuf_storednext=g->build_buffer; //g->build_buffer=malloc(1024); //given_termbuf_storednext=CVEC_NEXT(g->given_termbuf); g->build_buffer=wr_cvec_new(g,1000); } else { partialresflag=0; wr_process_resolve_result_setupquecopy(g); } if (ruleflag) { meta=RECORD_META_RULE_CLAUSE; datalen=rpos*LIT_WIDTH; //printf("meta %d headerlen %d datalen %d\n",meta,headerlen,datalen); res=wr_create_raw_record(g,CLAUSE_EXTRAHEADERLEN+datalen,meta,g->build_buffer); if (res==NULL) { ++(g->stat_internlimit_discarded_cl); wr_alloc_err(g,"could not alloc raw record in wr_process_resolve_result "); return; } for(i=RECORD_HEADER_GINTS;i<(RECORD_HEADER_GINTS+CLAUSE_EXTRAHEADERLEN);i++) { res[i]=0; } for(i=0;istat_internlimit_discarded_cl); wr_alloc_err(g,"could not build new atom blt in wr_process_resolve_result "); return; } res[tmp+RECORD_HEADER_GINTS+CLAUSE_EXTRAHEADERLEN+LIT_ATOM_POS]=blt; } ++(g->stat_built_cl); } else { meta=RECORD_META_FACT_CLAUSE; //if (partialresflag) blt=wr_build_calc_term(g,rptr[LIT_ATOM_POS]); //else blt=rptr[LIT_ATOM_POS]; blt=wr_build_calc_term(g,rptr[LIT_ATOM_POS]); if (blt==WG_ILLEGAL) { ++(g->stat_internlimit_discarded_cl); wr_alloc_err(g,"could not build new atom blt in wr_process_resolve_result "); return; } res=otp(db,blt); res[RECORD_META_POS]=meta; ++(g->stat_built_cl); } #ifdef DEBUG printf("\nwr_process_resolve_result generated a clause \n"); wg_print_record(db,res); printf("\n"); #endif // now the resulting clause is fully built if ((g->hyperres_strat) && !wr_hyperres_satellite_cl(g,res)) { ++(g->stat_hyperres_partial_cl); if (g->print_partial_derived_cl) { printf("+ partial derived: "); wr_print_clause(g,res); } //wr_push_clpickstack_cl(g,res); wr_clear_varstack(g,g->varstack); //wr_clear_all_varbanks(g); //wr_print_vardata(g); wr_resolve_binary_all_active(g,res); // restore buffer pos to situation before building the current clause wr_vec_free(g,g->build_buffer); g->build_buffer=given_termbuf_storednext; //CVEC_NEXT(g->given_termbuf)=given_termbuf_storednext; } else { ++(g->stat_kept_cl); if (g->print_derived_cl) { printf("+ derived: "); wr_print_clause(g,res); } // push built clause into suitable list wr_push_clqueue_cl(g,res); } } int wr_process_resolve_result_isrulecl(glb* g, gptr rptr, int rpos) { void* db; int stopflag,ruleflag, len, i; gint meta, atom, term; gptr atomptr; if (rpos!=1) { return 1; } else { // only clauses of len 1 check further db=g->db; stopflag=0; ruleflag=1; meta=rptr[LIT_META_POS]; atom=rptr[LIT_ATOM_POS]; if (isdatarec(atom) && !wg_atom_meta_is_neg(db,meta)) { atomptr=decode_record(db,atom); len=get_record_len(atomptr); for(i=(g->unify_firstuseterm); ibuild_subst=1; // subst var values into vars g->build_calc=0; // do fun and pred calculations g->build_dcopy=0; // copy nonimmediate data (vs return ptr) //g->build_buffer=NULL; // build everything into tmp buffer (vs main area) (g->given_termbuf)[1]=2; // reuse given_termbuf g->build_buffer=g->derived_termbuf; //g->build_buffer=g->queue_termbuf; g->build_rename=1; // do var renaming g->build_rename_maxseenvnr=-1; // tmp var for var renaming g->build_rename_vc=0; // tmp var for var renaming g->build_rename_banknr=3; // nr of bank of created vars // points to bank of created vars g->build_rename_bank=(g->varbanks)+((g->build_rename_banknr)*NROF_VARSINBANK); g->use_comp_funs=g->use_comp_funs_strat; } void wr_process_resolve_result_cleanupsubst(glb* g) { int i; for(i=0;ibuild_rename_vc;i++) { (g->build_rename_bank)[i]=UNASSIGNED; } } void wr_process_resolve_result_setupgivencopy(glb* g) { g->build_subst=0; // subst var values into vars g->build_calc=0; // do fun and pred calculations g->build_dcopy=0; // copy nonimmediate data (vs return ptr) //g->build_buffer=NULL; // build everything into tmp buffer (vs main area) //(g->given_termbuf)[1]=2; // reuse given_termbuf //g->build_buffer=g->given_termbuf; //g->build_buffer=g->given_termbuf; //g->build_buffer=NULL; // PROBLEM WAS HERE: given_termbuf not ok here g->build_rename=0; // do var renaming g->use_comp_funs=0; } void wr_process_resolve_result_setupquecopy(glb* g) { g->build_subst=0; // subst var values into vars g->build_calc=0; // do fun and pred calculations g->build_dcopy=0; // copy nonimmediate data (vs return ptr) //g->build_buffer=NULL; // build everything into tmp buffer (vs main area) //(g->given_termbuf)[1]=2; // reuse given_termbuf g->build_buffer=g->queue_termbuf; g->build_rename=0; // do var renaming g->use_comp_funs=0; } void wr_process_resolve_result_setupclpickstackcopy(glb* g) { g->build_subst=0; // subst var values into vars g->build_calc=0; // do fun and pred calculations g->build_dcopy=0; // copy nonimmediate data (vs return ptr) //g->build_buffer=NULL; // build everything into tmp buffer (vs main area) //(g->given_termbuf)[1]=2; // reuse given_termbuf g->build_buffer=g->queue_termbuf; g->build_rename=0; // do var renaming g->use_comp_funs=0; } int wr_process_resolve_result_aux (glb* g, gptr cl, gint cutatom, int atomnr, gptr rptr, int* rpos){ //void *db=g->db; int i,j; int posfoundflag; gint meta,atom,newatom,rmeta; #ifdef DEBUG printf("\nwr_process_resolve_result_aux called on atomnr %d\n",atomnr); wr_print_term(g,cutatom); #endif for(i=0;istat_internlimit_discarded_cl); wr_alloc_err(g,"could not build subst newatom in wr_process_resolve_result "); return 0; // could not alloc memory, could not store clause } if (newatom==ACONST_TRUE) { if (wg_atom_meta_is_neg(db,meta)) continue; else return 0; } if (newatom==ACONST_FALSE) { if (wg_atom_meta_is_neg(db,meta)) return 0; else continue; } posfoundflag=0; // check if xatom present somewhere earlier for(j=0;j < *rpos;j++){ if (wr_equal_term(g,newatom,rptr[(j*LIT_WIDTH)+LIT_ATOM_POS],1)) { rmeta=rptr[(j*LIT_WIDTH)+LIT_META_POS]; if (!litmeta_negpolarities(meta,rmeta)) { //same sign, drop lit posfoundflag=1; printf("\nequals found:\n"); wr_print_term(g,newatom); printf("\n"); wr_print_term(g,rptr[(j*LIT_WIDTH)+LIT_ATOM_POS]); printf("\n"); break; } else { printf("\nin wr_process_resolve_result_aux return 0\n"); // negative sign, tautology, drop clause return 0; } } } if (!posfoundflag) { // store lit rptr[((*rpos)*LIT_WIDTH)+LIT_META_POS]=meta; rptr[((*rpos)*LIT_WIDTH)+LIT_ATOM_POS]=newatom; ++(*rpos); } } printf("\nwr_process_resolve_result_aux gen clause:\n"); wr_print_clause(g,rptr); return 1; // 1 means clause is still ok. 0 return means: drop clause } /** satellite is a fully-built hypperesolution result, not temporary result */ int wr_hyperres_satellite_cl(glb* g,gptr cl) { int len; int i; gint meta; if (cl==NULL) return 0; if (!wg_rec_is_rule_clause(g->db,cl)) { // fact clause (hence always positive) if (g->negpref_strat) return 1; else return 0; } else { // rule clause: check if contains only non-preferred len=wg_count_clause_atoms(g->db,cl); for(i=0;idb,cl,i); if (wg_atom_meta_is_neg(g->db,meta)) { if (g->negpref_strat) return 0; } else { if (g->pospref_strat) return 0; } } return 1; } } /** satellite is a fully-built hypperesolution result, not temporary result */ int wr_hyperres_satellite_tmpres(glb* g,gptr tmpres, int respos) { int i; gint tmeta; for(i=0;idb,tmeta)) { if (g->negpref_strat) return 0; } else { if (g->pospref_strat) return 0; } } return 1; } #ifdef __cplusplus } #endif whitedb-0.7.2/Reasoner/derive.h000066400000000000000000000040561226454622500163700ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009,2010 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file derive.h * Headers for clause derivation functions. */ #ifndef DEFINED_DERIVE_H #define DEFINED_DERIVE_H /* ==== Includes ==== */ #include "types.h" #include "glb.h" /* ==== Global defines ==== */ /* ==== Protos ==== */ void wr_process_resolve_result(glb* g, gint xatom, gptr xcl, gint yatom, gptr ycl); int wr_process_resolve_result_isrulecl(glb* g, gptr rptr, int rpos); void wr_process_resolve_result_setupsubst(glb* g); void wr_process_resolve_result_setupgivencopy(glb* g); void wr_process_resolve_result_setupquecopy(glb* g); void wr_process_resolve_result_setupclpickstackcopy(glb* g) ; void wr_process_resolve_result_cleanupsubst(glb* g); int wr_process_resolve_result_aux (glb* g, gptr cl, gint cutatom, int atomnr, gptr rptr, int* rpos); int wr_hyperres_satellite_cl(glb* g,gptr cl); int wr_hyperres_satellite_tmpres(glb* g,gptr tmpres, int respos); // void resolve_binary_all_active(gptr cl1); /* void resolve_binary(gptr cl1, gptr cl2); gptr factor_step(gptr incl); int simplify_cl_destr(gptr cl, int given_flag); int can_cut_lit(gptr litpt1, int unify_flag, int given_flag); void proc_derived_cl(gptr incl); void proc_derived_cl_binhist(gptr incl, gint clid1, gint clid2, gint litpos1, gint litpos2); void proc_input_cl(gptr incl); */ #endif whitedb-0.7.2/Reasoner/glb.c000066400000000000000000000222551226454622500156520ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009,2010 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file glb.c * Reasoner globals. * */ /* ====== Includes =============== */ #include #include #include #ifdef __cplusplus extern "C" { #endif #include "rincludes.h" /* ====== Private headers and defs ======== */ /* ======= Private protos ================ */ /* ====== Functions ============== */ /** Creates and fills in the glb structure. * */ glb* wr_glb_new_full(void* db) { glb* g; int tmp; g=wr_glb_new_simple(db); if (g==NULL) return NULL; tmp=wr_glb_init_shared_complex(g); // creates and fills in shared tables, substructures, etc if (tmp) { wr_glb_free_shared_complex(g); sys_free(g); return NULL; } tmp=wr_glb_init_local_complex(g); // creates and fills in local tables, substructures, etc if (tmp) { wr_glb_free_shared_complex(g); wr_glb_free_local_complex(g); sys_free(g); return NULL; } return g; } glb* wr_glb_new_simple(void* db) { glb* g; g=sys_malloc(sizeof(glb)); // allocate space if (g==NULL) return NULL; (g->db)=db; // store database pointer to glb structure wr_glb_init_simple(g); // fills in simple values (ints, strings etc) return g; } /** Fills in simple slots of glb structure. * */ int wr_glb_init_simple(glb* g) { (g->proof_found)=0; // becomes 1 if proof found (g->alloc_err)=0; // 0 if ok, becomes 1 or larger if alloc error occurs: // 3 out of varspace err /* unification/matching configuration */ (g->unify_samelen)=1; (g->unify_maxuseterms)=0; (g->unify_firstuseterm)=1; (g->unify_funpos)=1; (g->unify_funarg1pos)=2; // rec pos of a fun/pred first arg (g->unify_funarg2pos)=3; // rec pos of a fun/pred second arg (g->unify_footerlen)=0; /* strategy selection */ (g->hyperres_strat)=1; (g->pick_given_queue_ratio)=4; (g->pick_given_queue_ratio_counter)=0; (g->cl_keep_weightlimit)=100000; (g->unitres_strat)=0; (g->negpref_strat)=1; (g->pospref_strat)=0; (g->use_comp_funs_strat)=1; (g->use_comp_funs)=1; //(g->cl_maxweight)=1000000; //(g->cl_maxdepth)=1000000; //(g->cl_limitkept)=1; /* printout configuration */ (g->print_flag)=1; // if 0: no printout: rmain sets other flags accordingly (g->print_level_flag)=-1; // rmain uses this to set other flags accordingly // -1: use default, 0: none, 10: normal, 20: medium, 30: detailed (g->parser_print_level)=1; (g->print_initial_parser_result)=1; (g->print_generic_parser_result)=1; (g->print_initial_active_list)=1; (g->print_initial_passive_list)=1; (g->print_initial_given_cl)=1; (g->print_final_given_cl)=1; (g->print_active_cl)=1; (g->print_partial_derived_cl)=1; (g->print_derived_cl)=1; (g->print_clause_detaillevel)=1; (g->print_stats)=1; /* tmp variables */ /* build control: changed in code */ (g->build_subst)=1; // subst var values into vars (g->build_calc)=1; // do fun and pred calculations (g->build_dcopy)=0; // copy nonimmediate data (vs return ptr) (g->build_buffer)=NULL; // build everything into local tmp buffer area (vs main area) /* statistics */ (g->stat_wr_mallocs)=0; (g->stat_wr_reallocs)=0; (g->stat_wr_frees)=0; (g->stat_wr_malloc_bytes)=0; (g->stat_wr_realloc_bytes)=0; (g->stat_built_cl)=0; (g->stat_derived_cl)=0; (g->stat_binres_derived_cl)=0; (g->stat_factor_derived_cl)=0; (g->stat_kept_cl)=0; (g->stat_hyperres_partial_cl)=0; (g->stat_weight_discarded_building)=0; (g->stat_weight_discarded_cl)=0; (g->stat_internlimit_discarded_cl)=0; (g->stat_given_candidates)=0; (g->stat_given_used)=0; (g->stat_simplified_given)=0; (g->stat_simplified_derived)=0; (g->stat_backward_subsumed)=0; (g->stat_clsubs_attempted)=0; (g->stat_clsubs_meta_attempted)=0; (g->stat_clsubs_predsymbs_attempted)=0; (g->stat_clsubs_unit_attempted)=0; (g->stat_clsubs_full_attempted)=0; (g->stat_lit_hash_computed)=0; (g->stat_lit_hash_match_found)=0; (g->stat_lit_hash_match_miss)=0; (g->stat_lit_hash_cut_ok)=0; (g->stat_lit_hash_subsume_ok)=0; return 0; } /** Fills in shared complex slots of glb structure. * */ int wr_glb_init_shared_complex(glb* g) { // first NULL all vars (g->clbuilt)=(gint)NULL; (g->clqueue)=(gint)NULL; (g->clqueue_given)=(gint)NULL; (g->clpickstack)=(gint)NULL; (g->clactive)=(gint)NULL; (g->clweightqueue)=(gint)NULL; (g->hash_neg_atoms)=(gint)NULL; (g->hash_pos_atoms)=(gint)NULL; (g->hash_units)=(gint)NULL; (g->hash_para_terms)=(gint)NULL; // then create space (g->clbuilt)=rpto(g,wr_cvec_new(g,NROF_DYNALLOCINITIAL_ELS)); (g->clactive)=rpto(g,wr_cvec_new(g,NROF_DYNALLOCINITIAL_ELS)); (g->clpickstack)=rpto(g,wr_cvec_new(g,NROF_DYNALLOCINITIAL_ELS)); (g->clqueue)=rpto(g,wr_cvec_new(g,NROF_DYNALLOCINITIAL_ELS)); (g->clqueue_given)=1; (g->clweightqueue)=rpto(g,wr_vec_new(g,NROF_WEIGHTQUEUE_ELS)); (g->hash_neg_atoms)=rpto(g,wr_vec_new(g,NROF_CLTERM_HASHVEC_ELS)); (g->hash_pos_atoms)=rpto(g,wr_vec_new(g,NROF_CLTERM_HASHVEC_ELS)); (g->hash_units)=rpto(g,wr_vec_new(g,NROF_CLTERM_HASHVEC_ELS)); (g->hash_para_terms)=rpto(g,wr_vec_new(g,NROF_CLTERM_HASHVEC_ELS)); if (g->alloc_err) { return 1; } return 0; } /** Fills in local complex slots of glb structure. * */ int wr_glb_init_local_complex(glb* g) { // first NULL all vars (g->varbanks)=NULL; (g->varstack)=NULL; (g->given_termbuf)=NULL; (g->derived_termbuf)=NULL; (g->queue_termbuf)=NULL; (g->active_termbuf)=NULL; (g->tmp_litinf_vec)=NULL; // then create space (g->varbanks)=wr_vec_new(g,NROF_VARBANKS*NROF_VARSINBANK); //(g->varbankrepl)=wr_vec_new(g,3*NROF_VARSINBANK); (g->varstack)=wr_cvec_new(g,NROF_VARBANKS*NROF_VARSINBANK); (g->varstack)[1]=2; // first free elem //(g->tmp1_cl_vec)=wr_vec_new(g,100); //(g->tmp2_cl_vec)=wr_vec_new(g,100); //(g->tmp_litinf_vec)=wr_vec_new(g,100); (g->given_termbuf)=wr_cvec_new(g,NROF_GIVEN_TERMBUF_ELS); (g->given_termbuf)[1]=2; //(g->given_termbuf_freeindex)=2; (g->derived_termbuf)=wr_cvec_new(g,NROF_DERIVED_TERMBUF_ELS); (g->derived_termbuf)[1]=2; (g->queue_termbuf)=wr_cvec_new(g,NROF_QUEUE_TERMBUF_ELS); (g->queue_termbuf)[1]=2; (g->active_termbuf)=wr_cvec_new(g,NROF_ACTIVE_TERMBUF_ELS); (g->active_termbuf)[1]=2; (g->tmp_litinf_vec)=wr_vec_new(g,MAX_CLAUSE_LEN); // used by subsumption //(g->derived_termbuf_freeindex)=2; //(g->use_termbuf)=0; //(g->pick_given_queue_ratio)=4; //(g->pick_given_queue_ratio_counter)=0; if ((g->alloc_err)==1) { return 1; } return 0; } /** Frees the glb structure and subitems in glb. * */ int wr_glb_free(glb* g) { // first free subitems wr_glb_free_shared_simple(g); wr_glb_free_shared_complex(g); wr_glb_free_local_complex(g); wr_glb_free_local_simple(g); sys_free(g); // free whole spaces return 0; } /** Frees the glb shared simple subitems. * */ int wr_glb_free_shared_simple(glb* g) { //str_freeref(g,&(g->info)); return 0; } /** Frees the glb local simple subitems. * */ int wr_glb_free_local_simple(glb* g) { //str_freeref(g,&(g->info)); return 0; } /** Frees the glb shared complex subitems. * */ int wr_glb_free_shared_complex(glb* g) { wr_vec_free(g,rotp(g,g->clbuilt)); wr_vec_free(g,rotp(g,g->clactive)); wr_vec_free(g,rotp(g,g->clpickstack)); wr_vec_free(g,rotp(g,g->clqueue)); wr_vec_free(g,rotp(g,g->clweightqueue)); wr_clterm_hashlist_free(g,rotp(g,g->hash_neg_atoms)); wr_clterm_hashlist_free(g,rotp(g,g->hash_pos_atoms)); wr_clterm_hashlist_free(g,rotp(g,g->hash_units)); wr_clterm_hashlist_free(g,rotp(g,g->hash_para_terms)); return 0; } /** Frees the local glb complex subitems. * */ int wr_glb_free_local_complex(glb* g) { //wr_vec_free(g,g->varstrvec); wr_vec_free(g,g->varbanks); //wr_vec_free(g,g->varbankrepl); wr_vec_free(g,g->varstack); //wr_vec_free(g,g->tmp1_cl_vec); //wr_vec_free(g,g->tmp2_cl_vec); //wr_vec_free(g,g->tmp_litinf_vec); //wr_vec_free(g,g->termbuf); wr_vec_free(g,g->given_termbuf); wr_vec_free(g,g->derived_termbuf); wr_vec_free(g,g->queue_termbuf); wr_vec_free(g,g->active_termbuf); wr_vec_free(g,g->tmp_litinf_vec); return 0; } #ifdef __cplusplus } #endif whitedb-0.7.2/Reasoner/glb.h000066400000000000000000000146011226454622500156530ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009,2010 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file glb.h * Reasoner globals. * */ #ifndef DEFINED_GLB_H #define DEFINED_GLB_H #include "types.h" #define NROF_DYNALLOCINITIAL_ELS 10 #define NROF_GIVEN_TERMBUF_ELS 100000 #define NROF_DERIVED_TERMBUF_ELS 100000 #define NROF_QUEUE_TERMBUF_ELS 100000000 #define NROF_ACTIVE_TERMBUF_ELS 10000000 #define NROF_VARBANKS 5 #define NROF_VARSINBANK 1000 #define FIRST_UNREAL_VAR_NR ((NROF_VARBANKS-1)*NROF_VARSINBANK) #define NROF_WEIGHTQUEUE_ELS 50 #define NROF_CLTERM_HASHVEC_ELS 500 #define MAX_CLAUSE_LEN 1000 /* ======== Structures ========== */ /** glb contains global values for requests. * * requests should use a single global variable g, which * is a pointer to glb structure. * */ typedef struct { void* db; /**< shared mem database */ /* === shared data block === */ cveco clbuilt; /**< vector containing built clauses, newest last. 0: vec len, 1: index of next unused vec elem */ cveco clactive; cveco clpickstack; /**< vector containing built clause stack to be selected as given before using queue (hyperres eg) */ cveco clqueue; /**< vector containing kept clauses, newest last. 0: vec len, 1: index of next unused vec elem */ gint clqueue_given; /**< index of next clause to be taken from clqueue */ veco clweightqueue; veco hash_neg_atoms; veco hash_pos_atoms; veco hash_units; veco hash_para_terms; /* == local data block === */ gint proof_found; gint alloc_err; // set to 1 in case of alloc errors: should cancel search vec varbanks; // 0: input (passive), 1: given clause renamed, // 2: active clauses, 3: derived clauses, // 4: tmp rename area (vals always UNASSIGNED, never set to other vals!) cvec varstack; //int tmp_build_weight; //vec tmp1_cl_vec; //vec tmp2_cl_vec; vec tmp_litinf_vec; // used by subsumption cvec given_termbuf; cvec derived_termbuf; cvec queue_termbuf; cvec active_termbuf; /* unification/matching configuration */ int unify_samelen; // 1 if unifiable terms need not have same length, 0 otherwise int unify_maxuseterms; // max nr of rec elems unified one after another: t1,t2,t3 gives 3 // 0 if no limit int unify_firstuseterm; // rec pos where we start to unify int unify_funpos; // rec pos of a fun/pred uri int unify_funarg1pos; // rec pos of a fun/pred first arg int unify_funarg2pos; // rec pos of a fun/pred second arg int unify_footerlen; // obligatory amount of unused gints to add to end of each created term /* strategy selection */ int pick_given_queue_ratio; int pick_given_queue_ratio_counter; int cl_keep_weightlimit; int hyperres_strat; int unitres_strat; int negpref_strat; int pospref_strat; int use_comp_funs_strat; // general strategy int use_comp_funs; // current principle //int cl_maxweight; //int cl_maxdepth; //int cl_limitkept; /* printout configuration */ int print_flag; int print_level_flag; int parser_print_level; int print_initial_parser_result; int print_generic_parser_result; int print_initial_active_list; int print_initial_passive_list; int print_initial_given_cl; int print_final_given_cl; int print_active_cl; int print_partial_derived_cl; int print_derived_cl; int print_clause_detaillevel; int print_stats; /* tmp variables */ gint* tmp_unify_vc; // var count in unification gint tmp_unify_occcheck; // occcheck necessity in unification (changes) gint tmp_unify_do_occcheck; /* build control: changed in code */ gint build_subst; // subst var values into vars gint build_calc; // do fun and pred calculations gint build_dcopy; // copy nonimmediate data (vs return ptr) gptr build_buffer; // build everything into tmp buffer (vs main area) // points to NULL or given_termbuf, derived_termbuf etc gint build_rename; // do var renaming gint build_rename_maxseenvnr; // tmp var for var renaming gint build_rename_vc; // tmp var for var renaming gptr build_rename_bank; // points to bank of created vars gint build_rename_banknr; // nr of bank of created vars /* statistics */ int stat_wr_mallocs; int stat_wr_reallocs; int stat_wr_frees; int stat_wr_malloc_bytes; int stat_wr_realloc_bytes; int stat_built_cl; int stat_derived_cl; int stat_binres_derived_cl; int stat_factor_derived_cl; int stat_kept_cl; int stat_hyperres_partial_cl; int stat_weight_discarded_building; int stat_weight_discarded_cl; int stat_internlimit_discarded_cl; int stat_given_candidates; int stat_given_used; int stat_simplified_given; int stat_simplified_derived; int stat_backward_subsumed; int stat_clsubs_attempted; int stat_clsubs_meta_attempted; int stat_clsubs_predsymbs_attempted; int stat_clsubs_unit_attempted; int stat_clsubs_full_attempted; int stat_lit_hash_computed; int stat_lit_hash_match_found; int stat_lit_hash_match_miss; int stat_lit_hash_cut_ok; int stat_lit_hash_subsume_ok; int log_level; } glb; /* === Protos for funs in glb.c === */ glb* wr_glb_new_full(void* db); glb* wr_glb_new_simple(void* db); int wr_glb_free(glb* g); int wr_glb_init_simple(glb* g); int wr_glb_init_shared_simple(glb* g); int wr_glb_init_shared_complex(glb* g); int wr_glb_free_shared_simple(glb* g); int wr_glb_free_shared_complex(glb* g); int wr_glb_init_local_simple(glb* g); int wr_glb_init_local_complex(glb* g); int wr_glb_free_local_simple(glb* g); int wr_glb_free_local_complex(glb* g); #endif whitedb-0.7.2/Reasoner/mem.c000066400000000000000000000176471226454622500156750ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009,2010 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file mem.c * Specific memory allocation functions: vectors etc. * */ /* ====== Includes =============== */ #include #include #include #ifdef __cplusplus extern "C" { #endif #include "rincludes.h" /* ====== Private headers and defs ======== */ /* ======= Private protos ================ */ /* ====== Functions ============== */ /* low-level wrapper funs for alloc, realloc, and free */ void* wr_malloc(glb* g, int bytes) { ++(g->stat_wr_mallocs); (g->stat_wr_malloc_bytes)+=bytes; //printf("!!! wr malloc %d \n",bytes); return sys_malloc(bytes); } void* wr_realloc(glb* g, void* p, int bytes) { ++(g->stat_wr_reallocs); (g->stat_wr_realloc_bytes)+=bytes; //printf("!!! wr realloc %d \n",bytes); return sys_realloc(p,bytes); } void wr_free(glb* g, void* p) { ++(g->stat_wr_frees); sys_free(p); return; } /* ====== Functions for vec: word arrays with length at pos 0 ========== */ /** Allocate a new vec with length len, set element 0 to length, set elements to NULL * */ vec wr_vec_new(glb* g,int len) { vec res; int i; res = (vec) wr_malloc(g,((len+1)*sizeof(gint))+OVER_MALLOC_BYTES); if (res==NULL) { (g->alloc_err)=1; wr_alloc_err2int(g,"Cannot allocate memory for a vec with length",len); return NULL; } // set correct alignment for res i=VEC_ALIGNMENT_BYTES-(((gint)res)%VEC_ALIGNMENT_BYTES); if (i==VEC_ALIGNMENT_BYTES) i=0; res=(gptr)((char*)res+i); res[0]=(gint)len; //for (i=VEC_START; i<=len; i++) res[i]=0; return res; } /** Allocate a new vec with length len, set element 0 to length, set elements to NULL, * set counter (pos 1) to first empty (2) * */ cvec wr_cvec_new(glb* g,int len) { vec res; int i; if (g->alloc_err) return NULL; res = (vec) wr_malloc(g,(len+2)*sizeof(gint)); if (res==NULL) { (g->alloc_err)=1; wr_alloc_err2int(g,"Cannot reallocate memory for a cvec with length",len); return NULL; } // set correct alignment for res i=VEC_ALIGNMENT_BYTES-(((gint)res)%VEC_ALIGNMENT_BYTES); if (i==VEC_ALIGNMENT_BYTES) i=0; res=(gptr)((char*)res+i); res[0]=(gint)len; res[1]=(gint)CVEC_START; //for (i=CVEC_START; i<=len; i++) res[i]=0; //memset(res+CVEC_START,0,(len-CVEC_START)); return res; } /** Free the passed vec. * */ void wr_vec_free(glb* g,vec v) { if (v!=NULL) wr_free(g,v); } /** Free the passed vec and free all strings inside (assuming vec contains strings) * */ void wr_vecstr_free(glb* g,vec v) { int i; if (v==NULL) return; for(i=1;i<=(int)(v[0]);i++) { if (v[i]!=(gint)NULL) wr_str_free(g,(char*)(v[i])); } sys_free(v); } /** Free the passed vec and free all vecs and their strings inside * */ void wr_vecvecstr_free(glb* g,vec v) { int i; if (v==NULL) return; for(i=1;i<=(int)(v[0]);i++) { if (v[i]!=(gint)NULL) wr_vecstr_free(g,(vec)(v[i])); } sys_free(v); } /** Reallocate vec to contain at least i elements. * * Normally the reallocated vec contains more than i elems. * */ vec wr_vec_realloc(glb* g,vec v, int i) { int vlen; vec nvec; int nlen; vlen=(int)v[0]; if (i<=vlen) { return v; } else { if (g->alloc_err) return NULL; for(nlen=(vlen<=0 ? 2 : vlen*2); i>nlen; nlen=nlen*2); //printf("Reallocing vec from %d to %d\n",vlen,nlen); nvec=wr_realloc(g,v,(nlen+1)*sizeof(gint)); if (nvec==NULL) { (g->alloc_err)=1; wr_alloc_err2int(g,"Cannot reallocate memory for a vec with length",nlen); return NULL; } nvec[0]=(gint)nlen; //for (i=vlen+1; i<=nlen; i++) nvec[i]=0; // set new elems to 0 //memset(nvec+vlen+1,0,(nlen-vlen)-1); return nvec; } } /** Store element to pos i in the vec * * If vec is not big enough, it is automatically reallocated * */ vec wr_vec_store(glb* g,vec v, int i, gint e) { vec nvec; if (i<=(int)v[0]) { v[i]=(gint)e; return v; } else { nvec=wr_vec_realloc(g,v,i); if (nvec==NULL) { (g->alloc_err)=1; wr_alloc_err2int(g,"vec_store cannot allocate enough memory to store at",i); return NULL; } nvec[i]=(gint)e; return nvec; } } /** Store element to pos i in the cvec * * If vec is not big enough, it is automatically reallocated * Free pos is automatically moved to i+i * */ cvec wr_cvec_store(glb* g,cvec v, int i, gint e) { cvec nvec; nvec=wr_vec_store(g,v,i,e); if (nvec==NULL) return NULL; if (nvec[1]<=i) { nvec[1]=(gint)(i+1); } return nvec; } /** Store element to next free pos in the cvec * * If vec is not big enough, it is automatically reallocated * Free pos is automatically moved to i+i * */ cvec wr_cvec_push(glb* g,cvec v, gint e) { cvec nvec; nvec=wr_cvec_store(g,v,v[1],e); return nvec; } gptr wr_alloc_from_cvec(glb* g, cvec buf, gint gints) { gint pos; gint i; pos=CVEC_NEXT(buf); // set correct alignment for pos //printf("wr_alloc_from_cvec initial pos %d buf+pos %d remainder with VEC_ALIGNMENT_BYTES %d\n", // pos,buf+pos,((gint)(buf+pos))%VEC_ALIGNMENT_BYTES); i=VEC_ALIGNMENT_BYTES-(((gint)(buf+pos))%VEC_ALIGNMENT_BYTES); //printf("first i %d \n",i); if (i==VEC_ALIGNMENT_BYTES) i=0; if (i) pos++; //printf("wr_alloc_from_cvec final pos %d buf+pos %d remainder with VEC_ALIGNMENT_BYTES %d\n", // pos,buf+pos,((gint)(buf+pos))%VEC_ALIGNMENT_BYTES); if ((pos+gints)>=CVEC_LEN(buf)) { wr_alloc_err(g," local temp buffer overflow"); (g->alloc_err)=1; return NULL; } CVEC_NEXT(buf)=pos+gints; return buf+pos; } /* ====== Functions for strings ============== */ /** Allocate a new string with length len, set last element to 0 * */ char* wr_str_new(glb* g, int len) { char* res; res = (char*) wr_malloc(g,len*sizeof(char)); if (res==NULL) { (g->alloc_err)=1; wr_sys_exiterr2int(g,"Cannot allocate memory for a string with length",len); return NULL; } res[len-1]=0; return res; } /** Guarantee string space: realloc if necessary, then set last byte to 0 * */ void wr_str_guarantee_space(glb* g, char** stradr, int* strlenadr, int needed) { char* tmp; int newlen; int j; //printf("str_guarantee_space, needed: %d, *strlenadr: %d\n",needed,*strlenadr); if (needed>(*strlenadr)) { newlen=(*strlenadr)*2; tmp=wr_realloc(g,*stradr,newlen); //printf("str_guarantee_space, realloc done, newlen: %d\n",newlen); if (tmp==NULL) { wr_sys_exiterr2int(g,"Cannot reallocate memory for a string with length",newlen); return; } for(j=(*strlenadr)-1;j. * */ /** @file mem.h * Specific memory allocation functions: vectors etc. * */ #ifndef DEFINED_MEM_H #define DEFINED_MEM_H #ifdef _WIN32 #include "../config-w32.h" #else #include "../config.h" #endif #include "../Db/dballoc.h" #include "../Db/dbdata.h" #include "types.h" #include "glb.h" #define sys_malloc malloc ///< never use malloc: use sys_malloc as a last resort #define sys_free free ///< never use free: use sys_free as a last resort #define sys_realloc realloc ///< use sys_realloc instead of realloc #define OVER_MALLOC_BYTES 8 ///db),(offset)))) #define rpto(g,realptr)((gint)(pto(((g)->db),(realptr)))) /* ======= prototypes ===== */ void* wr_malloc(glb* g, int bytes); void* wr_realloc(glb* g, void* p, int bytes); void wr_free(glb* g, void* p); vec wr_vec_new(glb* g, int len); cvec wr_cvec_new(glb* g,int len); void wr_vec_free(glb* g, vec v); void wr_vecstr_free(glb* g, vec v); void wr_vecvecstr_free(glb* g, vec v); vec wr_vec_realloc(glb* g, vec v, int i); vec wr_vec_store(glb* g, vec v, int i, gint e); cvec wr_cvec_store(glb* g,vec v, int i, gint e); cvec wr_cvec_push(glb* g,vec v, gint e); gptr wr_alloc_from_cvec(glb* g, cvec buf, gint gints); char* wr_str_new(glb* g, int len); void wr_str_guarantee_space(glb* g, char** stradr, int* strlenadr, int needed); void wr_str_free(glb* g, char* str); void wr_str_freeref(glb* g, char** strref); #endif whitedb-0.7.2/Reasoner/printerrutils.c000066400000000000000000000050121226454622500200240ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009,2010 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file printerrutils.c * printing and err handling utils. * */ /* ====== Includes =============== */ #include #include #include #ifdef __cplusplus extern "C" { #endif #include "rincludes.h" /* ====== Private headers and defs ======== */ /* ======= Private protos ================ */ /* ====== Functions ============== */ /* =========== debug printing ========================= */ /* ======================= errors ================================= */ /** Allocation error not requiring immediate exit. * */ void* wr_alloc_err(glb* g, char* errstr) { if (g->print_flag) printf("Cannot allocate memory: %s.\n",errstr); return NULL; } /** Allocation error not requiring immediate exit. * */ void* wr_alloc_err2(glb* g, char* errstr1, char* errstr2) { if (g->print_flag) printf("Cannot allocate memory: %s %s.\n",errstr1,errstr2); return NULL; } /** Allocation error not requiring immediate exit. * */ void* wr_alloc_err2int(glb* g, char* errstr, int n) { if (g->print_flag) printf("Cannot allocate memory: %s %d.\n",errstr,n); return NULL; } /** Hard system error requiring immediate exit. * */ void wr_sys_exiterr(glb* g, char* errstr) { printf("System error in wgdb reasoner, exiting: %s.\n",errstr); exit(1); } /** Hard system error requiring immediate exit. * */ void wr_sys_exiterr2(glb* g, char* errstr1, char* errstr2) { printf("System error in wgdb reasoner, exiting: %s %s.\n",errstr1,errstr2); exit(1); } /** Hard system error requiring immediate exit. * */ void wr_sys_exiterr2int(glb* g, char* errstr, int n) { printf("System error in wgdb reasoner, exiting: %s %d.\n",errstr,n); exit(1); } #ifdef __cplusplus } #endif whitedb-0.7.2/Reasoner/printerrutils.h000066400000000000000000000027011226454622500200330ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009,2010 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file printerrutils.h * Headers for printing and err handling utils. * */ #ifndef DEFINED_PRINTERRUTILS_H #define DEFINED_PRINTERRUTILS_H /* ==== Includes ==== */ /* ==== Global defines ==== */ #ifdef DPRINTF #define dprintf(fmt, ...) printf(fmt, ##__VA_ARGS__) #else #define dprintf(fmt,...) ; #endif /* ==== Protos ==== */ void* wr_alloc_err(glb* g, char* errstr); void* wr_alloc_err2(glb* g, char* errstr1, char* errstr2); void* wr_alloc_err2int(glb* g, char* errstr, int n); void wr_sys_exiterr(glb* g, char* errstr); void wr_sys_exiterr2(glb* g, char* errstr1, char* errstr2); void wr_sys_exiterr2int(glb* g, char* errstr, int n); #endif whitedb-0.7.2/Reasoner/rgenloop.c000066400000000000000000000374341226454622500167400ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009,2010 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file wr_genloop.c * Procedures for reasoner top level search loops: given-clause, usable, sos etc. * */ /* ====== Includes =============== */ #include #include #include #ifdef __cplusplus extern "C" { #endif #ifdef _WIN32 #include "../config-w32.h" #else #include "../config.h" #endif #include "../Db/dballoc.h" #include "../Db/dbdata.h" #include "../Db/dbhash.h" #include "../Db/dblog.h" #include "../Db/dbindex.h" #include "../Db/dbcompare.h" #include "rincludes.h" /* ====== Private headers and defs ======== */ #define DEBUG //#undef DEBUG //#define QUIET #define USE_RES_TERMS // loop over active clauses in wr_resolve_binary_all_active /* ======= Private protos ================ */ static void wr_process_given_cl_setupsubst(glb* g, gptr buf, gint banknr, int reuseflag); static void wr_process_given_cl_cleanupsubst(glb* g); /* ====== Functions ============== */ int wr_genloop(glb* g) { gptr picked_given_cl_cand; gptr given_cl_cand; gptr given_cl; int i; int given_kept_flag; gptr tmp; #ifndef USE_RES_TERMS gint ipassive; gint iactive; gptr activecl; #endif #ifndef QUIET printf("========= rwr_genloop starting ========= \n"); #endif //clear_active_cl_list(); // ??? wr_clear_all_varbanks(g); if ((g->print_initial_passive_list)==1) { printf("-- initial passive list starts -- \n"); //printf("len %d next %d \n",CVEC_LEN(rotp(g,g->clqueue)),CVEC_NEXT(rotp(g,g->clqueue))); i=CVEC_START; for(; iclqueue)) ; ++i) { wr_print_clause(g,(gptr)((rotp(g,g->clqueue))[i])); } printf("-- initial passive list ends -- \n"); } if ((g->print_initial_active_list)==1) { printf("-- initial active list starts -- \n"); //printf("len %d next %d \n",CVEC_LEN(rotp(g,g->clactive)),CVEC_NEXT(rotp(g,g->clactive))); i=CVEC_START; for(; iclactive)) ; ++i) { wr_print_clause(g,(gptr)((rotp(g,g->clactive))[i])); } printf("-- initial active list ends -- \n"); } // loop until no more passive clauses available g->proof_found=0; g->clqueue_given=CVEC_START; given_kept_flag=1; for(;;) { if (g->alloc_err) { printf("Unhandled alloc_err detected in the main wr_genloop\n"); return -1; } given_kept_flag=1; // will be overwritten picked_given_cl_cand=wr_pick_given_cl(g,&given_kept_flag); // given_kept_flag will now indicate whether to add to active list or not if (g->print_initial_given_cl) { printf("*** given candidate %d: ",(g->stat_given_candidates)); wr_print_clause(g,picked_given_cl_cand); //CP0 //wr_print_vardata(g); //printf("\n"); } if (picked_given_cl_cand==NULL) { return 1; } (g->stat_given_candidates)++; //stats given_cl_cand=wr_activate_passive_cl(g,picked_given_cl_cand); if (given_cl_cand==NULL) { if (g->alloc_err) return -1; continue; } //given_cl_cand=picked_given_cl_cand; //if (given_cl_cand==GNULL) printf("activated given_cl_cand==GNULL\n"); if (given_cl_cand==NULL) continue; if (wr_given_cl_subsumed(g,given_cl_cand)) { #ifdef DEBUG printf("given cl is subsumed\n"); #endif continue; } //CP1 //wr_print_vardata(g); given_cl=wr_process_given_cl(g,given_cl_cand); //CP2 //wr_print_vardata(g); //wr_clear_all_varbanks(g); if (given_cl==NULL) { if (g->alloc_err) return -1; continue; } if (g->print_final_given_cl) { printf("*** given %d: ",(g->stat_given_used)); wr_print_clause(g,given_cl); //printf("\n"); //wr_print_vardata(g); // printf("built %d kept %d \n",(g->stat_built_cl),(g->stat_kept_cl)); //printf("\n"); } //if ((g->stat_given_used)>233) return; //223 if (given_kept_flag) { tmp=wr_add_given_cl_active_list(g,given_cl); if (tmp==NULL) { if (g->alloc_err) return -1; continue; } } // do all resolutions with the given clause #ifdef USE_RES_TERMS // normal case: active loop is done inside the wr_resolve_binary_all_active wr_resolve_binary_all_active(g,given_cl); if (g->proof_found) return 0; if (g->alloc_err) return -1; #else // testing/experimenting case: loop explicitly over active clauses iactive=CVEC_START; for(; iactiveclactive)); iactive++) { #ifndef QUIET printf("\n----- inner wr_genloop cycle (active) starts ----------\n"); #endif activecl=(gptr)((rotp(g,g->clactive))[iactive]); //resolve_binary(g,given_cl,activecl); if ((g->proof_found)) { return 0; } } #endif // USE_RES_TERMS } } gptr wr_pick_given_cl(glb* g, int* given_kept_flag) { gptr cl; int next; //printf("wr_pick_given_cl called with clqueue_given %d and given_kept_flag %d\n",(g->clqueue_given),given_kept_flag); //printf(" CVEC_NEXT(rotp(g,g->clqueue)) %d \n",CVEC_NEXT(rotp(g,g->clqueue))); #ifdef DEBUG printf("picking cl nr %d as given\n",g->clqueue_given); #endif //if (g->clqueue_given>=4) exit(0); // first try stack next=CVEC_NEXT(rotp(g,g->clpickstack)); if (next>CVEC_START) { cl=(gptr)((rotp(g,g->clpickstack))[next-1]); --(CVEC_NEXT(rotp(g,g->clpickstack))); // do not put cl to active list *given_kept_flag=0; if (cl!=NULL) return cl; } // then try queue next=CVEC_NEXT(rotp(g,g->clqueue)); if (next>(g->clqueue_given)) { cl=(gptr)((rotp(g,g->clqueue))[g->clqueue_given]); ++(g->clqueue_given); // do not put cl to active list *given_kept_flag=1; return cl; } // no candidates for given found return NULL; } gptr wr_activate_passive_cl(glb* g, gptr picked_given_cl_cand) { return picked_given_cl_cand; } gptr wr_process_given_cl(glb* g, gptr given_cl_cand) { gptr given_cl; #ifdef DEBUG void* db=g->db; printf("wr_process_given_cl called with \n"); printf("int %d type %d\n",given_cl_cand,wg_get_encoded_type(db,given_cl_cand)); wr_print_record(g,given_cl_cand); wr_print_clause(g,given_cl_cand); #endif wr_process_given_cl_setupsubst(g,g->given_termbuf,1,1); given_cl=wr_build_calc_cl(g,given_cl_cand); wr_process_given_cl_cleanupsubst(g); if (given_cl==NULL) return NULL; // could be memory err //wr_print_varbank(g,g->varbanks); #ifdef DEBUG printf("rebuilt as \n"); wr_print_record(g,given_cl); wr_print_clause(g,given_cl); #endif return given_cl; } gptr wr_add_given_cl_active_list(glb* g, gptr given_cl) { gptr active_cl; #ifdef DEBUG void* db=g->db; printf("wr_add_given_cl_active_list called with \n"); printf("int %d type %d\n",given_cl,wg_get_encoded_type(db,given_cl)); wr_print_record(g,given_cl); wr_print_clause(g,given_cl); #endif wr_process_given_cl_setupsubst(g,g->active_termbuf,2,0); active_cl=wr_build_calc_cl(g,given_cl); wr_process_given_cl_cleanupsubst(g); if (active_cl==NULL) return NULL; // could be memory err #ifdef DEBUG printf("wr_add_given_cl_active_list generated for storage \n"); printf("int %d type %d\n",given_cl,wg_get_encoded_type(db,active_cl)); wr_print_record(g,active_cl); wr_print_clause(g,active_cl); #endif //wr_print_varbank(g,g->varbanks); wr_push_clactive_cl(g,active_cl); (g->stat_given_used)++; // stats return active_cl; } static void wr_process_given_cl_setupsubst(glb* g, gptr buf, gint banknr, int reuseflag) { g->build_subst=0; // subst var values into vars g->build_calc=0; // do fun and pred calculations g->build_dcopy=0; // copy nonimmediate data (vs return ptr) //g->build_buffer=NULL; // build everything into tmp buffer (vs main area) if (reuseflag) buf[1]=2; // reuse given_termbuf g->build_buffer=buf; g->build_rename=1; // do var renaming g->build_rename_maxseenvnr=-1; // tmp var for var renaming g->build_rename_vc=0; // tmp var for var renaming g->build_rename_banknr=banknr; // nr of bank of created vars // points to bank of created vars g->build_rename_bank=(g->varbanks)+((g->build_rename_banknr)*NROF_VARSINBANK); g->tmp_unify_vc=((gptr)(g->varstack))+1; } static void wr_process_given_cl_cleanupsubst(glb* g) { int i; wr_clear_varstack(g,g->varstack); //for(i=0;ibuild_rename_vc;i++) { // (g->build_rename_bank)[i]=UNASSIGNED; //} } void wr_resolve_binary_all_active(glb* g, gptr cl) { void* db=g->db; int i; int len; int ruleflag; // 0 if not rule int poscount=0; // used only for pos/neg pref int negcount=0; // used only for pos/neg pref int posok=1; // default allow int negok=1; // default allow //gint parent; gint meta; int negflag; // 1 if negative int termflag; // 1 if complex atom gint hash; int addflag=0; int negadded=0; int posadded=0; vec hashvec; int hlen; gint node; gint xatom; gint yatom; gptr xcl; gptr ycl; int ures; //char buf[1000]; //int buflen=800; //for(i=0;inegpref_strat) || (g->pospref_strat)) { if (!ruleflag) { poscount=1; negcount=0; } else { poscount=0; negcount=0; for(i=0; inegpref_strat) { if (poscount>0 && negcount>0) posok=0; } if (g->pospref_strat) { if (poscount>0 && negcount>0) negok=0; } } } } xcl=cl; #ifdef DEBUG printf("ruleflag %d len %d poscount %d negcount %d posok %d negok %d\n", ruleflag,len,poscount,negcount,posok,negok); #endif // loop over literals #if 0 /* XXX: FIXME */ #ifdef USE_CHILD_DB if (ruleflag) parent = wg_get_rec_base_offset(db,xcl); #endif #endif for(i=0; inegpref_strat) || (g->pospref_strat)) || (negok && negflag && !negadded) || (posok && !negflag)) { if (negflag) negadded++; else posadded++; xatom=wg_get_rule_clause_atom(db,xcl,i); #ifdef DEBUG printf("atom nr %d from record \n",i); wr_print_record(g,xcl); printf("\natom\n"); wr_print_record(g,wg_decode_record(db,xatom)); printf("negflag %d\n",negflag); #endif if (wg_get_encoded_type(db,xatom)==WG_RECORDTYPE) { termflag=1; #if 0 /* XXX: FIXME */ #ifdef USE_CHILD_DB if(parent) xatom=wg_encode_parent_data(parent, xatom); #endif #endif //xatom=wg_decode_record(db,enc); } else { //printf("\ncp2 enc %d\n",xatom); } hash=wr_atom_funhash(g,xatom); //printf("hash %d\n",hash); addflag=1; } } // xcl: active clause // xatom: active atom if (addflag) { // now loop over hash vectors for all active unification candidates // ycl: cand clause // yatom: cand atom #ifndef QUIET printf("\n----- inner wr_genloop cycle (active hash list) starts ----------\n"); #endif if (negflag) hashvec=rotp(g,g->hash_pos_atoms); else hashvec=rotp(g,g->hash_neg_atoms); //wr_clterm_hashlist_print(g,hashvec); hlen=wr_clterm_hashlist_len(g,hashvec,hash); if (hlen==0) { dprintf("no matching atoms in hash\n"); continue; } node=wr_clterm_hashlist_start(g,hashvec,hash); if (node<0) { wr_sys_exiterr(g,"apparently wrong hash given to wr_clterm_hashlist_start"); return; } while(node!=0) { yatom=(otp(db,node))[CLTERM_HASHNODE_TERM_POS]; ycl=otp(db,(otp(db,node))[CLTERM_HASHNODE_CL_POS]); printf("after while(node!=0): \n"); printf("ycl: \n"); wr_print_clause(g,ycl); /* printf("xcl: \n"); wr_print_clause(g,xcl); printf("xatom: \n"); wr_print_clause(g,xatom); */ if (g->print_active_cl) { printf("* active: "); wr_print_clause(g,ycl); //printf("\n"); } #ifdef DEBUG printf("\nxatom "); wr_print_term(g,xatom); printf(" in xcl "); wr_print_clause(g,xcl); printf("yatom "); wr_print_term(g,yatom); printf(" in ycl "); wr_print_clause(g,ycl); //wg_print_record(db,ycl); //printf("calling equality check\n"); wr_print_vardata(g); #endif //printf("!!!!!!!!!!!!!!!!!!!!! before unification\n"); //wr_print_vardata(g); //printf("CLEAR\n"); //wr_clear_varstack(g,g->varstack); //wr_print_vardata(g); //printf("START UNIFICATION\n"); ures=wr_unify_term(g,xatom,yatom,1); // uniquestrflag=1 #ifdef DEBUG printf("unification check res: %d\n",ures); #endif //wr_print_vardata(g); //wr_print_vardata(g); //wr_clear_varstack(g,g->varstack); //wr_print_vardata(g); if (ures) { // build and process the new clause printf("\nin wr_resolve_binary_all_active to call wr_process_resolve_result\n"); wr_process_resolve_result(g,xatom,xcl,yatom,ycl); printf("\nin wr_resolve_binary_all_active after wr_process_resolve_result\n"); printf("\nxatom\n"); wr_print_term(g,xatom); if (g->proof_found || g->alloc_err) { wr_clear_varstack(g,g->varstack); return; } } wr_clear_varstack(g,g->varstack); //wr_print_vardata(g); // get next node; node=wr_clterm_hashlist_next(g,hashvec,node); } printf("\nexiting node loop\n"); } } dprintf("wr_resolve_binary_all_active finished\n"); return; } #ifdef __cplusplus } #endif whitedb-0.7.2/Reasoner/rgenloop.h000066400000000000000000000024551226454622500167400ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009,2010 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file rgenloop.h * Procedures for reasoner top level search loops: given-clause, usable, sos etc. * */ #ifndef DEFINED_RGENLOOP_H #define DEFINED_RGENLOOP_H #include "types.h" #include "glb.h" int wr_genloop(glb* g); gptr wr_pick_given_cl(glb* g, int* given_kept_flag); gptr wr_activate_passive_cl(glb* g, gptr picked_given_cl_cand); gptr wr_add_given_cl_active_list(glb* g, gptr given_cl); gptr wr_process_given_cl(glb* g, gptr given_cl_cand); void wr_resolve_binary_all_active(glb* g, gptr cl); #endif whitedb-0.7.2/Reasoner/rincludes.h000066400000000000000000000033451226454622500171020ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009,2010 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file rincludes.h * standard includes for reasoner c files */ #ifndef DEFINED_RINCLUDES_H #define DEFINED_RINCLUDES_H /* ==== Includes ==== */ #include #include #include #ifdef _WIN32 #include "../config-w32.h" #else #include "../config.h" #endif #include "mem.h" #include "glb.h" #include "clterm.h" #include "unify.h" #include "build.h" #include "clstore.h" #include "subsume.h" #include "derive.h" #include "rgenloop.h" #include "rmain.h" #include "printerrutils.h" #include "../Db/dbutil.h" #include "../Printer/dbotterprint.h" /* ==== Global defines ==== */ #define CP0 printf("CP0\n"); #define CP1 printf("CP1\n"); #define CP2 printf("CP2\n"); #define CP3 printf("CP3\n"); #define CP4 printf("CP4\n"); #define CP5 printf("CP5\n"); #define CP6 printf("CP6\n"); #define CP7 printf("CP7\n"); #define CP8 printf("CP8\n"); #define CP9 printf("CP9\n"); #define PRINT_LIMITS /* ==== Protos ==== */ #endif whitedb-0.7.2/Reasoner/rmain.c000066400000000000000000000277301226454622500162170ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009,2010 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file rmain.c * Reasoner top level: initialisation and startup. * */ /* ====== Includes =============== */ #include #include #include #ifdef __cplusplus extern "C" { #endif #include "rincludes.h" #include "../Parser/dbparse.h" /* ====== Private headers and defs ======== */ static void wr_set_no_printout(glb* g); static void wr_set_normal_printout(glb* g); static void wr_set_medium_printout(glb* g); static void wr_set_detailed_printout(glb* g); #define SHOW_SUBSUME_STATS #define SHOW_MEM_STATS /* ======= Private protos ================ */ /* ====== Functions ============== */ int wg_run_reasoner(void *db, int argc, char **argv) { glb* g; int res=1; int default_print_level=30; dprintf("wg_run_reasoner starts\n"); g=wg_init_reasoner(db,argc,argv); if (g==NULL) { //printf("Error: cannot allocate enough memory during reasoner initialization\n"); // printout in wg_init_reasoner return -1; } if (!(g->print_flag)) (g->print_level_flag)=0; if ((g->print_level_flag)<0) (g->print_level_flag)=default_print_level; if ((g->print_level_flag)==0) wr_set_no_printout(g); else if ((g->print_level_flag)<=10) wr_set_normal_printout(g); else if ((g->print_level_flag)<=20) wr_set_medium_printout(g); else if ((g->print_level_flag)<=30) wr_set_detailed_printout(g); else wr_set_normal_printout(g); res=wr_genloop(g); if (g->print_flag) { if (res==0) { printf("\n** PROOF FOUND\n"); } else if (res==1) { printf("\n** SEARCH FINISHED WITHOUT PROOF, RESULT CODE %d\n",res); } else if (res==-1) { printf("\n** SEARCH CANCELLED: MEMORY OVERFLOW %d\n",res); } else if (res<0) { printf("\n** SEARCH CANCELLED, ERROR CODE %d\n",res); } wr_show_stats(g); } wr_glb_free(g); return res; } int wg_import_otter_file(void *db, char* filename) { glb* g; int res; dprintf("wg_import_otterfile starts\n"); g=wr_glb_new_simple(db); // no complex values given to glb elements if (g==NULL) return 1; (g->parser_print_level)=0; (g->print_initial_parser_result)=0; (g->print_generic_parser_result)=0; res=wr_import_otter_file(g,filename,NULL,NULL); sys_free(g); // no complex values given to glb elements dprintf("wg_import_otterfile ends with res\n",res); return res; } int wg_import_prolog_file(void *db, char* filename) { glb* g; int res; dprintf("wg_import_prologfile starts\n"); g=wr_glb_new_simple(db); // no complex values given to glb elements if (g==NULL) return 1; res=wr_import_prolog_file(g,filename,NULL,NULL); sys_free(g); // no complex values given to glb elements dprintf("wg_import_prologfile ends with res\n",res); return res; } glb* wg_init_reasoner(void *db, int argc, char **argv) { glb* g; dprintf("init starts\n"); g=wr_glb_new_full(db); if (g==NULL) { printf("Error: cannot allocate enough memory during reasoner initialization\n"); return NULL; } dprintf("glb made\n"); dprintf("cycling over clauses to make active passive lists\n"); wr_init_active_passive_lists_std(g); //wr_init_active_passive_lists_factactive(g); //wr_init_active_passive_lists_ruleactive(g); dprintf("active passive lists made\n"); return g; } int wr_init_active_passive_lists_std(glb* g) { void *rec; void* db=(g->db); //int i; (g->proof_found)=0; //for(i=0;i<10;i++) printf(" %d ",(int)((rotp(g,g->clqueue))[i])); printf("\n"); rec = wg_get_first_raw_record(db); while(rec) { if (wg_rec_is_rule_clause(db,rec)) { #ifdef DEBUG wr_print_rule_clause_otter(g, (gint *) rec,(g->print_clause_detaillevel)); printf("\n"); #endif wr_push_clqueue_cl(g,rec); } else if (wg_rec_is_fact_clause(db,rec)) { #ifdef DEBUG wr_print_fact_clause_otter(g, (gint *) rec,(g->print_clause_detaillevel)); printf("\n"); #endif wr_push_clqueue_cl(g,rec); } //for(i=0;i<10;i++) printf(" %d ",(int)((rotp(g,g->clqueue))[i])); printf("\n"); rec = wg_get_next_raw_record(db,rec); } return 0; } /* int wr_init_active_passive_lists_factactive(glb* g) { void *rec; void* db=(g->db); //int i; (g->proof_found)=0; //for(i=0;i<10;i++) printf(" %d ",(int)((rotp(g,g->clqueue))[i])); printf("\n"); rec = wg_get_first_raw_record(db); while(rec) { if (wg_rec_is_rule_clause(db,rec)) { #ifdef DEBUG wr_print_rule_clause_otter(g, (gint *) rec); printf("\n"); #endif wr_push_clqueue_cl(g,rec); } else if (wg_rec_is_fact_clause(db,rec)) { #ifdef DEBUG wr_print_fact_clause_otter(g, (gint *) rec); printf("\n"); #endif wr_push_clactive_cl(g,rec); } //for(i=0;i<10;i++) printf(" %d ",(int)((rotp(g,g->clqueue))[i])); printf("\n"); rec = wg_get_next_raw_record(db,rec); } return 0; } */ /* int wr_init_active_passive_lists_ruleactive(glb* g) { void *rec; void* db=(g->db); //int i; (g->proof_found)=0; //for(i=0;i<10;i++) printf(" %d ",(int)((rotp(g,g->clqueue))[i])); printf("\n"); rec = wg_get_first_raw_record(db); while(rec) { if (wg_rec_is_rule_clause(db,rec)) { #ifdef DEBUG wr_print_rule_clause_otter(g, (gint *) rec); printf("\n"); #endif wr_push_clactive_cl(g,rec); } else if (wg_rec_is_fact_clause(db,rec)) { #ifdef DEBUG wr_print_fact_clause_otter(g, (gint *) rec); printf("\n"); #endif wr_push_clqueue_cl(g,rec); } //for(i=0;i<10;i++) printf(" %d ",(int)((rotp(g,g->clqueue))[i])); printf("\n"); rec = wg_get_next_raw_record(db,rec); } return 0; } */ /* ------------------------ Printout settings -------------------------- */ static void wr_set_no_printout(glb* g) { (g->print_flag)=0; (g->parser_print_level)=0; (g->print_initial_parser_result)=0; (g->print_generic_parser_result)=0; (g->print_initial_active_list)=0; (g->print_initial_passive_list)=0; (g->print_initial_given_cl)=0; (g->print_final_given_cl)=0; (g->print_active_cl)=0; (g->print_partial_derived_cl)=0; (g->print_derived_cl)=0; (g->print_clause_detaillevel)=0; (g->print_stats)=0; } static void wr_set_normal_printout(glb* g) { (g->print_flag)=1; (g->parser_print_level)=0; (g->print_initial_parser_result)=0; (g->print_generic_parser_result)=0; (g->print_initial_active_list)=0; (g->print_initial_passive_list)=0; (g->print_initial_given_cl)=0; (g->print_final_given_cl)=1; (g->print_active_cl)=0; (g->print_partial_derived_cl)=0; (g->print_derived_cl)=0; (g->print_clause_detaillevel)=1; (g->print_stats)=1; } static void wr_set_medium_printout(glb* g) { (g->print_flag)=1; (g->parser_print_level)=0; (g->print_initial_parser_result)=0; (g->print_generic_parser_result)=0; (g->print_initial_active_list)=1; (g->print_initial_passive_list)=1; (g->print_initial_given_cl)=1; (g->print_final_given_cl)=1; (g->print_active_cl)=1; (g->print_partial_derived_cl)=1; (g->print_derived_cl)=1; (g->print_clause_detaillevel)=1; (g->print_stats)=1; } static void wr_set_detailed_printout(glb* g) { (g->print_flag)=1; (g->parser_print_level)=1; (g->print_initial_parser_result)=1; (g->print_generic_parser_result)=1; (g->print_initial_active_list)=1; (g->print_initial_passive_list)=1; (g->print_initial_given_cl)=1; (g->print_final_given_cl)=1; (g->print_active_cl)=1; (g->print_partial_derived_cl)=1; (g->print_derived_cl)=1; (g->print_clause_detaillevel)=1; (g->print_stats)=1; } /* ----------------------------------------------- Some funs for statistics ----------------------------------------------- */ void wr_show_stats(glb* g) { if (!(g->print_stats)) return; printf("statistics:\n"); printf("----------------------------------\n"); printf("stat_given_used: %d\n",g->stat_given_used); printf("stat_given_candidates: %d\n",g->stat_given_candidates); //printf("stat_derived_cl: %d\n",g->stat_derived_cl); printf("stat_binres_derived_cl: %d\n",g->stat_binres_derived_cl); printf("stat_factor_derived_cl: %d\n",g->stat_factor_derived_cl); printf("stat_kept_cl: %d\n",g->stat_kept_cl); printf("stat_hyperres_partial_cl: %d\n",g->stat_hyperres_partial_cl); printf("stat_weight_discarded_building: %d\n",g->stat_weight_discarded_building); printf("stat_weight_discarded_cl: %d\n",g->stat_weight_discarded_cl); printf("stat_internlimit_discarded_cl: %d\n",g->stat_internlimit_discarded_cl); printf("stat_simplified: %d derived %d given\n", g->stat_simplified_derived, g->stat_simplified_given); printf("stat_backward_subsumed: %d\n",g->stat_backward_subsumed); printf("stat_built_cl: %d\n",g->stat_built_cl); #ifdef SHOW_SUBSUME_STATS printf("stat_clsubs_attempted: %20d\n",g->stat_clsubs_attempted); printf("stat_clsubs_meta_attempted: %20d\n",g->stat_clsubs_meta_attempted); printf("stat_clsubs_predsymbs_attempted: %20d\n",g->stat_clsubs_predsymbs_attempted); printf("stat_clsubs_unit_attempted: %20d\n",g->stat_clsubs_unit_attempted); printf("stat_clsubs_full_attempted: %20d\n",g->stat_clsubs_full_attempted); #endif #ifdef SHOW_HASH_CUT_STATS printf("stat_lit_hash_computed: %20d\n",g->stat_lit_hash_computed); printf("stat_lit_hash_match_found: %20d\n",g->stat_lit_hash_match_found); printf("stat_lit_hash_match_miss: %20d\n",g->stat_lit_hash_match_miss); printf("stat_lit_hash_cut_ok: %20d\n",g->stat_lit_hash_cut_ok); printf("stat_lit_hash_subsume_ok: %20d\n",g->stat_lit_hash_subsume_ok); #endif #ifdef SHOW_MEM_STATS //if ((g->clbuilt)!=(gint)NULL) // printf("clbuilt els %d used %d\n", // (rotp(g,g->clbuilt))[0],(rotp(g,g->clbuilt))[1]-1); if ((g->clqueue)!=(gint)NULL) printf("clqueue els %d used %d\n", (rotp(g,g->clqueue))[0],(rotp(g,g->clqueue))[1]-1); if ((g->clactive)!=(gint)NULL) printf("clactive els %d used %d\n", (rotp(g,g->clactive))[0],(rotp(g,g->clactive))[1]-1); //if ((g->clweightqueue)!=(gint)NULL) // printf("clweightqueue els %d used %d\n",((gptr)(g->clweightqueue))[0],((gptr)(g->clweightqueue))[1]-1); if ((g->queue_termbuf)!=(gint)NULL) printf("queue_termbuf els %d used %d\n",(g->queue_termbuf)[0],(g->queue_termbuf)[1]-1); if ((g->active_termbuf)!=(gint)NULL) printf("active_termbuf els %d used %d\n",(g->active_termbuf)[0],(g->active_termbuf)[1]-1); if ((g->varstack)!=(gint)NULL) printf("varstack els %d last used %d\n",(g->varstack)[0],(g->varstack)[1]-1); if ((g->given_termbuf)!=(gint)NULL) printf("given_termbuf els %d last used %d\n",(g->given_termbuf)[0],(g->given_termbuf)[1]-1); if ((g->derived_termbuf)!=(gint)NULL) printf("derived_termbuf els %d last used %d\n",(g->derived_termbuf)[0],(g->derived_termbuf)[1]-1); printf("wr_mallocs: %d\n",(g->stat_wr_mallocs)); printf("wr_reallocs: %d\n",(g->stat_wr_reallocs)); printf("wr_frees: %d\n",(g->stat_wr_frees)); printf("wr_malloc_bytes: %d\n",(g->stat_wr_malloc_bytes)); printf("wr_realloc_bytes: %d\n",(g->stat_wr_realloc_bytes)); #endif printf("----------------------------------\n"); } #ifdef __cplusplus } #endif whitedb-0.7.2/Reasoner/rmain.h000066400000000000000000000023501226454622500162130ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009,2010 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file rmain.h * Reasoner top level: initialisation and startup. * */ #ifndef DEFINED_RMAIN_H #define DEFINED_RMAIN_H #include "glb.h" int wg_run_reasoner(void *db, int argc, char **argv); int wg_import_otter_file(void *db, char* filename); int wg_import_prolog_file(void *db, char* filename); glb* wg_init_reasoner(void *db, int argc, char **argv); int wr_init_active_passive_lists_std(glb* g); void wr_show_stats(glb* g); #endif whitedb-0.7.2/Reasoner/rtechdoc.txt000066400000000000000000000223651226454622500173000ustar00rootroot00000000000000Reasoner technical doc ====================== Universal small data structures =============================== code in mem.h, mem.c vec: gint array: 0 (VEC_LEN) len, 1 (VEC_START) first elem pos cvec: gint array: 0 (CVEC_LEN) len, 1 (CVEC_NEXT) first empty pos, 2 (CVEC_START) first elem pos veco: offset version of vec cveco: offset version of cvec to store, use: wr_vec_store, wr_cvec_store, wr_cvec_push a la wr_vec_store(glb* g, vec v, int i, gint e) which automatically realloc the vec if necessary and return the vec (possibly reallocated) to create a string use: wr_str_new(glb* g, int len) Universal converters ==================== code in mem.h opt/pto: offset to ptr/ptr to offset, taking db as first arg ropt/rpto: offset to ptr/ptr to offset, taking g as first arg and getting g->db Top level clause selection and processing algorithms ==================================================== code in rmain.c: ---------------- wg_run_reasoner: - wg_init_reasoner - set printout levels - wr_genloop - optionally wr_show_stats - wr_glb_free wg_init_reasoner: - initialize globals - scan all tuples in db and if rule or fact clause push to clqueue vec with wr_push_clqueue_cl(g,rec); code in rgenloop.c: ------------------- wr_genloop: - wr_clear_all_varbanks - loop indefinitely: - picked_given_cl_cand=wr_pick_given_cl(g,&given_kept_flag) - given_cl_cand=wr_activate_passive_cl(g,picked_given_cl_cand); - if (wr_given_cl_subsumed(g,given_cl_cand)) continue; - given_cl=wr_process_given_cl(g,given_cl_cand); - if (given_kept_flag) tmp=wr_add_given_cl_active_list(g,given_cl); - either - normal case (USE_RES_TERMS): wr_resolve_binary_all_active(g,given_cl) - experimental case: loop explicitly over active clauses and do resolve_binary(g,given_cl,activecl) for each wr_resolve_binary_all_active: - ruleflag=wg_rec_is_rule_clause(db,cl); - for negpref check out if negative literals present (posok=0/1 or negok=0/1) - xcl=cl; - loop over all atoms of xcl: - xatom=singleton atom or wg_get_rule_clause_atom(db,xcl,i); - hash=wr_atom_funhash(g,xatom); - hashvec=rotp(g,g->hash_neg_atoms / hash_pos_atoms); - loop over hashvec of potential unifiable literals: - yatom=(otp(db,node))[CLTERM_HASHNODE_TERM_POS]; - ycl=otp(db,(otp(db,node))[CLTERM_HASHNODE_CL_POS]); - ures=wr_unify_term(g,xatom,yatom,1); - if (ures) //build and process the new clause: wr_process_resolve_result(g,xatom,xcl,yatom,ycl); - wr_clear_varstack(g,g->varstack); code in derive.c: ----------------- wr_process_resolve_result(glb* g, gint xatom, gptr xcl, gint yatom, gptr ycl): - check which args are rules - allocate local termbuf rptr=wr_alloc_from_cvec(g,g->derived_termbuf,rlen); - wr_process_resolve_result_setupsubst(g); - wr_process_resolve_result_aux(g,xcl,xatom,xatomnr,rptr,&rpos); - wr_process_resolve_result_cleanupsubst(g); - build clause: different ways for hyper/non-hyper, normal steps for clauses: - res=wr_create_raw_record(...) - loop over literals, for each: - blt=wr_build_calc_term(g,rptr[tmp+LIT_ATOM_POS]); - ... and normal steps for facts: - blt=wr_build_calc_term(g,rptr[LIT_ATOM_POS]); - if hyperres and !wr_hyperres_satellite_cl(g,res): - wr_clear_varstack(g,g->varstack); - recursively: wr_resolve_binary_all_active(g,res); - push built clause into suitable list: wr_push_clqueue_cl(g,res); wr_process_resolve_result_aux(glb* g, gptr cl, gint cutatom, int atomnr, gptr rptr, int* rpos): called for rule clauses only - loop over atoms (i=0...atomnr): - meta=wg_get_rule_clause_atom_meta(db,cl,i); - atom=wg_get_rule_clause_atom(db,cl,i); - check if xatom present somewhere earlier, if not, store lit: - rptr[((*rpos)*LIT_WIDTH)+LIT_META_POS]=meta; - rptr[((*rpos)*LIT_WIDTH)+LIT_ATOM_POS]=newatom; code in build.c: ----------------- wr_build_calc_cl(glb* g, gptr xptr): wr_build_calc_term(glb* g, gint x): wr_computable_termptr(glb* g, gptr tptr): wr_compute_from_termptr(glb* g, gptr tptr): code in clstore.c: ----------------- wr_cl_store_res_terms(glb* g, gptr cl): wr_term_hashstore(glb* g, void* hashdata, gint term, gptr cl): wr_term_complexhash(glb* g, gint* hasharr, gint hashposbits, gint term): wr_atom_funhash(glb* g, gint atom): wr_clterm_add_hashlist(glb* g, vec hashvec, gint hash, gint term, gptr cl): code in unify.c: ---------------- wr_unify_term(glb* g, gint x, gint y, int uniquestrflag) - prepares g->tmp_unify_vc and g->tmp_unify_occcheck - then calls wr_unify_term_aux(glb* g, gint x, gint y, int uniquestrflag) - gets var vals: VARVAL_F(x,(g->varbanks)) - sets vars: SETVAR(x,y,g->varbanks,g->varstack,g->tmp_unify_vc); uses - vec varbanks: // 0: input (passive), 1: given clause renamed, // 2: active clauses, 3: derived clauses, // 4: tmp rename area (vals always UNASSIGNED, never set to other vals!) created in glb.c by wr_glb_init_local_complex: (g->varbanks)=wr_vec_new(g,NROF_VARBANKS*NROF_VARSINBANK); - cvec varstack: set vars are pushed to varstack for quick clearing later created in glb.c by wr_glb_init_local_complex: (g->varstack)=wr_cvec_new(g,NROF_VARBANKS*NROF_VARSINBANK); - gint* tmp_unify_vc; // var count in unification - gint gint tmp_unify_occcheck; - gint tmp_unify_do_occcheck; - etc in glb.h Global data structures ====================== code in glb.h, glb.c to access, use prefix glb-> queues and stacks ----------------- cveco clbuilt; /**< vector containing built clauses, newest last. 0: vec len, 1: index of next unused vec elem */ cveco clactive; cveco clpickstack; /**< vector containing built clause stack to be selected as given before using queue (hyperres eg) */ cveco clqueue; /**< vector containing kept clauses, newest last. 0: vec len, 1: index of next unused vec elem */ gint clqueue_given; /**< index of next clause to be taken from clqueue */ variable banks -------------- vec varbanks; // 0: input (passive), 1: given clause renamed, // 2: active clauses, 3: derived clauses, // 4: tmp rename area (vals always UNASSIGNED, never set to other vals!) cvec varstack; temporary buffers for term/clause building: -------------------------------------------- cvec given_termbuf; cvec derived_termbuf; cvec queue_termbuf; cvec active_termbuf; Clause and term representation ============================== code in clterm.h, clterm.c a db tuple is either a: - fact clause: wg_rec_is_fact_clause(db,rec), wr_create_fact_clause(glb* g, int litnr); - rule clause: wg_rec_is_rule_clause(db,rec), wr_create_rule_clause(glb* g, int litnr); - atom: wg_rec_is_atom_rec(db,rec), wr_create_atom(glb* g, int termnr); - term: wg_rec_is_term_rec(db,rec), wr_create_term(glb* g, int termnr); using in meta one bit in pos 3...6 from right, either RECORD_META_FACT_CLAUSE (1<<4), RECORD_META_RULE_CLAUSE (1<<3), RECORD_META_ATOM (1<<5), RECORD_META_TERM (1<<6) hence a clause is either a fact clause or rule clause: fact clauses -------------- unit clauses with no vars (ground) raw db record with len (g->unify_firstuseterm)+litnr+(g->unify_footerlen)) meta 1 gint is inititally RECORD_META_FACT_CLAUSE; rule clauses ------------ non-unit or non-ground var-containing) clauses: extraheader + meta/lit pairs with structure: 1 gint (CLAUSE_EXTRAHEADERLEN) initially RECORD_META_NOTDATA | RECORD_META_RULE_CLAUSE followed by meta/lit pairs (both one gint) in succession: LIT_WIDTH 2, LIT_META_POS 0, LIT_ATOM_POS 1 for raw fields and lengths use: get_field(r,n), set_field(r,n,d), get_record_len(r), wg_count_clause_atoms(db,clause) for fields use: wg_get_rule_clause_atom_meta(db,rec,litnr), wg_get_rule_clause_atom(db,rec,litnr) get lit polarity: wg_atom_meta_is_neg(db,meta) (uses ATOM_META_NEG encode_smallint(1)) check if two literals have different polarities: litmeta_negpolarities(meta1,meta2) atoms ------ raw db record with len (g->unify_firstuseterm)+termnr+(g->unify_footerlen)); unify_firstuseterm; // rec pos where we start to unify unify_maxuseterm; // max nr of rec elems unified one after another, 0 if no limit unify_footerlen; // obligatory amount of unused gints to add to end of each created term meta 1 gint is initially RECORD_META_NOTDATA | RECORD_META_ATOM for fields use: wr_set_atom_subterm(glb* g, void* atom, int termnr, gint subterm); terms ----- almost the same as atoms raw db record with len (g->unify_firstuseterm)+termnr+(g->unify_footerlen)); unify_firstuseterm; // rec pos where we start to unify unify_maxuseterm; // max nr of rec elems unified one after another, 0 if no limit unify_footerlen; // obligatory amount of unused gints to add to end of each created term meta 1 gint is initially RECORD_META_NOTDATA | RECORD_META_TERM for fields use: wr_set_term_subterm(glb* g, void* term, int termnr, gint subterm); Variables ========= code in unify.h and unify.c: #define UNASSIGNED WG_ILLEGAL // 0xff in dbata.h #define VARVAL_DIRECT(x,vb) (vb[decode_var(x)]) - fast var value: #define VARVAL_F(x,vb) (tmp=vb[decode_var(x)], ((tmp==UNASSIGNED) ? x : (!isvar(tmp) ? tmp : wr_varval(tmp,vb)))) wr_varval: - loop until !isvar(y) or value is UNASSIGNED - slower var value does the same as VARVAL_F: #define VARVAL(x,vb) (wr_varval(x,vb)) - set var: #define SETVAR(x,y,vb,vstk,vc) (vb[decode_var(x)]=y,vstk[*vc]=(gint)((gptr)vb+decode_var(x)),++(*vc)) whitedb-0.7.2/Reasoner/rtest.c000066400000000000000000000445071226454622500162530ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009,2010 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file rtest.c * Reasoner testing functions. * */ /* ====== Includes =============== */ #include #include #include #ifdef __cplusplus extern "C" { #endif #ifdef _WIN32 #include "../config-w32.h" #else #include "../config.h" #endif #include "../Db/dballoc.h" #include "../Db/dbmem.h" #include "../Db/dbdata.h" //#include "../Db/dbapi.h" #include "../Db/dbtest.h" #include "../Db/dbdump.h" #include "../Db/dblog.h" #include "../Db/dbquery.h" #include "../Db/dbutil.h" #include "../Parser/dbparse.h" #include "rincludes.h" #include "rmain.h" #include "rtest.h" /* ====== Private headers and defs ======== */ /* ======= Private protos ================ */ static int wr_test_reasoner_otterparser(glb* g,int p); static int wr_test_coretests(glb* g,int p); static int wr_test_eq(glb* g, cvec clvec, char* clstr, int expres, int p); static int wr_test_unify(glb* g, cvec clvec, char* clstr, int expres, int p); static int wr_test_match(glb* g, cvec clvec, char* clstr, int expres, int p); static int wr_test_unify(glb* g, cvec clvec, char* clstr, int expres, int p); static int wr_litinf_is_clear(glb* g,vec v); /* ====== Functions ============== */ /** Run reasoner tests. * Allows each test to be run in separate locally allocated databases, * if necessary. * * returns 0 if no errors. * otherwise returns error code. */ int wg_test_reasoner(int argc, char **argv) { void* db=NULL; glb* g; int tmp=0; int p=2; int localflag=1; printf("******** wg_test_reasoner starts ********* \n"); if (localflag) db=wg_attach_local_database(500000); if (db==NULL) { if (p) printf("failed to initialize database\n"); return 1; } g=wr_glb_new_full(db); if (g==NULL) { if (p) printf("failed to initialize reasoner globals\n"); return 1; } if (tmp==0) tmp=wr_test_reasoner_otterparser(g,p); if (tmp) { if (p) printf("failed to parse otter text\n"); return 1; } tmp=wr_test_coretests(g,p); //wr_init_active_passive_lists_std(g); //res=wr_genloop(g); //printf("\nresult %d\n",res); //printf("----------------------------------\n"); //wr_show_stats(g); //printf("----------------------------------\n"); wr_glb_free(g); if (!tmp) { if (p) printf("******** wg_test_reasoner ends OK ********\n"); } else { if (p) printf("******** wg_test_reasoner ends with an error ********\n"); } if (localflag) wg_delete_local_database(db); return tmp; } static int wr_test_reasoner_otterparser(glb* g,int p) { int err; int res=0; char* otterstr; int ottestrlen; otterstr="p(3). -p(?X) | =(?X,2)."; ottestrlen=strlen(otterstr); if (p>0) printf("--------- wr_test_reasoner_otterparser starts ---------\n"); //err = wr_import_otter_file(g,"otter.txt",otterstr,ottestrlen); err = wr_import_otter_file(g,"Rexamples/otter.txt",NULL,NULL); if(!err) { if (p>1) printf("Data imported from otter file OK\n"); res=0; } else if(err<-1) { if (p>0) printf("Fatal error when importing otter file, data may be partially imported\n"); res=1; } else { if (p>0) printf("Import failed.\n"); res=1; } if (p>0) printf("--------- wr_test_reasoner_otterparser ends ---------\n"); return res; } static int wr_test_coretests(glb* g,int p) { int tmp=0; cvec clvec; if (p>0) printf("--------- wr_test_reasoner_coretests starts ---------\n"); clvec=wr_cvec_new(g,1000); if (clvec==NULL) { printf("cannot allocate cvec in wr_test_coretests\n"); return 1; } (g->print_initial_parser_result)=0; (g->print_generic_parser_result)=0; if (p>0) printf("- - - wr_equal_term tests - - -\n"); if (!tmp) tmp=wr_test_eq(g,clvec,"p(1). m(1).",0,p); if (!tmp) tmp=wr_test_eq(g,clvec,"p(1). ?X(1).",0,p); if (!tmp) tmp=wr_test_eq(g,clvec,"p(1). 'a'(1).",0,p); if (!tmp) tmp=wr_test_eq(g,clvec,"p(1,-1,0,2,100000). p(1,-1,0,2,100000).",1,p); if (!tmp) tmp=wr_test_eq(g,clvec,"p(1,10). p(1,20).",0,p); if (!tmp) tmp=wr_test_eq(g,clvec,"p(1.23456). p(1.23456).",1,p); if (!tmp) tmp=wr_test_eq(g,clvec,"p(1.0). p(1.0).",1,p); if (!tmp) tmp=wr_test_eq(g,clvec,"p(1.0,-10.2,10000.00001). p(1.0,-10.2,10000.00001).",1,p); if (!tmp) tmp=wr_test_eq(g,clvec,"p(1.0,-10.2,10000.00001). p(1,-10.2,10000.00001).",0,p); if (!tmp) tmp=wr_test_eq(g,clvec,"p(a,'a1a1a1a1a1a1a1a1a1a1'). p(a,'a1a1a1a1a1a1a1a1a1a1').",1,p); if (!tmp) tmp=wr_test_eq(g,clvec,"p(a,'a1a1a1a1a1a1a1a1a1a1'). p(a,'a1a1a1a1a1a1a1a1a1a2').",0,p); if (!tmp) tmp=wr_test_eq(g,clvec,"p(\"abc\"). p(\"abc\").",1,p); if (!tmp) tmp=wr_test_eq(g,clvec,"p(\"abc\",\"x\"). p(\"abc\",\"x\").",1,p); //if (!tmp) tmp=wr_test_eq(g,clvec,"p(\"abc\",\"\",\"x\"). p(\"abc\",\"\",\"x\").",1,p); //if (!tmp) tmp=wr_test_eq(g,clvec,"p(\"abc\",\"\",\"x\"). p(\"abc\",\"\",\"y\").",1,p); if (!tmp) tmp=wr_test_eq(g,clvec,"p(\"abc\",\"xxc\"). p(\"abc\",\"xxC\").",0,p); if (!tmp) tmp=wr_test_eq(g,clvec,"p(\"0123456789012345678901234567890123456789\").\ p(\"0123456789012345678901234567890123456789\").",1,p); if (!tmp) tmp=wr_test_eq(g,clvec,"p(\"0123456789012345678901234567890123456789\").\ p(\"012345678901234567890123456789012345678\").",0,p); if (!tmp) tmp=wr_test_eq(g,clvec,"p(\"01234567890123456789012345678901234567a9\").\ p(\"01234567890123456789012345678901234567b9\").",0,p); if (!tmp) tmp=wr_test_eq(g,clvec,"p(a,bbbb,c). p(a,bbbb,c).",1,p); if (!tmp) tmp=wr_test_eq(g,clvec,"p(a,bbbb,c). p(a,bbbd,c).",0,p); if (!tmp) tmp=wr_test_eq(g,clvec,"p(a,bbbb). p(a,\"bbbb\").",0,p); if (!tmp) tmp=wr_test_eq(g,clvec,"p(a,bbbb). p(a,'bbbb').",1,p); if (!tmp) tmp=wr_test_eq(g,clvec,"p(a,'bbbb'). p(a,\"bbbb\").",0,p); if (!tmp) tmp=wr_test_eq(g,clvec,"p(?X,10). p(?X,10).",1,p); if (!tmp) tmp=wr_test_eq(g,clvec,"p(?X,10). p(X,10).",0,p); if (!tmp) tmp=wr_test_eq(g,clvec,"p(?X,?Y). p(?X,?Y).",1,p); if (!tmp) tmp=wr_test_eq(g,clvec,"p(?X,?X). p(?X,?Y).",0,p); if (!tmp) tmp=wr_test_eq(g,clvec,"p(1,f(a,f(1,2,c),g(1))). p(1,f(a,f(1,2,c),g(1))).",1,p); if (!tmp) tmp=wr_test_eq(g,clvec,"p(1,f(a,f(1,2,c),g(1))). p(1,f(a,f(1,3,c),g(1))).",0,p); if (!tmp) tmp=wr_test_eq(g,clvec,"p(1,f(a,f(1,2,c),g(1))). p(1,f(a,f(1,?X,c),g(1))).",0,p); if (!tmp) tmp=wr_test_eq(g,clvec,"p(1,f(a,f(1,2,c),g(1))). p(1,f(a,D,g(1))).",0,p); if (!tmp) tmp=wr_test_eq(g,clvec,"p(1,f(a,f(1,b,c),g(1))). p(1,f(a,f(1,\"b\",c),g(1))).",0,p); if (!tmp) tmp=wr_test_eq(g,clvec,"p(1,f(a,f(1,\"b\",c),g(1))). p(1,f(a,f(1,\"b\",c),g(1))).",1,p); if (p>0) printf("- - - wr_unify_term const case tests - - -\n"); if (!tmp) tmp=wr_test_unify(g,clvec,"p(1,-1,0,2,100000) | p(1,-1,0,2,100000).",1,p); if (!tmp) tmp=wr_test_unify(g,clvec,"p(1,10) | p(1,20).",0,p); if (!tmp) tmp=wr_test_unify(g,clvec,"p(1.23456) | p(1.23456).",1,p); if (!tmp) tmp=wr_test_unify(g,clvec,"p(1.0) | p(1.0).",1,p); if (!tmp) tmp=wr_test_unify(g,clvec,"p(1.0,-10.2,10000.00001) | p(1.0,-10.2,10000.00001).",1,p); if (!tmp) tmp=wr_test_unify(g,clvec,"p(1.0,-10.2,10000.00001) | p(1,-10.2,10000.00001).",0,p); if (!tmp) tmp=wr_test_unify(g,clvec,"p(a,'a1a1a1a1a1a1a1a1a1a1') | p(a,'a1a1a1a1a1a1a1a1a1a1').",1,p); if (!tmp) tmp=wr_test_unify(g,clvec,"p(a,'a1a1a1a1a1a1a1a1a1a1') | p(a,'a1a1a1a1a1a1a1a1a1a2').",0,p); if (!tmp) tmp=wr_test_unify(g,clvec,"p(\"abc\") | p(\"abc\").",1,p); if (!tmp) tmp=wr_test_unify(g,clvec,"p(\"abc\",\"x\") | p(\"abc\",\"x\").",1,p); if (!tmp) tmp=wr_test_unify(g,clvec,"p(a,bbbb,c) | p(a,bbbb,c).",1,p); if (!tmp) tmp=wr_test_unify(g,clvec,"p(a,bbbb,c) | p(a,bbbd,c).",0,p); if (!tmp) tmp=wr_test_unify(g,clvec,"p(\"0123456789012345678901234567890123456789\") |\ p(\"0123456789012345678901234567890123456789\").",1,p); if (!tmp) tmp=wr_test_eq(g,clvec,"p(\"0123456789012345678901234567890123456789\") |\ p(\"012345678901234567890123456789012345678\").",0,p); if (!tmp) tmp=wr_test_eq(g,clvec,"p(\"01234567890123456789012345678901234567a9\") |\ p(\"01234567890123456789012345678901234567b9\").",0,p); if (p>0) printf("- - - wr_unify_term var case tests - - -\n"); if (!tmp) tmp=wr_test_unify(g,clvec,"p(?X,10) | p(?X,10).",1,p); if (!tmp) tmp=wr_test_unify(g,clvec,"p(?X,10) | p(X,10).",1,p); if (!tmp) tmp=wr_test_unify(g,clvec,"p(?X,?Y) | p(a,b).",1,p); if (!tmp) tmp=wr_test_unify(g,clvec,"p(a,b) | p(?X,?Y).",1,p); if (!tmp) tmp=wr_test_unify(g,clvec,"p(?X,?X) | p(a,b).",0,p); if (!tmp) tmp=wr_test_unify(g,clvec,"p(a,b) | p(?X,?X).",0,p); if (!tmp) tmp=wr_test_unify(g,clvec,"p(?X,?X) | p(?X,?X).",1,p); if (!tmp) tmp=wr_test_unify(g,clvec,"p(?X,?X) | p(?X,?Y).",1,p); if (!tmp) tmp=wr_test_unify(g,clvec,"p(?X) | p(f(f(f(?X)))).",0,p); if (!tmp) tmp=wr_test_unify(g,clvec,"p(f(f(f(?X)))) | p(?X).",0,p); if (!tmp) tmp=wr_test_unify(g,clvec,"p(?X) | p(f(f(f(?Y)))).",1,p); if (!tmp) tmp=wr_test_unify(g,clvec,"p(?X,?X) | p(?X,f(f(f(?X)))).",0,p); if (!tmp) tmp=wr_test_unify(g,clvec,"p(?X,?X) | p(?X,f(f(f(?Y)))).",1,p); if (!tmp) tmp=wr_test_unify(g,clvec,"p(?X,?X) | p(?Y,f(f(f(?Y)))).",0,p); if (!tmp) tmp=wr_test_unify(g,clvec,"p(a,b,?X,?Y) | p(?U,?V,?U,?V).",1,p); if (!tmp) tmp=wr_test_unify(g,clvec,"p(a,b,?X,?Y) | p(?U,?V,?U,?U).",1,p); if (!tmp) tmp=wr_test_unify(g,clvec,"p(a,b,?X,?X) | p(?U,?V,?U,?V).",0,p); if (!tmp) tmp=wr_test_unify(g,clvec,"p(?X,?X,f(?B)) | p(?A,?B,?A).",0,p); if (p>0) printf("- - - wr_match_term const case tests - - -\n"); if (!tmp) tmp=wr_test_match(g,clvec,"p(1,-1,0,2,100000) | p(1,-1,0,2,100000).",1,p); if (!tmp) tmp=wr_test_match(g,clvec,"p(1,10) | p(1,20).",0,p); if (!tmp) tmp=wr_test_match(g,clvec,"p(1.23456) | p(1.23456).",1,p); if (!tmp) tmp=wr_test_match(g,clvec,"p(1.0) | p(1.0).",1,p); if (!tmp) tmp=wr_test_match(g,clvec,"p(1.0,-10.2,10000.00001) | p(1.0,-10.2,10000.00001).",1,p); if (!tmp) tmp=wr_test_match(g,clvec,"p(1.0,-10.2,10000.00001) | p(1,-10.2,10000.00001).",0,p); if (!tmp) tmp=wr_test_match(g,clvec,"p(a,'a1a1a1a1a1a1a1a1a1a1') | p(a,'a1a1a1a1a1a1a1a1a1a1').",1,p); if (!tmp) tmp=wr_test_match(g,clvec,"p(a,'a1a1a1a1a1a1a1a1a1a1') | p(a,'a1a1a1a1a1a1a1a1a1a2').",0,p); if (!tmp) tmp=wr_test_match(g,clvec,"p(\"abc\") | p(\"abc\").",1,p); if (!tmp) tmp=wr_test_match(g,clvec,"p(\"abc\",\"x\") | p(\"abc\",\"x\").",1,p); if (!tmp) tmp=wr_test_match(g,clvec,"p(a,bbbb,c) | p(a,bbbb,c).",1,p); if (!tmp) tmp=wr_test_match(g,clvec,"p(a,bbbb,c) | p(a,bbbd,c).",0,p); if (!tmp) tmp=wr_test_match(g,clvec,"p(\"0123456789012345678901234567890123456789\") |\ p(\"0123456789012345678901234567890123456789\").",1,p); if (!tmp) tmp=wr_test_eq(g,clvec,"p(\"0123456789012345678901234567890123456789\") |\ p(\"012345678901234567890123456789012345678\").",0,p); if (!tmp) tmp=wr_test_eq(g,clvec,"p(\"01234567890123456789012345678901234567a9\") |\ p(\"01234567890123456789012345678901234567b9\").",0,p); if (p>0) printf("- - - wr_match_term var case tests - - -\n"); if (!tmp) tmp=wr_test_match(g,clvec,"p(?X,10) | p(?X,10).",1,p); if (!tmp) tmp=wr_test_match(g,clvec,"p(?X,10) | p(X,10).",1,p); if (!tmp) tmp=wr_test_match(g,clvec,"p(?X,?Y) | p(a,b).",1,p); if (!tmp) tmp=wr_test_match(g,clvec,"p(a,b) | p(?X,?Y).",0,p); if (!tmp) tmp=wr_test_match(g,clvec,"p(?X,?X) | p(a,b).",0,p); if (!tmp) tmp=wr_test_match(g,clvec,"p(a,b) | p(?X,?X).",0,p); if (!tmp) tmp=wr_test_match(g,clvec,"p(?X,?X) | p(?X,?X).",1,p); if (!tmp) tmp=wr_test_match(g,clvec,"p(?X,?X) | p(?X,?Y).",0,p); if (!tmp) tmp=wr_test_match(g,clvec,"p(?X) | p(f(f(f(?X)))).",1,p); if (!tmp) tmp=wr_test_match(g,clvec,"p(f(f(f(?X)))) | p(?X).",0,p); if (!tmp) tmp=wr_test_match(g,clvec,"p(?X) | p(f(f(f(?Y)))).",1,p); if (!tmp) tmp=wr_test_match(g,clvec,"p(?X,?X) | p(?X,f(f(f(?X)))).",0,p); if (!tmp) tmp=wr_test_match(g,clvec,"p(?X,?X) | p(?X,f(f(f(?Y)))).",0,p); if (!tmp) tmp=wr_test_match(g,clvec,"p(?X,?X) | p(?Y,f(f(f(?Y)))).",0,p); if (!tmp) tmp=wr_test_match(g,clvec,"p(a,b,?X,?Y) | p(?U,?V,?U,?V).",0,p); if (!tmp) tmp=wr_test_match(g,clvec,"p(?U,?V,?U,?V) | p(a,b,?X,?Y).",0,p); if (!tmp) tmp=wr_test_match(g,clvec,"p(?U,?V,?U,?V) | p(a,?X,a,?X).",1,p); if (!tmp) tmp=wr_test_match(g,clvec,"p(a,b,?X,?Y) | p(?U,?V,?U,?U).",0,p); if (!tmp) tmp=wr_test_match(g,clvec,"p(?X,?X,f(?B)) | p(?A,?B,?A).",0,p); if (!tmp) tmp=wr_test_match(g,clvec,"p(?A,?B,?A) | p(?X,?X,f(?B)).",0,p); if (!tmp) tmp=wr_test_match(g,clvec,"p(g(?A,?B,?A)) | p(g(?X,?X,f(?B))).",0,p); if (!tmp) tmp=wr_test_match(g,clvec,"p(?A,?B,?A) | p(f(?A),?X,f(?A)).",1,p); if (!tmp) tmp=wr_test_match(g,clvec,"p(g(?A,?B,?A)) | p(g(f(?A),?X,f(?A))).",1,p); if (!tmp) tmp=wr_test_match(g,clvec,"p(f(?X),?X) | p(f(?X),?Y).",0,p); if (p>0) printf("- - - wr_subsume_cl tests - - -\n"); if (!tmp) tmp=wr_test_subsume_cl(g,clvec,"p(1). p(1).",1,p); if (!tmp) tmp=wr_test_subsume_cl(g,clvec,"p(?X). p(?X).",1,p); if (!tmp) tmp=wr_test_subsume_cl(g,clvec,"p(?X). p(1).",1,p); if (!tmp) tmp=wr_test_subsume_cl(g,clvec,"p(1). p(?X).",0,p); if (!tmp) tmp=wr_test_subsume_cl(g,clvec,"p(1). p(?X) | p(1).",1,p); if (!tmp) tmp=wr_test_subsume_cl(g,clvec,"p(?X) | p(1). p(1). ",0,p); if (!tmp) tmp=wr_test_subsume_cl(g,clvec,"p(?X) | p(1). p(?X) | p(1). ",1,p); if (!tmp) tmp=wr_test_subsume_cl(g,clvec,"p(?X) | p(1). p(1) | p(?X). ",1,p); if (!tmp) tmp=wr_test_subsume_cl(g,clvec,"p(?X) | p(1). p(?X) | p(2). ",0,p); if (!tmp) tmp=wr_test_subsume_cl(g,clvec,"p(?X) | p(?Y). p(?X) | p(?X). ",1,p); if (!tmp) tmp=wr_test_subsume_cl(g,clvec,"p(?X) | p(?X). p(a) | m(b). ",0,p); if (!tmp) tmp=wr_test_subsume_cl(g,clvec,"p(?X) | p(?X). p(a) | p(b). ",0,p); if (!tmp) tmp=wr_test_subsume_cl(g,clvec,"p(?X) | p(?X). p(?X) | m(?Y). ",0,p); if (!tmp) tmp=wr_test_subsume_cl(g,clvec,"p(?X) | p(?X). p(?X) | p(?Y). ",0,p); if (!tmp) tmp=wr_test_subsume_cl(g,clvec,"p(?X) | p(?X). p(a) | p(b) | p(b). ",1,p); if (!tmp) tmp=wr_test_subsume_cl(g,clvec,"p(?X) | p(?X) | p(?Y). p(c) | p(a) | p(b) | p(b). ",1,p); if (!tmp) tmp=wr_test_subsume_cl(g,clvec,"p(?X) | p(?X) | p(?Y). p(c) | p(?X) | p(?Y) | p(?Z). ",0,p); if (p>0) printf("--------- wr_test_reasoner_coretests ends ---------\n"); free(clvec); return tmp; } static int wr_test_eq(glb* g, cvec clvec, char* clstr, int expres, int p) { int res,tmp; gint t1,t2; if (p>1) printf("eq testing %s expected %d ",clstr,expres); tmp=wr_import_otter_file(g,NULL,clstr,clvec); if (tmp) { if (p>0) printf("\neq testing %s: otter import failed\n",clstr); return 1; } t1=rpto(g,clvec[2]); t2=rpto(g,clvec[3]); res=1; if (wr_equal_term(g,t1,t2,1)) { if (expres) res=0; } else { if (!expres) res=0; } if (p>1) { if (res) printf("test FAILED\n"); else printf("test OK\n"); } return res; } static int wr_test_unify(glb* g, cvec clvec, char* clstr, int expres, int p) { //void* db=g->db; int res,tmp; gptr cl; gint t1,t2; if (p>1) printf("unify testing %s expected %d ",clstr,expres); tmp=wr_import_otter_file(g,NULL,clstr,clvec); if (tmp) { if (p>0) printf("\nunify testing %s: otter import failed\n",clstr); return 1; } cl=(gptr)(clvec[2]); t1=wg_get_rule_clause_atom(db,cl,0); t2=wg_get_rule_clause_atom(db,cl,1); res=1; wr_clear_all_varbanks(g); tmp=wr_unify_term(g,t1,t2,1); wr_clear_varstack(g,g->varstack); if (!wr_varbanks_are_clear(g,g->varbanks)) { if (p>1) { printf(" varbanks NOT CLEARED, test FAILED\n"); } return 1; } if (tmp) { if (expres) res=0; } else { if (!expres) res=0; } if (p>1) { if (res) printf("test FAILED\n"); else printf("test OK\n"); } return res; } static int wr_test_match(glb* g, cvec clvec, char* clstr, int expres, int p) { //void* db=g->db; int res,tmp; gptr cl; gint t1,t2; if (p>1) printf("match testing %s expected %d ",clstr,expres); tmp=wr_import_otter_file(g,NULL,clstr,clvec); if (tmp) { if (p>0) printf("\nmatch testing %s: otter import failed\n",clstr); return 1; } cl=(gptr)(clvec[2]); t1=wg_get_rule_clause_atom(db,cl,0); t2=wg_get_rule_clause_atom(db,cl,1); res=1; wr_clear_all_varbanks(g); tmp=wr_match_term(g,t1,t2,1); wr_clear_varstack(g,g->varstack); if (!wr_varbanks_are_clear(g,g->varbanks)) { if (p>1) { printf(" varbanks NOT CLEARED, test FAILED\n"); } return 1; } if (tmp) { if (expres) res=0; } else { if (!expres) res=0; } if (p>1) { if (res) printf("test FAILED\n"); else printf("test OK\n"); } return res; } static int wr_test_subsume_cl(glb* g, cvec clvec, char* clstr, int expres, int p) { //void* db=g->db; int res,tmp; gptr cl1,cl2; int i2; if (p>1) printf("subsume_cl testing %s expected %d ",clstr,expres); tmp=wr_import_otter_file(g,NULL,clstr,clvec); if (tmp) { if (p>0) printf("\nunify testing %s: otter import failed\n",clstr); return 1; } // cl order is reversed!! cl1=(gptr)(clvec[3]); cl2=(gptr)(clvec[2]); res=1; wr_clear_all_varbanks(g); for(i2=1;i2<=(g->tmp_litinf_vec)[0];i2++) (g->tmp_litinf_vec)[i2]=0; tmp=wr_subsume_cl(g,cl1,cl2,1); //wr_clear_varstack(g,g->varstack); if (!wr_varbanks_are_clear(g,g->varbanks)) { if (p>1) { printf(" varbanks NOT CLEARED, test FAILED\n"); } return 1; } // no need for g->tmp_litinf_vec to be clear //if (!wr_litinf_is_clear(g,g->tmp_litinf_vec)) { // if (p>1) { // printf(" litinfo NOT CLEARED, test FAILED\n"); // } // return 1; //} if (tmp) { if (expres) res=0; } else { if (!expres) res=0; } if (p>1) { if (res) printf("test FAILED\n"); else printf("test OK\n"); } return res; } /* static int wr_litinf_is_clear(glb* g,vec v) { int i; for(i=1;i. * */ /** @file rtest.h * Reasoner testing functions. * */ #ifndef DEFINED_RTEST_H #define DEFINED_RTEST_H #include "glb.h" int wg_test_reasoner(int argc, char **argv); static int wr_test_subsume_cl(glb* g, cvec clvec, char* clstr, int expres, int p); #endif whitedb-0.7.2/Reasoner/subsume.c000066400000000000000000000233771226454622500165770ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009,2010 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file subsume.c * Subsumption functions. * */ /* ====== Includes =============== */ #include #include #include #ifdef __cplusplus extern "C" { #endif #include "rincludes.h" //#define DEBUG #undef DEBUG /* ====== Private headers and defs ======== */ /* ======= Private protos ================ */ /* ====== Functions ============== */ /* ------------------------------------------------------ clause-to-clause-list subsumption ------------------------------------------------------ */ int wr_given_cl_subsumed(glb* g, gptr given_cl) { gptr cl; int iactive; gptr actptr; gint iactivelimit; int sres; #ifdef DEBUG printf("wr_given_cl_is_subsumed is called with \n"); wr_print_clause(g,given_cl); #endif if(1) { actptr=rotp(g,g->clactive); iactivelimit=CVEC_NEXT(actptr); for(iactive=CVEC_START; iactiveslit and recursive subsumption without glit and slit present is OK then return OK else restore variable settings before last match attempt If match found during internal loop otherwise Loop over all gcl literals Loop over all scl literals if match found glit<->slit then stop internal loop else restore variable settings before last match attempt */ gint wr_subsume_cl(glb* g, gptr cl1, gptr cl2, int uniquestrflag) { void* db=g->db; int cllen1,cllen2; int i2; gint meta1,meta2; gint lit1,lit2; int vc_tmp; int mres; #ifdef DEBUG printf("wr_subsume_cl called on %d %d \n",(int)cl1,(int)cl2); wr_print_clause(g,cl1); wr_print_clause(g,cl2); #endif ++(g->stat_clsubs_attempted); // check fact clause cases first if (!wg_rec_is_rule_clause(db,cl1)) { if (!wg_rec_is_rule_clause(db,cl2)) { #ifdef DEBUG printf("both clauses are facts \n"); #endif ++(g->stat_clsubs_unit_attempted); if (wr_equal_term(g,encode_record(db,cl1),encode_record(db,cl2),uniquestrflag)) return 1; else return 0; } else { cllen2=wg_count_clause_atoms(db,cl2); lit1=encode_record(db,cl1); ++(g->stat_clsubs_unit_attempted); for(i2=0;i2stat_clsubs_unit_attempted); cllen2=1; if (cllen1>1) return 0; meta1=wg_get_rule_clause_atom_meta(db,cl1,0); if (wg_atom_meta_is_neg(d,meta1)) return 0; lit1=wg_get_rule_clause_atom(db,cl1,0); lit2=rpto(g,cl2); vc_tmp=2; mres=wr_match_term(g,lit1,lit2,uniquestrflag); if (vc_tmp!=*((g->varstack)+1)) wr_clear_varstack(g,g->varstack); return mres; } else { // cl2 is a rule clause cllen2=wg_count_clause_atoms(db,cl2); } // now both clauses are rule clauses #ifdef DEBUG printf("cllen1 %d cllen2 %d\n",cllen1,cllen2); #endif // check unit rule clause case if (cllen1==1) { #ifdef DEBUG printf("unit clause subsumption case \n"); #endif ++(g->stat_clsubs_unit_attempted); ++(g->stat_clsubs_unit_attempted); meta1=wg_get_rule_clause_atom_meta(db,cl1,0); lit1=wg_get_rule_clause_atom(db,cl1,0); vc_tmp=2; *((g->varstack)+1)=vc_tmp; // zero varstack pointer for(i2=0;i2varstack)+1)) wr_clear_varstack(g,g->varstack); if (mres) return 1; } } return 0; } if (cllen1>cllen2) return 0; // now both clauses are nonunit rule clauses and we do full subsumption // prepare for subsumption: set globals etc #ifdef DEBUG printf("general subsumption case \n"); #endif g->tmp_unify_vc=(g->varstack)+1; // var counter for varstack // clear lit information vector (0 pos holds vec len) for(i2=1;i2<=cllen2;i2++) (g->tmp_litinf_vec)=wr_vec_store(g,g->tmp_litinf_vec,i2,0); ++(g->stat_clsubs_full_attempted); mres=wr_subsume_cl_aux(g,cl1,cl2, cl1+RECORD_HEADER_GINTS+CLAUSE_EXTRAHEADERLEN, cl2+RECORD_HEADER_GINTS+CLAUSE_EXTRAHEADERLEN, 0,0, cllen1,cllen2, uniquestrflag); wr_clear_varstack(g,g->varstack); return mres; } /* Each gcl literal must match a different scl literal. Take first gcl literal glit Loop over all scl literals if match found glit<->slit and recursive subsumption without glit and slit present is OK then return OK else restore variable settings before last match attempt If match found during internal loop in other words Loop over all gcl literals Loop over all scl literals if match found glit<->slit then stop internal loop else restore variable settings before last match attempt */ gint wr_subsume_cl_aux(glb* g,gptr cl1vec, gptr cl2vec, gptr litpt1, gptr litpt2, int litind1, int litind2, int cllen1, int cllen2, int uniquestrflag) { int i1,i2; gint lit1,lit2; gptr pt1,pt2; gint meta1,meta2; int foundflag; int vc_tmp; int nobackflag; #ifdef DEBUG printf("wr_subsume_cl_aux called with litind1 %d \n",litind1); #endif if(litind1tmp_litinf_vec)[i2+1]==0) { // literal not bound by subsumption yet meta2=*(pt2+LIT_META_POS); lit2=*(pt2+LIT_ATOM_POS); foundflag=0; if (!litmeta_negpolarities(meta1,meta2)) { vc_tmp=*(g->tmp_unify_vc); // store current value of varstack pointer ???????? if (wr_match_term_aux(g,lit1,lit2,uniquestrflag)) { #ifdef DEBUG printf("lit match ok with *(g->tmp_unify_vc): %d\n",*(g->tmp_unify_vc)); wr_print_vardata(g); #endif // literals were successfully matched (g->tmp_litinf_vec)[i2+1]=1; // mark as a bound literal if (vc_tmp==*(g->tmp_unify_vc)) nobackflag=1; // no need to backtrack if ((i1+1>=cllen1) || wr_subsume_cl_aux(g,cl1vec,cl2vec,pt1+(LIT_WIDTH),litpt2, i1+1,i2,cllen1,cllen2,uniquestrflag)) { // found a right match for current gcl lit #ifdef DEBUG printf("rest ok with *(g->tmp_unify_vc): %d\n",*(g->tmp_unify_vc)); wr_print_vardata(g); #endif return 1; } #ifdef DEBUG printf("rest failed with *(g->tmp_unify_vc): %d\n",*(g->tmp_unify_vc)); wr_print_vardata(g); #endif if (vc_tmp!=*(g->tmp_unify_vc)) wr_clear_varstack_topslice(g,g->varstack,vc_tmp); (g->tmp_litinf_vec)[i2+1]=0; // clear as a bound literal } else { //printf("lit match failed with *(g->tmp_unify_vc): %d\n",*(g->tmp_unify_vc)); if (vc_tmp!=*(g->tmp_unify_vc)) wr_clear_varstack_topslice(g,g->varstack,vc_tmp); } } } } // all literals checked, no way to subsume using current lit1 return 0; } // clause printf("REASONER ERROR! something wrong in calling wr_subsume_cl_aux\n"); //printf("litind1: %d cllen1: %d i1: %d \n", litind1,cllen1,i1); return 0; } #ifdef __cplusplus } #endif whitedb-0.7.2/Reasoner/subsume.h000066400000000000000000000024441226454622500165740ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009,2010 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file subsume.h * Public headers for subsumption functions. * */ #ifndef DEFINED_SUBSUME_H #define DEFINED_SUBSUME_H #include "glb.h" int wr_given_cl_subsumed(glb* g, gptr given_cl); gint wr_subsume_cl(glb* g, gptr cl1, gptr cl2, int uniquestrflag); gint wr_subsume_cl_aux(glb* g,gptr cl1vec, gptr cl2vec, gptr litpt1, gptr litpt2, int litind1, int litind2, int cllen1, int cllen2, int uniquestrflag); #endif whitedb-0.7.2/Reasoner/types.h000066400000000000000000000024571226454622500162610ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009,2010 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file types.h * Various datatypes. * */ #ifndef DEFINED_DATATYPES_H #define DEFINED_DATATYPES_H #ifdef _WIN32 #include "../config-w32.h" #else #include "../config.h" #endif #include "../Db/dballoc.h" #include "../Db/dbdata.h" typedef gint* gptr; typedef gint* vec; /**< array with length: 0 contains len of array*/ typedef gint* cvec; /**< array with length and freepos: 0 len of array, 1 first free pos */ typedef gint veco; /**< offset of vec */ typedef gint cveco; /**< offset of cvec */ #endif whitedb-0.7.2/Reasoner/unify.c000066400000000000000000000417731226454622500162460ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Tanel Tammet 2004,2005,2006,2007,2008,2009,2010 * * Contact: tanel.tammet@gmail.com * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file unify.c * Unification functions. * */ /* ====== Includes =============== */ #include #include #include #ifdef __cplusplus extern "C" { #endif #include "rincludes.h" //#define DEBUG #undef DEBUG /* ====== Private headers and defs ======== */ static gint wr_occurs_in(glb* g, gint x, gint y, gptr vb); /* ======= Private protos ================ */ /* ====== Functions ============== */ /** Plain term unification using g->unify_samelen and g->unify_maxuseterms * * Metainfo is not filtered out. Must be exactly the same. * */ gint wr_unify_term(glb* g, gint x, gint y, int uniquestrflag) { g->tmp_unify_vc=((gptr)(g->varstack))+1; // pointer arithmetic: &(varstack[1]) g->tmp_unify_occcheck=1; if (wr_unify_term_aux(g,x,y,uniquestrflag)) { return 1; } else { return 0; } } /** Plain term unification using g->unify_samelen and g->unify_maxuseterms * * Metainfo is not filtered out. Must be exactly the same. * */ gint wr_unify_term_aux(glb* g, gint x, gint y, int uniquestrflag) { gptr db; gint encx,ency; gint tmp; // used by VARVAL_F macro gptr xptr,yptr; int xlen,ylen,uselen,ilimit,i; #ifdef DEBUG printf("wr_unify_term_aux called with x %d ",x); wr_print_term(g,x); printf(" and y %d ",y); wr_print_term(g,y); printf("\n"); #endif // first check if immediately same: return 1 if yes if (x==y) return 1; // second, fetch var values for var args if (isvar(x)) x=VARVAL_F(x,(g->varbanks)); if (isvar(y)) y=VARVAL_F(y,(g->varbanks)); // check again if same if (x==y) return 1; // go through the ladder of possibilities // knowing that x and y are different if (!isdatarec(x)) { // x is a primitive if (!isdatarec(y)) { // both x and y are primitive if (isvar(x)) { SETVAR(x,y,g->varbanks,g->varstack,g->tmp_unify_vc); // set occcheck only if y is a var too if (g->tmp_unify_do_occcheck && isvar(y)) (g->tmp_unify_occcheck)=1; return 1; } else if (isvar(y)) { SETVAR(y,x,g->varbanks,g->varstack,g->tmp_unify_vc); // do not set occcheck here: x is a constant! return 1; } else { // x and y are constants if (wr_equal_ptr_primitives(g,x,y,uniquestrflag)) return 1; else return 0; } // x is primitive, but y is not } else if (isvar(x)) { // x is var, y is non-primitive if (g->tmp_unify_occcheck && wr_occurs_in(g,x,y,(gptr)(g->varbanks))) { return 0; } else { SETVAR(x,y,g->varbanks,g->varstack,g->tmp_unify_vc); if (g->tmp_unify_do_occcheck) (g->tmp_unify_occcheck)=1; return 1; } } else { // x is a constant, but y is non-primitive return 0; } // x is not primitive } else if (isvar(y)) { // x is non-primitive, y is var if (g->tmp_unify_occcheck && wr_occurs_in(g,y,x,(gptr)(g->varbanks))) { return 0; } else { SETVAR(y,x,g->varbanks,g->varstack,g->tmp_unify_vc); if (g->tmp_unify_do_occcheck) (g->tmp_unify_occcheck)=1; return 1; } // x is not primitive, y is non-var } else if (!isdatarec(y)) { // x is not primitive, y is constant return 0; } else { // x and y are both complex terms db=g->db; xptr=decode_record(db,x); yptr=decode_record(db,y); xlen=get_record_len(xptr); ylen=get_record_len(yptr); if (g->unify_samelen) { if (xlen!=ylen) return 0; uselen=xlen; } else { if (xlen<=ylen) uselen=xlen; else uselen=ylen; } if (g->unify_maxuseterms) { if (((g->unify_maxuseterms)+(g->unify_firstuseterm))unify_firstuseterm)+(g->unify_maxuseterms); } ilimit=RECORD_HEADER_GINTS+uselen; for(i=RECORD_HEADER_GINTS+(g->unify_firstuseterm); idb; gptr yptr; gint yi; int ylen,ilimit,i; gint tmp; // used by VARVAL_F #ifdef DEBUG printf("wr_occurs_in called with x %d ",x); wr_print_term(g,x); printf(" and y %d ",y); wr_print_term(g,y); printf("\n"); #endif yptr=decode_record(db,y); ylen=get_record_len(yptr); if (g->unify_maxuseterms) { if (((g->unify_maxuseterms)+(g->unify_firstuseterm))unify_firstuseterm)+(g->unify_maxuseterms); } ilimit=RECORD_HEADER_GINTS+ylen; for(i=RECORD_HEADER_GINTS+(g->unify_firstuseterm); iunify_samelen and g->unify_maxuseterms */ gint wr_match_term(glb* g, gint x, gint y, int uniquestrflag) { g->tmp_unify_vc=((gptr)(g->varstack))+1; // pointer arithmetic: &(varstack[1]) if (wr_match_term_aux(g,x,y,uniquestrflag)) { return 1; } else { return 0; } } /** Plain term matching using g->unify_samelen and g->unify_maxuseterms returns 1 iff x subsumes y assumptions: ??? GBUNASSIGNEDVAL!=1 ???? ?? xvars are initially all unassigned ?? ?? yvars are initially all unassigned ?? xvars do not have to be different from yvars */ gint wr_match_term_aux(glb* g, gint x, gint y, int uniquestrflag) { gptr db; gint xval,encx,ency; gptr xptr,yptr; int xlen,ylen,uselen,ilimit,i; gint eqencx; // used by WR_EQUAL_TERM macro #ifdef DEBUG printf("wr_match_term_aux called with x %d ",x); wr_print_term(g,x); printf(" and y %d ",y); wr_print_term(g,y); printf("\n"); #endif // check x var case immediately if (isvar(x)) { xval=VARVAL_DIRECT(x,(g->varbanks)); if (xval==UNASSIGNED) { // previously unassigned var: assign now and return SETVAR(x,y,g->varbanks,g->varstack,g->tmp_unify_vc); return 1; } else { // xval must now be equal to y, else match fails if (WR_EQUAL_TERM(g,xval,y,uniquestrflag)) return 1; return 0; } } // now x is not var if (!isdatarec(x)) { if (WR_EQUAL_TERM(g,x,y,uniquestrflag)) return 1; } if (!isdatarec(y)) return 0; // x is datarec but y is not // now x and y are different datarecs if (1) { db=g->db; xptr=decode_record(db,x); yptr=decode_record(db,y); xlen=get_record_len(xptr); ylen=get_record_len(yptr); if (g->unify_samelen) { if (xlen!=ylen) return 0; uselen=xlen; } else { if (xlen<=ylen) uselen=xlen; else uselen=ylen; } if (g->unify_maxuseterms) { if (((g->unify_maxuseterms)+(g->unify_firstuseterm))unify_firstuseterm)+(g->unify_maxuseterms); } ilimit=RECORD_HEADER_GINTS+uselen; for(i=RECORD_HEADER_GINTS+(g->unify_firstuseterm); iunify_samelen and g->unify_maxuseterms * * Metainfo is not filtered out. Must be exactly the same. * * NB! For a faster version use macro WR_EQUAL_TERM doing the same thing * and assuming the presence of the gint eqencx variable */ gint wr_equal_term(glb* g, gint x, gint y, int uniquestrflag) { gptr db; gint encx,ency; gptr xptr,yptr; int xlen,ylen,uselen,i,ilimit; gint eqencx; // used by the WR_EQUAL_TERM macro #ifdef DEBUG printf("wr_equal_term called with x %d and y %d\n",x,y); #endif // first check if immediately same: return 1 if yes if (x==y) return 1; // handle immediate check cases: for these bit suffixes x is equal to y iff x==y encx=(x&NORMALPTRMASK); if ((encx==LONGSTRBITS && uniquestrflag) || encx==SMALLINTBITS || encx==NORMALPTRMASK) return 0; // immediate value: must be unequal since x==y checked before if (!isptr(x) || !isptr(y)) return 0; // here both x and y are ptr types // quick type check: last two bits if (((x)&NONPTRBITS)!=((y)&NONPTRBITS)) return 0; // if one is datarec, the other must be as well if (!isdatarec(x)) { if (isdatarec(y)) return 0; // neither x nor y are datarecs // need to check pointed values if (wr_equal_ptr_primitives(g,x,y,uniquestrflag)) return 1; else return 0; } else { if (!isdatarec(y)) return 0; // both x and y are datarecs db=g->db; xptr=decode_record(db,x); yptr=decode_record(db,y); xlen=get_record_len(xptr); ylen=get_record_len(yptr); if (g->unify_samelen) { if (xlen!=ylen) return 0; uselen=xlen; } else { if (xlen<=ylen) uselen=xlen; else uselen=ylen; } if (g->unify_maxuseterms) { if (((g->unify_maxuseterms)+(g->unify_firstuseterm))unify_firstuseterm)+(g->unify_maxuseterms); } ilimit=RECORD_HEADER_GINTS+uselen; for(i=RECORD_HEADER_GINTS+(g->unify_firstuseterm); idb; xptr=decode_record(db,x); yptr=decode_record(db,y); xlen=get_record_len(xptr); ylen=get_record_len(yptr); if (g->unify_samelen) { if (xlen!=ylen) return 0; uselen=xlen; } else { if (xlen<=ylen) uselen=xlen; else uselen=ylen; } if (g->unify_maxuseterms) { if (((g->unify_maxuseterms)+(g->unify_firstuseterm))unify_firstuseterm)+(g->unify_maxuseterms); } ilimit=RECORD_HEADER_GINTS+uselen; for(i=RECORD_HEADER_GINTS+(g->unify_firstuseterm); idb,decode_fullint_offset(a))==dbfetch(g->db,decode_fullint_offset(b))) ) return 1; else return 0; case FULLDOUBLEBITS: if (isfulldouble(b) && wg_decode_double(g->db,a)==wg_decode_double(g->db,b) ) return 1; else return 0; case SHORTSTRBITS: //printf("shortstrbits \n"); if (isshortstr(b) && !memcmp((void*)(offsettoptr(g->db,decode_shortstr_offset(a))), (void*)(offsettoptr(g->db,decode_shortstr_offset(b))), SHORTSTR_SIZE)) return 1; else return 0; case LONGSTRBITS: if (uniquestrflag) { if (a==b) return 1; else return 0; } else { t1=wg_get_encoded_type(g->db,a); t2=wg_get_encoded_type(g->db,b); if (t1!=t2) return 0; l1=wg_decode_unistr_lang_len(g->db,a,t1); l2=wg_decode_unistr_lang_len(g->db,b,t2); if (11!=l2) return 0; ol=l1; l1=wg_decode_unistr_len(g->db,a,t1); l2=wg_decode_unistr_len(g->db,b,t2); if (11!=l2) return 0; s1=wg_decode_unistr_lang(g->db,a,t1); s2=wg_decode_unistr_lang(g->db,b,t2); if (s1!=s2 && (s1==NULL || s2==NULL || memcmp(s1,s2,ol))) return 0; s1=wg_decode_unistr(g->db,a,t1); s2=wg_decode_unistr(g->db,b,t2); if (s1!=s2 && (s1==NULL || s2==NULL || memcmp(s1,s2,l1))) return 0; return 1; } } return 0; } /* ------------------------------------------------------ variable handling ------------------------------------------------------ */ /** x must be a variable */ gint wr_varval(gint x, gptr vb) { gint y; // do the first test without loop y=vb[decode_var(x)]; if (y==UNASSIGNED) { return x; } else if (!isvar(y)) { return y; } else { // if variable is assigned to a variable, loop for (;;) { x=y; y=vb[decode_var(x)]; if (y==UNASSIGNED) { return x; } else if (!isvar(y)) { return y; } } } } /** x must be a variable */ void setvar(gint x, gint y, gptr vb, gptr vstk, gint* vc) { vb[decode_var(x)]=y; vstk[*vc]=(gint)(vb+decode_var(x)); // pointer arithmetic (gint ptr) (*vc)++; } /** clear single varstack varstack structure: 0: vector len 1: next free pos on stack (2 for empty stack) 2...N: pointer of the varbank cell corresponding to set var value */ void wr_clear_varstack(glb* g,vec vs) { gptr x; gptr maxpt; gint maxnr; x=vs; ++x; maxnr= *x; if (maxnr>2) { maxpt=vs+maxnr; for(++x; x2) { maxpt=vs+maxnr; for(x=vs+y; xvarbanks); len=NROF_VARBANKS*NROF_VARSINBANK; //len=x[0]; xmax=x+len; //for(x++; xvarbanks); wr_print_varstack(g,g->varstack); } void wr_print_varbank(glb* g,gptr vb){ int i; int start, end; gint cell; start=0; end=NROF_VARBANKS*NROF_VARSINBANK; printf("varbank %d:\n",(gint)vb); for(i=start;i. * */ /** @file unify.h * Unification functions. * */ #ifndef DEFINED_UNIFY_H #define DEFINED_UNIFY_H #include "glb.h" #define UNASSIGNED WG_ILLEGAL // 0xff in dbata.h #define VARVAL(x,vb) (wr_varval(x,vb)) #define VARVAL_F(x,vb) (tmp=vb[decode_var(x)], ((tmp==UNASSIGNED) ? x : (!isvar(tmp) ? tmp : wr_varval(tmp,vb)))) #define VARVAL_DIRECT(x,vb) (vb[decode_var(x)]) #define SETVAR(x,y,vb,vstk,vc) (vb[decode_var(x)]=y,vstk[*vc]=(gint)((gptr)vb+decode_var(x)),++(*vc)) // WR_EQUAL_TERM is a faster version of the wr_equal_term function, doing exactly the same thing // NB! gint eqenc unused variable must be present to call the following macro, as well as valued uniquestrflag #define WR_EQUAL_TERM(g,x,y,uniquestrflag) \ (((x)==(y)) ? 1 : \ (eqencx=(x)&NORMALPTRMASK,\ (((eqencx==LONGSTRBITS && uniquestrflag) || eqencx==SMALLINTBITS || eqencx==NORMALPTRMASK) ? 0 : \ ((!isptr(x) || !isptr(y)) ? 0 : \ ((((x)&NONPTRBITS)!=((y)&NONPTRBITS)) ? 0 : wr_equal_term_macroaux((g),(x),(y),(uniquestrflag)) ))))) gint wr_unify_term(glb* g, gint x, gint y, int uniquestrflag); gint wr_unify_term_aux(glb* g, gint x, gint y, int uniquestrflag); gint wr_match_term_aux(glb* g, gint x, gint y, int uniquestrflag); gint wr_match_term(glb* g, gint x, gint y, int uniquestrflag); gint wr_equal_term(glb* g, gint x, gint y, int uniquestrflag); gint wr_equal_term_macroaux(glb* g, gint x, gint y, int uniquestrflag); int wr_equal_ptr_primitives(glb* g, gint a, gint b, int uniquestrflag); gint wr_varval(gint x, gptr vb); void wr_setvar(gint x, gint y, gptr vb, gptr vstk, gint* vc); void wr_clear_varstack(glb* g,vec vs); void wr_clear_varstack_topslice(glb* g, vec vs, int y); void wr_clear_all_varbanks(glb* g); void wr_print_vardata(glb* g); void wr_print_varbank(glb* g, gptr vb); void wr_print_varstack(glb* g, gptr vs); int wr_varbanks_are_clear(glb* g, gptr vb); #endif whitedb-0.7.2/Rexamples/000077500000000000000000000000001226454622500151165ustar00rootroot00000000000000whitedb-0.7.2/Rexamples/luka.txt000066400000000000000000000005611226454622500166150ustar00rootroot00000000000000 %list(usable). -P(i(?x,?y)) | -P(?x) | P(?y). %end_of_list. %list(sos). P(i(i(?x,?y),i(i(?y,?z),i(?x,?z)))). % CN1 P(i(n(?x),i(?x,?x))). % CN2 P(i(?x,i(n(?x),?y))). % CN3 %end_of_list. %list(passive). %-P(i(a,a)). % | $Ans(CN_16). %-P(i(b,i(a,b))). % | $Ans(CN_18). -P(i(i(i(a,b),c),i(b,c))).% | $Ans(CN_19). %end_of_list. whitedb-0.7.2/Rexamples/luka2.txt000066400000000000000000000034341226454622500167010ustar00rootroot00000000000000 -p(i(?X,?Y)) | -p(?X) | p(?Y). p(i(i(?X,?Y),i(i(?Y,?Z),i(?X,?Z)))). p(i(i(n(?X),?X),?X)). p(i(?X,i(n(?X),?Y))). %-p(i(a,a)). -p(i(a,i(b,a))). /* -blocker(a). eats(a_caterpillar,caterpillar_food_of(a_caterpillar)) | blocker(a). %important(?Small_animal) | -eats(?Small_animal,?Other_plant). %-important(?Small_animal) | -eats(?Small_animal,?Other_plant). -eats(?Grain_eater,?Foo) | -animal(?Animal)| -animal(?Grain_eater)| -grain(?Grain)| -eats(?Animal,?Grain_eater)| -eats(?Grain_eater,?Grain). */ /* animal(?X)| -wolf(?X). animal(?X)| -fox(?X). animal(?X)| -bird(?X). animal(?X)| -caterpillar(?X). animal(?X)| -snail(?X). wolf(a_wolf). fox(a_fox). bird(a_bird). caterpillar(a_caterpillar). snail(a_snail). grain(a_grain). plant(?X)| -grain(?X). eats(?Animal,?Plant)|eats(?Animal,?Small_animal)| -animal(?Animal)| -plant(?Plant)| -animal(?Small_animal)| -plant(?Other_plant)| -much_smaller(?Small_animal,?Animal)| -eats(?Small_animal,?Other_plant). much_smaller(?Catapillar,?Bird)| -caterpillar(?Catapillar)| -bird(?Bird). much_smaller(?Snail,?Bird)| -snail(?Snail)| -bird(?Bird). much_smaller(?Bird,?Fox)| -bird(?Bird)| -fox(?Fox). much_smaller(?Fox,?Wolf)| -fox(?Fox)| -wolf(?Wolf). -wolf(?Wolf)| -fox(?Fox)| -eats(?Wolf,?Fox). -wolf(?Wolf)| -grain(?Grain)| -eats(?Wolf,?Grain). eats(?Bird,?Catapillar)| -bird(?Bird)| -caterpillar(?Catapillar). -bird(?Bird)| -snail(?Snail)| -eats(?Bird,?Snail). plant(caterpillar_food_of(?Catapillar))| -caterpillar(?Catapillar). eats(?Catapillar,caterpillar_food_of(?Catapillar))| -caterpillar(?Catapillar). plant(snail_food_of(?Snail))| -snail(?Snail). eats(?Snail,snail_food_of(?Snail))| -snail(?Snail). -animal(?Animal)| -animal(?Grain_eater)| -grain(?Grain)| -eats(?Animal,?Grain_eater)| -eats(?Grain_eater,?Grain). */ whitedb-0.7.2/Rexamples/otter.txt000066400000000000000000000057631226454622500170270ustar00rootroot00000000000000/* animal(?X)| -wolf(?X). animal(?X)| -fox(?X). animal(?X)| -bird(?X). animal(?X)| -caterpillar(?X). animal(?X)| -snail(?X). wolf(a_wolf). fox(a_fox). bird(a_bird). caterpillar(a_caterpillar). snail(a_snail). grain(a_grain). plant(?X)| -grain(?X). eats(?Animal,?Plant)|eats(?Animal,?Small_animal)| -animal(?Animal)| -plant(?Plant)| -animal(?Small_animal)| -plant(?Other_plant)| -much_smaller(?Small_animal,?Animal)| -eats(?Small_animal,?Other_plant). much_smaller(?Catapillar,?Bird)| -caterpillar(?Catapillar)| -bird(?Bird). much_smaller(?Snail,?Bird)| -snail(?Snail)| -bird(?Bird). much_smaller(?Bird,?Fox)| -bird(?Bird)| -fox(?Fox). much_smaller(?Fox,?Wolf)| -fox(?Fox)| -wolf(?Wolf). -wolf(?Wolf)| -fox(?Fox)| -eats(?Wolf,?Fox). -wolf(?Wolf)| -grain(?Grain)| -eats(?Wolf,?Grain). eats(?Bird,?Catapillar)| -bird(?Bird)| -caterpillar(?Catapillar). -bird(?Bird)| -snail(?Snail)| -eats(?Bird,?Snail). plant(caterpillar_food_of(?Catapillar))| -caterpillar(?Catapillar). eats(?Catapillar,caterpillar_food_of(?Catapillar))| -caterpillar(?Catapillar). plant(snail_food_of(?Snail))| -snail(?Snail). eats(?Snail,snail_food_of(?Snail))| -snail(?Snail). -animal(?Animal)| -animal(?Grain_eater)| -grain(?Grain)| -eats(?Animal,?Grain_eater)| -eats(?Grain_eater,?Grain). */ -m(f(?X)) | -p(f(?X)) | -r(f(?X)) | s(?X). p(f(a)). r(f(a)). m(f(a)). -t(f(?X)) | -s(?X). t(f(a)). /* -p(i(?X,?Y)) | -p(?X) | p(?Y). p(i(i(?X,?Y),i(i(?Y,?Z),i(?X,?Z)))). p(i(i(n(?X),?X),?X)). p(i(?X,i(n(?X),?Y))). -p(i(a,a)). %-p(i(a,i(b,a))). %-p(i(?2000,i(?2001,?2002))) | p(i(i(i(i(?2002,?2003),i(?2001,?2003)),?2004),i(?2000,?2004))). % p(i(i(?1000,?1001),i(i(n(i(n(?1000),?1000)),i(n(?1000),?1000)),?1001))). %p(i(?X,i(n(?X),?Y))). %-p(i(a,i(b,a))) */ /* animal(?X)| -wolf(?X). animal(?X)| -fox(?X). animal(?X)| -bird(?X). animal(?X)| -caterpillar(?X). animal(?X)| -snail(?X). wolf(a_wolf). fox(a_fox). bird(a_bird). caterpillar(a_caterpillar). snail(a_snail). grain(a_grain). plant(?X)| -grain(?X). eats(?Animal,?Plant)|eats(?Animal,?Small_animal)| -animal(?Animal)| -plant(?Plant)| -animal(?Small_animal)| -plant(?Other_plant)| -much_smaller(?Small_animal,?Animal)| -eats(?Small_animal,?Other_plant). much_smaller(?Catapillar,?Bird)| -caterpillar(?Catapillar)| -bird(?Bird). much_smaller(?Snail,?Bird)| -snail(?Snail)| -bird(?Bird). much_smaller(?Bird,?Fox)| -bird(?Bird)| -fox(?Fox). much_smaller(?Fox,?Wolf)| -fox(?Fox)| -wolf(?Wolf). -wolf(?Wolf)| -fox(?Fox)| -eats(?Wolf,?Fox). -wolf(?Wolf)| -grain(?Grain)| -eats(?Wolf,?Grain). eats(?Bird,?Catapillar)| -bird(?Bird)| -caterpillar(?Catapillar). -bird(?Bird)| -snail(?Snail)| -eats(?Bird,?Snail). plant(caterpillar_food_of(?Catapillar))| -caterpillar(?Catapillar). eats(?Catapillar,caterpillar_food_of(?Catapillar))| -caterpillar(?Catapillar). plant(snail_food_of(?Snail))| -snail(?Snail). eats(?Snail,snail_food_of(?Snail))| -snail(?Snail). -animal(?Animal)| -animal(?Grain_eater)| -grain(?Grain)| -eats(?Animal,?Grain_eater)| -eats(?Grain_eater,?Grain). */ whitedb-0.7.2/Rexamples/otter1.txt000066400000000000000000000000441226454622500170730ustar00rootroot00000000000000 p(1). - p(?X) | r(?X). -r(1). whitedb-0.7.2/Rexamples/otter2.txt000066400000000000000000000000571226454622500171000ustar00rootroot00000000000000- p(1,?X). m(a,b) | s(t). kk(f(aa),b12). whitedb-0.7.2/Rexamples/otter4.txt000066400000000000000000000001051226454622500170740ustar00rootroot00000000000000# p(1,1). # p(?X,?X). # p(2,f(2)). -p(?X,?Y) | p(3,3). #-r(1). whitedb-0.7.2/Rexamples/otter5.txt000066400000000000000000000000771226454622500171050ustar00rootroot00000000000000 -p(?X) | -p(f(?X)). p(f(?X)) | p(?X). -p(?X) | p(f(?X)). whitedb-0.7.2/Rexamples/otter7.txt000066400000000000000000000000751226454622500171050ustar00rootroot00000000000000 p(2). r(2.0). -p(?X) | -r(?Y) | m(+(?X,?Y)). -m(4.0). whitedb-0.7.2/Rexamples/otter_parse_test.txt000066400000000000000000000002451226454622500212460ustar00rootroot00000000000000 p(X,f(b)) | -r(X) | -m(X). p(a,f(b)). -s(a,2,"2"). % ((("p" "X" ("f" "b")) ("not" ("r" "X")) ("not" ("m" "X"))) (("p" "a" ("f" "b"))) (("not" ("s" "a" 2 "2"))) ) whitedb-0.7.2/Rexamples/p1.otter000066400000000000000000000000411226454622500165100ustar00rootroot00000000000000 r(a). m(b). %-r(a) | -m(b). whitedb-0.7.2/Rexamples/rrun000077500000000000000000000001231226454622500160260ustar00rootroot00000000000000#!/bin/sh ../Main/wgdb free ../Main/wgdb importotter $1 ../Main/wgdb runreasoner whitedb-0.7.2/Rexamples/rules.txt000066400000000000000000000005631226454622500170150ustar00rootroot00000000000000%aasas k(a). r(X) :- k(X). -r(a). %m(X,Y) :- m(Y,X). %m(Y,X) :- r(X,Y). r(X,Y) :- b(X,Y,a), b(Y,X,c). % #command(me,keepaway) :- #attachedTo(X,fragile). % #attachedTo(X,fragile) :- #attachedTo(X,glass). % ---------- %playsound(me,"0") :- velocity(me,X). %#command(me,clean) :- "found-tag"(me,tag1). %#command(me,clean) :- #attachedTo(X,fragile). % --------------- whitedb-0.7.2/Rexamples/steam.txt000066400000000000000000000023741226454622500167760ustar00rootroot00000000000000 animal(?X)| -wolf(?X). animal(?X)| -fox(?X). animal(?X)| -bird(?X). animal(?X)| -caterpillar(?X). animal(?X)| -snail(?X). wolf(a_wolf). fox(a_fox). bird(a_bird). caterpillar(a_caterpillar). snail(a_snail). grain(a_grain). plant(?X)| -grain(?X). eats(?Animal,?Plant)|eats(?Animal,?Small_animal)| -animal(?Animal)| -plant(?Plant)| -animal(?Small_animal)| -plant(?Other_plant)| -much_smaller(?Small_animal,?Animal)| -eats(?Small_animal,?Other_plant). much_smaller(?Catapillar,?Bird)| -caterpillar(?Catapillar)| -bird(?Bird). much_smaller(?Snail,?Bird)| -snail(?Snail)| -bird(?Bird). much_smaller(?Bird,?Fox)| -bird(?Bird)| -fox(?Fox). much_smaller(?Fox,?Wolf)| -fox(?Fox)| -wolf(?Wolf). -wolf(?Wolf)| -fox(?Fox)| -eats(?Wolf,?Fox). -wolf(?Wolf)| -grain(?Grain)| -eats(?Wolf,?Grain). eats(?Bird,?Catapillar)| -bird(?Bird)| -caterpillar(?Catapillar). -bird(?Bird)| -snail(?Snail)| -eats(?Bird,?Snail). plant(caterpillar_food_of(?Catapillar))| -caterpillar(?Catapillar). eats(?Catapillar,caterpillar_food_of(?Catapillar))| -caterpillar(?Catapillar). plant(snail_food_of(?Snail))| -snail(?Snail). eats(?Snail,snail_food_of(?Snail))| -snail(?Snail). -animal(?Animal)| -animal(?Grain_eater)| -grain(?Grain)| -eats(?Animal,?Grain_eater)| -eats(?Grain_eater,?Grain). whitedb-0.7.2/Rexamples/viga.txt000066400000000000000000000000221226454622500165770ustar00rootroot00000000000000 m(a). p((( k. whitedb-0.7.2/Rexamples/wrong_notprovable.txt000066400000000000000000000114241226454622500214300ustar00rootroot00000000000000%-------------------------------------------------------------------------- % File : Shortened file, so there is no header %-------------------------------------------------------------------------- % clause_1, axiom. equal(?X1, ?X1). % clause_147, axiom. -p43(?X1, ?X2) | p17(?X1, ?X2). % clause_177, axiom. p5(?X1, ?X2) | -p63(?X1, ?X2). % clause_155, axiom. -p25(?X1, ?X2) | p53(?X1, ?X2). % clause_207, axiom. p37(?X1, f68(?X2, ?X1), ?X2) | -p60(?X1, ?X2). % clause_157, axiom. p18(?X1, ?X2) | -p64(?X1, ?X2). % clause_162, axiom. -p19(?X1, ?X2) | p61(?X1, ?X2). % clause_195, axiom. p31(?X1, ?X2) | -p23(?X1, ?X2). % clause_193, axiom. -p22(?X1, ?X2) | p52(?X1, ?X2). % clause_191, axiom. p31(?X1, ?X2) | -p59(?X1, ?X2). % clause_185, axiom. -p26(?X1, ?X2) | p48(?X1, ?X2). % clause_148, axiom. -p17(?X1, ?X2) | p58(?X1, ?X2). % clause_165, axiom. p46(?X1, ?X2) | -p28(?X1, ?X2). % clause_194, axiom. p23(?X1, ?X2) | -p52(?X1, ?X2). % clause_192, axiom. p5(?X1, ?X2) | -p31(?X1, ?X2). % clause_212, axiom. equal(?X6, ?X7) | -equal(f69(?X6, ?X1, ?X2, ?X3), ?X6) | p60(?X4, ?X5) | -p37(?X4, ?X7, ?X5) | -p37(?X4, ?X6, ?X5). % clause_164, axiom. -p36(?X1, ?X2) | p28(?X1, ?X2). % clause_144, axiom. -p14(?X1, ?X2) | p13(?X1, ?X2). % clause_174, axiom. -p56(?X1, ?X2) | p18(?X1, ?X2). % clause_184, axiom. -p1(?X1, ?X2) | p61(?X1, ?X2). % clause_152, axiom. -p43(?X1, ?X2) | p41(?X1, ?X2). % clause_153, axiom. p29(?X1, ?X2) | -p43(?X1, ?X2). % clause_150, axiom. -p17(?X1, ?X2) | p55(?X1, ?X2). % clause_198, axiom. -p61(?X1, ?X2) | -p35(?X1, ?X2). % clause_180, axiom. p1(?X1, ?X2) | -p50(?X1, ?X2). % clause_143, axiom. -p37(?X1, ?X2, ?X2). % clause_167, axiom. p29(?X1, ?X2) | -p46(?X1, ?X2). % clause_160, axiom. p55(?X1, ?X2) | -p19(?X1, ?X2). % clause_159, axiom. -p19(?X1, ?X2) | p58(?X1, ?X2). % clause_214, axiom. -p48(?X1, ?X4) | equal(?X3, ?X4) | -p44(?X1, ?X4, ?X2) | -p48(?X1, ?X3) | -p17(?X1, ?X2) | -p44(?X1, ?X3, ?X2). % clause_169, axiom. -p28(?X1, ?X2) | p27(?X1, ?X2). % clause_190, axiom. p59(?X1, ?X2) | -p62(?X1, ?X2). % clause_209, axiom. -p3(?X1, ?X2, ?X3) | -p42(?X1, ?X2) | -p47(?X1, ?X2, ?X3). % clause_146, axiom. -p5(?X1, ?X2) | p43(?X1, ?X2). % clause_197, axiom. -p65(?X1, ?X2) | -p8(?X1, ?X2). % clause_204, axiom. -p4(?X1, ?X2) | -p41(?X1, ?X2). % clause_186, axiom. p33(?X1, ?X2) | -p12(?X1, ?X2). % clause_196, axiom. -p66(?X1, ?X2) | -p45(?X1, ?X2). % clause_203, axiom. -p20(?X1, ?X2) | -p39(?X1, ?X2). % clause_161, axiom. p39(?X1, ?X2) | -p19(?X1, ?X2). % clause_149, axiom. p54(?X1, ?X2) | -p58(?X1, ?X2). % clause_183, axiom. -p1(?X1, ?X2) | p24(?X1, ?X2). % clause_145, axiom. p5(?X1, ?X2) | -p13(?X1, ?X2). % clause_151, axiom. -p17(?X1, ?X2) | p20(?X1, ?X2). % clause_166, axiom. p17(?X1, ?X2) | -p46(?X1, ?X2). % clause_168, axiom. p32(?X1, ?X2) | -p46(?X1, ?X2). % clause_181, axiom. -p1(?X1, ?X2) | p58(?X1, ?X2). % clause_201, axiom. -p32(?X1, ?X2) | -p41(?X1, ?X2). % clause_241, conjecture. -p37(c70, ?X1, c78) | p21(c70, ?X1). % clause_215, conjecture. p2(c70). % clause_219, conjecture. p34(c70, c72). % clause_247, conjecture. -p37(c70, ?X1, c76) | p42(c70, f81(?X3, ?X4)) | -p37(c70, ?X2, c77). % clause_216, conjecture. p12(c70, c71). % clause_217, conjecture. p22(c70, c71). % clause_227, conjecture. p65(c70, c75). % clause_231, conjecture. p60(c70, c78). % clause_230, conjecture. p25(c70, c77). % clause_248, conjecture. p64(c70, f81(?X3, ?X4)) | -p37(c70, ?X2, c77) | -p37(c70, ?X1, c76). % clause_249, conjecture. p3(c70, f81(?X3, ?X2), ?X3) | -p37(c70, ?X1, c77) | -p37(c70, ?X3, c76). % clause_235, conjecture. p44(c70, c73, c71). % clause_224, conjecture. p18(c70, c74). % clause_234, conjecture. p30(c70, c74, c71). % clause_244, conjecture. p7(c70, f79(?X1), ?X1, f80(?X1)) | -p37(c70, ?X1, c78). % clause_233, conjecture. p3(c70, c74, c75). % clause_250, conjecture. -p37(c70, ?X2, c77) | -p37(c70, ?X1, c76) | p47(c70, f81(?X1, ?X2), ?X2). % clause_226, conjecture. p15(c70, c75). % clause_238, conjecture. p8(c70, ?X1) | -p37(c70, ?X1, c77). % clause_223, conjecture. p49(c70, c74). % clause_229, conjecture. p25(c70, c76). % clause_242, conjecture. p56(c70, f79(?X2)) | -p37(c70, ?X1, c78). % clause_221, conjecture. p48(c70, c73). % clause_222, conjecture. p6(c70, c74). % clause_228, conjecture. p11(c70, c75). % clause_237, conjecture. -p37(c70, ?X1, c77) | p10(c70, ?X1). %-------------------------------------------------------------------------- whitedb-0.7.2/Server/000077500000000000000000000000001226454622500144245ustar00rootroot00000000000000whitedb-0.7.2/Server/Makefile000066400000000000000000000046121226454622500160670ustar00rootroot00000000000000# Makefile for building dserve, dservehttps and nsmeasure # # dserve: database http tool as a standalone http server or cgi # dservehttps: database https tool as a standalone https server or cgi # nsmeasure: utility for speed testing the server over http # # This makefile assumes you have built and installed the whitedb library. # # Alternatively compile directly against the whitedb library as: # # gcc dserve.c dserve_util.c dserve_net.c -o dserve -O2 -lwgdb -lpthread # gcc -DUSE_OPENSSL dserve.c dserve_util.c dserve_net.c -o dservehttps -O2 -lwgdb -lpthread -lssl -lcrypto # gcc nsmeasure.c -o nsmeasure -O2 -lpthread # # or use compile.sh directly against the whitedb source without building a library first. # # dserve can be also compiled to work as a cgi or command line tool only # without using pthreads by: # - removing #define SERVEROPTION from dserve.h # - compiling by gcc dserve.c dserve_util.c -o dserve -O2 -lwgdb # # Compiling under windows: # copy the files dbapi.h and wgdb.lib into the same folder where you compile, then build # server version: # cl /Ox /I"." Server\dserve.c Server\dserve_util.c Server\dserve_net.c wgdb.lib # cl /Ox /I"." Server\dserve.c Server\dserve_util.c wgdb.lib # or a non-server version CFLAGS = -O2 -c CFLAGSHTTPS = -O2 -DUSE_OPENSSL -Wall -c CC = gcc all: dserve dservehttps nsmeasure dserve: dserve.o dserve_util.o dserve_net.o yajl_all.o $(CC) dserve.o dserve_net.o dserve_util.o yajl_all.o -o dserve -lpthread -lwgdb dservehttps: dservehttps.o dserve_utilhttps.o dserve_nethttps.o yajl_all.o $(CC) dservehttps.o dserve_nethttps.o dserve_utilhttps.o yajl_all.o -o dservehttps -lpthread -lwgdb -lssl -lcrypto nsmeasure: nsmeasure.o $(CC) nsmeasure.o -o nsmeasure -lpthread dserve.o: dserve.h dserve.c $(CC) $(CFLAGS) dserve.c dservehttps.o: dserve.h dserve.c $(CC) $(CFLAGSHTTPS) dserve.c -o dservehttps.o dserve_net.o: dserve.h dserve_net.c $(CC) $(CFLAGS) dserve_net.c dserve_nethttps.o: dserve.h dserve_net.c $(CC) $(CFLAGSHTTPS) dserve_net.c -o dserve_nethttps.o dserve_util.o: dserve.h dserve_util.c $(CC) $(CFLAGS) dserve_util.c dserve_utilhttps.o: dserve.h dserve_util.c $(CC) $(CFLAGSHTTPS) dserve_util.c -o dserve_utilhttps.o yajl_all.o: ../json/yajl_all.h ../json/yajl_all.c $(CC) $(CFLAGS) ../json/yajl_all.c -o yajl_all.o nsmeasure.o: nsmeasure.c $(CC) $(CFLAGS) nsmeasure.c clean: nsmeasure.c rm *.o *dserve dservehttps nsmeasure *.gch whitedb-0.7.2/Server/Manifest000066400000000000000000000011121226454622500161100ustar00rootroot00000000000000Manifest: - dserve.h: header for dserve.c, dserve_net.c, dserve_util.c - dserve.c: main file for dserve/dservehttps - dserve_net.c: networking, only needed for running as a server - dserve_util.c: printing, error handling and text utilities - nsmeasure.c: server speed measurement tool - Makefile: compile dserve, dservehttps, nsmeasure against compiled whitedb - compile.sh: compile dserve, dservehttps, nsmeasure without compiling whitedb - compile.bat: compile dserve, nsmeasure for Windows - conf_example.txt: example for an optional configuration file - examplewhitedb-0.7.2/Server/README000066400000000000000000000050701226454622500153060ustar00rootroot00000000000000dserve README ============== dserve is a tool for performing REST queries from WhiteDB using a cgi protocol over http. Results are given in the json or csv format. dservehttps is a version of dserve using https instead of http. nsmeasure is a tool for measuring server speed. For details see http://whitedb.org/server/ Run dserver in one of three ways: * a cgi program under a web server, connecting like http://myhost.com/dserve?op=search&from=0&count=5 * as a standalone http(s) server, passing a port number as a single argument, like dserve 8080 and connecting like http://localhost:8080/dserve?op=search&from=0&count=5 or, for dservehttps compiled with USE_OPENSSL dservehttps 8081 conf.txt https://localhost:8081/dserve?op=search&from=0&count=5 * from the command line, passing a cgi-format, urlencoded query string as a single argument, like dserve 'op=search&from=0&count=5' Use the provided Makefile or compile.sh or compile.bat for compiling dserve. Use and modify the code for creating your own data servers for WhiteDB. Copyright (c) 2013, Tanel Tammet This software is under MIT licence: -------- Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ------ NB! Observe that the dserve source is under a more permissive licence than the WhiteDB library: the latter is by default under GPLv3. Thus the linked dserve is under GPLv3 unless a free commercial licence is used (see whitedb.org for details). It is OK to use the MIT licence when using this code or parts of it in other projects without linking to the whitedb library. whitedb-0.7.2/Server/compile.bat000077500000000000000000000024241226454622500165510ustar00rootroot00000000000000@rem current version does not build reasoner: added later cl /Ox /W3 Main\wgdb.c Db\dbmem.c Db\dballoc.c Db\dbdata.c Db\dblock.c Db\dbtest.c DB\dbdump.c Db\dblog.c Db\dbhash.c Db\dbindex.c Db\dbcompare.c Db\dbquery.c Db\dbutil.c Db\dbmpool.c Db\dbjson.c Db\dbschema.c json\yajl_all.c cl /Ox /W3 Main\indextool.c Db\dbmem.c Db\dballoc.c Db\dbdata.c Db\dblock.c Db\dbtest.c Db\dblog.c Db\dbhash.c Db\dbindex.c Db\dbcompare.c Db\dbquery.c Db\dbutil.c Db\dbmpool.c Db\dbjson.c Db\dbschema.c json\yajl_all.c cl /Ox /W3 Main\stresstest.c Db\dbmem.c Db\dballoc.c Db\dbdata.c Db\dblock.c Db\dblog.c Db\dbhash.c Db\dbindex.c Db\dbcompare.c Db\dbmpool.c @rem build DLL. Currently we are not using it to link executables however. @rem unlike gcc build, it is necessary to have all functions declared in @rem wgdb.def file. Make sure it's up to date (should list same functions as @rem Db/dbapi.h) cl /Ox /W3 /MT /Fewgdb /LD Db\dbmem.c Db\dballoc.c Db\dbdata.c Db\dblock.c Db\dbtest.c DB\dbdump.c Db\dblog.c Db\dbhash.c Db\dbindex.c Db\dbcompare.c Db\dbquery.c Db\dbutil.c Db\dbmpool.c Db\dbjson.c Db\dbschema.c json\yajl_all.c /link /def:wgdb.def /incremental:no /MANIFEST:NO @rem Example of linking against wgdb.dll @rem cl /Ox /W3 Main\stresstest.c wgdb.lib cl /Ox /W3 Main\wgdb.c wgdb.lib whitedb-0.7.2/Server/compile.sh000077500000000000000000000021101226454622500164050ustar00rootroot00000000000000#/bin/sh # alternative to compiling dserve and dservehttps with automake/make: # just run it in the Server folder # copy config.h to the current folder [ -f config.h ] || cp ../config-gcc.h config.h if [ ../config-gcc.h -nt ../config.h ]; then echo "Warning: config.h is older than config-gcc.h, consider updating it" fi # compile dserve gcc -O2 -Wall -o dserve dserve.c dserve_util.c dserve_net.c \ ../Db/dbmem.c ../Db/dballoc.c ../Db/dbdata.c \ ../Db/dblock.c ../Db/dbindex.c ../Db/dbtest.c ../Db/dbdump.c \ ../Db/dblog.c ../Db/dbhash.c ../Db/dbcompare.c ../Db/dbquery.c ../Db/dbutil.c ../Db/dbmpool.c \ ../Db/dbjson.c ../Db/dbschema.c ../json/yajl_all.c \ -lm -lpthread # compile dservehttps gcc -O2 -Wall -DUSE_OPENSSL -o dservehttps dserve.c dserve_util.c dserve_net.c \ ../Db/dbmem.c ../Db/dballoc.c ../Db/dbdata.c \ ../Db/dblock.c ../Db/dbindex.c ../Db/dbtest.c ../Db/dbdump.c \ ../Db/dblog.c ../Db/dbhash.c ../Db/dbcompare.c ../Db/dbquery.c ../Db/dbutil.c ../Db/dbmpool.c \ ../Db/dbjson.c ../Db/dbschema.c ../json/yajl_all.c \ -lm -lpthread -lssl -lcrypto whitedb-0.7.2/Server/conf_example.txt000066400000000000000000000061471226454622500176350ustar00rootroot00000000000000# Optional configuration file for dserve: use as # dserve # # The configuration file is required for dservehttps, # since key_file and cert_file must be given. # # Set multiple values by writing them to lines following # a def, one item per line, with a leading whitespace, like this: # admin_tokens=token1 # token2 # token3 # # all ops creating and deleting databases are considered admin ops. # all ops adding, updating or deleting data are considered write ops. # ops not changing any databases are considered read ops. # ------------- # Defaults and limits for creating new databases: all numeric. # default_dbase: a dbase name in case none given in request: # overrides the DEFAULT_DATABASE=1000 macro in dserve.h # default_dbase_size: default size if none given in request: # overrides the DEFAULT_DATABASE_SIZE=10000000 macro in dserve.h # set this to 0 to inhibit automatic database creation upon insert # max_dbase_size: limit for new database size: # overrides the MAX_DATABASE_SIZE=10000000000 macro in dserve.h # set this to 0 to inhibit any database creation #default_dbase=1000 #default_dbase_size=1000000 #max_dbase_size=100000000 # ------------- # Limit access to these databases only: no limit by default #dbases=1000 # 1001 # 1002 # ------------- # Limit access from these IP addresses only: no limit by default. # Use 127.0.0.1 for localhost. # You can give a first part of the IP address like 127.0: this # will match all IP addresses with this prefix. # admin ips get also write and read permissions # write ips get also read permissions. # If you set a read or write ips, be sure to set # stronger ip as well (possibly to 127.0.0.1), # otherwise the requirement is not enforced. # If any of these is not defined, there are no ip limits # for this and weaker kinds of operations. # Multiple values are accepted. #admin_ips=127.0.0.1 #write_ips=127.0.0.1 #read_ips=127.0.0.1 # ------------- # Limit access by secret tokens: no limit by default. # If tokens set, token=sometoken parameter required # with sometoken being in a set of tokens given here. # admin tokens give also write and read permissions # write tokens give also read permissions. # If you set a read or write token, be sure to set # stronger tokens as well (possibly to values not told # to anyone), otherwise the requirement is not enforced. # If any of these is not defined, there are no limits # for this and weaker kinds of operations. # Multiple values are accepted. #admin_tokens=secret1 #write_tokens=secret1 #read_tokens=secret1 # ------------- # path to private key file for https: needed for dservehttps only # example created by: # openssl req -x509 -nodes -days 365 -newkey rsa:2048 # -keyout exampleprivatekey.key -out examplecertificate.crt # or from this by: openssl rsa -in exampleprivatekey.key -out exampleprivatekey.pem #key_file=/home/tanel/whitedb/Server/exampleprivatekey.key # ------------- # path to certificate key file for https: needed for dservehttps only # example file created by the previous key/cert generation command #cert_file=/home/tanel/whitedb/Server/examplecertificate.crt whitedb-0.7.2/Server/dserve.c000066400000000000000000001265441226454622500160740ustar00rootroot00000000000000/* dserve.c contains the main functionality of dserve: dserve is a tool for performing REST queries from WhiteDB using a cgi protocol over http(s). Results are given in the json or csv format. Run dserver in one of three ways: * a cgi program under a web server, connecting like http://myhost.com/dserve?op=search&from=0&count=5 * as a standalone http(s) server, passing a port number as a single argument, like dserve 8080 and connecting like http://localhost:8080/dserve?op=search&from=0&count=5 or, for dservehttps compiled with USE_OPENSSL dservehttps 8081 conf.txt https://localhost:8081/dserve?op=search&from=0&count=5 * from the command line, passing a cgi-format, urlencoded query string as a single argument, like dserve 'op=search&from=0&count=5' Use the provided Makefile or compile.bat for ompiling dserve or compile directly as: gcc dserve.c dserve_util.c dserve_net.c -o dserve -O2 -lwgdb -lpthread gcc -DUSE_OPENSSL dserve.c dserve_util.c dserve_net.c -o dservehttps -O2 -lwgdb -lpthread -lssl -lcrypto dserve can be also compiled to work as a cgi or command line tool only without using pthreads by: - removing #define SERVEROPTION from dserve.h - compiling by gcc dserve.c dserve_util.c -o dserve -O2 -lwgdb Compiling under windows: copy the files dbapi.h and wgdb.lib into the same folder where you compile, then build the server version: cl /Ox /I"." dserve.c dserve_util.c dserve_net.c wgdb.lib or a non-server version cl /Ox /I"." dserve.c dserve_util.c wgdb.lib Use and modify the code for creating your own data servers for WhiteDB. See http://whitedb.org/tools.html for a detailed manual. Copyright (c) 2013, Tanel Tammet This software is under MIT licence: -------- Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ------ NB! Observe that the current file is under a different licence than the WhiteDB library: the latter is by default under GPLv3. Thus the linked dserve is under GPLv3 unless a free commercial licence is used (see whitedb.org for details). It is OK to use the MIT licence when using this code or parts of it in other projects without linking to the whitedb library. */ #include "dserve.h" #include #include #include #include // for alarm and termination signal handling #include // LONG_MAX #if _MSC_VER // no alarm on windows #else #include // for alarm #endif //#include "../json/yajl_all.h" /* =============== local protos =================== */ static char* get_cgi_query(thread_data_p tdata, char* inmethod); static void setup_globals(void); static char* search(thread_data_p tdata, char* inparams[], char* invalues[], int count, int opcode); static char* insert(thread_data_p tdata, char* inparams[], char* invalues[], int incount); static char* create(thread_data_p tdata, char* inparams[], char* invalues[], int incount); static char* drop(thread_data_p tdata, char* inparams[], char* invalues[], int incount); static int op_print_record(thread_data_p tdata,void* rec,int gcount); static int op_delete_record(thread_data_p tdata,void* rec); static char* handle_generic_param(thread_data_p tdata,char* key,char* value, char** token,char* errbuf); static char* handle_fld_param(thread_data_p tdata,char* key,char* value, char** sfields, char** svalues, char** stypes, int sfcount, char* errbuf); static int op_print_data_start(thread_data_p tdata, int listflag); static int op_print_data_end(thread_data_p tdata, int listflag); static int op_update_record(thread_data_p tdata,void* db, void* rec, wg_int fld, wg_int value); static void* op_create_database(thread_data_p tdata,char* database,long size); /* =============== globals =================== */ // globalptr points to a struct containing a pointer to conf, // the nr of threads and // an array of separate data blocks for each thread: global ptr // guarantees access from the signal-called termination_handler. // Except in main and termination handlers this is not used directly: // access through thread_data->global instead. dserve_global_p globalptr; /* yajl */ /* static int reformat_null(void * ctx) { yajl_gen g = (yajl_gen) ctx; return yajl_gen_status_ok == yajl_gen_null(g); } static int reformat_boolean(void * ctx, int boolean) { yajl_gen g = (yajl_gen) ctx; return yajl_gen_status_ok == yajl_gen_bool(g, boolean); } static int reformat_number(void * ctx, const char * s, size_t l) { yajl_gen g = (yajl_gen) ctx; return yajl_gen_status_ok == yajl_gen_number(g, s, l); } static int reformat_string(void * ctx, const unsigned char * stringVal, size_t stringLen) { yajl_gen g = (yajl_gen) ctx; return yajl_gen_status_ok == yajl_gen_string(g, stringVal, stringLen); } static int reformat_map_key(void * ctx, const unsigned char * stringVal, size_t stringLen) { yajl_gen g = (yajl_gen) ctx; return yajl_gen_status_ok == yajl_gen_string(g, stringVal, stringLen); } static int reformat_start_map(void * ctx) { yajl_gen g = (yajl_gen) ctx; return yajl_gen_status_ok == yajl_gen_map_open(g); } static int reformat_end_map(void * ctx) { yajl_gen g = (yajl_gen) ctx; return yajl_gen_status_ok == yajl_gen_map_close(g); } static int reformat_start_array(void * ctx) { yajl_gen g = (yajl_gen) ctx; //printf("array start depth %d\n",g.depth); return yajl_gen_status_ok == yajl_gen_array_open(g); } static int reformat_end_array(void * ctx) { yajl_gen g = (yajl_gen) ctx; //printf("array end depth %d\n",g.depth); return yajl_gen_status_ok == yajl_gen_array_close(g); } static yajl_callbacks callbacks = { reformat_null, reformat_boolean, NULL, NULL, reformat_number, reformat_string, reformat_start_map, reformat_map_key, reformat_end_map, reformat_start_array, reformat_end_array }; */ /* =============== main =================== */ int main(int argc, char **argv) { char *inquery=NULL, *inip=NULL; char *inmethod=NULL, *inpath=NULL; int port=0, cgi=0; thread_data_p tdata; setup_globals(); // set globalptr and its components // Set up abnormal termination handler to clear locks #ifdef CATCH_SIGNALS signal(SIGSEGV,termination_handler); signal(SIGFPE,termination_handler); signal(SIGABRT,termination_handler); signal(SIGTERM,termination_handler); signal(SIGINT,termination_handler); signal(SIGILL,termination_handler); #endif #if _MSC_VER // some signals not used in windows #else signal(SIGPIPE,SIG_IGN); // important for TCP/IP handling #endif // detect calling parameters // process environment and args inmethod=getenv("REQUEST_METHOD"); if (inmethod!=NULL) { // assume cgi call cgi=1; inip=getenv("REMOTE_ADDR"); #ifdef CONF_FILE inpath=CONF_FILE; #endif } else { #ifdef SERVEROPTION if (argc<=1) { // no params #ifdef DEFAULT_PORT // use as server by default port=DEFAULT_PORT; #else print_help(); exit(0); #endif } else if (argc>1) { // command line param given inquery=argv[1]; if (!strcmp(inquery,HELP_PARAM)) { print_help(); exit(0); } port=atoi(inquery); // 0 port means no server if (argc>2) { // conf file given inpath=argv[2]; } } // run either as a server or a command line/cgi program if (port) { #ifdef CONF_FILE if (inpath==NULL) inpath=CONF_FILE; #endif if (inpath!=NULL) { // process conf file load_configuration(inpath,globalptr->conf); //print_conf(globalptr->conf); } run_server(port,globalptr); return 0; } #else if (argc>1) { // command line param given inquery=argv[1]; if (argc>2) { // conf file given inpath=argv[2]; } else { #ifdef CONF_FILE inpath=CONF_FILE; #endif } } else { // no params given print_help(); exit(0); } #endif } if (!port) { // run as command line or cgi #if _MSC_VER // no alarm on windows #else // a timeout for cgi/command line signal(SIGALRM,timeout_handler); alarm(TIMEOUT_SECONDS); #endif if (inpath!=NULL) { // process conf file load_configuration(inpath,globalptr->conf); //print_conf(globalptr->conf); } // setup a single tdata block globalptr->maxthreads=1; tdata=&(globalptr->threads_data[0]); tdata->isserver=0; tdata->iscgi=cgi; tdata->ip=inip; tdata->port=0; tdata->method=0; tdata->realthread=0; tdata->format=1; tdata->global=globalptr; tdata->inbuf=NULL; tdata->intype=0; tdata->common=NULL; if (cgi) inquery=get_cgi_query(tdata,inmethod); // actual processing process_query(inquery,tdata); return 0; } return 0; } /* used in cgi case only: get input data, data type and set crucial tdata fields. Returns inquery str if successful and NULL otherwise. */ static char* get_cgi_query(thread_data_p tdata, char* inmethod) { char *inquery=NULL, *inlen=NULL, *intype=NULL; int len=0, type=0, n; if (inmethod!=NULL && !strcmp(inmethod,"GET")) { tdata->method=GET_METHOD_CODE; inquery=getenv("QUERY_STRING"); return inquery; } else if (inmethod!=NULL && !strcmp(inmethod,"POST")) { tdata->method=POST_METHOD_CODE; inlen=getenv("CONTENT_LENGTH"); if (inlen==NULL) return NULL; len=atoi(inlen); if (len<=0) return NULL; if (len>=MAX_MALLOC) return NULL; inquery=malloc(len+10); if (!inquery) return NULL; tdata->inbuf=inquery; intype=getenv("CONTENT_TYPE"); if (intype!=NULL) { if (strstr(intype,"application/x-www-form-urlencoded")!=NULL) type=CONTENT_TYPE_URLENCODED; else if (strstr(intype,"application/json")!=NULL) type=CONTENT_TYPE_JSON; tdata->intype=type; } n=fread(inquery,1,len,stdin); if (n<=0) { return NULL; } if (nconf=malloc(sizeof(struct dserve_conf)); if (globalptr->conf==NULL) {errprint(CANNOT_ALLOC_ERR,NULL); exit(-1);} globalptr->maxthreads=MAX_THREADS; for(i=0;imaxthreads;i++) { globalptr->threads_data[i].db=NULL; globalptr->threads_data[i].inuse=0; } globalptr->conf->default_dbase.size=0; globalptr->conf->default_dbase_size.size=0; globalptr->conf->max_dbase_size.size=0; globalptr->conf->dbases.size=0; globalptr->conf->admin_ips.size=0; globalptr->conf->write_ips.size=0; globalptr->conf->read_ips.size=0; globalptr->conf->admin_tokens.size=0; globalptr->conf->write_tokens.size=0; globalptr->conf->read_tokens.size=0; globalptr->conf->default_dbase.used=0; globalptr->conf->default_dbase_size.used=0; globalptr->conf->max_dbase_size.size=0; globalptr->conf->dbases.used=0; globalptr->conf->admin_ips.used=0; globalptr->conf->write_ips.used=0; globalptr->conf->read_ips.used=0; globalptr->conf->admin_tokens.used=0; globalptr->conf->write_tokens.used=0; globalptr->conf->read_tokens.used=0; } char* process_query(char* inquery, thread_data_p tdata) { int i=0; char *query; char querybuf[MAXQUERYLEN]; int pcount=0; int ql,found; char* res=NULL; char* database=DEFAULT_DATABASE; char* params[MAXPARAMS]; char* values[MAXPARAMS]; // first NULL values potentially left from earlier thread calls #if _MSC_VER #else tdata->db=NULL; #endif tdata->database=NULL; tdata->lock_id=0; tdata->intype=0; tdata->jsonp=NULL; tdata->format=1; tdata->showid=0; tdata->depth=MAX_DEPTH_DEFAULT; tdata->maxdepth=MAX_DEPTH_DEFAULT; tdata->strenc=2; tdata->buf=NULL; tdata->bufptr=NULL; tdata->bufsize=0; // or use your own query string for testing a la // inquery="db=1000&op=search&field=1&value=2&compare=equal&type=record&from=0&count=3"; // parse the query if (inquery==NULL || inquery[0]=='\0') { if (tdata->iscgi && tdata->method==POST_METHOD_CODE) return errhalt(CGI_QUERY_ERR,tdata); else return errhalt(NOQUERY_ERR,tdata); } ql=strlen(inquery); if (ql>MAXQUERYLEN) return errhalt(LONGQUERY_ERR,tdata); if (tdata->isserver) query=inquery; else { strcpy((char*)querybuf,inquery); query=(char*)querybuf; } //fprintf(stderr, "query: %s\n", query); pcount=parse_query(query,ql,params,values); if (pcount<=0) return errhalt(MALFQUERY_ERR,tdata); //for(i=0;iglobal)->conf->default_dbase.used>0) database=(tdata->global)->conf->default_dbase.vals[0]; for(i=0;idatabase=database; // try to find jsonp for(i=0;ijsonp=values[i]; break; } } //find the operation and dispatch found=0; for(i=0;iisserver) { if (tdata->inbuf!=NULL) { free(tdata->inbuf); tdata->inbuf=NULL; } return res; } else { print_final(res,tdata); // freeing here is not really necessary and wastes time: process exits anyway if (tdata->inbuf!=NULL) free(tdata->inbuf); if (res!=NULL) free(res); return NULL; } } void print_final(char* str, thread_data_p tdata) { if (str!=NULL) { if (tdata->isserver || tdata->iscgi) { printf(CONTENT_LENGTH,strlen(str)+1); //1 added for puts newline if (tdata->format) printf(JSON_CONTENT_TYPE); else printf(CSV_CONTENT_TYPE); } puts(str); } else { if (tdata->isserver || tdata->iscgi) { if (tdata->format) printf(JSON_CONTENT_TYPE); else printf(CSV_CONTENT_TYPE); } } } /* ============== operations: query parsing and handling ===================== */ /* search from the database, combined with update and delete */ static char* search(thread_data_p tdata, char* inparams[], char* invalues[], int incount, int opcode) { char* database=tdata->database; char *token=NULL; int i,j,x,itmp; wg_int type=0; char* fields[MAXPARAMS]; // search fields char* values[MAXPARAMS]; // search values char* compares[MAXPARAMS]; // search comparisons char* types[MAXPARAMS]; // search value types char* cids=NULL; wg_int ids[MAXIDS]; // select these ids only int fcount=0, vcount=0, ccount=0, tcount=0; // array el counters for above char* sfields[MAXPARAMS]; // set / selected fields char* svalues[MAXPARAMS]; // set field values char* stypes[MAXPARAMS]; // set field types int sfcount; // array el counters for above int from=0; unsigned long count,rcount,gcount,handlecount; void* db=NULL; // actual database pointer void *rec, *oldrec; char* res; wg_query *wgquery; // query datastructure built later wg_query_arg wgargs[MAXPARAMS]; wg_int lock_id=0; // non-0 iff lock set int searchtype=0; // 0: full scan, 1: record ids, 2: by fields char errbuf[ERRBUF_LEN]; // used for building variable-content input param error strings only // default max nr of rows shown/handled if (opcode==COUNT_CODE) count=LONG_MAX; else count=MAXCOUNT; // -------check and parse cgi parameters, attach database ------------ // set params to defaults for(i=0;iformat=1; // 1: json tdata->maxdepth=MAX_DEPTH_DEFAULT; // rec depth limit for printer tdata->showid=0; // add record id as first extra elem: 0: no, 1: yes tdata->strenc=2; // string special chars escaping: 0: just ", 1: urlencode, 2: json, 3: csv // find search parameters for(i=0;i0) ids[x++]=atoi(cids+j); if (x>=MAXIDS) break; for(;jformat==0) { // csv tdata->maxdepth=0; // record structure not printed for csv tdata->strenc=3; // only " replaced with "" } // check search parameters if (cids!=NULL) { // query by record ids if (fcount) return errhalt(RECIDS_COMBINED_ERR,tdata); searchtype=1; } else if (!fcount) { // no search fields given if (vcount || ccount || tcount) return errhalt(NO_FIELD_ERR,tdata); else searchtype=0; // scan everything } else { // search by fields searchtype=2; } // attach to database db=op_attach_database(tdata,database,READ_LEVEL); if (!db) return errhalt(DB_ATTACH_ERR,tdata); // database attached OK // create output string buffer (may be reallocated later) tdata->buf=str_new(INITIAL_MALLOC); if (tdata->buf==NULL) return errhalt(MALLOC_ERR,tdata); tdata->bufsize=INITIAL_MALLOC; tdata->bufptr=tdata->buf; // check printing depth if (tdata->maxdepth>MAX_DEPTH_HARD) tdata->maxdepth=MAX_DEPTH_HARD; // initial print if(!op_print_data_start(tdata,opcode==SEARCH_CODE)) return err_clear_detach_halt(MALLOC_ERR,tdata); // zero counters rcount=0; gcount=0; handlecount=0; // actual nr of records handled // get lock if (tdata->realthread && tdata->common->shutdown) return NULL; // for multithreading only lock_id = wg_start_read(db); // get read lock tdata->lock_id=lock_id; tdata->lock_type=READ_LOCK_TYPE; if (!lock_id) return err_clear_detach_halt(LOCK_ERR,tdata); // handle one of the cases if (searchtype==0) { // ------- full scan case --- rec=wg_get_first_record(db); while (rec!=NULL) { if (rcount>=from) { gcount++; if (gcount>count) break; if (opcode==COUNT_CODE) { handlecount++; } else if (opcode==SEARCH_CODE) { itmp=op_print_record(tdata,rec,gcount); if (!itmp) return err_clear_detach_halt(MALLOC_ERR,tdata); } else if (opcode==UPDATE_CODE) { itmp=op_update_record(tdata,db,rec,0,0); if (!itmp) handlecount++; } } oldrec=rec; rec=wg_get_next_record(db,rec); if (opcode==DELETE_CODE) { x=wg_get_record_len(db,oldrec); if (x>0) { itmp=op_delete_record(tdata,oldrec); if (!itmp) handlecount++; //else err_clear_detach_halt(DELETE_ERR,tdata); } } rcount++; } } else if (searchtype==1) { // ------------ search by record ids: ------------ for(j=0; ids[j]!=0 && jcount) break; if (opcode==COUNT_CODE) handlecount++; else if (opcode==SEARCH_CODE) { itmp=op_print_record(tdata,rec,gcount); if (!itmp) return err_clear_detach_halt(MALLOC_ERR,tdata); } else if (opcode==UPDATE_CODE) { itmp=op_update_record(tdata,db,rec,0,0); if (!itmp) handlecount++; } else if (opcode==DELETE_CODE) { // test that db is not null, otherwise we may corrupt the database oldrec=wg_get_first_record(db); if (oldrec!=NULL) { //wg_int objecthead=dbfetch((void*)db,(void*)rec); //printf("isfreeobject %d\n",isfreeobject((int)objecthead)); wg_print_record(db,rec); itmp=op_delete_record(tdata,rec); printf("deletion result %d\n",itmp); if (!itmp) handlecount++; //else return err_clear_detach_halt(DELETE_ERR,tdata); } } } } else if (searchtype==2) { // ------------by field search case: --------- // create a query list datastructure for(i=0;imaxdepth>MAX_DEPTH_HARD) tdata->maxdepth=MAX_DEPTH_HARD; while((rec = wg_fetch(db, wgquery))) { if (rcount>=from) { gcount++; if (opcode==COUNT_CODE) handlecount++; else if (opcode==SEARCH_CODE) { itmp=op_print_record(tdata,rec,gcount); if (!itmp) return err_clear_detach_halt(MALLOC_ERR,tdata); } else if (opcode==UPDATE_CODE) { itmp=op_update_record(tdata,db,rec,0,0); if (!itmp) handlecount++; } else if (opcode==DELETE_CODE) { itmp=op_delete_record(tdata,rec); if (!itmp) handlecount++; //else return err_clear_detach_halt(DELETE_ERR,tdata); } } rcount++; if (gcount>=count) break; } // free query datastructure, for(i=0;ibufptr,MIN_STRLEN,"%lu",handlecount); tdata->bufptr+=itmp; } // release locks and detach if (!wg_end_read(db, lock_id)) { // release read lock return err_clear_detach_halt(LOCK_RELEASE_ERR,tdata); } tdata->lock_id=0; op_detach_database(tdata,db); if(!op_print_data_end(tdata,opcode==SEARCH_CODE)) return err_clear_detach_halt(MALLOC_ERR,tdata); return tdata->buf; } // insert into the database */ static char* insert(thread_data_p tdata, char* inparams[], char* invalues[], int incount) { char* database=tdata->database; char *token=NULL; int i,tmp; //int j,x; char* json=NULL; //int count=MAXCOUNT; void* db=NULL; // actual database pointer //void* rec; char* res; wg_int lock_id=0; // non-0 iff lock set char errbuf[ERRBUF_LEN]; // used for building variable-content input param error strings only // yajl /* yajl_handle hand; yajl_gen g; yajl_status stat; size_t rd; int retval = 0; int a = 1; */ // -------check and parse cgi parameters, attach database ------------ // set printing params to defaults tdata->format=1; // 1: json tdata->maxdepth=MAX_DEPTH_DEFAULT; // rec depth limit for printer tdata->showid=0; // add record id as first extra elem: 0: no, 1: yes tdata->strenc=2; // string special chars escaping: 0: just ", 1: urlencode, 2: json, 3: csv // find ids and display format parameters for(i=0;iformat==0) { // csv tdata->maxdepth=0; // record structure not printed for csv tdata->strenc=3; // only " replaced with "" } // attach to database db=op_attach_database(tdata,database,READ_LEVEL); if (!db) { if (!authorize(ADMIN_LEVEL,tdata,database,token)) { return errhalt(NOT_AUTHORIZED_INSERT_CREATE_ERR,tdata); } else { if (tdata->realthread && tdata->common->shutdown) return NULL; // for multithreading only db=op_create_database(tdata,database,0); if (!db) return err_clear_detach_halt(DB_CREATE_ERR,tdata); } } // database attached OK // create output string buffer (may be reallocated later) tdata->buf=str_new(INITIAL_MALLOC); if (tdata->buf==NULL) return errhalt(MALLOC_ERR,tdata); tdata->bufsize=INITIAL_MALLOC; tdata->bufptr=tdata->buf; op_print_data_start(tdata,1); // initialize yajl /* g = yajl_gen_alloc(NULL); yajl_gen_config(g, yajl_gen_beautify, 0); yajl_gen_config(g, yajl_gen_validate_utf8, 1); hand = yajl_alloc(&callbacks, NULL, (void *) g); yajl_config(hand, yajl_allow_comments, 1); */ // take a write lock if (tdata->realthread && tdata->common->shutdown) return NULL; // for multithreading only lock_id = wg_start_write(db); // get write lock tdata->lock_id=lock_id; tdata->lock_type=WRITE_LOCK_TYPE; if (!lock_id) return err_clear_detach_halt(LOCK_ERR,tdata); // start parsing /* rd=strlen(json); stat = yajl_parse(hand, json, rd); if (stat==yajl_status_ok) stat=yajl_complete_parse(hand); if (stat != yajl_status_ok) { unsigned char * str = yajl_get_error(hand, 1, json, rd); fprintf(stderr, "%s", (const char *) str); yajl_free_error(hand, str); retval = 1; } else { // parse succeeded const unsigned char * buf; size_t len; yajl_gen_get_buf(g, &buf, &len); fwrite(buf, 1, len, stdout); yajl_gen_clear(g); } yajl_gen_free(g); yajl_free(hand); */ // parsing ended // parse json and insert tmp=wg_parse_json_document(db,json); if(tmp==-1) { return err_clear_detach_halt(JSON_ERR,tdata); } else if(tmp==-2) { return err_clear_detach_halt(INCONSISTENT_ERR,tdata); } if(!str_guarantee_space(tdata,MIN_STRLEN)) return err_clear_detach_halt(MALLOC_ERR,tdata); strcpy(tdata->bufptr,"1"); tdata->bufptr+=strlen("1"); // end activity if (!wg_end_write(db, lock_id)) { // release write lock return err_clear_detach_halt(LOCK_RELEASE_ERR,tdata); } tdata->lock_id=0; op_detach_database(tdata,db); if(!op_print_data_end(tdata,1)) return err_clear_detach_halt(MALLOC_ERR,tdata); return tdata->buf; } // create a new database static char* create(thread_data_p tdata, char* inparams[], char* invalues[], int incount) { char* database=NULL; char *token=NULL; int i; void* db=NULL; // actual database pointer long size=0; long max_size=0; char *tmps,*res; char errbuf[ERRBUF_LEN]; // find and check parameters for(i=0;iglobal)->conf->max_dbase_size.used>0) { tmps=(tdata->global)->conf->max_dbase_size.vals[0]; max_size=atol(tmps); } if(size>max_size) return errhalt(DB_BIG_SIZE_ERR,tdata); } else { // handle generic parameters for all queries: at end of param check res=handle_generic_param(tdata,inparams[i],invalues[i],&token,errbuf); if (res!=NULL) return res; // return error string } } // authorization if (!authorize(ADMIN_LEVEL,tdata,database,token)) { return errhalt(NOT_AUTHORIZED_ERR,tdata); } // all parameters and values were understood // create output string buffer (may be reallocated later) tdata->buf=str_new(INITIAL_MALLOC); if (tdata->buf==NULL) return errhalt(MALLOC_ERR,tdata); tdata->bufsize=INITIAL_MALLOC; tdata->bufptr=tdata->buf; op_print_data_start(tdata,0); // indicate no lock if (tdata->realthread && tdata->common->shutdown) return NULL; // for multithreading only tdata->lock_id=0; // check and create database //db=wg_attach_existing_database(database); //if (db!=NULL) return errhalt(DB_EXISTS_ALREADY_ERR,tdata); db=op_create_database(tdata,database,size); if (db==NULL) return errhalt(DB_CREATE_ERR,tdata); tdata->db=db; // created successfully if(!str_guarantee_space(tdata,MIN_STRLEN)) return err_clear_detach_halt(MALLOC_ERR,tdata); strcpy(tdata->bufptr,"1"); tdata->bufptr+=strlen("1"); // end activity op_detach_database(tdata,db); if(!op_print_data_end(tdata,0)) return err_clear_detach_halt(MALLOC_ERR,tdata); return tdata->buf; } // drop a database static char* drop(thread_data_p tdata, char* inparams[], char* invalues[], int incount) { char* database=tdata->database; char *token=NULL; int i,tmp; void* db=NULL; // actual database pointer char *res; int lock_id=0; int found=0; char errbuf[ERRBUF_LEN]; // find and check parameters for(i=0;ibuf=str_new(INITIAL_MALLOC); if (tdata->buf==NULL) return errhalt(MALLOC_ERR,tdata); tdata->bufsize=INITIAL_MALLOC; tdata->bufptr=tdata->buf; op_print_data_start(tdata,0); // check if access allowed in the conf file if (database!=NULL && (tdata->global)->conf->dbases.used>0) { for(i=0;i<(tdata->global)->conf->dbases.used;i++) { if (!strcmp(database,(tdata->global)->conf->dbases.vals[i])) { found=1; break; } } if (!found) return errhalt(DB_AUTHORIZE_ERR,tdata); } // first try to attach to an existing database db=op_attach_database(tdata,database,ADMIN_LEVEL); if (db==NULL) { return errhalt(DB_NOT_EXISTS_ERR,tdata); } else { // database exists, take lock if (tdata->realthread && tdata->common->shutdown) return NULL; // for multithreading only tdata->db=db; lock_id = wg_start_write(db); // get write lock tdata->lock_id=lock_id; tdata->lock_type=WRITE_LOCK_TYPE; if (!lock_id) return err_clear_detach_halt(LOCK_ERR,tdata); tmp=wg_detach_database(db); // detaches a database: returns 0 if OK if (tmp) return err_clear_detach_halt(DB_DROP_ERR,tdata); tmp=wg_delete_database(database); if (tmp) return errhalt(DB_DROP_ERR,tdata); } // deleted successfully tdata->db=NULL; tdata->lock_id=0; if(!str_guarantee_space(tdata,MIN_STRLEN)) return err_clear_detach_halt(MALLOC_ERR,tdata); strcpy(tdata->bufptr,"1"); tdata->bufptr+=strlen("1"); // end activity if(!op_print_data_end(tdata,0)) return errhalt(MALLOC_ERR,tdata); return tdata->buf; } /* ***** print, delete, update utilities ****** */ // print a single record to output string buffer of tdata // return 1 if ok, 0 if fails static int op_print_record(thread_data_p tdata,void* rec,int gcount) { int res; if (!str_guarantee_space(tdata,MIN_STRLEN)) return 0; if (gcount>1 && tdata->format!=0) { // json and not first row snprintf(tdata->bufptr,MIN_STRLEN,",\n"); tdata->bufptr+=2; } res=sprint_record(tdata->db,rec,tdata); if (!res) return 0; if (tdata->format==0) { // csv if(!str_guarantee_space(tdata,MIN_STRLEN)) return 0; snprintf(tdata->bufptr,MIN_STRLEN,"\r\n"); tdata->bufptr+=2; } return 1; } // delete a record // return 0 if ok, 1 if fails (contrary to print above) static int op_delete_record(thread_data_p tdata,void* rec) { return wg_delete_record(tdata->db,rec); } /* ******** query preparation and ending utilities ******** */ // return NULL if parsing ok, errstr otherwise static char* handle_generic_param(thread_data_p tdata,char* key,char* value, char** token,char* errbuf) { if (key==NULL || value==NULL) { return NULL; } else if (strncmp(key,"depth",MAXQUERYLEN)==0) { tdata->maxdepth=atoi(value); } else if (strncmp(key,"showid",MAXQUERYLEN)==0) { if (strncmp(value,"yes",MAXQUERYLEN)==0) tdata->showid=1; else if (strncmp(value,"no",MAXQUERYLEN)==0) tdata->showid=0; else { snprintf(errbuf,ERRBUF_LEN,UNKNOWN_PARAM_VALUE_ERR,value,key); return errhalt(errbuf,tdata); } } else if (strncmp(key,"format",MAXQUERYLEN)==0) { if (strncmp(value,"csv",MAXQUERYLEN)==0) tdata->format=0; else if (strncmp(value,"json",MAXQUERYLEN)==0) tdata->format=1; else { snprintf(errbuf,ERRBUF_LEN,UNKNOWN_PARAM_VALUE_ERR,value,key); return errhalt(errbuf,tdata); } } else if (strncmp(key,"escape",MAXQUERYLEN)==0) { if (strncmp(value,"no",MAXQUERYLEN)==0) tdata->strenc=0; else if (strncmp(value,"url",MAXQUERYLEN)==0) tdata->strenc=1; else if (strncmp(value,"json",MAXQUERYLEN)==0) tdata->strenc=2; else { snprintf(errbuf,ERRBUF_LEN,UNKNOWN_PARAM_VALUE_ERR,value,key); return errhalt(errbuf,tdata); } } else if (strncmp(key,"token",MAXQUERYLEN)==0) { *token=value; } else if (strncmp(key,JSONP_PARAM,MAXQUERYLEN)==0) { //tdata->jsonp=value; } else if (strncmp(key,NOACTION_PARAM,MAXQUERYLEN)==0) { // correct parameter, no action here } else if (strncmp(key,"db",MAXQUERYLEN)==0) { // correct parameter, no action here } else if (strncmp(key,"op",MAXQUERYLEN)==0) { // correct parameter, no action here } else { // incorrect/unrecognized parameter #ifdef ALLOW_UNKNOWN_PARAMS #else snprintf(errbuf,ERRBUF_LEN,UNKNOWN_PARAM_ERR,key); return errhalt(errbuf,tdata); #endif } return NULL; } static char* handle_fld_param(thread_data_p tdata,char* key,char* value, char** sfields, char** svalues, char** stypes, int sfcount, char* errbuf) { if (key==NULL || value==NULL) { return NULL; } *sfields=NULL; *svalues=NULL; *stypes=NULL; return NULL; } // call to print output start // return 1 if successful, 0 if fails static int op_print_data_start(thread_data_p tdata, int listflag) { int itmp; if(!str_guarantee_space(tdata,MIN_STRLEN)) return 0; if (tdata->format!=0) { // json if (tdata->jsonp!=NULL) { if (listflag) itmp=snprintf(tdata->bufptr,MIN_STRLEN,"%s([\n",tdata->jsonp); else itmp=snprintf(tdata->bufptr,MIN_STRLEN,"%s(",tdata->jsonp); tdata->bufptr+=itmp; } else { if (listflag) { itmp=snprintf(tdata->bufptr,MIN_STRLEN,"[\n"); tdata->bufptr+=itmp; } } } return 1; } // call just before finishing an operation to finish output // return 1 if successful, 0 if fails static int op_print_data_end(thread_data_p tdata, int listflag) { int itmp; if(!str_guarantee_space(tdata,MIN_STRLEN)) return 0; if (tdata->format!=0) { // json if (tdata->jsonp!=NULL) { if (listflag) itmp=snprintf(tdata->bufptr,MIN_STRLEN,"\n]);"); else itmp=snprintf(tdata->bufptr,MIN_STRLEN,");"); tdata->bufptr+=itmp; } else { if (listflag) { itmp=snprintf(tdata->bufptr,MIN_STRLEN,"\n]"); tdata->bufptr+=itmp; } } } return 1; } // update a record static int op_update_record(thread_data_p tdata,void* db, void* rec, wg_int fld, wg_int value) { wg_set_int_field(db,rec,fld,value); return 0; } // create a new database static void* op_create_database(thread_data_p tdata,char* database,long size) { void* db; char* sizestr; long max_size=0; int i; int found=0; //printf("op_create_database called\n"); if (database==NULL) { #ifdef DEFAULT_DATABASE database=DEFAULT_DATABASE; #endif if ((tdata->global)->conf->default_dbase.used>0) database=(tdata->global)->conf->default_dbase.vals[0]; } if (size<=0) { #ifdef DEFAULT_DATABASE_SIZE size=DEFAULT_DATABASE_SIZE; #endif if ((tdata->global)->conf->default_dbase_size.used>0) { sizestr=(tdata->global)->conf->default_dbase_size.vals[0]; size=atol(sizestr); } } #ifdef MAX_DATABASE_SIZE max_size=MAX_DATABASE_SIZE; #else max_size=LONG_MAX; #endif if ((tdata->global)->conf->max_dbase_size.used>0) { sizestr=(tdata->global)->conf->max_dbase_size.vals[0]; max_size=atol(sizestr); } if (database==NULL) return NULL; // check if access allowed in the conf file if ((tdata->global)->conf->dbases.used>0) { for(i=0;i<(tdata->global)->conf->dbases.used;i++) { if (!strcmp(database,(tdata->global)->conf->dbases.vals[i])) { found=1; break; } } if (!found) return NULL; } if (size<=0) return NULL; if (size>max_size) return NULL; db = wg_attach_database(database,size); return db; } // attach to database void* op_attach_database(thread_data_p tdata,char* database,int accesslevel) { void* db; int i; int found=0; if (database==NULL) { //use default #ifdef DEFAULT_DATABASE database=DEFAULT_DATABASE; #endif if ((tdata->global)->conf->default_dbase.used>0) { database=(tdata->global)->conf->default_dbase.vals[0]; } if (database==NULL) return NULL; } // check if access allowed in the conf file if ((tdata->global)->conf->dbases.used>0) { for(i=0;i<(tdata->global)->conf->dbases.used;i++) { if (!strcmp(database,(tdata->global)->conf->dbases.vals[i])) { found=1; break; } } if (!found) return NULL; } #if _MSC_VER db = tdata->db; #else db = wg_attach_existing_database(database); //db = wg_attach_database(database,100000000); tdata->db=db; #endif return db; } // detach database int op_detach_database(thread_data_p tdata, void* db) { #if _MSC_VER #else if (db!=NULL) wg_detach_database(db); tdata->db=NULL; #endif return 0; } whitedb-0.7.2/Server/dserve.h000066400000000000000000000456741226454622500161050ustar00rootroot00000000000000/* dserve.h is a common header for dserve dserve is a tool for performing REST queries from WhiteDB using a cgi protocol over http(s). Results are given in the json or csv format. See http://whitedb.org/tools.html for a detailed manual. Copyright (c) 2013, Tanel Tammet This software is under MIT licence unless linked with WhiteDB: see dserve.c for details. */ /* ====== select windows/linux dependent includes and options ======= */ #define SERVEROPTION // remove this for cgi/command line only: no need for threads/sockets in this case #if _MSC_VER #include // set this to "../Db/dbapi.h" if whitedb is not installed #define snprintf _snprintf #else #include #endif #ifdef SERVEROPTION #if _MSC_VER // windows, see http://msdn.microsoft.com/en-us/library/windows/desktop/ms738566%28v=vs.85%29.aspx //#define WIN32_LEAN_AND_MEAN //#include // windows.h only with lean_and_mean before it #include #include #include #include #pragma comment (lib, "ws2_32.lib") #pragma comment (lib, "User32.lib") // required by win_err_handle only #define THREADPOOL 0 // threadpools are not implemented for windows #define CLOSE_CHECK_THRESHOLD 0 // always set this to 0 for windows #else // linux #include #define THREADPOOL 1 // set to 0 for no threadpool (instead, new thread for each connection) #define CLOSE_CHECK_THRESHOLD 10000 // close immediately after shutdown for msg len less than this #endif #endif #ifdef USE_OPENSSL #include #include #include // define KEY_FILE and CERT_FILE to overrule values normally given in conf file // create both by: // openssl req -x509 -nodes -days 365 -newkey rsa:2048 // -keyout exampleprivatekey.key -out examplecertificate.crt // or from this by: openssl rsa -in exampleprivatekey.key -out exampleprivatekey.pem // #define KEY_FILE "/home/tanel/whitedb/Server/exampleprivatekey.key" // overrule conf file // #define CERT_FILE "/home/tanel/whitedb/Server/examplecertificate.crt" // overrule conf file #endif /* =============== configuration macros =================== */ #define CONF_FILE "/home/tanel/whitedb/Server/conf.txt" #define DEFAULT_DATABASE "1000" // used if none explicitly given and not overruled by conf file #define DEFAULT_DATABASE_SIZE 10000000 // used if none explicitly given and not overruled by conf file // set DEFAULT_DATABASE_SIZE to 0 to inhibit automatic database creation upon insert #define MAX_DATABASE_SIZE 1000000000 // limit if not overruled by conf file // set MAX_DATABASE_SIZE to 0 to inhibit any database creation // print level #define INFOPRINT // if set, neutral info printed to stderr, no info otherwise #define WARNPRINT // if set, warnings printed to stderr, no warnprint otherwise #define ERRPRINT // if set, errors printed to stderr, no errprint otherwise // server/connection configuration //#define DEFAULT_PORT 8080 // define this to run as a server on that port if no params given //#define USE_OPENSSL // define this to build a https server: normally defined in compiler flags #define MULTI_THREAD // removing this creates a simple iterative server #define MAX_THREADS 8 // size of threadpool and max nr of threads in an always-new-thread model #define QUEUE_SIZE 100 // task queue size for threadpool #define TIMEOUT_SECONDS 2 // used for cgi and command line only #define CATCH_SIGNALS // remove this to leave system error signals unhandled // header row templates: XXXXXXXXXX replaced with actual content length #define JSON_CONTENT_TYPE "Content-Type: application/json\r\n\r\n" #define CSV_CONTENT_TYPE "Content-Type: text/csv\r\n\r\n" #define CONTENT_LENGTH "Content-Length: %d\r\n" #define HEADER_TEMPLATE "HTTP/1.0 200 OK\r\n\ Server: dserve\r\n\ Access-Control-Allow-Origin: *\r\n\ Connection: Close\r\n\ Cache-Control: no-cache, must-revalidate\r\n\ Pragma: no-cache\r\n\ Content-Length: XXXXXXXXXX \r\n\ Content-Type: text/plain\r\n\r\n" // limits #define MAXQUERYLEN 2000 // query string length limit for GET #define MAXPARAMS 100 // max number of cgi params in query #define MAXCOUNT 100000 // max number of result records #define MAXIDS 1000 // max number of rec id-s in recids query #define MAXLINE 10000 // server query input buffer and one header line max #define MAXLINES 1000 // server query input: max nr of header lines #define CONF_BUF_SIZE 1000 // initial conf buf size, incremented as necessary #define MAX_CONF_BUF_SIZE 10000000 // max conf file size #define CONF_VALS_SIZE 2 // initial size of conf value array #define MAX_CONF_VALS_SIZE 1000000 // max size of conf value array #define ERRBUF_LEN 200 // limit for simple param-checking error strings // QUERY PARSING #define JSONP_PARAM "jsonp" // a jsonp padding parameter: use as jsonp=mycallback #define NOACTION_PARAM "_" // an allowed additional cgi parameter with no effect: use as _=123 //#define ALLOW_UNKNOWN_PARAMS // define this to allow any unrecognized params // result output/print settings #define HELP_PARAM "--help" // for dserve --help #define INITIAL_MALLOC 1000 // initially malloced result size #define MAX_MALLOC 100000000 // max malloced result size #define MIN_STRLEN 100 // fixed-len obj strlen, add this to strlen for print-space need #define STRLEN_FACTOR 6 // might need 6*strlen for json encoding #define DOUBLE_FORMAT "%g" // snprintf format for printing double #define JS_NULL "[]" #define CSV_SEPARATOR ',' // must be a single char #define MAX_DEPTH_DEFAULT 100 // can be increased #define MAX_DEPTH_HARD 10000 // too deep rec nesting will cause stack overflow in the printer #define HTTP_LISTENQ 1024 // server only: second arg to listen: listening queue length // may want to use SOMAXCONN instead of 1024 in windows #define HTTP_HEADER_SIZE 1000 // server only: buffer size for header #define HTTP_ERR_BUFSIZE 1000 // server only: buffer size for errstr // QUERY PARAMETERS /* #define QUERY_OP_PARAM "op" #define QUERY_DB_PARAM "db" #define QUERY_OP_PARAM "op" #define QUERY_OP_PARAM "op" */ // normal nonterminating error strings #define NOQUERY_ERR "no query" #define LONGQUERY_ERR "too long query" #define MALFQUERY_ERR "malformed query" #define UNKNOWN_PARAM_ERR "unrecognized parameter: %s" #define UNKNOWN_PARAM_VALUE_ERR "unrecognized value %s for parameter %s" #define NO_OP_ERR "no op given: use op=opname for opname in search,insert,..." #define UNKNOWN_OP_ERR "unrecognized op: use op=search or op=recids" #define NO_FIELD_ERR "no field given" #define NO_VALUE_ERR "no value given" #define CANNOT_ALLOC_ERR "cannot allocate global data\n" #define DB_PARAM_ERR "use db=name with a numeric name for a concrete database" #define DB_ATTACH_ERR "no database found: use db=name with a numeric name for a concrete database" #define FIELD_ERR "unrecognized field: use an integer starting from 0" #define COND_ERR "unrecognized compare: use equal, not_equal, lessthan, greater, ltequal or gtequal" #define INTYPE_ERR "unrecognized type: use null, int, double, str, char or record " #define INVALUE_ERR "did not find a value to use for comparison" #define INVALUE_TYPE_ERR "value does not match type" #define DECODE_ERR "field data decoding failed" #define DELETE_ERR "record deletion failed" #define DB_NO_SIZE_ERR "database size not given" #define DB_BIG_SIZE_ERR "database size too big" #define DB_EXISTS_ALREADY_ERR "database exists already" #define DB_NOT_EXISTS_ERR "database does not exist" #define DB_CREATE_ERR "database creation failed" #define DB_DROP_ERR "database dropping failed" #define DB_NAME_ERR "incorrect or missing database name" #define DB_AUTHORIZE_ERR "access to database not authorized" #define HTTP_METHOD_ERR "method given in http not implemented: use GET" #define HTTP_REQUEST_ERR "incorrect http request" #define HTTP_NOQUERY_ERR "no query found" #define WRITEN_ERROR "writen error\n" // formatting normal err messages #define JS_TYPE_ERR "\"\"" // currently this will be shown also for empty string //#define NORMAL_ERR_FORMAT "[\"%s\"]" // normal non-terminate error string is put in here #define NORMAL_ERR_FORMAT "\"ERROR: %s\"" // normal non-terminate error string is put in here #define JSONP_ERR_FORMAT "%s(\"ERROR: %s\");" // jsonp non-terminate error string is put in here // normally one request terminating error strings #define MALLOC_ERR "cannot allocate enough memory for result string" #define CGI_QUERY_ERR "cannot get query string: maybe bad/missing content-length?" #define NOT_AUTHORIZED_ERR "query not authorized" #define NOT_AUTHORIZED_INSERT_CREATE_ERR "database missing and creation of new database not authorized" #define QUERY_ERR "query creation failed" #define MISSING_JSON_ERR "input json missing" #define JSON_ERR "json parsing failed" #define DB_CREATE_ERR "database creation failed" #define RECIDS_COMBINED_ERR "search by record ids cannot be combined with search by fields" // globally terminating error strings #define TIMEOUT_ERR "timeout" #define INTERNAL_ERR "internal error" #define LOCK_ERR "database locked" #define INCONSISTENT_ERR "database inconsistent" #define LOCK_RELEASE_ERR "releasing read lock failed: database may be in deadlock" #define WSASTART_ERR "WSAStartup failed\n" #define MUTEX_ERROR "Error initializing pthread mutex, cond or attr\n" #define THREAD_CREATE_ERR "Cannot create a thread: %s\n" #define PORT_LISTEN_ERR "Cannot open port for listening: %s\n" #define SETSOCKOPT_READT_ERR "Setsockopt for read timeout failed\n" #define SETSOCKOPT_WRITET_ERR "Setsockopt for write timeout failed\n" #define THREADPOOL_UNLOCK_ERR "Threadpool unlock failure\n" #define COND_WAIT_FAIL_ERR "pthread_cond_wait failure\n" #define THREADPOOL_LOCK_ERR "Threadpool lock failure \n" #define TERMINATE_ERR "dserve terminating\n" #define TERMINATE_NOGLOB_ERR "dserve terminating: no global data found\n" #define CONF_OPEN_ERR "Cannot open configuration file %s \n" #define CONF_MALLOC_ERR "Cannot malloc for configuration file reading\n" #define CONF_READ_ERR "Cannot read from configuration file %s\n" #define CONF_VAL_ERR "Unknown key %s in configuration file\n" #define CONF_SIZE_ERR "MAX_CONF_BUF_SIZE too small to read the the configuration file %s\n" #define CONF_VALNR_ERR "MAX_CONF_VALS_SIZE too small for the list of conf values\n" #define NO_KEY_FILE_ERR "key_file not given in configuration for https\n" #define NO_CERT_FILE_ERR "cert_file not given in configuration for https\n" // warnings and info #define CONN_ACCEPT_WARN "Cannot accept connection: %s.\n" #define SHUTDOWN_WARN "Shutting down.\n" #define SHUTDOWN_THREAD_WARN "Shutting down thread.\n" #define COND_SIGNAL_FAIL_WARN "pthread_cond_signal failure.\n" #define READING_FAILED_WARN "Failed to read input.\n" #define CONTENT_LENGTH_MISSING_WARN "Content-length missing.\n" #define CONTENT_LENGTH_BIG_WARN "Content-length too big.\n" #define THREADPOOL_INFO "Running multithreaded with a threadpool.\n" #define MULTITHREAD_INFO "Running multithreaded without threadpool.\n" // internal values #define READ "r" #define READ_LOCK_TYPE 1 #define WRITE_LOCK_TYPE 2 #define ADMIN_LEVEL 0 #define WRITE_LEVEL 1 #define READ_LEVEL 2 #define CONTENT_TYPE_UNKNOWN 0 // this and following determined by "Content-Type:" #define CONTENT_TYPE_URLENCODED 1 // application/x-www-form-urlencoded #define CONTENT_TYPE_JSON 2 // application/json #define GET_METHOD_CODE 1 // GET request code for tdata->method #define POST_METHOD_CODE 2 // POST request code code for tdata->method #define COUNT_CODE 0 // passed as last arg to generic search #define SEARCH_CODE 1 // passed as last arg to generic search #define DELETE_CODE 2 // passed as last arg to generic search #define UPDATE_CODE 3 // passed as last arg to generic search #define BAD_WG_VALUE WG_ILLEGAL // 0xff used for returning encoding failures // err codes from sysexit.h project #define ERR_EX_NOINPUT 66 // required file was missing or unreadable #define ERR_EX_UNAVAILABLE 69 // an external service or program failed #define ERR_EX_SOFTWARE 70 // hard software errors from catching a signal #define ERR_EX_TEMPFAIL 75 // temporary failure, perhaps not really an error #define ERR_EX_CONFIG 78 // configuration errors #define CONF_DEFAULT_DBASE "default_dbase" #define CONF_DEFAULT_DBASE_SIZE "default_dbase_size" #define CONF_MAX_DBASE_SIZE "max_dbase_size" #define CONF_DBASES "dbases" #define CONF_ADMIN_IPS "admin_ips" #define CONF_WRITE_IPS "write_ips" #define CONF_READ_IPS "read_ips" #define CONF_ADMIN_TOKENS "admin_tokens" #define CONF_WRITE_TOKENS "write_tokens" #define CONF_READ_TOKENS "read_tokens" #define CONF_KEY_FILE "key_file" #define CONF_CERT_FILE "cert_file" /* ========== global structures ============= */ // each thread (or a single cgi/command line) has its own thread_data block typedef struct thread_data * thread_data_p; struct thread_data{ // thread type, database, locks int isserver; // 1 if run as a server, 0 if not int iscgi; // 1 if run as a cgi program, 0 if not int realthread; // 1 if thread, 0 if not int thread_id; // 0,1,.. struct common_data *common; // common is shared by all threads struct dserve_global *global; // global is thread-independent void *db; // NULL iff not attached char *database; //database name wg_int lock_id; // 0 iff not locked int lock_type; // 1 read, 2 write int inuse; // 1 if in use, 0 if not (free to reuse) // task details int conn; // actual socket id #ifdef USE_OPENSSL SSL *ssl; #endif char *ip; // request ip int port; // request port int method; // request method code: unknown 0, GET 1, POST 2, ... int res; // stored by thread // input data char *inbuf; // input buffer: used only by post, should be freed int intype; // 0 missing content-type, 1 urlencoded, 2 json // printing char *jsonp; // NULL or jsonp function string int format; // 1 json, 0 csv int showid; // print record id for record: 0 no show, 1 first (extra) elem of record int depth; // limit on records nested via record pointers (0: no nesting) int maxdepth; // limit on printing records nested via record pointers (0: no nesting) int strenc; /* strenc==0: nothing is escaped at all strenc==1: non-ascii chars and % and " urlencoded strenc==2: json utf-8 encoding, not ascii-safe strenc==3: csv encoding, only " replaced for "" */ char *buf; // address of the whole string buffer start (not the start itself) char *bufptr; // address of the next place in buf to write into int bufsize; // buffer length }; // a single dserve_global is created as a global var dsglobal typedef struct dserve_global * dserve_global_p; struct dserve_global{ struct dserve_conf *conf; int maxthreads; struct thread_data threads_data[MAX_THREADS]; }; // configuration data read from file, kept as sized_strlst for each kind // with each sized_strlst containing char* array of conf vals struct sized_strlst{ int size; // vals array size: not all have to be used int used; // nr of used els in vals char** vals; // actual array of char* to vals }; typedef struct dserve_conf * dserve_conf_p; struct dserve_conf{ struct sized_strlst default_dbase; struct sized_strlst default_dbase_size; struct sized_strlst max_dbase_size; struct sized_strlst dbases; struct sized_strlst admin_ips; struct sized_strlst write_ips; struct sized_strlst read_ips; struct sized_strlst admin_tokens; struct sized_strlst write_tokens; struct sized_strlst read_tokens; struct sized_strlst key_file; struct sized_strlst cert_file; }; #ifdef SERVEROPTION #if _MSC_VER // windows #define ssize_t int #define socklen_t int // task queue elements typedef struct { int conn; } common_task_t; // common information pointed to from each thread data block: lock, queue, etc struct common_data{ void* *threads; void* mutex; void* cond; common_task_t *queue; int thread_count; int queue_size; int head; int tail; int count; int started; int shutdown; }; #else // linux // task queue elements typedef struct { int conn; #ifdef USE_OPENSSL SSL *ssl; #endif } common_task_t; // common information pointed to from each thread data block: lock, queue, etc struct common_data{ pthread_t *threads; pthread_mutex_t mutex; pthread_cond_t cond; common_task_t *queue; int thread_count; int queue_size; int head; int tail; int count; int started; int shutdown; }; #endif // win or linux server #else // no serveroption typedef struct { int conn; } common_task_t; struct common_data{ void* *threads; void* mutex; void* cond; common_task_t *queue; int thread_count; int queue_size; int head; int tail; int count; int started; int shutdown; }; #endif // serveroption or no serveroption /* =============== global protos =================== */ // in dserve.c: char* process_query(char* inquery, thread_data_p tdata); void print_final(char* str, thread_data_p tdata); void* op_attach_database(thread_data_p tdata,char* database,int accesslevel); int op_detach_database(thread_data_p tdata, void* db); // in dserve_net.c: int run_server(int port, struct dserve_global * globalptr); char* make_http_errstr(char* str, thread_data_p tdata); // in dserve_util.c: wg_int encode_incomp(void* db, char* incomp); wg_int encode_intype(void* db, char* intype); wg_int encode_invalue(void* db, char* invalue, wg_int type); int isint(char* s); int isdbl(char* s); int parse_query(char* query, int ql, char* params[], char* values[]); char* urldecode(char *indst, char *src); int sprint_record(void *db, wg_int *rec, thread_data_p tdata); char* sprint_value(void *db, wg_int enc, thread_data_p tdata); int sprint_string(char* bptr, int limit, char* strdata, int strenc); int sprint_blob(char* bptr, int limit, char* strdata, int strenc); int sprint_append(char** buf, char* str, int l); char* str_new(int len); int str_guarantee_space(thread_data_p tdata, int needed); int load_configuration(char* path, struct dserve_conf *conf); int add_conf_key_val(struct dserve_conf *conf, char* key, char* val); int add_slval(struct sized_strlst *lst, char* val); void print_conf(struct dserve_conf *conf); void print_conf_slval(struct sized_strlst *lst, char* key); int authorize(int level,thread_data_p tdata,char* database,char* token); void print_help(void); void infoprint(char* fmt, char* param); void warnprint(char* fmt, char* param); void errprint(char* fmt, char* param); char* errhalt(char* str, thread_data_p tdata); char* err_clear_detach_halt(char* errstr, thread_data_p tdata); void terminate(void); void termination_handler(int signal); void timeout_handler(int signal); void clear_detach_final(int signal); #if _MSC_VER void usleep(__int64 usec); void win_err_handler(LPTSTR lpszFunction); #endif whitedb-0.7.2/Server/dserve_net.c000066400000000000000000000567621226454622500167460ustar00rootroot00000000000000/* dserve_net.c contains networking functions for dserve.c dserve is a tool for performing REST queries from WhiteDB using a cgi protocol over http(s). Results are given in the json or csv format. See http://whitedb.org/tools.html for a detailed manual. Copyright (c) 2013, Tanel Tammet This software is under MIT licence unless linked with WhiteDB: see dserve.c for details. */ #include "dserve.h" #include #include #include #include // for alarm and termination signal handling #include #include #include // linux nanosleep #if _MSC_VER #else #include #include // inet_ntop #include #include #include // for alarm #ifdef MULTI_THREAD #include #endif #endif /* ============= local protos ============= */ static char* get_post_data(int connsd,char* buf,void* ssl,thread_data_p tdata); int open_listener(int port); void write_header(char* buf); void write_header_clen(char* buf, int clen); int parse_uri(char *uri, char *filename, char *cgiargs); ssize_t readlineb(int fd, void *usrbuf, size_t maxlen, void* sslp); ssize_t readn(int fd, void *usrbuf, size_t n, void* sslp); ssize_t writen(int fd, void *usrbuf, size_t n, void* sslp); #if _MSC_VER DWORD WINAPI handle_http(LPVOID targ); #else void *handle_http(void *targ); #endif #ifdef USE_OPENSSL SSL_CTX *init_openssl(dserve_conf_p conf); void ShowCerts(SSL* ssl); #endif /* ========== structures ============= */ /* ========== globals =========================== */ /* =============== functions =================== */ int run_server(int port, dserve_global_p globalptr) { struct sockaddr_in clientaddr; int rc, sd, connsd, next; thread_data_p tdata; struct common_data *common; long tid, maxtid, tcount, i; size_t clientlen; //struct timeval timeout; #ifdef MULTI_THREAD #if _MSC_VER HANDLE thandle; HANDLE thandlearray[MAX_THREADS]; DWORD threads[MAX_THREADS]; #else pthread_t threads[MAX_THREADS]; pthread_attr_t attr; struct timespec tim, tim2; #endif #ifdef USE_OPENSSL SSL_CTX *ctx; SSL *ssl; #endif #endif #if _MSC_VER void* db=NULL; // actual database pointer WSADATA wsaData; if (WSAStartup(MAKEWORD(2, 0),&wsaData) != 0) { errprint(WSASTART_ERR,NULL); exit(ERR_EX_UNAVAILABLE); } db = wg_attach_existing_database("1000"); //db = wg_attach_database(database,100000000); if (!db) { errprint(DB_ATTACH_ERR,NULL); exit(ERR_EX_UNAVAILABLE); } #else signal(SIGPIPE,SIG_IGN); // important for linux TCP/IP handling #endif tdata=&(globalptr->threads_data[0]); #ifdef MULTI_THREAD #if _MSC_VER #else if (THREADPOOL) { // ---------------- run as server with threadpool ----------- infoprint(THREADPOOL_INFO,NULL); // setup nanosleep for 100 microsec tim.tv_sec = 0; tim.tv_nsec = 100000; #ifdef USE_OPENSSL // prepare openssl ctx=init_openssl(globalptr->conf); #endif // prepare threads common=(struct common_data *)malloc(sizeof(struct common_data)); tid=0; tcount=0; maxtid=0; if (pthread_mutex_init(&(common->mutex),NULL) !=0 || pthread_cond_init(&(common->cond),NULL) != 0 || pthread_attr_init(&attr) !=0) { errprint(MUTEX_ERROR,NULL); exit(ERR_EX_UNAVAILABLE); } common->threads = threads; common->queue = (common_task_t *)malloc(sizeof(common_task_t) * QUEUE_SIZE); common->thread_count = 0; common->queue_size = QUEUE_SIZE; common->head = common->tail = common->count = 0; common->shutdown = common->started = 0; pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE); //PTHREAD_CREATE_DETACHED); // create threads for(tid=0;tidmutex)) != 0) { errprint(THREADPOOL_LOCK_ERR,NULL); exit(ERR_EX_UNAVAILABLE); } #ifdef USE_OPENSSL ssl = SSL_new(ctx); // get new SSL state with context SSL_set_fd(ssl,connsd); #endif // now we have a connection: add to queue next=common->tail+1; next=(next==common->queue_size) ? 0 : next; do { if(common->count==common->queue_size) { // full? //fprintf(stderr, "queue full\n"); nanosleep(&tim , &tim2); break; //continue; } if(common->shutdown) { warnprint(SHUTDOWN_WARN,NULL); break; } // add to task queue common->queue[common->tail].conn=connsd; #ifdef USE_OPENSSL common->queue[common->tail].ssl=ssl; #endif common->tail=next; common->count+=1; //printf("next %d\n",next); // broadcast if(pthread_cond_signal(&(common->cond)) != 0) { warnprint(COND_SIGNAL_FAIL_WARN,NULL); break; } } while(0); //fprintf(stderr,"starting to unlock \n"); if(pthread_mutex_unlock(&(common->mutex)) != 0) { errprint(THREADPOOL_UNLOCK_ERR,NULL); exit(ERR_EX_UNAVAILABLE); } } return 0; // never come to this } else #endif // threadpool not implemented on windows version: using a non-threadpool version { // ------------- run as server without threadpool ------------- infoprint(MULTITHREAD_INFO,NULL); // setup nanosleep for 100 microsec #if _MSC_VER #else tim.tv_sec = 0; tim.tv_nsec = 100000; #endif // prepare common block common=(struct common_data *)malloc(sizeof(struct common_data)); common->shutdown=0; // mark thread data blocks free for(i=0;ishutdown==1) break; if (connsd<0) { warnprint(CONN_ACCEPT_WARN, strerror(errno)); continue; } tid=-1; // find first free thread data block // loop until we get a free one while(tid<0) { for(i=0;i=0) break; #if _MSC_VER usleep(1); #else nanosleep(&tim , &tim2); #endif } if (tid>maxtid) maxtid=tid; tcount++; // init thread data block tdata[tid].isserver=1; tdata[tid].thread_id=tid; tdata[tid].realthread=1; tdata[tid].inuse=1; tdata[tid].conn=connsd; tdata[tid].ip=NULL; tdata[tid].port=0; tdata[tid].method=0; tdata[tid].res=0; tdata[tid].global=globalptr; #if _MSC_VER tdata[tid].db=db; thandle=CreateThread(NULL, 0, handle_http, (void *) &tdata[tid], 0, &threads[tid]); if (thandle==NULL) { win_err_handler(TEXT("CreateThread")); ExitProcess(3); } else { thandlearray[tid]=thandle; } #else rc=pthread_create(&threads[tid], &attr, handle_http, (void *) &tdata[tid]); #endif } return 0; // never come to this } #else // ------------ run as an iterative server ------------- sd=open_listener(port); if (sd<0) { errprint(PORT_LISTEN_ERR, strerror(errno)); return -1; } clientlen = sizeof(clientaddr); while (1) { connsd=accept(sd,(struct sockaddr *)&clientaddr, &clientlen); if (connsd<0) { warnprint(CONN_ACCEPT_WARN, strerror(errno)); continue; } tid=0; tdata[tid].isserver=1; tdata[tid].thread_id=tid; tdata[tid].realthread=0; tdata[tid].conn=connsd; tdata[tid].ip=NULL; tdata[tid].port=0; tdata[tid].method=0; tdata[tid].res=0; tdata[tid].global=globalptr; #if _MSC_VER tdata[tid].db=db; #endif handle_http((void *) &tdata[tid]); } return 0; // never come to this #endif } // handle one http request #if _MSC_VER DWORD WINAPI handle_http(LPVOID targ) { #else void *handle_http(void *targ) { #endif int connsd,i,tid,itmp,len=0; char *method=NULL, *uri=NULL, *version=NULL, *query=NULL; char *bp=NULL, *res=NULL; char buf[MAXLINE]; char header[HTTP_HEADER_SIZE]; thread_data_p tdata; struct common_data *common; socklen_t alen; struct sockaddr_storage addr; struct sockaddr_in *s4; struct sockaddr_in6 *s6; char ipstr[INET6_ADDRSTRLEN]; int port; #ifdef USE_OPENSSL SSL* ssl=NULL; #else void* ssl=NULL; #endif #if _MSC_VER #else struct timespec tim, tim2; tim.tv_sec = 0; tim.tv_nsec = 1000; #endif tdata=(thread_data_p) targ; tid=tdata->thread_id; common=tdata->common; while(1) { // infinite loop for threadpool, just once for non-threadpool if ((tdata->realthread)!=2) { // once-run thread connsd=tdata->conn; } else { // threadpool thread #ifdef MULTI_THREAD #if THREADPOOL pthread_mutex_lock(&(common->mutex)); while ((common->count==0) && (common->shutdown==0)) { itmp=pthread_cond_wait(&(common->cond),&(common->mutex)); // wait if (itmp) { errprint(COND_WAIT_FAIL_ERR,NULL); exit(ERR_EX_UNAVAILABLE); } } if (common->shutdown) { pthread_mutex_unlock(&(common->mutex)); // ? warnprint(SHUTDOWN_THREAD_WARN,NULL); tdata->inuse=0; pthread_exit((void*) tid); return NULL; } #endif #endif connsd=common->queue[common->head].conn; #ifdef USE_OPENSSL ssl=common->queue[common->head].ssl; if (SSL_accept(ssl)==-1) { SSL_free(ssl); ssl=NULL; //fprintf(stderr,"ssl accept error\n"); //ERR_print_errors_fp(stderr); } //ShowCerts(ssl); #endif common->head+=1; common->head=(common->head == common->queue_size) ? 0 : common->head; common->count-=1; #ifdef MULTI_THREAD #if THREADPOOL pthread_mutex_unlock(&(common->mutex)); #endif #endif } // who is calling? alen = sizeof addr; getpeername(connsd, (struct sockaddr*)&addr, &alen); if (addr.ss_family == AF_INET) { s4 = (struct sockaddr_in *)&addr; port = ntohs(s4->sin_port); #if _MSC_VER InetNtop(AF_INET, &s4->sin_addr, ipstr, sizeof ipstr); } else { // AF_INET6 s6 = (struct sockaddr_in6 *)&addr; port = ntohs(s6->sin6_port); InetNtop(AF_INET6, &s6->sin6_addr, ipstr, sizeof ipstr); } #else inet_ntop(AF_INET, &s4->sin_addr, ipstr, sizeof ipstr); } else { // AF_INET6 s6 = (struct sockaddr_in6 *)&addr; port = ntohs(s6->sin6_port); inet_ntop(AF_INET6, &s6->sin6_addr, ipstr, sizeof ipstr); } #endif tdata->ip=ipstr; tdata->port=port; //printf("Peer IP address: %s\n", ipstr); //printf("Peer port : %d\n", port); #ifdef USE_OPENSSL if (ssl!=NULL) { #else if (1) { #endif // accepted connection: read and process // read and parse request line readlineb(connsd,buf,MAXLINE,ssl); method=buf; for(i=0,bp=buf; *bp!='\0' && i<2; bp++) { if (*bp==' ') { *bp='\0'; if (!i) uri=bp+1; else version=bp+1; ++i; } } if (strcmp(method, "GET") && strcmp(method, "POST")) { //return; res=make_http_errstr(HTTP_METHOD_ERR,NULL); } else if (uri==NULL || version==NULL) { //return; res=make_http_errstr(HTTP_REQUEST_ERR,NULL); } else { if (!strcmp(method, "GET")) { // query follows GET tdata->method=GET_METHOD_CODE; for(bp=uri; *bp!='\0'; bp++) { if (*bp=='?') { *bp='\0'; query=bp+1; break; } } } else { // POST: query after empty line tdata->method=POST_METHOD_CODE; query=get_post_data(connsd,buf,ssl,tdata); } // now we have query for both methods if (query==NULL || *query=='\0') { res=make_http_errstr(HTTP_NOQUERY_ERR,NULL); } else { // compute result #ifdef MULTI_THREAD if (!(common->shutdown)) { res=process_query(query,tdata); //printf("res: %s\n",res); } else { tdata->inuse=0; #if _MSC_VER ExitThread(1); return 0; #else pthread_exit((void*) tid); return NULL; #endif } #else // no multithreading only res=process_query(query,tdata); #endif } } //printf("res: %s\n",res); // make header if (res==NULL) len=0; else len=strlen(res); write_header(header); write_header_clen(header,len); // send result i=writen(connsd,header,strlen(header),ssl); if (res!=NULL) { i=writen(connsd,res,len,ssl); free(res); } #ifdef USE_OPENSSL if (ssl!=NULL) SSL_free(ssl); #endif } // next part is run also for non-accepted connections #if _MSC_VER if (shutdown(connsd,SD_SEND)<0) { // SHUT_BOTH #else if (shutdown(connsd,SHUT_WR)<0) { // SHUT_RDWR #endif // shutdown fails //fprintf(stderr, "Cannot shutdown connection: %s\n", strerror(errno)); #if _MSC_VER if (closesocket(connsd) < 0) { #else if (close(connsd) < 0) { #endif warnprint("Cannot close connection after failed shutdown: %s\n", strerror(errno)); } } else { // normal shutdown if (len>=CLOSE_CHECK_THRESHOLD) { for(;;) { itmp=recv(connsd, buf, MAXLINE, 0); if(itmp<0) { fprintf(stderr,"error %d reading after shutdown in thread %d\n",itmp,tid); break; } if(!itmp) break; #if _MSC_VER usleep(1); #else nanosleep(&tim , &tim2); #endif } /* // alternative check i=-1; for(;;) { ioctl(connsd, SIOCOUTQ, &itmp); //if(itmp != i) printf("Outstanding: %d\n", itmp); i = itmp; if(!itmp) break; //usleep(1); nanosleep(&tim , &tim2); } */ } #if _MSC_VER if (closesocket(connsd) < 0) { #else if (close(connsd) < 0) { #endif warnprint("Cannot close connection: %s\n", strerror(errno)); } } // connection is now down tdata->inuse=0; if ((tdata->realthread)==1) { // non-threadpool thread //fprintf(stderr,"exiting thread %d\n",tid); #ifdef MULTI_THREAD #if _MSC_VER ExitThread(1); return 0; #else pthread_exit((void*) tid); return NULL; #endif #endif } else if ((tdata->realthread)==2) { // threadpool thread //fprintf(stderr,"thread %d loop ended\n",tid); } else { // not a thread at all #if _MSC_VER return 0; #else return NULL; #endif } } } // testing with curl: // curl 'http://127.0.0.1:8080' -d 'op=query' // curl -F password=@/etc/passwd www.mypasswords.com // -d --data-urlencode /* Reads posted data and returns the malloced query or NULL in case of error. Sets tdata->inbuf to malloced buffer and tdata->intype to numeric value corresponding to the content-type found. */ static char* get_post_data(int connsd,char* buf,void* ssl,thread_data_p tdata) { char *buffer, *tp; int j,k,chk,nread; int len=0,type=CONTENT_TYPE_UNKNOWN; int ctypelen,clenlen; char bufp[MAXLINE]; // for reading one line at a time //printf("get_post_data called\n"); clenlen=strlen("Content-Length:"); ctypelen=strlen("Content-Type:"); // start reading line by line until empty line is hit for(j=0;j=MAXLINES) return NULL; // empty line never found //printf("len %d type %d\n",len,type); if (len<=0) { warnprint(CONTENT_LENGTH_MISSING_WARN,NULL); return NULL; } if (len>=MAX_MALLOC) { warnprint(CONTENT_LENGTH_BIG_WARN,NULL); return NULL; } buffer=malloc(len+10); // add some just in case if (!buffer) { warnprint(CONTENT_LENGTH_BIG_WARN,NULL); return NULL; } // start reading the whole data nread=readn(connsd, buffer, len, ssl); if (nreadinbuf=buffer; tdata->intype=type; //printf("query:\n%s\n",buffer); return buffer; } void write_header(char* buf) { char *h1; h1=HEADER_TEMPLATE; strcpy(buf,h1); } void write_header_clen(char* buf, int clen) { char* p; int n; p=strchr(buf,'X'); n=sprintf(p,"%d",clen); *(p+n)=' '; for(p=p+n+1;*p=='X';p++) *p=' '; } int open_listener(int port) { int sd, opt=1; struct sockaddr_in saddr; // create socket descriptor sd if ((sd=socket(AF_INET, SOCK_STREAM, 0)) < 0) return -1; // eliminate addr in use error if (setsockopt(sd,SOL_SOCKET,SO_REUSEADDR,(const void *)&opt,sizeof(int))<0) return -1; // all requests to port for this host will be given to sd memset((char *) &saddr,0,sizeof(saddr)); saddr.sin_family=AF_INET; saddr.sin_addr.s_addr=htonl(INADDR_ANY); saddr.sin_port=htons((unsigned short)port); if (bind(sd,(struct sockaddr *)&saddr, sizeof(saddr)) < 0) return -1; // make a listening socket to accept requests if (listen(sd,HTTP_LISTENQ)<0) return -1; return sd; } ssize_t readlineb(int fd, void *usrbuf, size_t maxlen, void* sslp) { int n, rc; char c, *bufp = usrbuf; for (n = 1; n < maxlen; n++) { #if _MSC_VER if ((rc = recv(fd, &c, 1, 0)) == 1) { #else if ((rc = readn(fd, &c, 1, sslp)) == 1) { #endif *bufp++ = c; if (c == '\n') break; } else if (rc == 0) { if (n == 1) return 0; // EOF, no data read else break; // EOF, some data was read } else { return -1; // error } } *bufp = 0; return n; } ssize_t readn(int fd, void *usrbuf, size_t n, void* ssl) { size_t nleft = n; ssize_t nread; char *bufp = usrbuf; while (nleft>0) { #ifdef USE_OPENSSL if ((nread=SSL_read((SSL*)ssl, bufp, nleft)) < 0) { #else if ((nread=recv(fd, bufp, nleft, 0)) < 0) { #endif if (errno==EINTR) nread=0;/* interrupted by sig handler return */ /* and call read() again */ else return -1; /* errno set by read() */ } else if (nread==0) break; /* EOF */ nleft -= nread; bufp += nread; } return (n-nleft); /* return >= 0 */ } ssize_t writen(int fd, void *usrbuf, size_t n, void* ssl) { size_t nleft = n; ssize_t nwritten; char *bufp = usrbuf; while (nleft > 0) { #ifdef USE_OPENSSL if ((nwritten = SSL_write((SSL*)ssl, bufp, nleft)) < 0) { #else if ((nwritten = send(fd, bufp, nleft, 0)) <= 0) { #endif if (errno == EINTR) { nwritten = 0; /* interrupted by sig handler return */ /* and call write() again */ } else { warnprint(WRITEN_ERROR,NULL); // EPIPE 32 is likely here return -1; /* errorno set by write() */ } } nleft -= nwritten; bufp += nwritten; } return n; } #ifdef USE_OPENSSL // openssl utilities below inspired by // http://simplestcodings.blogspot.com/2010/08/secure-server-client-using-openssl-in-c.html SSL_CTX *init_openssl(dserve_conf_p conf) { const SSL_METHOD *method; SSL_CTX *ctx; char* KeyFile; char* CertFile; #ifdef KEY_FILE KeyFile=KEY_FILE; #else if ((conf->key_file).used<=0) { errprint(NO_KEY_FILE_ERR,NULL); exit(ERR_EX_NOINPUT); } KeyFile=conf->key_file.vals[0]; #endif #ifdef CERT_FILE CertFile=CERT_FILE; #else if ((conf->cert_file).used<=0) { errprint(NO_CERT_FILE_ERR,NULL); exit(ERR_EX_NOINPUT); } CertFile=conf->cert_file.vals[0]; #endif SSL_load_error_strings(); SSL_library_init(); //OpenSSL_add_all_algorithms(); method = SSLv3_server_method(); ctx = SSL_CTX_new(method); if (ctx==NULL) { fprintf(stderr,"ssl initialization error:\n"); ERR_print_errors_fp(stderr); exit(ERR_EX_UNAVAILABLE); } SSL_CTX_use_certificate_file(ctx, CertFile, SSL_FILETYPE_PEM); if (SSL_CTX_use_certificate_file(ctx, CertFile, SSL_FILETYPE_PEM) <= 0){ fprintf(stderr,"ssl certificate file error:\n"); ERR_print_errors_fp(stderr); exit(ERR_EX_CONFIG); } if (SSL_CTX_use_PrivateKey_file(ctx, KeyFile, SSL_FILETYPE_PEM)<=0) { fprintf(stderr,"ssl private key file error:\n"); ERR_print_errors_fp(stderr); exit(ERR_EX_CONFIG); } if ( !SSL_CTX_check_private_key(ctx) ) { fprintf(stderr,"ssl error checking key\n"); ERR_print_errors_fp(stderr); exit(ERR_EX_CONFIG); } return ctx; } void ShowCerts(SSL* ssl){ // this function is not really needed X509 *cert; char *line; cert = SSL_get_peer_certificate(ssl); /* Get certificates (if available) */ if ( cert != NULL ) { printf("Server certificates:\n"); line = X509_NAME_oneline(X509_get_subject_name(cert), 0, 0); printf("Subject: %s\n", line); free(line); line = X509_NAME_oneline(X509_get_issuer_name(cert), 0, 0); printf("Issuer: %s\n", line); free(line); X509_free(cert); } else { printf("No certificates.\n"); } } #endif /* // setting socket timeout struct timeval timeout; timeout.tv_sec = 1; timeout.tv_usec = 0; if (setsockopt (connsd, SOL_SOCKET, SO_RCVTIMEO, (char *)&timeout, sizeof(timeout)) < 0) error("setsockopt failed\n"); if (setsockopt (connsd, SOL_SOCKET, SO_SNDTIMEO, (char *)&timeout, sizeof(timeout)) < 0) error("setsockopt failed\n"); */ whitedb-0.7.2/Server/dserve_util.c000066400000000000000000001045031226454622500171200ustar00rootroot00000000000000/* dserve_util.c contains string and error handling utilities for dserve dserve is a tool for performing REST queries from WhiteDB using a cgi protocol over http. Results are given in the json format. See http://whitedb.org/tools.html for a detailed manual. Copyright (c) 2013, Tanel Tammet This software is under MIT licence unless linked with WhiteDB: see dserve.c for details. */ #include "dserve.h" #include #include #include #include // is(x)digit, isint, isspace #if _MSC_VER #define STDERR_FILENO 2 #else #include // for STDERR_FILENO #endif /* =============== local protos =================== */ static int authorize_aux(char* str, char** lst, int n, int eqflag); static int empty_str(char *s); /* =============== globals =================== */ // used in termination signal handlers extern dserve_global_p globalptr; /* =============== functions =================== */ /* ***** encode cgi params as query vals ****** */ wg_int encode_incomp(void* db, char* incomp) { if (incomp==NULL || incomp=='\0') return WG_COND_EQUAL; else if (!strcmp(incomp,"equal")) return WG_COND_EQUAL; else if (!strcmp(incomp,"not_equal")) return WG_COND_NOT_EQUAL; else if (!strcmp(incomp,"lessthan")) return WG_COND_LESSTHAN; else if (!strcmp(incomp,"greater")) return WG_COND_GREATER; else if (!strcmp(incomp,"ltequal")) return WG_COND_LTEQUAL; else if (!strcmp(incomp,"gtequal")) return WG_COND_GTEQUAL; else return BAD_WG_VALUE; //err_clear_detach_halt(COND_ERR); } wg_int encode_intype(void* db, char* intype) { if (intype==NULL || intype=='\0') return 0; else if (!strcmp(intype,"null")) return WG_NULLTYPE; else if (!strcmp(intype,"int")) return WG_INTTYPE; else if (!strcmp(intype,"record")) return WG_RECORDTYPE; else if (!strcmp(intype,"double")) return WG_DOUBLETYPE; else if (!strcmp(intype,"str")) return WG_STRTYPE; else if (!strcmp(intype,"char")) return WG_CHARTYPE; else return BAD_WG_VALUE; //err_clear_detach_halt(INTYPE_ERR); } wg_int encode_invalue(void* db, char* invalue, wg_int type) { if (invalue==NULL) { return WG_ILLEGAL; } if (type==WG_NULLTYPE) return wg_encode_query_param_null(db,NULL); else if (type==WG_INTTYPE) { if (!isint(invalue)) return WG_ILLEGAL; return wg_encode_query_param_int(db,atoi(invalue)); } else if (type==WG_RECORDTYPE) { if (!isint(invalue)) return WG_ILLEGAL; return (wg_int)atoi(invalue); } else if (type==WG_DOUBLETYPE) { if (!isdbl(invalue)) return WG_ILLEGAL; return wg_encode_query_param_double(db,strtod(invalue,NULL)); } else if (type==WG_STRTYPE) { return wg_encode_query_param_str(db,invalue,NULL); } else if (type==WG_CHARTYPE) { return wg_encode_query_param_char(db,invalue[0]); } else if (type==0 && isint(invalue)) { return wg_encode_query_param_int(db,atoi(invalue)); } else if (type==0 && isdbl(invalue)) { return wg_encode_query_param_double(db,strtod(invalue,NULL)); } else if (type==0) { return wg_encode_query_param_str(db,invalue,NULL); } else { return WG_ILLEGAL; //err_clear_detach_halt(INTYPE_ERR); } } /* ******** cgi query parsing ****** */ /* query parser: split by & and =, urldecode param and value return param count or -1 for error */ int parse_query(char* query, int ql, char* params[], char* values[]) { int count=0; int i,pi,vi; for(i=0;i=ql) return -1; vi=i; for(;i=MAXPARAMS) return -1; params[count]=urldecode(query+pi,query+pi); values[count]=urldecode(query+vi,query+vi); count++; } return count; } /* urldecode used by query parser */ char* urldecode(char *indst, char *src) { char a, b; char* endptr; char* dst; dst=indst; if (src==NULL || src[0]=='\0') return indst; endptr=src+strlen(src); while (*src) { if ((*src == '%') && (src+2= 'a') a -= 'A'-'a'; if (a >= 'A') a -= ('A' - 10); else a -= '0'; if (b >= 'a') b -= 'A'-'a'; if (b >= 'A') b -= ('A' - 10); else b -= '0'; *dst++ = 16*a+b; src+=3; } else { *dst++ = *src++; } } *dst++ = '\0'; return indst; } /* ****** guess string datatype ****** */ /* return 1 iff s contains numerals only */ int isint(char* s) { if (s==NULL) return 0; while(*s!='\0') { if (!isdigit(*s)) return 0; s++; } return 1; } /* return 1 iff s contains numerals plus single optional period only */ int isdbl(char* s) { int c=0; if (s==NULL) return 0; while(*s!='\0') { if (!isdigit(*s)) { if (*s=='.') c++; else return 0; if (c>1) return 0; } s++; } return 1; } /* ********* json printing ********* */ /** Print a record, handling records recursively The value is written into a character buffer. db: database pointer rec: rec to be printed buf: address of the whole string buffer start (not the start itself) bufsize: address of the actual pointer to start printing at in buffer bptr: address of the whole string buffer format: 0 csv, 1 json showid: print record id for record: 0 no show, 1 first (extra) elem of record depth: current depth in a nested record tree (increases from initial 0) maxdepth: limit on printing records nested via record pointers (0: no nesting) strenc==0: nothing is escaped at all strenc==1: non-ascii chars and % and " urlencoded strenc==2: json utf-8 encoding, not ascii-safe returns 1 if successful, 0 if failure */ int sprint_record(void *db, wg_int *rec, thread_data_p tdata) { int i,limit; wg_int enc, len; char* tmp; char **bptr=&(tdata->bufptr); #ifdef USE_CHILD_DB void *parent; #endif limit=MIN_STRLEN; if (!str_guarantee_space(tdata, MIN_STRLEN)) return 0; if (rec==NULL) { snprintf(*bptr, limit, JS_NULL); (*bptr)+=strlen(JS_NULL); return 1; } if (tdata->format!=0) { // json **bptr= '['; (*bptr)++; } #ifdef USE_CHILD_DB parent = wg_get_rec_owner(db, rec); #endif if (1) { len = wg_get_record_len(db, rec); if (len<0) return 0; if (tdata->showid) { // add record id (offset) as the first extra elem of record snprintf(*bptr, limit-1, "%d",wg_encode_record(db,rec)); *bptr=*bptr+strlen(*bptr); } for(i=0; ishowid) { if (tdata->format!=0) **bptr = ','; else **bptr = CSV_SEPARATOR; (*bptr)++; } tmp=sprint_value(db, enc, tdata); if (tmp==NULL) return 0; else *bptr=tmp; } } if (tdata->format!=0) { // json if (!str_guarantee_space(tdata, MIN_STRLEN)) return 0; **bptr = ']'; (*bptr)++; } return 1; } /** Print a single encoded value (may recursively contain record(s)). The value is written into a character buffer. db: database pointer enc: encoded value to be printed relevant tdata fields: buf: address of the whole string buffer start (not the start itself) bufsize: address of the actual pointer to start printing at in buffer bufptr: address of the whole string buffer format: 0 csv, 1 json showid: print record id for record: 0 no show, 1 first (extra) elem of record depth: limit on records nested via record pointers (0: no nesting) maxdepth: limit on printing records nested via record pointers (0: no nesting) strenc==0: nothing is escaped at all strenc==1: non-ascii chars and % and " urlencoded strenc==2: json utf-8 encoding, not ascii-safe strenc==3: csv encoding, only " replaced for "" if successful, returns pointer to the next byte after printed string else returns NULL */ char* sprint_value(void *db, wg_int enc, thread_data_p tdata) { wg_int *ptrdata; int intdata,strl,strl1,strl2; char *strdata, *exdata; double doubledata; char strbuf[80]; // tmp area for dates int limit=MIN_STRLEN; char **bptr=&(tdata->bufptr); switch(wg_get_encoded_type(db, enc)) { case WG_NULLTYPE: if (!str_guarantee_space(tdata, MIN_STRLEN)) return NULL; if (tdata->format!=0) { // json snprintf(*bptr, limit, JS_NULL); return *bptr+strlen(*bptr); } return *bptr; case WG_RECORDTYPE: if (!str_guarantee_space(tdata, MIN_STRLEN)) return NULL; if (!tdata->format || tdata->depth>=tdata->maxdepth) { snprintf(*bptr, limit,"%d", (int)enc); // record offset (i.e. id) return *bptr+strlen(*bptr); } else { // recursive print ptrdata = wg_decode_record(db, enc); if(ptrdata==0) return NULL; sprint_record(db,ptrdata,tdata); **bptr='\0'; return *bptr; } break; case WG_INTTYPE: intdata = wg_decode_int(db, enc); if (!str_guarantee_space(tdata, MIN_STRLEN)) return NULL; snprintf(*bptr, limit, "%d", intdata); return *bptr+strlen(*bptr); case WG_DOUBLETYPE: doubledata = wg_decode_double(db, enc); if (!str_guarantee_space(tdata, MIN_STRLEN)) return NULL; snprintf(*bptr, limit, DOUBLE_FORMAT, doubledata); return *bptr+strlen(*bptr); case WG_FIXPOINTTYPE: doubledata = wg_decode_fixpoint(db, enc); if (!str_guarantee_space(tdata, MIN_STRLEN)) return NULL; snprintf(*bptr, limit, DOUBLE_FORMAT, doubledata); return *bptr+strlen(*bptr); case WG_STRTYPE: strdata = wg_decode_str(db, enc); exdata = wg_decode_str_lang(db,enc); if (strdata!=NULL) strl1=strlen(strdata); else strl1=0; if (exdata!=NULL) strl2=strlen(exdata); else strl2=0; if (!str_guarantee_space(tdata, MIN_STRLEN+STRLEN_FACTOR*(strl1+strl2))) return NULL; sprint_string(*bptr,(strl1+strl2),strdata,tdata->strenc); if (exdata!=NULL) { snprintf(*bptr+strl1+1,limit,"@%s\"", exdata); } return *bptr+strlen(*bptr); case WG_URITYPE: strdata = wg_decode_uri(db, enc); exdata = wg_decode_uri_prefix(db, enc); if (strdata!=NULL) strl1=strlen(strdata); else strl1=0; if (exdata!=NULL) strl2=strlen(exdata); else strl2=0; limit=MIN_STRLEN+STRLEN_FACTOR*(strl1+strl2); if(!str_guarantee_space(tdata, limit)) return NULL; if (exdata==NULL) snprintf(*bptr, limit, "\"%s\"", strdata); else snprintf(*bptr, limit, "\"%s:%s\"", exdata, strdata); return *bptr+strlen(*bptr); case WG_XMLLITERALTYPE: strdata = wg_decode_xmlliteral(db, enc); exdata = wg_decode_xmlliteral_xsdtype(db, enc); if (strdata!=NULL) strl1=strlen(strdata); else strl1=0; if (exdata!=NULL) strl2=strlen(exdata); else strl2=0; limit=MIN_STRLEN+STRLEN_FACTOR*(strl1+strl2); if(!str_guarantee_space(tdata, limit)) return NULL; snprintf(*bptr, limit, "\"%s:%s\"", exdata, strdata); return *bptr+strlen(*bptr); case WG_CHARTYPE: intdata = wg_decode_char(db, enc); if(!str_guarantee_space(tdata, MIN_STRLEN)) return NULL; snprintf(*bptr, limit, "\"%c\"", (char) intdata); return *bptr+strlen(*bptr); case WG_DATETYPE: intdata = wg_decode_date(db, enc); wg_strf_iso_datetime(db,intdata,0,strbuf); strbuf[10]=0; if(!str_guarantee_space(tdata, MIN_STRLEN)) return NULL; snprintf(*bptr, limit, "\"%s\"",strbuf); return *bptr+strlen(*bptr); case WG_TIMETYPE: intdata = wg_decode_time(db, enc); wg_strf_iso_datetime(db,1,intdata,strbuf); if(!str_guarantee_space(tdata, MIN_STRLEN)) return NULL; snprintf(*bptr, limit, "\"%s\"",strbuf+11); return *bptr+strlen(*bptr); case WG_VARTYPE: intdata = wg_decode_var(db, enc); if(!str_guarantee_space(tdata, MIN_STRLEN)) return NULL; snprintf(*bptr, limit, "\"?%d\"", intdata); return *bptr+strlen(*bptr); case WG_BLOBTYPE: strdata = wg_decode_blob(db, enc); strl=wg_decode_blob_len(db, enc); limit=MIN_STRLEN+STRLEN_FACTOR*strlen(strdata); if(!str_guarantee_space(tdata, limit)) return NULL; sprint_blob(*bptr,strl,strdata,tdata->strenc); return *bptr+strlen(*bptr); default: if(!str_guarantee_space(tdata, MIN_STRLEN)) return NULL; snprintf(*bptr, limit, JS_TYPE_ERR); return *bptr+strlen(*bptr); } } /* Print string with several encoding/escaping options. It must be guaranteed beforehand that there is enough room in the buffer. bptr: direct pointer to location in buffer where to start writing limit: max nr of chars traversed (NOT limiting output len) strdata: pointer to printed string strenc==0: nothing is escaped at all strenc==1: non-ascii chars and % and " urlencoded strenc==2: json utf-8 encoding, not ascii-safe strenc==3: csv encoding, only " replaced for "" For proper json tools see: json rfc http://www.ietf.org/rfc/rfc4627.txt ccan json tool http://git.ozlabs.org/?p=ccan;a=tree;f=ccan/json Jansson json tool https://jansson.readthedocs.org/en/latest/ Parser http://linuxprograms.wordpress.com/category/json-c/ */ int sprint_string(char* bptr, int limit, char* strdata, int strenc) { unsigned char c; char *sptr; char *hex_chars="0123456789abcdef"; int i; sptr=strdata; *bptr++='"'; if (sptr==NULL) { *bptr++='"'; *bptr='\0'; return 1; } if (!strenc) { // nothing is escaped at all for(i=0;i126) { *bptr++='%'; *bptr++=hex_chars[c >> 4]; *bptr++=hex_chars[c & 0xf]; } else { *bptr++=c; } } } else { // json encoding; chars before ' ' are are escaped with \u00 sptr=strdata; for(i=0;i> 4]; *bptr++=hex_chars[c & 0xf]; } else { *bptr++=c; } } } } *bptr++='"'; *bptr='\0'; return 1; } int sprint_blob(char* bptr, int limit, char* strdata, int strenc) { unsigned char c; char *sptr; char *hex_chars="0123456789abcdef"; int i; sptr=strdata; *bptr++='"'; if (sptr==NULL) { *bptr++='"'; *bptr='\0'; return 1; } // non-ascii chars and % and " urlencoded for(i=0;i126) { *bptr++='%'; *bptr++=hex_chars[c >> 4]; *bptr++=hex_chars[c & 0xf]; } else { *bptr++=c; } } *bptr++='"'; *bptr='\0'; return 1; } int sprint_append(char** bptr, char* str, int l) { int i; for(i=0;ibuf for output. */ char* str_new(int len) { char* res; res = (char*) malloc(len*sizeof(char)); if (res==NULL) return NULL; else { res[len-1]='\0'; return res; } } /** Guarantee string space: realloc if necessary, change ptrs, set last byte to 0 * relevant tdata fields: buf: address of the whole string buffer start (not the start itself) bufsize: address of the actual pointer to start printing at in buffer bufptr: address of the whole string buffer return 1 if successful, 0 if failure */ int str_guarantee_space(thread_data_p tdata, int needed) { char* tmp; int newlen,used; char** stradr=&(tdata->buf); int* strlenadr=&(tdata->bufsize); char** ptr=&(tdata->bufptr); if (needed>(*strlenadr-(int)((*ptr)-(*stradr)))) { used=(int)((*ptr)-(*stradr)); newlen=(*strlenadr)*2; if (newlenMAX_MALLOC) { //if (*stradr!=NULL) free(*stradr); //err_clear_detach_halt(MALLOC_ERR); return 0; } //printf("needed %d oldlen %d used %d newlen %d \n",needed,*strlenadr,used,newlen); tmp=realloc(*stradr,newlen); if (tmp==NULL) { if (*stradr!=NULL) free(*stradr); //err_clear_detach_halt(MALLOC_ERR); return 0; } tmp[newlen-1]=0; // set last byte to 0 //printf("oldstradr %d newstradr %d oldptr %d newptr %d \n",(int)*stradr,(int)tmp,(int)*ptr,(int)tmp+used); *stradr=tmp; *strlenadr=newlen; *ptr=tmp+used; return 1; } return 1; } /* ************ loading configuration ************* */ int load_configuration(char* path, dserve_conf_p conf) { FILE *fp; char *buf, *bp, *bp2, *bp3, *bend, *key; int bufsize=CONF_BUF_SIZE; int i,n,row; if (path==NULL) { #ifdef CONF_FILE path=CONF_FILE ; #endif } if (path==NULL) return 0; // read configuration file fp=fopen(path,READ); if (fp==NULL) {errprint(CONF_OPEN_ERR,path); exit(ERR_EX_NOINPUT);} for(i=0;i<10;i++) { //printf("i %d bufsize %d\n",i,bufsize); buf=malloc(bufsize); if (buf==NULL) {errprint(CONF_MALLOC_ERR,NULL); return 3;} n=fread(buf,1,bufsize,fp); if (n<=0) {errprint(CONF_READ_ERR,path); free(buf); return 3;} if (n>=bufsize) { // did not manage to read all free(buf); rewind(fp); bufsize=bufsize*2; if (bufsize>MAX_CONF_BUF_SIZE) {errprint(CONF_SIZE_ERR,path); free(buf); return 4;} } else { // reading successful break; } } // parse configuration file bp=buf; bend=buf+n; key=NULL; for(row=0;;row++) { // parse row by row if(bp>=bend) break; if (*bp==' ' || *bp=='\t') { // row starts with whitespace // skip whitespace for(bp2=bp;bp2bp && (*bp3==' ' || *bp3=='\t' || *bp3=='\r'); bp3--) *bp3='\0'; key=bp; //printf("def line |%s|\n",key); // skip whitespace for(bp2++;bp2default_dbase),val); else if (!strcmp(key,CONF_DEFAULT_DBASE_SIZE)) return add_slval(&(conf->default_dbase_size),val); else if (!strcmp(key,CONF_MAX_DBASE_SIZE)) return add_slval(&(conf->max_dbase_size),val); else if (!strcmp(key,CONF_DBASES)) return add_slval(&(conf->dbases),val); else if (!strcmp(key,CONF_ADMIN_IPS)) return add_slval(&(conf->admin_ips),val); else if (!strcmp(key,CONF_WRITE_IPS)) return add_slval(&(conf->write_ips),val); else if (!strcmp(key,CONF_READ_IPS)) return add_slval(&(conf->read_ips),val); else if (!strcmp(key,CONF_ADMIN_TOKENS)) return add_slval(&(conf->admin_tokens),val); else if (!strcmp(key,CONF_WRITE_TOKENS)) return add_slval(&(conf->write_tokens),val); else if (!strcmp(key,CONF_READ_TOKENS)) return add_slval(&(conf->read_tokens),val); else if (!strcmp(key,CONF_KEY_FILE)) return add_slval(&(conf->key_file),val); else if (!strcmp(key,CONF_CERT_FILE)) return add_slval(&(conf->cert_file),val); else {errprint(CONF_VAL_ERR,key); return -1;} } int add_slval(struct sized_strlst *lst, char* val) { char **buf; int size; if (lst->used < lst->size) { lst->vals[lst->used]=val; lst->used++; } else if (lst->size==0) { buf=(char**)malloc(CONF_VALS_SIZE*sizeof(char*)); if (buf==NULL) {errprint(CONF_MALLOC_ERR,NULL); return -1; } buf[0]=val; lst->vals=buf; lst->size=CONF_VALS_SIZE; lst->used=1; } else { size=lst->size*2; if (size>MAX_CONF_VALS_SIZE) {errprint(CONF_VALNR_ERR,NULL); return -1; } buf=realloc(lst->vals,size*sizeof(char*)); if (buf==NULL) {errprint(CONF_MALLOC_ERR,NULL); return -1; } lst->vals=buf; lst->size=size; lst->vals[lst->used]=val; lst->used++; } return 0; } void print_conf(dserve_conf_p conf) { print_conf_slval(&(conf->default_dbase),CONF_DEFAULT_DBASE); print_conf_slval(&(conf->default_dbase_size),CONF_DEFAULT_DBASE_SIZE); print_conf_slval(&(conf->max_dbase_size),CONF_MAX_DBASE_SIZE); print_conf_slval(&(conf->dbases),CONF_DBASES); print_conf_slval(&(conf->admin_ips),CONF_ADMIN_IPS); print_conf_slval(&(conf->write_ips),CONF_WRITE_IPS); print_conf_slval(&(conf->read_ips),CONF_READ_IPS); print_conf_slval(&(conf->admin_tokens),CONF_ADMIN_TOKENS); print_conf_slval(&(conf->write_tokens),CONF_WRITE_TOKENS); print_conf_slval(&(conf->read_tokens),CONF_READ_TOKENS); print_conf_slval(&(conf->key_file),CONF_KEY_FILE); print_conf_slval(&(conf->cert_file),CONF_CERT_FILE); } void print_conf_slval(struct sized_strlst *lst, char* key) { int i; printf("%s = # %d %d\n",key,lst->size,lst->used); for(i=0;iused;i++) { printf(" %s\n",lst->vals[i]); } } /* *********** authorization ******** */ // returns 1 if authorized, 0 if not int authorize(int level,thread_data_p tdata, char* database, char* token) { int ok=0; dserve_conf_p conf=(tdata->global)->conf; if (!(tdata->isserver) && !(tdata->iscgi)) return 1; // command line always ok if (database!=NULL && !authorize_aux(database,conf->dbases.vals,conf->dbases.used,1)) return 0; if (level==READ_LEVEL) { if (authorize_aux(tdata->ip,conf->admin_ips.vals,conf->admin_ips.used,0)) ok=1; else if (authorize_aux(tdata->ip,conf->write_ips.vals,conf->write_ips.used,0)) ok=1; else if (authorize_aux(tdata->ip,conf->read_ips.vals,conf->read_ips.used,0)) ok=1; else ok=0; if (!ok) return 0; if (authorize_aux(token,conf->admin_tokens.vals,conf->admin_tokens.used,1)) return 1; if (authorize_aux(token,conf->write_tokens.vals,conf->write_tokens.used,1)) return 1; if (authorize_aux(token,conf->read_tokens.vals,conf->read_tokens.used,1)) return 1; return 0; } else if (level==WRITE_LEVEL) { if (authorize_aux(tdata->ip,conf->admin_ips.vals,conf->admin_ips.used,0)) ok=1; else if (authorize_aux(tdata->ip,conf->write_ips.vals,conf->write_ips.used,0)) ok=1; else ok=0; if (!ok) return 0; if (authorize_aux(token,conf->admin_tokens.vals,conf->admin_tokens.used,1)) return 1; if (authorize_aux(token,conf->write_tokens.vals,conf->write_tokens.used,1)) return 1; return 0; } else if (level==ADMIN_LEVEL) { if (authorize_aux(tdata->ip,conf->admin_ips.vals,conf->admin_ips.used,0)) ok=1; else ok=0; if (!ok) return 0; if (authorize_aux(token,conf->admin_tokens.vals,conf->admin_tokens.used,1)) return 1; return 0; } return 0; } static int authorize_aux(char* str, char** lst, int n, int eqflag) { int i,sl,ll; if(!n) return 1; if (str==NULL) return 0; sl=strlen(str); if (sl<1) return 0; for(i=0;i=ll && !strncmp(lst[i],str,ll)) return 1; } } return 0; } /* *********** windows specific ******** */ #if _MSC_VER void usleep(__int64 usec) { HANDLE timer; LARGE_INTEGER ft; ft.QuadPart = -(10*usec); // Convert to 100 nanosecond interval, negative value indicates relative time timer = CreateWaitableTimer(NULL, TRUE, NULL); SetWaitableTimer(timer, &ft, 0, NULL, NULL, 0); WaitForSingleObject(timer, INFINITE); CloseHandle(timer); } void win_err_handler(LPTSTR lpszFunction) { // Retrieve the system error message for the last-error code. LPVOID lpMsgBuf; LPVOID lpDisplayBuf; DWORD dw = GetLastError(); FormatMessage( FORMAT_MESSAGE_ALLOCATE_BUFFER | FORMAT_MESSAGE_FROM_SYSTEM | FORMAT_MESSAGE_IGNORE_INSERTS, NULL, dw, MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT), (LPTSTR) &lpMsgBuf, 0, NULL ); // Display the error message. lpDisplayBuf = (LPVOID)LocalAlloc(LMEM_ZEROINIT, (lstrlen((LPCTSTR) lpMsgBuf) + lstrlen((LPCTSTR) lpszFunction) + 40) * sizeof(TCHAR)); StringCchPrintf((LPTSTR)lpDisplayBuf, LocalSize(lpDisplayBuf) / sizeof(TCHAR), TEXT("%s failed with error %d: %s"), lpszFunction, dw, lpMsgBuf); MessageBox(NULL, (LPCTSTR) lpDisplayBuf, TEXT("Error"), MB_OK); // Free error-handling buffer allocations. LocalFree(lpMsgBuf); LocalFree(lpDisplayBuf); } #endif /* ************ help ************* */ void print_help(void) { printf("dserve is a rest server tool for whitedb with a json or csv output\n"); printf("There are three ways to run dserve:\n"); printf(" * command line tool: dserve [optional conffile] like \n"); printf(" dserve 'op=search'\n"); printf(" * cgi program under a web server: copy dserve to the cgi-bin folder,\n"); printf(" optionally set #define CONF_FILE in dserve.h before compiling\n"); printf(" * a standalone server: dserve [optional conffile] like\n"); printf(" dserve 8080 myconf.txt\n"); printf(" or set #define DEFAULT_PORT in dserve.h for startup without args\n"); printf("See http://whitedb.org/server/ for a manual.\n"); } /* ************ message printing to stderr ************* */ void infoprint(char* fmt, char* param) { #ifdef INFOPRINT //fprintf(stderr,"dserve information: "); if (param!=NULL) fprintf(stderr,fmt,param); else fprintf(stderr,fmt,NULL); #endif } void warnprint(char* fmt, char* param) { #ifdef WARNPRINT //fprintf(stderr,"dserve warning: "); if (param!=NULL) fprintf(stderr,fmt,param); else fprintf(stderr,fmt,NULL); #endif } void errprint(char* fmt, char* param) { #ifdef ERRPRINT //fprintf(stderr,"dserve error: "); if (param!=NULL) fprintf(stderr,fmt,param); else fprintf(stderr,fmt,NULL); #endif } /* ************ soft errors not terminating the server ************* */ /* easy user input / nonterminating error processing, in case there is no need to free anything except input buffer: just return errstr or print/exit if not a server. */ char* errhalt(char* str, thread_data_p tdata) { char buf[HTTP_ERR_BUFSIZE]; if (tdata==NULL) { errprint("tdata was NULL in errhalt\n",NULL); terminate(); } if (tdata->isserver) { if (tdata->inbuf!=NULL) { free(tdata->inbuf); tdata->inbuf=NULL; } return make_http_errstr(str,tdata); } else { // freeing tdata->inbuf here is not really necessary if (tdata->inbuf!=NULL) { free(tdata->inbuf); tdata->inbuf=NULL; } if (tdata->jsonp==NULL) snprintf(buf,HTTP_ERR_BUFSIZE,NORMAL_ERR_FORMAT,str); else snprintf(buf,HTTP_ERR_BUFSIZE,JSONP_ERR_FORMAT,tdata->jsonp,str); print_final(buf,tdata); #if _MSC_VER WSACleanup(); #endif exit(0); } } /* normal termination call: db is attached, lock taken, buffers are malloced: need to free all first, then return errstr or print/exit if not a server. */ char* err_clear_detach_halt(char* errstr, thread_data_p tdata) { int r; //printf("err_clear_detach_halt called\n"); // free lock if lock taken if (tdata->db!=NULL && tdata->lock_id) { if (tdata->lock_type==READ_LOCK_TYPE) { r=wg_end_read(tdata->db,tdata->lock_id); if (!r) { errprint("Error releasing readlock in err_clear_detach_halt\n",NULL); terminate(); } tdata->lock_id=0; } else if (tdata->lock_type==WRITE_LOCK_TYPE) { r=wg_end_write(tdata->db,tdata->lock_id); if (!r) { errprint("Error releasing writelock in err_clear_detach_halt\n",NULL); terminate(); } tdata->lock_id=0; } else { errprint("Unrecognized lock type in err_clear_detach_halt\n",NULL); terminate(); } } // detach from db: not critical if (tdata->db!=NULL) { op_detach_database(tdata,tdata->db); } // free string buf if (tdata->buf!=NULL) { free(tdata->buf); } // call the simpler terminator return errhalt(errstr,tdata); } // allocate and create an errstring char* make_http_errstr(char* str, thread_data_p tdata) { char *errstr; errstr=malloc(HTTP_ERR_BUFSIZE); if (!errstr) return NULL; if(tdata!=NULL && tdata->jsonp!=NULL) snprintf(errstr,HTTP_ERR_BUFSIZE,JSONP_ERR_FORMAT,tdata->jsonp,str); else snprintf(errstr,HTTP_ERR_BUFSIZE,NORMAL_ERR_FORMAT,str); return errstr; } /* ************ hard errors terminating the server ************* */ // called from code to terminate with hard error regardless if server or not // just try to release locks and detach void terminate() { termination_handler(0); } /* called in case of internal errors by the signal catcher: it is crucial that the locks are released and db detached */ void termination_handler(int signal) { int n; printf("termination_handler called\n"); clear_detach_final(signal); n=write(STDERR_FILENO, TERMINATE_ERR, strlen(TERMINATE_ERR)); if (n); // to suppress senseless gcc warning #if _MSC_VER WSACleanup(); #endif exit(ERR_EX_SOFTWARE); } /* timeout_handler only used for a cgi program case, not server: it is crucial that the locks are released and db detached */ void timeout_handler(int signal) { int n; printf("timeout_handler called\n"); clear_detach_final(signal); n=write(STDERR_FILENO, TERMINATE_ERR, strlen(TERMINATE_ERR)); if (n); // to suppress senseless gcc warning #if _MSC_VER WSACleanup(); #endif exit(ERR_EX_TEMPFAIL); } void clear_detach_final(int signal) { int i; printf("clear_detach called, maxthreads: %d\n",globalptr->maxthreads); if (globalptr==NULL) { i=write(STDERR_FILENO, TERMINATE_NOGLOB_ERR, strlen(TERMINATE_NOGLOB_ERR)); return; } #ifdef SERVEROPTION // avoid new further threads run and locks taken if (globalptr->maxthreads>0 && globalptr->threads_data[0].common!=NULL) globalptr->threads_data[0].common->shutdown=1; #endif // clear locks for(i=0;(i < globalptr->maxthreads) && (i<1000); i++) { //printf("clearing thread %d locks \n",i); if (globalptr->threads_data[i].db!=NULL && globalptr->threads_data[i].lock_id) { //printf("clear_detach freeing %d\n",i); if (globalptr->threads_data[i].lock_type==READ_LOCK_TYPE) { //printf("clear_detach end_read %d\n",i); wg_end_read(globalptr->threads_data[i].db,globalptr->threads_data[i].lock_id); globalptr->threads_data[i].lock_id=0; } else if (globalptr->threads_data[i].lock_type==WRITE_LOCK_TYPE) { //printf("clear_detach end_write %d\n",i); wg_end_write(globalptr->threads_data[i].db,globalptr->threads_data[i].lock_id); globalptr->threads_data[i].lock_id=0; } } } // detach databases for(i=0;(i < globalptr->maxthreads) && (i<1000); i++) { //printf("detaching thread %d database\n",i); if (globalptr->threads_data[i].db!=NULL) { wg_detach_database(globalptr->threads_data[i].db); globalptr->threads_data[i].db=NULL; } } return; } whitedb-0.7.2/Server/examplecertificate.crt000066400000000000000000000026401226454622500207760ustar00rootroot00000000000000-----BEGIN CERTIFICATE----- MIID+zCCAuOgAwIBAgIJALHKYKuIz9gbMA0GCSqGSIb3DQEBBQUAMIGTMQswCQYD VQQGEwJFRTERMA8GA1UECAwISGFyanVtYWExEDAOBgNVBAcMB1RhbGxpbm4xEDAO BgNVBAoMB2V4YW1wbGUxEDAOBgNVBAsMB2V4YW1wbGUxFzAVBgNVBAMMDkV4YW1w bGUgUGVyc29uMSIwIAYJKoZIhvcNAQkBFhNleGFtcGxlQGV4YW1wbGUuY29tMB4X DTE0MDEwMzE0NTMxOVoXDTE1MDEwMzE0NTMxOVowgZMxCzAJBgNVBAYTAkVFMREw DwYDVQQIDAhIYXJqdW1hYTEQMA4GA1UEBwwHVGFsbGlubjEQMA4GA1UECgwHZXhh bXBsZTEQMA4GA1UECwwHZXhhbXBsZTEXMBUGA1UEAwwORXhhbXBsZSBQZXJzb24x IjAgBgkqhkiG9w0BCQEWE2V4YW1wbGVAZXhhbXBsZS5jb20wggEiMA0GCSqGSIb3 DQEBAQUAA4IBDwAwggEKAoIBAQDBkjIsrlNOiW8RMMKmzj7WL1mr0vMrt4Pj/TIj isd+fgQck2v0pulTMv5r3N+u3/8cXjR8QpmcSXjkfYC6xWBwXRALKsVju/O5t3Zj GhHuv7PRtGnNgqiFDDLD2HdhoEojdXrirWnAe7YJDzfz2X28AJmGCH6RrZTL8EWs lgZd8ZA5XxQ3nXMd12lI6MECihbhQ5jc+dj1L7sA/Dd23eznwymQ0SwY+mkhca8D erX651DV3SwrXV0v3BN7gUg23CqKyUj/lgLRerKJb9yMQWdqZM5sTjxPxYr/JUFo pNFbm8hwv5dkpHdXtBrgq6Pxsge3/NIypjeNmnksZSsB4vkpAgMBAAGjUDBOMB0G A1UdDgQWBBRzNaHoRWvyMParvzh9sAZcusMheTAfBgNVHSMEGDAWgBRzNaHoRWvy MParvzh9sAZcusMheTAMBgNVHRMEBTADAQH/MA0GCSqGSIb3DQEBBQUAA4IBAQBL kdZsSHNo8LkRGlxIxK7YEsNgDEL0WChujO3ov05zgK7krLHY04HSCjtsjBobL/Z7 setW70RRZMc6ymikX63SRqy6Dx+G2WJosNkczlLnbhorwpCmRpzf4BOptEpUd5xi fW7DMQVqzr2kND3Sc10HjIMWF+4bq9OAQnAa5HWURPNPZ5fDsnRC/A/TGvJizH6h jzchkYOvNegwQi6qos5MnVAb+LtI0lvUOxnUA1R2F07d+liJgknkt/Rw09IvN9hI PI2UohCghUvYaKSJt+LDUwVo/ihf+pBv/iWD8X7SYVxsmP9pyFGz/I2Q5utnR2yK bjMaHsGJgIcvav/YR4/f -----END CERTIFICATE----- whitedb-0.7.2/Server/exampleprivatekey.key000066400000000000000000000032501226454622500206750ustar00rootroot00000000000000-----BEGIN PRIVATE KEY----- MIIEvQIBADANBgkqhkiG9w0BAQEFAASCBKcwggSjAgEAAoIBAQDBkjIsrlNOiW8R MMKmzj7WL1mr0vMrt4Pj/TIjisd+fgQck2v0pulTMv5r3N+u3/8cXjR8QpmcSXjk fYC6xWBwXRALKsVju/O5t3ZjGhHuv7PRtGnNgqiFDDLD2HdhoEojdXrirWnAe7YJ Dzfz2X28AJmGCH6RrZTL8EWslgZd8ZA5XxQ3nXMd12lI6MECihbhQ5jc+dj1L7sA /Dd23eznwymQ0SwY+mkhca8DerX651DV3SwrXV0v3BN7gUg23CqKyUj/lgLRerKJ b9yMQWdqZM5sTjxPxYr/JUFopNFbm8hwv5dkpHdXtBrgq6Pxsge3/NIypjeNmnks ZSsB4vkpAgMBAAECggEAF3Oi6I7mQOmdrzN9IcBzFHgAITUZiP5e2ExgurWhnc2e qeeieK2QLyhKcr77yrAQtFsleLiI68pq/yPFaNto57QesXupFoA68xErIq6R5Z8M Jif5eZCO4i+sJtYfAJDu6oTdMoFYAp36W/agDMcY2KIp93cn/nZNRLgDePlkJBVe NtHqqzqvn5FOwArBkiU5MmcGdRsWxABDzqTRJ3KHpF0JySdsEd+lT+A2tBWFl2Zz l2S1eBqI+G3KeRkQXITqOzwKctON8awMIQum/uKPGJc+NX5FgQXYPehsDGd+316V Sd+/Y5KP0xnKfjvsLvDrEGDpnwHwPG1iejMT507hUQKBgQDgQrMf04yOGZJZuVjV s6lR2S+Tr/yo+mhnzUSjj+AUDzyrfxoxxcncD6eKg1cMJstrPtRVZnS4TqcNUWjj qXHwpMwDLGZ8YCdZ+TVxsLpA8IZF6IiNo0Q1BbGTqQdtHMiznmqmioqwN5sNYqek N/IBm3vnuZfc9kgXrmOKw0LWdwKBgQDc95H79odz1vjlN0zN3+BGZdSKZQo3njYd My5c9CgMd+guKvQRoFI83gZJgSntw7qPH1feORuuMuIiL6fi46cjlQ257pdRp9CD 3HRzi6Wm020UravKz7v5pE/kRmZ5j8TfrB/D/kcQlUbp5X+Zv8aqX7I+8vG/XgIR zcl1F7d1XwKBgBodql56dFPYBoMMYpwAYCd382JvjCzhfGcaMHQbvSyY2affFV3W ert11zz6Lpjrq6TBnFiVpeIQxsN2R5C7mtk7V8bG1OiHCg4gR2kF+6q0V+6sNbrI 2JiUISng9Uxvna/NMv5SA/ShhRz58Cvfl/837CYAJv9EbwDS/iSauJ3hAoGBAIbq hnEQio3ZMSlLRZLiYd656DcEEGP7LtFPYbyRuy45vEMMKO/mMrBFZBNXURGCk5M1 sQHXXqZTHS2AaYKoO3IHXVUsb6oEy9TnMxclqeQdbZnVnHH9uqlngPxBW+pXNP7Y 6qBRznQ6oQzI+ssWhCecvImg7qhIrvzN6HadH4ADAoGAAOR+db5U7qLVeE1voWpV 7g04EKF0GRsvNOQQ41bmVFaGvnpEGEY7OkytaN/hqomlWC2PHQVD7bUGj/z34eUF iYTx6BDZyTnKohdr/qTLxiO//dlE6/4G/M26A2a3wZFEruZbIrGRUkYLutBls6uX sp5A+W9NEbQAk1vBgNcEtD4= -----END PRIVATE KEY----- whitedb-0.7.2/Server/nsmeasure.c000066400000000000000000000427221226454622500166010ustar00rootroot00000000000000/* testing the speed of answering queries via net compile by linux: gcc nsmeasure.c -o nsmeasure -O2 -lpthread windows: cl /Ox /I"." Server\nsmeasure.c wgdb.lib run like nsmeasure 'http://127.0.0.1:8080/dserve?op=search' 4 1000 '5689' where: - url must use http - url must contain a numeric IP, not a domain name - url must contain the port number - 4 indicates the number of parallel threads (use 1,2,... etc) - 1000 indicates the number each thread opens the connection - '5689' is a string searched for from each result (thus testing result is ok) you may skip this last string param altogether nsmeasure exits immediately when it sees an error or any result does not contain the string searched for. */ #if _MSC_VER // see http://msdn.microsoft.com/en-us/library/windows/desktop/ms738566%28v=vs.85%29.aspx #define WIN32_LEAN_AND_MEAN #include // windows.h only with lean_and_mean before it #include //#include #include #include #pragma comment (lib, "ws2_32.lib") #pragma comment (lib, "User32.lib") #endif #include #include #include #include #include #include #include // nanosleep for linux #if _MSC_VER #else #include #include #include #endif #define MAX_THREADS 5000 #define INITIALBUFS 1000 #define MAXHEADERS 500 #define MAXBUFS 100000000 #define TRACE #if _MSC_VER #define ssize_t int DWORD WINAPI process(LPVOID targ); void usleep(__int64 usec); void win_err_handler(LPTSTR lpszFunction); #else void *process(void *targ); #endif void errhalt(char *msg); void err_exit(int sockfd, char *buffer, int tid); static ssize_t readn(int fd, void *usrbuf, size_t n); struct thread_data{ int thread_id; // 0,1,.. int maxiter; // how many iterations to run char* ip; // ip to open int port; // port to open char* urlpart; // urlpart to open like /dserve?op=search char* verify; // string to look for int res; // stored by thread }; int main(int argc, char **argv) { int i,tmax,iter,maxiter,rc; char ip[100], urlpart[2000]; int portnr=0; #if _MSC_VER WSADATA wsaData; HANDLE thandle; HANDLE thandlearray[MAX_THREADS]; DWORD threads[MAX_THREADS]; #else pthread_t threads[MAX_THREADS]; pthread_attr_t attr; #endif struct thread_data tdata[MAX_THREADS]; long tid; char* verify=NULL; if (argc<4) errhalt("three obligatory args - url, threads, iter - like: \ 'http://127.0.0.1:8080/dserve?op=search' 2 3 - plus one optional: string to look for in results"); if (sscanf(argv[1], "http://%99[^:]:%i/%999[^\n]", ip, &portnr, urlpart) != 3) { errhalt("could not parse the url"); } if (!portnr) errhalt("nonnumeric or zero port given"); tmax=atoi(argv[2]); if (!tmax) errhalt("nonnumeric or zero thread count given"); maxiter=atoi(argv[3]); if (!maxiter) errhalt("nonnumeric or zero iteration count given"); if (tmax>=MAX_THREADS) errhalt("too many threads to be created"); if (argc==5) { verify=argv[4]; printf ("first line must contain 200 and the body must contain %s\n",verify); } else { verify=NULL; printf("no result verification to be done except 200 check in the first line \n"); } printf("starting %d threads to run %d iterations each\n",tmax,maxiter); // prepare and create threads #if _MSC_VER if (WSAStartup(MAKEWORD(2, 0),&wsaData) != 0) { printf("WSAStartup failed\n"); exit(1); } #else pthread_attr_init(&attr); pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE); #endif for(iter=0; iter<1; iter++) { // run just once for(tid=0;tidthread_id; printf("thread %d starts, ip %s portnr %d url %s \n",tid,tdata->ip,tdata->port,tdata->urlpart); buffer=malloc(INITIALBUFS); //pthread_exit((void*) tid); maxiter=tdata->maxiter; clenheader="Content-Length:"; clenheader_len=strlen(clenheader); for(iter=0; iterport); #if _MSC_VER dest.sin_addr.s_addr=inet_addr(tdata->ip); if (dest.sin_addr.s_addr==0 || dest.sin_addr.s_addr==INADDR_NONE ) #else if (inet_addr(tdata->ip, &dest.sin_addr.s_addr)==0) #endif errhalt("inet_addr problem"); // Connect to server ok=0; for(j=0;j<100;j++) { if ( connect(sockfd, (struct sockaddr*)&dest, sizeof(dest)) != 0 ) { // problem here #if _MSC_VER usleep(100); #else nanosleep(&tim , &tim2); #endif } else { ok=1; break; } } if (!ok) { errhalt("connect problem"); } //printf("thread %d connecting iteration %d\n",tid,iter); sprintf(buffer, "GET /%s HTTP/1.0\n\n", tdata->urlpart); send(sockfd, buffer, strlen(buffer),0); len=0; if (tdata->verify==NULL) { // read first line plus a bit more bufp=buffer; //memset(buffer,0,22); while (1) { if ((nread=read(sockfd, bufp, 20)) < 0) { if (errno==EINTR) nread=0;/* interrupted by sig handler return */ else break; } else { break; } } *(bufp+nread)='\0'; if (strstr(buffer,"200")==NULL) { printf("thread %d received a non-200 http result at iter %d, exiting\n",tid,iter); err_exit(sockfd,buffer,tid); } } else { bufp=buffer; // first read the main part of the header toread=INITIALBUFS-10; for(k=0;k<10;k++) { nread=readn(sockfd, bufp, toread); if (nread>0) break; usleep(k*100); printf("thread %d read try %d at iter %d\n",tid,k,iter); } /* printf("nread: %d toread: %d\n",nread,toread); bufp+=nread; *bufp='\0'; printf("%s\n",buffer); exit(0); */ if (nread<0) { printf("thread %d did not succeed to read at all at iter %d, exiting\n",tid,iter); err_exit(sockfd,buffer,tid); } else if (nreadMAXBUFS) { printf("thread %d got a too big Content-Length at iter %d, exiting\n",tid,iter); err_exit(sockfd,buffer,tid); } if (len>(INITIALBUFS-MAXHEADERS)) { j=bufp-buffer; buffer=realloc(buffer,len+MAXHEADERS); if (buffer==NULL) { printf("thread %d failed to alloc enough memory at iter %d, exiting\n",tid,iter); err_exit(sockfd,buffer,tid); } bufp=buffer+j; } //printf("nr %d |%s|\n",len,tp); //printf("buffer:\n----\n%s\n----\n",buffer); loc=strstr(buffer,"\r\n\r\n"); if (loc==NULL) { printf("thread %d did not find empty line at iter %d, exiting\n",tid,iter); err_exit(sockfd,buffer,tid); } // have to read content_length+header_length-read_already toread=len+((loc-buffer)+4)-nread; //printf("toread: %d \n",toread); nread=readn(sockfd, bufp, toread); //printf("nread: %d\n",nread); *(bufp+nread)='\0'; } //printf("j %d buffer:\n----\n%s\n----\n",j,buffer); if (strstr(buffer,"200")==NULL) { printf("thread %d received a non-200 http result at iter %d, exiting\n",tid,iter); printf("j %d buffer:\n----\n%s\n----\n",j,buffer); err_exit(sockfd,buffer,tid); } if (strstr(buffer,tdata->verify)==NULL) { printf("thread %d received a non-verified http result at iter %d, exiting\n",tid,iter); err_exit(sockfd,buffer,tid); } } // Clean up #if _MSC_VER shutdown(sockfd,SD_RECEIVE); closesocket(sockfd); #else shutdown(sockfd,SHUT_RDWR); close(sockfd); #endif //shutdown(sockfd,3); } // end thread tdata->res=i; //printf ("thread %d finishing with res %d \n",tid,i); free(buffer); #if _MSC_VER ExitThread(0); return 0; #else pthread_exit((void*) tid); return NULL; #endif } void err_exit(int sockfd, char *buffer, int tid) { #if _MSC_VER shutdown(sockfd,SD_RECEIVE); closesocket(sockfd); if (buffer!=NULL) free(buffer); ExitThread(1); #else shutdown(sockfd,SHUT_RDWR); close(sockfd); if (buffer!=NULL) free(buffer); pthread_exit((void*) tid); #endif } static ssize_t readn(int fd, void *usrbuf, size_t n) { size_t nleft = n; ssize_t nread; char *bufp = usrbuf; while (nleft>0) { if ((nread=recv(fd, bufp, nleft, 0)) < 0) { if (errno==EINTR) nread=0;/* interrupted by sig handler return */ /* and call recv() again */ else { //fprintf(stderr,"read %d err %d \n",nread,errno); return -1; /* errno set by read() */ } } else if (nread==0) break; /* EOF */ nleft -= nread; bufp += nread; } //fprintf(stderr,"readn terminates with %d\n",n-nleft); return (n-nleft); /* return >= 0 */ } void errhalt(char *str) { printf("Error: %s\n",str); #if _MSC_VER WSACleanup(); #endif exit(-1); } #if _MSC_VER void usleep(__int64 usec) { HANDLE timer; LARGE_INTEGER ft; ft.QuadPart = -(10*usec); // Convert to 100 nanosecond interval, negative value indicates relative time timer = CreateWaitableTimer(NULL, TRUE, NULL); SetWaitableTimer(timer, &ft, 0, NULL, NULL, 0); WaitForSingleObject(timer, INFINITE); CloseHandle(timer); } #endif #if _MSC_VER void win_err_handler(LPTSTR lpszFunction) { // Retrieve the system error message for the last-error code. LPVOID lpMsgBuf; LPVOID lpDisplayBuf; DWORD dw = GetLastError(); FormatMessage( FORMAT_MESSAGE_ALLOCATE_BUFFER | FORMAT_MESSAGE_FROM_SYSTEM | FORMAT_MESSAGE_IGNORE_INSERTS, NULL, dw, MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT), (LPTSTR) &lpMsgBuf, 0, NULL ); // Display the error message. lpDisplayBuf = (LPVOID)LocalAlloc(LMEM_ZEROINIT, (lstrlen((LPCTSTR) lpMsgBuf) + lstrlen((LPCTSTR) lpszFunction) + 40) * sizeof(TCHAR)); StringCchPrintf((LPTSTR)lpDisplayBuf, LocalSize(lpDisplayBuf) / sizeof(TCHAR), TEXT("%s failed with error %d: %s"), lpszFunction, dw, lpMsgBuf); MessageBox(NULL, (LPCTSTR) lpDisplayBuf, TEXT("Error"), MB_OK); // Free error-handling buffer allocations. LocalFree(lpMsgBuf); LocalFree(lpDisplayBuf); } #endif /* Timings for dserve on an i7 laptop: parallel mode ------------ measuring wdb query on localhost: query is dserve?op=search giving all dbase rows a) small: db size is 3 rows of 3 cols of ints, output ca 120 bytes b) medium: db size is 1000 rows of 3 cols of ints, output ca 17 kilobytes c) large: db size is 100.000 rows of 3 cols of ints, output ca 2.3 megabytes running N threads querying over http in parallel, each M iterations Summary - - - - Server, small query, 100000 ops altogether on linux i7 laptop: 100K ops in 1 thread: 8.345s 100K ops spread in 2 threads: 4.631s 100K ops spread in 4 threads: 3.215s 100K ops spread in 8 threads: 3.595s cgi, small query, 1000 ops altogether on linux i7 laptop: 10K ops in 1 thread: 12.1 s 10K ops spread in 2 threads: 6.145s 10K ops spread in 4 threads: 3.349s 10K ops spread in 8 threads: 2.693s 1 thread 10.000 iterations: - - - - - - - - - - - - - - iterative server: small: real 0m0.840s user 0m0.026s sys 0m0.427s medium: real 0m6.793s user 0m0.144s sys 0m0.773s server with 8 threads: small: real 0m1.416s user 0m0.019s sys 0m0.495s medium: real 0m6.777s user 0m0.156s sys 0m0.720s threadpool with 8 threads small without close check real 0m0.840s user 0m0.020s sys 0m0.444s small with close check real 0m1.464s user 0m0.035s sys 0m0.502s 2 threads 10.000 iterations: - - - - - - - - - - - - - - iterative server: small: real 0m0.951s user 0m0.038s sys 0m0.863s medium: real 0m10.622s user 0m0.264s sys 0m1.200s server with 8 threads: small: real 0m2.376s user 0m0.053s sys 0m1.330s medium: real 0m6.698s user 0m0.252s sys 0m1.558s threadpool with 8 threads small without close check: real 0m0.985s user 0m0.053s sys 0m0.990s small with close check: real 0m2.964s user 0m0.053s sys 0m1.116s 3 threads 10.000 iterations: - - - - - - - - - - - - - - iterative server: small: real 0m1.378s user 0m0.050s sys 0m1.303s medium: real 0m16.007s user 0m0.411s sys 0m1.778s server with 8 threads: small: real 0m3.389s user 0m0.100s sys 0m2.213s medium: real 0m7.148s user 0m0.394s sys 0m2.381s threadpool with 8 threads real 0m4.439s user 0m0.088s sys 0m1.795s 4 threads 10.000 iterations: - - - - - - - - - - - - - - iterative server: small: real 0m1.822s user 0m0.077s sys 0m1.723s medium: real 0m21.363s user 0m0.520s sys 0m2.392s server with 8 threads: small: real 0m4.589s user 0m0.140s sys 0m3.112s medium: real 0m9.372s user 0m0.582s sys 0m3.065s threadpool with 8 threads small without closing check real 0m1.398s user 0m0.130s sys 0m2.534s small with closing check real 0m5.913s user 0m0.117s sys 0m2.571s medium: real 0m9.139s user 0m0.472s sys 0m2.933s 8 threads 10.000 iterations: - - - - - - - - - - - - - - iterative server: small: real 0m3.695s user 0m0.138s sys 0m3.514s medium: real 0m42.736s user 0m1.232s sys 0m4.712s server with 8 threads: small: real 0m9.324s user 0m0.272s sys 0m6.227s medium: connection problems occur after a while, server refuses to answer for ca 10 secs, then automagically starts working ok again survives OK with 4 server threads, though: real 0m28.006s user 0m0.309s threadpool with 8 threads small without close check: real 0m3.324s user 0m0.268s sys 0m5.139s small with close check: sys 0m6.928s real 0m11.897s user 0m0.222s sys 0m5.187s medium: real 0m18.915s user 0m0.309s sys 0m6.068s burst mode for ----------- creating 100.000 empty threads (no connections done) sequentially real 0m1.454s user 0m0.461s sys 0m1.156s creating 10 empty thread batches 10.000 times: real 0m1.358s user 0m0.774s sys 0m1.999s creating 100 empty thread batches 1000 times: real 0m1.544s user 0m0.808s sys 0m2.190s Windows ======== 1 thread 1000 calls with a small dataset: threaded server: 0.5 sec iterative server: 1 sec 8 threads 1000 calls each with a small dataset: threaded server: 2 sec iterative server: 8 sec 8 threads 10 calls each with a large dataset (100K rows, 2.34MB) threaded server: 2 sec iterative server: 7.4 sec */ whitedb-0.7.2/Server/timecmd.bat000077500000000000000000000017011226454622500165400ustar00rootroot00000000000000@echo off @setlocal set start=%time% :: runs your command cmd /c %* set end=%time% set options="tokens=1-4 delims=:." for /f %options% %%a in ("%start%") do set start_h=%%a&set /a start_m=100%%b %% 100&set /a start_s=100%%c %% 100&set /a start_ms=100%%d %% 100 for /f %options% %%a in ("%end%") do set end_h=%%a&set /a end_m=100%%b %% 100&set /a end_s=100%%c %% 100&set /a end_ms=100%%d %% 100 set /a hours=%end_h%-%start_h% set /a mins=%end_m%-%start_m% set /a secs=%end_s%-%start_s% set /a ms=%end_ms%-%start_ms% if %hours% lss 0 set /a hours = 24%hours% if %mins% lss 0 set /a hours = %hours% - 1 & set /a mins = 60%mins% if %secs% lss 0 set /a mins = %mins% - 1 & set /a secs = 60%secs% if %ms% lss 0 set /a secs = %secs% - 1 & set /a ms = 100%ms% if 1%ms% lss 100 set ms=0%ms% :: mission accomplished set /a totalsecs = %hours%*3600 + %mins%*60 + %secs% echo command took %hours%:%mins%:%secs%.%ms% (%totalsecs%.%ms%s total)whitedb-0.7.2/acinclude.m4000066400000000000000000000664541226454622500153660ustar00rootroot00000000000000# =========================================================================== # http://autoconf-archive.cryp.to/acx_pthread.html # =========================================================================== # # SYNOPSIS # # ACX_PTHREAD([ACTION-IF-FOUND[, ACTION-IF-NOT-FOUND]]) # # DESCRIPTION # # This macro figures out how to build C programs using POSIX threads. It # sets the PTHREAD_LIBS output variable to the threads library and linker # flags, and the PTHREAD_CFLAGS output variable to any special C compiler # flags that are needed. (The user can also force certain compiler # flags/libs to be tested by setting these environment variables.) # # Also sets PTHREAD_CC to any special C compiler that is needed for # multi-threaded programs (defaults to the value of CC otherwise). (This # is necessary on AIX to use the special cc_r compiler alias.) # # NOTE: You are assumed to not only compile your program with these flags, # but also link it with them as well. e.g. you should link with # $PTHREAD_CC $CFLAGS $PTHREAD_CFLAGS $LDFLAGS ... $PTHREAD_LIBS $LIBS # # If you are only building threads programs, you may wish to use these # variables in your default LIBS, CFLAGS, and CC: # # LIBS="$PTHREAD_LIBS $LIBS" # CFLAGS="$CFLAGS $PTHREAD_CFLAGS" # CC="$PTHREAD_CC" # # In addition, if the PTHREAD_CREATE_JOINABLE thread-attribute constant # has a nonstandard name, defines PTHREAD_CREATE_JOINABLE to that name # (e.g. PTHREAD_CREATE_UNDETACHED on AIX). # # ACTION-IF-FOUND is a list of shell commands to run if a threads library # is found, and ACTION-IF-NOT-FOUND is a list of commands to run it if it # is not found. If ACTION-IF-FOUND is not specified, the default action # will define HAVE_PTHREAD. # # Please let the authors know if this macro fails on any platform, or if # you have any other suggestions or comments. This macro was based on work # by SGJ on autoconf scripts for FFTW (http://www.fftw.org/) (with help # from M. Frigo), as well as ac_pthread and hb_pthread macros posted by # Alejandro Forero Cuervo to the autoconf macro repository. We are also # grateful for the helpful feedback of numerous users. # # LAST MODIFICATION # # 2008-04-12 # # COPYLEFT # # Copyright (c) 2008 Steven G. Johnson # # This program is free software: you can redistribute it and/or modify it # under the terms of the GNU General Public License as published by the # Free Software Foundation, either version 3 of the License, or (at your # option) any later version. # # This program is distributed in the hope that it will be useful, but # WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General # Public License for more details. # # You should have received a copy of the GNU General Public License along # with this program. If not, see . # # As a special exception, the respective Autoconf Macro's copyright owner # gives unlimited permission to copy, distribute and modify the configure # scripts that are the output of Autoconf when processing the Macro. You # need not follow the terms of the GNU General Public License when using # or distributing such scripts, even though portions of the text of the # Macro appear in them. The GNU General Public License (GPL) does govern # all other use of the material that constitutes the Autoconf Macro. # # This special exception to the GPL applies to versions of the Autoconf # Macro released by the Autoconf Macro Archive. When you make and # distribute a modified version of the Autoconf Macro, you may extend this # special exception to the GPL to apply to your modified version as well. AC_DEFUN([ACX_PTHREAD], [ AC_REQUIRE([AC_CANONICAL_HOST]) AC_LANG_SAVE AC_LANG_C acx_pthread_ok=no # We used to check for pthread.h first, but this fails if pthread.h # requires special compiler flags (e.g. on True64 or Sequent). # It gets checked for in the link test anyway. # First of all, check if the user has set any of the PTHREAD_LIBS, # etcetera environment variables, and if threads linking works using # them: if test x"$PTHREAD_LIBS$PTHREAD_CFLAGS" != x; then save_CFLAGS="$CFLAGS" CFLAGS="$CFLAGS $PTHREAD_CFLAGS" save_LIBS="$LIBS" LIBS="$PTHREAD_LIBS $LIBS" AC_MSG_CHECKING([for pthread_join in LIBS=$PTHREAD_LIBS with CFLAGS=$PTHREAD_CFLAGS]) AC_TRY_LINK_FUNC(pthread_join, acx_pthread_ok=yes) AC_MSG_RESULT($acx_pthread_ok) if test x"$acx_pthread_ok" = xno; then PTHREAD_LIBS="" PTHREAD_CFLAGS="" fi LIBS="$save_LIBS" CFLAGS="$save_CFLAGS" fi # We must check for the threads library under a number of different # names; the ordering is very important because some systems # (e.g. DEC) have both -lpthread and -lpthreads, where one of the # libraries is broken (non-POSIX). # Create a list of thread flags to try. Items starting with a "-" are # C compiler flags, and other items are library names, except for "none" # which indicates that we try without any flags at all, and "pthread-config" # which is a program returning the flags for the Pth emulation library. acx_pthread_flags="pthreads none -Kthread -kthread lthread -pthread -pthreads -mthreads pthread --thread-safe -mt pthread-config" # The ordering *is* (sometimes) important. Some notes on the # individual items follow: # pthreads: AIX (must check this before -lpthread) # none: in case threads are in libc; should be tried before -Kthread and # other compiler flags to prevent continual compiler warnings # -Kthread: Sequent (threads in libc, but -Kthread needed for pthread.h) # -kthread: FreeBSD kernel threads (preferred to -pthread since SMP-able) # lthread: LinuxThreads port on FreeBSD (also preferred to -pthread) # -pthread: Linux/gcc (kernel threads), BSD/gcc (userland threads) # -pthreads: Solaris/gcc # -mthreads: Mingw32/gcc, Lynx/gcc # -mt: Sun Workshop C (may only link SunOS threads [-lthread], but it # doesn't hurt to check since this sometimes defines pthreads too; # also defines -D_REENTRANT) # ... -mt is also the pthreads flag for HP/aCC # pthread: Linux, etcetera # --thread-safe: KAI C++ # pthread-config: use pthread-config program (for GNU Pth library) case "${host_cpu}-${host_os}" in *solaris*) # On Solaris (at least, for some versions), libc contains stubbed # (non-functional) versions of the pthreads routines, so link-based # tests will erroneously succeed. (We need to link with -pthreads/-mt/ # -lpthread.) (The stubs are missing pthread_cleanup_push, or rather # a function called by this macro, so we could check for that, but # who knows whether they'll stub that too in a future libc.) So, # we'll just look for -pthreads and -lpthread first: acx_pthread_flags="-pthreads pthread -mt -pthread $acx_pthread_flags" ;; esac if test x"$acx_pthread_ok" = xno; then for flag in $acx_pthread_flags; do case $flag in none) AC_MSG_CHECKING([whether pthreads work without any flags]) ;; -*) AC_MSG_CHECKING([whether pthreads work with $flag]) PTHREAD_CFLAGS="$flag" ;; pthread-config) AC_CHECK_PROG(acx_pthread_config, pthread-config, yes, no) if test x"$acx_pthread_config" = xno; then continue; fi PTHREAD_CFLAGS="`pthread-config --cflags`" PTHREAD_LIBS="`pthread-config --ldflags` `pthread-config --libs`" ;; *) AC_MSG_CHECKING([for the pthreads library -l$flag]) PTHREAD_LIBS="-l$flag" ;; esac save_LIBS="$LIBS" save_CFLAGS="$CFLAGS" LIBS="$PTHREAD_LIBS $LIBS" CFLAGS="$CFLAGS $PTHREAD_CFLAGS" # Check for various functions. We must include pthread.h, # since some functions may be macros. (On the Sequent, we # need a special flag -Kthread to make this header compile.) # We check for pthread_join because it is in -lpthread on IRIX # while pthread_create is in libc. We check for pthread_attr_init # due to DEC craziness with -lpthreads. We check for # pthread_cleanup_push because it is one of the few pthread # functions on Solaris that doesn't have a non-functional libc stub. # We try pthread_create on general principles. AC_TRY_LINK([#include ], [pthread_t th; pthread_join(th, 0); pthread_attr_init(0); pthread_cleanup_push(0, 0); pthread_create(0,0,0,0); pthread_cleanup_pop(0); ], [acx_pthread_ok=yes]) LIBS="$save_LIBS" CFLAGS="$save_CFLAGS" AC_MSG_RESULT($acx_pthread_ok) if test "x$acx_pthread_ok" = xyes; then break; fi PTHREAD_LIBS="" PTHREAD_CFLAGS="" done fi # Various other checks: if test "x$acx_pthread_ok" = xyes; then save_LIBS="$LIBS" LIBS="$PTHREAD_LIBS $LIBS" save_CFLAGS="$CFLAGS" CFLAGS="$CFLAGS $PTHREAD_CFLAGS" # Detect AIX lossage: JOINABLE attribute is called UNDETACHED. AC_MSG_CHECKING([for joinable pthread attribute]) attr_name=unknown for attr in PTHREAD_CREATE_JOINABLE PTHREAD_CREATE_UNDETACHED; do AC_TRY_LINK([#include ], [int attr=$attr; return attr;], [attr_name=$attr; break]) done AC_MSG_RESULT($attr_name) if test "$attr_name" != PTHREAD_CREATE_JOINABLE; then AC_DEFINE_UNQUOTED(PTHREAD_CREATE_JOINABLE, $attr_name, [Define to necessary symbol if this constant uses a non-standard name on your system.]) fi AC_MSG_CHECKING([if more special flags are required for pthreads]) flag=no case "${host_cpu}-${host_os}" in *-aix* | *-freebsd* | *-darwin*) flag="-D_THREAD_SAFE";; *solaris* | *-osf* | *-hpux*) flag="-D_REENTRANT";; esac AC_MSG_RESULT(${flag}) if test "x$flag" != xno; then PTHREAD_CFLAGS="$flag $PTHREAD_CFLAGS" fi LIBS="$save_LIBS" CFLAGS="$save_CFLAGS" # More AIX lossage: must compile with xlc_r or cc_r if test x"$GCC" != xyes; then AC_CHECK_PROGS(PTHREAD_CC, xlc_r cc_r, ${CC}) else PTHREAD_CC=$CC fi else PTHREAD_CC="$CC" fi AC_SUBST(PTHREAD_LIBS) AC_SUBST(PTHREAD_CFLAGS) AC_SUBST(PTHREAD_CC) # Finally, execute ACTION-IF-FOUND/ACTION-IF-NOT-FOUND: if test x"$acx_pthread_ok" = xyes; then ifelse([$1],,AC_DEFINE(HAVE_PTHREAD,1,[Define if you have POSIX threads libraries and header files.]),[$1]) : else acx_pthread_ok=no $2 fi AC_LANG_RESTORE ])dnl ACX_PTHREAD dnl a macro to check for ability to create python extensions dnl AM_CHECK_PYTHON_HEADERS([ACTION-IF-POSSIBLE], [ACTION-IF-NOT-POSSIBLE]) dnl function also defines PYTHON_INCLUDES AC_DEFUN([AM_CHECK_PYTHON_HEADERS], [AC_REQUIRE([AM_PATH_PYTHON]) AC_MSG_CHECKING(for headers required to compile python extensions) dnl deduce PYTHON_INCLUDES py_prefix=`$PYTHON -c "import sys; sys.stdout.write(sys.prefix)"` py_exec_prefix=`$PYTHON -c "import sys; sys.stdout.write(sys.exec_prefix)"` if test -x "$PYTHON-config"; then PYTHON_INCLUDES=`$PYTHON-config --includes 2>/dev/null` else PYTHON_INCLUDES="-I${py_prefix}/include/python${PYTHON_VERSION}" if test "$py_prefix" != "$py_exec_prefix"; then PYTHON_INCLUDES="$PYTHON_INCLUDES -I${py_exec_prefix}/include/python${PYTHON_VERSION}" fi if test "${IS_WINDOWS}" = "yes"; then PYTHON_INCLUDES=`echo $PYTHON_INCLUDES -I${py_prefix}/include | sed s%\\\\\\\\%/%g` PYTHON_VER=`echo ${PYTHON_VERSION} | sed s:\\\\.::` PYTHON_LIBS=`echo -L${py_prefix}/libs -lpython${PYTHON_VER} | sed s%\\\\\\\\%/%g` fi fi AC_SUBST(PYTHON_INCLUDES) AC_SUBST(PYTHON_LIBS) dnl check if the headers exist: save_CPPFLAGS="$CPPFLAGS" CPPFLAGS="$CPPFLAGS $PYTHON_INCLUDES" AC_TRY_CPP([#include ],dnl [AC_MSG_RESULT(found) $1],dnl [AC_MSG_RESULT(not found) $2]) CPPFLAGS="$save_CPPFLAGS" ]) # =========================================================================== # http://www.gnu.org/software/autoconf-archive/ax_check_compiler_flags.html # =========================================================================== # # SYNOPSIS # # AX_CHECK_COMPILER_FLAGS(FLAGS, [ACTION-SUCCESS], [ACTION-FAILURE]) # # DESCRIPTION # # Check whether the given compiler FLAGS work with the current language's # compiler, or whether they give an error. (Warnings, however, are # ignored.) # # ACTION-SUCCESS/ACTION-FAILURE are shell commands to execute on # success/failure. # # LICENSE # # Copyright (c) 2009 Steven G. Johnson # Copyright (c) 2009 Matteo Frigo # # This program is free software: you can redistribute it and/or modify it # under the terms of the GNU General Public License as published by the # Free Software Foundation, either version 3 of the License, or (at your # option) any later version. # # This program is distributed in the hope that it will be useful, but # WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General # Public License for more details. # # You should have received a copy of the GNU General Public License along # with this program. If not, see . # # As a special exception, the respective Autoconf Macro's copyright owner # gives unlimited permission to copy, distribute and modify the configure # scripts that are the output of Autoconf when processing the Macro. You # need not follow the terms of the GNU General Public License when using # or distributing such scripts, even though portions of the text of the # Macro appear in them. The GNU General Public License (GPL) does govern # all other use of the material that constitutes the Autoconf Macro. # # This special exception to the GPL applies to versions of the Autoconf # Macro released by the Autoconf Archive. When you make and distribute a # modified version of the Autoconf Macro, you may extend this special # exception to the GPL to apply to your modified version as well. AC_DEFUN([AX_CHECK_COMPILER_FLAGS], [AC_PREREQ(2.59) dnl for _AC_LANG_PREFIX AC_MSG_CHECKING([whether _AC_LANG compiler accepts $1]) dnl Some hackery here since AC_CACHE_VAL can't handle a non-literal varname: AS_LITERAL_IF([$1], [AC_CACHE_VAL(AS_TR_SH(ax_cv_[]_AC_LANG_ABBREV[]_flags_[$1]), [ ax_save_FLAGS=$[]_AC_LANG_PREFIX[]FLAGS _AC_LANG_PREFIX[]FLAGS="$1" AC_COMPILE_IFELSE([AC_LANG_PROGRAM()], AS_TR_SH(ax_cv_[]_AC_LANG_ABBREV[]_flags_[$1])=yes, AS_TR_SH(ax_cv_[]_AC_LANG_ABBREV[]_flags_[$1])=no) _AC_LANG_PREFIX[]FLAGS=$ax_save_FLAGS])], [ax_save_FLAGS=$[]_AC_LANG_PREFIX[]FLAGS _AC_LANG_PREFIX[]FLAGS="$1" AC_COMPILE_IFELSE([AC_LANG_PROGRAM()], eval AS_TR_SH(ax_cv_[]_AC_LANG_ABBREV[]_flags_[$1])=yes, eval AS_TR_SH(ax_cv_[]_AC_LANG_ABBREV[]_flags_[$1])=no) _AC_LANG_PREFIX[]FLAGS=$ax_save_FLAGS]) eval ax_check_compiler_flags=$AS_TR_SH(ax_cv_[]_AC_LANG_ABBREV[]_flags_[$1]) AC_MSG_RESULT($ax_check_compiler_flags) if test "x$ax_check_compiler_flags" = xyes; then m4_default([$2], :) else m4_default([$3], :) fi ])dnl AX_CHECK_COMPILER_FLAGS # =========================================================================== # http://www.gnu.org/software/autoconf-archive/ax_gcc_x86_cpuid.html # =========================================================================== # # SYNOPSIS # # AX_GCC_X86_CPUID(OP) # # DESCRIPTION # # On Pentium and later x86 processors, with gcc or a compiler that has a # compatible syntax for inline assembly instructions, run a small program # that executes the cpuid instruction with input OP. This can be used to # detect the CPU type. # # On output, the values of the eax, ebx, ecx, and edx registers are stored # as hexadecimal strings as "eax:ebx:ecx:edx" in the cache variable # ax_cv_gcc_x86_cpuid_OP. # # If the cpuid instruction fails (because you are running a # cross-compiler, or because you are not using gcc, or because you are on # a processor that doesn't have this instruction), ax_cv_gcc_x86_cpuid_OP # is set to the string "unknown". # # This macro mainly exists to be used in AX_GCC_ARCHFLAG. # # LICENSE # # Copyright (c) 2008 Steven G. Johnson # Copyright (c) 2008 Matteo Frigo # # This program is free software: you can redistribute it and/or modify it # under the terms of the GNU General Public License as published by the # Free Software Foundation, either version 3 of the License, or (at your # option) any later version. # # This program is distributed in the hope that it will be useful, but # WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General # Public License for more details. # # You should have received a copy of the GNU General Public License along # with this program. If not, see . # # As a special exception, the respective Autoconf Macro's copyright owner # gives unlimited permission to copy, distribute and modify the configure # scripts that are the output of Autoconf when processing the Macro. You # need not follow the terms of the GNU General Public License when using # or distributing such scripts, even though portions of the text of the # Macro appear in them. The GNU General Public License (GPL) does govern # all other use of the material that constitutes the Autoconf Macro. # # This special exception to the GPL applies to versions of the Autoconf # Macro released by the Autoconf Archive. When you make and distribute a # modified version of the Autoconf Macro, you may extend this special # exception to the GPL to apply to your modified version as well. AC_DEFUN([AX_GCC_X86_CPUID], [AC_REQUIRE([AC_PROG_CC]) AC_LANG_PUSH([C]) AC_CACHE_CHECK(for x86 cpuid $1 output, ax_cv_gcc_x86_cpuid_$1, [AC_RUN_IFELSE([AC_LANG_PROGRAM([#include ], [ int op = $1, eax, ebx, ecx, edx; FILE *f; __asm__("cpuid" : "=a" (eax), "=b" (ebx), "=c" (ecx), "=d" (edx) : "a" (op)); f = fopen("conftest_cpuid", "w"); if (!f) return 1; fprintf(f, "%x:%x:%x:%x\n", eax, ebx, ecx, edx); fclose(f); return 0; ])], [ax_cv_gcc_x86_cpuid_$1=`cat conftest_cpuid`; rm -f conftest_cpuid], [ax_cv_gcc_x86_cpuid_$1=unknown; rm -f conftest_cpuid], [ax_cv_gcc_x86_cpuid_$1=unknown])]) AC_LANG_POP([C]) ]) # =========================================================================== # http://www.gnu.org/software/autoconf-archive/ax_gcc_archflag.html # =========================================================================== # # SYNOPSIS # # AX_GCC_ARCHFLAG([PORTABLE?], [ACTION-SUCCESS], [ACTION-FAILURE]) # # DESCRIPTION # # This macro tries to guess the "native" arch corresponding to the target # architecture for use with gcc's -march=arch or -mtune=arch flags. If # found, the cache variable $ax_cv_gcc_archflag is set to this flag and # ACTION-SUCCESS is executed; otherwise $ax_cv_gcc_archflag is is set to # "unknown" and ACTION-FAILURE is executed. The default ACTION-SUCCESS is # to add $ax_cv_gcc_archflag to the end of $CFLAGS. # # PORTABLE? should be either [yes] (default) or [no]. In the former case, # the flag is set to -mtune (or equivalent) so that the architecture is # only used for tuning, but the instruction set used is still portable. In # the latter case, the flag is set to -march (or equivalent) so that # architecture-specific instructions are enabled. # # The user can specify --with-gcc-arch= in order to override the # macro's choice of architecture, or --without-gcc-arch to disable this. # # When cross-compiling, or if $CC is not gcc, then ACTION-FAILURE is # called unless the user specified --with-gcc-arch manually. # # Requires macros: AX_CHECK_COMPILER_FLAGS, AX_GCC_X86_CPUID # # (The main emphasis here is on recent CPUs, on the principle that doing # high-performance computing on old hardware is uncommon.) # # LICENSE # # Copyright (c) 2008 Steven G. Johnson # Copyright (c) 2008 Matteo Frigo # # This program is free software: you can redistribute it and/or modify it # under the terms of the GNU General Public License as published by the # Free Software Foundation, either version 3 of the License, or (at your # option) any later version. # # This program is distributed in the hope that it will be useful, but # WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General # Public License for more details. # # You should have received a copy of the GNU General Public License along # with this program. If not, see . # # As a special exception, the respective Autoconf Macro's copyright owner # gives unlimited permission to copy, distribute and modify the configure # scripts that are the output of Autoconf when processing the Macro. You # need not follow the terms of the GNU General Public License when using # or distributing such scripts, even though portions of the text of the # Macro appear in them. The GNU General Public License (GPL) does govern # all other use of the material that constitutes the Autoconf Macro. # # This special exception to the GPL applies to versions of the Autoconf # Macro released by the Autoconf Archive. When you make and distribute a # modified version of the Autoconf Macro, you may extend this special # exception to the GPL to apply to your modified version as well. AC_DEFUN([AX_GCC_ARCHFLAG], [AC_REQUIRE([AC_PROG_CC]) AC_REQUIRE([AC_CANONICAL_HOST]) AC_ARG_WITH(gcc-arch, [AS_HELP_STRING([--with-gcc-arch=], [use architecture for gcc -march/-mtune, instead of guessing])], ax_gcc_arch=$withval, ax_gcc_arch=yes) AC_MSG_CHECKING([for gcc architecture flag]) AC_MSG_RESULT([]) AC_CACHE_VAL(ax_cv_gcc_archflag, [ ax_cv_gcc_archflag="unknown" if test "$GCC" = yes; then if test "x$ax_gcc_arch" = xyes; then ax_gcc_arch="" if test "$cross_compiling" = no; then case $host_cpu in i[[3456]]86*|x86_64*) # use cpuid codes, in part from x86info-1.7 by D. Jones AX_GCC_X86_CPUID(0) AX_GCC_X86_CPUID(1) case $ax_cv_gcc_x86_cpuid_0 in *:756e6547:*:*) # Intel case $ax_cv_gcc_x86_cpuid_1 in *5[[48]]?:*:*:*) ax_gcc_arch="pentium-mmx pentium" ;; *5??:*:*:*) ax_gcc_arch=pentium ;; *6[[3456]]?:*:*:*) ax_gcc_arch="pentium2 pentiumpro" ;; *6a?:*[[01]]:*:*) ax_gcc_arch="pentium2 pentiumpro" ;; *6a?:*[[234]]:*:*) ax_gcc_arch="pentium3 pentiumpro" ;; *6[[9d]]?:*:*:*) ax_gcc_arch="pentium-m pentium3 pentiumpro" ;; *6[[78b]]?:*:*:*) ax_gcc_arch="pentium3 pentiumpro" ;; *6??:*:*:*) ax_gcc_arch=pentiumpro ;; *f3[[347]]:*:*:*|*f4[1347]:*:*:*) case $host_cpu in x86_64*) ax_gcc_arch="nocona pentium4 pentiumpro" ;; *) ax_gcc_arch="prescott pentium4 pentiumpro" ;; esac ;; *f??:*:*:*) ax_gcc_arch="pentium4 pentiumpro";; esac ;; *:68747541:*:*) # AMD case $ax_cv_gcc_x86_cpuid_1 in *5[[67]]?:*:*:*) ax_gcc_arch=k6 ;; *5[[8d]]?:*:*:*) ax_gcc_arch="k6-2 k6" ;; *5[[9]]?:*:*:*) ax_gcc_arch="k6-3 k6" ;; *60?:*:*:*) ax_gcc_arch=k7 ;; *6[[12]]?:*:*:*) ax_gcc_arch="athlon k7" ;; *6[[34]]?:*:*:*) ax_gcc_arch="athlon-tbird k7" ;; *67?:*:*:*) ax_gcc_arch="athlon-4 athlon k7" ;; *6[[68a]]?:*:*:*) AX_GCC_X86_CPUID(0x80000006) # L2 cache size case $ax_cv_gcc_x86_cpuid_0x80000006 in *:*:*[[1-9a-f]]??????:*) # (L2 = ecx >> 16) >= 256 ax_gcc_arch="athlon-xp athlon-4 athlon k7" ;; *) ax_gcc_arch="athlon-4 athlon k7" ;; esac ;; *f[[4cef8b]]?:*:*:*) ax_gcc_arch="athlon64 k8" ;; *f5?:*:*:*) ax_gcc_arch="opteron k8" ;; *f7?:*:*:*) ax_gcc_arch="athlon-fx opteron k8" ;; *f??:*:*:*) ax_gcc_arch="k8" ;; esac ;; *:746e6543:*:*) # IDT case $ax_cv_gcc_x86_cpuid_1 in *54?:*:*:*) ax_gcc_arch=winchip-c6 ;; *58?:*:*:*) ax_gcc_arch=winchip2 ;; *6[[78]]?:*:*:*) ax_gcc_arch=c3 ;; *69?:*:*:*) ax_gcc_arch="c3-2 c3" ;; esac ;; esac if test x"$ax_gcc_arch" = x; then # fallback case $host_cpu in i586*) ax_gcc_arch=pentium ;; i686*) ax_gcc_arch=pentiumpro ;; esac fi ;; sparc*) AC_PATH_PROG([PRTDIAG], [prtdiag], [prtdiag], [$PATH:/usr/platform/`uname -i`/sbin/:/usr/platform/`uname -m`/sbin/]) cputype=`(((grep cpu /proc/cpuinfo | cut -d: -f2) ; ($PRTDIAG -v |grep -i sparc) ; grep -i cpu /var/run/dmesg.boot ) | head -n 1) 2> /dev/null` cputype=`echo "$cputype" | tr -d ' -' |tr $as_cr_LETTERS $as_cr_letters` case $cputype in *ultrasparciv*) ax_gcc_arch="ultrasparc4 ultrasparc3 ultrasparc v9" ;; *ultrasparciii*) ax_gcc_arch="ultrasparc3 ultrasparc v9" ;; *ultrasparc*) ax_gcc_arch="ultrasparc v9" ;; *supersparc*|*tms390z5[[05]]*) ax_gcc_arch="supersparc v8" ;; *hypersparc*|*rt62[[056]]*) ax_gcc_arch="hypersparc v8" ;; *cypress*) ax_gcc_arch=cypress ;; esac ;; alphaev5) ax_gcc_arch=ev5 ;; alphaev56) ax_gcc_arch=ev56 ;; alphapca56) ax_gcc_arch="pca56 ev56" ;; alphapca57) ax_gcc_arch="pca57 pca56 ev56" ;; alphaev6) ax_gcc_arch=ev6 ;; alphaev67) ax_gcc_arch=ev67 ;; alphaev68) ax_gcc_arch="ev68 ev67" ;; alphaev69) ax_gcc_arch="ev69 ev68 ev67" ;; alphaev7) ax_gcc_arch="ev7 ev69 ev68 ev67" ;; alphaev79) ax_gcc_arch="ev79 ev7 ev69 ev68 ev67" ;; powerpc*) cputype=`((grep cpu /proc/cpuinfo | head -n 1 | cut -d: -f2 | cut -d, -f1 | sed 's/ //g') ; /usr/bin/machine ; /bin/machine; grep CPU /var/run/dmesg.boot | head -n 1 | cut -d" " -f2) 2> /dev/null` cputype=`echo $cputype | sed -e 's/ppc//g;s/ *//g'` case $cputype in *750*) ax_gcc_arch="750 G3" ;; *740[[0-9]]*) ax_gcc_arch="$cputype 7400 G4" ;; *74[[4-5]][[0-9]]*) ax_gcc_arch="$cputype 7450 G4" ;; *74[[0-9]][[0-9]]*) ax_gcc_arch="$cputype G4" ;; *970*) ax_gcc_arch="970 G5 power4";; *POWER4*|*power4*|*gq*) ax_gcc_arch="power4 970";; *POWER5*|*power5*|*gr*|*gs*) ax_gcc_arch="power5 power4 970";; 603ev|8240) ax_gcc_arch="$cputype 603e 603";; *) ax_gcc_arch=$cputype ;; esac ax_gcc_arch="$ax_gcc_arch powerpc" ;; esac fi # not cross-compiling fi # guess arch if test "x$ax_gcc_arch" != x -a "x$ax_gcc_arch" != xno; then for arch in $ax_gcc_arch; do if test "x[]m4_default([$1],yes)" = xyes; then # if we require portable code flags="-mtune=$arch" # -mcpu=$arch and m$arch generate nonportable code on every arch except # x86. And some other arches (e.g. Alpha) don't accept -mtune. Grrr. case $host_cpu in i*86|x86_64*) flags="$flags -mcpu=$arch -m$arch";; esac else flags="-march=$arch -mcpu=$arch -m$arch" fi for flag in $flags; do AX_CHECK_COMPILER_FLAGS($flag, [ax_cv_gcc_archflag=$flag; break]) done test "x$ax_cv_gcc_archflag" = xunknown || break done fi fi # $GCC=yes ]) AC_MSG_CHECKING([for gcc architecture flag]) AC_MSG_RESULT($ax_cv_gcc_archflag) if test "x$ax_cv_gcc_archflag" = xunknown; then m4_default([$3],:) else m4_default([$2], [CFLAGS="$CFLAGS $ax_cv_gcc_archflag"]) fi ]) whitedb-0.7.2/compile.bat000066400000000000000000000024241226454622500153000ustar00rootroot00000000000000@rem current version does not build reasoner: added later cl /Ox /W3 Main\wgdb.c Db\dbmem.c Db\dballoc.c Db\dbdata.c Db\dblock.c Db\dbtest.c DB\dbdump.c Db\dblog.c Db\dbhash.c Db\dbindex.c Db\dbcompare.c Db\dbquery.c Db\dbutil.c Db\dbmpool.c Db\dbjson.c Db\dbschema.c json\yajl_all.c cl /Ox /W3 Main\indextool.c Db\dbmem.c Db\dballoc.c Db\dbdata.c Db\dblock.c Db\dbtest.c Db\dblog.c Db\dbhash.c Db\dbindex.c Db\dbcompare.c Db\dbquery.c Db\dbutil.c Db\dbmpool.c Db\dbjson.c Db\dbschema.c json\yajl_all.c cl /Ox /W3 Main\stresstest.c Db\dbmem.c Db\dballoc.c Db\dbdata.c Db\dblock.c Db\dblog.c Db\dbhash.c Db\dbindex.c Db\dbcompare.c Db\dbmpool.c @rem build DLL. Currently we are not using it to link executables however. @rem unlike gcc build, it is necessary to have all functions declared in @rem wgdb.def file. Make sure it's up to date (should list same functions as @rem Db/dbapi.h) cl /Ox /W3 /MT /Fewgdb /LD Db\dbmem.c Db\dballoc.c Db\dbdata.c Db\dblock.c Db\dbtest.c DB\dbdump.c Db\dblog.c Db\dbhash.c Db\dbindex.c Db\dbcompare.c Db\dbquery.c Db\dbutil.c Db\dbmpool.c Db\dbjson.c Db\dbschema.c json\yajl_all.c /link /def:wgdb.def /incremental:no /MANIFEST:NO @rem Example of linking against wgdb.dll @rem cl /Ox /W3 Main\stresstest.c wgdb.lib cl /Ox /W3 Main\wgdb.c wgdb.lib whitedb-0.7.2/compile.sh000077500000000000000000000020121226454622500151400ustar00rootroot00000000000000#/bin/sh # alternative to compilation with automake/make: just run # current version does not build reasoner: added later [ -f config.h ] || cp config-gcc.h config.h if [ config-gcc.h -nt config.h ]; then echo "Warning: config.h is older than config-gcc.h, consider updating it" fi gcc -O2 -Wall -march=pentium4 -o Main/wgdb Main/wgdb.c Db/dbmem.c \ Db/dballoc.c Db/dbdata.c Db/dblock.c Db/dbindex.c Db/dbtest.c Db/dbdump.c \ Db/dblog.c Db/dbhash.c Db/dbcompare.c Db/dbquery.c Db/dbutil.c Db/dbmpool.c \ Db/dbjson.c Db/dbschema.c json/yajl_all.c -lm gcc -O2 -march=pentium4 -o Main/indextool Main/indextool.c Db/dbmem.c \ Db/dballoc.c Db/dbdata.c Db/dblock.c Db/dbindex.c Db/dbtest.c Db/dblog.c \ Db/dbhash.c Db/dbcompare.c Db/dbquery.c Db/dbutil.c Db/dbmpool.c \ Db/dbjson.c Db/dbschema.c json/yajl_all.c -lm gcc -O3 -Wall -march=pentium4 -pthread -o Main/stresstest Main/stresstest.c \ Db/dbmem.c Db/dballoc.c Db/dbdata.c Db/dblock.c Db/dbindex.c \ Db/dblog.c Db/dbhash.c Db/dbcompare.c Db/dbmpool.c -lm whitedb-0.7.2/config-gcc.h000066400000000000000000000042661226454622500153360ustar00rootroot00000000000000/* * XXX: put this file under license if needed * */ /** @file config-gcc.h * Build configuration for gcc platform. * * Based on auto-generated config.h. Should be manually synced whenever * additional configuration parameters are added. */ /* Use additional validation checks */ #define CHECK 1 /* Journal file directory */ #define DBLOG_DIR "/tmp" /* Select locking protocol (undef to disable locking) * 1 - reader preference spinlock * 2 - writer preference spinlock * 3 - task-fair queued lock */ #define LOCK_PROTO 1 /* Encoded data is 64-bit */ /* #undef HAVE_64BIT_GINT */ /* Define if you have POSIX threads libraries and header files. */ #define HAVE_PTHREAD 1 /* Compile with raptor rdf library */ /* #undef HAVE_RAPTOR */ /* Define to 1 if your C compiler doesn't accept -c and -o together. */ /* #undef NO_MINUS_C_MINUS_O */ /* Name of package */ #define PACKAGE "whitedb" /* Define to the address where bug reports for this package should be sent. */ #define PACKAGE_BUGREPORT "" /* Define to the full name of this package. */ #define PACKAGE_NAME "WhiteDB" /* Define to the full name and version of this package. */ #define PACKAGE_STRING "WhiteDB 0.7-alpha" /* Define to the one symbol short name of this package. */ #define PACKAGE_TARNAME "whitedb" /* Define to the version of this package. */ #define PACKAGE_VERSION "0.7-alpha" /* Define to necessary symbol if this constant uses a non-standard name on your system. */ /* #undef PTHREAD_CREATE_JOINABLE */ /* String hash size (% of db size) */ #define STRHASH_SIZE 2 /* Use chained T-tree index nodes */ #define TTREE_CHAINED_NODES 1 /* Use single-compare T-tree mode */ #define TTREE_SINGLE_COMPARE 1 /* Use record banklinks */ #define USE_BACKLINKING 1 /* Enable child database support */ /* #undef USE_CHILD_DB */ /* Use dblog module for transaction logging */ /* #undef USE_DBLOG */ /* Use match templates for indexes */ #define USE_INDEX_TEMPLATE 1 /* Enable reasoner */ /* #undef USE_REASONER */ /* Version number of package */ #define VERSION "0.7-alpha" /* Package major version */ #define VERSION_MAJOR 0 /* Package minor version */ #define VERSION_MINOR 7 /* Package revision number */ #define VERSION_REV 0 whitedb-0.7.2/config-w32.h000066400000000000000000000043101226454622500152030ustar00rootroot00000000000000/* * XXX: put this file under license if needed * */ /** @file config-w32.h * Build configuration for Win32 platform. * * Based on auto-generated config.h. Should be manually synced whenever * additional configuration parameters are added. */ /* Use additional validation checks */ #define CHECK 1 /* Journal file directory */ #define DBLOG_DIR "c:\\windows\\temp" /* Select locking protocol (undef to disable locking) * 1 - reader preference spinlock * 2 - writer preference spinlock * 3 - task-fair queued lock */ #define LOCK_PROTO 1 /* Encoded data is 64-bit */ /* #undef HAVE_64BIT_GINT */ /* Define if you have POSIX threads libraries and header files. */ /* #undef HAVE_PTHREAD */ /* Compile with raptor rdf library */ /* #undef HAVE_RAPTOR */ /* Define to 1 if your C compiler doesn't accept -c and -o together. */ /* #undef NO_MINUS_C_MINUS_O */ /* Name of package */ #define PACKAGE "whitedb" /* Define to the address where bug reports for this package should be sent. */ #define PACKAGE_BUGREPORT "" /* Define to the full name of this package. */ #define PACKAGE_NAME "WhiteDB" /* Define to the full name and version of this package. */ #define PACKAGE_STRING "WhiteDB 0.7-alpha" /* Define to the one symbol short name of this package. */ #define PACKAGE_TARNAME "whitedb" /* Define to the version of this package. */ #define PACKAGE_VERSION "0.7-alpha" /* Define to necessary symbol if this constant uses a non-standard name on your system. */ /* #undef PTHREAD_CREATE_JOINABLE */ /* String hash size (% of db size) */ #define STRHASH_SIZE 2 /* Use chained T-tree index nodes */ #define TTREE_CHAINED_NODES 1 /* Use single-compare T-tree mode */ #define TTREE_SINGLE_COMPARE 1 /* Use record banklinks */ #define USE_BACKLINKING 1 /* Enable child database support */ /* #undef USE_CHILD_DB */ /* Use dblog module for transaction logging */ /* #undef USE_DBLOG */ /* Use match templates for indexes */ #define USE_INDEX_TEMPLATE 1 /* Enable reasoner */ /* #undef USE_REASONER */ /* Version number of package */ #define VERSION "0.7-alpha" /* Package major version */ #define VERSION_MAJOR 0 /* Package minor version */ #define VERSION_MINOR 7 /* Package revision number */ #define VERSION_REV 0 whitedb-0.7.2/configure.ac000066400000000000000000000203741226454622500154520ustar00rootroot00000000000000# Process this file with autoconf to produce a configure script. # $Id: $ # $Source: $ # ------- Initialisation ------- m4_define([WHITEDB_MAJOR], [0]) m4_define([WHITEDB_MINOR], [7]) m4_define([WHITEDB_REV], [0]) # standard release #m4_define([WHITEDB_VERSION], # m4_defn([WHITEDB_MAJOR]).m4_defn([WHITEDB_MINOR]).m4_defn([WHITEDB_REV])) # release candidate; fill in manually m4_define([WHITEDB_VERSION], [0.7-alpha])) AC_INIT(WhiteDB, [WHITEDB_VERSION]) AC_MSG_NOTICE([====== initialising ======]) # Add new configuration files AC_CONFIG_SRCDIR(Db/dballoc.c) AC_CONFIG_HEADER(config.h) AM_INIT_AUTOMAKE #Initialize libtool AC_PROG_LIBTOOL # ------- Checking ------- AC_MSG_NOTICE([====== checking ======]) AC_PROG_CC AM_PROG_CC_C_O # for per-target flags #Checks for libraries. ACX_PTHREAD() AC_CHECK_LIB([m],[cos]) #Check for programs AC_PROG_INSTALL #Check for Python (optional) AC_ARG_WITH([python], [AS_HELP_STRING([--with-python], [enable building Python bindings])], [], [with_python=no]) if test "x$with_python" != "xno" then if test "x$with_python" != "xyes" then PYTHON="$with_python" fi AM_PATH_PYTHON(,,[]) else PYTHON="" fi if test "x$PYTHON" != "x" then AM_CHECK_PYTHON_HEADERS(,[PYTHON=[]]) fi # If PYTHON is non-empty, contents of Python subdir will be # included in the build. This also implies that the check for # headers was successful and PYTHON_INCLUDES contains something useful. AM_CONDITIONAL(PYTHON, [test "x$PYTHON" != "x"]) #Check for Raptor (optional) AC_CHECK_PROGS(RAPTOR_CONFIG, raptor-config, []) AM_CONDITIONAL(RAPTOR, [test "x$RAPTOR_CONFIG" != "x"]) if test "x$RAPTOR_CONFIG" != "x" then AC_DEFINE([HAVE_RAPTOR], [1], [Compile with raptor rdf library]) fi # lex and bison (needed for reasoner) AC_PROG_LEX AC_CHECK_TOOL([BISON], [bison], [:]) # futex availability AC_CHECK_HEADER(linux/futex.h, [futex=yes], [AC_MSG_RESULT([Futexes not supported, tfqueue locks not available])] ) # Set the journal directory AC_ARG_WITH(logdir, [AC_HELP_STRING([--with-logdir=DIR], [Journal file directory @<:@default=/tmp@:>@])], with_logdir="$withval", with_logdir="/tmp") if test "x$with_logdir" = "xyes" -o "x$with_logdir" = "xno" then # define it anyway, even if the user picked --without LOGDIR="/tmp" else LOGDIR="$with_logdir" fi AC_DEFINE_UNQUOTED([DBLOG_DIR], "$LOGDIR", Journal file directory) # ----------- configuration options ---------- AC_MSG_NOTICE([====== setting configuration options ======]) AC_MSG_CHECKING(for logging) AC_ARG_ENABLE(logging, [AS_HELP_STRING([--enable-logging], [enable transaction logging])], [logging=$enable_logging],logging=no) if test "$logging" != no then AC_DEFINE([USE_DBLOG], [1], [Use dblog module for transaction logging]) AC_MSG_RESULT([enabled, journal directory is $LOGDIR]) else AC_MSG_RESULT(disabled) fi AC_MSG_CHECKING(for locking protocol) AC_ARG_ENABLE(locking, [AS_HELP_STRING([--enable-locking], [select locking protocol (rpspin,wpspin,tfqueue,no) @<:@default=tfqueue@:>@])], [locking=$enable_locking],locking=tfqueue) if test "$locking" == no then AC_MSG_RESULT(disabled) elif test "$locking" == wpspin then AC_DEFINE([LOCK_PROTO], [2], [Select locking protocol: writer-preference spinlock]) AC_MSG_RESULT([wpspin]) elif test "$locking" == tfqueue -a "$futex" == "yes" then AC_DEFINE([LOCK_PROTO], [3], [Select locking protocol: task-fair queued lock]) AC_MSG_RESULT([tfqueue]) else # unknown or unsupported value, revert to default AC_DEFINE([LOCK_PROTO], [1], [Select locking protocol: reader-preference spinlock]) AC_MSG_RESULT([rpspin]) fi AC_MSG_CHECKING(for additional validation checks) AC_ARG_ENABLE(checking, [AS_HELP_STRING([--disable-checking], [disable additional validation checks in API layer (small performance gain) ])], [checking=$enable_checking],checking=yes) if test "$checking" != no then AC_DEFINE([CHECK], [1], [Use additional validation checks]) AC_MSG_RESULT(enabled) else AC_MSG_RESULT(disabled) fi AC_MSG_CHECKING(for single-compare T-tree mode) AC_ARG_ENABLE(single_compare, [AS_HELP_STRING([--disable-single-compare], [disable experimental single compare algorithm in T-tree search])], [single_compare=$enable_single_compare],single_compare=yes) if test "$single_compare" != no then AC_DEFINE([TTREE_SINGLE_COMPARE], [1], [Use single-compare T-tree mode]) AC_MSG_RESULT(enabled) else AC_MSG_RESULT(disabled) fi AC_MSG_CHECKING(for chained T-tree nodes algorithm) AC_ARG_ENABLE(tstar, [AS_HELP_STRING([--disable-tstar], [disable experimental chained T-tree (similar to T* tree) index algorithm])], [tstar=$enable_tstar],tstar=yes) if test "$tstar" != no then AC_DEFINE([TTREE_CHAINED_NODES], [1], [Use chained T-tree index nodes]) AC_MSG_RESULT(enabled) else AC_MSG_RESULT(disabled) fi AC_MSG_CHECKING(for backlinking) AC_ARG_ENABLE(backlink, [AS_HELP_STRING([--disable-backlink], [disable record backlinking])], [backlink=$enable_backlink],backlink=yes) if test "$backlink" != no then AC_DEFINE([USE_BACKLINKING], [1], [Use record banklinks]) AC_MSG_RESULT(enabled) else AC_MSG_RESULT(disabled) fi AC_MSG_CHECKING(for child db support) AC_ARG_ENABLE(childdb, [AS_HELP_STRING([--enable-childdb], [enable child database support])], [childdb=$enable_childdb],childdb=no) if test "$childdb" != no then AC_DEFINE([USE_CHILD_DB], [1], [Enable child database support]) AC_MSG_RESULT(enabled) else AC_MSG_RESULT(disabled) fi AC_MSG_CHECKING(for index templates) AC_ARG_ENABLE(index_templates, [AS_HELP_STRING([--disable-index-templates], [disable support for index templates])], [index_templates=$enable_index_templates],index_templates=yes) if test "$index_templates" != no then AC_DEFINE([USE_INDEX_TEMPLATE], [1], [Use match templates for indexes]) AC_MSG_RESULT(enabled) else AC_MSG_RESULT(disabled) fi AC_MSG_CHECKING(for reasoner) AC_ARG_ENABLE(reasoner, [AS_HELP_STRING([--enable-reasoner], [enable reasoner])], [reasoner=$enable_reasoner],reasoner=no) if test "$reasoner" != no then if test "x$LEX" != "x:" -a "x$BISON" != "x:" then AC_DEFINE([USE_REASONER], [1], [Enable reasoner]) AC_MSG_RESULT(enabled) else AC_MSG_RESULT([disabled, bison or lex missing]) reasoner=no fi else AC_MSG_RESULT(disabled) fi AM_CONDITIONAL(REASONER, [test "$reasoner" != no]) AC_MSG_CHECKING(string hash size) AC_ARG_ENABLE(strhash_size, [AS_HELP_STRING([--enable-strhash-size], [set string hash size (% of db size) @<:@default=2@:>@])], [strhash_size=$enable_strhash_size],strhash_size=2) if test "x$strhash_size" = xyes -o "x$strhash_size" = xno -o "x$strhash_size" = x then AC_DEFINE([STRHASH_SIZE], [2], [Default string hash size (2% of db size)]) AC_MSG_RESULT([2]) else AC_DEFINE_UNQUOTED([STRHASH_SIZE], $strhash_size, [String hash size (% of db size)]) AC_MSG_RESULT($strhash_size) fi # ---------- Compiler flags -------- AC_MSG_NOTICE([====== setting compiler flags ======]) auto_cflags="-Wall" AX_GCC_ARCHFLAG([no]) # add -m64 if possible (amd64) restore_cflags="$CFLAGS" CFLAGS="$CFLAGS -m64" AC_MSG_CHECKING([whether CC supports -m64]) AC_COMPILE_IFELSE([AC_LANG_PROGRAM([])], [AC_MSG_RESULT([yes])] [auto_cflags="$auto_cflags -m64"] [AC_DEFINE([HAVE_64BIT_GINT], [1], [Encoded data is 64-bit])], [AC_MSG_RESULT([no])] ) CFLAGS="$restore_cflags" # XXX: this is redundant now that we don't clobber CFLAGS anymore, # may be removed any time. AC_MSG_CHECKING(for gprof) AC_ARG_ENABLE(gprof, [AS_HELP_STRING([--enable-gprof], [enable memory profiling with gprof])], gprof=yes,gprof=no) if test "$gprof" != no then auto_cflags="-pg $auto_cflags" AC_MSG_RESULT(enabled) else AC_MSG_RESULT(no) fi AC_SUBST([AM_CFLAGS], ["$auto_cflags"]) # ---------- Final creation -------- AC_MSG_NOTICE([====== final steps ======]) AC_DEFINE([VERSION_MAJOR], [WHITEDB_MAJOR], [Package major version]) AC_DEFINE([VERSION_MINOR], [WHITEDB_MINOR], [Package minor version]) AC_DEFINE([VERSION_REV], [WHITEDB_REV], [Package revision number]) AC_OUTPUT([ Makefile Db/Makefile json/Makefile Main/Makefile Examples/Makefile Python/Makefile Parser/Makefile Printer/Makefile Reasoner/Makefile ]) whitedb-0.7.2/gendoc.sh000077500000000000000000000022341226454622500147550ustar00rootroot00000000000000#!/bin/sh # # run with ASCIIDOC_XSL=/path/to/stylesheets ./gendoc.sh [ destdir ] # # ASCIIDOC_XSL=/usr/share/asciidoc/docbook-xsl/ ./gendoc.sh # default stylesheets path [ "X${ASCIIDOC_XSL}" = "X" ] && ASCIIDOC_XSL="/etc/asciidoc/docbook-xsl" # destination directory DESTDIR=$1 [ "X${DESTDIR}" = "X" ] && DESTDIR=Doc iconv -f latin1 -t utf-8 Doc/Manual.txt |\ sed -f Doc/Manual2html.sed |\ asciidoc -b docbook - |\ xsltproc --nonet ${ASCIIDOC_XSL}/xhtml.xsl - > ${DESTDIR}/Manual.html iconv -f latin1 -t utf-8 Doc/python.txt |\ sed -f Doc/python2html.sed |\ asciidoc -b docbook - |\ xsltproc --nonet ${ASCIIDOC_XSL}/xhtml.xsl - > ${DESTDIR}/python.html iconv -f latin1 -t utf-8 Doc/Tutorial.txt |\ sed -f Doc/Tutorial2html.sed |\ asciidoc -b docbook - |\ xsltproc --nonet ${ASCIIDOC_XSL}/xhtml.xsl - > ${DESTDIR}/Tutorial.html iconv -f latin1 -t utf-8 Doc/Install.txt |\ asciidoc -b docbook - |\ xsltproc --nonet ${ASCIIDOC_XSL}/xhtml.xsl - > ${DESTDIR}/Install.html iconv -f latin1 -t utf-8 Doc/Utilities.txt |\ asciidoc -b docbook - |\ xsltproc --nonet ${ASCIIDOC_XSL}/xhtml.xsl - > ${DESTDIR}/Utilities.html # add Doxygen / PyDoc generation here as needed whitedb-0.7.2/java/000077500000000000000000000000001226454622500140775ustar00rootroot00000000000000whitedb-0.7.2/java/jni/000077500000000000000000000000001226454622500146575ustar00rootroot00000000000000whitedb-0.7.2/java/jni/Readme.txt000066400000000000000000000036311226454622500166200ustar00rootroot00000000000000WhiteDB JNI wrapper =================== The JNI wrapper provides the WhiteDB bindings for Java. It is still under development and not usable for general application. Compiling under Windows environment ----------------------------------- Scripts should be executed in the order given here. Please run each script in the directory they are located. You may need to edit the scripts to adjust the paths to your environment; specific instructions below. `setEnv.bat` This file sets all needed paths and variables for compiling and running all jni classes. Line 1: `javac`, `javah` and `java` location in your computer will be added to path. Line 2: you will be directed to your working directory. Line 3: path to jni library (in windows, dll file) will be added to your path variable. Line 4: 'vcvarsall.bat' run. Its needed to use Visual Studios c compiler 'cl.exe'. `compileJava.bat` This script file compiles 'WhiteDB.java' source file and generates .h file for jni from compiled class file. `compileBridge.bat` This script will compile c source code ('cl.exe' flag '/LD') into dll file including java sdk's jni needed files ('cl.exe' flag '/I'). Change path of 'cl.exe' and java sdk includes to paths where they can be found in your computer. Copy 'config-w32.h' to 'java\jni' directory before running this script. `runJava.bat` This script runs WhiteDB class file. Compiling under Unix-like environment ------------------------------------- Scripts should be executed in the order given here. Please run each script in the directory they are located. Set JAVA_HOME before executing the scripts. `compile_java.sh` This script file compiles 'WhiteDB.java' source file and generates the header file for jni from compiled class file. `compile_bridge.sh` Compile the shared library that wraps the WhiteDB functions for Java. `run_java.sh` Runs the tests in the WhiteDB class. whitedb-0.7.2/java/jni/classes/000077500000000000000000000000001226454622500163145ustar00rootroot00000000000000whitedb-0.7.2/java/jni/classes/.gitignore000066400000000000000000000000001226454622500202720ustar00rootroot00000000000000whitedb-0.7.2/java/jni/compileBridge.bat000066400000000000000000000007061226454622500201170ustar00rootroot00000000000000cd library "C:\Program Files\Microsoft Visual Studio 9.0\VC\bin\cl.exe" /LD /I"C:\Program Files\Java\jdk1.7.0_25\include" /I"C:\Program Files\Java\jdk1.7.0_25\include\win32" /W3 ..\src\native\whitedbDriver.c ..\..\..\Db\dbmem.c ..\..\..\Db\dballoc.c ..\..\..\Db\dbdata.c ..\..\..\Db\dbindex.c ..\..\..\Db\dbtest.c ..\..\..\Db\dbquery.c ..\..\..\Db\dbutil.c ..\..\..\Db\dbmpool.c ..\..\..\Db\dblock.c ..\..\..\Db\dbcompare.c ..\..\..\Db\dbhash.c cd .. whitedb-0.7.2/java/jni/compileJava.bat000066400000000000000000000002741226454622500176040ustar00rootroot00000000000000javac -cp ./src -d classes src\whitedb\driver\WhiteDB.java javac -cp ./src -d classes src\whitedb\driver\tests.java javah -classpath ./classes -d src/native -jni whitedb.driver.WhiteDB whitedb-0.7.2/java/jni/compile_bridge.sh000066400000000000000000000007761226454622500201710ustar00rootroot00000000000000#!/bin/sh # [ "X${JAVA_HOME}" = "X" ] && JAVA_HOME=/usr/lib/jvm/java-6-openjdk DBDIR=../../../Db cd library gcc -O2 -march=pentium4 -lm -shared -I${JAVA_HOME}/include ../src/native/whitedbDriver.c ${DBDIR}/dbmem.c ${DBDIR}/dballoc.c ${DBDIR}/dbdata.c ${DBDIR}/dblock.c ${DBDIR}/dbindex.c ${DBDIR}/dbtest.c ${DBDIR}/dblog.c ${DBDIR}/dbhash.c ${DBDIR}/dbcompare.c ${DBDIR}/dbquery.c ${DBDIR}/dbutil.c ${DBDIR}/dbmpool.c ${DBDIR}/dbschema.c ${DBDIR}/dbjson.c ${DBDIR}/../json/yajl_all.c -o libwhitedbDriver.so whitedb-0.7.2/java/jni/compile_java.sh000066400000000000000000000003051226454622500176420ustar00rootroot00000000000000#!/bin/sh javac -cp ./src -d classes src/whitedb/driver/WhiteDB.java javac -cp ./src -d classes src/whitedb/driver/tests.java javah -classpath ./classes -d src/native -jni whitedb.driver.WhiteDB whitedb-0.7.2/java/jni/config.h000077700000000000000000000000001226454622500203342../../config.hustar00rootroot00000000000000whitedb-0.7.2/java/jni/library/000077500000000000000000000000001226454622500163235ustar00rootroot00000000000000whitedb-0.7.2/java/jni/library/.gitignore000066400000000000000000000000001226454622500203010ustar00rootroot00000000000000whitedb-0.7.2/java/jni/runJava.bat000066400000000000000000000000571226454622500167570ustar00rootroot00000000000000java -classpath ./classes whitedb.driver.tests whitedb-0.7.2/java/jni/run_java.sh000066400000000000000000000001341226454622500170160ustar00rootroot00000000000000#!/bin/sh java -classpath ./classes -Djava.library.path=./library \ whitedb.driver.tests whitedb-0.7.2/java/jni/setEnv.bat000066400000000000000000000002621226454622500166130ustar00rootroot00000000000000PATH=%PATH%;C:\Program Files\Java\jdk1.7.0_25\bin\; REM cd C:\path\to\whitedb\java\jni\ PATH=%path%;./library "C:\Program Files\Microsoft Visual Studio 9.0\VC\vcvarsall.bat" whitedb-0.7.2/java/jni/src/000077500000000000000000000000001226454622500154465ustar00rootroot00000000000000whitedb-0.7.2/java/jni/src/native/000077500000000000000000000000001226454622500167345ustar00rootroot00000000000000whitedb-0.7.2/java/jni/src/native/whitedbDriver.c000066400000000000000000000323601226454622500217060ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Andres Puusepp 2009 * Copyright (c) Priit Järv 2013 * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file whitedbDriver.c * JNI native methods for WhiteDB. * */ #include "whitedb_driver_WhiteDB.h" #include "../../../../Db/dballoc.h" #include "../../../../Db/dbmem.h" #include "../../../../Db/dbdata.h" #include "../../../../Db/dbquery.h" #ifdef _WIN32 #include "../../config-w32.h" #else #include "../../config.h" #endif #include #if 0 void* get_database_from_java_object(JNIEnv *env, jobject database) { jclass clazz; jfieldID fieldID; jlong pointer; clazz = (*env)->FindClass(env, "whitedb/holder/Database"); fieldID = (*env)->GetFieldID(env, clazz, "pointer", "J"); pointer = (*env)->GetLongField(env, database, fieldID); return (void*)pointer; } void* get_record_from_java_object(JNIEnv *env, jobject record) { jclass clazz; jfieldID fieldID; jlong pointer; clazz = (*env)->FindClass(env, "whitedb/holder/Record"); fieldID = (*env)->GetFieldID(env, clazz, "pointer", "J"); pointer = (*env)->GetLongField(env, record, fieldID); return (void*)pointer; } #endif jobject create_database_record_for_java(JNIEnv *env, void* recordPointer) { jclass clazz; jmethodID methodID; jobject item; jfieldID fieldID; clazz = (*env)->FindClass(env, "whitedb/holder/Record"); methodID = (*env)->GetMethodID(env, clazz, "", "()V"); item = (*env)->NewObject(env, clazz, methodID, NULL); fieldID = (*env)->GetFieldID(env, clazz, "pointer", "J"); (*env)->SetLongField(env, item, fieldID, (jlong)recordPointer); return item; } JNIEXPORT jobject JNICALL Java_whitedb_driver_WhiteDB_getDatabase(JNIEnv *env, jobject obj, jstring shmname, jint size, jboolean local) { jclass clazz; jmethodID methodID; jobject item; jfieldID fieldID; jlong shmptr; const char *shmnamep = NULL; /* JNI wants const here */ if(local) { shmptr = (jlong) wg_attach_local_database((int) size); } else { if(shmname) shmnamep = (*env)->GetStringUTFChars(env, shmname, 0); shmptr = (jlong) wg_attach_database((char *) shmnamep, (int) size); } clazz = (*env)->FindClass(env, "whitedb/holder/Database"); methodID = (*env)->GetMethodID(env, clazz, "", "()V"); item = (*env)->NewObject(env, clazz, methodID, NULL); fieldID = (*env)->GetFieldID(env, clazz, "pointer", "J"); (*env)->SetLongField(env, item, fieldID, shmptr); if(shmnamep) (*env)->ReleaseStringUTFChars(env, shmname, shmnamep); return item; } JNIEXPORT jint JNICALL Java_whitedb_driver_WhiteDB_deleteDatabase(JNIEnv *env, jobject obj, jstring shmname) { jboolean ret; const char *shmnamep = NULL; if(shmname) shmnamep = (*env)->GetStringUTFChars(env, shmname, 0); ret = wg_delete_database((char *) shmnamep); if(shmnamep) (*env)->ReleaseStringUTFChars(env, shmname, shmnamep); return ret; } JNIEXPORT void JNICALL Java_whitedb_driver_WhiteDB_deleteLocalDatabase(JNIEnv *env, jobject obj, jlong dbptr) { wg_delete_local_database((void *) dbptr); } JNIEXPORT jint JNICALL Java_whitedb_driver_WhiteDB_detachDatabase(JNIEnv *env, jobject obj, jlong dbptr ) { return wg_detach_database((void *) dbptr); } JNIEXPORT jobject JNICALL Java_whitedb_driver_WhiteDB_createRecord (JNIEnv *env, jobject obj, jlong dbptr, jint fieldcount) { void* record; record = wg_create_record((void *) dbptr, (int)fieldcount); return create_database_record_for_java(env, record); } JNIEXPORT jobject JNICALL Java_whitedb_driver_WhiteDB_getFirstRecord (JNIEnv *env, jobject obj, jlong dbptr) { void* record; record = wg_get_first_record((void *) dbptr); return create_database_record_for_java(env, record); } JNIEXPORT jobject JNICALL Java_whitedb_driver_WhiteDB_getNextRecord (JNIEnv *env, jobject obj, jlong dbptr, jlong rptr) { void* record; record = wg_get_next_record((void *) dbptr, (void *) rptr); if(record == NULL) { return NULL; } return create_database_record_for_java(env, record); } JNIEXPORT jint JNICALL Java_whitedb_driver_WhiteDB_deleteRecord (JNIEnv *env, jobject obj, jlong dbptr, jlong rptr) { return wg_delete_record((void *) dbptr, (void *) rptr); } JNIEXPORT jint JNICALL Java_whitedb_driver_WhiteDB_getRecordLength (JNIEnv *env, jobject obj, jlong dbptr, jlong rptr) { return wg_get_record_len((void *) dbptr, (void *) rptr); } JNIEXPORT jint JNICALL Java_whitedb_driver_WhiteDB_setRecordIntField (JNIEnv *env, jobject obj, jlong dbptr, jlong rptr, jint field, jint value) { return wg_set_int_field((void *) dbptr, (void *) rptr, (int)field, (int)value); } JNIEXPORT jint JNICALL Java_whitedb_driver_WhiteDB_getIntFieldValue (JNIEnv *env, jobject obj, jlong dbptr, jlong rptr, jint field) { void* database; database = (void *) dbptr; return wg_decode_int(database, wg_get_field(database, (void *) rptr, (int)field)); } JNIEXPORT jint JNICALL Java_whitedb_driver_WhiteDB_setRecordStringField (JNIEnv *env, jobject obj, jlong dbptr, jlong rptr, jint field, jstring value) { int result; const char *valuep = NULL; if(value) valuep = (*env)->GetStringUTFChars(env, value, 0); if(!valuep) return -1; result = wg_set_str_field((void *) dbptr, (void *) rptr, (int)field, (char *)valuep); (*env)->ReleaseStringUTFChars(env, value, valuep); return result; } JNIEXPORT jstring JNICALL Java_whitedb_driver_WhiteDB_getStringFieldValue (JNIEnv *env, jobject obj, jlong dbptr, jlong rptr, jint field) { void* database; gint enc; char* str = NULL; database = (void *) dbptr; enc = wg_get_field(database, (void *) rptr, (int)field); if(enc != WG_ILLEGAL) { str = wg_decode_str(database, enc); } if(str) { return (*env)->NewStringUTF(env, (const char *) str); } else { return NULL; } } JNIEXPORT jint JNICALL Java_whitedb_driver_WhiteDB_setRecordBlobField (JNIEnv *env, jobject obj, jlong dbptr, jlong rptr, jint field, jbyteArray value) { void* database; size_t arraylen, result; gint enc; jbyte *valuep = NULL; if(value) valuep = (*env)->GetByteArrayElements(env, value, 0); if(!valuep) return -1; database = (void *) dbptr; arraylen = (*env)->GetArrayLength(env, value); enc = wg_encode_blob(database, (char *) valuep, NULL, arraylen); if(enc != WG_ILLEGAL) { result = wg_set_field(database, (void *) rptr, (int)field, enc); } else { result = -1; } (*env)->ReleaseByteArrayElements(env, value, valuep, 0); return result; } JNIEXPORT jbyteArray JNICALL Java_whitedb_driver_WhiteDB_getBlobFieldValue (JNIEnv *env, jobject obj, jlong dbptr, jlong rptr, jint field) { void* database; size_t arraylen = 0; gint enc; char* str = NULL; jbyteArray result; database = (void *) dbptr; enc = wg_get_field(database, (void *) rptr, (int)field); if(enc != WG_ILLEGAL) { str = wg_decode_blob(database, enc); arraylen = wg_decode_blob_len(database, enc); } if(str) { result = (*env)->NewByteArray(env, arraylen); if(result) (*env)->SetByteArrayRegion(env, result, 0, arraylen, (const jbyte *) str); return result; } else { return NULL; } } gint map_cond(jint cond) { /* Robust method of mapping constants. This way redefining * something on either side doesn't break. */ switch(cond) { case whitedb_driver_WhiteDB_COND_EQUAL: return WG_COND_EQUAL; case whitedb_driver_WhiteDB_COND_NOT_EQUAL: return WG_COND_NOT_EQUAL; case whitedb_driver_WhiteDB_COND_LESSTHAN: return WG_COND_LESSTHAN; case whitedb_driver_WhiteDB_COND_GREATER: return WG_COND_GREATER; case whitedb_driver_WhiteDB_COND_LTEQUAL: return WG_COND_LTEQUAL; case whitedb_driver_WhiteDB_COND_GTEQUAL: return WG_COND_GTEQUAL; default: break; } return -1; } JNIEXPORT jobject JNICALL Java_whitedb_driver_WhiteDB_makeQuery(JNIEnv *env, jobject obj, jlong dbptr, jlong matchrecptr, jobjectArray arglistobj, jlong rowlimit) { jclass clazz; jmethodID methodID; jfieldID fieldID; jobject item = NULL; void *database; wg_query *query; void *matchrec = NULL; wg_query_arg *argv = NULL; int argc = 0, i; database = (void *) dbptr; if(matchrecptr) { matchrec = (void *) matchrecptr; } else if(arglistobj) { jfieldID column_id, cond_id, value_id; argc = (*env)->GetArrayLength(env, arglistobj); argv = malloc(sizeof(wg_query_arg) * argc); if(!argv) { return NULL; } clazz = (*env)->FindClass(env, "whitedb/util/ArgListEntry"); column_id = (*env)->GetFieldID(env, clazz, "column", "I"); cond_id = (*env)->GetFieldID(env, clazz, "cond", "I"); value_id = (*env)->GetFieldID(env, clazz, "value", "I"); for(i=0; iGetObjectArrayElement(env, arglistobj, i); argv[i].column = \ (gint) (*env)->GetIntField(env, argobj, column_id); argv[i].cond = map_cond( (*env)->GetIntField(env, argobj, cond_id)); argv[i].value = wg_encode_query_param_int(database, (gint) (*env)->GetIntField(env, argobj, value_id)); } } if(rowlimit > 0) query = wg_make_query_rc(database, matchrec, 0, argv, argc, rowlimit); else query = wg_make_query(database, matchrec, 0, argv, argc); if(query) { clazz = (*env)->FindClass(env, "whitedb/holder/Query"); methodID = (*env)->GetMethodID(env, clazz, "", "()V"); item = (*env)->NewObject(env, clazz, methodID, NULL); fieldID = (*env)->GetFieldID(env, clazz, "query", "J"); (*env)->SetLongField(env, item, fieldID, (jlong)query); fieldID = (*env)->GetFieldID(env, clazz, "arglist", "J"); (*env)->SetLongField(env, item, fieldID, (jlong)argv); fieldID = (*env)->GetFieldID(env, clazz, "argc", "I"); (*env)->SetIntField(env, item, fieldID, argc); } return item; } JNIEXPORT void JNICALL Java_whitedb_driver_WhiteDB_freeQuery(JNIEnv *env, jobject obj, jlong dbptr, jobject queryobj) { jclass clazz; jfieldID fieldID; jlong pointer; void *database; wg_query *query = NULL; wg_query_arg *arglist = NULL; int argc = 0, i; database = (void *) dbptr; clazz = (*env)->FindClass(env, "whitedb/holder/Query"); fieldID = (*env)->GetFieldID(env, clazz, "arglist", "J"); pointer = (*env)->GetLongField(env, queryobj, fieldID); if(pointer) { arglist = (wg_query_arg *) pointer; fieldID = (*env)->GetFieldID(env, clazz, "argc", "I"); argc = (*env)->GetIntField(env, queryobj, fieldID); for(i=0; iGetFieldID(env, clazz, "query", "J"); pointer = (*env)->GetLongField(env, queryobj, fieldID); query = (wg_query *) pointer; if(query) wg_free_query(database, query); } JNIEXPORT jobject JNICALL Java_whitedb_driver_WhiteDB_fetchQuery(JNIEnv *env, jobject obj, jlong dbptr, jlong queryptr) { wg_query *query; void *rec = NULL; query = (wg_query *) queryptr; if(!query) return NULL; rec = wg_fetch((void *) dbptr, query); if(!rec) return NULL; return create_database_record_for_java(env, rec); } JNIEXPORT jlong JNICALL Java_whitedb_driver_WhiteDB_startRead(JNIEnv *env, jobject obj, jlong dbptr) { return (jlong) wg_start_read((void *) dbptr); } JNIEXPORT jlong JNICALL Java_whitedb_driver_WhiteDB_endRead(JNIEnv *env, jobject obj, jlong dbptr, jlong lock) { return (jlong) wg_end_read((void *) dbptr, lock); } JNIEXPORT jlong JNICALL Java_whitedb_driver_WhiteDB_startWrite(JNIEnv *env, jobject obj, jlong dbptr) { return (jlong) wg_start_write((void *) dbptr); } JNIEXPORT jlong JNICALL Java_whitedb_driver_WhiteDB_endWrite(JNIEnv *env, jobject obj, jlong dbptr, jlong lock) { return (jlong) wg_end_write((void *) dbptr, lock); } whitedb-0.7.2/java/jni/src/whitedb/000077500000000000000000000000001226454622500170745ustar00rootroot00000000000000whitedb-0.7.2/java/jni/src/whitedb/driver/000077500000000000000000000000001226454622500203675ustar00rootroot00000000000000whitedb-0.7.2/java/jni/src/whitedb/driver/WhiteDB.java000066400000000000000000000223021226454622500225170ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Andres Puusepp 2009 * Copyright (c) Priit Järv 2013 * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file WhiteDB.java * Java API for WhiteDB. * */ package whitedb.driver; import whitedb.holder.Database; import whitedb.holder.Record; import whitedb.holder.Query; import whitedb.util.FieldComparator; import whitedb.util.ArgListEntry; import java.lang.reflect.Field; import java.util.Collections; import java.util.Arrays; public class WhiteDB { /**************************** Constants ****************************/ public static final int COND_EQUAL = 1; public static final int COND_NOT_EQUAL = 2; public static final int COND_LESSTHAN = 4; public static final int COND_GREATER = 8; public static final int COND_LTEQUAL = 16; public static final int COND_GTEQUAL = 32; /************************** Native methods **************************/ /* * Db connection: encapsulate in class */ private native Database getDatabase(String shmname, int size, boolean local); private native void deleteLocalDatabase(long dbptr); private native int detachDatabase(long dbptr); /* * Db management: public */ public native int deleteDatabase(String shmname); /* * Record handling: wrapped in Java functions */ private native Record createRecord(long dbptr, int fieldCount); private native Record getFirstRecord(long dbptr); private native Record getNextRecord(long dbptr, long rptr); private native int deleteRecord(long dbptr, long rptr); private native int getRecordLength(long dbptr, long rptr); /* * Read/write field data: wrapped in Java functions */ private native int setRecordIntField(long dbptr, long rptr, int field, int value); private native int getIntFieldValue(long dbptr, long rptr, int field); private native int setRecordStringField(long dbptr, long rptr, int field, String value); private native String getStringFieldValue(long dbptr, long rptr, int field); private native int setRecordBlobField(long dbptr, long rptr, int field, byte[] value); private native byte[] getBlobFieldValue(long dbptr, long rptr, int field); /* * Query functions: wrapped. */ private native Query makeQuery(long dbptr, long matchrecptr, ArgListEntry[] arglist, long rowlimit); private native void freeQuery(long dbptr, Query query); private native Record fetchQuery(long dbptr, long queryptr); /* * Locking functions: wrapped. */ private native long startRead(long dbptr); private native long endRead(long dbptr, long lock); private native long startWrite(long dbptr); private native long endWrite(long dbptr, long lock); static { System.loadLibrary("whitedbDriver"); } /*********************** Connection state ***************************/ private Database database; private boolean local; /****************** Class constructor: connect to db ****************/ public WhiteDB() { this.local = false; this.database = getDatabase(null, 0, false); } public WhiteDB(int size) { this.local = false; this.database = getDatabase(null, size, false); } public WhiteDB(String shmname) { this.local = false; this.database = getDatabase(shmname, 0, false); } public WhiteDB(String shmname, int size) { this.local = false; this.database = getDatabase(shmname, size, false); } public WhiteDB(int size, boolean local) { this.local = local; this.database = getDatabase(null, size, local); } public WhiteDB(boolean local) { this.local = local; this.database = getDatabase(null, 0, local); } public void close() { if(local) { deleteLocalDatabase(database.pointer); } else { detachDatabase(database.pointer); } } /******************** Wrappers for native methods *******************/ public Record createRecord(int fieldCount) { return createRecord(database.pointer, fieldCount); } public Record getFirstRecord() { return getFirstRecord(database.pointer); } public Record getNextRecord(Record record) { return getNextRecord(database.pointer, record.pointer); } public int deleteRecord(Record record) { return deleteRecord(database.pointer, record.pointer); } public int getRecordLength(Record record) { return getRecordLength(database.pointer, record.pointer); } public int setRecordIntField(Record record, int field, int value) { return setRecordIntField(database.pointer, record.pointer, field, value); } public int getIntFieldValue(Record record, int field) { return getIntFieldValue(database.pointer, record.pointer, field); } public int setRecordStringField(Record record, int field, String value) { return setRecordStringField(database.pointer, record.pointer, field, value); } public String getStringFieldValue(Record record, int field) { return getStringFieldValue(database.pointer, record.pointer, field); } public int setRecordBlobField(Record record, int field, byte[] value) { return setRecordBlobField(database.pointer, record.pointer, field, value); } public byte[] getBlobFieldValue(Record record, int field) { return getBlobFieldValue(database.pointer, record.pointer, field); } /****************** Wrappers for query functions ********************/ public Query makeQuery(Record record) { return makeQuery(database.pointer, record.pointer, null, 0); } public Query makeQuery(ArgListEntry[] arglist) { return makeQuery(database.pointer, 0, arglist, 0); } public Query makeQuery(Record record, long rowlimit) { return makeQuery(database.pointer, record.pointer, null, rowlimit); } public Query makeQuery(ArgListEntry[] arglist, long rowlimit) { return makeQuery(database.pointer, 0, arglist, rowlimit); } public void freeQuery(Query query) { freeQuery(database.pointer, query); } public Record fetchQuery(Query query) { return fetchQuery(database.pointer, query.query); } /**************** Wrappers for locking functions ********************/ public long startRead() { return startRead(database.pointer); } public long endRead(long lock) { return endRead(database.pointer, lock); } public long startWrite() { return startWrite(database.pointer); } public long endWrite(long lock) { return endWrite(database.pointer, lock); } /********************* ORM support functions ************************/ public void writeObjectToDatabase(Object object) throws IllegalAccessException { Field[] declaredFields = object.getClass().getDeclaredFields(); Arrays.sort(declaredFields, new FieldComparator()); //Performance issue, cache sorted fields Record record = createRecord(database.pointer, declaredFields.length); for (int i = 0; i < declaredFields.length; i++) { Integer value = getFieldValue(object, declaredFields[i]); System.out.println("Writing field: [" + declaredFields[i].getName() + "] with value: " + value); setRecordIntField(database.pointer, record.pointer, i, value); } } public T readObjectFromDatabase(Class objecClass, Record record) throws IllegalAccessException, InstantiationException { Field[] declaredFields = objecClass.getDeclaredFields(); Arrays.sort(declaredFields, new FieldComparator()); //Performance issue, cache sorted fields T object = objecClass.newInstance(); for (int i = 0; i < declaredFields.length; i++) { int value = getIntFieldValue(database.pointer, record.pointer, i); System.out.println("Reading field: [" + declaredFields[i].getName() + "] with value: " + value); setFieldValue(object, declaredFields[i], value); } return object; } public void setFieldValue(Object object, Field field, int value) throws IllegalAccessException { field.set(object, value); } public Integer getFieldValue(Object object, Field field) throws IllegalAccessException { return (Integer) field.get(object); } } whitedb-0.7.2/java/jni/src/whitedb/driver/tests.java000066400000000000000000000103251226454622500223750ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Andres Puusepp 2009 * Copyright (c) Priit Järv 2013 * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file tests.java * Java API tests and demos. * */ package whitedb.driver; import whitedb.driver.WhiteDB; import whitedb.holder.Record; import whitedb.holder.SampleObject; import whitedb.holder.Query; import whitedb.util.ArgListEntry; public class tests { public static void main(String[] args) throws IllegalAccessException, InstantiationException { ormDatabaseExample(); basicDatabaseExample(); } /* * Prototype only works with int fields. */ public static void ormDatabaseExample() throws IllegalAccessException, InstantiationException { SampleObject sampleObject = new SampleObject(); sampleObject.age = 25; sampleObject.weight = 100; WhiteDB db = new WhiteDB(500000, true); db.writeObjectToDatabase(sampleObject); sampleObject = null; Record record = db.getFirstRecord(); sampleObject = db.readObjectFromDatabase(SampleObject.class, record); System.out.println("Object read from database: " + sampleObject); db.close(); } /* * Basic example of native database methods usage */ public static void basicDatabaseExample() { WhiteDB db = new WhiteDB(500000, true); /* local db, 500k */ /* System.out.println("db.database pointer: " + db.database.pointer); */ Record record = db.createRecord(1); System.out.println("Create record 1: " + record.pointer); int result = db.setRecordIntField(record, 0, 108); System.out.println("Inserted record 1 value, result was: " + result); record = db.createRecord(3); System.out.println("Create record 2: " + record.pointer); result = db.setRecordIntField(record, 0, 666); System.out.println("Inserted record 2 field 0, result was: " + result); result = db.setRecordStringField(record, 1, "testval"); System.out.println("Inserted record 2 field 1, result was: " + result); result = db.setRecordBlobField(record, 2, new byte[] {65, 0, -7, 44}); System.out.println("Inserted record 2 field 2, result was: " + result); result = db.getRecordLength(record); System.out.println("Record 2 length reported: " + result); record = db.getFirstRecord(); System.out.println("First record pointer: " + record.pointer); System.out.println("Get field 0 value: " + db.getIntFieldValue(record, 0)); record = db.getNextRecord(record); System.out.println("Next record pointer: " + record.pointer); System.out.println("Get field 0 value: " + db.getIntFieldValue(record, 0)); System.out.println("Get field 1 value: " + db.getStringFieldValue(record, 1)); byte[] val = db.getBlobFieldValue(record, 2); System.out.print("Get field 2 value: "); for(int i=0; i. * */ /** @file Database.java * Database class for WhiteDB Java API. * */ package whitedb.holder; public class Database { public long pointer; } whitedb-0.7.2/java/jni/src/whitedb/holder/Query.java000066400000000000000000000016571226454622500223320ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Priit Järv 2013 * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file Query.java * Query class for WhiteDB Java API. * */ package whitedb.holder; public class Query { public long query; public long arglist; public int argc; } whitedb-0.7.2/java/jni/src/whitedb/holder/Record.java000066400000000000000000000016101226454622500224300ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Andres Puusepp 2009 * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file Record.java * Record class for WhiteDB Java API. * */ package whitedb.holder; public class Record { public long pointer; } whitedb-0.7.2/java/jni/src/whitedb/holder/SampleObject.java000066400000000000000000000021411226454622500235620ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Andres Puusepp 2009 * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file SampleObject.java * ORM demo class for WhiteDB Java API. * */ package whitedb.holder; public class SampleObject { public int weight; public int age; @Override public String toString() { return "SampleObject{" + "weight=" + weight + ", age=" + age + '}'; } } whitedb-0.7.2/java/jni/src/whitedb/util/000077500000000000000000000000001226454622500200515ustar00rootroot00000000000000whitedb-0.7.2/java/jni/src/whitedb/util/ArgListEntry.java000066400000000000000000000022071226454622500233040ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Priit Järv 2013 * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file ArgListEntry.java * WhiteDB Query argument helper class. * */ package whitedb.util; /* XXX: currently only allows int arguments */ public class ArgListEntry { public int column; public int cond; public int value; public ArgListEntry(int column, int cond, int value) { this.column = column; this.cond = cond; this.value = value; } } whitedb-0.7.2/java/jni/src/whitedb/util/FieldComparator.java000066400000000000000000000021271226454622500237710ustar00rootroot00000000000000/* * $Id: $ * $Version: $ * * Copyright (c) Andres Puusepp 2009 * * This file is part of WhiteDB * * WhiteDB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * WhiteDB is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with WhiteDB. If not, see . * */ /** @file FieldComparator.java * WhiteDB Java API ORM support code. * */ package whitedb.util; import java.util.Comparator; import java.lang.reflect.Field; public class FieldComparator implements Comparator { public int compare(Field field1, Field field2) { return field1.getName().compareTo(field2.getName()); } } whitedb-0.7.2/json/000077500000000000000000000000001226454622500141275ustar00rootroot00000000000000whitedb-0.7.2/json/Makefile.am000066400000000000000000000001761226454622500161670ustar00rootroot00000000000000# # - - - - json parser sources - - - noinst_LTLIBRARIES = libjson.la libjson_la_SOURCES = yajl_api.h yajl_all.h yajl_all.c whitedb-0.7.2/json/yajl_all.c000066400000000000000000002055351226454622500160740ustar00rootroot00000000000000/* * Copyright (c) 2007-2011, Lloyd Hilaiel * * Permission to use, copy, modify, and/or distribute this software for any * purpose with or without fee is hereby granted, provided that the above * copyright notice and this permission notice appear in all copies. * * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include "yajl_api.h" #include "yajl_all.h" #define YAJL_BUF_INIT_SIZE 2048 /* There seem to be some environments where long long is supported but * LLONG_MAX and LLONG_MIN are not defined. This is a safe workaround * (parsing large integers may break however). */ #ifndef LLONG_MAX #define LLONG_MAX LONG_MAX #define LLONG_MIN LONG_MIN #endif #ifdef _WIN32 #define snprintf(s, sz, f, ...) _snprintf_s(s, sz+1, sz, f, ## __VA_ARGS__) #endif struct yajl_buf_t { size_t len; size_t used; unsigned char * data; yajl_alloc_funcs * alloc; }; typedef enum { yajl_gen_start, yajl_gen_map_start, yajl_gen_map_key, yajl_gen_map_val, yajl_gen_array_start, yajl_gen_in_array, yajl_gen_complete, yajl_gen_error } yajl_gen_state; struct yajl_gen_t { unsigned int flags; unsigned int depth; const char * indentString; yajl_gen_state state[YAJL_MAX_DEPTH]; yajl_print_t print; void * ctx; /* yajl_buf */ /* memory allocation routines */ yajl_alloc_funcs alloc; }; struct yajl_lexer_t { /* the overal line and char offset into the data */ size_t lineOff; size_t charOff; /* error */ yajl_lex_error error; /* a input buffer to handle the case where a token is spread over * multiple chunks */ yajl_buf buf; /* in the case where we have data in the lexBuf, bufOff holds * the current offset into the lexBuf. */ size_t bufOff; /* are we using the lex buf? */ unsigned int bufInUse; /* shall we allow comments? */ unsigned int allowComments; /* shall we validate utf8 inside strings? */ unsigned int validateUTF8; yajl_alloc_funcs * alloc; }; static void * yajl_internal_malloc(void *ctx, size_t sz) { (void)ctx; return malloc(sz); } static void * yajl_internal_realloc(void *ctx, void * previous, size_t sz) { (void)ctx; return realloc(previous, sz); } static void yajl_internal_free(void *ctx, void * ptr) { (void)ctx; free(ptr); } static void yajl_set_default_alloc_funcs(yajl_alloc_funcs * yaf) { yaf->malloc = yajl_internal_malloc; yaf->free = yajl_internal_free; yaf->realloc = yajl_internal_realloc; yaf->ctx = NULL; } static void yajl_buf_ensure_available(yajl_buf buf, size_t want) { size_t need; assert(buf != NULL); /* first call */ if (buf->data == NULL) { buf->len = YAJL_BUF_INIT_SIZE; buf->data = (unsigned char *) YA_MALLOC(buf->alloc, buf->len); buf->data[0] = 0; } need = buf->len; while (want >= (need - buf->used)) need <<= 1; if (need != buf->len) { buf->data = (unsigned char *) YA_REALLOC(buf->alloc, buf->data, need); buf->len = need; } } static yajl_buf yajl_buf_alloc(yajl_alloc_funcs * alloc) { yajl_buf b = YA_MALLOC(alloc, sizeof(struct yajl_buf_t)); memset((void *) b, 0, sizeof(struct yajl_buf_t)); b->alloc = alloc; return b; } static void yajl_buf_free(yajl_buf buf) { assert(buf != NULL); if (buf->data) YA_FREE(buf->alloc, buf->data); YA_FREE(buf->alloc, buf); } static void yajl_buf_append(yajl_buf buf, const void * data, size_t len) { yajl_buf_ensure_available(buf, len); if (len > 0) { assert(data != NULL); memcpy(buf->data + buf->used, data, len); buf->used += len; buf->data[buf->used] = 0; } } static void yajl_buf_clear(yajl_buf buf) { buf->used = 0; if (buf->data) buf->data[buf->used] = 0; } static const unsigned char * yajl_buf_data(yajl_buf buf) { return buf->data; } static size_t yajl_buf_len(yajl_buf buf) { return buf->used; } const char * yajl_status_to_string(yajl_status stat) { const char * statStr = "unknown"; switch (stat) { case yajl_status_ok: statStr = "ok, no error"; break; case yajl_status_client_canceled: statStr = "client canceled parse"; break; case yajl_status_error: statStr = "parse error"; break; } return statStr; } yajl_handle yajl_alloc(const yajl_callbacks * callbacks, yajl_alloc_funcs * afs, void * ctx) { yajl_handle hand = NULL; yajl_alloc_funcs afsBuffer; /* first order of business is to set up memory allocation routines */ if (afs != NULL) { if (afs->malloc == NULL || afs->realloc == NULL || afs->free == NULL) { return NULL; } } else { yajl_set_default_alloc_funcs(&afsBuffer); afs = &afsBuffer; } hand = (yajl_handle) YA_MALLOC(afs, sizeof(struct yajl_handle_t)); /* copy in pointers to allocation routines */ memcpy((void *) &(hand->alloc), (void *) afs, sizeof(yajl_alloc_funcs)); hand->callbacks = callbacks; hand->ctx = ctx; hand->lexer = NULL; hand->bytesConsumed = 0; hand->decodeBuf = yajl_buf_alloc(&(hand->alloc)); hand->flags = 0; yajl_bs_init(hand->stateStack, &(hand->alloc)); yajl_bs_push(hand->stateStack, yajl_state_start); return hand; } int yajl_config(yajl_handle h, yajl_option opt, ...) { int rv = 1; va_list ap; va_start(ap, opt); switch(opt) { case yajl_allow_comments: case yajl_dont_validate_strings: case yajl_allow_trailing_garbage: case yajl_allow_multiple_values: case yajl_allow_partial_values: if (va_arg(ap, int)) h->flags |= opt; else h->flags &= ~opt; break; default: rv = 0; } va_end(ap); return rv; } void yajl_free(yajl_handle handle) { yajl_bs_free(handle->stateStack); yajl_buf_free(handle->decodeBuf); if (handle->lexer) { yajl_lex_free(handle->lexer); handle->lexer = NULL; } YA_FREE(&(handle->alloc), handle); } yajl_status yajl_parse(yajl_handle hand, const unsigned char * jsonText, size_t jsonTextLen) { yajl_status status; /* lazy allocation of the lexer */ if (hand->lexer == NULL) { hand->lexer = yajl_lex_alloc(&(hand->alloc), hand->flags & yajl_allow_comments, !(hand->flags & yajl_dont_validate_strings)); } status = yajl_do_parse(hand, jsonText, jsonTextLen); return status; } yajl_status yajl_complete_parse(yajl_handle hand) { /* The lexer is lazy allocated in the first call to parse. if parse is * never called, then no data was provided to parse at all. This is a * "premature EOF" error unless yajl_allow_partial_values is specified. * allocating the lexer now is the simplest possible way to handle this * case while preserving all the other semantics of the parser * (multiple values, partial values, etc). */ if (hand->lexer == NULL) { hand->lexer = yajl_lex_alloc(&(hand->alloc), hand->flags & yajl_allow_comments, !(hand->flags & yajl_dont_validate_strings)); } return yajl_do_finish(hand); } unsigned char * yajl_get_error(yajl_handle hand, int verbose, const unsigned char * jsonText, size_t jsonTextLen) { return yajl_render_error_string(hand, jsonText, jsonTextLen, verbose); } size_t yajl_get_bytes_consumed(yajl_handle hand) { if (!hand) return 0; else return hand->bytesConsumed; } void yajl_free_error(yajl_handle hand, unsigned char * str) { /* use memory allocation functions if set */ YA_FREE(&(hand->alloc), str); } static void CharToHex(unsigned char c, char * hexBuf) { const char * hexchar = "0123456789ABCDEF"; hexBuf[0] = hexchar[c >> 4]; hexBuf[1] = hexchar[c & 0x0F]; } static void yajl_string_encode(const yajl_print_t print, void * ctx, const unsigned char * str, size_t len, int escape_solidus) { size_t beg = 0; size_t end = 0; char hexBuf[7]; hexBuf[0] = '\\'; hexBuf[1] = 'u'; hexBuf[2] = '0'; hexBuf[3] = '0'; hexBuf[6] = 0; while (end < len) { const char * escaped = NULL; switch (str[end]) { case '\r': escaped = "\\r"; break; case '\n': escaped = "\\n"; break; case '\\': escaped = "\\\\"; break; /* it is not required to escape a solidus in JSON: * read sec. 2.5: http://www.ietf.org/rfc/rfc4627.txt * specifically, this production from the grammar: * unescaped = %x20-21 / %x23-5B / %x5D-10FFFF */ case '/': if (escape_solidus) escaped = "\\/"; break; case '"': escaped = "\\\""; break; case '\f': escaped = "\\f"; break; case '\b': escaped = "\\b"; break; case '\t': escaped = "\\t"; break; default: if ((unsigned char) str[end] < 32) { CharToHex(str[end], hexBuf + 4); escaped = hexBuf; } break; } if (escaped != NULL) { print(ctx, (const char *) (str + beg), end - beg); print(ctx, escaped, (unsigned int)strlen(escaped)); beg = ++end; } else { ++end; } } print(ctx, (const char *) (str + beg), end - beg); } static void hexToDigit(unsigned int * val, const unsigned char * hex) { unsigned int i; for (i=0;i<4;i++) { unsigned char c = hex[i]; if (c >= 'A') c = (c & ~0x20) - 7; c -= '0'; assert(!(c & 0xF0)); *val = (*val << 4) | c; } } static void Utf32toUtf8(unsigned int codepoint, char * utf8Buf) { if (codepoint < 0x80) { utf8Buf[0] = (char) codepoint; utf8Buf[1] = 0; } else if (codepoint < 0x0800) { utf8Buf[0] = (char) ((codepoint >> 6) | 0xC0); utf8Buf[1] = (char) ((codepoint & 0x3F) | 0x80); utf8Buf[2] = 0; } else if (codepoint < 0x10000) { utf8Buf[0] = (char) ((codepoint >> 12) | 0xE0); utf8Buf[1] = (char) (((codepoint >> 6) & 0x3F) | 0x80); utf8Buf[2] = (char) ((codepoint & 0x3F) | 0x80); utf8Buf[3] = 0; } else if (codepoint < 0x200000) { utf8Buf[0] =(char)((codepoint >> 18) | 0xF0); utf8Buf[1] =(char)(((codepoint >> 12) & 0x3F) | 0x80); utf8Buf[2] =(char)(((codepoint >> 6) & 0x3F) | 0x80); utf8Buf[3] =(char)((codepoint & 0x3F) | 0x80); utf8Buf[4] = 0; } else { utf8Buf[0] = '?'; utf8Buf[1] = 0; } } static void yajl_string_decode(yajl_buf buf, const unsigned char * str, size_t len) { size_t beg = 0; size_t end = 0; while (end < len) { if (str[end] == '\\') { char utf8Buf[5]; const char * unescaped = "?"; yajl_buf_append(buf, str + beg, end - beg); switch (str[++end]) { case 'r': unescaped = "\r"; break; case 'n': unescaped = "\n"; break; case '\\': unescaped = "\\"; break; case '/': unescaped = "/"; break; case '"': unescaped = "\""; break; case 'f': unescaped = "\f"; break; case 'b': unescaped = "\b"; break; case 't': unescaped = "\t"; break; case 'u': { unsigned int codepoint = 0; hexToDigit(&codepoint, str + ++end); end+=3; /* check if this is a surrogate */ if ((codepoint & 0xFC00) == 0xD800) { end++; if (str[end] == '\\' && str[end + 1] == 'u') { unsigned int surrogate = 0; hexToDigit(&surrogate, str + end + 2); codepoint = (((codepoint & 0x3F) << 10) | ((((codepoint >> 6) & 0xF) + 1) << 16) | (surrogate & 0x3FF)); end += 5; } else { unescaped = "?"; break; } } Utf32toUtf8(codepoint, utf8Buf); unescaped = utf8Buf; if (codepoint == 0) { yajl_buf_append(buf, unescaped, 1); beg = ++end; continue; } break; } default: assert("this should never happen" == NULL); } yajl_buf_append(buf, unescaped, (unsigned int)strlen(unescaped)); beg = ++end; } else { end++; } } yajl_buf_append(buf, str + beg, end - beg); } #define ADV_PTR s++; if (!(len--)) return 0; static int yajl_string_validate_utf8(const unsigned char * s, size_t len) { if (!len) return 1; if (!s) return 0; while (len--) { /* single byte */ if (*s <= 0x7f) { /* noop */ } /* two byte */ else if ((*s >> 5) == 0x6) { ADV_PTR; if (!((*s >> 6) == 0x2)) return 0; } /* three byte */ else if ((*s >> 4) == 0x0e) { ADV_PTR; if (!((*s >> 6) == 0x2)) return 0; ADV_PTR; if (!((*s >> 6) == 0x2)) return 0; } /* four byte */ else if ((*s >> 3) == 0x1e) { ADV_PTR; if (!((*s >> 6) == 0x2)) return 0; ADV_PTR; if (!((*s >> 6) == 0x2)) return 0; ADV_PTR; if (!((*s >> 6) == 0x2)) return 0; } else { return 0; } s++; } return 1; } int yajl_gen_config(yajl_gen g, yajl_gen_option opt, ...) { int rv = 1; va_list ap; va_start(ap, opt); switch(opt) { case yajl_gen_beautify: case yajl_gen_validate_utf8: case yajl_gen_escape_solidus: if (va_arg(ap, int)) g->flags |= opt; else g->flags &= ~opt; break; case yajl_gen_indent_string: { const char *indent = va_arg(ap, const char *); g->indentString = indent; for (; *indent; indent++) { if (*indent != '\n' && *indent != '\v' && *indent != '\f' && *indent != '\t' && *indent != '\r' && *indent != ' ') { g->indentString = NULL; rv = 0; } } break; } case yajl_gen_print_callback: yajl_buf_free(g->ctx); g->print = va_arg(ap, const yajl_print_t); g->ctx = va_arg(ap, void *); break; default: rv = 0; } va_end(ap); return rv; } yajl_gen yajl_gen_alloc(const yajl_alloc_funcs * afs) { yajl_gen g = NULL; yajl_alloc_funcs afsBuffer; /* first order of business is to set up memory allocation routines */ if (afs != NULL) { if (afs->malloc == NULL || afs->realloc == NULL || afs->free == NULL) { return NULL; } } else { yajl_set_default_alloc_funcs(&afsBuffer); afs = &afsBuffer; } g = (yajl_gen) YA_MALLOC(afs, sizeof(struct yajl_gen_t)); if (!g) return NULL; memset((void *) g, 0, sizeof(struct yajl_gen_t)); /* copy in pointers to allocation routines */ memcpy((void *) &(g->alloc), (void *) afs, sizeof(yajl_alloc_funcs)); g->print = (yajl_print_t)&yajl_buf_append; g->ctx = yajl_buf_alloc(&(g->alloc)); g->indentString = " "; return g; } void yajl_gen_free(yajl_gen g) { if (g->print == (yajl_print_t)&yajl_buf_append) yajl_buf_free((yajl_buf)g->ctx); YA_FREE(&(g->alloc), g); } #define INSERT_SEP \ if (g->state[g->depth] == yajl_gen_map_key || \ g->state[g->depth] == yajl_gen_in_array) { \ g->print(g->ctx, ",", 1); \ if ((g->flags & yajl_gen_beautify)) g->print(g->ctx, "\n", 1); \ } else if (g->state[g->depth] == yajl_gen_map_val) { \ g->print(g->ctx, ":", 1); \ if ((g->flags & yajl_gen_beautify)) g->print(g->ctx, " ", 1); \ } #define INSERT_WHITESPACE \ if ((g->flags & yajl_gen_beautify)) { \ if (g->state[g->depth] != yajl_gen_map_val) { \ unsigned int _i; \ for (_i=0;_idepth;_i++) \ g->print(g->ctx, \ g->indentString, \ (unsigned int)strlen(g->indentString)); \ } \ } #define ENSURE_NOT_KEY \ if (g->state[g->depth] == yajl_gen_map_key || \ g->state[g->depth] == yajl_gen_map_start) { \ return yajl_gen_keys_must_be_strings; \ } \ /* check that we're not complete, or in error state. in a valid state * to be generating */ #define ENSURE_VALID_STATE \ if (g->state[g->depth] == yajl_gen_error) { \ return yajl_gen_in_error_state;\ } else if (g->state[g->depth] == yajl_gen_complete) { \ return yajl_gen_generation_complete; \ } #define INCREMENT_DEPTH \ if (++(g->depth) >= YAJL_MAX_DEPTH) return yajl_max_depth_exceeded; #define DECREMENT_DEPTH \ if (--(g->depth) >= YAJL_MAX_DEPTH) return yajl_gen_error; #define APPENDED_ATOM \ switch (g->state[g->depth]) { \ case yajl_gen_start: \ g->state[g->depth] = yajl_gen_complete; \ break; \ case yajl_gen_map_start: \ case yajl_gen_map_key: \ g->state[g->depth] = yajl_gen_map_val; \ break; \ case yajl_gen_array_start: \ g->state[g->depth] = yajl_gen_in_array; \ break; \ case yajl_gen_map_val: \ g->state[g->depth] = yajl_gen_map_key; \ break; \ default: \ break; \ } \ #define FINAL_NEWLINE \ if ((g->flags & yajl_gen_beautify) && g->state[g->depth] == yajl_gen_complete) \ g->print(g->ctx, "\n", 1); yajl_gen_status yajl_gen_integer(yajl_gen g, long long int number) { char i[32]; ENSURE_VALID_STATE; ENSURE_NOT_KEY; INSERT_SEP; INSERT_WHITESPACE; snprintf(i, 31, "%lld", number); g->print(g->ctx, i, (unsigned int)strlen(i)); APPENDED_ATOM; FINAL_NEWLINE; return yajl_gen_status_ok; } #if defined(_WIN32) || defined(WIN32) #include #define isnan _isnan #define isinf !_finite #endif yajl_gen_status yajl_gen_double(yajl_gen g, double number) { char i[32]; ENSURE_VALID_STATE; ENSURE_NOT_KEY; if (isnan(number) || isinf(number)) return yajl_gen_invalid_number; INSERT_SEP; INSERT_WHITESPACE; snprintf(i, 31, "%.20g", number); if (strspn(i, "0123456789-") == strlen(i)) { #ifdef _WIN32 strcat_s(i, 32, ".0"); #else strcat(i, ".0"); #endif } g->print(g->ctx, i, (unsigned int)strlen(i)); APPENDED_ATOM; FINAL_NEWLINE; return yajl_gen_status_ok; } yajl_gen_status yajl_gen_number(yajl_gen g, const char * s, size_t l) { ENSURE_VALID_STATE; ENSURE_NOT_KEY; INSERT_SEP; INSERT_WHITESPACE; g->print(g->ctx, s, l); APPENDED_ATOM; FINAL_NEWLINE; return yajl_gen_status_ok; } yajl_gen_status yajl_gen_string(yajl_gen g, const unsigned char * str, size_t len) { // if validation is enabled, check that the string is valid utf8 // XXX: This checking could be done a little faster, in the same pass as // the string encoding if (g->flags & yajl_gen_validate_utf8) { if (!yajl_string_validate_utf8(str, len)) { return yajl_gen_invalid_string; } } ENSURE_VALID_STATE; INSERT_SEP; INSERT_WHITESPACE; g->print(g->ctx, "\"", 1); yajl_string_encode(g->print, g->ctx, str, len, g->flags & yajl_gen_escape_solidus); g->print(g->ctx, "\"", 1); APPENDED_ATOM; FINAL_NEWLINE; return yajl_gen_status_ok; } yajl_gen_status yajl_gen_null(yajl_gen g) { ENSURE_VALID_STATE; ENSURE_NOT_KEY; INSERT_SEP; INSERT_WHITESPACE; g->print(g->ctx, "null", strlen("null")); APPENDED_ATOM; FINAL_NEWLINE; return yajl_gen_status_ok; } yajl_gen_status yajl_gen_bool(yajl_gen g, int boolean) { const char * val = boolean ? "true" : "false"; ENSURE_VALID_STATE; ENSURE_NOT_KEY; INSERT_SEP; INSERT_WHITESPACE; g->print(g->ctx, val, (unsigned int)strlen(val)); APPENDED_ATOM; FINAL_NEWLINE; return yajl_gen_status_ok; } yajl_gen_status yajl_gen_map_open(yajl_gen g) { ENSURE_VALID_STATE; ENSURE_NOT_KEY; INSERT_SEP; INSERT_WHITESPACE; INCREMENT_DEPTH; g->state[g->depth] = yajl_gen_map_start; g->print(g->ctx, "{", 1); if ((g->flags & yajl_gen_beautify)) g->print(g->ctx, "\n", 1); FINAL_NEWLINE; return yajl_gen_status_ok; } yajl_gen_status yajl_gen_map_close(yajl_gen g) { ENSURE_VALID_STATE; DECREMENT_DEPTH; if ((g->flags & yajl_gen_beautify)) g->print(g->ctx, "\n", 1); APPENDED_ATOM; INSERT_WHITESPACE; g->print(g->ctx, "}", 1); FINAL_NEWLINE; return yajl_gen_status_ok; } yajl_gen_status yajl_gen_array_open(yajl_gen g) { ENSURE_VALID_STATE; ENSURE_NOT_KEY; INSERT_SEP; INSERT_WHITESPACE; INCREMENT_DEPTH; g->state[g->depth] = yajl_gen_array_start; g->print(g->ctx, "[", 1); if ((g->flags & yajl_gen_beautify)) g->print(g->ctx, "\n", 1); FINAL_NEWLINE; return yajl_gen_status_ok; } yajl_gen_status yajl_gen_array_close(yajl_gen g) { ENSURE_VALID_STATE; DECREMENT_DEPTH; if ((g->flags & yajl_gen_beautify)) g->print(g->ctx, "\n", 1); APPENDED_ATOM; INSERT_WHITESPACE; g->print(g->ctx, "]", 1); FINAL_NEWLINE; return yajl_gen_status_ok; } yajl_gen_status yajl_gen_get_buf(yajl_gen g, const unsigned char ** buf, size_t * len) { if (g->print != (yajl_print_t)&yajl_buf_append) return yajl_gen_no_buf; *buf = yajl_buf_data((yajl_buf)g->ctx); *len = yajl_buf_len((yajl_buf)g->ctx); return yajl_gen_status_ok; } void yajl_gen_clear(yajl_gen g) { if (g->print == (yajl_print_t)&yajl_buf_append) yajl_buf_clear((yajl_buf)g->ctx); } #ifdef YAJL_LEXER_DEBUG static const char * tokToStr(yajl_tok tok) { switch (tok) { case yajl_tok_bool: return "bool"; case yajl_tok_colon: return "colon"; case yajl_tok_comma: return "comma"; case yajl_tok_eof: return "eof"; case yajl_tok_error: return "error"; case yajl_tok_left_brace: return "brace"; case yajl_tok_left_bracket: return "bracket"; case yajl_tok_null: return "null"; case yajl_tok_integer: return "integer"; case yajl_tok_double: return "double"; case yajl_tok_right_brace: return "brace"; case yajl_tok_right_bracket: return "bracket"; case yajl_tok_string: return "string"; case yajl_tok_string_with_escapes: return "string_with_escapes"; } return "unknown"; } #endif /* Impact of the stream parsing feature on the lexer: * * YAJL support stream parsing. That is, the ability to parse the first * bits of a chunk of JSON before the last bits are available (still on * the network or disk). This makes the lexer more complex. The * responsibility of the lexer is to handle transparently the case where * a chunk boundary falls in the middle of a token. This is * accomplished is via a buffer and a character reading abstraction. * * Overview of implementation * * When we lex to end of input string before end of token is hit, we * copy all of the input text composing the token into our lexBuf. * * Every time we read a character, we do so through the readChar function. * readChar's responsibility is to handle pulling all chars from the buffer * before pulling chars from input text */ #define readChar(lxr, txt, off) \ (((lxr)->bufInUse && yajl_buf_len((lxr)->buf) && lxr->bufOff < yajl_buf_len((lxr)->buf)) ? \ (*((const unsigned char *) yajl_buf_data((lxr)->buf) + ((lxr)->bufOff)++)) : \ ((txt)[(*(off))++])) #define unreadChar(lxr, off) ((*(off) > 0) ? (*(off))-- : ((lxr)->bufOff--)) static yajl_lexer yajl_lex_alloc(yajl_alloc_funcs * alloc, unsigned int allowComments, unsigned int validateUTF8) { yajl_lexer lxr = (yajl_lexer) YA_MALLOC(alloc, sizeof(struct yajl_lexer_t)); memset((void *) lxr, 0, sizeof(struct yajl_lexer_t)); lxr->buf = yajl_buf_alloc(alloc); lxr->allowComments = allowComments; lxr->validateUTF8 = validateUTF8; lxr->alloc = alloc; return lxr; } static void yajl_lex_free(yajl_lexer lxr) { yajl_buf_free(lxr->buf); YA_FREE(lxr->alloc, lxr); return; } /* a lookup table which lets us quickly determine three things: * VEC - valid escaped control char * note. the solidus '/' may be escaped or not. * IJC - invalid json char * VHC - valid hex char * NFP - needs further processing (from a string scanning perspective) * NUC - needs utf8 checking when enabled (from a string scanning perspective) */ #define VEC 0x01 #define IJC 0x02 #define VHC 0x04 #define NFP 0x08 #define NUC 0x10 static const char charLookupTable[256] = { /*00*/ IJC , IJC , IJC , IJC , IJC , IJC , IJC , IJC , /*08*/ IJC , IJC , IJC , IJC , IJC , IJC , IJC , IJC , /*10*/ IJC , IJC , IJC , IJC , IJC , IJC , IJC , IJC , /*18*/ IJC , IJC , IJC , IJC , IJC , IJC , IJC , IJC , /*20*/ 0 , 0 , NFP|VEC|IJC, 0 , 0 , 0 , 0 , 0 , /*28*/ 0 , 0 , 0 , 0 , 0 , 0 , 0 , VEC , /*30*/ VHC , VHC , VHC , VHC , VHC , VHC , VHC , VHC , /*38*/ VHC , VHC , 0 , 0 , 0 , 0 , 0 , 0 , /*40*/ 0 , VHC , VHC , VHC , VHC , VHC , VHC , 0 , /*48*/ 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , /*50*/ 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , /*58*/ 0 , 0 , 0 , 0 , NFP|VEC|IJC, 0 , 0 , 0 , /*60*/ 0 , VHC , VEC|VHC, VHC , VHC , VHC , VEC|VHC, 0 , /*68*/ 0 , 0 , 0 , 0 , 0 , 0 , VEC , 0 , /*70*/ 0 , 0 , VEC , 0 , VEC , 0 , 0 , 0 , /*78*/ 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC , NUC }; /** process a variable length utf8 encoded codepoint. * * returns: * yajl_tok_string - if valid utf8 char was parsed and offset was * advanced * yajl_tok_eof - if end of input was hit before validation could * complete * yajl_tok_error - if invalid utf8 was encountered * * NOTE: on error the offset will point to the first char of the * invalid utf8 */ #define UTF8_CHECK_EOF if (*offset >= jsonTextLen) { return yajl_tok_eof; } static yajl_tok yajl_lex_utf8_char(yajl_lexer lexer, const unsigned char * jsonText, size_t jsonTextLen, size_t * offset, unsigned char curChar) { if (curChar <= 0x7f) { /* single byte */ return yajl_tok_string; } else if ((curChar >> 5) == 0x6) { /* two byte */ UTF8_CHECK_EOF; curChar = readChar(lexer, jsonText, offset); if ((curChar >> 6) == 0x2) return yajl_tok_string; } else if ((curChar >> 4) == 0x0e) { /* three byte */ UTF8_CHECK_EOF; curChar = readChar(lexer, jsonText, offset); if ((curChar >> 6) == 0x2) { UTF8_CHECK_EOF; curChar = readChar(lexer, jsonText, offset); if ((curChar >> 6) == 0x2) return yajl_tok_string; } } else if ((curChar >> 3) == 0x1e) { /* four byte */ UTF8_CHECK_EOF; curChar = readChar(lexer, jsonText, offset); if ((curChar >> 6) == 0x2) { UTF8_CHECK_EOF; curChar = readChar(lexer, jsonText, offset); if ((curChar >> 6) == 0x2) { UTF8_CHECK_EOF; curChar = readChar(lexer, jsonText, offset); if ((curChar >> 6) == 0x2) return yajl_tok_string; } } } return yajl_tok_error; } /* lex a string. input is the lexer, pointer to beginning of * json text, and start of string (offset). * a token is returned which has the following meanings: * yajl_tok_string: lex of string was successful. offset points to * terminating '"'. * yajl_tok_eof: end of text was encountered before we could complete * the lex. * yajl_tok_error: embedded in the string were unallowable chars. offset * points to the offending char */ #define STR_CHECK_EOF \ if (*offset >= jsonTextLen) { \ tok = yajl_tok_eof; \ goto finish_string_lex; \ } /** scan a string for interesting characters that might need further * review. return the number of chars that are uninteresting and can * be skipped. * (lth) hi world, any thoughts on how to make this routine faster? */ static size_t yajl_string_scan(const unsigned char * buf, size_t len, int utf8check) { unsigned char mask = IJC|NFP|(utf8check ? NUC : 0); size_t skip = 0; while (skip < len && !(charLookupTable[*buf] & mask)) { skip++; buf++; } return skip; } static yajl_tok yajl_lex_string(yajl_lexer lexer, const unsigned char * jsonText, size_t jsonTextLen, size_t * offset) { yajl_tok tok = yajl_tok_error; int hasEscapes = 0; for (;;) { unsigned char curChar; /* now jump into a faster scanning routine to skip as much * of the buffers as possible */ { const unsigned char * p; size_t len; if ((lexer->bufInUse && yajl_buf_len(lexer->buf) && lexer->bufOff < yajl_buf_len(lexer->buf))) { p = ((const unsigned char *) yajl_buf_data(lexer->buf) + (lexer->bufOff)); len = yajl_buf_len(lexer->buf) - lexer->bufOff; lexer->bufOff += yajl_string_scan(p, len, lexer->validateUTF8); } else if (*offset < jsonTextLen) { p = jsonText + *offset; len = jsonTextLen - *offset; *offset += yajl_string_scan(p, len, lexer->validateUTF8); } } STR_CHECK_EOF; curChar = readChar(lexer, jsonText, offset); /* quote terminates */ if (curChar == '"') { tok = yajl_tok_string; break; } /* backslash escapes a set of control chars, */ else if (curChar == '\\') { hasEscapes = 1; STR_CHECK_EOF; /* special case \u */ curChar = readChar(lexer, jsonText, offset); if (curChar == 'u') { unsigned int i = 0; for (i=0;i<4;i++) { STR_CHECK_EOF; curChar = readChar(lexer, jsonText, offset); if (!(charLookupTable[curChar] & VHC)) { /* back up to offending char */ unreadChar(lexer, offset); lexer->error = yajl_lex_string_invalid_hex_char; goto finish_string_lex; } } } else if (!(charLookupTable[curChar] & VEC)) { /* back up to offending char */ unreadChar(lexer, offset); lexer->error = yajl_lex_string_invalid_escaped_char; goto finish_string_lex; } } /* when not validating UTF8 it's a simple table lookup to determine * if the present character is invalid */ else if(charLookupTable[curChar] & IJC) { /* back up to offending char */ unreadChar(lexer, offset); lexer->error = yajl_lex_string_invalid_json_char; goto finish_string_lex; } /* when in validate UTF8 mode we need to do some extra work */ else if (lexer->validateUTF8) { yajl_tok t = yajl_lex_utf8_char(lexer, jsonText, jsonTextLen, offset, curChar); if (t == yajl_tok_eof) { tok = yajl_tok_eof; goto finish_string_lex; } else if (t == yajl_tok_error) { lexer->error = yajl_lex_string_invalid_utf8; goto finish_string_lex; } } /* accept it, and move on */ } finish_string_lex: /* tell our buddy, the parser, wether he needs to process this string * again */ if (hasEscapes && tok == yajl_tok_string) { tok = yajl_tok_string_with_escapes; } return tok; } #define RETURN_IF_EOF if (*offset >= jsonTextLen) return yajl_tok_eof; static yajl_tok yajl_lex_number(yajl_lexer lexer, const unsigned char * jsonText, size_t jsonTextLen, size_t * offset) { /** XXX: numbers are the only entities in json that we must lex * _beyond_ in order to know that they are complete. There * is an ambiguous case for integers at EOF. */ unsigned char c; yajl_tok tok = yajl_tok_integer; RETURN_IF_EOF; c = readChar(lexer, jsonText, offset); /* optional leading minus */ if (c == '-') { RETURN_IF_EOF; c = readChar(lexer, jsonText, offset); } /* a single zero, or a series of integers */ if (c == '0') { RETURN_IF_EOF; c = readChar(lexer, jsonText, offset); } else if (c >= '1' && c <= '9') { do { RETURN_IF_EOF; c = readChar(lexer, jsonText, offset); } while (c >= '0' && c <= '9'); } else { unreadChar(lexer, offset); lexer->error = yajl_lex_missing_integer_after_minus; return yajl_tok_error; } /* optional fraction (indicates this is floating point) */ if (c == '.') { int numRd = 0; RETURN_IF_EOF; c = readChar(lexer, jsonText, offset); while (c >= '0' && c <= '9') { numRd++; RETURN_IF_EOF; c = readChar(lexer, jsonText, offset); } if (!numRd) { unreadChar(lexer, offset); lexer->error = yajl_lex_missing_integer_after_decimal; return yajl_tok_error; } tok = yajl_tok_double; } /* optional exponent (indicates this is floating point) */ if (c == 'e' || c == 'E') { RETURN_IF_EOF; c = readChar(lexer, jsonText, offset); /* optional sign */ if (c == '+' || c == '-') { RETURN_IF_EOF; c = readChar(lexer, jsonText, offset); } if (c >= '0' && c <= '9') { do { RETURN_IF_EOF; c = readChar(lexer, jsonText, offset); } while (c >= '0' && c <= '9'); } else { unreadChar(lexer, offset); lexer->error = yajl_lex_missing_integer_after_exponent; return yajl_tok_error; } tok = yajl_tok_double; } /* we always go "one too far" */ unreadChar(lexer, offset); return tok; } static yajl_tok yajl_lex_comment(yajl_lexer lexer, const unsigned char * jsonText, size_t jsonTextLen, size_t * offset) { unsigned char c; yajl_tok tok = yajl_tok_comment; RETURN_IF_EOF; c = readChar(lexer, jsonText, offset); /* either slash or star expected */ if (c == '/') { /* now we throw away until end of line */ do { RETURN_IF_EOF; c = readChar(lexer, jsonText, offset); } while (c != '\n'); } else if (c == '*') { /* now we throw away until end of comment */ for (;;) { RETURN_IF_EOF; c = readChar(lexer, jsonText, offset); if (c == '*') { RETURN_IF_EOF; c = readChar(lexer, jsonText, offset); if (c == '/') { break; } else { unreadChar(lexer, offset); } } } } else { lexer->error = yajl_lex_invalid_char; tok = yajl_tok_error; } return tok; } static yajl_tok yajl_lex_lex(yajl_lexer lexer, const unsigned char * jsonText, size_t jsonTextLen, size_t * offset, const unsigned char ** outBuf, size_t * outLen) { yajl_tok tok = yajl_tok_error; unsigned char c; size_t startOffset = *offset; *outBuf = NULL; *outLen = 0; for (;;) { assert(*offset <= jsonTextLen); if (*offset >= jsonTextLen) { tok = yajl_tok_eof; goto lexed; } c = readChar(lexer, jsonText, offset); switch (c) { case '{': tok = yajl_tok_left_bracket; goto lexed; case '}': tok = yajl_tok_right_bracket; goto lexed; case '[': tok = yajl_tok_left_brace; goto lexed; case ']': tok = yajl_tok_right_brace; goto lexed; case ',': tok = yajl_tok_comma; goto lexed; case ':': tok = yajl_tok_colon; goto lexed; case '\t': case '\n': case '\v': case '\f': case '\r': case ' ': startOffset++; break; case 't': { const char * want = "rue"; do { if (*offset >= jsonTextLen) { tok = yajl_tok_eof; goto lexed; } c = readChar(lexer, jsonText, offset); if (c != *want) { unreadChar(lexer, offset); lexer->error = yajl_lex_invalid_string; tok = yajl_tok_error; goto lexed; } } while (*(++want)); tok = yajl_tok_bool; goto lexed; } case 'f': { const char * want = "alse"; do { if (*offset >= jsonTextLen) { tok = yajl_tok_eof; goto lexed; } c = readChar(lexer, jsonText, offset); if (c != *want) { unreadChar(lexer, offset); lexer->error = yajl_lex_invalid_string; tok = yajl_tok_error; goto lexed; } } while (*(++want)); tok = yajl_tok_bool; goto lexed; } case 'n': { const char * want = "ull"; do { if (*offset >= jsonTextLen) { tok = yajl_tok_eof; goto lexed; } c = readChar(lexer, jsonText, offset); if (c != *want) { unreadChar(lexer, offset); lexer->error = yajl_lex_invalid_string; tok = yajl_tok_error; goto lexed; } } while (*(++want)); tok = yajl_tok_null; goto lexed; } case '"': { tok = yajl_lex_string(lexer, (const unsigned char *) jsonText, jsonTextLen, offset); goto lexed; } case '-': case '0': case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9': { /* integer parsing wants to start from the beginning */ unreadChar(lexer, offset); tok = yajl_lex_number(lexer, (const unsigned char *) jsonText, jsonTextLen, offset); goto lexed; } case '/': /* hey, look, a probable comment! If comments are disabled * it's an error. */ if (!lexer->allowComments) { unreadChar(lexer, offset); lexer->error = yajl_lex_unallowed_comment; tok = yajl_tok_error; goto lexed; } /* if comments are enabled, then we should try to lex * the thing. possible outcomes are * - successful lex (tok_comment, which means continue), * - malformed comment opening (slash not followed by * '*' or '/') (tok_error) * - eof hit. (tok_eof) */ tok = yajl_lex_comment(lexer, (const unsigned char *) jsonText, jsonTextLen, offset); if (tok == yajl_tok_comment) { /* "error" is silly, but that's the initial * state of tok. guilty until proven innocent. */ tok = yajl_tok_error; yajl_buf_clear(lexer->buf); lexer->bufInUse = 0; startOffset = *offset; break; } /* hit error or eof, bail */ goto lexed; default: lexer->error = yajl_lex_invalid_char; tok = yajl_tok_error; goto lexed; } } lexed: /* need to append to buffer if the buffer is in use or * if it's an EOF token */ if (tok == yajl_tok_eof || lexer->bufInUse) { if (!lexer->bufInUse) yajl_buf_clear(lexer->buf); lexer->bufInUse = 1; yajl_buf_append(lexer->buf, jsonText + startOffset, *offset - startOffset); lexer->bufOff = 0; if (tok != yajl_tok_eof) { *outBuf = yajl_buf_data(lexer->buf); *outLen = yajl_buf_len(lexer->buf); lexer->bufInUse = 0; } } else if (tok != yajl_tok_error) { *outBuf = jsonText + startOffset; *outLen = *offset - startOffset; } /* special case for strings. skip the quotes. */ if (tok == yajl_tok_string || tok == yajl_tok_string_with_escapes) { assert(*outLen >= 2); (*outBuf)++; *outLen -= 2; } #ifdef YAJL_LEXER_DEBUG if (tok == yajl_tok_error) { printf("lexical error: %s\n", yajl_lex_error_to_string(yajl_lex_get_error(lexer))); } else if (tok == yajl_tok_eof) { printf("EOF hit\n"); } else { printf("lexed %s: '", tokToStr(tok)); fwrite(*outBuf, 1, *outLen, stdout); printf("'\n"); } #endif return tok; } static const char * yajl_lex_error_to_string(yajl_lex_error error) { switch (error) { case yajl_lex_e_ok: return "ok, no error"; case yajl_lex_string_invalid_utf8: return "invalid bytes in UTF8 string."; case yajl_lex_string_invalid_escaped_char: return "inside a string, '\\' occurs before a character " "which it may not."; case yajl_lex_string_invalid_json_char: return "invalid character inside string."; case yajl_lex_string_invalid_hex_char: return "invalid (non-hex) character occurs after '\\u' inside " "string."; case yajl_lex_invalid_char: return "invalid char in json text."; case yajl_lex_invalid_string: return "invalid string in json text."; case yajl_lex_missing_integer_after_exponent: return "malformed number, a digit is required after the exponent."; case yajl_lex_missing_integer_after_decimal: return "malformed number, a digit is required after the " "decimal point."; case yajl_lex_missing_integer_after_minus: return "malformed number, a digit is required after the " "minus sign."; case yajl_lex_unallowed_comment: return "probable comment found in input text, comments are " "not enabled."; } return "unknown error code"; } /** allows access to more specific information about the lexical * error when yajl_lex_lex returns yajl_tok_error. */ static yajl_lex_error yajl_lex_get_error(yajl_lexer lexer) { if (lexer == NULL) return (yajl_lex_error) -1; return lexer->error; } #define MAX_VALUE_TO_MULTIPLY ((LLONG_MAX / 10) + (LLONG_MAX % 10)) /* same semantics as strtol */ static long long yajl_parse_integer(const unsigned char *number, unsigned int length) { long long ret = 0; long sign = 1; const unsigned char *pos = number; if (*pos == '-') { pos++; sign = -1; } if (*pos == '+') { pos++; } while (pos < number + length) { if ( ret > MAX_VALUE_TO_MULTIPLY ) { errno = ERANGE; return sign == 1 ? LLONG_MAX : LLONG_MIN; } ret *= 10; if (LLONG_MAX - ret < (*pos - '0')) { errno = ERANGE; return sign == 1 ? LLONG_MAX : LLONG_MIN; } if (*pos < '0' || *pos > '9') { errno = ERANGE; return sign == 1 ? LLONG_MAX : LLONG_MIN; } ret += (*pos++ - '0'); } return sign * ret; } static unsigned char * yajl_render_error_string(yajl_handle hand, const unsigned char * jsonText, size_t jsonTextLen, int verbose) { size_t offset = hand->bytesConsumed; unsigned char * str; const char * errorType = NULL; const char * errorText = NULL; char text[72]; const char * arrow = " (right here) ------^\n"; if (yajl_bs_current(hand->stateStack) == yajl_state_parse_error) { errorType = "parse"; errorText = hand->parseError; } else if (yajl_bs_current(hand->stateStack) == yajl_state_lexical_error) { errorType = "lexical"; errorText = yajl_lex_error_to_string(yajl_lex_get_error(hand->lexer)); } else { errorType = "unknown"; } { size_t memneeded = 0; memneeded += strlen(errorType); memneeded += strlen(" error"); if (errorText != NULL) { memneeded += strlen(": "); memneeded += strlen(errorText); } str = (unsigned char *) YA_MALLOC(&(hand->alloc), memneeded + 2); if (!str) return NULL; str[0] = 0; #ifdef _WIN32 strcat_s((char *) str, memneeded+2, errorType); strcat_s((char *) str, memneeded+2, " error"); #else strcat((char *) str, errorType); strcat((char *) str, " error"); #endif if (errorText != NULL) { #ifdef _WIN32 strcat_s((char *) str, memneeded+2, ": "); strcat_s((char *) str, memneeded+2, errorText); #else strcat((char *) str, ": "); strcat((char *) str, errorText); #endif } #ifdef _WIN32 strcat_s((char *) str, memneeded+2, "\n"); #else strcat((char *) str, "\n"); #endif } /* now we append as many spaces as needed to make sure the error * falls at char 41, if verbose was specified */ if (verbose) { size_t start, end, i; size_t spacesNeeded; spacesNeeded = (offset < 30 ? 40 - offset : 10); start = (offset >= 30 ? offset - 30 : 0); end = (offset + 30 > jsonTextLen ? jsonTextLen : offset + 30); for (i=0;ialloc), memneeded); if (newStr) { newStr[0] = 0; #ifdef _WIN32 strcat_s((char *) newStr, memneeded, (char *) str); strcat_s((char *) newStr, memneeded, text); strcat_s((char *) newStr, memneeded, arrow); #else strcat((char *) newStr, (char *) str); strcat((char *) newStr, text); strcat((char *) newStr, arrow); #endif } YA_FREE(&(hand->alloc), str); str = (unsigned char *) newStr; } } return str; } /* check for client cancelation */ #define _CC_CHK(x) \ if (!(x)) { \ yajl_bs_set(hand->stateStack, yajl_state_parse_error); \ hand->parseError = \ "client cancelled parse via callback return value"; \ return yajl_status_client_canceled; \ } static yajl_status yajl_do_finish(yajl_handle hand) { yajl_status stat; stat = yajl_do_parse(hand,(const unsigned char *) " ",1); if (stat != yajl_status_ok) return stat; switch(yajl_bs_current(hand->stateStack)) { case yajl_state_parse_error: case yajl_state_lexical_error: return yajl_status_error; case yajl_state_got_value: case yajl_state_parse_complete: return yajl_status_ok; default: if (!(hand->flags & yajl_allow_partial_values)) { yajl_bs_set(hand->stateStack, yajl_state_parse_error); hand->parseError = "premature EOF"; return yajl_status_error; } return yajl_status_ok; } } static yajl_status yajl_do_parse(yajl_handle hand, const unsigned char * jsonText, size_t jsonTextLen) { yajl_tok tok; const unsigned char * buf; size_t bufLen; size_t * offset = &(hand->bytesConsumed); *offset = 0; around_again: switch (yajl_bs_current(hand->stateStack)) { case yajl_state_parse_complete: if (hand->flags & yajl_allow_multiple_values) { yajl_bs_set(hand->stateStack, yajl_state_got_value); goto around_again; } if (!(hand->flags & yajl_allow_trailing_garbage)) { if (*offset != jsonTextLen) { tok = yajl_lex_lex(hand->lexer, jsonText, jsonTextLen, offset, &buf, &bufLen); if (tok != yajl_tok_eof) { yajl_bs_set(hand->stateStack, yajl_state_parse_error); hand->parseError = "trailing garbage"; } goto around_again; } } return yajl_status_ok; case yajl_state_lexical_error: case yajl_state_parse_error: return yajl_status_error; case yajl_state_start: case yajl_state_got_value: case yajl_state_map_need_val: case yajl_state_array_need_val: case yajl_state_array_start: { /* for arrays and maps, we advance the state for this * depth, then push the state of the next depth. * If an error occurs during the parsing of the nesting * enitity, the state at this level will not matter. * a state that needs pushing will be anything other * than state_start */ yajl_state stateToPush = yajl_state_start; tok = yajl_lex_lex(hand->lexer, jsonText, jsonTextLen, offset, &buf, &bufLen); switch (tok) { case yajl_tok_eof: return yajl_status_ok; case yajl_tok_error: yajl_bs_set(hand->stateStack, yajl_state_lexical_error); goto around_again; case yajl_tok_string: if (hand->callbacks && hand->callbacks->yajl_string) { _CC_CHK(hand->callbacks->yajl_string(hand->ctx, buf, bufLen)); } break; case yajl_tok_string_with_escapes: if (hand->callbacks && hand->callbacks->yajl_string) { yajl_buf_clear(hand->decodeBuf); yajl_string_decode(hand->decodeBuf, buf, bufLen); _CC_CHK(hand->callbacks->yajl_string( hand->ctx, yajl_buf_data(hand->decodeBuf), yajl_buf_len(hand->decodeBuf))); } break; case yajl_tok_bool: if (hand->callbacks && hand->callbacks->yajl_boolean) { _CC_CHK(hand->callbacks->yajl_boolean(hand->ctx, *buf == 't')); } break; case yajl_tok_null: if (hand->callbacks && hand->callbacks->yajl_null) { _CC_CHK(hand->callbacks->yajl_null(hand->ctx)); } break; case yajl_tok_left_bracket: if (hand->callbacks && hand->callbacks->yajl_start_map) { _CC_CHK(hand->callbacks->yajl_start_map(hand->ctx)); } stateToPush = yajl_state_map_start; break; case yajl_tok_left_brace: if (hand->callbacks && hand->callbacks->yajl_start_array) { _CC_CHK(hand->callbacks->yajl_start_array(hand->ctx)); } stateToPush = yajl_state_array_start; break; case yajl_tok_integer: if (hand->callbacks) { if (hand->callbacks->yajl_number) { _CC_CHK(hand->callbacks->yajl_number( hand->ctx,(const char *) buf, bufLen)); } else if (hand->callbacks->yajl_integer) { long long int i = 0; errno = 0; i = yajl_parse_integer(buf, bufLen); if ((i == LLONG_MIN || i == LLONG_MAX) && errno == ERANGE) { yajl_bs_set(hand->stateStack, yajl_state_parse_error); hand->parseError = "integer overflow" ; /* try to restore error offset */ if (*offset >= bufLen) *offset -= bufLen; else *offset = 0; goto around_again; } _CC_CHK(hand->callbacks->yajl_integer(hand->ctx, i)); } } break; case yajl_tok_double: if (hand->callbacks) { if (hand->callbacks->yajl_number) { _CC_CHK(hand->callbacks->yajl_number( hand->ctx, (const char *) buf, bufLen)); } else if (hand->callbacks->yajl_double) { double d = 0.0; yajl_buf_clear(hand->decodeBuf); yajl_buf_append(hand->decodeBuf, buf, bufLen); buf = yajl_buf_data(hand->decodeBuf); errno = 0; d = strtod((char *) buf, NULL); if ((d == HUGE_VAL || d == -HUGE_VAL) && errno == ERANGE) { yajl_bs_set(hand->stateStack, yajl_state_parse_error); hand->parseError = "numeric (floating point) " "overflow"; /* try to restore error offset */ if (*offset >= bufLen) *offset -= bufLen; else *offset = 0; goto around_again; } _CC_CHK(hand->callbacks->yajl_double(hand->ctx, d)); } } break; case yajl_tok_right_brace: { if (yajl_bs_current(hand->stateStack) == yajl_state_array_start) { if (hand->callbacks && hand->callbacks->yajl_end_array) { _CC_CHK(hand->callbacks->yajl_end_array(hand->ctx)); } yajl_bs_pop(hand->stateStack); goto around_again; } /* intentional fall-through */ } case yajl_tok_colon: case yajl_tok_comma: case yajl_tok_right_bracket: yajl_bs_set(hand->stateStack, yajl_state_parse_error); hand->parseError = "unallowed token at this point in JSON text"; goto around_again; default: yajl_bs_set(hand->stateStack, yajl_state_parse_error); hand->parseError = "invalid token, internal error"; goto around_again; } /* got a value. transition depends on the state we're in. */ { yajl_state s = yajl_bs_current(hand->stateStack); if (s == yajl_state_start || s == yajl_state_got_value) { yajl_bs_set(hand->stateStack, yajl_state_parse_complete); } else if (s == yajl_state_map_need_val) { yajl_bs_set(hand->stateStack, yajl_state_map_got_val); } else { yajl_bs_set(hand->stateStack, yajl_state_array_got_val); } } if (stateToPush != yajl_state_start) { yajl_bs_push(hand->stateStack, stateToPush); } goto around_again; } case yajl_state_map_start: case yajl_state_map_need_key: { /* only difference between these two states is that in * start '}' is valid, whereas in need_key, we've parsed * a comma, and a string key _must_ follow */ tok = yajl_lex_lex(hand->lexer, jsonText, jsonTextLen, offset, &buf, &bufLen); switch (tok) { case yajl_tok_eof: return yajl_status_ok; case yajl_tok_error: yajl_bs_set(hand->stateStack, yajl_state_lexical_error); goto around_again; case yajl_tok_string_with_escapes: if (hand->callbacks && hand->callbacks->yajl_map_key) { yajl_buf_clear(hand->decodeBuf); yajl_string_decode(hand->decodeBuf, buf, bufLen); buf = yajl_buf_data(hand->decodeBuf); bufLen = yajl_buf_len(hand->decodeBuf); } /* intentional fall-through */ case yajl_tok_string: if (hand->callbacks && hand->callbacks->yajl_map_key) { _CC_CHK(hand->callbacks->yajl_map_key(hand->ctx, buf, bufLen)); } yajl_bs_set(hand->stateStack, yajl_state_map_sep); goto around_again; case yajl_tok_right_bracket: if (yajl_bs_current(hand->stateStack) == yajl_state_map_start) { if (hand->callbacks && hand->callbacks->yajl_end_map) { _CC_CHK(hand->callbacks->yajl_end_map(hand->ctx)); } yajl_bs_pop(hand->stateStack); goto around_again; } default: yajl_bs_set(hand->stateStack, yajl_state_parse_error); hand->parseError = "invalid object key (must be a string)"; goto around_again; } } case yajl_state_map_sep: { tok = yajl_lex_lex(hand->lexer, jsonText, jsonTextLen, offset, &buf, &bufLen); switch (tok) { case yajl_tok_colon: yajl_bs_set(hand->stateStack, yajl_state_map_need_val); goto around_again; case yajl_tok_eof: return yajl_status_ok; case yajl_tok_error: yajl_bs_set(hand->stateStack, yajl_state_lexical_error); goto around_again; default: yajl_bs_set(hand->stateStack, yajl_state_parse_error); hand->parseError = "object key and value must " "be separated by a colon (':')"; goto around_again; } } case yajl_state_map_got_val: { tok = yajl_lex_lex(hand->lexer, jsonText, jsonTextLen, offset, &buf, &bufLen); switch (tok) { case yajl_tok_right_bracket: if (hand->callbacks && hand->callbacks->yajl_end_map) { _CC_CHK(hand->callbacks->yajl_end_map(hand->ctx)); } yajl_bs_pop(hand->stateStack); goto around_again; case yajl_tok_comma: yajl_bs_set(hand->stateStack, yajl_state_map_need_key); goto around_again; case yajl_tok_eof: return yajl_status_ok; case yajl_tok_error: yajl_bs_set(hand->stateStack, yajl_state_lexical_error); goto around_again; default: yajl_bs_set(hand->stateStack, yajl_state_parse_error); hand->parseError = "after key and value, inside map, " "I expect ',' or '}'"; /* try to restore error offset */ if (*offset >= bufLen) *offset -= bufLen; else *offset = 0; goto around_again; } } case yajl_state_array_got_val: { tok = yajl_lex_lex(hand->lexer, jsonText, jsonTextLen, offset, &buf, &bufLen); switch (tok) { case yajl_tok_right_brace: if (hand->callbacks && hand->callbacks->yajl_end_array) { _CC_CHK(hand->callbacks->yajl_end_array(hand->ctx)); } yajl_bs_pop(hand->stateStack); goto around_again; case yajl_tok_comma: yajl_bs_set(hand->stateStack, yajl_state_array_need_val); goto around_again; case yajl_tok_eof: return yajl_status_ok; case yajl_tok_error: yajl_bs_set(hand->stateStack, yajl_state_lexical_error); goto around_again; default: yajl_bs_set(hand->stateStack, yajl_state_parse_error); hand->parseError = "after array element, I expect ',' or ']'"; goto around_again; } } } abort(); return yajl_status_error; } whitedb-0.7.2/json/yajl_all.h000066400000000000000000000204461226454622500160750ustar00rootroot00000000000000/* * Copyright (c) 2007-2011, Lloyd Hilaiel * * Permission to use, copy, modify, and/or distribute this software for any * purpose with or without fee is hereby granted, provided that the above * copyright notice and this permission notice appear in all copies. * * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. */ #ifndef YAJL_ALL_H #define YAJL_ALL_H #ifdef __cplusplus extern "C" { #endif /** * \file yajl_alloc.h * default memory allocation routines for yajl which use malloc/realloc and * free */ #include "yajl_api.h" #define YA_MALLOC(afs, sz) (afs)->malloc((afs)->ctx, (sz)) #define YA_FREE(afs, ptr) (afs)->free((afs)->ctx, (ptr)) #define YA_REALLOC(afs, ptr, sz) (afs)->realloc((afs)->ctx, (ptr), (sz)) static void yajl_set_default_alloc_funcs(yajl_alloc_funcs * yaf); /* * Implementation/performance notes. If this were moved to a header * only implementation using #define's where possible we might be * able to sqeeze a little performance out of the guy by killing function * call overhead. YMMV. */ /** * yajl_buf is a buffer with exponential growth. the buffer ensures that * you are always null padded. */ typedef struct yajl_buf_t * yajl_buf; /* allocate a new buffer */ static yajl_buf yajl_buf_alloc(yajl_alloc_funcs * alloc); /* free the buffer */ static void yajl_buf_free(yajl_buf buf); /* append a number of bytes to the buffer */ static void yajl_buf_append(yajl_buf buf, const void * data, size_t len); /* empty the buffer */ static void yajl_buf_clear(yajl_buf buf); /* get a pointer to the beginning of the buffer */ static const unsigned char * yajl_buf_data(yajl_buf buf); /* get the length of the buffer */ static size_t yajl_buf_len(yajl_buf buf); /* * A header only implementation of a simple stack of bytes, used in YAJL * to maintain parse state. */ #define YAJL_BS_INC 128 typedef struct yajl_bytestack_t { unsigned char * stack; size_t size; size_t used; yajl_alloc_funcs * yaf; } yajl_bytestack; /* initialize a bytestack */ #define yajl_bs_init(obs, _yaf) { \ (obs).stack = NULL; \ (obs).size = 0; \ (obs).used = 0; \ (obs).yaf = (_yaf); \ } \ /* initialize a bytestack */ #define yajl_bs_free(obs) \ if ((obs).stack) (obs).yaf->free((obs).yaf->ctx, (obs).stack); #define yajl_bs_current(obs) \ (assert((obs).used > 0), (obs).stack[(obs).used - 1]) #define yajl_bs_push(obs, byte) { \ if (((obs).size - (obs).used) == 0) { \ (obs).size += YAJL_BS_INC; \ (obs).stack = (obs).yaf->realloc((obs).yaf->ctx,\ (void *) (obs).stack, (obs).size);\ } \ (obs).stack[((obs).used)++] = (byte); \ } /* removes the top item of the stack, returns nothing */ #define yajl_bs_pop(obs) { ((obs).used)--; } #define yajl_bs_set(obs, byte) \ (obs).stack[((obs).used) - 1] = (byte); static void yajl_string_encode(const yajl_print_t printer, void * ctx, const unsigned char * str, size_t length, int escape_solidus); static void yajl_string_decode(yajl_buf buf, const unsigned char * str, size_t length); static int yajl_string_validate_utf8(const unsigned char * s, size_t len); typedef enum { yajl_tok_bool, yajl_tok_colon, yajl_tok_comma, yajl_tok_eof, yajl_tok_error, yajl_tok_left_brace, yajl_tok_left_bracket, yajl_tok_null, yajl_tok_right_brace, yajl_tok_right_bracket, /* we differentiate between integers and doubles to allow the * parser to interpret the number without re-scanning */ yajl_tok_integer, yajl_tok_double, /* we differentiate between strings which require further processing, * and strings that do not */ yajl_tok_string, yajl_tok_string_with_escapes, /* comment tokens are not currently returned to the parser, ever */ yajl_tok_comment } yajl_tok; typedef struct yajl_lexer_t * yajl_lexer; static yajl_lexer yajl_lex_alloc(yajl_alloc_funcs * alloc, unsigned int allowComments, unsigned int validateUTF8); static void yajl_lex_free(yajl_lexer lexer); /** * run/continue a lex. "offset" is an input/output parameter. * It should be initialized to zero for a * new chunk of target text, and upon subsetquent calls with the same * target text should passed with the value of the previous invocation. * * the client may be interested in the value of offset when an error is * returned from the lexer. This allows the client to render useful n * error messages. * * When you pass the next chunk of data, context should be reinitialized * to zero. * * Finally, the output buffer is usually just a pointer into the jsonText, * however in cases where the entity being lexed spans multiple chunks, * the lexer will buffer the entity and the data returned will be * a pointer into that buffer. * * This behavior is abstracted from client code except for the performance * implications which require that the client choose a reasonable chunk * size to get adequate performance. */ static yajl_tok yajl_lex_lex(yajl_lexer lexer, const unsigned char * jsonText, size_t jsonTextLen, size_t * offset, const unsigned char ** outBuf, size_t * outLen); typedef enum { yajl_lex_e_ok = 0, yajl_lex_string_invalid_utf8, yajl_lex_string_invalid_escaped_char, yajl_lex_string_invalid_json_char, yajl_lex_string_invalid_hex_char, yajl_lex_invalid_char, yajl_lex_invalid_string, yajl_lex_missing_integer_after_decimal, yajl_lex_missing_integer_after_exponent, yajl_lex_missing_integer_after_minus, yajl_lex_unallowed_comment } yajl_lex_error; static const char * yajl_lex_error_to_string(yajl_lex_error error); /** allows access to more specific information about the lexical * error when yajl_lex_lex returns yajl_tok_error. */ static yajl_lex_error yajl_lex_get_error(yajl_lexer lexer); typedef enum { yajl_state_start = 0, yajl_state_parse_complete, yajl_state_parse_error, yajl_state_lexical_error, yajl_state_map_start, yajl_state_map_sep, yajl_state_map_need_val, yajl_state_map_got_val, yajl_state_map_need_key, yajl_state_array_start, yajl_state_array_got_val, yajl_state_array_need_val, yajl_state_got_value, } yajl_state; struct yajl_handle_t { const yajl_callbacks * callbacks; void * ctx; yajl_lexer lexer; const char * parseError; /* the number of bytes consumed from the last client buffer, * in the case of an error this will be an error offset, in the * case of an error this can be used as the error offset */ size_t bytesConsumed; /* temporary storage for decoded strings */ yajl_buf decodeBuf; /* a stack of states. access with yajl_state_XXX routines */ yajl_bytestack stateStack; /* memory allocation routines */ yajl_alloc_funcs alloc; /* bitfield */ unsigned int flags; }; static yajl_status yajl_do_parse(yajl_handle handle, const unsigned char * jsonText, size_t jsonTextLen); static yajl_status yajl_do_finish(yajl_handle handle); static unsigned char * yajl_render_error_string(yajl_handle hand, const unsigned char * jsonText, size_t jsonTextLen, int verbose); /* A little built in integer parsing routine with the same semantics as strtol * that's unaffected by LOCALE. */ static long long yajl_parse_integer(const unsigned char *number, unsigned int length); #ifdef __cplusplus } #endif #endif /* YAJL_ALL_H */ whitedb-0.7.2/json/yajl_api.h000066400000000000000000000360641226454622500161010ustar00rootroot00000000000000/* * Copyright (c) 2007-2011, Lloyd Hilaiel * * Permission to use, copy, modify, and/or distribute this software for any * purpose with or without fee is hereby granted, provided that the above * copyright notice and this permission notice appear in all copies. * * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. */ #ifndef YAJL_API_H #define YAJL_API_H #include #ifdef __cplusplus extern "C" { #endif #define YAJL_MAX_DEPTH 128 #define YAJL_API /** pointer to a malloc function, supporting client overriding memory * allocation routines */ typedef void * (*yajl_malloc_func)(void *ctx, size_t sz); /** pointer to a free function, supporting client overriding memory * allocation routines */ typedef void (*yajl_free_func)(void *ctx, void * ptr); /** pointer to a realloc function which can resize an allocation. */ typedef void * (*yajl_realloc_func)(void *ctx, void * ptr, size_t sz); /** A structure which can be passed to yajl_*_alloc routines to allow the * client to specify memory allocation functions to be used. */ typedef struct { /** pointer to a function that can allocate uninitialized memory */ yajl_malloc_func malloc; /** pointer to a function that can resize memory allocations */ yajl_realloc_func realloc; /** pointer to a function that can free memory allocated using * reallocFunction or mallocFunction */ yajl_free_func free; /** a context pointer that will be passed to above allocation routines */ void * ctx; } yajl_alloc_funcs; /** * \file yajl_parse.h * Interface to YAJL's JSON stream parsing facilities. */ /** error codes returned from this interface */ typedef enum { /** no error was encountered */ yajl_status_ok, /** a client callback returned zero, stopping the parse */ yajl_status_client_canceled, /** An error occured during the parse. Call yajl_get_error for * more information about the encountered error */ yajl_status_error } yajl_status; /** attain a human readable, english, string for an error */ YAJL_API const char * yajl_status_to_string(yajl_status code); /** an opaque handle to a parser */ typedef struct yajl_handle_t * yajl_handle; /** yajl is an event driven parser. this means as json elements are * parsed, you are called back to do something with the data. The * functions in this table indicate the various events for which * you will be called back. Each callback accepts a "context" * pointer, this is a void * that is passed into the yajl_parse * function which the client code may use to pass around context. * * All callbacks return an integer. If non-zero, the parse will * continue. If zero, the parse will be canceled and * yajl_status_client_canceled will be returned from the parse. * * \attention { * A note about the handling of numbers: * * yajl will only convert numbers that can be represented in a * double or a 64 bit (long long) int. All other numbers will * be passed to the client in string form using the yajl_number * callback. Furthermore, if yajl_number is not NULL, it will * always be used to return numbers, that is yajl_integer and * yajl_double will be ignored. If yajl_number is NULL but one * of yajl_integer or yajl_double are defined, parsing of a * number larger than is representable in a double or 64 bit * integer will result in a parse error. * } */ typedef struct { int (* yajl_null)(void * ctx); int (* yajl_boolean)(void * ctx, int boolVal); int (* yajl_integer)(void * ctx, long long integerVal); int (* yajl_double)(void * ctx, double doubleVal); /** A callback which passes the string representation of the number * back to the client. Will be used for all numbers when present */ int (* yajl_number)(void * ctx, const char * numberVal, size_t numberLen); /** strings are returned as pointers into the JSON text when, * possible, as a result, they are _not_ null padded */ int (* yajl_string)(void * ctx, const unsigned char * stringVal, size_t stringLen); int (* yajl_start_map)(void * ctx); int (* yajl_map_key)(void * ctx, const unsigned char * key, size_t stringLen); int (* yajl_end_map)(void * ctx); int (* yajl_start_array)(void * ctx); int (* yajl_end_array)(void * ctx); } yajl_callbacks; /** allocate a parser handle * \param callbacks a yajl callbacks structure specifying the * functions to call when different JSON entities * are encountered in the input text. May be NULL, * which is only useful for validation. * \param afs memory allocation functions, may be NULL for to use * C runtime library routines (malloc and friends) * \param ctx a context pointer that will be passed to callbacks. */ YAJL_API yajl_handle yajl_alloc(const yajl_callbacks * callbacks, yajl_alloc_funcs * afs, void * ctx); /** configuration parameters for the parser, these may be passed to * yajl_config() along with option specific argument(s). In general, * all configuration parameters default to *off*. */ typedef enum { /** Ignore javascript style comments present in * JSON input. Non-standard, but rather fun * arguments: toggled off with integer zero, on otherwise. * * example: * yajl_config(h, yajl_allow_comments, 1); // turn comment support on */ yajl_allow_comments = 0x01, /** * When set the parser will verify that all strings in JSON input are * valid UTF8 and will emit a parse error if this is not so. When set, * this option makes parsing slightly more expensive (~7% depending * on processor and compiler in use) * * example: * yajl_config(h, yajl_dont_validate_strings, 1); // disable utf8 checking */ yajl_dont_validate_strings = 0x02, /** * By default, upon calls to yajl_complete_parse(), yajl will * ensure the entire input text was consumed and will raise an error * otherwise. Enabling this flag will cause yajl to disable this * check. This can be useful when parsing json out of a that contains more * than a single JSON document. */ yajl_allow_trailing_garbage = 0x04, /** * Allow multiple values to be parsed by a single handle. The * entire text must be valid JSON, and values can be seperated * by any kind of whitespace. This flag will change the * behavior of the parser, and cause it continue parsing after * a value is parsed, rather than transitioning into a * complete state. This option can be useful when parsing multiple * values from an input stream. */ yajl_allow_multiple_values = 0x08, /** * When yajl_complete_parse() is called the parser will * check that the top level value was completely consumed. I.E., * if called whilst in the middle of parsing a value * yajl will enter an error state (premature EOF). Setting this * flag suppresses that check and the corresponding error. */ yajl_allow_partial_values = 0x10 } yajl_option; /** allow the modification of parser options subsequent to handle * allocation (via yajl_alloc) * \returns zero in case of errors, non-zero otherwise */ YAJL_API int yajl_config(yajl_handle h, yajl_option opt, ...); /** free a parser handle */ YAJL_API void yajl_free(yajl_handle handle); /** Parse some json! * \param hand - a handle to the json parser allocated with yajl_alloc * \param jsonText - a pointer to the UTF8 json text to be parsed * \param jsonTextLength - the length, in bytes, of input text */ YAJL_API yajl_status yajl_parse(yajl_handle hand, const unsigned char * jsonText, size_t jsonTextLength); /** Parse any remaining buffered json. * Since yajl is a stream-based parser, without an explicit end of * input, yajl sometimes can't decide if content at the end of the * stream is valid or not. For example, if "1" has been fed in, * yajl can't know whether another digit is next or some character * that would terminate the integer token. * * \param hand - a handle to the json parser allocated with yajl_alloc */ YAJL_API yajl_status yajl_complete_parse(yajl_handle hand); /** get an error string describing the state of the * parse. * * If verbose is non-zero, the message will include the JSON * text where the error occured, along with an arrow pointing to * the specific char. * * \returns A dynamically allocated string will be returned which should * be freed with yajl_free_error */ YAJL_API unsigned char * yajl_get_error(yajl_handle hand, int verbose, const unsigned char * jsonText, size_t jsonTextLength); /** * get the amount of data consumed from the last chunk passed to YAJL. * * In the case of a successful parse this can help you understand if * the entire buffer was consumed (which will allow you to handle * "junk at end of input"). * * In the event an error is encountered during parsing, this function * affords the client a way to get the offset into the most recent * chunk where the error occured. 0 will be returned if no error * was encountered. */ YAJL_API size_t yajl_get_bytes_consumed(yajl_handle hand); /** free an error returned from yajl_get_error */ YAJL_API void yajl_free_error(yajl_handle hand, unsigned char * str); /** * \file yajl_gen.h * Interface to YAJL's JSON generation facilities. */ /** generator status codes */ typedef enum { /** no error */ yajl_gen_status_ok = 0, /** at a point where a map key is generated, a function other than * yajl_gen_string was called */ yajl_gen_keys_must_be_strings, /** YAJL's maximum generation depth was exceeded. see * YAJL_MAX_DEPTH */ yajl_max_depth_exceeded, /** A generator function (yajl_gen_XXX) was called while in an error * state */ yajl_gen_in_error_state, /** A complete JSON document has been generated */ yajl_gen_generation_complete, /** yajl_gen_double was passed an invalid floating point value * (infinity or NaN). */ yajl_gen_invalid_number, /** A print callback was passed in, so there is no internal * buffer to get from */ yajl_gen_no_buf, /** returned from yajl_gen_string() when the yajl_gen_validate_utf8 * option is enabled and an invalid was passed by client code. */ yajl_gen_invalid_string } yajl_gen_status; /** an opaque handle to a generator */ typedef struct yajl_gen_t * yajl_gen; /** a callback used for "printing" the results. */ typedef void (*yajl_print_t)(void * ctx, const char * str, size_t len); /** configuration parameters for the parser, these may be passed to * yajl_gen_config() along with option specific argument(s). In general, * all configuration parameters default to *off*. */ typedef enum { /** generate indented (beautiful) output */ yajl_gen_beautify = 0x01, /** * Set an indent string which is used when yajl_gen_beautify * is enabled. Maybe something like \\t or some number of * spaces. The default is four spaces ' '. */ yajl_gen_indent_string = 0x02, /** * Set a function and context argument that should be used to * output generated json. the function should conform to the * yajl_print_t prototype while the context argument is a * void * of your choosing. * * example: * yajl_gen_config(g, yajl_gen_print_callback, myFunc, myVoidPtr); */ yajl_gen_print_callback = 0x04, /** * Normally the generator does not validate that strings you * pass to it via yajl_gen_string() are valid UTF8. Enabling * this option will cause it to do so. */ yajl_gen_validate_utf8 = 0x08, /** * the forward solidus (slash or '/' in human) is not required to be * escaped in json text. By default, YAJL will not escape it in the * iterest of saving bytes. Setting this flag will cause YAJL to * always escape '/' in generated JSON strings. */ yajl_gen_escape_solidus = 0x10 } yajl_gen_option; /** allow the modification of generator options subsequent to handle * allocation (via yajl_alloc) * \returns zero in case of errors, non-zero otherwise */ YAJL_API int yajl_gen_config(yajl_gen g, yajl_gen_option opt, ...); /** allocate a generator handle * \param allocFuncs an optional pointer to a structure which allows * the client to overide the memory allocation * used by yajl. May be NULL, in which case * malloc/free/realloc will be used. * * \returns an allocated handle on success, NULL on failure (bad params) */ YAJL_API yajl_gen yajl_gen_alloc(const yajl_alloc_funcs * allocFuncs); /** free a generator handle */ YAJL_API void yajl_gen_free(yajl_gen handle); YAJL_API yajl_gen_status yajl_gen_integer(yajl_gen hand, long long int number); /** generate a floating point number. number may not be infinity or * NaN, as these have no representation in JSON. In these cases the * generator will return 'yajl_gen_invalid_number' */ YAJL_API yajl_gen_status yajl_gen_double(yajl_gen hand, double number); YAJL_API yajl_gen_status yajl_gen_number(yajl_gen hand, const char * num, size_t len); YAJL_API yajl_gen_status yajl_gen_string(yajl_gen hand, const unsigned char * str, size_t len); YAJL_API yajl_gen_status yajl_gen_null(yajl_gen hand); YAJL_API yajl_gen_status yajl_gen_bool(yajl_gen hand, int boolean); YAJL_API yajl_gen_status yajl_gen_map_open(yajl_gen hand); YAJL_API yajl_gen_status yajl_gen_map_close(yajl_gen hand); YAJL_API yajl_gen_status yajl_gen_array_open(yajl_gen hand); YAJL_API yajl_gen_status yajl_gen_array_close(yajl_gen hand); /** access the null terminated generator buffer. If incrementally * outputing JSON, one should call yajl_gen_clear to clear the * buffer. This allows stream generation. */ YAJL_API yajl_gen_status yajl_gen_get_buf(yajl_gen hand, const unsigned char ** buf, size_t * len); /** clear yajl's output buffer, but maintain all internal generation * state. This function will not "reset" the generator state, and is * intended to enable incremental JSON outputing. */ YAJL_API void yajl_gen_clear(yajl_gen hand); #ifdef __cplusplus } #endif #endif /* YAJL_API_H */ whitedb-0.7.2/wgdb.def000066400000000000000000000065151226454622500145700ustar00rootroot00000000000000; ; Contains all functions exported by wgdb.dll ; this file should list everything declared in Db/dbapi.h ; LIBRARY WGDB EXPORTS wg_attach_database wg_attach_existing_database wg_attach_logged_database wg_detach_database wg_delete_database wg_create_record wg_create_raw_record wg_delete_record wg_get_first_record wg_get_next_record wg_get_record_len wg_get_record_dataarray wg_set_field wg_set_new_field wg_set_int_field wg_set_double_field wg_set_str_field wg_update_atomic_field wg_set_atomic_field wg_add_int_atomic_field wg_get_field wg_get_field_type wg_get_encoded_type wg_free_encoded wg_encode_null wg_decode_null wg_encode_int wg_decode_int wg_encode_double wg_decode_double wg_encode_fixpoint wg_decode_fixpoint wg_encode_date wg_decode_date wg_encode_time wg_decode_time wg_current_utcdate wg_current_localdate wg_current_utctime wg_current_localtime wg_strf_iso_datetime wg_strp_iso_date wg_strp_iso_time wg_ymd_to_date wg_hms_to_time wg_date_to_ymd wg_time_to_hms wg_encode_str wg_decode_str wg_decode_str_lang wg_decode_str_len wg_decode_str_lang_len wg_decode_str_copy wg_decode_str_lang_copy wg_encode_xmlliteral wg_decode_xmlliteral_copy wg_decode_xmlliteral_xsdtype_copy wg_decode_xmlliteral_len wg_decode_xmlliteral_xsdtype_len wg_decode_xmlliteral wg_decode_xmlliteral_xsdtype wg_encode_uri wg_decode_uri_copy wg_decode_uri_prefix_copy wg_decode_uri_len wg_decode_uri_prefix_len wg_decode_uri wg_decode_uri_prefix wg_encode_blob wg_decode_blob_len wg_decode_blob wg_decode_blob_copy wg_decode_blob_type wg_decode_blob_type_copy wg_decode_blob_type_len wg_encode_record wg_decode_record wg_encode_char wg_decode_char wg_encode_var wg_decode_var wg_start_write wg_end_write wg_start_read wg_end_read wg_run_tests wg_dump wg_dump_internal wg_import_dump wg_attach_local_database wg_delete_local_database wg_print_db wg_print_record wg_snprint_value wg_make_query wg_make_query_rc wg_fetch wg_free_query wg_encode_query_param_null wg_encode_query_param_char wg_encode_query_param_fixpoint wg_encode_query_param_date wg_encode_query_param_time wg_encode_query_param_var wg_encode_query_param_int wg_encode_query_param_double wg_encode_query_param_str wg_free_query_param wg_export_db_csv wg_import_db_csv wg_register_external_db wg_encode_external_data wg_create_index wg_drop_index wg_column_to_index_id wg_get_index_type wg_get_index_template wg_get_all_indexes wg_parse_json_file wg_parse_json_document wg_replay_log wg_start_logging wg_stop_logging wg_database_size wg_database_freesize ; non-API functions (not in dbapi.h) needed to link wgdb.exe wg_check_datatype_writeread wg_show_db_memsegment_header wg_genintdata_asc wg_genintdata_desc wg_genintdata_mix wg_parse_and_encode wg_get_rec_owner wg_attach_memsegment wg_check_header_compat wg_print_code_version wg_print_header_version wg_check_dump wg_parse_and_encode_param wg_delete_document wg_parse_json_param wg_make_json_query wg_print_json_document ; end of wgdb.exe related exports