pax_global_header00006660000000000000000000000064117646753070014532gustar00rootroot0000000000000052 comment=d40cfee4d0d53cb4dc2f66c40e789949361fa094 ruby-graffiti-2.2/000077500000000000000000000000001176467530700141475ustar00rootroot00000000000000ruby-graffiti-2.2/COPYING000066400000000000000000001043741176467530700152130ustar00rootroot00000000000000 GNU GENERAL PUBLIC LICENSE Version 3, 29 June 2007 Copyright (C) 2007 Free Software Foundation, Inc. Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The GNU General Public License is a free, copyleft license for software and other kinds of works. The licenses for most software and other practical works are designed to take away your freedom to share and change the works. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change all versions of a program--to make sure it remains free software for all its users. We, the Free Software Foundation, use the GNU General Public License for most of our software; it applies also to any other work released this way by its authors. You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for them if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs, and that you know you can do these things. To protect your rights, we need to prevent others from denying you these rights or asking you to surrender the rights. Therefore, you have certain responsibilities if you distribute copies of the software, or if you modify it: responsibilities to respect the freedom of others. For example, if you distribute copies of such a program, whether gratis or for a fee, you must pass on to the recipients the same freedoms that you received. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. Developers that use the GNU GPL protect your rights with two steps: (1) assert copyright on the software, and (2) offer you this License giving you legal permission to copy, distribute and/or modify it. For the developers' and authors' protection, the GPL clearly explains that there is no warranty for this free software. For both users' and authors' sake, the GPL requires that modified versions be marked as changed, so that their problems will not be attributed erroneously to authors of previous versions. Some devices are designed to deny users access to install or run modified versions of the software inside them, although the manufacturer can do so. This is fundamentally incompatible with the aim of protecting users' freedom to change the software. The systematic pattern of such abuse occurs in the area of products for individuals to use, which is precisely where it is most unacceptable. Therefore, we have designed this version of the GPL to prohibit the practice for those products. If such problems arise substantially in other domains, we stand ready to extend this provision to those domains in future versions of the GPL, as needed to protect the freedom of users. Finally, every program is threatened constantly by software patents. States should not allow patents to restrict development and use of software on general-purpose computers, but in those that do, we wish to avoid the special danger that patents applied to a free program could make it effectively proprietary. To prevent this, the GPL assures that patents cannot be used to render the program non-free. The precise terms and conditions for copying, distribution and modification follow. TERMS AND CONDITIONS 0. Definitions. "This License" refers to version 3 of the GNU General Public License. "Copyright" also means copyright-like laws that apply to other kinds of works, such as semiconductor masks. "The Program" refers to any copyrightable work licensed under this License. Each licensee is addressed as "you". "Licensees" and "recipients" may be individuals or organizations. To "modify" a work means to copy from or adapt all or part of the work in a fashion requiring copyright permission, other than the making of an exact copy. The resulting work is called a "modified version" of the earlier work or a work "based on" the earlier work. A "covered work" means either the unmodified Program or a work based on the Program. To "propagate" a work means to do anything with it that, without permission, would make you directly or secondarily liable for infringement under applicable copyright law, except executing it on a computer or modifying a private copy. Propagation includes copying, distribution (with or without modification), making available to the public, and in some countries other activities as well. To "convey" a work means any kind of propagation that enables other parties to make or receive copies. Mere interaction with a user through a computer network, with no transfer of a copy, is not conveying. An interactive user interface displays "Appropriate Legal Notices" to the extent that it includes a convenient and prominently visible feature that (1) displays an appropriate copyright notice, and (2) tells the user that there is no warranty for the work (except to the extent that warranties are provided), that licensees may convey the work under this License, and how to view a copy of this License. If the interface presents a list of user commands or options, such as a menu, a prominent item in the list meets this criterion. 1. Source Code. The "source code" for a work means the preferred form of the work for making modifications to it. "Object code" means any non-source form of a work. A "Standard Interface" means an interface that either is an official standard defined by a recognized standards body, or, in the case of interfaces specified for a particular programming language, one that is widely used among developers working in that language. The "System Libraries" of an executable work include anything, other than the work as a whole, that (a) is included in the normal form of packaging a Major Component, but which is not part of that Major Component, and (b) serves only to enable use of the work with that Major Component, or to implement a Standard Interface for which an implementation is available to the public in source code form. A "Major Component", in this context, means a major essential component (kernel, window system, and so on) of the specific operating system (if any) on which the executable work runs, or a compiler used to produce the work, or an object code interpreter used to run it. The "Corresponding Source" for a work in object code form means all the source code needed to generate, install, and (for an executable work) run the object code and to modify the work, including scripts to control those activities. However, it does not include the work's System Libraries, or general-purpose tools or generally available free programs which are used unmodified in performing those activities but which are not part of the work. For example, Corresponding Source includes interface definition files associated with source files for the work, and the source code for shared libraries and dynamically linked subprograms that the work is specifically designed to require, such as by intimate data communication or control flow between those subprograms and other parts of the work. The Corresponding Source need not include anything that users can regenerate automatically from other parts of the Corresponding Source. The Corresponding Source for a work in source code form is that same work. 2. Basic Permissions. All rights granted under this License are granted for the term of copyright on the Program, and are irrevocable provided the stated conditions are met. This License explicitly affirms your unlimited permission to run the unmodified Program. The output from running a covered work is covered by this License only if the output, given its content, constitutes a covered work. This License acknowledges your rights of fair use or other equivalent, as provided by copyright law. You may make, run and propagate covered works that you do not convey, without conditions so long as your license otherwise remains in force. You may convey covered works to others for the sole purpose of having them make modifications exclusively for you, or provide you with facilities for running those works, provided that you comply with the terms of this License in conveying all material for which you do not control copyright. Those thus making or running the covered works for you must do so exclusively on your behalf, under your direction and control, on terms that prohibit them from making any copies of your copyrighted material outside their relationship with you. Conveying under any other circumstances is permitted solely under the conditions stated below. Sublicensing is not allowed; section 10 makes it unnecessary. 3. Protecting Users' Legal Rights From Anti-Circumvention Law. No covered work shall be deemed part of an effective technological measure under any applicable law fulfilling obligations under article 11 of the WIPO copyright treaty adopted on 20 December 1996, or similar laws prohibiting or restricting circumvention of such measures. When you convey a covered work, you waive any legal power to forbid circumvention of technological measures to the extent such circumvention is effected by exercising rights under this License with respect to the covered work, and you disclaim any intention to limit operation or modification of the work as a means of enforcing, against the work's users, your or third parties' legal rights to forbid circumvention of technological measures. 4. Conveying Verbatim Copies. You may convey verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice; keep intact all notices stating that this License and any non-permissive terms added in accord with section 7 apply to the code; keep intact all notices of the absence of any warranty; and give all recipients a copy of this License along with the Program. You may charge any price or no price for each copy that you convey, and you may offer support or warranty protection for a fee. 5. Conveying Modified Source Versions. You may convey a work based on the Program, or the modifications to produce it from the Program, in the form of source code under the terms of section 4, provided that you also meet all of these conditions: a) The work must carry prominent notices stating that you modified it, and giving a relevant date. b) The work must carry prominent notices stating that it is released under this License and any conditions added under section 7. This requirement modifies the requirement in section 4 to "keep intact all notices". c) You must license the entire work, as a whole, under this License to anyone who comes into possession of a copy. This License will therefore apply, along with any applicable section 7 additional terms, to the whole of the work, and all its parts, regardless of how they are packaged. This License gives no permission to license the work in any other way, but it does not invalidate such permission if you have separately received it. d) If the work has interactive user interfaces, each must display Appropriate Legal Notices; however, if the Program has interactive interfaces that do not display Appropriate Legal Notices, your work need not make them do so. A compilation of a covered work with other separate and independent works, which are not by their nature extensions of the covered work, and which are not combined with it such as to form a larger program, in or on a volume of a storage or distribution medium, is called an "aggregate" if the compilation and its resulting copyright are not used to limit the access or legal rights of the compilation's users beyond what the individual works permit. Inclusion of a covered work in an aggregate does not cause this License to apply to the other parts of the aggregate. 6. Conveying Non-Source Forms. You may convey a covered work in object code form under the terms of sections 4 and 5, provided that you also convey the machine-readable Corresponding Source under the terms of this License, in one of these ways: a) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by the Corresponding Source fixed on a durable physical medium customarily used for software interchange. b) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by a written offer, valid for at least three years and valid for as long as you offer spare parts or customer support for that product model, to give anyone who possesses the object code either (1) a copy of the Corresponding Source for all the software in the product that is covered by this License, on a durable physical medium customarily used for software interchange, for a price no more than your reasonable cost of physically performing this conveying of source, or (2) access to copy the Corresponding Source from a network server at no charge. c) Convey individual copies of the object code with a copy of the written offer to provide the Corresponding Source. This alternative is allowed only occasionally and noncommercially, and only if you received the object code with such an offer, in accord with subsection 6b. d) Convey the object code by offering access from a designated place (gratis or for a charge), and offer equivalent access to the Corresponding Source in the same way through the same place at no further charge. You need not require recipients to copy the Corresponding Source along with the object code. If the place to copy the object code is a network server, the Corresponding Source may be on a different server (operated by you or a third party) that supports equivalent copying facilities, provided you maintain clear directions next to the object code saying where to find the Corresponding Source. Regardless of what server hosts the Corresponding Source, you remain obligated to ensure that it is available for as long as needed to satisfy these requirements. e) Convey the object code using peer-to-peer transmission, provided you inform other peers where the object code and Corresponding Source of the work are being offered to the general public at no charge under subsection 6d. A separable portion of the object code, whose source code is excluded from the Corresponding Source as a System Library, need not be included in conveying the object code work. A "User Product" is either (1) a "consumer product", which means any tangible personal property which is normally used for personal, family, or household purposes, or (2) anything designed or sold for incorporation into a dwelling. In determining whether a product is a consumer product, doubtful cases shall be resolved in favor of coverage. For a particular product received by a particular user, "normally used" refers to a typical or common use of that class of product, regardless of the status of the particular user or of the way in which the particular user actually uses, or expects or is expected to use, the product. A product is a consumer product regardless of whether the product has substantial commercial, industrial or non-consumer uses, unless such uses represent the only significant mode of use of the product. "Installation Information" for a User Product means any methods, procedures, authorization keys, or other information required to install and execute modified versions of a covered work in that User Product from a modified version of its Corresponding Source. The information must suffice to ensure that the continued functioning of the modified object code is in no case prevented or interfered with solely because modification has been made. If you convey an object code work under this section in, or with, or specifically for use in, a User Product, and the conveying occurs as part of a transaction in which the right of possession and use of the User Product is transferred to the recipient in perpetuity or for a fixed term (regardless of how the transaction is characterized), the Corresponding Source conveyed under this section must be accompanied by the Installation Information. But this requirement does not apply if neither you nor any third party retains the ability to install modified object code on the User Product (for example, the work has been installed in ROM). The requirement to provide Installation Information does not include a requirement to continue to provide support service, warranty, or updates for a work that has been modified or installed by the recipient, or for the User Product in which it has been modified or installed. Access to a network may be denied when the modification itself materially and adversely affects the operation of the network or violates the rules and protocols for communication across the network. Corresponding Source conveyed, and Installation Information provided, in accord with this section must be in a format that is publicly documented (and with an implementation available to the public in source code form), and must require no special password or key for unpacking, reading or copying. 7. Additional Terms. "Additional permissions" are terms that supplement the terms of this License by making exceptions from one or more of its conditions. Additional permissions that are applicable to the entire Program shall be treated as though they were included in this License, to the extent that they are valid under applicable law. If additional permissions apply only to part of the Program, that part may be used separately under those permissions, but the entire Program remains governed by this License without regard to the additional permissions. When you convey a copy of a covered work, you may at your option remove any additional permissions from that copy, or from any part of it. (Additional permissions may be written to require their own removal in certain cases when you modify the work.) You may place additional permissions on material, added by you to a covered work, for which you have or can give appropriate copyright permission. Notwithstanding any other provision of this License, for material you add to a covered work, you may (if authorized by the copyright holders of that material) supplement the terms of this License with terms: a) Disclaiming warranty or limiting liability differently from the terms of sections 15 and 16 of this License; or b) Requiring preservation of specified reasonable legal notices or author attributions in that material or in the Appropriate Legal Notices displayed by works containing it; or c) Prohibiting misrepresentation of the origin of that material, or requiring that modified versions of such material be marked in reasonable ways as different from the original version; or d) Limiting the use for publicity purposes of names of licensors or authors of the material; or e) Declining to grant rights under trademark law for use of some trade names, trademarks, or service marks; or f) Requiring indemnification of licensors and authors of that material by anyone who conveys the material (or modified versions of it) with contractual assumptions of liability to the recipient, for any liability that these contractual assumptions directly impose on those licensors and authors. All other non-permissive additional terms are considered "further restrictions" within the meaning of section 10. If the Program as you received it, or any part of it, contains a notice stating that it is governed by this License along with a term that is a further restriction, you may remove that term. If a license document contains a further restriction but permits relicensing or conveying under this License, you may add to a covered work material governed by the terms of that license document, provided that the further restriction does not survive such relicensing or conveying. If you add terms to a covered work in accord with this section, you must place, in the relevant source files, a statement of the additional terms that apply to those files, or a notice indicating where to find the applicable terms. Additional terms, permissive or non-permissive, may be stated in the form of a separately written license, or stated as exceptions; the above requirements apply either way. 8. Termination. You may not propagate or modify a covered work except as expressly provided under this License. Any attempt otherwise to propagate or modify it is void, and will automatically terminate your rights under this License (including any patent licenses granted under the third paragraph of section 11). However, if you cease all violation of this License, then your license from a particular copyright holder is reinstated (a) provisionally, unless and until the copyright holder explicitly and finally terminates your license, and (b) permanently, if the copyright holder fails to notify you of the violation by some reasonable means prior to 60 days after the cessation. Moreover, your license from a particular copyright holder is reinstated permanently if the copyright holder notifies you of the violation by some reasonable means, this is the first time you have received notice of violation of this License (for any work) from that copyright holder, and you cure the violation prior to 30 days after your receipt of the notice. Termination of your rights under this section does not terminate the licenses of parties who have received copies or rights from you under this License. If your rights have been terminated and not permanently reinstated, you do not qualify to receive new licenses for the same material under section 10. 9. Acceptance Not Required for Having Copies. You are not required to accept this License in order to receive or run a copy of the Program. Ancillary propagation of a covered work occurring solely as a consequence of using peer-to-peer transmission to receive a copy likewise does not require acceptance. However, nothing other than this License grants you permission to propagate or modify any covered work. These actions infringe copyright if you do not accept this License. Therefore, by modifying or propagating a covered work, you indicate your acceptance of this License to do so. 10. Automatic Licensing of Downstream Recipients. Each time you convey a covered work, the recipient automatically receives a license from the original licensors, to run, modify and propagate that work, subject to this License. You are not responsible for enforcing compliance by third parties with this License. An "entity transaction" is a transaction transferring control of an organization, or substantially all assets of one, or subdividing an organization, or merging organizations. If propagation of a covered work results from an entity transaction, each party to that transaction who receives a copy of the work also receives whatever licenses to the work the party's predecessor in interest had or could give under the previous paragraph, plus a right to possession of the Corresponding Source of the work from the predecessor in interest, if the predecessor has it or can get it with reasonable efforts. You may not impose any further restrictions on the exercise of the rights granted or affirmed under this License. For example, you may not impose a license fee, royalty, or other charge for exercise of rights granted under this License, and you may not initiate litigation (including a cross-claim or counterclaim in a lawsuit) alleging that any patent claim is infringed by making, using, selling, offering for sale, or importing the Program or any portion of it. 11. Patents. A "contributor" is a copyright holder who authorizes use under this License of the Program or a work on which the Program is based. The work thus licensed is called the contributor's "contributor version". A contributor's "essential patent claims" are all patent claims owned or controlled by the contributor, whether already acquired or hereafter acquired, that would be infringed by some manner, permitted by this License, of making, using, or selling its contributor version, but do not include claims that would be infringed only as a consequence of further modification of the contributor version. For purposes of this definition, "control" includes the right to grant patent sublicenses in a manner consistent with the requirements of this License. Each contributor grants you a non-exclusive, worldwide, royalty-free patent license under the contributor's essential patent claims, to make, use, sell, offer for sale, import and otherwise run, modify and propagate the contents of its contributor version. In the following three paragraphs, a "patent license" is any express agreement or commitment, however denominated, not to enforce a patent (such as an express permission to practice a patent or covenant not to sue for patent infringement). To "grant" such a patent license to a party means to make such an agreement or commitment not to enforce a patent against the party. If you convey a covered work, knowingly relying on a patent license, and the Corresponding Source of the work is not available for anyone to copy, free of charge and under the terms of this License, through a publicly available network server or other readily accessible means, then you must either (1) cause the Corresponding Source to be so available, or (2) arrange to deprive yourself of the benefit of the patent license for this particular work, or (3) arrange, in a manner consistent with the requirements of this License, to extend the patent license to downstream recipients. "Knowingly relying" means you have actual knowledge that, but for the patent license, your conveying the covered work in a country, or your recipient's use of the covered work in a country, would infringe one or more identifiable patents in that country that you have reason to believe are valid. If, pursuant to or in connection with a single transaction or arrangement, you convey, or propagate by procuring conveyance of, a covered work, and grant a patent license to some of the parties receiving the covered work authorizing them to use, propagate, modify or convey a specific copy of the covered work, then the patent license you grant is automatically extended to all recipients of the covered work and works based on it. A patent license is "discriminatory" if it does not include within the scope of its coverage, prohibits the exercise of, or is conditioned on the non-exercise of one or more of the rights that are specifically granted under this License. You may not convey a covered work if you are a party to an arrangement with a third party that is in the business of distributing software, under which you make payment to the third party based on the extent of your activity of conveying the work, and under which the third party grants, to any of the parties who would receive the covered work from you, a discriminatory patent license (a) in connection with copies of the covered work conveyed by you (or copies made from those copies), or (b) primarily for and in connection with specific products or compilations that contain the covered work, unless you entered into that arrangement, or that patent license was granted, prior to 28 March 2007. Nothing in this License shall be construed as excluding or limiting any implied license or other defenses to infringement that may otherwise be available to you under applicable patent law. 12. No Surrender of Others' Freedom. If conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot convey a covered work so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not convey it at all. For example, if you agree to terms that obligate you to collect a royalty for further conveying from those to whom you convey the Program, the only way you could satisfy both those terms and this License would be to refrain entirely from conveying the Program. 13. Use with the GNU Affero General Public License. Notwithstanding any other provision of this License, you have permission to link or combine any covered work with a work licensed under version 3 of the GNU Affero General Public License into a single combined work, and to convey the resulting work. The terms of this License will continue to apply to the part which is the covered work, but the special requirements of the GNU Affero General Public License, section 13, concerning interaction through a network will apply to the combination as such. 14. Revised Versions of this License. The Free Software Foundation may publish revised and/or new versions of the GNU General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies that a certain numbered version of the GNU General Public License "or any later version" applies to it, you have the option of following the terms and conditions either of that numbered version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of the GNU General Public License, you may choose any version ever published by the Free Software Foundation. If the Program specifies that a proxy can decide which future versions of the GNU General Public License can be used, that proxy's public statement of acceptance of a version permanently authorizes you to choose that version for the Program. Later license versions may give you additional or different permissions. However, no additional obligations are imposed on any author or copyright holder as a result of your choosing to follow a later version. 15. Disclaimer of Warranty. THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 16. Limitation of Liability. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. 17. Interpretation of Sections 15 and 16. If the disclaimer of warranty and limitation of liability provided above cannot be given local legal effect according to their terms, reviewing courts shall apply local law that most closely approximates an absolute waiver of all civil liability in connection with the Program, unless a warranty or assumption of liability accompanies a copy of the Program in return for a fee. END OF TERMS AND CONDITIONS How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively state the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Copyright (C) This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see . Also add information on how to contact you by electronic and paper mail. If the program does terminal interaction, make it output a short notice like this when it starts in an interactive mode: Copyright (C) This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. The hypothetical commands `show w' and `show c' should show the appropriate parts of the General Public License. Of course, your program's commands might be different; for a GUI interface, you would use an "about box". You should also get your employer (if you work as a programmer) or school, if any, to sign a "copyright disclaimer" for the program, if necessary. For more information on this, and how to apply and follow the GNU GPL, see . The GNU General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Lesser General Public License instead of this License. But first, please read . ruby-graffiti-2.2/ChangeLog000066400000000000000000000141161176467530700157240ustar00rootroot00000000000000commit 903518dc145cf51d8232aaf6150691427cc38d7f (HEAD, tag: v2.2, origin/master, origin/HEAD, master) Author: Dmitry Borodaenko Date: Sat Jun 9 19:04:01 2012 +0300 force deterministic iteration over hashes This allows unit test to fully compare generated SQL queries with etalons. Also, more debug points are added. lib/graffiti/debug.rb | 2 +- lib/graffiti/sql_mapper.rb | 48 +++++++++++--------- lib/graffiti/squish.rb | 14 +++--- test/ts_graffiti.rb | 107 +++++++++++++++++++++----------------------- 4 files changed, 86 insertions(+), 85 deletions(-) commit 455d1eb517ec049486839c62fc6562cd91838ebd Author: anonymous Date: Mon Jan 30 21:10:59 2012 +0300 add spec.test_files graffiti.gemspec | 1 + 1 file changed, 1 insertion(+) commit 0a253fcb561a699b35467a2b11ae1aa15c36c173 Author: Dmitry Borodaenko Date: Sun Feb 5 15:15:08 2012 +0300 better example in README README.rdoc | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) commit 7c576111432963bb97f235dd55294346dc9cf8b5 Author: anonymous Date: Sun Jan 29 15:23:07 2012 +0300 initial support for gem creation graffiti.gemspec | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) commit 4b2a34274aa8735bdb26ad3609ce132886d4ee18 (tag: v2.1, bejbus/master) Author: Dmitry Borodaenko Date: Sun Dec 25 16:52:34 2011 +0300 updated README for Sequel, updated copyrights README.rdoc | 8 ++++---- lib/graffiti/rdf_config.rb | 2 +- lib/graffiti/rdf_property_map.rb | 2 +- lib/graffiti/sql_mapper.rb | 2 +- lib/graffiti/squish.rb | 2 +- lib/graffiti/store.rb | 2 +- 6 files changed, 9 insertions(+), 9 deletions(-) commit e8004f1244745d68758f21cc8820ded433d6faad (tag: v2.0) Author: Dmitry Borodaenko Date: Fri Sep 30 23:02:40 2011 +0300 migrate from DBI to Sequel * Store now expects a Sequel::Database object * SquishSelect and SquishAssert refactored * SquishQuery now keeps raw unescaped literals in @strings, SquishAssert passes these as is to Sequel::Dataset, SquishSelect still escapes them back into the SQL query locally * validate_expression now returns the validated string for chainability * substitute_parameters dropped: Sequel understands named parameters * SquishSelect#to_sql (and by extention Store#select) now returns only SQL query (same params hash can now be passed to Sequel verbatim) * SquishSelect#to_sql now names columns with corresponding blank node names where applicable * unit test now runs select and assert on an in-memory Sqlite database doc/examples/samizdat-rdf-config.yaml | 70 ++--- doc/examples/samizdat-triggers-pgsql.sql | 138 ++++------ lib/graffiti/debug.rb | 34 +++ lib/graffiti/rdf_config.rb | 1 + lib/graffiti/rdf_property_map.rb | 10 +- lib/graffiti/sql_mapper.rb | 31 +-- lib/graffiti/squish.rb | 416 ++++++++++++++++++------------ lib/graffiti/store.rb | 35 +-- test/ts_graffiti.rb | 237 +++++++++++++---- 9 files changed, 580 insertions(+), 392 deletions(-) commit ef6717ce1fe9a36bf450866034f915b1d3013e11 Author: Dmitry Borodaenko Date: Fri Sep 16 22:18:19 2011 +0300 old Monotone changelog ChangeLog.mtn | 233 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 233 insertions(+) commit 6b5b8f32ca3b362227f09852da42df4dd56b498e Author: Dmitry Borodaenko Date: Fri Sep 16 22:10:29 2011 +0300 ordering-agnosting query matching in unit tests test/ts_graffiti.rb | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) commit 574f5519a9ed4dbee78b0b5101918db08089ea3e Author: Dmitry Borodaenko Date: Fri Sep 16 22:10:07 2011 +0300 SqlExpression#to_str for Ruby 1.9 lib/graffiti/sql_mapper.rb | 2 ++ 1 file changed, 2 insertions(+) commit cf3325be344735f2bbf0ba7cacaa16467e1148e4 Author: Dmitry Borodaenko Date: Sat Jun 4 22:08:18 2011 +0300 minor rdoc markup fix README.rdoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit 210ab735c85abc429d9651bd340d7683df595b16 Author: Dmitry Borodaenko Date: Sat Jun 4 22:02:51 2011 +0300 ICIS 2009 paper New paper added (On-demand RDF to Relational Query Translation in Samizdat RDF Store, ICIS 2009) ...df-to-relational-query-translation-icis2009.tex | 936 ++++++++++++++++++++ 1 file changed, 936 insertions(+) commit 3605566c2a3b251e592f2a78aee0048724f76174 (tag: v1.0) Author: Dmitry Borodaenko Date: Sat Jun 4 22:00:39 2011 +0300 first commit to github COPYING | 676 +++++++++++++++ README.rdoc | 127 +++ TODO | 30 + doc/diagrams/graffiti-classes.svg | 157 ++++ doc/diagrams/graffiti-deployment.svg | 117 +++ doc/diagrams/graffiti-store-sequence.svg | 69 ++ doc/diagrams/squish-select-sequence.svg | 266 ++++++ doc/examples/samizdat-rdf-config.yaml | 95 +++ doc/examples/samizdat-triggers-pgsql.sql | 290 +++++++ doc/papers/collreif.tex | 462 ++++++++++ doc/papers/rel-rdf.tex | 545 ++++++++++++ doc/rdf-impl-report.txt | 126 +++ lib/graffiti.rb | 15 + lib/graffiti/exceptions.rb | 20 + lib/graffiti/rdf_config.rb | 77 ++ lib/graffiti/rdf_property_map.rb | 84 ++ lib/graffiti/sql_mapper.rb | 927 ++++++++++++++++++++ lib/graffiti/squish.rb | 496 +++++++++++ lib/graffiti/store.rb | 99 +++ setup.rb | 1360 ++++++++++++++++++++++++++++++ test/ts_graffiti.rb | 321 +++++++ 21 files changed, 6359 insertions(+) ruby-graffiti-2.2/ChangeLog.mtn000066400000000000000000000142251176467530700165220ustar00rootroot00000000000000----------------------------------------------------------------- Revision: 5f07d75e7786d56d30659b04ac0091fc8bc37fda Ancestor: e13812ed0f84c7e722284750df4ac1a8ef81a501 Author: angdraug@debian.org Date: 2009-10-11T12:23:07 Branch: graffiti-head Deleted entries: doc/diagrams/graffiti_classes.dia Added files: doc/diagrams/graffiti-classes.svg doc/diagrams/graffiti-deployment.svg doc/diagrams/graffiti-store-sequence.svg doc/diagrams/squish-select-sequence.svg ChangeLog: replaced the old Dia diagram with SVG diagrams produced with BoUML ----------------------------------------------------------------- Revision: e13812ed0f84c7e722284750df4ac1a8ef81a501 Ancestor: e364f136ecc00b4559a0f0d07f257886f48ae4ef Author: angdraug@debian.org Date: 2009-09-06T13:47:40 Branch: graffiti-head Modified files: README.rdoc ChangeLog: relational data adaptation instructions added ----------------------------------------------------------------- Revision: e364f136ecc00b4559a0f0d07f257886f48ae4ef Ancestor: f2ef2d91fb1e29ee91928b5ebf52c22b09def60b Author: angdraug@debian.org Date: 2009-09-06T13:22:12 Branch: graffiti-head Modified files: README.rdoc ChangeLog: update language description added ----------------------------------------------------------------- Revision: f2ef2d91fb1e29ee91928b5ebf52c22b09def60b Ancestor: c5c5ad50f2f0e1159628804eb141c60977483fb7 Author: angdraug@debian.org Date: 2009-09-06T13:12:45 Branch: graffiti-head Modified files: README.rdoc ChangeLog: query language description added ----------------------------------------------------------------- Revision: c5c5ad50f2f0e1159628804eb141c60977483fb7 Ancestor: c6389322361c36538d626257f8dd957864e7b85e Author: angdraug@debian.org Date: 2009-08-22T12:38:32 Branch: graffiti-head Modified files: TODO lib/graffiti/rdf_config.rb lib/graffiti/rdf_property_map.rb lib/graffiti/sql_mapper.rb ChangeLog: move setting of subproperty_of and transitive_closure outside of the RdfPropertyMap constructor; documented some missing pieces ----------------------------------------------------------------- Revision: c6389322361c36538d626257f8dd957864e7b85e Ancestor: 41a4f5d475227a9df9aae268506bcdb2dbf2e6ab Author: angdraug@debian.org Date: 2009-08-22T12:37:03 Branch: graffiti-head Modified files: lib/graffiti/squish.rb ChangeLog: use @db, db shortcut is no longer available ----------------------------------------------------------------- Revision: 41a4f5d475227a9df9aae268506bcdb2dbf2e6ab Ancestor: bed7201b8512540b0c225add168a66b18cfad5f3 Author: angdraug@debian.org Date: 2009-08-05T13:16:50 Branch: graffiti-head Modified files: lib/graffiti/squish.rb ChangeLog: allow PostgreSQL full text search operators in expressions ----------------------------------------------------------------- Revision: bed7201b8512540b0c225add168a66b18cfad5f3 Ancestor: 3538ff90e492811cd2c022cf6502439090241d8e Author: angdraug@debian.org Date: 2009-08-02T14:34:09 Branch: graffiti-head Modified files: lib/graffiti/sql_mapper.rb ChangeLog: move SqlNodeBinding and SqlExpression out of SqlMapper namespace; move exception raising into check_graph ----------------------------------------------------------------- Revision: 3538ff90e492811cd2c022cf6502439090241d8e Ancestor: ad944a53e855766391fb1840b09cc47eac78f911 Author: angdraug@debian.org Date: 2009-07-30T08:50:54 Branch: graffiti-head Added files: lib/graffiti/exceptions.rb lib/graffiti/rdf_config.rb lib/graffiti/rdf_property_map.rb lib/graffiti/sql_mapper.rb lib/graffiti/squish.rb lib/graffiti/store.rb Added directories: lib/graffiti Modified files: lib/graffiti.rb ChangeLog: split Graffiti classes into their own .rb files ----------------------------------------------------------------- Revision: ad944a53e855766391fb1840b09cc47eac78f911 Ancestor: 35e3b7019eda146e755449c86f2b04f035796608 Author: angdraug@debian.org Date: 2009-07-28T12:42:47 Branch: graffiti-head Modified files: README.rdoc ChangeLog: mention SynCache in README.rdoc ----------------------------------------------------------------- Revision: 35e3b7019eda146e755449c86f2b04f035796608 Ancestor: 7eeecf40a8f1e0127ecc6d2f818bd6d80e0b1f90 Author: angdraug@debian.org Date: 2009-07-28T10:56:06 Branch: graffiti-head Modified files: README.rdoc lib/graffiti.rb test/ts_graffiti.rb ChangeLog: module initialization fixes ----------------------------------------------------------------- Revision: 7eeecf40a8f1e0127ecc6d2f818bd6d80e0b1f90 Ancestor: 35efa8b3fb65bbe7744fc930dc3ceb5a98564a98 Author: angdraug@debian.org Date: 2009-07-28T10:45:31 Branch: graffiti-head Modified files: README.rdoc ChangeLog: minor documentation cleanup ----------------------------------------------------------------- Revision: 35efa8b3fb65bbe7744fc930dc3ceb5a98564a98 Ancestor: 60fc13b5ffccadc9990d5303de93e71be30128bb Author: angdraug@debian.org Date: 2009-07-28T10:38:48 Branch: graffiti-head Added files: doc/diagrams/graffiti_classes.dia Added directories: doc/diagrams Modified files: README.rdoc ChangeLog: documentation update ----------------------------------------------------------------- Revision: 60fc13b5ffccadc9990d5303de93e71be30128bb Ancestor: abd14a047ad68bc4e92ca80916e71a39f605b14c Author: angdraug@debian.org Date: 2009-07-27T19:20:47 Branch: graffiti-head Renamed entries: doc/examples/samizdat-triggers-pgsql-sql to doc/examples/samizdat-triggers-pgsql.sql ChangeLog: fixed .sql extension for triggers example ----------------------------------------------------------------- Revision: abd14a047ad68bc4e92ca80916e71a39f605b14c Ancestor: Author: angdraug@debian.org Date: 2009-07-27T19:16:55 Branch: graffiti-head Added files: COPYING README.rdoc TODO doc/examples/samizdat-rdf-config.yaml doc/examples/samizdat-triggers-pgsql-sql doc/papers/collreif.tex doc/papers/rel-rdf.tex doc/rdf-impl-report.txt lib/graffiti.rb setup.rb test/ts_graffiti.rb Added directories: . doc doc/examples doc/papers lib test ChangeLog: initial checkin: Graffiti is a spin-off of storage.rb from Samizdat project ruby-graffiti-2.2/README.rdoc000066400000000000000000000103641176467530700157610ustar00rootroot00000000000000= Graffiti - relational RDF store for Ruby == Synopsis require 'sequel' require 'yaml' require 'graffiti' db = Sequel.connect(:adapter => 'pg', :database = > dbname) config = File.open('rdf.yaml') {|f| YAML.load(f.read) } store = Graffiti::Store.new(db, config) data = store.fetch(%{ SELECT ?date, ?title WHERE (dc::date ?r ?date FILTER ?date >= :start) (dc::title ?r ?title) ORDER BY ?date DESC}, 10, 0, :start => Time.now - 24*3600) puts data.first[:title] == Description Graffiti is an RDF store based on dynamic translation of RDF queries into SQL. Graffiti allows to map any relational database schema into RDF semantics and vice versa, to store any RDF data in a relational database. == Requirements Graffiti uses Sequel to connect to database backend and provides a DBI-like interface to run RDF queries in Squish query language from Ruby applications. SynCache object cache is used for in-process cache of translated Squish queries. == Query Language Graffiti implements Squish RDF query language with several extensions. A query may include following clauses: * SELECT: comma-separated list of result expressions, which may be variables or aggregate functions. * WHERE: main graph pattern, described as a list of triple patterns. Each triple is enclosed in parenthesis and includes predicate, subject and object. Predicate must be a URL and may use a shorthand notation with namespace id separated by double colon. Subject may be a URL, internal resource id, or variable. Object may be a URL, internal resource id, variable, or literal. Values of variables bound by the triple pattern may be bound by an optional FILTER expression. * EXCEPT: negative graph pattern. Solutions that match any part of the negative graph pattern are excluded from the result set. * OPTIONAL: optional graph pattern. Variables defined in the optional pattern are only included in the result set only for solutions that match corresponding parts of the optional graph pattern. * LITERAL: global filter expression. Used for expressions that involve variables from different triple patterns. * GROUP BY: result set aggregation criteria. * ORDER BY: result set ordering criteria. * USING: namespaces definitions. Namespaces defined in the RDF store configuration do not have to be repeated here. A basic update language is also implemented. A Squish assert statement uses the same structure as a query, with SELECT clause replaced by either one or both of the following clauses: * INSERT: list of variables representing new resources to be inserted into the RDF graph. * UPDATE: list of assignments of literal expressions to variables bound by a solution. Assert statement will only update one solution per invocation, if more solutions match the graph pattern, only the first will be updated. == Relational Data Relational data has to be adapted for RDF access using Graffiti. The adaptation is non-intrusive and will not break compatibility with existing SQL queries. Following schema changes are required for all cases: * Create rdfs:Resource superclass table with auto-generated primary key. * Replace primary keys of mapped subclass tables with foreign keys referencing the rdfs:Resource table (existing foreign keys may need to be updated to reflect this change. * Register rdfs:subClassOf inference database triggers to update the rdfs:Re-source table and maintain foreign keys integrity on all changes in mapped subclass tables. Following changes may be necessary to support optional RDF mapping features: * Register database triggers for other cases of rdfs:subClassOf entailment. * Create triples table (required to represent non-relational RDF data and RDF statement reification). * Add sub-property qualifier attributes referencing property URIref entry in the rdfs:Resource table for each attribute mapped to a super-property. * Create transitive closure tables, register owl:TransitiveProperty inference triggers. Example of RDF map and corresponding triggers can be found in doc/examples/. == Copying Copyright (c) 2002-2011 Dmitry Borodaenko This program is free software. You can distribute/modify this program under the terms of the GNU General Public License version 3 or later. ruby-graffiti-2.2/TODO000066400000000000000000000020641176467530700146410ustar00rootroot00000000000000Graffiti ToDo List ================== - generalize RDF storage, implement SPARQL - unit, functional, and performance test suite (in progress) - separate library for RDF storage (done) - investigate alternative backends: FramerD, 3store, Redland -- depends on separate library for RDF storage -- depends on test suite - security: Squish literal condition safety (done), limited number of query clauses (done), dry-run of user-defined query, approvable resource usage - query result set representation - optional patterns and negation (done) - parametrized queries (done) -- cache prepared statements - support blob literals -- depends on parametrized queries - vocabulary entailment: RDF, RDFS, OWL - RDF aggregates storage internalization (Seq, Bag, Alt) - query introspection - storage workflow control (triggers) - transparent (structured) RDF query storage -- depends on RDF aggregates storage -- depends on storage workflow control - subqueries (query premise) -- depends on transparent query storage - chain queries -- depends on native RDF storage ruby-graffiti-2.2/doc/000077500000000000000000000000001176467530700147145ustar00rootroot00000000000000ruby-graffiti-2.2/doc/diagrams/000077500000000000000000000000001176467530700165035ustar00rootroot00000000000000ruby-graffiti-2.2/doc/diagrams/graffiti-classes.svg000066400000000000000000000300031176467530700224460ustar00rootroot00000000000000 SquishQuery ns variables uri_shrink() ns_shrink() substitute_literals() validate_expression() SqlMapper clauses nodes from where bind() check_graph() map_predicates() label_pattern_components() map_pattern() refine_ambiguous_properties() define_relation_aliases() transform() generate_tables_and_conditions() SquishSelect to_sql() SquishAssert assert() SqlExpression parts to_s() traverse() rebind() eql?() SqlNodeBinding alias field to_s() eql?() hash() Store get_property() select_one() select_all() assert() select() RdfConfig ns map ns_expand() sql_mapper config ruby-graffiti-2.2/doc/diagrams/graffiti-deployment.svg000066400000000000000000000213541176467530700232020ustar00rootroot00000000000000 :Database Server :Application Server :Relational DBMS :Cache Server :Ruby <<artifact>> Graffiti Triggers <<artifact>> Relational Data :Ruby <<artifact>> Application <<artifact>> Graffiti Module <<artifact>> Ruby/DBI <<artifact>> SynCache Module DRuby SQL ruby-graffiti-2.2/doc/diagrams/graffiti-store-sequence.svg000066400000000000000000000070111176467530700237560ustar00rootroot00000000000000 :Store :Application :DatabaseHandle select_all() select_all() select() ruby-graffiti-2.2/doc/diagrams/squish-select-sequence.svg000066400000000000000000000303251176467530700236260ustar00rootroot00000000000000 :Store :SquishSelect :SqlMapper :RdfConfig :SqlExpression :SqlNodeBinding new to_sql() new where from bind() check_graph() map_predicates() map_pattern() label_pattern_components() map(predicate) refine_ambiguous_properties() new transform() define_relation_aliases() new new generate_tables_and_conditions() self self rebind() ruby-graffiti-2.2/doc/examples/000077500000000000000000000000001176467530700165325ustar00rootroot00000000000000ruby-graffiti-2.2/doc/examples/samizdat-rdf-config.yaml000066400000000000000000000053061176467530700232520ustar00rootroot00000000000000--- # rdf.yaml # # Defines essential parts of RDF model of a Samizdat site. Don't touch # it unless you know what you're doing. # Namespaces # ns: s: 'http://www.nongnu.org/samizdat/rdf/schema#' tag: 'http://www.nongnu.org/samizdat/rdf/tag#' items: 'http://www.nongnu.org/samizdat/rdf/items#' rdf: 'http://www.w3.org/1999/02/22-rdf-syntax-ns#' dc: 'http://purl.org/dc/elements/1.1/' dct: 'http://purl.org/dc/terms/' ical: 'http://www.w3.org/2002/12/cal#' # Mapping of internal RDF properties to tables and fields. Statements # over properties not listed here or in 'subproperty:' section below are # reified using standard rdf::subject, rdf::predicate, and rdf::object # properties, so at least these three and s::id must be mapped. # map: 's::id': {resource: id} 'dc::date': {resource: published_date} 'dct::isPartOf': {resource: part_of} 's::isPartOfSubProperty': {resource: part_of_subproperty} 's::partSequenceNumber': {resource: part_sequence_number} 'rdf::subject': {statement: subject} 'rdf::predicate': {statement: predicate} 'rdf::object': {statement: object} 's::login': {member: login} 's::fullName': {member: full_name} 's::email': {member: email} 'dc::title': {message: title} 'dc::creator': {message: creator} 'dc::format': {message: format} 'dc::language': {message: language} 's::openForAll': {message: open} 's::hidden': {message: hidden} 's::locked': {message: locked} 's::content': {message: content} 's::htmlFull': {message: html_full} 's::htmlShort': {message: html_short} 's::rating': {statement: rating} 's::voteProposition': {vote: proposition} 's::voteMember': {vote: member} 's::voteRating': {vote: rating} # Map of properties into lists of their subproperties. For each property # listed here, an additional qualifier field named _subproperty # is defined in the same table (as defined under 'map:' above) referring # to resource id identifying the subproperty (normally a uriref resource # holding uriref of the subproperty). Only one level of subproperty # relation is supported, all subsubproperties must be listed directly # under root property. # subproperties: 'dct::isPartOf': [ 's::inReplyTo', 'dct::isVersionOf', 's::isTranslationOf', 's::subTagOf' ] # Map of transitive RDF properties into tables that hold their # transitive closures. The format of the table is as follows: 'resource' # field refers to the subject resource id, property field (and qualifier # field in case of subproperty) has the same name as in the main table # (as defined under 'map:' above) and holds reference to predicate # object, and 'distance' field holds the distance from subject to object # in the RDF graph. # transitive_closure: 'dct::isPartOf': part ruby-graffiti-2.2/doc/examples/samizdat-triggers-pgsql.sql000066400000000000000000000232431176467530700240430ustar00rootroot00000000000000-- Samizdat Database Triggers - PostgreSQL -- -- Copyright (c) 2002-2011 Dmitry Borodaenko -- -- This program is free software. -- You can distribute/modify this program under the terms of -- the GNU General Public License version 3 or later. -- CREATE FUNCTION insert_resource() RETURNS TRIGGER AS $$ BEGIN IF NEW.id IS NULL THEN INSERT INTO resource (literal, uriref, label) VALUES ('false', 'false', TG_ARGV[0]); NEW.id := currval('resource_id_seq'); END IF; RETURN NEW; END; $$ LANGUAGE 'plpgsql'; CREATE TRIGGER insert_statement BEFORE INSERT ON statement FOR EACH ROW EXECUTE PROCEDURE insert_resource('statement'); CREATE TRIGGER insert_member BEFORE INSERT ON member FOR EACH ROW EXECUTE PROCEDURE insert_resource('member'); CREATE TRIGGER insert_message BEFORE INSERT ON message FOR EACH ROW EXECUTE PROCEDURE insert_resource('message'); CREATE TRIGGER insert_vote BEFORE INSERT ON vote FOR EACH ROW EXECUTE PROCEDURE insert_resource('vote'); CREATE FUNCTION delete_resource() RETURNS TRIGGER AS $$ BEGIN DELETE FROM resource WHERE id = OLD.id; RETURN NULL; END; $$ LANGUAGE 'plpgsql'; CREATE TRIGGER delete_statement AFTER DELETE ON statement FOR EACH ROW EXECUTE PROCEDURE delete_resource(); CREATE TRIGGER delete_member AFTER DELETE ON member FOR EACH ROW EXECUTE PROCEDURE delete_resource(); CREATE TRIGGER delete_message AFTER DELETE ON message FOR EACH ROW EXECUTE PROCEDURE delete_resource(); CREATE TRIGGER delete_vote AFTER DELETE ON vote FOR EACH ROW EXECUTE PROCEDURE delete_resource(); CREATE FUNCTION select_subproperty(value resource.id%TYPE, subproperty resource.id%TYPE) RETURNS resource.id%TYPE AS $$ BEGIN IF subproperty IS NULL THEN RETURN NULL; ELSE RETURN value; END IF; END; $$ LANGUAGE 'plpgsql'; CREATE FUNCTION calculate_statement_rating(statement_id statement.id%TYPE) RETURNS statement.rating%TYPE AS $$ BEGIN RETURN (SELECT AVG(rating) FROM vote WHERE proposition = statement_id); END; $$ LANGUAGE 'plpgsql'; CREATE FUNCTION update_nrelated(tag_id resource.id%TYPE) RETURNS VOID AS $$ DECLARE dc_relation resource.label%TYPE := 'http://purl.org/dc/elements/1.1/relation'; s_subtag_of resource.label%TYPE := 'http://www.nongnu.org/samizdat/rdf/schema#subTagOf'; s_subtag_of_id resource.id%TYPE; n tag.nrelated%TYPE; supertag RECORD; BEGIN -- update nrelated SELECT COUNT(*) INTO n FROM statement s INNER JOIN resource p ON s.predicate = p.id WHERE p.label = dc_relation AND s.object = tag_id AND s.rating > 0; UPDATE tag SET nrelated = n WHERE id = tag_id; IF NOT FOUND THEN INSERT INTO tag (id, nrelated) VALUES (tag_id, n); END IF; -- update nrelated_with_subtags for this tag and its supertags SELECT id INTO s_subtag_of_id FROM resource WHERE label = s_subtag_of; FOR supertag IN ( SELECT tag_id AS id, 0 AS distance UNION SELECT part_of AS id, distance FROM part WHERE id = tag_id AND part_of_subproperty = s_subtag_of_id ORDER BY distance ASC) LOOP UPDATE tag SET nrelated_with_subtags = nrelated + COALESCE(( SELECT SUM(subt.nrelated) FROM part p INNER JOIN tag subt ON subt.id = p.id WHERE p.part_of = supertag.id AND p.part_of_subproperty = s_subtag_of_id), 0) WHERE id = supertag.id; END LOOP; END; $$ LANGUAGE 'plpgsql'; CREATE FUNCTION update_nrelated_if_subtag(tag_id resource.id%TYPE, property resource.id%TYPE) RETURNS VOID AS $$ DECLARE s_subtag_of resource.label%TYPE := 'http://www.nongnu.org/samizdat/rdf/schema#subTagOf'; s_subtag_of_id resource.id%TYPE; BEGIN SELECT id INTO s_subtag_of_id FROM resource WHERE label = s_subtag_of; IF property = s_subtag_of_id THEN PERFORM update_nrelated(tag_id); END IF; END; $$ LANGUAGE 'plpgsql'; CREATE FUNCTION update_rating() RETURNS TRIGGER AS $$ DECLARE dc_relation resource.label%TYPE := 'http://purl.org/dc/elements/1.1/relation'; old_rating statement.rating%TYPE; new_rating statement.rating%TYPE; tag_id resource.id%TYPE; predicate_uriref resource.label%TYPE; BEGIN -- save some values for later reference SELECT s.rating, s.object, p.label INTO old_rating, tag_id, predicate_uriref FROM statement s INNER JOIN resource p ON s.predicate = p.id WHERE s.id = NEW.proposition; -- set new rating of the proposition new_rating := calculate_statement_rating(NEW.proposition); UPDATE statement SET rating = new_rating WHERE id = NEW.proposition; -- check if new rating reverts truth value of the proposition IF predicate_uriref = dc_relation AND (((old_rating IS NULL OR old_rating <= 0) AND new_rating > 0) OR (old_rating > 0 AND new_rating <= 0)) THEN PERFORM update_nrelated(tag_id); END IF; RETURN NEW; END; $$ LANGUAGE 'plpgsql'; CREATE TRIGGER update_rating AFTER INSERT OR UPDATE OR DELETE ON vote FOR EACH ROW EXECUTE PROCEDURE update_rating(); CREATE FUNCTION before_update_part() RETURNS TRIGGER AS $$ BEGIN IF TG_OP = 'INSERT' THEN IF NEW.part_of IS NULL THEN RETURN NEW; END IF; ELSIF TG_OP = 'UPDATE' THEN IF (NEW.part_of IS NULL AND OLD.part_of IS NULL) OR ((NEW.part_of = OLD.part_of) AND (NEW.part_of_subproperty = OLD.part_of_subproperty)) THEN -- part_of is unchanged, do nothing RETURN NEW; END IF; END IF; -- check for loops IF NEW.part_of = NEW.id OR NEW.part_of IN ( SELECT id FROM part WHERE part_of = NEW.id) THEN -- unset part_of, but don't fail whole query NEW.part_of = NULL; NEW.part_of_subproperty = NULL; IF TG_OP != 'INSERT' THEN -- check it was a subtag link PERFORM update_nrelated_if_subtag(OLD.id, OLD.part_of_subproperty); END IF; RETURN NEW; END IF; RETURN NEW; END; $$ LANGUAGE 'plpgsql'; CREATE TRIGGER before_update_part BEFORE INSERT OR UPDATE ON resource FOR EACH ROW EXECUTE PROCEDURE before_update_part(); CREATE FUNCTION after_update_part() RETURNS TRIGGER AS $$ BEGIN IF TG_OP = 'INSERT' THEN IF NEW.part_of IS NULL THEN RETURN NEW; END IF; ELSIF TG_OP = 'UPDATE' THEN IF (NEW.part_of IS NULL AND OLD.part_of IS NULL) OR ((NEW.part_of = OLD.part_of) AND (NEW.part_of_subproperty = OLD.part_of_subproperty)) THEN -- part_of is unchanged, do nothing RETURN NEW; END IF; END IF; IF TG_OP != 'INSERT' THEN IF OLD.part_of IS NOT NULL THEN -- clean up links generated for old part_of DELETE FROM part WHERE id IN ( -- for old resource... SELECT OLD.id UNION --...and all its parts, ... SELECT id FROM part WHERE part_of = OLD.id) AND part_of IN ( -- ...remove links to all parents of old resource SELECT part_of FROM part WHERE id = OLD.id) AND part_of_subproperty = OLD.part_of_subproperty; END IF; END IF; IF TG_OP != 'DELETE' THEN IF NEW.part_of IS NOT NULL THEN -- generate links to the parent and grand-parents of new resource INSERT INTO part (id, part_of, part_of_subproperty, distance) SELECT NEW.id, NEW.part_of, NEW.part_of_subproperty, 1 UNION SELECT NEW.id, part_of, NEW.part_of_subproperty, distance + 1 FROM part WHERE id = NEW.part_of AND part_of_subproperty = NEW.part_of_subproperty; -- generate links from all parts of new resource to all its parents INSERT INTO part (id, part_of, part_of_subproperty, distance) SELECT child.id, parent.part_of, NEW.part_of_subproperty, child.distance + parent.distance FROM part child INNER JOIN part parent ON parent.id = NEW.id AND parent.part_of_subproperty = NEW.part_of_subproperty WHERE child.part_of = NEW.id AND child.part_of_subproperty = NEW.part_of_subproperty; END IF; END IF; -- check if subtag link was affected IF TG_OP != 'DELETE' THEN PERFORM update_nrelated_if_subtag(NEW.id, NEW.part_of_subproperty); END IF; IF TG_OP != 'INSERT' THEN PERFORM update_nrelated_if_subtag(OLD.id, OLD.part_of_subproperty); END IF; RETURN NEW; END; $$ LANGUAGE 'plpgsql'; CREATE TRIGGER after_update_part AFTER INSERT OR UPDATE OR DELETE ON resource FOR EACH ROW EXECUTE PROCEDURE after_update_part(); ruby-graffiti-2.2/doc/papers/000077500000000000000000000000001176467530700162065ustar00rootroot00000000000000ruby-graffiti-2.2/doc/papers/collreif.tex000066400000000000000000000510001176467530700205230ustar00rootroot00000000000000\documentclass{llncs} \usepackage{makeidx} % allows for indexgeneration \usepackage[pdfpagescrop={92 112 523 778},a4paper=false, pdfborder={0 0 0}]{hyperref} \emergencystretch=8pt % \begin{document} \mainmatter % start of the contributions % \title{Model for Collaborative Decision Making Based on RDF Reification} \toctitle{Model for Collaborative Decision Making Based on RDF Reification} \titlerunning{Collaboration and RDF Reification} % \author{Dmitry Borodaenko} \authorrunning{Dmitry Borodaenko} % abbreviated author list (for running head) %%%% modified list of authors for the TOC (add the affiliations) \tocauthor{Dmitry Borodaenko} % \institute{\email{angdraug@debian.org}} \maketitle % typeset the title of the contribution \begin{abstract} This paper presents a novel approach to online collaboration on the Web, intended as technical means to make collective decisions in situations when consensus fails. It is proposed that participants of the process are allowed to create statements about site resources and, by the means of RDF reification, to assert personal approval of such statements. Arbitrary algorithms may then be used to determine validity of a statement in a given context from the set of approval statements by different participants. The paper goes on to discuss applicability of the proposed approach in the areas of open-source development and independent media, and describes its implementation in the Samizdat open publishing and collaboration system. \end{abstract} \section{Introduction} Extensive growth of Internet over the last decades introduced a new form of human collaboration: online communities. Availability of cheap digital communication media has made it possible to form large distributed projects, bringing together participants who would be otherwise unable to cooperate. As more and more projects go online and spread across the globe, it becomes apparent that new opportunities in remote cooperation also bring forth new challenges. As observed by Steven Talbott\cite{fdnc}, technogical means do not provide a full substitute for a real person-to-person relations, ``technology is not a community''. A well-known example of this is the fact that it is vital for an online communty to augment indirect and impersonal digital communications with live meetings. However, even regular live meetings do not solve all of the remote cooperation problems as they are limited in time and scope, and thus can't happen often enough nor include all of the interested parties into communication. In particular, one of the problems of online communities that is begging for a new and better technical solution is decision making and dispute resolution. While it is most common that online communities are formed by volunteers, their forms of governance are not necessarily democratic and vary widely, from primitive single-person leadership and meritocracy in less formal technical projects to consensus and majority voting in more complicated situations. Usually, decision making in online volunteer projects is carried out via traditional communication means, such as IRC channels, mailing lists, newsgroups, etc., with rare exceptions such as the Debian project which employs its own Devotee voting system based on PGP authentication and Concorde vote counting\cite{debian-constitution}, and the Wikipedia project which relies on a Wiki collaborative publishing system and enforces consensus among its contributors. The scale and the level of quality achieved by the latter two projects demonstrates that formalized collaboration process is as important for volunteer projects as elsewhere: while sufficient to determine rough consensus, traditional communications require participants to come up with informal means of dispute resolution, making the whole process overly dependent on interpersonal attitudes and communicative skills within group. It is not to say that Debian or Wikipedia processes are perfect and need not be improved. The strict consensus required by the Wikipedia Editors Policy discourages dissenting minority from participation, while full-scale voting system like Debian Devotee can't be used for every minor day-to-day decision because of the high overhead involved and the limits imposed by the ballot form. This paper describes how RDF statement approval based on reification can be applied to the problem of online decision making in diverse and politically intensive distributed projects, and proposes a generic semantic model which can be used in a wide range of applications involving online collaboration. The proposed model is implemented in the Samizdat open-publishing and collaboration engine, described later in the paper. \section{Collaboration Model} The collaboration model implemented by Samizdat evolves around the concept of \emph{open editing}\cite{opened}, which includes the processes of publishing, structuring, and filtering online content. ``Open'' part of open editing implies that the collaboration process is visible to all participants, and roles of readers and editors are available equally to everyone. \emph{Publishing} involves posting new documents, comments, and revised documents. \emph{Structuring} involves categorization and appraisal of publications and other actions of fellow participants. \emph{Filtering} process is intended to reduce information flow to a comprehensible level by presenting a user with resources of highest quality and relevance. Each of these processes requires a fair amount of decision making to be done, which means that its effectiveness can be greatly improved by automating some aspects of the decision making procedure. \section{Collective Statement Approval} % \subsection{Focus-Centered Site Structure} In the proposed collaboration model, RDF statements are used as a generic mechanism for structuring site content. While it is possible to make any kinds of statements about site resources, the most important kind of statement is the one that relates a resource to a so-called ``focus''\cite{concepts}. \emph{Focus} is a kind of resource that, when related by an RDF statement to other resources, allows to group similar resources together and to evaluate resources against different criteria. In some sense, all activities of project members are represented as relations between resources and focuses. Dynamically grouping resources around different focuses allows project members to concentrate on the resources that are most relevant to their area of interests and provide best quality. Use of RDF for site structure description makes it possible to store and exchange filters for site resource selection in the form of RDF queries, thus allowing participants to share their preferences and ensuring interoperability with RDF-aware agents. Since any resource can be used as a focus, it is possible that project members define their own focuses, and relate focuses one to another. In a sufficiently large and intensive project, this feature should help site structure to evolve in accordance with usage patterns of different groups of users. \subsection{RDF Reification} RDF reification provides a mechanism for describing RDF statements. As defined in ``RDF Semantics''\cite{rdf-mt}, assertion of reification of RDF statement means that a document exists containing a triple token instantiating the statement. The reified triple is a resource which can be described in the same way as any other resource. It is important to note that there can be several triple tokens with the same subject, object, and predicate, and, according to RDF reification semantics, such tokens should be treated as separate resources, possibly with different composition or provenance information attached to each. \subsection{Proposition and Vote} In the proposed model, all statements are reified, and may be voted upon by project members. To distinguish statements with attached votes, they are called ``propositions''. \emph{Proposition} is a subclass of RDF statement which can be approved or disapproved by votes of project members. Accordingly, \emph{vote} is a record of vote cast in favor or against particular proposition by particular member, and \emph{rating} is a denotation of approval of the proposition as determined from individual votes. Exact mechanism of rating calculation can be determined by each site, or even each user, individually, according to average value of votes cast, level of trust existing between the user and particular voters, absolute number of votes cast, etc. Since individual votes are recorded in RDF and are available for later extraction, rating can be calculated at any time using any formula that suits the end user best. Some users may choose to share their view of the site resources, and publish their filters in the form of RDF queries. Default rating system in Samizdat lets voter select from ratings ``$-2$'' (no), ``$-1$'' (not likely), ``$0$'' (uncertain), ``$1$'' (likely), ``$2$'' (yes). Total rating of proposition is equal to the average value of all votes cast for the proposition; resources with rating below ``$-1$'' are hidden from view. \section{Target Applications and Use Cases} % \subsection{Open Publishing} While it is vital for any project to come up with fair and predictable methods of decision making, it's hard to find a more typical example than the Indymedia network, international open publishing project with the aim of providing the public with unbiased news source\cite{openpub}. Since the main focus of Indymedia is politics, and since it is explicitly open for everyone, independent media centers are used by people from all parts of political spectrum, and often become a place of heated debate, or even target of flood attacks. This conflict between fairness and political bias, as well as sheer amount of information flowing through the news network, creates a need for a more flexible categorization and filtering system that would take the burden and responsibility of moderation off from site administrators. The issue of developing an open editing system was raised by Indymedia project participants in January 2002, but, to date, implementations of this concept are not ready for production use. The Active2 project\cite{active2} which has set forth to fulfil that role is still in the alpha stage of the development, and, unlike Samizdat, limits its use of RDF to describing its resources with Dublin Core meta-data. Implementation of an open editing system was one of the initial goals of the Samizdat project\cite{oscom3}, and deployment of the Samizdat engine by an independent media center would become a deciding trial of vitality of the proposed collaboration model in a real-world environment. \subsection{Documentation Development} Complexity level of modern computer systems makes it impossible to develop and operate them without extensive user and developer manuals which document intended behaviour of a system and describe solutions to typical user problems. Ultimately, such manuals reflect collective knowledge about a system, and may require input from many different people with different perspectives. On the other hand, in order to be useful to different people, documentation should be well-structured and easy to navigate. The most popular solution for collaborative documentation development to date is \emph{Wiki}, a combination of very simple hypertext markup and ability to edit documents within an HTML form. Such simplicity makes Wiki easy to use, but in the same time limits its applicability to large bodies of documentation. Due to being limited to basic hypertext without categorization and filtering capabilities, Wiki sites require huge amount of manual editing done by trusted maintainers in order to keep the site structure from falling behind a growing amount of available information, and to protect it from vandals. Although there are successful examples of large Wiki sites (most prominent being the Wikipedia project), Wiki does not provide sufficient infrastructure for development and maintainance of complex technical documentation. Combination of the Wiki approach with RDF metadata, along with implementation of the proposed collaborative decision making model for determination of documentation structure, would allow to make significant progress in the adoption of the open-source software which is often suffering from a lack of comprehensive and up-to-date documentation. \subsection{Bug Tracking} Bug-tracking tools have grown to become essential component of any software development process. However, despite wide adoption, bug-tracking software has not yet reached maturity: interoperability between different tools is missing; incompatible issue classifications and work flows complicate status syncronization between companies collaborating on a single project; lack of integration with time-management, document management, version control and other kinds of applications increases amount of routine work done by project manager. On the other hand, development of integrated project management systems shows that the most important problem in project management automation is convergence of information from all sources in a single focal point. For such convergence to become possible, unified process flow model, based on open standards such as RDF, should be adopted across all information sources, from source code version control to developer forums. Since strict provenance tracking is a key requirement for such model, the proposed reification-based approach may be employed to satisfy it. \section{Samizdat Engine} % \subsection{Project Status} Samizdat engine is implemented in the Ruby programming language and relies on the PostgreSQL database management system for RDF storage. Other programs required for Samizdat deployment are Ruby/Postgres, Ruby/DBI, and YAML4R libraries for Ruby, and Apache web server with mod\_ruby module. Samizdat is free software and does not require any non-free software to run\cite{impl-report}. Samizdat project development started in December 2002, first public release was announced in June 2003. As of the second beta version 0.5.1, released in March 2004, Samizdat provided basic set of open publishing functionality, including registering site members, publishing and replying to messages, uploading multimedia messages, voting on relation of site focuses to resources, creating and managing new focuses, hand-editing or using GUI for constructing and publishing Squish queries that can be used to search and filter site resources. Next major release 0.6.0 is expected to add collaborative documentation development functionality. \subsection{Samizdat Schema} Core representation of Samizdat content is RDF. Any new resource published on Samizdat site is automatically assigned a unique numberic ID, which, when appended to the base site URL, forms resource URIref. This ID may be accessed via {\tt id} property. Publication time stamp is recorded in {\tt dc:date} property (here and below, ``{\tt dc:}'' prefix refers to the Dublin Core namespace): \begin{verbatim} :id rdfs:domain rdfs:Resource . dc:date rdfs:domain rdfs:Resource . \end{verbatim} {\tt Member} is a registered user of a Samizdat site (synonyms: poster, visitor, reader, author, creator). Members can post messages, create focuses, relate messages to focuses, vote on relations, view messages, use and publish filters based on relations between messages and focuses. \begin{verbatim} :Member rdfs:subClassOf rdfs:Resource . :login rdfs:domain :Member ; rdfs:range rdfs:Literal . \end{verbatim} Resources are related to focuses with {\tt dc:relation} property: \begin{verbatim} :Focus rdfs:subClassOf rdfs:Resource . dc:relation rdfs:domain rdfs:Resource ; rdfs:range :Focus . \end{verbatim} {\tt Proposition} is an RDF statement with {\tt rating} property. Value of {\tt rating} is calculated from {\tt voteRating} values of individual {\tt Vote} resources attached to this proposition via {\tt voteProposition} property: \begin{verbatim} :Proposition rdfs:subClassOf rdf:Statement . :rating rdfs:domain :Proposition ; rdfs:range rdfs:Literal . :Vote rdfs:subClassOf rdfs:Resource . :voteProposition rdfs:domain :Vote ; rdfs:range :Proposition . :voteMember rdfs:domain :Vote ; rdfs:range :Member . :voteRating rdfs:domain :Vote ; rdfs:range rdfs:Literal . \end{verbatim} Parts of Samizdat schema that are not relevant to the discussed collective decision making model, such as discussion threads, version control, and aggregate messages, were omitted. Full Samizdat schema in N3 notation can be found in Samizdat source code package. \subsection{RDF Storage Implementation} To address scalability concerns, Samizdat extends traditional relational representation of RDF as a table of \{subject, object, predicate\} triples with a unique RDF-to-relational query translation technology. Most highly used RDF properties of Samizdat schema are mapped into fields of \emph{internal resource tables} corresponding to resource classes, with id of the record referencing to the {\tt Resource} table; all other properties are recorded as triples in the {\tt Statement} table. Detailed explanation of the RDF-to-relational mapping can be found in ``Samizdat RDF Storage''\cite{rdf-storage} document. To demonstrate usage of the Samizdat RDF schema described earlier in this section, the exerpt of Ruby code responsible for individual vote rating assignment is quoted below. \begin{verbatim} def rating=(value) value = Focus.validate_rating(value) if value then rdf.assert %{ UPDATE ?rating = '#{value}' WHERE (rdf::subject ?stmt #{resource.id}) (rdf::predicate ?stmt dc::relation) (rdf::object ?stmt #{@id}) (s::voteProposition ?vote ?stmt) (s::voteMember ?vote #{session.id}) (s::voteRating ?vote ?rating) USING PRESET NS} @rating = nil # invalidate rating cache end end \end{verbatim} In this attribute assignment method of {\tt Focus} class, RDF assertion is recorded in extended Squish syntax and populated with variables storing the rating {\tt value}, resource identifier {\tt resource.id}, focus identifier {\tt @id}, and identifier of registered member {\tt session.id}. When the Samizdat RDF storage layer updates {\tt Vote.voteRating}, average value of corresponding {\tt Proposition.rating} is recalculated by a stored procedure. \section{Conclusions} Initially started as an RDF-based open-publishing engine, Samizdat project opens a new approach to online collaboration in general. Proposed model of collective statement approval via RDF reification is applicable in a large range of problem domains, including documentation development and bug tracking. Implementation of the proposed model in the Samizdat engine proves viability of RDF not only as a metadata interchange format, but also as a data model that may be employed by software architects in innovative ways. Key role played by RDF reification in the described model shows that this comparatively obscure part of RDF standard deserves broader mindshare among Semantic Web developers. % ---- Bibliography ---- % \begin{thebibliography}{19} % \bibitem {openpub} Arnison, Matthew: Open publishing is the same as free software, 2002\\ http://www.cat.org.au/maffew/cat/openpub.html \bibitem {concepts} Borodaenko, Dmitry: Samizdat Concepts, December 2002\\ http://savannah.nongnu.org/cgi-bin/viewcvs/samizdat/samizdat/doc/\\ concepts.txt \bibitem {rdf-storage} Borodaenko, Dmitry: Samizdat RDF Storage, December 2002\\ http://savannah.nongnu.org/cgi-bin/viewcvs/samizdat/samizdat/doc/\\ rdf-storage.txt \bibitem {oscom3} Borodaenko, Dmitry: Samizdat --- RDF model for an open publishing and cooperation engine. Third International OSCOM Conference, Berkman Center for Internet and Society, Harvard Law School, May 2003\\ http://slideml.bitflux.ch/files/slidesets/503/title.html \bibitem {impl-report} Borodaenko, Dmitry: Samizdat RDF Implementation Report, September 2003\\ http://lists.w3.org/Archives/Public/www-rdf-interest/2003Sep/0043.html \bibitem {debian-constitution} Debian Constitution. Debian Project, 1999\\ http://www.debian.org/devel/constitution \bibitem {rdf-mt} Hayes, Patrick: RDF Semantics. W3C, February 2004\\ http://www.w3.org/TR/rdf-mt \bibitem {opened} Jay, Dru: Three Proposals for Open Publishing --- Towards a transparent, collaborative editorial framework, 2002\\ http://dru.ca/imc/open\_pub.html \bibitem {fdnc} Talbott, Stephen L.: The Future Does Not Compute. O'Reilly \& Associates, 1995\\ http://www.oreilly.com/\homedir{}stevet/fdnc/ \bibitem {active2} Warren, Mike: Active2 Design. Indymedia, 2002.\\ http://docs.indymedia.org/view/Devel/DesignDocument \end{thebibliography} \end{document} ruby-graffiti-2.2/doc/papers/rdf-to-relational-query-translation-icis2009.tex000066400000000000000000001100441176467530700272120ustar00rootroot00000000000000\documentclass[conference,letterpaper]{IEEEtran} \usepackage{graphicx} %\usepackage{multirow} %\usepackage{ragged2e} \usepackage{algpseudocode} \usepackage[cmex10]{amsmath} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{fancyvrb} \usepackage{pstricks,pst-node} \usepackage[pdftitle={On-demand RDF to Relational Query Translation in Samizdat RDF Store}, pdfauthor={Dmitry Borodaenko}, pdfkeywords={Semantic Web, RDF, relational databases, query language, Samizdat}, pdfborder={0 0 0}]{hyperref} %\urlstyle{rm} \emergencystretch=8pt \interdisplaylinepenalty=2500 % \begin{document} % \title{On-demand RDF to Relational Query Translation in Samizdat RDF Store} % \author{\IEEEauthorblockN{Dmitry Borodaenko} \IEEEauthorblockA{Belarusian State University of Informatics and Radioelectronics\\ 6 Brovki st., Minsk, Belarus\\ Email: angdraug@debian.org}} \maketitle % typeset the title of the contribution \begin{abstract} This paper presents an algorithm for on-demand translation of RDF queries that allows to map any relational data structure to RDF model, and to perform queries over a combination of mapped relational data and arbitrary RDF triples with a performance comparable to that of relational systems. Query capabilities implemented by the algorithm include optional and negative graph patterns, nested sub-patterns, and limited RDFS and OWL inference backed by database triggers. \end{abstract} \section{Introduction} \label{introduction} % motivation for the proposed solution A wide range of solutions that map relational data to RDF data model has accumulated to date~\cite{triplify}. There are several factors that make integration of RDF and relational data important for the adoption of the Semantic Web. One reason, shared with RDF stores based on a triples table, is the wide availability of mature relational database implementations which had seen decades of improvements in reliability, scalability, and performance. Second is the fact that most of structured data available online is backed by relational databases. This data is not likely to be replaced by pure RDF stores in the near future, so it has to be mapped in one way or another to become available to RDF agents. Finally, properly normalized and indexed application-specific relational database schema allows a DBMS to optimize complex queries in ways that are not possible for a tree of joins over a single triples table~\cite{sp2b}. % what is unique about the proposed solution In the Samizdat open publishing engine, most of the data fits into the relational model, with the exception of reified RDF statements which are used in collaborative decision making process~\cite{samizdat-collreif} and require a more generic triple store. The need for a generic RDF store with performance on par with a relational database is the primary motivation behind the design of Samizdat RDF storage module, which is different from both triples table based RDF stores and relational to RDF mapping systems. Unlike the former, Samizdat can run optimized SQL queries over application-specific tables, but unlike the latter, it is not limited by the relational database schema and can fall back, within the same query, to a triples table for RDF predicates that are not mapped to the relational model. % structure of the paper The following sections of this paper describe: targeted relational data, database triggers required for RDFS and OWL inference, query translation algorithm, update request execution algorithm, details of algorithm implementation in Samizdat, analysis of its performance, comparison with related work, and outline for future work. \section{Relational Data} \label{relational-data} % formal definition of data targeted for storage Samizdat RDF storage module does not impose additional restrictions on the underlying relational database schema beyond the requirements of the SQL standard. Any legacy database may be adapted for RDF access while retaining backwards compatibility with existing SQL queries. The adaptation process involves adding attributes, foreign keys, tables, and triggers to the database to enable RDF query translation and support optional features of Samizdat RDF store, such as statement reification and inference for {\em rdfs:sub\-Class\-Of}, {\em rdfs:sub\-Property\-Of}, and {\em owl:Transitive\-Property\/} rules. Following database schema changes are required for all cases: \begin{itemize} \item create {\em rdfs:Resource\/} superclass table with autogenerated primary key; \item replace primary keys of mapped subclass tables with foreign keys referencing the {\em rdfs:Resource\/} table (existing foreign keys may need to be updated to reflect this change); \item register {\em rdfs:subClassOf\/} inference database triggers to update the Resource table and maintain foreign keys integrity on all changes in mapped subclass tables. \end{itemize} Following changes may be necessary to support optional RDF mapping features: \begin{itemize} \item register database triggers for other cases of {\em rdfs:sub\-Class\-Of\/} entailment; \item create triples table (required to represent non-relational RDF data and RDF statement reification); \item add subproperty qualifier attributes referencing property URIref entry in the {\em rdfs:Resource\/} table for each attribute mapped to a superproperty; \item create transitive closure tables, register {\em owl:TransitivePro\-perty\/} inference triggers. \end{itemize} \section{Inference and Database Triggers} \label{inference-triggers} Samizdat RDF storage module implements entailment rules for following RDFS predicates and OWL classes: {\em rdfs:sub\-Class\-Of}, {\em rdfs:sub\-Property\-Of}, {\em owl:Transitive\-Property}. Database triggers are used to minimize impact of RDFS and OWL inference on query performance: {\em rdfs:subClassOf\/} inference triggers are invoked on every insert into and delete from a subclass table. When a tuple without a primary key is inserted,\footnote{Insertion into subclass table with explicit primary key is used in two-step resource insertion during execution of RDF update command (described in section~\ref{update-execution}).} a template tuple is inserted into superclass table and the produced primary key is added to the new subclass tuple. Delete operation is cascaded to all subclass and superclass tables. {\em rdfs:subPropertyOf\/} inference is performed during query translation, with help of a stored procedure that returns the attribute value when subproperty qualifier attribute is set, and NULL otherwise. {\em owl:TransitiveProperty\/} inference uses a separate transitive closure table for each relational attribute mapped to a transitive property. Transitive closure tables are maintained by triggers invoked on each insert, update, and delete operation involving such an attribute. The transitive closure update algorithm is presented in \figurename~\ref{transitive-closure}. The input to the algorithm is: \begin{itemize} \item directed labeled graph $G = \langle N, A \rangle$ where $N$ is a set of nodes representing RDF resources and $A$ is a set of arcs $a = \langle s, p, o \rangle$ representing RDF triples; \item transitive property $\tau$; \item subgraph $G_\tau \subseteq G$ such that: \begin{equation} a_\tau = \langle s, p, o \rangle \in G_\tau \iff a_\tau \in G \, \wedge \, p = \tau \, ; \end{equation} \item graph $G_\tau^+$ containing transitive closure of $G_\tau$; \item update operation $\omega \in \{insert, update, delete\}$ and its parameters $a_{old} = \langle s_\omega, \tau, o_{old} \rangle$, $a_{new} = \langle s_\omega, \tau, o_{new} \rangle$ such that: \begin{equation} G_\tau' = (G_\tau \setminus \{ a_{old} \}) \cup \{ a_{new} \} \, . \end{equation} \end{itemize} The algorithm transforms $G_\tau^+$ into a transitive closure of $G_\tau'$. The algorithm assumes that $G_\tau$ is and should remain acyclic. \begin{figure} \begin{algorithmic}[1] \If {$o_{new} = s_\omega$ or $\langle o_{new}, \tau, s_\omega \rangle \in G_\tau^+$} \State stop \Comment refuse to create a cycle in $G_\tau$ \EndIf \State $G_\tau \gets G_\tau'$ \Comment apply $\omega$ \If {$\omega \in \{update, delete\}$} \State $G_\tau^+ \gets G_\tau^+ \setminus \{ \langle s, \tau, o \rangle \mid (s = s_\omega \, \vee \, \langle s, \tau, s_\omega \rangle \in G_\tau^+) \, \wedge \, \langle s_\omega, \tau, o \rangle \in G_\tau^+ \}$ \Comment remove obsolete arcs from $G_\tau^+$ \EndIf \If {$\omega \in \{insert, update\}$} \Comment add new arcs to $G_\tau^+$ \State $G_\tau^+ \gets G_\tau^+ \cup \{ \langle s_\omega, \tau, o \rangle \mid o = o_{new} \, \vee \, \langle o_{new}, \tau, o \rangle \in G_\tau^+ \}$ \State $G_\tau^+ \gets G_\tau^+ \cup \{ \langle s, \tau, o \rangle \mid \langle s, \tau, s_\omega \rangle \in G_\tau^+ \, \wedge \, \langle s_\omega, \tau, o \rangle \in G_\tau^+ \}$ \EndIf \end{algorithmic} \caption{Update transitive closure} \label{transitive-closure} \end{figure} \section{Query Pattern Translation} \label{query-translation} Class structure of the Samizdat RDF storage module is as follows. External API is provided by the {\tt RDF} class. RDF storage configuration as described in section~\ref{relational-data} is encapsulated in {\tt RDFConfig} class. The concrete syntax of Squish~\cite{samizdat-rel-rdf,squish} and SQL is abstracted into {\tt SquishQuery} and its subclasses. The query pattern translation algorithm is implemented by the {\tt SqlMapper} class. % prerequisites The input to the algorithm is as follows: \begin{itemize} \item mappings $M = \langle M_{rel}, M_{attr}, M_{sub}, M_{trans} \rangle$ where $M_{rel}: P \to R$, $M_{attr}: P \to \Phi$, $M_{sub}: P \to S$, $M_{trans} \to T$; $P$ is a set of mapped RDF properties, $R$ is a set of relations, $\Phi$ is a set of relation attributes, $S \subset P$ is a subset of RDF properties that have configured subproperties, $T \subset R$ is a set of transitive closures (as described in sections~\ref{relational-data} and \ref{inference-triggers}); \item graph pattern $\Psi = \langle \Psi_{nodes}, \Psi_{arcs} \rangle = \Pi \cup N \cup \Omega$, where $\Pi$, $N$, and $\Omega$ are main ("must bind"), negative ("must not bind"), and optional ("may bind") graph patterns respectively, such that $\Pi$, $N$, and $\Omega$ share no arcs, and $\Pi$, $\Pi \cup N$ and $\Pi \cup \Omega$ are joint graphs.\footnote{Arcs with the same subject, object, and predicate but different bind mode are treated as distinct.} \item global filter condition $F_g \in F$ and local filter conditions $F_c: \Psi_{arcs} \to F$ where $F$ is a set of all literal conditions expressible in the query language syntax. \end{itemize} For example, consider the following Squish query and its graph pattern $\Psi$ presented in \figurename~\ref{graph-pattern}. \begin{Verbatim}[fontsize=\scriptsize] SELECT ?msg WHERE (rdf::predicate ?stmt dc::relation) (rdf::subject ?stmt ?msg) (rdf::object ?stmt ?tag) (dc::date ?stmt ?date) (s::rating ?stmt ?rating FILTER ?rating >= :threshold) EXCEPT (dct::isPartOf ?msg ?parent) OPTIONAL (dc::language ?msg ?original_lang) (s::isTranslationOf ?msg ?translation) (dc::language ?translation ?translation_lang) LITERAL ?original_lang = :lang OR ?translation_lang = :lang GROUP BY ?msg ORDER BY max(?date) DESC \end{Verbatim} \begin{figure} \centering \psset{unit=3.8mm,labelsep=0.2pt} \begin{pspicture}[showgrid=false](0,0)(23,12) \footnotesize \rput(2.5,5.5){\ovalnode{msg}{\sl ?msg}} \rput(10,8){\ovalnode{stmt}{\sl ?stmt}} \rput(2.5,8){\ovalnode{rel}{\it dc:relation}} \rput(5,10.5){\ovalnode{tag}{\sl ?tag}} \rput(14,10.5){\ovalnode{date}{\sl ?date}} \rput(17,8){\ovalnode{rating}{\sl ?rating}} \rput(14,5.5){\ovalnode{parent}{\sl ?parent}} \rput(8,1){\ovalnode{origlang}{\sl ?original\_lang}} \rput(11.2,3.3){\ovalnode{trans}{\sl ?translation}} \rput(19.2,1){\ovalnode{translang}{\sl ?translation\_lang}} \ncline{<-}{msg}{stmt} \aput{:U}(0.4){\it rdf:subject} \ncline{<-}{rel}{stmt} \aput{:U}{\it rdf:predicate} \ncline{<-}{tag}{stmt} \aput{:U}{\it rdf:object} \ncline{->}{stmt}{date} \aput{:U}{\it dc:date} \ncline{->}{stmt}{rating} \aput{:U}{\it s:rating} \ncline{->}{msg}{parent} \aput{:U}(0.6){\it dct:isPartOf} \ncline{->}{msg}{origlang} \aput{:U}(0.6){\it dc:language} \ncline{<-}{msg}{trans} \aput{:U}(0.65){\it s:isTranslationOf} \ncline{->}{trans}{translang} \aput{:U}(0.6){\it dc:language} \psccurve[curvature=0.75 0.1 0,linestyle=dashed,showpoints=false]% (0.3,5)(0.3,10)(3,11.3)(20,9.5)(20,7)(8.5,7)(2.5,4.5) \rput(18.8,10){$\Pi$} \rput(16.5,5.5){$N$} \rput(12.5,1.5){$\Omega$} \end{pspicture} \caption{Graph pattern $\Psi$ for the example query} \label{graph-pattern} \end{figure} The output of the algorithm is a join expression $F$ and condition $W$ ready for composition into {\tt FROM} and {\tt WHERE} clauses of an SQL {\tt SELECT} statement. In the algorithm description below, $\mathrm{id}(r)$ is used to denote primary key of relation $r \in R$, and $\rho(n)$ is used to denote value of $\mathrm{id}(Resource)$ for non-variable node $n \in \Psi_{nodes}$ where such value is known during query translation.\footnote{E.g. Samizdat uses {\em site-ns/resource-id} notation for internal resource URIrefs.} % the algorithm Key steps of the query pattern translation algorithm correspond to the following private methods of {\tt SqlMapper}: {\tt label\_pattern\_components}: Label every connected component of $\Pi$, $N$, and $\Omega$ with different colors $K$ such that $K_\Pi: \Pi_{nodes} \to \mathbb{K}, K_N: N_{nodes} \to \mathbb{K}, K_\Omega: \Omega_{nodes} \to \mathbb{K}, K(n) = K_\Pi(n) \cup K_N(n) \cup K_\Omega(n)$. The Two-pass Connected Component Labeling algorithm~\cite{shapiro} is used with a special case to exclude nodes present in $\Pi$ from neighbour lists while labeling $N$ and $\Omega$. The special case ensures that parts of $N$ and $\Omega$ which are only connected through a node in $\Pi$ are labeled with different colors. {\tt map\_predicates}: Map each arc $c = \langle s, p, o \rangle \in \Psi_{arcs}$ to the relational data model according to $M$: define mapping $M_{attr}^{pos}: \Psi_{arcs} \times \Psi_{nodes} \to \Phi$ such that $M_{attr}^{pos}(c, s) = \mathrm{id}( M_{rel}(p) ), M_{attr}^{pos}(c, o) = M_{attr}(p)$; replace each unmapped arc with its reification and map the resulting arcs in the same manner;\footnote{$M$ is expected to map reification properties to the triples table.} for each arc labeled with a subproperty predicate, add an arc mapped to the subproperty qualifier attribute. For each node $n \in \Psi_{nodes}$, find adjacent arcs $\Psi_{nodes}^n = \{\langle s, p, o \rangle \mid n \in \{s, o\}\}$ and determine its binding mode $\beta_{node}: \Psi_{nodes} \to \{ \Pi, N, \Omega \}$ such that $\beta_{node}(n) = max(\beta_{arc}(c) \, \forall c \in \Psi_{nodes}^n)$ where $\beta_{arc}(c)$ reflects which of the graph patterns $\{ \Pi, N, \Omega \}$ contains arc $c$, and the order of precedence used by $max$ is $\Pi > N > \Omega$. {\tt define\_relation\_aliases}: Map each node in $\Psi$ to one or more relation aliases $a \in \mathbb{A}$ according to the algorithm described in \figurename~\ref{define-relation-aliases}. The algorithm produces mapping $C_a: \Psi_{arcs} \to \mathbb{A}$ which links every arc in $\Psi$ to an alias, and mappings $A = \langle A_{rel}, A_{node}, A_\beta, A_{filter} \rangle$ where $A_{rel}: \mathbb{A} \to R$, $A_{node}: \mathbb{A} \to \Psi_{nodes}$, $A_\beta: \mathbb{A} \to \{ \Pi, N, \Omega \}$, $A_{filter}: \mathbb{A} \to F)$ which record relation, node, bind mode, and a filter condition for each alias. \begin{figure} \begin{algorithmic}[1] \ForAll {$n \in \Psi_{nodes}$} \ForAll {$c = \langle s, p, o \rangle \in \Psi_{arcs} \mid s = n \, \wedge \, C_a(c) = \emptyset$} \If {$\exists c' = \langle s', p', o' \rangle \mid n \in \{s', o'\} \, \wedge \, C_a(c') \not= \emptyset \, \wedge \, M_{rel}(p') = M_{rel}(p)$} \State $C_a(c) \gets C_a(c')$ \Comment Reuse the alias assigned to an arc adjacent to $n$ and mapped to the same relation \Else \Comment Create new alias \State $a = max(\mathbb{A}) + 1$; $\mathbb{A} \gets \mathbb{A} \cup \{ a\}$; $C_a(c) \gets a$ \State $A_{node}(a) \gets n$, $A_{filter}(a) \gets \emptyset$ \If {$M_{trans}(p) = \emptyset$} \Comment Use base relation \State $A_{rel}(a) \gets M_{rel}(p)$ \State $A_\beta(a) \gets \beta_{node}(n)$ \Else \Comment Use transitive closure \State $A_{rel}(a) \gets M_{trans}(p)$ \State $A_\beta(a) \gets \beta_{arc}(c)$ \State \Comment Use arc's bind mode instead of node's \EndIf \EndIf \EndFor \EndFor \ForAll {$c \in \Psi_{arcs}$} \State $A_{filter}( C_a(c) ) \gets A_{filter}( C_a(c) ) \cup F_c(c)$ \State \Comment Add arc filter to the linked alias filters \EndFor \end{algorithmic} \caption{Define relation aliases} \label{define-relation-aliases} \end{figure} {\tt transform}: Define bindings $B: \Psi_{nodes} \to \mathbb{B}$ where $\mathbb{B} = \{\{ \langle a, f \rangle \mid a \in \mathbb{A}, f \in \Phi \}\}$ of graph pattern nodes to sets of pairs of relation aliases and attributes, such that \begin{equation} \begin{split} \langle a, f \rangle \in B(n) \iff &\exists c \in \Psi_{arcs}^n \\ &C_a(c) = a, M_{attr}^{pos}(c, n) = f \, . \end{split} \end{equation} Transform graph pattern $\Psi$ into relational query graph $Q = \langle \mathbb{A}, J \rangle$ where nodes $\mathbb{A}$ are relation aliases defined earlier and edges $J = \{ \langle b_1, b_2, n \rangle \mid b_1 = \langle a_1, f_1 \rangle \in B(n), b_2 = \langle a_2, f_2 \rangle \in B(n), a_1 \not= a_2 \}$ are join conditions. Ground non-variable nodes according to the algorithm defined in \figurename~\ref{ground-non-variable-nodes}. Record list of grounded nodes $G \subseteq \Psi_{nodes}$ such that \begin{equation} \begin{split} n \in G \iff &n \in F_g \,\vee\, \exists \langle b_1, b_2, n \rangle \in J \\ &\vee\, \exists b \in B(n) \, \exists a \in \mathbb{A} \: b \in A_{filter}(a) \, . \end{split} \end{equation} \begin{figure} \begin{algorithmic}[1] \State $\exists b = \langle a, f \rangle \in B(n)$ \Comment Take any binding of $n$ \If {$n$ is an internal resource and $\rho(n) = i$} \State $A_{filter}(a) \gets A_{filter}(a) \cup (b = i)$ \ElsIf {$n$ is a query parameter or a literal} \State $A_{filter}(a) \gets A_{filter}(a) \cup (b = n)$ \ElsIf {$n$ is a URIref} \Comment Add a join to a URIref tuple in Resource relation \State $\mathbb{A} \gets \mathbb{A} \cup \{ a_r \}$; $A_{node}(a_r) = n$; $A_{rel}(a_r) = Resource$; $A_\beta(a_r) = \beta_{node}(n)$ \State $B(n) \gets B(n) \cup \langle a_r, \mathrm{id}(Resource) \rangle; J \gets J \cup \{ \langle b, \langle a_r, \mathrm{id}(Resource) \rangle, n \rangle \}$ \State $A_{filter}(a_r) = A_{filter}(a_r) \cup ( \langle a_r, literal \rangle = f \wedge \langle a_r, uriref \rangle = t \wedge \langle a_r, label \rangle = n )$ \EndIf \end{algorithmic} \caption{Ground non-variable nodes} \label{ground-non-variable-nodes} \end{figure} Transformation of the example query presented above will result in a relational query graph in \figurename~\ref{join-graph}. \begin{figure} \centering \psset{unit=3.8mm,labelsep=0.2pt} \begin{pspicture}[showgrid=false](0,0)(23,13) \footnotesize \rput(1,6){\circlenode{b}{\vphantom{Ij}b}} \rput(6.7,6){\circlenode{a}{\vphantom{Ij}a}} \rput(12.8,6){\circlenode{c}{\vphantom{Ij}c}} \rput(2,11){\circlenode{d}{\vphantom{Ij}d}} \rput(1,1){\circlenode{g}{\vphantom{Ij}g}} \rput(22,11){\circlenode{f}{\vphantom{Ij}f}} \rput(20,1){\circlenode{e}{\vphantom{Ij}e}} \ncline{-}{b}{a} \aput{:U}(0.4){a.id = b.id} \bput{:U}(0.35){?stmt} \ncline{-}{a}{c} \aput{:U}{a.subject = c.id} \bput{:U}{?msg} \ncline{-}{d}{a} \aput{:U}{a.subject = d.id} \bput{:U}(0.4){?msg} \ncline{-}{g}{a} \aput{:U}(0.43){a.predicate = g.id} \bput{:U}{\it dc:relation} \ncline{-}{c}{f} \aput{:U}{c.part\_of\_subproperty = f.id} \bput{:U}{\it s:isTranslationOf} \ncline{-}{c}{e} \aput{:U}{c.part\_of = e.id} \bput{:U}{?translation} \pspolygon[linestyle=dashed,linearc=0.8](0.1,0.1)(0.1,11.9)(14.5,11.9)(14.5,0.1) \rput(13.8,1){$P_1$} \end{pspicture} \caption{Relational query graph $Q$ for the example query} \label{join-graph} \end{figure} {\tt generate\_tables\_and\_conditions}: Produce ordered connected minimum edge-disjoint tree cover $P$ for relational query graph $Q$ such that $\forall P_i \in P$ \, $\forall j = \langle b_{j1}, b_{j2}, n_j \rangle \in P_i$ \, $\forall k = \langle b_{k1}, b_{k2}, n_k \rangle \in P_i$: \begin{gather} K(n_j) \cap K(n_k) \not= \emptyset \, , \\ \beta_{node}(n_j) = \beta_{node}(n_k) = \beta_{tree}(P_i) \, , \end{gather} starting with $P_1$ such that $\beta_{tree}(P_1) = \Pi$ (it follows from definitions of $\Psi$ and {\tt transform} that $P_1$ is the only such tree and covers all join conditions $\langle b_1, b_2, n \rangle \in J$ such that $\beta_{node}(n) = \Pi$). Encode $P_1$ as the root inner join. Encode other trees with at least one edge as subqueries. Left join subqueries and aliases representing roots of zero-length trees into join expression $F$. For each $P_i$ such that $\beta_{tree}(P_i) = N$, find a binding $b = \langle a, f \rangle \in P_i$ such that $a \in P_1 \cap P_i$ and add ($b$ {\tt IS NULL}) condition to $W$. For each non-grounded node $n \not\in G$ such that $\langle a, f \rangle \in B(n) \, \wedge \, a \in P_1$, add ($b$ {\tt IS NOT NULL}) condition to $W$ if $\beta_{node}(n) = \Pi$, or ($b$ {\tt IS NULL}) condition if $\beta_{node}(n) = N$. Add $F_g$ to $W$. Translation of the example query presented earlier will result in the following SQL: \begin{Verbatim}[fontsize=\scriptsize] SELECT DISTINCT a.subject, max(b.published_date) FROM Statement AS a INNER JOIN Resource AS b ON (a.id = b.id) INNER JOIN Resource AS c ON (a.subject = c.id) INNER JOIN Message AS d ON (a.subject = d.id) INNER JOIN Resource AS g ON (a.predicate = g.id) AND (g.literal = 'false' AND g.uriref = 'true' AND g.label = 'http://purl.org/dc/elements/1.1/relation') LEFT JOIN ( SELECT e.language AS _field_b, c.id AS _field_a FROM Message AS e INNER JOIN Resource AS f ON (f.literal = 'false' AND f.uriref = 'true' AND f.label = 'http://www.nongnu.org/samizdat/rdf/schema#isTranslationOf') INNER JOIN Resource AS c ON (c.part_of_subproperty = f.id) AND (c.part_of = e.id) ) AS _subquery_a ON (c.id = _subquery_a._field_a) WHERE (b.published_date IS NOT NULL) AND (a.object IS NOT NULL) AND (a.rating IS NOT NULL) AND (c.part_of IS NULL) AND (a.rating >= ?) AND (d.language = ? OR _subquery_a._field_b = ?) GROUP BY a.subject ORDER BY max(b.published_date) DESC \end{Verbatim} \section{Update Command Execution} \label{update-execution} Update command uses the same graph pattern structure as a query, and additionally defines a set $\Delta \subset \Psi_{nodes}$ of variables representing new RDF resources and a mapping $U: \Psi_{nodes} \to \mathbb{L}$ of variables to literal values. Execution of an update command starts with query pattern translation using the algorithm described in section~\ref{query-translation}. The variables $\Psi$, $A$, $Q$, etc. produced by pattern translation are used in the subsequent stages as described below: \begin{enumerate} % node values \item Construct node values mapping $V: \Psi_{nodes} \to \mathbb{L}$ using the algorithm defined in \figurename~\ref{node-values}. Record resources inserted into the database during this stage in $\Delta_{new} \subset \Psi_{nodes}$ (it follows from the algorithm definition that $\Delta \subseteq \Delta_{new}$). \begin{figure} \begin{algorithmic}[1] \ForAll {$n \in \Psi_{nodes}$} \If {$n$ is an internal resource and $\rho(n) = i$} \State $V(n) \gets i$ \ElsIf {$n$ is a query parameter or a literal} \State $V(n) \gets n$ \ElsIf {$n$ is a variable} \If {$\nexists c = \langle n, p, o \rangle \in \Psi_{arcs}$} \State \Comment If found only in object position \State $V(n) \gets U(n)$ \Else \If {$n \not\in \Delta$} \State $V(n) \gets \mathrm{SquishSelect}(n, \Psi^{n*})$ \EndIf \If {$V(n) = \emptyset$} \State Insert $n$ into $Resource$ relation \State $V(n) \gets \rho(n)$ \State $\Delta_{new} \gets \Delta_{new} \cup n$ \EndIf \EndIf \ElsIf {$n$ is a URIref} \State Select $n$ from $Resource$ relation, insert if missing \State $V(n) \gets \rho(n)$ \EndIf \EndFor \end{algorithmic} \caption{Determine node values. $\Psi^{n*}$ is a subgraph of $\Psi$ reachable from $n$. $\mathrm{SquishSelect}(n, \Psi)$ finds a mapping of variable $n$ that satisfies pattern $\Psi$.} \label{node-values} \end{figure} % data assignment \item For each alias $a \in \mathbb{A}$, find a subset of graph pattern $\Psi_{arcs}^a \subseteq \Psi_{arcs}$ such that $c \in \Psi_{arcs}^a \iff C_a(c) = a$, select a key node $k$ such that $\exists c = \langle k, p, o \rangle \in \Psi_{arcs}^a$, and collect a map $D_a: \Phi \to \mathbb{L}$ of fields to values such that $\forall c = \langle s, p, o \rangle \in \Psi_{arcs}^a \; \exists D_a(o) = V(o)$. If $k \in \Delta_{new}$ and $A_{rel}(a) \not= Resource$, transform $D_a$ into an SQL {\tt INSERT} into $A_{rel}(a)$ with explicit primary key assignment $\mathrm{id}_k(A_{rel}(a)) \gets V(k)$. Otherwise, transform $D_a$ into an {\tt UPDATE} statement on the tuple in $A_{rel}(a)$ for which $\mathrm{id}_k(A_{rel}(a)) = V(k)$. % iterative assertions \item Execute the SQL statements produced in the previous stage inside the same transaction in the order that resolves their mutual references. \end{enumerate} \section{Implementation} The algorithms described in previous sections are implemented by the Samizdat RDF storage module, which is used as the primary means of data access in the Samizdat open publishing system. The module is written in Ruby programming language, supported by several triggers written in procedural SQL. The module and the whole Samizdat engine are available under GNU General Public License. Samizdat exposes all RDF resources underpinning the structure and content of the site. HTTP request with a URL of any internal resource yields a page with detailed information about the resource and its relation with other resources. Furthermore, Samizdat provides a graphical interface that allows to compose arbitrary Squish queries.\footnote{Complexity of user queries is limited to a configurable maximum number of triples in the graph pattern to prevent abuse.} Queries may be published so that other users may modify and reuse them, results of a query may be accessed either as plain HTML or as an RSS feed. \section{Evaluation of Results} \label{evaluation} %\enlargethispage{-1ex} Samizdat performance was measured using Berlin SPARQL Benchmark (BSBM)~\cite{bsbm}, with following variations: a functional equivalent of BSBM test driver was implemented in Ruby and Squish (instead of Java and SPARQL); the test platform included Intel Core 2 Duo (instead of Quad) clocked at the same frequency, and 2GB of memory (instead of 8GB). In this environment, Samizdat was able to process 25287 complete query mixes per second (QMpH) on a dataset with 1M triples, and achieved 18735 QMpH with 25M triples, in both cases exceeding figures for all RDF stores reported in~\cite{bsbm}. In production, Samizdat was able to serve without congestion peak loads of up to 5K hits per hour for a site with a dataset sized at 100K triples in a shared VPS environment. Regeneration of the site frontpage on the same dataset executes 997 Squish queries and completes in 7.7s, which is comparable to RDBMS-backed content management systems. \section{Comparison with Related Work} \label{related-work} As mentioned in section~\ref{introduction}, there exists a wide range of solutions for relational to RDF mapping. Besides Samizdat, the approach based on automatic on-demand translation of RDF queries into SQL is also implemented by Federate~\cite{federate}, D2RQ~\cite{d2rq}, and Virtuoso~\cite{virtuoso}. While being one of the first solutions to provide on-demand relational to RDF mapping, Samizdat remains one of the most advanced in terms of query capabilities. Its single largest drawback is lack of compatibility with SPARQL; in the same time, in some regards it exceeds capabilities of other solutions. The alternative that is closest to Samizdat in terms of query capabilities is Virtuoso RDF Views: it is the only other relational-to-RDF mapping solution that provides partial RDFS and OWL inference, aggregation, and an update language. Still, there are substantial differences between these two projects. First of all, Samizdat RDF store is a small module (1000 lines of Ruby and 200 lines of SQL) that can be used with a variety of RDBMSes, while Virtuoso RDF Views is tied to its own RDBMS. Virtuoso doesn't support implicit statement reification, although its design is compatible with this feature. Finally, Virtuso relies on SQL unions for queries with unspecified predicates and RDFS and OWL inference. While allowing for greater flexibility than the database triggers described in section~\ref{inference-triggers}, iterative union operation has a considerable impact on query performance. \section{Future Work} \label{future-work} Since the SPARQL Recommendation has been published by W3C~\cite{sparql}, SPARQL support has been at the top of the Samizdat RDF store to-do list. SPARQL syntax is considerably more expressive than Squish and will require some effort to implement in Samizdat, but, since design of the implementation separates syntactic layer from the query translation logic, the same algorithms as described in this paper can be used to translate SPARQL patterns to SQL with minimal changes. Most substantial changes are expected to be required for the explicit grouping of optional graph patterns and the associated filter scope issues~\cite{cyganiak}. Samizdat RDF store should be made more adaptable to a wider variety of problem domains. Query translation algorithm should be augmented to translate an ambiguously mapped query (including queries with unspecified predicates) to a union of alternative interpretations. Mapping of relational schema should be generalized, including support for multi-part keys and more generic stored procedures for reification and inference. Standard RDB2RDF mapping should be implemented when W3C publishes a specification to that end. \section{Conclusions} The on-demand RDF to relational query translation algorithm described in this paper utilizes existing relational databases to their full potential, including indexing, transactions, and procedural SQL, to provide efficient access to RDF data. Implementation of this algorithm in Samizdat RDF storage module has been tried in production environment and demonstrated how Semantic Web technologies can be introduced into an application serving thousands of users without imposing additional requirements on hardware resources. \vspace{1ex} % ---- Bibliography ---- % \begin{thebibliography}{19} %\bibitem {expressive-power-of-sparql} %Anglez, R., Gutierrez, C.: %The Expressive Power of SPARQL. In: A. Sheth et al. (Eds.) ISWC 2008. %LNCS, vol. 5318, pp. 82-97. Springer, Heidelberg (2008)\\ %\url{http://www.dcc.uchile.cl/~cgutierr/papers/expPowSPARQL.pdf} \bibitem {triplify} Auer, S., Dietzold, S. Lehman, J., Hellmann, S., Aumueller, D.: Triplify -- Light-Weight Linked Data Publication from Relational Databases. WWW 2009, Madrid, Spain (2009)\\ \url{http://www.informatik.uni-leipzig.de/~auer/publication/triplify.pdf} %\bibitem {swad-storage} %Beckett, Dave: %Semantic Web Scalability and Storage: Survey of Free Software / Open %Source RDF storage systems. SWAD-Europe Deliverable 10.1 (2001)\\ %\url{http://www.w3.org/2001/sw/Europe/reports/rdf\_scalable\_storage\_report} %\bibitem {swad-rdbms-mapping} %Beckett, D., Grant, J.: %Semantic Web Scalability and Storage: Mapping Semantic Web Data with %RDBMSes, SWAD-Europe Deliverable 10.2 (2001)\\ %\url{http://www.w3.org/2001/sw/Europe/reports/scalable\_rdbms\_mapping\_report} %\bibitem {cwm} %Berners-Lee, T., Kolovski, V., Connolly, D., Hendler, J. Scharf, Y.: %A Reasoner for the Web. Theory and Practice of Logic Programming (TPLP), %special issue on Logic Programming and the Web (2000)\\ %\url{http://www.w3.org/2000/10/swap/doc/paper/} \bibitem {bsbm} Bizer, C., Schultz, A.: The Berlin SPARQL Benchmark. International Journal On Semantic Web and Information Systems (IJSWIS), Volume 5, Issue 2 (2009)\\ \url{http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/} \bibitem {d2rq} Bizer, C., Seaborne, A.: D2RQ - Treating non-RDF databases as virtual RDF graphs. In: ISWC 2004 (posters)\\ \url{http://www.wiwiss.fu-berlin.de/bizer/D2RQ/spec/} %\bibitem {samizdat-euruko} %Borodaenko, Dmitry: %RDF storage for Ruby: the case of Samizdat. EuRuKo 2003, Karlsruhe (June %2003)\\ %\url{http://samizdat.nongnu.org/slides/euruko2003\_samizdat.html} %\bibitem {samizdat-impl-report} %Borodaenko, Dmitry: %Samizdat RDF Implementation Report. RDF Interest ML (September 2003)\\ %\url{http://lists.w3.org/Archives/Public/www-rdf-interest/2003Sep/0043.html} \bibitem {samizdat-rel-rdf} Borodaenko, Dmitry: Accessing Relational Data with RDF Queries and Assertions (April 2004)\\ \url{http://samizdat.nongnu.org/papers/rel-rdf.pdf} \bibitem {samizdat-collreif} Borodaenko, Dmitry: Model for Collaborative Decision Making Based on RDF Reification (April 2004)\\ \url{http://samizdat.nongnu.org/papers/collreif.pdf} \bibitem {cyganiak} Cyganiak, R.: A relational algebra for SPARQL. Technical Report HPL-2005-170, HP Labs (2005)\\ \url{http://www.hpl.hp.com/techreports/2005/HPL-2005-170.html} \bibitem {virtuoso} Erling, O., Mikhailov I.: RDF support in the Virtuoso DBMS. In: Proceedings of the 1st Conference on Social Semantic Web, volume P-113 of GI-Edition -- Lecture Notes in Informatics (LNI), ISSN 1617-5468. Bonner K\"{o}llen Verlag (2007)\\ \url{http://virtuoso.openlinksw.com/dav/wiki/Main/VOSArticleRDF} %\bibitem {rdf-mt} %Hayes, Patrick: %RDF Semantics. W3C Recommendation (February 2004)\\ %\url{http://www.w3.org/TR/rdf-mt/} %\bibitem {rdf-syntax-1999} %Lassila, O., Swick, R.~R.: %Resource Description Framework (RDF) Model and Syntax Specification, W3C %Recommendation (February 1999)\\ %\url{http://www.w3.org/TR/1999/REC-rdf-syntax-19990222/} %\bibitem {rdb2rdf-xg-report} %Malhotra, Ashok: %W3C RDB2RDF Incubator Group Report. W3C Incubator Group Report (January %2009)\\ %\url{http://www.w3.org/2005/Incubator/rdb2rdf/XGR-rdb2rdf/} %\bibitem {melnik} %Melnik, S.: %Storing RDF in a relational database. Stanford University (2001)\\ %\url{http://infolab.stanford.edu/~melnik/rdf/db.html} \bibitem {squish} Miller, Libby, Seaborne, Andy, Reggiori, Alberto: Three Implementations of SquishQL, a Simple RDF Query Language. In: Horrocks, I., Hendler, J. (Eds) ISWC 2002. LNCS vol. 2342, pp. 423-435. Springer, Heidelberg (2002)\\ \url{http://ilrt.org/discovery/2001/02/squish/} %\bibitem {nuutila} %Nuutila, Esko: %Efficient Transitive Closure Computation in Large Digraphs. Acta %Polytechnica Scandinavica, Mathematics and Computing in Engineering %Series No. 74, Helsinki (1995)\\ %\url{http://www.cs.hut.fi/~enu/thesis.html} %\bibitem {owl-semantics} %Patel-Schneider, Peter F., Hayes, Patrick, Horrocks, Ian: %OWL Web Ontology Language Semantics and Abstract Syntax. W3C %Recommendation (February 2004)\\ %\url{http://www.w3.org/TR/owl-semantics/} \bibitem {federate} Prud'hommeaux, Eric: RDF Access to Relational Databases (2003)\\ \url{http://www.w3.org/2003/01/21-RDF-RDB-access/} \bibitem {sparql} Prud'hommeaux, Eric, Seaborne, Andy: SPARQL Query Language for RDF. W3C Recommendation (January 2008)\\ \url{http://www.w3.org/TR/rdf-sparql-query/} \bibitem {shapiro} Shapiro, L., Stockman, G: Computer Vision, pp. 69-73. Prentice-Hall (2002)\\ \url{http://www.cse.msu.edu/~stockman/Book/2002/Chapters/ch3.pdf} \bibitem {sp2b} Schmidt, M., Hornung, T., K\"{u}chlin, N., Lausen, G., Pinkel, C.: An Experimental Comparison of RDF Data Management Approaches in a SPARQL Benchmark Scenario. In: A. Sheth et al. (Eds.) ISWC 2008. LNCS vol. 5318, pp. 82-97. Springer, Heidelberg (2008)\\ \url{http://www.informatik.uni-freiburg.de/~mschmidt/docs/sp2b\_exp.pdf} %\bibitem {treehugger} %Steer, D.: %TreeHugger -- XSLT for RDF (2003)\\ %\url{http://rdfweb.org/people/damian/treehugger/} \end{thebibliography} \end{document} ruby-graffiti-2.2/doc/papers/rel-rdf.tex000066400000000000000000000552451176467530700202760ustar00rootroot00000000000000\documentclass{llncs} \usepackage{makeidx} % allows for indexgeneration \usepackage{graphicx} \usepackage[pdfpagescrop={92 112 523 778},a4paper=false, pdfborder={0 0 0}]{hyperref} \emergencystretch=8pt % \begin{document} \mainmatter % start of the contributions % \title{Accessing Relational Data with RDF Queries and Assertions} \toctitle{Accessing Relational Data with RDF Queries and Assertions} \titlerunning{Accessing Relational Data with RDF} % \author{Dmitry Borodaenko} \authorrunning{Dmitry Borodaenko} % abbreviated author list (for running head) %%%% modified list of authors for the TOC (add the affiliations) \tocauthor{Dmitry Borodaenko} % \institute{\email{angdraug@debian.org}} \maketitle % typeset the title of the contribution \begin{abstract} This paper presents a hybrid RDF storage model that combines relational data with arbitrary RDF meta-data, as implemented in the RDF storage layer of the Samizdat open publishing and collaboration engine, and explains the supporting algorithms for online translation of RDF queries and conditional assertions into their relational equivalents. Proposed model allows to supplement legacy databases with RDF meta-data without sacrificing the benefits of RDBMS technology. \end{abstract} \section{Introduction} The survey of free software / open source RDF storage systems performed by SWAD-Europe\cite{swad-storage} has found that the most wide-spread approach to RDF storage relies on relational databases. As seen from the companion report on mapping Semantic Web data with RDBMSes\cite{swad-rdbms-mapping}, traditional relational representation of RDF is a triple store, usually evolving around a central statement table with \{subject, predicate, object\} triples as its rows and one or more tables storing resource URIrefs, namespaces, and other supplementary data. While such triple store approach serves well to satisfy the open world assumption of RDF, by abandoning existing relational data models it fails to take full advantage of the RDBMS technology. According to \cite{swad-storage}, existing RDF storage tools are still immature; in the same time, although modern triple stores claim to scale to millions of triples, ICS-FORTH research\cite{ics-volume} shows that schema-specific storage model yields better results with regards to performance and scalability on large volumes of data. These concerns are addressed from different angles by RSSDB\cite{rssdb}, Federate\cite{ericp-rdf-rdb-access}, and D2R\cite{d2r} packages. RSSDB splits the single triples table into a schema-specific set of property tables. In this way, it walks away from relational data model, but maintains performance benefits due to better indexing. Federate takes the most conservative approach and allows to query a relational database with a restricted application-specific RDF schema. Conversely, D2R is intended for batch export of data from RDBMS to RDF and assumes that subsequent operation will involve only RDF. The hybrid RDF storage model presented in this paper attacks this problem from yet another angle, which can be described as a combination of Federate's relational-to-RDF mapping and a traditional triple store. While having the advantage of being designed from the ground up with the RDF model in mind, Samizdat RDF layer\cite{samizdat-rdf-storage} deviated from the common RDF storage practice in order to use both relational and triple data models and get the best of both worlds. Hybrid storage model was designed, and algorithms were implemented that allow to access the data in the hybrid triple-relational model with RDF queries and conditional assertions in an extended variant of the Squish\cite{squish} query language.\footnote{The decision to use Squish over more expressive languages like RDQL\cite{rdql} and Notation3\cite{notation3} was made due to its intuitive syntax, which was found more suitable for the Samizdat's query composer GUI intended for end users of an open-publishing system.} This paper describes the proposed model and its implementation in the Samizdat engine. \section{Relational Database Schema} All content in a Samizdat site is represented internally as RDF. Canonic URIref for any Samizdat resource is {\tt http:///}, where {\tt } is a base URL of the site and {\tt } is a unique (within a single site) numeric identifier of the resource. Root of SQL representation of RDF resources is {\tt Resource} table with {\tt id} primary key field storing {\tt }, and {\tt label} text field representing resource label. Semantics of label values are different for literals, references to external resources, and internal resources of the site. \emph{Literal} value (including typed literals) is stored directly in the {\tt label} field and marked with {\tt literal} boolean field. \emph{External resource} label contains the resource URIref and is marked with {\tt uriref} boolean field. \emph{Internal resource} is mapped into a row in an \emph{internal resource table} with name corresponding to the resource class name stored in the {\tt label} field, primary key {\tt id} field referencing back to the {\tt Resource} table, and other fields holding values of \emph{internal properties} for this resource class, represented as literals or references to other resources stored in the {\tt Resource} table. Primary key reference to {\tt Resource.id} is enforced by PostgreSQL stored procedures. To determine what information about a resource can be stored in and extracted from class-specific tables, RDF storage layer consults site-specific mapping \begin{equation} M(p) = \{\langle t_{p1},~f_{p1} \rangle, \enspace \dots\} \enspace , \end{equation} which stores a list of possible pairs of SQL table name $t$ and field name $f$ for each internal property name $p$. Mapping $M$ is read at runtime from external YAML\cite{yaml} file of the following form: \begin{verbatim} --- ns: s: 'http://www.nongnu.org/samizdat/rdf/schema#' focus: 'http://www.nongnu.org/samizdat/rdf/focus#' items: 'http://www.nongnu.org/samizdat/rdf/items#' rdf: 'http://www.w3.org/1999/02/22-rdf-syntax-ns#' dc: 'http://purl.org/dc/elements/1.1/' map: 'dc::date': {Resource: published_date} 's::id': {Resource: id} 'rdf::subject': {Statement: subject} 'rdf::predicate': {Statement: predicate} 'rdf::object': {Statement: object} 's::rating': {Statement: rating} . . . \end{verbatim} \emph{External properties}, i.e. properties that are not covered by $M$, are represented by \{{\tt subject}, {\tt predicate}, {\tt object}\} triples in the {\tt Statement} table. Every such triple is treated as a reified statement in RDF semantics and is assigned a {\tt } and a record in the {\tt Resource} table. {\tt Resource} and {\tt Statement} are also internal resource tables, and, as such, have some of their fields mapped by $M$. In particular, {\tt subject}, {\tt predicate}, and {\tt object} fields of the {\tt Statement} table are mapped to the corresponding properties from the RDF reification vocabulary, and {\tt Resource.id} is mapped to {\tt samizdat:id} property from Samizdat namespace. Excerpt from default Samizdat database schema with mapped field names replaced by predicate QNames is visualized on Fig.\,\ref{db-schema-figure}. In addition to {\tt Resource} and {\tt Statement} tables described above, it shows the {\tt Message} table representing one of internal resource classes. Note how {\tt dc:date} property is made available to all resource classes, and how reified statements are allowed to have optional {\tt samizdat:rating} property. \begin{figure} %\begin{verbatim} % +-------------+ +-----------------+ % | Resource | | Statement | % +-------------+ +-----------------+ % +->| samizdat:id |<-+-| id | % | | label | +-| rdf:subject | % | | literal | +-| rdf:predicate | % | | uriref | +-| rdf:object | % | | dc:date | | samizdat:rating | % | +-------------+ +-----------------+ % | % | +------------------+ % | | Message | % | +------------------+ % +--| id | % | dc:title | % | dc:format | % | samizdat:content | % +------------------+ %\end{verbatim} \begin{center} \includegraphics[scale=0.6]{fig1.eps} \end{center} \caption{Excerpt from Samizdat database schema} \label{db-schema-figure} \end{figure} \section{Query Pattern Translation} % \subsection{Prerequisites} Pattern translation algorithm operates on the pattern section of a Squish query. Query pattern $\Psi$ is represented as a list of \emph{pattern clauses} \begin{equation} \psi_i = \langle p_i,~s_i,~o_i \rangle \enspace , \end{equation} where $i$ is the position of a clause, $p_i$ is the predicate URIref, $s_i$ is the subject node and may be URIref or blank node, $o_i$ is the object node and may be URIref, blank node, or literal. \subsection{Predicate Mapping} For each position $i$, predicate URIref $p_i$ is looked up in the map of internal resource properties $M$. All possible mappings are recorded for all clauses in a list $C$: \begin{equation} c_i = \{\langle t_{i1},~f_{i1} \rangle, \enspace \langle t_{i2},~f_{i2} \rangle, \enspace \dots\} \enspace , \end{equation} where $t_{ij}$ is the table name (same for subject $s_i$ and object $o_i$) and $f_{ij}$ is the field name (meaningful for object only, since subject is always mapped to the {\tt id} primary key). In the same iteration, all subject and object positions of nodes are recorded in the reverse positional mapping \begin{equation} R(n) = \{\langle i_1,~m_1 \rangle, \enspace \langle i_2,~m_2 \rangle, \enspace \dots\} \enspace , \end{equation} where $m$ shows whether node $n$ appears as subject or as object in the clause $i$. Each ambiguous property mapping is compared with mappings for other occurrences of the same subject and object nodes in the pattern graph; anytime non-empty intersection of mappings for the same node is found, both subject and object mappings for the ambiguous property are refined to such intersection. \subsection{Relation Aliases and Join Conditions} Relation alias $a_i$ is determined for each clause mapping $c_i$, such that for all subject occurrences of the subject $s_i$ that were mapped to the same table $t_i$, alias is the same, and for all positions with differing table mapping or subject node, alias is different. For all nodes $n$ that are mapped to more than one $\langle a_i,~f_i \rangle$ pair in different positions, join conditions are generated. Additionally, for each external resource, {\tt Resource} table is joined by URIref, and for each existential blank node that isn't already bound by join, {\tt NOT NULL} condition is generated. Resulting join conditions set $J$ is used to generate the {\tt WHERE} section of the target SQL query. \subsection{Example} Following Squish query selects all messages with rating above 1: \begin{verbatim} SELECT ?msg, ?title, ?name, ?date, ?rating WHERE (dc::title ?msg ?title) (dc::creator ?msg ?author) (s::fullName ?author ?name) (dc::date ?msg ?date) (rdf::subject ?stmt ?msg) (rdf::predicate ?stmt dc::relation) (rdf::object ?stmt focus::Quality) (s::rating ?stmt ?rating) LITERAL ?rating >= 1 ORDER BY ?rating USING rdf FOR http://www.w3.org/1999/02/22-rdf-syntax-ns# dc FOR http://purl.org/dc/elements/1.1/ s FOR http://www.nongnu.org/samizdat/rdf/schema# focus FOR http://www.nongnu.org/samizdat/rdf/focus# \end{verbatim} Mappings produced by translation of this query are summarized in the Table~\ref{mappings-table}. \begin{table} \caption{Query Translation Mappings} \label{mappings-table} \begin{center} \begin{tabular}{clll} \hline\noalign{\smallskip} $i$ & $t_i$ & $f_i$ & $a_i$\\ \noalign{\smallskip} \hline \noalign{\smallskip} 1 & {\tt Message} & {\tt title} & {\tt b}\\ 2 & {\tt Message} & {\tt creator} & {\tt b}\\ 3 & {\tt Member} & {\tt full\_name} & {\tt d}\\ 4 & {\tt Resource} & {\tt published\_date} & {\tt c}\\ 5 & {\tt Statement} & {\tt subject} & {\tt a}\\ 6 & {\tt Statement} & {\tt predicate} & {\tt a}\\ 7 & {\tt Statement} & {\tt object} & {\tt a}\\ 8 & {\tt Statement} & {\tt rating} & {\tt a}\\ \hline \end{tabular} \end{center} \end{table} As a result of translation, following SQL query will be generated: \begin{verbatim} SELECT b.id, b.title, d.full_name, c.published_date, a.rating FROM Statement a, Message b, Resource c, Member d, Resource e, Resource f WHERE a.id IS NOT NULL AND a.object = e.id AND e.literal = false AND e.uriref = true AND e.label = 'focus::Quality' AND a.predicate = f.id AND f.literal = false AND f.uriref = true AND f.label = 'dc::relation' AND a.rating IS NOT NULL AND b.creator = d.id AND b.id = a.subject AND b.id = c.id AND b.title IS NOT NULL AND c.published_date IS NOT NULL AND d.full_name IS NOT NULL AND (a.rating >= 1) ORDER BY a.rating \end{verbatim} \subsection{Limitations} In RDF model theory\cite{rdf-mt}, a resource may belong to more than one class. In Samizdat RDF storage model, resource class specified in {\tt Resource.label} is treated as the primary class: it is not possible to have some of the internal properties of a resource mapped to one table and some other internal properties mapped to the other. The only exception to this is, obviously, the {\tt Resource} table, which is shared by all resource classes. Predicates with cardinality greater than 1 cannot be mapped to internal resource tables, and should be recorded as reified statements instead. RDF properties are allowed to be mapped to more than one internal resource table, and queries on such ambiguous properties are intended to select all classes of resources that match this property in conjunction with the rest of the query. The algorithm described above assumes that other pattern clauses refine such ambiguous property mapping to one internal resource table. Queries that fail this assumption will be translated incorrectly by the current implementation: only the resource class from the first remaining mapping will be matched. This should be taken into account in site-specific resource maps: ambiguous properties should be avoided where possible, and their mappings should go in order of resource class probability descension. It is possible to solve this problem, but any precise solution will add significant complexity to the resulting query. Solutions that would not adversely affect performance are still being sought. So far, it is recommended not to specify more than one mapping per internal property. \section{Conditional Assertion} % \subsection{Prerequisites} Conditional assertion statement in Samizdat Squish is recorded using the same syntax as RDF query, with the {\tt SELECT} section containing variables list replaced by {\tt INSERT} section with a list of ``don't-bind'' variables and {\tt UPDATE} section containing assignments of values to query variables: \begin{verbatim} [ INSERT node [, ...] ] [ UPDATE node = value [, ...] ] WHERE (predicate subject object) [...] [ USING prefix FOR namespace [...] ] \end{verbatim} Initially, pattern clauses in assertion are translated using the same procedure as for a query. Pattern $\Psi$, clause mapping $C$, reverse positional mapping $R$, alias list $A$, and join conditions set $J$ are generated as described in the previous section. After that, database update is performed in two stages described below. Both stages are executed within a single transaction, rolling back intermediate inserts and updates in case assertion fails. \subsection{Resource Values} On this stage value mapping $V(n)$ is defined for each node $n$, and necessary resource insertions are performed: \begin{enumerate} \item If $n$ is an internal resource, $V(n)$ is its {\tt id}. If there is no resource with such {\tt id} in the database, error is raised. \item If $n$ is a literal, $V(n)$ is the literal value. \item If $n$ is a blank node and only appears in object position, it is assigned a value from the {\tt UPDATE} section of the assertion. \item If $n$ is a blank node and appears in subject position, it is either looked up in the database or inserted as a new resource. If no resource in the database matches $n$ (to check that, subgraph of $\Psi$ including all pattern nodes and predicates reachable from $n$ is generated and matched against the database), or if $n$ appears in the {\tt INSERT} section of the assertion, new resource is created and its {\tt id} is assigned to $V(n)$. If matching resource is found, $V(n)$ becomes equal to its {\tt id}. \item If $n$ is an external URIref, it is looked up in the {\tt Resource} table. As with subject blank nodes, $V(n)$ is the {\tt id} of a matching or new resource. \end{enumerate} All nodes that were inserted during this stage are recorded in the set of new nodes $N$. \subsection{Data Assignment} For all aliases from $A$ except additional aliases that are defined for external URIref nodes (which don't have to be looked up since their {\tt id}s are recorded in $V$ during the previous stage), reverse positional mapping \begin{equation} R_\mathrm{A}(a) = \{i_1, \enspace i_2, \enspace \dots\} \end{equation} is defined. Key node $K$ is defined as the subject node $s_{i_1}$ from clause $\psi_{i_1}$, and aliased table $t$ is defined as the table name $t_{i_1}$ from clause mapping $c_{i_1}$. For each position $k$ from $R_\mathrm{A}(a)$, a pair $\langle f_k, V(o_k) \rangle$, where $f_k$ is the field name from $c_k$, and $o_k$ the object node from $\psi_k$, is added to the data assignment list $D(K)$ if node $o_k$ occurs in new node list $N$ or in {\tt UPDATE} section of the assertion statement. If key node $K$ occurs in $N$, new row is inserted into the table $t$. If $K$ is not in $N$, but $D(K)$ is not empty, SQL update statement is generated for the row of $t$ with {\tt id} equal to $V(K)$. In both cases, assignments are generated from the data assignment list $D(K)$. The above procedure is repeated for each alias $a$ included in $R_\mathrm{A}$. \subsection{Iterative assertions} If the assertion pattern matches more than once in the site knowledge base, the algorithm defined in this section will nevertheless run the appropriate insertions and updates only once. For iterative update of all occurences of pattern, assertion has to be programmatically wrapped inside an appropriate RDF query. \section{Implementation Details} Samizdat engine\cite{samizdat-impl-report} is written in Ruby programming language and uses PostgreSQL database for storage and an assortment of Ruby libraries for database access (DBI), configuration and RDF mapping (YAML), l10n (GetText), and Pingback protocol (XML-RPC). It is running on a variety of platforms ranging from Debian GNU/Linux to Windows 98/Cygwin. Samizdat is free software and is available under GNU General Public License, version 2 or later. Samizdat project development started in December 2002, first public release was announced in June 2003. As of the second beta version 0.5.1, released in March 2004, Samizdat provided basic set of open publishing functionality, including registering site members, publishing and replying to messages, uploading multimedia messages, voting on relation of site focuses to resources, creating and managing new focuses, hand-editing or using GUI for constructing and publishing Squish queries that can be used to search and filter site resources. \section{Conclusions} Wide adoption of the Semantic Web requires interoperability between relational databases and RDF applications. Existing RDF stores treat relational data as legacy and require that it is recorded in triples before being processed, with the exception of the Federate system that provides limited direct access to relational data via application-specific RDF schema. The Samizdat RDF storage layer provides an intermediate solution for this problem by combining relational databases with arbitrary RDF meta-data. The described approach allows to take advantage of RDBMS transactions, replication, performance optimizations, etc., in Semantic Web applications, and reduces the costs of migration from relational data model to RDF. As can be seen from corresponding sections of this paper, current implementation of the proposed approach has several limitations. These limitations are not caused by limitations in the approach itself, but rather, reflect the pragmatic decision to only implement the functionality that is used by Samizdat engine. As more advanced collaboration features such as message versioning and aggregation are added to Samizdat, some of the limitations of its RDF storage layer will be removed. % ---- Bibliography ---- % \begin{thebibliography}{19} % \bibitem {ics-volume} Alexaki, S., Christophides, V., Karvounarakis, G., Plexousakis D., Tolle, K.: The RDFSuite: Managing Voluminous RDF Description Bases, Technical report, ICS-FORTH, Heraklion, Greece, 2000.\\ http://139.91.183.30:9090/RDF/publications/semweb2001.html \bibitem {swad-storage} Beckett, Dave: Semantic Web Scalability and Storage: Survey of Free Software / Open Source RDF storage systems, SWAD-Europe Deliverable 10.1\\ http://www.w3.org/2001/sw/Europe/reports/rdf\_scalable\_storage\_report \bibitem {swad-rdbms-mapping} Beckett, D., Grant, J.: Semantic Web Scalability and Storage: Mapping Semantic Web Data with RDBMSes, SWAD-Europe Deliverable 10.2\\ http://www.w3.org/2001/sw/Europe/reports/scalable\_rdbms\_mapping\_report \bibitem{yaml} Ben-Kiki, O., Evans, C., Ingerson, B.: YAML Ain't Markup Language (YAML) 1.0. Working Draft 2004-JAN-29.\\ http://www.yaml.org/spec/ \bibitem {notation3} Berners-Lee, Tim: Notation3 --- Ideas about Web architecture\\ http://www.w3.org/DesignIssues/Notation3 \bibitem {d2r} Bizer, Chris: D2R MAP --- Database to RDF Mapping Language and Processor\\ http://www.wiwiss.fu-berlin.de/suhl/bizer/d2rmap/D2Rmap.htm \bibitem {samizdat-rdf-storage} Borodaenko, Dmitry: Samizdat RDF Storage, December 2002\\ http://savannah.nongnu.org/cgi-bin/viewcvs/samizdat/samizdat/doc/rdf-storage.txt \bibitem {samizdat-impl-report} Borodaenko, Dmitry: Samizdat RDF Implementation Report, September 2003\\ http://lists.w3.org/Archives/Public/www-rdf-interest/2003Sep/0043.html \bibitem {rdf-mt} Hayes, Patrick: RDF Semantics. W3C, February 2004\\ http://www.w3.org/TR/rdf-mt \bibitem {rdql} Jena Semantic Web Framework: RDQL Grammar\\ http://jena.sf.net/RDQL/rdql\_grammar.html \bibitem {ericp-rdf-rdb-access} Prud'hommeaux, Eric: RDF Access to Relational Databases\\ http://www.w3.org/2003/01/21-RDF-RDB-access/ \bibitem {rssdb} RSSDB --- RDF Schema Specific DataBase (RSSDB), ICS-FORTH, 2002\\ http://139.91.183.30:9090/RDF/RSSDB/ \bibitem {squish} Libby Miller, Andy Seaborne, Alberto Reggiori: Three Implementations of SquishQL, a Simple RDF Query Language. 1st International Semantic Web Conference (ISWC2002), June 9-12, 2002. Sardinia, Italy.\\ http://ilrt.org/discovery/2001/02/squish/ \end{thebibliography} \end{document} ruby-graffiti-2.2/doc/rdf-impl-report.txt000066400000000000000000000121111176467530700204740ustar00rootroot00000000000000Samizdat RDF Implementation Report ================================== http://lists.w3.org/Archives/Public/www-rdf-interest/2003Sep/0043.html Implementation -------------- http://www.nongnu.org/samizdat/ Samizdat is a generic RDF-based engine for building collaboration and open publishing web sites. Samizdat will let everyone publish, view, comment, edit, and aggregate text and multimedia resources, vote on ratings and classifications, filter resources by flexible sets of criteria, cooperate and coordinate on all kinds of activities (see Design Goals document). Samizdat intends to promote values of freedom, openness, equality, and cooperation. Samizdat engine is implemented using Ruby programming language, Apache mod_ruby module, and PostgreSQL RDBMS, and is available under the GNU General Public License, version 2 or later. Project development started in December 2002, first public release was announced in June 2003. This report refers to the Samizdat 0.0.4, released on 2003-09-01. Functionality covered by this version includes: registering site members, publishing and replying to messages, uploading multimedia messages, voting on standard tags on resources; hand-editing or using GUI for constructing and publishing Squish queries that can be used to search and filter site resources. RDF Schema ---------- Samizdat defines its own RDF schema for description of site members, published messages, votes, and other site resources (see Concepts document). One of the outstanding features of Samizdat schema is the use of statement reification in approval of content classification with votes cast by site members. Samizdat RDF schema uses Dublin Core metadata where applicable; also, integration of site member descriptions with FOAF is planned. One of the problems encountered in Samizdat RDF Schema development was the lack of standard metadata describing discussion threads. While other properties defined in Samizdat schema denote Samizdat-specific concepts, such as "vote" and "rating", it is more desirable to use commonly agreed metadata for threading structure in place of implementation-local "thread" and "inReplyTo" properties. RDF Import and Export --------------------- While Samizdat model follows RDF Concepts and RDF Semantics recommendations (with the exceptions put down below), the engine does not externally interchange RDF data and thus does not use RDF/XML or other RDF serialization format. It is assumed that, when the need for RDF import and export arises, it can be implemented externally on top of the Samizdat RDF storage module and using existing RDF frameworks such as Redland. Datatyped Literals ------------------ Samizdat doesn't implement datatyped literals, and relies on underlying PostgreSQL capabilities for mapping between literal values and their string representations. Outside of SQL context, literals are interpreted as opaque strings; XML literals are not treated specially, and datatype information is not preserved. However, support of XML schema datatypes is considered necessary in order to untie a Samizdat knowledge base from specifics of underlying RDF storage, and will be implemented as a prerequisite for migration to a selection of alternative RDF storage backends (candidates are FramerD, 3store, and Redland). Language Tags ------------- Literal language tags are not honoured, "dc:language" property is supposed to be used to denote message language. Entailments ----------- Samizdat RDF storage only implements simple entailment, vocabulary entailment is not implemented yet. At the moment, simple entailment suffices for all features of the Samizdat engine. If and when vocabulary entailment becomes necessary, it will be implemented in Samizdat RDF storage module or relegated to an alternative RDF storage backend, depending on status of backend alternatives for Samizdat at that time. Query Support ------------- Samizdat RDF storage implements a translation of RDF query graphs written in extended Squish into relational SQL queries and allows purely relational representation of selected properties of site resources (see RDF Storage and Storage Implementation documents). It must be noted that at the moment, status of RDF query language standards is found unsatisfactory. DAML Query Language abstract specification provides excellent formal basis, but does not encompass all capabilities of existing RDF query languages. Also, existing query languages are limited in one way or another, are underformalized (most are defined by single implementation), and often overloaded with baroque syntax. Two major features that were missed the most in existing query languages at the time of Samizdat RDF storage implementation were: knowledge base update allowing to merge complex constructs into the site KB graph (implemented in Samizdat RDF Data Manipulation Language), and workflow control providing at least transaction rollback (in Samizdat, underlying PostgreSQL transactions are used). Other Squish extensions implemented in Samizdat are literal conditions and answer collection ordering (currently, relegated to PostgreSQL; ideally, interpreted according to literal datatypes). ruby-graffiti-2.2/graffiti.gemspec000066400000000000000000000015671176467530700173200ustar00rootroot00000000000000Gem::Specification.new do |spec| spec.name = 'graffiti' spec.version = '2.1' spec.author = 'Dmitry Borodaenko' spec.email = 'angdraug@debian.org' spec.homepage = 'https://github.com/angdraug/graffiti' spec.summary = 'Relational RDF store for Ruby' spec.description = <<-EOF Graffiti is an RDF store based on dynamic translation of RDF queries into SQL. Graffiti allows one to map any relational database schema into RDF semantics and vice versa, to store any RDF data in a relational database. Graffiti uses Sequel to connect to database backend and provides a DBI-like interface to run RDF queries in Squish query language from Ruby applications. EOF spec.files = `git ls-files`.split "\n" spec.test_files = Dir['test/ts_*.rb'] spec.license = 'GPL3+' spec.add_dependency('syncache') spec.add_dependency('sequel') end ruby-graffiti-2.2/lib/000077500000000000000000000000001176467530700147155ustar00rootroot00000000000000ruby-graffiti-2.2/lib/graffiti.rb000066400000000000000000000007571176467530700170460ustar00rootroot00000000000000# Graffiti RDF Store # (originally written for Samizdat project) # # Copyright (c) 2002-2009 Dmitry Borodaenko # # This program is free software. # You can distribute/modify this program under the terms of # the GNU General Public License version 3 or later. # # see doc/rdf-storage.txt for introduction and Graffiti Squish definition; # see doc/storage-impl.txt for explanation of implemented algorithms # # vim: et sw=2 sts=2 ts=8 tw=0 require 'graffiti/store' ruby-graffiti-2.2/lib/graffiti/000077500000000000000000000000001176467530700165105ustar00rootroot00000000000000ruby-graffiti-2.2/lib/graffiti/debug.rb000066400000000000000000000013371176467530700201270ustar00rootroot00000000000000# Graffiti RDF Store # (originally written for Samizdat project) # # Copyright (c) 2002-2011 Dmitry Borodaenko # # This program is free software. # You can distribute/modify this program under the terms of # the GNU General Public License version 3 or later. # # see doc/rdf-storage.txt for introduction and Graffiti Squish definition; # see doc/storage-impl.txt for explanation of implemented algorithms # # vim: et sw=2 sts=2 ts=8 tw=0 module Graffiti module Debug private DEBUG = false def debug(message = nil) return unless DEBUG message = yield if block_given? log message if message end def log(message) STDERR << 'Graffiti: ' << message.to_s << "\n" end end end ruby-graffiti-2.2/lib/graffiti/exceptions.rb000066400000000000000000000011071176467530700212150ustar00rootroot00000000000000# Graffiti RDF Store # (originally written for Samizdat project) # # Copyright (c) 2002-2009 Dmitry Borodaenko # # This program is free software. # You can distribute/modify this program under the terms of # the GNU General Public License version 3 or later. # # see doc/rdf-storage.txt for introduction and Graffiti Squish definition; # see doc/storage-impl.txt for explanation of implemented algorithms # # vim: et sw=2 sts=2 ts=8 tw=0 module Graffiti # raised for syntax errors in Squish statements class ProgrammingError < RuntimeError; end end ruby-graffiti-2.2/lib/graffiti/rdf_config.rb000066400000000000000000000041411176467530700211350ustar00rootroot00000000000000# Graffiti RDF Store # (originally written for Samizdat project) # # Copyright (c) 2002-2011 Dmitry Borodaenko # # This program is free software. # You can distribute/modify this program under the terms of # the GNU General Public License version 3 or later. # # see doc/rdf-storage.txt for introduction and Graffiti Squish definition; # see doc/storage-impl.txt for explanation of implemented algorithms # # vim: et sw=2 sts=2 ts=8 tw=0 require 'graffiti/rdf_property_map' module Graffiti # Configuration of relational RDF storage (see examples) # class RdfConfig def initialize(config) @ns = config['ns'] @map = {} config['map'].each_pair do |p, m| table, field = m.to_a.first p = ns_expand(p) @map[p] = RdfPropertyMap.new(p, table, field) end if config['subproperties'].kind_of? Hash config['subproperties'].each_pair do |p, subproperties| p = ns_expand(p) map = @map[p] or raise RuntimeError, "Incorrect RDF storage configuration: superproperty #{p} must be mapped" map.superproperty = true qualifier = RdfPropertyMap.qualifier_property(p) @map[qualifier] = RdfPropertyMap.new( qualifier, map.table, RdfPropertyMap.qualifier_field(map.field)) subproperties.each do |subp| subp = ns_expand(subp) @map[subp] = RdfPropertyMap.new(subp, map.table, map.field) @map[subp].subproperty_of = p end end end if config['transitive_closure'].kind_of? Hash config['transitive_closure'].each_pair do |p, table| @map[ ns_expand(p) ].transitive_closure = table if config['subproperties'].kind_of?(Hash) and config['subproperties'][p] config['subproperties'][p].each do |subp| @map[ ns_expand(subp) ].transitive_closure = table end end end end end # hash of namespaces attr_reader :ns # map internal property names with expanded namespaces to RdfPropertyMap # objects # attr_reader :map def ns_expand(p) p and p.sub(/\A(\S+?)::/) { @ns[$1] } end end end ruby-graffiti-2.2/lib/graffiti/rdf_property_map.rb000066400000000000000000000050001176467530700224040ustar00rootroot00000000000000# Graffiti RDF Store # (originally written for Samizdat project) # # Copyright (c) 2002-2011 Dmitry Borodaenko # # This program is free software. # You can distribute/modify this program under the terms of # the GNU General Public License version 3 or later. # # see doc/rdf-storage.txt for introduction and Graffiti Squish definition; # see doc/storage-impl.txt for explanation of implemented algorithms # # vim: et sw=2 sts=2 ts=8 tw=0 module Graffiti # Map of an internal RDF property into relational storage # class RdfPropertyMap # special qualifier map # # ' ' is added to the property name to make sure it can't clash with any # valid property uriref # def RdfPropertyMap.qualifier_property(property, type = 'subproperty') property + ' ' + type end # special qualifier field # def RdfPropertyMap.qualifier_field(field, type = 'subproperty') field + '_' + type end def initialize(property, table, field) # fixme: support ambiguous mappings @property = property @table = table @field = field end # expanded uriref of the mapped property # attr_reader :property # name of the table into which the property is mapped (property domain is an # internal resource class mapped into this table) # attr_reader :table # name of the field into which the property is mapped # # if property range is not a literal, the field is a reference to the # resource table # attr_reader :field # expanded uriref of the property which this property is a subproperty of # # if set, this property maps into the same table and field as its # superproperty, and is qualified by an additional field named # _subproperty which refers to a uriref resource holding uriref of # this subproperty # attr_accessor :subproperty_of attr_writer :superproperty # set to +true+ if this property has subproperties # def superproperty? @superproperty or false end # name of transitive closure table for a transitive property # # the format of a transitive closure table is: # # - 'resource' field refers to the subject resource id # - '' property field and '_subproperty' qualifier field (in # case of subproperty) have the same name as in the main table # - 'distance' field holds the distance from subject to object in the RDF # graph # # the transitive closure table is automatically updated by a trigger on every # update of the main table # attr_accessor :transitive_closure end end ruby-graffiti-2.2/lib/graffiti/sql_mapper.rb000066400000000000000000000623261176467530700212110ustar00rootroot00000000000000# Graffiti RDF Store # (originally written for Samizdat project) # # Copyright (c) 2002-2011 Dmitry Borodaenko # # This program is free software. # You can distribute/modify this program under the terms of # the GNU General Public License version 3 or later. # # see doc/rdf-storage.txt for introduction and Graffiti Squish definition; # see doc/storage-impl.txt for explanation of implemented algorithms # # vim: et sw=2 sts=2 ts=8 tw=0 require 'delegate' require 'uri/common' require 'graffiti/rdf_property_map' require 'graffiti/squish' module Graffiti class SqlNodeBinding def initialize(table_alias, field) @alias = table_alias @field = field end attr_reader :alias, :field def to_s @alias + '.' + @field end alias :inspect :to_s def eql?(binding) @alias == binding.alias and @field == binding.field end alias :'==' :eql? def hash self.to_s.hash end end class SqlExpression < DelegateClass(Array) def initialize(*parts) super parts end def to_s '(' << self.join(' ') << ')' end alias :to_str :to_s def traverse(&block) self.each do |part| case part when SqlExpression part.traverse(&block) else yield end end end def rebind!(rebind, &block) self.each_with_index do |part, i| case part when SqlExpression part.rebind!(rebind, &block) when SqlNodeBinding if rebind[part] self[i] = rebind[part] yield part if block_given? end end end end alias :eql? :'==' def hash self.to_s.hash end end # Transform RDF query pattern graph into a relational join expression. # class SqlMapper include Debug def initialize(config, pattern, negative = [], optional = [], global_filter = '') @config = config @global_filter = global_filter check_graph(pattern) negative.empty? or check_graph(pattern + negative) optional.empty? or check_graph(pattern + optional) map_predicates(pattern, negative, optional) transform generate_tables_and_conditions @jc = @aliases = @ac = @global_filter = nil end # map clause position to table, field, and table alias # # position => { # :subject => { # :node => node, # :field => field # }, # :object => { # :node => node, # :field => field # }, # :map => RdfPropertyMap, # :bind_mode => < :must_bind | :may_bind | :must_not_bind >, # :alias => alias # } # attr_reader :clauses # map node to list of positions in clauses # # node => { # :positions => [ # { :clause => position, :role => < :subject | :object > } # ], # :bind_mode => < :must_bind | :may_bind | :must_not_bind >, # :colors => { color1 => bind_mode1, ... }, # :ground => < true | false > # } # attr_reader :nodes # list of tables for FROM clause of SQL query attr_reader :from # conditions for WHERE clause of SQL query attr_reader :where # return node's binding, raise exception if the node isn't bound # def bind(node) (@nodes[node] and @bindings[node] and (binding = @bindings[node].first) ) or raise ProgrammingError, "Node '#{node}' is not bound by the query pattern" @nodes[node][:positions].each do |p| if :object == p[:role] and @clauses[ p[:clause] ][:map].subproperty_of property = @clauses[ p[:clause] ][:map].property return %{select_subproperty(#{binding}, #{bind(property)})} end end binding end private # Check whether pattern is not a disjoint graph (all nodes are # undirectionally reachable from one node). # def check_graph(pattern) nodes = pattern.transpose[1, 2].flatten.uniq # all nodes seen = [ nodes.shift ] found_more = true while found_more and not nodes.empty? found_more = false pattern.each do |predicate, subject, object| if seen.include?(subject) and nodes.include?(object) seen.push(object) nodes.delete(object) found_more = true elsif seen.include?(object) and nodes.include?(subject) seen.push(subject) nodes.delete(subject) found_more = true end end end nodes.empty? or raise ProgrammingError, "Query pattern is a disjoint graph" end # Stage 1: Predicate Mapping (storage-impl.txt). # def map_predicates(pattern, negative, optional) @nodes = {} @clauses = [] map_pattern(pattern, :must_bind) map_pattern(negative, :must_not_bind) map_pattern(optional, :may_bind) @color_counter = @must_bind_nodes = nil refine_ambiguous_properties debug do @nodes.keys.sort.each do |node| n = @nodes[node] debug %{map_predicates #{node}: #{n[:bind_mode]} #{n[:colors].inspect}} end nil end end # Label every connected component of the pattern with a different color. # # Pattern clause positions: # # 0. predicate # 1. subject # 2. object # 3. filter # # Returns hash of node colors. # # Implements the {Two-pass Connected Component Labeling algorithm} # [http://en.wikipedia.org/wiki/Connected_Component_Labeling#Two-pass] # with an added special case to exclude _alien_nodes_ from neighbor lists. # # The special case ensures that parts of a may-bind or must-not-bind # subpattern that are only connected through a must-bind node do not connect. # def label_pattern_components(pattern, alien_nodes, augment_alien_nodes = false) return {} if pattern.empty? color = {} color_eq = [] # [ [ smaller, larger ], ... ] nodes = pattern.transpose[1, 2].flatten.uniq alien_nodes_here = nodes & alien_nodes @color_counter = @color_counter ? @color_counter.next : 0 color[ nodes[0] ] = @color_counter # first pass 1.upto(nodes.size - 1) do |i| node = nodes[i] pattern.each do |predicate, subject, object, filter| if node == subject neighbor = object elsif node == object neighbor = subject end next if neighbor.nil? or color[neighbor].nil? or alien_nodes_here.include?(neighbor) if color[node].nil? color[node] = color[neighbor] elsif color[node] != color[neighbor] # record color equivalence color_eq |= [ [ color[node], color[neighbor] ].sort ] end end color[node] ||= (@color_counter += 1) end # second pass nodes.each do |node| while eq = color_eq.rassoc(color[node]) color[node] = eq[0] end end alien_nodes.push(*nodes).uniq! if augment_alien_nodes color end def map_pattern(pattern, bind_mode = :must_bind) pattern = pattern.dup @must_bind_nodes ||= [] color = label_pattern_components(pattern, @must_bind_nodes, :must_bind == bind_mode) pattern.each do |predicate, subject, object, filter, transitive| # validate the triple predicate =~ URI::URI_REF or raise ProgrammingError, "Valid uriref expected in predicate position instead of '#{predicate}'" [subject, object].each do |node| node =~ SquishQuery::INTERNAL or node =~ SquishQuery::BN or node =~ URI::URI_REF or raise ProgrammingError, "Resource or blank node name expected instead of '#{node}'" end # list of possible mappings into internal tables map = @config.map[predicate] if transitive and map.transitive_closure.nil? raise ProgrammingError, "No transitive closure is defined for #{predicate} property" end if map and (subject =~ SquishQuery::BN or subject =~ SquishQuery::INTERNAL or subject =~ SquishQuery::PARAMETER or 'resource' == map.table) # internal predicate and subject is mappable to resource table i = clauses.size @clauses[i] = { :subject => [ { :node => subject, :field => 'id' } ], :object => [ { :node => object, :field => map.field } ], :map => map, :transitive => transitive, :bind_mode => bind_mode } @clauses[i][:filter] = SqlExpression.new(filter) if filter [subject, object].each do |node| if @nodes[node] @nodes[node][:bind_mode] = stronger_bind_mode(@nodes[node][:bind_mode], bind_mode) else @nodes[node] = { :positions => [], :bind_mode => bind_mode, :colors => {} } end # set of node colors, one for each bind_mode @nodes[node][:colors][ color[node] ] = bind_mode end # reverse mapping of the node occurences @nodes[subject][:positions].push( { :clause => i, :role => :subject } ) @nodes[object][:positions].push( { :clause => i, :role => :object } ) if superp = map.subproperty_of # link subproperty qualifier into the pattern pattern.push( [RdfPropertyMap.qualifier_property(superp), subject, predicate]) color[predicate] = color[object] # no need to ground both subproperty and superproperty @nodes[object][:ground] = true end else # assume reification for unmapped predicates: # # | (rdf::predicate ?_stmt_#{i} p) # (p s o) -> | (rdf::subject ?_stmt_#{i} s) # | (rdf::object ?_stmt_#{i} o) # rdf = 'http://www.w3.org/1999/02/22-rdf-syntax-ns#' stmt = "?_stmt_#{i}" pattern.push([rdf + 'predicate', stmt, predicate], [rdf + 'subject', stmt, subject], [rdf + 'object', stmt, object]) color[stmt] = color[predicate] = color[object] end end end # Select strongest of the two bind modes, in the following order of # preference: # # :must_bind -> :must_not_bind -> :may_bind # def stronger_bind_mode(mode1, mode2) if mode1 != mode2 and (:must_bind == mode2 or :may_bind == mode1) mode2 else mode1 end end # If a node can be mapped to more than one [table, field] pair, see if it can # be refined based on other occurences of this node in other query clauses. # def refine_ambiguous_properties @nodes.keys.sort.each do |node| map = @nodes[node][:positions] map.each_with_index do |p, i| big = @clauses[ p[:clause] ][ p[:role] ] next if big.size <= 1 # no refining needed debug { 'refine_ambiguous_properties ' + @nodes[node] + ': ' + big.inspect } (i + 1).upto(map.size - 1) do |j| small_p = map[j] small = @clauses[ small_p[:clause] ][ small_p[:role] ] refined = big & small if refined.size > 0 and refined.size < big.size # refine the node... @clauses[ p[:clause] ][ p[:role] ] = big = refined # ...and its pair @clauses[ p[:clause] ][ opposite_role(p[:role]) ].collect! {|pair| refined.assoc(pair[0]) ? pair : nil }.compact! end end end end # drop remaining ambiguous mappings # todo: split query for ambiguous mappings @clauses.each do |clause| next if clause.nil? # means it was reified clause[:subject] = clause[:subject].first clause[:object] = clause[:object].first end end def opposite_role(role) :subject == role ? :object : :subject end # Return current value of alias counter, remember which table it was assigned # to, and increment the counter. # def next_alias(table, node, bind_mode = @nodes[node][:bind_mode]) @ac ||= 'a' @aliases ||= {} a = @ac.dup @aliases[a] = { :table => table, :node => node, :bind_mode => bind_mode, :filter => [] } @ac.next! return a end def define_relation_aliases @nodes.keys.sort.each do |node| positions = @nodes[node][:positions] debug { 'define_relation_aliases ' + positions.inspect } # go through all clauses with this node in subject position positions.each_with_index do |p, i| next if :subject != p[:role] or @clauses[ p[:clause] ][:alias] clause = @clauses[ p[:clause] ] map = clause[:map] table = clause[:transitive] ? map.transitive_closure : map.table # see if we've already mapped this node to the same table before 0.upto(i - 1) do |j| similar_clause = @clauses[ positions[j][:clause] ] if similar_clause[:alias] and similar_clause[:map].table == table and similar_clause[:map].field != map.field # same node, same table, different field -> same alias clause[:alias] = similar_clause[:alias] break end end if clause[:alias].nil? clause[:alias] = if clause[:transitive] # transitive clause bind mode overrides a stronger node bind mode # # fixme: generic case for multiple aliases per node next_alias(table, node, clause[:bind_mode]) else next_alias(table, node) end end end end # optimize: unnecessary aliases are generated end def update_alias_filters @clauses.each do |c| if c[:filter] @aliases[ c[:alias] ][:filter].push(c[:filter]) end end end # Stage 2: Relation Aliases and Join Conditions (storage-impl.txt). # # Result is map of aliases in @aliases and list of join conditions in @jc. # def transform define_relation_aliases update_alias_filters # [ [ binding1, binding2 ], ... ] @jc = [] @bindings = {} @nodes.keys.sort.each do |node| positions = @nodes[node][:positions] # node binding first = positions.first clause = @clauses[ first[:clause] ] a = clause[:alias] binding = SqlNodeBinding.new(a, clause[ first[:role] ][:field]) @bindings[node] = [ binding ] # join conditions 1.upto(positions.size - 1) do |i| p = positions[i] clause2 = @clauses[ p[:clause] ] binding2 = SqlNodeBinding.new(clause2[:alias], clause2[ p[:role] ][:field]) unless @bindings[node].include?(binding2) @bindings[node].push(binding2) @jc.push([binding, binding2, node]) @nodes[node][:ground] = true end end # ground non-blank nodes if node !~ SquishQuery::BN if node =~ SquishQuery::INTERNAL # internal resource id @aliases[a][:filter].push SqlExpression.new(binding, '=', $1) elsif node =~ SquishQuery::PARAMETER or node =~ SquishQuery::LITERAL @aliases[a][:filter].push SqlExpression.new(binding, '=', node) elsif node =~ URI::URI_REF # external resource uriref r = nil positions.each do |p| next unless :subject == p[:role] c = @clauses[ p[:clause] ] if 'resource' == c[:map].table r = c[:alias] # reuse existing mapping to resource table break end end if r.nil? r = next_alias('resource', node) r_binding = SqlNodeBinding.new(r, 'id') @bindings[node].unshift(r_binding) @jc.push([ binding, r_binding, node ]) end @aliases[r][:filter].push SqlExpression.new( SqlNodeBinding.new(r, 'uriref'), '=', "'t'", 'AND', SqlNodeBinding.new(r, 'label'), '=', %{'#{node}'}) else raise RuntimeError, "Invalid node '#{node}' should never occur at this point" end @nodes[node][:ground] = true end end debug do @aliases.keys.sort.each {|a| debug %{transform #{a}: #{@aliases[a].inspect}} } @jc.each {|jc| debug 'transform ' + jc.inspect } nil end end # Produce SQL FROM and WHERE clauses from results of transform(). # def generate_tables_and_conditions main_path, seen = jc_subgraph_path(:must_bind) debug { 'generate_tables_and_conditions ' + main_path.inspect } main_path and not main_path.empty? or raise RuntimeError, 'Failed to find table aliases for main query' @where = ground_dangling_blank_nodes(main_path) joins = '' subquery_count = 'a' [ :must_not_bind, :may_bind ].each do |bind_mode| loop do sub_path, new = jc_subgraph_path(bind_mode, seen) break if sub_path.nil? or sub_path.empty? debug { 'generate_tables_and_conditions ' + sub_path.inspect } sub_query, sub_join = sub_path.partition {|a,| main_path.assoc(a).nil? } # fixme: make sure that sub_join is not empty if 1 == sub_query.size # simplified case: join single table directly without a subquery join_alias, = sub_query.first a = @aliases[join_alias] join_target = a[:table] join_conditions = jc_path_to_join_conditions(sub_join) + a[:filter] else # left join subquery to the main query join_alias = '_subquery_' << subquery_count subquery_count.next! sub_join = subquery_jc_path(sub_join, join_alias) rebind = rebind_subquery(sub_path, join_alias) select_nodes = subquery_select_nodes(rebind, main_path, sub_join) join_conditions = jc_path_to_join_conditions(sub_join, rebind, select_nodes) select_nodes = select_nodes.keys.collect {|b| b.to_s << ' AS ' << rebind[b].field }.join(', ') tables, conditions = jc_path_to_tables_and_conditions(sub_path) join_target = "(\nSELECT #{select_nodes}\nFROM #{tables}" join_target << "\nWHERE " << conditions unless conditions.empty? join_target << "\n)" join_target.gsub!(/\n(?!\)\z)/, "\n ") end joins << ("\nLEFT JOIN " + join_target + ' AS ' + join_alias + ' ON ' + join_conditions.uniq.join(' AND ')) if :must_not_bind == bind_mode left_join_is_null(main_path, sub_join) end end end @from, main_where = jc_path_to_tables_and_conditions(main_path) @from << joins @where.push('(' + main_where + ')') unless main_where.empty? @where.push('(' + @global_filter + ')') unless @global_filter.empty? @where = @where.join("\nAND ") end # Produce a subgraph path through join conditions linking all aliases with # given _bind_mode_ that form a same-color connected component of the join # conditions graph and weren't processed yet: # # path = [ [start, []], [ next, [ jc, ... ] ], ... ] # # Update _seen_ hash for all aliases included in the produced path. # def jc_subgraph_path(bind_mode, seen = {}) start = find_alias(bind_mode, seen) return nil if start.nil? new = {} new[start] = true path = [ [start, []] ] colors = @nodes[ @aliases[start][:node] ][:colors].keys loop do # while we can find more connecting joins of the same color join_alias = nil @jc.each do |jc| # use cases: # - seen is empty (composing the must-bind join) # - seen is not empty (composing a subquery) next if (colors & @nodes[ jc[2] ][:colors].keys).empty? 0.upto(1) do |i| a_seen = jc[i].alias a_next = jc[1-i].alias if not new[a_next] and ( ((new[a_seen] or seen[a_seen]) and (@aliases[a_next][:bind_mode] == bind_mode) # connect an untouched node of matching bind mode ) or ( new[a_seen] and seen[a_next] and # connect subquery to the rest of the query... @aliases[a_seen][:bind_mode] == bind_mode # ...but only go one step deep )) join_alias = a_next break end end break if join_alias end break if join_alias.nil? # join it to all seen aliases join_on = @jc.find_all do |jc| a1, a2 = jc[0, 2].collect {|b| b.alias } (new[a1] and a2 == join_alias) or (new[a2] and a1 == join_alias) end new[join_alias] = true path.push([join_alias, join_on]) end seen.merge!(new) [ path, new ] end def find_alias(bind_mode, seen = {}) @aliases.keys.sort.each do |a| next if seen[a] or @aliases[a][:bind_mode] != bind_mode return a end nil end # Ground all must-bind blank nodes that weren't ground elsewhere to an # existential quantifier. # def ground_dangling_blank_nodes(main_path) conditions = [] ground_nodes = @global_filter.scan(SquishQuery::BN_SCAN) @nodes.keys.sort.each do |node| n = @nodes[node] next if (n[:ground] or ground_nodes.include?(node)) expression = case n[:bind_mode] when :must_bind 'IS NOT NULL' when :must_not_bind 'IS NULL' else next end @bindings[node].each do |binding| if main_path.assoc(binding.alias) conditions.push SqlExpression.new(binding, expression) break end end end conditions end # Join a subquery to the main query: for each alias shared between the two, # link 'id' field of the corresponding table within and outside the subquery. # If no node is bound to the 'id' field, create a virtual node bound to it, # so that it can be rebound by rebind_subquery(). # def subquery_jc_path(sub_join, join_alias) sub_join.empty? and raise ProgrammingError, "Unexpected empty subquery, check your RDF storage configuration" # fixme: reify instead of raising an exception sub_join.transpose[0].uniq.collect do |a| binding = SqlNodeBinding.new(a, 'id') exists = false @nodes.each do |node, n| if @bindings[node].include?(binding) exists = true break end end unless exists node = '?' + join_alias + '_' + a @nodes[node] = { :ground => true } @bindings[node] = [ binding ] end [ a, [[ binding, binding ]] ] end end # Generate a hash that maps all bindings that's been wrapped inside the # _sub_query_ (a jc path, see jc_subquery_path()) to rebound bindings based # on the _join_alias_ so that they may still be used in the main query. # def rebind_subquery(sub_path, join_alias) rebind = {} field_count = 'a' wrapped = {} sub_path.each {|a,| wrapped[a] = true } @nodes.keys.sort.each do |node| @bindings[node].each do |b| if wrapped[b.alias] and rebind[b].nil? field = '_field_' << field_count field_count.next! rebind[b] = SqlNodeBinding.new(join_alias, field) end end end rebind end # Go through global filter, filters in the main query, and join conditions # attaching the subquery to the main query, rebind the bindings for nodes # wrapped inside the subquery, and return a hash with keys for all bindings # that should be selected from the subquery. # def subquery_select_nodes(rebind, main_path, sub_join) select_nodes = {} # update the global filter @nodes.keys.sort.each do |node| if r = rebind[ @bindings[node].first ] @global_filter.gsub!(/#{Regexp.escape(node)}\b/) do select_nodes[ @bindings[node].first ] = true r.to_s end end end # update filters in the main query main_path.each do |a,| next if sub_join.assoc(a) @aliases[a][:filter].each do |f| f.rebind!(rebind) do |b| select_nodes[b] = true end end end # update the subquery join path sub_join.each do |a, jcs| jcs.each do |jc| select_nodes[ jc[0] ] = true jc[1] = rebind[ jc[1] ] end end # fixme: update main SELECT list select_nodes end # Transform jc path (see jc_subgraph_path()) into a list of join conditions. # If _rebind_ and _select_nodes_ hashes are defined, conditions will be # rebound accordingly, and _select_nodes_ will be updated to include bindings # used in the conditions. # def jc_path_to_join_conditions(jc_path, rebind = nil, select_nodes = nil) conditions = [] jc_path.each do |a, jcs| jcs.each do |b1, b2, n| conditions.push SqlExpression.new(b1, '=', b2) end end conditions.empty? and raise RuntimeError, "Failed to join subquery to the main query" conditions end # Generate FROM and WHERE clauses from a jc path (see jc_subgraph_path()). # def jc_path_to_tables_and_conditions(path) first, = path[0] a = @aliases[first] tables = a[:table] + ' AS ' + first conditions = a[:filter] path[1, path.size - 1].each do |join_alias, join_on| a = @aliases[join_alias] tables << %{\nINNER JOIN #{a[:table]} AS #{join_alias} ON } << ( join_on.collect {|b1, b2| SqlExpression.new(b1, '=', b2) } + a[:filter] ).uniq.join(' AND ') end [ tables, conditions.uniq.join("\nAND ") ] end # Find and declare as NULL key fields of a must-not-bind subquery. # def left_join_is_null(main_path, sub_join) sub_join.each do |a, jcs| jcs.each do |jc| 0.upto(1) do |i| if main_path.assoc(jc[i].alias).nil? @where.push SqlExpression.new(jc[i], 'IS NULL') break end end end end end end end ruby-graffiti-2.2/lib/graffiti/squish.rb000066400000000000000000000367701176467530700203660ustar00rootroot00000000000000# Graffiti RDF Store # (originally written for Samizdat project) # # Copyright (c) 2002-2011 Dmitry Borodaenko # # This program is free software. # You can distribute/modify this program under the terms of # the GNU General Public License version 3 or later. # # see doc/rdf-storage.txt for introduction and Graffiti Squish definition; # see doc/storage-impl.txt for explanation of implemented algorithms # # vim: et sw=2 sts=2 ts=8 tw=0 require 'graffiti/exceptions' require 'graffiti/sql_mapper' module Graffiti # parse Squish query and translate triples to relational conditions # # provides access to internal representation of the parsed query and utility # functions to deal with Squish syntax # class SquishQuery include Debug # regexp for internal resource reference INTERNAL = Regexp.new(/\A([[:digit:]]+)\z/).freeze # regexp for blank node mark and name BN = Regexp.new(/\A\?([[:alnum:]_]+)\z/).freeze # regexp for scanning blank nodes inside a string BN_SCAN = Regexp.new(/\?[[:alnum:]_]+?\b/).freeze # regexp for parametrized value PARAMETER = Regexp.new(/\A:([[:alnum:]_]+)\z/).freeze # regexp for replaced string literal LITERAL = Regexp.new(/\A'(\d+)'\z/).freeze # regexp for scanning replaced string literals in a string LITERAL_SCAN = Regexp.new(/'(\d+)'/).freeze # regexp for scanning query parameters inside a string PARAMETER_AND_LITERAL_SCAN = Regexp.new(/\B:([[:alnum:]_]+)|'(\d+)'/).freeze # regexp for number NUMBER = Regexp.new(/\A-?[[:digit:]]+(\.[[:digit:]]+)?\z/).freeze # regexp for operator OPERATOR = Regexp.new(/\A(\+|-|\*|\/|\|\||<|<=|>|>=|=|!=|@@|to_tsvector|to_tsquery|I?LIKE|NOT|AND|OR|IN|IS|NULL)\z/i).freeze # regexp for aggregate function AGGREGATE = Regexp.new(/\A(avg|count|max|min|sum)\z/i).freeze QUERY = Regexp.new(/\A\s*(SELECT|INSERT|UPDATE)\b\s*(.*?)\s* \bWHERE\b\s*(.*?)\s* (?:\bEXCEPT\b\s*(.*?))?\s* (?:\bOPTIONAL\b\s*(.*?))?\s* (?:\bLITERAL\b\s*(.*?))?\s* (?:\bGROUP\s+BY\b\s*(.*?))?\s* (?:\bORDER\s+BY\b\s*(.*?)\s*(ASC|DESC)?)?\s* (?:\bUSING\b\s*(.*?))?\s*\z/mix).freeze # extract common Squish query sections, perform namespace substitution, # generate query pattern graph, call transform_pattern, # determine query type and parse nodes section accordingly # def initialize(config, query) query.nil? and raise ProgrammingError, "SquishQuery: query can't be nil" if query.kind_of? Hash # pre-parsed query (used by SquishAssert) @nodes = query[:nodes] @pattern = query[:pattern] @negative = query[:negative] @optional = query[:optional] @strings = query[:strings] @literal = @group = @order = '' @sql_mapper = SqlMapper.new(config, @pattern) return self elsif not query.kind_of? String raise ProgrammingError, "Bad query initialization parameter class: #{query.class}" end debug { 'SquishQuery ' + query } @query = query # keep original string query = query.dup # replace string literals with 'n' placeholders (also see #substitute_literals) @strings = [] query.gsub!(/'((?:''|[^'])*)'/m) do @strings.push $1.gsub("''", "'") # keep unescaped string "'" + (@strings.size - 1).to_s + "'" end match = QUERY.match(query) or raise ProgrammingError, "Malformed query: are keywords SELECT, INSERT, UPDATE or WHERE missing?" match, @key, @nodes, @pattern, @negative, @optional, @literal, @group, @order, @order_dir, @ns = match.to_a.collect {|m| m.to_s } match = nil @key.upcase! @order_dir.upcase! # namespaces # todo: validate ns @ns = (@ns.empty? or /\APRESET\s+NS\z/ =~ @ns) ? config.ns : Hash[*@ns.gsub(/\b(FOR|AS|AND)\b/i, '').scan(/\S+/)] @pattern = parse_pattern(@pattern) @optional = parse_pattern(@optional) @negative = parse_pattern(@negative) # validate SQL expressions validate_expression(@literal) @group.split(/\s*,\s*/).each {|group| validate_expression(group) } validate_expression(@order) @sql_mapper = SqlMapper.new( config, @pattern, @negative, @optional, @literal) # check that all variables can be bound @variables = query.scan(BN_SCAN) @variables.each {|node| @sql_mapper.bind(node) } return self end # blank variables control section attr_reader :nodes # query pattern graph as array of triples [ [p, s, o], ... ] attr_reader :pattern # literal SQL expression attr_reader :literal # SQL GROUP BY expression attr_reader :group # SQL order expression attr_reader :order # direction of order, ASC or DESC attr_reader :order_dir # query namespaces mapping attr_reader :ns # list of variables defined in the query attr_reader :variables # returns original string passed in for parsing # def to_s @query end # replace 'n' substitutions with query string literals (see #new, #LITERAL) # def substitute_literals(s) return s unless s.kind_of? String s.gsub(LITERAL_SCAN) do get_literal_value($1.to_i) end end # replace schema uri with namespace prefix # def SquishQuery.uri_shrink!(uriref, prefix, uri) uriref.gsub!(/\A#{uri}([^\/#]+)\z/) {"#{prefix}::#{$1}"} end # replace schema uri with a prefix from a supplied namespaces hash # def SquishQuery.ns_shrink(uriref, namespaces) u = uriref.dup or return nil namespaces.each {|p, uri| SquishQuery.uri_shrink!(u, p, uri) and break } return u end # replace schema uri with a prefix from query namespaces # def ns_shrink(uriref) SquishQuery.ns_shrink(uriref, @ns) end # validate expression # # expression := value [ operator expression ] # # value := blank_node | literal_string | number | '(' expression ')' # # whitespace between tokens (except inside parentheses) is mandatory # def validate_expression(string) # todo: lexical analyser string.split(/[\s(),]+/).each do |token| case token when '', BN, PARAMETER, LITERAL, NUMBER, OPERATOR, AGGREGATE else raise ProgrammingError, "Bad token '#{token}' in expression" end end string end private PATTERN_SCAN = Regexp.new(/\A\((\S+)\s+(\S+)\s+(.*?)(?:\s+FILTER\b\s*(.*?)\s*)?(?:\s+(TRANSITIVE)\s*)?\)\z/).freeze # parse query pattern graph out of a string, expand URI namespaces # def parse_pattern(pattern) pattern.scan(/\(.*?\)(?=\s*(?:\(|\z))/).collect do |c| match, predicate, subject, object, filter, transitive = c.match(PATTERN_SCAN).to_a match = nil [predicate, subject, object].each do |u| u.sub!(/\A(\S+?)::/) do @ns[$1] or raise ProgrammingError, "Undefined namespace prefix #{$1}" end end validate_expression(filter.to_s) [predicate, subject, object, filter, 'TRANSITIVE' == transitive] end end # replace RDF query parameters with their values # def expression_value(expr, params={}) case expr when 'NULL' nil when PARAMETER get_parameter_value($1, params) when LITERAL @strings[$1.to_i] else expr.gsub(PARAMETER_AND_LITERAL_SCAN) do if $1 # parameter get_parameter_value($1, params) else # literal get_literal_value($2.to_i) end end # fixme: make Sequel treat it as SQL expression, not a string value end end def get_parameter_value(name, params) key = name.to_sym params.has_key?(key) or raise ProgrammingError, 'Unknown parameter :' + name params[key] end def get_literal_value(i) "'" + @strings[i].gsub("'", "''") + "'" end end class SquishSelect < SquishQuery def initialize(config, query) super(config, query) if @key # initialized from a String, not a Hash 'SELECT' == @key or raise ProgrammingError, 'Wrong query type: SELECT expected intead of ' + @key @nodes = @nodes.split(/\s*,\s*/).map {|node| validate_expression(node) } end end # translate Squish SELECT query to SQL # def to_sql where = @sql_mapper.where select = @nodes.dup select.push(@order) unless @order.empty? or @nodes.include?(@order) # now put it all together sql = %{\nFROM #{@sql_mapper.from}} sql << %{\nWHERE #{where}} unless where.empty? sql << %{\nGROUP BY #{@group}} unless @group.empty? sql << %{\nORDER BY #{@order} #{@order_dir}} unless @order.empty? select = select.map do |expr| bind_blank_nodes(expr) + (BN.match(expr) ? (' AS ' + $1) : '') end sql = 'SELECT DISTINCT ' << select.join(', ') << bind_blank_nodes(sql) sql =~ /\?/ and raise ProgrammingError, "Unexpected '?' in translated query (probably, caused by unmapped blank node): #{sql.gsub(/\s+/, ' ')};" substitute_literals(sql) end private # replace blank node names with bindings # def bind_blank_nodes(sql) sql.gsub(BN_SCAN) {|node| @sql_mapper.bind(node) } end end class SquishAssert < SquishQuery def initialize(config, query) @config = config super(@config, query) if 'UPDATE' == @key @insert = '' @update = @nodes elsif 'INSERT' == @key and @nodes =~ /\A\s*(.*?)\s*(?:\bUPDATE\b\s*(.*?))?\s*\z/ @insert, @update = $1, $2.to_s else raise ProgrammingError, "Wrong query type: INSERT or UPDATE expected instead of " + @key end @insert = @insert.split(/\s*,\s*/).each {|s| s =~ BN or raise ProgrammingError, "Blank node expected in INSERT section instead of '#{s}'" } @update = @update.empty? ? {} : Hash[*@update.split(/\s*,\s*/).collect {|s| s.split(/\s*=\s*/) }.each {|node, value| node =~ BN or raise ProgrammingError, "Blank node expected on the left side of UPDATE assignment instead of '#{bn}'" validate_expression(value) }.flatten!] end def run(db, params={}) values = resource_values(db, params) statements = [] alias_positions.each do |alias_, clauses| statement = SquishAssertStatement.new(clauses, values) statements.push(statement) if statement.action end SquishAssertStatement.run_ordered_statements(db, statements) return @insert.collect {|node| values[node].value } end attr_reader :insert, :update private def resource_values(db, params) values = {} @sql_mapper.nodes.each do |node, n| new = false if node =~ INTERNAL # internal resource value = $1.to_i # resource id elsif node =~ PARAMETER value = get_parameter_value($1, params) elsif node =~ LITERAL value = @strings[$1.to_i] elsif node =~ BN subject_position = n[:positions].select {|p| :subject == p[:role] }.first if subject_position.nil? # blank node occuring only in object position value = @update[node] or raise ProgrammingError, %{Blank node #{node} is undefined (drop it or set its value in UPDATE section)} value = expression_value(value, params) else # resource blank node unless @insert.include?(node) s = SquishSelect.new( @config, { :nodes => [node], :pattern => subgraph(node), :strings => @strings } ) debug { 'resource_values ' + db[s.to_sql, params].select_sql } found = db.fetch(s.to_sql, params).first end if found value = found.values.first else table = @sql_mapper.clauses[ subject_position[:clause] ][:map].table value = db[:resource].insert(:label => table) debug { 'resource_values ' + db[:resource].insert_sql(:label => table) } new = true unless 'resource' == table end end else # external resource uriref = { :uriref => true, :label => node } found = db[:resource].filter(uriref).first if found value = found[:id] else value = db[:resource].insert(uriref) debug { 'resource_values ' + db[:resource].insert_sql(uriref) } end end debug { 'resource_values ' + node + ' = ' + value.inspect } v = SquishAssertValue.new(value, new, @update.has_key?(node)) values[node] = v end debug { 'resource_values ' + 'resource_values ' + values.inspect } values end def alias_positions a = {} @sql_mapper.clauses.each_with_index do |clause, i| a[ clause[:alias] ] ||= [] a[ clause[:alias] ].push(clause) end a end # calculate subgraph of query pattern that is reachable from _node_ # # fixme: make it work with optional sub-patterns # def subgraph(node) subgraph = [node] w = [] begin stop = true @pattern.each do |triple| if subgraph.include? triple[1] and not w.include? triple subgraph.push triple[2] w.push triple stop = false end end end until stop return w end end class SquishAssertValue def initialize(value, new, updated) @value = value @new = new @updated = updated end attr_reader :value # true if node was inserted into resource during value generation and a # corresponding record should be inserted into an internal resource table # later # def new? @new end # true if the node value is set in the UPDATE section of the Squish statement # def updated? @updated end end class SquishAssertStatement include Debug def initialize(clauses, values) @key_node = clauses.first[:subject][:node] @table = clauses.first[:map].table.to_sym key = values[@key_node] @params = {} @references = [] clauses.each do |clause| node = clause[:object][:node] v = values[node] if key.new? or v.updated? field = clause[:object][:field] @params[field.to_sym] = v.value # when subproperty value is updated, update the qualifier as well map = clause[:map] if map.subproperty_of @params[ RdfPropertyMap.qualifier_field(field).to_sym ] = values[map.property].value elsif map.superproperty? @params[ RdfPropertyMap.qualifier_field(field).to_sym ] = nil end @references.push(node) if v.new? end end if key.new? and @table != :resource # when id is inserted, insert_resource() trigger does nothing @action = :insert @params[:id] = key.value elsif not @params.empty? @action = :update @filter = {:id => key.value} end debug { 'SquishAssertStatement ' + self.inspect } end attr_reader :key_node, :references, :action def run(db) if @action ds = db[@table] ds = ds.filter(@filter) if @filter debug { :insert == @action ? ds.insert_sql(@params) : ds.update_sql(@params) } ds.send(@action, @params) end end # make sure mutually referencing records are inserted in the right order # def SquishAssertStatement.run_ordered_statements(db, statements) statements = statements.sort_by {|s| s.references.size } inserted = [] progress = true until statements.empty? or not progress progress = false 0.upto(statements.size - 1) do |i| s = statements[i] if (s.references - inserted).empty? s.run(db) inserted.push(s.key_node) statements.delete_at(i) progress = true break end end end statements.empty? or raise ProgrammingError, "Failed to resolve mutual references of inserted resources: " + statements.collect {|s| s.key_node + ' -- ' + s.references.join(', ') }.join('; ') end end end ruby-graffiti-2.2/lib/graffiti/store.rb000066400000000000000000000046561176467530700202040ustar00rootroot00000000000000# Graffiti RDF Store # (originally written for Samizdat project) # # Copyright (c) 2002-2011 Dmitry Borodaenko # # This program is free software. # You can distribute/modify this program under the terms of # the GNU General Public License version 3 or later. # # see doc/rdf-storage.txt for introduction and Graffiti Squish definition; # see doc/storage-impl.txt for explanation of implemented algorithms # # vim: et sw=2 sts=2 ts=8 tw=0 require 'syncache' require 'graffiti/exceptions' require 'graffiti/debug' require 'graffiti/rdf_config' require 'graffiti/squish' module Graffiti # API for the RDF storage access similar to DBI or Sequel # class Store # initialize class attributes # # _db_ is a Sequel database handle # # _config_ is a hash of configuraiton options for RdfConfig # def initialize(db, config) @db = db @config = RdfConfig.new(config) # cache parsed Squish SELECT queries @select_cache = SynCache::Cache.new(nil, 1000) end # storage configuration in an RdfConfig object # attr_reader :config # replace schema uri with a prefix from the configured namespaces # def ns_shrink(uriref) SquishQuery.ns_shrink(uriref, @config.ns) end # get value of subject's property # def get_property(subject, property) fetch(%{SELECT ?object WHERE (#{property} :subject ?object)}, :subject => subject).get(:object) end def fetch(query, params={}) @db.fetch(select(query), params) end # get one query answer (similar to DBI#select_one) # def select_one(query, params={}) fetch(query, params).first end # get all query answers (similar to DBI#select_all) # def select_all(query, limit=nil, offset=nil, params={}, &p) ds = fetch(query, params).limit(limit, offset) if block_given? ds.all(&p) else ds.all end end # accepts String or pre-parsed SquishQuery object, caches SQL by String # def select(query) query.kind_of?(String) and query = @select_cache.fetch_or_add(query) { SquishSelect.new(@config, query) } query.kind_of?(SquishSelect) or raise ProgrammingError, "String or SquishSelect expected" query.to_sql end # merge Squish query into RDF database # # returns list of new ids assigned to blank nodes listed in INSERT section # def assert(query, params={}) @db.transaction do SquishAssert.new(@config, query).run(@db, params) end end end end ruby-graffiti-2.2/setup.rb000066400000000000000000000711041176467530700156370ustar00rootroot00000000000000# # setup.rb # # Copyright (c) 2000-2004 Minero Aoki # # This program is free software. # You can distribute/modify this program under the terms of # the GNU LGPL, Lesser General Public License version 2.1. # unless Enumerable.method_defined?(:map) # Ruby 1.4.6 module Enumerable alias map collect end end unless File.respond_to?(:read) # Ruby 1.6 def File.read(fname) open(fname) {|f| return f.read } end end def File.binread(fname) open(fname, 'rb') {|f| return f.read } end # for corrupted windows stat(2) def File.dir?(path) File.directory?((path[-1,1] == '/') ? path : path + '/') end class SetupError < StandardError; end def setup_rb_error(msg) raise SetupError, msg end # # Config # if arg = ARGV.detect {|arg| /\A--rbconfig=/ =~ arg } ARGV.delete(arg) require arg.split(/=/, 2)[1] $".push 'rbconfig.rb' else require 'rbconfig' end def multipackage_install? FileTest.directory?(File.dirname($0) + '/packages') end class ConfigItem def initialize(name, template, default, desc) @name = name.freeze @template = template @value = default @default = default.dup.freeze @description = desc end attr_reader :name attr_reader :description attr_accessor :default alias help_default default def help_opt "--#{@name}=#{@template}" end def value @value end def eval(table) @value.gsub(%r<\$([^/]+)>) { table[$1] } end def set(val) @value = check(val) end private def check(val) setup_rb_error "config: --#{name} requires argument" unless val val end end class BoolItem < ConfigItem def config_type 'bool' end def help_opt "--#{@name}" end private def check(val) return 'yes' unless val unless /\A(y(es)?|n(o)?|t(rue)?|f(alse))\z/i =~ val setup_rb_error "config: --#{@name} accepts only yes/no for argument" end (/\Ay(es)?|\At(rue)/i =~ value) ? 'yes' : 'no' end end class PathItem < ConfigItem def config_type 'path' end private def check(path) setup_rb_error "config: --#{@name} requires argument" unless path path[0,1] == '$' ? path : File.expand_path(path) end end class ProgramItem < ConfigItem def config_type 'program' end end class SelectItem < ConfigItem def initialize(name, template, default, desc) super @ok = template.split('/') end def config_type 'select' end private def check(val) unless @ok.include?(val.strip) setup_rb_error "config: use --#{@name}=#{@template} (#{val})" end val.strip end end class PackageSelectionItem < ConfigItem def initialize(name, template, default, help_default, desc) super name, template, default, desc @help_default = help_default end attr_reader :help_default def config_type 'package' end private def check(val) unless File.dir?("packages/#{val}") setup_rb_error "config: no such package: #{val}" end val end end class ConfigTable_class def initialize(items) @items = items @table = {} items.each do |i| @table[i.name] = i end ALIASES.each do |ali, name| @table[ali] = @table[name] end end include Enumerable def each(&block) @items.each(&block) end def key?(name) @table.key?(name) end def lookup(name) @table[name] or raise ArgumentError, "no such config item: #{name}" end def add(item) @items.push item @table[item.name] = item end def remove(name) item = lookup(name) @items.delete_if {|i| i.name == name } @table.delete_if {|name, i| i.name == name } item end def new dup() end def savefile '.config' end def load begin t = dup() File.foreach(savefile()) do |line| k, v = *line.split(/=/, 2) t[k] = v.strip end t rescue Errno::ENOENT setup_rb_error $!.message + "#{File.basename($0)} config first" end end def save @items.each {|i| i.value } File.open(savefile(), 'w') {|f| @items.each do |i| f.printf "%s=%s\n", i.name, i.value if i.value end } end def [](key) lookup(key).eval(self) end def []=(key, val) lookup(key).set val end end c = ::Config::CONFIG rubypath = c['bindir'] + '/' + c['ruby_install_name'] major = c['MAJOR'].to_i minor = c['MINOR'].to_i teeny = c['TEENY'].to_i version = "#{major}.#{minor}" # ruby ver. >= 1.4.4? newpath_p = ((major >= 2) or ((major == 1) and ((minor >= 5) or ((minor == 4) and (teeny >= 4))))) if c['rubylibdir'] # V < 1.6.3 _stdruby = c['rubylibdir'] _siteruby = c['sitedir'] _siterubyver = c['sitelibdir'] _siterubyverarch = c['sitearchdir'] elsif newpath_p # 1.4.4 <= V <= 1.6.3 _stdruby = "$prefix/lib/ruby/#{version}" _siteruby = c['sitedir'] _siterubyver = "$siteruby/#{version}" _siterubyverarch = "$siterubyver/#{c['arch']}" else # V < 1.4.4 _stdruby = "$prefix/lib/ruby/#{version}" _siteruby = "$prefix/lib/ruby/#{version}/site_ruby" _siterubyver = _siteruby _siterubyverarch = "$siterubyver/#{c['arch']}" end libdir = '-* dummy libdir *-' stdruby = '-* dummy rubylibdir *-' siteruby = '-* dummy site_ruby *-' siterubyver = '-* dummy site_ruby version *-' parameterize = lambda {|path| path.sub(/\A#{Regexp.quote(c['prefix'])}/, '$prefix')\ .sub(/\A#{Regexp.quote(libdir)}/, '$libdir')\ .sub(/\A#{Regexp.quote(stdruby)}/, '$stdruby')\ .sub(/\A#{Regexp.quote(siteruby)}/, '$siteruby')\ .sub(/\A#{Regexp.quote(siterubyver)}/, '$siterubyver') } libdir = parameterize.call(c['libdir']) stdruby = parameterize.call(_stdruby) siteruby = parameterize.call(_siteruby) siterubyver = parameterize.call(_siterubyver) siterubyverarch = parameterize.call(_siterubyverarch) if arg = c['configure_args'].split.detect {|arg| /--with-make-prog=/ =~ arg } makeprog = arg.sub(/'/, '').split(/=/, 2)[1] else makeprog = 'make' end common_conf = [ PathItem.new('prefix', 'path', c['prefix'], 'path prefix of target environment'), PathItem.new('bindir', 'path', parameterize.call(c['bindir']), 'the directory for commands'), PathItem.new('libdir', 'path', libdir, 'the directory for libraries'), PathItem.new('datadir', 'path', parameterize.call(c['datadir']), 'the directory for shared data'), PathItem.new('mandir', 'path', parameterize.call(c['mandir']), 'the directory for man pages'), PathItem.new('sysconfdir', 'path', parameterize.call(c['sysconfdir']), 'the directory for man pages'), PathItem.new('stdruby', 'path', stdruby, 'the directory for standard ruby libraries'), PathItem.new('siteruby', 'path', siteruby, 'the directory for version-independent aux ruby libraries'), PathItem.new('siterubyver', 'path', siterubyver, 'the directory for aux ruby libraries'), PathItem.new('siterubyverarch', 'path', siterubyverarch, 'the directory for aux ruby binaries'), PathItem.new('rbdir', 'path', '$siterubyver', 'the directory for ruby scripts'), PathItem.new('sodir', 'path', '$siterubyverarch', 'the directory for ruby extentions'), PathItem.new('rubypath', 'path', rubypath, 'the path to set to #! line'), ProgramItem.new('rubyprog', 'name', rubypath, 'the ruby program using for installation'), ProgramItem.new('makeprog', 'name', makeprog, 'the make program to compile ruby extentions'), SelectItem.new('shebang', 'all/ruby/never', 'ruby', 'shebang line (#!) editing mode'), BoolItem.new('without-ext', 'yes/no', 'no', 'does not compile/install ruby extentions') ] class ConfigTable_class # open again ALIASES = { 'std-ruby' => 'stdruby', 'site-ruby-common' => 'siteruby', # For backward compatibility 'site-ruby' => 'siterubyver', # For backward compatibility 'bin-dir' => 'bindir', 'bin-dir' => 'bindir', 'rb-dir' => 'rbdir', 'so-dir' => 'sodir', 'data-dir' => 'datadir', 'ruby-path' => 'rubypath', 'ruby-prog' => 'rubyprog', 'ruby' => 'rubyprog', 'make-prog' => 'makeprog', 'make' => 'makeprog' } end multipackage_conf = [ PackageSelectionItem.new('with', 'name,name...', '', 'ALL', 'package names that you want to install'), PackageSelectionItem.new('without', 'name,name...', '', 'NONE', 'package names that you do not want to install') ] if multipackage_install? ConfigTable = ConfigTable_class.new(common_conf + multipackage_conf) else ConfigTable = ConfigTable_class.new(common_conf) end module MetaConfigAPI def eval_file_ifexist(fname) instance_eval File.read(fname), fname, 1 if File.file?(fname) end def config_names ConfigTable.map {|i| i.name } end def config?(name) ConfigTable.key?(name) end def bool_config?(name) ConfigTable.lookup(name).config_type == 'bool' end def path_config?(name) ConfigTable.lookup(name).config_type == 'path' end def value_config?(name) case ConfigTable.lookup(name).config_type when 'bool', 'path' true else false end end def add_config(item) ConfigTable.add item end def add_bool_config(name, default, desc) ConfigTable.add BoolItem.new(name, 'yes/no', default ? 'yes' : 'no', desc) end def add_path_config(name, default, desc) ConfigTable.add PathItem.new(name, 'path', default, desc) end def set_config_default(name, default) ConfigTable.lookup(name).default = default end def remove_config(name) ConfigTable.remove(name) end end # # File Operations # module FileOperations def mkdir_p(dirname, prefix = nil) dirname = prefix + File.expand_path(dirname) if prefix $stderr.puts "mkdir -p #{dirname}" if verbose? return if no_harm? # does not check '/'... it's too abnormal case dirs = File.expand_path(dirname).split(%r<(?=/)>) if /\A[a-z]:\z/i =~ dirs[0] disk = dirs.shift dirs[0] = disk + dirs[0] end dirs.each_index do |idx| path = dirs[0..idx].join('') Dir.mkdir path unless File.dir?(path) end end def rm_f(fname) $stderr.puts "rm -f #{fname}" if verbose? return if no_harm? if File.exist?(fname) or File.symlink?(fname) File.chmod 0777, fname File.unlink fname end end def rm_rf(dn) $stderr.puts "rm -rf #{dn}" if verbose? return if no_harm? Dir.chdir dn Dir.foreach('.') do |fn| next if fn == '.' next if fn == '..' if File.dir?(fn) verbose_off { rm_rf fn } else verbose_off { rm_f fn } end end Dir.chdir '..' Dir.rmdir dn end def move_file(src, dest) File.unlink dest if File.exist?(dest) begin File.rename src, dest rescue File.open(dest, 'wb') {|f| f.write File.binread(src) } File.chmod File.stat(src).mode, dest File.unlink src end end def install(from, dest, mode, prefix = nil) $stderr.puts "install #{from} #{dest}" if verbose? return if no_harm? realdest = prefix ? prefix + File.expand_path(dest) : dest realdest = File.join(realdest, File.basename(from)) if File.dir?(realdest) str = File.binread(from) if diff?(str, realdest) verbose_off { rm_f realdest if File.exist?(realdest) } File.open(realdest, 'wb') {|f| f.write str } File.chmod mode, realdest File.open("#{objdir_root()}/InstalledFiles", 'a') {|f| if prefix f.puts realdest.sub(prefix, '') else f.puts realdest end } end end def diff?(new_content, path) return true unless File.exist?(path) new_content != File.binread(path) end def command(str) $stderr.puts str if verbose? system str or raise RuntimeError, "'system #{str}' failed" end def ruby(str) command config('rubyprog') + ' ' + str end def make(task = '') command config('makeprog') + ' ' + task end def extdir?(dir) File.exist?(dir + '/MANIFEST') end def all_files_in(dirname) Dir.open(dirname) {|d| return d.select {|ent| File.file?("#{dirname}/#{ent}") } } end REJECT_DIRS = %w( CVS SCCS RCS CVS.adm .svn ) def all_dirs_in(dirname) Dir.open(dirname) {|d| return d.select {|n| File.dir?("#{dirname}/#{n}") } - %w(. ..) - REJECT_DIRS } end end # # Main Installer # module HookUtils def run_hook(name) try_run_hook "#{curr_srcdir()}/#{name}" or try_run_hook "#{curr_srcdir()}/#{name}.rb" end def try_run_hook(fname) return false unless File.file?(fname) begin instance_eval File.read(fname), fname, 1 rescue setup_rb_error "hook #{fname} failed:\n" + $!.message end true end end module HookScriptAPI def get_config(key) @config[key] end alias config get_config def set_config(key, val) @config[key] = val end # # srcdir/objdir (works only in the package directory) # #abstract srcdir_root #abstract objdir_root #abstract relpath def curr_srcdir "#{srcdir_root()}/#{relpath()}" end def curr_objdir "#{objdir_root()}/#{relpath()}" end def srcfile(path) "#{curr_srcdir()}/#{path}" end def srcexist?(path) File.exist?(srcfile(path)) end def srcdirectory?(path) File.dir?(srcfile(path)) end def srcfile?(path) File.file? srcfile(path) end def srcentries(path = '.') Dir.open("#{curr_srcdir()}/#{path}") {|d| return d.to_a - %w(. ..) } end def srcfiles(path = '.') srcentries(path).select {|fname| File.file?(File.join(curr_srcdir(), path, fname)) } end def srcdirectories(path = '.') srcentries(path).select {|fname| File.dir?(File.join(curr_srcdir(), path, fname)) } end end class ToplevelInstaller Version = '3.3.1' Copyright = 'Copyright (c) 2000-2004 Minero Aoki' TASKS = [ [ 'all', 'do config, setup, then install' ], [ 'config', 'saves your configurations' ], [ 'show', 'shows current configuration' ], [ 'setup', 'compiles ruby extentions and others' ], [ 'install', 'installs files' ], [ 'clean', "does `make clean' for each extention" ], [ 'distclean',"does `make distclean' for each extention" ] ] def ToplevelInstaller.invoke instance().invoke end @singleton = nil def ToplevelInstaller.instance @singleton ||= new(File.dirname($0)) @singleton end include MetaConfigAPI def initialize(ardir_root) @config = nil @options = { 'verbose' => true } @ardir = File.expand_path(ardir_root) end def inspect "#<#{self.class} #{__id__()}>" end def invoke run_metaconfigs case task = parsearg_global() when nil, 'all' @config = load_config('config') parsearg_config init_installers exec_config exec_setup exec_install else @config = load_config(task) __send__ "parsearg_#{task}" init_installers __send__ "exec_#{task}" end end def run_metaconfigs eval_file_ifexist "#{@ardir}/metaconfig" end def load_config(task) case task when 'config' ConfigTable.new when 'clean', 'distclean' if File.exist?(ConfigTable.savefile) then ConfigTable.load else ConfigTable.new end else ConfigTable.load end end def init_installers @installer = Installer.new(@config, @options, @ardir, File.expand_path('.')) end # # Hook Script API bases # def srcdir_root @ardir end def objdir_root '.' end def relpath '.' end # # Option Parsing # def parsearg_global valid_task = /\A(?:#{TASKS.map {|task,desc| task }.join '|'})\z/ while arg = ARGV.shift case arg when /\A\w+\z/ setup_rb_error "invalid task: #{arg}" unless valid_task =~ arg return arg when '-q', '--quiet' @options['verbose'] = false when '--verbose' @options['verbose'] = true when '-h', '--help' print_usage $stdout exit 0 when '-v', '--version' puts "#{File.basename($0)} version #{Version}" exit 0 when '--copyright' puts Copyright exit 0 else setup_rb_error "unknown global option '#{arg}'" end end nil end def parsearg_no_options unless ARGV.empty? setup_rb_error "#{task}: unknown options: #{ARGV.join ' '}" end end alias parsearg_show parsearg_no_options alias parsearg_setup parsearg_no_options alias parsearg_clean parsearg_no_options alias parsearg_distclean parsearg_no_options def parsearg_config re = /\A--(#{ConfigTable.map {|i| i.name }.join('|')})(?:=(.*))?\z/ @options['config-opt'] = [] while i = ARGV.shift if /\A--?\z/ =~ i @options['config-opt'] = ARGV.dup break end m = re.match(i) or setup_rb_error "config: unknown option #{i}" name, value = *m.to_a[1,2] @config[name] = value end end def parsearg_install @options['no-harm'] = false @options['install-prefix'] = '' while a = ARGV.shift case a when /\A--no-harm\z/ @options['no-harm'] = true when /\A--prefix=(.*)\z/ path = $1 path = File.expand_path(path) unless path[0,1] == '/' @options['install-prefix'] = path else setup_rb_error "install: unknown option #{a}" end end end def print_usage(out) out.puts 'Typical Installation Procedure:' out.puts " $ ruby #{File.basename $0} config" out.puts " $ ruby #{File.basename $0} setup" out.puts " # ruby #{File.basename $0} install (may require root privilege)" out.puts out.puts 'Detailed Usage:' out.puts " ruby #{File.basename $0} " out.puts " ruby #{File.basename $0} [] []" fmt = " %-24s %s\n" out.puts out.puts 'Global options:' out.printf fmt, '-q,--quiet', 'suppress message outputs' out.printf fmt, ' --verbose', 'output messages verbosely' out.printf fmt, '-h,--help', 'print this message' out.printf fmt, '-v,--version', 'print version and quit' out.printf fmt, ' --copyright', 'print copyright and quit' out.puts out.puts 'Tasks:' TASKS.each do |name, desc| out.printf fmt, name, desc end fmt = " %-24s %s [%s]\n" out.puts out.puts 'Options for CONFIG or ALL:' ConfigTable.each do |item| out.printf fmt, item.help_opt, item.description, item.help_default end out.printf fmt, '--rbconfig=path', 'rbconfig.rb to load',"running ruby's" out.puts out.puts 'Options for INSTALL:' out.printf fmt, '--no-harm', 'only display what to do if given', 'off' out.printf fmt, '--prefix=path', 'install path prefix', '$prefix' out.puts end # # Task Handlers # def exec_config @installer.exec_config @config.save # must be final end def exec_setup @installer.exec_setup end def exec_install @installer.exec_install end def exec_show ConfigTable.each do |i| printf "%-20s %s\n", i.name, i.value end end def exec_clean @installer.exec_clean end def exec_distclean @installer.exec_distclean end end class ToplevelInstallerMulti < ToplevelInstaller include HookUtils include HookScriptAPI include FileOperations def initialize(ardir) super @packages = all_dirs_in("#{@ardir}/packages") raise 'no package exists' if @packages.empty? end def run_metaconfigs eval_file_ifexist "#{@ardir}/metaconfig" @packages.each do |name| eval_file_ifexist "#{@ardir}/packages/#{name}/metaconfig" end end def init_installers @installers = {} @packages.each do |pack| @installers[pack] = Installer.new(@config, @options, "#{@ardir}/packages/#{pack}", "packages/#{pack}") end with = extract_selection(config('with')) without = extract_selection(config('without')) @selected = @installers.keys.select {|name| (with.empty? or with.include?(name)) \ and not without.include?(name) } end def extract_selection(list) a = list.split(/,/) a.each do |name| setup_rb_error "no such package: #{name}" unless @installers.key?(name) end a end def print_usage(f) super f.puts 'Inluded packages:' f.puts ' ' + @packages.sort.join(' ') f.puts end # # multi-package metaconfig API # attr_reader :packages def declare_packages(list) raise 'package list is empty' if list.empty? list.each do |name| raise "directory packages/#{name} does not exist"\ unless File.dir?("#{@ardir}/packages/#{name}") end @packages = list end # # Task Handlers # def exec_config run_hook 'pre-config' each_selected_installers {|inst| inst.exec_config } run_hook 'post-config' @config.save # must be final end def exec_setup run_hook 'pre-setup' each_selected_installers {|inst| inst.exec_setup } run_hook 'post-setup' end def exec_install run_hook 'pre-install' each_selected_installers {|inst| inst.exec_install } run_hook 'post-install' end def exec_clean rm_f ConfigTable.savefile run_hook 'pre-clean' each_selected_installers {|inst| inst.exec_clean } run_hook 'post-clean' end def exec_distclean rm_f ConfigTable.savefile run_hook 'pre-distclean' each_selected_installers {|inst| inst.exec_distclean } run_hook 'post-distclean' end # # lib # def each_selected_installers Dir.mkdir 'packages' unless File.dir?('packages') @selected.each do |pack| $stderr.puts "Processing the package `#{pack}' ..." if @options['verbose'] Dir.mkdir "packages/#{pack}" unless File.dir?("packages/#{pack}") Dir.chdir "packages/#{pack}" yield @installers[pack] Dir.chdir '../..' end end def verbose? @options['verbose'] end def no_harm? @options['no-harm'] end end class Installer FILETYPES = %w( bin lib ext data ) include HookScriptAPI include HookUtils include FileOperations def initialize(config, opt, srcroot, objroot) @config = config @options = opt @srcdir = File.expand_path(srcroot) @objdir = File.expand_path(objroot) @currdir = '.' end def inspect "#<#{self.class} #{File.basename(@srcdir)}>" end # # Hook Script API base methods # def srcdir_root @srcdir end def objdir_root @objdir end def relpath @currdir end # # configs/options # def no_harm? @options['no-harm'] end def verbose? @options['verbose'] end def verbose_off begin save, @options['verbose'] = @options['verbose'], false yield ensure @options['verbose'] = save end end # # TASK config # def exec_config exec_task_traverse 'config' end def config_dir_bin(rel) end def config_dir_lib(rel) end def config_dir_ext(rel) extconf if extdir?(curr_srcdir()) end def extconf opt = @options['config-opt'].join(' ') command "#{config('rubyprog')} #{curr_srcdir()}/extconf.rb #{opt}" end def config_dir_data(rel) end # # TASK setup # def exec_setup exec_task_traverse 'setup' end def setup_dir_bin(rel) all_files_in(curr_srcdir()).each do |fname| adjust_shebang "#{curr_srcdir()}/#{fname}" end end def adjust_shebang(path) return if no_harm? tmpfile = File.basename(path) + '.tmp' begin File.open(path, 'rb') {|r| first = r.gets return unless File.basename(config('rubypath')) == 'ruby' return unless File.basename(first.sub(/\A\#!/, '').split[0]) == 'ruby' $stderr.puts "adjusting shebang: #{File.basename(path)}" if verbose? File.open(tmpfile, 'wb') {|w| w.print first.sub(/\A\#!\s*\S+/, '#! ' + config('rubypath')) w.write r.read } move_file tmpfile, File.basename(path) } ensure File.unlink tmpfile if File.exist?(tmpfile) end end def setup_dir_lib(rel) end def setup_dir_ext(rel) make if extdir?(curr_srcdir()) end def setup_dir_data(rel) end # # TASK install # def exec_install rm_f 'InstalledFiles' exec_task_traverse 'install' end def install_dir_bin(rel) install_files collect_filenames_auto(), "#{config('bindir')}/#{rel}", 0755 end def install_dir_lib(rel) install_files ruby_scripts(), "#{config('rbdir')}/#{rel}", 0644 end def install_dir_ext(rel) return unless extdir?(curr_srcdir()) install_files ruby_extentions('.'), "#{config('sodir')}/#{File.dirname(rel)}", 0555 end def install_dir_data(rel) install_files collect_filenames_auto(), "#{config('datadir')}/#{rel}", 0644 end def install_files(list, dest, mode) mkdir_p dest, @options['install-prefix'] list.each do |fname| install fname, dest, mode, @options['install-prefix'] end end def ruby_scripts collect_filenames_auto().select {|n| /\.rb\z/ =~ n } end # picked up many entries from cvs-1.11.1/src/ignore.c reject_patterns = %w( core RCSLOG tags TAGS .make.state .nse_depinfo #* .#* cvslog.* ,* .del-* *.olb *~ *.old *.bak *.BAK *.orig *.rej _$* *$ *.org *.in .* ) mapping = { '.' => '\.', '$' => '\$', '#' => '\#', '*' => '.*' } REJECT_PATTERNS = Regexp.new('\A(?:' + reject_patterns.map {|pat| pat.gsub(/[\.\$\#\*]/) {|ch| mapping[ch] } }.join('|') + ')\z') def collect_filenames_auto mapdir((existfiles() - hookfiles()).reject {|fname| REJECT_PATTERNS =~ fname }) end def existfiles all_files_in(curr_srcdir()) | all_files_in('.') end def hookfiles %w( pre-%s post-%s pre-%s.rb post-%s.rb ).map {|fmt| %w( config setup install clean ).map {|t| sprintf(fmt, t) } }.flatten end def mapdir(filelist) filelist.map {|fname| if File.exist?(fname) # objdir fname else # srcdir File.join(curr_srcdir(), fname) end } end def ruby_extentions(dir) Dir.open(dir) {|d| ents = d.select {|fname| /\.#{::Config::CONFIG['DLEXT']}\z/ =~ fname } if ents.empty? setup_rb_error "no ruby extention exists: 'ruby #{$0} setup' first" end return ents } end # # TASK clean # def exec_clean exec_task_traverse 'clean' rm_f ConfigTable.savefile rm_f 'InstalledFiles' end def clean_dir_bin(rel) end def clean_dir_lib(rel) end def clean_dir_ext(rel) return unless extdir?(curr_srcdir()) make 'clean' if File.file?('Makefile') end def clean_dir_data(rel) end # # TASK distclean # def exec_distclean exec_task_traverse 'distclean' rm_f ConfigTable.savefile rm_f 'InstalledFiles' end def distclean_dir_bin(rel) end def distclean_dir_lib(rel) end def distclean_dir_ext(rel) return unless extdir?(curr_srcdir()) make 'distclean' if File.file?('Makefile') end # # lib # def exec_task_traverse(task) run_hook "pre-#{task}" FILETYPES.each do |type| if config('without-ext') == 'yes' and type == 'ext' $stderr.puts 'skipping ext/* by user option' if verbose? next end traverse task, type, "#{task}_dir_#{type}" end run_hook "post-#{task}" end def traverse(task, rel, mid) dive_into(rel) { run_hook "pre-#{task}" __send__ mid, rel.sub(%r[\A.*?(?:/|\z)], '') all_dirs_in(curr_srcdir()).each do |d| traverse task, "#{rel}/#{d}", mid end run_hook "post-#{task}" } end def dive_into(rel) return unless File.dir?("#{@srcdir}/#{rel}") dir = File.basename(rel) Dir.mkdir dir unless File.dir?(dir) prevdir = Dir.pwd Dir.chdir dir $stderr.puts '---> ' + rel if verbose? @currdir = rel yield Dir.chdir prevdir $stderr.puts '<--- ' + rel if verbose? @currdir = File.dirname(rel) end end if $0 == __FILE__ begin if multipackage_install? ToplevelInstallerMulti.invoke else ToplevelInstaller.invoke end rescue SetupError raise if $DEBUG $stderr.puts $!.message $stderr.puts "Try 'ruby #{$0} --help' for detailed usage." exit 1 end end ruby-graffiti-2.2/test/000077500000000000000000000000001176467530700151265ustar00rootroot00000000000000ruby-graffiti-2.2/test/ts_graffiti.rb000066400000000000000000000330701176467530700177570ustar00rootroot00000000000000#!/usr/bin/env ruby # # Graffiti RDF Store tests # # Copyright (c) 2002-2009 Dmitry Borodaenko # # This program is free software. # You can distribute/modify this program under the terms of # the GNU General Public License version 3 or later. # # vim: et sw=2 sts=2 ts=8 tw=0 require 'test/unit' require 'yaml' require 'sequel' require 'graffiti' include Graffiti class TC_Storage < Test::Unit::TestCase def setup config = File.open( File.join( File.dirname(File.dirname(__FILE__)), 'doc', 'examples', 'samizdat-rdf-config.yaml' ) ) {|f| YAML.load(f.read) } @db = create_mock_db @store = Store.new(@db, config) @ns = @store.config.ns end def test_query_select squish = %{ SELECT ?msg, ?title, ?name, ?date, ?rating WHERE (dc::title ?msg ?title) (dc::creator ?msg ?creator) (s::fullName ?creator ?name) (dc::date ?msg ?date) (rdf::subject ?stmt ?msg) (rdf::predicate ?stmt dc::relation) (rdf::object ?stmt s::Quality) (s::rating ?stmt ?rating) LITERAL ?rating >= -1 ORDER BY ?rating DESC USING PRESET NS} sql = "SELECT DISTINCT b.id AS msg, b.title AS title, a.full_name AS name, c.published_date AS date, d.rating AS rating FROM member AS a INNER JOIN message AS b ON (b.creator = a.id) INNER JOIN resource AS c ON (b.id = c.id) INNER JOIN statement AS d ON (b.id = d.subject) INNER JOIN resource AS e ON (d.predicate = e.id) AND (e.uriref = 't' AND e.label = 'http://purl.org/dc/elements/1.1/relation') INNER JOIN resource AS f ON (d.object = f.id) AND (f.uriref = 't' AND f.label = 'http://www.nongnu.org/samizdat/rdf/schema#Quality') WHERE (c.published_date IS NOT NULL) AND (a.full_name IS NOT NULL) AND (d.id IS NOT NULL) AND (b.title IS NOT NULL) AND (d.rating >= -1) ORDER BY d.rating DESC" test_squish_select(squish, sql) do |query| assert_equal %w[?msg ?title ?name ?date ?rating], query.nodes assert query.pattern.include?(["#{@ns['dc']}title", "?msg", "?title", nil, false]) assert_equal '?rating >= -1', query.literal assert_equal '?rating', query.order assert_equal 'DESC', query.order_dir assert_equal @ns['s'], query.ns['s'] end assert_equal [], @store.select_all(squish) end def test_query_assert # initialize query_text = %{ INSERT ?msg UPDATE ?title = 'Test Message', ?content = 'Some ''text''.' WHERE (dc::creator ?msg 1) (dc::title ?msg ?title) (s::content ?msg ?content) USING dc FOR #{@ns['dc']} s FOR #{@ns['s']}} begin query = SquishAssert.new(@store.config, query_text) rescue assert false, "SquishAssert initialization raised #{$!.class}: #{$!}" end # query parser assert_equal ['?msg'], query.insert assert_equal({'?title' => "'0'", '?content' => "'1'"}, query.update) assert query.pattern.include?(["#{@ns['dc']}title", "?msg", "?title", nil, false]) assert_equal @ns['s'], query.ns['s'] assert_equal "'Test Message'", query.substitute_literals("'0'") assert_equal "'Some ''text''.'", query.substitute_literals("'1'") # mock db ids = @store.assert(query_text) assert_equal [1], ids assert_equal 'Test Message', @db[:Message][:id => 1][:title] id2 = @store.assert(query_text) query_text = %{ UPDATE ?rating = :rating WHERE (rdf::subject ?stmt :related) (rdf::predicate ?stmt dc::relation) (rdf::object ?stmt 1) (s::voteProposition ?vote ?stmt) (s::voteMember ?vote :member) (s::voteRating ?vote ?rating)} params = {:rating => -1, :related => 2, :member => 3} ids = @store.assert(query_text, params) assert_equal [], ids assert vote = @db[:vote].order(:id).last assert_equal -1, vote[:rating].to_i params[:rating] = -2 @store.assert(query_text, params) assert vote2 = @db[:vote].order(:id).last assert_equal -2, vote2[:rating].to_i assert_equal vote[:id], vote2[:id] end def test_query_assert_expression query_text = %{ UPDATE ?rating = 2 * :rating WHERE (rdf::subject ?stmt :related) (rdf::predicate ?stmt dc::relation) (rdf::object ?stmt 1) (s::voteProposition ?vote ?stmt) (s::voteMember ?vote :member) (s::voteRating ?vote ?rating)} params = {:rating => -1, :related => 2, :member => 3} @store.assert(query_text, params) assert vote = @db[:vote].order(:id).last assert_equal -2, vote[:rating].to_i end private :test_query_assert_expression def test_dangling_blank_node squish = %{ SELECT ?msg WHERE (s::inReplyTo ?msg ?parent) USING s FOR #{@ns['s']}} sql = "SELECT DISTINCT a.id AS msg FROM resource AS a INNER JOIN resource AS b ON (a.part_of_subproperty = b.id) AND (b.uriref = 't' AND b.label = 'http://www.nongnu.org/samizdat/rdf/schema#inReplyTo') WHERE (a.id IS NOT NULL)" test_squish_select(squish, sql) do |query| assert_equal %w[?msg], query.nodes assert query.pattern.include?(["#{@ns['s']}inReplyTo", "?msg", "?parent", nil, false]) assert_equal @ns['s'], query.ns['s'] end end def test_external_resource_no_self_join squish = %{SELECT ?id WHERE (s::id tag::Translation ?id)} sql = "SELECT DISTINCT a.id AS id FROM resource AS a WHERE (a.id IS NOT NULL) AND ((a.uriref = 't' AND a.label = 'http://www.nongnu.org/samizdat/rdf/tag#Translation'))" test_squish_select(squish, sql) do |query| assert_equal %w[?id], query.nodes assert query.pattern.include?(["#{@ns['s']}id", "#{@ns['tag']}Translation", "?id", nil, false]) assert_equal @ns['s'], query.ns['s'] end end #def test_internal_resource #end #def test_external_subject_internal_property #end def test_except squish = %{ SELECT ?msg WHERE (dc::date ?msg ?date) EXCEPT (s::inReplyTo ?msg ?parent) (dct::isVersionOf ?msg ?version_of) (dc::creator ?version_of 1) ORDER BY ?date DESC} sql = "SELECT DISTINCT a.id AS msg, a.published_date AS date FROM resource AS a LEFT JOIN ( SELECT a.id AS _field_c FROM message AS b INNER JOIN resource AS a ON (a.part_of = b.id) INNER JOIN resource AS c ON (a.part_of_subproperty = c.id) AND (c.uriref = 't' AND c.label = 'http://purl.org/dc/terms/isVersionOf') WHERE (b.creator = 1) ) AS _subquery_a ON (a.id = _subquery_a._field_c) LEFT JOIN resource AS d ON (a.part_of_subproperty = d.id) AND (d.uriref = 't' AND d.label = 'http://www.nongnu.org/samizdat/rdf/schema#inReplyTo') WHERE (a.published_date IS NOT NULL) AND (a.id IS NOT NULL) AND (_subquery_a._field_c IS NULL) AND (d.id IS NULL) ORDER BY a.published_date DESC" test_squish_select(squish, sql) end def test_except_group_by squish = %{ SELECT ?msg WHERE (rdf::predicate ?stmt dc::relation) (rdf::subject ?stmt ?msg) (rdf::object ?stmt ?tag) (dc::date ?stmt ?date) (s::rating ?stmt ?rating FILTER ?rating >= 1.5) (s::hidden ?msg ?hidden FILTER ?hidden = 'f') EXCEPT (dct::isPartOf ?msg ?parent) GROUP BY ?msg ORDER BY max(?date) DESC} sql = "SELECT DISTINCT c.subject AS msg, max(d.published_date) FROM message AS a INNER JOIN statement AS c ON (c.subject = a.id) AND (c.rating >= 1.5) INNER JOIN resource AS b ON (c.subject = b.id) INNER JOIN resource AS d ON (c.id = d.id) INNER JOIN resource AS e ON (c.predicate = e.id) AND (e.uriref = 't' AND e.label = 'http://purl.org/dc/elements/1.1/relation') WHERE (d.published_date IS NOT NULL) AND (a.hidden IS NOT NULL) AND (b.part_of IS NULL) AND (c.rating IS NOT NULL) AND (c.object IS NOT NULL) AND ((a.hidden = 'f')) GROUP BY c.subject ORDER BY max(d.published_date) DESC" test_squish_select(squish, sql) end def test_optional squish = %{ SELECT ?date, ?creator, ?lang, ?parent, ?version_of, ?hidden, ?open WHERE (dc::date 1 ?date) OPTIONAL (dc::creator 1 ?creator) (dc::language 1 ?lang) (s::inReplyTo 1 ?parent) (dct::isVersionOf 1 ?version_of) (s::hidden 1 ?hidden) (s::openForAll 1 ?open)} sql = "SELECT DISTINCT a.published_date AS date, b.creator AS creator, b.language AS lang, select_subproperty(a.part_of, d.id) AS parent, select_subproperty(a.part_of, c.id) AS version_of, b.hidden AS hidden, b.open AS open FROM resource AS a INNER JOIN message AS b ON (a.id = b.id) LEFT JOIN resource AS c ON (a.part_of_subproperty = c.id) AND (c.uriref = 't' AND c.label = 'http://purl.org/dc/terms/isVersionOf') LEFT JOIN resource AS d ON (a.part_of_subproperty = d.id) AND (d.uriref = 't' AND d.label = 'http://www.nongnu.org/samizdat/rdf/schema#inReplyTo') WHERE (a.published_date IS NOT NULL) AND ((a.id = 1))" test_squish_select(squish, sql) end def test_except_optional_transitive squish = %{ SELECT ?msg WHERE (rdf::subject ?stmt ?msg) (rdf::predicate ?stmt dc::relation) (rdf::object ?stmt ?tag) (s::rating ?stmt ?rating FILTER ?rating > 0) (dc::date ?msg ?date) EXCEPT (dct::isPartOf ?msg ?parent) OPTIONAL (dct::isPartOf ?tag ?supertag TRANSITIVE) LITERAL ?tag = 1 OR ?supertag = 1 ORDER BY ?date DESC} sql = "SELECT DISTINCT b.subject AS msg, a.published_date AS date FROM resource AS a INNER JOIN statement AS b ON (b.subject = a.id) AND (b.rating > 0) INNER JOIN resource AS d ON (b.predicate = d.id) AND (d.uriref = 't' AND d.label = 'http://purl.org/dc/elements/1.1/relation') LEFT JOIN part AS c ON (b.object = c.id) WHERE (a.published_date IS NOT NULL) AND (a.part_of IS NULL) AND (b.rating IS NOT NULL) AND (b.id IS NOT NULL) AND (b.object = 1 OR c.part_of = 1) ORDER BY a.published_date DESC" test_squish_select(squish, sql) end def test_optional_connect_by_object squish = %{ SELECT ?event WHERE (ical::dtstart ?event ?dtstart FILTER ?dtstart >= 'now') (ical::dtend ?event ?dtend) OPTIONAL (s::rruleEvent ?rrule ?event) (ical::until ?rrule ?until FILTER ?until IS NULL OR ?until > 'now') LITERAL ?dtend > 'now' OR ?rrule IS NOT NULL ORDER BY ?event DESC} sql = "SELECT DISTINCT b.id AS event FROM event AS b LEFT JOIN recurrence AS a ON (b.id = a.event) AND (a.until IS NULL OR a.until > 'now') WHERE (b.dtstart IS NOT NULL) AND ((b.dtstart >= 'now')) AND (b.dtend > 'now' OR a.id IS NOT NULL) ORDER BY b.id DESC" test_squish_select(squish, sql) end private :test_optional_connect_by_object def test_many_to_many # pretend that Vote is a many-to-many relation table squish = %{ SELECT ?p, ?date WHERE (s::voteRating ?p ?vote1 FILTER ?vote1 > 0) (s::voteRating ?p ?vote2 FILTER ?vote2 < 0) (dc::date ?p ?date) ORDER BY ?date DESC} sql = "SELECT DISTINCT a.id AS p, c.published_date AS date FROM vote AS a INNER JOIN vote AS b ON (a.id = b.id) AND (b.rating < 0) INNER JOIN resource AS c ON (a.id = c.id) WHERE (c.published_date IS NOT NULL) AND (a.rating IS NOT NULL) AND (b.rating IS NOT NULL) AND ((a.rating > 0)) ORDER BY c.published_date DESC" test_squish_select(squish, sql) end def test_update_null_and_subproperty query_text = %{INSERT ?msg UPDATE ?parent = :parent WHERE (dct::isPartOf ?msg ?parent)} @store.assert(query_text, :id => 1, :parent => 3) assert_equal 3, @db[:resource].filter(:id => 1).get(:part_of) # check that subproperty is set query_text = %{UPDATE ?parent = :parent WHERE (s::subTagOf :id ?parent)} @store.assert(query_text, :id => 1, :parent => 3) assert_equal 3, @db[:resource].filter(:id => 1).get(:part_of) assert_equal 2, @db[:resource].filter(:id => 1).get(:part_of_subproperty) # check that NULL is handled correctly and that subproperty is unset query_text = %{UPDATE ?parent = NULL WHERE (dct::isPartOf :id ?parent)} @store.assert(query_text, :id => 1) assert_equal nil, @db[:resource].filter(:id => 1).get(:part_of) assert_equal nil, @db[:resource].filter(:id => 1).get(:part_of_subproperty) end private def test_squish_select(squish, sql) begin query = SquishSelect.new(@store.config, squish) rescue assert false, "SquishSelect initialization raised #{$!.class}: #{$!}" end yield query if block_given? # query result begin sql1 = @store.select(query) rescue assert false, "select with pre-parsed query raised #{$!.class}: #{$!}" end begin sql2 = @store.select(squish) rescue assert false, "select with query text raised #{$!.class}: #{$!}" end assert sql1 == sql2 # transform result assert_equal normalize(sql), normalize(sql1), "Query doesn't match. Expected:\n#{sql}\nReceived:\n#{sql1}" end def normalize(sql) sql end def create_mock_db db = Sequel.sqlite(:quote_identifiers => false) db.create_table(:resource) do primary_key :id Time :published_date Integer :part_of Integer :part_of_subproperty Integer :part_sequence_number TrueClass :literal TrueClass :uriref String :label end db.create_table(:statement) do primary_key :id Integer :subject Integer :predicate Integer :object BigDecimal :rating, :size => [4, 2] end db.create_table(:member) do primary_key :id String :login String :full_name String :email end db.create_table(:message) do primary_key :id String :title Integer :creator String :format String :language TrueClass :open TrueClass :hidden TrueClass :locked String :content String :html_full String :html_short end db.create_table(:vote) do primary_key :id Integer :proposition Integer :member BigDecimal :rating, :size => 2 end db end def create_mock_member(db) db[:member].insert( :login => 'test', :full_name => 'test', :email => 'test@localhost' ) end end