pax_global_header 0000666 0000000 0000000 00000000064 11764675307 0014532 g ustar 00root root 0000000 0000000 52 comment=d40cfee4d0d53cb4dc2f66c40e789949361fa094
ruby-graffiti-2.2/ 0000775 0000000 0000000 00000000000 11764675307 0014147 5 ustar 00root root 0000000 0000000 ruby-graffiti-2.2/COPYING 0000664 0000000 0000000 00000104374 11764675307 0015213 0 ustar 00root root 0000000 0000000
GNU GENERAL PUBLIC LICENSE
Version 3, 29 June 2007
Copyright (C) 2007 Free Software Foundation, Inc.
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
The GNU General Public License is a free, copyleft license for
software and other kinds of works.
The licenses for most software and other practical works are designed
to take away your freedom to share and change the works. By contrast,
the GNU General Public License is intended to guarantee your freedom to
share and change all versions of a program--to make sure it remains free
software for all its users. We, the Free Software Foundation, use the
GNU General Public License for most of our software; it applies also to
any other work released this way by its authors. You can apply it to
your programs, too.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
them if you wish), that you receive source code or can get it if you
want it, that you can change the software or use pieces of it in new
free programs, and that you know you can do these things.
To protect your rights, we need to prevent others from denying you
these rights or asking you to surrender the rights. Therefore, you have
certain responsibilities if you distribute copies of the software, or if
you modify it: responsibilities to respect the freedom of others.
For example, if you distribute copies of such a program, whether
gratis or for a fee, you must pass on to the recipients the same
freedoms that you received. You must make sure that they, too, receive
or can get the source code. And you must show them these terms so they
know their rights.
Developers that use the GNU GPL protect your rights with two steps:
(1) assert copyright on the software, and (2) offer you this License
giving you legal permission to copy, distribute and/or modify it.
For the developers' and authors' protection, the GPL clearly explains
that there is no warranty for this free software. For both users' and
authors' sake, the GPL requires that modified versions be marked as
changed, so that their problems will not be attributed erroneously to
authors of previous versions.
Some devices are designed to deny users access to install or run
modified versions of the software inside them, although the manufacturer
can do so. This is fundamentally incompatible with the aim of
protecting users' freedom to change the software. The systematic
pattern of such abuse occurs in the area of products for individuals to
use, which is precisely where it is most unacceptable. Therefore, we
have designed this version of the GPL to prohibit the practice for those
products. If such problems arise substantially in other domains, we
stand ready to extend this provision to those domains in future versions
of the GPL, as needed to protect the freedom of users.
Finally, every program is threatened constantly by software patents.
States should not allow patents to restrict development and use of
software on general-purpose computers, but in those that do, we wish to
avoid the special danger that patents applied to a free program could
make it effectively proprietary. To prevent this, the GPL assures that
patents cannot be used to render the program non-free.
The precise terms and conditions for copying, distribution and
modification follow.
TERMS AND CONDITIONS
0. Definitions.
"This License" refers to version 3 of the GNU General Public License.
"Copyright" also means copyright-like laws that apply to other kinds of
works, such as semiconductor masks.
"The Program" refers to any copyrightable work licensed under this
License. Each licensee is addressed as "you". "Licensees" and
"recipients" may be individuals or organizations.
To "modify" a work means to copy from or adapt all or part of the work
in a fashion requiring copyright permission, other than the making of an
exact copy. The resulting work is called a "modified version" of the
earlier work or a work "based on" the earlier work.
A "covered work" means either the unmodified Program or a work based
on the Program.
To "propagate" a work means to do anything with it that, without
permission, would make you directly or secondarily liable for
infringement under applicable copyright law, except executing it on a
computer or modifying a private copy. Propagation includes copying,
distribution (with or without modification), making available to the
public, and in some countries other activities as well.
To "convey" a work means any kind of propagation that enables other
parties to make or receive copies. Mere interaction with a user through
a computer network, with no transfer of a copy, is not conveying.
An interactive user interface displays "Appropriate Legal Notices"
to the extent that it includes a convenient and prominently visible
feature that (1) displays an appropriate copyright notice, and (2)
tells the user that there is no warranty for the work (except to the
extent that warranties are provided), that licensees may convey the
work under this License, and how to view a copy of this License. If
the interface presents a list of user commands or options, such as a
menu, a prominent item in the list meets this criterion.
1. Source Code.
The "source code" for a work means the preferred form of the work
for making modifications to it. "Object code" means any non-source
form of a work.
A "Standard Interface" means an interface that either is an official
standard defined by a recognized standards body, or, in the case of
interfaces specified for a particular programming language, one that
is widely used among developers working in that language.
The "System Libraries" of an executable work include anything, other
than the work as a whole, that (a) is included in the normal form of
packaging a Major Component, but which is not part of that Major
Component, and (b) serves only to enable use of the work with that
Major Component, or to implement a Standard Interface for which an
implementation is available to the public in source code form. A
"Major Component", in this context, means a major essential component
(kernel, window system, and so on) of the specific operating system
(if any) on which the executable work runs, or a compiler used to
produce the work, or an object code interpreter used to run it.
The "Corresponding Source" for a work in object code form means all
the source code needed to generate, install, and (for an executable
work) run the object code and to modify the work, including scripts to
control those activities. However, it does not include the work's
System Libraries, or general-purpose tools or generally available free
programs which are used unmodified in performing those activities but
which are not part of the work. For example, Corresponding Source
includes interface definition files associated with source files for
the work, and the source code for shared libraries and dynamically
linked subprograms that the work is specifically designed to require,
such as by intimate data communication or control flow between those
subprograms and other parts of the work.
The Corresponding Source need not include anything that users
can regenerate automatically from other parts of the Corresponding
Source.
The Corresponding Source for a work in source code form is that
same work.
2. Basic Permissions.
All rights granted under this License are granted for the term of
copyright on the Program, and are irrevocable provided the stated
conditions are met. This License explicitly affirms your unlimited
permission to run the unmodified Program. The output from running a
covered work is covered by this License only if the output, given its
content, constitutes a covered work. This License acknowledges your
rights of fair use or other equivalent, as provided by copyright law.
You may make, run and propagate covered works that you do not
convey, without conditions so long as your license otherwise remains
in force. You may convey covered works to others for the sole purpose
of having them make modifications exclusively for you, or provide you
with facilities for running those works, provided that you comply with
the terms of this License in conveying all material for which you do
not control copyright. Those thus making or running the covered works
for you must do so exclusively on your behalf, under your direction
and control, on terms that prohibit them from making any copies of
your copyrighted material outside their relationship with you.
Conveying under any other circumstances is permitted solely under
the conditions stated below. Sublicensing is not allowed; section 10
makes it unnecessary.
3. Protecting Users' Legal Rights From Anti-Circumvention Law.
No covered work shall be deemed part of an effective technological
measure under any applicable law fulfilling obligations under article
11 of the WIPO copyright treaty adopted on 20 December 1996, or
similar laws prohibiting or restricting circumvention of such
measures.
When you convey a covered work, you waive any legal power to forbid
circumvention of technological measures to the extent such circumvention
is effected by exercising rights under this License with respect to
the covered work, and you disclaim any intention to limit operation or
modification of the work as a means of enforcing, against the work's
users, your or third parties' legal rights to forbid circumvention of
technological measures.
4. Conveying Verbatim Copies.
You may convey verbatim copies of the Program's source code as you
receive it, in any medium, provided that you conspicuously and
appropriately publish on each copy an appropriate copyright notice;
keep intact all notices stating that this License and any
non-permissive terms added in accord with section 7 apply to the code;
keep intact all notices of the absence of any warranty; and give all
recipients a copy of this License along with the Program.
You may charge any price or no price for each copy that you convey,
and you may offer support or warranty protection for a fee.
5. Conveying Modified Source Versions.
You may convey a work based on the Program, or the modifications to
produce it from the Program, in the form of source code under the
terms of section 4, provided that you also meet all of these conditions:
a) The work must carry prominent notices stating that you modified
it, and giving a relevant date.
b) The work must carry prominent notices stating that it is
released under this License and any conditions added under section
7. This requirement modifies the requirement in section 4 to
"keep intact all notices".
c) You must license the entire work, as a whole, under this
License to anyone who comes into possession of a copy. This
License will therefore apply, along with any applicable section 7
additional terms, to the whole of the work, and all its parts,
regardless of how they are packaged. This License gives no
permission to license the work in any other way, but it does not
invalidate such permission if you have separately received it.
d) If the work has interactive user interfaces, each must display
Appropriate Legal Notices; however, if the Program has interactive
interfaces that do not display Appropriate Legal Notices, your
work need not make them do so.
A compilation of a covered work with other separate and independent
works, which are not by their nature extensions of the covered work,
and which are not combined with it such as to form a larger program,
in or on a volume of a storage or distribution medium, is called an
"aggregate" if the compilation and its resulting copyright are not
used to limit the access or legal rights of the compilation's users
beyond what the individual works permit. Inclusion of a covered work
in an aggregate does not cause this License to apply to the other
parts of the aggregate.
6. Conveying Non-Source Forms.
You may convey a covered work in object code form under the terms
of sections 4 and 5, provided that you also convey the
machine-readable Corresponding Source under the terms of this License,
in one of these ways:
a) Convey the object code in, or embodied in, a physical product
(including a physical distribution medium), accompanied by the
Corresponding Source fixed on a durable physical medium
customarily used for software interchange.
b) Convey the object code in, or embodied in, a physical product
(including a physical distribution medium), accompanied by a
written offer, valid for at least three years and valid for as
long as you offer spare parts or customer support for that product
model, to give anyone who possesses the object code either (1) a
copy of the Corresponding Source for all the software in the
product that is covered by this License, on a durable physical
medium customarily used for software interchange, for a price no
more than your reasonable cost of physically performing this
conveying of source, or (2) access to copy the
Corresponding Source from a network server at no charge.
c) Convey individual copies of the object code with a copy of the
written offer to provide the Corresponding Source. This
alternative is allowed only occasionally and noncommercially, and
only if you received the object code with such an offer, in accord
with subsection 6b.
d) Convey the object code by offering access from a designated
place (gratis or for a charge), and offer equivalent access to the
Corresponding Source in the same way through the same place at no
further charge. You need not require recipients to copy the
Corresponding Source along with the object code. If the place to
copy the object code is a network server, the Corresponding Source
may be on a different server (operated by you or a third party)
that supports equivalent copying facilities, provided you maintain
clear directions next to the object code saying where to find the
Corresponding Source. Regardless of what server hosts the
Corresponding Source, you remain obligated to ensure that it is
available for as long as needed to satisfy these requirements.
e) Convey the object code using peer-to-peer transmission, provided
you inform other peers where the object code and Corresponding
Source of the work are being offered to the general public at no
charge under subsection 6d.
A separable portion of the object code, whose source code is excluded
from the Corresponding Source as a System Library, need not be
included in conveying the object code work.
A "User Product" is either (1) a "consumer product", which means any
tangible personal property which is normally used for personal, family,
or household purposes, or (2) anything designed or sold for incorporation
into a dwelling. In determining whether a product is a consumer product,
doubtful cases shall be resolved in favor of coverage. For a particular
product received by a particular user, "normally used" refers to a
typical or common use of that class of product, regardless of the status
of the particular user or of the way in which the particular user
actually uses, or expects or is expected to use, the product. A product
is a consumer product regardless of whether the product has substantial
commercial, industrial or non-consumer uses, unless such uses represent
the only significant mode of use of the product.
"Installation Information" for a User Product means any methods,
procedures, authorization keys, or other information required to install
and execute modified versions of a covered work in that User Product from
a modified version of its Corresponding Source. The information must
suffice to ensure that the continued functioning of the modified object
code is in no case prevented or interfered with solely because
modification has been made.
If you convey an object code work under this section in, or with, or
specifically for use in, a User Product, and the conveying occurs as
part of a transaction in which the right of possession and use of the
User Product is transferred to the recipient in perpetuity or for a
fixed term (regardless of how the transaction is characterized), the
Corresponding Source conveyed under this section must be accompanied
by the Installation Information. But this requirement does not apply
if neither you nor any third party retains the ability to install
modified object code on the User Product (for example, the work has
been installed in ROM).
The requirement to provide Installation Information does not include a
requirement to continue to provide support service, warranty, or updates
for a work that has been modified or installed by the recipient, or for
the User Product in which it has been modified or installed. Access to a
network may be denied when the modification itself materially and
adversely affects the operation of the network or violates the rules and
protocols for communication across the network.
Corresponding Source conveyed, and Installation Information provided,
in accord with this section must be in a format that is publicly
documented (and with an implementation available to the public in
source code form), and must require no special password or key for
unpacking, reading or copying.
7. Additional Terms.
"Additional permissions" are terms that supplement the terms of this
License by making exceptions from one or more of its conditions.
Additional permissions that are applicable to the entire Program shall
be treated as though they were included in this License, to the extent
that they are valid under applicable law. If additional permissions
apply only to part of the Program, that part may be used separately
under those permissions, but the entire Program remains governed by
this License without regard to the additional permissions.
When you convey a copy of a covered work, you may at your option
remove any additional permissions from that copy, or from any part of
it. (Additional permissions may be written to require their own
removal in certain cases when you modify the work.) You may place
additional permissions on material, added by you to a covered work,
for which you have or can give appropriate copyright permission.
Notwithstanding any other provision of this License, for material you
add to a covered work, you may (if authorized by the copyright holders of
that material) supplement the terms of this License with terms:
a) Disclaiming warranty or limiting liability differently from the
terms of sections 15 and 16 of this License; or
b) Requiring preservation of specified reasonable legal notices or
author attributions in that material or in the Appropriate Legal
Notices displayed by works containing it; or
c) Prohibiting misrepresentation of the origin of that material, or
requiring that modified versions of such material be marked in
reasonable ways as different from the original version; or
d) Limiting the use for publicity purposes of names of licensors or
authors of the material; or
e) Declining to grant rights under trademark law for use of some
trade names, trademarks, or service marks; or
f) Requiring indemnification of licensors and authors of that
material by anyone who conveys the material (or modified versions of
it) with contractual assumptions of liability to the recipient, for
any liability that these contractual assumptions directly impose on
those licensors and authors.
All other non-permissive additional terms are considered "further
restrictions" within the meaning of section 10. If the Program as you
received it, or any part of it, contains a notice stating that it is
governed by this License along with a term that is a further
restriction, you may remove that term. If a license document contains
a further restriction but permits relicensing or conveying under this
License, you may add to a covered work material governed by the terms
of that license document, provided that the further restriction does
not survive such relicensing or conveying.
If you add terms to a covered work in accord with this section, you
must place, in the relevant source files, a statement of the
additional terms that apply to those files, or a notice indicating
where to find the applicable terms.
Additional terms, permissive or non-permissive, may be stated in the
form of a separately written license, or stated as exceptions;
the above requirements apply either way.
8. Termination.
You may not propagate or modify a covered work except as expressly
provided under this License. Any attempt otherwise to propagate or
modify it is void, and will automatically terminate your rights under
this License (including any patent licenses granted under the third
paragraph of section 11).
However, if you cease all violation of this License, then your
license from a particular copyright holder is reinstated (a)
provisionally, unless and until the copyright holder explicitly and
finally terminates your license, and (b) permanently, if the copyright
holder fails to notify you of the violation by some reasonable means
prior to 60 days after the cessation.
Moreover, your license from a particular copyright holder is
reinstated permanently if the copyright holder notifies you of the
violation by some reasonable means, this is the first time you have
received notice of violation of this License (for any work) from that
copyright holder, and you cure the violation prior to 30 days after
your receipt of the notice.
Termination of your rights under this section does not terminate the
licenses of parties who have received copies or rights from you under
this License. If your rights have been terminated and not permanently
reinstated, you do not qualify to receive new licenses for the same
material under section 10.
9. Acceptance Not Required for Having Copies.
You are not required to accept this License in order to receive or
run a copy of the Program. Ancillary propagation of a covered work
occurring solely as a consequence of using peer-to-peer transmission
to receive a copy likewise does not require acceptance. However,
nothing other than this License grants you permission to propagate or
modify any covered work. These actions infringe copyright if you do
not accept this License. Therefore, by modifying or propagating a
covered work, you indicate your acceptance of this License to do so.
10. Automatic Licensing of Downstream Recipients.
Each time you convey a covered work, the recipient automatically
receives a license from the original licensors, to run, modify and
propagate that work, subject to this License. You are not responsible
for enforcing compliance by third parties with this License.
An "entity transaction" is a transaction transferring control of an
organization, or substantially all assets of one, or subdividing an
organization, or merging organizations. If propagation of a covered
work results from an entity transaction, each party to that
transaction who receives a copy of the work also receives whatever
licenses to the work the party's predecessor in interest had or could
give under the previous paragraph, plus a right to possession of the
Corresponding Source of the work from the predecessor in interest, if
the predecessor has it or can get it with reasonable efforts.
You may not impose any further restrictions on the exercise of the
rights granted or affirmed under this License. For example, you may
not impose a license fee, royalty, or other charge for exercise of
rights granted under this License, and you may not initiate litigation
(including a cross-claim or counterclaim in a lawsuit) alleging that
any patent claim is infringed by making, using, selling, offering for
sale, or importing the Program or any portion of it.
11. Patents.
A "contributor" is a copyright holder who authorizes use under this
License of the Program or a work on which the Program is based. The
work thus licensed is called the contributor's "contributor version".
A contributor's "essential patent claims" are all patent claims
owned or controlled by the contributor, whether already acquired or
hereafter acquired, that would be infringed by some manner, permitted
by this License, of making, using, or selling its contributor version,
but do not include claims that would be infringed only as a
consequence of further modification of the contributor version. For
purposes of this definition, "control" includes the right to grant
patent sublicenses in a manner consistent with the requirements of
this License.
Each contributor grants you a non-exclusive, worldwide, royalty-free
patent license under the contributor's essential patent claims, to
make, use, sell, offer for sale, import and otherwise run, modify and
propagate the contents of its contributor version.
In the following three paragraphs, a "patent license" is any express
agreement or commitment, however denominated, not to enforce a patent
(such as an express permission to practice a patent or covenant not to
sue for patent infringement). To "grant" such a patent license to a
party means to make such an agreement or commitment not to enforce a
patent against the party.
If you convey a covered work, knowingly relying on a patent license,
and the Corresponding Source of the work is not available for anyone
to copy, free of charge and under the terms of this License, through a
publicly available network server or other readily accessible means,
then you must either (1) cause the Corresponding Source to be so
available, or (2) arrange to deprive yourself of the benefit of the
patent license for this particular work, or (3) arrange, in a manner
consistent with the requirements of this License, to extend the patent
license to downstream recipients. "Knowingly relying" means you have
actual knowledge that, but for the patent license, your conveying the
covered work in a country, or your recipient's use of the covered work
in a country, would infringe one or more identifiable patents in that
country that you have reason to believe are valid.
If, pursuant to or in connection with a single transaction or
arrangement, you convey, or propagate by procuring conveyance of, a
covered work, and grant a patent license to some of the parties
receiving the covered work authorizing them to use, propagate, modify
or convey a specific copy of the covered work, then the patent license
you grant is automatically extended to all recipients of the covered
work and works based on it.
A patent license is "discriminatory" if it does not include within
the scope of its coverage, prohibits the exercise of, or is
conditioned on the non-exercise of one or more of the rights that are
specifically granted under this License. You may not convey a covered
work if you are a party to an arrangement with a third party that is
in the business of distributing software, under which you make payment
to the third party based on the extent of your activity of conveying
the work, and under which the third party grants, to any of the
parties who would receive the covered work from you, a discriminatory
patent license (a) in connection with copies of the covered work
conveyed by you (or copies made from those copies), or (b) primarily
for and in connection with specific products or compilations that
contain the covered work, unless you entered into that arrangement,
or that patent license was granted, prior to 28 March 2007.
Nothing in this License shall be construed as excluding or limiting
any implied license or other defenses to infringement that may
otherwise be available to you under applicable patent law.
12. No Surrender of Others' Freedom.
If conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot convey a
covered work so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you may
not convey it at all. For example, if you agree to terms that obligate you
to collect a royalty for further conveying from those to whom you convey
the Program, the only way you could satisfy both those terms and this
License would be to refrain entirely from conveying the Program.
13. Use with the GNU Affero General Public License.
Notwithstanding any other provision of this License, you have
permission to link or combine any covered work with a work licensed
under version 3 of the GNU Affero General Public License into a single
combined work, and to convey the resulting work. The terms of this
License will continue to apply to the part which is the covered work,
but the special requirements of the GNU Affero General Public License,
section 13, concerning interaction through a network will apply to the
combination as such.
14. Revised Versions of this License.
The Free Software Foundation may publish revised and/or new versions of
the GNU General Public License from time to time. Such new versions will
be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the
Program specifies that a certain numbered version of the GNU General
Public License "or any later version" applies to it, you have the
option of following the terms and conditions either of that numbered
version or of any later version published by the Free Software
Foundation. If the Program does not specify a version number of the
GNU General Public License, you may choose any version ever published
by the Free Software Foundation.
If the Program specifies that a proxy can decide which future
versions of the GNU General Public License can be used, that proxy's
public statement of acceptance of a version permanently authorizes you
to choose that version for the Program.
Later license versions may give you additional or different
permissions. However, no additional obligations are imposed on any
author or copyright holder as a result of your choosing to follow a
later version.
15. Disclaimer of Warranty.
THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
16. Limitation of Liability.
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
SUCH DAMAGES.
17. Interpretation of Sections 15 and 16.
If the disclaimer of warranty and limitation of liability provided
above cannot be given local legal effect according to their terms,
reviewing courts shall apply local law that most closely approximates
an absolute waiver of all civil liability in connection with the
Program, unless a warranty or assumption of liability accompanies a
copy of the Program in return for a fee.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest
to attach them to the start of each source file to most effectively
state the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.
Copyright (C)
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see .
Also add information on how to contact you by electronic and paper mail.
If the program does terminal interaction, make it output a short
notice like this when it starts in an interactive mode:
Copyright (C)
This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
This is free software, and you are welcome to redistribute it
under certain conditions; type `show c' for details.
The hypothetical commands `show w' and `show c' should show the appropriate
parts of the General Public License. Of course, your program's commands
might be different; for a GUI interface, you would use an "about box".
You should also get your employer (if you work as a programmer) or school,
if any, to sign a "copyright disclaimer" for the program, if necessary.
For more information on this, and how to apply and follow the GNU GPL, see
.
The GNU General Public License does not permit incorporating your program
into proprietary programs. If your program is a subroutine library, you
may consider it more useful to permit linking proprietary applications with
the library. If this is what you want to do, use the GNU Lesser General
Public License instead of this License. But first, please read
.
ruby-graffiti-2.2/ChangeLog 0000664 0000000 0000000 00000014116 11764675307 0015724 0 ustar 00root root 0000000 0000000 commit 903518dc145cf51d8232aaf6150691427cc38d7f (HEAD, tag: v2.2, origin/master, origin/HEAD, master)
Author: Dmitry Borodaenko
Date: Sat Jun 9 19:04:01 2012 +0300
force deterministic iteration over hashes
This allows unit test to fully compare generated SQL queries with
etalons. Also, more debug points are added.
lib/graffiti/debug.rb | 2 +-
lib/graffiti/sql_mapper.rb | 48 +++++++++++---------
lib/graffiti/squish.rb | 14 +++---
test/ts_graffiti.rb | 107 +++++++++++++++++++++-----------------------
4 files changed, 86 insertions(+), 85 deletions(-)
commit 455d1eb517ec049486839c62fc6562cd91838ebd
Author: anonymous
Date: Mon Jan 30 21:10:59 2012 +0300
add spec.test_files
graffiti.gemspec | 1 +
1 file changed, 1 insertion(+)
commit 0a253fcb561a699b35467a2b11ae1aa15c36c173
Author: Dmitry Borodaenko
Date: Sun Feb 5 15:15:08 2012 +0300
better example in README
README.rdoc | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
commit 7c576111432963bb97f235dd55294346dc9cf8b5
Author: anonymous
Date: Sun Jan 29 15:23:07 2012 +0300
initial support for gem creation
graffiti.gemspec | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
commit 4b2a34274aa8735bdb26ad3609ce132886d4ee18 (tag: v2.1, bejbus/master)
Author: Dmitry Borodaenko
Date: Sun Dec 25 16:52:34 2011 +0300
updated README for Sequel, updated copyrights
README.rdoc | 8 ++++----
lib/graffiti/rdf_config.rb | 2 +-
lib/graffiti/rdf_property_map.rb | 2 +-
lib/graffiti/sql_mapper.rb | 2 +-
lib/graffiti/squish.rb | 2 +-
lib/graffiti/store.rb | 2 +-
6 files changed, 9 insertions(+), 9 deletions(-)
commit e8004f1244745d68758f21cc8820ded433d6faad (tag: v2.0)
Author: Dmitry Borodaenko
Date: Fri Sep 30 23:02:40 2011 +0300
migrate from DBI to Sequel
* Store now expects a Sequel::Database object
* SquishSelect and SquishAssert refactored
* SquishQuery now keeps raw unescaped literals in @strings, SquishAssert
passes these as is to Sequel::Dataset, SquishSelect still escapes them
back into the SQL query locally
* validate_expression now returns the validated string for chainability
* substitute_parameters dropped: Sequel understands named parameters
* SquishSelect#to_sql (and by extention Store#select) now returns only
SQL query (same params hash can now be passed to Sequel verbatim)
* SquishSelect#to_sql now names columns with corresponding blank node
names where applicable
* unit test now runs select and assert on an in-memory Sqlite database
doc/examples/samizdat-rdf-config.yaml | 70 ++---
doc/examples/samizdat-triggers-pgsql.sql | 138 ++++------
lib/graffiti/debug.rb | 34 +++
lib/graffiti/rdf_config.rb | 1 +
lib/graffiti/rdf_property_map.rb | 10 +-
lib/graffiti/sql_mapper.rb | 31 +--
lib/graffiti/squish.rb | 416 ++++++++++++++++++------------
lib/graffiti/store.rb | 35 +--
test/ts_graffiti.rb | 237 +++++++++++++----
9 files changed, 580 insertions(+), 392 deletions(-)
commit ef6717ce1fe9a36bf450866034f915b1d3013e11
Author: Dmitry Borodaenko
Date: Fri Sep 16 22:18:19 2011 +0300
old Monotone changelog
ChangeLog.mtn | 233 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 233 insertions(+)
commit 6b5b8f32ca3b362227f09852da42df4dd56b498e
Author: Dmitry Borodaenko
Date: Fri Sep 16 22:10:29 2011 +0300
ordering-agnosting query matching in unit tests
test/ts_graffiti.rb | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
commit 574f5519a9ed4dbee78b0b5101918db08089ea3e
Author: Dmitry Borodaenko
Date: Fri Sep 16 22:10:07 2011 +0300
SqlExpression#to_str for Ruby 1.9
lib/graffiti/sql_mapper.rb | 2 ++
1 file changed, 2 insertions(+)
commit cf3325be344735f2bbf0ba7cacaa16467e1148e4
Author: Dmitry Borodaenko
Date: Sat Jun 4 22:08:18 2011 +0300
minor rdoc markup fix
README.rdoc | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
commit 210ab735c85abc429d9651bd340d7683df595b16
Author: Dmitry Borodaenko
Date: Sat Jun 4 22:02:51 2011 +0300
ICIS 2009 paper
New paper added (On-demand RDF to Relational Query Translation in
Samizdat RDF Store, ICIS 2009)
...df-to-relational-query-translation-icis2009.tex | 936 ++++++++++++++++++++
1 file changed, 936 insertions(+)
commit 3605566c2a3b251e592f2a78aee0048724f76174 (tag: v1.0)
Author: Dmitry Borodaenko
Date: Sat Jun 4 22:00:39 2011 +0300
first commit to github
COPYING | 676 +++++++++++++++
README.rdoc | 127 +++
TODO | 30 +
doc/diagrams/graffiti-classes.svg | 157 ++++
doc/diagrams/graffiti-deployment.svg | 117 +++
doc/diagrams/graffiti-store-sequence.svg | 69 ++
doc/diagrams/squish-select-sequence.svg | 266 ++++++
doc/examples/samizdat-rdf-config.yaml | 95 +++
doc/examples/samizdat-triggers-pgsql.sql | 290 +++++++
doc/papers/collreif.tex | 462 ++++++++++
doc/papers/rel-rdf.tex | 545 ++++++++++++
doc/rdf-impl-report.txt | 126 +++
lib/graffiti.rb | 15 +
lib/graffiti/exceptions.rb | 20 +
lib/graffiti/rdf_config.rb | 77 ++
lib/graffiti/rdf_property_map.rb | 84 ++
lib/graffiti/sql_mapper.rb | 927 ++++++++++++++++++++
lib/graffiti/squish.rb | 496 +++++++++++
lib/graffiti/store.rb | 99 +++
setup.rb | 1360 ++++++++++++++++++++++++++++++
test/ts_graffiti.rb | 321 +++++++
21 files changed, 6359 insertions(+)
ruby-graffiti-2.2/ChangeLog.mtn 0000664 0000000 0000000 00000014225 11764675307 0016522 0 ustar 00root root 0000000 0000000 -----------------------------------------------------------------
Revision: 5f07d75e7786d56d30659b04ac0091fc8bc37fda
Ancestor: e13812ed0f84c7e722284750df4ac1a8ef81a501
Author: angdraug@debian.org
Date: 2009-10-11T12:23:07
Branch: graffiti-head
Deleted entries:
doc/diagrams/graffiti_classes.dia
Added files:
doc/diagrams/graffiti-classes.svg
doc/diagrams/graffiti-deployment.svg
doc/diagrams/graffiti-store-sequence.svg
doc/diagrams/squish-select-sequence.svg
ChangeLog:
replaced the old Dia diagram with SVG diagrams produced with BoUML
-----------------------------------------------------------------
Revision: e13812ed0f84c7e722284750df4ac1a8ef81a501
Ancestor: e364f136ecc00b4559a0f0d07f257886f48ae4ef
Author: angdraug@debian.org
Date: 2009-09-06T13:47:40
Branch: graffiti-head
Modified files:
README.rdoc
ChangeLog:
relational data adaptation instructions added
-----------------------------------------------------------------
Revision: e364f136ecc00b4559a0f0d07f257886f48ae4ef
Ancestor: f2ef2d91fb1e29ee91928b5ebf52c22b09def60b
Author: angdraug@debian.org
Date: 2009-09-06T13:22:12
Branch: graffiti-head
Modified files:
README.rdoc
ChangeLog:
update language description added
-----------------------------------------------------------------
Revision: f2ef2d91fb1e29ee91928b5ebf52c22b09def60b
Ancestor: c5c5ad50f2f0e1159628804eb141c60977483fb7
Author: angdraug@debian.org
Date: 2009-09-06T13:12:45
Branch: graffiti-head
Modified files:
README.rdoc
ChangeLog:
query language description added
-----------------------------------------------------------------
Revision: c5c5ad50f2f0e1159628804eb141c60977483fb7
Ancestor: c6389322361c36538d626257f8dd957864e7b85e
Author: angdraug@debian.org
Date: 2009-08-22T12:38:32
Branch: graffiti-head
Modified files:
TODO lib/graffiti/rdf_config.rb
lib/graffiti/rdf_property_map.rb lib/graffiti/sql_mapper.rb
ChangeLog:
move setting of subproperty_of and transitive_closure outside of the RdfPropertyMap constructor; documented some missing pieces
-----------------------------------------------------------------
Revision: c6389322361c36538d626257f8dd957864e7b85e
Ancestor: 41a4f5d475227a9df9aae268506bcdb2dbf2e6ab
Author: angdraug@debian.org
Date: 2009-08-22T12:37:03
Branch: graffiti-head
Modified files:
lib/graffiti/squish.rb
ChangeLog:
use @db, db shortcut is no longer available
-----------------------------------------------------------------
Revision: 41a4f5d475227a9df9aae268506bcdb2dbf2e6ab
Ancestor: bed7201b8512540b0c225add168a66b18cfad5f3
Author: angdraug@debian.org
Date: 2009-08-05T13:16:50
Branch: graffiti-head
Modified files:
lib/graffiti/squish.rb
ChangeLog:
allow PostgreSQL full text search operators in expressions
-----------------------------------------------------------------
Revision: bed7201b8512540b0c225add168a66b18cfad5f3
Ancestor: 3538ff90e492811cd2c022cf6502439090241d8e
Author: angdraug@debian.org
Date: 2009-08-02T14:34:09
Branch: graffiti-head
Modified files:
lib/graffiti/sql_mapper.rb
ChangeLog:
move SqlNodeBinding and SqlExpression out of SqlMapper namespace; move exception raising into check_graph
-----------------------------------------------------------------
Revision: 3538ff90e492811cd2c022cf6502439090241d8e
Ancestor: ad944a53e855766391fb1840b09cc47eac78f911
Author: angdraug@debian.org
Date: 2009-07-30T08:50:54
Branch: graffiti-head
Added files:
lib/graffiti/exceptions.rb lib/graffiti/rdf_config.rb
lib/graffiti/rdf_property_map.rb lib/graffiti/sql_mapper.rb
lib/graffiti/squish.rb lib/graffiti/store.rb
Added directories:
lib/graffiti
Modified files:
lib/graffiti.rb
ChangeLog:
split Graffiti classes into their own .rb files
-----------------------------------------------------------------
Revision: ad944a53e855766391fb1840b09cc47eac78f911
Ancestor: 35e3b7019eda146e755449c86f2b04f035796608
Author: angdraug@debian.org
Date: 2009-07-28T12:42:47
Branch: graffiti-head
Modified files:
README.rdoc
ChangeLog:
mention SynCache in README.rdoc
-----------------------------------------------------------------
Revision: 35e3b7019eda146e755449c86f2b04f035796608
Ancestor: 7eeecf40a8f1e0127ecc6d2f818bd6d80e0b1f90
Author: angdraug@debian.org
Date: 2009-07-28T10:56:06
Branch: graffiti-head
Modified files:
README.rdoc lib/graffiti.rb test/ts_graffiti.rb
ChangeLog:
module initialization fixes
-----------------------------------------------------------------
Revision: 7eeecf40a8f1e0127ecc6d2f818bd6d80e0b1f90
Ancestor: 35efa8b3fb65bbe7744fc930dc3ceb5a98564a98
Author: angdraug@debian.org
Date: 2009-07-28T10:45:31
Branch: graffiti-head
Modified files:
README.rdoc
ChangeLog:
minor documentation cleanup
-----------------------------------------------------------------
Revision: 35efa8b3fb65bbe7744fc930dc3ceb5a98564a98
Ancestor: 60fc13b5ffccadc9990d5303de93e71be30128bb
Author: angdraug@debian.org
Date: 2009-07-28T10:38:48
Branch: graffiti-head
Added files:
doc/diagrams/graffiti_classes.dia
Added directories:
doc/diagrams
Modified files:
README.rdoc
ChangeLog:
documentation update
-----------------------------------------------------------------
Revision: 60fc13b5ffccadc9990d5303de93e71be30128bb
Ancestor: abd14a047ad68bc4e92ca80916e71a39f605b14c
Author: angdraug@debian.org
Date: 2009-07-27T19:20:47
Branch: graffiti-head
Renamed entries:
doc/examples/samizdat-triggers-pgsql-sql to doc/examples/samizdat-triggers-pgsql.sql
ChangeLog:
fixed .sql extension for triggers example
-----------------------------------------------------------------
Revision: abd14a047ad68bc4e92ca80916e71a39f605b14c
Ancestor:
Author: angdraug@debian.org
Date: 2009-07-27T19:16:55
Branch: graffiti-head
Added files:
COPYING README.rdoc TODO
doc/examples/samizdat-rdf-config.yaml
doc/examples/samizdat-triggers-pgsql-sql
doc/papers/collreif.tex doc/papers/rel-rdf.tex
doc/rdf-impl-report.txt lib/graffiti.rb setup.rb
test/ts_graffiti.rb
Added directories:
. doc doc/examples doc/papers lib test
ChangeLog:
initial checkin: Graffiti is a spin-off of storage.rb from Samizdat project
ruby-graffiti-2.2/README.rdoc 0000664 0000000 0000000 00000010364 11764675307 0015761 0 ustar 00root root 0000000 0000000 = Graffiti - relational RDF store for Ruby
== Synopsis
require 'sequel'
require 'yaml'
require 'graffiti'
db = Sequel.connect(:adapter => 'pg', :database = > dbname)
config = File.open('rdf.yaml') {|f| YAML.load(f.read) }
store = Graffiti::Store.new(db, config)
data = store.fetch(%{
SELECT ?date, ?title
WHERE (dc::date ?r ?date FILTER ?date >= :start)
(dc::title ?r ?title)
ORDER BY ?date DESC}, 10, 0, :start => Time.now - 24*3600)
puts data.first[:title]
== Description
Graffiti is an RDF store based on dynamic translation of RDF queries
into SQL. Graffiti allows to map any relational database schema into RDF
semantics and vice versa, to store any RDF data in a relational
database.
== Requirements
Graffiti uses Sequel to connect to database backend and provides a
DBI-like interface to run RDF queries in Squish query language from Ruby
applications. SynCache object cache is used for in-process cache of
translated Squish queries.
== Query Language
Graffiti implements Squish RDF query language with several extensions. A
query may include following clauses:
* SELECT: comma-separated list of result expressions, which may be
variables or aggregate functions.
* WHERE: main graph pattern, described as a list of triple patterns.
Each triple is enclosed in parenthesis and includes predicate, subject
and object. Predicate must be a URL and may use a shorthand notation
with namespace id separated by double colon. Subject may be a URL,
internal resource id, or variable. Object may be a URL, internal
resource id, variable, or literal. Values of variables bound by the
triple pattern may be bound by an optional FILTER expression.
* EXCEPT: negative graph pattern. Solutions that match any part of the
negative graph pattern are excluded from the result set.
* OPTIONAL: optional graph pattern. Variables defined in the optional
pattern are only included in the result set only for solutions that
match corresponding parts of the optional graph pattern.
* LITERAL: global filter expression. Used for expressions that involve
variables from different triple patterns.
* GROUP BY: result set aggregation criteria.
* ORDER BY: result set ordering criteria.
* USING: namespaces definitions. Namespaces defined in the RDF store
configuration do not have to be repeated here.
A basic update language is also implemented. A Squish assert statement
uses the same structure as a query, with SELECT clause replaced by
either one or both of the following clauses:
* INSERT: list of variables representing new resources to be inserted
into the RDF graph.
* UPDATE: list of assignments of literal expressions to variables bound
by a solution.
Assert statement will only update one solution per invocation, if more
solutions match the graph pattern, only the first will be updated.
== Relational Data
Relational data has to be adapted for RDF access using Graffiti. The
adaptation is non-intrusive and will not break compatibility with
existing SQL queries.
Following schema changes are required for all cases:
* Create rdfs:Resource superclass table with auto-generated primary key.
* Replace primary keys of mapped subclass tables with foreign keys
referencing the rdfs:Resource table (existing foreign keys may need to
be updated to reflect this change.
* Register rdfs:subClassOf inference database triggers to update the
rdfs:Re-source table and maintain foreign keys integrity on all
changes in mapped subclass tables.
Following changes may be necessary to support optional RDF mapping
features:
* Register database triggers for other cases of rdfs:subClassOf
entailment.
* Create triples table (required to represent non-relational RDF data
and RDF statement reification).
* Add sub-property qualifier attributes referencing property URIref
entry in the rdfs:Resource table for each attribute mapped to a
super-property.
* Create transitive closure tables, register owl:TransitiveProperty
inference triggers.
Example of RDF map and corresponding triggers can be found in
doc/examples/.
== Copying
Copyright (c) 2002-2011 Dmitry Borodaenko
This program is free software.
You can distribute/modify this program under the terms of the GNU
General Public License version 3 or later.
ruby-graffiti-2.2/TODO 0000664 0000000 0000000 00000002064 11764675307 0014641 0 ustar 00root root 0000000 0000000 Graffiti ToDo List
==================
- generalize RDF storage, implement SPARQL
- unit, functional, and performance test suite (in progress)
- separate library for RDF storage (done)
- investigate alternative backends: FramerD, 3store, Redland
-- depends on separate library for RDF storage
-- depends on test suite
- security: Squish literal condition safety (done), limited number of
query clauses (done), dry-run of user-defined query, approvable
resource usage
- query result set representation
- optional patterns and negation (done)
- parametrized queries (done)
-- cache prepared statements
- support blob literals
-- depends on parametrized queries
- vocabulary entailment: RDF, RDFS, OWL
- RDF aggregates storage internalization (Seq, Bag, Alt)
- query introspection
- storage workflow control (triggers)
- transparent (structured) RDF query storage
-- depends on RDF aggregates storage
-- depends on storage workflow control
- subqueries (query premise)
-- depends on transparent query storage
- chain queries
-- depends on native RDF storage
ruby-graffiti-2.2/doc/ 0000775 0000000 0000000 00000000000 11764675307 0014714 5 ustar 00root root 0000000 0000000 ruby-graffiti-2.2/doc/diagrams/ 0000775 0000000 0000000 00000000000 11764675307 0016503 5 ustar 00root root 0000000 0000000 ruby-graffiti-2.2/doc/diagrams/graffiti-classes.svg 0000664 0000000 0000000 00000030003 11764675307 0022446 0 ustar 00root root 0000000 0000000
ruby-graffiti-2.2/doc/diagrams/graffiti-deployment.svg 0000664 0000000 0000000 00000021354 11764675307 0023202 0 ustar 00root root 0000000 0000000
ruby-graffiti-2.2/doc/diagrams/graffiti-store-sequence.svg 0000664 0000000 0000000 00000007011 11764675307 0023756 0 ustar 00root root 0000000 0000000
ruby-graffiti-2.2/doc/diagrams/squish-select-sequence.svg 0000664 0000000 0000000 00000030325 11764675307 0023626 0 ustar 00root root 0000000 0000000
ruby-graffiti-2.2/doc/examples/ 0000775 0000000 0000000 00000000000 11764675307 0016532 5 ustar 00root root 0000000 0000000 ruby-graffiti-2.2/doc/examples/samizdat-rdf-config.yaml 0000664 0000000 0000000 00000005306 11764675307 0023252 0 ustar 00root root 0000000 0000000 ---
# rdf.yaml
#
# Defines essential parts of RDF model of a Samizdat site. Don't touch
# it unless you know what you're doing.
# Namespaces
#
ns:
s: 'http://www.nongnu.org/samizdat/rdf/schema#'
tag: 'http://www.nongnu.org/samizdat/rdf/tag#'
items: 'http://www.nongnu.org/samizdat/rdf/items#'
rdf: 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'
dc: 'http://purl.org/dc/elements/1.1/'
dct: 'http://purl.org/dc/terms/'
ical: 'http://www.w3.org/2002/12/cal#'
# Mapping of internal RDF properties to tables and fields. Statements
# over properties not listed here or in 'subproperty:' section below are
# reified using standard rdf::subject, rdf::predicate, and rdf::object
# properties, so at least these three and s::id must be mapped.
#
map:
's::id': {resource: id}
'dc::date': {resource: published_date}
'dct::isPartOf': {resource: part_of}
's::isPartOfSubProperty': {resource: part_of_subproperty}
's::partSequenceNumber': {resource: part_sequence_number}
'rdf::subject': {statement: subject}
'rdf::predicate': {statement: predicate}
'rdf::object': {statement: object}
's::login': {member: login}
's::fullName': {member: full_name}
's::email': {member: email}
'dc::title': {message: title}
'dc::creator': {message: creator}
'dc::format': {message: format}
'dc::language': {message: language}
's::openForAll': {message: open}
's::hidden': {message: hidden}
's::locked': {message: locked}
's::content': {message: content}
's::htmlFull': {message: html_full}
's::htmlShort': {message: html_short}
's::rating': {statement: rating}
's::voteProposition': {vote: proposition}
's::voteMember': {vote: member}
's::voteRating': {vote: rating}
# Map of properties into lists of their subproperties. For each property
# listed here, an additional qualifier field named _subproperty
# is defined in the same table (as defined under 'map:' above) referring
# to resource id identifying the subproperty (normally a uriref resource
# holding uriref of the subproperty). Only one level of subproperty
# relation is supported, all subsubproperties must be listed directly
# under root property.
#
subproperties:
'dct::isPartOf': [ 's::inReplyTo', 'dct::isVersionOf',
's::isTranslationOf', 's::subTagOf' ]
# Map of transitive RDF properties into tables that hold their
# transitive closures. The format of the table is as follows: 'resource'
# field refers to the subject resource id, property field (and qualifier
# field in case of subproperty) has the same name as in the main table
# (as defined under 'map:' above) and holds reference to predicate
# object, and 'distance' field holds the distance from subject to object
# in the RDF graph.
#
transitive_closure:
'dct::isPartOf': part
ruby-graffiti-2.2/doc/examples/samizdat-triggers-pgsql.sql 0000664 0000000 0000000 00000023243 11764675307 0024043 0 ustar 00root root 0000000 0000000 -- Samizdat Database Triggers - PostgreSQL
--
-- Copyright (c) 2002-2011 Dmitry Borodaenko
--
-- This program is free software.
-- You can distribute/modify this program under the terms of
-- the GNU General Public License version 3 or later.
--
CREATE FUNCTION insert_resource() RETURNS TRIGGER AS $$
BEGIN
IF NEW.id IS NULL THEN
INSERT INTO resource (literal, uriref, label)
VALUES ('false', 'false', TG_ARGV[0]);
NEW.id := currval('resource_id_seq');
END IF;
RETURN NEW;
END;
$$ LANGUAGE 'plpgsql';
CREATE TRIGGER insert_statement BEFORE INSERT ON statement
FOR EACH ROW EXECUTE PROCEDURE insert_resource('statement');
CREATE TRIGGER insert_member BEFORE INSERT ON member
FOR EACH ROW EXECUTE PROCEDURE insert_resource('member');
CREATE TRIGGER insert_message BEFORE INSERT ON message
FOR EACH ROW EXECUTE PROCEDURE insert_resource('message');
CREATE TRIGGER insert_vote BEFORE INSERT ON vote
FOR EACH ROW EXECUTE PROCEDURE insert_resource('vote');
CREATE FUNCTION delete_resource() RETURNS TRIGGER AS $$
BEGIN
DELETE FROM resource WHERE id = OLD.id;
RETURN NULL;
END;
$$ LANGUAGE 'plpgsql';
CREATE TRIGGER delete_statement AFTER DELETE ON statement
FOR EACH ROW EXECUTE PROCEDURE delete_resource();
CREATE TRIGGER delete_member AFTER DELETE ON member
FOR EACH ROW EXECUTE PROCEDURE delete_resource();
CREATE TRIGGER delete_message AFTER DELETE ON message
FOR EACH ROW EXECUTE PROCEDURE delete_resource();
CREATE TRIGGER delete_vote AFTER DELETE ON vote
FOR EACH ROW EXECUTE PROCEDURE delete_resource();
CREATE FUNCTION select_subproperty(value resource.id%TYPE, subproperty resource.id%TYPE) RETURNS resource.id%TYPE AS $$
BEGIN
IF subproperty IS NULL THEN
RETURN NULL;
ELSE
RETURN value;
END IF;
END;
$$ LANGUAGE 'plpgsql';
CREATE FUNCTION calculate_statement_rating(statement_id statement.id%TYPE) RETURNS statement.rating%TYPE AS $$
BEGIN
RETURN (SELECT AVG(rating) FROM vote WHERE proposition = statement_id);
END;
$$ LANGUAGE 'plpgsql';
CREATE FUNCTION update_nrelated(tag_id resource.id%TYPE) RETURNS VOID AS $$
DECLARE
dc_relation resource.label%TYPE := 'http://purl.org/dc/elements/1.1/relation';
s_subtag_of resource.label%TYPE := 'http://www.nongnu.org/samizdat/rdf/schema#subTagOf';
s_subtag_of_id resource.id%TYPE;
n tag.nrelated%TYPE;
supertag RECORD;
BEGIN
-- update nrelated
SELECT COUNT(*) INTO n
FROM statement s
INNER JOIN resource p ON s.predicate = p.id
WHERE p.label = dc_relation AND s.object = tag_id AND s.rating > 0;
UPDATE tag SET nrelated = n WHERE id = tag_id;
IF NOT FOUND THEN
INSERT INTO tag (id, nrelated) VALUES (tag_id, n);
END IF;
-- update nrelated_with_subtags for this tag and its supertags
SELECT id INTO s_subtag_of_id FROM resource
WHERE label = s_subtag_of;
FOR supertag IN (
SELECT tag_id AS id, 0 AS distance
UNION
SELECT part_of AS id, distance FROM part
WHERE id = tag_id
AND part_of_subproperty = s_subtag_of_id
ORDER BY distance ASC)
LOOP
UPDATE tag
SET nrelated_with_subtags = nrelated + COALESCE((
SELECT SUM(subt.nrelated)
FROM part p
INNER JOIN tag subt ON subt.id = p.id
WHERE p.part_of = supertag.id
AND p.part_of_subproperty = s_subtag_of_id), 0)
WHERE id = supertag.id;
END LOOP;
END;
$$ LANGUAGE 'plpgsql';
CREATE FUNCTION update_nrelated_if_subtag(tag_id resource.id%TYPE, property resource.id%TYPE) RETURNS VOID AS $$
DECLARE
s_subtag_of resource.label%TYPE := 'http://www.nongnu.org/samizdat/rdf/schema#subTagOf';
s_subtag_of_id resource.id%TYPE;
BEGIN
SELECT id INTO s_subtag_of_id FROM resource
WHERE label = s_subtag_of;
IF property = s_subtag_of_id THEN
PERFORM update_nrelated(tag_id);
END IF;
END;
$$ LANGUAGE 'plpgsql';
CREATE FUNCTION update_rating() RETURNS TRIGGER AS $$
DECLARE
dc_relation resource.label%TYPE := 'http://purl.org/dc/elements/1.1/relation';
old_rating statement.rating%TYPE;
new_rating statement.rating%TYPE;
tag_id resource.id%TYPE;
predicate_uriref resource.label%TYPE;
BEGIN
-- save some values for later reference
SELECT s.rating, s.object, p.label
INTO old_rating, tag_id, predicate_uriref
FROM statement s
INNER JOIN resource p ON s.predicate = p.id
WHERE s.id = NEW.proposition;
-- set new rating of the proposition
new_rating := calculate_statement_rating(NEW.proposition);
UPDATE statement SET rating = new_rating WHERE id = NEW.proposition;
-- check if new rating reverts truth value of the proposition
IF predicate_uriref = dc_relation
AND (((old_rating IS NULL OR old_rating <= 0) AND new_rating > 0) OR
(old_rating > 0 AND new_rating <= 0))
THEN
PERFORM update_nrelated(tag_id);
END IF;
RETURN NEW;
END;
$$ LANGUAGE 'plpgsql';
CREATE TRIGGER update_rating AFTER INSERT OR UPDATE OR DELETE ON vote
FOR EACH ROW EXECUTE PROCEDURE update_rating();
CREATE FUNCTION before_update_part() RETURNS TRIGGER AS $$
BEGIN
IF TG_OP = 'INSERT' THEN
IF NEW.part_of IS NULL THEN
RETURN NEW;
END IF;
ELSIF TG_OP = 'UPDATE' THEN
IF (NEW.part_of IS NULL AND OLD.part_of IS NULL) OR
((NEW.part_of = OLD.part_of) AND (NEW.part_of_subproperty = OLD.part_of_subproperty))
THEN
-- part_of is unchanged, do nothing
RETURN NEW;
END IF;
END IF;
-- check for loops
IF NEW.part_of = NEW.id OR NEW.part_of IN (
SELECT id FROM part WHERE part_of = NEW.id)
THEN
-- unset part_of, but don't fail whole query
NEW.part_of = NULL;
NEW.part_of_subproperty = NULL;
IF TG_OP != 'INSERT' THEN
-- check it was a subtag link
PERFORM update_nrelated_if_subtag(OLD.id, OLD.part_of_subproperty);
END IF;
RETURN NEW;
END IF;
RETURN NEW;
END;
$$ LANGUAGE 'plpgsql';
CREATE TRIGGER before_update_part BEFORE INSERT OR UPDATE ON resource
FOR EACH ROW EXECUTE PROCEDURE before_update_part();
CREATE FUNCTION after_update_part() RETURNS TRIGGER AS $$
BEGIN
IF TG_OP = 'INSERT' THEN
IF NEW.part_of IS NULL THEN
RETURN NEW;
END IF;
ELSIF TG_OP = 'UPDATE' THEN
IF (NEW.part_of IS NULL AND OLD.part_of IS NULL) OR
((NEW.part_of = OLD.part_of) AND (NEW.part_of_subproperty = OLD.part_of_subproperty))
THEN
-- part_of is unchanged, do nothing
RETURN NEW;
END IF;
END IF;
IF TG_OP != 'INSERT' THEN
IF OLD.part_of IS NOT NULL THEN
-- clean up links generated for old part_of
DELETE FROM part
WHERE id IN (
-- for old resource...
SELECT OLD.id
UNION
--...and all its parts, ...
SELECT id FROM part WHERE part_of = OLD.id)
AND part_of IN (
-- ...remove links to all parents of old resource
SELECT part_of FROM part WHERE id = OLD.id)
AND part_of_subproperty = OLD.part_of_subproperty;
END IF;
END IF;
IF TG_OP != 'DELETE' THEN
IF NEW.part_of IS NOT NULL THEN
-- generate links to the parent and grand-parents of new resource
INSERT INTO part (id, part_of, part_of_subproperty, distance)
SELECT NEW.id, NEW.part_of, NEW.part_of_subproperty, 1
UNION
SELECT NEW.id, part_of, NEW.part_of_subproperty, distance + 1
FROM part
WHERE id = NEW.part_of
AND part_of_subproperty = NEW.part_of_subproperty;
-- generate links from all parts of new resource to all its parents
INSERT INTO part (id, part_of, part_of_subproperty, distance)
SELECT child.id, parent.part_of, NEW.part_of_subproperty,
child.distance + parent.distance
FROM part child
INNER JOIN part parent
ON parent.id = NEW.id
AND parent.part_of_subproperty = NEW.part_of_subproperty
WHERE child.part_of = NEW.id
AND child.part_of_subproperty = NEW.part_of_subproperty;
END IF;
END IF;
-- check if subtag link was affected
IF TG_OP != 'DELETE' THEN
PERFORM update_nrelated_if_subtag(NEW.id, NEW.part_of_subproperty);
END IF;
IF TG_OP != 'INSERT' THEN
PERFORM update_nrelated_if_subtag(OLD.id, OLD.part_of_subproperty);
END IF;
RETURN NEW;
END;
$$ LANGUAGE 'plpgsql';
CREATE TRIGGER after_update_part AFTER INSERT OR UPDATE OR DELETE ON resource
FOR EACH ROW EXECUTE PROCEDURE after_update_part();
ruby-graffiti-2.2/doc/papers/ 0000775 0000000 0000000 00000000000 11764675307 0016206 5 ustar 00root root 0000000 0000000 ruby-graffiti-2.2/doc/papers/collreif.tex 0000664 0000000 0000000 00000051000 11764675307 0020523 0 ustar 00root root 0000000 0000000 \documentclass{llncs}
\usepackage{makeidx} % allows for indexgeneration
\usepackage[pdfpagescrop={92 112 523 778},a4paper=false,
pdfborder={0 0 0}]{hyperref}
\emergencystretch=8pt
%
\begin{document}
\mainmatter % start of the contributions
%
\title{Model for Collaborative Decision Making Based on RDF Reification}
\toctitle{Model for Collaborative Decision Making Based on RDF Reification}
\titlerunning{Collaboration and RDF Reification}
%
\author{Dmitry Borodaenko}
\authorrunning{Dmitry Borodaenko} % abbreviated author list (for running head)
%%%% modified list of authors for the TOC (add the affiliations)
\tocauthor{Dmitry Borodaenko}
%
\institute{\email{angdraug@debian.org}}
\maketitle % typeset the title of the contribution
\begin{abstract}
This paper presents a novel approach to online collaboration on the Web,
intended as technical means to make collective decisions in situations when
consensus fails. It is proposed that participants of the process are allowed
to create statements about site resources and, by the means of RDF
reification, to assert personal approval of such statements. Arbitrary
algorithms may then be used to determine validity of a statement in a given
context from the set of approval statements by different participants. The
paper goes on to discuss applicability of the proposed approach in the areas
of open-source development and independent media, and describes its
implementation in the Samizdat open publishing and collaboration system.
\end{abstract}
\section{Introduction}
Extensive growth of Internet over the last decades introduced a new form of
human collaboration: online communities. Availability of cheap digital
communication media has made it possible to form large distributed projects,
bringing together participants who would be otherwise unable to cooperate.
As more and more projects go online and spread across the globe, it becomes
apparent that new opportunities in remote cooperation also bring forth new
challenges. As observed by Steven Talbott\cite{fdnc}, technogical means do not
provide a full substitute for a real person-to-person relations, ``technology
is not a community''. A well-known example of this is the fact that it is
vital for an online communty to augment indirect and impersonal digital
communications with live meetings. However, even regular live meetings do not
solve all of the remote cooperation problems as they are limited in time and
scope, and thus can't happen often enough nor include all of the interested
parties into communication. In particular, one of the problems of online
communities that is begging for a new and better technical solution is
decision making and dispute resolution.
While it is most common that online communities are formed by volunteers,
their forms of governance are not necessarily democratic and vary widely, from
primitive single-person leadership and meritocracy in less formal technical
projects to consensus and majority voting in more complicated situations.
Usually, decision making in online volunteer projects is carried out via
traditional communication means, such as IRC channels, mailing lists,
newsgroups, etc., with rare exceptions such as the Debian project which
employs its own Devotee voting system based on PGP authentication and Concorde
vote counting\cite{debian-constitution}, and the Wikipedia project which
relies on a Wiki collaborative publishing system and enforces consensus among
its contributors. The scale and the level of quality achieved by the latter
two projects demonstrates that formalized collaboration process is as
important for volunteer projects as elsewhere: while sufficient to determine
rough consensus, traditional communications require participants to come up
with informal means of dispute resolution, making the whole process overly
dependent on interpersonal attitudes and communicative skills within group.
It is not to say that Debian or Wikipedia processes are perfect and need not
be improved. The strict consensus required by the Wikipedia Editors Policy
discourages dissenting minority from participation, while full-scale voting
system like Debian Devotee can't be used for every minor day-to-day decision
because of the high overhead involved and the limits imposed by the ballot
form.
This paper describes how RDF statement approval based on reification can be
applied to the problem of online decision making in diverse and politically
intensive distributed projects, and proposes a generic semantic model which
can be used in a wide range of applications involving online collaboration.
The proposed model is implemented in the Samizdat open-publishing and
collaboration engine, described later in the paper.
\section{Collaboration Model}
The collaboration model implemented by Samizdat evolves around the concept of
\emph{open editing}\cite{opened}, which includes the processes of publishing,
structuring, and filtering online content. ``Open'' part of open editing
implies that the collaboration process is visible to all participants, and
roles of readers and editors are available equally to everyone.
\emph{Publishing} involves posting new documents, comments, and revised
documents. \emph{Structuring} involves categorization and appraisal of
publications and other actions of fellow participants. \emph{Filtering}
process is intended to reduce information flow to a comprehensible level by
presenting a user with resources of highest quality and relevance. Each of
these processes requires a fair amount of decision making to be done, which
means that its effectiveness can be greatly improved by automating some
aspects of the decision making procedure.
\section{Collective Statement Approval}
%
\subsection{Focus-Centered Site Structure}
In the proposed collaboration model, RDF statements are used as a generic
mechanism for structuring site content. While it is possible to make any kinds
of statements about site resources, the most important kind of statement is
the one that relates a resource to a so-called ``focus''\cite{concepts}.
\emph{Focus} is a kind of resource that, when related by an RDF statement to
other resources, allows to group similar resources together and to evaluate
resources against different criteria. In some sense, all activities of project
members are represented as relations between resources and focuses.
Dynamically grouping resources around different focuses allows project members
to concentrate on the resources that are most relevant to their area of
interests and provide best quality. Use of RDF for site structure description
makes it possible to store and exchange filters for site resource selection in
the form of RDF queries, thus allowing participants to share their preferences
and ensuring interoperability with RDF-aware agents.
Since any resource can be used as a focus, it is possible that project members
define their own focuses, and relate focuses one to another. In a sufficiently
large and intensive project, this feature should help site structure to evolve
in accordance with usage patterns of different groups of users.
\subsection{RDF Reification}
RDF reification provides a mechanism for describing RDF statements. As defined
in ``RDF Semantics''\cite{rdf-mt}, assertion of reification of RDF statement
means that a document exists containing a triple token instantiating the
statement. The reified triple is a resource which can be described in the same
way as any other resource. It is important to note that there can be several
triple tokens with the same subject, object, and predicate, and, according to
RDF reification semantics, such tokens should be treated as separate
resources, possibly with different composition or provenance information
attached to each.
\subsection{Proposition and Vote}
In the proposed model, all statements are reified, and may be voted upon by
project members. To distinguish statements with attached votes, they are
called ``propositions''. \emph{Proposition} is a subclass of RDF statement
which can be approved or disapproved by votes of project members. Accordingly,
\emph{vote} is a record of vote cast in favor or against particular
proposition by particular member, and \emph{rating} is a denotation of
approval of the proposition as determined from individual votes.
Exact mechanism of rating calculation can be determined by each site, or even
each user, individually, according to average value of votes cast, level of
trust existing between the user and particular voters, absolute number of
votes cast, etc. Since individual votes are recorded in RDF and are available
for later extraction, rating can be calculated at any time using any formula
that suits the end user best. Some users may choose to share their view of the
site resources, and publish their filters in the form of RDF queries.
Default rating system in Samizdat lets voter select from ratings ``$-2$''
(no), ``$-1$'' (not likely), ``$0$'' (uncertain), ``$1$'' (likely), ``$2$''
(yes). Total rating of proposition is equal to the average value of all votes
cast for the proposition; resources with rating below ``$-1$'' are hidden from
view.
\section{Target Applications and Use Cases}
%
\subsection{Open Publishing}
While it is vital for any project to come up with fair and predictable methods
of decision making, it's hard to find a more typical example than the
Indymedia network, international open publishing project with the aim of
providing the public with unbiased news source\cite{openpub}. Since the main
focus of Indymedia is politics, and since it is explicitly open for everyone,
independent media centers are used by people from all parts of political
spectrum, and often become a place of heated debate, or even target of flood
attacks.
This conflict between fairness and political bias, as well as sheer amount of
information flowing through the news network, creates a need for a more
flexible categorization and filtering system that would take the burden and
responsibility of moderation off from site administrators. The issue of
developing an open editing system was raised by Indymedia project participants
in January 2002, but, to date, implementations of this concept are not ready
for production use. The Active2 project\cite{active2} which has set forth to
fulfil that role is still in the alpha stage of the development, and, unlike
Samizdat, limits its use of RDF to describing its resources with Dublin Core
meta-data.
Implementation of an open editing system was one of the initial goals of the
Samizdat project\cite{oscom3}, and deployment of the Samizdat engine by an
independent media center would become a deciding trial of vitality of the
proposed collaboration model in a real-world environment.
\subsection{Documentation Development}
Complexity level of modern computer systems makes it impossible to develop and
operate them without extensive user and developer manuals which document
intended behaviour of a system and describe solutions to typical user
problems. Ultimately, such manuals reflect collective knowledge about a
system, and may require input from many different people with different
perspectives. On the other hand, in order to be useful to different people,
documentation should be well-structured and easy to navigate.
The most popular solution for collaborative documentation development to date
is \emph{Wiki}, a combination of very simple hypertext markup and ability to
edit documents within an HTML form. Such simplicity makes Wiki easy to use,
but in the same time limits its applicability to large bodies of
documentation. Due to being limited to basic hypertext without categorization
and filtering capabilities, Wiki sites require huge amount of manual editing
done by trusted maintainers in order to keep the site structure from falling
behind a growing amount of available information, and to protect it from
vandals. Although there are successful examples of large Wiki sites (most
prominent being the Wikipedia project), Wiki does not provide sufficient
infrastructure for development and maintainance of complex technical
documentation.
Combination of the Wiki approach with RDF metadata, along with implementation
of the proposed collaborative decision making model for determination of
documentation structure, would allow to make significant progress in the
adoption of the open-source software which is often suffering from a lack of
comprehensive and up-to-date documentation.
\subsection{Bug Tracking}
Bug-tracking tools have grown to become essential component of any software
development process. However, despite wide adoption, bug-tracking software has
not yet reached maturity: interoperability between different tools is missing;
incompatible issue classifications and work flows complicate status
syncronization between companies collaborating on a single project; lack of
integration with time-management, document management, version control and
other kinds of applications increases amount of routine work done by project
manager.
On the other hand, development of integrated project management systems shows
that the most important problem in project management automation is
convergence of information from all sources in a single focal point. For such
convergence to become possible, unified process flow model, based on open
standards such as RDF, should be adopted across all information sources, from
source code version control to developer forums. Since strict provenance
tracking is a key requirement for such model, the proposed reification-based
approach may be employed to satisfy it.
\section{Samizdat Engine}
%
\subsection{Project Status}
Samizdat engine is implemented in the Ruby programming language and relies on
the PostgreSQL database management system for RDF storage. Other programs
required for Samizdat deployment are Ruby/Postgres, Ruby/DBI, and YAML4R
libraries for Ruby, and Apache web server with mod\_ruby module. Samizdat is
free software and does not require any non-free software to
run\cite{impl-report}.
Samizdat project development started in December 2002, first public release
was announced in June 2003. As of the second beta version 0.5.1, released in
March 2004, Samizdat provided basic set of open publishing functionality,
including registering site members, publishing and replying to messages,
uploading multimedia messages, voting on relation of site focuses to
resources, creating and managing new focuses, hand-editing or using GUI for
constructing and publishing Squish queries that can be used to search and
filter site resources. Next major release 0.6.0 is expected to add
collaborative documentation development functionality.
\subsection{Samizdat Schema}
Core representation of Samizdat content is RDF. Any new resource published on
Samizdat site is automatically assigned a unique numberic ID, which, when
appended to the base site URL, forms resource URIref. This ID may be accessed
via {\tt id} property. Publication time stamp is recorded in {\tt dc:date}
property (here and below, ``{\tt dc:}'' prefix refers to the Dublin Core
namespace):
\begin{verbatim}
:id
rdfs:domain rdfs:Resource .
dc:date
rdfs:domain rdfs:Resource .
\end{verbatim}
{\tt Member} is a registered user of a Samizdat site (synonyms: poster,
visitor, reader, author, creator). Members can post messages, create focuses,
relate messages to focuses, vote on relations, view messages, use and publish
filters based on relations between messages and focuses.
\begin{verbatim}
:Member
rdfs:subClassOf rdfs:Resource .
:login
rdfs:domain :Member ;
rdfs:range rdfs:Literal .
\end{verbatim}
Resources are related to focuses with {\tt dc:relation} property:
\begin{verbatim}
:Focus
rdfs:subClassOf rdfs:Resource .
dc:relation
rdfs:domain rdfs:Resource ;
rdfs:range :Focus .
\end{verbatim}
{\tt Proposition} is an RDF statement with {\tt rating} property. Value of
{\tt rating} is calculated from {\tt voteRating} values of individual {\tt
Vote} resources attached to this proposition via {\tt voteProposition}
property:
\begin{verbatim}
:Proposition
rdfs:subClassOf rdf:Statement .
:rating
rdfs:domain :Proposition ;
rdfs:range rdfs:Literal .
:Vote
rdfs:subClassOf rdfs:Resource .
:voteProposition
rdfs:domain :Vote ;
rdfs:range :Proposition .
:voteMember
rdfs:domain :Vote ;
rdfs:range :Member .
:voteRating
rdfs:domain :Vote ;
rdfs:range rdfs:Literal .
\end{verbatim}
Parts of Samizdat schema that are not relevant to the discussed collective
decision making model, such as discussion threads, version control, and
aggregate messages, were omitted. Full Samizdat schema in N3 notation can be
found in Samizdat source code package.
\subsection{RDF Storage Implementation}
To address scalability concerns, Samizdat extends traditional relational
representation of RDF as a table of \{subject, object, predicate\} triples
with a unique RDF-to-relational query translation technology. Most highly used
RDF properties of Samizdat schema are mapped into fields of \emph{internal
resource tables} corresponding to resource classes, with id of the record
referencing to the {\tt Resource} table; all other properties are recorded as
triples in the {\tt Statement} table. Detailed explanation of the
RDF-to-relational mapping can be found in ``Samizdat RDF
Storage''\cite{rdf-storage} document.
To demonstrate usage of the Samizdat RDF schema described earlier in this
section, the exerpt of Ruby code responsible for individual vote rating
assignment is quoted below.
\begin{verbatim}
def rating=(value)
value = Focus.validate_rating(value)
if value then
rdf.assert %{
UPDATE ?rating = '#{value}'
WHERE (rdf::subject ?stmt #{resource.id})
(rdf::predicate ?stmt dc::relation)
(rdf::object ?stmt #{@id})
(s::voteProposition ?vote ?stmt)
(s::voteMember ?vote #{session.id})
(s::voteRating ?vote ?rating)
USING PRESET NS}
@rating = nil # invalidate rating cache
end
end
\end{verbatim}
In this attribute assignment method of {\tt Focus} class, RDF assertion is
recorded in extended Squish syntax and populated with variables storing the
rating {\tt value}, resource identifier {\tt resource.id}, focus identifier
{\tt @id}, and identifier of registered member {\tt session.id}. When the
Samizdat RDF storage layer updates {\tt Vote.voteRating}, average value of
corresponding {\tt Proposition.rating} is recalculated by a stored procedure.
\section{Conclusions}
Initially started as an RDF-based open-publishing engine, Samizdat project
opens a new approach to online collaboration in general. Proposed model of
collective statement approval via RDF reification is applicable in a large
range of problem domains, including documentation development and bug
tracking.
Implementation of the proposed model in the Samizdat engine proves viability
of RDF not only as a metadata interchange format, but also as a data model
that may be employed by software architects in innovative ways. Key role
played by RDF reification in the described model shows that this comparatively
obscure part of RDF standard deserves broader mindshare among Semantic Web
developers.
% ---- Bibliography ----
%
\begin{thebibliography}{19}
%
\bibitem {openpub}
Arnison, Matthew:
Open publishing is the same as free software, 2002\\
http://www.cat.org.au/maffew/cat/openpub.html
\bibitem {concepts}
Borodaenko, Dmitry:
Samizdat Concepts, December 2002\\
http://savannah.nongnu.org/cgi-bin/viewcvs/samizdat/samizdat/doc/\\
concepts.txt
\bibitem {rdf-storage}
Borodaenko, Dmitry:
Samizdat RDF Storage, December 2002\\
http://savannah.nongnu.org/cgi-bin/viewcvs/samizdat/samizdat/doc/\\
rdf-storage.txt
\bibitem {oscom3}
Borodaenko, Dmitry:
Samizdat --- RDF model for an open publishing and cooperation engine. Third
International OSCOM Conference, Berkman Center for Internet and Society,
Harvard Law School, May 2003\\
http://slideml.bitflux.ch/files/slidesets/503/title.html
\bibitem {impl-report}
Borodaenko, Dmitry:
Samizdat RDF Implementation Report, September 2003\\
http://lists.w3.org/Archives/Public/www-rdf-interest/2003Sep/0043.html
\bibitem {debian-constitution}
Debian Constitution. Debian Project, 1999\\
http://www.debian.org/devel/constitution
\bibitem {rdf-mt}
Hayes, Patrick:
RDF Semantics. W3C, February 2004\\
http://www.w3.org/TR/rdf-mt
\bibitem {opened}
Jay, Dru:
Three Proposals for Open Publishing --- Towards a transparent, collaborative
editorial framework, 2002\\
http://dru.ca/imc/open\_pub.html
\bibitem {fdnc}
Talbott, Stephen L.:
The Future Does Not Compute. O'Reilly \& Associates, 1995\\
http://www.oreilly.com/\homedir{}stevet/fdnc/
\bibitem {active2}
Warren, Mike:
Active2 Design. Indymedia, 2002.\\
http://docs.indymedia.org/view/Devel/DesignDocument
\end{thebibliography}
\end{document}
ruby-graffiti-2.2/doc/papers/rdf-to-relational-query-translation-icis2009.tex 0000664 0000000 0000000 00000110044 11764675307 0027212 0 ustar 00root root 0000000 0000000 \documentclass[conference,letterpaper]{IEEEtran}
\usepackage{graphicx}
%\usepackage{multirow}
%\usepackage{ragged2e}
\usepackage{algpseudocode}
\usepackage[cmex10]{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{fancyvrb}
\usepackage{pstricks,pst-node}
\usepackage[pdftitle={On-demand RDF to Relational Query Translation in
Samizdat RDF Store},
pdfauthor={Dmitry Borodaenko},
pdfkeywords={Semantic Web, RDF, relational databases, query
language, Samizdat},
pdfborder={0 0 0}]{hyperref}
%\urlstyle{rm}
\emergencystretch=8pt
\interdisplaylinepenalty=2500
%
\begin{document}
%
\title{On-demand RDF to Relational Query Translation in Samizdat RDF
Store}
%
\author{\IEEEauthorblockN{Dmitry Borodaenko}
\IEEEauthorblockA{Belarusian State University of Informatics and
Radioelectronics\\
6 Brovki st., Minsk, Belarus\\
Email: angdraug@debian.org}}
\maketitle % typeset the title of the contribution
\begin{abstract}
This paper presents an algorithm for on-demand translation of RDF
queries that allows to map any relational data structure to RDF model,
and to perform queries over a combination of mapped relational data and
arbitrary RDF triples with a performance comparable to that of
relational systems. Query capabilities implemented by the algorithm
include optional and negative graph patterns, nested sub-patterns, and
limited RDFS and OWL inference backed by database triggers.
\end{abstract}
\section{Introduction}
\label{introduction}
% motivation for the proposed solution
A wide range of solutions that map relational data to RDF data model has
accumulated to date~\cite{triplify}. There are several factors that make
integration of RDF and relational data important for the adoption of the
Semantic Web. One reason, shared with RDF stores based on a triples
table, is the wide availability of mature relational database
implementations which had seen decades of improvements in reliability,
scalability, and performance. Second is the fact that most of structured
data available online is backed by relational databases. This data is
not likely to be replaced by pure RDF stores in the near future, so it
has to be mapped in one way or another to become available to RDF
agents. Finally, properly normalized and indexed application-specific
relational database schema allows a DBMS to optimize complex queries in
ways that are not possible for a tree of joins over a single triples
table~\cite{sp2b}.
% what is unique about the proposed solution
In the Samizdat open publishing engine, most of the data fits into the
relational model, with the exception of reified RDF statements which are
used in collaborative decision making process~\cite{samizdat-collreif}
and require a more generic triple store. The need for a generic RDF
store with performance on par with a relational database is the primary
motivation behind the design of Samizdat RDF storage module, which is
different from both triples table based RDF stores and relational to RDF
mapping systems. Unlike the former, Samizdat can run optimized SQL
queries over application-specific tables, but unlike the latter, it is
not limited by the relational database schema and can fall back, within
the same query, to a triples table for RDF predicates that are not
mapped to the relational model.
% structure of the paper
The following sections of this paper describe: targeted relational data,
database triggers required for RDFS and OWL inference, query translation
algorithm, update request execution algorithm, details of algorithm
implementation in Samizdat, analysis of its performance, comparison with
related work, and outline for future work.
\section{Relational Data}
\label{relational-data}
% formal definition of data targeted for storage
Samizdat RDF storage module does not impose additional restrictions on
the underlying relational database schema beyond the requirements of the
SQL standard. Any legacy database may be adapted for RDF access while
retaining backwards compatibility with existing SQL queries.
The adaptation process involves adding attributes, foreign keys, tables,
and triggers to the database to enable RDF query translation and support
optional features of Samizdat RDF store, such as statement reification
and inference for {\em rdfs:sub\-Class\-Of}, {\em
rdfs:sub\-Property\-Of}, and {\em owl:Transitive\-Property\/} rules.
Following database schema changes are required for all cases:
\begin{itemize}
\item create {\em rdfs:Resource\/} superclass table with autogenerated
primary key;
\item replace primary keys of mapped subclass tables with foreign keys
referencing the {\em rdfs:Resource\/} table (existing foreign keys may
need to be updated to reflect this change);
\item register {\em rdfs:subClassOf\/} inference database triggers to
update the Resource table and maintain foreign keys integrity on all
changes in mapped subclass tables.
\end{itemize}
Following changes may be necessary to support optional RDF mapping
features:
\begin{itemize}
\item register database triggers for other cases of {\em
rdfs:sub\-Class\-Of\/} entailment;
\item create triples table (required to represent non-relational RDF
data and RDF statement reification);
\item add subproperty qualifier attributes referencing property URIref
entry in the {\em rdfs:Resource\/} table for each attribute mapped to a
superproperty;
\item create transitive closure tables, register {\em
owl:TransitivePro\-perty\/} inference triggers.
\end{itemize}
\section{Inference and Database Triggers}
\label{inference-triggers}
Samizdat RDF storage module implements entailment rules for following
RDFS predicates and OWL classes: {\em rdfs:sub\-Class\-Of}, {\em
rdfs:sub\-Property\-Of}, {\em owl:Transitive\-Property}. Database
triggers are used to minimize impact of RDFS and OWL inference on query
performance:
{\em rdfs:subClassOf\/} inference triggers are invoked on every insert
into and delete from a subclass table. When a tuple without a primary
key is inserted,\footnote{Insertion into subclass table with explicit
primary key is used in two-step resource insertion during execution of
RDF update command (described in section~\ref{update-execution}).} a
template tuple is inserted into superclass table and the produced
primary key is added to the new subclass tuple. Delete operation is
cascaded to all subclass and superclass tables.
{\em rdfs:subPropertyOf\/} inference is performed during query
translation, with help of a stored procedure that returns the attribute
value when subproperty qualifier attribute is set, and NULL otherwise.
{\em owl:TransitiveProperty\/} inference uses a separate transitive
closure table for each relational attribute mapped to a transitive
property. Transitive closure tables are maintained by triggers invoked
on each insert, update, and delete operation involving such an
attribute.
The transitive closure update algorithm is presented in
\figurename~\ref{transitive-closure}. The input to the algorithm is:
\begin{itemize}
\item directed labeled graph $G = \langle N, A \rangle$ where $N$ is a
set of nodes representing RDF resources and $A$ is a set of arcs $a =
\langle s, p, o \rangle$ representing RDF triples;
\item transitive property $\tau$;
\item subgraph $G_\tau \subseteq G$ such that:
\begin{equation}
a_\tau = \langle s, p, o \rangle \in G_\tau \iff
a_\tau \in G \, \wedge \, p = \tau \, ;
\end{equation}
\item graph $G_\tau^+$ containing transitive closure of $G_\tau$;
\item update operation $\omega \in \{insert, update, delete\}$ and its
parameters $a_{old} = \langle s_\omega, \tau, o_{old} \rangle$, $a_{new}
= \langle s_\omega, \tau, o_{new} \rangle$ such that:
\begin{equation}
G_\tau' = (G_\tau \setminus \{ a_{old} \}) \cup \{ a_{new} \} \, .
\end{equation}
\end{itemize}
The algorithm transforms $G_\tau^+$ into a transitive closure of
$G_\tau'$. The algorithm assumes that $G_\tau$ is and should remain
acyclic.
\begin{figure}
\begin{algorithmic}[1]
\If {$o_{new} = s_\omega$ or $\langle o_{new}, \tau, s_\omega \rangle \in G_\tau^+$}
\State stop
\Comment refuse to create a cycle in $G_\tau$
\EndIf
\State $G_\tau \gets G_\tau'$
\Comment apply $\omega$
\If {$\omega \in \{update, delete\}$}
\State $G_\tau^+ \gets G_\tau^+ \setminus
\{ \langle s, \tau, o \rangle \mid
(s = s_\omega \, \vee \,
\langle s, \tau, s_\omega \rangle \in G_\tau^+) \, \wedge \,
\langle s_\omega, \tau, o \rangle \in G_\tau^+ \}$
\Comment remove obsolete arcs from $G_\tau^+$
\EndIf
\If {$\omega \in \{insert, update\}$}
\Comment add new arcs to $G_\tau^+$
\State $G_\tau^+ \gets G_\tau^+ \cup
\{ \langle s_\omega, \tau, o \rangle \mid
o = o_{new} \, \vee \,
\langle o_{new}, \tau, o \rangle \in G_\tau^+ \}$
\State $G_\tau^+ \gets G_\tau^+ \cup
\{ \langle s, \tau, o \rangle \mid
\langle s, \tau, s_\omega \rangle \in G_\tau^+ \, \wedge \,
\langle s_\omega, \tau, o \rangle \in G_\tau^+ \}$
\EndIf
\end{algorithmic}
\caption{Update transitive closure}
\label{transitive-closure}
\end{figure}
\section{Query Pattern Translation}
\label{query-translation}
Class structure of the Samizdat RDF storage module is as follows.
External API is provided by the {\tt RDF} class. RDF storage
configuration as described in section~\ref{relational-data} is
encapsulated in {\tt RDFConfig} class. The concrete syntax of
Squish~\cite{samizdat-rel-rdf,squish} and SQL is abstracted into {\tt
SquishQuery} and its subclasses. The query pattern translation algorithm
is implemented by the {\tt SqlMapper} class.
% prerequisites
The input to the algorithm is as follows:
\begin{itemize}
\item mappings $M = \langle M_{rel}, M_{attr}, M_{sub}, M_{trans}
\rangle$ where $M_{rel}: P \to R$, $M_{attr}: P \to \Phi$, $M_{sub}: P
\to S$, $M_{trans} \to T$; $P$ is a set of mapped RDF properties, $R$ is
a set of relations, $\Phi$ is a set of relation attributes, $S \subset
P$ is a subset of RDF properties that have configured subproperties, $T
\subset R$ is a set of transitive closures (as described in
sections~\ref{relational-data} and \ref{inference-triggers});
\item graph pattern $\Psi = \langle \Psi_{nodes}, \Psi_{arcs} \rangle =
\Pi \cup N \cup \Omega$, where $\Pi$, $N$, and $\Omega$ are main ("must
bind"), negative ("must not bind"), and optional ("may bind") graph
patterns respectively, such that $\Pi$, $N$, and $\Omega$ share no arcs,
and $\Pi$, $\Pi \cup N$ and $\Pi \cup \Omega$ are joint
graphs.\footnote{Arcs with the same subject, object, and predicate but
different bind mode are treated as distinct.}
\item global filter condition $F_g \in F$ and local filter conditions
$F_c: \Psi_{arcs} \to F$ where $F$ is a set of all literal conditions
expressible in the query language syntax.
\end{itemize}
For example, consider the following Squish query and its graph pattern
$\Psi$ presented in \figurename~\ref{graph-pattern}.
\begin{Verbatim}[fontsize=\scriptsize]
SELECT ?msg
WHERE (rdf::predicate ?stmt dc::relation)
(rdf::subject ?stmt ?msg)
(rdf::object ?stmt ?tag)
(dc::date ?stmt ?date)
(s::rating ?stmt ?rating
FILTER ?rating >= :threshold)
EXCEPT (dct::isPartOf ?msg ?parent)
OPTIONAL (dc::language ?msg ?original_lang)
(s::isTranslationOf ?msg ?translation)
(dc::language ?translation ?translation_lang)
LITERAL ?original_lang = :lang
OR ?translation_lang = :lang
GROUP BY ?msg
ORDER BY max(?date) DESC
\end{Verbatim}
\begin{figure}
\centering
\psset{unit=3.8mm,labelsep=0.2pt}
\begin{pspicture}[showgrid=false](0,0)(23,12)
\footnotesize
\rput(2.5,5.5){\ovalnode{msg}{\sl ?msg}}
\rput(10,8){\ovalnode{stmt}{\sl ?stmt}}
\rput(2.5,8){\ovalnode{rel}{\it dc:relation}}
\rput(5,10.5){\ovalnode{tag}{\sl ?tag}}
\rput(14,10.5){\ovalnode{date}{\sl ?date}}
\rput(17,8){\ovalnode{rating}{\sl ?rating}}
\rput(14,5.5){\ovalnode{parent}{\sl ?parent}}
\rput(8,1){\ovalnode{origlang}{\sl ?original\_lang}}
\rput(11.2,3.3){\ovalnode{trans}{\sl ?translation}}
\rput(19.2,1){\ovalnode{translang}{\sl ?translation\_lang}}
\ncline{<-}{msg}{stmt} \aput{:U}(0.4){\it rdf:subject}
\ncline{<-}{rel}{stmt} \aput{:U}{\it rdf:predicate}
\ncline{<-}{tag}{stmt} \aput{:U}{\it rdf:object}
\ncline{->}{stmt}{date} \aput{:U}{\it dc:date}
\ncline{->}{stmt}{rating} \aput{:U}{\it s:rating}
\ncline{->}{msg}{parent} \aput{:U}(0.6){\it dct:isPartOf}
\ncline{->}{msg}{origlang} \aput{:U}(0.6){\it dc:language}
\ncline{<-}{msg}{trans} \aput{:U}(0.65){\it s:isTranslationOf}
\ncline{->}{trans}{translang} \aput{:U}(0.6){\it dc:language}
\psccurve[curvature=0.75 0.1 0,linestyle=dashed,showpoints=false]%
(0.3,5)(0.3,10)(3,11.3)(20,9.5)(20,7)(8.5,7)(2.5,4.5)
\rput(18.8,10){$\Pi$}
\rput(16.5,5.5){$N$}
\rput(12.5,1.5){$\Omega$}
\end{pspicture}
\caption{Graph pattern $\Psi$ for the example query}
\label{graph-pattern}
\end{figure}
The output of the algorithm is a join expression $F$ and condition $W$
ready for composition into {\tt FROM} and {\tt WHERE} clauses of an SQL
{\tt SELECT} statement.
In the algorithm description below, $\mathrm{id}(r)$ is used to denote
primary key of relation $r \in R$, and $\rho(n)$ is used to denote value
of $\mathrm{id}(Resource)$ for non-variable node $n \in \Psi_{nodes}$
where such value is known during query translation.\footnote{E.g.
Samizdat uses {\em site-ns/resource-id} notation for internal resource
URIrefs.}
% the algorithm
Key steps of the query pattern translation algorithm correspond to the
following private methods of {\tt SqlMapper}:
{\tt label\_pattern\_components}: Label every connected component of
$\Pi$, $N$, and $\Omega$ with different colors $K$ such that $K_\Pi:
\Pi_{nodes} \to \mathbb{K}, K_N: N_{nodes} \to \mathbb{K}, K_\Omega:
\Omega_{nodes} \to \mathbb{K}, K(n) = K_\Pi(n) \cup K_N(n) \cup
K_\Omega(n)$. The Two-pass Connected Component Labeling
algorithm~\cite{shapiro} is used with a special case to exclude nodes
present in $\Pi$ from neighbour lists while labeling $N$ and $\Omega$.
The special case ensures that parts of $N$ and $\Omega$ which are only
connected through a node in $\Pi$ are labeled with different colors.
{\tt map\_predicates}: Map each arc $c = \langle s, p, o \rangle \in
\Psi_{arcs}$ to the relational data model according to $M$: define
mapping $M_{attr}^{pos}: \Psi_{arcs} \times \Psi_{nodes} \to \Phi$ such
that $M_{attr}^{pos}(c, s) = \mathrm{id}( M_{rel}(p) ),
M_{attr}^{pos}(c, o) = M_{attr}(p)$; replace each unmapped arc with its
reification and map the resulting arcs in the same manner;\footnote{$M$
is expected to map reification properties to the triples table.} for
each arc labeled with a subproperty predicate, add an arc mapped to the
subproperty qualifier attribute. For each node $n \in \Psi_{nodes}$,
find adjacent arcs $\Psi_{nodes}^n = \{\langle s, p, o \rangle \mid n
\in \{s, o\}\}$ and determine its binding mode $\beta_{node}:
\Psi_{nodes} \to \{ \Pi, N, \Omega \}$ such that $\beta_{node}(n) =
max(\beta_{arc}(c) \, \forall c \in \Psi_{nodes}^n)$ where
$\beta_{arc}(c)$ reflects which of the graph patterns $\{ \Pi, N, \Omega
\}$ contains arc $c$, and the order of precedence used by $max$ is $\Pi
> N > \Omega$.
{\tt define\_relation\_aliases}: Map each node in $\Psi$ to one or more
relation aliases $a \in \mathbb{A}$ according to the algorithm described
in \figurename~\ref{define-relation-aliases}. The algorithm produces
mapping $C_a: \Psi_{arcs} \to \mathbb{A}$ which links every arc in
$\Psi$ to an alias, and mappings $A = \langle A_{rel}, A_{node},
A_\beta, A_{filter} \rangle$ where $A_{rel}: \mathbb{A} \to R$,
$A_{node}: \mathbb{A} \to \Psi_{nodes}$, $A_\beta: \mathbb{A} \to \{
\Pi, N, \Omega \}$, $A_{filter}: \mathbb{A} \to F)$ which record
relation, node, bind mode, and a filter condition for each alias.
\begin{figure}
\begin{algorithmic}[1]
\ForAll {$n \in \Psi_{nodes}$}
\ForAll {$c = \langle s, p, o \rangle \in \Psi_{arcs} \mid s = n \, \wedge \, C_a(c) = \emptyset$}
\If {$\exists c' = \langle s', p', o' \rangle \mid
n \in \{s', o'\} \, \wedge \,
C_a(c') \not= \emptyset \, \wedge \,
M_{rel}(p') = M_{rel}(p)$}
\State $C_a(c) \gets C_a(c')$
\Comment Reuse the alias assigned to an arc adjacent to $n$ and
mapped to the same relation
\Else
\Comment Create new alias
\State $a = max(\mathbb{A}) + 1$;
$\mathbb{A} \gets \mathbb{A} \cup \{ a\}$;
$C_a(c) \gets a$
\State $A_{node}(a) \gets n$, $A_{filter}(a) \gets \emptyset$
\If {$M_{trans}(p) = \emptyset$}
\Comment Use base relation
\State $A_{rel}(a) \gets M_{rel}(p)$
\State $A_\beta(a) \gets \beta_{node}(n)$
\Else
\Comment Use transitive closure
\State $A_{rel}(a) \gets M_{trans}(p)$
\State $A_\beta(a) \gets \beta_{arc}(c)$
\State \Comment Use arc's bind mode instead of node's
\EndIf
\EndIf
\EndFor
\EndFor
\ForAll {$c \in \Psi_{arcs}$}
\State $A_{filter}( C_a(c) ) \gets A_{filter}( C_a(c) ) \cup F_c(c)$
\State \Comment Add arc filter to the linked alias filters
\EndFor
\end{algorithmic}
\caption{Define relation aliases}
\label{define-relation-aliases}
\end{figure}
{\tt transform}: Define bindings $B: \Psi_{nodes} \to \mathbb{B}$ where
$\mathbb{B} = \{\{ \langle a, f \rangle \mid a \in \mathbb{A}, f \in
\Phi \}\}$ of graph pattern nodes to sets of pairs of relation aliases
and attributes, such that
\begin{equation}
\begin{split}
\langle a, f \rangle \in B(n) \iff
&\exists c \in \Psi_{arcs}^n \\
&C_a(c) = a, M_{attr}^{pos}(c, n) = f \, .
\end{split}
\end{equation}
Transform graph pattern $\Psi$ into relational query graph $Q = \langle
\mathbb{A}, J \rangle$ where nodes $\mathbb{A}$ are relation aliases
defined earlier and edges $J = \{ \langle b_1, b_2, n \rangle \mid b_1 =
\langle a_1, f_1 \rangle \in B(n), b_2 = \langle a_2, f_2 \rangle \in
B(n), a_1 \not= a_2 \}$ are join conditions. Ground non-variable nodes
according to the algorithm defined in
\figurename~\ref{ground-non-variable-nodes}. Record list of grounded nodes $G
\subseteq \Psi_{nodes}$ such that
\begin{equation}
\begin{split}
n \in G \iff &n \in F_g
\,\vee\, \exists \langle b_1, b_2, n \rangle \in J \\
&\vee\, \exists b \in B(n) \, \exists a \in \mathbb{A} \:
b \in A_{filter}(a) \, .
\end{split}
\end{equation}
\begin{figure}
\begin{algorithmic}[1]
\State $\exists b = \langle a, f \rangle \in B(n)$
\Comment Take any binding of $n$
\If {$n$ is an internal resource and $\rho(n) = i$}
\State $A_{filter}(a) \gets A_{filter}(a) \cup (b = i)$
\ElsIf {$n$ is a query parameter or a literal}
\State $A_{filter}(a) \gets A_{filter}(a) \cup (b = n)$
\ElsIf {$n$ is a URIref}
\Comment Add a join to a URIref tuple in Resource relation
\State $\mathbb{A} \gets \mathbb{A} \cup \{ a_r \}$;
$A_{node}(a_r) = n$;
$A_{rel}(a_r) = Resource$;
$A_\beta(a_r) = \beta_{node}(n)$
\State $B(n) \gets B(n) \cup \langle a_r, \mathrm{id}(Resource) \rangle;
J \gets J \cup
\{ \langle b, \langle a_r, \mathrm{id}(Resource) \rangle, n \rangle \}$
\State $A_{filter}(a_r) = A_{filter}(a_r) \cup (
\langle a_r, literal \rangle = f \wedge
\langle a_r, uriref \rangle = t \wedge
\langle a_r, label \rangle = n )$
\EndIf
\end{algorithmic}
\caption{Ground non-variable nodes}
\label{ground-non-variable-nodes}
\end{figure}
Transformation of the example query presented above will result in a
relational query graph in \figurename~\ref{join-graph}.
\begin{figure}
\centering
\psset{unit=3.8mm,labelsep=0.2pt}
\begin{pspicture}[showgrid=false](0,0)(23,13)
\footnotesize
\rput(1,6){\circlenode{b}{\vphantom{Ij}b}}
\rput(6.7,6){\circlenode{a}{\vphantom{Ij}a}}
\rput(12.8,6){\circlenode{c}{\vphantom{Ij}c}}
\rput(2,11){\circlenode{d}{\vphantom{Ij}d}}
\rput(1,1){\circlenode{g}{\vphantom{Ij}g}}
\rput(22,11){\circlenode{f}{\vphantom{Ij}f}}
\rput(20,1){\circlenode{e}{\vphantom{Ij}e}}
\ncline{-}{b}{a} \aput{:U}(0.4){a.id = b.id} \bput{:U}(0.35){?stmt}
\ncline{-}{a}{c} \aput{:U}{a.subject = c.id} \bput{:U}{?msg}
\ncline{-}{d}{a} \aput{:U}{a.subject = d.id} \bput{:U}(0.4){?msg}
\ncline{-}{g}{a} \aput{:U}(0.43){a.predicate = g.id} \bput{:U}{\it dc:relation}
\ncline{-}{c}{f} \aput{:U}{c.part\_of\_subproperty = f.id} \bput{:U}{\it s:isTranslationOf}
\ncline{-}{c}{e} \aput{:U}{c.part\_of = e.id} \bput{:U}{?translation}
\pspolygon[linestyle=dashed,linearc=0.8](0.1,0.1)(0.1,11.9)(14.5,11.9)(14.5,0.1)
\rput(13.8,1){$P_1$}
\end{pspicture}
\caption{Relational query graph $Q$ for the example query}
\label{join-graph}
\end{figure}
{\tt generate\_tables\_and\_conditions}: Produce ordered connected
minimum edge-disjoint tree cover $P$ for relational query graph $Q$ such
that $\forall P_i \in P$ \, $\forall j = \langle b_{j1}, b_{j2}, n_j
\rangle \in P_i$ \, $\forall k = \langle b_{k1}, b_{k2}, n_k \rangle \in
P_i$:
\begin{gather}
K(n_j) \cap K(n_k) \not= \emptyset \, , \\
\beta_{node}(n_j) = \beta_{node}(n_k) = \beta_{tree}(P_i) \, ,
\end{gather}
starting with $P_1$ such that $\beta_{tree}(P_1) = \Pi$ (it follows from
definitions of $\Psi$ and {\tt transform} that $P_1$ is the only such
tree and covers all join conditions $\langle b_1, b_2, n \rangle \in J$
such that $\beta_{node}(n) = \Pi$). Encode $P_1$ as the root inner join.
Encode other trees with at least one edge as subqueries. Left join
subqueries and aliases representing roots of zero-length trees into join
expression $F$. For each $P_i$ such that $\beta_{tree}(P_i) = N$, find a
binding $b = \langle a, f \rangle \in P_i$ such that $a \in P_1 \cap
P_i$ and add ($b$ {\tt IS NULL}) condition to $W$. For each non-grounded
node $n \not\in G$ such that $\langle a, f \rangle \in B(n) \, \wedge \,
a \in P_1$, add ($b$ {\tt IS NOT NULL}) condition to $W$ if
$\beta_{node}(n) = \Pi$, or ($b$ {\tt IS NULL}) condition if
$\beta_{node}(n) = N$. Add $F_g$ to $W$.
Translation of the example query presented earlier will result in the
following SQL:
\begin{Verbatim}[fontsize=\scriptsize]
SELECT DISTINCT a.subject, max(b.published_date)
FROM Statement AS a
INNER JOIN Resource AS b ON (a.id = b.id)
INNER JOIN Resource AS c ON (a.subject = c.id)
INNER JOIN Message AS d ON (a.subject = d.id)
INNER JOIN Resource AS g ON (a.predicate = g.id)
AND (g.literal = 'false' AND g.uriref = 'true'
AND g.label = 'http://purl.org/dc/elements/1.1/relation')
LEFT JOIN (
SELECT e.language AS _field_b, c.id AS _field_a
FROM Message AS e
INNER JOIN Resource AS f ON (f.literal = 'false'
AND f.uriref = 'true' AND f.label =
'http://www.nongnu.org/samizdat/rdf/schema#isTranslationOf')
INNER JOIN Resource AS c ON (c.part_of_subproperty = f.id)
AND (c.part_of = e.id)
) AS _subquery_a ON (c.id = _subquery_a._field_a)
WHERE (b.published_date IS NOT NULL)
AND (a.object IS NOT NULL) AND (a.rating IS NOT NULL)
AND (c.part_of IS NULL) AND (a.rating >= ?)
AND (d.language = ? OR _subquery_a._field_b = ?)
GROUP BY a.subject ORDER BY max(b.published_date) DESC
\end{Verbatim}
\section{Update Command Execution}
\label{update-execution}
Update command uses the same graph pattern structure as a query, and
additionally defines a set $\Delta \subset \Psi_{nodes}$ of variables
representing new RDF resources and a mapping $U: \Psi_{nodes} \to
\mathbb{L}$ of variables to literal values. Execution of an update
command starts with query pattern translation using the algorithm
described in section~\ref{query-translation}. The variables $\Psi$, $A$,
$Q$, etc. produced by pattern translation are used in the subsequent
stages as described below:
\begin{enumerate}
% node values
\item Construct node values mapping $V: \Psi_{nodes} \to \mathbb{L}$
using the algorithm defined in \figurename~\ref{node-values}. Record
resources inserted into the database during this stage in $\Delta_{new}
\subset \Psi_{nodes}$ (it follows from the algorithm definition that
$\Delta \subseteq \Delta_{new}$).
\begin{figure}
\begin{algorithmic}[1]
\ForAll {$n \in \Psi_{nodes}$}
\If {$n$ is an internal resource and $\rho(n) = i$}
\State $V(n) \gets i$
\ElsIf {$n$ is a query parameter or a literal}
\State $V(n) \gets n$
\ElsIf {$n$ is a variable}
\If {$\nexists c = \langle n, p, o \rangle \in \Psi_{arcs}$}
\State \Comment If found only in object position
\State $V(n) \gets U(n)$
\Else
\If {$n \not\in \Delta$}
\State $V(n) \gets \mathrm{SquishSelect}(n, \Psi^{n*})$
\EndIf
\If {$V(n) = \emptyset$}
\State Insert $n$ into $Resource$ relation
\State $V(n) \gets \rho(n)$
\State $\Delta_{new} \gets \Delta_{new} \cup n$
\EndIf
\EndIf
\ElsIf {$n$ is a URIref}
\State Select $n$ from $Resource$ relation, insert if missing
\State $V(n) \gets \rho(n)$
\EndIf
\EndFor
\end{algorithmic}
\caption{Determine node values. $\Psi^{n*}$ is a subgraph of $\Psi$
reachable from $n$. $\mathrm{SquishSelect}(n, \Psi)$ finds a mapping of
variable $n$ that satisfies pattern $\Psi$.}
\label{node-values}
\end{figure}
% data assignment
\item For each alias $a \in \mathbb{A}$, find a subset of graph pattern
$\Psi_{arcs}^a \subseteq \Psi_{arcs}$ such that $c \in \Psi_{arcs}^a
\iff C_a(c) = a$, select a key node $k$ such that $\exists c = \langle
k, p, o \rangle \in \Psi_{arcs}^a$, and collect a map $D_a: \Phi \to
\mathbb{L}$ of fields to values such that $\forall c = \langle s, p, o
\rangle \in \Psi_{arcs}^a \; \exists D_a(o) = V(o)$. If $k \in
\Delta_{new}$ and $A_{rel}(a) \not= Resource$, transform $D_a$ into an
SQL {\tt INSERT} into $A_{rel}(a)$ with explicit primary key assignment
$\mathrm{id}_k(A_{rel}(a)) \gets V(k)$. Otherwise, transform $D_a$
into an {\tt UPDATE} statement on the tuple in $A_{rel}(a)$ for which
$\mathrm{id}_k(A_{rel}(a)) = V(k)$.
% iterative assertions
\item Execute the SQL statements produced in the previous stage inside
the same transaction in the order that resolves their mutual references.
\end{enumerate}
\section{Implementation}
The algorithms described in previous sections are implemented by the
Samizdat RDF storage module, which is used as the primary means of data
access in the Samizdat open publishing system. The module is written in
Ruby programming language, supported by several triggers written in
procedural SQL. The module and the whole Samizdat engine are available
under GNU General Public License.
Samizdat exposes all RDF resources underpinning the structure and
content of the site. HTTP request with a URL of any internal resource
yields a page with detailed information about the resource and its
relation with other resources. Furthermore, Samizdat provides a
graphical interface that allows to compose arbitrary Squish
queries.\footnote{Complexity of user queries is limited to a
configurable maximum number of triples in the graph pattern to prevent
abuse.} Queries may be published so that other users may modify and
reuse them, results of a query may be accessed either as plain HTML or
as an RSS feed.
\section{Evaluation of Results}
\label{evaluation}
%\enlargethispage{-1ex}
Samizdat performance was measured using Berlin SPARQL Benchmark
(BSBM)~\cite{bsbm}, with following variations: a functional equivalent
of BSBM test driver was implemented in Ruby and Squish (instead of Java
and SPARQL); the test platform included Intel Core 2 Duo (instead of
Quad) clocked at the same frequency, and 2GB of memory (instead of 8GB).
In this environment, Samizdat was able to process 25287 complete query
mixes per second (QMpH) on a dataset with 1M triples, and achieved 18735
QMpH with 25M triples, in both cases exceeding figures for all RDF
stores reported in~\cite{bsbm}.
In production, Samizdat was able to serve without congestion peak loads
of up to 5K hits per hour for a site with a dataset sized at 100K
triples in a shared VPS environment. Regeneration of the site frontpage
on the same dataset executes 997 Squish queries and completes in 7.7s,
which is comparable to RDBMS-backed content management systems.
\section{Comparison with Related Work}
\label{related-work}
As mentioned in section~\ref{introduction}, there exists a wide range of
solutions for relational to RDF mapping. Besides Samizdat, the approach
based on automatic on-demand translation of RDF queries into SQL is also
implemented by Federate~\cite{federate}, D2RQ~\cite{d2rq}, and
Virtuoso~\cite{virtuoso}.
While being one of the first solutions to provide on-demand relational
to RDF mapping, Samizdat remains one of the most advanced in terms of
query capabilities. Its single largest drawback is lack of compatibility
with SPARQL; in the same time, in some regards it exceeds capabilities
of other solutions.
The alternative that is closest to Samizdat in terms of query
capabilities is Virtuoso RDF Views: it is the only other
relational-to-RDF mapping solution that provides partial RDFS and OWL
inference, aggregation, and an update language. Still, there are
substantial differences between these two projects. First of all,
Samizdat RDF store is a small module (1000 lines of Ruby and 200 lines
of SQL) that can be used with a variety of RDBMSes, while Virtuoso RDF
Views is tied to its own RDBMS. Virtuoso doesn't support implicit
statement reification, although its design is compatible with this
feature. Finally, Virtuso relies on SQL unions for queries with
unspecified predicates and RDFS and OWL inference. While allowing for
greater flexibility than the database triggers described in
section~\ref{inference-triggers}, iterative union operation has a
considerable impact on query performance.
\section{Future Work}
\label{future-work}
Since the SPARQL Recommendation has been published by W3C~\cite{sparql},
SPARQL support has been at the top of the Samizdat RDF store to-do list.
SPARQL syntax is considerably more expressive than Squish and will
require some effort to implement in Samizdat, but, since design of the
implementation separates syntactic layer from the query translation
logic, the same algorithms as described in this paper can be used to
translate SPARQL patterns to SQL with minimal changes. Most substantial
changes are expected to be required for the explicit grouping of
optional graph patterns and the associated filter scope
issues~\cite{cyganiak}.
Samizdat RDF store should be made more adaptable to a wider variety of
problem domains. Query translation algorithm should be augmented to
translate an ambiguously mapped query (including queries with
unspecified predicates) to a union of alternative interpretations.
Mapping of relational schema should be generalized, including support
for multi-part keys and more generic stored procedures for reification
and inference. Standard RDB2RDF mapping should be implemented when W3C
publishes a specification to that end.
\section{Conclusions}
The on-demand RDF to relational query translation algorithm described
in this paper utilizes existing relational databases to their full
potential, including indexing, transactions, and procedural SQL, to
provide efficient access to RDF data. Implementation of this algorithm
in Samizdat RDF storage module has been tried in production environment
and demonstrated how Semantic Web technologies can be introduced into an
application serving thousands of users without imposing additional
requirements on hardware resources.
\vspace{1ex}
% ---- Bibliography ----
%
\begin{thebibliography}{19}
%\bibitem {expressive-power-of-sparql}
%Anglez, R., Gutierrez, C.:
%The Expressive Power of SPARQL. In: A. Sheth et al. (Eds.) ISWC 2008.
%LNCS, vol. 5318, pp. 82-97. Springer, Heidelberg (2008)\\
%\url{http://www.dcc.uchile.cl/~cgutierr/papers/expPowSPARQL.pdf}
\bibitem {triplify}
Auer, S., Dietzold, S. Lehman, J., Hellmann, S., Aumueller, D.:
Triplify -- Light-Weight Linked Data Publication from Relational
Databases. WWW 2009, Madrid, Spain (2009)\\
\url{http://www.informatik.uni-leipzig.de/~auer/publication/triplify.pdf}
%\bibitem {swad-storage}
%Beckett, Dave:
%Semantic Web Scalability and Storage: Survey of Free Software / Open
%Source RDF storage systems. SWAD-Europe Deliverable 10.1 (2001)\\
%\url{http://www.w3.org/2001/sw/Europe/reports/rdf\_scalable\_storage\_report}
%\bibitem {swad-rdbms-mapping}
%Beckett, D., Grant, J.:
%Semantic Web Scalability and Storage: Mapping Semantic Web Data with
%RDBMSes, SWAD-Europe Deliverable 10.2 (2001)\\
%\url{http://www.w3.org/2001/sw/Europe/reports/scalable\_rdbms\_mapping\_report}
%\bibitem {cwm}
%Berners-Lee, T., Kolovski, V., Connolly, D., Hendler, J. Scharf, Y.:
%A Reasoner for the Web. Theory and Practice of Logic Programming (TPLP),
%special issue on Logic Programming and the Web (2000)\\
%\url{http://www.w3.org/2000/10/swap/doc/paper/}
\bibitem {bsbm}
Bizer, C., Schultz, A.:
The Berlin SPARQL Benchmark. International Journal On Semantic Web and
Information Systems (IJSWIS), Volume 5, Issue 2 (2009)\\
\url{http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/}
\bibitem {d2rq}
Bizer, C., Seaborne, A.:
D2RQ - Treating non-RDF databases as virtual RDF graphs. In: ISWC 2004
(posters)\\
\url{http://www.wiwiss.fu-berlin.de/bizer/D2RQ/spec/}
%\bibitem {samizdat-euruko}
%Borodaenko, Dmitry:
%RDF storage for Ruby: the case of Samizdat. EuRuKo 2003, Karlsruhe (June
%2003)\\
%\url{http://samizdat.nongnu.org/slides/euruko2003\_samizdat.html}
%\bibitem {samizdat-impl-report}
%Borodaenko, Dmitry:
%Samizdat RDF Implementation Report. RDF Interest ML (September 2003)\\
%\url{http://lists.w3.org/Archives/Public/www-rdf-interest/2003Sep/0043.html}
\bibitem {samizdat-rel-rdf}
Borodaenko, Dmitry:
Accessing Relational Data with RDF Queries and Assertions (April 2004)\\
\url{http://samizdat.nongnu.org/papers/rel-rdf.pdf}
\bibitem {samizdat-collreif}
Borodaenko, Dmitry:
Model for Collaborative Decision Making Based on RDF Reification (April
2004)\\
\url{http://samizdat.nongnu.org/papers/collreif.pdf}
\bibitem {cyganiak}
Cyganiak, R.:
A relational algebra for SPARQL. Technical Report HPL-2005-170, HP Labs
(2005)\\
\url{http://www.hpl.hp.com/techreports/2005/HPL-2005-170.html}
\bibitem {virtuoso}
Erling, O., Mikhailov I.:
RDF support in the Virtuoso DBMS. In: Proceedings of the 1st Conference
on Social Semantic Web, volume P-113 of GI-Edition -- Lecture Notes in
Informatics (LNI), ISSN 1617-5468. Bonner K\"{o}llen Verlag (2007)\\
\url{http://virtuoso.openlinksw.com/dav/wiki/Main/VOSArticleRDF}
%\bibitem {rdf-mt}
%Hayes, Patrick:
%RDF Semantics. W3C Recommendation (February 2004)\\
%\url{http://www.w3.org/TR/rdf-mt/}
%\bibitem {rdf-syntax-1999}
%Lassila, O., Swick, R.~R.:
%Resource Description Framework (RDF) Model and Syntax Specification, W3C
%Recommendation (February 1999)\\
%\url{http://www.w3.org/TR/1999/REC-rdf-syntax-19990222/}
%\bibitem {rdb2rdf-xg-report}
%Malhotra, Ashok:
%W3C RDB2RDF Incubator Group Report. W3C Incubator Group Report (January
%2009)\\
%\url{http://www.w3.org/2005/Incubator/rdb2rdf/XGR-rdb2rdf/}
%\bibitem {melnik}
%Melnik, S.:
%Storing RDF in a relational database. Stanford University (2001)\\
%\url{http://infolab.stanford.edu/~melnik/rdf/db.html}
\bibitem {squish}
Miller, Libby, Seaborne, Andy, Reggiori, Alberto:
Three Implementations of SquishQL, a Simple RDF Query Language. In:
Horrocks, I., Hendler, J. (Eds) ISWC 2002. LNCS vol. 2342, pp. 423-435.
Springer, Heidelberg (2002)\\
\url{http://ilrt.org/discovery/2001/02/squish/}
%\bibitem {nuutila}
%Nuutila, Esko:
%Efficient Transitive Closure Computation in Large Digraphs. Acta
%Polytechnica Scandinavica, Mathematics and Computing in Engineering
%Series No. 74, Helsinki (1995)\\
%\url{http://www.cs.hut.fi/~enu/thesis.html}
%\bibitem {owl-semantics}
%Patel-Schneider, Peter F., Hayes, Patrick, Horrocks, Ian:
%OWL Web Ontology Language Semantics and Abstract Syntax. W3C
%Recommendation (February 2004)\\
%\url{http://www.w3.org/TR/owl-semantics/}
\bibitem {federate}
Prud'hommeaux, Eric:
RDF Access to Relational Databases (2003)\\
\url{http://www.w3.org/2003/01/21-RDF-RDB-access/}
\bibitem {sparql}
Prud'hommeaux, Eric, Seaborne, Andy:
SPARQL Query Language for RDF. W3C Recommendation (January 2008)\\
\url{http://www.w3.org/TR/rdf-sparql-query/}
\bibitem {shapiro}
Shapiro, L., Stockman, G:
Computer Vision, pp. 69-73. Prentice-Hall (2002)\\
\url{http://www.cse.msu.edu/~stockman/Book/2002/Chapters/ch3.pdf}
\bibitem {sp2b}
Schmidt, M., Hornung, T., K\"{u}chlin, N., Lausen, G., Pinkel, C.:
An Experimental Comparison of RDF Data Management Approaches in a SPARQL
Benchmark Scenario. In: A. Sheth et al. (Eds.) ISWC 2008. LNCS vol.
5318, pp. 82-97. Springer, Heidelberg (2008)\\
\url{http://www.informatik.uni-freiburg.de/~mschmidt/docs/sp2b\_exp.pdf}
%\bibitem {treehugger}
%Steer, D.:
%TreeHugger -- XSLT for RDF (2003)\\
%\url{http://rdfweb.org/people/damian/treehugger/}
\end{thebibliography}
\end{document}
ruby-graffiti-2.2/doc/papers/rel-rdf.tex 0000664 0000000 0000000 00000055245 11764675307 0020276 0 ustar 00root root 0000000 0000000 \documentclass{llncs}
\usepackage{makeidx} % allows for indexgeneration
\usepackage{graphicx}
\usepackage[pdfpagescrop={92 112 523 778},a4paper=false,
pdfborder={0 0 0}]{hyperref}
\emergencystretch=8pt
%
\begin{document}
\mainmatter % start of the contributions
%
\title{Accessing Relational Data with RDF Queries and Assertions}
\toctitle{Accessing Relational Data with RDF Queries and Assertions}
\titlerunning{Accessing Relational Data with RDF}
%
\author{Dmitry Borodaenko}
\authorrunning{Dmitry Borodaenko} % abbreviated author list (for running head)
%%%% modified list of authors for the TOC (add the affiliations)
\tocauthor{Dmitry Borodaenko}
%
\institute{\email{angdraug@debian.org}}
\maketitle % typeset the title of the contribution
\begin{abstract}
This paper presents a hybrid RDF storage model that combines relational data
with arbitrary RDF meta-data, as implemented in the RDF storage layer of the
Samizdat open publishing and collaboration engine, and explains the supporting
algorithms for online translation of RDF queries and conditional assertions
into their relational equivalents. Proposed model allows to supplement legacy
databases with RDF meta-data without sacrificing the benefits of RDBMS
technology.
\end{abstract}
\section{Introduction}
The survey of free software / open source RDF storage systems performed by
SWAD-Europe\cite{swad-storage} has found that the most wide-spread approach to
RDF storage relies on relational databases. As seen from the companion report
on mapping Semantic Web data with RDBMSes\cite{swad-rdbms-mapping},
traditional relational representation of RDF is a triple store, usually
evolving around a central statement table with \{subject, predicate, object\}
triples as its rows and one or more tables storing resource URIrefs,
namespaces, and other supplementary data.
While such triple store approach serves well to satisfy the open world
assumption of RDF, by abandoning existing relational data models it fails to
take full advantage of the RDBMS technology. According to \cite{swad-storage},
existing RDF storage tools are still immature; in the same time, although
modern triple stores claim to scale to millions of triples, ICS-FORTH
research\cite{ics-volume} shows that schema-specific storage model yields
better results with regards to performance and scalability on large volumes of
data.
These concerns are addressed from different angles by RSSDB\cite{rssdb},
Federate\cite{ericp-rdf-rdb-access}, and D2R\cite{d2r} packages. RSSDB splits
the single triples table into a schema-specific set of property tables. In
this way, it walks away from relational data model, but maintains performance
benefits due to better indexing. Federate takes the most conservative approach
and allows to query a relational database with a restricted
application-specific RDF schema. Conversely, D2R is intended for batch export
of data from RDBMS to RDF and assumes that subsequent operation will involve
only RDF.
The hybrid RDF storage model presented in this paper attacks this problem from
yet another angle, which can be described as a combination of Federate's
relational-to-RDF mapping and a traditional triple store. While having the
advantage of being designed from the ground up with the RDF model in mind,
Samizdat RDF layer\cite{samizdat-rdf-storage} deviated from the common RDF
storage practice in order to use both relational and triple data models and
get the best of both worlds. Hybrid storage model was designed, and algorithms
were implemented that allow to access the data in the hybrid triple-relational
model with RDF queries and conditional assertions in an extended variant of
the Squish\cite{squish} query language.\footnote{The decision to use Squish
over more expressive languages like RDQL\cite{rdql} and
Notation3\cite{notation3} was made due to its intuitive syntax, which was
found more suitable for the Samizdat's query composer GUI intended for end
users of an open-publishing system.} This paper describes the proposed model
and its implementation in the Samizdat engine.
\section{Relational Database Schema}
All content in a Samizdat site is represented internally as RDF. Canonic
URIref for any Samizdat resource is {\tt http:///},
where {\tt } is a base URL of the site and {\tt } is a
unique (within a single site) numeric identifier of the resource.
Root of SQL representation of RDF resources is {\tt Resource} table with {\tt
id} primary key field storing {\tt }, and {\tt label} text field
representing resource label. Semantics of label values are different for
literals, references to external resources, and internal resources of the
site.
\emph{Literal} value (including typed literals) is stored directly in the {\tt
label} field and marked with {\tt literal} boolean field.
\emph{External resource} label contains the resource URIref and is marked with
{\tt uriref} boolean field.
\emph{Internal resource} is mapped into a row in an \emph{internal resource
table} with name corresponding to the resource class name stored in the {\tt
label} field, primary key {\tt id} field referencing back to the {\tt
Resource} table, and other fields holding values of \emph{internal properties}
for this resource class, represented as literals or references to other
resources stored in the {\tt Resource} table. Primary key reference to {\tt
Resource.id} is enforced by PostgreSQL stored procedures.
To determine what information about a resource can be stored in and extracted
from class-specific tables, RDF storage layer consults site-specific mapping
\begin{equation}
M(p) = \{\langle t_{p1},~f_{p1} \rangle, \enspace \dots\} \enspace ,
\end{equation}
which stores a list of possible pairs of SQL table name $t$ and field name $f$
for each internal property name $p$. Mapping $M$ is read at runtime from
external YAML\cite{yaml} file of the following form:
\begin{verbatim}
---
ns:
s: 'http://www.nongnu.org/samizdat/rdf/schema#'
focus: 'http://www.nongnu.org/samizdat/rdf/focus#'
items: 'http://www.nongnu.org/samizdat/rdf/items#'
rdf: 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'
dc: 'http://purl.org/dc/elements/1.1/'
map:
'dc::date': {Resource: published_date}
's::id': {Resource: id}
'rdf::subject': {Statement: subject}
'rdf::predicate': {Statement: predicate}
'rdf::object': {Statement: object}
's::rating': {Statement: rating}
. . .
\end{verbatim}
\emph{External properties}, i.e. properties that are not covered by $M$, are
represented by \{{\tt subject}, {\tt predicate}, {\tt object}\} triples in the
{\tt Statement} table. Every such triple is treated as a reified statement in
RDF semantics and is assigned a {\tt } and a record in the {\tt
Resource} table.
{\tt Resource} and {\tt Statement} are also internal resource tables, and, as
such, have some of their fields mapped by $M$. In particular, {\tt subject},
{\tt predicate}, and {\tt object} fields of the {\tt Statement} table are
mapped to the corresponding properties from the RDF reification vocabulary,
and {\tt Resource.id} is mapped to {\tt samizdat:id} property from Samizdat
namespace.
Excerpt from default Samizdat database schema with mapped field names replaced
by predicate QNames is visualized on Fig.\,\ref{db-schema-figure}. In addition
to {\tt Resource} and {\tt Statement} tables described above, it shows the
{\tt Message} table representing one of internal resource classes. Note how
{\tt dc:date} property is made available to all resource classes, and how
reified statements are allowed to have optional {\tt samizdat:rating}
property.
\begin{figure}
%\begin{verbatim}
% +-------------+ +-----------------+
% | Resource | | Statement |
% +-------------+ +-----------------+
% +->| samizdat:id |<-+-| id |
% | | label | +-| rdf:subject |
% | | literal | +-| rdf:predicate |
% | | uriref | +-| rdf:object |
% | | dc:date | | samizdat:rating |
% | +-------------+ +-----------------+
% |
% | +------------------+
% | | Message |
% | +------------------+
% +--| id |
% | dc:title |
% | dc:format |
% | samizdat:content |
% +------------------+
%\end{verbatim}
\begin{center}
\includegraphics[scale=0.6]{fig1.eps}
\end{center}
\caption{Excerpt from Samizdat database schema}
\label{db-schema-figure}
\end{figure}
\section{Query Pattern Translation}
%
\subsection{Prerequisites}
Pattern translation algorithm operates on the pattern section of a Squish
query. Query pattern $\Psi$ is represented as a list of \emph{pattern clauses}
\begin{equation}
\psi_i = \langle p_i,~s_i,~o_i \rangle \enspace ,
\end{equation}
where $i$ is the position of a clause, $p_i$ is the predicate URIref, $s_i$ is
the subject node and may be URIref or blank node, $o_i$ is the object node and
may be URIref, blank node, or literal.
\subsection{Predicate Mapping}
For each position $i$, predicate URIref $p_i$ is looked up in the map of
internal resource properties $M$. All possible mappings are recorded for all
clauses in a list $C$:
\begin{equation}
c_i = \{\langle t_{i1},~f_{i1} \rangle, \enspace \langle t_{i2},~f_{i2}
\rangle, \enspace \dots\} \enspace ,
\end{equation}
where $t_{ij}$ is the table name (same for subject $s_i$ and object $o_i$) and
$f_{ij}$ is the field name (meaningful for object only, since subject is
always mapped to the {\tt id} primary key). In the same iteration, all subject
and object positions of nodes are recorded in the reverse positional mapping
\begin{equation}
R(n) = \{\langle i_1,~m_1 \rangle, \enspace \langle i_2,~m_2 \rangle, \enspace
\dots\} \enspace ,
\end{equation}
where $m$ shows whether node $n$ appears as subject or as object in the clause
$i$.
Each ambiguous property mapping is compared with mappings for other
occurrences of the same subject and object nodes in the pattern graph; anytime
non-empty intersection of mappings for the same node is found, both subject
and object mappings for the ambiguous property are refined to such
intersection.
\subsection{Relation Aliases and Join Conditions}
Relation alias $a_i$ is determined for each clause mapping $c_i$, such that
for all subject occurrences of the subject $s_i$ that were mapped to the same
table $t_i$, alias is the same, and for all positions with differing table
mapping or subject node, alias is different.
For all nodes $n$ that are mapped to more than one $\langle a_i,~f_i \rangle$
pair in different positions, join conditions are generated. Additionally, for
each external resource, {\tt Resource} table is joined by URIref, and for each
existential blank node that isn't already bound by join, {\tt NOT NULL}
condition is generated. Resulting join conditions set $J$ is used to generate
the {\tt WHERE} section of the target SQL query.
\subsection{Example}
Following Squish query selects all messages with rating above 1:
\begin{verbatim}
SELECT ?msg, ?title, ?name, ?date, ?rating
WHERE (dc::title ?msg ?title)
(dc::creator ?msg ?author)
(s::fullName ?author ?name)
(dc::date ?msg ?date)
(rdf::subject ?stmt ?msg)
(rdf::predicate ?stmt dc::relation)
(rdf::object ?stmt focus::Quality)
(s::rating ?stmt ?rating)
LITERAL ?rating >= 1
ORDER BY ?rating
USING rdf FOR http://www.w3.org/1999/02/22-rdf-syntax-ns#
dc FOR http://purl.org/dc/elements/1.1/
s FOR http://www.nongnu.org/samizdat/rdf/schema#
focus FOR http://www.nongnu.org/samizdat/rdf/focus#
\end{verbatim}
Mappings produced by translation of this query are summarized in the
Table~\ref{mappings-table}.
\begin{table}
\caption{Query Translation Mappings}
\label{mappings-table}
\begin{center}
\begin{tabular}{clll}
\hline\noalign{\smallskip}
$i$ & $t_i$ & $f_i$ & $a_i$\\
\noalign{\smallskip}
\hline
\noalign{\smallskip}
1 & {\tt Message} & {\tt title} & {\tt b}\\
2 & {\tt Message} & {\tt creator} & {\tt b}\\
3 & {\tt Member} & {\tt full\_name} & {\tt d}\\
4 & {\tt Resource} & {\tt published\_date} & {\tt c}\\
5 & {\tt Statement} & {\tt subject} & {\tt a}\\
6 & {\tt Statement} & {\tt predicate} & {\tt a}\\
7 & {\tt Statement} & {\tt object} & {\tt a}\\
8 & {\tt Statement} & {\tt rating} & {\tt a}\\
\hline
\end{tabular}
\end{center}
\end{table}
As a result of translation, following SQL query will be generated:
\begin{verbatim}
SELECT b.id, b.title, d.full_name, c.published_date, a.rating
FROM Statement a, Message b, Resource c, Member d,
Resource e, Resource f
WHERE a.id IS NOT NULL
AND a.object = e.id AND e.literal = false
AND e.uriref = true AND e.label = 'focus::Quality'
AND a.predicate = f.id AND f.literal = false
AND f.uriref = true AND f.label = 'dc::relation'
AND a.rating IS NOT NULL
AND b.creator = d.id
AND b.id = a.subject
AND b.id = c.id
AND b.title IS NOT NULL
AND c.published_date IS NOT NULL
AND d.full_name IS NOT NULL
AND (a.rating >= 1)
ORDER BY a.rating
\end{verbatim}
\subsection{Limitations}
In RDF model theory\cite{rdf-mt}, a resource may belong to more than one
class. In Samizdat RDF storage model, resource class specified in {\tt
Resource.label} is treated as the primary class: it is not possible to have
some of the internal properties of a resource mapped to one table and some
other internal properties mapped to the other. The only exception to this is,
obviously, the {\tt Resource} table, which is shared by all resource classes.
Predicates with cardinality greater than 1 cannot be mapped to internal
resource tables, and should be recorded as reified statements instead.
RDF properties are allowed to be mapped to more than one internal resource
table, and queries on such ambiguous properties are intended to select all
classes of resources that match this property in conjunction with the rest of
the query.
The algorithm described above assumes that other pattern clauses refine such
ambiguous property mapping to one internal resource table. Queries that fail
this assumption will be translated incorrectly by the current implementation:
only the resource class from the first remaining mapping will be matched. This
should be taken into account in site-specific resource maps: ambiguous
properties should be avoided where possible, and their mappings should go in
order of resource class probability descension.
It is possible to solve this problem, but any precise solution will add
significant complexity to the resulting query. Solutions that would not
adversely affect performance are still being sought. So far, it is recommended
not to specify more than one mapping per internal property.
\section{Conditional Assertion}
%
\subsection{Prerequisites}
Conditional assertion statement in Samizdat Squish is recorded using the same
syntax as RDF query, with the {\tt SELECT} section containing variables list
replaced by {\tt INSERT} section with a list of ``don't-bind'' variables and
{\tt UPDATE} section containing assignments of values to query variables:
\begin{verbatim}
[ INSERT node [, ...] ]
[ UPDATE node = value [, ...] ]
WHERE (predicate subject object) [...]
[ USING prefix FOR namespace [...] ]
\end{verbatim}
Initially, pattern clauses in assertion are translated using the same
procedure as for a query. Pattern $\Psi$, clause mapping $C$, reverse
positional mapping $R$, alias list $A$, and join conditions set $J$ are
generated as described in the previous section.
After that, database update is performed in two stages described below. Both
stages are executed within a single transaction, rolling back intermediate
inserts and updates in case assertion fails.
\subsection{Resource Values}
On this stage value mapping $V(n)$ is defined for each node $n$, and necessary
resource insertions are performed:
\begin{enumerate}
\item If $n$ is an internal resource, $V(n)$ is its {\tt id}. If there is no
resource with such {\tt id} in the database, error is raised.
\item If $n$ is a literal, $V(n)$ is the literal value.
\item If $n$ is a blank node and only appears in object position, it is
assigned a value from the {\tt UPDATE} section of the assertion.
\item If $n$ is a blank node and appears in subject position, it is either
looked up in the database or inserted as a new resource. If no resource in the
database matches $n$ (to check that, subgraph of $\Psi$ including all pattern
nodes and predicates reachable from $n$ is generated and matched against the
database), or if $n$ appears in the {\tt INSERT} section of the assertion, new
resource is created and its {\tt id} is assigned to $V(n)$. If matching
resource is found, $V(n)$ becomes equal to its {\tt id}.
\item If $n$ is an external URIref, it is looked up in the {\tt Resource}
table. As with subject blank nodes, $V(n)$ is the {\tt id} of a matching or
new resource.
\end{enumerate}
All nodes that were inserted during this stage are recorded in the set of new
nodes $N$.
\subsection{Data Assignment}
For all aliases from $A$ except additional aliases that are defined for
external URIref nodes (which don't have to be looked up since their {\tt id}s
are recorded in $V$ during the previous stage), reverse positional mapping
\begin{equation}
R_\mathrm{A}(a) = \{i_1, \enspace i_2, \enspace \dots\}
\end{equation}
is defined. Key node $K$ is defined as the subject node $s_{i_1}$ from clause
$\psi_{i_1}$, and aliased table $t$ is defined as the table name $t_{i_1}$
from clause mapping $c_{i_1}$.
For each position $k$ from $R_\mathrm{A}(a)$, a pair $\langle f_k, V(o_k)
\rangle$, where $f_k$ is the field name from $c_k$, and $o_k$ the object node
from $\psi_k$, is added to the data assignment list $D(K)$ if node $o_k$
occurs in new node list $N$ or in {\tt UPDATE} section of the assertion
statement.
If key node $K$ occurs in $N$, new row is inserted into the table $t$. If $K$
is not in $N$, but $D(K)$ is not empty, SQL update statement is generated for
the row of $t$ with {\tt id} equal to $V(K)$. In both cases, assignments are
generated from the data assignment list $D(K)$.
The above procedure is repeated for each alias $a$ included in $R_\mathrm{A}$.
\subsection{Iterative assertions}
If the assertion pattern matches more than once in the site knowledge base,
the algorithm defined in this section will nevertheless run the appropriate
insertions and updates only once. For iterative update of all occurences of
pattern, assertion has to be programmatically wrapped inside an appropriate
RDF query.
\section{Implementation Details}
Samizdat engine\cite{samizdat-impl-report} is written in Ruby programming
language and uses PostgreSQL database for storage and an assortment of Ruby
libraries for database access (DBI), configuration and RDF mapping (YAML),
l10n (GetText), and Pingback protocol (XML-RPC). It is running on a variety of
platforms ranging from Debian GNU/Linux to Windows 98/Cygwin. Samizdat is free
software and is available under GNU General Public License, version 2 or
later.
Samizdat project development started in December 2002, first public release
was announced in June 2003. As of the second beta version 0.5.1, released in
March 2004, Samizdat provided basic set of open publishing functionality,
including registering site members, publishing and replying to messages,
uploading multimedia messages, voting on relation of site focuses to
resources, creating and managing new focuses, hand-editing or using GUI for
constructing and publishing Squish queries that can be used to search and
filter site resources.
\section{Conclusions}
Wide adoption of the Semantic Web requires interoperability between relational
databases and RDF applications. Existing RDF stores treat relational data as
legacy and require that it is recorded in triples before being processed, with
the exception of the Federate system that provides limited direct access to
relational data via application-specific RDF schema.
The Samizdat RDF storage layer provides an intermediate solution for this
problem by combining relational databases with arbitrary RDF meta-data. The
described approach allows to take advantage of RDBMS transactions,
replication, performance optimizations, etc., in Semantic Web applications,
and reduces the costs of migration from relational data model to RDF.
As can be seen from corresponding sections of this paper, current
implementation of the proposed approach has several limitations. These
limitations are not caused by limitations in the approach itself, but rather,
reflect the pragmatic decision to only implement the functionality that is
used by Samizdat engine. As more advanced collaboration features such as
message versioning and aggregation are added to Samizdat, some of the
limitations of its RDF storage layer will be removed.
% ---- Bibliography ----
%
\begin{thebibliography}{19}
%
\bibitem {ics-volume}
Alexaki, S., Christophides, V., Karvounarakis, G., Plexousakis D., Tolle, K.:
The RDFSuite: Managing Voluminous RDF Description Bases, Technical report,
ICS-FORTH, Heraklion, Greece, 2000.\\
http://139.91.183.30:9090/RDF/publications/semweb2001.html
\bibitem {swad-storage}
Beckett, Dave:
Semantic Web Scalability and Storage: Survey of Free Software / Open Source
RDF storage systems, SWAD-Europe Deliverable 10.1\\
http://www.w3.org/2001/sw/Europe/reports/rdf\_scalable\_storage\_report
\bibitem {swad-rdbms-mapping}
Beckett, D., Grant, J.:
Semantic Web Scalability and Storage: Mapping Semantic Web Data with RDBMSes,
SWAD-Europe Deliverable 10.2\\
http://www.w3.org/2001/sw/Europe/reports/scalable\_rdbms\_mapping\_report
\bibitem{yaml}
Ben-Kiki, O., Evans, C., Ingerson, B.:
YAML Ain't Markup Language (YAML) 1.0. Working Draft 2004-JAN-29.\\
http://www.yaml.org/spec/
\bibitem {notation3}
Berners-Lee, Tim:
Notation3 --- Ideas about Web architecture\\
http://www.w3.org/DesignIssues/Notation3
\bibitem {d2r}
Bizer, Chris:
D2R MAP --- Database to RDF Mapping Language and Processor\\
http://www.wiwiss.fu-berlin.de/suhl/bizer/d2rmap/D2Rmap.htm
\bibitem {samizdat-rdf-storage}
Borodaenko, Dmitry:
Samizdat RDF Storage, December 2002\\
http://savannah.nongnu.org/cgi-bin/viewcvs/samizdat/samizdat/doc/rdf-storage.txt
\bibitem {samizdat-impl-report}
Borodaenko, Dmitry:
Samizdat RDF Implementation Report, September 2003\\
http://lists.w3.org/Archives/Public/www-rdf-interest/2003Sep/0043.html
\bibitem {rdf-mt}
Hayes, Patrick:
RDF Semantics. W3C, February 2004\\
http://www.w3.org/TR/rdf-mt
\bibitem {rdql}
Jena Semantic Web Framework:
RDQL Grammar\\
http://jena.sf.net/RDQL/rdql\_grammar.html
\bibitem {ericp-rdf-rdb-access}
Prud'hommeaux, Eric:
RDF Access to Relational Databases\\
http://www.w3.org/2003/01/21-RDF-RDB-access/
\bibitem {rssdb}
RSSDB --- RDF Schema Specific DataBase (RSSDB), ICS-FORTH, 2002\\
http://139.91.183.30:9090/RDF/RSSDB/
\bibitem {squish}
Libby Miller, Andy Seaborne, Alberto Reggiori:
Three Implementations of SquishQL, a Simple RDF Query Language. 1st
International Semantic Web Conference (ISWC2002), June 9-12, 2002. Sardinia,
Italy.\\
http://ilrt.org/discovery/2001/02/squish/
\end{thebibliography}
\end{document}
ruby-graffiti-2.2/doc/rdf-impl-report.txt 0000664 0000000 0000000 00000012111 11764675307 0020474 0 ustar 00root root 0000000 0000000 Samizdat RDF Implementation Report
==================================
http://lists.w3.org/Archives/Public/www-rdf-interest/2003Sep/0043.html
Implementation
--------------
http://www.nongnu.org/samizdat/
Samizdat is a generic RDF-based engine for building collaboration and
open publishing web sites. Samizdat will let everyone publish, view,
comment, edit, and aggregate text and multimedia resources, vote on
ratings and classifications, filter resources by flexible sets of
criteria, cooperate and coordinate on all kinds of activities (see
Design Goals document). Samizdat intends to promote values of freedom,
openness, equality, and cooperation.
Samizdat engine is implemented using Ruby programming language, Apache
mod_ruby module, and PostgreSQL RDBMS, and is available under the GNU
General Public License, version 2 or later.
Project development started in December 2002, first public release was
announced in June 2003. This report refers to the Samizdat 0.0.4,
released on 2003-09-01.
Functionality covered by this version includes: registering site
members, publishing and replying to messages, uploading multimedia
messages, voting on standard tags on resources; hand-editing or using
GUI for constructing and publishing Squish queries that can be used to
search and filter site resources.
RDF Schema
----------
Samizdat defines its own RDF schema for description of site members,
published messages, votes, and other site resources (see Concepts
document). One of the outstanding features of Samizdat schema is the use
of statement reification in approval of content classification with
votes cast by site members.
Samizdat RDF schema uses Dublin Core metadata where applicable; also,
integration of site member descriptions with FOAF is planned.
One of the problems encountered in Samizdat RDF Schema development was
the lack of standard metadata describing discussion threads. While other
properties defined in Samizdat schema denote Samizdat-specific concepts,
such as "vote" and "rating", it is more desirable to use commonly agreed
metadata for threading structure in place of implementation-local
"thread" and "inReplyTo" properties.
RDF Import and Export
---------------------
While Samizdat model follows RDF Concepts and RDF Semantics
recommendations (with the exceptions put down below), the engine does
not externally interchange RDF data and thus does not use RDF/XML or
other RDF serialization format. It is assumed that, when the need for
RDF import and export arises, it can be implemented externally on top of
the Samizdat RDF storage module and using existing RDF frameworks such
as Redland.
Datatyped Literals
------------------
Samizdat doesn't implement datatyped literals, and relies on underlying
PostgreSQL capabilities for mapping between literal values and their
string representations. Outside of SQL context, literals are interpreted
as opaque strings; XML literals are not treated specially, and datatype
information is not preserved.
However, support of XML schema datatypes is considered necessary in
order to untie a Samizdat knowledge base from specifics of underlying
RDF storage, and will be implemented as a prerequisite for migration to
a selection of alternative RDF storage backends (candidates are FramerD,
3store, and Redland).
Language Tags
-------------
Literal language tags are not honoured, "dc:language" property is
supposed to be used to denote message language.
Entailments
-----------
Samizdat RDF storage only implements simple entailment, vocabulary
entailment is not implemented yet. At the moment, simple entailment
suffices for all features of the Samizdat engine. If and when vocabulary
entailment becomes necessary, it will be implemented in Samizdat RDF
storage module or relegated to an alternative RDF storage backend,
depending on status of backend alternatives for Samizdat at that time.
Query Support
-------------
Samizdat RDF storage implements a translation of RDF query graphs
written in extended Squish into relational SQL queries and allows purely
relational representation of selected properties of site resources (see
RDF Storage and Storage Implementation documents).
It must be noted that at the moment, status of RDF query language
standards is found unsatisfactory.
DAML Query Language abstract specification provides excellent formal
basis, but does not encompass all capabilities of existing RDF query
languages. Also, existing query languages are limited in one way or
another, are underformalized (most are defined by single
implementation), and often overloaded with baroque syntax.
Two major features that were missed the most in existing query languages
at the time of Samizdat RDF storage implementation were: knowledge base
update allowing to merge complex constructs into the site KB graph
(implemented in Samizdat RDF Data Manipulation Language), and workflow
control providing at least transaction rollback (in Samizdat, underlying
PostgreSQL transactions are used). Other Squish extensions implemented
in Samizdat are literal conditions and answer collection ordering
(currently, relegated to PostgreSQL; ideally, interpreted according to
literal datatypes).
ruby-graffiti-2.2/graffiti.gemspec 0000664 0000000 0000000 00000001567 11764675307 0017320 0 ustar 00root root 0000000 0000000 Gem::Specification.new do |spec|
spec.name = 'graffiti'
spec.version = '2.1'
spec.author = 'Dmitry Borodaenko'
spec.email = 'angdraug@debian.org'
spec.homepage = 'https://github.com/angdraug/graffiti'
spec.summary = 'Relational RDF store for Ruby'
spec.description = <<-EOF
Graffiti is an RDF store based on dynamic translation of RDF queries into SQL.
Graffiti allows one to map any relational database schema into RDF semantics
and vice versa, to store any RDF data in a relational database.
Graffiti uses Sequel to connect to database backend and provides a DBI-like
interface to run RDF queries in Squish query language from Ruby applications.
EOF
spec.files = `git ls-files`.split "\n"
spec.test_files = Dir['test/ts_*.rb']
spec.license = 'GPL3+'
spec.add_dependency('syncache')
spec.add_dependency('sequel')
end
ruby-graffiti-2.2/lib/ 0000775 0000000 0000000 00000000000 11764675307 0014715 5 ustar 00root root 0000000 0000000 ruby-graffiti-2.2/lib/graffiti.rb 0000664 0000000 0000000 00000000757 11764675307 0017046 0 ustar 00root root 0000000 0000000 # Graffiti RDF Store
# (originally written for Samizdat project)
#
# Copyright (c) 2002-2009 Dmitry Borodaenko
#
# This program is free software.
# You can distribute/modify this program under the terms of
# the GNU General Public License version 3 or later.
#
# see doc/rdf-storage.txt for introduction and Graffiti Squish definition;
# see doc/storage-impl.txt for explanation of implemented algorithms
#
# vim: et sw=2 sts=2 ts=8 tw=0
require 'graffiti/store'
ruby-graffiti-2.2/lib/graffiti/ 0000775 0000000 0000000 00000000000 11764675307 0016510 5 ustar 00root root 0000000 0000000 ruby-graffiti-2.2/lib/graffiti/debug.rb 0000664 0000000 0000000 00000001337 11764675307 0020127 0 ustar 00root root 0000000 0000000 # Graffiti RDF Store
# (originally written for Samizdat project)
#
# Copyright (c) 2002-2011 Dmitry Borodaenko
#
# This program is free software.
# You can distribute/modify this program under the terms of
# the GNU General Public License version 3 or later.
#
# see doc/rdf-storage.txt for introduction and Graffiti Squish definition;
# see doc/storage-impl.txt for explanation of implemented algorithms
#
# vim: et sw=2 sts=2 ts=8 tw=0
module Graffiti
module Debug
private
DEBUG = false
def debug(message = nil)
return unless DEBUG
message = yield if block_given?
log message if message
end
def log(message)
STDERR << 'Graffiti: ' << message.to_s << "\n"
end
end
end
ruby-graffiti-2.2/lib/graffiti/exceptions.rb 0000664 0000000 0000000 00000001107 11764675307 0021215 0 ustar 00root root 0000000 0000000 # Graffiti RDF Store
# (originally written for Samizdat project)
#
# Copyright (c) 2002-2009 Dmitry Borodaenko
#
# This program is free software.
# You can distribute/modify this program under the terms of
# the GNU General Public License version 3 or later.
#
# see doc/rdf-storage.txt for introduction and Graffiti Squish definition;
# see doc/storage-impl.txt for explanation of implemented algorithms
#
# vim: et sw=2 sts=2 ts=8 tw=0
module Graffiti
# raised for syntax errors in Squish statements
class ProgrammingError < RuntimeError; end
end
ruby-graffiti-2.2/lib/graffiti/rdf_config.rb 0000664 0000000 0000000 00000004141 11764675307 0021135 0 ustar 00root root 0000000 0000000 # Graffiti RDF Store
# (originally written for Samizdat project)
#
# Copyright (c) 2002-2011 Dmitry Borodaenko
#
# This program is free software.
# You can distribute/modify this program under the terms of
# the GNU General Public License version 3 or later.
#
# see doc/rdf-storage.txt for introduction and Graffiti Squish definition;
# see doc/storage-impl.txt for explanation of implemented algorithms
#
# vim: et sw=2 sts=2 ts=8 tw=0
require 'graffiti/rdf_property_map'
module Graffiti
# Configuration of relational RDF storage (see examples)
#
class RdfConfig
def initialize(config)
@ns = config['ns']
@map = {}
config['map'].each_pair do |p, m|
table, field = m.to_a.first
p = ns_expand(p)
@map[p] = RdfPropertyMap.new(p, table, field)
end
if config['subproperties'].kind_of? Hash
config['subproperties'].each_pair do |p, subproperties|
p = ns_expand(p)
map = @map[p] or raise RuntimeError,
"Incorrect RDF storage configuration: superproperty #{p} must be mapped"
map.superproperty = true
qualifier = RdfPropertyMap.qualifier_property(p)
@map[qualifier] = RdfPropertyMap.new(
qualifier, map.table, RdfPropertyMap.qualifier_field(map.field))
subproperties.each do |subp|
subp = ns_expand(subp)
@map[subp] = RdfPropertyMap.new(subp, map.table, map.field)
@map[subp].subproperty_of = p
end
end
end
if config['transitive_closure'].kind_of? Hash
config['transitive_closure'].each_pair do |p, table|
@map[ ns_expand(p) ].transitive_closure = table
if config['subproperties'].kind_of?(Hash) and config['subproperties'][p]
config['subproperties'][p].each do |subp|
@map[ ns_expand(subp) ].transitive_closure = table
end
end
end
end
end
# hash of namespaces
attr_reader :ns
# map internal property names with expanded namespaces to RdfPropertyMap
# objects
#
attr_reader :map
def ns_expand(p)
p and p.sub(/\A(\S+?)::/) { @ns[$1] }
end
end
end
ruby-graffiti-2.2/lib/graffiti/rdf_property_map.rb 0000664 0000000 0000000 00000005000 11764675307 0022404 0 ustar 00root root 0000000 0000000 # Graffiti RDF Store
# (originally written for Samizdat project)
#
# Copyright (c) 2002-2011 Dmitry Borodaenko
#
# This program is free software.
# You can distribute/modify this program under the terms of
# the GNU General Public License version 3 or later.
#
# see doc/rdf-storage.txt for introduction and Graffiti Squish definition;
# see doc/storage-impl.txt for explanation of implemented algorithms
#
# vim: et sw=2 sts=2 ts=8 tw=0
module Graffiti
# Map of an internal RDF property into relational storage
#
class RdfPropertyMap
# special qualifier map
#
# ' ' is added to the property name to make sure it can't clash with any
# valid property uriref
#
def RdfPropertyMap.qualifier_property(property, type = 'subproperty')
property + ' ' + type
end
# special qualifier field
#
def RdfPropertyMap.qualifier_field(field, type = 'subproperty')
field + '_' + type
end
def initialize(property, table, field)
# fixme: support ambiguous mappings
@property = property
@table = table
@field = field
end
# expanded uriref of the mapped property
#
attr_reader :property
# name of the table into which the property is mapped (property domain is an
# internal resource class mapped into this table)
#
attr_reader :table
# name of the field into which the property is mapped
#
# if property range is not a literal, the field is a reference to the
# resource table
#
attr_reader :field
# expanded uriref of the property which this property is a subproperty of
#
# if set, this property maps into the same table and field as its
# superproperty, and is qualified by an additional field named
# _subproperty which refers to a uriref resource holding uriref of
# this subproperty
#
attr_accessor :subproperty_of
attr_writer :superproperty
# set to +true+ if this property has subproperties
#
def superproperty?
@superproperty or false
end
# name of transitive closure table for a transitive property
#
# the format of a transitive closure table is:
#
# - 'resource' field refers to the subject resource id
# - '' property field and '_subproperty' qualifier field (in
# case of subproperty) have the same name as in the main table
# - 'distance' field holds the distance from subject to object in the RDF
# graph
#
# the transitive closure table is automatically updated by a trigger on every
# update of the main table
#
attr_accessor :transitive_closure
end
end
ruby-graffiti-2.2/lib/graffiti/sql_mapper.rb 0000664 0000000 0000000 00000062326 11764675307 0021211 0 ustar 00root root 0000000 0000000 # Graffiti RDF Store
# (originally written for Samizdat project)
#
# Copyright (c) 2002-2011 Dmitry Borodaenko
#
# This program is free software.
# You can distribute/modify this program under the terms of
# the GNU General Public License version 3 or later.
#
# see doc/rdf-storage.txt for introduction and Graffiti Squish definition;
# see doc/storage-impl.txt for explanation of implemented algorithms
#
# vim: et sw=2 sts=2 ts=8 tw=0
require 'delegate'
require 'uri/common'
require 'graffiti/rdf_property_map'
require 'graffiti/squish'
module Graffiti
class SqlNodeBinding
def initialize(table_alias, field)
@alias = table_alias
@field = field
end
attr_reader :alias, :field
def to_s
@alias + '.' + @field
end
alias :inspect :to_s
def eql?(binding)
@alias == binding.alias and @field == binding.field
end
alias :'==' :eql?
def hash
self.to_s.hash
end
end
class SqlExpression < DelegateClass(Array)
def initialize(*parts)
super parts
end
def to_s
'(' << self.join(' ') << ')'
end
alias :to_str :to_s
def traverse(&block)
self.each do |part|
case part
when SqlExpression
part.traverse(&block)
else
yield
end
end
end
def rebind!(rebind, &block)
self.each_with_index do |part, i|
case part
when SqlExpression
part.rebind!(rebind, &block)
when SqlNodeBinding
if rebind[part]
self[i] = rebind[part]
yield part if block_given?
end
end
end
end
alias :eql? :'=='
def hash
self.to_s.hash
end
end
# Transform RDF query pattern graph into a relational join expression.
#
class SqlMapper
include Debug
def initialize(config, pattern, negative = [], optional = [], global_filter = '')
@config = config
@global_filter = global_filter
check_graph(pattern)
negative.empty? or check_graph(pattern + negative)
optional.empty? or check_graph(pattern + optional)
map_predicates(pattern, negative, optional)
transform
generate_tables_and_conditions
@jc = @aliases = @ac = @global_filter = nil
end
# map clause position to table, field, and table alias
#
# position => {
# :subject => {
# :node => node,
# :field => field
# },
# :object => {
# :node => node,
# :field => field
# },
# :map => RdfPropertyMap,
# :bind_mode => < :must_bind | :may_bind | :must_not_bind >,
# :alias => alias
# }
#
attr_reader :clauses
# map node to list of positions in clauses
#
# node => {
# :positions => [
# { :clause => position, :role => < :subject | :object > }
# ],
# :bind_mode => < :must_bind | :may_bind | :must_not_bind >,
# :colors => { color1 => bind_mode1, ... },
# :ground => < true | false >
# }
#
attr_reader :nodes
# list of tables for FROM clause of SQL query
attr_reader :from
# conditions for WHERE clause of SQL query
attr_reader :where
# return node's binding, raise exception if the node isn't bound
#
def bind(node)
(@nodes[node] and @bindings[node] and (binding = @bindings[node].first)
) or raise ProgrammingError,
"Node '#{node}' is not bound by the query pattern"
@nodes[node][:positions].each do |p|
if :object == p[:role] and @clauses[ p[:clause] ][:map].subproperty_of
property = @clauses[ p[:clause] ][:map].property
return %{select_subproperty(#{binding}, #{bind(property)})}
end
end
binding
end
private
# Check whether pattern is not a disjoint graph (all nodes are
# undirectionally reachable from one node).
#
def check_graph(pattern)
nodes = pattern.transpose[1, 2].flatten.uniq # all nodes
seen = [ nodes.shift ]
found_more = true
while found_more and not nodes.empty?
found_more = false
pattern.each do |predicate, subject, object|
if seen.include?(subject) and nodes.include?(object)
seen.push(object)
nodes.delete(object)
found_more = true
elsif seen.include?(object) and nodes.include?(subject)
seen.push(subject)
nodes.delete(subject)
found_more = true
end
end
end
nodes.empty? or raise ProgrammingError, "Query pattern is a disjoint graph"
end
# Stage 1: Predicate Mapping (storage-impl.txt).
#
def map_predicates(pattern, negative, optional)
@nodes = {}
@clauses = []
map_pattern(pattern, :must_bind)
map_pattern(negative, :must_not_bind)
map_pattern(optional, :may_bind)
@color_counter = @must_bind_nodes = nil
refine_ambiguous_properties
debug do
@nodes.keys.sort.each do |node|
n = @nodes[node]
debug %{map_predicates #{node}: #{n[:bind_mode]} #{n[:colors].inspect}}
end
nil
end
end
# Label every connected component of the pattern with a different color.
#
# Pattern clause positions:
#
# 0. predicate
# 1. subject
# 2. object
# 3. filter
#
# Returns hash of node colors.
#
# Implements the {Two-pass Connected Component Labeling algorithm}
# [http://en.wikipedia.org/wiki/Connected_Component_Labeling#Two-pass]
# with an added special case to exclude _alien_nodes_ from neighbor lists.
#
# The special case ensures that parts of a may-bind or must-not-bind
# subpattern that are only connected through a must-bind node do not connect.
#
def label_pattern_components(pattern, alien_nodes, augment_alien_nodes = false)
return {} if pattern.empty?
color = {}
color_eq = [] # [ [ smaller, larger ], ... ]
nodes = pattern.transpose[1, 2].flatten.uniq
alien_nodes_here = nodes & alien_nodes
@color_counter = @color_counter ? @color_counter.next : 0
color[ nodes[0] ] = @color_counter
# first pass
1.upto(nodes.size - 1) do |i|
node = nodes[i]
pattern.each do |predicate, subject, object, filter|
if node == subject
neighbor = object
elsif node == object
neighbor = subject
end
next if neighbor.nil? or color[neighbor].nil? or
alien_nodes_here.include?(neighbor)
if color[node].nil?
color[node] = color[neighbor]
elsif color[node] != color[neighbor] # record color equivalence
color_eq |= [ [ color[node], color[neighbor] ].sort ]
end
end
color[node] ||= (@color_counter += 1)
end
# second pass
nodes.each do |node|
while eq = color_eq.rassoc(color[node])
color[node] = eq[0]
end
end
alien_nodes.push(*nodes).uniq! if augment_alien_nodes
color
end
def map_pattern(pattern, bind_mode = :must_bind)
pattern = pattern.dup
@must_bind_nodes ||= []
color = label_pattern_components(pattern, @must_bind_nodes, :must_bind == bind_mode)
pattern.each do |predicate, subject, object, filter, transitive|
# validate the triple
predicate =~ URI::URI_REF or raise ProgrammingError,
"Valid uriref expected in predicate position instead of '#{predicate}'"
[subject, object].each do |node|
node =~ SquishQuery::INTERNAL or
node =~ SquishQuery::BN or
node =~ URI::URI_REF or
raise ProgrammingError,
"Resource or blank node name expected instead of '#{node}'"
end
# list of possible mappings into internal tables
map = @config.map[predicate]
if transitive and map.transitive_closure.nil?
raise ProgrammingError,
"No transitive closure is defined for #{predicate} property"
end
if map and
(subject =~ SquishQuery::BN or
subject =~ SquishQuery::INTERNAL or
subject =~ SquishQuery::PARAMETER or
'resource' == map.table)
# internal predicate and subject is mappable to resource table
i = clauses.size
@clauses[i] = {
:subject => [ { :node => subject, :field => 'id' } ],
:object => [ { :node => object, :field => map.field } ],
:map => map,
:transitive => transitive,
:bind_mode => bind_mode
}
@clauses[i][:filter] = SqlExpression.new(filter) if filter
[subject, object].each do |node|
if @nodes[node]
@nodes[node][:bind_mode] =
stronger_bind_mode(@nodes[node][:bind_mode], bind_mode)
else
@nodes[node] = { :positions => [], :bind_mode => bind_mode, :colors => {} }
end
# set of node colors, one for each bind_mode
@nodes[node][:colors][ color[node] ] = bind_mode
end
# reverse mapping of the node occurences
@nodes[subject][:positions].push( { :clause => i, :role => :subject } )
@nodes[object][:positions].push( { :clause => i, :role => :object } )
if superp = map.subproperty_of
# link subproperty qualifier into the pattern
pattern.push(
[RdfPropertyMap.qualifier_property(superp), subject, predicate])
color[predicate] = color[object]
# no need to ground both subproperty and superproperty
@nodes[object][:ground] = true
end
else
# assume reification for unmapped predicates:
#
# | (rdf::predicate ?_stmt_#{i} p)
# (p s o) -> | (rdf::subject ?_stmt_#{i} s)
# | (rdf::object ?_stmt_#{i} o)
#
rdf = 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'
stmt = "?_stmt_#{i}"
pattern.push([rdf + 'predicate', stmt, predicate],
[rdf + 'subject', stmt, subject],
[rdf + 'object', stmt, object])
color[stmt] = color[predicate] = color[object]
end
end
end
# Select strongest of the two bind modes, in the following order of
# preference:
#
# :must_bind -> :must_not_bind -> :may_bind
#
def stronger_bind_mode(mode1, mode2)
if mode1 != mode2 and (:must_bind == mode2 or :may_bind == mode1)
mode2
else
mode1
end
end
# If a node can be mapped to more than one [table, field] pair, see if it can
# be refined based on other occurences of this node in other query clauses.
#
def refine_ambiguous_properties
@nodes.keys.sort.each do |node|
map = @nodes[node][:positions]
map.each_with_index do |p, i|
big = @clauses[ p[:clause] ][ p[:role] ]
next if big.size <= 1 # no refining needed
debug { 'refine_ambiguous_properties ' + @nodes[node] + ': ' + big.inspect }
(i + 1).upto(map.size - 1) do |j|
small_p = map[j]
small = @clauses[ small_p[:clause] ][ small_p[:role] ]
refined = big & small
if refined.size > 0 and refined.size < big.size
# refine the node...
@clauses[ p[:clause] ][ p[:role] ] = big = refined
# ...and its pair
@clauses[ p[:clause] ][ opposite_role(p[:role]) ].collect! {|pair|
refined.assoc(pair[0]) ? pair : nil
}.compact!
end
end
end
end
# drop remaining ambiguous mappings
# todo: split query for ambiguous mappings
@clauses.each do |clause|
next if clause.nil? # means it was reified
clause[:subject] = clause[:subject].first
clause[:object] = clause[:object].first
end
end
def opposite_role(role)
:subject == role ? :object : :subject
end
# Return current value of alias counter, remember which table it was assigned
# to, and increment the counter.
#
def next_alias(table, node, bind_mode = @nodes[node][:bind_mode])
@ac ||= 'a'
@aliases ||= {}
a = @ac.dup
@aliases[a] = {
:table => table,
:node => node,
:bind_mode => bind_mode,
:filter => []
}
@ac.next!
return a
end
def define_relation_aliases
@nodes.keys.sort.each do |node|
positions = @nodes[node][:positions]
debug { 'define_relation_aliases ' + positions.inspect }
# go through all clauses with this node in subject position
positions.each_with_index do |p, i|
next if :subject != p[:role] or @clauses[ p[:clause] ][:alias]
clause = @clauses[ p[:clause] ]
map = clause[:map]
table = clause[:transitive] ? map.transitive_closure : map.table
# see if we've already mapped this node to the same table before
0.upto(i - 1) do |j|
similar_clause = @clauses[ positions[j][:clause] ]
if similar_clause[:alias] and
similar_clause[:map].table == table and
similar_clause[:map].field != map.field
# same node, same table, different field -> same alias
clause[:alias] = similar_clause[:alias]
break
end
end
if clause[:alias].nil?
clause[:alias] =
if clause[:transitive]
# transitive clause bind mode overrides a stronger node bind mode
#
# fixme: generic case for multiple aliases per node
next_alias(table, node, clause[:bind_mode])
else
next_alias(table, node)
end
end
end
end # optimize: unnecessary aliases are generated
end
def update_alias_filters
@clauses.each do |c|
if c[:filter]
@aliases[ c[:alias] ][:filter].push(c[:filter])
end
end
end
# Stage 2: Relation Aliases and Join Conditions (storage-impl.txt).
#
# Result is map of aliases in @aliases and list of join conditions in @jc.
#
def transform
define_relation_aliases
update_alias_filters
# [ [ binding1, binding2 ], ... ]
@jc = []
@bindings = {}
@nodes.keys.sort.each do |node|
positions = @nodes[node][:positions]
# node binding
first = positions.first
clause = @clauses[ first[:clause] ]
a = clause[:alias]
binding = SqlNodeBinding.new(a, clause[ first[:role] ][:field])
@bindings[node] = [ binding ]
# join conditions
1.upto(positions.size - 1) do |i|
p = positions[i]
clause2 = @clauses[ p[:clause] ]
binding2 = SqlNodeBinding.new(clause2[:alias], clause2[ p[:role] ][:field])
unless @bindings[node].include?(binding2)
@bindings[node].push(binding2)
@jc.push([binding, binding2, node])
@nodes[node][:ground] = true
end
end
# ground non-blank nodes
if node !~ SquishQuery::BN
if node =~ SquishQuery::INTERNAL # internal resource id
@aliases[a][:filter].push SqlExpression.new(binding, '=', $1)
elsif node =~ SquishQuery::PARAMETER or node =~ SquishQuery::LITERAL
@aliases[a][:filter].push SqlExpression.new(binding, '=', node)
elsif node =~ URI::URI_REF # external resource uriref
r = nil
positions.each do |p|
next unless :subject == p[:role]
c = @clauses[ p[:clause] ]
if 'resource' == c[:map].table
r = c[:alias] # reuse existing mapping to resource table
break
end
end
if r.nil?
r = next_alias('resource', node)
r_binding = SqlNodeBinding.new(r, 'id')
@bindings[node].unshift(r_binding)
@jc.push([ binding, r_binding, node ])
end
@aliases[r][:filter].push SqlExpression.new(
SqlNodeBinding.new(r, 'uriref'), '=', "'t'", 'AND',
SqlNodeBinding.new(r, 'label'), '=', %{'#{node}'})
else
raise RuntimeError,
"Invalid node '#{node}' should never occur at this point"
end
@nodes[node][:ground] = true
end
end
debug do
@aliases.keys.sort.each {|a| debug %{transform #{a}: #{@aliases[a].inspect}} }
@jc.each {|jc| debug 'transform ' + jc.inspect }
nil
end
end
# Produce SQL FROM and WHERE clauses from results of transform().
#
def generate_tables_and_conditions
main_path, seen = jc_subgraph_path(:must_bind)
debug { 'generate_tables_and_conditions ' + main_path.inspect }
main_path and not main_path.empty? or raise RuntimeError,
'Failed to find table aliases for main query'
@where = ground_dangling_blank_nodes(main_path)
joins = ''
subquery_count = 'a'
[ :must_not_bind, :may_bind ].each do |bind_mode|
loop do
sub_path, new = jc_subgraph_path(bind_mode, seen)
break if sub_path.nil? or sub_path.empty?
debug { 'generate_tables_and_conditions ' + sub_path.inspect }
sub_query, sub_join = sub_path.partition {|a,| main_path.assoc(a).nil? }
# fixme: make sure that sub_join is not empty
if 1 == sub_query.size
# simplified case: join single table directly without a subquery
join_alias, = sub_query.first
a = @aliases[join_alias]
join_target = a[:table]
join_conditions = jc_path_to_join_conditions(sub_join) + a[:filter]
else
# left join subquery to the main query
join_alias = '_subquery_' << subquery_count
subquery_count.next!
sub_join = subquery_jc_path(sub_join, join_alias)
rebind = rebind_subquery(sub_path, join_alias)
select_nodes = subquery_select_nodes(rebind, main_path, sub_join)
join_conditions = jc_path_to_join_conditions(sub_join, rebind,
select_nodes)
select_nodes = select_nodes.keys.collect {|b|
b.to_s << ' AS ' << rebind[b].field
}.join(', ')
tables, conditions = jc_path_to_tables_and_conditions(sub_path)
join_target = "(\nSELECT #{select_nodes}\nFROM #{tables}"
join_target << "\nWHERE " << conditions unless conditions.empty?
join_target << "\n)"
join_target.gsub!(/\n(?!\)\z)/, "\n ")
end
joins << ("\nLEFT JOIN " + join_target + ' AS ' + join_alias + ' ON ' +
join_conditions.uniq.join(' AND '))
if :must_not_bind == bind_mode
left_join_is_null(main_path, sub_join)
end
end
end
@from, main_where = jc_path_to_tables_and_conditions(main_path)
@from << joins
@where.push('(' + main_where + ')') unless main_where.empty?
@where.push('(' + @global_filter + ')') unless @global_filter.empty?
@where = @where.join("\nAND ")
end
# Produce a subgraph path through join conditions linking all aliases with
# given _bind_mode_ that form a same-color connected component of the join
# conditions graph and weren't processed yet:
#
# path = [ [start, []], [ next, [ jc, ... ] ], ... ]
#
# Update _seen_ hash for all aliases included in the produced path.
#
def jc_subgraph_path(bind_mode, seen = {})
start = find_alias(bind_mode, seen)
return nil if start.nil?
new = {}
new[start] = true
path = [ [start, []] ]
colors = @nodes[ @aliases[start][:node] ][:colors].keys
loop do # while we can find more connecting joins of the same color
join_alias = nil
@jc.each do |jc|
# use cases:
# - seen is empty (composing the must-bind join)
# - seen is not empty (composing a subquery)
next if (colors & @nodes[ jc[2] ][:colors].keys).empty?
0.upto(1) do |i|
a_seen = jc[i].alias
a_next = jc[1-i].alias
if not new[a_next] and (
((new[a_seen] or seen[a_seen]) and
(@aliases[a_next][:bind_mode] == bind_mode)
# connect an untouched node of matching bind mode
) or (
new[a_seen] and seen[a_next] and
# connect subquery to the rest of the query...
@aliases[a_seen][:bind_mode] == bind_mode
# ...but only go one step deep
))
join_alias = a_next
break
end
end
break if join_alias
end
break if join_alias.nil?
# join it to all seen aliases
join_on = @jc.find_all do |jc|
a1, a2 = jc[0, 2].collect {|b| b.alias }
(new[a1] and a2 == join_alias) or (new[a2] and a1 == join_alias)
end
new[join_alias] = true
path.push([join_alias, join_on])
end
seen.merge!(new)
[ path, new ]
end
def find_alias(bind_mode, seen = {})
@aliases.keys.sort.each do |a|
next if seen[a] or @aliases[a][:bind_mode] != bind_mode
return a
end
nil
end
# Ground all must-bind blank nodes that weren't ground elsewhere to an
# existential quantifier.
#
def ground_dangling_blank_nodes(main_path)
conditions = []
ground_nodes = @global_filter.scan(SquishQuery::BN_SCAN)
@nodes.keys.sort.each do |node|
n = @nodes[node]
next if (n[:ground] or ground_nodes.include?(node))
expression =
case n[:bind_mode]
when :must_bind
'IS NOT NULL'
when :must_not_bind
'IS NULL'
else
next
end
@bindings[node].each do |binding|
if main_path.assoc(binding.alias)
conditions.push SqlExpression.new(binding, expression)
break
end
end
end
conditions
end
# Join a subquery to the main query: for each alias shared between the two,
# link 'id' field of the corresponding table within and outside the subquery.
# If no node is bound to the 'id' field, create a virtual node bound to it,
# so that it can be rebound by rebind_subquery().
#
def subquery_jc_path(sub_join, join_alias)
sub_join.empty? and raise ProgrammingError,
"Unexpected empty subquery, check your RDF storage configuration"
# fixme: reify instead of raising an exception
sub_join.transpose[0].uniq.collect do |a|
binding = SqlNodeBinding.new(a, 'id')
exists = false
@nodes.each do |node, n|
if @bindings[node].include?(binding)
exists = true
break
end
end
unless exists
node = '?' + join_alias + '_' + a
@nodes[node] = { :ground => true }
@bindings[node] = [ binding ]
end
[ a, [[ binding, binding ]] ]
end
end
# Generate a hash that maps all bindings that's been wrapped inside the
# _sub_query_ (a jc path, see jc_subquery_path()) to rebound bindings based
# on the _join_alias_ so that they may still be used in the main query.
#
def rebind_subquery(sub_path, join_alias)
rebind = {}
field_count = 'a'
wrapped = {}
sub_path.each {|a,| wrapped[a] = true }
@nodes.keys.sort.each do |node|
@bindings[node].each do |b|
if wrapped[b.alias] and rebind[b].nil?
field = '_field_' << field_count
field_count.next!
rebind[b] = SqlNodeBinding.new(join_alias, field)
end
end
end
rebind
end
# Go through global filter, filters in the main query, and join conditions
# attaching the subquery to the main query, rebind the bindings for nodes
# wrapped inside the subquery, and return a hash with keys for all bindings
# that should be selected from the subquery.
#
def subquery_select_nodes(rebind, main_path, sub_join)
select_nodes = {}
# update the global filter
@nodes.keys.sort.each do |node|
if r = rebind[ @bindings[node].first ]
@global_filter.gsub!(/#{Regexp.escape(node)}\b/) do
select_nodes[ @bindings[node].first ] = true
r.to_s
end
end
end
# update filters in the main query
main_path.each do |a,|
next if sub_join.assoc(a)
@aliases[a][:filter].each do |f|
f.rebind!(rebind) do |b|
select_nodes[b] = true
end
end
end
# update the subquery join path
sub_join.each do |a, jcs|
jcs.each do |jc|
select_nodes[ jc[0] ] = true
jc[1] = rebind[ jc[1] ]
end
end
# fixme: update main SELECT list
select_nodes
end
# Transform jc path (see jc_subgraph_path()) into a list of join conditions.
# If _rebind_ and _select_nodes_ hashes are defined, conditions will be
# rebound accordingly, and _select_nodes_ will be updated to include bindings
# used in the conditions.
#
def jc_path_to_join_conditions(jc_path, rebind = nil, select_nodes = nil)
conditions = []
jc_path.each do |a, jcs|
jcs.each do |b1, b2, n|
conditions.push SqlExpression.new(b1, '=', b2)
end
end
conditions.empty? and raise RuntimeError,
"Failed to join subquery to the main query"
conditions
end
# Generate FROM and WHERE clauses from a jc path (see jc_subgraph_path()).
#
def jc_path_to_tables_and_conditions(path)
first, = path[0]
a = @aliases[first]
tables = a[:table] + ' AS ' + first
conditions = a[:filter]
path[1, path.size - 1].each do |join_alias, join_on|
a = @aliases[join_alias]
tables <<
%{\nINNER JOIN #{a[:table]} AS #{join_alias} ON } <<
(
join_on.collect {|b1, b2| SqlExpression.new(b1, '=', b2) } +
a[:filter]
).uniq.join(' AND ')
end
[ tables, conditions.uniq.join("\nAND ") ]
end
# Find and declare as NULL key fields of a must-not-bind subquery.
#
def left_join_is_null(main_path, sub_join)
sub_join.each do |a, jcs|
jcs.each do |jc|
0.upto(1) do |i|
if main_path.assoc(jc[i].alias).nil?
@where.push SqlExpression.new(jc[i], 'IS NULL')
break
end
end
end
end
end
end
end
ruby-graffiti-2.2/lib/graffiti/squish.rb 0000664 0000000 0000000 00000036770 11764675307 0020366 0 ustar 00root root 0000000 0000000 # Graffiti RDF Store
# (originally written for Samizdat project)
#
# Copyright (c) 2002-2011 Dmitry Borodaenko
#
# This program is free software.
# You can distribute/modify this program under the terms of
# the GNU General Public License version 3 or later.
#
# see doc/rdf-storage.txt for introduction and Graffiti Squish definition;
# see doc/storage-impl.txt for explanation of implemented algorithms
#
# vim: et sw=2 sts=2 ts=8 tw=0
require 'graffiti/exceptions'
require 'graffiti/sql_mapper'
module Graffiti
# parse Squish query and translate triples to relational conditions
#
# provides access to internal representation of the parsed query and utility
# functions to deal with Squish syntax
#
class SquishQuery
include Debug
# regexp for internal resource reference
INTERNAL = Regexp.new(/\A([[:digit:]]+)\z/).freeze
# regexp for blank node mark and name
BN = Regexp.new(/\A\?([[:alnum:]_]+)\z/).freeze
# regexp for scanning blank nodes inside a string
BN_SCAN = Regexp.new(/\?[[:alnum:]_]+?\b/).freeze
# regexp for parametrized value
PARAMETER = Regexp.new(/\A:([[:alnum:]_]+)\z/).freeze
# regexp for replaced string literal
LITERAL = Regexp.new(/\A'(\d+)'\z/).freeze
# regexp for scanning replaced string literals in a string
LITERAL_SCAN = Regexp.new(/'(\d+)'/).freeze
# regexp for scanning query parameters inside a string
PARAMETER_AND_LITERAL_SCAN = Regexp.new(/\B:([[:alnum:]_]+)|'(\d+)'/).freeze
# regexp for number
NUMBER = Regexp.new(/\A-?[[:digit:]]+(\.[[:digit:]]+)?\z/).freeze
# regexp for operator
OPERATOR = Regexp.new(/\A(\+|-|\*|\/|\|\||<|<=|>|>=|=|!=|@@|to_tsvector|to_tsquery|I?LIKE|NOT|AND|OR|IN|IS|NULL)\z/i).freeze
# regexp for aggregate function
AGGREGATE = Regexp.new(/\A(avg|count|max|min|sum)\z/i).freeze
QUERY = Regexp.new(/\A\s*(SELECT|INSERT|UPDATE)\b\s*(.*?)\s*
\bWHERE\b\s*(.*?)\s*
(?:\bEXCEPT\b\s*(.*?))?\s*
(?:\bOPTIONAL\b\s*(.*?))?\s*
(?:\bLITERAL\b\s*(.*?))?\s*
(?:\bGROUP\s+BY\b\s*(.*?))?\s*
(?:\bORDER\s+BY\b\s*(.*?)\s*(ASC|DESC)?)?\s*
(?:\bUSING\b\s*(.*?))?\s*\z/mix).freeze
# extract common Squish query sections, perform namespace substitution,
# generate query pattern graph, call transform_pattern,
# determine query type and parse nodes section accordingly
#
def initialize(config, query)
query.nil? and raise ProgrammingError, "SquishQuery: query can't be nil"
if query.kind_of? Hash # pre-parsed query (used by SquishAssert)
@nodes = query[:nodes]
@pattern = query[:pattern]
@negative = query[:negative]
@optional = query[:optional]
@strings = query[:strings]
@literal = @group = @order = ''
@sql_mapper = SqlMapper.new(config, @pattern)
return self
elsif not query.kind_of? String
raise ProgrammingError,
"Bad query initialization parameter class: #{query.class}"
end
debug { 'SquishQuery ' + query }
@query = query # keep original string
query = query.dup
# replace string literals with 'n' placeholders (also see #substitute_literals)
@strings = []
query.gsub!(/'((?:''|[^'])*)'/m) do
@strings.push $1.gsub("''", "'") # keep unescaped string
"'" + (@strings.size - 1).to_s + "'"
end
match = QUERY.match(query) or raise ProgrammingError,
"Malformed query: are keywords SELECT, INSERT, UPDATE or WHERE missing?"
match, @key, @nodes, @pattern, @negative, @optional, @literal,
@group, @order, @order_dir, @ns = match.to_a.collect {|m| m.to_s }
match = nil
@key.upcase!
@order_dir.upcase!
# namespaces
# todo: validate ns
@ns = (@ns.empty? or /\APRESET\s+NS\z/ =~ @ns) ? config.ns :
Hash[*@ns.gsub(/\b(FOR|AS|AND)\b/i, '').scan(/\S+/)]
@pattern = parse_pattern(@pattern)
@optional = parse_pattern(@optional)
@negative = parse_pattern(@negative)
# validate SQL expressions
validate_expression(@literal)
@group.split(/\s*,\s*/).each {|group| validate_expression(group) }
validate_expression(@order)
@sql_mapper = SqlMapper.new(
config, @pattern, @negative, @optional, @literal)
# check that all variables can be bound
@variables = query.scan(BN_SCAN)
@variables.each {|node| @sql_mapper.bind(node) }
return self
end
# blank variables control section
attr_reader :nodes
# query pattern graph as array of triples [ [p, s, o], ... ]
attr_reader :pattern
# literal SQL expression
attr_reader :literal
# SQL GROUP BY expression
attr_reader :group
# SQL order expression
attr_reader :order
# direction of order, ASC or DESC
attr_reader :order_dir
# query namespaces mapping
attr_reader :ns
# list of variables defined in the query
attr_reader :variables
# returns original string passed in for parsing
#
def to_s
@query
end
# replace 'n' substitutions with query string literals (see #new, #LITERAL)
#
def substitute_literals(s)
return s unless s.kind_of? String
s.gsub(LITERAL_SCAN) do
get_literal_value($1.to_i)
end
end
# replace schema uri with namespace prefix
#
def SquishQuery.uri_shrink!(uriref, prefix, uri)
uriref.gsub!(/\A#{uri}([^\/#]+)\z/) {"#{prefix}::#{$1}"}
end
# replace schema uri with a prefix from a supplied namespaces hash
#
def SquishQuery.ns_shrink(uriref, namespaces)
u = uriref.dup or return nil
namespaces.each {|p, uri| SquishQuery.uri_shrink!(u, p, uri) and break }
return u
end
# replace schema uri with a prefix from query namespaces
#
def ns_shrink(uriref)
SquishQuery.ns_shrink(uriref, @ns)
end
# validate expression
#
# expression := value [ operator expression ]
#
# value := blank_node | literal_string | number | '(' expression ')'
#
# whitespace between tokens (except inside parentheses) is mandatory
#
def validate_expression(string)
# todo: lexical analyser
string.split(/[\s(),]+/).each do |token|
case token
when '', BN, PARAMETER, LITERAL, NUMBER, OPERATOR, AGGREGATE
else
raise ProgrammingError, "Bad token '#{token}' in expression"
end
end
string
end
private
PATTERN_SCAN = Regexp.new(/\A\((\S+)\s+(\S+)\s+(.*?)(?:\s+FILTER\b\s*(.*?)\s*)?(?:\s+(TRANSITIVE)\s*)?\)\z/).freeze
# parse query pattern graph out of a string, expand URI namespaces
#
def parse_pattern(pattern)
pattern.scan(/\(.*?\)(?=\s*(?:\(|\z))/).collect do |c|
match, predicate, subject, object, filter, transitive = c.match(PATTERN_SCAN).to_a
match = nil
[predicate, subject, object].each do |u|
u.sub!(/\A(\S+?)::/) do
@ns[$1] or raise ProgrammingError, "Undefined namespace prefix #{$1}"
end
end
validate_expression(filter.to_s)
[predicate, subject, object, filter, 'TRANSITIVE' == transitive]
end
end
# replace RDF query parameters with their values
#
def expression_value(expr, params={})
case expr
when 'NULL'
nil
when PARAMETER
get_parameter_value($1, params)
when LITERAL
@strings[$1.to_i]
else
expr.gsub(PARAMETER_AND_LITERAL_SCAN) do
if $1 # parameter
get_parameter_value($1, params)
else # literal
get_literal_value($2.to_i)
end
end
# fixme: make Sequel treat it as SQL expression, not a string value
end
end
def get_parameter_value(name, params)
key = name.to_sym
params.has_key?(key) or raise ProgrammingError,
'Unknown parameter :' + name
params[key]
end
def get_literal_value(i)
"'" + @strings[i].gsub("'", "''") + "'"
end
end
class SquishSelect < SquishQuery
def initialize(config, query)
super(config, query)
if @key # initialized from a String, not a Hash
'SELECT' == @key or raise ProgrammingError,
'Wrong query type: SELECT expected intead of ' + @key
@nodes = @nodes.split(/\s*,\s*/).map {|node|
validate_expression(node)
}
end
end
# translate Squish SELECT query to SQL
#
def to_sql
where = @sql_mapper.where
select = @nodes.dup
select.push(@order) unless @order.empty? or @nodes.include?(@order)
# now put it all together
sql = %{\nFROM #{@sql_mapper.from}}
sql << %{\nWHERE #{where}} unless where.empty?
sql << %{\nGROUP BY #{@group}} unless @group.empty?
sql << %{\nORDER BY #{@order} #{@order_dir}} unless @order.empty?
select = select.map do |expr|
bind_blank_nodes(expr) + (BN.match(expr) ? (' AS ' + $1) : '')
end
sql = 'SELECT DISTINCT ' << select.join(', ') << bind_blank_nodes(sql)
sql =~ /\?/ and raise ProgrammingError,
"Unexpected '?' in translated query (probably, caused by unmapped blank node): #{sql.gsub(/\s+/, ' ')};"
substitute_literals(sql)
end
private
# replace blank node names with bindings
#
def bind_blank_nodes(sql)
sql.gsub(BN_SCAN) {|node| @sql_mapper.bind(node) }
end
end
class SquishAssert < SquishQuery
def initialize(config, query)
@config = config
super(@config, query)
if 'UPDATE' == @key
@insert = ''
@update = @nodes
elsif 'INSERT' == @key and @nodes =~ /\A\s*(.*?)\s*(?:\bUPDATE\b\s*(.*?))?\s*\z/
@insert, @update = $1, $2.to_s
else
raise ProgrammingError,
"Wrong query type: INSERT or UPDATE expected instead of " + @key
end
@insert = @insert.split(/\s*,\s*/).each {|s|
s =~ BN or raise ProgrammingError,
"Blank node expected in INSERT section instead of '#{s}'"
}
@update = @update.empty? ? {} : Hash[*@update.split(/\s*,\s*/).collect {|s|
s.split(/\s*=\s*/)
}.each {|node, value|
node =~ BN or raise ProgrammingError,
"Blank node expected on the left side of UPDATE assignment instead of '#{bn}'"
validate_expression(value)
}.flatten!]
end
def run(db, params={})
values = resource_values(db, params)
statements = []
alias_positions.each do |alias_, clauses|
statement = SquishAssertStatement.new(clauses, values)
statements.push(statement) if statement.action
end
SquishAssertStatement.run_ordered_statements(db, statements)
return @insert.collect {|node| values[node].value }
end
attr_reader :insert, :update
private
def resource_values(db, params)
values = {}
@sql_mapper.nodes.each do |node, n|
new = false
if node =~ INTERNAL # internal resource
value = $1.to_i # resource id
elsif node =~ PARAMETER
value = get_parameter_value($1, params)
elsif node =~ LITERAL
value = @strings[$1.to_i]
elsif node =~ BN
subject_position = n[:positions].select {|p| :subject == p[:role] }.first
if subject_position.nil? # blank node occuring only in object position
value = @update[node] or raise ProgrammingError,
%{Blank node #{node} is undefined (drop it or set its value in UPDATE section)}
value = expression_value(value, params)
else # resource blank node
unless @insert.include?(node)
s = SquishSelect.new(
@config, {
:nodes => [node],
:pattern => subgraph(node),
:strings => @strings
}
)
debug { 'resource_values ' + db[s.to_sql, params].select_sql }
found = db.fetch(s.to_sql, params).first
end
if found
value = found.values.first
else
table = @sql_mapper.clauses[ subject_position[:clause] ][:map].table
value = db[:resource].insert(:label => table)
debug { 'resource_values ' + db[:resource].insert_sql(:label => table) }
new = true unless 'resource' == table
end
end
else # external resource
uriref = { :uriref => true, :label => node }
found = db[:resource].filter(uriref).first
if found
value = found[:id]
else
value = db[:resource].insert(uriref)
debug { 'resource_values ' + db[:resource].insert_sql(uriref) }
end
end
debug { 'resource_values ' + node + ' = ' + value.inspect }
v = SquishAssertValue.new(value, new, @update.has_key?(node))
values[node] = v
end
debug { 'resource_values ' + 'resource_values ' + values.inspect }
values
end
def alias_positions
a = {}
@sql_mapper.clauses.each_with_index do |clause, i|
a[ clause[:alias] ] ||= []
a[ clause[:alias] ].push(clause)
end
a
end
# calculate subgraph of query pattern that is reachable from _node_
#
# fixme: make it work with optional sub-patterns
#
def subgraph(node)
subgraph = [node]
w = []
begin
stop = true
@pattern.each do |triple|
if subgraph.include? triple[1] and not w.include? triple
subgraph.push triple[2]
w.push triple
stop = false
end
end
end until stop
return w
end
end
class SquishAssertValue
def initialize(value, new, updated)
@value = value
@new = new
@updated = updated
end
attr_reader :value
# true if node was inserted into resource during value generation and a
# corresponding record should be inserted into an internal resource table
# later
#
def new?
@new
end
# true if the node value is set in the UPDATE section of the Squish statement
#
def updated?
@updated
end
end
class SquishAssertStatement
include Debug
def initialize(clauses, values)
@key_node = clauses.first[:subject][:node]
@table = clauses.first[:map].table.to_sym
key = values[@key_node]
@params = {}
@references = []
clauses.each do |clause|
node = clause[:object][:node]
v = values[node]
if key.new? or v.updated?
field = clause[:object][:field]
@params[field.to_sym] = v.value
# when subproperty value is updated, update the qualifier as well
map = clause[:map]
if map.subproperty_of
@params[ RdfPropertyMap.qualifier_field(field).to_sym ] = values[map.property].value
elsif map.superproperty?
@params[ RdfPropertyMap.qualifier_field(field).to_sym ] = nil
end
@references.push(node) if v.new?
end
end
if key.new? and @table != :resource
# when id is inserted, insert_resource() trigger does nothing
@action = :insert
@params[:id] = key.value
elsif not @params.empty?
@action = :update
@filter = {:id => key.value}
end
debug { 'SquishAssertStatement ' + self.inspect }
end
attr_reader :key_node, :references, :action
def run(db)
if @action
ds = db[@table]
ds = ds.filter(@filter) if @filter
debug { :insert == @action ? ds.insert_sql(@params) : ds.update_sql(@params) }
ds.send(@action, @params)
end
end
# make sure mutually referencing records are inserted in the right order
#
def SquishAssertStatement.run_ordered_statements(db, statements)
statements = statements.sort_by {|s| s.references.size }
inserted = []
progress = true
until statements.empty? or not progress
progress = false
0.upto(statements.size - 1) do |i|
s = statements[i]
if (s.references - inserted).empty?
s.run(db)
inserted.push(s.key_node)
statements.delete_at(i)
progress = true
break
end
end
end
statements.empty? or raise ProgrammingError,
"Failed to resolve mutual references of inserted resources: " +
statements.collect {|s| s.key_node + ' -- ' + s.references.join(', ') }.join('; ')
end
end
end
ruby-graffiti-2.2/lib/graffiti/store.rb 0000664 0000000 0000000 00000004656 11764675307 0020204 0 ustar 00root root 0000000 0000000 # Graffiti RDF Store
# (originally written for Samizdat project)
#
# Copyright (c) 2002-2011 Dmitry Borodaenko
#
# This program is free software.
# You can distribute/modify this program under the terms of
# the GNU General Public License version 3 or later.
#
# see doc/rdf-storage.txt for introduction and Graffiti Squish definition;
# see doc/storage-impl.txt for explanation of implemented algorithms
#
# vim: et sw=2 sts=2 ts=8 tw=0
require 'syncache'
require 'graffiti/exceptions'
require 'graffiti/debug'
require 'graffiti/rdf_config'
require 'graffiti/squish'
module Graffiti
# API for the RDF storage access similar to DBI or Sequel
#
class Store
# initialize class attributes
#
# _db_ is a Sequel database handle
#
# _config_ is a hash of configuraiton options for RdfConfig
#
def initialize(db, config)
@db = db
@config = RdfConfig.new(config)
# cache parsed Squish SELECT queries
@select_cache = SynCache::Cache.new(nil, 1000)
end
# storage configuration in an RdfConfig object
#
attr_reader :config
# replace schema uri with a prefix from the configured namespaces
#
def ns_shrink(uriref)
SquishQuery.ns_shrink(uriref, @config.ns)
end
# get value of subject's property
#
def get_property(subject, property)
fetch(%{SELECT ?object WHERE (#{property} :subject ?object)},
:subject => subject).get(:object)
end
def fetch(query, params={})
@db.fetch(select(query), params)
end
# get one query answer (similar to DBI#select_one)
#
def select_one(query, params={})
fetch(query, params).first
end
# get all query answers (similar to DBI#select_all)
#
def select_all(query, limit=nil, offset=nil, params={}, &p)
ds = fetch(query, params).limit(limit, offset)
if block_given?
ds.all(&p)
else
ds.all
end
end
# accepts String or pre-parsed SquishQuery object, caches SQL by String
#
def select(query)
query.kind_of?(String) and
query = @select_cache.fetch_or_add(query) { SquishSelect.new(@config, query) }
query.kind_of?(SquishSelect) or raise ProgrammingError,
"String or SquishSelect expected"
query.to_sql
end
# merge Squish query into RDF database
#
# returns list of new ids assigned to blank nodes listed in INSERT section
#
def assert(query, params={})
@db.transaction do
SquishAssert.new(@config, query).run(@db, params)
end
end
end
end
ruby-graffiti-2.2/setup.rb 0000664 0000000 0000000 00000071104 11764675307 0015637 0 ustar 00root root 0000000 0000000 #
# setup.rb
#
# Copyright (c) 2000-2004 Minero Aoki
#
# This program is free software.
# You can distribute/modify this program under the terms of
# the GNU LGPL, Lesser General Public License version 2.1.
#
unless Enumerable.method_defined?(:map) # Ruby 1.4.6
module Enumerable
alias map collect
end
end
unless File.respond_to?(:read) # Ruby 1.6
def File.read(fname)
open(fname) {|f|
return f.read
}
end
end
def File.binread(fname)
open(fname, 'rb') {|f|
return f.read
}
end
# for corrupted windows stat(2)
def File.dir?(path)
File.directory?((path[-1,1] == '/') ? path : path + '/')
end
class SetupError < StandardError; end
def setup_rb_error(msg)
raise SetupError, msg
end
#
# Config
#
if arg = ARGV.detect {|arg| /\A--rbconfig=/ =~ arg }
ARGV.delete(arg)
require arg.split(/=/, 2)[1]
$".push 'rbconfig.rb'
else
require 'rbconfig'
end
def multipackage_install?
FileTest.directory?(File.dirname($0) + '/packages')
end
class ConfigItem
def initialize(name, template, default, desc)
@name = name.freeze
@template = template
@value = default
@default = default.dup.freeze
@description = desc
end
attr_reader :name
attr_reader :description
attr_accessor :default
alias help_default default
def help_opt
"--#{@name}=#{@template}"
end
def value
@value
end
def eval(table)
@value.gsub(%r<\$([^/]+)>) { table[$1] }
end
def set(val)
@value = check(val)
end
private
def check(val)
setup_rb_error "config: --#{name} requires argument" unless val
val
end
end
class BoolItem < ConfigItem
def config_type
'bool'
end
def help_opt
"--#{@name}"
end
private
def check(val)
return 'yes' unless val
unless /\A(y(es)?|n(o)?|t(rue)?|f(alse))\z/i =~ val
setup_rb_error "config: --#{@name} accepts only yes/no for argument"
end
(/\Ay(es)?|\At(rue)/i =~ value) ? 'yes' : 'no'
end
end
class PathItem < ConfigItem
def config_type
'path'
end
private
def check(path)
setup_rb_error "config: --#{@name} requires argument" unless path
path[0,1] == '$' ? path : File.expand_path(path)
end
end
class ProgramItem < ConfigItem
def config_type
'program'
end
end
class SelectItem < ConfigItem
def initialize(name, template, default, desc)
super
@ok = template.split('/')
end
def config_type
'select'
end
private
def check(val)
unless @ok.include?(val.strip)
setup_rb_error "config: use --#{@name}=#{@template} (#{val})"
end
val.strip
end
end
class PackageSelectionItem < ConfigItem
def initialize(name, template, default, help_default, desc)
super name, template, default, desc
@help_default = help_default
end
attr_reader :help_default
def config_type
'package'
end
private
def check(val)
unless File.dir?("packages/#{val}")
setup_rb_error "config: no such package: #{val}"
end
val
end
end
class ConfigTable_class
def initialize(items)
@items = items
@table = {}
items.each do |i|
@table[i.name] = i
end
ALIASES.each do |ali, name|
@table[ali] = @table[name]
end
end
include Enumerable
def each(&block)
@items.each(&block)
end
def key?(name)
@table.key?(name)
end
def lookup(name)
@table[name] or raise ArgumentError, "no such config item: #{name}"
end
def add(item)
@items.push item
@table[item.name] = item
end
def remove(name)
item = lookup(name)
@items.delete_if {|i| i.name == name }
@table.delete_if {|name, i| i.name == name }
item
end
def new
dup()
end
def savefile
'.config'
end
def load
begin
t = dup()
File.foreach(savefile()) do |line|
k, v = *line.split(/=/, 2)
t[k] = v.strip
end
t
rescue Errno::ENOENT
setup_rb_error $!.message + "#{File.basename($0)} config first"
end
end
def save
@items.each {|i| i.value }
File.open(savefile(), 'w') {|f|
@items.each do |i|
f.printf "%s=%s\n", i.name, i.value if i.value
end
}
end
def [](key)
lookup(key).eval(self)
end
def []=(key, val)
lookup(key).set val
end
end
c = ::Config::CONFIG
rubypath = c['bindir'] + '/' + c['ruby_install_name']
major = c['MAJOR'].to_i
minor = c['MINOR'].to_i
teeny = c['TEENY'].to_i
version = "#{major}.#{minor}"
# ruby ver. >= 1.4.4?
newpath_p = ((major >= 2) or
((major == 1) and
((minor >= 5) or
((minor == 4) and (teeny >= 4)))))
if c['rubylibdir']
# V < 1.6.3
_stdruby = c['rubylibdir']
_siteruby = c['sitedir']
_siterubyver = c['sitelibdir']
_siterubyverarch = c['sitearchdir']
elsif newpath_p
# 1.4.4 <= V <= 1.6.3
_stdruby = "$prefix/lib/ruby/#{version}"
_siteruby = c['sitedir']
_siterubyver = "$siteruby/#{version}"
_siterubyverarch = "$siterubyver/#{c['arch']}"
else
# V < 1.4.4
_stdruby = "$prefix/lib/ruby/#{version}"
_siteruby = "$prefix/lib/ruby/#{version}/site_ruby"
_siterubyver = _siteruby
_siterubyverarch = "$siterubyver/#{c['arch']}"
end
libdir = '-* dummy libdir *-'
stdruby = '-* dummy rubylibdir *-'
siteruby = '-* dummy site_ruby *-'
siterubyver = '-* dummy site_ruby version *-'
parameterize = lambda {|path|
path.sub(/\A#{Regexp.quote(c['prefix'])}/, '$prefix')\
.sub(/\A#{Regexp.quote(libdir)}/, '$libdir')\
.sub(/\A#{Regexp.quote(stdruby)}/, '$stdruby')\
.sub(/\A#{Regexp.quote(siteruby)}/, '$siteruby')\
.sub(/\A#{Regexp.quote(siterubyver)}/, '$siterubyver')
}
libdir = parameterize.call(c['libdir'])
stdruby = parameterize.call(_stdruby)
siteruby = parameterize.call(_siteruby)
siterubyver = parameterize.call(_siterubyver)
siterubyverarch = parameterize.call(_siterubyverarch)
if arg = c['configure_args'].split.detect {|arg| /--with-make-prog=/ =~ arg }
makeprog = arg.sub(/'/, '').split(/=/, 2)[1]
else
makeprog = 'make'
end
common_conf = [
PathItem.new('prefix', 'path', c['prefix'],
'path prefix of target environment'),
PathItem.new('bindir', 'path', parameterize.call(c['bindir']),
'the directory for commands'),
PathItem.new('libdir', 'path', libdir,
'the directory for libraries'),
PathItem.new('datadir', 'path', parameterize.call(c['datadir']),
'the directory for shared data'),
PathItem.new('mandir', 'path', parameterize.call(c['mandir']),
'the directory for man pages'),
PathItem.new('sysconfdir', 'path', parameterize.call(c['sysconfdir']),
'the directory for man pages'),
PathItem.new('stdruby', 'path', stdruby,
'the directory for standard ruby libraries'),
PathItem.new('siteruby', 'path', siteruby,
'the directory for version-independent aux ruby libraries'),
PathItem.new('siterubyver', 'path', siterubyver,
'the directory for aux ruby libraries'),
PathItem.new('siterubyverarch', 'path', siterubyverarch,
'the directory for aux ruby binaries'),
PathItem.new('rbdir', 'path', '$siterubyver',
'the directory for ruby scripts'),
PathItem.new('sodir', 'path', '$siterubyverarch',
'the directory for ruby extentions'),
PathItem.new('rubypath', 'path', rubypath,
'the path to set to #! line'),
ProgramItem.new('rubyprog', 'name', rubypath,
'the ruby program using for installation'),
ProgramItem.new('makeprog', 'name', makeprog,
'the make program to compile ruby extentions'),
SelectItem.new('shebang', 'all/ruby/never', 'ruby',
'shebang line (#!) editing mode'),
BoolItem.new('without-ext', 'yes/no', 'no',
'does not compile/install ruby extentions')
]
class ConfigTable_class # open again
ALIASES = {
'std-ruby' => 'stdruby',
'site-ruby-common' => 'siteruby', # For backward compatibility
'site-ruby' => 'siterubyver', # For backward compatibility
'bin-dir' => 'bindir',
'bin-dir' => 'bindir',
'rb-dir' => 'rbdir',
'so-dir' => 'sodir',
'data-dir' => 'datadir',
'ruby-path' => 'rubypath',
'ruby-prog' => 'rubyprog',
'ruby' => 'rubyprog',
'make-prog' => 'makeprog',
'make' => 'makeprog'
}
end
multipackage_conf = [
PackageSelectionItem.new('with', 'name,name...', '', 'ALL',
'package names that you want to install'),
PackageSelectionItem.new('without', 'name,name...', '', 'NONE',
'package names that you do not want to install')
]
if multipackage_install?
ConfigTable = ConfigTable_class.new(common_conf + multipackage_conf)
else
ConfigTable = ConfigTable_class.new(common_conf)
end
module MetaConfigAPI
def eval_file_ifexist(fname)
instance_eval File.read(fname), fname, 1 if File.file?(fname)
end
def config_names
ConfigTable.map {|i| i.name }
end
def config?(name)
ConfigTable.key?(name)
end
def bool_config?(name)
ConfigTable.lookup(name).config_type == 'bool'
end
def path_config?(name)
ConfigTable.lookup(name).config_type == 'path'
end
def value_config?(name)
case ConfigTable.lookup(name).config_type
when 'bool', 'path'
true
else
false
end
end
def add_config(item)
ConfigTable.add item
end
def add_bool_config(name, default, desc)
ConfigTable.add BoolItem.new(name, 'yes/no', default ? 'yes' : 'no', desc)
end
def add_path_config(name, default, desc)
ConfigTable.add PathItem.new(name, 'path', default, desc)
end
def set_config_default(name, default)
ConfigTable.lookup(name).default = default
end
def remove_config(name)
ConfigTable.remove(name)
end
end
#
# File Operations
#
module FileOperations
def mkdir_p(dirname, prefix = nil)
dirname = prefix + File.expand_path(dirname) if prefix
$stderr.puts "mkdir -p #{dirname}" if verbose?
return if no_harm?
# does not check '/'... it's too abnormal case
dirs = File.expand_path(dirname).split(%r<(?=/)>)
if /\A[a-z]:\z/i =~ dirs[0]
disk = dirs.shift
dirs[0] = disk + dirs[0]
end
dirs.each_index do |idx|
path = dirs[0..idx].join('')
Dir.mkdir path unless File.dir?(path)
end
end
def rm_f(fname)
$stderr.puts "rm -f #{fname}" if verbose?
return if no_harm?
if File.exist?(fname) or File.symlink?(fname)
File.chmod 0777, fname
File.unlink fname
end
end
def rm_rf(dn)
$stderr.puts "rm -rf #{dn}" if verbose?
return if no_harm?
Dir.chdir dn
Dir.foreach('.') do |fn|
next if fn == '.'
next if fn == '..'
if File.dir?(fn)
verbose_off {
rm_rf fn
}
else
verbose_off {
rm_f fn
}
end
end
Dir.chdir '..'
Dir.rmdir dn
end
def move_file(src, dest)
File.unlink dest if File.exist?(dest)
begin
File.rename src, dest
rescue
File.open(dest, 'wb') {|f| f.write File.binread(src) }
File.chmod File.stat(src).mode, dest
File.unlink src
end
end
def install(from, dest, mode, prefix = nil)
$stderr.puts "install #{from} #{dest}" if verbose?
return if no_harm?
realdest = prefix ? prefix + File.expand_path(dest) : dest
realdest = File.join(realdest, File.basename(from)) if File.dir?(realdest)
str = File.binread(from)
if diff?(str, realdest)
verbose_off {
rm_f realdest if File.exist?(realdest)
}
File.open(realdest, 'wb') {|f|
f.write str
}
File.chmod mode, realdest
File.open("#{objdir_root()}/InstalledFiles", 'a') {|f|
if prefix
f.puts realdest.sub(prefix, '')
else
f.puts realdest
end
}
end
end
def diff?(new_content, path)
return true unless File.exist?(path)
new_content != File.binread(path)
end
def command(str)
$stderr.puts str if verbose?
system str or raise RuntimeError, "'system #{str}' failed"
end
def ruby(str)
command config('rubyprog') + ' ' + str
end
def make(task = '')
command config('makeprog') + ' ' + task
end
def extdir?(dir)
File.exist?(dir + '/MANIFEST')
end
def all_files_in(dirname)
Dir.open(dirname) {|d|
return d.select {|ent| File.file?("#{dirname}/#{ent}") }
}
end
REJECT_DIRS = %w(
CVS SCCS RCS CVS.adm .svn
)
def all_dirs_in(dirname)
Dir.open(dirname) {|d|
return d.select {|n| File.dir?("#{dirname}/#{n}") } - %w(. ..) - REJECT_DIRS
}
end
end
#
# Main Installer
#
module HookUtils
def run_hook(name)
try_run_hook "#{curr_srcdir()}/#{name}" or
try_run_hook "#{curr_srcdir()}/#{name}.rb"
end
def try_run_hook(fname)
return false unless File.file?(fname)
begin
instance_eval File.read(fname), fname, 1
rescue
setup_rb_error "hook #{fname} failed:\n" + $!.message
end
true
end
end
module HookScriptAPI
def get_config(key)
@config[key]
end
alias config get_config
def set_config(key, val)
@config[key] = val
end
#
# srcdir/objdir (works only in the package directory)
#
#abstract srcdir_root
#abstract objdir_root
#abstract relpath
def curr_srcdir
"#{srcdir_root()}/#{relpath()}"
end
def curr_objdir
"#{objdir_root()}/#{relpath()}"
end
def srcfile(path)
"#{curr_srcdir()}/#{path}"
end
def srcexist?(path)
File.exist?(srcfile(path))
end
def srcdirectory?(path)
File.dir?(srcfile(path))
end
def srcfile?(path)
File.file? srcfile(path)
end
def srcentries(path = '.')
Dir.open("#{curr_srcdir()}/#{path}") {|d|
return d.to_a - %w(. ..)
}
end
def srcfiles(path = '.')
srcentries(path).select {|fname|
File.file?(File.join(curr_srcdir(), path, fname))
}
end
def srcdirectories(path = '.')
srcentries(path).select {|fname|
File.dir?(File.join(curr_srcdir(), path, fname))
}
end
end
class ToplevelInstaller
Version = '3.3.1'
Copyright = 'Copyright (c) 2000-2004 Minero Aoki'
TASKS = [
[ 'all', 'do config, setup, then install' ],
[ 'config', 'saves your configurations' ],
[ 'show', 'shows current configuration' ],
[ 'setup', 'compiles ruby extentions and others' ],
[ 'install', 'installs files' ],
[ 'clean', "does `make clean' for each extention" ],
[ 'distclean',"does `make distclean' for each extention" ]
]
def ToplevelInstaller.invoke
instance().invoke
end
@singleton = nil
def ToplevelInstaller.instance
@singleton ||= new(File.dirname($0))
@singleton
end
include MetaConfigAPI
def initialize(ardir_root)
@config = nil
@options = { 'verbose' => true }
@ardir = File.expand_path(ardir_root)
end
def inspect
"#<#{self.class} #{__id__()}>"
end
def invoke
run_metaconfigs
case task = parsearg_global()
when nil, 'all'
@config = load_config('config')
parsearg_config
init_installers
exec_config
exec_setup
exec_install
else
@config = load_config(task)
__send__ "parsearg_#{task}"
init_installers
__send__ "exec_#{task}"
end
end
def run_metaconfigs
eval_file_ifexist "#{@ardir}/metaconfig"
end
def load_config(task)
case task
when 'config'
ConfigTable.new
when 'clean', 'distclean'
if File.exist?(ConfigTable.savefile)
then ConfigTable.load
else ConfigTable.new
end
else
ConfigTable.load
end
end
def init_installers
@installer = Installer.new(@config, @options, @ardir, File.expand_path('.'))
end
#
# Hook Script API bases
#
def srcdir_root
@ardir
end
def objdir_root
'.'
end
def relpath
'.'
end
#
# Option Parsing
#
def parsearg_global
valid_task = /\A(?:#{TASKS.map {|task,desc| task }.join '|'})\z/
while arg = ARGV.shift
case arg
when /\A\w+\z/
setup_rb_error "invalid task: #{arg}" unless valid_task =~ arg
return arg
when '-q', '--quiet'
@options['verbose'] = false
when '--verbose'
@options['verbose'] = true
when '-h', '--help'
print_usage $stdout
exit 0
when '-v', '--version'
puts "#{File.basename($0)} version #{Version}"
exit 0
when '--copyright'
puts Copyright
exit 0
else
setup_rb_error "unknown global option '#{arg}'"
end
end
nil
end
def parsearg_no_options
unless ARGV.empty?
setup_rb_error "#{task}: unknown options: #{ARGV.join ' '}"
end
end
alias parsearg_show parsearg_no_options
alias parsearg_setup parsearg_no_options
alias parsearg_clean parsearg_no_options
alias parsearg_distclean parsearg_no_options
def parsearg_config
re = /\A--(#{ConfigTable.map {|i| i.name }.join('|')})(?:=(.*))?\z/
@options['config-opt'] = []
while i = ARGV.shift
if /\A--?\z/ =~ i
@options['config-opt'] = ARGV.dup
break
end
m = re.match(i) or setup_rb_error "config: unknown option #{i}"
name, value = *m.to_a[1,2]
@config[name] = value
end
end
def parsearg_install
@options['no-harm'] = false
@options['install-prefix'] = ''
while a = ARGV.shift
case a
when /\A--no-harm\z/
@options['no-harm'] = true
when /\A--prefix=(.*)\z/
path = $1
path = File.expand_path(path) unless path[0,1] == '/'
@options['install-prefix'] = path
else
setup_rb_error "install: unknown option #{a}"
end
end
end
def print_usage(out)
out.puts 'Typical Installation Procedure:'
out.puts " $ ruby #{File.basename $0} config"
out.puts " $ ruby #{File.basename $0} setup"
out.puts " # ruby #{File.basename $0} install (may require root privilege)"
out.puts
out.puts 'Detailed Usage:'
out.puts " ruby #{File.basename $0} "
out.puts " ruby #{File.basename $0} [] []"
fmt = " %-24s %s\n"
out.puts
out.puts 'Global options:'
out.printf fmt, '-q,--quiet', 'suppress message outputs'
out.printf fmt, ' --verbose', 'output messages verbosely'
out.printf fmt, '-h,--help', 'print this message'
out.printf fmt, '-v,--version', 'print version and quit'
out.printf fmt, ' --copyright', 'print copyright and quit'
out.puts
out.puts 'Tasks:'
TASKS.each do |name, desc|
out.printf fmt, name, desc
end
fmt = " %-24s %s [%s]\n"
out.puts
out.puts 'Options for CONFIG or ALL:'
ConfigTable.each do |item|
out.printf fmt, item.help_opt, item.description, item.help_default
end
out.printf fmt, '--rbconfig=path', 'rbconfig.rb to load',"running ruby's"
out.puts
out.puts 'Options for INSTALL:'
out.printf fmt, '--no-harm', 'only display what to do if given', 'off'
out.printf fmt, '--prefix=path', 'install path prefix', '$prefix'
out.puts
end
#
# Task Handlers
#
def exec_config
@installer.exec_config
@config.save # must be final
end
def exec_setup
@installer.exec_setup
end
def exec_install
@installer.exec_install
end
def exec_show
ConfigTable.each do |i|
printf "%-20s %s\n", i.name, i.value
end
end
def exec_clean
@installer.exec_clean
end
def exec_distclean
@installer.exec_distclean
end
end
class ToplevelInstallerMulti < ToplevelInstaller
include HookUtils
include HookScriptAPI
include FileOperations
def initialize(ardir)
super
@packages = all_dirs_in("#{@ardir}/packages")
raise 'no package exists' if @packages.empty?
end
def run_metaconfigs
eval_file_ifexist "#{@ardir}/metaconfig"
@packages.each do |name|
eval_file_ifexist "#{@ardir}/packages/#{name}/metaconfig"
end
end
def init_installers
@installers = {}
@packages.each do |pack|
@installers[pack] = Installer.new(@config, @options,
"#{@ardir}/packages/#{pack}",
"packages/#{pack}")
end
with = extract_selection(config('with'))
without = extract_selection(config('without'))
@selected = @installers.keys.select {|name|
(with.empty? or with.include?(name)) \
and not without.include?(name)
}
end
def extract_selection(list)
a = list.split(/,/)
a.each do |name|
setup_rb_error "no such package: #{name}" unless @installers.key?(name)
end
a
end
def print_usage(f)
super
f.puts 'Inluded packages:'
f.puts ' ' + @packages.sort.join(' ')
f.puts
end
#
# multi-package metaconfig API
#
attr_reader :packages
def declare_packages(list)
raise 'package list is empty' if list.empty?
list.each do |name|
raise "directory packages/#{name} does not exist"\
unless File.dir?("#{@ardir}/packages/#{name}")
end
@packages = list
end
#
# Task Handlers
#
def exec_config
run_hook 'pre-config'
each_selected_installers {|inst| inst.exec_config }
run_hook 'post-config'
@config.save # must be final
end
def exec_setup
run_hook 'pre-setup'
each_selected_installers {|inst| inst.exec_setup }
run_hook 'post-setup'
end
def exec_install
run_hook 'pre-install'
each_selected_installers {|inst| inst.exec_install }
run_hook 'post-install'
end
def exec_clean
rm_f ConfigTable.savefile
run_hook 'pre-clean'
each_selected_installers {|inst| inst.exec_clean }
run_hook 'post-clean'
end
def exec_distclean
rm_f ConfigTable.savefile
run_hook 'pre-distclean'
each_selected_installers {|inst| inst.exec_distclean }
run_hook 'post-distclean'
end
#
# lib
#
def each_selected_installers
Dir.mkdir 'packages' unless File.dir?('packages')
@selected.each do |pack|
$stderr.puts "Processing the package `#{pack}' ..." if @options['verbose']
Dir.mkdir "packages/#{pack}" unless File.dir?("packages/#{pack}")
Dir.chdir "packages/#{pack}"
yield @installers[pack]
Dir.chdir '../..'
end
end
def verbose?
@options['verbose']
end
def no_harm?
@options['no-harm']
end
end
class Installer
FILETYPES = %w( bin lib ext data )
include HookScriptAPI
include HookUtils
include FileOperations
def initialize(config, opt, srcroot, objroot)
@config = config
@options = opt
@srcdir = File.expand_path(srcroot)
@objdir = File.expand_path(objroot)
@currdir = '.'
end
def inspect
"#<#{self.class} #{File.basename(@srcdir)}>"
end
#
# Hook Script API base methods
#
def srcdir_root
@srcdir
end
def objdir_root
@objdir
end
def relpath
@currdir
end
#
# configs/options
#
def no_harm?
@options['no-harm']
end
def verbose?
@options['verbose']
end
def verbose_off
begin
save, @options['verbose'] = @options['verbose'], false
yield
ensure
@options['verbose'] = save
end
end
#
# TASK config
#
def exec_config
exec_task_traverse 'config'
end
def config_dir_bin(rel)
end
def config_dir_lib(rel)
end
def config_dir_ext(rel)
extconf if extdir?(curr_srcdir())
end
def extconf
opt = @options['config-opt'].join(' ')
command "#{config('rubyprog')} #{curr_srcdir()}/extconf.rb #{opt}"
end
def config_dir_data(rel)
end
#
# TASK setup
#
def exec_setup
exec_task_traverse 'setup'
end
def setup_dir_bin(rel)
all_files_in(curr_srcdir()).each do |fname|
adjust_shebang "#{curr_srcdir()}/#{fname}"
end
end
def adjust_shebang(path)
return if no_harm?
tmpfile = File.basename(path) + '.tmp'
begin
File.open(path, 'rb') {|r|
first = r.gets
return unless File.basename(config('rubypath')) == 'ruby'
return unless File.basename(first.sub(/\A\#!/, '').split[0]) == 'ruby'
$stderr.puts "adjusting shebang: #{File.basename(path)}" if verbose?
File.open(tmpfile, 'wb') {|w|
w.print first.sub(/\A\#!\s*\S+/, '#! ' + config('rubypath'))
w.write r.read
}
move_file tmpfile, File.basename(path)
}
ensure
File.unlink tmpfile if File.exist?(tmpfile)
end
end
def setup_dir_lib(rel)
end
def setup_dir_ext(rel)
make if extdir?(curr_srcdir())
end
def setup_dir_data(rel)
end
#
# TASK install
#
def exec_install
rm_f 'InstalledFiles'
exec_task_traverse 'install'
end
def install_dir_bin(rel)
install_files collect_filenames_auto(), "#{config('bindir')}/#{rel}", 0755
end
def install_dir_lib(rel)
install_files ruby_scripts(), "#{config('rbdir')}/#{rel}", 0644
end
def install_dir_ext(rel)
return unless extdir?(curr_srcdir())
install_files ruby_extentions('.'),
"#{config('sodir')}/#{File.dirname(rel)}",
0555
end
def install_dir_data(rel)
install_files collect_filenames_auto(), "#{config('datadir')}/#{rel}", 0644
end
def install_files(list, dest, mode)
mkdir_p dest, @options['install-prefix']
list.each do |fname|
install fname, dest, mode, @options['install-prefix']
end
end
def ruby_scripts
collect_filenames_auto().select {|n| /\.rb\z/ =~ n }
end
# picked up many entries from cvs-1.11.1/src/ignore.c
reject_patterns = %w(
core RCSLOG tags TAGS .make.state
.nse_depinfo #* .#* cvslog.* ,* .del-* *.olb
*~ *.old *.bak *.BAK *.orig *.rej _$* *$
*.org *.in .*
)
mapping = {
'.' => '\.',
'$' => '\$',
'#' => '\#',
'*' => '.*'
}
REJECT_PATTERNS = Regexp.new('\A(?:' +
reject_patterns.map {|pat|
pat.gsub(/[\.\$\#\*]/) {|ch| mapping[ch] }
}.join('|') +
')\z')
def collect_filenames_auto
mapdir((existfiles() - hookfiles()).reject {|fname|
REJECT_PATTERNS =~ fname
})
end
def existfiles
all_files_in(curr_srcdir()) | all_files_in('.')
end
def hookfiles
%w( pre-%s post-%s pre-%s.rb post-%s.rb ).map {|fmt|
%w( config setup install clean ).map {|t| sprintf(fmt, t) }
}.flatten
end
def mapdir(filelist)
filelist.map {|fname|
if File.exist?(fname) # objdir
fname
else # srcdir
File.join(curr_srcdir(), fname)
end
}
end
def ruby_extentions(dir)
Dir.open(dir) {|d|
ents = d.select {|fname| /\.#{::Config::CONFIG['DLEXT']}\z/ =~ fname }
if ents.empty?
setup_rb_error "no ruby extention exists: 'ruby #{$0} setup' first"
end
return ents
}
end
#
# TASK clean
#
def exec_clean
exec_task_traverse 'clean'
rm_f ConfigTable.savefile
rm_f 'InstalledFiles'
end
def clean_dir_bin(rel)
end
def clean_dir_lib(rel)
end
def clean_dir_ext(rel)
return unless extdir?(curr_srcdir())
make 'clean' if File.file?('Makefile')
end
def clean_dir_data(rel)
end
#
# TASK distclean
#
def exec_distclean
exec_task_traverse 'distclean'
rm_f ConfigTable.savefile
rm_f 'InstalledFiles'
end
def distclean_dir_bin(rel)
end
def distclean_dir_lib(rel)
end
def distclean_dir_ext(rel)
return unless extdir?(curr_srcdir())
make 'distclean' if File.file?('Makefile')
end
#
# lib
#
def exec_task_traverse(task)
run_hook "pre-#{task}"
FILETYPES.each do |type|
if config('without-ext') == 'yes' and type == 'ext'
$stderr.puts 'skipping ext/* by user option' if verbose?
next
end
traverse task, type, "#{task}_dir_#{type}"
end
run_hook "post-#{task}"
end
def traverse(task, rel, mid)
dive_into(rel) {
run_hook "pre-#{task}"
__send__ mid, rel.sub(%r[\A.*?(?:/|\z)], '')
all_dirs_in(curr_srcdir()).each do |d|
traverse task, "#{rel}/#{d}", mid
end
run_hook "post-#{task}"
}
end
def dive_into(rel)
return unless File.dir?("#{@srcdir}/#{rel}")
dir = File.basename(rel)
Dir.mkdir dir unless File.dir?(dir)
prevdir = Dir.pwd
Dir.chdir dir
$stderr.puts '---> ' + rel if verbose?
@currdir = rel
yield
Dir.chdir prevdir
$stderr.puts '<--- ' + rel if verbose?
@currdir = File.dirname(rel)
end
end
if $0 == __FILE__
begin
if multipackage_install?
ToplevelInstallerMulti.invoke
else
ToplevelInstaller.invoke
end
rescue SetupError
raise if $DEBUG
$stderr.puts $!.message
$stderr.puts "Try 'ruby #{$0} --help' for detailed usage."
exit 1
end
end
ruby-graffiti-2.2/test/ 0000775 0000000 0000000 00000000000 11764675307 0015126 5 ustar 00root root 0000000 0000000 ruby-graffiti-2.2/test/ts_graffiti.rb 0000664 0000000 0000000 00000033070 11764675307 0017757 0 ustar 00root root 0000000 0000000 #!/usr/bin/env ruby
#
# Graffiti RDF Store tests
#
# Copyright (c) 2002-2009 Dmitry Borodaenko
#
# This program is free software.
# You can distribute/modify this program under the terms of
# the GNU General Public License version 3 or later.
#
# vim: et sw=2 sts=2 ts=8 tw=0
require 'test/unit'
require 'yaml'
require 'sequel'
require 'graffiti'
include Graffiti
class TC_Storage < Test::Unit::TestCase
def setup
config = File.open(
File.join(
File.dirname(File.dirname(__FILE__)),
'doc', 'examples', 'samizdat-rdf-config.yaml'
)
) {|f| YAML.load(f.read) }
@db = create_mock_db
@store = Store.new(@db, config)
@ns = @store.config.ns
end
def test_query_select
squish = %{
SELECT ?msg, ?title, ?name, ?date, ?rating
WHERE (dc::title ?msg ?title)
(dc::creator ?msg ?creator)
(s::fullName ?creator ?name)
(dc::date ?msg ?date)
(rdf::subject ?stmt ?msg)
(rdf::predicate ?stmt dc::relation)
(rdf::object ?stmt s::Quality)
(s::rating ?stmt ?rating)
LITERAL ?rating >= -1
ORDER BY ?rating DESC
USING PRESET NS}
sql = "SELECT DISTINCT b.id AS msg, b.title AS title, a.full_name AS name, c.published_date AS date, d.rating AS rating
FROM member AS a
INNER JOIN message AS b ON (b.creator = a.id)
INNER JOIN resource AS c ON (b.id = c.id)
INNER JOIN statement AS d ON (b.id = d.subject)
INNER JOIN resource AS e ON (d.predicate = e.id) AND (e.uriref = 't' AND e.label = 'http://purl.org/dc/elements/1.1/relation')
INNER JOIN resource AS f ON (d.object = f.id) AND (f.uriref = 't' AND f.label = 'http://www.nongnu.org/samizdat/rdf/schema#Quality')
WHERE (c.published_date IS NOT NULL)
AND (a.full_name IS NOT NULL)
AND (d.id IS NOT NULL)
AND (b.title IS NOT NULL)
AND (d.rating >= -1)
ORDER BY d.rating DESC"
test_squish_select(squish, sql) do |query|
assert_equal %w[?msg ?title ?name ?date ?rating], query.nodes
assert query.pattern.include?(["#{@ns['dc']}title", "?msg", "?title", nil, false])
assert_equal '?rating >= -1', query.literal
assert_equal '?rating', query.order
assert_equal 'DESC', query.order_dir
assert_equal @ns['s'], query.ns['s']
end
assert_equal [], @store.select_all(squish)
end
def test_query_assert
# initialize
query_text = %{
INSERT ?msg
UPDATE ?title = 'Test Message', ?content = 'Some ''text''.'
WHERE (dc::creator ?msg 1)
(dc::title ?msg ?title)
(s::content ?msg ?content)
USING dc FOR #{@ns['dc']}
s FOR #{@ns['s']}}
begin
query = SquishAssert.new(@store.config, query_text)
rescue
assert false, "SquishAssert initialization raised #{$!.class}: #{$!}"
end
# query parser
assert_equal ['?msg'], query.insert
assert_equal({'?title' => "'0'", '?content' => "'1'"}, query.update)
assert query.pattern.include?(["#{@ns['dc']}title", "?msg", "?title", nil, false])
assert_equal @ns['s'], query.ns['s']
assert_equal "'Test Message'", query.substitute_literals("'0'")
assert_equal "'Some ''text''.'", query.substitute_literals("'1'")
# mock db
ids = @store.assert(query_text)
assert_equal [1], ids
assert_equal 'Test Message', @db[:Message][:id => 1][:title]
id2 = @store.assert(query_text)
query_text = %{
UPDATE ?rating = :rating
WHERE (rdf::subject ?stmt :related)
(rdf::predicate ?stmt dc::relation)
(rdf::object ?stmt 1)
(s::voteProposition ?vote ?stmt)
(s::voteMember ?vote :member)
(s::voteRating ?vote ?rating)}
params = {:rating => -1, :related => 2, :member => 3}
ids = @store.assert(query_text, params)
assert_equal [], ids
assert vote = @db[:vote].order(:id).last
assert_equal -1, vote[:rating].to_i
params[:rating] = -2
@store.assert(query_text, params)
assert vote2 = @db[:vote].order(:id).last
assert_equal -2, vote2[:rating].to_i
assert_equal vote[:id], vote2[:id]
end
def test_query_assert_expression
query_text = %{
UPDATE ?rating = 2 * :rating
WHERE (rdf::subject ?stmt :related)
(rdf::predicate ?stmt dc::relation)
(rdf::object ?stmt 1)
(s::voteProposition ?vote ?stmt)
(s::voteMember ?vote :member)
(s::voteRating ?vote ?rating)}
params = {:rating => -1, :related => 2, :member => 3}
@store.assert(query_text, params)
assert vote = @db[:vote].order(:id).last
assert_equal -2, vote[:rating].to_i
end
private :test_query_assert_expression
def test_dangling_blank_node
squish = %{
SELECT ?msg
WHERE (s::inReplyTo ?msg ?parent)
USING s FOR #{@ns['s']}}
sql = "SELECT DISTINCT a.id AS msg
FROM resource AS a
INNER JOIN resource AS b ON (a.part_of_subproperty = b.id) AND (b.uriref = 't' AND b.label = 'http://www.nongnu.org/samizdat/rdf/schema#inReplyTo')
WHERE (a.id IS NOT NULL)"
test_squish_select(squish, sql) do |query|
assert_equal %w[?msg], query.nodes
assert query.pattern.include?(["#{@ns['s']}inReplyTo", "?msg", "?parent", nil, false])
assert_equal @ns['s'], query.ns['s']
end
end
def test_external_resource_no_self_join
squish = %{SELECT ?id WHERE (s::id tag::Translation ?id)}
sql = "SELECT DISTINCT a.id AS id
FROM resource AS a
WHERE (a.id IS NOT NULL)
AND ((a.uriref = 't' AND a.label = 'http://www.nongnu.org/samizdat/rdf/tag#Translation'))"
test_squish_select(squish, sql) do |query|
assert_equal %w[?id], query.nodes
assert query.pattern.include?(["#{@ns['s']}id", "#{@ns['tag']}Translation", "?id", nil, false])
assert_equal @ns['s'], query.ns['s']
end
end
#def test_internal_resource
#end
#def test_external_subject_internal_property
#end
def test_except
squish = %{
SELECT ?msg
WHERE (dc::date ?msg ?date)
EXCEPT (s::inReplyTo ?msg ?parent)
(dct::isVersionOf ?msg ?version_of)
(dc::creator ?version_of 1)
ORDER BY ?date DESC}
sql = "SELECT DISTINCT a.id AS msg, a.published_date AS date
FROM resource AS a
LEFT JOIN (
SELECT a.id AS _field_c
FROM message AS b
INNER JOIN resource AS a ON (a.part_of = b.id)
INNER JOIN resource AS c ON (a.part_of_subproperty = c.id) AND (c.uriref = 't' AND c.label = 'http://purl.org/dc/terms/isVersionOf')
WHERE (b.creator = 1)
) AS _subquery_a ON (a.id = _subquery_a._field_c)
LEFT JOIN resource AS d ON (a.part_of_subproperty = d.id) AND (d.uriref = 't' AND d.label = 'http://www.nongnu.org/samizdat/rdf/schema#inReplyTo')
WHERE (a.published_date IS NOT NULL)
AND (a.id IS NOT NULL)
AND (_subquery_a._field_c IS NULL)
AND (d.id IS NULL)
ORDER BY a.published_date DESC"
test_squish_select(squish, sql)
end
def test_except_group_by
squish = %{
SELECT ?msg
WHERE (rdf::predicate ?stmt dc::relation)
(rdf::subject ?stmt ?msg)
(rdf::object ?stmt ?tag)
(dc::date ?stmt ?date)
(s::rating ?stmt ?rating FILTER ?rating >= 1.5)
(s::hidden ?msg ?hidden FILTER ?hidden = 'f')
EXCEPT (dct::isPartOf ?msg ?parent)
GROUP BY ?msg
ORDER BY max(?date) DESC}
sql = "SELECT DISTINCT c.subject AS msg, max(d.published_date)
FROM message AS a
INNER JOIN statement AS c ON (c.subject = a.id) AND (c.rating >= 1.5)
INNER JOIN resource AS b ON (c.subject = b.id)
INNER JOIN resource AS d ON (c.id = d.id)
INNER JOIN resource AS e ON (c.predicate = e.id) AND (e.uriref = 't' AND e.label = 'http://purl.org/dc/elements/1.1/relation')
WHERE (d.published_date IS NOT NULL)
AND (a.hidden IS NOT NULL)
AND (b.part_of IS NULL)
AND (c.rating IS NOT NULL)
AND (c.object IS NOT NULL)
AND ((a.hidden = 'f'))
GROUP BY c.subject
ORDER BY max(d.published_date) DESC"
test_squish_select(squish, sql)
end
def test_optional
squish = %{
SELECT ?date, ?creator, ?lang, ?parent, ?version_of, ?hidden, ?open
WHERE (dc::date 1 ?date)
OPTIONAL (dc::creator 1 ?creator)
(dc::language 1 ?lang)
(s::inReplyTo 1 ?parent)
(dct::isVersionOf 1 ?version_of)
(s::hidden 1 ?hidden)
(s::openForAll 1 ?open)}
sql = "SELECT DISTINCT a.published_date AS date, b.creator AS creator, b.language AS lang, select_subproperty(a.part_of, d.id) AS parent, select_subproperty(a.part_of, c.id) AS version_of, b.hidden AS hidden, b.open AS open
FROM resource AS a
INNER JOIN message AS b ON (a.id = b.id)
LEFT JOIN resource AS c ON (a.part_of_subproperty = c.id) AND (c.uriref = 't' AND c.label = 'http://purl.org/dc/terms/isVersionOf')
LEFT JOIN resource AS d ON (a.part_of_subproperty = d.id) AND (d.uriref = 't' AND d.label = 'http://www.nongnu.org/samizdat/rdf/schema#inReplyTo')
WHERE (a.published_date IS NOT NULL)
AND ((a.id = 1))"
test_squish_select(squish, sql)
end
def test_except_optional_transitive
squish = %{
SELECT ?msg
WHERE (rdf::subject ?stmt ?msg)
(rdf::predicate ?stmt dc::relation)
(rdf::object ?stmt ?tag)
(s::rating ?stmt ?rating FILTER ?rating > 0)
(dc::date ?msg ?date)
EXCEPT (dct::isPartOf ?msg ?parent)
OPTIONAL (dct::isPartOf ?tag ?supertag TRANSITIVE)
LITERAL ?tag = 1 OR ?supertag = 1
ORDER BY ?date DESC}
sql = "SELECT DISTINCT b.subject AS msg, a.published_date AS date
FROM resource AS a
INNER JOIN statement AS b ON (b.subject = a.id) AND (b.rating > 0)
INNER JOIN resource AS d ON (b.predicate = d.id) AND (d.uriref = 't' AND d.label = 'http://purl.org/dc/elements/1.1/relation')
LEFT JOIN part AS c ON (b.object = c.id)
WHERE (a.published_date IS NOT NULL)
AND (a.part_of IS NULL)
AND (b.rating IS NOT NULL)
AND (b.id IS NOT NULL)
AND (b.object = 1 OR c.part_of = 1)
ORDER BY a.published_date DESC"
test_squish_select(squish, sql)
end
def test_optional_connect_by_object
squish = %{
SELECT ?event
WHERE (ical::dtstart ?event ?dtstart FILTER ?dtstart >= 'now')
(ical::dtend ?event ?dtend)
OPTIONAL (s::rruleEvent ?rrule ?event)
(ical::until ?rrule ?until FILTER ?until IS NULL OR ?until > 'now')
LITERAL ?dtend > 'now' OR ?rrule IS NOT NULL
ORDER BY ?event DESC}
sql = "SELECT DISTINCT b.id AS event
FROM event AS b
LEFT JOIN recurrence AS a ON (b.id = a.event) AND (a.until IS NULL OR a.until > 'now')
WHERE (b.dtstart IS NOT NULL)
AND ((b.dtstart >= 'now'))
AND (b.dtend > 'now' OR a.id IS NOT NULL)
ORDER BY b.id DESC"
test_squish_select(squish, sql)
end
private :test_optional_connect_by_object
def test_many_to_many
# pretend that Vote is a many-to-many relation table
squish = %{
SELECT ?p, ?date
WHERE (s::voteRating ?p ?vote1 FILTER ?vote1 > 0)
(s::voteRating ?p ?vote2 FILTER ?vote2 < 0)
(dc::date ?p ?date)
ORDER BY ?date DESC}
sql = "SELECT DISTINCT a.id AS p, c.published_date AS date
FROM vote AS a
INNER JOIN vote AS b ON (a.id = b.id) AND (b.rating < 0)
INNER JOIN resource AS c ON (a.id = c.id)
WHERE (c.published_date IS NOT NULL)
AND (a.rating IS NOT NULL)
AND (b.rating IS NOT NULL)
AND ((a.rating > 0))
ORDER BY c.published_date DESC"
test_squish_select(squish, sql)
end
def test_update_null_and_subproperty
query_text =
%{INSERT ?msg
UPDATE ?parent = :parent
WHERE (dct::isPartOf ?msg ?parent)}
@store.assert(query_text, :id => 1, :parent => 3)
assert_equal 3, @db[:resource].filter(:id => 1).get(:part_of)
# check that subproperty is set
query_text =
%{UPDATE ?parent = :parent
WHERE (s::subTagOf :id ?parent)}
@store.assert(query_text, :id => 1, :parent => 3)
assert_equal 3, @db[:resource].filter(:id => 1).get(:part_of)
assert_equal 2, @db[:resource].filter(:id => 1).get(:part_of_subproperty)
# check that NULL is handled correctly and that subproperty is unset
query_text =
%{UPDATE ?parent = NULL
WHERE (dct::isPartOf :id ?parent)}
@store.assert(query_text, :id => 1)
assert_equal nil, @db[:resource].filter(:id => 1).get(:part_of)
assert_equal nil, @db[:resource].filter(:id => 1).get(:part_of_subproperty)
end
private
def test_squish_select(squish, sql)
begin
query = SquishSelect.new(@store.config, squish)
rescue
assert false, "SquishSelect initialization raised #{$!.class}: #{$!}"
end
yield query if block_given?
# query result
begin
sql1 = @store.select(query)
rescue
assert false, "select with pre-parsed query raised #{$!.class}: #{$!}"
end
begin
sql2 = @store.select(squish)
rescue
assert false, "select with query text raised #{$!.class}: #{$!}"
end
assert sql1 == sql2
# transform result
assert_equal normalize(sql), normalize(sql1),
"Query doesn't match. Expected:\n#{sql}\nReceived:\n#{sql1}"
end
def normalize(sql)
sql
end
def create_mock_db
db = Sequel.sqlite(:quote_identifiers => false)
db.create_table(:resource) do
primary_key :id
Time :published_date
Integer :part_of
Integer :part_of_subproperty
Integer :part_sequence_number
TrueClass :literal
TrueClass :uriref
String :label
end
db.create_table(:statement) do
primary_key :id
Integer :subject
Integer :predicate
Integer :object
BigDecimal :rating, :size => [4, 2]
end
db.create_table(:member) do
primary_key :id
String :login
String :full_name
String :email
end
db.create_table(:message) do
primary_key :id
String :title
Integer :creator
String :format
String :language
TrueClass :open
TrueClass :hidden
TrueClass :locked
String :content
String :html_full
String :html_short
end
db.create_table(:vote) do
primary_key :id
Integer :proposition
Integer :member
BigDecimal :rating, :size => 2
end
db
end
def create_mock_member(db)
db[:member].insert(
:login => 'test',
:full_name => 'test',
:email => 'test@localhost'
)
end
end