pax_global_header00006660000000000000000000000064123073056670014523gustar00rootroot0000000000000052 comment=b28a24c423bed62a194c6508cf08c05f32656a54 ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/000077500000000000000000000000001230730566700171105ustar00rootroot00000000000000ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/.gitignore000066400000000000000000000000561230730566700211010ustar00rootroot00000000000000bin develop-eggs lib include build *.so parts ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/.travis.yml000066400000000000000000000003071230730566700212210ustar00rootroot00000000000000language: python python: - 2.6 - 2.7 install: - travis_retry python ./bootstrap.py - travis_retry ./bin/buildout install test script: - ./bin/test notifications: email: false ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/COPYING000066400000000000000000000001331230730566700201400ustar00rootroot00000000000000See: - the copyright notice in: COPYRIGHT.txt - The Zope Public License in LICENSE.txt ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/COPYRIGHT.txt000066400000000000000000000000401230730566700212130ustar00rootroot00000000000000Zope Foundation and ContributorsZODB-b28a24c423bed62a194c6508cf08c05f32656a54/HISTORY.txt000066400000000000000000004625311230730566700210250ustar00rootroot00000000000000 3.9.7 (2010-09-28) ================== Bugs Fixed ---------- - Changes in way that garbage collection treats dictionaries in Python 2.7 broke the object/connection cache implementation. (https://bugs.launchpad.net/zodb/+bug/641481) Python 2.7 wasn't officially supported, but we were releasing binaries for it, so ... - Logrotation/repoening via a SIGUSR2 signal wasn't implemented. (https://bugs.launchpad.net/zodb/+bug/143600) - When using multi-databases, cache-management operations on a connection, cacheMinimize and cacheGC, weren't applied to subconnections. 3.9.6 (2010-09-21) ================== Bugs Fixed ---------- - Updating blobs in save points could cause spurious "invalidations out of order" errors. https://bugs.launchpad.net/zodb/+bug/509801 (Thanks to Christian Zagrodnick for chasing this down.) - If a ZEO client process was restarted while invalidating a ZEO cache entry, the cache could be left in a stage when there is data marked current that should be invalidated, leading to persistent conflict errors. - Corrupted or invalid cache files prevented ZEO clients from starting. Now, bad cache files are moved aside. - Invalidations of object records in ZEO caches, where the invalidation transaction ids matched the cached transaction ids should have been ignored. - Shutting down a process while committing a transaction or processing invalidations from the server could cause ZEO persistent client caches to have invalid data. This, in turn caused stale data to remain in the cache until it was updated. - Conflict errors didn't invalidate ZEO cache entries. - When objects were added in savepoints and either the savepoint was rolled back (https://bugs.launchpad.net/zodb/+bug/143560) or the transaction was aborted (https://mail.zope.org/pipermail/zodb-dev/2010-June/013488.html) The objects' _p_oid and _p_jar variables weren't cleared, leading to surprizing errors. - Objects added in transactions that were later aborted could have _p_changed still set (https://bugs.launchpad.net/zodb/+bug/615758). - ZEO extension methods failed when a client reconnected to a storage. (https://bugs.launchpad.net/zodb/+bug/143344) - On Mac OS X, clients that connected and disconnected quickly could cause a ZEO server to stop accepting connections, due to a failure to catch errors in the initial part of the connection process. The failure to properly handle exceptions while accepting connections is potentially problematic on other platforms. Fixes: https://bugs.launchpad.net/zodb/+bug/135108 - Passing keys or values outside the range of 32-bit ints on 64-bit platforms led to undetected overflow errors. Now these cases cause Type errors to be raised. https://bugs.launchpad.net/zodb/+bug/143237 - BTree sets and tree sets didn't correctly check values passed to update or to constructors, causing Python to exit under certain circumstances. - The verbose mode of the fstest was broken. (https://bugs.launchpad.net/zodb/+bug/475996) 3.9.5 (2010-04-23) ================== Bugs Fixed ---------- - Fixed bug in cPickleCache's byte size estimation logic. (https://bugs.launchpad.net/zodb/+bug/533015) - Fixed a serious bug that caused cache failures when run with Python optimization turned on. https://bugs.launchpad.net/zodb/+bug/544305 - Fixed a bug that caused savepoint rollback to not properly set object state when objects implemented _p_invalidate methods that reloaded ther state (unghostifiable objects). https://bugs.launchpad.net/zodb/+bug/428039 - cross-database wekrefs weren't handled correctly. https://bugs.launchpad.net/zodb/+bug/435547 - The mkzeoinst script was fixed to tell people to install and use the mkzeoinstance script. :) 3.9.4 (2009-12-14) ================== Bugs Fixed ---------- - A ZEO threading bug could cause transactions to read inconsistent data. (This sometimes caused an AssertionError in Connection._setstate_noncurrent.) - DemoStorage.loadBefore sometimes returned invalid data which would trigger AssertionErrors in ZODB.Connection. - History support was broken when using stprages that work with ZODB 3.8 and 3.9. - zope.testing was an unnecessary non-testing dependency. - Internal ZEO errors were logged at the INFO level, rather than at the error level. - The FileStorage backup and restore script, repozo, gave a deprecation warning under Python 2.6. - C Header files weren't installed correctly. - The undo implementation was incorrect in ways that could cause subtle missbehaviors. 3.9.3 (2009-10-23) ================== Bugs Fixed ---------- - 2 BTree bugs, introduced by a bug fix in 3.9.0c2, sometimes caused deletion of keys to be improperly handled, resulting in data being available via iteraation but not item access. 3.9.2 (2009-10-13) ================== Bugs Fixed ---------- - ZEO manages a separate thread for client network IO. It created this thread on import, which caused problems for applications that implemented daemon behavior by forking. Now, the client thread isn't created until needed. - File-storage pack clean-up tasks that can take a long time unnecessarily blocked other activity. - In certain rare situations, ZEO client connections would hang during the initial connection setup. 3.9.1 (2009-10-01) ================== Bugs Fixed ---------- - Conflict errors committing blobs caused ZEO servers to stop committing transactions. 3.9.0 (2009-09-08) ================== New Features (in more or less reverse chronological order) ---------------------------------------------------------- - The Database class now has an ``xrefs`` keyword argument and a corresponding allow-implicit-cross-references configuration option. which default to true. When set to false, cross-database references are disallowed. - Added support for RelStorage. - As a convenience, the connection root method for returning the root object can now *also* be used as an object with attributes mapped to the root-object keys. - Databases have a new method, ``transaction``, that can be used with the Python (2.5 and later) ``with`` statement:: db = ZODB.DB(...) with db.transaction() as conn: # ... do stuff with conn This uses a private transaction manager for the connection. If control exits the block without an error, the transaction is committed, otherwise, it is aborted. - Convenience functions ZODB.connection and ZEO.connection provide a convenient way to open a connection to a database. They open a database and return a connection to it. When the connection is closed, the database is closed as well. - The ZODB.config databaseFrom... methods now support multi-databases. If multiple zodb sections are used to define multiple databases, the databases are connected in a multi-database arrangement and the first of the defined databases is returned. - The zeopack script has gotten a number of improvements: - Simplified command-line interface. (The old interface is still supported, except that support for ZEO version 1 servers has been dropped.) - Multiple storages can be packed in sequence. - This simplifies pack scheduling on servers serving multiple databases. - All storages are packed to the same time. - You can now specify a time of day to pack to. - The script will now time out if it can't connect to s storage in 60 seconds. - The connection now estimates the object size based on its pickle size and informs the cache about size changes. The database got additional configurations options (`cache-size-bytes` and `historical-cache-size-bytes`) to limit the cache size based on the estimated total size of cached objects. The default values are 0 which has the interpretation "do not limit based on the total estimated size". There are corresponding methods to read and set the new configuration parameters. - Connections now have a public ``opened`` attribute that is true when the connection is open, and false otherwise. When true, it is the seconds since the epoch (time.time()) when the connection was opened. This is a renaming of the previous ``_opened`` private variable. - FileStorage now supports blobs directly. - You can now control whether FileStorages keep .old files when packing. - POSKeyErrors are no longer logged by ZEO servers, because they are really client errors. - A new storage interface, IExternalGC, to support external garbage collection, http://wiki.zope.org/ZODB/ExternalGC, has been defined and implemented for FileStorage and ClientStorage. - As a small convenience (mainly for tests), you can now specify initial data as a string argument to the Blob constructor. - ZEO Servers now provide an option, invalidation-age, that allows quick verification of ZEO clients have been disconnected for less than a given time even if the number of transactions the client hasn't seen exceeds the invalidation queue size. This is only recommended if the storage being served supports efficient iteration from a point near the end of the transaction history. - The FileStorage iterator now handles large files better. When iterating from a starting transaction near the end of the file, the iterator will scan backward from the end of the file to find the starting point. This enhancement makes it practical to take advantage of the new storage server invalidation-age option. - Previously, database connections were managed as a stack. This tended to cause the same connection(s) to be used over and over. For example, the most used connection would typically be the only connection used. In some rare situations, extra connections could be opened and end up on the top of the stack, causing extreme memory wastage. Now, when connections are placed on the stack, they sink below existing connections that have more active objects. - There is a new pool-timeout database configuration option to specify that connections unused after the given time interval should be garbage collection. This will provide a means of dealing with extra connections that are created in rare circumstances and that would consume an unreasonable amount of memory. - The Blob open method now supports a new mode, 'c', to open committed data for reading as an ordinary file, rather than as a blob file. The ordinary file may be used outside the current transaction and even after the blob's database connection has been closed. - ClientStorage now provides blob cache management. When using non-shared blob directories, you can set a target cache size and the cache will periodically be reduced try to keep it below the target size. The client blob directory layout has changed. If you have existing non-shared blob directories, you will have to remove them. - ZODB 3.9 ZEO clients can connect to ZODB 3.8 servers. ZODB ZEO clients from ZODB 3.2 on can connect to ZODB 3.9 servers. - When a ZEO cache is stale and would need verification, a ZEO.interfaces.StaleCache event is published (to zope.event). Applications may handle this event and take action such as exiting the application without verifying the cache or starting cold. - There's a new convenience function, ZEO.DB, for creating databases using ZEO Client Storages. Just call ZEO.DB with the same arguments you would otherwise pass to ZEO.ClientStorage.ClientStorage:: import ZEO db = ZEO.DB(('some_host', 8200)) - Object saves are a little faster - When configuring storages in a storage server, the storage name now defaults to "1". In the overwhelmingly common case that a single storage, the name can now be omitted. - FileStorage now provides optional garbage collection. A 'gc' keyword option can be passed to the pack method. A false value prevents garbage collection. - The FileStorage constructor now provides a boolean pack_gc option, which defaults to True, to control whether garbage collection is performed when packing by default. This can be overridden with the gc option to the pack method. The ZConfig configuration for FileStorage now includes a pack-gc option, corresponding to the pack_gc constructor argument. - The FileStorage constructor now has a packer keyword argument that allows an alternative packer to be supplied. The ZConfig configuration for FileStorage now includes a packer option, corresponding to the packer constructor argument. - MappingStorage now supports multi-version concurrency control and iteration and provides a better storage implementation example. - DemoStorage has a number of new features: - The ability to use a separate storage, such as a file storage to store changes - Blob support - Multi-version concurrency control and iteration - Explicit support for demo-storage stacking via push and pop methods. - Wen calling ZODB.DB to create a database, you can now pass a file name, rather than a storage to use a file storage. - Added support for copying and recovery of blob storages: - Added a helper function, ZODB.blob.is_blob_record for testing whether a data record is for a blob. This can be used when iterating over a storage to detect blob records so that blob data can be copied. In the future, we may want to build this into a blob-aware iteration interface, so that records get blob file attributes automatically. - Added the IBlobStorageRestoreable interfaces for blob storages that support recovery via a restoreBlob method. - Updated ZODB.blob.BlobStorage to implement IBlobStorageRestoreable and to have a copyTransactionsFrom method that also copies blob data. - New `ClientStorage` configuration option `drop_cache_rather_verify`. If this option is true then the ZEO client cache is dropped instead of the long (unoptimized) verification. For large caches, setting this option can avoid effective down times in the order of hours when the connection to the ZEO server was interrupted for a longer time. - Cleaned-up the storage iteration API and provided an iterator implementation for ZEO. - Versions are no-longer supported. - Document conflict resolution (see ZODB/ConflictResolution.txt). - Support multi-database references in conflict resolution. - Make it possible to examine oid and (in some situations) database name of persistent object references during conflict resolution. - Moved the 'transaction' module out of ZODB. ZODB depends upon this module, but it must be installed separately. - ZODB installation now requires setuptools. - Added `offset` information to output of `fstail` script. Added test harness for this script. - Added support for read-only, historical connections based on datetimes or serials (TIDs). See src/ZODB/historical_connections.txt. - Removed the ThreadedAsync module. - Now depend on zc.lockfile Bugs Fixed ---------- - CVE-2009-2701: Fixed a vulnerability in ZEO storage servers when blobs are available. Someone with write access to a ZEO server configured to support blobs could read any file on the system readable by the server process and remove any file removable by the server process. - BTrees (and TreeSets) kept references to internal keys. https://bugs.launchpad.net/zope3/+bug/294788 - BTree Sets and TreeSets don't support the standard set add method. (Now either add or the original insert method can be used to add an object to a BTree-based set.) - The runzeo script didn't work without a configuration file. (https://bugs.launchpad.net/zodb/+bug/410571) - Officially deprecated PersistentDict (https://bugs.launchpad.net/zodb/+bug/400775) - Calling __setstate__ on a persistent object could under certain uncommon cause the process to crash. (https://bugs.launchpad.net/zodb/+bug/262158) - When committing transactions involving blobs to ClientStorages with non-shared blob directories, a failure could occur in tpc_finish if there was insufficient disk space to copy the blob file or if the file wasn't available. https://bugs.launchpad.net/zodb/+bug/224169 - Savepoint blob data wasn't properly isolated. If multiple simultaneous savepoints in separate transactions modified the same blob, data from one savepoint would overwrite data for another. - Savepoint blob data wasn't cleaned up after a transaction abort. https://bugs.launchpad.net/zodb/+bug/323067 - Opening a blob with modes 'r+' or 'a' would fail when the blob had no committed changes. - PersistentList's sort method did not allow passing of keyword parameters. Changed its sort parameter list to match that of its (Python 2.4+) UserList base class. - Certain ZEO server errors could cause a client to get into a state where it couldn't commit transactions. https://bugs.launchpad.net/zodb/+bug/374737 - Fixed vulnerabilities in the ZEO network protocol that allow: - CVE-2009-0668 Arbitrary Python code execution in ZODB ZEO storage servers - CVE-2009-0669 Authentication bypass in ZODB ZEO storage servers The vulnerabilities only apply if you are using ZEO to share a database among multiple applications or application instances and if untrusted clients are able to connect to your ZEO servers. - Fixed the setup test command. It previously depended on private functions in zope.testing.testrunner that don't exist any more. - ZEO client threads were unnamed, making it hard to debug thread management. - ZEO protocol 2 support was broken. This caused very old clients to be unable to use new servers. - zeopack was less flexible than it was before. -h should default to local host. - The "lawn" layout was being selected by default if the root of the blob directory happened to contain a hidden file or directory such as ".svn". Now hidden files and directories are ignored when choosing the default layout. - BlobStorage was not compatible with MVCC storages because the wrappers were being removed by each database connection. Fixed. - Saving indexes for large file storages failed (with the error: RuntimeError: maximum recursion depth exceeded). This can cause a FileStorage to fail to start because it gets an error trying to save its index. - Sizes of new objects weren't added to the object cache size estimation, causing the object-cache size limiting feature to let the cache grow too large when many objects were added. - Deleted records weren't removed when packing file storages. - Fixed analyze.py and added test. - fixed Python 2.6 compatibility issue with ZEO/zeoserverlog.py - using hashlib.sha1 if available in order to avoid DeprecationWarning under Python 2.6 - made runzeo -h work - The monitor server didn't correctly report the actual number of clients. - Packing could return spurious errors due to errors notifying disconnected clients of new database size statistics. - Undo sometimes failed for FileStorages configured to support blobs. - Starting ClientStorages sometimes failed with non-new but empty cache files. - The history method on ZEO clients failed. - Fix for bug #251037: Make packing of blob storages non-blocking. - Fix for bug #220856: Completed implementation of ZEO authentication. - Fix for bug #184057: Make initialisation of small ZEO client file cache sizes not fail. - Fix for bug #184054: MappingStorage used to raise a KeyError during `load` instead of a POSKeyError. - Fixed bug in Connection.TmpStore: load() would not defer to the backend storage for loading blobs. - Fix for bug #181712: Make ClientStorage update `lastTransaction` directly after connecting to a server, even when no cache verification is necessary. - Fixed bug in blob filesystem helper: the `isSecure` check was inverted. - Fixed bug in transaction buffer: a tuple was unpacked incorrectly in `clear`. - Bugfix the situation in which comparing persistent objects (for instance, as members in BTree set or keys of BTree) might cause data inconsistency during conflict resolution. - Fixed bug 153316: persistent and BTrees were using `int` for memory sizes which caused errors on x86_64 Intel Xeon machines (using 64-bit Linux). - Fixed small bug that the Connection.isReadOnly method didn't work after a savepoint. - Bug #98275: Made ZEO cache more tolerant when invalidating current versions of objects. - Fixed a serious bug that could cause client I/O to stop (hang). This was accompanied by a critical log message along the lines of: "RuntimeError: dictionary changed size during iteration". - Fixed bug #127182: Blobs were subclassable which was not desired. - Fixed bug #126007: tpc_abort had untested code path that was broken. - Fixed bug #129921: getSize() function in BlobStorage could not deal with garbage files - Fixed bug in which MVCC would not work for blobs. - Fixed bug in ClientCache that occurred with objects larger than the total cache size. - When an error occured attempting to lock a file and logging of said error was enabled. - FileStorages previously saved indexes after a certain number of writes. This was done during the last phase of two-phase commit, which made this critical phase more subject to errors than it should have been. Also, for large databases, saves were done so infrequently as to be useless. The feature was removed to reduce the chance for errors during the last phase of two-phase commit. - File storages previously kept an internal object id to transaction id mapping as an optimization. This mapping caused excessive memory usage and failures during the last phase of two-phase commit. This optimization has been removed. - Refactored handling of invalidations on ZEO clients to fix a possible ordering problem for invalidation messages. - On many systems, it was impossible to create more than 32K blobs. Added a new blob-directory layout to work around this limitation. - Fixed bug that could lead to memory errors due to the use of a Python dictionary for a mapping that can grow large. - Fixed bug #251037: Made packing of blob storages non-blocking. - Fixed a bug that could cause InvalidObjectReference errors for objects that were explicitly added to a database if the object was modified after a savepoint that added the object. - Fixed several bugs that caused ZEO cache corruption when connecting to servers. These bugs affected both persistent and non-persistent caches. - Improved the the ZEO client shutdown support to try to avoid spurious errors on exit, especially for scripts, such as zeopack. - Packing failed for databases containing cross-database references. - Cross-database references to databases with empty names weren't constructed properly. - The zeo client cache used an excessive amount of memory, causing applications with large caches to exhaust available memory. - Fixed a number of bugs in the handling of persistent ZEO caches: - Cache records are written in several steps. If a process exits after writing begins and before it is finishes, the cache will be corrupt on restart. The way records are written was changed to make cache record updates atomic. - There was no lock file to prevent opening a cache multiple times at once, which would lead to corruption. Persistent caches now use lock files, in the same way that file storages do. - A bug in the cache-opening logic led to cache failure in the unlikely event that a cache has no free blocks. - When using ZEO Client Storages, Errors occured when trying to store objects too big to fit in the ZEO cache file. - Fixed bug in blob filesystem helper: the `isSecure` check was inverted. - Fixed bug in transaction buffer: a tuple was unpacked incorrectly in `clear`. - Fixed bug in Connection.TmpStore: load() would not defer to the back-end storage for loading blobs. - Fixed bug #190884: Wrong reference to `POSKeyError` caused NameError. - Completed implementation of ZEO authentication. This fixes issue 220856. What's new in ZODB 3.8.0 ======================== General ------- - (unreleased) Fixed setup.py use of setuptools vs distutils, so .c and .h files are included in the bdist_egg. - The ZODB Storage APIs have been documented and cleaned up. - ZODB versions are now officially deprecated and support for them will be removed in ZODB 3.9. (They have been widely recognized as deprecated for quite a while.) - Changed the automatic garbage collection when opening a connection to only apply the garbage collections on those connections in the pool that are closed. (This fixed issue 113923.) ZEO --- - (3.8a1) ZEO's strategoes for avoiding client cache verification were improved in the case that servers are restarted. Before, if transactions were committed after the restart, clients that were up to date or nearly up to date at the time of the restart and then connected had to verify their caches. Now, it is far more likely that a client that reconnects soon after a server restart won't have to verify its cache. - (3.8a1) Fixed a serious bug that could cause clients that disconnect from and reconnect to a server to get bad invalidation data if the server serves multiple storages with active writes. - (3.8a1) It is now theoretically possible to use a ClientStorage in a storage server. This might make it possible to offload read load from a storage server at the cost of increasing write latency. This should increase write throughput by offloading reads from the final storage server. This feature is somewhat experimental. It has tests, but hasn't been used in production. Transactions ------------ - (3.8a1) Add a doom() and isDoomed() interface to the transaction module. First step towards the resolution of http://www.zope.org/Collectors/Zope3-dev/655 A doomed transaction behaves exactly the same way as an active transaction but raises an error on any attempt to commit it, thus forcing an abort. Doom is useful in places where abort is unsafe and an exception cannot be raised. This occurs when the programmer wants the code following the doom to run but not commit. It is unsafe to abort in these circumstances as a following get() may implicitly open a new transaction. Any attempt to commit a doomed transaction will raise a DoomedTransaction exception. - (3.8a1) Clean up the ZODB imports in transaction. Clean up weird import dance with ZODB. This is unnecessary since the transaction module stopped being imported in ZODB/__init__.py in rev 39622. - (3.8a1) Support for subtransactions has been removed in favor of save points. Blobs ----- - (3.8b1) Updated the Blob implementation in a number of ways. Some of these are backward incompatible with 3.8a1: o The Blob class now lives in ZODB.blob o The blob openDetached method has been replaced by the committed method. - (3.8a1) Added new blob feature. See the ZODB/Blobs directory for documentation. ZODB now handles (reasonably) large binary objects efficiently. Useful to use from a few kilobytes to at least multiple hundred megabytes. BTrees ------ - (3.8a1) Added support for 64-bit integer BTrees as separate types. (For now, we're retaining compile-time support for making the regular integer BTrees 64-bit.) - (3.8a1) Normalize names in modules so that BTrees, Buckets, Sets, and TreeSets can all be accessed with those names in the modules (e.g., BTrees.IOBTree.BTree). This is in addition to the older names (e.g., BTrees.IOBTree.IOBTree). This allows easier drop-in replacement, which can especially be simplify code for packages that want to support both 32-bit and 64-bit BTrees. - (3.8a1) Describe the interfaces for each module and actually declare the interfaces for each. - (3.8a1) Fix module references so klass.__module__ points to the Python wrapper module, not the C extension. - (3.8a1) introduce module families, to group all 32-bit and all 64-bit modules. What's new in ZODB3 3.7.0 ========================== Release date: 2007-04-20 Packaging --------- - (3.7.0b3) ZODB is now packaged without it's dependencies ZODB no longer includes copies of dependencies such as ZConfig, zope.interface and so on. It now treats these as dependencies. If ZODB is installed with easy_install or zc.buildout, the dependencies will be installed automatically. - (3.7.0b3) ZODB is now a buildout ZODB checkouts are now built and tested using zc.buildout. - (3.7b4) Added logic to avoid spurious errors from the logging system on exit. - (3.7b2) Removed the "sync" mode for ClientStorage. Previously, a ClientStorage could be in either "sync" mode or "async" mode. Now there is just "async" mode. There is now a dedicicated asyncore main loop dedicated to ZEO clients. Applications no-longer need to run an asyncore main loop to cause client storages to run in async mode. Even if an application runs an asyncore main loop, it is independent of the loop used by client storages. This addresses a test failure on Mac OS X, http://www.zope.org/Collectors/Zope3-dev/650, that I believe was due to a bug in sync mode. Some asyncore-based code was being called from multiple threads that didn't expect to be. Converting to always-async mode revealed some bugs that weren't caught before because the tests ran in sync mode. These problems could explain some problems we've seen at times with clients taking a long time to reconnect after a disconnect. Added a partial heart beat to try to detect lost connections that aren't otherwise caught, http://mail.zope.org/pipermail/zodb-dev/2005-June/008951.html, by perioidically writing to all connections during periods of inactivity. Connection management --------------------- - (3.7a1) When more than ``pool_size`` connections have been closed, ``DB`` forgets the excess (over ``pool_size``) connections closed first. Python's cyclic garbage collection can take "a long time" to reclaim them (and may in fact never reclaim them if application code keeps strong references to them), but such forgotten connections can never be opened again, so their caches are now cleared at the time ``DB`` forgets them. Most applications won't notice a difference, but applications that open many connections, and/or store many large objects in connection caches, and/or store limited resources (such as RDB connections) in connection caches may benefit. BTrees ------ - Support for 64-bit integer keys and values has been provided as a compile-time option for the "I" BTrees (e.g. IIBTree). Documentation ------------- - (3.7a1) Thanks to Stephan Richter for converting many of the doctest files to ReST format. These are now chapters in the Zope 3 apidoc too. IPersistent ----------- - (3.7a1) The documentation for ``_p_oid`` now specifies the concrete type of oids (in short, an oid is either None or a non-empty string). Testing ------- - (3.7b2) Fixed test-runner output truncation. A bug was fixed in the test runner that caused result summaries to be omitted when running on Windows. Tools ----- - (3.7a1) The changeover from zLOG to the logging module means that some tools need to perform minimal logging configuration themselves. Changed the zeoup script to do so and thus enable it to emit error messages. BTrees ------ - (3.7a1) Suppressed warnings about signedness of characters when compiling under GCC 4.0.x. See http://www.zope.org/Collectors/Zope/2027. Connection ---------- - (3.7a1) An optimization for loading non-current data (MVCC) was inadvertently disabled in ``_setstate()``; this has been repaired. persistent ---------- - (3.7a1) Suppressed warnings about signedness of characters when compiling under GCC 4.0.x. See http://www.zope.org/Collectors/Zope/2027. - (3.7a1) PersistentMapping was inadvertently pickling volatile attributes (http://www.zope.org/Collectors/Zope/2052). After Commit hooks ------------------ - (3.7a1) Transaction objects have a new method, ``addAfterCommitHook(hook, *args, **kws)``. Hook functions registered with a transaction are called after the transaction commits or aborts. For example, one might want to launch non transactional or asynchrnonous code after a successful, or aborted, commit. See ``test_afterCommitHook()`` in ``transaction/tests/test_transaction.py`` for a tutorial doctest, and the ``ITransaction`` interface for details. What's new in ZODB3 3.6.2? ========================== Release date: 15-July-2006 DemoStorage ----------- - (3.6.2) DemoStorage was unable to wrap base storages who did not have an '_oid' attribute: most notably, ZEO.ClientStorage (http://www.zope.org/Collectors/Zope/2016). Following is combined news from internal releases (to support ongoing Zope2 / Zope3 development). These are the dates of the internal releases: - 3.6.1 27-Mar-2006 - 3.6.0 05-Jan-2006 - 3.6b6 01-Jan-2006 - 3.6b5 18-Dec-2005 - 3.6b4 04-Dec-2005 - 3.6b3 06-Nov-2005 - 3.6b2 25-Oct-2005 - 3.6b1 24-Oct-2005 - 3.6a4 07-Oct-2005 - 3.6a3 07-Sep-2005 - 3.6a2 06-Sep-2005 - 3.6a1 04-Sep-2005 Removal of Features Deprecated in ZODB 3.4 ------------------------------------------ (3.6b2) ZODB 3.6 no longer contains features officially deprecated in the ZODB 3.4 release. These include: - ``get_transaction()``. Use ``transaction.get()`` instead. ``transaction.commit()`` is a shortcut spelling of ``transaction.get().commit()``, and ``transaction.abort()`` of ``transaction.get().abort()``. Note that importing ZODB no longer installs ``get_transaction`` as a name in Python's ``__builtin__`` module either. - The ``begin()`` method of ``Transaction`` objects. Use the ``begin()`` method of a transaction manager instead. ``transaction.begin()`` is a shortcut spelling to call the default transaction manager's ``begin()`` method. - The ``dt`` argument to ``Connection.cacheMinimize()``. - The ``Connection.cacheFullSweep()`` method. Use ``cacheMinimize()`` instead. - The ``Connection.getTransaction()`` method. Pass a transaction manager to ``DB.open()`` instead. - The ``Connection.getLocalTransaction()`` method. Pass a transaction manager to ``DB.open()`` instead. - The ``cache_deactivate_after`` and ``version_cache_deactivate_after`` arguments to the ``DB`` constructor. - The ``temporary``, ``force``, and ``waitflag`` arguments to ``DB.open()``. ``DB.open()`` no longer blocks (there's no longer a fixed limit on the number of open connections). - The ``transaction`` and ``txn_mgr``arguments to ``DB.open()``. Use the ``transaction_manager`` argument instead. - The ``getCacheDeactivateAfter``, ``setCacheDeactivateAfter``, ``getVersionCacheDeactivateAfter`` and ``setVersionCacheDeactivateAfter`` methods of ``DB``. Persistent ---------- - (3.6.1) Suppressed warnings about signedness of characters when compiling under GCC 4.0.x. See http://www.zope.org/Collectors/Zope/2027. - (3.6a4) ZODB 3.6 introduces a change to the basic behavior of Persistent objects in a particular end case. Before ZODB 3.6, setting ``obj._p_changed`` to a true value when ``obj`` was a ghost was ignored: ``obj`` remained a ghost, and getting ``obj._p_changed`` continued to return ``None``. Starting with ZODB 3.6, ``obj`` is activated instead (unghostified), and its state is changed from the ghost state to the changed state. The new behavior is less surprising and more robust. - (3.6b5) The documentation for ``_p_oid`` now specifies the concrete type of oids (in short, an oid is either None or a non-empty string). Commit hooks ------------ - (3.6a1) The ``beforeCommitHook()`` method has been replaced by the new ``addBeforeCommitHook()`` method, with a more-robust signature. ``beforeCommitHook()`` is now deprecated, and will be removed in ZODB 3.8. Thanks to Julien Anguenot for contributing code and tests. Connection management --------------------- - (3.6b6) When more than ``pool_size`` connections have been closed, ``DB`` forgets the excess (over ``pool_size``) connections closed first. Python's cyclic garbage collection can take "a long time" to reclaim them (and may in fact never reclaim them if application code keeps strong references to them), but such forgotten connections can never be opened again, so their caches are now cleared at the time ``DB`` forgets them. Most applications won't notice a difference, but applications that open many connections, and/or store many large objects in connection caches, and/or store limited resources (such as RDB connections) in connection caches may benefit. ZEO --- - (3.6a4) Collector 1900. In some cases of pickle exceptions raised by low-level ZEO communication code, callers of ``marshal.encode()`` could attempt to catch an exception that didn't actually exist, leading to an erroneous ``AttributeError`` exception. Thanks to Tres Seaver for the diagnosis. BaseStorage ----------- - (3.6a4) Nothing done by ``tpc_abort()`` should raise an exception. However, if something does (an error case), ``BaseStorage.tpc_abort()`` left the commit lock in the acquired state, causing any later attempt to commit changes hang. Multidatabase ------------- - (3.6b1) The ``database_name`` for a database in a multidatabase collection can now be specified in a config file's ```` section, as the value of the optional new ``database_name`` key. The ``.databases`` attribute cannot be specified in a config file, but can be passed as the optional new ``databases`` argument to the ``open()`` method of a ZConfig factory for type ``ZODBDatabase``. For backward compatibility, Zope 2.9 continues to allow using the name in its ```` config section as the database name (note that ```` is defined by Zope, not by ZODB -- it's a Zope-specific extension of ZODB's ```` section). PersistentMapping ----------------- - (3.6.1) PersistentMapping was inadvertently pickling volatile attributes (http://www.zope.org/Collectors/Zope/2052). - (3.6b4) ``PersistentMapping`` makes changes by a ``pop()`` method call persistent now (http://www.zope.org/Collectors/Zope/2036). - (3.6a1) The ``PersistentMapping`` class has an ``__iter__()`` method now, so that objects of this type work well with Python's iteration protocol. For example, if ``x`` is a ``PersistentMapping`` (or Python dictionary, or BTree, or ``PersistentDict``, ...), then ``for key in x:`` iterates over the keys of ``x``, ``list(x)`` creates a list containing ``x``'s keys, ``iter(x)`` creates an iterator for ``x``'s keys, and so on. Tools ----- - (3.6b5) The changeover from zLOG to the logging module means that some tools need to perform minimal logging configuration themselves. Changed the zeoup script to do so and thus enable it to emit error messages. BTrees ------ - (3.6.1) Suppressed warnings about signedness of characters when compiling under GCC 4.0.x. See http://www.zope.org/Collectors/Zope/2027. - (3.6a1) BTrees and Buckets now implement the ``setdefault()`` and ``pop()`` methods. These are exactly like Python's dictionary methods of the same names, except that ``setdefault()`` requires both arguments (and Python is likely to change to require both arguments too -- defaulting the ``default`` argument to ``None`` has no viable use cases). Thanks to Ruslan Spivak for contributing code, tests, and documentation. - (3.6a1) Collector 1873. It wasn't possible to construct a BTree or Bucket from, or apply their update() methods to, a PersistentMapping or PersistentDict. This works now. ZopeUndo -------- - (3.6a4) Collector 1810. A previous bugfix (#1726) broke listing undoable transactions for users defined in a non-root acl_users folder. Zope logs a acl_users path together with a username (separated by a space) and this previous fix failed to take this into account. Connection ---------- - (3.6b5) An optimization for loading non-current data (MVCC) was inadvertently disabled in ``_setstate()``; this has been repaired. Documentation ------------- - (3.6b3) Thanks to Stephan Richter for converting many of the doctest files to ReST format. These are now chapters in the Zope 3 apidoc too. - (3.6b4) Several misspellings of "occurred" were repaired. Development ----------- - (3.6a1) The source code for the old ExtensionClass-based Persistence package moved, from ZODB to the Zope 2.9 development tree. ZODB 3.5 makes no use of Persistence, and, indeed, the Persistence package could not be compiled from a ZODB release, since some of the C header files needed appear only in Zope. - (3.6a3) Re-added the ``zeoctl`` module, for the same reasons ``mkzeoinst`` was re-added (see below). - (3.6a2) The ``mkzeoinst`` module was re-added to ZEO, because Zope3 has a script that expects to import it from there. ZODB's ``mkzeoinst`` script was rewritten to invoke the ``mkzeoinst`` module. ``transact`` ------------ - (3.6b4) Collector 1959: The undocumented ``transact`` module no longer worked. It remains undocumented and untested, but thanks to Janko Hauser it's possible that it works again ;-). What's new in ZODB3 3.5.1? ========================== Release date: 26-Sep-2005 Following is combined news from internal releases (to support ongoing Zope3 development). These are the dates of the internal releases: - 3.5.1b2 07-Sep-2005 - 3.5.1b1 06-Sep-2005 Build ----- - (3.5.1b2) Re-added the ``zeoctl`` module, for the same reasons ``mkzeoinst`` was re-added (see below). - (3.5.1b1) The ``mkzeoinst`` module was re-added to ZEO, because Zope3 has a script that expects to import it from there. ZODB's ``mkzeoinst`` script was rewritten to invoke the ``mkzeoinst`` module. ZopeUndo -------- - (3.5.1) Collector 1810. A previous bugfix (#1726) broke listing undoable transactions for users defined in a non-root acl_users folder. Zope logs a acl_users path together with a username (separated by a space) and this previous fix failed to take this into account. What's new in ZODB3 3.5.0? ========================== Release date: 31-Aug-2005 Following is combined news from internal releases (to support ongoing Zope3 development). These are the dates of the internal releases: - 3.5a7 11-Aug-2005 - 3.5a6 04-Aug-2005 - 3.5a5 19-Jul-2005 - 3.5a4 14-Jul-2005 - 3.5a3 17-Jun-2005 - 3.5a2 16-Jun-2005 - 3.5a1 10-Jun-2005 Savepoints ---------- - (3.5.0) As for deprecated subtransaction commits, the intent was that making a savepoint would invoke incremental garbage collection on Connection memory caches, to try to reduce the number of objects in cache to the configured cache size. Due to an oversight, this didn't happen, and stopped happening for subtransaction commits too. Making a savepoint (or doing a subtransaction commit) does invoke cache gc now. - (3.5a3) When a savepoint is made, the states of objects modified so far are saved to a temporary storage (an instance of class ``TmpStore``, although that's an internal implementation detail). That storage needs to implement the full storage API too, but was missing the ``loadBefore()`` method needed for MVCC to retrieve non-current revisions of objects. This could cause spurious errors if a transaction with a pending savepoint needed to fetch an older revision of some object. - (3.5a4) The ``ISavepoint`` interface docs said you could roll back to a given savepoint any number of times (until the transaction ends, or until you roll back to an earlier savepoint's state), but the implementation marked a savepoint as invalid after its first use. The implementation has been repaired, to match the docs. ZEO client cache ---------------- - (3.5a6) Two memory leaks in the ZEO client cache were repaired, a major one involving ``ZEO.cache.Entry`` objects, and a minor one involving empty lists. Subtransactions are deprecated ------------------------------ - (3.5a4) Subtransactions are deprecated, and will be removed in ZODB 3.7. Use savepoints instead. Savepoints are more powerful, and code using subtransactions does not mix well with code using savepoints (a subtransaction commit forces all current savepoints to become unusable, so code using subtransactions can hurt newer code trying to use savepoints). In general, a subtransaction commit done just to free memory can be changed from:: transaction.commit(1) to:: transaction.savepoint(True) That is, make a savepoint, and forget it. As shown, it's best to pass ``True`` for the optional ``optimistic`` argument in this case: because there's no possibility of asking for a rollback later, there's no need to insist that all data managers support rollback. In rarer cases, a subtransaction commit is followed later by a subtransaction abort. In that case, change the initial:: transaction.commit(1) to:: sp = transaction.savepoint() and in place of the subtransaction abort:: transaction.abort(1) roll back the savepoint instead:: sp.rollback() - (3.5a4) Internal uses of subtransactions (transaction ``commit()`` or ``abort()`` passing a true argument) were rewritten to use savepoints instead. Multi-database -------------- - (3.5a1) Preliminary support for persistent cross-database references has been added. See ``ZODB/cross-database-references.txt`` for an introduction. Tools ----- - (3.5a6, 3.5a7) Collector #1847. The ZEO client cache tracing and simulation tools weren't updated to work with ZODB 3.3, and the introduction of MVCC required major reworking of the tracing and simulation code. These tools are in a working state again, although so far lightly tested on just a few applications. In ``doc/ZEO/``, see the heavily revised ``trace.txt`` and ``cache.txt``. - (3.5a5) Collector #1846: If an uncommitted transaction was found, fsrecover.py fell into an infinite loop. Windows ------- - (3.5a6) As developed in a long thread starting at http://mail.zope.org/pipermail/zope/2005-July/160433.html there appears to be a race bug in the Microsoft Windows socket implementation, rarely visible in ZEO when multiple processes try to create an "asyncore trigger" simultaneously. Windows-specific code in ``ZEO/zrpc/trigger.py`` changed to work around this bug when it occurs. ThreadedAsync.LoopCallback -------------------------- - (3.5a5) This once again physically replaces Python's ``asyncore.loop`` function with its own loop function, because it turns out Zope relied on the seemingly unused ``LoopCallback.exit_status`` global, which was removed in the change described below. Python's ``asyncore.loop`` is again not invoked, so any breakpoints or debugging prints added to that are again "lost". - (3.5a4) This replaces Python's ``asyncore.loop`` function with its own, in order to get notified when ``loop()`` is first called. The signature of ``asyncore.loop`` changed in Python 2.4, but ``LoopCallback.loop``'s signature didn't change to match. The code here was repaired to be compatible with both old and new signatures, and also repaired to invoke Python's ``asyncore.loop()`` instead of replacing it entirely (so, for example, debugging prints added to Python's ``asyncore.loop`` won't be lost anymore). FileStorage ----------- - (3.5a4) Collector #1830. In some error cases when reading a FileStorage index, the code referenced an undefined global. - (3.5a4) Collector #1822. The ``undoLog()`` and ``undoInfo()`` methods were changed in 3.4a9 to return the documented results. Alas, some pieces of (non-ZODB) code relied on the actual behavior. When the ``first`` and ``last`` arguments are both >= 0, these methods now treat them as if they were Python slice indices, including the `first` index but excluding the ``last`` index. This matches former behavior, although it contradicts older ZODB UML documentation. The documentation in ``ZODB.interfaces.IStorageUndoable`` was changed to match the new intent. - (3.5a2) The ``_readnext()`` method now returns the transaction size as the value of the "size" key. Thanks to Dieter Maurer for the patch, from http://mail.zope.org/pipermail/zodb-dev/2003-October/006157.html. "This is very valuable when you want to spot strange transaction sizes via Zope's 'Undo' tab". BTrees ------ - (3.5.a5) Collector 1843. When a non-integer was passed to a method like ``keys()`` of a Bucket or Set with integer keys, an internal error code was overlooked, leading to everything from "delayed errors" to segfaults. Such cases raise TypeError now, as intended. - (3.5a4) Collector 1831. The BTree ``minKey()`` and ``maxKey()`` methods gave a misleading message if no key satisfying the constraints existed in a non-empty tree. - (3.5a4) Collector 1829. Clarified that the ``minKey()`` and ``maxKey()`` methods raise an exception if no key exists satsifying the constraints. - (3.5a4) The ancient ``convert.py`` script was removed. It was intended to convert "old" BTrees to "new" BTrees, but the "old" BTree implementation was removed from ZODB years ago. What's new in ZODB3 3.4.1? ========================== Release date: 09-Aug-2005 Following are dates of internal releases (to support ongoing Zope 2 development) since ZODB 3.4's last public release: - 3.4.1b5 08-Aug-2005 - 3.4.1b4 07-Aug-2005 - 3.4.1b3 04-Aug-2005 - 3.4.1b2 02-Aug-2005 - 3.4.1b1 26-Jul-2005 - 3.4.1a6 19-Jul-2005 - 3.4.1a5 12-Jul-2005 - 3.4.1a4 08-Jul-2005 - 3.4.1a3 02-Jul-2005 - 3.4.1a2 29-Jun-2005 - 3.4.1a1 27-Jun-2005 Savepoints ---------- - (3.4.1a1) When a savepoint is made, the states of objects modified so far are saved to a temporary storage (an instance of class ``TmpStore``, although that's an internal implementation detail). That storage needs to implement the full storage API too, but was missing the ``loadBefore()`` method needed for MVCC to retrieve non-current revisions of objects. This could cause spurious errors if a transaction with a pending savepoint needed to fetch an older revision of some object. - (3.4.1a5) The ``ISavepoint`` interface docs said you could roll back to a given savepoint any number of times (until the transaction ends, or until you roll back to an earlier savepoint's state), but the implementation marked a savepoint as invalid after its first use. The implementation has been repaired, to match the docs. - (3.4.1b4) Collector 1860: use an optimistic savepoint in ExportImport (there's no possiblity of rollback here, so no need to insist that the data manager support rollbacks). ZEO client cache ---------------- - (3.4.1b3) Two memory leaks in the ZEO client cache were repaired, a major one involving ``ZEO.cache.Entry`` objects, and a minor one involving empty lists. Subtransactions --------------- - (3.4.1a5) Internal uses of subtransactions (transaction ``commit()`` or ``abort()`` passing a true argument) were rewritten to use savepoints instead. Application code is strongly encouraged to do this too: subtransactions are weaker, will be deprecated soon, and do not mix well with savepoints (when you do a subtransaction commit, all current savepoints are made unusable). In general, a subtransaction commit done just to free memory can be changed from:: transaction.commit(1) to:: transaction.savepoint(True) That is, make a savepoint, and forget it. As shown, it's best to pass ``True`` for the optional ``optimistic`` argument in this case: because there's no possibility of asking for a rollback later, there's no need to insist that all data managers support rollback. In rarer cases, a subtransaction commit is followed later by a subtransaction abort. In that case, change the initial:: transaction.commit(1) to:: sp = transaction.savepoint() and in place of the subtransaction abort:: transaction.abort(1) roll back the savepoint instead:: sp.rollback() FileStorage ----------- - (3.4.1a3) Collector #1830. In some error cases when reading a FileStorage index, the code referenced an undefined global. - (3.4.1a2) Collector #1822. The ``undoLog()`` and ``undoInfo()`` methods were changed in 3.4a9 to return the documented results. Alas, some pieces of (non-ZODB) code relied on the actual behavior. When the `first` and `last` arguments are both >= 0, these methods now treat them as if they were Python slice indices, including the `first` index but excluding the `last` index. This matches former behavior, although it contradicts older ZODB UML documentation. The documentation in ``ZODB.interfaces.IStorageUndoable`` was changed to match the new intent. - (3.4.1a1) The ``UndoSearch._readnext()`` method now returns the transaction size as the value of the "size" key. Thanks to Dieter Maurer for the patch, from http://mail.zope.org/pipermail/zodb-dev/2003-October/006157.html. "This is very valuable when you want to spot strange transaction sizes via Zope's 'Undo' tab". ThreadedAsync.LoopCallback -------------------------- - (3.4.1a6) This once again physically replaces Python's ``asyncore.loop`` function with its own loop function, because it turns out Zope relied on the seemingly unused ``LoopCallback.exit_status`` global, which was removed in the change described below. Python's ``asyncore.loop`` is again not invoked, so any breakpoints or debugging prints added to that are again "lost". - (3.4.1a1) This replaces Python's ``asyncore.loop`` function with its own, in order to get notified when ``loop()`` is first called. The signature of ``asyncore.loop`` changed in Python 2.4, but ``LoopCallback.loop``'s signature didn't change to match. The code here was repaired to be compatible with both old and new signatures, and also repaired to invoke Python's ``asyncore.loop()`` instead of replacing it entirely (so, for example, debugging prints added to Python's ``asyncore.loop`` won't be lost anymore). Windows ------- - (3.4.1b2) As developed in a long thread starting at http://mail.zope.org/pipermail/zope/2005-July/160433.html there appears to be a race bug in the Microsoft Windows socket implementation, rarely visible in ZEO when multiple processes try to create an "asyncore trigger" simultaneously. Windows-specific code in ``ZEO/zrpc/trigger.py`` changed to work around this bug when it occurs. Tools ----- - (3.4.1b1 thru 3.4.1b5) Collector #1847. The ZEO client cache tracing and simulation tools weren't updated to work with ZODB 3.3, and the introduction of MVCC required major reworking of the tracing and simulation code. These tools are in a working state again, although so far lightly tested on just a few applications. In ``doc/ZEO/``, see the heavily revised ``trace.txt`` and ``cache.txt``. - (3.4.1a6) Collector #1846: If an uncommitted transaction was found, fsrecover.py fell into an infinite loop. DemoStorage ----------- - (3.4.1a1) The implementation of ``undoLog()`` was wrong in several ways; repaired. BTrees ------ - (3.4.1a6) Collector 1843. When a non-integer was passed to a method like ``keys()`` of a Bucket or Set with integer keys, an internal error code was overlooked, leading to everything from "delayed errors" to segfaults. Such cases raise TypeError now, as intended. - (3.4.1a4) Collector 1831. The BTree ``minKey()`` and ``maxKey()`` methods gave a misleading message if no key satisfying the constraints existed in a non-empty tree. - (3.4.1a3) Collector 1829. Clarified that the ``minKey()`` and ``maxKey()`` methods raise an exception if no key exists satsifying the constraints. What's new in ZODB3 3.4? ======================== Release date: 09-Jun-2005 Following is combined news from the "internal releases" (to support ongoing Zope 2.8 and Zope3 development) since the last public ZODB 3.4 release. These are the dates of the internal releases: - 3.4c2 06-Jun-2005 - 3.4c1 03-Jun-2005 - 3.4b3 27-May-2005 - 3.4b2 26-May-2005 Connection, DB -------------- - (3.4b3) ``.transaction_manager`` is now a public attribute of IDataManager, and is the instance of ITransactionManager used by the data manager as its transaction manager. There was previously no way to ask a data manager which transaction manager it was using. It's intended that ``transaction_manager`` be treated as read-only. - (3.4b3) For sanity, the ``txn_mgr`` argument to ``DB.open()``, ``Connection.__init__()``, and ``Connection._setDB()`` has been renamed to ``transaction_manager``. ``txn_mgr`` is still accepted, but is deprecated and will be removed in ZODB 3.6. Any code that was using the private ``._txn_mgr`` attribute of ``Connection`` will break immediately. Development ----------- - (3.4b2) ZODB's ``test.py`` is now a small driver for the shared ``zope.testing.testrunner``. See the latter's documentation for command-line arguments. Error reporting --------------- - (3.4c1) In the unlikely event that ``referencesf()`` reports an unpickling error (for example, a corrupt database can cause this), the message it produces no longer contains unprintable characters. Tests ----- - (3.4c2) ``checkCrossDBInvalidations`` suffered spurious failures too often on slow and/or busy machines. The test is willing to wait longer for success now. What's new in ZODB3 3.4b1? ========================== Release date: 19-May-2005 What follows is combined news from the "internal releases" (to support ongoing Zope 2.8 and Zope3 development) since the last public ZODB 3.4 release. These are the dates of the internal releases: - 3.4b1 19-May-2005 - 3.4a9 12-May-2005 - 3.4a8 09-May-2005 - 3.4a7 06-May-2005 - 3.4a6 05-May-2005 - 3.4a5 25-Apr-2005 - 3.4a4 23-Apr-2005 - 3.4a3 13-Apr-2005 - 3.4a2 03-Apr-2005 transaction ----------- - (3.4a7) If the first activity seen by a new ``ThreadTransactionManager`` was an explicit ``begin()`` call, then synchronizers registered after that (but still during the first transaction) were not communicated to the transaction object. As a result, the ``afterCompletion()`` methods of registered synchronizers weren't called when the first transaction ended. - (3.4a6) Doing a subtransaction commit erroneously processed invalidations, which could lead to an inconsistent view of the database. For example, let T be the transaction of which the subtransaction commit was a part. If T read a persistent object O's state before the subtransaction commit, did not commit new state of its own for O during its subtransaction commit, and O was modified before the subtransaction commit by a different transaction, then the subtransaction commit processed an invalidation for O, and the state T read for O originally was discarded in T. If T went on to access O again, it saw the newly committed (by a different transaction) state for O:: o_attr = O.some_attribute get_transaction().commit(True) assert o_attr == O.some_attribute could fail, and despite that T never modifed O. - (3.4a4) Transactions now support savepoints. Savepoints allow changes to be periodically checkpointed within a transaction. You can then rollback to a previously created savepoint. See ``transaction/savepoint.txt``. - (3.4a6) A ``getBeforeCommitHooks()`` method was added. It returns an iterable producing the registered beforeCommit hooks. - (3.4a6) The ``ISynchronizer`` interface has a new ``newTransaction()`` method. This is invoked whenever a transaction manager's ``begin()`` method is called. (Note that a transaction object's (as opposed to a transaction manager's) ``begin()`` method is deprecated, and ``newTransaction()`` is not called when using the deprecated method.) - (3.4a6) Relatedly, ``Connection`` implements ``ISynchronizer``, and ``Connection``'s ``afterCompletion()`` and ``newTransaction()`` methods now call ``sync()`` on the underlying storage (if the underlying storage has such a method), in addition to processing invalidations. The practical implication is that storage synchronization will be done automatically now, whenever a transaction is explicitly started, and after top-level transaction commit or abort. As a result, ``Connection.sync()`` should virtually never be needed anymore, and will eventually be deprecated. - (3.4a3) Transaction objects have a new method, ``beforeCommitHook(hook, *args, **kws)``. Hook functions registered with a transaction are called at the start of a top-level commit, before any of the work is begun, so a hook function can perform any database operations it likes. See ``test_beforeCommitHook()`` in ``transaction/tests/test_transaction.py`` for a tutorial doctest, and the ``ITransaction`` interface for details. Thanks to Florent Guillaume for contributing code and tests. - (3.4a3) Clarifications were made to transaction interfaces. Support for ZODB4 savepoint-aware data managers has been dropped ---------------------------------------------------------------- - (3.4a4) In adding savepoint support, we dropped the attempted support for ZODB4 data managers that support savepoints. We don't think that this will affect anyone. ZEO --- - (3.4a4) The ZODB and ZEO version numbers are now the same. Concretely:: import ZODB, ZEO assert ZODB.__version__ == ZEO.version no longer fails. If interested, see the README file for details about earlier version numbering schemes. - (3.4b1) ZConfig version 2.3 adds new socket address types, for smoother default behavior across platforms. The hostname portion of socket-binding-address defaults to an empty string, which acts like INADDR_ANY on Windows and Linux (bind to any interface). The hostname portion of socket-connection-address defaults to "127.0.0.1" (aka "localhost"). In config files, the types of ``zeo`` section keys ``address`` and ``monitor-address`` changed to socket-binding-address, and the type of the ``zeoclient`` section key ``server`` changed to socket-connection-address. - (3.4a4) The default logging setup in ``runzeo.py`` was broken. It was changed so that running ``runzeo.py`` from a command line now, and without using a config file, prints output to the console much as ZODB 3.2 did. ZEO on Windows -------------- Thanks to Mark Hammond for these ``runzeo.py`` enhancements on Windows: - (3.4b1) Collector 1788: Repair one of the new features below. - (3.4a4) A pid file (containing the process id as a decimal string) is created now for a ZEO server started via ``runzeo.py``. External programs can read the pid from this file and derive a "signal name" used in a new signal-emulation scheme for Windows. This is only necessary on Windows, but the pid file is created on all platforms that implement ``os.getpid()``, as long as the ``pid-filename`` option is set, or environment variable ``INSTANCE_HOME`` is defined. The ``pid-filename`` option can be set in a ZEO config file, or passed as the new ``--pid-file`` argument to ``runzeo.py``. - (3.4a4) If available, ``runzeo.py`` now uses Zope's new 'Signal' mechanism for Windows, to implement clean shutdown and log rotation handlers for Windows. Note that the Python in use on the ZEO server must also have the Python Win32 extensions installed for this to be useful. Tools ----- - (3.4a4) ``fsdump.py`` now displays the size (in bytes) of data records. This actually went in several months go, but wasn't noted here at the time. Thanks to Dmitry Vasiliev for contributing code and tests. FileStorage ----------- - (3.4a9) The ``undoLog()`` and ``undoInfo()`` methods almost always returned a wrong number of results, one too many if ``last < 0`` (the default is such a case), or one too few if ``last >= 0``. These have been repaired, new tests were added, and these methods are now documented in ``ZODB.interfaces.IStorageUndoable``. - (3.4a2) A ``pdb.set_trace()`` call was mistakenly left in method ``FileStorage.modifiedInVersion()``. ZConfig ------- - (3.4b1) The "standalone" release of ZODB now includes ZConfig version 2.3. DemoStorage ----------- - (3.4a4) Appropriate implementations of the storage API's ``registerDB()`` and ``new_oid()`` methods were added, delegating to the base storage. This was needed to support wrapping a ZEO client storage as a ``DemoStorage`` base storage, as some new Zope tests want to do. BaseStorage ----------- - (3.4a4) ``new_oid()``'s undocumented ``last=`` argument was removed. It was used only for internal recursion, and injured code sanity elsewhere because not all storages included it in their ``new_oid()``'s signature. Straightening this out required adding ``last=`` everywhere, or removing it everywhere. Since recursion isn't actually needed, and there was no other use for ``last=``, removing it everywhere was the obvious choice. Tests ----- - (3.4a3) The various flavors of the ``check2ZODBThreads`` and ``check7ZODBThreads`` tests are much less likely to suffer sproadic failures now. - (3.4a2) The test ``checkOldStyleRoot`` failed in Zope3, because of an obscure dependence on the ``Persistence`` package (which Zope3 doesn't use). ZApplication ------------ - (3.4a8) The file ``ZApplication.py`` was moved, from ZODB to Zope(2). ZODB and Zope3 don't use it, but Zope2 does. - (3.4a7) The ``__call__`` method didn't work if a non-None ``connection`` string argument was passed. Thanks to Stefan Holek for noticing. What's new in ZODB3 3.4a1? ========================== Release date: 01-Apr-2005 transaction ----------- - ``get_transaction()`` is officially deprecated now, and will be removed in ZODB 3.6. Use the ``transaction`` package instead. For example, instead of:: import ZODB ... get_transaction().commit() do:: import transaction ... transaction.commit() DB -- - There is no longer a hard limit on the number of connections that ``DB.open()`` will create. In other words, ``DB.open()`` never blocks anymore waiting for an earlier connection to close, and ``DB.open()`` always returns a connection now (while it wasn't documented, it was possible for ``DB.open()`` to return ``None`` before). ``pool_size`` continues to default to 7, but its meaning has changed: if more than ``pool_size`` connections are obtained from ``DB.open()`` and not closed, a warning is logged; if more than twice ``pool_size``, a critical problem is logged. ``pool_size`` should be set to the maximum number of connections from the ``DB`` instance you expect to have open simultaneously. In addition, if a connection obtained from ``DB.open()`` becomes unreachable without having been explicitly closed, when Python's garbage collection reclaims that connection it no longer counts against the ``pool_size`` thresholds for logging messages. The following optional arguments to ``DB.open()`` are deprecated: ``transaction``, ``waitflag``, ``force`` and ``temporary``. If one is specified, its value is ignored, and ``DeprecationWarning`` is raised. In ZODB 3.6, these optional arguments will be removed. - Lightweight support for "multi-databases" is implemented. These are collections of named DB objects and associated open Connections, such that the Connection for any DB in the collection can be obtained from a Connection from any other DB in the collection. See the new test file ZODB/tests/multidb.txt for a tutorial doctest. Thanks to Christian Theune for his work on this during the PyCon 2005 ZODB sprint. ZEO compatibility ----------------- There are severe restrictions on using ZEO servers and clients at or after ZODB 3.3 with ZEO servers and clients from ZODB versions before 3.3. See the reworked ``Compatibility`` section in ``README.txt`` for details. If possible, it will be easiest to move clients and servers to 3.3+ simultaneously. With care, it's possible to use a 3.3+ ZEO server with pre-3.3 ZEO clients, but not possible to use a pre-3.3 ZEO server with 3.3+ ZEO clients. BTrees ------ - A new family of BTree types, in the ``IFBTree`` module, map signed integers (32 bits) to C floats (also 32 bits). The intended use is to help construct search indices, where, e.g., integer word or document identifiers map to scores of some kind. This is easier than trying to work with scaled integer scores in an ``IIBTree``, and Zope3 has moved to ``IFBTrees`` for these purposes in its search code. FileStorage ----------- - Addded a record iteration protocol to FileStorage. You can use the record iterator to iterate over all current revisions of data pickles in the storage. In order to support calling via ZEO, we don't implement this as an actual iterator. An example of using the record iterator protocol is as follows:: storage = FileStorage('anexisting.fs') next_oid = None while True: oid, tid, data, next_oid = storage.record_iternext(next_oid) # do something with oid, tid and data if next_oid is None: break The behavior of the iteration protocol is now to iterate over all current records in the database in ascending oid order, although this is not a promise to do so in the future. Tools ----- New tool fsoids.py, for heavy debugging of FileStorages; shows all uses of specified oids in the entire database (e.g., suppose oid 0x345620 is missing -- did it ever exist? if so, when? who referenced it? when was the last transaction that modified an object that referenced it? which objects did it reference? what kind of object was it?). ZODB/test/testfsoids.py is a tutorial doctest. fsIndex ------- Efficient, general implementations of ``minKey()`` and ``maxKey()`` methods were added. ``fsIndex`` is a special hybrid kind of BTree used to implement FileStorage indices. Thanks to Chris McDonough for code and tests. What's new in ZODB3 3.3.1? ========================== Release date: DD-MMM-2005 Tests ----- The various flavors of the ``check2ZODBThreads`` and ``check7ZODBThreads`` tests are much less likely to suffer sproadic failures now. What's new in ZODB3 3.3.1c1? ============================ Release date: 01-Apr-2005 BTrees ------ Collector #1734: BTrees conflict resolution leads to index inconsistencies. Silent data loss could occur due to BTree conflict resolution when one transaction T1 added a new key to a BTree containing at least three buckets, and a concurrent transaction T2 deleted all keys in the bucket to which the new key was added. Conflict resolution then created a bucket containing the newly added key, but the bucket remained isolated, disconnected from the BTree. In other words, the committed BTree didn't contain the new key added by T1. Conflict resolution doesn't have enough information to repair this, so ``ConflictError`` is now raised in such cases. ZEO --- Repaired subtle race conditions in establishing ZEO connections, both client- and server-side. These account for intermittent cases where ZEO failed to make a connection (or reconnection), accompanied by a log message showing an error caught in ``asyncore`` and having a traceback ending with: ``UnpicklingError: invalid load key, 'Z'.`` or: ``ZRPCError: bad handshake '(K\x00K\x00U\x0fgetAuthProtocol)t.'`` or: ``error: (9, 'Bad file descriptor')`` or an ``AttributeError``. These were exacerbated when running the test suite, because of an unintended busy loop in the test scaffolding, which could starve the thread trying to make a connection. The ZEO reconnection tests may run much faster now, depending on platform, and should suffer far fewer (if any) intermittent "timed out waiting for storage to connect" failures. ZEO protocol and compatibility ------------------------------ ZODB 3.3 introduced multiversion concurrency control (MVCC), which required changes to the ZEO protocol. The first 3.3 release should have increased the internal ZEO protocol version number (used by ZEO protocol negotiation when a client connects), but neglected to. This has been repaired. Compatibility between pre-3.3 and post-3.3 ZEO clients and servers remains very limited. See the newly updated ``Compatibility`` section in ``README.txt`` for details. FileStorage ----------- - The ``.store()`` and ``.restore()`` methods didn't update the storage's belief about the largest oid in use when passed an oid larger than the largest oid the storage already knew about. Because ``.restore()`` in particular is used by ``copyTransactionsFrom()``, and by the first stage of ZRS recovery, a large database could be created that believed the only oid in use was oid 0 (the special oid reserved for the root object). In rare cases, it could go on from there assigning duplicate oids to new objects, starting over from oid 1 again. This has been repaired. A new ``set_max_oid()`` method was added to the ``BaseStorage`` class so that derived storages can update the largest oid in use in a threadsafe way. - A FileStorage's index file tried to maintain the index's largest oid as a separate piece of data, incrementally updated over the storage's lifetime. This scheme was more complicated than necessary, so was also more brittle and slower than necessary. It indirectly participated in a rare but critical bug: when a FileStorage was created via ``copyTransactionsFrom()``, the "maximum oid" saved in the index file was always 0. Use that FileStorage, and it could then create "new" oids starting over at 0 again, despite that those oids were already in use by old objects in the database. Packing a FileStorage has no reason to try to update the maximum oid in the index file either, so this kind of damage could (and did) persist even across packing. The index file's maximum-oid data is ignored now, but is still written out so that ``.index`` files can be read by older versions of ZODB. Finding the true maximum oid is done now by exploiting that the main index is really a kind of BTree (long ago, this wasn't true), and finding the largest key in a BTree is inexpensive. - A FileStorage's index file could be updated on disk even if the storage was opened in read-only mode. That bug has been repaired. - An efficient ``maxKey()`` implementation was added to class ``fsIndex``. Pickle (in-memory Connection) Cache ----------------------------------- You probably never saw this exception: ``ValueError: Can not re-register object under a different oid`` It's been changed to say what it meant: ``ValueError: A different object already has the same oid`` This happens if an attempt is made to add distinct objects to the cache that have the same oid (object identifier). ZODB should never do this, but it's possible for application code to force such an attempt. PersistentMapping and PersistentList ------------------------------------ Backward compatibility code has been added so that the sanest of the ZODB 3.2 dotted paths for ``PersistentMapping`` and ``PersistentList`` resolve. These are still preferred: - ``from persistent.list import PersistentList`` - ``from persistent.mapping import PersistentMapping`` but these work again too: - ``from ZODB.PersistentList import PersistentList`` - ``from ZODB.PersistentMapping import PersistentMapping`` BTrees ------ The BTrees interface file neglected to document the optional ``excludemin`` and ``excludemax`` arguments to the ``keys()``, ``values()`` and ``items()`` methods. Appropriate changes were merged in from the ZODB4 BTrees interface file. Tools ----- - ``mkzeoinst.py``'s default port number changed from to 9999 to 8100, to match the example in Zope's ``zope.conf``. fsIndex ------- An efficient ``maxKey()`` method was implemented for the ``fsIndex`` class. This makes it possible to determine the largest oid in a ``FileStorage`` index efficiently, directly, and reliably, replacing a more delicate scheme that tried to keep track of this by saving an oid high water mark in the index file and incrementally updating it. What's new in ZODB3 3.3.1a1? ============================ Release date: 11-Jan-2005 ZEO client cache ---------------- - Collector 1536: The ``cache-size`` configuration option for ZEO clients was being ignored. Worse, the client cache size was only one megabyte, much smaller than the advertised default of 20MB. Note that the default is carried over from a time when gigabyte disks were expensive and rare; 20MB is also too small on most modern machines. - Fixed a nasty bug in cache verification. A persistent ZEO cache uses a disk file, and, when active, has some in-memory data structures too to speed operation. Invalidations processed as part of startup cache verification were reflected in the in-memory data structures, but not correctly in the disk file. So if an object revision was invalidated as part of verification, the object wasn't loaded again before the connection was closed, and the object revision remained in the cache file until the connection was closed, then the next time the cache file was opened it could believe that the stale object revision in the file was actually current. - Fixed a bug wherein an object removed from the client cache didn't properly mark the file slice it occupied as being available for reuse. ZEO --- Collector 1503: excessive logging. It was possible for a ZEO client to log "waiting for cache verification to finish" messages at a very high rate, producing gigabytes of such messages in short order. ``ClientStorage._wait_sync()`` was changed to log no more than one such message per 5 minutes. persistent ---------- Collector #1350: ZODB has a default one-thread-per-connection model, and two threads should never do operations on a single connection simultaneously. However, ZODB can't detect violations, and this happened in an early stage of Zope 2.8 development. The low-level ``ghostify()`` and ``unghostify()`` routines in ``cPerisistence.c`` were changed to give some help in detecting this when it happens. In a debug build, both abort the process if thread interference is detected. This is extreme, but impossible to overlook. In a release build, ``unghostify()`` raises ``SystemError`` if thread damage is detected; ``ghostify()`` ignores the problem in a release build (``ghostify()`` is supposed to be so simple that it "can't fail"). ConflictError ------------- New in 3.3, a ``ConflictError`` exception may attempt to insert the path to the object's class in its message. However, a ZEO server may not have access to application class implementations, and then the attempt by the server to raise ``ConflictError`` could raise ``ImportError`` instead while trying to determine the object's class path. This was confusing. The code has been changed to obtain the class path from the object's pickle, without trying to import application modules or classes. FileStorage ----------- Collector 1581: When an attempt to pack a corrupted ``Data.fs`` file was made, it was possible for the pack routine to die with a reference to an undefined global while it was trying to raise ``CorruptedError``. It raises ``CorruptedError``, as it always intended, in these cases now. Install ------- The C header file ``ring.h`` is now installed. Tools ----- - ``BTrees.check.display()`` now displays the oids (if any) of the BTree's or TreeSet's constituent objects. What's new in ZODB3 3.3? ======================== Release date: 06-Oct-2004 ZEO --- The encoding of RPC calls between server and client was being done with protocol 0 ("text mode") pickles, which could require sending four times as many bytes as necessary. Protocol 1 pickles are used now. Thanks to Andreas Jung for the diagnosis and cure. ZODB/component.xml ------------------ ``cache-size`` parameters were changed from type ``integer`` to type ``byte-size``. This allows you to specify, for example, "``cache-size 20MB``" to get a 20 megabyte cache. transaction ----------- The deprecation warning for ``Transaction.begin()`` was changed to point to the caller, instead of to ``Transaction.begin()`` itself. Connection ---------- Restored Connection's private ``_opened`` attribute. This was still referenced by ``DB.connectionDebugInfo()``, and Zope 2 calls the latter. FileStorage ----------- Collector #1517: History tab for ZPT does not work. ``FileStorage.history()`` was reading the user, description, and extension fields out of the object pickle, due to starting the read at a wrong location. Looked like cut-and-paste repetition of the same bug in ``FileStorage.FileIterator`` noted in the news for 3.3c1. What's new in ZODB3 3.3 release candidate 1? ============================================ Release date: 14-Sep-2004 Connection ---------- ZODB intends to raise ``ConnnectionStateError`` if an attempt is made to close a connection while modifications are pending (the connection is involved in a transaction that hasn't been ``abort()``'ed or ``commit()``'ed). It was missing the case where the only pending modifications were made in subtransactions. This has been fixed. If an attempt to close a connection with pending subtransactions is made now:: ConnnectionStateError: Cannot close a connection with a pending subtransaction is raised. transaction ----------- - Transactions have new, backward-incompatible behavior in one respect: if a ``Transaction.commit()``, ``Transaction.commit(False)``, or ``Transaction.commit(True)`` raised an exception, prior behavior was that the transaction effectively aborted, and a new transaction began. A primary bad consequence was that, if in a sequence of subtransaction commits, one of the commits failed but the exception was suppressed, all changes up to and including the failing commit were lost, but later subtransaction commits in the sequence got no indication that something had gone wrong, nor did the final (top level) commit. This could easily lead to inconsistent data being committed, from the application's point of view. The new behavior is that a failing commit "sticks" until explicitly cleared. Now if an exception is raised by a ``commit()`` call (whether subtransaction or top level) on a Transaction object ``T``: - Pending changes are aborted, exactly as they were for a failing commit before. - But ``T`` remains the current transaction object (if ``tm`` is ``T``'s transaction manger, ``tm.get()`` continues to return ``T``). - All subsequent attempts to do ``T.commit()``, ``T.join()``, or ``T.register()`` raise the new ``TransactionFailedError`` exception. Note that if you try to modify a persistent object, that object's resource manager (usually a ``Connection`` object) will attempt to ``join()`` the failed transaction, and ``TransactionFailedError`` will be raised right away. So after a transaction or subtransaction commit fails, that must be explicitly cleared now, either by invoking ``abort()`` on the transaction object, or by invoking ``begin()`` on its transaction manager. - Some explanations of new transaction features in the 3.3a3 news were incorrect, and this news file has been retroactively edited to repair that. See news for 3.3a3 below. - If ReadConflictError was raised by an attempt to load an object with a ``_p_independent()`` method that returned false, attempting to commit the transaction failed to (re)raise ReadConflictError for that object. Note that ZODB intends to prevent committing a transaction in which a ReadConflictError occurred; this was an obscure case it missed. - Growing pains: ZODB 3.2 had a bug wherein ``Transaction.begin()`` didn't abort the current transaction if the only pending changes were in a subtransaction. In ZODB 3.3, it's intended that a transaction manager be used to effect ``begin()`` (instead of invoking ``Transaction.begin()``), and calling ``begin()`` on a transaction manager didn't have this old bug. However, ``Transaction.begin()`` still exists in 3.3, and it had a worse bug: it never aborted the transaction (not even if changes were pending outside of subtransactions). ``Transaction.begin()`` has been changed to abort the transaction. ``Transaction.begin()`` is also deprecated. Don't use it. Use ``begin()`` on the relevant transaction manager instead. For example, >>> import transaction >>> txn = transaction.begin() # start a txn using the default TM if using the default ``ThreadTransactionManager`` (see news for 3.3a3 below). In 3.3, it's intended that a single ``Transaction`` object is used for exactly one transaction. So, unlike as in 3.2, when somtimes ``Transaction`` objects were reused across transactions, but sometimes weren't, when you do ``Transaction.begin()`` in 3.3 a brand new transaction object is created. That's why this use is deprecated. Code of the form: >>> txn = transaction.get() >>> ... >>> txn.begin() >>> ... >>> txn.commit() can't work as intended in 3.3, because ``txn`` is no longer the current ``Transaction`` object the instant ``txn.begin()`` returns. BTrees ------ The BTrees __init__.py file is now just a comment. It had been trying to set up support for (long gone) "int sets", and to import an old version of Zope's Interface package, which doesn't even ship with ZODB. The latter in particular created problems, at least clashing with PythonCAD's Interface package. POSException ------------ Collector #1488 (TemporaryStorage -- going backward in time). This confusion was really due to that the detail on a ConflictError exception didn't make sense. It called the current revision "was", and the old revision "now". The detail is much more informative now. For example, if the exception said:: ConflictError: database conflict error (oid 0xcb22, serial was 0x03441422948b4399, now 0x034414228c3728d5) before, it now says:: ConflictError: database conflict error (oid 0xcb22, serial this txn started with 0x034414228c3728d5 2002-04-14 20:50:32.863000, serial currently committed 0x03441422948b4399 2002-04-14 20:50:34.815000) ConflictError ------------- The undocumented ``get_old_serial()`` and ``get_new_serial()`` methods were swapped (the first returned the new serial, and the second returned the old serial). Tools ----- ``FileStorage.FileIterator`` was confused about how to read a transaction's user and description fields, which caused several tools to display binary gibberish for these values. ``ZODB.utils.oid_repr()`` changed to add a leading "0x", and to strip leading zeroes. This is used, e.g., in the detail of a ``POSKeyError`` exception, to identify the missing oid. Before, the output was ambiguous. For example, oid 17 was displayed as 0000000000000011. As a Python integer, that's octal 9. Or was it meant to be decimal 11? Or was it meant to be hex? Now it displays as 0x11. fsrefs.py: When run with ``-v``, produced tracebacks for objects whose creation was merely undone. This was confusing. Tracebacks are now produced only if there's "a real" problem loading an oid. If the current revision of object O refers to an object P whose creation has been undone, this is now identified as a distinct case. Captured and ignored most attempts to stop it via Ctrl+C. Repaired. Now makes two passes, so that an accurate report can be given of all invalid references. ``analyze.py`` produced spurious "len of unsized object" messages when finding a data record for an object uncreation or version abort. These no longer appear. ``fsdump.py``'s ``get_pickle_metadata()`` function (which is used by several tools) was confused about what to do when the ZODB pickle started with a pickle ``GLOBAL`` opcode. It actually loaded the class then, which it intends never to do, leading to stray messages on stdout when the class wasn't available, and leading to a strange return value even when it was available (the repr of the type object was returned as "the module name", and an empty string was returned as "the class name"). This has been repaired. What's new in ZODB3 3.3 beta 2 ============================== Release date: 13-Aug-2004 Transaction Managers -------------------- Zope3-dev Collector #139: Memory leak involving buckets and connections The transaction manager internals effectively made every Connection object immortal, except for those explicitly closed. Since typical practice is not to close connections explicitly (and closing a DB happens not to close the connections to it -- although that may change), this caused massive memory leaks when many connections were opened. The transaction manager internals were reworked to use weak references instead, so that connection memory (and other registered synch objects) now get cleaned up when nothing other than the transaction manager knows about them. Storages -------- Collector #1327: FileStorage init confused by time travel If the system clock "went backwards" a long time between the times a FileStorage was closed and reopened, new transaction ids could be smaller than transaction ids already in the storage, violating a key invariant. Now transaction ids are guaranteed to be increasing even when this happens. If time appears to have run backwards at all when a FileStorage is opened, a new message saying so is logged at warning level; if time appears to have run backwards at least 30 minutes, the message is logged at critical level (and you should investigate to find and repair the true cause). Tools ----- repozo.py: Thanks to a suggestion from Toby Dickenson, backups (whether incremental or full) are first written to a temp file now, which is fsync'ed at the end, and only after that succeeds is the file renamed to YYYY-MM-DD-HH-MM-SS.ext form. In case of a system crash during a repozo backup, this at least makes it much less likely that a backup file with incomplete or incorrect data will be left behind. fsrefs.py: Fleshed out the module docstring, and repaired a bug wherein spurious error msgs could be produced after reporting a problem with an unloadable object. Test suite ---------- Collector #1397: testTimeStamp fails on FreeBSD The BSD distributions are unique in that their mktime() implementation usually ignores the input tm_isdst value. Test checkFullTimeStamp() was sensitive to this platform quirk. Reworked the way some of the ZEO tests use threads, so that unittest is more likely to notice the real cause of a failure (which usually occurs in a thread), and less likely to latch on to spurious problems resulting from the real failure. What's new in ZODB3 3.3 beta 1 ============================== Release date: 07-Jun-2004 3.3b1 is the first ZODB release built using the new zpkg tools: http://zope.org/Members/fdrake/zpkgtools/ This appears to have worked very well. The structure of the tarball release differs from previous releases because of it, and the set of installed files includes some that were not installed in previous releases. That shouldn't create problems, so let us know if it does! We'll fine-tune this for the next release. BTrees ------ Fixed bug indexing BTreeItems objects with negative indexes. This caused reverse iteration to return each item twice. Thanks to Casey Duncan for the fix. ZODB ---- Methods removed from the database (ZODB.DB.DB) class: cacheStatistics(), cacheMeanAge(), cacheMeanDeac(), and cacheMeanDeal(). These were undocumented, untested, and unused. The first always returned an empty tuple, and the rest always returned None. When trying to do recovery to a time earlier than that of the most recent full backup, repozo.py failed to find the appropriate files, erroneously claiming "No files in repository before ". This has been repaired. Collector #1330: repozo.py -R can create corrupt .fs. When looking for the backup files needed to recreate a Data.fs file, repozo could (unintentionally) include its meta .dat files in the list, or random files of any kind created by the user in the backup directory. These would then get copied verbatim into the reconstructed file, filling parts with junk. Repaired by filtering the file list to include only files with the data extensions repozo.py creates (.fs, .fsz, .deltafs, and .deltafsz). Thanks to James Henderson for the diagnosis. fsrecover.py couldn't work, because it referenced attributes that no longer existed after the MVCC changes. Repaired that, and added new tests to ensure it continues working. Collector #1309: The reference counts reported by DB.cacheExtremeDetails() for ghosts were one too small. Thanks to Dieter Maurer for the diagnosis. Collector #1208: Infinite loop in cPickleCache. If a persistent object had a __del__ method (probably not a good idea regardless, but we don't prevent it) that referenced an attribute of self, the code to deactivate objects in the cache could get into an infinite loop: ghostifying the object could lead to calling its __del__ method, the latter would load the object into cache again to satsify the attribute reference, the cache would again decide that the object should be ghostified, and so on. The infinite loop no longer occurs, but note that objects of this kind still aren't sensible (they're effectively immortal). Thanks to Toby Dickenson for suggesting a nice cure. What's new in ZODB3 3.3 alpha 3 =============================== Release date: 16-Apr-2004 transaction ----------- There is a new transaction package, which provides new interfaces for application code and for the interaction between transactions and resource managers. The top-level transaction package has functions ``commit()``, ``abort()``, ``get()``, and ``begin()``. They should be used instead of the magic ``get_transaction()`` builtin, which will be deprecated. For example: >>> get_transaction().commit() should now be written as >>> import transaction >>> transaction.commit() The new API provides explicit transaction manager objects. A transaction manager (TM) is responsible for associating resource managers with a "current" transaction. The default TM, implemented by class ``ThreadedTransactionManager``, assigns each thread its own current transaction. This default TM is available as ``transaction.manager``. The ``TransactionManager`` class assigns all threads to the same transaction, and is an explicit replacement for the ``Connection.setLocalTransaction()`` method: A transaction manager instance can be passed as the transaction_manager argument to ``DB.open()``. If you do, the connection will use the specified transaction manager instead of the default TM. The current transaction is obtained by calling ``get()`` on a TM. For example: >>> tm = transaction.TransactionManager() >>> cn = db.open(transaction_manager=tm) [...] >>> tm.get().commit() The ``setLocalTransaction()`` and ``getTransaction()`` methods of Connection are deprecated. Use an explicit TM passed via ``transaction_manager=`` to ``DB.open()`` instead. The ``setLocalTransaction()`` method still works, but it returns a TM instead of a Transaction. A TM creates Transaction objects, which are used for exactly one transaction. Transaction objects still have ``commit()``, ``abort()``, ``note()``, ``setUser()``, and ``setExtendedInfo()`` methods. Resource managers, e.g. Connection or RDB adapter, should use a Transaction's ``join()`` method instead of its ``register()`` method. An object that calls ``join()`` manages its own resources. An object that calls ``register()`` expects the TM to manage the objects. Data managers written against the ZODB 4 transaction API are now supported in ZODB 3. persistent ---------- A database can now contain persistent weak references. An object that is only reachable from persistent weak references will be removed by pack(). The persistence API now distinguishes between deactivation and invalidation. This change is intended to support objects that can't be ghosts, like persistent classes. Deactivation occurs when a user calls _p_deactivate() or when the cache evicts objects because it is full. Invalidation occurs when a transaction updates the object. An object that can't be a ghost must load new state when it is invalidated, but can ignore deactivation. Persistent objects can implement a __getnewargs__() method that will be used to provide arguments that should be passed to __new__() when instances (including ghosts) are created. An object that implements __getnewargs__() must be loaded from storage even to create a ghost. There is new support for writing hooks like __getattr__ and __getattribute__. The new hooks require that user code call special persistence methods like _p_getattr() inside their hook. See the ZODB programming guide for details. The format of serialized persistent references has changed; that is, the on-disk format for references has changed. The old format is still supported, but earlier versions of ZODB will not be able to read the new format. ZODB ---- Closing a ZODB Connection while it is registered with a transaction, e.g. has pending modifications, will raise a ConnnectionStateError. Trying to load objects from or store objects to a closed connection will also raise a ConnnectionStateError. ZODB connections are synchronized on commit, even when they didn't modify objects. This feature assumes that the thread that opened the connection is also the thread that uses it. If not, this feature will cause problems. It can be disabled by passing synch=False to open(). New broken object support. New add() method on Connection. User code should not assign the _p_jar attribute of a new persistent object directly; a deprecation warning is issued in this case. Added a get() method to Connection as a preferred synonym for __getitem__(). Several methods and/or specific optional arguments of methods have been deprecated. The cache_deactivate_after argument used by DB() and Connection() is deprecated. The DB methods getCacheDeactivateAfter(), getVersionCacheDeactivateAfter(), setCacheDeactivateAfter(), and setVersionCacheDeactivateAfter() are also deprecated. The old-style undo() method was removed from the storage API, and transactionalUndo() was renamed to undo(). The BDBStorages are no longer distributed with ZODB. Fixed a serious bug in the new pack implementation. If pack was called on the storage and passed a time earlier than a previous pack time, data could be lost. In other words, if there are any two pack calls, where the time argument passed to the second call was earlier than the first call, data loss could occur. The bug was fixed by causing the second call to raise a StorageError before performing any work. Fixed a rare bug in pack: if a pack started during a small window of time near the end of a concurrent transaction's commit, it was possible for the pack attempt to raise a spurious CorruptedError: ... transaction with checkpoint flag set exception. This did no damage to the database, or to the transaction in progress, but no pack was performed then. By popular demand, FileStorage.pack() no longer propagates a FileStorageError: The database has already been packed to a later time or no changes have been made since the last pack exception. Instead that message is logged (at INFO level), and the pack attempt simply returns then (no pack is performed). ZEO --- Fixed a bug that prevented the -m / --monitor argument from working. zdaemon ------- Added a -m / --mask option that controls the umask of the subprocess. zLOG ---- The zLOG backend has been removed. zLOG is now just a facade over the standard Python logging package. Environment variables like STUPID_LOG_FILE are no longer honored. To configure logging, you need to follow the directions in the logging package documentation. The process is currently more complicated than configured zLOG. See test.py for an example. ZConfig ------- This release of ZODB contains ZConfig 2.1. More documentation has been written. Make sure keys specified as attributes of the element are converted by the appropriate key type, and are re-checked for derived sections. Refactored the ZConfig.components.logger schema components so that a schema can import just one of the "eventlog" or "logger" sections if desired. This can be helpful to avoid naming conflicts. Added a reopen() method to the logger factories. Always use an absolute pathname when opening a FileHandler. Miscellaneous ------------- The layout of the ZODB source release has changed. All the source code is contained in a src subdirectory. The primary motivation for this change was to avoid confusion caused by installing ZODB and then testing it interactively from the source directory; the interpreter would find the uncompiled ZODB package in the source directory and report an import error. A reference-counting bug was fixed, in the logic calling a modified persistent object's data manager's register() method. The primary symptom was rare assertion failures in Python's cyclic garbage collection. The Connection class's onCommitAction() method was removed. Some of the doc strings in ZODB are now written for processing by epydoc. Several new test suites were written using doctest instead of the standard unittest TestCase framework. MappingStorage now implements getTid(). ThreadedAsync: Provide a way to shutdown the servers using an exit status. The mkzeoinstance script looks for a ZODB installation, not a Zope installation. The received wisdom is that running a ZEO server without access to the appserver code avoids many mysterious problems. What's new in ZODB3 3.3 alpha 2 =============================== Release date: 06-Jan-2004 This release contains a major overhaul of the persistence machinery, including some user-visible changes. The Persistent base class is now a new-style class instead of an ExtensionClass. The change enables the use of features like properties with persistent object classes. The Persistent base class is now contained in the persistent package. The Persistence package is included for backwards compatibility. The Persistence package is used by Zope to provide special ExtensionClass-compatibility features like a non-C3 MRO and an __of__ method. ExtensionClass is not included with this release of ZODB3. If you use the Persistence package, it will print a warning and import Persistent from persistent. In short, the new persistent package is recommended for non-Zope applications. The following dotted class names are now preferred over earlier names: - persistent.Persistent - persistent.list.PersistentList - persistent.mapping.PersistentMapping - persistent.TimeStamp The in-memory, per-connection object cache (pickle cache) was changed to participate in garbage collection. This should reduce the number of memory leaks, although we are still tracking a few problems. Multi-version concurrency control --------------------------------- ZODB now supports multi-version concurrency control (MVCC) for storages that support multiple revisions. FileStorage and BDBFullStorage both support MVCC. In short, MVCC means that read conflicts should almost never occur. When an object is modified in one transaction, other concurrent transactions read old revisions of the object to preserve consistency. In earlier versions of ZODB, any access of the modified object would raise a ReadConflictError. The ZODB internals changed significantly to accommodate MVCC. There are relatively few user visible changes, aside from the lack of read conflicts. It is possible to disable the MVCC feature using the mvcc keyword argument to the DB open() method, ex.: db.open(mvcc=False). ZEO --- Changed the ZEO server and control process to work with a single configuration file; this is now the default way to configure these processes. (It's still possible to use separate configuration files.) The ZEO configuration file can now include a "runner" section used by the control process and ignored by the ZEO server process itself. If present, the control process can use the same configuration file. Fixed a performance problem in the logging code for the ZEO protocol. The logging code could call repr() on arbitrarily long lists, even though it only logged the first 60 bytes; worse, it called repr() even if logging was currently disabled. Fixed to call repr() on individual elements until the limit is reached. Fixed a bug in zrpc (when using authentication) where the MAC header wasn't being read for large messages, generating errors while unpickling commands sent over the wire. Also fixed the zeopasswd.py script, added testcases and provided a more complete commandline interface. Fixed a misuse of the _map variable in zrpc Connectio objects, which are also asyncore.dispatcher objects. This allows ZEO to work with CVS Python (2.4). _map is used to indicate whether the dispatcher users the default socket_map or a custom socket_map. A recent change to asyncore caused it to use _map in its add_channel() and del_channel() methods, which presumes to be a bug fix (may get ported to 2.3). That causes our dubious use of _map to be a problem, because we also put the Connections in the global socket_map. The new asyncore won't remove it from the global socket map, because it has a custom _map. The prefix used for log messages from runzeo.py was changed from RUNSVR to RUNZEO. Miscellaneous ------------- ReadConflictError objects now have an ignore() method. Normally, a transaction that causes a read conflict can't be committed. If the exception is caught and its ignore() method called, the transaction can be committed. Application code may need this in advanced applications. What's new in ZODB3 3.3 alpha 1 =============================== Release date: 17-Jul-2003 New features of Persistence --------------------------- The Persistent base class is a regular Python type implemented in C. It should be possible to create new-style classes that inherit from Persistent, and, thus, use all the new Python features introduced in Python 2.2 and 2.3. The __changed__() method on Persistent objects is no longer supported. New features in BTrees ---------------------- BTree, Bucket, TreeSet and Set objects are now iterable objects, playing nicely with the iteration protocol introduced in Python 2.2, and can be used in any context that accepts an iterable object. As for Python dicts, the iterator constructed for BTrees and Buckets iterates over the keys. >>> from BTrees.OOBTree import OOBTree >>> b = OOBTree({"one": 1, "two": 2, "three": 3, "four": 4}) >>> for key in b: # iterates over the keys ... print key four one three two >>> list(enumerate(b)) [(0, 'four'), (1, 'one'), (2, 'three'), (3, 'two')] >>> i = iter(b) >>> i.next() 'four' >>> i.next() 'one' >>> i.next() 'three' >>> i.next() 'two' >>> As for Python dicts in 2.2, BTree and Bucket objects have new .iterkeys(), .iteritems(), and .itervalues() methods. TreeSet and Set objects have a new .iterkeys() method. Unlike as for Python dicts, these new methods accept optional min and max arguments to effect range searches. While Bucket.keys() produces a list, Bucket.iterkeys() produces an iterator, and similarly for Bucket values() versus itervalues(), Bucket items() versus iteritems(), and Set keys() versus iterkeys(). The iter{keys,values,items} methods of BTrees and the iterkeys() method of Treesets also produce iterators, while their keys() (etc) methods continue to produce BTreeItems objects (a form of "lazy" iterator that predates Python 2.2's iteration protocol). >>> sum(b.itervalues()) 10 >>> zip(b.itervalues(), b.iterkeys()) [(4, 'four'), (1, 'one'), (3, 'three'), (2, 'two')] >>> BTree, Bucket, TreeSet and Set objects also implement the __contains__ method new in Python 2.2, which means that testing for key membership can be done directly now via the "in" and "not in" operators: >>> "won" in b False >>> "won" not in b True >>> "one" in b True >>> All old and new range-search methods now accept keyword arguments, and new optional excludemin and excludemax keyword arguments. The new keyword arguments allow doing a range search that's exclusive at one or both ends (doesn't include min, and/or doesn't include max). >>> list(b.keys()) ['four', 'one', 'three', 'two'] >>> list(b.keys(max='three')) ['four', 'one', 'three'] >>> list(b.keys(max='three', excludemax=True)) ['four', 'one'] >>> Other improvements ------------------ The exceptions generated by write conflicts now contain the name of the conflicted object's class. This feature requires support for the storage. All the standard storages support it. What's new in ZODB3 3.2 ======================== Release date: 08-Oct-2003 Nothing has changed since release candidate 1. What's new in ZODB3 3.2 release candidate 1 =========================================== Release date: 01-Oct-2003 Added a summary to the Doc directory. There are several new documents in the 3.2 release, including "Using zdctl and zdrun to manage server processes" and "Running a ZEO Server HOWTO." Fixed ZEO's protocol negotiation mechanism so that a client ZODB 3.1 can talk to a ZODB 3.2 server. Fixed a memory leak in the ZEO server. The server was leaking a few KB of memory per connection. Fixed a memory leak in the ZODB object cache (cPickleCache). The cache did not release two references to its Connection, causing a large cycle of objects to leak when a database was closed. Fixed a bug in the ZEO code that caused it to leak socket objects on Windows. Specifically, fix the trigger mechanism so that both sockets created for a trigger are closed. Fixed a bug in the ZEO storage server that caused it to leave temp files behind. The CommitLog class contains a temp file, but it was not closing the file. Changed the order of setuid() and setgid() calls in zdrun, so that setgid() is called first. Added a timeout to the ZEO test suite that prevents hangs. The test suite creates ZEO servers with randomly assigned ports. If the port happens to be in use, the test suite would hang because the ZEO client would never stop trying to connect. The fix will cause the test to fail after a minute, but should prevent the test runner from hanging. The logging package was updated to include the latest version of the logging package from Python CVS. Note that this package is only installed for Python 2.2. In later versions of Python, it is available in the Python standard library. The ZEO1 directory was removed from the source distribution. ZEO1 is not supported, and we never intended to include it in the release. What's new in ZODB3 3.2 beta 3 ============================== Release date: 23-Sep-2003 Note: The changes listed for this release include changes also made in ZODB 3.1.x releases and ported to the 3.2 release. This version of ZODB 3.2 is not compatible with Python 2.1. Early versions were explicitly designed to be compatible with Zope 2.6. That plan has been dropped, because Zope 2.7 is already in beta release. Several of the classes in ZEO and ZODB now inherit from object, making them new-style classes. The primary motivation for the change was to make it easier to debug memory leaks. We don't expect any behavior to change as a result. A new feature to allow removal of connection pools for versions was ported from Zope 2.6. This feature is needed by Zope to avoid denial of service attacks that allow a client to create an arbitrary number of version pools. Fixed several critical ZEO bugs. - If several client transactions were blocked waiting for the storage and one of the blocked clients disconnected, the server would attempt to restart one of the other waiting clients. Since the disconnected client did not have the storage lock, this could lead to deadlock. It could also cause the assertion "self._client is None" to fail. - If a storage server fails or times out between the vote and the finish, the ZEO cache could get populated with objects that didn't make it to the storage server. - If a client loses its connection to the server near the end of a transaction, it is now guaranteed to get a ClientDisconnected error even if it reconnects before the transaction finishes. This is necessary because the server will always abort the transaction. In some cases, the client would never see an error for the aborted transaction. - In tpc_finish(), reordered the calls so that the server's tpc_finish() is called (and must succeed) before we update the ZEO client cache. - The storage name is now prepended to the sort key, to ensure a unique global sort order if storages are named uniquely. This can prevent deadlock in some unusual cases. Fixed several serious flaws in the implementation of the ZEO authentication protocol. - The smac layer would accept a message without a MAC even after the session key was established. - The client never initialized its session key, so it never checked incoming messages or created MACs for outgoing messags. - The smac layer used a single HMAC instance for sending and receiving messages. This approach could only work if client and server were guaranteed to process all messages in the same total order, which could only happen in simple scenarios like unit tests. Fixed a bug in ExtensionClass when comparing ExtensionClass instances. The code could raise RuntimeWarning under Python 2.3, and produce incorrect results on 64-bit platforms. Fixed bug in BDBStorage that could lead to DBRunRecoveryErrors when a transaction was aborted after performing operations like commit version or undo that create new references to existing pickles. Fixed a bug in Connection.py that caused it to fail with an AttributeError if close() was called after the database was closed. The test suite leaves fewer log files behind, although it still leaves a lot of junk. The test.py script puts each tests temp files in a separate directory, so it is easier to see which tests are causing problems. Unfortunately, it is still to tedious to figure out why the identified tests are leaving files behind. This release contains the latest and greatest version of the BDBStorage. This storage has still not seen testing in a production environment, but it represents the current best design and most recent code culled from various branches where development has occurred. The Tools directory contains a number of small improvements, a few new tools, and README.txt that catalogs the tools. Many of the tools are installed by setup.py; those scripts will now have a #! line set automatically on Unix. Fixed bugs in Tools/repozo.py, including a timing-dependent one that could cause the following invocation of repozo to do a full backup when an incremental backup would have sufficed. A pair of new scripts from Jim Fulton can be used to synthesize workloads and measure ZEO performance: see zodbload.py and zeoserverlog.py in the Tools directory. Note that these require Zope. Tools/checkbtrees.py was strengthened in two ways: - In addition to running the _check() method on each BTree B found, BTrees.check.check(B) is also run. The check() function was written after checkbtrees.py, and identifies kinds of damage B._check() cannot find. - Cycles in the object graph no longer lead to unbounded output. Note that preventing this requires remembering the oid of each persistent object found, which increases the memory needed by the script. What's new in ZODB3 3.2 beta 2 ============================== Release date: 16-Jun-2003 Fixed critical race conditions in ZEO's cache consistency code that could cause invalidations to be lost or stale data to be written to the cache. These bugs can lead to data loss or data corruption. These bugs are relatively unlikely to be provoked in sites with few conflicts, but the possibility of failure existed any time an object was loaded and stored concurrently. Fixed a bug in conflict resolution that failed to ghostify an object if it was involved in a conflict. (This code may be redundant, but it has been fixed regardless.) The ZEO server was fixed so that it does not perform any I/O until all of a transactions' invalidations are queued. If it performs I/O in the middle of sending invalidations, it would be possible to overlap a load from a client with the invalidation being sent to it. The ZEO cache now handles invalidations atomically. This is the same sort of bug that is described in the 3.1.2b1 section below, but it affects the ZEO cache. Fixed several serious bugs in fsrecover that caused it to fail catastrophically in certain cases because it thought it had found a checkpoint (status "c") record when it was in the middle of the file. Two new features snuck into this beta release. The ZODB.transact module provides a helper function that converts a regular function or method into a transactional one. The ZEO client cache now supports Adaptable Persistence (APE). The cache used to expect that all OIDs were eight bytes long. What's new in ZODB3 3.2 beta 1 ============================== Release date: 30-May-2003 ZODB ---- Invalidations are now processed atomically. Each transaction will see all the changes caused by an earlier transaction or none of them. Before this patch, it was possible for a transaction to see invalid data because it saw only a subset of the invalidations. This is the most likely cause of reported BTrees corruption, where keys were stored in the wrong bucket. When a BTree bucket splits, the bucket and the bucket's parent are both modified. If a transaction sees the invalidation for the bucket but not the parent, the BTree in memory will be internally inconsistent and keys can be put in the wrong bucket. The atomic invalidation fix prevents this problem. A number of minor reference count fixes in the object cache were fixed. That's the cPickleCache.c file. It was possible for a transaction that failed in tpc_finish() to lose the traceback that caused the failure. The transaction code was fixed to report the original error as well as any errors that occur while trying to recover from the original error. The "other" argument to copyTransactionsFrom() only needs to have an .iterator() method. For convenience, change FileStorage's and BDBFullStorage's iterator to have this method, which just returns self. Mount points are now visible from mounted objects. Fixed memory leak involving database connections and caches. When a connection or database was closed, the cache and database leaked, because of a circular reference involving the cache. Fixed the cache to explicitly clear out its contents when its connection is closed. The ZODB cache has fewer methods. It used to expose methods that could mutate the dictionary, which allowed users to violate internal invariants. ZConfig ------- It is now possible to configure ZODB databases and storages and ZEO servers using ZConfig. ZEO & zdaemon ------------- ZEO now supports authenticated client connections. The default authentication protocol uses a hash-based challenge-response protocol to prove identity and establish a session key for message authentication. The architecture is pluggable to allow third-parties to developer better authentication protocols. There is a new HOWTO for running a ZEO server. The draft in this release is incomplete, but provides more guidance than previous releases. See the file Doc/ZEO/howto.txt. The ZEO storage server's transaction timeout feature was refactored and made slightly more rebust. A new ZEO utility script, ZEO/mkzeoinst.py, was added. This creates a standard directory structure and writes a configuration file with mostly default values, and a bootstrap script that can be used to manage and monitor the server using zdctl.py (see below). Much work was done to improve zdaemon's zdctl.py and zdrun.py scripts. (In the alpha 1 release, zdrun.py was called zdaemon.py, but installing it in /bin caused much breakage due to the name conflict with the zdaemon package.) Together with the new mkzeoinst.py script, this makes controlling a ZEO server a breeze. A ZEO client will not read from its cache during cache verification. This fix was necessary to prevent the client from reading inconsistent data. The isReadOnly() method of a ZEO client was fixed to return the false when the client is connected to a read-only fallback server. The sync() method of ClientStorage and the pending() method of a zrpc connection now do both input and output. The short_repr() function used to generate log messages was fixed so that it does not blow up creating a repr of very long tuples. Storages -------- FileStorage has a new pack() implementation that fixes several reported problems that could lead to data loss. Two small bugs were fixed in DemoStorage. undoLog() did not handle its arguments correctly and pack() could accidentally delete objects created in versions. Fixed trivial bug in fsrecover that prevented it from working at all. FileStorage will use fsync() on Windows starting with Python 2.2.3. FileStorage's commit version was fixed. It used to stop after the first object, leaving all the other objects in the version. BTrees ------ Trying to store an object of a non-integer type into an IIBTree or OIBTree could leave the bucket in a variety of insane states. For example, trying b[obj] = "I'm a string, not an integer" where b is an OIBTree. This manifested as a refcount leak in the test suite, but could have been much worse (most likely in real life is that a seemingly arbitrary existing key would "go missing"). When deleting the first child of a BTree node with more than one child, a reference to the second child leaked. This could cause the entire bucket chain to leak (not be collected as garbage despite not being referenced anymore). Other minor BTree leak scenarios were also fixed. Tools ----- New tool zeoqueue.py for parsing ZEO log files, looking for blocked transactions. New tool repozo.py (originally by Anthony Baxter) for performing incremental backups of Data.fs files. The fsrecover.py script now does a better job of recovering from errors the occur in the middle of a transaction record. Fixed several bugs that caused partial or total failures in earlier versions. What's new in ZODB3 3.2 alpha 1 =============================== Release date: 17-Jan-2003 Most of the changes in this release are performance and stability improvements to ZEO. A major packaging change is that there won't be a separate ZEO release. The new ZConfig is a noteworthy addtion (see below). ZODB ---- An experimental new transaction API was added. The Connection class has a new method, setLocalTransaction(). ZODB applications can call this method to bind transactions to connections rather than threads. This is especially useful for GUI applications, which often have only one thread but multiple independent activities within that thread (generally one per window). Thanks to Christian Reis for championing this feature. Applications that take advantage of this feature should not use the get_transaction() function. Until now, ZODB itself sometimes assumed get_transaction() was the only way to get the transaction. Minor corrections have been added. The ZODB test suite, on the other hand, can continue to use get_transaction(), since it is free to assume that transactions are bound to threads. ZEO --- There is a new recommended script for starting a storage server. We recommend using ZEO/runzeo.py instead of ZEO/start.py. The start.py script is still available in this release, but it will no longer be maintained and will eventually be removed. There is a new zdaemon implementation. This version is a separate script that runs an arbitrary daemon. To run the ZEO server as a daemon, you would run "zdrun.py runzeo.py". There is also a simple shell, zdctl.py, that can be used to manage a daemon. Try "zdctl.py -p runzeo.py". There is a new version of the ZEO protocol in this release and a first stab at protocol negotiation. (It's a first stab because the protocol checking supporting in ZODB 3.1 was too primitive to support anything better.) A ZODB 3.2 ZEO client can talk to an old server, but a ZODB 3.2 server can't talk to an old client. It's safe to upgrade all the clients first and upgrade the server last. The ZEO client cache format changed, so you'll need to delete persistent caches before restarting clients. The ZEO cache verification protocol was revised to require many fewer messages in cases where a client or server restarts quickly. The performance of full cache verification has improved dramatically. Measurements from Jim were somewhere in 2x-5x. The implementation was fixed to use the very-fast getSerial() method on the storage instead of the comparatively slow load(). The ZEO server has an optional timeout feature that will abort a connection that does not commit within a certain amount of time. The timeout works by closing the socket the client is using, causing both client and server to abort the transaction and continue. This is a drastic step, but can be useful to prevent a hung client or other bug from blocking a server indefinitely. A bug was fixed in the ZEO protocol that allowed clients to read stale cache data while cache verification was being performed. The fixed version prevents the client from using the storage until after verification completes. The ZEO server has an experimental monitoring interface that reports usage statistics for the storage server including number of connected clients and number of transactions active and committed. It can be enabled by passing the -m flag to runsvr.py. The ZEO ClientStorage no longer supports the environment variables CLIENT_HOME, INSTANCE_HOME, or ZEO_CLIENT. The ZEO1 package is still included with this release, but there is no longer an option to install it. BTrees ------ The BTrees package now has a check module that inspects a BTree to check internal invariants. Bugs in older versions of the code code leave a BTree in an inconsistent state. Calling BTrees.check.check() on a BTree object should verify its consistency. (See the NEWS section for 3.1 beta 1 below to for the old BTrees bugs.) Fixed a rare conflict resolution problem in the BTrees that could cause an segfault when the conflict resolution resulted in any empty bucket. Installation ------------ The distutils setup now installs several Python scripts. The runzeo.py and zdrun.py scripts mentioned above and several fsXXX.py scripts from the Tools directory. The test.py script does not run all the ZEO tests by default, because the ZEO tests take a long time to run. Use --all to run all the tests. Otherwise a subset of the tests, mostly using MappingStorage, are run. Storages -------- There are two new storages based on Sleepycat's BerkeleyDB in the BDBStorage package. Barry will have to write more here, because I don't know how different they are from the old bsddb3Storage storages. See Doc/BDBStorage.txt for more information. It now takes less time to open an existing FileStorage. The FileStorage uses a BTree-based index that is faster to pickle and unpickle. It also saves the index periodically so that subsequent opens will go fast even if the storage was not closed cleanly. Misc ---- The new ZConfig package, which will be used by Zope and ZODB, is included. ZConfig provides a configuration syntax, similar to Apache's syntax. The package can be used to configure the ZEO server and ZODB databases. See the module ZODB.config for functions to open the database from configuration. See ZConfig/doc for more info. The zLOG package now uses the logging package by Vinay Sajip, which will be included in Python 2.3. The Sync extension was removed from ExtensionClass, because it was not used by ZODB. What's new in ZODB3 3.1.4? ========================== Release date: 11-Sep-2003 A new feature to allow removal of connection pools for versions was ported from Zope 2.6. This feature is needed by Zope to avoid denial of service attacks that allow a client to create an arbitrary number of version pools. A pair of new scripts from Jim Fulton can be used to synthesize workloads and measure ZEO performance: see zodbload.py and zeoserverlog.py in the Tools directory. Note that these require Zope. Tools/checkbtrees.py was strengthened in two ways: - In addition to running the _check() method on each BTree B found, BTrees.check.check(B) is also run. The check() function was written after checkbtrees.py, and identifies kinds of damage B._check() cannot find. - Cycles in the object graph no longer lead to unbounded output. Note that preventing this requires remembering the oid of each persistent object found, which increases the memory needed by the script. What's new in ZODB3 3.1.3? ========================== Release date: 18-Aug-2003 Fixed several critical ZEO bugs. - If a storage server fails or times out between the vote and the finish, the ZEO cache could get populated with objects that didn't make it to the storage server. - If a client loses its connection to the server near the end of a transaction, it is now guaranteed to get a ClientDisconnected error even if it reconnects before the transaction finishes. This is necessary because the server will always abort the transaction. In some cases, the client would never see an error for the aborted transaction. - In tpc_finish(), reordered the calls so that the server's tpc_finish() is called (and must succeed) before we update the ZEO client cache. - The storage name is now prepended to the sort key, to ensure a unique global sort order if storages are named uniquely. This can prevent deadlock in some unusual cases. A variety of fixes and improvements to Berkeley storage (aka BDBStorage) were back-ported from ZODB 4. This release now contains the most current version of the Berkeley storage code. Many tests have been back-ported, but not all. Modified the Windows tests to wait longer at the end of ZEO tests for the server to shut down. Before Python 2.3, there is no waitpid() on Windows, and, thus, no way to know if the server has shut down. The change makes the Windows ZEO tests much less likely to fail or hang, at the cost of increasing the time needed to run the tests. Fixed a bug in ExtensionClass when comparing ExtensionClass instances. The code could raise RuntimeWarning under Python 2.3, and produce incorrect results on 64-bit platforms. Fixed bugs in Tools/repozo.py, including a timing-dependent one that could cause the following invocation of repozo to do a full backup when an incremental backup would have sufficed. Added Tools/README.txt that explains what each of the scripts in the Tools directory does. There were many small changes and improvements to the test suite. What's new in ZODB3 3.1.2 final? ================================ Fixed bug in FileStorage pack that caused it to fail if it encountered an old undo record (status "u"). Fixed several bugs in FileStorage pack that could cause OverflowErrors for storages > 2 GB. Fixed memory leak in TimeStamp.laterThan() that only occurred when it had to create a new TimeStamp. Fixed two BTree bugs that were fixed on the head a while ago: - bug in fsBTree that would cause byValue searches to end early. (fsBTrees are never used this way, but it was still a bug.) - bug that lead to segfault if BTree was mutated via deletion while it was being iterated over. What's new in ZODB3 3.1.2 beta 2? ================================= Fixed critical race conditions in ZEO's cache consistency code that could cause invalidations to be lost or stale data to be written to the cache. These bugs can lead to data loss or data corruption. These bugs are relatively unlikely to be provoked in sites with few conflicts, but the possibility of failure existed any time an object was loaded and stored concurrently. Fixed a bug in conflict resolution that failed to ghostify an object if it was involved in a conflict. (This code may be redundant, but it has been fixed regardless.) The ZEO server was fixed so that it does not perform any I/O until all of a transactions' invalidations are queued. If it performs I/O in the middle of sending invalidations, it would be possible to overlap a load from a client with the invalidation being sent to it. The ZEO cache now handles invalidations atomically. This is the same sort of bug that is described in the 3.1.2b1 section below, but it affects the ZEO cache. Fixed several serious bugs in fsrecover that caused it to fail catastrophically in certain cases because it thought it had found a checkpoint (status "c") record when it was in the middle of the file. What's new in ZODB3 3.1.2 beta 1? ================================= ZODB ---- Invalidations are now processed atomically. Each transaction will see all the changes caused by an earlier transaction or none of them. Before this patch, it was possible for a transaction to see invalid data because it saw only a subset of the invalidations. This is the most likely cause of reported BTrees corruption, where keys were stored in the wrong bucket. When a BTree bucket splits, the bucket and the bucket's parent are both modified. If a transaction sees the invalidation for the bucket but not the parent, the BTree in memory will be internally inconsistent and keys can be put in the wrong bucket. The atomic invalidation fix prevents this problem. A number of minor reference count fixes in the object cache were fixed. That's the cPickleCache.c file. It was possible for a transaction that failed in tpc_finish() to lose the traceback that caused the failure. The transaction code was fixed to report the original error as well as any errors that occur while trying to recover from the original error. ZEO --- A ZEO client will not read from its cache during cache verification. This fix was necessary to prevent the client from reading inconsistent data. The isReadOnly() method of a ZEO client was fixed to return the false when the client is connected to a read-only fallback server. The sync() method of ClientStorage and the pending() method of a zrpc connection now do both input and output. The short_repr() function used to generate log messages was fixed so that it does not blow up creating a repr of very long tuples. Storages -------- FileStorage has a new pack() implementation that fixes several reported problems that could lead to data loss. Two small bugs were fixed in DemoStorage. undoLog() did not handle its arguments correctly and pack() could accidentally delete objects created in versions. Fixed trivial bug in fsrecover that prevented it from working at all. FileStorage will use fsync() on Windows starting with Python 2.2.3. FileStorage's commit version was fixed. It used to stop after the first object, leaving all the other objects in the version. BTrees ------ Trying to store an object of a non-integer type into an IIBTree or OIBTree could leave the bucket in a variety of insane states. For example, trying b[obj] = "I'm a string, not an integer" where b is an OIBTree. This manifested as a refcount leak in the test suite, but could have been much worse (most likely in real life is that a seemingly arbitrary existing key would "go missing"). When deleting the first child of a BTree node with more than one child, a reference to the second child leaked. This could cause the entire bucket chain to leak (not be collected as garbage despite not being referenced anymore). Other minor BTree leak scenarios were also fixed. Other ----- Comparing a Missing.Value object to a C type that provide its own comparison operation could lead to a segfault when the Missing.Value was on the right-hand side of the comparison operator. The Missing class was fixed so that its coercion and comparison operations are safe. Tools ----- Four tools are now installed by setup.py: fsdump.py, fstest.py, repozo.py, and zeopack.py. What's new in ZODB3 3.1.1 final? ================================ Release date: 11-Feb-2003 Tools ----- Updated repozo.py tool What's new in ZODB3 3.1.1 beta 2? ================================= Release date: 03-Feb-2003 The Transaction "hosed" feature is disabled in this release. If a transaction fails during the tpc_finish() it is not possible, in general, to know whether the storage is in a consistent state. For example, a ZEO server may commit the data and then fail before sending confirmation of the commit to the client. If multiple storages are involved in a transaction, the problem is exacerbated: One storage may commit the data while another fails to commit. In previous versions of ZODB, the database would set a global "hosed" flag that prevented any other transaction from committing until an administrator could check the status of the various failed storages and ensure that the database is in a consistent state. This approach favors data consistency over availability. The new approach is to log a panic but continue. In practice, availability seems to be more important than consistency. The failure mode is exceedingly rare in either case. The BTrees-based fsIndex for FileStorage is enabled. This version of the index is faster to load and store via pickle and uses less memory to store keys. We had intended to enable this feature in an earlier release, but failed to actually do it; thus, it's getting enabled as a bug fix now. Two rare bugs were fixed in BTrees conflict resolution. The most probable symptom of the bug would have been a segfault. The bugs were found via synthetic stress tests rather than bug reports. A value-based consistency checker for BTrees was added. See the module BTrees.check for the checker and other utilities for working with BTrees. A new script called repozo.py was added. This script, originally written by Anthony Baxter, provides an incremental backup scheme for FileStorage based storages. zeopack.py has been fixed to use a read-only connection. Various small autopack-related race conditions have been fixed in the Berkeley storage implementations. There have been some table changes to the Berkeley storages so any storage you created in 3.1.1b1 may not work. Part of these changes was to add a storage version number to the schema so these types of incompatible changes can be avoided in the future. Removed the chance of bogus warnings in the FileStorage iterator. ZEO --- The ZEO version number was bumped to 2.0.2 on account of the following minor feature additions. The performance of full cache verification has improved dramatically. Measurements from Jim were somewhere in 2x-5x. The implementation was fixed to use the very-fast getSerial() method on the storage instead of the comparatively slow load(). The ZEO server has an optional timeout feature that will abort a connection that does not commit within a certain amount of time. The timeout works by closing the socket the client is using, causing both client and server to abort the transaction and continue. This is a drastic step, but can be useful to prevent a hung client or other bug from blocking a server indefinitely. If a client was disconnected during a transaction, the tpc_abort() call did not properly reset the internal state about the transaction. The bug caused the next transaction to fail in its tpc_finish(). Also, any ClientDisconnected exceptions raised during tpc_abort() are ignored. ZEO logging has been improved by adding more logging for important events, and changing the logging level for existing messages to a more appropriate level (usually lower). What's new in ZODB3 3.1.1 beta 1? ================================= Release date: 10-Dev-2002 It was possible for earlier versions of ZODB to deadlock when using multiple storages. If multiple transactions committed concurrently and both transactions involved two or more shared storages, deadlock was possible. This problem has been fixed by introducing a sortKey() method to the transaction and storage APIs that is used to define an ordering on transaction participants. This solution will prevent deadlocks provided that all transaction participants that use locks define a valid sortKey() method. A warning is raised if a participant does not define sortKey(). For backwards compatibility, BaseStorage provides a sortKey() that uses __name__. Added code to ThreadedAsync/LoopCallback.py to work around a bug in asyncore.py: a handled signal can cause unwanted reads to happen. A bug in FileStorage related to object uncreation was fixed. If an a transaction that created an object was undone, FileStorage could write a bogus data record header that could lead to strange errors if the object was loaded. An attempt to load an uncreated object now raises KeyError, as expected. The restore() implementation in FileStorage wrote incorrect backpointers for a few corner cases involving versions and undo. It also failed if the backpointer pointed to a record that was before the pack time. These specific bugs have been fixed and new test cases were added to cover them. A bug was fixed in conflict resolution that raised a NameError when a class involved in a conflict could not be loaded. The bug did not affect correctness, but prevent ZODB from caching the fact that the class was unloadable. A related bug prevented spurious AttributeErrors when a class could not be loaded. It was also fixed. The script Tools/zeopack.py was fixed to work with ZEO 2. It was untested and had two silly bugs. Some C extensions included standard header files before including Python.h, which is not allowed. They now include Python.h first, which eliminates compiler warnings in certain configurations. The BerkeleyDB based storages have been merged from the trunk, providing a much more robust version of the storages. They are not backwards compatible with the old storages, but the decision was made to update them in this micro release because the old storages did not work for all practical purposes. For details, see Doc/BDBStorage.txt. What's new in ZODB3 3.1 final? =============================== Release date: 28-Oct-2002 If an error occurs during conflict resolution, the store will silently catch the error, log it, and continue as if the conflict was unresolvable. ZODB used to behave this way, and the change to catch only ConflictError was causing problems in deployed systems. There are a lot of legitimate errors that should be caught, but it's too close to the final release to make the substantial changes needed to correct this. What's new in ZODB3 3.1 beta 3? =============================== Release date: 21-Oct-2002 A small extension was made to the iterator protocol. The Record objects, which are returned by the per-transaction iterators, contain a new `data_txn` attribute. It is None, unless the data contained in the record is a logical copy of an earlier transaction's data. For example, when transactional undo modifies an object, it creates a logical copy of the earlier transaction's data. Note that this provide a stronger statement about consistency than whether the data in two records is the same; it's possible for two different updates to an object to coincidentally have the same data. The restore() method was extended to take the data_txn attribute mentioned above as an argument. FileStorage uses the new argument to write a backpointer if possible. A few bugs were fixed. The setattr slot of the cPersistence C API was being initialized to NULL. The proper initialization was restored, preventing crashes in some applications with C extensions that used persistence. The return value of TimeStamp's __cmp__ method was clipped to return only 1, 0, -1. The restore() method was fixed to write a valid backpointer if the update being restored is in a version. Several bugs and improvements were made to zdaemon, which can be used to run the ZEO server. The parent now forwards signals to the child as intended. Pidfile handling was improved and the trailing newline was omitted. What's new in ZODB3 3.1 beta 2? =============================== Release date: 4-Oct-2002 A few bugs have been fixed, some that were found with the help of Neal Norwitz's PyChecker. The zeoup.py tool has been fixed to allow connecting to a read-only storage, when the --nowrite option is given. Casey Duncan fixed a few bugs in the recent changes to undoLog(). The fstest.py script no longer checks that each object modified in a transaction has a serial number that matches the transaction id. This invariant is no longer maintained; several new features in the 3.1 release depend on it. The ZopeUndo package was added. If ZODB3 is being used to run a ZEO server that will be used with Zope, it is usually best if the server and the Zope client don't share any software. The Zope undo framework, however, requires that a Prefix object be passed between client and server. To support this use, ZopeUndo was created to hold the Prefix object. Many bugs were fixed in ZEO, and a couple of features added. See `ZEO-NEWS.txt` for details. The ZODB guide included in the Doc directory has been updated. It is still incomplete, but most of the references to old ZODB packages have been removed. There is a new section that briefly explains how to use BTrees. The zeoup.py tool connects using a read-only connection when --nowrite is specifified. This feature is useful for checking on read-only ZEO servers. What's new in ZODB3 3.1 beta 1? =============================== Release date: 12-Sep-2002 We've changed the name and version number of the project, but it's still the same old ZODB. There have been a lot of changes since the last release. New ZODB cache -------------- Toby Dickenson implemented a new Connection cache for ZODB. The cache is responsible for pointer swizzling (translating between oids and Python objects) and for keeping recently used objects in memory. The new cache is a big improvement over the old cache. It strictly honors its size limit, where size is specified in number of objects, and it evicts objects in least recently used (LRU) order. Users should take care when setting the cache size, which has a default value of 400 objects. The old version of the cache often held many more objects than its specified size. An application may not perform as well with a small cache size, because the cache no longer exceeds the limit. Storages -------- The index used by FileStorage was reimplemented using a custom BTrees object. The index maps oids to file offsets, and is kept in memory at all times. The new index uses about 1/4 the memory of the old, dictionary-based index. See the module ZODB.fsIndex for details. A security flaw was corrected in transactionalUndo(). The transaction ids returned by undoLog() and used for transactionalUndo() contained a file offset. An attacker could construct a pickle with a bogus transaction record in its binary data, deduce the position of the pickle in the file from the undo log, then submit an undo with a bogus file position that caused the pickle to get written as a regular data record. The implementation was fixed so that file offsets are not included in the transaction ids. Several storages now have an explicit read-only mode. For example, passing the keyword argument read_only=1 to FileStorage will make it read-only. If a write operation is performed on a read-only storage, a ReadOnlyError will be raised. The storage API was extended with new methods that support the Zope Replication Service (ZRS), a proprietary Zope Corp product. We expect these methods to be generally useful. The methods are: - restore(oid, serialno, data, version, transaction) Perform a store without doing consistency checks. A client can use this method to provide a new current revision of an object. The ``serialno`` argument is the new serialno to use for the object, not the serialno of the previous revision. - lastTransaction() Returns the transaction id of the last committed transaction. - lastSerial(oid) Return the current serialno for ``oid`` or None. - iterator(start=None, stop=None) The iterator method isn't new, but the optional ``start`` and ``stop`` arguments are. These arguments can be used to specify the range of the iterator -- an inclusive range [start, stop]. FileStorage is now more cautious about creating a new file when it believes a file does not exist. This change is a workaround for bug in Python versions upto and including 2.1.3. If the interpreter was builtin without large file support but the platform had it, os.path.exists() would return false for large files. The fix is to try to open the file first, and decide whether to create a new file based on errno. The undoLog() and undoInfo() methods of FileStorage can run concurrently with other methods. The internal storage lock is released periodically to give other threads a chance to run. This should increase responsiveness of ZEO clients when used with ZEO 2. New serial numbers are assigned consistently for abortVersion() and commitVersion(). When a version is committed, the non-version data gets a new serial number. When a version is aborted, the serial number for non-version data does not change. This means that the abortVersion() transaction record has the unique property that its transaction id is not the serial number of the data records. Berkeley Storages ----------------- Berkeley storage constructors now take an optional `config` argument, which is an instance whose attributes can be used to configure such BerkeleyDB policies as an automatic checkpointing interval, lock table sizing, and the log directory. See bsddb3Storage/BerkeleyBase.py for details. A getSize() method has been added to all Berkeley storages. Berkeley storages open their environments with the DB_THREAD flag. Some performance optimizations have been implemented in Full storage, including the addition of a helper C extension when used with Python 2.2. More performance improvements will be added for the ZODB 3.1 final release. A new experimental Autopack storage was added which keeps only a certain amount of old revision information. The concepts in this storage will be folded into Full and Autopack will likely go away in ZODB 3.1 final. ZODB 3.1 final will also have much improved Minimal and Full storages, which eliminate Berkeley lock exhaustion problems, reduce memory use, and improve performance. It is recommended that you use BerkeleyDB 4.0.14 and PyBSDDB 3.4.0 with the Berkeley storages. See bsddb3Storage/README.txt for details. BTrees ------ BTrees no longer ignore exceptions raised when two keys are compared. Tim Peters fixed several endcase bugs in the BTrees code. Most importantly, after a mix of inserts and deletes in a BTree or TreeSet, it was possible (but unlikely) for the internal state of the object to become inconsistent. Symptoms then varied; most often this manifested as a mysterious failure to find a key that you knew was present, or that tree.keys() would yield an object that disgreed with the tree about how many keys there were. If you suspect such a problem, BTrees and TreeSets now support a ._check() method, which does a thorough job of examining the internal tree pointers for consistency. It raises AssertionError if it finds any problems, else returns None. If ._check() raises an exception, the object is damaged, and rebuilding the object is the best solution. All known ways for a BTree or TreeSet object to become internally inconsistent have been repaired. Other fixes include: - Many fixes for range search endcases, including the "range search bug:" If the smallest key S in a bucket in a BTree was deleted, doing a range search on the BTree with S on the high end could claim that the range was empty even when it wasn't. - Zope Collector #419: repaired off-by-1 errors and IndexErrors when slicing BTree-based data structures. For example, an_IIBTree.items()[0:0] had length 1 (should be empty) if the tree wasn't empty. - The BTree module functions weightedIntersection() and weightedUnion() now treat negative weights as documented. It's hard to explain what their effects were before this fix, as the sign bits were getting confused with an internal distinction between whether the result should be a set or a mapping. ZEO ---- For news about ZEO2, see the file ZEO-NEWS.txt. This version of ZODB ships with two different versions of ZEO. It includes ZEO 2.0 beta 1, the recommended new version. (ZEO 2 will reach final release before ZODB3.) The ZEO 2.0 protocol is not compatible with ZEO 1.0, so we have also included ZEO 1.0 to support people already using ZEO 1.0. Other features -------------- When a ConflictError is raised, the exception object now has a sensible structure, thanks to a patch from Greg Ward. The exception now uses the following standard attributes: oid, class_name, message, serials. See the ZODB.POSException.ConflictError doc string for details. It is now easier to customize the registration of persistent objects with a transaction. The low-level persistence mechanism in cPersistence.c registers with the object's jar instead of with the current transaction. The jar (Connection) then registers with the transaction. This redirection would allow specialized Connections to change the default policy on how the transaction manager is selected without hacking the Transaction module. Empty transactions can be committed without interacting with the storage. It is possible for registration to occur unintentionally and for a persistent object to compensate by making itself as unchanged. When this happens, it's possible to commit a transaction with no modified objects. The change allows such transactions to finish even on a read-only storage. Two new tools were added to the Tools directory. The ``analyze.py`` script, based on a tool by Matt Kromer, prints a summary of space usage in a FileStorage Data.fs. The ``checkbtrees.py`` script scans a FileStorage Data.fs. When it finds a BTrees object, it loads the object and calls the ``_check`` method. It prints warning messages for any corrupt BTrees objects found. Documentation ------------- The user's guide included with this release is still woefully out of date. Other bugs fixed ---------------- If an exception occurs inside an _p_deactivate() method, a traceback is printed on stderr. Previous versions of ZODB silently cleared the exception. ExtensionClass and ZODB now work correctly with a Python debug build. All C code has been fixed to use a consistent set of functions from the Python memory API. This allows ZODB to be used in conjunction with pymalloc, the default allocator in Python 2.3. zdaemon, which can be used to run a ZEO server, more clearly reports the exit status of its child processes. The ZEO server will reinitialize zLOG when it receives a SIGHUP. This allows log file rotation without restarting the server. What's new in StandaloneZODB 1.0 final? ======================================= Release date: 08-Feb-2002 All copyright notices have been updated to reflect the fact that the ZPL 2.0 covers this release. Added a cleanroom PersistentList.py implementation, which multiply inherits from UserDict and Persistent. Some improvements in setup.py and test.py for sites that don't have the Berkeley libraries installed. A new program, zeoup.py was added which simply verifies that a ZEO server is reachable. Also, a new program zeopack.py was added which connects to a ZEO server and packs it. What's new in StandaloneZODB 1.0 c1? ==================================== Release Date: 25-Jan-2002 This was the first public release of the StandaloneZODB from Zope Corporation. Everything's new! :) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/LICENSE.txt000066400000000000000000000040261230730566700207350ustar00rootroot00000000000000Zope Public License (ZPL) Version 2.1 A copyright notice accompanies this license document that identifies the copyright holders. This license has been certified as open source. It has also been designated as GPL compatible by the Free Software Foundation (FSF). Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions in source code must retain the accompanying copyright notice, this list of conditions, and the following disclaimer. 2. Redistributions in binary form must reproduce the accompanying copyright notice, this list of conditions, and the following disclaimer in the documentation and/or other materials provided with the distribution. 3. Names of the copyright holders must not be used to endorse or promote products derived from this software without prior written permission from the copyright holders. 4. The right to distribute this software or to use it for any purpose does not give you the right to use Servicemarks (sm) or Trademarks (tm) of the copyright holders. Use of them is covered by separate agreement with the copyright holders. 5. If any files are modified, you must cause the modified files to carry prominent notices stating that you changed the files and the date of any change. Disclaimer THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/MANIFEST.in000066400000000000000000000005071230730566700206500ustar00rootroot00000000000000include *.py include *.txt include COPYING include buildout.cfg include log.ini recursive-include doc *.pdf recursive-include doc *.txt recursive-include src *.c recursive-include src *.fs recursive-include src *.h recursive-include src *.py recursive-include src *.test recursive-include src *.txt recursive-include src *.xml ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/README.txt000066400000000000000000000156501230730566700206150ustar00rootroot00000000000000==== ZODB ==== Introduction ============ The ZODB package provides a set of tools for using the Zope Object Database (ZODB). The components you get with the ZODB release are as follows: - Core ZODB, including the persistence machinery - Standard storages such as FileStorage - The persistent BTrees modules - ZEO, for scalability needs - documentation (needs a lot more work) Our primary development platforms are Linux, Mac OS X, and Windows XP. The test suite should pass without error on all of these platforms, although it can take a long time on Windows -- longer if you use ZoneAlarm. Many particularly slow tests are skipped unless you pass --all as an argument to test.py. Compatibility ============= ZODB 3.10 requires Python 2.5 or later. Note -- When using ZEO and upgrading from Python 2.4, you need to upgrade clients and servers at the same time, or upgrade clients first and then servers. Clients running Python 2.5 or 2.6 will work with servers running Python 2.4. Clients running Python 2.4 won't work properly with servers running Python 2.5 or later due to changes in the way Python implements exceptions. ZODB ZEO clients from ZODB 3.2 on can talk to ZODB 3.10 servers. ZODB ZEO 3.10 Clients can talk to ZODB 3.8, 3.9, and 3.10 ZEO servers. Note -- ZEO 3.10 servers don't support undo for older clients. Prerequisites ============= You must have Python installed. If you're using a system Python install, make sure development support is installed too. You also need the transaction, zc.lockfile, ZConfig, zdaemon, zope.event, zope.interface, zope.proxy and zope.testing packages. If you don't have them and you can connect to the Python Package Index, then these will be installed for you if you don't have them. Installation ============ ZODB is released as a distutils package. The easiest ways to build and install it are to use `easy_install `_, or `zc.buildout `_. To install by hand, first install the dependencies, ZConfig, zdaemon, zope.interface, zope.proxy and zope.testing. These can be found in the `Python Package Index `_. To run the tests, use the test setup command:: python setup.py test It will download dependencies if needed. If this happens, ou may get an import error when the test command gets to looking for tests. Try running the test command a second time and you should see the tests run. :: python setup.py test To install, use the install command:: python setup.py install Testing for Developers ====================== The ZODB checkouts are `buildouts `_. When working from a ZODB checkout, first run the bootstrap.py script to initialize the buildout: % python bootstrap.py and then use the buildout script to build ZODB and gather the dependencies: % bin/buildout This creates a test script: % bin/test -v This command will run all the tests, printing a single dot for each test. When it finishes, it will print a test summary. The exact number of tests can vary depending on platform and available third-party libraries.:: Ran 1182 tests in 241.269s OK The test script has many more options. Use the ``-h`` or ``--help`` options to see a file list of options. The default test suite omits several tests that depend on third-party software or that take a long time to run. To run all the available tests use the ``--all`` option. Running all the tests takes much longer.:: Ran 1561 tests in 1461.557s OK Maintenance scripts ------------------- Several scripts are provided with the ZODB and can help for analyzing, debugging, checking for consistency, summarizing content, reporting space used by objects, doing backups, artificial load testing, etc. Look at the ZODB/script directory for more informations. History ======= The historical version numbering schemes for ZODB and ZEO are complicated. Starting with ZODB 3.4, the ZODB and ZEO version numbers are the same. In the ZODB 3.1 through 3.3 lines, the ZEO version number was "one smaller" than the ZODB version number; e.g., ZODB 3.2.7 included ZEO 2.2.7. ZODB and ZEO were distinct releases prior to ZODB 3.1, and had independent version numbers. Historically, ZODB was distributed as a part of the Zope application server. Jim Fulton's paper at the Python conference in 2000 described a version of ZODB he called ZODB 3, based on an earlier persistent object system called BoboPOS. The earliest versions of ZODB 3 were released with Zope 2.0. Andrew Kuchling extracted ZODB from Zope 2.4.1 and packaged it for use by standalone Python programs. He called this version "StandaloneZODB". Andrew's guide to using ZODB is included in the Doc directory. This version of ZODB was hosted at http://sf.net/projects/zodb. It supported Python 1.5.2, and might still be of interest to users of this very old Python version. Zope Corporation released a version of ZODB called "StandaloneZODB 1.0" in Feb. 2002. This release was based on Andrew's packaging, but built from the same CVS repository as Zope. It is roughly equivalent to the ZODB in Zope 2.5. Why not call the current release StandaloneZODB? The name StandaloneZODB is a bit of a mouthful. The standalone part of the name suggests that the Zope version is the real version and that this is an afterthought, which isn't the case. So we're calling this release "ZODB". We also worked on a ZODB4 package for a while and made a couple of alpha releases. We've now abandoned that effort, because we didn't have the resources to pursue ot while also maintaining ZODB(3). License ======= ZODB is distributed under the Zope Public License, an OSI-approved open source license. Please see the LICENSE.txt file for terms and conditions. The ZODB/ZEO Programming Guide included in the documentation is a modified version of Andrew Kuchling's original guide, provided under the terms of the GNU Free Documentation License. More information ================ We maintain a Wiki page about all things ZODB, including status on future directions for ZODB. Please see http://wiki.zope.org/ZODB/FrontPage and feel free to contribute your comments. There is a Mailman mailing list in place to discuss all issues related to ZODB. You can send questions to zodb-dev@zope.org or subscribe at http://lists.zope.org/mailman/listinfo/zodb-dev and view its archives at http://lists.zope.org/pipermail/zodb-dev Note that Zope Corp mailing lists have a subscriber-only posting policy. Andrew's ZODB Programmers Guide is made available in several forms, including DVI and HTML. To view it online, point your browser at the file Doc/guide/zodb/index.html Bugs and Patches ================ Bug reports and patches should be added to the Launchpad: https://launchpad.net/zodb .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End: ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/bootstrap.py000066400000000000000000000041041230730566700214760ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2006 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Bootstrap a buildout-based project Simply run this script in a directory containing a buildout.cfg. The script accepts buildout command-line options, so you can use the -c option to specify an alternate configuration file. """ import os, shutil, sys, tempfile, urllib2 tmpeggs = tempfile.mkdtemp() ez = {} exec urllib2.urlopen('http://peak.telecommunity.com/dist/ez_setup.py' ).read() in ez ez['use_setuptools'](to_dir=tmpeggs, download_delay=0) import pkg_resources is_jython = sys.platform.startswith('java') if is_jython: import subprocess cmd = 'from setuptools.command.easy_install import main; main()' if sys.platform == 'win32': cmd = '"%s"' % cmd # work around spawn lamosity on windows ws = pkg_resources.working_set if is_jython: assert subprocess.Popen( [sys.executable] + ['-c', cmd, '-mqNxd', tmpeggs, 'zc.buildout'], env = dict(os.environ, PYTHONPATH= ws.find(pkg_resources.Requirement.parse('setuptools')).location ), ).wait() == 0 else: assert os.spawnle( os.P_WAIT, sys.executable, sys.executable, '-c', cmd, '-mqNxd', tmpeggs, 'zc.buildout', dict(os.environ, PYTHONPATH= ws.find(pkg_resources.Requirement.parse('setuptools')).location ), ) == 0 ws.add_entry(tmpeggs) ws.require('zc.buildout') import zc.buildout.buildout zc.buildout.buildout.main(sys.argv[1:] + ['bootstrap']) shutil.rmtree(tmpeggs) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/buildout.cfg000066400000000000000000000010001230730566700214070ustar00rootroot00000000000000[buildout] develop = . parts = test scripts versions = versions [versions] zc.recipe.testrunner = 1.3.0 zope.event = 3.5.2 zope.exceptions = 3.7.1 zope.interface = 3.8.0 [test] recipe = zc.recipe.testrunner eggs = ZODB3 [test] initialization = import os, tempfile try: os.mkdir('tmp') except: pass tempfile.tempdir = os.path.abspath('tmp') defaults = ['--all'] [scripts] recipe = zc.recipe.egg eggs = ZODB3 [test] interpreter = py [omelette] recipe = collective.recipe.omelette eggs = ${test:eggs} ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/doc/000077500000000000000000000000001230730566700176555ustar00rootroot00000000000000ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/doc/HOWTO-Blobs-NFS.txt000066400000000000000000000054501230730566700230050ustar00rootroot00000000000000=========================================== How to use NFS to make Blobs more efficient =========================================== :Author: Christian Theune Overview ======== When handling blobs, the biggest goal is to avoid writing operations that require the blob data to be transferred using up IO resources. When bringing a blob into the system, at least one O(N) operation has to happen, e.g. when the blob is uploaded via a network server. The blob should be extracted as a file on the final storage volume as early as possible, avoiding further copies. In a ZEO setup, all data is stored on a networked server and passed to it using zrpc. This is a major problem for handling blobs, because it will lock all transactions from committing when storing a single large blob. As a default, this mechanism works but is not recommended for high-volume installations. Shared filesystem ================= The solution for the transfer problem is to setup various storage parameters so that blobs are always handled on a single volume that is shared via network between ZEO servers and clients. Step 1: Setup a writable shared filesystem for ZEO server and client -------------------------------------------------------------------- On the ZEO server, create two directories on the volume that will be used by this setup (assume the volume is accessible via $SERVER/): - $SERVER/blobs - $SERVER/tmp Then export the $SERVER directory using a shared network filesystem like NFS. Make sure it's writable by the ZEO clients. Assume the exported directory is available on the client as $CLIENT. Step 2: Application temporary directories ----------------------------------------- Applications (i.e. Zope) will put uploaded data in a temporary directory first. Adjust your TMPDIR, TMP or TEMP environment variable to point to the shared filesystem: $ export TMPDIR=$CLIENT/tmp Step 3: ZEO client caches ------------------------- Edit the file `zope.conf` on the ZEO client and adjust the configuration of the `zeoclient` storage with two new variables:: blob-dir = $CLIENT/blobs blob-cache-writable = yes Step 4: ZEO server ------------------ Edit the file `zeo.conf` on the ZEO server to configure the blob directory. Assuming the published storage of the ZEO server is a file storage, then the configuration should look like this:: path $INSTANCE/var/Data.fs blob-dir $SERVER/blobs (Remember to manually replace $SERVER and $CLIENT with the exported directory as accessible by either the ZEO server or the ZEO client.) Conclusion ---------- At this point, after restarting your ZEO server and clients, the blob directory will be shared and a minimum amount of IO will occur when working with blobs. ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/doc/storage.pdf000066400000000000000000000613621230730566700220240ustar00rootroot00000000000000%PDF-1.4 3 0 obj << /Length 1933 /Filter /FlateDecode >> stream xÚÍXYÛÈ~Ÿ_¡G 9}ñÊ›íµA¼›d,ö"[Rg(RËóÚ_Ÿª®"E? ì£ú«£¿®®–\ ø#W©‚¿bU¼yü¤]Fy#†ÔY”JЦKfäo¶¶EÏËd$­ÝÒDÚ$¸.422©¯Šb-Iy{äE@w¨Epd×6¬\­(.€'‘ÊÕÄ—±AˆXJüijƒkWa""­õ*”i”(ãåÿRl½‚“^ø¼NE`Ë}ÓÖíîDcl3ØÎ£H ®§Â!Xjû±sMÉ0«WÜUÔ:k2DGz†H<ħÎV¶såK¿Y+ŒàvìØƒ¿¯¥ŽƒwìŽ2B"ÌÃÇçy¯MG‹$±' Å/bãEÎÄ 'Ùp!û䀉ášRP}d"¹0ä¥XêM;æ³taÌŸUÁ5ò}ŸÏrw}æiT£Ùg¼ç¢÷Pä]:ip ‡ 3á\³-‚ÚÜ‹/,» Â$õ¦asÖÅ÷‚å…†ù»A8ËÝ O£ÃAøk;L—ÃïA_ÄׄðWF¥á›L£”†BëÒÌ;a©7­œ#²0õ^D¤`ËïFd!w/"Ó4©¡ˆüà 6sTÂÎwáÍù€`Ü=" â ø¼4ðN,XêMû¦X\y'Ie*ù_Y ”©’>_ó¨EÆQ¢õ¿½siÅü¦‚zH'*[^Y¯¬L,o¬Ë‹Fø‚BÓ“äòvƒl/Ë\èÇ#¼¯êø VÇ ”¼Í.ìÆ¦¡2¦®*Ù˜Žõ$`h‰oÿuKu³€5ÙCÑ ®ì±VÍ úË=--ظbh®tÉ»5U†çwV/¤¢vW£(Tˆ¾P––ô*ƒ\ Ç‘³öÅ@-(G€*ØÙÖTÌ÷4æ±?*5Á¦è]É‹¾³®þØÙ¢¢iЍGêh-†æ *umÅ Ú-ã9Ÿ êuï0dÁ‰{Çóƒkà<õüœòÌÉU–^üüª½Ú}zµ¥Ö¡¨jÇÿ :¥0ì+Vøô¡H|Ož ¦1ßqÍ„¹éÝ0Á rÌühe’œ¦Y€c ^1öv;Ö—-TÙ5«öP¸Éübv£®£‹Qæÿºú!»dê_BŸ­Ó‡±hxcé[Âb‡Îã´¨À F&~Œø1bÒyqÏ yqâ0Évö7ðxtÝdÀÌ0[>Ypö’G(QÐ’pZ¶htKà”ŠŒÝTðÖlZßH)Ú0ðÍõnSsg„ÓXÓ<¼Üá‚"À =’°qœ†7«ÂÚÌÿ:9ètteQ×'’èý­3Kxì§³VMâ];îö$D*´Èÿ^#ä"¤¶ìœÚ þÈ”€Û> nÊó“ÿ5…g¯2Å7ë:º“(¡`H )‹¹ÏâàcßswŠÒІÏà·r‹6*(qŽÎ?“¡‹'¿}Îl32(ü®Ã(2x€>|¨çg _۽дc¬#]‰mwð„Dôæ|AhÇ}ðÙÇ“?ï-¯"³A‹û»ŒW,Õyóý¼8ÁðN@a¿¤Z,ƒ8TÔÞvíäæÅdíŽ;gS}À0rsZ¼©Æé¾wÜ3yÌÏt/mß/mÂ+€Q;Û·cw®.þâ5*endstream endobj 2 0 obj << /Type /Page /Contents 3 0 R /Resources 1 0 R /MediaBox [0 0 612 792] /Parent 22 0 R /Annots [ 11 0 R 14 0 R 15 0 R 16 0 R 17 0 R 18 0 R 19 0 R ] >> endobj 11 0 obj << /Type /Annot /Border [0 0 0] /Rect [455.098 504.184 540 512.662] /S /URI /URI (http://www.zope.com/) >> endobj 14 0 obj << /Type /Annot /Border [0 0 0] /Rect [72 382.05 126.236 390.941] /Subtype /Link /A << /S /GoTo /D (page001) >> >> endobj 15 0 obj << /Type /Annot /Border [0 0 0] /Rect [86.944 372.007 144.169 378.916] /Subtype /Link /A << /S /GoTo /D (page001) >> >> endobj 16 0 obj << /Type /Annot /Border [0 0 0] /Rect [72 348.177 160.806 357.068] /Subtype /Link /A << /S /GoTo /D (page002) >> >> endobj 17 0 obj << /Type /Annot /Border [0 0 0] /Rect [72 326.259 240.805 335.15] /Subtype /Link /A << /S /GoTo /D (page005) >> >> endobj 18 0 obj << /Type /Annot /Border [0 0 0] /Rect [72 304.341 223.092 313.233] /Subtype /Link /A << /S /GoTo /D (page005) >> >> endobj 19 0 obj << /Type /Annot /Border [0 0 0] /Rect [72 282.423 211.804 291.315] /Subtype /Link /A << /S /GoTo /D (page005) >> >> endobj 4 0 obj << /D [2 0 R /XYZ 72 744.907 null] >> endobj 1 0 obj << /Font << /F29 6 0 R /F30 8 0 R /F32 10 0 R /F34 13 0 R /F38 21 0 R >> /ProcSet [ /PDF /Text ] >> endobj 27 0 obj << /Length 3152 /Filter /FlateDecode >> stream xÚ¥ZKܸ¾Ï¯èÛª·VÔ[»‡`½Îöa“Àƒˆã[âL3VK²DyÒÿ>U¬¢-y×Èƈ¤ª‹ÅbÕWùõãÃ÷¿Dá¡ð‹4:<>²ð‰À¢ìðX}ðtcT3èc˜x_Ž"ñÔñ©'› ™gÚ–V®mÓÖÚ\téOqVxo ½ÐQ6-/ŒMÙ^œfOmOïÕÉ^åˆzUì¢iy‘ͳÜž´xfÚ«¬Ôú¤Ç‹:Úð;9õÌE êøññÝ!8œ„ð‹$±G÷ÈBïE×5e…€‘åO»/<˶Óëw‘gŽÂ»(w}[%,mO KƒêA©§zšë¦Rx6¦¾ÓÄý¥Yæ=^4ËÒ«Ï£îd²¡'jHm#Ï5Ë#¯íØ·O{'lÛ¾ÒDy^Oax¨}|^ÇÚè9ál0dzÖÍ3Ï­ØpûýàO|3? cË÷_}óSPrU ÊŠScÏ€#ÐÈyuË};>_hR·xüçS?6 mˆ½lY¢ J𠝔u Ú†ÝÁ`óÙ`ÃBøB Eùš 0þiÆE¾R&=G©ð~v׎û’)Y Z>=&SÂI‰w€ƒ3ÙÑ) c?+;µì™U$ˆ<‘UzN“Ö(pýßAÀ}íEÕ h…ö„×e b° f)Gs±W”ƒBhbÅfÚ)3ÿ”Žd÷•†F=lDª4ÃDìxQâaàü›Nµd;0~Aý·ý'¼Ë­ ¶£´u×(³"EQN[àÂJh|š&܈2{Î59KÊIÀ—þyQ ­3AŽÖ¸¬ïÀ\öî—í&xÁu‹ °.;0T‚$©^¡–!xímŠ¨Ï¼Í QÓO¶§¾ÊO“/¯+¾èAŸËY#A‡´{yýÀ*ëk;볪[wüÇüÕˆÏ#!8[»®í ½yêÛ+ÓöòYÑä*£»Ÿ² _±dê%‹T9cÌÛ=WÖ¼…’fì™±½sÚ ¹Ùfx!K5’KM{’2XAô€àat9Ö’­’FžñöV0!º}‹nÕÍ\vY“éÃ`xîëÛŒ‡NU:Qrô©þ˜ð…|ÿKXDìGqа§ˆr?…=Ex<‰ ¼÷À}…e¹÷bmÿtŒ`©T[ð.aæéâ/ªQ½äéaÕðÃÊÈò1Pïçÿ°O£z{^Ń ØE~ù·`KCãN—ŸjëaqP8f1,Ãyñ M1† Ѓ¡©fFúŠ*µhК ¥Þü f²×`̼Ü!-ZÁ“C Œ¦[÷͎“iÃ.£ügÄKFQ~ÚóŸÛAN—|á³Úz´az«B²ýIÎ$X¯Xq—£é.ñ‡¥úŠåØ÷ª)‘ÙmÙ™å2þþü¯Þ¬•f©F‡ê!8¼{ü¨È/0|Q‡ëCâCò˳úáýÃß'F'âtZ°zÙ÷R¨"õ1m†¤a[¥†…Ÿç9k ˆmb”Œ)L61ÙÔ¤ý£âÄõ‡-³ØÏ2·»™ÑÕ‘øÓ®_á“­ù@L3ã°Ã Θ&.±ûx<üÝšê*œ5!¶¦uqê "Ë`òÚºÉóë %0§î"]èÀzLKgiýö`"‚44OGß³ê&¿3Lý8D!Á†DÌNnÁr2dŸë¶üÄëñkŒ‚Ó\›Ã“ úË>z2d¸‚ÇÓÆÏØŸCá…—FŒHô3fÐÕà“Ã3Ú’3K˜ßí¶Úd ÝvG 2?¾E(-\ÞY,ƒmÐià­å_¸kƒ…Õ! ²*þ aÉòyQÐ6ü®·Ê]m2Ümâ`Á1.d“5TfÕÏjÿÈ,¯M1Ùþî-+ÂÕªUŒãtiÝhwQ£Wg9Ä×׫ª4(¡¾¹,h4î7‹°£žPOÊÖªs@3\Äš~À‰ÐÕ;€ƒý:X€ÃÂGpÕC¢&rïyÄðȦN•˜X@hmÆkÆílôUÁ^;œÆToÃò™_CRQÑj(¬~jo 3.ÄaíízRæ u£láórØC (‚µ¬ÉWÓ@*^]h3^Ï”0cæñ–‡úþþV‡]Á]`•lh—¶ÔpWN/ë{‹9BÏ ùöÒB?ÈÞAñm’E_^œ-¡:¾èòBuáŦ‘°¥¤y¥(Œµ¡uJWêQÑ{jTÜAzûM˜þÝñ”BºüÝŽ=a:-¢ ÔbÓþC‹IxÅ\„ƒ<-ÍÑh¥l;”ð6“.’/'q¨ó»ÄÁ&Ò8L·ÚYÛ«¹–áᢿálf 5“…ogz¤¿$“Ü‹x$ßÏÇZõ\"?v~ŒÂí^M&âÅÕÄ‘µˆ8œë S}Áï­çâ*7 ¦¬Æè±øÔ%öÔÈshñ|Û‰ÈEèGi:«v‚@,úAAæýDßOQ'v=@;²þ ]”ñÉ(> —˜3,õÚpT„™ÓØd$ øÝª€Y 6õ£€‘$äßÚUo!1‡Ím¨Î\ þ`0‡Î a@ÿº¢g&èzìº*^µ>bkÑzI(fN¸5 ÄÒ°·LmNK20ÉòôeaqM $HK¸â ay™)æÎÎÕg$¾ƒ©Ìϧ„2XnüqÏ1 üTD+ BoñwÜ5ŠlÛ!ÝiB~.Š¥Û·ÄÇ`z ‘ÉTâ*º#û"““¨É] 2áfóŠB];s[rß>}(!ÿŸÄ# S[íZQmû¬ä½÷öäýv_*[þãÜØYüUÒ̦ðe;œ†ç.IÏs¯D‹Ç€¡9žl>&Ãà@ÜqÉ<êæ1 C wÌ×T\°LKpïä`³¦eûqÊ+æ®c¯JÜy§ð ¡>sæ»S9‡"÷ƒŸ1<³?T;;^§3ªžW¾AÞ0í7•Ï¿áØtˆ8ÕbPUÙ=bYàÍ#p†az.$¨o»vpÝL0vù+ôQ#å¦eW5ÈAÚöFÆýIXy¹@2ƒ_pB—”-K#$Aˆî(8j ïûè²­I¦–Ò$s¬òµUÕ5ÚFQ•Ò>q›¶pÛ²?,Šò‘mÍ)ö‹Ü…â_‘çÞ¡|˜>ô¼r}¶J—Ò,»ësÎD#ʧÜ=+WkMýOߔيÞäçúMl3¹µüÑWä—Ž;bÚ+×D€l^7[I§‚ƒó>,.]c—ŠŠ¹í°þ%§G[¡I’¹øv™£Ûå-±)i?õòw#ût¥P9É›WŽÚFb»èdÅõNjû#º;xä>(ÙL•Ö.§^P=sWÇ®â¯Ãð†²¿;cV²¯µêw"4díA>U<-~ðú |¦†ˆƒkµØ¢F0`ŸôY‹ˆ´#ï—ÀeusÏ&›Îo;3îÖCûÜ­¡’¶áâl+Œ©Ê\Ûþ.1¯ÝÜ/ø¥ý@»ü­`Kºxòo£~ØqðÔ‡ œi¸õ»Stµl\5óQq×¶yPÅ8‡ú¯uÞóG\|*JÌÒ²óŒ?—ŸD!ÀeÖi¹µevAû±låƒ+ÿ‰ŠÿײïdSRï p\-êò÷ss±–½~âN †AGøIyœï7ÑNs÷†óƒ{™øÛ°í'í¶ý×§þ“£R}ï§-!ÿ Zxà!ŸHÀAÒoíqšïvÌIf&6¤‡Å꿣ġ/DÈŸÎP56J'[úöÏi(úÿyÙD(endstream endobj 26 0 obj << /Type /Page /Contents 27 0 R /Resources 25 0 R /MediaBox [0 0 612 792] /Parent 22 0 R >> endobj 23 0 obj << /D [26 0 R /XYZ 72 744.907 null] >> endobj 25 0 obj << /Font << /F32 10 0 R /F38 21 0 R /F29 6 0 R /F42 29 0 R /F43 31 0 R /F14 34 0 R /F34 13 0 R >> /ProcSet [ /PDF /Text ] >> endobj 37 0 obj << /Length 2970 /Filter /FlateDecode >> stream xÚíZÝã¶ß¿Âoõ1ÃO}´OÉ!W$( ´·i$yez­ž-9’|Ûë_ßr(Q–¼»—»—}X,)Qäpfø›ß ýíÃÝ×o…^å,OÔêa¿B±L«U*8ã*]=ì~^ÿÂEzÿëÃ_¿Õr©Yž‰w#Þ4õþX•ýwmÛ´~¨Š†fœIYu÷iøº-ªÎî|ûé`kú(?29ÓYBŸu¶­Šã|fi˜È Ú5–f¯›Þ7NE_|³?XzÖtô¶µå½XÛšº´ˆŸárÚÚÖ·÷ 5ší¿lÙÏE•R3!R£©vsAA­œu1ÿ>ÖûFóºÚÃrýJµÿö]ÕÔiÊ÷74Ÿ§,ÏU¬{•Œº‡¶Ó½kÝÚ\fXš<¿7Ô‘W‹A¨{Á×a¡Š–)ü¿÷¬½øô¼v³ÓLÐ.ÓiFk|ËÀÔîã™4J03h¨€ÙúñròF†uʦîðÃÕF錉T€ÚËñßxcïªý½4ë½m™ ^RœÈ©‚‹ÿ¶'ð¬sÿñ>1Ëævfü$k¿ë›¶x´mQwEÙƒ7Œ.4ÓäóN\ßâf‡¥æ«è”¥êÓN^yiGeö°ÐÍ¿¤#i$Gý.-ÝPM 5 RÃÍ×_8Ú*´(¥ÖëfßÛšžþQwÙ–Ç¢ëÂÿ¿êiTE/C›´ ú_Õ½mk0гN@?_ ŠéÂg(åH¢xÍ¡~WÞiiÞg“‰‘pŒŽUý8*~¦¬DÀ×VaZû 7þÄ‚‰5SJÓØ²8ÉJF+ °’ˆ\ðLú#ÔZ7󂥄‹.œŒiø‚‚ïf#Ü|c3uvE_P3‚zÒƒþ–|7C€óeF96¿)gZçløA$pë©Qy›/g0åé«õ,e»½—|}¡ùŸªþЄÀ7È-8‚Y‡þÒÇ–¸çôü`Ë÷¼`÷ÅÓõÃÁÃr¾î`ºãη›úøÑ·¶Öÿ¿ûò!.€ p,sÃLŽÜûºyZÍŠiTì%†ÙÐt¾Uk.Y>EàǦ v1J{À-(e!°ß'ØáŸ‡êx=º«뢿´Ô=6Í{:,·'uAO¾Òz±O5¬,B£kB°ˆCKi»?.a_*X’ä‚Ìäæ üb[Äàw“&e b_ŒñÖ5ïÁvà=Æ$çüÀð läþPÑqK”ð€ïüèdœöF7—…ÏÁbBBïûÞ–ð.ˆ-Ò½-vÃì3+mh[±ëtö¸w(u÷Ýê…¯÷ jËB‰Uyºûíîç_ùjwÇW?Üq¦€x=A›™ÌW§;\#§ÞñîÝÝ߆™64Õ&šë[äÐt1šÁ÷aÈs†Q†e‰™ú„€9¬ƒ ƒ î–‚dš°LÈå©nÄÈ¿6`Ò$~`áÔË‹<ÃVn¹¢;èsGäp®²!¢ÔO ÂÏü‚žõ om&£ê”Au8gU者著`·ð@øgÛxFhýão  œøžH°{‡'v§Có‡{<+~\ÙÚÂ…÷òPÐã­µô¨YÀöÇ!ps)ÇÀ}©w°ËRlʤž„ÖÚ>]»wÎ!˜­t®0°Žsû‰6ãLÞµ#‰òv–š$“Y‡Š£üÍ 9æôüþôÓO z€ü*‡`Š$¤‡þ\ÞÐC¢ ø"zfº¥?À¥$M¿äÄ‘‹¾Ìvú—h‹6LêçI ¸˜TâSUª_P©Ò.{ý:§º¥Tí«ºê ä*†ÍOV,áìþR— ô8e*M_`†)gW‰é2ÞÎÝ„ ÿ`»:Ž`‡â|¶uÇæÂm g&‰#×hõ«¸¤† =—ñ ì‰8¬ðºñâê}óä(Žò^ÏBòæ:®.âZ~™ (îqç¼CÜߟ›®«¶Hѱ·µ{—¡Nf¾œ!ÌzSŒ¸v:³óyh±•w—¶Ø-1gÊ_”L°–Výw…ùDOƒú†8íå|nÚþ™tS¤©cAC¡°¶7œ#‘L&¤ú@B®ã)5fëAÇjGáߜۦoÊf1;•‰a<דÐQlAdªÍâ=úèËØ²€¾ãiδ/@Œm˜)ļ9Z08ŠÂšáxÍð†Î=´¶ôÚ{ |’lÅÀ[‘Y¥± °§é©g)N0ä7€qå1c op÷ˆ«™Q@ǻԵÕ D˜6, † //*——o œ°;¦[Øï†„ŸÀŸWh÷#Húì®ÅpgWØB¨ßÙˆdUÓ©ò'=®³õ˜Z¯ 7jp“ÞàîÊ$¨v±ª à™šä'unnÀgž¿¸‡šlsiï°<‹!-T–m¨Vô7¢¹£  ‹t#¸Ñ3Í”™r°w½¿u0×ÇŽâØòh&nÜŠñ”‰¡Œmh©˜ä!u¼ºÆÂE¼‘o,“€àÃ¥I¤¡ JX:Ü E˸°¥Öß,¤´šé,ŸÞg¾qv¾u¹…Óy> Ã"ºÜè™TXÌ$É–Õ€¬Ò×i15A‹xa9Ÿ ŽŸÎ^©ªL§Ã5¥3wº¶¿]ü tJ!5¥í\ù1äŸiþZ‰ÍDY)±…”X“kŠÈ"™,( #Iöú HRÐ [/ž¤®ç×wG.GŸ÷—³ ¨œÑ“GË¢»®õÓí¤ê^`N¾˜¬í¢ÕM½™]é÷™¦¿ÿ'lÿ£„M¤€Ì€/_„°E“Ý$laÌ—%lá¶ÎÖ ÖGêãÝ{ýÌÙ{rrì`¸Ej’‰@•ƹÀ ð*ŸL Îê ´<­ƒÁÛ°`Då¼ÁÙKa|ôÇæFa<› BÄ^pÙ}ÀÕ„pqø§¼žá¢äkÿË÷©Û <™Òà íçs»át4ðÉvœiBRSN\‹ëe®¥ð÷ãï«XÖ&|©Of[:OüçR­…Jpylºå;™,Ë^®ëë} ÑâXý“lä€[ˆI2ƒÓ~´EçËz2ªq`Ç× †PÀ“Öbà+mç‹Zñ¤Ô ÷Ý2 ¡ô®„ÿéº×Ú÷î×^N8‡h2‚\h‡g¾&È^wGq,ºþ]tóëü øöËluùž%þQÙ [åxFlŒ¿»›¾koYþõ›ý6ŒÄmQöae¬×±¾ã·á‚Í95ÈêÎe˜`ûq®¥ASw}Íë\:øý~vÈbÝLv|:R¹2šqˆ./ W™K!æØŒ“¸`!óȰ Xo"VZ2hR« Æú®Äendstream endobj 36 0 obj << /Type /Page /Contents 37 0 R /Resources 35 0 R /MediaBox [0 0 612 792] /Parent 22 0 R >> endobj 38 0 obj << /D [36 0 R /XYZ 72 744.907 null] >> endobj 35 0 obj << /Font << /F14 34 0 R /F42 29 0 R /F32 10 0 R /F38 21 0 R /F43 31 0 R /F29 6 0 R >> /ProcSet [ /PDF /Text ] >> endobj 41 0 obj << /Length 2595 /Filter /FlateDecode >> stream xÚ½Z[Û¸~Ÿ_áGX³¼‰”Ú‡¢ƒ¤À‹ v3-‚&yÐÈôŒZ[òJr&Ó_ßÛD]l9´‰©Û!ù}çι½¿ùÃ_]¥(lu¿[¥¥X®$Á3¹ºß~ŠÊÓáAÕë ¥I´«êõ—ûwðMÒØ¯°y»*¶î@*‰•Ò½ñ™Ri¥=?ù“¦Qž•ö^Uî_ì­*ÏOkÕöªØÙçEk¯ku¬U£Ê¶±÷3ûSª5£g'ááŸ*oí?«þ²=ÕecWɃUÒ˜"˶Ëü¥Y“P†RâÞ@î1ïÇ(ÆI' Uœ‘ %­û§Â­ßþ¦QY¹oÕgŒi© œ« )Jx¼Ú‚Ò86_wl„» ì±þÇû7·è6kÔ‡¶ª³Ç™ ¥1|´#ÎúçÂ8±žX"A¹yoŸ5í}•M–·EUN‘¤H¿ÉÏ8ÆðL'ߎd,›úÍÐÄ8jƒ9Ì»7öWïÜ ö !°{‘W‡CÑ¶Ùøëª(a°/±H£dHHö¿–,ôduË›`’bÄÓÇ ÂÄób±{Tí/ÙAMçæ”‡-óAÀü™˜áCo˜ñ¨}RzÀ¢ROcnU»Ñ£¶¤ÐzÃp ¹·4"$:d­½ÌÊ­e«j°ä6s$ƒ€^bá¦õ³±Èßinæb«ŽªÜ‚/0³òèÎÍ’W§½›çÁ}ŸÙKÍÖÞà˜„»Õ3ýdI̵Y›=SŽñþ¹jsØÒ5”':bæCñïËÌ_œQdÇc]}+\·ÈFÏdFJm †žv»q{(œˆ‡—V5‹[áÝVT]dûéfb†dš›™‹¥2ÇŠÿbcÒ˘ƒK”\ðO»LKœbàm7z÷Py/R«TÉŽ¿®I©ºéP5þ®‹1M­ƒ…Ö-ï^¦0½l!.ÆK½`üŒC4CãßWÙ œaÀºŠ§_,¢UiνCÄOC¥ãÜS 3ïJBEþ¯½êÍ)u $m/óØEW°A¥@¤s´ç²žz6б’…*2ÅV ¸K|&"MŠ(»äî1$'\N?kŠ)’R,/'Ä7gr_$¹ÌyBCÎÖZúSpê໋Ü:obàª^C*†½¥ñÕ¤>ek +)Êdz 7™#ˆc?EÚ€UŠd—Ïr¬SF6p°‡j[ì µ½+ÿêÕÀIb‚Iÿ÷žV§!à?æ­X&Ž$ i4OL°ß鈄ñsÑ>ÙQ@é Ì$…TR.0)¥gòY[Xæ§ìTEÏTjOâÈø ¸‘¹µªÃ±}±Ã¦­­è÷µ&ÊôÌ™t"hÒtªõ @¶€‰ŽÏPc½ É&pe.šßT¶}¯+¥e»?C,dõ‰H/¸çúä̯‡¼>÷+õ©p +Ú˜âMÄ‘s%|R?>wªCù;ß¾QW&PÍéx¬ê¶ ÊŽlÿ·r[ÍU@)ppEJÓÄŒÃ@vH‡ôˆýôpÜ«h±Obáž_¯—¬ÚJ8ÁÚ N©UGx«Ãd8©)yáw[)'Ôm¦¢a’#Þ‘ó £‚FFE/Øê1Ç%É›kŠ^‡~gõy)Ñ!Ö+Å@ý=´óìK‚$#ËìC“tù‰x…LÈÇòqÏ4Ž|Â2`ºŸ¡ðå¤cL/9?§‹3ýAIy‰Aþ\Ã"2ëk^ÐõeX{…÷øÎÐÚÖzƒ^ÈÍÛû3^È šŒ@‚GÄ*?Üü~óé ^moðêÝ F ¦gc¡nu¸×eê®ö7n~íDm¼¬M ìÖtðÂ܇aÈøû MÀðÉÝÌbGáCÂÖè圀¤ r; –T„ŽjŒÇÃÖ¾ßUÞ…AýÆ\âF…ºF\ ­…¦+†%däuÈZQ›@ÖXÈnI7›…uŠN‚p§ÈvZCvð²ž~îl!F`S¿+w3æ®»~”\ÝÒøøñãLtO\ì-Æ£Š·—´^èµÿ\=^Ñåí&&`»|Õ§™S6¨ëú¥cj{IíŒ (Aã„\'Éѯtæ$¥(î´î:I¦³ÞªzFšÎ^¸ýËz“¦ÝÿSž$"B,ä! :MÏV@ÝžÙŸ@‰á ”xšX£·š Â|£š¼.ŽílÇNŒÑ¤kM›pÛÆ$ ´Bd{7uà |³}˜ü€r ú”ãP™ðþ(ôÃÍ c9x¹UúÁ7ç} ¹ý0u›´´Í̾>9‚Ó ‹é»³ {åsLÓÙ™ þ•$^¿ñœƒå"̺»ål•­NûÖ(‚]uͺ°…ktñgr&°W}ð§ ,ŽO2§äÌéßÀDB=Å"ù[Èáv§ø'w9ìn³º~Ö9ïÀY`Ù/î—!Ì–½ã>ëÏEógW¤3Änän7Ó¨„B>të ŽC ·AEt¬+=ù×bkò?¸cí`à’ÆQå 1’_}z®µ…wž©ñ©q#sTG…=ÑÑîTz탫îOà9įcÛø—§C-ñ_X5C2NVíÐÖ±£;·‘;†ËcVgÕúó × ó¥ûÚçËgÐXÏóä/²{1ºvî"ùÇá¡÷\†Î0ƒÈ)—©!C_ží÷ýº _pgõd9s}m iþ0‰ÿj+ý·¦ÙºÜ·¹œÍ_<Ÿ SÎUP„½ð—3$ui”ÎHuÚøös§Sƒø{[¿qˆas}v4N1¹Ç÷}›K´‰ó.„OH8w.ÕÝì‰@"ìßñ³V¶Ej£”ð‡ pÃb§oæú3ý·[ûfQ޾pÍ)m="žu•B:é”0„ 8@¬«Œ§„)•ÐOaY t 9ê„ÁmÓ ÐçOUeÚ\"qPOÛj\IºŠ9›+j".’ùŠdlz!¦¢i`‚tÅ)"Ä֢ܸ=&u5î)§ë ÁGºº7AùÑŒÜé¿ãØ­ÜÊ•_ú„¼ endstream endobj 40 0 obj << /Type /Page /Contents 41 0 R /Resources 39 0 R /MediaBox [0 0 612 792] /Parent 22 0 R >> endobj 42 0 obj << /D [40 0 R /XYZ 72 744.907 null] >> endobj 39 0 obj << /Font << /F32 10 0 R /F38 21 0 R /F42 29 0 R /F34 13 0 R /F43 31 0 R /F29 6 0 R >> /ProcSet [ /PDF /Text ] >> endobj 43 0 obj << /S /GoTo /D (page001) >> endobj 45 0 obj (1 Concepts) endobj 46 0 obj << /S /GoTo /D (page001) >> endobj 48 0 obj (1.1 Versions) endobj 49 0 obj << /S /GoTo /D (page002) >> endobj 51 0 obj (2 Storage Interface) endobj 52 0 obj << /S /GoTo /D (page005) >> endobj 54 0 obj (3 ZODB.BaseStorage Implementation) endobj 55 0 obj << /S /GoTo /D (page005) >> endobj 57 0 obj (4 Notes for Storage Implementors) endobj 58 0 obj << /S /GoTo /D (page005) >> endobj 60 0 obj (5 Distributed Storage Interface) endobj 63 0 obj << /Length 2870 /Filter /FlateDecode >> stream xÚ­ZY“ܸ ~Ÿ_Ñêª4#ñБ¼ÍÚ[ñV*©'Ùªxý –ÔÓŠuôJjÏN~}ÔÑÒx0^ÑõeËò›´.zA«´šV‚@Š$¡‰Pj»ú\öCÛ=óìÙ:1¬¦=~õ¿VÔˆ$rZ´e>êh¡thUÂ÷ÝQ>­%hEÿ´ÿZtûÀ÷ìqÖ•2x›À¾üo±¥¬‘æÛ’⥤_}_Vè·–&ÐТéŸ÷‡$ÿ]Y¬„ÑdÙ`}Á‡@ƒ%xUHŒ±SÿÁ>¤Tâ¥ø'öúâ·kÑdu¶§õfq"bãŽùºì÷Íàn|¾idDà»3´ÇÿÙnt0Qè=œy‹²9µ]ÖÙ°ãÒµè_˼èI§”ªöqÔÊþHFìeàõ4;ÿÎYôÐNó@78z¨Aµ`aR ´ÒFyïÒ6“ ì >ãÔîŠKÛ EN_%>Š$‚Þ½4ã[á©Ù¹k›4-³´¢ym—ÃãñNk×ð6d¢ÝûÛpØqª’"HœMËž¶},']ð-å'ã;d¥•áX ÷m-&ŸÊáL&“‘:a‹…Œ<ýL3cmmlÿâ]`Æ`‹¬D×¶&ƒ¾  ã' ŒB#pÂÀ«žod4ms˜­qbÆkÆòt³¨¨/ k`YýЕÍ#Oå%‹ƒâmÒ»çùyqB›§×j°*J/­*ká.“è˜vÕä!MÏÎý͵>µÑkñï¹5ÝÕÀwo§tg®¦ø†Ð¿ÛΓ–8>¯}H4ÊÐyÆ£& ÿ!Ï©}ïé\fg;3ï ×»xÕ¾IÈÔ ˆ;0bƒlù<èbF€[á\ZBGzq håë9V䨣¼´Éi¨i‡ ì D vý µî×SÏ©kk;ó¤âÚ1íÝ’sŠH‰¡‡Mk7h#·Ô¬l x´àKš¢È)‚|ø[ )…ÃÙ^ÙµëàZè£Òû-þm*µ²m3€ŒžýéŸMÞ’uŸJwóø¾—>ˆŸ3ß`_æõVîc;Y§ƒ.©îçNÚÚËé°…ŽJƒÛDΥѳÐw¶ïX‡f„ʵ³jž½Žü:k/ϳKÿ±F`£ÆÈWÝw„8÷& b¾•nÛ~‹ «H¨X”v Ì„½Äb1• –|ä‡ö‚Q J…ñôЗñI`']5¶Ò†Nj?Ð[`ZúX Ò¯© £ÍP+ýHÈñý.M7MšYü8 ð‚’r‡ç 2Åê™6»ö6ŽÁÀÓ™°ÁašùÐ`Y ÎMixñŠRÆ@×– —C|…FèõøþÛNße}©Šœ<`„"t[ ëœÑˆÄº“À½Hœ}í-e^㨖Bûîb;z£/? À=ß1a EV½D `«¬ÒceïKkïˆý×öwš4ãÆ‹ð!ÿôû½¢¼<¸>³$§-¡ð†Ôw-a¬gÜ]' dpgëC‚9Áܯ¤÷ ºÃÆîá,ÁzesÈÿ"™Ünnq¬i xÄÙu×é35ˆ–ièÒ‚-$R h›SUfÃû®k7^E,…-JpБGÉ’ïý‹¿¶Ù—ä&F„:ø”Ò¾/â~ <)¶´Ð‘ˆåFðŸÃÙ>_I÷!ìtÃVÉ^«y[ía´Û€i,‹ÜV ¶½U~a_Ä7l{,ÄD3cI–Cµ#'ö “€ç×Ìæå¸Ò¾NhPÚãKê‚C UÓ>f[ð"Hh·’Ç#-ͱ$¿/šß9:ˆP–ûjR¶&«®– c/RrüKþ¾ $ §" ·ƒK„~®è[,ýùFΈ—–Šz¦b’ˆÀÄãíÅŸ6Α€¸“°¢Y=І>ú7®qg:mÆ ¥áÇK´ú÷ßßÝ‹{`D™áübEWNÛQíl,œ=íHO»/\Th!XeCÅE¶béš‹’\ÉÂêb8·,Ìõ%Ì(Çêpþ‰'¤¶¿$B$2QVk…9@/'škOËýñAË(ö>L)"ÿ.aKáõ‘Íb÷{ZsÚ+ŸJýðñ²Ä¶ë7ä–÷/Úà·óÛÜ”ØôÙMÓlÃ×|¬þ‰DÑe½[9Ñëù]/XHÄ:yb“J®A g¾„k5”—ŠûÓËF:쀀 ì·çŸNm”ŸâÉzû0˜oJïÌ!ûÂÀz! ŒH‹}e´×þb‡_[±ƒ9Kmpm“óê¶æÉ=úlqz°ÐÕÙ_®&mxUÑq ‚xm6NçÀ–#6 zÎ$ÇâœV®:|[g.›œP¹Ì¯i5^(I ;‡N{cS*ýÂ’¡"wÔ7ýtƒÕžH‰È1Ñ™zi&Uý±—@:Zà§ -9[àBÙ¿!W†æš¬Ëµ§_§`dÜ”ÍØúüªÍW²6êt?VMnÝÖØÃCÀÑÓÑ%C¯¶EÒ~\EuÑÐÝU6áRühwK]"gnÛsÃZÌmWlµ¡Îm"`ua"À‹'MÊnqØÅ¯Ž³°Máê_äwïî0œø»RÑ£]Vßýv÷鳿ËïüÝOw>Ыx÷mŠîõcnWwï~E@Æarÿ/ðtbu (Gì´A ¶XÿÃ÷ÚÛendstream endobj 62 0 obj << /Type /Page /Contents 63 0 R /Resources 61 0 R /MediaBox [0 0 612 792] /Parent 22 0 R >> endobj 24 0 obj << /D [62 0 R /XYZ 72 744.907 null] >> endobj 61 0 obj << /Font << /F32 10 0 R /F38 21 0 R /F43 31 0 R /F42 29 0 R /F34 13 0 R /F29 6 0 R >> /ProcSet [ /PDF /Text ] >> endobj 64 0 obj << /Type /Encoding /Differences [ 0 /minus/periodcentered/multiply/asteriskmath/divide/diamondmath/plusminus/minusplus/circleplus/circleminus/circlemultiply/circledivide/circledot/circlecopyrt/openbullet/bullet/equivasymptotic/equivalence/reflexsubset/reflexsuperset/lessequal/greaterequal/precedesequal/followsequal/similar/approxequal/propersubset/propersuperset/lessmuch/greatermuch/precedes/follows/arrowleft/arrowright/arrowup/arrowdown/arrowboth/arrownortheast/arrowsoutheast/similarequal/arrowdblleft/arrowdblright/arrowdblup/arrowdbldown/arrowdblboth/arrownorthwest/arrowsouthwest/proportional/prime/infinity/element/owner/triangle/triangleinv/negationslash/mapsto/universal/existential/logicalnot/emptyset/Rfractur/Ifractur/latticetop/perpendicular/aleph/A/B/C/D/E/F/G/H/I/J/K/L/M/N/O/P/Q/R/S/T/U/V/W/X/Y/Z/union/intersection/unionmulti/logicaland/logicalor/turnstileleft/turnstileright/floorleft/floorright/ceilingleft/ceilingright/braceleft/braceright/angbracketleft/angbracketright/bar/bardbl/arrowbothv/arrowdblbothv/backslash/wreathproduct/radical/coproduct/nabla/integral/unionsq/intersectionsq/subsetsqequal/supersetsqequal/section/dagger/daggerdbl/paragraph/club/diamond/heart/spade/arrowleft 129/.notdef 161/minus/periodcentered/multiply/asteriskmath/divide/diamondmath/plusminus/minusplus/circleplus/circleminus 171/.notdef 173/circlemultiply/circledivide/circledot/circlecopyrt/openbullet/bullet/equivasymptotic/equivalence/reflexsubset/reflexsuperset/lessequal/greaterequal/precedesequal/followsequal/similar/approxequal/propersubset/propersuperset/lessmuch/greatermuch/precedes/follows/arrowleft/spade 197/.notdef] >> endobj 33 0 obj << /Length1 772 /Length2 576 /Length3 532 /Length 1127 /Filter /FlateDecode >> stream xÚSU ÖuLÉOJuËÏ+Ñ5Ô3´Rpö Ž44P0Ô3àRUu.JM,ÉÌÏsI,IµR0´´4Tp,MW04U00·22°25çRUpÎ/¨,ÊLÏ(QÐpÖ)2WpÌM-ÊLNÌSðM,ÉHÍš‘œ˜£œŸœ™ZR©§à˜“£ÒQ¬”ZœZT–š¢Çeh¨’™\¢”šž™Ç¥r‘g^Z¾‚9D8¥´&U–ZT t”‚Бš @'¦äçåT*¤¤¦qéûåíJº„ŽB7Ü­4'Ç/1d<8”0äs3s*¡*òs JKR‹|óSR‹òЕ†§B盚’Yš‹.ëY’˜“™ì˜—ž“ª kh¢g`l ‘È,vˬHM È,IÎPHKÌ)N‹§æ¥ ;|`‡èGºúD„kCã,˜™WRYª`€P æ"øÀP*ʬPˆ6Ð300*B+Í2×¼äü”̼t#S3…Ä¢¢ÄJ.` òLª 2óRR+R+€.Ö×ËË/jQM­BZ~(ZÉI? ´©% q.L89åWTëY*èZš 644S077­EUš—YXšêé¢`j```añYriQQj^ 8 ÆOËljjEj2×ÍkùÉÖ-YÓ·µ­¬s]|a«>çÏk_Þd?±£nvfJm°é¼@Åô’%¯>ÚÚwX<û¢„W²õTá¢-’½~=q_ ¯ÙÚµ`YÄ„Óýz7‚Å+›»¦ñþÓVåy¸0lÆœÖGÒVû‹ÏêTÖ¹ùE¹þϼ”NQ‹÷}¿w[H+h’–’”ùÍìwÅÄ+ï>¿,ÿiGýôã¶ÉïÎÞòñ /vëR¿˜fÇô%ñۮش²‹µŸ9¼òâQ¹DÊÿžýÑod;”ÚU? ^Vñµ«Nºúú©vñK¯{~­ñçäÚ/ëtôî…Ã-Çé÷7¸ï“õ‘9ñØ8ã·Ô m¿i"é÷Œ™6=Û!y:ëIèÆõ†íÿ_°K-­û±,1{Îö)².oª —ï¶ý*Þ[«ç½mFäû%»s_Û-j(lå¦sÿÏùœ~gغŒ|K·~›¶#£ïµ¾øÓ·&g®]p_ò¸!—GrnM`ìv®^ÿD·l½ŸÞë>Z`.x‹“Yh—ý.Ž#ÁÇ8©¯Øw6O~¡—5“{Þ„U7¶ð807ì™õ…ûk4鹇Wñ»5þô öŒïùfÕŸ”ÛV¼ RÅ—÷mõ‰_A¢ëX¦¼OïjW;[Ã(Ï´ÿÇê¼uï,¥n˜ q(ï»°õÆA®æ…Ëü+Ì»·3z^›"_Õöûÿ‘Ù“O:†~ýUûI¯H$P†kR¦½ÏíÏ-‚©¢áúº×y'y=øØ'sµñó‰BˉêÿcÙWdtDÇ?û:`‡‘µªñ½w¦[K½ð_°È€BÀ5jÀ°0 9'5±¨$?7±(› _Úx¦endstream endobj 34 0 obj << /Type /Font /Subtype /Type1 /Encoding 64 0 R /FirstChar 15 /LastChar 15 /Widths 65 0 R /BaseFont /YELXPW+CMSY10 /FontDescriptor 32 0 R >> endobj 32 0 obj << /Ascent 750 /CapHeight 683 /Descent -194 /FontName /YELXPW+CMSY10 /ItalicAngle -14 /StemV 85 /XHeight 431 /FontBBox [-29 -960 1116 775] /Flags 4 /CharSet (/bullet) /FontFile 33 0 R >> endobj 65 0 obj [500 ] endobj 66 0 obj << /Type /Encoding /Differences [ 0 /.notdef 1/dotaccent/fi/fl/fraction/hungarumlaut/Lslash/lslash/ogonek/ring 10/.notdef 11/breve/minus 13/.notdef 14/Zcaron/zcaron/caron/dotlessi/dotlessj/ff/ffi/ffl 22/.notdef 30/grave/quotesingle/space/exclam/quotedbl/numbersign/dollar/percent/ampersand/quoteright/parenleft/parenright/asterisk/plus/comma/hyphen/period/slash/zero/one/two/three/four/five/six/seven/eight/nine/colon/semicolon/less/equal/greater/question/at/A/B/C/D/E/F/G/H/I/J/K/L/M/N/O/P/Q/R/S/T/U/V/W/X/Y/Z/bracketleft/backslash/bracketright/asciicircum/underscore/quoteleft/a/b/c/d/e/f/g/h/i/j/k/l/m/n/o/p/q/r/s/t/u/v/w/x/y/z/braceleft/bar/braceright/asciitilde 127/.notdef 128/Euro 129/.notdef 130/quotesinglbase/florin/quotedblbase/ellipsis/dagger/daggerdbl/circumflex/perthousand/Scaron/guilsinglleft/OE 141/.notdef 147/quotedblleft/quotedblright/bullet/endash/emdash/tilde/trademark/scaron/guilsinglright/oe 157/.notdef 159/Ydieresis 160/.notdef 161/exclamdown/cent/sterling/currency/yen/brokenbar/section/dieresis/copyright/ordfeminine/guillemotleft/logicalnot/hyphen/registered/macron/degree/plusminus/twosuperior/threesuperior/acute/mu/paragraph/periodcentered/cedilla/onesuperior/ordmasculine/guillemotright/onequarter/onehalf/threequarters/questiondown/Agrave/Aacute/Acircumflex/Atilde/Adieresis/Aring/AE/Ccedilla/Egrave/Eacute/Ecircumflex/Edieresis/Igrave/Iacute/Icircumflex/Idieresis/Eth/Ntilde/Ograve/Oacute/Ocircumflex/Otilde/Odieresis/multiply/Oslash/Ugrave/Uacute/Ucircumflex/Udieresis/Yacute/Thorn/germandbls/agrave/aacute/acircumflex/atilde/adieresis/aring/ae/ccedilla/egrave/eacute/ecircumflex/edieresis/igrave/iacute/icircumflex/idieresis/eth/ntilde/ograve/oacute/ocircumflex/otilde/odieresis/divide/oslash/ugrave/uacute/ucircumflex/udieresis/yacute/thorn/ydieresis] >> endobj 31 0 obj << /Type /Font /Subtype /Type1 /Encoding 66 0 R /BaseFont /Courier-Bold >> endobj 29 0 obj << /Type /Font /Subtype /Type1 /Encoding 66 0 R /BaseFont /Courier >> endobj 21 0 obj << /Type /Font /Subtype /Type1 /Encoding 66 0 R /BaseFont /Times-Italic >> endobj 13 0 obj << /Type /Font /Subtype /Type1 /Encoding 66 0 R /BaseFont /Times-Bold >> endobj 10 0 obj << /Type /Font /Subtype /Type1 /Encoding 66 0 R /BaseFont /Times-Roman >> endobj 8 0 obj << /Type /Font /Subtype /Type1 /Encoding 66 0 R /BaseFont /Helvetica-Oblique >> endobj 6 0 obj << /Type /Font /Subtype /Type1 /Encoding 66 0 R /BaseFont /Helvetica >> endobj 22 0 obj << /Type /Pages /Count 5 /Kids [2 0 R 26 0 R 36 0 R 40 0 R 62 0 R] >> endobj 67 0 obj << /Type /Outlines /First 44 0 R /Last 59 0 R /Count 5 >> endobj 59 0 obj << /Title 60 0 R /A 58 0 R /Parent 67 0 R /Prev 56 0 R >> endobj 56 0 obj << /Title 57 0 R /A 55 0 R /Parent 67 0 R /Prev 53 0 R /Next 59 0 R >> endobj 53 0 obj << /Title 54 0 R /A 52 0 R /Parent 67 0 R /Prev 50 0 R /Next 56 0 R >> endobj 50 0 obj << /Title 51 0 R /A 49 0 R /Parent 67 0 R /Prev 44 0 R /Next 53 0 R >> endobj 47 0 obj << /Title 48 0 R /A 46 0 R /Parent 44 0 R >> endobj 44 0 obj << /Title 45 0 R /A 43 0 R /Parent 67 0 R /Next 50 0 R /First 47 0 R /Last 47 0 R /Count -1 >> endobj 68 0 obj << /Names [(page001) 4 0 R (page002) 23 0 R (page003) 38 0 R (page004) 42 0 R (page005) 24 0 R] /Limits [(page001) (page005)] >> endobj 69 0 obj << /Kids [68 0 R] >> endobj 70 0 obj << /Dests 69 0 R >> endobj 71 0 obj << /Type /Catalog /Pages 22 0 R /Outlines 67 0 R /Names 70 0 R /PageMode /UseOutlines /PTEX.Fullbanner (This is pdfTeX, Version 3.14159-1.10b) >> endobj 72 0 obj << /Producer (pdfTeX-1.10b) /Author (Zope Corporation) /Title (ZODB Storage API) /Creator (TeX) /CreationDate (D:20040620214400) >> endobj xref 0 73 0000000005 65535 f 0000003156 00000 n 0000002020 00000 n 0000000009 00000 n 0000003103 00000 n 0000000007 00000 f 0000022505 00000 n 0000000009 00000 f 0000022410 00000 n 0000000012 00000 f 0000022320 00000 n 0000002186 00000 n 0000000020 00000 f 0000022231 00000 n 0000002309 00000 n 0000002440 00000 n 0000002576 00000 n 0000002708 00000 n 0000002839 00000 n 0000002971 00000 n 0000000028 00000 f 0000022140 00000 n 0000022592 00000 n 0000006610 00000 n 0000016685 00000 n 0000006665 00000 n 0000006502 00000 n 0000003271 00000 n 0000000030 00000 f 0000022054 00000 n 0000000000 00000 f 0000021963 00000 n 0000019927 00000 n 0000018525 00000 n 0000019770 00000 n 0000010018 00000 n 0000009855 00000 n 0000006806 00000 n 0000009963 00000 n 0000012984 00000 n 0000012821 00000 n 0000010147 00000 n 0000012929 00000 n 0000013113 00000 n 0000023148 00000 n 0000013157 00000 n 0000013186 00000 n 0000023087 00000 n 0000013230 00000 n 0000013261 00000 n 0000023000 00000 n 0000013305 00000 n 0000013343 00000 n 0000022913 00000 n 0000013387 00000 n 0000013439 00000 n 0000022826 00000 n 0000013483 00000 n 0000013534 00000 n 0000022752 00000 n 0000013578 00000 n 0000016740 00000 n 0000016577 00000 n 0000013628 00000 n 0000016869 00000 n 0000020130 00000 n 0000020153 00000 n 0000022678 00000 n 0000023259 00000 n 0000023404 00000 n 0000023441 00000 n 0000023477 00000 n 0000023639 00000 n trailer << /Size 73 /Root 71 0 R /Info 72 0 R >> startxref 23789 %%EOF ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/doc/zeo-client-cache-tracing.txt000066400000000000000000000166131230730566700251640ustar00rootroot00000000000000ZEO Client Cache Tracing ======================== An important question for ZEO users is: how large should the ZEO client cache be? ZEO 2 (as of ZEO 2.0b2) has a new feature that lets you collect a trace of cache activity and tools to analyze this trace, enabling you to make an informed decision about the cache size. Don't confuse the ZEO client cache with the Zope object cache. The ZEO client cache is only used when an object is not in the Zope object cache; the ZEO client cache avoids roundtrips to the ZEO server. Enabling Cache Tracing ---------------------- To enable cache tracing, you must use a persistent cache (specify a ``client`` name), and set the environment variable ZEO_CACHE_TRACE to a non-empty value. The path to the trace file is derived from the path to the persistent cache file by appending ".trace". If the file doesn't exist, ZEO will try to create it. If the file does exist, it's opened for appending (previous trace information is not overwritten). If there are problems with the file, a warning message is logged. To start or stop tracing, the ZEO client process (typically a Zope application server) must be restarted. The trace file can grow pretty quickly; on a moderately loaded server, we observed it growing by 7 MB per hour. The file consists of binary records, each 34 bytes long if 8-byte oids are in use; a detailed description of the record lay-out is given in stats.py. No sensitive data is logged: data record sizes (but not data records), and binary object and transaction ids are logged, but no object pickles, object types or names, user names, transaction comments, access paths, or machine information (such as machine name or IP address) are logged. Analyzing a Cache Trace ----------------------- The stats.py command-line tool is the first-line tool to analyze a cache trace. Its default output consists of two parts: a one-line summary of essential statistics for each segment of 15 minutes, interspersed with lines indicating client restarts, followed by a more detailed summary of overall statistics. The most important statistic is the "hit rate", a percentage indicating how many requests to load an object could be satisfied from the cache. Hit rates around 70% are good. 90% is excellent. If you see a hit rate under 60% you can probably improve the cache performance (and hence your Zope application server's performance) by increasing the ZEO cache size. This is normally configured using key ``cache_size`` in the ``zeoclient`` section of your configuration file. The default cache size is 20 MB, which is small. The stats.py tool shows its command line syntax when invoked without arguments. The tracefile argument can be a gzipped file if it has a .gz extension. It will be read from stdin (assuming uncompressed data) if the tracefile argument is '-'. Simulating Different Cache Sizes -------------------------------- Based on a cache trace file, you can make a prediction of how well the cache might do with a different cache size. The simul.py tool runs a simulation of the ZEO client cache implementation based upon the events read from a trace file. A new simulation is started each time the trace file records a client restart event; if a trace file contains more than one restart event, a separate line is printed for each simulation, and a line with overall statistics is added at the end. Example, assuming the trace file is in /tmp/cachetrace.log:: $ python simul.py -s 4 /tmp/cachetrace.log CircularCacheSimulation, cache size 4,194,304 bytes START TIME DURATION LOADS HITS INVALS WRITES HITRATE EVICTS INUSE Jul 22 22:22 39:09 3218856 1429329 24046 41517 44.4% 40776 99.8 This shows that with a 4 MB cache size, the cache hit rate is 44.4%, the percentage 1429329 (number of cache hits) is of 3218856 (number of load requests). The cache simulated 40776 evictions, to make room for new object states. At the end, 99.8% of the bytes reserved for the cache file were in use to hold object state (the remaining 0.2% consists of "holes", bytes freed by object eviction and not yet reused to hold another object's state). Let's try this again with an 8 MB cache:: $ python simul.py -s 8 /tmp/cachetrace.log CircularCacheSimulation, cache size 8,388,608 bytes START TIME DURATION LOADS HITS INVALS WRITES HITRATE EVICTS INUSE Jul 22 22:22 39:09 3218856 2182722 31315 41517 67.8% 40016 100.0 That's a huge improvement in hit rate, which isn't surprising since these are very small cache sizes. The default cache size is 20 MB, which is still on the small side:: $ python simul.py /tmp/cachetrace.log CircularCacheSimulation, cache size 20,971,520 bytes START TIME DURATION LOADS HITS INVALS WRITES HITRATE EVICTS INUSE Jul 22 22:22 39:09 3218856 2982589 37922 41517 92.7% 37761 99.9 Again a very nice improvement in hit rate, and there's not a lot of room left for improvement. Let's try 100 MB:: $ python simul.py -s 100 /tmp/cachetrace.log CircularCacheSimulation, cache size 104,857,600 bytes START TIME DURATION LOADS HITS INVALS WRITES HITRATE EVICTS INUSE Jul 22 22:22 39:09 3218856 3218741 39572 41517 100.0% 22778 100.0 It's very unusual to see a hit rate so high. The application here frequently modified a very large BTree, so given enough cache space to hold the entire BTree it rarely needed to ask the ZEO server for data: this application reused the same objects over and over. More typical is that a substantial number of objects will be referenced only once. Whenever an object turns out to be loaded only once, it's a pure loss for the cache: the first (and only) load is a cache miss; storing the object evicts other objects, possibly causing more cache misses; and the object is never loaded again. If, for example, a third of the objects are loaded only once, it's quite possible for the theoretical maximum hit rate to be 67%, no matter how large the cache. The simul.py script also contains code to simulate different cache strategies. Since none of these are implemented, and only the default cache strategy's code has been updated to be aware of MVCC, these are not further documented here. Simulation Limitations ---------------------- The cache simulation is an approximation, and actual hit rate may be higher or lower than the simulated result. These are some factors that inhibit exact simulation: - The simulator doesn't try to emulate versions. If the trace file contains loads and stores of objects in versions, the simulator treats them as if they were loads and stores of non-version data. - Each time a load of an object O in the trace file was a cache hit, but the simulated cache has evicted O, the simulated cache has no way to repair its knowledge about O. This is more frequent when simulating caches smaller than the cache used to produce the trace file. When a real cache suffers a cache miss, it asks the ZEO server for the needed information about O, and saves O in the client cache. The simulated cache doesn't have a ZEO server to ask, and O continues to be absent in the simulated cache. Further requests for O will continue to be simulated cache misses, although in a real cache they'll likely be cache hits. On the other hand, the simulated cache doesn't need to evict any objects to make room for O, so it may enjoy further cache hits on objects a real cache would have evicted. ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/doc/zeo-client-cache.txt000066400000000000000000000036021230730566700235310ustar00rootroot00000000000000ZEO Client Cache The client cache provides a disk based cache for each ZEO client. The client cache allows reads to be done from local disk rather than by remote access to the storage server. The cache may be persistent or transient. If the cache is persistent, then the cache file is retained for use after process restarts. A non- persistent cache uses a temporary file. The client cache is managed in a single file, of the specified size. The life of the cache is as follows: - The cache file is opened (if it already exists), or created and set to the specified size. - Cache records are written to the cache file, as transactions commit locally, and as data are loaded from the server. - Writes are to "the current file position". This is a pointer that travels around the file, circularly. After a record is written, the pointer advances to just beyond it. Objects starting at the current file position are evicted, as needed, to make room for the next record written. A distinct index file is not created, although indexing structures are maintained in memory while a ClientStorage is running. When a persistent client cache file is reopened, these indexing structures are recreated by analyzing the file contents. Persistent cache files are created in the directory named in the ``var`` argument to the ClientStorage, or if ``var`` is None, in the current working directory. Persistent cache files have names of the form:: client-storage.zec where: client -- the client name, as given by the ClientStorage's ``client`` argument storage -- the storage name, as given by the ClientStorage's ``storage`` argument; this is typically a string denoting a small integer, "1" by default For example, the cache file for client '8881' and storage 'spam' is named "8881-spam.zec". ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/doc/zeo.txt000066400000000000000000000403341230730566700212170ustar00rootroot00000000000000========================== Running a ZEO Server HOWTO ========================== Introduction ------------ ZEO (Zope Enterprise Objects) is a client-server system for sharing a single storage among many clients. Normally, a ZODB storage can only be used by a single process. When you use ZEO, the storage is opened in the ZEO server process. Client programs connect to this process using a ZEO ClientStorage. ZEO provides a consistent view of the database to all clients. The ZEO client and server communicate using a custom RPC protocol layered on top of TCP. There are several configuration options that affect the behavior of a ZEO server. This section describes how a few of these features working. Subsequent sections describe how to configure every option. Client cache ~~~~~~~~~~~~ Each ZEO client keeps an on-disk cache of recently used objects to avoid fetching those objects from the server each time they are requested. It is usually faster to read the objects from disk than it is to fetch them over the network. The cache can also provide read-only copies of objects during server outages. The cache may be persistent or transient. If the cache is persistent, then the cache files are retained for use after process restarts. A non-persistent cache uses temporary files that are removed when the client storage is closed. The client cache size is configured when the ClientStorage is created. The default size is 20MB, but the right size depends entirely on the particular database. Setting the cache size too small can hurt performance, but in most cases making it too big just wastes disk space. The document "Client cache tracing" describes how to collect a cache trace that can be used to determine a good cache size. ZEO uses invalidations for cache consistency. Every time an object is modified, the server sends a message to each client informing it of the change. The client will discard the object from its cache when it receives an invalidation. These invalidations are often batched. Each time a client connects to a server, it must verify that its cache contents are still valid. (It did not receive any invalidation messages while it was disconnected.) There are several mechanisms used to perform cache verification. In the worst case, the client sends the server a list of all objects in its cache along with their timestamps; the server sends back an invalidation message for each stale object. The cost of verification is one drawback to making the cache too large. Note that every time a client crashes or disconnects, it must verify its cache. Every time a server crashes, all of its clients must verify their caches. The cache verification process is optimized in two ways to eliminate costs when restarting clients and servers. Each client keeps the timestamp of the last invalidation message it has seen. When it connects to the server, it checks to see if any invalidation messages were sent after that timestamp. If not, then the cache is up-to-date and no further verification occurs. The other optimization is the invalidation queue, described below. Invalidation queue ~~~~~~~~~~~~~~~~~~ The ZEO server keeps a queue of recent invalidation messages in memory. When a client connects to the server, it sends the timestamp of the most recent invalidation message it has received. If that message is still in the invalidation queue, then the server sends the client all the missing invalidations. This is often cheaper than perform full cache verification. The default size of the invalidation queue is 100. If the invalidation queue is larger, it will be more likely that a client that reconnects will be able to verify its cache using the queue. On the other hand, a large queue uses more memory on the server to store the message. Invalidation messages tend to be small, perhaps a few hundred bytes each on average; it depends on the number of objects modified by a transaction. Transaction timeouts ~~~~~~~~~~~~~~~~~~~~ A ZEO server can be configured to timeout a transaction if it takes too long to complete. Only a single transaction can commit at a time; so if one transaction takes too long, all other clients will be delayed waiting for it. In the extreme, a client can hang during the commit process. If the client hangs, the server will be unable to commit other transactions until it restarts. A well-behaved client will not hang, but the server can be configured with a transaction timeout to guard against bugs that cause a client to hang. If any transaction exceeds the timeout threshold, the client's connection to the server will be closed and the transaction aborted. Once the transaction is aborted, the server can start processing other client's requests. Most transactions should take very little time to commit. The timer begins for a transaction after all the data has been sent to the server. At this point, the cost of commit should be dominated by the cost of writing data to disk; it should be unusual for a commit to take longer than 1 second. A transaction timeout of 30 seconds should tolerate heavy load and slow communications between client and server, while guarding against hung servers. When a transaction times out, the client can be left in an awkward position. If the timeout occurs during the second phase of the two phase commit, the client will log a panic message. This should only cause problems if the client transaction involved multiple storages. If it did, it is possible that some storages committed the client changes and others did not. Connection management ~~~~~~~~~~~~~~~~~~~~~ A ZEO client manages its connection to the ZEO server. If it loses the connection, it attempts to reconnect. While it is disconnected, it can satisfy some reads by using its cache. The client can be configured to wait for a connection when it is created or to return immediately and provide data from its persistent cache. It usually simplifies programming to have the client wait for a connection on startup. When the client is disconnected, it polls periodically to see if the server is available. The rate at which it polls is configurable. The client can be configured with multiple server addresses. In this case, it assumes that each server has identical content and will use any server that is available. It is possible to configure the client to accept a read-only connection to one of these servers if no read-write connection is available. If it has a read-only connection, it will continue to poll for a read-write connection. This feature supports the Zope Replication Services product, http://www.zope.com/Products/ZopeProducts/ZRS. In general, it could be used to with a system that arranges to provide hot backups of servers in the case of failure. If a single address resolves to multiple IPv4 or IPv6 addresses, the client will connect to an arbitrary of these addresses. Authentication ~~~~~~~~~~~~~~ ZEO supports optional authentication of client and server using a password scheme similar to HTTP digest authentication (RFC 2069). It is a simple challenge-response protocol that does not send passwords in the clear, but does not offer strong security. The RFC discusses many of the limitations of this kind of protocol. Note that this feature provides authentication only. It does not provide encryption or confidentiality. The challenge-response also produces a session key that is used to generate message authentication codes for each ZEO message. This should prevent session hijacking. Guard the password database as if it contained plaintext passwords. It stores the hash of a username and password. This does not expose the plaintext password, but it is sensitive nonetheless. An attacker with the hash can impersonate the real user. This is a limitation of the simple digest scheme. The authentication framework allows third-party developers to provide new authentication modules. Installing software ------------------- ZEO is distributed as part of the ZODB3 package and with Zope, starting with Zope 2.7. You can download it from http://pypi.python.org/pypi/ZODB3. Configuring server ------------------ The script runzeo.py runs the ZEO server. The server can be configured using command-line arguments or a config file. This document only describes the config file. Run runzeo.py -h to see the list of command-line arguments. The runzeo.py script imports the ZEO package. ZEO must either be installed in Python's site-packages directory or be in a directory on PYTHONPATH. The configuration file specifies the underlying storage the server uses, the address it binds, and a few other optional parameters. An example is:: address zeo.example.com:8090 monitor-address zeo.example.com:8091 path /var/tmp/Data.fs path /var/tmp/zeo.log format %(asctime)s %(message)s This file configures a server to use a FileStorage from /var/tmp/Data.fs. The server listens on port 8090 of zeo.example.com. It also starts a monitor server that lists in port 8091. The ZEO server writes its log file to /var/tmp/zeo.log and uses a custom format for each line. Assuming the example configuration it stored in zeo.config, you can run a server by typing:: python /usr/local/bin/runzeo.py -C zeo.config A configuration file consists of a section and a storage section, where the storage section can use any of the valid ZODB storage types. It may also contain an eventlog configuration. See the document "Configuring a ZODB database" for more information about configuring storages and eventlogs. The zeo section must list the address. All the other keys are optional. address The address at which the server should listen. This can be in the form 'host:port' to signify a TCP/IP connection or a pathname string to signify a Unix domain socket connection (at least one '/' is required). A hostname may be a DNS name or a dotted IP address. If the hostname is omitted, the platform's default behavior is used when binding the listening socket ('' is passed to socket.bind() as the hostname portion of the address). read-only Flag indicating whether the server should operate in read-only mode. Defaults to false. Note that even if the server is operating in writable mode, individual storages may still be read-only. But if the server is in read-only mode, no write operations are allowed, even if the storages are writable. Note that pack() is considered a read-only operation. invalidation-queue-size The storage server keeps a queue of the objects modified by the last N transactions, where N == invalidation_queue_size. This queue is used to speed client cache verification when a client disconnects for a short period of time. monitor-address The address at which the monitor server should listen. If specified, a monitor server is started. The monitor server provides server statistics in a simple text format. This can be in the form 'host:port' to signify a TCP/IP connection or a pathname string to signify a Unix domain socket connection (at least one '/' is required). A hostname may be a DNS name or a dotted IP address. If the hostname is omitted, the platform's default behavior is used when binding the listening socket ('' is passed to socket.bind() as the hostname portion of the address). transaction-timeout The maximum amount of time to wait for a transaction to commit after acquiring the storage lock, specified in seconds. If the transaction takes too long, the client connection will be closed and the transaction aborted. authentication-protocol The name of the protocol used for authentication. The only protocol provided with ZEO is "digest," but extensions may provide other protocols. authentication-database The path of the database containing authentication credentials. authentication-realm The authentication realm of the server. Some authentication schemes use a realm to identify the logic set of usernames that are accepted by this server. Configuring clients ------------------- The ZEO client can also be configured using ZConfig. The ZODB.config module provides several function for opening a storage based on its configuration. - ZODB.config.storageFromString() - ZODB.config.storageFromFile() - ZODB.config.storageFromURL() The ZEO client configuration requires the server address be specified. Everything else is optional. An example configuration is:: server zeo.example.com:8090 The other configuration options are listed below. storage The name of the storage that the client wants to use. If the ZEO server serves more than one storage, the client selects the storage it wants to use by name. The default name is '1', which is also the default name for the ZEO server. cache-size The maximum size of the client cache, in bytes. name The storage name. If unspecified, the address of the server will be used as the name. client Enables persistent cache files. The string passed here is used to construct the cache filenames. If it is not specified, the client creates a temporary cache that will only be used by the current object. var The directory where persistent cache files are stored. By default cache files, if they are persistent, are stored in the current directory. min-disconnect-poll The minimum delay in seconds between attempts to connect to the server, in seconds. Defaults to 5 seconds. max-disconnect-poll The maximum delay in seconds between attempts to connect to the server, in seconds. Defaults to 300 seconds. wait A boolean indicating whether the constructor should wait for the client to connect to the server and verify the cache before returning. The default is true. read-only A flag indicating whether this should be a read-only storage, defaulting to false (i.e. writing is allowed by default). read-only-fallback A flag indicating whether a read-only remote storage should be acceptable as a fallback when no writable storages are available. Defaults to false. At most one of read_only and read_only_fallback should be true. realm The authentication realm of the server. Some authentication schemes use a realm to identify the logic set of usernames that are accepted by this server. A ZEO client can also be created by calling the ClientStorage constructor explicitly. For example:: from ZEO.ClientStorage import ClientStorage storage = ClientStorage(("zeo.example.com", 8090)) Running the ZEO server as a daemon ---------------------------------- In an operational setting, you will want to run the ZEO server a daemon process that is restarted when it dies. The zdaemon package provides two tools for running daemons: zdrun.py and zdctl.py. You can find zdaemon and it's documentation at http://pypi.python.org/pypi/zdaemon. Rotating log files ~~~~~~~~~~~~~~~~~~ ZEO will re-initialize its logging subsystem when it receives a SIGUSR2 signal. If you are using the standard event logger, you should first rename the log file and then send the signal to the server. The server will continue writing to the renamed log file until it receives the signal. After it receives the signal, the server will create a new file with the old name and write to it. Tools ----- There are a few scripts that may help running a ZEO server. The zeopack.py script connects to a server and packs the storage. It can be run as a cron job. The zeoup.py script attempts to connect to a ZEO server and verify that is is functioning. The zeopasswd.py script manages a ZEO servers password database. Diagnosing problems ------------------- If an exception occurs on the server, the server will log a traceback and send an exception to the client. The traceback on the client will show a ZEO protocol library as the source of the error. If you need to diagnose the problem, you will have to look in the server log for the rest of the traceback. ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/doc/zodb-guide.txt000066400000000000000000000001741230730566700224510ustar00rootroot00000000000000The ZODB/ZEO Programming Guide has been moved into it's own package (zodbguide) and published at http://docs.zope.org/zodb. ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/ez_setup.py000066400000000000000000000227641230730566700213330ustar00rootroot00000000000000#!python """Bootstrap setuptools installation If you want to use setuptools in your package's setup.py, just include this file in the same directory with it, and add this to the top of your setup.py:: from ez_setup import use_setuptools use_setuptools() If you want to require a specific version of setuptools, set a download mirror, or use an alternate download directory, you can do so by supplying the appropriate options to ``use_setuptools()``. This file can also be run as a script to install or upgrade setuptools. """ import sys DEFAULT_VERSION = "0.6c9" DEFAULT_URL = "http://pypi.python.org/packages/%s/s/setuptools/" % sys.version[:3] md5_data = { 'setuptools-0.6b1-py2.3.egg': '8822caf901250d848b996b7f25c6e6ca', 'setuptools-0.6b1-py2.4.egg': 'b79a8a403e4502fbb85ee3f1941735cb', 'setuptools-0.6b2-py2.3.egg': '5657759d8a6d8fc44070a9d07272d99b', 'setuptools-0.6b2-py2.4.egg': '4996a8d169d2be661fa32a6e52e4f82a', 'setuptools-0.6b3-py2.3.egg': 'bb31c0fc7399a63579975cad9f5a0618', 'setuptools-0.6b3-py2.4.egg': '38a8c6b3d6ecd22247f179f7da669fac', 'setuptools-0.6b4-py2.3.egg': '62045a24ed4e1ebc77fe039aa4e6f7e5', 'setuptools-0.6b4-py2.4.egg': '4cb2a185d228dacffb2d17f103b3b1c4', 'setuptools-0.6c1-py2.3.egg': 'b3f2b5539d65cb7f74ad79127f1a908c', 'setuptools-0.6c1-py2.4.egg': 'b45adeda0667d2d2ffe14009364f2a4b', 'setuptools-0.6c2-py2.3.egg': 'f0064bf6aa2b7d0f3ba0b43f20817c27', 'setuptools-0.6c2-py2.4.egg': '616192eec35f47e8ea16cd6a122b7277', 'setuptools-0.6c3-py2.3.egg': 'f181fa125dfe85a259c9cd6f1d7b78fa', 'setuptools-0.6c3-py2.4.egg': 'e0ed74682c998bfb73bf803a50e7b71e', 'setuptools-0.6c3-py2.5.egg': 'abef16fdd61955514841c7c6bd98965e', 'setuptools-0.6c4-py2.3.egg': 'b0b9131acab32022bfac7f44c5d7971f', 'setuptools-0.6c4-py2.4.egg': '2a1f9656d4fbf3c97bf946c0a124e6e2', 'setuptools-0.6c4-py2.5.egg': '8f5a052e32cdb9c72bcf4b5526f28afc', 'setuptools-0.6c5-py2.3.egg': 'ee9fd80965da04f2f3e6b3576e9d8167', 'setuptools-0.6c5-py2.4.egg': 'afe2adf1c01701ee841761f5bcd8aa64', 'setuptools-0.6c5-py2.5.egg': 'a8d3f61494ccaa8714dfed37bccd3d5d', 'setuptools-0.6c6-py2.3.egg': '35686b78116a668847237b69d549ec20', 'setuptools-0.6c6-py2.4.egg': '3c56af57be3225019260a644430065ab', 'setuptools-0.6c6-py2.5.egg': 'b2f8a7520709a5b34f80946de5f02f53', 'setuptools-0.6c7-py2.3.egg': '209fdf9adc3a615e5115b725658e13e2', 'setuptools-0.6c7-py2.4.egg': '5a8f954807d46a0fb67cf1f26c55a82e', 'setuptools-0.6c7-py2.5.egg': '45d2ad28f9750e7434111fde831e8372', 'setuptools-0.6c8-py2.3.egg': '50759d29b349db8cfd807ba8303f1902', 'setuptools-0.6c8-py2.4.egg': 'cba38d74f7d483c06e9daa6070cce6de', 'setuptools-0.6c8-py2.5.egg': '1721747ee329dc150590a58b3e1ac95b', 'setuptools-0.6c9-py2.3.egg': 'a83c4020414807b496e4cfbe08507c03', 'setuptools-0.6c9-py2.4.egg': '260a2be2e5388d66bdaee06abec6342a', 'setuptools-0.6c9-py2.5.egg': 'fe67c3e5a17b12c0e7c541b7ea43a8e6', 'setuptools-0.6c9-py2.6.egg': 'ca37b1ff16fa2ede6e19383e7b59245a', } import sys, os try: from hashlib import md5 except ImportError: from md5 import md5 def _validate_md5(egg_name, data): if egg_name in md5_data: digest = md5(data).hexdigest() if digest != md5_data[egg_name]: print >>sys.stderr, ( "md5 validation of %s failed! (Possible download problem?)" % egg_name ) sys.exit(2) return data def use_setuptools( version=DEFAULT_VERSION, download_base=DEFAULT_URL, to_dir=os.curdir, download_delay=15 ): """Automatically find/download setuptools and make it available on sys.path `version` should be a valid setuptools version number that is available as an egg for download under the `download_base` URL (which should end with a '/'). `to_dir` is the directory where setuptools will be downloaded, if it is not already available. If `download_delay` is specified, it should be the number of seconds that will be paused before initiating a download, should one be required. If an older version of setuptools is installed, this routine will print a message to ``sys.stderr`` and raise SystemExit in an attempt to abort the calling script. """ was_imported = 'pkg_resources' in sys.modules or 'setuptools' in sys.modules def do_download(): egg = download_setuptools(version, download_base, to_dir, download_delay) sys.path.insert(0, egg) import setuptools; setuptools.bootstrap_install_from = egg try: import pkg_resources except ImportError: return do_download() try: pkg_resources.require("setuptools>="+version); return except pkg_resources.VersionConflict, e: if was_imported: print >>sys.stderr, ( "The required version of setuptools (>=%s) is not available, and\n" "can't be installed while this script is running. Please install\n" " a more recent version first, using 'easy_install -U setuptools'." "\n\n(Currently using %r)" ) % (version, e.args[0]) sys.exit(2) else: del pkg_resources, sys.modules['pkg_resources'] # reload ok return do_download() except pkg_resources.DistributionNotFound: return do_download() def download_setuptools( version=DEFAULT_VERSION, download_base=DEFAULT_URL, to_dir=os.curdir, delay = 15 ): """Download setuptools from a specified location and return its filename `version` should be a valid setuptools version number that is available as an egg for download under the `download_base` URL (which should end with a '/'). `to_dir` is the directory where the egg will be downloaded. `delay` is the number of seconds to pause before an actual download attempt. """ import urllib2, shutil egg_name = "setuptools-%s-py%s.egg" % (version,sys.version[:3]) url = download_base + egg_name saveto = os.path.join(to_dir, egg_name) src = dst = None if not os.path.exists(saveto): # Avoid repeated downloads try: from distutils import log if delay: log.warn(""" --------------------------------------------------------------------------- This script requires setuptools version %s to run (even to display help). I will attempt to download it for you (from %s), but you may need to enable firewall access for this script first. I will start the download in %d seconds. (Note: if this machine does not have network access, please obtain the file %s and place it in this directory before rerunning this script.) ---------------------------------------------------------------------------""", version, download_base, delay, url ); from time import sleep; sleep(delay) log.warn("Downloading %s", url) src = urllib2.urlopen(url) # Read/write all in one block, so we don't create a corrupt file # if the download is interrupted. data = _validate_md5(egg_name, src.read()) dst = open(saveto,"wb"); dst.write(data) finally: if src: src.close() if dst: dst.close() return os.path.realpath(saveto) def main(argv, version=DEFAULT_VERSION): """Install or upgrade setuptools and EasyInstall""" try: import setuptools except ImportError: egg = None try: egg = download_setuptools(version, delay=0) sys.path.insert(0,egg) from setuptools.command.easy_install import main return main(list(argv)+[egg]) # we're done here finally: if egg and os.path.exists(egg): os.unlink(egg) else: if setuptools.__version__ == '0.0.1': print >>sys.stderr, ( "You have an obsolete version of setuptools installed. Please\n" "remove it from your system entirely before rerunning this script." ) sys.exit(2) req = "setuptools>="+version import pkg_resources try: pkg_resources.require(req) except pkg_resources.VersionConflict: try: from setuptools.command.easy_install import main except ImportError: from easy_install import main main(list(argv)+[download_setuptools(delay=0)]) sys.exit(0) # try to force an exit else: if argv: from setuptools.command.easy_install import main main(argv) else: print "Setuptools version",version,"or greater has been installed." print '(Run "ez_setup.py -U setuptools" to reinstall or upgrade.)' def update_md5(filenames): """Update our built-in md5 registry""" import re for name in filenames: base = os.path.basename(name) f = open(name,'rb') md5_data[base] = md5(f.read()).hexdigest() f.close() data = [" %r: %r,\n" % it for it in md5_data.items()] data.sort() repl = "".join(data) import inspect srcfile = inspect.getsourcefile(sys.modules[__name__]) f = open(srcfile, 'rb'); src = f.read(); f.close() match = re.search("\nmd5_data = {\n([^}]+)}", src) if not match: print >>sys.stderr, "Internal error!" sys.exit(2) src = src[:match.start(1)] + repl + src[match.end(1):] f = open(srcfile,'w') f.write(src) f.close() if __name__=='__main__': if len(sys.argv)>2 and sys.argv[1]=='--md5update': update_md5(sys.argv[2:]) else: main(sys.argv[1:]) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/log.ini000066400000000000000000000011121230730566700203650ustar00rootroot00000000000000# This file configures the logging module for the test harness: # critical errors are logged to testing.log; everything else is # ignored. # Documentation for the file format is at # http://www.red-dove.com/python_logging.html#config [logger_root] level=CRITICAL handlers=normal [handler_normal] class=FileHandler level=NOTSET formatter=common args=('testing.log', 'a') filename=testing.log mode=a [formatter_common] format=------ %(asctime)s %(levelname)s %(name)s %(message)s datefmt=%Y-%m-%dT%H:%M:%S [loggers] keys=root [handlers] keys=normal [formatters] keys=common ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/release.py000066400000000000000000000044611230730566700211070ustar00rootroot00000000000000#! /usr/bin/env python """Update version numbers and release dates for the next release. usage: release.py version date version should be a string like "3.2.0c1" date should be a string like "23-Sep-2003" The following files are updated: - setup.py - NEWS.txt - doc/guide/zodb.tex - src/ZEO/__init__.py - src/ZEO/version.txt - src/ZODB/__init__.py """ import fileinput import os import re # In file filename, replace the first occurrence of regexp pat with # string repl. def replace(filename, pat, repl): from sys import stderr as e # fileinput hijacks sys.stdout foundone = False for line in fileinput.input([filename], inplace=True, backup="~"): if foundone: print line, else: match = re.search(pat, line) if match is not None: foundone = True new = re.sub(pat, repl, line) print new, print >> e, "In %s, replaced:" % filename print >> e, " ", repr(line) print >> e, " ", repr(new) else: print line, if not foundone: print >> e, "*" * 60, "Oops!" print >> e, " Failed to find %r in %r" % (pat, filename) # Nothing in our codebase cares about ZEO/version.txt. Jeremy said # someone asked for it so that a shell script could read up the ZEO # version easily. # Before ZODB 3.4, the ZEO version was one smaller than the ZODB version; # e.g., ZEO 2.2.7 shipped with ZODB 3.2.7. Now ZEO and ZODB share their # version number. def write_zeoversion(path, version): f = open(path, "w") print >> f, version f.close() def main(args): version, date = args replace("setup.py", r'^VERSION = "\S+"$', 'VERSION = "%s"' % version) replace("src/ZODB/__init__.py", r'__version__ = "\S+"', '__version__ = "%s"' % version) replace("src/ZEO/__init__.py", r'version = "\S+"', 'version = "%s"' % version) write_zeoversion("src/ZEO/version.txt", version) replace("NEWS.txt", r"^Release date: .*", "Release date: %s" % date) replace("doc/guide/zodb.tex", r"release{\S+}", "release{%s}" % version) if __name__ == "__main__": import sys main(sys.argv[1:]) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/setup.py000066400000000000000000000167741230730566700206410ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2002, 2003 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Zope Object Database: object database and persistence The Zope Object Database provides an object-oriented database for Python that provides a high-degree of transparency. Applications can take advantage of object database features with few, if any, changes to application logic. ZODB includes features such as a plugable storage interface, rich transaction support, and undo. """ VERSION = "3.10dev" from ez_setup import use_setuptools use_setuptools() from setuptools import setup, find_packages from setuptools.extension import Extension import os import sys if sys.version_info < (2, 5): print "This version of ZODB requires Python 2.5 or higher" sys.exit(0) if sys.version_info < (2, 6): transaction_version = 'transaction == 1.1.1' manuel_version = 'manuel < 1.6dev' else: transaction_version = 'transaction >= 1.1.0' manuel_version = 'manuel' # The (non-obvious!) choices for the Trove Development Status line: # Development Status :: 5 - Production/Stable # Development Status :: 4 - Beta # Development Status :: 3 - Alpha classifiers = """\ Intended Audience :: Developers License :: OSI Approved :: Zope Public License Programming Language :: Python Topic :: Database Topic :: Software Development :: Libraries :: Python Modules Operating System :: Microsoft :: Windows Operating System :: Unix Framework :: ZODB """ # Include directories for C extensions include = ['src'] # Set up dependencies for the BTrees package base_btrees_depends = [ "src/BTrees/BTreeItemsTemplate.c", "src/BTrees/BTreeModuleTemplate.c", "src/BTrees/BTreeTemplate.c", "src/BTrees/BucketTemplate.c", "src/BTrees/MergeTemplate.c", "src/BTrees/SetOpTemplate.c", "src/BTrees/SetTemplate.c", "src/BTrees/TreeSetTemplate.c", "src/BTrees/sorters.c", "src/persistent/cPersistence.h", ] _flavors = {"O": "object", "I": "int", "F": "float", 'L': 'int'} KEY_H = "src/BTrees/%skeymacros.h" VALUE_H = "src/BTrees/%svaluemacros.h" def BTreeExtension(flavor): key = flavor[0] value = flavor[1] name = "BTrees._%sBTree" % flavor sources = ["src/BTrees/_%sBTree.c" % flavor] kwargs = {"include_dirs": include} if flavor != "fs": kwargs["depends"] = (base_btrees_depends + [KEY_H % _flavors[key], VALUE_H % _flavors[value]]) else: kwargs["depends"] = base_btrees_depends if key != "O": kwargs["define_macros"] = [('EXCLUDE_INTSET_SUPPORT', None)] return Extension(name, sources, **kwargs) exts = [BTreeExtension(flavor) for flavor in ("OO", "IO", "OI", "II", "IF", "fs", "LO", "OL", "LL", "LF", )] cPersistence = Extension(name = 'persistent.cPersistence', include_dirs = include, sources= ['src/persistent/cPersistence.c', 'src/persistent/ring.c'], depends = ['src/persistent/cPersistence.h', 'src/persistent/ring.h', 'src/persistent/ring.c'] ) cPickleCache = Extension(name = 'persistent.cPickleCache', include_dirs = include, sources= ['src/persistent/cPickleCache.c', 'src/persistent/ring.c'], depends = ['src/persistent/cPersistence.h', 'src/persistent/ring.h', 'src/persistent/ring.c'] ) TimeStamp = Extension(name = 'persistent.TimeStamp', include_dirs = include, sources= ['src/persistent/TimeStamp.c'] ) exts += [cPersistence, cPickleCache, TimeStamp, ] def _modname(path, base, name=''): if path == base: return name dirname, basename = os.path.split(path) return _modname(dirname, base, basename + '.' + name) def alltests(): import logging import pkg_resources import unittest import ZEO.ClientStorage class NullHandler(logging.Handler): level = 50 def emit(self, record): pass logging.getLogger().addHandler(NullHandler()) suite = unittest.TestSuite() base = pkg_resources.working_set.find( pkg_resources.Requirement.parse('ZODB3')).location for dirpath, dirnames, filenames in os.walk(base): if os.path.basename(dirpath) == 'tests': for filename in filenames: if filename != 'testZEO.py': continue if filename.endswith('.py') and filename.startswith('test'): mod = __import__( _modname(dirpath, base, os.path.splitext(filename)[0]), {}, {}, ['*']) suite.addTest(mod.test_suite()) elif 'tests.py' in filenames: continue mod = __import__(_modname(dirpath, base, 'tests'), {}, {}, ['*']) suite.addTest(mod.test_suite()) return suite doclines = __doc__.split("\n") def read_file(*path): base_dir = os.path.dirname(__file__) file_path = (base_dir, ) + tuple(path) return file(os.path.join(*file_path)).read() long_description = str( ("\n".join(doclines[2:]) + "\n\n" + ".. contents::\n\n" + read_file("README.txt") + "\n\n" + read_file("src", "CHANGES.txt") ).decode('latin-1').replace(u'L\xf6wis', '|Lowis|') )+ '''\n\n.. |Lowis| unicode:: L \\xf6 wis\n''' setup(name="ZODB3", version=VERSION, maintainer="Zope Foundation and Contributors", maintainer_email="zodb-dev@zope.org", packages = find_packages('src'), package_dir = {'': 'src'}, ext_modules = exts, headers = ['src/persistent/cPersistence.h', 'src/persistent/py24compat.h', 'src/persistent/ring.h'], license = "ZPL 2.1", platforms = ["any"], description = doclines[0], classifiers = filter(None, classifiers.split("\n")), long_description = long_description, test_suite="__main__.alltests", # to support "setup.py test" tests_require = ['zope.testing', manuel_version], extras_require = dict(test=['zope.testing', manuel_version]), install_requires = [ transaction_version, 'zc.lockfile', 'ZConfig', 'zdaemon', 'zope.event', 'zope.interface', ], zip_safe = False, entry_points = """ [console_scripts] fsdump = ZODB.FileStorage.fsdump:main fsoids = ZODB.scripts.fsoids:main fsrefs = ZODB.scripts.fsrefs:main fstail = ZODB.scripts.fstail:Main repozo = ZODB.scripts.repozo:main zeopack = ZEO.scripts.zeopack:main runzeo = ZEO.runzeo:main zeopasswd = ZEO.zeopasswd:main zeoctl = ZEO.zeoctl:main """, include_package_data = True, ) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/000077500000000000000000000000001230730566700176775ustar00rootroot00000000000000ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/000077500000000000000000000000001230730566700210635ustar00rootroot00000000000000ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/BTreeItemsTemplate.c000066400000000000000000000477161230730566700247450ustar00rootroot00000000000000/***************************************************************************** Copyright (c) 2001, 2002 Zope Foundation and Contributors. All Rights Reserved. This software is subject to the provisions of the Zope Public License, Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS FOR A PARTICULAR PURPOSE ****************************************************************************/ #define BTREEITEMSTEMPLATE_C "$Id$\n" /* A BTreeItems struct is returned from calling .items(), .keys() or * .values() on a BTree-based data structure, and is also the result of * taking slices of those. It represents a contiguous slice of a BTree. * * The start of the slice is in firstbucket, at offset first. The end of * the slice is in lastbucket, at offset last. Both endpoints are inclusive. * It must possible to get from firstbucket to lastbucket via following * bucket 'next' pointers zero or more times. firstbucket, first, lastbucket, * and last are readonly after initialization. An empty slice is represented * by firstbucket == lastbucket == currentbucket == NULL. * * 'kind' determines whether this slice represents 'k'eys alone, 'v'alues * alone, or 'i'items (key+value pairs). 'kind' is also readonly after * initialization. * * The combination of currentbucket, currentoffset and pseudoindex acts as * a search finger. Offset currentoffset in bucket currentbucket is at index * pseudoindex, where pseudoindex==0 corresponds to offset first in bucket * firstbucket, and pseudoindex==-1 corresponds to offset last in bucket * lastbucket. The function BTreeItems_seek() can be used to set this combo * correctly for any in-bounds index, and uses this combo on input to avoid * needing to search from the start (or end) on each call. Calling * BTreeItems_seek() with consecutive larger positions is very efficent. * Calling it with consecutive smaller positions is more efficient than if * a search finger weren't being used at all, but is still quadratic time * in the number of buckets in the slice. */ typedef struct { PyObject_HEAD Bucket *firstbucket; /* First bucket */ Bucket *currentbucket; /* Current bucket (search finger) */ Bucket *lastbucket; /* Last bucket */ int currentoffset; /* Offset in currentbucket */ int pseudoindex; /* search finger index */ int first; /* Start offset in firstbucket */ int last; /* End offset in lastbucket */ char kind; /* 'k', 'v', 'i' */ } BTreeItems; #define ITEMS(O)((BTreeItems*)(O)) static PyObject * newBTreeItems(char kind, Bucket *lowbucket, int lowoffset, Bucket *highbucket, int highoffset); static void BTreeItems_dealloc(BTreeItems *self) { Py_XDECREF(self->firstbucket); Py_XDECREF(self->lastbucket); Py_XDECREF(self->currentbucket); PyObject_DEL(self); } static Py_ssize_t BTreeItems_length_or_nonzero(BTreeItems *self, int nonzero) { Py_ssize_t r; Bucket *b, *next; b = self->firstbucket; if (b == NULL) return 0; r = self->last + 1 - self->first; if (nonzero && r > 0) /* Short-circuit if all we care about is nonempty */ return 1; if (b == self->lastbucket) return r; Py_INCREF(b); PER_USE_OR_RETURN(b, -1); while ((next = b->next)) { r += b->len; if (nonzero && r > 0) /* Short-circuit if all we care about is nonempty */ break; if (next == self->lastbucket) break; /* we already counted the last bucket */ Py_INCREF(next); PER_UNUSE(b); Py_DECREF(b); b = next; PER_USE_OR_RETURN(b, -1); } PER_UNUSE(b); Py_DECREF(b); return r >= 0 ? r : 0; } static Py_ssize_t BTreeItems_length(BTreeItems *self) { return BTreeItems_length_or_nonzero(self, 0); } /* ** BTreeItems_seek ** ** Find the ith position in the BTreeItems. ** ** Arguments: self The BTree ** i the index to seek to, in 0 .. len(self)-1, or in ** -len(self) .. -1, as for indexing a Python sequence. ** ** ** Returns 0 if successful, -1 on failure to seek (like out-of-bounds). ** Upon successful return, index i is at offset self->currentoffset in bucket ** self->currentbucket. */ static int BTreeItems_seek(BTreeItems *self, Py_ssize_t i) { int delta, pseudoindex, currentoffset; Bucket *b, *currentbucket; int error; pseudoindex = self->pseudoindex; currentoffset = self->currentoffset; currentbucket = self->currentbucket; if (currentbucket == NULL) goto no_match; delta = i - pseudoindex; while (delta > 0) { /* move right */ int max; /* Want to move right delta positions; the most we can move right in * this bucket is currentbucket->len - currentoffset - 1 positions. */ PER_USE_OR_RETURN(currentbucket, -1); max = currentbucket->len - currentoffset - 1; b = currentbucket->next; PER_UNUSE(currentbucket); if (delta <= max) { currentoffset += delta; pseudoindex += delta; if (currentbucket == self->lastbucket && currentoffset > self->last) goto no_match; break; } /* Move to start of next bucket. */ if (currentbucket == self->lastbucket || b == NULL) goto no_match; currentbucket = b; pseudoindex += max + 1; delta -= max + 1; currentoffset = 0; } while (delta < 0) { /* move left */ int status; /* Want to move left -delta positions; the most we can move left in * this bucket is currentoffset positions. */ if ((-delta) <= currentoffset) { currentoffset += delta; pseudoindex += delta; if (currentbucket == self->firstbucket && currentoffset < self->first) goto no_match; break; } /* Move to end of previous bucket. */ if (currentbucket == self->firstbucket) goto no_match; status = PreviousBucket(¤tbucket, self->firstbucket); if (status == 0) goto no_match; else if (status < 0) return -1; pseudoindex -= currentoffset + 1; delta += currentoffset + 1; PER_USE_OR_RETURN(currentbucket, -1); currentoffset = currentbucket->len - 1; PER_UNUSE(currentbucket); } assert(pseudoindex == i); /* Alas, the user may have mutated the bucket since the last time we * were called, and if they deleted stuff, we may be pointing into * trash memory now. */ PER_USE_OR_RETURN(currentbucket, -1); error = currentoffset < 0 || currentoffset >= currentbucket->len; PER_UNUSE(currentbucket); if (error) { PyErr_SetString(PyExc_RuntimeError, "the bucket being iterated changed size"); return -1; } Py_INCREF(currentbucket); Py_DECREF(self->currentbucket); self->currentbucket = currentbucket; self->currentoffset = currentoffset; self->pseudoindex = pseudoindex; return 0; no_match: IndexError(i); return -1; } /* Return the right kind ('k','v','i') of entry from bucket b at offset i. * b must be activated. Returns NULL on error. */ static PyObject * getBucketEntry(Bucket *b, int i, char kind) { PyObject *result = NULL; assert(b); assert(0 <= i && i < b->len); switch (kind) { case 'k': COPY_KEY_TO_OBJECT(result, b->keys[i]); break; case 'v': COPY_VALUE_TO_OBJECT(result, b->values[i]); break; case 'i': { PyObject *key; PyObject *value;; COPY_KEY_TO_OBJECT(key, b->keys[i]); if (!key) break; COPY_VALUE_TO_OBJECT(value, b->values[i]); if (!value) { Py_DECREF(key); break; } result = PyTuple_New(2); if (result) { PyTuple_SET_ITEM(result, 0, key); PyTuple_SET_ITEM(result, 1, value); } else { Py_DECREF(key); Py_DECREF(value); } break; } default: PyErr_SetString(PyExc_AssertionError, "getBucketEntry: unknown kind"); break; } return result; } /* ** BTreeItems_item ** ** Arguments: self a BTreeItems structure ** i Which item to inspect ** ** Returns: the BTreeItems_item_BTree of self->kind, i ** (ie pulls the ith item out) */ static PyObject * BTreeItems_item(BTreeItems *self, Py_ssize_t i) { PyObject *result; if (BTreeItems_seek(self, i) < 0) return NULL; PER_USE_OR_RETURN(self->currentbucket, NULL); result = getBucketEntry(self->currentbucket, self->currentoffset, self->kind); PER_UNUSE(self->currentbucket); return result; } /* ** BTreeItems_slice ** ** Creates a new BTreeItems structure representing the slice ** between the low and high range ** ** Arguments: self The old BTreeItems structure ** ilow The start index ** ihigh The end index ** ** Returns: BTreeItems item */ static PyObject * BTreeItems_slice(BTreeItems *self, Py_ssize_t ilow, Py_ssize_t ihigh) { Bucket *lowbucket; Bucket *highbucket; int lowoffset; int highoffset; Py_ssize_t length = -1; /* len(self), but computed only if needed */ /* Complications: * A Python slice never raises IndexError, but BTreeItems_seek does. * Python did only part of index normalization before calling this: * ilow may be < 0 now, and ihigh may be arbitrarily large. It's * our responsibility to clip them. * A Python slice is exclusive of the high index, but a BTreeItems * struct is inclusive on both ends. */ /* First adjust ilow and ihigh to be legit endpoints in the Python * sense (ilow inclusive, ihigh exclusive). This block duplicates the * logic from Python's list_slice function (slicing for builtin lists). */ if (ilow < 0) ilow = 0; else { if (length < 0) length = BTreeItems_length(self); if (ilow > length) ilow = length; } if (ihigh < ilow) ihigh = ilow; else { if (length < 0) length = BTreeItems_length(self); if (ihigh > length) ihigh = length; } assert(0 <= ilow && ilow <= ihigh); assert(length < 0 || ihigh <= length); /* Now adjust for that our struct is inclusive on both ends. This is * easy *except* when the slice is empty: there's no good way to spell * that in an inclusive-on-both-ends scheme. For example, if the * slice is btree.items([:0]), ilow == ihigh == 0 at this point, and if * we were to subtract 1 from ihigh that would get interpreted by * BTreeItems_seek as meaning the *entire* set of items. Setting ilow==1 * and ihigh==0 doesn't work either, as BTreeItems_seek raises IndexError * if we attempt to seek to ilow==1 when the underlying sequence is empty. * It seems simplest to deal with empty slices as a special case here. */ if (ilow == ihigh) { /* empty slice */ lowbucket = highbucket = NULL; lowoffset = 1; highoffset = 0; } else { assert(ilow < ihigh); --ihigh; /* exclusive -> inclusive */ if (BTreeItems_seek(self, ilow) < 0) return NULL; lowbucket = self->currentbucket; lowoffset = self->currentoffset; if (BTreeItems_seek(self, ihigh) < 0) return NULL; highbucket = self->currentbucket; highoffset = self->currentoffset; } return newBTreeItems(self->kind, lowbucket, lowoffset, highbucket, highoffset); } static PySequenceMethods BTreeItems_as_sequence = { (lenfunc) BTreeItems_length, (binaryfunc)0, (ssizeargfunc)0, (ssizeargfunc) BTreeItems_item, (ssizessizeargfunc) BTreeItems_slice, }; /* Number Method items (just for nb_nonzero!) */ static int BTreeItems_nonzero(BTreeItems *self) { return BTreeItems_length_or_nonzero(self, 1); } static PyNumberMethods BTreeItems_as_number_for_nonzero = { 0,0,0,0,0,0,0,0,0,0, (inquiry)BTreeItems_nonzero}; static PyTypeObject BTreeItemsType = { PyObject_HEAD_INIT(NULL) 0, /*ob_size*/ MOD_NAME_PREFIX "BTreeItems", /*tp_name*/ sizeof(BTreeItems), /*tp_basicsize*/ 0, /*tp_itemsize*/ /* methods */ (destructor) BTreeItems_dealloc, /*tp_dealloc*/ (printfunc)0, /*tp_print*/ (getattrfunc)0, /*obsolete tp_getattr*/ (setattrfunc)0, /*obsolete tp_setattr*/ (cmpfunc)0, /*tp_compare*/ (reprfunc)0, /*tp_repr*/ &BTreeItems_as_number_for_nonzero, /*tp_as_number*/ &BTreeItems_as_sequence, /*tp_as_sequence*/ 0, /*tp_as_mapping*/ (hashfunc)0, /*tp_hash*/ (ternaryfunc)0, /*tp_call*/ (reprfunc)0, /*tp_str*/ 0, /*tp_getattro*/ 0, /*tp_setattro*/ /* Space for future expansion */ 0L,0L, "Sequence type used to iterate over BTree items." /* Documentation string */ }; /* Returns a new BTreeItems object representing the contiguous slice from * offset lowoffset in bucket lowbucket through offset highoffset in bucket * highbucket, inclusive. Pass lowbucket == NULL for an empty slice. * The currentbucket is set to lowbucket, currentoffset ot lowoffset, and * pseudoindex to 0. kind is 'k', 'v' or 'i' (see BTreeItems struct docs). */ static PyObject * newBTreeItems(char kind, Bucket *lowbucket, int lowoffset, Bucket *highbucket, int highoffset) { BTreeItems *self; UNLESS (self = PyObject_NEW(BTreeItems, &BTreeItemsType)) return NULL; self->kind=kind; self->first=lowoffset; self->last=highoffset; if (! lowbucket || ! highbucket || (lowbucket == highbucket && lowoffset > highoffset)) { self->firstbucket = 0; self->lastbucket = 0; self->currentbucket = 0; } else { Py_INCREF(lowbucket); self->firstbucket = lowbucket; Py_INCREF(highbucket); self->lastbucket = highbucket; Py_INCREF(lowbucket); self->currentbucket = lowbucket; } self->currentoffset = lowoffset; self->pseudoindex = 0; return OBJECT(self); } static int nextBTreeItems(SetIteration *i) { if (i->position >= 0) { if (i->position) { DECREF_KEY(i->key); DECREF_VALUE(i->value); } if (BTreeItems_seek(ITEMS(i->set), i->position) >= 0) { Bucket *currentbucket; currentbucket = BUCKET(ITEMS(i->set)->currentbucket); UNLESS(PER_USE(currentbucket)) { /* Mark iteration terminated, so that finiSetIteration doesn't * try to redundantly decref the key and value */ i->position = -1; return -1; } COPY_KEY(i->key, currentbucket->keys[ITEMS(i->set)->currentoffset]); INCREF_KEY(i->key); COPY_VALUE(i->value, currentbucket->values[ITEMS(i->set)->currentoffset]); INCREF_VALUE(i->value); i->position ++; PER_UNUSE(currentbucket); } else { i->position = -1; PyErr_Clear(); } } return 0; } static int nextTreeSetItems(SetIteration *i) { if (i->position >= 0) { if (i->position) { DECREF_KEY(i->key); } if (BTreeItems_seek(ITEMS(i->set), i->position) >= 0) { Bucket *currentbucket; currentbucket = BUCKET(ITEMS(i->set)->currentbucket); UNLESS(PER_USE(currentbucket)) { /* Mark iteration terminated, so that finiSetIteration doesn't * try to redundantly decref the key and value */ i->position = -1; return -1; } COPY_KEY(i->key, currentbucket->keys[ITEMS(i->set)->currentoffset]); INCREF_KEY(i->key); i->position ++; PER_UNUSE(currentbucket); } else { i->position = -1; PyErr_Clear(); } } return 0; } /* Support for the iteration protocol new in Python 2.2. */ static PyTypeObject BTreeIter_Type; /* The type of iterator objects, returned by e.g. iter(IIBTree()). */ typedef struct { PyObject_HEAD /* We use a BTreeItems object because it's convenient and flexible. * We abuse it two ways: * 1. We set currentbucket to NULL when the iteration is finished. * 2. We don't bother keeping pseudoindex in synch. */ BTreeItems *pitems; } BTreeIter; /* Return a new iterator object, to traverse the keys and/or values * represented by pitems. pitems must not be NULL. Returns NULL if error. */ static BTreeIter * BTreeIter_new(BTreeItems *pitems) { BTreeIter *result; assert(pitems != NULL); result = PyObject_New(BTreeIter, &BTreeIter_Type); if (result) { Py_INCREF(pitems); result->pitems = pitems; } return result; } /* The iterator's tp_dealloc slot. */ static void BTreeIter_dealloc(BTreeIter *bi) { Py_DECREF(bi->pitems); PyObject_Del(bi); } /* The implementation of the iterator's tp_iternext slot. Returns "the next" * item; returns NULL if error; returns NULL without setting an error if the * iteration is exhausted (that's the way to terminate the iteration protocol). */ static PyObject * BTreeIter_next(BTreeIter *bi, PyObject *args) { PyObject *result = NULL; /* until proven innocent */ BTreeItems *items = bi->pitems; int i = items->currentoffset; Bucket *bucket = items->currentbucket; if (bucket == NULL) /* iteration termination is sticky */ return NULL; PER_USE_OR_RETURN(bucket, NULL); if (i >= bucket->len) { /* We never leave this routine normally with i >= len: somebody * else mutated the current bucket. */ PyErr_SetString(PyExc_RuntimeError, "the bucket being iterated changed size"); /* Arrange for that this error is sticky too. */ items->currentoffset = INT_MAX; goto Done; } /* Build the result object, from bucket at offset i. */ result = getBucketEntry(bucket, i, items->kind); /* Advance position for next call. */ if (bucket == items->lastbucket && i >= items->last) { /* Next call should terminate the iteration. */ Py_DECREF(items->currentbucket); items->currentbucket = NULL; } else { ++i; if (i >= bucket->len) { Py_XINCREF(bucket->next); items->currentbucket = bucket->next; Py_DECREF(bucket); i = 0; } items->currentoffset = i; } Done: PER_UNUSE(bucket); return result; } static PyObject * BTreeIter_getiter(PyObject *it) { Py_INCREF(it); return it; } static PyTypeObject BTreeIter_Type = { PyObject_HEAD_INIT(NULL) 0, /* ob_size */ MOD_NAME_PREFIX "-iterator", /* tp_name */ sizeof(BTreeIter), /* tp_basicsize */ 0, /* tp_itemsize */ /* methods */ (destructor)BTreeIter_dealloc, /* tp_dealloc */ 0, /* tp_print */ 0, /* tp_getattr */ 0, /* tp_setattr */ 0, /* tp_compare */ 0, /* tp_repr */ 0, /* tp_as_number */ 0, /* tp_as_sequence */ 0, /* tp_as_mapping */ 0, /* tp_hash */ 0, /* tp_call */ 0, /* tp_str */ 0, /*PyObject_GenericGetAttr,*/ /* tp_getattro */ 0, /* tp_setattro */ 0, /* tp_as_buffer */ Py_TPFLAGS_DEFAULT, /* tp_flags */ 0, /* tp_doc */ 0, /* tp_traverse */ 0, /* tp_clear */ 0, /* tp_richcompare */ 0, /* tp_weaklistoffset */ (getiterfunc)BTreeIter_getiter, /* tp_iter */ (iternextfunc)BTreeIter_next, /* tp_iternext */ 0, /* tp_methods */ 0, /* tp_members */ 0, /* tp_getset */ 0, /* tp_base */ 0, /* tp_dict */ 0, /* tp_descr_get */ 0, /* tp_descr_set */ }; ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/BTreeModuleTemplate.c000066400000000000000000000420321230730566700250730ustar00rootroot00000000000000/***************************************************************************** Copyright (c) 2001, 2002 Zope Foundation and Contributors. All Rights Reserved. This software is subject to the provisions of the Zope Public License, Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS FOR A PARTICULAR PURPOSE ****************************************************************************/ #include "Python.h" /* include structmember.h for offsetof */ #include "structmember.h" #ifdef PERSISTENT #include "persistent/cPersistence.h" #else #define PER_USE_OR_RETURN(self, NULL) #define PER_ALLOW_DEACTIVATION(self) #define PER_PREVENT_DEACTIVATION(self) #define PER_DEL(self) #define PER_USE(O) 1 #define PER_ACCESSED(O) 1 #endif #include "py24compat.h" /* So sue me. This pair gets used all over the place, so much so that it * interferes with understanding non-persistence parts of algorithms. * PER_UNUSE can be used after a successul PER_USE or PER_USE_OR_RETURN. * It allows the object to become ghostified, and tells the persistence * machinery that the object's fields were used recently. */ #define PER_UNUSE(OBJ) do { \ PER_ALLOW_DEACTIVATION(OBJ); \ PER_ACCESSED(OBJ); \ } while (0) /* The tp_name slots of the various BTree types contain the fully * qualified names of the types, e.g. zodb.btrees.OOBTree.OOBTree. * The full name is usd to support pickling and because it is not * possible to modify the __module__ slot of a type dynamically. (This * may be a bug in Python 2.2). * * The MODULE_NAME here used to be "BTrees._". We actually want the module * name to point to the Python module rather than the C, so the underline * is now removed. */ #define MODULE_NAME "BTrees." MOD_NAME_PREFIX "BTree." static PyObject *sort_str, *reverse_str, *__setstate___str, *_bucket_type_str; static PyObject *ConflictError = NULL; static void PyVar_Assign(PyObject **v, PyObject *e) { Py_XDECREF(*v); *v=e;} #define ASSIGN(V,E) PyVar_Assign(&(V),(E)) #define UNLESS(E) if (!(E)) #define OBJECT(O) ((PyObject*)(O)) #define MIN_BUCKET_ALLOC 16 #define MAX_BTREE_SIZE(B) DEFAULT_MAX_BTREE_SIZE #define MAX_BUCKET_SIZE(B) DEFAULT_MAX_BUCKET_SIZE #define SameType_Check(O1, O2) ((O1)->ob_type==(O2)->ob_type) #define ASSERT(C, S, R) if (! (C)) { \ PyErr_SetString(PyExc_AssertionError, (S)); return (R); } #ifdef NEED_LONG_LONG_SUPPORT /* Helper code used to support long long instead of int. */ #ifndef PY_LONG_LONG #error "PY_LONG_LONG required but not defined" #endif static int longlong_check(PyObject *ob) { if (PyInt_Check(ob)) return 1; if (PyLong_Check(ob)) { /* check magnitude */ PY_LONG_LONG val = PyLong_AsLongLong(ob); if (val == -1 && PyErr_Occurred()) return 0; return 1; } return 0; } static PyObject * longlong_as_object(PY_LONG_LONG val) { static PY_LONG_LONG maxint = 0; if (maxint == 0) maxint = PyInt_GetMax(); if ((val > maxint) || (val < (-maxint-1))) return PyLong_FromLongLong(val); return PyInt_FromLong((long)val); } #endif /* Various kinds of BTree and Bucket structs are instances of * "sized containers", and have a common initial layout: * The stuff needed for all Python objects, or all Persistent objects. * int size: The maximum number of things that could be contained * without growing the container. * int len: The number of things currently contained. * * Invariant: 0 <= len <= size. * * A sized container typically goes on to declare one or more pointers * to contiguous arrays with 'size' elements each, the initial 'len' of * which are currently in use. */ #ifdef PERSISTENT #define sizedcontainer_HEAD \ cPersistent_HEAD \ int size; \ int len; #else #define sizedcontainer_HEAD \ PyObject_HEAD \ int size; \ int len; #endif /* Nothing is actually of type Sized, but (pointers to) BTree nodes and * Buckets can be cast to Sized* in contexts that only need to examine * the members common to all sized containers. */ typedef struct Sized_s { sizedcontainer_HEAD } Sized; #define SIZED(O) ((Sized*)(O)) /* A Bucket wraps contiguous vectors of keys and values. Keys are unique, * and stored in sorted order. The 'values' pointer may be NULL if the * Bucket is used to implement a set. Buckets serving as leafs of BTrees * are chained together via 'next', so that the entire BTree contents * can be traversed in sorted order quickly and easily. */ typedef struct Bucket_s { sizedcontainer_HEAD struct Bucket_s *next; /* the bucket with the next-larger keys */ KEY_TYPE *keys; /* 'len' keys, in increasing order */ VALUE_TYPE *values; /* 'len' corresponding values; NULL if a set */ } Bucket; #define BUCKET(O) ((Bucket*)(O)) /* A BTree is complicated. See Maintainer.txt. */ typedef struct BTreeItem_s { KEY_TYPE key; Sized *child; /* points to another BTree, or to a Bucket of some sort */ } BTreeItem; typedef struct BTree_s { sizedcontainer_HEAD /* firstbucket points to the bucket containing the smallest key in * the BTree. This is found by traversing leftmost child pointers * (data[0].child) until reaching a Bucket. */ Bucket *firstbucket; /* The BTree points to 'len' children, via the "child" fields of the data * array. There are len-1 keys in the 'key' fields, stored in increasing * order. data[0].key is unused. For i in 0 .. len-1, all keys reachable * from data[i].child are >= data[i].key and < data[i+1].key, at the * endpoints pretending that data[0].key is minus infinity and * data[len].key is positive infinity. */ BTreeItem *data; } BTree; static PyTypeObject BTreeType; static PyTypeObject BucketType; #define BTREE(O) ((BTree*)(O)) /* Use BTREE_SEARCH to find which child pointer to follow. * RESULT An int lvalue to hold the index i such that SELF->data[i].child * is the correct node to search next. * SELF A pointer to a BTree node. * KEY The key you're looking for, of type KEY_TYPE. * ONERROR What to do if key comparison raises an exception; for example, * perhaps 'return NULL'. * * See Maintainer.txt for discussion: this is optimized in subtle ways. * It's recommended that you call this at the start of a routine, waiting * to check for self->len == 0 after. */ #define BTREE_SEARCH(RESULT, SELF, KEY, ONERROR) { \ int _lo = 0; \ int _hi = (SELF)->len; \ int _i, _cmp; \ for (_i = _hi >> 1; _i > _lo; _i = (_lo + _hi) >> 1) { \ TEST_KEY_SET_OR(_cmp, (SELF)->data[_i].key, (KEY)) \ ONERROR; \ if (_cmp < 0) _lo = _i; \ else if (_cmp > 0) _hi = _i; \ else /* equal */ break; \ } \ (RESULT) = _i; \ } /* SetIteration structs are used in the internal set iteration protocol. * When you want to iterate over a set or bucket or BTree (even an * individual key!), * 1. Declare a new iterator: * SetIteration si = {0,0,0}; * Using "{0,0,0}" or "{0,0}" appear most common. Only one {0} is * necssary. At least one must be given so that finiSetIteration() works * correctly even if you don't get around to calling initSetIteration(). * 2. Initialize it via * initSetIteration(&si, PyObject *s, useValues) * It's an error if that returns an int < 0. In case of error on the * init call, calling finiSetIteration(&si) is optional. But if the * init call succeeds, you must eventually call finiSetIteration(), * and whether or not subsequent calls to si.next() fail. * 3. Get the first element: * if (si.next(&si) < 0) { there was an error } * If the set isn't empty, this sets si.position to an int >= 0, * si.key to the element's key (of type KEY_TYPE), and maybe si.value to * the element's value (of type VALUE_TYPE). si.value is defined * iff si.usesValue is true. * 4. Process all the elements: * while (si.position >= 0) { * do something with si.key and/or si.value; * if (si.next(&si) < 0) { there was an error; } * } * 5. Finalize the SetIterator: * finiSetIteration(&si); * This is mandatory! si may contain references to iterator objects, * keys and values, and they must be cleaned up else they'll leak. If * this were C++ we'd hide that in the destructor, but in C you have to * do it by hand. */ typedef struct SetIteration_s { PyObject *set; /* the set, bucket, BTree, ..., being iterated */ int position; /* initialized to 0; set to -1 by next() when done */ int usesValue; /* true iff 'set' has values & we iterate them */ KEY_TYPE key; /* next() sets to next key */ VALUE_TYPE value; /* next() may set to next value */ int (*next)(struct SetIteration_s*); /* function to get next key+value */ } SetIteration; /* Finish the set iteration protocol. This MUST be called by everyone * who starts a set iteration, unless the initial call to initSetIteration * failed; in that case, and only that case, calling finiSetIteration is * optional. */ static void finiSetIteration(SetIteration *i) { assert(i != NULL); if (i->set == NULL) return; Py_DECREF(i->set); i->set = NULL; /* so it doesn't hurt to call this again */ if (i->position > 0) { /* next() was called at least once, but didn't finish iterating * (else position would be negative). So the cached key and * value need to be cleaned up. */ DECREF_KEY(i->key); if (i->usesValue) { DECREF_VALUE(i->value); } } i->position = -1; /* stop any stray next calls from doing harm */ } static PyObject * IndexError(int i) { PyObject *v; v = PyInt_FromLong(i); if (!v) { v = Py_None; Py_INCREF(v); } PyErr_SetObject(PyExc_IndexError, v); Py_DECREF(v); return NULL; } /* Search for the bucket immediately preceding *current, in the bucket chain * starting at first. current, *current and first must not be NULL. * * Return: * 1 *current holds the correct bucket; this is a borrowed reference * 0 no such bucket exists; *current unaltered * -1 error; *current unaltered */ static int PreviousBucket(Bucket **current, Bucket *first) { Bucket *trailing = NULL; /* first travels; trailing follows it */ int result = 0; assert(current && *current && first); if (first == *current) return 0; do { trailing = first; PER_USE_OR_RETURN(first, -1); first = first->next; ((trailing)->state==cPersistent_STICKY_STATE && ((trailing)->state=cPersistent_UPTODATE_STATE)); PER_ACCESSED(trailing); if (first == *current) { *current = trailing; result = 1; break; } } while (first); return result; } static void * BTree_Malloc(size_t sz) { void *r; ASSERT(sz > 0, "non-positive size malloc", NULL); r = malloc(sz); if (r) return r; PyErr_NoMemory(); return NULL; } static void * BTree_Realloc(void *p, size_t sz) { void *r; ASSERT(sz > 0, "non-positive size realloc", NULL); if (p) r = realloc(p, sz); else r = malloc(sz); UNLESS (r) PyErr_NoMemory(); return r; } /* Shared keyword-argument list for BTree/Bucket * (iter)?(keys|values|items) */ static char *search_keywords[] = {"min", "max", "excludemin", "excludemax", 0}; #include "BTreeItemsTemplate.c" #include "BucketTemplate.c" #include "SetTemplate.c" #include "BTreeTemplate.c" #include "TreeSetTemplate.c" #include "SetOpTemplate.c" #include "MergeTemplate.c" static struct PyMethodDef module_methods[] = { {"difference", (PyCFunction) difference_m, METH_VARARGS, "difference(o1, o2) -- " "compute the difference between o1 and o2" }, {"union", (PyCFunction) union_m, METH_VARARGS, "union(o1, o2) -- compute the union of o1 and o2\n" }, {"intersection", (PyCFunction) intersection_m, METH_VARARGS, "intersection(o1, o2) -- " "compute the intersection of o1 and o2" }, #ifdef MERGE {"weightedUnion", (PyCFunction) wunion_m, METH_VARARGS, "weightedUnion(o1, o2 [, w1, w2]) -- compute the union of o1 and o2\n" "\nw1 and w2 are weights." }, {"weightedIntersection", (PyCFunction) wintersection_m, METH_VARARGS, "weightedIntersection(o1, o2 [, w1, w2]) -- " "compute the intersection of o1 and o2\n" "\nw1 and w2 are weights." }, #endif #ifdef MULTI_INT_UNION {"multiunion", (PyCFunction) multiunion_m, METH_VARARGS, "multiunion(seq) -- compute union of a sequence of integer sets.\n" "\n" "Each element of seq must be an integer set, or convertible to one\n" "via the set iteration protocol. The union returned is an IISet." }, #endif {NULL, NULL} /* sentinel */ }; static char BTree_module_documentation[] = "\n" MASTER_ID BTREEITEMSTEMPLATE_C "$Id$\n" BTREETEMPLATE_C BUCKETTEMPLATE_C KEYMACROS_H MERGETEMPLATE_C SETOPTEMPLATE_C SETTEMPLATE_C TREESETTEMPLATE_C VALUEMACROS_H BTREEITEMSTEMPLATE_C ; int init_persist_type(PyTypeObject *type) { type->ob_type = &PyType_Type; type->tp_base = cPersistenceCAPI->pertype; if (PyType_Ready(type) < 0) return 0; return 1; } void INITMODULE (void) { PyObject *m, *d, *c; sort_str = PyString_InternFromString("sort"); if (!sort_str) return; reverse_str = PyString_InternFromString("reverse"); if (!reverse_str) return; __setstate___str = PyString_InternFromString("__setstate__"); if (!__setstate___str) return; _bucket_type_str = PyString_InternFromString("_bucket_type"); if (!_bucket_type_str) return; /* Grab the ConflictError class */ m = PyImport_ImportModule("ZODB.POSException"); if (m != NULL) { c = PyObject_GetAttrString(m, "BTreesConflictError"); if (c != NULL) ConflictError = c; Py_DECREF(m); } if (ConflictError == NULL) { Py_INCREF(PyExc_ValueError); ConflictError=PyExc_ValueError; } /* Initialize the PyPersist_C_API and the type objects. */ cPersistenceCAPI = PyCObject_Import("persistent.cPersistence", "CAPI"); if (cPersistenceCAPI == NULL) return; BTreeItemsType.ob_type = &PyType_Type; BTreeIter_Type.ob_type = &PyType_Type; BTreeIter_Type.tp_getattro = PyObject_GenericGetAttr; BucketType.tp_new = PyType_GenericNew; SetType.tp_new = PyType_GenericNew; BTreeType.tp_new = PyType_GenericNew; TreeSetType.tp_new = PyType_GenericNew; if (!init_persist_type(&BucketType)) return; if (!init_persist_type(&BTreeType)) return; if (!init_persist_type(&SetType)) return; if (!init_persist_type(&TreeSetType)) return; if (PyDict_SetItem(BTreeType.tp_dict, _bucket_type_str, (PyObject *)&BucketType) < 0) { fprintf(stderr, "btree failed\n"); return; } if (PyDict_SetItem(TreeSetType.tp_dict, _bucket_type_str, (PyObject *)&SetType) < 0) { fprintf(stderr, "bucket failed\n"); return; } /* Create the module and add the functions */ m = Py_InitModule4("_" MOD_NAME_PREFIX "BTree", module_methods, BTree_module_documentation, (PyObject *)NULL, PYTHON_API_VERSION); /* Add some symbolic constants to the module */ d = PyModule_GetDict(m); if (PyDict_SetItemString(d, MOD_NAME_PREFIX "Bucket", (PyObject *)&BucketType) < 0) return; if (PyDict_SetItemString(d, MOD_NAME_PREFIX "BTree", (PyObject *)&BTreeType) < 0) return; if (PyDict_SetItemString(d, MOD_NAME_PREFIX "Set", (PyObject *)&SetType) < 0) return; if (PyDict_SetItemString(d, MOD_NAME_PREFIX "TreeSet", (PyObject *)&TreeSetType) < 0) return; if (PyDict_SetItemString(d, MOD_NAME_PREFIX "TreeIterator", (PyObject *)&BTreeIter_Type) < 0) return; /* We also want to be able to access these constants without the prefix * so that code can more easily exchange modules (particularly the integer * and long modules, but also others). The TreeIterator is only internal, * so we don't bother to expose that. */ if (PyDict_SetItemString(d, "Bucket", (PyObject *)&BucketType) < 0) return; if (PyDict_SetItemString(d, "BTree", (PyObject *)&BTreeType) < 0) return; if (PyDict_SetItemString(d, "Set", (PyObject *)&SetType) < 0) return; if (PyDict_SetItemString(d, "TreeSet", (PyObject *)&TreeSetType) < 0) return; #if defined(ZODB_64BIT_INTS) && defined(NEED_LONG_LONG_SUPPORT) if (PyDict_SetItemString(d, "using64bits", Py_True) < 0) return; #else if (PyDict_SetItemString(d, "using64bits", Py_False) < 0) return; #endif } ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/BTreeTemplate.c000066400000000000000000001676751230730566700237520ustar00rootroot00000000000000/***************************************************************************** Copyright (c) 2001, 2002 Zope Foundation and Contributors. All Rights Reserved. This software is subject to the provisions of the Zope Public License, Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS FOR A PARTICULAR PURPOSE ****************************************************************************/ #define BTREETEMPLATE_C "$Id$\n" /* Sanity-check a BTree. This is a private helper for BTree_check. Return: * -1 Error. If it's an internal inconsistency in the BTree, * AssertionError is set. * 0 No problem found. * * nextbucket is the bucket "one beyond the end" of the BTree; the last bucket * directly reachable from following right child pointers *should* be linked * to nextbucket (and this is checked). */ static int BTree_check_inner(BTree *self, Bucket *nextbucket) { int i; Bucket *bucketafter; Sized *child; char *errormsg = "internal error"; /* someone should have overriden */ Sized *activated_child = NULL; int result = -1; /* until proved innocent */ #define CHECK(CONDITION, ERRORMSG) \ if (!(CONDITION)) { \ errormsg = (ERRORMSG); \ goto Error; \ } PER_USE_OR_RETURN(self, -1); CHECK(self->len >= 0, "BTree len < 0"); CHECK(self->len <= self->size, "BTree len > size"); if (self->len == 0) { /* Empty BTree. */ CHECK(self->firstbucket == NULL, "Empty BTree has non-NULL firstbucket"); result = 0; goto Done; } /* Non-empty BTree. */ CHECK(self->firstbucket != NULL, "Non-empty BTree has NULL firstbucket"); /* Obscure: The first bucket is pointed to at least by self->firstbucket * and data[0].child of whichever BTree node it's a child of. However, * if persistence is enabled then the latter BTree node may be a ghost * at this point, and so its pointers "don't count": we can only rely * on self's pointers being intact. */ #ifdef PERSISTENT CHECK(self->firstbucket->ob_refcnt >= 1, "Non-empty BTree firstbucket has refcount < 1"); #else CHECK(self->firstbucket->ob_refcnt >= 2, "Non-empty BTree firstbucket has refcount < 2"); #endif for (i = 0; i < self->len; ++i) { CHECK(self->data[i].child != NULL, "BTree has NULL child"); } if (SameType_Check(self, self->data[0].child)) { /* Our children are also BTrees. */ child = self->data[0].child; UNLESS (PER_USE(child)) goto Done; activated_child = child; CHECK(self->firstbucket == BTREE(child)->firstbucket, "BTree has firstbucket different than " "its first child's firstbucket"); PER_ALLOW_DEACTIVATION(child); activated_child = NULL; for (i = 0; i < self->len; ++i) { child = self->data[i].child; CHECK(SameType_Check(self, child), "BTree children have different types"); if (i == self->len - 1) bucketafter = nextbucket; else { BTree *child2 = BTREE(self->data[i+1].child); UNLESS (PER_USE(child2)) goto Done; bucketafter = child2->firstbucket; PER_ALLOW_DEACTIVATION(child2); } if (BTree_check_inner(BTREE(child), bucketafter) < 0) goto Done; } } else { /* Our children are buckets. */ CHECK(self->firstbucket == BUCKET(self->data[0].child), "Bottom-level BTree node has inconsistent firstbucket belief"); for (i = 0; i < self->len; ++i) { child = self->data[i].child; UNLESS (PER_USE(child)) goto Done; activated_child = child; CHECK(!SameType_Check(self, child), "BTree children have different types"); CHECK(child->len >= 1, "Bucket length < 1"); /* no empty buckets! */ CHECK(child->len <= child->size, "Bucket len > size"); #ifdef PERSISTENT CHECK(child->ob_refcnt >= 1, "Bucket has refcount < 1"); #else CHECK(child->ob_refcnt >= 2, "Bucket has refcount < 2"); #endif if (i == self->len - 1) bucketafter = nextbucket; else bucketafter = BUCKET(self->data[i+1].child); CHECK(BUCKET(child)->next == bucketafter, "Bucket next pointer is damaged"); PER_ALLOW_DEACTIVATION(child); activated_child = NULL; } } result = 0; goto Done; Error: PyErr_SetString(PyExc_AssertionError, errormsg); result = -1; Done: /* No point updating access time -- this isn't a "real" use. */ PER_ALLOW_DEACTIVATION(self); if (activated_child) { PER_ALLOW_DEACTIVATION(activated_child); } return result; #undef CHECK } /* Sanity-check a BTree. This is the ._check() method. Return: * NULL Error. If it's an internal inconsistency in the BTree, * AssertionError is set. * Py_None No problem found. */ static PyObject* BTree_check(BTree *self) { PyObject *result = NULL; int i = BTree_check_inner(self, NULL); if (i >= 0) { result = Py_None; Py_INCREF(result); } return result; } /* ** _BTree_get ** ** Search a BTree. ** ** Arguments ** self a pointer to a BTree ** keyarg the key to search for, as a Python object ** has_key true/false; when false, try to return the associated ** value; when true, return a boolean ** Return ** When has_key false: ** If key exists, its associated value. ** If key doesn't exist, NULL and KeyError is set. ** When has_key true: ** A Python int is returned in any case. ** If key exists, the depth of the bucket in which it was found. ** If key doesn't exist, 0. */ static PyObject * _BTree_get(BTree *self, PyObject *keyarg, int has_key) { KEY_TYPE key; PyObject *result = NULL; /* guilty until proved innocent */ int copied = 1; COPY_KEY_FROM_ARG(key, keyarg, copied); UNLESS (copied) return NULL; PER_USE_OR_RETURN(self, NULL); if (self->len == 0) { /* empty BTree */ if (has_key) result = PyInt_FromLong(0); else PyErr_SetObject(PyExc_KeyError, keyarg); } else { for (;;) { int i; Sized *child; BTREE_SEARCH(i, self, key, goto Done); child = self->data[i].child; has_key += has_key != 0; /* bump depth counter, maybe */ if (SameType_Check(self, child)) { PER_UNUSE(self); self = BTREE(child); PER_USE_OR_RETURN(self, NULL); } else { result = _bucket_get(BUCKET(child), keyarg, has_key); break; } } } Done: PER_UNUSE(self); return result; } static PyObject * BTree_get(BTree *self, PyObject *key) { return _BTree_get(self, key, 0); } /* Create a new bucket for the BTree or TreeSet using the class attribute _bucket_type, which is normally initialized to BucketType or SetType as appropriate. */ static Sized * BTree_newBucket(BTree *self) { PyObject *factory; Sized *result; /* _bucket_type_str defined in BTreeModuleTemplate.c */ factory = PyObject_GetAttr((PyObject *)self->ob_type, _bucket_type_str); if (factory == NULL) return NULL; /* TODO: Should we check that the factory actually returns something of the appropriate type? How? The C code here is going to depend on any custom bucket type having the same layout at the C level. */ result = SIZED(PyObject_CallObject(factory, NULL)); Py_DECREF(factory); return result; } /* * Move data from the current BTree, from index onward, to the newly created * BTree 'next'. self and next must both be activated. If index is OOB (< 0 * or >= self->len), use self->len / 2 as the index (i.e., split at the * midpoint). self must have at least 2 children on entry, and index must * be such that self and next each have at least one child at exit. self's * accessed time is updated. * * Return: * -1 error * 0 OK */ static int BTree_split(BTree *self, int index, BTree *next) { int next_size; Sized *child; if (index < 0 || index >= self->len) index = self->len / 2; next_size = self->len - index; ASSERT(index > 0, "split creates empty tree", -1); ASSERT(next_size > 0, "split creates empty tree", -1); next->data = BTree_Malloc(sizeof(BTreeItem) * next_size); if (!next->data) return -1; memcpy(next->data, self->data + index, sizeof(BTreeItem) * next_size); next->size = next_size; /* but don't set len until we succeed */ /* Set next's firstbucket. self->firstbucket is still correct. */ child = next->data[0].child; if (SameType_Check(self, child)) { PER_USE_OR_RETURN(child, -1); next->firstbucket = BTREE(child)->firstbucket; PER_UNUSE(child); } else next->firstbucket = BUCKET(child); Py_INCREF(next->firstbucket); next->len = next_size; self->len = index; return PER_CHANGED(self) >= 0 ? 0 : -1; } /* Fwd decl -- BTree_grow and BTree_split_root reference each other. */ static int BTree_grow(BTree *self, int index, int noval); /* Split the root. This is a little special because the root isn't a child * of anything else, and the root needs to retain its object identity. So * this routine moves the root's data into a new child, and splits the * latter. This leaves the root with two children. * * Return: * 0 OK * -1 error * * CAUTION: The caller must call PER_CHANGED on self. */ static int BTree_split_root(BTree *self, int noval) { BTree *child; BTreeItem *d; /* Create a child BTree, and a new data vector for self. */ child = BTREE(PyObject_CallObject(OBJECT(self->ob_type), NULL)); if (!child) return -1; d = BTree_Malloc(sizeof(BTreeItem) * 2); if (!d) { Py_DECREF(child); return -1; } /* Move our data to new BTree. */ child->size = self->size; child->len = self->len; child->data = self->data; child->firstbucket = self->firstbucket; Py_INCREF(child->firstbucket); /* Point self to child and split the child. */ self->data = d; self->len = 1; self->size = 2; self->data[0].child = SIZED(child); /* transfers reference ownership */ return BTree_grow(self, 0, noval); } /* ** BTree_grow ** ** Grow a BTree ** ** Arguments: self The BTree ** index self->data[index].child needs to be split. index ** must be 0 if self is empty (len == 0), and a new ** empty bucket is created then. ** noval Boolean; is this a set (true) or mapping (false)? ** ** Returns: 0 on success ** -1 on failure ** ** CAUTION: If self is empty on entry, this routine adds an empty bucket. ** That isn't a legitimate BTree; if the caller doesn't put something in ** in the bucket (say, because of a later error), the BTree must be cleared ** to get rid of the empty bucket. */ static int BTree_grow(BTree *self, int index, int noval) { int i; Sized *v, *e = 0; BTreeItem *d; if (self->len == self->size) { if (self->size) { d = BTree_Realloc(self->data, sizeof(BTreeItem) * self->size * 2); if (d == NULL) return -1; self->data = d; self->size *= 2; } else { d = BTree_Malloc(sizeof(BTreeItem) * 2); if (d == NULL) return -1; self->data = d; self->size = 2; } } if (self->len) { d = self->data + index; v = d->child; /* Create a new object of the same type as the target value */ e = (Sized *)PyObject_CallObject((PyObject *)v->ob_type, NULL); if (e == NULL) return -1; UNLESS(PER_USE(v)) { Py_DECREF(e); return -1; } /* Now split between the original (v) and the new (e) at the midpoint*/ if (SameType_Check(self, v)) i = BTree_split((BTree *)v, -1, (BTree *)e); else i = bucket_split((Bucket *)v, -1, (Bucket *)e); PER_ALLOW_DEACTIVATION(v); if (i < 0) { Py_DECREF(e); assert(PyErr_Occurred()); return -1; } index++; d++; if (self->len > index) /* Shift up the old values one array slot */ memmove(d+1, d, sizeof(BTreeItem)*(self->len-index)); if (SameType_Check(self, v)) { COPY_KEY(d->key, BTREE(e)->data->key); /* We take the unused reference from e, so there's no reason to INCREF! */ /* INCREF_KEY(self->data[1].key); */ } else { COPY_KEY(d->key, BUCKET(e)->keys[0]); INCREF_KEY(d->key); } d->child = e; self->len++; if (self->len >= MAX_BTREE_SIZE(self) * 2) /* the root is huge */ return BTree_split_root(self, noval); } else { /* The BTree is empty. Create an empty bucket. See CAUTION in * the comments preceding. */ assert(index == 0); d = self->data; d->child = BTree_newBucket(self); if (d->child == NULL) return -1; self->len = 1; Py_INCREF(d->child); self->firstbucket = (Bucket *)d->child; } return 0; } /* Return the rightmost bucket reachable from following child pointers * from self. The caller gets a new reference to this bucket. Note that * bucket 'next' pointers are not followed: if self is an interior node * of a BTree, this returns the rightmost bucket in that node's subtree. * In case of error, returns NULL. * * self must not be a ghost; this isn't checked. The result may be a ghost. * * Pragmatics: Note that the rightmost bucket's last key is the largest * key in self's subtree. */ static Bucket * BTree_lastBucket(BTree *self) { Sized *pchild; Bucket *result; UNLESS (self->data && self->len) { IndexError(-1); /* is this the best action to take? */ return NULL; } pchild = self->data[self->len - 1].child; if (SameType_Check(self, pchild)) { self = BTREE(pchild); PER_USE_OR_RETURN(self, NULL); result = BTree_lastBucket(self); PER_UNUSE(self); } else { Py_INCREF(pchild); result = BUCKET(pchild); } return result; } static int BTree_deleteNextBucket(BTree *self) { Bucket *b; UNLESS (PER_USE(self)) return -1; b = BTree_lastBucket(self); if (b == NULL) goto err; if (Bucket_deleteNextBucket(b) < 0) goto err; Py_DECREF(b); PER_UNUSE(self); return 0; err: Py_XDECREF(b); PER_ALLOW_DEACTIVATION(self); return -1; } /* ** _BTree_clear ** ** Clears out all of the values in the BTree (firstbucket, keys, and children); ** leaving self an empty BTree. ** ** Arguments: self The BTree ** ** Returns: 0 on success ** -1 on failure ** ** Internal: Deallocation order is important. The danger is that a long ** list of buckets may get freed "at once" via decref'ing the first bucket, ** in which case a chain of consequenct Py_DECREF calls may blow the stack. ** Luckily, every bucket has a refcount of at least two, one due to being a ** BTree node's child, and another either because it's not the first bucket in ** the chain (so the preceding bucket points to it), or because firstbucket ** points to it. By clearing in the natural depth-first, left-to-right ** order, the BTree->bucket child pointers prevent Py_DECREF(bucket->next) ** calls from freeing bucket->next, and the maximum stack depth is equal ** to the height of the tree. **/ static int _BTree_clear(BTree *self) { const int len = self->len; if (self->firstbucket) { /* Obscure: The first bucket is pointed to at least by * self->firstbucket and data[0].child of whichever BTree node it's * a child of. However, if persistence is enabled then the latter * BTree node may be a ghost at this point, and so its pointers "don't * count": we can only rely on self's pointers being intact. */ #ifdef PERSISTENT ASSERT(self->firstbucket->ob_refcnt > 0, "Invalid firstbucket pointer", -1); #else ASSERT(self->firstbucket->ob_refcnt > 1, "Invalid firstbucket pointer", -1); #endif Py_DECREF(self->firstbucket); self->firstbucket = NULL; } if (self->data) { int i; if (len > 0) { /* 0 is special because key 0 is trash */ Py_DECREF(self->data[0].child); } for (i = 1; i < len; i++) { #ifdef KEY_TYPE_IS_PYOBJECT DECREF_KEY(self->data[i].key); #endif Py_DECREF(self->data[i].child); } free(self->data); self->data = NULL; } self->len = self->size = 0; return 0; } /* Set (value != 0) or delete (value=0) a tree item. If unique is non-zero, then only change if the key is new. If noval is non-zero, then don't set a value (the tree is a set). Return: -1 error 0 successful, and number of entries didn't change >0 successful, and number of entries did change Internal There are two distinct return values > 0: 1 Successful, number of entries changed, but firstbucket did not go away. 2 Successful, number of entries changed, firstbucket did go away. This can only happen on a delete (value == NULL). The caller may need to change its own firstbucket pointer, and in any case *someone* needs to adjust the 'next' pointer of the bucket immediately preceding the bucket that went away (it needs to point to the bucket immediately following the bucket that went away). */ static int _BTree_set(BTree *self, PyObject *keyarg, PyObject *value, int unique, int noval) { int changed = 0; /* did I mutate? */ int min; /* index of child I searched */ BTreeItem *d; /* self->data[min] */ int childlength; /* len(self->data[min].child) */ int status; /* our return value; and return value from callee */ int self_was_empty; /* was self empty at entry? */ KEY_TYPE key; int copied = 1; COPY_KEY_FROM_ARG(key, keyarg, copied); if (!copied) return -1; PER_USE_OR_RETURN(self, -1); self_was_empty = self->len == 0; if (self_was_empty) { /* We're empty. Make room. */ if (value) { if (BTree_grow(self, 0, noval) < 0) goto Error; } else { /* Can't delete a key from an empty BTree. */ PyErr_SetObject(PyExc_KeyError, keyarg); goto Error; } } /* Find the right child to search, and hand the work off to it. */ BTREE_SEARCH(min, self, key, goto Error); d = self->data + min; #ifdef PERSISTENT PER_READCURRENT(self, goto Error); #endif if (SameType_Check(self, d->child)) status = _BTree_set(BTREE(d->child), keyarg, value, unique, noval); else { int bucket_changed = 0; status = _bucket_set(BUCKET(d->child), keyarg, value, unique, noval, &bucket_changed); #ifdef PERSISTENT /* If a BTree contains only a single bucket, BTree.__getstate__() * includes the bucket's entire state, and the bucket doesn't get * an oid of its own. So if we have a single oid-less bucket that * changed, it's *our* oid that should be marked as changed -- the * bucket doesn't have one. */ if (bucket_changed && self->len == 1 && self->data[0].child->oid == NULL) { changed = 1; } #endif } if (status == 0) goto Done; if (status < 0) goto Error; assert(status == 1 || status == 2); /* The child changed size. Get its new size. Note that since the tree * rooted at the child changed size, so did the tree rooted at self: * our status must be >= 1 too. */ UNLESS(PER_USE(d->child)) goto Error; childlength = d->child->len; PER_UNUSE(d->child); if (value) { /* A bucket got bigger -- if it's "too big", split it. */ int toobig; assert(status == 1); /* can be 2 only on deletes */ if (SameType_Check(self, d->child)) toobig = childlength > MAX_BTREE_SIZE(d->child); else toobig = childlength > MAX_BUCKET_SIZE(d->child); if (toobig) { if (BTree_grow(self, min, noval) < 0) goto Error; changed = 1; /* BTree_grow mutated self */ } goto Done; /* and status still == 1 */ } /* A bucket got smaller. This is much harder, and despite that we * don't try to rebalance the tree. */ if (min && childlength) { /* We removed a key. but the node child is non-empty. If the deleted key is the node key, then update the node key using the smallest key of the node child. This doesn't apply to the 0th node, whos key is unused. */ int _cmp = 1; TEST_KEY_SET_OR(_cmp, key, d->key) goto Error; if (_cmp == 0) { /* Need to replace key with first key from child */ Bucket *bucket; if (SameType_Check(self, d->child)) { UNLESS(PER_USE(d->child)) goto Error; bucket = BTREE(d->child)->firstbucket; PER_UNUSE(d->child); } else bucket = BUCKET(d->child); UNLESS(PER_USE(bucket)) goto Error; DECREF_KEY(d->key); COPY_KEY(d->key, bucket->keys[0]); INCREF_KEY(d->key); PER_UNUSE(bucket); if (PER_CHANGED(self) < 0) goto Error; } } if (status == 2) { /* The child must be a BTree because bucket.set never returns 2 */ /* Two problems to solve: May have to adjust our own firstbucket, * and the bucket that went away needs to get unlinked. */ if (min) { /* This wasn't our firstbucket, so no need to adjust ours (note * that it can't be the firstbucket of any node above us either). * Tell "the tree to the left" to do the unlinking. */ if (BTree_deleteNextBucket(BTREE(d[-1].child)) < 0) goto Error; status = 1; /* we solved the child's firstbucket problem */ } else { /* This was our firstbucket. Update to new firstbucket value. */ Bucket *nextbucket; UNLESS(PER_USE(d->child)) goto Error; nextbucket = BTREE(d->child)->firstbucket; PER_UNUSE(d->child); Py_XINCREF(nextbucket); Py_DECREF(self->firstbucket); self->firstbucket = nextbucket; changed = 1; /* The caller has to do the unlinking -- we can't. Also, since * it was our firstbucket, it may also be theirs. */ assert(status == 2); } } /* If the child isn't empty, we're done! We did all that was possible for * us to do with the firstbucket problems the child gave us, and since the * child isn't empty don't create any new firstbucket problems of our own. */ if (childlength) goto Done; /* The child became empty: we need to remove it from self->data. * But first, if we're a bottom-level node, we've got more bucket-fiddling * to set up. */ if (! SameType_Check(self, d->child)) { /* We're about to delete a bucket, so need to adjust bucket pointers. */ if (min) { /* It's not our first bucket, so we can tell the previous * bucket to adjust its reference to it. It can't be anyone * else's first bucket either, so the caller needn't do anything. */ if (Bucket_deleteNextBucket(BUCKET(d[-1].child)) < 0) goto Error; /* status should be 1, and already is: if it were 2, the * block above would have set it to 1 in its min != 0 branch. */ assert(status == 1); } else { Bucket *nextbucket; /* It's our first bucket. We can't unlink it directly. */ /* 'changed' will be set true by the deletion code following. */ UNLESS(PER_USE(d->child)) goto Error; nextbucket = BUCKET(d->child)->next; PER_UNUSE(d->child); Py_XINCREF(nextbucket); Py_DECREF(self->firstbucket); self->firstbucket = nextbucket; status = 2; /* we're giving our caller a new firstbucket problem */ } } /* Remove the child from self->data. */ Py_DECREF(d->child); #ifdef KEY_TYPE_IS_PYOBJECT if (min) { DECREF_KEY(d->key); } else if (self->len > 1) { /* We're deleting the first child of a BTree with more than one * child. The key at d+1 is about to be shifted into slot 0, * and hence never to be referenced again (the key in slot 0 is * trash). */ DECREF_KEY((d+1)->key); } /* Else min==0 and len==1: we're emptying the BTree entirely, and * there is no key in need of decrefing. */ #endif --self->len; if (min < self->len) memmove(d, d+1, (self->len - min) * sizeof(BTreeItem)); changed = 1; Done: #ifdef PERSISTENT if (changed) { if (PER_CHANGED(self) < 0) goto Error; } #endif PER_UNUSE(self); return status; Error: assert(PyErr_Occurred()); if (self_was_empty) { /* BTree_grow may have left the BTree in an invalid state. Make * sure the tree is a legitimate empty tree. */ _BTree_clear(self); } PER_UNUSE(self); return -1; } /* ** BTree_setitem ** ** wrapper for _BTree_set ** ** Arguments: self The BTree ** key The key to insert ** v The value to insert ** ** Returns -1 on failure ** 0 on success */ static int BTree_setitem(BTree *self, PyObject *key, PyObject *v) { if (_BTree_set(self, key, v, 0, 0) < 0) return -1; return 0; } #ifdef PERSISTENT static PyObject * BTree__p_deactivate(BTree *self, PyObject *args, PyObject *keywords) { int ghostify = 1; PyObject *force = NULL; if (args && PyTuple_GET_SIZE(args) > 0) { PyErr_SetString(PyExc_TypeError, "_p_deactivate takes not positional arguments"); return NULL; } if (keywords) { int size = PyDict_Size(keywords); force = PyDict_GetItemString(keywords, "force"); if (force) size--; if (size) { PyErr_SetString(PyExc_TypeError, "_p_deactivate only accepts keyword arg force"); return NULL; } } if (self->jar && self->oid) { ghostify = self->state == cPersistent_UPTODATE_STATE; if (!ghostify && force) { if (PyObject_IsTrue(force)) ghostify = 1; if (PyErr_Occurred()) return NULL; } if (ghostify) { if (_BTree_clear(self) < 0) return NULL; PER_GHOSTIFY(self); } } Py_INCREF(Py_None); return Py_None; } #endif static PyObject * BTree_clear(BTree *self) { UNLESS (PER_USE(self)) return NULL; if (self->len) { if (_BTree_clear(self) < 0) goto err; if (PER_CHANGED(self) < 0) goto err; } PER_UNUSE(self); Py_INCREF(Py_None); return Py_None; err: PER_UNUSE(self); return NULL; } /* * Return: * * For an empty BTree (self->len == 0), None. * * For a BTree with one child (self->len == 1), and that child is a bucket, * and that bucket has a NULL oid, a one-tuple containing a one-tuple * containing the bucket's state: * * ( * ( * child[0].__getstate__(), * ), * ) * * Else a two-tuple. The first element is a tuple interleaving the BTree's * keys and direct children, of size 2*self->len - 1 (key[0] is unused and * is not saved). The second element is the firstbucket: * * ( * (child[0], key[1], child[1], key[2], child[2], ..., * key[len-1], child[len-1]), * self->firstbucket * ) * * In the above, key[i] means self->data[i].key, and similarly for child[i]. */ static PyObject * BTree_getstate(BTree *self) { PyObject *r = NULL; PyObject *o; int i, l; UNLESS (PER_USE(self)) return NULL; if (self->len) { r = PyTuple_New(self->len * 2 - 1); if (r == NULL) goto err; if (self->len == 1 && self->data->child->ob_type != self->ob_type #ifdef PERSISTENT && BUCKET(self->data->child)->oid == NULL #endif ) { /* We have just one bucket. Save its data directly. */ o = bucket_getstate((Bucket *)self->data->child); if (o == NULL) goto err; PyTuple_SET_ITEM(r, 0, o); ASSIGN(r, Py_BuildValue("(O)", r)); } else { for (i=0, l=0; i < self->len; i++) { if (i) { COPY_KEY_TO_OBJECT(o, self->data[i].key); PyTuple_SET_ITEM(r, l, o); l++; } o = (PyObject *)self->data[i].child; Py_INCREF(o); PyTuple_SET_ITEM(r,l,o); l++; } ASSIGN(r, Py_BuildValue("OO", r, self->firstbucket)); } } else { r = Py_None; Py_INCREF(r); } PER_UNUSE(self); return r; err: PER_UNUSE(self); Py_XDECREF(r); return NULL; } static int _BTree_setstate(BTree *self, PyObject *state, int noval) { PyObject *items, *firstbucket = NULL; BTreeItem *d; int len, l, i, copied=1; if (_BTree_clear(self) < 0) return -1; /* The state of a BTree can be one of the following: None -- an empty BTree A one-tuple -- a single bucket btree A two-tuple -- a BTree with more than one bucket See comments for BTree_getstate() for the details. */ if (state == Py_None) return 0; if (!PyArg_ParseTuple(state, "O|O:__setstate__", &items, &firstbucket)) return -1; if (!PyTuple_Check(items)) { PyErr_SetString(PyExc_TypeError, "tuple required for first state element"); return -1; } len = PyTuple_Size(items); if (len < 0) return -1; len = (len + 1) / 2; assert(len > 0); /* If the BTree is empty, it's state is None. */ assert(self->size == 0); /* We called _BTree_clear(). */ self->data = BTree_Malloc(sizeof(BTreeItem) * len); if (self->data == NULL) return -1; self->size = len; for (i = 0, d = self->data, l = 0; i < len; i++, d++) { PyObject *v; if (i) { /* skip the first key slot */ COPY_KEY_FROM_ARG(d->key, PyTuple_GET_ITEM(items, l), copied); l++; if (!copied) return -1; INCREF_KEY(d->key); } v = PyTuple_GET_ITEM(items, l); if (PyTuple_Check(v)) { /* Handle the special case in __getstate__() for a BTree with a single bucket. */ d->child = BTree_newBucket(self); if (!d->child) return -1; if (noval) { if (_set_setstate(BUCKET(d->child), v) < 0) return -1; } else { if (_bucket_setstate(BUCKET(d->child), v) < 0) return -1; } } else { d->child = (Sized *)v; Py_INCREF(v); } l++; } if (!firstbucket) firstbucket = (PyObject *)self->data->child; if (!PyObject_IsInstance(firstbucket, (PyObject *) (noval ? &SetType : &BucketType))) { PyErr_SetString(PyExc_TypeError, "No firstbucket in non-empty BTree"); return -1; } self->firstbucket = BUCKET(firstbucket); Py_INCREF(firstbucket); #ifndef PERSISTENT /* firstbucket is also the child of some BTree node, but that node may * be a ghost if persistence is enabled. */ assert(self->firstbucket->ob_refcnt > 1); #endif self->len = len; return 0; } static PyObject * BTree_setstate(BTree *self, PyObject *arg) { int r; PER_PREVENT_DEACTIVATION(self); r = _BTree_setstate(self, arg, 0); PER_UNUSE(self); if (r < 0) return NULL; Py_INCREF(Py_None); return Py_None; } #ifdef PERSISTENT /* Recognize the special cases of a BTree that's empty or contains a single * bucket. In the former case, return a borrowed reference to Py_None. * In this single-bucket case, the bucket state is embedded directly in the * BTree state, like so: * * ( * ( * thebucket.__getstate__(), * ), * ) * * When this obtains, return a borrowed reference to thebucket.__getstate__(). * Else return NULL with an exception set. The exception should always be * ConflictError then, but may be TypeError if the state makes no sense at all * for a BTree (corrupted or hostile state). */ PyObject * get_bucket_state(PyObject *t) { if (t == Py_None) return Py_None; /* an empty BTree */ if (! PyTuple_Check(t)) { PyErr_SetString(PyExc_TypeError, "_p_resolveConflict: expected tuple or None for state"); return NULL; } if (PyTuple_GET_SIZE(t) == 2) { /* A non-degenerate BTree. */ return merge_error(-1, -1, -1, 11); } /* We're in the one-bucket case. */ if (PyTuple_GET_SIZE(t) != 1) { PyErr_SetString(PyExc_TypeError, "_p_resolveConflict: expected 1- or 2-tuple for state"); return NULL; } t = PyTuple_GET_ITEM(t, 0); if (! PyTuple_Check(t) || PyTuple_GET_SIZE(t) != 1) { PyErr_SetString(PyExc_TypeError, "_p_resolveConflict: expected 1-tuple containing " "bucket state"); return NULL; } t = PyTuple_GET_ITEM(t, 0); if (! PyTuple_Check(t)) { PyErr_SetString(PyExc_TypeError, "_p_resolveConflict: expected tuple for bucket state"); return NULL; } return t; } /* Tricky. The only kind of BTree conflict we can actually potentially * resolve is the special case of a BTree containing a single bucket, * in which case this becomes a fancy way of calling the bucket conflict * resolution code. */ static PyObject * BTree__p_resolveConflict(BTree *self, PyObject *args) { PyObject *s[3]; PyObject *x, *y, *z; if (!PyArg_ParseTuple(args, "OOO", &x, &y, &z)) return NULL; s[0] = get_bucket_state(x); if (s[0] == NULL) return NULL; s[1] = get_bucket_state(y); if (s[1] == NULL) return NULL; s[2] = get_bucket_state(z); if (s[2] == NULL) return NULL; if (PyObject_IsInstance((PyObject *)self, (PyObject *)&BTreeType)) x = _bucket__p_resolveConflict(OBJECT(&BucketType), s); else x = _bucket__p_resolveConflict(OBJECT(&SetType), s); if (x == NULL) return NULL; return Py_BuildValue("((N))", x); } #endif /* BTree_findRangeEnd -- Find one end, expressed as a bucket and position, for a range search. If low, return bucket and index of the smallest item >= key, otherwise return bucket and index of the largest item <= key. If exclude_equal, exact matches aren't acceptable; if one is found, move right if low, or left if !low (this is for range searches exclusive of an endpoint). Return: -1 Error; offset and bucket unchanged 0 Not found; offset and bucket unchanged 1 Correct bucket and offset stored; the caller owns a new reference to the bucket. Internal: We do binary searches in BTree nodes downward, at each step following C(i) where K(i) <= key < K(i+1). As always, K(i) <= C(i) < K(i+1) too. (See Maintainer.txt for the meaning of that notation.) That eventually leads to a bucket where we do Bucket_findRangeEnd. That usually works, but there are two cases where it can fail to find the correct answer: 1. On a low search, we find a bucket with keys >= K(i), but that doesn't imply there are keys in the bucket >= key. For example, suppose a bucket has keys in 1..100, its successor's keys are in 200..300, and we're doing a low search on 150. We'll end up in the first bucket, but there are no keys >= 150 in it. K(i+1) > key, though, and all the keys in C(i+1) >= K(i+1) > key, so the first key in the next bucket (if any) is the correct result. This is easy to find by following the bucket 'next' pointer. 2. On a high search, again that the keys in the bucket are >= K(i) doesn't imply that any key in the bucket is <= key, but it's harder for this to fail (and an earlier version of this routine didn't catch it): if K(i) itself is in the bucket, it works (then K(i) <= key is *a* key in the bucket that's in the desired range). But when keys get deleted from buckets, they aren't also deleted from BTree nodes, so there's no guarantee that K(i) is in the bucket. For example, delete the smallest key S from some bucket, and S remains in the interior BTree nodes. Do a high search for S, and the BTree nodes direct the search to the bucket S used to be in, but all keys remaining in that bucket are > S. The largest key in the *preceding* bucket (if any) is < K(i), though, and K(i) <= key, so the largest key in the preceding bucket is < key and so is the proper result. This is harder to get at efficiently, as buckets are linked only in the increasing direction. While we're searching downward, deepest_smaller is set to the node deepest in the tree where we *could* have gone to the left of C(i). The rightmost bucket in deepest_smaller's subtree is the bucket preceding the bucket we find at first. This is clumsy to get at, but efficient. */ static int BTree_findRangeEnd(BTree *self, PyObject *keyarg, int low, int exclude_equal, Bucket **bucket, int *offset) { Sized *deepest_smaller = NULL; /* last possibility to move left */ int deepest_smaller_is_btree = 0; /* Boolean; if false, it's a bucket */ Bucket *pbucket; int self_got_rebound = 0; /* Boolean; when true, deactivate self */ int result = -1; /* Until proven innocent */ int i; KEY_TYPE key; int copied = 1; COPY_KEY_FROM_ARG(key, keyarg, copied); UNLESS (copied) return -1; /* We don't need to: PER_USE_OR_RETURN(self, -1); because the caller does. */ UNLESS (self->data && self->len) return 0; /* Search downward until hitting a bucket, stored in pbucket. */ for (;;) { Sized *pchild; int pchild_is_btree; BTREE_SEARCH(i, self, key, goto Done); pchild = self->data[i].child; pchild_is_btree = SameType_Check(self, pchild); if (i) { deepest_smaller = self->data[i-1].child; deepest_smaller_is_btree = pchild_is_btree; } if (pchild_is_btree) { if (self_got_rebound) { PER_UNUSE(self); } self = BTREE(pchild); self_got_rebound = 1; PER_USE_OR_RETURN(self, -1); } else { pbucket = BUCKET(pchild); break; } } /* Search the bucket for a suitable key. */ i = Bucket_findRangeEnd(pbucket, keyarg, low, exclude_equal, offset); if (i < 0) goto Done; if (i > 0) { Py_INCREF(pbucket); *bucket = pbucket; result = 1; goto Done; } /* This may be one of the two difficult cases detailed in the comments. */ if (low) { Bucket *next; UNLESS(PER_USE(pbucket)) goto Done; next = pbucket->next; if (next) { result = 1; Py_INCREF(next); *bucket = next; *offset = 0; } else result = 0; PER_UNUSE(pbucket); } /* High-end search: if it's possible to go left, do so. */ else if (deepest_smaller) { if (deepest_smaller_is_btree) { UNLESS(PER_USE(deepest_smaller)) goto Done; /* We own the reference this returns. */ pbucket = BTree_lastBucket(BTREE(deepest_smaller)); PER_UNUSE(deepest_smaller); if (pbucket == NULL) goto Done; /* error */ } else { pbucket = BUCKET(deepest_smaller); Py_INCREF(pbucket); } UNLESS(PER_USE(pbucket)) goto Done; result = 1; *bucket = pbucket; /* transfer ownership to caller */ *offset = pbucket->len - 1; PER_UNUSE(pbucket); } else result = 0; /* simply not found */ Done: if (self_got_rebound) { PER_UNUSE(self); } return result; } static PyObject * BTree_maxminKey(BTree *self, PyObject *args, int min) { PyObject *key=0; Bucket *bucket = NULL; int offset, rc; int empty_tree = 1; UNLESS (PyArg_ParseTuple(args, "|O", &key)) return NULL; UNLESS (PER_USE(self)) return NULL; UNLESS (self->data && self->len) goto empty; /* Find the range */ if (key) { if ((rc = BTree_findRangeEnd(self, key, min, 0, &bucket, &offset)) <= 0) { if (rc < 0) goto err; empty_tree = 0; goto empty; } PER_UNUSE(self); UNLESS (PER_USE(bucket)) { Py_DECREF(bucket); return NULL; } } else if (min) { bucket = self->firstbucket; PER_UNUSE(self); PER_USE_OR_RETURN(bucket, NULL); Py_INCREF(bucket); offset = 0; } else { bucket = BTree_lastBucket(self); PER_UNUSE(self); UNLESS (PER_USE(bucket)) { Py_DECREF(bucket); return NULL; } assert(bucket->len); offset = bucket->len - 1; } COPY_KEY_TO_OBJECT(key, bucket->keys[offset]); PER_UNUSE(bucket); Py_DECREF(bucket); return key; empty: PyErr_SetString(PyExc_ValueError, empty_tree ? "empty tree" : "no key satisfies the conditions"); err: PER_UNUSE(self); if (bucket) { PER_UNUSE(bucket); Py_DECREF(bucket); } return NULL; } static PyObject * BTree_minKey(BTree *self, PyObject *args) { return BTree_maxminKey(self, args, 1); } static PyObject * BTree_maxKey(BTree *self, PyObject *args) { return BTree_maxminKey(self, args, 0); } /* ** BTree_rangeSearch ** ** Generates a BTreeItems object based on the two indexes passed in, ** being the range between them. ** */ static PyObject * BTree_rangeSearch(BTree *self, PyObject *args, PyObject *kw, char type) { PyObject *min = Py_None; PyObject *max = Py_None; int excludemin = 0; int excludemax = 0; int rc; Bucket *lowbucket = NULL; Bucket *highbucket = NULL; int lowoffset; int highoffset; PyObject *result; if (args) { if (! PyArg_ParseTupleAndKeywords(args, kw, "|OOii", search_keywords, &min, &max, &excludemin, &excludemax)) return NULL; } UNLESS (PER_USE(self)) return NULL; UNLESS (self->data && self->len) goto empty; /* Find the low range */ if (min != Py_None) { if ((rc = BTree_findRangeEnd(self, min, 1, excludemin, &lowbucket, &lowoffset)) <= 0) { if (rc < 0) goto err; goto empty; } } else { lowbucket = self->firstbucket; lowoffset = 0; if (excludemin) { int bucketlen; UNLESS (PER_USE(lowbucket)) goto err; bucketlen = lowbucket->len; PER_UNUSE(lowbucket); if (bucketlen > 1) lowoffset = 1; else if (self->len < 2) goto empty; else { /* move to first item in next bucket */ Bucket *next; UNLESS (PER_USE(lowbucket)) goto err; next = lowbucket->next; PER_UNUSE(lowbucket); assert(next != NULL); lowbucket = next; /* and lowoffset is still 0 */ assert(lowoffset == 0); } } Py_INCREF(lowbucket); } /* Find the high range */ if (max != Py_None) { if ((rc = BTree_findRangeEnd(self, max, 0, excludemax, &highbucket, &highoffset)) <= 0) { Py_DECREF(lowbucket); if (rc < 0) goto err; goto empty; } } else { int bucketlen; highbucket = BTree_lastBucket(self); assert(highbucket != NULL); /* we know self isn't empty */ UNLESS (PER_USE(highbucket)) goto err_and_decref_buckets; bucketlen = highbucket->len; PER_UNUSE(highbucket); highoffset = bucketlen - 1; if (excludemax) { if (highoffset > 0) --highoffset; else if (self->len < 2) goto empty_and_decref_buckets; else { /* move to last item of preceding bucket */ int status; assert(highbucket != self->firstbucket); Py_DECREF(highbucket); status = PreviousBucket(&highbucket, self->firstbucket); if (status < 0) { Py_DECREF(lowbucket); goto err; } assert(status > 0); Py_INCREF(highbucket); UNLESS (PER_USE(highbucket)) goto err_and_decref_buckets; highoffset = highbucket->len - 1; PER_UNUSE(highbucket); } } assert(highoffset >= 0); } /* It's still possible that the range is empty, even if min < max. For * example, if min=3 and max=4, and 3 and 4 aren't in the BTree, but 2 and * 5 are, then the low position points to the 5 now and the high position * points to the 2 now. They're not necessarily even in the same bucket, * so there's no trick we can play with pointer compares to get out * cheap in general. */ if (lowbucket == highbucket && lowoffset > highoffset) goto empty_and_decref_buckets; /* definitely empty */ /* The buckets differ, or they're the same and the offsets show a non- * empty range. */ if (min != Py_None && max != Py_None && /* both args user-supplied */ lowbucket != highbucket) /* and different buckets */ { KEY_TYPE first; KEY_TYPE last; int cmp; /* Have to check the hard way: see how the endpoints compare. */ UNLESS (PER_USE(lowbucket)) goto err_and_decref_buckets; COPY_KEY(first, lowbucket->keys[lowoffset]); PER_UNUSE(lowbucket); UNLESS (PER_USE(highbucket)) goto err_and_decref_buckets; COPY_KEY(last, highbucket->keys[highoffset]); PER_UNUSE(highbucket); TEST_KEY_SET_OR(cmp, first, last) goto err_and_decref_buckets; if (cmp > 0) goto empty_and_decref_buckets; } PER_UNUSE(self); result = newBTreeItems(type, lowbucket, lowoffset, highbucket, highoffset); Py_DECREF(lowbucket); Py_DECREF(highbucket); return result; err_and_decref_buckets: Py_DECREF(lowbucket); Py_DECREF(highbucket); err: PER_UNUSE(self); return NULL; empty_and_decref_buckets: Py_DECREF(lowbucket); Py_DECREF(highbucket); empty: PER_UNUSE(self); return newBTreeItems(type, 0, 0, 0, 0); } /* ** BTree_keys */ static PyObject * BTree_keys(BTree *self, PyObject *args, PyObject *kw) { return BTree_rangeSearch(self, args, kw, 'k'); } /* ** BTree_values */ static PyObject * BTree_values(BTree *self, PyObject *args, PyObject *kw) { return BTree_rangeSearch(self, args, kw, 'v'); } /* ** BTree_items */ static PyObject * BTree_items(BTree *self, PyObject *args, PyObject *kw) { return BTree_rangeSearch(self, args, kw, 'i'); } static PyObject * BTree_byValue(BTree *self, PyObject *omin) { PyObject *r=0, *o=0, *item=0; VALUE_TYPE min; VALUE_TYPE v; int copied=1; SetIteration it = {0, 0, 1}; UNLESS (PER_USE(self)) return NULL; COPY_VALUE_FROM_ARG(min, omin, copied); UNLESS(copied) return NULL; UNLESS (r=PyList_New(0)) goto err; it.set=BTree_rangeSearch(self, NULL, NULL, 'i'); UNLESS(it.set) goto err; if (nextBTreeItems(&it) < 0) goto err; while (it.position >= 0) { if (TEST_VALUE(it.value, min) >= 0) { UNLESS (item = PyTuple_New(2)) goto err; COPY_KEY_TO_OBJECT(o, it.key); UNLESS (o) goto err; PyTuple_SET_ITEM(item, 1, o); COPY_VALUE(v, it.value); NORMALIZE_VALUE(v, min); COPY_VALUE_TO_OBJECT(o, v); DECREF_VALUE(v); UNLESS (o) goto err; PyTuple_SET_ITEM(item, 0, o); if (PyList_Append(r, item) < 0) goto err; Py_DECREF(item); item = 0; } if (nextBTreeItems(&it) < 0) goto err; } item=PyObject_GetAttr(r,sort_str); UNLESS (item) goto err; ASSIGN(item, PyObject_CallObject(item, NULL)); UNLESS (item) goto err; ASSIGN(item, PyObject_GetAttr(r, reverse_str)); UNLESS (item) goto err; ASSIGN(item, PyObject_CallObject(item, NULL)); UNLESS (item) goto err; Py_DECREF(item); finiSetIteration(&it); PER_UNUSE(self); return r; err: PER_UNUSE(self); Py_XDECREF(r); finiSetIteration(&it); Py_XDECREF(item); return NULL; } /* ** BTree_getm */ static PyObject * BTree_getm(BTree *self, PyObject *args) { PyObject *key, *d=Py_None, *r; UNLESS (PyArg_ParseTuple(args, "O|O", &key, &d)) return NULL; if ((r=_BTree_get(self, key, 0))) return r; UNLESS (PyErr_ExceptionMatches(PyExc_KeyError)) return NULL; PyErr_Clear(); Py_INCREF(d); return d; } static PyObject * BTree_has_key(BTree *self, PyObject *key) { return _BTree_get(self, key, 1); } static PyObject * BTree_setdefault(BTree *self, PyObject *args) { PyObject *key; PyObject *failobj; /* default */ PyObject *value; /* return value */ if (! PyArg_UnpackTuple(args, "setdefault", 2, 2, &key, &failobj)) return NULL; value = _BTree_get(self, key, 0); if (value != NULL) return value; /* The key isn't in the tree. If that's not due to a KeyError exception, * pass back the unexpected exception. */ if (! PyErr_ExceptionMatches(PyExc_KeyError)) return NULL; PyErr_Clear(); /* Associate `key` with `failobj` in the tree, and return `failobj`. */ value = failobj; if (_BTree_set(self, key, failobj, 0, 0) < 0) value = NULL; Py_XINCREF(value); return value; } /* forward declaration */ static Py_ssize_t BTree_length_or_nonzero(BTree *self, int nonzero); static PyObject * BTree_pop(BTree *self, PyObject *args) { PyObject *key; PyObject *failobj = NULL; /* default */ PyObject *value; /* return value */ if (! PyArg_UnpackTuple(args, "pop", 1, 2, &key, &failobj)) return NULL; value = _BTree_get(self, key, 0); if (value != NULL) { /* Delete key and associated value. */ if (_BTree_set(self, key, NULL, 0, 0) < 0) { Py_DECREF(value); return NULL;; } return value; } /* The key isn't in the tree. If that's not due to a KeyError exception, * pass back the unexpected exception. */ if (! PyErr_ExceptionMatches(PyExc_KeyError)) return NULL; if (failobj != NULL) { /* Clear the KeyError and return the explicit default. */ PyErr_Clear(); Py_INCREF(failobj); return failobj; } /* No default given. The only difference in this case is the error * message, which depends on whether the tree is empty. */ if (BTree_length_or_nonzero(self, 1) == 0) /* tree is empty */ PyErr_SetString(PyExc_KeyError, "pop(): BTree is empty"); return NULL; } /* Search BTree self for key. This is the sq_contains slot of the * PySequenceMethods. * * Return: * -1 error * 0 not found * 1 found */ static int BTree_contains(BTree *self, PyObject *key) { PyObject *asobj = _BTree_get(self, key, 1); int result = -1; if (asobj != NULL) { result = PyInt_AsLong(asobj) ? 1 : 0; Py_DECREF(asobj); } return result; } static PyObject * BTree_addUnique(BTree *self, PyObject *args) { int grew; PyObject *key, *v; UNLESS (PyArg_ParseTuple(args, "OO", &key, &v)) return NULL; if ((grew=_BTree_set(self, key, v, 1, 0)) < 0) return NULL; return PyInt_FromLong(grew); } /**************************************************************************/ /* Iterator support. */ /* A helper to build all the iterators for BTrees and TreeSets. * If args is NULL, the iterator spans the entire structure. Else it's an * argument tuple, with optional low and high arguments. * kind is 'k', 'v' or 'i'. * Returns a BTreeIter object, or NULL if error. */ static PyObject * buildBTreeIter(BTree *self, PyObject *args, PyObject *kw, char kind) { BTreeIter *result = NULL; BTreeItems *items = (BTreeItems *)BTree_rangeSearch(self, args, kw, kind); if (items) { result = BTreeIter_new(items); Py_DECREF(items); } return (PyObject *)result; } /* The implementation of iter(BTree_or_TreeSet); the BTree tp_iter slot. */ static PyObject * BTree_getiter(BTree *self) { return buildBTreeIter(self, NULL, NULL, 'k'); } /* The implementation of BTree.iterkeys(). */ static PyObject * BTree_iterkeys(BTree *self, PyObject *args, PyObject *kw) { return buildBTreeIter(self, args, kw, 'k'); } /* The implementation of BTree.itervalues(). */ static PyObject * BTree_itervalues(BTree *self, PyObject *args, PyObject *kw) { return buildBTreeIter(self, args, kw, 'v'); } /* The implementation of BTree.iteritems(). */ static PyObject * BTree_iteritems(BTree *self, PyObject *args, PyObject *kw) { return buildBTreeIter(self, args, kw, 'i'); } /* End of iterator support. */ /* Caution: Even though the _firstbucket attribute is read-only, a program could do arbitrary damage to the btree internals. For example, it could call clear() on a bucket inside a BTree. We need to decide if the convenience for inspecting BTrees is worth the risk. */ static struct PyMemberDef BTree_members[] = { {"_firstbucket", T_OBJECT, offsetof(BTree, firstbucket), RO}, {NULL} }; static struct PyMethodDef BTree_methods[] = { {"__getstate__", (PyCFunction) BTree_getstate, METH_NOARGS, "__getstate__() -> state\n\n" "Return the picklable state of the BTree."}, {"__setstate__", (PyCFunction) BTree_setstate, METH_O, "__setstate__(state)\n\n" "Set the state of the BTree."}, {"has_key", (PyCFunction) BTree_has_key, METH_O, "has_key(key)\n\n" "Return true if the BTree contains the given key."}, {"keys", (PyCFunction) BTree_keys, METH_KEYWORDS, "keys([min, max]) -> list of keys\n\n" "Returns the keys of the BTree. If min and max are supplied, only\n" "keys greater than min and less than max are returned."}, {"values", (PyCFunction) BTree_values, METH_KEYWORDS, "values([min, max]) -> list of values\n\n" "Returns the values of the BTree. If min and max are supplied, only\n" "values corresponding to keys greater than min and less than max are\n" "returned."}, {"items", (PyCFunction) BTree_items, METH_KEYWORDS, "items([min, max]) -> -- list of key, value pairs\n\n" "Returns the items of the BTree. If min and max are supplied, only\n" "items with keys greater than min and less than max are returned."}, {"byValue", (PyCFunction) BTree_byValue, METH_O, "byValue(min) -> list of value, key pairs\n\n" "Returns list of value, key pairs where the value is >= min. The\n" "list is sorted by value. Note that items() returns keys in the\n" "opposite order."}, {"get", (PyCFunction) BTree_getm, METH_VARARGS, "get(key[, default=None]) -> Value for key or default\n\n" "Return the value or the default if the key is not found."}, {"setdefault", (PyCFunction) BTree_setdefault, METH_VARARGS, "D.setdefault(k, d) -> D.get(k, d), also set D[k]=d if k not in D.\n\n" "Return the value like get() except that if key is missing, d is both\n" "returned and inserted into the BTree as the value of k."}, {"pop", (PyCFunction) BTree_pop, METH_VARARGS, "D.pop(k[, d]) -> v, remove key and return the corresponding value.\n\n" "If key is not found, d is returned if given, otherwise KeyError\n" "is raised."}, {"maxKey", (PyCFunction) BTree_maxKey, METH_VARARGS, "maxKey([max]) -> key\n\n" "Return the largest key in the BTree. If max is specified, return\n" "the largest key <= max."}, {"minKey", (PyCFunction) BTree_minKey, METH_VARARGS, "minKey([mi]) -> key\n\n" "Return the smallest key in the BTree. If min is specified, return\n" "the smallest key >= min."}, {"clear", (PyCFunction) BTree_clear, METH_NOARGS, "clear()\n\nRemove all of the items from the BTree."}, {"insert", (PyCFunction)BTree_addUnique, METH_VARARGS, "insert(key, value) -> 0 or 1\n\n" "Add an item if the key is not already used. Return 1 if the item was\n" "added, or 0 otherwise."}, {"update", (PyCFunction) Mapping_update, METH_O, "update(collection)\n\n Add the items from the given collection."}, {"iterkeys", (PyCFunction) BTree_iterkeys, METH_KEYWORDS, "B.iterkeys([min[,max]]) -> an iterator over the keys of B"}, {"itervalues", (PyCFunction) BTree_itervalues, METH_KEYWORDS, "B.itervalues([min[,max]]) -> an iterator over the values of B"}, {"iteritems", (PyCFunction) BTree_iteritems, METH_KEYWORDS, "B.iteritems([min[,max]]) -> an iterator over the (key, value) items of B"}, {"_check", (PyCFunction) BTree_check, METH_NOARGS, "Perform sanity check on BTree, and raise exception if flawed."}, #ifdef PERSISTENT {"_p_resolveConflict", (PyCFunction) BTree__p_resolveConflict, METH_VARARGS, "_p_resolveConflict() -- Reinitialize from a newly created copy"}, {"_p_deactivate", (PyCFunction) BTree__p_deactivate, METH_KEYWORDS, "_p_deactivate()\n\nReinitialize from a newly created copy."}, #endif {NULL, NULL} }; static int BTree_init(PyObject *self, PyObject *args, PyObject *kwds) { PyObject *v = NULL; if (!PyArg_ParseTuple(args, "|O:" MOD_NAME_PREFIX "BTree", &v)) return -1; if (v) return update_from_seq(self, v); else return 0; } static void BTree_dealloc(BTree *self) { if (self->state != cPersistent_GHOST_STATE) _BTree_clear(self); cPersistenceCAPI->pertype->tp_dealloc((PyObject *)self); } static int BTree_traverse(BTree *self, visitproc visit, void *arg) { int err = 0; int i, len; #define VISIT(SLOT) \ if (SLOT) { \ err = visit((PyObject *)(SLOT), arg); \ if (err) \ goto Done; \ } if (self->ob_type == &BTreeType) assert(self->ob_type->tp_dictoffset == 0); /* Call our base type's traverse function. Because BTrees are * subclasses of Peristent, there must be one. */ err = cPersistenceCAPI->pertype->tp_traverse((PyObject *)self, visit, arg); if (err) goto Done; /* If this is registered with the persistence system, cleaning up cycles * is the database's problem. It would be horrid to unghostify BTree * nodes here just to chase pointers every time gc runs. */ if (self->state == cPersistent_GHOST_STATE) goto Done; len = self->len; #ifdef KEY_TYPE_IS_PYOBJECT /* Keys are Python objects so need to be traversed. Note that the * key 0 slot is unused and should not be traversed. */ for (i = 1; i < len; i++) VISIT(self->data[i].key); #endif /* Children are always pointers, and child 0 is legit. */ for (i = 0; i < len; i++) VISIT(self->data[i].child); VISIT(self->firstbucket); Done: return err; #undef VISIT } static int BTree_tp_clear(BTree *self) { if (self->state != cPersistent_GHOST_STATE) _BTree_clear(self); return 0; } /* * Return the number of elements in a BTree. nonzero is a Boolean, and * when true requests just a non-empty/empty result. Testing for emptiness * is efficient (constant-time). Getting the true length takes time * proportional to the number of leaves (buckets). * * Return: * When nonzero true: * -1 error * 0 empty * 1 not empty * When nonzero false (possibly expensive!): * -1 error * >= 0 number of elements. */ static Py_ssize_t BTree_length_or_nonzero(BTree *self, int nonzero) { int result; Bucket *b; Bucket *next; PER_USE_OR_RETURN(self, -1); b = self->firstbucket; PER_UNUSE(self); if (nonzero) return b != NULL; result = 0; while (b) { PER_USE_OR_RETURN(b, -1); result += b->len; next = b->next; PER_UNUSE(b); b = next; } return result; } static Py_ssize_t BTree_length(BTree *self) { return BTree_length_or_nonzero(self, 0); } static PyMappingMethods BTree_as_mapping = { (lenfunc)BTree_length, /*mp_length*/ (binaryfunc)BTree_get, /*mp_subscript*/ (objobjargproc)BTree_setitem, /*mp_ass_subscript*/ }; static PySequenceMethods BTree_as_sequence = { (lenfunc)0, /* sq_length */ (binaryfunc)0, /* sq_concat */ (ssizeargfunc)0, /* sq_repeat */ (ssizeargfunc)0, /* sq_item */ (ssizessizeargfunc)0, /* sq_slice */ (ssizeobjargproc)0, /* sq_ass_item */ (ssizessizeobjargproc)0, /* sq_ass_slice */ (objobjproc)BTree_contains, /* sq_contains */ 0, /* sq_inplace_concat */ 0, /* sq_inplace_repeat */ }; static Py_ssize_t BTree_nonzero(BTree *self) { return BTree_length_or_nonzero(self, 1); } static PyNumberMethods BTree_as_number_for_nonzero = { 0,0,0,0,0,0,0,0,0,0, (inquiry)BTree_nonzero}; static PyTypeObject BTreeType = { PyObject_HEAD_INIT(NULL) /* PyPersist_Type */ 0, /* ob_size */ MODULE_NAME MOD_NAME_PREFIX "BTree",/* tp_name */ sizeof(BTree), /* tp_basicsize */ 0, /* tp_itemsize */ (destructor)BTree_dealloc, /* tp_dealloc */ 0, /* tp_print */ 0, /* tp_getattr */ 0, /* tp_setattr */ 0, /* tp_compare */ 0, /* tp_repr */ &BTree_as_number_for_nonzero, /* tp_as_number */ &BTree_as_sequence, /* tp_as_sequence */ &BTree_as_mapping, /* tp_as_mapping */ 0, /* tp_hash */ 0, /* tp_call */ 0, /* tp_str */ 0, /* tp_getattro */ 0, /* tp_setattro */ 0, /* tp_as_buffer */ Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC | Py_TPFLAGS_BASETYPE, /* tp_flags */ 0, /* tp_doc */ (traverseproc)BTree_traverse, /* tp_traverse */ (inquiry)BTree_tp_clear, /* tp_clear */ 0, /* tp_richcompare */ 0, /* tp_weaklistoffset */ (getiterfunc)BTree_getiter, /* tp_iter */ 0, /* tp_iternext */ BTree_methods, /* tp_methods */ BTree_members, /* tp_members */ 0, /* tp_getset */ 0, /* tp_base */ 0, /* tp_dict */ 0, /* tp_descr_get */ 0, /* tp_descr_set */ 0, /* tp_dictoffset */ BTree_init, /* tp_init */ 0, /* tp_alloc */ 0, /*PyType_GenericNew,*/ /* tp_new */ }; ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/BucketTemplate.c000066400000000000000000001367441230730566700241570ustar00rootroot00000000000000/***************************************************************************** Copyright (c) 2001, 2002 Zope Foundation and Contributors. All Rights Reserved. This software is subject to the provisions of the Zope Public License, Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS FOR A PARTICULAR PURPOSE ****************************************************************************/ #define BUCKETTEMPLATE_C "$Id$\n" /* Use BUCKET_SEARCH to find the index at which a key belongs. * INDEX An int lvalue to hold the index i such that KEY belongs at * SELF->keys[i]. Note that this will equal SELF->len if KEY * is larger than the bucket's largest key. Else it's the * smallest i such that SELF->keys[i] >= KEY. * ABSENT An int lvalue to hold a Boolean result, true (!= 0) if the * key is absent, false (== 0) if the key is at INDEX. * SELF A pointer to a Bucket node. * KEY The key you're looking for, of type KEY_TYPE. * ONERROR What to do if key comparison raises an exception; for example, * perhaps 'return NULL'. * * See Maintainer.txt for discussion: this is optimized in subtle ways. * It's recommended that you call this at the start of a routine, waiting * to check for self->len == 0 after (if an empty bucket is special in * context; INDEX becomes 0 and ABSENT becomes true if this macro is run * with an empty SELF, and that may be all the invoker needs to know). */ #define BUCKET_SEARCH(INDEX, ABSENT, SELF, KEY, ONERROR) { \ int _lo = 0; \ int _hi = (SELF)->len; \ int _i; \ int _cmp = 1; \ for (_i = _hi >> 1; _lo < _hi; _i = (_lo + _hi) >> 1) { \ TEST_KEY_SET_OR(_cmp, (SELF)->keys[_i], (KEY)) \ ONERROR; \ if (_cmp < 0) _lo = _i + 1; \ else if (_cmp == 0) break; \ else _hi = _i; \ } \ (INDEX) = _i; \ (ABSENT) = _cmp; \ } /* ** _bucket_get ** ** Search a bucket for a given key. ** ** Arguments ** self The bucket ** keyarg The key to look for ** has_key Boolean; if true, return a true/false result; else return ** the value associated with the key. ** ** Return ** If has_key: ** Returns the Python int 0 if the key is absent, else returns ** has_key itself as a Python int. A BTree caller generally passes ** the depth of the bucket for has_key, so a true result returns ** the bucket depth then. ** Note that has_key should be true when searching set buckets. ** If not has_key: ** If the key is present, returns the associated value, and the ** caller owns the reference. Else returns NULL and sets KeyError. ** Whether or not has_key: ** If a comparison sets an exception, returns NULL. */ static PyObject * _bucket_get(Bucket *self, PyObject *keyarg, int has_key) { int i, cmp; KEY_TYPE key; PyObject *r = NULL; int copied = 1; COPY_KEY_FROM_ARG(key, keyarg, copied); UNLESS (copied) return NULL; UNLESS (PER_USE(self)) return NULL; BUCKET_SEARCH(i, cmp, self, key, goto Done); if (has_key) r = PyInt_FromLong(cmp ? 0 : has_key); else { if (cmp == 0) { COPY_VALUE_TO_OBJECT(r, self->values[i]); } else PyErr_SetObject(PyExc_KeyError, keyarg); } Done: PER_UNUSE(self); return r; } static PyObject * bucket_getitem(Bucket *self, PyObject *key) { return _bucket_get(self, key, 0); } /* ** Bucket_grow ** ** Resize a bucket. ** ** Arguments: self The bucket. ** newsize The new maximum capacity. If < 0, double the ** current size unless the bucket is currently empty, ** in which case use MIN_BUCKET_ALLOC. ** noval Boolean; if true, allocate only key space and not ** value space ** ** Returns: -1 on error, and MemoryError exception is set ** 0 on success */ static int Bucket_grow(Bucket *self, int newsize, int noval) { KEY_TYPE *keys; VALUE_TYPE *values; if (self->size) { if (newsize < 0) newsize = self->size * 2; if (newsize < 0) /* int overflow */ goto Overflow; UNLESS (keys = BTree_Realloc(self->keys, sizeof(KEY_TYPE) * newsize)) return -1; UNLESS (noval) { values = BTree_Realloc(self->values, sizeof(VALUE_TYPE) * newsize); if (values == NULL) { free(keys); return -1; } self->values = values; } self->keys = keys; } else { if (newsize < 0) newsize = MIN_BUCKET_ALLOC; UNLESS (self->keys = BTree_Malloc(sizeof(KEY_TYPE) * newsize)) return -1; UNLESS (noval) { self->values = BTree_Malloc(sizeof(VALUE_TYPE) * newsize); if (self->values == NULL) { free(self->keys); self->keys = NULL; return -1; } } } self->size = newsize; return 0; Overflow: PyErr_NoMemory(); return -1; } /* So far, bucket_append is called only by multiunion_m(), so is called * only when MULTI_INT_UNION is defined. Flavors of BTree/Bucket that * don't support MULTI_INT_UNION don't call bucket_append (yet), and * gcc complains if bucket_append is compiled in those cases. So only * compile bucket_append if it's going to be used. */ #ifdef MULTI_INT_UNION /* * Append a slice of the "from" bucket to self. * * self Append (at least keys) to this bucket. self must be activated * upon entry, and remains activated at exit. If copyValues * is true, self must be empty or already have a non-NULL values * pointer. self's access and modification times aren't updated. * from The bucket from which to take keys, and possibly values. from * must be activated upon entry, and remains activated at exit. * If copyValues is true, from must have a non-NULL values * pointer. self and from must not be the same. from's access * time isn't updated. * i, n The slice from[i : i+n] is appended to self. Must have * i >= 0, n > 0 and i+n <= from->len. * copyValues Boolean. If true, copy values from the slice as well as keys. * In this case, from must have a non-NULL values pointer, and * self must too (unless self is empty, in which case a values * vector will be allocated for it). * overallocate Boolean. If self doesn't have enough room upon entry to hold * all the appended stuff, then if overallocate is false exactly * enough room will be allocated to hold the new stuff, else if * overallocate is true an excess will be allocated. overallocate * may be a good idea if you expect to append more stuff to self * later; else overallocate should be false. * * CAUTION: If self is empty upon entry (self->size == 0), and copyValues is * false, then no space for values will get allocated. This can be a trap if * the caller intends to copy values itself. * * Return * -1 Error. * 0 OK. */ static int bucket_append(Bucket *self, Bucket *from, int i, int n, int copyValues, int overallocate) { int newlen; assert(self && from && self != from); assert(i >= 0); assert(n > 0); assert(i+n <= from->len); /* Make room. */ newlen = self->len + n; if (newlen > self->size) { int newsize = newlen; if (overallocate) /* boost by 25% -- pretty arbitrary */ newsize += newsize >> 2; if (Bucket_grow(self, newsize, ! copyValues) < 0) return -1; } assert(newlen <= self->size); /* Copy stuff. */ memcpy(self->keys + self->len, from->keys + i, n * sizeof(KEY_TYPE)); if (copyValues) { assert(self->values); assert(from->values); memcpy(self->values + self->len, from->values + i, n * sizeof(VALUE_TYPE)); } self->len = newlen; /* Bump refcounts. */ #ifdef KEY_TYPE_IS_PYOBJECT { int j; PyObject **p = from->keys + i; for (j = 0; j < n; ++j, ++p) { Py_INCREF(*p); } } #endif #ifdef VALUE_TYPE_IS_PYOBJECT if (copyValues) { int j; PyObject **p = from->values + i; for (j = 0; j < n; ++j, ++p) { Py_INCREF(*p); } } #endif return 0; } #endif /* MULTI_INT_UNION */ /* ** _bucket_set: Assign a value to a key in a bucket, delete a key+value ** pair, or just insert a key. ** ** Arguments ** self The bucket ** keyarg The key to look for ** v The value to associate with key; NULL means delete the key. ** If NULL, it's an error (KeyError) if the key isn't present. ** Note that if this is a set bucket, and you want to insert ** a new set element, v must be non-NULL although its exact ** value will be ignored. Passing Py_None is good for this. ** unique Boolean; when true, don't replace the value if the key is ** already present. ** noval Boolean; when true, operate on keys only (ignore values) ** changed ignored on input ** ** Return ** -1 on error ** 0 on success and the # of bucket entries didn't change ** 1 on success and the # of bucket entries did change ** *changed If non-NULL, set to 1 on any mutation of the bucket. */ static int _bucket_set(Bucket *self, PyObject *keyarg, PyObject *v, int unique, int noval, int *changed) { int i, cmp; KEY_TYPE key; /* Subtle: there may or may not be a value. If there is, we need to * check its type early, so that in case of error we can get out before * mutating the bucket. But because value isn't used on all paths, if * we don't initialize value then gcc gives a nuisance complaint that * value may be used initialized (it can't be, but gcc doesn't know * that). So we initialize it. However, VALUE_TYPE can be various types, * including int, PyObject*, and char[6], so it's a puzzle to spell * initialization. It so happens that {0} is a valid initializer for all * these types. */ VALUE_TYPE value = {0}; /* squash nuisance warning */ int result = -1; /* until proven innocent */ int copied = 1; COPY_KEY_FROM_ARG(key, keyarg, copied); UNLESS(copied) return -1; /* Copy the value early (if needed), so that in case of error a * pile of bucket mutations don't need to be undone. */ if (v && !noval) { COPY_VALUE_FROM_ARG(value, v, copied); UNLESS(copied) return -1; } UNLESS (PER_USE(self)) return -1; BUCKET_SEARCH(i, cmp, self, key, goto Done); if (cmp == 0) { /* The key exists, at index i. */ if (v) { /* The key exists at index i, and there's a new value. * If unique, we're not supposed to replace it. If noval, or this * is a set bucket (self->values is NULL), there's nothing to do. */ if (unique || noval || self->values == NULL) { result = 0; goto Done; } /* The key exists at index i, and we need to replace the value. */ #ifdef VALUE_SAME /* short-circuit if no change */ if (VALUE_SAME(self->values[i], value)) { result = 0; goto Done; } #endif if (changed) *changed = 1; DECREF_VALUE(self->values[i]); COPY_VALUE(self->values[i], value); INCREF_VALUE(self->values[i]); if (PER_CHANGED(self) >= 0) result = 0; goto Done; } /* The key exists at index i, and should be deleted. */ DECREF_KEY(self->keys[i]); self->len--; if (i < self->len) memmove(self->keys + i, self->keys + i+1, sizeof(KEY_TYPE)*(self->len - i)); if (self->values) { DECREF_VALUE(self->values[i]); if (i < self->len) memmove(self->values + i, self->values + i+1, sizeof(VALUE_TYPE)*(self->len - i)); } if (! self->len) { self->size = 0; free(self->keys); self->keys = NULL; if (self->values) { free(self->values); self->values = NULL; } } if (changed) *changed = 1; if (PER_CHANGED(self) >= 0) result = 1; goto Done; } /* The key doesn't exist, and belongs at index i. */ if (!v) { /* Can't delete a non-existent key. */ PyErr_SetObject(PyExc_KeyError, keyarg); goto Done; } /* The key doesn't exist and should be inserted at index i. */ if (self->len == self->size && Bucket_grow(self, -1, noval) < 0) goto Done; if (self->len > i) { memmove(self->keys + i + 1, self->keys + i, sizeof(KEY_TYPE) * (self->len - i)); if (self->values) { memmove(self->values + i + 1, self->values + i, sizeof(VALUE_TYPE) * (self->len - i)); } } COPY_KEY(self->keys[i], key); INCREF_KEY(self->keys[i]); if (! noval) { COPY_VALUE(self->values[i], value); INCREF_VALUE(self->values[i]); } self->len++; if (changed) *changed = 1; if (PER_CHANGED(self) >= 0) result = 1; Done: PER_UNUSE(self); return result; } /* ** bucket_setitem ** ** wrapper for _bucket_setitem (eliminates +1 return code) ** ** Arguments: self The bucket ** key The key to insert under ** v The value to insert ** ** Returns 0 on success ** -1 on failure */ static int bucket_setitem(Bucket *self, PyObject *key, PyObject *v) { if (_bucket_set(self, key, v, 0, 0, 0) < 0) return -1; return 0; } /** ** Accepts a sequence of 2-tuples, or any object with an items() ** method that returns an iterable object producing 2-tuples. */ static int update_from_seq(PyObject *map, PyObject *seq) { PyObject *iter, *o, *k, *v; int err = -1; /* One path creates a new seq object. The other path has an INCREF of the seq argument. So seq must always be DECREFed on the way out. */ /* Use items() if it's not a sequence. Alas, PySequence_Check() * returns true for a PeristentMapping or PersistentDict, and we * want to use items() in those cases too. */ if (!PySequence_Check(seq) || /* or it "looks like a dict" */ PyObject_HasAttrString(seq, "iteritems")) { PyObject *items; items = PyObject_GetAttrString(seq, "items"); if (items == NULL) return -1; seq = PyObject_CallObject(items, NULL); Py_DECREF(items); if (seq == NULL) return -1; } else Py_INCREF(seq); iter = PyObject_GetIter(seq); if (iter == NULL) goto err; while (1) { o = PyIter_Next(iter); if (o == NULL) { if (PyErr_Occurred()) goto err; else break; } if (!PyTuple_Check(o) || PyTuple_GET_SIZE(o) != 2) { Py_DECREF(o); PyErr_SetString(PyExc_TypeError, "Sequence must contain 2-item tuples"); goto err; } k = PyTuple_GET_ITEM(o, 0); v = PyTuple_GET_ITEM(o, 1); if (PyObject_SetItem(map, k, v) < 0) { Py_DECREF(o); goto err; } Py_DECREF(o); } err = 0; err: Py_DECREF(iter); Py_DECREF(seq); return err; } static PyObject * Mapping_update(PyObject *self, PyObject *seq) { if (update_from_seq(self, seq) < 0) return NULL; Py_INCREF(Py_None); return Py_None; } /* ** bucket_split ** ** Splits one bucket into two ** ** Arguments: self The bucket ** index the index of the key to split at (O.O.B use midpoint) ** next the new bucket to split into ** ** Returns: 0 on success ** -1 on failure */ static int bucket_split(Bucket *self, int index, Bucket *next) { int next_size; ASSERT(self->len > 1, "split of empty bucket", -1); if (index < 0 || index >= self->len) index = self->len / 2; next_size = self->len - index; next->keys = BTree_Malloc(sizeof(KEY_TYPE) * next_size); if (!next->keys) return -1; memcpy(next->keys, self->keys + index, sizeof(KEY_TYPE) * next_size); if (self->values) { next->values = BTree_Malloc(sizeof(VALUE_TYPE) * next_size); if (!next->values) { free(next->keys); next->keys = NULL; return -1; } memcpy(next->values, self->values + index, sizeof(VALUE_TYPE) * next_size); } next->size = next_size; next->len = next_size; self->len = index; next->next = self->next; Py_INCREF(next); self->next = next; if (PER_CHANGED(self) < 0) return -1; return 0; } /* Set self->next to self->next->next, i.e. unlink self's successor from * the chain. * * Return: * -1 error * 0 OK */ static int Bucket_deleteNextBucket(Bucket *self) { int result = -1; /* until proven innocent */ Bucket *successor; PER_USE_OR_RETURN(self, -1); successor = self->next; if (successor) { Bucket *next; /* Before: self -> successor -> next * After: self --------------> next */ UNLESS (PER_USE(successor)) goto Done; next = successor->next; PER_UNUSE(successor); Py_XINCREF(next); /* it may be NULL, of course */ self->next = next; Py_DECREF(successor); if (PER_CHANGED(self) < 0) goto Done; } result = 0; Done: PER_UNUSE(self); return result; } /* Bucket_findRangeEnd -- Find the index of a range endpoint (possibly) contained in a bucket. Arguments: self The bucket keyarg The key to match against low Boolean; true for low end of range, false for high exclude_equal Boolean; if true, don't accept an exact match, and if there is one then move right if low and left if !low. offset The output offset If low true, *offset <- index of the smallest item >= key, if low false the index of the largest item <= key. In either case, if there is no such index, *offset is left alone and 0 is returned. Return: 0 No suitable index exists; *offset has not been changed 1 The correct index was stored into *offset -1 Error Example: Suppose the keys are [2, 4], and exclude_equal is false. Searching for 2 sets *offset to 0 and returns 1 regardless of low. Searching for 4 sets *offset to 1 and returns 1 regardless of low. Searching for 1: If low true, sets *offset to 0, returns 1. If low false, returns 0. Searching for 3: If low true, sets *offset to 1, returns 1. If low false, sets *offset to 0, returns 1. Searching for 5: If low true, returns 0. If low false, sets *offset to 1, returns 1. The 1, 3 and 5 examples are the same when exclude_equal is true. */ static int Bucket_findRangeEnd(Bucket *self, PyObject *keyarg, int low, int exclude_equal, int *offset) { int i, cmp; int result = -1; /* until proven innocent */ KEY_TYPE key; int copied = 1; COPY_KEY_FROM_ARG(key, keyarg, copied); UNLESS (copied) return -1; UNLESS (PER_USE(self)) return -1; BUCKET_SEARCH(i, cmp, self, key, goto Done); if (cmp == 0) { /* exact match at index i */ if (exclude_equal) { /* but we don't want an exact match */ if (low) ++i; else --i; } } /* Else keys[i-1] < key < keys[i], picturing infinities at OOB indices, * and i has the smallest item > key, which is correct for low. */ else if (! low) /* i-1 has the largest item < key (unless i-1 is 0OB) */ --i; result = 0 <= i && i < self->len; if (result) *offset = i; Done: PER_UNUSE(self); return result; } static PyObject * Bucket_maxminKey(Bucket *self, PyObject *args, int min) { PyObject *key=0; int rc, offset = 0; int empty_bucket = 1; if (args && ! PyArg_ParseTuple(args, "|O", &key)) return NULL; PER_USE_OR_RETURN(self, NULL); UNLESS (self->len) goto empty; /* Find the low range */ if (key) { if ((rc = Bucket_findRangeEnd(self, key, min, 0, &offset)) <= 0) { if (rc < 0) return NULL; empty_bucket = 0; goto empty; } } else if (min) offset = 0; else offset = self->len -1; COPY_KEY_TO_OBJECT(key, self->keys[offset]); PER_UNUSE(self); return key; empty: PyErr_SetString(PyExc_ValueError, empty_bucket ? "empty bucket" : "no key satisfies the conditions"); PER_UNUSE(self); return NULL; } static PyObject * Bucket_minKey(Bucket *self, PyObject *args) { return Bucket_maxminKey(self, args, 1); } static PyObject * Bucket_maxKey(Bucket *self, PyObject *args) { return Bucket_maxminKey(self, args, 0); } static int Bucket_rangeSearch(Bucket *self, PyObject *args, PyObject *kw, int *low, int *high) { PyObject *min = Py_None; PyObject *max = Py_None; int excludemin = 0; int excludemax = 0; int rc; if (args) { if (! PyArg_ParseTupleAndKeywords(args, kw, "|OOii", search_keywords, &min, &max, &excludemin, &excludemax)) return -1; } UNLESS (self->len) goto empty; /* Find the low range */ if (min != Py_None) { rc = Bucket_findRangeEnd(self, min, 1, excludemin, low); if (rc < 0) return -1; if (rc == 0) goto empty; } else { *low = 0; if (excludemin) { if (self->len < 2) goto empty; ++*low; } } /* Find the high range */ if (max != Py_None) { rc = Bucket_findRangeEnd(self, max, 0, excludemax, high); if (rc < 0) return -1; if (rc == 0) goto empty; } else { *high = self->len - 1; if (excludemax) { if (self->len < 2) goto empty; --*high; } } /* If min < max to begin with, it's quite possible that low > high now. */ if (*low <= *high) return 0; empty: *low = 0; *high = -1; return 0; } /* ** bucket_keys ** ** Generate a list of all keys in the bucket ** ** Arguments: self The Bucket ** args (unused) ** ** Returns: list of bucket keys */ static PyObject * bucket_keys(Bucket *self, PyObject *args, PyObject *kw) { PyObject *r = NULL, *key; int i, low, high; PER_USE_OR_RETURN(self, NULL); if (Bucket_rangeSearch(self, args, kw, &low, &high) < 0) goto err; r = PyList_New(high-low+1); if (r == NULL) goto err; for (i=low; i <= high; i++) { COPY_KEY_TO_OBJECT(key, self->keys[i]); if (PyList_SetItem(r, i-low , key) < 0) goto err; } PER_UNUSE(self); return r; err: PER_UNUSE(self); Py_XDECREF(r); return NULL; } /* ** bucket_values ** ** Generate a list of all values in the bucket ** ** Arguments: self The Bucket ** args (unused) ** ** Returns list of values */ static PyObject * bucket_values(Bucket *self, PyObject *args, PyObject *kw) { PyObject *r=0, *v; int i, low, high; PER_USE_OR_RETURN(self, NULL); if (Bucket_rangeSearch(self, args, kw, &low, &high) < 0) goto err; UNLESS (r=PyList_New(high-low+1)) goto err; for (i=low; i <= high; i++) { COPY_VALUE_TO_OBJECT(v, self->values[i]); UNLESS (v) goto err; if (PyList_SetItem(r, i-low, v) < 0) goto err; } PER_UNUSE(self); return r; err: PER_UNUSE(self); Py_XDECREF(r); return NULL; } /* ** bucket_items ** ** Returns a list of all items in a bucket ** ** Arguments: self The Bucket ** args (unused) ** ** Returns: list of all items in the bucket */ static PyObject * bucket_items(Bucket *self, PyObject *args, PyObject *kw) { PyObject *r=0, *o=0, *item=0; int i, low, high; PER_USE_OR_RETURN(self, NULL); if (Bucket_rangeSearch(self, args, kw, &low, &high) < 0) goto err; UNLESS (r=PyList_New(high-low+1)) goto err; for (i=low; i <= high; i++) { UNLESS (item = PyTuple_New(2)) goto err; COPY_KEY_TO_OBJECT(o, self->keys[i]); UNLESS (o) goto err; PyTuple_SET_ITEM(item, 0, o); COPY_VALUE_TO_OBJECT(o, self->values[i]); UNLESS (o) goto err; PyTuple_SET_ITEM(item, 1, o); if (PyList_SetItem(r, i-low, item) < 0) goto err; item = 0; } PER_UNUSE(self); return r; err: PER_UNUSE(self); Py_XDECREF(r); Py_XDECREF(item); return NULL; } static PyObject * bucket_byValue(Bucket *self, PyObject *omin) { PyObject *r=0, *o=0, *item=0; VALUE_TYPE min; VALUE_TYPE v; int i, l, copied=1; PER_USE_OR_RETURN(self, NULL); COPY_VALUE_FROM_ARG(min, omin, copied); UNLESS(copied) return NULL; for (i=0, l=0; i < self->len; i++) if (TEST_VALUE(self->values[i], min) >= 0) l++; UNLESS (r=PyList_New(l)) goto err; for (i=0, l=0; i < self->len; i++) { if (TEST_VALUE(self->values[i], min) < 0) continue; UNLESS (item = PyTuple_New(2)) goto err; COPY_KEY_TO_OBJECT(o, self->keys[i]); UNLESS (o) goto err; PyTuple_SET_ITEM(item, 1, o); COPY_VALUE(v, self->values[i]); NORMALIZE_VALUE(v, min); COPY_VALUE_TO_OBJECT(o, v); DECREF_VALUE(v); UNLESS (o) goto err; PyTuple_SET_ITEM(item, 0, o); if (PyList_SetItem(r, l, item) < 0) goto err; l++; item = 0; } item=PyObject_GetAttr(r,sort_str); UNLESS (item) goto err; ASSIGN(item, PyObject_CallObject(item, NULL)); UNLESS (item) goto err; ASSIGN(item, PyObject_GetAttr(r, reverse_str)); UNLESS (item) goto err; ASSIGN(item, PyObject_CallObject(item, NULL)); UNLESS (item) goto err; Py_DECREF(item); PER_UNUSE(self); return r; err: PER_UNUSE(self); Py_XDECREF(r); Py_XDECREF(item); return NULL; } static int _bucket_clear(Bucket *self) { const int len = self->len; /* Don't declare i at this level. If neither keys nor values are * PyObject*, i won't be referenced, and you'll get a nuisance compiler * wng for declaring it here. */ self->len = self->size = 0; if (self->next) { Py_DECREF(self->next); self->next = NULL; } /* Silence compiler warning about unused variable len for the case when neither key nor value is an object, i.e. II. */ (void)len; if (self->keys) { #ifdef KEY_TYPE_IS_PYOBJECT int i; for (i = 0; i < len; ++i) DECREF_KEY(self->keys[i]); #endif free(self->keys); self->keys = NULL; } if (self->values) { #ifdef VALUE_TYPE_IS_PYOBJECT int i; for (i = 0; i < len; ++i) DECREF_VALUE(self->values[i]); #endif free(self->values); self->values = NULL; } return 0; } #ifdef PERSISTENT static PyObject * bucket__p_deactivate(Bucket *self, PyObject *args, PyObject *keywords) { int ghostify = 1; PyObject *force = NULL; if (args && PyTuple_GET_SIZE(args) > 0) { PyErr_SetString(PyExc_TypeError, "_p_deactivate takes not positional arguments"); return NULL; } if (keywords) { int size = PyDict_Size(keywords); force = PyDict_GetItemString(keywords, "force"); if (force) size--; if (size) { PyErr_SetString(PyExc_TypeError, "_p_deactivate only accepts keyword arg force"); return NULL; } } if (self->jar && self->oid) { ghostify = self->state == cPersistent_UPTODATE_STATE; if (!ghostify && force) { if (PyObject_IsTrue(force)) ghostify = 1; if (PyErr_Occurred()) return NULL; } if (ghostify) { if (_bucket_clear(self) < 0) return NULL; PER_GHOSTIFY(self); } } Py_INCREF(Py_None); return Py_None; } #endif static PyObject * bucket_clear(Bucket *self, PyObject *args) { PER_USE_OR_RETURN(self, NULL); if (self->len) { if (_bucket_clear(self) < 0) return NULL; if (PER_CHANGED(self) < 0) goto err; } PER_UNUSE(self); Py_INCREF(Py_None); return Py_None; err: PER_UNUSE(self); return NULL; } /* * Return: * * For a set bucket (self->values is NULL), a one-tuple or two-tuple. The * first element is a tuple of keys, of length self->len. The second element * is the next bucket, present if and only if next is non-NULL: * * ( * (keys[0], keys[1], ..., keys[len-1]), * next iff non-NULL> * ) * * For a mapping bucket (self->values is not NULL), a one-tuple or two-tuple. * The first element is a tuple interleaving keys and values, of length * 2 * self->len. The second element is the next bucket, present iff next is * non-NULL: * * ( * (keys[0], values[0], keys[1], values[1], ..., * keys[len-1], values[len-1]), * next iff non-NULL> * ) */ static PyObject * bucket_getstate(Bucket *self) { PyObject *o = NULL, *items = NULL, *state; int i, len, l; PER_USE_OR_RETURN(self, NULL); len = self->len; if (self->values) { /* Bucket */ items = PyTuple_New(len * 2); if (items == NULL) goto err; for (i = 0, l = 0; i < len; i++) { COPY_KEY_TO_OBJECT(o, self->keys[i]); if (o == NULL) goto err; PyTuple_SET_ITEM(items, l, o); l++; COPY_VALUE_TO_OBJECT(o, self->values[i]); if (o == NULL) goto err; PyTuple_SET_ITEM(items, l, o); l++; } } else { /* Set */ items = PyTuple_New(len); if (items == NULL) goto err; for (i = 0; i < len; i++) { COPY_KEY_TO_OBJECT(o, self->keys[i]); if (o == NULL) goto err; PyTuple_SET_ITEM(items, i, o); } } if (self->next) state = Py_BuildValue("OO", items, self->next); else state = Py_BuildValue("(O)", items); Py_DECREF(items); PER_UNUSE(self); return state; err: PER_UNUSE(self); Py_XDECREF(items); return NULL; } static int _bucket_setstate(Bucket *self, PyObject *state) { PyObject *k, *v, *items; Bucket *next = NULL; int i, l, len, copied=1; KEY_TYPE *keys; VALUE_TYPE *values; if (!PyArg_ParseTuple(state, "O|O:__setstate__", &items, &next)) return -1; if (!PyTuple_Check(items)) { PyErr_SetString(PyExc_TypeError, "tuple required for first state element"); return -1; } len = PyTuple_Size(items); if (len < 0) return -1; len /= 2; for (i = self->len; --i >= 0; ) { DECREF_KEY(self->keys[i]); DECREF_VALUE(self->values[i]); } self->len = 0; if (self->next) { Py_DECREF(self->next); self->next = NULL; } if (len > self->size) { keys = BTree_Realloc(self->keys, sizeof(KEY_TYPE)*len); if (keys == NULL) return -1; values = BTree_Realloc(self->values, sizeof(VALUE_TYPE)*len); if (values == NULL) return -1; self->keys = keys; self->values = values; self->size = len; } for (i=0, l=0; i < len; i++) { k = PyTuple_GET_ITEM(items, l); l++; v = PyTuple_GET_ITEM(items, l); l++; COPY_KEY_FROM_ARG(self->keys[i], k, copied); if (!copied) return -1; COPY_VALUE_FROM_ARG(self->values[i], v, copied); if (!copied) return -1; INCREF_KEY(self->keys[i]); INCREF_VALUE(self->values[i]); } self->len = len; if (next) { self->next = next; Py_INCREF(next); } return 0; } static PyObject * bucket_setstate(Bucket *self, PyObject *state) { int r; PER_PREVENT_DEACTIVATION(self); r = _bucket_setstate(self, state); PER_UNUSE(self); if (r < 0) return NULL; Py_INCREF(Py_None); return Py_None; } static PyObject * bucket_has_key(Bucket *self, PyObject *key) { return _bucket_get(self, key, 1); } static PyObject * bucket_setdefault(Bucket *self, PyObject *args) { PyObject *key; PyObject *failobj; /* default */ PyObject *value; /* return value */ int dummy_changed; /* in order to call _bucket_set */ if (! PyArg_UnpackTuple(args, "setdefault", 2, 2, &key, &failobj)) return NULL; value = _bucket_get(self, key, 0); if (value != NULL) return value; /* The key isn't in the bucket. If that's not due to a KeyError exception, * pass back the unexpected exception. */ if (! PyErr_ExceptionMatches(PyExc_KeyError)) return NULL; PyErr_Clear(); /* Associate `key` with `failobj` in the bucket, and return `failobj`. */ value = failobj; if (_bucket_set(self, key, failobj, 0, 0, &dummy_changed) < 0) value = NULL; Py_XINCREF(value); return value; } /* forward declaration */ static int Bucket_length(Bucket *self); static PyObject * bucket_pop(Bucket *self, PyObject *args) { PyObject *key; PyObject *failobj = NULL; /* default */ PyObject *value; /* return value */ int dummy_changed; /* in order to call _bucket_set */ if (! PyArg_UnpackTuple(args, "pop", 1, 2, &key, &failobj)) return NULL; value = _bucket_get(self, key, 0); if (value != NULL) { /* Delete key and associated value. */ if (_bucket_set(self, key, NULL, 0, 0, &dummy_changed) < 0) { Py_DECREF(value); return NULL; } return value; } /* The key isn't in the bucket. If that's not due to a KeyError exception, * pass back the unexpected exception. */ if (! PyErr_ExceptionMatches(PyExc_KeyError)) return NULL; if (failobj != NULL) { /* Clear the KeyError and return the explicit default. */ PyErr_Clear(); Py_INCREF(failobj); return failobj; } /* No default given. The only difference in this case is the error * message, which depends on whether the bucket is empty. */ if (Bucket_length(self) == 0) PyErr_SetString(PyExc_KeyError, "pop(): Bucket is empty"); return NULL; } /* Search bucket self for key. This is the sq_contains slot of the * PySequenceMethods. * * Return: * -1 error * 0 not found * 1 found */ static int bucket_contains(Bucket *self, PyObject *key) { PyObject *asobj = _bucket_get(self, key, 1); int result = -1; if (asobj != NULL) { result = PyInt_AsLong(asobj) ? 1 : 0; Py_DECREF(asobj); } return result; } /* ** bucket_getm ** */ static PyObject * bucket_getm(Bucket *self, PyObject *args) { PyObject *key, *d=Py_None, *r; if (!PyArg_ParseTuple(args, "O|O:get", &key, &d)) return NULL; r = _bucket_get(self, key, 0); if (r) return r; if (!PyErr_ExceptionMatches(PyExc_KeyError)) return NULL; PyErr_Clear(); Py_INCREF(d); return d; } /**************************************************************************/ /* Iterator support. */ /* A helper to build all the iterators for Buckets and Sets. * If args is NULL, the iterator spans the entire structure. Else it's an * argument tuple, with optional low and high arguments. * kind is 'k', 'v' or 'i'. * Returns a BTreeIter object, or NULL if error. */ static PyObject * buildBucketIter(Bucket *self, PyObject *args, PyObject *kw, char kind) { BTreeItems *items; int lowoffset, highoffset; BTreeIter *result = NULL; PER_USE_OR_RETURN(self, NULL); if (Bucket_rangeSearch(self, args, kw, &lowoffset, &highoffset) < 0) goto Done; items = (BTreeItems *)newBTreeItems(kind, self, lowoffset, self, highoffset); if (items == NULL) goto Done; result = BTreeIter_new(items); /* win or lose, we're done */ Py_DECREF(items); Done: PER_UNUSE(self); return (PyObject *)result; } /* The implementation of iter(Bucket_or_Set); the Bucket tp_iter slot. */ static PyObject * Bucket_getiter(Bucket *self) { return buildBucketIter(self, NULL, NULL, 'k'); } /* The implementation of Bucket.iterkeys(). */ static PyObject * Bucket_iterkeys(Bucket *self, PyObject *args, PyObject *kw) { return buildBucketIter(self, args, kw, 'k'); } /* The implementation of Bucket.itervalues(). */ static PyObject * Bucket_itervalues(Bucket *self, PyObject *args, PyObject *kw) { return buildBucketIter(self, args, kw, 'v'); } /* The implementation of Bucket.iteritems(). */ static PyObject * Bucket_iteritems(Bucket *self, PyObject *args, PyObject *kw) { return buildBucketIter(self, args, kw, 'i'); } /* End of iterator support. */ #ifdef PERSISTENT static PyObject *merge_error(int p1, int p2, int p3, int reason); static PyObject *bucket_merge(Bucket *s1, Bucket *s2, Bucket *s3); static PyObject * _bucket__p_resolveConflict(PyObject *ob_type, PyObject *s[3]) { PyObject *result = NULL; /* guilty until proved innocent */ Bucket *b[3] = {NULL, NULL, NULL}; PyObject *meth = NULL; PyObject *a = NULL; int i; for (i = 0; i < 3; i++) { PyObject *r; b[i] = (Bucket*)PyObject_CallObject((PyObject *)ob_type, NULL); if (b[i] == NULL) goto Done; if (s[i] == Py_None) /* None is equivalent to empty, for BTrees */ continue; meth = PyObject_GetAttr((PyObject *)b[i], __setstate___str); if (meth == NULL) goto Done; a = PyTuple_New(1); if (a == NULL) goto Done; PyTuple_SET_ITEM(a, 0, s[i]); Py_INCREF(s[i]); r = PyObject_CallObject(meth, a); /* b[i].__setstate__(s[i]) */ if (r == NULL) goto Done; Py_DECREF(r); Py_DECREF(a); Py_DECREF(meth); a = meth = NULL; } if (b[0]->next != b[1]->next || b[0]->next != b[2]->next) merge_error(-1, -1, -1, 0); else result = bucket_merge(b[0], b[1], b[2]); Done: Py_XDECREF(meth); Py_XDECREF(a); Py_XDECREF(b[0]); Py_XDECREF(b[1]); Py_XDECREF(b[2]); return result; } static PyObject * bucket__p_resolveConflict(Bucket *self, PyObject *args) { PyObject *s[3]; if (!PyArg_ParseTuple(args, "OOO", &s[0], &s[1], &s[2])) return NULL; return _bucket__p_resolveConflict((PyObject *)self->ob_type, s); } #endif /* Caution: Even though the _next attribute is read-only, a program could do arbitrary damage to the btree internals. For example, it could call clear() on a bucket inside a BTree. We need to decide if the convenience for inspecting BTrees is worth the risk. */ static struct PyMemberDef Bucket_members[] = { {"_next", T_OBJECT, offsetof(Bucket, next)}, {NULL} }; static struct PyMethodDef Bucket_methods[] = { {"__getstate__", (PyCFunction) bucket_getstate, METH_NOARGS, "__getstate__() -- Return the picklable state of the object"}, {"__setstate__", (PyCFunction) bucket_setstate, METH_O, "__setstate__() -- Set the state of the object"}, {"keys", (PyCFunction) bucket_keys, METH_KEYWORDS, "keys([min, max]) -- Return the keys"}, {"has_key", (PyCFunction) bucket_has_key, METH_O, "has_key(key) -- Test whether the bucket contains the given key"}, {"clear", (PyCFunction) bucket_clear, METH_VARARGS, "clear() -- Remove all of the items from the bucket"}, {"update", (PyCFunction) Mapping_update, METH_O, "update(collection) -- Add the items from the given collection"}, {"maxKey", (PyCFunction) Bucket_maxKey, METH_VARARGS, "maxKey([key]) -- Find the maximum key\n\n" "If an argument is given, find the maximum <= the argument"}, {"minKey", (PyCFunction) Bucket_minKey, METH_VARARGS, "minKey([key]) -- Find the minimum key\n\n" "If an argument is given, find the minimum >= the argument"}, {"values", (PyCFunction) bucket_values, METH_KEYWORDS, "values([min, max]) -- Return the values"}, {"items", (PyCFunction) bucket_items, METH_KEYWORDS, "items([min, max])) -- Return the items"}, {"byValue", (PyCFunction) bucket_byValue, METH_O, "byValue(min) -- " "Return value-keys with values >= min and reverse sorted by values"}, {"get", (PyCFunction) bucket_getm, METH_VARARGS, "get(key[,default]) -- Look up a value\n\n" "Return the default (or None) if the key is not found."}, {"setdefault", (PyCFunction) bucket_setdefault, METH_VARARGS, "D.setdefault(k, d) -> D.get(k, d), also set D[k]=d if k not in D.\n\n" "Return the value like get() except that if key is missing, d is both\n" "returned and inserted into the bucket as the value of k."}, {"pop", (PyCFunction) bucket_pop, METH_VARARGS, "D.pop(k[, d]) -> v, remove key and return the corresponding value.\n\n" "If key is not found, d is returned if given, otherwise KeyError\n" "is raised."}, {"iterkeys", (PyCFunction) Bucket_iterkeys, METH_KEYWORDS, "B.iterkeys([min[,max]]) -> an iterator over the keys of B"}, {"itervalues", (PyCFunction) Bucket_itervalues, METH_KEYWORDS, "B.itervalues([min[,max]]) -> an iterator over the values of B"}, {"iteritems", (PyCFunction) Bucket_iteritems, METH_KEYWORDS, "B.iteritems([min[,max]]) -> an iterator over the (key, value) items of B"}, #ifdef EXTRA_BUCKET_METHODS EXTRA_BUCKET_METHODS #endif #ifdef PERSISTENT {"_p_resolveConflict", (PyCFunction) bucket__p_resolveConflict, METH_VARARGS, "_p_resolveConflict() -- Reinitialize from a newly created copy"}, {"_p_deactivate", (PyCFunction) bucket__p_deactivate, METH_KEYWORDS, "_p_deactivate() -- Reinitialize from a newly created copy"}, #endif {NULL, NULL} }; static int Bucket_init(PyObject *self, PyObject *args, PyObject *kwds) { PyObject *v = NULL; if (!PyArg_ParseTuple(args, "|O:" MOD_NAME_PREFIX "Bucket", &v)) return -1; if (v) return update_from_seq(self, v); else return 0; } static void bucket_dealloc(Bucket *self) { if (self->state != cPersistent_GHOST_STATE) _bucket_clear(self); cPersistenceCAPI->pertype->tp_dealloc((PyObject *)self); } static int bucket_traverse(Bucket *self, visitproc visit, void *arg) { int err = 0; int i, len; #define VISIT(SLOT) \ if (SLOT) { \ err = visit((PyObject *)(SLOT), arg); \ if (err) \ goto Done; \ } /* Call our base type's traverse function. Because buckets are * subclasses of Peristent, there must be one. */ err = cPersistenceCAPI->pertype->tp_traverse((PyObject *)self, visit, arg); if (err) goto Done; /* If this is registered with the persistence system, cleaning up cycles * is the database's problem. It would be horrid to unghostify buckets * here just to chase pointers every time gc runs. */ if (self->state == cPersistent_GHOST_STATE) goto Done; len = self->len; (void)i; /* if neither keys nor values are PyObject*, "i" is otherwise unreferenced and we get a nuisance compiler wng */ #ifdef KEY_TYPE_IS_PYOBJECT /* Keys are Python objects so need to be traversed. */ for (i = 0; i < len; i++) VISIT(self->keys[i]); #endif #ifdef VALUE_TYPE_IS_PYOBJECT if (self->values != NULL) { /* self->values exists (this is a mapping bucket, not a set bucket), * and are Python objects, so need to be traversed. */ for (i = 0; i < len; i++) VISIT(self->values[i]); } #endif VISIT(self->next); Done: return err; #undef VISIT } static int bucket_tp_clear(Bucket *self) { if (self->state != cPersistent_GHOST_STATE) _bucket_clear(self); return 0; } /* Code to access Bucket objects as mappings */ static int Bucket_length( Bucket *self) { int r; UNLESS (PER_USE(self)) return -1; r = self->len; PER_UNUSE(self); return r; } static PyMappingMethods Bucket_as_mapping = { (lenfunc)Bucket_length, /*mp_length*/ (binaryfunc)bucket_getitem, /*mp_subscript*/ (objobjargproc)bucket_setitem, /*mp_ass_subscript*/ }; static PySequenceMethods Bucket_as_sequence = { (lenfunc)0, /* sq_length */ (binaryfunc)0, /* sq_concat */ (ssizeargfunc)0, /* sq_repeat */ (ssizeargfunc)0, /* sq_item */ (ssizessizeargfunc)0, /* sq_slice */ (ssizeobjargproc)0, /* sq_ass_item */ (ssizessizeobjargproc)0, /* sq_ass_slice */ (objobjproc)bucket_contains, /* sq_contains */ 0, /* sq_inplace_concat */ 0, /* sq_inplace_repeat */ }; static PyObject * bucket_repr(Bucket *self) { PyObject *i, *r; char repr[10000]; int rv; i = bucket_items(self, NULL, NULL); if (!i) return NULL; r = PyObject_Repr(i); Py_DECREF(i); if (!r) { return NULL; } rv = PyOS_snprintf(repr, sizeof(repr), "%s(%s)", self->ob_type->tp_name, PyString_AS_STRING(r)); if (rv > 0 && rv < sizeof(repr)) { Py_DECREF(r); return PyString_FromStringAndSize(repr, strlen(repr)); } else { /* The static buffer wasn't big enough */ int size; PyObject *s; /* 3 for the parens and the null byte */ size = strlen(self->ob_type->tp_name) + PyString_GET_SIZE(r) + 3; s = PyString_FromStringAndSize(NULL, size); if (!s) { Py_DECREF(r); return r; } PyOS_snprintf(PyString_AS_STRING(s), size, "%s(%s)", self->ob_type->tp_name, PyString_AS_STRING(r)); Py_DECREF(r); return s; } } static PyTypeObject BucketType = { PyObject_HEAD_INIT(NULL) /* PyPersist_Type */ 0, /* ob_size */ MODULE_NAME MOD_NAME_PREFIX "Bucket",/* tp_name */ sizeof(Bucket), /* tp_basicsize */ 0, /* tp_itemsize */ (destructor)bucket_dealloc, /* tp_dealloc */ 0, /* tp_print */ 0, /* tp_getattr */ 0, /* tp_setattr */ 0, /* tp_compare */ (reprfunc)bucket_repr, /* tp_repr */ 0, /* tp_as_number */ &Bucket_as_sequence, /* tp_as_sequence */ &Bucket_as_mapping, /* tp_as_mapping */ 0, /* tp_hash */ 0, /* tp_call */ 0, /* tp_str */ 0, /* tp_getattro */ 0, /* tp_setattro */ 0, /* tp_as_buffer */ Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC | Py_TPFLAGS_BASETYPE, /* tp_flags */ 0, /* tp_doc */ (traverseproc)bucket_traverse, /* tp_traverse */ (inquiry)bucket_tp_clear, /* tp_clear */ 0, /* tp_richcompare */ 0, /* tp_weaklistoffset */ (getiterfunc)Bucket_getiter, /* tp_iter */ 0, /* tp_iternext */ Bucket_methods, /* tp_methods */ Bucket_members, /* tp_members */ 0, /* tp_getset */ 0, /* tp_base */ 0, /* tp_dict */ 0, /* tp_descr_get */ 0, /* tp_descr_set */ 0, /* tp_dictoffset */ Bucket_init, /* tp_init */ 0, /* tp_alloc */ 0, /*PyType_GenericNew,*/ /* tp_new */ }; static int nextBucket(SetIteration *i) { if (i->position >= 0) { UNLESS(PER_USE(BUCKET(i->set))) return -1; if (i->position) { DECREF_KEY(i->key); DECREF_VALUE(i->value); } if (i->position < BUCKET(i->set)->len) { COPY_KEY(i->key, BUCKET(i->set)->keys[i->position]); INCREF_KEY(i->key); COPY_VALUE(i->value, BUCKET(i->set)->values[i->position]); INCREF_VALUE(i->value); i->position ++; } else { i->position = -1; PER_ACCESSED(BUCKET(i->set)); } PER_ALLOW_DEACTIVATION(BUCKET(i->set)); } return 0; } ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/Development.txt000066400000000000000000000407431230730566700241160ustar00rootroot00000000000000===================== Developer Information ===================== This document provides information for developers who maintain or extend `BTrees`. Macros ====== `BTrees` are defined using a "template", roughly akin to a C++ template. To create a new family of `BTrees`, create a source file that defines macros used to handle differences in key and value types: Configuration Macros -------------------- ``MASTER_ID`` A string to hold an RCS/CVS Id key to be included in compiled binaries. ``MOD_NAME_PREFIX`` A string (like "IO" or "OO") that provides the prefix used for the module. This gets used to generate type names and the internal module name string. ``DEFAULT_MAX_BUCKET_SIZE`` An int giving the maximum bucket size (number of key/value pairs). When a bucket gets larger than this due to an insertion *into a BTREE*, it splits. Inserting into a bucket directly doesn't split, and functions that produce a bucket output (e.g., ``union()``) also have no bound on how large a bucket may get. Someday this will be tunable on `BTree`. instances. ``DEFAULT_MAX_BTREE_SIZE`` An ``int`` giving the maximum size (number of children) of an internal btree node. Someday this will be tunable on ``BTree`` instances. Macros for Keys --------------- ``KEY_TYPE`` The C type declaration for keys (e.g., ``int`` or ``PyObject*``). ``KEY_TYPE_IS_PYOBJECT`` Define if ``KEY_TYPE`` is a ``PyObject*`, else ``undef``. ``KEY_CHECK(K)`` Tests whether the ``PyObject* K`` can be converted to the (``C``) key type (``KEY_TYPE``). The macro should return a boolean (zero for false, non-zero for true). When it returns false, its caller should probably set a ``TypeError`` exception. ``TEST_KEY_SET_OR(V, K, T)`` Like Python's ``cmp()``. Compares K(ey) to T(arget), where ``K`` and ``T`` are ``C`` values of type `KEY_TYPE`. ``V`` is assigned an `int` value depending on the outcome:: < 0 if K < T == 0 if K == T > 0 if K > T This macro acts like an ``if``, where the following statement is executed only if a Python exception has been raised because the values could not be compared. ``DECREF_KEY(K)`` ``K`` is a value of ``KEY_TYPE``. If ``KEY_TYPE`` is a flavor of ``PyObject*``, write this to do ``Py_DECREF(K)``. Else (e.g., ``KEY_TYPE`` is ``int``) make it a nop. ``INCREF_KEY(K)`` ``K`` is a value of `KEY_TYPE`. If `KEY_TYPE` is a flavor of ``PyObject*``, write this to do ``Py_INCREF(K)``. Else (e.g., `KEY_TYPE` is ``int``) make it a nop. ``COPY_KEY(K, E)`` Like ``K=E``. Copy a key from ``E`` to ``K``, both of ``KEY_TYPE``. Note that this doesn't ``decref K`` or ``incref E`` when ``KEY_TYPE`` is a ``PyObject*``; the caller is responsible for keeping refcounts straight. ``COPY_KEY_TO_OBJECT(O, K)`` Roughly like ``O=K``. ``O`` is a ``PyObject*``, and the macro must build a Python object form of ``K``, assign it to ``O``, and ensure that ``O`` owns the reference to its new value. It may do this by creating a new Python object based on ``K`` (e.g., ``PyInt_FromLong(K)`` when ``KEY_TYPE`` is ``int``), or simply by doing ``Py_INCREF(K)`` if ``KEY_TYPE`` is a ``PyObject*``. ``COPY_KEY_FROM_ARG(TARGET, ARG, STATUS)`` Copy an argument to the target without creating a new reference to ``ARG``. ``ARG`` is a ``PyObject*``, and ``TARGET`` is of type ``KEY_TYPE``. If this can't be done (for example, ``KEY_CHECK(ARG)`` returns false), set a Python error and set status to ``0``. If there is no error, leave status alone. Macros for Values ----------------- ``VALUE_TYPE`` The C type declaration for values (e.g., ``int`` or ``PyObject*``). ``VALUE_TYPE_IS_PYOBJECT`` Define if ``VALUE_TYPE`` is a ``PyObject*``, else ``undef``. ``TEST_VALUE(X, Y)`` Like Python's ``cmp()``. Compares ``X`` to ``Y``, where ``X`` & ``Y`` are ``C`` values of type ``VALUE_TYPE``. The macro returns an ``int``, with value:: < 0 if X < Y == 0 if X == Y > 0 if X > Y Bug: There is no provision for determining whether the comparison attempt failed (set a Python exception). ``DECREF_VALUE(K)`` Like ``DECREF_KEY``, except applied to values of ``VALUE_TYPE``. ``INCREF_VALUE(K)`` Like ``INCREF_KEY``, except applied to values of ``VALUE_TYPE``. ``COPY_VALUE(K, E)`` Like ``COPY_KEY``, except applied to values of ``VALUE_TYPE``. ``COPY_VALUE_TO_OBJECT(O, K)`` Like ``COPY_KEY_TO_OBJECT``, except applied to values of ``VALUE_TYPE``. ``COPY_VALUE_FROM_ARG(TARGET, ARG, STATUS)`` Like ``COPY_KEY_FROM_ARG``, except applied to values of ``VALUE_TYPE``. ``NORMALIZE_VALUE(V, MIN)`` Normalize the value, ``V``, using the parameter ``MIN``. This is almost certainly a YAGNI. It is a no-op for most types. For integers, ``V`` is replaced by ``V/MIN`` only if ``MIN > 0``. Macros for Set Operations ------------------------- ``MERGE_DEFAULT`` A value of ``VALUE_TYPE`` specifying the value to associate with set elements when sets are merged with mappings via weighed union or weighted intersection. ``MERGE(O1, w1, O2, w2)`` Performs a weighted merge of two values, ``O1`` and ``O2``, using weights ``w1`` and ``w2``. The result must be of ``VALUE_TYPE``. Note that weighted unions and weighted intersections are not enabled if this macro is left undefined. ``MERGE_WEIGHT(O, w)`` Computes a weighted value for ``O``. The result must be of ``VALUE_TYPE``. This is used for "filling out" weighted unions, i.e. to compute a weighted value for keys that appear in only one of the input mappings. If left undefined, ``MERGE_WEIGHT`` defaults to:: #define MERGE_WEIGHT(O, w) (O) ``MULTI_INT_UNION`` The value doesn't matter. If defined, `SetOpTemplate.c` compiles code for a ``multiunion()`` function (compute a union of many input sets at high speed). This currently makes sense only for structures with integer keys. BTree Clues =========== More or less random bits of helpful info. + In papers and textbooks, this flavor of BTree is usually called a B+-Tree, where "+" is a superscript. + All keys and all values live in the bucket leaf nodes. Keys in interior (BTree) nodes merely serve to guide a search efficiently toward the correct leaf. + When a key is deleted, it's physically removed from the bucket it's in, but this doesn't propagate back up the tree: since keys in interior nodes only serve to guide searches, it's OK-- and saves time --to leave "stale" keys in interior nodes. + No attempt is made to rebalance the tree after a deletion, unless a bucket thereby becomes entirely empty. "Classic BTrees" do rebalance, keeping all buckets at least half full (provided there are enough keys in the entire tree to fill half a bucket). The tradeoffs are murky. Pathological cases in the presence of deletion do exist. Pathologies include trees tending toward only one key per bucket, and buckets at differing depths (all buckets are at the same depth in a classic BTree). + ``DEFAULT_MAX_BUCKET_SIZE`` and ``DEFAULT_MAX_BTREE_SIZE`` are chosen mostly to "even out" pickle sizes in storage. That's why, e.g., an `IIBTree` has larger values than an `OOBTree`: pickles store ints more efficiently than they can store arbitrary Python objects. + In a non-empty BTree, every bucket node contains at least one key, and every BTree node contains at least one child and a non-NULL firstbucket pointer. However, a BTree node may not contain any keys. + An empty BTree consists solely of a BTree node with ``len==0`` and ``firstbucket==NULL``. + Although a BTree can become unbalanced under a mix of inserts and deletes (meaning both that there's nothing stronger that can be said about buckets than that they're not empty, and that buckets can appear at different depths), a BTree node always has children of the same kind: they're all buckets, or they're all BTree nodes. The ``BTREE_SEARCH`` Macro ========================== For notational ease, consider a fixed BTree node ``x``, and let :: K(i) mean x->data.key[i] C(i) mean all the keys reachable from x->data.child[i] For each ``i`` in ``0`` to ``x->len-1`` inclusive, :: K(i) <= C(i) < K(i+1) is a BTree node invariant, where we pretend that ``K(0)`` holds a key smaller than any possible key, and ``K(x->len)`` holds a key larger than any possible key. (Note that ``K(x->len)`` doesn't actually exist, and ``K(0)`` is never used although space for it exists in non-empty BTree nodes.) When searching for a key ``k``, then, the child pointer we want to follow is the one at index ``i`` such that ``K(i) <= k < K(i+1)``. There can be at most one such ``i``, since the ``K(i)`` are strictly increasing. And there is at least one such ``i`` provided the tree isn't empty (so that ``0 < len``). For the moment, assume the tree isn't empty (we'll get back to that later). The macro's chief loop invariant is :: K(lo) < k < K(hi) This holds trivially at the start, since ``lo`` is set to ``0``, and ``hi`` to ``x->len``, and we pretend ``K(0)`` is minus infinity and ``K(len)`` is plus infinity. Inside the loop, if ``K(i) < k`` we set ``lo`` to ``i``, and if ``K(i) > k`` we set ``hi`` to ``i``. These obviously preserve the invariant. If ``K(i) == k``, the loop breaks and sets the result to ``i``, and since ``K(i) == k`` in that case ``i`` is obviously the correct result. Other cases depend on how ``i = floor((lo + hi)/2)`` works, exactly. Suppose ``lo + d = hi`` for some ``d >= 0``. Then ``i = floor((lo + lo + d)/2) = floor(lo + d/2) = lo + floor(d/2)``. So: a. ``[d == 0] (lo == i == hi)`` if and only if ``(lo == hi)``. b. ``[d == 1] (lo == i < hi)`` if and only if ``(lo+1 == hi)``. c. ``[d > 1] (lo < i < hi)`` if and only if ``(lo+1 < hi)``. If the node is empty ``(x->len == 0)``, then ``lo==i==hi==0`` at the start, and the loop exits immediately (the first ``i > lo`` test fails), without entering the body. Else ``lo < hi`` at the start, and the invariant ``K(lo) < k < K(hi)`` holds. If ``lo+1 < hi``, we're in case (c): ``i`` is strictly between ``lo`` and ``hi``, so the loop body is entered, and regardless of whether the body sets the new ``lo`` or the new ``hi`` to ``i``, the new ``lo`` is strictly less than the new ``hi``, and the difference between the new ``lo`` and new ``hi`` is strictly less than the difference between the old ``lo`` and old ``hi``. So long as the new ``lo + 1`` remains < the new ``hi``, we stay in this case. We can't stay in this case forever, though: because ``hi-lo`` decreases on each trip but remains > ``0``, ``lo+1 == hi`` must eventually become true. (In fact, it becomes true quickly, in about ``log2(x->len)`` trips; the point is more that ``lo`` doesn't equal ``hi`` when the loop ends, it has to end with ``lo+1==hi`` and ``i==lo``). Then we're in case (b): ``i==lo==hi-1`` then, and the loop exits. The invariant still holds, with ``lo==i`` and ``hi==lo+1==i+1``:: K(i) < k < K(i+1) so ``i`` is again the correct answer. Optimization points: -------------------- + Division by 2 is done via shift rather via "/2". These are signed ints, and almost all C compilers treat signed int division as truncating, and shifting is not the same as truncation for signed int division. The compiler has no way to know these values aren't negative, so has to generate longer-winded code for "/2". But we know these values aren't negative, and exploit it. + The order of _cmp comparisons matters. We're in an interior BTree node, and are looking at only a tiny fraction of all the keys that exist. So finding the key exactly in this node is unlikely, and checking ``_cmp == 0`` is a waste of time to the same extent. It doesn't matter whether we check for ``_cmp < 0`` or ``_cmp > 0`` first, so long as we do both before worrying about equality. + At the start of a routine, it's better to run this macro even if ``x->len`` is ``0`` (check for that afterwards). We just called a function and so probably drained the pipeline. If the first thing we do then is read up ``self->len`` and check it against ``0``, we just sit there waiting for the data to get read up, and then another immediate test-and-branch, and for a very unlikely case (BTree nodes are rarely empty). It's better to get into the loop right away so the normal case makes progress ASAP. The ``BUCKET_SEARCH`` Macro =========================== This has a different job than ``BTREE_SEARCH``: the key ``0`` slot is legitimate in a bucket, and we want to find the index at which the key belongs. If the key is larger than the bucket's largest key, a new slot at index len is where it belongs, else it belongs at the smallest ``i`` with ``keys[i]`` >= the key we're looking for. We also need to know whether or not the key is present (``BTREE_SEARCH`` didn't care; it only wanted to find the next node to search). The mechanics of the search are quite similar, though. The primary loop invariant changes to (say we're searching for key ``k``):: K(lo-1) < k < K(hi) where ``K(i)`` means ``keys[i]``, and we pretend ``K(-1)`` is minus infinity and ``K(len)`` is plus infinity. If the bucket is empty, ``lo=hi=i=0`` at the start, the loop body is never entered, and the macro sets ``INDEX`` to 0 and ``ABSENT`` to true. That's why ``_cmp`` is initialized to 1 (``_cmp`` becomes ``ABSENT``). Else the bucket is not empty, lok``, ``hi`` is set to ``i``, preserving that ``K[hi] = K[i] > k``. If the loop exits after either of those, ``_cmp != 0``, so ``ABSENT`` becomes true. If ``K[i]=k``, the loop breaks, so that ``INDEX`` becomes ``i``, and ``ABSENT`` becomes false (``_cmp=0`` in this case). The same case analysis for ``BTREE_SEARCH`` on ``lo`` and ``hi`` holds here: a. ``(lo == i == hi)`` if and only if ``(lo == hi)``. b. ``(lo == i < hi)`` if and only if ``(lo+1 == hi)``. c. ``(lo < i < hi)`` if and only if ``(lo+1 < hi)``. So long as ``lo+1 < hi``, we're in case (c), and either break with equality (in which case the right results are obviously computed) or narrow the range. If equality doesn't obtain, the range eventually narrows to cases (a) or (b). To go from (c) to (a), we must have ``lo+2==hi`` at the start, and ``K[i]=K[lo+1] key``), because when it pays it narrows the range more (we get a little boost from setting ``lo=i+1`` in this case; the other case sets ``hi=i``, which isn't as much of a narrowing). ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/IFBTree.py000066400000000000000000000015011230730566700226520ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## import zope.interface import BTrees.Interfaces # hack to overcome dynamic-linking headache. from _IFBTree import * zope.interface.moduleProvides(BTrees.Interfaces.IIntegerFloatBTreeModule) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/IIBTree.py000066400000000000000000000015031230730566700226570ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## import zope.interface import BTrees.Interfaces # hack to overcome dynamic-linking headache. from _IIBTree import * zope.interface.moduleProvides(BTrees.Interfaces.IIntegerIntegerBTreeModule) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/IOBTree.py000066400000000000000000000015021230730566700226640ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## import zope.interface import BTrees.Interfaces # hack to overcome dynamic-linking headache. from _IOBTree import * zope.interface.moduleProvides(BTrees.Interfaces.IIntegerObjectBTreeModule) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/Interfaces.py000066400000000000000000000423371230730566700235310ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## from zope.interface import Interface, Attribute class ICollection(Interface): def clear(): """Remove all of the items from the collection.""" def __nonzero__(): """Check if the collection is non-empty. Return a true value if the collection is non-empty and a false value otherwise. """ class IReadSequence(Interface): def __getitem__(index): """Return the value at the given index. An IndexError is raised if the index cannot be found. """ def __getslice__(index1, index2): """Return a subsequence from the original sequence. The subsequence includes the items from index1 up to, but not including, index2. """ class IKeyed(ICollection): def has_key(key): """Check whether the object has an item with the given key. Return a true value if the key is present, else a false value. """ def keys(min=None, max=None, excludemin=False, excludemax=False): """Return an IReadSequence containing the keys in the collection. The type of the IReadSequence is not specified. It could be a list or a tuple or some other type. All arguments are optional, and may be specified as keyword arguments, or by position. If a min is specified, then output is constrained to keys greater than or equal to the given min, and, if excludemin is specified and true, is further constrained to keys strictly greater than min. A min value of None is ignored. If min is None or not specified, and excludemin is true, the smallest key is excluded. If a max is specified, then output is constrained to keys less than or equal to the given max, and, if excludemax is specified and true, is further constrained to keys strictly less than max. A max value of None is ignored. If max is None or not specified, and excludemax is true, the largest key is excluded. """ def maxKey(key=None): """Return the maximum key. If a key argument if provided and not None, return the largest key that is less than or equal to the argument. Raise an exception if no such key exists. """ def minKey(key=None): """Return the minimum key. If a key argument if provided and not None, return the smallest key that is greater than or equal to the argument. Raise an exception if no such key exists. """ class ISetMutable(IKeyed): def insert(key): """Add the key (value) to the set. If the key was already in the set, return 0, otherwise return 1. """ def remove(key): """Remove the key from the set. Raises KeyError if key is not in the set. """ def update(seq): """Add the items from the given sequence to the set.""" class ISized(Interface): """An object that supports __len__.""" def __len__(): """Return the number of items in the container.""" class IKeySequence(IKeyed, ISized): def __getitem__(index): """Return the key in the given index position. This allows iteration with for loops and use in functions, like map and list, that read sequences. """ class ISet(IKeySequence, ISetMutable): pass class ITreeSet(IKeyed, ISetMutable): pass class IMinimalDictionary(ISized, IKeyed): def get(key, default): """Get the value associated with the given key. Return the default if has_key(key) is false. """ def __getitem__(key): """Get the value associated with the given key. Raise KeyError if has_key(key) is false. """ def __setitem__(key, value): """Set the value associated with the given key.""" def __delitem__(key): """Delete the value associated with the given key. Raise KeyError if has_key(key) is false. """ def values(min=None, max=None, excludemin=False, excludemax=False): """Return an IReadSequence containing the values in the collection. The type of the IReadSequence is not specified. It could be a list or a tuple or some other type. All arguments are optional, and may be specified as keyword arguments, or by position. If a min is specified, then output is constrained to values whose keys are greater than or equal to the given min, and, if excludemin is specified and true, is further constrained to values whose keys are strictly greater than min. A min value of None is ignored. If min is None or not specified, and excludemin is true, the value corresponding to the smallest key is excluded. If a max is specified, then output is constrained to values whose keys are less than or equal to the given max, and, if excludemax is specified and true, is further constrained to values whose keys are strictly less than max. A max value of None is ignored. If max is None or not specified, and excludemax is true, the value corresponding to the largest key is excluded. """ def items(min=None, max=None, excludemin=False, excludemax=False): """Return an IReadSequence containing the items in the collection. An item is a 2-tuple, a (key, value) pair. The type of the IReadSequence is not specified. It could be a list or a tuple or some other type. All arguments are optional, and may be specified as keyword arguments, or by position. If a min is specified, then output is constrained to items whose keys are greater than or equal to the given min, and, if excludemin is specified and true, is further constrained to items whose keys are strictly greater than min. A min value of None is ignored. If min is None or not specified, and excludemin is true, the item with the smallest key is excluded. If a max is specified, then output is constrained to items whose keys are less than or equal to the given max, and, if excludemax is specified and true, is further constrained to items whose keys are strictly less than max. A max value of None is ignored. If max is None or not specified, and excludemax is true, the item with the largest key is excluded. """ class IDictionaryIsh(IMinimalDictionary): def update(collection): """Add the items from the given collection object to the collection. The input collection must be a sequence of (key, value) 2-tuples, or an object with an 'items' method that returns a sequence of (key, value) pairs. """ def byValue(minValue): """Return a sequence of (value, key) pairs, sorted by value. Values < minValue are omitted and other values are "normalized" by the minimum value. This normalization may be a noop, but, for integer values, the normalization is division. """ def setdefault(key, d): """D.setdefault(k, d) -> D.get(k, d), also set D[k]=d if k not in D. Return the value like get() except that if key is missing, d is both returned and inserted into the dictionary as the value of k. Note that, unlike as for Python's dict.setdefault(), d is not optional. Python defaults d to None, but that doesn't make sense for mappings that can't have None as a value (for example, an IIBTree can have only integers as values). """ def pop(key, d): """D.pop(k[, d]) -> v, remove key and return the corresponding value. If key is not found, d is returned if given, otherwise KeyError is raised. """ class IBTree(IDictionaryIsh): def insert(key, value): """Insert a key and value into the collection. If the key was already in the collection, then there is no change and 0 is returned. If the key was not already in the collection, then the item is added and 1 is returned. This method is here to allow one to generate random keys and to insert and test whether the key was there in one operation. A standard idiom for generating new keys will be:: key = generate_key() while not t.insert(key, value): key=generate_key() """ class IMerge(Interface): """Object with methods for merging sets, buckets, and trees. These methods are supplied in modules that define collection classes with particular key and value types. The operations apply only to collections from the same module. For example, the IIBTree.union can only be used with IIBTree.IIBTree, IIBTree.IIBucket, IIBTree.IISet, and IIBTree.IITreeSet. The implementing module has a value type. The IOBTree and OOBTree modules have object value type. The IIBTree and OIBTree modules have integer value types. Other modules may be defined in the future that have other value types. The individual types are classified into set (Set and TreeSet) and mapping (Bucket and BTree) types. """ def difference(c1, c2): """Return the keys or items in c1 for which there is no key in c2. If c1 is None, then None is returned. If c2 is None, then c1 is returned. If neither c1 nor c2 is None, the output is a Set if c1 is a Set or TreeSet, and is a Bucket if c1 is a Bucket or BTree. """ def union(c1, c2): """Compute the Union of c1 and c2. If c1 is None, then c2 is returned, otherwise, if c2 is None, then c1 is returned. The output is a Set containing keys from the input collections. """ def intersection(c1, c2): """Compute the intersection of c1 and c2. If c1 is None, then c2 is returned, otherwise, if c2 is None, then c1 is returned. The output is a Set containing matching keys from the input collections. """ class IBTreeModule(Interface): """These are available in all modules (IOBTree, OIBTree, OOBTree, IIBTree, IFBTree, LFBTree, LOBTree, OLBTree, and LLBTree). """ BTree = Attribute( """The IBTree for this module. Also available as [prefix]BTree, as in IOBTree.""") Bucket = Attribute( """The leaf-node data buckets used by the BTree. (IBucket is not currently defined in this file, but is essentially IDictionaryIsh, with the exception of __nonzero__, as of this writing.) Also available as [prefix]Bucket, as in IOBucket.""") TreeSet = Attribute( """The ITreeSet for this module. Also available as [prefix]TreeSet, as in IOTreeSet.""") Set = Attribute( """The ISet for this module: the leaf-node data buckets used by the TreeSet. Also available as [prefix]BTree, as in IOSet.""") class IIMerge(IMerge): """Merge collections with integer value type. A primary intent is to support operations with no or integer values, which are used as "scores" to rate indiviual keys. That is, in this context, a BTree or Bucket is viewed as a set with scored keys, using integer scores. """ def weightedUnion(c1, c2, weight1=1, weight2=1): """Compute the weighted union of c1 and c2. If c1 and c2 are None, the output is (0, None). If c1 is None and c2 is not None, the output is (weight2, c2). If c1 is not None and c2 is None, the output is (weight1, c1). Else, and hereafter, c1 is not None and c2 is not None. If c1 and c2 are both sets, the output is 1 and the (unweighted) union of the sets. Else the output is 1 and a Bucket whose keys are the union of c1 and c2's keys, and whose values are:: v1*weight1 + v2*weight2 where: v1 is 0 if the key is not in c1 1 if the key is in c1 and c1 is a set c1[key] if the key is in c1 and c1 is a mapping v2 is 0 if the key is not in c2 1 if the key is in c2 and c2 is a set c2[key] if the key is in c2 and c2 is a mapping Note that c1 and c2 must be collections. """ def weightedIntersection(c1, c2, weight1=1, weight2=1): """Compute the weighted intersection of c1 and c2. If c1 and c2 are None, the output is (0, None). If c1 is None and c2 is not None, the output is (weight2, c2). If c1 is not None and c2 is None, the output is (weight1, c1). Else, and hereafter, c1 is not None and c2 is not None. If c1 and c2 are both sets, the output is the sum of the weights and the (unweighted) intersection of the sets. Else the output is 1 and a Bucket whose keys are the intersection of c1 and c2's keys, and whose values are:: v1*weight1 + v2*weight2 where: v1 is 1 if c1 is a set c1[key] if c1 is a mapping v2 is 1 if c2 is a set c2[key] if c2 is a mapping Note that c1 and c2 must be collections. """ class IMergeIntegerKey(IMerge): """IMerge-able objects with integer keys. Concretely, this means the types in IOBTree and IIBTree. """ def multiunion(seq): """Return union of (zero or more) integer sets, as an integer set. seq is a sequence of objects each convertible to an integer set. These objects are convertible to an integer set: + An integer, which is added to the union. + A Set or TreeSet from the same module (for example, an IIBTree.TreeSet for IIBTree.multiunion()). The elements of the set are added to the union. + A Bucket or BTree from the same module (for example, an IOBTree.IOBTree for IOBTree.multiunion()). The keys of the mapping are added to the union. The union is returned as a Set from the same module (for example, IIBTree.multiunion() returns an IIBTree.IISet). The point to this method is that it can run much faster than doing a sequence of two-input union() calls. Under the covers, all the integers in all the inputs are sorted via a single linear-time radix sort, then duplicates are removed in a second linear-time pass. """ class IBTreeFamily(Interface): """the 64-bit or 32-bit family""" IO = Attribute('The IIntegerObjectBTreeModule for this family') OI = Attribute('The IObjectIntegerBTreeModule for this family') II = Attribute('The IIntegerIntegerBTreeModule for this family') IF = Attribute('The IIntegerFloatBTreeModule for this family') OO = Attribute('The IObjectObjectBTreeModule for this family') maxint = Attribute('The maximum integer storable in this family') minint = Attribute('The minimum integer storable in this family') class IIntegerObjectBTreeModule(IBTreeModule, IMerge): """keys, or set values, are integers; values are objects. describes IOBTree and LOBTree""" family = Attribute('The IBTreeFamily of this module') class IObjectIntegerBTreeModule(IBTreeModule, IIMerge): """keys, or set values, are objects; values are integers. Object keys (and set values) must sort reliably (for instance, *not* on object id)! Homogenous key types recommended. describes OIBTree and LOBTree""" family = Attribute('The IBTreeFamily of this module') class IIntegerIntegerBTreeModule(IBTreeModule, IIMerge, IMergeIntegerKey): """keys, or set values, are integers; values are also integers. describes IIBTree and LLBTree""" family = Attribute('The IBTreeFamily of this module') class IObjectObjectBTreeModule(IBTreeModule, IMerge): """keys, or set values, are objects; values are also objects. Object keys (and set values) must sort reliably (for instance, *not* on object id)! Homogenous key types recommended. describes OOBTree""" # Note that there's no ``family`` attribute; all families include # the OO flavor of BTrees. class IIntegerFloatBTreeModule(IBTreeModule, IMerge): """keys, or set values, are integers; values are floats. describes IFBTree and LFBTree""" family = Attribute('The IBTreeFamily of this module') ############################################################### # IMPORTANT NOTE # # Getting the length of a BTree, TreeSet, or output of keys, # values, or items of same is expensive. If you need to get the # length, you need to maintain this separately. # # Eventually, I need to express this through the interfaces. # ################################################################ ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/LFBTree.py000066400000000000000000000015011230730566700226550ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## import zope.interface import BTrees.Interfaces # hack to overcome dynamic-linking headache. from _LFBTree import * zope.interface.moduleProvides(BTrees.Interfaces.IIntegerFloatBTreeModule) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/LLBTree.py000066400000000000000000000015031230730566700226650ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## import zope.interface import BTrees.Interfaces # hack to overcome dynamic-linking headache. from _LLBTree import * zope.interface.moduleProvides(BTrees.Interfaces.IIntegerIntegerBTreeModule) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/LOBTree.py000066400000000000000000000015021230730566700226670ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## import zope.interface import BTrees.Interfaces # hack to overcome dynamic-linking headache. from _LOBTree import * zope.interface.moduleProvides(BTrees.Interfaces.IIntegerObjectBTreeModule) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/Length.py000066400000000000000000000032411230730566700226560ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## import persistent class Length(persistent.Persistent): """BTree lengths are often too expensive to compute. Objects that use BTrees need to keep track of lengths themselves. This class provides an object for doing this. As a bonus, the object support application-level conflict resolution. It is tempting to to assign length objects to __len__ attributes to provide instance-specific __len__ methods. However, this no longer works as expected, because new-style classes cache class-defined slot methods (like __len__) in C type slots. Thus, instance-defined slot fillers are ignored. """ value = 0 def __init__(self, v=0): self.value = v def __getstate__(self): return self.value def __setstate__(self, v): self.value = v def set(self, v): self.value = v def _p_resolveConflict(self, old, s1, s2): return s1 + s2 - old def change(self, delta): self.value += delta def __call__(self, *args): return self.value ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/MergeTemplate.c000066400000000000000000000274651230730566700240000ustar00rootroot00000000000000/***************************************************************************** Copyright (c) 2001, 2002 Zope Foundation and Contributors. All Rights Reserved. This software is subject to the provisions of the Zope Public License, Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS FOR A PARTICULAR PURPOSE ****************************************************************************/ #define MERGETEMPLATE_C "$Id$\n" /**************************************************************************** Set operations ****************************************************************************/ static int merge_output(Bucket *r, SetIteration *i, int mapping) { if (r->len >= r->size && Bucket_grow(r, -1, !mapping) < 0) return -1; COPY_KEY(r->keys[r->len], i->key); INCREF_KEY(r->keys[r->len]); if (mapping) { COPY_VALUE(r->values[r->len], i->value); INCREF_VALUE(r->values[r->len]); } r->len++; return 0; } /* The "reason" argument is a little integer giving "a reason" for the * error. In the Zope3 codebase, these are mapped to explanatory strings * via zodb/btrees/interfaces.py. */ static PyObject * merge_error(int p1, int p2, int p3, int reason) { PyObject *r; UNLESS (r=Py_BuildValue("iiii", p1, p2, p3, reason)) r=Py_None; if (ConflictError == NULL) { ConflictError = PyExc_ValueError; Py_INCREF(ConflictError); } PyErr_SetObject(ConflictError, r); if (r != Py_None) { Py_DECREF(r); } return NULL; } /* It's hard to explain "the rules" for bucket_merge, in large part because * any automatic conflict-resolution scheme is going to be incorrect for * some endcases of *some* app. The scheme here is pretty conservative, * and should be OK for most apps. It's easier to explain what the code * allows than what it forbids: * * Leaving things alone: it's OK if both s2 and s3 leave a piece of s1 * alone (don't delete the key, and don't change the value). * * Key deletion: a transaction (s2 or s3) can delete a key (from s1), but * only if the other transaction (of s2 and s3) doesn't delete the same key. * However, it's not OK for s2 and s3 to, between them, end up deleting all * the keys. This is a higher-level constraint, due to that the caller of * bucket_merge() doesn't have enough info to unlink the resulting empty * bucket from its BTree correctly. It's also not OK if s2 or s3 are empty, * because the transaction that emptied the bucket unlinked the bucket from * the tree, and nothing we do here can get it linked back in again. * * Key insertion: s2 or s3 can add a new key, provided the other transaction * doesn't insert the same key. It's not OK even if they insert the same * pair. * * Mapping value modification: s2 or s3 can modify the value associated * with a key in s1, provided the other transaction doesn't make a * modification of the same key to a different value. It's OK if s2 and s3 * both give the same new value to the key while it's hard to be precise about * why, this doesn't seem consistent with that it's *not* OK for both to add * a new key mapping to the same value). */ static PyObject * bucket_merge(Bucket *s1, Bucket *s2, Bucket *s3) { Bucket *r=0; PyObject *s; SetIteration i1 = {0,0,0}, i2 = {0,0,0}, i3 = {0,0,0}; int cmp12, cmp13, cmp23, mapping, set; /* If either "after" bucket is empty, punt. */ if (s2->len == 0 || s3->len == 0) { merge_error(-1, -1, -1, 12); goto err; } if (initSetIteration(&i1, OBJECT(s1), 1) < 0) goto err; if (initSetIteration(&i2, OBJECT(s2), 1) < 0) goto err; if (initSetIteration(&i3, OBJECT(s3), 1) < 0) goto err; mapping = i1.usesValue | i2.usesValue | i3.usesValue; set = !mapping; if (mapping) r = (Bucket *)PyObject_CallObject((PyObject *)&BucketType, NULL); else r = (Bucket *)PyObject_CallObject((PyObject *)&SetType, NULL); if (r == NULL) goto err; if (i1.next(&i1) < 0) goto err; if (i2.next(&i2) < 0) goto err; if (i3.next(&i3) < 0) goto err; /* Consult zodb/btrees/interfaces.py for the meaning of the last * argument passed to merge_error(). */ /* TODO: This isn't passing on errors raised by value comparisons. */ while (i1.position >= 0 && i2.position >= 0 && i3.position >= 0) { TEST_KEY_SET_OR(cmp12, i1.key, i2.key) goto err; TEST_KEY_SET_OR(cmp13, i1.key, i3.key) goto err; if (cmp12==0) { if (cmp13==0) { if (set || (TEST_VALUE(i1.value, i2.value) == 0)) { /* change in i3 value or all same */ if (merge_output(r, &i3, mapping) < 0) goto err; } else if (set || (TEST_VALUE(i1.value, i3.value) == 0)) { /* change in i2 value */ if (merge_output(r, &i2, mapping) < 0) goto err; } else { /* conflicting value changes in i2 and i3 */ merge_error(i1.position, i2.position, i3.position, 1); goto err; } if (i1.next(&i1) < 0) goto err; if (i2.next(&i2) < 0) goto err; if (i3.next(&i3) < 0) goto err; } else if (cmp13 > 0) { /* insert i3 */ if (merge_output(r, &i3, mapping) < 0) goto err; if (i3.next(&i3) < 0) goto err; } else if (set || (TEST_VALUE(i1.value, i2.value) == 0)) { /* deleted in i3 */ if (i3.position == 1) { /* Deleted the first item. This will modify the parent node, so we don't know if merging will be safe */ merge_error(i1.position, i2.position, i3.position, 13); goto err; } if (i1.next(&i1) < 0) goto err; if (i2.next(&i2) < 0) goto err; } else { /* conflicting del in i3 and change in i2 */ merge_error(i1.position, i2.position, i3.position, 2); goto err; } } else if (cmp13 == 0) { if (cmp12 > 0) { /* insert i2 */ if (merge_output(r, &i2, mapping) < 0) goto err; if (i2.next(&i2) < 0) goto err; } else if (set || (TEST_VALUE(i1.value, i3.value) == 0)) { /* deleted in i2 */ if (i2.position == 1) { /* Deleted the first item. This will modify the parent node, so we don't know if merging will be safe */ merge_error(i1.position, i2.position, i3.position, 13); goto err; } if (i1.next(&i1) < 0) goto err; if (i3.next(&i3) < 0) goto err; } else { /* conflicting del in i2 and change in i3 */ merge_error(i1.position, i2.position, i3.position, 3); goto err; } } else { /* Both keys changed */ TEST_KEY_SET_OR(cmp23, i2.key, i3.key) goto err; if (cmp23==0) { /* dueling inserts or deletes */ merge_error(i1.position, i2.position, i3.position, 4); goto err; } if (cmp12 > 0) { /* insert i2 */ if (cmp23 > 0) { /* insert i3 first */ if (merge_output(r, &i3, mapping) < 0) goto err; if (i3.next(&i3) < 0) goto err; } else { /* insert i2 first */ if (merge_output(r, &i2, mapping) < 0) goto err; if (i2.next(&i2) < 0) goto err; } } else if (cmp13 > 0) { /* Insert i3 */ if (merge_output(r, &i3, mapping) < 0) goto err; if (i3.next(&i3) < 0) goto err; } else { /* 1<2 and 1<3: both deleted 1.key */ merge_error(i1.position, i2.position, i3.position, 5); goto err; } } } while (i2.position >= 0 && i3.position >= 0) { /* New inserts */ TEST_KEY_SET_OR(cmp23, i2.key, i3.key) goto err; if (cmp23==0) { /* dueling inserts */ merge_error(i1.position, i2.position, i3.position, 6); goto err; } if (cmp23 > 0) { /* insert i3 */ if (merge_output(r, &i3, mapping) < 0) goto err; if (i3.next(&i3) < 0) goto err; } else { /* insert i2 */ if (merge_output(r, &i2, mapping) < 0) goto err; if (i2.next(&i2) < 0) goto err; } } while (i1.position >= 0 && i2.position >= 0) { /* remainder of i1 deleted in i3 */ TEST_KEY_SET_OR(cmp12, i1.key, i2.key) goto err; if (cmp12 > 0) { /* insert i2 */ if (merge_output(r, &i2, mapping) < 0) goto err; if (i2.next(&i2) < 0) goto err; } else if (cmp12==0 && (set || (TEST_VALUE(i1.value, i2.value) == 0))) { /* delete i3 */ if (i1.next(&i1) < 0) goto err; if (i2.next(&i2) < 0) goto err; } else { /* Dueling deletes or delete and change */ merge_error(i1.position, i2.position, i3.position, 7); goto err; } } while (i1.position >= 0 && i3.position >= 0) { /* remainder of i1 deleted in i2 */ TEST_KEY_SET_OR(cmp13, i1.key, i3.key) goto err; if (cmp13 > 0) { /* insert i3 */ if (merge_output(r, &i3, mapping) < 0) goto err; if (i3.next(&i3) < 0) goto err; } else if (cmp13==0 && (set || (TEST_VALUE(i1.value, i3.value) == 0))) { /* delete i2 */ if (i1.next(&i1) < 0) goto err; if (i3.next(&i3) < 0) goto err; } else { /* Dueling deletes or delete and change */ merge_error(i1.position, i2.position, i3.position, 8); goto err; } } if (i1.position >= 0) { /* Dueling deletes */ merge_error(i1.position, i2.position, i3.position, 9); goto err; } while (i2.position >= 0) { /* Inserting i2 at end */ if (merge_output(r, &i2, mapping) < 0) goto err; if (i2.next(&i2) < 0) goto err; } while (i3.position >= 0) { /* Inserting i3 at end */ if (merge_output(r, &i3, mapping) < 0) goto err; if (i3.next(&i3) < 0) goto err; } /* If the output bucket is empty, conflict resolution doesn't have * enough info to unlink it from its containing BTree correctly. */ if (r->len == 0) { merge_error(-1, -1, -1, 10); goto err; } finiSetIteration(&i1); finiSetIteration(&i2); finiSetIteration(&i3); if (s1->next) { Py_INCREF(s1->next); r->next = s1->next; } s = bucket_getstate(r); Py_DECREF(r); return s; err: finiSetIteration(&i1); finiSetIteration(&i2); finiSetIteration(&i3); Py_XDECREF(r); return NULL; } ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/OIBTree.py000066400000000000000000000015021230730566700226640ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## import zope.interface import BTrees.Interfaces # hack to overcome dynamic-linking headache. from _OIBTree import * zope.interface.moduleProvides(BTrees.Interfaces.IObjectIntegerBTreeModule) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/OLBTree.py000066400000000000000000000015021230730566700226670ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## import zope.interface import BTrees.Interfaces # hack to overcome dynamic-linking headache. from _OLBTree import * zope.interface.moduleProvides(BTrees.Interfaces.IObjectIntegerBTreeModule) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/OOBTree.py000066400000000000000000000015011230730566700226710ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## import zope.interface import BTrees.Interfaces # hack to overcome dynamic-linking headache. from _OOBTree import * zope.interface.moduleProvides(BTrees.Interfaces.IObjectObjectBTreeModule) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/SetOpTemplate.c000066400000000000000000000362661230730566700237720ustar00rootroot00000000000000/***************************************************************************** Copyright (c) 2001, 2002 Zope Foundation and Contributors. All Rights Reserved. This software is subject to the provisions of the Zope Public License, Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS FOR A PARTICULAR PURPOSE ****************************************************************************/ /**************************************************************************** Set operations ****************************************************************************/ #define SETOPTEMPLATE_C "$Id$\n" #ifdef KEY_CHECK static int nextKeyAsSet(SetIteration *i) { if (i->position >= 0) { if (i->position) { DECREF_KEY(i->key); i->position = -1; } else i->position = 1; } return 0; } #endif /* initSetIteration * * Start the set iteration protocol. See the comments at struct SetIteration. * * Arguments * i The address of a SetIteration control struct. * s The address of the set, bucket, BTree, ..., to be iterated. * useValues Boolean; if true, and s has values (is a mapping), copy * them into i->value each time i->next() is called; else * ignore s's values even if s is a mapping. * * Return * 0 on success; -1 and an exception set if error. * i.usesValue is set to 1 (true) if s has values and useValues was * true; else usesValue is set to 0 (false). * i.set gets a new reference to s, or to some other object used to * iterate over s. * i.position is set to 0. * i.next is set to an appropriate iteration function. * i.key and i.value are left alone. * * Internal * i.position < 0 means iteration terminated. * i.position = 0 means iteration hasn't yet begun (next() hasn't * been called yet). * In all other cases, i.key, and possibly i.value, own references. * These must be cleaned up, either by next() routines, or by * finiSetIteration. * next() routines must ensure the above. They should return without * doing anything when i.position < 0. * It's the responsibility of {init, fini}setIteration to clean up * the reference in i.set, and to ensure that no stale references * live in i.key or i.value if iteration terminates abnormally. * A SetIteration struct has been cleaned up iff i.set is NULL. */ static int initSetIteration(SetIteration *i, PyObject *s, int useValues) { i->set = NULL; i->position = -1; /* set to 0 only on normal return */ i->usesValue = 0; /* assume it's a set or that values aren't iterated */ if (PyObject_IsInstance(s, (PyObject *)&BucketType)) { i->set = s; Py_INCREF(s); if (useValues) { i->usesValue = 1; i->next = nextBucket; } else i->next = nextSet; } else if (PyObject_IsInstance(s, (PyObject *)&SetType)) { i->set = s; Py_INCREF(s); i->next = nextSet; } else if (PyObject_IsInstance(s, (PyObject *)&BTreeType)) { i->set = BTree_rangeSearch(BTREE(s), NULL, NULL, 'i'); UNLESS(i->set) return -1; if (useValues) { i->usesValue = 1; i->next = nextBTreeItems; } else i->next = nextTreeSetItems; } else if (PyObject_IsInstance(s, (PyObject *)&TreeSetType)) { i->set = BTree_rangeSearch(BTREE(s), NULL, NULL, 'k'); UNLESS(i->set) return -1; i->next = nextTreeSetItems; } #ifdef KEY_CHECK else if (KEY_CHECK(s)) { int copied = 1; COPY_KEY_FROM_ARG(i->key, s, copied); UNLESS (copied) return -1; INCREF_KEY(i->key); i->set = s; Py_INCREF(s); i->next = nextKeyAsSet; } #endif else { PyErr_SetString(PyExc_TypeError, "invalid argument"); return -1; } i->position = 0; return 0; } #ifndef MERGE_WEIGHT #define MERGE_WEIGHT(O, w) (O) #endif static int copyRemaining(Bucket *r, SetIteration *i, int merge, /* See comment # 42 */ #ifdef MERGE VALUE_TYPE w) #else int w) #endif { while (i->position >= 0) { if(r->len >= r->size && Bucket_grow(r, -1, ! merge) < 0) return -1; COPY_KEY(r->keys[r->len], i->key); INCREF_KEY(r->keys[r->len]); if (merge) { COPY_VALUE(r->values[r->len], MERGE_WEIGHT(i->value, w)); INCREF_VALUE(r->values[r->len]); } r->len++; if (i->next(i) < 0) return -1; } return 0; } /* This is the workhorse for all set merge operations: the weighted and * unweighted flavors of union and intersection, and set difference. The * algorithm is conceptually simple but the code is complicated due to all * the options. * * s1, s2 * The input collections to be merged. * * usevalues1, usevalues2 * Booleans. In the output, should values from s1 (or s2) be used? This * only makes sense when an operation intends to support mapping outputs; * these should both be false for operations that want pure set outputs. * * w1, w2 * If usevalues1(2) are true, these are the weights to apply to the * input values. * * c1 * Boolean. Should keys that appear in c1 but not c2 appear in the output? * c12 * Boolean. Should keys that appear in both inputs appear in the output? * c2 * Boolean. Should keys that appear in c2 but not c1 appear in the output? * * Returns NULL if error, else a Set or Bucket, depending on whether a set or * mapping was requested. */ static PyObject * set_operation(PyObject *s1, PyObject *s2, int usevalues1, int usevalues2, /* Comment # 42 The following ifdef works around a template/type problem Weights are passed as integers. In particular, the weight passed by difference is one. This works fine in the int value and float value cases but makes no sense in the object value case. In the object value case, we don't do merging, so we don't use the weights, so it doesn't matter what they are. */ #ifdef MERGE VALUE_TYPE w1, VALUE_TYPE w2, #else int w1, int w2, #endif int c1, int c12, int c2) { Bucket *r=0; SetIteration i1 = {0,0,0}, i2 = {0,0,0}; int cmp, merge; if (initSetIteration(&i1, s1, usevalues1) < 0) goto err; if (initSetIteration(&i2, s2, usevalues2) < 0) goto err; merge = i1.usesValue | i2.usesValue; if (merge) { #ifndef MERGE if (c12 && i1.usesValue && i2.usesValue) goto invalid_set_operation; #endif if (! i1.usesValue&& i2.usesValue) { SetIteration t; int i; /* See comment # 42 above */ #ifdef MERGE VALUE_TYPE v; #else int v; #endif t=i1; i1=i2; i2=t; i=c1; c1=c2; c2=i; v=w1; w1=w2; w2=v; } #ifdef MERGE_DEFAULT i1.value=MERGE_DEFAULT; i2.value=MERGE_DEFAULT; #else if (i1.usesValue) { if (! i2.usesValue && c2) goto invalid_set_operation; } else { if (c1 || c12) goto invalid_set_operation; } #endif UNLESS(r=BUCKET(PyObject_CallObject(OBJECT(&BucketType), NULL))) goto err; } else { UNLESS(r=BUCKET(PyObject_CallObject(OBJECT(&SetType), NULL))) goto err; } if (i1.next(&i1) < 0) goto err; if (i2.next(&i2) < 0) goto err; while (i1.position >= 0 && i2.position >= 0) { TEST_KEY_SET_OR(cmp, i1.key, i2.key) goto err; if(cmp < 0) { if(c1) { if(r->len >= r->size && Bucket_grow(r, -1, ! merge) < 0) goto err; COPY_KEY(r->keys[r->len], i1.key); INCREF_KEY(r->keys[r->len]); if (merge) { COPY_VALUE(r->values[r->len], MERGE_WEIGHT(i1.value, w1)); INCREF_VALUE(r->values[r->len]); } r->len++; } if (i1.next(&i1) < 0) goto err; } else if(cmp==0) { if(c12) { if(r->len >= r->size && Bucket_grow(r, -1, ! merge) < 0) goto err; COPY_KEY(r->keys[r->len], i1.key); INCREF_KEY(r->keys[r->len]); if (merge) { #ifdef MERGE r->values[r->len] = MERGE(i1.value, w1, i2.value, w2); #else COPY_VALUE(r->values[r->len], i1.value); INCREF_VALUE(r->values[r->len]); #endif } r->len++; } if (i1.next(&i1) < 0) goto err; if (i2.next(&i2) < 0) goto err; } else { if(c2) { if(r->len >= r->size && Bucket_grow(r, -1, ! merge) < 0) goto err; COPY_KEY(r->keys[r->len], i2.key); INCREF_KEY(r->keys[r->len]); if (merge) { COPY_VALUE(r->values[r->len], MERGE_WEIGHT(i2.value, w2)); INCREF_VALUE(r->values[r->len]); } r->len++; } if (i2.next(&i2) < 0) goto err; } } if(c1 && copyRemaining(r, &i1, merge, w1) < 0) goto err; if(c2 && copyRemaining(r, &i2, merge, w2) < 0) goto err; finiSetIteration(&i1); finiSetIteration(&i2); return OBJECT(r); #ifndef MERGE_DEFAULT invalid_set_operation: PyErr_SetString(PyExc_TypeError, "invalid set operation"); #endif err: finiSetIteration(&i1); finiSetIteration(&i2); Py_XDECREF(r); return NULL; } static PyObject * difference_m(PyObject *ignored, PyObject *args) { PyObject *o1, *o2; UNLESS(PyArg_ParseTuple(args, "OO", &o1, &o2)) return NULL; if (o1 == Py_None || o2 == Py_None) { /* difference(None, X) -> None; difference(X, None) -> X */ Py_INCREF(o1); return o1; } return set_operation(o1, o2, 1, 0, /* preserve values from o1, ignore o2's */ 1, 0, /* o1's values multiplied by 1 */ 1, 0, 0); /* take only keys unique to o1 */ } static PyObject * union_m(PyObject *ignored, PyObject *args) { PyObject *o1, *o2; UNLESS(PyArg_ParseTuple(args, "OO", &o1, &o2)) return NULL; if (o1 == Py_None) { Py_INCREF(o2); return o2; } else if (o2 == Py_None) { Py_INCREF(o1); return o1; } return set_operation(o1, o2, 0, 0, /* ignore values in both */ 1, 1, /* the weights are irrelevant */ 1, 1, 1); /* take all keys */ } static PyObject * intersection_m(PyObject *ignored, PyObject *args) { PyObject *o1, *o2; UNLESS(PyArg_ParseTuple(args, "OO", &o1, &o2)) return NULL; if (o1 == Py_None) { Py_INCREF(o2); return o2; } else if (o2 == Py_None) { Py_INCREF(o1); return o1; } return set_operation(o1, o2, 0, 0, /* ignore values in both */ 1, 1, /* the weights are irrelevant */ 0, 1, 0); /* take only keys common to both */ } #ifdef MERGE static PyObject * wunion_m(PyObject *ignored, PyObject *args) { PyObject *o1, *o2; VALUE_TYPE w1 = 1, w2 = 1; UNLESS(PyArg_ParseTuple(args, "OO|" VALUE_PARSE VALUE_PARSE, &o1, &o2, &w1, &w2) ) return NULL; if (o1 == Py_None) return Py_BuildValue(VALUE_PARSE "O", (o2 == Py_None ? 0 : w2), o2); else if (o2 == Py_None) return Py_BuildValue(VALUE_PARSE "O", w1, o1); o1 = set_operation(o1, o2, 1, 1, w1, w2, 1, 1, 1); if (o1) ASSIGN(o1, Py_BuildValue(VALUE_PARSE "O", (VALUE_TYPE)1, o1)); return o1; } static PyObject * wintersection_m(PyObject *ignored, PyObject *args) { PyObject *o1, *o2; VALUE_TYPE w1 = 1, w2 = 1; UNLESS(PyArg_ParseTuple(args, "OO|" VALUE_PARSE VALUE_PARSE, &o1, &o2, &w1, &w2) ) return NULL; if (o1 == Py_None) return Py_BuildValue(VALUE_PARSE "O", (o2 == Py_None ? 0 : w2), o2); else if (o2 == Py_None) return Py_BuildValue(VALUE_PARSE "O", w1, o1); o1 = set_operation(o1, o2, 1, 1, w1, w2, 0, 1, 0); if (o1) ASSIGN(o1, Py_BuildValue(VALUE_PARSE "O", ((o1->ob_type == (PyTypeObject*)(&SetType)) ? w2+w1 : 1), o1)); return o1; } #endif #ifdef MULTI_INT_UNION #include "sorters.c" /* Input is a sequence of integer sets (or convertible to sets by the set iteration protocol). Output is the union of the sets. The point is to run much faster than doing pairs of unions. */ static PyObject * multiunion_m(PyObject *ignored, PyObject *args) { PyObject *seq; /* input sequence */ int n; /* length of input sequence */ PyObject *set = NULL; /* an element of the input sequence */ Bucket *result; /* result set */ SetIteration setiter = {0}; int i; UNLESS(PyArg_ParseTuple(args, "O", &seq)) return NULL; n = PyObject_Length(seq); if (n < 0) return NULL; /* Construct an empty result set. */ result = BUCKET(PyObject_CallObject(OBJECT(&SetType), NULL)); if (result == NULL) return NULL; /* For each set in the input sequence, append its elements to the result set. At this point, we ignore the possibility of duplicates. */ for (i = 0; i < n; ++i) { set = PySequence_GetItem(seq, i); if (set == NULL) goto Error; /* If set is a bucket, do a straight resize + memcpy. */ if (set->ob_type == (PyTypeObject*)&SetType || set->ob_type == (PyTypeObject*)&BucketType) { Bucket *b = BUCKET(set); int status = 0; UNLESS (PER_USE(b)) goto Error; if (b->len) status = bucket_append(result, b, 0, b->len, 0, i < n-1); PER_UNUSE(b); if (status < 0) goto Error; } else { /* No cheap way: iterate over set's elements one at a time. */ if (initSetIteration(&setiter, set, 0) < 0) goto Error; if (setiter.next(&setiter) < 0) goto Error; while (setiter.position >= 0) { if (result->len >= result->size && Bucket_grow(result, -1, 1) < 0) goto Error; COPY_KEY(result->keys[result->len], setiter.key); ++result->len; /* We know the key is an int, so no need to incref it. */ if (setiter.next(&setiter) < 0) goto Error; } finiSetIteration(&setiter); } Py_DECREF(set); set = NULL; } /* Combine, sort, remove duplicates, and reset the result's len. If the set shrinks (which happens if and only if there are duplicates), no point to realloc'ing the set smaller, as we expect the result set to be short-lived. */ if (result->len > 0) { size_t newlen; /* number of elements in final result set */ newlen = sort_int_nodups(result->keys, (size_t)result->len); result->len = (int)newlen; } return (PyObject *)result; Error: Py_DECREF(result); Py_XDECREF(set); finiSetIteration(&setiter); return NULL; } #endif ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/SetTemplate.c000066400000000000000000000207141230730566700234620ustar00rootroot00000000000000/***************************************************************************** Copyright (c) 2001, 2002 Zope Foundation and Contributors. All Rights Reserved. This software is subject to the provisions of the Zope Public License, Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS FOR A PARTICULAR PURPOSE ****************************************************************************/ #define SETTEMPLATE_C "$Id$\n" static PyObject * Set_insert(Bucket *self, PyObject *args) { PyObject *key; int i; UNLESS (PyArg_ParseTuple(args, "O", &key)) return NULL; if ( (i=_bucket_set(self, key, Py_None, 1, 1, 0)) < 0) return NULL; return PyInt_FromLong(i); } /* _Set_update and _TreeSet_update are identical except for the function they call to add the element to the set. */ static int _Set_update(Bucket *self, PyObject *seq) { int n=0, ind=0; PyObject *iter, *v; iter = PyObject_GetIter(seq); if (iter == NULL) return -1; while (1) { v = PyIter_Next(iter); if (v == NULL) { if (PyErr_Occurred()) goto err; else break; } ind = _bucket_set(self, v, Py_None, 1, 1, 0); Py_DECREF(v); if (ind < 0) goto err; else n += ind; } err: Py_DECREF(iter); if (ind < 0) return -1; return n; } static PyObject * Set_update(Bucket *self, PyObject *args) { PyObject *seq = NULL; int n = 0; if (!PyArg_ParseTuple(args, "|O:update", &seq)) return NULL; if (seq) { n = _Set_update(self, seq); if (n < 0) return NULL; } return PyInt_FromLong(n); } static PyObject * Set_remove(Bucket *self, PyObject *args) { PyObject *key; UNLESS (PyArg_ParseTuple(args, "O", &key)) return NULL; if (_bucket_set(self, key, NULL, 0, 1, 0) < 0) return NULL; Py_INCREF(Py_None); return Py_None; } static int _set_setstate(Bucket *self, PyObject *args) { PyObject *k, *items; Bucket *next=0; int i, l, copied=1; KEY_TYPE *keys; UNLESS (PyArg_ParseTuple(args, "O|O", &items, &next)) return -1; if (!PyTuple_Check(items)) { PyErr_SetString(PyExc_TypeError, "tuple required for first state element"); return -1; } if ((l=PyTuple_Size(items)) < 0) return -1; for (i=self->len; --i >= 0; ) { DECREF_KEY(self->keys[i]); } self->len=0; if (self->next) { Py_DECREF(self->next); self->next=0; } if (l > self->size) { UNLESS (keys=BTree_Realloc(self->keys, sizeof(KEY_TYPE)*l)) return -1; self->keys=keys; self->size=l; } for (i=0; ikeys[i], k, copied); UNLESS (copied) return -1; INCREF_KEY(self->keys[i]); } self->len=l; if (next) { self->next=next; Py_INCREF(next); } return 0; } static PyObject * set_setstate(Bucket *self, PyObject *args) { int r; UNLESS (PyArg_ParseTuple(args, "O", &args)) return NULL; PER_PREVENT_DEACTIVATION(self); r=_set_setstate(self, args); PER_UNUSE(self); if (r < 0) return NULL; Py_INCREF(Py_None); return Py_None; } static struct PyMethodDef Set_methods[] = { {"__getstate__", (PyCFunction) bucket_getstate, METH_VARARGS, "__getstate__() -- Return the picklable state of the object"}, {"__setstate__", (PyCFunction) set_setstate, METH_VARARGS, "__setstate__() -- Set the state of the object"}, {"keys", (PyCFunction) bucket_keys, METH_KEYWORDS, "keys() -- Return the keys"}, {"has_key", (PyCFunction) bucket_has_key, METH_O, "has_key(key) -- Test whether the bucket contains the given key"}, {"clear", (PyCFunction) bucket_clear, METH_VARARGS, "clear() -- Remove all of the items from the bucket"}, {"maxKey", (PyCFunction) Bucket_maxKey, METH_VARARGS, "maxKey([key]) -- Find the maximum key\n\n" "If an argument is given, find the maximum <= the argument"}, {"minKey", (PyCFunction) Bucket_minKey, METH_VARARGS, "minKey([key]) -- Find the minimum key\n\n" "If an argument is given, find the minimum >= the argument"}, #ifdef PERSISTENT {"_p_resolveConflict", (PyCFunction) bucket__p_resolveConflict, METH_VARARGS, "_p_resolveConflict() -- Reinitialize from a newly created copy"}, {"_p_deactivate", (PyCFunction) bucket__p_deactivate, METH_KEYWORDS, "_p_deactivate() -- Reinitialize from a newly created copy"}, #endif {"add", (PyCFunction)Set_insert, METH_VARARGS, "add(id) -- Add a key to the set"}, {"insert", (PyCFunction)Set_insert, METH_VARARGS, "insert(id) -- Add a key to the set"}, {"update", (PyCFunction)Set_update, METH_VARARGS, "update(seq) -- Add the items from the given sequence to the set"}, {"remove", (PyCFunction)Set_remove, METH_VARARGS, "remove(id) -- Remove an id from the set"}, {NULL, NULL} /* sentinel */ }; static int Set_init(PyObject *self, PyObject *args, PyObject *kwds) { PyObject *v = NULL; if (!PyArg_ParseTuple(args, "|O:" MOD_NAME_PREFIX "Set", &v)) return -1; if (v) return _Set_update((Bucket *)self, v); else return 0; } static PyObject * set_repr(Bucket *self) { static PyObject *format; PyObject *r, *t; if (!format) format = PyString_FromString(MOD_NAME_PREFIX "Set(%s)"); UNLESS (t = PyTuple_New(1)) return NULL; UNLESS (r = bucket_keys(self, NULL, NULL)) goto err; PyTuple_SET_ITEM(t, 0, r); r = t; ASSIGN(r, PyString_Format(format, r)); return r; err: Py_DECREF(t); return NULL; } static Py_ssize_t set_length(Bucket *self) { int r; PER_USE_OR_RETURN(self, -1); r = self->len; PER_UNUSE(self); return r; } static PyObject * set_item(Bucket *self, Py_ssize_t index) { PyObject *r=0; PER_USE_OR_RETURN(self, NULL); if (index >= 0 && index < self->len) { COPY_KEY_TO_OBJECT(r, self->keys[index]); } else IndexError(index); PER_UNUSE(self); return r; } static PySequenceMethods set_as_sequence = { (lenfunc)set_length, /* sq_length */ (binaryfunc)0, /* sq_concat */ (ssizeargfunc)0, /* sq_repeat */ (ssizeargfunc)set_item, /* sq_item */ (ssizessizeargfunc)0, /* sq_slice */ (ssizeobjargproc)0, /* sq_ass_item */ (ssizessizeobjargproc)0, /* sq_ass_slice */ (objobjproc)bucket_contains, /* sq_contains */ 0, /* sq_inplace_concat */ 0, /* sq_inplace_repeat */ }; static PyTypeObject SetType = { PyObject_HEAD_INIT(NULL) /* PyPersist_Type */ 0, /* ob_size */ MODULE_NAME MOD_NAME_PREFIX "Set", /* tp_name */ sizeof(Bucket), /* tp_basicsize */ 0, /* tp_itemsize */ (destructor)bucket_dealloc, /* tp_dealloc */ 0, /* tp_print */ 0, /* tp_getattr */ 0, /* tp_setattr */ 0, /* tp_compare */ (reprfunc)set_repr, /* tp_repr */ 0, /* tp_as_number */ &set_as_sequence, /* tp_as_sequence */ 0, /* tp_as_mapping */ 0, /* tp_hash */ 0, /* tp_call */ 0, /* tp_str */ 0, /* tp_getattro */ 0, /* tp_setattro */ 0, /* tp_as_buffer */ Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC | Py_TPFLAGS_BASETYPE, /* tp_flags */ 0, /* tp_doc */ (traverseproc)bucket_traverse, /* tp_traverse */ (inquiry)bucket_tp_clear, /* tp_clear */ 0, /* tp_richcompare */ 0, /* tp_weaklistoffset */ (getiterfunc)Bucket_getiter, /* tp_iter */ 0, /* tp_iternext */ Set_methods, /* tp_methods */ Bucket_members, /* tp_members */ 0, /* tp_getset */ 0, /* tp_base */ 0, /* tp_dict */ 0, /* tp_descr_get */ 0, /* tp_descr_set */ 0, /* tp_dictoffset */ Set_init, /* tp_init */ 0, /* tp_alloc */ 0, /*PyType_GenericNew,*/ /* tp_new */ }; static int nextSet(SetIteration *i) { if (i->position >= 0) { UNLESS(PER_USE(BUCKET(i->set))) return -1; if (i->position) { DECREF_KEY(i->key); } if (i->position < BUCKET(i->set)->len) { COPY_KEY(i->key, BUCKET(i->set)->keys[i->position]); INCREF_KEY(i->key); i->position ++; } else { i->position = -1; PER_ACCESSED(BUCKET(i->set)); } PER_ALLOW_DEACTIVATION(BUCKET(i->set)); } return 0; } ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/TreeSetTemplate.c000066400000000000000000000152631230730566700243050ustar00rootroot00000000000000/***************************************************************************** Copyright (c) 2001, 2002 Zope Foundation and Contributors. All Rights Reserved. This software is subject to the provisions of the Zope Public License, Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS FOR A PARTICULAR PURPOSE ****************************************************************************/ #define TREESETTEMPLATE_C "$Id$\n" static PyObject * TreeSet_insert(BTree *self, PyObject *args) { PyObject *key; int i; if (!PyArg_ParseTuple(args, "O:insert", &key)) return NULL; i = _BTree_set(self, key, Py_None, 1, 1); if (i < 0) return NULL; return PyInt_FromLong(i); } /* _Set_update and _TreeSet_update are identical except for the function they call to add the element to the set. */ static int _TreeSet_update(BTree *self, PyObject *seq) { int n=0, ind=0; PyObject *iter, *v; iter = PyObject_GetIter(seq); if (iter == NULL) return -1; while (1) { v = PyIter_Next(iter); if (v == NULL) { if (PyErr_Occurred()) goto err; else break; } ind = _BTree_set(self, v, Py_None, 1, 1); Py_DECREF(v); if (ind < 0) goto err; else n += ind; } err: Py_DECREF(iter); if (ind < 0) return -1; return n; } static PyObject * TreeSet_update(BTree *self, PyObject *args) { PyObject *seq = NULL; int n = 0; if (!PyArg_ParseTuple(args, "|O:update", &seq)) return NULL; if (seq) { n = _TreeSet_update(self, seq); if (n < 0) return NULL; } return PyInt_FromLong(n); } static PyObject * TreeSet_remove(BTree *self, PyObject *args) { PyObject *key; UNLESS (PyArg_ParseTuple(args, "O", &key)) return NULL; if (_BTree_set(self, key, NULL, 0, 1) < 0) return NULL; Py_INCREF(Py_None); return Py_None; } static PyObject * TreeSet_setstate(BTree *self, PyObject *args) { int r; if (!PyArg_ParseTuple(args,"O",&args)) return NULL; PER_PREVENT_DEACTIVATION(self); r=_BTree_setstate(self, args, 1); PER_UNUSE(self); if (r < 0) return NULL; Py_INCREF(Py_None); return Py_None; } static struct PyMethodDef TreeSet_methods[] = { {"__getstate__", (PyCFunction) BTree_getstate, METH_NOARGS, "__getstate__() -> state\n\n" "Return the picklable state of the TreeSet."}, {"__setstate__", (PyCFunction) TreeSet_setstate, METH_VARARGS, "__setstate__(state)\n\n" "Set the state of the TreeSet."}, {"has_key", (PyCFunction) BTree_has_key, METH_O, "has_key(key)\n\n" "Return true if the TreeSet contains the given key."}, {"keys", (PyCFunction) BTree_keys, METH_KEYWORDS, "keys([min, max]) -> list of keys\n\n" "Returns the keys of the TreeSet. If min and max are supplied, only\n" "keys greater than min and less than max are returned."}, {"maxKey", (PyCFunction) BTree_maxKey, METH_VARARGS, "maxKey([max]) -> key\n\n" "Return the largest key in the BTree. If max is specified, return\n" "the largest key <= max."}, {"minKey", (PyCFunction) BTree_minKey, METH_VARARGS, "minKey([mi]) -> key\n\n" "Return the smallest key in the BTree. If min is specified, return\n" "the smallest key >= min."}, {"clear", (PyCFunction) BTree_clear, METH_NOARGS, "clear()\n\nRemove all of the items from the BTree."}, {"add", (PyCFunction)TreeSet_insert, METH_VARARGS, "add(id) -- Add an item to the set"}, {"insert", (PyCFunction)TreeSet_insert, METH_VARARGS, "insert(id) -- Add an item to the set"}, {"update", (PyCFunction)TreeSet_update, METH_VARARGS, "update(collection)\n\n Add the items from the given collection."}, {"remove", (PyCFunction)TreeSet_remove, METH_VARARGS, "remove(id) -- Remove a key from the set"}, {"_check", (PyCFunction) BTree_check, METH_NOARGS, "Perform sanity check on TreeSet, and raise exception if flawed."}, #ifdef PERSISTENT {"_p_resolveConflict", (PyCFunction) BTree__p_resolveConflict, METH_VARARGS, "_p_resolveConflict() -- Reinitialize from a newly created copy"}, {"_p_deactivate", (PyCFunction) BTree__p_deactivate, METH_KEYWORDS, "_p_deactivate()\n\nReinitialize from a newly created copy."}, #endif {NULL, NULL} /* sentinel */ }; static PyMappingMethods TreeSet_as_mapping = { (lenfunc)BTree_length, /*mp_length*/ }; static PySequenceMethods TreeSet_as_sequence = { (lenfunc)0, /* sq_length */ (binaryfunc)0, /* sq_concat */ (ssizeargfunc)0, /* sq_repeat */ (ssizeargfunc)0, /* sq_item */ (ssizessizeargfunc)0, /* sq_slice */ (ssizeobjargproc)0, /* sq_ass_item */ (ssizessizeobjargproc)0, /* sq_ass_slice */ (objobjproc)BTree_contains, /* sq_contains */ 0, /* sq_inplace_concat */ 0, /* sq_inplace_repeat */ }; static int TreeSet_init(PyObject *self, PyObject *args, PyObject *kwds) { PyObject *v = NULL; if (!PyArg_ParseTuple(args, "|O:" MOD_NAME_PREFIX "TreeSet", &v)) return -1; if (v) return _TreeSet_update((BTree *)self, v); else return 0; } static PyTypeObject TreeSetType = { PyObject_HEAD_INIT(NULL) /* PyPersist_Type */ 0, /* ob_size */ MODULE_NAME MOD_NAME_PREFIX "TreeSet",/* tp_name */ sizeof(BTree), /* tp_basicsize */ 0, /* tp_itemsize */ (destructor)BTree_dealloc, /* tp_dealloc */ 0, /* tp_print */ 0, /* tp_getattr */ 0, /* tp_setattr */ 0, /* tp_compare */ 0, /* tp_repr */ &BTree_as_number_for_nonzero, /* tp_as_number */ &TreeSet_as_sequence, /* tp_as_sequence */ &TreeSet_as_mapping, /* tp_as_mapping */ 0, /* tp_hash */ 0, /* tp_call */ 0, /* tp_str */ 0, /* tp_getattro */ 0, /* tp_setattro */ 0, /* tp_as_buffer */ Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC | Py_TPFLAGS_BASETYPE, /* tp_flags */ 0, /* tp_doc */ (traverseproc)BTree_traverse, /* tp_traverse */ (inquiry)BTree_tp_clear, /* tp_clear */ 0, /* tp_richcompare */ 0, /* tp_weaklistoffset */ (getiterfunc)BTree_getiter, /* tp_iter */ 0, /* tp_iternext */ TreeSet_methods, /* tp_methods */ BTree_members, /* tp_members */ 0, /* tp_getset */ 0, /* tp_base */ 0, /* tp_dict */ 0, /* tp_descr_get */ 0, /* tp_descr_set */ 0, /* tp_dictoffset */ TreeSet_init, /* tp_init */ 0, /* tp_alloc */ 0, /*PyType_GenericNew,*/ /* tp_new */ }; ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/_IFBTree.c000066400000000000000000000020351230730566700226060ustar00rootroot00000000000000/*############################################################################ # # Copyright (c) 2004 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################*/ #define MASTER_ID "$Id$\n" /* IFBTree - int key, float value BTree Implements a collection using int type keys and float type values */ /* Setup template macros */ #define PERSISTENT #define MOD_NAME_PREFIX "IF" #define INITMODULE init_IFBTree #define DEFAULT_MAX_BUCKET_SIZE 120 #define DEFAULT_MAX_BTREE_SIZE 500 #include "intkeymacros.h" #include "floatvaluemacros.h" #include "BTreeModuleTemplate.c" ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/_IIBTree.c000066400000000000000000000020271230730566700226120ustar00rootroot00000000000000/*############################################################################ # # Copyright (c) 2004 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################*/ #define MASTER_ID "$Id$\n" /* IIBTree - int key, int value BTree Implements a collection using int type keys and int type values */ /* Setup template macros */ #define PERSISTENT #define MOD_NAME_PREFIX "II" #define INITMODULE init_IIBTree #define DEFAULT_MAX_BUCKET_SIZE 120 #define DEFAULT_MAX_BTREE_SIZE 500 #include "intkeymacros.h" #include "intvaluemacros.h" #include "BTreeModuleTemplate.c" ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/_IOBTree.c000066400000000000000000000020421230730566700226150ustar00rootroot00000000000000/*############################################################################ # # Copyright (c) 2004 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################*/ #define MASTER_ID "$Id$\n" /* IOBTree - int key, object value BTree Implements a collection using int type keys and object type values */ #define PERSISTENT #define MOD_NAME_PREFIX "IO" #define DEFAULT_MAX_BUCKET_SIZE 60 #define DEFAULT_MAX_BTREE_SIZE 500 #define INITMODULE init_IOBTree #include "intkeymacros.h" #include "objectvaluemacros.h" #include "BTreeModuleTemplate.c" ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/_LFBTree.c000066400000000000000000000021451230730566700226130ustar00rootroot00000000000000/*############################################################################ # # Copyright (c) 2004 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################*/ #define MASTER_ID "$Id: _IFBTree.c 67074 2006-04-17 19:13:39Z fdrake $\n" /* IFBTree - int key, float value BTree Implements a collection using int type keys and float type values */ /* Setup template macros */ #define PERSISTENT #define MOD_NAME_PREFIX "LF" #define INITMODULE init_LFBTree #define DEFAULT_MAX_BUCKET_SIZE 120 #define DEFAULT_MAX_BTREE_SIZE 500 #define ZODB_64BIT_INTS #include "intkeymacros.h" #include "floatvaluemacros.h" #include "BTreeModuleTemplate.c" ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/_LLBTree.c000066400000000000000000000021341230730566700226170ustar00rootroot00000000000000/*############################################################################ # # Copyright (c) 2004 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################*/ #define MASTER_ID "$Id: _IIBTree.c 25186 2004-06-02 15:07:33Z jim $\n" /* IIBTree - int key, int value BTree Implements a collection using int type keys and int type values */ /* Setup template macros */ #define PERSISTENT #define MOD_NAME_PREFIX "LL" #define INITMODULE init_LLBTree #define DEFAULT_MAX_BUCKET_SIZE 120 #define DEFAULT_MAX_BTREE_SIZE 500 #define ZODB_64BIT_INTS #include "intkeymacros.h" #include "intvaluemacros.h" #include "BTreeModuleTemplate.c" ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/_LOBTree.c000066400000000000000000000021071230730566700226220ustar00rootroot00000000000000/*############################################################################ # # Copyright (c) 2004 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################*/ #define MASTER_ID "$Id: _IOBTree.c 25186 2004-06-02 15:07:33Z jim $\n" /* IOBTree - int key, object value BTree Implements a collection using int type keys and object type values */ #define PERSISTENT #define MOD_NAME_PREFIX "LO" #define DEFAULT_MAX_BUCKET_SIZE 60 #define DEFAULT_MAX_BTREE_SIZE 500 #define INITMODULE init_LOBTree #define ZODB_64BIT_INTS #include "intkeymacros.h" #include "objectvaluemacros.h" #include "BTreeModuleTemplate.c" ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/_OIBTree.c000066400000000000000000000020421230730566700226150ustar00rootroot00000000000000/*############################################################################ # # Copyright (c) 2004 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################*/ #define MASTER_ID "$Id$\n" /* OIBTree - object key, int value BTree Implements a collection using object type keys and int type values */ #define PERSISTENT #define MOD_NAME_PREFIX "OI" #define INITMODULE init_OIBTree #define DEFAULT_MAX_BUCKET_SIZE 60 #define DEFAULT_MAX_BTREE_SIZE 250 #include "objectkeymacros.h" #include "intvaluemacros.h" #include "BTreeModuleTemplate.c" ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/_OLBTree.c000066400000000000000000000021071230730566700226220ustar00rootroot00000000000000/*############################################################################ # # Copyright (c) 2004 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################*/ #define MASTER_ID "$Id: _OIBTree.c 25186 2004-06-02 15:07:33Z jim $\n" /* OIBTree - object key, int value BTree Implements a collection using object type keys and int type values */ #define PERSISTENT #define MOD_NAME_PREFIX "OL" #define INITMODULE init_OLBTree #define DEFAULT_MAX_BUCKET_SIZE 60 #define DEFAULT_MAX_BTREE_SIZE 250 #define ZODB_64BIT_INTS #include "objectkeymacros.h" #include "intvaluemacros.h" #include "BTreeModuleTemplate.c" ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/_OOBTree.c000066400000000000000000000020531230730566700226250ustar00rootroot00000000000000/*############################################################################ # # Copyright (c) 2004 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################*/ #define MASTER_ID "$Id$\n" /* OOBTree - object key, object value BTree Implements a collection using object type keys and object type values */ #define PERSISTENT #define MOD_NAME_PREFIX "OO" #define INITMODULE init_OOBTree #define DEFAULT_MAX_BUCKET_SIZE 30 #define DEFAULT_MAX_BTREE_SIZE 250 #include "objectkeymacros.h" #include "objectvaluemacros.h" #include "BTreeModuleTemplate.c" ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/__init__.py000066400000000000000000000035211230730566700231750ustar00rootroot00000000000000############################################################################# # # Copyright (c) 2007 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################# import zope.interface import BTrees.Interfaces class _Family(object): zope.interface.implements(BTrees.Interfaces.IBTreeFamily) from BTrees import OOBTree as OO class _Family32(_Family): from BTrees import OIBTree as OI from BTrees import IIBTree as II from BTrees import IOBTree as IO from BTrees import IFBTree as IF maxint = int(2**31-1) minint = -maxint - 1 def __reduce__(self): return _family32, () class _Family64(_Family): from BTrees import OLBTree as OI from BTrees import LLBTree as II from BTrees import LOBTree as IO from BTrees import LFBTree as IF maxint = 2**63-1 minint = -maxint - 1 def __reduce__(self): return _family64, () def _family32(): return family32 _family32.__safe_for_unpickling__ = True def _family64(): return family64 _family64.__safe_for_unpickling__ = True family32 = _Family32() family64 = _Family64() BTrees.family64.IO.family = family64 BTrees.family64.OI.family = family64 BTrees.family64.IF.family = family64 BTrees.family64.II.family = family64 BTrees.family32.IO.family = family32 BTrees.family32.OI.family = family32 BTrees.family32.IF.family = family32 BTrees.family32.II.family = family32 ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/_fsBTree.c000066400000000000000000000106201230730566700227170ustar00rootroot00000000000000/*############################################################################ # # Copyright (c) 2004 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################*/ #define MASTER_ID "$Id$\n" /* fsBTree - FileStorage index BTree This BTree implements a mapping from 2-character strings to six-character strings. This allows us to efficiently store a FileStorage index as a nested mapping of 6-character oid prefix to mapping of 2-character oid suffix to 6-character (byte) file positions. */ typedef unsigned char char2[2]; typedef unsigned char char6[6]; /* Setup template macros */ #define PERSISTENT #define MOD_NAME_PREFIX "fs" #define INITMODULE init_fsBTree #define DEFAULT_MAX_BUCKET_SIZE 500 #define DEFAULT_MAX_BTREE_SIZE 500 /*#include "intkeymacros.h"*/ #define KEYMACROS_H "$Id$\n" #define KEY_TYPE char2 #undef KEY_TYPE_IS_PYOBJECT #define KEY_CHECK(K) (PyString_Check(K) && PyString_GET_SIZE(K)==2) #define TEST_KEY_SET_OR(V, K, T) if ( ( (V) = ((*(K) < *(T) || (*(K) == *(T) && (K)[1] < (T)[1])) ? -1 : ((*(K) == *(T) && (K)[1] == (T)[1]) ? 0 : 1)) ), 0 ) #define DECREF_KEY(KEY) #define INCREF_KEY(k) #define COPY_KEY(KEY, E) (*(KEY)=*(E), (KEY)[1]=(E)[1]) #define COPY_KEY_TO_OBJECT(O, K) O=PyString_FromStringAndSize((const char*)K,2) #define COPY_KEY_FROM_ARG(TARGET, ARG, STATUS) \ if (KEY_CHECK(ARG)) memcpy(TARGET, PyString_AS_STRING(ARG), 2); else { \ PyErr_SetString(PyExc_TypeError, "expected two-character string key"); \ (STATUS)=0; } /*#include "intvaluemacros.h"*/ #define VALUEMACROS_H "$Id$\n" #define VALUE_TYPE char6 #undef VALUE_TYPE_IS_PYOBJECT #define TEST_VALUE(K, T) memcmp(K,T,6) #define DECREF_VALUE(k) #define INCREF_VALUE(k) #define COPY_VALUE(V, E) (memcpy(V, E, 6)) #define COPY_VALUE_TO_OBJECT(O, K) O=PyString_FromStringAndSize((const char*)K,6) #define COPY_VALUE_FROM_ARG(TARGET, ARG, STATUS) \ if ((PyString_Check(ARG) && PyString_GET_SIZE(ARG)==6)) \ memcpy(TARGET, PyString_AS_STRING(ARG), 6); else { \ PyErr_SetString(PyExc_TypeError, "expected six-character string key"); \ (STATUS)=0; } #define NORMALIZE_VALUE(V, MIN) #include "Python.h" static PyObject *bucket_toString(PyObject *self); static PyObject *bucket_fromString(PyObject *self, PyObject *state); #define EXTRA_BUCKET_METHODS \ {"toString", (PyCFunction) bucket_toString, METH_NOARGS, \ "toString() -- Return the state as a string"}, \ {"fromString", (PyCFunction) bucket_fromString, METH_O, \ "fromString(s) -- Set the state of the object from a string"}, \ #include "BTreeModuleTemplate.c" static PyObject * bucket_toString(PyObject *oself) { Bucket *self = (Bucket *)oself; PyObject *items = NULL; int len; PER_USE_OR_RETURN(self, NULL); len = self->len; items = PyString_FromStringAndSize(NULL, len*8); if (items == NULL) goto err; memcpy(PyString_AS_STRING(items), self->keys, len*2); memcpy(PyString_AS_STRING(items)+len*2, self->values, len*6); PER_UNUSE(self); return items; err: PER_UNUSE(self); Py_XDECREF(items); return NULL; } static PyObject * bucket_fromString(PyObject *oself, PyObject *state) { Bucket *self = (Bucket *)oself; int len; KEY_TYPE *keys; VALUE_TYPE *values; len = PyString_Size(state); if (len < 0) return NULL; if (len%8) { PyErr_SetString(PyExc_ValueError, "state string of wrong size"); return NULL; } len /= 8; if (self->next) { Py_DECREF(self->next); self->next = NULL; } if (len > self->size) { keys = BTree_Realloc(self->keys, sizeof(KEY_TYPE)*len); if (keys == NULL) return NULL; values = BTree_Realloc(self->values, sizeof(VALUE_TYPE)*len); if (values == NULL) return NULL; self->keys = keys; self->values = values; self->size = len; } memcpy(self->keys, PyString_AS_STRING(state), len*2); memcpy(self->values, PyString_AS_STRING(state)+len*2, len*6); self->len = len; Py_INCREF(self); return (PyObject *)self; } ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/check.py000066400000000000000000000340771230730566700225250ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2003 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """ Utilities for working with BTrees (TreeSets, Buckets, and Sets) at a low level. The primary function is check(btree), which performs value-based consistency checks of a kind btree._check() does not perform. See the function docstring for details. display(btree) displays the internal structure of a BTree (TreeSet, etc) to stdout. CAUTION: When a BTree node has only a single bucket child, it can be impossible to get at the bucket from Python code (__getstate__() may squash the bucket object out of existence, as a pickling storage optimization). In such a case, the code here synthesizes a temporary bucket with the same keys (and values, if the bucket is of a mapping type). This has no first-order consequences, but can mislead if you pay close attention to reported object addresses and/or object identity (the synthesized bucket has an address that doesn't exist in the actual BTree). """ from types import TupleType from BTrees.OOBTree import OOBTree, OOBucket, OOSet, OOTreeSet from BTrees.OIBTree import OIBTree, OIBucket, OISet, OITreeSet from BTrees.IOBTree import IOBTree, IOBucket, IOSet, IOTreeSet from BTrees.IIBTree import IIBTree, IIBucket, IISet, IITreeSet from BTrees.IFBTree import IFBTree, IFBucket, IFSet, IFTreeSet from BTrees.OLBTree import OLBTree, OLBucket, OLSet, OLTreeSet from BTrees.LOBTree import LOBTree, LOBucket, LOSet, LOTreeSet from BTrees.LLBTree import LLBTree, LLBucket, LLSet, LLTreeSet from BTrees.LFBTree import LFBTree, LFBucket, LFSet, LFTreeSet from ZODB.utils import positive_id, oid_repr TYPE_UNKNOWN, TYPE_BTREE, TYPE_BUCKET = range(3) _type2kind = {} for kv in ('OO', 'II', 'IO', 'OI', 'IF', 'LL', 'LO', 'OL', 'LF', ): for name, kind in ( ('BTree', (TYPE_BTREE, True)), ('Bucket', (TYPE_BUCKET, True)), ('TreeSet', (TYPE_BTREE, False)), ('Set', (TYPE_BUCKET, False)), ): _type2kind[globals()[kv+name]] = kind # Return pair # # TYPE_BTREE or TYPE_BUCKET, is_mapping def classify(obj): return _type2kind[type(obj)] BTREE_EMPTY, BTREE_ONE, BTREE_NORMAL = range(3) # If the BTree is empty, returns # # BTREE_EMPTY, [], [] # # If the BTree has only one bucket, sometimes returns # # BTREE_ONE, bucket_state, None # # Else returns # # BTREE_NORMAL, list of keys, list of kids # # and the list of kids has one more entry than the list of keys. # # BTree.__getstate__() docs: # # For an empty BTree (self->len == 0), None. # # For a BTree with one child (self->len == 1), and that child is a bucket, # and that bucket has a NULL oid, a one-tuple containing a one-tuple # containing the bucket's state: # # ( # ( # child[0].__getstate__(), # ), # ) # # Else a two-tuple. The first element is a tuple interleaving the BTree's # keys and direct children, of size 2*self->len - 1 (key[0] is unused and # is not saved). The second element is the firstbucket: # # ( # (child[0], key[1], child[1], key[2], child[2], ..., # key[len-1], child[len-1]), # self->firstbucket # ) _btree2bucket = {} for kv in ('OO', 'II', 'IO', 'OI', 'IF', 'LL', 'LO', 'OL', 'LF', ): _btree2bucket[globals()[kv+'BTree']] = globals()[kv+'Bucket'] _btree2bucket[globals()[kv+'TreeSet']] = globals()[kv+'Set'] def crack_btree(t, is_mapping): state = t.__getstate__() if state is None: return BTREE_EMPTY, [], [] assert isinstance(state, TupleType) if len(state) == 1: state = state[0] assert isinstance(state, TupleType) and len(state) == 1 state = state[0] return BTREE_ONE, state, None assert len(state) == 2 data, firstbucket = state n = len(data) assert n & 1 kids = [] keys = [] i = 0 for x in data: if i & 1: keys.append(x) else: kids.append(x) i += 1 return BTREE_NORMAL, keys, kids # Returns # # keys, values # for a mapping; len(keys) == len(values) in this case # or # keys, [] # for a set # # bucket.__getstate__() docs: # # For a set bucket (self->values is NULL), a one-tuple or two-tuple. The # first element is a tuple of keys, of length self->len. The second element # is the next bucket, present if and only if next is non-NULL: # # ( # (keys[0], keys[1], ..., keys[len-1]), # next iff non-NULL> # ) # # For a mapping bucket (self->values is not NULL), a one-tuple or two-tuple. # The first element is a tuple interleaving keys and values, of length # 2 * self->len. The second element is the next bucket, present iff next is # non-NULL: # # ( # (keys[0], values[0], keys[1], values[1], ..., # keys[len-1], values[len-1]), # next iff non-NULL> # ) def crack_bucket(b, is_mapping): state = b.__getstate__() assert isinstance(state, TupleType) assert 1 <= len(state) <= 2 data = state[0] if not is_mapping: return data, [] keys = [] values = [] n = len(data) assert n & 1 == 0 i = 0 for x in data: if i & 1: values.append(x) else: keys.append(x) i += 1 return keys, values def type_and_adr(obj): if hasattr(obj, '_p_oid'): oid = oid_repr(obj._p_oid) else: oid = 'None' return "%s (0x%x oid=%s)" % (type(obj).__name__, positive_id(obj), oid) # Walker implements a depth-first search of a BTree (or TreeSet or Set or # Bucket). Subclasses must implement the visit_btree() and visit_bucket() # methods, and arrange to call the walk() method. walk() calls the # visit_XYZ() methods once for each node in the tree, in depth-first # left-to-right order. class Walker: def __init__(self, obj): self.obj = obj # obj is the BTree (BTree or TreeSet). # path is a list of indices, from the root. For example, if a BTree node # is child[5] of child[3] of the root BTree, [3, 5]. # parent is the parent BTree object, or None if this is the root BTree. # is_mapping is True for a BTree and False for a TreeSet. # keys is a list of the BTree's internal keys. # kids is a list of the BTree's children. # If the BTree is an empty root node, keys == kids == []. # Else len(kids) == len(keys) + 1. # lo and hi are slice bounds on the values the elements of keys *should* # lie in (lo inclusive, hi exclusive). lo is None if there is no lower # bound known, and hi is None if no upper bound is known. def visit_btree(self, obj, path, parent, is_mapping, keys, kids, lo, hi): raise NotImplementedError # obj is the bucket (Bucket or Set). # path is a list of indices, from the root. For example, if a bucket # node is child[5] of child[3] of the root BTree, [3, 5]. # parent is the parent BTree object. # is_mapping is True for a Bucket and False for a Set. # keys is a list of the bucket's keys. # values is a list of the bucket's values. # If is_mapping is false, values == []. Else len(keys) == len(values). # lo and hi are slice bounds on the values the elements of keys *should* # lie in (lo inclusive, hi exclusive). lo is None if there is no lower # bound known, and hi is None if no upper bound is known. def visit_bucket(self, obj, path, parent, is_mapping, keys, values, lo, hi): raise NotImplementedError def walk(self): obj = self.obj path = [] stack = [(obj, path, None, None, None)] while stack: obj, path, parent, lo, hi = stack.pop() kind, is_mapping = classify(obj) if kind is TYPE_BTREE: bkind, keys, kids = crack_btree(obj, is_mapping) if bkind is BTREE_NORMAL: # push the kids, in reverse order (so they're popped off # the stack in forward order) n = len(kids) for i in range(len(kids)-1, -1, -1): newlo, newhi = lo, hi if i < n-1: newhi = keys[i] if i > 0: newlo = keys[i-1] stack.append((kids[i], path + [i], obj, newlo, newhi)) elif bkind is BTREE_EMPTY: pass else: assert bkind is BTREE_ONE # Yuck. There isn't a bucket object to pass on, as # the bucket state is embedded directly in the BTree # state. Synthesize a bucket. assert kids is None # and "keys" is really the bucket # state bucket = _btree2bucket[type(obj)]() bucket.__setstate__(keys) stack.append((bucket, path + [0], obj, lo, hi)) keys = [] kids = [bucket] self.visit_btree(obj, path, parent, is_mapping, keys, kids, lo, hi) else: assert kind is TYPE_BUCKET keys, values = crack_bucket(obj, is_mapping) self.visit_bucket(obj, path, parent, is_mapping, keys, values, lo, hi) class Checker(Walker): def __init__(self, obj): Walker.__init__(self, obj) self.errors = [] def check(self): self.walk() if self.errors: s = "Errors found in %s:" % type_and_adr(self.obj) self.errors.insert(0, s) s = "\n".join(self.errors) raise AssertionError(s) def visit_btree(self, obj, path, parent, is_mapping, keys, kids, lo, hi): self.check_sorted(obj, path, keys, lo, hi) def visit_bucket(self, obj, path, parent, is_mapping, keys, values, lo, hi): self.check_sorted(obj, path, keys, lo, hi) def check_sorted(self, obj, path, keys, lo, hi): i, n = 0, len(keys) for x in keys: if lo is not None and not lo <= x: s = "key %r < lower bound %r at index %d" % (x, lo, i) self.complain(s, obj, path) if hi is not None and not x < hi: s = "key %r >= upper bound %r at index %d" % (x, hi, i) self.complain(s, obj, path) if i < n-1 and not x < keys[i+1]: s = "key %r at index %d >= key %r at index %d" % ( x, i, keys[i+1], i+1) self.complain(s, obj, path) i += 1 def complain(self, msg, obj, path): s = "%s, in %s, path from root %s" % ( msg, type_and_adr(obj), ".".join(map(str, path))) self.errors.append(s) class Printer(Walker): def __init__(self, obj): Walker.__init__(self, obj) def display(self): self.walk() def visit_btree(self, obj, path, parent, is_mapping, keys, kids, lo, hi): indent = " " * len(path) print "%s%s %s with %d children" % ( indent, ".".join(map(str, path)), type_and_adr(obj), len(kids)) indent += " " n = len(keys) for i in range(n): print "%skey %d: %r" % (indent, i, keys[i]) def visit_bucket(self, obj, path, parent, is_mapping, keys, values, lo, hi): indent = " " * len(path) print "%s%s %s with %d keys" % ( indent, ".".join(map(str, path)), type_and_adr(obj), len(keys)) indent += " " n = len(keys) for i in range(n): print "%skey %d: %r" % (indent, i, keys[i]), if is_mapping: print "value %r" % (values[i],) def check(btree): """Check internal value-based invariants in a BTree or TreeSet. The btree._check() method checks internal C-level pointer consistency. The check() function here checks value-based invariants: whether the keys in leaf bucket and internal nodes are in strictly increasing order, and whether they all lie in their expected range. The latter is a subtle invariant that can't be checked locally -- it requires propagating range info down from the root of the tree, and modifying it at each level for each child. Raises AssertionError if anything is wrong, with a string detail explaining the problems. The entire tree is checked before AssertionError is raised, and the string detail may be large (depending on how much went wrong). """ Checker(btree).check() def display(btree): "Display the internal structure of a BTree, Bucket, TreeSet or Set." Printer(btree).display() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/floatvaluemacros.h000066400000000000000000000016061230730566700246060ustar00rootroot00000000000000 #define VALUEMACROS_H "$Id$\n" #define VALUE_TYPE float #undef VALUE_TYPE_IS_PYOBJECT #define TEST_VALUE(K, T) (((K) < (T)) ? -1 : (((K) > (T)) ? 1: 0)) #define VALUE_SAME(VALUE, TARGET) ( (VALUE) == (TARGET) ) #define DECLARE_VALUE(NAME) VALUE_TYPE NAME #define VALUE_PARSE "f" #define DECREF_VALUE(k) #define INCREF_VALUE(k) #define COPY_VALUE(V, E) (V=(E)) #define COPY_VALUE_TO_OBJECT(O, K) O=PyFloat_FromDouble(K) #define COPY_VALUE_FROM_ARG(TARGET, ARG, STATUS) \ if (PyFloat_Check(ARG)) TARGET = (float)PyFloat_AsDouble(ARG); \ else if (PyInt_Check(ARG)) TARGET = (float)PyInt_AsLong(ARG); \ else { \ PyErr_SetString(PyExc_TypeError, "expected float or int value"); \ (STATUS)=0; (TARGET)=0; } #define NORMALIZE_VALUE(V, MIN) ((MIN) > 0) ? ((V)/=(MIN)) : 0 #define MERGE_DEFAULT 1.0f #define MERGE(O1, w1, O2, w2) ((O1)*(w1)+(O2)*(w2)) #define MERGE_WEIGHT(O, w) ((O)*(w)) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/fsBTree.py000066400000000000000000000015011230730566700227640ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## # fsBTrees are data structures used for ZODB FileStorage. They are not # expected to be "public" excpect to FileStorage. # hack to overcome dynamic-linking headache. from _fsBTree import * ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/intkeymacros.h000066400000000000000000000033751230730566700237540ustar00rootroot00000000000000 #define KEYMACROS_H "$Id$\n" #ifdef ZODB_64BIT_INTS /* PY_LONG_LONG as key */ #define NEED_LONG_LONG_SUPPORT #define KEY_TYPE PY_LONG_LONG #define KEY_CHECK longlong_check #define COPY_KEY_TO_OBJECT(O, K) O=longlong_as_object(K) #define COPY_KEY_FROM_ARG(TARGET, ARG, STATUS) \ if (PyInt_Check(ARG)) TARGET=PyInt_AS_LONG(ARG); else \ if (longlong_check(ARG)) TARGET=PyLong_AsLongLong(ARG); else \ if (PyLong_Check(ARG)) { \ PyErr_SetString(PyExc_ValueError, "long integer out of range"); \ (STATUS)=0; (TARGET)=0; } \ else { \ PyErr_SetString(PyExc_TypeError, "expected integer key"); \ (STATUS)=0; (TARGET)=0; } #else /* C int as key */ #define KEY_TYPE int #define KEY_CHECK PyInt_Check #define COPY_KEY_TO_OBJECT(O, K) O=PyInt_FromLong(K) #define COPY_KEY_FROM_ARG(TARGET, ARG, STATUS) \ if (PyInt_Check(ARG)) { \ long vcopy = PyInt_AS_LONG(ARG); \ if ((int)vcopy != vcopy) { \ PyErr_SetString(PyExc_TypeError, "integer out of range"); \ (STATUS)=0; (TARGET)=0; \ } \ else TARGET = vcopy; \ } else { \ PyErr_SetString(PyExc_TypeError, "expected integer key"); \ (STATUS)=0; (TARGET)=0; } #endif #undef KEY_TYPE_IS_PYOBJECT #define TEST_KEY_SET_OR(V, K, T) if ( ( (V) = (((K) < (T)) ? -1 : (((K) > (T)) ? 1: 0)) ) , 0 ) #define DECREF_KEY(KEY) #define INCREF_KEY(k) #define COPY_KEY(KEY, E) (KEY=(E)) #define MULTI_INT_UNION 1 ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/intvaluemacros.h000066400000000000000000000036651230730566700243020ustar00rootroot00000000000000 #define VALUEMACROS_H "$Id$\n" #ifdef ZODB_64BIT_INTS #define NEED_LONG_LONG_SUPPORT #define VALUE_TYPE PY_LONG_LONG #define VALUE_PARSE "L" #define COPY_VALUE_TO_OBJECT(O, K) O=longlong_as_object(K) #define COPY_VALUE_FROM_ARG(TARGET, ARG, STATUS) \ if (PyInt_Check(ARG)) TARGET=PyInt_AS_LONG(ARG); else \ if (longlong_check(ARG)) TARGET=PyLong_AsLongLong(ARG); else \ if (PyLong_Check(ARG)) { \ PyErr_SetString(PyExc_ValueError, "long integer out of range"); \ (STATUS)=0; (TARGET)=0; } \ else { \ PyErr_SetString(PyExc_TypeError, "expected integer value"); \ (STATUS)=0; (TARGET)=0; } #else #define VALUE_TYPE int #define VALUE_PARSE "i" #define COPY_VALUE_TO_OBJECT(O, K) O=PyInt_FromLong(K) #define COPY_VALUE_FROM_ARG(TARGET, ARG, STATUS) \ if (PyInt_Check(ARG)) { \ long vcopy = PyInt_AS_LONG(ARG); \ if ((int)vcopy != vcopy) { \ PyErr_SetString(PyExc_TypeError, "integer out of range"); \ (STATUS)=0; (TARGET)=0; \ } \ else TARGET = vcopy; \ } else { \ PyErr_SetString(PyExc_TypeError, "expected integer key"); \ (STATUS)=0; (TARGET)=0; } #endif #undef VALUE_TYPE_IS_PYOBJECT #define TEST_VALUE(K, T) (((K) < (T)) ? -1 : (((K) > (T)) ? 1: 0)) #define VALUE_SAME(VALUE, TARGET) ( (VALUE) == (TARGET) ) #define DECLARE_VALUE(NAME) VALUE_TYPE NAME #define DECREF_VALUE(k) #define INCREF_VALUE(k) #define COPY_VALUE(V, E) (V=(E)) #define NORMALIZE_VALUE(V, MIN) ((MIN) > 0) ? ((V)/=(MIN)) : 0 #define MERGE_DEFAULT 1 #define MERGE(O1, w1, O2, w2) ((O1)*(w1)+(O2)*(w2)) #define MERGE_WEIGHT(O, w) ((O)*(w)) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/objectkeymacros.h000066400000000000000000000006311230730566700244200ustar00rootroot00000000000000#define KEYMACROS_H "$Id$\n" #define KEY_TYPE PyObject * #define KEY_TYPE_IS_PYOBJECT #define TEST_KEY_SET_OR(V, KEY, TARGET) if ( ( (V) = PyObject_Compare((KEY),(TARGET)) ), PyErr_Occurred() ) #define INCREF_KEY(k) Py_INCREF(k) #define DECREF_KEY(KEY) Py_DECREF(KEY) #define COPY_KEY(KEY, E) KEY=(E) #define COPY_KEY_TO_OBJECT(O, K) O=(K); Py_INCREF(O) #define COPY_KEY_FROM_ARG(TARGET, ARG, S) TARGET=(ARG) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/objectvaluemacros.h000066400000000000000000000007241230730566700247470ustar00rootroot00000000000000 #define VALUEMACROS_H "$Id$\n" #define VALUE_TYPE PyObject * #define VALUE_TYPE_IS_PYOBJECT #define TEST_VALUE(VALUE, TARGET) PyObject_Compare((VALUE),(TARGET)) #define DECLARE_VALUE(NAME) VALUE_TYPE NAME #define INCREF_VALUE(k) Py_INCREF(k) #define DECREF_VALUE(k) Py_DECREF(k) #define COPY_VALUE(k,e) k=(e) #define COPY_VALUE_TO_OBJECT(O, K) O=(K); Py_INCREF(O) #define COPY_VALUE_FROM_ARG(TARGET, ARG, S) TARGET=(ARG) #define NORMALIZE_VALUE(V, MIN) Py_INCREF(V) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/py24compat.h000066400000000000000000000010271230730566700232360ustar00rootroot00000000000000/* Backport type definitions from Python 2.5's object.h */ #ifndef BTREE_PY24COMPATH_H #define BTREE_PY24COMPAT_H #if PY_VERSION_HEX < 0x02050000 typedef Py_ssize_t (*lenfunc)(PyObject *); typedef PyObject *(*ssizeargfunc)(PyObject *, Py_ssize_t); typedef PyObject *(*ssizessizeargfunc)(PyObject *, Py_ssize_t, Py_ssize_t); typedef int(*ssizeobjargproc)(PyObject *, Py_ssize_t, PyObject *); typedef int(*ssizessizeobjargproc)(PyObject *, Py_ssize_t, Py_ssize_t, PyObject *); #endif /* PY_VERSION_HEX */ #endif /* BTREE_PY24COMPAT_H */ ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/sorters.c000066400000000000000000000355571230730566700227470ustar00rootroot00000000000000/***************************************************************************** Copyright (c) 2002 Zope Foundation and Contributors. All Rights Reserved. This software is subject to the provisions of the Zope Public License, Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS FOR A PARTICULAR PURPOSE ****************************************************************************/ /* Revision information: $Id$ */ /* The only routine here intended to be used outside the file is size_t sort_int_nodups(int *p, size_t n) Sort the array of n ints pointed at by p, in place, and also remove duplicates. Return the number of unique elements remaining, which occupy a contiguous and monotonically increasing slice of the array starting at p. Example: If the input array is [3, 1, 2, 3, 1, 5, 2], sort_int_nodups returns 4, and the first 4 elements of the array are changed to [1, 2, 3, 5]. The content of the remaining array positions is not defined. Notes: + This is specific to n-byte signed ints, with endianness natural to the platform. `n` is determined based on ZODB_64BIT_INTS. + 4*n bytes of available heap memory are required for best speed (8*n when ZODB_64BIT_INTS is defined). */ #include #include #include #include #include /* The type of array elements to be sorted. Most of the routines don't care about the type, and will work fine for any scalar C type (provided they're recompiled with element_type appropriately redefined). However, the radix sort has to know everything about the type's internal representation. */ typedef KEY_TYPE element_type; /* The radixsort is faster than the quicksort for large arrays, but radixsort has high fixed overhead, making it a poor choice for small arrays. The crossover point isn't critical, and is sensitive to things like compiler and machine cache structure, so don't worry much about this. */ #define QUICKSORT_BEATS_RADIXSORT 800U /* In turn, the quicksort backs off to an insertion sort for very small slices. MAX_INSERTION is the largest slice quicksort leaves entirely to insertion. Because this version of quicksort uses a median-of-3 rule for selecting a pivot, MAX_INSERTION must be at least 2 (so that quicksort has at least 3 values to look at in a slice). Again, the exact value here isn't critical. */ #define MAX_INSERTION 25U #if MAX_INSERTION < 2U # error "MAX_INSERTION must be >= 2" #endif /* LSB-first radix sort of the n elements in 'in'. 'work' is work storage at least as large as 'in'. Depending on how many swaps are done internally, the final result may come back in 'in' or 'work'; and that pointer is returned. radixsort_int is specific to signed n-byte ints, with natural machine endianness. `n` is determined based on ZODB_64BIT_INTS. */ static element_type* radixsort_int(element_type *in, element_type *work, size_t n) { /* count[i][j] is the number of input elements that have byte value j in byte position i, where byte position 0 is the LSB. Note that holding i fixed, the sum of count[i][j] over all j in range(256) is n. */ #ifdef ZODB_64BIT_INTS size_t count[8][256]; #else size_t count[4][256]; #endif size_t i; int offset, offsetinc; /* Which byte position are we working on now? 0=LSB, 1, 2, ... */ int bytenum; #ifdef ZODB_64BIT_INTS assert(sizeof(element_type) == 8); #else assert(sizeof(element_type) == 4); #endif assert(in); assert(work); /* Compute all of count in one pass. */ memset(count, 0, sizeof(count)); for (i = 0; i < n; ++i) { element_type const x = in[i]; ++count[0][(x ) & 0xff]; ++count[1][(x >> 8) & 0xff]; ++count[2][(x >> 16) & 0xff]; ++count[3][(x >> 24) & 0xff]; #ifdef ZODB_64BIT_INTS ++count[4][(x >> 32) & 0xff]; ++count[5][(x >> 40) & 0xff]; ++count[6][(x >> 48) & 0xff]; ++count[7][(x >> 56) & 0xff]; #endif } /* For p an element_type* cast to char*, offset is how much farther we have to go to get to the LSB of the element; this is 0 for little- endian boxes and sizeof(element_type)-1 for big-endian. offsetinc is 1 or -1, respectively, telling us which direction to go from p+offset to get to the element's more-significant bytes. */ { element_type one = 1; if (*(char*)&one) { /* Little endian. */ offset = 0; offsetinc = 1; } else { /* Big endian. */ offset = sizeof(element_type) - 1; offsetinc = -1; } } /* The radix sort. */ for (bytenum = 0; bytenum < sizeof(element_type); ++bytenum, offset += offsetinc) { /* Do a stable distribution sort on byte position bytenum, from in to work. index[i] tells us the work index at which to store the next in element with byte value i. pinbyte points to the correct byte in the input array. */ size_t index[256]; unsigned char* pinbyte; size_t total = 0; size_t *pcount = count[bytenum]; /* Compute the correct output starting index for each possible byte value. */ if (bytenum < sizeof(element_type) - 1) { for (i = 0; i < 256; ++i) { const size_t icount = pcount[i]; index[i] = total; total += icount; if (icount == n) break; } if (i < 256) { /* All bytes in the current position have value i, so there's nothing to do on this pass. */ continue; } } else { /* The MSB of signed ints needs to be distributed differently than the other bytes, in order 0x80, 0x81, ... 0xff, 0x00, 0x01, ... 0x7f */ for (i = 128; i < 256; ++i) { const size_t icount = pcount[i]; index[i] = total; total += icount; if (icount == n) break; } if (i < 256) continue; for (i = 0; i < 128; ++i) { const size_t icount = pcount[i]; index[i] = total; total += icount; if (icount == n) break; } if (i < 128) continue; } assert(total == n); /* Distribute the elements according to byte value. Note that this is where most of the time is spent. Note: The loop is unrolled 4x by hand, for speed. This may be a pessimization someday, but was a significant win on my MSVC 6.0 timing tests. */ pinbyte = (unsigned char *)in + offset; i = 0; /* Reduce number of elements to copy to a multiple of 4. */ while ((n - i) & 0x3) { unsigned char byte = *pinbyte; work[index[byte]++] = in[i]; ++i; pinbyte += sizeof(element_type); } for (; i < n; i += 4, pinbyte += 4 * sizeof(element_type)) { unsigned char byte1 = *(pinbyte ); unsigned char byte2 = *(pinbyte + sizeof(element_type)); unsigned char byte3 = *(pinbyte + 2 * sizeof(element_type)); unsigned char byte4 = *(pinbyte + 3 * sizeof(element_type)); element_type in1 = in[i ]; element_type in2 = in[i+1]; element_type in3 = in[i+2]; element_type in4 = in[i+3]; work[index[byte1]++] = in1; work[index[byte2]++] = in2; work[index[byte3]++] = in3; work[index[byte4]++] = in4; } /* Swap in and work (just a pointer swap). */ { element_type *temp = in; in = work; work = temp; } } return in; } /* Remove duplicates from sorted array in, storing exactly one of each distinct element value into sorted array out. It's OK (and expected!) for in == out, but otherwise the n elements beginning at in must not overlap with the n beginning at out. Return the number of elements in out. */ static size_t uniq(element_type *out, element_type *in, size_t n) { size_t i; element_type lastelt; element_type *pout; assert(out); assert(in); if (n == 0) return 0; /* i <- first index in 'in' that contains a duplicate. in[0], in[1], ... in[i-1] are unique, but in[i-1] == in[i]. Set i to n if everything is unique. */ for (i = 1; i < n; ++i) { if (in[i-1] == in[i]) break; } /* in[:i] is unique; copy to out[:i] if needed. */ assert(i > 0); if (in != out) memcpy(out, in, i * sizeof(element_type)); pout = out + i; lastelt = in[i-1]; /* safe even when i == n */ for (++i; i < n; ++i) { element_type elt = in[i]; if (elt != lastelt) *pout++ = lastelt = elt; } return pout - out; } #if 0 /* insertionsort is no longer referenced directly, but I'd like to keep * the code here just in case. */ /* Straight insertion sort of the n elements starting at 'in'. */ static void insertionsort(element_type *in, size_t n) { element_type *p, *q; element_type minimum; /* smallest seen so far */ element_type *plimit = in + n; assert(in); if (n < 2) return; minimum = *in; for (p = in+1; p < plimit; ++p) { /* *in <= *(in+1) <= ... <= *(p-1). Slide *p into place. */ element_type thiselt = *p; if (thiselt < minimum) { /* This is a new minimum. This saves p-in compares when it happens, but should happen so rarely that it's not worth checking for its own sake: the point is that the far more popular 'else' branch can exploit that thiselt is *not* the smallest so far. */ memmove(in+1, in, (p - in) * sizeof(*in)); *in = minimum = thiselt; } else { /* thiselt >= minimum, so the loop will find a q with *q <= thiselt. This saves testing q >= in on each trip. It's such a simple loop that saving a per-trip test is a major speed win. */ for (q = p-1; *q > thiselt; --q) *(q+1) = *q; *(q+1) = thiselt; } } } #endif /* The maximum number of elements in the pending-work stack quicksort maintains. The maximum stack depth is approximately log2(n), so arrays of size up to approximately MAX_INSERTION * 2**STACKSIZE can be sorted. The memory burden for the stack is small, so better safe than sorry. */ #define STACKSIZE 60 /* A _stacknode remembers a contiguous slice of an array that needs to sorted. lo must be <= hi, and, unlike Python array slices, this includes both ends. */ struct _stacknode { element_type *lo; element_type *hi; }; static void quicksort(element_type *plo, size_t n) { element_type *phi; /* Swap two array elements. */ element_type _temp; #define SWAP(P, Q) (_temp = *(P), *(P) = *(Q), *(Q) = _temp) /* Stack of pending array slices to be sorted. */ struct _stacknode stack[STACKSIZE]; struct _stacknode *stackfree = stack; /* available stack slot */ /* Push an array slice on the pending-work stack. */ #define PUSH(PLO, PHI) \ do { \ assert(stackfree - stack < STACKSIZE); \ assert((PLO) <= (PHI)); \ stackfree->lo = (PLO); \ stackfree->hi = (PHI); \ ++stackfree; \ } while(0) assert(plo); phi = plo + n - 1; for (;;) { element_type pivot; element_type *pi, *pj; assert(plo <= phi); n = phi - plo + 1; if (n <= MAX_INSERTION) { /* Do a small insertion sort. Contra Knuth, we do this now instead of waiting until the end, because this little slice is likely still in cache now. */ element_type *p, *q; element_type minimum = *plo; for (p = plo+1; p <= phi; ++p) { /* *plo <= *(plo+1) <= ... <= *(p-1). Slide *p into place. */ element_type thiselt = *p; if (thiselt < minimum) { /* New minimum. */ memmove(plo+1, plo, (p - plo) * sizeof(*p)); *plo = minimum = thiselt; } else { /* thiselt >= minimum, so the loop will find a q with *q <= thiselt. */ for (q = p-1; *q > thiselt; --q) *(q+1) = *q; *(q+1) = thiselt; } } /* Pop another slice off the stack. */ if (stack == stackfree) break; /* no more slices -- we're done */ --stackfree; plo = stackfree->lo; phi = stackfree->hi; continue; } /* Parition the slice. For pivot, take the median of the leftmost, rightmost, and middle elements. First sort those three; then the median is the middle one. For technical reasons, the middle element is swapped to plo+1 first (see Knuth Vol 3 Ed 2 section 5.2.2 exercise 55 -- reverse-sorted arrays can take quadratic time otherwise!). */ { element_type *plop1 = plo + 1; element_type *pmid = plo + (n >> 1); assert(plo < pmid && pmid < phi); SWAP(plop1, pmid); /* Sort plo, plop1, phi. */ /* Smaller of rightmost two -> middle. */ if (*plop1 > *phi) SWAP(plop1, phi); /* Smallest of all -> left; if plo is already the smallest, the sort is complete. */ if (*plo > *plop1) { SWAP(plo, plop1); /* Largest of all -> right. */ if (*plop1 > *phi) SWAP(plop1, phi); } pivot = *plop1; pi = plop1; } assert(*plo <= pivot); assert(*pi == pivot); assert(*phi >= pivot); pj = phi; /* Partition wrt pivot. This is the time-critical part, and nearly every decision in the routine aims at making this loop as fast as possible -- even small points like arranging that all loop tests can be done correctly at the bottoms of loops instead of the tops, and that pointers can be derefenced directly as-is (without fiddly +1 or -1). The aim is to make the C here so simple that a compiler has a good shot at doing as well as hand-crafted assembler. */ for (;;) { /* Invariants: 1. pi < pj. 2. All elements at plo, plo+1 .. pi are <= pivot. 3. All elements at pj, pj+1 .. phi are >= pivot. 4. There is an element >= pivot to the right of pi. 5. There is an element <= pivot to the left of pj. Note that #4 and #5 save us from needing to check that the pointers stay in bounds. */ assert(pi < pj); do { ++pi; } while (*pi < pivot); assert(pi <= pj); do { --pj; } while (*pj > pivot); assert(pj >= pi - 1); if (pi < pj) SWAP(pi, pj); else break; } assert(plo+1 < pi && pi <= phi); assert(plo < pj && pj < phi); assert(*pi >= pivot); assert( (pi == pj && *pj == pivot) || (pj + 1 == pi && *pj <= pivot) ); /* Swap pivot into its final position, pj. */ assert(plo[1] == pivot); plo[1] = *pj; *pj = pivot; /* Subfiles are from plo to pj-1 inclusive, and pj+1 to phi inclusive. Push the larger one, and loop back to do the smaller one directly. */ if (pj - plo >= phi - pj) { PUSH(plo, pj-1); plo = pj+1; } else { PUSH(pj+1, phi); phi = pj-1; } } #undef PUSH #undef SWAP } /* Sort p and remove duplicates, as fast as we can. */ static size_t sort_int_nodups(KEY_TYPE *p, size_t n) { size_t nunique; element_type *work; assert(sizeof(KEY_TYPE) == sizeof(element_type)); assert(p); /* Use quicksort if the array is small, OR if malloc can't find enough temp memory for radixsort. */ work = NULL; if (n > QUICKSORT_BEATS_RADIXSORT) work = (element_type *)malloc(n * sizeof(element_type)); if (work) { element_type *out = radixsort_int(p, work, n); nunique = uniq(p, out, n); free(work); } else { quicksort(p, n); nunique = uniq(p, p, n); } return nunique; } ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/tests/000077500000000000000000000000001230730566700222255ustar00rootroot00000000000000ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/tests/__init__.py000066400000000000000000000000641230730566700243360ustar00rootroot00000000000000# If tests is a package, debugging is a bit easier. ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/tests/testBTrees.py000066400000000000000000002120211230730566700246610ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## import gc import pickle import random import StringIO from unittest import TestCase, TestSuite, TextTestRunner, makeSuite from types import ClassType import zope.interface.verify from BTrees.OOBTree import OOBTree, OOBucket, OOSet, OOTreeSet from BTrees.IOBTree import IOBTree, IOBucket, IOSet, IOTreeSet from BTrees.IIBTree import IIBTree, IIBucket, IISet, IITreeSet from BTrees.IFBTree import IFBTree, IFBucket, IFSet, IFTreeSet from BTrees.OIBTree import OIBTree, OIBucket, OISet, OITreeSet from BTrees.LOBTree import LOBTree, LOBucket, LOSet, LOTreeSet from BTrees.LLBTree import LLBTree, LLBucket, LLSet, LLTreeSet from BTrees.LFBTree import LFBTree, LFBucket, LFSet, LFTreeSet from BTrees.OLBTree import OLBTree, OLBucket, OLSet, OLTreeSet import BTrees from BTrees.IIBTree import using64bits from BTrees.check import check import transaction from ZODB import DB from ZODB.MappingStorage import MappingStorage class Base(TestCase): """ Tests common to all types: sets, buckets, and BTrees """ db = None def setUp(self): self.t = self.t_class() def tearDown(self): if self.db is not None: self.db.close() self.t = None del self.t def _getRoot(self): if self.db is None: # Unclear: On the next line, the ZODB4 flavor of this routine # [asses a cache_size argument: # self.db = DB(MappingStorage(), cache_size=1) # If that's done here, though, testLoadAndStore() and # testGhostUnghost() both nail the CPU and seemingly # never finish. self.db = DB(MappingStorage()) return self.db.open().root() def _closeRoot(self, root): root._p_jar.close() def testLoadAndStore(self): for i in 0, 10, 1000: t = self.t.__class__() self._populate(t, i) root = None root = self._getRoot() root[i] = t transaction.commit() root2 = self._getRoot() if hasattr(t, 'items'): self.assertEqual(list(root2[i].items()) , list(t.items())) else: self.assertEqual(list(root2[i].keys()) , list(t.keys())) self._closeRoot(root) self._closeRoot(root2) def testSetstateArgumentChecking(self): try: self.t.__class__().__setstate__(('',)) except TypeError, v: self.assertEqual(str(v), 'tuple required for first state element') else: raise AssertionError("Expected exception") def testGhostUnghost(self): for i in 0, 10, 1000: t = self.t.__class__() self._populate(t, i) root = self._getRoot() root[i] = t transaction.commit() root2 = self._getRoot() root2[i]._p_deactivate() transaction.commit() if hasattr(t, 'items'): self.assertEqual(list(root2[i].items()) , list(t.items())) else: self.assertEqual(list(root2[i].keys()) , list(t.keys())) self._closeRoot(root) self._closeRoot(root2) def testSimpleExclusiveKeyRange(self): t = self.t.__class__() self.assertEqual(list(t.keys()), []) self.assertEqual(list(t.keys(excludemin=True)), []) self.assertEqual(list(t.keys(excludemax=True)), []) self.assertEqual(list(t.keys(excludemin=True, excludemax=True)), []) self._populate(t, 1) self.assertEqual(list(t.keys()), [0]) self.assertEqual(list(t.keys(excludemin=True)), []) self.assertEqual(list(t.keys(excludemax=True)), []) self.assertEqual(list(t.keys(excludemin=True, excludemax=True)), []) t.clear() self._populate(t, 2) self.assertEqual(list(t.keys()), [0, 1]) self.assertEqual(list(t.keys(excludemin=True)), [1]) self.assertEqual(list(t.keys(excludemax=True)), [0]) self.assertEqual(list(t.keys(excludemin=True, excludemax=True)), []) t.clear() self._populate(t, 3) self.assertEqual(list(t.keys()), [0, 1, 2]) self.assertEqual(list(t.keys(excludemin=True)), [1, 2]) self.assertEqual(list(t.keys(excludemax=True)), [0, 1]) self.assertEqual(list(t.keys(excludemin=True, excludemax=True)), [1]) self.assertEqual(list(t.keys(-1, 3, excludemin=True, excludemax=True)), [0, 1, 2]) self.assertEqual(list(t.keys(0, 3, excludemin=True, excludemax=True)), [1, 2]) self.assertEqual(list(t.keys(-1, 2, excludemin=True, excludemax=True)), [0, 1]) self.assertEqual(list(t.keys(0, 2, excludemin=True, excludemax=True)), [1]) def testUpdatesDoReadChecksOnInternalNodes(self): t = self.t if not hasattr(t, '_firstbucket'): return self._populate(t, 1000) store = MappingStorage() db = DB(store) conn = db.open() conn.root.t = t transaction.commit() read = [] def readCurrent(ob): read.append(ob) conn.__class__.readCurrent(conn, ob) return 1 conn.readCurrent = readCurrent try: add = t.add remove = t.remove except AttributeError: def add(i): t[i] = i def remove(i): del t[i] # Modifying a thing remove(100) self.assert_(t in read) del read[:] add(100) self.assert_(t in read) del read[:] transaction.abort() conn.cacheMinimize() list(t) self.assert_(100 in t) self.assert_(not read) class MappingBase(Base): """ Tests common to mappings (buckets, btrees) """ def _populate(self, t, l): # Make some data for i in range(l): t[i]=i def testRepr(self): # test the repr because buckets have a complex repr implementation # internally the cutoff from a stack allocated buffer to a heap # allocated buffer is 10000. for i in range(1000): self.t[i] = i r = repr(self.t) # Make sure the repr is 10000 bytes long for a bucket. # But since the test is also run for btrees, skip the length # check if the repr starts with '<' if not r.startswith('<'): self.assert_(len(r) > 10000) def testGetItemFails(self): self.assertRaises(KeyError, self._getitemfail) def _getitemfail(self): return self.t[1] def testGetReturnsDefault(self): self.assertEqual(self.t.get(1) , None) self.assertEqual(self.t.get(1, 'foo') , 'foo') def testSetItemGetItemWorks(self): self.t[1] = 1 a = self.t[1] self.assertEqual(a , 1, `a`) def testReplaceWorks(self): self.t[1] = 1 self.assertEqual(self.t[1] , 1, self.t[1]) self.t[1] = 2 self.assertEqual(self.t[1] , 2, self.t[1]) def testLen(self): added = {} r = range(1000) for x in r: k = random.choice(r) self.t[k] = x added[k] = x addl = added.keys() self.assertEqual(len(self.t) , len(addl), len(self.t)) def testHasKeyWorks(self): self.t[1] = 1 self.assert_(self.t.has_key(1)) self.assert_(1 in self.t) self.assert_(0 not in self.t) self.assert_(2 not in self.t) def testValuesWorks(self): for x in range(100): self.t[x] = x*x v = self.t.values() for i in range(100): self.assertEqual(v[i], i*i) self.assertRaises(IndexError, lambda: v[i+1]) i = 0 for value in self.t.itervalues(): self.assertEqual(value, i*i) i += 1 def testValuesWorks1(self): for x in range(100): self.t[99-x] = x for x in range(40): lst = list(self.t.values(0+x,99-x)) lst.sort() self.assertEqual(lst,range(0+x,99-x+1)) lst = list(self.t.values(max=99-x, min=0+x)) lst.sort() self.assertEqual(lst,range(0+x,99-x+1)) def testValuesNegativeIndex(self): L = [-3, 6, -11, 4] for i in L: self.t[i] = i L.sort() vals = self.t.values() for i in range(-1, -5, -1): self.assertEqual(vals[i], L[i]) self.assertRaises(IndexError, lambda: vals[-5]) def testKeysWorks(self): for x in range(100): self.t[x] = x v = self.t.keys() i = 0 for x in v: self.assertEqual(x,i) i = i + 1 self.assertRaises(IndexError, lambda: v[i]) for x in range(40): lst = self.t.keys(0+x,99-x) self.assertEqual(list(lst), range(0+x, 99-x+1)) lst = self.t.keys(max=99-x, min=0+x) self.assertEqual(list(lst), range(0+x, 99-x+1)) self.assertEqual(len(v), 100) def testKeysNegativeIndex(self): L = [-3, 6, -11, 4] for i in L: self.t[i] = i L.sort() keys = self.t.keys() for i in range(-1, -5, -1): self.assertEqual(keys[i], L[i]) self.assertRaises(IndexError, lambda: keys[-5]) def testItemsWorks(self): for x in range(100): self.t[x] = 2*x v = self.t.items() i = 0 for x in v: self.assertEqual(x[0], i) self.assertEqual(x[1], 2*i) i += 1 self.assertRaises(IndexError, lambda: v[i+1]) i = 0 for x in self.t.iteritems(): self.assertEqual(x, (i, 2*i)) i += 1 items = list(self.t.items(min=12, max=20)) self.assertEqual(items, zip(range(12, 21), range(24, 43, 2))) items = list(self.t.iteritems(min=12, max=20)) self.assertEqual(items, zip(range(12, 21), range(24, 43, 2))) def testItemsNegativeIndex(self): L = [-3, 6, -11, 4] for i in L: self.t[i] = i L.sort() items = self.t.items() for i in range(-1, -5, -1): self.assertEqual(items[i], (L[i], L[i])) self.assertRaises(IndexError, lambda: items[-5]) def testDeleteInvalidKeyRaisesKeyError(self): self.assertRaises(KeyError, self._deletefail) def _deletefail(self): del self.t[1] def testMaxKeyMinKey(self): self.t[7] = 6 self.t[3] = 10 self.t[8] = 12 self.t[1] = 100 self.t[5] = 200 self.t[10] = 500 self.t[6] = 99 self.t[4] = 150 del self.t[7] t = self.t self.assertEqual(t.maxKey(), 10) self.assertEqual(t.maxKey(6), 6) self.assertEqual(t.maxKey(9), 8) self.assertEqual(t.minKey(), 1) self.assertEqual(t.minKey(3), 3) self.assertEqual(t.minKey(9), 10) try: t.maxKey(t.minKey() - 1) except ValueError, err: self.assertEqual(str(err), "no key satisfies the conditions") else: self.fail("expected ValueError") try: t.minKey(t.maxKey() + 1) except ValueError, err: self.assertEqual(str(err), "no key satisfies the conditions") else: self.fail("expected ValueError") def testClear(self): r = range(100) for x in r: rnd = random.choice(r) self.t[rnd] = 0 self.t.clear() diff = lsubtract(list(self.t.keys()), []) self.assertEqual(diff, []) def testUpdate(self): d={} l=[] for i in range(10000): k=random.randrange(-2000, 2001) d[k]=i l.append((k, i)) items=d.items() items.sort() self.t.update(d) self.assertEqual(list(self.t.items()), items) self.t.clear() self.assertEqual(list(self.t.items()), []) self.t.update(l) self.assertEqual(list(self.t.items()), items) # Before ZODB 3.4.2, update/construction from PersistentMapping failed. def testUpdateFromPersistentMapping(self): from persistent.mapping import PersistentMapping pm = PersistentMapping({1: 2}) self.t.update(pm) self.assertEqual(list(self.t.items()), [(1, 2)]) # Construction goes thru the same internals as .update(). t = self.t.__class__(pm) self.assertEqual(list(t.items()), [(1, 2)]) def testEmptyRangeSearches(self): t = self.t t.update([(1,1), (5,5), (9,9)]) self.assertEqual(list(t.keys(-6,-4)), [], list(t.keys(-6,-4))) self.assertEqual(list(t.keys(2,4)), [], list(t.keys(2,4))) self.assertEqual(list(t.keys(6,8)), [], list(t.keys(6,8))) self.assertEqual(list(t.keys(10,12)), [], list(t.keys(10,12))) self.assertEqual(list(t.keys(9, 1)), [], list(t.keys(9, 1))) # For IITreeSets, this one was returning 31 for len(keys), and # list(keys) produced a list with 100 elements. t.clear() t.update(zip(range(300), range(300))) keys = t.keys(200, 50) self.assertEqual(len(keys), 0) self.assertEqual(list(keys), []) self.assertEqual(list(t.iterkeys(200, 50)), []) keys = t.keys(max=50, min=200) self.assertEqual(len(keys), 0) self.assertEqual(list(keys), []) self.assertEqual(list(t.iterkeys(max=50, min=200)), []) def testSlicing(self): # Test that slicing of .keys()/.values()/.items() works exactly the # same way as slicing a Python list with the same contents. # This tests fixes to several bugs in this area, starting with # http://collector.zope.org/Zope/419, # "BTreeItems slice contains 1 too many elements". t = self.t for n in range(10): t.clear() self.assertEqual(len(t), 0) keys = [] values = [] items = [] for key in range(n): value = -2 * key t[key] = value keys.append(key) values.append(value) items.append((key, value)) self.assertEqual(len(t), n) kslice = t.keys() vslice = t.values() islice = t.items() self.assertEqual(len(kslice), n) self.assertEqual(len(vslice), n) self.assertEqual(len(islice), n) # Test whole-structure slices. x = kslice[:] self.assertEqual(list(x), keys[:]) x = vslice[:] self.assertEqual(list(x), values[:]) x = islice[:] self.assertEqual(list(x), items[:]) for lo in range(-2*n, 2*n+1): # Test one-sided slices. x = kslice[:lo] self.assertEqual(list(x), keys[:lo]) x = kslice[lo:] self.assertEqual(list(x), keys[lo:]) x = vslice[:lo] self.assertEqual(list(x), values[:lo]) x = vslice[lo:] self.assertEqual(list(x), values[lo:]) x = islice[:lo] self.assertEqual(list(x), items[:lo]) x = islice[lo:] self.assertEqual(list(x), items[lo:]) for hi in range(-2*n, 2*n+1): # Test two-sided slices. x = kslice[lo:hi] self.assertEqual(list(x), keys[lo:hi]) x = vslice[lo:hi] self.assertEqual(list(x), values[lo:hi]) x = islice[lo:hi] self.assertEqual(list(x), items[lo:hi]) # The specific test case from Zope collector 419. t.clear() for i in xrange(100): t[i] = 1 tslice = t.items()[20:80] self.assertEqual(len(tslice), 60) self.assertEqual(list(tslice), zip(range(20, 80), [1]*60)) def testIterators(self): t = self.t for keys in [], [-2], [1, 4], range(-170, 2000, 6): t.clear() for k in keys: t[k] = -3 * k self.assertEqual(list(t), keys) x = [] for k in t: x.append(k) self.assertEqual(x, keys) it = iter(t) self.assert_(it is iter(it)) x = [] try: while 1: x.append(it.next()) except StopIteration: pass self.assertEqual(x, keys) self.assertEqual(list(t.iterkeys()), keys) self.assertEqual(list(t.itervalues()), list(t.values())) self.assertEqual(list(t.iteritems()), list(t.items())) def testRangedIterators(self): t = self.t for keys in [], [-2], [1, 4], range(-170, 2000, 13): t.clear() values = [] for k in keys: value = -3 * k t[k] = value values.append(value) items = zip(keys, values) self.assertEqual(list(t.iterkeys()), keys) self.assertEqual(list(t.itervalues()), values) self.assertEqual(list(t.iteritems()), items) if not keys: continue min_mid_max = (keys[0], keys[len(keys) >> 1], keys[-1]) for key1 in min_mid_max: for lo in range(key1 - 1, key1 + 2): # Test one-sided range iterators. goodkeys = [k for k in keys if lo <= k] got = t.iterkeys(lo) self.assertEqual(goodkeys, list(got)) goodvalues = [t[k] for k in goodkeys] got = t.itervalues(lo) self.assertEqual(goodvalues, list(got)) gooditems = zip(goodkeys, goodvalues) got = t.iteritems(lo) self.assertEqual(gooditems, list(got)) for key2 in min_mid_max: for hi in range(key2 - 1, key2 + 2): goodkeys = [k for k in keys if lo <= k <= hi] got = t.iterkeys(min=lo, max=hi) self.assertEqual(goodkeys, list(got)) goodvalues = [t[k] for k in goodkeys] got = t.itervalues(lo, max=hi) self.assertEqual(goodvalues, list(got)) gooditems = zip(goodkeys, goodvalues) got = t.iteritems(max=hi, min=lo) self.assertEqual(gooditems, list(got)) def testBadUpdateTupleSize(self): # This one silently ignored the excess in Zope3. try: self.t.update([(1, 2, 3)]) except TypeError: pass else: self.fail("update() with 3-tuple didn't complain") # This one dumped core in Zope3. try: self.t.update([(1,)]) except TypeError: pass else: self.fail("update() with 1-tuple didn't complain") # This one should simply succeed. self.t.update([(1, 2)]) self.assertEqual(list(self.t.items()), [(1, 2)]) def testSimpleExclusivRanges(self): def identity(x): return x def dup(x): return [(y, y) for y in x] for methodname, f in (("keys", identity), ("values", identity), ("items", dup), ("iterkeys", identity), ("itervalues", identity), ("iteritems", dup)): t = self.t.__class__() meth = getattr(t, methodname, None) if meth is None: continue self.assertEqual(list(meth()), []) self.assertEqual(list(meth(excludemin=True)), []) self.assertEqual(list(meth(excludemax=True)), []) self.assertEqual(list(meth(excludemin=True, excludemax=True)), []) self._populate(t, 1) self.assertEqual(list(meth()), f([0])) self.assertEqual(list(meth(excludemin=True)), []) self.assertEqual(list(meth(excludemax=True)), []) self.assertEqual(list(meth(excludemin=True, excludemax=True)), []) t.clear() self._populate(t, 2) self.assertEqual(list(meth()), f([0, 1])) self.assertEqual(list(meth(excludemin=True)), f([1])) self.assertEqual(list(meth(excludemax=True)), f([0])) self.assertEqual(list(meth(excludemin=True, excludemax=True)), []) t.clear() self._populate(t, 3) self.assertEqual(list(meth()), f([0, 1, 2])) self.assertEqual(list(meth(excludemin=True)), f([1, 2])) self.assertEqual(list(meth(excludemax=True)), f([0, 1])) self.assertEqual(list(meth(excludemin=True, excludemax=True)), f([1])) self.assertEqual(list(meth(-1, 3, excludemin=True, excludemax=True)), f([0, 1, 2])) self.assertEqual(list(meth(0, 3, excludemin=True, excludemax=True)), f([1, 2])) self.assertEqual(list(meth(-1, 2, excludemin=True, excludemax=True)), f([0, 1])) self.assertEqual(list(meth(0, 2, excludemin=True, excludemax=True)), f([1])) def testSetdefault(self): t = self.t self.assertEqual(t.setdefault(1, 2), 2) # That should also have associated 1 with 2 in the tree. self.assert_(1 in t) self.assertEqual(t[1], 2) # And trying to change it again should have no effect. self.assertEqual(t.setdefault(1, 666), 2) self.assertEqual(t[1], 2) # Not enough arguments. self.assertRaises(TypeError, t.setdefault) self.assertRaises(TypeError, t.setdefault, 1) # Too many arguments. self.assertRaises(TypeError, t.setdefault, 1, 2, 3) def testPop(self): t = self.t # Empty container. # If no default given, raises KeyError. self.assertRaises(KeyError, t.pop, 1) # But if default given, returns that instead. self.assertEqual(t.pop(1, 42), 42) t[1] = 3 # KeyError when key is not in container and default is not passed. self.assertRaises(KeyError, t.pop, 5) self.assertEqual(list(t.items()), [(1, 3)]) # If key is in container, returns the value and deletes the key. self.assertEqual(t.pop(1), 3) self.assertEqual(len(t), 0) # If key is present, return value bypassing default. t[1] = 3 self.assertEqual(t.pop(1, 7), 3) self.assertEqual(len(t), 0) # Pop only one item. t[1] = 3 t[2] = 4 self.assertEqual(len(t), 2) self.assertEqual(t.pop(1), 3) self.assertEqual(len(t), 1) self.assertEqual(t[2], 4) self.assertEqual(t.pop(1, 3), 3) # Too few arguments. self.assertRaises(TypeError, t.pop) # Too many arguments. self.assertRaises(TypeError, t.pop, 1, 2, 3) class NormalSetTests(Base): """ Test common to all set types """ def _populate(self, t, l): # Make some data t.update(range(l)) def testInsertReturnsValue(self): t = self.t self.assertEqual(t.insert(5) , 1) self.assertEqual(t.add(4) , 1) def testDuplicateInsert(self): t = self.t t.insert(5) self.assertEqual(t.insert(5) , 0) self.assertEqual(t.add(5) , 0) def testInsert(self): t = self.t t.insert(1) self.assert_(t.has_key(1)) self.assert_(1 in t) self.assert_(2 not in t) def testBigInsert(self): t = self.t r = xrange(10000) for x in r: t.insert(x) for x in r: self.assert_(t.has_key(x)) self.assert_(x in t) def testRemoveSucceeds(self): t = self.t r = xrange(10000) for x in r: t.insert(x) for x in r: t.remove(x) def testRemoveFails(self): self.assertRaises(KeyError, self._removenonexistent) def _removenonexistent(self): self.t.remove(1) def testHasKeyFails(self): t = self.t self.assert_(not t.has_key(1)) self.assert_(1 not in t) def testKeys(self): t = self.t r = xrange(1000) for x in r: t.insert(x) diff = lsubtract(t.keys(), r) self.assertEqual(diff, []) def testClear(self): t = self.t r = xrange(1000) for x in r: t.insert(x) t.clear() diff = lsubtract(t.keys(), []) self.assertEqual(diff , [], diff) def testMaxKeyMinKey(self): t = self.t t.insert(1) t.insert(2) t.insert(3) t.insert(8) t.insert(5) t.insert(10) t.insert(6) t.insert(4) self.assertEqual(t.maxKey() , 10) self.assertEqual(t.maxKey(6) , 6) self.assertEqual(t.maxKey(9) , 8) self.assertEqual(t.minKey() , 1) self.assertEqual(t.minKey(3) , 3) self.assertEqual(t.minKey(9) , 10) self.assert_(t.minKey() in t) self.assert_(t.minKey()-1 not in t) self.assert_(t.maxKey() in t) self.assert_(t.maxKey()+1 not in t) try: t.maxKey(t.minKey() - 1) except ValueError, err: self.assertEqual(str(err), "no key satisfies the conditions") else: self.fail("expected ValueError") try: t.minKey(t.maxKey() + 1) except ValueError, err: self.assertEqual(str(err), "no key satisfies the conditions") else: self.fail("expected ValueError") def testUpdate(self): d={} l=[] for i in range(10000): k=random.randrange(-2000, 2001) d[k]=i l.append(k) items = d.keys() items.sort() self.t.update(l) self.assertEqual(list(self.t.keys()), items) def testEmptyRangeSearches(self): t = self.t t.update([1, 5, 9]) self.assertEqual(list(t.keys(-6,-4)), [], list(t.keys(-6,-4))) self.assertEqual(list(t.keys(2,4)), [], list(t.keys(2,4))) self.assertEqual(list(t.keys(6,8)), [], list(t.keys(6,8))) self.assertEqual(list(t.keys(10,12)), [], list(t.keys(10,12))) self.assertEqual(list(t.keys(9,1)), [], list(t.keys(9,1))) # For IITreeSets, this one was returning 31 for len(keys), and # list(keys) produced a list with 100 elements. t.clear() t.update(range(300)) keys = t.keys(200, 50) self.assertEqual(len(keys), 0) self.assertEqual(list(keys), []) keys = t.keys(max=50, min=200) self.assertEqual(len(keys), 0) self.assertEqual(list(keys), []) def testSlicing(self): # Test that slicing of .keys() works exactly the same way as slicing # a Python list with the same contents. t = self.t for n in range(10): t.clear() self.assertEqual(len(t), 0) keys = range(10*n, 11*n) t.update(keys) self.assertEqual(len(t), n) kslice = t.keys() self.assertEqual(len(kslice), n) # Test whole-structure slices. x = kslice[:] self.assertEqual(list(x), keys[:]) for lo in range(-2*n, 2*n+1): # Test one-sided slices. x = kslice[:lo] self.assertEqual(list(x), keys[:lo]) x = kslice[lo:] self.assertEqual(list(x), keys[lo:]) for hi in range(-2*n, 2*n+1): # Test two-sided slices. x = kslice[lo:hi] self.assertEqual(list(x), keys[lo:hi]) def testIterator(self): t = self.t for keys in [], [-2], [1, 4], range(-170, 2000, 6): t.clear() t.update(keys) self.assertEqual(list(t), keys) x = [] for k in t: x.append(k) self.assertEqual(x, keys) it = iter(t) self.assert_(it is iter(it)) x = [] try: while 1: x.append(it.next()) except StopIteration: pass self.assertEqual(x, keys) class ExtendedSetTests(NormalSetTests): def testLen(self): t = self.t r = xrange(10000) for x in r: t.insert(x) self.assertEqual(len(t) , 10000, len(t)) def testGetItem(self): t = self.t r = xrange(10000) for x in r: t.insert(x) for x in r: self.assertEqual(t[x] , x) class BTreeTests(MappingBase): """ Tests common to all BTrees """ def tearDown(self): self.t._check() check(self.t) MappingBase.tearDown(self) def testDeleteNoChildrenWorks(self): self.t[5] = 6 self.t[2] = 10 self.t[6] = 12 self.t[1] = 100 self.t[3] = 200 self.t[10] = 500 self.t[4] = 99 del self.t[4] diff = lsubtract(self.t.keys(), [1,2,3,5,6,10]) self.assertEqual(diff , [], diff) def testDeleteOneChildWorks(self): self.t[5] = 6 self.t[2] = 10 self.t[6] = 12 self.t[1] = 100 self.t[3] = 200 self.t[10] = 500 self.t[4] = 99 del self.t[3] diff = lsubtract(self.t.keys(), [1,2,4,5,6,10]) self.assertEqual(diff , [], diff) def testDeleteTwoChildrenNoInorderSuccessorWorks(self): self.t[5] = 6 self.t[2] = 10 self.t[6] = 12 self.t[1] = 100 self.t[3] = 200 self.t[10] = 500 self.t[4] = 99 del self.t[2] diff = lsubtract(self.t.keys(), [1,3,4,5,6,10]) self.assertEqual(diff , [], diff) def testDeleteTwoChildrenInorderSuccessorWorks(self): # 7, 3, 8, 1, 5, 10, 6, 4 -- del 3 self.t[7] = 6 self.t[3] = 10 self.t[8] = 12 self.t[1] = 100 self.t[5] = 200 self.t[10] = 500 self.t[6] = 99 self.t[4] = 150 del self.t[3] diff = lsubtract(self.t.keys(), [1,4,5,6,7,8,10]) self.assertEqual(diff , [], diff) def testDeleteRootWorks(self): # 7, 3, 8, 1, 5, 10, 6, 4 -- del 7 self.t[7] = 6 self.t[3] = 10 self.t[8] = 12 self.t[1] = 100 self.t[5] = 200 self.t[10] = 500 self.t[6] = 99 self.t[4] = 150 del self.t[7] diff = lsubtract(self.t.keys(), [1,3,4,5,6,8,10]) self.assertEqual(diff , [], diff) def testRandomNonOverlappingInserts(self): added = {} r = range(100) for x in r: k = random.choice(r) if not added.has_key(k): self.t[k] = x added[k] = 1 addl = added.keys() addl.sort() diff = lsubtract(list(self.t.keys()), addl) self.assertEqual(diff , [], (diff, addl, list(self.t.keys()))) def testRandomOverlappingInserts(self): added = {} r = range(100) for x in r: k = random.choice(r) self.t[k] = x added[k] = 1 addl = added.keys() addl.sort() diff = lsubtract(self.t.keys(), addl) self.assertEqual(diff , [], diff) def testRandomDeletes(self): r = range(1000) added = [] for x in r: k = random.choice(r) self.t[k] = x added.append(k) deleted = [] for x in r: k = random.choice(r) if self.t.has_key(k): self.assert_(k in self.t) del self.t[k] deleted.append(k) if self.t.has_key(k): self.fail( "had problems deleting %s" % k ) badones = [] for x in deleted: if self.t.has_key(x): badones.append(x) self.assertEqual(badones , [], (badones, added, deleted)) def testTargetedDeletes(self): r = range(1000) for x in r: k = random.choice(r) self.t[k] = x for x in r: try: del self.t[x] except KeyError: pass self.assertEqual(realseq(self.t.keys()) , [], realseq(self.t.keys())) def testPathologicalRightBranching(self): r = range(1000) for x in r: self.t[x] = 1 self.assertEqual(realseq(self.t.keys()) , r, realseq(self.t.keys())) for x in r: del self.t[x] self.assertEqual(realseq(self.t.keys()) , [], realseq(self.t.keys())) def testPathologicalLeftBranching(self): r = range(1000) revr = r[:] revr.reverse() for x in revr: self.t[x] = 1 self.assertEqual(realseq(self.t.keys()) , r, realseq(self.t.keys())) for x in revr: del self.t[x] self.assertEqual(realseq(self.t.keys()) , [], realseq(self.t.keys())) def testSuccessorChildParentRewriteExerciseCase(self): add_order = [ 85, 73, 165, 273, 215, 142, 233, 67, 86, 166, 235, 225, 255, 73, 175, 171, 285, 162, 108, 28, 283, 258, 232, 199, 260, 298, 275, 44, 261, 291, 4, 181, 285, 289, 216, 212, 129, 243, 97, 48, 48, 159, 22, 285, 92, 110, 27, 55, 202, 294, 113, 251, 193, 290, 55, 58, 239, 71, 4, 75, 129, 91, 111, 271, 101, 289, 194, 218, 77, 142, 94, 100, 115, 101, 226, 17, 94, 56, 18, 163, 93, 199, 286, 213, 126, 240, 245, 190, 195, 204, 100, 199, 161, 292, 202, 48, 165, 6, 173, 40, 218, 271, 228, 7, 166, 173, 138, 93, 22, 140, 41, 234, 17, 249, 215, 12, 292, 246, 272, 260, 140, 58, 2, 91, 246, 189, 116, 72, 259, 34, 120, 263, 168, 298, 118, 18, 28, 299, 192, 252, 112, 60, 277, 273, 286, 15, 263, 141, 241, 172, 255, 52, 89, 127, 119, 255, 184, 213, 44, 116, 231, 173, 298, 178, 196, 89, 184, 289, 98, 216, 115, 35, 132, 278, 238, 20, 241, 128, 179, 159, 107, 206, 194, 31, 260, 122, 56, 144, 118, 283, 183, 215, 214, 87, 33, 205, 183, 212, 221, 216, 296, 40, 108, 45, 188, 139, 38, 256, 276, 114, 270, 112, 214, 191, 147, 111, 299, 107, 101, 43, 84, 127, 67, 205, 251, 38, 91, 297, 26, 165, 187, 19, 6, 73, 4, 176, 195, 90, 71, 30, 82, 139, 210, 8, 41, 253, 127, 190, 102, 280, 26, 233, 32, 257, 194, 263, 203, 190, 111, 218, 199, 29, 81, 207, 18, 180, 157, 172, 192, 135, 163, 275, 74, 296, 298, 265, 105, 191, 282, 277, 83, 188, 144, 259, 6, 173, 81, 107, 292, 231, 129, 65, 161, 113, 103, 136, 255, 285, 289, 1 ] delete_order = [ 276, 273, 12, 275, 2, 286, 127, 83, 92, 33, 101, 195, 299, 191, 22, 232, 291, 226, 110, 94, 257, 233, 215, 184, 35, 178, 18, 74, 296, 210, 298, 81, 265, 175, 116, 261, 212, 277, 260, 234, 6, 129, 31, 4, 235, 249, 34, 289, 105, 259, 91, 93, 119, 7, 183, 240, 41, 253, 290, 136, 75, 292, 67, 112, 111, 256, 163, 38, 126, 139, 98, 56, 282, 60, 26, 55, 245, 225, 32, 52, 40, 271, 29, 252, 239, 89, 87, 205, 213, 180, 97, 108, 120, 218, 44, 187, 196, 251, 202, 203, 172, 28, 188, 77, 90, 199, 297, 282, 141, 100, 161, 216, 73, 19, 17, 189, 30, 258 ] for x in add_order: self.t[x] = 1 for x in delete_order: try: del self.t[x] except KeyError: if self.t.has_key(x): self.assertEqual(1,2,"failed to delete %s" % x) def testRangeSearchAfterSequentialInsert(self): r = range(100) for x in r: self.t[x] = 0 diff = lsubtract(list(self.t.keys(0, 100)), r) self.assertEqual(diff , [], diff) def testRangeSearchAfterRandomInsert(self): r = range(100) a = {} for x in r: rnd = random.choice(r) self.t[rnd] = 0 a[rnd] = 0 diff = lsubtract(list(self.t.keys(0, 100)), a.keys()) self.assertEqual(diff , [], diff) def testPathologicalRangeSearch(self): t = self.t # Build a 2-level tree with at least two buckets. for i in range(200): t[i] = i items, dummy = t.__getstate__() self.assert_(len(items) > 2) # at least two buckets and a key # All values in the first bucket are < firstkey. All in the # second bucket are >= firstkey, and firstkey is the first key in # the second bucket. firstkey = items[1] therange = t.keys(-1, firstkey) self.assertEqual(len(therange), firstkey + 1) self.assertEqual(list(therange), range(firstkey + 1)) # Now for the tricky part. If we delete firstkey, the second bucket # loses its smallest key, but firstkey remains in the BTree node. # If we then do a high-end range search on firstkey, the BTree node # directs us to look in the second bucket, but there's no longer any # key <= firstkey in that bucket. The correct answer points to the # end of the *first* bucket. The algorithm has to be smart enough # to "go backwards" in the BTree then; if it doesn't, it will # erroneously claim that the range is empty. del t[firstkey] therange = t.keys(min=-1, max=firstkey) self.assertEqual(len(therange), firstkey) self.assertEqual(list(therange), range(firstkey)) def testInsertMethod(self): t = self.t t[0] = 1 self.assertEqual(t.insert(0, 1) , 0) self.assertEqual(t.insert(1, 1) , 1) self.assertEqual(lsubtract(list(t.keys()), [0,1]) , []) def testDamagedIterator(self): # A cute one from Steve Alexander. This caused the BTreeItems # object to go insane, accessing memory beyond the allocated part # of the bucket. If it fails, the symptom is either a C-level # assertion error (if the BTree code was compiled without NDEBUG), # or most likely a segfault (if the BTree code was compiled with # NDEBUG). t = self.t.__class__() self._populate(t, 10) # In order for this to fail, it's important that k be a "lazy" # iterator, referring to the BTree by indirect position (index) # instead of a fully materialized list. Then the position can # end up pointing into trash memory, if the bucket pointed to # shrinks. k = t.keys() for dummy in range(20): try: del t[k[0]] except RuntimeError, detail: self.assertEqual(str(detail), "the bucket being iterated " "changed size") break LARGEST_32_BITS = 2147483647 SMALLEST_32_BITS = -LARGEST_32_BITS - 1 SMALLEST_POSITIVE_33_BITS = LARGEST_32_BITS + 1 LARGEST_NEGATIVE_33_BITS = SMALLEST_32_BITS - 1 LARGEST_64_BITS = 0x7fffffffffffffff SMALLEST_64_BITS = -LARGEST_64_BITS - 1 SMALLEST_POSITIVE_65_BITS = LARGEST_64_BITS + 1 LARGEST_NEGATIVE_65_BITS = SMALLEST_64_BITS - 1 class TestLongIntSupport: def getTwoValues(self): """Return two distinct values; these must compare as un-equal. These values must be usable as values. """ return object(), object() def getTwoKeys(self): """Return two distinct values, these must compare as un-equal. These values must be usable as keys. """ return 0, 1 def _set_value(self, key, value): self.t[key] = value class TestLongIntKeys(TestLongIntSupport): def testLongIntKeysWork(self): o1, o2 = self.getTwoValues() assert o1 != o2 # Test some small key values first: self.t[0L] = o1 self.assertEqual(self.t[0], o1) self.t[0] = o2 self.assertEqual(self.t[0L], o2) self.assertEqual(list(self.t.keys()), [0]) # Test some large key values too: k1 = SMALLEST_POSITIVE_33_BITS k2 = LARGEST_64_BITS k3 = SMALLEST_64_BITS self.t[k1] = o1 self.t[k2] = o2 self.t[k3] = o1 self.assertEqual(self.t[k1], o1) self.assertEqual(self.t[k2], o2) self.assertEqual(self.t[k3], o1) self.assertEqual(list(self.t.keys()), [k3, 0, k1, k2]) def testLongIntKeysOutOfRange(self): o1, o2 = self.getTwoValues() self.assertRaises( ValueError, self._set_value, SMALLEST_POSITIVE_65_BITS, o1) self.assertRaises( ValueError, self._set_value, LARGEST_NEGATIVE_65_BITS, o1) class TestLongIntValues(TestLongIntSupport): def testLongIntValuesWork(self): keys = list(self.getTwoKeys()) keys.sort() k1, k2 = keys assert k1 != k2 # This is the smallest positive integer that requires 33 bits: v1 = SMALLEST_POSITIVE_33_BITS v2 = v1 + 1 self.t[k1] = v1 self.t[k2] = v2 self.assertEqual(self.t[k1], v1) self.assertEqual(self.t[k2], v2) self.assertEqual(list(self.t.values()), [v1, v2]) def testLongIntValuesOutOfRange(self): k1, k2 = self.getTwoKeys() self.assertRaises( ValueError, self._set_value, k1, SMALLEST_POSITIVE_65_BITS) self.assertRaises( ValueError, self._set_value, k1, LARGEST_NEGATIVE_65_BITS) if not using64bits: # We're not using 64-bit ints in this build, so we don't expect # the long-integer tests to pass. class TestLongIntKeys: pass class TestLongIntValues: pass # tests of various type errors class TypeTest(TestCase): def testBadTypeRaises(self): self.assertRaises(TypeError, self._stringraises) self.assertRaises(TypeError, self._floatraises) self.assertRaises(TypeError, self._noneraises) class TestIOBTrees(TypeTest): def setUp(self): self.t = IOBTree() def _stringraises(self): self.t['c'] = 1 def _floatraises(self): self.t[2.5] = 1 def _noneraises(self): self.t[None] = 1 class TestOIBTrees(TypeTest): def setUp(self): self.t = OIBTree() def _stringraises(self): self.t[1] = 'c' def _floatraises(self): self.t[1] = 1.4 def _noneraises(self): self.t[1] = None def testEmptyFirstBucketReportedByGuido(self): b = self.t for i in xrange(29972): # reduce to 29971 and it works b[i] = i for i in xrange(30): # reduce to 29 and it works del b[i] b[i+40000] = i self.assertEqual(b.keys()[0], 30) class TestIIBTrees(TestCase): def setUp(self): self.t = IIBTree() def testNonIntegerKeyRaises(self): self.assertRaises(TypeError, self._stringraiseskey) self.assertRaises(TypeError, self._floatraiseskey) self.assertRaises(TypeError, self._noneraiseskey) def testNonIntegerValueRaises(self): self.assertRaises(TypeError, self._stringraisesvalue) self.assertRaises(TypeError, self._floatraisesvalue) self.assertRaises(TypeError, self._noneraisesvalue) def _stringraiseskey(self): self.t['c'] = 1 def _floatraiseskey(self): self.t[2.5] = 1 def _noneraiseskey(self): self.t[None] = 1 def _stringraisesvalue(self): self.t[1] = 'c' def _floatraisesvalue(self): self.t[1] = 1.4 def _noneraisesvalue(self): self.t[1] = None class TestIFBTrees(TestCase): def setUp(self): self.t = IFBTree() def testNonIntegerKeyRaises(self): self.assertRaises(TypeError, self._stringraiseskey) self.assertRaises(TypeError, self._floatraiseskey) self.assertRaises(TypeError, self._noneraiseskey) def testNonNumericValueRaises(self): self.assertRaises(TypeError, self._stringraisesvalue) self.assertRaises(TypeError, self._noneraisesvalue) self.t[1] = 1 self.t[1] = 1.0 def _stringraiseskey(self): self.t['c'] = 1 def _floatraiseskey(self): self.t[2.5] = 1 def _noneraiseskey(self): self.t[None] = 1 def _stringraisesvalue(self): self.t[1] = 'c' def _floatraisesvalue(self): self.t[1] = 1.4 def _noneraisesvalue(self): self.t[1] = None class TestI_Sets(TestCase): def testBadBadKeyAfterFirst(self): self.assertRaises(TypeError, self.t.__class__, [1, '']) self.assertRaises(TypeError, self.t.update, [1, '']) del self.t def testNonIntegerInsertRaises(self): self.assertRaises(TypeError,self._insertstringraises) self.assertRaises(TypeError,self._insertfloatraises) self.assertRaises(TypeError,self._insertnoneraises) def _insertstringraises(self): self.t.insert('a') def _insertfloatraises(self): self.t.insert(1.4) def _insertnoneraises(self): self.t.insert(None) class TestIOSets(TestI_Sets): def setUp(self): self.t = IOSet() class TestIOTreeSets(TestI_Sets): def setUp(self): self.t = IOTreeSet() class TestIISets(TestI_Sets): def setUp(self): self.t = IISet() class TestIITreeSets(TestI_Sets): def setUp(self): self.t = IITreeSet() class TestLOSets(TestI_Sets): def setUp(self): self.t = LOSet() class TestLOTreeSets(TestI_Sets): def setUp(self): self.t = LOTreeSet() class TestLLSets(TestI_Sets): def setUp(self): self.t = LLSet() class TestLLTreeSets(TestI_Sets): def setUp(self): self.t = LLTreeSet() class DegenerateBTree(TestCase): # Build a degenerate tree (set). Boxes are BTree nodes. There are # 5 leaf buckets, each containing a single int. Keys in the BTree # nodes don't appear in the buckets. Seven BTree nodes are purely # indirection nodes (no keys). Buckets aren't all at the same depth: # # +------------------------+ # | 4 | # +------------------------+ # | | # | v # | +-+ # | | | # | +-+ # | | # v v # +-------+ +-------------+ # | 2 | | 6 10 | # +-------+ +-------------+ # | | | | | # v v v v v # +-+ +-+ +-+ +-+ +-+ # | | | | | | | | | | # +-+ +-+ +-+ +-+ +-+ # | | | | | # v v v v v # 1 3 +-+ 7 11 # | | # +-+ # | # v # 5 # # This is nasty for many algorithms. Consider a high-end range search # for 4. The BTree nodes direct it to the 5 bucket, but the correct # answer is the 3 bucket, which requires going in a different direction # at the very top node already. Consider a low-end range search for # 9. The BTree nodes direct it to the 7 bucket, but the correct answer # is the 11 bucket. This is also a nasty-case tree for deletions. def _build_degenerate_tree(self): # Build the buckets and chain them together. bucket11 = IISet([11]) bucket7 = IISet() bucket7.__setstate__(((7,), bucket11)) bucket5 = IISet() bucket5.__setstate__(((5,), bucket7)) bucket3 = IISet() bucket3.__setstate__(((3,), bucket5)) bucket1 = IISet() bucket1.__setstate__(((1,), bucket3)) # Build the deepest layers of indirection nodes. ts = IITreeSet tree1 = ts() tree1.__setstate__(((bucket1,), bucket1)) tree3 = ts() tree3.__setstate__(((bucket3,), bucket3)) tree5lower = ts() tree5lower.__setstate__(((bucket5,), bucket5)) tree5 = ts() tree5.__setstate__(((tree5lower,), bucket5)) tree7 = ts() tree7.__setstate__(((bucket7,), bucket7)) tree11 = ts() tree11.__setstate__(((bucket11,), bucket11)) # Paste together the middle layers. tree13 = ts() tree13.__setstate__(((tree1, 2, tree3), bucket1)) tree5711lower = ts() tree5711lower.__setstate__(((tree5, 6, tree7, 10, tree11), bucket5)) tree5711 = ts() tree5711.__setstate__(((tree5711lower,), bucket5)) # One more. t = ts() t.__setstate__(((tree13, 4, tree5711), bucket1)) t._check() check(t) return t, [1, 3, 5, 7, 11] def testBasicOps(self): t, keys = self._build_degenerate_tree() self.assertEqual(len(t), len(keys)) self.assertEqual(list(t.keys()), keys) # has_key actually returns the depth of a bucket. self.assertEqual(t.has_key(1), 4) self.assertEqual(t.has_key(3), 4) self.assertEqual(t.has_key(5), 6) self.assertEqual(t.has_key(7), 5) self.assertEqual(t.has_key(11), 5) for i in 0, 2, 4, 6, 8, 9, 10, 12: self.assert_(i not in t) def _checkRanges(self, tree, keys): self.assertEqual(len(tree), len(keys)) sorted_keys = keys[:] sorted_keys.sort() self.assertEqual(list(tree.keys()), sorted_keys) for k in keys: self.assert_(k in tree) if keys: lokey = sorted_keys[0] hikey = sorted_keys[-1] self.assertEqual(lokey, tree.minKey()) self.assertEqual(hikey, tree.maxKey()) else: lokey = hikey = 42 # Try all range searches. for lo in range(lokey - 1, hikey + 2): for hi in range(lo - 1, hikey + 2): for skipmin in False, True: for skipmax in False, True: wantlo, wanthi = lo, hi if skipmin: wantlo += 1 if skipmax: wanthi -= 1 want = [k for k in keys if wantlo <= k <= wanthi] got = list(tree.keys(lo, hi, skipmin, skipmax)) self.assertEqual(want, got) def testRanges(self): t, keys = self._build_degenerate_tree() self._checkRanges(t, keys) def testDeletes(self): # Delete keys in all possible orders, checking each tree along # the way. # This is a tough test. Previous failure modes included: # 1. A variety of assertion failures in _checkRanges. # 2. Assorted "Invalid firstbucket pointer" failures at # seemingly random times, coming out of the BTree destructor. # 3. Under Python 2.3 CVS, some baffling # RuntimeWarning: tp_compare didn't return -1 or -2 for exception # warnings, possibly due to memory corruption after a BTree # goes insane. t, keys = self._build_degenerate_tree() for oneperm in permutations(keys): t, keys = self._build_degenerate_tree() for key in oneperm: t.remove(key) keys.remove(key) t._check() check(t) self._checkRanges(t, keys) # We removed all the keys, so the tree should be empty now. self.assertEqual(t.__getstate__(), None) # A damaged tree may trigger an "invalid firstbucket pointer" # failure at the time its destructor is invoked. Try to force # that to happen now, so it doesn't look like a baffling failure # at some unrelated line. del t # trigger destructor LP294788_ids = {} class ToBeDeleted(object): def __init__(self, id): assert type(id) is int #we don't want to store any object ref here self.id = id global LP294788_ids LP294788_ids[id] = 1 def __del__(self): global LP294788_ids LP294788_ids.pop(self.id, None) def __cmp__(self, other): return cmp(self.id, other.id) def __hash__(self): return hash(self.id) class BugFixes(TestCase): # Collector 1843. Error returns were effectively ignored in # Bucket_rangeSearch(), leading to "delayed" errors, or worse. def testFixed1843(self): t = IISet() t.insert(1) # This one used to fail to raise the TypeError when it occurred. self.assertRaises(TypeError, t.keys, "") # This one used to segfault. self.assertRaises(TypeError, t.keys, 0, "") def test_LP294788(self): # https://bugs.launchpad.net/bugs/294788 # BTree keeps some deleted objects referenced # The logic here together with the ToBeDeleted class is that # a separate reference dict is populated on object creation # and removed in __del__ # That means what's left in the reference dict is never GC'ed # therefore referenced somewhere # To simulate real life, some random data is used to exercise the tree t = OOBTree() trandom = random.Random('OOBTree') global LP294788_ids # /// BTree keys are integers, value is an object LP294788_ids = {} ids = {} for i in xrange(1024): if trandom.random() > 0.1: #add id = None while id is None or id in ids: id = trandom.randint(0,1000000) ids[id] = 1 t[id] = ToBeDeleted(id) else: #del id = trandom.choice(ids.keys()) del t[id] del ids[id] ids = ids.keys() trandom.shuffle(ids) for id in ids: del t[id] ids = None #to be on the safe side run a full GC gc.collect() #print LP294788_ids self.assertEqual(len(t), 0) self.assertEqual(len(LP294788_ids), 0) # \\\ # /// BTree keys are integers, value is a tuple having an object LP294788_ids = {} ids = {} for i in xrange(1024): if trandom.random() > 0.1: #add id = None while id is None or id in ids: id = trandom.randint(0,1000000) ids[id] = 1 t[id] = (id, ToBeDeleted(id), u'somename') else: #del id = trandom.choice(ids.keys()) del t[id] del ids[id] ids = ids.keys() trandom.shuffle(ids) for id in ids: del t[id] ids = None #to be on the safe side run a full GC gc.collect() #print LP294788_ids self.assertEqual(len(t), 0) self.assertEqual(len(LP294788_ids), 0) # \\\ # /// BTree keys are objects, value is an int t = OOBTree() LP294788_ids = {} ids = {} for i in xrange(1024): if trandom.random() > 0.1: #add id = None while id is None or id in ids: id = ToBeDeleted(trandom.randint(0,1000000)) ids[id] = 1 t[id] = 1 else: #del id = trandom.choice(ids.keys()) del ids[id] del t[id] ids = ids.keys() trandom.shuffle(ids) for id in ids: del t[id] #release all refs ids = obj = id = None #to be on the safe side run a full GC gc.collect() #print LP294788_ids self.assertEqual(len(t), 0) self.assertEqual(len(LP294788_ids), 0) # /// BTree keys are tuples having objects, value is an int t = OOBTree() LP294788_ids = {} ids = {} for i in xrange(1024): if trandom.random() > 0.1: #add id = None while id is None or id in ids: id = trandom.randint(0,1000000) id = (id, ToBeDeleted(id), u'somename') ids[id] = 1 t[id] = 1 else: #del id = trandom.choice(ids.keys()) del ids[id] del t[id] ids = ids.keys() trandom.shuffle(ids) for id in ids: del t[id] #release all refs ids = id = obj = key = None #to be on the safe side run a full GC gc.collect() #print LP294788_ids self.assertEqual(len(t), 0) self.assertEqual(len(LP294788_ids), 0) class IIBTreeTest(BTreeTests): def setUp(self): self.t = IIBTree() def testIIBTreeOverflow(self): good = set() b = self.t def trial(i): i = int(i) try: b[i] = 0 except TypeError: self.assertRaises(TypeError, b.__setitem__, 0, i) else: good.add(i) b[0] = i self.assertEqual(b[0], i) for i in range((1<<31) - 3, (1<<31) + 3): trial(i) trial(-i) del b[0] self.assertEqual(sorted(good), sorted(b)) class IFBTreeTest(BTreeTests): def setUp(self): self.t = IFBTree() class IOBTreeTest(BTreeTests): def setUp(self): self.t = IOBTree() class OIBTreeTest(BTreeTests): def setUp(self): self.t = OIBTree() class OOBTreeTest(BTreeTests): def setUp(self): self.t = OOBTree() if using64bits: class IIBTreeTest(BTreeTests, TestLongIntKeys, TestLongIntValues): def setUp(self): self.t = IIBTree() def getTwoValues(self): return 1, 2 class IFBTreeTest(BTreeTests, TestLongIntKeys): def setUp(self): self.t = IFBTree() def getTwoValues(self): return 0.5, 1.5 class IOBTreeTest(BTreeTests, TestLongIntKeys): def setUp(self): self.t = IOBTree() class OIBTreeTest(BTreeTests, TestLongIntValues): def setUp(self): self.t = OIBTree() def getTwoKeys(self): return object(), object() class LLBTreeTest(BTreeTests, TestLongIntKeys, TestLongIntValues): def setUp(self): self.t = LLBTree() def getTwoValues(self): return 1, 2 class LFBTreeTest(BTreeTests, TestLongIntKeys): def setUp(self): self.t = LFBTree() def getTwoValues(self): return 0.5, 1.5 class LOBTreeTest(BTreeTests, TestLongIntKeys): def setUp(self): self.t = LOBTree() class OLBTreeTest(BTreeTests, TestLongIntValues): def setUp(self): self.t = OLBTree() def getTwoKeys(self): return object(), object() class OOBTreeTest(BTreeTests): def setUp(self): self.t = OOBTree() # cmp error propagation tests class DoesntLikeBeingCompared: def __cmp__(self,other): raise ValueError('incomparable') class TestCmpError(TestCase): def testFoo(self): t = OOBTree() t['hello world'] = None try: t[DoesntLikeBeingCompared()] = None except ValueError,e: self.assertEqual(str(e), 'incomparable') else: self.fail('incomarable objects should not be allowed into ' 'the tree') # test for presence of generic names in module class ModuleTest(TestCase): module = None prefix = None iface = None def testNames(self): for name in ('Bucket', 'BTree', 'Set', 'TreeSet'): klass = getattr(self.module, name) self.assertEqual(klass.__module__, self.module.__name__) self.assert_(klass is getattr(self.module, self.prefix + name)) def testModuleProvides(self): self.assert_( zope.interface.verify.verifyObject(self.iface, self.module)) def testFamily(self): if self.prefix == 'OO': self.assert_( getattr(self.module, 'family', self) is self) elif 'L' in self.prefix: self.assert_(self.module.family is BTrees.family64) elif 'I' in self.prefix: self.assert_(self.module.family is BTrees.family32) class FamilyTest(TestCase): def test32(self): self.assert_( zope.interface.verify.verifyObject( BTrees.Interfaces.IBTreeFamily, BTrees.family32)) self.assertEquals( BTrees.family32.IO, BTrees.IOBTree) self.assertEquals( BTrees.family32.OI, BTrees.OIBTree) self.assertEquals( BTrees.family32.II, BTrees.IIBTree) self.assertEquals( BTrees.family32.IF, BTrees.IFBTree) self.assertEquals( BTrees.family32.OO, BTrees.OOBTree) s = IOTreeSet() s.insert(BTrees.family32.maxint) self.assert_(BTrees.family32.maxint in s) s = IOTreeSet() s.insert(BTrees.family32.minint) self.assert_(BTrees.family32.minint in s) s = IOTreeSet() # this next bit illustrates an, um, "interesting feature". If # the characteristics change to match the 64 bit version, please # feel free to change. big = BTrees.family32.maxint + 1 self.assertRaises(TypeError, s.insert, big) self.assertRaises(TypeError, s.insert, BTrees.family32.minint - 1) self.check_pickling(BTrees.family32) def test64(self): self.assert_( zope.interface.verify.verifyObject( BTrees.Interfaces.IBTreeFamily, BTrees.family64)) self.assertEquals( BTrees.family64.IO, BTrees.LOBTree) self.assertEquals( BTrees.family64.OI, BTrees.OLBTree) self.assertEquals( BTrees.family64.II, BTrees.LLBTree) self.assertEquals( BTrees.family64.IF, BTrees.LFBTree) self.assertEquals( BTrees.family64.OO, BTrees.OOBTree) s = LOTreeSet() s.insert(BTrees.family64.maxint) self.assert_(BTrees.family64.maxint in s) s = LOTreeSet() s.insert(BTrees.family64.minint) self.assert_(BTrees.family64.minint in s) s = LOTreeSet() self.assertRaises(ValueError, s.insert, BTrees.family64.maxint + 1) self.assertRaises(ValueError, s.insert, BTrees.family64.minint - 1) self.check_pickling(BTrees.family64) def check_pickling(self, family): # The "family" objects are singletons; they can be pickled and # unpickled, and the same instances will always be returned on # unpickling, whether from the same unpickler or different # unpicklers. s = pickle.dumps((family, family)) (f1, f2) = pickle.loads(s) self.failUnless(f1 is family) self.failUnless(f2 is family) # Using a single memo across multiple pickles: sio = StringIO.StringIO() p = pickle.Pickler(sio) p.dump(family) p.dump([family]) u = pickle.Unpickler(StringIO.StringIO(sio.getvalue())) f1 = u.load() f2, = u.load() self.failUnless(f1 is family) self.failUnless(f2 is family) # Using separate memos for each pickle: sio = StringIO.StringIO() p = pickle.Pickler(sio) p.dump(family) p.clear_memo() p.dump([family]) u = pickle.Unpickler(StringIO.StringIO(sio.getvalue())) f1 = u.load() f2, = u.load() self.failUnless(f1 is family) self.failUnless(f2 is family) class InternalKeysMappingTest(TestCase): """There must not be any internal keys not in the BTree """ def add_key(self, tree, key): tree[key] = key def test_internal_keys_after_deletion(self): """Make sure when a key's deleted, it's not an internal key We'll leverage __getstate__ to introspect the internal structures. We need to check BTrees with BTree children as well as BTrees with bucket children. """ from ZODB.MappingStorage import DB db = DB() conn = db.open() tree = conn.root.tree = self.t_class() i = 0 # Grow the btree until we have multiple buckets while 1: i += 1 self.add_key(tree, i) data = tree.__getstate__()[0] if len(data) >= 3: break transaction.commit() # Now, delete the internal key and make sure it's really gone key = data[1] del tree[key] data = tree.__getstate__()[0] self.assert_(data[1] != key) # The tree should have changed: self.assert_(tree._p_changed) # Grow the btree until we have multiple levels while 1: i += 1 self.add_key(tree, i) data = tree.__getstate__()[0] if data[0].__class__ == tree.__class__: assert len(data[2].__getstate__()[0]) >= 3 break # Now, delete the internal key and make sure it's really gone key = data[1] del tree[key] data = tree.__getstate__()[0] self.assert_(data[1] != key) transaction.abort() db.close() class InternalKeysSetTest: """There must not be any internal keys not in the TreeSet """ def add_key(self, tree, key): tree.add(key) def test_suite(): s = TestSuite() for kv in ('OO', 'II', 'IO', 'OI', 'IF', 'LL', 'LO', 'OL', 'LF', ): for name, bases in ( ('BTree', (InternalKeysMappingTest,)), ('TreeSet', (InternalKeysSetTest,)), ): klass = ClassType(kv + name + 'InternalKeyTest', bases, dict(t_class=globals()[kv+name])) s.addTest(makeSuite(klass)) for kv in ('OO', 'II', 'IO', 'OI', 'IF', 'LL', 'LO', 'OL', 'LF', ): for name, bases in ( ('Bucket', (MappingBase,)), ('TreeSet', (NormalSetTests,)), ('Set', (ExtendedSetTests,)), ): klass = ClassType(kv + name + 'Test', bases, dict(t_class=globals()[kv+name])) s.addTest(makeSuite(klass)) for kv, iface in ( ('OO', BTrees.Interfaces.IObjectObjectBTreeModule), ('IO', BTrees.Interfaces.IIntegerObjectBTreeModule), ('LO', BTrees.Interfaces.IIntegerObjectBTreeModule), ('OI', BTrees.Interfaces.IObjectIntegerBTreeModule), ('OL', BTrees.Interfaces.IObjectIntegerBTreeModule), ('II', BTrees.Interfaces.IIntegerIntegerBTreeModule), ('LL', BTrees.Interfaces.IIntegerIntegerBTreeModule), ('IF', BTrees.Interfaces.IIntegerFloatBTreeModule), ('LF', BTrees.Interfaces.IIntegerFloatBTreeModule)): s.addTest( makeSuite( ClassType( kv + 'ModuleTest', (ModuleTest,), dict( prefix=kv, module=getattr(BTrees, kv + 'BTree'), iface=iface)))) for klass in ( IIBTreeTest, IFBTreeTest, IOBTreeTest, OIBTreeTest, LLBTreeTest, LFBTreeTest, LOBTreeTest, OLBTreeTest, OOBTreeTest, # Note: there is no TestOOBTrees. The next three are # checking for assorted TypeErrors, and when both keys # and values are objects (OO), there's nothing to test. TestIIBTrees, TestIFBTrees, TestIOBTrees, TestOIBTrees, TestIOSets, TestIOTreeSets, TestIISets, TestIITreeSets, TestLOSets, TestLOTreeSets, TestLLSets, TestLLTreeSets, DegenerateBTree, TestCmpError, BugFixes, FamilyTest, ): s.addTest(makeSuite(klass)) return s ## utility functions def lsubtract(l1, l2): l1 = list(l1) l2 = list(l2) l = filter(lambda x, l1=l1: x not in l1, l2) l = l + filter(lambda x, l2=l2: x not in l2, l1) return l def realseq(itemsob): return [x for x in itemsob] def permutations(x): # Return a list of all permutations of list x. n = len(x) if n <= 1: return [x] result = [] x0 = x[0] for i in range(n): # Build the (n-1)! permutations with x[i] in the first position. xcopy = x[:] first, xcopy[i] = xcopy[i], x0 result.extend([[first] + p for p in permutations(xcopy[1:])]) return result def main(): TextTestRunner().run(test_suite()) if __name__ == '__main__': main() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/tests/testBTreesUnicode.py000066400000000000000000000046521230730566700262010ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## import unittest from BTrees.OOBTree import OOBTree # When an OOBtree contains unicode strings as keys, # it is neccessary accessing non-unicode strings are # either ascii strings or encoded as unicoded using the # corresponding encoding encoding = 'ISO-8859-1' class TestBTreesUnicode(unittest.TestCase): """ test unicode""" def setUp(self): """setup an OOBTree with some unicode strings""" self.s = unicode('dreit\xe4gigen', 'latin1') self.data = [('alien', 1), ('k\xf6nnten', 2), ('fox', 3), ('future', 4), ('quick', 5), ('zerst\xf6rt', 6), (unicode('dreit\xe4gigen','latin1'), 7), ] self.tree = OOBTree() for k, v in self.data: if isinstance(k, str): k = unicode(k, 'latin1') self.tree[k] = v def testAllKeys(self): # check every item of the tree for k, v in self.data: if isinstance(k, str): k = unicode(k, encoding) self.assert_(self.tree.has_key(k)) self.assertEqual(self.tree[k], v) def testUnicodeKeys(self): # try to access unicode keys in tree k, v = self.data[-1] self.assertEqual(k, self.s) self.assertEqual(self.tree[k], v) self.assertEqual(self.tree[self.s], v) def testAsciiKeys(self): # try to access some "plain ASCII" keys in the tree for k, v in self.data[0], self.data[2]: self.assert_(isinstance(k, str)) self.assertEqual(self.tree[k], v) def test_suite(): return unittest.makeSuite(TestBTreesUnicode) def main(): unittest.TextTestRunner().run(test_suite()) if __name__ == '__main__': main() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/tests/testConflict.py000066400000000000000000000675111230730566700252520ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## import os from unittest import TestCase, TestSuite, makeSuite from types import ClassType from BTrees.OOBTree import OOBTree, OOBucket, OOSet, OOTreeSet from BTrees.IOBTree import IOBTree, IOBucket, IOSet, IOTreeSet from BTrees.IIBTree import IIBTree, IIBucket, IISet, IITreeSet from BTrees.IFBTree import IFBTree, IFBucket, IFSet, IFTreeSet from BTrees.OIBTree import OIBTree, OIBucket, OISet, OITreeSet from BTrees.LOBTree import LOBTree, LOBucket, LOSet, LOTreeSet from BTrees.LLBTree import LLBTree, LLBucket, LLSet, LLTreeSet from BTrees.LFBTree import LFBTree, LFBucket, LFSet, LFTreeSet from BTrees.OLBTree import OLBTree, OLBucket, OLSet, OLTreeSet import transaction from ZODB.POSException import ConflictError class Base: """ Tests common to all types: sets, buckets, and BTrees """ storage = None def setUp(self): self.t = self.t_type() def tearDown(self): transaction.abort() del self.t if self.storage is not None: self.storage.close() self.storage.cleanup() def openDB(self): from ZODB.FileStorage import FileStorage from ZODB.DB import DB n = 'fs_tmp__%s' % os.getpid() self.storage = FileStorage(n) self.db = DB(self.storage) return self.db class MappingBase(Base): """ Tests common to mappings (buckets, btrees) """ def _deletefail(self): del self.t[1] def _setupConflict(self): l=[ -5124, -7377, 2274, 8801, -9901, 7327, 1565, 17, -679, 3686, -3607, 14, 6419, -5637, 6040, -4556, -8622, 3847, 7191, -4067] e1=[(-1704, 0), (5420, 1), (-239, 2), (4024, 3), (-6984, 4)] e2=[(7745, 0), (4868, 1), (-2548, 2), (-2711, 3), (-3154, 4)] base=self.t base.update([(i, i*i) for i in l[:20]]) b1=base.__class__(base) b2=base.__class__(base) bm=base.__class__(base) items=base.items() return base, b1, b2, bm, e1, e2, items def testSimpleConflict(self): # Unlike all the other tests, invoke conflict resolution # by committing a transaction and catching a conflict # in the storage. self.openDB() r1 = self.db.open().root() r1["t"] = self.t transaction.commit() r2 = self.db.open().root() copy = r2["t"] list(copy) # unghostify self.assertEqual(self.t._p_serial, copy._p_serial) self.t.update({1:2, 2:3}) transaction.commit() copy.update({3:4}) transaction.commit() def testMergeDelete(self): base, b1, b2, bm, e1, e2, items = self._setupConflict() del b1[items[1][0]] del b2[items[5][0]] del b1[items[-1][0]] del b2[items[-2][0]] del bm[items[1][0]] del bm[items[5][0]] del bm[items[-1][0]] del bm[items[-2][0]] test_merge(base, b1, b2, bm, 'merge delete') def testMergeDeleteAndUpdate(self): base, b1, b2, bm, e1, e2, items = self._setupConflict() del b1[items[1][0]] b2[items[5][0]]=1 del b1[items[-1][0]] b2[items[-2][0]]=2 del bm[items[1][0]] bm[items[5][0]]=1 del bm[items[-1][0]] bm[items[-2][0]]=2 test_merge(base, b1, b2, bm, 'merge update and delete') def testMergeUpdate(self): base, b1, b2, bm, e1, e2, items = self._setupConflict() b1[items[0][0]]=1 b2[items[5][0]]=2 b1[items[-1][0]]=3 b2[items[-2][0]]=4 bm[items[0][0]]=1 bm[items[5][0]]=2 bm[items[-1][0]]=3 bm[items[-2][0]]=4 test_merge(base, b1, b2, bm, 'merge update') def testFailMergeDelete(self): base, b1, b2, bm, e1, e2, items = self._setupConflict() del b1[items[0][0]] del b2[items[0][0]] test_merge(base, b1, b2, bm, 'merge conflicting delete', should_fail=1) def testFailMergeUpdate(self): base, b1, b2, bm, e1, e2, items = self._setupConflict() b1[items[0][0]]=1 b2[items[0][0]]=2 test_merge(base, b1, b2, bm, 'merge conflicting update', should_fail=1) def testFailMergeDeleteAndUpdate(self): base, b1, b2, bm, e1, e2, items = self._setupConflict() del b1[items[0][0]] b2[items[0][0]]=-9 test_merge(base, b1, b2, bm, 'merge conflicting update and delete', should_fail=1) def testMergeInserts(self): base, b1, b2, bm, e1, e2, items = self._setupConflict() b1[-99999]=-99999 b1[e1[0][0]]=e1[0][1] b2[99999]=99999 b2[e1[2][0]]=e1[2][1] bm[-99999]=-99999 bm[e1[0][0]]=e1[0][1] bm[99999]=99999 bm[e1[2][0]]=e1[2][1] test_merge(base, b1, b2, bm, 'merge insert') def testMergeInsertsFromEmpty(self): base, b1, b2, bm, e1, e2, items = self._setupConflict() base.clear() b1.clear() b2.clear() bm.clear() b1.update(e1) bm.update(e1) b2.update(e2) bm.update(e2) test_merge(base, b1, b2, bm, 'merge insert from empty') def testFailMergeEmptyAndFill(self): base, b1, b2, bm, e1, e2, items = self._setupConflict() b1.clear() bm.clear() b2.update(e2) bm.update(e2) test_merge(base, b1, b2, bm, 'merge insert from empty', should_fail=1) def testMergeEmpty(self): base, b1, b2, bm, e1, e2, items = self._setupConflict() b1.clear() bm.clear() test_merge(base, b1, b2, bm, 'empty one and not other', should_fail=1) def testFailMergeInsert(self): base, b1, b2, bm, e1, e2, items = self._setupConflict() b1[-99999]=-99999 b1[e1[0][0]]=e1[0][1] b2[99999]=99999 b2[e1[0][0]]=e1[0][1] test_merge(base, b1, b2, bm, 'merge conflicting inserts', should_fail=1) class SetTests(Base): "Set (as opposed to TreeSet) specific tests." def _setupConflict(self): l=[ -5124, -7377, 2274, 8801, -9901, 7327, 1565, 17, -679, 3686, -3607, 14, 6419, -5637, 6040, -4556, -8622, 3847, 7191, -4067] e1=[-1704, 5420, -239, 4024, -6984] e2=[7745, 4868, -2548, -2711, -3154] base=self.t base.update(l) b1=base.__class__(base) b2=base.__class__(base) bm=base.__class__(base) items=base.keys() return base, b1, b2, bm, e1, e2, items def testMergeDelete(self): base, b1, b2, bm, e1, e2, items = self._setupConflict() b1.remove(items[1]) b2.remove(items[5]) b1.remove(items[-1]) b2.remove(items[-2]) bm.remove(items[1]) bm.remove(items[5]) bm.remove(items[-1]) bm.remove(items[-2]) test_merge(base, b1, b2, bm, 'merge delete') def testFailMergeDelete(self): base, b1, b2, bm, e1, e2, items = self._setupConflict() b1.remove(items[0]) b2.remove(items[0]) test_merge(base, b1, b2, bm, 'merge conflicting delete', should_fail=1) def testMergeInserts(self): base, b1, b2, bm, e1, e2, items = self._setupConflict() b1.insert(-99999) b1.insert(e1[0]) b2.insert(99999) b2.insert(e1[2]) bm.insert(-99999) bm.insert(e1[0]) bm.insert(99999) bm.insert(e1[2]) test_merge(base, b1, b2, bm, 'merge insert') def testMergeInsertsFromEmpty(self): base, b1, b2, bm, e1, e2, items = self._setupConflict() base.clear() b1.clear() b2.clear() bm.clear() b1.update(e1) bm.update(e1) b2.update(e2) bm.update(e2) test_merge(base, b1, b2, bm, 'merge insert from empty') def testFailMergeEmptyAndFill(self): base, b1, b2, bm, e1, e2, items = self._setupConflict() b1.clear() bm.clear() b2.update(e2) bm.update(e2) test_merge(base, b1, b2, bm, 'merge insert from empty', should_fail=1) def testMergeEmpty(self): base, b1, b2, bm, e1, e2, items = self._setupConflict() b1.clear() bm.clear() test_merge(base, b1, b2, bm, 'empty one and not other', should_fail=1) def testFailMergeInsert(self): base, b1, b2, bm, e1, e2, items = self._setupConflict() b1.insert(-99999) b1.insert(e1[0]) b2.insert(99999) b2.insert(e1[0]) test_merge(base, b1, b2, bm, 'merge conflicting inserts', should_fail=1) def test_merge(o1, o2, o3, expect, message='failed to merge', should_fail=0): s1 = o1.__getstate__() s2 = o2.__getstate__() s3 = o3.__getstate__() expected = expect.__getstate__() if expected is None: expected = ((((),),),) if should_fail: try: merged = o1._p_resolveConflict(s1, s2, s3) except ConflictError, err: pass else: assert 0, message else: merged = o1._p_resolveConflict(s1, s2, s3) assert merged == expected, message class NastyConfict(Base, TestCase): t_type = OOBTree # This tests a problem that cropped up while trying to write # testBucketSplitConflict (below): conflict resolution wasn't # working at all in non-trivial cases. Symptoms varied from # strange complaints about pickling (despite that the test isn't # doing any *directly*), thru SystemErrors from Python and # AssertionErrors inside the BTree code. def testResolutionBlowsUp(self): b = self.t for i in range(0, 200, 4): b[i] = i # bucket 0 has 15 values: 0, 4 .. 56 # bucket 1 has 15 values: 60, 64 .. 116 # bucket 2 has 20 values: 120, 124 .. 196 state = b.__getstate__() # Looks like: ((bucket0, 60, bucket1, 120, bucket2), firstbucket) # If these fail, the *preconditions* for running the test aren't # satisfied -- the test itself hasn't been run yet. self.assertEqual(len(state), 2) self.assertEqual(len(state[0]), 5) self.assertEqual(state[0][1], 60) self.assertEqual(state[0][3], 120) # Invoke conflict resolution by committing a transaction. self.openDB() r1 = self.db.open().root() r1["t"] = self.t transaction.commit() r2 = self.db.open().root() copy = r2["t"] # Make sure all of copy is loaded. list(copy.values()) self.assertEqual(self.t._p_serial, copy._p_serial) self.t.update({1:2, 2:3}) transaction.commit() copy.update({3:4}) transaction.commit() # if this doesn't blow up list(copy.values()) # and this doesn't either, then fine def testBucketSplitConflict(self): # Tests that a bucket split is viewed as a conflict. # It's (almost necessarily) a white-box test, and sensitive to # implementation details. b = self.t for i in range(0, 200, 4): b[i] = i # bucket 0 has 15 values: 0, 4 .. 56 # bucket 1 has 15 values: 60, 64 .. 116 # bucket 2 has 20 values: 120, 124 .. 196 state = b.__getstate__() # Looks like: ((bucket0, 60, bucket1, 120, bucket2), firstbucket) # If these fail, the *preconditions* for running the test aren't # satisfied -- the test itself hasn't been run yet. self.assertEqual(len(state), 2) self.assertEqual(len(state[0]), 5) self.assertEqual(state[0][1], 60) self.assertEqual(state[0][3], 120) # Invoke conflict resolution by committing a transaction. self.openDB() tm1 = transaction.TransactionManager() r1 = self.db.open(transaction_manager=tm1).root() r1["t"] = self.t tm1.commit() tm2 = transaction.TransactionManager() r2 = self.db.open(transaction_manager=tm2).root() copy = r2["t"] # Make sure all of copy is loaded. list(copy.values()) self.assertEqual(self.t._p_serial, copy._p_serial) # In one transaction, add 16 new keys to bucket1, to force a bucket # split. b = self.t numtoadd = 16 candidate = 60 while numtoadd: if not b.has_key(candidate): b[candidate] = candidate numtoadd -= 1 candidate += 1 # bucket 0 has 15 values: 0, 4 .. 56 # bucket 1 has 15 values: 60, 61 .. 74 # bucket 2 has 16 values: [75, 76 .. 81] + [84, 88 ..116] # bucket 3 has 20 values: 120, 124 .. 196 state = b.__getstate__() # Looks like: ((b0, 60, b1, 75, b2, 120, b3), firstbucket) # The next block is still verifying preconditions. self.assertEqual(len(state) , 2) self.assertEqual(len(state[0]), 7) self.assertEqual(state[0][1], 60) self.assertEqual(state[0][3], 75) self.assertEqual(state[0][5], 120) tm1.commit() # In the other transaction, add 3 values near the tail end of bucket1. # This doesn't cause a split. b = copy for i in range(112, 116): b[i] = i # bucket 0 has 15 values: 0, 4 .. 56 # bucket 1 has 18 values: 60, 64 .. 112, 113, 114, 115, 116 # bucket 2 has 20 values: 120, 124 .. 196 state = b.__getstate__() # Looks like: ((bucket0, 60, bucket1, 120, bucket2), firstbucket) # The next block is still verifying preconditions. self.assertEqual(len(state), 2) self.assertEqual(len(state[0]), 5) self.assertEqual(state[0][1], 60) self.assertEqual(state[0][3], 120) self.assertRaises(ConflictError, tm2.commit) def testEmptyBucketConflict(self): # Tests that an emptied bucket *created by* conflict resolution is # viewed as a conflict: conflict resolution doesn't have enough # info to unlink the empty bucket from the BTree correctly. b = self.t for i in range(0, 200, 4): b[i] = i # bucket 0 has 15 values: 0, 4 .. 56 # bucket 1 has 15 values: 60, 64 .. 116 # bucket 2 has 20 values: 120, 124 .. 196 state = b.__getstate__() # Looks like: ((bucket0, 60, bucket1, 120, bucket2), firstbucket) # If these fail, the *preconditions* for running the test aren't # satisfied -- the test itself hasn't been run yet. self.assertEqual(len(state), 2) self.assertEqual(len(state[0]), 5) self.assertEqual(state[0][1], 60) self.assertEqual(state[0][3], 120) # Invoke conflict resolution by committing a transaction. self.openDB() tm1 = transaction.TransactionManager() r1 = self.db.open(transaction_manager=tm1).root() r1["t"] = self.t tm1.commit() tm2 = transaction.TransactionManager() r2 = self.db.open(transaction_manager=tm2).root() copy = r2["t"] # Make sure all of copy is loaded. list(copy.values()) self.assertEqual(self.t._p_serial, copy._p_serial) # In one transaction, delete half of bucket 1. b = self.t for k in 60, 64, 68, 72, 76, 80, 84, 88: del b[k] # bucket 0 has 15 values: 0, 4 .. 56 # bucket 1 has 7 values: 92, 96, 100, 104, 108, 112, 116 # bucket 2 has 20 values: 120, 124 .. 196 state = b.__getstate__() # Looks like: ((bucket0, 60, bucket1, 120, bucket2), firstbucket) # The next block is still verifying preconditions. self.assertEqual(len(state) , 2) self.assertEqual(len(state[0]), 5) self.assertEqual(state[0][1], 92) self.assertEqual(state[0][3], 120) tm1.commit() # In the other transaction, delete the other half of bucket 1. b = copy for k in 92, 96, 100, 104, 108, 112, 116: del b[k] # bucket 0 has 15 values: 0, 4 .. 56 # bucket 1 has 8 values: 60, 64, 68, 72, 76, 80, 84, 88 # bucket 2 has 20 values: 120, 124 .. 196 state = b.__getstate__() # Looks like: ((bucket0, 60, bucket1, 120, bucket2), firstbucket) # The next block is still verifying preconditions. self.assertEqual(len(state), 2) self.assertEqual(len(state[0]), 5) self.assertEqual(state[0][1], 60) self.assertEqual(state[0][3], 120) # Conflict resolution empties bucket1 entirely. This used to # create an "insane" BTree (a legit BTree cannot contain an empty # bucket -- it contains NULL pointers the BTree code doesn't # expect, and segfaults result). self.assertRaises(ConflictError, tm2.commit) def testEmptyBucketNoConflict(self): # Tests that a plain empty bucket (on input) is not viewed as a # conflict. b = self.t for i in range(0, 200, 4): b[i] = i # bucket 0 has 15 values: 0, 4 .. 56 # bucket 1 has 15 values: 60, 64 .. 116 # bucket 2 has 20 values: 120, 124 .. 196 state = b.__getstate__() # Looks like: ((bucket0, 60, bucket1, 120, bucket2), firstbucket) # If these fail, the *preconditions* for running the test aren't # satisfied -- the test itself hasn't been run yet. self.assertEqual(len(state), 2) self.assertEqual(len(state[0]), 5) self.assertEqual(state[0][1], 60) self.assertEqual(state[0][3], 120) # Invoke conflict resolution by committing a transaction. self.openDB() r1 = self.db.open().root() r1["t"] = self.t transaction.commit() r2 = self.db.open().root() copy = r2["t"] # Make sure all of copy is loaded. list(copy.values()) self.assertEqual(self.t._p_serial, copy._p_serial) # In one transaction, just add a key. b = self.t b[1] = 1 # bucket 0 has 16 values: [0, 1] + [4, 8 .. 56] # bucket 1 has 15 values: 60, 64 .. 116 # bucket 2 has 20 values: 120, 124 .. 196 state = b.__getstate__() # Looks like: ((bucket0, 60, bucket1, 120, bucket2), firstbucket) # The next block is still verifying preconditions. self.assertEqual(len(state), 2) self.assertEqual(len(state[0]), 5) self.assertEqual(state[0][1], 60) self.assertEqual(state[0][3], 120) transaction.commit() # In the other transaction, delete bucket 2. b = copy for k in range(120, 200, 4): del b[k] # bucket 0 has 15 values: 0, 4 .. 56 # bucket 1 has 15 values: 60, 64 .. 116 state = b.__getstate__() # Looks like: ((bucket0, 60, bucket1), firstbucket) # The next block is still verifying preconditions. self.assertEqual(len(state), 2) self.assertEqual(len(state[0]), 3) self.assertEqual(state[0][1], 60) # This shouldn't create a ConflictError. transaction.commit() # And the resulting BTree shouldn't have internal damage. b._check() # The snaky control flow in _bucket__p_resolveConflict ended up trying # to decref a NULL pointer if conflict resolution was fed 3 empty # buckets. http://collector.zope.org/Zope/553 def testThreeEmptyBucketsNoSegfault(self): self.t[1] = 1 bucket = self.t._firstbucket del self.t[1] state1 = bucket.__getstate__() state2 = bucket.__getstate__() state3 = bucket.__getstate__() self.assert_(state2 is not state1 and state2 is not state3 and state3 is not state1) self.assert_(state2 == state1 and state3 == state1) self.assertRaises(ConflictError, bucket._p_resolveConflict, state1, state2, state3) # When an empty BTree resolves conflicts, it computes the # bucket state as None, so... self.assertRaises(ConflictError, bucket._p_resolveConflict, None, None, None) def testCantResolveBTreeConflict(self): # Test that a conflict involving two different changes to # an internal BTree node is unresolvable. An internal node # only changes when there are enough additions or deletions # to a child bucket that the bucket is split or removed. # It's (almost necessarily) a white-box test, and sensitive to # implementation details. b = self.t for i in range(0, 200, 4): b[i] = i # bucket 0 has 15 values: 0, 4 .. 56 # bucket 1 has 15 values: 60, 64 .. 116 # bucket 2 has 20 values: 120, 124 .. 196 state = b.__getstate__() # Looks like: ((bucket0, 60, bucket1, 120, bucket2), firstbucket) # If these fail, the *preconditions* for running the test aren't # satisfied -- the test itself hasn't been run yet. self.assertEqual(len(state), 2) self.assertEqual(len(state[0]), 5) self.assertEqual(state[0][1], 60) self.assertEqual(state[0][3], 120) # Set up database connections to provoke conflict. self.openDB() tm1 = transaction.TransactionManager() r1 = self.db.open(transaction_manager=tm1).root() r1["t"] = self.t tm1.commit() tm2 = transaction.TransactionManager() r2 = self.db.open(transaction_manager=tm2).root() copy = r2["t"] # Make sure all of copy is loaded. list(copy.values()) self.assertEqual(self.t._p_serial, copy._p_serial) # Now one transaction should add enough keys to cause a split, # and another should remove all the keys in one bucket. for k in range(200, 300, 4): self.t[k] = k tm1.commit() for k in range(0, 60, 4): del copy[k] try: tm2.commit() except ConflictError, detail: self.assert_(str(detail).startswith('database conflict error')) else: self.fail("expected ConflictError") def testConflictWithOneEmptyBucket(self): # If one transaction empties a bucket, while another adds an item # to the bucket, all the changes "look resolvable": bucket conflict # resolution returns a bucket containing (only) the item added by # the latter transaction, but changes from the former transaction # removing the bucket are uncontested: the bucket is removed from # the BTree despite that resolution thinks it's non-empty! This # was first reported by Dieter Maurer, to zodb-dev on 22 Mar 2005. b = self.t for i in range(0, 200, 4): b[i] = i # bucket 0 has 15 values: 0, 4 .. 56 # bucket 1 has 15 values: 60, 64 .. 116 # bucket 2 has 20 values: 120, 124 .. 196 state = b.__getstate__() # Looks like: ((bucket0, 60, bucket1, 120, bucket2), firstbucket) # If these fail, the *preconditions* for running the test aren't # satisfied -- the test itself hasn't been run yet. self.assertEqual(len(state), 2) self.assertEqual(len(state[0]), 5) self.assertEqual(state[0][1], 60) self.assertEqual(state[0][3], 120) # Set up database connections to provoke conflict. self.openDB() tm1 = transaction.TransactionManager() r1 = self.db.open(transaction_manager=tm1).root() r1["t"] = self.t tm1.commit() tm2 = transaction.TransactionManager() r2 = self.db.open(transaction_manager=tm2).root() copy = r2["t"] # Make sure all of copy is loaded. list(copy.values()) self.assertEqual(self.t._p_serial, copy._p_serial) # Now one transaction empties the first bucket, and another adds a # key to the first bucket. for k in range(0, 60, 4): del self.t[k] tm1.commit() copy[1] = 1 try: tm2.commit() except ConflictError, detail: self.assert_(str(detail).startswith('database conflict error')) else: self.fail("expected ConflictError") # Same thing, except commit the transactions in the opposite order. b = OOBTree() for i in range(0, 200, 4): b[i] = i tm1 = transaction.TransactionManager() r1 = self.db.open(transaction_manager=tm1).root() r1["t"] = b tm1.commit() tm2 = transaction.TransactionManager() r2 = self.db.open(transaction_manager=tm2).root() copy = r2["t"] # Make sure all of copy is loaded. list(copy.values()) self.assertEqual(b._p_serial, copy._p_serial) # Now one transaction empties the first bucket, and another adds a # key to the first bucket. b[1] = 1 tm1.commit() for k in range(0, 60, 4): del copy[k] try: tm2.commit() except ConflictError, detail: self.assert_(str(detail).startswith('database conflict error')) else: self.fail("expected ConflictError") def testConflictOfInsertAndDeleteOfFirstBucketItem(self): """ Recently, BTrees became careful about removing internal keys (keys in internal aka BTree nodes) when they were deleted from buckets. This poses a problem for conflict resolution. We want to guard against a case in which the first key in a bucket is removed in one transaction while a key is added after that key but before the next key in another transaction with the result that the added key is unreachble original: Bucket(...), k1, Bucket((k1, v1), (k3, v3), ...) tran1 Bucket(...), k3, Bucket(k3, v3), ...) tran2 Bucket(...), k1, Bucket((k1, v1), (k2, v2), (k3, v3), ...) where k1 < k2 < k3 We don't want: Bucket(...), k3, Bucket((k2, v2), (k3, v3), ...) as k2 would be unfindable, so we want a conflict. """ mytype = self.t_type db = self.openDB() tm1 = transaction.TransactionManager() conn1 = db.open(tm1) conn1.root.t = t = mytype() for i in range(0, 200, 2): t[i] = i tm1.commit() k = t.__getstate__()[0][1] assert t.__getstate__()[0][2].keys()[0] == k tm2 = transaction.TransactionManager() conn2 = db.open(tm2) t[k+1] = k+1 del conn2.root.t[k] for i in range(200,300): conn2.root.t[i] = i tm1.commit() self.assertRaises(ConflictError, tm2.commit) tm2.abort() k = t.__getstate__()[0][1] t[k+1] = k+1 del conn2.root.t[k] tm2.commit() self.assertRaises(ConflictError, tm1.commit) tm1.abort() def test_suite(): suite = TestSuite() for kv in ('OO', 'II', 'IO', 'OI', 'IF', 'LL', 'LO', 'OL', 'LF', ): for name, bases in (('BTree', (MappingBase, TestCase)), ('Bucket', (MappingBase, TestCase)), ('TreeSet', (SetTests, TestCase)), ('Set', (SetTests, TestCase)), ): klass = ClassType(kv + name + 'Test', bases, dict(t_type=globals()[kv+name])) suite.addTest(makeSuite(klass)) suite.addTest(makeSuite(NastyConfict)) return suite ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/tests/testLength.py000066400000000000000000000032731230730566700247250ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2008 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """\ Test for BTrees.Length module. """ __docformat__ = "reStructuredText" import BTrees.Length import copy import sys import unittest class LengthTestCase(unittest.TestCase): def test_length_overflows_to_long(self): length = BTrees.Length.Length(sys.maxint) self.assertEqual(length(), sys.maxint) self.assert_(type(length()) is int) length.change(+1) self.assertEqual(length(), sys.maxint + 1) self.assert_(type(length()) is long) def test_length_underflows_to_long(self): minint = (-sys.maxint) - 1 length = BTrees.Length.Length(minint) self.assertEqual(length(), minint) self.assert_(type(length()) is int) length.change(-1) self.assertEqual(length(), minint - 1) self.assert_(type(length()) is long) def test_copy(self): # Test for https://bugs.launchpad.net/zodb/+bug/516653 length = BTrees.Length.Length() other = copy.copy(length) self.assertEqual(other(), 0) def test_suite(): return unittest.makeSuite(LengthTestCase) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/tests/testSetOps.py000066400000000000000000000515221230730566700247210ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## from unittest import TestCase, TestSuite, TextTestRunner, makeSuite from BTrees.OOBTree import OOBTree, OOBucket, OOSet, OOTreeSet from BTrees.IOBTree import IOBTree, IOBucket, IOSet, IOTreeSet from BTrees.IFBTree import IFBTree, IFBucket, IFSet, IFTreeSet from BTrees.IIBTree import IIBTree, IIBucket, IISet, IITreeSet from BTrees.OIBTree import OIBTree, OIBucket, OISet, OITreeSet from BTrees.LOBTree import LOBTree, LOBucket, LOSet, LOTreeSet from BTrees.LFBTree import LFBTree, LFBucket, LFSet, LFTreeSet from BTrees.LLBTree import LLBTree, LLBucket, LLSet, LLTreeSet from BTrees.OLBTree import OLBTree, OLBucket, OLSet, OLTreeSet # Subclasses have to set up: # builders - functions to build inputs, taking an optional keys arg # intersection, union, difference - set to the type-correct versions class SetResult(TestCase): def setUp(self): self.Akeys = [1, 3, 5, 6 ] self.Bkeys = [ 2, 3, 4, 6, 7] self.As = [makeset(self.Akeys) for makeset in self.builders] self.Bs = [makeset(self.Bkeys) for makeset in self.builders] self.emptys = [makeset() for makeset in self.builders] # Slow but obviously correct Python implementations of basic ops. def _union(self, x, y): result = list(x) for e in y: if e not in result: result.append(e) result.sort() return result def _intersection(self, x, y): result = [] for e in x: if e in y: result.append(e) return result def _difference(self, x, y): result = list(x) for e in y: if e in result: result.remove(e) # Difference preserves LHS values. if hasattr(x, "values"): result = [(k, x[k]) for k in result] return result def testNone(self): for op in self.union, self.intersection, self.difference: C = op(None, None) self.assert_(C is None) for op in self.union, self.intersection, self.difference: for A in self.As: C = op(A, None) self.assert_(C is A) C = op(None, A) if op is self.difference: self.assert_(C is None) else: self.assert_(C is A) def testEmptyUnion(self): for A in self.As: for E in self.emptys: C = self.union(A, E) self.assert_(not hasattr(C, "values")) self.assertEqual(list(C), self.Akeys) C = self.union(E, A) self.assert_(not hasattr(C, "values")) self.assertEqual(list(C), self.Akeys) def testEmptyIntersection(self): for A in self.As: for E in self.emptys: C = self.intersection(A, E) self.assert_(not hasattr(C, "values")) self.assertEqual(list(C), []) C = self.intersection(E, A) self.assert_(not hasattr(C, "values")) self.assertEqual(list(C), []) def testEmptyDifference(self): for A in self.As: for E in self.emptys: C = self.difference(A, E) # Difference preserves LHS values. self.assertEqual(hasattr(C, "values"), hasattr(A, "values")) if hasattr(A, "values"): self.assertEqual(list(C.items()), list(A.items())) else: self.assertEqual(list(C), self.Akeys) C = self.difference(E, A) self.assertEqual(hasattr(C, "values"), hasattr(E, "values")) self.assertEqual(list(C), []) def testUnion(self): inputs = self.As + self.Bs for A in inputs: for B in inputs: C = self.union(A, B) self.assert_(not hasattr(C, "values")) self.assertEqual(list(C), self._union(A, B)) def testIntersection(self): inputs = self.As + self.Bs for A in inputs: for B in inputs: C = self.intersection(A, B) self.assert_(not hasattr(C, "values")) self.assertEqual(list(C), self._intersection(A, B)) def testDifference(self): inputs = self.As + self.Bs for A in inputs: for B in inputs: C = self.difference(A, B) # Difference preserves LHS values. self.assertEqual(hasattr(C, "values"), hasattr(A, "values")) want = self._difference(A, B) if hasattr(A, "values"): self.assertEqual(list(C.items()), want) else: self.assertEqual(list(C), want) def testLargerInputs(self): from random import randint MAXSIZE = 200 MAXVAL = 400 for i in range(3): n = randint(0, MAXSIZE) Akeys = [randint(1, MAXVAL) for j in range(n)] As = [makeset(Akeys) for makeset in self.builders] Akeys = IISet(Akeys) n = randint(0, MAXSIZE) Bkeys = [randint(1, MAXVAL) for j in range(n)] Bs = [makeset(Bkeys) for makeset in self.builders] Bkeys = IISet(Bkeys) for op, simulator in ((self.union, self._union), (self.intersection, self._intersection), (self.difference, self._difference)): for A in As: for B in Bs: got = op(A, B) want = simulator(Akeys, Bkeys) self.assertEqual(list(got), want, (A, B, Akeys, Bkeys, list(got), want)) # Given a mapping builder (IIBTree, OOBucket, etc), return a function # that builds an object of that type given only a list of keys. def makeBuilder(mapbuilder): def result(keys=[], mapbuilder=mapbuilder): return mapbuilder(zip(keys, keys)) return result class PureOO(SetResult): from BTrees.OOBTree import union, intersection, difference builders = OOSet, OOTreeSet, makeBuilder(OOBTree), makeBuilder(OOBucket) class PureII(SetResult): from BTrees.IIBTree import union, intersection, difference builders = IISet, IITreeSet, makeBuilder(IIBTree), makeBuilder(IIBucket) class PureIO(SetResult): from BTrees.IOBTree import union, intersection, difference builders = IOSet, IOTreeSet, makeBuilder(IOBTree), makeBuilder(IOBucket) class PureIF(SetResult): from BTrees.IFBTree import union, intersection, difference builders = IFSet, IFTreeSet, makeBuilder(IFBTree), makeBuilder(IFBucket) class PureOI(SetResult): from BTrees.OIBTree import union, intersection, difference builders = OISet, OITreeSet, makeBuilder(OIBTree), makeBuilder(OIBucket) class PureLL(SetResult): from BTrees.LLBTree import union, intersection, difference builders = LLSet, LLTreeSet, makeBuilder(LLBTree), makeBuilder(LLBucket) class PureLO(SetResult): from BTrees.LOBTree import union, intersection, difference builders = LOSet, LOTreeSet, makeBuilder(LOBTree), makeBuilder(LOBucket) class PureLF(SetResult): from BTrees.LFBTree import union, intersection, difference builders = LFSet, LFTreeSet, makeBuilder(LFBTree), makeBuilder(LFBucket) class PureOL(SetResult): from BTrees.OLBTree import union, intersection, difference builders = OLSet, OLTreeSet, makeBuilder(OLBTree), makeBuilder(OLBucket) # Subclasses must set up (as class variables): # multiunion, union # mkset, mktreeset # mkbucket, mkbtree class MultiUnion(TestCase): def testEmpty(self): self.assertEqual(len(self.multiunion([])), 0) def testOne(self): for sequence in [3], range(20), range(-10, 0, 2) + range(1, 10, 2): seq1 = sequence[:] seq2 = sequence[:] seq2.reverse() seqsorted = sequence[:] seqsorted.sort() for seq in seq1, seq2, seqsorted: for builder in self.mkset, self.mktreeset: input = builder(seq) output = self.multiunion([input]) self.assertEqual(len(seq), len(output)) self.assertEqual(seqsorted, list(output)) def testValuesIgnored(self): for builder in self.mkbucket, self.mkbtree: input = builder([(1, 2), (3, 4), (5, 6)]) output = self.multiunion([input]) self.assertEqual([1, 3, 5], list(output)) def testBigInput(self): N = 100000 input = self.mkset(range(N)) output = self.multiunion([input] * 10) self.assertEqual(len(output), N) self.assertEqual(output.minKey(), 0) self.assertEqual(output.maxKey(), N-1) self.assertEqual(list(output), range(N)) def testLotsOfLittleOnes(self): from random import shuffle N = 5000 inputs = [] mkset, mktreeset = self.mkset, self.mktreeset for i in range(N): base = i * 4 - N inputs.append(mkset([base, base+1])) inputs.append(mktreeset([base+2, base+3])) shuffle(inputs) output = self.multiunion(inputs) self.assertEqual(len(output), N*4) self.assertEqual(list(output), range(-N, 3*N)) def testFunkyKeyIteration(self): # The internal set iteration protocol allows "iterating over" a # a single key as if it were a set. N = 100 union, mkset = self.union, self.mkset slow = mkset() for i in range(N): slow = union(slow, mkset([i])) fast = self.multiunion(range(N)) # acts like N distinct singleton sets self.assertEqual(len(slow), N) self.assertEqual(len(fast), N) self.assertEqual(list(slow), list(fast)) self.assertEqual(list(fast), range(N)) class TestIIMultiUnion(MultiUnion): from BTrees.IIBTree import multiunion, union from BTrees.IIBTree import IISet as mkset, IITreeSet as mktreeset from BTrees.IIBTree import IIBucket as mkbucket, IIBTree as mkbtree class TestIOMultiUnion(MultiUnion): from BTrees.IOBTree import multiunion, union from BTrees.IOBTree import IOSet as mkset, IOTreeSet as mktreeset from BTrees.IOBTree import IOBucket as mkbucket, IOBTree as mkbtree class TestIFMultiUnion(MultiUnion): from BTrees.IFBTree import multiunion, union from BTrees.IFBTree import IFSet as mkset, IFTreeSet as mktreeset from BTrees.IFBTree import IFBucket as mkbucket, IFBTree as mkbtree class TestLLMultiUnion(MultiUnion): from BTrees.LLBTree import multiunion, union from BTrees.LLBTree import LLSet as mkset, LLTreeSet as mktreeset from BTrees.LLBTree import LLBucket as mkbucket, LLBTree as mkbtree class TestLOMultiUnion(MultiUnion): from BTrees.LOBTree import multiunion, union from BTrees.LOBTree import LOSet as mkset, LOTreeSet as mktreeset from BTrees.LOBTree import LOBucket as mkbucket, LOBTree as mkbtree class TestLFMultiUnion(MultiUnion): from BTrees.LFBTree import multiunion, union from BTrees.LFBTree import LFSet as mkset, LFTreeSet as mktreeset from BTrees.LFBTree import LFBucket as mkbucket, LFBTree as mkbtree # Check that various special module functions are and aren't imported from # the expected BTree modules. class TestImports(TestCase): def testWeightedUnion(self): from BTrees.IIBTree import weightedUnion from BTrees.OIBTree import weightedUnion try: from BTrees.IOBTree import weightedUnion except ImportError: pass else: self.fail("IOBTree shouldn't have weightedUnion") from BTrees.LLBTree import weightedUnion from BTrees.OLBTree import weightedUnion try: from BTrees.LOBTree import weightedUnion except ImportError: pass else: self.fail("LOBTree shouldn't have weightedUnion") try: from BTrees.OOBTree import weightedUnion except ImportError: pass else: self.fail("OOBTree shouldn't have weightedUnion") def testWeightedIntersection(self): from BTrees.IIBTree import weightedIntersection from BTrees.OIBTree import weightedIntersection try: from BTrees.IOBTree import weightedIntersection except ImportError: pass else: self.fail("IOBTree shouldn't have weightedIntersection") from BTrees.LLBTree import weightedIntersection from BTrees.OLBTree import weightedIntersection try: from BTrees.LOBTree import weightedIntersection except ImportError: pass else: self.fail("LOBTree shouldn't have weightedIntersection") try: from BTrees.OOBTree import weightedIntersection except ImportError: pass else: self.fail("OOBTree shouldn't have weightedIntersection") def testMultiunion(self): from BTrees.IIBTree import multiunion from BTrees.IOBTree import multiunion try: from BTrees.OIBTree import multiunion except ImportError: pass else: self.fail("OIBTree shouldn't have multiunion") from BTrees.LLBTree import multiunion from BTrees.LOBTree import multiunion try: from BTrees.OLBTree import multiunion except ImportError: pass else: self.fail("OLBTree shouldn't have multiunion") try: from BTrees.OOBTree import multiunion except ImportError: pass else: self.fail("OOBTree shouldn't have multiunion") # Subclasses must set up (as class variables): # weightedUnion, weightedIntersection # builders -- sequence of constructors, taking items # union, intersection -- the module routines of those names # mkbucket -- the module bucket builder class Weighted(TestCase): def setUp(self): self.Aitems = [(1, 10), (3, 30), (5, 50), (6, 60)] self.Bitems = [(2, 21), (3, 31), (4, 41), (6, 61), (7, 71)] self.As = [make(self.Aitems) for make in self.builders] self.Bs = [make(self.Bitems) for make in self.builders] self.emptys = [make([]) for make in self.builders] weights = [] for w1 in -3, -1, 0, 1, 7: for w2 in -3, -1, 0, 1, 7: weights.append((w1, w2)) self.weights = weights def testBothNone(self): for op in self.weightedUnion, self.weightedIntersection: w, C = op(None, None) self.assert_(C is None) self.assertEqual(w, 0) w, C = op(None, None, 42, 666) self.assert_(C is None) self.assertEqual(w, 0) def testLeftNone(self): for op in self.weightedUnion, self.weightedIntersection: for A in self.As + self.emptys: w, C = op(None, A) self.assert_(C is A) self.assertEqual(w, 1) w, C = op(None, A, 42, 666) self.assert_(C is A) self.assertEqual(w, 666) def testRightNone(self): for op in self.weightedUnion, self.weightedIntersection: for A in self.As + self.emptys: w, C = op(A, None) self.assert_(C is A) self.assertEqual(w, 1) w, C = op(A, None, 42, 666) self.assert_(C is A) self.assertEqual(w, 42) # If obj is a set, return a bucket with values all 1; else return obj. def _normalize(self, obj): if isaset(obj): obj = self.mkbucket(zip(obj, [1] * len(obj))) return obj # Python simulation of weightedUnion. def _wunion(self, A, B, w1=1, w2=1): if isaset(A) and isaset(B): return 1, self.union(A, B).keys() A = self._normalize(A) B = self._normalize(B) result = [] for key in self.union(A, B): v1 = A.get(key, 0) v2 = B.get(key, 0) result.append((key, v1*w1 + v2*w2)) return 1, result def testUnion(self): inputs = self.As + self.Bs + self.emptys for A in inputs: for B in inputs: want_w, want_s = self._wunion(A, B) got_w, got_s = self.weightedUnion(A, B) self.assertEqual(got_w, want_w) if isaset(got_s): self.assertEqual(got_s.keys(), want_s) else: self.assertEqual(got_s.items(), want_s) for w1, w2 in self.weights: want_w, want_s = self._wunion(A, B, w1, w2) got_w, got_s = self.weightedUnion(A, B, w1, w2) self.assertEqual(got_w, want_w) if isaset(got_s): self.assertEqual(got_s.keys(), want_s) else: self.assertEqual(got_s.items(), want_s) # Python simulation weightedIntersection. def _wintersection(self, A, B, w1=1, w2=1): if isaset(A) and isaset(B): return w1 + w2, self.intersection(A, B).keys() A = self._normalize(A) B = self._normalize(B) result = [] for key in self.intersection(A, B): result.append((key, A[key]*w1 + B[key]*w2)) return 1, result def testIntersection(self): inputs = self.As + self.Bs + self.emptys for A in inputs: for B in inputs: want_w, want_s = self._wintersection(A, B) got_w, got_s = self.weightedIntersection(A, B) self.assertEqual(got_w, want_w) if isaset(got_s): self.assertEqual(got_s.keys(), want_s) else: self.assertEqual(got_s.items(), want_s) for w1, w2 in self.weights: want_w, want_s = self._wintersection(A, B, w1, w2) got_w, got_s = self.weightedIntersection(A, B, w1, w2) self.assertEqual(got_w, want_w) if isaset(got_s): self.assertEqual(got_s.keys(), want_s) else: self.assertEqual(got_s.items(), want_s) # Given a set builder (like OITreeSet or OISet), return a function that # takes a list of (key, value) pairs and builds a set out of the keys. def itemsToSet(setbuilder): def result(items, setbuilder=setbuilder): return setbuilder([key for key, value in items]) return result class TestWeightedII(Weighted): from BTrees.IIBTree import weightedUnion, weightedIntersection from BTrees.IIBTree import union, intersection from BTrees.IIBTree import IIBucket as mkbucket builders = IIBucket, IIBTree, itemsToSet(IISet), itemsToSet(IITreeSet) class TestWeightedOI(Weighted): from BTrees.OIBTree import weightedUnion, weightedIntersection from BTrees.OIBTree import union, intersection from BTrees.OIBTree import OIBucket as mkbucket builders = OIBucket, OIBTree, itemsToSet(OISet), itemsToSet(OITreeSet) class TestWeightedLL(Weighted): from BTrees.LLBTree import weightedUnion, weightedIntersection from BTrees.LLBTree import union, intersection from BTrees.LLBTree import LLBucket as mkbucket builders = LLBucket, LLBTree, itemsToSet(LLSet), itemsToSet(LLTreeSet) class TestWeightedOL(Weighted): from BTrees.OLBTree import weightedUnion, weightedIntersection from BTrees.OLBTree import union, intersection from BTrees.OLBTree import OLBucket as mkbucket builders = OLBucket, OLBTree, itemsToSet(OLSet), itemsToSet(OLTreeSet) # 'thing' is a bucket, btree, set or treeset. Return true iff it's one of the # latter two. def isaset(thing): return not hasattr(thing, 'values') def test_suite(): s = TestSuite() for klass in ( TestIIMultiUnion, TestIOMultiUnion, TestIFMultiUnion, TestLLMultiUnion, TestLOMultiUnion, TestLFMultiUnion, TestImports, PureOO, PureII, PureIO, PureIF, PureOI, PureLL, PureLO, PureLF, PureOL, TestWeightedII, TestWeightedOI, TestWeightedLL, TestWeightedOL, ): s.addTest(makeSuite(klass)) return s def main(): TextTestRunner().run(test_suite()) if __name__ == '__main__': main() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/tests/test_btreesubclass.py000066400000000000000000000026601230730566700265030ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## from BTrees.OOBTree import OOBTree, OOBucket class B(OOBucket): pass class T(OOBTree): _bucket_type = B import unittest class SubclassTest(unittest.TestCase): def testSubclass(self): # test that a subclass that defines _bucket_type gets buckets # of that type t = T() # There's no good way to get a bucket at the moment. # __getstate__() is as good as it gets, but the default # getstate explicitly includes the pickle of the bucket # for small trees, so we have to be clever :-( # make sure there is more than one bucket in the tree for i in range(1000): t[i] = i state = t.__getstate__() self.assert_(state[0][0].__class__ is B) def test_suite(): return unittest.makeSuite(SubclassTest) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/tests/test_check.py000066400000000000000000000066501230730566700247220ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2003 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Test the BTree check.check() function.""" import unittest from BTrees.OOBTree import OOBTree from BTrees.check import check class CheckTest(unittest.TestCase): def setUp(self): self.t = t = OOBTree() for i in range(31): t[i] = 2*i self.state = t.__getstate__() def testNormal(self): s = self.state # Looks like (state, first_bucket) # where state looks like (bucket0, 15, bucket1). self.assertEqual(len(s), 2) self.assertEqual(len(s[0]), 3) self.assertEqual(s[0][1], 15) self.t._check() # shouldn't blow up check(self.t) # shouldn't blow up def testKeyTooLarge(self): # Damage an invariant by dropping the BTree key to 14. s = self.state news = (s[0][0], 14, s[0][2]), s[1] self.t.__setstate__(news) self.t._check() # not caught try: # Expecting "... key %r >= upper bound %r at index %d" check(self.t) except AssertionError, detail: self.failUnless(str(detail).find(">= upper bound") > 0) else: self.fail("expected self.t_check() to catch the problem") def testKeyTooSmall(self): # Damage an invariant by bumping the BTree key to 16. s = self.state news = (s[0][0], 16, s[0][2]), s[1] self.t.__setstate__(news) self.t._check() # not caught try: # Expecting "... key %r < lower bound %r at index %d" check(self.t) except AssertionError, detail: self.failUnless(str(detail).find("< lower bound") > 0) else: self.fail("expected self.t_check() to catch the problem") def testKeysSwapped(self): # Damage an invariant by swapping two key/value pairs. s = self.state # Looks like (state, first_bucket) # where state looks like (bucket0, 15, bucket1). (b0, num, b1), firstbucket = s self.assertEqual(b0[4], 8) self.assertEqual(b0[5], 10) b0state = b0.__getstate__() self.assertEqual(len(b0state), 2) # b0state looks like # ((k0, v0, k1, v1, ...), nextbucket) pairs, nextbucket = b0state self.assertEqual(pairs[8], 4) self.assertEqual(pairs[9], 8) self.assertEqual(pairs[10], 5) self.assertEqual(pairs[11], 10) newpairs = pairs[:8] + (5, 10, 4, 8) + pairs[12:] b0.__setstate__((newpairs, nextbucket)) self.t._check() # not caught try: check(self.t) except AssertionError, detail: self.failUnless(str(detail).find( "key 5 at index 4 >= key 4 at index 5") > 0) else: self.fail("expected self.t_check() to catch the problem") def test_suite(): return unittest.makeSuite(CheckTest) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/tests/test_compare.py000066400000000000000000000053441230730566700252720ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Test errors during comparison of BTree keys.""" import unittest from BTrees.OOBTree import OOBucket as Bucket, OOSet as Set import transaction from ZODB.MappingStorage import MappingStorage from ZODB.DB import DB class CompareTest(unittest.TestCase): s = "A string with hi-bit-set characters: \700\701" u = u"A unicode string" def setUp(self): # These defaults only make sense if the default encoding # prevents s from being promoted to Unicode. self.assertRaises(UnicodeError, unicode, self.s) # An object needs to be added to the database to self.db = DB(MappingStorage()) root = self.db.open().root() self.bucket = root["bucket"] = Bucket() self.set = root["set"] = Set() transaction.commit() def tearDown(self): self.assert_(self.bucket._p_changed != 2) self.assert_(self.set._p_changed != 2) transaction.abort() def assertUE(self, callable, *args): self.assertRaises(UnicodeError, callable, *args) def testBucketGet(self): import sys import warnings _warnlog = [] def _showwarning(*args, **kw): _warnlog.append((args, kw)) warnings.showwarning, _before = _showwarning, warnings.showwarning try: self.bucket[self.s] = 1 self.assertUE(self.bucket.get, self.u) finally: warnings.showwarning = _before if sys.version_info >= (2, 6): self.assertEqual(len(_warnlog), 1) def testSetGet(self): self.set.insert(self.s) self.assertUE(self.set.remove, self.u) def testBucketSet(self): self.bucket[self.s] = 1 self.assertUE(self.bucket.__setitem__, self.u, 1) def testSetSet(self): self.set.insert(self.s) self.assertUE(self.set.insert, self.u) def testBucketMinKey(self): self.bucket[self.s] = 1 self.assertUE(self.bucket.minKey, self.u) def testSetMinKey(self): self.set.insert(self.s) self.assertUE(self.set.minKey, self.u) def test_suite(): return unittest.makeSuite(CompareTest) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/BTrees/tests/test_fsBTree.py000066400000000000000000000022441230730566700251720ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2010 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## import doctest import unittest def test_fsbucket_string_conversion(): """ fsBuckets have toString and fromString methods that can be used to get and set their state very efficiently: >>> from BTrees.fsBTree import fsBucket >>> b = fsBucket([(c*2, c*6) for c in 'abcdef']) >>> import pprint >>> b.toString() 'aabbccddeeffaaaaaabbbbbbccccccddddddeeeeeeffffff' >>> b2 = fsBucket().fromString(b.toString()) >>> b.__getstate__() == b2.__getstate__() True """ def test_suite(): return doctest.DocTestSuite() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/CHANGES.txt000066400000000000000000000310521230730566700215110ustar00rootroot00000000000000================ Change History ================ 3.10.6 (unreleased) =================== Bugs Fixed ---------- - POSKeyError during transaction.commit when after savepoint.rollback - Ensured that the export file and index file created by ``repozo`` share the same timestamp. https://bugs.launchpad.net/zodb/+bug/993350 - Pinned the ``transaction`` and ``manuel`` dependencies to Python 2.5- compatible versions when installing under Python 2.5. 3.10.5 (2011-11-19) =================== Bugs Fixed ---------- - Conflict resolution failed when state included cross-database persistent references with classes that couldn't be imported. 3.10.4 (2011-11-17) =================== Bugs Fixed ---------- - Conflict resolution failed when state included persistent references with classes that couldn't be imported. 3.10.3 (2011-04-12) =================== Bugs Fixed ---------- - "activity monitor not updated for subconnections when connection returned to pool" https://bugs.launchpad.net/zodb/+bug/737198 - "Blob temp file get's removed before it should", https://bugs.launchpad.net/zodb/+bug/595378 A way this to happen is that a transaction is aborted after the commit process has started. I don't know how this would happen in the wild. In 3.10.3, the ZEO tpc_abort call to the server is changed to be synchronous, which should address this case. Maybe there's another case. Performance enhancements ------------------------ - Improved ZEO client cache implementation to make it less likely to evict objects that are being used. - Small (possibly negligable) reduction in CPU in ZEO storage servers to service object loads and in networking code. 3.10.2 (2011-02-12) =================== Bugs Fixed ---------- - 3.10 introduced an optimization to try to address BTree conflict errors arrising for basing BTree keys on object ids. The optimization caused object ids allocated in aborted transactions to be reused. Unfortunately, this optimzation led to some rather severe failures in some applications. The symptom is a conflict error in which one of the serials mentioned is zero. This optimization has been removed. See (for example): https://bugs.launchpad.net/zodb/+bug/665452 - ZEO server transaction timeouts weren't logged as critical. https://bugs.launchpad.net/zodb/+bug/670986 3.10.1 (2010-10-27) =================== Bugs Fixed ---------- - When a transaction rolled back a savepoint after adding objects and subsequently added more objects and committed, an error could be raised "ValueError: A different object already has the same oid" causing the transaction to fail. Worse, this could leave a database in a state where subsequent transactions in the same process would fail. https://bugs.launchpad.net/zodb/+bug/665452 - Unix domain sockets didn't work for ZEO (since the addition of IPv6 support). https://bugs.launchpad.net/zodb/+bug/663259 - Removed a missfeature that can cause performance problems when using an external garbage collector with ZEO. When objects were deleted from a storage, invalidations were sent to clients. This makes no sense. It's wildly unlikely that the other connections/clients have copies of the garbage. In normal storage garbage collection, we don't send invalidations. There's no reason to send them when an external garbage collector is used. - ZEO client cache simulation misshandled invalidations causing incorrect statistics and errors. 3.10.0 (2010-10-08) =================== New Features ------------ - There are a number of performance enhancements for ZEO storage servers. - FileStorage indexes use a new format. They are saved and loaded much faster and take less space. Old indexes can still be read, but new indexes won't be readable by older versions of ZODB. - The API for undoing multiple transactions has changed. To undo multiple transactions in a single transaction, pass a list of transaction identifiers to a database's undoMultiple method. Calling a database's undo method multiple times in the same transaction now raises an exception. - The ZEO protocol for undo has changed. The only user-visible consequence of this is that when ZODB 3.10 ZEO servers won't support undo for older clients. - The storage API (IStorage) has been tightened. Now, storages should raise a StorageTransactionError when invalid transactions are passed to tpc_begin, tpc_vote, or tpc_finish. - ZEO clients (``ClientStorage`` instances) now work in forked processes, including those created via ``multiprocessing.Process`` instances. - Broken objects now provide the IBroken interface. - As a convenience, you can now pass an integer port as an address to the ZEO ClientStorage constructor. - As a convenience, there's a new ``client`` function in the ZEO package for constructing a ClientStorage instance. It takes the same arguments as the ClientStorage constructor. - DemoStorages now accept constructor athuments, close_base_on_close and close_changes_on_close, to control whether underlying storages are closed when the DemoStorage is closed. https://bugs.launchpad.net/zodb/+bug/118512 - Removed the dependency on zope.proxy. - Removed support for the _p_independent mini framework, which was made moot by the introduction of multi-version concurrency control several years ago. - Added support for the transaction retry convenience (transaction-manager attempts method) introduced in the ``transaction`` 1.1.0 release. - Enhanced the database opening conveniences: - You can now pass storage keyword arguments to ZODB.DB and ZODB.connection. - You can now pass None (rather than a storage or file name) to get a database with a mapping storage. - Databases now warn when committing very large records (> 16MB). This is to try to warn people of likely design mistakes. There is a new option (large_record_size/large-record-size) to control the record size at which the warning is issued. - Added support for wrapper storages that transform pickle data. Applications for this include compression and encryption. An example wrapper storage implementation, ZODB.tests.hexstorage, was included for testing. It is important that storage implementations not assume that storages contain pickles. Renamed IStorageDB to IStorageWrapper and expanded it to provide methods for transforming and untransforming data records. Storages implementations should use these methods to get pickle data from stored records. - Deprecated ZODB.interfaces.StorageStopIteration. Storage iterator implementations should just raise StopIteration, which means they can now be implemented as generators. - The filestorage packer configuration option noe accepts values of the form ``modname:expression``, allowing the use of packer factories with options. - Added a new API that allows applications to make sure that current data are read. For example, with:: self._p_jar.readCurrent(ob) A conflict error will be raised if the version of ob read by the transaction isn't current when the transaction is committed. Normally, ZODB only assures that objects read are consistent, but not necessarily up to date. Checking whether an object is up to date is important when information read from one object is used to update another. BTrees are an important case of reading one object to update another. Internal nodes are read to decide which leave notes are updated when a BTree is updated. BTrees now use this new API to make sure that internal nodes are up to date on updates. - When transactions are aborted, new object ids allocated during the transaction are saved and used in subsequent transactions. This can help in situations where object ids are used as BTree keys and the sequential allocation of object ids leads to conflict errors. - ZEO servers now support a server_status method for for getting information on the number of clients, lock requests and general statistics. - ZEO clients now support a client_label constructor argument and client-label configuration-file option to specify a label for a client in server logs. This makes it easier to identify specific clients corresponding to server log entries, especially when there are multiple clients originating from the same machine. - Improved ZEO server commit lock logging. Now, locking activity is logged at the debug level until the number of waiting lock requests gets above 3. Log at the critical level when the number of waiting lock requests gets above 9. - The file-storage backup script, repozo, will now create a backup index file if an output file name is given via the --output/-o option. - Added a '--kill-old-on-full' argument to the repozo backup options: if passed, remove any older full or incremental backup files from the repository after doing a full backup. (https://bugs.launchpad.net/zope2/+bug/143158) - The mkzeoinst script has been moved to a separate project: http://pypi.python.org/pypi/zope.mkzeoinstance and is no-longer included with ZODB. - Removed untested unsupported dbmstorage fossile. - ZEO servers no longer log their pids in every log message. It's just not interesting. :) Bugs fixed ---------- - When a pool timeout was specified for a database and old connections were removed due to timing out, an error occured due to a bug in the connection cleanup logic. - When multi-database connections were no longer used and cleaned up, their subconnections weren't cleaned up properly. - ZEO didn't work with IPv6 addrsses. Added IPv6 support contributed by Martin v. Löwis. - A file storage bug could cause ZEO clients to have incorrect information about current object revisions after reconnecting to a database server. - Updated the 'repozo --kill-old-on-full' option to remove any '.index' files corresponding to backups being removed. - ZEO extension methods failed when a client reconnected to a storage. (https://bugs.launchpad.net/zodb/+bug/143344) - Clarified the return Value for lastTransaction in the case when there aren't any transactions. Now a string of 8 nulls (aka "z64") is specified. - Setting _p_changed on a blob wo actually writing anything caused an error. (https://bugs.launchpad.net/zodb/+bug/440234) - The verbose mode of the fstest was broken. (https://bugs.launchpad.net/zodb/+bug/475996) - Object ids created in a savepoint that is rolled back wren't being reused. (https://bugs.launchpad.net/zodb/+bug/588389) - Database connections didn't invalidate cache entries when conflict errors were raised in response to checkCurrentSerialInTransaction errors. Normally, this shouldn't be a problem, since there should be pending invalidations for these oids which will cause the object to be invalidated. There have been issues with ZEO persistent cache management that have caused out of date data to remain in the cache. (It's possible that the last of these were addressed in the 3.10.0b5.) Invalidating read data when there is a conflict error provides some extra insurance. - The interface, ZODB.interfaces.IStorage was incorrect. The store method should never return a sequence of oid and serial pairs. - When a demo storage push method was used to create a new demo storage and the new storage was closed, the original was (incorrectly) closed. - There were numerous bugs in the ZEO cache tracing and analysis code. Cache simulation, while not perfect, seems to be much more accurate now than it was before. The ZEO cache trace statistics and simulation scripts have been given more descriptive names and moved to the ZEO scripts package. - BTree sets and tree sets didn't correctly check values passed to update or to constructors, causing Python to exit under certain circumstances. - Fixed bug in copying a BTrees.Length instance. (https://bugs.launchpad.net/zodb/+bug/516653) - Fixed a serious bug that caused cache failures when run with Python optimization turned on. https://bugs.launchpad.net/zodb/+bug/544305 - When using using a ClientStorage in a Storage server, there was a threading bug that caused clients to get disconnected. - On Mac OS X, clients that connected and disconnected quickly could cause a ZEO server to stop accepting connections, due to a failure to catch errors in the initial part of the connection process. The failure to properly handle exceptions while accepting connections is potentially problematic on other platforms. Fixes: https://bugs.launchpad.net/zodb/+bug/135108 - Object state management wasn't done correctly when classes implemented custom _p_deavtivate methods. (https://bugs.launchpad.net/zodb/+bug/185066) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/000077500000000000000000000000001230730566700203345ustar00rootroot00000000000000ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/ClientStorage.py000066400000000000000000001760141230730566700234620ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002, 2003 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """The ClientStorage class and the exceptions that it may raise. Public contents of this module: ClientStorage -- the main class, implementing the Storage API """ from persistent.TimeStamp import TimeStamp from ZEO.auth import get_module from ZEO.cache import ClientCache from ZEO.Exceptions import ClientStorageError, ClientDisconnected, AuthError from ZEO import ServerStub from ZEO.TransactionBuffer import TransactionBuffer from ZEO.zrpc.client import ConnectionManager from ZODB import POSException from ZODB import utils import BTrees.IOBTree import cPickle import logging import os import re import socket import stat import sys import tempfile import thread import threading import time import types import weakref import zc.lockfile import ZEO.interfaces import ZODB import ZODB.BaseStorage import ZODB.interfaces import zope.event import zope.interface logger = logging.getLogger(__name__) try: from ZODB.ConflictResolution import ResolvedSerial except ImportError: ResolvedSerial = 'rs' def tid2time(tid): return str(TimeStamp(tid)) def get_timestamp(prev_ts=None): """Internal helper to return a unique TimeStamp instance. If the optional argument is not None, it must be a TimeStamp; the return value is then guaranteed to be at least 1 microsecond later the argument. """ t = time.time() t = TimeStamp(*time.gmtime(t)[:5] + (t % 60,)) if prev_ts is not None: t = t.laterThan(prev_ts) return t class DisconnectedServerStub: """Internal helper class used as a faux RPC stub when disconnected. This raises ClientDisconnected on all attribute accesses. This is a singleton class -- there should be only one instance, the global disconnected_stub, so it can be tested by identity. """ def __getattr__(self, attr): raise ClientDisconnected() # Singleton instance of DisconnectedServerStub disconnected_stub = DisconnectedServerStub() MB = 1024**2 class ClientStorage(object): """A storage class that is a network client to a remote storage. This is a faithful implementation of the Storage API. This class is thread-safe; transactions are serialized in tpc_begin(). """ # ClientStorage does not declare any interfaces here. Interfaces are # declared according to the server's storage once a connection is # established. # Classes we instantiate. A subclass might override. TransactionBufferClass = TransactionBuffer ClientCacheClass = ClientCache ConnectionManagerClass = ConnectionManager StorageServerStubClass = ServerStub.stub def __init__(self, addr, storage='1', cache_size=20 * MB, name='', client=None, var=None, min_disconnect_poll=1, max_disconnect_poll=30, wait_for_server_on_startup=None, # deprecated alias for wait wait=None, wait_timeout=None, read_only=0, read_only_fallback=0, drop_cache_rather_verify=False, username='', password='', realm=None, blob_dir=None, shared_blob_dir=False, blob_cache_size=None, blob_cache_size_check=10, client_label=None, ): """ClientStorage constructor. This is typically invoked from a custom_zodb.py file. All arguments except addr should be keyword arguments. Arguments: addr The server address(es). This is either a list of addresses or a single address. Each address can be a (hostname, port) tuple to signify a TCP/IP connection or a pathname string to signify a Unix domain socket connection. A hostname may be a DNS name or a dotted IP address. Required. storage The storage name, defaulting to '1'. The name must match one of the storage names supported by the server(s) specified by the addr argument. The storage name is displayed in the Zope control panel. cache_size The disk cache size, defaulting to 20 megabytes. This is passed to the ClientCache constructor. name The storage name, defaulting to ''. If this is false, str(addr) is used as the storage name. client A name used to construct persistent cache filenames. Defaults to None, in which case the cache is not persistent. See ClientCache for more info. var When client is not None, this specifies the directory where the persistent cache files are created. It defaults to None, in whichcase the current directory is used. min_disconnect_poll The minimum delay in seconds between attempts to connect to the server, in seconds. Defaults to 5 seconds. max_disconnect_poll The maximum delay in seconds between attempts to connect to the server, in seconds. Defaults to 300 seconds. wait_for_server_on_startup A backwards compatible alias for the wait argument. wait A flag indicating whether to wait until a connection with a server is made, defaulting to true. wait_timeout Maximum time to wait for a connection before giving up. Only meaningful if wait is True. read_only A flag indicating whether this should be a read-only storage, defaulting to false (i.e. writing is allowed by default). read_only_fallback A flag indicating whether a read-only remote storage should be acceptable as a fallback when no writable storages are available. Defaults to false. At most one of read_only and read_only_fallback should be true. username string with username to be used when authenticating. These only need to be provided if you are connecting to an authenticated server storage. password string with plaintext password to be used when authenticated. realm not documented. drop_cache_rather_verify a flag indicating that the cache should be dropped rather than expensively verified. blob_dir directory path for blob data. 'blob data' is data that is retrieved via the loadBlob API. shared_blob_dir Flag whether the blob_dir is a server-shared filesystem that should be used instead of transferring blob data over zrpc. blob_cache_size Maximum size of the ZEO blob cache, in bytes. If not set, then the cache size isn't checked and the blob directory will grow without bound. This option is ignored if shared_blob_dir is true. blob_cache_size_check ZEO check size as percent of blob_cache_size. The ZEO cache size will be checked when this many bytes have been loaded into the cache. Defaults to 10% of the blob cache size. This option is ignored if shared_blob_dir is true. client_label A label to include in server log messages for the client. Note that the authentication protocol is defined by the server and is detected by the ClientStorage upon connecting (see testConnection() and doAuth() for details). """ if isinstance(addr, int): addr = '127.0.0.1', addr self.__name__ = name or str(addr) # Standard convention for storages logger.info( "%s %s (pid=%d) created %s/%s for storage: %r", self.__name__, self.__class__.__name__, os.getpid(), read_only and "RO" or "RW", read_only_fallback and "fallback" or "normal", storage, ) self._drop_cache_rather_verify = drop_cache_rather_verify # wait defaults to True, but wait_for_server_on_startup overrides # if not None if wait_for_server_on_startup is not None: if wait is not None and wait != wait_for_server_on_startup: logger.warning( "%s ClientStorage(): conflicting values for wait and " "wait_for_server_on_startup; wait prevails", self.__name__) else: logger.info( "%s ClientStorage(): wait_for_server_on_startup " "is deprecated; please use wait instead", self.__name__) wait = wait_for_server_on_startup elif wait is None: wait = 1 self._addr = addr # For tests # A ZEO client can run in disconnected mode, using data from # its cache, or in connected mode. Several instance variables # are related to whether the client is connected. # _server: All method calls are invoked through the server # stub. When not connect, set to disconnected_stub an # object that raises ClientDisconnected errors. # _ready: A threading Event that is set only if _server # is set to a real stub. # _connection: The current zrpc connection or None. # _connection is set as soon as a connection is established, # but _server is set only after cache verification has finished # and clients can safely use the server. _pending_server holds # a server stub while it is being verified. self._server = disconnected_stub self._connection = None self._pending_server = None self._ready = threading.Event() # _is_read_only stores the constructor argument self._is_read_only = read_only self._storage = storage self._read_only_fallback = read_only_fallback self._username = username self._password = password self._realm = realm self._iterators = weakref.WeakValueDictionary() self._iterator_ids = set() # Flag tracking disconnections in the middle of a transaction. This # is reset in tpc_begin() and set in notifyDisconnected(). self._midtxn_disconnect = 0 # _server_addr is used by sortKey() self._server_addr = None self._client_label = client_label self._pickler = self._tfile = None self._info = {'length': 0, 'size': 0, 'name': 'ZEO Client', 'supportsUndo': 0, 'interfaces': ()} self._tbuf = self.TransactionBufferClass() self._db = None self._ltid = None # the last committed transaction # _serials: stores (oid, serialno) as returned by server # _seriald: _check_serials() moves from _serials to _seriald, # which maps oid to serialno # TODO: If serial number matches transaction id, then there is # no need to have all this extra infrastructure for handling # serial numbers. The vote call can just return the tid. # If there is a conflict error, we can't have a special method # called just to propagate the error. self._serials = [] self._seriald = {} # A ClientStorage only allows one thread to commit at a time. # Mutual exclusion is achieved using _tpc_cond, which # protects _transaction. A thread that wants to assign to # self._transaction must acquire _tpc_cond first. A thread # that decides it's done with a transaction (whether via success # or failure) must set _transaction to None and do # _tpc_cond.notify() before releasing _tpc_cond. self._tpc_cond = threading.Condition() self._transaction = None # Prevent multiple new_oid calls from going out. The _oids # variable should only be modified while holding the # _oid_lock. self._oid_lock = threading.Lock() self._oids = [] # Object ids retrieved from new_oids() # load() and tpc_finish() must be serialized to guarantee # that cache modifications from each occur atomically. # It also prevents multiple load calls occuring simultaneously, # which simplifies the cache logic. self._load_lock = threading.Lock() # _load_oid and _load_status are protected by _lock self._load_oid = None self._load_status = None # Can't read data in one thread while writing data # (tpc_finish) in another thread. In general, the lock # must prevent access to the cache while _update_cache # is executing. self._lock = threading.Lock() # XXX need to check for POSIX-ness here self.blob_dir = blob_dir self.shared_blob_dir = shared_blob_dir if blob_dir is not None: # Avoid doing this import unless we need it, as it # currently requires pywin32 on Windows. import ZODB.blob if shared_blob_dir: self.fshelper = ZODB.blob.FilesystemHelper(blob_dir) else: if 'zeocache' not in ZODB.blob.LAYOUTS: ZODB.blob.LAYOUTS['zeocache'] = BlobCacheLayout() self.fshelper = ZODB.blob.FilesystemHelper( blob_dir, layout_name='zeocache') self.fshelper.create() self.fshelper.checkSecure() else: self.fshelper = None if client is not None: dir = var or os.getcwd() cache_path = os.path.join(dir, "%s-%s.zec" % (client, storage)) else: cache_path = None self._cache = self.ClientCacheClass(cache_path, size=cache_size) self._blob_cache_size = blob_cache_size self._blob_data_bytes_loaded = 0 if blob_cache_size is not None: assert blob_cache_size_check < 100 self._blob_cache_size_check = ( blob_cache_size * blob_cache_size_check / 100) self._check_blob_size() self._rpc_mgr = self.ConnectionManagerClass(addr, self, tmin=min_disconnect_poll, tmax=max_disconnect_poll) if wait: self._wait(wait_timeout) else: # attempt_connect() will make an attempt that doesn't block # "too long," for a very vague notion of too long. If that # doesn't succeed, call connect() to start a thread. if not self._rpc_mgr.attempt_connect(): self._rpc_mgr.connect() def _wait(self, timeout=None): if timeout is not None: deadline = time.time() + timeout logger.debug("%s Setting deadline to %f", self.__name__, deadline) else: deadline = None # Wait for a connection to be established. self._rpc_mgr.connect(sync=1) # When a synchronous connect() call returns, there is # a valid _connection object but cache validation may # still be going on. This code must wait until validation # finishes, but if the connection isn't a zrpc async # connection it also needs to poll for input. while 1: self._ready.wait(30) if self._ready.isSet(): break if timeout and time.time() > deadline: logger.warning("%s Timed out waiting for connection", self.__name__) break logger.info("%s Waiting for cache verification to finish", self.__name__) def close(self): "Storage API: finalize the storage, releasing external resources." _rpc_mgr = self._rpc_mgr self._rpc_mgr = None if _rpc_mgr is None: return # already closed if self._connection is not None: self._connection.register_object(None) # Don't call me! self._connection = None _rpc_mgr.close() self._tbuf.close() if self._cache is not None: self._cache.close() self._cache = None if self._tfile is not None: self._tfile.close() if self._check_blob_size_thread is not None: self._check_blob_size_thread.join() _check_blob_size_thread = None def _check_blob_size(self, bytes=None): if self._blob_cache_size is None: return if self.shared_blob_dir or not self.blob_dir: return if (bytes is not None) and (bytes < self._blob_cache_size_check): return self._blob_data_bytes_loaded = 0 target = max(self._blob_cache_size - self._blob_cache_size_check, 0) check_blob_size_thread = threading.Thread( target=_check_blob_cache_size, args=(self.blob_dir, target), ) check_blob_size_thread.setDaemon(True) check_blob_size_thread.start() self._check_blob_size_thread = check_blob_size_thread def registerDB(self, db): """Storage API: register a database for invalidation messages. This is called by ZODB.DB (and by some tests). The storage isn't really ready to use until after this call. """ self._db = db def is_connected(self): """Return whether the storage is currently connected to a server.""" # This function is used by clients, so we only report that a # connection exists when the connection is ready to use. return self._ready.isSet() def sync(self): # The separate async thread should keep us up to date pass def doAuth(self, protocol, stub): if not (self._username and self._password): raise AuthError("empty username or password") module = get_module(protocol) if not module: logger.error("%s %s: no such an auth protocol: %s", self.__name__, self.__class__.__name__, protocol) return storage_class, client, db_class = module if not client: logger.error( "%s %s: %s isn't a valid protocol, must have a Client class", self.__name__, self.__class__.__name__, protocol) raise AuthError("invalid protocol") c = client(stub) # Initiate authentication, returns boolean specifying whether OK return c.start(self._username, self._realm, self._password) def testConnection(self, conn): """Internal: test the given connection. This returns: 1 if the connection is an optimal match, 0 if it is a suboptimal but acceptable match. It can also raise DisconnectedError or ReadOnlyError. This is called by ZEO.zrpc.ConnectionManager to decide which connection to use in case there are multiple, and some are read-only and others are read-write. This works by calling register() on the server. In read-only mode, register() is called with the read_only flag set. In writable mode and in read-only fallback mode, register() is called with the read_only flag cleared. In read-only fallback mode only, if the register() call raises ReadOnlyError, it is retried with the read-only flag set, and if this succeeds, this is deemed a suboptimal match. In all other cases, a succeeding register() call is deemed an optimal match, and any exception raised by register() is passed through. """ logger.info("%s Testing connection %r", self.__name__, conn) # TODO: Should we check the protocol version here? conn._is_read_only = self._is_read_only stub = self.StorageServerStubClass(conn) auth = stub.getAuthProtocol() logger.info("%s Server authentication protocol %r", self.__name__, auth) if auth: skey = self.doAuth(auth, stub) if skey: logger.info("%s Client authentication successful", self.__name__) conn.setSessionKey(skey) else: logger.info("%s Authentication failed", self.__name__) raise AuthError("Authentication failed") try: stub.register(str(self._storage), self._is_read_only) return 1 except POSException.ReadOnlyError: if not self._read_only_fallback: raise logger.info("%s Got ReadOnlyError; trying again with read_only=1", self.__name__) stub.register(str(self._storage), read_only=1) conn._is_read_only = True return 0 def notifyConnected(self, conn): """Internal: start using the given connection. This is called by ConnectionManager after it has decided which connection should be used. """ if self._cache is None: # the storage was closed, but the connect thread called # this method before it was stopped. return if self._connection is not None: # If we are upgrading from a read-only fallback connection, # we must close the old connection to prevent it from being # used while the cache is verified against the new connection. self._connection.register_object(None) # Don't call me! self._connection.close() self._connection = None self._ready.clear() reconnect = 1 else: reconnect = 0 self.set_server_addr(conn.get_addr()) self._connection = conn # invalidate our db cache if self._db is not None: self._db.invalidateCache() if reconnect: logger.info("%s Reconnected to storage: %s", self.__name__, self._server_addr) else: logger.info("%s Connected to storage: %s", self.__name__, self._server_addr) stub = self.StorageServerStubClass(conn) if self._client_label and conn.peer_protocol_version >= "Z310": stub.set_client_label(self._client_label) if conn.peer_protocol_version < "Z3101": logger.warning("Old server doesn't suppport " "checkCurrentSerialInTransaction") self.checkCurrentSerialInTransaction = lambda *args: None self._oids = [] self.verify_cache(stub) # It's important to call get_info after calling verify_cache. # If we end up doing a full-verification, we need to wait till # it's done. By doing a synchonous call, we are guarenteed # that the verification will be done because operations are # handled in order. self._info.update(stub.get_info()) self._handle_extensions() for iface in ( ZODB.interfaces.IStorageRestoreable, ZODB.interfaces.IStorageIteration, ZODB.interfaces.IStorageUndoable, ZODB.interfaces.IStorageCurrentRecordIteration, ZODB.interfaces.IBlobStorage, ZODB.interfaces.IExternalGC, ): if (iface.__module__, iface.__name__) in self._info.get( 'interfaces', ()): zope.interface.alsoProvides(self, iface) def _handle_extensions(self): for name in self.getExtensionMethods().keys(): if not hasattr(self, name): def mklambda(mname): return (lambda *args, **kw: self._server.rpc.call(mname, *args, **kw)) setattr(self, name, mklambda(name)) def set_server_addr(self, addr): # Normalize server address and convert to string if isinstance(addr, types.StringType): self._server_addr = addr else: assert isinstance(addr, types.TupleType) # If the server is on a remote host, we need to guarantee # that all clients used the same name for the server. If # they don't, the sortKey() may be different for each client. # The best solution seems to be the official name reported # by gethostbyaddr(). host = addr[0] try: canonical, aliases, addrs = socket.gethostbyaddr(host) except socket.error, err: logger.debug("%s Error resolving host: %s (%s)", self.__name__, host, err) canonical = host self._server_addr = str((canonical, addr[1])) def sortKey(self): # If the client isn't connected to anything, it can't have a # valid sortKey(). Raise an error to stop the transaction early. if self._server_addr is None: raise ClientDisconnected else: return '%s:%s' % (self._storage, self._server_addr) ### Is there a race condition between notifyConnected and ### notifyDisconnected? In Particular, what if we get ### notifyDisconnected in the middle of notifyConnected? ### The danger is that we'll proceed as if we were connected ### without worrying if we were, but this would happen any way if ### notifyDisconnected had to get the instance lock. There's ### nothing to gain by getting the instance lock. def notifyDisconnected(self): """Internal: notify that the server connection was terminated. This is called by ConnectionManager when the connection is closed or when certain problems with the connection occur. """ logger.info("%s Disconnected from storage: %r", self.__name__, self._server_addr) self._connection = None self._ready.clear() self._server = disconnected_stub self._midtxn_disconnect = 1 self._iterator_gc(True) def __len__(self): """Return the size of the storage.""" # TODO: Is this method used? return self._info['length'] def getName(self): """Storage API: return the storage name as a string. The return value consists of two parts: the name as determined by the name and addr argments to the ClientStorage constructor, and the string 'connected' or 'disconnected' in parentheses indicating whether the storage is (currently) connected. """ return "%s (%s)" % ( self.__name__, self.is_connected() and "connected" or "disconnected") def getSize(self): """Storage API: an approximate size of the database, in bytes.""" return self._info['size'] def getExtensionMethods(self): """getExtensionMethods This returns a dictionary whose keys are names of extra methods provided by this storage. Storage proxies (such as ZEO) should call this method to determine the extra methods that they need to proxy in addition to the standard storage methods. Dictionary values should be None; this will be a handy place for extra marshalling information, should we need it """ return self._info.get('extensionMethods', {}) def supportsUndo(self): """Storage API: return whether we support undo.""" return self._info['supportsUndo'] def isReadOnly(self): """Storage API: return whether we are in read-only mode.""" if self._is_read_only: return True else: # If the client is configured for a read-write connection # but has a read-only fallback connection, conn._is_read_only # will be True. If self._connection is None, we'll behave as # read_only try: return self._connection._is_read_only except AttributeError: return True def _check_trans(self, trans): """Internal helper to check a transaction argument for sanity.""" if self._is_read_only: raise POSException.ReadOnlyError() if self._transaction is not trans: raise POSException.StorageTransactionError(self._transaction, trans) def history(self, oid, size=1): """Storage API: return a sequence of HistoryEntry objects. """ return self._server.history(oid, size) def record_iternext(self, next=None): """Storage API: get the next database record. This is part of the conversion-support API. """ return self._server.record_iternext(next) def getTid(self, oid): """Storage API: return current serial number for oid.""" return self._server.getTid(oid) def loadSerial(self, oid, serial): """Storage API: load a historical revision of an object.""" return self._server.loadSerial(oid, serial) def load(self, oid, version=''): """Storage API: return the data for a given object. This returns the pickle data and serial number for the object specified by the given object id, if they exist; otherwise a KeyError is raised. """ self._lock.acquire() # for atomic processing of invalidations try: t = self._cache.load(oid) if t: return t finally: self._lock.release() if self._server is None: raise ClientDisconnected() self._load_lock.acquire() try: self._lock.acquire() try: self._load_oid = oid self._load_status = 1 finally: self._lock.release() data, tid = self._server.loadEx(oid) self._lock.acquire() # for atomic processing of invalidations try: if self._load_status: self._cache.store(oid, tid, None, data) self._load_oid = None finally: self._lock.release() finally: self._load_lock.release() return data, tid def loadBefore(self, oid, tid): self._lock.acquire() try: t = self._cache.loadBefore(oid, tid) if t is not None: return t finally: self._lock.release() t = self._server.loadBefore(oid, tid) if t is None: return None data, start, end = t if end is None: # This method should not be used to get current data. It # doesn't use the _load_lock, so it is possble to overlap # this load with an invalidation for the same object. # If we call again, we're guaranteed to get the # post-invalidation data. But if the data is still # current, we'll still get end == None. # Maybe the best thing to do is to re-run the test with # the load lock in the case. That's slow performance, but # I don't think real application code will ever care about # it. return data, start, end self._lock.acquire() try: self._cache.store(oid, start, end, data) finally: self._lock.release() return data, start, end def new_oid(self): """Storage API: return a new object identifier.""" if self._is_read_only: raise POSException.ReadOnlyError() # avoid multiple oid requests to server at the same time self._oid_lock.acquire() try: if not self._oids: self._oids = self._server.new_oids() self._oids.reverse() return self._oids.pop() finally: self._oid_lock.release() def pack(self, t=None, referencesf=None, wait=1, days=0): """Storage API: pack the storage. Deviations from the Storage API: the referencesf argument is ignored; two additional optional arguments wait and days are provided: wait -- a flag indicating whether to wait for the pack to complete; defaults to true. days -- a number of days to subtract from the pack time; defaults to zero. """ # TODO: Is it okay that read-only connections allow pack()? # rf argument ignored; server will provide its own implementation if t is None: t = time.time() t = t - (days * 86400) return self._server.pack(t, wait) def _check_serials(self): """Internal helper to move data from _serials to _seriald.""" # serials are always going to be the same, the only # question is whether an exception has been raised. if self._serials: l = len(self._serials) r = self._serials[:l] del self._serials[:l] for oid, s in r: if isinstance(s, Exception): self._cache.invalidate(oid, None) raise s self._seriald[oid] = s return r def store(self, oid, serial, data, version, txn): """Storage API: store data for an object.""" assert not version self._check_trans(txn) self._server.storea(oid, serial, data, id(txn)) self._tbuf.store(oid, data) return self._check_serials() def checkCurrentSerialInTransaction(self, oid, serial, transaction): self._check_trans(transaction) self._server.checkCurrentSerialInTransaction(oid, serial, id(transaction)) def storeBlob(self, oid, serial, data, blobfilename, version, txn): """Storage API: store a blob object.""" assert not version # Grab the file right away. That way, if we don't have enough # room for a copy, we'll know now rather than in tpc_finish. # Also, this releaves the client of having to manage the file # (or the directory contianing it). self.fshelper.getPathForOID(oid, create=True) fd, target = self.fshelper.blob_mkstemp(oid, serial) os.close(fd) # It's a bit odd (and impossible on windows) to rename over # an existing file. We'll use the temporary file name as a base. target += '-' ZODB.blob.rename_or_copy_blob(blobfilename, target) os.remove(target[:-1]) serials = self.store(oid, serial, data, '', txn) if self.shared_blob_dir: self._server.storeBlobShared( oid, serial, data, os.path.basename(target), id(txn)) else: self._server.storeBlob(oid, serial, data, target, txn) self._tbuf.storeBlob(oid, target) return serials def receiveBlobStart(self, oid, serial): blob_filename = self.fshelper.getBlobFilename(oid, serial) assert not os.path.exists(blob_filename) lockfilename = os.path.join(os.path.dirname(blob_filename), '.lock') assert os.path.exists(lockfilename) blob_filename += '.dl' assert not os.path.exists(blob_filename) f = open(blob_filename, 'wb') f.close() def receiveBlobChunk(self, oid, serial, chunk): blob_filename = self.fshelper.getBlobFilename(oid, serial)+'.dl' assert os.path.exists(blob_filename) f = open(blob_filename, 'r+b') f.seek(0, 2) f.write(chunk) f.close() self._blob_data_bytes_loaded += len(chunk) self._check_blob_size(self._blob_data_bytes_loaded) def receiveBlobStop(self, oid, serial): blob_filename = self.fshelper.getBlobFilename(oid, serial) os.rename(blob_filename+'.dl', blob_filename) os.chmod(blob_filename, stat.S_IREAD) def deleteObject(self, oid, serial, txn): self._check_trans(txn) self._server.deleteObject(oid, serial, id(txn)) self._tbuf.store(oid, None) def loadBlob(self, oid, serial): # Load a blob. If it isn't present and we have a shared blob # directory, then assume that it doesn't exist on the server # and return None. if self.fshelper is None: raise POSException.Unsupported("No blob cache directory is " "configured.") blob_filename = self.fshelper.getBlobFilename(oid, serial) if self.shared_blob_dir: if os.path.exists(blob_filename): return blob_filename else: # We're using a server shared cache. If the file isn't # here, it's not anywhere. raise POSException.POSKeyError("No blob file", oid, serial) if os.path.exists(blob_filename): return _accessed(blob_filename) # First, we'll create the directory for this oid, if it doesn't exist. self.fshelper.createPathForOID(oid) # OK, it's not here and we (or someone) needs to get it. We # want to avoid getting it multiple times. We want to avoid # getting it multiple times even accross separate client # processes on the same machine. We'll use file locking. lock = _lock_blob(blob_filename) try: # We got the lock, so it's our job to download it. First, # we'll double check that someone didn't download it while we # were getting the lock: if os.path.exists(blob_filename): return _accessed(blob_filename) # Ask the server to send it to us. When this function # returns, it will have been sent. (The recieving will # have been handled by the asyncore thread.) self._server.sendBlob(oid, serial) if os.path.exists(blob_filename): return _accessed(blob_filename) raise POSException.POSKeyError("No blob file", oid, serial) finally: lock.close() def openCommittedBlobFile(self, oid, serial, blob=None): blob_filename = self.loadBlob(oid, serial) try: if blob is None: return open(blob_filename, 'rb') else: return ZODB.blob.BlobFile(blob_filename, 'r', blob) except (IOError): # The file got removed while we were opening. # Fall through and try again with the protection of the lock. pass lock = _lock_blob(blob_filename) try: blob_filename = self.fshelper.getBlobFilename(oid, serial) if not os.path.exists(blob_filename): if self.shared_blob_dir: # We're using a server shared cache. If the file isn't # here, it's not anywhere. raise POSException.POSKeyError("No blob file", oid, serial) self._server.sendBlob(oid, serial) if not os.path.exists(blob_filename): raise POSException.POSKeyError("No blob file", oid, serial) _accessed(blob_filename) if blob is None: return open(blob_filename, 'rb') else: return ZODB.blob.BlobFile(blob_filename, 'r', blob) finally: lock.close() def temporaryDirectory(self): return self.fshelper.temp_dir def tpc_vote(self, txn): """Storage API: vote on a transaction.""" if txn is not self._transaction: raise POSException.StorageTransactionError( "tpc_vote called with wrong transaction") self._server.vote(id(txn)) return self._check_serials() def tpc_transaction(self): return self._transaction def tpc_begin(self, txn, tid=None, status=' '): """Storage API: begin a transaction.""" if self._is_read_only: raise POSException.ReadOnlyError() self._tpc_cond.acquire() self._midtxn_disconnect = 0 while self._transaction is not None: # It is allowable for a client to call two tpc_begins in a # row with the same transaction, and the second of these # must be ignored. if self._transaction == txn: self._tpc_cond.release() raise POSException.StorageTransactionError( "Duplicate tpc_begin calls for same transaction") self._tpc_cond.wait(30) self._transaction = txn self._tpc_cond.release() try: self._server.tpc_begin(id(txn), txn.user, txn.description, txn._extension, tid, status) except: # Client may have disconnected during the tpc_begin(). if self._server is not disconnected_stub: self.end_transaction() raise self._tbuf.clear() self._seriald.clear() del self._serials[:] def end_transaction(self): """Internal helper to end a transaction.""" # the right way to set self._transaction to None # calls notify() on _tpc_cond in case there are waiting threads self._tpc_cond.acquire() self._transaction = None self._tpc_cond.notify() self._tpc_cond.release() def lastTransaction(self): return self._cache.getLastTid() def tpc_abort(self, txn): """Storage API: abort a transaction.""" if txn is not self._transaction: return try: # Caution: Are there any exceptions that should prevent an # abort from occurring? It seems wrong to swallow them # all, yet you want to be sure that other abort logic is # executed regardless. try: self._server.tpc_abort(id(txn)) except ClientDisconnected: logger.debug("%s ClientDisconnected in tpc_abort() ignored", self.__name__) finally: self._tbuf.clear() self._seriald.clear() del self._serials[:] self._iterator_gc() self.end_transaction() def tpc_finish(self, txn, f=None): """Storage API: finish a transaction.""" if txn is not self._transaction: raise POSException.StorageTransactionError( "tpc_finish called with wrong transaction") self._load_lock.acquire() try: if self._midtxn_disconnect: raise ClientDisconnected( 'Calling tpc_finish() on a disconnected transaction') finished = 0 try: self._lock.acquire() # for atomic processing of invalidations try: tid = self._server.tpc_finish(id(txn)) finished = 1 self._update_cache(tid) if f is not None: f(tid) finally: self._lock.release() r = self._check_serials() assert r is None or len(r) == 0, "unhandled serialnos: %s" % r except: if finished: # The server successfully committed. If we get a failure # here, our own state will be in question, so reconnect. self._connection.close() raise self.end_transaction() finally: self._load_lock.release() self._iterator_gc() def _update_cache(self, tid): """Internal helper to handle objects modified by a transaction. This iterates over the objects in the transaction buffer and update or invalidate the cache. """ # Must be called with _lock already acquired. # Not sure why _update_cache() would be called on a closed storage. if self._cache is None: return for oid, _ in self._seriald.iteritems(): self._cache.invalidate(oid, tid) for oid, data in self._tbuf: # If data is None, we just invalidate. if data is not None: s = self._seriald[oid] if s != ResolvedSerial: assert s == tid, (s, tid) self._cache.store(oid, s, None, data) else: # object deletion self._cache.invalidate(oid, tid) if self.fshelper is not None: blobs = self._tbuf.blobs had_blobs = False while blobs: oid, blobfilename = blobs.pop() self._blob_data_bytes_loaded += os.stat(blobfilename).st_size targetpath = self.fshelper.getPathForOID(oid, create=True) target_blob_file_name = self.fshelper.getBlobFilename(oid, tid) lock = _lock_blob(target_blob_file_name) try: ZODB.blob.rename_or_copy_blob( blobfilename, target_blob_file_name, ) finally: lock.close() had_blobs = True if had_blobs: self._check_blob_size(self._blob_data_bytes_loaded) self._cache.setLastTid(tid) self._tbuf.clear() def undo(self, trans_id, txn): """Storage API: undo a transaction. This is executed in a transactional context. It has no effect until the transaction is committed. It can be undone itself. Zope uses this to implement undo unless it is not supported by a storage. """ self._check_trans(txn) self._server.undoa(trans_id, id(txn)) def undoInfo(self, first=0, last=-20, specification=None): """Storage API: return undo information.""" return self._server.undoInfo(first, last, specification) def undoLog(self, first=0, last=-20, filter=None): """Storage API: return a sequence of TransactionDescription objects. The filter argument should be None or left unspecified, since it is impossible to pass the filter function to the server to be executed there. If filter is not None, an empty sequence is returned. """ if filter is not None: return [] return self._server.undoLog(first, last) # Recovery support def copyTransactionsFrom(self, other, verbose=0): """Copy transactions from another storage. This is typically used for converting data from one storage to another. `other` must have an .iterator() method. """ ZODB.BaseStorage.copy(other, self, verbose) def restore(self, oid, serial, data, version, prev_txn, transaction): """Write data already committed in a separate database.""" assert not version self._check_trans(transaction) self._server.restorea(oid, serial, data, prev_txn, id(transaction)) # Don't update the transaction buffer, because current data are # unaffected. return self._check_serials() # Below are methods invoked by the StorageServer def serialnos(self, args): """Server callback to pass a list of changed (oid, serial) pairs.""" self._serials.extend(args) def info(self, dict): """Server callback to update the info dictionary.""" self._info.update(dict) def verify_cache(self, server): """Internal routine called to verify the cache. The return value (indicating which path we took) is used by the test suite. """ self._pending_server = server # setup tempfile to hold zeoVerify results and interim # invalidation results self._tfile = tempfile.TemporaryFile(suffix=".inv") self._pickler = cPickle.Pickler(self._tfile, 1) self._pickler.fast = 1 # Don't use the memo if self._connection.peer_protocol_version < 'Z309': client = ClientStorage308Adapter(self) else: client = self # allow incoming invalidations: self._connection.register_object(client) # If verify_cache() finishes the cache verification process, # it should set self._server. If it goes through full cache # verification, then endVerify() should self._server. server_tid = server.lastTransaction() if not self._cache: logger.info("%s No verification necessary -- empty cache", self.__name__) if server_tid != utils.z64: self._cache.setLastTid(server_tid) self.finish_verification() return "empty cache" cache_tid = self._cache.getLastTid() if cache_tid != utils.z64: if server_tid == cache_tid: logger.info( "%s No verification necessary" " (cache_tid up-to-date %r)", self.__name__, server_tid) self.finish_verification() return "no verification" elif server_tid < cache_tid: message = ("%s Client has seen newer transactions than server!" % self.__name__) logger.critical(message) raise ClientStorageError(message) # log some hints about last transaction logger.info("%s last inval tid: %r %s\n", self.__name__, cache_tid, tid2time(cache_tid)) logger.info("%s last transaction: %r %s", self.__name__, server_tid, server_tid and tid2time(server_tid)) pair = server.getInvalidations(cache_tid) if pair is not None: logger.info("%s Recovering %d invalidations", self.__name__, len(pair[1])) self.finish_verification(pair) return "quick verification" elif server_tid != utils.z64: # Hm, to have gotten here, the cache is non-empty, but # it has no last tid. This doesn't seem like good situation. # We'll have to verify the cache, if we're willing. self._cache.setLastTid(server_tid) zope.event.notify(ZEO.interfaces.StaleCache(self)) # From this point on, we do not have complete information about # the missed transactions. The reason is that cache # verification only checks objects in the client cache and # there may be objects in the object caches that aren't in the # client cach that would need verification too. We avoid that # problem by just invalidating the objects in the object caches. if self._db is not None: self._db.invalidateCache() if self._cache and self._drop_cache_rather_verify: logger.critical("%s dropping stale cache", self.__name__) self._cache.clear() if server_tid: self._cache.setLastTid(server_tid) self.finish_verification() return "cache dropped" logger.info("%s Verifying cache", self.__name__) for oid, tid in self._cache.contents(): server.verify(oid, tid) server.endZeoVerify() return "full verification" def invalidateVerify(self, oid): """Server callback to invalidate an oid pair. This is called as part of cache validation. """ # Invalidation as result of verify_cache(). # Queue an invalidate for the end the verification procedure. if self._pickler is None: # This should never happen. logger.error("%s invalidateVerify with no _pickler", self.__name__) return self._pickler.dump((None, [oid])) def endVerify(self): """Server callback to signal end of cache validation.""" logger.info("%s endVerify finishing", self.__name__) self.finish_verification() logger.info("%s endVerify finished", self.__name__) def finish_verification(self, catch_up=None): self._lock.acquire() try: if catch_up: # process catch-up invalidations self._process_invalidations(*catch_up) if self._pickler is None: return # write end-of-data marker self._pickler.dump((None, None)) self._pickler = None self._tfile.seek(0) unpickler = cPickle.Unpickler(self._tfile) min_tid = self._cache.getLastTid() while 1: tid, oids = unpickler.load() logger.debug('pickled inval %r %r', tid, min_tid) if oids is None: break if ((tid is None) or (min_tid is None) or (tid > min_tid) ): self._process_invalidations(tid, oids) self._tfile.close() self._tfile = None finally: self._lock.release() self._server = self._pending_server self._ready.set() self._pending_server = None def invalidateTransaction(self, tid, oids): """Server callback: Invalidate objects modified by tid.""" self._lock.acquire() try: if self._pickler is not None: logger.debug( "%s Transactional invalidation during cache verification", self.__name__) self._pickler.dump((tid, oids)) else: self._process_invalidations(tid, oids) finally: self._lock.release() def _process_invalidations(self, tid, oids): for oid in oids: if oid == self._load_oid: self._load_status = 0 self._cache.invalidate(oid, tid) self._cache.setLastTid(tid) if self._db is not None: self._db.invalidate(tid, oids) # The following are for compatibility with protocol version 2.0.0 def invalidateTrans(self, oids): return self.invalidateTransaction(None, oids) invalidate = invalidateVerify end = endVerify Invalidate = invalidateTrans # IStorageIteration def iterator(self, start=None, stop=None): """Return an IStorageTransactionInformation iterator.""" # iids are "iterator IDs" that can be used to query an iterator whose # status is held on the server. iid = self._server.iterator_start(start, stop) return self._setup_iterator(TransactionIterator, iid) def _setup_iterator(self, factory, iid, *args): self._iterators[iid] = iterator = factory(self, iid, *args) self._iterator_ids.add(iid) return iterator def _forget_iterator(self, iid): self._iterators.pop(iid, None) self._iterator_ids.remove(iid) def _iterator_gc(self, disconnected=False): if not self._iterator_ids: return if disconnected: for i in self._iterators.values(): i._iid = -1 self._iterators.clear() self._iterator_ids.clear() return iids = self._iterator_ids - set(self._iterators) if iids: try: self._server.iterator_gc(list(iids)) except ClientDisconnected: # If we get disconnected, all of the iterators on the # server are thrown away. We should clear ours too: return self._iterator_gc(True) self._iterator_ids -= iids def server_status(self): return self._server.server_status() class TransactionIterator(object): def __init__(self, storage, iid, *args): self._storage = storage self._iid = iid self._ended = False def __iter__(self): return self def next(self): if self._ended: raise StopIteration() if self._iid < 0: raise ClientDisconnected("Disconnected iterator") tx_data = self._storage._server.iterator_next(self._iid) if tx_data is None: # The iterator is exhausted, and the server has already # disposed it. self._ended = True self._storage._forget_iterator(self._iid) raise StopIteration() return ClientStorageTransactionInformation( self._storage, self, *tx_data) class ClientStorageTransactionInformation(ZODB.BaseStorage.TransactionRecord): def __init__(self, storage, txiter, tid, status, user, description, extension): self._storage = storage self._txiter = txiter self._completed = False self._riid = None self.tid = tid self.status = status self.user = user self.description = description self.extension = extension def __iter__(self): riid = self._storage._server.iterator_record_start(self._txiter._iid, self.tid) return self._storage._setup_iterator(RecordIterator, riid) class RecordIterator(object): def __init__(self, storage, riid): self._riid = riid self._completed = False self._storage = storage def __iter__(self): return self def next(self): if self._completed: # We finished iteration once already and the server can't know # about the iteration anymore. raise StopIteration() item = self._storage._server.iterator_record_next(self._riid) if item is None: # The iterator is exhausted, and the server has already # disposed it. self._completed = True raise StopIteration() return ZODB.BaseStorage.DataRecord(*item) class ClientStorage308Adapter: def __init__(self, client): self.client = client def invalidateTransaction(self, tid, args): self.client.invalidateTransaction(tid, [arg[0] for arg in args]) def invalidateVerify(self, arg): self.client.invalidateVerify(arg[0]) def __getattr__(self, name): return getattr(self.client, name) class BlobCacheLayout(object): size = 997 def oid_to_path(self, oid): return str(utils.u64(oid) % self.size) def getBlobFilePath(self, oid, tid): base, rem = divmod(utils.u64(oid), self.size) return os.path.join( str(rem), "%s.%s%s" % (base, tid.encode('hex'), ZODB.blob.BLOB_SUFFIX) ) def _accessed(filename): try: os.utime(filename, (time.time(), os.stat(filename).st_mtime)) except OSError: pass # We tried. :) return filename cache_file_name = re.compile(r'\d+$').match def _check_blob_cache_size(blob_dir, target): logger = logging.getLogger(__name__+'.check_blob_cache') layout = open(os.path.join(blob_dir, ZODB.blob.LAYOUT_MARKER) ).read().strip() if not layout == 'zeocache': logger.critical("Invalid blob directory layout %s", layout) raise ValueError("Invalid blob directory layout", layout) attempt_path = os.path.join(blob_dir, 'check_size.attempt') try: check_lock = zc.lockfile.LockFile( os.path.join(blob_dir, 'check_size.lock')) except zc.lockfile.LockError: try: time.sleep(1) check_lock = zc.lockfile.LockFile( os.path.join(blob_dir, 'check_size.lock')) except zc.lockfile.LockError: # Someone is already cleaning up, so don't bother logger.debug("%s Another thread is checking the blob cache size.", thread.get_ident()) open(attempt_path, 'w').close() # Mark that we tried return logger.debug("%s Checking blob cache size. (target: %s)", thread.get_ident(), target) try: while 1: size = 0 blob_suffix = ZODB.blob.BLOB_SUFFIX files_by_atime = BTrees.OOBTree.BTree() for dirname in os.listdir(blob_dir): if not cache_file_name(dirname): continue base = os.path.join(blob_dir, dirname) if not os.path.isdir(base): continue for file_name in os.listdir(base): if not file_name.endswith(blob_suffix): continue file_path = os.path.join(base, file_name) if not os.path.isfile(file_path): continue stat = os.stat(file_path) size += stat.st_size t = stat.st_atime if t not in files_by_atime: files_by_atime[t] = [] files_by_atime[t].append(os.path.join(dirname, file_name)) logger.debug("%s blob cache size: %s", thread.get_ident(), size) if size <= target: if os.path.isfile(attempt_path): try: os.remove(attempt_path) except OSError: pass # Sigh, windows continue logger.debug("%s -->", thread.get_ident()) break while size > target and files_by_atime: for file_name in files_by_atime.pop(files_by_atime.minKey()): file_name = os.path.join(blob_dir, file_name) lockfilename = os.path.join(os.path.dirname(file_name), '.lock') try: lock = zc.lockfile.LockFile(lockfilename) except zc.lockfile.LockError: logger.debug("%s Skipping locked %s", thread.get_ident(), os.path.basename(file_name)) continue # In use, skip try: fsize = os.stat(file_name).st_size try: ZODB.blob.remove_committed(file_name) except OSError, v: pass # probably open on windows else: size -= fsize finally: lock.close() if size <= target: break logger.debug("%s reduced blob cache size: %s", thread.get_ident(), size) finally: check_lock.close() def check_blob_size_script(args=None): if args is None: args = sys.argv[1:] blob_dir, target = args _check_blob_cache_size(blob_dir, int(target)) def _lock_blob(path): lockfilename = os.path.join(os.path.dirname(path), '.lock') n = 0 while 1: try: return zc.lockfile.LockFile(lockfilename) except zc.lockfile.LockError: time.sleep(0.01) n += 1 if n > 60000: raise else: break ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/Exceptions.py000066400000000000000000000021411230730566700230250ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Exceptions for ZEO.""" from ZODB.POSException import StorageError class ClientStorageError(StorageError): """An error occurred in the ZEO Client Storage.""" class UnrecognizedResult(ClientStorageError): """A server call returned an unrecognized result.""" class ClientDisconnected(ClientStorageError): """The database storage is disconnected from the storage.""" class AuthError(StorageError): """The client provided invalid authentication credentials.""" ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/ServerStub.py000066400000000000000000000321461230730566700230200ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002, 2003 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """RPC stubs for interface exported by StorageServer.""" import time from ZODB.utils import z64 ## # ZEO storage server. #

# Remote method calls can be synchronous or asynchronous. If the call # is synchronous, the client thread blocks until the call returns. A # single client can only have one synchronous request outstanding. If # several threads share a single client, threads other than the caller # will block only if the attempt to make another synchronous call. # An asynchronous call does not cause the client thread to block. An # exception raised by an asynchronous method is logged on the server, # but is not returned to the client. class StorageServer: """An RPC stub class for the interface exported by ClientStorage. This is the interface presented by the StorageServer to the ClientStorage; i.e. the ClientStorage calls these methods and they are executed in the StorageServer. See the StorageServer module for documentation on these methods, with the exception of _update(), which is documented here. """ def __init__(self, rpc): """Constructor. The argument is a connection: an instance of the zrpc.connection.Connection class. """ self.rpc = rpc def extensionMethod(self, name): return ExtensionMethodWrapper(self.rpc, name).call ## # Register current connection with a storage and a mode. # In effect, it is like an open call. # @param storage_name a string naming the storage. This argument # is primarily for backwards compatibility with servers # that supported multiple storages. # @param read_only boolean # @exception ValueError unknown storage_name or already registered # @exception ReadOnlyError storage is read-only and a read-write # connectio was requested def register(self, storage_name, read_only): self.rpc.call('register', storage_name, read_only) ## # Return dictionary of meta-data about the storage. # @defreturn dict def get_info(self): return self.rpc.call('get_info') ## # Check whether the server requires authentication. Returns # the name of the protocol. # @defreturn string def getAuthProtocol(self): return self.rpc.call('getAuthProtocol') ## # Return id of the last committed transaction # @defreturn string def lastTransaction(self): # Not in protocol version 2.0.0; see __init__() return self.rpc.call('lastTransaction') or z64 ## # Return invalidations for all transactions after tid. # @param tid transaction id # @defreturn 2-tuple, (tid, list) # @return tuple containing the last committed transaction # and a list of oids that were invalidated. Returns # None and an empty list if the server does not have # the list of oids available. def getInvalidations(self, tid): # Not in protocol version 2.0.0; see __init__() return self.rpc.call('getInvalidations', tid) ## # Check whether a serial number is current for oid. # If the serial number is not current, the # server will make an asynchronous invalidateVerify() call. # @param oid object id # @param s serial number # @defreturn async def zeoVerify(self, oid, s): self.rpc.callAsync('zeoVerify', oid, s) ## # Check whether current serial number is valid for oid. # If the serial number is not current, the server will make an # asynchronous invalidateVerify() call. # @param oid object id # @param serial client's current serial number # @defreturn async def verify(self, oid, serial): self.rpc.callAsync('verify', oid, serial) ## # Signal to the server that cache verification is done. # @defreturn async def endZeoVerify(self): self.rpc.callAsync('endZeoVerify') ## # Generate a new set of oids. # @param n number of new oids to return # @defreturn list # @return list of oids def new_oids(self, n=None): if n is None: return self.rpc.call('new_oids') else: return self.rpc.call('new_oids', n) ## # Pack the storage. # @param t pack time # @param wait optional, boolean. If true, the call will not # return until the pack is complete. def pack(self, t, wait=None): if wait is None: self.rpc.call('pack', t) else: self.rpc.call('pack', t, wait) ## # Return current data for oid. # @param oid object id # @defreturn 2-tuple # @return 2-tuple, current non-version data, serial number # @exception KeyError if oid is not found def zeoLoad(self, oid): return self.rpc.call('zeoLoad', oid)[:2] ## # Return current data for oid, and the tid of the # transaction that wrote the most recent revision. # @param oid object id # @defreturn 2-tuple # @return data, transaction id # @exception KeyError if oid is not found def loadEx(self, oid): return self.rpc.call("loadEx", oid) ## # Return non-current data along with transaction ids that identify # the lifetime of the specific revision. # @param oid object id # @param tid a transaction id that provides an upper bound on # the lifetime of the revision. That is, loadBefore # returns the revision that was current before tid committed. # @defreturn 4-tuple # @return data, serial numbr, start transaction id, end transaction id def loadBefore(self, oid, tid): return self.rpc.call("loadBefore", oid, tid) ## # Storage new revision of oid. # @param oid object id # @param serial serial number that this transaction read # @param data new data record for oid # @param id id of current transaction # @defreturn async def storea(self, oid, serial, data, id): self.rpc.callAsync('storea', oid, serial, data, id) def checkCurrentSerialInTransaction(self, oid, serial, id): self.rpc.callAsync('checkCurrentSerialInTransaction', oid, serial, id) def restorea(self, oid, serial, data, prev_txn, id): self.rpc.callAsync('restorea', oid, serial, data, prev_txn, id) def storeBlob(self, oid, serial, data, blobfilename, txn): # Store a blob to the server. We don't want to real all of # the data into memory, so we use a message iterator. This # allows us to read the blob data as needed. def store(): yield ('storeBlobStart', ()) f = open(blobfilename, 'rb') while 1: chunk = f.read(59000) if not chunk: break yield ('storeBlobChunk', (chunk, )) f.close() yield ('storeBlobEnd', (oid, serial, data, id(txn))) self.rpc.callAsyncIterator(store()) def storeBlobShared(self, oid, serial, data, filename, id): self.rpc.callAsync('storeBlobShared', oid, serial, data, filename, id) def deleteObject(self, oid, serial, id): self.rpc.callAsync('deleteObject', oid, serial, id) ## # Start two-phase commit for a transaction # @param id id used by client to identify current transaction. The # only purpose of this argument is to distinguish among multiple # threads using a single ClientStorage. # @param user name of user committing transaction (can be "") # @param description string containing transaction metadata (can be "") # @param ext dictionary of extended metadata (?) # @param tid optional explicit tid to pass to underlying storage # @param status optional status character, e.g "p" for pack # @defreturn async def tpc_begin(self, id, user, descr, ext, tid, status): self.rpc.callAsync('tpc_begin', id, user, descr, ext, tid, status) def vote(self, trans_id): return self.rpc.call('vote', trans_id) def tpc_finish(self, id): return self.rpc.call('tpc_finish', id) def tpc_abort(self, id): self.rpc.call('tpc_abort', id) def history(self, oid, length=None): if length is None: return self.rpc.call('history', oid) else: return self.rpc.call('history', oid, length) def record_iternext(self, next): return self.rpc.call('record_iternext', next) def sendBlob(self, oid, serial): return self.rpc.call('sendBlob', oid, serial) def getTid(self, oid): return self.rpc.call('getTid', oid) def loadSerial(self, oid, serial): return self.rpc.call('loadSerial', oid, serial) def new_oid(self): return self.rpc.call('new_oid') def undoa(self, trans_id, trans): self.rpc.callAsync('undoa', trans_id, trans) def undoLog(self, first, last): return self.rpc.call('undoLog', first, last) def undoInfo(self, first, last, spec): return self.rpc.call('undoInfo', first, last, spec) def iterator_start(self, start, stop): return self.rpc.call('iterator_start', start, stop) def iterator_next(self, iid): return self.rpc.call('iterator_next', iid) def iterator_record_start(self, txn_iid, tid): return self.rpc.call('iterator_record_start', txn_iid, tid) def iterator_record_next(self, iid): return self.rpc.call('iterator_record_next', iid) def iterator_gc(self, iids): return self.rpc.callAsync('iterator_gc', iids) def server_status(self): return self.rpc.call("server_status") def set_client_label(self, label): return self.rpc.callAsync('set_client_label', label) class StorageServer308(StorageServer): def __init__(self, rpc): if rpc.peer_protocol_version == 'Z200': self.lastTransaction = lambda: z64 self.getInvalidations = lambda tid: None self.getAuthProtocol = lambda: None StorageServer.__init__(self, rpc) def history(self, oid, length=None): if length is None: return self.rpc.call('history', oid, '') else: return self.rpc.call('history', oid, '', length) def getInvalidations(self, tid): # Not in protocol version 2.0.0; see __init__() result = self.rpc.call('getInvalidations', tid) if result is not None: result = result[0], [oid for (oid, version) in result[1]] return result def verify(self, oid, serial): self.rpc.callAsync('verify', oid, '', serial) def loadEx(self, oid): return self.rpc.call("loadEx", oid, '')[:2] def storea(self, oid, serial, data, id): self.rpc.callAsync('storea', oid, serial, data, '', id) def storeBlob(self, oid, serial, data, blobfilename, txn): # Store a blob to the server. We don't want to real all of # the data into memory, so we use a message iterator. This # allows us to read the blob data as needed. def store(): yield ('storeBlobStart', ()) f = open(blobfilename, 'rb') while 1: chunk = f.read(59000) if not chunk: break yield ('storeBlobChunk', (chunk, )) f.close() yield ('storeBlobEnd', (oid, serial, data, '', id(txn))) self.rpc.callAsyncIterator(store()) def storeBlobShared(self, oid, serial, data, filename, id): self.rpc.callAsync('storeBlobShared', oid, serial, data, filename, '', id) def zeoVerify(self, oid, s): self.rpc.callAsync('zeoVerify', oid, s, None) def iterator_start(self, start, stop): raise NotImplementedError def iterator_next(self, iid): raise NotImplementedError def iterator_record_start(self, txn_iid, tid): raise NotImplementedError def iterator_record_next(self, iid): raise NotImplementedError def iterator_gc(self, iids): raise NotImplementedError def stub(client, connection): start = time.time() # Wait until we know what version the other side is using. while connection.peer_protocol_version is None: if time.time()-start > 10: raise ValueError("Timeout waiting for protocol handshake") time.sleep(0.1) if connection.peer_protocol_version < 'Z309': return StorageServer308(connection) return StorageServer(connection) class ExtensionMethodWrapper: def __init__(self, rpc, name): self.rpc = rpc self.name = name def call(self, *a, **kwa): return self.rpc.call(self.name, *a, **kwa) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/StorageServer.py000066400000000000000000001576531230730566700235220ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002, 2003 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """The StorageServer class and the exception that it may raise. This server acts as a front-end for one or more real storages, like file storage or Berkeley storage. TODO: Need some basic access control-- a declaration of the methods exported for invocation by the server. """ from __future__ import with_statement from ZEO.Exceptions import AuthError from ZEO.monitor import StorageStats, StatsServer from ZEO.zrpc.connection import ManagedServerConnection, Delay, MTDelay, Result from ZEO.zrpc.server import Dispatcher from ZODB.ConflictResolution import ResolvedSerial from ZODB.loglevels import BLATHER from ZODB.POSException import StorageError, StorageTransactionError from ZODB.POSException import TransactionError, ReadOnlyError, ConflictError from ZODB.serialize import referencesf from ZODB.utils import oid_repr, p64, u64, z64 import asyncore import cPickle import itertools import logging import os import sys import tempfile import threading import time import transaction import warnings import ZEO.zrpc.error import ZODB.blob import ZODB.serialize import ZODB.TimeStamp import zope.interface logger = logging.getLogger('ZEO.StorageServer') def log(message, level=logging.INFO, label='', exc_info=False): """Internal helper to log a message.""" if label: message = "(%s) %s" % (label, message) logger.log(level, message, exc_info=exc_info) class StorageServerError(StorageError): """Error reported when an unpicklable exception is raised.""" class ZEOStorage: """Proxy to underlying storage for a single remote client.""" # A list of extension methods. A subclass with extra methods # should override. extensions = [] def __init__(self, server, read_only=0, auth_realm=None): self.server = server # timeout and stats will be initialized in register() self.stats = None self.connection = None self.client = None self.storage = None self.storage_id = "uninitialized" self.transaction = None self.read_only = read_only self.log_label = 'unconnected' self.locked = False # Don't have storage lock self.verifying = 0 self.store_failed = 0 self.authenticated = 0 self.auth_realm = auth_realm self.blob_tempfile = None # The authentication protocol may define extra methods. self._extensions = {} for func in self.extensions: self._extensions[func.func_name] = None self._iterators = {} self._iterator_ids = itertools.count() # Stores the last item that was handed out for a # transaction iterator. self._txn_iterators_last = {} def _finish_auth(self, authenticated): if not self.auth_realm: return 1 self.authenticated = authenticated return authenticated def set_database(self, database): self.database = database def notifyConnected(self, conn): self.connection = conn assert conn.peer_protocol_version is not None if conn.peer_protocol_version < 'Z309': self.client = ClientStub308(conn) conn.register_object(ZEOStorage308Adapter(self)) else: self.client = ClientStub(conn) self.log_label = _addr_label(conn.addr) def notifyDisconnected(self): # When this storage closes, we must ensure that it aborts # any pending transaction. if self.transaction is not None: self.log("disconnected during %s transaction" % (self.locked and 'locked' or 'unlocked')) self.tpc_abort(self.transaction.id) else: self.log("disconnected") self.connection = None def __repr__(self): tid = self.transaction and repr(self.transaction.id) if self.storage: stid = (self.tpc_transaction() and repr(self.tpc_transaction().id)) else: stid = None name = self.__class__.__name__ return "<%s %X trans=%s s_trans=%s>" % (name, id(self), tid, stid) def log(self, msg, level=logging.INFO, exc_info=False): log(msg, level=level, label=self.log_label, exc_info=exc_info) def setup_delegation(self): """Delegate several methods to the storage """ # Called from register storage = self.storage info = self.get_info() if not info['supportsUndo']: self.undoLog = self.undoInfo = lambda *a,**k: () self.getTid = storage.getTid self.load = storage.load self.loadSerial = storage.loadSerial record_iternext = getattr(storage, 'record_iternext', None) if record_iternext is not None: self.record_iternext = record_iternext try: fn = storage.getExtensionMethods except AttributeError: pass # no extension methods else: d = fn() self._extensions.update(d) for name in d: assert not hasattr(self, name) setattr(self, name, getattr(storage, name)) self.lastTransaction = storage.lastTransaction try: self.tpc_transaction = storage.tpc_transaction except AttributeError: if hasattr(storage, '_transaction'): log("Storage %r doesn't have a tpc_transaction method.\n" "See ZEO.interfaces.IServeable." "Falling back to using _transaction attribute, which\n." "is icky.", logging.ERROR) self.tpc_transaction = lambda : storage._transaction else: raise def history(self,tid,size=1): # This caters for storages which still accept # a version parameter. return self.storage.history(tid,size=size) def _check_tid(self, tid, exc=None): if self.read_only: raise ReadOnlyError() if self.transaction is None: caller = sys._getframe().f_back.f_code.co_name self.log("no current transaction: %s()" % caller, level=logging.WARNING) if exc is not None: raise exc(None, tid) else: return 0 if self.transaction.id != tid: caller = sys._getframe().f_back.f_code.co_name self.log("%s(%s) invalid; current transaction = %s" % (caller, repr(tid), repr(self.transaction.id)), logging.WARNING) if exc is not None: raise exc(self.transaction.id, tid) else: return 0 return 1 def getAuthProtocol(self): """Return string specifying name of authentication module to use. The module name should be auth_%s where %s is auth_protocol.""" protocol = self.server.auth_protocol if not protocol or protocol == 'none': return None return protocol def register(self, storage_id, read_only): """Select the storage that this client will use This method must be the first one called by the client. For authenticated storages this method will be called by the client immediately after authentication is finished. """ if self.auth_realm and not self.authenticated: raise AuthError("Client was never authenticated with server!") if self.storage is not None: self.log("duplicate register() call") raise ValueError("duplicate register() call") storage = self.server.storages.get(storage_id) if storage is None: self.log("unknown storage_id: %s" % storage_id) raise ValueError("unknown storage: %s" % storage_id) if not read_only and (self.read_only or storage.isReadOnly()): raise ReadOnlyError() self.read_only = self.read_only or read_only self.storage_id = storage_id self.storage = storage self.setup_delegation() self.stats = self.server.register_connection(storage_id, self) def get_info(self): storage = self.storage supportsUndo = (getattr(storage, 'supportsUndo', lambda : False)() and self.connection.peer_protocol_version >= 'Z310') # Communicate the backend storage interfaces to the client storage_provides = zope.interface.providedBy(storage) interfaces = [] for candidate in storage_provides.__iro__: interfaces.append((candidate.__module__, candidate.__name__)) return {'length': len(storage), 'size': storage.getSize(), 'name': storage.getName(), 'supportsUndo': supportsUndo, 'extensionMethods': self.getExtensionMethods(), 'supports_record_iternext': hasattr(self, 'record_iternext'), 'interfaces': tuple(interfaces), } def get_size_info(self): return {'length': len(self.storage), 'size': self.storage.getSize(), } def getExtensionMethods(self): return self._extensions def loadEx(self, oid): self.stats.loads += 1 return self.storage.load(oid, '') def loadBefore(self, oid, tid): self.stats.loads += 1 return self.storage.loadBefore(oid, tid) def getInvalidations(self, tid): invtid, invlist = self.server.get_invalidations(self.storage_id, tid) if invtid is None: return None self.log("Return %d invalidations up to tid %s" % (len(invlist), u64(invtid))) return invtid, invlist def verify(self, oid, tid): try: t = self.getTid(oid) except KeyError: self.client.invalidateVerify(oid) else: if tid != t: self.client.invalidateVerify(oid) def zeoVerify(self, oid, s): if not self.verifying: self.verifying = 1 self.stats.verifying_clients += 1 try: os = self.getTid(oid) except KeyError: self.client.invalidateVerify((oid, '')) # It's not clear what we should do now. The KeyError # could be caused by an object uncreation, in which case # invalidation is right. It could be an application bug # that left a dangling reference, in which case it's bad. else: if s != os: self.client.invalidateVerify((oid, '')) def endZeoVerify(self): if self.verifying: self.stats.verifying_clients -= 1 self.verifying = 0 self.client.endVerify() def pack(self, time, wait=1): # Yes, you can pack a read-only server or storage! if wait: return run_in_thread(self._pack_impl, time) else: # If the client isn't waiting for a reply, start a thread # and forget about it. t = threading.Thread(target=self._pack_impl, args=(time,)) t.start() return None def _pack_impl(self, time): self.log("pack(time=%s) started..." % repr(time)) self.storage.pack(time, referencesf) self.log("pack(time=%s) complete" % repr(time)) # Broadcast new size statistics self.server.invalidate(0, self.storage_id, None, (), self.get_size_info()) def new_oids(self, n=100): """Return a sequence of n new oids, where n defaults to 100""" n = min(n, 100) if self.read_only: raise ReadOnlyError() if n <= 0: n = 1 return [self.storage.new_oid() for i in range(n)] # undoLog and undoInfo are potentially slow methods def undoInfo(self, first, last, spec): return run_in_thread(self.storage.undoInfo, first, last, spec) def undoLog(self, first, last): return run_in_thread(self.storage.undoLog, first, last) def tpc_begin(self, id, user, description, ext, tid=None, status=" "): if self.read_only: raise ReadOnlyError() if self.transaction is not None: if self.transaction.id == id: self.log("duplicate tpc_begin(%s)" % repr(id)) return else: raise StorageTransactionError("Multiple simultaneous tpc_begin" " requests from one client.") t = transaction.Transaction() t.id = id t.user = user t.description = description t._extension = ext self.serials = [] self.invalidated = [] self.txnlog = CommitLog() self.blob_log = [] self.tid = tid self.status = status self.store_failed = 0 self.stats.active_txns += 1 # Assign the transaction attribute last. This is so we don't # think we've entered TPC until everything is set. Why? # Because if we have an error after this, the server will # think it is in TPC and the client will think it isn't. At # that point, the client will keep trying to enter TPC and # server won't let it. Errors *after* the tpc_begin call will # cause the client to abort the transaction. # (Also see https://bugs.launchpad.net/zodb/+bug/374737.) self.transaction = t def tpc_finish(self, id): if not self._check_tid(id): return assert self.locked, "finished called wo lock" self.stats.commits += 1 self.storage.tpc_finish(self.transaction, self._invalidate) # Note that the tid is still current because we still hold the # commit lock. We'll relinquish it in _clear_transaction. tid = self.storage.lastTransaction() # Return the tid, for cache invalidation optimization return Result(tid, self._clear_transaction) def _invalidate(self, tid): if self.invalidated: self.server.invalidate(self, self.storage_id, tid, self.invalidated, self.get_size_info()) def tpc_abort(self, tid): if not self._check_tid(tid): return self.stats.aborts += 1 self.storage.tpc_abort(self.transaction) self._clear_transaction() def _clear_transaction(self): # Common code at end of tpc_finish() and tpc_abort() if self.locked: self.server.unlock_storage(self) self.locked = 0 if self.transaction is not None: self.server.stop_waiting(self) self.transaction = None self.stats.active_txns -= 1 if self.txnlog is not None: self.txnlog.close() self.txnlog = None for oid, oldserial, data, blobfilename in self.blob_log: ZODB.blob.remove_committed(blobfilename) del self.blob_log def vote(self, tid): self._check_tid(tid, exc=StorageTransactionError) if self.locked or self.server.already_waiting(self): raise StorageTransactionError( 'Already voting (%s)' % (self.locked and 'locked' or 'waiting') ) return self._try_to_vote() def _try_to_vote(self, delay=None): if self.connection is None: return # We're disconnected if delay is not None and delay.sent: # as a consequence of the unlocking strategy, _try_to_vote # may be called multiple times for delayed # transactions. The first call will mark the delay as # sent. We should skip if the delay was already sent. return self.locked, delay = self.server.lock_storage(self, delay) if self.locked: try: self.log( "Preparing to commit transaction: %d objects, %d bytes" % (self.txnlog.stores, self.txnlog.size()), level=BLATHER) if (self.tid is not None) or (self.status != ' '): self.storage.tpc_begin(self.transaction, self.tid, self.status) else: self.storage.tpc_begin(self.transaction) for op, args in self.txnlog: if not getattr(self, op)(*args): break # Blob support while self.blob_log and not self.store_failed: oid, oldserial, data, blobfilename = self.blob_log.pop() self._store(oid, oldserial, data, blobfilename) if not self.store_failed: # Only call tpc_vote of no store call failed, # otherwise the serialnos() call will deliver an # exception that will be handled by the client in # its tpc_vote() method. serials = self.storage.tpc_vote(self.transaction) if serials: self.serials.extend(serials) self.client.serialnos(self.serials) except Exception: self.storage.tpc_abort(self.transaction) self._clear_transaction() if delay is not None: delay.error() else: raise else: if delay is not None: delay.reply(None) else: return None else: return delay def _unlock_callback(self, delay): connection = self.connection if connection is None: self.server.stop_waiting(self) else: connection.call_from_thread(self._try_to_vote, delay) # The public methods of the ZEO client API do not do the real work. # They defer work until after the storage lock has been acquired. # Most of the real implementations are in methods beginning with # an _. def deleteObject(self, oid, serial, id): self._check_tid(id, exc=StorageTransactionError) self.stats.stores += 1 self.txnlog.delete(oid, serial) def storea(self, oid, serial, data, id): self._check_tid(id, exc=StorageTransactionError) self.stats.stores += 1 self.txnlog.store(oid, serial, data) def checkCurrentSerialInTransaction(self, oid, serial, id): self._check_tid(id, exc=StorageTransactionError) self.txnlog.checkread(oid, serial) def restorea(self, oid, serial, data, prev_txn, id): self._check_tid(id, exc=StorageTransactionError) self.stats.stores += 1 self.txnlog.restore(oid, serial, data, prev_txn) def storeBlobStart(self): assert self.blob_tempfile is None self.blob_tempfile = tempfile.mkstemp( dir=self.storage.temporaryDirectory()) def storeBlobChunk(self, chunk): os.write(self.blob_tempfile[0], chunk) def storeBlobEnd(self, oid, serial, data, id): self._check_tid(id, exc=StorageTransactionError) assert self.txnlog is not None # effectively not allowed after undo fd, tempname = self.blob_tempfile self.blob_tempfile = None os.close(fd) self.blob_log.append((oid, serial, data, tempname)) def storeBlobShared(self, oid, serial, data, filename, id): self._check_tid(id, exc=StorageTransactionError) assert self.txnlog is not None # effectively not allowed after undo # Reconstruct the full path from the filename in the OID directory if (os.path.sep in filename or not (filename.endswith('.tmp') or filename[:-1].endswith('.tmp') ) ): logger.critical( "We're under attack! (bad filename to storeBlobShared, %r)", filename) raise ValueError(filename) filename = os.path.join(self.storage.fshelper.getPathForOID(oid), filename) self.blob_log.append((oid, serial, data, filename)) def sendBlob(self, oid, serial): self.client.storeBlob(oid, serial, self.storage.loadBlob(oid, serial)) def undo(*a, **k): raise NotImplementedError def undoa(self, trans_id, tid): self._check_tid(tid, exc=StorageTransactionError) self.txnlog.undo(trans_id) def _op_error(self, oid, err, op): self.store_failed = 1 if isinstance(err, ConflictError): self.stats.conflicts += 1 self.log("conflict error oid=%s msg=%s" % (oid_repr(oid), str(err)), BLATHER) if not isinstance(err, TransactionError): # Unexpected errors are logged and passed to the client self.log("%s error: %s, %s" % ((op,)+ sys.exc_info()[:2]), logging.ERROR, exc_info=True) err = self._marshal_error(err) # The exception is reported back as newserial for this oid self.serials.append((oid, err)) def _delete(self, oid, serial): err = None try: self.storage.deleteObject(oid, serial, self.transaction) except (SystemExit, KeyboardInterrupt): raise except Exception, err: self._op_error(oid, err, 'delete') return err is None def _checkread(self, oid, serial): err = None try: self.storage.checkCurrentSerialInTransaction( oid, serial, self.transaction) except (SystemExit, KeyboardInterrupt): raise except Exception, err: self._op_error(oid, err, 'checkCurrentSerialInTransaction') return err is None def _store(self, oid, serial, data, blobfile=None): err = None try: if blobfile is None: newserial = self.storage.store( oid, serial, data, '', self.transaction) else: newserial = self.storage.storeBlob( oid, serial, data, blobfile, '', self.transaction) except (SystemExit, KeyboardInterrupt): raise except Exception, err: self._op_error(oid, err, 'store') else: if serial != "\0\0\0\0\0\0\0\0": self.invalidated.append(oid) if isinstance(newserial, str): newserial = [(oid, newserial)] for oid, s in newserial or (): if s == ResolvedSerial: self.stats.conflicts_resolved += 1 self.log("conflict resolved oid=%s" % oid_repr(oid), BLATHER) self.serials.append((oid, s)) return err is None def _restore(self, oid, serial, data, prev_txn): err = None try: self.storage.restore(oid, serial, data, '', prev_txn, self.transaction) except (SystemExit, KeyboardInterrupt): raise except Exception, err: self._op_error(oid, err, 'restore') return err is None def _undo(self, trans_id): err = None try: tid, oids = self.storage.undo(trans_id, self.transaction) except (SystemExit, KeyboardInterrupt): raise except Exception, err: self._op_error(z64, err, 'undo') else: self.invalidated.extend(oids) self.serials.extend((oid, ResolvedSerial) for oid in oids) return err is None def _marshal_error(self, error): # Try to pickle the exception. If it can't be pickled, # the RPC response would fail, so use something that can be pickled. pickler = cPickle.Pickler() pickler.fast = 1 try: pickler.dump(error, 1) except: msg = "Couldn't pickle storage exception: %s" % repr(error) self.log(msg, logging.ERROR) error = StorageServerError(msg) return error # IStorageIteration support def iterator_start(self, start, stop): iid = self._iterator_ids.next() self._iterators[iid] = iter(self.storage.iterator(start, stop)) return iid def iterator_next(self, iid): iterator = self._iterators[iid] try: info = iterator.next() except StopIteration: del self._iterators[iid] item = None if iid in self._txn_iterators_last: del self._txn_iterators_last[iid] else: item = (info.tid, info.status, info.user, info.description, info.extension) # Keep a reference to the last iterator result to allow starting a # record iterator off it. self._txn_iterators_last[iid] = info return item def iterator_record_start(self, txn_iid, tid): record_iid = self._iterator_ids.next() txn_info = self._txn_iterators_last[txn_iid] if txn_info.tid != tid: raise Exception( 'Out-of-order request for record iterator for transaction %r' % tid) self._iterators[record_iid] = iter(txn_info) return record_iid def iterator_record_next(self, iid): iterator = self._iterators[iid] try: info = iterator.next() except StopIteration: del self._iterators[iid] item = None else: item = (info.oid, info.tid, info.data, info.data_txn) return item def iterator_gc(self, iids): for iid in iids: self._iterators.pop(iid, None) def server_status(self): return self.server.server_status(self) def set_client_label(self, label): self.log_label = str(label)+' '+_addr_label(self.connection.addr) class StorageServerDB: def __init__(self, server, storage_id): self.server = server self.storage_id = storage_id self.references = ZODB.serialize.referencesf def invalidate(self, tid, oids, version=''): if version: raise StorageServerError("Versions aren't supported.") storage_id = self.storage_id self.server.invalidate(None, storage_id, tid, oids) def invalidateCache(self): self.server._invalidateCache(self.storage_id) transform_record_data = untransform_record_data = lambda self, data: data class StorageServer: """The server side implementation of ZEO. The StorageServer is the 'manager' for incoming connections. Each connection is associated with its own ZEOStorage instance (defined below). The StorageServer may handle multiple storages; each ZEOStorage instance only handles a single storage. """ # Classes we instantiate. A subclass might override. DispatcherClass = Dispatcher ZEOStorageClass = ZEOStorage ManagedServerConnectionClass = ManagedServerConnection def __init__(self, addr, storages, read_only=0, invalidation_queue_size=100, invalidation_age=None, transaction_timeout=None, monitor_address=None, auth_protocol=None, auth_database=None, auth_realm=None): """StorageServer constructor. This is typically invoked from the start.py script. Arguments (the first two are required and positional): addr -- the address at which the server should listen. This can be a tuple (host, port) to signify a TCP/IP connection or a pathname string to signify a Unix domain socket connection. A hostname may be a DNS name or a dotted IP address. storages -- a dictionary giving the storage(s) to handle. The keys are the storage names, the values are the storage instances, typically FileStorage or Berkeley storage instances. By convention, storage names are typically strings representing small integers starting at '1'. read_only -- an optional flag saying whether the server should operate in read-only mode. Defaults to false. Note that even if the server is operating in writable mode, individual storages may still be read-only. But if the server is in read-only mode, no write operations are allowed, even if the storages are writable. Note that pack() is considered a read-only operation. invalidation_queue_size -- The storage server keeps a queue of the objects modified by the last N transactions, where N == invalidation_queue_size. This queue is used to speed client cache verification when a client disconnects for a short period of time. invalidation_age -- If the invalidation queue isn't big enough to support a quick verification, but the last transaction seen by a client is younger than the invalidation age, then invalidations will be computed by iterating over transactions later than the given transaction. transaction_timeout -- The maximum amount of time to wait for a transaction to commit after acquiring the storage lock. If the transaction takes too long, the client connection will be closed and the transaction aborted. monitor_address -- The address at which the monitor server should listen. If specified, a monitor server is started. The monitor server provides server statistics in a simple text format. auth_protocol -- The name of the authentication protocol to use. Examples are "digest" and "srp". auth_database -- The name of the password database filename. It should be in a format compatible with the authentication protocol used; for instance, "sha" and "srp" require different formats. Note that to implement an authentication protocol, a server and client authentication mechanism must be implemented in a auth_* module, which should be stored inside the "auth" subdirectory. This module may also define a DatabaseClass variable that should indicate what database should be used by the authenticator. """ self.addr = addr self.storages = storages msg = ", ".join( ["%s:%s:%s" % (name, storage.isReadOnly() and "RO" or "RW", storage.getName()) for name, storage in storages.items()]) log("%s created %s with storages: %s" % (self.__class__.__name__, read_only and "RO" or "RW", msg)) self._lock = threading.Lock() self._commit_locks = {} self._waiting = dict((name, []) for name in storages) self.read_only = read_only self.auth_protocol = auth_protocol self.auth_database = auth_database self.auth_realm = auth_realm self.database = None if auth_protocol: self._setup_auth(auth_protocol) # A list, by server, of at most invalidation_queue_size invalidations. # The list is kept in sorted order with the most recent # invalidation at the front. The list never has more than # self.invq_bound elements. self.invq_bound = invalidation_queue_size self.invq = {} for name, storage in storages.items(): self._setup_invq(name, storage) storage.registerDB(StorageServerDB(self, name)) self.invalidation_age = invalidation_age self.connections = {} self.dispatcher = self.DispatcherClass(addr, factory=self.new_connection) self.stats = {} self.timeouts = {} for name in self.storages.keys(): self.connections[name] = [] self.stats[name] = StorageStats(self.connections[name]) if transaction_timeout is None: # An object with no-op methods timeout = StubTimeoutThread() else: timeout = TimeoutThread(transaction_timeout) timeout.start() self.timeouts[name] = timeout if monitor_address: warnings.warn( "The monitor server is deprecated. Use the server_status\n" "ZEO method instead.", DeprecationWarning) self.monitor = StatsServer(monitor_address, self.stats) else: self.monitor = None def _setup_invq(self, name, storage): lastInvalidations = getattr(storage, 'lastInvalidations', None) if lastInvalidations is None: # Using None below doesn't look right, but the first # element in invq is never used. See get_invalidations. # (If it was used, it would generate an error, which would # be good. :) Doing this allows clients that were up to # date when a server was restarted to pick up transactions # it subsequently missed. self.invq[name] = [(storage.lastTransaction() or z64, None)] else: self.invq[name] = list(lastInvalidations(self.invq_bound)) self.invq[name].reverse() def _setup_auth(self, protocol): # Can't be done in global scope, because of cyclic references from ZEO.auth import get_module name = self.__class__.__name__ module = get_module(protocol) if not module: log("%s: no such an auth protocol: %s" % (name, protocol)) return storage_class, client, db_class = module if not storage_class or not issubclass(storage_class, ZEOStorage): log(("%s: %s isn't a valid protocol, must have a StorageClass" % (name, protocol))) self.auth_protocol = None return self.ZEOStorageClass = storage_class log("%s: using auth protocol: %s" % (name, protocol)) # We create a Database instance here for use with the authenticator # modules. Having one instance allows it to be shared between multiple # storages, avoiding the need to bloat each with a new authenticator # Database that would contain the same info, and also avoiding any # possibly synchronization issues between them. self.database = db_class(self.auth_database) if self.database.realm != self.auth_realm: raise ValueError("password database realm %r " "does not match storage realm %r" % (self.database.realm, self.auth_realm)) def new_connection(self, sock, addr): """Internal: factory to create a new connection. This is called by the Dispatcher class in ZEO.zrpc.server whenever accept() returns a socket for a new incoming connection. """ if self.auth_protocol and self.database: zstorage = self.ZEOStorageClass(self, self.read_only, auth_realm=self.auth_realm) zstorage.set_database(self.database) else: zstorage = self.ZEOStorageClass(self, self.read_only) c = self.ManagedServerConnectionClass(sock, addr, zstorage, self) log("new connection %s: %s" % (addr, repr(c))) return c def register_connection(self, storage_id, conn): """Internal: register a connection with a particular storage. This is called by ZEOStorage.register(). The dictionary self.connections maps each storage name to a list of current connections for that storage; this information is needed to handle invalidation. This function updates this dictionary. Returns the timeout and stats objects for the appropriate storage. """ self.connections[storage_id].append(conn) return self.stats[storage_id] def _invalidateCache(self, storage_id): """We need to invalidate any caches we have. This basically means telling our clients to invalidate/revalidate their caches. We do this by closing them and making them reconnect. """ # This method can be called from foreign threads. We have to # worry about interaction with the main thread. # 1. We modify self.invq which is read by get_invalidations # below. This is why get_invalidations makes a copy of # self.invq. # 2. We access connections. There are two dangers: # # a. We miss a new connection. This is not a problem because # if a client connects after we get the list of connections, # then it will have to read the invalidation queue, which # has already been reset. # # b. A connection is closes while we are iterating. This # doesn't matter, bacause we can call should_close on a closed # connection. # Rebuild invq self._setup_invq(storage_id, self.storages[storage_id]) # Make a copy since we are going to be mutating the # connections indirectoy by closing them. We don't care about # later transactions since they will have to validate their # caches anyway. for p in self.connections[storage_id][:]: try: p.connection.should_close() p.connection.trigger.pull_trigger() except ZEO.zrpc.error.DisconnectedError: pass def invalidate(self, conn, storage_id, tid, invalidated=(), info=None): """Internal: broadcast info and invalidations to clients. This is called from several ZEOStorage methods. invalidated is a sequence of oids. This can do three different things: - If the invalidated argument is non-empty, it broadcasts invalidateTransaction() messages to all clients of the given storage except the current client (the conn argument). - If the invalidated argument is empty and the info argument is a non-empty dictionary, it broadcasts info() messages to all clients of the given storage, including the current client. - If both the invalidated argument and the info argument are non-empty, it broadcasts invalidateTransaction() messages to all clients except the current, and sends an info() message to the current client. """ # This method can be called from foreign threads. We have to # worry about interaction with the main thread. # 1. We modify self.invq which is read by get_invalidations # below. This is why get_invalidations makes a copy of # self.invq. # 2. We access connections. There are two dangers: # # a. We miss a new connection. This is not a problem because # we are called while the storage lock is held. A new # connection that tries to read data won't read committed # data without first recieving an invalidation. Also, if a # client connects after getting the list of connections, # then it will have to read the invalidation queue, which # has been updated to reflect the invalidations. # # b. A connection is closes while we are iterating. We'll need # to cactch and ignore Disconnected errors. if invalidated: invq = self.invq[storage_id] if len(invq) >= self.invq_bound: invq.pop() invq.insert(0, (tid, invalidated)) for p in self.connections[storage_id]: try: if invalidated and p is not conn: p.client.invalidateTransaction(tid, invalidated) elif info is not None: p.client.info(info) except ZEO.zrpc.error.DisconnectedError: pass def get_invalidations(self, storage_id, tid): """Return a tid and list of all objects invalidation since tid. The tid is the most recent transaction id seen by the client. Returns None if it is unable to provide a complete list of invalidations for tid. In this case, client should do full cache verification. """ # We make a copy of invq because it might be modified by a # foreign (other than main thread) calling invalidate above. invq = self.invq[storage_id][:] oids = set() latest_tid = None if invq and invq[-1][0] <= tid: # We have needed data in the queue for _tid, L in invq: if _tid <= tid: break oids.update(L) latest_tid = invq[0][0] elif (self.invalidation_age and (self.invalidation_age > (time.time()-ZODB.TimeStamp.TimeStamp(tid).timeTime()) ) ): for t in self.storages[storage_id].iterator(p64(u64(tid)+1)): for r in t: oids.add(r.oid) latest_tid = t.tid elif not invq: log("invq empty") else: log("tid to old for invq %s < %s" % (u64(tid), u64(invq[-1][0]))) return latest_tid, list(oids) def close_server(self): """Close the dispatcher so that there are no new connections. This is only called from the test suite, AFAICT. """ self.dispatcher.close() if self.monitor is not None: self.monitor.close() # Force the asyncore mainloop to exit by hackery, i.e. close # every socket in the map. loop() will return when the map is # empty. for s in asyncore.socket_map.values(): try: s.close() except: pass asyncore.socket_map.clear() for storage in self.storages.values(): storage.close() def close_conn(self, conn): """Internal: remove the given connection from self.connections. This is the inverse of register_connection(). """ for cl in self.connections.values(): if conn.obj in cl: cl.remove(conn.obj) def lock_storage(self, zeostore, delay): storage_id = zeostore.storage_id waiting = self._waiting[storage_id] with self._lock: if storage_id in self._commit_locks: # The lock is held by another zeostore locked = self._commit_locks[storage_id] assert locked is not zeostore, (storage_id, delay) if locked.connection is None: locked.log("Still locked after disconnected. Unlocking.", logging.CRITICAL) if locked.transaction: locked.storage.tpc_abort(locked.transaction) del self._commit_locks[storage_id] # yuck: have to manipulate lock to appease with :( self._lock.release() try: return self.lock_storage(zeostore, delay) finally: self._lock.acquire() if delay is None: # New request, queue it assert not [i for i in waiting if i[0] is zeostore ], "already waiting" delay = Delay() waiting.append((zeostore, delay)) zeostore.log("(%r) queue lock: transactions waiting: %s" % (storage_id, len(waiting)), _level_for_waiting(waiting) ) return False, delay else: self._commit_locks[storage_id] = zeostore self.timeouts[storage_id].begin(zeostore) self.stats[storage_id].lock_time = time.time() if delay is not None: # we were waiting, stop waiting[:] = [i for i in waiting if i[0] is not zeostore] zeostore.log("(%r) lock: transactions waiting: %s" % (storage_id, len(waiting)), _level_for_waiting(waiting) ) return True, delay def unlock_storage(self, zeostore): storage_id = zeostore.storage_id waiting = self._waiting[storage_id] with self._lock: assert self._commit_locks[storage_id] is zeostore del self._commit_locks[storage_id] self.timeouts[storage_id].end(zeostore) self.stats[storage_id].lock_time = None callbacks = waiting[:] if callbacks: assert not [i for i in waiting if i[0] is zeostore ], "waiting while unlocking" zeostore.log("(%r) unlock: transactions waiting: %s" % (storage_id, len(callbacks)), _level_for_waiting(callbacks) ) for zeostore, delay in callbacks: try: zeostore._unlock_callback(delay) except (SystemExit, KeyboardInterrupt): raise except Exception: logger.exception("Calling unlock callback") def stop_waiting(self, zeostore): storage_id = zeostore.storage_id waiting = self._waiting[storage_id] with self._lock: new_waiting = [i for i in waiting if i[0] is not zeostore] if len(new_waiting) == len(waiting): return waiting[:] = new_waiting zeostore.log("(%r) dequeue lock: transactions waiting: %s" % (storage_id, len(waiting)), _level_for_waiting(waiting) ) def already_waiting(self, zeostore): storage_id = zeostore.storage_id waiting = self._waiting[storage_id] with self._lock: return bool([i for i in waiting if i[0] is zeostore]) def server_status(self, zeostore): storage_id = zeostore.storage_id status = self.stats[storage_id].__dict__.copy() status['connections'] = len(status['connections']) status['waiting'] = len(self._waiting[storage_id]) status['timeout-thread-is-alive'] = self.timeouts[storage_id].isAlive() return status def _level_for_waiting(waiting): if len(waiting) > 9: return logging.CRITICAL if len(waiting) > 3: return logging.WARNING else: return logging.DEBUG class StubTimeoutThread: def begin(self, client): pass def end(self, client): pass isAlive = lambda self: 'stub' class TimeoutThread(threading.Thread): """Monitors transaction progress and generates timeouts.""" # There is one TimeoutThread per storage, because there's one # transaction lock per storage. def __init__(self, timeout): threading.Thread.__init__(self) self.setDaemon(1) self._timeout = timeout self._client = None self._deadline = None self._cond = threading.Condition() # Protects _client and _deadline def begin(self, client): # Called from the restart code the "main" thread, whenever the # storage lock is being acquired. (Serialized by asyncore.) with self._cond: assert self._client is None self._client = client self._deadline = time.time() + self._timeout self._cond.notify() def end(self, client): # Called from the "main" thread whenever the storage lock is # being released. (Serialized by asyncore.) with self._cond: assert self._client is not None assert self._client is client self._client = None self._deadline = None def run(self): # Code running in the thread. while 1: with self._cond: while self._deadline is None: self._cond.wait() howlong = self._deadline - time.time() if howlong <= 0: # Prevent reporting timeout more than once self._deadline = None client = self._client # For the howlong <= 0 branch below if howlong <= 0: client.log("Transaction timeout after %s seconds" % self._timeout, logging.CRITICAL) try: client.connection.call_from_thread(client.connection.close) except: client.log("Timeout failure", logging.CRITICAL, exc_info=sys.exc_info()) self.end(client) else: time.sleep(howlong) def run_in_thread(method, *args): t = SlowMethodThread(method, args) t.start() return t.delay class SlowMethodThread(threading.Thread): """Thread to run potentially slow storage methods. Clients can use the delay attribute to access the MTDelay object used to send a zrpc response at the right time. """ # Some storage methods can take a long time to complete. If we # run these methods via a standard asyncore read handler, they # will block all other server activity until they complete. To # avoid blocking, we spawn a separate thread, return an MTDelay() # object, and have the thread reply() when it finishes. def __init__(self, method, args): threading.Thread.__init__(self) self._method = method self._args = args self.delay = MTDelay() def run(self): try: result = self._method(*self._args) except (SystemExit, KeyboardInterrupt): raise except Exception: self.delay.error(sys.exc_info()) else: self.delay.reply(result) class ClientStub: def __init__(self, rpc): self.rpc = rpc def beginVerify(self): self.rpc.callAsync('beginVerify') def invalidateVerify(self, args): self.rpc.callAsync('invalidateVerify', args) def endVerify(self): self.rpc.callAsync('endVerify') def invalidateTransaction(self, tid, args): # Note that this method is *always* called from a different # thread than self.rpc's async thread. It is the only method # for which this is true and requires special consideration! # callAsyncNoSend is important here because: # - callAsyncNoPoll isn't appropriate because # the network thread may not wake up for a long time, # delaying invalidations for too long. (This is demonstrateed # by a test failure.) # - callAsync isn't appropriate because (on the server) it tries # to write to the socket. If self.rpc's network thread also # tries to write at the ame time, we can run into problems # because handle_write isn't thread safe. self.rpc.callAsyncNoSend('invalidateTransaction', tid, args) def serialnos(self, arg): self.rpc.callAsyncNoPoll('serialnos', arg) def info(self, arg): self.rpc.callAsyncNoPoll('info', arg) def storeBlob(self, oid, serial, blobfilename): def store(): yield ('receiveBlobStart', (oid, serial)) f = open(blobfilename, 'rb') while 1: chunk = f.read(59000) if not chunk: break yield ('receiveBlobChunk', (oid, serial, chunk, )) f.close() yield ('receiveBlobStop', (oid, serial)) self.rpc.callAsyncIterator(store()) class ClientStub308(ClientStub): def invalidateTransaction(self, tid, args): ClientStub.invalidateTransaction( self, tid, [(arg, '') for arg in args]) def invalidateVerify(self, oid): ClientStub.invalidateVerify(self, (oid, '')) class ZEOStorage308Adapter: def __init__(self, storage): self.storage = storage def __eq__(self, other): return self is other or self.storage is other def getSerial(self, oid): return self.storage.loadEx(oid)[1] # Z200 def history(self, oid, version, size=1): if version: raise ValueError("Versions aren't supported.") return self.storage.history(oid, size=size) def getInvalidations(self, tid): result = self.storage.getInvalidations(tid) if result is not None: result = result[0], [(oid, '') for oid in result[1]] return result def verify(self, oid, version, tid): if version: raise StorageServerError("Versions aren't supported.") return self.storage.verify(oid, tid) def loadEx(self, oid, version=''): if version: raise StorageServerError("Versions aren't supported.") data, serial = self.storage.loadEx(oid) return data, serial, '' def storea(self, oid, serial, data, version, id): if version: raise StorageServerError("Versions aren't supported.") self.storage.storea(oid, serial, data, id) def storeBlobEnd(self, oid, serial, data, version, id): if version: raise StorageServerError("Versions aren't supported.") self.storage.storeBlobEnd(oid, serial, data, id) def storeBlobShared(self, oid, serial, data, filename, version, id): if version: raise StorageServerError("Versions aren't supported.") self.storage.storeBlobShared(oid, serial, data, filename, id) def getInfo(self): result = self.storage.getInfo() result['supportsVersions'] = False return result def zeoVerify(self, oid, s, sv=None): if sv: raise StorageServerError("Versions aren't supported.") self.storage.zeoVerify(oid, s) def modifiedInVersion(self, oid): return '' def versions(self): return () def versionEmpty(self, version): return True def commitVersion(self, *a, **k): raise NotImplementedError abortVersion = commitVersion def zeoLoad(self, oid): # Z200 p, s = self.storage.loadEx(oid) return p, s, '', None, None def __getattr__(self, name): return getattr(self.storage, name) def _addr_label(addr): if isinstance(addr, type("")): return addr else: host, port = addr return str(host) + ":" + str(port) class CommitLog: def __init__(self): self.file = tempfile.TemporaryFile(suffix=".comit-log") self.pickler = cPickle.Pickler(self.file, 1) self.pickler.fast = 1 self.stores = 0 def size(self): return self.file.tell() def delete(self, oid, serial): self.pickler.dump(('_delete', (oid, serial))) self.stores += 1 def checkread(self, oid, serial): self.pickler.dump(('_checkread', (oid, serial))) self.stores += 1 def store(self, oid, serial, data): self.pickler.dump(('_store', (oid, serial, data))) self.stores += 1 def restore(self, oid, serial, data, prev_txn): self.pickler.dump(('_restore', (oid, serial, data, prev_txn))) self.stores += 1 def undo(self, transaction_id): self.pickler.dump(('_undo', (transaction_id, ))) self.stores += 1 def __iter__(self): self.file.seek(0) unpickler = cPickle.Unpickler(self.file) for i in range(self.stores): yield unpickler.load() def close(self): if self.file: self.file.close() self.file = None ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/TransactionBuffer.py000066400000000000000000000116551230730566700243350ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """A TransactionBuffer store transaction updates until commit or abort. A transaction may generate enough data that it is not practical to always hold pending updates in memory. Instead, a TransactionBuffer is used to store the data until a commit or abort. """ # A faster implementation might store trans data in memory until it # reaches a certain size. from threading import Lock import os import cPickle import tempfile import ZODB.blob class TransactionBuffer: # Valid call sequences: # # ((store | invalidate)* begin_iterate next* clear)* close # # get_size can be called any time # The TransactionBuffer is used by client storage to hold update # data until the tpc_finish(). It is normally used by a single # thread, because only one thread can be in the two-phase commit # at one time. # It is possible, however, for one thread to close the storage # while another thread is in the two-phase commit. We must use # a lock to guard against this race, because unpredictable things # can happen in Python if one thread closes a file that another # thread is reading. In a debug build, an assert() can fail. # Caution: If an operation is performed on a closed TransactionBuffer, # it has no effect and does not raise an exception. The only time # this should occur is when a ClientStorage is closed in one # thread while another thread is in its tpc_finish(). It's not # clear what should happen in this case. If the tpc_finish() # completes without error, the Connection using it could have # inconsistent data. This should have minimal effect, though, # because the Connection is connected to a closed storage. def __init__(self): self.file = tempfile.TemporaryFile(suffix=".tbuf") self.lock = Lock() self.closed = 0 self.count = 0 self.size = 0 self.blobs = [] # It's safe to use a fast pickler because the only objects # stored are builtin types -- strings or None. self.pickler = cPickle.Pickler(self.file, 1) self.pickler.fast = 1 def close(self): self.clear() self.lock.acquire() try: self.closed = 1 try: self.file.close() except OSError: pass finally: self.lock.release() def store(self, oid, data): """Store oid, version, data for later retrieval""" self.lock.acquire() try: if self.closed: return self.pickler.dump((oid, data)) self.count += 1 # Estimate per-record cache size self.size = self.size + (data and len(data) or 0) + 31 finally: self.lock.release() def storeBlob(self, oid, blobfilename): self.blobs.append((oid, blobfilename)) def invalidate(self, oid): self.lock.acquire() try: if self.closed: return self.pickler.dump((oid, None)) self.count += 1 finally: self.lock.release() def clear(self): """Mark the buffer as empty""" self.lock.acquire() try: if self.closed: return self.file.seek(0) self.count = 0 self.size = 0 while self.blobs: oid, blobfilename = self.blobs.pop() if os.path.exists(blobfilename): ZODB.blob.remove_committed(blobfilename) finally: self.lock.release() def __iter__(self): self.lock.acquire() try: if self.closed: return self.file.flush() self.file.seek(0) return TBIterator(self.file, self.count) finally: self.lock.release() class TBIterator(object): def __init__(self, f, count): self.file = f self.count = count self.unpickler = cPickle.Unpickler(f) def __iter__(self): return self def next(self): """Return next tuple of data or None if EOF""" if self.count == 0: self.file.seek(0) self.size = 0 raise StopIteration oid_ver_data = self.unpickler.load() self.count -= 1 return oid_ver_data ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/__init__.py000066400000000000000000000060671230730566700224560ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """ZEO -- Zope Enterprise Objects. See the file README.txt in this directory for an overview. ZEO is now part of ZODB; ZODB's home on the web is http://wiki.zope.org/ZODB """ def DB(*args, **kw): import ZEO.ClientStorage, ZODB return ZODB.DB(ZEO.ClientStorage.ClientStorage(*args, **kw)) def connection(*args, **kw): db = DB(*args, **kw) conn = db.open() conn.onCloseCallback(db.close) return conn def client(*args, **kw): import ZEO.ClientStorage return ZEO.ClientStorage.ClientStorage(*args, **kw) def server(path=None, blob_dir=None, storage_conf=None, zeo_conf=None, port=None): """Convenience function to start a server for interactive exploration This fuction starts a ZEO server, given a storage configuration or a file-storage path and blob directory. You can also supply a ZEO configuration string or a port. If neither a ZEO port or configuration is supplied, a port is chosen randomly. The server address and a stop function are returned. The address can be passed to ZEO.ClientStorage.ClientStorage or ZEO.DB to create a client to the server. The stop function can be called without arguments to stop the server. Arguments: path A file-storage path. This argument is ignored if a storage configuration is supplied. blob_dir A blob directory path. This argument is ignored if a storage configuration is supplied. storage_conf A storage configuration string. If none is supplied, then at least a file-storage path must be supplied and the storage configuration will be generated from the file-storage path and the blob directory. zeo_conf A ZEO server configuration string. port If no ZEO configuration is supplied, the one will be computed from the port. If no port is supplied, one will be chosedn randomly. """ import os, ZEO.tests.forker if storage_conf is None and path is None: storage_conf = '\n' if port is None and zeo_conf is None: port = ZEO.tests.forker.get_port() addr, admin, pid, config = ZEO.tests.forker.start_zeo_server( storage_conf, zeo_conf, port, keep=True, path=path, blob_dir=blob_dir, suicide=False) os.remove(config) def stop_server(): ZEO.tests.forker.shutdown_zeo_server(admin) os.waitpid(pid, 0) return addr, stop_server ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/auth/000077500000000000000000000000001230730566700212755ustar00rootroot00000000000000ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/auth/__init__.py000066400000000000000000000023061230730566700234070ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2003 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## _auth_modules = {} def get_module(name): if name == 'sha': from auth_sha import StorageClass, SHAClient, Database return StorageClass, SHAClient, Database elif name == 'digest': from auth_digest import StorageClass, DigestClient, DigestDatabase return StorageClass, DigestClient, DigestDatabase else: return _auth_modules.get(name) def register_module(name, storage_class, client, db): if _auth_modules.has_key(name): raise TypeError("%s is already registred" % name) _auth_modules[name] = storage_class, client, db ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/auth/auth_digest.py000066400000000000000000000123531230730566700241530ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2003 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Digest authentication for ZEO This authentication mechanism follows the design of HTTP digest authentication (RFC 2069). It is a simple challenge-response protocol that does not send passwords in the clear, but does not offer strong security. The RFC discusses many of the limitations of this kind of protocol. Guard the password database as if it contained plaintext passwords. It stores the hash of a username and password. This does not expose the plaintext password, but it is sensitive nonetheless. An attacker with the hash can impersonate the real user. This is a limitation of the simple digest scheme. HTTP is a stateless protocol, and ZEO is a stateful protocol. The security requirements are quite different as a result. The HTTP protocol uses a nonce as a challenge. The ZEO protocol requires a separate session key that is used for message authentication. We generate a second nonce for this purpose; the hash of nonce and user/realm/password is used as the session key. TODO: I'm not sure if this is a sound approach; SRP would be preferred. """ import os import random import struct import time from ZEO.auth.base import Database, Client from ZEO.StorageServer import ZEOStorage from ZEO.Exceptions import AuthError from ZEO.hash import sha1 def get_random_bytes(n=8): if os.path.exists("/dev/urandom"): f = open("/dev/urandom") s = f.read(n) f.close() else: L = [chr(random.randint(0, 255)) for i in range(n)] s = "".join(L) return s def hexdigest(s): return sha1(s).hexdigest() class DigestDatabase(Database): def __init__(self, filename, realm=None): Database.__init__(self, filename, realm) # Initialize a key used to build the nonce for a challenge. # We need one key for the lifetime of the server, so it # is convenient to store in on the database. self.noncekey = get_random_bytes(8) def _store_password(self, username, password): dig = hexdigest("%s:%s:%s" % (username, self.realm, password)) self._users[username] = dig def session_key(h_up, nonce): # The hash itself is a bit too short to be a session key. # HMAC wants a 64-byte key. We don't want to use h_up # directly because it would never change over time. Instead # use the hash plus part of h_up. return sha1("%s:%s" % (h_up, nonce)).digest() + h_up[:44] class StorageClass(ZEOStorage): def set_database(self, database): assert isinstance(database, DigestDatabase) self.database = database self.noncekey = database.noncekey def _get_time(self): # Return a string representing the current time. t = int(time.time()) return struct.pack("i", t) def _get_nonce(self): # RFC 2069 recommends a nonce of the form # H(client-IP ":" time-stamp ":" private-key) dig = sha1() dig.update(str(self.connection.addr)) dig.update(self._get_time()) dig.update(self.noncekey) return dig.hexdigest() def auth_get_challenge(self): """Return realm, challenge, and nonce.""" self._challenge = self._get_nonce() self._key_nonce = self._get_nonce() return self.auth_realm, self._challenge, self._key_nonce def auth_response(self, resp): # verify client response user, challenge, response = resp # Since zrpc is a stateful protocol, we just store the nonce # we sent to the client. It will need to generate a new # nonce for a new connection anyway. if self._challenge != challenge: raise ValueError("invalid challenge") # lookup user in database h_up = self.database.get_password(user) # regeneration resp from user, password, and nonce check = hexdigest("%s:%s" % (h_up, challenge)) if check == response: self.connection.setSessionKey(session_key(h_up, self._key_nonce)) return self._finish_auth(check == response) extensions = [auth_get_challenge, auth_response] class DigestClient(Client): extensions = ["auth_get_challenge", "auth_response"] def start(self, username, realm, password): _realm, challenge, nonce = self.stub.auth_get_challenge() if _realm != realm: raise AuthError("expected realm %r, got realm %r" % (_realm, realm)) h_up = hexdigest("%s:%s:%s" % (username, realm, password)) resp_dig = hexdigest("%s:%s" % (h_up, challenge)) result = self.stub.auth_response((username, challenge, resp_dig)) if result: return session_key(h_up, nonce) else: return None ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/auth/base.py000066400000000000000000000102451230730566700225630ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2003 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Base classes for defining an authentication protocol. Database -- abstract base class for password database Client -- abstract base class for authentication client """ import os from ZEO.hash import sha1 class Client: # Subclass should override to list the names of methods that # will be called on the server. extensions = [] def __init__(self, stub): self.stub = stub for m in self.extensions: setattr(self.stub, m, self.stub.extensionMethod(m)) def sort(L): """Sort a list in-place and return it.""" L.sort() return L class Database: """Abstracts a password database. This class is used both in the authentication process (via get_password()) and by client scripts that manage the password database file. The password file is a simple, colon-separated text file mapping usernames to password hashes. The hashes are SHA hex digests produced from the password string. """ realm = None def __init__(self, filename, realm=None): """Creates a new Database filename: a string containing the full pathname of the password database file. Must be readable by the user running ZEO. Must be writeable by any client script that accesses the database. realm: the realm name (a string) """ self._users = {} self.filename = filename self.load() if realm: if self.realm and self.realm != realm: raise ValueError("Specified realm %r differs from database " "realm %r" % (realm or '', self.realm)) else: self.realm = realm def save(self, fd=None): filename = self.filename if not fd: fd = open(filename, 'w') if self.realm: print >> fd, "realm", self.realm for username in sort(self._users.keys()): print >> fd, "%s: %s" % (username, self._users[username]) def load(self): filename = self.filename if not filename: return if not os.path.exists(filename): return fd = open(filename) L = fd.readlines() if not L: return if L[0].startswith("realm "): line = L.pop(0).strip() self.realm = line[len("realm "):] for line in L: username, hash = line.strip().split(":", 1) self._users[username] = hash.strip() def _store_password(self, username, password): self._users[username] = self.hash(password) def get_password(self, username): """Returns password hash for specified username. Callers must check for LookupError, which is raised in the case of a non-existent user specified.""" if not self._users.has_key(username): raise LookupError("No such user: %s" % username) return self._users[username] def hash(self, s): return sha1(s).hexdigest() def add_user(self, username, password): if self._users.has_key(username): raise LookupError("User %s already exists" % username) self._store_password(username, password) def del_user(self, username): if not self._users.has_key(username): raise LookupError("No such user: %s" % username) del self._users[username] def change_password(self, username, password): if not self._users.has_key(username): raise LookupError("No such user: %s" % username) self._store_password(username, password) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/auth/hmac.py000066400000000000000000000057301230730566700225640ustar00rootroot00000000000000"""HMAC (Keyed-Hashing for Message Authentication) Python module. Implements the HMAC algorithm as described by RFC 2104. """ def _strxor(s1, s2): """Utility method. XOR the two strings s1 and s2 (must have same length). """ return "".join(map(lambda x, y: chr(ord(x) ^ ord(y)), s1, s2)) # The size of the digests returned by HMAC depends on the underlying # hashing module used. digest_size = None class HMAC: """RFC2104 HMAC class. This supports the API for Cryptographic Hash Functions (PEP 247). """ def __init__(self, key, msg = None, digestmod = None): """Create a new HMAC object. key: key for the keyed hash object. msg: Initial input for the hash, if provided. digestmod: A module supporting PEP 247. Defaults to the md5 module. """ if digestmod is None: import md5 digestmod = md5 self.digestmod = digestmod self.outer = digestmod.new() self.inner = digestmod.new() self.digest_size = digestmod.digest_size blocksize = 64 ipad = "\x36" * blocksize opad = "\x5C" * blocksize if len(key) > blocksize: key = digestmod.new(key).digest() key = key + chr(0) * (blocksize - len(key)) self.outer.update(_strxor(key, opad)) self.inner.update(_strxor(key, ipad)) if msg is not None: self.update(msg) ## def clear(self): ## raise NotImplementedError("clear() method not available in HMAC.") def update(self, msg): """Update this hashing object with the string msg. """ self.inner.update(msg) def copy(self): """Return a separate copy of this hashing object. An update to this copy won't affect the original object. """ other = HMAC("") other.digestmod = self.digestmod other.inner = self.inner.copy() other.outer = self.outer.copy() return other def digest(self): """Return the hash value of this hashing object. This returns a string containing 8-bit data. The object is not altered in any way by this function; you can continue updating the object after calling this function. """ h = self.outer.copy() h.update(self.inner.digest()) return h.digest() def hexdigest(self): """Like digest(), but returns a string of hexadecimal digits instead. """ return "".join([hex(ord(x))[2:].zfill(2) for x in tuple(self.digest())]) def new(key, msg = None, digestmod = None): """Create a new hashing object and return it. key: The starting key for the hash. msg: if available, will immediately be hashed into the object's starting state. You can now feed arbitrary strings into the object using its update() method, and can ask for the hash value at any time by calling its digest() method. """ return HMAC(key, msg, digestmod) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/cache.py000066400000000000000000000723461230730566700217650ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2003 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Disk-based client cache for ZEO. ClientCache exposes an API used by the ZEO client storage. FileCache stores objects on disk using a 2-tuple of oid and tid as key. ClientCache's API is similar to a storage API, with methods like load(), store(), and invalidate(). It manages in-memory data structures that allow it to map this richer API onto the simple key-based API of the lower-level FileCache. """ from struct import pack, unpack import BTrees.LLBTree import BTrees.LOBTree import logging import os import tempfile import threading import time import ZODB.fsIndex import zc.lockfile from ZODB.utils import p64, u64, z64 logger = logging.getLogger("ZEO.cache") # A disk-based cache for ZEO clients. # # This class provides an interface to a persistent, disk-based cache # used by ZEO clients to store copies of database records from the # server. # # The details of the constructor as unspecified at this point. # # Each entry in the cache is valid for a particular range of transaction # ids. The lower bound is the transaction that wrote the data. The # upper bound is the next transaction that wrote a revision of the # object. If the data is current, the upper bound is stored as None; # the data is considered current until an invalidate() call is made. # # It is an error to call store() twice with the same object without an # intervening invalidate() to set the upper bound on the first cache # entry. Perhaps it will be necessary to have a call the removes # something from the cache outright, without keeping a non-current # entry. # Cache verification # # When the client is connected to the server, it receives # invalidations every time an object is modified. When the client is # disconnected then reconnects, it must perform cache verification to make # sure its cached data is synchronized with the storage's current state. # # quick verification # full verification # # FileCache stores a cache in a single on-disk file. # # On-disk cache structure. # # The file begins with a 12-byte header. The first four bytes are the # file's magic number - ZEC3 - indicating zeo cache version 4. The # next eight bytes are the last transaction id. magic = "ZEC3" ZEC_HEADER_SIZE = 12 # Maximum block size. Note that while we are doing a store, we may # need to write a free block that is almost twice as big. If we die # in the middle of a store, then we need to split the large free records # while opening. max_block_size = (1<<31) - 1 # After the header, the file contains a contiguous sequence of blocks. All # blocks begin with a one-byte status indicator: # # 'a' # Allocated. The block holds an object; the next 4 bytes are >I # format total block size. # # 'f' # Free. The block is free; the next 4 bytes are >I format total # block size. # # '1', '2', '3', '4' # The block is free, and consists of 1, 2, 3 or 4 bytes total. # # "Total" includes the status byte, and size bytes. There are no # empty (size 0) blocks. # Allocated blocks have more structure: # # 1 byte allocation status ('a'). # 4 bytes block size, >I format. # 8 byte oid # 8 byte start_tid # 8 byte end_tid # 2 byte version length must be 0 # 4 byte data size # data # 8 byte redundant oid for error detection. allocated_record_overhead = 43 # The cache's currentofs goes around the file, circularly, forever. # It's always the starting offset of some block. # # When a new object is added to the cache, it's stored beginning at # currentofs, and currentofs moves just beyond it. As many contiguous # blocks needed to make enough room for the new object are evicted, # starting at currentofs. Exception: if currentofs is close enough # to the end of the file that the new object can't fit in one # contiguous chunk, currentofs is reset to ZEC_HEADER_SIZE first. class locked(object): def __init__(self, func): self.func = func def __get__(self, inst, class_): if inst is None: return self def call(*args, **kw): inst._lock.acquire() try: return self.func(inst, *args, **kw) finally: inst._lock.release() return call class ClientCache(object): """A simple in-memory cache.""" # The default size of 200MB makes a lot more sense than the traditional # default of 20MB. The default here is misleading, though, since # ClientStorage is the only user of ClientCache, and it always passes an # explicit size of its own choosing. def __init__(self, path=None, size=200*1024**2, rearrange=.8): # - `path`: filepath for the cache file, or None (in which case # a temp file will be created) self.path = path # - `maxsize`: total size of the cache file # We set to the minimum size of less than the minimum. size = max(size, ZEC_HEADER_SIZE) self.maxsize = size # rearrange: if we read a current record and it's more than # rearrange*size from the end, then copy it forward to keep it # from being evicted. self.rearrange = rearrange * size # The number of records in the cache. self._len = 0 # {oid -> pos} self.current = ZODB.fsIndex.fsIndex() # {oid -> {tid->pos}} # Note that caches in the wild seem to have very little non-current # data, so this would seem to have little impact on memory consumption. # I wonder if we even need to store non-current data in the cache. self.noncurrent = BTrees.LOBTree.LOBTree() # tid for the most recent transaction we know about. This is also # stored near the start of the file. self.tid = z64 # Always the offset into the file of the start of a block. # New and relocated objects are always written starting at # currentofs. self.currentofs = ZEC_HEADER_SIZE # self.f is the open file object. # When we're not reusing an existing file, self.f is left None # here -- the scan() method must be called then to open the file # (and it sets self.f). fsize = ZEC_HEADER_SIZE if path: self._lock_file = zc.lockfile.LockFile(path + '.lock') if not os.path.exists(path): # Create a small empty file. We'll make it bigger in _initfile. self.f = open(path, 'wb+') self.f.write(magic+z64) logger.info("created persistent cache file %r", path) else: fsize = os.path.getsize(self.path) self.f = open(path, 'rb+') logger.info("reusing persistent cache file %r", path) else: # Create a small empty file. We'll make it bigger in _initfile. self.f = tempfile.TemporaryFile() self.f.write(magic+z64) logger.info("created temporary cache file %r", self.f.name) try: self._initfile(fsize) except: self.f.close() if not path: raise # unrecoverable temp file error :( badpath = path+'.bad' if os.path.exists(badpath): logger.critical( 'Removing bad cache file: %r (prev bad exists).', path, exc_info=1) os.remove(path) else: logger.critical('Moving bad cache file to %r.', badpath, exc_info=1) os.rename(path, badpath) self.f = open(path, 'wb+') self.f.write(magic+z64) self._initfile(ZEC_HEADER_SIZE) # Statistics: _n_adds, _n_added_bytes, # _n_evicts, _n_evicted_bytes, # _n_accesses self.clearStats() self._setup_trace(path) self._lock = threading.RLock() # Backward compatibility. Client code used to have to use the fc # attr to get to the file cache to get cache stats. @property def fc(self): return self def clear(self): self.f.seek(ZEC_HEADER_SIZE) self.f.truncate() self._initfile(ZEC_HEADER_SIZE) ## # Scan the current contents of the cache file, calling `install` # for each object found in the cache. This method should only # be called once to initialize the cache from disk. def _initfile(self, fsize): maxsize = self.maxsize f = self.f read = f.read seek = f.seek write = f.write seek(0) if read(4) != magic: seek(0) raise ValueError("unexpected magic number: %r" % read(4)) self.tid = read(8) if len(self.tid) != 8: raise ValueError("cache file too small -- no tid at start") # Populate .filemap and .key2entry to reflect what's currently in the # file, and tell our parent about it too (via the `install` callback). # Remember the location of the largest free block. That seems a # decent place to start currentofs. self.current = ZODB.fsIndex.fsIndex() self.noncurrent = BTrees.LOBTree.LOBTree() l = 0 last = ofs = ZEC_HEADER_SIZE first_free_offset = 0 current = self.current status = ' ' while ofs < fsize: seek(ofs) status = read(1) if status == 'a': size, oid, start_tid, end_tid, lver = unpack( ">I8s8s8sH", read(30)) if ofs+size <= maxsize: if end_tid == z64: assert oid not in current, (ofs, f.tell()) current[oid] = ofs else: assert start_tid < end_tid, (ofs, f.tell()) self._set_noncurrent(oid, start_tid, ofs) assert lver == 0, "Versions aren't supported" l += 1 else: # free block if first_free_offset == 0: first_free_offset = ofs if status == 'f': size, = unpack(">I", read(4)) if size > max_block_size: # Oops, we either have an old cache, or a we # crashed while storing. Split this block into two. assert size <= max_block_size*2 seek(ofs+max_block_size) write('f'+pack(">I", size-max_block_size)) seek(ofs) write('f'+pack(">I", max_block_size)) sync(f) elif status in '1234': size = int(status) else: raise ValueError("unknown status byte value %s in client " "cache file" % 0, hex(ord(status))) last = ofs ofs += size if ofs >= maxsize: # Oops, the file was bigger before. if ofs > maxsize: # The last record is too big. Replace it with a smaller # free record size = maxsize-last seek(last) if size > 4: write('f'+pack(">I", size)) else: write("012345"[size]) sync(f) ofs = maxsize break if fsize < maxsize: assert ofs==fsize # Make sure the OS really saves enough bytes for the file. seek(self.maxsize - 1) write('x') # add as many free blocks as are needed to fill the space seek(ofs) nfree = maxsize - ZEC_HEADER_SIZE for i in range(0, nfree, max_block_size): block_size = min(max_block_size, nfree-i) write('f' + pack(">I", block_size)) seek(block_size-5, 1) sync(self.f) # There is always data to read and assert last and status in ' f1234' first_free_offset = last else: assert ofs==maxsize if maxsize < fsize: seek(maxsize) f.truncate() # We use the first_free_offset because it is most likelyt the # place where we last wrote. self.currentofs = first_free_offset or ZEC_HEADER_SIZE self._len = l def _set_noncurrent(self, oid, tid, ofs): noncurrent_for_oid = self.noncurrent.get(u64(oid)) if noncurrent_for_oid is None: noncurrent_for_oid = BTrees.LLBTree.LLBucket() self.noncurrent[u64(oid)] = noncurrent_for_oid noncurrent_for_oid[u64(tid)] = ofs def _del_noncurrent(self, oid, tid): try: noncurrent_for_oid = self.noncurrent[u64(oid)] del noncurrent_for_oid[u64(tid)] if not noncurrent_for_oid: del self.noncurrent[u64(oid)] except KeyError: logger.error("Couldn't find non-current %r", (oid, tid)) def clearStats(self): self._n_adds = self._n_added_bytes = 0 self._n_evicts = self._n_evicted_bytes = 0 self._n_accesses = 0 def getStats(self): return (self._n_adds, self._n_added_bytes, self._n_evicts, self._n_evicted_bytes, self._n_accesses ) ## # The number of objects currently in the cache. def __len__(self): return self._len ## # Close the underlying file. No methods accessing the cache should be # used after this. def close(self): self._unsetup_trace() f = self.f self.f = None if f is not None: sync(f) f.close() if hasattr(self,'_lock_file'): self._lock_file.close() ## # Evict objects as necessary to free up at least nbytes bytes, # starting at currentofs. If currentofs is closer than nbytes to # the end of the file, currentofs is reset to ZEC_HEADER_SIZE first. # The number of bytes actually freed may be (and probably will be) # greater than nbytes, and is _makeroom's return value. The file is not # altered by _makeroom. filemap and key2entry are updated to reflect the # evictions, and it's the caller's responsibility both to fiddle # the file, and to update filemap, to account for all the space # freed (starting at currentofs when _makeroom returns, and # spanning the number of bytes retured by _makeroom). def _makeroom(self, nbytes): assert 0 < nbytes <= self.maxsize - ZEC_HEADER_SIZE, ( nbytes, self.maxsize) if self.currentofs + nbytes > self.maxsize: self.currentofs = ZEC_HEADER_SIZE ofs = self.currentofs seek = self.f.seek read = self.f.read current = self.current while nbytes > 0: seek(ofs) status = read(1) if status == 'a': size, oid, start_tid, end_tid = unpack(">I8s8s8s", read(28)) self._n_evicts += 1 self._n_evicted_bytes += size if end_tid == z64: del current[oid] else: self._del_noncurrent(oid, start_tid) self._len -= 1 else: if status == 'f': size = unpack(">I", read(4))[0] else: assert status in '1234' size = int(status) ofs += size nbytes -= size return ofs - self.currentofs ## # Update our idea of the most recent tid. This is stored in the # instance, and also written out near the start of the cache file. The # new tid must be strictly greater than our current idea of the most # recent tid. @locked def setLastTid(self, tid): if (not tid) or (tid == z64): return if (tid <= self.tid) and self._len: if tid == self.tid: return # Be a little forgiving raise ValueError("new last tid (%s) must be greater than " "previous one (%s)" % (u64(tid), u64(self.tid))) assert isinstance(tid, str) and len(tid) == 8, tid self.tid = tid self.f.seek(len(magic)) self.f.write(tid) self.f.flush() ## # Return the last transaction seen by the cache. # @return a transaction id # @defreturn string, or 8 nulls if no transaction is yet known def getLastTid(self): return self.tid ## # Return the current data record for oid. # @param oid object id # @return (data record, serial number, tid), or None if the object is not # in the cache # @defreturn 3-tuple: (string, string, string) @locked def load(self, oid): ofs = self.current.get(oid) if ofs is None: self._trace(0x20, oid) return None self.f.seek(ofs) read = self.f.read status = read(1) assert status == 'a', (ofs, self.f.tell(), oid) size, saved_oid, tid, end_tid, lver, ldata = unpack( ">I8s8s8sHI", read(34)) assert saved_oid == oid, (ofs, self.f.tell(), oid, saved_oid) assert end_tid == z64, (ofs, self.f.tell(), oid, tid, end_tid) assert lver == 0, "Versions aren't supported" data = read(ldata) assert len(data) == ldata, (ofs, self.f.tell(), oid, len(data), ldata) # WARNING: The following assert changes the file position. # We must not depend on this below or we'll fail in optimized mode. assert read(8) == oid, (ofs, self.f.tell(), oid) self._n_accesses += 1 self._trace(0x22, oid, tid, end_tid, ldata) ofsofs = self.currentofs - ofs if ofsofs < 0: ofsofs += self.maxsize if (ofsofs > self.rearrange and self.maxsize > 10*len(data) and size > 4): # The record is far back and might get evicted, but it's # valuable, so move it forward. # Remove fromn old loc: del self.current[oid] self.f.seek(ofs) self.f.write('f'+pack(">I", size)) # Write to new location: self._store(oid, tid, None, data, size) return data, tid ## # Return a non-current revision of oid that was current before tid. # @param oid object id # @param tid id of transaction that wrote next revision of oid # @return data record, serial number, start tid, and end tid # @defreturn 4-tuple: (string, string, string, string) @locked def loadBefore(self, oid, before_tid): noncurrent_for_oid = self.noncurrent.get(u64(oid)) if noncurrent_for_oid is None: self._trace(0x24, oid, "", before_tid) return None items = noncurrent_for_oid.items(None, u64(before_tid)-1) if not items: self._trace(0x24, oid, "", before_tid) return None tid, ofs = items[-1] self.f.seek(ofs) read = self.f.read status = read(1) assert status == 'a', (ofs, self.f.tell(), oid, before_tid) size, saved_oid, saved_tid, end_tid, lver, ldata = unpack( ">I8s8s8sHI", read(34)) assert saved_oid == oid, (ofs, self.f.tell(), oid, saved_oid) assert saved_tid == p64(tid), (ofs, self.f.tell(), oid, saved_tid, tid) assert end_tid != z64, (ofs, self.f.tell(), oid) assert lver == 0, "Versions aren't supported" data = read(ldata) assert len(data) == ldata, (ofs, self.f.tell()) # WARNING: The following assert changes the file position. # We must not depend on this below or we'll fail in optimized mode. assert read(8) == oid, (ofs, self.f.tell(), oid) if end_tid < before_tid: self._trace(0x24, oid, "", before_tid) return None self._n_accesses += 1 self._trace(0x26, oid, "", saved_tid) return data, saved_tid, end_tid ## # Store a new data record in the cache. # @param oid object id # @param start_tid the id of the transaction that wrote this revision # @param end_tid the id of the transaction that created the next # revision of oid. If end_tid is None, the data is # current. # @param data the actual data @locked def store(self, oid, start_tid, end_tid, data): seek = self.f.seek if end_tid is None: ofs = self.current.get(oid) if ofs: seek(ofs) read = self.f.read status = read(1) assert status == 'a', (ofs, self.f.tell(), oid) size, saved_oid, saved_tid, end_tid = unpack( ">I8s8s8s", read(28)) assert saved_oid == oid, (ofs, self.f.tell(), oid, saved_oid) assert end_tid == z64, (ofs, self.f.tell(), oid) if saved_tid == start_tid: return raise ValueError("already have current data for oid") else: noncurrent_for_oid = self.noncurrent.get(u64(oid)) if noncurrent_for_oid and (u64(start_tid) in noncurrent_for_oid): return size = allocated_record_overhead + len(data) # A number of cache simulation experiments all concluded that the # 2nd-level ZEO cache got a much higher hit rate if "very large" # objects simply weren't cached. For now, we ignore the request # only if the entire cache file is too small to hold the object. if size >= min(max_block_size, self.maxsize - ZEC_HEADER_SIZE): return self._n_adds += 1 self._n_added_bytes += size self._len += 1 self._store(oid, start_tid, end_tid, data, size) if end_tid: self._trace(0x54, oid, start_tid, end_tid, dlen=len(data)) else: self._trace(0x52, oid, start_tid, dlen=len(data)) def _store(self, oid, start_tid, end_tid, data, size): # Low-level store used by store and load # In the next line, we ask for an extra to make sure we always # have a free block after the new alocated block. This free # block acts as a ring pointer, so that on restart, we start # where we left off. nfreebytes = self._makeroom(size+1) assert size <= nfreebytes, (size, nfreebytes) excess = nfreebytes - size # If there's any excess (which is likely), we need to record a # free block following the end of the data record. That isn't # expensive -- it's all a contiguous write. if excess == 0: extra = '' elif excess < 5: extra = "01234"[excess] else: extra = 'f' + pack(">I", excess) ofs = self.currentofs seek = self.f.seek seek(ofs) write = self.f.write # Before writing data, we'll write a free block for the space freed. # We'll come back with a last atomic write to rewrite the start of the # allocated-block header. write('f'+pack(">I", nfreebytes)) # Now write the rest of the allocation block header and object data. write(pack(">8s8s8sHI", oid, start_tid, end_tid or z64, 0, len(data))) write(data) write(oid) write(extra) # Now, we'll go back and rewrite the beginning of the # allocated block header. seek(ofs) write('a'+pack(">I", size)) if end_tid: self._set_noncurrent(oid, start_tid, ofs) else: self.current[oid] = ofs self.currentofs += size ## # If `tid` is None, # forget all knowledge of `oid`. (`tid` can be None only for # invalidations generated by startup cache verification.) If `tid` # isn't None, and we had current # data for `oid`, stop believing we have current data, and mark the # data we had as being valid only up to `tid`. In all other cases, do # nothing. # # Paramters: # # - oid object id # - tid the id of the transaction that wrote a new revision of oid, # or None to forget all cached info about oid. @locked def invalidate(self, oid, tid): ofs = self.current.get(oid) if ofs is None: # 0x10 == invalidate (miss) self._trace(0x10, oid, tid) return self.f.seek(ofs) read = self.f.read status = read(1) assert status == 'a', (ofs, self.f.tell(), oid) size, saved_oid, saved_tid, end_tid = unpack(">I8s8s8s", read(28)) assert saved_oid == oid, (ofs, self.f.tell(), oid, saved_oid) assert end_tid == z64, (ofs, self.f.tell(), oid) del self.current[oid] if tid is None: self.f.seek(ofs) self.f.write('f'+pack(">I", size)) # 0x1E = invalidate (hit, discarding current or non-current) self._trace(0x1E, oid, tid) self._len -= 1 else: if tid == saved_tid: logger.warning("Ignoring invalidation with same tid as current") return self.f.seek(ofs+21) self.f.write(tid) self._set_noncurrent(oid, saved_tid, ofs) # 0x1C = invalidate (hit, saving non-current) self._trace(0x1C, oid, tid) ## # Generates (oid, serial) oairs for all objects in the # cache. This generator is used by cache verification. def contents(self): # May need to materialize list instead of iterating; # depends on whether the caller may change the cache. seek = self.f.seek read = self.f.read for oid, ofs in self.current.iteritems(): self._lock.acquire() try: seek(ofs) status = read(1) assert status == 'a', (ofs, self.f.tell(), oid) size, saved_oid, tid, end_tid = unpack(">I8s8s8s", read(28)) assert saved_oid == oid, (ofs, self.f.tell(), oid, saved_oid) assert end_tid == z64, (ofs, self.f.tell(), oid) result = oid, tid finally: self._lock.release() yield result def dump(self): from ZODB.utils import oid_repr print "cache size", len(self) L = list(self.contents()) L.sort() for oid, tid in L: print oid_repr(oid), oid_repr(tid) print "dll contents" L = list(self) L.sort(lambda x, y: cmp(x.key, y.key)) for x in L: end_tid = x.end_tid or z64 print oid_repr(x.key[0]), oid_repr(x.key[1]), oid_repr(end_tid) print # If `path` isn't None (== we're using a persistent cache file), and # envar ZEO_CACHE_TRACE is set to a non-empty value, try to open # path+'.trace' as a trace file, and store the file object in # self._tracefile. If not, or we can't write to the trace file, disable # tracing by setting self._trace to a dummy function, and set # self._tracefile to None. _tracefile = None def _trace(self, *a, **kw): pass def _setup_trace(self, path): _tracefile = None if path and os.environ.get("ZEO_CACHE_TRACE"): tfn = path + ".trace" try: _tracefile = open(tfn, "ab") except IOError, msg: logger.warning("cannot write tracefile %r (%s)", tfn, msg) else: logger.info("opened tracefile %r", tfn) if _tracefile is None: return now = time.time def _trace(code, oid="", tid=z64, end_tid=z64, dlen=0): # The code argument is two hex digits; bits 0 and 7 must be zero. # The first hex digit shows the operation, the second the outcome. # This method has been carefully tuned to be as fast as possible. # Note: when tracing is disabled, this method is hidden by a dummy. encoded = (dlen << 8) + code if tid is None: tid = z64 if end_tid is None: end_tid = z64 try: _tracefile.write( pack(">iiH8s8s", now(), encoded, len(oid), tid, end_tid) + oid, ) except: print `tid`, `end_tid` raise self._trace = _trace self._tracefile = _tracefile _trace(0x00) def _unsetup_trace(self): if self._tracefile is not None: del self._trace self._tracefile.close() del self._tracefile def sync(f): f.flush() if hasattr(os, 'fsync'): def sync(f): f.flush() os.fsync(f.fileno()) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/component.xml000066400000000000000000000115071230730566700230640ustar00rootroot00000000000000 The content of a ZEO section describe operational parameters of a ZEO server except for the storage(s) to be served. The address at which the server should listen. This can be in the form 'host:port' to signify a TCP/IP connection or a pathname string to signify a Unix domain socket connection (at least one '/' is required). A hostname may be a DNS name or a dotted IP address. If the hostname is omitted, the platform's default behavior is used when binding the listening socket ('' is passed to socket.bind() as the hostname portion of the address). Flag indicating whether the server should operate in read-only mode. Defaults to false. Note that even if the server is operating in writable mode, individual storages may still be read-only. But if the server is in read-only mode, no write operations are allowed, even if the storages are writable. Note that pack() is considered a read-only operation. The storage server keeps a queue of the objects modified by the last N transactions, where N == invalidation_queue_size. This queue is used to speed client cache verification when a client disconnects for a short period of time. The maximum age of a client for which quick-verification invalidations will be provided by iterating over the served storage. This option should only be used if the served storage supports efficient iteration from a starting point near the end of the transaction history (e.g. end of file). The address at which the monitor server should listen. If specified, a monitor server is started. The monitor server provides server statistics in a simple text format. This can be in the form 'host:port' to signify a TCP/IP connection or a pathname string to signify a Unix domain socket connection (at least one '/' is required). A hostname may be a DNS name or a dotted IP address. If the hostname is omitted, the platform's default behavior is used when binding the listening socket ('' is passed to socket.bind() as the hostname portion of the address). The maximum amount of time to wait for a transaction to commit after acquiring the storage lock, specified in seconds. If the transaction takes too long, the client connection will be closed and the transaction aborted. The name of the protocol used for authentication. The only protocol provided with ZEO is "digest," but extensions may provide other protocols. The path of the database containing authentication credentials. The authentication realm of the server. Some authentication schemes use a realm to identify the logical set of usernames that are accepted by this server. The full path to the file in which to write the ZEO server's Process ID at startup. If omitted, $INSTANCE/var/ZEO.pid is used. $INSTANCE/var/ZEO.pid (or $clienthome/ZEO.pid) indicates that the cache should be dropped rather than verified when the verification optimization is not available (e.g. when the ZEO server restarted). ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/hash.py000066400000000000000000000017041230730566700216330ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2008 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """In Python 2.6, the "sha" and "md5" modules have been deprecated in favor of using hashlib for both. This class allows for compatibility between versions.""" try: import hashlib sha1 = hashlib.sha1 new = sha1 except ImportError: import sha sha1 = sha.new new = sha1 digest_size = sha.digest_size ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/interfaces.py000066400000000000000000000034731230730566700230400ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2006 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## import zope.interface class StaleCache(object): """A ZEO cache is stale and requires verification. """ def __init__(self, storage): self.storage = storage class IServeable(zope.interface.Interface): """Interface provided by storages that can be served by ZEO """ def getTid(oid): """The last transaction to change an object Return the transaction id of the last transaction that committed a change to an object with the given object id. """ def tpc_transaction(): """The current transaction being committed. If a storage is participating in a two-phase commit, then return the transaction (object) being committed. Otherwise return None. """ def lastInvalidations(size): """Get recent transaction invalidations This method is optional and is used to get invalidations performed by the most recent transactions. An iterable of up to size entries must be returned, where each entry is a transaction id and a sequence of object-id/empty-string pairs describing the objects written by the transaction, in chronological order. """ ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/monitor.py000066400000000000000000000125031230730566700223760ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2003 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Monitor behavior of ZEO server and record statistics. $Id$ """ import asyncore import socket import time import types import logging zeo_version = 'unknown' try: import pkg_resources except ImportError: pass else: zeo_dist = pkg_resources.working_set.find( pkg_resources.Requirement.parse('ZODB3') ) if zeo_dist is not None: zeo_version = zeo_dist.version class StorageStats: """Per-storage usage statistics.""" def __init__(self, connections=None): self.connections = connections self.loads = 0 self.stores = 0 self.commits = 0 self.aborts = 0 self.active_txns = 0 self.verifying_clients = 0 self.lock_time = None self.conflicts = 0 self.conflicts_resolved = 0 self.start = time.ctime() @property def clients(self): return len(self.connections) def parse(self, s): # parse the dump format lines = s.split("\n") for line in lines: field, value = line.split(":", 1) if field == "Server started": self.start = value elif field == "Clients": # Hack because we use this both on the server and on # the client where there are no connections. self.connections = [0] * int(value) elif field == "Clients verifying": self.verifying_clients = int(value) elif field == "Active transactions": self.active_txns = int(value) elif field == "Commit lock held for": # This assumes self.lock_time = time.time() - int(value) elif field == "Commits": self.commits = int(value) elif field == "Aborts": self.aborts = int(value) elif field == "Loads": self.loads = int(value) elif field == "Stores": self.stores = int(value) elif field == "Conflicts": self.conflicts = int(value) elif field == "Conflicts resolved": self.conflicts_resolved = int(value) def dump(self, f): print >> f, "Server started:", self.start print >> f, "Clients:", self.clients print >> f, "Clients verifying:", self.verifying_clients print >> f, "Active transactions:", self.active_txns if self.lock_time: howlong = time.time() - self.lock_time print >> f, "Commit lock held for:", int(howlong) print >> f, "Commits:", self.commits print >> f, "Aborts:", self.aborts print >> f, "Loads:", self.loads print >> f, "Stores:", self.stores print >> f, "Conflicts:", self.conflicts print >> f, "Conflicts resolved:", self.conflicts_resolved class StatsClient(asyncore.dispatcher): def __init__(self, sock, addr): asyncore.dispatcher.__init__(self, sock) self.buf = [] self.closed = 0 def close(self): self.closed = 1 # The socket is closed after all the data is written. # See handle_write(). def write(self, s): self.buf.append(s) def writable(self): return len(self.buf) def readable(self): return 0 def handle_write(self): s = "".join(self.buf) self.buf = [] n = self.socket.send(s) if n < len(s): self.buf.append(s[:n]) if self.closed and not self.buf: asyncore.dispatcher.close(self) class StatsServer(asyncore.dispatcher): StatsConnectionClass = StatsClient def __init__(self, addr, stats): asyncore.dispatcher.__init__(self) self.addr = addr self.stats = stats if type(self.addr) == types.TupleType: self.create_socket(socket.AF_INET, socket.SOCK_STREAM) else: self.create_socket(socket.AF_UNIX, socket.SOCK_STREAM) self.set_reuse_addr() logger = logging.getLogger('ZEO.monitor') logger.info("listening on %s", repr(self.addr)) self.bind(self.addr) self.listen(5) def writable(self): return 0 def readable(self): return 1 def handle_accept(self): try: sock, addr = self.accept() except socket.error: return f = self.StatsConnectionClass(sock, addr) self.dump(f) f.close() def dump(self, f): print >> f, "ZEO monitor server version %s" % zeo_version print >> f, time.ctime() print >> f L = self.stats.keys() L.sort() for k in L: stats = self.stats[k] print >> f, "Storage:", k stats.dump(f) print >> f ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/protocol.txt000066400000000000000000000044331230730566700227420ustar00rootroot00000000000000ZEO Network Protocol (sans authentication) ========================================== This document describes the ZEO network protocol. It assumes that the optional authentication protocol isn't used. At the lowest level, the protocol consists of sized messages. All communication between the client and server consists of sized messages. A sized message consists of a 4-byte unsigned big-endian content length, followed by the content. There are two subprotocols, for protocol negotiation, and for normal operation. The normal operation protocol is a basic RPC protocol. In the protocol negotiation phase, the server sends a protocol identifier to the client. The client chooses a protocol to use to the server. The client or the server can fail if it doesn't like the protocol string sent by the other party. After sending their protocol strings, the client and server switch to RPC mode. The RPC protocol uses messages that are pickled tuples consisting of: message_id The message id is used to match replies with requests, allowing multiple outstanding synchronous requests. async_flag An integer 0 for a regular (2-way) request and 1 for a one-way request. Two-way requests have a reply. One way requests don't. ZRS tries to use as many one-way requests as possible to avoid network round trips. name The name of a method to call. If this is the special string ".reply", then the message is interpreted as a return from a synchronous call. args A tuple of positional arguments or returned values. After making a connection and negotiating the protocol, the following interactions occur: - The client requests the authentication protocol by calling getAuthProtocol. For this discussion, we'll assume the server returns None. Note that if the server doesn't require authentication, this step is optional. - The client calls register passing a storage identifier and a read-only flag. The server doesn't return a value, but it may raise an exception either if the storage doesn't exist, or if the storage is readonly and the read-only flag passed by the client is false. At this point, the client and server send each other messages as needed. The client may make regular or one-way calls to the server. The server sends replies and one-way calls to the client. ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/runzeo.py000066400000000000000000000334421230730566700222360ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002, 2003 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Start the ZEO storage server. Usage: %s [-C URL] [-a ADDRESS] [-f FILENAME] [-h] Options: -C/--configuration URL -- configuration file or URL -a/--address ADDRESS -- server address of the form PORT, HOST:PORT, or PATH (a PATH must contain at least one "/") -f/--filename FILENAME -- filename for FileStorage -t/--timeout TIMEOUT -- transaction timeout in seconds (default no timeout) -h/--help -- print this usage message and exit -m/--monitor ADDRESS -- address of monitor server ([HOST:]PORT or PATH) --pid-file PATH -- relative path to output file containing this process's pid; default $(INSTANCE_HOME)/var/ZEO.pid but only if envar INSTANCE_HOME is defined Unless -C is specified, -a and -f are required. """ # The code here is designed to be reused by other, similar servers. # For the forseeable future, it must work under Python 2.1 as well as # 2.2 and above. import asyncore import os import sys import signal import socket import logging import ZConfig.datatypes import ZEO from zdaemon.zdoptions import ZDOptions logger = logging.getLogger('ZEO.runzeo') _pid = str(os.getpid()) def log(msg, level=logging.INFO, exc_info=False): """Internal: generic logging function.""" message = "(%s) %s" % (_pid, msg) logger.log(level, message, exc_info=exc_info) def parse_binding_address(arg): # Caution: Not part of the official ZConfig API. obj = ZConfig.datatypes.SocketBindingAddress(arg) return obj.family, obj.address def windows_shutdown_handler(): # Called by the signal mechanism on Windows to perform shutdown. import asyncore asyncore.close_all() class ZEOOptionsMixin: storages = None def handle_address(self, arg): self.family, self.address = parse_binding_address(arg) def handle_monitor_address(self, arg): self.monitor_family, self.monitor_address = parse_binding_address(arg) def handle_filename(self, arg): from ZODB.config import FileStorage # That's a FileStorage *opener*! class FSConfig: def __init__(self, name, path): self._name = name self.path = path self.stop = None def getSectionName(self): return self._name if not self.storages: self.storages = [] name = str(1 + len(self.storages)) conf = FileStorage(FSConfig(name, arg)) self.storages.append(conf) testing_exit_immediately = False def handle_test(self, *args): self.testing_exit_immediately = True def add_zeo_options(self): self.add(None, None, None, "test", self.handle_test) self.add(None, None, "a:", "address=", self.handle_address) self.add(None, None, "f:", "filename=", self.handle_filename) self.add("family", "zeo.address.family") self.add("address", "zeo.address.address", required="no server address specified; use -a or -C") self.add("read_only", "zeo.read_only", default=0) self.add("invalidation_queue_size", "zeo.invalidation_queue_size", default=100) self.add("invalidation_age", "zeo.invalidation_age") self.add("transaction_timeout", "zeo.transaction_timeout", "t:", "timeout=", float) self.add("monitor_address", "zeo.monitor_address.address", "m:", "monitor=", self.handle_monitor_address) self.add('auth_protocol', 'zeo.authentication_protocol', None, 'auth-protocol=', default=None) self.add('auth_database', 'zeo.authentication_database', None, 'auth-database=') self.add('auth_realm', 'zeo.authentication_realm', None, 'auth-realm=') self.add('pid_file', 'zeo.pid_filename', None, 'pid-file=') class ZEOOptions(ZDOptions, ZEOOptionsMixin): __doc__ = __doc__ logsectionname = "eventlog" schemadir = os.path.dirname(ZEO.__file__) def __init__(self): ZDOptions.__init__(self) self.add_zeo_options() self.add("storages", "storages", required="no storages specified; use -f or -C") def realize(self, *a, **k): ZDOptions.realize(self, *a, **k) nunnamed = [s for s in self.storages if s.name is None] if nunnamed: if len(nunnamed) > 1: return self.usage("No more than one storage may be unnamed.") if [s for s in self.storages if s.name == '1']: return self.usage( "Can't have an unnamed storage and a storage named 1.") for s in self.storages: if s.name is None: s.name = '1' break class ZEOServer: def __init__(self, options): self.options = options def main(self): self.setup_default_logging() self.check_socket() self.clear_socket() self.make_pidfile() try: self.open_storages() self.setup_signals() self.create_server() self.loop_forever() finally: self.close_storages() self.clear_socket() self.remove_pidfile() def setup_default_logging(self): if self.options.config_logger is not None: return # No log file is configured; default to stderr. root = logging.getLogger() root.setLevel(logging.INFO) fmt = logging.Formatter( "------\n%(asctime)s %(levelname)s %(name)s %(message)s", "%Y-%m-%dT%H:%M:%S") handler = logging.StreamHandler() handler.setFormatter(fmt) root.addHandler(handler) def check_socket(self): if self.can_connect(self.options.family, self.options.address): self.options.usage("address %s already in use" % repr(self.options.address)) def can_connect(self, family, address): s = socket.socket(family, socket.SOCK_STREAM) try: s.connect(address) except socket.error: return 0 else: s.close() return 1 def clear_socket(self): if isinstance(self.options.address, type("")): try: os.unlink(self.options.address) except os.error: pass def open_storages(self): self.storages = {} for opener in self.options.storages: log("opening storage %r using %s" % (opener.name, opener.__class__.__name__)) self.storages[opener.name] = opener.open() def setup_signals(self): """Set up signal handlers. The signal handler for SIGFOO is a method handle_sigfoo(). If no handler method is defined for a signal, the signal action is not changed from its initial value. The handler method is called without additional arguments. """ if os.name != "posix": if os.name == "nt": self.setup_win32_signals() return if hasattr(signal, 'SIGXFSZ'): signal.signal(signal.SIGXFSZ, signal.SIG_IGN) # Special case init_signames() for sig, name in signames.items(): method = getattr(self, "handle_" + name.lower(), None) if method is not None: def wrapper(sig_dummy, frame_dummy, method=method): method() signal.signal(sig, wrapper) def setup_win32_signals(self): # Borrow the Zope Signals package win32 support, if available. # Signals does a check/log for the availability of pywin32. try: import Signals.Signals except ImportError: logger.debug("Signals package not found. " "Windows-specific signal handler " "will *not* be installed.") return SignalHandler = Signals.Signals.SignalHandler if SignalHandler is not None: # may be None if no pywin32. SignalHandler.registerHandler(signal.SIGTERM, windows_shutdown_handler) SignalHandler.registerHandler(signal.SIGINT, windows_shutdown_handler) SIGUSR2 = 12 # not in signal module on Windows. SignalHandler.registerHandler(SIGUSR2, self.handle_sigusr2) def create_server(self): self.server = create_server(self.storages, self.options) def loop_forever(self): if self.options.testing_exit_immediately: print "testing exit immediately" else: asyncore.loop() def handle_sigterm(self): log("terminated by SIGTERM") sys.exit(0) def handle_sigint(self): log("terminated by SIGINT") sys.exit(0) def handle_sighup(self): log("restarted by SIGHUP") sys.exit(1) def handle_sigusr2(self): # log rotation signal - do the same as Zope 2.7/2.8... if self.options.config_logger is None or os.name not in ("posix", "nt"): log("received SIGUSR2, but it was not handled!", level=logging.WARNING) return loggers = [self.options.config_logger] if os.name == "posix": for l in loggers: l.reopen() log("Log files reopened successfully", level=logging.INFO) else: # nt - same rotation code as in Zope's Signals/Signals.py for l in loggers: for f in l.handler_factories: handler = f() if hasattr(handler, 'rotate') and callable(handler.rotate): handler.rotate() log("Log files rotation complete", level=logging.INFO) def close_storages(self): for name, storage in self.storages.items(): log("closing storage %r" % name) try: storage.close() except: # Keep going log("failed to close storage %r" % name, level=logging.ERROR, exc_info=True) def _get_pidfile(self): pidfile = self.options.pid_file # 'pidfile' is marked as not required. if not pidfile: # Try to find a reasonable location if the pidfile is not # set. If we are running in a Zope environment, we can # safely assume INSTANCE_HOME. instance_home = os.environ.get("INSTANCE_HOME") if not instance_home: # If all our attempts failed, just log a message and # proceed. logger.debug("'pidfile' option not set, and 'INSTANCE_HOME' " "environment variable could not be found. " "Cannot guess pidfile location.") return self.options.pid_file = os.path.join(instance_home, "var", "ZEO.pid") def make_pidfile(self): if not self.options.read_only: self._get_pidfile() pidfile = self.options.pid_file if pidfile is None: return pid = os.getpid() try: if os.path.exists(pidfile): os.unlink(pidfile) f = open(pidfile, 'w') print >> f, pid f.close() log("created PID file '%s'" % pidfile) except IOError: logger.error("PID file '%s' cannot be opened" % pidfile) def remove_pidfile(self): if not self.options.read_only: pidfile = self.options.pid_file if pidfile is None: return try: if os.path.exists(pidfile): os.unlink(pidfile) log("removed PID file '%s'" % pidfile) except IOError: logger.error("PID file '%s' could not be removed" % pidfile) def create_server(storages, options): from ZEO.StorageServer import StorageServer return StorageServer( options.address, storages, read_only = options.read_only, invalidation_queue_size = options.invalidation_queue_size, invalidation_age = options.invalidation_age, transaction_timeout = options.transaction_timeout, monitor_address = options.monitor_address, auth_protocol = options.auth_protocol, auth_database = options.auth_database, auth_realm = options.auth_realm, ) # Signal names signames = None def signame(sig): """Return a symbolic name for a signal. Return "signal NNN" if there is no corresponding SIG name in the signal module. """ if signames is None: init_signames() return signames.get(sig) or "signal %d" % sig def init_signames(): global signames signames = {} for name, sig in signal.__dict__.items(): k_startswith = getattr(name, "startswith", None) if k_startswith is None: continue if k_startswith("SIG") and not k_startswith("SIG_"): signames[sig] = name # Main program def main(args=None): options = ZEOOptions() options.realize(args) s = ZEOServer(options) s.main() if __name__ == "__main__": main() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/schema.xml000066400000000000000000000022311230730566700223140ustar00rootroot00000000000000 This schema describes the configuration of the ZEO storage server process.

One or more storages that are provided by the ZEO server. The section names are used as the storage names, and must be unique within each ZEO storage server. Traditionally, these names represent small integers starting at '1'.
ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/scripts/000077500000000000000000000000001230730566700220235ustar00rootroot00000000000000ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/scripts/README.txt000066400000000000000000000036131230730566700235240ustar00rootroot00000000000000This directory contains a collection of utilities for working with ZEO. Some are more useful than others. If you install ZODB using distutils ("python setup.py install"), some of these will be installed. Unless otherwise noted, these scripts are invoked with the name of the Data.fs file as their only argument. Example: checkbtrees.py data.fs. parsezeolog.py -- parse BLATHER logs from ZEO server This script may be obsolete. It has not been tested against the current log output of the ZEO server. Reports on the time and size of transactions committed by a ZEO server, by inspecting log messages at BLATHER level. timeout.py -- script to test transaction timeout usage: timeout.py address delay [storage-name] This script connects to a storage, begins a transaction, calls store() and tpc_vote(), and then sleeps forever. This should trigger the transaction timeout feature of the server. zeopack.py -- pack a ZEO server The script connects to a server and calls pack() on a specific storage. See the script for usage details. zeoreplay.py -- experimental script to replay transactions from a ZEO log Like parsezeolog.py, this may be obsolete because it was written against an earlier version of the ZEO server. See the script for usage details. zeoup.py usage: zeoup.py [options] The test will connect to a ZEO server, load the root object, and attempt to update the zeoup counter in the root. It will report success if it updates to counter or if it gets a ConflictError. A ConflictError is considered a success, because the client was able to start a transaction. See the script for details about the options. zeoserverlog.py -- analyze ZEO server log for performance statistics See the module docstring for details; there are a large number of options. New in ZODB3 3.1.4. zeoqueue.py -- report number of clients currently waiting in the ZEO queue See the module docstring for details. ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/scripts/__init__.py000066400000000000000000000000021230730566700241240ustar00rootroot00000000000000# ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/scripts/cache_simul.py000077500000000000000000000474201230730566700246630ustar00rootroot00000000000000#! /usr/bin/env python ############################################################################## # # Copyright (c) 2001-2005 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Cache simulation. Usage: simul.py [-s size] tracefile Options: -s size: cache size in MB (default 20 MB) -i: summarizing interval in minutes (default 15; max 60) -r: rearrange factor Note: - The simulation isn't perfect. - The simulation will be far off if the trace file was created starting with a non-empty cache """ import bisect import getopt import struct import re import sys import ZEO.cache from ZODB.utils import z64, u64 # we assign ctime locally to facilitate test replacement! from time import ctime def usage(msg): print >> sys.stderr, msg print >> sys.stderr, __doc__ def main(args=None): if args is None: args = sys.argv[1:] # Parse options. MB = 1<<20 cachelimit = 20*MB rearrange = 0.8 simclass = CircularCacheSimulation interval_step = 15 try: opts, args = getopt.getopt(args, "s:i:r:") except getopt.error, msg: usage(msg) return 2 for o, a in opts: if o == '-s': cachelimit = int(float(a)*MB) elif o == '-i': interval_step = int(a) elif o == '-r': rearrange = float(a) else: assert False, (o, a) interval_step *= 60 if interval_step <= 0: interval_step = 60 elif interval_step > 3600: interval_step = 3600 if len(args) != 1: usage("exactly one file argument required") return 2 filename = args[0] # Open file. if filename.endswith(".gz"): # Open gzipped file. try: import gzip except ImportError: print >> sys.stderr, "can't read gzipped files (no module gzip)" return 1 try: f = gzip.open(filename, "rb") except IOError, msg: print >> sys.stderr, "can't open %s: %s" % (filename, msg) return 1 elif filename == "-": # Read from stdin. f = sys.stdin else: # Open regular file. try: f = open(filename, "rb") except IOError, msg: print >> sys.stderr, "can't open %s: %s" % (filename, msg) return 1 # Create simulation object. sim = simclass(cachelimit, rearrange) interval_sim = simclass(cachelimit, rearrange) # Print output header. sim.printheader() # Read trace file, simulating cache behavior. f_read = f.read unpack = struct.unpack FMT = ">iiH8s8s" FMT_SIZE = struct.calcsize(FMT) assert FMT_SIZE == 26 last_interval = None while 1: # Read a record and decode it. r = f_read(FMT_SIZE) if len(r) < FMT_SIZE: break ts, code, oidlen, start_tid, end_tid = unpack(FMT, r) if ts == 0: # Must be a misaligned record caused by a crash; skip 8 bytes # and try again. Why 8? Lost in the mist of history. f.seek(f.tell() - FMT_SIZE + 8) continue oid = f_read(oidlen) if len(oid) < oidlen: break # Decode the code. dlen, version, code = ((code & 0x7fffff00) >> 8, code & 0x80, code & 0x7e) # And pass it to the simulation. this_interval = int(ts)/interval_step if this_interval != last_interval: if last_interval is not None: interval_sim.report() interval_sim.restart() if not interval_sim.warm: sim.restart() last_interval = this_interval sim.event(ts, dlen, version, code, oid, start_tid, end_tid) interval_sim.event(ts, dlen, version, code, oid, start_tid, end_tid) f.close() # Finish simulation. interval_sim.report() sim.finish() class Simulation(object): """Base class for simulations. The driver program calls: event(), printheader(), finish(). The standard event() method calls these additional methods: write(), load(), inval(), report(), restart(); the standard finish() method also calls report(). """ def __init__(self, cachelimit, rearrange): self.cachelimit = cachelimit self.rearrange = rearrange # Initialize global statistics. self.epoch = None self.warm = False self.total_loads = 0 self.total_hits = 0 # subclass must increment self.total_invals = 0 # subclass must increment self.total_writes = 0 if not hasattr(self, "extras"): self.extras = (self.extraname,) self.format = self.format + " %7s" * len(self.extras) # Reset per-run statistics and set up simulation data. self.restart() def restart(self): # Reset per-run statistics. self.loads = 0 self.hits = 0 # subclass must increment self.invals = 0 # subclass must increment self.writes = 0 self.ts0 = None def event(self, ts, dlen, _version, code, oid, start_tid, end_tid): # Record first and last timestamp seen. if self.ts0 is None: self.ts0 = ts if self.epoch is None: self.epoch = ts self.ts1 = ts # Simulate cache behavior. Caution: the codes in the trace file # record whether the actual cache missed or hit on each load, but # that bears no necessary relationship to whether the simulated cache # will hit or miss. Relatedly, if the actual cache needed to store # an object, the simulated cache may not need to (it may already # have the data). action = code & 0x70 if action & 0x20: # Load. self.loads += 1 self.total_loads += 1 # Asserting that dlen is 0 iff it's a load miss. # assert (dlen == 0) == (code in (0x20, 0x24)) self.load(oid, dlen, start_tid, code) elif action & 0x40: # Store. assert dlen self.write(oid, dlen, start_tid, end_tid) elif action & 0x10: # Invalidate. self.inval(oid, start_tid) elif action == 0x00: # Restart. self.restart() else: raise ValueError("unknown trace code 0x%x" % code) def write(self, oid, size, start_tid, end_tid): pass def load(self, oid, size, start_tid, code): # Must increment .hits and .total_hits as appropriate. pass def inval(self, oid, start_tid): # Must increment .invals and .total_invals as appropriate. pass format = "%12s %6s %7s %7s %6s %6s %7s" # Subclass should override extraname to name known instance variables; # if extraname is 'foo', both self.foo and self.total_foo must exist: extraname = "*** please override ***" def printheader(self): print "%s, cache size %s bytes" % (self.__class__.__name__, addcommas(self.cachelimit)) self.extraheader() extranames = tuple([s.upper() for s in self.extras]) args = ("START TIME", "DUR.", "LOADS", "HITS", "INVALS", "WRITES", "HITRATE") + extranames print self.format % args def extraheader(self): pass nreports = 0 def report(self): if not hasattr(self, 'ts1'): return self.nreports += 1 args = (ctime(self.ts0)[4:-8], duration(self.ts1 - self.ts0), self.loads, self.hits, self.invals, self.writes, hitrate(self.loads, self.hits)) args += tuple([getattr(self, name) for name in self.extras]) print self.format % args def finish(self): # Make sure that the last line of output ends with "OVERALL". This # makes it much easier for another program parsing the output to # find summary statistics. print '-'*74 if self.nreports < 2: self.report() else: self.report() args = ( ctime(self.epoch)[4:-8], duration(self.ts1 - self.epoch), self.total_loads, self.total_hits, self.total_invals, self.total_writes, hitrate(self.total_loads, self.total_hits)) args += tuple([getattr(self, "total_" + name) for name in self.extras]) print self.format % args # For use in CircularCacheSimulation. class CircularCacheEntry(object): __slots__ = ( # object key: an (oid, start_tid) pair, where start_tid is the # tid of the transaction that created this revision of oid 'key', # tid of transaction that created the next revision; z64 iff # this is the current revision 'end_tid', # Offset from start of file to the object's data record; this # includes all overhead bytes (status byte, size bytes, etc). 'offset', ) def __init__(self, key, end_tid, offset): self.key = key self.end_tid = end_tid self.offset = offset from ZEO.cache import ZEC_HEADER_SIZE class CircularCacheSimulation(Simulation): """Simulate the ZEO 3.0 cache.""" # The cache is managed as a single file with a pointer that # goes around the file, circularly, forever. New objects # are written at the current pointer, evicting whatever was # there previously. extras = "evicts", "inuse" evicts = 0 def __init__(self, cachelimit, rearrange): from ZEO import cache Simulation.__init__(self, cachelimit, rearrange) self.total_evicts = 0 # number of cache evictions # Current offset in file. self.offset = ZEC_HEADER_SIZE # Map offset in file to (size, CircularCacheEntry) pair, or to # (size, None) if the offset starts a free block. self.filemap = {ZEC_HEADER_SIZE: (self.cachelimit - ZEC_HEADER_SIZE, None)} # Map key to CircularCacheEntry. A key is an (oid, tid) pair. self.key2entry = {} # Map oid to tid of current revision. self.current = {} # Map oid to list of (start_tid, end_tid) pairs in sorted order. # Used to find matching key for load of non-current data. self.noncurrent = {} # The number of overhead bytes needed to store an object pickle # on disk (all bytes beyond those needed for the object pickle). self.overhead = ZEO.cache.allocated_record_overhead # save evictions so we can replay them, if necessary self.evicted = {} def restart(self): Simulation.restart(self) if self.evicts: self.warm = True self.evicts = 0 self.evicted_hit = self.evicted_miss = 0 evicted_hit = evicted_miss = 0 def load(self, oid, size, tid, code): if (code == 0x20) or (code == 0x22): # Trying to load current revision. if oid in self.current: # else it's a cache miss self.hits += 1 self.total_hits += 1 tid = self.current[oid] entry = self.key2entry[(oid, tid)] offset_offset = self.offset - entry.offset if offset_offset < 0: offset_offset += self.cachelimit assert offset_offset >= 0 if offset_offset > self.rearrange * self.cachelimit: # we haven't accessed it in a while. Move it forward size = self.filemap[entry.offset][0] self._remove(*entry.key) self.add(oid, size, tid) elif oid in self.evicted: size, e = self.evicted[oid] self.write(oid, size, e.key[1], z64, 1) self.evicted_hit += 1 else: self.evicted_miss += 1 return # May or may not be trying to load current revision. cur_tid = self.current.get(oid) if cur_tid == tid: self.hits += 1 self.total_hits += 1 return # It's a load for non-current data. Do we know about this oid? L = self.noncurrent.get(oid) if L is None: return # cache miss i = bisect.bisect_left(L, (tid, None)) if i == 0: # This tid is smaller than any we know about -- miss. return lo, hi = L[i-1] assert lo < tid if tid > hi: # No data in the right tid range -- miss. return # Cache hit. self.hits += 1 self.total_hits += 1 # (oid, tid) is in the cache. Remove it: take it out of key2entry, # and in `filemap` mark the space it occupied as being free. The # caller is responsible for removing it from `current` or `noncurrent`. def _remove(self, oid, tid): key = oid, tid e = self.key2entry.pop(key) pos = e.offset size, _e = self.filemap[pos] assert e is _e self.filemap[pos] = size, None def _remove_noncurrent_revisions(self, oid): noncurrent_list = self.noncurrent.get(oid) if noncurrent_list: self.invals += len(noncurrent_list) self.total_invals += len(noncurrent_list) for start_tid, end_tid in noncurrent_list: self._remove(oid, start_tid) del self.noncurrent[oid] def inval(self, oid, tid): if tid == z64: # This is part of startup cache verification: forget everything # about this oid. self._remove_noncurrent_revisions(oid) if oid in self.evicted: del self.evicted[oid] cur_tid = self.current.get(oid) if cur_tid is None: # We don't have current data, so nothing more to do. return # We had current data for oid, but no longer. self.invals += 1 self.total_invals += 1 del self.current[oid] if tid == z64: # Startup cache verification: forget this oid entirely. self._remove(oid, cur_tid) return # Our current data becomes non-current data. # Add the validity range to the list of non-current data for oid. assert cur_tid < tid L = self.noncurrent.setdefault(oid, []) bisect.insort_left(L, (cur_tid, tid)) # Update the end of oid's validity range in its CircularCacheEntry. e = self.key2entry[oid, cur_tid] assert e.end_tid == z64 e.end_tid = tid def write(self, oid, size, start_tid, end_tid, evhit=0): if end_tid == z64: # Storing current revision. if oid in self.current: # we already have it in cache if evhit: import pdb; pdb.set_trace() raise ValueError('WTF') return self.current[oid] = start_tid self.writes += 1 self.total_writes += 1 self.add(oid, size, start_tid) return if evhit: import pdb; pdb.set_trace() raise ValueError('WTF') # Storing non-current revision. L = self.noncurrent.setdefault(oid, []) p = start_tid, end_tid if p in L: return # we already have it in cache bisect.insort_left(L, p) self.writes += 1 self.total_writes += 1 self.add(oid, size, start_tid, end_tid) # Add `oid` to the cache, evicting objects as needed to make room. # This updates `filemap` and `key2entry`; it's the caller's # responsibilty to update `current` or `noncurrent` appropriately. def add(self, oid, size, start_tid, end_tid=z64): key = oid, start_tid assert key not in self.key2entry size += self.overhead avail = self.makeroom(size+1) # see cache.py e = CircularCacheEntry(key, end_tid, self.offset) self.filemap[self.offset] = size, e self.key2entry[key] = e self.offset += size # All the space made available must be accounted for in filemap. excess = avail - size if excess: self.filemap[self.offset] = excess, None # Evict enough objects to make at least `need` contiguous bytes, starting # at `self.offset`, available. Evicted objects are removed from # `filemap`, `key2entry`, `current` and `noncurrent`. The caller is # responsible for adding new entries to `filemap` to account for all # the freed bytes, and for advancing `self.offset`. The number of bytes # freed is the return value, and will be >= need. def makeroom(self, need): if self.offset + need > self.cachelimit: self.offset = ZEC_HEADER_SIZE pos = self.offset while need > 0: assert pos < self.cachelimit size, e = self.filemap.pop(pos) if e: # there is an object here (else it's already free space) self.evicts += 1 self.total_evicts += 1 assert pos == e.offset _e = self.key2entry.pop(e.key) assert e is _e oid, start_tid = e.key if e.end_tid == z64: del self.current[oid] self.evicted[oid] = size-self.overhead, e else: L = self.noncurrent[oid] L.remove((start_tid, e.end_tid)) need -= size pos += size return pos - self.offset # total number of bytes freed def report(self): self.check() free = used = total = 0 for size, e in self.filemap.itervalues(): total += size if e: used += size else: free += size self.inuse = round(100.0 * used / total, 1) self.total_inuse = self.inuse Simulation.report(self) #print self.evicted_hit, self.evicted_miss def check(self): oidcount = 0 pos = ZEC_HEADER_SIZE while pos < self.cachelimit: size, e = self.filemap[pos] if e: oidcount += 1 assert self.key2entry[e.key].offset == pos pos += size assert oidcount == len(self.key2entry) assert pos == self.cachelimit def dump(self): print len(self.filemap) L = list(self.filemap) L.sort() for k in L: v = self.filemap[k] print k, v[0], repr(v[1]) def roundup(size): k = MINSIZE while k < size: k += k return k def hitrate(loads, hits): if loads < 1: return 'n/a' return "%5.1f%%" % (100.0 * hits / loads) def duration(secs): mm, ss = divmod(secs, 60) hh, mm = divmod(mm, 60) if hh: return "%d:%02d:%02d" % (hh, mm, ss) if mm: return "%d:%02d" % (mm, ss) return "%d" % ss nre = re.compile('([=-]?)(\d+)([.]\d*)?').match def addcommas(n): sign, s, d = nre(str(n)).group(1, 2, 3) if d == '.0': d = '' result = s[-3:] s = s[:-3] while s: result = s[-3:]+','+result s = s[:-3] return (sign or '') + result + (d or '') import random def maybe(f, p=0.5): if random.random() < p: f() if __name__ == "__main__": sys.exit(main()) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/scripts/cache_stats.py000077500000000000000000000311061230730566700246620ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Trace file statistics analyzer. Usage: stats.py [-h] [-i interval] [-q] [-s] [-S] [-v] [-X] tracefile -h: print histogram of object load frequencies -i: summarizing interval in minutes (default 15; max 60) -q: quiet; don't print summaries -s: print histogram of object sizes -S: don't print statistics -v: verbose; print each record -X: enable heuristic checking for misaligned records: oids > 2**32 will be rejected; this requires the tracefile to be seekable """ """File format: Each record is 26 bytes, plus a variable number of bytes to store an oid, with the following layout. Numbers are big-endian integers. Offset Size Contents 0 4 timestamp (seconds since 1/1/1970) 4 3 data size, in 256-byte increments, rounded up 7 1 code (see below) 8 2 object id length 10 8 start tid 18 8 end tid 26 variable object id The code at offset 7 packs three fields: Mask bits Contents 0x80 1 set if there was a non-empty version string 0x7e 6 function and outcome code 0x01 1 current cache file (0 or 1) The "current cache file" bit is no longer used; it refers to a 2-file cache scheme used before ZODB 3.3. The function and outcome codes are documented in detail at the end of this file in the 'explain' dictionary. Note that the keys there (and also the arguments to _trace() in ClientStorage.py) are 'code & 0x7e', i.e. the low bit is always zero. """ import sys import time import getopt import struct from types import StringType # we assign ctime locally to facilitate test replacement! from time import ctime def usage(msg): print >> sys.stderr, msg print >> sys.stderr, __doc__ def main(args=None): if args is None: args = sys.argv[1:] # Parse options verbose = False quiet = False dostats = True print_size_histogram = False print_histogram = False interval = 15*60 # Every 15 minutes heuristic = False try: opts, args = getopt.getopt(args, "hi:qsSvX") except getopt.error, msg: usage(msg) return 2 for o, a in opts: if o == '-h': print_histogram = True elif o == "-i": interval = int(60 * float(a)) if interval <= 0: interval = 60 elif interval > 3600: interval = 3600 elif o == "-q": quiet = True verbose = False elif o == "-s": print_size_histogram = True elif o == "-S": dostats = False elif o == "-v": verbose = True elif o == '-X': heuristic = True else: assert False, (o, opts) if len(args) != 1: usage("exactly one file argument required") return 2 filename = args[0] # Open file if filename.endswith(".gz"): # Open gzipped file try: import gzip except ImportError: print >> sys.stderr, "can't read gzipped files (no module gzip)" return 1 try: f = gzip.open(filename, "rb") except IOError, msg: print >> sys.stderr, "can't open %s: %s" % (filename, msg) return 1 elif filename == '-': # Read from stdin f = sys.stdin else: # Open regular file try: f = open(filename, "rb") except IOError, msg: print >> sys.stderr, "can't open %s: %s" % (filename, msg) return 1 rt0 = time.time() bycode = {} # map code to count of occurrences byinterval = {} # map code to count in current interval records = 0 # number of trace records read versions = 0 # number of trace records with versions datarecords = 0 # number of records with dlen set datasize = 0L # sum of dlen across records with dlen set oids = {} # map oid to number of times it was loaded bysize = {} # map data size to number of loads bysizew = {} # map data size to number of writes total_loads = 0 t0 = None # first timestamp seen te = None # most recent timestamp seen h0 = None # timestamp at start of current interval he = None # timestamp at end of current interval thisinterval = None # generally te//interval f_read = f.read unpack = struct.unpack FMT = ">iiH8s8s" FMT_SIZE = struct.calcsize(FMT) assert FMT_SIZE == 26 # Read file, gathering statistics, and printing each record if verbose. print ' '*16, "%7s %7s %7s %7s" % ('loads', 'hits', 'inv(h)', 'writes'), print 'hitrate' try: while 1: r = f_read(FMT_SIZE) if len(r) < FMT_SIZE: break ts, code, oidlen, start_tid, end_tid = unpack(FMT, r) if ts == 0: # Must be a misaligned record caused by a crash. if not quiet: print "Skipping 8 bytes at offset", f.tell() - FMT_SIZE f.seek(f.tell() - FMT_SIZE + 8) continue oid = f_read(oidlen) if len(oid) < oidlen: break records += 1 if t0 is None: t0 = ts thisinterval = t0 // interval h0 = he = ts te = ts if ts // interval != thisinterval: if not quiet: dumpbyinterval(byinterval, h0, he) byinterval = {} thisinterval = ts // interval h0 = ts he = ts dlen, code = (code & 0x7fffff00) >> 8, code & 0xff if dlen: datarecords += 1 datasize += dlen if code & 0x80: version = 'V' versions += 1 else: version = '-' code &= 0x7e bycode[code] = bycode.get(code, 0) + 1 byinterval[code] = byinterval.get(code, 0) + 1 if dlen: if code & 0x70 == 0x20: # All loads bysize[dlen] = d = bysize.get(dlen) or {} d[oid] = d.get(oid, 0) + 1 elif code & 0x70 == 0x50: # All stores bysizew[dlen] = d = bysizew.get(dlen) or {} d[oid] = d.get(oid, 0) + 1 if verbose: print "%s %02x %s %016x %016x %c%s" % ( ctime(ts)[4:-5], code, oid_repr(oid), U64(start_tid), U64(end_tid), version, dlen and (' '+str(dlen)) or "") if code & 0x70 == 0x20: oids[oid] = oids.get(oid, 0) + 1 total_loads += 1 elif code == 0x00: # restart if not quiet: dumpbyinterval(byinterval, h0, he) byinterval = {} thisinterval = ts // interval h0 = he = ts if not quiet: print ctime(ts)[4:-5], print '='*20, "Restart", '='*20 except KeyboardInterrupt: print "\nInterrupted. Stats so far:\n" end_pos = f.tell() f.close() rte = time.time() if not quiet: dumpbyinterval(byinterval, h0, he) # Error if nothing was read if not records: print >> sys.stderr, "No records processed" return 1 # Print statistics if dostats: print print "Read %s trace records (%s bytes) in %.1f seconds" % ( addcommas(records), addcommas(end_pos), rte-rt0) print "Versions: %s records used a version" % addcommas(versions) print "First time: %s" % ctime(t0) print "Last time: %s" % ctime(te) print "Duration: %s seconds" % addcommas(te-t0) print "Data recs: %s (%.1f%%), average size %d bytes" % ( addcommas(datarecords), 100.0 * datarecords / records, datasize / datarecords) print "Hit rate: %.1f%% (load hits / loads)" % hitrate(bycode) print codes = bycode.keys() codes.sort() print "%13s %4s %s" % ("Count", "Code", "Function (action)") for code in codes: print "%13s %02x %s" % ( addcommas(bycode.get(code, 0)), code, explain.get(code) or "*** unknown code ***") # Print histogram. if print_histogram: print print "Histogram of object load frequency" total = len(oids) print "Unique oids: %s" % addcommas(total) print "Total loads: %s" % addcommas(total_loads) s = addcommas(total) width = max(len(s), len("objects")) fmt = "%5d %" + str(width) + "s %5.1f%% %5.1f%% %5.1f%%" hdr = "%5s %" + str(width) + "s %6s %6s %6s" print hdr % ("loads", "objects", "%obj", "%load", "%cum") cum = 0.0 for binsize, count in histogram(oids): obj_percent = 100.0 * count / total load_percent = 100.0 * count * binsize / total_loads cum += load_percent print fmt % (binsize, addcommas(count), obj_percent, load_percent, cum) # Print size histogram. if print_size_histogram: print print "Histograms of object sizes" print dumpbysize(bysizew, "written", "writes") dumpbysize(bysize, "loaded", "loads") def dumpbysize(bysize, how, how2): print print "Unique sizes %s: %s" % (how, addcommas(len(bysize))) print "%10s %6s %6s" % ("size", "objs", how2) sizes = bysize.keys() sizes.sort() for size in sizes: loads = 0 for n in bysize[size].itervalues(): loads += n print "%10s %6d %6d" % (addcommas(size), len(bysize.get(size, "")), loads) def dumpbyinterval(byinterval, h0, he): loads = hits = invals = writes = 0 for code in byinterval: if code & 0x20: n = byinterval[code] loads += n if code in (0x22, 0x26): hits += n elif code & 0x40: writes += byinterval[code] elif code & 0x10: if code != 0x10: invals += byinterval[code] if loads: hr = "%5.1f%%" % (100.0 * hits / loads) else: hr = 'n/a' print "%s-%s %7s %7s %7s %7s %7s" % ( ctime(h0)[4:-8], ctime(he)[14:-8], loads, hits, invals, writes, hr) def hitrate(bycode): loads = hits = 0 for code in bycode: if code & 0x70 == 0x20: n = bycode[code] loads += n if code in (0x22, 0x26): hits += n if loads: return 100.0 * hits / loads else: return 0.0 def histogram(d): bins = {} for v in d.itervalues(): bins[v] = bins.get(v, 0) + 1 L = bins.items() L.sort() return L def U64(s): return struct.unpack(">Q", s)[0] def oid_repr(oid): if isinstance(oid, StringType) and len(oid) == 8: return '%16x' % U64(oid) else: return repr(oid) def addcommas(n): sign, s = '', str(n) if s[0] == '-': sign, s = '-', s[1:] i = len(s) - 3 while i > 0: s = s[:i] + ',' + s[i:] i -= 3 return sign + s explain = { # The first hex digit shows the operation, the second the outcome. # If the second digit is in "02468" then it is a 'miss'. # If it is in "ACE" then it is a 'hit'. 0x00: "_setup_trace (initialization)", 0x10: "invalidate (miss)", 0x1A: "invalidate (hit, version)", 0x1C: "invalidate (hit, saving non-current)", # 0x1E can occur during startup verification. 0x1E: "invalidate (hit, discarding current or non-current)", 0x20: "load (miss)", 0x22: "load (hit)", 0x24: "load (non-current, miss)", 0x26: "load (non-current, hit)", 0x50: "store (version)", 0x52: "store (current, non-version)", 0x54: "store (non-current)", } if __name__ == "__main__": sys.exit(main()) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/scripts/parsezeolog.py000066400000000000000000000064761230730566700247440ustar00rootroot00000000000000#!/usr/bin/env python2.3 """Parse the BLATHER logging generated by ZEO2. An example of the log format is: 2002-04-15T13:05:29 BLATHER(-100) ZEO Server storea(3235680, [714], 235339406490168806) ('10.0.26.30', 45514) """ import re import time rx_time = re.compile('(\d\d\d\d-\d\d-\d\d)T(\d\d:\d\d:\d\d)') def parse_time(line): """Return the time portion of a zLOG line in seconds or None.""" mo = rx_time.match(line) if mo is None: return None date, time_ = mo.group(1, 2) date_l = [int(elt) for elt in date.split('-')] time_l = [int(elt) for elt in time_.split(':')] return int(time.mktime(date_l + time_l + [0, 0, 0])) rx_meth = re.compile("zrpc:\d+ calling (\w+)\((.*)") def parse_method(line): pass def parse_line(line): """Parse a log entry and return time, method info, and client.""" t = parse_time(line) if t is None: return None, None mo = rx_meth.search(line) if mo is None: return None, None meth_name = mo.group(1) meth_args = mo.group(2).strip() if meth_args.endswith(')'): meth_args = meth_args[:-1] meth_args = [s.strip() for s in meth_args.split(",")] m = meth_name, tuple(meth_args) return t, m class TStats: counter = 1 def __init__(self): self.id = TStats.counter TStats.counter += 1 fields = ("time", "vote", "done", "user", "path") fmt = "%-24s %5s %5s %-15s %s" hdr = fmt % fields def report(self): """Print a report about the transaction""" t = time.ctime(self.begin) if hasattr(self, "vote"): d_vote = self.vote - self.begin else: d_vote = "*" if hasattr(self, "finish"): d_finish = self.finish - self.begin else: d_finish = "*" print self.fmt % (time.ctime(self.begin), d_vote, d_finish, self.user, self.url) class TransactionParser: def __init__(self): self.txns = {} self.skipped = 0 def parse(self, line): t, m = parse_line(line) if t is None: return name = m[0] meth = getattr(self, name, None) if meth is not None: meth(t, m[1]) def tpc_begin(self, time, args): t = TStats() t.begin = time t.user = args[1] t.url = args[2] t.objects = [] tid = eval(args[0]) self.txns[tid] = t def get_txn(self, args): tid = eval(args[0]) try: return self.txns[tid] except KeyError: print "uknown tid", repr(tid) return None def tpc_finish(self, time, args): t = self.get_txn(args) if t is None: return t.finish = time def vote(self, time, args): t = self.get_txn(args) if t is None: return t.vote = time def get_txns(self): L = [(t.id, t) for t in self.txns.values()] L.sort() return [t for (id, t) in L] if __name__ == "__main__": import fileinput p = TransactionParser() i = 0 for line in fileinput.input(): i += 1 try: p.parse(line) except: print "line", i raise print "Transaction: %d" % len(p.txns) print TStats.hdr for txn in p.get_txns(): txn.report() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/scripts/tests.py000066400000000000000000000020201230730566700235310ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2004 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## import doctest, re, unittest from zope.testing import renormalizing def test_suite(): return unittest.TestSuite(( doctest.DocFileSuite( 'zeopack.test', checker=renormalizing.RENormalizing([ (re.compile('usage: Usage: '), 'Usage: '), # Py 2.4 (re.compile('options:'), 'Options:'), # Py 2.4 ]) ), )) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/scripts/timeout.py000077500000000000000000000032251230730566700240700ustar00rootroot00000000000000#!/usr/bin/env python2.3 """Transaction timeout test script. This script connects to a storage, begins a transaction, calls store() and tpc_vote(), and then sleeps forever. This should trigger the transaction timeout feature of the server. usage: timeout.py address delay [storage-name] """ import sys import time from ZODB.Transaction import Transaction from ZODB.tests.MinPO import MinPO from ZODB.tests.StorageTestBase import zodb_pickle from ZEO.ClientStorage import ClientStorage ZERO = '\0'*8 def main(): if len(sys.argv) not in (3, 4): sys.stderr.write("Usage: timeout.py address delay [storage-name]\n" % sys.argv[0]) sys.exit(2) hostport = sys.argv[1] delay = float(sys.argv[2]) if sys.argv[3:]: name = sys.argv[3] else: name = "1" if "/" in hostport: address = hostport else: if ":" in hostport: i = hostport.index(":") host, port = hostport[:i], hostport[i+1:] else: host, port = "", hostport port = int(port) address = (host, port) print "Connecting to %s..." % repr(address) storage = ClientStorage(address, name) print "Connected. Now starting a transaction..." oid = storage.new_oid() revid = ZERO data = MinPO("timeout.py") pickled_data = zodb_pickle(data) t = Transaction() t.user = "timeout.py" storage.tpc_begin(t) storage.store(oid, revid, pickled_data, '', t) print "Stored. Now voting..." storage.tpc_vote(t) print "Voted; now sleeping %s..." % delay time.sleep(delay) print "Done." if __name__ == "__main__": main() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/scripts/zeopack.py000077500000000000000000000117741230730566700240460ustar00rootroot00000000000000#!/usr/bin/env python2.3 import logging import optparse import socket import sys import time import traceback import ZEO.ClientStorage usage = """Usage: %prog [options] [servers] Pack one or more storages hosted by ZEO servers. The positional arguments specify 0 or more tcp servers to pack, where each is of the form: host:port[:name] """ WAIT = 10 # wait no more than 10 seconds for client to connect def _main(args=None, prog=None): if args is None: args = sys.argv[1:] parser = optparse.OptionParser(usage, prog=prog) parser.add_option( "-d", "--days", dest="days", type='int', default=0, help=("Pack objects that are older than this number of days") ) parser.add_option( "-t", "--time", dest="time", help=("Time of day to pack to of the form: HH[:MM[:SS]]. " "Defaults to current time.") ) parser.add_option( "-u", "--unix", dest="unix_sockets", action="append", help=("A unix-domain-socket server to connect to, of the form: " "path[:name]") ) parser.remove_option('-h') parser.add_option( "-h", dest="host", help=("Deprecated: " "Used with the -p and -S options, specified the host to " "connect to.") ) parser.add_option( "-p", type="int", dest="port", help=("Deprecated: " "Used with the -h and -S options, specifies " "the port to connect to.") ) parser.add_option( "-S", dest="name", default='1', help=("Deprecated: Used with the -h and -p, options, or with the " "-U option specified the storage name to use. Defaults to 1.") ) parser.add_option( "-U", dest="unix", help=("Deprecated: Used with the -S option, " "Unix-domain socket to connect to.") ) if not args: parser.print_help() return def error(message): sys.stderr.write("Error:\n%s\n" % message) sys.exit(1) options, args = parser.parse_args(args) packt = time.time() if options.time: time_ = map(int, options.time.split(':')) if len(time_) == 1: time_ += (0, 0) elif len(time_) == 2: time_ += (0,) elif len(time_) > 3: error("Invalid time value: %r" % options.time) packt = time.localtime(packt) packt = time.mktime(packt[:3]+tuple(time_)+packt[6:]) packt -= options.days * 86400 servers = [] if options.host: if not options.port: error("If host (-h) is specified then a port (-p) must be " "specified as well.") servers.append(((options.host, options.port), options.name)) elif options.port: servers.append(((socket.gethostname(), options.port), options.name)) if options.unix: servers.append((options.unix, options.name)) for server in args: data = server.split(':') if len(data) in (2, 3): host = data[0] try: port = int(data[1]) except ValueError: error("Invalid port in server specification: %r" % server) addr = host, port if len(data) == 2: name = '1' else: name = data[2] else: error("Invalid server specification: %r" % server) servers.append((addr, name)) for server in options.unix_sockets or (): data = server.split(':') if len(data) == 1: addr = data[0] name = '1' elif len(data) == 2: addr = data[0] name = data[1] else: error("Invalid server specification: %r" % server) servers.append((addr, name)) if not servers: error("No servers specified.") for addr, name in servers: try: cs = ZEO.ClientStorage.ClientStorage( addr, storage=name, wait=False, read_only=1) for i in range(60): if cs.is_connected(): break time.sleep(1) else: sys.stderr.write("Couldn't connect to: %r\n" % ((addr, name), )) cs.close() continue cs.pack(packt, wait=True) cs.close() except: traceback.print_exception(*(sys.exc_info()+(99, sys.stderr))) error("Error packing storage %s in %r" % (name, addr)) def main(*args): root_logger = logging.getLogger() old_level = root_logger.getEffectiveLevel() logging.getLogger().setLevel(logging.WARNING) handler = logging.StreamHandler(sys.stdout) handler.setFormatter(logging.Formatter( "%(name)s %(levelname)s %(message)s")) logging.getLogger().addHandler(handler) try: _main(*args) finally: logging.getLogger().setLevel(old_level) logging.getLogger().removeHandler(handler) if __name__ == "__main__": main() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/scripts/zeopack.test000066400000000000000000000211571230730566700243660ustar00rootroot00000000000000zeopack ======= The zeopack script can be used to pack one or more storages. It uses ClientStorage to do this. To test it's behavior, we'll replace the normal ClientStorage with a fake one that echos information we'll want for our test: >>> class ClientStorage: ... connect_wait = 0 ... def __init__(self, *args, **kw): ... if args[0] == 'bad': ... import logging ... logging.getLogger('test.ClientStorage').error( ... "I hate this address, %r", args[0]) ... raise ValueError("Bad address") ... print "ClientStorage(%s %s)" % ( ... repr(args)[1:-1], ... ', '.join("%s=%r" % i for i in sorted(kw.items())), ... ) ... def pack(self, t=None, *args, **kw): ... now = time.localtime(time.time()) ... local_midnight = time.mktime(now[:3]+(0, 0, 0)+now[6:]) ... t -= local_midnight # adjust for tz ... t += 86400*7 # add a week to make sure we're positive ... print "pack(%r,%s %s)" % ( ... t, repr(args)[1:-1], ... ', '.join("%s=%r" % i for i in sorted(kw.items())), ... ) ... def is_connected(self): ... self.connect_wait -= 1 ... print 'is_connected', self.connect_wait < 0 ... return self.connect_wait < 0 ... def close(self): ... print "close()" >>> import ZEO >>> ClientStorage_orig = ZEO.ClientStorage.ClientStorage >>> ZEO.ClientStorage.ClientStorage = ClientStorage Now, we're ready to try the script: >>> from ZEO.scripts.zeopack import main If we call it with no arguments, we get help: >>> import os; os.environ['COLUMNS'] = '80' # for consistent optparse output >>> main([], 'zeopack') Usage: zeopack [options] [servers] Pack one or more storages hosted by ZEO servers. The positional arguments specify 0 or more tcp servers to pack, where each is of the form: host:port[:name] Options: -d DAYS, --days=DAYS Pack objects that are older than this number of days -t TIME, --time=TIME Time of day to pack to of the form: HH[:MM[:SS]]. Defaults to current time. -u UNIX_SOCKETS, --unix=UNIX_SOCKETS A unix-domain-socket server to connect to, of the form: path[:name] -h HOST Deprecated: Used with the -p and -S options, specified the host to connect to. -p PORT Deprecated: Used with the -h and -S options, specifies the port to connect to. -S NAME Deprecated: Used with the -h and -p, options, or with the -U option specified the storage name to use. Defaults to 1. -U UNIX Deprecated: Used with the -S option, Unix-domain socket to connect to. Since packing involves time, we'd better have our way with it. Replace time.time() with a function that always returns the same value. The value is timezone dependent. >>> import time >>> time_orig = time.time >>> time.time = lambda : time.mktime((2009, 3, 24, 10, 55, 17, 1, 83, -1)) >>> sleep_orig = time.sleep >>> def sleep(t): ... print 'sleep(%r)' % t >>> time.sleep = sleep Normally, we pass one or more TCP server specifications: >>> main(["host1:8100", "host1:8100:2"]) ClientStorage(('host1', 8100), read_only=1, storage='1', wait=False) is_connected True pack(644117.0, wait=True) close() ClientStorage(('host1', 8100), read_only=1, storage='2', wait=False) is_connected True pack(644117.0, wait=True) close() We can also pass unix-domain-sockey servers using the -u option: >>> main(["-ufoo", "-ubar:spam", "host1:8100", "host1:8100:2"]) ClientStorage(('host1', 8100), read_only=1, storage='1', wait=False) is_connected True pack(644117.0, wait=True) close() ClientStorage(('host1', 8100), read_only=1, storage='2', wait=False) is_connected True pack(644117.0, wait=True) close() ClientStorage('foo', read_only=1, storage='1', wait=False) is_connected True pack(644117.0, wait=True) close() ClientStorage('bar', read_only=1, storage='spam', wait=False) is_connected True pack(644117.0, wait=True) close() The -d option causes a pack time the given number of days earlier to be used: >>> main(["-ufoo", "-ubar:spam", "-d3", "host1:8100", "host1:8100:2"]) ClientStorage(('host1', 8100), read_only=1, storage='1', wait=False) is_connected True pack(384917.0, wait=True) close() ClientStorage(('host1', 8100), read_only=1, storage='2', wait=False) is_connected True pack(384917.0, wait=True) close() ClientStorage('foo', read_only=1, storage='1', wait=False) is_connected True pack(384917.0, wait=True) close() ClientStorage('bar', read_only=1, storage='spam', wait=False) is_connected True pack(384917.0, wait=True) close() The -t option allows us to control the time of day: >>> main(["-ufoo", "-d3", "-t1:30", "host1:8100:2"]) ClientStorage(('host1', 8100), read_only=1, storage='2', wait=False) is_connected True pack(351000.0, wait=True) close() ClientStorage('foo', read_only=1, storage='1', wait=False) is_connected True pack(351000.0, wait=True) close() Connection timeout ------------------ The zeopack script tells ClientStorage not to wait for connections before returning from the constructor, but will time out after 60 seconds of waiting for a connect. >>> ClientStorage.connect_wait = 3 >>> main(["-d3", "-t1:30", "host1:8100:2"]) ClientStorage(('host1', 8100), read_only=1, storage='2', wait=False) is_connected False sleep(1) is_connected False sleep(1) is_connected False sleep(1) is_connected True pack(351000.0, wait=True) close() >>> def call_main(args): ... import sys ... old_stderr = sys.stderr ... sys.stderr = sys.stdout ... try: ... try: ... main(args) ... except SystemExit, v: ... print "Exited", v ... finally: ... sys.stderr = old_stderr >>> ClientStorage.connect_wait = 999 >>> call_main(["-d3", "-t1:30", "host1:8100", "host1:8100:2"]) ... # doctest: +ELLIPSIS ClientStorage(('host1', 8100), read_only=1, storage='1', wait=False) is_connected False sleep(1) ... is_connected False sleep(1) Couldn't connect to: (('host1', 8100), '1') close() ClientStorage(('host1', 8100), read_only=1, storage='2', wait=False) is_connected False sleep(1) ... is_connected False sleep(1) Couldn't connect to: (('host1', 8100), '2') close() >>> ClientStorage.connect_wait = 0 Legacy support -------------- >>> main(["-d3", "-h", "host1", "-p", "8100", "-S", "2"]) ClientStorage(('host1', 8100), read_only=1, storage='2', wait=False) is_connected True pack(384917.0, wait=True) close() >>> import socket >>> old_gethostname = socket.gethostname >>> socket.gethostname = lambda : 'test.host.com' >>> main(["-d3", "-p", "8100"]) ClientStorage(('test.host.com', 8100), read_only=1, storage='1', wait=False) is_connected True pack(384917.0, wait=True) close() >>> socket.gethostname = old_gethostname >>> main(["-d3", "-U", "foo/bar", "-S", "2"]) ClientStorage('foo/bar', read_only=1, storage='2', wait=False) is_connected True pack(384917.0, wait=True) close() Error handling -------------- >>> call_main(["-d3"]) Error: No servers specified. Exited 1 >>> call_main(["-d3", "a"]) Error: Invalid server specification: 'a' Exited 1 >>> call_main(["-d3", "a:b:c:d"]) Error: Invalid server specification: 'a:b:c:d' Exited 1 >>> call_main(["-d3", "a:b:2"]) Error: Invalid port in server specification: 'a:b:2' Exited 1 >>> call_main(["-d3", "-u", "a:b:2"]) Error: Invalid server specification: 'a:b:2' Exited 1 >>> call_main(["-d3", "-u", "bad"]) # doctest: +ELLIPSIS test.ClientStorage ERROR I hate this address, 'bad' Traceback (most recent call last): ... ValueError: Bad address Error: Error packing storage 1 in 'bad' Exited 1 Note that in the previous example, the first line was output through logging. .. tear down >>> ZEO.ClientStorage.ClientStorage = ClientStorage_orig >>> time.time = time_orig >>> time.sleep = sleep_orig ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/scripts/zeoqueue.py000077500000000000000000000255351230730566700242540ustar00rootroot00000000000000#!/usr/bin/env python2.3 """Report on the number of currently waiting clients in the ZEO queue. Usage: %(PROGRAM)s [options] logfile Options: -h / --help Print this help text and exit. -v / --verbose Verbose output -f file --file file Use the specified file to store the incremental state as a pickle. If not given, %(STATEFILE)s is used. -r / --reset Reset the state of the tool. This blows away any existing state pickle file and then exits -- it does not parse the file. Use this when you rotate log files so that the next run will parse from the beginning of the file. """ import os import re import sys import time import errno import getopt import cPickle as pickle COMMASPACE = ', ' STATEFILE = 'zeoqueue.pck' PROGRAM = sys.argv[0] try: True, False except NameError: True = 1 False = 0 tcre = re.compile(r""" (?P \d{4}- # year \d{2}- # month \d{2}) # day T # separator (?P \d{2}: # hour \d{2}: # minute \d{2}) # second """, re.VERBOSE) ccre = re.compile(r""" zrpc-conn:(?P\d+.\d+.\d+.\d+:\d+)\s+ calling\s+ (?P \w+) # the method \( # args open paren \' # string quote start (?P \S+) # first argument -- usually the tid \' # end of string (?P .*) # rest of line """, re.VERBOSE) wcre = re.compile(r'Clients waiting: (?P\d+)') def parse_time(line): """Return the time portion of a zLOG line in seconds or None.""" mo = tcre.match(line) if mo is None: return None date, time_ = mo.group('ymd', 'hms') date_l = [int(elt) for elt in date.split('-')] time_l = [int(elt) for elt in time_.split(':')] return int(time.mktime(date_l + time_l + [0, 0, 0])) class Txn: """Track status of single transaction.""" def __init__(self, tid): self.tid = tid self.hint = None self.begin = None self.vote = None self.abort = None self.finish = None self.voters = [] def isactive(self): if self.begin and not (self.abort or self.finish): return True else: return False class Status: """Track status of ZEO server by replaying log records. We want to keep track of several events: - The last committed transaction. - The last committed or aborted transaction. - The last transaction that got the lock but didn't finish. - The client address doing the first vote of a transaction. - The number of currently active transactions. - The number of reported queued transactions. - Client restarts. - Number of current connections (but this might not be useful). We can observe these events by reading the following sorts of log entries: 2002-12-16T06:16:05 BLATHER(-100) zrpc:12649 calling tpc_begin('\x03I\x90((\xdbp\xd5', '', 'QueueCatal... 2002-12-16T06:16:06 BLATHER(-100) zrpc:12649 calling vote('\x03I\x90((\xdbp\xd5') 2002-12-16T06:16:06 BLATHER(-100) zrpc:12649 calling tpc_finish('\x03I\x90((\xdbp\xd5') 2002-12-16T10:46:10 INFO(0) ZSS:12649:1 Transaction blocked waiting for storage. Clients waiting: 1. 2002-12-16T06:15:57 BLATHER(-100) zrpc:12649 connect from ('10.0.26.54', 48983): 2002-12-16T10:30:09 INFO(0) ZSS:12649:1 disconnected """ def __init__(self): self.lineno = 0 self.pos = 0 self.reset() def reset(self): self.commit = None self.commit_or_abort = None self.last_unfinished = None self.n_active = 0 self.n_blocked = 0 self.n_conns = 0 self.t_restart = None self.txns = {} def iscomplete(self): # The status report will always be complete if we encounter an # explicit restart. if self.t_restart is not None: return True # If we haven't seen a restart, assume that seeing a finished # transaction is good enough. return self.commit is not None def process_file(self, fp): if self.pos: if VERBOSE: print 'seeking to file position', self.pos fp.seek(self.pos) while True: line = fp.readline() if not line: break self.lineno += 1 self.process(line) self.pos = fp.tell() def process(self, line): if line.find("calling") != -1: self.process_call(line) elif line.find("connect") != -1: self.process_connect(line) # test for "locked" because word may start with "B" or "b" elif line.find("locked") != -1: self.process_block(line) elif line.find("Starting") != -1: self.process_start(line) def process_call(self, line): mo = ccre.search(line) if mo is None: return called_method = mo.group('method') # Exit early if we've got zeoLoad, because it's the most # frequently called method and we don't use it. if called_method == "zeoLoad": return t = parse_time(line) meth = getattr(self, "call_%s" % called_method, None) if meth is None: return client = mo.group('addr') tid = mo.group('tid') rest = mo.group('rest') meth(t, client, tid, rest) def process_connect(self, line): pass def process_block(self, line): mo = wcre.search(line) if mo is None: # assume that this was a restart message for the last blocked # transaction. self.n_blocked = 0 else: self.n_blocked = int(mo.group('num')) def process_start(self, line): if line.find("Starting ZEO server") != -1: self.reset() self.t_restart = parse_time(line) def call_tpc_begin(self, t, client, tid, rest): txn = Txn(tid) txn.begin = t if rest[0] == ',': i = 1 while rest[i].isspace(): i += 1 rest = rest[i:] txn.hint = rest self.txns[tid] = txn self.n_active += 1 self.last_unfinished = txn def call_vote(self, t, client, tid, rest): txn = self.txns.get(tid) if txn is None: print "Oops!" txn = self.txns[tid] = Txn(tid) txn.vote = t txn.voters.append(client) def call_tpc_abort(self, t, client, tid, rest): txn = self.txns.get(tid) if txn is None: print "Oops!" txn = self.txns[tid] = Txn(tid) txn.abort = t txn.voters = [] self.n_active -= 1 if self.commit_or_abort: # delete the old transaction try: del self.txns[self.commit_or_abort.tid] except KeyError: pass self.commit_or_abort = txn def call_tpc_finish(self, t, client, tid, rest): txn = self.txns.get(tid) if txn is None: print "Oops!" txn = self.txns[tid] = Txn(tid) txn.finish = t txn.voters = [] self.n_active -= 1 if self.commit: # delete the old transaction try: del self.txns[self.commit.tid] except KeyError: pass if self.commit_or_abort: # delete the old transaction try: del self.txns[self.commit_or_abort.tid] except KeyError: pass self.commit = self.commit_or_abort = txn def report(self): print "Blocked transactions:", self.n_blocked if not VERBOSE: return if self.t_restart: print "Server started:", time.ctime(self.t_restart) if self.commit is not None: t = self.commit_or_abort.finish if t is None: t = self.commit_or_abort.abort print "Last finished transaction:", time.ctime(t) # the blocked transaction should be the first one that calls vote L = [(txn.begin, txn) for txn in self.txns.values()] L.sort() for x, txn in L: if txn.isactive(): began = txn.begin if txn.voters: print "Blocked client (first vote):", txn.voters[0] print "Blocked transaction began at:", time.ctime(began) print "Hint:", txn.hint print "Idle time: %d sec" % int(time.time() - began) break def usage(code, msg=''): print >> sys.stderr, __doc__ % globals() if msg: print >> sys.stderr, msg sys.exit(code) def main(): global VERBOSE VERBOSE = 0 file = STATEFILE reset = False # -0 is a secret option used for testing purposes only seek = True try: opts, args = getopt.getopt(sys.argv[1:], 'vhf:r0', ['help', 'verbose', 'file=', 'reset']) except getopt.error, msg: usage(1, msg) for opt, arg in opts: if opt in ('-h', '--help'): usage(0) elif opt in ('-v', '--verbose'): VERBOSE += 1 elif opt in ('-f', '--file'): file = arg elif opt in ('-r', '--reset'): reset = True elif opt == '-0': seek = False if reset: # Blow away the existing state file and exit try: os.unlink(file) if VERBOSE: print 'removing pickle state file', file except OSError, e: if e.errno <> errno.ENOENT: raise return if not args: usage(1, 'logfile is required') if len(args) > 1: usage(1, 'too many arguments: %s' % COMMASPACE.join(args)) path = args[0] # Get the previous status object from the pickle file, if it is available # and if the --reset flag wasn't given. status = None try: statefp = open(file, 'rb') try: status = pickle.load(statefp) if VERBOSE: print 'reading status from file', file finally: statefp.close() except IOError, e: if e.errno <> errno.ENOENT: raise if status is None: status = Status() if VERBOSE: print 'using new status' if not seek: status.pos = 0 fp = open(path, 'rb') try: status.process_file(fp) finally: fp.close() # Save state statefp = open(file, 'wb') pickle.dump(status, statefp, 1) statefp.close() # Print the report and return the number of blocked clients in the exit # status code. status.report() sys.exit(status.n_blocked) if __name__ == "__main__": main() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/scripts/zeoreplay.py000066400000000000000000000204571230730566700244170ustar00rootroot00000000000000#!/usr/bin/env python2.3 """Parse the BLATHER logging generated by ZEO, and optionally replay it. Usage: zeointervals.py [options] Options: --help / -h Print this message and exit. --replay=storage -r storage Replay the parsed transactions through the new storage --maxtxn=count -m count Parse no more than count transactions. --report / -p Print a report as we're parsing. Unlike parsezeolog.py, this script generates timestamps for each transaction, and sub-command in the transaction. We can use this to compare timings with synthesized data. """ import re import sys import time import getopt import operator # ZEO logs measure wall-clock time so for consistency we need to do the same #from time import clock as now from time import time as now from ZODB.FileStorage import FileStorage #from BDBStorage.BDBFullStorage import BDBFullStorage #from Standby.primary import PrimaryStorage #from Standby.config import RS_PORT from ZODB.Transaction import Transaction from ZODB.utils import p64 datecre = re.compile('(\d\d\d\d-\d\d-\d\d)T(\d\d:\d\d:\d\d)') methcre = re.compile("ZEO Server (\w+)\((.*)\) \('(.*)', (\d+)") class StopParsing(Exception): pass def usage(code, msg=''): print __doc__ if msg: print msg sys.exit(code) def parse_time(line): """Return the time portion of a zLOG line in seconds or None.""" mo = datecre.match(line) if mo is None: return None date, time_ = mo.group(1, 2) date_l = [int(elt) for elt in date.split('-')] time_l = [int(elt) for elt in time_.split(':')] return int(time.mktime(date_l + time_l + [0, 0, 0])) def parse_line(line): """Parse a log entry and return time, method info, and client.""" t = parse_time(line) if t is None: return None, None, None mo = methcre.search(line) if mo is None: return None, None, None meth_name = mo.group(1) meth_args = mo.group(2) meth_args = [s.strip() for s in meth_args.split(',')] m = meth_name, tuple(meth_args) c = mo.group(3), mo.group(4) return t, m, c class StoreStat: def __init__(self, when, oid, size): self.when = when self.oid = oid self.size = size # Crufty def __getitem__(self, i): if i == 0: return self.oid if i == 1: return self.size raise IndexError class TxnStat: def __init__(self): self._begintime = None self._finishtime = None self._aborttime = None self._url = None self._objects = [] def tpc_begin(self, when, args, client): self._begintime = when # args are txnid, user, description (looks like it's always a url) self._url = args[2] def storea(self, when, args, client): oid = int(args[0]) # args[1] is "[numbytes]" size = int(args[1][1:-1]) s = StoreStat(when, oid, size) self._objects.append(s) def tpc_abort(self, when): self._aborttime = when def tpc_finish(self, when): self._finishtime = when # Mapping oid -> revid _revids = {} class ReplayTxn(TxnStat): def __init__(self, storage): self._storage = storage self._replaydelta = 0 TxnStat.__init__(self) def replay(self): ZERO = '\0'*8 t0 = now() t = Transaction() self._storage.tpc_begin(t) for obj in self._objects: oid = obj.oid revid = _revids.get(oid, ZERO) # BAW: simulate a pickle of the given size data = 'x' * obj.size # BAW: ignore versions for now newrevid = self._storage.store(p64(oid), revid, data, '', t) _revids[oid] = newrevid if self._aborttime: self._storage.tpc_abort(t) origdelta = self._aborttime - self._begintime else: self._storage.tpc_vote(t) self._storage.tpc_finish(t) origdelta = self._finishtime - self._begintime t1 = now() # Shows how many seconds behind (positive) or ahead (negative) of the # original reply our local update took self._replaydelta = t1 - t0 - origdelta class ZEOParser: def __init__(self, maxtxns=-1, report=1, storage=None): self.__txns = [] self.__curtxn = {} self.__skipped = 0 self.__maxtxns = maxtxns self.__finishedtxns = 0 self.__report = report self.__storage = storage def parse(self, line): t, m, c = parse_line(line) if t is None: # Skip this line return name = m[0] meth = getattr(self, name, None) if meth is not None: meth(t, m[1], c) def tpc_begin(self, when, args, client): txn = ReplayTxn(self.__storage) self.__curtxn[client] = txn meth = getattr(txn, 'tpc_begin', None) if meth is not None: meth(when, args, client) def storea(self, when, args, client): txn = self.__curtxn.get(client) if txn is None: self.__skipped += 1 return meth = getattr(txn, 'storea', None) if meth is not None: meth(when, args, client) def tpc_finish(self, when, args, client): txn = self.__curtxn.get(client) if txn is None: self.__skipped += 1 return meth = getattr(txn, 'tpc_finish', None) if meth is not None: meth(when) if self.__report: self.report(txn) self.__txns.append(txn) self.__curtxn[client] = None self.__finishedtxns += 1 if self.__maxtxns > 0 and self.__finishedtxns >= self.__maxtxns: raise StopParsing def report(self, txn): """Print a report about the transaction""" if txn._objects: bytes = reduce(operator.add, [size for oid, size in txn._objects]) else: bytes = 0 print '%s %s %4d %10d %s %s' % ( txn._begintime, txn._finishtime - txn._begintime, len(txn._objects), bytes, time.ctime(txn._begintime), txn._url) def replay(self): for txn in self.__txns: txn.replay() # How many fell behind? slower = [] faster = [] for txn in self.__txns: if txn._replaydelta > 0: slower.append(txn) else: faster.append(txn) print len(slower), 'laggards,', len(faster), 'on-time or faster' # Find some averages if slower: sum = reduce(operator.add, [txn._replaydelta for txn in slower], 0) print 'average slower txn was:', float(sum) / len(slower) if faster: sum = reduce(operator.add, [txn._replaydelta for txn in faster], 0) print 'average faster txn was:', float(sum) / len(faster) def main(): try: opts, args = getopt.getopt( sys.argv[1:], 'hr:pm:', ['help', 'replay=', 'report', 'maxtxns=']) except getopt.error, e: usage(1, e) if args: usage(1) replay = 0 maxtxns = -1 report = 0 storagefile = None for opt, arg in opts: if opt in ('-h', '--help'): usage(0) elif opt in ('-r', '--replay'): replay = 1 storagefile = arg elif opt in ('-p', '--report'): report = 1 elif opt in ('-m', '--maxtxns'): try: maxtxns = int(arg) except ValueError: usage(1, 'Bad -m argument: %s' % arg) if replay: storage = FileStorage(storagefile) #storage = BDBFullStorage(storagefile) #storage = PrimaryStorage('yyz', storage, RS_PORT) t0 = now() p = ZEOParser(maxtxns, report, storage) i = 0 while 1: line = sys.stdin.readline() if not line: break i += 1 try: p.parse(line) except StopParsing: break except: print 'input file line:', i raise t1 = now() print 'total parse time:', t1-t0 t2 = now() if replay: p.replay() t3 = now() print 'total replay time:', t3-t2 print 'total time:', t3-t0 if __name__ == '__main__': main() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/scripts/zeoserverlog.py000066400000000000000000000346611230730566700251350ustar00rootroot00000000000000#!/usr/bin/env python2.3 ############################################################################## # # Copyright (c) 2003 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Tools for analyzing ZEO Server logs. This script contains a number of commands, implemented by command functions. To run a command, give the command name and it's arguments as arguments to this script. Commands: blocked_times file threshold Output a summary of episodes where thransactions were blocked when the episode lasted at least threshold seconds. The file may be a file name or - to read from standard input. The file may also be a command: script blocked_times 'bunzip2 = 0 if blocking and waiting == 1: t1 = time(line) t2 = t1 if not blocking and last_blocking: last_wait = 0 t2 = time(line) cid = idre.search(line).group(1) if waiting == 0: d = sub(t1, time(line)) if d >= thresh: print t1, sub(t1, t2), cid, d t1 = t2 = cid = blocking = waiting = last_wait = max_wait = 0 last_blocking = blocking connidre = re.compile(r' zrpc-conn:(\d+.\d+.\d+.\d+:\d+) ') def time_calls(f): f, thresh = f if f == '-': f = sys.stdin else: f = xopen(f) thresh = float(thresh) t1 = None maxd = 0 for line in f: line = line.strip() if ' calling ' in line: t1 = time(line) elif ' returns ' in line and t1 is not None: d = sub(t1, time(line)) if d >= thresh: print t1, d, connidre.search(line).group(1) maxd = max(maxd, d) t1 = None print maxd def xopen(f): if f == '-': return sys.stdin if ' ' in f: return os.popen(f, 'r') return open(f) def time_tpc(f): f, thresh = f if f == '-': f = sys.stdin else: f = xopen(f) thresh = float(thresh) transactions = {} for line in f: line = line.strip() if ' calling vote(' in line: cid = connidre.search(line).group(1) transactions[cid] = time(line), elif ' vote returns None' in line: cid = connidre.search(line).group(1) transactions[cid] += time(line), 'n' elif ' vote() raised' in line: cid = connidre.search(line).group(1) transactions[cid] += time(line), 'e' elif ' vote returns ' in line: # delayed, skip cid = connidre.search(line).group(1) transactions[cid] += time(line), 'd' elif ' calling tpc_abort(' in line: cid = connidre.search(line).group(1) if cid in transactions: t1, t2, vs = transactions[cid] t = time(line) d = sub(t1, t) if d >= thresh: print 'a', t1, cid, sub(t1, t2), vs, sub(t2, t) del transactions[cid] elif ' calling tpc_finish(' in line: if cid in transactions: cid = connidre.search(line).group(1) transactions[cid] += time(line), elif ' tpc_finish returns ' in line: if cid in transactions: t1, t2, vs, t3 = transactions[cid] t = time(line) d = sub(t1, t) if d >= thresh: print 'c', t1, cid, sub(t1, t2), vs, sub(t2, t3), sub(t3, t) del transactions[cid] newobre = re.compile(r"storea\(.*, '\\x00\\x00\\x00\\x00\\x00") def time_trans(f): f, thresh = f if f == '-': f = sys.stdin else: f = xopen(f) thresh = float(thresh) transactions = {} for line in f: line = line.strip() if ' calling tpc_begin(' in line: cid = connidre.search(line).group(1) transactions[cid] = time(line), [0, 0] if ' calling storea(' in line: cid = connidre.search(line).group(1) if cid in transactions: transactions[cid][1][0] += 1 if not newobre.search(line): transactions[cid][1][1] += 1 elif ' calling vote(' in line: cid = connidre.search(line).group(1) if cid in transactions: transactions[cid] += time(line), elif ' vote returns None' in line: cid = connidre.search(line).group(1) if cid in transactions: transactions[cid] += time(line), 'n' elif ' vote() raised' in line: cid = connidre.search(line).group(1) if cid in transactions: transactions[cid] += time(line), 'e' elif ' vote returns ' in line: # delayed, skip cid = connidre.search(line).group(1) if cid in transactions: transactions[cid] += time(line), 'd' elif ' calling tpc_abort(' in line: cid = connidre.search(line).group(1) if cid in transactions: try: t0, (stores, old), t1, t2, vs = transactions[cid] except ValueError: pass else: t = time(line) d = sub(t1, t) if d >= thresh: print t1, cid, "%s/%s" % (stores, old), \ sub(t0, t1), sub(t1, t2), vs, \ sub(t2, t), 'abort' del transactions[cid] elif ' calling tpc_finish(' in line: if cid in transactions: cid = connidre.search(line).group(1) transactions[cid] += time(line), elif ' tpc_finish returns ' in line: if cid in transactions: t0, (stores, old), t1, t2, vs, t3 = transactions[cid] t = time(line) d = sub(t1, t) if d >= thresh: print t1, cid, "%s/%s" % (stores, old), \ sub(t0, t1), sub(t1, t2), vs, \ sub(t2, t3), sub(t3, t) del transactions[cid] def minute(f, slice=16, detail=1, summary=1): f, = f if f == '-': f = sys.stdin else: f = xopen(f) cols = ["time", "reads", "stores", "commits", "aborts", "txns"] fmt = "%18s %6s %6s %7s %6s %6s" print fmt % cols print fmt % ["-"*len(col) for col in cols] mlast = r = s = c = a = cl = None rs = [] ss = [] cs = [] aborts = [] ts = [] cls = [] for line in f: line = line.strip() if (line.find('returns') > 0 or line.find('storea') > 0 or line.find('tpc_abort') > 0 ): client = connidre.search(line).group(1) m = line[:slice] if m != mlast: if mlast: if detail: print fmt % (mlast, len(cl), r, s, c, a, a+c) cls.append(len(cl)) rs.append(r) ss.append(s) cs.append(c) aborts.append(a) ts.append(c+a) mlast = m r = s = c = a = 0 cl = {} if line.find('zeoLoad') > 0: r += 1 cl[client] = 1 elif line.find('storea') > 0: s += 1 cl[client] = 1 elif line.find('tpc_finish') > 0: c += 1 cl[client] = 1 elif line.find('tpc_abort') > 0: a += 1 cl[client] = 1 if mlast: if detail: print fmt % (mlast, len(cl), r, s, c, a, a+c) cls.append(len(cl)) rs.append(r) ss.append(s) cs.append(c) aborts.append(a) ts.append(c+a) if summary: print print 'Summary: \t', '\t'.join(('min', '10%', '25%', 'med', '75%', '90%', 'max', 'mean')) print "n=%6d\t" % len(cls), '-'*62 print 'Clients: \t', '\t'.join(map(str,stats(cls))) print 'Reads: \t', '\t'.join(map(str,stats(rs))) print 'Stores: \t', '\t'.join(map(str,stats(ss))) print 'Commits: \t', '\t'.join(map(str,stats(cs))) print 'Aborts: \t', '\t'.join(map(str,stats(aborts))) print 'Trans: \t', '\t'.join(map(str,stats(ts))) def stats(s): s.sort() min = s[0] max = s[-1] n = len(s) out = [min] ni = n + 1 for p in .1, .25, .5, .75, .90: lp = ni*p l = int(lp) if lp < 1 or lp > n: out.append('-') elif abs(lp-l) < .00001: out.append(s[l-1]) else: out.append(int(s[l-1] + (lp - l) * (s[l] - s[l-1]))) mean = 0.0 for v in s: mean += v out.extend([max, int(mean/n)]) return out def minutes(f): minute(f, 16, detail=0) def hour(f): minute(f, 13) def day(f): minute(f, 10) def hours(f): minute(f, 13, detail=0) def days(f): minute(f, 10, detail=0) new_connection_idre = re.compile(r"new connection \('(\d+.\d+.\d+.\d+)', (\d+)\):") def verify(f): f, = f if f == '-': f = sys.stdin else: f = xopen(f) t1 = None nv = {} for line in f: if line.find('new connection') > 0: m = new_connection_idre.search(line) cid = "%s:%s" % (m.group(1), m.group(2)) nv[cid] = [time(line), 0] elif line.find('calling zeoVerify(') > 0: cid = connidre.search(line).group(1) nv[cid][1] += 1 elif line.find('calling endZeoVerify()') > 0: cid = connidre.search(line).group(1) t1, n = nv[cid] if n: d = sub(t1, time(line)) print cid, t1, n, d, n and (d*1000.0/n) or '-' def recovery(f): f, = f if f == '-': f = sys.stdin else: f = xopen(f) last = '' trans = [] n = 0 for line in f: n += 1 if line.find('RecoveryServer') < 0: continue l = line.find('sending transaction ') if l > 0 and last.find('sending transaction ') > 0: trans.append(line[l+20:].strip()) else: if trans: if len(trans) > 1: print " ... %s similar records skipped ..." % ( len(trans) - 1) print n, last.strip() trans=[] print n, line.strip() last = line if len(trans) > 1: print " ... %s similar records skipped ..." % ( len(trans) - 1) print n, last.strip() if __name__ == '__main__': globals()[sys.argv[1]](sys.argv[2:]) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/scripts/zeoup.py000077500000000000000000000100121230730566700235340ustar00rootroot00000000000000#!/usr/bin/env python2.3 """Make sure a ZEO server is running. usage: zeoup.py [options] The test will connect to a ZEO server, load the root object, and attempt to update the zeoup counter in the root. It will report success if it updates the counter or if it gets a ConflictError. A ConflictError is considered a success, because the client was able to start a transaction. Options: -p port -- port to connect to -h host -- host to connect to (default is current host) -S storage -- storage name (default '1') -U path -- Unix-domain socket to connect to --nowrite -- Do not update the zeoup counter. -1 -- Connect to a ZEO 1.0 server. You must specify either -p and -h or -U. """ import getopt import logging import socket import sys import time from persistent.mapping import PersistentMapping import transaction import ZODB from ZODB.POSException import ConflictError from ZODB.tests.MinPO import MinPO from ZEO.ClientStorage import ClientStorage ZEO_VERSION = 2 def setup_logging(): # Set up logging to stderr which will show messages originating # at severity ERROR or higher. root = logging.getLogger() root.setLevel(logging.ERROR) fmt = logging.Formatter( "------\n%(asctime)s %(levelname)s %(name)s %(message)s", "%Y-%m-%dT%H:%M:%S") handler = logging.StreamHandler() handler.setFormatter(fmt) root.addHandler(handler) def check_server(addr, storage, write): t0 = time.time() if ZEO_VERSION == 2: # TODO: should do retries w/ exponential backoff. cs = ClientStorage(addr, storage=storage, wait=0, read_only=(not write)) else: cs = ClientStorage(addr, storage=storage, debug=1, wait_for_server_on_startup=1) # _startup() is an artifact of the way ZEO 1.0 works. The # ClientStorage doesn't get fully initialized until registerDB() # is called. The only thing we care about, though, is that # registerDB() calls _startup(). if write: db = ZODB.DB(cs) cn = db.open() root = cn.root() try: # We store the data in a special `monitor' dict under the root, # where other tools may also store such heartbeat and bookkeeping # type data. monitor = root.get('monitor') if monitor is None: monitor = root['monitor'] = PersistentMapping() obj = monitor['zeoup'] = monitor.get('zeoup', MinPO(0)) obj.value += 1 transaction.commit() except ConflictError: pass cn.close() db.close() else: data, serial = cs.load("\0\0\0\0\0\0\0\0", "") cs.close() t1 = time.time() print "Elapsed time: %.2f" % (t1 - t0) def usage(exit=1): print __doc__ print " ".join(sys.argv) sys.exit(exit) def main(): host = None port = None unix = None write = 1 storage = '1' try: opts, args = getopt.getopt(sys.argv[1:], 'p:h:U:S:1', ['nowrite']) for o, a in opts: if o == '-p': port = int(a) elif o == '-h': host = a elif o == '-U': unix = a elif o == '-S': storage = a elif o == '--nowrite': write = 0 elif o == '-1': ZEO_VERSION = 1 except Exception, err: s = str(err) if s: s = ": " + s print err.__class__.__name__ + s usage() if unix is not None: addr = unix else: if host is None: host = socket.gethostname() if port is None: usage() addr = host, port setup_logging() check_server(addr, storage, write) if __name__ == "__main__": try: main() except SystemExit: raise except Exception, err: s = str(err) if s: s = ": " + s print err.__class__.__name__ + s sys.exit(1) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/tests/000077500000000000000000000000001230730566700214765ustar00rootroot00000000000000ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/tests/Cache.py000066400000000000000000000034761230730566700230650ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Tests of the ZEO cache""" from ZODB.tests.MinPO import MinPO from ZODB.tests.StorageTestBase import zodb_unpickle from transaction import Transaction class TransUndoStorageWithCache: def checkUndoInvalidation(self): oid = self._storage.new_oid() revid = self._dostore(oid, data=MinPO(23)) revid = self._dostore(oid, revid=revid, data=MinPO(24)) revid = self._dostore(oid, revid=revid, data=MinPO(25)) info = self._storage.undoInfo() if not info: # Preserved this comment, but don't understand it: # "Perhaps we have an old storage implementation that # does do the negative nonsense." info = self._storage.undoInfo(0, 20) tid = info[0]['id'] # Now start an undo transaction t = Transaction() t.note('undo1') oids = self._begin_undos_vote(t, tid) # Make sure this doesn't load invalid data into the cache self._storage.load(oid, '') self._storage.tpc_finish(t) assert len(oids) == 1 assert oids[0] == oid data, revid = self._storage.load(oid, '') obj = zodb_unpickle(data) assert obj == MinPO(24) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/tests/CommitLockTests.py000066400000000000000000000132341230730566700251370ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Tests of the distributed commit lock.""" import threading import time from persistent.TimeStamp import TimeStamp import transaction from ZODB.tests.StorageTestBase import zodb_pickle, MinPO import ZEO.ClientStorage from ZEO.Exceptions import ClientDisconnected from ZEO.tests.TestThread import TestThread ZERO = '\0'*8 class DummyDB: def invalidate(self, *args, **kwargs): pass class WorkerThread(TestThread): # run the entire test in a thread so that the blocking call for # tpc_vote() doesn't hang the test suite. def __init__(self, test, storage, trans): self.storage = storage self.trans = trans self.ready = threading.Event() TestThread.__init__(self, test) def testrun(self): try: self.storage.tpc_begin(self.trans) oid = self.storage.new_oid() p = zodb_pickle(MinPO("c")) self.storage.store(oid, ZERO, p, '', self.trans) oid = self.storage.new_oid() p = zodb_pickle(MinPO("c")) self.storage.store(oid, ZERO, p, '', self.trans) self.myvote() self.storage.tpc_finish(self.trans) except ClientDisconnected: pass def myvote(self): # The vote() call is synchronous, which makes it difficult to # coordinate the action of multiple threads that all call # vote(). This method sends the vote call, then sets the # event saying vote was called, then waits for the vote # response. It digs deep into the implementation of the client. # This method is a replacement for: # self.ready.set() # self.storage.tpc_vote(self.trans) rpc = self.storage._server.rpc msgid = rpc._deferred_call('vote', id(self.trans)) self.ready.set() rpc._deferred_wait(msgid) self.storage._check_serials() class CommitLockTests: NUM_CLIENTS = 5 # The commit lock tests verify that the storage successfully # blocks and restarts transactions when there is contention for a # single storage. There are a lot of cases to cover. # The general flow of these tests is to start a transaction by # getting far enough into 2PC to acquire the commit lock. Then # begin one or more other connections that also want to commit. # This causes the commit lock code to be exercised. Once the # other connections are started, the first transaction completes. def _cleanup(self): for store, trans in self._storages: store.tpc_abort(trans) store.close() self._storages = [] def _start_txn(self): txn = transaction.Transaction() self._storage.tpc_begin(txn) oid = self._storage.new_oid() self._storage.store(oid, ZERO, zodb_pickle(MinPO(1)), '', txn) return oid, txn def _begin_threads(self): # Start a second transaction on a different connection without # blocking the test thread. Returns only after each thread has # set it's ready event. self._storages = [] self._threads = [] for i in range(self.NUM_CLIENTS): storage = self._duplicate_client() txn = transaction.Transaction() tid = self._get_timestamp() t = WorkerThread(self, storage, txn) self._threads.append(t) t.start() t.ready.wait() # Close one of the connections abnormally to test server response if i == 0: storage.close() else: self._storages.append((storage, txn)) def _finish_threads(self): for t in self._threads: t.cleanup() def _duplicate_client(self): "Open another ClientStorage to the same server." # It's hard to find the actual address. # The rpc mgr addr attribute is a list. Each element in the # list is a socket domain (AF_INET, AF_UNIX, etc.) and an # address. addr = self._storage._addr new = ZEO.ClientStorage.ClientStorage(addr, wait=1) new.registerDB(DummyDB()) return new def _get_timestamp(self): t = time.time() t = TimeStamp(*time.gmtime(t)[:5]+(t%60,)) return `t` class CommitLockVoteTests(CommitLockTests): def checkCommitLockVoteFinish(self): oid, txn = self._start_txn() self._storage.tpc_vote(txn) self._begin_threads() self._storage.tpc_finish(txn) self._storage.load(oid, '') self._finish_threads() self._dostore() self._cleanup() def checkCommitLockVoteAbort(self): oid, txn = self._start_txn() self._storage.tpc_vote(txn) self._begin_threads() self._storage.tpc_abort(txn) self._finish_threads() self._dostore() self._cleanup() def checkCommitLockVoteClose(self): oid, txn = self._start_txn() self._storage.tpc_vote(txn) self._begin_threads() self._storage.close() self._finish_threads() self._cleanup() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/tests/ConnectionTests.py000066400000000000000000001323371230730566700252030ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## import os import time import socket import asyncore import threading import logging import ZEO.ServerStub from ZEO.ClientStorage import ClientStorage from ZEO.Exceptions import ClientDisconnected from ZEO.zrpc.marshal import encode from ZEO.tests import forker from ZODB.DB import DB from ZODB.POSException import ReadOnlyError, ConflictError from ZODB.tests.StorageTestBase import StorageTestBase from ZODB.tests.MinPO import MinPO from ZODB.tests.StorageTestBase \ import zodb_pickle, zodb_unpickle, handle_all_serials, handle_serials import ZODB.tests.util import transaction from transaction import Transaction logger = logging.getLogger('ZEO.tests.ConnectionTests') ZERO = '\0'*8 class TestServerStub(ZEO.ServerStub.StorageServer): __super_getInvalidations = ZEO.ServerStub.StorageServer.getInvalidations def getInvalidations(self, tid): # squirrel the results away for inspection by test case self._last_invals = self.__super_getInvalidations(tid) return self._last_invals class TestClientStorage(ClientStorage): test_connection = False StorageServerStubClass = TestServerStub connection_count_for_tests = 0 def notifyConnected(self, conn): ClientStorage.notifyConnected(self, conn) self.connection_count_for_tests += 1 def verify_cache(self, stub): self.end_verify = threading.Event() self.verify_result = ClientStorage.verify_cache(self, stub) def endVerify(self): ClientStorage.endVerify(self) self.end_verify.set() def testConnection(self, conn): try: return ClientStorage.testConnection(self, conn) finally: self.test_connection = True class DummyDB: def invalidate(self, *args, **kwargs): pass def invalidateCache(self): pass class CommonSetupTearDown(StorageTestBase): """Common boilerplate""" __super_setUp = StorageTestBase.setUp __super_tearDown = StorageTestBase.tearDown keep = 0 invq = None timeout = None monitor = 0 db_class = DummyDB def setUp(self, before=None): """Test setup for connection tests. This starts only one server; a test may start more servers by calling self._newAddr() and then self.startServer(index=i) for i in 1, 2, ... """ self.__super_setUp() logging.info("setUp() %s", self.id()) self.file = 'storage_conf' self.addr = [] self._pids = [] self._servers = [] self.conf_paths = [] self.caches = [] self._newAddr() self.startServer() # self._old_log_level = logging.getLogger().getEffectiveLevel() # logging.getLogger().setLevel(logging.WARNING) # self._log_handler = logging.StreamHandler() # logging.getLogger().addHandler(self._log_handler) def tearDown(self): """Try to cause the tests to halt""" # logging.getLogger().setLevel(self._old_log_level) # logging.getLogger().removeHandler(self._log_handler) # logging.info("tearDown() %s" % self.id()) for p in self.conf_paths: os.remove(p) if getattr(self, '_storage', None) is not None: self._storage.close() if hasattr(self._storage, 'cleanup'): logging.debug("cleanup storage %s" % self._storage.__name__) self._storage.cleanup() for adminaddr in self._servers: if adminaddr is not None: forker.shutdown_zeo_server(adminaddr) for pid in self._pids: try: os.waitpid(pid, 0) except OSError: pass # The subprocess module may already have waited for c in self.caches: for i in 0, 1: for ext in "", ".trace", ".lock": path = "%s-%s.zec%s" % (c, "1", ext) # On Windows before 2.3, we don't have a way to wait for # the spawned server(s) to close, and they inherited # file descriptors for our open files. So long as those # processes are alive, we can't delete the files. Try # a few times then give up. need_to_delete = False if os.path.exists(path): need_to_delete = True for dummy in range(5): try: os.unlink(path) except: time.sleep(0.5) else: need_to_delete = False break if need_to_delete: os.unlink(path) # sometimes this is just gonna fail self.__super_tearDown() def _newAddr(self): self.addr.append(self._getAddr()) def _getAddr(self): return 'localhost', forker.get_port(self) def getConfig(self, path, create, read_only): raise NotImplementedError cache_id = 1 def openClientStorage(self, cache=None, cache_size=200000, wait=1, read_only=0, read_only_fallback=0, username=None, password=None, realm=None): if cache is None: cache = str(self.__class__.cache_id) self.__class__.cache_id += 1 self.caches.append(cache) storage = TestClientStorage(self.addr, client=cache, var='.', cache_size=cache_size, wait=wait, min_disconnect_poll=0.1, read_only=read_only, read_only_fallback=read_only_fallback, username=username, password=password, realm=realm) storage.registerDB(DummyDB()) return storage def getServerConfig(self, addr, ro_svr): zconf = forker.ZEOConfig(addr) if ro_svr: zconf.read_only = 1 if self.monitor: zconf.monitor_address = ("", 42000) if self.invq: zconf.invalidation_queue_size = self.invq if self.timeout: zconf.transaction_timeout = self.timeout return zconf def startServer(self, create=1, index=0, read_only=0, ro_svr=0, keep=None, path=None): addr = self.addr[index] logging.info("startServer(create=%d, index=%d, read_only=%d) @ %s" % (create, index, read_only, addr)) if path is None: path = "%s.%d" % (self.file, index) sconf = self.getConfig(path, create, read_only) zconf = self.getServerConfig(addr, ro_svr) if keep is None: keep = self.keep zeoport, adminaddr, pid, path = forker.start_zeo_server( sconf, zconf, addr[1], keep) self.conf_paths.append(path) self._pids.append(pid) self._servers.append(adminaddr) def shutdownServer(self, index=0): logging.info("shutdownServer(index=%d) @ %s" % (index, self._servers[index])) adminaddr = self._servers[index] if adminaddr is not None: forker.shutdown_zeo_server(adminaddr) self._servers[index] = None def pollUp(self, timeout=30.0, storage=None): if storage is None: storage = self._storage # Poll until we're connected. now = time.time() giveup = now + timeout while not storage.is_connected(): asyncore.poll(0.1) now = time.time() if now > giveup: self.fail("timed out waiting for storage to connect") # When the socket map is empty, poll() returns immediately, # and this is a pure busy-loop then. At least on some Linux # flavors, that can starve the thread trying to connect, # leading to grossly increased runtime (typical) or bogus # "timed out" failures. A little sleep here cures both. time.sleep(0.1) def pollDown(self, timeout=30.0): # Poll until we're disconnected. now = time.time() giveup = now + timeout while self._storage.is_connected(): asyncore.poll(0.1) now = time.time() if now > giveup: self.fail("timed out waiting for storage to disconnect") # See pollUp() for why we sleep a little here. time.sleep(0.1) class ConnectionTests(CommonSetupTearDown): """Tests that explicitly manage the server process. To test the cache or re-connection, these test cases explicit start and stop a ZEO storage server. """ def checkMultipleAddresses(self): for i in range(4): self._newAddr() self._storage = self.openClientStorage('test', 100000) oid = self._storage.new_oid() obj = MinPO(12) self._dostore(oid, data=obj) self._storage.close() def checkReadOnlyClient(self): # Open a read-only client to a read-write server; stores fail # Start a read-only client for a read-write server self._storage = self.openClientStorage(read_only=1) # Stores should fail here self.assertRaises(ReadOnlyError, self._dostore) self._storage.close() def checkReadOnlyServer(self): # Open a read-only client to a read-only *server*; stores fail # We don't want the read-write server created by setUp() self.shutdownServer() self._servers = [] # Start a read-only server self.startServer(create=0, index=0, ro_svr=1) # Start a read-only client self._storage = self.openClientStorage(read_only=1) # Stores should fail here self.assertRaises(ReadOnlyError, self._dostore) self._storage.close() # Get rid of the 'test left new threads behind' warning time.sleep(0.1) def checkReadOnlyFallbackWritable(self): # Open a fallback client to a read-write server; stores succeed # Start a read-only-fallback client for a read-write server self._storage = self.openClientStorage(read_only_fallback=1) # Stores should succeed here self._dostore() self._storage.close() def checkReadOnlyFallbackReadOnlyServer(self): # Open a fallback client to a read-only *server*; stores fail # We don't want the read-write server created by setUp() self.shutdownServer() self._servers = [] # Start a read-only server self.startServer(create=0, index=0, ro_svr=1) # Start a read-only-fallback client self._storage = self.openClientStorage(read_only_fallback=1) self.assert_(self._storage.isReadOnly()) # Stores should fail here self.assertRaises(ReadOnlyError, self._dostore) self._storage.close() def checkDisconnectionError(self): # Make sure we get a ClientDisconnected when we try to read an # object when we're not connected to a storage server and the # object is not in the cache. self.shutdownServer() self._storage = self.openClientStorage('test', 1000, wait=0) self.assertRaises(ClientDisconnected, self._storage.load, 'fredwash', '') self._storage.close() def checkBasicPersistence(self): # Verify cached data persists across client storage instances. # To verify that the cache is being used, the test closes the # server and then starts a new client with the server down. # When the server is down, a load() gets the data from its cache. self._storage = self.openClientStorage('test', 100000) oid = self._storage.new_oid() obj = MinPO(12) revid1 = self._dostore(oid, data=obj) self._storage.close() self.shutdownServer() self._storage = self.openClientStorage('test', 100000, wait=0) data, revid2 = self._storage.load(oid, '') self.assertEqual(zodb_unpickle(data), MinPO(12)) self.assertEqual(revid1, revid2) self._storage.close() def checkDisconnectedCacheWorks(self): # Check that the cache works when the client is disconnected. self._storage = self.openClientStorage('test') oid1 = self._storage.new_oid() obj1 = MinPO("1" * 500) self._dostore(oid1, data=obj1) oid2 = self._storage.new_oid() obj2 = MinPO("2" * 500) self._dostore(oid2, data=obj2) expected1 = self._storage.load(oid1, '') expected2 = self._storage.load(oid2, '') # Shut it all down, and try loading from the persistent cache file # without a server present. self._storage.close() self.shutdownServer() self._storage = self.openClientStorage('test', wait=False) self.assertEqual(expected1, self._storage.load(oid1, '')) self.assertEqual(expected2, self._storage.load(oid2, '')) self._storage.close() def checkDisconnectedCacheFails(self): # Like checkDisconnectedCacheWorks above, except the cache # file is so small that only one object can be remembered. self._storage = self.openClientStorage('test', cache_size=900) oid1 = self._storage.new_oid() obj1 = MinPO("1" * 500) self._dostore(oid1, data=obj1) oid2 = self._storage.new_oid() obj2 = MinPO("2" * 500) # The cache file is so small that adding oid2 will evict oid1. self._dostore(oid2, data=obj2) expected2 = self._storage.load(oid2, '') # Shut it all down, and try loading from the persistent cache file # without a server present. self._storage.close() self.shutdownServer() self._storage = self.openClientStorage('test', cache_size=900, wait=False) # oid2 should still be in cache. self.assertEqual(expected2, self._storage.load(oid2, '')) # But oid1 should have been purged, so that trying to load it will # try to fetch it from the (non-existent) ZEO server. self.assertRaises(ClientDisconnected, self._storage.load, oid1, '') self._storage.close() def checkVerificationInvalidationPersists(self): # This tests a subtle invalidation bug from ZODB 3.3: # invalidations processed as part of ZEO cache verification acted # kinda OK wrt the in-memory cache structures, but had no effect # on the cache file. So opening the file cache again could # incorrectly believe that a previously invalidated object was # still current. This takes some effort to set up. # First, using a persistent cache ('test'), create an object # MinPO(13). We used to see this again at the end of this test, # despite that we modify it, and despite that it gets invalidated # in 'test', before the end. self._storage = self.openClientStorage('test') oid = self._storage.new_oid() obj = MinPO(13) self._dostore(oid, data=obj) self._storage.close() # Now modify obj via a temp connection. `test` won't learn about # this until we open a connection using `test` again. self._storage = self.openClientStorage() pickle, rev = self._storage.load(oid, '') newobj = zodb_unpickle(pickle) self.assertEqual(newobj, obj) newobj.value = 42 # .value *should* be 42 forever after now, not 13 self._dostore(oid, data=newobj, revid=rev) self._storage.close() # Open 'test' again. `oid` in this cache should be (and is) # invalidated during cache verification. The bug was that it # got invalidated (kinda) in memory, but not in the cache file. self._storage = self.openClientStorage('test') # The invalidation happened already. Now create and store a new # object before closing this storage: this is so `test` believes # it's seen transactions beyond the one that invalidated `oid`, so # that the *next* time we open `test` it doesn't process another # invalidation for `oid`. It's also important that we not try to # load `oid` now: because it's been (kinda) invalidated in the # cache's memory structures, loading it now would fetch the # current revision from the server, thus hiding the bug. obj2 = MinPO(666) oid2 = self._storage.new_oid() self._dostore(oid2, data=obj2) self._storage.close() # Finally, open `test` again and load `oid`. `test` believes # it's beyond the transaction that modified `oid`, so its view # of whether it has an up-to-date `oid` comes solely from the disk # file, unaffected by cache verification. self._storage = self.openClientStorage('test') pickle, rev = self._storage.load(oid, '') newobj_copy = zodb_unpickle(pickle) # This used to fail, with # AssertionError: MinPO(13) != MinPO(42) # That is, `test` retained a stale revision of the object on disk. self.assertEqual(newobj_copy, newobj) self._storage.close() def checkBadMessage1(self): # not even close to a real message self._bad_message("salty") def checkBadMessage2(self): # just like a real message, but with an unpicklable argument global Hack class Hack: pass msg = encode(1, 0, "foo", (Hack(),)) self._bad_message(msg) del Hack def _bad_message(self, msg): # Establish a connection, then send the server an ill-formatted # request. Verify that the connection is closed and that it is # possible to establish a new connection. self._storage = self.openClientStorage() self._dostore() # break into the internals to send a bogus message zrpc_conn = self._storage._server.rpc zrpc_conn.message_output(msg) try: self._dostore() except ClientDisconnected: pass else: self._storage.close() self.fail("Server did not disconnect after bogus message") self._storage.close() self._storage = self.openClientStorage() self._dostore() self._storage.close() # Test case for multiple storages participating in a single # transaction. This is not really a connection test, but it needs # about the same infrastructure (several storage servers). # TODO: with the current ZEO code, this occasionally fails. # That's the point of this test. :-) def NOcheckMultiStorageTransaction(self): # Configuration parameters (larger values mean more likely deadlocks) N = 2 # These don't *have* to be all the same, but it's convenient this way self.nservers = N self.nthreads = N self.ntrans = N self.nobj = N # Start extra servers for i in range(1, self.nservers): self._newAddr() self.startServer(index=i) # Spawn threads that each do some transactions on all storages threads = [] try: for i in range(self.nthreads): t = MSTThread(self, "T%d" % i) threads.append(t) t.start() # Wait for all threads to finish for t in threads: t.join(60) self.failIf(t.isAlive(), "%s didn't die" % t.getName()) finally: for t in threads: t.closeclients() def checkCrossDBInvalidations(self): db1 = DB(self.openClientStorage()) c1 = db1.open() r1 = c1.root() r1["a"] = MinPO("a") transaction.commit() self.assertEqual(r1._p_state, 0) # up-to-date db2 = DB(self.openClientStorage()) r2 = db2.open().root() self.assertEqual(r2["a"].value, "a") r2["b"] = MinPO("b") transaction.commit() # Make sure the invalidation is received in the other client. # We've had problems with this timing out on "slow" and/or "very # busy" machines, so we increase the sleep time on each trip, and # are willing to wait quite a long time. for i in range(20): c1.sync() if r1._p_state == -1: break time.sleep(i / 10.0) self.assertEqual(r1._p_state, -1) # ghost r1.keys() # unghostify self.assertEqual(r1._p_serial, r2._p_serial) self.assertEqual(r1["b"].value, "b") db2.close() db1.close() def checkCheckForOutOfDateServer(self): # We don't want to connect a client to a server if the client # has seen newer transactions. self._storage = self.openClientStorage() self._dostore() self.shutdownServer() self.assertRaises(ClientDisconnected, self._storage.load, '\0'*8, '') self.startServer() # No matter how long we wait, the client won't reconnect: time.sleep(2) self.assertRaises(ClientDisconnected, self._storage.load, '\0'*8, '') class InvqTests(CommonSetupTearDown): invq = 3 def checkQuickVerificationWith2Clients(self): perstorage = self.openClientStorage(cache="test", cache_size=4000) self.assertEqual(perstorage.verify_result, "empty cache") self._storage = self.openClientStorage() oid = self._storage.new_oid() oid2 = self._storage.new_oid() # When we create a new storage, it should always do a full # verification self.assertEqual(self._storage.verify_result, "empty cache") # do two storages of the object to make sure an invalidation # message is generated revid = self._dostore(oid) revid = self._dostore(oid, revid) # Create a second object and revision to guarantee it doesn't # show up in the list of invalidations sent when perstore restarts. revid2 = self._dostore(oid2) revid2 = self._dostore(oid2, revid2) perstorage.load(oid, '') perstorage.close() forker.wait_until(lambda : os.path.exists('test-1.zec')) revid = self._dostore(oid, revid) perstorage = self.openClientStorage(cache="test") forker.wait_until( (lambda : perstorage.verify_result == "quick verification"), onfail=(lambda : None)) self.assertEqual(perstorage.verify_result, "quick verification") self.assertEqual(perstorage._server._last_invals, (revid, [oid])) self.assertEqual(perstorage.load(oid, ''), self._storage.load(oid, '')) perstorage.close() def checkVerificationWith2ClientsInvqOverflow(self): perstorage = self.openClientStorage(cache="test") self.assertEqual(perstorage.verify_result, "empty cache") self._storage = self.openClientStorage() oid = self._storage.new_oid() # When we create a new storage, it should always do a full # verification self.assertEqual(self._storage.verify_result, "empty cache") # do two storages of the object to make sure an invalidation # message is generated revid = self._dostore(oid) revid = self._dostore(oid, revid) forker.wait_until( "Client has seen all of the transactions from the server", lambda : perstorage.lastTransaction() == self._storage.lastTransaction() ) perstorage.load(oid, '') perstorage.close() # the test code sets invq bound to 2 for i in range(5): revid = self._dostore(oid, revid) perstorage = self.openClientStorage(cache="test") self.assertEqual(perstorage.verify_result, "full verification") t = time.time() + 30 while not perstorage.end_verify.isSet(): perstorage.sync() if time.time() > t: self.fail("timed out waiting for endVerify") self.assertEqual(self._storage.load(oid, '')[1], revid) self.assertEqual(perstorage.load(oid, ''), self._storage.load(oid, '')) perstorage.close() class ReconnectionTests(CommonSetupTearDown): # The setUp() starts a server automatically. In order for its # state to persist, we set the class variable keep to 1. In # order for its state to be cleaned up, the last startServer() # call in the test must pass keep=0. keep = 1 invq = 2 def checkReadOnlyStorage(self): # Open a read-only client to a read-only *storage*; stores fail # We don't want the read-write server created by setUp() self.shutdownServer() self._servers = [] # Start a read-only server self.startServer(create=0, index=0, read_only=1, keep=0) # Start a read-only client self._storage = self.openClientStorage(read_only=1) # Stores should fail here self.assertRaises(ReadOnlyError, self._dostore) def checkReadOnlyFallbackReadOnlyStorage(self): # Open a fallback client to a read-only *storage*; stores fail # We don't want the read-write server created by setUp() self.shutdownServer() self._servers = [] # Start a read-only server self.startServer(create=0, index=0, read_only=1, keep=0) # Start a read-only-fallback client self._storage = self.openClientStorage(read_only_fallback=1) # Stores should fail here self.assertRaises(ReadOnlyError, self._dostore) # TODO: Compare checkReconnectXXX() here to checkReconnection() # further down. Is the code here hopelessly naive, or is # checkReconnection() overwrought? def checkReconnectWritable(self): # A read-write client reconnects to a read-write server # Start a client self._storage = self.openClientStorage() # Stores should succeed here self._dostore() # Shut down the server self.shutdownServer() self._servers = [] # Poll until the client disconnects self.pollDown() # Stores should fail now self.assertRaises(ClientDisconnected, self._dostore) # Restart the server self.startServer(create=0) # Poll until the client connects self.pollUp() # Stores should succeed here self._dostore() self._storage.close() def checkReconnectReadOnly(self): # A read-only client reconnects from a read-write to a # read-only server # Start a client self._storage = self.openClientStorage(read_only=1) # Stores should fail here self.assertRaises(ReadOnlyError, self._dostore) # Shut down the server self.shutdownServer() self._servers = [] # Poll until the client disconnects self.pollDown() # Stores should still fail self.assertRaises(ReadOnlyError, self._dostore) # Restart the server self.startServer(create=0, read_only=1, keep=0) # Poll until the client connects self.pollUp() # Stores should still fail self.assertRaises(ReadOnlyError, self._dostore) def checkReconnectFallback(self): # A fallback client reconnects from a read-write to a # read-only server # Start a client in fallback mode self._storage = self.openClientStorage(read_only_fallback=1) # Stores should succeed here self._dostore() # Shut down the server self.shutdownServer() self._servers = [] # Poll until the client disconnects self.pollDown() # Stores should fail now self.assertRaises(ClientDisconnected, self._dostore) # Restart the server self.startServer(create=0, read_only=1, keep=0) # Poll until the client connects self.pollUp() # Stores should fail here self.assertRaises(ReadOnlyError, self._dostore) def checkReconnectUpgrade(self): # A fallback client reconnects from a read-only to a # read-write server # We don't want the read-write server created by setUp() self.shutdownServer() self._servers = [] # Start a read-only server self.startServer(create=0, read_only=1) # Start a client in fallback mode self._storage = self.openClientStorage(read_only_fallback=1) # Stores should fail here self.assertRaises(ReadOnlyError, self._dostore) # Shut down the server self.shutdownServer() self._servers = [] # Poll until the client disconnects self.pollDown() # Stores should fail now self.assertRaises(ClientDisconnected, self._dostore) # Restart the server, this time read-write self.startServer(create=0, keep=0) # Poll until the client sconnects self.pollUp() # Stores should now succeed self._dostore() def checkReconnectSwitch(self): # A fallback client initially connects to a read-only server, # then discovers a read-write server and switches to that # We don't want the read-write server created by setUp() self.shutdownServer() self._servers = [] # Allocate a second address (for the second server) self._newAddr() # Start a read-only server self.startServer(create=0, index=0, read_only=1, keep=0) # Start a client in fallback mode self._storage = self.openClientStorage(read_only_fallback=1) # Stores should fail here self.assertRaises(ReadOnlyError, self._dostore) # Start a read-write server self.startServer(index=1, read_only=0, keep=0) # After a while, stores should work for i in range(300): # Try for 30 seconds try: self._dostore() break except (ClientDisconnected, ReadOnlyError): # If the client isn't connected at all, sync() returns # quickly and the test fails because it doesn't wait # long enough for the client. time.sleep(0.1) else: self.fail("Couldn't store after starting a read-write server") def checkNoVerificationOnServerRestart(self): self._storage = self.openClientStorage() # When we create a new storage, it should always do a full # verification self.assertEqual(self._storage.verify_result, "empty cache") self._dostore() self.shutdownServer() self.pollDown() self._storage.verify_result = None self.startServer(create=0, keep=0) self.pollUp() # There were no transactions committed, so no verification # should be needed. self.assertEqual(self._storage.verify_result, "no verification") def checkNoVerificationOnServerRestartWith2Clients(self): perstorage = self.openClientStorage(cache="test") self.assertEqual(perstorage.verify_result, "empty cache") self._storage = self.openClientStorage() oid = self._storage.new_oid() # When we create a new storage, it should always do a full # verification self.assertEqual(self._storage.verify_result, "empty cache") # do two storages of the object to make sure an invalidation # message is generated revid = self._dostore(oid) revid = self._dostore(oid, revid) forker.wait_until( "Client has seen all of the transactions from the server", lambda : perstorage.lastTransaction() == self._storage.lastTransaction() ) perstorage.load(oid, '') self.shutdownServer() self.pollDown() self._storage.verify_result = None perstorage.verify_result = None logging.info('2ALLBEEF') self.startServer(create=0, keep=0) self.pollUp() self.pollUp(storage=perstorage) # There were no transactions committed, so no verification # should be needed. self.assertEqual(self._storage.verify_result, "no verification") self.assertEqual(perstorage.verify_result, "no verification") perstorage.close() self._storage.close() def checkDisconnectedAbort(self): self._storage = self.openClientStorage() self._dostore() oids = [self._storage.new_oid() for i in range(5)] txn = Transaction() self._storage.tpc_begin(txn) for oid in oids: data = zodb_pickle(MinPO(oid)) self._storage.store(oid, None, data, '', txn) self.shutdownServer() self.assertRaises(ClientDisconnected, self._storage.tpc_vote, txn) self._storage.tpc_abort(txn) self.startServer(create=0) self._storage._wait() self._dostore() # This test is supposed to cover the following error, although # I don't have much confidence that it does. The likely # explanation for the error is that the _tbuf contained # objects that weren't in the _seriald, because the client was # interrupted waiting for tpc_vote() to return. When the next # transaction committed, it tried to do something with the # bogus _tbuf entries. The explanation is wrong/incomplete, # because tpc_begin() should clear the _tbuf. # 2003-01-15T15:44:19 ERROR(200) ZODB A storage error occurred # in the last phase of a two-phase commit. This shouldn't happen. # Traceback (innermost last): # Module ZODB.Transaction, line 359, in _finish_one # Module ZODB.Connection, line 691, in tpc_finish # Module ZEO.ClientStorage, line 679, in tpc_finish # Module ZEO.ClientStorage, line 709, in _update_cache # KeyError: ... def checkReconnection(self): # Check that the client reconnects when a server restarts. self._storage = self.openClientStorage() oid = self._storage.new_oid() obj = MinPO(12) self._dostore(oid, data=obj) logging.info("checkReconnection(): About to shutdown server") self.shutdownServer() logging.info("checkReconnection(): About to restart server") self.startServer(create=0) forker.wait_until('reconnect', self._storage.is_connected) oid = self._storage.new_oid() obj = MinPO(12) while 1: try: self._dostore(oid, data=obj) break except ClientDisconnected: # Maybe the exception mess is better now logging.info("checkReconnection(): Error after" " server restart; retrying.", exc_info=True) transaction.abort() # Give the other thread a chance to run. time.sleep(0.1) logging.info("checkReconnection(): finished") self._storage.close() def checkMultipleServers(self): # Crude test-- just start two servers and do a commit at each one. self._newAddr() self._storage = self.openClientStorage('test', 100000) self._dostore() self.shutdownServer(index=0) # When we start the second server, we use file data file from # the original server so tha the new server is a replica of # the original. We need this becaise ClientStorage won't use # a server if the server's last transaction is earlier than # what the client has seen. self.startServer(index=1, path=self.file+'.0', create=False) # If we can still store after shutting down one of the # servers, we must be reconnecting to the other server. did_a_store = 0 for i in range(10): try: self._dostore() did_a_store = 1 break except ClientDisconnected: time.sleep(0.5) self.assert_(did_a_store) self._storage.close() class TimeoutTests(CommonSetupTearDown): timeout = 1 def checkTimeout(self): storage = self.openClientStorage() txn = Transaction() storage.tpc_begin(txn) storage.tpc_vote(txn) time.sleep(2) self.assertRaises(ClientDisconnected, storage.tpc_finish, txn) # Make sure it's logged as CRITICAL for line in open("server-%s.log" % self.addr[0][1]): if (('Transaction timeout after' in line) and ('CRITICAL ZEO.StorageServer' in line) ): break else: self.assert_(False, 'bad logging') storage.close() def checkTimeoutOnAbort(self): storage = self.openClientStorage() txn = Transaction() storage.tpc_begin(txn) storage.tpc_vote(txn) storage.tpc_abort(txn) storage.close() def checkTimeoutOnAbortNoLock(self): storage = self.openClientStorage() txn = Transaction() storage.tpc_begin(txn) storage.tpc_abort(txn) storage.close() def checkTimeoutAfterVote(self): self._storage = storage = self.openClientStorage() # Assert that the zeo cache is empty self.assert_(not list(storage._cache.contents())) # Create the object oid = storage.new_oid() obj = MinPO(7) # Now do a store, sleeping before the finish so as to cause a timeout t = Transaction() old_connection_count = storage.connection_count_for_tests storage.tpc_begin(t) revid1 = storage.store(oid, ZERO, zodb_pickle(obj), '', t) storage.tpc_vote(t) # Now sleep long enough for the storage to time out time.sleep(3) self.assert_( (not storage.is_connected()) or (storage.connection_count_for_tests > old_connection_count) ) storage._wait() self.assert_(storage.is_connected()) # We expect finish to fail self.assertRaises(ClientDisconnected, storage.tpc_finish, t) # The cache should still be empty self.assert_(not list(storage._cache.contents())) # Load should fail since the object should not be in either the cache # or the server. self.assertRaises(KeyError, storage.load, oid, '') def checkTimeoutProvokingConflicts(self): self._storage = storage = self.openClientStorage() # Assert that the zeo cache is empty. self.assert_(not list(storage._cache.contents())) # Create the object oid = storage.new_oid() obj = MinPO(7) # We need to successfully commit an object now so we have something to # conflict about. t = Transaction() storage.tpc_begin(t) revid1a = storage.store(oid, ZERO, zodb_pickle(obj), '', t) revid1b = storage.tpc_vote(t) revid1 = handle_serials(oid, revid1a, revid1b) storage.tpc_finish(t) # Now do a store, sleeping before the finish so as to cause a timeout. obj.value = 8 t = Transaction() old_connection_count = storage.connection_count_for_tests storage.tpc_begin(t) revid2a = storage.store(oid, revid1, zodb_pickle(obj), '', t) revid2b = storage.tpc_vote(t) revid2 = handle_serials(oid, revid2a, revid2b) # Now sleep long enough for the storage to time out. # This used to sleep for 3 seconds, and sometimes (but very rarely) # failed then. Now we try for a minute. It typically succeeds # on the second time thru the loop, and, since self.timeout is 1, # it's typically faster now (2/1.8 ~= 1.11 seconds sleeping instead # of 3). deadline = time.time() + 60 # wait up to a minute while time.time() < deadline: if (storage.is_connected() and (storage.connection_count_for_tests == old_connection_count) ): time.sleep(self.timeout / 1.8) else: break self.assert_( (not storage.is_connected()) or (storage.connection_count_for_tests > old_connection_count) ) storage._wait() self.assert_(storage.is_connected()) # We expect finish to fail. self.assertRaises(ClientDisconnected, storage.tpc_finish, t) storage.tpc_abort(t) # Now we think we've committed the second transaction, but we really # haven't. A third one should produce a POSKeyError on the server, # which manifests as a ConflictError on the client. obj.value = 9 t = Transaction() storage.tpc_begin(t) storage.store(oid, revid2, zodb_pickle(obj), '', t) self.assertRaises(ConflictError, storage.tpc_vote, t) # Even aborting won't help. storage.tpc_abort(t) self.assertRaises(ZODB.POSException.StorageTransactionError, storage.tpc_finish, t) # Try again. obj.value = 10 t = Transaction() storage.tpc_begin(t) storage.store(oid, revid2, zodb_pickle(obj), '', t) # Even aborting won't help. self.assertRaises(ConflictError, storage.tpc_vote, t) # Abort this one and try a transaction that should succeed. storage.tpc_abort(t) # Now do a store. obj.value = 11 t = Transaction() storage.tpc_begin(t) revid2a = storage.store(oid, revid1, zodb_pickle(obj), '', t) revid2b = storage.tpc_vote(t) revid2 = handle_serials(oid, revid2a, revid2b) storage.tpc_finish(t) # Now load the object and verify that it has a value of 11. data, revid = storage.load(oid, '') self.assertEqual(zodb_unpickle(data), MinPO(11)) self.assertEqual(revid, revid2) class MSTThread(threading.Thread): __super_init = threading.Thread.__init__ def __init__(self, testcase, name): self.__super_init(name=name) self.testcase = testcase self.clients = [] def run(self): tname = self.getName() testcase = self.testcase # Create client connections to each server clients = self.clients for i in range(len(testcase.addr)): c = testcase.openClientStorage(addr=testcase.addr[i]) c.__name = "C%d" % i clients.append(c) for i in range(testcase.ntrans): # Because we want a transaction spanning all storages, # we can't use _dostore(). This is several _dostore() calls # expanded in-line (mostly). # Create oid->serial mappings for c in clients: c.__oids = [] c.__serials = {} # Begin a transaction t = Transaction() for c in clients: #print "%s.%s.%s begin\n" % (tname, c.__name, i), c.tpc_begin(t) for j in range(testcase.nobj): for c in clients: # Create and store a new object on each server oid = c.new_oid() c.__oids.append(oid) data = MinPO("%s.%s.t%d.o%d" % (tname, c.__name, i, j)) #print data.value data = zodb_pickle(data) s = c.store(oid, ZERO, data, '', t) c.__serials.update(handle_all_serials(oid, s)) # Vote on all servers and handle serials for c in clients: #print "%s.%s.%s vote\n" % (tname, c.__name, i), s = c.tpc_vote(t) c.__serials.update(handle_all_serials(None, s)) # Finish on all servers for c in clients: #print "%s.%s.%s finish\n" % (tname, c.__name, i), c.tpc_finish(t) for c in clients: # Check that we got serials for all oids for oid in c.__oids: testcase.failUnless(c.__serials.has_key(oid)) # Check that we got serials for no other oids for oid in c.__serials.keys(): testcase.failUnless(oid in c.__oids) def closeclients(self): # Close clients opened by run() for c in self.clients: try: c.close() except: pass # Run IPv6 tests if V6 sockets are supported try: socket.socket(socket.AF_INET6, socket.SOCK_STREAM) except (socket.error, AttributeError): pass else: class V6Setup: def _getAddr(self): return '::1', forker.get_port(self) _g = globals() for name, value in _g.items(): if isinstance(value, type) and issubclass(value, CommonSetupTearDown): _g[name+"V6"] = type(name+"V6", (V6Setup, value), {}) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/tests/InvalidationTests.py000066400000000000000000000406001230730566700255140ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2003 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## import threading import time from random import Random import transaction from BTrees.check import check, display from BTrees.OOBTree import OOBTree from ZEO.tests.TestThread import TestThread from ZODB.DB import DB from ZODB.POSException import ReadConflictError, ConflictError # The tests here let several threads have a go at one or more database # instances simultaneously. Each thread appends a disjoint (from the # other threads) sequence of increasing integers to an OOBTree, one at # at time (per thread). This provokes lots of conflicts, and BTrees # work hard at conflict resolution too. An OOBTree is used because # that flavor has the smallest maximum bucket size, and so splits buckets # more often than other BTree flavors. # # When these tests were first written, they provoked an amazing number # of obscure timing-related bugs in cache consistency logic, revealed # by failure of the BTree to pass internal consistency checks at the end, # and/or by failure of the BTree to contain all the keys the threads # thought they added (i.e., the keys for which transaction.commit() # did not raise any exception). class FailableThread(TestThread): # mixin class # subclass must provide # - self.stop attribute (an event) # - self._testrun() method # TestThread.run() invokes testrun(). def testrun(self): try: self._testrun() except: # Report the failure here to all the other threads, so # that they stop quickly. self.stop.set() raise class StressTask: # Append integers startnum, startnum + step, startnum + 2*step, ... # to 'tree'. If sleep is given, sleep # that long after each append. At the end, instance var .added_keys # is a list of the ints the thread believes it added successfully. def __init__(self, db, threadnum, startnum, step=2, sleep=None): self.db = db self.threadnum = threadnum self.startnum = startnum self.step = step self.sleep = sleep self.added_keys = [] self.tm = transaction.TransactionManager() self.cn = self.db.open(transaction_manager=self.tm) self.cn.sync() def doStep(self): tree = self.cn.root()["tree"] key = self.startnum tree[key] = self.threadnum def commit(self): cn = self.cn key = self.startnum self.tm.get().note("add key %s" % key) try: self.tm.get().commit() except ConflictError, msg: self.tm.abort() else: if self.sleep: time.sleep(self.sleep) self.added_keys.append(key) self.startnum += self.step def cleanup(self): self.tm.get().abort() self.cn.close() def _runTasks(rounds, *tasks): '''run *task* interleaved for *rounds* rounds.''' def commit(run, actions): actions.append(':') for t in run: t.commit() del run[:] r = Random() r.seed(1064589285) # make it deterministic run = [] actions = [] try: for i in range(rounds): t = r.choice(tasks) if t in run: commit(run, actions) run.append(t) t.doStep() actions.append(`t.startnum`) commit(run,actions) # stderr.write(' '.join(actions)+'\n') finally: for t in tasks: t.cleanup() class StressThread(FailableThread): # Append integers startnum, startnum + step, startnum + 2*step, ... # to 'tree' until Event stop is set. If sleep is given, sleep # that long after each append. At the end, instance var .added_keys # is a list of the ints the thread believes it added successfully. def __init__(self, testcase, db, stop, threadnum, commitdict, startnum, step=2, sleep=None): TestThread.__init__(self, testcase) self.db = db self.stop = stop self.threadnum = threadnum self.startnum = startnum self.step = step self.sleep = sleep self.added_keys = [] self.commitdict = commitdict def _testrun(self): tm = transaction.TransactionManager() cn = self.db.open(transaction_manager=tm) while not self.stop.isSet(): try: tree = cn.root()["tree"] break except (ConflictError, KeyError): tm.abort() key = self.startnum while not self.stop.isSet(): try: tree[key] = self.threadnum tm.get().note("add key %s" % key) tm.commit() self.commitdict[self] = 1 if self.sleep: time.sleep(self.sleep) except (ReadConflictError, ConflictError), msg: tm.abort() else: self.added_keys.append(key) key += self.step cn.close() class LargeUpdatesThread(FailableThread): # A thread that performs a lot of updates. It attempts to modify # more than 25 objects so that it can test code that runs vote # in a separate thread when it modifies more than 25 objects. def __init__(self, test, db, stop, threadnum, commitdict, startnum, step=2, sleep=None): TestThread.__init__(self, test) self.db = db self.stop = stop self.threadnum = threadnum self.startnum = startnum self.step = step self.sleep = sleep self.added_keys = [] self.commitdict = commitdict def _testrun(self): cn = self.db.open() while not self.stop.isSet(): try: tree = cn.root()["tree"] break except (ConflictError, KeyError): # print "%d getting tree abort" % self.threadnum transaction.abort() keys_added = {} # set of keys we commit tkeys = [] while not self.stop.isSet(): # The test picks 50 keys spread across many buckets. # self.startnum and self.step ensure that all threads use # disjoint key sets, to minimize conflict errors. nkeys = len(tkeys) if nkeys < 50: tkeys = range(self.startnum, 3000, self.step) nkeys = len(tkeys) step = max(int(nkeys / 50), 1) keys = [tkeys[i] for i in range(0, nkeys, step)] for key in keys: try: tree[key] = self.threadnum except (ReadConflictError, ConflictError), msg: # print "%d setting key %s" % (self.threadnum, msg) transaction.abort() break else: # print "%d set #%d" % (self.threadnum, len(keys)) transaction.get().note("keys %s" % ", ".join(map(str, keys))) try: transaction.commit() self.commitdict[self] = 1 if self.sleep: time.sleep(self.sleep) except ConflictError, msg: # print "%d commit %s" % (self.threadnum, msg) transaction.abort() continue for k in keys: tkeys.remove(k) keys_added[k] = 1 self.added_keys = keys_added.keys() cn.close() class InvalidationTests: # Minimum # of seconds the main thread lets the workers run. The # test stops as soon as this much time has elapsed, and all threads # have managed to commit a change. MINTIME = 10 # Maximum # of seconds the main thread lets the workers run. We # stop after this long has elapsed regardless of whether all threads # have managed to commit a change. MAXTIME = 300 StressThread = StressThread def _check_tree(self, cn, tree): # Make sure the BTree is sane at the C level. retries = 3 while retries: retries -= 1 try: check(tree) tree._check() except ReadConflictError: if retries: transaction.abort() else: raise except: display(tree) raise def _check_threads(self, tree, *threads): # Make sure the thread's view of the world is consistent with # the actual database state. expected_keys = [] errormsgs = [] err = errormsgs.append for t in threads: if not t.added_keys: err("thread %d didn't add any keys" % t.threadnum) expected_keys.extend(t.added_keys) expected_keys.sort() for i in range(100): tree._p_jar.sync() actual_keys = list(tree.keys()) if expected_keys == actual_keys: break time.sleep(.1) else: err("expected keys != actual keys") for k in expected_keys: if k not in actual_keys: err("key %s expected but not in tree" % k) for k in actual_keys: if k not in expected_keys: err("key %s in tree but not expected" % k) self.fail('\n'.join(errormsgs)) def go(self, stop, commitdict, *threads): # Run the threads for t in threads: t.start() delay = self.MINTIME start = time.time() while time.time() - start <= self.MAXTIME: stop.wait(delay) if stop.isSet(): # Some thread failed. Stop right now. break delay = 2.0 if len(commitdict) >= len(threads): break # Some thread still hasn't managed to commit anything. stop.set() # Give all the threads some time to stop before trying to clean up. # cleanup() will cause the test to fail if some thread ended with # an uncaught exception, and unittest will call the base class # tearDown then immediately, but if other threads are still # running that can lead to a cascade of spurious exceptions. for t in threads: t.join(30) for t in threads: t.cleanup(10) def checkConcurrentUpdates2Storages_emulated(self): self._storage = storage1 = self.openClientStorage() storage2 = self.openClientStorage() db1 = DB(storage1) db2 = DB(storage2) cn = db1.open() tree = cn.root()["tree"] = OOBTree() transaction.commit() # DM: allow time for invalidations to come in and process them time.sleep(0.1) # Run two threads that update the BTree t1 = StressTask(db1, 1, 1,) t2 = StressTask(db2, 2, 2,) _runTasks(100, t1, t2) cn.sync() self._check_tree(cn, tree) self._check_threads(tree, t1, t2) cn.close() db1.close() db2.close() def checkConcurrentUpdates2Storages(self): self._storage = storage1 = self.openClientStorage() storage2 = self.openClientStorage() db1 = DB(storage1) db2 = DB(storage2) stop = threading.Event() cn = db1.open() tree = cn.root()["tree"] = OOBTree() transaction.commit() cn.close() # Run two threads that update the BTree cd = {} t1 = self.StressThread(self, db1, stop, 1, cd, 1) t2 = self.StressThread(self, db2, stop, 2, cd, 2) self.go(stop, cd, t1, t2) while db1.lastTransaction() != db2.lastTransaction(): db1._storage.sync() db2._storage.sync() cn = db1.open() tree = cn.root()["tree"] self._check_tree(cn, tree) self._check_threads(tree, t1, t2) cn.close() db1.close() db2.close() def checkConcurrentUpdates19Storages(self): n = 19 dbs = [DB(self.openClientStorage()) for i in range(n)] self._storage = dbs[0].storage stop = threading.Event() cn = dbs[0].open() tree = cn.root()["tree"] = OOBTree() transaction.commit() cn.close() # Run threads that update the BTree cd = {} threads = [self.StressThread(self, dbs[i], stop, i, cd, i, n) for i in range(n)] self.go(stop, cd, *threads) while len(set(db.lastTransaction() for db in dbs)) > 1: _ = [db._storage.sync() for db in dbs] cn = dbs[0].open() tree = cn.root()["tree"] self._check_tree(cn, tree) self._check_threads(tree, *threads) cn.close() _ = [db.close() for db in dbs] def checkConcurrentUpdates1Storage(self): self._storage = storage1 = self.openClientStorage() db1 = DB(storage1) stop = threading.Event() cn = db1.open() tree = cn.root()["tree"] = OOBTree() transaction.commit() cn.close() # Run two threads that update the BTree cd = {} t1 = self.StressThread(self, db1, stop, 1, cd, 1, sleep=0.01) t2 = self.StressThread(self, db1, stop, 2, cd, 2, sleep=0.01) self.go(stop, cd, t1, t2) cn = db1.open() tree = cn.root()["tree"] self._check_tree(cn, tree) self._check_threads(tree, t1, t2) cn.close() db1.close() def checkConcurrentUpdates2StoragesMT(self): self._storage = storage1 = self.openClientStorage() db1 = DB(storage1) db2 = DB(self.openClientStorage()) stop = threading.Event() cn = db1.open() tree = cn.root()["tree"] = OOBTree() transaction.commit() cn.close() # Run three threads that update the BTree. # Two of the threads share a single storage so that it # is possible for both threads to read the same object # at the same time. cd = {} t1 = self.StressThread(self, db1, stop, 1, cd, 1, 3) t2 = self.StressThread(self, db2, stop, 2, cd, 2, 3, 0.01) t3 = self.StressThread(self, db2, stop, 3, cd, 3, 3, 0.01) self.go(stop, cd, t1, t2, t3) while db1.lastTransaction() != db2.lastTransaction(): time.sleep(.1) time.sleep(.1) cn = db1.open() tree = cn.root()["tree"] self._check_tree(cn, tree) self._check_threads(tree, t1, t2, t3) cn.close() db1.close() db2.close() def checkConcurrentLargeUpdates(self): # Use 3 threads like the 2StorageMT test above. self._storage = storage1 = self.openClientStorage() db1 = DB(storage1) db2 = DB(self.openClientStorage()) stop = threading.Event() cn = db1.open() tree = cn.root()["tree"] = OOBTree() for i in range(0, 3000, 2): tree[i] = 0 transaction.commit() cn.close() # Run three threads that update the BTree. # Two of the threads share a single storage so that it # is possible for both threads to read the same object # at the same time. cd = {} t1 = LargeUpdatesThread(self, db1, stop, 1, cd, 1, 3, 0.02) t2 = LargeUpdatesThread(self, db2, stop, 2, cd, 2, 3, 0.01) t3 = LargeUpdatesThread(self, db2, stop, 3, cd, 3, 3, 0.01) self.go(stop, cd, t1, t2, t3) while db1.lastTransaction() != db2.lastTransaction(): db1._storage.sync() db2._storage.sync() cn = db1.open() tree = cn.root()["tree"] self._check_tree(cn, tree) # Purge the tree of the dummy entries mapping to 0. losers = [k for k, v in tree.items() if v == 0] for k in losers: del tree[k] transaction.commit() self._check_threads(tree, t1, t2, t3) cn.close() db1.close() db2.close() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/tests/IterationTests.py000066400000000000000000000152331230730566700250350ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2008 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """ZEO iterator protocol tests.""" import transaction class IterationTests: def checkIteratorGCProtocol(self): # Test garbage collection on protocol level. server = self._storage._server iid = server.iterator_start(None, None) # None signals the end of iteration. self.assertEquals(None, server.iterator_next(iid)) # The server has disposed the iterator already. self.assertRaises(KeyError, server.iterator_next, iid) iid = server.iterator_start(None, None) # This time, we tell the server to throw the iterator away. server.iterator_gc([iid]) self.assertRaises(KeyError, server.iterator_next, iid) def checkIteratorExhaustionStorage(self): # Test the storage's garbage collection mechanism. self._dostore() iterator = self._storage.iterator() # At this point, a wrapping iterator might not have called the CS # iterator yet. We'll consume one item to make sure this happens. iterator.next() self.assertEquals(1, len(self._storage._iterator_ids)) iid = list(self._storage._iterator_ids)[0] self.assertEquals([], list(iterator)) self.assertEquals(0, len(self._storage._iterator_ids)) # The iterator has run through, so the server has already disposed it. self.assertRaises(KeyError, self._storage._server.iterator_next, iid) def checkIteratorGCSpanTransactions(self): # Keep a hard reference to the iterator so it won't be automatically # garbage collected at the transaction boundary. self._dostore() iterator = self._storage.iterator() self._dostore() # As the iterator was not garbage collected, we can still use it. (We # don't see the transaction we just wrote being picked up, because # iterators only see the state from the point in time when they were # created.) self.assert_(list(iterator)) def checkIteratorGCStorageCommitting(self): # We want the iterator to be garbage-collected, so we don't keep any # hard references to it. The storage tracks its ID, though. # The odd little jig we do below arises from the fact that the # CS iterator may not be constructed right away if the CS is wrapped. # We need to actually do some iteration to get the iterator created. # We do a store to make sure the iterator isn't exhausted right away. self._dostore() self._storage.iterator().next() self.assertEquals(1, len(self._storage._iterator_ids)) iid = list(self._storage._iterator_ids)[0] # GC happens at the transaction boundary. After that, both the storage # and the server have forgotten the iterator. self._dostore() self.assertEquals(0, len(self._storage._iterator_ids)) self.assertRaises(KeyError, self._storage._server.iterator_next, iid) def checkIteratorGCStorageTPCAborting(self): # The odd little jig we do below arises from the fact that the # CS iterator may not be constructed right away if the CS is wrapped. # We need to actually do some iteration to get the iterator created. # We do a store to make sure the iterator isn't exhausted right away. self._dostore() self._storage.iterator().next() iid = list(self._storage._iterator_ids)[0] t = transaction.Transaction() self._storage.tpc_begin(t) self._storage.tpc_abort(t) self.assertEquals(0, len(self._storage._iterator_ids)) self.assertRaises(KeyError, self._storage._server.iterator_next, iid) def checkIteratorGCStorageDisconnect(self): # The odd little jig we do below arises from the fact that the # CS iterator may not be constructed right away if the CS is wrapped. # We need to actually do some iteration to get the iterator created. # We do a store to make sure the iterator isn't exhausted right away. self._dostore() self._storage.iterator().next() iid = list(self._storage._iterator_ids)[0] t = transaction.Transaction() self._storage.tpc_begin(t) # Show that after disconnecting, the client side GCs the iterators # as well. I'm calling this directly to avoid accidentally # calling tpc_abort implicitly. self._storage.notifyDisconnected() self.assertEquals(0, len(self._storage._iterator_ids)) def checkIteratorParallel(self): self._dostore() self._dostore() iter1 = self._storage.iterator() iter2 = self._storage.iterator() txn_info1 = iter1.next() txn_info2 = iter2.next() self.assertEquals(txn_info1.tid, txn_info2.tid) txn_info1 = iter1.next() txn_info2 = iter2.next() self.assertEquals(txn_info1.tid, txn_info2.tid) self.assertRaises(StopIteration, iter1.next) self.assertRaises(StopIteration, iter2.next) def iterator_sane_after_reconnect(): r"""Make sure that iterators are invalidated on disconnect. Start a server: >>> addr, adminaddr = start_server( ... '\npath fs\n', keep=1) Open a client storage to it and commit a some transactions: >>> import ZEO, transaction >>> db = ZEO.DB(addr) >>> conn = db.open() >>> for i in range(10): ... conn.root().i = i ... transaction.commit() Create an iterator: >>> it = conn._storage.iterator() >>> tid1 = it.next().tid Restart the storage: >>> stop_server(adminaddr) >>> wait_disconnected(conn._storage) >>> _ = start_server('\npath fs\n', addr=addr) >>> wait_connected(conn._storage) Now, we'll create a second iterator: >>> it2 = conn._storage.iterator() If we try to advance the first iterator, we should get an error: >>> it.next().tid > tid1 Traceback (most recent call last): ... ClientDisconnected: Disconnected iterator The second iterator should be peachy: >>> it2.next().tid == tid1 True Cleanup: >>> db.close() """ ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/tests/TestThread.py000066400000000000000000000042701230730566700241220ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """A Thread base class for use with unittest.""" import threading import sys class TestThread(threading.Thread): """Base class for defining threads that run from unittest. The subclass should define a testrun() method instead of a run() method. Call cleanup() when the test is done with the thread, instead of join(). If the thread exits with an uncaught exception, it's captured and re-raised when cleanup() is called. cleanup() should be called by the main thread! Trying to tell unittest that a test failed from another thread creates a nightmare of timing-depending cascading failures and missed errors (tracebacks that show up on the screen, but don't cause unittest to believe the test failed). cleanup() also joins the thread. If the thread ended without raising an uncaught exception, and the join doesn't succeed in the timeout period, then the test is made to fail with a "Thread still alive" message. """ def __init__(self, testcase): threading.Thread.__init__(self) # In case this thread hangs, don't stop Python from exiting. self.setDaemon(1) self._exc_info = None self._testcase = testcase def run(self): try: self.testrun() except: self._exc_info = sys.exc_info() def cleanup(self, timeout=15): self.join(timeout) if self._exc_info: raise self._exc_info[0], self._exc_info[1], self._exc_info[2] if self.isAlive(): self._testcase.fail("Thread did not finish: %s" % self) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/tests/ThreadTests.py000066400000000000000000000120201230730566700242750ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Compromising positions involving threads.""" import threading import transaction from ZODB.tests.StorageTestBase import zodb_pickle, MinPO import ZEO.ClientStorage ZERO = '\0'*8 class BasicThread(threading.Thread): def __init__(self, storage, doNextEvent, threadStartedEvent): self.storage = storage self.trans = transaction.Transaction() self.doNextEvent = doNextEvent self.threadStartedEvent = threadStartedEvent self.gotValueError = 0 self.gotDisconnected = 0 threading.Thread.__init__(self) self.setDaemon(1) def join(self): threading.Thread.join(self, 10) assert not self.isAlive() class GetsThroughVoteThread(BasicThread): # This thread gets partially through a transaction before it turns # execution over to another thread. We're trying to establish that a # tpc_finish() after a storage has been closed by another thread will get # a ClientStorageError error. # # This class gets does a tpc_begin(), store(), tpc_vote() and is waiting # to do the tpc_finish() when the other thread closes the storage. def run(self): self.storage.tpc_begin(self.trans) oid = self.storage.new_oid() self.storage.store(oid, ZERO, zodb_pickle(MinPO("c")), '', self.trans) self.storage.tpc_vote(self.trans) self.threadStartedEvent.set() self.doNextEvent.wait(10) try: self.storage.tpc_finish(self.trans) except ZEO.ClientStorage.ClientStorageError: self.gotValueError = 1 self.storage.tpc_abort(self.trans) class GetsThroughBeginThread(BasicThread): # This class is like the above except that it is intended to be run when # another thread is already in a tpc_begin(). Thus, this thread will # block in the tpc_begin until another thread closes the storage. When # that happens, this one will get disconnected too. def run(self): try: self.storage.tpc_begin(self.trans) except ZEO.ClientStorage.ClientStorageError: self.gotValueError = 1 class ThreadTests: # Thread 1 should start a transaction, but not get all the way through it. # Main thread should close the connection. Thread 1 should then get # disconnected. def checkDisconnectedOnThread2Close(self): doNextEvent = threading.Event() threadStartedEvent = threading.Event() thread1 = GetsThroughVoteThread(self._storage, doNextEvent, threadStartedEvent) thread1.start() threadStartedEvent.wait(10) self._storage.close() doNextEvent.set() thread1.join() self.assertEqual(thread1.gotValueError, 1) # Thread 1 should start a transaction, but not get all the way through # it. While thread 1 is in the middle of the transaction, a second thread # should start a transaction, and it will block in the tcp_begin() -- # because thread 1 has acquired the lock in its tpc_begin(). Now the main # thread closes the storage and both sub-threads should get disconnected. def checkSecondBeginFails(self): doNextEvent = threading.Event() threadStartedEvent = threading.Event() thread1 = GetsThroughVoteThread(self._storage, doNextEvent, threadStartedEvent) thread2 = GetsThroughBeginThread(self._storage, doNextEvent, threadStartedEvent) thread1.start() threadStartedEvent.wait(1) thread2.start() self._storage.close() doNextEvent.set() thread1.join() thread2.join() self.assertEqual(thread1.gotValueError, 1) self.assertEqual(thread2.gotValueError, 1) # Run a bunch of threads doing small and large stores in parallel def checkMTStores(self): threads = [] for i in range(5): t = threading.Thread(target=self.mtstorehelper) threads.append(t) t.start() for t in threads: t.join(30) for i in threads: self.failUnless(not t.isAlive()) # Helper for checkMTStores def mtstorehelper(self): name = threading.currentThread().getName() objs = [] for i in range(10): objs.append(MinPO("X" * 200000)) objs.append(MinPO("X")) for obj in objs: self._dostore(data=obj) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/tests/__init__.py000066400000000000000000000012011230730566700236010ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/tests/auth_plaintext.py000066400000000000000000000040441230730566700251030ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2003 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Implements plaintext password authentication. The password is stored in an SHA hash in the Database. The client sends over the plaintext password, and the SHA hashing is done on the server side. This mechanism offers *no network security at all*; the only security is provided by not storing plaintext passwords on disk. """ from ZEO.hash import sha1 from ZEO.StorageServer import ZEOStorage from ZEO.auth import register_module from ZEO.auth.base import Client, Database def session_key(username, realm, password): return sha1("%s:%s:%s" % (username, realm, password)).hexdigest() class StorageClass(ZEOStorage): def auth(self, username, password): try: dbpw = self.database.get_password(username) except LookupError: return 0 password_dig = sha1(password).hexdigest() if dbpw == password_dig: self.connection.setSessionKey(session_key(username, self.database.realm, password)) return self._finish_auth(dbpw == password_dig) class PlaintextClient(Client): extensions = ["auth"] def start(self, username, realm, password): if self.stub.auth(username, password): return session_key(username, realm, password) else: return None register_module("plaintext", StorageClass, PlaintextClient, Database) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/tests/client-config.test000066400000000000000000000041641230730566700251250ustar00rootroot00000000000000ZEO Client Configuration ======================== Here we'll describe (and test) the various ZEO Client configuration options. To facilitate this, we'l start a server that our client can connect to: >>> addr, _ = start_server(blob_dir='server-blobs') The simplest client configuration specified a server address: >>> import ZODB.config >>> storage = ZODB.config.storageFromString(""" ... ... server %s:%s ... ... """ % addr) >>> storage.getName(), storage.__class__.__name__ ... # doctest: +ELLIPSIS ("[('localhost', ...)] (connected)", 'ClientStorage') >>> storage.blob_dir >>> storage._storage '1' >>> storage._cache.maxsize 20971520 >>> storage._cache.path >>> storage._rpc_mgr.tmin 5 >>> storage._rpc_mgr.tmax 300 >>> storage._is_read_only False >>> storage._read_only_fallback False >>> storage._drop_cache_rather_verify False >>> storage._blob_cache_size >>> storage.close() >>> storage = ZODB.config.storageFromString(""" ... ... server %s:%s ... blob-dir blobs ... storage 2 ... cache-size 100 ... name bob ... client cache ... min-disconnect-poll 1 ... max-disconnect-poll 5 ... read-only true ... drop-cache-rather-verify true ... blob-cache-size 1000MB ... blob-cache-size-check 10 ... wait false ... ... """ % addr) >>> storage.getName(), storage.__class__.__name__ ('bob (disconnected)', 'ClientStorage') >>> storage.blob_dir 'blobs' >>> storage._storage '2' >>> storage._cache.maxsize 100 >>> import os >>> storage._cache.path == os.path.abspath('cache-2.zec') True >>> storage._rpc_mgr.tmin 1 >>> storage._rpc_mgr.tmax 5 >>> storage._is_read_only True >>> storage._read_only_fallback False >>> storage._drop_cache_rather_verify True >>> storage._blob_cache_size 1048576000 >>> print storage._blob_cache_size_check 104857600 >>> storage.close() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/tests/drop_cache_rather_than_verify.txt000066400000000000000000000133331230730566700302740ustar00rootroot00000000000000Avoiding cache verifification ============================= For large databases it is common to also use very large ZEO cache files. If a client has beed disconnected for too long, cache verification might be necessary, but cache verification can be very hard on the storage server. When verification is needed, a ZEO.interfaces.StaleCache event is published. Applications may handle this event to perform actions such as exiting the process to avoid a cold restart. ClientStorage provides an option to drop it's cache rather than doing verification. When this option is used, and verification would be necessary, after publishing the event, ClientStorage: - Invalidates all object caches - Drops or clears it's client cache. (The end result is that the cache is working but empty.) - Logs a CRITICAL message. Here's an example that shows that this is actually what happens. Start a server, create a cient to it and commit some data >>> addr, admin = start_server(keep=1) >>> import ZEO, transaction >>> db = ZEO.DB(addr, drop_cache_rather_verify=True, client='cache', ... name='test') >>> wait_connected(db.storage) >>> conn = db.open() >>> conn.root()[1] = conn.root().__class__() >>> conn.root()[1].x = 1 >>> transaction.commit() >>> len(db.storage._cache) 3 Now, we'll stop the server and restart with a different address: >>> stop_server(admin) >>> addr2, admin = start_server(keep=1) And create another client and write some data to it: >>> db2 = ZEO.DB(addr2) >>> wait_connected(db2.storage) >>> conn2 = db2.open() >>> for i in range(5): ... conn2.root()[1].x += 1 ... transaction.commit() >>> db2.close() >>> stop_server(admin) Now, we'll restart the server. Before we do that, we'll capture logging and event data: >>> import logging, zope.testing.loggingsupport, zope.event >>> handler = zope.testing.loggingsupport.InstalledHandler( ... 'ZEO.ClientStorage', level=logging.ERROR) >>> events = [] >>> def event_handler(e): ... events.append(( ... len(e.storage._cache), str(handler), e.__class__.__name__)) >>> zope.event.subscribers.append(event_handler) Note that the event handler is saving away the length of the cache and the state of the log handler. We'll use this to show that the event is generated before the cache is dropped or the message is logged. Now, we'll restart the server on the original address: >>> _, admin = start_server(zeo_conf=dict(invalidation_queue_size=1), ... addr=addr, keep=1) >>> wait_connected(db.storage) Now, let's verify our assertions above: - Publishes a stale-cache event. >>> for e in events: ... print e (3, '', 'StaleCache') Note that the length of the cache when the event handler was called waa non-zero. This is because the cache wasn't cleared yet. Similarly, the dropping-cache message hasn't been logged yet. >>> del events[:] - Drops or clears it's client cache. (The end result is that the cache is working but empty.) >>> len(db.storage._cache) 0 - Invalidates all object caches >>> transaction.abort() >>> conn.root()._p_changed - Logs a CRITICAL message. >>> print handler ZEO.ClientStorage CRITICAL test dropping stale cache >>> handler.clear() If we access the root object, it'll be loaded from the server: >>> conn.root()[1].x 6 >>> len(db.storage._cache) 2 Similarly, if we simply disconnect the client, and write data from another client: >>> db.close() >>> db2 = ZEO.DB(addr) >>> wait_connected(db2.storage) >>> conn2 = db2.open() >>> for i in range(5): ... conn2.root()[1].x += 1 ... transaction.commit() >>> db2.close() >>> db = ZEO.DB(addr, drop_cache_rather_verify=True, client='cache', ... name='test') >>> wait_connected(db.storage) - Drops or clears it's client cache. (The end result is that the cache is working but empty.) >>> len(db.storage._cache) 1 (When a database is created, it checks to make sure the root object is in the database, which is why we get 1, rather than 0 objects in the cache.) - Publishes a stake-cache event. >>> for e in events: ... print e (2, '', 'StaleCache') >>> del events[:] - Logs a CRITICAL message. >>> print handler ZEO.ClientStorage CRITICAL test dropping stale cache >>> handler.clear() If we access the root object, it'll be loaded from the server: >>> conn = db.open() >>> conn.root()[1].x 11 Finally, let's look at what happens without the drop_cache_rather_verify option: >>> db.close() >>> db = ZEO.DB(addr, client='cache') >>> wait_connected(db.storage) >>> conn = db.open() >>> conn.root()[1].x 11 >>> conn.root()[2] = conn.root().__class__() >>> transaction.commit() >>> len(db.storage._cache) 4 >>> stop_server(admin) >>> addr2, admin = start_server(keep=1) >>> db2 = ZEO.DB(addr2) >>> wait_connected(db2.storage) >>> conn2 = db2.open() >>> for i in range(5): ... conn2.root()[1].x += 1 ... transaction.commit() >>> db2.close() >>> stop_server(admin) >>> _, admin = start_server(zeo_conf=dict(invalidation_queue_size=1), ... addr=addr) >>> wait_connected(db.storage) >>> for e in events: ... print e (4, '', 'StaleCache') >>> print handler >>> len(db.storage._cache) 3 Here we see the cache wasn't dropped, although one of the records was invalidated during verification. .. Cleanup >>> db.close() >>> handler.uninstall() >>> zope.event.subscribers.remove(event_handler) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/tests/forker.py000066400000000000000000000276031230730566700233500ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Library for forking storage server and connecting client storage""" import os import random import sys import time import errno import socket import subprocess import logging import StringIO import tempfile import logging import ZODB.tests.util import zope.testing.setupstack logger = logging.getLogger('ZEO.tests.forker') class ZEOConfig: """Class to generate ZEO configuration file. """ def __init__(self, addr): if isinstance(addr, str): self.logpath = addr+'.log' else: self.logpath = 'server-%s.log' % addr[1] addr = '%s:%s' % addr self.address = addr self.read_only = None self.invalidation_queue_size = None self.invalidation_age = None self.monitor_address = None self.transaction_timeout = None self.authentication_protocol = None self.authentication_database = None self.authentication_realm = None self.loglevel = 'INFO' def dump(self, f): print >> f, "" print >> f, "address " + self.address if self.read_only is not None: print >> f, "read-only", self.read_only and "true" or "false" if self.invalidation_queue_size is not None: print >> f, "invalidation-queue-size", self.invalidation_queue_size if self.invalidation_age is not None: print >> f, "invalidation-age", self.invalidation_age if self.monitor_address is not None: print >> f, "monitor-address %s:%s" % self.monitor_address if self.transaction_timeout is not None: print >> f, "transaction-timeout", self.transaction_timeout if self.authentication_protocol is not None: print >> f, "authentication-protocol", self.authentication_protocol if self.authentication_database is not None: print >> f, "authentication-database", self.authentication_database if self.authentication_realm is not None: print >> f, "authentication-realm", self.authentication_realm print >> f, "" print >> f, """ level %s path %s """ % (self.loglevel, self.logpath) def __str__(self): f = StringIO.StringIO() self.dump(f) return f.getvalue() def encode_format(fmt): # The list of replacements mirrors # ZConfig.components.logger.handlers._control_char_rewrites for xform in (("\n", r"\n"), ("\t", r"\t"), ("\b", r"\b"), ("\f", r"\f"), ("\r", r"\r")): fmt = fmt.replace(*xform) return fmt def start_zeo_server(storage_conf=None, zeo_conf=None, port=None, keep=False, path='Data.fs', protocol=None, blob_dir=None, suicide=True, debug=False): """Start a ZEO server in a separate process. Takes two positional arguments a string containing the storage conf and a ZEOConfig object. Returns the ZEO address, the test server address, the pid, and the path to the config file. """ if not storage_conf: storage_conf = '\npath %s\n' % path if blob_dir: storage_conf = '\nblob-dir %s\n%s\n' % ( blob_dir, storage_conf) if port is None: raise AssertionError("The port wasn't specified") if isinstance(port, int): addr = 'localhost', port adminaddr = 'localhost', port+1 else: addr = port adminaddr = port+'-test' if zeo_conf is None or isinstance(zeo_conf, dict): z = ZEOConfig(addr) if zeo_conf: z.__dict__.update(zeo_conf) zeo_conf = z # Store the config info in a temp file. tmpfile = tempfile.mktemp(".conf", dir=os.getcwd()) fp = open(tmpfile, 'w') zeo_conf.dump(fp) fp.write(storage_conf) fp.close() # Find the zeoserver script import ZEO.tests.zeoserver script = ZEO.tests.zeoserver.__file__ if script.endswith('.pyc'): script = script[:-1] # Create a list of arguments, which we'll tuplify below qa = _quote_arg args = [qa(sys.executable), qa(script), '-C', qa(tmpfile)] if keep: args.append("-k") if debug: args.append("-d") if not suicide: args.append("-S") if protocol: args.extend(["-v", protocol]) d = os.environ.copy() d['PYTHONPATH'] = os.pathsep.join(sys.path) if sys.platform.startswith('win'): pid = os.spawnve(os.P_NOWAIT, sys.executable, tuple(args), d) else: pid = subprocess.Popen(args, env=d, close_fds=True).pid # We need to wait until the server starts, but not forever. # 30 seconds is a somewhat arbitrary upper bound. A BDBStorage # takes a long time to open -- more than 10 seconds on occasion. for i in range(300): time.sleep(0.1) try: if isinstance(adminaddr, str) and not os.path.exists(adminaddr): continue logger.debug('connect %s', i) if isinstance(adminaddr, str): s = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM) else: s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.connect(adminaddr) ack = s.recv(1024) s.close() logging.debug('acked: %s' % ack) break except socket.error, e: if e[0] not in (errno.ECONNREFUSED, errno.ECONNRESET): raise s.close() else: logging.debug('boo hoo') raise return addr, adminaddr, pid, tmpfile if sys.platform[:3].lower() == "win": def _quote_arg(s): return '"%s"' % s else: def _quote_arg(s): return s def shutdown_zeo_server(adminaddr): # Do this in a loop to guard against the possibility that the # client failed to connect to the adminaddr earlier. That really # only requires two iterations, but do a third for pure # superstition. for i in range(3): if isinstance(adminaddr, str): s = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM) else: s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.settimeout(.3) try: s.connect(adminaddr) except socket.timeout: # On FreeBSD 5.3 the connection just timed out if i > 0: break raise except socket.error, e: if (e[0] == errno.ECONNREFUSED or # MAC OS X uses EINVAL when connecting to a port # that isn't being listened on. (sys.platform == 'darwin' and e[0] == errno.EINVAL) ) and i > 0: break raise try: ack = s.recv(1024) except socket.error, e: ack = 'no ack received' logger.debug('shutdown_zeo_server(): acked: %s' % ack) s.close() def get_port(test=None): """Return a port that is not in use. Checks if a port is in use by trying to connect to it. Assumes it is not in use if connect raises an exception. We actually look for 2 consective free ports because most of the clients of this function will use the returned port and the next one. Raises RuntimeError after 10 tries. """ if test is not None: return get_port2(test) for i in range(10): port = random.randrange(20000, 30000) s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s1 = socket.socket(socket.AF_INET, socket.SOCK_STREAM) try: try: s.connect(('localhost', port)) except socket.error: pass # Perhaps we should check value of error too. else: continue try: s1.connect(('localhost', port+1)) except socket.error: pass # Perhaps we should check value of error too. else: continue return port finally: s.close() s1.close() raise RuntimeError("Can't find port") def get_port2(test): for i in range(10): while 1: port = random.randrange(20000, 30000) if port%3 == 0: break s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) try: s.bind(('localhost', port+2)) except socket.error, e: if e[0] != errno.EADDRINUSE: raise continue if not (can_connect(port) or can_connect(port+1)): zope.testing.setupstack.register(test, s.close) return port s.close() raise RuntimeError("Can't find port") def can_connect(port): c = socket.socket(socket.AF_INET, socket.SOCK_STREAM) try: c.connect(('localhost', port)) except socket.error: return False # Perhaps we should check value of error too. else: c.close() return True def setUp(test): ZODB.tests.util.setUp(test) servers = {} def start_server(storage_conf=None, zeo_conf=None, port=None, keep=False, addr=None, path='Data.fs', protocol=None, blob_dir=None, suicide=True, debug=False): """Start a ZEO server. Return the server and admin addresses. """ if port is None: if addr is None: port = get_port2(test) else: port = addr[1] elif addr is not None: raise TypeError("Can't specify port and addr") addr, adminaddr, pid, config_path = start_zeo_server( storage_conf, zeo_conf, port, keep, path, protocol, blob_dir, suicide, debug) os.remove(config_path) servers[adminaddr] = pid return addr, adminaddr test.globs['start_server'] = start_server def get_port(): return get_port2(test) test.globs['get_port'] = get_port def stop_server(adminaddr): pid = servers.pop(adminaddr) shutdown_zeo_server(adminaddr) os.waitpid(pid, 0) test.globs['stop_server'] = stop_server def cleanup_servers(): for adminaddr in list(servers): stop_server(adminaddr) zope.testing.setupstack.register(test, cleanup_servers) test.globs['wait_until'] = wait_until test.globs['wait_connected'] = wait_connected test.globs['wait_disconnected'] = wait_disconnected def wait_until(label=None, func=None, timeout=30, onfail=None): if label is None: if func is not None: label = func.__name__ elif not isinstance(label, basestring) and func is None: func = label label = func.__name__ if func is None: def wait_decorator(f): wait_until(label, f, timeout, onfail) return wait_decorator giveup = time.time() + timeout while not func(): if time.time() > giveup: if onfail is None: raise AssertionError("Timed out waiting for: ", label) else: return onfail() time.sleep(0.01) def wait_connected(storage): wait_until("storage is connected", storage.is_connected) def wait_disconnected(storage): wait_until("storage is disconnected", lambda : not storage.is_connected()) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/tests/invalidation-age.txt000066400000000000000000000114241230730566700254540ustar00rootroot00000000000000Invalidation age ================ When a ZEO client with a non-empty cache connects to the server, it needs to verify whether the data in its cache is current. It does this in one of 2 ways: quick verification It gets a list of invalidations from the server since the last transaction the client has seen and applies those to it's disk and in-memory caches. This is only possible if there haven't been too many transactions since the client was last connected. full verification If quick verification isn't possible, the client iterates through it's disk cache asking the server to verify whether each current entry is valid. Unfortunately, for large caches, full verification is soooooo not quick that it is impractical. Quick verificatioin is highly desireable. To support quick verification, the server keeps a list of recent invalidations. The size of this list is controlled by the invalidation_queue_size parameter. If there is a lot of database activity, the size might need to be quite large to support having clients be disconnected for more than a few minutes. A very large invalidation queue size can use a lot of memory. To suppliment the invalidation queue, you can also specify an invalidation_age parameter. When a client connects and presents the last transaction id it has seen, we first check to see if the invalidation queue has that transaction id. It it does, then we send all transactions since that id. Otherwise, we check to see if the difference between storage's last transaction id and the given id is less than or equal to the invalidation age. If it is, then we iterate over the storage, starting with the given id, to get the invalidations since the given id. NOTE: This assumes that iterating from a point near the "end" of a database is inexpensive. Don't use this option for a storage for which that is not the case. Here's an example. We set up a server, using an invalidation-queue-size of 5: >>> addr, admin = start_server(zeo_conf=dict(invalidation_queue_size=5), ... keep=True) Now, we'll open a client with a persistent cache, set up some data, and then close client: >>> import ZEO, transaction >>> db = ZEO.DB(addr, client='test') >>> conn = db.open() >>> for i in range(9): ... conn.root()[i] = conn.root().__class__() ... conn.root()[i].x = 0 >>> transaction.commit() >>> db.close() We'll open another client, and commit some transactions: >>> db = ZEO.DB(addr) >>> conn = db.open() >>> import transaction >>> for i in range(2): ... conn.root()[i].x = 1 ... transaction.commit() >>> db.close() If we reopen the first client, we'll do quick verification. We'll turn on logging so we can see this: >>> import logging, sys >>> old_logging_level = logging.getLogger().getEffectiveLevel() >>> logging.getLogger().setLevel(logging.INFO) >>> handler = logging.StreamHandler(sys.stdout) >>> logging.getLogger().addHandler(handler) >>> db = ZEO.DB(addr, client='test') # doctest: +ELLIPSIS ('localhost', ... ('localhost', ...) Recovering 2 invalidations >>> logging.getLogger().removeHandler(handler) >>> [v.x for v in db.open().root().values()] [1, 1, 0, 0, 0, 0, 0, 0, 0] Now, if we disconnect and commit more than 5 transactions, we'll see that verification is necessary: >>> db.close() >>> db = ZEO.DB(addr) >>> conn = db.open() >>> import transaction >>> for i in range(9): ... conn.root()[i].x = 2 ... transaction.commit() >>> db.close() >>> logging.getLogger().addHandler(handler) >>> db = ZEO.DB(addr, client='test') # doctest: +ELLIPSIS ('localhost', ... ('localhost', ...) Verifying cache ('localhost', ...) endVerify finishing ('localhost', ...) endVerify finished >>> logging.getLogger().removeHandler(handler) >>> [v.x for v in db.open().root().values()] [2, 2, 2, 2, 2, 2, 2, 2, 2] >>> db.close() But if we restart the server with invalidation-age set, we can do quick verification: >>> stop_server(admin) >>> addr, admin = start_server(zeo_conf=dict(invalidation_queue_size=5, ... invalidation_age=100)) >>> db = ZEO.DB(addr) >>> conn = db.open() >>> import transaction >>> for i in range(9): ... conn.root()[i].x = 3 ... transaction.commit() >>> db.close() >>> logging.getLogger().addHandler(handler) >>> db = ZEO.DB(addr, client='test') # doctest: +ELLIPSIS ('localhost', ... ('localhost', ...) Recovering 9 invalidations >>> logging.getLogger().removeHandler(handler) >>> [v.x for v in db.open().root().values()] [3, 3, 3, 3, 3, 3, 3, 3, 3] >>> db.close() >>> logging.getLogger().setLevel(old_logging_level) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/tests/protocols.test000066400000000000000000000117531230730566700244320ustar00rootroot00000000000000Test that multiple protocols are supported ========================================== A full test of all protocols isn't practical. But we'll do a limited test that at least the current and previous protocols are supported in both directions. Let's start a Z308 server >>> storage_conf = ''' ... ... blob-dir server-blobs ... ... path Data.fs ... ... ... ''' >>> addr, admin = start_server( ... storage_conf, dict(invalidation_queue_size=5), protocol='Z308') A current client should be able to connect to a old server: >>> import ZEO, ZODB.blob, transaction >>> db = ZEO.DB(addr, client='client', blob_dir='blobs') >>> wait_connected(db.storage) >>> db.storage._connection.peer_protocol_version 'Z308' >>> conn = db.open() >>> conn.root().x = 0 >>> transaction.commit() >>> len(db.history(conn.root()._p_oid, 99)) 2 >>> conn.root()['blob1'] = ZODB.blob.Blob() >>> conn.root()['blob1'].open('w').write('blob data 1') >>> transaction.commit() >>> db2 = ZEO.DB(addr, blob_dir='server-blobs', shared_blob_dir=True) >>> wait_connected(db2.storage) >>> conn2 = db2.open() >>> for i in range(5): ... conn2.root().x += 1 ... transaction.commit() >>> conn2.root()['blob2'] = ZODB.blob.Blob() >>> conn2.root()['blob2'].open('w').write('blob data 2') >>> transaction.commit() >>> @wait_until("Get the new data") ... def f(): ... conn.sync() ... return conn.root().x == 5 >>> db.close() >>> for i in range(2): ... conn2.root().x += 1 ... transaction.commit() >>> db = ZEO.DB(addr, client='client', blob_dir='blobs') >>> wait_connected(db.storage) >>> conn = db.open() >>> conn.root().x 7 >>> db.close() >>> for i in range(10): ... conn2.root().x += 1 ... transaction.commit() >>> db = ZEO.DB(addr, client='client', blob_dir='blobs') >>> wait_connected(db.storage) >>> conn = db.open() >>> conn.root().x 17 >>> conn.root()['blob1'].open().read() 'blob data 1' >>> conn.root()['blob2'].open().read() 'blob data 2' Note that when taking to a 3.8 server, iteration won't work: >>> db.storage.iterator() Traceback (most recent call last): ... NotImplementedError >>> db2.close() >>> db.close() >>> stop_server(admin) >>> import os, zope.testing.setupstack >>> os.remove('client-1.zec') >>> zope.testing.setupstack.rmtree('blobs') >>> zope.testing.setupstack.rmtree('server-blobs') And the other way around: >>> addr, _ = start_server(storage_conf, dict(invalidation_queue_size=5)) Note that we'll have to pull some hijinks: >>> import ZEO.zrpc.connection >>> old_current_protocol = ZEO.zrpc.connection.Connection.current_protocol >>> ZEO.zrpc.connection.Connection.current_protocol = 'Z308' >>> db = ZEO.DB(addr, client='client', blob_dir='blobs') >>> db.storage._connection.peer_protocol_version 'Z308' >>> wait_connected(db.storage) >>> conn = db.open() >>> conn.root().x = 0 >>> transaction.commit() >>> len(db.history(conn.root()._p_oid, 99)) 2 >>> conn.root()['blob1'] = ZODB.blob.Blob() >>> conn.root()['blob1'].open('w').write('blob data 1') >>> transaction.commit() >>> db2 = ZEO.DB(addr, blob_dir='server-blobs', shared_blob_dir=True) >>> wait_connected(db2.storage) >>> conn2 = db2.open() >>> for i in range(5): ... conn2.root().x += 1 ... transaction.commit() >>> conn2.root()['blob2'] = ZODB.blob.Blob() >>> conn2.root()['blob2'].open('w').write('blob data 2') >>> transaction.commit() >>> @wait_until() ... def x_to_be_5(): ... conn.sync() ... return conn.root().x == 5 >>> db.close() >>> for i in range(2): ... conn2.root().x += 1 ... transaction.commit() >>> db = ZEO.DB(addr, client='client', blob_dir='blobs') >>> wait_connected(db.storage) >>> conn = db.open() >>> conn.root().x 7 >>> db.close() >>> for i in range(10): ... conn2.root().x += 1 ... transaction.commit() >>> db = ZEO.DB(addr, client='client', blob_dir='blobs') >>> wait_connected(db.storage) >>> conn = db.open() >>> conn.root().x 17 >>> conn.root()['blob1'].open().read() 'blob data 1' >>> conn.root()['blob2'].open().read() 'blob data 2' Make some old protocol calls: >>> db.storage._server.rpc.call('getSerial', conn.root()._p_oid ... ) == conn.root()._p_serial True >>> p, s, v, x, y = db.storage._server.rpc.call('zeoLoad', ... conn.root()._p_oid) >>> (v, x, y) == ('', None, None) True >>> db.storage.load(conn.root()._p_oid) == (p, s) True >>> db2.close() >>> db.close() Undo the hijinks: >>> ZEO.zrpc.connection.Connection.current_protocol = old_current_protocol ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/tests/registerDB.test000066400000000000000000000075401230730566700244370ustar00rootroot00000000000000Storage Servers should call registerDB on storages to propigate invalidations ============================================================================= Storages servers propagate invalidations from their storages. Among other things, this allows client storages to be used in storage servers, allowing storage-server fan out, spreading read load over multiple storage servers. We'll create a Faux storage that has a registerDB method. >>> class FauxStorage: ... invalidations = [('trans0', ['ob0']), ... ('trans1', ['ob0', 'ob1']), ... ] ... def registerDB(self, db): ... self.db = db ... def isReadOnly(self): ... return False ... def getName(self): ... return 'faux' ... def lastTransaction(self): ... return self.invq[0][0] ... def lastInvalidations(self, size): ... return list(self.invalidations) We dont' want the storage server to try to bind to a socket. We'll subclass it and give it a do-nothing dispatcher "class": >>> import ZEO.StorageServer >>> class StorageServer(ZEO.StorageServer.StorageServer): ... DispatcherClass = lambda *a, **k: None We'll create a storage instance and a storage server using it: >>> storage = FauxStorage() >>> server = StorageServer('addr', dict(t=storage)) Our storage now has a db attribute that provides IStorageDB. It's references method is just the referencesf function from ZODB.Serialize >>> import ZODB.serialize >>> storage.db.references is ZODB.serialize.referencesf True To see the effects of the invalidation messages, we'll create a client stub that implements the client invalidation calls: >>> class Client: ... def __init__(self, name): ... self.name = name ... def invalidateTransaction(self, tid, invalidated): ... print 'invalidateTransaction', tid, self.name ... print invalidated >>> class Connection: ... def __init__(self, mgr, obj): ... self.mgr = mgr ... self.obj = obj ... def should_close(self): ... print 'closed', self.obj.name ... self.mgr.close_conn(self) ... def poll(self): ... pass ... ... @property ... def trigger(self): ... return self ... ... def pull_trigger(self): ... pass >>> class ZEOStorage: ... def __init__(self, server, name): ... self.name = name ... self.connection = Connection(server, self) ... self.client = Client(name) Now, we'll register the client with the storage server: >>> _ = server.register_connection('t', ZEOStorage(server, 1)) >>> _ = server.register_connection('t', ZEOStorage(server, 2)) Now, if we call invalidate, we'll see it propigate to the client: >>> storage.db.invalidate('trans2', ['ob1', 'ob2']) invalidateTransaction trans2 1 ['ob1', 'ob2'] invalidateTransaction trans2 2 ['ob1', 'ob2'] >>> storage.db.invalidate('trans3', ['ob1', 'ob2']) invalidateTransaction trans3 1 ['ob1', 'ob2'] invalidateTransaction trans3 2 ['ob1', 'ob2'] The storage servers queue will reflect the invalidations: >>> for tid, invalidated in server.invq['t']: ... print repr(tid), invalidated 'trans3' ['ob1', 'ob2'] 'trans2' ['ob1', 'ob2'] 'trans1' ['ob0', 'ob1'] 'trans0' ['ob0'] If we call invalidateCache, the storage server will close each of it's connections: >>> storage.db.invalidateCache() closed 1 closed 2 The connections will then reopen and revalidate their caches. The servers's invalidation queue will get reset >>> for tid, invalidated in server.invq['t']: ... print repr(tid), invalidated 'trans1' ['ob0', 'ob1'] 'trans0' ['ob0'] ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/tests/servertesting.py000066400000000000000000000051131230730566700247540ustar00rootroot00000000000000############################################################################## # # Copyright Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## # Testing the current ZEO implementation is rather hard due to the # architecture, which mixes concerns, especially between application # and networking. Still, it's not as bad as it could be. # The 2 most important classes in the architecture are ZEOStorage and # StorageServer. A ZEOStorage is created for each client connection. # The StorageServer maintains data shared or needed for coordination # among clients. # The other important part of the architecture is connections. # Connections are used by ZEOStorages to send messages or return data # to clients. # Here, we'll try to provide some testing infrastructure to isolate # servers from the network. import ZEO.StorageServer import ZEO.zrpc.connection import ZEO.zrpc.error import ZODB.MappingStorage class StorageServer(ZEO.StorageServer.StorageServer): def __init__(self, addr='test_addr', storages=None, **kw): if storages is None: storages = {'1': ZODB.MappingStorage.MappingStorage()} ZEO.StorageServer.StorageServer.__init__(self, addr, storages, **kw) def DispatcherClass(*args, **kw): pass class Connection: peer_protocol_version = ZEO.zrpc.connection.Connection.current_protocol connected = True def __init__(self, name='connection', addr=''): name = str(name) self.name = name self.addr = addr or 'test-addr-'+name def close(self): print self.name, 'closed' self.connected = False def poll(self): if not self.connected: raise ZEO.zrpc.error.DisconnectedError() def callAsync(self, meth, *args): print self.name, 'callAsync', meth, repr(args) callAsyncNoPoll = callAsync def call_from_thread(self, *args): if args: args[0](*args[1:]) def send_reply(self, *args): pass def client(server, name='client', addr=''): zs = ZEO.StorageServer.ZEOStorage(server) zs.notifyConnected(Connection(name, addr)) zs.register('1', 0) return zs ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/tests/speed.py000066400000000000000000000137701230730566700231600ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## usage="""Test speed of a ZODB storage Options: -d file The data file to use as input. The default is this script. -n n The number of repititions -s module A module that defines a 'Storage' attribute, which is an open storage. If not specified, a FileStorage will ne used. -z Test compressing data -D Run in debug mode -L Test loads as well as stores by minimizing the cache after eachrun -M Output means only -C Run with a persistent client cache -U Run ZEO using a Unix domain socket -t n Number of concurrent threads to run. """ import asyncore import sys, os, getopt, time ##sys.path.insert(0, os.getcwd()) import persistent import transaction import ZODB from ZODB.POSException import ConflictError from ZEO.tests import forker class P(persistent.Persistent): pass fs_name = "zeo-speed.fs" class ZEOExit(asyncore.file_dispatcher): """Used to exit ZEO.StorageServer when run is done""" def writable(self): return 0 def readable(self): return 1 def handle_read(self): buf = self.recv(4) assert buf == "done" self.delete_fs() os._exit(0) def handle_close(self): print "Parent process exited unexpectedly" self.delete_fs() os._exit(0) def delete_fs(self): os.unlink(fs_name) os.unlink(fs_name + ".lock") os.unlink(fs_name + ".tmp") def work(db, results, nrep, compress, data, detailed, minimize, threadno=None): for j in range(nrep): for r in 1, 10, 100, 1000: t = time.time() conflicts = 0 jar = db.open() while 1: try: transaction.begin() rt = jar.root() key = 's%s' % r if rt.has_key(key): p = rt[key] else: rt[key] = p =P() for i in range(r): v = getattr(p, str(i), P()) if compress is not None: v.d = compress(data) else: v.d = data setattr(p, str(i), v) transaction.commit() except ConflictError: conflicts = conflicts + 1 else: break jar.close() t = time.time() - t if detailed: if threadno is None: print "%s\t%s\t%.4f\t%d" % (j, r, t, conflicts) else: print "%s\t%s\t%.4f\t%d\t%d" % (j, r, t, conflicts, threadno) results[r].append((t, conflicts)) rt=d=p=v=None # release all references if minimize: time.sleep(3) jar.cacheMinimize() def main(args): opts, args = getopt.getopt(args, 'zd:n:Ds:LMt:U') s = None compress = None data=sys.argv[0] nrep=5 minimize=0 detailed=1 cache = None domain = 'AF_INET' threads = 1 for o, v in opts: if o=='-n': nrep = int(v) elif o=='-d': data = v elif o=='-s': s = v elif o=='-z': import zlib compress = zlib.compress elif o=='-L': minimize=1 elif o=='-M': detailed=0 elif o=='-D': global debug os.environ['STUPID_LOG_FILE']='' os.environ['STUPID_LOG_SEVERITY']='-999' debug = 1 elif o == '-C': cache = 'speed' elif o == '-U': domain = 'AF_UNIX' elif o == '-t': threads = int(v) zeo_pipe = None if s: s = __import__(s, globals(), globals(), ('__doc__',)) s = s.Storage server = None else: s, server, pid = forker.start_zeo("FileStorage", (fs_name, 1), domain=domain) data=open(data).read() db=ZODB.DB(s, # disable cache deactivation cache_size=4000, cache_deactivate_after=6000,) print "Beginning work..." results={1:[], 10:[], 100:[], 1000:[]} if threads > 1: import threading l = [] for i in range(threads): t = threading.Thread(target=work, args=(db, results, nrep, compress, data, detailed, minimize, i)) l.append(t) for t in l: t.start() for t in l: t.join() else: work(db, results, nrep, compress, data, detailed, minimize) if server is not None: server.close() os.waitpid(pid, 0) if detailed: print '-'*24 print "num\tmean\tmin\tmax" for r in 1, 10, 100, 1000: times = [] for time, conf in results[r]: times.append(time) t = mean(times) print "%d\t%.4f\t%.4f\t%.4f" % (r, t, min(times), max(times)) def mean(l): tot = 0 for v in l: tot = tot + v return tot / len(l) ##def compress(s): ## c = zlib.compressobj() ## o = c.compress(s) ## return o + c.flush() if __name__=='__main__': main(sys.argv[1:]) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/tests/stress.py000066400000000000000000000073031230730566700233760ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """A ZEO client-server stress test to look for leaks. The stress test should run in an infinite loop and should involve multiple connections. """ # TODO: This code is currently broken. import transaction import ZODB from ZODB.MappingStorage import MappingStorage from ZODB.tests import MinPO from ZEO.ClientStorage import ClientStorage from ZEO.tests import forker import os import random import types NUM_TRANSACTIONS_PER_CONN = 10 NUM_CONNECTIONS = 10 NUM_ROOTS = 20 MAX_DEPTH = 20 MIN_OBJSIZE = 128 MAX_OBJSIZE = 2048 def an_object(): """Return an object suitable for a PersistentMapping key""" size = random.randrange(MIN_OBJSIZE, MAX_OBJSIZE) if os.path.exists("/dev/urandom"): f = open("/dev/urandom") buf = f.read(size) f.close() return buf else: f = open(MinPO.__file__) l = list(f.read(size)) f.close() random.shuffle(l) return "".join(l) def setup(cn): """Initialize the database with some objects""" root = cn.root() for i in range(NUM_ROOTS): prev = an_object() for j in range(random.randrange(1, MAX_DEPTH)): o = MinPO.MinPO(prev) prev = o root[an_object()] = o transaction.commit() cn.close() def work(cn): """Do some work with a transaction""" cn.sync() root = cn.root() obj = random.choice(root.values()) # walk down to the bottom while not isinstance(obj.value, types.StringType): obj = obj.value obj.value = an_object() transaction.commit() def main(): # Yuck! Need to cleanup forker so that the API is consistent # across Unix and Windows, at least if that's possible. if os.name == "nt": zaddr, tport, pid = forker.start_zeo_server('MappingStorage', ()) def exitserver(): import socket s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.connect(tport) s.close() else: zaddr = '', random.randrange(20000, 30000) pid, exitobj = forker.start_zeo_server(MappingStorage(), zaddr) def exitserver(): exitobj.close() while 1: pid = start_child(zaddr) print "started", pid os.waitpid(pid, 0) exitserver() def start_child(zaddr): pid = os.fork() if pid != 0: return pid try: _start_child(zaddr) finally: os._exit(0) def _start_child(zaddr): storage = ClientStorage(zaddr, debug=1, min_disconnect_poll=0.5, wait=1) db = ZODB.DB(storage, pool_size=NUM_CONNECTIONS) setup(db.open()) conns = [] conn_count = 0 for i in range(NUM_CONNECTIONS): c = db.open() c.__count = 0 conns.append(c) conn_count += 1 while conn_count < 25: c = random.choice(conns) if c.__count > NUM_TRANSACTIONS_PER_CONN: conns.remove(c) c.close() conn_count += 1 c = db.open() c.__count = 0 conns.append(c) else: c.__count += 1 work(c) if __name__ == "__main__": main() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/tests/testAuth.py000066400000000000000000000115231230730566700236530ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2003 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Test suite for AuthZEO.""" import os import tempfile import time import unittest from ZEO import zeopasswd from ZEO.Exceptions import ClientDisconnected from ZEO.tests.ConnectionTests import CommonSetupTearDown class AuthTest(CommonSetupTearDown): __super_getServerConfig = CommonSetupTearDown.getServerConfig __super_setUp = CommonSetupTearDown.setUp __super_tearDown = CommonSetupTearDown.tearDown realm = None def setUp(self): fd, self.pwfile = tempfile.mkstemp('pwfile') os.close(fd) if self.realm: self.pwdb = self.dbclass(self.pwfile, self.realm) else: self.pwdb = self.dbclass(self.pwfile) self.pwdb.add_user("foo", "bar") self.pwdb.save() self._checkZEOpasswd() self.__super_setUp() def _checkZEOpasswd(self): args = ["-f", self.pwfile, "-p", self.protocol] if self.protocol == "plaintext": from ZEO.auth.base import Database zeopasswd.main(args + ["-d", "foo"], Database) zeopasswd.main(args + ["foo", "bar"], Database) else: zeopasswd.main(args + ["-d", "foo"]) zeopasswd.main(args + ["foo", "bar"]) def tearDown(self): os.remove(self.pwfile) self.__super_tearDown() def getConfig(self, path, create, read_only): return "" def getServerConfig(self, addr, ro_svr): zconf = self.__super_getServerConfig(addr, ro_svr) zconf.authentication_protocol = self.protocol zconf.authentication_database = self.pwfile zconf.authentication_realm = self.realm return zconf def wait(self): for i in range(25): time.sleep(0.1) if self._storage.test_connection: return self.fail("Timed out waiting for client to authenticate") def testOK(self): # Sleep for 0.2 seconds to give the server some time to start up # seems to be needed before and after creating the storage self._storage = self.openClientStorage(wait=0, username="foo", password="bar", realm=self.realm) self.wait() self.assert_(self._storage._connection) self._storage._connection.poll() self.assert_(self._storage.is_connected()) # Make a call to make sure the mechanism is working self._storage.undoInfo() def testNOK(self): self._storage = self.openClientStorage(wait=0, username="foo", password="noogie", realm=self.realm) self.wait() # If the test established a connection, then it failed. self.failIf(self._storage._connection) def testUnauthenticatedMessage(self): # Test that an unauthenticated message is rejected by the server # if it was sent after the connection was authenticated. self._storage = self.openClientStorage(wait=0, username="foo", password="bar", realm=self.realm) # Sleep for 0.2 seconds to give the server some time to start up # seems to be needed before and after creating the storage self.wait() self._storage.undoInfo() # Manually clear the state of the hmac connection self._storage._connection._SizedMessageAsyncConnection__hmac_send = None # Once the client stops using the hmac, it should be disconnected. self.assertRaises(ClientDisconnected, self._storage.undoInfo) class PlainTextAuth(AuthTest): import ZEO.tests.auth_plaintext protocol = "plaintext" database = "authdb.sha" dbclass = ZEO.tests.auth_plaintext.Database realm = "Plaintext Realm" class DigestAuth(AuthTest): import ZEO.auth.auth_digest protocol = "digest" database = "authdb.digest" dbclass = ZEO.auth.auth_digest.DigestDatabase realm = "Digest Realm" test_classes = [PlainTextAuth, DigestAuth] def test_suite(): suite = unittest.TestSuite() for klass in test_classes: sub = unittest.makeSuite(klass) suite.addTest(sub) return suite if __name__ == "__main__": unittest.main(defaultTest='test_suite') ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/tests/testConnection.py000066400000000000000000000213041230730566700250470ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Test setup for ZEO connection logic. The actual tests are in ConnectionTests.py; this file provides the platform-dependent scaffolding. """ from __future__ import with_statement from ZEO.tests import ConnectionTests, InvalidationTests from zope.testing import setupstack import os if os.environ.get('USE_ZOPE_TESTING_DOCTEST'): from zope.testing import doctest else: import doctest import unittest import ZEO.tests.forker import ZEO.tests.testMonitor import ZEO.zrpc.connection import ZODB.tests.util class FileStorageConfig: def getConfig(self, path, create, read_only): return """\ path %s create %s read-only %s """ % (path, create and 'yes' or 'no', read_only and 'yes' or 'no') class MappingStorageConfig: def getConfig(self, path, create, read_only): return """""" class FileStorageConnectionTests( FileStorageConfig, ConnectionTests.ConnectionTests, InvalidationTests.InvalidationTests ): """FileStorage-specific connection tests.""" class FileStorageReconnectionTests( FileStorageConfig, ConnectionTests.ReconnectionTests, ): """FileStorage-specific re-connection tests.""" # Run this at level 1 because MappingStorage can't do reconnection tests class FileStorageInvqTests( FileStorageConfig, ConnectionTests.InvqTests ): """FileStorage-specific invalidation queue tests.""" class FileStorageTimeoutTests( FileStorageConfig, ConnectionTests.TimeoutTests ): pass class MappingStorageConnectionTests( MappingStorageConfig, ConnectionTests.ConnectionTests ): """Mapping storage connection tests.""" # The ReconnectionTests can't work with MappingStorage because it's only an # in-memory storage and has no persistent state. class MappingStorageTimeoutTests( MappingStorageConfig, ConnectionTests.TimeoutTests ): pass class MonitorTests(ZEO.tests.testMonitor.MonitorTests): def check_connection_management(self): # Open and close a few connections, making sure that # the resulting number of clients is 0. s1 = self.openClientStorage() s2 = self.openClientStorage() s3 = self.openClientStorage() stats = self.parse(self.get_monitor_output())[1] self.assertEqual(stats.clients, 3) s1.close() s3.close() s2.close() ZEO.tests.forker.wait_until( "Number of clients shown in monitor drops to 0", lambda : self.parse(self.get_monitor_output())[1].clients == 0 ) def check_connection_management_with_old_client(self): # Check that connection management works even when using an # older protcool that requires a connection adapter. test_protocol = "Z303" current_protocol = ZEO.zrpc.connection.Connection.current_protocol ZEO.zrpc.connection.Connection.current_protocol = test_protocol ZEO.zrpc.connection.Connection.servers_we_can_talk_to.append( test_protocol) try: self.check_connection_management() finally: ZEO.zrpc.connection.Connection.current_protocol = current_protocol ZEO.zrpc.connection.Connection.servers_we_can_talk_to.pop() test_classes = [FileStorageConnectionTests, FileStorageReconnectionTests, FileStorageInvqTests, FileStorageTimeoutTests, MappingStorageConnectionTests, MappingStorageTimeoutTests, MonitorTests, ] def invalidations_while_connecting(): r""" As soon as a client registers with a server, it will recieve invalidations from the server. The client must be careful to queue these invalidations until it is ready to deal with them. At the time of the writing of this test, clients weren't careful enough about queing invalidations. This led to cache corruption in the form of both low-level file corruption as well as out-of-date records marked as current. This tests tries to provoke this bug by: - starting a server >>> addr, _ = start_server() - opening a client to the server that writes some objects, filling it's cache at the same time, >>> import ZODB.tests.MinPO, transaction >>> db = ZEO.DB(addr, client='x') >>> conn = db.open() >>> nobs = 1000 >>> for i in range(nobs): ... conn.root()[i] = ZODB.tests.MinPO.MinPO(0) >>> transaction.commit() >>> import zope.testing.loggingsupport, logging >>> handler = zope.testing.loggingsupport.InstalledHandler( ... 'ZEO', level=logging.INFO) # >>> logging.getLogger('ZEO').debug( # ... 'Initial tid %r' % conn.root()._p_serial) - disconnecting the first client (closing it with a persistent cache), >>> db.close() - starting a second client that writes objects more or less constantly, >>> import random, threading, time >>> stop = False >>> db2 = ZEO.DB(addr) >>> tm = transaction.TransactionManager() >>> conn2 = db2.open(transaction_manager=tm) >>> random = random.Random(0) >>> lock = threading.Lock() >>> def run(): ... while 1: ... i = random.randint(0, nobs-1) ... if stop: ... return ... with lock: ... conn2.root()[i].value += 1 ... tm.commit() ... #logging.getLogger('ZEO').debug( ... # 'COMMIT %s %s %r' % ( ... # i, conn2.root()[i].value, conn2.root()[i]._p_serial)) ... time.sleep(0) >>> thread = threading.Thread(target=run) >>> thread.setDaemon(True) >>> thread.start() - restarting the first client, and - testing for cache validity. >>> bad = False >>> try: ... for c in range(10): ... time.sleep(.1) ... db = ZODB.DB(ZEO.ClientStorage.ClientStorage(addr, client='x')) ... with lock: ... #logging.getLogger('ZEO').debug('Locked %s' % c) ... @wait_until("connected and we have caught up", timeout=199) ... def _(): ... if (db.storage.is_connected() ... and db.storage.lastTransaction() ... == db.storage._server.lastTransaction() ... ): ... #logging.getLogger('ZEO').debug( ... # 'Connected %r' % db.storage.lastTransaction()) ... return True ... ... conn = db.open() ... for i in range(1000): ... if conn.root()[i].value != conn2.root()[i].value: ... print 'bad', c, i, conn.root()[i].value, ... print conn2.root()[i].value ... bad = True ... print 'client debug log with lock held' ... while handler.records: ... record = handler.records.pop(0) ... print record.name, record.levelname, ... print handler.format(record) ... if bad: ... print open('server-%s.log' % addr[1]).read() ... #else: ... # logging.getLogger('ZEO').debug('GOOD %s' % c) ... db.close() ... finally: ... stop = True ... thread.join(10) >>> thread.isAlive() False >>> for record in handler.records: ... if record.levelno < logging.ERROR: ... continue ... print record.name, record.levelname ... print handler.format(record) >>> handler.uninstall() >>> db.close() >>> db2.close() """ def test_suite(): suite = unittest.TestSuite() for klass in test_classes: sub = unittest.makeSuite(klass, 'check') suite.addTest(sub) suite.addTest(doctest.DocTestSuite( setUp=ZEO.tests.forker.setUp, tearDown=setupstack.tearDown, )) suite.layer = ZODB.tests.util.MininalTestLayer('ZEO Connection Tests') return suite ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/tests/testConversionSupport.py000066400000000000000000000126641230730566700265030ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2006 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## import doctest import unittest class FakeStorageBase: def __getattr__(self, name): if name in ('getTid', 'history', 'load', 'loadSerial', 'lastTransaction', 'getSize', 'getName', 'supportsUndo', 'tpc_transaction'): return lambda *a, **k: None raise AttributeError(name) def isReadOnly(self): return False def __len__(self): return 4 class FakeStorage(FakeStorageBase): def record_iternext(self, next=None): if next == None: next = '0' next = str(int(next) + 1) oid = next if next == '4': next = None return oid, oid*8, 'data ' + oid, next class FakeServer: storages = { '1': FakeStorage(), '2': FakeStorageBase(), } def register_connection(*args): return None, None def test_server_record_iternext(): """ On the server, record_iternext calls are simply delegated to the underlying storage. >>> import ZEO.StorageServer >>> zeo = ZEO.StorageServer.ZEOStorage(FakeServer(), False) >>> zeo.register('1', False) >>> next = None >>> while 1: ... oid, serial, data, next = zeo.record_iternext(next) ... print oid ... if next is None: ... break 1 2 3 4 The storage info also reflects the fact that record_iternext is supported. >>> zeo.get_info()['supports_record_iternext'] True >>> zeo = ZEO.StorageServer.ZEOStorage(FakeServer(), False) >>> zeo.register('2', False) >>> zeo.get_info()['supports_record_iternext'] False """ def test_client_record_iternext(): """\ The client simply delegates record_iternext calls to it's server stub. There's really no decent way to test ZEO without running too much crazy stuff. I'd rather do a lame test than a really lame test, so here goes. First, fake out the connection manager so we can make a connection: >>> import ZEO.ClientStorage >>> from ZEO.ClientStorage import ClientStorage >>> oldConnectionManagerClass = ClientStorage.ConnectionManagerClass >>> class FauxConnectionManagerClass: ... def __init__(*a, **k): ... pass ... def attempt_connect(self): ... return True >>> ClientStorage.ConnectionManagerClass = FauxConnectionManagerClass >>> client = ClientStorage('', wait=False) >>> ClientStorage.ConnectionManagerClass = oldConnectionManagerClass Now we'll have our way with it's private _server attr: >>> client._server = FakeStorage() >>> next = None >>> while 1: ... oid, serial, data, next = client.record_iternext(next) ... print oid ... if next is None: ... break 1 2 3 4 """ def test_server_stub_record_iternext(): """\ The server stub simply delegates record_iternext calls to it's rpc. There's really no decent way to test ZEO without running to much crazy stuff. I'd rather do a lame test than a really lame test, so here goes. >>> class FauxRPC: ... storage = FakeStorage() ... def call(self, meth, *args): ... return getattr(self.storage, meth)(*args) ... peer_protocol_version = 1 >>> import ZEO.ServerStub >>> stub = ZEO.ServerStub.StorageServer(FauxRPC()) >>> next = None >>> while 1: ... oid, serial, data, next = stub.record_iternext(next) ... print oid ... if next is None: ... break 1 2 3 4 """ def history_to_version_compatible_storage(): """ Some storages work under ZODB <= 3.8 and ZODB >= 3.9. This means they have a history method that accepts a version parameter: >>> class VersionCompatibleStorage(FakeStorageBase): ... def history(self,oid,version='',size=1): ... return oid,version,size A ZEOStorage such as the following should support this type of storage: >>> class OurFakeServer(FakeServer): ... storages = {'1':VersionCompatibleStorage()} >>> import ZEO.StorageServer >>> zeo = ZEO.StorageServer.ZEOStorage(OurFakeServer(), False) >>> zeo.register('1', False) The ZEOStorage should sort out the following call such that the storage gets the correct parameters and so should return the parameters it was called with: >>> zeo.history('oid',99) ('oid', '', 99) The same problem occurs when a Z308 client connects to a Z309 server, but different code is executed: >>> from ZEO.StorageServer import ZEOStorage308Adapter >>> zeo = ZEOStorage308Adapter(VersionCompatibleStorage()) The history method should still return the parameters it was called with: >>> zeo.history('oid','',99) ('oid', '', 99) """ def test_suite(): return doctest.DocTestSuite() if __name__ == '__main__': unittest.main(defaultTest='test_suite') ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/tests/testMonitor.py000066400000000000000000000053171230730566700244050ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2003 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Test that the monitor produce sensible results. $Id$ """ import socket import unittest from ZEO.tests.ConnectionTests import CommonSetupTearDown from ZEO.monitor import StorageStats class MonitorTests(CommonSetupTearDown): monitor = 1 def get_monitor_output(self): s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.connect(('localhost', 42000)) L = [] while 1: buf = s.recv(8192) if buf: L.append(buf) else: break s.close() return "".join(L) def parse(self, s): # Return a list of StorageStats, one for each storage. lines = s.split("\n") self.assert_(lines[0].startswith("ZEO monitor server")) # lines[1] is a date # Break up rest of lines into sections starting with Storage: # and ending with a blank line. sections = [] cur = None for line in lines[2:]: if line.startswith("Storage:"): cur = [line] elif line: cur.append(line) else: if cur is not None: sections.append(cur) cur = None assert cur is None # bug in the test code if this fails d = {} for sect in sections: hdr = sect[0] key, value = hdr.split(":") storage = int(value) s = d[storage] = StorageStats() s.parse("\n".join(sect[1:])) return d def getConfig(self, path, create, read_only): return """""" def testMonitor(self): # Just open a client to know that the server is up and running # TODO: should put this in setUp. self.storage = self.openClientStorage() s = self.get_monitor_output() self.storage.close() self.assert_(s.find("monitor") != -1) d = self.parse(s) stats = d[1] self.assertEqual(stats.clients, 1) self.assertEqual(stats.commits, 0) def test_suite(): return unittest.makeSuite(MonitorTests) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/tests/testTransactionBuffer.py000066400000000000000000000042571230730566700263770ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## import random import unittest from ZEO.TransactionBuffer import TransactionBuffer def random_string(size): """Return a random string of size size.""" l = [chr(random.randrange(256)) for i in range(size)] return "".join(l) def new_store_data(): """Return arbitrary data to use as argument to store() method.""" return random_string(8), random_string(random.randrange(1000)) def new_invalidate_data(): """Return arbitrary data to use as argument to invalidate() method.""" return random_string(8) class TransBufTests(unittest.TestCase): def checkTypicalUsage(self): tbuf = TransactionBuffer() tbuf.store(*new_store_data()) tbuf.invalidate(new_invalidate_data()) for o in tbuf: pass def doUpdates(self, tbuf): data = [] for i in range(10): d = new_store_data() tbuf.store(*d) data.append(d) d = new_invalidate_data() tbuf.invalidate(d) data.append(d) for i, x in enumerate(tbuf): if x[1] is None: # the tbuf add a dummy None to invalidates x = x[0] self.assertEqual(x, data[i]) def checkOrderPreserved(self): tbuf = TransactionBuffer() self.doUpdates(tbuf) def checkReusable(self): tbuf = TransactionBuffer() self.doUpdates(tbuf) tbuf.clear() self.doUpdates(tbuf) tbuf.clear() self.doUpdates(tbuf) def test_suite(): return unittest.makeSuite(TransBufTests, 'check') ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/tests/testZEO.py000066400000000000000000001542241230730566700234150ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Test suite for ZEO based on ZODB.tests.""" from ZEO.ClientStorage import ClientStorage from ZEO.tests.forker import get_port from ZEO.tests import forker, Cache, CommitLockTests, ThreadTests from ZEO.tests import IterationTests from ZEO.zrpc.error import DisconnectedError from ZODB.tests import StorageTestBase, BasicStorage, \ TransactionalUndoStorage, \ PackableStorage, Synchronization, ConflictResolution, RevisionStorage, \ MTStorage, ReadOnlyStorage, IteratorStorage, RecoveryStorage from ZODB.tests.MinPO import MinPO from ZODB.tests.StorageTestBase import zodb_unpickle from ZODB.utils import p64, u64 from zope.testing import renormalizing import doctest import logging import os import persistent import re import shutil import signal import stat import sys import tempfile import threading import time import transaction import unittest import ZEO.ServerStub import ZEO.StorageServer import ZEO.tests.ConnectionTests import ZEO.zrpc.connection import ZODB import ZODB.blob import ZODB.tests.hexstorage import ZODB.tests.testblob import ZODB.tests.util import ZODB.utils import zope.testing.setupstack logger = logging.getLogger('ZEO.tests.testZEO') class DummyDB: def invalidate(self, *args): pass def invalidateCache(*unused): pass transform_record_data = untransform_record_data = lambda self, v: v class CreativeGetState(persistent.Persistent): def __getstate__(self): self.name = 'me' return super(CreativeGetState, self).__getstate__() class MiscZEOTests: """ZEO tests that don't fit in elsewhere.""" def checkCreativeGetState(self): # This test covers persistent objects that provide their own # __getstate__ which modifies the state of the object. # For details see bug #98275 db = ZODB.DB(self._storage) cn = db.open() rt = cn.root() m = CreativeGetState() m.attr = 'hi' rt['a'] = m # This commit used to fail because of the `Mine` object being put back # into `changed` state although it was already stored causing the ZEO # cache to bail out. transaction.commit() cn.close() def checkLargeUpdate(self): obj = MinPO("X" * (10 * 128 * 1024)) self._dostore(data=obj) def checkZEOInvalidation(self): addr = self._storage._addr storage2 = self._wrap_client( ClientStorage(addr, wait=1, min_disconnect_poll=0.1)) try: oid = self._storage.new_oid() ob = MinPO('first') revid1 = self._dostore(oid, data=ob) data, serial = storage2.load(oid, '') self.assertEqual(zodb_unpickle(data), MinPO('first')) self.assertEqual(serial, revid1) revid2 = self._dostore(oid, data=MinPO('second'), revid=revid1) # Now, storage 2 should eventually get the new data. It # will take some time, although hopefully not much. # We'll poll till we get it and whine if we time out: for n in range(30): time.sleep(.1) data, serial = storage2.load(oid, '') if (serial == revid2 and zodb_unpickle(data) == MinPO('second') ): break else: raise AssertionError('Invalidation message was not sent!') finally: storage2.close() def checkVolatileCacheWithImmediateLastTransaction(self): # Earlier, a ClientStorage would not have the last transaction id # available right after successful connection, this is required now. addr = self._storage._addr storage2 = ClientStorage(addr) self.assert_(storage2.is_connected()) self.assertEquals(ZODB.utils.z64, storage2.lastTransaction()) storage2.close() self._dostore() storage3 = ClientStorage(addr) self.assert_(storage3.is_connected()) self.assertEquals(8, len(storage3.lastTransaction())) self.assertNotEquals(ZODB.utils.z64, storage3.lastTransaction()) storage3.close() class ConfigurationTests(unittest.TestCase): def checkDropCacheRatherVerifyConfiguration(self): from ZODB.config import storageFromString # the default is to do verification and not drop the cache cs = storageFromString(''' server localhost:9090 wait false ''') self.assertEqual(cs._drop_cache_rather_verify, False) cs.close() # now for dropping cs = storageFromString(''' server localhost:9090 wait false drop-cache-rather-verify true ''') self.assertEqual(cs._drop_cache_rather_verify, True) cs.close() class GenericTests( # Base class for all ZODB tests StorageTestBase.StorageTestBase, # ZODB test mixin classes (in the same order as imported) BasicStorage.BasicStorage, PackableStorage.PackableStorage, Synchronization.SynchronizedStorage, MTStorage.MTStorage, ReadOnlyStorage.ReadOnlyStorage, # ZEO test mixin classes (in the same order as imported) CommitLockTests.CommitLockVoteTests, ThreadTests.ThreadTests, # Locally defined (see above) MiscZEOTests, ): """Combine tests from various origins in one class.""" shared_blob_dir = False blob_cache_dir = None def setUp(self): StorageTestBase.StorageTestBase.setUp(self) logger.info("setUp() %s", self.id()) port = get_port(self) zconf = forker.ZEOConfig(('', port)) zport, adminaddr, pid, path = forker.start_zeo_server(self.getConfig(), zconf, port) self._pids = [pid] self._servers = [adminaddr] self._conf_path = path if not self.blob_cache_dir: # This is the blob cache for ClientStorage self.blob_cache_dir = tempfile.mkdtemp( 'blob_cache', dir=os.path.abspath(os.getcwd())) self._storage = self._wrap_client(ClientStorage( zport, '1', cache_size=20000000, min_disconnect_poll=0.5, wait=1, wait_timeout=60, blob_dir=self.blob_cache_dir, shared_blob_dir=self.shared_blob_dir)) self._storage.registerDB(DummyDB()) def _wrap_client(self, client): return client def tearDown(self): self._storage.close() for server in self._servers: forker.shutdown_zeo_server(server) if hasattr(os, 'waitpid'): # Not in Windows Python until 2.3 for pid in self._pids: os.waitpid(pid, 0) StorageTestBase.StorageTestBase.tearDown(self) def runTest(self): try: super(GenericTests, self).runTest() except: self._failed = True raise else: self._failed = False def open(self, read_only=0): # Needed to support ReadOnlyStorage tests. Ought to be a # cleaner way. addr = self._storage._addr self._storage.close() self._storage = ClientStorage(addr, read_only=read_only, wait=1) def checkWriteMethods(self): # ReadOnlyStorage defines checkWriteMethods. The decision # about where to raise the read-only error was changed after # Zope 2.5 was released. So this test needs to detect Zope # of the 2.5 vintage and skip the test. # The __version__ attribute was not present in Zope 2.5. if hasattr(ZODB, "__version__"): ReadOnlyStorage.ReadOnlyStorage.checkWriteMethods(self) def checkSortKey(self): key = '%s:%s' % (self._storage._storage, self._storage._server_addr) self.assertEqual(self._storage.sortKey(), key) def _do_store_in_separate_thread(self, oid, revid, voted): def do_store(): store = ZEO.ClientStorage.ClientStorage(self._storage._addr) try: t = transaction.get() store.tpc_begin(t) store.store(oid, revid, 'x', '', t) store.tpc_vote(t) store.tpc_finish(t) except Exception, v: import traceback print 'E'*70 print v traceback.print_exception(*sys.exc_info()) finally: store.close() thread = threading.Thread(name='T2', target=do_store) thread.setDaemon(True) thread.start() thread.join(voted and .1 or 9) return thread class FullGenericTests( GenericTests, Cache.TransUndoStorageWithCache, ConflictResolution.ConflictResolvingStorage, ConflictResolution.ConflictResolvingTransUndoStorage, PackableStorage.PackableUndoStorage, RevisionStorage.RevisionStorage, TransactionalUndoStorage.TransactionalUndoStorage, IteratorStorage.IteratorStorage, IterationTests.IterationTests, ): """Extend GenericTests with tests that MappingStorage can't pass.""" class FileStorageRecoveryTests(StorageTestBase.StorageTestBase, RecoveryStorage.RecoveryStorage): def getConfig(self): return """\ path %s """ % tempfile.mktemp(dir='.') def _new_storage(self): port = get_port(self) zconf = forker.ZEOConfig(('', port)) zport, adminaddr, pid, path = forker.start_zeo_server(self.getConfig(), zconf, port) self._pids.append(pid) self._servers.append(adminaddr) blob_cache_dir = tempfile.mkdtemp(dir='.') storage = ClientStorage( zport, '1', cache_size=20000000, min_disconnect_poll=0.5, wait=1, wait_timeout=60, blob_dir=blob_cache_dir) storage.registerDB(DummyDB()) return storage def setUp(self): StorageTestBase.StorageTestBase.setUp(self) self._pids = [] self._servers = [] self._storage = self._new_storage() self._dst = self._new_storage() def tearDown(self): self._storage.close() self._dst.close() for server in self._servers: forker.shutdown_zeo_server(server) if hasattr(os, 'waitpid'): # Not in Windows Python until 2.3 for pid in self._pids: os.waitpid(pid, 0) StorageTestBase.StorageTestBase.tearDown(self) def new_dest(self): return self._new_storage() class FileStorageTests(FullGenericTests): """Test ZEO backed by a FileStorage.""" def getConfig(self): return """\ path Data.fs """ _expected_interfaces = ( ('ZODB.interfaces', 'IStorageRestoreable'), ('ZODB.interfaces', 'IStorageIteration'), ('ZODB.interfaces', 'IStorageUndoable'), ('ZODB.interfaces', 'IStorageCurrentRecordIteration'), ('ZODB.interfaces', 'IExternalGC'), ('ZODB.interfaces', 'IStorage'), ('zope.interface', 'Interface'), ) def checkInterfaceFromRemoteStorage(self): # ClientStorage itself doesn't implement IStorageIteration, but the # FileStorage on the other end does, and thus the ClientStorage # instance that is connected to it reflects this. self.failIf(ZODB.interfaces.IStorageIteration.implementedBy( ZEO.ClientStorage.ClientStorage)) self.failUnless(ZODB.interfaces.IStorageIteration.providedBy( self._storage)) # This is communicated using ClientStorage's _info object: self.assertEquals(self._expected_interfaces, self._storage._info['interfaces'] ) class FileStorageHexTests(FileStorageTests): _expected_interfaces = ( ('ZODB.interfaces', 'IStorageRestoreable'), ('ZODB.interfaces', 'IStorageIteration'), ('ZODB.interfaces', 'IStorageUndoable'), ('ZODB.interfaces', 'IStorageCurrentRecordIteration'), ('ZODB.interfaces', 'IExternalGC'), ('ZODB.interfaces', 'IStorage'), ('ZODB.interfaces', 'IStorageWrapper'), ('zope.interface', 'Interface'), ) def getConfig(self): return """\ %import ZODB.tests path Data.fs """ class FileStorageClientHexTests(FileStorageHexTests): def getConfig(self): return """\ %import ZODB.tests path Data.fs """ def _wrap_client(self, client): return ZODB.tests.hexstorage.HexStorage(client) class MappingStorageTests(GenericTests): """ZEO backed by a Mapping storage.""" def getConfig(self): return """""" def checkSimpleIteration(self): # The test base class IteratorStorage assumes that we keep undo data # to construct our iterator, which we don't, so we disable this test. pass def checkUndoZombie(self): # The test base class IteratorStorage assumes that we keep undo data # to construct our iterator, which we don't, so we disable this test. pass class DemoStorageTests( GenericTests, ): def getConfig(self): return """ path Data.fs """ def checkUndoZombie(self): # The test base class IteratorStorage assumes that we keep undo data # to construct our iterator, which we don't, so we disable this test. pass def checkPackWithMultiDatabaseReferences(self): pass # DemoStorage pack doesn't do gc checkPackAllRevisions = checkPackWithMultiDatabaseReferences class HeartbeatTests(ZEO.tests.ConnectionTests.CommonSetupTearDown): """Make sure a heartbeat is being sent and that it does no harm This is really hard to test properly because we can't see the data flow between the client and server and we can't really tell what's going on in the server very well. :( """ def setUp(self): # Crank down the select frequency self.__old_client_timeout = ZEO.zrpc.client.client_timeout ZEO.zrpc.client.client_timeout = self.__client_timeout ZEO.tests.ConnectionTests.CommonSetupTearDown.setUp(self) __client_timeouts = 0 def __client_timeout(self): self.__client_timeouts += 1 return .1 def tearDown(self): ZEO.zrpc.client.client_timeout = self.__old_client_timeout ZEO.tests.ConnectionTests.CommonSetupTearDown.tearDown(self) def getConfig(self, path, create, read_only): return """""" def checkHeartbeatWithServerClose(self): # This is a minimal test that mainly tests that the heartbeat # function does no harm. self._storage = self.openClientStorage() client_timeouts = self.__client_timeouts forker.wait_until('got a timeout', lambda : self.__client_timeouts > client_timeouts ) self._dostore() if hasattr(os, 'kill') and hasattr(signal, 'SIGKILL'): # Kill server violently, in hopes of provoking problem os.kill(self._pids[0], signal.SIGKILL) self._servers[0] = None else: self.shutdownServer() forker.wait_until('disconnected', lambda : not self._storage.is_connected() ) self._storage.close() class ZRPCConnectionTests(ZEO.tests.ConnectionTests.CommonSetupTearDown): def getConfig(self, path, create, read_only): return """""" def checkCatastrophicClientLoopFailure(self): # Test what happens when the client loop falls over self._storage = self.openClientStorage() class Evil: def writable(self): raise SystemError("I'm evil") import zope.testing.loggingsupport handler = zope.testing.loggingsupport.InstalledHandler( 'ZEO.zrpc.client') self._storage._rpc_mgr.map[None] = Evil() try: self._storage._rpc_mgr.trigger.pull_trigger() except DisconnectedError: pass forker.wait_until( 'disconnected', lambda : not self._storage.is_connected() ) log = str(handler) handler.uninstall() self.assert_("ZEO client loop failed" in log) self.assert_("Couldn't close a dispatcher." in log) def checkExceptionLogsAtError(self): # Test the exceptions are logged at error self._storage = self.openClientStorage() conn = self._storage._connection # capture logging log = [] conn.logger.log = ( lambda l, m, *a, **kw: log.append((l,m % a, kw)) ) # This is a deliberately bogus call to get an exception # logged self._storage._connection.handle_request( 'foo', 0, 'history', (1, 2, 3, 4)) # test logging for level, message, kw in log: if message.endswith( ') history() raised exception: history() takes at' ' most 3 arguments (5 given)' ): self.assertEqual(level,logging.ERROR) self.assertEqual(kw,{'exc_info':True}) break else: self.fail("error not in log") # cleanup del conn.logger.log def checkConnectionInvalidationOnReconnect(self): storage = ClientStorage(self.addr, wait=1, min_disconnect_poll=0.1) self._storage = storage # and we'll wait for the storage to be reconnected: for i in range(100): if storage.is_connected(): break time.sleep(0.1) else: raise AssertionError("Couldn't connect to server") class DummyDB: _invalidatedCache = 0 def invalidateCache(self): self._invalidatedCache += 1 def invalidate(*a, **k): pass db = DummyDB() storage.registerDB(db) base = db._invalidatedCache # Now we'll force a disconnection and reconnection storage._connection.close() # and we'll wait for the storage to be reconnected: for i in range(100): if storage.is_connected(): break time.sleep(0.1) else: raise AssertionError("Couldn't connect to server") # Now, the root object in the connection should have been invalidated: self.assertEqual(db._invalidatedCache, base+1) class CommonBlobTests: def getConfig(self): return """ blob-dir blobs path Data.fs """ blobdir = 'blobs' blob_cache_dir = 'blob_cache' def checkStoreBlob(self): from ZODB.utils import oid_repr, tid_repr from ZODB.blob import Blob, BLOB_SUFFIX from ZODB.tests.StorageTestBase import zodb_pickle, ZERO, \ handle_serials import transaction somedata = 'a' * 10 blob = Blob() bd_fh = blob.open('w') bd_fh.write(somedata) bd_fh.close() tfname = bd_fh.name oid = self._storage.new_oid() data = zodb_pickle(blob) self.assert_(os.path.exists(tfname)) t = transaction.Transaction() try: self._storage.tpc_begin(t) r1 = self._storage.storeBlob(oid, ZERO, data, tfname, '', t) r2 = self._storage.tpc_vote(t) revid = handle_serials(oid, r1, r2) self._storage.tpc_finish(t) except: self._storage.tpc_abort(t) raise self.assert_(not os.path.exists(tfname)) filename = self._storage.fshelper.getBlobFilename(oid, revid) self.assert_(os.path.exists(filename)) self.assertEqual(somedata, open(filename).read()) def checkStoreBlob_wrong_partition(self): os_rename = os.rename try: def fail(*a): raise OSError os.rename = fail self.checkStoreBlob() finally: os.rename = os_rename def checkLoadBlob(self): from ZODB.blob import Blob from ZODB.tests.StorageTestBase import zodb_pickle, ZERO, \ handle_serials import transaction somedata = 'a' * 10 blob = Blob() bd_fh = blob.open('w') bd_fh.write(somedata) bd_fh.close() tfname = bd_fh.name oid = self._storage.new_oid() data = zodb_pickle(blob) t = transaction.Transaction() try: self._storage.tpc_begin(t) r1 = self._storage.storeBlob(oid, ZERO, data, tfname, '', t) r2 = self._storage.tpc_vote(t) serial = handle_serials(oid, r1, r2) self._storage.tpc_finish(t) except: self._storage.tpc_abort(t) raise filename = self._storage.loadBlob(oid, serial) self.assertEquals(somedata, open(filename, 'rb').read()) self.assert_(not(os.stat(filename).st_mode & stat.S_IWRITE)) self.assert_((os.stat(filename).st_mode & stat.S_IREAD)) def checkTemporaryDirectory(self): self.assertEquals(os.path.join(self.blob_cache_dir, 'tmp'), self._storage.temporaryDirectory()) def checkTransactionBufferCleanup(self): oid = self._storage.new_oid() open('blob_file', 'w').write('I am a happy blob.') t = transaction.Transaction() self._storage.tpc_begin(t) self._storage.storeBlob( oid, ZODB.utils.z64, 'foo', 'blob_file', '', t) self._storage.close() class BlobAdaptedFileStorageTests(FullGenericTests, CommonBlobTests): """ZEO backed by a BlobStorage-adapted FileStorage.""" def checkStoreAndLoadBlob(self): from ZODB.utils import oid_repr, tid_repr from ZODB.blob import Blob, BLOB_SUFFIX from ZODB.tests.StorageTestBase import zodb_pickle, ZERO, \ handle_serials import transaction somedata_path = os.path.join(self.blob_cache_dir, 'somedata') somedata = open(somedata_path, 'w+b') for i in range(1000000): somedata.write("%s\n" % i) somedata.seek(0) blob = Blob() bd_fh = blob.open('w') ZODB.utils.cp(somedata, bd_fh) bd_fh.close() tfname = bd_fh.name oid = self._storage.new_oid() data = zodb_pickle(blob) self.assert_(os.path.exists(tfname)) t = transaction.Transaction() try: self._storage.tpc_begin(t) r1 = self._storage.storeBlob(oid, ZERO, data, tfname, '', t) r2 = self._storage.tpc_vote(t) revid = handle_serials(oid, r1, r2) self._storage.tpc_finish(t) except: self._storage.tpc_abort(t) raise # The uncommitted data file should have been removed self.assert_(not os.path.exists(tfname)) def check_data(path): self.assert_(os.path.exists(path)) f = open(path, 'rb') somedata.seek(0) d1 = d2 = 1 while d1 or d2: d1 = f.read(8096) d2 = somedata.read(8096) self.assertEqual(d1, d2) # The file should be in the cache ... filename = self._storage.fshelper.getBlobFilename(oid, revid) check_data(filename) # ... and on the server server_filename = os.path.join( self.blobdir, ZODB.blob.BushyLayout().getBlobFilePath(oid, revid), ) self.assert_(server_filename.startswith(self.blobdir)) check_data(server_filename) # If we remove it from the cache and call loadBlob, it should # come back. We can do this in many threads. We'll instrument # the method that is used to request data from teh server to # verify that it is only called once. sendBlob_org = ZEO.ServerStub.StorageServer.sendBlob calls = [] def sendBlob(self, oid, serial): calls.append((oid, serial)) sendBlob_org(self, oid, serial) ZODB.blob.remove_committed(filename) returns = [] threads = [ threading.Thread( target=lambda : returns.append(self._storage.loadBlob(oid, revid)) ) for i in range(10) ] [thread.start() for thread in threads] [thread.join() for thread in threads] [self.assertEqual(r, filename) for r in returns] check_data(filename) class BlobWritableCacheTests(FullGenericTests, CommonBlobTests): blob_cache_dir = 'blobs' shared_blob_dir = True class FauxConn: addr = 'x' peer_protocol_version = ZEO.zrpc.connection.Connection.current_protocol class StorageServerClientWrapper: def __init__(self): self.serials = [] def serialnos(self, serials): self.serials.extend(serials) def info(self, info): pass class StorageServerWrapper: def __init__(self, server, storage_id): self.storage_id = storage_id self.server = ZEO.StorageServer.ZEOStorage(server, server.read_only) self.server.notifyConnected(FauxConn()) self.server.register(storage_id, False) self.server.client = StorageServerClientWrapper() def sortKey(self): return self.storage_id def __getattr__(self, name): return getattr(self.server, name) def registerDB(self, *args): pass def supportsUndo(self): return False def new_oid(self): return self.server.new_oids(1)[0] def tpc_begin(self, transaction): self.server.tpc_begin(id(transaction), '', '', {}, None, ' ') def tpc_vote(self, transaction): vote_result = self.server.vote(id(transaction)) assert vote_result is None result = self.server.client.serials[:] del self.server.client.serials[:] return result def store(self, oid, serial, data, version_ignored, transaction): self.server.storea(oid, serial, data, id(transaction)) def send_reply(self, *args): # Masquerade as conn pass def tpc_abort(self, transaction): self.server.tpc_abort(id(transaction)) def tpc_finish(self, transaction, func = lambda: None): self.server.tpc_finish(id(transaction)).set_sender(0, self) def multiple_storages_invalidation_queue_is_not_insane(): """ >>> from ZEO.StorageServer import StorageServer, ZEOStorage >>> from ZODB.FileStorage import FileStorage >>> from ZODB.DB import DB >>> from persistent.mapping import PersistentMapping >>> from transaction import commit >>> fs1 = FileStorage('t1.fs') >>> fs2 = FileStorage('t2.fs') >>> server = StorageServer(('', get_port()), dict(fs1=fs1, fs2=fs2)) >>> s1 = StorageServerWrapper(server, 'fs1') >>> s2 = StorageServerWrapper(server, 'fs2') >>> db1 = DB(s1); conn1 = db1.open() >>> db2 = DB(s2); conn2 = db2.open() >>> commit() >>> o1 = conn1.root() >>> for i in range(10): ... o1.x = PersistentMapping(); o1 = o1.x ... commit() >>> last = fs1.lastTransaction() >>> for i in range(5): ... o1.x = PersistentMapping(); o1 = o1.x ... commit() >>> o2 = conn2.root() >>> for i in range(20): ... o2.x = PersistentMapping(); o2 = o2.x ... commit() >>> trans, oids = s1.getInvalidations(last) >>> from ZODB.utils import u64 >>> sorted([int(u64(oid)) for oid in oids]) [10, 11, 12, 13, 14] >>> server.close_server() """ def getInvalidationsAfterServerRestart(): """ Clients were often forced to verify their caches after a server restart even if there weren't many transactions between the server restart and the client connect. Let's create a file storage and stuff some data into it: >>> from ZEO.StorageServer import StorageServer, ZEOStorage >>> from ZODB.FileStorage import FileStorage >>> from ZODB.DB import DB >>> from persistent.mapping import PersistentMapping >>> fs = FileStorage('t.fs') >>> db = DB(fs) >>> conn = db.open() >>> from transaction import commit >>> last = [] >>> for i in range(100): ... conn.root()[i] = PersistentMapping() ... commit() ... last.append(fs.lastTransaction()) >>> db.close() Now we'll open a storage server on the data, simulating a restart: >>> fs = FileStorage('t.fs') >>> sv = StorageServer(('', get_port()), dict(fs=fs)) >>> s = ZEOStorage(sv, sv.read_only) >>> s.notifyConnected(FauxConn()) >>> s.register('fs', False) If we ask for the last transaction, we should get the last transaction we saved: >>> s.lastTransaction() == last[-1] True If a storage implements the method lastInvalidations, as FileStorage does, then the stroage server will populate its invalidation data structure using lastTransactions. >>> tid, oids = s.getInvalidations(last[-10]) >>> tid == last[-1] True >>> from ZODB.utils import u64 >>> sorted([int(u64(oid)) for oid in oids]) [0, 92, 93, 94, 95, 96, 97, 98, 99, 100] (Note that the fact that we get oids for 92-100 is actually an artifact of the fact that the FileStorage lastInvalidations method returns all OIDs written by transactions, even if the OIDs were created and not modified. FileStorages don't record whether objects were created rather than modified. Objects that are just created don't need to be invalidated. This means we'll invalidate objects that dont' need to be invalidated, however, that's better than verifying caches.) >>> sv.close_server() >>> fs.close() If a storage doesn't implement lastInvalidations, a client can still avoid verifying its cache if it was up to date when the server restarted. To illustrate this, we'll create a subclass of FileStorage without this method: >>> class FS(FileStorage): ... lastInvalidations = property() >>> fs = FS('t.fs') >>> sv = StorageServer(('', get_port()), dict(fs=fs)) >>> st = StorageServerWrapper(sv, 'fs') >>> s = st.server Now, if we ask for the invalidations since the last committed transaction, we'll get a result: >>> tid, oids = s.getInvalidations(last[-1]) >>> tid == last[-1] True >>> oids [] >>> db = DB(st); conn = db.open() >>> ob = conn.root() >>> for i in range(5): ... ob.x = PersistentMapping(); ob = ob.x ... commit() ... last.append(fs.lastTransaction()) >>> ntid, oids = s.getInvalidations(tid) >>> ntid == last[-1] True >>> sorted([int(u64(oid)) for oid in oids]) [0, 101, 102, 103, 104] >>> fs.close() """ def tpc_finish_error(): r"""Server errors in tpc_finish weren't handled properly. >>> import ZEO.ClientStorage, ZEO.zrpc.connection >>> class Connection: ... peer_protocol_version = ( ... ZEO.zrpc.connection.Connection.current_protocol) ... def __init__(self, client): ... self.client = client ... def get_addr(self): ... return 'server' ... def is_async(self): ... return True ... def register_object(self, ob): ... pass ... def close(self): ... print 'connection closed' ... trigger = property(lambda self: self) ... pull_trigger = lambda self, func, *args: func(*args) >>> class ConnectionManager: ... def __init__(self, addr, client, tmin, tmax): ... self.client = client ... def connect(self, sync=1): ... self.client.notifyConnected(Connection(self.client)) ... def close(self): ... pass >>> class StorageServer: ... should_fail = True ... def __init__(self, conn): ... self.conn = conn ... self.t = None ... def get_info(self): ... return {} ... def endZeoVerify(self): ... self.conn.client.endVerify() ... def lastTransaction(self): ... return '\0'*8 ... def tpc_begin(self, t, *args): ... if self.t is not None: ... raise TypeError('already trans') ... self.t = t ... print 'begin', args ... def vote(self, t): ... if self.t != t: ... raise TypeError('bad trans') ... print 'vote' ... def tpc_finish(self, *args): ... if self.should_fail: ... raise TypeError() ... print 'finish' ... def tpc_abort(self, t): ... if self.t != t: ... raise TypeError('bad trans') ... self.t = None ... print 'abort' ... def iterator_gc(*args): ... pass >>> class ClientStorage(ZEO.ClientStorage.ClientStorage): ... ConnectionManagerClass = ConnectionManager ... StorageServerStubClass = StorageServer >>> class Transaction: ... user = 'test' ... description = '' ... _extension = {} >>> cs = ClientStorage(('', '')) >>> t1 = Transaction() >>> cs.tpc_begin(t1) begin ('test', '', {}, None, ' ') >>> cs.tpc_vote(t1) vote >>> cs.tpc_finish(t1) Traceback (most recent call last): ... TypeError >>> cs.tpc_abort(t1) abort >>> t2 = Transaction() >>> cs.tpc_begin(t2) begin ('test', '', {}, None, ' ') >>> cs.tpc_vote(t2) vote If client storage has an internal error after the storage finish succeeeds, it will close the connection, which will force a restart and reverification. >>> StorageServer.should_fail = False >>> cs._update_cache = lambda : None >>> try: cs.tpc_finish(t2) ... except: pass ... else: print "Should have failed" finish connection closed >>> cs.close() """ def client_has_newer_data_than_server(): """It is bad if a client has newer data than the server. >>> db = ZODB.DB('Data.fs') >>> db.close() >>> shutil.copyfile('Data.fs', 'Data.save') >>> addr, admin = start_server(keep=1) >>> db = ZEO.DB(addr, name='client', max_disconnect_poll=.01) >>> wait_connected(db.storage) >>> conn = db.open() >>> conn.root().x = 1 >>> transaction.commit() OK, we've added some data to the storage and the client cache has the new data. Now, we'll stop the server, put back the old data, and see what happens. :) >>> stop_server(admin) >>> shutil.copyfile('Data.save', 'Data.fs') >>> import zope.testing.loggingsupport >>> handler = zope.testing.loggingsupport.InstalledHandler( ... 'ZEO', level=logging.ERROR) >>> formatter = logging.Formatter('%(name)s %(levelname)s %(message)s') >>> _, admin = start_server(addr=addr) >>> for i in range(1000): ... while len(handler.records) < 5: ... time.sleep(.01) >>> db.close() >>> for record in handler.records[:5]: ... print formatter.format(record) ... # doctest: +ELLIPSIS +NORMALIZE_WHITESPACE ZEO.ClientStorage CRITICAL client Client has seen newer transactions than server! ZEO.zrpc ERROR (...) CW: error in notifyConnected (('127.0.0.1', ...)) Traceback (most recent call last): ... ClientStorageError: client Client has seen newer transactions than server! ZEO.ClientStorage CRITICAL client Client has seen newer transactions than server! ZEO.zrpc ERROR (...) CW: error in notifyConnected (('127.0.0.1', ...)) Traceback (most recent call last): ... ClientStorageError: client Client has seen newer transactions than server! ... Note that the errors repeat because the client keeps on trying to connect. >>> handler.uninstall() >>> stop_server(admin) """ def history_over_zeo(): """ >>> addr, _ = start_server() >>> db = ZEO.DB(addr) >>> wait_connected(db.storage) >>> conn = db.open() >>> conn.root().x = 0 >>> transaction.commit() >>> len(db.history(conn.root()._p_oid, 99)) 2 >>> db.close() """ def dont_log_poskeyerrors_on_server(): """ >>> addr, admin = start_server() >>> cs = ClientStorage(addr) >>> cs.load(ZODB.utils.p64(1)) Traceback (most recent call last): ... POSKeyError: 0x01 >>> cs.close() >>> stop_server(admin) >>> 'POSKeyError' in open('server-%s.log' % addr[1]).read() False """ def open_convenience(): """Often, we just want to open a single connection. >>> addr, _ = start_server(path='data.fs') >>> conn = ZEO.connection(addr) >>> conn.root() {} >>> conn.root()['x'] = 1 >>> transaction.commit() >>> conn.close() Let's make sure the database was cloased when we closed the connection, and that the data is there. >>> db = ZEO.DB(addr) >>> conn = db.open() >>> conn.root() {'x': 1} >>> db.close() """ def client_asyncore_thread_has_name(): """ >>> addr, _ = start_server() >>> db = ZEO.DB(addr) >>> len([t for t in threading.enumerate() ... if ' zeo client networking thread' in t.getName()]) 1 >>> db.close() """ def runzeo_without_configfile(): """ >>> open('runzeo', 'w').write(''' ... import sys ... sys.path[:] = %r ... import ZEO.runzeo ... ZEO.runzeo.main(sys.argv[1:]) ... ''' % sys.path) >>> import subprocess, re >>> print re.sub('\d\d+|[:]', '', subprocess.Popen( ... [sys.executable, 'runzeo', '-a:%s' % get_port(), '-ft', '--test'], ... stdout=subprocess.PIPE, stderr=subprocess.STDOUT, ... ).stdout.read()), # doctest: +ELLIPSIS +NORMALIZE_WHITESPACE ------ --T INFO ZEO.runzeo () opening storage '1' using FileStorage ------ --T INFO ZEO.StorageServer StorageServer created RW with storages 1RWt ------ --T INFO ZEO.zrpc () listening on ... ------ --T INFO ZEO.runzeo () closing storage '1' testing exit immediately """ def close_client_storage_w_invalidations(): r""" Invalidations could cause errors when closing client storages, >>> addr, _ = start_server() >>> writing = threading.Event() >>> def mad_write_thread(): ... global writing ... conn = ZEO.connection(addr) ... writing.set() ... while writing.isSet(): ... conn.root.x = 1 ... transaction.commit() ... conn.close() >>> thread = threading.Thread(target=mad_write_thread) >>> thread.setDaemon(True) >>> thread.start() >>> _ = writing.wait() >>> time.sleep(.01) >>> for i in range(10): ... conn = ZEO.connection(addr) ... _ = conn._storage.load('\0'*8) ... conn.close() >>> writing.clear() >>> thread.join(1) """ def convenient_to_pass_port_to_client_and_ZEO_dot_client(): """Jim hates typing >>> addr, _ = start_server() >>> client = ZEO.client(addr[1]) >>> client.__name__ == "('127.0.0.1', %s)" % addr[1] True >>> client.close() """ def test_server_status(): """ You can get server status using the server_status method. >>> addr, _ = start_server(zeo_conf=dict(transaction_timeout=1)) >>> db = ZEO.DB(addr) >>> import pprint >>> pprint.pprint(db.storage.server_status(), width=1) {'aborts': 0, 'active_txns': 0, 'commits': 1, 'conflicts': 0, 'conflicts_resolved': 0, 'connections': 1, 'loads': 1, 'lock_time': None, 'start': 'Tue May 4 10:55:20 2010', 'stores': 1, 'timeout-thread-is-alive': True, 'verifying_clients': 0, 'waiting': 0} >>> db.close() """ def client_labels(): """ When looking at server logs, for servers with lots of clients coming from the same machine, it can be very difficult to correlate server log entries with actual clients. It's possible, sort of, but tedious. You can make this easier by passing a label to the ClientStorage constructor. >>> addr, _ = start_server() >>> db = ZEO.DB(addr, client_label='test-label-1') >>> db.close() >>> @wait_until ... def check_for_test_label_1(): ... for line in open('server-%s.log' % addr[1]): ... if 'test-label-1' in line: ... print line.split()[1:4] ... return True ['INFO', 'ZEO.StorageServer', '(test-label-1'] You can specify the client label via a configuration file as well: >>> import ZODB.config >>> db = ZODB.config.databaseFromString(''' ... ... ... server :%s ... client-label test-label-2 ... ... ... ''' % addr[1]) >>> db.close() >>> @wait_until ... def check_for_test_label_2(): ... for line in open('server-%s.log' % addr[1]): ... if 'test-label-2' in line: ... print line.split()[1:4] ... return True ['INFO', 'ZEO.StorageServer', '(test-label-2'] """ def invalidate_client_cache_entry_on_server_commit_error(): """ When the serials returned during commit includes an error, typically a conflict error, invalidate the cache entry. This is important when the cache is messed up. >>> addr, _ = start_server() >>> conn1 = ZEO.connection(addr) >>> conn1.root.x = conn1.root().__class__() >>> transaction.commit() >>> conn1.root.x {} >>> cs = ZEO.ClientStorage.ClientStorage(addr, client='cache') >>> conn2 = ZODB.connection(cs) >>> conn2.root.x {} >>> conn2.close() >>> cs.close() >>> conn1.root.x['x'] = 1 >>> transaction.commit() >>> conn1.root.x {'x': 1} Now, let's screw up the cache by making it have a last tid that is later than the root serial. >>> import ZEO.cache >>> cache = ZEO.cache.ClientCache('cache-1.zec') >>> cache.setLastTid(p64(u64(conn1.root.x._p_serial)+1)) >>> cache.close() We'll also update the server so that it's last tid is newer than the cache's: >>> conn1.root.y = 1 >>> transaction.commit() >>> conn1.root.y = 2 >>> transaction.commit() Now, if we reopen the client storage, we'll get the wrong root: >>> cs = ZEO.ClientStorage.ClientStorage(addr, client='cache') >>> conn2 = ZODB.connection(cs) >>> conn2.root.x {} And, we'll get a conflict error if we try to modify it: >>> conn2.root.x['y'] = 1 >>> transaction.commit() # doctest: +ELLIPSIS Traceback (most recent call last): ... ConflictError: ... But, if we abort, we'll get up to date data and we'll see the changes. >>> transaction.abort() >>> conn2.root.x {'x': 1} >>> conn2.root.x['y'] = 1 >>> transaction.commit() >>> sorted(conn2.root.x.items()) [('x', 1), ('y', 1)] >>> cs.close() >>> conn1.close() """ script_template = """ import sys sys.path[:] = %(path)r %(src)s """ def generate_script(name, src): open(name, 'w').write(script_template % dict( exe=sys.executable, path=sys.path, src=src, )) def runzeo_logrotate_on_sigusr2(): """ >>> port = get_port() >>> open('c', 'w').write(''' ... ... address %s ... ... ... ... ... ... path l ... ... ... ''' % port) >>> generate_script('s', ''' ... import ZEO.runzeo ... ZEO.runzeo.main() ... ''') >>> import subprocess, signal >>> p = subprocess.Popen([sys.executable, 's', '-Cc'], close_fds=True) >>> wait_until('started', ... lambda : os.path.exists('l') and ('listening on' in open('l').read()) ... ) >>> oldlog = open('l').read() >>> os.rename('l', 'o') >>> os.kill(p.pid, signal.SIGUSR2) >>> wait_until('new file', lambda : os.path.exists('l')) >>> s = ClientStorage(port) >>> s.close() >>> wait_until('See logging', lambda : ('Log files ' in open('l').read())) >>> open('o').read() == oldlog # No new data in old log True # Cleanup: >>> os.kill(p.pid, signal.SIGKILL) >>> _ = p.wait() """ def unix_domain_sockets(): """Make sure unix domain sockets work >>> addr, _ = start_server(port='./sock') >>> c = ZEO.connection(addr) >>> c.root.x = 1 >>> transaction.commit() >>> c.close() """ def gracefully_handle_abort_while_storing_many_blobs(): r""" >>> import logging, sys >>> old_level = logging.getLogger().getEffectiveLevel() >>> logging.getLogger().setLevel(logging.ERROR) >>> handler = logging.StreamHandler(sys.stdout) >>> logging.getLogger().addHandler(handler) >>> addr, _ = start_server(blob_dir='blobs') >>> c = ZEO.connection(addr, blob_dir='cblobs') >>> c.root.x = ZODB.blob.Blob('z'*(1<<20)) >>> c.root.y = ZODB.blob.Blob('z'*(1<<2)) >>> t = c.transaction_manager.get() >>> c.tpc_begin(t) >>> c.commit(t) We've called commit, but the blob sends are queued. We'll call abort right away, which will delete the temporary blob files. The queued iterators will try to open these files. >>> c.tpc_abort(t) Now we'll try to use the connection, mainly to wait for everything to get processed. Before we fixed this by making tpc_finish a synchronous call to the server. we'd get some sort of error here. >>> _ = c._storage._server.loadEx('\0'*8) >>> c.close() >>> logging.getLogger().removeHandler(handler) >>> logging.getLogger().setLevel(old_level) """ if sys.platform.startswith('win'): del runzeo_logrotate_on_sigusr2 del unix_domain_sockets if sys.version_info >= (2, 6): import multiprocessing def work_with_multiprocessing_process(name, addr, q): conn = ZEO.connection(addr) q.put((name, conn.root.x)) conn.close() class MultiprocessingTests(unittest.TestCase): layer = ZODB.tests.util.MininalTestLayer('work_with_multiprocessing') def test_work_with_multiprocessing(self): "Client storage should work with multi-processing." # Gaaa, zope.testing.runner.FakeInputContinueGenerator has no close if not hasattr(sys.stdin, 'close'): sys.stdin.close = lambda : None if not hasattr(sys.stdin, 'fileno'): sys.stdin.fileno = lambda : -1 self.globs = {} forker.setUp(self) addr, adminaddr = self.globs['start_server']() conn = ZEO.connection(addr) conn.root.x = 1 transaction.commit() q = multiprocessing.Queue() processes = [multiprocessing.Process( target=work_with_multiprocessing_process, args=(i, addr, q)) for i in range(3)] _ = [p.start() for p in processes] self.assertEqual(sorted(q.get(timeout=300) for p in processes), [(0, 1), (1, 1), (2, 1)]) _ = [p.join(30) for p in processes] conn.close() zope.testing.setupstack.tearDown(self) else: class MultiprocessingTests(unittest.TestCase): pass def quick_close_doesnt_kill_server(): r""" Start a server: >>> addr, _ = start_server() Now connect and immediately disconnect. This caused the server to die in the past: >>> import socket, struct >>> for i in range(5): ... s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) ... s.setsockopt(socket.SOL_SOCKET, socket.SO_LINGER, ... struct.pack('ii', 1, 0)) ... s.connect(addr) ... s.close() Now we should be able to connect as normal: >>> db = ZEO.DB(addr) >>> db.storage.is_connected() True >>> db.close() """ def sync_connect_doesnt_hang(): r""" >>> import threading >>> import ZEO.zrpc.client >>> ConnectThread = ZEO.zrpc.client.ConnectThread >>> ZEO.zrpc.client.ConnectThread = lambda *a, **kw: threading.Thread() >>> class CM(ZEO.zrpc.client.ConnectionManager): ... sync_wait = 1 ... _start_asyncore_loop = lambda self: None >>> cm = CM(('', 0), object()) Calling connect results in an exception being raised, instead of hanging indefinitely when the thread dies without setting up the connection. >>> cm.connect(sync=1) Traceback (most recent call last): ... AssertionError >>> cm.thread.isAlive() False >>> ZEO.zrpc.client.ConnectThread = ConnectThread """ def lp143344_extension_methods_not_lost_on_server_restart(): r""" Make sure we don't lose exension methods on server restart. >>> addr, adminaddr = start_server(keep=True) >>> conn = ZEO.connection(addr) >>> conn.root.x = 1 >>> transaction.commit() >>> conn.db().storage.answer_to_the_ultimate_question() 42 >>> stop_server(adminaddr) >>> wait_until('not connected', ... lambda : not conn.db().storage.is_connected()) >>> _ = start_server(addr=addr) >>> wait_until('connected', conn.db().storage.is_connected) >>> conn.root.x 1 >>> conn.db().storage.answer_to_the_ultimate_question() 42 >>> conn.close() """ def can_use_empty_string_for_local_host_on_client(): """We should be able to spell localhost with ''. >>> (_, port), _ = start_server() >>> conn = ZEO.connection(('', port)) >>> conn.root() {} >>> conn.root.x = 1 >>> transaction.commit() >>> conn.close() """ slow_test_classes = [ BlobAdaptedFileStorageTests, BlobWritableCacheTests, MappingStorageTests, DemoStorageTests, FileStorageTests, FileStorageHexTests, FileStorageClientHexTests, ] quick_test_classes = [ FileStorageRecoveryTests, ConfigurationTests, HeartbeatTests, ZRPCConnectionTests, ] class ServerManagingClientStorage(ClientStorage): class StorageServerStubClass(ZEO.ServerStub.StorageServer): # Wait for abort for the benefit of blob_transaction.txt def tpc_abort(self, id): self.rpc.call('tpc_abort', id) def __init__(self, name, blob_dir, shared=False, extrafsoptions=''): if shared: server_blob_dir = blob_dir else: server_blob_dir = 'server-'+blob_dir self.globs = {} port = forker.get_port2(self) addr, admin, pid, config = forker.start_zeo_server( """ blob-dir %s path %s %s """ % (server_blob_dir, name+'.fs', extrafsoptions), port=port, ) os.remove(config) zope.testing.setupstack.register(self, os.waitpid, pid, 0) zope.testing.setupstack.register( self, forker.shutdown_zeo_server, admin) if shared: ClientStorage.__init__(self, addr, blob_dir=blob_dir, shared_blob_dir=True) else: ClientStorage.__init__(self, addr, blob_dir=blob_dir) def close(self): ClientStorage.close(self) zope.testing.setupstack.tearDown(self) def create_storage_shared(name, blob_dir): return ServerManagingClientStorage(name, blob_dir, True) class ServerManagingClientStorageForIExternalGCTest( ServerManagingClientStorage): def pack(self, t=None, referencesf=None): ServerManagingClientStorage.pack(self, t, referencesf, wait=True) # Packing doesn't clear old versions out of zeo client caches, # so we'll clear the caches. self._cache.clear() ZEO.ClientStorage._check_blob_cache_size(self.blob_dir, 0) def test_suite(): suite = unittest.TestSuite() # Collect misc tests into their own layer to educe size of # unit test layer zeo = unittest.TestSuite() zeo.addTest(unittest.makeSuite(ZODB.tests.util.AAAA_Test_Runner_Hack)) zeo.addTest(doctest.DocTestSuite( setUp=forker.setUp, tearDown=zope.testing.setupstack.tearDown, checker=renormalizing.RENormalizing([ (re.compile(r"'start': '[^\n]+'"), 'start'), ]), )) zeo.addTest(doctest.DocTestSuite(ZEO.tests.IterationTests, setUp=forker.setUp, tearDown=zope.testing.setupstack.tearDown)) zeo.addTest(doctest.DocFileSuite('registerDB.test')) zeo.addTest( doctest.DocFileSuite( 'zeo-fan-out.test', 'zdoptions.test', 'drop_cache_rather_than_verify.txt', 'client-config.test', 'protocols.test', 'zeo_blob_cache.test', 'invalidation-age.txt', setUp=forker.setUp, tearDown=zope.testing.setupstack.tearDown, ), ) zeo.addTest(PackableStorage.IExternalGC_suite( lambda : ServerManagingClientStorageForIExternalGCTest( 'data.fs', 'blobs', extrafsoptions='pack-gc false') )) for klass in quick_test_classes: zeo.addTest(unittest.makeSuite(klass, "check")) zeo.layer = ZODB.tests.util.MininalTestLayer('testZeo-misc') suite.addTest(zeo) suite.addTest(unittest.makeSuite(MultiprocessingTests)) # Put the heavyweights in their own layers for klass in slow_test_classes: sub = unittest.makeSuite(klass, "check") sub.layer = ZODB.tests.util.MininalTestLayer(klass.__name__) suite.addTest(sub) suite.addTest(ZODB.tests.testblob.storage_reusable_suite( 'ClientStorageNonSharedBlobs', ServerManagingClientStorage)) suite.addTest(ZODB.tests.testblob.storage_reusable_suite( 'ClientStorageSharedBlobs', create_storage_shared)) return suite if __name__ == "__main__": unittest.main(defaultTest="test_suite") ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/tests/testZEO2.py000066400000000000000000000372671230730566700235060ustar00rootroot00000000000000############################################################################## # # Copyright Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## from zope.testing import setupstack, renormalizing import doctest import logging import pprint import re import sys import transaction import unittest import ZEO.StorageServer import ZEO.tests.servertesting import ZODB.blob import ZODB.FileStorage import ZODB.tests.util import ZODB.utils def proper_handling_of_blob_conflicts(): r""" Conflict errors weren't properly handled when storing blobs, the result being that the storage was left in a transaction. We originally saw this when restarting a block transaction, although it doesn't really matter. Set up the storage with some initial blob data. >>> fs = ZODB.FileStorage.FileStorage('t.fs', blob_dir='t.blobs') >>> db = ZODB.DB(fs) >>> conn = db.open() >>> conn.root.b = ZODB.blob.Blob('x') >>> transaction.commit() Get the iod and first serial. We'll use the serial later to provide out-of-date data. >>> oid = conn.root.b._p_oid >>> serial = conn.root.b._p_serial >>> conn.root.b.open('w').write('y') >>> transaction.commit() >>> data = fs.load(oid)[0] Create the server: >>> server = ZEO.tests.servertesting.StorageServer('x', {'1': fs}) And an initial client. >>> zs1 = ZEO.StorageServer.ZEOStorage(server) >>> conn1 = ZEO.tests.servertesting.Connection(1) >>> zs1.notifyConnected(conn1) >>> zs1.register('1', 0) >>> zs1.tpc_begin('0', '', '', {}) >>> zs1.storea(ZODB.utils.p64(99), ZODB.utils.z64, 'x', '0') >>> _ = zs1.vote('0') # doctest: +ELLIPSIS 1 callAsync serialnos ... In a second client, we'll try to commit using the old serial. This will conflict. It will be blocked at the vote call. >>> zs2 = ZEO.StorageServer.ZEOStorage(server) >>> conn2 = ZEO.tests.servertesting.Connection(2) >>> zs2.notifyConnected(conn2) >>> zs2.register('1', 0) >>> zs2.tpc_begin('1', '', '', {}) >>> zs2.storeBlobStart() >>> zs2.storeBlobChunk('z') >>> zs2.storeBlobEnd(oid, serial, data, '1') >>> delay = zs2.vote('1') >>> class Sender: ... def send_reply(self, id, reply): ... print 'reply', id, reply >>> delay.set_sender(1, Sender()) >>> logger = logging.getLogger('ZEO') >>> handler = logging.StreamHandler(sys.stdout) >>> logger.setLevel(logging.INFO) >>> logger.addHandler(handler) Now, when we abort the transaction for the first client. the second client will be restarted. It will get a conflict error, that is handled correctly: >>> zs1.tpc_abort('0') # doctest: +ELLIPSIS 2 callAsync serialnos ... reply 1 None >>> fs.tpc_transaction() is not None True >>> conn2.connected True >>> logger.setLevel(logging.NOTSET) >>> logger.removeHandler(handler) >>> zs2.tpc_abort('1') >>> fs.close() """ def proper_handling_of_errors_in_restart(): r""" It's critical that if there is an error in vote that the storage isn't left in tpc. >>> fs = ZODB.FileStorage.FileStorage('t.fs', blob_dir='t.blobs') >>> server = ZEO.tests.servertesting.StorageServer('x', {'1': fs}) And an initial client. >>> zs1 = ZEO.StorageServer.ZEOStorage(server) >>> conn1 = ZEO.tests.servertesting.Connection(1) >>> zs1.notifyConnected(conn1) >>> zs1.register('1', 0) >>> zs1.tpc_begin('0', '', '', {}) >>> zs1.storea(ZODB.utils.p64(99), ZODB.utils.z64, 'x', '0') Intentionally break zs1: >>> zs1._store = lambda : None >>> _ = zs1.vote('0') # doctest: +ELLIPSIS Traceback (most recent call last): ... TypeError: () takes no arguments (3 given) We're not in a transaction: >>> fs.tpc_transaction() is None True We can start another client and get the storage lock. >>> zs1 = ZEO.StorageServer.ZEOStorage(server) >>> conn1 = ZEO.tests.servertesting.Connection(1) >>> zs1.notifyConnected(conn1) >>> zs1.register('1', 0) >>> zs1.tpc_begin('1', '', '', {}) >>> zs1.storea(ZODB.utils.p64(99), ZODB.utils.z64, 'x', '1') >>> _ = zs1.vote('1') # doctest: +ELLIPSIS 1 callAsync serialnos ... >>> zs1.tpc_finish('1').set_sender(0, conn1) >>> fs.close() """ def errors_in_vote_should_clear_lock(): """ So, we arrange to get an error in vote: >>> import ZODB.MappingStorage >>> vote_should_fail = True >>> class MappingStorage(ZODB.MappingStorage.MappingStorage): ... def tpc_vote(*args): ... if vote_should_fail: ... raise ValueError ... return ZODB.MappingStorage.MappingStorage.tpc_vote(*args) >>> server = ZEO.tests.servertesting.StorageServer( ... 'x', {'1': MappingStorage()}) >>> zs = ZEO.StorageServer.ZEOStorage(server) >>> conn = ZEO.tests.servertesting.Connection(1) >>> zs.notifyConnected(conn) >>> zs.register('1', 0) >>> zs.tpc_begin('0', '', '', {}) >>> zs.storea(ZODB.utils.p64(99), ZODB.utils.z64, 'x', '0') >>> zs.vote('0') Traceback (most recent call last): ... ValueError When we do, the storage server's transaction lock shouldn't be held: >>> '1' in server._commit_locks False Of course, if vote suceeds, the lock will be held: >>> vote_should_fail = False >>> zs.tpc_begin('1', '', '', {}) >>> zs.storea(ZODB.utils.p64(99), ZODB.utils.z64, 'x', '1') >>> _ = zs.vote('1') # doctest: +ELLIPSIS 1 callAsync serialnos ... >>> '1' in server._commit_locks True >>> zs.tpc_abort('1') """ def some_basic_locking_tests(): r""" >>> itid = 0 >>> def start_trans(zs): ... global itid ... itid += 1 ... tid = str(itid) ... zs.tpc_begin(tid, '', '', {}) ... zs.storea(ZODB.utils.p64(99), ZODB.utils.z64, 'x', tid) ... return tid >>> server = ZEO.tests.servertesting.StorageServer() >>> handler = logging.StreamHandler(sys.stdout) >>> handler.setFormatter(logging.Formatter( ... '%(name)s %(levelname)s\n%(message)s')) >>> logging.getLogger('ZEO').addHandler(handler) >>> logging.getLogger('ZEO').setLevel(logging.DEBUG) We start a transaction and vote, this leads to getting the lock. >>> zs1 = ZEO.tests.servertesting.client(server, '1') >>> tid1 = start_trans(zs1) >>> zs1.vote(tid1) # doctest: +ELLIPSIS ZEO.StorageServer DEBUG (test-addr-1) ('1') lock: transactions waiting: 0 ZEO.StorageServer BLATHER (test-addr-1) Preparing to commit transaction: 1 objects, 36 bytes 1 callAsync serialnos ... If another client tried to vote, it's lock request will be queued and a delay will be returned: >>> zs2 = ZEO.tests.servertesting.client(server, '2') >>> tid2 = start_trans(zs2) >>> delay = zs2.vote(tid2) ZEO.StorageServer DEBUG (test-addr-2) ('1') queue lock: transactions waiting: 1 >>> delay.set_sender(0, zs2.connection) When we end the first transaction, the queued vote gets the lock. >>> zs1.tpc_abort(tid1) # doctest: +ELLIPSIS ZEO.StorageServer DEBUG (test-addr-1) ('1') unlock: transactions waiting: 1 ZEO.StorageServer DEBUG (test-addr-2) ('1') lock: transactions waiting: 0 ZEO.StorageServer BLATHER (test-addr-2) Preparing to commit transaction: 1 objects, 36 bytes 2 callAsync serialnos ... Let's try again with the first client. The vote will be queued: >>> tid1 = start_trans(zs1) >>> delay = zs1.vote(tid1) ZEO.StorageServer DEBUG (test-addr-1) ('1') queue lock: transactions waiting: 1 If the queued transaction is aborted, it will be dequeued: >>> zs1.tpc_abort(tid1) # doctest: +ELLIPSIS ZEO.StorageServer DEBUG (test-addr-1) ('1') dequeue lock: transactions waiting: 0 BTW, voting multiple times will error: >>> zs2.vote(tid2) Traceback (most recent call last): ... StorageTransactionError: Already voting (locked) >>> tid1 = start_trans(zs1) >>> delay = zs1.vote(tid1) ZEO.StorageServer DEBUG (test-addr-1) ('1') queue lock: transactions waiting: 1 >>> delay.set_sender(0, zs1.connection) >>> zs1.vote(tid1) Traceback (most recent call last): ... StorageTransactionError: Already voting (waiting) Note that the locking activity is logged at debug level to avoid cluttering log files, however, as the number of waiting votes increased, so does the logging level: >>> clients = [] >>> for i in range(9): ... client = ZEO.tests.servertesting.client(server, str(i+10)) ... tid = start_trans(client) ... delay = client.vote(tid) ... clients.append(client) ZEO.StorageServer DEBUG (test-addr-10) ('1') queue lock: transactions waiting: 2 ZEO.StorageServer DEBUG (test-addr-11) ('1') queue lock: transactions waiting: 3 ZEO.StorageServer WARNING (test-addr-12) ('1') queue lock: transactions waiting: 4 ZEO.StorageServer WARNING (test-addr-13) ('1') queue lock: transactions waiting: 5 ZEO.StorageServer WARNING (test-addr-14) ('1') queue lock: transactions waiting: 6 ZEO.StorageServer WARNING (test-addr-15) ('1') queue lock: transactions waiting: 7 ZEO.StorageServer WARNING (test-addr-16) ('1') queue lock: transactions waiting: 8 ZEO.StorageServer WARNING (test-addr-17) ('1') queue lock: transactions waiting: 9 ZEO.StorageServer CRITICAL (test-addr-18) ('1') queue lock: transactions waiting: 10 If a client with the transaction lock disconnects, it will abort and release the lock and one of the waiting clients will get the lock. >>> zs2.notifyDisconnected() # doctest: +ELLIPSIS ZEO.StorageServer INFO (test-addr-2) disconnected during locked transaction ZEO.StorageServer CRITICAL (test-addr-2) ('1') unlock: transactions waiting: 10 ZEO.StorageServer WARNING (test-addr-1) ('1') lock: transactions waiting: 9 ZEO.StorageServer BLATHER (test-addr-1) Preparing to commit transaction: 1 objects, 36 bytes 1 callAsync serialnos ... (In practice, waiting clients won't necessarily get the lock in order.) We can find out about the current lock state, and get other server statistics using the server_status method: >>> pprint.pprint(zs1.server_status(), width=1) {'aborts': 3, 'active_txns': 10, 'commits': 0, 'conflicts': 0, 'conflicts_resolved': 0, 'connections': 11, 'loads': 0, 'lock_time': 1272653598.693882, 'start': 'Fri Apr 30 14:53:18 2010', 'stores': 13, 'timeout-thread-is-alive': 'stub', 'verifying_clients': 0, 'waiting': 9} (Note that the connections count above is off by 1 due to the way the test infrastructure works.) If clients disconnect while waiting, they will be dequeued: >>> for client in clients: ... client.notifyDisconnected() ZEO.StorageServer INFO (test-addr-10) disconnected during unlocked transaction ZEO.StorageServer WARNING (test-addr-10) ('1') dequeue lock: transactions waiting: 8 ZEO.StorageServer INFO (test-addr-11) disconnected during unlocked transaction ZEO.StorageServer WARNING (test-addr-11) ('1') dequeue lock: transactions waiting: 7 ZEO.StorageServer INFO (test-addr-12) disconnected during unlocked transaction ZEO.StorageServer WARNING (test-addr-12) ('1') dequeue lock: transactions waiting: 6 ZEO.StorageServer INFO (test-addr-13) disconnected during unlocked transaction ZEO.StorageServer WARNING (test-addr-13) ('1') dequeue lock: transactions waiting: 5 ZEO.StorageServer INFO (test-addr-14) disconnected during unlocked transaction ZEO.StorageServer WARNING (test-addr-14) ('1') dequeue lock: transactions waiting: 4 ZEO.StorageServer INFO (test-addr-15) disconnected during unlocked transaction ZEO.StorageServer DEBUG (test-addr-15) ('1') dequeue lock: transactions waiting: 3 ZEO.StorageServer INFO (test-addr-16) disconnected during unlocked transaction ZEO.StorageServer DEBUG (test-addr-16) ('1') dequeue lock: transactions waiting: 2 ZEO.StorageServer INFO (test-addr-17) disconnected during unlocked transaction ZEO.StorageServer DEBUG (test-addr-17) ('1') dequeue lock: transactions waiting: 1 ZEO.StorageServer INFO (test-addr-18) disconnected during unlocked transaction ZEO.StorageServer DEBUG (test-addr-18) ('1') dequeue lock: transactions waiting: 0 >>> zs1.tpc_abort(tid1) >>> logging.getLogger('ZEO').setLevel(logging.NOTSET) >>> logging.getLogger('ZEO').removeHandler(handler) """ def lock_sanity_check(): r""" On one occasion with 3.10.0a1 in production, we had a case where a transaction lock wasn't released properly. One possibility, fron scant log information, is that the server and ZEOStorage had different ideas about whether the ZEOStorage was locked. The timeout thread properly closed the ZEOStorage's connection, but the ZEOStorage didn't release it's lock, presumably because it thought it wasn't locked. I'm not sure why this happened. I've refactored the logic quite a bit to try to deal with this, but the consequences of this failure are so severe, I'm adding some sanity checking when queueing lock requests. Helper to manage transactions: >>> itid = 0 >>> def start_trans(zs): ... global itid ... itid += 1 ... tid = str(itid) ... zs.tpc_begin(tid, '', '', {}) ... zs.storea(ZODB.utils.p64(99), ZODB.utils.z64, 'x', tid) ... return tid Set up server and logging: >>> server = ZEO.tests.servertesting.StorageServer() >>> handler = logging.StreamHandler(sys.stdout) >>> handler.setFormatter(logging.Formatter( ... '%(name)s %(levelname)s\n%(message)s')) >>> logging.getLogger('ZEO').addHandler(handler) >>> logging.getLogger('ZEO').setLevel(logging.DEBUG) Now, we'll start a transaction, get the lock and then mark the ZEOStorage as closed and see if trying to get a lock cleans it up: >>> zs1 = ZEO.tests.servertesting.client(server, '1') >>> tid1 = start_trans(zs1) >>> zs1.vote(tid1) # doctest: +ELLIPSIS ZEO.StorageServer DEBUG (test-addr-1) ('1') lock: transactions waiting: 0 ZEO.StorageServer BLATHER (test-addr-1) Preparing to commit transaction: 1 objects, 36 bytes 1 callAsync serialnos ... >>> zs1.connection = None >>> zs2 = ZEO.tests.servertesting.client(server, '2') >>> tid2 = start_trans(zs2) >>> zs2.vote(tid2) # doctest: +ELLIPSIS ZEO.StorageServer CRITICAL (test-addr-1) Still locked after disconnected. Unlocking. ZEO.StorageServer DEBUG (test-addr-2) ('1') lock: transactions waiting: 0 ZEO.StorageServer BLATHER (test-addr-2) Preparing to commit transaction: 1 objects, 36 bytes 2 callAsync serialnos ... >>> zs1.txnlog.close() >>> zs2.tpc_abort(tid2) >>> logging.getLogger('ZEO').setLevel(logging.NOTSET) >>> logging.getLogger('ZEO').removeHandler(handler) """ def test_suite(): return unittest.TestSuite(( doctest.DocTestSuite( setUp=ZODB.tests.util.setUp, tearDown=setupstack.tearDown, checker=renormalizing.RENormalizing([ (re.compile('\d+/test-addr'), ''), (re.compile("'lock_time': \d+.\d+"), 'lock_time'), (re.compile(r"'start': '[^\n]+'"), 'start'), ]), ), )) if __name__ == '__main__': unittest.main(defaultTest='test_suite') ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/tests/testZEOOptions.py000066400000000000000000000074041230730566700247660ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Test suite for ZEO.runzeo.ZEOOptions.""" import os import tempfile import unittest import ZODB.config from ZEO.runzeo import ZEOOptions from zdaemon.tests.testzdoptions import TestZDOptions # When a hostname isn't specified in a socket binding address, ZConfig # supplies the empty string. DEFAULT_BINDING_HOST = "" class TestZEOOptions(TestZDOptions): OptionsClass = ZEOOptions input_args = ["-f", "Data.fs", "-a", "5555"] output_opts = [("-f", "Data.fs"), ("-a", "5555")] output_args = [] configdata = """ address 5555 path Data.fs """ def setUp(self): self.tempfilename = tempfile.mktemp() f = open(self.tempfilename, "w") f.write(self.configdata) f.close() def tearDown(self): try: os.remove(self.tempfilename) except os.error: pass def test_configure(self): # Hide the base class test_configure pass def test_default_help(self): pass # disable silly test w spurious failures def test_defaults_with_schema(self): options = self.OptionsClass() options.realize(["-C", self.tempfilename]) self.assertEqual(options.address, (DEFAULT_BINDING_HOST, 5555)) self.assertEqual(len(options.storages), 1) opener = options.storages[0] self.assertEqual(opener.name, "fs") self.assertEqual(opener.__class__, ZODB.config.FileStorage) self.assertEqual(options.read_only, 0) self.assertEqual(options.transaction_timeout, None) self.assertEqual(options.invalidation_queue_size, 100) def test_defaults_without_schema(self): options = self.OptionsClass() options.realize(["-a", "5555", "-f", "Data.fs"]) self.assertEqual(options.address, (DEFAULT_BINDING_HOST, 5555)) self.assertEqual(len(options.storages), 1) opener = options.storages[0] self.assertEqual(opener.name, "1") self.assertEqual(opener.__class__, ZODB.config.FileStorage) self.assertEqual(opener.config.path, "Data.fs") self.assertEqual(options.read_only, 0) self.assertEqual(options.transaction_timeout, None) self.assertEqual(options.invalidation_queue_size, 100) def test_commandline_overrides(self): options = self.OptionsClass() options.realize(["-C", self.tempfilename, "-a", "6666", "-f", "Wisdom.fs"]) self.assertEqual(options.address, (DEFAULT_BINDING_HOST, 6666)) self.assertEqual(len(options.storages), 1) opener = options.storages[0] self.assertEqual(opener.__class__, ZODB.config.FileStorage) self.assertEqual(opener.config.path, "Wisdom.fs") self.assertEqual(options.read_only, 0) self.assertEqual(options.transaction_timeout, None) self.assertEqual(options.invalidation_queue_size, 100) def test_suite(): suite = unittest.TestSuite() for cls in [TestZEOOptions]: suite.addTest(unittest.makeSuite(cls)) return suite if __name__ == "__main__": unittest.main(defaultTest='test_suite') ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/tests/test_cache.py000066400000000000000000001354541230730566700241660ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2003 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Basic unit tests for a client cache.""" from ZODB.utils import p64, repr_to_oid import doctest import os import re import string import struct import sys import tempfile import unittest import ZEO.cache import ZODB.tests.util import zope.testing.setupstack import zope.testing.renormalizing import ZEO.cache from ZODB.utils import p64, u64, z64 n1 = p64(1) n2 = p64(2) n3 = p64(3) n4 = p64(4) n5 = p64(5) def hexprint(file): file.seek(0) data = file.read() offset = 0 while data: line, data = data[:16], data[16:] printable = "" hex = "" for character in line: if (character in string.printable and not ord(character) in [12,13,9]): printable += character else: printable += '.' hex += character.encode('hex') + ' ' hex = hex[:24] + ' ' + hex[24:] hex = hex.ljust(49) printable = printable.ljust(16) print '%08x %s |%s|' % (offset, hex, printable) offset += 16 def oid(o): repr = '%016x' % o return repr_to_oid(repr) tid = oid class CacheTests(ZODB.tests.util.TestCase): def setUp(self): # The default cache size is much larger than we need here. Since # testSerialization reads the entire file into a string, it's not # good to leave it that big. ZODB.tests.util.TestCase.setUp(self) self.cache = ZEO.cache.ClientCache(size=1024**2) def tearDown(self): self.cache.close() if self.cache.path: os.remove(self.cache.path) ZODB.tests.util.TestCase.tearDown(self) def testLastTid(self): self.assertEqual(self.cache.getLastTid(), z64) self.cache.setLastTid(n2) self.assertEqual(self.cache.getLastTid(), n2) self.assertEqual(self.cache.getLastTid(), n2) self.cache.setLastTid(n3) self.assertEqual(self.cache.getLastTid(), n3) # Check that setting tids out of order gives an error: # the cache complains only when it's non-empty self.cache.store(n1, n3, None, 'x') self.assertRaises(ValueError, self.cache.setLastTid, n2) def testLoad(self): data1 = "data for n1" self.assertEqual(self.cache.load(n1), None) self.cache.store(n1, n3, None, data1) self.assertEqual(self.cache.load(n1), (data1, n3)) def testInvalidate(self): data1 = "data for n1" self.cache.store(n1, n3, None, data1) self.cache.invalidate(n2, n2) self.cache.invalidate(n1, n4) self.assertEqual(self.cache.load(n1), None) self.assertEqual(self.cache.loadBefore(n1, n4), (data1, n3, n4)) def testNonCurrent(self): data1 = "data for n1" data2 = "data for n2" self.cache.store(n1, n4, None, data1) self.cache.store(n1, n2, n3, data2) # can't say anything about state before n2 self.assertEqual(self.cache.loadBefore(n1, n2), None) # n3 is the upper bound of non-current record n2 self.assertEqual(self.cache.loadBefore(n1, n3), (data2, n2, n3)) # no data for between n2 and n3 self.assertEqual(self.cache.loadBefore(n1, n4), None) self.cache.invalidate(n1, n5) self.assertEqual(self.cache.loadBefore(n1, n5), (data1, n4, n5)) self.assertEqual(self.cache.loadBefore(n2, n4), None) def testException(self): self.cache.store(n1, n2, None, "data") self.cache.store(n1, n2, None, "data") self.assertRaises(ValueError, self.cache.store, n1, n3, None, "data") def testEviction(self): # Manually override the current maxsize cache = ZEO.cache.ClientCache(None, 3395) # Trivial test of eviction code. Doesn't test non-current # eviction. data = ["z" * i for i in range(100)] for i in range(50): n = p64(i) cache.store(n, n, None, data[i]) self.assertEquals(len(cache), i + 1) # The cache is now almost full. The next insert # should delete some objects. n = p64(50) cache.store(n, n, None, data[51]) self.assert_(len(cache) < 51) # TODO: Need to make sure eviction of non-current data # are handled correctly. def testSerialization(self): self.cache.store(n1, n2, None, "data for n1") self.cache.store(n3, n3, n4, "non-current data for n3") self.cache.store(n3, n4, n5, "more non-current data for n3") path = tempfile.mktemp() # Copy data from self.cache into path, reaching into the cache # guts to make the copy. dst = open(path, "wb+") src = self.cache.f src.seek(0) dst.write(src.read(self.cache.maxsize)) dst.close() copy = ZEO.cache.ClientCache(path) # Verify that internals of both objects are the same. # Could also test that external API produces the same results. eq = self.assertEqual eq(copy.getLastTid(), self.cache.getLastTid()) eq(len(copy), len(self.cache)) eq(dict(copy.current), dict(self.cache.current)) eq(dict([(k, dict(v)) for (k, v) in copy.noncurrent.items()]), dict([(k, dict(v)) for (k, v) in self.cache.noncurrent.items()]), ) def testCurrentObjectLargerThanCache(self): if self.cache.path: os.remove(self.cache.path) self.cache = ZEO.cache.ClientCache(size=50) # We store an object that is a bit larger than the cache can handle. self.cache.store(n1, n2, None, "x"*64) # We can see that it was not stored. self.assertEquals(None, self.cache.load(n1)) # If an object cannot be stored in the cache, it must not be # recorded as current. self.assert_(n1 not in self.cache.current) # Regression test: invalidation must still work. self.cache.invalidate(n1, n2) def testOldObjectLargerThanCache(self): if self.cache.path: os.remove(self.cache.path) cache = ZEO.cache.ClientCache(size=50) # We store an object that is a bit larger than the cache can handle. cache.store(n1, n2, n3, "x"*64) # We can see that it was not stored. self.assertEquals(None, cache.load(n1)) # If an object cannot be stored in the cache, it must not be # recorded as non-current. self.assert_(1 not in cache.noncurrent) def testVeryLargeCaches(self): cache = ZEO.cache.ClientCache('cache', size=(1<<32)+(1<<20)) cache.store(n1, n2, None, "x") cache.close() cache = ZEO.cache.ClientCache('cache', size=(1<<33)+(1<<20)) self.assertEquals(cache.load(n1), ('x', n2)) cache.close() def testConversionOfLargeFreeBlocks(self): f = open('cache', 'wb') f.write(ZEO.cache.magic+ '\0'*8 + 'f'+struct.pack(">I", (1<<32)-12) ) f.seek((1<<32)-1) f.write('x') f.close() cache = ZEO.cache.ClientCache('cache', size=1<<32) cache.close() cache = ZEO.cache.ClientCache('cache', size=1<<32) cache.close() f = open('cache', 'rb') f.seek(12) self.assertEquals(f.read(1), 'f') self.assertEquals(struct.unpack(">I", f.read(4))[0], ZEO.cache.max_block_size) f.close() if not sys.platform.startswith('linux'): # On platforms without sparse files, these tests are just way # too hard on the disk and take too long (especially in a windows # VM). del testVeryLargeCaches del testConversionOfLargeFreeBlocks def test_clear_zeo_cache(self): cache = self.cache for i in range(10): cache.store(p64(i), n2, None, str(i)) cache.store(p64(i), n1, n2, str(i)+'old') self.assertEqual(len(cache), 20) self.assertEqual(cache.load(n3), ('3', n2)) self.assertEqual(cache.loadBefore(n3, n2), ('3old', n1, n2)) cache.clear() self.assertEqual(len(cache), 0) self.assertEqual(cache.load(n3), None) self.assertEqual(cache.loadBefore(n3, n2), None) def testChangingCacheSize(self): # start with a small cache data = 'x' recsize = ZEO.cache.allocated_record_overhead+len(data) for extra in (2, recsize-2): cache = ZEO.cache.ClientCache( 'cache', size=ZEO.cache.ZEC_HEADER_SIZE+100*recsize+extra) for i in range(100): cache.store(p64(i), n1, None, data) self.assertEquals(len(cache), 100) self.assertEquals(os.path.getsize( 'cache'), ZEO.cache.ZEC_HEADER_SIZE+100*recsize+extra) # Now make it smaller cache.close() small = 50 cache = ZEO.cache.ClientCache( 'cache', size=ZEO.cache.ZEC_HEADER_SIZE+small*recsize+extra) self.assertEquals(len(cache), small) self.assertEquals(os.path.getsize( 'cache'), ZEO.cache.ZEC_HEADER_SIZE+small*recsize+extra) self.assertEquals(set(u64(oid) for (oid, tid) in cache.contents()), set(range(small))) for i in range(100, 110): cache.store(p64(i), n1, None, data) # We use small-1 below because an extra object gets # evicted because of the optimization to assure that we # always get a free block after a new allocated block. expected_len = small - 1 self.assertEquals(len(cache), expected_len) expected_oids = set(range(11, 50)+range(100, 110)) self.assertEquals( set(u64(oid) for (oid, tid) in cache.contents()), expected_oids) # Make sure we can reopen with same size cache.close() cache = ZEO.cache.ClientCache( 'cache', size=ZEO.cache.ZEC_HEADER_SIZE+small*recsize+extra) self.assertEquals(len(cache), expected_len) self.assertEquals(set(u64(oid) for (oid, tid) in cache.contents()), expected_oids) # Now make it bigger cache.close() large = 150 cache = ZEO.cache.ClientCache( 'cache', size=ZEO.cache.ZEC_HEADER_SIZE+large*recsize+extra) self.assertEquals(len(cache), expected_len) self.assertEquals(os.path.getsize( 'cache'), ZEO.cache.ZEC_HEADER_SIZE+large*recsize+extra) self.assertEquals(set(u64(oid) for (oid, tid) in cache.contents()), expected_oids) for i in range(200, 305): cache.store(p64(i), n1, None, data) # We use large-2 for the same reason we used small-1 above. expected_len = large-2 self.assertEquals(len(cache), expected_len) expected_oids = set(range(11, 50)+range(106, 110)+range(200, 305)) self.assertEquals(set(u64(oid) for (oid, tid) in cache.contents()), expected_oids) # Make sure we can reopen with same size cache.close() cache = ZEO.cache.ClientCache( 'cache', size=ZEO.cache.ZEC_HEADER_SIZE+large*recsize+extra) self.assertEquals(len(cache), expected_len) self.assertEquals(set(u64(oid) for (oid, tid) in cache.contents()), expected_oids) # Cleanup cache.close() os.remove('cache') def testSetAnyLastTidOnEmptyCache(self): self.cache.setLastTid(p64(5)) self.cache.setLastTid(p64(5)) self.cache.setLastTid(p64(3)) self.cache.setLastTid(p64(4)) def kill_does_not_cause_cache_corruption(): r""" If we kill a process while a cache is being written to, the cache isn't corrupted. To see this, we'll write a little script that writes records to a cache file repeatedly. >>> import os, random, sys, time >>> open('t', 'w').write(''' ... import os, random, sys, thread, time ... sys.path = %r ... ... def suicide(): ... time.sleep(random.random()/10) ... os._exit(0) ... ... import ZEO.cache ... from ZODB.utils import p64 ... cache = ZEO.cache.ClientCache('cache') ... oid = 0 ... t = 0 ... thread.start_new_thread(suicide, ()) ... while 1: ... oid += 1 ... t += 1 ... data = 'X' * random.randint(5000,25000) ... cache.store(p64(oid), p64(t), None, data) ... ... ''' % sys.path) >>> for i in range(10): ... _ = os.spawnl(os.P_WAIT, sys.executable, sys.executable, 't') ... if os.path.exists('cache'): ... cache = ZEO.cache.ClientCache('cache') ... cache.close() ... os.remove('cache') ... os.remove('cache.lock') """ def full_cache_is_valid(): r""" If we fill up the cache without any free space, the cache can still be used. >>> import ZEO.cache >>> cache = ZEO.cache.ClientCache('cache', 1000) >>> data = 'X' * (1000 - ZEO.cache.ZEC_HEADER_SIZE - 41) >>> cache.store(p64(1), p64(1), None, data) >>> cache.close() >>> cache = ZEO.cache.ClientCache('cache', 1000) >>> cache.store(p64(2), p64(2), None, 'XXX') >>> cache.close() """ def cannot_open_same_cache_file_twice(): r""" >>> import ZEO.cache >>> cache = ZEO.cache.ClientCache('cache', 1000) >>> cache2 = ZEO.cache.ClientCache('cache', 1000) Traceback (most recent call last): ... LockError: Couldn't lock 'cache.lock' >>> cache.close() """ def thread_safe(): r""" >>> import ZEO.cache, ZODB.utils >>> cache = ZEO.cache.ClientCache('cache', 1000000) >>> for i in range(100): ... cache.store(ZODB.utils.p64(i), ZODB.utils.p64(1), None, '0') >>> import random, sys, threading >>> random = random.Random(0) >>> stop = False >>> read_failure = None >>> def read_thread(): ... def pick_oid(): ... return ZODB.utils.p64(random.randint(0,99)) ... ... try: ... while not stop: ... cache.load(pick_oid()) ... cache.loadBefore(pick_oid(), ZODB.utils.p64(2)) ... except: ... global read_failure ... read_failure = sys.exc_info() >>> thread = threading.Thread(target=read_thread) >>> thread.start() >>> for tid in range(2,10): ... for oid in range(100): ... oid = ZODB.utils.p64(oid) ... cache.invalidate(oid, ZODB.utils.p64(tid)) ... cache.store(oid, ZODB.utils.p64(tid), None, str(tid)) >>> stop = True >>> thread.join() >>> if read_failure: ... print 'Read failure:' ... import traceback ... traceback.print_exception(*read_failure) >>> expected = '9', ZODB.utils.p64(9) >>> for oid in range(100): ... loaded = cache.load(ZODB.utils.p64(oid)) ... if loaded != expected: ... print oid, loaded >>> cache.close() """ def broken_non_current(): r""" In production, we saw a situation where an _del_noncurrent raused a key error when trying to free space, causing the cache to become unusable. I can't see why this would occur, but added a logging exception handler so, in the future, we'll still see cases in the log, but will ignore the error and keep going. >>> import ZEO.cache, ZODB.utils, logging, sys >>> logger = logging.getLogger('ZEO.cache') >>> logger.setLevel(logging.ERROR) >>> handler = logging.StreamHandler(sys.stdout) >>> logger.addHandler(handler) >>> cache = ZEO.cache.ClientCache('cache', 1000) >>> cache.store(ZODB.utils.p64(1), ZODB.utils.p64(1), None, '0') >>> cache.invalidate(ZODB.utils.p64(1), ZODB.utils.p64(2)) >>> cache._del_noncurrent(ZODB.utils.p64(1), ZODB.utils.p64(2)) ... # doctest: +NORMALIZE_WHITESPACE Couldn't find non-current ('\x00\x00\x00\x00\x00\x00\x00\x01', '\x00\x00\x00\x00\x00\x00\x00\x02') >>> cache._del_noncurrent(ZODB.utils.p64(1), ZODB.utils.p64(1)) >>> cache._del_noncurrent(ZODB.utils.p64(1), ZODB.utils.p64(1)) # ... # doctest: +NORMALIZE_WHITESPACE Couldn't find non-current ('\x00\x00\x00\x00\x00\x00\x00\x01', '\x00\x00\x00\x00\x00\x00\x00\x01') >>> logger.setLevel(logging.NOTSET) >>> logger.removeHandler(handler) >>> cache.close() """ # def bad_magic_number(): See rename_bad_cache_file def cache_trace_analysis(): r""" Check to make sure the cache analysis scripts work. >>> import time >>> timetime = time.time >>> now = 1278864701.5 >>> time.time = lambda : now >>> os.environ["ZEO_CACHE_TRACE"] = 'yes' >>> import random >>> random = random.Random(42) >>> history = [] >>> serial = 1 >>> for i in range(1000): ... serial += 1 ... oid = random.randint(i+1000, i+6000) ... history.append(('s', p64(oid), p64(serial), ... 'x'*random.randint(200,2000))) ... for j in range(10): ... oid = random.randint(i+1000, i+6000) ... history.append(('l', p64(oid), p64(serial), ... 'x'*random.randint(200,2000))) >>> def cache_run(name, size): ... serial = 1 ... random.seed(42) ... global now ... now = 1278864701.5 ... cache = ZEO.cache.ClientCache(name, size*(1<<20)) ... for action, oid, serial, data in history: ... now += 1 ... if action == 's': ... cache.invalidate(oid, serial) ... cache.store(oid, serial, None, data) ... else: ... v = cache.load(oid) ... if v is None: ... cache.store(oid, serial, None, data) ... cache.close() >>> cache_run('cache', 2) >>> import ZEO.scripts.cache_stats, ZEO.scripts.cache_simul >>> def ctime(t): ... return time.asctime(time.gmtime(t-3600*4)) >>> ZEO.scripts.cache_stats.ctime = ctime >>> ZEO.scripts.cache_simul.ctime = ctime ############################################################ Stats >>> ZEO.scripts.cache_stats.main(['cache.trace']) loads hits inv(h) writes hitrate Jul 11 12:11-11 0 0 0 0 n/a Jul 11 12:11:41 ==================== Restart ==================== Jul 11 12:11-14 180 1 2 197 0.6% Jul 11 12:15-29 818 107 9 793 13.1% Jul 11 12:30-44 818 213 22 687 26.0% Jul 11 12:45-59 818 291 19 609 35.6% Jul 11 13:00-14 818 295 36 605 36.1% Jul 11 13:15-29 818 277 31 623 33.9% Jul 11 13:30-44 819 276 29 624 33.7% Jul 11 13:45-59 818 251 25 649 30.7% Jul 11 14:00-14 818 295 27 605 36.1% Jul 11 14:15-29 818 262 33 638 32.0% Jul 11 14:30-44 818 297 32 603 36.3% Jul 11 14:45-59 819 268 23 632 32.7% Jul 11 15:00-14 818 291 30 609 35.6% Jul 11 15:15-15 2 1 0 1 50.0% Read 18,876 trace records (641,776 bytes) in 0.0 seconds Versions: 0 records used a version First time: Sun Jul 11 12:11:41 2010 Last time: Sun Jul 11 15:15:01 2010 Duration: 11,000 seconds Data recs: 11,000 (58.3%), average size 1108 bytes Hit rate: 31.2% (load hits / loads) Count Code Function (action) 1 00 _setup_trace (initialization) 682 10 invalidate (miss) 318 1c invalidate (hit, saving non-current) 6,875 20 load (miss) 3,125 22 load (hit) 7,875 52 store (current, non-version) >>> ZEO.scripts.cache_stats.main('-q cache.trace'.split()) loads hits inv(h) writes hitrate Read 18,876 trace records (641,776 bytes) in 0.0 seconds Versions: 0 records used a version First time: Sun Jul 11 12:11:41 2010 Last time: Sun Jul 11 15:15:01 2010 Duration: 11,000 seconds Data recs: 11,000 (58.3%), average size 1108 bytes Hit rate: 31.2% (load hits / loads) Count Code Function (action) 1 00 _setup_trace (initialization) 682 10 invalidate (miss) 318 1c invalidate (hit, saving non-current) 6,875 20 load (miss) 3,125 22 load (hit) 7,875 52 store (current, non-version) >>> ZEO.scripts.cache_stats.main('-v cache.trace'.split()) ... # doctest: +ELLIPSIS loads hits inv(h) writes hitrate Jul 11 12:11:41 00 '' 0000000000000000 0000000000000000 - Jul 11 12:11-11 0 0 0 0 n/a Jul 11 12:11:41 ==================== Restart ==================== Jul 11 12:11:42 10 1065 0000000000000002 0000000000000000 - Jul 11 12:11:42 52 1065 0000000000000002 0000000000000000 - 245 Jul 11 12:11:43 20 947 0000000000000000 0000000000000000 - Jul 11 12:11:43 52 947 0000000000000002 0000000000000000 - 602 Jul 11 12:11:44 20 124b 0000000000000000 0000000000000000 - Jul 11 12:11:44 52 124b 0000000000000002 0000000000000000 - 1418 ... Jul 11 15:14:55 52 10cc 00000000000003e9 0000000000000000 - 1306 Jul 11 15:14:56 20 18a7 0000000000000000 0000000000000000 - Jul 11 15:14:56 52 18a7 00000000000003e9 0000000000000000 - 1610 Jul 11 15:14:57 22 18b5 000000000000031d 0000000000000000 - 1636 Jul 11 15:14:58 20 b8a 0000000000000000 0000000000000000 - Jul 11 15:14:58 52 b8a 00000000000003e9 0000000000000000 - 838 Jul 11 15:14:59 22 1085 0000000000000357 0000000000000000 - 217 Jul 11 15:00-14 818 291 30 609 35.6% Jul 11 15:15:00 22 1072 000000000000037e 0000000000000000 - 204 Jul 11 15:15:01 20 16c5 0000000000000000 0000000000000000 - Jul 11 15:15:01 52 16c5 00000000000003e9 0000000000000000 - 1712 Jul 11 15:15-15 2 1 0 1 50.0% Read 18,876 trace records (641,776 bytes) in 0.0 seconds Versions: 0 records used a version First time: Sun Jul 11 12:11:41 2010 Last time: Sun Jul 11 15:15:01 2010 Duration: 11,000 seconds Data recs: 11,000 (58.3%), average size 1108 bytes Hit rate: 31.2% (load hits / loads) Count Code Function (action) 1 00 _setup_trace (initialization) 682 10 invalidate (miss) 318 1c invalidate (hit, saving non-current) 6,875 20 load (miss) 3,125 22 load (hit) 7,875 52 store (current, non-version) >>> ZEO.scripts.cache_stats.main('-h cache.trace'.split()) loads hits inv(h) writes hitrate Jul 11 12:11-11 0 0 0 0 n/a Jul 11 12:11:41 ==================== Restart ==================== Jul 11 12:11-14 180 1 2 197 0.6% Jul 11 12:15-29 818 107 9 793 13.1% Jul 11 12:30-44 818 213 22 687 26.0% Jul 11 12:45-59 818 291 19 609 35.6% Jul 11 13:00-14 818 295 36 605 36.1% Jul 11 13:15-29 818 277 31 623 33.9% Jul 11 13:30-44 819 276 29 624 33.7% Jul 11 13:45-59 818 251 25 649 30.7% Jul 11 14:00-14 818 295 27 605 36.1% Jul 11 14:15-29 818 262 33 638 32.0% Jul 11 14:30-44 818 297 32 603 36.3% Jul 11 14:45-59 819 268 23 632 32.7% Jul 11 15:00-14 818 291 30 609 35.6% Jul 11 15:15-15 2 1 0 1 50.0% Read 18,876 trace records (641,776 bytes) in 0.0 seconds Versions: 0 records used a version First time: Sun Jul 11 12:11:41 2010 Last time: Sun Jul 11 15:15:01 2010 Duration: 11,000 seconds Data recs: 11,000 (58.3%), average size 1108 bytes Hit rate: 31.2% (load hits / loads) Count Code Function (action) 1 00 _setup_trace (initialization) 682 10 invalidate (miss) 318 1c invalidate (hit, saving non-current) 6,875 20 load (miss) 3,125 22 load (hit) 7,875 52 store (current, non-version) Histogram of object load frequency Unique oids: 4,585 Total loads: 10,000 loads objects %obj %load %cum 1 1,645 35.9% 16.4% 16.4% 2 1,465 32.0% 29.3% 45.8% 3 809 17.6% 24.3% 70.0% 4 430 9.4% 17.2% 87.2% 5 167 3.6% 8.3% 95.6% 6 49 1.1% 2.9% 98.5% 7 12 0.3% 0.8% 99.3% 8 7 0.2% 0.6% 99.9% 9 1 0.0% 0.1% 100.0% >>> ZEO.scripts.cache_stats.main('-s cache.trace'.split()) ... # doctest: +ELLIPSIS loads hits inv(h) writes hitrate Jul 11 12:11-11 0 0 0 0 n/a Jul 11 12:11:41 ==================== Restart ==================== Jul 11 12:11-14 180 1 2 197 0.6% Jul 11 12:15-29 818 107 9 793 13.1% Jul 11 12:30-44 818 213 22 687 26.0% Jul 11 12:45-59 818 291 19 609 35.6% Jul 11 13:00-14 818 295 36 605 36.1% Jul 11 13:15-29 818 277 31 623 33.9% Jul 11 13:30-44 819 276 29 624 33.7% Jul 11 13:45-59 818 251 25 649 30.7% Jul 11 14:00-14 818 295 27 605 36.1% Jul 11 14:15-29 818 262 33 638 32.0% Jul 11 14:30-44 818 297 32 603 36.3% Jul 11 14:45-59 819 268 23 632 32.7% Jul 11 15:00-14 818 291 30 609 35.6% Jul 11 15:15-15 2 1 0 1 50.0% Read 18,876 trace records (641,776 bytes) in 0.0 seconds Versions: 0 records used a version First time: Sun Jul 11 12:11:41 2010 Last time: Sun Jul 11 15:15:01 2010 Duration: 11,000 seconds Data recs: 11,000 (58.3%), average size 1108 bytes Hit rate: 31.2% (load hits / loads) Count Code Function (action) 1 00 _setup_trace (initialization) 682 10 invalidate (miss) 318 1c invalidate (hit, saving non-current) 6,875 20 load (miss) 3,125 22 load (hit) 7,875 52 store (current, non-version) Histograms of object sizes Unique sizes written: 1,782 size objs writes 200 5 5 201 4 4 202 4 4 203 1 1 204 1 1 205 6 6 206 8 8 ... 1,995 1 2 1,996 2 2 1,997 1 1 1,998 2 2 1,999 2 4 2,000 1 1 >>> ZEO.scripts.cache_stats.main('-S cache.trace'.split()) loads hits inv(h) writes hitrate Jul 11 12:11-11 0 0 0 0 n/a Jul 11 12:11:41 ==================== Restart ==================== Jul 11 12:11-14 180 1 2 197 0.6% Jul 11 12:15-29 818 107 9 793 13.1% Jul 11 12:30-44 818 213 22 687 26.0% Jul 11 12:45-59 818 291 19 609 35.6% Jul 11 13:00-14 818 295 36 605 36.1% Jul 11 13:15-29 818 277 31 623 33.9% Jul 11 13:30-44 819 276 29 624 33.7% Jul 11 13:45-59 818 251 25 649 30.7% Jul 11 14:00-14 818 295 27 605 36.1% Jul 11 14:15-29 818 262 33 638 32.0% Jul 11 14:30-44 818 297 32 603 36.3% Jul 11 14:45-59 819 268 23 632 32.7% Jul 11 15:00-14 818 291 30 609 35.6% Jul 11 15:15-15 2 1 0 1 50.0% >>> ZEO.scripts.cache_stats.main('-X cache.trace'.split()) loads hits inv(h) writes hitrate Jul 11 12:11-11 0 0 0 0 n/a Jul 11 12:11:41 ==================== Restart ==================== Jul 11 12:11-14 180 1 2 197 0.6% Jul 11 12:15-29 818 107 9 793 13.1% Jul 11 12:30-44 818 213 22 687 26.0% Jul 11 12:45-59 818 291 19 609 35.6% Jul 11 13:00-14 818 295 36 605 36.1% Jul 11 13:15-29 818 277 31 623 33.9% Jul 11 13:30-44 819 276 29 624 33.7% Jul 11 13:45-59 818 251 25 649 30.7% Jul 11 14:00-14 818 295 27 605 36.1% Jul 11 14:15-29 818 262 33 638 32.0% Jul 11 14:30-44 818 297 32 603 36.3% Jul 11 14:45-59 819 268 23 632 32.7% Jul 11 15:00-14 818 291 30 609 35.6% Jul 11 15:15-15 2 1 0 1 50.0% Read 18,876 trace records (641,776 bytes) in 0.0 seconds Versions: 0 records used a version First time: Sun Jul 11 12:11:41 2010 Last time: Sun Jul 11 15:15:01 2010 Duration: 11,000 seconds Data recs: 11,000 (58.3%), average size 1108 bytes Hit rate: 31.2% (load hits / loads) Count Code Function (action) 1 00 _setup_trace (initialization) 682 10 invalidate (miss) 318 1c invalidate (hit, saving non-current) 6,875 20 load (miss) 3,125 22 load (hit) 7,875 52 store (current, non-version) >>> ZEO.scripts.cache_stats.main('-i 5 cache.trace'.split()) loads hits inv(h) writes hitrate Jul 11 12:11-11 0 0 0 0 n/a Jul 11 12:11:41 ==================== Restart ==================== Jul 11 12:11-14 180 1 2 197 0.6% Jul 11 12:15-19 272 19 2 281 7.0% Jul 11 12:20-24 273 35 5 265 12.8% Jul 11 12:25-29 273 53 2 247 19.4% Jul 11 12:30-34 272 60 8 240 22.1% Jul 11 12:35-39 273 68 6 232 24.9% Jul 11 12:40-44 273 85 8 215 31.1% Jul 11 12:45-49 273 84 6 216 30.8% Jul 11 12:50-54 272 104 9 196 38.2% Jul 11 12:55-59 273 103 4 197 37.7% Jul 11 13:00-04 273 92 12 208 33.7% Jul 11 13:05-09 273 103 8 197 37.7% Jul 11 13:10-14 272 100 16 200 36.8% Jul 11 13:15-19 273 91 11 209 33.3% Jul 11 13:20-24 273 96 9 204 35.2% Jul 11 13:25-29 272 90 11 210 33.1% Jul 11 13:30-34 273 82 14 218 30.0% Jul 11 13:35-39 273 102 9 198 37.4% Jul 11 13:40-44 273 92 6 208 33.7% Jul 11 13:45-49 272 82 6 218 30.1% Jul 11 13:50-54 273 83 8 217 30.4% Jul 11 13:55-59 273 86 11 214 31.5% Jul 11 14:00-04 273 95 11 205 34.8% Jul 11 14:05-09 272 91 10 209 33.5% Jul 11 14:10-14 273 109 6 191 39.9% Jul 11 14:15-19 273 89 9 211 32.6% Jul 11 14:20-24 272 84 16 216 30.9% Jul 11 14:25-29 273 89 8 211 32.6% Jul 11 14:30-34 273 97 12 203 35.5% Jul 11 14:35-39 273 93 10 207 34.1% Jul 11 14:40-44 272 107 10 193 39.3% Jul 11 14:45-49 273 80 8 220 29.3% Jul 11 14:50-54 273 100 8 200 36.6% Jul 11 14:55-59 273 88 7 212 32.2% Jul 11 15:00-04 272 99 8 201 36.4% Jul 11 15:05-09 273 95 11 205 34.8% Jul 11 15:10-14 273 97 11 203 35.5% Jul 11 15:15-15 2 1 0 1 50.0% Read 18,876 trace records (641,776 bytes) in 0.0 seconds Versions: 0 records used a version First time: Sun Jul 11 12:11:41 2010 Last time: Sun Jul 11 15:15:01 2010 Duration: 11,000 seconds Data recs: 11,000 (58.3%), average size 1108 bytes Hit rate: 31.2% (load hits / loads) Count Code Function (action) 1 00 _setup_trace (initialization) 682 10 invalidate (miss) 318 1c invalidate (hit, saving non-current) 6,875 20 load (miss) 3,125 22 load (hit) 7,875 52 store (current, non-version) >>> ZEO.scripts.cache_simul.main('-s 2 -i 5 cache.trace'.split()) CircularCacheSimulation, cache size 2,097,152 bytes START TIME DUR. LOADS HITS INVALS WRITES HITRATE EVICTS INUSE Jul 11 12:11 3:17 180 1 2 197 0.6% 0 10.7 Jul 11 12:15 4:59 272 19 2 281 7.0% 0 26.4 Jul 11 12:20 4:59 273 35 5 265 12.8% 0 40.4 Jul 11 12:25 4:59 273 53 2 247 19.4% 0 54.8 Jul 11 12:30 4:59 272 60 8 240 22.1% 0 67.1 Jul 11 12:35 4:59 273 68 6 232 24.9% 0 79.8 Jul 11 12:40 4:59 273 85 8 215 31.1% 0 91.4 Jul 11 12:45 4:59 273 84 6 216 30.8% 77 99.1 Jul 11 12:50 4:59 272 104 9 196 38.2% 196 98.9 Jul 11 12:55 4:59 273 104 4 196 38.1% 188 99.1 Jul 11 13:00 4:59 273 92 12 208 33.7% 213 99.3 Jul 11 13:05 4:59 273 103 8 197 37.7% 190 99.0 Jul 11 13:10 4:59 272 100 16 200 36.8% 203 99.2 Jul 11 13:15 4:59 273 91 11 209 33.3% 222 98.7 Jul 11 13:20 4:59 273 96 9 204 35.2% 210 99.2 Jul 11 13:25 4:59 272 89 11 211 32.7% 212 99.1 Jul 11 13:30 4:59 273 82 14 218 30.0% 220 99.1 Jul 11 13:35 4:59 273 101 9 199 37.0% 191 99.5 Jul 11 13:40 4:59 273 92 6 208 33.7% 214 99.4 Jul 11 13:45 4:59 272 80 6 220 29.4% 217 99.3 Jul 11 13:50 4:59 273 81 8 219 29.7% 214 99.2 Jul 11 13:55 4:59 273 86 11 214 31.5% 208 98.8 Jul 11 14:00 4:59 273 95 11 205 34.8% 188 99.3 Jul 11 14:05 4:59 272 93 10 207 34.2% 207 99.3 Jul 11 14:10 4:59 273 110 6 190 40.3% 198 98.8 Jul 11 14:15 4:59 273 91 9 209 33.3% 209 99.1 Jul 11 14:20 4:59 272 85 16 215 31.2% 210 99.3 Jul 11 14:25 4:59 273 89 8 211 32.6% 226 99.3 Jul 11 14:30 4:59 273 96 12 204 35.2% 214 99.3 Jul 11 14:35 4:59 273 90 10 210 33.0% 213 99.3 Jul 11 14:40 4:59 272 106 10 194 39.0% 196 98.8 Jul 11 14:45 4:59 273 80 8 220 29.3% 230 99.0 Jul 11 14:50 4:59 273 99 8 201 36.3% 202 99.0 Jul 11 14:55 4:59 273 87 8 213 31.9% 205 99.4 Jul 11 15:00 4:59 272 98 8 202 36.0% 211 99.3 Jul 11 15:05 4:59 273 93 11 207 34.1% 198 99.2 Jul 11 15:10 4:59 273 96 11 204 35.2% 184 99.2 Jul 11 15:15 1 2 1 0 1 50.0% 1 99.2 -------------------------------------------------------------------------- Jul 11 12:45 2:30:01 8184 2794 286 6208 34.1% 6067 99.2 >>> cache_run('cache4', 4) >>> ZEO.scripts.cache_stats.main('cache4.trace'.split()) loads hits inv(h) writes hitrate Jul 11 12:11-11 0 0 0 0 n/a Jul 11 12:11:41 ==================== Restart ==================== Jul 11 12:11-14 180 1 2 197 0.6% Jul 11 12:15-29 818 107 9 793 13.1% Jul 11 12:30-44 818 213 22 687 26.0% Jul 11 12:45-59 818 322 23 578 39.4% Jul 11 13:00-14 818 381 43 519 46.6% Jul 11 13:15-29 818 450 44 450 55.0% Jul 11 13:30-44 819 503 47 397 61.4% Jul 11 13:45-59 818 496 49 404 60.6% Jul 11 14:00-14 818 516 48 384 63.1% Jul 11 14:15-29 818 532 59 368 65.0% Jul 11 14:30-44 818 516 51 384 63.1% Jul 11 14:45-59 819 529 53 371 64.6% Jul 11 15:00-14 818 515 49 385 63.0% Jul 11 15:15-15 2 2 0 0 100.0% Read 16,918 trace records (575,204 bytes) in 0.0 seconds Versions: 0 records used a version First time: Sun Jul 11 12:11:41 2010 Last time: Sun Jul 11 15:15:01 2010 Duration: 11,000 seconds Data recs: 11,000 (65.0%), average size 1104 bytes Hit rate: 50.8% (load hits / loads) Count Code Function (action) 1 00 _setup_trace (initialization) 501 10 invalidate (miss) 499 1c invalidate (hit, saving non-current) 4,917 20 load (miss) 5,083 22 load (hit) 5,917 52 store (current, non-version) >>> ZEO.scripts.cache_simul.main('-s 4 cache.trace'.split()) CircularCacheSimulation, cache size 4,194,304 bytes START TIME DUR. LOADS HITS INVALS WRITES HITRATE EVICTS INUSE Jul 11 12:11 3:17 180 1 2 197 0.6% 0 5.4 Jul 11 12:15 14:59 818 107 9 793 13.1% 0 27.4 Jul 11 12:30 14:59 818 213 22 687 26.0% 0 45.7 Jul 11 12:45 14:59 818 322 23 578 39.4% 0 61.4 Jul 11 13:00 14:59 818 381 43 519 46.6% 0 75.8 Jul 11 13:15 14:59 818 450 44 450 55.0% 0 88.2 Jul 11 13:30 14:59 819 503 47 397 61.4% 36 98.2 Jul 11 13:45 14:59 818 496 49 404 60.6% 388 98.5 Jul 11 14:00 14:59 818 515 48 385 63.0% 376 98.3 Jul 11 14:15 14:59 818 529 58 371 64.7% 391 98.1 Jul 11 14:30 14:59 818 511 51 389 62.5% 376 98.5 Jul 11 14:45 14:59 819 529 53 371 64.6% 410 97.9 Jul 11 15:00 14:59 818 512 49 388 62.6% 379 97.7 Jul 11 15:15 1 2 2 0 0 100.0% 0 97.7 -------------------------------------------------------------------------- Jul 11 13:30 1:45:01 5730 3597 355 2705 62.8% 2356 97.7 >>> cache_run('cache1', 1) >>> ZEO.scripts.cache_stats.main('cache1.trace'.split()) loads hits inv(h) writes hitrate Jul 11 12:11-11 0 0 0 0 n/a Jul 11 12:11:41 ==================== Restart ==================== Jul 11 12:11-14 180 1 2 197 0.6% Jul 11 12:15-29 818 107 9 793 13.1% Jul 11 12:30-44 818 160 16 740 19.6% Jul 11 12:45-59 818 158 8 742 19.3% Jul 11 13:00-14 818 141 21 759 17.2% Jul 11 13:15-29 818 128 17 772 15.6% Jul 11 13:30-44 819 151 13 749 18.4% Jul 11 13:45-59 818 120 17 780 14.7% Jul 11 14:00-14 818 159 17 741 19.4% Jul 11 14:15-29 818 141 13 759 17.2% Jul 11 14:30-44 818 157 16 743 19.2% Jul 11 14:45-59 819 133 13 767 16.2% Jul 11 15:00-14 818 158 10 742 19.3% Jul 11 15:15-15 2 1 0 1 50.0% Read 20,286 trace records (689,716 bytes) in 0.0 seconds Versions: 0 records used a version First time: Sun Jul 11 12:11:41 2010 Last time: Sun Jul 11 15:15:01 2010 Duration: 11,000 seconds Data recs: 11,000 (54.2%), average size 1105 bytes Hit rate: 17.1% (load hits / loads) Count Code Function (action) 1 00 _setup_trace (initialization) 828 10 invalidate (miss) 172 1c invalidate (hit, saving non-current) 8,285 20 load (miss) 1,715 22 load (hit) 9,285 52 store (current, non-version) >>> ZEO.scripts.cache_simul.main('-s 1 cache.trace'.split()) CircularCacheSimulation, cache size 1,048,576 bytes START TIME DUR. LOADS HITS INVALS WRITES HITRATE EVICTS INUSE Jul 11 12:11 3:17 180 1 2 197 0.6% 0 21.5 Jul 11 12:15 14:59 818 107 9 793 13.1% 96 99.6 Jul 11 12:30 14:59 818 160 16 740 19.6% 724 99.6 Jul 11 12:45 14:59 818 158 8 742 19.3% 741 99.2 Jul 11 13:00 14:59 818 140 21 760 17.1% 771 99.5 Jul 11 13:15 14:59 818 125 17 775 15.3% 781 99.6 Jul 11 13:30 14:59 819 147 13 753 17.9% 748 99.5 Jul 11 13:45 14:59 818 120 17 780 14.7% 763 99.5 Jul 11 14:00 14:59 818 159 17 741 19.4% 728 99.4 Jul 11 14:15 14:59 818 141 13 759 17.2% 787 99.6 Jul 11 14:30 14:59 818 150 15 750 18.3% 755 99.2 Jul 11 14:45 14:59 819 132 13 768 16.1% 771 99.5 Jul 11 15:00 14:59 818 154 10 746 18.8% 723 99.2 Jul 11 15:15 1 2 1 0 1 50.0% 0 99.3 -------------------------------------------------------------------------- Jul 11 12:15 3:00:01 9820 1694 169 9108 17.3% 8388 99.3 Cleanup: >>> del os.environ["ZEO_CACHE_TRACE"] >>> time.time = timetime >>> ZEO.scripts.cache_stats.ctime = time.ctime >>> ZEO.scripts.cache_simul.ctime = time.ctime """ def cache_simul_properly_handles_load_miss_after_eviction_and_inval(): r""" Set up evicted and then invalidated oid >>> os.environ["ZEO_CACHE_TRACE"] = 'yes' >>> cache = ZEO.cache.ClientCache('cache', 1<<21) >>> cache.store(p64(1), p64(1), None, 'x') >>> for i in range(10): ... cache.store(p64(2+i), p64(1), None, 'x'*(1<<19)) # Evict 1 >>> cache.store(p64(1), p64(1), None, 'x') >>> cache.invalidate(p64(1), p64(2)) >>> cache.load(p64(1)) >>> cache.close() Now try to do simulation: >>> import ZEO.scripts.cache_simul >>> ZEO.scripts.cache_simul.main('-s 1 cache.trace'.split()) ... # doctest: +ELLIPSIS +NORMALIZE_WHITESPACE CircularCacheSimulation, cache size 1,048,576 bytes START TIME DUR. LOADS HITS INVALS WRITES HITRATE EVICTS INUSE ... 1 0 1 12 0.0% 10 50.0 -------------------------------------------------------------------------- ... 1 0 1 12 0.0% 10 50.0 >>> del os.environ["ZEO_CACHE_TRACE"] """ def invalidations_with_current_tid_dont_wreck_cache(): """ >>> cache = ZEO.cache.ClientCache('cache', 1000) >>> cache.store(p64(1), p64(1), None, 'data') >>> import logging, sys >>> handler = logging.StreamHandler(sys.stdout) >>> logging.getLogger().addHandler(handler) >>> old_level = logging.getLogger().getEffectiveLevel() >>> logging.getLogger().setLevel(logging.WARNING) >>> cache.invalidate(p64(1), p64(1)) Ignoring invalidation with same tid as current >>> cache.close() >>> cache = ZEO.cache.ClientCache('cache', 1000) >>> cache.close() >>> logging.getLogger().removeHandler(handler) >>> logging.getLogger().setLevel(old_level) """ def rename_bad_cache_file(): """ An attempt to open a bad cache file will cause it to be dropped and recreated. >>> open('cache', 'w').write('x'*100) >>> import logging, sys >>> handler = logging.StreamHandler(sys.stdout) >>> logging.getLogger().addHandler(handler) >>> old_level = logging.getLogger().getEffectiveLevel() >>> logging.getLogger().setLevel(logging.WARNING) >>> cache = ZEO.cache.ClientCache('cache', 1000) # doctest: +ELLIPSIS Moving bad cache file to 'cache.bad'. Traceback (most recent call last): ... ValueError: unexpected magic number: 'xxxx' >>> cache.store(p64(1), p64(1), None, 'data') >>> cache.close() >>> f = open('cache') >>> f.seek(0, 2) >>> print f.tell() 1000 >>> f.close() >>> open('cache', 'w').write('x'*200) >>> cache = ZEO.cache.ClientCache('cache', 1000) # doctest: +ELLIPSIS Removing bad cache file: 'cache' (prev bad exists). Traceback (most recent call last): ... ValueError: unexpected magic number: 'xxxx' >>> cache.store(p64(1), p64(1), None, 'data') >>> cache.close() >>> f = open('cache') >>> f.seek(0, 2) >>> print f.tell() 1000 >>> f.close() >>> f = open('cache.bad') >>> f.seek(0, 2) >>> print f.tell() 100 >>> f.close() >>> logging.getLogger().removeHandler(handler) >>> logging.getLogger().setLevel(old_level) """ def test_suite(): suite = unittest.TestSuite() suite.addTest(unittest.makeSuite(CacheTests)) suite.addTest( doctest.DocTestSuite( setUp=zope.testing.setupstack.setUpDirectory, tearDown=zope.testing.setupstack.tearDown, checker=zope.testing.renormalizing.RENormalizing([ (re.compile(r'31\.3%'), '31.2%'), ]), ) ) return suite ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/tests/zdoptions.test000066400000000000000000000066211230730566700244350ustar00rootroot00000000000000Minimal test of Server Options Handling ======================================= This is initially motivated by a desire to remove the requirement of specifying a storage name when there is only one storage. Storage Names ------------- It is an error not to specify any storages: >>> import StringIO, sys, ZEO.runzeo >>> stderr = sys.stderr >>> open('config', 'w').write(""" ... ... address 8100 ... ... """) >>> sys.stderr = StringIO.StringIO() >>> options = ZEO.runzeo.ZEOOptions() >>> options.realize('-C config'.split()) Traceback (most recent call last): ... SystemExit: 2 >>> print sys.stderr.getvalue() # doctest: +ELLIPSIS +NORMALIZE_WHITESPACE Error: not enough values for section type 'zodb.storage'; 0 found, 1 required ... But we can specify a storage without a name: >>> open('config', 'w').write(""" ... ... address 8100 ... ... ... ... """) >>> options = ZEO.runzeo.ZEOOptions() >>> options.realize('-C config'.split()) >>> [storage.name for storage in options.storages] ['1'] We can't have multiple unnamed storages: >>> sys.stderr = StringIO.StringIO() >>> open('config', 'w').write(""" ... ... address 8100 ... ... ... ... ... ... """) >>> options = ZEO.runzeo.ZEOOptions() >>> options.realize('-C config'.split()) Traceback (most recent call last): ... SystemExit: 2 >>> print sys.stderr.getvalue() # doctest: +ELLIPSIS +NORMALIZE_WHITESPACE Error: No more than one storage may be unnamed. ... Or an unnamed storage and one named '1': >>> sys.stderr = StringIO.StringIO() >>> open('config', 'w').write(""" ... ... address 8100 ... ... ... ... ... ... """) >>> options = ZEO.runzeo.ZEOOptions() >>> options.realize('-C config'.split()) Traceback (most recent call last): ... SystemExit: 2 >>> print sys.stderr.getvalue() # doctest: +ELLIPSIS +NORMALIZE_WHITESPACE Error: Can't have an unnamed storage and a storage named 1. ... But we can have multiple storages: >>> open('config', 'w').write(""" ... ... address 8100 ... ... ... ... ... ... """) >>> options = ZEO.runzeo.ZEOOptions() >>> options.realize('-C config'.split()) >>> [storage.name for storage in options.storages] ['x', 'y'] As long as the names are unique: >>> sys.stderr = StringIO.StringIO() >>> open('config', 'w').write(""" ... ... address 8100 ... ... ... ... ... ... """) >>> options = ZEO.runzeo.ZEOOptions() >>> options.realize('-C config'.split()) Traceback (most recent call last): ... SystemExit: 2 >>> print sys.stderr.getvalue() # doctest: +ELLIPSIS +NORMALIZE_WHITESPACE Error: section names must not be re-used within the same container:'1' ... .. Cleanup ===================================================== >>> sys.stderr = stderr ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/tests/zeo-fan-out.test000066400000000000000000000073661230730566700245570ustar00rootroot00000000000000ZEO Fan Out =========== We should be able to set up ZEO servers with ZEO clients. Let's see if we can make it work. We'll use some helper functions. The first is a helper that starts ZEO servers for us and another one that picks ports. We'll start the first server: >>> (_, port0), adminaddr0 = start_server( ... '\npath fs\nblob-dir blobs\n', keep=1) Then we'll start 2 others that use this one: >>> addr1, _ = start_server( ... '\nserver %s\nblob-dir b1\n' % port0) >>> addr2, _ = start_server( ... '\nserver %s\nblob-dir b2\n' % port0) Now, let's create some client storages that connect to these: >>> import os, ZEO, ZODB.blob, ZODB.POSException, transaction >>> db0 = ZEO.DB(port0, blob_dir='cb0') >>> db1 = ZEO.DB(addr1, blob_dir='cb1') >>> tm1 = transaction.TransactionManager() >>> c1 = db1.open(transaction_manager=tm1) >>> r1 = c1.root() >>> r1 {} >>> db2 = ZEO.DB(addr2, blob_dir='cb2') >>> tm2 = transaction.TransactionManager() >>> c2 = db2.open(transaction_manager=tm2) >>> r2 = c2.root() >>> r2 {} If we update c1, we'll eventually see the change in c2: >>> import persistent.mapping >>> r1[1] = persistent.mapping.PersistentMapping() >>> r1[1].v = 1000 >>> r1[2] = persistent.mapping.PersistentMapping() >>> r1[2].v = -1000 >>> r1[3] = ZODB.blob.Blob('x'*4111222) >>> for i in range(1000, 2000): ... r1[i] = persistent.mapping.PersistentMapping() ... r1[i].v = 0 >>> tm1.commit() >>> blob_id = r1[3]._p_oid, r1[1]._p_serial >>> import time >>> for i in range(100): ... t = tm2.begin() ... if 1 in r2: ... break ... time.sleep(0.01) >>> tm2.abort() >>> r2[1].v 1000 >>> r2[2].v -1000 Now, let's see if we can break it. :) >>> def f(): ... c = db1.open(transaction.TransactionManager()) ... r = c.root() ... i = 0 ... while i < 100: ... r[1].v -= 1 ... r[2].v += 1 ... try: ... c.transaction_manager.commit() ... i += 1 ... except ZODB.POSException.ConflictError: ... c.transaction_manager.abort() ... c.close() >>> import threading >>> threadf = threading.Thread(target=f) >>> threadg = threading.Thread(target=f) >>> threadf.start() >>> threadg.start() >>> s2 = db2.storage >>> start_time = time.time() >>> while time.time() - start_time < 999: ... t = tm2.begin() ... if r2[1].v + r2[2].v: ... print 'oops', r2[1], r2[2] ... if r2[1].v == 800: ... break # we caught up ... path = s2.fshelper.getBlobFilename(*blob_id) ... if os.path.exists(path): ... ZODB.blob.remove_committed(path) ... s2._server.sendBlob(*blob_id) ... else: print 'Dang' >>> threadf.join() >>> threadg.join() If we shutdown and restart the source server, the variables will be invalidated: >>> stop_server(adminaddr0) >>> _ = start_server('\npath fs\n\n', ... port=port0) >>> for i in range(1000): ... c1.sync() ... c2.sync() ... if ( ... (r1[1]._p_changed is None) ... and ... (r1[2]._p_changed is None) ... and ... (r2[1]._p_changed is None) ... and ... (r2[2]._p_changed is None) ... ): ... print 'Cool' ... break ... time.sleep(0.01) ... else: ... print 'Dang' Cool Cleanup: >>> db0.close() >>> db1.close() >>> db2.close() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/tests/zeo_blob_cache.test000066400000000000000000000115771230730566700253300ustar00rootroot00000000000000ZEO caching of blob data ======================== ZEO supports 2 modes for providing clients access to blob data: shared Blob data are shared via a network file system. The client shares a common blob directory with the server. non-shared Blob data are loaded from the storage server and cached locally. A maximum size for the blob data can be set and data are removed when the size is exceeded. In this test, we'll demonstrate that blobs data are removed from a ZEO cache when the amount of data stored exceeds a given limit. Let's start by setting up some data: >>> addr, _ = start_server(blob_dir='server-blobs') We'll also create a client. >>> import ZEO >>> db = ZEO.DB(addr, blob_dir='blobs', blob_cache_size=3000) Here, we passed a blob_cache_size parameter, which specifies a target blob cache size. This is not a hard limit, but rather a target. It defaults to a very large value. We also passed a blob_cache_size_check option. The blob_cache_size_check option specifies the number of bytes, as a percent of the target that can be written or downloaded from the server before the cache size is checked. The blob_cache_size_check option defaults to 100. We passed 10, to check after writing 10% of the target size. .. We're going to wait for any threads we started to finish, so... >>> import threading >>> old_threads = list(threading.enumerate()) We want to check for name collections in the blob cache dir. We'll try to provoke name collections by reducing the number of cache directory subdirectories. >>> import ZEO.ClientStorage >>> orig_blob_cache_layout_size = ZEO.ClientStorage.BlobCacheLayout.size >>> ZEO.ClientStorage.BlobCacheLayout.size = 11 Now, let's write some data: >>> import ZODB.blob, transaction, time >>> conn = db.open() >>> for i in range(1, 101): ... conn.root()[i] = ZODB.blob.Blob() ... conn.root()[i].open('w').write(chr(i)*100) >>> transaction.commit() We've committed 10000 bytes of data, but our target size is 3000. We expect to have not much more than the target size in the cache blob directory. >>> import os >>> def cache_size(d): ... size = 0 ... for base, dirs, files in os.walk(d): ... for f in files: ... if f.endswith('.blob'): ... try: ... size += os.stat(os.path.join(base, f)).st_size ... except OSError: ... if os.path.exists(os.path.join(base, f)): ... raise ... return size >>> def check(): ... return cache_size('blobs') < 5000 >>> def onfail(): ... return cache_size('blobs') >>> from ZEO.tests.forker import wait_until >>> wait_until("size is reduced", check, 99, onfail) If we read all of the blobs, data will be downloaded again, as necessary, but the cache size will remain not much bigger than the target: >>> for i in range(1, 101): ... data = conn.root()[i].open().read() ... if data != chr(i)*100: ... print 'bad data', `chr(i)`, `data` >>> wait_until("size is reduced", check, 99, onfail) >>> for i in range(1, 101): ... data = conn.root()[i].open().read() ... if data != chr(i)*100: ... print 'bad data', `chr(i)`, `data` >>> for i in range(1, 101): ... data = conn.root()[i].open('c').read() ... if data != chr(i)*100: ... print 'bad data', `chr(i)`, `data` >>> wait_until("size is reduced", check, 99, onfail) Now let see if we can stress things a bit. We'll create many clients and get them to pound on the blobs all at once to see if we can provoke problems: >>> import threading, random >>> def run(): ... db = ZEO.DB(addr, blob_dir='blobs', blob_cache_size=4000) ... conn = db.open() ... for i in range(300): ... time.sleep(0) ... i = random.randint(1, 100) ... data = conn.root()[i].open().read() ... if data != chr(i)*100: ... print 'bad data', `chr(i)`, `data` ... i = random.randint(1, 100) ... data = conn.root()[i].open('c').read() ... if data != chr(i)*100: ... print 'bad data', `chr(i)`, `data` ... db.close() >>> threads = [threading.Thread(target=run) for i in range(10)] >>> for thread in threads: ... thread.setDaemon(True) >>> for thread in threads: ... thread.start() >>> for thread in threads: ... thread.join(99) ... if thread.isAlive(): ... print "Can't join thread." >>> wait_until("size is reduced", check, 99, onfail) .. cleanup >>> for thread in threading.enumerate(): ... if thread not in old_threads: ... thread.join(33) >>> db.close() >>> ZEO.ClientStorage.BlobCacheLayout.size = orig_blob_cache_layout_size ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/tests/zeoserver.py000066400000000000000000000161021230730566700240740ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Helper file used to launch a ZEO server cross platform""" import asyncore import errno import getopt import logging import os import signal import socket import sys import threading import time import ZEO.runzeo import ZEO.zrpc.connection def cleanup(storage): # FileStorage and the Berkeley storages have this method, which deletes # all files and directories used by the storage. This prevents @-files # from clogging up /tmp try: storage.cleanup() except AttributeError: pass logger = logging.getLogger('ZEO.tests.zeoserver') def log(label, msg, *args): message = "(%s) %s" % (label, msg) logger.debug(message, *args) class ZEOTestServer(asyncore.dispatcher): """A server for killing the whole process at the end of a test. The first time we connect to this server, we write an ack character down the socket. The other end should block on a recv() of the socket so it can guarantee the server has started up before continuing on. The second connect to the port immediately exits the process, via os._exit(), without writing data on the socket. It does close and clean up the storage first. The other end will get the empty string from its recv() which will be enough to tell it that the server has exited. I think this should prevent us from ever getting a legitimate addr-in-use error. """ __super_init = asyncore.dispatcher.__init__ def __init__(self, addr, server, keep): self.__super_init() self._server = server self._sockets = [self] self._keep = keep # Count down to zero, the number of connects self._count = 1 self._label ='%d @ %s' % (os.getpid(), addr) if isinstance(addr, str): self.create_socket(socket.AF_UNIX, socket.SOCK_STREAM) else: self.create_socket(socket.AF_INET, socket.SOCK_STREAM) # Some ZEO tests attempt a quick start of the server using the same # port so we have to set the reuse flag. self.set_reuse_addr() try: self.bind(addr) except: # We really want to see these exceptions import traceback traceback.print_exc() raise self.listen(5) self.log('bound and listening') def log(self, msg, *args): log(self._label, msg, *args) def handle_accept(self): sock, addr = self.accept() self.log('in handle_accept()') # When we're done with everything, close the storage. Do not write # the ack character until the storage is finished closing. if self._count <= 0: self.log('closing the storage') self._server.close_server() if not self._keep: for storage in self._server.storages.values(): cleanup(storage) self.log('exiting') # Close all the other sockets so that we don't have to wait # for os._exit() to get to it before starting the next # server process. for s in self._sockets: s.close() # Now explicitly close the socket returned from accept(), # since it didn't go through the wrapper. sock.close() os._exit(0) self.log('continuing') sock.send('X') self._count -= 1 def register_socket(self, sock): # Register a socket to be closed when server shutsdown. self._sockets.append(sock) class Suicide(threading.Thread): def __init__(self, addr): threading.Thread.__init__(self) self._adminaddr = addr def run(self): # If this process doesn't exit in 330 seconds, commit suicide. # The client threads in the ConcurrentUpdate tests will run for # as long as 300 seconds. Set this timeout to 330 to minimize # chance that the server gives up before the clients. time.sleep(999) log(str(os.getpid()), "suicide thread invoking shutdown") # If the server hasn't shut down yet, the client may not be # able to connect to it. If so, try to kill the process to # force it to shutdown. if hasattr(os, "kill"): os.kill(pid, signal.SIGTERM) time.sleep(5) os.kill(pid, signal.SIGKILL) else: from ZEO.tests.forker import shutdown_zeo_server # Nott: If the -k option was given to zeoserver, then the # process will go away but the temp files won't get # cleaned up. shutdown_zeo_server(self._adminaddr) def main(): global pid pid = os.getpid() label = str(pid) log(label, "starting") # We don't do much sanity checking of the arguments, since if we get it # wrong, it's a bug in the test suite. keep = 0 configfile = None suicide = True # Parse the arguments and let getopt.error percolate opts, args = getopt.getopt(sys.argv[1:], 'dkSC:v:') for opt, arg in opts: if opt == '-k': keep = 1 if opt == '-d': ZEO.zrpc.connection.debug_zrpc = True elif opt == '-C': configfile = arg elif opt == '-S': suicide = False elif opt == '-v': ZEO.zrpc.connection.Connection.current_protocol = arg zo = ZEO.runzeo.ZEOOptions() zo.realize(["-C", configfile]) addr = zo.address if zo.auth_protocol == "plaintext": __import__('ZEO.tests.auth_plaintext') if isinstance(addr, tuple): test_addr = addr[0], addr[1]+1 else: test_addr = addr + '-test' log(label, 'creating the storage server') storage = zo.storages[0].open() mon_addr = None if zo.monitor_address: mon_addr = zo.monitor_address server = ZEO.runzeo.create_server({"1": storage}, zo) try: log(label, 'creating the test server, keep: %s', keep) t = ZEOTestServer(test_addr, server, keep) except socket.error, e: if e[0] != errno.EADDRINUSE: raise log(label, 'addr in use, closing and exiting') storage.close() cleanup(storage) sys.exit(2) t.register_socket(server.dispatcher) if suicide: # Create daemon suicide thread d = Suicide(test_addr) d.setDaemon(1) d.start() # Loop for socket events log(label, 'entering asyncore loop') asyncore.loop() if __name__ == '__main__': import warnings warnings.simplefilter('ignore') main() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/util.py000066400000000000000000000035061230730566700216670ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Utilities for setting up the server environment.""" import os def parentdir(p, n=1): """Return the ancestor of p from n levels up.""" d = p while n: d = os.path.dirname(d) if not d or d == '.': d = os.getcwd() n -= 1 return d class Environment: """Determine location of the Data.fs & ZEO_SERVER.pid files. Pass the argv[0] used to start ZEO to the constructor. Use the zeo_pid and fs attributes to get the filenames. """ def __init__(self, argv0): v = os.environ.get("INSTANCE_HOME") if v is None: # looking for a Zope/var directory assuming that this code # is installed in Zope/lib/python/ZEO p = parentdir(argv0, 4) if os.path.isdir(os.path.join(p, "var")): v = p else: v = os.getcwd() self.home = v self.var = os.path.join(v, "var") if not os.path.isdir(self.var): self.var = self.home pid = os.environ.get("ZEO_SERVER_PID") if pid is None: pid = os.path.join(self.var, "ZEO_SERVER.pid") self.zeo_pid = pid self.fs = os.path.join(self.var, "Data.fs") ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/version.txt000066400000000000000000000000101230730566700225510ustar00rootroot000000000000003.7.0b3 ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/zeoctl.py000066400000000000000000000020021230730566700222000ustar00rootroot00000000000000#!/usr/bin/env python2.3 ############################################################################## # # Copyright (c) 2005 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Wrapper script for zdctl.py that causes it to use the ZEO schema.""" import os import ZEO import zdaemon.zdctl # Main program def main(args=None): options = zdaemon.zdctl.ZDCtlOptions() options.schemadir = os.path.dirname(ZEO.__file__) options.schemafile = "zeoctl.xml" zdaemon.zdctl.main(args, options) if __name__ == "__main__": main() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/zeoctl.xml000066400000000000000000000015051230730566700223570ustar00rootroot00000000000000 This schema describes the configuration of the ZEO storage server controller. It differs from the schema for the storage server only in that the "runner" section is required.
ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/zeopasswd.py000066400000000000000000000103051230730566700227240ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2003 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Update a user's authentication tokens for a ZEO server. usage: python zeopasswd.py [options] username [password] Specify either a configuration file: -C/--configuration -- ZConfig configuration file or the individual options: -f/--filename -- authentication database filename -p/--protocol -- authentication protocol name -r/--realm -- authentication database realm Additional options: -d/--delete -- delete user instead of updating password """ import getopt import getpass import sys import os import ZConfig import ZEO def usage(msg): print __doc__ print msg sys.exit(2) def options(args): """Password-specific options loaded from regular ZEO config file.""" try: opts, args = getopt.getopt(args, "dr:p:f:C:", ["configure=", "protocol=", "filename=", "realm"]) except getopt.error, msg: usage(msg) config = None delete = 0 auth_protocol = None auth_db = "" auth_realm = None for k, v in opts: if k == '-C' or k == '--configure': schemafile = os.path.join(os.path.dirname(ZEO.__file__), "schema.xml") schema = ZConfig.loadSchema(schemafile) config, nil = ZConfig.loadConfig(schema, v) if k == '-d' or k == '--delete': delete = 1 if k == '-p' or k == '--protocol': auth_protocol = v if k == '-f' or k == '--filename': auth_db = v if k == '-r' or k == '--realm': auth_realm = v if config is not None: if auth_protocol or auth_db: usage("Error: Conflicting options; use either -C *or* -p and -f") auth_protocol = config.zeo.authentication_protocol auth_db = config.zeo.authentication_database auth_realm = config.zeo.authentication_realm elif not (auth_protocol and auth_db): usage("Error: Must specifiy configuration file or protocol and database") password = None if delete: if not args: usage("Error: Must specify a username to delete") elif len(args) > 1: usage("Error: Too many arguments") username = args[0] else: if not args: usage("Error: Must specify a username") elif len(args) > 2: usage("Error: Too many arguments") elif len(args) == 1: username = args[0] else: username, password = args return auth_protocol, auth_db, auth_realm, delete, username, password def main(args=None, dbclass=None): if args is None: args = sys.argv[1:] p, auth_db, auth_realm, delete, username, password = options(args) if p is None: usage("Error: configuration does not specify auth protocol") if p == "digest": from ZEO.auth.auth_digest import DigestDatabase as Database elif p == "srp": from ZEO.auth.auth_srp import SRPDatabase as Database elif dbclass: # dbclass is used for testing tests.auth_plaintext, see testAuth.py Database = dbclass else: raise ValueError("Unknown database type %r" % p) if auth_db is None: usage("Error: configuration does not specify auth database") db = Database(auth_db, auth_realm) if delete: db.del_user(username) else: if password is None: password = getpass.getpass("Enter password: ") db.add_user(username, password) db.save() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/zrpc/000077500000000000000000000000001230730566700213125ustar00rootroot00000000000000ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/zrpc/__init__.py000066400000000000000000000021161230730566700234230ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## # zrpc is a package with the following modules # client -- manages connection creation to remote server # connection -- object dispatcher # log -- logging helper # error -- exceptions raised by zrpc # marshal -- internal, handles basic protocol issues # server -- manages incoming connections from remote clients # smac -- sized message async connections # trigger -- medusa's trigger # zrpc is not an advertised subpackage of ZEO; its interfaces are internal ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/zrpc/_hmac.py000066400000000000000000000064041230730566700227370ustar00rootroot00000000000000# This file is a slightly modified copy of Python 2.3's Lib/hmac.py. # This file is under the Python Software Foundation (PSF) license. """HMAC (Keyed-Hashing for Message Authentication) Python module. Implements the HMAC algorithm as described by RFC 2104. """ def _strxor(s1, s2): """Utility method. XOR the two strings s1 and s2 (must have same length). """ return "".join(map(lambda x, y: chr(ord(x) ^ ord(y)), s1, s2)) # The size of the digests returned by HMAC depends on the underlying # hashing module used. digest_size = None class HMAC: """RFC2104 HMAC class. This supports the API for Cryptographic Hash Functions (PEP 247). """ def __init__(self, key, msg = None, digestmod = None): """Create a new HMAC object. key: key for the keyed hash object. msg: Initial input for the hash, if provided. digestmod: A module supporting PEP 247. Defaults to the md5 module. """ if digestmod is None: import md5 digestmod = md5 self.digestmod = digestmod self.outer = digestmod.new() self.inner = digestmod.new() # Python 2.1 and 2.2 differ about the correct spelling try: self.digest_size = digestmod.digestsize except AttributeError: self.digest_size = digestmod.digest_size blocksize = 64 ipad = "\x36" * blocksize opad = "\x5C" * blocksize if len(key) > blocksize: key = digestmod.new(key).digest() key = key + chr(0) * (blocksize - len(key)) self.outer.update(_strxor(key, opad)) self.inner.update(_strxor(key, ipad)) if msg is not None: self.update(msg) ## def clear(self): ## raise NotImplementedError("clear() method not available in HMAC.") def update(self, msg): """Update this hashing object with the string msg. """ self.inner.update(msg) def copy(self): """Return a separate copy of this hashing object. An update to this copy won't affect the original object. """ other = HMAC("") other.digestmod = self.digestmod other.inner = self.inner.copy() other.outer = self.outer.copy() return other def digest(self): """Return the hash value of this hashing object. This returns a string containing 8-bit data. The object is not altered in any way by this function; you can continue updating the object after calling this function. """ h = self.outer.copy() h.update(self.inner.digest()) return h.digest() def hexdigest(self): """Like digest(), but returns a string of hexadecimal digits instead. """ return "".join([hex(ord(x))[2:].zfill(2) for x in tuple(self.digest())]) def new(key, msg = None, digestmod = None): """Create a new hashing object and return it. key: The starting key for the hash. msg: if available, will immediately be hashed into the object's starting state. You can now feed arbitrary strings into the object using its update() method, and can ask for the hash value at any time by calling its digest() method. """ return HMAC(key, msg, digestmod) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/zrpc/client.py000066400000000000000000000557101230730566700231520ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## import asyncore import errno import logging import select import socket import sys import threading import time import types import ZEO.zrpc.trigger from ZEO.zrpc.connection import ManagedClientConnection from ZEO.zrpc.log import log from ZEO.zrpc.error import DisconnectedError from ZODB.POSException import ReadOnlyError from ZODB.loglevels import BLATHER def client_timeout(): return 30.0 def client_loop(map): read = asyncore.read write = asyncore.write _exception = asyncore._exception while map: try: # The next two lines intentionally don't use # iterators. Other threads can close dispatchers, causeing # the socket map to shrink. r = e = map.keys() w = [fd for (fd, obj) in map.items() if obj.writable()] try: r, w, e = select.select(r, w, e, client_timeout()) except select.error, err: if err[0] != errno.EINTR: if err[0] == errno.EBADF: # If a connection is closed while we are # calling select on it, we can get a bad # file-descriptor error. We'll check for this # case by looking for entries in r and w that # are not in the socket map. if [fd for fd in r if fd not in map]: continue if [fd for fd in w if fd not in map]: continue raise else: continue if not map: break if not (r or w or e): # The line intentionally doesn't use iterators. Other # threads can close dispatchers, causeing the socket # map to shrink. for obj in map.values(): if isinstance(obj, ManagedClientConnection): # Send a heartbeat message as a reply to a # non-existent message id. try: obj.send_reply(-1, None) except DisconnectedError: pass continue for fd in r: obj = map.get(fd) if obj is None: continue read(obj) for fd in w: obj = map.get(fd) if obj is None: continue write(obj) for fd in e: obj = map.get(fd) if obj is None: continue _exception(obj) except: if map: try: logging.getLogger(__name__+'.client_loop').critical( 'A ZEO client loop failed.', exc_info=sys.exc_info()) except: pass for fd, obj in map.items(): if not hasattr(obj, 'mgr'): continue try: obj.mgr.client.close() except: map.pop(fd, None) try: logging.getLogger(__name__+'.client_loop' ).critical( "Couldn't close a dispatcher.", exc_info=sys.exc_info()) except: pass class ConnectionManager(object): """Keeps a connection up over time""" sync_wait = 30 def __init__(self, addrs, client, tmin=1, tmax=180): self.client = client self._start_asyncore_loop() self.addrlist = self._parse_addrs(addrs) self.tmin = min(tmin, tmax) self.tmax = tmax self.cond = threading.Condition(threading.Lock()) self.connection = None # Protected by self.cond self.closed = 0 # If thread is not None, then there is a helper thread # attempting to connect. self.thread = None # Protected by self.cond def _start_asyncore_loop(self): self.map = {} self.trigger = ZEO.zrpc.trigger.trigger(self.map) self.loop_thread = threading.Thread( name="%s zeo client networking thread" % self.client.__name__, target=client_loop, args=(self.map,)) self.loop_thread.setDaemon(True) self.loop_thread.start() def __repr__(self): return "<%s for %s>" % (self.__class__.__name__, self.addrlist) def _parse_addrs(self, addrs): # Return a list of (addr_type, addr) pairs. # For backwards compatibility (and simplicity?) the # constructor accepts a single address in the addrs argument -- # a string for a Unix domain socket or a 2-tuple with a # hostname and port. It can also accept a list of such addresses. addr_type = self._guess_type(addrs) if addr_type is not None: return [(addr_type, addrs)] else: addrlist = [] for addr in addrs: addr_type = self._guess_type(addr) if addr_type is None: raise ValueError("unknown address in list: %s" % repr(addr)) addrlist.append((addr_type, addr)) return addrlist def _guess_type(self, addr): if isinstance(addr, types.StringType): return socket.AF_UNIX if (len(addr) == 2 and isinstance(addr[0], types.StringType) and isinstance(addr[1], types.IntType)): return socket.AF_INET # also denotes IPv6 # not anything I know about return None def close(self): """Prevent ConnectionManager from opening new connections""" self.closed = 1 self.cond.acquire() try: t = self.thread self.thread = None finally: self.cond.release() if t is not None: log("CM.close(): stopping and joining thread") t.stop() t.join(30) if t.isAlive(): log("CM.close(): self.thread.join() timed out", level=logging.WARNING) for fd, obj in self.map.items(): if obj is not self.trigger: try: obj.close() except: logging.getLogger(__name__+'.'+self.__class__.__name__ ).critical( "Couldn't close a dispatcher.", exc_info=sys.exc_info()) self.map.clear() self.trigger.pull_trigger() try: self.loop_thread.join(9) except RuntimeError: pass # we are the thread :) self.trigger.close() def attempt_connect(self): """Attempt a connection to the server without blocking too long. There isn't a crisp definition for too long. When a ClientStorage is created, it attempts to connect to the server. If the server isn't immediately available, it can operate from the cache. This method will start the background connection thread and wait a little while to see if it finishes quickly. """ # Will a single attempt take too long? # Answer: it depends -- normally, you'll connect or get a # connection refused error very quickly. Packet-eating # firewalls and other mishaps may cause the connect to take a # long time to time out though. It's also possible that you # connect quickly to a slow server, and the attempt includes # at least one roundtrip to the server (the register() call). # But that's as fast as you can expect it to be. self.connect() self.cond.acquire() try: t = self.thread conn = self.connection finally: self.cond.release() if t is not None and conn is None: event = t.one_attempt event.wait() self.cond.acquire() try: conn = self.connection finally: self.cond.release() return conn is not None def connect(self, sync=0): self.cond.acquire() try: if self.connection is not None: return t = self.thread if t is None: log("CM.connect(): starting ConnectThread") self.thread = t = ConnectThread(self, self.client, self.addrlist, self.tmin, self.tmax) t.setDaemon(1) t.start() if sync: while self.connection is None and t.isAlive(): self.cond.wait(self.sync_wait) if self.connection is None: log("CM.connect(sync=1): still waiting...") assert self.connection is not None finally: self.cond.release() def connect_done(self, conn, preferred): # Called by ConnectWrapper.notify_client() after notifying the client log("CM.connect_done(preferred=%s)" % preferred) self.cond.acquire() try: self.connection = conn if preferred: self.thread = None self.cond.notifyAll() # Wake up connect(sync=1) finally: self.cond.release() def close_conn(self, conn): # Called by the connection when it is closed self.cond.acquire() try: if conn is not self.connection: # Closing a non-current connection log("CM.close_conn() non-current", level=BLATHER) return log("CM.close_conn()") self.connection = None finally: self.cond.release() self.client.notifyDisconnected() if not self.closed: self.connect() def is_connected(self): self.cond.acquire() try: return self.connection is not None finally: self.cond.release() # When trying to do a connect on a non-blocking socket, some outcomes # are expected. Set _CONNECT_IN_PROGRESS to the errno value(s) expected # when an initial connect can't complete immediately. Set _CONNECT_OK # to the errno value(s) expected if the connect succeeds *or* if it's # already connected (our code can attempt redundant connects). if hasattr(errno, "WSAEWOULDBLOCK"): # Windows # Caution: The official Winsock docs claim that WSAEALREADY should be # treated as yet another "in progress" indicator, but we've never # seen this. _CONNECT_IN_PROGRESS = (errno.WSAEWOULDBLOCK,) # Win98: WSAEISCONN; Win2K: WSAEINVAL _CONNECT_OK = (0, errno.WSAEISCONN, errno.WSAEINVAL) else: # Unix _CONNECT_IN_PROGRESS = (errno.EINPROGRESS,) _CONNECT_OK = (0, errno.EISCONN) class ConnectThread(threading.Thread): """Thread that tries to connect to server given one or more addresses. The thread is passed a ConnectionManager and the manager's client as arguments. It calls testConnection() on the client when a socket connects; that should return 1 or 0 indicating whether this is a preferred or a fallback connection. It may also raise an exception, in which case the connection is abandoned. The thread will continue to run, attempting connections, until a preferred connection is seen and successfully handed over to the manager and client. As soon as testConnection() finds a preferred connection, or after all sockets have been tried and at least one fallback connection has been seen, notifyConnected(connection) is called on the client and connect_done() on the manager. If this was a preferred connection, the thread then exits; otherwise, it keeps trying until it gets a preferred connection, and then reconnects the client using that connection. """ __super_init = threading.Thread.__init__ # We don't expect clients to call any methods of this Thread other # than close() and those defined by the Thread API. def __init__(self, mgr, client, addrlist, tmin, tmax): self.__super_init(name="Connect(%s)" % addrlist) self.mgr = mgr self.client = client self.addrlist = addrlist self.tmin = tmin self.tmax = tmax self.stopped = 0 self.one_attempt = threading.Event() # A ConnectThread keeps track of whether it has finished a # call to try_connecting(). This allows the ConnectionManager # to make an attempt to connect right away, but not block for # too long if the server isn't immediately available. def stop(self): self.stopped = 1 def run(self): delay = self.tmin success = 0 # Don't wait too long the first time. # TODO: make timeout configurable? attempt_timeout = 5 while not self.stopped: success = self.try_connecting(attempt_timeout) if not self.one_attempt.isSet(): self.one_attempt.set() attempt_timeout = 75 if success > 0: break time.sleep(delay) if self.mgr.is_connected(): log("CT: still trying to replace fallback connection", level=logging.INFO) delay = min(delay*2, self.tmax) log("CT: exiting thread: %s" % self.getName()) def try_connecting(self, timeout): """Try connecting to all self.addrlist addresses. Return 1 if a preferred connection was found; 0 if no connection was found; and -1 if a fallback connection was found. If no connection is found within timeout seconds, return 0. """ log("CT: attempting to connect on %d sockets" % len(self.addrlist)) deadline = time.time() + timeout wrappers = self._create_wrappers() for wrap in wrappers.keys(): if wrap.state == "notified": return 1 try: if time.time() > deadline: return 0 r = self._connect_wrappers(wrappers, deadline) if r is not None: return r if time.time() > deadline: return 0 r = self._fallback_wrappers(wrappers, deadline) if r is not None: return r # Alas, no luck. assert not wrappers finally: for wrap in wrappers.keys(): wrap.close() del wrappers return 0 def _expand_addrlist(self): for domain, addr in self.addrlist: # AF_INET really means either IPv4 or IPv6, possibly # indirected by DNS. By design, DNS lookup is deferred # until connections get established, so that DNS # reconfiguration can affect failover if domain == socket.AF_INET: host, port = addr for (family, socktype, proto, cannoname, sockaddr ) in socket.getaddrinfo(host or 'localhost', port): # for IPv6, drop flowinfo, and restrict addresses # to [host]:port yield family, sockaddr[:2] else: yield domain, addr def _create_wrappers(self): # Create socket wrappers wrappers = {} # keys are active wrappers for domain, addr in self._expand_addrlist(): wrap = ConnectWrapper(domain, addr, self.mgr, self.client) wrap.connect_procedure() if wrap.state == "notified": for w in wrappers.keys(): w.close() return {wrap: wrap} if wrap.state != "closed": wrappers[wrap] = wrap return wrappers def _connect_wrappers(self, wrappers, deadline): # Next wait until they all actually connect (or fail) # The deadline is necessary, because we'd wait forever if a # sockets never connects or fails. while wrappers: if self.stopped: for wrap in wrappers.keys(): wrap.close() return 0 # Select connecting wrappers connecting = [wrap for wrap in wrappers.keys() if wrap.state == "connecting"] if not connecting: break if time.time() > deadline: break try: r, w, x = select.select([], connecting, connecting, 1.0) log("CT: select() %d, %d, %d" % tuple(map(len, (r,w,x)))) except select.error, msg: log("CT: select failed; msg=%s" % str(msg), level=logging.WARNING) continue # Exceptable wrappers are in trouble; close these suckers for wrap in x: log("CT: closing troubled socket %s" % str(wrap.addr)) del wrappers[wrap] wrap.close() # Writable sockets are connected for wrap in w: wrap.connect_procedure() if wrap.state == "notified": del wrappers[wrap] # Don't close this one for wrap in wrappers.keys(): wrap.close() return 1 if wrap.state == "closed": del wrappers[wrap] def _fallback_wrappers(self, wrappers, deadline): # If we've got wrappers left at this point, they're fallback # connections. Try notifying them until one succeeds. for wrap in wrappers.keys(): assert wrap.state == "tested" and wrap.preferred == 0 if self.mgr.is_connected(): wrap.close() else: wrap.notify_client() if wrap.state == "notified": del wrappers[wrap] # Don't close this one for wrap in wrappers.keys(): wrap.close() return -1 assert wrap.state == "closed" del wrappers[wrap] # TODO: should check deadline class ConnectWrapper: """An object that handles the connection procedure for one socket. This is a little state machine with states: closed opened connecting connected tested notified """ def __init__(self, domain, addr, mgr, client): """Store arguments and create non-blocking socket.""" self.domain = domain self.addr = addr self.mgr = mgr self.client = client # These attributes are part of the interface self.state = "closed" self.sock = None self.conn = None self.preferred = 0 log("CW: attempt to connect to %s" % repr(addr)) try: self.sock = socket.socket(domain, socket.SOCK_STREAM) except socket.error, err: log("CW: can't create socket, domain=%s: %s" % (domain, err), level=logging.ERROR) self.close() return self.sock.setblocking(0) self.state = "opened" def connect_procedure(self): """Call sock.connect_ex(addr) and interpret result.""" if self.state in ("opened", "connecting"): try: err = self.sock.connect_ex(self.addr) except socket.error, msg: log("CW: connect_ex(%r) failed: %s" % (self.addr, msg), level=logging.ERROR) self.close() return log("CW: connect_ex(%s) returned %s" % (self.addr, errno.errorcode.get(err) or str(err))) if err in _CONNECT_IN_PROGRESS: self.state = "connecting" return if err not in _CONNECT_OK: log("CW: error connecting to %s: %s" % (self.addr, errno.errorcode.get(err) or str(err)), level=logging.WARNING) self.close() return self.state = "connected" if self.state == "connected": self.test_connection() def test_connection(self): """Establish and test a connection at the zrpc level. Call the client's testConnection(), giving the client a chance to do app-level check of the connection. """ self.conn = ManagedClientConnection(self.sock, self.addr, self.mgr) self.sock = None # The socket is now owned by the connection try: self.preferred = self.client.testConnection(self.conn) self.state = "tested" except ReadOnlyError: log("CW: ReadOnlyError in testConnection (%s)" % repr(self.addr)) self.close() return except: log("CW: error in testConnection (%s)" % repr(self.addr), level=logging.ERROR, exc_info=True) self.close() return if self.preferred: self.notify_client() def notify_client(self): """Call the client's notifyConnected(). If this succeeds, call the manager's connect_done(). If the client is already connected, we assume it's a fallback connection, and the new connection must be a preferred connection. The client will close the old connection. """ try: self.client.notifyConnected(self.conn) except: log("CW: error in notifyConnected (%s)" % repr(self.addr), level=logging.ERROR, exc_info=True) self.close() return self.state = "notified" self.mgr.connect_done(self.conn, self.preferred) def close(self): """Close the socket and reset everything.""" self.state = "closed" self.mgr = self.client = None self.preferred = 0 if self.conn is not None: # Closing the ZRPC connection will eventually close the # socket, somewhere in asyncore. Guido asks: Why do we care? self.conn.close() self.conn = None if self.sock is not None: self.sock.close() self.sock = None def fileno(self): return self.sock.fileno() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/zrpc/connection.py000066400000000000000000000761761230730566700240440ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## import asyncore import sys import threading import logging import ZEO.zrpc.marshal import ZEO.zrpc.trigger from ZEO.zrpc import smac from ZEO.zrpc.error import ZRPCError, DisconnectedError from ZEO.zrpc.log import short_repr, log from ZODB.loglevels import BLATHER, TRACE import ZODB.POSException REPLY = ".reply" # message name used for replies exception_type_type = type(Exception) debug_zrpc = False class Delay: """Used to delay response to client for synchronous calls. When a synchronous call is made and the original handler returns without handling the call, it returns a Delay object that prevents the mainloop from sending a response. """ msgid = conn = sent = None def set_sender(self, msgid, conn): self.msgid = msgid self.conn = conn def reply(self, obj): self.sent = 'reply' self.conn.send_reply(self.msgid, obj) def error(self, exc_info): self.sent = 'error' log("Error raised in delayed method", logging.ERROR, exc_info=True) self.conn.return_error(self.msgid, *exc_info[:2]) def __repr__(self): return "%s[%s, %r, %r, %r]" % ( self.__class__.__name__, id(self), self.msgid, self.conn, self.sent) class Result(Delay): def __init__(self, *args): self.args = args def set_sender(self, msgid, conn): reply, callback = self.args conn.send_reply(msgid, reply, False) callback() class MTDelay(Delay): def __init__(self): self.ready = threading.Event() def set_sender(self, *args): Delay.set_sender(self, *args) self.ready.set() def reply(self, obj): self.ready.wait() self.conn.call_from_thread(self.conn.send_reply, self.msgid, obj) def error(self, exc_info): self.ready.wait() self.conn.call_from_thread(Delay.error, self, exc_info) # PROTOCOL NEGOTIATION # # The code implementing protocol version 2.0.0 (which is deployed # in the field and cannot be changed) *only* talks to peers that # send a handshake indicating protocol version 2.0.0. In that # version, both the client and the server immediately send out # their protocol handshake when a connection is established, # without waiting for their peer, and disconnect when a different # handshake is receive. # # The new protocol uses this to enable new clients to talk to # 2.0.0 servers. In the new protocol: # # The server sends its protocol handshake to the client at once. # # The client waits until it receives the server's protocol handshake # before sending its own handshake. The client sends the lower of its # own protocol version and the server protocol version, allowing it to # talk to servers using later protocol versions (2.0.2 and higher) as # well: the effective protocol used will be the lower of the client # and server protocol. However, this changed in ZODB 3.3.1 (and # should have changed in ZODB 3.3) because an older server doesn't # support MVCC methods required by 3.3 clients. # # [Ugly details: In order to treat the first received message (protocol # handshake) differently than all later messages, both client and server # start by patching their message_input() method to refer to their # recv_handshake() method instead. In addition, the client has to arrange # to queue (delay) outgoing messages until it receives the server's # handshake, so that the first message the client sends to the server is # the client's handshake. This multiply-special treatment of the first # message is delicate, and several asyncore and thread subtleties were # handled unsafely before ZODB 3.2.6. # ] # # The ZEO modules ClientStorage and ServerStub have backwards # compatibility code for dealing with the previous version of the # protocol. The client accepts the old version of some messages, # and will not send new messages when talking to an old server. # # As long as the client hasn't sent its handshake, it can't send # anything else; output messages are queued during this time. # (Output can happen because the connection testing machinery can # start sending requests before the handshake is received.) # # UPGRADING FROM ZEO 2.0.0 TO NEWER VERSIONS: # # Because a new client can talk to an old server, but not vice # versa, all clients should be upgraded before upgrading any # servers. Protocol upgrades beyond 2.0.1 will not have this # restriction, because clients using protocol 2.0.1 or later can # talk to both older and newer servers. # # No compatibility with protocol version 1 is provided. # Connection is abstract (it must be derived from). ManagedServerConnection # and ManagedClientConnection are the concrete subclasses. They need to # supply a handshake() method appropriate for their role in protocol # negotiation. class Connection(smac.SizedMessageAsyncConnection, object): """Dispatcher for RPC on object on both sides of socket. The connection supports synchronous calls, which expect a return, and asynchronous calls, which do not. It uses the Marshaller class to handle encoding and decoding of method calls and arguments. Marshaller uses pickle to encode arbitrary Python objects. The code here doesn't ever see the wire format. A Connection is designed for use in a multithreaded application, where a synchronous call must block until a response is ready. A socket connection between a client and a server allows either side to invoke methods on the other side. The processes on each end of the socket use a Connection object to manage communication. The Connection deals with decoded RPC messages. They are represented as four-tuples containing: msgid, flags, method name, and a tuple of method arguments. The msgid starts at zero and is incremented by one each time a method call message is sent. Each side of the connection has a separate msgid state. When one side of the connection (the client) calls a method, it sends a message with a new msgid. The other side (the server), replies with a message that has the same msgid, the string ".reply" (the global variable REPLY) as the method name, and the actual return value in the args position. Note that each side of the Connection can initiate a call, in which case it will be the client for that particular call. The protocol also supports asynchronous calls. The client does not wait for a return value for an asynchronous call. If a method call raises an Exception, the exception is propagated back to the client via the REPLY message. The client side will raise any exception it receives instead of returning the value to the caller. """ __super_init = smac.SizedMessageAsyncConnection.__init__ __super_close = smac.SizedMessageAsyncConnection.close __super_setSessionKey = smac.SizedMessageAsyncConnection.setSessionKey # Protocol history: # # Z200 -- Original ZEO 2.0 protocol # # Z201 -- Added invalidateTransaction() to client. # Renamed several client methods. # Added several sever methods: # lastTransaction() # getAuthProtocol() and scheme-specific authentication methods # getExtensionMethods(). # getInvalidations(). # # Z303 -- named after the ZODB release 3.3 # Added methods for MVCC: # loadBefore() # A Z303 client cannot talk to a Z201 server, because the latter # doesn't support MVCC. A Z201 client can talk to a Z303 server, # but because (at least) the type of the root object changed # from ZODB.PersistentMapping to persistent.mapping, the older # client can't actually make progress if a Z303 client created, # or ever modified, the root. # # Z308 -- named after the ZODB release 3.8 # Added blob-support server methods: # sendBlob # storeBlobStart # storeBlobChunk # storeBlobEnd # storeBlobShared # Added blob-support client methods: # receiveBlobStart # receiveBlobChunk # receiveBlobStop # # Z309 -- named after the ZODB release 3.9 # New server methods: # restorea, iterator_start, iterator_next, # iterator_record_start, iterator_record_next, # iterator_gc # # Z310 -- named after the ZODB release 3.10 # New server methods: # undoa # Doesn't support undo for older clients. # Undone oid info returned by vote. # # Z3101 -- checkCurrentSerialInTransaction # Protocol variables: # Our preferred protocol. current_protocol = "Z3101" # If we're a client, an exhaustive list of the server protocols we # can accept. servers_we_can_talk_to = ["Z308", "Z309", "Z310", current_protocol] # If we're a server, an exhaustive list of the client protocols we # can accept. clients_we_can_talk_to = [ "Z200", "Z201", "Z303", "Z308", "Z309", "Z310", current_protocol] # This is pretty excruciating. Details: # # 3.3 server 3.2 client # server sends Z303 to client # client computes min(Z303, Z201) == Z201 as the protocol to use # client sends Z201 to server # OK, because Z201 is in the server's clients_we_can_talk_to # # 3.2 server 3.3 client # server sends Z201 to client # client computes min(Z303, Z201) == Z201 as the protocol to use # Z201 isn't in the client's servers_we_can_talk_to, so client # raises exception # # 3.3 server 3.3 client # server sends Z303 to client # client computes min(Z303, Z303) == Z303 as the protocol to use # Z303 is in the client's servers_we_can_talk_to, so client # sends Z303 to server # OK, because Z303 is in the server's clients_we_can_talk_to # Exception types that should not be logged: unlogged_exception_types = () # Client constructor passes 'C' for tag, server constructor 'S'. This # is used in log messages, and to determine whether we can speak with # our peer. def __init__(self, sock, addr, obj, tag, map=None): self.obj = None self.decode = ZEO.zrpc.marshal.decode self.encode = ZEO.zrpc.marshal.encode self.fast_encode = ZEO.zrpc.marshal.fast_encode self.closed = False self.peer_protocol_version = None # set in recv_handshake() assert tag in "CS" self.tag = tag self.logger = logging.getLogger('ZEO.zrpc.Connection(%c)' % tag) if isinstance(addr, tuple): self.log_label = "(%s:%d) " % addr else: self.log_label = "(%s) " % addr # Supply our own socket map, so that we don't get registered with # the asyncore socket map just yet. The initial protocol messages # are treated very specially, and we dare not get invoked by asyncore # before that special-case setup is complete. Some of that setup # occurs near the end of this constructor, and the rest is done by # a concrete subclass's handshake() method. Unfortunately, because # we ultimately derive from asyncore.dispatcher, it's not possible # to invoke the superclass constructor without asyncore stuffing # us into _some_ socket map. ourmap = {} self.__super_init(sock, addr, map=ourmap) # The singleton dict is used in synchronous mode when a method # needs to call into asyncore to try to force some I/O to occur. # The singleton dict is a socket map containing only this object. self._singleton = {self._fileno: self} # waiting_for_reply is used internally to indicate whether # a call is in progress. setting a session key is deferred # until after the call returns. self.waiting_for_reply = False self.delay_sesskey = None self.register_object(obj) # The first message we see is a protocol handshake. message_input() # is temporarily replaced by recv_handshake() to treat that message # specially. revc_handshake() does "del self.message_input", which # uncovers the normal message_input() method thereafter. self.message_input = self.recv_handshake # Server and client need to do different things for protocol # negotiation, and handshake() is implemented differently in each. self.handshake() # Now it's safe to register with asyncore's socket map; it was not # safe before message_input was replaced, or before handshake() was # invoked. # Obscure: in Python 2.4, the base asyncore.dispatcher class grew # a ._map attribute, which is used instead of asyncore's global # socket map when ._map isn't None. Because we passed `ourmap` to # the base class constructor above, in 2.4 asyncore believes we want # to use `ourmap` instead of the global socket map -- but we don't. # So we have to replace our ._map with the global socket map, and # update the global socket map with `ourmap`. Replacing our ._map # isn't necessary before Python 2.4, but doesn't hurt then (it just # gives us an unused attribute in 2.3); updating the global socket # map is necessary regardless of Python version. if map is None: map = asyncore.socket_map self._map = map map.update(ourmap) def __repr__(self): return "<%s %s>" % (self.__class__.__name__, self.addr) __str__ = __repr__ # Defeat asyncore's dreaded __getattr__ def log(self, message, level=BLATHER, exc_info=False): self.logger.log(level, self.log_label + message, exc_info=exc_info) def close(self): self.mgr.close_conn(self) if self.closed: return self._singleton.clear() self.closed = True self.__super_close() self.trigger.pull_trigger() def register_object(self, obj): """Register obj as the true object to invoke methods on.""" self.obj = obj # Subclass must implement. handshake() is called by the constructor, # near its end, but before self is added to asyncore's socket map. # When a connection is created the first message sent is a 4-byte # protocol version. This allows the protocol to evolve over time, and # lets servers handle clients using multiple versions of the protocol. # In general, the server's handshake() just needs to send the server's # preferred protocol; the client's also needs to queue (delay) outgoing # messages until it sees the handshake from the server. def handshake(self): raise NotImplementedError # Replaces message_input() for the first message received. Records the # protocol sent by the peer in `peer_protocol_version`, restores the # normal message_input() method, and raises an exception if the peer's # protocol is unacceptable. That's all the server needs to do. The # client needs to do additional work in response to the server's # handshake, and extends this method. def recv_handshake(self, proto): # Extended by ManagedClientConnection. del self.message_input # uncover normal-case message_input() self.peer_protocol_version = proto if self.tag == 'C': good_protos = self.servers_we_can_talk_to else: assert self.tag == 'S' good_protos = self.clients_we_can_talk_to if proto in good_protos: self.log("received handshake %r" % proto, level=logging.INFO) else: self.log("bad handshake %s" % short_repr(proto), level=logging.ERROR) raise ZRPCError("bad handshake %r" % proto) def message_input(self, message): """Decode an incoming message and dispatch it""" # If something goes wrong during decoding, the marshaller # will raise an exception. The exception will ultimately # result in asycnore calling handle_error(), which will # close the connection. msgid, async, name, args = self.decode(message) if debug_zrpc: self.log("recv msg: %s, %s, %s, %s" % (msgid, async, name, short_repr(args)), level=TRACE) if name == 'loadEx': # Special case and inline the heck out of load case: try: ret = self.obj.loadEx(*args) except (SystemExit, KeyboardInterrupt): raise except Exception, msg: if not isinstance(msg, self.unlogged_exception_types): self.log("%s() raised exception: %s" % (name, msg), logging.ERROR, exc_info=True) self.return_error(msgid, *sys.exc_info()[:2]) else: try: self.message_output(self.fast_encode(msgid, 0, REPLY, ret)) self.poll() except: # Fall back to normal version for better error handling self.send_reply(msgid, ret) elif name == REPLY: assert not async self.handle_reply(msgid, args) else: self.handle_request(msgid, async, name, args) def handle_request(self, msgid, async, name, args): obj = self.obj if name.startswith('_') or not hasattr(obj, name): if obj is None: if debug_zrpc: self.log("no object calling %s%s" % (name, short_repr(args)), level=logging.DEBUG) return msg = "Invalid method name: %s on %s" % (name, repr(obj)) raise ZRPCError(msg) if debug_zrpc: self.log("calling %s%s" % (name, short_repr(args)), level=logging.DEBUG) meth = getattr(obj, name) try: self.waiting_for_reply = True try: ret = meth(*args) finally: self.waiting_for_reply = False except (SystemExit, KeyboardInterrupt): raise except Exception, msg: if not isinstance(msg, self.unlogged_exception_types): self.log("%s() raised exception: %s" % (name, msg), logging.ERROR, exc_info=True) error = sys.exc_info()[:2] if async: self.log("Asynchronous call raised exception: %s" % self, level=logging.ERROR, exc_info=True) else: self.return_error(msgid, *error) return if async: if ret is not None: raise ZRPCError("async method %s returned value %s" % (name, short_repr(ret))) else: if debug_zrpc: self.log("%s returns %s" % (name, short_repr(ret)), logging.DEBUG) if isinstance(ret, Delay): ret.set_sender(msgid, self) else: self.send_reply(msgid, ret, not self.delay_sesskey) if self.delay_sesskey: self.__super_setSessionKey(self.delay_sesskey) self.delay_sesskey = None def return_error(self, msgid, err_type, err_value): # Note that, ideally, this should be defined soley for # servers, but a test arranges to get it called on # a client. Too much trouble to fix it now. :/ if not isinstance(err_value, Exception): err_value = err_type, err_value # encode() can pass on a wide variety of exceptions from cPickle. # While a bare `except` is generally poor practice, in this case # it's acceptable -- we really do want to catch every exception # cPickle may raise. try: msg = self.encode(msgid, 0, REPLY, (err_type, err_value)) except: # see above try: r = short_repr(err_value) except: r = "" err = ZRPCError("Couldn't pickle error %.100s" % r) msg = self.encode(msgid, 0, REPLY, (ZRPCError, err)) self.message_output(msg) self.poll() def handle_error(self): if sys.exc_info()[0] == SystemExit: raise sys.exc_info() self.log("Error caught in asyncore", level=logging.ERROR, exc_info=True) self.close() def setSessionKey(self, key): if self.waiting_for_reply: self.delay_sesskey = key else: self.__super_setSessionKey(key) def send_call(self, method, args, async=False): # send a message and return its msgid if async: msgid = 0 else: msgid = self._new_msgid() if debug_zrpc: self.log("send msg: %d, %d, %s, ..." % (msgid, async, method), level=TRACE) buf = self.encode(msgid, async, method, args) self.message_output(buf) return msgid def callAsync(self, method, *args): if self.closed: raise DisconnectedError() self.send_call(method, args, 1) self.poll() def callAsyncNoPoll(self, method, *args): # Like CallAsync but doesn't poll. This exists so that we can # send invalidations atomically to all clients without # allowing any client to sneak in a load request. if self.closed: raise DisconnectedError() self.send_call(method, args, 1) def callAsyncNoSend(self, method, *args): # Like CallAsync but doesn't poll. This exists so that we can # send invalidations atomically to all clients without # allowing any client to sneak in a load request. if self.closed: raise DisconnectedError() self.send_call(method, args, 1) self.call_from_thread() def callAsyncIterator(self, iterator): """Queue a sequence of calls using an iterator The calls will not be interleaved with other calls from the same client. """ self.message_output(self.encode(0, 1, method, args) for method, args in iterator) def handle_reply(self, msgid, ret): assert msgid == -1 and ret is None def poll(self): """Invoke asyncore mainloop to get pending message out.""" if debug_zrpc: self.log("poll()", level=TRACE) self.trigger.pull_trigger() # import cProfile, time class ManagedServerConnection(Connection): """Server-side Connection subclass.""" # Exception types that should not be logged: unlogged_exception_types = (ZODB.POSException.POSKeyError, ) def __init__(self, sock, addr, obj, mgr): self.mgr = mgr map = {} Connection.__init__(self, sock, addr, obj, 'S', map=map) self.decode = ZEO.zrpc.marshal.server_decode self.trigger = ZEO.zrpc.trigger.trigger(map) self.call_from_thread = self.trigger.pull_trigger t = threading.Thread(target=server_loop, args=(map,)) t.setDaemon(True) t.start() # self.profile = cProfile.Profile() # def message_input(self, message): # self.profile.enable() # try: # Connection.message_input(self, message) # finally: # self.profile.disable() def handshake(self): # Send the server's preferred protocol to the client. self.message_output(self.current_protocol) def recv_handshake(self, proto): Connection.recv_handshake(self, proto) self.obj.notifyConnected(self) def close(self): self.obj.notifyDisconnected() Connection.close(self) # self.profile.dump_stats(str(time.time())+'.stats') def send_reply(self, msgid, ret, immediately=True): # encode() can pass on a wide variety of exceptions from cPickle. # While a bare `except` is generally poor practice, in this case # it's acceptable -- we really do want to catch every exception # cPickle may raise. try: msg = self.encode(msgid, 0, REPLY, ret) except: # see above try: r = short_repr(ret) except: r = "" err = ZRPCError("Couldn't pickle return %.100s" % r) msg = self.encode(msgid, 0, REPLY, (ZRPCError, err)) self.message_output(msg) if immediately: self.poll() poll = smac.SizedMessageAsyncConnection.handle_write def server_loop(map): while len(map) > 1: asyncore.poll(30.0, map) for o in map.values(): o.close() class ManagedClientConnection(Connection): """Client-side Connection subclass.""" __super_init = Connection.__init__ base_message_output = Connection.message_output def __init__(self, sock, addr, mgr): self.mgr = mgr # We can't use the base smac's message_output directly because the # client needs to queue outgoing messages until it's seen the # initial protocol handshake from the server. So we have our own # message_ouput() method, and support for initial queueing. This is # a delicate design, requiring an output mutex to be wholly # thread-safe. # Caution: we must set this up before calling the base class # constructor, because the latter registers us with asyncore; # we need to guarantee that we'll queue outgoing messages before # asyncore learns about us. self.output_lock = threading.Lock() self.queue_output = True self.queued_messages = [] # msgid_lock guards access to msgid self.msgid = 0 self.msgid_lock = threading.Lock() # replies_cond is used to block when a synchronous call is # waiting for a response self.replies_cond = threading.Condition() self.replies = {} self.__super_init(sock, addr, None, tag='C', map=mgr.map) self.trigger = mgr.trigger self.call_from_thread = self.trigger.pull_trigger self.call_from_thread() def close(self): Connection.close(self) self.replies_cond.acquire() self.replies_cond.notifyAll() self.replies_cond.release() # Our message_ouput() queues messages until recv_handshake() gets the # protocol handshake from the server. def message_output(self, message): self.output_lock.acquire() try: if self.queue_output: self.queued_messages.append(message) else: assert not self.queued_messages self.base_message_output(message) finally: self.output_lock.release() def handshake(self): # The client waits to see the server's handshake. Outgoing messages # are queued for the duration. The client will send its own # handshake after the server's handshake is seen, in recv_handshake() # below. It will then send any messages queued while waiting. assert self.queue_output # the constructor already set this def recv_handshake(self, proto): # The protocol to use is the older of our and the server's preferred # protocols. proto = min(proto, self.current_protocol) # Restore the normal message_input method, and raise an exception # if the protocol version is too old. Connection.recv_handshake(self, proto) # Tell the server the protocol in use, then send any messages that # were queued while waiting to hear the server's protocol, and stop # queueing messages. self.output_lock.acquire() try: self.base_message_output(proto) for message in self.queued_messages: self.base_message_output(message) self.queued_messages = [] self.queue_output = False finally: self.output_lock.release() def _new_msgid(self): self.msgid_lock.acquire() try: msgid = self.msgid self.msgid = self.msgid + 1 return msgid finally: self.msgid_lock.release() def call(self, method, *args): if self.closed: raise DisconnectedError() msgid = self.send_call(method, args) r_args = self.wait(msgid) if (isinstance(r_args, tuple) and len(r_args) > 1 and type(r_args[0]) == exception_type_type and issubclass(r_args[0], Exception)): inst = r_args[1] raise inst # error raised by server else: return r_args def wait(self, msgid): """Invoke asyncore mainloop and wait for reply.""" if debug_zrpc: self.log("wait(%d)" % msgid, level=TRACE) self.trigger.pull_trigger() self.replies_cond.acquire() try: while 1: if self.closed: raise DisconnectedError() reply = self.replies.get(msgid, self) if reply is not self: del self.replies[msgid] if debug_zrpc: self.log("wait(%d): reply=%s" % (msgid, short_repr(reply)), level=TRACE) return reply self.replies_cond.wait() finally: self.replies_cond.release() # For testing purposes, it is useful to begin a synchronous call # but not block waiting for its response. def _deferred_call(self, method, *args): if self.closed: raise DisconnectedError() msgid = self.send_call(method, args) self.trigger.pull_trigger() return msgid def _deferred_wait(self, msgid): r_args = self.wait(msgid) if (isinstance(r_args, tuple) and type(r_args[0]) == exception_type_type and issubclass(r_args[0], Exception)): inst = r_args[1] raise inst # error raised by server else: return r_args def handle_reply(self, msgid, args): if debug_zrpc: self.log("recv reply: %s, %s" % (msgid, short_repr(args)), level=TRACE) self.replies_cond.acquire() try: self.replies[msgid] = args self.replies_cond.notifyAll() finally: self.replies_cond.release() def send_reply(self, msgid, ret): # Whimper. Used to send heartbeat assert msgid == -1 and ret is None self.message_output('(J\xff\xff\xff\xffK\x00U\x06.replyNt.') ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/zrpc/error.py000066400000000000000000000020721230730566700230160ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## from ZODB import POSException from ZEO.Exceptions import ClientDisconnected class ZRPCError(POSException.StorageError): pass class DisconnectedError(ZRPCError, ClientDisconnected): """The database storage is disconnected from the storage server. The error occurred because a problem in the low-level RPC connection, or because the connection was closed. """ # This subclass is raised when zrpc catches the error. ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/zrpc/log.py000066400000000000000000000046771230730566700224630ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## import os import threading import logging from ZODB.loglevels import BLATHER LOG_THREAD_ID = 0 # Set this to 1 during heavy debugging logger = logging.getLogger('ZEO.zrpc') _label = "%s" % os.getpid() def new_label(): global _label _label = str(os.getpid()) def log(message, level=BLATHER, label=None, exc_info=False): label = label or _label if LOG_THREAD_ID: label = label + ':' + threading.currentThread().getName() logger.log(level, '(%s) %s' % (label, message), exc_info=exc_info) REPR_LIMIT = 60 def short_repr(obj): "Return an object repr limited to REPR_LIMIT bytes." # Some of the objects being repr'd are large strings. A lot of memory # would be wasted to repr them and then truncate, so they are treated # specially in this function. # Also handle short repr of a tuple containing a long string. # This strategy works well for arguments to StorageServer methods. # The oid is usually first and will get included in its entirety. # The pickle is near the beginning, too, and you can often fit the # module name in the pickle. if isinstance(obj, str): if len(obj) > REPR_LIMIT: r = repr(obj[:REPR_LIMIT]) else: r = repr(obj) if len(r) > REPR_LIMIT: r = r[:REPR_LIMIT-4] + '...' + r[-1] return r elif isinstance(obj, (list, tuple)): elts = [] size = 0 for elt in obj: r = short_repr(elt) elts.append(r) size += len(r) if size > REPR_LIMIT: break if isinstance(obj, tuple): r = "(%s)" % (", ".join(elts)) else: r = "[%s]" % (", ".join(elts)) else: r = repr(obj) if len(r) > REPR_LIMIT: return r[:REPR_LIMIT] + '...' else: return r ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/zrpc/marshal.py000066400000000000000000000071611230730566700233200ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## from cPickle import Unpickler, Pickler from cStringIO import StringIO import logging from ZEO.zrpc.error import ZRPCError from ZEO.zrpc.log import log, short_repr def encode(*args): # args: (msgid, flags, name, args) # (We used to have a global pickler, but that's not thread-safe. :-( ) # It's not thread safe if, in the couse of pickling, we call the # Python interpeter, which releases the GIL. # Note that args may contain very large binary pickles already; for # this reason, it's important to use proto 1 (or higher) pickles here # too. For a long time, this used proto 0 pickles, and that can # bloat our pickle to 4x the size (due to high-bit and control bytes # being represented by \xij escapes in proto 0). # Undocumented: cPickle.Pickler accepts a lone protocol argument; # pickle.py does not. pickler = Pickler(1) pickler.fast = 1 return pickler.dump(args, 1) @apply def fast_encode(): # Only use in cases where you *know* the data contains only basic # Python objects pickler = Pickler(1) pickler.fast = 1 dump = pickler.dump def fast_encode(*args): return dump(args, 1) return fast_encode def decode(msg): """Decodes msg and returns its parts""" unpickler = Unpickler(StringIO(msg)) unpickler.find_global = find_global try: return unpickler.load() # msgid, flags, name, args except: log("can't decode message: %s" % short_repr(msg), level=logging.ERROR) raise def server_decode(msg): """Decodes msg and returns its parts""" unpickler = Unpickler(StringIO(msg)) unpickler.find_global = server_find_global try: return unpickler.load() # msgid, flags, name, args except: log("can't decode message: %s" % short_repr(msg), level=logging.ERROR) raise _globals = globals() _silly = ('__doc__',) exception_type_type = type(Exception) def find_global(module, name): """Helper for message unpickler""" try: m = __import__(module, _globals, _globals, _silly) except ImportError, msg: raise ZRPCError("import error %s: %s" % (module, msg)) try: r = getattr(m, name) except AttributeError: raise ZRPCError("module %s has no global %s" % (module, name)) safe = getattr(r, '__no_side_effects__', 0) if safe: return r # TODO: is there a better way to do this? if type(r) == exception_type_type and issubclass(r, Exception): return r raise ZRPCError("Unsafe global: %s.%s" % (module, name)) def server_find_global(module, name): """Helper for message unpickler""" try: if module != 'ZopeUndo.Prefix': raise ImportError m = __import__(module, _globals, _globals, _silly) except ImportError, msg: raise ZRPCError("import error %s: %s" % (module, msg)) try: r = getattr(m, name) except AttributeError: raise ZRPCError("module %s has no global %s" % (module, name)) return r ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/zrpc/server.py000066400000000000000000000077401230730566700232020ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## import asyncore import socket import types # _has_dualstack: True if the dual-stack sockets are supported try: # Check whether IPv6 sockets can be created s = socket.socket(socket.AF_INET6, socket.SOCK_STREAM) except (socket.error, AttributeError): _has_dualstack = False else: # Check whether enabling dualstack (disabling v6only) works try: s.setsockopt(socket.IPPROTO_IPV6, socket.IPV6_V6ONLY, False) except (socket.error, AttributeError): _has_dualstack = False else: _has_dualstack = True s.close() del s from ZEO.zrpc.connection import Connection from ZEO.zrpc.log import log import ZEO.zrpc.log import logging # Export the main asyncore loop loop = asyncore.loop class Dispatcher(asyncore.dispatcher): """A server that accepts incoming RPC connections""" __super_init = asyncore.dispatcher.__init__ def __init__(self, addr, factory=Connection): self.__super_init() self.addr = addr self.factory = factory self._open_socket() def _open_socket(self): if type(self.addr) == types.TupleType: if self.addr[0] == '' and _has_dualstack: # Wildcard listen on all interfaces, both IPv4 and # IPv6 if possible self.create_socket(socket.AF_INET6, socket.SOCK_STREAM) self.socket.setsockopt( socket.IPPROTO_IPV6, socket.IPV6_V6ONLY, False) elif ':' in self.addr[0]: self.create_socket(socket.AF_INET6, socket.SOCK_STREAM) if _has_dualstack: # On Linux, IPV6_V6ONLY is off by default. # If the user explicitly asked for IPv6, don't bind to IPv4 self.socket.setsockopt( socket.IPPROTO_IPV6, socket.IPV6_V6ONLY, True) else: self.create_socket(socket.AF_INET, socket.SOCK_STREAM) else: self.create_socket(socket.AF_UNIX, socket.SOCK_STREAM) self.set_reuse_addr() log("listening on %s" % str(self.addr), logging.INFO) self.bind(self.addr) self.listen(5) def writable(self): return 0 def readable(self): return 1 def handle_accept(self): try: sock, addr = self.accept() except socket.error, msg: log("accepted failed: %s" % msg) return # We could short-circuit the attempt below in some edge cases # and avoid a log message by checking for addr being None. # Unfortunately, our test for the code below, # quick_close_doesnt_kill_server, causes addr to be None and # we'd have to write a test for the non-None case, which is # *even* harder to provoke. :/ So we'll leave things as they # are for now. # It might be better to check whether the socket has been # closed, but I don't see a way to do that. :( # Drop flow-info from IPv6 addresses if addr: # Sometimes None on Mac. See above. addr = addr[:2] try: c = self.factory(sock, addr) except: if sock.fileno() in asyncore.socket_map: del asyncore.socket_map[sock.fileno()] ZEO.zrpc.log.logger.exception("Error in handle_accept") else: log("connect from %s: %s" % (repr(addr), c)) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/zrpc/smac.py000066400000000000000000000302731230730566700226140ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Sized Message Async Connections. This class extends the basic asyncore layer with a record-marking layer. The message_output() method accepts an arbitrary sized string as its argument. It sends over the wire the length of the string encoded using struct.pack('>I') and the string itself. The receiver passes the original string to message_input(). This layer also supports an optional message authentication code (MAC). If a session key is present, it uses HMAC-SHA-1 to generate a 20-byte MAC. If a MAC is present, the high-order bit of the length is set to 1 and the MAC immediately follows the length. """ import asyncore import errno try: import hmac except ImportError: import _hmac as hmac import socket import struct import threading from types import StringType from ZEO.zrpc.log import log from ZEO.zrpc.error import DisconnectedError import ZEO.hash # Use the dictionary to make sure we get the minimum number of errno # entries. We expect that EWOULDBLOCK == EAGAIN on most systems -- # or that only one is actually used. tmp_dict = {errno.EWOULDBLOCK: 0, errno.EAGAIN: 0, errno.EINTR: 0, } expected_socket_read_errors = tuple(tmp_dict.keys()) tmp_dict = {errno.EAGAIN: 0, errno.EWOULDBLOCK: 0, errno.ENOBUFS: 0, errno.EINTR: 0, } expected_socket_write_errors = tuple(tmp_dict.keys()) del tmp_dict # We chose 60000 as the socket limit by looking at the largest strings # that we could pass to send() without blocking. SEND_SIZE = 60000 MAC_BIT = 0x80000000L _close_marker = object() class SizedMessageAsyncConnection(asyncore.dispatcher): __super_init = asyncore.dispatcher.__init__ __super_close = asyncore.dispatcher.close __closed = True # Marker indicating that we're closed socket = None # to outwit Sam's getattr def __init__(self, sock, addr, map=None): self.addr = addr # __input_lock protects __inp, __input_len, __state, __msg_size self.__input_lock = threading.Lock() self.__inp = None # None, a single String, or a list self.__input_len = 0 # Instance variables __state, __msg_size and __has_mac work together: # when __state == 0: # __msg_size == 4, and the next thing read is a message size; # __has_mac is set according to the MAC_BIT in the header # when __state == 1: # __msg_size is variable, and the next thing read is a message. # __has_mac indicates if we're in MAC mode or not (and # therefore, if we need to check the mac header) # The next thing read is always of length __msg_size. # The state alternates between 0 and 1. self.__state = 0 self.__has_mac = 0 self.__msg_size = 4 self.__output_messages = [] self.__output = [] self.__closed = False # Each side of the connection sends and receives messages. A # MAC is generated for each message and depends on each # previous MAC; the state of the MAC generator depends on the # history of operations it has performed. So the MACs must be # generated in the same order they are verified. # Each side is guaranteed to receive messages in the order # they are sent, but there is no ordering constraint between # message sends and receives. If the two sides are A and B # and message An indicates the nth message sent by A, then # A1 A2 B1 B2 and A1 B1 B2 A2 are both legitimate total # orderings of the messages. # As a result, there must be seperate MAC generators for each # side of the connection. If not, the generator state would # be different after A1 A2 B1 B2 than it would be after # A1 B1 B2 A2; if the generator state was different, the MAC # could not be verified. self.__hmac_send = None self.__hmac_recv = None self.__super_init(sock, map) # asyncore overwrites addr with the getpeername result # restore our value self.addr = addr def setSessionKey(self, sesskey): log("set session key %r" % sesskey) # Low-level construction is now delayed until data are sent. # This is to allow use of iterators that generate messages # only when we're ready to do I/O so that we can effeciently # transmit large files. Because we delay messages, we also # have to delay setting the session key to retain proper # ordering. # The low-level output queue supports strings, a special close # marker, and iterators. It doesn't support callbacks. We # can create a allback by providing an iterator that doesn't # yield anything. # The hack fucntion below is a callback in iterator's # clothing. :) It never yields anything, but is a generator # and thus iterator, because it contains a yield statement. def hack(): self.__hmac_send = hmac.HMAC(sesskey, digestmod=ZEO.hash) self.__hmac_recv = hmac.HMAC(sesskey, digestmod=ZEO.hash) if False: yield '' self.message_output(hack()) def get_addr(self): return self.addr # TODO: avoid expensive getattr calls? Can't remember exactly what # this comment was supposed to mean, but it has something to do # with the way asyncore uses getattr and uses if sock: def __nonzero__(self): return 1 def handle_read(self): self.__input_lock.acquire() try: # Use a single __inp buffer and integer indexes to make this fast. try: d = self.recv(8192) except socket.error, err: if err[0] in expected_socket_read_errors: return raise if not d: return input_len = self.__input_len + len(d) msg_size = self.__msg_size state = self.__state has_mac = self.__has_mac inp = self.__inp if msg_size > input_len: if inp is None: self.__inp = d elif type(self.__inp) is StringType: self.__inp = [self.__inp, d] else: self.__inp.append(d) self.__input_len = input_len return # keep waiting for more input # load all previous input and d into single string inp if isinstance(inp, StringType): inp = inp + d elif inp is None: inp = d else: inp.append(d) inp = "".join(inp) offset = 0 while (offset + msg_size) <= input_len: msg = inp[offset:offset + msg_size] offset = offset + msg_size if not state: msg_size = struct.unpack(">I", msg)[0] has_mac = msg_size & MAC_BIT if has_mac: msg_size ^= MAC_BIT msg_size += 20 elif self.__hmac_send: raise ValueError("Received message without MAC") state = 1 else: msg_size = 4 state = 0 # Obscure: We call message_input() with __input_lock # held!!! And message_input() may end up calling # message_output(), which has its own lock. But # message_output() cannot call message_input(), so # the locking order is always consistent, which # prevents deadlock. Also, message_input() may # take a long time, because it can cause an # incoming call to be handled. During all this # time, the __input_lock is held. That's a good # thing, because it serializes incoming calls. if has_mac: mac = msg[:20] msg = msg[20:] if self.__hmac_recv: self.__hmac_recv.update(msg) _mac = self.__hmac_recv.digest() if mac != _mac: raise ValueError("MAC failed: %r != %r" % (_mac, mac)) else: log("Received MAC but no session key set") elif self.__hmac_send: raise ValueError("Received message without MAC") self.message_input(msg) self.__state = state self.__has_mac = has_mac self.__msg_size = msg_size self.__inp = inp[offset:] self.__input_len = input_len - offset finally: self.__input_lock.release() def readable(self): return True def writable(self): return bool(self.__output_messages or self.__output) def should_close(self): self.__output_messages.append(_close_marker) def handle_write(self): output = self.__output messages = self.__output_messages while output or messages: # Process queued messages until we have enough output size = sum((len(s) for s in output)) while (size <= SEND_SIZE) and messages: message = messages[0] if message.__class__ is str: size += self.__message_output(messages.pop(0), output) elif message is _close_marker: del messages[:] del output[:] return self.close() else: try: message = message.next() except StopIteration: messages.pop(0) else: size += self.__message_output(message, output) v = "".join(output) del output[:] try: n = self.send(v) except socket.error, err: # Fix for https://bugs.launchpad.net/zodb/+bug/182833 # ensure the above mentioned "output" invariant output.insert(0, v) if err[0] in expected_socket_write_errors: break # we couldn't write anything raise if n < len(v): output.append(v[n:]) break # we can't write any more def handle_close(self): self.close() def message_output(self, message): if self.__closed: raise DisconnectedError( "This action is temporarily unavailable.

") self.__output_messages.append(message) def __message_output(self, message, output): # do two separate appends to avoid copying the message string size = 4 if self.__hmac_send: output.append(struct.pack(">I", len(message) | MAC_BIT)) self.__hmac_send.update(message) output.append(self.__hmac_send.digest()) size += 20 else: output.append(struct.pack(">I", len(message))) if len(message) <= SEND_SIZE: output.append(message) else: for i in range(0, len(message), SEND_SIZE): output.append(message[i:i+SEND_SIZE]) return size + len(message) def close(self): if not self.__closed: self.__closed = True self.__super_close() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZEO/zrpc/trigger.py000066400000000000000000000214261230730566700233340ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001-2005 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## from __future__ import with_statement import asyncore import os import socket import thread import errno from ZODB.utils import positive_id # Original comments follow; they're hard to follow in the context of # ZEO's use of triggers. TODO: rewrite from a ZEO perspective. # Wake up a call to select() running in the main thread. # # This is useful in a context where you are using Medusa's I/O # subsystem to deliver data, but the data is generated by another # thread. Normally, if Medusa is in the middle of a call to # select(), new output data generated by another thread will have # to sit until the call to select() either times out or returns. # If the trigger is 'pulled' by another thread, it should immediately # generate a READ event on the trigger object, which will force the # select() invocation to return. # # A common use for this facility: letting Medusa manage I/O for a # large number of connections; but routing each request through a # thread chosen from a fixed-size thread pool. When a thread is # acquired, a transaction is performed, but output data is # accumulated into buffers that will be emptied more efficiently # by Medusa. [picture a server that can process database queries # rapidly, but doesn't want to tie up threads waiting to send data # to low-bandwidth connections] # # The other major feature provided by this class is the ability to # move work back into the main thread: if you call pull_trigger() # with a thunk argument, when select() wakes up and receives the # event it will call your thunk from within that thread. The main # purpose of this is to remove the need to wrap thread locks around # Medusa's data structures, which normally do not need them. [To see # why this is true, imagine this scenario: A thread tries to push some # new data onto a channel's outgoing data queue at the same time that # the main thread is trying to remove some] class _triggerbase(object): """OS-independent base class for OS-dependent trigger class.""" kind = None # subclass must set to "pipe" or "loopback"; used by repr def __init__(self): self._closed = False # `lock` protects the `thunks` list from being traversed and # appended to simultaneously. self.lock = thread.allocate_lock() # List of no-argument callbacks to invoke when the trigger is # pulled. These run in the thread running the asyncore mainloop, # regardless of which thread pulls the trigger. self.thunks = [] def readable(self): return 1 def writable(self): return 0 def handle_connect(self): pass def handle_close(self): self.close() # Override the asyncore close() method, because it doesn't know about # (so can't close) all the gimmicks we have open. Subclass must # supply a _close() method to do platform-specific closing work. _close() # will be called iff we're not already closed. def close(self): if not self._closed: self._closed = True self.del_channel() self._close() # subclass does OS-specific stuff def _close(self): # see close() above; subclass must supply raise NotImplementedError def pull_trigger(self, *thunk): if thunk: with self.lock: self.thunks.append(thunk) try: self._physical_pull() except Exception: if not self._closed: raise # Subclass must supply _physical_pull, which does whatever the OS # needs to do to provoke the "write" end of the trigger. def _physical_pull(self): raise NotImplementedError def handle_read(self): try: self.recv(8192) except socket.error: return while 1: with self.lock: if self.thunks: thunk = self.thunks.pop(0) else: return try: thunk[0](*thunk[1:]) except: nil, t, v, tbinfo = asyncore.compact_traceback() print ('exception in trigger thunk:' ' (%s:%s %s)' % (t, v, tbinfo)) def __repr__(self): return '' % (self.kind, positive_id(self)) if os.name == 'posix': class trigger(_triggerbase, asyncore.file_dispatcher): kind = "pipe" def __init__(self, map=None): _triggerbase.__init__(self) r, self.trigger = os.pipe() asyncore.file_dispatcher.__init__(self, r, map) if self.fd != r: # Starting in Python 2.6, the descriptor passed to # file_dispatcher gets duped and assigned to # self.fd. This breals the instantiation semantics and # is a bug imo. I dount it will get fixed, but maybe # it will. Who knows. For that reason, we test for the # fd changing rather than just checking the Python version. os.close(r) def _close(self): os.close(self.trigger) asyncore.file_dispatcher.close(self) def _physical_pull(self): os.write(self.trigger, 'x') else: # Windows version; uses just sockets, because a pipe isn't select'able # on Windows. class BindError(Exception): pass class trigger(_triggerbase, asyncore.dispatcher): kind = "loopback" def __init__(self, map=None): _triggerbase.__init__(self) # Get a pair of connected sockets. The trigger is the 'w' # end of the pair, which is connected to 'r'. 'r' is put # in the asyncore socket map. "pulling the trigger" then # means writing something on w, which will wake up r. w = socket.socket() # Disable buffering -- pulling the trigger sends 1 byte, # and we want that sent immediately, to wake up asyncore's # select() ASAP. w.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1) count = 0 while 1: count += 1 # Bind to a local port; for efficiency, let the OS pick # a free port for us. # Unfortunately, stress tests showed that we may not # be able to connect to that port ("Address already in # use") despite that the OS picked it. This appears # to be a race bug in the Windows socket implementation. # So we loop until a connect() succeeds (almost always # on the first try). See the long thread at # http://mail.zope.org/pipermail/zope/2005-July/160433.html # for hideous details. a = socket.socket() a.bind(("127.0.0.1", 0)) connect_address = a.getsockname() # assigned (host, port) pair a.listen(1) try: w.connect(connect_address) break # success except socket.error, detail: if detail[0] != errno.WSAEADDRINUSE: # "Address already in use" is the only error # I've seen on two WinXP Pro SP2 boxes, under # Pythons 2.3.5 and 2.4.1. raise # (10048, 'Address already in use') # assert count <= 2 # never triggered in Tim's tests if count >= 10: # I've never seen it go above 2 a.close() w.close() raise BindError("Cannot bind trigger!") # Close `a` and try again. Note: I originally put a short # sleep() here, but it didn't appear to help or hurt. a.close() r, addr = a.accept() # r becomes asyncore's (self.)socket a.close() self.trigger = w asyncore.dispatcher.__init__(self, r, map) def _close(self): # self.socket is r, and self.trigger is w, from __init__ self.socket.close() self.trigger.close() def _physical_pull(self): self.trigger.send('x') ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/000077500000000000000000000000001230730566700204355ustar00rootroot00000000000000ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/ActivityMonitor.py000066400000000000000000000070621230730566700241600ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """ZODB transfer activity monitoring $Id$""" import threading import time class ActivityMonitor: """ZODB load/store activity monitor This simple implementation just keeps a small log in memory and iterates over the log when getActivityAnalysis() is called. It assumes that log entries are added in chronological sequence. """ def __init__(self, history_length=3600): self.history_length = history_length # Number of seconds self.log = [] # [(time, loads, stores)] self.trim_lock = threading.Lock() def closedConnection(self, conn): log = self.log now = time.time() loads, stores = conn.getTransferCounts(1) log.append((now, loads, stores)) self.trim(now) def trim(self, now): self.trim_lock.acquire() log = self.log cutoff = now - self.history_length n = 0 loglen = len(log) while n < loglen and log[n][0] < cutoff: n = n + 1 if n: del log[:n] self.trim_lock.release() def setHistoryLength(self, history_length): self.history_length = history_length self.trim(time.time()) def getHistoryLength(self): return self.history_length def getActivityAnalysis(self, start=0, end=0, divisions=10): res = [] now = time.time() if start == 0: start = now - self.history_length if end == 0: end = now for n in range(divisions): res.append({ 'start': start + (end - start) * n / divisions, 'end': start + (end - start) * (n + 1) / divisions, 'loads': 0, 'stores': 0, 'connections': 0, }) div = res[0] div_end = div['end'] div_index = 0 connections = 0 total_loads = 0 total_stores = 0 for t, loads, stores in self.log: if t < start: # We could use a binary search to find the start. continue elif t > end: # We could use a binary search to find the end also. break while t > div_end: div['loads'] = total_loads div['stores'] = total_stores div['connections'] = connections total_loads = 0 total_stores = 0 connections = 0 div_index = div_index + 1 if div_index < divisions: div = res[div_index] div_end = div['end'] connections = connections + 1 total_loads = total_loads + loads total_stores = total_stores + stores div['stores'] = div['stores'] + total_stores div['loads'] = div['loads'] + total_loads div['connections'] = div['connections'] + connections return res ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/BaseStorage.py000066400000000000000000000357311230730566700232170ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Storage base class that is mostly a mistake The base class here is tightly coupled with its subclasses and its use is not recommended. It's still here for historical reasons. """ from __future__ import with_statement import cPickle import threading import time import logging from struct import pack as _structpack, unpack as _structunpack import zope.interface from persistent.TimeStamp import TimeStamp import ZODB.interfaces from ZODB import POSException from ZODB.utils import z64, oid_repr from ZODB.UndoLogCompatible import UndoLogCompatible log = logging.getLogger("ZODB.BaseStorage") import sys class BaseStorage(UndoLogCompatible): """Base class that supports storage implementations. XXX Base classes like this are an attractive nuisance. They often introduce more complexity than they save. While important logic is implemented here, we should consider exposing it as utility functions or as objects that can be used through composition. A subclass must define the following methods: load() store() close() cleanup() lastTransaction() It must override these hooks: _begin() _vote() _abort() _finish() _clear_temp() If it stores multiple revisions, it should implement loadSerial() loadBefore() Each storage will have two locks that are accessed via lock acquire and release methods bound to the instance. (Yuck.) _lock_acquire / _lock_release (reentrant) _commit_lock_acquire / _commit_lock_release The commit lock is acquired in tpc_begin() and released in tpc_abort() and tpc_finish(). It is never acquired with the other lock held. The other lock appears to protect _oid and _transaction and perhaps other things. It is always held when load() is called, so presumably the load() implementation should also acquire the lock. """ _transaction=None # Transaction that is being committed _tstatus=' ' # Transaction status, used for copying data _is_read_only = False def __init__(self, name, base=None): self.__name__= name log.debug("create storage %s", self.__name__) # Allocate locks: self._lock = threading.RLock() self.__commit_lock = threading.Lock() # Comment out the following 4 lines to debug locking: self._lock_acquire = self._lock.acquire self._lock_release = self._lock.release self._commit_lock_acquire = self.__commit_lock.acquire self._commit_lock_release = self.__commit_lock.release t = time.time() t = self._ts = TimeStamp(*(time.gmtime(t)[:5] + (t%60,))) self._tid = repr(t) # ._oid is the highest oid in use (0 is always in use -- it's # a reserved oid for the root object). Our new_oid() method # increments it by 1, and returns the result. It's really a # 64-bit integer stored as an 8-byte big-endian string. oid = getattr(base, '_oid', None) if oid is None: self._oid = z64 else: self._oid = oid ######################################################################## # The following methods are normally overridden on instances, # except when debugging: def _lock_acquire(self, *args): f = sys._getframe(1) sys.stdout.write("[la(%s:%s)\n" % (f.f_code.co_filename, f.f_lineno)) sys.stdout.flush() self._lock.acquire(*args) sys.stdout.write("la(%s:%s)]\n" % (f.f_code.co_filename, f.f_lineno)) sys.stdout.flush() def _lock_release(self, *args): f = sys._getframe(1) sys.stdout.write("[lr(%s:%s)\n" % (f.f_code.co_filename, f.f_lineno)) sys.stdout.flush() self._lock.release(*args) sys.stdout.write("lr(%s:%s)]\n" % (f.f_code.co_filename, f.f_lineno)) sys.stdout.flush() def _commit_lock_acquire(self, *args): f = sys._getframe(1) sys.stdout.write("[ca(%s:%s)\n" % (f.f_code.co_filename, f.f_lineno)) sys.stdout.flush() self.__commit_lock.acquire(*args) sys.stdout.write("ca(%s:%s)]\n" % (f.f_code.co_filename, f.f_lineno)) sys.stdout.flush() def _commit_lock_release(self, *args): f = sys._getframe(1) sys.stdout.write("[cr(%s:%s)\n" % (f.f_code.co_filename, f.f_lineno)) sys.stdout.flush() self.__commit_lock.release(*args) sys.stdout.write("cr(%s:%s)]\n" % (f.f_code.co_filename, f.f_lineno)) sys.stdout.flush() # ######################################################################## def sortKey(self): """Return a string that can be used to sort storage instances. The key must uniquely identify a storage and must be the same across multiple instantiations of the same storage. """ # name may not be sufficient, e.g. ZEO has a user-definable name. return self.__name__ def getName(self): return self.__name__ def getSize(self): return len(self)*300 # WAG! def history(self, oid, version, length=1, filter=None): return () def new_oid(self): if self._is_read_only: raise POSException.ReadOnlyError() self._lock_acquire() try: last = self._oid d = ord(last[-1]) if d < 255: # fast path for the usual case last = last[:-1] + chr(d+1) else: # there's a carry out of the last byte last_as_long, = _structunpack(">Q", last) last = _structpack(">Q", last_as_long + 1) self._oid = last return last finally: self._lock_release() # Update the maximum oid in use, under protection of a lock. The # maximum-in-use attribute is changed only if possible_new_max_oid is # larger than its current value. def set_max_oid(self, possible_new_max_oid): self._lock_acquire() try: if possible_new_max_oid > self._oid: self._oid = possible_new_max_oid finally: self._lock_release() def registerDB(self, db): pass # we don't care def isReadOnly(self): return self._is_read_only def tpc_abort(self, transaction): self._lock_acquire() try: if transaction is not self._transaction: return try: self._abort() self._clear_temp() self._transaction = None finally: self._commit_lock_release() finally: self._lock_release() def _abort(self): """Subclasses should redefine this to supply abort actions""" pass def tpc_begin(self, transaction, tid=None, status=' '): if self._is_read_only: raise POSException.ReadOnlyError() self._lock_acquire() try: if self._transaction is transaction: raise POSException.StorageTransactionError( "Duplicate tpc_begin calls for same transaction") self._lock_release() self._commit_lock_acquire() self._lock_acquire() self._transaction = transaction self._clear_temp() user = transaction.user desc = transaction.description ext = transaction._extension if ext: ext = cPickle.dumps(ext, 1) else: ext = "" self._ude = user, desc, ext if tid is None: now = time.time() t = TimeStamp(*(time.gmtime(now)[:5] + (now % 60,))) self._ts = t = t.laterThan(self._ts) self._tid = repr(t) else: self._ts = TimeStamp(tid) self._tid = tid self._tstatus = status self._begin(self._tid, user, desc, ext) finally: self._lock_release() def tpc_transaction(self): return self._transaction def _begin(self, tid, u, d, e): """Subclasses should redefine this to supply transaction start actions. """ pass def tpc_vote(self, transaction): self._lock_acquire() try: if transaction is not self._transaction: raise POSException.StorageTransactionError( "tpc_vote called with wrong transaction") self._vote() finally: self._lock_release() def _vote(self): """Subclasses should redefine this to supply transaction vote actions. """ pass def tpc_finish(self, transaction, f=None): # It's important that the storage calls the function we pass # while it still has its lock. We don't want another thread # to be able to read any updated data until we've had a chance # to send an invalidation message to all of the other # connections! self._lock_acquire() try: if transaction is not self._transaction: raise POSException.StorageTransactionError( "tpc_finish called with wrong transaction") try: if f is not None: f(self._tid) u, d, e = self._ude self._finish(self._tid, u, d, e) self._clear_temp() finally: self._ude = None self._transaction = None self._commit_lock_release() finally: self._lock_release() def _finish(self, tid, u, d, e): """Subclasses should redefine this to supply transaction finish actions """ pass def lastTransaction(self): with self._lock: return self._ltid def getTid(self, oid): self._lock_acquire() try: v = '' try: supportsVersions = self.supportsVersions except AttributeError: pass else: if supportsVersions(): v = self.modifiedInVersion(oid) pickledata, serial = self.load(oid, v) return serial finally: self._lock_release() def loadSerial(self, oid, serial): raise POSException.Unsupported( "Retrieval of historical revisions is not supported") def loadBefore(self, oid, tid): """Return most recent revision of oid before tid committed.""" return None def copyTransactionsFrom(self, other, verbose=0): """Copy transactions from another storage. This is typically used for converting data from one storage to another. `other` must have an .iterator() method. """ copy(other, self, verbose) def copy(source, dest, verbose=0): """Copy transactions from a source to a destination storage This is typically used for converting data from one storage to another. `source` must have an .iterator() method. """ _ts = None ok = 1 preindex = {}; preget = preindex.get # restore() is a new storage API method which has an identical # signature to store() except that it does not return anything. # Semantically, restore() is also identical to store() except that it # doesn't do the ConflictError or VersionLockError consistency # checks. The reason to use restore() over store() in this method is # that store() cannot be used to copy transactions spanning a version # commit or abort, or over transactional undos. # # We'll use restore() if it's available, otherwise we'll fall back to # using store(). However, if we use store, then # copyTransactionsFrom() may fail with VersionLockError or # ConflictError. restoring = hasattr(dest, 'restore') fiter = source.iterator() for transaction in fiter: tid = transaction.tid if _ts is None: _ts = TimeStamp(tid) else: t = TimeStamp(tid) if t <= _ts: if ok: print ('Time stamps out of order %s, %s' % (_ts, t)) ok = 0 _ts = t.laterThan(_ts) tid = `_ts` else: _ts = t if not ok: print ('Time stamps back in order %s' % (t)) ok = 1 if verbose: print _ts dest.tpc_begin(transaction, tid, transaction.status) for r in transaction: oid = r.oid if verbose: print oid_repr(oid), r.version, len(r.data) if restoring: dest.restore(oid, r.tid, r.data, r.version, r.data_txn, transaction) else: pre = preget(oid, None) s = dest.store(oid, pre, r.data, r.version, transaction) preindex[oid] = s dest.tpc_vote(transaction) dest.tpc_finish(transaction) # defined outside of BaseStorage to facilitate independent reuse. # just depends on _transaction attr and getTid method. def checkCurrentSerialInTransaction(self, oid, serial, transaction): if transaction is not self._transaction: raise POSException.StorageTransactionError(self, transaction) committed_tid = self.getTid(oid) if committed_tid != serial: raise POSException.ReadConflictError( oid=oid, serials=(committed_tid, serial)) BaseStorage.checkCurrentSerialInTransaction = checkCurrentSerialInTransaction class TransactionRecord(object): """Abstract base class for iterator protocol""" zope.interface.implements(ZODB.interfaces.IStorageTransactionInformation) def __init__(self, tid, status, user, description, extension): self.tid = tid self.status = status self.user = user self.description = description self.extension = extension # XXX This is a workaround to make the TransactionRecord compatible with a # transaction object because it is passed to tpc_begin(). def _ext_set(self, value): self.extension = value def _ext_get(self): return self.extension _extension = property(fset=_ext_set, fget=_ext_get) class DataRecord(object): """Abstract base class for iterator protocol""" zope.interface.implements(ZODB.interfaces.IStorageRecordInformation) version = '' def __init__(self, oid, tid, data, prev): self.oid = oid self.tid = tid self.data = data self.data_txn = prev ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/ConflictResolution.py000066400000000000000000000235701230730566700246430ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## import logging from cStringIO import StringIO from cPickle import Unpickler, Pickler from pickle import PicklingError import zope.interface from ZODB.POSException import ConflictError from ZODB.loglevels import BLATHER logger = logging.getLogger('ZODB.ConflictResolution') ResolvedSerial = 'rs' class BadClassName(Exception): pass class BadClass(object): def __init__(self, *args): self.args = args def __reduce__(self): raise BadClassName(*self.args) _class_cache = {} _class_cache_get = _class_cache.get def find_global(*args): cls = _class_cache_get(args, 0) if cls == 0: # Not cached. Try to import try: module = __import__(args[0], {}, {}, ['cluck']) except ImportError: cls = 1 else: cls = getattr(module, args[1], 1) _class_cache[args] = cls if cls == 1: logger.log(BLATHER, "Unable to load class", exc_info=True) if cls == 1: # Not importable if (isinstance(args, tuple) and len(args) == 2 and isinstance(args[0], basestring) and isinstance(args[1], basestring) ): return BadClass(*args) else: raise BadClassName(*args) return cls def state(self, oid, serial, prfactory, p=''): p = p or self.loadSerial(oid, serial) p = self._crs_untransform_record_data(p) file = StringIO(p) unpickler = Unpickler(file) unpickler.find_global = find_global unpickler.persistent_load = prfactory.persistent_load unpickler.load() # skip the class tuple return unpickler.load() class IPersistentReference(zope.interface.Interface): '''public contract for references to persistent objects from an object with conflicts.''' oid = zope.interface.Attribute( 'The oid of the persistent object that this reference represents') database_name = zope.interface.Attribute( '''The name of the database of the reference, *if* different. If not different, None.''') klass = zope.interface.Attribute( '''class meta data. Presence is not reliable.''') weak = zope.interface.Attribute( '''bool: whether this reference is weak''') def __cmp__(other): '''if other is equivalent reference, return 0; else raise ValueError. Equivalent in this case means that oid and database_name are the same. If either is a weak reference, we only support `is` equivalence, and otherwise raise a ValueError even if the datbase_names and oids are the same, rather than guess at the correct semantics. It is impossible to sort reliably, since the actual persistent class may have its own comparison, and we have no idea what it is. We assert that it is reasonably safe to assume that an object is equivalent to itself, but that's as much as we can say. We don't compare on 'is other', despite the PersistentReferenceFactory.data cache, because it is possible to have two references to the same object that are spelled with different data (for instance, one with a class and one without).''' class PersistentReference(object): zope.interface.implements(IPersistentReference) weak = False oid = database_name = klass = None def __init__(self, data): self.data = data # see serialize.py, ObjectReader._persistent_load if isinstance(data, tuple): self.oid, klass = data if isinstance(klass, BadClass): # We can't use the BadClass directly because, if # resolution succeeds, there's no good way to pickle # it. Fortunately, a class reference in a persistent # reference is allowed to be a module+name tuple. self.data = self.oid, klass.args elif isinstance(data, str): self.oid = data else: # a list reference_type = data[0] # 'm' = multi_persistent: (database_name, oid, klass) # 'n' = multi_oid: (database_name, oid) # 'w' = persistent weakref: (oid) # or persistent weakref: (oid, database_name) # else it is a weakref: reference_type if reference_type == 'm': self.database_name, self.oid, klass = data[1] if isinstance(klass, BadClass): # see above wrt BadClass data[1] = self.database_name, self.oid, klass.args elif reference_type == 'n': self.database_name, self.oid = data[1] elif reference_type == 'w': try: self.oid, = data[1] except ValueError: self.oid, self.database_name = data[1] self.weak = True else: assert len(data) == 1, 'unknown reference format' self.oid = data[0] self.weak = True def __cmp__(self, other): if self is other or ( isinstance(other, PersistentReference) and self.oid == other.oid and self.database_name == other.database_name and not self.weak and not other.weak): return 0 else: raise ValueError( "can't reliably compare against different " "PersistentReferences") def __repr__(self): return "PR(%s %s)" % (id(self), self.data) def __getstate__(self): raise PicklingError("Can't pickle PersistentReference") @property def klass(self): # for tests data = self.data if isinstance(data, tuple): return data[1] elif isinstance(data, list) and data[0] == 'm': return data[1][2] class PersistentReferenceFactory: data = None def persistent_load(self, ref): if self.data is None: self.data = {} key = tuple(ref) # lists are not hashable; formats are different enough # even after eliminating list/tuple distinction r = self.data.get(key, None) if r is None: r = PersistentReference(ref) self.data[key] = r return r def persistent_id(object): if getattr(object, '__class__', 0) is not PersistentReference: return None return object.data _unresolvable = {} def tryToResolveConflict(self, oid, committedSerial, oldSerial, newpickle, committedData=''): # class_tuple, old, committed, newstate = ('',''), 0, 0, 0 try: prfactory = PersistentReferenceFactory() newpickle = self._crs_untransform_record_data(newpickle) file = StringIO(newpickle) unpickler = Unpickler(file) unpickler.find_global = find_global unpickler.persistent_load = prfactory.persistent_load meta = unpickler.load() if isinstance(meta, tuple): klass = meta[0] newargs = meta[1] or () if isinstance(klass, tuple): klass = find_global(*klass) else: klass = meta newargs = () if klass in _unresolvable: raise ConflictError inst = klass.__new__(klass, *newargs) try: resolve = inst._p_resolveConflict except AttributeError: _unresolvable[klass] = 1 raise ConflictError oldData = self.loadSerial(oid, oldSerial) if not committedData: committedData = self.loadSerial(oid, committedSerial) if newpickle == oldData: # old -> new diff is empty, so merge is trivial return committedData if committedData == oldData: # old -> committed diff is empty, so merge is trivial return newpickle newstate = unpickler.load() old = state(self, oid, oldSerial, prfactory, oldData) committed = state(self, oid, committedSerial, prfactory, committedData) resolved = resolve(old, committed, newstate) file = StringIO() pickler = Pickler(file,1) pickler.inst_persistent_id = persistent_id pickler.dump(meta) pickler.dump(resolved) return self._crs_transform_record_data(file.getvalue(1)) except (ConflictError, BadClassName): pass except: # If anything else went wrong, catch it here and avoid passing an # arbitrary exception back to the client. The error here will mask # the original ConflictError. A client can recover from a # ConflictError, but not necessarily from other errors. But log # the error so that any problems can be fixed. logger.error("Unexpected error", exc_info=True) raise ConflictError(oid=oid, serials=(committedSerial, oldSerial), data=newpickle) class ConflictResolvingStorage(object): "Mix-in class that provides conflict resolution handling for storages" tryToResolveConflict = tryToResolveConflict _crs_transform_record_data = _crs_untransform_record_data = ( lambda self, o: o) def registerDB(self, wrapper): self._crs_untransform_record_data = wrapper.untransform_record_data self._crs_transform_record_data = wrapper.transform_record_data super(ConflictResolvingStorage, self).registerDB(wrapper) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/ConflictResolution.txt000066400000000000000000000470061230730566700250320ustar00rootroot00000000000000=================== Conflict Resolution =================== Overview ======== Conflict resolution is a way to resolve transaction conflicts that would otherwise abort a transaction. As such, it risks data integrity in order to try to avoid throwing away potentially computationally expensive transactions. The risk of harming data integrity should not be underestimated. Writing conflict resolution code takes some responsibility for transactional integrity away from the ZODB, and puts it in the hands of the developer writing the conflict resolution code. The current conflict resolution code is implemented with a storage mix-in found in ZODB/ConflictResolution.py. The idea's proposal, and an explanation of the interface, can be found here: http://www.zope.org/Members/jim/ZODB/ApplicationLevelConflictResolution Here is the most pertinent section, somewhat modified for this document's use: A new interface is proposed to allow object authors to provide a method for resolving conflicts. When a conflict is detected, then the database checks to see if the class of the object being saved defines the method, _p_resolveConflict. If the method is defined, then the method is called on the object. If the method succeeds, then the object change can be committed, otherwise a ConflictError is raised as usual. def _p_resolveConflict(oldState, savedState, newState): Return the state of the object after resolving different changes. Arguments: oldState The state of the object that the changes made by the current transaction were based on. The method is permitted to modify this value. savedState The state of the object that is currently stored in the database. This state was written after oldState and reflects changes made by a transaction that committed before the current transaction. The method is permitted to modify this value. newState The state after changes made by the current transaction. The method is not permitted to modify this value. This method should compute a new state by merging changes reflected in savedState and newState, relative to oldState. If the method cannot resolve the changes, then it should raise ZODB.POSException.ConflictError. Consider an extremely simple example, a counter:: from persistent import Persistent class PCounter(Persistent): '`value` is readonly; increment it with `inc`.' _val = 0 def inc(self): self._val += 1 @property def value(self): return self._val def _p_resolveConflict(self, oldState, savedState, newState): oldState['_val'] = ( savedState.get('_val', 0) + newState.get('_val', 0) - oldState.get('_val', 0)) return oldState .. -> src >>> import ConflictResolution_txt >>> exec src in ConflictResolution_txt.__dict__ >>> PCounter = ConflictResolution_txt.PCounter >>> PCounter.__module__ = 'ConflictResolution_txt' By "state", the excerpt above means the value used by __getstate__ and __setstate__: a dictionary, in most cases. We'll look at more details below, but let's continue the example above with a simple successful resolution story. First we create a storage and a database, and put a PCounter in the database. >>> import ZODB >>> db = ZODB.DB('Data.fs') >>> import transaction >>> tm_A = transaction.TransactionManager() >>> conn_A = db.open(transaction_manager=tm_A) >>> p_A = conn_A.root()['p'] = PCounter() >>> p_A.value 0 >>> tm_A.commit() Now get another copy of 'p' so we can make a conflict. Think of `conn_A` (connection A) as one thread, and `conn_B` (connection B) as a concurrent thread. `p_A` is a view on the object in the first connection, and `p_B` is a view on *the same persistent object* in the second connection. >>> tm_B = transaction.TransactionManager() >>> conn_B = db.open(transaction_manager=tm_B) >>> p_B = conn_B.root()['p'] >>> p_B.value 0 >>> p_A._p_oid == p_B._p_oid True Now we can make a conflict, and see it resolved. >>> p_A.inc() >>> p_A.value 1 >>> p_B.inc() >>> p_B.value 1 >>> tm_B.commit() >>> p_B.value 1 >>> tm_A.commit() >>> p_A.value 2 We need to synchronize connection B, in any of a variety of ways, to see the change from connection A. >>> p_B.value 1 >>> trans = tm_B.begin() >>> p_B.value 2 A very similar class found in real world use is BTrees.Length.Length. This conflict resolution approach is simple, yet powerful. However, it has a few caveats and rough edges in practice. The simplicity, then, is a bit of a disguise. Again, be warned, writing conflict resolution code means that you claim significant responsibilty for your data integrity. Because of the rough edges, the current conflict resolution approach is slated for change (as of this writing, according to Jim Fulton, the ZODB primary author and maintainer). Others have talked about different approaches as well (see, for instance, http://www.python.org/~jeremy/weblog/031031c.html). But for now, the _p_resolveConflict method is what we have. Caveats and Dangers =================== Here are caveats for working with this conflict resolution approach. Each sub-section has a "DANGERS" section that outlines what might happen if you ignore the warning. We work from the least danger to the most. Conflict Resolution Is on the Server ------------------------------------ If you are using ZEO or ZRS, be aware that the classes for which you have conflict resolution code *and* the classes of the non-persistent objects they reference must be available to import by the *server* (or ZRS primary). DANGERS: You think you are going to get conflict resolution, but you won't. Ignore `self` ------------- Even though the _p_resolveConflict method has a "self", ignore it. Don't change it. You make changes by returning the state. This is effectively a class method. DANGERS: The changes you make to the instance will be discarded. The instance is not initialized, so other methods that depend on instance attributes will not work. Here's an example of a broken _p_resolveConflict method:: class PCounter2(PCounter): def __init__(self): self.data = [] def _p_resolveConflict(self, oldState, savedState, newState): self.data.append('bad idea') return super(PCounter2, self)._p_resolveConflict( oldState, savedState, newState) .. -> src >>> exec src in ConflictResolution_txt.__dict__ >>> PCounter2 = ConflictResolution_txt.PCounter2 >>> PCounter2.__module__ = 'ConflictResolution_txt' Now we'll prepare for the conflict again. >>> p2_A = conn_A.root()['p2'] = PCounter2() >>> p2_A.value 0 >>> tm_A.commit() >>> trans = tm_B.begin() # sync >>> p2_B = conn_B.root()['p2'] >>> p2_B.value 0 >>> p2_A._p_oid == p2_B._p_oid True And now we will make a conflict. >>> p2_A.inc() >>> p2_A.value 1 >>> p2_B.inc() >>> p2_B.value 1 >>> tm_B.commit() >>> p2_B.value 1 >>> tm_A.commit() # doctest: +ELLIPSIS Traceback (most recent call last): ... ConflictError: database conflict error... oops! >>> tm_A.abort() >>> p2_A.value 1 >>> trans = tm_B.begin() >>> p2_B.value 1 Watch Out for Persistent Objects in the State --------------------------------------------- If the object state has a reference to Persistent objects (instances of classes that inherit from persistent.Persistent) then these references *will not be loaded and are inaccessible*. Instead, persistent objects in the state dictionary are ZODB.ConflictResolution.PersistentReference instances. These objects have the following interface:: class IPersistentReference(zope.interface.Interface): '''public contract for references to persistent objects from an object with conflicts.''' oid = zope.interface.Attribute( 'The oid of the persistent object that this reference represents') database_name = zope.interface.Attribute( '''The name of the database of the reference, *if* different. If not different, None.''') klass = zope.interface.Attribute( '''class meta data. Presence is not reliable.''') weak = zope.interface.Attribute( '''bool: whether this reference is weak''') def __cmp__(other): '''if other is equivalent reference, return 0; else raise ValueError. Equivalent in this case means that oid and database_name are the same. If either is a weak reference, we only support `is` equivalence, and otherwise raise a ValueError even if the datbase_names and oids are the same, rather than guess at the correct semantics. It is impossible to sort reliably, since the actual persistent class may have its own comparison, and we have no idea what it is. We assert that it is reasonably safe to assume that an object is equivalent to itself, but that's as much as we can say. We don't compare on 'is other', despite the PersistentReferenceFactory.data cache, because it is possible to have two references to the same object that are spelled with different data (for instance, one with a class and one without).''' So let's look at one of these. Let's assume we have three, `old`, `saved`, and `new`, each representing a persistent reference to the same object within a _p_resolveConflict call from the oldState, savedState, and newState [#get_persistent_reference]_. They have an oid, `weak` is False, and `database_name` is None. `klass` happens to be set but this is not always the case. >>> isinstance(new.oid, str) True >>> new.weak False >>> print new.database_name None >>> new.klass is PCounter True There are a few subtleties to highlight here. First, notice that the database_name is only present if this is a cross-database reference (see cross-database-references.txt in this directory, and examples below). The database name and oid is sometimes a reasonable way to reliably sort Persistent objects (see zope.app.keyreference, for instance) but if your code compares one PersistentReference with a database_name and another without, you need to refuse to give an answer and raise an exception, because you can't know how the unknown database_name sorts. We already saw a persistent reference with a database_name of None. Now let's suppose `new` is an example of a cross-database reference from a database named '2' [#cross-database]_. >>> new.database_name '2' As seen, the database_name is available for this cross-database reference, and not for others. References to persistent objects, as defined in seialize.py, have other variations, such as weak references, which are handled but not discussed here [#instantiation_test]_ Second, notice the __cmp__ behavior [#cmp_test]_. This is new behavior after ZODB 3.8 and addresses a serious problem for when persistent objects are compared in an _p_resolveConflict, such as that in the ZODB BTrees code. Prior to this change, it was not safe to use Persistent objects as keys in a BTree. You needed to define a __cmp__ for them to be sorted reliably out of the context of conflict resolution, but then during conflict resolution the sorting would be arbitrary, on the basis of the persistent reference's memory location. This could have lead to inconsistent state for BTrees (or BTree module buckets or tree sets or sets). Here's an example of how the new behavior stops potentially incorrect resolution. >>> import BTrees >>> treeset_A = conn_A.root()['treeset'] = BTrees.family32.OI.TreeSet() >>> tm_A.commit() >>> trans = tm_B.begin() # sync >>> treeset_B = conn_B.root()['treeset'] >>> treeset_A.insert(PCounter()) 1 >>> treeset_B.insert(PCounter()) 1 >>> tm_B.commit() >>> tm_A.commit() # doctest: +ELLIPSIS Traceback (most recent call last): ... ConflictError: database conflict error... >>> tm_A.abort() Third, note that, even if the persistent object to which the reference refers changes in the same transaction, the reference is still the same. DANGERS: subtle and potentially serious. Beyond the two subtleties above, which should now be addressed, there is a general problem for objects that are composites of smaller persistent objects--for instance, a BTree, in which the BTree and each bucket is a persistent object; or a zc.queue.CompositePersistentQueue, which is a persistent queue of persistent queues. Consider the following situation. It is actually solved, but it is a concrete example of what might go wrong. A BTree (persistent object) has a two buckets (persistent objects). The second bucket has one persistent object in it. Concurrently, one thread deletes the one object in the second bucket, which causes the BTree to dump the bucket; and another thread puts an object in the second bucket. What happens during conflict resolution? Remember, each persistent object cannot see the other. From the perspective of the BTree object, it has no conflicts: one transaction modified it, causing it to lose a bucket; and the other transaction did not change it. From the perspective of the bucket, one transaction deleted an object and the other added it: it will resolve conflicts and say that the bucket has the new object and not the old one. However, it will be garbage collected, and effectively the addition of the new object will be lost. As mentioned, this story is actually solved for BTrees. As BTrees/MergeTemplate.c explains, whenever savedState or newState for a bucket shows an empty bucket, the code refuses to resolve the conflict: this avoids the situation above. >>> bucket_A = conn_A.root()['bucket'] = BTrees.family32.II.Bucket() >>> bucket_A[0] = 255 >>> tm_A.commit() >>> trans = tm_B.begin() # sync >>> bucket_B = conn_B.root()['bucket'] >>> bucket_B[1] = 254 >>> del bucket_A[0] >>> tm_B.commit() >>> tm_A.commit() # doctest: +ELLIPSIS Traceback (most recent call last): ... ConflictError: database conflict error... >>> tm_A.abort() However, the story highlights the kinds of subtle problems that units made up of multiple composite Persistent objects need to contemplate. Any structure made up of objects that contain persistent objects with conflict resolution code, as a catalog index is made up of multiple BTree Buckets and Sets, each with conflict resolution, needs to think through these kinds of problems or be faced with potential data integrity issues. .. cleanup >>> db.close() >>> db1.close() >>> db2.close() .. ......... .. .. FOOTNOTES .. .. ......... .. .. [#get_persistent_reference] We'll catch persistent references with a class mutable. :: class PCounter3(PCounter): data = [] def _p_resolveConflict(self, oldState, savedState, newState): PCounter3.data.append( (oldState.get('other'), savedState.get('other'), newState.get('other'))) return super(PCounter3, self)._p_resolveConflict( oldState, savedState, newState) .. -> src >>> exec src in ConflictResolution_txt.__dict__ >>> PCounter3 = ConflictResolution_txt.PCounter3 >>> PCounter3.__module__ = 'ConflictResolution_txt' >>> p3_A = conn_A.root()['p3'] = PCounter3() >>> p3_A.other = conn_A.root()['p'] >>> tm_A.commit() >>> trans = tm_B.begin() # sync >>> p3_B = conn_B.root()['p3'] >>> p3_A.inc() >>> p3_B.inc() >>> tm_B.commit() >>> tm_A.commit() >>> old, saved, new = PCounter3.data[-1] .. [#cross-database] We need a whole different set of databases for this. See cross-database-references.txt in this directory for a discussion of what is going on here. >>> databases = {} >>> db1 = ZODB.DB('1', databases=databases, database_name='1') >>> db2 = ZODB.DB('2', databases=databases, database_name='2') >>> tm_multi_A = transaction.TransactionManager() >>> conn_1A = db1.open(transaction_manager=tm_multi_A) >>> conn_2A = conn_1A.get_connection('2') >>> p4_1A = conn_1A.root()['p4'] = PCounter3() >>> p5_2A = conn_2A.root()['p5'] = PCounter3() >>> conn_2A.add(p5_2A) >>> p4_1A.other = p5_2A >>> tm_multi_A.commit() >>> tm_multi_B = transaction.TransactionManager() >>> conn_1B = db1.open(transaction_manager=tm_multi_B) >>> p4_1B = conn_1B.root()['p4'] >>> p4_1A.inc() >>> p4_1B.inc() >>> tm_multi_B.commit() >>> tm_multi_A.commit() >>> old, saved, new = PCounter3.data[-1] .. [#instantiation_test] We'll simply instantiate PersistentReferences with examples of types described in ZODB/serialize.py. >>> from ZODB.ConflictResolution import PersistentReference >>> ref1 = PersistentReference('my_oid') >>> ref1.oid 'my_oid' >>> print ref1.klass None >>> print ref1.database_name None >>> ref1.weak False >>> ref2 = PersistentReference(('my_oid', 'my_class')) >>> ref2.oid 'my_oid' >>> ref2.klass 'my_class' >>> print ref2.database_name None >>> ref2.weak False >>> ref3 = PersistentReference(['w', ('my_oid',)]) >>> ref3.oid 'my_oid' >>> print ref3.klass None >>> print ref3.database_name None >>> ref3.weak True >>> ref3a = PersistentReference(['w', ('my_oid', 'other_db')]) >>> ref3a.oid 'my_oid' >>> print ref3a.klass None >>> ref3a.database_name 'other_db' >>> ref3a.weak True >>> ref4 = PersistentReference(['m', ('other_db', 'my_oid', 'my_class')]) >>> ref4.oid 'my_oid' >>> ref4.klass 'my_class' >>> ref4.database_name 'other_db' >>> ref4.weak False >>> ref5 = PersistentReference(['n', ('other_db', 'my_oid')]) >>> ref5.oid 'my_oid' >>> print ref5.klass None >>> ref5.database_name 'other_db' >>> ref5.weak False >>> ref6 = PersistentReference(['my_oid']) # legacy >>> ref6.oid 'my_oid' >>> print ref6.klass None >>> print ref6.database_name None >>> ref6.weak True .. [#cmp_test] All references are equal to themselves. >>> ref1 == ref1 and ref2 == ref2 and ref4 == ref4 and ref5 == ref5 True >>> ref3 == ref3 and ref3a == ref3a and ref6 == ref6 # weak references True Non-weak references with the same oid and database_name are equal. >>> ref1 == ref2 and ref4 == ref5 True Everything else raises a ValueError: weak references with the same oid and database, and references with a different database_name or oid. >>> ref3 == ref6 Traceback (most recent call last): ... ValueError: can't reliably compare against different PersistentReferences >>> ref1 == PersistentReference(('another_oid', 'my_class')) Traceback (most recent call last): ... ValueError: can't reliably compare against different PersistentReferences >>> ref4 == PersistentReference( ... ['m', ('another_db', 'my_oid', 'my_class')]) Traceback (most recent call last): ... ValueError: can't reliably compare against different PersistentReferences ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/Connection.py000066400000000000000000001501261230730566700231130ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Database connection support $Id$""" import logging import sys import tempfile import threading import warnings import os import time from persistent import PickleCache # interfaces from persistent.interfaces import IPersistentDataManager from ZODB.interfaces import IConnection from ZODB.interfaces import IBlobStorage from ZODB.interfaces import IMVCCStorage from ZODB.blob import Blob, rename_or_copy_blob, remove_committed_dir from transaction.interfaces import ISavepointDataManager from transaction.interfaces import IDataManagerSavepoint from transaction.interfaces import ISynchronizer from zope.interface import implements import transaction import ZODB from ZODB.blob import SAVEPOINT_SUFFIX from ZODB.ConflictResolution import ResolvedSerial from ZODB.ExportImport import ExportImport from ZODB import POSException from ZODB.POSException import InvalidObjectReference, ConnectionStateError from ZODB.POSException import ConflictError, ReadConflictError from ZODB.POSException import Unsupported, ReadOnlyHistoryError from ZODB.POSException import POSKeyError from ZODB.serialize import ObjectWriter, ObjectReader from ZODB.utils import p64, u64, z64, oid_repr, positive_id from ZODB import utils global_reset_counter = 0 def resetCaches(): """Causes all connection caches to be reset as connections are reopened. Zope's refresh feature uses this. When you reload Python modules, instances of classes continue to use the old class definitions. To use the new code immediately, the refresh feature asks ZODB to clear caches by calling resetCaches(). When the instances are loaded by subsequent connections, they will use the new class definitions. """ global global_reset_counter global_reset_counter += 1 class Connection(ExportImport, object): """Connection to ZODB for loading and storing objects.""" implements(IConnection, ISavepointDataManager, IPersistentDataManager, ISynchronizer) _code_timestamp = 0 ########################################################################## # Connection methods, ZODB.IConnection def __init__(self, db, cache_size=400, before=None, cache_size_bytes=0): """Create a new Connection.""" self._log = logging.getLogger('ZODB.Connection') self._debug_info = () self._db = db self.large_record_size = db.large_record_size # historical connection self.before = before # Multi-database support self.connections = {self._db.database_name: self} storage = db.storage if IMVCCStorage.providedBy(storage): # Use a connection-specific storage instance. self._mvcc_storage = True storage = storage.new_instance() else: self._mvcc_storage = False self._normal_storage = self._storage = storage self.new_oid = db.new_oid self._savepoint_storage = None # Do we need to join a txn manager? self._needs_to_join = True self.transaction_manager = None self.opened = None # time.time() when DB.open() opened us self._reset_counter = global_reset_counter self._load_count = 0 # Number of objects unghosted self._store_count = 0 # Number of objects stored # Cache which can ghostify (forget the state of) objects not # recently used. Its API is roughly that of a dict, with # additional gc-related and invalidation-related methods. self._cache = PickleCache(self, cache_size, cache_size_bytes) # The pre-cache is used by get to avoid infinite loops when # objects immediately load their state whern they get their # persistent data set. self._pre_cache = {} # List of all objects (not oids) registered as modified by the # persistence machinery, or by add(), or whose access caused a # ReadConflictError (just to be able to clean them up from the # cache on abort with the other modified objects). All objects # of this list are either in _cache or in _added. self._registered_objects = [] # ids and serials of objects for which readCurrent was called # in a transaction. self._readCurrent = {} # Dict of oid->obj added explicitly through add(). Used as a # preliminary cache until commit time when objects are all moved # to the real _cache. The objects are moved to _creating at # commit time. self._added = {} # During commit this is turned into a list, which receives # objects added as a side-effect of storing a modified object. self._added_during_commit = None # During commit, all objects go to either _modified or _creating: # Dict of oid->flag of new objects (without serial), either # added by add() or implicitly added (discovered by the # serializer during commit). The flag is True for implicit # adding. Used during abort to remove created objects from the # _cache, and by persistent_id to check that a new object isn't # reachable from multiple databases. self._creating = {} # List of oids of modified objects, which have to be invalidated # in the cache on abort and in other connections on finish. self._modified = [] # _invalidated queues invalidate messages delivered from the DB # _inv_lock prevents one thread from modifying the set while # another is processing invalidations. All the invalidations # from a single transaction should be applied atomically, so # the lock must be held when reading _invalidated. # It sucks that we have to hold the lock to read _invalidated. # Normally, _invalidated is written by calling dict.update, which # will execute atomically by virtue of the GIL. But some storage # might generate oids where hash or compare invokes Python code. In # that case, the GIL can't save us. # Note: since that was written, it was officially declared that the # type of an oid is str. TODO: remove the related now-unnecessary # critical sections (if any -- this needs careful thought). self._inv_lock = threading.Lock() self._invalidated = set() # Flag indicating whether the cache has been invalidated: self._invalidatedCache = False # We intend to prevent committing a transaction in which # ReadConflictError occurs. _conflicts is the set of oids that # experienced ReadConflictError. Any time we raise ReadConflictError, # the oid should be added to this set, and we should be sure that the # object is registered. Because it's registered, Connection.commit() # will raise ReadConflictError again (because the oid is in # _conflicts). self._conflicts = {} # _txn_time stores the upper bound on transactions visible to # this connection. That is, all object revisions must be # written before _txn_time. If it is None, then the current # revisions are acceptable. self._txn_time = None # To support importFile(), implemented in the ExportImport base # class, we need to run _importDuringCommit() from our commit() # method. If _import is not None, it is a two-tuple of arguments # to pass to _importDuringCommit(). self._import = None self._reader = ObjectReader(self, self._cache, self._db.classFactory) def add(self, obj): """Add a new object 'obj' to the database and assign it an oid.""" if self.opened is None: raise ConnectionStateError("The database connection is closed") marker = object() oid = getattr(obj, "_p_oid", marker) if oid is marker: raise TypeError("Only first-class persistent objects may be" " added to a Connection.", obj) elif obj._p_jar is None: assert obj._p_oid is None oid = obj._p_oid = self.new_oid() obj._p_jar = self if self._added_during_commit is not None: self._added_during_commit.append(obj) self._register(obj) # Add to _added after calling register(), so that _added # can be used as a test for whether the object has been # registered with the transaction. self._added[oid] = obj elif obj._p_jar is not self: raise InvalidObjectReference(obj, obj._p_jar) def get(self, oid): """Return the persistent object with oid 'oid'.""" if self.opened is None: raise ConnectionStateError("The database connection is closed") obj = self._cache.get(oid, None) if obj is not None: return obj obj = self._added.get(oid, None) if obj is not None: return obj obj = self._pre_cache.get(oid, None) if obj is not None: return obj p, serial = self._storage.load(oid, '') obj = self._reader.getGhost(p) # Avoid infiniate loop if obj tries to load its state before # it is added to the cache and it's state refers to it. # (This will typically be the case for non-ghostifyable objects, # like persistent caches.) self._pre_cache[oid] = obj self._cache.new_ghost(oid, obj) self._pre_cache.pop(oid) return obj def cacheMinimize(self): """Deactivate all unmodified objects in the cache. """ for connection in self.connections.itervalues(): connection._cache.minimize() # TODO: we should test what happens when cacheGC is called mid-transaction. def cacheGC(self): """Reduce cache size to target size. """ for connection in self.connections.itervalues(): connection._cache.incrgc() __onCloseCallbacks = None def onCloseCallback(self, f): """Register a callable, f, to be called by close().""" if self.__onCloseCallbacks is None: self.__onCloseCallbacks = [] self.__onCloseCallbacks.append(f) def close(self, primary=True): """Close the Connection.""" if not self._needs_to_join: # We're currently joined to a transaction. raise ConnectionStateError("Cannot close a connection joined to " "a transaction") if self._cache is not None: self._cache.incrgc() # This is a good time to do some GC # Call the close callbacks. if self.__onCloseCallbacks is not None: for f in self.__onCloseCallbacks: try: f() except: # except what? f = getattr(f, 'im_self', f) self._log.error("Close callback failed for %s", f, exc_info=sys.exc_info()) self.__onCloseCallbacks = None self._debug_info = () if self.opened: self.transaction_manager.unregisterSynch(self) if self._mvcc_storage: self._storage.sync(force=False) if primary: for connection in self.connections.values(): if connection is not self: connection.close(False) # Return the connection to the pool. if self.opened is not None: self._db._returnToPool(self) # _returnToPool() set self.opened to None. # However, we can't assert that here, because self may # have been reused (by another thread) by the time we # get back here. else: self.opened = None am = self._db._activity_monitor if am is not None: am.closedConnection(self) def db(self): """Returns a handle to the database this connection belongs to.""" return self._db def isReadOnly(self): """Returns True if this connection is read only.""" if self.opened is None: raise ConnectionStateError("The database connection is closed") return self.before is not None or self._storage.isReadOnly() def invalidate(self, tid, oids): """Notify the Connection that transaction 'tid' invalidated oids.""" if self.before is not None: # This is a historical connection. Invalidations are irrelevant. return self._inv_lock.acquire() try: if self._txn_time is None: self._txn_time = tid elif (tid < self._txn_time) and (tid is not None): raise AssertionError("invalidations out of order, %r < %r" % (tid, self._txn_time)) self._invalidated.update(oids) finally: self._inv_lock.release() def invalidateCache(self): self._inv_lock.acquire() try: self._invalidatedCache = True finally: self._inv_lock.release() @property def root(self): """Return the database root object.""" return RootConvenience(self.get(z64)) def get_connection(self, database_name): """Return a Connection for the named database.""" connection = self.connections.get(database_name) if connection is None: new_con = self._db.databases[database_name].open( transaction_manager=self.transaction_manager, before=self.before, ) self.connections.update(new_con.connections) new_con.connections = self.connections connection = new_con return connection def _implicitlyAdding(self, oid): """Are we implicitly adding an object within the current transaction This is used in a check to avoid implicitly adding an object to a database in a multi-database situation. See serialize.ObjectWriter.persistent_id. """ return (self._creating.get(oid, 0) or ((self._savepoint_storage is not None) and self._savepoint_storage.creating.get(oid, 0) ) ) def sync(self): """Manually update the view on the database.""" self.transaction_manager.abort() self._storage_sync() def getDebugInfo(self): """Returns a tuple with different items for debugging the connection. """ return self._debug_info def setDebugInfo(self, *args): """Add the given items to the debug information of this connection.""" self._debug_info = self._debug_info + args def getTransferCounts(self, clear=False): """Returns the number of objects loaded and stored.""" res = self._load_count, self._store_count if clear: self._load_count = 0 self._store_count = 0 return res # Connection methods ########################################################################## ########################################################################## # Data manager (ISavepointDataManager) methods def abort(self, transaction): """Abort a transaction and forget all changes.""" # The order is important here. We want to abort registered # objects before we process the cache. Otherwise, we may un-add # objects added in savepoints. If they've been modified since # the savepoint, then they won't have _p_oid or _p_jar after # they've been unadded. This will make the code in _abort # confused. self._abort() if self._savepoint_storage is not None: self._abort_savepoint() self._invalidate_creating() self._tpc_cleanup() def _abort(self): """Abort a transaction and forget all changes.""" for obj in self._registered_objects: oid = obj._p_oid assert oid is not None if oid in self._added: del self._added[oid] if self._cache.get(oid) is not None: del self._cache[oid] del obj._p_jar del obj._p_oid if obj._p_changed: obj._p_changed = False else: # Note: If we invalidate a non-ghostifiable object # (i.e. a persistent class), the object will # immediately reread its state. That means that the # following call could result in a call to # self.setstate, which, of course, must succeed. # In general, it would be better if the read could be # delayed until the start of the next transaction. If # we read at the end of a transaction and if the # object was invalidated during this transaction, then # we'll read non-current data, which we'll discard # later in transaction finalization. Unfortnately, we # can only delay the read if this abort corresponds to # a top-level-transaction abort. We can't tell if # this is a top-level-transaction abort, so we have to # go ahead and invalidate now. Fortunately, it's # pretty unlikely that the object we are invalidating # was invalidated by another thread, so the risk of a # reread is pretty low. self._cache.invalidate(oid) def _tpc_cleanup(self): """Performs cleanup operations to support tpc_finish and tpc_abort.""" self._conflicts.clear() self._needs_to_join = True self._registered_objects = [] self._creating.clear() # Process pending invalidations. def _flush_invalidations(self): if self._mvcc_storage: # Poll the storage for invalidations. invalidated = self._storage.poll_invalidations() if invalidated is None: # special value: the transaction is so old that # we need to flush the whole cache. self._cache.invalidate(self._cache.cache_data.keys()) elif invalidated: self._cache.invalidate(invalidated) self._inv_lock.acquire() try: # Non-ghostifiable objects may need to read when they are # invalidated, so we'll quickly just replace the # invalidating dict with a new one. We'll then process # the invalidations after freeing the lock *and* after # resetting the time. This means that invalidations will # happen after the start of the transactions. They are # subject to conflict errors and to reading old data. # TODO: There is a potential problem lurking for persistent # classes. Suppose we have an invalidation of a persistent # class and of an instance. If the instance is # invalidated first and if the invalidation logic uses # data read from the class, then the invalidation could # be performed with stale data. Or, suppose that there # are instances of the class that are freed as a result of # invalidating some object. Perhaps code in their __del__ # uses class data. Really, the only way to properly fix # this is to, in fact, make classes ghostifiable. Then # we'd have to reimplement attribute lookup to check the # class state and, if necessary, activate the class. It's # much worse than that though, because we'd also need to # deal with slots. When a class is ghostified, we'd need # to replace all of the slot operations with versions that # reloaded the object when called. It's hard to say which # is better or worse. For now, it seems the risk of # using a class while objects are being invalidated seems # small enough to be acceptable. invalidated = dict.fromkeys(self._invalidated) self._invalidated = set() self._txn_time = None if self._invalidatedCache: self._invalidatedCache = False invalidated = self._cache.cache_data.copy() finally: self._inv_lock.release() self._cache.invalidate(invalidated) # Now is a good time to collect some garbage. self._cache.incrgc() def tpc_begin(self, transaction): """Begin commit of a transaction, starting the two-phase commit.""" self._modified = [] # _creating is a list of oids of new objects, which is used to # remove them from the cache if a transaction aborts. self._creating.clear() self._normal_storage.tpc_begin(transaction) def commit(self, transaction): """Commit changes to an object""" if self._savepoint_storage is not None: # We first checkpoint the current changes to the savepoint self.savepoint() # then commit all of the savepoint changes at once self._commit_savepoint(transaction) # No need to call _commit since savepoint did. else: self._commit(transaction) for oid, serial in self._readCurrent.iteritems(): try: self._storage.checkCurrentSerialInTransaction( oid, serial, transaction) except ConflictError: self._cache.invalidate(oid) raise def _commit(self, transaction): """Commit changes to an object""" if self.before is not None: raise ReadOnlyHistoryError() if self._import: # We are importing an export file. We alsways do this # while making a savepoint so we can copy export data # directly to our storage, typically a TmpStore. self._importDuringCommit(transaction, *self._import) self._import = None # Just in case an object is added as a side-effect of storing # a modified object. If, for example, a __getstate__() method # calls add(), the newly added objects will show up in # _added_during_commit. This sounds insane, but has actually # happened. self._added_during_commit = [] if self._invalidatedCache: raise ConflictError() for obj in self._registered_objects: oid = obj._p_oid assert oid if oid in self._conflicts: raise ReadConflictError(object=obj) if obj._p_jar is not self: raise InvalidObjectReference(obj, obj._p_jar) elif oid in self._added: assert obj._p_serial == z64 elif obj._p_changed: if oid in self._invalidated: resolve = getattr(obj, "_p_resolveConflict", None) if resolve is None: raise ConflictError(object=obj) self._modified.append(oid) else: # Nothing to do. It's been said that it's legal, e.g., for # an object to set _p_changed to false after it's been # changed and registered. continue self._store_objects(ObjectWriter(obj), transaction) for obj in self._added_during_commit: self._store_objects(ObjectWriter(obj), transaction) self._added_during_commit = None def _store_objects(self, writer, transaction): for obj in writer: oid = obj._p_oid serial = getattr(obj, "_p_serial", z64) if ((serial == z64) and ((self._savepoint_storage is None) or (oid not in self._savepoint_storage.creating) or self._savepoint_storage.creating[oid] ) ): # obj is a new object # Because obj was added, it is now in _creating, so it # can be removed from _added. If oid wasn't in # adding, then we are adding it implicitly. implicitly_adding = self._added.pop(oid, None) is None self._creating[oid] = implicitly_adding else: if (oid in self._invalidated and not hasattr(obj, '_p_resolveConflict')): raise ConflictError(object=obj) self._modified.append(oid) p = writer.serialize(obj) # This calls __getstate__ of obj if len(p) >= self.large_record_size: warnings.warn(large_object_message % (obj.__class__, len(p))) if isinstance(obj, Blob): if not IBlobStorage.providedBy(self._storage): raise Unsupported( "Storing Blobs in %s is not supported." % repr(self._storage)) if obj.opened(): raise ValueError("Can't commit with opened blobs.") blobfilename = obj._uncommitted() if blobfilename is None: assert serial is not None # See _uncommitted self._modified.pop() # not modified continue s = self._storage.storeBlob(oid, serial, p, blobfilename, '', transaction) # we invalidate the object here in order to ensure # that that the next attribute access of its name # unghostify it, which will cause its blob data # to be reattached "cleanly" obj._p_invalidate() else: s = self._storage.store(oid, serial, p, '', transaction) self._store_count += 1 # Put the object in the cache before handling the # response, just in case the response contains the # serial number for a newly created object try: self._cache[oid] = obj except: # Dang, I bet it's wrapped: # TODO: Deprecate, then remove, this. if hasattr(obj, 'aq_base'): self._cache[oid] = obj.aq_base else: raise self._cache.update_object_size_estimation(oid, len(p)) obj._p_estimated_size = len(p) self._handle_serial(oid, s) def _handle_serial(self, oid, serial, change=True): # if we write an object, we don't want to check if it was read # while current. This is a convenient choke point to do this. self._readCurrent.pop(oid, None) if not serial: return if not isinstance(serial, str): raise serial obj = self._cache.get(oid, None) if obj is None: return if serial == ResolvedSerial: del obj._p_changed # transition from changed to ghost else: if change: obj._p_changed = 0 # transition from changed to up-to-date obj._p_serial = serial def tpc_abort(self, transaction): if self._import: self._import = None if self._savepoint_storage is not None: self._abort_savepoint() self._storage.tpc_abort(transaction) # Note: If we invalidate a non-ghostifiable object (i.e. a # persistent class), the object will immediately reread its # state. That means that the following call could result in a # call to self.setstate, which, of course, must succeed. In # general, it would be better if the read could be delayed # until the start of the next transaction. If we read at the # end of a transaction and if the object was invalidated # during this transaction, then we'll read non-current data, # which we'll discard later in transaction finalization. We # could, theoretically queue this invalidation by calling # self.invalidate. Unfortunately, attempts to make that # change resulted in mysterious test failures. It's pretty # unlikely that the object we are invalidating was invalidated # by another thread, so the risk of a reread is pretty low. # It's really not worth the effort to pursue this. self._cache.invalidate(self._modified) self._invalidate_creating() while self._added: oid, obj = self._added.popitem() if obj._p_changed: obj._p_changed = False del obj._p_oid del obj._p_jar self._tpc_cleanup() def _invalidate_creating(self, creating=None): """Disown any objects newly saved in an uncommitted transaction.""" if creating is None: creating = self._creating self._creating = {} for oid in creating: o = self._cache.get(oid) if o is not None: del self._cache[oid] if o._p_changed: o._p_changed = False del o._p_jar del o._p_oid def tpc_vote(self, transaction): """Verify that a data manager can commit the transaction.""" try: vote = self._storage.tpc_vote except AttributeError: return try: s = vote(transaction) except ReadConflictError, v: if v.oid: self._cache.invalidate(v.oid) raise if s: for oid, serial in s: self._handle_serial(oid, serial) def tpc_finish(self, transaction): """Indicate confirmation that the transaction is done.""" def callback(tid): if self._mvcc_storage: # Inter-connection invalidation is not needed when the # storage provides MVCC. return d = dict.fromkeys(self._modified) self._db.invalidate(tid, d, self) # It's important that the storage calls the passed function # while it still has its lock. We don't want another thread # to be able to read any updated data until we've had a chance # to send an invalidation message to all of the other # connections! self._storage.tpc_finish(transaction, callback) self._tpc_cleanup() def sortKey(self): """Return a consistent sort key for this connection.""" return "%s:%s" % (self._storage.sortKey(), id(self)) # Data manager (ISavepointDataManager) methods ########################################################################## ########################################################################## # Transaction-manager synchronization -- ISynchronizer def beforeCompletion(self, txn): # We don't do anything before a commit starts. pass # Call the underlying storage's sync() method (if any), and process # pending invalidations regardless. Of course this should only be # called at transaction boundaries. def _storage_sync(self, *ignored): self._readCurrent.clear() sync = getattr(self._storage, 'sync', 0) if sync: sync() self._flush_invalidations() afterCompletion = _storage_sync newTransaction = _storage_sync # Transaction-manager synchronization -- ISynchronizer ########################################################################## ########################################################################## # persistent.interfaces.IPersistentDatamanager def oldstate(self, obj, tid): """Return copy of 'obj' that was written by transaction 'tid'.""" assert obj._p_jar is self p = self._storage.loadSerial(obj._p_oid, tid) return self._reader.getState(p) def setstate(self, obj): """Turns the ghost 'obj' into a real object by loading its state from the database.""" oid = obj._p_oid if self.opened is None: msg = ("Shouldn't load state for %s " "when the connection is closed" % oid_repr(oid)) self._log.error(msg) raise ConnectionStateError(msg) try: self._setstate(obj) except ConflictError: raise except: self._log.error("Couldn't load state for %s", oid_repr(oid), exc_info=sys.exc_info()) raise def _setstate(self, obj): # Helper for setstate(), which provides logging of failures. # The control flow is complicated here to avoid loading an # object revision that we are sure we aren't going to use. As # a result, invalidation tests occur before and after the # load. We can only be sure about invalidations after the # load. # If an object has been invalidated, among the cases to consider: # - Try MVCC # - Raise ConflictError. if self.before is not None: # Load data that was current before the time we have. before = self.before t = self._storage.loadBefore(obj._p_oid, before) if t is None: raise POSKeyError() # historical connection! p, serial, end = t else: # There is a harmless data race with self._invalidated. A # dict update could go on in another thread, but we don't care # because we have to check again after the load anyway. if self._invalidatedCache: raise ReadConflictError() if (obj._p_oid in self._invalidated): self._load_before_or_conflict(obj) return p, serial = self._storage.load(obj._p_oid, '') self._load_count += 1 self._inv_lock.acquire() try: invalid = obj._p_oid in self._invalidated finally: self._inv_lock.release() if invalid: self._load_before_or_conflict(obj) return self._reader.setGhostState(obj, p) obj._p_serial = serial self._cache.update_object_size_estimation(obj._p_oid, len(p)) obj._p_estimated_size = len(p) # Blob support if isinstance(obj, Blob): obj._p_blob_uncommitted = None obj._p_blob_committed = self._storage.loadBlob(obj._p_oid, serial) def _load_before_or_conflict(self, obj): """Load non-current state for obj or raise ReadConflictError.""" if not self._setstate_noncurrent(obj): self._register(obj) self._conflicts[obj._p_oid] = True raise ReadConflictError(object=obj) def _setstate_noncurrent(self, obj): """Set state using non-current data. Return True if state was available, False if not. """ try: # Load data that was current before the commit at txn_time. t = self._storage.loadBefore(obj._p_oid, self._txn_time) except KeyError: return False if t is None: return False data, start, end = t # The non-current transaction must have been written before # txn_time. It must be current at txn_time, but could have # been modified at txn_time. assert start < self._txn_time, (u64(start), u64(self._txn_time)) assert end is not None assert self._txn_time <= end, (u64(self._txn_time), u64(end)) self._reader.setGhostState(obj, data) obj._p_serial = start # MVCC Blob support if isinstance(obj, Blob): obj._p_blob_uncommitted = None obj._p_blob_committed = self._storage.loadBlob(obj._p_oid, start) return True def register(self, obj): """Register obj with the current transaction manager. A subclass could override this method to customize the default policy of one transaction manager for each thread. obj must be an object loaded from this Connection. """ assert obj._p_jar is self if obj._p_oid is None: # The actual complaint here is that an object without # an oid is being registered. I can't think of any way to # achieve that without assignment to _p_jar. If there is # a way, this will be a very confusing exception. raise ValueError("assigning to _p_jar is not supported") elif obj._p_oid in self._added: # It was registered before it was added to _added. return self._register(obj) def _register(self, obj=None): # The order here is important. We need to join before # registering the object, because joining may take a # savepoint, and the savepoint should not reflect the change # to the object. if self._needs_to_join: self.transaction_manager.get().join(self) self._needs_to_join = False if obj is not None: self._registered_objects.append(obj) def readCurrent(self, ob): assert ob._p_jar is self assert ob._p_oid is not None and ob._p_serial is not None if ob._p_serial != z64: self._readCurrent[ob._p_oid] = ob._p_serial # persistent.interfaces.IPersistentDatamanager ########################################################################## ########################################################################## # PROTECTED stuff (used by e.g. ZODB.DB.DB) def _cache_items(self): # find all items on the lru list items = self._cache.lru_items() # fine everything. some on the lru list, some not everything = self._cache.cache_data # remove those items that are on the lru list for k,v in items: del everything[k] # return a list of [ghosts....not recently used.....recently used] return everything.items() + items def open(self, transaction_manager=None, delegate=True): """Register odb, the DB that this Connection uses. This method is called by the DB every time a Connection is opened. Any invalidations received while the Connection was closed will be processed. If the global module function resetCaches() was called, the cache will be cleared. Parameters: odb: database that owns the Connection transaction_manager: transaction manager to use. None means use the default transaction manager. register for afterCompletion() calls. """ self.opened = time.time() if transaction_manager is None: transaction_manager = transaction.manager self.transaction_manager = transaction_manager if self._reset_counter != global_reset_counter: # New code is in place. Start a new cache. self._resetCache() else: self._flush_invalidations() transaction_manager.registerSynch(self) if self._cache is not None: self._cache.incrgc() # This is a good time to do some GC if delegate: # delegate open to secondary connections for connection in self.connections.values(): if connection is not self: connection.open(transaction_manager, False) def _resetCache(self): """Creates a new cache, discarding the old one. See the docstring for the resetCaches() function. """ self._reset_counter = global_reset_counter self._invalidated.clear() self._invalidatedCache = False cache_size = self._cache.cache_size cache_size_bytes = self._cache.cache_size_bytes self._cache = cache = PickleCache(self, cache_size, cache_size_bytes) if getattr(self, '_reader', None) is not None: self._reader._cache = cache def _release_resources(self): for c in self.connections.itervalues(): if c._mvcc_storage: c._storage.release() c._storage = c._normal_storage = None c._cache = PickleCache(self, 0, 0) ########################################################################## # Python protocol def __repr__(self): return '' % (positive_id(self),) # Python protocol ########################################################################## ########################################################################## # DEPRECATION candidates __getitem__ = get def exchange(self, old, new): # called by a ZClasses method that isn't executed by the test suite oid = old._p_oid new._p_oid = oid new._p_jar = self new._p_changed = 1 self._register(new) self._cache[oid] = new # DEPRECATION candidates ########################################################################## ########################################################################## # DEPRECATED methods # None at present. # DEPRECATED methods ########################################################################## ##################################################################### # Savepoint support def savepoint(self): if self._savepoint_storage is None: tmpstore = TmpStore(self._normal_storage) self._savepoint_storage = tmpstore self._storage = self._savepoint_storage self._creating.clear() self._commit(None) self._storage.creating.update(self._creating) self._creating.clear() self._registered_objects = [] state = (self._storage.position, self._storage.index.copy(), self._storage.creating.copy(), ) result = Savepoint(self, state) # While the interface doesn't guarantee this, savepoints are # sometimes used just to "break up" very long transactions, and as # a pragmatic matter this is a good time to reduce the cache # memory burden. self.cacheGC() return result def _rollback(self, state): self._abort() self._registered_objects = [] src = self._storage # Invalidate objects created *after* the savepoint. self._invalidate_creating((oid for oid in src.creating if oid not in state[2])) index = src.index src.reset(*state) self._cache.invalidate(index) def _commit_savepoint(self, transaction): """Commit all changes made in savepoints and begin 2-phase commit """ src = self._savepoint_storage self._storage = self._normal_storage self._savepoint_storage = None try: self._log.debug("Committing savepoints of size %s", src.getSize()) oids = src.index.keys() # Copy invalidating and creating info from temporary storage: self._modified.extend(oids) self._creating.update(src.creating) for oid in oids: data, serial = src.load(oid, src) obj = self._cache.get(oid, None) if obj is not None: self._cache.update_object_size_estimation( obj._p_oid, len(data)) obj._p_estimated_size = len(data) if isinstance(self._reader.getGhost(data), Blob): blobfilename = src.loadBlob(oid, serial) s = self._storage.storeBlob( oid, serial, data, blobfilename, '', transaction) # we invalidate the object here in order to ensure # that that the next attribute access of its name # unghostify it, which will cause its blob data # to be reattached "cleanly" self.invalidate(None, (oid, )) else: s = self._storage.store(oid, serial, data, '', transaction) self._handle_serial(oid, s, change=False) finally: src.close() def _abort_savepoint(self): """Discard all savepoint data.""" src = self._savepoint_storage self._invalidate_creating(src.creating) self._storage = self._normal_storage self._savepoint_storage = None # Note: If we invalidate a non-ghostifiable object (i.e. a # persistent class), the object will immediately reread it's # state. That means that the following call could result in a # call to self.setstate, which, of course, must succeed. In # general, it would be better if the read could be delayed # until the start of the next transaction. If we read at the # end of a transaction and if the object was invalidated # during this transaction, then we'll read non-current data, # which we'll discard later in transaction finalization. We # could, theoretically queue this invalidation by calling # self.invalidate. Unfortunately, attempts to make that # change resulted in mysterious test failures. It's pretty # unlikely that the object we are invalidating was invalidated # by another thread, so the risk of a reread is pretty low. # It's really not worth the effort to pursue this. # Note that we do this *after* reseting the storage so that, if # data are read, we read it from the reset storage! self._cache.invalidate(src.index) src.close() # Savepoint support ##################################################################### class Savepoint: implements(IDataManagerSavepoint) def __init__(self, datamanager, state): self.datamanager = datamanager self.state = state def rollback(self): self.datamanager._rollback(self.state) class TmpStore: """A storage-like thing to support savepoints.""" implements(IBlobStorage) def __init__(self, storage): self._storage = storage for method in ( 'getName', 'new_oid', 'getSize', 'sortKey', 'loadBefore', 'isReadOnly' ): setattr(self, method, getattr(storage, method)) self._file = tempfile.TemporaryFile() # position: current file position # _tpos: file position at last commit point self.position = 0L # index: map oid to pos of last committed version self.index = {} self.creating = {} self._blob_dir = None def __len__(self): return len(self.index) def close(self): self._file.close() if self._blob_dir is not None: remove_committed_dir(self._blob_dir) self._blob_dir = None def load(self, oid, version): pos = self.index.get(oid) if pos is None: return self._storage.load(oid, '') self._file.seek(pos) h = self._file.read(8) oidlen = u64(h) read_oid = self._file.read(oidlen) if read_oid != oid: raise POSException.StorageSystemError('Bad temporary storage') h = self._file.read(16) size = u64(h[8:]) serial = h[:8] return self._file.read(size), serial def store(self, oid, serial, data, version, transaction): # we have this funny signature so we can reuse the normal non-commit # commit logic assert version == '' self._file.seek(self.position) l = len(data) if serial is None: serial = z64 header = p64(len(oid)) + oid + serial + p64(l) self._file.write(header) self._file.write(data) self.index[oid] = self.position self.position += l + len(header) return serial def storeBlob(self, oid, serial, data, blobfilename, version, transaction): assert version == '' serial = self.store(oid, serial, data, '', transaction) targetpath = self._getBlobPath() if not os.path.exists(targetpath): os.makedirs(targetpath, 0700) targetname = self._getCleanFilename(oid, serial) rename_or_copy_blob(blobfilename, targetname, chmod=False) def loadBlob(self, oid, serial): """Return the filename where the blob file can be found. """ if not IBlobStorage.providedBy(self._storage): raise Unsupported( "Blobs are not supported by the underlying storage %r." % self._storage) filename = self._getCleanFilename(oid, serial) if not os.path.exists(filename): return self._storage.loadBlob(oid, serial) return filename def openCommittedBlobFile(self, oid, serial, blob=None): blob_filename = self.loadBlob(oid, serial) if blob is None: return open(blob_filename, 'rb') else: return ZODB.blob.BlobFile(blob_filename, 'r', blob) def _getBlobPath(self): blob_dir = self._blob_dir if blob_dir is None: blob_dir = tempfile.mkdtemp(dir=self.temporaryDirectory(), prefix='savepoints') self._blob_dir = blob_dir return blob_dir def _getCleanFilename(self, oid, tid): return os.path.join( self._getBlobPath(), "%s-%s%s" % (utils.oid_repr(oid), utils.tid_repr(tid), SAVEPOINT_SUFFIX,) ) def temporaryDirectory(self): return self._storage.temporaryDirectory() def reset(self, position, index, creating): self._file.truncate(position) self.position = position # Caution: We're typically called as part of a savepoint rollback. # Other machinery remembers the index to restore, and passes it to # us. If we simply bind self.index to `index`, then if the caller # didn't pass a copy of the index, the caller's index will mutate # when self.index mutates. This can be a disaster if the caller is a # savepoint to which the user rolls back again later (the savepoint # loses the original index it passed). Therefore, to be safe, we make # a copy of the index here. An alternative would be to ensure that # all callers pass copies. As is, our callers do not make copies. self.index = index.copy() self.creating = creating class RootConvenience(object): def __init__(self, root): self.__dict__['_root'] = root def __getattr__(self, name): try: return self._root[name] except KeyError: raise AttributeError(name) def __setattr__(self, name, v): self._root[name] = v def __delattr__(self, name): try: del self._root[name] except KeyError: raise AttributeError(name) def __call__(self): return self._root def __repr__(self): names = " ".join(sorted(self._root)) if len(names) > 60: names = names[:57].rsplit(' ', 1)[0] + ' ...' return "" % names large_object_message = """The %s object you're saving is large. (%s bytes.) Perhaps you're storing media which should be stored in blobs. Perhaps you're using a non-scalable data structure, such as a PersistentMapping or PersistentList. Perhaps you're storing data in objects that aren't persistent at all. In cases like that, the data is stored in the record of the containing persistent object. In any case, storing records this big is probably a bad idea. If you insist and want to get rid of this warning, use the large_record_size option of the ZODB.DB constructor (or the large-record-size option in a configuration file) to specify a larger size. """ ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/DB.py000066400000000000000000001034341230730566700213010ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Database objects """ import cPickle import cStringIO import sys import threading import logging import datetime import time import warnings from ZODB.broken import find_global from ZODB.utils import z64 from ZODB.Connection import Connection import ZODB.serialize import transaction.weakset from zope.interface import implements from ZODB.interfaces import IDatabase from ZODB.interfaces import IMVCCStorage import transaction from persistent.TimeStamp import TimeStamp logger = logging.getLogger('ZODB.DB') class AbstractConnectionPool(object): """Manage a pool of connections. CAUTION: Methods should be called under the protection of a lock. This class does no locking of its own. There's no limit on the number of connections this can keep track of, but a warning is logged if there are more than pool_size active connections, and a critical problem if more than twice pool_size. New connections are registered via push(). This will log a message if "too many" connections are active. When a connection is explicitly closed, tell the pool via repush(). That adds the connection to a stack of connections available for reuse, and throws away the oldest stack entries if the stack is too large. pop() pops this stack. When a connection is obtained via pop(), the pool holds only a weak reference to it thereafter. It's not necessary to inform the pool if the connection goes away. A connection handed out by pop() counts against pool_size only so long as it exists, and provided it isn't repush()'ed. A weak reference is retained so that DB methods like connectionDebugInfo() can still gather statistics. """ def __init__(self, size, timeout): # The largest # of connections we expect to see alive simultaneously. self._size = size # The minimum number of seconds that an available connection should # be kept, or None. self._timeout = timeout # A weak set of all connections we've seen. A connection vanishes # from this set if pop() hands it out, it's not reregistered via # repush(), and it becomes unreachable. self.all = transaction.weakset.WeakSet() def setSize(self, size): """Change our belief about the expected maximum # of live connections. If the pool_size is smaller than the current value, this may discard the oldest available connections. """ self._size = size self._reduce_size() def setTimeout(self, timeout): old = self._timeout self._timeout = timeout if timeout < old: self._reduce_size() def getSize(self): return self._size def getTimeout(self): return self._timeout timeout = property(getTimeout, lambda self, v: self.setTimeout(v)) size = property(getSize, lambda self, v: self.setSize(v)) class ConnectionPool(AbstractConnectionPool): def __init__(self, size, timeout=1<<31): super(ConnectionPool, self).__init__(size, timeout) # A stack of connections available to hand out. This is a subset # of self.all. push() and repush() add to this, and may remove # the oldest available connections if the pool is too large. # pop() pops this stack. There are never more than size entries # in this stack. self.available = [] def _append(self, c): available = self.available cactive = c._cache.cache_non_ghost_count if (available and (available[-1][1]._cache.cache_non_ghost_count > cactive) ): i = len(available) - 1 while (i and (available[i-1][1]._cache.cache_non_ghost_count > cactive) ): i -= 1 available.insert(i, (time.time(), c)) else: available.append((time.time(), c)) def push(self, c): """Register a new available connection. We must not know about c already. c will be pushed onto the available stack even if we're over the pool size limit. """ assert c not in self.all assert c not in self.available self._reduce_size(strictly_less=True) self.all.add(c) self._append(c) n = len(self.all) limit = self.size if n > limit: reporter = logger.warn if n > 2 * limit: reporter = logger.critical reporter("DB.open() has %s open connections with a pool_size " "of %s", n, limit) def repush(self, c): """Reregister an available connection formerly obtained via pop(). This pushes it on the stack of available connections, and may discard older available connections. """ assert c in self.all assert c not in self.available self._reduce_size(strictly_less=True) self._append(c) def _reduce_size(self, strictly_less=False): """Throw away the oldest available connections until we're under our target size (strictly_less=False, the default) or no more than that (strictly_less=True). """ threshhold = time.time() - self.timeout target = self.size if strictly_less: target -= 1 available = self.available while ( (len(available) > target) or (available and available[0][0] < threshhold) ): t, c = available.pop(0) self.all.remove(c) c._release_resources() def reduce_size(self): self._reduce_size() def pop(self): """Pop an available connection and return it. Return None if none are available - in this case, the caller should create a new connection, register it via push(), and call pop() again. The caller is responsible for serializing this sequence. """ result = None if self.available: _, result = self.available.pop() # Leave it in self.all, so we can still get at it for statistics # while it's alive. assert result in self.all return result def map(self, f): """For every live connection c, invoke f(c).""" self.all.map(f) def availableGC(self): """Perform garbage collection on available connections. If a connection is no longer viable because it has timed out, it is garbage collected.""" threshhold = time.time() - self.timeout to_remove = () for (t, c) in self.available: if t < threshhold: to_remove += (c,) self.all.remove(c) c._release_resources() else: c.cacheGC() if to_remove: self.available[:] = [i for i in self.available if i[1] not in to_remove] class KeyedConnectionPool(AbstractConnectionPool): # this pool keeps track of keyed connections all together. It makes # it possible to make assertions about total numbers of keyed connections. # The keys in this case are "before" TIDs, but this is used by other # packages as well. # see the comments in ConnectionPool for method descriptions. def __init__(self, size, timeout=1<<31): super(KeyedConnectionPool, self).__init__(size, timeout) self.pools = {} def setSize(self, v): self._size = v for pool in self.pools.values(): pool.setSize(v) def setTimeout(self, v): self._timeout = v for pool in self.pools.values(): pool.setTimeout(v) def push(self, c, key): pool = self.pools.get(key) if pool is None: pool = self.pools[key] = ConnectionPool(self.size, self.timeout) pool.push(c) def repush(self, c, key): self.pools[key].repush(c) def _reduce_size(self, strictly_less=False): for key, pool in list(self.pools.items()): pool._reduce_size(strictly_less) if not pool.all: del self.pools[key] def reduce_size(self): self._reduce_size() def pop(self, key): pool = self.pools.get(key) if pool is not None: return pool.pop() def map(self, f): for pool in self.pools.itervalues(): pool.map(f) def availableGC(self): for key, pool in self.pools.items(): pool.availableGC() if not pool.all: del self.pools[key] @property def test_all(self): result = set() for pool in self.pools.itervalues(): result.update(pool.all) return frozenset(result) @property def test_available(self): result = [] for pool in self.pools.itervalues(): result.extend(pool.available) return tuple(result) def toTimeStamp(dt): utc_struct = dt.utctimetuple() # if this is a leapsecond, this will probably fail. That may be a good # thing: leapseconds are not really accounted for with serials. args = utc_struct[:5]+(utc_struct[5] + dt.microsecond/1000000.0,) return TimeStamp(*args) def getTID(at, before): if at is not None: if before is not None: raise ValueError('can only pass zero or one of `at` and `before`') if isinstance(at, datetime.datetime): at = toTimeStamp(at) else: at = TimeStamp(at) before = repr(at.laterThan(at)) elif before is not None: if isinstance(before, datetime.datetime): before = repr(toTimeStamp(before)) else: before = repr(TimeStamp(before)) return before class DB(object): """The Object Database ------------------- The DB class coordinates the activities of multiple database Connection instances. Most of the work is done by the Connections created via the open method. The DB instance manages a pool of connections. If a connection is closed, it is returned to the pool and its object cache is preserved. A subsequent call to open() will reuse the connection. There is no hard limit on the pool size. If more than `pool_size` connections are opened, a warning is logged, and if more than twice that many, a critical problem is logged. The class variable 'klass' is used by open() to create database connections. It is set to Connection, but a subclass could override it to provide a different connection implementation. The database provides a few methods intended for application code -- open, close, undo, and pack -- and a large collection of methods for inspecting the database and its connections' caches. :Cvariables: - `klass`: Class used by L{open} to create database connections :Groups: - `User Methods`: __init__, open, close, undo, pack, classFactory - `Inspection Methods`: getName, getSize, objectCount, getActivityMonitor, setActivityMonitor - `Connection Pool Methods`: getPoolSize, getHistoricalPoolSize, setPoolSize, setHistoricalPoolSize, getHistoricalTimeout, setHistoricalTimeout - `Transaction Methods`: invalidate - `Other Methods`: lastTransaction, connectionDebugInfo - `Cache Inspection Methods`: cacheDetail, cacheExtremeDetail, cacheFullSweep, cacheLastGCTime, cacheMinimize, cacheSize, cacheDetailSize, getCacheSize, getHistoricalCacheSize, setCacheSize, setHistoricalCacheSize """ implements(IDatabase) klass = Connection # Class to use for connections _activity_monitor = next = previous = None def __init__(self, storage, pool_size=7, pool_timeout=1<<31, cache_size=400, cache_size_bytes=0, historical_pool_size=3, historical_cache_size=1000, historical_cache_size_bytes=0, historical_timeout=300, database_name='unnamed', databases=None, xrefs=True, large_record_size=1<<24, **storage_args): """Create an object database. :Parameters: - `storage`: the storage used by the database, e.g. FileStorage - `pool_size`: expected maximum number of open connections - `cache_size`: target size of Connection object cache - `cache_size_bytes`: target size measured in total estimated size of objects in the Connection object cache. "0" means unlimited. - `historical_pool_size`: expected maximum number of total historical connections - `historical_cache_size`: target size of Connection object cache for historical (`at` or `before`) connections - `historical_cache_size_bytes` -- similar to `cache_size_bytes` for the historical connection. - `historical_timeout`: minimum number of seconds that an unused historical connection will be kept, or None. - `xrefs` - Boolian flag indicating whether implicit cross-database references are allowed """ if isinstance(storage, basestring): from ZODB import FileStorage storage = ZODB.FileStorage.FileStorage(storage, **storage_args) elif storage is None: from ZODB import MappingStorage storage = ZODB.MappingStorage.MappingStorage(**storage_args) # Allocate lock. x = threading.RLock() self._a = x.acquire self._r = x.release # pools and cache sizes self.pool = ConnectionPool(pool_size, pool_timeout) self.historical_pool = KeyedConnectionPool(historical_pool_size, historical_timeout) self._cache_size = cache_size self._cache_size_bytes = cache_size_bytes self._historical_cache_size = historical_cache_size self._historical_cache_size_bytes = historical_cache_size_bytes # Setup storage self.storage = storage self.references = ZODB.serialize.referencesf try: storage.registerDB(self) except TypeError: storage.registerDB(self, None) # Backward compat if (not hasattr(storage, 'tpc_vote')) and not storage.isReadOnly(): warnings.warn( "Storage doesn't have a tpc_vote and this violates " "the storage API. Violently monkeypatching in a do-nothing " "tpc_vote.", DeprecationWarning, 2) storage.tpc_vote = lambda *args: None if IMVCCStorage.providedBy(storage): temp_storage = storage.new_instance() else: temp_storage = storage try: try: temp_storage.load(z64, '') except KeyError: # Create the database's root in the storage if it doesn't exist from persistent.mapping import PersistentMapping root = PersistentMapping() # Manually create a pickle for the root to put in the storage. # The pickle must be in the special ZODB format. file = cStringIO.StringIO() p = cPickle.Pickler(file, 1) p.dump((root.__class__, None)) p.dump(root.__getstate__()) t = transaction.Transaction() t.description = 'initial database creation' temp_storage.tpc_begin(t) temp_storage.store(z64, None, file.getvalue(), '', t) temp_storage.tpc_vote(t) temp_storage.tpc_finish(t) finally: if IMVCCStorage.providedBy(temp_storage): temp_storage.release() # Multi-database setup. if databases is None: databases = {} self.databases = databases self.database_name = database_name if database_name in databases: raise ValueError("database_name %r already in databases" % database_name) databases[database_name] = self self.xrefs = xrefs self.large_record_size = large_record_size @property def _storage(self): # Backward compatibility return self.storage # This is called by Connection.close(). def _returnToPool(self, connection): """Return a connection to the pool. connection._db must be self on entry. """ self._a() try: assert connection._db is self connection.opened = None if connection.before: self.historical_pool.repush(connection, connection.before) else: self.pool.repush(connection) finally: self._r() def _connectionMap(self, f): """Call f(c) for all connections c in all pools, live and historical. """ self._a() try: self.pool.map(f) self.historical_pool.map(f) finally: self._r() def cacheDetail(self): """Return information on objects in the various caches Organized by class. """ detail = {} def f(con, detail=detail): for oid, ob in con._cache.items(): module = getattr(ob.__class__, '__module__', '') module = module and '%s.' % module or '' c = "%s%s" % (module, ob.__class__.__name__) if c in detail: detail[c] += 1 else: detail[c] = 1 self._connectionMap(f) detail = detail.items() detail.sort() return detail def cacheExtremeDetail(self): detail = [] conn_no = [0] # A mutable reference to a counter def f(con, detail=detail, rc=sys.getrefcount, conn_no=conn_no): conn_no[0] += 1 cn = conn_no[0] for oid, ob in con._cache_items(): id = '' if hasattr(ob, '__dict__'): d = ob.__dict__ if d.has_key('id'): id = d['id'] elif d.has_key('__name__'): id = d['__name__'] module = getattr(ob.__class__, '__module__', '') module = module and ('%s.' % module) or '' # What refcount ('rc') should we return? The intent is # that we return the true Python refcount, but as if the # cache didn't exist. This routine adds 3 to the true # refcount: 1 for binding to name 'ob', another because # ob lives in the con._cache_items() list we're iterating # over, and calling sys.getrefcount(ob) boosts ob's # count by 1 too. So the true refcount is 3 less than # sys.getrefcount(ob) returns. But, in addition to that, # the cache holds an extra reference on non-ghost objects, # and we also want to pretend that doesn't exist. detail.append({ 'conn_no': cn, 'oid': oid, 'id': id, 'klass': "%s%s" % (module, ob.__class__.__name__), 'rc': rc(ob) - 3 - (ob._p_changed is not None), 'state': ob._p_changed, #'references': con.references(oid), }) self._connectionMap(f) return detail def cacheFullSweep(self): self._connectionMap(lambda c: c._cache.full_sweep()) def cacheLastGCTime(self): m = [0] def f(con, m=m): t = con._cache.cache_last_gc_time if t > m[0]: m[0] = t self._connectionMap(f) return m[0] def cacheMinimize(self): self._connectionMap(lambda c: c._cache.minimize()) def cacheSize(self): m = [0] def f(con, m=m): m[0] += con._cache.cache_non_ghost_count self._connectionMap(f) return m[0] def cacheDetailSize(self): m = [] def f(con, m=m): m.append({'connection': repr(con), 'ngsize': con._cache.cache_non_ghost_count, 'size': len(con._cache)}) self._connectionMap(f) m.sort() return m def close(self): """Close the database and its underlying storage. It is important to close the database, because the storage may flush in-memory data structures to disk when it is closed. Leaving the storage open with the process exits can cause the next open to be slow. What effect does closing the database have on existing connections? Technically, they remain open, but their storage is closed, so they stop behaving usefully. Perhaps close() should also close all the Connections. """ noop = lambda *a: None self.close = noop @self._connectionMap def _(c): c.transaction_manager.abort() c.afterCompletion = c.newTransaction = c.close = noop c._release_resources() self.storage.close() del self.storage def getCacheSize(self): return self._cache_size def getCacheSizeBytes(self): return self._cache_size_bytes def lastTransaction(self): return self.storage.lastTransaction() def getName(self): return self.storage.getName() def getPoolSize(self): return self.pool.size def getSize(self): return self.storage.getSize() def getHistoricalCacheSize(self): return self._historical_cache_size def getHistoricalCacheSizeBytes(self): return self._historical_cache_size_bytes def getHistoricalPoolSize(self): return self.historical_pool.size def getHistoricalTimeout(self): return self.historical_pool.timeout def invalidate(self, tid, oids, connection=None, version=''): """Invalidate references to a given oid. This is used to indicate that one of the connections has committed a change to the object. The connection commiting the change should be passed in to prevent useless (but harmless) messages to the connection. """ # Storages, esp. ZEO tests, need the version argument still. :-/ assert version=='' # Notify connections. def inval(c): if c is not connection: c.invalidate(tid, oids) self._connectionMap(inval) def invalidateCache(self): """Invalidate each of the connection caches """ self._connectionMap(lambda c: c.invalidateCache()) transform_record_data = untransform_record_data = lambda self, data: data def objectCount(self): return len(self.storage) def open(self, transaction_manager=None, at=None, before=None): """Return a database Connection for use by application code. Note that the connection pool is managed as a stack, to increase the likelihood that the connection's stack will include useful objects. :Parameters: - `transaction_manager`: transaction manager to use. None means use the default transaction manager. - `at`: a datetime.datetime or 8 character transaction id of the time to open the database with a read-only connection. Passing both `at` and `before` raises a ValueError, and passing neither opens a standard writable transaction of the newest state. A timezone-naive datetime.datetime is treated as a UTC value. - `before`: like `at`, but opens the readonly state before the tid or datetime. """ # `at` is normalized to `before`, since we use storage.loadBefore # as the underlying implementation of both. before = getTID(at, before) if (before is not None and before > self.lastTransaction() and before > getTID(self.lastTransaction(), None)): raise ValueError( 'cannot open an historical connection in the future.') if isinstance(transaction_manager, basestring): if transaction_manager: raise TypeError("Versions aren't supported.") warnings.warn( "A version string was passed to open.\n" "The first argument is a transaction manager.", DeprecationWarning, 2) transaction_manager = None self._a() try: # result <- a connection if before is not None: result = self.historical_pool.pop(before) if result is None: c = self.klass(self, self._historical_cache_size, before, self._historical_cache_size_bytes, ) self.historical_pool.push(c, before) result = self.historical_pool.pop(before) else: result = self.pool.pop() if result is None: c = self.klass(self, self._cache_size, None, self._cache_size_bytes, ) self.pool.push(c) result = self.pool.pop() assert result is not None # open the connection. result.open(transaction_manager) # A good time to do some cache cleanup. # (note we already have the lock) self.pool.availableGC() self.historical_pool.availableGC() return result finally: self._r() def connectionDebugInfo(self): result = [] t = time.time() def get_info(c): # `result`, `time` and `before` are lexically inherited. o = c.opened d = c.getDebugInfo() if d: if len(d) == 1: d = d[0] else: d = '' d = "%s (%s)" % (d, len(c._cache)) # output UTC time with the standard Z time zone indicator result.append({ 'opened': o and ("%s (%.2fs)" % ( time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime(o)), t-o)), 'info': d, 'before': c.before, }) self._connectionMap(get_info) return result def getActivityMonitor(self): return self._activity_monitor def pack(self, t=None, days=0): """Pack the storage, deleting unused object revisions. A pack is always performed relative to a particular time, by default the current time. All object revisions that are not reachable as of the pack time are deleted from the storage. The cost of this operation varies by storage, but it is usually an expensive operation. There are two optional arguments that can be used to set the pack time: t, pack time in seconds since the epcoh, and days, the number of days to subtract from t or from the current time if t is not specified. """ if t is None: t = time.time() t -= days * 86400 try: self.storage.pack(t, self.references) except: logger.error("packing", exc_info=True) raise def setActivityMonitor(self, am): self._activity_monitor = am def classFactory(self, connection, modulename, globalname): # Zope will rebind this method to arbitrary user code at runtime. return find_global(modulename, globalname) def setCacheSize(self, size): self._a() try: self._cache_size = size def setsize(c): c._cache.cache_size = size self.pool.map(setsize) finally: self._r() def setCacheSizeBytes(self, size): self._a() try: self._cache_size_bytes = size def setsize(c): c._cache.cache_size_bytes = size self.pool.map(setsize) finally: self._r() def setHistoricalCacheSize(self, size): self._a() try: self._historical_cache_size = size def setsize(c): c._cache.cache_size = size self.historical_pool.map(setsize) finally: self._r() def setHistoricalCacheSizeBytes(self, size): self._a() try: self._historical_cache_size_bytes = size def setsize(c): c._cache.cache_size_bytes = size self.historical_pool.map(setsize) finally: self._r() def setPoolSize(self, size): self._a() try: self.pool.size = size finally: self._r() def setHistoricalPoolSize(self, size): self._a() try: self.historical_pool.size = size finally: self._r() def setHistoricalTimeout(self, timeout): self._a() try: self.historical_pool.timeout = timeout finally: self._r() def history(self, *args, **kw): return self.storage.history(*args, **kw) def supportsUndo(self): try: f = self.storage.supportsUndo except AttributeError: return False return f() def undoLog(self, *args, **kw): if not self.supportsUndo(): return () return self.storage.undoLog(*args, **kw) def undoInfo(self, *args, **kw): if not self.supportsUndo(): return () return self.storage.undoInfo(*args, **kw) def undoMultiple(self, ids, txn=None): """Undo multiple transactions identified by ids. A transaction can be undone if all of the objects involved in the transaction were not modified subsequently, if any modifications can be resolved by conflict resolution, or if subsequent changes resulted in the same object state. The values in ids should be generated by calling undoLog() or undoInfo(). The value of ids are not the same as a transaction ids used by other methods; they are unique to undo(). :Parameters: - `ids`: a sequence of storage-specific transaction identifiers - `txn`: transaction context to use for undo(). By default, uses the current transaction. """ if not self.supportsUndo(): raise NotImplementedError if txn is None: txn = transaction.get() if isinstance(ids, basestring): ids = [ids] txn.join(TransactionalUndo(self, ids)) def undo(self, id, txn=None): """Undo a transaction identified by id. A transaction can be undone if all of the objects involved in the transaction were not modified subsequently, if any modifications can be resolved by conflict resolution, or if subsequent changes resulted in the same object state. The value of id should be generated by calling undoLog() or undoInfo(). The value of id is not the same as a transaction id used by other methods; it is unique to undo(). :Parameters: - `id`: a transaction identifier - `txn`: transaction context to use for undo(). By default, uses the current transaction. """ self.undoMultiple([id], txn) def transaction(self): return ContextManager(self) def new_oid(self): return self.storage.new_oid() class ContextManager: """PEP 343 context manager """ def __init__(self, db): self.db = db def __enter__(self): self.tm = transaction.TransactionManager() self.conn = self.db.open(self.tm) return self.conn def __exit__(self, t, v, tb): if t is None: self.tm.commit() else: self.tm.abort() self.conn.close() resource_counter_lock = threading.Lock() resource_counter = 0 class TransactionalUndo(object): def __init__(self, db, tids): self._db = db self._storage = db.storage self._tids = tids self._oids = set() def abort(self, transaction): pass def tpc_begin(self, transaction): self._storage.tpc_begin(transaction) def commit(self, transaction): for tid in self._tids: result = self._storage.undo(tid, transaction) if result: self._oids.update(result[1]) def tpc_vote(self, transaction): for oid, _ in self._storage.tpc_vote(transaction) or (): self._oids.add(oid) def tpc_finish(self, transaction): self._storage.tpc_finish( transaction, lambda tid: self._db.invalidate(tid, self._oids) ) def tpc_abort(self, transaction): self._storage.tpc_abort(transaction) def sortKey(self): return "%s:%s" % (self._storage.sortKey(), id(self)) def connection(*args, **kw): db = DB(*args, **kw) conn = db.open() conn.onCloseCallback(db.close) return conn ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/DemoStorage.py000066400000000000000000000305051230730566700232230ustar00rootroot00000000000000############################################################################## # # Copyright (c) Zope Corporation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Demo ZODB storage A demo storage supports demos by allowing a volatile changed database to be layered over a base database. The base storage must not change. """ import os import random import weakref import tempfile import threading import ZODB.BaseStorage import ZODB.blob import ZODB.interfaces import ZODB.MappingStorage import ZODB.POSException import ZODB.utils import zope.interface class DemoStorage(object): zope.interface.implements( ZODB.interfaces.IStorage, ZODB.interfaces.IStorageIteration, ) def __init__(self, name=None, base=None, changes=None, close_base_on_close=None, close_changes_on_close=None): if close_base_on_close is None: if base is None: base = ZODB.MappingStorage.MappingStorage() close_base_on_close = False else: close_base_on_close = True self.base = base self.close_base_on_close = close_base_on_close if changes is None: self._temporary_changes = True changes = ZODB.MappingStorage.MappingStorage() zope.interface.alsoProvides(self, ZODB.interfaces.IBlobStorage) if close_changes_on_close is None: close_changes_on_close = False else: if ZODB.interfaces.IBlobStorage.providedBy(changes): zope.interface.alsoProvides(self, ZODB.interfaces.IBlobStorage) if close_changes_on_close is None: close_changes_on_close = True self.changes = changes self.close_changes_on_close = close_changes_on_close self._issued_oids = set() self._stored_oids = set() self._commit_lock = threading.Lock() self._transaction = None if name is None: name = 'DemoStorage(%r, %r)' % (base.getName(), changes.getName()) self.__name__ = name self._copy_methods_from_changes(changes) self._next_oid = random.randint(1, 1<<62) def _blobify(self): if (self._temporary_changes and isinstance(self.changes, ZODB.MappingStorage.MappingStorage) ): blob_dir = tempfile.mkdtemp('.demoblobs') _temporary_blobdirs[ weakref.ref(self, cleanup_temporary_blobdir) ] = blob_dir self.changes = ZODB.blob.BlobStorage(blob_dir, self.changes) self._copy_methods_from_changes(self.changes) return True def cleanup(self): self.base.cleanup() self.changes.cleanup() __opened = True def opened(self): return self.__opened def close(self): self.__opened = False if self.close_base_on_close: self.base.close() if self.close_changes_on_close: self.changes.close() def _copy_methods_from_changes(self, changes): for meth in ( '_lock_acquire', '_lock_release', 'getSize', 'history', 'isReadOnly', 'registerDB', 'sortKey', 'tpc_transaction', 'tpc_vote', ): setattr(self, meth, getattr(changes, meth)) supportsUndo = getattr(changes, 'supportsUndo', None) if supportsUndo is not None and supportsUndo(): for meth in ('supportsUndo', 'undo', 'undoLog', 'undoInfo'): setattr(self, meth, getattr(changes, meth)) zope.interface.alsoProvides(self, ZODB.interfaces.IStorageUndoable) lastInvalidations = getattr(changes, 'lastInvalidations', None) if lastInvalidations is not None: self.lastInvalidations = lastInvalidations def getName(self): return self.__name__ __repr__ = getName def getTid(self, oid): try: return self.changes.getTid(oid) except ZODB.POSException.POSKeyError: return self.base.getTid(oid) def iterator(self, start=None, end=None): for t in self.base.iterator(start, end): yield t for t in self.changes.iterator(start, end): yield t def lastTransaction(self): t = self.changes.lastTransaction() if t == ZODB.utils.z64: t = self.base.lastTransaction() return t def __len__(self): return len(self.changes) def load(self, oid, version=''): try: return self.changes.load(oid, version) except ZODB.POSException.POSKeyError: return self.base.load(oid, version) def loadBefore(self, oid, tid): try: result = self.changes.loadBefore(oid, tid) except ZODB.POSException.POSKeyError: # The oid isn't in the changes, so defer to base return self.base.loadBefore(oid, tid) if result is None: # The oid *was* in the changes, but there aren't any # earlier records. Maybe there are in the base. try: result = self.base.loadBefore(oid, tid) except ZODB.POSException.POSKeyError: # The oid isn't in the base, so None will be the right result pass else: if result and not result[-1]: end_tid = None t = self.changes.load(oid) while t: end_tid = t[1] t = self.changes.loadBefore(oid, end_tid) result = result[:2] + (end_tid,) return result def loadBlob(self, oid, serial): try: return self.changes.loadBlob(oid, serial) except ZODB.POSException.POSKeyError: try: return self.base.loadBlob(oid, serial) except AttributeError: if not ZODB.interfaces.IBlobStorage.providedBy(self.base): raise ZODB.POSException.POSKeyError(oid, serial) raise except AttributeError: if self._blobify(): return self.loadBlob(oid, serial) raise def openCommittedBlobFile(self, oid, serial, blob=None): try: return self.changes.openCommittedBlobFile(oid, serial, blob) except ZODB.POSException.POSKeyError: try: return self.base.openCommittedBlobFile(oid, serial, blob) except AttributeError: if not ZODB.interfaces.IBlobStorage.providedBy(self.base): raise ZODB.POSException.POSKeyError(oid, serial) raise except AttributeError: if self._blobify(): return self.openCommittedBlobFile(oid, serial, blob) raise def loadSerial(self, oid, serial): try: return self.changes.loadSerial(oid, serial) except ZODB.POSException.POSKeyError: return self.base.loadSerial(oid, serial) @ZODB.utils.locked def new_oid(self): while 1: oid = ZODB.utils.p64(self._next_oid ) if oid not in self._issued_oids: try: self.changes.load(oid, '') except ZODB.POSException.POSKeyError: try: self.base.load(oid, '') except ZODB.POSException.POSKeyError: self._next_oid += 1 self._issued_oids.add(oid) return oid self._next_oid = random.randint(1, 1<<62) def pack(self, t, referencesf, gc=None): if gc is None: if self._temporary_changes: return self.changes.pack(t, referencesf) elif self._temporary_changes: return self.changes.pack(t, referencesf, gc=gc) elif gc: raise TypeError( "Garbage collection isn't supported" " when there is a base storage.") try: self.changes.pack(t, referencesf, gc=False) except TypeError, v: if 'gc' in str(v): pass # The gc arg isn't supported. Don't pack raise def pop(self): self.changes.close() return self.base def push(self, changes=None): return self.__class__(base=self, changes=changes, close_base_on_close=False) def store(self, oid, serial, data, version, transaction): assert version=='', "versions aren't supported" if transaction is not self._transaction: raise ZODB.POSException.StorageTransactionError(self, transaction) # Since the OID is being used, we don't have to keep up with it any # more. Save it now so we can forget it later. :) self._stored_oids.add(oid) # See if we already have changes for this oid try: old = self.changes.load(oid, '')[1] except ZODB.POSException.POSKeyError: try: old = self.base.load(oid, '')[1] except ZODB.POSException.POSKeyError: old = serial if old != serial: raise ZODB.POSException.ConflictError( oid=oid, serials=(old, serial)) # XXX untested branch return self.changes.store(oid, serial, data, '', transaction) def storeBlob(self, oid, oldserial, data, blobfilename, version, transaction): assert version=='', "versions aren't supported" if transaction is not self._transaction: raise ZODB.POSException.StorageTransactionError(self, transaction) # Since the OID is being used, we don't have to keep up with it any # more. Save it now so we can forget it later. :) self._stored_oids.add(oid) try: return self.changes.storeBlob( oid, oldserial, data, blobfilename, '', transaction) except AttributeError: if self._blobify(): return self.changes.storeBlob( oid, oldserial, data, blobfilename, '', transaction) raise checkCurrentSerialInTransaction = ( ZODB.BaseStorage.checkCurrentSerialInTransaction) def temporaryDirectory(self): try: return self.changes.temporaryDirectory() except AttributeError: if self._blobify(): return self.changes.temporaryDirectory() raise @ZODB.utils.locked def tpc_abort(self, transaction): if transaction is not self._transaction: return self._stored_oids = set() self._transaction = None self.changes.tpc_abort(transaction) self._commit_lock.release() @ZODB.utils.locked def tpc_begin(self, transaction, *a, **k): # The tid argument exists to support testing. if transaction is self._transaction: raise ZODB.POSException.StorageTransactionError( "Duplicate tpc_begin calls for same transaction") self._lock_release() self._commit_lock.acquire() self._lock_acquire() self.changes.tpc_begin(transaction, *a, **k) self._transaction = transaction self._stored_oids = set() @ZODB.utils.locked def tpc_finish(self, transaction, func = lambda tid: None): if (transaction is not self._transaction): raise ZODB.POSException.StorageTransactionError( "tpc_finish called with wrong transaction") self._issued_oids.difference_update(self._stored_oids) self._stored_oids = set() self._transaction = None self.changes.tpc_finish(transaction, func) self._commit_lock.release() _temporary_blobdirs = {} def cleanup_temporary_blobdir( ref, _temporary_blobdirs=_temporary_blobdirs, # Make sure it stays around ): blob_dir = _temporary_blobdirs.pop(ref, None) if blob_dir and os.path.exists(blob_dir): ZODB.blob.remove_committed_dir(blob_dir) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/DemoStorage.test000066400000000000000000000254441230730566700235600ustar00rootroot00000000000000========================== DemoStorage demo (doctest) ========================== DemoStorages provide a way to provide incremental updates to an existing, base, storage without updating the storage. .. We need to mess with time to prevent spurious test failures on windows >>> now = 1231019584.0 >>> def faux_time_time(): ... global now ... now += .1 ... return now >>> import time >>> real_time_time = time.time >>> time.time = faux_time_time To see how this works, we'll start by creating a base storage and puting an object (in addition to the root object) in it: >>> from ZODB.FileStorage import FileStorage >>> base = FileStorage('base.fs') >>> from ZODB.DB import DB >>> db = DB(base) >>> from persistent.mapping import PersistentMapping >>> conn = db.open() >>> conn.root()['1'] = PersistentMapping({'a': 1, 'b':2}) >>> import transaction >>> transaction.commit() >>> db.close() >>> import os >>> original_size = os.path.getsize('base.fs') Now, lets reopen the base storage in read-only mode: >>> base = FileStorage('base.fs', read_only=True) And open a new storage to store changes: >>> changes = FileStorage('changes.fs') and combine the 2 in a demofilestorage: >>> from ZODB.DemoStorage import DemoStorage >>> storage = DemoStorage(base=base, changes=changes) If there are no transactions, the storage reports the lastTransaction of the base database: >>> storage.lastTransaction() == base.lastTransaction() True Let's add some data: >>> db = DB(storage) >>> conn = db.open() >>> items = conn.root()['1'].items() >>> items.sort() >>> items [('a', 1), ('b', 2)] >>> conn.root()['2'] = PersistentMapping({'a': 3, 'b':4}) >>> transaction.commit() >>> conn.root()['2']['c'] = 5 >>> transaction.commit() Here we can see that we haven't modified the base storage: >>> original_size == os.path.getsize('base.fs') True But we have modified the changes database: >>> len(changes) 2 Our lastTransaction reflects the lastTransaction of the changes: >>> storage.lastTransaction() > base.lastTransaction() True >>> storage.lastTransaction() == changes.lastTransaction() True Let's walk over some of the methods so we can see how we delegate to the new underlying storages: >>> from ZODB.utils import p64, u64 >>> storage.load(p64(0), '') == changes.load(p64(0), '') True >>> storage.load(p64(0), '') == base.load(p64(0), '') False >>> storage.load(p64(1), '') == base.load(p64(1), '') True >>> serial = base.getTid(p64(0)) >>> storage.loadSerial(p64(0), serial) == base.loadSerial(p64(0), serial) True >>> serial = changes.getTid(p64(0)) >>> storage.loadSerial(p64(0), serial) == changes.loadSerial(p64(0), ... serial) True The object id of the new object is quite random, and typically large: >>> print u64(conn.root()['2']._p_oid) 3553260803050964942 Let's look at some other methods: >>> storage.getName() "DemoStorage('base.fs', 'changes.fs')" >>> storage.sortKey() == changes.sortKey() True >>> storage.getSize() == changes.getSize() True >>> len(storage) == len(changes) True Undo methods are simply copied from the changes storage: >>> [getattr(storage, name) == getattr(changes, name) ... for name in ('supportsUndo', 'undo', 'undoLog', 'undoInfo') ... ] [True, True, True, True] >>> db.close() Closing demo storages ===================== Normally, when a demo storage is closed, it's base and changes storage are closed: >>> from ZODB.MappingStorage import MappingStorage >>> demo = DemoStorage(base=MappingStorage(), changes=MappingStorage()) >>> demo.close() >>> demo.base.opened(), demo.changes.opened() (False, False) You can pass constructor arguments to control whether the base and changes storages should be closed when the demo storage is closed: >>> demo = DemoStorage( ... base=MappingStorage(), changes=MappingStorage(), ... close_base_on_close=False, close_changes_on_close=False, ... ) >>> demo.close() >>> demo.base.opened(), demo.changes.opened() (True, True) Storage Stacking ================ A common use case is to stack demo storages. DemoStorage provides some helper functions to help with this. The push method, just creates a new demo storage who's base is the original demo storage: >>> demo = DemoStorage() >>> demo2 = demo.push() >>> demo2.base is demo True We can also supply an explicit changes storage, if we wish: >>> changes = MappingStorage() >>> demo3 = demo2.push(changes) >>> demo3.changes is changes, demo3.base is demo2 (True, True) The pop method closes the changes storage and returns the base *without* closing it: >>> demo3.pop() is demo2 True >>> changes.opened() False If storage returned by push is closed, the original storage isn't: >>> demo3.push().close() >>> demo2.opened() True Blob Support ============ DemoStorage supports Blobs if the changes database supports blobs. >>> import ZODB.blob >>> base = ZODB.blob.BlobStorage('base', FileStorage('base.fs')) >>> db = DB(base) >>> conn = db.open() >>> conn.root()['blob'] = ZODB.blob.Blob() >>> conn.root()['blob'].open('w').write('state 1') >>> transaction.commit() >>> db.close() >>> base = ZODB.blob.BlobStorage('base', ... FileStorage('base.fs', read_only=True)) >>> changes = ZODB.blob.BlobStorage('changes', ... FileStorage('changes.fs', create=True)) >>> storage = DemoStorage(base=base, changes=changes) >>> db = DB(storage) >>> conn = db.open() >>> conn.root()['blob'].open().read() 'state 1' >>> _ = transaction.begin() >>> conn.root()['blob'].open('w').write('state 2') >>> transaction.commit() >>> conn.root()['blob'].open().read() 'state 2' >>> storage.temporaryDirectory() == changes.temporaryDirectory() True >>> db.close() It isn't necessary for the base database to support blobs. >>> base = FileStorage('base.fs', read_only=True) >>> changes = ZODB.blob.BlobStorage('changes', FileStorage('changes.fs')) >>> storage = DemoStorage(base=base, changes=changes) >>> db = DB(storage) >>> conn = db.open() >>> conn.root()['blob'].open().read() 'state 2' >>> _ = transaction.begin() >>> conn.root()['blob2'] = ZODB.blob.Blob() >>> conn.root()['blob2'].open('w').write('state 1') >>> conn.root()['blob2'].open().read() 'state 1' >>> db.close() If the changes database is created implicitly, it will get a blob storage wrapped around it when necessary: >>> base = ZODB.blob.BlobStorage('base', ... FileStorage('base.fs', read_only=True)) >>> storage = DemoStorage(base=base) >>> type(storage.changes).__name__ 'MappingStorage' >>> db = DB(storage) >>> conn = db.open() >>> conn.root()['blob'].open().read() 'state 1' >>> type(storage.changes).__name__ 'BlobStorage' >>> _ = transaction.begin() >>> conn.root()['blob'].open('w').write('state 2') >>> transaction.commit() >>> conn.root()['blob'].open().read() 'state 2' >>> storage.temporaryDirectory() == storage.changes.temporaryDirectory() True >>> db.close() .. Check that the temporary directory is gone For now, it won't go until the storage does. >>> transaction.abort() >>> blobdir = storage.temporaryDirectory() >>> del storage, _ >>> import gc >>> _ = gc.collect() >>> import os >>> os.path.exists(blobdir) False ZConfig support =============== You can configure demo storages using ZConfig, using name, changes, and base options: >>> import ZODB.config >>> storage = ZODB.config.storageFromString(""" ... ... ... """) >>> storage.getName() "DemoStorage('MappingStorage', 'MappingStorage')" >>> storage = ZODB.config.storageFromString(""" ... ... ... path base.fs ... ... ... ... path changes.fs ... ... ... """) >>> storage.getName() "DemoStorage('base.fs', 'changes.fs')" >>> storage.close() >>> storage = ZODB.config.storageFromString(""" ... ... name bob ... ... path base.fs ... ... ... ... path changes.fs ... ... ... """) >>> storage.getName() 'bob' >>> storage.base.getName() 'base.fs' >>> storage.close() Generating OIDs =============== When asked for a new OID DemoStorage chooses a value and then verifies that neither the base or changes storages already contain that OID. It chooses values sequentially from random starting points, picking new starting points whenever a chosen value us already in the changes or base. Under rare circumstances an OID can be chosen that has already been handed out, but which hasn't yet been comitted. Lets verify that if the same OID is chosen twice during a transaction that everything will still work. To test this, we need to hack random.randint a bit. >>> import random >>> randint = random.randint >>> rv = 42 >>> def faux_randint(min, max): ... print 'called randint' ... global rv ... rv += 1000 ... return rv >>> random.randint = faux_randint Now, we create a demostorage. >>> storage = DemoStorage() called randint If we ask for an oid, we'll get 1042. >>> u64(storage.new_oid()) 1042 oids are allocated seuentially: >>> u64(storage.new_oid()) 1043 Now, we'll save 1044 in changes so that it has to pick a new one randomly. >>> t = transaction.get() >>> ZODB.tests.util.store(storage.changes, 1044) >>> u64(storage.new_oid()) called randint 2042 Now, we hack rv to 1042 is given out again and we'll save 2043 in base to force another attempt: >>> rv -= 1000 >>> ZODB.tests.util.store(storage.changes, 2043) >>> oid = storage.new_oid() called randint called randint >>> u64(oid) 3042 DemoStorage keeps up with the issued OIDs to know when not to reissue them... >>> oid in storage._issued_oids True ...but once data is stored with a given OID... >>> ZODB.tests.util.store(storage, oid) ...there's no need to remember it any longer: >>> oid in storage._issued_oids False >>> storage.close() .. restore randint >>> random.randint = randint .. restore time >>> time.time = real_time_time ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/ExportImport.py000066400000000000000000000143161230730566700234700ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Support for database export and import.""" import os from cStringIO import StringIO from cPickle import Pickler, Unpickler from tempfile import TemporaryFile import logging from ZODB.blob import Blob from ZODB.interfaces import IBlobStorage from ZODB.POSException import ExportError from ZODB.serialize import referencesf from ZODB.utils import p64, u64, cp, mktemp logger = logging.getLogger('ZODB.ExportImport') class ExportImport: def exportFile(self, oid, f=None): if f is None: f = TemporaryFile() elif isinstance(f, str): f = open(f,'w+b') f.write('ZEXP') oids = [oid] done_oids = {} done=done_oids.has_key load=self._storage.load supports_blobs = IBlobStorage.providedBy(self._storage) while oids: oid = oids.pop(0) if oid in done_oids: continue done_oids[oid] = True try: p, serial = load(oid, '') except: logger.debug("broken reference for oid %s", repr(oid), exc_info=True) else: referencesf(p, oids) f.writelines([oid, p64(len(p)), p]) if supports_blobs: if not isinstance(self._reader.getGhost(p), Blob): continue # not a blob blobfilename = self._storage.loadBlob(oid, serial) f.write(blob_begin_marker) f.write(p64(os.stat(blobfilename).st_size)) blobdata = open(blobfilename, "rb") cp(blobdata, f) blobdata.close() f.write(export_end_marker) return f def importFile(self, f, clue='', customImporters=None): # This is tricky, because we need to work in a transaction! if isinstance(f, str): f = open(f, 'rb') magic = f.read(4) if magic != 'ZEXP': if customImporters and customImporters.has_key(magic): f.seek(0) return customImporters[magic](self, f, clue) raise ExportError("Invalid export header") t = self.transaction_manager.get() if clue: t.note(clue) return_oid_list = [] self._import = f, return_oid_list self._register() t.savepoint(optimistic=True) # Return the root imported object. if return_oid_list: return self.get(return_oid_list[0]) else: return None def _importDuringCommit(self, transaction, f, return_oid_list): """Import data during two-phase commit. Invoked by the transaction manager mid commit. Appends one item, the OID of the first object created, to return_oid_list. """ oids = {} # IMPORTANT: This code should be consistent with the code in # serialize.py. It is currently out of date and doesn't handle # weak references. def persistent_load(ooid): """Remap a persistent id to a new ID and create a ghost for it.""" klass = None if isinstance(ooid, tuple): ooid, klass = ooid if ooid in oids: oid = oids[ooid] else: if klass is None: oid = self._storage.new_oid() else: oid = self._storage.new_oid(), klass oids[ooid] = oid return Ghost(oid) while 1: header = f.read(16) if header == export_end_marker: break if len(header) != 16: raise ExportError("Truncated export file") # Extract header information ooid = header[:8] length = u64(header[8:16]) data = f.read(length) if len(data) != length: raise ExportError("Truncated export file") if oids: oid = oids[ooid] if isinstance(oid, tuple): oid = oid[0] else: oids[ooid] = oid = self._storage.new_oid() return_oid_list.append(oid) # Blob support blob_begin = f.read(len(blob_begin_marker)) if blob_begin == blob_begin_marker: # Copy the blob data to a temporary file # and remember the name blob_len = u64(f.read(8)) blob_filename = mktemp() blob_file = open(blob_filename, "wb") cp(f, blob_file, blob_len) blob_file.close() else: f.seek(-len(blob_begin_marker),1) blob_filename = None pfile = StringIO(data) unpickler = Unpickler(pfile) unpickler.persistent_load = persistent_load newp = StringIO() pickler = Pickler(newp, 1) pickler.inst_persistent_id = persistent_id pickler.dump(unpickler.load()) pickler.dump(unpickler.load()) data = newp.getvalue() if blob_filename is not None: self._storage.storeBlob(oid, None, data, blob_filename, '', transaction) else: self._storage.store(oid, None, data, '', transaction) export_end_marker = '\377'*16 blob_begin_marker = '\000BLOBSTART' class Ghost(object): __slots__ = ("oid",) def __init__(self, oid): self.oid = oid def persistent_id(obj): if isinstance(obj, Ghost): return obj.oid ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/FileStorage/000077500000000000000000000000001230730566700226415ustar00rootroot00000000000000ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/FileStorage/FileStorage.py000066400000000000000000002130371230730566700254250ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Storage implementation using a log written to a single file. """ from __future__ import with_statement from cPickle import Pickler, loads from persistent.TimeStamp import TimeStamp from struct import pack, unpack from zc.lockfile import LockFile from ZODB.FileStorage.format import CorruptedError, CorruptedDataError from ZODB.FileStorage.format import FileStorageFormatter, DataHeader from ZODB.FileStorage.format import TRANS_HDR, TRANS_HDR_LEN from ZODB.FileStorage.format import TxnHeader, DATA_HDR, DATA_HDR_LEN from ZODB.FileStorage.fspack import FileStoragePacker from ZODB.fsIndex import fsIndex from ZODB import BaseStorage, ConflictResolution, POSException from ZODB.POSException import UndoError, POSKeyError, MultipleUndoErrors from ZODB.utils import p64, u64, z64 import base64 import contextlib import errno import logging import os import threading import time import ZODB.blob import ZODB.interfaces import zope.interface import ZODB.utils # Not all platforms have fsync fsync = getattr(os, "fsync", None) packed_version = "FS21" logger = logging.getLogger('ZODB.FileStorage') def panic(message, *data): logger.critical(message, *data) raise CorruptedTransactionError(message % data) class FileStorageError(POSException.StorageError): pass class PackError(FileStorageError): pass class FileStorageFormatError(FileStorageError): """Invalid file format The format of the given file is not valid. """ class CorruptedFileStorageError(FileStorageError, POSException.StorageSystemError): """Corrupted file storage.""" class CorruptedTransactionError(CorruptedFileStorageError): pass class FileStorageQuotaError(FileStorageError, POSException.StorageSystemError): """File storage quota exceeded.""" # Intended to be raised only in fspack.py, and ignored here. class RedundantPackWarning(FileStorageError): pass class TempFormatter(FileStorageFormatter): """Helper class used to read formatted FileStorage data.""" def __init__(self, afile): self._file = afile class FileStorage( FileStorageFormatter, ZODB.blob.BlobStorageMixin, ConflictResolution.ConflictResolvingStorage, BaseStorage.BaseStorage, ): zope.interface.implements( ZODB.interfaces.IStorage, ZODB.interfaces.IStorageRestoreable, ZODB.interfaces.IStorageIteration, ZODB.interfaces.IStorageUndoable, ZODB.interfaces.IStorageCurrentRecordIteration, ZODB.interfaces.IExternalGC, ) # Set True while a pack is in progress; undo is blocked for the duration. _pack_is_in_progress = False def __init__(self, file_name, create=False, read_only=False, stop=None, quota=None, pack_gc=True, pack_keep_old=True, packer=None, blob_dir=None): if read_only: self._is_read_only = True if create: raise ValueError("can't create a read-only file") elif stop is not None: raise ValueError("time-travel only supported in read-only mode") if stop is None: stop='\377'*8 # Lock the database and set up the temp file. if not read_only: # Create the lock file self._lock_file = LockFile(file_name + '.lock') self._tfile = open(file_name + '.tmp', 'w+b') self._tfmt = TempFormatter(self._tfile) else: self._tfile = None self._file_name = os.path.abspath(file_name) self._pack_gc = pack_gc self.pack_keep_old = pack_keep_old if packer is not None: self.packer = packer BaseStorage.BaseStorage.__init__(self, file_name) index, tindex = self._newIndexes() self._initIndex(index, tindex) # Now open the file self._file = None if not create: try: self._file = open(file_name, read_only and 'rb' or 'r+b') except IOError, exc: if exc.errno == errno.EFBIG: # The file is too big to open. Fail visibly. raise if exc.errno == errno.ENOENT: # The file doesn't exist. Create it. create = 1 # If something else went wrong, it's hard to guess # what the problem was. If the file does not exist, # create it. Otherwise, fail. if os.path.exists(file_name): raise else: create = 1 if self._file is None and create: if os.path.exists(file_name): os.remove(file_name) self._file = open(file_name, 'w+b') self._file.write(packed_version) self._files = FilePool(self._file_name) r = self._restore_index() if r is not None: self._used_index = 1 # Marker for testing index, start, ltid = r self._initIndex(index, tindex) self._pos, self._oid, tid = read_index( self._file, file_name, index, tindex, stop, ltid=ltid, start=start, read_only=read_only, ) else: self._used_index = 0 # Marker for testing self._pos, self._oid, tid = read_index( self._file, file_name, index, tindex, stop, read_only=read_only, ) self._save_index() self._ltid = tid # self._pos should always point just past the last # transaction. During 2PC, data is written after _pos. # invariant is restored at tpc_abort() or tpc_finish(). self._ts = tid = TimeStamp(tid) t = time.time() t = TimeStamp(*time.gmtime(t)[:5] + (t % 60,)) if tid > t: seconds = tid.timeTime() - t.timeTime() complainer = logger.warning if seconds > 30 * 60: # 30 minutes -- way screwed up complainer = logger.critical complainer("%s Database records %d seconds in the future", file_name, seconds) self._quota = quota if blob_dir: self.blob_dir = os.path.abspath(blob_dir) if create and os.path.exists(self.blob_dir): ZODB.blob.remove_committed_dir(self.blob_dir) self._blob_init(blob_dir) zope.interface.alsoProvides(self, ZODB.interfaces.IBlobStorageRestoreable) else: self.blob_dir = None self._blob_init_no_blobs() def copyTransactionsFrom(self, other): if self.blob_dir: return ZODB.blob.BlobStorageMixin.copyTransactionsFrom(self, other) else: return BaseStorage.BaseStorage.copyTransactionsFrom(self, other) def _initIndex(self, index, tindex): self._index=index self._tindex=tindex self._index_get=index.get def __len__(self): return len(self._index) def _newIndexes(self): # hook to use something other than builtin dict return fsIndex(), {} _saved = 0 def _save_index(self): """Write the database index to a file to support quick startup.""" if self._is_read_only: return index_name = self.__name__ + '.index' tmp_name = index_name + '.index_tmp' self._index.save(self._pos, tmp_name) try: try: os.remove(index_name) except OSError: pass os.rename(tmp_name, index_name) except: pass self._saved += 1 def _clear_index(self): index_name = self.__name__ + '.index' if os.path.exists(index_name): try: os.remove(index_name) except OSError: pass def _sane(self, index, pos): """Sanity check saved index data by reading the last undone trans Basically, we read the last not undone transaction and check to see that the included records are consistent with the index. Any invalid record records or inconsistent object positions cause zero to be returned. """ r = self._check_sanity(index, pos) if not r: logger.warning("Ignoring index for %s", self._file_name) return r def _check_sanity(self, index, pos): if pos < 100: return 0 # insane self._file.seek(0, 2) if self._file.tell() < pos: return 0 # insane ltid = None max_checked = 5 checked = 0 while checked < max_checked: self._file.seek(pos - 8) rstl = self._file.read(8) tl = u64(rstl) pos = pos - tl - 8 if pos < 4: return 0 # insane h = self._read_txn_header(pos) if not ltid: ltid = h.tid if h.tlen != tl: return 0 # inconsistent lengths if h.status == 'u': continue # undone trans, search back if h.status not in ' p': return 0 # insane if tl < h.headerlen(): return 0 # insane tend = pos + tl opos = pos + h.headerlen() if opos == tend: continue # empty trans while opos < tend and checked < max_checked: # Read the data records for this transaction h = self._read_data_header(opos) if opos + h.recordlen() > tend or h.tloc != pos: return 0 if index.get(h.oid, 0) != opos: return 0 # insane checked += 1 opos = opos + h.recordlen() return ltid def _restore_index(self): """Load database index to support quick startup.""" # Returns (index, pos, tid), or None in case of error. # The index returned is always an instance of fsIndex. If the # index cached in the file is a Python dict, it's converted to # fsIndex here, and, if we're not in read-only mode, the .index # file is rewritten with the converted fsIndex so we don't need to # convert it again the next time. file_name=self.__name__ index_name=file_name+'.index' if os.path.exists(index_name): try: info = fsIndex.load(index_name) except: logger.exception('loading index') return None else: return None index = info.get('index') pos = info.get('pos') if index is None or pos is None: return None pos = long(pos) if (isinstance(index, dict) or (isinstance(index, fsIndex) and isinstance(index._data, dict))): # Convert dictionary indexes to fsIndexes *or* convert fsIndexes # which have a dict `_data` attribute to a new fsIndex (newer # fsIndexes have an OOBTree as `_data`). newindex = fsIndex() newindex.update(index) index = newindex if not self._is_read_only: # Save the converted index. f = open(index_name, 'wb') p = Pickler(f, 1) info['index'] = index p.dump(info) f.close() # Now call this method again to get the new data. return self._restore_index() tid = self._sane(index, pos) if not tid: return None return index, pos, tid def close(self): self._file.close() self._files.close() if hasattr(self,'_lock_file'): self._lock_file.close() if self._tfile: self._tfile.close() try: self._save_index() except: # Log the error and continue logger.error("Error saving index on close()", exc_info=True) def getSize(self): return self._pos def _lookup_pos(self, oid): try: return self._index[oid] except KeyError: raise POSKeyError(oid) except TypeError: raise TypeError("invalid oid %r" % (oid,)) def load(self, oid, version=''): """Return pickle data and serial number.""" assert not version with self._files.get() as _file: pos = self._lookup_pos(oid) h = self._read_data_header(pos, oid, _file) if h.plen: data = _file.read(h.plen) return data, h.tid elif h.back: # Get the data from the backpointer, but tid from # current txn. data = self._loadBack_impl(oid, h.back, _file=_file)[0] return data, h.tid else: raise POSKeyError(oid) def loadSerial(self, oid, serial): with self._lock: pos = self._lookup_pos(oid) while 1: h = self._read_data_header(pos, oid) if h.tid == serial: break pos = h.prev if not pos: raise POSKeyError(oid) if h.plen: return self._file.read(h.plen) else: return self._loadBack_impl(oid, h.back)[0] def loadBefore(self, oid, tid): with self._files.get() as _file: pos = self._lookup_pos(oid) end_tid = None while True: h = self._read_data_header(pos, oid, _file) if h.tid < tid: break pos = h.prev end_tid = h.tid if not pos: return None if h.back: data, _, _, _ = self._loadBack_impl(oid, h.back, _file=_file) return data, h.tid, end_tid else: return _file.read(h.plen), h.tid, end_tid def store(self, oid, oldserial, data, version, transaction): if self._is_read_only: raise POSException.ReadOnlyError() if transaction is not self._transaction: raise POSException.StorageTransactionError(self, transaction) assert not version with self._lock: if oid > self._oid: self.set_max_oid(oid) old = self._index_get(oid, 0) committed_tid = None pnv = None if old: h = self._read_data_header(old, oid) committed_tid = h.tid if oldserial != committed_tid: data = self.tryToResolveConflict(oid, committed_tid, oldserial, data) pos = self._pos here = pos + self._tfile.tell() + self._thl self._tindex[oid] = here new = DataHeader(oid, self._tid, old, pos, 0, len(data)) self._tfile.write(new.asString()) self._tfile.write(data) # Check quota if self._quota is not None and here > self._quota: raise FileStorageQuotaError( "The storage quota has been exceeded.") if old and oldserial != committed_tid: return ConflictResolution.ResolvedSerial else: return self._tid def deleteObject(self, oid, oldserial, transaction): if self._is_read_only: raise POSException.ReadOnlyError() if transaction is not self._transaction: raise POSException.StorageTransactionError(self, transaction) with self._lock: old = self._index_get(oid, 0) if not old: raise POSException.POSKeyError(oid) h = self._read_data_header(old, oid) committed_tid = h.tid if oldserial != committed_tid: raise POSException.ConflictError( oid=oid, serials=(committed_tid, oldserial)) pos = self._pos here = pos + self._tfile.tell() + self._thl self._tindex[oid] = here new = DataHeader(oid, self._tid, old, pos, 0, 0) self._tfile.write(new.asString()) self._tfile.write(z64) # Check quota if self._quota is not None and here > self._quota: raise FileStorageQuotaError( "The storage quota has been exceeded.") def _data_find(self, tpos, oid, data): # Return backpointer for oid. Must call with the lock held. # This is a file offset to oid's data record if found, else 0. # The data records in the transaction at tpos are searched for oid. # If a data record for oid isn't found, returns 0. # Else if oid's data record contains a backpointer, that # backpointer is returned. # Else oid's data record contains the data, and the file offset of # oid's data record is returned. This data record should contain # a pickle identical to the 'data' argument. # Unclear: If the length of the stored data doesn't match len(data), # an exception is raised. If the lengths match but the data isn't # the same, 0 is returned. Why the discrepancy? self._file.seek(tpos) h = self._file.read(TRANS_HDR_LEN) tid, tl, status, ul, dl, el = unpack(TRANS_HDR, h) self._file.read(ul + dl + el) tend = tpos + tl + 8 pos = self._file.tell() while pos < tend: h = self._read_data_header(pos) if h.oid == oid: # Make sure this looks like the right data record if h.plen == 0: # This is also a backpointer. Gotta trust it. return pos if h.plen != len(data): # The expected data doesn't match what's in the # backpointer. Something is wrong. logger.error("Mismatch between data and" " backpointer at %d", pos) return 0 _data = self._file.read(h.plen) if data != _data: return 0 return pos pos += h.recordlen() self._file.seek(pos) return 0 def restore(self, oid, serial, data, version, prev_txn, transaction): # A lot like store() but without all the consistency checks. This # should only be used when we /know/ the data is good, hence the # method name. While the signature looks like store() there are some # differences: # # - serial is the serial number of /this/ revision, not of the # previous revision. It is used instead of self._tid, which is # ignored. # # - Nothing is returned # # - data can be None, which indicates a George Bailey object # (i.e. one who's creation has been transactionally undone). # # prev_txn is a backpointer. In the original database, it's possible # that the data was actually living in a previous transaction. This # can happen for transactional undo and other operations, and is used # as a space saving optimization. Under some circumstances the # prev_txn may not actually exist in the target database (i.e. self) # for example, if it's been packed away. In that case, the prev_txn # should be considered just a hint, and is ignored if the transaction # doesn't exist. if self._is_read_only: raise POSException.ReadOnlyError() if transaction is not self._transaction: raise POSException.StorageTransactionError(self, transaction) if version: raise TypeError("Versions are no-longer supported") with self._lock: if oid > self._oid: self.set_max_oid(oid) prev_pos = 0 if prev_txn is not None: prev_txn_pos = self._txn_find(prev_txn, 0) if prev_txn_pos: prev_pos = self._data_find(prev_txn_pos, oid, data) old = self._index_get(oid, 0) # Calculate the file position in the temporary file here = self._pos + self._tfile.tell() + self._thl # And update the temp file index self._tindex[oid] = here if prev_pos: # If there is a valid prev_pos, don't write data. data = None if data is None: dlen = 0 else: dlen = len(data) # Write the recovery data record new = DataHeader(oid, serial, old, self._pos, 0, dlen) self._tfile.write(new.asString()) # Finally, write the data or a backpointer. if data is None: if prev_pos: self._tfile.write(p64(prev_pos)) else: # Write a zero backpointer, which indicates an # un-creation transaction. self._tfile.write(z64) else: self._tfile.write(data) def supportsUndo(self): return 1 def _clear_temp(self): self._tindex.clear() if self._tfile is not None: self._tfile.seek(0) def _begin(self, tid, u, d, e): self._nextpos = 0 self._thl = TRANS_HDR_LEN + len(u) + len(d) + len(e) if self._thl > 65535: # one of u, d, or e may be > 65535 # We have to check lengths here because struct.pack # doesn't raise an exception on overflow! if len(u) > 65535: raise FileStorageError('user name too long') if len(d) > 65535: raise FileStorageError('description too long') if len(e) > 65535: raise FileStorageError('too much extension data') def tpc_vote(self, transaction): with self._lock: if transaction is not self._transaction: raise POSException.StorageTransactionError( "tpc_vote called with wrong transaction") dlen = self._tfile.tell() if not dlen: return # No data in this trans self._tfile.seek(0) user, descr, ext = self._ude self._file.seek(self._pos) tl = self._thl + dlen try: h = TxnHeader(self._tid, tl, "c", len(user), len(descr), len(ext)) h.user = user h.descr = descr h.ext = ext self._file.write(h.asString()) ZODB.utils.cp(self._tfile, self._file, dlen) self._file.write(p64(tl)) self._file.flush() except: # Hm, an error occurred writing out the data. Maybe the # disk is full. We don't want any turd at the end. self._file.truncate(self._pos) raise self._nextpos = self._pos + (tl + 8) def tpc_finish(self, transaction, f=None): with self._files.write_lock(): with self._lock: if transaction is not self._transaction: raise POSException.StorageTransactionError( "tpc_finish called with wrong transaction") try: if f is not None: f(self._tid) u, d, e = self._ude self._finish(self._tid, u, d, e) self._clear_temp() finally: self._ude = None self._transaction = None self._commit_lock_release() def _finish(self, tid, u, d, e): # If self._nextpos is 0, then the transaction didn't write any # data, so we don't bother writing anything to the file. if self._nextpos: # Clear the checkpoint flag self._file.seek(self._pos+16) self._file.write(self._tstatus) try: # At this point, we may have committed the data to disk. # If we fail from here, we're in bad shape. self._finish_finish(tid) except: # Ouch. This is bad. Let's try to get back to where we were # and then roll over and die logger.critical("Failure in _finish. Closing.", exc_info=True) self.close() raise def _finish_finish(self, tid): # This is a separate method to allow tests to replace it with # something broken. :) self._file.flush() if fsync is not None: fsync(self._file.fileno()) self._pos = self._nextpos self._index.update(self._tindex) self._ltid = tid self._blob_tpc_finish() def _abort(self): if self._nextpos: self._file.truncate(self._pos) self._nextpos=0 self._blob_tpc_abort() def _undoDataInfo(self, oid, pos, tpos): """Return the tid, data pointer, and data for the oid record at pos """ if tpos: pos = tpos - self._pos - self._thl tpos = self._tfile.tell() h = self._tfmt._read_data_header(pos, oid) afile = self._tfile else: h = self._read_data_header(pos, oid) afile = self._file if h.oid != oid: raise UndoError("Invalid undo transaction id", oid) if h.plen: data = afile.read(h.plen) else: data = '' pos = h.back if tpos: self._tfile.seek(tpos) # Restore temp file to end return h.tid, pos, data def getTid(self, oid): with self._lock: pos = self._lookup_pos(oid) h = self._read_data_header(pos, oid) if h.plen == 0 and h.back == 0: # Undone creation raise POSKeyError(oid) return h.tid def _transactionalUndoRecord(self, oid, pos, tid, pre): """Get the undo information for a data record 'pos' points to the data header for 'oid' in the transaction being undone. 'tid' refers to the transaction being undone. 'pre' is the 'prev' field of the same data header. Return a 3-tuple consisting of a pickle, data pointer, and current position. If the pickle is true, then the data pointer must be 0, but the pickle can be empty *and* the pointer 0. """ copy = 1 # Can we just copy a data pointer # First check if it is possible to undo this record. tpos = self._tindex.get(oid, 0) ipos = self._index.get(oid, 0) tipos = tpos or ipos if tipos != pos: # Eek, a later transaction modified the data, but, # maybe it is pointing at the same data we are. ctid, cdataptr, cdata = self._undoDataInfo(oid, ipos, tpos) if cdataptr != pos: # We aren't sure if we are talking about the same data try: if ( # The current record wrote a new pickle cdataptr == tipos or # Backpointers are different self._loadBackPOS(oid, pos) != self._loadBackPOS(oid, cdataptr) ): if pre and not tpos: copy = 0 # we'll try to do conflict resolution else: # We bail if: # - We don't have a previous record, which should # be impossible. raise UndoError("no previous record", oid) except KeyError: # LoadBack gave us a key error. Bail. raise UndoError("_loadBack() failed", oid) # Return the data that should be written in the undo record. if not pre: # There is no previous revision, because the object creation # is being undone. return "", 0, ipos if copy: # we can just copy our previous-record pointer forward return "", pre, ipos try: bdata = self._loadBack_impl(oid, pre)[0] except KeyError: # couldn't find oid; what's the real explanation for this? raise UndoError("_loadBack() failed for %s", oid) try: data = self.tryToResolveConflict(oid, ctid, tid, bdata, cdata) return data, 0, ipos except POSException.ConflictError: pass raise UndoError("Some data were modified by a later transaction", oid) # undoLog() returns a description dict that includes an id entry. # The id is opaque to the client, but contains the transaction id. # The transactionalUndo() implementation does a simple linear # search through the file (from the end) to find the transaction. def undoLog(self, first=0, last=-20, filter=None): if last < 0: # -last is supposed to be the max # of transactions. Convert to # a positive index. Should have x - first = -last, which # means x = first - last. This is spelled out here because # the normalization code was incorrect for years (used +1 # instead -- off by 1), until ZODB 3.4. last = first - last with self._lock: if self._pack_is_in_progress: raise UndoError( 'Undo is currently disabled for database maintenance.

') us = UndoSearch(self._file, self._pos, first, last, filter) while not us.finished(): # Hold lock for batches of 20 searches, so default search # parameters will finish without letting another thread run. for i in range(20): if us.finished(): break us.search() # Give another thread a chance, so that a long undoLog() # operation doesn't block all other activity. self._lock_release() self._lock_acquire() return us.results def undo(self, transaction_id, transaction): """Undo a transaction, given by transaction_id. Do so by writing new data that reverses the action taken by the transaction. Usually, we can get by with just copying a data pointer, by writing a file position rather than a pickle. Sometimes, we may do conflict resolution, in which case we actually copy new data that results from resolution. """ if self._is_read_only: raise POSException.ReadOnlyError() if transaction is not self._transaction: raise POSException.StorageTransactionError(self, transaction) with self._lock: # Find the right transaction to undo and call _txn_undo_write(). tid = base64.decodestring(transaction_id + '\n') assert len(tid) == 8 tpos = self._txn_find(tid, 1) tindex = self._txn_undo_write(tpos) self._tindex.update(tindex) return self._tid, tindex.keys() def _txn_find(self, tid, stop_at_pack): pos = self._pos while pos > 39: self._file.seek(pos - 8) pos = pos - u64(self._file.read(8)) - 8 self._file.seek(pos) h = self._file.read(TRANS_HDR_LEN) _tid = h[:8] if _tid == tid: return pos if stop_at_pack: # check the status field of the transaction header if h[16] == 'p': break raise UndoError("Invalid transaction id") def _txn_undo_write(self, tpos): # a helper function to write the data records for transactional undo otloc = self._pos here = self._pos + self._tfile.tell() + self._thl base = here - self._tfile.tell() # Let's move the file pointer back to the start of the txn record. th = self._read_txn_header(tpos) if th.status != " ": raise UndoError('non-undoable transaction') tend = tpos + th.tlen pos = tpos + th.headerlen() tindex = {} # keep track of failures, cause we may succeed later failures = {} # Read the data records for this transaction while pos < tend: h = self._read_data_header(pos) if h.oid in failures: del failures[h.oid] # second chance! assert base + self._tfile.tell() == here, (here, base, self._tfile.tell()) try: p, prev, ipos = self._transactionalUndoRecord( h.oid, pos, h.tid, h.prev) except UndoError, v: # Don't fail right away. We may be redeemed later! failures[h.oid] = v else: if self.blob_dir and not p and prev: try: up, userial = self._loadBackTxn(h.oid, prev) except ZODB.POSException.POSKeyError: pass # It was removed, so no need to copy data else: if self.is_blob_record(up): # We're undoing a blob modification operation. # We have to copy the blob data tmp = ZODB.utils.mktemp(dir=self.fshelper.temp_dir) ZODB.utils.cp( self.openCommittedBlobFile(h.oid, userial), open(tmp, 'wb')) self._blob_storeblob(h.oid, self._tid, tmp) new = DataHeader(h.oid, self._tid, ipos, otloc, 0, len(p)) # TODO: This seek shouldn't be necessary, but some other # bit of code is messing with the file pointer. assert self._tfile.tell() == here - base, (here, base, self._tfile.tell()) self._tfile.write(new.asString()) if p: self._tfile.write(p) else: self._tfile.write(p64(prev)) tindex[h.oid] = here here += new.recordlen() pos += h.recordlen() if pos > tend: raise UndoError("non-undoable transaction") if failures: raise MultipleUndoErrors(failures.items()) return tindex def history(self, oid, size=1, filter=None): with self._lock: r = [] pos = self._lookup_pos(oid) while 1: if len(r) >= size: return r h = self._read_data_header(pos) th = self._read_txn_header(h.tloc) if th.ext: d = loads(th.ext) else: d = {} d.update({"time": TimeStamp(h.tid).timeTime(), "user_name": th.user, "description": th.descr, "tid": h.tid, "size": h.plen, }) if filter is None or filter(d): r.append(d) if h.prev: pos = h.prev else: return r def _redundant_pack(self, file, pos): assert pos > 8, pos file.seek(pos - 8) p = u64(file.read(8)) file.seek(pos - p + 8) return file.read(1) not in ' u' @staticmethod def packer(storage, referencesf, stop, gc): # Our default packer is built around the original packer. We # simply adapt the old interface to the new. We don't really # want to invest much in the old packer, at least for now. assert referencesf is not None p = FileStoragePacker(storage, referencesf, stop, gc) opos = p.pack() if opos is None: return None return opos, p.index def pack(self, t, referencesf, gc=None): """Copy data from the current database file to a packed file Non-current records from transactions with time-stamp strings less than packtss are ommitted. As are all undone records. Also, data back pointers that point before packtss are resolved and the associated data are copied, since the old records are not copied. """ if self._is_read_only: raise POSException.ReadOnlyError() stop=`TimeStamp(*time.gmtime(t)[:5]+(t%60,))` if stop==z64: raise FileStorageError('Invalid pack time') # If the storage is empty, there's nothing to do. if not self._index: return with self._lock: if self._pack_is_in_progress: raise FileStorageError('Already packing') self._pack_is_in_progress = True if gc is None: gc = self._pack_gc oldpath = self._file_name + ".old" if os.path.exists(oldpath): os.remove(oldpath) if self.blob_dir and os.path.exists(self.blob_dir + ".old"): ZODB.blob.remove_committed_dir(self.blob_dir + ".old") cleanup = [] have_commit_lock = False try: pack_result = None try: pack_result = self.packer(self, referencesf, stop, gc) except RedundantPackWarning, detail: logger.info(str(detail)) if pack_result is None: return have_commit_lock = True opos, index = pack_result with self._files.write_lock(): with self._lock: self._files.empty() self._file.close() try: os.rename(self._file_name, oldpath) except Exception: self._file = open(self._file_name, 'r+b') raise # OK, we're beyond the point of no return os.rename(self._file_name + '.pack', self._file_name) self._file = open(self._file_name, 'r+b') self._initIndex(index, self._tindex) self._pos = opos # We're basically done. Now we need to deal with removed # blobs and removing the .old file (see further down). if self.blob_dir: self._commit_lock_release() have_commit_lock = False self._remove_blob_files_tagged_for_removal_during_pack() finally: if have_commit_lock: self._commit_lock_release() with self._lock: self._pack_is_in_progress = False if not self.pack_keep_old: os.remove(oldpath) with self._lock: self._save_index() def _remove_blob_files_tagged_for_removal_during_pack(self): lblob_dir = len(self.blob_dir) fshelper = self.fshelper old = self.blob_dir+'.old' link_or_copy = ZODB.blob.link_or_copy # Helper to clean up dirs left empty after moving things to old def maybe_remove_empty_dir_containing(path, level=0): path = os.path.dirname(path) if len(path) <= lblob_dir or os.listdir(path): return # Path points to an empty dir. There may be a race. We # might have just removed the dir for an oid (or a parent # dir) and while we're cleaning up it's parent, another # thread is adding a new entry to it. # We don't have to worry about level 0, as this is just a # directory containing an object's revisions. If it is # enmpty, the object must have been garbage. # If the level is 1 or higher, we need to be more # careful. We'll get the storage lock and double check # that the dir is still empty before removing it. removed = False if level: self._lock_acquire() try: if not os.listdir(path): os.rmdir(path) removed = True finally: if level: self._lock_release() if removed: maybe_remove_empty_dir_containing(path, level+1) if self.pack_keep_old: # Helpers that move oid dir or revision file to the old dir. os.mkdir(old, 0777) link_or_copy(os.path.join(self.blob_dir, '.layout'), os.path.join(old, '.layout')) def handle_file(path): newpath = old+path[lblob_dir:] dest = os.path.dirname(newpath) if not os.path.exists(dest): os.makedirs(dest, 0700) os.rename(path, newpath) handle_dir = handle_file else: # Helpers that remove an oid dir or revision file. handle_file = ZODB.blob.remove_committed handle_dir = ZODB.blob.remove_committed_dir # Fist step: move or remove oids or revisions for line in open(os.path.join(self.blob_dir, '.removed')): line = line.strip().decode('hex') if len(line) == 8: # oid is garbage, re/move dir path = fshelper.getPathForOID(line) if not os.path.exists(path): # Hm, already gone. Odd. continue handle_dir(path) maybe_remove_empty_dir_containing(path, 1) continue if len(line) != 16: raise ValueError("Bad record in ", self.blob_dir, '.removed') oid, tid = line[:8], line[8:] path = fshelper.getBlobFilename(oid, tid) if not os.path.exists(path): # Hm, already gone. Odd. continue handle_file(path) assert not os.path.exists(path) maybe_remove_empty_dir_containing(path) os.remove(os.path.join(self.blob_dir, '.removed')) if not self.pack_keep_old: return # Second step, copy remaining files. for path, dir_names, file_names in os.walk(self.blob_dir): for file_name in file_names: if not file_name.endswith('.blob'): continue file_path = os.path.join(path, file_name) dest = os.path.dirname(old+file_path[lblob_dir:]) if not os.path.exists(dest): os.makedirs(dest, 0700) link_or_copy(file_path, old+file_path[lblob_dir:]) def iterator(self, start=None, stop=None): return FileIterator(self._file_name, start, stop) def lastInvalidations(self, count): file = self._file seek = file.seek read = file.read with self._lock: pos = self._pos while count > 0 and pos > 4: count -= 1 seek(pos-8) pos = pos - 8 - u64(read(8)) seek(0) return [(trans.tid, [r.oid for r in trans]) for trans in FileIterator(self._file_name, pos=pos)] def lastTid(self, oid): """Return last serialno committed for object oid. If there is no serialno for this oid -- which can only occur if it is a new object -- return None. """ try: return self.getTid(oid) except KeyError: return None def cleanup(self): """Remove all files created by this storage.""" for ext in '', '.old', '.tmp', '.lock', '.index', '.pack': try: os.remove(self._file_name + ext) except OSError, e: if e.errno != errno.ENOENT: raise def record_iternext(self, next=None): index = self._index oid = index.minKey(next) oid_as_long, = unpack(">Q", oid) next_oid = pack(">Q", oid_as_long + 1) try: next_oid = index.minKey(next_oid) except ValueError: # "empty tree" error next_oid = None data, tid = self.load(oid, "") return oid, tid, data, next_oid ###################################################################### # The following 2 methods are for testing a ZEO extension mechanism def getExtensionMethods(self): return dict(answer_to_the_ultimate_question=None) def answer_to_the_ultimate_question(self): return 42 # ###################################################################### def shift_transactions_forward(index, tindex, file, pos, opos): """Copy transactions forward in the data file This might be done as part of a recovery effort """ # Cache a bunch of methods seek=file.seek read=file.read write=file.write index_get=index.get # Initialize, pv=z64 p1=opos p2=pos offset=p2-p1 # Copy the data in two stages. In the packing stage, # we skip records that are non-current or that are for # unreferenced objects. We also skip undone transactions. # # After the packing stage, we copy everything but undone # transactions, however, we have to update various back pointers. # We have to have the storage lock in the second phase to keep # data from being changed while we're copying. pnv=None while 1: # Read the transaction record seek(pos) h=read(TRANS_HDR_LEN) if len(h) < TRANS_HDR_LEN: break tid, stl, status, ul, dl, el = unpack(TRANS_HDR,h) if status=='c': break # Oops. we found a checkpoint flag. tl=u64(stl) tpos=pos tend=tpos+tl otpos=opos # start pos of output trans thl=ul+dl+el h2=read(thl) if len(h2) != thl: raise PackError(opos) # write out the transaction record seek(opos) write(h) write(h2) thl=TRANS_HDR_LEN+thl pos=tpos+thl opos=otpos+thl while pos < tend: # Read the data records for this transaction seek(pos) h=read(DATA_HDR_LEN) oid,serial,sprev,stloc,vlen,splen = unpack(DATA_HDR, h) assert not vlen plen=u64(splen) dlen=DATA_HDR_LEN+(plen or 8) tindex[oid]=opos if plen: p=read(plen) else: p=read(8) p=u64(p) if p >= p2: p=p-offset elif p >= p1: # Ick, we're in trouble. Let's bail # to the index and hope for the best p=index_get(oid, 0) p=p64(p) # WRITE seek(opos) sprev=p64(index_get(oid, 0)) write(pack(DATA_HDR, oid, serial, sprev, p64(otpos), 0, splen)) write(p) opos=opos+dlen pos=pos+dlen # skip the (intentionally redundant) transaction length pos=pos+8 if status != 'u': index.update(tindex) # Record the position tindex.clear() write(stl) opos=opos+8 return opos def search_back(file, pos): seek=file.seek read=file.read seek(0,2) s=p=file.tell() while p > pos: seek(p-8) l=u64(read(8)) if l <= 0: break p=p-l-8 return p, s def recover(file_name): file=open(file_name, 'r+b') index={} tindex={} pos, oid, tid = read_index(file, file_name, index, tindex, recover=1) if oid is not None: print "Nothing to recover" return opos=pos pos, sz = search_back(file, pos) if pos < sz: npos = shift_transactions_forward(index, tindex, file, pos, opos) file.truncate(npos) print "Recovered file, lost %s, ended up with %s bytes" % ( pos-opos, npos) def read_index(file, name, index, tindex, stop='\377'*8, ltid=z64, start=4L, maxoid=z64, recover=0, read_only=0): """Scan the file storage and update the index. Returns file position, max oid, and last transaction id. It also stores index information in the three dictionary arguments. Arguments: file -- a file object (the Data.fs) name -- the name of the file (presumably file.name) index -- fsIndex, oid -> data record file offset tindex -- dictionary, oid -> data record offset tindex is cleared before return There are several default arguments that affect the scan or the return values. TODO: document them. start -- the file position at which to start scanning for oids added beyond the ones the passed-in indices know about. The .index file caches the highest ._pos FileStorage knew about when the the .index file was last saved, and that's the intended value to pass in for start; accept the default (and pass empty indices) to recreate the index from scratch maxoid -- ignored (it meant something prior to ZODB 3.2.6; the argument still exists just so the signature of read_index() stayed the same) The file position returned is the position just after the last valid transaction record. The oid returned is the maximum object id in `index`, or z64 if the index is empty. The transaction id is the tid of the last transaction, or ltid if the index is empty. """ read = file.read seek = file.seek seek(0, 2) file_size = file.tell() fmt = TempFormatter(file) if file_size: if file_size < start: raise FileStorageFormatError(file.name) seek(0) if read(4) != packed_version: raise FileStorageFormatError(name) else: if not read_only: file.write(packed_version) return 4L, z64, ltid index_get = index.get pos = start seek(start) tid = '\0' * 7 + '\1' while 1: # Read the transaction record h = read(TRANS_HDR_LEN) if not h: break if len(h) != TRANS_HDR_LEN: if not read_only: logger.warning('%s truncated at %s', name, pos) seek(pos) file.truncate() break tid, tl, status, ul, dl, el = unpack(TRANS_HDR, h) if tid <= ltid: logger.warning("%s time-stamp reduction at %s", name, pos) ltid = tid if pos+(tl+8) > file_size or status=='c': # Hm, the data were truncated or the checkpoint flag wasn't # cleared. They may also be corrupted, # in which case, we don't want to totally lose the data. if not read_only: logger.warning("%s truncated, possibly due to damaged" " records at %s", name, pos) _truncate(file, name, pos) break if status not in ' up': logger.warning('%s has invalid status, %s, at %s', name, status, pos) if tl < TRANS_HDR_LEN + ul + dl + el: # We're in trouble. Find out if this is bad data in the # middle of the file, or just a turd that Win 9x dropped # at the end when the system crashed. # Skip to the end and read what should be the transaction length # of the last transaction. seek(-8, 2) rtl = u64(read(8)) # Now check to see if the redundant transaction length is # reasonable: if file_size - rtl < pos or rtl < TRANS_HDR_LEN: logger.critical('%s has invalid transaction header at %s', name, pos) if not read_only: logger.warning( "It appears that there is invalid data at the end of " "the file, possibly due to a system crash. %s " "truncated to recover from bad data at end." % name) _truncate(file, name, pos) break else: if recover: return pos, None, None panic('%s has invalid transaction header at %s', name, pos) if tid >= stop: break tpos = pos tend = tpos + tl if status == 'u': # Undone transaction, skip it seek(tend) h = u64(read(8)) if h != tl: if recover: return tpos, None, None panic('%s has inconsistent transaction length at %s', name, pos) pos = tend + 8 continue pos = tpos + TRANS_HDR_LEN + ul + dl + el while pos < tend: # Read the data records for this transaction h = fmt._read_data_header(pos) dlen = h.recordlen() tindex[h.oid] = pos if pos + dlen > tend or h.tloc != tpos: if recover: return tpos, None, None panic("%s data record exceeds transaction record at %s", name, pos) if index_get(h.oid, 0) != h.prev: if h.prev: if recover: return tpos, None, None logger.error("%s incorrect previous pointer at %s", name, pos) else: logger.warning("%s incorrect previous pointer at %s", name, pos) pos += dlen if pos != tend: if recover: return tpos, None, None panic("%s data records don't add up at %s",name,tpos) # Read the (intentionally redundant) transaction length seek(pos) h = u64(read(8)) if h != tl: if recover: return tpos, None, None panic("%s redundant transaction length check failed at %s", name, pos) pos += 8 index.update(tindex) tindex.clear() # Caution: fsIndex doesn't have an efficient __nonzero__ or __len__. # That's why we do try/except instead. fsIndex.maxKey() is fast. try: maxoid = index.maxKey() except ValueError: # The index is empty. maxoid == z64 return pos, maxoid, ltid def _truncate(file, name, pos): file.seek(0, 2) file_size = file.tell() try: i = 0 while 1: oname='%s.tr%s' % (name, i) if os.path.exists(oname): i += 1 else: logger.warning("Writing truncated data from %s to %s", name, oname) o = open(oname,'wb') file.seek(pos) ZODB.utils.cp(file, o, file_size-pos) o.close() break except: logger.error("couldn\'t write truncated data for %s", name, exc_info=True) raise POSException.StorageSystemError("Couldn't save truncated data") file.seek(pos) file.truncate() class FileIterator(FileStorageFormatter): """Iterate over the transactions in a FileStorage file. """ _ltid = z64 _file = None def __init__(self, filename, start=None, stop=None, pos=4L): assert isinstance(filename, str) file = open(filename, 'rb') self._file = file self._file_name = filename if file.read(4) != packed_version: raise FileStorageFormatError(file.name) file.seek(0,2) self._file_size = file.tell() if (pos < 4) or pos > self._file_size: raise ValueError("Given position is greater than the file size", pos, self._file_size) self._pos = pos assert start is None or isinstance(start, str) assert stop is None or isinstance(stop, str) self._start = start self._stop = stop if start: if self._file_size <= 4: return self._skip_to_start(start) def __len__(self): # Define a bogus __len__() to make the iterator work # with code like builtin list() and tuple() in Python 2.1. # There's a lot of C code that expects a sequence to have # an __len__() but can cope with any sort of mistake in its # implementation. So just return 0. return 0 # This allows us to pass an iterator as the `other' argument to # copyTransactionsFrom() in BaseStorage. The advantage here is that we # can create the iterator manually, e.g. setting start and stop, and then # just let copyTransactionsFrom() do its thing. def iterator(self): return self def close(self): file = self._file if file is not None: self._file = None file.close() def _skip_to_start(self, start): file = self._file pos1 = self._pos file.seek(pos1) tid1 = file.read(8) if len(tid1) < 8: raise CorruptedError("Couldn't read tid.") if start < tid1: pos2 = pos1 tid2 = tid1 file.seek(4) tid1 = file.read(8) if start <= tid1: self._pos = 4 return pos1 = 4 else: if start == tid1: return # Try to read the last transaction. We could be unlucky and # opened the file while committing a transaction. In that # case, we'll just scan from the beginning if the file is # small enough, otherwise we'll fail. file.seek(self._file_size-8) l = u64(file.read(8)) if not (l + 12 <= self._file_size and self._read_num(self._file_size-l) == l): if self._file_size < (1<<20): return self._scan_foreward(start) raise ValueError("Can't find last transaction in large file") pos2 = self._file_size-l-8 file.seek(pos2) tid2 = file.read(8) if tid2 < tid1: raise CorruptedError("Tids out of order.") if tid2 <= start: if tid2 == start: self._pos = pos2 else: self._pos = self._file_size return t1 = ZODB.TimeStamp.TimeStamp(tid1).timeTime() t2 = ZODB.TimeStamp.TimeStamp(tid2).timeTime() ts = ZODB.TimeStamp.TimeStamp(start).timeTime() if (ts - t1) < (t2 - ts): return self._scan_forward(pos1, start) else: return self._scan_backward(pos2, start) def _scan_forward(self, pos, start): logger.debug("Scan forward %s:%s looking for %r", self._file_name, pos, start) file = self._file while 1: # Read the transaction record h = self._read_txn_header(pos) if h.tid >= start: self._pos = pos return pos += h.tlen + 8 def _scan_backward(self, pos, start): logger.debug("Scan backward %s:%s looking for %r", self._file_name, pos, start) file = self._file seek = file.seek read = file.read while 1: pos -= 8 seek(pos) tlen = ZODB.utils.u64(read(8)) pos -= tlen h = self._read_txn_header(pos) if h.tid <= start: if h.tid == start: self._pos = pos else: self._pos = pos + tlen + 8 return # Iterator protocol def __iter__(self): return self def next(self): if self._file is None: raise StopIteration() pos = self._pos while True: # Read the transaction record try: h = self._read_txn_header(pos) except CorruptedDataError, err: # If buf is empty, we've reached EOF. if not err.buf: break raise if h.tid <= self._ltid: logger.warning("%s time-stamp reduction at %s", self._file.name, pos) self._ltid = h.tid if self._stop is not None and h.tid > self._stop: break if h.status == "c": # Assume we've hit the last, in-progress transaction break if pos + h.tlen + 8 > self._file_size: # Hm, the data were truncated or the checkpoint flag wasn't # cleared. They may also be corrupted, # in which case, we don't want to totally lose the data. logger.warning("%s truncated, possibly due to" " damaged records at %s", self._file.name, pos) break if h.status not in " up": logger.warning('%s has invalid status,' ' %s, at %s', self._file.name, h.status, pos) if h.tlen < h.headerlen(): # We're in trouble. Find out if this is bad data in # the middle of the file, or just a turd that Win 9x # dropped at the end when the system crashed. Skip to # the end and read what should be the transaction # length of the last transaction. self._file.seek(-8, 2) rtl = u64(self._file.read(8)) # Now check to see if the redundant transaction length is # reasonable: if self._file_size - rtl < pos or rtl < TRANS_HDR_LEN: logger.critical("%s has invalid transaction header at %s", self._file.name, pos) logger.warning( "It appears that there is invalid data at the end of " "the file, possibly due to a system crash. %s " "truncated to recover from bad data at end." % self._file.name) break else: logger.warning("%s has invalid transaction header at %s", self._file.name, pos) break tpos = pos tend = tpos + h.tlen if h.status != "u": pos = tpos + h.headerlen() e = {} if h.elen: try: e = loads(h.ext) except: pass result = TransactionRecord(h.tid, h.status, h.user, h.descr, e, pos, tend, self._file, tpos) # Read the (intentionally redundant) transaction length self._file.seek(tend) rtl = u64(self._file.read(8)) if rtl != h.tlen: logger.warning("%s redundant transaction length check" " failed at %s", self._file.name, tend) break self._pos = tend + 8 return result self.close() raise StopIteration() class TransactionRecord(BaseStorage.TransactionRecord): def __init__(self, tid, status, user, desc, ext, pos, tend, file, tpos): BaseStorage.TransactionRecord.__init__( self, tid, status, user, desc, ext) self._pos = pos self._tend = tend self._file = file self._tpos = tpos def __iter__(self): return TransactionRecordIterator(self) class TransactionRecordIterator(FileStorageFormatter): """Iterate over the transactions in a FileStorage file.""" def __init__(self, record): self._file = record._file self._pos = record._pos self._tpos = record._tpos self._tend = record._tend def __iter__(self): return self def next(self): pos = self._pos while pos < self._tend: # Read the data records for this transaction h = self._read_data_header(pos) dlen = h.recordlen() if pos + dlen > self._tend or h.tloc != self._tpos: logger.warning("%s data record exceeds transaction" " record at %s", file.name, pos) break self._pos = pos + dlen prev_txn = None if h.plen: data = self._file.read(h.plen) else: if h.back == 0: # If the backpointer is 0, then this transaction # undoes the object creation. It undid the # transaction that created it. Return None # instead of a pickle to indicate this. data = None else: data, tid = self._loadBackTxn(h.oid, h.back, False) # Caution: :ooks like this only goes one link back. # Should it go to the original data like BDBFullStorage? prev_txn = self.getTxnFromData(h.oid, h.back) return Record(h.oid, h.tid, data, prev_txn, pos) raise StopIteration() class Record(BaseStorage.DataRecord): def __init__(self, oid, tid, data, prev, pos): super(Record, self).__init__(oid, tid, data, prev) self.pos = pos class UndoSearch: def __init__(self, file, pos, first, last, filter=None): self.file = file self.pos = pos self.first = first self.last = last self.filter = filter # self.i is the index of the transaction we're _going_ to find # next. When it reaches self.first, we should start appending # to self.results. When it reaches self.last, we're done # (although we may finish earlier). self.i = 0 self.results = [] self.stop = False def finished(self): """Return True if UndoSearch has found enough records.""" # BAW: Why 39 please? This makes no sense (see also below). return self.i >= self.last or self.pos < 39 or self.stop def search(self): """Search for another record.""" dict = self._readnext() if dict is not None and (self.filter is None or self.filter(dict)): if self.i >= self.first: self.results.append(dict) self.i += 1 def _readnext(self): """Read the next record from the storage.""" self.file.seek(self.pos - 8) self.pos -= u64(self.file.read(8)) + 8 self.file.seek(self.pos) h = self.file.read(TRANS_HDR_LEN) tid, tl, status, ul, dl, el = unpack(TRANS_HDR, h) if status == 'p': self.stop = 1 return None if status != ' ': return None d = u = '' if ul: u = self.file.read(ul) if dl: d = self.file.read(dl) e = {} if el: try: e = loads(self.file.read(el)) except: pass d = {'id': base64.encodestring(tid).rstrip(), 'time': TimeStamp(tid).timeTime(), 'user_name': u, 'size': tl, 'description': d} d.update(e) return d class FilePool: closed = False writing = False writers = 0 def __init__(self, file_name): self.name = file_name self._files = [] self._out = [] self._cond = threading.Condition() @contextlib.contextmanager def write_lock(self): with self._cond: self.writers += 1 while self.writing or self._out: self._cond.wait() if self.closed: raise ValueError('closed') self.writing = True try: yield None finally: with self._cond: self.writing = False if self.writers > 0: self.writers -= 1 self._cond.notifyAll() @contextlib.contextmanager def get(self): with self._cond: while self.writers: self._cond.wait() assert not self.writing if self.closed: raise ValueError('closed') try: f = self._files.pop() except IndexError: f = open(self.name, 'rb') self._out.append(f) try: yield f finally: self._out.remove(f) self._files.append(f) if not self._out: with self._cond: if self.writers and not self._out: self._cond.notifyAll() def empty(self): while self._files: self._files.pop().close() def close(self): with self._cond: self.closed = True while self._out: self._out.pop().close() self.empty() self.writing = self.writers = 0 ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/FileStorage/__init__.py000066400000000000000000000003561230730566700247560ustar00rootroot00000000000000# this is a package from ZODB.FileStorage.FileStorage import FileStorage, TransactionRecord from ZODB.FileStorage.FileStorage import FileIterator, Record, packed_version # BBB Alias for compatibility RecordIterator = TransactionRecord ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/FileStorage/format.py000066400000000000000000000221621230730566700245060ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2003 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## # # File-based ZODB storage # # Files are arranged as follows. # # - The first 4 bytes are a file identifier. # # - The rest of the file consists of a sequence of transaction # "records". # # A transaction record consists of: # # - 8-byte transaction id, which is also a time stamp. # # - 8-byte transaction record length - 8. # # - 1-byte status code # ' ' (a blank) completed transaction that hasn't been packed # 'p' completed transaction that has been packed # 'c' checkpoint -- a transaction in progress, at the end of the file; # it's been thru vote() but not finish(); if finish() completes # normally, it will be overwritten with a blank; if finish() dies # (e.g., out of disk space), cleanup code will try to truncate # the file to chop off this incomplete transaction # 'u' uncertain; no longer used; was previously used to record something # about non-transactional undo # # - 2-byte length of user name # # - 2-byte length of description # # - 2-byte length of extension attributes # # - user name # # - description # # - extension attributes # # * A sequence of data records # # - 8-byte redundant transaction length -8 # # A data record consists of # # - 8-byte oid. # # - 8-byte tid, which matches the transaction id in the transaction record. # # - 8-byte previous-record file-position. # # - 8-byte beginning of transaction record file position. # # - 2-bytes with zero values. (Was version length.) # # - 8-byte data length # # ? data # (data length > 0) # # ? 8-byte position of data record containing data # (data length == 0) # # Note that the lengths and positions are all big-endian. # Also, the object ids time stamps are big-endian, so comparisons # are meaningful. # # Backpointers # # When we undo a record, we don't copy (or delete) # data. Instead, we write records with back pointers. import struct import logging from ZODB.POSException import POSKeyError from ZODB.utils import u64, oid_repr class CorruptedError(Exception): pass class CorruptedDataError(CorruptedError): def __init__(self, oid=None, buf=None, pos=None): self.oid = oid self.buf = buf self.pos = pos def __str__(self): if self.oid: msg = "Error reading oid %s. Found %r" % (oid_repr(self.oid), self.buf) else: msg = "Error reading unknown oid. Found %r" % self.buf if self.pos: msg += " at %d" % self.pos return msg # the struct formats for the headers TRANS_HDR = ">8sQcHHH" DATA_HDR = ">8s8sQQHQ" # constants to support various header sizes TRANS_HDR_LEN = 23 DATA_HDR_LEN = 42 assert struct.calcsize(TRANS_HDR) == TRANS_HDR_LEN assert struct.calcsize(DATA_HDR) == DATA_HDR_LEN logger = logging.getLogger('ZODB.FileStorage.format') class FileStorageFormatter(object): """Mixin class that can read and write the low-level format.""" # subclasses must provide _file _metadata_size = 4L _format_version = "21" def _read_num(self, pos): """Read an 8-byte number.""" self._file.seek(pos) return u64(self._file.read(8)) def _read_data_header(self, pos, oid=None, _file=None): """Return a DataHeader object for data record at pos. If ois is not None, raise CorruptedDataError if oid passed does not match oid in file. """ if _file is None: _file = self._file _file.seek(pos) s = _file.read(DATA_HDR_LEN) if len(s) != DATA_HDR_LEN: raise CorruptedDataError(oid, s, pos) h = DataHeaderFromString(s) if oid is not None and oid != h.oid: raise CorruptedDataError(oid, s, pos) if not h.plen: h.back = u64(_file.read(8)) return h def _read_txn_header(self, pos, tid=None): self._file.seek(pos) s = self._file.read(TRANS_HDR_LEN) if len(s) != TRANS_HDR_LEN: raise CorruptedDataError(tid, s, pos) h = TxnHeaderFromString(s) if tid is not None and tid != h.tid: raise CorruptedDataError(tid, s, pos) h.user = self._file.read(h.ulen) h.descr = self._file.read(h.dlen) h.ext = self._file.read(h.elen) return h def _loadBack_impl(self, oid, back, fail=True, _file=None): # shared implementation used by various _loadBack methods # # If the backpointer ultimately resolves to 0: # If fail is True, raise KeyError for zero backpointer. # If fail is False, return the empty data from the record # with no backpointer. if _file is None: _file = self._file while 1: if not back: # If backpointer is 0, object does not currently exist. raise POSKeyError(oid) h = self._read_data_header(back, _file=_file) if h.plen: return _file.read(h.plen), h.tid, back, h.tloc if h.back == 0 and not fail: return None, h.tid, back, h.tloc back = h.back def _loadBackTxn(self, oid, back, fail=True): """Return data and txn id for backpointer.""" return self._loadBack_impl(oid, back, fail)[:2] def _loadBackPOS(self, oid, back): return self._loadBack_impl(oid, back)[2] def getTxnFromData(self, oid, back): """Return transaction id for data at back.""" h = self._read_data_header(back, oid) return h.tid def fail(self, pos, msg, *args): s = ("%s:%s:" + msg) % ((self._name, pos) + args) logger.error(s) raise CorruptedError(s) def checkTxn(self, th, pos): if th.tid <= self.ltid: self.fail(pos, "time-stamp reduction: %s <= %s", oid_repr(th.tid), oid_repr(self.ltid)) self.ltid = th.tid if th.status == "c": self.fail(pos, "transaction with checkpoint flag set") if not th.status in " pu": # recognize " ", "p", and "u" as valid self.fail(pos, "invalid transaction status: %r", th.status) if th.tlen < th.headerlen(): self.fail(pos, "invalid transaction header: " "txnlen (%d) < headerlen(%d)", th.tlen, th.headerlen()) def checkData(self, th, tpos, dh, pos): if dh.tloc != tpos: self.fail(pos, "data record does not point to transaction header" ": %d != %d", dh.tloc, tpos) if pos + dh.recordlen() > tpos + th.tlen: self.fail(pos, "data record size exceeds transaction size: " "%d > %d", pos + dh.recordlen(), tpos + th.tlen) if dh.prev >= pos: self.fail(pos, "invalid previous pointer: %d", dh.prev) if dh.back: if dh.back >= pos: self.fail(pos, "invalid back pointer: %d", dh.prev) if dh.plen: self.fail(pos, "data record has back pointer and data") def DataHeaderFromString(s): return DataHeader(*struct.unpack(DATA_HDR, s)) class DataHeader(object): """Header for a data record.""" __slots__ = ("oid", "tid", "prev", "tloc", "plen", "back") def __init__(self, oid, tid, prev, tloc, vlen, plen): if vlen: raise ValueError( "Non-zero version length. Versions aren't supported.") self.oid = oid self.tid = tid self.prev = prev self.tloc = tloc self.plen = plen self.back = 0 # default def asString(self): return struct.pack(DATA_HDR, self.oid, self.tid, self.prev, self.tloc, 0, self.plen) def recordlen(self): return DATA_HDR_LEN + (self.plen or 8) def TxnHeaderFromString(s): return TxnHeader(*struct.unpack(TRANS_HDR, s)) class TxnHeader(object): """Header for a transaction record.""" __slots__ = ("tid", "tlen", "status", "user", "descr", "ext", "ulen", "dlen", "elen") def __init__(self, tid, tlen, status, ulen, dlen, elen): self.tid = tid self.tlen = tlen self.status = status self.ulen = ulen self.dlen = dlen self.elen = elen assert elen >= 0 def asString(self): s = struct.pack(TRANS_HDR, self.tid, self.tlen, self.status, self.ulen, self.dlen, self.elen) return "".join(map(str, [s, self.user, self.descr, self.ext])) def headerlen(self): return TRANS_HDR_LEN + self.ulen + self.dlen + self.elen ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/FileStorage/fsdump.py000066400000000000000000000110111230730566700245030ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2003 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## import struct from ZODB.FileStorage import FileIterator from ZODB.FileStorage.format import TRANS_HDR, TRANS_HDR_LEN from ZODB.FileStorage.format import DATA_HDR, DATA_HDR_LEN from ZODB.TimeStamp import TimeStamp from ZODB.utils import u64, get_pickle_metadata def fsdump(path, file=None, with_offset=1): iter = FileIterator(path) for i, trans in enumerate(iter): if with_offset: print >> file, ("Trans #%05d tid=%016x time=%s offset=%d" % (i, u64(trans.tid), TimeStamp(trans.tid), trans._pos)) else: print >> file, ("Trans #%05d tid=%016x time=%s" % (i, u64(trans.tid), TimeStamp(trans.tid))) print >> file, (" status=%r user=%r description=%r" % (trans.status, trans.user, trans.description)) for j, rec in enumerate(trans): if rec.data is None: fullclass = "undo or abort of object creation" size = "" else: modname, classname = get_pickle_metadata(rec.data) size = " size=%d" % len(rec.data) fullclass = "%s.%s" % (modname, classname) if rec.data_txn: # It would be nice to print the transaction number # (i) but it would be expensive to keep track of. bp = " bp=%016x" % u64(rec.data_txn) else: bp = "" print >> file, (" data #%05d oid=%016x%s class=%s%s" % (j, u64(rec.oid), size, fullclass, bp)) iter.close() def fmt(p64): # Return a nicely formatted string for a packaged 64-bit value return "%016x" % u64(p64) class Dumper: """A very verbose dumper for debuggin FileStorage problems.""" # TODO: Should revise this class to use FileStorageFormatter. def __init__(self, path, dest=None): self.file = open(path, "rb") self.dest = dest def dump(self): fid = self.file.read(4) print >> self.dest, "*" * 60 print >> self.dest, "file identifier: %r" % fid while self.dump_txn(): pass def dump_txn(self): pos = self.file.tell() h = self.file.read(TRANS_HDR_LEN) if not h: return False tid, tlen, status, ul, dl, el = struct.unpack(TRANS_HDR, h) end = pos + tlen print >> self.dest, "=" * 60 print >> self.dest, "offset: %d" % pos print >> self.dest, "end pos: %d" % end print >> self.dest, "transaction id: %s" % fmt(tid) print >> self.dest, "trec len: %d" % tlen print >> self.dest, "status: %r" % status user = descr = extra = "" if ul: user = self.file.read(ul) if dl: descr = self.file.read(dl) if el: extra = self.file.read(el) print >> self.dest, "user: %r" % user print >> self.dest, "description: %r" % descr print >> self.dest, "len(extra): %d" % el while self.file.tell() < end: self.dump_data(pos) stlen = self.file.read(8) print >> self.dest, "redundant trec len: %d" % u64(stlen) return 1 def dump_data(self, tloc): pos = self.file.tell() h = self.file.read(DATA_HDR_LEN) assert len(h) == DATA_HDR_LEN oid, revid, prev, tloc, vlen, dlen = struct.unpack(DATA_HDR, h) print >> self.dest, "-" * 60 print >> self.dest, "offset: %d" % pos print >> self.dest, "oid: %s" % fmt(oid) print >> self.dest, "revid: %s" % fmt(revid) print >> self.dest, "previous record offset: %d" % prev print >> self.dest, "transaction offset: %d" % tloc assert not vlen print >> self.dest, "len(data): %d" % dlen self.file.read(dlen) if not dlen: sbp = self.file.read(8) print >> self.dest, "backpointer: %d" % u64(sbp) def main(): import sys fsdump(sys.argv[1]) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/FileStorage/fsoids.py000066400000000000000000000174631230730566700245150ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2004 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## import ZODB.FileStorage from ZODB.utils import get_pickle_metadata, p64, oid_repr, tid_repr from ZODB.serialize import get_refs from ZODB.TimeStamp import TimeStamp # Extract module.class string from pickle. def get_class(pickle): return "%s.%s" % get_pickle_metadata(pickle) # Shorten a string for display. def shorten(s, size=50): if len(s) <= size: return s # Stick ... in the middle. navail = size - 5 nleading = navail // 2 ntrailing = size - nleading return s[:nleading] + " ... " + s[-ntrailing:] class Tracer(object): """Trace all occurrences of a set of oids in a FileStorage. Create passing a path to an existing FileStorage. Call register_oids(oid, ...) one or more times to specify which oids to investigate. Call run() to do the analysis. This isn't swift -- it has to read every byte in the database, in order to find all references. Call report() to display the results. """ def __init__(self, path): import os if not os.path.isfile(path): raise ValueError("must specify an existing FileStorage") self.path = path # Map an interesting tid to (status, user, description, pos). self.tid2info = {} # List of messages. Each is a tuple of the form # (oid, tid, string) # The order in the tuple is important, because it defines the # sort order for grouping. self.msgs = [] # The set of interesting oids, specified by register_oid() calls. # Maps oid to # of revisions. self.oids = {} # Maps interesting oid to its module.class name. If a creation # record for an interesting oid is never seen, it won't appear # in this mapping. self.oid2name = {} def register_oids(self, *oids): """ Declare that oids (0 or more) are "interesting". An oid can be given as a native 8-byte string, or as an integer. Info will be gathered about all appearances of this oid in the entire database, including references. """ for oid in oids: if isinstance(oid, str): assert len(oid) == 8 else: oid = p64(oid) self.oids[oid] = 0 # 0 revisions seen so far def _msg(self, oid, tid, *args): args = map(str, args) self.msgs.append( (oid, tid, ' '.join(args)) ) self._produced_msg = True def report(self): """Show all msgs, grouped by oid and sub-grouped by tid.""" msgs = self.msgs oids = self.oids oid2name = self.oid2name # First determine which oids weren't seen at all, and synthesize msgs # for them. NOT_SEEN = "this oid was not defined (no data record for it found)" for oid in oids: if oid not in oid2name: msgs.append( (oid, None, NOT_SEEN) ) msgs.sort() # oids are primary key, tids secondary current_oid = current_tid = None for oid, tid, msg in msgs: if oid != current_oid: nrev = oids[oid] revision = "revision" + (nrev != 1 and 's' or '') name = oid2name.get(oid, "") print "oid", oid_repr(oid), name, nrev, revision current_oid = oid current_tid = None if msg is NOT_SEEN: assert tid is None print " ", msg continue if tid != current_tid: current_tid = tid status, user, description, pos = self.tid2info[tid] print " tid %s offset=%d %s" % (tid_repr(tid), pos, TimeStamp(tid)) print " tid user=%r" % shorten(user) print " tid description=%r" % shorten(description) print " ", msg # Do the analysis. def run(self): """Find all occurrences of the registered oids in the database.""" # Maps oid of a reference to its module.class name. self._ref2name = {} for txn in ZODB.FileStorage.FileIterator(self.path): self._check_trec(txn) # Process next transaction record. def _check_trec(self, txn): # txn has members tid, status, user, description, # _extension, _pos, _tend, _file, _tpos self._produced_msg = False # Map and list for save data records for current transaction. self._records_map = {} self._records = [] for drec in txn: self._save_references(drec) for drec in self._records: self._check_drec(drec) if self._produced_msg: # Copy txn info for later output. self.tid2info[txn.tid] = (txn.status, txn.user, txn.description, txn._tpos) def _save_references(self, drec): # drec has members oid, tid, data, data_txn tid, oid, pick, pos = drec.tid, drec.oid, drec.data, drec.pos if pick: if oid in self.oids: klass = get_class(pick) self._msg(oid, tid, "new revision", klass, "at", pos) self.oids[oid] += 1 self.oid2name[oid] = self._ref2name[oid] = klass self._records_map[oid] = drec self._records.append(drec) elif oid in self.oids: self._msg(oid, tid, "creation undo at", pos) # Process next data record. If a message is produced, self._produced_msg # will be set True. def _check_drec(self, drec): # drec has members oid, tid, data, data_txn tid, oid, pick, pos = drec.tid, drec.oid, drec.data, drec.pos ref2name = self._ref2name ref2name_get = ref2name.get records_map_get = self._records_map.get if pick: oid_in_oids = oid in self.oids for ref, klass in get_refs(pick): if ref in self.oids: oidclass = ref2name_get(oid, None) if oidclass is None: ref2name[oid] = oidclass = get_class(pick) self._msg(ref, tid, "referenced by", oid_repr(oid), oidclass, "at", pos) if oid_in_oids: if klass is None: klass = ref2name_get(ref, None) if klass is None: r = records_map_get(ref, None) # For save memory we only save references # seen in one transaction with interesting # objects changes. So in some circumstances # we may still got "" class name. if r is None: klass = "" else: ref2name[ref] = klass = get_class(r.data) elif isinstance(klass, tuple): ref2name[ref] = klass = "%s.%s" % klass self._msg(oid, tid, "references", oid_repr(ref), klass, "at", pos) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/FileStorage/fspack.py000066400000000000000000000557411230730566700244760ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2003 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """FileStorage helper to perform pack. A storage contains an ordered set of object revisions. When a storage is packed, object revisions that are not reachable as of the pack time are deleted. The notion of reachability is complicated by backpointers -- object revisions that point to earlier revisions of the same object. An object revisions is reachable at a certain time if it is reachable from the revision of the root at that time or if it is reachable from a backpointer after that time. """ from ZODB.FileStorage.format import DataHeader, TRANS_HDR_LEN from ZODB.FileStorage.format import FileStorageFormatter, CorruptedDataError from ZODB.utils import p64, u64, z64 import logging import os import ZODB.fsIndex import ZODB.POSException logger = logging.getLogger(__name__) class PackError(ZODB.POSException.POSError): pass class PackCopier(FileStorageFormatter): def __init__(self, f, index, tindex): self._file = f self._index = index self._tindex = tindex self._pos = None def _txn_find(self, tid, stop_at_pack): # _pos always points just past the last transaction pos = self._pos while pos > 4: self._file.seek(pos - 8) pos = pos - u64(self._file.read(8)) - 8 self._file.seek(pos) h = self._file.read(TRANS_HDR_LEN) _tid = h[:8] if _tid == tid: return pos if stop_at_pack: if h[16] == 'p': break raise PackError("Invalid backpointer transaction id") def _data_find(self, tpos, oid, data): # Return backpointer for oid. Must call with the lock held. # This is a file offset to oid's data record if found, else 0. # The data records in the transaction at tpos are searched for oid. # If a data record for oid isn't found, returns 0. # Else if oid's data record contains a backpointer, that # backpointer is returned. # Else oid's data record contains the data, and the file offset of # oid's data record is returned. This data record should contain # a pickle identical to the 'data' argument. # Unclear: If the length of the stored data doesn't match len(data), # an exception is raised. If the lengths match but the data isn't # the same, 0 is returned. Why the discrepancy? h = self._read_txn_header(tpos) tend = tpos + h.tlen pos = self._file.tell() while pos < tend: h = self._read_data_header(pos) if h.oid == oid: # Make sure this looks like the right data record if h.plen == 0: # This is also a backpointer. Gotta trust it. return pos if h.plen != len(data): # The expected data doesn't match what's in the # backpointer. Something is wrong. logger.error("Mismatch between data and backpointer at %d", pos) return 0 _data = self._file.read(h.plen) if data != _data: return 0 return pos pos += h.recordlen() return 0 def copy(self, oid, serial, data, prev_txn, txnpos, datapos): prev_pos = self._resolve_backpointer(prev_txn, oid, data) old = self._index.get(oid, 0) # Calculate the pos the record will have in the storage. here = datapos # And update the temp file index self._tindex[oid] = here if prev_pos: # If there is a valid prev_pos, don't write data. data = None if data is None: dlen = 0 else: dlen = len(data) # Write the recovery data record h = DataHeader(oid, serial, old, txnpos, 0, dlen) self._file.write(h.asString()) # Write the data or a backpointer if data is None: if prev_pos: self._file.write(p64(prev_pos)) else: # Write a zero backpointer, which indicates an # un-creation transaction. self._file.write(z64) else: self._file.write(data) def setTxnPos(self, pos): self._pos = pos def _resolve_backpointer(self, prev_txn, oid, data): pos = self._file.tell() try: prev_pos = 0 if prev_txn is not None: prev_txn_pos = self._txn_find(prev_txn, 0) if prev_txn_pos: prev_pos = self._data_find(prev_txn_pos, oid, data) return prev_pos finally: self._file.seek(pos) class GC(FileStorageFormatter): def __init__(self, file, eof, packtime, gc, referencesf): self._file = file self._name = file.name self.eof = eof self.packtime = packtime self.gc = gc # packpos: position of first txn header after pack time self.packpos = None # {oid -> current data record position}: self.oid2curpos = ZODB.fsIndex.fsIndex() # The set of reachable revisions of each object. # # This set as managed using two data structures. The first is # an fsIndex mapping oids to one data record pos. Since only # a few objects will have more than one revision, we use this # efficient data structure to handle the common case. The # second is a dictionary mapping objects to lists of # positions; it is used to handle the same number of objects # for which we must keep multiple revisions. self.reachable = ZODB.fsIndex.fsIndex() self.reach_ex = {} # keep ltid for consistency checks during initial scan self.ltid = z64 self.referencesf = referencesf def isReachable(self, oid, pos): """Return 1 if revision of `oid` at `pos` is reachable.""" rpos = self.reachable.get(oid) if rpos is None: return 0 if rpos == pos: return 1 return pos in self.reach_ex.get(oid, []) def findReachable(self): self.buildPackIndex() if self.gc: self.findReachableAtPacktime([z64]) self.findReachableFromFuture() # These mappings are no longer needed and may consume a lot of # space. del self.oid2curpos else: self.reachable = self.oid2curpos def buildPackIndex(self): pos = 4L # We make the initial assumption that the database has been # packed before and set unpacked to True only after seeing the # first record with a status == " ". If we get to the packtime # and unpacked is still False, we need to watch for a redundant # pack. unpacked = False while pos < self.eof: th = self._read_txn_header(pos) if th.tid > self.packtime: break self.checkTxn(th, pos) if th.status != "p": unpacked = True tpos = pos end = pos + th.tlen pos += th.headerlen() while pos < end: dh = self._read_data_header(pos) self.checkData(th, tpos, dh, pos) if dh.plen or dh.back: self.oid2curpos[dh.oid] = pos else: if dh.oid in self.oid2curpos: del self.oid2curpos[dh.oid] pos += dh.recordlen() tlen = self._read_num(pos) if tlen != th.tlen: self.fail(pos, "redundant transaction length does not " "match initial transaction length: %d != %d", tlen, th.tlen) pos += 8 self.packpos = pos if unpacked: return # check for a redundant pack. If the first record following # the newly computed packpos has status 'p', then it was # packed earlier and the current pack is redudant. try: th = self._read_txn_header(pos) except CorruptedDataError, err: if err.buf != "": raise if th.status == 'p': # Delayed import to cope with circular imports. # TODO: put exceptions in a separate module. from ZODB.FileStorage.FileStorage import RedundantPackWarning raise RedundantPackWarning( "The database has already been packed to a later time" " or no changes have been made since the last pack") def findReachableAtPacktime(self, roots): """Mark all objects reachable from the oids in roots as reachable.""" reachable = self.reachable oid2curpos = self.oid2curpos todo = list(roots) while todo: oid = todo.pop() if oid in reachable: continue try: pos = oid2curpos[oid] except KeyError: if oid == z64 and len(oid2curpos) == 0: # special case, pack to before creation time continue raise reachable[oid] = pos for oid in self.findrefs(pos): if oid not in reachable: todo.append(oid) def findReachableFromFuture(self): # In this pass, the roots are positions of object revisions. # We add a pos to extra_roots when there is a backpointer to a # revision that was not current at the packtime. The # non-current revision could refer to objects that were # otherwise unreachable at the packtime. extra_roots = [] pos = self.packpos while pos < self.eof: th = self._read_txn_header(pos) self.checkTxn(th, pos) tpos = pos end = pos + th.tlen pos += th.headerlen() while pos < end: dh = self._read_data_header(pos) self.checkData(th, tpos, dh, pos) if dh.back and dh.back < self.packpos: if self.reachable.has_key(dh.oid): L = self.reach_ex.setdefault(dh.oid, []) if dh.back not in L: L.append(dh.back) extra_roots.append(dh.back) else: self.reachable[dh.oid] = dh.back pos += dh.recordlen() tlen = self._read_num(pos) if tlen != th.tlen: self.fail(pos, "redundant transaction length does not " "match initial transaction length: %d != %d", tlen, th.tlen) pos += 8 for pos in extra_roots: refs = self.findrefs(pos) self.findReachableAtPacktime(refs) def findrefs(self, pos): """Return a list of oids referenced as of packtime.""" dh = self._read_data_header(pos) # Chase backpointers until we get to the record with the refs while dh.back: dh = self._read_data_header(dh.back) if dh.plen: return self.referencesf(self._file.read(dh.plen)) else: return [] class FileStoragePacker(FileStorageFormatter): # path is the storage file path. # stop is the pack time, as a TimeStamp. # la and lr are the acquire() and release() methods of the storage's lock. # cla and clr similarly, for the storage's commit lock. # current_size is the storage's _pos. All valid data at the start # lives before that offset (there may be a checkpoint transaction in # progress after it). def __init__(self, storage, referencesf, stop, gc=True): self._storage = storage if storage.blob_dir: self.pack_blobs = True self.blob_removed = open( os.path.join(storage.blob_dir, '.removed'), 'w') else: self.pack_blobs = False path = storage._file.name self._name = path # We open our own handle on the storage so that much of pack can # proceed in parallel. It's important to close this file at every # return point, else on Windows the caller won't be able to rename # or remove the storage file. self._file = open(path, "rb") self._path = path self._stop = stop self.locked = False self.file_end = storage.getSize() self.gc = GC(self._file, self.file_end, self._stop, gc, referencesf) # The packer needs to acquire the parent's commit lock # during the copying stage, so the two sets of lock acquire # and release methods are passed to the constructor. self._lock_acquire = storage._lock_acquire self._lock_release = storage._lock_release self._commit_lock_acquire = storage._commit_lock_acquire self._commit_lock_release = storage._commit_lock_release # The packer will use several indexes. # index: oid -> pos # tindex: oid -> pos, for current txn # oid2tid: not used by the packer self.index = ZODB.fsIndex.fsIndex() self.tindex = {} self.oid2tid = {} self.toid2tid = {} self.toid2tid_delete = {} def pack(self): # Pack copies all data reachable at the pack time or later. # # Copying occurs in two phases. In the first phase, txns # before the pack time are copied if the contain any reachable # data. In the second phase, all txns after the pack time # are copied. # # Txn and data records contain pointers to previous records. # Because these pointers are stored as file offsets, they # must be updated when we copy data. # TODO: Should add sanity checking to pack. self.gc.findReachable() # Setup the destination file and copy the metadata. # TODO: rename from _tfile to something clearer. self._tfile = open(self._name + ".pack", "w+b") self._file.seek(0) self._tfile.write(self._file.read(self._metadata_size)) self._copier = PackCopier(self._tfile, self.index, self.tindex) ipos, opos = self.copyToPacktime() assert ipos == self.gc.packpos if ipos == opos: # pack didn't free any data. there's no point in continuing. self._tfile.close() self._file.close() os.remove(self._name + ".pack") return None self._commit_lock_acquire() self.locked = True try: self._lock_acquire() try: # Re-open the file in unbuffered mode. # The main thread may write new transactions to the # file, which creates the possibility that we will # read a status 'c' transaction into the pack thread's # stdio buffer even though we're acquiring the commit # lock. Transactions can still be in progress # throughout much of packing, and are written to the # same physical file but via a distinct Python file # object. The code used to leave off the trailing 0 # argument, and then on every platform except native # Windows it was observed that we could read stale # data from the tail end of the file. self._file.close() # else self.gc keeps the original # alive & open self._file = open(self._path, "rb", 0) self._file.seek(0, 2) self.file_end = self._file.tell() finally: self._lock_release() if ipos < self.file_end: self.copyRest(ipos) # OK, we've copied everything. Now we need to wrap things up. pos = self._tfile.tell() self._tfile.flush() self._tfile.close() self._file.close() return pos except: if self.locked: self._commit_lock_release() raise def copyToPacktime(self): offset = 0L # the amount of space freed by packing pos = self._metadata_size new_pos = pos while pos < self.gc.packpos: th = self._read_txn_header(pos) new_tpos, pos = self.copyDataRecords(pos, th) if new_tpos: new_pos = self._tfile.tell() + 8 tlen = new_pos - new_tpos - 8 # Update the transaction length self._tfile.seek(new_tpos + 8) self._tfile.write(p64(tlen)) self._tfile.seek(new_pos - 8) self._tfile.write(p64(tlen)) tlen = self._read_num(pos) if tlen != th.tlen: self.fail(pos, "redundant transaction length does not " "match initial transaction length: %d != %d", tlen, th.tlen) pos += 8 return pos, new_pos def copyDataRecords(self, pos, th): """Copy any current data records between pos and tend. Returns position of txn header in output file and position of next record in the input file. If any data records are copied, also write txn header (th). """ copy = 0 new_tpos = 0L tend = pos + th.tlen pos += th.headerlen() while pos < tend: h = self._read_data_header(pos) if not self.gc.isReachable(h.oid, pos): if self.pack_blobs: # We need to find out if this is a blob, so get the data: if h.plen: data = self._file.read(h.plen) else: data = self.fetchDataViaBackpointer(h.oid, h.back) if data and self._storage.is_blob_record(data): # We need to remove the blob record. Maybe we # need to remove oid: # But first, we need to make sure the record # we're looking at isn't a dup of the current # record. There's a bug in ZEO blob support that causes # duplicate data records. rpos = self.gc.reachable.get(h.oid) is_dup = (rpos and self._read_data_header(rpos).tid == h.tid) if not is_dup: if h.oid not in self.gc.reachable: self.blob_removed.write( h.oid.encode('hex')+'\n') else: self.blob_removed.write( (h.oid+h.tid).encode('hex')+'\n') pos += h.recordlen() continue pos += h.recordlen() # If we are going to copy any data, we need to copy # the transaction header. Note that we will need to # patch up the transaction length when we are done. if not copy: th.status = "p" s = th.asString() new_tpos = self._tfile.tell() self._tfile.write(s) new_pos = new_tpos + len(s) copy = 1 if h.plen: data = self._file.read(h.plen) else: data = self.fetchDataViaBackpointer(h.oid, h.back) self.writePackedDataRecord(h, data, new_tpos) new_pos = self._tfile.tell() return new_tpos, pos def fetchDataViaBackpointer(self, oid, back): """Return the data for oid via backpointer back If `back` is 0 or ultimately resolves to 0, return None. In this case, the transaction undoes the object creation. """ if back == 0: return None data, tid = self._loadBackTxn(oid, back, 0) return data def writePackedDataRecord(self, h, data, new_tpos): # Update the header to reflect current information, then write # it to the output file. if data is None: data = "" h.prev = 0 h.back = 0 h.plen = len(data) h.tloc = new_tpos pos = self._tfile.tell() self.index[h.oid] = pos self._tfile.write(h.asString()) self._tfile.write(data) if not data: # Packed records never have backpointers (?). # If there is no data, write a z64 backpointer. # This is a George Bailey event. self._tfile.write(z64) def copyRest(self, ipos): # After the pack time, all data records are copied. # Copy one txn at a time, using copy() for data. try: while 1: ipos = self.copyOne(ipos) except CorruptedDataError, err: # The last call to copyOne() will raise # CorruptedDataError, because it will attempt to read past # the end of the file. Double-check that the exception # occurred for this reason. self._file.seek(0, 2) endpos = self._file.tell() if endpos != err.pos: raise def copyOne(self, ipos): # The call below will raise CorruptedDataError at EOF. th = self._read_txn_header(ipos) # Release commit lock while writing to pack file self._commit_lock_release() self.locked = False pos = self._tfile.tell() self._copier.setTxnPos(pos) self._tfile.write(th.asString()) tend = ipos + th.tlen ipos += th.headerlen() while ipos < tend: h = self._read_data_header(ipos) ipos += h.recordlen() prev_txn = None if h.plen: data = self._file.read(h.plen) else: data = self.fetchDataViaBackpointer(h.oid, h.back) if h.back: prev_txn = self.getTxnFromData(h.oid, h.back) self._copier.copy(h.oid, h.tid, data, prev_txn, pos, self._tfile.tell()) tlen = self._tfile.tell() - pos assert tlen == th.tlen self._tfile.write(p64(tlen)) ipos += 8 self.index.update(self.tindex) self.tindex.clear() self._commit_lock_acquire() self.locked = True return ipos ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/FileStorage/interfaces.py000066400000000000000000000045621230730566700253450ustar00rootroot00000000000000############################################################################## # # Copyright (c) Zope Corporation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## import zope.interface class IFileStoragePacker(zope.interface.Interface): def __call__(storage, referencesf, stop, gc): """Pack the file storage into a new file The new file will have the same name as the old file with '.pack' appended. (The packer can get the old file name via storage._file.name.) If blobs are supported, if the storages blob_dir attribute is not None or empty, then a .removed file most be created in the blob directory. This file contains of the form: (oid+serial).encode('hex')+'\n' or, of the form: oid.encode('hex')+'\n' If packing is unnecessary, or would not change the file, then no pack or removed files are created None is returned, otherwise a tuple is returned with: - the size of the packed file, and - the packed index If and only if packing was necessary (non-None) and there was no error, then the commit lock must be acquired. In addition, it is up to FileStorage to: - Rename the .pack file, and - process the blob_dir/.removed file by removing the blobs corresponding to the file records. """ class IFileStorage(zope.interface.Interface): packer = zope.interface.Attribute( "The IFileStoragePacker to be used for packing." ) _file = zope.interface.Attribute( "The file object used to access the underlying data." ) def _lock_acquire(): "Acquire the storage lock" def _lock_release(): "Release the storage lock" def _commit_lock_acquire(): "Acquire the storage commit lock" def _commit_lock_release(): "Release the storage commit lock" ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/FileStorage/iterator.test000066400000000000000000000104651230730566700254010ustar00rootroot00000000000000FileStorage-specific iterator tests =================================== The FileStorage iterator has some special features that deserve some special tests. We'll make some assertions about time, so we'll take it over: >>> now = 1229959248 >>> def faux_time(): ... global now ... now += 0.1 ... return now >>> import time >>> time_time = time.time >>> time.time = faux_time Commit a bunch of transactions: >>> import ZODB.FileStorage, transaction >>> db = ZODB.DB('data.fs') >>> tids = [db.storage.lastTransaction()] >>> poss = [db.storage._pos] >>> conn = db.open() >>> for i in range(100): ... conn.root()[i] = conn.root().__class__() ... transaction.commit() ... tids.append(db.storage.lastTransaction()) ... poss.append(db.storage._pos) Deciding where to start ----------------------- By default, we start at the beginning: >>> it = ZODB.FileStorage.FileIterator('data.fs') >>> it.next().tid == tids[0] True The file iterator has an optimization to deal with large files. It can serarch from either the front or the back of the file, depending on the starting transaction given. To see this, we'll turn on debug logging: >>> import logging, sys >>> old_log_level = logging.getLogger().getEffectiveLevel() >>> logging.getLogger().setLevel(logging.DEBUG) >>> handler = logging.StreamHandler(sys.stdout) >>> logging.getLogger().addHandler(handler) If we specify a start transaction, we'll scan forward or backward, as seems best and set the next record to that: >>> it = ZODB.FileStorage.FileIterator('data.fs', tids[0]) >>> it.next().tid == tids[0] True >>> it = ZODB.FileStorage.FileIterator('data.fs', tids[1]) Scan forward data.fs:4 looking for '\x03z\xbd\xd8\xd06\x9c\xcc' >>> it.next().tid == tids[1] True >>> it = ZODB.FileStorage.FileIterator('data.fs', tids[30]) Scan forward data.fs:4 looking for '\x03z\xbd\xd8\xdc\x96.\xcc' >>> it.next().tid == tids[30] True >>> it = ZODB.FileStorage.FileIterator('data.fs', tids[70]) Scan backward data.fs:117080 looking for '\x03z\xbd\xd8\xed\xa7>\xcc' >>> it.next().tid == tids[70] True >>> it = ZODB.FileStorage.FileIterator('data.fs', tids[-2]) Scan backward data.fs:117080 looking for '\x03z\xbd\xd8\xfa\x06\xd0\xcc' >>> it.next().tid == tids[-2] True >>> it = ZODB.FileStorage.FileIterator('data.fs', tids[-1]) >>> it.next().tid == tids[-1] True We can also supply a file position. This can speed up finding the starting point, or just pick up where another iterator left off: >>> it = ZODB.FileStorage.FileIterator('data.fs', pos=poss[50]) >>> it.next().tid == tids[51] True >>> it = ZODB.FileStorage.FileIterator('data.fs', tids[0], pos=4) >>> it.next().tid == tids[0] True >>> it = ZODB.FileStorage.FileIterator('data.fs', tids[-1], pos=poss[-2]) >>> it.next().tid == tids[-1] True >>> it = ZODB.FileStorage.FileIterator('data.fs', tids[50], pos=poss[50]) Scan backward data.fs:35936 looking for '\x03z\xbd\xd8\xe5\x1e\xb6\xcc' >>> it.next().tid == tids[50] True >>> it = ZODB.FileStorage.FileIterator('data.fs', tids[49], pos=poss[50]) Scan backward data.fs:35936 looking for '\x03z\xbd\xd8\xe4\xb1|\xcc' >>> it.next().tid == tids[49] True >>> it = ZODB.FileStorage.FileIterator('data.fs', tids[51], pos=poss[50]) >>> it.next().tid == tids[51] True >>> logging.getLogger().setLevel(old_log_level) >>> logging.getLogger().removeHandler(handler) If a starting transaction is before the first transaction in the file, then the first transaction is returned. >>> from ZODB.utils import p64, u64 >>> it = ZODB.FileStorage.FileIterator('data.fs', p64(u64(tids[0])-1)) >>> it.next().tid == tids[0] True If it is after the last transaction, then iteration be empty: >>> it = ZODB.FileStorage.FileIterator('data.fs', p64(u64(tids[-1])+1)) >>> list(it) [] Even if we write more transactions: >>> it = ZODB.FileStorage.FileIterator('data.fs', p64(u64(tids[-1])+1)) >>> for i in range(10): ... conn.root()[i] = conn.root().__class__() ... transaction.commit() >>> list(it) [] .. Cleanup >>> time.time = time_time >>> it.close() >>> db.close() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/FileStorage/tests.py000066400000000000000000000124641230730566700243640ustar00rootroot00000000000000############################################################################## # # Copyright (c) Zope Corporation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## import doctest import os import time import transaction import unittest import ZODB.blob import ZODB.FileStorage import ZODB.tests.util def pack_keep_old(): """Should a copy of the database be kept? The pack_keep_old constructor argument controls whether a .old file (and .old directory for blobs is kept.) >>> fs = ZODB.FileStorage.FileStorage('data.fs', blob_dir='blobs') >>> db = ZODB.DB(fs) >>> conn = db.open() >>> import ZODB.blob >>> conn.root()[1] = ZODB.blob.Blob() >>> conn.root()[1].open('w').write('some data') >>> conn.root()[2] = ZODB.blob.Blob() >>> conn.root()[2].open('w').write('some data') >>> transaction.commit() >>> conn.root()[1].open('w').write('some other data') >>> del conn.root()[2] >>> transaction.commit() >>> old_size = os.stat('data.fs').st_size >>> def get_blob_size(d): ... result = 0 ... for path, dirs, file_names in os.walk(d): ... for file_name in file_names: ... result += os.stat(os.path.join(path, file_name)).st_size ... return result >>> blob_size = get_blob_size('blobs') >>> db.pack(time.time()+1) >>> packed_size = os.stat('data.fs').st_size >>> packed_size < old_size True >>> os.stat('data.fs.old').st_size == old_size True >>> packed_blob_size = get_blob_size('blobs') >>> packed_blob_size < blob_size True >>> get_blob_size('blobs.old') == blob_size True >>> db.close() >>> fs = ZODB.FileStorage.FileStorage('data.fs', blob_dir='blobs', ... create=True, pack_keep_old=False) >>> db = ZODB.DB(fs) >>> conn = db.open() >>> conn.root()[1] = ZODB.blob.Blob() >>> conn.root()[1].open('w').write('some data') >>> conn.root()[2] = ZODB.blob.Blob() >>> conn.root()[2].open('w').write('some data') >>> transaction.commit() >>> conn.root()[1].open('w').write('some other data') >>> del conn.root()[2] >>> transaction.commit() >>> db.pack(time.time()+1) >>> os.stat('data.fs').st_size == packed_size True >>> os.path.exists('data.fs.old') False >>> get_blob_size('blobs') == packed_blob_size True >>> os.path.exists('blobs.old') False >>> db.close() """ def pack_with_repeated_blob_records(): """ There is a bug in ZEO that causes duplicate bloc database records to be written in a blob store operation. (Maybe this has been fixed by the time you read this, but there might still be transactions in the wild that have duplicate records. >>> fs = ZODB.FileStorage.FileStorage('t', blob_dir='bobs') >>> db = ZODB.DB(fs) >>> conn = db.open() >>> conn.root()[1] = ZODB.blob.Blob() >>> transaction.commit() >>> tm = transaction.TransactionManager() >>> oid = conn.root()[1]._p_oid >>> blob_record, oldserial = fs.load(oid) Now, create a transaction with multiple saves: >>> trans = tm.begin() >>> fs.tpc_begin(trans) >>> open('ablob', 'w').write('some data') >>> _ = fs.store(oid, oldserial, blob_record, '', trans) >>> _ = fs.storeBlob(oid, oldserial, blob_record, 'ablob', '', trans) >>> fs.tpc_vote(trans) >>> fs.tpc_finish(trans) >>> time.sleep(.01) >>> db.pack() >>> conn.sync() >>> conn.root()[1].open().read() 'some data' >>> db.close() """ def _save_index(): """ _save_index can fail for large indexes. >>> import ZODB.utils >>> fs = ZODB.FileStorage.FileStorage('data.fs') >>> t = transaction.begin() >>> fs.tpc_begin(t) >>> oid = 0 >>> for i in range(5000): ... oid += (1<<16) ... _ = fs.store(ZODB.utils.p64(oid), ZODB.utils.z64, 'x', '', t) >>> fs.tpc_vote(t) >>> fs.tpc_finish(t) >>> import sys >>> old_limit = sys.getrecursionlimit() >>> sys.setrecursionlimit(50) >>> fs._save_index() Make sure we can restore: >>> import logging >>> handler = logging.StreamHandler(sys.stdout) >>> logger = logging.getLogger('ZODB.FileStorage') >>> logger.setLevel(logging.DEBUG) >>> logger.addHandler(handler) >>> index, pos, tid = fs._restore_index() >>> index.items() == fs._index.items() True >>> pos, tid = fs._pos, fs._tid cleanup >>> fs.close() >>> logger.setLevel(logging.NOTSET) >>> logger.removeHandler(handler) >>> sys.setrecursionlimit(old_limit) """ def test_suite(): return unittest.TestSuite(( doctest.DocFileSuite( 'zconfig.txt', 'iterator.test', setUp=ZODB.tests.util.setUp, tearDown=ZODB.tests.util.tearDown, ), doctest.DocTestSuite( setUp=ZODB.tests.util.setUp, tearDown=ZODB.tests.util.tearDown, ), )) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/FileStorage/zconfig.txt000066400000000000000000000126241230730566700250460ustar00rootroot00000000000000Defining FileStorages using ZConfig =================================== ZODB provides support for defining many storages, including FileStorages, using ZConfig. To define a FileStorage, you use a filestorage section, and define a path: >>> import ZODB.config >>> fs = ZODB.config.storageFromString(""" ... ... path my.fs ... ... """) >>> fs._file.name 'my.fs' >>> fs.close() There are a number of options we can provide: blob-dir If supplied, the file storage will provide blob support and this is the name of a directory to hold blob data. The directory will be created if it doeesn't exist. If no value (or an empty value) is provided, then no blob support will be provided. (You can still use a BlobStorage to provide blob support.) >>> fs = ZODB.config.storageFromString(""" ... ... path my.fs ... blob-dir blobs ... ... """) >>> fs._file.name 'my.fs' >>> import os >>> os.path.basename(fs.blob_dir) 'blobs' create Flag that indicates whether the storage should be truncated if it already exists. To demonstrate this, we'll first write some data: >>> db = ZODB.DB(fs) >>> conn = db.open() >>> import ZODB.blob, transaction >>> conn.root()[1] = ZODB.blob.Blob() >>> transaction.commit() >>> db.close() Then reopen with the create option: >>> fs = ZODB.config.storageFromString(""" ... ... path my.fs ... blob-dir blobs ... create true ... ... """) Because the file was truncated, we no-longer have object 0: >>> fs.load('\0'*8) Traceback (most recent call last): ... POSKeyError: 0x00 >>> sorted(os.listdir('blobs')) ['.layout', 'tmp'] >>> fs.close() read-only If true, only reads may be executed against the storage. Note that the "pack" operation is not considered a write operation and is still allowed on a read-only filestorage. >>> fs = ZODB.config.storageFromString(""" ... ... path my.fs ... read-only true ... ... """) >>> fs.isReadOnly() True >>> fs.close() quota Maximum allowed size of the storage file. Operations which would cause the size of the storage to exceed the quota will result in a ZODB.FileStorage.FileStorageQuotaError being raised. >>> fs = ZODB.config.storageFromString(""" ... ... path my.fs ... quota 10 ... ... """) >>> db = ZODB.DB(fs) # writes object 0 Traceback (most recent call last): ... FileStorageQuotaError: The storage quota has been exceeded. >>> fs.close() packer The dotten name (dotten module name and object name) of a packer object. This is used to provide an alternative pack implementation. To demonstrate this, we'll create a null packer that just prints some information about it's arguments: >>> def packer(storage, referencesf, stop, gc): ... print referencesf, storage is fs, gc, storage.pack_keep_old >>> ZODB.FileStorage.config_demo_printing_packer = packer >>> fs = ZODB.config.storageFromString(""" ... ... path my.fs ... packer ZODB.FileStorage.config_demo_printing_packer ... ... """) >>> import time >>> db = ZODB.DB(fs) # writes object 0 >>> fs.pack(time.time(), 42) 42 True True True >>> fs.close() If the packer contains a ':', then the text after the first ':' is interpreted as an expression. This is handy to pass limited configuration information to the packer: >>> def packer_factory(name): ... def packer(storage, referencesf, stop, gc): ... print repr(name), referencesf, storage is fs, gc, ... print storage.pack_keep_old ... return packer >>> ZODB.FileStorage.config_demo_printing_packer_factory = packer_factory >>> fs = ZODB.config.storageFromString(""" ... ... path my.fs ... packer ZODB.FileStorage:config_demo_printing_packer_factory('bob ') ... ... """) >>> import time >>> db = ZODB.DB(fs) # writes object 0 >>> fs.pack(time.time(), 42) 'bob ' 42 True True True >>> fs.close() pack-gc If false, then no garbage collection will be performed when packing. This can make packing go much faster and can avoid problems when objects are referenced only from other databases. >>> fs = ZODB.config.storageFromString(""" ... ... path my.fs ... packer ZODB.FileStorage.config_demo_printing_packer ... pack-gc false ... ... """) >>> fs.pack(time.time(), 42) 42 True False True Note that if we pass the gc option to pack, then this will override the value set in the configuration: >>> fs.pack(time.time(), 42, gc=True) 42 True True True >>> fs.close() pack-keep-old If false, then old files aren't kept when packing >>> fs = ZODB.config.storageFromString(""" ... ... path my.fs ... packer ZODB.FileStorage.config_demo_printing_packer ... pack-keep-old false ... ... """) >>> fs.pack(time.time(), 42) 42 True True False >>> fs.close() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/MappingStorage.py000066400000000000000000000263551230730566700237420ustar00rootroot00000000000000############################################################################## # # Copyright (c) Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """A simple in-memory mapping-based ZODB storage This storage provides an example implementation of a fairly full storage without distracting storage details. """ import BTrees import time import threading import ZODB.BaseStorage import ZODB.interfaces import ZODB.POSException import ZODB.TimeStamp import ZODB.utils import zope.interface class MappingStorage(object): zope.interface.implements( ZODB.interfaces.IStorage, ZODB.interfaces.IStorageIteration, ) def __init__(self, name='MappingStorage'): self.__name__ = name self._data = {} # {oid->{tid->pickle}} self._transactions = BTrees.OOBTree.OOBTree() # {tid->TransactionRecord} self._ltid = ZODB.utils.z64 self._last_pack = None _lock = threading.RLock() self._lock_acquire = _lock.acquire self._lock_release = _lock.release self._commit_lock = threading.Lock() self._opened = True self._transaction = None self._oid = 0 ###################################################################### # Preconditions: def opened(self): """The storage is open """ return self._opened def not_in_transaction(self): """The storage is not committing a transaction """ return self._transaction is None # ###################################################################### # testing framework (lame) def cleanup(self): pass # ZODB.interfaces.IStorage @ZODB.utils.locked def close(self): self._opened = False # ZODB.interfaces.IStorage def getName(self): return self.__name__ # ZODB.interfaces.IStorage @ZODB.utils.locked(opened) def getSize(self): size = 0 for oid, tid_data in self._data.items(): size += 50 for tid, pickle in tid_data.items(): size += 100+len(pickle) return size # ZEO.interfaces.IServeable @ZODB.utils.locked(opened) def getTid(self, oid): tid_data = self._data.get(oid) if tid_data: return tid_data.maxKey() raise ZODB.POSException.POSKeyError(oid) # ZODB.interfaces.IStorage @ZODB.utils.locked(opened) def history(self, oid, size=1): tid_data = self._data.get(oid) if not tid_data: raise ZODB.POSException.POSKeyError(oid) tids = tid_data.keys()[-size:] tids.reverse() return [ dict( time = ZODB.TimeStamp.TimeStamp(tid), tid = tid, serial = tid, user_name = self._transactions[tid].user, description = self._transactions[tid].description, extension = self._transactions[tid].extension, size = len(tid_data[tid]) ) for tid in tids] # ZODB.interfaces.IStorage def isReadOnly(self): return False # ZODB.interfaces.IStorageIteration def iterator(self, start=None, end=None): for transaction_record in self._transactions.values(start, end): yield transaction_record # ZODB.interfaces.IStorage @ZODB.utils.locked(opened) def lastTransaction(self): return self._ltid # ZODB.interfaces.IStorage @ZODB.utils.locked(opened) def __len__(self): return len(self._data) # ZODB.interfaces.IStorage @ZODB.utils.locked(opened) def load(self, oid, version=''): assert not version, "Versions are not supported" tid_data = self._data.get(oid) if tid_data: tid = tid_data.maxKey() return tid_data[tid], tid raise ZODB.POSException.POSKeyError(oid) # ZODB.interfaces.IStorage @ZODB.utils.locked(opened) def loadBefore(self, oid, tid): tid_data = self._data.get(oid) if tid_data: before = ZODB.utils.u64(tid) if not before: return None before = ZODB.utils.p64(before-1) tids_before = tid_data.keys(None, before) if tids_before: tids_after = tid_data.keys(tid, None) tid = tids_before[-1] return (tid_data[tid], tid, (tids_after and tids_after[0] or None) ) else: raise ZODB.POSException.POSKeyError(oid) # ZODB.interfaces.IStorage @ZODB.utils.locked(opened) def loadSerial(self, oid, serial): tid_data = self._data.get(oid) if tid_data: try: return tid_data[serial] except KeyError: pass raise ZODB.POSException.POSKeyError(oid, serial) # ZODB.interfaces.IStorage @ZODB.utils.locked(opened) def new_oid(self): self._oid += 1 return ZODB.utils.p64(self._oid) # ZODB.interfaces.IStorage @ZODB.utils.locked(opened) def pack(self, t, referencesf, gc=True): if not self._data: return stop = `ZODB.TimeStamp.TimeStamp(*time.gmtime(t)[:5]+(t%60,))` if self._last_pack is not None and self._last_pack >= stop: if self._last_pack == stop: return raise ValueError("Already packed to a later time") self._last_pack = stop transactions = self._transactions # Step 1, remove old non-current records for oid, tid_data in self._data.items(): tids_to_remove = tid_data.keys(None, stop) if tids_to_remove: tids_to_remove.pop() # Keep the last, if any if tids_to_remove: for tid in tids_to_remove: del tid_data[tid] if transactions[tid].pack(oid): del transactions[tid] if gc: # Step 2, GC. A simple sweep+copy new_data = BTrees.OOBTree.OOBTree() to_copy = set([ZODB.utils.z64]) while to_copy: oid = to_copy.pop() tid_data = self._data.pop(oid) new_data[oid] = tid_data for pickle in tid_data.values(): for oid in referencesf(pickle): if oid in new_data: continue to_copy.add(oid) # Remove left over data from transactions for oid, tid_data in self._data.items(): for tid in tid_data: if transactions[tid].pack(oid): del transactions[tid] self._data.clear() self._data.update(new_data) # ZODB.interfaces.IStorage def registerDB(self, db): pass # ZODB.interfaces.IStorage def sortKey(self): return self.__name__ # ZODB.interfaces.IStorage @ZODB.utils.locked(opened) def store(self, oid, serial, data, version, transaction): assert not version, "Versions are not supported" if transaction is not self._transaction: raise ZODB.POSException.StorageTransactionError(self, transaction) old_tid = None tid_data = self._data.get(oid) if tid_data: old_tid = tid_data.maxKey() if serial != old_tid: raise ZODB.POSException.ConflictError( oid=oid, serials=(old_tid, serial), data=data) self._tdata[oid] = data return self._tid checkCurrentSerialInTransaction = ( ZODB.BaseStorage.checkCurrentSerialInTransaction) # ZODB.interfaces.IStorage @ZODB.utils.locked(opened) def tpc_abort(self, transaction): if transaction is not self._transaction: return self._transaction = None self._commit_lock.release() # ZODB.interfaces.IStorage @ZODB.utils.locked(opened) def tpc_begin(self, transaction, tid=None): # The tid argument exists to support testing. if transaction is self._transaction: raise ZODB.POSException.StorageTransactionError( "Duplicate tpc_begin calls for same transaction") self._lock_release() self._commit_lock.acquire() self._lock_acquire() self._transaction = transaction self._tdata = {} if tid is None: if self._transactions: old_tid = self._transactions.maxKey() else: old_tid = None tid = ZODB.utils.newTid(old_tid) self._tid = tid # ZODB.interfaces.IStorage @ZODB.utils.locked(opened) def tpc_finish(self, transaction, func = lambda tid: None): if (transaction is not self._transaction): raise ZODB.POSException.StorageTransactionError( "tpc_finish called with wrong transaction") tid = self._tid func(tid) tdata = self._tdata for oid in tdata: tid_data = self._data.get(oid) if tid_data is None: tid_data = BTrees.OOBTree.OOBucket() self._data[oid] = tid_data tid_data[tid] = tdata[oid] self._ltid = tid self._transactions[tid] = TransactionRecord(tid, transaction, tdata) self._transaction = None del self._tdata self._commit_lock.release() # ZEO.interfaces.IServeable @ZODB.utils.locked(opened) def tpc_transaction(self): return self._transaction # ZODB.interfaces.IStorage def tpc_vote(self, transaction): if transaction is not self._transaction: raise ZODB.POSException.StorageTransactionError( "tpc_vote called with wrong transaction") class TransactionRecord: status = ' ' def __init__(self, tid, transaction, data): self.tid = tid self.user = transaction.user self.description = transaction.description extension = transaction._extension self.extension = extension self.data = data _extension = property(lambda self: self._extension, lambda self, v: setattr(self, '_extension', v), ) def __iter__(self): for oid, data in self.data.items(): yield DataRecord(oid, self.tid, data) def pack(self, oid): self.status = 'p' del self.data[oid] return not self.data class DataRecord(object): """Abstract base class for iterator protocol""" zope.interface.implements(ZODB.interfaces.IStorageRecordInformation) version = '' data_txn = None def __init__(self, oid, tid, data): self.oid = oid self.tid = tid self.data = data def DB(*args, **kw): return ZODB.DB(MappingStorage(), *args, **kw) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/POSException.py000066400000000000000000000272161230730566700233370ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """ZODB-defined exceptions $Id$""" import sys from ZODB.utils import oid_repr, readable_tid_repr # BBB: We moved the two transactions to the transaction package from transaction.interfaces import TransactionError, TransactionFailedError import transaction.interfaces def _fmt_undo(oid, reason): s = reason and (": %s" % reason) or "" return "Undo error %s%s" % (oid_repr(oid), s) def _recon(class_, state): err = class_.__new__(class_) err.__setstate__(state) return err _recon.__no_side_effects__ = True class POSError(StandardError): """Persistent object system error.""" if sys.version_info[:2] == (2, 6): # The 'message' attribute was deprecated for BaseException with # Python 2.6; here we create descriptor properties to continue using it def __set_message(self, v): self.__dict__['message'] = v def __get_message(self): return self.__dict__['message'] def __del_message(self): del self.__dict__['message'] message = property(__get_message, __set_message, __del_message) if sys.version_info[:2] >= (2, 5): def __reduce__(self): # Copy extra data from internal structures state = self.__dict__.copy() if sys.version_info[:2] == (2, 5): state['message'] = self.message state['args'] = self.args return (_recon, (self.__class__, state)) class POSKeyError(POSError, KeyError): """Key not found in database.""" def __str__(self): return oid_repr(self.args[0]) class ConflictError(POSError, transaction.interfaces.TransientError): """Two transactions tried to modify the same object at once. This transaction should be resubmitted. Instance attributes: oid : string the OID (8-byte packed string) of the object in conflict class_name : string the fully-qualified name of that object's class message : string a human-readable explanation of the error serials : (string, string) a pair of 8-byte packed strings; these are the serial numbers related to conflict. The first is the revision of object that is in conflict, the currently committed serial. The second is the revision the current transaction read when it started. data : string The database record that failed to commit, used to put the class name in the error message. The caller should pass either object or oid as a keyword argument, but not both of them. If object is passed, it should be a persistent object with an _p_oid attribute. """ def __init__(self, message=None, object=None, oid=None, serials=None, data=None): if message is None: self.message = "database conflict error" else: self.message = message if object is None: self.oid = None self.class_name = None else: self.oid = object._p_oid klass = object.__class__ self.class_name = klass.__module__ + "." + klass.__name__ if oid is not None: assert self.oid is None self.oid = oid if data is not None: # avoid circular import chain from ZODB.utils import get_pickle_metadata self.class_name = "%s.%s" % get_pickle_metadata(data) ## else: ## if message != "data read conflict error": ## raise RuntimeError self.serials = serials def __str__(self): extras = [] if self.oid: extras.append("oid %s" % oid_repr(self.oid)) if self.class_name: extras.append("class %s" % self.class_name) if self.serials: current, old = self.serials extras.append("serial this txn started with %s" % readable_tid_repr(old)) extras.append("serial currently committed %s" % readable_tid_repr(current)) if extras: return "%s (%s)" % (self.message, ", ".join(extras)) else: return self.message def get_oid(self): return self.oid def get_class_name(self): return self.class_name def get_old_serial(self): return self.serials[1] def get_new_serial(self): return self.serials[0] def get_serials(self): return self.serials class ReadConflictError(ConflictError): """Conflict detected when object was loaded. An attempt was made to read an object that has changed in another transaction (eg. another thread or process). """ def __init__(self, message=None, object=None, serials=None, **kw): if message is None: message = "database read conflict error" ConflictError.__init__(self, message=message, object=object, serials=serials, **kw) class BTreesConflictError(ConflictError): """A special subclass for BTrees conflict errors.""" msgs = [# 0; i2 or i3 bucket split; positions are all -1 'Conflicting bucket split', # 1; keys the same, but i2 and i3 values differ, and both values # differ from i1's value 'Conflicting changes', # 2; i1's value changed in i2, but key+value deleted in i3 'Conflicting delete and change', # 3; i1's value changed in i3, but key+value deleted in i2 'Conflicting delete and change', # 4; i1 and i2 both added the same key, or both deleted the # same key 'Conflicting inserts or deletes', # 5; i2 and i3 both deleted the same key 'Conflicting deletes', # 6; i2 and i3 both added the same key 'Conflicting inserts', # 7; i2 and i3 both deleted the same key, or i2 changed the value # associated with a key and i3 deleted that key 'Conflicting deletes, or delete and change', # 8; i2 and i3 both deleted the same key, or i3 changed the value # associated with a key and i2 deleted that key 'Conflicting deletes, or delete and change', # 9; i2 and i3 both deleted the same key 'Conflicting deletes', # 10; i2 and i3 deleted all the keys, and didn't insert any, # leaving an empty bucket; conflict resolution doesn't have # enough info to unlink an empty bucket from its containing # BTree correctly 'Empty bucket from deleting all keys', # 11; conflicting changes in an internal BTree node 'Conflicting changes in an internal BTree node', # 12; i2 or i3 was empty 'Empty bucket in a transaction', # 13; delete of first key, which causes change to parent node 'Delete of first key', ] def __init__(self, p1, p2, p3, reason): self.p1 = p1 self.p2 = p2 self.p3 = p3 self.reason = reason def __repr__(self): return "BTreesConflictError(%d, %d, %d, %d)" % (self.p1, self.p2, self.p3, self.reason) def __str__(self): return "BTrees conflict error at %d/%d/%d: %s" % ( self.p1, self.p2, self.p3, self.msgs[self.reason]) class DanglingReferenceError(POSError, transaction.interfaces.TransactionError): """An object has a persistent reference to a missing object. If an object is stored and it has a reference to another object that does not exist (for example, it was deleted by pack), this exception may be raised. Whether a storage supports this feature, it a quality of implementation issue. Instance attributes: referer: oid of the object being written missing: referenced oid that does not have a corresponding object """ def __init__(self, Aoid, Boid): self.referer = Aoid self.missing = Boid def __str__(self): return "from %s to %s" % (oid_repr(self.referer), oid_repr(self.missing)) ############################################################################ # Only used in storages; versions are no longer supported. class VersionError(POSError): """An error in handling versions occurred.""" class VersionCommitError(VersionError): """An invalid combination of versions was used in a version commit.""" class VersionLockError(VersionError, transaction.interfaces.TransactionError): """Modification to an object modified in an unsaved version. An attempt was made to modify an object that has been modified in an unsaved version. """ ############################################################################ class UndoError(POSError): """An attempt was made to undo a non-undoable transaction.""" def __init__(self, reason, oid=None): self._reason = reason self._oid = oid def __str__(self): return _fmt_undo(self._oid, self._reason) class MultipleUndoErrors(UndoError): """Several undo errors occurred during a single transaction.""" def __init__(self, errs): # provide a reason and oid for clients that only look at that UndoError.__init__(self, *errs[0]) self._errs = errs def __str__(self): return "\n".join([_fmt_undo(*pair) for pair in self._errs]) class StorageError(POSError): """Base class for storage based exceptions.""" class StorageTransactionError(StorageError): """An operation was invoked for an invalid transaction or state.""" class StorageSystemError(StorageError): """Panic! Internal storage error!""" class MountedStorageError(StorageError): """Unable to access mounted storage.""" class ReadOnlyError(StorageError): """Unable to modify objects in a read-only storage.""" class TransactionTooLargeError(StorageTransactionError): """The transaction exhausted some finite storage resource.""" class ExportError(POSError): """An export file doesn't have the right format.""" class Unsupported(POSError): """A feature was used that is not supported by the storage.""" class ReadOnlyHistoryError(POSError): """Unable to add or modify objects in an historical connection.""" class InvalidObjectReference(POSError): """An object contains an invalid reference to another object. An invalid reference may be one of: o A reference to a wrapped persistent object. o A reference to an object in a different database connection. TODO: The exception ought to have a member that is the invalid object. """ class ConnectionStateError(POSError): """A Connection isn't in the required state for an operation. o An operation such as a load is attempted on a closed connection. o An attempt to close a connection is made while the connection is still joined to a transaction (for example, a transaction is in progress, with uncommitted modifications in the connection). """ ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/UndoLogCompatible.py000066400000000000000000000030041230730566700243530ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Provide backward compatibility with storages that only have undoLog().""" class UndoLogCompatible: def undoInfo(self, first=0, last=-20, specification=None): if specification: # filter(desc) returns true iff `desc` is a "superdict" # of `specification`, meaning that `desc` contains the same # (key, value) pairs as `specification`, and possibly additional # (key, value) pairs. Another way to do this might be # d = desc.copy() # d.update(specification) # return d == desc def filter(desc, spec=specification.items()): get = desc.get for k, v in spec: if get(k, None) != v: return 0 return 1 else: filter = None return self.undoLog(first, last, filter) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/__init__.py000066400000000000000000000020271230730566700225470ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## import sys from persistent import TimeStamp from persistent import list from persistent import mapping # Backward compat for old imports. sys.modules['ZODB.TimeStamp'] = sys.modules['persistent.TimeStamp'] sys.modules['ZODB.PersistentMapping'] = sys.modules['persistent.mapping'] sys.modules['ZODB.PersistentList'] = sys.modules['persistent.list'] del mapping, list, sys from DB import DB, connection ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/blob.py000066400000000000000000001023651230730566700217340ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2005-2006 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Blobs """ import cPickle import cStringIO import base64 import binascii import logging import os import re import shutil import stat import sys import tempfile import weakref import zope.interface import ZODB.interfaces from ZODB.interfaces import BlobError from ZODB import utils from ZODB.POSException import POSKeyError import persistent logger = logging.getLogger('ZODB.blob') BLOB_SUFFIX = ".blob" SAVEPOINT_SUFFIX = ".spb" LAYOUT_MARKER = '.layout' LAYOUTS = {} valid_modes = 'r', 'w', 'r+', 'a', 'c' # Threading issues: # We want to support closing blob files when they are destroyed. # This introduces a threading issue, since a blob file may be destroyed # via GC in any thread. class Blob(persistent.Persistent): """A BLOB supports efficient handling of large data within ZODB.""" zope.interface.implements(ZODB.interfaces.IBlob) _p_blob_uncommitted = None # Filename of the uncommitted (dirty) data _p_blob_committed = None # Filename of the committed data readers = writers = None def __init__(self, data=None): # Raise exception if Blobs are getting subclassed # refer to ZODB-Bug No.127182 by Jim Fulton on 2007-07-20 if (self.__class__ is not Blob): raise TypeError('Blobs do not support subclassing.') self.__setstate__() if data is not None: self.open('w').write(data) def __setstate__(self, state=None): # we use lists here because it will allow us to add and remove # atomically self.readers = [] self.writers = [] def __getstate__(self): return None def _p_deactivate(self): # Only ghostify if we are unopened. if self.readers or self.writers: return super(Blob, self)._p_deactivate() def _p_invalidate(self): # Force-close any open readers or writers, # XXX should we warn of this? Maybe? if self._p_changed is None: return for ref in (self.readers or [])+(self.writers or []): f = ref() if f is not None: f.close() if (self._p_blob_uncommitted): os.remove(self._p_blob_uncommitted) super(Blob, self)._p_invalidate() def opened(self): return bool(self.readers or self.writers) def closed(self, f): # We use try/except below because another thread might remove # the ref after we check it if the file is GCed. for file_refs in (self.readers, self.writers): for ref in file_refs: if ref() is f: try: file_refs.remove(ref) except ValueError: pass return def open(self, mode="r"): if mode not in valid_modes: raise ValueError("invalid mode", mode) if mode == 'c': if (self._p_blob_uncommitted or not self._p_blob_committed or self._p_blob_committed.endswith(SAVEPOINT_SUFFIX) ): raise BlobError('Uncommitted changes') return self._p_jar._storage.openCommittedBlobFile( self._p_oid, self._p_serial) if self.writers: raise BlobError("Already opened for writing.") if self.readers is None: self.readers = [] if mode == 'r': result = None to_open = self._p_blob_uncommitted if not to_open: to_open = self._p_blob_committed if to_open: result = self._p_jar._storage.openCommittedBlobFile( self._p_oid, self._p_serial, self) else: self._create_uncommitted_file() to_open = self._p_blob_uncommitted assert to_open if result is None: result = BlobFile(to_open, mode, self) def destroyed(ref, readers=self.readers): try: readers.remove(ref) except ValueError: pass self.readers.append(weakref.ref(result, destroyed)) else: if self.readers: raise BlobError("Already opened for reading.") if mode == 'w': if self._p_blob_uncommitted is None: self._create_uncommitted_file() result = BlobFile(self._p_blob_uncommitted, mode, self) else: # 'r+' and 'a' if self._p_blob_uncommitted is None: # Create a new working copy self._create_uncommitted_file() result = BlobFile(self._p_blob_uncommitted, mode, self) if self._p_blob_committed: utils.cp(open(self._p_blob_committed), result) if mode == 'r+': result.seek(0) else: # Re-use existing working copy result = BlobFile(self._p_blob_uncommitted, mode, self) def destroyed(ref, writers=self.writers): try: writers.remove(ref) except ValueError: pass self.writers.append(weakref.ref(result, destroyed)) self._p_changed = True return result def committed(self): if (self._p_blob_uncommitted or not self._p_blob_committed or self._p_blob_committed.endswith(SAVEPOINT_SUFFIX) ): raise BlobError('Uncommitted changes') result = self._p_blob_committed # We do this to make sure we have the file and to let the # storage know we're accessing the file. n = self._p_jar._storage.loadBlob(self._p_oid, self._p_serial) assert result == n, (result, n) return result def consumeFile(self, filename): """Will replace the current data of the blob with the file given under filename. """ if self.writers: raise BlobError("Already opened for writing.") if self.readers: raise BlobError("Already opened for reading.") previous_uncommitted = bool(self._p_blob_uncommitted) if previous_uncommitted: # If we have uncommitted data, we move it aside for now # in case the consumption doesn't work. target = self._p_blob_uncommitted target_aside = target+".aside" os.rename(target, target_aside) else: target = self._create_uncommitted_file() # We need to unlink the freshly created target again # to allow link() to do its job os.remove(target) try: rename_or_copy_blob(filename, target, chmod=False) except: # Recover from the failed consumption: First remove the file, it # might exist and mark the pointer to the uncommitted file. self._p_blob_uncommitted = None if os.path.exists(target): os.remove(target) # If there was a file moved aside, bring it back including the # pointer to the uncommitted file. if previous_uncommitted: os.rename(target_aside, target) self._p_blob_uncommitted = target # Re-raise the exception to make the application aware of it. raise else: if previous_uncommitted: # The relinking worked so we can remove the data that we had # set aside. os.remove(target_aside) # We changed the blob state and have to make sure we join the # transaction. self._p_changed = True # utility methods def _create_uncommitted_file(self): assert self._p_blob_uncommitted is None, ( "Uncommitted file already exists.") if self._p_jar: tempdir = self._p_jar.db()._storage.temporaryDirectory() else: tempdir = tempfile.gettempdir() filename = utils.mktemp(dir=tempdir) self._p_blob_uncommitted = filename def cleanup(ref): if os.path.exists(filename): os.remove(filename) self._p_blob_ref = weakref.ref(self, cleanup) return filename def _uncommitted(self): # hand uncommitted data to connection, relinquishing responsibility # for it. filename = self._p_blob_uncommitted if filename is None and self._p_blob_committed is None: filename = self._create_uncommitted_file() self._p_blob_uncommitted = self._p_blob_ref = None return filename class BlobFile(file): """A BlobFile that holds a file handle to actual blob data. It is a file that can be used within a transaction boundary; a BlobFile is just a Python file object, we only override methods which cause a change to blob data in order to call methods on our 'parent' persistent blob object signifying that the change happened. """ # XXX these files should be created in the same partition as # the storage later puts them to avoid copying them ... def __init__(self, name, mode, blob): super(BlobFile, self).__init__(name, mode+'b') self.blob = blob def close(self): self.blob.closed(self) file.close(self) _pid = str(os.getpid()) def log(msg, level=logging.INFO, subsys=_pid, exc_info=False): message = "(%s) %s" % (subsys, msg) logger.log(level, message, exc_info=exc_info) class FilesystemHelper: # Storages that implement IBlobStorage can choose to use this # helper class to generate and parse blob filenames. This is not # a set-in-stone interface for all filesystem operations dealing # with blobs and storages needn't indirect through this if they # want to perform blob storage differently. def __init__(self, base_dir, layout_name='automatic'): self.base_dir = os.path.abspath(base_dir) + os.path.sep self.temp_dir = os.path.join(base_dir, 'tmp') if layout_name == 'automatic': layout_name = auto_layout_select(base_dir) if layout_name == 'lawn': log('The `lawn` blob directory layout is deprecated due to ' 'scalability issues on some file systems, please consider ' 'migrating to the `bushy` layout.', level=logging.WARN) self.layout_name = layout_name self.layout = LAYOUTS[layout_name] def create(self): if not os.path.exists(self.base_dir): os.makedirs(self.base_dir, 0700) log("Blob directory '%s' does not exist. " "Created new directory." % self.base_dir) if not os.path.exists(self.temp_dir): os.makedirs(self.temp_dir, 0700) log("Blob temporary directory '%s' does not exist. " "Created new directory." % self.temp_dir) if not os.path.exists(os.path.join(self.base_dir, LAYOUT_MARKER)): layout_marker = open( os.path.join(self.base_dir, LAYOUT_MARKER), 'wb') layout_marker.write(self.layout_name) else: layout = open(os.path.join(self.base_dir, LAYOUT_MARKER), 'rb' ).read().strip() if layout != self.layout_name: raise ValueError( "Directory layout `%s` selected for blob directory %s, but " "marker found for layout `%s`" % (self.layout_name, self.base_dir, layout)) def isSecure(self, path): """Ensure that (POSIX) path mode bits are 0700.""" return (os.stat(path).st_mode & 077) == 0 def checkSecure(self): if not self.isSecure(self.base_dir): log('Blob dir %s has insecure mode setting' % self.base_dir, level=logging.WARNING) def getPathForOID(self, oid, create=False): """Given an OID, return the path on the filesystem where the blob data relating to that OID is stored. If the create flag is given, the path is also created if it didn't exist already. """ # OIDs are numbers and sometimes passed around as integers. For our # computations we rely on the 64-bit packed string representation. if isinstance(oid, int): oid = utils.p64(oid) path = self.layout.oid_to_path(oid) path = os.path.join(self.base_dir, path) if create and not os.path.exists(path): try: os.makedirs(path, 0700) except OSError: # We might have lost a race. If so, the directory # must exist now assert os.path.exists(path) return path def getOIDForPath(self, path): """Given a path, return an OID, if the path is a valid path for an OID. The inverse function to `getPathForOID`. Raises ValueError if the path is not valid for an OID. """ path = path[len(self.base_dir):] return self.layout.path_to_oid(path) def createPathForOID(self, oid): """Given an OID, creates a directory on the filesystem where the blob data relating to that OID is stored, if it doesn't exist. """ return self.getPathForOID(oid, create=True) def getBlobFilename(self, oid, tid): """Given an oid and a tid, return the full filename of the 'committed' blob file related to that oid and tid. """ # TIDs are numbers and sometimes passed around as integers. For our # computations we rely on the 64-bit packed string representation if isinstance(oid, int): oid = utils.p64(oid) if isinstance(tid, int): tid = utils.p64(tid) return os.path.join(self.base_dir, self.layout.getBlobFilePath(oid, tid), ) def blob_mkstemp(self, oid, tid): """Given an oid and a tid, return a temporary file descriptor and a related filename. The file is guaranteed to exist on the same partition as committed data, which is important for being able to rename the file without a copy operation. The directory in which the file will be placed, which is the return value of self.getPathForOID(oid), must exist before this method may be called successfully. """ oidpath = self.getPathForOID(oid) fd, name = tempfile.mkstemp(suffix='.tmp', prefix=utils.tid_repr(tid), dir=oidpath) return fd, name def splitBlobFilename(self, filename): """Returns the oid and tid for a given blob filename. If the filename cannot be recognized as a blob filename, (None, None) is returned. """ if not filename.endswith(BLOB_SUFFIX): return None, None path, filename = os.path.split(filename) oid = self.getOIDForPath(path) serial = filename[:-len(BLOB_SUFFIX)] serial = utils.repr_to_oid(serial) return oid, serial def getOIDsForSerial(self, search_serial): """Return all oids related to a particular tid that exist in blob data. """ oids = [] for oid, oidpath in self.listOIDs(): for filename in os.listdir(oidpath): blob_path = os.path.join(oidpath, filename) oid, serial = self.splitBlobFilename(blob_path) if search_serial == serial: oids.append(oid) return oids def listOIDs(self): """Iterates over all paths under the base directory that contain blob files. """ for path, dirs, files in os.walk(self.base_dir): # Make sure we traverse in a stable order. This is mainly to make # testing predictable. dirs.sort() files.sort() try: oid = self.getOIDForPath(path) except ValueError: continue yield oid, path class NoBlobsFileSystemHelper: @property def temp_dir(self): raise TypeError("Blobs are not supported") getPathForOID = getBlobFilename = temp_dir class BlobStorageError(Exception): """The blob storage encountered an invalid state.""" def auto_layout_select(path): # A heuristic to look at a path and determine which directory layout to # use. layout_marker = os.path.join(path, LAYOUT_MARKER) if os.path.exists(layout_marker): layout = open(layout_marker, 'rb').read() layout = layout.strip() log('Blob directory `%s` has layout marker set. ' 'Selected `%s` layout. ' % (path, layout), level=logging.DEBUG) elif not os.path.exists(path): log('Blob directory %s does not exist. ' 'Selected `bushy` layout. ' % path) layout = 'bushy' else: # look for a non-hidden file in the directory has_files = False for name in os.listdir(path): if not name.startswith('.'): has_files = True break if not has_files: log('Blob directory `%s` is unused and has no layout marker set. ' 'Selected `bushy` layout. ' % path) layout = 'bushy' else: log('Blob directory `%s` is used but has no layout marker set. ' 'Selected `lawn` layout. ' % path) layout = 'lawn' return layout class BushyLayout(object): """A bushy directory layout for blob directories. Creates an 8-level directory structure (one level per byte) in big-endian order from the OID of an object. """ blob_path_pattern = re.compile( r'(0x[0-9a-f]{1,2}\%s){7,7}0x[0-9a-f]{1,2}$' % os.path.sep) def oid_to_path(self, oid): directories = [] # Create the bushy directory structure with the least significant byte # first for byte in str(oid): directories.append('0x%s' % binascii.hexlify(byte)) return os.path.sep.join(directories) def path_to_oid(self, path): if self.blob_path_pattern.match(path) is None: raise ValueError("Not a valid OID path: `%s`" % path) path = path.split(os.path.sep) # Each path segment stores a byte in hex representation. Turn it into # an int and then get the character for our byte string. oid = ''.join(binascii.unhexlify(byte[2:]) for byte in path) return oid def getBlobFilePath(self, oid, tid): """Given an oid and a tid, return the full filename of the 'committed' blob file related to that oid and tid. """ oid_path = self.oid_to_path(oid) filename = "%s%s" % (utils.tid_repr(tid), BLOB_SUFFIX) return os.path.join(oid_path, filename) LAYOUTS['bushy'] = BushyLayout() class LawnLayout(BushyLayout): """A shallow directory layout for blob directories. Creates a single level of directories (one for each oid). """ def oid_to_path(self, oid): return utils.oid_repr(oid) def path_to_oid(self, path): try: if path == '': # This is a special case where repr_to_oid converts '' to the # OID z64. raise TypeError() return utils.repr_to_oid(path) except TypeError: raise ValueError('Not a valid OID path: `%s`' % path) LAYOUTS['lawn'] = LawnLayout() class BlobStorageMixin(object): """A mix-in to help storages support blobs.""" def _blob_init(self, blob_dir, layout='automatic'): # XXX Log warning if storage is ClientStorage self.fshelper = FilesystemHelper(blob_dir, layout) self.fshelper.create() self.fshelper.checkSecure() self.dirty_oids = [] def _blob_init_no_blobs(self): self.fshelper = NoBlobsFileSystemHelper() self.dirty_oids = [] def _blob_tpc_abort(self): """Blob cleanup to be called from subclass tpc_abort """ while self.dirty_oids: oid, serial = self.dirty_oids.pop() clean = self.fshelper.getBlobFilename(oid, serial) if os.path.exists(clean): remove_committed(clean) def _blob_tpc_finish(self): """Blob cleanup to be called from subclass tpc_finish """ self.dirty_oids = [] def registerDB(self, db): self.__untransform_record_data = db.untransform_record_data try: m = super(BlobStorageMixin, self).registerDB except AttributeError: pass else: m(db) def __untransform_record_data(self, record): return record def is_blob_record(self, record): if record: return is_blob_record(self.__untransform_record_data(record)) def copyTransactionsFrom(self, other): copyTransactionsFromTo(other, self) def loadBlob(self, oid, serial): """Return the filename where the blob file can be found. """ filename = self.fshelper.getBlobFilename(oid, serial) if not os.path.exists(filename): raise POSKeyError("No blob file", oid, serial) return filename def openCommittedBlobFile(self, oid, serial, blob=None): blob_filename = self.loadBlob(oid, serial) if blob is None: return open(blob_filename, 'rb') else: return BlobFile(blob_filename, 'r', blob) def restoreBlob(self, oid, serial, data, blobfilename, prev_txn, transaction): """Write blob data already committed in a separate database """ self.restore(oid, serial, data, '', prev_txn, transaction) self._blob_storeblob(oid, serial, blobfilename) return self._tid def _blob_storeblob(self, oid, serial, blobfilename): self._lock_acquire() try: self.fshelper.getPathForOID(oid, create=True) targetname = self.fshelper.getBlobFilename(oid, serial) rename_or_copy_blob(blobfilename, targetname) # if oid already in there, something is really hosed. # The underlying storage should have complained anyway self.dirty_oids.append((oid, serial)) finally: self._lock_release() def storeBlob(self, oid, oldserial, data, blobfilename, version, transaction): """Stores data that has a BLOB attached.""" assert not version, "Versions aren't supported." serial = self.store(oid, oldserial, data, '', transaction) self._blob_storeblob(oid, serial, blobfilename) return self._tid def temporaryDirectory(self): return self.fshelper.temp_dir class BlobStorage(BlobStorageMixin): """A wrapper/proxy storage to support blobs. """ zope.interface.implements(ZODB.interfaces.IBlobStorage) def __init__(self, base_directory, storage, layout='automatic'): assert not ZODB.interfaces.IBlobStorage.providedBy(storage) self.__storage = storage self._blob_init(base_directory, layout) try: supportsUndo = storage.supportsUndo except AttributeError: supportsUndo = False else: supportsUndo = supportsUndo() self.__supportsUndo = supportsUndo self._blobs_pack_is_in_progress = False if ZODB.interfaces.IStorageRestoreable.providedBy(storage): iblob = ZODB.interfaces.IBlobStorageRestoreable else: iblob = ZODB.interfaces.IBlobStorage zope.interface.directlyProvides( self, iblob, zope.interface.providedBy(storage)) def __getattr__(self, name): return getattr(self.__storage, name) def __len__(self): return len(self.__storage) def __repr__(self): normal_storage = self.__storage return '' % (normal_storage, hex(id(self))) def tpc_finish(self, *arg, **kw): # We need to override the base storage's tpc_finish instead of # providing a _finish method because methods found on the proxied # object aren't rebound to the proxy self.__storage.tpc_finish(*arg, **kw) self._blob_tpc_finish() def tpc_abort(self, *arg, **kw): # We need to override the base storage's abort instead of # providing an _abort method because methods found on the proxied object # aren't rebound to the proxy self.__storage.tpc_abort(*arg, **kw) self._blob_tpc_abort() def _packUndoing(self, packtime, referencesf): # Walk over all existing revisions of all blob files and check # if they are still needed by attempting to load the revision # of that object from the database. This is maybe the slowest # possible way to do this, but it's safe. for oid, oid_path in self.fshelper.listOIDs(): files = os.listdir(oid_path) for filename in files: filepath = os.path.join(oid_path, filename) whatever, serial = self.fshelper.splitBlobFilename(filepath) try: self.loadSerial(oid, serial) except POSKeyError: remove_committed(filepath) if not os.listdir(oid_path): shutil.rmtree(oid_path) def _packNonUndoing(self, packtime, referencesf): for oid, oid_path in self.fshelper.listOIDs(): exists = True try: self.load(oid, None) # no version support except (POSKeyError, KeyError): exists = False if exists: files = os.listdir(oid_path) files.sort() latest = files[-1] # depends on ever-increasing tids files.remove(latest) for file in files: remove_committed(os.path.join(oid_path, file)) else: remove_committed_dir(oid_path) continue if not os.listdir(oid_path): shutil.rmtree(oid_path) def pack(self, packtime, referencesf): """Remove all unused OID/TID combinations.""" self._lock_acquire() try: if self._blobs_pack_is_in_progress: raise BlobStorageError('Already packing') self._blobs_pack_is_in_progress = True finally: self._lock_release() try: # Pack the underlying storage, which will allow us to determine # which serials are current. unproxied = self.__storage result = unproxied.pack(packtime, referencesf) # Perform a pack on the blob data. if self.__supportsUndo: self._packUndoing(packtime, referencesf) else: self._packNonUndoing(packtime, referencesf) finally: self._lock_acquire() self._blobs_pack_is_in_progress = False self._lock_release() return result def undo(self, serial_id, transaction): undo_serial, keys = self.__storage.undo(serial_id, transaction) # serial_id is the transaction id of the txn that we wish to undo. # "undo_serial" is the transaction id of txn in which the undo is # performed. "keys" is the list of oids that are involved in the # undo transaction. # The serial_id is assumed to be given to us base-64 encoded # (belying the web UI legacy of the ZODB code :-() serial_id = base64.decodestring(serial_id+'\n') self._lock_acquire() try: # we get all the blob oids on the filesystem related to the # transaction we want to undo. for oid in self.fshelper.getOIDsForSerial(serial_id): # we want to find the serial id of the previous revision # of this blob object. load_result = self.loadBefore(oid, serial_id) if load_result is None: # There was no previous revision of this blob # object. The blob was created in the transaction # represented by serial_id. We copy the blob data # to a new file that references the undo # transaction in case a user wishes to undo this # undo. It would be nice if we had some way to # link to old blobs. orig_fn = self.fshelper.getBlobFilename(oid, serial_id) new_fn = self.fshelper.getBlobFilename(oid, undo_serial) else: # A previous revision of this blob existed before the # transaction implied by "serial_id". We copy the blob # data to a new file that references the undo transaction # in case a user wishes to undo this undo. data, serial_before, serial_after = load_result orig_fn = self.fshelper.getBlobFilename(oid, serial_before) new_fn = self.fshelper.getBlobFilename(oid, undo_serial) orig = open(orig_fn, "r") new = open(new_fn, "wb") utils.cp(orig, new) orig.close() new.close() self.dirty_oids.append((oid, undo_serial)) finally: self._lock_release() return undo_serial, keys def new_instance(self): """Implementation of IMVCCStorage.new_instance. This method causes all storage instances to be wrapped with a blob storage wrapper. """ base_dir = self.fshelper.base_dir s = self.__storage.new_instance() res = BlobStorage(base_dir, s) return res copied = logging.getLogger('ZODB.blob.copied').debug def rename_or_copy_blob(f1, f2, chmod=True): """Try to rename f1 to f2, fallback to copy. Under certain conditions a rename might not work, e.g. because the target directory is on a different partition. In this case we try to copy the data and remove the old file afterwards. """ try: os.rename(f1, f2) except OSError: copied("Copied blob file %r to %r.", f1, f2) file1 = open(f1, 'rb') file2 = open(f2, 'wb') try: utils.cp(file1, file2) finally: file1.close() file2.close() remove_committed(f1) if chmod: os.chmod(f2, stat.S_IREAD) if sys.platform == 'win32': # On Windows, you can't remove read-only files, so make the # file writable first. def remove_committed(filename): os.chmod(filename, stat.S_IWRITE) os.remove(filename) def remove_committed_dir(path): for (dirpath, dirnames, filenames) in os.walk(path): for filename in filenames: filename = os.path.join(dirpath, filename) remove_committed(filename) shutil.rmtree(path) link_or_copy = shutil.copy else: remove_committed = os.remove remove_committed_dir = shutil.rmtree link_or_copy = os.link def find_global_Blob(module, class_): if module == 'ZODB.blob' and class_ == 'Blob': return Blob def is_blob_record(record): """Check whether a database record is a blob record. This is primarily intended to be used when copying data from one storage to another. """ if record and ('ZODB.blob' in record): unpickler = cPickle.Unpickler(cStringIO.StringIO(record)) unpickler.find_global = find_global_Blob try: return unpickler.load() is Blob except (MemoryError, KeyboardInterrupt, SystemExit): raise except Exception: pass return False def copyTransactionsFromTo(source, destination): for trans in source.iterator(): destination.tpc_begin(trans, trans.tid, trans.status) for record in trans: blobfilename = None if is_blob_record(record.data): try: blobfilename = source.loadBlob(record.oid, record.tid) except POSKeyError: pass if blobfilename is not None: fd, name = tempfile.mkstemp( suffix='.tmp', dir=destination.fshelper.temp_dir) os.close(fd) utils.cp(open(blobfilename, 'rb'), open(name, 'wb')) destination.restoreBlob(record.oid, record.tid, record.data, name, record.data_txn, trans) else: destination.restore(record.oid, record.tid, record.data, '', record.data_txn, trans) destination.tpc_vote(trans) destination.tpc_finish(trans) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/broken.py000066400000000000000000000232221230730566700222700ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2004 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Broken object support $Id$ """ import sys import persistent import zope.interface import ZODB.interfaces broken_cache = {} class Broken(object): """Broken object base class Broken objects are placeholders for objects that can no longer be created because their class has gone away. Broken objects don't really do much of anything, except hold their state. The Broken class is used as a base class for creating classes in leu of missing classes:: >>> Atall = type('Atall', (Broken, ), {'__module__': 'not.there'}) The only thing the class can be used for is to create new objects:: >>> Atall() >>> Atall().__Broken_newargs__ () >>> Atall().__Broken_initargs__ () >>> Atall(1, 2).__Broken_newargs__ (1, 2) >>> Atall(1, 2).__Broken_initargs__ (1, 2) >>> a = Atall.__new__(Atall, 1, 2) >>> a >>> a.__Broken_newargs__ (1, 2) >>> a.__Broken_initargs__ You can't modify broken objects:: >>> a.x = 1 Traceback (most recent call last): ... BrokenModified: Can't change broken objects But you can set their state:: >>> a.__setstate__({'x': 1, }) You can pickle broken objects:: >>> r = a.__reduce__() >>> len(r) 3 >>> r[0] is rebuild True >>> r[1] ('not.there', 'Atall', 1, 2) >>> r[2] {'x': 1} >>> import cPickle >>> a2 = cPickle.loads(cPickle.dumps(a, 1)) >>> a2 >>> a2.__Broken_newargs__ (1, 2) >>> a2.__Broken_initargs__ >>> a2.__Broken_state__ {'x': 1} Cleanup:: >>> broken_cache.clear() """ zope.interface.implements(ZODB.interfaces.IBroken) __Broken_state__ = __Broken_initargs__ = None __name__ = 'broken object' def __new__(class_, *args): result = object.__new__(class_) result.__dict__['__Broken_newargs__'] = args return result def __init__(self, *args): self.__dict__['__Broken_initargs__'] = args def __reduce__(self): """We pickle broken objects in hope of being able to fix them later """ return (rebuild, ((self.__class__.__module__, self.__class__.__name__) + self.__Broken_newargs__), self.__Broken_state__, ) def __setstate__(self, state): self.__dict__['__Broken_state__'] = state def __repr__(self): return "" % ( self.__class__.__module__, self.__class__.__name__) def __setattr__(self, name, value): raise BrokenModified("Can't change broken objects") def find_global(modulename, globalname, # These are *not* optimizations. Callers can override these. Broken=Broken, type=type, ): """Find a global object, returning a broken class if it can't be found. This function looks up global variable in modules:: >>> import sys >>> find_global('sys', 'path') is sys.path True If an object can't be found, a broken class is returned:: >>> broken = find_global('ZODB.not.there', 'atall') >>> issubclass(broken, Broken) True >>> broken.__module__ 'ZODB.not.there' >>> broken.__name__ 'atall' Broken classes are cached:: >>> find_global('ZODB.not.there', 'atall') is broken True If we "repair" a missing global:: >>> class ZODBnotthere: ... atall = [] >>> sys.modules['ZODB.not'] = ZODBnotthere >>> sys.modules['ZODB.not.there'] = ZODBnotthere we can then get the repaired value:: >>> find_global('ZODB.not.there', 'atall') is ZODBnotthere.atall True Of course, if we beak it again:: >>> del sys.modules['ZODB.not'] >>> del sys.modules['ZODB.not.there'] we get the broken value:: >>> find_global('ZODB.not.there', 'atall') is broken True Cleanup:: >>> broken_cache.clear() """ # short circuit common case: try: return getattr(sys.modules[modulename], globalname) except (AttributeError, KeyError): pass try: __import__(modulename) except ImportError: pass else: module = sys.modules[modulename] try: return getattr(module, globalname) except AttributeError: pass try: return broken_cache[(modulename, globalname)] except KeyError: pass class_ = type(globalname, (Broken, ), {'__module__': modulename}) broken_cache[(modulename, globalname)] = class_ return class_ def rebuild(modulename, globalname, *args): """Recreate a broken object, possibly recreating the missing class This functions unpickles broken objects:: >>> broken = rebuild('ZODB.notthere', 'atall', 1, 2) >>> broken >>> broken.__Broken_newargs__ (1, 2) If we "repair" the brokenness:: >>> class notthere: # fake notthere module ... class atall(object): ... def __new__(self, *args): ... ob = object.__new__(self) ... ob.args = args ... return ob ... def __repr__(self): ... return 'atall %s %s' % self.args >>> sys.modules['ZODB.notthere'] = notthere >>> rebuild('ZODB.notthere', 'atall', 1, 2) atall 1 2 >>> del sys.modules['ZODB.notthere'] Cleanup:: >>> broken_cache.clear() """ class_ = find_global(modulename, globalname) return class_.__new__(class_, *args) class BrokenModified(TypeError): """Attempt to modify a broken object """ class PersistentBroken(Broken, persistent.Persistent): r"""Persistent broken objects Persistent broken objects are used for broken objects that are also persistent. In addition to having to track the original object data, they need to handle persistent meta data. Persistent broken classes are created from existing broken classes using the persistentBroken, function:: >>> Atall = type('Atall', (Broken, ), {'__module__': 'not.there'}) >>> PAtall = persistentBroken(Atall) (Note that we always get the *same* persistent broken class for a given broken class:: >>> persistentBroken(Atall) is PAtall True ) Persistent broken classes work a lot like broken classes:: >>> a = PAtall.__new__(PAtall, 1, 2) >>> a >>> a.__Broken_newargs__ (1, 2) >>> a.__Broken_initargs__ >>> a.x = 1 Traceback (most recent call last): ... BrokenModified: Can't change broken objects Unlike regular broken objects, persistent broken objects keep track of persistence meta data: >>> a._p_oid = '\0\0\0\0****' >>> a and persistent broken objects aren't directly picklable: >>> a.__reduce__() # doctest: +NORMALIZE_WHITESPACE Traceback (most recent call last): ... BrokenModified: but you can get their state: >>> a.__setstate__({'y': 2}) >>> a.__getstate__() {'y': 2} Cleanup:: >>> broken_cache.clear() """ def __new__(class_, *args): result = persistent.Persistent.__new__(class_) result.__dict__['__Broken_newargs__'] = args return result def __reduce__(self, *args): raise BrokenModified(self) def __getstate__(self): return self.__Broken_state__ def __setattr__(self, name, value): if name.startswith('_p_'): persistent.Persistent.__setattr__(self, name, value) else: raise BrokenModified("Can't change broken objects") def __repr__(self): return "" % ( self.__class__.__module__, self.__class__.__name__, self._p_oid) def __getnewargs__(self): return self.__Broken_newargs__ def persistentBroken(class_): try: return class_.__dict__['__Broken_Persistent__'] except KeyError: class_.__Broken_Persistent__ = ( type(class_.__name__, (PersistentBroken, class_), {'__module__': class_.__module__}, ) ) return class_.__dict__['__Broken_Persistent__'] ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/collaborations.txt000066400000000000000000000135001230730566700242100ustar00rootroot00000000000000======================= Collabortation Diagrams ======================= This file contains several collaboration diagrams for the ZODB. Simple fetch, modify, commit ============================ Participants ------------ - ``DB``: ``ZODB.DB.DB`` - ``C``: ``ZODB.Connection.Connection`` - ``S``: ``ZODB.FileStorage.FileStorage`` - ``T``: ``transaction.interfaces.ITransaction`` - ``TM``: ``transaction.interfaces.ITransactionManager`` - ``o1``, ``o2``, ...: pre-existing persistent objects Scenario -------- :: DB.open() create C TM.registerSynch(C) TM.begin() create T C.get(1) # fetches o1 C.get(2) # fetches o2 C.get(3) # fetches o3 o1.modify() # anything that modifies o1 C.register(o1) T.join(C) o2.modify() C.register(o2) # T.join(C) does not happen again o1.modify() # C.register(o1) doesn't happen again, because o1 was already # in the changed state. T.commit() C.beforeCompletion(T) C.tpc_begin(T) S.tpc_begin(T) C.commit(T) S.store(1, ..., T) S.store(2, ..., T) # o3 is not stored, because it wasn't modified C.tpc_vote(T) S.tpc_vote(T) C.tpc_finish(T) S.tpc_finish(T, f) # f is a callback function, which arranges # to call DB.invalidate (next) DB.invalidate(tid, {1: 1, 2: 1}, C) C2.invalidate(tid, {1: 1, 2: 1}) # for all connections # C2 to DB, where C2 # is not C TM.free(T) C.afterCompletion(T) C._flush_invalidations() # Processes invalidations that may have come in from other # transactions. Simple fetch, modify, abort =========================== Participants ------------ - ``DB``: ``ZODB.DB.DB`` - ``C``: ``ZODB.Connection.Connection`` - ``S``: ``ZODB.FileStorage.FileStorage`` - ``T``: ``transaction.interfaces.ITransaction`` - ``TM``: ``transaction.interfaces.ITransactionManager`` - ``o1``, ``o2``, ...: pre-existing persistent objects Scenario -------- :: DB.open() create C TM.registerSynch(C) TM.begin() create T C.get(1) # fetches o1 C.get(2) # fetches o2 C.get(3) # fetches o3 o1.modify() # anything that modifies o1 C.register(o1) T.join(C) o2.modify() C.register(o2) # T.join(C) does not happen again o1.modify() # C.register(o1) doesn't happen again, because o1 was already # in the changed state. T.abort() C.beforeCompletion(T) C.abort(T) C._cache.invalidate(1) # toss changes to o1 C._cache.invalidate(2) # toss changes to o2 # o3 wasn't modified, and its cache entry isn't invalidated. TM.free(T) C.afterCompletion(T) C._flush_invalidations() # Processes invalidations that may have come in from other # transactions. Rollback of a savepoint ======================= Participants ------------ - ``T``: ``transaction.interfaces.ITransaction`` - ``o1``, ``o2``, ``o3``: some persistent objects - ``C1``, ``C2``, ``C3``: resource managers - ``S1``, ``S2``: Transaction savepoint objects - ``s11``, ``s21``, ``s22``: resource-manager savepoints Scenario -------- :: create T o1.modify() C1.regisiter(o1) T.join(C1) T.savepoint() C1.savepoint() return s11 return S1 = Savepoint(T, [r11]) o1.modify() C1.regisiter(o1) o2.modify() C2.regisiter(o2) T.join(C2) T.savepoint() C1.savepoint() return s21 C2.savepoint() return s22 return S2 = Savepoint(T, [r21, r22]) o3.modify() C3.regisiter(o3) T.join(C3) S1.rollback() S2.rollback() T.discard() C1.discard() C2.discard() C3.discard() o3.invalidate() S2.discard() s21.discard() # roll back changes since previous, which is r11 C1.discard(s21) o1.invalidate() # truncates temporary storage to s21's position s22.discard() # roll back changes since previous, which is r11 C1.discard(s22) o2.invalidate() # truncates temporary storage to beginning, because # s22 was the first savepoint. (Perhaps conection # savepoints record the log position before the # data were written, which is 0 in this case. T.commit() C1.beforeCompletion(T) C2.beforeCompletion(T) C3.beforeCompletion(T) C1.tpc_begin(T) S1.tpc_begin(T) C2.tpc_begin(T) C3.tpc_begin(T) C1.commit(T) S1.store(1, ..., T) C2.commit(T) C3.commit(T) C1.tpc_vote(T) S1.tpc_vote(T) C2.tpc_vote(T) C3.tpc_vote(T) C1.tpc_finish(T) S1.tpc_finish(T, f) # f is a callback function, which arranges c# to call DB.invalidate (next) DB.invalidate(tid, {1: 1}, C) TM.free(T) C1.afterCompletion(T) C1._flush_invalidations() C2.afterCompletion(T) C2._flush_invalidations() C3.afterCompletion(T) C3._flush_invalidations() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/component.xml000066400000000000000000000301641230730566700231650ustar00rootroot00000000000000 Path name to the main storage file. The names for supplemental files, including index and lock files, will be computed from this. If supplied, the file storage will provide blob support and this is the name of a directory to hold blob data. The directory will be created if it doeesn't exist. If no value (or an empty value) is provided, then no blob support will be provided. (You can still use a BlobStorage to provide blob support.) Flag that indicates whether the storage should be truncated if it already exists. If true, only reads may be executed against the storage. Note that the "pack" operation is not considered a write operation and is still allowed on a read-only filestorage. Maximum allowed size of the storage file. Operations which would cause the size of the storage to exceed the quota will result in a ZODB.FileStorage.FileStorageQuotaError being raised. The dotted name (dotted module name and object name) of a packer object. This is used to provide an alternative pack implementation. If false, then no garbage collection will be performed when packing. This can make packing go much faster and can avoid problems when objects are referenced only from other databases. If true, a copy of the database before packing is kept in a ".old" file. Path name to the blob cache directory. Tells whether the cache is a shared writable directory and that the ZEO protocol should not transfer the file but only the filename when committing. Maximum size of the ZEO blob cache, in bytes. If not set, then the cache size isn't checked and the blob directory will grow without bound. This option is ignored if shared_blob_dir is true. ZEO check size as percent of blob_cache_size. The ZEO cache size will be checked when this many bytes have been loaded into the cache. Defaults to 10% of the blob cache size. This option is ignored if shared_blob_dir is true. The name of the storage that the client wants to use. If the ZEO server serves more than one storage, the client selects the storage it wants to use by name. The default name is '1', which is also the default name for the ZEO server. The maximum size of the client cache, in bytes, KB or MB. The storage name. If unspecified, the address of the server will be used as the name. Enables persistent cache files. The string passed here is used to construct the cache filenames. If it is not specified, the client creates a temporary cache that will only be used by the current object. The directory where persistent cache files are stored. By default cache files, if they are persistent, are stored in the current directory. The minimum delay in seconds between attempts to connect to the server, in seconds. Defaults to 5 seconds. The maximum delay in seconds between attempts to connect to the server, in seconds. Defaults to 300 seconds. A boolean indicating whether the constructor should wait for the client to connect to the server and verify the cache before returning. The default is true. A flag indicating whether this should be a read-only storage, defaulting to false (i.e. writing is allowed by default). A flag indicating whether a read-only remote storage should be acceptable as a fallback when no writable storages are available. Defaults to false. At most one of read_only and read_only_fallback should be true. The authentication username of the server. The authentication password of the server. The authentication realm of the server. Some authentication schemes use a realm to identify the logic set of usernames that are accepted by this server. A flag indicating whether the client cache should be dropped instead of an expensive verification. A label for the client in server logs

Target size, in number of objects, of each connection's object cache. Target size, in total estimated size for objects, of each connection's object cache. "0" means no limit. The expected maximum number of simultaneously open connections. There is no hard limit (as many connections as are requested will be opened, until system resources are exhausted). Exceeding pool-size connections causes a warning message to be logged, and exceeding twice pool-size connections causes a critical message to be logged. The minimum interval that an unused (non-historical) connection should be kept. The expected maximum total number of historical connections simultaneously open. Target size, in number of objects, of each historical connection's object cache. Target size, in total estimated size of objects, of each historical connection's object cache. The minimum interval that an unused historical connection should be kept. When multidatabases are in use, this is the name given to this database in the collection. The name must be unique across all databases in the collection. The collection must also be given a mapping from its databases' names to their databases, but that cannot be specified in a ZODB config file. Applications using multidatabases typical supply a way to configure the mapping in their own config files, using the "databases" parameter of a DB constructor. If set to false, implicit cross references (the only kind currently possible) are disallowed. Path name to the blob storage directory.
ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/config.py000066400000000000000000000172741230730566700222670ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Open database and storage from a configuration.""" import os from cStringIO import StringIO import ZConfig import ZODB db_schema_path = os.path.join(ZODB.__path__[0], "config.xml") _db_schema = None s_schema_path = os.path.join(ZODB.__path__[0], "storage.xml") _s_schema = None def getDbSchema(): global _db_schema if _db_schema is None: _db_schema = ZConfig.loadSchema(db_schema_path) return _db_schema def getStorageSchema(): global _s_schema if _s_schema is None: _s_schema = ZConfig.loadSchema(s_schema_path) return _s_schema def databaseFromString(s): return databaseFromFile(StringIO(s)) def databaseFromFile(f): config, handle = ZConfig.loadConfigFile(getDbSchema(), f) return databaseFromConfig(config.database) def databaseFromURL(url): config, handler = ZConfig.loadConfig(getDbSchema(), url) return databaseFromConfig(config.database) def databaseFromConfig(database_factories): databases = {} first = None for factory in database_factories: db = factory.open(databases) if first is None: first = db return first def storageFromString(s): return storageFromFile(StringIO(s)) def storageFromFile(f): config, handle = ZConfig.loadConfigFile(getStorageSchema(), f) return storageFromConfig(config.storage) def storageFromURL(url): config, handler = ZConfig.loadConfig(getStorageSchema(), url) return storageFromConfig(config.storage) def storageFromConfig(section): return section.open() class BaseConfig: """Object representing a configured storage or database. Methods: open() -- open and return the configured object Attributes: name -- name of the storage """ def __init__(self, config): self.config = config self.name = config.getSectionName() def open(self, database_name='unnamed', databases=None): """Open and return the storage object.""" raise NotImplementedError class ZODBDatabase(BaseConfig): def open(self, databases=None): section = self.config storage = section.storage.open() options = {} def _option(name, oname=None): v = getattr(section, name) if v is not None: if oname is None: oname = name options[oname] = v _option('pool_timeout') _option('allow_implicit_cross_references', 'xrefs') _option('large_record_size') try: return ZODB.DB( storage, pool_size=section.pool_size, cache_size=section.cache_size, cache_size_bytes=section.cache_size_bytes, historical_pool_size=section.historical_pool_size, historical_cache_size=section.historical_cache_size, historical_cache_size_bytes=section.historical_cache_size_bytes, historical_timeout=section.historical_timeout, database_name=section.database_name or self.name or '', databases=databases, **options) except: storage.close() raise class MappingStorage(BaseConfig): def open(self): from ZODB.MappingStorage import MappingStorage return MappingStorage(self.config.name) class DemoStorage(BaseConfig): def open(self): base = changes = None for factory in self.config.factories: if factory.name == 'changes': changes = factory.open() else: if base is None: base = factory.open() else: raise ValueError("Too many base storages defined!") from ZODB.DemoStorage import DemoStorage return DemoStorage(self.config.name, base=base, changes=changes) class FileStorage(BaseConfig): def open(self): from ZODB.FileStorage import FileStorage config = self.config options = {} if getattr(config, 'packer', None): packer = config.packer if ':' in packer: m, expr = packer.split(':', 1) m = __import__(m, {}, {}, ['*']) options['packer'] = eval(expr, m.__dict__) else: m, name = config.packer.rsplit('.', 1) m = __import__(m, {}, {}, ['*']) options['packer'] = getattr(m, name) for name in ('blob_dir', 'create', 'read_only', 'quota', 'pack_gc', 'pack_keep_old'): v = getattr(config, name, self) if v is not self: options[name] = v return FileStorage(config.path, **options) class BlobStorage(BaseConfig): def open(self): from ZODB.blob import BlobStorage base = self.config.base.open() return BlobStorage(self.config.blob_dir, base) class ZEOClient(BaseConfig): def open(self): from ZEO.ClientStorage import ClientStorage # config.server is a multikey of socket-connection-address values # where the value is a socket family, address tuple. L = [server.address for server in self.config.server] options = {} if self.config.blob_cache_size is not None: options['blob_cache_size'] = self.config.blob_cache_size if self.config.blob_cache_size_check is not None: options['blob_cache_size_check'] = self.config.blob_cache_size_check if self.config.client_label is not None: options['client_label'] = self.config.client_label return ClientStorage( L, blob_dir=self.config.blob_dir, shared_blob_dir=self.config.shared_blob_dir, storage=self.config.storage, cache_size=self.config.cache_size, name=self.config.name, client=self.config.client, var=self.config.var, min_disconnect_poll=self.config.min_disconnect_poll, max_disconnect_poll=self.config.max_disconnect_poll, wait=self.config.wait, read_only=self.config.read_only, read_only_fallback=self.config.read_only_fallback, drop_cache_rather_verify=self.config.drop_cache_rather_verify, username=self.config.username, password=self.config.password, realm=self.config.realm, **options) class BDBStorage(BaseConfig): def open(self): from BDBStorage.BerkeleyBase import BerkeleyConfig storageclass = self.get_storageclass() bconf = BerkeleyConfig() for name in dir(BerkeleyConfig): if name.startswith('_'): continue setattr(bconf, name, getattr(self.config, name)) return storageclass(self.config.envdir, config=bconf) class BDBMinimalStorage(BDBStorage): def get_storageclass(self): import BDBStorage.BDBMinimalStorage return BDBStorage.BDBMinimalStorage.BDBMinimalStorage class BDBFullStorage(BDBStorage): def get_storageclass(self): import BDBStorage.BDBFullStorage return BDBStorage.BDBFullStorage.BDBFullStorage ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/config.xml000066400000000000000000000002531230730566700224240ustar00rootroot00000000000000 ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/conversionhack.py000066400000000000000000000020131230730566700240170ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## import persistent.mapping class fixer: def __of__(self, parent): def __setstate__(state, self=parent): self._container=state del self.__setstate__ return __setstate__ fixer=fixer() class hack: pass hack=hack() def __basicnew__(): r=persistent.mapping.PersistentMapping() r.__setstate__=fixer return r hack.__basicnew__=__basicnew__ ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/cross-database-references.txt000066400000000000000000000142051230730566700262120ustar00rootroot00000000000000========================= Cross-Database References ========================= Persistent references to objects in different databases within a multi-database are allowed. Lets set up a multi-database with 2 databases: >>> import ZODB.tests.util, transaction, persistent >>> databases = {} >>> db1 = ZODB.tests.util.DB(databases=databases, database_name='1') >>> db2 = ZODB.tests.util.DB(databases=databases, database_name='2') And create a persistent object in the first database: >>> tm = transaction.TransactionManager() >>> conn1 = db1.open(transaction_manager=tm) >>> p1 = MyClass() >>> conn1.root()['p'] = p1 >>> tm.commit() First, we get a connection to the second database. We get the second connection using the first connection's `get_connection` method. This is important. When using multiple databases, we need to make sure we use a consistent set of connections so that the objects in the connection caches are connected in a consistent manner. >>> conn2 = conn1.get_connection('2') Now, we'll create a second persistent object in the second database. We'll have a reference to the first object: >>> p2 = MyClass() >>> conn2.root()['p'] = p2 >>> p2.p1 = p1 >>> tm.commit() Now, let's open a separate connection to database 2. We use it to read `p2`, use `p2` to get to `p1`, and verify that it is in database 1: >>> conn = db2.open() >>> p2x = conn.root()['p'] >>> p1x = p2x.p1 >>> p2x is p2, p2x._p_oid == p2._p_oid, p2x._p_jar.db() is db2 (False, True, True) >>> p1x is p1, p1x._p_oid == p1._p_oid, p1x._p_jar.db() is db1 (False, True, True) It isn't valid to create references outside a multi database: >>> db3 = ZODB.tests.util.DB() >>> conn3 = db3.open(transaction_manager=tm) >>> p3 = MyClass() >>> conn3.root()['p'] = p3 >>> tm.commit() >>> p2.p3 = p3 >>> tm.commit() # doctest: +NORMALIZE_WHITESPACE +ELLIPSIS Traceback (most recent call last): ... InvalidObjectReference: ('Attempt to store an object from a foreign database connection', , ) >>> tm.abort() Databases for new objects ------------------------- Objects are normally added to a database by making them reachable from an object already in the database. This is unambiguous when there is only one database. With multiple databases, it is not so clear what happens. Consider: >>> p4 = MyClass() >>> p1.p4 = p4 >>> p2.p4 = p4 In this example, the new object is reachable from both `p1` in database 1 and `p2` in database 2. If we commit, which database should `p4` end up in? This sort of ambiguity could lead to subtle bugs. For that reason, an error is generated if we commit changes when new objects are reachable from multiple databases: >>> tm.commit() # doctest: +NORMALIZE_WHITESPACE +ELLIPSIS Traceback (most recent call last): ... InvalidObjectReference: ("A new object is reachable from multiple databases. Won't try to guess which one was correct!", , ) >>> tm.abort() To resolve this ambiguity, we can commit before an object becomes reachable from multiple databases. >>> p4 = MyClass() >>> p1.p4 = p4 >>> tm.commit() >>> p2.p4 = p4 >>> tm.commit() >>> p4._p_jar.db().database_name '1' This doesn't work with a savepoint: >>> p5 = MyClass() >>> p1.p5 = p5 >>> s = tm.savepoint() >>> p2.p5 = p5 >>> tm.commit() # doctest: +NORMALIZE_WHITESPACE +ELLIPSIS Traceback (most recent call last): ... InvalidObjectReference: ("A new object is reachable from multiple databases. Won't try to guess which one was correct!", , ) >>> tm.abort() (Maybe it should.) We can disambiguate this situation by using the connection add method to explicitly say what database an object belongs to: >>> p5 = MyClass() >>> p1.p5 = p5 >>> p2.p5 = p5 >>> conn1.add(p5) >>> tm.commit() >>> p5._p_jar.db().database_name '1' This the most explicit and thus the best way, when practical, to avoid the ambiguity. Dissallowing implicit cross-database references ----------------------------------------------- The database contructor accepts a xrefs keyword argument that defaults to True. If False is passed, the implicit cross database references are disallowed. (Note that currently, implicit cross references are the only kind of cross references allowed.) >>> databases = {} >>> db1 = ZODB.tests.util.DB(databases=databases, database_name='1') >>> db2 = ZODB.tests.util.DB(databases=databases, database_name='2', ... xrefs=False) In this example, we allow cross-references from db1 to db2, but not the other way around. >>> c1 = db1.open() >>> c2 = c1.get_connection('2') >>> c1.root.x = c2.root() >>> transaction.commit() >>> c2.root.x = c1.root() >>> transaction.commit() # doctest: +NORMALIZE_WHITESPACE +ELLIPSIS Traceback (most recent call last): ... InvalidObjectReference: ("Database '2' doesn't allow implicit cross-database references", , {'x': {}}) >>> transaction.abort() NOTE ---- This implementation is incomplete. It allows creating and using cross-database references, however, there are a number of facilities missing: cross-database garbage collection Garbage collection is done on a database by database basis. If an object on a database only has references to it from other databases, then the object will be garbage collected when its database is packed. The cross-database references to it will be broken. cross-database undo Undo is only applied to a single database. Fixing this for multiple databases is going to be extremely difficult. Undo currently poses consistency problems, so it is not (or should not be) widely used. Cross-database aware (tolerant) export/import The export/import facility needs to be aware, at least, of cross-database references. ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/fsIndex.py000066400000000000000000000173221230730566700224140ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Implement an OID to File-position (long integer) mapping.""" # To save space, we do two things: # # 1. We split the keys (OIDS) into 6-byte prefixes and 2-byte suffixes. # We use the prefixes as keys in a mapping from prefix to mappings # of suffix to data: # # data is {prefix -> {suffix -> data}} # # 2. We limit the data size to 48 bits. This should allow databases # as large as 256 terabytes. # # Most of the space is consumed by items in the mappings from 2-byte # suffix to 6-byte data. This should reduce the overall memory usage to # 8-16 bytes per OID. # # Because # - the mapping from suffix to data contains at most 65535 entries, # - this is an in-memory data structure # - new keys are inserted sequentially, # we use a BTree bucket instead of a full BTree to store the results. # # We use p64 to convert integers to 8-byte strings and lop off the two # high-order bytes when saving. On loading data, we add the leading # bytes back before using u64 to convert the data back to (long) # integers. from __future__ import with_statement import cPickle import struct from BTrees._fsBTree import fsBucket from BTrees.OOBTree import OOBTree # convert between numbers and six-byte strings def num2str(n): return struct.pack(">Q", n)[2:] def str2num(s): return struct.unpack(">Q", "\000\000" + s)[0] def prefix_plus_one(s): num = str2num(s) return num2str(num + 1) def prefix_minus_one(s): num = str2num(s) return num2str(num - 1) class fsIndex(object): def __init__(self, data=None): self._data = OOBTree() if data: self.update(data) def __getstate__(self): return dict( state_version = 1, _data = [(k, v.toString()) for (k, v) in self._data.iteritems() ] ) def __setstate__(self, state): version = state.pop('state_version', 0) getattr(self, '_setstate_%s' % version)(state) def _setstate_0(self, state): self.__dict__.clear() self.__dict__.update(state) def _setstate_1(self, state): self._data = OOBTree([ (k, fsBucket().fromString(v)) for (k, v) in state['_data'] ]) def __getitem__(self, key): return str2num(self._data[key[:6]][key[6:]]) def save(self, pos, fname): with open(fname, 'wb') as f: pickler = cPickle.Pickler(f, 1) pickler.fast = True pickler.dump(pos) for k, v in self._data.iteritems(): pickler.dump((k, v.toString())) pickler.dump(None) @classmethod def load(class_, fname): with open(fname, 'rb') as f: unpickler = cPickle.Unpickler(f) pos = unpickler.load() if not isinstance(pos, (int, long)): return pos # Old format index = class_() data = index._data while 1: v = unpickler.load() if not v: break k, v = v data[k] = fsBucket().fromString(v) return dict(pos=pos, index=index) def get(self, key, default=None): tree = self._data.get(key[:6], default) if tree is default: return default v = tree.get(key[6:], default) if v is default: return default return str2num(v) def __setitem__(self, key, value): value = num2str(value) treekey = key[:6] tree = self._data.get(treekey) if tree is None: tree = fsBucket() self._data[treekey] = tree tree[key[6:]] = value def __delitem__(self, key): treekey = key[:6] tree = self._data.get(treekey) if tree is None: raise KeyError, key del tree[key[6:]] if not tree: del self._data[treekey] def __len__(self): r = 0 for tree in self._data.itervalues(): r += len(tree) return r def update(self, mapping): for k, v in mapping.items(): self[k] = v def has_key(self, key): v = self.get(key, self) return v is not self def __contains__(self, key): tree = self._data.get(key[:6]) if tree is None: return False v = tree.get(key[6:], None) if v is None: return False return True def clear(self): self._data.clear() def __iter__(self): for prefix, tree in self._data.iteritems(): for suffix in tree: yield prefix + suffix iterkeys = __iter__ def keys(self): return list(self.iterkeys()) def iteritems(self): for prefix, tree in self._data.iteritems(): for suffix, value in tree.iteritems(): yield (prefix + suffix, str2num(value)) def items(self): return list(self.iteritems()) def itervalues(self): for tree in self._data.itervalues(): for value in tree.itervalues(): yield str2num(value) def values(self): return list(self.itervalues()) # Comment below applies for the following minKey and maxKey methods # # Obscure: what if `tree` is actually empty? We're relying here on # that this class doesn't implement __delitem__: once a key gets # into an fsIndex, the only way it can go away is by invoking # clear(). Therefore nothing in _data.values() is ever empty. # # Note that because `tree` is an fsBTree, its minKey()/maxKey() methods are # very efficient. def minKey(self, key=None): if key is None: smallest_prefix = self._data.minKey() else: smallest_prefix = self._data.minKey(key[:6]) tree = self._data[smallest_prefix] assert tree if key is None: smallest_suffix = tree.minKey() else: try: smallest_suffix = tree.minKey(key[6:]) except ValueError: # 'empty tree' (no suffix >= arg) next_prefix = prefix_plus_one(smallest_prefix) smallest_prefix = self._data.minKey(next_prefix) tree = self._data[smallest_prefix] assert tree smallest_suffix = tree.minKey() return smallest_prefix + smallest_suffix def maxKey(self, key=None): if key is None: biggest_prefix = self._data.maxKey() else: biggest_prefix = self._data.maxKey(key[:6]) tree = self._data[biggest_prefix] assert tree if key is None: biggest_suffix = tree.maxKey() else: try: biggest_suffix = tree.maxKey(key[6:]) except ValueError: # 'empty tree' (no suffix <= arg) next_prefix = prefix_minus_one(biggest_prefix) biggest_prefix = self._data.maxKey(next_prefix) tree = self._data[biggest_prefix] assert tree biggest_suffix = tree.maxKey() return biggest_prefix + biggest_suffix ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/fsrecover.py000066400000000000000000000241331230730566700230100ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Simple script for repairing damaged FileStorage files. Usage: %s [-f] [-v level] [-p] [-P seconds] input output Recover data from a FileStorage data file, skipping over damaged data. Any damaged data will be lost. This could lead to useless output if critical data is lost. Options: -f Overwrite output file even if it exists. -v level Set the verbosity level: 0 -- show progress indicator (default) 1 -- show transaction times and sizes 2 -- show transaction times and sizes, and show object (record) ids, versions, and sizes -p Copy partial transactions. If a data record in the middle of a transaction is bad, the data up to the bad data are packed. The output record is marked as packed. If this option is not used, transactions with any bad data are skipped. -P t Pack data to t seconds in the past. Note that if the "-p" option is used, then t should be 0. Important: The ZODB package must be importable. You may need to adjust PYTHONPATH accordingly. """ # Algorithm: # # position to start of input # while 1: # if end of file: # break # try: # copy_transaction # except: # scan for transaction # continue import sys import os import getopt import time from struct import unpack from cPickle import loads try: import ZODB except ImportError: if os.path.exists('ZODB'): sys.path.append('.') elif os.path.exists('FileStorage.py'): sys.path.append('..') import ZODB import ZODB.FileStorage from ZODB.utils import u64 from ZODB.FileStorage import TransactionRecord from persistent.TimeStamp import TimeStamp def die(mess='', show_docstring=False): if mess: print >> sys.stderr, mess + '\n' if show_docstring: print >> sys.stderr, __doc__ % sys.argv[0] sys.exit(1) class ErrorFound(Exception): pass def error(mess, *args): raise ErrorFound(mess % args) def read_txn_header(f, pos, file_size, outp, ltid): # Read the transaction record f.seek(pos) h = f.read(23) if len(h) < 23: raise EOFError tid, stl, status, ul, dl, el = unpack(">8s8scHHH",h) tl = u64(stl) if pos + (tl + 8) > file_size: error("bad transaction length at %s", pos) if tl < (23 + ul + dl + el): error("invalid transaction length, %s, at %s", tl, pos) if ltid and tid < ltid: error("time-stamp reducation %s < %s, at %s", u64(tid), u64(ltid), pos) if status == "c": truncate(f, pos, file_size, outp) raise EOFError if status not in " up": error("invalid status, %r, at %s", status, pos) tpos = pos tend = tpos + tl if status == "u": # Undone transaction, skip it f.seek(tend) h = f.read(8) if h != stl: error("inconsistent transaction length at %s", pos) pos = tend + 8 return pos, None, tid pos = tpos+(23+ul+dl+el) user = f.read(ul) description = f.read(dl) if el: try: e=loads(f.read(el)) except: e={} else: e={} result = TransactionRecord(tid, status, user, description, e, pos, tend, f, tpos) pos = tend # Read the (intentionally redundant) transaction length f.seek(pos) h = f.read(8) if h != stl: error("redundant transaction length check failed at %s", pos) pos += 8 return pos, result, tid def truncate(f, pos, file_size, outp): """Copy data from pos to end of f to a .trNNN file.""" # _trname is global so that the test suite can know the path too (in # order to delete the file when the test ends). global _trname i = 0 while 1: _trname = outp + ".tr%d" % i if os.path.exists(_trname): i += 1 else: break tr = open(_trname, "wb") copy(f, tr, file_size - pos) f.seek(pos) tr.close() def copy(src, dst, n): while n: buf = src.read(8096) if not buf: break if len(buf) > n: buf = buf[:n] dst.write(buf) n -= len(buf) def scan(f, pos): """Return a potential transaction location following pos in f. This routine scans forward from pos looking for the last data record in a transaction. A period '.' always occurs at the end of a pickle, and an 8-byte transaction length follows the last pickle. If a period is followed by a plausible 8-byte transaction length, assume that we have found the end of a transaction. The caller should try to verify that the returned location is actually a transaction header. """ while 1: f.seek(pos) data = f.read(8096) if not data: return 0 s = 0 while 1: l = data.find(".", s) if l < 0: pos += len(data) break # If we are less than 8 bytes from the end of the # string, we need to read more data. s = l + 1 if s > len(data) - 8: pos += l break tl = u64(data[s:s+8]) if tl < pos: return pos + s + 8 def iprogress(i): if i % 2: print ".", else: print (i/2) % 10, sys.stdout.flush() def progress(p): for i in range(p): iprogress(i) def main(): try: opts, args = getopt.getopt(sys.argv[1:], "fv:pP:") except getopt.error, msg: die(str(msg), show_docstring=True) if len(args) != 2: die("two positional arguments required", show_docstring=True) inp, outp = args force = partial = False verbose = 0 pack = None for opt, v in opts: if opt == "-v": verbose = int(v) elif opt == "-p": partial = True elif opt == "-f": force = True elif opt == "-P": pack = time.time() - float(v) recover(inp, outp, verbose, partial, force, pack) def recover(inp, outp, verbose=0, partial=False, force=False, pack=None): print "Recovering", inp, "into", outp if os.path.exists(outp) and not force: die("%s exists" % outp) f = open(inp, "rb") if f.read(4) != ZODB.FileStorage.packed_version: die("input is not a file storage") f.seek(0,2) file_size = f.tell() ofs = ZODB.FileStorage.FileStorage(outp, create=1) _ts = None ok = 1 prog1 = 0 undone = 0 pos = 4L ltid = None while pos: try: npos, txn, tid = read_txn_header(f, pos, file_size, outp, ltid) except EOFError: break except (KeyboardInterrupt, SystemExit): raise except Exception, err: print "error reading txn header:", err if not verbose: progress(prog1) pos = scan(f, pos) if verbose > 1: print "looking for valid txn header at", pos continue ltid = tid if txn is None: undone = undone + npos - pos pos = npos continue else: pos = npos tid = txn.tid if _ts is None: _ts = TimeStamp(tid) else: t = TimeStamp(tid) if t <= _ts: if ok: print ("Time stamps out of order %s, %s" % (_ts, t)) ok = 0 _ts = t.laterThan(_ts) tid = `_ts` else: _ts = t if not ok: print ("Time stamps back in order %s" % (t)) ok = 1 ofs.tpc_begin(txn, tid, txn.status) if verbose: print "begin", pos, _ts, if verbose > 1: print sys.stdout.flush() nrec = 0 try: for r in txn: if verbose > 1: if r.data is None: l = "bp" else: l = len(r.data) print "%7d %s %s" % (u64(r.oid), l) ofs.restore(r.oid, r.tid, r.data, '', r.data_txn, txn) nrec += 1 except (KeyboardInterrupt, SystemExit): raise except Exception, err: if partial and nrec: ofs._status = "p" ofs.tpc_vote(txn) ofs.tpc_finish(txn) if verbose: print "partial" else: ofs.tpc_abort(txn) print "error copying transaction:", err if not verbose: progress(prog1) pos = scan(f, pos) if verbose > 1: print "looking for valid txn header at", pos else: ofs.tpc_vote(txn) ofs.tpc_finish(txn) if verbose: print "finish" sys.stdout.flush() if not verbose: prog = pos * 20l / file_size while prog > prog1: prog1 = prog1 + 1 iprogress(prog1) bad = file_size - undone - ofs._pos print "\n%s bytes removed during recovery" % bad if undone: print "%s bytes of undone transaction data were skipped" % undone if pack is not None: print "Packing ..." from ZODB.serialize import referencesf ofs.pack(pack, referencesf) ofs.close() if __name__ == "__main__": main() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/fstools.py000066400000000000000000000112151230730566700225000ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Tools for using FileStorage data files. TODO: This module needs tests. Caution: This file needs to be kept in sync with FileStorage.py. """ import cPickle import struct from ZODB.FileStorage.format import TRANS_HDR, DATA_HDR, TRANS_HDR_LEN from ZODB.FileStorage.format import DATA_HDR_LEN from ZODB.utils import u64 from persistent.TimeStamp import TimeStamp class TxnHeader: """Object representing a transaction record header. Attribute Position Value --------- -------- ----- tid 0- 8 transaction id length 8-16 length of entire transaction record - 8 status 16-17 status of transaction (' ', 'u', 'p'?) user_len 17-19 length of user field (pack code H) descr_len 19-21 length of description field (pack code H) ext_len 21-23 length of extensions (pack code H) """ def __init__(self, file, pos): self._file = file self._pos = pos self._read_header() def _read_header(self): self._file.seek(self._pos) self._hdr = self._file.read(TRANS_HDR_LEN) (self.tid, self.length, self.status, self.user_len, self.descr_len, self.ext_len) = struct.unpack(TRANS_HDR, self._hdr) def read_meta(self): """Load user, descr, and ext attributes.""" self.user = "" self.descr = "" self.ext = {} if not (self.user_len or self.descr_len or self.ext_len): return self._file.seek(self._pos + TRANS_HDR_LEN) if self.user_len: self.user = self._file.read(self.user_len) if self.descr_len: self.descr = self._file.read(self.descr_len) if self.ext_len: self._ext = self._file.read(self.ext_len) self.ext = cPickle.loads(self._ext) def get_data_offset(self): return (self._pos + TRANS_HDR_LEN + self.user_len + self.descr_len + self.ext_len) def get_timestamp(self): return TimeStamp(self.tid) def get_raw_data(self): data_off = self.get_data_offset() data_len = self.length - (data_off - self._pos) self._file.seek(data_off) return self._file.read(data_len) def next_txn(self): off = self._pos + self.length + 8 self._file.seek(off) s = self._file.read(8) if not s: return None return TxnHeader(self._file, off) def prev_txn(self): if self._pos == 4: return None self._file.seek(self._pos - 8) tlen = u64(self._file.read(8)) return TxnHeader(self._file, self._pos - (tlen + 8)) class DataHeader: """Object representing a data record header. Attribute Position Value --------- -------- ----- oid 0- 8 object id serial 8-16 object serial numver prev_rec_pos 16-24 position of previous data record for object txn_pos 24-32 position of txn header version_len 32-34 length of version (always 0) data_len 34-42 length of data """ def __init__(self, file, pos): self._file = file self._pos = pos self._read_header() def _read_header(self): self._file.seek(self._pos) self._hdr = self._file.read(DATA_HDR_LEN) # always read the longer header, just in case (self.oid, self.serial, prev_rec_pos, txn_pos, vlen, data_len ) = struct.unpack(DATA_HDR, self._hdr[:DATA_HDR_LEN]) assert not vlen self.prev_rec_pos = u64(prev_rec_pos) self.txn_pos = u64(txn_pos) self.data_len = u64(data_len) def next_offset(self): """Return offset of next record.""" off = self._pos + self.data_len off += DATA_HDR_LEN if self.data_len == 0: off += 8 # backpointer return off def prev_txn(f): """Return transaction located before current file position.""" f.seek(-8, 1) tlen = u64(f.read(8)) + 8 return TxnHeader(f, f.tell() - tlen) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/historical_connections.txt000066400000000000000000000226621230730566700257510ustar00rootroot00000000000000====================== Historical Connections ====================== Usage ===== A database can be opened with a read-only, historical connection when given a specific transaction or datetime. This can enable full-context application level conflict resolution, historical exploration and preparation for reverts, or even the use of a historical database revision as "production" while development continues on a "development" head. A database can be opened historically ``at`` or ``before`` a given transaction serial or datetime. Here's a simple example. It should work with any storage that supports ``loadBefore``. We'll begin our example with a fairly standard set up. We - make a storage and a database; - open a normal connection; - modify the database through the connection; - commit a transaction, remembering the time in UTC; - modify the database again; and - commit a transaction. >>> import ZODB.MappingStorage >>> db = ZODB.MappingStorage.DB() >>> conn = db.open() >>> import persistent.mapping >>> conn.root()['first'] = persistent.mapping.PersistentMapping(count=0) >>> import transaction >>> transaction.commit() We wait for some time to pass, record he time, and then make some other changes. >>> import time >>> time.sleep(.01) >>> import datetime >>> now = datetime.datetime.utcnow() >>> time.sleep(.01) >>> root = conn.root() >>> root['second'] = persistent.mapping.PersistentMapping() >>> root['first']['count'] += 1 >>> transaction.commit() Now we will show a historical connection. We'll open one using the ``now`` value we generated above, and then demonstrate that the state of the original connection, at the mutable head of the database, is different than the historical state. >>> transaction1 = transaction.TransactionManager() >>> historical_conn = db.open(transaction_manager=transaction1, at=now) >>> sorted(conn.root().keys()) ['first', 'second'] >>> conn.root()['first']['count'] 1 >>> historical_conn.root().keys() ['first'] >>> historical_conn.root()['first']['count'] 0 Moreover, the historical connection cannot commit changes. >>> historical_conn.root()['first']['count'] += 1 >>> historical_conn.root()['first']['count'] 1 >>> transaction1.commit() Traceback (most recent call last): ... ReadOnlyHistoryError >>> transaction1.abort() >>> historical_conn.root()['first']['count'] 0 (It is because of the mutable behavior outside of transactional semantics that we must have a separate connection, and associated object cache, per thread, even though the semantics should be readonly.) As demonstrated, a timezone-naive datetime will be interpreted as UTC. You can also pass a timezone-aware datetime or a serial (transaction id). Here's opening with a serial--the serial of the root at the time of the first commit. >>> historical_serial = historical_conn.root()._p_serial >>> historical_conn.close() >>> historical_conn = db.open(transaction_manager=transaction1, ... at=historical_serial) >>> historical_conn.root().keys() ['first'] >>> historical_conn.root()['first']['count'] 0 >>> historical_conn.close() We've shown the ``at`` argument. You can also ask to look ``before`` a datetime or serial. (It's an error to pass both [#not_both]_) In this example, we're looking at the database immediately prior to the most recent change to the root. >>> serial = conn.root()._p_serial >>> historical_conn = db.open( ... transaction_manager=transaction1, before=serial) >>> historical_conn.root().keys() ['first'] >>> historical_conn.root()['first']['count'] 0 In fact, ``at`` arguments are translated into ``before`` values because the underlying mechanism is a storage's loadBefore method. When you look at a connection's ``before`` attribute, it is normalized into a ``before`` serial, no matter what you pass into ``db.open``. >>> print conn.before None >>> historical_conn.before == serial True >>> conn.close() Configuration ============= Like normal connections, the database lets you set how many total historical connections can be active without generating a warning, and how many objects should be kept in each historical connection's object cache. >>> db.getHistoricalPoolSize() 3 >>> db.setHistoricalPoolSize(4) >>> db.getHistoricalPoolSize() 4 >>> db.getHistoricalCacheSize() 1000 >>> db.setHistoricalCacheSize(2000) >>> db.getHistoricalCacheSize() 2000 In addition, you can specify the minimum number of seconds that an unused historical connection should be kept. >>> db.getHistoricalTimeout() 300 >>> db.setHistoricalTimeout(400) >>> db.getHistoricalTimeout() 400 All three of these values can be specified in a ZConfig file. >>> import ZODB.config >>> db2 = ZODB.config.databaseFromString(''' ... ... ... historical-pool-size 3 ... historical-cache-size 1500 ... historical-timeout 6m ... ... ''') >>> db2.getHistoricalPoolSize() 3 >>> db2.getHistoricalCacheSize() 1500 >>> db2.getHistoricalTimeout() 360 The pool lets us reuse connections. To see this, we'll open some connections, close them, and then open them again: >>> conns1 = [db2.open(before=serial) for i in range(4)] >>> _ = [c.close() for c in conns1] >>> conns2 = [db2.open(before=serial) for i in range(4)] Now let's look at what we got. The first connection in conns 2 is the last connection in conns1, because it was the last connection closed. >>> conns2[0] is conns1[-1] True Also for the next two: >>> (conns2[1] is conns1[-2]), (conns2[2] is conns1[-3]) (True, True) But not for the last: >>> conns2[3] is conns1[-4] False Because the pool size was set to 3. Connections are also discarded if they haven't been used in a while. To see this, let's close two of the connections: >>> conns2[0].close(); conns2[1].close() We'l also set the historical timeout to be very low: >>> db2.setHistoricalTimeout(.01) >>> time.sleep(.1) >>> conns2[2].close(); conns2[3].close() Now, when we open 4 connections: >>> conns1 = [db2.open(before=serial) for i in range(4)] We'll see that only the last 2 connections from conn2 are in the result: >>> [c in conns1 for c in conns2] [False, False, True, True] If you change the historical cache size, that changes the size of the persistent cache on our connection. >>> historical_conn._cache.cache_size 2000 >>> db.setHistoricalCacheSize(1500) >>> historical_conn._cache.cache_size 1500 Invalidations ============= Invalidations are ignored for historical connections. This is another white box test. >>> historical_conn = db.open( ... transaction_manager=transaction1, at=serial) >>> conn = db.open() >>> sorted(conn.root().keys()) ['first', 'second'] >>> conn.root()['first']['count'] 1 >>> sorted(historical_conn.root().keys()) ['first', 'second'] >>> historical_conn.root()['first']['count'] 1 >>> conn.root()['first']['count'] += 1 >>> conn.root()['third'] = persistent.mapping.PersistentMapping() >>> transaction.commit() >>> len(historical_conn._invalidated) 0 >>> historical_conn.close() Note that if you try to open an historical connection to a time in the future, you will get an error. >>> historical_conn = db.open( ... at=datetime.datetime.utcnow()+datetime.timedelta(1)) Traceback (most recent call last): ... ValueError: cannot open an historical connection in the future. Warnings ======== First, if you use datetimes to get a historical connection, be aware that the conversion from datetime to transaction id has some pitfalls. Generally, the transaction ids in the database are only as time-accurate as the system clock was when the transaction id was created. Moreover, leap seconds are handled somewhat naively in the ZODB (largely because they are handled naively in Unix/ POSIX time) so any minute that contains a leap second may contain serials that are a bit off. This is not generally a problem for the ZODB, because serials are guaranteed to increase, but it does highlight the fact that serials are not guaranteed to be accurately connected to time. Generally, they are about as reliable as time.time. Second, historical connections currently introduce potentially wide variance in memory requirements for the applications. Since you can open up many connections to different serials, and each gets their own pool, you may collect quite a few connections. For now, at least, if you use this feature you need to be particularly careful of your memory usage. Get rid of pools when you know you can, and reuse the exact same values for ``at`` or ``before`` when possible. If historical connections are used for conflict resolution, these connections will probably be temporary--not saved in a pool--so that the extra memory usage would also be brief and unlikely to overlap. .. cleanup >>> db.close() >>> db2.close() .. ......... .. .. Footnotes .. .. ......... .. .. [#not_both] It is an error to try and pass both `at` and `before`. >>> historical_conn = db.open( ... transaction_manager=transaction1, at=now, before=historical_serial) Traceback (most recent call last): ... ValueError: can only pass zero or one of `at` and `before` ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/interfaces.py000066400000000000000000001375111230730566700231420ustar00rootroot00000000000000############################################################################## # # Copyright (c) Zope Corporation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## from zope.interface import Interface, Attribute class IConnection(Interface): """Connection to ZODB for loading and storing objects. The Connection object serves as a data manager. The root() method on a Connection returns the root object for the database. This object and all objects reachable from it are associated with the Connection that loaded them. When a transaction commits, it uses the Connection to store modified objects. Typical use of ZODB is for each thread to have its own Connection and that no thread should have more than one Connection to the same database. A thread is associated with a Connection by loading objects from that Connection. Objects loaded by one thread should not be used by another thread. A Connection can be frozen to a serial--a transaction id, a single point in history-- when it is created. By default, a Connection is not associated with a serial; it uses current data. A Connection frozen to a serial is read-only. Each Connection provides an isolated, consistent view of the database, by managing independent copies of objects in the database. At transaction boundaries, these copies are updated to reflect the current state of the database. You should not instantiate this class directly; instead call the open() method of a DB instance. In many applications, root() is the only method of the Connection that you will need to use. Synchronization --------------- A Connection instance is not thread-safe. It is designed to support a thread model where each thread has its own transaction. If an application has more than one thread that uses the connection or the transaction the connection is registered with, the application should provide locking. The Connection manages movement of objects in and out of object storage. TODO: We should document an intended API for using a Connection via multiple threads. TODO: We should explain that the Connection has a cache and that multiple calls to get() will return a reference to the same object, provided that one of the earlier objects is still referenced. Object identity is preserved within a connection, but not across connections. TODO: Mention the database pool. A database connection always presents a consistent view of the objects in the database, although it may not always present the most current revision of any particular object. Modifications made by concurrent transactions are not visible until the next transaction boundary (abort or commit). Two options affect consistency. By default, the mvcc and synch options are enabled by default. If you pass mvcc=False to db.open(), the Connection will never read non-current revisions of an object. Instead it will raise a ReadConflictError to indicate that the current revision is unavailable because it was written after the current transaction began. The logic for handling modifications assumes that the thread that opened a Connection (called db.open()) is the thread that will use the Connection. If this is not true, you should pass synch=False to db.open(). When the synch option is disabled, some transaction boundaries will be missed by the Connection; in particular, if a transaction does not involve any modifications to objects loaded from the Connection and synch is disabled, the Connection will miss the transaction boundary. Two examples of this behavior are db.undo() and read-only transactions. Groups of methods: User Methods: root, get, add, close, db, sync, isReadOnly, cacheGC, cacheFullSweep, cacheMinimize Experimental Methods: onCloseCallbacks Database Invalidation Methods: invalidate Other Methods: exchange, getDebugInfo, setDebugInfo, getTransferCounts """ def add(ob): """Add a new object 'obj' to the database and assign it an oid. A persistent object is normally added to the database and assigned an oid when it becomes reachable to an object already in the database. In some cases, it is useful to create a new object and use its oid (_p_oid) in a single transaction. This method assigns a new oid regardless of whether the object is reachable. The object is added when the transaction commits. The object must implement the IPersistent interface and must not already be associated with a Connection. Parameters: obj: a Persistent object Raises TypeError if obj is not a persistent object. Raises InvalidObjectReference if obj is already associated with another connection. Raises ConnectionStateError if the connection is closed. """ def get(oid): """Return the persistent object with oid 'oid'. If the object was not in the cache and the object's class is ghostable, then a ghost will be returned. If the object is already in the cache, a reference to the cached object will be returned. Applications seldom need to call this method, because objects are loaded transparently during attribute lookup. Parameters: oid: an object id Raises KeyError if oid does not exist. It is possible that an object does not exist as of the current transaction, but existed in the past. It may even exist again in the future, if the transaction that removed it is undone. Raises ConnectionStateError if the connection is closed. """ def cacheMinimize(): """Deactivate all unmodified objects in the cache. Call _p_deactivate() on each cached object, attempting to turn it into a ghost. It is possible for individual objects to remain active. """ def cacheGC(): """Reduce cache size to target size. Call _p_deactivate() on cached objects until the cache size falls under the target size. """ def onCloseCallback(f): """Register a callable, f, to be called by close(). f will be called with no arguments before the Connection is closed. Parameters: f: method that will be called on `close` """ def close(): """Close the Connection. When the Connection is closed, all callbacks registered by onCloseCallback() are invoked and the cache is garbage collected. A closed Connection should not be used by client code. It can't load or store objects. Objects in the cache are not freed, because Connections are re-used and the cache is expected to be useful to the next client. """ def db(): """Returns a handle to the database this connection belongs to.""" def isReadOnly(): """Returns True if the storage for this connection is read only.""" def invalidate(tid, oids): """Notify the Connection that transaction 'tid' invalidated oids. When the next transaction boundary is reached, objects will be invalidated. If any of the invalidated objects are accessed by the current transaction, the revision written before Connection.tid will be used. The DB calls this method, even when the Connection is closed. Parameters: tid: the storage-level id of the transaction that committed oids: oids is an iterable of oids. """ def root(): """Return the database root object. The root is a persistent.mapping.PersistentMapping. """ # Multi-database support. connections = Attribute( """A mapping from database name to a Connection to that database. In multi-database use, the Connections of all members of a database collection share the same .connections object. In single-database use, of course this mapping contains a single entry. """) # TODO: should this accept all the arguments one may pass to DB.open()? def get_connection(database_name): """Return a Connection for the named database. This is intended to be called from an open Connection associated with a multi-database. In that case, database_name must be the name of a database within the database collection (probably the name of a different database than is associated with the calling Connection instance, but it's fine to use the name of the calling Connection object's database). A Connection for the named database is returned. If no connection to that database is already open, a new Connection is opened. So long as the multi-database remains open, passing the same name to get_connection() multiple times returns the same Connection object each time. """ def sync(): """Manually update the view on the database. This includes aborting the current transaction, getting a fresh and consistent view of the data (synchronizing with the storage if possible) and calling cacheGC() for this connection. This method was especially useful in ZODB 3.2 to better support read-only connections that were affected by a couple of problems. """ # Debug information def getDebugInfo(): """Returns a tuple with different items for debugging the connection. Debug information can be added to a connection by using setDebugInfo. """ def setDebugInfo(*items): """Add the given items to the debug information of this connection.""" def getTransferCounts(clear=False): """Returns the number of objects loaded and stored. If clear is True, reset the counters. """ def invalidateCache(): """Invalidate the connection cache This invalidates *all* objects in the cache. If the connection is open, subsequent reads will fail until a new transaction begins or until the connection os reopned. """ def readCurrent(obj): """Make sure an object being read is current This is used when applications want to ensure a higher level of consistency for some operations. This should be called when an object is read and the information read is used to write a separate object. """ class IStorageWrapper(Interface): """Storage wrapper interface This interface provides 3 facilities: - Out-of-band invalidation support A storage can notify it's wrapper of object invalidations that don't occur due to direct operations on the storage. Currently this is only used by ZEO client storages to pass invalidation messages sent from a server. - Record-reference extraction The references method can be used to extract referenced object IDs from a database record. This can be used by storages to provide more advanced garbage collection. A wrapper storage that transforms data will provide a references method that untransforms data passed to it and then pass the data to the layer above it. - Record transformation A storage wrapper may transform data, for example for compression or encryption. Methods are provided to transform or untransform data. This interface may be implemented by storage adapters or other intermediaries. For example, a storage adapter that provides encryption and/or compresssion will apply record transformations in it's references method. """ def invalidateCache(): """Discard all cached data This can be necessary if there have been major changes to stored data and it is either impractical to enumerate them or there would be so many that it would be inefficient to do so. """ def invalidate(transaction_id, oids, version=''): """Invalidate object ids committed by the given transaction The oids argument is an iterable of object identifiers. The version argument is provided for backward compatibility. If passed, it must be an empty string. """ def references(record, oids=None): """Scan the given record for object ids A list of object ids is returned. If a list is passed in, then it will be used and augmented. Otherwise, a new list will be created and returned. """ def transform_record_data(data): """Return transformed data """ def untransform_record_data(data): """Return untransformed data """ IStorageDB = IStorageWrapper # for backward compatibility class IDatabase(IStorageDB): """ZODB DB. """ # TODO: This interface is incomplete. # XXX how is it incomplete? databases = Attribute( """A mapping from database name to DB (database) object. In multi-database use, all DB members of a database collection share the same .databases object. In single-database use, of course this mapping contains a single entry. """) storage = Attribute( """The object that provides storage for the database This attribute is useful primarily for tests. Normal application code should rarely, if ever, have a need to use this attribute. """) def open(transaction_manager=None, serial=''): """Return an IConnection object for use by application code. transaction_manager: transaction manager to use. None means use the default transaction manager. serial: the serial (transaction id) of the database to open. An empty string (the default) means to open it to the newest serial. Specifying a serial results in a read-only historical connection. Note that the connection pool is managed as a stack, to increase the likelihood that the connection's stack will include useful objects. """ # TODO: Should this method be moved into some subinterface? def pack(t=None, days=0): """Pack the storage, deleting unused object revisions. A pack is always performed relative to a particular time, by default the current time. All object revisions that are not reachable as of the pack time are deleted from the storage. The cost of this operation varies by storage, but it is usually an expensive operation. There are two optional arguments that can be used to set the pack time: t, pack time in seconds since the epcoh, and days, the number of days to subtract from t or from the current time if t is not specified. """ # TODO: Should this method be moved into some subinterface? def undo(id, txn=None): """Undo a transaction identified by id. A transaction can be undone if all of the objects involved in the transaction were not modified subsequently, if any modifications can be resolved by conflict resolution, or if subsequent changes resulted in the same object state. The value of id should be generated by calling undoLog() or undoInfo(). The value of id is not the same as a transaction id used by other methods; it is unique to undo(). id: a storage-specific transaction identifier txn: transaction context to use for undo(). By default, uses the current transaction. """ def close(): """Close the database and its underlying storage. It is important to close the database, because the storage may flush in-memory data structures to disk when it is closed. Leaving the storage open with the process exits can cause the next open to be slow. What effect does closing the database have on existing connections? Technically, they remain open, but their storage is closed, so they stop behaving usefully. Perhaps close() should also close all the Connections. """ class IStorage(Interface): """A storage is responsible for storing and retrieving data of objects. Consistency and locking ----------------------- When transactions are committed, a storage assigns monotonically increasing transaction identifiers (tids) to the transactions and to the object versions written by the transactions. ZODB relies on this to decide if data in object caches are up to date and to implement multi-version concurrency control. There are methods in IStorage and in derived interfaces that provide information about the current revisions (tids) for objects or for the database as a whole. It is critical for the proper working of ZODB that the resulting tids are increasing with respect to the object identifier given or to the databases. That is, if there are 2 results for an object or for the database, R1 and R2, such that R1 is returned before R2, then the tid returned by R2 must be greater than or equal to the tid returned by R1. (When thinking about results for the database, think of these as results for all objects in the database.) This implies some sort of locking strategy. The key method is tcp_finish, which causes new tids to be generated and also, through the callback passed to it, returns new current tids for the objects stored in a transaction and for the database as a whole. The IStorage methods affected are lastTransaction, load, store, and tpc_finish. Derived interfaces may introduce additional methods. """ def close(): """Close the storage. Finalize the storage, releasing any external resources. The storage should not be used after this method is called. """ def getName(): """The name of the storage The format and interpretation of this name is storage dependent. It could be a file name, a database name, etc.. This is used soley for informational purposes. """ def getSize(): """An approximate size of the database, in bytes. This is used soley for informational purposes. """ def history(oid, size=1): """Return a sequence of history information dictionaries. Up to size objects (including no objects) may be returned. The information provides a log of the changes made to the object. Data are reported in reverse chronological order. Each dictionary has the following keys: time UTC seconds since the epoch (as in time.time) that the object revision was committed. tid The transaction identifier of the transaction that committed the version. serial An alias for tid, which expected by older clients. user_name The user identifier, if any (or an empty string) of the user on whos behalf the revision was committed. description The transaction description for the transaction that committed the revision. size The size of the revision data record. If the transaction had extension items, then these items are also included if they don't conflict with the keys above. """ def isReadOnly(): """Test whether a storage allows committing new transactions For a given storage instance, this method always returns the same value. Read-only-ness is a static property of a storage. """ # XXX Note that this method doesn't really buy us much, # especially since we have to account for the fact that a # ostensibly non-read-only storage may be read-only # transiently. It would be better to just have read-only errors. def lastTransaction(): """Return the id of the last committed transaction. If no transactions have been committed, return a string of 8 null (0) characters. """ def __len__(): """The approximate number of objects in the storage This is used soley for informational purposes. """ def load(oid, version): """Load data for an object id The version argumement should always be an empty string. It exists soley for backward compatibility with older storage implementations. A data record and serial are returned. The serial is a transaction identifier of the transaction that wrote the data record. A POSKeyError is raised if there is no record for the object id. """ def loadBefore(oid, tid): """Load the object data written before a transaction id If there isn't data before the object before the given transaction, then None is returned, otherwise three values are returned: - The data record - The transaction id of the data record - The transaction id of the following revision, if any, or None. If the object id isn't in the storage, then POSKeyError is raised. """ def loadSerial(oid, serial): """Load the object record for the give transaction id If a matching data record can be found, it is returned, otherwise, POSKeyError is raised. """ # The following two methods are effectively part of the interface, # as they are generally needed when one storage wraps # another. This deserves some thought, at probably debate, before # adding them. # # def _lock_acquire(): # """Acquire the storage lock # """ # def _lock_release(): # """Release the storage lock # """ def new_oid(): """Allocate a new object id. The object id returned is reserved at least as long as the storage is opened. The return value is a string. """ def pack(pack_time, referencesf): """Pack the storage It is up to the storage to interpret this call, however, the general idea is that the storage free space by: - discarding object revisions that were old and not current as of the given pack time. - garbage collecting objects that aren't reachable from the root object via revisions remaining after discarding revisions that were not current as of the pack time. The pack time is given as a UTC time in seconds since the epoch. The second argument is a function that should be used to extract object references from database records. This is needed to determine which objects are referenced from object revisions. """ def registerDB(wrapper): """Register a storage wrapper IStorageWrapper. The passed object is a wrapper object that provides an upcall interface to support composition. Note that, for historical reasons, an implementation may require a second argument, however, if required, the None will be passed as the second argument. Also, for historical reasons, this is called registerDB rather than register_wrapper. """ def sortKey(): """Sort key used to order distributed transactions When a transaction involved multiple storages, 2-phase commit operations are applied in sort-key order. This must be unique among storages used in a transaction. Obviously, the storage can't assure this, but it should construct the sort key so it has a reasonable chance of being unique. The result must be a string. """ def store(oid, serial, data, version, transaction): """Store data for the object id, oid. Arguments: oid The object identifier. This is either a string consisting of 8 nulls or a string previously returned by new_oid. serial The serial of the data that was read when the object was loaded from the database. If the object was created in the current transaction this will be a string consisting of 8 nulls. data The data record. This is opaque to the storage. version This must be an empty string. It exists for backward compatibility. transaction A transaction object. This should match the current transaction for the storage, set by tpc_begin. The new serial for the object is returned, but not necessarily immediately. It may be returned directly, or on a subsequent store or tpc_vote call. The return value may be: - None, or - A new serial (string) for the object If None is returned, then a new serial (or other special values) must ve returned in tpc_vote results. A serial, returned as a string, may be the special value ZODB.ConflictResolution.ResolvedSerial to indicate that a conflict occured and that the object should be invalidated. Several different exceptions may be raised when an error occurs. ConflictError is raised when serial does not match the most recent serial number for object oid and the conflict was not resolved by the storage. StorageTransactionError is raised when transaction does not match the current transaction. StorageError or, more often, a subclass of it is raised when an internal error occurs while the storage is handling the store() call. """ def tpc_abort(transaction): """Abort the transaction. Any changes made by the transaction are discarded. This call is ignored is the storage is not participating in two-phase commit or if the given transaction is not the same as the transaction the storage is commiting. """ def tpc_begin(transaction): """Begin the two-phase commit process. If storage is already participating in a two-phase commit using the same transaction, a StorageTransactionError is raised. If the storage is already participating in a two-phase commit using a different transaction, the call blocks until the current transaction ends (commits or aborts). """ def tpc_finish(transaction, func = lambda tid: None): """Finish the transaction, making any transaction changes permanent. Changes must be made permanent at this point. This call raises a StorageTransactionError if the storage isn't participating in two-phase commit or if it is committing a different transaction. Failure of this method is extremely serious. The second argument is a call-back function that must be called while the storage transaction lock is held. It takes the new transaction id generated by the transaction. """ def tpc_vote(transaction): """Provide a storage with an opportunity to veto a transaction This call raises a StorageTransactionError if the storage isn't participating in two-phase commit or if it is commiting a different transaction. If a transaction can be committed by a storage, then the method should return. If a transaction cannot be committed, then an exception should be raised. If this method returns without an error, then there must not be an error if tpc_finish or tpc_abort is called subsequently. The return value can be either None or a sequence of object-id and serial pairs giving new serials for objects who's ids were passed to previous store calls in the same transaction. After the tpc_vote call, new serials must have been returned, either from tpc_vote or store for objects passed to store. A serial returned in a sequence of oid/serial pairs, may be the special value ZODB.ConflictResolution.ResolvedSerial to indicate that a conflict occured and that the object should be invalidated. """ class IStorageRestoreable(IStorage): """Copying Transactions The IStorageRestoreable interface supports copying already-committed transactions from one storage to another. This is typically done for replication or for moving data from one storage implementation to another. """ def tpc_begin(transaction, tid=None): """Begin the two-phase commit process. If storage is already participating in a two-phase commit using the same transaction, the call is ignored. If the storage is already participating in a two-phase commit using a different transaction, the call blocks until the current transaction ends (commits or aborts). If a transaction id is given, then the transaction will use the given id rather than generating a new id. This is used when copying already committed transactions from another storage. """ # Note that the current implementation also accepts a status. # This is an artifact of: # - Earlier use of an undo status to undo revisions in place, # and, # - Incorrect pack garbage-collection algorithms (possibly # including the existing FileStorage implementation), that # failed to take into account records after the pack time. def restore(oid, serial, data, version, prev_txn, transaction): """Write data already committed in a separate database The restore method is used when copying data from one database to a replica of the database. It differs from store in that the data have already been committed, so there is no check for conflicts and no new transaction is is used for the data. Arguments: oid The object id for the record serial The transaction identifier that originally committed this object. data The record data. This will be None if the transaction undid the creation of the object. prev_txn The identifier of a previous transaction that held the object data. The target storage can sometimes use this as a hint to save space. transaction The current transaction. Nothing is returned. """ class IStorageRecordInformation(Interface): """Provide information about a single storage record """ oid = Attribute("The object id") tid = Attribute("The transaction id") data = Attribute("The data record") version = Attribute("The version id") data_txn = Attribute("The previous transaction id") class IStorageTransactionInformation(Interface): """Provide information about a storage transaction. Can be iterated over to retrieve the records modified in the transaction. """ tid = Attribute("Transaction id") status = Attribute("Transaction Status") # XXX what are valid values? user = Attribute("Transaction user") description = Attribute("Transaction Description") extension = Attribute( "A dictionary carrying the transaction's extension data") def __iter__(): """Iterate over the transaction's records given as IStorageRecordInformation objects. """ class IStorageIteration(Interface): """API for iterating over the contents of a storage.""" def iterator(start=None, stop=None): """Return an IStorageTransactionInformation iterator. If the start argument is not None, then iteration will start with the first transaction whose identifier is greater than or equal to start. If the stop argument is not None, then iteration will end with the last transaction whose identifier is less than or equal to stop. The iterator provides access to the data as available at the time when the iterator was retrieved. """ class IStorageUndoable(IStorage): """A storage supporting transactional undo. """ def supportsUndo(): """Return True, indicating that the storage supports undo. """ def undo(transaction_id, transaction): """Undo the transaction corresponding to the given transaction id. The transaction id is a value returned from undoInfo or undoLog, which may not be a stored transaction identifier as used elsewhere in the storage APIs. This method must only be called in the first phase of two-phase commit (after tpc_begin but before tpc_vote). It returns a serial (transaction id) and a sequence of object ids for objects affected by the transaction. """ # Used by DB (Actually, by TransactionalUndo) def undoLog(first, last, filter=None): """Return a sequence of descriptions for undoable transactions. Application code should call undoLog() on a DB instance instead of on the storage directly. A transaction description is a mapping with at least these keys: "time": The time, as float seconds since the epoch, when the transaction committed. "user_name": The value of the `.user` attribute on that transaction. "description": The value of the `.description` attribute on that transaction. "id`" A string uniquely identifying the transaction to the storage. If it's desired to undo this transaction, this is the `transaction_id` to pass to `undo()`. In addition, if any name+value pairs were added to the transaction by `setExtendedInfo()`, those may be added to the transaction description mapping too (for example, FileStorage's `undoLog()` does this). `filter` is a callable, taking one argument. A transaction description mapping is passed to `filter` for each potentially undoable transaction. The sequence returned by `undoLog()` excludes descriptions for which `filter` returns a false value. By default, `filter` always returns a true value. ZEO note: Arbitrary callables cannot be passed from a ZEO client to a ZEO server, and a ZEO client's implementation of `undoLog()` ignores any `filter` argument that may be passed. ZEO clients should use the related `undoInfo()` method instead (if they want to do filtering). Now picture a list containing descriptions of all undoable transactions that pass the filter, most recent transaction first (at index 0). The `first` and `last` arguments specify the slice of this (conceptual) list to be returned: `first`: This is the index of the first transaction description in the slice. It must be >= 0. `last`: If >= 0, first:last acts like a Python slice, selecting the descriptions at indices `first`, first+1, ..., up to but not including index `last`. At most last-first descriptions are in the slice, and `last` should be at least as large as `first` in this case. If `last` is less than 0, then abs(last) is taken to be the maximum number of descriptions in the slice (which still begins at index `first`). When `last` < 0, the same effect could be gotten by passing the positive first-last for `last` instead. """ # DB pass through def undoInfo(first=0, last=-20, specification=None): """Return a sequence of descriptions for undoable transactions. This is like `undoLog()`, except for the `specification` argument. If given, `specification` is a dictionary, and `undoInfo()` synthesizes a `filter` function `f` for `undoLog()` such that `f(desc)` returns true for a transaction description mapping `desc` if and only if `desc` maps each key in `specification` to the same value `specification` maps that key to. In other words, only extensions (or supersets) of `specification` match. ZEO note: `undoInfo()` passes the `specification` argument from a ZEO client to its ZEO server (while a ZEO client ignores any `filter` argument passed to `undoLog()`). """ # DB pass-through class IMVCCStorage(IStorage): """A storage that provides MVCC semantics internally. MVCC (multi-version concurrency control) means each user of a database has a snapshot view of the database. The snapshot view does not change, even if concurrent connections commit transactions, until a transaction boundary. Relational databases that support serializable transaction isolation provide MVCC. Storages that implement IMVCCStorage, such as RelStorage, provide MVCC semantics at the ZODB storage layer. When ZODB.Connection uses a storage that implements IMVCCStorage, each connection uses a connection-specific storage instance, and that storage instance provides a snapshot of the database. By contrast, storages that do not implement IMVCCStorage, such as FileStorage, rely on ZODB.Connection to provide MVCC semantics, so in that case, one storage instance is shared by many ZODB.Connections. Applications that use ZODB.Connection always have a snapshot view of the database; IMVCCStorage only modifies which layer of ZODB provides MVCC. Furthermore, IMVCCStorage changes the way object invalidation works. An essential feature of ZODB is the propagation of object invalidation messages to keep in-memory caches up to date. Storages like FileStorage and ZEO.ClientStorage send invalidation messages to all other Connection instances at transaction commit time. Storages that implement IMVCCStorage, on the other hand, expect the ZODB.Connection to poll for a list of invalidated objects. Certain methods of IMVCCStorage implementations open persistent back end database sessions and retain the sessions even after the method call finishes:: load loadEx loadSerial loadBefore store restore new_oid history tpc_begin tpc_vote tpc_abort tpc_finish If you know that the storage instance will no longer be used after calling any of these methods, you should call the release method to release the persistent sessions. The persistent sessions will be reopened as necessary if you call one of those methods again. Other storage methods open short lived back end sessions and close the back end sessions before returning. These include:: __len__ getSize undoLog undo pack iterator These methods do not provide MVCC semantics, so these methods operate on the most current view of the database, rather than the snapshot view that the other methods use. """ def new_instance(): """Creates and returns another storage instance. The returned instance provides IMVCCStorage and connects to the same back-end database. The database state visible by the instance will be a snapshot that varies independently of other storage instances. """ def release(): """Release all persistent sessions used by this storage instance. After this call, the storage instance can still be used; calling methods that use persistent sessions will cause the persistent sessions to be reopened. """ def poll_invalidations(): """Poll the storage for external changes. Returns either a sequence of OIDs that have changed, or None. When a sequence is returned, the corresponding objects should be removed from the ZODB in-memory cache. When None is returned, the storage is indicating that so much time has elapsed since the last poll that it is no longer possible to enumerate all of the changed OIDs, since the previous transaction seen by the connection has already been packed. In that case, the ZODB in-memory cache should be cleared. """ def sync(force=True): """Updates the internal snapshot to the current state of the database. If the force parameter is False, the storage may choose to ignore this call. By ignoring this call, a storage can reduce the frequency of database polls, thus reducing database load. """ class IStorageCurrentRecordIteration(IStorage): def record_iternext(next=None): """Iterate over the records in a storage Use like this: >>> next = None >>> while 1: ... oid, tid, data, next = storage.record_iternext(next) ... # do things with oid, tid, and data ... if next is None: ... break """ class IExternalGC(IStorage): def deleteObject(oid, serial, transaction): """Mark an object as deleted This method marks an object as deleted via a new object revision. Subsequent attempts to load current data for the object will fail with a POSKeyError, but loads for non-current data will suceed if there are previous non-delete records. The object will be removed from the storage when all not-delete records are removed. The serial argument must match the most recently committed serial for the object. This is a seat belt. This method can only be called in the first phase of 2-phase commit. """ class ReadVerifyingStorage(IStorage): def checkCurrentSerialInTransaction(oid, serial, transaction): """Check whether the given serial number is current. The method is called during the first phase of 2-phase commit to verify that data read in a transaction is current. The storage should raise a ReadConflictError if the serial is not current, although it may raise the exception later, in a call to store or in a call to tpc_vote. If no exception is raised, then the serial must remain current through the end of the transaction. """ class IBlob(Interface): """A BLOB supports efficient handling of large data within ZODB.""" def open(mode): """Open a blob Returns a file(-like) object for handling the blob data. mode: Mode to open the file with. Possible values: r,w,r+,a,c The mode 'c' is similar to 'r', except that an orinary file object is returned and may be used in a separate transaction and after the blob's database connection has been closed. """ def committed(): """Return a file name for committed data. The returned file name may be opened for reading or handed to other processes for reading. The file name isn't guarenteed to be valid indefinately. The file may be removed in the future as a result of garbage collection depending on system configuration. A BlobError will be raised if the blob has any uncommitted data. """ def consumeFile(filename): """Consume a file. Replace the current data of the blob with the file given under filename. The blob must not be opened for reading or writing when consuming a file. The blob will take over ownership of the file and will either rename or copy and remove it. The file must not be open. """ class IBlobStorage(Interface): """A storage supporting BLOBs.""" def storeBlob(oid, oldserial, data, blobfilename, version, transaction): """Stores data that has a BLOB attached. The blobfilename argument names a file containing blob data. The storage will take ownership of the file and will rename it (or copy and remove it) immediately, or at transaction-commit time. The file must not be open. The new serial for the object is returned, but not necessarily immediately. It may be returned directly, or on a subsequent store or tpc_vote call. The return value may be: - None - A new serial (string) for the object, or - An iterable of object-id and serial pairs giving new serials for objects. A serial, returned as a string or in a sequence of oid/serial pairs, may be the special value ZODB.ConflictResolution.ResolvedSerial to indicate that a conflict occured and that the object should be invalidated. Several different exceptions may be raised when an error occurs. ConflictError is raised when serial does not match the most recent serial number for object oid and the conflict was not resolved by the storage. StorageTransactionError is raised when transaction does not match the current transaction. StorageError or, more often, a subclass of it is raised when an internal error occurs while the storage is handling the store() call. """ def loadBlob(oid, serial): """Return the filename of the Blob data for this OID and serial. Returns a filename. Raises POSKeyError if the blobfile cannot be found. """ def openCommittedBlobFile(oid, serial, blob=None): """Return a file for committed data for the given object id and serial If a blob is provided, then a BlobFile object is returned, otherwise, an ordinary file is returned. In either case, the file is opened for binary reading. This method is used to allow storages that cache blob data to make sure that data are available at least long enough for the file to be opened. """ def temporaryDirectory(): """Return a directory that should be used for uncommitted blob data. If Blobs use this, then commits can be performed with a simple rename. """ class IBlobStorageRestoreable(IBlobStorage, IStorageRestoreable): def restoreBlob(oid, serial, data, blobfilename, prev_txn, transaction): """Write blob data already committed in a separate database See the restore and storeBlob methods. """ class IBroken(Interface): """Broken objects are placeholders for objects that can no longer be created because their class has gone away. They cannot be modified, but they retain their state. This allows them to be rebuild should the missing class be found again. A broken object's __class__ can be used to determine the original class' name (__name__) and module (__module__). The original object's state and initialization arguments are available in broken object attributes to aid analysis and reconstruction. """ def __setattr__(name, value): """You cannot modify broken objects. This will raise a ZODB.broken.BrokenModified exception. """ __Broken_newargs__ = Attribute("Arguments passed to __new__.") __Broken_initargs__ = Attribute("Arguments passed to __init__.") __Broken_state__ = Attribute("Value passed to __setstate__.") class BlobError(Exception): pass class StorageStopIteration(IndexError, StopIteration): """A combination of StopIteration and IndexError to provide a backwards-compatible exception. """ ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/loglevels.py000066400000000000000000000033601230730566700230050ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2004 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Supplies custom logging levels BLATHER and TRACE. $Revision: 1.1 $ """ import logging __all__ = ["BLATHER", "TRACE"] # In the days of zLOG, there were 7 standard log levels, and ZODB/ZEO used # all of them. Here's how they map to the logging package's 5 standard # levels: # # zLOG logging # ------------- --------------- # PANIC (300) FATAL, CRITICAL (50) # ERROR (200) ERROR (40) # WARNING, PROBLEM (100) WARN (30) # INFO (0) INFO (20) # BLATHER (-100) none -- defined here as BLATHER (15) # DEBUG (-200) DEBUG (10) # TRACE (-300) none -- defined here as TRACE (5) # # TRACE is used by ZEO for extremely verbose trace output, enabled only # when chasing bottom-level communications bugs. It really should be at # a lower level than DEBUG. # # BLATHER is a harder call, and various instances could probably be folded # into INFO or DEBUG without real harm. BLATHER = 15 TRACE = 5 logging.addLevelName("BLATHER", BLATHER) logging.addLevelName("TRACE", TRACE) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/persistentclass.py000066400000000000000000000146311230730566700242420ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2004 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Persistent Class Support $Id$ """ # Notes: # # Persistent classes are non-ghostable. This has some interesting # ramifications: # # - When an object is invalidated, it must reload it's state # # - When an object is loaded from the database, it's state must be # loaded. Unfortunately, there isn't a clear signal when an object is # loaded from the database. This should probably be fixed. # # In the mean time, we need to infer. This should be viewed as a # short term hack. # # Here's the strategy we'll use: # # - We'll have a need to be loaded flag that we'll set in # __new__, through an extra argument. # # - When setting _p_oid and _p_jar, if both are set and we need to be # loaded, then we'll load out state. # # - We'll use _p_changed is None to indicate that we're in this state. # class _p_DataDescr(object): # Descr used as base for _p_ data. Data are stored in # _p_class_dict. def __init__(self, name): self.__name__ = name def __get__(self, inst, cls): if inst is None: return self if '__global_persistent_class_not_stored_in_DB__' in inst.__dict__: raise AttributeError(self.__name__) return inst._p_class_dict.get(self.__name__) def __set__(self, inst, v): inst._p_class_dict[self.__name__] = v def __delete__(self, inst): raise AttributeError(self.__name__) class _p_oid_or_jar_Descr(_p_DataDescr): # Special descr for _p_oid and _p_jar that loads # state when set if both are set and and _p_changed is None # # See notes above def __set__(self, inst, v): get = inst._p_class_dict.get if v == get(self.__name__): return inst._p_class_dict[self.__name__] = v jar = get('_p_jar') if (jar is not None and get('_p_oid') is not None and get('_p_changed') is None ): jar.setstate(inst) class _p_ChangedDescr(object): # descriptor to handle special weird emantics of _p_changed def __get__(self, inst, cls): if inst is None: return self return inst._p_class_dict['_p_changed'] def __set__(self, inst, v): if v is None: return inst._p_class_dict['_p_changed'] = bool(v) def __delete__(self, inst): inst._p_invalidate() class _p_MethodDescr(object): """Provide unassignable class attributes """ def __init__(self, func): self.func = func def __get__(self, inst, cls): if inst is None: return cls return self.func.__get__(inst, cls) def __set__(self, inst, v): raise AttributeError(self.__name__) def __delete__(self, inst): raise AttributeError(self.__name__) special_class_descrs = '__dict__', '__weakref__' class PersistentMetaClass(type): _p_jar = _p_oid_or_jar_Descr('_p_jar') _p_oid = _p_oid_or_jar_Descr('_p_oid') _p_changed = _p_ChangedDescr() _p_serial = _p_DataDescr('_p_serial') def __new__(self, name, bases, cdict, _p_changed=False): cdict = dict([(k, v) for (k, v) in cdict.items() if not k.startswith('_p_')]) cdict['_p_class_dict'] = {'_p_changed': _p_changed} return super(PersistentMetaClass, self).__new__( self, name, bases, cdict) def __getnewargs__(self): return self.__name__, self.__bases__, {}, None __getnewargs__ = _p_MethodDescr(__getnewargs__) def _p_maybeupdate(self, name): get = self._p_class_dict.get data_manager = get('_p_jar') if ( (data_manager is not None) and (get('_p_oid') is not None) and (get('_p_changed') == False) ): self._p_changed = True data_manager.register(self) def __setattr__(self, name, v): if not ((name.startswith('_p_') or name.startswith('_v'))): self._p_maybeupdate(name) super(PersistentMetaClass, self).__setattr__(name, v) def __delattr__(self, name): if not ((name.startswith('_p_') or name.startswith('_v'))): self._p_maybeupdate(name) super(PersistentMetaClass, self).__delattr__(name) def _p_deactivate(self): # persistent classes can't be ghosts pass _p_deactivate = _p_MethodDescr(_p_deactivate) def _p_invalidate(self): # reset state self._p_class_dict['_p_changed'] = None self._p_jar.setstate(self) _p_invalidate = _p_MethodDescr(_p_invalidate) def __getstate__(self): return (self.__bases__, dict([(k, v) for (k, v) in self.__dict__.items() if not (k.startswith('_p_') or k.startswith('_v_') or k in special_class_descrs ) ]), ) __getstate__ = _p_MethodDescr(__getstate__) def __setstate__(self, state): self.__bases__, cdict = state cdict = dict([(k, v) for (k, v) in cdict.items() if not k.startswith('_p_')]) _p_class_dict = self._p_class_dict self._p_class_dict = {} to_remove = [k for k in self.__dict__ if ((k not in cdict) and (k not in special_class_descrs) and (k != '_p_class_dict') )] for k in to_remove: delattr(self, k) for k, v in cdict.items(): setattr(self, k, v) self._p_class_dict = _p_class_dict self._p_changed = False __setstate__ = _p_MethodDescr(__setstate__) def _p_activate(self): self._p_jar.setstate(self) _p_activate = _p_MethodDescr(_p_activate) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/persistentclass.txt000066400000000000000000000166331230730566700244350ustar00rootroot00000000000000================== Persistent Classes ================== NOTE: persistent classes are EXPERIMENTAL and, in some sense, incomplete. This module exists largely to test changes made to support Zope 2 ZClasses, with their historical flaws. The persistentclass module provides a meta class that can be used to implement persistent classes. Persistent classes have the following properties: - They cannot be turned into ghosts - They can only contain picklable subobjects - They don't live in regular file-system modules Let's look at an example: >>> def __init__(self, name): ... self.name = name >>> def foo(self): ... return self.name, self.kind >>> import ZODB.persistentclass >>> class C: ... __metaclass__ = ZODB.persistentclass.PersistentMetaClass ... __init__ = __init__ ... __module__ = '__zodb__' ... foo = foo ... kind = 'sample' This example is obviously a bit contrived. In particular, we defined the methods outside of the class. Why? Because all of the items in a persistent class must be picklable. We defined the methods as global functions to make them picklable. Also note that we explictly set the module. Persistent classes don't live in normal Python modules. Rather, they live in the database. We use information in ``__module__`` to record where in the database. When we want to use a database, we will need to supply a custom class factory to load instances of the class. The class we created works a lot like other persistent objects. It has standard standard persistent attributes: >>> C._p_oid >>> C._p_jar >>> C._p_serial >>> C._p_changed False Because we haven't saved the object, the jar, oid, and serial are all None and it's not changed. We can create and use instances of the class: >>> c = C('first') >>> c.foo() ('first', 'sample') We can modify the class and none of the persistent attributes will change because the object hasn't been saved. >>> def bar(self): ... print 'bar', self.name >>> C.bar = bar >>> c.bar() bar first >>> C._p_oid >>> C._p_jar >>> C._p_serial >>> C._p_changed False Now, we can store the class in a database. We're going to use an explicit transaction manager so that we can show parallel transactions without having to use threads. >>> import transaction >>> tm = transaction.TransactionManager() >>> connection = some_database.open(transaction_manager=tm) >>> connection.root()['C'] = C >>> tm.commit() Now, if we look at the persistence variables, we'll see that they have values: >>> C._p_oid '\x00\x00\x00\x00\x00\x00\x00\x01' >>> C._p_jar is not None True >>> C._p_serial is not None True >>> C._p_changed False Now, if we modify the class: >>> def baz(self): ... print 'baz', self.name >>> C.baz = baz >>> c.baz() baz first We'll see that the class has changed: >>> C._p_changed True If we abort the transaction: >>> tm.abort() Then the class will return to it's prior state: >>> c.baz() Traceback (most recent call last): ... AttributeError: 'C' object has no attribute 'baz' >>> c.bar() bar first We can open another connection and access the class there. >>> tm2 = transaction.TransactionManager() >>> connection2 = some_database.open(transaction_manager=tm2) >>> C2 = connection2.root()['C'] >>> c2 = C2('other') >>> c2.bar() bar other If we make changes without commiting them: >>> C.bar = baz >>> c.bar() baz first >>> C is C2 False Other connections are unaffected: >>> connection2.sync() >>> c2.bar() bar other Until we commit: >>> tm.commit() >>> connection2.sync() >>> c2.bar() baz other Similarly, we don't see changes made in other connections: >>> C2.color = 'red' >>> tm2.commit() >>> c.color Traceback (most recent call last): ... AttributeError: 'C' object has no attribute 'color' until we sync: >>> connection.sync() >>> c.color 'red' Instances of Persistent Classes ------------------------------- We can, of course, store instances of persistent classes in the database: >>> c.color = 'blue' >>> connection.root()['c'] = c >>> tm.commit() >>> connection2.sync() >>> connection2.root()['c'].color 'blue' NOTE: If a non-persistent instance of a persistent class is copied, the class may be copied as well. This is usually not the desired result. Persistent instances of persistent classes ------------------------------------------ Persistent instances of persistent classes are handled differently than normal instances. When we copy a persistent instances of a persistent class, we want to avoid copying the class. Lets create a persistent class that subclasses Persistent: >>> import persistent >>> class P(persistent.Persistent, C): ... __module__ = '__zodb__' ... color = 'green' >>> connection.root()['P'] = P >>> import persistent.mapping >>> connection.root()['obs'] = persistent.mapping.PersistentMapping() >>> p = P('p') >>> connection.root()['obs']['p'] = p >>> tm.commit() You might be wondering why we didn't just stick 'p' into the root object. We created an intermediate persistent object instead. We are storing persistent classes in the root object. To create a ghost for a persistent instance of a persistent class, we need to be able to be able to access the root object and it must be loaded first. If the instance was in the root object, we'd be unable to create it while loading the root object. Now, if we try to load it, we get a broken oject: >>> connection2.sync() >>> connection2.root()['obs']['p'] because the module, `__zodb__` can't be loaded. We need to provide a class factory that knows about this special module. Here we'll supply a sample class factory that looks up a class name in the database root if the module is `__zodb__`. It falls back to the normal class lookup for other modules: >>> from ZODB.broken import find_global >>> def classFactory(connection, modulename, globalname): ... if modulename == '__zodb__': ... return connection.root()[globalname] ... return find_global(modulename, globalname) >>> some_database.classFactory = classFactory Normally, the classFactory should be set before a database is opened. We'll reopen the connections we're using. We'll assign the old connections to a variable first to prevent getting them from the connection pool: >>> old = connection, connection2 >>> connection = some_database.open(transaction_manager=tm) >>> connection2 = some_database.open(transaction_manager=tm2) Now, we can read the object: >>> connection2.root()['obs']['p'].color 'green' >>> connection2.root()['obs']['p'].color = 'blue' >>> tm2.commit() >>> connection.sync() >>> p = connection.root()['obs']['p'] >>> p.color 'blue' Copying ------- If we copy an instance via export/import, the copy and the original share the same class: >>> file = connection.exportFile(p._p_oid) >>> file.seek(0) >>> cp = connection.importFile(file) >>> file.close() >>> cp.color 'blue' >>> cp is not p True >>> cp.__class__ is p.__class__ True >>> tm.abort() XXX test abort of import ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/scripts/000077500000000000000000000000001230730566700221245ustar00rootroot00000000000000ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/scripts/README.txt000066400000000000000000000073511230730566700236300ustar00rootroot00000000000000This directory contains a collection of utilities for managing ZODB databases. Some are more useful than others. If you install ZODB using distutils ("python setup.py install"), a few of these will be installed. Unless otherwise noted, these scripts are invoked with the name of the Data.fs file as their only argument. Example: checkbtrees.py data.fs. analyze.py -- a transaction analyzer for FileStorage Reports on the data in a FileStorage. The report is organized by class. It shows total data, as well as separate reports for current and historical revisions of objects. checkbtrees.py -- checks BTrees in a FileStorage for corruption Attempts to find all the BTrees contained in a Data.fs, calls their _check() methods, and runs them through BTrees.check.check(). fsdump.py -- summarize FileStorage contents, one line per revision Prints a report of FileStorage contents, with one line for each transaction and one line for each data record in that transaction. Includes time stamps, file positions, and class names. fsoids.py -- trace all uses of specified oids in a FileStorage For heavy debugging. A set of oids is specified by text file listing and/or command line. A report is generated showing all uses of these oids in the database: all new-revision creation/modifications, all references from all revisions of other objects, and all creation undos. fstest.py -- simple consistency checker for FileStorage usage: fstest.py [-v] data.fs The fstest tool will scan all the data in a FileStorage and report an error if it finds any corrupt transaction data. The tool will print a message when the first error is detected an exit. The tool accepts one or more -v arguments. If a single -v is used, it will print a line of text for each transaction record it encounters. If two -v arguments are used, it will also print a line of text for each object. The objects for a transaction will be printed before the transaction itself. Note: It does not check the consistency of the object pickles. It is possible for the damage to occur only in the part of the file that stores object pickles. Those errors will go undetected. space.py -- report space used by objects in a FileStorage usage: space.py [-v] data.fs This ignores revisions and versions. netspace.py -- hackish attempt to report on size of objects usage: netspace.py [-P | -v] data.fs -P: do a pack first -v: print info for all objects, even if a traversal path isn't found Traverses objects from the database root and attempts to calculate size of object, including all reachable subobjects. parsezeolog.py -- parse BLATHER logs from ZEO server This script may be obsolete. It has not been tested against the current log output of the ZEO server. Reports on the time and size of transactions committed by a ZEO server, by inspecting log messages at BLATHER level. repozo.py -- incremental backup utility for FileStorage Run the script with the -h option to see usage details. timeout.py -- script to test transaction timeout usage: timeout.py address delay [storage-name] This script connects to a storage, begins a transaction, calls store() and tpc_vote(), and then sleeps forever. This should trigger the transaction timeout feature of the server. zodbload.py -- exercise ZODB under a heavy synthesized Zope-like load See the module docstring for details. Note that this script requires Zope. New in ZODB3 3.1.4. fsrefs.py -- check FileStorage for dangling references fstail.py -- display the most recent transactions in a FileStorage usage: fstail.py [-n nxtn] data.fs The most recent ntxn transactions are displayed, to stdout. Optional argument -n specifies ntxn, and defaults to 10. migrate.py -- do a storage migration and gather statistics See the module docstring for details. ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/scripts/__init__.py000066400000000000000000000000021230730566700242250ustar00rootroot00000000000000# ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/scripts/analyze.py000077500000000000000000000104361230730566700241500ustar00rootroot00000000000000#!/usr/bin/env python2.4 # Based on a transaction analyzer by Matt Kromer. import pickle import sys import types from ZODB.FileStorage import FileStorage from cStringIO import StringIO class FakeError(Exception): def __init__(self, module, name): Exception.__init__(self) self.module = module self.name = name class FakeUnpickler(pickle.Unpickler): def find_class(self, module, name): raise FakeError(module, name) class Report: def __init__(self): self.OIDMAP = {} self.TYPEMAP = {} self.TYPESIZE = {} self.FREEMAP = {} self.USEDMAP = {} self.TIDS = 0 self.OIDS = 0 self.DBYTES = 0 self.COIDS = 0 self.CBYTES = 0 self.FOIDS = 0 self.FBYTES = 0 def shorten(s, n): l = len(s) if l <= n: return s while len(s) + 3 > n: # account for ... i = s.find(".") if i == -1: # In the worst case, just return the rightmost n bytes return s[-n:] else: s = s[i + 1:] l = len(s) return "..." + s def report(rep): print "Processed %d records in %d transactions" % (rep.OIDS, rep.TIDS) print "Average record size is %7.2f bytes" % (rep.DBYTES * 1.0 / rep.OIDS) print ("Average transaction size is %7.2f bytes" % (rep.DBYTES * 1.0 / rep.TIDS)) print "Types used:" fmt = "%-46s %7s %9s %6s %7s" fmtp = "%-46s %7d %9d %5.1f%% %7.2f" # per-class format fmts = "%46s %7d %8dk %5.1f%% %7.2f" # summary format print fmt % ("Class Name", "Count", "TBytes", "Pct", "AvgSize") print fmt % ('-'*46, '-'*7, '-'*9, '-'*5, '-'*7) typemap = rep.TYPEMAP.keys() typemap.sort() cumpct = 0.0 for t in typemap: pct = rep.TYPESIZE[t] * 100.0 / rep.DBYTES cumpct += pct print fmtp % (shorten(t, 46), rep.TYPEMAP[t], rep.TYPESIZE[t], pct, rep.TYPESIZE[t] * 1.0 / rep.TYPEMAP[t]) print fmt % ('='*46, '='*7, '='*9, '='*5, '='*7) print "%46s %7d %9s %6s %6.2fk" % ('Total Transactions', rep.TIDS, ' ', ' ', rep.DBYTES * 1.0 / rep.TIDS / 1024.0) print fmts % ('Total Records', rep.OIDS, rep.DBYTES / 1024.0, cumpct, rep.DBYTES * 1.0 / rep.OIDS) print fmts % ('Current Objects', rep.COIDS, rep.CBYTES / 1024.0, rep.CBYTES * 100.0 / rep.DBYTES, rep.CBYTES * 1.0 / rep.COIDS) if rep.FOIDS: print fmts % ('Old Objects', rep.FOIDS, rep.FBYTES / 1024.0, rep.FBYTES * 100.0 / rep.DBYTES, rep.FBYTES * 1.0 / rep.FOIDS) def analyze(path): fs = FileStorage(path, read_only=1) fsi = fs.iterator() report = Report() for txn in fsi: analyze_trans(report, txn) return report def analyze_trans(report, txn): report.TIDS += 1 for rec in txn: analyze_rec(report, rec) def get_type(record): try: unpickled = FakeUnpickler(StringIO(record.data)).load() except FakeError, err: return "%s.%s" % (err.module, err.name) except: raise classinfo = unpickled[0] if isinstance(classinfo, types.TupleType): mod, klass = classinfo return "%s.%s" % (mod, klass) else: return str(classinfo) def analyze_rec(report, record): oid = record.oid report.OIDS += 1 if record.data is None: # No pickle -- aborted version or undo of object creation. return try: size = len(record.data) # Ignores various overhead report.DBYTES += size if oid not in report.OIDMAP: type = get_type(record) report.OIDMAP[oid] = type report.USEDMAP[oid] = size report.COIDS += 1 report.CBYTES += size else: type = report.OIDMAP[oid] fsize = report.USEDMAP[oid] report.FREEMAP[oid] = report.FREEMAP.get(oid, 0) + fsize report.USEDMAP[oid] = size report.FOIDS += 1 report.FBYTES += fsize report.CBYTES += size - fsize report.TYPEMAP[type] = report.TYPEMAP.get(type, 0) + 1 report.TYPESIZE[type] = report.TYPESIZE.get(type, 0) + size except Exception, err: print err if __name__ == "__main__": path = sys.argv[1] report(analyze(path)) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/scripts/checkbtrees.py000077500000000000000000000061011230730566700247610ustar00rootroot00000000000000#!/usr/bin/env python2.3 """Check the consistency of BTrees in a Data.fs usage: checkbtrees.py data.fs Try to find all the BTrees in a Data.fs, call their _check() methods, and run them through BTrees.check.check(). """ from types import IntType import ZODB from ZODB.FileStorage import FileStorage from BTrees.check import check # Set of oids we've already visited. Since the object structure is # a general graph, this is needed to prevent unbounded paths in the # presence of cycles. It's also helpful in eliminating redundant # checking when a BTree is pointed to by many objects. oids_seen = {} # Append (obj, path) to L if and only if obj is a persistent object # and we haven't seen it before. def add_if_new_persistent(L, obj, path): global oids_seen getattr(obj, '_', None) # unghostify if hasattr(obj, '_p_oid'): oid = obj._p_oid if not oids_seen.has_key(oid): L.append((obj, path)) oids_seen[oid] = 1 def get_subobjects(obj): getattr(obj, '_', None) # unghostify sub = [] try: attrs = obj.__dict__.items() except AttributeError: attrs = () for pair in attrs: sub.append(pair) # what if it is a mapping? try: items = obj.items() except AttributeError: items = () for k, v in items: if not isinstance(k, IntType): sub.append(("", k)) if not isinstance(v, IntType): sub.append(("[%s]" % repr(k), v)) # what if it is a sequence? i = 0 while 1: try: elt = obj[i] except: break sub.append(("[%d]" % i, elt)) i += 1 return sub def main(fname=None): if fname is None: import sys try: fname, = sys.argv[1:] except: print __doc__ sys.exit(2) fs = FileStorage(fname, read_only=1) cn = ZODB.DB(fs).open() rt = cn.root() todo = [] add_if_new_persistent(todo, rt, '') found = 0 while todo: obj, path = todo.pop(0) found += 1 if not path: print "", repr(obj) else: print path, repr(obj) mod = str(obj.__class__.__module__) if mod.startswith("BTrees"): if hasattr(obj, "_check"): try: obj._check() except AssertionError, msg: print "*" * 60 print msg print "*" * 60 try: check(obj) except AssertionError, msg: print "*" * 60 print msg print "*" * 60 if found % 100 == 0: cn.cacheMinimize() for k, v in get_subobjects(obj): if k.startswith('['): # getitem newpath = "%s%s" % (path, k) else: newpath = "%s.%s" % (path, k) add_if_new_persistent(todo, v, newpath) print "total", len(fs._index), "found", found if __name__ == "__main__": main() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/scripts/fsoids.py000066400000000000000000000044621230730566700237730ustar00rootroot00000000000000#!/usr/bin/env python2.3 ############################################################################## # # Copyright (c) 2004 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """FileStorage oid-tracer. usage: fsoids.py [-f oid_file] Data.fs [oid]... Display information about all occurrences of specified oids in a FileStorage. This is meant for heavy debugging. This includes all revisions of the oids, all objects referenced by the oids, and all revisions of all objects referring to the oids. If specified, oid_file is an input text file, containing one oid per line. oids are specified as integers, in any of Python's integer notations (typically like 0x341a). One or more oids can also be specified on the command line. The output is grouped by oid, from smallest to largest, and sub-grouped by transaction, from oldest to newest. This will not alter the FileStorage, but running against a live FileStorage is not recommended (spurious error messages may result). See testfsoids.py for a tutorial doctest. """ import sys from ZODB.FileStorage.fsoids import Tracer def usage(): print __doc__ def main(): import getopt try: opts, args = getopt.getopt(sys.argv[1:], 'f:') if not args: usage() raise ValueError("Must specify a FileStorage") path = None for k, v in opts: if k == '-f': path = v except (getopt.error, ValueError): usage() raise c = Tracer(args[0]) for oid in args[1:]: as_int = int(oid, 0) # 0 == auto-detect base c.register_oids(as_int) if path is not None: for line in open(path): as_int = int(line, 0) c.register_oids(as_int) if not c.oids: raise ValueError("no oids specified") c.run() c.report() if __name__ == "__main__": main() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/scripts/fsrefs.py000066400000000000000000000134641230730566700237760ustar00rootroot00000000000000#!/usr/bin/env python2.3 ############################################################################## # # Copyright (c) 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Check FileStorage for dangling references. usage: fsrefs.py [-v] data.fs fsrefs.py checks object sanity by trying to load the current revision of every object O in the database, and also verifies that every object directly reachable from each such O exists in the database. It's hard to explain exactly what it does because it relies on undocumented features in Python's cPickle module: many of the crucial steps of loading an object are taken, but application objects aren't actually created. This saves a lot of time, and allows fsrefs to be run even if the code implementing the object classes isn't available. A read-only connection to the specified FileStorage is made, but it is not recommended to run fsrefs against a live FileStorage. Because a live FileStorage is mutating while fsrefs runs, it's not possible for fsrefs to get a wholly consistent view of the database across the entire time fsrefs is running; spurious error messages may result. fsrefs doesn't normally produce any output. If an object fails to load, the oid of the object is given in a message saying so, and if -v was specified then the traceback corresponding to the load failure is also displayed (this is the only effect of the -v flag). Three other kinds of errors are also detected, when an object O loads OK, and directly refers to a persistent object P but there's a problem with P: - If P doesn't exist in the database, a message saying so is displayed. The unsatisifiable reference to P is often called a "dangling reference"; P is called "missing" in the error output. - If the current state of the database is such that P's creation has been undone, then P can't be loaded either. This is also a kind of dangling reference, but is identified as "object creation was undone". - If P can't be loaded (but does exist in the database), a message saying that O refers to an object that can't be loaded is displayed. fsrefs also (indirectly) checks that the .index file is sane, because fsrefs uses the index to get its idea of what constitutes "all the objects in the database". Note these limitations: because fsrefs only looks at the current revision of objects, it does not attempt to load objects in versions, or non-current revisions of objects; therefore fsrefs cannot find problems in versions or in non-current revisions. """ import traceback import types from ZODB.FileStorage import FileStorage from ZODB.TimeStamp import TimeStamp from ZODB.utils import u64, oid_repr, get_pickle_metadata from ZODB.serialize import get_refs from ZODB.POSException import POSKeyError VERBOSE = 0 # There's a problem with oid. 'data' is its pickle, and 'serial' its # serial number. 'missing' is a list of (oid, class, reason) triples, # explaining what the problem(s) is(are). def report(oid, data, serial, missing): from_mod, from_class = get_pickle_metadata(data) if len(missing) > 1: plural = "s" else: plural = "" ts = TimeStamp(serial) print "oid %s %s.%s" % (hex(u64(oid)), from_mod, from_class) print "last updated: %s, tid=%s" % (ts, hex(u64(serial))) print "refers to invalid object%s:" % plural for oid, info, reason in missing: if isinstance(info, types.TupleType): description = "%s.%s" % info else: description = str(info) print "\toid %s %s: %r" % (oid_repr(oid), reason, description) print def main(path=None): if path is None: import sys import getopt opts, args = getopt.getopt(sys.argv[1:], "v") for k, v in opts: if k == "-v": VERBOSE += 1 path, = args fs = FileStorage(path, read_only=1) # Set of oids in the index that failed to load due to POSKeyError. # This is what happens if undo is applied to the transaction creating # the object (the oid is still in the index, but its current data # record has a backpointer of 0, and POSKeyError is raised then # because of that backpointer). undone = {} # Set of oids that were present in the index but failed to load. # This does not include oids in undone. noload = {} for oid in fs._index.keys(): try: data, serial = fs.load(oid, "") except (KeyboardInterrupt, SystemExit): raise except POSKeyError: undone[oid] = 1 except: if VERBOSE: traceback.print_exc() noload[oid] = 1 inactive = noload.copy() inactive.update(undone) for oid in fs._index.keys(): if oid in inactive: continue data, serial = fs.load(oid, "") refs = get_refs(data) missing = [] # contains 3-tuples of oid, klass-metadata, reason for ref, klass in refs: if klass is None: klass = '' if ref not in fs._index: missing.append((ref, klass, "missing")) if ref in noload: missing.append((ref, klass, "failed to load")) if ref in undone: missing.append((ref, klass, "object creation was undone")) if missing: report(oid, data, serial, missing) if __name__ == "__main__": main() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/scripts/fsstats.py000077500000000000000000000131311230730566700241670ustar00rootroot00000000000000#!/usr/bin/env python2.3 """Print details statistics from fsdump output.""" import re import sys rx_txn = re.compile("tid=([0-9a-f]+).*size=(\d+)") rx_data = re.compile("oid=([0-9a-f]+) class=(\S+) size=(\d+)") def sort_byhsize(seq, reverse=False): L = [(v.size(), k, v) for k, v in seq] L.sort() if reverse: L.reverse() return [(k, v) for n, k, v in L] class Histogram(dict): def add(self, size): self[size] = self.get(size, 0) + 1 def size(self): return sum(self.itervalues()) def mean(self): product = sum([k * v for k, v in self.iteritems()]) return product / self.size() def median(self): # close enough? n = self.size() / 2 L = self.keys() L.sort() L.reverse() while 1: k = L.pop() if self[k] > n: return k n -= self[k] def mode(self): mode = 0 value = 0 for k, v in self.iteritems(): if v > value: value = v mode = k return mode def make_bins(self, binsize): maxkey = max(self.iterkeys()) self.binsize = binsize self.bins = [0] * (1 + maxkey / binsize) for k, v in self.iteritems(): b = k / binsize self.bins[b] += v def report(self, name, binsize=50, usebins=False, gaps=True, skip=True): if usebins: # Use existing bins with whatever size they have binsize = self.binsize else: # Make new bins self.make_bins(binsize) maxval = max(self.bins) # Print up to 40 dots for a value dot = max(maxval / 40, 1) tot = sum(self.bins) print name print "Total", tot, print "Median", self.median(), print "Mean", self.mean(), print "Mode", self.mode(), print "Max", max(self) print "One * represents", dot gap = False cum = 0 for i, n in enumerate(self.bins): if gaps and (not n or (skip and not n / dot)): if not gap: print " ..." gap = True continue gap = False p = 100 * n / tot cum += n pc = 100 * cum / tot print "%6d %6d %3d%% %3d%% %s" % ( i * binsize, n, p, pc, "*" * (n / dot)) print def class_detail(class_size): # summary of classes fmt = "%5s %6s %6s %6s %-50.50s" labels = ["num", "median", "mean", "mode", "class"] print fmt % tuple(labels) print fmt % tuple(["-" * len(s) for s in labels]) for klass, h in sort_byhsize(class_size.iteritems()): print fmt % (h.size(), h.median(), h.mean(), h.mode(), klass) print # per class details for klass, h in sort_byhsize(class_size.iteritems(), reverse=True): h.make_bins(50) if len(filter(None, h.bins)) == 1: continue h.report("Object size for %s" % klass, usebins=True) def revision_detail(lifetimes, classes): # Report per-class details for any object modified more than once for name, oids in classes.iteritems(): h = Histogram() keep = False for oid in dict.fromkeys(oids, 1): L = lifetimes.get(oid) n = len(L) h.add(n) if n > 1: keep = True if keep: h.report("Number of revisions for %s" % name, binsize=10) def main(path=None): if path is None: path = sys.argv[1] txn_objects = Histogram() # histogram of txn size in objects txn_bytes = Histogram() # histogram of txn size in bytes obj_size = Histogram() # histogram of object size n_updates = Histogram() # oid -> num updates n_classes = Histogram() # class -> num objects lifetimes = {} # oid -> list of tids class_size = {} # class -> histogram of object size classes = {} # class -> list of oids MAX = 0 objects = 0 tid = None f = open(path, "rb") for i, line in enumerate(f): if MAX and i > MAX: break if line.startswith(" data"): m = rx_data.search(line) if not m: continue oid, klass, size = m.groups() size = int(size) obj_size.add(size) n_updates.add(oid) n_classes.add(klass) h = class_size.get(klass) if h is None: h = class_size[klass] = Histogram() h.add(size) L = lifetimes.setdefault(oid, []) L.append(tid) L = classes.setdefault(klass, []) L.append(oid) objects += 1 elif line.startswith("Trans"): if tid is not None: txn_objects.add(objects) m = rx_txn.search(line) if not m: continue tid, size = m.groups() size = int(size) objects = 0 txn_bytes.add(size) f.close() print "Summary: %d txns, %d objects, %d revisions" % ( txn_objects.size(), len(n_updates), n_updates.size()) print txn_bytes.report("Transaction size (bytes)", binsize=1024) txn_objects.report("Transaction size (objects)", binsize=10) obj_size.report("Object size", binsize=128) # object lifetime info h = Histogram() for k, v in lifetimes.items(): h.add(len(v)) h.report("Number of revisions", binsize=10, skip=False) # details about revisions revision_detail(lifetimes, classes) class_detail(class_size) if __name__ == "__main__": main() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/scripts/fstail.py000066400000000000000000000031651230730566700237650ustar00rootroot00000000000000#!/usr/bin/env python2.3 ############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Tool to dump the last few transactions from a FileStorage.""" from ZODB.fstools import prev_txn import binascii import getopt import sys try: from hashlib import sha1 except ImportError: from sha import sha as sha1 def main(path, ntxn): f = open(path, "rb") f.seek(0, 2) th = prev_txn(f) i = ntxn while th and i > 0: hash = sha1(th.get_raw_data()).digest() l = len(str(th.get_timestamp())) + 1 th.read_meta() print "%s: hash=%s" % (th.get_timestamp(), binascii.hexlify(hash)) print ("user=%r description=%r length=%d offset=%d" % (th.user, th.descr, th.length, th.get_data_offset())) print th = th.prev_txn() i -= 1 def Main(): ntxn = 10 opts, args = getopt.getopt(sys.argv[1:], "n:") path, = args for k, v in opts: if k == '-n': ntxn = int(v) main(path, ntxn) if __name__ == "__main__": Main() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/scripts/fstest.py000066400000000000000000000151621230730566700240130ustar00rootroot00000000000000#!/usr/bin/env python2.3 ############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Simple consistency checker for FileStorage. usage: fstest.py [-v] data.fs The fstest tool will scan all the data in a FileStorage and report an error if it finds any corrupt transaction data. The tool will print a message when the first error is detected, then exit. The tool accepts one or more -v arguments. If a single -v is used, it will print a line of text for each transaction record it encounters. If two -v arguments are used, it will also print a line of text for each object. The objects for a transaction will be printed before the transaction itself. Note: It does not check the consistency of the object pickles. It is possible for the damage to occur only in the part of the file that stores object pickles. Those errors will go undetected. """ # The implementation is based closely on the read_index() function in # ZODB.FileStorage. If anything about the FileStorage layout changes, # this file will need to be udpated. import string import struct import sys class FormatError(ValueError): """There is a problem with the format of the FileStorage.""" class Status: checkpoint = 'c' undone = 'u' packed_version = 'FS21' TREC_HDR_LEN = 23 DREC_HDR_LEN = 42 VERBOSE = 0 def hexify(s): """Format an 8-bite string as hex""" l = [] for c in s: h = hex(ord(c)) if h[:2] == '0x': h = h[2:] if len(h) == 1: l.append("0") l.append(h) return "0x" + string.join(l, '') def chatter(msg, level=1): if VERBOSE >= level: sys.stdout.write(msg) def U64(v): """Unpack an 8-byte string as a 64-bit long""" h, l = struct.unpack(">II", v) if h: return (h << 32) + l else: return l def check(path): file = open(path, 'rb') file.seek(0, 2) file_size = file.tell() if file_size == 0: raise FormatError("empty file") file.seek(0) if file.read(4) != packed_version: raise FormatError("invalid file header") pos = 4L tid = '\000' * 8 # lowest possible tid to start i = 0 while pos: _pos = pos pos, tid = check_trec(path, file, pos, tid, file_size) if tid is not None: chatter("%10d: transaction tid %s #%d \n" % (_pos, hexify(tid), i)) i = i + 1 def check_trec(path, file, pos, ltid, file_size): """Read an individual transaction record from file. Returns the pos of the next transaction and the transaction id. It also leaves the file pointer set to pos. The path argument is used for generating error messages. """ h = file.read(TREC_HDR_LEN) if not h: return None, None if len(h) != TREC_HDR_LEN: raise FormatError("%s truncated at %s" % (path, pos)) tid, stl, status, ul, dl, el = struct.unpack(">8s8scHHH", h) tmeta_len = TREC_HDR_LEN + ul + dl + el if tid <= ltid: raise FormatError("%s time-stamp reduction at %s: %s <= %s" % (path, pos, hexify(tid), hexify(ltid))) ltid = tid tl = U64(stl) # transaction record length - 8 if pos + tl + 8 > file_size: raise FormatError("%s truncated possibly because of" " damaged records at %s" % (path, pos)) if status == Status.checkpoint: raise FormatError("%s checkpoint flag was not cleared at %s" % (path, pos)) if status not in ' up': raise FormatError("%s has invalid status '%s' at %s" % (path, status, pos)) if tmeta_len > tl: raise FormatError("%s has an invalid transaction header" " at %s" % (path, pos)) tpos = pos tend = tpos + tl if status != Status.undone: pos = tpos + tmeta_len file.read(ul + dl + el) # skip transaction metadata i = 0 while pos < tend: _pos = pos pos, oid = check_drec(path, file, pos, tpos, tid) if pos > tend: raise FormatError("%s has data records that extend beyond" " the transaction record; end at %s" % (path, pos)) chatter("%10d: object oid %s #%d\n" % (_pos, hexify(oid), i), level=2) i = i + 1 file.seek(tend) rtl = file.read(8) if rtl != stl: raise FormatError("%s has inconsistent transaction length" " for undone transaction at %s" % (path, pos)) pos = tend + 8 return pos, tid def check_drec(path, file, pos, tpos, tid): """Check a data record for the current transaction record""" h = file.read(DREC_HDR_LEN) if len(h) != DREC_HDR_LEN: raise FormatError("%s truncated at %s" % (path, pos)) oid, serial, _prev, _tloc, vlen, _plen = ( struct.unpack(">8s8s8s8sH8s", h)) prev = U64(_prev) tloc = U64(_tloc) plen = U64(_plen) dlen = DREC_HDR_LEN + (plen or 8) if vlen: dlen = dlen + 16 + vlen file.seek(8, 1) pv = U64(file.read(8)) file.seek(vlen, 1) # skip the version data if tloc != tpos: raise FormatError("%s data record exceeds transaction record " "at %s: tloc %d != tpos %d" % (path, pos, tloc, tpos)) pos = pos + dlen if plen: file.seek(plen, 1) else: file.seek(8, 1) # _loadBack() ? return pos, oid def usage(): print __doc__ sys.exit(-1) def main(args=None): if args is None: args = sys.argv[1:] import getopt global VERBOSE try: opts, args = getopt.getopt(args, 'v') if len(args) != 1: raise ValueError("expected one argument") for k, v in opts: if k == '-v': VERBOSE = VERBOSE + 1 except (getopt.error, ValueError): usage() try: check(args[0]) except FormatError, msg: print msg sys.exit(-1) chatter("no errors detected") if __name__ == "__main__": main() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/scripts/manual_tests/000077500000000000000000000000001230730566700246235ustar00rootroot00000000000000ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/scripts/manual_tests/test-checker.fs000066400000000000000000000014421230730566700275370ustar00rootroot00000000000000FS21?ž4\B3“ initial database creation?ž4\B39(cPersistence PersistentMapping qNt.}qU _containerq}s.“?ž4™"£™š ?ž4™"£™4ŸY((U PersistenceqUPersistentMappingqtqNt.}qU _containerq}qUaUThis is a test.qss.š?ž5±\w1 ?ž5±\w¶A„((U PersistenceqUPersistentMappingqtqNt.}qU _containerq}q(UaUThis is a test.qK(Uq(hUPersistentMappingq ttQus.?ž5±\wAB((U PersistenceqUPersistentMappingqtqNt.}qU _containerq}qs.1?ž5.5°w  ?ž5.5°wz_((U PersistenceqUPersistentMappingqtqNt.}qU _containerq}qUaUThis is another test.qss. ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/scripts/manual_tests/testfstest.py000066400000000000000000000125571230730566700274170ustar00rootroot00000000000000"""Verify that fstest.py can find errors. Note: To run this test script fstest.py must be on your PYTHONPATH. """ from cStringIO import StringIO import re import struct import unittest import ZODB.tests.util import fstest from fstest import FormatError, U64 class TestCorruptedFS(ZODB.tests.util.TestCase): f = open('test-checker.fs', 'rb') datafs = f.read() f.close() del f def setUp(self): ZODB.tests.util.TestCase.setUp(self) self._temp = 'Data.fs' self._file = open(self._temp, 'wb') def tearDown(self): if not self._file.closed: self._file.close() ZODB.tests.util.TestCase.tearDown(self) def noError(self): if not self._file.closed: self._file.close() fstest.check(self._temp) def detectsError(self, rx): if not self._file.closed: self._file.close() try: fstest.check(self._temp) except FormatError, msg: mo = re.search(rx, str(msg)) self.failIf(mo is None, "unexpected error: %s" % msg) else: self.fail("fstest did not detect corruption") def getHeader(self): buf = self._datafs.read(16) if not buf: return 0, '' tl = U64(buf[8:]) return tl, buf def copyTransactions(self, n): """Copy at most n transactions from the good data""" f = self._datafs = StringIO(self.datafs) self._file.write(f.read(4)) for i in range(n): tl, data = self.getHeader() if not tl: return self._file.write(data) rec = f.read(tl - 8) self._file.write(rec) def testGood(self): self._file.write(self.datafs) self.noError() def testTwoTransactions(self): self.copyTransactions(2) self.noError() def testEmptyFile(self): self.detectsError("empty file") def testInvalidHeader(self): self._file.write('SF12') self.detectsError("invalid file header") def testTruncatedTransaction(self): self._file.write(self.datafs[:4+22]) self.detectsError("truncated") def testCheckpointFlag(self): self.copyTransactions(2) tl, data = self.getHeader() assert tl > 0, "ran out of good transaction data" self._file.write(data) self._file.write('c') self._file.write(self._datafs.read(tl - 9)) self.detectsError("checkpoint flag") def testInvalidStatus(self): self.copyTransactions(2) tl, data = self.getHeader() assert tl > 0, "ran out of good transaction data" self._file.write(data) self._file.write('Z') self._file.write(self._datafs.read(tl - 9)) self.detectsError("invalid status") def testTruncatedRecord(self): self.copyTransactions(3) tl, data = self.getHeader() assert tl > 0, "ran out of good transaction data" self._file.write(data) buf = self._datafs.read(tl / 2) self._file.write(buf) self.detectsError("truncated possibly") def testBadLength(self): self.copyTransactions(2) tl, data = self.getHeader() assert tl > 0, "ran out of good transaction data" self._file.write(data) buf = self._datafs.read(tl - 8) self._file.write(buf[0]) assert tl <= 1<<16, "can't use this transaction for this test" self._file.write("\777\777") self._file.write(buf[3:]) self.detectsError("invalid transaction header") def testDecreasingTimestamps(self): self.copyTransactions(0) tl, data = self.getHeader() buf = self._datafs.read(tl - 8) t1 = data + buf tl, data = self.getHeader() buf = self._datafs.read(tl - 8) t2 = data + buf self._file.write(t2[:8] + t1[8:]) self._file.write(t1[:8] + t2[8:]) self.detectsError("time-stamp") def testTruncatedData(self): # This test must re-write the transaction header length in # order to trigger the error in check_drec(). If it doesn't, # the truncated data record would also caught a truncated # transaction record. self.copyTransactions(1) tl, data = self.getHeader() pos = self._file.tell() self._file.write(data) buf = self._datafs.read(tl - 8) hdr = buf[:15] ul, dl, el = struct.unpack(">HHH", hdr[-6:]) self._file.write(buf[:15 + ul + dl + el]) data = buf[15 + ul + dl + el:] self._file.write(data[:24]) self._file.seek(pos + 8, 0) newlen = struct.pack(">II", 0, tl - (len(data) - 24)) self._file.write(newlen) self.detectsError("truncated at") def testBadDataLength(self): self.copyTransactions(1) tl, data = self.getHeader() self._file.write(data) buf = self._datafs.read(tl - 8) hdr = buf[:7] # write the transaction meta data ul, dl, el = struct.unpack(">HHH", hdr[-6:]) self._file.write(buf[:7 + ul + dl + el]) # write the first part of the data header data = buf[7 + ul + dl + el:] self._file.write(data[:24]) self._file.write("\000" * 4 + "\077" + "\000" * 3) self._file.write(data[32:]) self.detectsError("record exceeds transaction") if __name__ == "__main__": unittest.main() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/scripts/migrate.py000077500000000000000000000257561230730566700241500ustar00rootroot00000000000000#!/usr/bin/env python2.3 ############################################################################## # # Copyright (c) 2001, 2002, 2003 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """A script to gather statistics while doing a storage migration. This is very similar to a standard storage's copyTransactionsFrom() method, except that it's geared to run as a script, and it collects useful pieces of information as it's working. This script can be used to stress test a storage since it blasts transactions at it as fast as possible. You can get a good sense of the performance of a storage by running this script. Actually it just counts the size of pickles in the transaction via the iterator protocol, so storage overheads aren't counted. Usage: %(PROGRAM)s [options] [source-storage-args] [destination-storage-args] Options: -S sourcetype --stype=sourcetype This is the name of a recognized type for the source database. Use -T to print out the known types. Defaults to "file". -D desttype --dtype=desttype This is the name of the recognized type for the destination database. Use -T to print out the known types. Defaults to "file". -o filename --output=filename Print results in filename, otherwise stdout. -m txncount --max=txncount Stop after committing txncount transactions. -k txncount --skip=txncount Skip the first txncount transactions. -p/--profile Turn on specialized profiling. -t/--timestamps Print tids as timestamps. -T/--storage_types Print all the recognized storage types and exit. -v/--verbose Turns on verbose output. Multiple -v options increase the verbosity. -h/--help Print this message and exit. Positional arguments: source-storage-args: Semicolon separated list of arguments for the source storage, as key=val pairs. E.g. "file_name=Data.fs;read_only=1" destination-storage-args: Comma separated list of arguments for the source storage, as key=val pairs. E.g. "name=full;frequency=3600" """ import re import sys import time import getopt import marshal import profile from ZODB import utils from ZODB import StorageTypes from ZODB.TimeStamp import TimeStamp PROGRAM = sys.argv[0] ZERO = '\0'*8 try: True, False except NameError: True = 1 False = 0 def usage(code, msg=''): print >> sys.stderr, __doc__ % globals() if msg: print >> sys.stderr, msg sys.exit(code) def error(code, msg): print >> sys.stderr, msg print "use --help for usage message" sys.exit(code) def main(): try: opts, args = getopt.getopt( sys.argv[1:], 'hvo:pm:k:D:S:Tt', ['help', 'verbose', 'output=', 'profile', 'storage_types', 'max=', 'skip=', 'dtype=', 'stype=', 'timestamps']) except getopt.error, msg: error(2, msg) class Options: stype = 'FileStorage' dtype = 'FileStorage' verbose = 0 outfile = None profilep = False maxtxn = -1 skiptxn = -1 timestamps = False options = Options() for opt, arg in opts: if opt in ('-h', '--help'): usage(0) elif opt in ('-v', '--verbose'): options.verbose += 1 elif opt in ('-T', '--storage_types'): print_types() sys.exit(0) elif opt in ('-S', '--stype'): options.stype = arg elif opt in ('-D', '--dtype'): options.dtype = arg elif opt in ('-o', '--output'): options.outfile = arg elif opt in ('-p', '--profile'): options.profilep = True elif opt in ('-m', '--max'): options.maxtxn = int(arg) elif opt in ('-k', '--skip'): options.skiptxn = int(arg) elif opt in ('-t', '--timestamps'): options.timestamps = True if len(args) > 2: error(2, "too many arguments") srckws = {} if len(args) > 0: srcargs = args[0] for kv in re.split(r';\s*', srcargs): key, val = kv.split('=') srckws[key] = val destkws = {} if len(args) > 1: destargs = args[1] for kv in re.split(r';\s*', destargs): key, val = kv.split('=') destkws[key] = val if options.stype not in StorageTypes.storage_types.keys(): usage(2, 'Source database type must be provided') if options.dtype not in StorageTypes.storage_types.keys(): usage(2, 'Destination database type must be provided') # Open the output file if options.outfile is None: options.outfp = sys.stdout options.outclosep = False else: options.outfp = open(options.outfile, 'w') options.outclosep = True if options.verbose > 0: print 'Opening source database...' modname, sconv = StorageTypes.storage_types[options.stype] kw = sconv(**srckws) __import__(modname) sclass = getattr(sys.modules[modname], options.stype) srcdb = sclass(**kw) if options.verbose > 0: print 'Opening destination database...' modname, dconv = StorageTypes.storage_types[options.dtype] kw = dconv(**destkws) __import__(modname) dclass = getattr(sys.modules[modname], options.dtype) dstdb = dclass(**kw) try: t0 = time.time() doit(srcdb, dstdb, options) t1 = time.time() if options.verbose > 0: print 'Migration time: %8.3f' % (t1-t0) finally: # Done srcdb.close() dstdb.close() if options.outclosep: options.outfp.close() def doit(srcdb, dstdb, options): outfp = options.outfp profilep = options.profilep verbose = options.verbose # some global information largest_pickle = 0 largest_txn_in_size = 0 largest_txn_in_objects = 0 total_pickle_size = 0L total_object_count = 0 # Ripped from BaseStorage.copyTransactionsFrom() ts = None ok = True prevrevids = {} counter = 0 skipper = 0 if options.timestamps: print "%4s. %26s %6s %8s %5s %5s %5s %5s %5s" % ( "NUM", "TID AS TIMESTAMP", "OBJS", "BYTES", # Does anybody know what these times mean? "t4-t0", "t1-t0", "t2-t1", "t3-t2", "t4-t3") else: print "%4s. %20s %6s %8s %6s %6s %6s %6s %6s" % ( "NUM", "TRANSACTION ID", "OBJS", "BYTES", # Does anybody know what these times mean? "t4-t0", "t1-t0", "t2-t1", "t3-t2", "t4-t3") for txn in srcdb.iterator(): skipper += 1 if skipper <= options.skiptxn: continue counter += 1 if counter > options.maxtxn >= 0: break tid = txn.tid if ts is None: ts = TimeStamp(tid) else: t = TimeStamp(tid) if t <= ts: if ok: print >> sys.stderr, ( 'Time stamps are out of order %s, %s' % (ts, t)) ok = False ts = t.laterThan(ts) tid = `ts` else: ts = t if not ok: print >> sys.stderr, ( 'Time stamps are back in order %s' % t) ok = True if verbose > 1: print ts prof = None if profilep and (counter % 100) == 0: prof = profile.Profile() objects = 0 size = 0 newrevids = RevidAccumulator() t0 = time.time() dstdb.tpc_begin(txn, tid, txn.status) t1 = time.time() for r in txn: oid = r.oid objects += 1 thissize = len(r.data) size += thissize if thissize > largest_pickle: largest_pickle = thissize if verbose > 1: if not r.version: vstr = 'norev' else: vstr = r.version print utils.U64(oid), vstr, len(r.data) oldrevid = prevrevids.get(oid, ZERO) result = dstdb.store(oid, oldrevid, r.data, r.version, txn) newrevids.store(oid, result) t2 = time.time() result = dstdb.tpc_vote(txn) t3 = time.time() newrevids.tpc_vote(result) prevrevids.update(newrevids.get_dict()) # Profile every 100 transactions if prof: prof.runcall(dstdb.tpc_finish, txn) else: dstdb.tpc_finish(txn) t4 = time.time() # record the results if objects > largest_txn_in_objects: largest_txn_in_objects = objects if size > largest_txn_in_size: largest_txn_in_size = size if options.timestamps: tidstr = str(TimeStamp(tid)) format = "%4d. %26s %6d %8d %5.3f %5.3f %5.3f %5.3f %5.3f" else: tidstr = utils.U64(tid) format = "%4d. %20s %6d %8d %6.4f %6.4f %6.4f %6.4f %6.4f" print >> outfp, format % (skipper, tidstr, objects, size, t4-t0, t1-t0, t2-t1, t3-t2, t4-t3) total_pickle_size += size total_object_count += objects if prof: prof.create_stats() fp = open('profile-%02d.txt' % (counter / 100), 'wb') marshal.dump(prof.stats, fp) fp.close() print >> outfp, "Largest pickle: %8d" % largest_pickle print >> outfp, "Largest transaction: %8d" % largest_txn_in_size print >> outfp, "Largest object count: %8d" % largest_txn_in_objects print >> outfp, "Total pickle size: %14d" % total_pickle_size print >> outfp, "Total object count: %8d" % total_object_count # helper to deal with differences between old-style store() return and # new-style store() return that supports ZEO import types class RevidAccumulator: def __init__(self): self.data = {} def _update_from_list(self, list): for oid, serial in list: if not isinstance(serial, types.StringType): raise serial self.data[oid] = serial def store(self, oid, result): if isinstance(result, types.StringType): self.data[oid] = result elif result is not None: self._update_from_list(result) def tpc_vote(self, result): if result is not None: self._update_from_list(result) def get_dict(self): return self.data if __name__ == '__main__': main() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/scripts/migrateblobs.py000066400000000000000000000052371230730566700251570ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2008 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """A script to migrate a blob directory into a different layout. """ import logging import optparse import os import shutil from ZODB.blob import FilesystemHelper from ZODB.utils import oid_repr def link_or_copy(f1, f2): try: os.link(f1, f2) except OSError: shutil.copy(f1, f2) # Check if we actually have link try: os.link except AttributeError: link_or_copy = shutil.copy def migrate(source, dest, layout): source_fsh = FilesystemHelper(source) source_fsh.create() dest_fsh = FilesystemHelper(dest, layout) dest_fsh.create() print "Migrating blob data from `%s` (%s) to `%s` (%s)" % ( source, source_fsh.layout_name, dest, dest_fsh.layout_name) for oid, path in source_fsh.listOIDs(): dest_path = dest_fsh.getPathForOID(oid, create=True) files = os.listdir(path) for file in files: source_file = os.path.join(path, file) dest_file = os.path.join(dest_path, file) link_or_copy(source_file, dest_file) print "\tOID: %s - %s files " % (oid_repr(oid), len(files)) def main(source=None, dest=None, layout="bushy"): usage = "usage: %prog [options] " description = ("Create the new directory and migrate all blob " "data to while using the new for " "") parser = optparse.OptionParser(usage=usage, description=description) parser.add_option("-l", "--layout", default=layout, type='choice', choices=['bushy', 'lawn'], help="Define the layout to use for the new directory " "(bushy or lawn). Default: %default") options, args = parser.parse_args() if not len(args) == 2: parser.error("source and destination must be given") logging.getLogger().addHandler(logging.StreamHandler()) logging.getLogger().setLevel(0) source, dest = args migrate(source, dest, options.layout) if __name__ == '__main__': main() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/scripts/netspace.py000066400000000000000000000064201230730566700243020ustar00rootroot00000000000000#!/usr/bin/env python2.3 """Report on the net size of objects counting subobjects. usage: netspace.py [-P | -v] data.fs -P: do a pack first -v: print info for all objects, even if a traversal path isn't found """ import ZODB from ZODB.FileStorage import FileStorage from ZODB.utils import U64, get_pickle_metadata from ZODB.referencesf import referencesf def find_paths(root, maxdist): """Find Python attribute traversal paths for objects to maxdist distance. Starting at a root object, traverse attributes up to distance levels from the root, looking for persistent objects. Return a dict mapping oids to traversal paths. TODO: Assumes that the keys of the root are not themselves persistent objects. TODO: Doesn't traverse containers. """ paths = {} # Handle the root as a special case because it's a dict objs = [] for k, v in root.items(): oid = getattr(v, '_p_oid', None) objs.append((k, v, oid, 0)) for path, obj, oid, dist in objs: if oid is not None: paths[oid] = path if dist < maxdist: getattr(obj, 'foo', None) # unghostify try: items = obj.__dict__.items() except AttributeError: continue for k, v in items: oid = getattr(v, '_p_oid', None) objs.append(("%s.%s" % (path, k), v, oid, dist + 1)) return paths def main(path): fs = FileStorage(path, read_only=1) if PACK: fs.pack() db = ZODB.DB(fs) rt = db.open().root() paths = find_paths(rt, 3) def total_size(oid): cache = {} cache_size = 1000 def _total_size(oid, seen): v = cache.get(oid) if v is not None: return v data, serialno = fs.load(oid, '') size = len(data) for suboid in referencesf(data): if seen.has_key(suboid): continue seen[suboid] = 1 size += _total_size(suboid, seen) cache[oid] = size if len(cache) == cache_size: cache.popitem() return size return _total_size(oid, {}) keys = fs._index.keys() keys.sort() keys.reverse() if not VERBOSE: # If not running verbosely, don't print an entry for an object # unless it has an entry in paths. keys = filter(paths.has_key, keys) fmt = "%8s %5d %8d %s %s.%s" for oid in keys: data, serialno = fs.load(oid, '') mod, klass = get_pickle_metadata(data) refs = referencesf(data) path = paths.get(oid, '-') print fmt % (U64(oid), len(data), total_size(oid), path, mod, klass) def Main(): import sys import getopt global PACK global VERBOSE PACK = 0 VERBOSE = 0 try: opts, args = getopt.getopt(sys.argv[1:], 'Pv') path, = args except getopt.error, err: print err print __doc__ sys.exit(2) except ValueError: print "expected one argument, got", len(args) print __doc__ sys.exit(2) for o, v in opts: if o == '-P': PACK = 1 if o == '-v': VERBOSE += 1 main(path) if __name__ == "__main__": Main() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/scripts/referrers.py000066400000000000000000000017361230730566700245040ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2005 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Compute a table of object id referrers $Id$ """ from ZODB.serialize import referencesf def referrers(storage): result = {} for transaction in storage.iterator(): for record in transaction: for oid in referencesf(record.data): result.setdefault(oid, []).append((record.oid, record.tid)) return result ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/scripts/repozo.py000077500000000000000000000507061230730566700240270ustar00rootroot00000000000000#!/usr/bin/env python2.3 # repozo.py -- incremental and full backups of a Data.fs file. # # Originally written by Anthony Baxter # Significantly modified by Barry Warsaw """repozo.py -- incremental and full backups of a Data.fs file and index. Usage: %(program)s [options] Where: Exactly one of -B or -R must be specified: -B / --backup Backup current ZODB file. -R / --recover Restore a ZODB file from a backup. -v / --verbose Verbose mode. -h / --help Print this text and exit. -r dir --repository=dir Repository directory containing the backup files. This argument is required. The directory must already exist. You should not edit the files in this directory, or add your own files to it. Options for -B/--backup: -f file --file=file Source Data.fs file. This argument is required. -F / --full Force a full backup. By default, an incremental backup is made if possible (e.g., if a pack has occurred since the last incremental backup, a full backup is necessary). -Q / --quick Verify via md5 checksum only the last incremental written. This significantly reduces the disk i/o at the (theoretical) cost of inconsistency. This is a probabilistic way of determining whether a full backup is necessary. -z / --gzip Compress with gzip the backup files. Uses the default zlib compression level. By default, gzip compression is not used. -k / --kill-old-on-full If a full backup is created, remove any prior full or incremental backup files (and associated metadata files) from the repository directory. Options for -R/--recover: -D str --date=str Recover state as of this date. Specify UTC (not local) time. yyyy-mm-dd[-hh[-mm[-ss]]] By default, current time is used. -o filename --output=filename Write recovered ZODB to given file. By default, the file is written to stdout. Note: for the stdout case, the index file will **not** be restored automatically. """ import os import shutil import sys try: # the hashlib package is available from Python 2.5 from hashlib import md5 except ImportError: # the md5 package is deprecated in Python 2.6 from md5 import new as md5 import gzip import time import errno import getopt from ZODB.FileStorage import FileStorage program = sys.argv[0] BACKUP = 1 RECOVER = 2 COMMASPACE = ', ' READCHUNK = 16 * 1024 VERBOSE = False class WouldOverwriteFiles(Exception): pass class NoFiles(Exception): pass def usage(code, msg=''): outfp = sys.stderr if code == 0: outfp = sys.stdout print >> outfp, __doc__ % globals() if msg: print >> outfp, msg sys.exit(code) def log(msg, *args): if VERBOSE: # Use stderr here so that -v flag works with -R and no -o print >> sys.stderr, msg % args def parseargs(argv): global VERBOSE try: opts, args = getopt.getopt(argv, 'BRvhr:f:FQzkD:o:', ['backup', 'recover', 'verbose', 'help', 'repository=', 'file=', 'full', 'quick', 'gzip', 'kill-old-on-full', 'date=', 'output=', ]) except getopt.error, msg: usage(1, msg) class Options: mode = None # BACKUP or RECOVER file = None # name of input Data.fs file repository = None # name of directory holding backups full = False # True forces full backup date = None # -D argument, if any output = None # where to write recovered data; None = stdout quick = False # -Q flag state gzip = False # -z flag state killold = False # -k flag state options = Options() for opt, arg in opts: if opt in ('-h', '--help'): usage(0) elif opt in ('-v', '--verbose'): VERBOSE = True elif opt in ('-R', '--recover'): if options.mode is not None: usage(1, '-B and -R are mutually exclusive') options.mode = RECOVER elif opt in ('-B', '--backup'): if options.mode is not None: usage(1, '-B and -R are mutually exclusive') options.mode = BACKUP elif opt in ('-Q', '--quick'): options.quick = True elif opt in ('-f', '--file'): options.file = arg elif opt in ('-r', '--repository'): options.repository = arg elif opt in ('-F', '--full'): options.full = True elif opt in ('-D', '--date'): options.date = arg elif opt in ('-o', '--output'): options.output = arg elif opt in ('-z', '--gzip'): options.gzip = True elif opt in ('-k', '--kill-old-on-full'): options.killold = True else: assert False, (opt, arg) # Any other arguments are invalid if args: usage(1, 'Invalid arguments: ' + COMMASPACE.join(args)) # Sanity checks if options.mode is None: usage(1, 'Either --backup or --recover is required') if options.repository is None: usage(1, '--repository is required') if options.mode == BACKUP: if options.date is not None: log('--date option is ignored in backup mode') options.date = None if options.output is not None: log('--output option is ignored in backup mode') options.output = None else: assert options.mode == RECOVER if options.file is not None: log('--file option is ignored in recover mode') options.file = None if options.killold is not None: log('--kill-old-on-full option is ignored in recover mode') options.killold = None return options # afile is a Python file object, or created by gzip.open(). The latter # doesn't have a fileno() method, so to fsync it we need to reach into # its underlying file object. def fsync(afile): afile.flush() fileobject = getattr(afile, 'fileobj', afile) os.fsync(fileobject.fileno()) # Read bytes (no more than n, or to EOF if n is None) in chunks from the # current position in file fp. Pass each chunk as an argument to func(). # Return the total number of bytes read == the total number of bytes # passed in all to func(). Leaves the file position just after the # last byte read. def dofile(func, fp, n=None): bytesread = 0L while n is None or n > 0: if n is None: todo = READCHUNK else: todo = min(READCHUNK, n) data = fp.read(todo) if not data: break func(data) nread = len(data) bytesread += nread if n is not None: n -= nread return bytesread def checksum(fp, n): # Checksum the first n bytes of the specified file sum = md5() def func(data): sum.update(data) dofile(func, fp, n) return sum.hexdigest() def copyfile(options, dst, start, n): # Copy bytes from file src, to file dst, starting at offset start, for n # length of bytes. For robustness, we first write, flush and fsync # to a temp file, then rename the temp file at the end. sum = md5() ifp = open(options.file, 'rb') ifp.seek(start) tempname = os.path.join(os.path.dirname(dst), 'tmp.tmp') if options.gzip: ofp = gzip.open(tempname, 'wb') else: ofp = open(tempname, 'wb') def func(data): sum.update(data) ofp.write(data) ndone = dofile(func, ifp, n) assert ndone == n ifp.close() fsync(ofp) ofp.close() os.rename(tempname, dst) return sum.hexdigest() def concat(files, ofp=None): # Concatenate a bunch of files from the repository, output to `outfile' if # given. Return the number of bytes written and the md5 checksum of the # bytes. sum = md5() def func(data): sum.update(data) if ofp: ofp.write(data) bytesread = 0 for f in files: # Auto uncompress if f.endswith('fsz'): ifp = gzip.open(f, 'rb') else: ifp = open(f, 'rb') bytesread += dofile(func, ifp) ifp.close() if ofp: ofp.close() return bytesread, sum.hexdigest() def gen_filedate(options): return getattr(options, 'test_now', time.gmtime()[:6]) def gen_filename(options, ext=None, now=None): if ext is None: if options.full: ext = '.fs' else: ext = '.deltafs' if options.gzip: ext += 'z' # Hook for testing if now is None: now = gen_filedate(options) t = now + (ext,) return '%04d-%02d-%02d-%02d-%02d-%02d%s' % t # Return a list of files needed to reproduce state at time options.date. # This is a list, in chronological order, of the .fs[z] and .deltafs[z] # files, from the time of the most recent full backup preceding # options.date, up to options.date. import re is_data_file = re.compile(r'\d{4}(?:-\d\d){5}\.(?:delta)?fsz?$').match del re def find_files(options): when = options.date if not when: when = gen_filename(options, ext='') log('looking for files between last full backup and %s...', when) all = filter(is_data_file, os.listdir(options.repository)) all.sort() all.reverse() # newest file first # Find the last full backup before date, then include all the # incrementals between that full backup and "when". needed = [] for fname in all: root, ext = os.path.splitext(fname) if root <= when: needed.append(fname) if ext in ('.fs', '.fsz'): break # Make the file names relative to the repository directory needed = [os.path.join(options.repository, f) for f in needed] # Restore back to chronological order needed.reverse() if needed: log('files needed to recover state as of %s:', when) for f in needed: log('\t%s', f) else: log('no files found') return needed # Scan the .dat file corresponding to the last full backup performed. # Return # # filename, startpos, endpos, checksum # # of the last incremental. If there is no .dat file, or the .dat file # is empty, return # # None, None, None, None def scandat(repofiles): fullfile = repofiles[0] datfile = os.path.splitext(fullfile)[0] + '.dat' fn = startpos = endpos = sum = None # assume .dat file missing or empty try: fp = open(datfile) except IOError, e: if e.errno <> errno.ENOENT: raise else: # We only care about the last one. lines = fp.readlines() fp.close() if lines: fn, startpos, endpos, sum = lines[-1].split() startpos = long(startpos) endpos = long(endpos) return fn, startpos, endpos, sum def delete_old_backups(options): # Delete all full backup files except for the most recent full backup file all = filter(is_data_file, os.listdir(options.repository)) all.sort() deletable = [] full = [] for fname in all: root, ext = os.path.splitext(fname) if ext in ('.fs', '.fsz'): full.append(fname) if ext in ('.fs', '.fsz', '.deltafs', '.deltafsz'): deletable.append(fname) # keep most recent full if not full: return recentfull = full.pop(-1) deletable.remove(recentfull) root, ext = os.path.splitext(recentfull) dat = root + '.dat' if dat in deletable: deletable.remove(dat) index = root + '.index' if index in deletable: deletable.remove(index) for fname in deletable: log('removing old backup file %s (and .dat / .index)', fname) root, ext = os.path.splitext(fname) try: os.unlink(os.path.join(options.repository, root + '.dat')) except OSError: pass try: os.unlink(os.path.join(options.repository, root + '.index')) except OSError: pass os.unlink(os.path.join(options.repository, fname)) def do_full_backup(options): options.full = True tnow = gen_filedate(options) dest = os.path.join(options.repository, gen_filename(options, now=tnow)) if os.path.exists(dest): raise WouldOverwriteFiles('Cannot overwrite existing file: %s' % dest) # Find the file position of the last completed transaction. fs = FileStorage(options.file, read_only=True) # Note that the FileStorage ctor calls read_index() which scans the file # and returns "the position just after the last valid transaction record". # getSize() then returns this position, which is exactly what we want, # because we only want to copy stuff from the beginning of the file to the # last valid transaction record. pos = fs.getSize() # Save the storage index into the repository index_file = os.path.join(options.repository, gen_filename(options, '.index', tnow)) log('writing index') fs._index.save(pos, index_file) fs.close() log('writing full backup: %s bytes to %s', pos, dest) sum = copyfile(options, dest, 0, pos) # Write the data file for this full backup datfile = os.path.splitext(dest)[0] + '.dat' fp = open(datfile, 'w') print >> fp, dest, 0, pos, sum fp.flush() os.fsync(fp.fileno()) fp.close() if options.killold: delete_old_backups(options) def do_incremental_backup(options, reposz, repofiles): options.full = False tnow = gen_filedate(options) dest = os.path.join(options.repository, gen_filename(options, now=tnow)) if os.path.exists(dest): raise WouldOverwriteFiles('Cannot overwrite existing file: %s' % dest) # Find the file position of the last completed transaction. fs = FileStorage(options.file, read_only=True) # Note that the FileStorage ctor calls read_index() which scans the file # and returns "the position just after the last valid transaction record". # getSize() then returns this position, which is exactly what we want, # because we only want to copy stuff from the beginning of the file to the # last valid transaction record. pos = fs.getSize() log('writing index') index_file = os.path.join(options.repository, gen_filename(options, '.index', tnow)) fs._index.save(pos, index_file) fs.close() log('writing incremental: %s bytes to %s', pos-reposz, dest) sum = copyfile(options, dest, reposz, pos - reposz) # The first file in repofiles points to the last full backup. Use this to # get the .dat file and append the information for this incrementatl to # that file. fullfile = repofiles[0] datfile = os.path.splitext(fullfile)[0] + '.dat' # This .dat file better exist. Let the exception percolate if not. fp = open(datfile, 'a') print >> fp, dest, reposz, pos, sum fp.flush() os.fsync(fp.fileno()) fp.close() def do_backup(options): repofiles = find_files(options) # See if we need to do a full backup if options.full or not repofiles: log('doing a full backup') do_full_backup(options) return srcsz = os.path.getsize(options.file) if options.quick: fn, startpos, endpos, sum = scandat(repofiles) # If the .dat file was missing, or was empty, do a full backup if (fn, startpos, endpos, sum) == (None, None, None, None): log('missing or empty .dat file (full backup)') do_full_backup(options) return # Has the file shrunk, possibly because of a pack? if srcsz < endpos: log('file shrunk, possibly because of a pack (full backup)') do_full_backup(options) return # Now check the md5 sum of the source file, from the last # incremental's start and stop positions. srcfp = open(options.file, 'rb') srcfp.seek(startpos) srcsum = checksum(srcfp, endpos-startpos) srcfp.close() log('last incremental file: %s', fn) log('last incremental checksum: %s', sum) log('source checksum range: [%s..%s], sum: %s', startpos, endpos, srcsum) if sum == srcsum: if srcsz == endpos: log('No changes, nothing to do') return log('doing incremental, starting at: %s', endpos) do_incremental_backup(options, endpos, repofiles) return else: # This was is much slower, and more disk i/o intensive, but it's also # more accurate since it checks the actual existing files instead of # the information in the .dat file. # # See if we can do an incremental, based on the files that already # exist. This call of concat() will not write an output file. reposz, reposum = concat(repofiles) log('repository state: %s bytes, md5: %s', reposz, reposum) # Get the md5 checksum of the source file, up to two file positions: # the entire size of the file, and up to the file position of the last # incremental backup. srcfp = open(options.file, 'rb') srcsum = checksum(srcfp, srcsz) srcfp.seek(0) srcsum_backedup = checksum(srcfp, reposz) srcfp.close() log('current state : %s bytes, md5: %s', srcsz, srcsum) log('backed up state : %s bytes, md5: %s', reposz, srcsum_backedup) # Has nothing changed? if srcsz == reposz and srcsum == reposum: log('No changes, nothing to do') return # Has the file shrunk, probably because of a pack? if srcsz < reposz: log('file shrunk, possibly because of a pack (full backup)') do_full_backup(options) return # The source file is larger than the repository. If the md5 checksums # match, then we know we can do an incremental backup. If they don't, # then perhaps the file was packed at some point (or a # non-transactional undo was performed, but this is deprecated). Only # do a full backup if forced to. if reposum == srcsum_backedup: log('doing incremental, starting at: %s', reposz) do_incremental_backup(options, reposz, repofiles) return # The checksums don't match, meaning the front of the source file has # changed. We'll need to do a full backup in that case. log('file changed, possibly because of a pack (full backup)') do_full_backup(options) def do_recover(options): # Find the first full backup at or before the specified date repofiles = find_files(options) if not repofiles: if options.date: raise NoFiles('No files in repository before %s', options.date) else: raise NoFiles('No files in repository') if options.output is None: log('Recovering file to stdout') outfp = sys.stdout else: log('Recovering file to %s', options.output) outfp = open(options.output, 'wb') reposz, reposum = concat(repofiles, outfp) if outfp <> sys.stdout: outfp.close() log('Recovered %s bytes, md5: %s', reposz, reposum) if options.output is not None: last_base = os.path.splitext(repofiles[-1])[0] source_index = '%s.index' % last_base target_index = '%s.index' % options.output if os.path.exists(source_index): log('Restoring index file %s to %s', source_index, target_index) shutil.copyfile(source_index, target_index) else: log('No index file to restore: %s', source_index) def main(argv=None): if argv is None: argv = sys.argv[1:] options = parseargs(argv) if options.mode == BACKUP: try: do_backup(options) except WouldOverwriteFiles, e: print >> sys.stderr, str(e) sys.exit(1) else: assert options.mode == RECOVER try: do_recover(options) except NoFiles, e: print >> sys.stderr, str(e) sys.exit(1) if __name__ == '__main__': main() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/scripts/space.py000066400000000000000000000030371230730566700235740ustar00rootroot00000000000000#!/usr/bin/env python2.3 """Report on the space used by objects in a storage. usage: space.py data.fs The current implementation only supports FileStorage. Current limitations / simplifications: Ignores revisions and versions. """ from ZODB.FileStorage import FileStorage from ZODB.utils import U64, get_pickle_metadata def run(path, v=0): fs = FileStorage(path, read_only=1) # break into the file implementation if hasattr(fs._index, 'iterkeys'): iter = fs._index.iterkeys() else: iter = fs._index.keys() totals = {} for oid in iter: data, serialno = fs.load(oid, '') mod, klass = get_pickle_metadata(data) key = "%s.%s" % (mod, klass) bytes, count = totals.get(key, (0, 0)) bytes += len(data) count += 1 totals[key] = bytes, count if v: print "%8s %5d %s" % (U64(oid), len(data), key) L = totals.items() L.sort(lambda a, b: cmp(a[1], b[1])) L.reverse() print "Totals per object class:" for key, (bytes, count) in L: print "%8d %8d %s" % (count, bytes, key) def main(): import sys import getopt try: opts, args = getopt.getopt(sys.argv[1:], "v") except getopt.error, msg: print msg print "usage: space.py [-v] Data.fs" sys.exit(2) if len(args) != 1: print "usage: space.py [-v] Data.fs" sys.exit(2) v = 0 for o, a in opts: if o == "-v": v += 1 path = args[0] run(path, v) if __name__ == "__main__": main() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/scripts/tests/000077500000000000000000000000001230730566700232665ustar00rootroot00000000000000ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/scripts/tests/__init__.py000066400000000000000000000000001230730566700253650ustar00rootroot00000000000000ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/scripts/tests/fstail.txt000066400000000000000000000022731230730566700253150ustar00rootroot00000000000000==================== The `fstail` utility ==================== The `fstail` utility shows information for a FileStorage about the last `n` transactions: We have to prepare a FileStorage first: >>> from ZODB.FileStorage import FileStorage >>> from ZODB.DB import DB >>> import transaction >>> from tempfile import mktemp >>> storagefile = mktemp() >>> base_storage = FileStorage(storagefile) >>> database = DB(base_storage) >>> connection1 = database.open() >>> root = connection1.root() >>> root['foo'] = 1 >>> transaction.commit() Now lets have a look at the last transactions of this FileStorage: >>> from ZODB.scripts.fstail import main >>> main(storagefile, 5) 2007-11-10 15:18:48.543001: hash=b16422d09fabdb45d4e4325e4b42d7d6f021d3c3 user='' description='' length=132 offset=185 2007-11-10 15:18:48.543001: hash=b16422d09fabdb45d4e4325e4b42d7d6f021d3c3 user='' description='initial database creation' length=150 offset=52 Now clean up the storage again: >>> import os >>> base_storage.close() >>> os.unlink(storagefile) >>> os.unlink(storagefile+'.index') >>> os.unlink(storagefile+'.lock') >>> os.unlink(storagefile+'.tmp') ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/scripts/tests/referrers.txt000066400000000000000000000025041230730566700260270ustar00rootroot00000000000000Getting Object Referrers ======================== The referrers module provides a way to get object referrers. It provides a referrers method that takes an iterable storage object. It returns a dictionary mapping object ids to lists of referrer object versions, which each version is a tuple an object id nd serial nummber. To see how this works, we'll create a small database: >>> import transaction >>> from persistent.mapping import PersistentMapping >>> from ZODB.FileStorage import FileStorage >>> from ZODB.DB import DB >>> import os, tempfile >>> dest = tempfile.mkdtemp() >>> fs = FileStorage(os.path.join(dest, 'Data.fs')) >>> db = DB(fs) >>> conn = db.open() >>> conn.root()['a'] = PersistentMapping() >>> conn.root()['b'] = PersistentMapping() >>> transaction.commit() >>> roid = conn.root()._p_oid >>> aoid = conn.root()['a']._p_oid >>> boid = conn.root()['b']._p_oid >>> s1 = conn.root()['b']._p_serial >>> conn.root()['a']['b'] = conn.root()['b'] >>> transaction.commit() >>> s2 = conn.root()['a']._p_serial Now we'll get the storage and compute the referrers: >>> import ZODB.scripts.referrers >>> referrers = ZODB.scripts.referrers.referrers(fs) >>> referrers[boid] == [(roid, s1), (aoid, s2)] True .. Cleanup >>> db.close() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/scripts/tests/test_doc.py000066400000000000000000000023421230730566700254450ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2004 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## import doctest import re import unittest import ZODB.tests.util import zope.testing.renormalizing checker = zope.testing.renormalizing.RENormalizing([ (re.compile( '[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}\.[0-9]+'), '2007-11-10 15:18:48.543001'), (re.compile('hash=[0-9a-f]{40}'), 'hash=b16422d09fabdb45d4e4325e4b42d7d6f021d3c3')]) def test_suite(): return unittest.TestSuite(( doctest.DocFileSuite( 'referrers.txt', 'fstail.txt', setUp=ZODB.tests.util.setUp, tearDown=ZODB.tests.util.tearDown, checker=checker), )) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/scripts/tests/test_fstest.py000066400000000000000000000026301230730566700262100ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2010 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## from zope.testing import setupstack import doctest import ZODB def test_fstest_verbose(): r""" >>> db = ZODB.DB('data.fs') >>> db.close() >>> import ZODB.scripts.fstest >>> ZODB.scripts.fstest.main(['data.fs']) >>> ZODB.scripts.fstest.main(['data.fs']) >>> ZODB.scripts.fstest.main(['-v', 'data.fs']) ... # doctest: +ELLIPSIS +NORMALIZE_WHITESPACE 4: transaction tid ... #0 no errors detected >>> ZODB.scripts.fstest.main(['-vvv', 'data.fs']) ... # doctest: +ELLIPSIS +NORMALIZE_WHITESPACE 52: object oid 0x0000000000000000 #0 4: transaction tid ... #0 no errors detected """ def test_suite(): return doctest.DocTestSuite( setUp=setupstack.setUpDirectory, tearDown=setupstack.tearDown) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/scripts/tests/test_repozo.py000066400000000000000000001033231230730566700262170ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2004-2009 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## import unittest import os try: # the hashlib package is available from Python 2.5 from hashlib import md5 except ImportError: # the md5 package is deprecated in Python 2.6 from md5 import new as md5 import ZODB.tests.util # layer used at class scope _NOISY = os.environ.get('NOISY_REPOZO_TEST_OUTPUT') class OurDB: _file_name = None def __init__(self, dir): from BTrees.OOBTree import OOBTree import transaction self.dir = dir self.getdb() conn = self.db.open() conn.root()['tree'] = OOBTree() transaction.commit() self.pos = self.db.storage._pos self.close() def getdb(self): from ZODB import DB from ZODB.FileStorage import FileStorage self._file_name = storage_filename = os.path.join(self.dir, 'Data.fs') storage = FileStorage(storage_filename) self.db = DB(storage) def gettree(self): self.getdb() conn = self.db.open() return conn.root()['tree'] def pack(self): self.getdb() self.db.pack() def close(self): if self.db is not None: self.db.close() self.db = None def mutate(self): # Make random mutations to the btree in the database. import random import transaction tree = self.gettree() for dummy in range(100): if random.random() < 0.6: tree[random.randrange(100000)] = random.randrange(100000) else: keys = tree.keys() if keys: del tree[keys[0]] transaction.commit() self.pos = self.db.storage._pos self.maxkey = self.db.storage._oid self.close() class FileopsBase: def _makeChunks(self): from ZODB.scripts.repozo import READCHUNK return ['x' * READCHUNK, 'y' * READCHUNK, 'z'] def _makeFile(self, text=None): from StringIO import StringIO if text is None: text = ''.join(self._makeChunks()) return StringIO(text) class Test_dofile(unittest.TestCase, FileopsBase): def _callFUT(self, func, fp, n): from ZODB.scripts.repozo import dofile return dofile(func, fp, n) def test_empty_read_all(self): chunks = [] file = self._makeFile('') bytes = self._callFUT(chunks.append, file, None) self.assertEqual(bytes, 0) self.assertEqual(chunks, []) def test_empty_read_count(self): chunks = [] file = self._makeFile('') bytes = self._callFUT(chunks.append, file, 42) self.assertEqual(bytes, 0) self.assertEqual(chunks, []) def test_nonempty_read_all(self): chunks = [] file = self._makeFile() bytes = self._callFUT(chunks.append, file, None) self.assertEqual(bytes, file.tell()) self.assertEqual(chunks, self._makeChunks()) def test_nonempty_read_count(self): chunks = [] file = self._makeFile() bytes = self._callFUT(chunks.append, file, 42) self.assertEqual(bytes, 42) self.assertEqual(chunks, ['x' * 42]) class Test_checksum(unittest.TestCase, FileopsBase): def _callFUT(self, fp, n): from ZODB.scripts.repozo import checksum return checksum(fp, n) def test_empty_read_all(self): file = self._makeFile('') sum = self._callFUT(file, None) self.assertEqual(sum, md5('').hexdigest()) def test_empty_read_count(self): file = self._makeFile('') sum = self._callFUT(file, 42) self.assertEqual(sum, md5('').hexdigest()) def test_nonempty_read_all(self): file = self._makeFile() sum = self._callFUT(file, None) self.assertEqual(sum, md5(''.join(self._makeChunks())).hexdigest()) def test_nonempty_read_count(self): chunks = [] file = self._makeFile() sum = self._callFUT(file, 42) self.assertEqual(sum, md5('x' * 42).hexdigest()) class OptionsTestBase: _repository_directory = None _data_directory = None def tearDown(self): if self._repository_directory is not None: from shutil import rmtree rmtree(self._repository_directory) if self._data_directory is not None: from shutil import rmtree rmtree(self._data_directory) def _makeOptions(self, **kw): import tempfile self._repository_directory = tempfile.mkdtemp() class Options(object): repository = self._repository_directory def __init__(self, **kw): self.__dict__.update(kw) return Options(**kw) class Test_copyfile(OptionsTestBase, unittest.TestCase): def _callFUT(self, options, dest, start, n): from ZODB.scripts.repozo import copyfile return copyfile(options, dest, start, n) def test_no_gzip(self): options = self._makeOptions(gzip=False) source = options.file = os.path.join(self._repository_directory, 'source.txt') f = open(source, 'wb') f.write('x' * 1000) f.close() target = os.path.join(self._repository_directory, 'target.txt') sum = self._callFUT(options, target, 0, 100) self.assertEqual(sum, md5('x' * 100).hexdigest()) self.assertEqual(open(target, 'rb').read(), 'x' * 100) def test_w_gzip(self): import gzip options = self._makeOptions(gzip=True) source = options.file = os.path.join(self._repository_directory, 'source.txt') f = open(source, 'wb') f.write('x' * 1000) f.close() target = os.path.join(self._repository_directory, 'target.txt') sum = self._callFUT(options, target, 0, 100) self.assertEqual(sum, md5('x' * 100).hexdigest()) self.assertEqual(gzip.open(target, 'rb').read(), 'x' * 100) class Test_concat(OptionsTestBase, unittest.TestCase): def _callFUT(self, files, ofp): from ZODB.scripts.repozo import concat return concat(files, ofp) def _makeFile(self, name, text, gzip_file=False): import gzip import tempfile if self._repository_directory is None: self._repository_directory = tempfile.mkdtemp() fqn = os.path.join(self._repository_directory, name) if gzip_file: f = gzip.open(fqn, 'wb') else: f = open(fqn, 'wb') f.write(text) f.flush() f.close() return fqn def test_empty_list_no_ofp(self): bytes, sum = self._callFUT([], None) self.assertEqual(bytes, 0) self.assertEqual(sum, md5('').hexdigest()) def test_w_plain_files_no_ofp(self): files = [self._makeFile(x, x, False) for x in 'ABC'] bytes, sum = self._callFUT(files, None) self.assertEqual(bytes, 3) self.assertEqual(sum, md5('ABC').hexdigest()) def test_w_gzipped_files_no_ofp(self): files = [self._makeFile('%s.fsz' % x, x, True) for x in 'ABC'] bytes, sum = self._callFUT(files, None) self.assertEqual(bytes, 3) self.assertEqual(sum, md5('ABC').hexdigest()) def test_w_ofp(self): class Faux: _closed = False def __init__(self): self._written = [] def write(self, data): self._written.append(data) def close(self): self._closed = True files = [self._makeFile(x, x, False) for x in 'ABC'] ofp = Faux() bytes, sum = self._callFUT(files, ofp) self.assertEqual(ofp._written, [x for x in 'ABC']) self.failUnless(ofp._closed) _marker = object() class Test_gen_filename(OptionsTestBase, unittest.TestCase): def _callFUT(self, options, ext=_marker): from ZODB.scripts.repozo import gen_filename if ext is _marker: return gen_filename(options) return gen_filename(options, ext) def test_explicit_ext(self): options = self._makeOptions(test_now = (2010, 5, 14, 12, 52, 31)) fn = self._callFUT(options, '.txt') self.assertEqual(fn, '2010-05-14-12-52-31.txt') def test_full_no_gzip(self): options = self._makeOptions(test_now = (2010, 5, 14, 12, 52, 31), full = True, gzip = False, ) fn = self._callFUT(options) self.assertEqual(fn, '2010-05-14-12-52-31.fs') def test_full_w_gzip(self): options = self._makeOptions(test_now = (2010, 5, 14, 12, 52, 31), full = True, gzip = True, ) fn = self._callFUT(options) self.assertEqual(fn, '2010-05-14-12-52-31.fsz') def test_incr_no_gzip(self): options = self._makeOptions(test_now = (2010, 5, 14, 12, 52, 31), full = False, gzip = False, ) fn = self._callFUT(options) self.assertEqual(fn, '2010-05-14-12-52-31.deltafs') def test_incr_w_gzip(self): options = self._makeOptions(test_now = (2010, 5, 14, 12, 52, 31), full = False, gzip = True, ) fn = self._callFUT(options) self.assertEqual(fn, '2010-05-14-12-52-31.deltafsz') class Test_find_files(OptionsTestBase, unittest.TestCase): def _callFUT(self, options): from ZODB.scripts.repozo import find_files return find_files(options) def _makeFile(self, hour, min, sec, ext): # call _makeOptions first! name = '2010-05-14-%02d-%02d-%02d%s' % (hour, min, sec, ext) fqn = os.path.join(self._repository_directory, name) f = open(fqn, 'wb') f.write(name) f.flush() f.close() return fqn def test_no_files(self): options = self._makeOptions(date='2010-05-14-13-30-57') found = self._callFUT(options) self.assertEqual(found, []) def test_explicit_date(self): options = self._makeOptions(date='2010-05-14-13-30-57') files = [] for h, m, s, e in [(2, 13, 14, '.fs'), (2, 13, 14, '.dat'), (3, 14, 15, '.deltafs'), (4, 14, 15, '.deltafs'), (5, 14, 15, '.deltafs'), (12, 13, 14, '.fs'), (12, 13, 14, '.dat'), (13, 14, 15, '.deltafs'), (14, 15, 16, '.deltafs'), ]: files.append(self._makeFile(h, m, s, e)) found = self._callFUT(options) # Older files, .dat file not included self.assertEqual(found, [files[5], files[7]]) def test_using_gen_filename(self): options = self._makeOptions(date=None, test_now=(2010, 5, 14, 13, 30, 57)) files = [] for h, m, s, e in [(2, 13, 14, '.fs'), (2, 13, 14, '.dat'), (3, 14, 15, '.deltafs'), (4, 14, 15, '.deltafs'), (5, 14, 15, '.deltafs'), (12, 13, 14, '.fs'), (12, 13, 14, '.dat'), (13, 14, 15, '.deltafs'), (14, 15, 16, '.deltafs'), ]: files.append(self._makeFile(h, m, s, e)) found = self._callFUT(options) # Older files, .dat file not included self.assertEqual(found, [files[5], files[7]]) class Test_scandat(OptionsTestBase, unittest.TestCase): def _callFUT(self, repofiles): from ZODB.scripts.repozo import scandat return scandat(repofiles) def test_no_dat_file(self): options = self._makeOptions() fsfile = os.path.join(self._repository_directory, 'foo.fs') fn, startpos, endpos, sum = self._callFUT([fsfile]) self.assertEqual(fn, None) self.assertEqual(startpos, None) self.assertEqual(endpos, None) self.assertEqual(sum, None) def test_empty_dat_file(self): options = self._makeOptions() fsfile = os.path.join(self._repository_directory, 'foo.fs') datfile = os.path.join(self._repository_directory, 'foo.dat') open(datfile, 'wb').close() fn, startpos, endpos, sum = self._callFUT([fsfile]) self.assertEqual(fn, None) self.assertEqual(startpos, None) self.assertEqual(endpos, None) self.assertEqual(sum, None) def test_single_line(self): options = self._makeOptions() fsfile = os.path.join(self._repository_directory, 'foo.fs') datfile = os.path.join(self._repository_directory, 'foo.dat') f = open(datfile, 'wb') f.write('foo.fs 0 123 ABC\n') f.flush() f.close() fn, startpos, endpos, sum = self._callFUT([fsfile]) self.assertEqual(fn, 'foo.fs') self.assertEqual(startpos, 0) self.assertEqual(endpos, 123) self.assertEqual(sum, 'ABC') def test_multiple_lines(self): options = self._makeOptions() fsfile = os.path.join(self._repository_directory, 'foo.fs') datfile = os.path.join(self._repository_directory, 'foo.dat') f = open(datfile, 'wb') f.write('foo.fs 0 123 ABC\n') f.write('bar.deltafs 123 456 DEF\n') f.flush() f.close() fn, startpos, endpos, sum = self._callFUT([fsfile]) self.assertEqual(fn, 'bar.deltafs') self.assertEqual(startpos, 123) self.assertEqual(endpos, 456) self.assertEqual(sum, 'DEF') class Test_delete_old_backups(OptionsTestBase, unittest.TestCase): def _makeOptions(self, filenames=()): options = super(Test_delete_old_backups, self)._makeOptions() for filename in filenames: fqn = os.path.join(options.repository, filename) f = open(fqn, 'wb') f.write('testing delete_old_backups') f.close() return options def _callFUT(self, options=None, filenames=()): from ZODB.scripts.repozo import delete_old_backups if options is None: options = self._makeOptions(filenames) return delete_old_backups(options) def test_empty_dir_doesnt_raise(self): self._callFUT() self.assertEqual(len(os.listdir(self._repository_directory)), 0) def test_no_repozo_files_doesnt_raise(self): FILENAMES = ['bogus.txt', 'not_a_repozo_file'] self._callFUT(filenames=FILENAMES) remaining = os.listdir(self._repository_directory) self.assertEqual(len(remaining), len(FILENAMES)) for name in FILENAMES: fqn = os.path.join(self._repository_directory, name) self.failUnless(os.path.isfile(fqn)) def test_doesnt_remove_current_repozo_files(self): FILENAMES = ['2009-12-20-10-08-03.fs', '2009-12-20-10-08-03.dat', '2009-12-20-10-08-03.index', ] self._callFUT(filenames=FILENAMES) remaining = os.listdir(self._repository_directory) self.assertEqual(len(remaining), len(FILENAMES)) for name in FILENAMES: fqn = os.path.join(self._repository_directory, name) self.failUnless(os.path.isfile(fqn)) def test_removes_older_repozo_files(self): OLDER_FULL = ['2009-12-20-00-01-03.fs', '2009-12-20-00-01-03.dat', '2009-12-20-00-01-03.index', ] DELTAS = ['2009-12-21-00-00-01.deltafs', '2009-12-21-00-00-01.index', '2009-12-22-00-00-01.deltafs', '2009-12-22-00-00-01.index', ] CURRENT_FULL = ['2009-12-23-00-00-01.fs', '2009-12-23-00-00-01.dat', '2009-12-23-00-00-01.index', ] FILENAMES = OLDER_FULL + DELTAS + CURRENT_FULL self._callFUT(filenames=FILENAMES) remaining = os.listdir(self._repository_directory) self.assertEqual(len(remaining), len(CURRENT_FULL)) for name in OLDER_FULL: fqn = os.path.join(self._repository_directory, name) self.failIf(os.path.isfile(fqn)) for name in DELTAS: fqn = os.path.join(self._repository_directory, name) self.failIf(os.path.isfile(fqn)) for name in CURRENT_FULL: fqn = os.path.join(self._repository_directory, name) self.failUnless(os.path.isfile(fqn)) def test_removes_older_repozo_files_zipped(self): OLDER_FULL = ['2009-12-20-00-01-03.fsz', '2009-12-20-00-01-03.dat', '2009-12-20-00-01-03.index', ] DELTAS = ['2009-12-21-00-00-01.deltafsz', '2009-12-21-00-00-01.index', '2009-12-22-00-00-01.deltafsz', '2009-12-22-00-00-01.index', ] CURRENT_FULL = ['2009-12-23-00-00-01.fsz', '2009-12-23-00-00-01.dat', '2009-12-23-00-00-01.index', ] FILENAMES = OLDER_FULL + DELTAS + CURRENT_FULL self._callFUT(filenames=FILENAMES) remaining = os.listdir(self._repository_directory) self.assertEqual(len(remaining), len(CURRENT_FULL)) for name in OLDER_FULL: fqn = os.path.join(self._repository_directory, name) self.failIf(os.path.isfile(fqn)) for name in DELTAS: fqn = os.path.join(self._repository_directory, name) self.failIf(os.path.isfile(fqn)) for name in CURRENT_FULL: fqn = os.path.join(self._repository_directory, name) self.failUnless(os.path.isfile(fqn)) class Test_do_full_backup(OptionsTestBase, unittest.TestCase): def _callFUT(self, options): from ZODB.scripts.repozo import do_full_backup return do_full_backup(options) def _makeDB(self): import tempfile datadir = self._data_directory = tempfile.mkdtemp() return OurDB(self._data_directory) def test_dont_overwrite_existing_file(self): from ZODB.scripts.repozo import WouldOverwriteFiles from ZODB.scripts.repozo import gen_filename db = self._makeDB() options = self._makeOptions(full=True, file=db._file_name, gzip=False, test_now = (2010, 5, 14, 10, 51, 22), ) f = open(os.path.join(self._repository_directory, gen_filename(options)), 'w') f.write('TESTING') f.flush() f.close() self.assertRaises(WouldOverwriteFiles, self._callFUT, options) def test_empty(self): import struct from ZODB.scripts.repozo import gen_filename from ZODB.fsIndex import fsIndex db = self._makeDB() options = self._makeOptions(file=db._file_name, gzip=False, killold=False, test_now = (2010, 5, 14, 10, 51, 22), ) self._callFUT(options) target = os.path.join(self._repository_directory, gen_filename(options)) original = open(db._file_name, 'rb').read() self.assertEqual(open(target, 'rb').read(), original) datfile = os.path.join(self._repository_directory, gen_filename(options, '.dat')) self.assertEqual(open(datfile).read(), '%s 0 %d %s\n' % (target, len(original), md5(original).hexdigest())) ndxfile = os.path.join(self._repository_directory, gen_filename(options, '.index')) ndx_info = fsIndex.load(ndxfile) self.assertEqual(ndx_info['pos'], len(original)) index = ndx_info['index'] pZero = struct.pack(">Q", 0) pOne = struct.pack(">Q", 1) self.assertEqual(index.minKey(), pZero) self.assertEqual(index.maxKey(), pOne) class Test_do_incremental_backup(OptionsTestBase, unittest.TestCase): def _callFUT(self, options, reposz, repofiles): from ZODB.scripts.repozo import do_incremental_backup return do_incremental_backup(options, reposz, repofiles) def _makeDB(self): import tempfile datadir = self._data_directory = tempfile.mkdtemp() return OurDB(self._data_directory) def test_dont_overwrite_existing_file(self): from ZODB.scripts.repozo import WouldOverwriteFiles from ZODB.scripts.repozo import gen_filename from ZODB.scripts.repozo import find_files db = self._makeDB() options = self._makeOptions(full=False, file=db._file_name, gzip=False, test_now = (2010, 5, 14, 10, 51, 22), date = None, ) f = open(os.path.join(self._repository_directory, gen_filename(options)), 'w') f.write('TESTING') f.flush() f.close() repofiles = find_files(options) self.assertRaises(WouldOverwriteFiles, self._callFUT, options, 0, repofiles) def test_no_changes(self): import struct from ZODB.scripts.repozo import gen_filename from ZODB.fsIndex import fsIndex db = self._makeDB() oldpos = db.pos options = self._makeOptions(file=db._file_name, gzip=False, killold=False, test_now = (2010, 5, 14, 10, 51, 22), date = None, ) fullfile = os.path.join(self._repository_directory, '2010-05-14-00-00-00.fs') original = open(db._file_name, 'rb').read() last = len(original) f = open(fullfile, 'wb') f.write(original) f.flush() f.close() datfile = os.path.join(self._repository_directory, '2010-05-14-00-00-00.dat') repofiles = [fullfile, datfile] self._callFUT(options, oldpos, repofiles) target = os.path.join(self._repository_directory, gen_filename(options)) self.assertEqual(open(target, 'rb').read(), '') self.assertEqual(open(datfile).read(), '%s %d %d %s\n' % (target, oldpos, oldpos, md5('').hexdigest())) ndxfile = os.path.join(self._repository_directory, gen_filename(options, '.index')) ndx_info = fsIndex.load(ndxfile) self.assertEqual(ndx_info['pos'], oldpos) index = ndx_info['index'] pZero = struct.pack(">Q", 0) pOne = struct.pack(">Q", 1) self.assertEqual(index.minKey(), pZero) self.assertEqual(index.maxKey(), pOne) def test_w_changes(self): import struct from ZODB.scripts.repozo import gen_filename from ZODB.fsIndex import fsIndex db = self._makeDB() oldpos = db.pos options = self._makeOptions(file=db._file_name, gzip=False, killold=False, test_now = (2010, 5, 14, 10, 51, 22), date = None, ) fullfile = os.path.join(self._repository_directory, '2010-05-14-00-00-00.fs') original = open(db._file_name, 'rb').read() f = open(fullfile, 'wb') f.write(original) f.flush() f.close() datfile = os.path.join(self._repository_directory, '2010-05-14-00-00-00.dat') repofiles = [fullfile, datfile] db.mutate() newpos = db.pos self._callFUT(options, oldpos, repofiles) target = os.path.join(self._repository_directory, gen_filename(options)) f = open(db._file_name, 'rb') f.seek(oldpos) increment = f.read() self.assertEqual(open(target, 'rb').read(), increment) self.assertEqual(open(datfile).read(), '%s %d %d %s\n' % (target, oldpos, newpos, md5(increment).hexdigest())) ndxfile = os.path.join(self._repository_directory, gen_filename(options, '.index')) ndx_info = fsIndex.load(ndxfile) self.assertEqual(ndx_info['pos'], newpos) index = ndx_info['index'] pZero = struct.pack(">Q", 0) self.assertEqual(index.minKey(), pZero) self.assertEqual(index.maxKey(), db.maxkey) class Test_do_recover(OptionsTestBase, unittest.TestCase): def _callFUT(self, options): from ZODB.scripts.repozo import do_recover return do_recover(options) def _makeFile(self, hour, min, sec, ext, text=None): # call _makeOptions first! name = '2010-05-14-%02d-%02d-%02d%s' % (hour, min, sec, ext) if text is None: text = name fqn = os.path.join(self._repository_directory, name) f = open(fqn, 'wb') f.write(text) f.flush() f.close() return fqn def test_no_files(self): from ZODB.scripts.repozo import NoFiles options = self._makeOptions(date=None, test_now=(2010, 5, 15, 13, 30, 57)) self.assertRaises(NoFiles, self._callFUT, options) def test_no_files_before_explicit_date(self): from ZODB.scripts.repozo import NoFiles options = self._makeOptions(date='2010-05-13-13-30-57') files = [] for h, m, s, e in [(2, 13, 14, '.fs'), (2, 13, 14, '.dat'), (3, 14, 15, '.deltafs'), (4, 14, 15, '.deltafs'), (5, 14, 15, '.deltafs'), (12, 13, 14, '.fs'), (12, 13, 14, '.dat'), (13, 14, 15, '.deltafs'), (14, 15, 16, '.deltafs'), ]: files.append(self._makeFile(h, m, s, e)) self.assertRaises(NoFiles, self._callFUT, options) def test_w_full_backup_latest_no_index(self): import tempfile dd = self._data_directory = tempfile.mkdtemp() output = os.path.join(dd, 'Data.fs') index = os.path.join(dd, 'Data.fs.index') options = self._makeOptions(date='2010-05-15-13-30-57', output=output) self._makeFile(2, 3, 4, '.fs', 'AAA') self._makeFile(4, 5, 6, '.fs', 'BBB') self._callFUT(options) self.assertEqual(open(output, 'rb').read(), 'BBB') def test_w_full_backup_latest_index(self): import tempfile dd = self._data_directory = tempfile.mkdtemp() output = os.path.join(dd, 'Data.fs') index = os.path.join(dd, 'Data.fs.index') options = self._makeOptions(date='2010-05-15-13-30-57', output=output) self._makeFile(2, 3, 4, '.fs', 'AAA') self._makeFile(4, 5, 6, '.fs', 'BBB') self._makeFile(4, 5, 6, '.index', 'CCC') self._callFUT(options) self.assertEqual(open(output, 'rb').read(), 'BBB') self.assertEqual(open(index, 'rb').read(), 'CCC') def test_w_incr_backup_latest_no_index(self): import tempfile dd = self._data_directory = tempfile.mkdtemp() output = os.path.join(dd, 'Data.fs') index = os.path.join(dd, 'Data.fs.index') options = self._makeOptions(date='2010-05-15-13-30-57', output=output) self._makeFile(2, 3, 4, '.fs', 'AAA') self._makeFile(4, 5, 6, '.deltafs', 'BBB') self._callFUT(options) self.assertEqual(open(output, 'rb').read(), 'AAABBB') def test_w_incr_backup_latest_index(self): import tempfile dd = self._data_directory = tempfile.mkdtemp() output = os.path.join(dd, 'Data.fs') index = os.path.join(dd, 'Data.fs.index') options = self._makeOptions(date='2010-05-15-13-30-57', output=output) self._makeFile(2, 3, 4, '.fs', 'AAA') self._makeFile(4, 5, 6, '.deltafs', 'BBB') self._makeFile(4, 5, 6, '.index', 'CCC') self._callFUT(options) self.assertEqual(open(output, 'rb').read(), 'AAABBB') self.assertEqual(open(index, 'rb').read(), 'CCC') class MonteCarloTests(unittest.TestCase): layer = ZODB.tests.util.MininalTestLayer('repozo') def setUp(self): # compute directory names import tempfile self.basedir = tempfile.mkdtemp() self.backupdir = os.path.join(self.basedir, 'backup') self.datadir = os.path.join(self.basedir, 'data') self.restoredir = os.path.join(self.basedir, 'restore') self.copydir = os.path.join(self.basedir, 'copy') self.currdir = os.getcwd() # create empty directories os.mkdir(self.backupdir) os.mkdir(self.datadir) os.mkdir(self.restoredir) os.mkdir(self.copydir) os.chdir(self.datadir) self.db = OurDB(self.datadir) def tearDown(self): os.chdir(self.currdir) import shutil shutil.rmtree(self.basedir) def _callRepozoMain(self, argv): from ZODB.scripts.repozo import main main(argv) def test_via_monte_carlo(self): self.saved_snapshots = [] # list of (name, time) pairs for copies. for i in range(100): self.mutate_pack_backup(i) # Verify snapshots can be reproduced exactly. for copyname, copytime in self.saved_snapshots: if _NOISY: print "Checking that", copyname, print "at", copytime, "is reproducible." self.assertRestored(copyname, copytime) def mutate_pack_backup(self, i): import random from shutil import copyfile from time import gmtime from time import sleep self.db.mutate() # Pack about each tenth time. if random.random() < 0.1: if _NOISY: print "packing" self.db.pack() self.db.close() # Make an incremental backup, half the time with gzip (-z). argv = ['-BQr', self.backupdir, '-f', 'Data.fs'] if _NOISY: argv.insert(0, '-v') if random.random() < 0.5: argv.insert(0, '-z') self._callRepozoMain(argv) # Save snapshots to assert that dated restores are possible if i % 9 == 0: srcname = os.path.join(self.datadir, 'Data.fs') copytime = '%04d-%02d-%02d-%02d-%02d-%02d' % (gmtime()[:6]) copyname = os.path.join(self.copydir, "Data%d.fs" % i) copyfile(srcname, copyname) self.saved_snapshots.append((copyname, copytime)) # Make sure the clock moves at least a second. sleep(1.01) # Verify current Data.fs can be reproduced exactly. self.assertRestored() def assertRestored(self, correctpath='Data.fs', when=None): # Do recovery to time 'when', and check that it's identical to correctpath. # restore to Restored.fs restoredfile = os.path.join(self.restoredir, 'Restored.fs') argv = ['-Rr', self.backupdir, '-o', restoredfile] if _NOISY: argv.insert(0, '-v') if when is not None: argv.append('-D') argv.append(when) self._callRepozoMain(argv) # check restored file content is equal to file that was backed up f = file(correctpath, 'rb') g = file(restoredfile, 'rb') fguts = f.read() gguts = g.read() f.close() g.close() msg = ("guts don't match\ncorrectpath=%r when=%r\n cmd=%r" % (correctpath, when, ' '.join(argv))) self.assertEquals(fguts, gguts, msg) def test_suite(): return unittest.TestSuite([ unittest.makeSuite(Test_dofile), unittest.makeSuite(Test_checksum), unittest.makeSuite(Test_copyfile), unittest.makeSuite(Test_concat), unittest.makeSuite(Test_gen_filename), unittest.makeSuite(Test_find_files), unittest.makeSuite(Test_scandat), unittest.makeSuite(Test_delete_old_backups), unittest.makeSuite(Test_do_full_backup), unittest.makeSuite(Test_do_incremental_backup), #unittest.makeSuite(Test_do_backup), #TODO unittest.makeSuite(Test_do_recover), # N.B.: this test take forever to run (~40sec on a fast laptop), # *and* it is non-deterministic. unittest.makeSuite(MonteCarloTests), ]) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/scripts/zodbload.py000066400000000000000000000741461230730566700243100ustar00rootroot00000000000000#!/usr/bin/env python2.3 ############################################################################## # # Copyright (c) 2003 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Test script for testing ZODB under a heavy zope-like load. Note that, to be as realistic as possible with ZEO, you should run this script multiple times, to simulate multiple clients. Here's how this works. The script starts some number of threads. Each thread, sequentially executes jobs. There is a job producer that produces jobs. Input data are provided by a mail producer that hands out message from a mailbox. Execution continues until there is an error, which will normally occur when the mailbox is exhausted. Command-line options are used to provide job definitions. Job definitions have perameters of the form name=value. Jobs have 2 standard parameters: frequency=integer The frequency of the job. The default is 1. sleep=float The number os seconds to sleep before performing the job. The default is 0. Usage: loadmail2 [options] Options: -edit [frequency=integer] [sleep=float] Define an edit job. An edit job edits a random already-saved email message, deleting and inserting a random number of words. After editing the message, the message is (re)cataloged. -insert [number=int] [frequency=integer] [sleep=float] Insert some number of email messages. -index [number=int] [frequency=integer] [sleep=float] Insert and index (catalog) some number of email messages. -search [terms='word1 word2 ...'] [frequency=integer] [sleep=float] Search the catalog. A query is givem with one or more terms as would be entered into a typical seach box. If no query is given, then queries will be randomly selected based on a set of built-in word list. -setup Set up the database. This will delete any existing Data.fs file. (Of course, this may have no effect, if there is a custom_zodb that defined a different storage.) It also adds a mail folder and a catalog. -options file Read options from the given file. Th efile should be a python source file that defines a sequence of options named 'options'. -threads n Specify the number of threads to execute. If not specified (< 2), then jobs are run in a single (main) thread. -mbox filename Specify the mailbox for getting input data. There is a (lame) syntax for providing options within the filename. The filename may be followed by up to 3 integers, min, max, and start: -mbox 'foo.mbox 0 100 10000' The messages from min to max will be read from the mailbox. They will be assigned message numbers starting with start. So, in the example above, we read the first hundred messages and assign thgem message numbers starting with 10001. The maxmum can be given as a negative number, in which case, it specifies the number of messages to read. The start defaults to the minimum. The following two options: -mbox 'foo.mbox 300 400 300' and -mbox 'foo.mbox 300 -100' are equivalent $Id$ """ import mailbox import math import os import random import re import sys import threading import time import transaction class JobProducer: def __init__(self): self.jobs = [] def add(self, callable, frequency, sleep, repeatp=0): self.jobs.extend([(callable, sleep, repeatp)] * int(frequency)) random.shuffle(self.jobs) def next(self): factory, sleep, repeatp = random.choice(self.jobs) time.sleep(sleep) callable, args = factory.create() return factory, callable, args, repeatp def __nonzero__(self): return not not self.jobs class MBox: def __init__(self, filename): if ' ' in filename: filename = filename.split() if len(filename) < 4: filename += [0, 0, -1][-(4-len(filename)):] filename, min, max, start = filename min = int(min) max = int(max) start = int(start) if start < 0: start = min if max < 0: # negative max is treated as a count self._max = start - max elif max > 0: self._max = start + max - min else: self._max = 0 else: self._max = 0 min = start = 0 if filename.endswith('.bz2'): f = os.popen("bunzip2 <"+filename, 'r') filename = filename[-4:] else: f = open(filename) self._mbox = mb = mailbox.UnixMailbox(f) self.number = start while min: mb.next() min -= 1 self._lock = threading.Lock() self.__name__ = os.path.splitext(os.path.split(filename)[1])[0] self._max = max def next(self): self._lock.acquire() try: if self._max > 0 and self.number >= self._max: raise IndexError(self.number + 1) message = self._mbox.next() message.body = message.fp.read() message.headers = list(message.headers) self.number += 1 message.number = self.number message.mbox = self.__name__ return message finally: self._lock.release() bins = 9973 #bins = 11 def mailfolder(app, mboxname, number): mail = getattr(app, mboxname, None) if mail is None: app.manage_addFolder(mboxname) mail = getattr(app, mboxname) from BTrees.Length import Length mail.length = Length() for i in range(bins): mail.manage_addFolder('b'+str(i)) bin = hash(str(number))%bins return getattr(mail, 'b'+str(bin)) def VmSize(): try: f = open('/proc/%s/status' % os.getpid()) except: return 0 else: l = filter(lambda l: l[:7] == 'VmSize:', f.readlines()) if l: l = l[0][7:].strip().split()[0] return int(l) return 0 def setup(lib_python): try: os.remove(os.path.join(lib_python, '..', '..', 'var', 'Data.fs')) except: pass import Zope2 import Products import AccessControl.SecurityManagement app=Zope2.app() Products.ZCatalog.ZCatalog.manage_addZCatalog(app, 'cat', '') from Products.ZCTextIndex.ZCTextIndex import PLexicon from Products.ZCTextIndex.Lexicon import Splitter, CaseNormalizer app.cat._setObject('lex', PLexicon('lex', '', Splitter(), CaseNormalizer()) ) class extra: doc_attr = 'PrincipiaSearchSource' lexicon_id = 'lex' index_type = 'Okapi BM25 Rank' app.cat.addIndex('PrincipiaSearchSource', 'ZCTextIndex', extra) transaction.commit() system = AccessControl.SpecialUsers.system AccessControl.SecurityManagement.newSecurityManager(None, system) app._p_jar.close() def do(db, f, args): """Do something in a transaction, retrying of necessary Measure the speed of both the compurartion and the commit """ from ZODB.POSException import ConflictError wcomp = ccomp = wcommit = ccommit = 0.0 rconflicts = wconflicts = 0 start = time.time() while 1: connection = db.open() try: transaction.begin() t=time.time() c=time.clock() try: try: r = f(connection, *args) except ConflictError: rconflicts += 1 transaction.abort() continue finally: wcomp += time.time() - t ccomp += time.clock() - c t=time.time() c=time.clock() try: try: transaction.commit() break except ConflictError: wconflicts += 1 transaction.abort() continue finally: wcommit += time.time() - t ccommit += time.clock() - c finally: connection.close() return start, wcomp, ccomp, rconflicts, wconflicts, wcommit, ccommit, r def run1(tid, db, factory, job, args): (start, wcomp, ccomp, rconflicts, wconflicts, wcommit, ccommit, r ) = do(db, job, args) start = "%.4d-%.2d-%.2d %.2d:%.2d:%.2d" % time.localtime(start)[:6] print "%s %s %8.3g %8.3g %s %s\t%8.3g %8.3g %s %r" % ( start, tid, wcomp, ccomp, rconflicts, wconflicts, wcommit, ccommit, factory.__name__, r) def run(jobs, tid=''): import Zope2 while 1: factory, job, args, repeatp = jobs.next() run1(tid, Zope2.DB, factory, job, args) if repeatp: while 1: i = random.randint(0,100) if i > repeatp: break run1(tid, Zope2.DB, factory, job, args) def index(connection, messages, catalog, max): app = connection.root()['Application'] for message in messages: mail = mailfolder(app, message.mbox, message.number) if max: # Cheat and use folder implementation secrets # to avoid having to read the old data _objects = mail._objects if len(_objects) >= max: for d in _objects[:len(_objects)-max+1]: del mail.__dict__[d['id']] mail._objects = _objects[len(_objects)-max+1:] docid = 'm'+str(message.number) mail.manage_addDTMLDocument(docid, file=message.body) # increment counted getattr(app, message.mbox).length.change(1) doc = mail[docid] for h in message.headers: h = h.strip() l = h.find(':') if l <= 0: continue name = h[:l].lower() if name=='subject': name='title' v = h[l+1:].strip() type='string' if name=='title': doc.manage_changeProperties(title=h) else: try: doc.manage_addProperty(name, v, type) except: pass if catalog: app.cat.catalog_object(doc) return message.number class IndexJob: needs_mbox = 1 catalog = 1 prefix = 'index' def __init__(self, mbox, number=1, max=0): self.__name__ = "%s%s_%s" % (self.prefix, number, mbox.__name__) self.mbox, self.number, self.max = mbox, int(number), int(max) def create(self): messages = [self.mbox.next() for i in range(self.number)] return index, (messages, self.catalog, self.max) class InsertJob(IndexJob): catalog = 0 prefix = 'insert' wordre = re.compile(r'(\w{3,20})') stop = 'and', 'not' def edit(connection, mbox, catalog=1): app = connection.root()['Application'] mail = getattr(app, mbox.__name__, None) if mail is None: time.sleep(1) return "No mailbox %s" % mbox.__name__ nmessages = mail.length() if nmessages < 2: time.sleep(1) return "No messages to edit in %s" % mbox.__name__ # find a message to edit: while 1: number = random.randint(1, nmessages-1) did = 'm' + str(number) mail = mailfolder(app, mbox.__name__, number) doc = getattr(mail, did, None) if doc is not None: break text = doc.raw.split() norig = len(text) if norig > 10: ndel = int(math.exp(random.randint(0, int(math.log(norig))))) nins = int(math.exp(random.randint(0, int(math.log(norig))))) else: ndel = 0 nins = 10 for j in range(ndel): j = random.randint(0,len(text)-1) word = text[j] m = wordre.search(word) if m: word = m.group(1).lower() if (not wordsd.has_key(word)) and word not in stop: words.append(word) wordsd[word] = 1 del text[j] for j in range(nins): word = random.choice(words) text.append(word) doc.raw = ' '.join(text) if catalog: app.cat.catalog_object(doc) return norig, ndel, nins class EditJob: needs_mbox = 1 prefix = 'edit' catalog = 1 def __init__(self, mbox): self.__name__ = "%s_%s" % (self.prefix, mbox.__name__) self.mbox = mbox def create(self): return edit, (self.mbox, self.catalog) class ModifyJob(EditJob): prefix = 'modify' catalog = 0 def search(connection, terms, number): app = connection.root()['Application'] cat = app.cat n = 0 for i in number: term = random.choice(terms) results = cat(PrincipiaSearchSource=term) n += len(results) for result in results: obj = result.getObject() # Apparently, there is a bug in Zope that leads obj to be None # on occasion. if obj is not None: obj.getId() return n class SearchJob: def __init__(self, terms='', number=10): if terms: terms = terms.split() self.__name__ = "search_" + '_'.join(terms) self.terms = terms else: self.__name__ = 'search' self.terms = words number = min(int(number), len(self.terms)) self.number = range(number) def create(self): return search, (self.terms, self.number) words=['banishment', 'indirectly', 'imprecise', 'peeks', 'opportunely', 'bribe', 'sufficiently', 'Occidentalized', 'elapsing', 'fermenting', 'listen', 'orphanage', 'younger', 'draperies', 'Ida', 'cuttlefish', 'mastermind', 'Michaels', 'populations', 'lent', 'cater', 'attentional', 'hastiness', 'dragnet', 'mangling', 'scabbards', 'princely', 'star', 'repeat', 'deviation', 'agers', 'fix', 'digital', 'ambitious', 'transit', 'jeeps', 'lighted', 'Prussianizations', 'Kickapoo', 'virtual', 'Andrew', 'generally', 'boatsman', 'amounts', 'promulgation', 'Malay', 'savaging', 'courtesan', 'nursed', 'hungered', 'shiningly', 'ship', 'presides', 'Parke', 'moderns', 'Jonas', 'unenlightening', 'dearth', 'deer', 'domesticates', 'recognize', 'gong', 'penetrating', 'dependents', 'unusually', 'complications', 'Dennis', 'imbalances', 'nightgown', 'attached', 'testaments', 'congresswoman', 'circuits', 'bumpers', 'braver', 'Boreas', 'hauled', 'Howe', 'seethed', 'cult', 'numismatic', 'vitality', 'differences', 'collapsed', 'Sandburg', 'inches', 'head', 'rhythmic', 'opponent', 'blanketer', 'attorneys', 'hen', 'spies', 'indispensably', 'clinical', 'redirection', 'submit', 'catalysts', 'councilwoman', 'kills', 'topologies', 'noxious', 'exactions', 'dashers', 'balanced', 'slider', 'cancerous', 'bathtubs', 'legged', 'respectably', 'crochets', 'absenteeism', 'arcsine', 'facility', 'cleaners', 'bobwhite', 'Hawkins', 'stockade', 'provisional', 'tenants', 'forearms', 'Knowlton', 'commit', 'scornful', 'pediatrician', 'greets', 'clenches', 'trowels', 'accepts', 'Carboloy', 'Glenn', 'Leigh', 'enroll', 'Madison', 'Macon', 'oiling', 'entertainingly', 'super', 'propositional', 'pliers', 'beneficiary', 'hospitable', 'emigration', 'sift', 'sensor', 'reserved', 'colonization', 'shrilled', 'momentously', 'stevedore', 'Shanghaiing', 'schoolmasters', 'shaken', 'biology', 'inclination', 'immoderate', 'stem', 'allegory', 'economical', 'daytime', 'Newell', 'Moscow', 'archeology', 'ported', 'scandals', 'Blackfoot', 'leery', 'kilobit', 'empire', 'obliviousness', 'productions', 'sacrificed', 'ideals', 'enrolling', 'certainties', 'Capsicum', 'Brookdale', 'Markism', 'unkind', 'dyers', 'legislates', 'grotesquely', 'megawords', 'arbitrary', 'laughing', 'wildcats', 'thrower', 'sex', 'devils', 'Wehr', 'ablates', 'consume', 'gossips', 'doorways', 'Shari', 'advanced', 'enumerable', 'existentially', 'stunt', 'auctioneers', 'scheduler', 'blanching', 'petulance', 'perceptibly', 'vapors', 'progressed', 'rains', 'intercom', 'emergency', 'increased', 'fluctuating', 'Krishna', 'silken', 'reformed', 'transformation', 'easter', 'fares', 'comprehensible', 'trespasses', 'hallmark', 'tormenter', 'breastworks', 'brassiere', 'bladders', 'civet', 'death', 'transformer', 'tolerably', 'bugle', 'clergy', 'mantels', 'satin', 'Boswellizes', 'Bloomington', 'notifier', 'Filippo', 'circling', 'unassigned', 'dumbness', 'sentries', 'representativeness', 'souped', 'Klux', 'Kingstown', 'gerund', 'Russell', 'splices', 'bellow', 'bandies', 'beefers', 'cameramen', 'appalled', 'Ionian', 'butterball', 'Portland', 'pleaded', 'admiringly', 'pricks', 'hearty', 'corer', 'deliverable', 'accountably', 'mentors', 'accorded', 'acknowledgement', 'Lawrenceville', 'morphology', 'eucalyptus', 'Rena', 'enchanting', 'tighter', 'scholars', 'graduations', 'edges', 'Latinization', 'proficiency', 'monolithic', 'parenthesizing', 'defy', 'shames', 'enjoyment', 'Purdue', 'disagrees', 'barefoot', 'maims', 'flabbergast', 'dishonorable', 'interpolation', 'fanatics', 'dickens', 'abysses', 'adverse', 'components', 'bowl', 'belong', 'Pipestone', 'trainees', 'paw', 'pigtail', 'feed', 'whore', 'conditioner', 'Volstead', 'voices', 'strain', 'inhabits', 'Edwin', 'discourses', 'deigns', 'cruiser', 'biconvex', 'biking', 'depreciation', 'Harrison', 'Persian', 'stunning', 'agar', 'rope', 'wagoner', 'elections', 'reticulately', 'Cruz', 'pulpits', 'wilt', 'peels', 'plants', 'administerings', 'deepen', 'rubs', 'hence', 'dissension', 'implored', 'bereavement', 'abyss', 'Pennsylvania', 'benevolent', 'corresponding', 'Poseidon', 'inactive', 'butchers', 'Mach', 'woke', 'loading', 'utilizing', 'Hoosier', 'undo', 'Semitization', 'trigger', 'Mouthe', 'mark', 'disgracefully', 'copier', 'futility', 'gondola', 'algebraic', 'lecturers', 'sponged', 'instigators', 'looted', 'ether', 'trust', 'feeblest', 'sequencer', 'disjointness', 'congresses', 'Vicksburg', 'incompatibilities', 'commend', 'Luxembourg', 'reticulation', 'instructively', 'reconstructs', 'bricks', 'attache', 'Englishman', 'provocation', 'roughen', 'cynic', 'plugged', 'scrawls', 'antipode', 'injected', 'Daedalus', 'Burnsides', 'asker', 'confronter', 'merriment', 'disdain', 'thicket', 'stinker', 'great', 'tiers', 'oust', 'antipodes', 'Macintosh', 'tented', 'packages', 'Mediterraneanize', 'hurts', 'orthodontist', 'seeder', 'readying', 'babying', 'Florida', 'Sri', 'buckets', 'complementary', 'cartographer', 'chateaus', 'shaves', 'thinkable', 'Tehran', 'Gordian', 'Angles', 'arguable', 'bureau', 'smallest', 'fans', 'navigated', 'dipole', 'bootleg', 'distinctive', 'minimization', 'absorbed', 'surmised', 'Malawi', 'absorbent', 'close', 'conciseness', 'hopefully', 'declares', 'descent', 'trick', 'portend', 'unable', 'mildly', 'Morse', 'reference', 'scours', 'Caribbean', 'battlers', 'astringency', 'likelier', 'Byronizes', 'econometric', 'grad', 'steak', 'Austrian', 'ban', 'voting', 'Darlington', 'bison', 'Cetus', 'proclaim', 'Gilbertson', 'evictions', 'submittal', 'bearings', 'Gothicizer', 'settings', 'McMahon', 'densities', 'determinants', 'period', 'DeKastere', 'swindle', 'promptness', 'enablers', 'wordy', 'during', 'tables', 'responder', 'baffle', 'phosgene', 'muttering', 'limiters', 'custodian', 'prevented', 'Stouffer', 'waltz', 'Videotex', 'brainstorms', 'alcoholism', 'jab', 'shouldering', 'screening', 'explicitly', 'earner', 'commandment', 'French', 'scrutinizing', 'Gemma', 'capacitive', 'sheriff', 'herbivore', 'Betsey', 'Formosa', 'scorcher', 'font', 'damming', 'soldiers', 'flack', 'Marks', 'unlinking', 'serenely', 'rotating', 'converge', 'celebrities', 'unassailable', 'bawling', 'wording', 'silencing', 'scotch', 'coincided', 'masochists', 'graphs', 'pernicious', 'disease', 'depreciates', 'later', 'torus', 'interject', 'mutated', 'causer', 'messy', 'Bechtel', 'redundantly', 'profoundest', 'autopsy', 'philosophic', 'iterate', 'Poisson', 'horridly', 'silversmith', 'millennium', 'plunder', 'salmon', 'missioner', 'advances', 'provers', 'earthliness', 'manor', 'resurrectors', 'Dahl', 'canto', 'gangrene', 'gabler', 'ashore', 'frictionless', 'expansionism', 'emphasis', 'preservations', 'Duane', 'descend', 'isolated', 'firmware', 'dynamites', 'scrawled', 'cavemen', 'ponder', 'prosperity', 'squaw', 'vulnerable', 'opthalmic', 'Simms', 'unite', 'totallers', 'Waring', 'enforced', 'bridge', 'collecting', 'sublime', 'Moore', 'gobble', 'criticizes', 'daydreams', 'sedate', 'apples', 'Concordia', 'subsequence', 'distill', 'Allan', 'seizure', 'Isadore', 'Lancashire', 'spacings', 'corresponded', 'hobble', 'Boonton', 'genuineness', 'artifact', 'gratuities', 'interviewee', 'Vladimir', 'mailable', 'Bini', 'Kowalewski', 'interprets', 'bereave', 'evacuated', 'friend', 'tourists', 'crunched', 'soothsayer', 'fleetly', 'Romanizations', 'Medicaid', 'persevering', 'flimsy', 'doomsday', 'trillion', 'carcasses', 'guess', 'seersucker', 'ripping', 'affliction', 'wildest', 'spokes', 'sheaths', 'procreate', 'rusticates', 'Schapiro', 'thereafter', 'mistakenly', 'shelf', 'ruination', 'bushel', 'assuredly', 'corrupting', 'federation', 'portmanteau', 'wading', 'incendiary', 'thing', 'wanderers', 'messages', 'Paso', 'reexamined', 'freeings', 'denture', 'potting', 'disturber', 'laborer', 'comrade', 'intercommunicating', 'Pelham', 'reproach', 'Fenton', 'Alva', 'oasis', 'attending', 'cockpit', 'scout', 'Jude', 'gagging', 'jailed', 'crustaceans', 'dirt', 'exquisitely', 'Internet', 'blocker', 'smock', 'Troutman', 'neighboring', 'surprise', 'midscale', 'impart', 'badgering', 'fountain', 'Essen', 'societies', 'redresses', 'afterwards', 'puckering', 'silks', 'Blakey', 'sequel', 'greet', 'basements', 'Aubrey', 'helmsman', 'album', 'wheelers', 'easternmost', 'flock', 'ambassadors', 'astatine', 'supplant', 'gird', 'clockwork', 'foxes', 'rerouting', 'divisional', 'bends', 'spacer', 'physiologically', 'exquisite', 'concerts', 'unbridled', 'crossing', 'rock', 'leatherneck', 'Fortescue', 'reloading', 'Laramie', 'Tim', 'forlorn', 'revert', 'scarcer', 'spigot', 'equality', 'paranormal', 'aggrieves', 'pegs', 'committeewomen', 'documented', 'interrupt', 'emerald', 'Battelle', 'reconverted', 'anticipated', 'prejudices', 'drowsiness', 'trivialities', 'food', 'blackberries', 'Cyclades', 'tourist', 'branching', 'nugget', 'Asilomar', 'repairmen', 'Cowan', 'receptacles', 'nobler', 'Nebraskan', 'territorial', 'chickadee', 'bedbug', 'darted', 'vigilance', 'Octavia', 'summands', 'policemen', 'twirls', 'style', 'outlawing', 'specifiable', 'pang', 'Orpheus', 'epigram', 'Babel', 'butyrate', 'wishing', 'fiendish', 'accentuate', 'much', 'pulsed', 'adorned', 'arbiters', 'counted', 'Afrikaner', 'parameterizes', 'agenda', 'Americanism', 'referenda', 'derived', 'liquidity', 'trembling', 'lordly', 'Agway', 'Dillon', 'propellers', 'statement', 'stickiest', 'thankfully', 'autograph', 'parallel', 'impulse', 'Hamey', 'stylistic', 'disproved', 'inquirer', 'hoisting', 'residues', 'variant', 'colonials', 'dequeued', 'especial', 'Samoa', 'Polaris', 'dismisses', 'surpasses', 'prognosis', 'urinates', 'leaguers', 'ostriches', 'calculative', 'digested', 'divided', 'reconfigurer', 'Lakewood', 'illegalities', 'redundancy', 'approachability', 'masterly', 'cookery', 'crystallized', 'Dunham', 'exclaims', 'mainline', 'Australianizes', 'nationhood', 'pusher', 'ushers', 'paranoia', 'workstations', 'radiance', 'impedes', 'Minotaur', 'cataloging', 'bites', 'fashioning', 'Alsop', 'servants', 'Onondaga', 'paragraph', 'leadings', 'clients', 'Latrobe', 'Cornwallis', 'excitingly', 'calorimetric', 'savior', 'tandem', 'antibiotics', 'excuse', 'brushy', 'selfish', 'naive', 'becomes', 'towers', 'popularizes', 'engender', 'introducing', 'possession', 'slaughtered', 'marginally', 'Packards', 'parabola', 'utopia', 'automata', 'deterrent', 'chocolates', 'objectives', 'clannish', 'aspirin', 'ferociousness', 'primarily', 'armpit', 'handfuls', 'dangle', 'Manila', 'enlivened', 'decrease', 'phylum', 'hardy', 'objectively', 'baskets', 'chaired', 'Sepoy', 'deputy', 'blizzard', 'shootings', 'breathtaking', 'sticking', 'initials', 'epitomized', 'Forrest', 'cellular', 'amatory', 'radioed', 'horrified', 'Neva', 'simultaneous', 'delimiter', 'expulsion', 'Himmler', 'contradiction', 'Remus', 'Franklinizations', 'luggage', 'moisture', 'Jews', 'comptroller', 'brevity', 'contradictions', 'Ohio', 'active', 'babysit', 'China', 'youngest', 'superstition', 'clawing', 'raccoons', 'chose', 'shoreline', 'helmets', 'Jeffersonian', 'papered', 'kindergarten', 'reply', 'succinct', 'split', 'wriggle', 'suitcases', 'nonce', 'grinders', 'anthem', 'showcase', 'maimed', 'blue', 'obeys', 'unreported', 'perusing', 'recalculate', 'rancher', 'demonic', 'Lilliputianize', 'approximation', 'repents', 'yellowness', 'irritates', 'Ferber', 'flashlights', 'booty', 'Neanderthal', 'someday', 'foregoes', 'lingering', 'cloudiness', 'guy', 'consumer', 'Berkowitz', 'relics', 'interpolating', 'reappearing', 'advisements', 'Nolan', 'turrets', 'skeletal', 'skills', 'mammas', 'Winsett', 'wheelings', 'stiffen', 'monkeys', 'plainness', 'braziers', 'Leary', 'advisee', 'jack', 'verb', 'reinterpret', 'geometrical', 'trolleys', 'arboreal', 'overpowered', 'Cuzco', 'poetical', 'admirations', 'Hobbes', 'phonemes', 'Newsweek', 'agitator', 'finally', 'prophets', 'environment', 'easterners', 'precomputed', 'faults', 'rankly', 'swallowing', 'crawl', 'trolley', 'spreading', 'resourceful', 'go', 'demandingly', 'broader', 'spiders', 'Marsha', 'debris', 'operates', 'Dundee', 'alleles', 'crunchier', 'quizzical', 'hanging', 'Fisk'] wordsd = {} for word in words: wordsd[word] = 1 def collect_options(args, jobs, options): while args: arg = args.pop(0) if arg.startswith('-'): name = arg[1:] if name == 'options': fname = args.pop(0) d = {} execfile(fname, d) collect_options(list(d['options']), jobs, options) elif options.has_key(name): v = args.pop(0) if options[name] != None: raise ValueError( "Duplicate values for %s, %s and %s" % (name, v, options[name]) ) options[name] = v elif name == 'setup': options['setup'] = 1 elif globals().has_key(name.capitalize()+'Job'): job = name kw = {} while args and args[0].find("=") > 0: arg = args.pop(0).split('=') name, v = arg[0], '='.join(arg[1:]) if kw.has_key(name): raise ValueError( "Duplicate parameter %s for job %s" % (name, job) ) kw[name]=v if kw.has_key('frequency'): frequency = kw['frequency'] del kw['frequency'] else: frequency = 1 if kw.has_key('sleep'): sleep = float(kw['sleep']) del kw['sleep'] else: sleep = 0.0001 if kw.has_key('repeat'): repeatp = float(kw['repeat']) del kw['repeat'] else: repeatp = 0 jobs.append((job, kw, frequency, sleep, repeatp)) else: raise ValueError("not an option or job", name) else: raise ValueError("Expected an option", arg) def find_lib_python(): for b in os.getcwd(), os.path.split(sys.argv[0])[0]: for i in range(6): d = ['..']*i + ['lib', 'python'] p = os.path.join(b, *d) if os.path.isdir(p): return p raise ValueError("Couldn't find lib/python") def main(args=None): lib_python = find_lib_python() sys.path.insert(0, lib_python) if args is None: args = sys.argv[1:] if not args: print __doc__ sys.exit(0) print args random.seed(hash(tuple(args))) # always use the same for the given args options = {"mbox": None, "threads": None} jobdefs = [] collect_options(args, jobdefs, options) mboxes = {} if options["mbox"]: mboxes[options["mbox"]] = MBox(options["mbox"]) # Perform a ZConfig-based Zope initialization: zetup(os.path.join(lib_python, '..', '..', 'etc', 'zope.conf')) if options.has_key('setup'): setup(lib_python) else: import Zope2 Zope2.startup() jobs = JobProducer() for job, kw, frequency, sleep, repeatp in jobdefs: Job = globals()[job.capitalize()+'Job'] if getattr(Job, 'needs_mbox', 0): if not kw.has_key("mbox"): if not options["mbox"]: raise ValueError( "no mailbox (mbox option) file specified") kw['mbox'] = mboxes[options["mbox"]] else: if not mboxes.has_key[kw["mbox"]]: mboxes[kw['mbox']] = MBox[kw['mbox']] kw["mbox"] = mboxes[kw['mbox']] jobs.add(Job(**kw), frequency, sleep, repeatp) if not jobs: print "No jobs to execute" return threads = int(options['threads'] or '0') if threads > 1: threads = [threading.Thread(target=run, args=(jobs, i), name=str(i)) for i in range(threads)] for thread in threads: thread.start() for thread in threads: thread.join() else: run(jobs) def zetup(configfile_name): from Zope.Startup.options import ZopeOptions from Zope.Startup import handlers as h from App import config opts = ZopeOptions() opts.configfile = configfile_name opts.realize(args=[]) h.handleConfig(opts.configroot, opts.confighandlers) config.setConfiguration(opts.configroot) from Zope.Startup import dropPrivileges dropPrivileges(opts.configroot) if __name__ == '__main__': main() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/serialize.py000066400000000000000000000547321230730566700230110ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2003 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Support for ZODB object serialization. ZODB serializes objects using a custom format based on Python pickles. When an object is unserialized, it can be loaded as either a ghost or a real object. A ghost is a persistent object of the appropriate type but without any state. The first time a ghost is accessed, the persistence machinery traps access and loads the actual state. A ghost allows many persistent objects to be loaded while minimizing the memory consumption of referenced but otherwise unused objects. Pickle format ------------- ZODB stores serialized objects using a custom format based on pickle. Each serialized object has two parts: the class description and the object state. The class description must provide enough information to call the class's ``__new__`` and create an empty object. Once the object exists as a ghost, its state is passed to ``__setstate__``. The class description can be in a variety of formats, in part to provide backwards compatibility with earlier versions of Zope. The four current formats for class description are: 1. type(obj) 2. type(obj), obj.__getnewargs__() 3. (module name, class name), None 7. (module name, class name), obj.__getnewargs__() The second of these options is used if the object has a __getnewargs__() method. It is intended to support objects like persistent classes that have custom C layouts that are determined by arguments to __new__(). The third and fourth (#3 & #7) apply to instances of a persistent class (which means the class itself is persistent, not that it's a subclass of Persistent). The type object is usually stored using the standard pickle mechanism, which involves the pickle GLOBAL opcode (giving the type's module and name as strings). The type may itself be a persistent object, in which case a persistent reference (see below) is used. It's unclear what "usually" means in the last paragraph. There are two useful places to concentrate confusion about exactly which formats exist: - ObjectReader.getClassName() below returns a dotted "module.class" string, via actually loading a pickle. This requires that the implementation of application objects be available. - ZODB/utils.py's get_pickle_metadata() tries to return the module and class names (as strings) without importing any application modules or classes, via analyzing the pickle. Earlier versions of Zope supported several other kinds of class descriptions. The current serialization code reads these descriptions, but does not write them. The three earlier formats are: 4. (module name, class name), __getinitargs__() 5. class, None 6. class, __getinitargs__() Formats 4 and 6 are used only if the class defines a __getinitargs__() method, but we really can't tell them apart from formats 7 and 2 (respectively). Formats 5 and 6 are used if the class does not have a __module__ attribute (I'm not sure when this applies, but I think it occurs for some but not all ZClasses). Persistent references --------------------- When one persistent object pickle refers to another persistent object, the database uses a persistent reference. ZODB persistent references are of the form:: oid A simple object reference. (oid, class meta data) A persistent object reference [reference_type, args] An extended reference Extension references come in a number of subforms, based on the reference types. The following reference types are defined: 'w' Persistent weak reference. The arguments consist of an oid and optionally a database name. The following are planned for the future: 'n' Multi-database simple object reference. The arguments consist of a database name, and an object id. 'm' Multi-database persistent object reference. The arguments consist of a database name, an object id, and class meta data. The following legacy format is also supported. [oid] A persistent weak reference Because the persistent object reference forms include class information, it is not possible to change the class of a persistent object for which this form is used. If a transaction changed the class of an object, a new record with new class metadata would be written but all the old references would still use the old class. (It is possible that we could deal with this limitation in the future.) An object id is used alone when a class requires arguments to it's __new__ method, which is signalled by the class having a __getnewargs__ attribute. A number of legacyforms are defined: """ import cPickle import cStringIO import logging from persistent import Persistent from persistent.wref import WeakRefMarker, WeakRef from ZODB import broken from ZODB.broken import Broken from ZODB.POSException import InvalidObjectReference _oidtypes = str, type(None) # Might to update or redo coptimizations to reflect weakrefs: # from ZODB.coptimizations import new_persistent_id def myhasattr(obj, name, _marker=object()): """Make sure we don't mask exceptions like hasattr(). We don't want exceptions other than AttributeError to be masked, since that too often masks other programming errors. Three-argument getattr() doesn't mask those, so we use that to implement our own hasattr() replacement. """ return getattr(obj, name, _marker) is not _marker class ObjectWriter: """Serializes objects for storage in the database. The ObjectWriter creates object pickles in the ZODB format. It also detects new persistent objects reachable from the current object. """ _jar = None def __init__(self, obj=None): self._file = cStringIO.StringIO() self._p = cPickle.Pickler(self._file, 1) self._p.inst_persistent_id = self.persistent_id self._stack = [] if obj is not None: self._stack.append(obj) jar = obj._p_jar assert myhasattr(jar, "new_oid") self._jar = jar def persistent_id(self, obj): """Return the persistent id for obj. >>> from ZODB.tests.util import P >>> class DummyJar: ... xrefs = True ... def new_oid(self): ... return 42 ... def db(self): ... return self ... databases = {} >>> jar = DummyJar() >>> class O: ... _p_jar = jar >>> writer = ObjectWriter(O) Normally, object references include the oid and a cached named reference to the class. Having the class information available allows fast creation of the ghost, avoiding requiring an additional database lookup. >>> bob = P('bob') >>> oid, cls = writer.persistent_id(bob) >>> oid 42 >>> cls is P True If a persistent object does not already have an oid and jar, these will be assigned by persistent_id(): >>> bob._p_oid 42 >>> bob._p_jar is jar True If the object already has a persistent id, the id is not changed: >>> bob._p_oid = 24 >>> oid, cls = writer.persistent_id(bob) >>> oid 24 >>> cls is P True If the jar doesn't match that of the writer, an error is raised: >>> bob._p_jar = DummyJar() >>> writer.persistent_id(bob) ... # doctest: +NORMALIZE_WHITESPACE +ELLIPSIS Traceback (most recent call last): ... InvalidObjectReference: ('Attempt to store an object from a foreign database connection', , P(bob)) Constructor arguments used by __new__(), as returned by __getnewargs__(), can affect memory allocation, but may also change over the life of the object. This makes it useless to cache even the object's class. >>> class PNewArgs(P): ... def __getnewargs__(self): ... return () >>> sam = PNewArgs('sam') >>> writer.persistent_id(sam) 42 >>> sam._p_oid 42 >>> sam._p_jar is jar True Check that simple objects don't get accused of persistence: >>> writer.persistent_id(42) >>> writer.persistent_id(object()) Check that a classic class doesn't get identified improperly: >>> class ClassicClara: ... pass >>> clara = ClassicClara() >>> writer.persistent_id(clara) """ # Most objects are not persistent. The following cheap test # identifies most of them. For these, we return None, # signalling that the object should be pickled normally. if not isinstance(obj, (Persistent, type, WeakRef)): # Not persistent, pickle normally return None # Any persistent object must have an oid: try: oid = obj._p_oid except AttributeError: # Not persistent, pickle normally return None if not (oid is None or isinstance(oid, str)): # Deserves a closer look: # Make sure it's not a descriptor if hasattr(oid, '__get__'): # The oid is a descriptor. That means obj is a non-persistent # class whose instances are persistent, so ... # Not persistent, pickle normally return None if oid is WeakRefMarker: # we have a weakref, see weakref.py oid = obj.oid if oid is None: target = obj() # get the referenced object oid = target._p_oid if oid is None: # Here we are causing the object to be saved in # the database. One could argue that we shouldn't # do this, because a weakref should not cause an object # to be added. We'll be optimistic, though, and # assume that the object will be added eventually. oid = self._jar.new_oid() target._p_jar = self._jar target._p_oid = oid self._stack.append(target) obj.oid = oid obj.dm = target._p_jar obj.database_name = obj.dm.db().database_name if obj.dm is self._jar: return ['w', (oid, )] else: return ['w', (oid, obj.database_name)] # Since we have an oid, we have either a persistent instance # (an instance of Persistent), or a persistent class. # NOTE! Persistent classes don't (and can't) subclass persistent. database_name = None if oid is None: oid = obj._p_oid = self._jar.new_oid() obj._p_jar = self._jar self._stack.append(obj) elif obj._p_jar is not self._jar: if not self._jar.db().xrefs: raise InvalidObjectReference( "Database %r doesn't allow implicit cross-database " "references" % self._jar.db().database_name, self._jar, obj) try: otherdb = obj._p_jar.db() database_name = otherdb.database_name except AttributeError: otherdb = self if self._jar.db().databases.get(database_name) is not otherdb: raise InvalidObjectReference( "Attempt to store an object from a foreign " "database connection", self._jar, obj, ) if self._jar.get_connection(database_name) is not obj._p_jar: raise InvalidObjectReference( "Attempt to store a reference to an object from " "a separate connection to the same database or " "multidatabase", self._jar, obj, ) # OK, we have an object from another database. # Lets make sure the object ws not *just* loaded. if obj._p_jar._implicitlyAdding(oid): raise InvalidObjectReference( "A new object is reachable from multiple databases. " "Won't try to guess which one was correct!", self._jar, obj, ) klass = type(obj) if hasattr(klass, '__getnewargs__'): # We don't want to save newargs in object refs. # It's possible that __getnewargs__ is degenerate and # returns (), but we don't want to have to deghostify # the object to find out. # Note that this has the odd effect that, if the class has # __getnewargs__ of its own, we'll lose the optimization # of caching the class info. if database_name is not None: return ['n', (database_name, oid)] return oid # Note that we never get here for persistent classes. # We'll use direct refs for normal classes. if database_name is not None: return ['m', (database_name, oid, klass)] return oid, klass def serialize(self, obj): # We don't use __class__ here, because obj could be a persistent proxy. # We don't want to be fooled by proxies. klass = type(obj) # We want to serialize persistent classes by name if they have # a non-None non-empty module so as not to have a direct # ref. This is important when copying. We probably want to # revisit this in the future. newargs = getattr(obj, "__getnewargs__", None) if (isinstance(getattr(klass, '_p_oid', 0), _oidtypes) and klass.__module__): # This is a persistent class with a non-empty module. This # uses pickle format #3 or #7. klass = klass.__module__, klass.__name__ if newargs is None: meta = klass, None else: meta = klass, newargs() elif newargs is None: # Pickle format #1. meta = klass else: # Pickle format #2. meta = klass, newargs() return self._dump(meta, obj.__getstate__()) def _dump(self, classmeta, state): # To reuse the existing cStringIO object, we must reset # the file position to 0 and truncate the file after the # new pickle is written. self._file.seek(0) self._p.clear_memo() self._p.dump(classmeta) self._p.dump(state) self._file.truncate() return self._file.getvalue() def __iter__(self): return NewObjectIterator(self._stack) class NewObjectIterator: # The pickler is used as a forward iterator when the connection # is looking for new objects to pickle. def __init__(self, stack): self._stack = stack def __iter__(self): return self def next(self): if self._stack: elt = self._stack.pop() return elt else: raise StopIteration class ObjectReader: def __init__(self, conn=None, cache=None, factory=None): self._conn = conn self._cache = cache self._factory = factory def _get_class(self, module, name): return self._factory(self._conn, module, name) def _get_unpickler(self, pickle): file = cStringIO.StringIO(pickle) unpickler = cPickle.Unpickler(file) unpickler.persistent_load = self._persistent_load factory = self._factory conn = self._conn def find_global(modulename, name): return factory(conn, modulename, name) unpickler.find_global = find_global return unpickler loaders = {} def _persistent_load(self, reference): if isinstance(reference, tuple): return self.load_persistent(*reference) elif isinstance(reference, str): return self.load_oid(reference) else: try: reference_type, args = reference except ValueError: # weakref return self.loaders['w'](self, *reference) else: return self.loaders[reference_type](self, *args) def load_persistent(self, oid, klass): # Quick instance reference. We know all we need to know # to create the instance w/o hitting the db, so go for it! obj = self._cache.get(oid, None) if obj is not None: return obj if isinstance(klass, tuple): klass = self._get_class(*klass) if issubclass(klass, Broken): # We got a broken class. We might need to make it # PersistentBroken if not issubclass(klass, broken.PersistentBroken): klass = broken.persistentBroken(klass) try: obj = klass.__new__(klass) except TypeError: # Couldn't create the instance. Maybe there's more # current data in the object's actual record! return self._conn.get(oid) # TODO: should be done by connection self._cache.new_ghost(oid, obj) return obj def load_multi_persistent(self, database_name, oid, klass): conn = self._conn.get_connection(database_name) # TODO, make connection _cache attr public reader = ObjectReader(conn, conn._cache, self._factory) return reader.load_persistent(oid, klass) loaders['m'] = load_multi_persistent def load_persistent_weakref(self, oid, database_name=None): obj = WeakRef.__new__(WeakRef) obj.oid = oid if database_name is None: obj.dm = self._conn else: obj.database_name = database_name try: obj.dm = self._conn.get_connection(database_name) except KeyError: # XXX Not sure what to do here. It seems wrong to # fail since this is a weak reference. For now we'll # just pretend that the target object has gone. pass return obj loaders['w'] = load_persistent_weakref def load_oid(self, oid): obj = self._cache.get(oid, None) if obj is not None: return obj return self._conn.get(oid) def load_multi_oid(self, database_name, oid): conn = self._conn.get_connection(database_name) # TODO, make connection _cache attr public reader = ObjectReader(conn, conn._cache, self._factory) return reader.load_oid(oid) loaders['n'] = load_multi_oid def getClassName(self, pickle): unpickler = self._get_unpickler(pickle) klass = unpickler.load() if isinstance(klass, tuple): klass, args = klass if isinstance(klass, tuple): # old style reference return "%s.%s" % klass return "%s.%s" % (klass.__module__, klass.__name__) def getGhost(self, pickle): unpickler = self._get_unpickler(pickle) klass = unpickler.load() if isinstance(klass, tuple): # Here we have a separate class and args. # This could be an old record, so the class module ne a named # refernce klass, args = klass if isinstance(klass, tuple): # Old module_name, class_name tuple klass = self._get_class(*klass) if args is None: args = () else: # Definitely new style direct class reference args = () if issubclass(klass, Broken): # We got a broken class. We might need to make it # PersistentBroken if not issubclass(klass, broken.PersistentBroken): klass = broken.persistentBroken(klass) return klass.__new__(klass, *args) def getState(self, pickle): unpickler = self._get_unpickler(pickle) try: unpickler.load() # skip the class metadata return unpickler.load() except EOFError, msg: log = logging.getLogger("ZODB.serialize") log.exception("Unpickling error: %r", pickle) raise def setGhostState(self, obj, pickle): state = self.getState(pickle) obj.__setstate__(state) def referencesf(p, oids=None): """Return a list of object ids found in a pickle A list may be passed in, in which case, information is appended to it. Only ordinary internal references are included. Weak and multi-database references are not included. """ refs = [] u = cPickle.Unpickler(cStringIO.StringIO(p)) u.persistent_load = refs u.noload() u.noload() # Now we have a list of referencs. Need to convert to list of # oids: if oids is None: oids = [] for reference in refs: if isinstance(reference, tuple): oid = reference[0] elif isinstance(reference, str): oid = reference else: assert isinstance(reference, list) continue oids.append(oid) return oids oid_klass_loaders = { 'w': lambda oid, database_name=None: None, } def get_refs(a_pickle): """Return oid and class information for references in a pickle The result of a list of oid and class information tuples. If the reference doesn't contain class information, then the klass information is None. """ refs = [] u = cPickle.Unpickler(cStringIO.StringIO(a_pickle)) u.persistent_load = refs u.noload() u.noload() # Now we have a list of referencs. Need to convert to list of # oids and class info: result = [] for reference in refs: if isinstance(reference, tuple): data = reference elif isinstance(reference, str): data = reference, None else: assert isinstance(reference, list) continue result.append(data) return result ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/storage.xml000066400000000000000000000001501230730566700226170ustar00rootroot00000000000000
ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/subtransactions.txt000066400000000000000000000027161230730566700244260ustar00rootroot00000000000000========================= Subtransactions in ZODB 3 ========================= ZODB 3 provides limited support for subtransactions. Subtransactions are nested to *one* level. There are top-level transactions and subtransactions. When a transaction is committed, a flag is passed indicating whether it is a subtransaction or a top-level transaction. Consider the following exampler commit calls: - ``commit()`` A regular top-level transaction is committed. - ``commit(1)`` A subtransaction is committed. There is now one subtransaction of the current top-level transaction. - ``commit(1)`` A subtransaction is committed. There are now two subtransactions of the current top-level transaction. - ``abort(1)`` A subtransaction is aborted. There are still two subtransactions of the current top-level transaction; work done since the last ``commit(1)`` call is discarded. - ``commit()`` We now commit a top-level transaction. The work done in the previous two subtransactions *plus* work done since the last ``abort(1)`` call is saved. - ``commit(1)`` A subtransaction is committed. There is now one subtransaction of the current top-level transaction. - ``commit(1)`` A subtransaction is committed. There are now two subtransactions of the current top-level transaction. - ``abort()`` We now abort a top-level transaction. We discard the work done in the previous two subtransactions *plus* work done since the last ``commit(1)`` call. ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/000077500000000000000000000000001230730566700215775ustar00rootroot00000000000000ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/BasicStorage.py000066400000000000000000000345301230730566700245240ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Run the basic tests for a storage as described in the official storage API The most complete and most out-of-date description of the interface is: http://www.zope.org/Documentation/Developer/Models/ZODB/ZODB_Architecture_Storage_Interface_Info.html All storages should be able to pass these tests. """ from __future__ import with_statement from ZODB import POSException from ZODB.tests.MinPO import MinPO from ZODB.tests.StorageTestBase import zodb_unpickle, zodb_pickle from ZODB.tests.StorageTestBase import handle_serials import threading import time import transaction import zope.interface import zope.interface.verify ZERO = '\0'*8 class BasicStorage: def checkBasics(self): self.assertEqual(self._storage.lastTransaction(), '\0'*8) t = transaction.Transaction() self._storage.tpc_begin(t) self.assertRaises(POSException.StorageTransactionError, self._storage.tpc_begin, t) # Aborting is easy self._storage.tpc_abort(t) # Test a few expected exceptions when we're doing operations giving a # different Transaction object than the one we've begun on. self._storage.tpc_begin(t) self.assertRaises( POSException.StorageTransactionError, self._storage.store, ZERO, ZERO, '', '', transaction.Transaction()) self.assertRaises( POSException.StorageTransactionError, self._storage.store, ZERO, 1, '2', '', transaction.Transaction()) self.assertRaises( POSException.StorageTransactionError, self._storage.tpc_vote, transaction.Transaction()) self._storage.tpc_abort(t) def checkSerialIsNoneForInitialRevision(self): eq = self.assertEqual oid = self._storage.new_oid() txn = transaction.Transaction() self._storage.tpc_begin(txn) # Use None for serial. Don't use _dostore() here because that coerces # serial=None to serial=ZERO. r1 = self._storage.store(oid, None, zodb_pickle(MinPO(11)), '', txn) r2 = self._storage.tpc_vote(txn) self._storage.tpc_finish(txn) newrevid = handle_serials(oid, r1, r2) data, revid = self._storage.load(oid, '') value = zodb_unpickle(data) eq(value, MinPO(11)) eq(revid, newrevid) def checkStore(self): revid = ZERO newrevid = self._dostore(revid=None) # Finish the transaction. self.assertNotEqual(newrevid, revid) def checkStoreAndLoad(self): eq = self.assertEqual oid = self._storage.new_oid() self._dostore(oid=oid, data=MinPO(7)) data, revid = self._storage.load(oid, '') value = zodb_unpickle(data) eq(value, MinPO(7)) # Now do a bunch of updates to an object for i in range(13, 22): revid = self._dostore(oid, revid=revid, data=MinPO(i)) # Now get the latest revision of the object data, revid = self._storage.load(oid, '') eq(zodb_unpickle(data), MinPO(21)) def checkConflicts(self): oid = self._storage.new_oid() revid1 = self._dostore(oid, data=MinPO(11)) self._dostore(oid, revid=revid1, data=MinPO(12)) self.assertRaises(POSException.ConflictError, self._dostore, oid, revid=revid1, data=MinPO(13)) def checkWriteAfterAbort(self): oid = self._storage.new_oid() t = transaction.Transaction() self._storage.tpc_begin(t) self._storage.store(oid, ZERO, zodb_pickle(MinPO(5)), '', t) # Now abort this transaction self._storage.tpc_abort(t) # Now start all over again oid = self._storage.new_oid() self._dostore(oid=oid, data=MinPO(6)) def checkAbortAfterVote(self): oid1 = self._storage.new_oid() revid1 = self._dostore(oid=oid1, data=MinPO(-2)) oid = self._storage.new_oid() t = transaction.Transaction() self._storage.tpc_begin(t) self._storage.store(oid, ZERO, zodb_pickle(MinPO(5)), '', t) # Now abort this transaction self._storage.tpc_vote(t) self._storage.tpc_abort(t) # Now start all over again oid = self._storage.new_oid() revid = self._dostore(oid=oid, data=MinPO(6)) for oid, revid in [(oid1, revid1), (oid, revid)]: data, _revid = self._storage.load(oid, '') self.assertEqual(revid, _revid) def checkStoreTwoObjects(self): noteq = self.assertNotEqual p31, p32, p51, p52 = map(MinPO, (31, 32, 51, 52)) oid1 = self._storage.new_oid() oid2 = self._storage.new_oid() noteq(oid1, oid2) revid1 = self._dostore(oid1, data=p31) revid2 = self._dostore(oid2, data=p51) noteq(revid1, revid2) revid3 = self._dostore(oid1, revid=revid1, data=p32) revid4 = self._dostore(oid2, revid=revid2, data=p52) noteq(revid3, revid4) def checkGetTid(self): if not hasattr(self._storage, 'getTid'): return eq = self.assertEqual p41, p42 = map(MinPO, (41, 42)) oid = self._storage.new_oid() self.assertRaises(KeyError, self._storage.getTid, oid) # Now store a revision revid1 = self._dostore(oid, data=p41) eq(revid1, self._storage.getTid(oid)) # And another one revid2 = self._dostore(oid, revid=revid1, data=p42) eq(revid2, self._storage.getTid(oid)) def checkLen(self): # len(storage) reports the number of objects. # check it is zero when empty self.assertEqual(len(self._storage),0) # check it is correct when the storage contains two object. # len may also be zero, for storages that do not keep track # of this number self._dostore(data=MinPO(22)) self._dostore(data=MinPO(23)) self.assert_(len(self._storage) in [0,2]) def checkGetSize(self): self._dostore(data=MinPO(25)) size = self._storage.getSize() # The storage API doesn't make any claims about what size # means except that it ought to be printable. str(size) def checkNote(self): oid = self._storage.new_oid() t = transaction.Transaction() self._storage.tpc_begin(t) t.note('this is a test') self._storage.store(oid, ZERO, zodb_pickle(MinPO(5)), '', t) self._storage.tpc_vote(t) self._storage.tpc_finish(t) def checkInterfaces(self): for iface in zope.interface.providedBy(self._storage): zope.interface.verify.verifyObject(iface, self._storage) def checkMultipleEmptyTransactions(self): # There was a bug in handling empty transactions in mapping # storage that caused the commit lock not to be released. :( transaction.begin() t = transaction.get() self._storage.tpc_begin(t) self._storage.tpc_vote(t) self._storage.tpc_finish(t) t.commit() transaction.begin() t = transaction.get() self._storage.tpc_begin(t) # Hung here before self._storage.tpc_vote(t) self._storage.tpc_finish(t) t.commit() def _do_store_in_separate_thread(self, oid, revid, voted): # We'll run the competing trans in a separate thread: thread = threading.Thread(name='T2', target=self._dostore, args=(oid,), kwargs=dict(revid=revid)) thread.setDaemon(True) thread.start() thread.join(.1) return thread def check_checkCurrentSerialInTransaction(self): oid = '\0\0\0\0\0\0\0\xf0' tid = self._dostore(oid) tid2 = self._dostore(oid, revid=tid) data = 'cpersistent\nPersistent\nq\x01.N.' # a simple persistent obj #---------------------------------------------------------------------- # stale read transaction.begin() t = transaction.get() self._storage.tpc_begin(t) try: self._storage.store('\0\0\0\0\0\0\0\xf1', '\0\0\0\0\0\0\0\0', data, '', t) self._storage.checkCurrentSerialInTransaction(oid, tid, t) self._storage.tpc_vote(t) except POSException.ReadConflictError, v: self.assert_(v.oid) == oid self.assert_(v.serials == (tid2, tid)) else: if 0: self.assert_(False, "No conflict error") self._storage.tpc_abort(t) #---------------------------------------------------------------------- # non-stale read, no stress. :) transaction.begin() t = transaction.get() self._storage.tpc_begin(t) self._storage.store('\0\0\0\0\0\0\0\xf2', '\0\0\0\0\0\0\0\0', data, '', t) self._storage.checkCurrentSerialInTransaction(oid, tid2, t) self._storage.tpc_vote(t) self._storage.tpc_finish(t) #---------------------------------------------------------------------- # non-stale read, competition after vote. The competing # transaction must produce a tid > this transaction's tid transaction.begin() t = transaction.get() self._storage.tpc_begin(t) self._storage.store('\0\0\0\0\0\0\0\xf3', '\0\0\0\0\0\0\0\0', data, '', t) self._storage.checkCurrentSerialInTransaction(oid, tid2, t) self._storage.tpc_vote(t) # We'll run the competing trans in a separate thread: thread = self._do_store_in_separate_thread(oid, tid2, True) self._storage.tpc_finish(t) thread.join(33) tid3 = self._storage.load(oid)[1] self.assert_(tid3 > self._storage.load('\0\0\0\0\0\0\0\xf3')[1]) #---------------------------------------------------------------------- # non-stale competing trans after checkCurrentSerialInTransaction transaction.begin() t = transaction.get() self._storage.tpc_begin(t) self._storage.store('\0\0\0\0\0\0\0\xf4', '\0\0\0\0\0\0\0\0', data, '', t) self._storage.checkCurrentSerialInTransaction(oid, tid3, t) thread = self._do_store_in_separate_thread(oid, tid3, False) # There are 2 possibilities: # 1. The store happens before this transaction completes, # in which case, the vote below fails. # 2. The store happens after this trans, in which case, the # tid of the object is greater than this transaction's tid. try: self._storage.tpc_vote(t) except POSException.ReadConflictError: thread.join() # OK :) else: self._storage.tpc_finish(t) thread.join() tid4 = self._storage.load(oid)[1] self.assert_(tid4 > self._storage.load('\0\0\0\0\0\0\0\xf4')[1]) def check_tid_ordering_w_commit(self): # It's important that storages always give a consistent # ordering for revisions, tids. This is most likely to fail # around commit. Here we'll do some basic tests to check this. # We'll use threads to arrange for ordering to go wrong and # verify that a storage gets it right. # First, some initial data. t = transaction.get() self._storage.tpc_begin(t) self._storage.store(ZERO, ZERO, 'x', '', t) self._storage.tpc_vote(t) tids = [] self._storage.tpc_finish(t, lambda tid: tids.append(tid)) # OK, now we'll start a new transaction, take it to finish, # and then block finish while we do some other operations. t = transaction.get() self._storage.tpc_begin(t) self._storage.store(ZERO, tids[0], 'y', '', t) self._storage.tpc_vote(t) to_join = [] def run_in_thread(func): t = threading.Thread(target=func) t.setDaemon(True) t.start() to_join.append(t) started = threading.Event() finish = threading.Event() @run_in_thread def commit(): def callback(tid): started.set() tids.append(tid) finish.wait() self._storage.tpc_finish(t, callback) results = {} started.wait() attempts = [] attempts_cond = threading.Condition() def update_attempts(): with attempts_cond: attempts.append(1) attempts_cond.notifyAll() @run_in_thread def lastTransaction(): update_attempts() results['lastTransaction'] = self._storage.lastTransaction() @run_in_thread def load(): update_attempts() results['load'] = self._storage.load(ZERO, '')[1] expected_attempts = 2 if hasattr(self._storage, 'getTid'): expected_attempts += 1 @run_in_thread def getTid(): update_attempts() results['getTid'] = self._storage.getTid(ZERO) if hasattr(self._storage, 'lastInvalidations'): expected_attempts += 1 @run_in_thread def lastInvalidations(): update_attempts() invals = self._storage.lastInvalidations(1) if invals: results['lastInvalidations'] = invals[0][0] with attempts_cond: while len(attempts) < expected_attempts: attempts_cond.wait() time.sleep(.01) # for good measure :) finish.set() for t in to_join: t.join(1) self.assertEqual(results.pop('load'), tids[1]) self.assertEqual(results.pop('lastTransaction'), tids[1]) for m, tid in results.items(): self.assertEqual(tid, tids[1]) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/ConflictResolution.py000066400000000000000000000142751230730566700260070ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Tests for application-level conflict resolution.""" from ZODB.POSException import ConflictError, UndoError from persistent import Persistent from transaction import Transaction from ZODB.tests.StorageTestBase import zodb_unpickle, zodb_pickle class PCounter(Persistent): _value = 0 def __repr__(self): return "" % self._value def inc(self): self._value = self._value + 1 def _p_resolveConflict(self, oldState, savedState, newState): savedDiff = savedState['_value'] - oldState['_value'] newDiff = newState['_value'] - oldState['_value'] oldState['_value'] = oldState['_value'] + savedDiff + newDiff return oldState # Insecurity: What if _p_resolveConflict _thinks_ it resolved the # conflict, but did something wrong? class PCounter2(PCounter): def _p_resolveConflict(self, oldState, savedState, newState): raise ConflictError class PCounter3(PCounter): def _p_resolveConflict(self, oldState, savedState, newState): raise AttributeError("no attribute (testing conflict resolution)") class PCounter4(PCounter): def _p_resolveConflict(self, oldState, savedState): raise RuntimeError("Can't get here; not enough args") class ConflictResolvingStorage: def checkResolve(self): obj = PCounter() obj.inc() oid = self._storage.new_oid() revid1 = self._dostoreNP(oid, data=zodb_pickle(obj)) obj.inc() obj.inc() # The effect of committing two transactions with the same # pickle is to commit two different transactions relative to # revid1 that add two to _value. revid2 = self._dostoreNP(oid, revid=revid1, data=zodb_pickle(obj)) revid3 = self._dostoreNP(oid, revid=revid1, data=zodb_pickle(obj)) data, serialno = self._storage.load(oid, '') inst = zodb_unpickle(data) self.assertEqual(inst._value, 5) def checkUnresolvable(self): obj = PCounter2() obj.inc() oid = self._storage.new_oid() revid1 = self._dostoreNP(oid, data=zodb_pickle(obj)) obj.inc() obj.inc() # The effect of committing two transactions with the same # pickle is to commit two different transactions relative to # revid1 that add two to _value. revid2 = self._dostoreNP(oid, revid=revid1, data=zodb_pickle(obj)) try: self._dostoreNP(oid, revid=revid1, data=zodb_pickle(obj)) except ConflictError, err: self.assert_("PCounter2" in str(err)) else: self.fail("Expected ConflictError") def checkZClassesArentResolved(self): from ZODB.ConflictResolution import find_global, BadClassName dummy_class_tuple = ('*foobar', ()) self.assertRaises(BadClassName, find_global, '*foobar', ()) def checkBuggyResolve1(self): obj = PCounter3() obj.inc() oid = self._storage.new_oid() revid1 = self._dostoreNP(oid, data=zodb_pickle(obj)) obj.inc() obj.inc() # The effect of committing two transactions with the same # pickle is to commit two different transactions relative to # revid1 that add two to _value. revid2 = self._dostoreNP(oid, revid=revid1, data=zodb_pickle(obj)) self.assertRaises(ConflictError, self._dostoreNP, oid, revid=revid1, data=zodb_pickle(obj)) def checkBuggyResolve2(self): obj = PCounter4() obj.inc() oid = self._storage.new_oid() revid1 = self._dostoreNP(oid, data=zodb_pickle(obj)) obj.inc() obj.inc() # The effect of committing two transactions with the same # pickle is to commit two different transactions relative to # revid1 that add two to _value. revid2 = self._dostoreNP(oid, revid=revid1, data=zodb_pickle(obj)) self.assertRaises(ConflictError, self._dostoreNP, oid, revid=revid1, data=zodb_pickle(obj)) class ConflictResolvingTransUndoStorage: def checkUndoConflictResolution(self): # This test is based on checkNotUndoable in the # TransactionalUndoStorage test suite. Except here, conflict # resolution should allow us to undo the transaction anyway. obj = PCounter() obj.inc() oid = self._storage.new_oid() revid_a = self._dostore(oid, data=obj) obj.inc() revid_b = self._dostore(oid, revid=revid_a, data=obj) obj.inc() revid_c = self._dostore(oid, revid=revid_b, data=obj) # Start the undo info = self._storage.undoInfo() tid = info[1]['id'] t = Transaction() self._storage.tpc_begin(t) self._storage.undo(tid, t) self._storage.tpc_vote(t) self._storage.tpc_finish(t) def checkUndoUnresolvable(self): # This test is based on checkNotUndoable in the # TransactionalUndoStorage test suite. Except here, conflict # resolution should allow us to undo the transaction anyway. obj = PCounter2() obj.inc() oid = self._storage.new_oid() revid_a = self._dostore(oid, data=obj) obj.inc() revid_b = self._dostore(oid, revid=revid_a, data=obj) obj.inc() revid_c = self._dostore(oid, revid=revid_b, data=obj) # Start the undo info = self._storage.undoInfo() tid = info[1]['id'] t = Transaction() self.assertRaises(UndoError, self._begin_undos_vote, t, tid) self._storage.tpc_abort(t) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/Corruption.py000066400000000000000000000043701230730566700243210ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Do some minimal tests of data corruption""" import os import random import stat import ZODB.FileStorage from StorageTestBase import StorageTestBase class FileStorageCorruptTests(StorageTestBase): def setUp(self): StorageTestBase.setUp(self) self._storage = ZODB.FileStorage.FileStorage('Data.fs', create=1) def _do_stores(self): oids = [] for i in range(5): oid = self._storage.new_oid() revid = self._dostore(oid) oids.append((oid, revid)) return oids def _check_stores(self, oids): for oid, revid in oids: data, s_revid = self._storage.load(oid, '') self.assertEqual(s_revid, revid) def checkTruncatedIndex(self): oids = self._do_stores() self._close() # truncation the index file self.failUnless(os.path.exists('Data.fs.index')) f = open('Data.fs.index', 'r+') f.seek(0, 2) size = f.tell() f.seek(size / 2) f.truncate() f.close() self._storage = ZODB.FileStorage.FileStorage('Data.fs') self._check_stores(oids) def checkCorruptedIndex(self): oids = self._do_stores() self._close() # truncation the index file self.failUnless(os.path.exists('Data.fs.index')) size = os.stat('Data.fs.index')[stat.ST_SIZE] f = open('Data.fs.index', 'r+') while f.tell() < size: f.seek(random.randrange(1, size / 10), 1) f.write('\000') f.close() self._storage = ZODB.FileStorage.FileStorage('Data.fs') self._check_stores(oids) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/HistoryStorage.py000066400000000000000000000041311230730566700251360ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Run the history() related tests for a storage. Any storage that supports the history() method should be able to pass all these tests. """ from ZODB.tests.MinPO import MinPO class HistoryStorage: def checkSimpleHistory(self): eq = self.assertEqual # Store a couple of revisions of the object oid = self._storage.new_oid() self.assertRaises(KeyError,self._storage.history,oid) revid1 = self._dostore(oid, data=MinPO(11)) revid2 = self._dostore(oid, revid=revid1, data=MinPO(12)) revid3 = self._dostore(oid, revid=revid2, data=MinPO(13)) # Now get various snapshots of the object's history h = self._storage.history(oid, size=1) eq(len(h), 1) d = h[0] eq(d['tid'], revid3) # Try to get 2 historical revisions h = self._storage.history(oid, size=2) eq(len(h), 2) d = h[0] eq(d['tid'], revid3) d = h[1] eq(d['tid'], revid2) # Try to get all 3 historical revisions h = self._storage.history(oid, size=3) eq(len(h), 3) d = h[0] eq(d['tid'], revid3) d = h[1] eq(d['tid'], revid2) d = h[2] eq(d['tid'], revid1) # There should be no more than 3 revisions h = self._storage.history(oid, size=4) eq(len(h), 3) d = h[0] eq(d['tid'], revid3) d = h[1] eq(d['tid'], revid2) d = h[2] eq(d['tid'], revid1) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/IExternalGC.test000066400000000000000000000070141230730566700246070ustar00rootroot00000000000000Storage Support for external GC =============================== A storage that provides IExternalGC supports external garbage collectors by providing a deleteObject method that transactionally deletes an object. A create_storage function is provided that creates a storage. >>> storage = create_storage() >>> import ZODB.blob, transaction >>> db = ZODB.DB(storage) >>> conn = db.open() >>> conn.root()[0] = conn.root().__class__() >>> conn.root()[1] = ZODB.blob.Blob('some data') >>> transaction.commit() >>> oid0 = conn.root()[0]._p_oid >>> oid1 = conn.root()[1]._p_oid >>> del conn.root()[0] >>> del conn.root()[1] >>> transaction.commit() At this point, object 0 and 1 is garbage, but it's still in the storage: >>> p0, s0 = storage.load(oid0, '') >>> p1, s1 = storage.load(oid1, '') The storage is configured not to gc on pack, so even if we pack, these objects won't go away: >>> len(storage) 3 >>> import time >>> db.pack(time.time()+1) >>> len(storage) 3 >>> p0, s0 = storage.load(oid0, '') >>> p1, s1 = storage.load(oid1, '') Now we'll use the new deleteObject API to delete the objects. We can't go through the database to do this, so we'll have to manage the transaction ourselves. >>> txn = transaction.begin() >>> storage.tpc_begin(txn) >>> storage.deleteObject(oid0, s0, txn) >>> storage.deleteObject(oid1, s1, txn) >>> storage.tpc_vote(txn) >>> storage.tpc_finish(txn) >>> tid = storage.lastTransaction() Now if we try to load data for the objects, we get a POSKeyError: >>> storage.load(oid0, '') # doctest: +ELLIPSIS Traceback (most recent call last): ... POSKeyError: ... >>> storage.load(oid1, '') # doctest: +ELLIPSIS Traceback (most recent call last): ... POSKeyError: ... We can still get the data if we load before the time we deleted. >>> storage.loadBefore(oid0, conn.root()._p_serial) == (p0, s0, tid) True >>> storage.loadBefore(oid1, conn.root()._p_serial) == (p1, s1, tid) True >>> open(storage.loadBlob(oid1, s1)).read() 'some data' If we pack, however, the old data will be removed and the data will be gone: >>> db.pack(time.time()+1) >>> len(db.storage) 1 >>> time.sleep(.1) >>> storage.load(oid0, '') # doctest: +ELLIPSIS Traceback (most recent call last): ... POSKeyError: ... >>> storage.load(oid1, '') # doctest: +ELLIPSIS Traceback (most recent call last): ... POSKeyError: ... >>> storage.loadBefore(oid0, conn.root()._p_serial) # doctest: +ELLIPSIS Traceback (most recent call last): ... POSKeyError: ... >>> storage.loadBefore(oid1, conn.root()._p_serial) # doctest: +ELLIPSIS Traceback (most recent call last): ... POSKeyError: ... >>> storage.loadBlob(oid1, s1) # doctest: +ELLIPSIS Traceback (most recent call last): ... POSKeyError: ... A conflict error is raised if the serial we provide to deleteObject isn't current: >>> conn.root()[0] = conn.root().__class__() >>> transaction.commit() >>> oid = conn.root()[0]._p_oid >>> bad_serial = conn.root()[0]._p_serial >>> conn.root()[0].x = 1 >>> transaction.commit() >>> txn = transaction.begin() >>> storage.tpc_begin(txn) >>> storage.deleteObject(oid, bad_serial, txn); storage.tpc_vote(txn) ... # doctest: +ELLIPSIS Traceback (most recent call last): ... ConflictError: database conflict error ... >>> storage.tpc_abort(txn) >>> storage.close() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/IteratorStorage.py000066400000000000000000000227131230730566700252740ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Run tests against the iterator() interface for storages. Any storage that supports the iterator() method should be able to pass all these tests. """ from ZODB.tests.MinPO import MinPO from ZODB.tests.StorageTestBase import zodb_pickle, zodb_unpickle from ZODB.utils import U64, p64 from transaction import Transaction import itertools import ZODB.blob class IteratorCompare: def iter_verify(self, txniter, revids, val0): eq = self.assertEqual oid = self._oid val = val0 for reciter, revid in itertools.izip(txniter, revids + [None]): eq(reciter.tid, revid) for rec in reciter: eq(rec.oid, oid) eq(rec.tid, revid) eq(zodb_unpickle(rec.data), MinPO(val)) val = val + 1 eq(val, val0 + len(revids)) class IteratorStorage(IteratorCompare): def checkSimpleIteration(self): # Store a bunch of revisions of a single object self._oid = oid = self._storage.new_oid() revid1 = self._dostore(oid, data=MinPO(11)) revid2 = self._dostore(oid, revid=revid1, data=MinPO(12)) revid3 = self._dostore(oid, revid=revid2, data=MinPO(13)) # Now iterate over all the transactions and compare carefully txniter = self._storage.iterator() self.iter_verify(txniter, [revid1, revid2, revid3], 11) def checkUndoZombie(self): oid = self._storage.new_oid() revid = self._dostore(oid, data=MinPO(94)) # Get the undo information info = self._storage.undoInfo() tid = info[0]['id'] # Undo the creation of the object, rendering it a zombie t = Transaction() self._storage.tpc_begin(t) oids = self._storage.undo(tid, t) self._storage.tpc_vote(t) self._storage.tpc_finish(t) # Now attempt to iterator over the storage iter = self._storage.iterator() for txn in iter: for rec in txn: pass # The last transaction performed an undo of the transaction that # created object oid. (As Barry points out, the object is now in the # George Bailey state.) Assert that the final data record contains # None in the data attribute. self.assertEqual(rec.oid, oid) self.assertEqual(rec.data, None) def checkTransactionExtensionFromIterator(self): oid = self._storage.new_oid() revid = self._dostore(oid, data=MinPO(1)) iter = self._storage.iterator() count = 0 for txn in iter: self.assertEqual(txn.extension, {}) count +=1 self.assertEqual(count, 1) def checkIterationIntraTransaction(self): # TODO: Try this test with logging enabled. If you see something # like # # ZODB FS FS21 warn: FileStorageTests.fs truncated, possibly due to # damaged records at 4 # # Then the code in FileIterator.next() hasn't yet been fixed. # Should automate that check. oid = self._storage.new_oid() t = Transaction() data = zodb_pickle(MinPO(0)) try: self._storage.tpc_begin(t) self._storage.store(oid, '\0'*8, data, '', t) self._storage.tpc_vote(t) # Don't do tpc_finish yet it = self._storage.iterator() for x in it: pass finally: self._storage.tpc_finish(t) def checkLoad_was_checkLoadEx(self): oid = self._storage.new_oid() self._dostore(oid, data=42) data, tid = self._storage.load(oid, "") self.assertEqual(zodb_unpickle(data), MinPO(42)) match = False for txn in self._storage.iterator(): for rec in txn: if rec.oid == oid and rec.tid == tid: self.assertEqual(txn.tid, tid) match = True if not match: self.fail("Could not find transaction with matching id") def checkIterateRepeatedly(self): self._dostore() transactions = self._storage.iterator() self.assertEquals(1, len(list(transactions))) # The iterator can only be consumed once: self.assertEquals(0, len(list(transactions))) def checkIterateRecordsRepeatedly(self): self._dostore() tinfo = self._storage.iterator().next() self.assertEquals(1, len(list(tinfo))) self.assertEquals(1, len(list(tinfo))) def checkIterateWhileWriting(self): self._dostore() iterator = self._storage.iterator() # We have one transaction with 1 modified object. txn_1 = iterator.next() self.assertEquals(1, len(list(txn_1))) # We store another transaction with 1 object, the already running # iterator does not pick this up. self._dostore() self.assertRaises(StopIteration, iterator.next) class ExtendedIteratorStorage(IteratorCompare): def checkExtendedIteration(self): # Store a bunch of revisions of a single object self._oid = oid = self._storage.new_oid() revid1 = self._dostore(oid, data=MinPO(11)) revid2 = self._dostore(oid, revid=revid1, data=MinPO(12)) revid3 = self._dostore(oid, revid=revid2, data=MinPO(13)) revid4 = self._dostore(oid, revid=revid3, data=MinPO(14)) # Note that the end points are included # Iterate over all of the transactions with explicit start/stop txniter = self._storage.iterator(revid1, revid4) self.iter_verify(txniter, [revid1, revid2, revid3, revid4], 11) # Iterate over some of the transactions with explicit start txniter = self._storage.iterator(revid3) self.iter_verify(txniter, [revid3, revid4], 13) # Iterate over some of the transactions with explicit stop txniter = self._storage.iterator(None, revid2) self.iter_verify(txniter, [revid1, revid2], 11) # Iterate over some of the transactions with explicit start+stop txniter = self._storage.iterator(revid2, revid3) self.iter_verify(txniter, [revid2, revid3], 12) # Specify an upper bound somewhere in between values revid3a = p64((U64(revid3) + U64(revid4)) / 2) txniter = self._storage.iterator(revid2, revid3a) self.iter_verify(txniter, [revid2, revid3], 12) # Specify a lower bound somewhere in between values. # revid2 == revid1+1 is very likely on Windows. Adding 1 before # dividing ensures that "the midpoint" we compute is strictly larger # than revid1. revid1a = p64((U64(revid1) + 1 + U64(revid2)) / 2) assert revid1 < revid1a txniter = self._storage.iterator(revid1a, revid3a) self.iter_verify(txniter, [revid2, revid3], 12) # Specify an empty range txniter = self._storage.iterator(revid3, revid2) self.iter_verify(txniter, [], 13) # Specify a singleton range txniter = self._storage.iterator(revid3, revid3) self.iter_verify(txniter, [revid3], 13) class IteratorDeepCompare: def compare(self, storage1, storage2): eq = self.assertEqual iter1 = storage1.iterator() iter2 = storage2.iterator() for txn1, txn2 in itertools.izip(iter1, iter2): eq(txn1.tid, txn2.tid) eq(txn1.status, txn2.status) eq(txn1.user, txn2.user) eq(txn1.description, txn2.description) eq(txn1.extension, txn2.extension) itxn1 = iter(txn1) itxn2 = iter(txn2) for rec1, rec2 in itertools.izip(itxn1, itxn2): eq(rec1.oid, rec2.oid) eq(rec1.tid, rec2.tid) eq(rec1.data, rec2.data) if ZODB.blob.is_blob_record(rec1.data): try: fn1 = storage1.loadBlob(rec1.oid, rec1.tid) except ZODB.POSException.POSKeyError: self.assertRaises( ZODB.POSException.POSKeyError, storage2.loadBlob, rec1.oid, rec1.tid) else: fn2 = storage2.loadBlob(rec1.oid, rec1.tid) self.assert_(fn1 != fn2) eq(open(fn1, 'rb').read(), open(fn2, 'rb').read()) # Make sure there are no more records left in rec1 and rec2, # meaning they were the same length. # Additionally, check that we're backwards compatible to the # IndexError we used to raise before. self.assertRaises(StopIteration, itxn1.next) self.assertRaises(StopIteration, itxn2.next) # Make sure ther are no more records left in txn1 and txn2, meaning # they were the same length self.assertRaises(StopIteration, iter1.next) self.assertRaises(StopIteration, iter2.next) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/MTStorage.py000066400000000000000000000162111230730566700240170ustar00rootroot00000000000000import random import sys import threading import time from persistent.mapping import PersistentMapping import transaction import ZODB from ZODB.tests.StorageTestBase import zodb_pickle, zodb_unpickle from ZODB.tests.StorageTestBase import handle_serials from ZODB.tests.MinPO import MinPO from ZODB.POSException import ConflictError SHORT_DELAY = 0.01 def sort(l): "Sort a list in place and return it." l.sort() return l class TestThread(threading.Thread): """Base class for defining threads that run from unittest. If the thread exits with an uncaught exception, catch it and re-raise it when the thread is joined. The re-raise will cause the test to fail. The subclass should define a runtest() method instead of a run() method. """ def __init__(self): threading.Thread.__init__(self) self._exc_info = None def run(self): try: self.runtest() except: self._exc_info = sys.exc_info() def join(self, timeout=None): threading.Thread.join(self, timeout) if self._exc_info: raise self._exc_info[0], self._exc_info[1], self._exc_info[2] class ZODBClientThread(TestThread): __super_init = TestThread.__init__ def __init__(self, db, test, commits=10, delay=SHORT_DELAY): self.__super_init() self.setDaemon(1) self.db = db self.test = test self.commits = commits self.delay = delay def runtest(self): conn = self.db.open() conn.sync() root = conn.root() d = self.get_thread_dict(root) if d is None: self.test.fail() else: for i in range(self.commits): self.commit(d, i) self.test.assertEqual(sort(d.keys()), range(self.commits)) def commit(self, d, num): d[num] = time.time() time.sleep(self.delay) transaction.commit() time.sleep(self.delay) # Return a new PersistentMapping, and store it on the root object under # the name (.getName()) of the current thread. def get_thread_dict(self, root): # This is vicious: multiple threads are slamming changes into the # root object, then trying to read the root object, simultaneously # and without any coordination. Conflict errors are rampant. It # used to go around at most 10 times, but that fairly often failed # to make progress in the 7-thread tests on some test boxes. Going # around (at most) 1000 times was enough so that a 100-thread test # reliably passed on Tim's hyperthreaded WinXP box (but at the # original 10 retries, the same test reliably failed with 15 threads). name = self.getName() MAXRETRIES = 1000 for i in range(MAXRETRIES): try: root[name] = PersistentMapping() transaction.commit() break except ConflictError: root._p_jar.sync() else: raise ConflictError("Exceeded %d attempts to store" % MAXRETRIES) for j in range(MAXRETRIES): try: return root.get(name) except ConflictError: root._p_jar.sync() raise ConflictError("Exceeded %d attempts to read" % MAXRETRIES) class StorageClientThread(TestThread): __super_init = TestThread.__init__ def __init__(self, storage, test, commits=10, delay=SHORT_DELAY): self.__super_init() self.storage = storage self.test = test self.commits = commits self.delay = delay self.oids = {} def runtest(self): for i in range(self.commits): self.dostore(i) self.check() def check(self): for oid, revid in self.oids.items(): data, serial = self.storage.load(oid, '') self.test.assertEqual(serial, revid) obj = zodb_unpickle(data) self.test.assertEqual(obj.value[0], self.getName()) def pause(self): time.sleep(self.delay) def oid(self): oid = self.storage.new_oid() self.oids[oid] = None return oid def dostore(self, i): data = zodb_pickle(MinPO((self.getName(), i))) t = transaction.Transaction() oid = self.oid() self.pause() self.storage.tpc_begin(t) self.pause() # Always create a new object, signified by None for revid r1 = self.storage.store(oid, None, data, '', t) self.pause() r2 = self.storage.tpc_vote(t) self.pause() self.storage.tpc_finish(t) self.pause() revid = handle_serials(oid, r1, r2) self.oids[oid] = revid class ExtStorageClientThread(StorageClientThread): def runtest(self): # pick some other storage ops to execute, depending in part # on the features provided by the storage. names = ["do_load"] storage = self.storage try: supportsUndo = storage.supportsUndo except AttributeError: pass else: if supportsUndo(): names += ["do_loadSerial", "do_undoLog", "do_iterator"] ops = [getattr(self, meth) for meth in names] assert ops, "Didn't find an storage ops in %s" % self.storage # do a store to guarantee there's at least one oid in self.oids self.dostore(0) for i in range(self.commits - 1): meth = random.choice(ops) meth() self.dostore(i) self.check() def pick_oid(self): return random.choice(self.oids.keys()) def do_load(self): oid = self.pick_oid() self.storage.load(oid, '') def do_loadSerial(self): oid = self.pick_oid() self.storage.loadSerial(oid, self.oids[oid]) def do_undoLog(self): self.storage.undoLog(0, -20) def do_iterator(self): try: iter = self.storage.iterator() except AttributeError: # It's hard to detect that a ZEO ClientStorage # doesn't have this method, but does have all the others. return for obj in iter: pass class MTStorage: "Test a storage with multiple client threads executing concurrently." def _checkNThreads(self, n, constructor, *args): threads = [constructor(*args) for i in range(n)] for t in threads: t.start() for t in threads: t.join(60) for t in threads: self.failIf(t.isAlive(), "thread failed to finish in 60 seconds") def check2ZODBThreads(self): db = ZODB.DB(self._storage) self._checkNThreads(2, ZODBClientThread, db, self) db.close() def check7ZODBThreads(self): db = ZODB.DB(self._storage) self._checkNThreads(7, ZODBClientThread, db, self) db.close() def check2StorageThreads(self): self._checkNThreads(2, StorageClientThread, self._storage, self) def check7StorageThreads(self): self._checkNThreads(7, StorageClientThread, self._storage, self) def check4ExtStorageThread(self): self._checkNThreads(4, ExtStorageClientThread, self._storage, self) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/MVCCMappingStorage.py000066400000000000000000000113741230730566700255500ustar00rootroot00000000000000############################################################################## # # Copyright (c) Zope Corporation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """An extension of MappingStorage that depends on polling. Each Connection has its own view of the database. Polling updates each connection's view. """ import ZODB.utils import ZODB.POSException from ZODB.interfaces import IMVCCStorage from ZODB.MappingStorage import MappingStorage from zope.interface import implements class MVCCMappingStorage(MappingStorage): implements(IMVCCStorage) def __init__(self, name="MVCC Mapping Storage"): MappingStorage.__init__(self, name=name) # _polled_tid contains the transaction ID at the last poll. self._polled_tid = '' self._data_snapshot = None # {oid->(state, tid)} self._main_lock_acquire = self._lock_acquire self._main_lock_release = self._lock_release def new_instance(self): """Returns a storage instance that is a view of the same data. """ inst = MVCCMappingStorage(name=self.__name__) # All instances share the same OID data, transaction log, commit lock, # and OID sequence. inst._data = self._data inst._transactions = self._transactions inst._commit_lock = self._commit_lock inst.new_oid = self.new_oid inst.pack = self.pack inst._main_lock_acquire = self._lock_acquire inst._main_lock_release = self._lock_release return inst @ZODB.utils.locked(MappingStorage.opened) def sync(self, force=False): self._data_snapshot = None def release(self): pass @ZODB.utils.locked(MappingStorage.opened) def load(self, oid, version=''): assert not version, "Versions are not supported" if self._data_snapshot is None: self.poll_invalidations() info = self._data_snapshot.get(oid) if info: return info raise ZODB.POSException.POSKeyError(oid) def poll_invalidations(self): """Poll the storage for changes by other connections. """ # prevent changes to _transactions and _data during analysis self._main_lock_acquire() try: if self._transactions: new_tid = self._transactions.maxKey() else: new_tid = '' # Copy the current data into a snapshot. This is obviously # very inefficient for large storages, but it's good for # tests. self._data_snapshot = {} for oid, tid_data in self._data.items(): if tid_data: tid = tid_data.maxKey() self._data_snapshot[oid] = tid_data[tid], tid if self._polled_tid: if not self._transactions.has_key(self._polled_tid): # This connection is so old that we can no longer enumerate # all the changes. self._polled_tid = new_tid return None changed_oids = set() for tid, txn in self._transactions.items( self._polled_tid, new_tid, excludemin=True, excludemax=False): if txn.status == 'p': # This transaction has been packed, so it is no longer # possible to enumerate all changed oids. self._polled_tid = new_tid return None if tid == self._ltid: # ignore the transaction committed by this connection continue changed_oids.update(txn.data.keys()) finally: self._main_lock_release() self._polled_tid = new_tid return list(changed_oids) def tpc_finish(self, transaction, func = lambda tid: None): self._data_snapshot = None MappingStorage.tpc_finish(self, transaction, func) def tpc_abort(self, transaction): self._data_snapshot = None MappingStorage.tpc_abort(self, transaction) def pack(self, t, referencesf, gc=True): # prevent all concurrent commits during packing self._commit_lock.acquire() try: MappingStorage.pack(self, t, referencesf, gc) finally: self._commit_lock.release() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/MinPO.py000066400000000000000000000016771230730566700231460ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """A minimal persistent object to use for tests""" from persistent import Persistent class MinPO(Persistent): def __init__(self, value=None): self.value = value def __cmp__(self, aMinPO): return cmp(self.value, aMinPO.value) def __repr__(self): return "MinPO(%s)" % self.value ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/PackableStorage.py000066400000000000000000000675201230730566700252120ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Run some tests relevant for storages that support pack().""" from cStringIO import StringIO from persistent import Persistent from persistent.mapping import PersistentMapping from ZODB import DB from ZODB.POSException import ConflictError, StorageError from ZODB.serialize import referencesf from ZODB.tests.MinPO import MinPO from ZODB.tests.MTStorage import TestThread from ZODB.tests.StorageTestBase import snooze import cPickle import doctest import time import transaction import ZODB.interfaces import ZODB.tests.util import zope.testing.setupstack ZERO = '\0'*8 # This class is for the root object. It must not contain a getoid() method # (really, attribute). The persistent pickling machinery -- in the dumps() # function below -- will pickle Root objects as normal, but any attributes # which reference persistent Object instances will get pickled as persistent # ids, not as the object's state. This makes the referencesf stuff work, # because it pickle sniffs for persistent ids (so we have to get those # persistent ids into the root object's pickle). class Root: pass # This is the persistent Object class. Because it has a getoid() method, the # persistent pickling machinery -- in the dumps() function below -- will # pickle the oid string instead of the object's actual state. Yee haw, this # stuff is deep. ;) class Object(object): def __init__(self, oid): self._oid = oid def getoid(self): return self._oid class C(Persistent): pass # Here's where all the magic occurs. Sadly, the pickle module is a bit # underdocumented, but here's what happens: by setting the persistent_id # attribute to getpersid() on the pickler, that function gets called for every # object being pickled. By returning None when the object has no getoid # attribute, it signals pickle to serialize the object as normal. That's how # the Root instance gets pickled correctly. But, if the object has a getoid # attribute, then by returning that method's value, we tell pickle to # serialize the persistent id of the object instead of the object's state. # That sets the pickle up for proper sniffing by the referencesf machinery. # Fun, huh? def dumps(obj): def getpersid(obj): if hasattr(obj, 'getoid'): return obj.getoid() return None s = StringIO() p = cPickle.Pickler(s, 1) p.inst_persistent_id = getpersid p.dump(obj) p.dump(None) return s.getvalue() def pdumps(obj): s = StringIO() p = cPickle.Pickler(s) p.dump(obj) p.dump(None) return s.getvalue() class PackableStorageBase: # We keep a cache of object ids to instances so that the unpickler can # easily return any persistent object. @property def _cache(self): try: return self.__cache except AttributeError: self.__cache = {} return self.__cache def _newobj(self): # This is a convenience method to create a new persistent Object # instance. It asks the storage for a new object id, creates the # instance with the given oid, populates the cache and returns the # object. oid = self._storage.new_oid() obj = Object(oid) self._cache[obj.getoid()] = obj return obj def _makeloader(self): # This is the other side of the persistent pickling magic. We need a # custom unpickler to mirror our custom pickler above. By setting the # persistent_load function of the unpickler to self._cache.get(), # whenever a persistent id is unpickled, it will actually return the # Object instance out of the cache. As far as returning a function # with an argument bound to an instance attribute method, we do it # this way because it makes the code in the tests more succinct. # # BUT! Be careful in your use of loads() vs. cPickle.loads(). loads() # should only be used on the Root object's pickle since it's the only # special one. All the Object instances should use cPickle.loads(). def loads(str, persfunc=self._cache.get): fp = StringIO(str) u = cPickle.Unpickler(fp) u.persistent_load = persfunc return u.load() return loads def _initroot(self): try: self._storage.load(ZERO, '') except KeyError: from transaction import Transaction file = StringIO() p = cPickle.Pickler(file, 1) p.dump((PersistentMapping, None)) p.dump({'_container': {}}) t=Transaction() t.description='initial database creation' self._storage.tpc_begin(t) self._storage.store(ZERO, None, file.getvalue(), '', t) self._storage.tpc_vote(t) self._storage.tpc_finish(t) def _sanity_check(self): # Iterate over the storage to make sure it's sane. if not ZODB.interfaces.IStorageIteration.providedBy(self._storage): return it = self._storage.iterator() for txn in it: for data in txn: pass class PackableStorage(PackableStorageBase): def checkPackEmptyStorage(self): self._storage.pack(time.time(), referencesf) def checkPackTomorrow(self): self._initroot() self._storage.pack(time.time() + 10000, referencesf) def checkPackYesterday(self): self._initroot() self._storage.pack(time.time() - 10000, referencesf) def _PackWhileWriting(self, pack_now): # A storage should allow some reading and writing during # a pack. This test attempts to exercise locking code # in the storage to test that it is safe. It generates # a lot of revisions, so that pack takes a long time. db = DB(self._storage) conn = db.open() root = conn.root() for i in range(10): root[i] = MinPO(i) transaction.commit() snooze() packt = time.time() choices = range(10) for dummy in choices: for i in choices: root[i].value = MinPO(i) transaction.commit() # How many client threads should we run, and how long should we # wait for them to finish? Hard to say. Running 4 threads and # waiting 30 seconds too often left a thread still alive on Tim's # Win98SE box, during ZEO flavors of this test. Those tend to # run one thread at a time to completion, and take about 10 seconds # per thread. There doesn't appear to be a compelling reason to # run that many threads. Running 3 threads and waiting up to a # minute seems to work well in practice. The ZEO tests normally # finish faster than that, and the non-ZEO tests very much faster # than that. NUM_LOOP_TRIP = 50 timer = ElapsedTimer(time.time()) threads = [ClientThread(db, choices, NUM_LOOP_TRIP, timer, i) for i in range(3)] for t in threads: t.start() if pack_now: db.pack(time.time()) else: db.pack(packt) for t in threads: t.join(60) liveness = [t.isAlive() for t in threads] if True in liveness: # They should have finished by now. print 'Liveness:', liveness # Combine the outcomes, and sort by start time. outcomes = [] for t in threads: outcomes.extend(t.outcomes) # each outcome list has as many of these as a loop trip got thru: # thread_id # elapsed millis at loop top # elapsed millis at attempt to assign to self.root[index] # index into self.root getting replaced # elapsed millis when outcome known # 'OK' or 'Conflict' # True if we got beyond this line, False if it raised an # exception (one possible Conflict cause): # self.root[index].value = MinPO(j) def cmp_by_time(a, b): return cmp((a[1], a[0]), (b[1], b[0])) outcomes.sort(cmp_by_time) counts = [0] * 4 for outcome in outcomes: n = len(outcome) assert n >= 2 tid = outcome[0] print 'tid:%d top:%5d' % (tid, outcome[1]), if n > 2: print 'commit:%5d' % outcome[2], if n > 3: print 'index:%2d' % outcome[3], if n > 4: print 'known:%5d' % outcome[4], if n > 5: print '%8s' % outcome[5], if n > 6: print 'assigned:%5s' % outcome[6], counts[tid] += 1 if counts[tid] == NUM_LOOP_TRIP: print 'thread %d done' % tid, print self.fail('a thread is still alive') self._sanity_check() def checkPackWhileWriting(self): self._PackWhileWriting(pack_now=False) def checkPackNowWhileWriting(self): self._PackWhileWriting(pack_now=True) def checkPackLotsWhileWriting(self): # This is like the other pack-while-writing tests, except it packs # repeatedly until the client thread is done. At the time it was # introduced, it reliably provoked # CorruptedError: ... transaction with checkpoint flag set # in the ZEO flavor of the FileStorage tests. db = DB(self._storage) conn = db.open() root = conn.root() choices = range(10) for i in choices: root[i] = MinPO(i) transaction.commit() snooze() packt = time.time() for dummy in choices: for i in choices: root[i].value = MinPO(i) transaction.commit() NUM_LOOP_TRIP = 100 timer = ElapsedTimer(time.time()) thread = ClientThread(db, choices, NUM_LOOP_TRIP, timer, 0) thread.start() while thread.isAlive(): db.pack(packt) snooze() packt = time.time() thread.join() self._sanity_check() def checkPackWithMultiDatabaseReferences(self): databases = {} db = DB(self._storage, databases=databases, database_name='') otherdb = ZODB.tests.util.DB(databases=databases, database_name='o') conn = db.open() root = conn.root() root[1] = C() transaction.commit() del root[1] transaction.commit() root[2] = conn.get_connection('o').root() transaction.commit() db.pack(time.time()+1) # some valid storages always return 0 for len() self.assertTrue(len(self._storage) in (0, 1)) def checkPackAllRevisions(self): self._initroot() eq = self.assertEqual raises = self.assertRaises # Create a `persistent' object obj = self._newobj() oid = obj.getoid() obj.value = 1 # Commit three different revisions revid1 = self._dostoreNP(oid, data=pdumps(obj)) obj.value = 2 revid2 = self._dostoreNP(oid, revid=revid1, data=pdumps(obj)) obj.value = 3 revid3 = self._dostoreNP(oid, revid=revid2, data=pdumps(obj)) # Now make sure all three revisions can be extracted data = self._storage.loadSerial(oid, revid1) pobj = cPickle.loads(data) eq(pobj.getoid(), oid) eq(pobj.value, 1) data = self._storage.loadSerial(oid, revid2) pobj = cPickle.loads(data) eq(pobj.getoid(), oid) eq(pobj.value, 2) data = self._storage.loadSerial(oid, revid3) pobj = cPickle.loads(data) eq(pobj.getoid(), oid) eq(pobj.value, 3) # Now pack all transactions; need to sleep a second to make # sure that the pack time is greater than the last commit time. now = packtime = time.time() while packtime <= now: packtime = time.time() self._storage.pack(packtime, referencesf) # All revisions of the object should be gone, since there is no # reference from the root object to this object. raises(KeyError, self._storage.loadSerial, oid, revid1) raises(KeyError, self._storage.loadSerial, oid, revid2) raises(KeyError, self._storage.loadSerial, oid, revid3) def checkPackJustOldRevisions(self): eq = self.assertEqual raises = self.assertRaises loads = self._makeloader() # Create a root object. This can't be an instance of Object, # otherwise the pickling machinery will serialize it as a persistent # id and not as an object that contains references (persistent ids) to # other objects. root = Root() # Create a persistent object, with some initial state obj = self._newobj() oid = obj.getoid() # Link the root object to the persistent object, in order to keep the # persistent object alive. Store the root object. root.obj = obj root.value = 0 revid0 = self._dostoreNP(ZERO, data=dumps(root)) # Make sure the root can be retrieved data, revid = self._storage.load(ZERO, '') eq(revid, revid0) eq(loads(data).value, 0) # Commit three different revisions of the other object obj.value = 1 revid1 = self._dostoreNP(oid, data=pdumps(obj)) obj.value = 2 revid2 = self._dostoreNP(oid, revid=revid1, data=pdumps(obj)) obj.value = 3 revid3 = self._dostoreNP(oid, revid=revid2, data=pdumps(obj)) # Now make sure all three revisions can be extracted data = self._storage.loadSerial(oid, revid1) pobj = cPickle.loads(data) eq(pobj.getoid(), oid) eq(pobj.value, 1) data = self._storage.loadSerial(oid, revid2) pobj = cPickle.loads(data) eq(pobj.getoid(), oid) eq(pobj.value, 2) data = self._storage.loadSerial(oid, revid3) pobj = cPickle.loads(data) eq(pobj.getoid(), oid) eq(pobj.value, 3) # Now pack just revisions 1 and 2. The object's current revision # should stay alive because it's pointed to by the root. now = packtime = time.time() while packtime <= now: packtime = time.time() self._storage.pack(packtime, referencesf) # Make sure the revisions are gone, but that object zero and revision # 3 are still there and correct data, revid = self._storage.load(ZERO, '') eq(revid, revid0) eq(loads(data).value, 0) raises(KeyError, self._storage.loadSerial, oid, revid1) raises(KeyError, self._storage.loadSerial, oid, revid2) data = self._storage.loadSerial(oid, revid3) pobj = cPickle.loads(data) eq(pobj.getoid(), oid) eq(pobj.value, 3) data, revid = self._storage.load(oid, '') eq(revid, revid3) pobj = cPickle.loads(data) eq(pobj.getoid(), oid) eq(pobj.value, 3) def checkPackOnlyOneObject(self): eq = self.assertEqual raises = self.assertRaises loads = self._makeloader() # Create a root object. This can't be an instance of Object, # otherwise the pickling machinery will serialize it as a persistent # id and not as an object that contains references (persistent ids) to # other objects. root = Root() # Create a persistent object, with some initial state obj1 = self._newobj() oid1 = obj1.getoid() # Create another persistent object, with some initial state. obj2 = self._newobj() oid2 = obj2.getoid() # Link the root object to the persistent objects, in order to keep # them alive. Store the root object. root.obj1 = obj1 root.obj2 = obj2 root.value = 0 revid0 = self._dostoreNP(ZERO, data=dumps(root)) # Make sure the root can be retrieved data, revid = self._storage.load(ZERO, '') eq(revid, revid0) eq(loads(data).value, 0) # Commit three different revisions of the first object obj1.value = 1 revid1 = self._dostoreNP(oid1, data=pdumps(obj1)) obj1.value = 2 revid2 = self._dostoreNP(oid1, revid=revid1, data=pdumps(obj1)) obj1.value = 3 revid3 = self._dostoreNP(oid1, revid=revid2, data=pdumps(obj1)) # Now make sure all three revisions can be extracted data = self._storage.loadSerial(oid1, revid1) pobj = cPickle.loads(data) eq(pobj.getoid(), oid1) eq(pobj.value, 1) data = self._storage.loadSerial(oid1, revid2) pobj = cPickle.loads(data) eq(pobj.getoid(), oid1) eq(pobj.value, 2) data = self._storage.loadSerial(oid1, revid3) pobj = cPickle.loads(data) eq(pobj.getoid(), oid1) eq(pobj.value, 3) # Now commit a revision of the second object obj2.value = 11 revid4 = self._dostoreNP(oid2, data=pdumps(obj2)) # And make sure the revision can be extracted data = self._storage.loadSerial(oid2, revid4) pobj = cPickle.loads(data) eq(pobj.getoid(), oid2) eq(pobj.value, 11) # Now pack just revisions 1 and 2 of object1. Object1's current # revision should stay alive because it's pointed to by the root, as # should Object2's current revision. now = packtime = time.time() while packtime <= now: packtime = time.time() self._storage.pack(packtime, referencesf) # Make sure the revisions are gone, but that object zero, object2, and # revision 3 of object1 are still there and correct. data, revid = self._storage.load(ZERO, '') eq(revid, revid0) eq(loads(data).value, 0) raises(KeyError, self._storage.loadSerial, oid1, revid1) raises(KeyError, self._storage.loadSerial, oid1, revid2) data = self._storage.loadSerial(oid1, revid3) pobj = cPickle.loads(data) eq(pobj.getoid(), oid1) eq(pobj.value, 3) data, revid = self._storage.load(oid1, '') eq(revid, revid3) pobj = cPickle.loads(data) eq(pobj.getoid(), oid1) eq(pobj.value, 3) data, revid = self._storage.load(oid2, '') eq(revid, revid4) eq(loads(data).value, 11) data = self._storage.loadSerial(oid2, revid4) pobj = cPickle.loads(data) eq(pobj.getoid(), oid2) eq(pobj.value, 11) class PackableStorageWithOptionalGC(PackableStorage): def checkPackAllRevisionsNoGC(self): self._initroot() eq = self.assertEqual raises = self.assertRaises # Create a `persistent' object obj = self._newobj() oid = obj.getoid() obj.value = 1 # Commit three different revisions revid1 = self._dostoreNP(oid, data=pdumps(obj)) obj.value = 2 revid2 = self._dostoreNP(oid, revid=revid1, data=pdumps(obj)) obj.value = 3 revid3 = self._dostoreNP(oid, revid=revid2, data=pdumps(obj)) # Now make sure all three revisions can be extracted data = self._storage.loadSerial(oid, revid1) pobj = cPickle.loads(data) eq(pobj.getoid(), oid) eq(pobj.value, 1) data = self._storage.loadSerial(oid, revid2) pobj = cPickle.loads(data) eq(pobj.getoid(), oid) eq(pobj.value, 2) data = self._storage.loadSerial(oid, revid3) pobj = cPickle.loads(data) eq(pobj.getoid(), oid) eq(pobj.value, 3) # Now pack all transactions; need to sleep a second to make # sure that the pack time is greater than the last commit time. now = packtime = time.time() while packtime <= now: packtime = time.time() self._storage.pack(packtime, referencesf, gc=False) # Only old revisions of the object should be gone. We don't gc raises(KeyError, self._storage.loadSerial, oid, revid1) raises(KeyError, self._storage.loadSerial, oid, revid2) self._storage.loadSerial(oid, revid3) class PackableUndoStorage(PackableStorageBase): def checkPackUnlinkedFromRoot(self): eq = self.assertEqual db = DB(self._storage) conn = db.open() root = conn.root() txn = transaction.get() txn.note('root') txn.commit() now = packtime = time.time() while packtime <= now: packtime = time.time() obj = C() obj.value = 7 root['obj'] = obj txn = transaction.get() txn.note('root -> o1') txn.commit() del root['obj'] txn = transaction.get() txn.note('root -x-> o1') txn.commit() self._storage.pack(packtime, referencesf) log = self._storage.undoLog() tid = log[0]['id'] db.undo(tid) txn = transaction.get() txn.note('undo root -x-> o1') txn.commit() conn.sync() eq(root['obj'].value, 7) def checkRedundantPack(self): # It is an error to perform a pack with a packtime earlier # than a previous packtime. The storage can't do a full # traversal as of the packtime, because the previous pack may # have removed revisions necessary for a full traversal. # It should be simple to test that a storage error is raised, # but this test case goes to the trouble of constructing a # scenario that would lose data if the earlier packtime was # honored. self._initroot() db = DB(self._storage) conn = db.open() root = conn.root() root["d"] = d = PersistentMapping() transaction.commit() snooze() obj = d["obj"] = C() obj.value = 1 transaction.commit() snooze() packt1 = time.time() lost_oid = obj._p_oid obj = d["anotherobj"] = C() obj.value = 2 transaction.commit() snooze() packt2 = time.time() db.pack(packt2) # BDBStorage allows the second pack, but doesn't lose data. try: db.pack(packt1) except StorageError: pass # This object would be removed by the second pack, even though # it is reachable. self._storage.load(lost_oid, "") def checkPackUndoLog(self): self._initroot() # Create a `persistent' object obj = self._newobj() oid = obj.getoid() obj.value = 1 # Commit two different revisions revid1 = self._dostoreNP(oid, data=pdumps(obj)) obj.value = 2 snooze() packtime = time.time() snooze() self._dostoreNP(oid, revid=revid1, data=pdumps(obj)) # Now pack the first transaction self.assertEqual(3, len(self._storage.undoLog())) self._storage.pack(packtime, referencesf) # The undo log contains only the most resent transaction self.assertEqual(1,len(self._storage.undoLog())) def dont_checkPackUndoLogUndoable(self): # A disabled test. I wanted to test that the content of the # undo log was consistent, but every storage appears to # include something slightly different. If the result of this # method is only used to fill a GUI then this difference # doesnt matter. Perhaps re-enable this test once we agree # what should be asserted. self._initroot() # Create two `persistent' object obj1 = self._newobj() oid1 = obj1.getoid() obj1.value = 1 obj2 = self._newobj() oid2 = obj2.getoid() obj2.value = 2 # Commit the first revision of each of them revid11 = self._dostoreNP(oid1, data=pdumps(obj1), description="1-1") revid22 = self._dostoreNP(oid2, data=pdumps(obj2), description="2-2") # remember the time. everything above here will be packed away snooze() packtime = time.time() snooze() # Commit two revisions of the first object obj1.value = 3 revid13 = self._dostoreNP(oid1, revid=revid11, data=pdumps(obj1), description="1-3") obj1.value = 4 self._dostoreNP(oid1, revid=revid13, data=pdumps(obj1), description="1-4") # Commit one revision of the second object obj2.value = 5 self._dostoreNP(oid2, revid=revid22, data=pdumps(obj2), description="2-5") # Now pack self.assertEqual(6,len(self._storage.undoLog())) print '\ninitial undoLog was' for r in self._storage.undoLog(): print r self._storage.pack(packtime, referencesf) # The undo log contains only two undoable transaction. print '\nafter packing undoLog was' for r in self._storage.undoLog(): print r # what can we assert about that? # A number of these threads are kicked off by _PackWhileWriting(). Their # purpose is to abuse the database passed to the constructor with lots of # random write activity while the main thread is packing it. class ClientThread(TestThread): def __init__(self, db, choices, loop_trip, timer, thread_id): TestThread.__init__(self) self.root = db.open().root() self.choices = choices self.loop_trip = loop_trip self.millis = timer.elapsed_millis self.thread_id = thread_id # list of lists; each list has as many of these as a loop trip # got thru: # thread_id # elapsed millis at loop top # elapsed millis at attempt # index into self.root getting replaced # elapsed millis when outcome known # 'OK' or 'Conflict' # True if we got beyond this line, False if it raised an exception: # self.root[index].value = MinPO(j) self.outcomes = [] def runtest(self): from random import choice for j in range(self.loop_trip): assign_worked = False alist = [self.thread_id, self.millis()] self.outcomes.append(alist) try: index = choice(self.choices) alist.extend([self.millis(), index]) self.root[index].value = MinPO(j) assign_worked = True transaction.commit() alist.append(self.millis()) alist.append('OK') except ConflictError: alist.append(self.millis()) alist.append('Conflict') transaction.abort() alist.append(assign_worked) class ElapsedTimer: def __init__(self, start_time): self.start_time = start_time def elapsed_millis(self): return int((time.time() - self.start_time) * 1000) def IExternalGC_suite(factory): """Return a test suite for a generic . Pass a factory taking a name and a blob directory name. """ def setup(test): ZODB.tests.util.setUp(test) test.globs['create_storage'] = factory return doctest.DocFileSuite( 'IExternalGC.test', setUp=setup, tearDown=zope.testing.setupstack.tearDown) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/PersistentStorage.py000066400000000000000000000032311230730566700256350ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Test that a storage's values persist across open and close.""" class PersistentStorage: def checkUpdatesPersist(self): oids = [] def new_oid_wrapper(l=oids, new_oid=self._storage.new_oid): oid = new_oid() l.append(oid) return oid self._storage.new_oid = new_oid_wrapper self._dostore() oid = self._storage.new_oid() revid = self._dostore(oid) oid = self._storage.new_oid() revid = self._dostore(oid, data=1) revid = self._dostore(oid, revid, data=2) self._dostore(oid, revid, data=3) # keep copies of all the objects objects = [] for oid in oids: p, s = self._storage.load(oid, '') objects.append((oid, '', p, s)) self._storage.close() self.open() # keep copies of all the objects for oid, ver, p, s in objects: _p, _s = self._storage.load(oid, ver) self.assertEquals(p, _p) self.assertEquals(s, _s) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/ReadOnlyStorage.py000066400000000000000000000041321230730566700252130ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## from ZODB.POSException import ReadOnlyError, Unsupported import transaction class ReadOnlyStorage: def _create_data(self): # test a read-only storage that already has some data self.oids = {} for i in range(10): oid = self._storage.new_oid() revid = self._dostore(oid) self.oids[oid] = revid def _make_readonly(self): self._storage.close() self.open(read_only=True) self.assert_(self._storage.isReadOnly()) def checkReadMethods(self): self._create_data() self._make_readonly() # Note that this doesn't check _all_ read methods. for oid in self.oids.keys(): data, revid = self._storage.load(oid, '') self.assertEqual(revid, self.oids[oid]) # Storages without revisions may not have loadSerial(). try: _data = self._storage.loadSerial(oid, revid) self.assertEqual(data, _data) except Unsupported: pass def checkWriteMethods(self): self._make_readonly() self.assertRaises(ReadOnlyError, self._storage.new_oid) t = transaction.Transaction() self.assertRaises(ReadOnlyError, self._storage.tpc_begin, t) self.assertRaises(ReadOnlyError, self._storage.store, '\000' * 8, None, '', '', t) self.assertRaises(ReadOnlyError, self._storage.undo, '\000' * 8, t) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/RecoveryStorage.py000066400000000000000000000170451230730566700253030ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """More recovery and iterator tests.""" import transaction from transaction import Transaction from ZODB.tests.IteratorStorage import IteratorDeepCompare from ZODB.tests.StorageTestBase import MinPO, snooze from ZODB import DB from ZODB.serialize import referencesf import time class RecoveryStorage(IteratorDeepCompare): # Requires a setUp() that creates a self._dst destination storage def checkSimpleRecovery(self): oid = self._storage.new_oid() revid = self._dostore(oid, data=11) revid = self._dostore(oid, revid=revid, data=12) revid = self._dostore(oid, revid=revid, data=13) self._dst.copyTransactionsFrom(self._storage) self.compare(self._storage, self._dst) def checkRestoreAcrossPack(self): db = DB(self._storage) c = db.open() r = c.root() obj = r["obj1"] = MinPO(1) transaction.commit() obj = r["obj2"] = MinPO(1) transaction.commit() self._dst.copyTransactionsFrom(self._storage) self._dst.pack(time.time(), referencesf) self._undo(self._storage.undoInfo()[0]['id']) # copy the final transaction manually. even though there # was a pack, the restore() ought to succeed. it = self._storage.iterator() # Get the last transaction and its record iterator. Record iterators # can't be accessed out-of-order, so we need to do this in a bit # complicated way: for final in it: records = list(final) self._dst.tpc_begin(final, final.tid, final.status) for r in records: self._dst.restore(r.oid, r.tid, r.data, '', r.data_txn, final) self._dst.tpc_vote(final) self._dst.tpc_finish(final) def checkPackWithGCOnDestinationAfterRestore(self): raises = self.assertRaises db = DB(self._storage) conn = db.open() root = conn.root() root.obj = obj1 = MinPO(1) txn = transaction.get() txn.note('root -> obj') txn.commit() root.obj.obj = obj2 = MinPO(2) txn = transaction.get() txn.note('root -> obj -> obj') txn.commit() del root.obj txn = transaction.get() txn.note('root -X->') txn.commit() # Now copy the transactions to the destination self._dst.copyTransactionsFrom(self._storage) # Now pack the destination. snooze() self._dst.pack(time.time(), referencesf) # And check to see that the root object exists, but not the other # objects. data, serial = self._dst.load(root._p_oid, '') raises(KeyError, self._dst.load, obj1._p_oid, '') raises(KeyError, self._dst.load, obj2._p_oid, '') def checkRestoreWithMultipleObjectsInUndoRedo(self): from ZODB.FileStorage import FileStorage # Undo creates backpointers in (at least) FileStorage. ZODB 3.2.1 # FileStorage._data_find() had an off-by-8 error, neglecting to # account for the size of the backpointer when searching a # transaction with multiple data records. The results were # unpredictable. For example, it could raise a Python exception # due to passing a negative offset to file.seek(), or could # claim that a transaction didn't have data for an oid despite # that it actually did. # # The former failure mode was seen in real life, in a ZRS secondary # doing recovery. On my box today, the second failure mode is # what happens in this test (with an unpatched _data_find, of # course). Note that the error can only "bite" if more than one # data record is in a transaction, and the oid we're looking for # follows at least one data record with a backpointer. # # Unfortunately, _data_find() is a low-level implementation detail, # and this test does some horrid white-box abuse to test it. is_filestorage = isinstance(self._storage, FileStorage) db = DB(self._storage) c = db.open() r = c.root() # Create some objects. r["obj1"] = MinPO(1) r["obj2"] = MinPO(1) transaction.commit() # Add x attributes to them. r["obj1"].x = 'x1' r["obj2"].x = 'x2' transaction.commit() r = db.open().root() self.assertEquals(r["obj1"].x, 'x1') self.assertEquals(r["obj2"].x, 'x2') # Dirty tricks. if is_filestorage: obj1_oid = r["obj1"]._p_oid obj2_oid = r["obj2"]._p_oid # This will be the offset of the next transaction, which # will contain two backpointers. pos = self._storage.getSize() # Undo the attribute creation. info = self._storage.undoInfo() tid = info[0]['id'] t = Transaction() self._storage.tpc_begin(t) oids = self._storage.undo(tid, t) self._storage.tpc_vote(t) self._storage.tpc_finish(t) r = db.open().root() self.assertRaises(AttributeError, getattr, r["obj1"], 'x') self.assertRaises(AttributeError, getattr, r["obj2"], 'x') if is_filestorage: # _data_find should find data records for both objects in that # transaction. Without the patch, the second assert failed # (it claimed it couldn't find a data record for obj2) on my # box, but other failure modes were possible. self.assert_(self._storage._data_find(pos, obj1_oid, '') > 0) self.assert_(self._storage._data_find(pos, obj2_oid, '') > 0) # The offset of the next ("redo") transaction. pos = self._storage.getSize() # Undo the undo (restore the attributes). info = self._storage.undoInfo() tid = info[0]['id'] t = Transaction() self._storage.tpc_begin(t) oids = self._storage.undo(tid, t) self._storage.tpc_vote(t) self._storage.tpc_finish(t) r = db.open().root() self.assertEquals(r["obj1"].x, 'x1') self.assertEquals(r["obj2"].x, 'x2') if is_filestorage: # Again _data_find should find both objects in this txn, and # again the second assert failed on my box. self.assert_(self._storage._data_find(pos, obj1_oid, '') > 0) self.assert_(self._storage._data_find(pos, obj2_oid, '') > 0) # Indirectly provoke .restore(). .restore in turn indirectly # provokes _data_find too, but not usefully for the purposes of # the specific bug this test aims at: copyTransactionsFrom() uses # storage iterators that chase backpointers themselves, and # return the data they point at instead. The result is that # _data_find didn't actually see anything dangerous in this # part of the test. self._dst.copyTransactionsFrom(self._storage) self.compare(self._storage, self._dst) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/RevisionStorage.py000066400000000000000000000147771230730566700253140ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Check loadSerial() on storages that support historical revisions.""" from ZODB.tests.MinPO import MinPO from ZODB.tests.StorageTestBase import zodb_unpickle, zodb_pickle, snooze from ZODB.tests.StorageTestBase import handle_serials from ZODB.utils import p64, u64 import transaction ZERO = '\0'*8 class RevisionStorage: def checkLoadSerial(self): oid = self._storage.new_oid() revid = ZERO revisions = {} for i in range(31, 38): revid = self._dostore(oid, revid=revid, data=MinPO(i)) revisions[revid] = MinPO(i) # Now make sure all the revisions have the correct value for revid, value in revisions.items(): data = self._storage.loadSerial(oid, revid) self.assertEqual(zodb_unpickle(data), value) def checkLoadBefore(self): # Store 10 revisions of one object and then make sure that we # can get all the non-current revisions back. oid = self._storage.new_oid() revs = [] revid = None for i in range(10): # We need to ensure that successive timestamps are at least # two apart, so that a timestamp exists that's unambiguously # between successive timestamps. Each call to snooze() # guarantees that the next timestamp will be at least one # larger (and probably much more than that) than the previous # one. snooze() snooze() revid = self._dostore(oid, revid, data=MinPO(i)) revs.append(self._storage.load(oid, "")) prev = u64(revs[0][1]) for i in range(1, 10): tid = revs[i][1] cur = u64(tid) middle = prev + (cur - prev) // 2 assert prev < middle < cur # else the snooze() trick failed prev = cur t = self._storage.loadBefore(oid, p64(middle)) self.assert_(t is not None) data, start, end = t self.assertEqual(revs[i-1][0], data) self.assertEqual(tid, end) def checkLoadBeforeEdges(self): # Check the edges cases for a non-current load. oid = self._storage.new_oid() self.assertRaises(KeyError, self._storage.loadBefore, oid, p64(0)) revid1 = self._dostore(oid, data=MinPO(1)) self.assertEqual(self._storage.loadBefore(oid, p64(0)), None) self.assertEqual(self._storage.loadBefore(oid, revid1), None) cur = p64(u64(revid1) + 1) data, start, end = self._storage.loadBefore(oid, cur) self.assertEqual(zodb_unpickle(data), MinPO(1)) self.assertEqual(start, revid1) self.assertEqual(end, None) revid2 = self._dostore(oid, revid=revid1, data=MinPO(2)) data, start, end = self._storage.loadBefore(oid, cur) self.assertEqual(zodb_unpickle(data), MinPO(1)) self.assertEqual(start, revid1) self.assertEqual(end, revid2) def checkLoadBeforeOld(self): # Look for a very old revision. With the BaseStorage implementation # this should require multple history() calls. oid = self._storage.new_oid() revs = [] revid = None for i in range(50): revid = self._dostore(oid, revid, data=MinPO(i)) revs.append(revid) data, start, end = self._storage.loadBefore(oid, revs[12]) self.assertEqual(zodb_unpickle(data), MinPO(11)) self.assertEqual(start, revs[11]) self.assertEqual(end, revs[12]) # Unsure: Is it okay to assume everyone testing against RevisionStorage # implements undo? def checkLoadBeforeUndo(self): # Do several transactions then undo them. oid = self._storage.new_oid() revid = None for i in range(5): revid = self._dostore(oid, revid, data=MinPO(i)) revs = [] for i in range(4): info = self._storage.undoInfo() tid = info[0]["id"] # Always undo the most recent txn, so the value will # alternate between 3 and 4. self._undo(tid, note="undo %d" % i) revs.append(self._storage.load(oid, "")) prev_tid = None for i, (data, tid) in enumerate(revs): t = self._storage.loadBefore(oid, p64(u64(tid) + 1)) self.assertEqual(data, t[0]) self.assertEqual(tid, t[1]) if prev_tid: self.assert_(prev_tid < t[1]) prev_tid = t[1] if i < 3: self.assertEqual(revs[i+1][1], t[2]) else: self.assertEqual(None, t[2]) def checkLoadBeforeConsecutiveTids(self): eq = self.assertEqual oid = self._storage.new_oid() def helper(tid, revid, x): data = zodb_pickle(MinPO(x)) t = transaction.Transaction() try: self._storage.tpc_begin(t, p64(tid)) r1 = self._storage.store(oid, revid, data, '', t) # Finish the transaction r2 = self._storage.tpc_vote(t) newrevid = handle_serials(oid, r1, r2) self._storage.tpc_finish(t) except: self._storage.tpc_abort(t) raise return newrevid revid1 = helper(1, None, 1) revid2 = helper(2, revid1, 2) revid3 = helper(3, revid2, 3) data, start_tid, end_tid = self._storage.loadBefore(oid, p64(2)) eq(zodb_unpickle(data), MinPO(1)) eq(u64(start_tid), 1) eq(u64(end_tid), 2) def checkLoadBeforeCreation(self): eq = self.assertEqual oid1 = self._storage.new_oid() oid2 = self._storage.new_oid() revid1 = self._dostore(oid1) revid2 = self._dostore(oid2) results = self._storage.loadBefore(oid2, revid2) eq(results, None) # TODO: There are other edge cases to handle, including pack. ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/StorageTestBase.py000066400000000000000000000161251230730566700252150ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Provide a mixin base class for storage tests. The StorageTestBase class provides basic setUp() and tearDown() semantics (which you can override), and it also provides a helper method _dostore() which performs a complete store transaction for a single object revision. """ import sys import time from cPickle import Pickler, Unpickler from cStringIO import StringIO import transaction from ZODB.utils import u64 from ZODB.tests.MinPO import MinPO import ZODB.tests.util ZERO = '\0'*8 def snooze(): # In Windows, it's possible that two successive time.time() calls return # the same value. Tim guarantees that time never runs backwards. You # usually want to call this before you pack a storage, or must make other # guarantees about increasing timestamps. now = time.time() while now == time.time(): time.sleep(0.1) def _persistent_id(obj): oid = getattr(obj, "_p_oid", None) if getattr(oid, "__get__", None) is not None: return None else: return oid def zodb_pickle(obj): """Create a pickle in the format expected by ZODB.""" f = StringIO() p = Pickler(f, 1) p.inst_persistent_id = _persistent_id klass = obj.__class__ assert not hasattr(obj, '__getinitargs__'), "not ready for constructors" args = None mod = getattr(klass, '__module__', None) if mod is not None: klass = mod, klass.__name__ state = obj.__getstate__() p.dump((klass, args)) p.dump(state) return f.getvalue(1) def persistent_load(pid): # helper for zodb_unpickle return "ref to %s.%s oid=%s" % (pid[1][0], pid[1][1], u64(pid[0])) def zodb_unpickle(data): """Unpickle an object stored using the format expected by ZODB.""" f = StringIO(data) u = Unpickler(f) u.persistent_load = persistent_load klass_info = u.load() if isinstance(klass_info, tuple): if isinstance(klass_info[0], type): # Unclear: what is the second part of klass_info? klass, xxx = klass_info assert not xxx else: if isinstance(klass_info[0], tuple): modname, klassname = klass_info[0] else: modname, klassname = klass_info if modname == "__main__": ns = globals() else: mod = import_helper(modname) ns = mod.__dict__ try: klass = ns[klassname] except KeyError: print >> sys.stderr, "can't find %s in %r" % (klassname, ns) inst = klass() else: raise ValueError("expected class info: %s" % repr(klass_info)) state = u.load() inst.__setstate__(state) return inst def handle_all_serials(oid, *args): """Return dict of oid to serialno from store() and tpc_vote(). Raises an exception if one of the calls raised an exception. The storage interface got complicated when ZEO was introduced. Any individual store() call can return None or a sequence of 2-tuples where the 2-tuple is either oid, serialno or an exception to be raised by the client. The original interface just returned the serialno for the object. """ d = {} for arg in args: if isinstance(arg, str): d[oid] = arg elif arg is None: pass else: for oid, serial in arg: if not isinstance(serial, str): raise serial # error from ZEO server d[oid] = serial return d def handle_serials(oid, *args): """Return the serialno for oid based on multiple return values. A helper for function _handle_all_serials(). """ return handle_all_serials(oid, *args)[oid] def import_helper(name): __import__(name) return sys.modules[name] class StorageTestBase(ZODB.tests.util.TestCase): # It would be simpler if concrete tests didn't need to extend # setUp() and tearDown(). _storage = None def _close(self): # You should override this if closing your storage requires additional # shutdown operations. if self._storage is not None: self._storage.close() def tearDown(self): self._close() ZODB.tests.util.TestCase.tearDown(self) def _dostore(self, oid=None, revid=None, data=None, already_pickled=0, user=None, description=None): """Do a complete storage transaction. The defaults are: - oid=None, ask the storage for a new oid - revid=None, use a revid of ZERO - data=None, pickle up some arbitrary data (the integer 7) Returns the object's new revision id. """ if oid is None: oid = self._storage.new_oid() if revid is None: revid = ZERO if data is None: data = MinPO(7) if type(data) == int: data = MinPO(data) if not already_pickled: data = zodb_pickle(data) # Begin the transaction t = transaction.Transaction() if user is not None: t.user = user if description is not None: t.description = description try: self._storage.tpc_begin(t) # Store an object r1 = self._storage.store(oid, revid, data, '', t) # Finish the transaction r2 = self._storage.tpc_vote(t) revid = handle_serials(oid, r1, r2) self._storage.tpc_finish(t) except: self._storage.tpc_abort(t) raise return revid def _dostoreNP(self, oid=None, revid=None, data=None, user=None, description=None): return self._dostore(oid, revid, data, 1, user, description) # The following methods depend on optional storage features. def _undo(self, tid, expected_oids=None, note=None): # Undo a tid that affects a single object (oid). # This is very specialized. t = transaction.Transaction() t.note(note or "undo") self._storage.tpc_begin(t) undo_result = self._storage.undo(tid, t) vote_result = self._storage.tpc_vote(t) self._storage.tpc_finish(t) if expected_oids is not None: oids = undo_result and undo_result[1] or [] oids.extend(oid for (oid, _) in vote_result or ()) self.assertEqual(len(oids), len(expected_oids), repr(oids)) for oid in expected_oids: self.assert_(oid in oids) return self._storage.lastTransaction() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/Synchronization.py000066400000000000000000000102631230730566700253540ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Test the storage's implemenetation of the storage synchronization spec. The Synchronization spec http://www.zope.org/Documentation/Developer/Models/ZODB/ ZODB_Architecture_Storage_Interface_State_Synchronization_Diag.html It specifies two states committing and non-committing. A storage starts in the non-committing state. tpc_begin() transfers to the committting state; tpc_abort() and tpc_finish() transfer back to non-committing. Several other methods are only allowed in one state or another. Many methods allowed only in the committing state require that they apply to the currently committing transaction. The spec is silent on a variety of methods that don't appear to modify the state, e.g. load(), undoLog(), pack(). It's unclear whether there is a separate set of synchronization rules that apply to these methods or if the synchronization is implementation dependent, i.e. only what is need to guarantee a corrected implementation. The synchronization spec is also silent on whether there is any contract implied with the caller. If the storage can assume that a single client is single-threaded and that it will not call, e.g., store() until after it calls tpc_begin(), the implementation can be substantially simplified. New and/or unspecified methods: tpc_vote(): handled like tpc_abort undo(): how's that handled? Methods that have nothing to do with committing/non-committing: load(), loadSerial(), getName(), getSize(), __len__(), history(), undoLog(), pack(). Specific questions: The spec & docs say that undo() takes three arguments, the second being a transaction. If the specified arg isn't the current transaction, the undo() should raise StorageTransactionError. This isn't implemented anywhere. It looks like undo can be called at anytime. FileStorage does not allow undo() during a pack. How should this be tested? Is it a general restriction? """ from transaction import Transaction from ZODB.POSException import StorageTransactionError OID = "\000" * 8 SERIALNO = "\000" * 8 TID = "\000" * 8 class SynchronizedStorage: def verifyNotCommitting(self, callable, *args): self.assertRaises(StorageTransactionError, callable, *args) def verifyWrongTrans(self, callable, *args): t = Transaction() self._storage.tpc_begin(t) self.assertRaises(StorageTransactionError, callable, *args) self._storage.tpc_abort(t) def checkStoreNotCommitting(self): self.verifyNotCommitting(self._storage.store, OID, SERIALNO, "", "", Transaction()) def checkStoreWrongTrans(self): self.verifyWrongTrans(self._storage.store, OID, SERIALNO, "", "", Transaction()) def checkAbortNotCommitting(self): self._storage.tpc_abort(Transaction()) def checkAbortWrongTrans(self): t = Transaction() self._storage.tpc_begin(t) self._storage.tpc_abort(Transaction()) self._storage.tpc_abort(t) def checkFinishNotCommitting(self): t = Transaction() self.assertRaises(StorageTransactionError, self._storage.tpc_finish, t) self._storage.tpc_abort(t) def checkFinishWrongTrans(self): t = Transaction() self._storage.tpc_begin(t) self.assertRaises(StorageTransactionError, self._storage.tpc_finish, Transaction()) self._storage.tpc_abort(t) def checkBeginCommitting(self): t = Transaction() self._storage.tpc_begin(t) self._storage.tpc_abort(t) # TODO: how to check undo? ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/TransactionalUndoStorage.py000066400000000000000000000667711230730566700271470ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Check undo(). Any storage that supports undo() must pass these tests. """ import time import types from persistent import Persistent import transaction from transaction import Transaction from ZODB import POSException from ZODB.serialize import referencesf from ZODB.utils import p64 from ZODB import DB from ZODB.tests.MinPO import MinPO from ZODB.tests.StorageTestBase import zodb_pickle, zodb_unpickle ZERO = '\0'*8 class C(Persistent): pass def snooze(): # In Windows, it's possible that two successive time.time() calls return # the same value. Tim guarantees that time never runs backwards. You # usually want to call this before you pack a storage, or must make other # guarantees about increasing timestamps. now = time.time() while now == time.time(): time.sleep(0.1) def listeq(L1, L2): """Return True if L1.sort() == L2.sort()""" c1 = L1[:] c2 = L2[:] c1.sort() c2.sort() return c1 == c2 class TransactionalUndoStorage: def _transaction_begin(self): self.__serials = {} def _transaction_store(self, oid, rev, data, vers, trans): r = self._storage.store(oid, rev, data, vers, trans) if r: if type(r) == types.StringType: self.__serials[oid] = r else: for oid, serial in r: self.__serials[oid] = serial def _transaction_vote(self, trans): r = self._storage.tpc_vote(trans) if r: for oid, serial in r: self.__serials[oid] = serial def _transaction_newserial(self, oid): return self.__serials[oid] def _multi_obj_transaction(self, objs): newrevs = {} t = Transaction() self._storage.tpc_begin(t) self._transaction_begin() for oid, rev, data in objs: self._transaction_store(oid, rev, data, '', t) newrevs[oid] = None self._transaction_vote(t) self._storage.tpc_finish(t) for oid in newrevs.keys(): newrevs[oid] = self._transaction_newserial(oid) return newrevs def _iterate(self): """Iterate over the storage in its final state.""" # This is testing that the iterator() code works correctly. # The hasattr() guards against ZEO, which doesn't support iterator. if not hasattr(self._storage, "iterator"): return iter = self._storage.iterator() for txn in iter: for rec in txn: pass def _begin_undos_vote(self, t, *tids): self._storage.tpc_begin(t) oids = [] for tid in tids: undo_result = self._storage.undo(tid, t) if undo_result: oids.extend(undo_result[1]) oids.extend(oid for (oid, _) in self._storage.tpc_vote(t) or ()) return oids def undo(self, tid, note): t = Transaction() t.note(note) oids = self._begin_undos_vote(t, tid) self._storage.tpc_finish(t) return oids def checkSimpleTransactionalUndo(self): eq = self.assertEqual oid = self._storage.new_oid() revid = self._dostore(oid, data=MinPO(23)) revid = self._dostore(oid, revid=revid, data=MinPO(24)) revid = self._dostore(oid, revid=revid, data=MinPO(25)) info = self._storage.undoInfo() # Now start an undo transaction self._undo(info[0]["id"], [oid], note="undo1") data, revid = self._storage.load(oid, '') eq(zodb_unpickle(data), MinPO(24)) # Do another one info = self._storage.undoInfo() self._undo(info[2]["id"], [oid], note="undo2") data, revid = self._storage.load(oid, '') eq(zodb_unpickle(data), MinPO(23)) # Try to undo the first record info = self._storage.undoInfo() self._undo(info[4]["id"], [oid], note="undo3") # This should fail since we've undone the object's creation self.assertRaises(KeyError, self._storage.load, oid, '') # And now let's try to redo the object's creation info = self._storage.undoInfo() self._undo(info[0]["id"], [oid]) data, revid = self._storage.load(oid, '') eq(zodb_unpickle(data), MinPO(23)) self._iterate() def checkCreationUndoneGetTid(self): # create an object oid = self._storage.new_oid() self._dostore(oid, data=MinPO(23)) # undo its creation info = self._storage.undoInfo() tid = info[0]['id'] t = Transaction() t.note('undo1') self._begin_undos_vote(t, tid) self._storage.tpc_finish(t) # Check that calling getTid on an uncreated object raises a KeyError # The current version of FileStorage fails this test self.assertRaises(KeyError, self._storage.getTid, oid) def checkUndoCreationBranch1(self): eq = self.assertEqual oid = self._storage.new_oid() revid = self._dostore(oid, data=MinPO(11)) revid = self._dostore(oid, revid=revid, data=MinPO(12)) # Undo the last transaction info = self._storage.undoInfo() self._undo(info[0]['id'], [oid]) data, revid = self._storage.load(oid, '') eq(zodb_unpickle(data), MinPO(11)) # Now from here, we can either redo the last undo, or undo the object # creation. Let's undo the object creation. info = self._storage.undoInfo() self._undo(info[2]['id'], [oid]) self.assertRaises(KeyError, self._storage.load, oid, '') self._iterate() def checkUndoCreationBranch2(self): eq = self.assertEqual oid = self._storage.new_oid() revid = self._dostore(oid, data=MinPO(11)) revid = self._dostore(oid, revid=revid, data=MinPO(12)) # Undo the last transaction info = self._storage.undoInfo() self._undo(info[0]['id'], [oid]) data, revid = self._storage.load(oid, '') eq(zodb_unpickle(data), MinPO(11)) # Now from here, we can either redo the last undo, or undo the object # creation. Let's redo the last undo info = self._storage.undoInfo() self._undo(info[0]['id'], [oid]) data, revid = self._storage.load(oid, '') eq(zodb_unpickle(data), MinPO(12)) self._iterate() def checkTwoObjectUndo(self): eq = self.assertEqual # Convenience p31, p32, p51, p52 = map(zodb_pickle, map(MinPO, (31, 32, 51, 52))) oid1 = self._storage.new_oid() oid2 = self._storage.new_oid() revid1 = revid2 = ZERO # Store two objects in the same transaction t = Transaction() self._storage.tpc_begin(t) self._transaction_begin() self._transaction_store(oid1, revid1, p31, '', t) self._transaction_store(oid2, revid2, p51, '', t) # Finish the transaction self._transaction_vote(t) revid1 = self._transaction_newserial(oid1) revid2 = self._transaction_newserial(oid2) self._storage.tpc_finish(t) eq(revid1, revid2) # Update those same two objects t = Transaction() self._storage.tpc_begin(t) self._transaction_begin() self._transaction_store(oid1, revid1, p32, '', t) self._transaction_store(oid2, revid2, p52, '', t) # Finish the transaction self._transaction_vote(t) revid1 = self._transaction_newserial(oid1) revid2 = self._transaction_newserial(oid2) self._storage.tpc_finish(t) eq(revid1, revid2) # Make sure the objects have the current value data, revid1 = self._storage.load(oid1, '') eq(zodb_unpickle(data), MinPO(32)) data, revid2 = self._storage.load(oid2, '') eq(zodb_unpickle(data), MinPO(52)) # Now attempt to undo the transaction containing two objects info = self._storage.undoInfo() self._undo(info[0]['id'], [oid1, oid2]) data, revid1 = self._storage.load(oid1, '') eq(zodb_unpickle(data), MinPO(31)) data, revid2 = self._storage.load(oid2, '') eq(zodb_unpickle(data), MinPO(51)) self._iterate() def checkTwoObjectUndoAtOnce(self): # Convenience eq = self.assertEqual unless = self.failUnless p30, p31, p32, p50, p51, p52 = map(zodb_pickle, map(MinPO, (30, 31, 32, 50, 51, 52))) oid1 = self._storage.new_oid() oid2 = self._storage.new_oid() revid1 = revid2 = ZERO # Store two objects in the same transaction d = self._multi_obj_transaction([(oid1, revid1, p30), (oid2, revid2, p50), ]) eq(d[oid1], d[oid2]) # Update those same two objects d = self._multi_obj_transaction([(oid1, d[oid1], p31), (oid2, d[oid2], p51), ]) eq(d[oid1], d[oid2]) # Update those same two objects d = self._multi_obj_transaction([(oid1, d[oid1], p32), (oid2, d[oid2], p52), ]) eq(d[oid1], d[oid2]) revid1 = self._transaction_newserial(oid1) revid2 = self._transaction_newserial(oid2) eq(revid1, revid2) # Make sure the objects have the current value data, revid1 = self._storage.load(oid1, '') eq(zodb_unpickle(data), MinPO(32)) data, revid2 = self._storage.load(oid2, '') eq(zodb_unpickle(data), MinPO(52)) # Now attempt to undo the transaction containing two objects info = self._storage.undoInfo() tid = info[0]['id'] tid1 = info[1]['id'] t = Transaction() oids = self._begin_undos_vote(t, tid, tid1) self._storage.tpc_finish(t) # We get the finalization stuff called an extra time: eq(len(oids), 4) unless(oid1 in oids) unless(oid2 in oids) data, revid1 = self._storage.load(oid1, '') eq(zodb_unpickle(data), MinPO(30)) data, revid2 = self._storage.load(oid2, '') eq(zodb_unpickle(data), MinPO(50)) # Now try to undo the one we just did to undo, whew info = self._storage.undoInfo() self._undo(info[0]['id'], [oid1, oid2]) data, revid1 = self._storage.load(oid1, '') eq(zodb_unpickle(data), MinPO(32)) data, revid2 = self._storage.load(oid2, '') eq(zodb_unpickle(data), MinPO(52)) self._iterate() def checkTwoObjectUndoAgain(self): eq = self.assertEqual p31, p32, p33, p51, p52, p53 = map( zodb_pickle, map(MinPO, (31, 32, 33, 51, 52, 53))) # Like the above, but the first revision of the objects are stored in # different transactions. oid1 = self._storage.new_oid() oid2 = self._storage.new_oid() revid1 = self._dostore(oid1, data=p31, already_pickled=1) revid2 = self._dostore(oid2, data=p51, already_pickled=1) # Update those same two objects t = Transaction() self._storage.tpc_begin(t) self._transaction_begin() self._transaction_store(oid1, revid1, p32, '', t) self._transaction_store(oid2, revid2, p52, '', t) # Finish the transaction self._transaction_vote(t) self._storage.tpc_finish(t) revid1 = self._transaction_newserial(oid1) revid2 = self._transaction_newserial(oid2) eq(revid1, revid2) # Now attempt to undo the transaction containing two objects info = self._storage.undoInfo() self._undo(info[0]["id"], [oid1, oid2]) data, revid1 = self._storage.load(oid1, '') eq(zodb_unpickle(data), MinPO(31)) data, revid2 = self._storage.load(oid2, '') eq(zodb_unpickle(data), MinPO(51)) # Like the above, but this time, the second transaction contains only # one object. t = Transaction() self._storage.tpc_begin(t) self._transaction_begin() self._transaction_store(oid1, revid1, p33, '', t) self._transaction_store(oid2, revid2, p53, '', t) # Finish the transaction self._transaction_vote(t) self._storage.tpc_finish(t) revid1 = self._transaction_newserial(oid1) revid2 = self._transaction_newserial(oid2) eq(revid1, revid2) # Update in different transactions revid1 = self._dostore(oid1, revid=revid1, data=MinPO(34)) revid2 = self._dostore(oid2, revid=revid2, data=MinPO(54)) # Now attempt to undo the transaction containing two objects info = self._storage.undoInfo() tid = info[1]['id'] t = Transaction() oids = self._begin_undos_vote(t, tid) self._storage.tpc_finish(t) eq(len(oids), 1) self.failUnless(oid1 in oids) self.failUnless(not oid2 in oids) data, revid1 = self._storage.load(oid1, '') eq(zodb_unpickle(data), MinPO(33)) data, revid2 = self._storage.load(oid2, '') eq(zodb_unpickle(data), MinPO(54)) self._iterate() def checkNotUndoable(self): eq = self.assertEqual # Set things up so we've got a transaction that can't be undone oid = self._storage.new_oid() revid_a = self._dostore(oid, data=MinPO(51)) revid_b = self._dostore(oid, revid=revid_a, data=MinPO(52)) revid_c = self._dostore(oid, revid=revid_b, data=MinPO(53)) # Start the undo info = self._storage.undoInfo() tid = info[1]['id'] t = Transaction() self.assertRaises(POSException.UndoError, self._begin_undos_vote, t, tid) self._storage.tpc_abort(t) # Now have more fun: object1 and object2 are in the same transaction, # which we'll try to undo to, but one of them has since modified in # different transaction, so the undo should fail. oid1 = oid revid1 = revid_c oid2 = self._storage.new_oid() revid2 = ZERO p81, p82, p91, p92 = map(zodb_pickle, map(MinPO, (81, 82, 91, 92))) t = Transaction() self._storage.tpc_begin(t) self._transaction_begin() self._transaction_store(oid1, revid1, p81, '', t) self._transaction_store(oid2, revid2, p91, '', t) self._transaction_vote(t) self._storage.tpc_finish(t) revid1 = self._transaction_newserial(oid1) revid2 = self._transaction_newserial(oid2) eq(revid1, revid2) # Make sure the objects have the expected values data, revid_11 = self._storage.load(oid1, '') eq(zodb_unpickle(data), MinPO(81)) data, revid_22 = self._storage.load(oid2, '') eq(zodb_unpickle(data), MinPO(91)) eq(revid_11, revid1) eq(revid_22, revid2) # Now modify oid2 revid2 = self._dostore(oid2, revid=revid2, data=MinPO(92)) self.assertNotEqual(revid1, revid2) self.assertNotEqual(revid2, revid_22) info = self._storage.undoInfo() tid = info[1]['id'] t = Transaction() self.assertRaises(POSException.UndoError, self._begin_undos_vote, t, tid) self._storage.tpc_abort(t) self._iterate() def checkTransactionalUndoAfterPack(self): # bwarsaw Date: Thu Mar 28 21:04:43 2002 UTC # This is a test which should provoke the underlying bug in # transactionalUndo() on a standby storage. If our hypothesis # is correct, the bug is in FileStorage, and is caused by # encoding the file position in the `id' field of the undoLog # information. Note that Full just encodes the tid, but this # is a problem for FileStorage (we have a strategy for fixing # this). # So, basically, this makes sure that undo info doesn't depend # on file positions. We change the file positions in an undo # record by packing. # Add a few object revisions oid = '\0'*8 revid0 = self._dostore(oid, data=MinPO(50)) revid1 = self._dostore(oid, revid=revid0, data=MinPO(51)) snooze() packtime = time.time() snooze() # time.time() now distinct from packtime revid2 = self._dostore(oid, revid=revid1, data=MinPO(52)) self._dostore(oid, revid=revid2, data=MinPO(53)) # Now get the undo log info = self._storage.undoInfo() self.assertEqual(len(info), 4) tid = info[0]['id'] # Now pack just the initial revision of the object. We need the # second revision otherwise we won't be able to undo the third # revision! self._storage.pack(packtime, referencesf) # Make some basic assertions about the undo information now info2 = self._storage.undoInfo() self.assertEqual(len(info2), 2) # And now attempt to undo the last transaction t = Transaction() oids = self._begin_undos_vote(t, tid) self._storage.tpc_finish(t) self.assertEqual(len(oids), 1) self.assertEqual(oids[0], oid) data, revid = self._storage.load(oid, '') # The object must now be at the second state self.assertEqual(zodb_unpickle(data), MinPO(52)) self._iterate() def checkTransactionalUndoAfterPackWithObjectUnlinkFromRoot(self): eq = self.assertEqual db = DB(self._storage) conn = db.open() root = conn.root() o1 = C() o2 = C() root['obj'] = o1 o1.obj = o2 txn = transaction.get() txn.note('o1 -> o2') txn.commit() now = packtime = time.time() while packtime <= now: packtime = time.time() o3 = C() o2.obj = o3 txn = transaction.get() txn.note('o1 -> o2 -> o3') txn.commit() o1.obj = o3 txn = transaction.get() txn.note('o1 -> o3') txn.commit() log = self._storage.undoLog() eq(len(log), 4) for entry in zip(log, ('o1 -> o3', 'o1 -> o2 -> o3', 'o1 -> o2', 'initial database creation')): eq(entry[0]['description'], entry[1]) self._storage.pack(packtime, referencesf) log = self._storage.undoLog() for entry in zip(log, ('o1 -> o3', 'o1 -> o2 -> o3')): eq(entry[0]['description'], entry[1]) tid = log[0]['id'] db.undo(tid) txn = transaction.get() txn.note('undo') txn.commit() # undo does a txn-undo, but doesn't invalidate conn.sync() log = self._storage.undoLog() for entry in zip(log, ('undo', 'o1 -> o3', 'o1 -> o2 -> o3')): eq(entry[0]['description'], entry[1]) eq(o1.obj, o2) eq(o1.obj.obj, o3) self._iterate() def checkPackAfterUndoDeletion(self): db = DB(self._storage) cn = db.open() root = cn.root() pack_times = [] def set_pack_time(): pack_times.append(time.time()) snooze() root["key0"] = MinPO(0) root["key1"] = MinPO(1) root["key2"] = MinPO(2) txn = transaction.get() txn.note("create 3 keys") txn.commit() set_pack_time() del root["key1"] txn = transaction.get() txn.note("delete 1 key") txn.commit() set_pack_time() root._p_deactivate() cn.sync() self.assert_(listeq(root.keys(), ["key0", "key2"])) L = db.undoInfo() db.undo(L[0]["id"]) txn = transaction.get() txn.note("undo deletion") txn.commit() set_pack_time() root._p_deactivate() cn.sync() self.assert_(listeq(root.keys(), ["key0", "key1", "key2"])) for t in pack_times: self._storage.pack(t, referencesf) root._p_deactivate() cn.sync() self.assert_(listeq(root.keys(), ["key0", "key1", "key2"])) for i in range(3): obj = root["key%d" % i] self.assertEqual(obj.value, i) root.items() self._inter_pack_pause() def checkPackAfterUndoManyTimes(self): db = DB(self._storage) cn = db.open() rt = cn.root() rt["test"] = MinPO(1) transaction.commit() rt["test2"] = MinPO(2) transaction.commit() rt["test"] = MinPO(3) txn = transaction.get() txn.note("root of undo") txn.commit() packtimes = [] for i in range(10): L = db.undoInfo() db.undo(L[0]["id"]) txn = transaction.get() txn.note("undo %d" % i) txn.commit() rt._p_deactivate() cn.sync() self.assertEqual(rt["test"].value, i % 2 and 3 or 1) self.assertEqual(rt["test2"].value, 2) packtimes.append(time.time()) snooze() for t in packtimes: self._storage.pack(t, referencesf) cn.sync() # TODO: Is _cache supposed to have a clear() method, or not? # cn._cache.clear() # The last undo set the value to 3 and pack should # never change that. self.assertEqual(rt["test"].value, 3) self.assertEqual(rt["test2"].value, 2) self._inter_pack_pause() def _inter_pack_pause(self): # DirectoryStorage needs a pause between packs, # most other storages dont. pass def checkTransactionalUndoIterator(self): # check that data_txn set in iterator makes sense if not hasattr(self._storage, "iterator"): return s = self._storage BATCHES = 4 OBJECTS = 4 orig = [] for i in range(BATCHES): t = Transaction() tid = p64(i + 1) s.tpc_begin(t, tid) for j in range(OBJECTS): oid = s.new_oid() obj = MinPO(i * OBJECTS + j) s.store(oid, None, zodb_pickle(obj), '', t) orig.append((tid, oid)) s.tpc_vote(t) s.tpc_finish(t) orig = [(tid, oid, s.getTid(oid)) for tid, oid in orig] i = 0 for tid, oid, revid in orig: self._dostore(oid, revid=revid, data=MinPO(revid), description="update %s" % i) # Undo the OBJECTS transactions that modified objects created # in the ith original transaction. def undo(i): info = s.undoInfo() t = Transaction() s.tpc_begin(t) base = i * OBJECTS + i for j in range(OBJECTS): tid = info[base + j]['id'] s.undo(tid, t) s.tpc_vote(t) s.tpc_finish(t) for i in range(BATCHES): undo(i) # There are now (2 + OBJECTS) * BATCHES transactions: # BATCHES original transactions, followed by # OBJECTS * BATCHES modifications, followed by # BATCHES undos transactions = s.iterator() eq = self.assertEqual for i in range(BATCHES): txn = transactions.next() tid = p64(i + 1) eq(txn.tid, tid) L1 = [(rec.oid, rec.tid, rec.data_txn) for rec in txn] L2 = [(oid, revid, None) for _tid, oid, revid in orig if _tid == tid] eq(L1, L2) for i in range(BATCHES * OBJECTS): txn = transactions.next() eq(len([rec for rec in txn if rec.data_txn is None]), 1) for i in range(BATCHES): txn = transactions.next() # The undos are performed in reverse order. otid = p64(BATCHES - i) L1 = [(rec.oid, rec.data_txn) for rec in txn] L2 = [(oid, otid) for _tid, oid, revid in orig if _tid == otid] L1.sort() L2.sort() eq(L1, L2) self.assertRaises(StopIteration, transactions.next) def checkUndoLogMetadata(self): # test that the metadata is correct in the undo log t = transaction.get() t.note('t1') t.setExtendedInfo('k2','this is transaction metadata') t.setUser('u3',path='p3') db = DB(self._storage) conn = db.open() root = conn.root() o1 = C() root['obj'] = o1 txn = transaction.get() txn.commit() l = self._storage.undoLog() self.assertEqual(len(l),2) d = l[0] self.assertEqual(d['description'],'t1') self.assertEqual(d['k2'],'this is transaction metadata') self.assertEqual(d['user_name'],'p3 u3') # A common test body for index tests on undoInfo and undoLog. Before # ZODB 3.4, they always returned a wrong number of results (one too # few _or_ too many, depending on how they were called). def _exercise_info_indices(self, method_name): db = DB(self._storage) info_func = getattr(db, method_name) cn = db.open() rt = cn.root() # Do some transactions. for key in "abcdefghijklmnopqrstuvwxyz": rt[key] = ord(key) transaction.commit() # 26 letters = 26 transactions, + the hidden transaction to make # the root object, == 27 expected. allofem = info_func(0, 100000) self.assertEqual(len(allofem), 27) # Asking for no more than 100000 should do the same. redundant = info_func(last=-1000000) self.assertEqual(allofem, redundant) # By default, we should get only 20 back. default = info_func() self.assertEqual(len(default), 20) # And they should be the most recent 20. self.assertEqual(default, allofem[:20]) # If we ask for only one, we should get only the most recent. fresh = info_func(last=1) self.assertEqual(len(fresh), 1) self.assertEqual(fresh[0], allofem[0]) # Another way of asking for only the most recent. redundant = info_func(last=-1) self.assertEqual(fresh, redundant) # Try a slice that doesn't start at 0. oddball = info_func(first=11, last=17) self.assertEqual(len(oddball), 17-11) self.assertEqual(oddball, allofem[11 : 11+len(oddball)]) # And another way to spell the same thing. redundant = info_func(first=11, last=-6) self.assertEqual(oddball, redundant) cn.close() # Caution: don't close db; the framework does that. If you close # it here, the ZODB tests still work, but the ZRS RecoveryStorageTests # fail (closing the DB here in those tests closes the ZRS primary # before a ZRS secondary even starts, and then the latter can't # find a server to recover from). def checkIndicesInUndoInfo(self): self._exercise_info_indices("undoInfo") def checkIndicesInUndoLog(self): self._exercise_info_indices("undoLog") ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/__init__.py000066400000000000000000000000461230730566700237100ustar00rootroot00000000000000# Having this makes debugging better. ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/blob_basic.txt000066400000000000000000000111531230730566700244200ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2005 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## ZODB Blob support ================= You create a blob like this:: >>> from ZODB.blob import Blob >>> myblob = Blob() A blob implements the IBlob interface:: >>> from ZODB.interfaces import IBlob >>> IBlob.providedBy(myblob) True We can open a new blob file for reading, but it won't have any data:: >>> myblob.open("r").read() '' But we can write data to a new Blob by opening it for writing:: >>> f = myblob.open("w") >>> f.write("Hi, Blob!") If we try to open a Blob again while it is open for writing, we get an error:: >>> myblob.open("r") Traceback (most recent call last): ... BlobError: Already opened for writing. We can close the file:: >>> f.close() Now we can open it for reading:: >>> f2 = myblob.open("r") And we get the data back:: >>> f2.read() 'Hi, Blob!' If we want to, we can open it again:: >>> f3 = myblob.open("r") >>> f3.read() 'Hi, Blob!' But we can't open it for writing, while it is opened for reading:: >>> myblob.open("a") Traceback (most recent call last): ... BlobError: Already opened for reading. Before we can write, we have to close the readers:: >>> f2.close() >>> f3.close() Now we can open it for writing again and e.g. append data:: >>> f4 = myblob.open("a") >>> f4.write("\nBlob is fine.") We can't open a blob while it is open for writing: >>> myblob.open("w") Traceback (most recent call last): ... BlobError: Already opened for writing. >>> myblob.open("r") Traceback (most recent call last): ... BlobError: Already opened for writing. >>> f4.close() Now we can read it:: >>> f4a = myblob.open("r") >>> f4a.read() 'Hi, Blob!\nBlob is fine.' >>> f4a.close() You shouldn't need to explicitly close a blob unless you hold a reference to it via a name. If the first line in the following test kept a reference around via a name, the second call to open it in a writable mode would fail with a BlobError, but it doesn't:: >>> myblob.open("r+").read() 'Hi, Blob!\nBlob is fine.' >>> f4b = myblob.open("a") >>> f4b.close() We can read lines out of the blob too:: >>> f5 = myblob.open("r") >>> f5.readline() 'Hi, Blob!\n' >>> f5.readline() 'Blob is fine.' >>> f5.close() We can seek to certain positions in a blob and read portions of it:: >>> f6 = myblob.open('r') >>> f6.seek(4) >>> int(f6.tell()) 4 >>> f6.read(5) 'Blob!' >>> f6.close() We can use the object returned by a blob open call as an iterable:: >>> f7 = myblob.open('r') >>> for line in f7: ... print line Hi, Blob! Blob is fine. >>> f7.close() We can truncate a blob:: >>> f8 = myblob.open('a') >>> f8.truncate(0) >>> f8.close() >>> f8 = myblob.open('r') >>> f8.read() '' >>> f8.close() Blobs are always opened in binary mode:: >>> f9 = myblob.open("r") >>> f9.mode 'rb' >>> f9.close() Blobs that have not been committed can be opened using any mode, except for "c":: >>> from ZODB.blob import BlobError, valid_modes >>> for mode in valid_modes: ... try: ... f10 = Blob().open(mode) ... except BlobError: ... print 'open failed with mode "%s"' % mode ... else: ... f10.close() open failed with mode "c" Some cleanup in this test is needed:: >>> import transaction >>> transaction.get().abort() Subclassing Blobs ----------------- Blobs are not subclassable:: >>> class SubBlob(Blob): ... pass >>> my_sub_blob = SubBlob() Traceback (most recent call last): ... TypeError: Blobs do not support subclassing. Passing data to the blob constructor ------------------------------------ If you have a small amount of data, you can pass it to the blob constructor. (This is a convenience, mostly for writing tests.) >>> myblob = Blob('some data') >>> myblob.open().read() 'some data' ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/blob_connection.txt000066400000000000000000000053621230730566700255030ustar00rootroot00000000000000Connection support for Blobs tests ================================== Connections handle Blobs specially. To demonstrate that, we first need a Blob with some data: >>> from ZODB.interfaces import IBlob >>> from ZODB.blob import Blob >>> import transaction >>> blob = Blob() >>> data = blob.open("w") >>> data.write("I'm a happy Blob.") >>> data.close() We also need a database with a blob supporting storage. (We're going to use FileStorage rather than MappingStorage here because we will want ``loadBefore`` for one of our examples.) >>> blob_storage = create_storage() >>> from ZODB.DB import DB >>> database = DB(blob_storage) Putting a Blob into a Connection works like every other object: >>> connection = database.open() >>> root = connection.root() >>> root['myblob'] = blob >>> transaction.commit() We can also commit a transaction that seats a blob into place without calling the blob's open method: >>> nothing = transaction.begin() >>> anotherblob = Blob() >>> root['anotherblob'] = anotherblob >>> nothing = transaction.commit() Getting stuff out of there works similarly: >>> transaction2 = transaction.TransactionManager() >>> connection2 = database.open(transaction_manager=transaction2) >>> root = connection2.root() >>> blob2 = root['myblob'] >>> IBlob.providedBy(blob2) True >>> blob2.open("r").read() "I'm a happy Blob." >>> transaction2.abort() MVCC also works. >>> transaction3 = transaction.TransactionManager() >>> connection3 = database.open(transaction_manager=transaction3) >>> f = connection.root()['myblob'].open('w') >>> f.write('I am an ecstatic Blob.') >>> f.close() >>> transaction.commit() >>> connection3.root()['myblob'].open('r').read() "I'm a happy Blob." >>> transaction2.abort() >>> transaction3.abort() >>> connection2.close() >>> connection3.close() You can't put blobs into a database that has uses a Non-Blob-Storage, though: >>> from ZODB.MappingStorage import MappingStorage >>> no_blob_storage = MappingStorage() >>> database2 = DB(no_blob_storage) >>> connection2 = database2.open(transaction_manager=transaction2) >>> root = connection2.root() >>> root['myblob'] = Blob() >>> transaction2.commit() # doctest: +ELLIPSIS Traceback (most recent call last): ... Unsupported: Storing Blobs in is not supported. >>> transaction2.abort() >>> connection2.close() After testing this, we don't need the storage directory and databases anymore: >>> transaction.abort() >>> connection.close() >>> database.close() >>> database2.close() >>> blob_storage.close() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/blob_consume.txt000066400000000000000000000072171230730566700250160ustar00rootroot00000000000000Consuming existing files ======================== The ZODB Blob implementation allows to import existing files as Blobs within an O(1) operation we call `consume`:: Let's create a file:: >>> to_import = open('to_import', 'wb') >>> to_import.write("I'm a Blob and I feel fine.") The file *must* be closed before giving it to consumeFile: >>> to_import.close() Now, let's consume this file in a blob by specifying it's name:: >>> from ZODB.blob import Blob >>> blob = Blob() >>> blob.consumeFile('to_import') After the consumeFile operation, the original file has been removed: >>> import os >>> os.path.exists('to_import') False We now can call open on the blob and read and write the data:: >>> blob_read = blob.open('r') >>> blob_read.read() "I'm a Blob and I feel fine." >>> blob_read.close() >>> blob_write = blob.open('w') >>> blob_write.write('I was changed.') >>> blob_write.close() We can not consume a file when there is a reader or writer around for a blob already:: >>> open('to_import', 'wb').write('I am another blob.') >>> blob_read = blob.open('r') >>> blob.consumeFile('to_import') Traceback (most recent call last): BlobError: Already opened for reading. >>> blob_read.close() >>> blob_write = blob.open('w') >>> blob.consumeFile('to_import') Traceback (most recent call last): BlobError: Already opened for writing. >>> blob_write.close() Now, after closing all readers and writers we can consume files again:: >>> blob.consumeFile('to_import') >>> blob_read = blob.open('r') >>> blob_read.read() 'I am another blob.' >>> blob_read.close() Edge cases ========== There are some edge cases what happens when the link() operation fails. We simulate this in different states: Case 1: We don't have uncommitted data, but the link operation fails. We fall back to try a copy/remove operation that is successfull:: >>> open('to_import', 'wb').write('Some data.') >>> def failing_rename(f1, f2): ... import exceptions ... if f1 == 'to_import': ... raise exceptions.OSError("I can't link.") ... os_rename(f1, f2) >>> blob = Blob() >>> os_rename = os.rename >>> os.rename = failing_rename >>> blob.consumeFile('to_import') The blob did not have data before, so it shouldn't have data now:: >>> blob.open('r').read() 'Some data.' Case 2: We don't have uncommitted data and both the link operation and the copy fail. The exception will be re-raised and the target file will not exist:: >>> blob = Blob() >>> import ZODB.utils >>> utils_cp = ZODB.utils.cp >>> def failing_copy(f1, f2): ... import exceptions ... raise exceptions.OSError("I can't copy.") >>> ZODB.utils.cp = failing_copy >>> open('to_import', 'wb').write('Some data.') >>> blob.consumeFile('to_import') Traceback (most recent call last): OSError: I can't copy. The blob did not have data before, so it shouldn't have data now:: >>> blob.open('r').read() '' Case 3: We have uncommitted data, but the link and the copy operations fail. The exception will be re-raised and the target file will exist with the previous uncomitted data:: >>> blob = Blob() >>> blob_writing = blob.open('w') >>> blob_writing.write('Uncommitted data') >>> blob_writing.close() >>> blob.consumeFile('to_import') Traceback (most recent call last): OSError: I can't copy. The blob did existed before and had uncommitted data, this shouldn't have changed:: >>> blob.open('r').read() 'Uncommitted data' >>> os.rename = os_rename >>> ZODB.utils.cp = utils_cp ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/blob_importexport.txt000066400000000000000000000045751230730566700261250ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2005 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## Import/export support for blob data =================================== Set up: >>> import ZODB.blob, transaction >>> from persistent.mapping import PersistentMapping We need an database with an undoing blob supporting storage: >>> database1 = ZODB.DB(create_storage('1')) >>> database2 = ZODB.DB(create_storage('2')) Create our root object for database1: >>> connection1 = database1.open() >>> root1 = connection1.root() Put a couple blob objects in our database1 and on the filesystem: >>> import time, os >>> nothing = transaction.begin() >>> data1 = 'x'*100000 >>> blob1 = ZODB.blob.Blob() >>> blob1.open('w').write(data1) >>> data2 = 'y'*100000 >>> blob2 = ZODB.blob.Blob() >>> blob2.open('w').write(data2) >>> d = PersistentMapping({'blob1':blob1, 'blob2':blob2}) >>> root1['blobdata'] = d >>> transaction.commit() Export our blobs from a database1 connection: >>> conn = root1['blobdata']._p_jar >>> oid = root1['blobdata']._p_oid >>> exportfile = 'export' >>> nothing = connection1.exportFile(oid, exportfile) Import our exported data into database2: >>> connection2 = database2.open() >>> root2 = connection2.root() >>> nothing = transaction.begin() >>> data = root2._p_jar.importFile(exportfile) >>> root2['blobdata'] = data >>> transaction.commit() Make sure our data exists: >>> items1 = root1['blobdata'] >>> items2 = root2['blobdata'] >>> bool(items1.keys() == items2.keys()) True >>> items1['blob1'].open().read() == items2['blob1'].open().read() True >>> items1['blob2'].open().read() == items2['blob2'].open().read() True >>> transaction.get().abort() .. cleanup >>> database1.close() >>> database2.close() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/blob_layout.txt000066400000000000000000000236551230730566700246660ustar00rootroot00000000000000====================== Blob directory layouts ====================== The internal structure of the blob directories is governed by so called `layouts`. The current default layout is called `bushy`. The original blob implementation used a layout that we now call `lawn` and which is still available for backwards compatibility. Layouts implement two methods: one for computing a relative path for an OID and one for turning a relative path back into an OID. Our terminology is roughly the same as used in `DirectoryStorage`. The `bushy` layout ================== The bushy layout splits the OID into the 8 byte parts, reverses them and creates one directory level for each part, named by the hexlified representation of the byte value. This results in 8 levels of directories, the leaf directories being used for the revisions of the blobs and at most 256 entries per directory level: >>> from ZODB.blob import BushyLayout >>> bushy = BushyLayout() >>> bushy.oid_to_path('\x00\x00\x00\x00\x00\x00\x00\x00') '0x00/0x00/0x00/0x00/0x00/0x00/0x00/0x00' >>> bushy.oid_to_path('\x00\x00\x00\x00\x00\x00\x00\x01') '0x00/0x00/0x00/0x00/0x00/0x00/0x00/0x01' >>> import os >>> bushy.path_to_oid(os.path.join( ... '0x01', '0x00', '0x00', '0x00', '0x00', '0x00', '0x00', '0x00')) '\x01\x00\x00\x00\x00\x00\x00\x00' >>> bushy.path_to_oid(os.path.join( ... '0xff', '0x00', '0x00', '0x00', '0x00', '0x00', '0x00', '0x00')) '\xff\x00\x00\x00\x00\x00\x00\x00' Paths that do not represent an OID will cause a ValueError: >>> bushy.path_to_oid('tmp') Traceback (most recent call last): ValueError: Not a valid OID path: `tmp` The `lawn` layout ================= The lawn layout creates on directory for each blob named by the blob's hex representation of its OID. This has some limitations on various file systems like performance penalties or the inability to store more than a given number of blobs at the same time (e.g. 32k on ext3). >>> from ZODB.blob import LawnLayout >>> lawn = LawnLayout() >>> lawn.oid_to_path('\x00\x00\x00\x00\x00\x00\x00\x00') '0x00' >>> lawn.oid_to_path('\x00\x00\x00\x00\x00\x00\x00\x01') '0x01' >>> lawn.path_to_oid('0x01') '\x00\x00\x00\x00\x00\x00\x00\x01' Paths that do not represent an OID will cause a ValueError: >>> lawn.path_to_oid('tmp') Traceback (most recent call last): ValueError: Not a valid OID path: `tmp` >>> lawn.path_to_oid('') Traceback (most recent call last): ValueError: Not a valid OID path: `` Auto-detecting the layout of a directory ======================================== To allow easier migration, we provide an auto-detection feature that analyses a blob directory and decides for a strategy to use. In general it prefers to choose the `bushy` layout, except if it determines that the directory has already been used to create a lawn structure. >>> from ZODB.blob import auto_layout_select 1. Non-existing directories will trigger a bushy layout: >>> import os, shutil >>> auto_layout_select('blobs') 'bushy' 2. Empty directories will trigger a bushy layout too: >>> os.mkdir('blobs') >>> auto_layout_select('blobs') 'bushy' 3. If the directory contains a marker for the strategy it will be used: >>> from ZODB.blob import LAYOUT_MARKER >>> import os.path >>> open(os.path.join('blobs', LAYOUT_MARKER), 'wb').write('bushy') >>> auto_layout_select('blobs') 'bushy' >>> open(os.path.join('blobs', LAYOUT_MARKER), 'wb').write('lawn') >>> auto_layout_select('blobs') 'lawn' >>> shutil.rmtree('blobs') 4. If the directory does not contain a marker but other files that are not hidden, we assume that it was created with an earlier version of the blob implementation and uses our `lawn` layout: >>> os.mkdir('blobs') >>> open(os.path.join('blobs', '0x0101'), 'wb').write('foo') >>> auto_layout_select('blobs') 'lawn' >>> shutil.rmtree('blobs') 5. If the directory contains only hidden files, use the bushy layout: >>> os.mkdir('blobs') >>> open(os.path.join('blobs', '.svn'), 'wb').write('blah') >>> auto_layout_select('blobs') 'bushy' >>> shutil.rmtree('blobs') Directory layout markers ======================== When the file system helper (FSH) is asked to create the directory structure, it will leave a marker with the choosen layout if no marker exists yet: >>> from ZODB.blob import FilesystemHelper >>> blobs = 'blobs' >>> fsh = FilesystemHelper(blobs) >>> fsh.layout_name 'bushy' >>> fsh.create() >>> open(os.path.join(blobs, LAYOUT_MARKER), 'rb').read() 'bushy' If the FSH finds a marker, then it verifies whether its content matches the strategy that was chosen. It will raise an exception if we try to work with a directory that has a different marker than the chosen strategy: >>> fsh = FilesystemHelper(blobs, 'lawn') >>> fsh.layout_name 'lawn' >>> fsh.create() # doctest: +ELLIPSIS Traceback (most recent call last): ValueError: Directory layout `lawn` selected for blob directory .../blobs/, but marker found for layout `bushy` >>> rmtree(blobs) This function interacts with the automatic detection in the way, that an unmarked directory will be marked the first time when it is auto-guessed and the marker will be used in the future: >>> import ZODB.FileStorage >>> from ZODB.blob import BlobStorage >>> datafs = 'data.fs' >>> base_storage = ZODB.FileStorage.FileStorage(datafs) >>> os.mkdir(blobs) >>> open(os.path.join(blobs, 'foo'), 'wb').write('foo') >>> blob_storage = BlobStorage(blobs, base_storage) >>> blob_storage.fshelper.layout_name 'lawn' >>> open(os.path.join(blobs, LAYOUT_MARKER), 'rb').read() 'lawn' >>> blob_storage = BlobStorage('blobs', base_storage, layout='bushy') ... # doctest: +ELLIPSIS Traceback (most recent call last): ValueError: Directory layout `bushy` selected for blob directory .../blobs/, but marker found for layout `lawn` >>> base_storage.close() >>> rmtree('blobs') Migrating between directory layouts =================================== A script called `migrateblobs.py` is distributed with the ZODB for offline migration capabilities between different directory layouts. It can migrate any blob directory layout to any other layout. It leaves the original blob directory untouched (except from eventually creating a temporary directory and the storage layout marker). The migration is accessible as a library function: >>> from ZODB.scripts.migrateblobs import migrate Create a `lawn` directory structure and migrate it to the new `bushy` one: >>> from ZODB.blob import FilesystemHelper >>> d = 'd' >>> os.mkdir(d) >>> old = os.path.join(d, 'old') >>> old_fsh = FilesystemHelper(old, 'lawn') >>> old_fsh.create() >>> blob1 = old_fsh.getPathForOID(7039, create=True) >>> blob2 = old_fsh.getPathForOID(10, create=True) >>> blob3 = old_fsh.getPathForOID(7034, create=True) >>> open(os.path.join(blob1, 'foo'), 'wb').write('foo') >>> open(os.path.join(blob1, 'foo2'), 'wb').write('bar') >>> open(os.path.join(blob2, 'foo3'), 'wb').write('baz') >>> open(os.path.join(blob2, 'foo4'), 'wb').write('qux') >>> open(os.path.join(blob3, 'foo5'), 'wb').write('quux') >>> open(os.path.join(blob3, 'foo6'), 'wb').write('corge') Committed blobs have their permissions set to 000 The migration function is called with the old and the new path and the layout that shall be used for the new directory: >>> bushy = os.path.join(d, 'bushy') >>> migrate(old, bushy, 'bushy') # doctest: +ELLIPSIS +NORMALIZE_WHITESPACE Migrating blob data from `.../old` (lawn) to `.../bushy` (bushy) OID: 0x0a - 2 files OID: 0x1b7a - 2 files OID: 0x1b7f - 2 files The new directory now contains the same files in different directories, but with the same sizes and permissions: >>> lawn_files = {} >>> for base, dirs, files in os.walk(old): ... for file_name in files: ... lawn_files[file_name] = os.path.join(base, file_name) >>> bushy_files = {} >>> for base, dirs, files in os.walk(bushy): ... for file_name in files: ... bushy_files[file_name] = os.path.join(base, file_name) >>> len(lawn_files) == len(bushy_files) True >>> for file_name, lawn_path in sorted(lawn_files.items()): ... if file_name == '.layout': ... continue ... lawn_stat = os.stat(lawn_path) ... bushy_path = bushy_files[file_name] ... bushy_stat = os.stat(bushy_path) ... print lawn_path, '-->', bushy_path ... if ((lawn_stat.st_mode, lawn_stat.st_size) != ... (bushy_stat.st_mode, bushy_stat.st_size)): ... print 'oops' old/0x1b7f/foo --> bushy/0x00/0x00/0x00/0x00/0x00/0x00/0x1b/0x7f/foo old/0x1b7f/foo2 --> bushy/0x00/0x00/0x00/0x00/0x00/0x00/0x1b/0x7f/foo2 old/0x0a/foo3 --> bushy/0x00/0x00/0x00/0x00/0x00/0x00/0x00/0x0a/foo3 old/0x0a/foo4 --> bushy/0x00/0x00/0x00/0x00/0x00/0x00/0x00/0x0a/foo4 old/0x1b7a/foo5 --> bushy/0x00/0x00/0x00/0x00/0x00/0x00/0x1b/0x7a/foo5 old/0x1b7a/foo6 --> bushy/0x00/0x00/0x00/0x00/0x00/0x00/0x1b/0x7a/foo6 We can also migrate the bushy layout back to the lawn layout: >>> lawn = os.path.join(d, 'lawn') >>> migrate(bushy, lawn, 'lawn') Migrating blob data from `.../bushy` (bushy) to `.../lawn` (lawn) OID: 0x0a - 2 files OID: 0x1b7a - 2 files OID: 0x1b7f - 2 files >>> lawn_files = {} >>> for base, dirs, files in os.walk(lawn): ... for file_name in files: ... lawn_files[file_name] = os.path.join(base, file_name) >>> len(lawn_files) == len(bushy_files) True >>> for file_name, lawn_path in sorted(lawn_files.items()): ... if file_name == '.layout': ... continue ... lawn_stat = os.stat(lawn_path) ... bushy_path = bushy_files[file_name] ... bushy_stat = os.stat(bushy_path) ... print bushy_path, '-->', lawn_path ... if ((lawn_stat.st_mode, lawn_stat.st_size) != ... (bushy_stat.st_mode, bushy_stat.st_size)): ... print 'oops' bushy/0x00/0x00/0x00/0x00/0x00/0x00/0x1b/0x7f/foo --> lawn/0x1b7f/foo bushy/0x00/0x00/0x00/0x00/0x00/0x00/0x1b/0x7f/foo2 --> lawn/0x1b7f/foo2 bushy/0x00/0x00/0x00/0x00/0x00/0x00/0x00/0x0a/foo3 --> lawn/0x0a/foo3 bushy/0x00/0x00/0x00/0x00/0x00/0x00/0x00/0x0a/foo4 --> lawn/0x0a/foo4 bushy/0x00/0x00/0x00/0x00/0x00/0x00/0x1b/0x7a/foo5 --> lawn/0x1b7a/foo5 bushy/0x00/0x00/0x00/0x00/0x00/0x00/0x1b/0x7a/foo6 --> lawn/0x1b7a/foo6 >>> rmtree(d) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/blob_packing.txt000066400000000000000000000072721230730566700247620ustar00rootroot00000000000000Packing support for blob data ============================= Set up: >>> from ZODB.serialize import referencesf >>> from ZODB.blob import Blob >>> from ZODB import utils >>> from ZODB.DB import DB >>> import transaction A helper method to assure a unique timestamp across multiple platforms: >>> from ZODB.tests.testblob import new_time UNDOING ======= We need a database with an undoing blob supporting storage: >>> blob_storage = create_storage() >>> database = DB(blob_storage) Create our root object: >>> connection1 = database.open() >>> root = connection1.root() Put some revisions of a blob object in our database and on the filesystem: >>> import os >>> tids = [] >>> times = [] >>> nothing = transaction.begin() >>> times.append(new_time()) >>> blob = Blob() >>> blob.open('w').write('this is blob data 0') >>> root['blob'] = blob >>> transaction.commit() >>> tids.append(blob._p_serial) >>> nothing = transaction.begin() >>> times.append(new_time()) >>> root['blob'].open('w').write('this is blob data 1') >>> transaction.commit() >>> tids.append(blob._p_serial) >>> nothing = transaction.begin() >>> times.append(new_time()) >>> root['blob'].open('w').write('this is blob data 2') >>> transaction.commit() >>> tids.append(blob._p_serial) >>> nothing = transaction.begin() >>> times.append(new_time()) >>> root['blob'].open('w').write('this is blob data 3') >>> transaction.commit() >>> tids.append(blob._p_serial) >>> nothing = transaction.begin() >>> times.append(new_time()) >>> root['blob'].open('w').write('this is blob data 4') >>> transaction.commit() >>> tids.append(blob._p_serial) >>> oid = root['blob']._p_oid >>> fns = [ blob_storage.fshelper.getBlobFilename(oid, x) for x in tids ] >>> [ os.path.exists(x) for x in fns ] [True, True, True, True, True] Do a pack to the slightly before the first revision was written: >>> packtime = times[0] >>> blob_storage.pack(packtime, referencesf) >>> [ os.path.exists(x) for x in fns ] [True, True, True, True, True] Do a pack to the slightly before the second revision was written: >>> packtime = times[1] >>> blob_storage.pack(packtime, referencesf) >>> [ os.path.exists(x) for x in fns ] [True, True, True, True, True] Do a pack to the slightly before the third revision was written: >>> packtime = times[2] >>> blob_storage.pack(packtime, referencesf) >>> [ os.path.exists(x) for x in fns ] [False, True, True, True, True] Do a pack to the slightly before the fourth revision was written: >>> packtime = times[3] >>> blob_storage.pack(packtime, referencesf) >>> [ os.path.exists(x) for x in fns ] [False, False, True, True, True] Do a pack to the slightly before the fifth revision was written: >>> packtime = times[4] >>> blob_storage.pack(packtime, referencesf) >>> [ os.path.exists(x) for x in fns ] [False, False, False, True, True] Do a pack to now: >>> packtime = new_time() >>> blob_storage.pack(packtime, referencesf) >>> [ os.path.exists(x) for x in fns ] [False, False, False, False, True] Delete the object and do a pack, it should get rid of the most current revision as well as the entire directory: >>> nothing = transaction.begin() >>> del root['blob'] >>> transaction.commit() >>> packtime = new_time() >>> blob_storage.pack(packtime, referencesf) >>> [ os.path.exists(x) for x in fns ] [False, False, False, False, False] >>> os.path.exists(os.path.split(fns[0])[0]) False Clean up our blob directory and database: >>> blob_storage.close() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/blob_tempdir.txt000066400000000000000000000030061230730566700250010ustar00rootroot00000000000000======================================= Temporary directory handling with blobs ======================================= When creating uncommitted data files for a blob (e.g. by calling `blob.open('w')`) we need to decide where to create them. The decision depends on whether the blob is already stored in a database or not. Case 1: Blobs that are not in a database yet ============================================ Let's create a new blob and open it for writing:: >>> from ZODB.blob import Blob >>> b = Blob() >>> w = b.open('w') The created file is in the default temporary directory:: >>> import tempfile >>> w.name.startswith(tempfile.gettempdir()) True >>> w.close() Case 2: Blobs that are in a database ==================================== For this case we instanciate a blob and add it to a database immediately. First, we need a datatabase with blob support:: >>> from ZODB.MappingStorage import MappingStorage >>> from ZODB.blob import BlobStorage >>> from ZODB.DB import DB >>> import os.path >>> base_storage = MappingStorage('test') >>> blob_dir = os.path.abspath('blobs') >>> blob_storage = BlobStorage(blob_dir, base_storage) >>> database = DB(blob_storage) Now we create a blob and put it in the database. After that we open it for writing and expect the file to be in the blob temporary directory:: >>> blob = Blob() >>> connection = database.open() >>> connection.add(blob) >>> w = blob.open('w') >>> w.name.startswith(os.path.join(blob_dir, 'tmp')) True >>> w.close() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/blob_transaction.txt000066400000000000000000000274061230730566700256740ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2005-2007 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## Transaction support for Blobs ============================= We need a database with a blob supporting storage:: >>> import ZODB.blob, transaction >>> blob_dir = 'blobs' >>> blob_storage = create_storage(blob_dir=blob_dir) >>> database = ZODB.DB(blob_storage) >>> connection1 = database.open() >>> root1 = connection1.root() Putting a Blob into a Connection works like any other Persistent object:: >>> blob1 = ZODB.blob.Blob() >>> blob1.open('w').write('this is blob 1') >>> root1['blob1'] = blob1 >>> 'blob1' in root1 True Aborting a blob add leaves the blob unchanged: >>> transaction.abort() >>> 'blob1' in root1 False >>> blob1._p_oid >>> blob1._p_jar >>> blob1.open().read() 'this is blob 1' It doesn't clear the file because there is no previously committed version: >>> fname = blob1._p_blob_uncommitted >>> import os >>> os.path.exists(fname) True Let's put the blob back into the root and commit the change: >>> root1['blob1'] = blob1 >>> transaction.commit() Now, if we make a change and abort it, we'll return to the committed state: >>> os.path.exists(fname) False >>> blob1._p_blob_uncommitted >>> blob1.open('w').write('this is new blob 1') >>> blob1.open().read() 'this is new blob 1' >>> fname = blob1._p_blob_uncommitted >>> os.path.exists(fname) True >>> transaction.abort() >>> os.path.exists(fname) False >>> blob1._p_blob_uncommitted >>> blob1.open().read() 'this is blob 1' Opening a blob gives us a filehandle. Getting data out of the resulting filehandle is accomplished via the filehandle's read method:: >>> connection2 = database.open() >>> root2 = connection2.root() >>> blob1a = root2['blob1'] >>> blob1afh1 = blob1a.open("r") >>> blob1afh1.read() 'this is blob 1' Let's make another filehandle for read only to blob1a. Aach file handle has a reference to the (same) underlying blob:: >>> blob1afh2 = blob1a.open("r") >>> blob1afh2.blob is blob1afh1.blob True Let's close the first filehandle we got from the blob:: >>> blob1afh1.close() Let's abort this transaction, and ensure that the filehandles that we opened are still open:: >>> transaction.abort() >>> blob1afh2.read() 'this is blob 1' >>> blob1afh2.close() If we open a blob for append, writing any number of bytes to the blobfile should result in the blob being marked "dirty" in the connection (we just aborted above, so the object should be "clean" when we start):: >>> bool(blob1a._p_changed) False >>> blob1a.open('r').read() 'this is blob 1' >>> blob1afh3 = blob1a.open('a') >>> bool(blob1a._p_changed) True >>> blob1afh3.write('woot!') >>> blob1afh3.close() We can open more than one blob object during the course of a single transaction:: >>> blob2 = ZODB.blob.Blob() >>> blob2.open('w').write('this is blob 3') >>> root2['blob2'] = blob2 >>> transaction.commit() Since we committed the current transaction above, the aggregate changes we've made to blob, blob1a (these refer to the same object) and blob2 (a different object) should be evident:: >>> blob1.open('r').read() 'this is blob 1woot!' >>> blob1a.open('r').read() 'this is blob 1woot!' >>> blob2.open('r').read() 'this is blob 3' We shouldn't be able to persist a blob filehandle at commit time (although the exception which is raised when an object cannot be pickled appears to be particulary unhelpful for casual users at the moment):: >>> root1['wontwork'] = blob1.open('r') >>> transaction.commit() Traceback (most recent call last): ... TypeError: coercing to Unicode: need string or buffer, BlobFile found Abort for good measure:: >>> transaction.abort() Attempting to change a blob simultaneously from two different connections should result in a write conflict error:: >>> tm1 = transaction.TransactionManager() >>> tm2 = transaction.TransactionManager() >>> root3 = database.open(transaction_manager=tm1).root() >>> root4 = database.open(transaction_manager=tm2).root() >>> blob1c3 = root3['blob1'] >>> blob1c4 = root4['blob1'] >>> blob1c3fh1 = blob1c3.open('a').write('this is from connection 3') >>> blob1c4fh1 = blob1c4.open('a').write('this is from connection 4') >>> tm1.commit() >>> root3['blob1'].open('r').read() 'this is blob 1woot!this is from connection 3' >>> tm2.commit() Traceback (most recent call last): ... ConflictError: database conflict error (oid 0x01, class ZODB.blob.Blob...) After the conflict, the winning transaction's result is visible on both connections:: >>> root3['blob1'].open('r').read() 'this is blob 1woot!this is from connection 3' >>> tm2.abort() >>> root4['blob1'].open('r').read() 'this is blob 1woot!this is from connection 3' You can't commit a transaction while blob files are open: >>> f = root3['blob1'].open('w') >>> tm1.commit() Traceback (most recent call last): ... ValueError: Can't commit with opened blobs. >>> f.close() >>> tm1.abort() >>> f = root3['blob1'].open('w') >>> f.close() >>> f = root3['blob1'].open('r') >>> tm1.commit() Traceback (most recent call last): ... ValueError: Can't commit with opened blobs. >>> f.close() >>> tm1.abort() Savepoints and Blobs -------------------- We do support optimistic savepoints: >>> connection5 = database.open() >>> root5 = connection5.root() >>> blob = ZODB.blob.Blob() >>> blob_fh = blob.open("w") >>> blob_fh.write("I'm a happy blob.") >>> blob_fh.close() >>> root5['blob'] = blob >>> transaction.commit() >>> root5['blob'].open("r").read() "I'm a happy blob." >>> blob_fh = root5['blob'].open("a") >>> blob_fh.write(" And I'm singing.") >>> blob_fh.close() >>> root5['blob'].open("r").read() "I'm a happy blob. And I'm singing." >>> savepoint = transaction.savepoint(optimistic=True) >>> root5['blob'].open("r").read() "I'm a happy blob. And I'm singing." Savepoints store the blobs in temporary directories in the temporary directory of the blob storage: >>> len([name for name in os.listdir(os.path.join(blob_dir, 'tmp')) ... if name.startswith('savepoint')]) 1 After committing the transaction, the temporary savepoint files are moved to the committed location again: >>> transaction.commit() >>> len([name for name in os.listdir(os.path.join(blob_dir, 'tmp')) ... if name.startswith('savepoint')]) 0 We support non-optimistic savepoints too: >>> root5['blob'].open("a").write(" And I'm dancing.") >>> root5['blob'].open("r").read() "I'm a happy blob. And I'm singing. And I'm dancing." >>> savepoint = transaction.savepoint() Again, the savepoint creates a new savepoints directory: >>> len([name for name in os.listdir(os.path.join(blob_dir, 'tmp')) ... if name.startswith('savepoint')]) 1 >>> root5['blob'].open("w").write(" And the weather is beautiful.") >>> savepoint.rollback() >>> root5['blob'].open("r").read() "I'm a happy blob. And I'm singing. And I'm dancing." >>> transaction.abort() The savepoint blob directory gets cleaned up on an abort: >>> len([name for name in os.listdir(os.path.join(blob_dir, 'tmp')) ... if name.startswith('savepoint')]) 0 Reading Blobs outside of a transaction -------------------------------------- If you want to read from a Blob outside of transaction boundaries (e.g. to stream a file to the browser), committed method to get the name of a file that can be opened. >>> connection6 = database.open() >>> root6 = connection6.root() >>> blob = ZODB.blob.Blob() >>> blob_fh = blob.open("w") >>> blob_fh.write("I'm a happy blob.") >>> blob_fh.close() >>> root6['blob'] = blob >>> transaction.commit() >>> open(blob.committed()).read() "I'm a happy blob." We can also read committed data by calling open with a 'c' flag: >>> f = blob.open('c') This just returns a regular file object: >>> type(f) and doesn't prevent us from opening the blob for writing: >>> blob.open('w').write('x') >>> blob.open().read() 'x' >>> f.read() "I'm a happy blob." >>> f.close() >>> transaction.abort() An exception is raised if we call committed on a blob that has uncommitted changes: >>> blob = ZODB.blob.Blob() >>> blob.committed() Traceback (most recent call last): ... BlobError: Uncommitted changes >>> blob.open('c') Traceback (most recent call last): ... BlobError: Uncommitted changes >>> blob.open('w').write("I'm a happy blob.") >>> root6['blob6'] = blob >>> blob.committed() Traceback (most recent call last): ... BlobError: Uncommitted changes >>> blob.open('c') Traceback (most recent call last): ... BlobError: Uncommitted changes >>> s = transaction.savepoint() >>> blob.committed() Traceback (most recent call last): ... BlobError: Uncommitted changes >>> blob.open('c') Traceback (most recent call last): ... BlobError: Uncommitted changes >>> transaction.commit() >>> open(blob.committed()).read() "I'm a happy blob." You can't open a committed blob file for writing: >>> open(blob.committed(), 'w') # doctest: +ELLIPSIS Traceback (most recent call last): ... IOError: ... tpc_abort --------- If a transaction is aborted in the middle of 2-phase commit, any data stored are discarded. >>> olddata, oldserial = blob_storage.load(blob._p_oid, '') >>> t = transaction.get() >>> blob_storage.tpc_begin(t) >>> open('blobfile', 'w').write('This data should go away') >>> s1 = blob_storage.storeBlob(blob._p_oid, oldserial, olddata, 'blobfile', ... '', t) >>> new_oid = blob_storage.new_oid() >>> open('blobfile2', 'w').write('This data should go away too') >>> s2 = blob_storage.storeBlob(new_oid, '\0'*8, olddata, 'blobfile2', ... '', t) >>> serials = blob_storage.tpc_vote(t) >>> if s1 is None: ... s1 = [s for (oid, s) in serials if oid == blob._p_oid][0] >>> if s2 is None: ... s2 = [s for (oid, s) in serials if oid == new_oid][0] >>> blob_storage.tpc_abort(t) Now, the serial for the existing blob should be the same: >>> blob_storage.load(blob._p_oid, '') == (olddata, oldserial) True And we shouldn't be able to read the data that we saved: >>> blob_storage.loadBlob(blob._p_oid, s1) Traceback (most recent call last): ... POSKeyError: 'No blob file' Of course the old data should be unaffected: >>> open(blob_storage.loadBlob(blob._p_oid, oldserial)).read() "I'm a happy blob." Similarly, the new object wasn't added to the storage: >>> blob_storage.load(new_oid, '') Traceback (most recent call last): ... POSKeyError: 0x... >>> blob_storage.loadBlob(blob._p_oid, s2) Traceback (most recent call last): ... POSKeyError: 'No blob file' .. clean up >>> tm1.abort() >>> tm2.abort() >>> database.close() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/blobstorage_packing.txt000066400000000000000000000116661230730566700263510ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2005 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## Packing support for blob data ============================= Set up: >>> from ZODB.MappingStorage import MappingStorage >>> from ZODB.serialize import referencesf >>> from ZODB.blob import Blob, BlobStorage >>> from ZODB import utils >>> from ZODB.DB import DB >>> import transaction >>> storagefile = 'Data.fs' >>> blob_dir = 'blobs' A helper method to assure a unique timestamp across multiple platforms: >>> from ZODB.tests.testblob import new_time UNDOING ======= See blob_packing.txt. NON-UNDOING =========== We need an database with a NON-undoing blob supporting storage: >>> base_storage = MappingStorage('storage') >>> blob_storage = BlobStorage(blob_dir, base_storage) >>> database = DB(blob_storage) Create our root object: >>> connection1 = database.open() >>> root = connection1.root() Put some revisions of a blob object in our database and on the filesystem: >>> import time, os >>> tids = [] >>> times = [] >>> nothing = transaction.begin() >>> times.append(new_time()) >>> blob = Blob() >>> blob.open('w').write('this is blob data 0') >>> root['blob'] = blob >>> transaction.commit() >>> tids.append(blob_storage._tid) >>> nothing = transaction.begin() >>> times.append(new_time()) >>> root['blob'].open('w').write('this is blob data 1') >>> transaction.commit() >>> tids.append(blob_storage._tid) >>> nothing = transaction.begin() >>> times.append(new_time()) >>> root['blob'].open('w').write('this is blob data 2') >>> transaction.commit() >>> tids.append(blob_storage._tid) >>> nothing = transaction.begin() >>> times.append(new_time()) >>> root['blob'].open('w').write('this is blob data 3') >>> transaction.commit() >>> tids.append(blob_storage._tid) >>> nothing = transaction.begin() >>> times.append(new_time()) >>> root['blob'].open('w').write('this is blob data 4') >>> transaction.commit() >>> tids.append(blob_storage._tid) >>> oid = root['blob']._p_oid >>> fns = [ blob_storage.fshelper.getBlobFilename(oid, x) for x in tids ] >>> [ os.path.exists(x) for x in fns ] [True, True, True, True, True] Get our blob filenames for this oid. >>> fns = [ blob_storage.fshelper.getBlobFilename(oid, x) for x in tids ] Do a pack to the slightly before the first revision was written: >>> packtime = times[0] >>> blob_storage.pack(packtime, referencesf) >>> [ os.path.exists(x) for x in fns ] [False, False, False, False, True] Do a pack to now: >>> packtime = new_time() >>> blob_storage.pack(packtime, referencesf) >>> [ os.path.exists(x) for x in fns ] [False, False, False, False, True] Delete the object and do a pack, it should get rid of the most current revision as well as the entire directory: >>> nothing = transaction.begin() >>> del root['blob'] >>> transaction.commit() >>> packtime = new_time() >>> blob_storage.pack(packtime, referencesf) >>> [ os.path.exists(x) for x in fns ] [False, False, False, False, False] >>> os.path.exists(os.path.split(fns[0])[0]) False Avoiding parallel packs ======================= Blob packing (similar to FileStorage) can only be run once at a time. For this, a flag (_blobs_pack_is_in_progress) is set. If the pack method is called while this flag is set, it will refuse to perform another pack, until the flag is reset: >>> blob_storage._blobs_pack_is_in_progress False >>> blob_storage._blobs_pack_is_in_progress = True >>> blob_storage.pack(packtime, referencesf) Traceback (most recent call last): BlobStorageError: Already packing >>> blob_storage._blobs_pack_is_in_progress = False >>> blob_storage.pack(packtime, referencesf) We can also see, that the flag is set during the pack, by leveraging the knowledge that the underlying storage's pack method is also called: >>> def dummy_pack(time, ref): ... print "_blobs_pack_is_in_progress =", ... print blob_storage._blobs_pack_is_in_progress ... return base_pack(time, ref) >>> base_pack = base_storage.pack >>> base_storage.pack = dummy_pack >>> blob_storage.pack(packtime, referencesf) _blobs_pack_is_in_progress = True >>> blob_storage._blobs_pack_is_in_progress False >>> base_storage.pack = base_pack ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/component.xml000066400000000000000000000007331230730566700243260ustar00rootroot00000000000000
ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/dangle.py000077500000000000000000000033421230730566700234100ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Functional test to produce a dangling reference.""" import time import transaction from ZODB.FileStorage import FileStorage from ZODB import DB from persistent import Persistent class P(Persistent): pass def create_dangling_ref(db): rt = db.open().root() rt[1] = o1 = P() transaction.get().note("create o1") transaction.commit() rt[2] = o2 = P() transaction.get().note("create o2") transaction.commit() c = o1.child = P() transaction.get().note("set child on o1") transaction.commit() o1.child = P() transaction.get().note("replace child on o1") transaction.commit() time.sleep(2) # The pack should remove the reference to c, because it is no # longer referenced from o1. But the object still exists and has # an oid, so a new commit of it won't create a new object. db.pack() print repr(c._p_oid) o2.child = c transaction.get().note("set child on o2") transaction.commit() def main(): fs = FileStorage("dangle.fs") db = DB(fs) create_dangling_ref(db) db.close() if __name__ == "__main__": main() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/dbopen.txt000066400000000000000000000240631230730566700236140ustar00rootroot00000000000000===================== Connection Management ===================== Here we exercise the connection management done by the DB class. >>> from ZODB import DB >>> from ZODB.MappingStorage import MappingStorage as Storage Capturing log messages from DB is important for some of the examples: >>> from zope.testing.loggingsupport import InstalledHandler >>> handler = InstalledHandler('ZODB.DB') Create a storage, and wrap it in a DB wrapper: >>> st = Storage() >>> db = DB(st) By default, we can open 7 connections without any log messages: >>> conns = [db.open() for dummy in range(7)] >>> handler.records [] Open one more, and we get a warning: >>> conns.append(db.open()) >>> len(handler.records) 1 >>> msg = handler.records[0] >>> print msg.name, msg.levelname, msg.getMessage() ZODB.DB WARNING DB.open() has 8 open connections with a pool_size of 7 Open 6 more, and we get 6 more warnings: >>> conns.extend([db.open() for dummy in range(6)]) >>> len(conns) 14 >>> len(handler.records) 7 >>> msg = handler.records[-1] >>> print msg.name, msg.levelname, msg.getMessage() ZODB.DB WARNING DB.open() has 14 open connections with a pool_size of 7 Add another, so that it's more than twice the default, and the level rises to critical: >>> conns.append(db.open()) >>> len(conns) 15 >>> len(handler.records) 8 >>> msg = handler.records[-1] >>> print msg.name, msg.levelname, msg.getMessage() ZODB.DB CRITICAL DB.open() has 15 open connections with a pool_size of 7 While it's boring, it's important to verify that the same relationships hold if the default pool size is overridden. >>> handler.clear() >>> st.close() >>> st = Storage() >>> PS = 2 # smaller pool size >>> db = DB(st, pool_size=PS) >>> conns = [db.open() for dummy in range(PS)] >>> handler.records [] A warning for opening one more: >>> conns.append(db.open()) >>> len(handler.records) 1 >>> msg = handler.records[0] >>> print msg.name, msg.levelname, msg.getMessage() ZODB.DB WARNING DB.open() has 3 open connections with a pool_size of 2 More warnings through 4 connections: >>> conns.extend([db.open() for dummy in range(PS-1)]) >>> len(conns) 4 >>> len(handler.records) 2 >>> msg = handler.records[-1] >>> print msg.name, msg.levelname, msg.getMessage() ZODB.DB WARNING DB.open() has 4 open connections with a pool_size of 2 And critical for going beyond that: >>> conns.append(db.open()) >>> len(conns) 5 >>> len(handler.records) 3 >>> msg = handler.records[-1] >>> print msg.name, msg.levelname, msg.getMessage() ZODB.DB CRITICAL DB.open() has 5 open connections with a pool_size of 2 We can change the pool size on the fly: >>> handler.clear() >>> db.setPoolSize(6) >>> conns.append(db.open()) >>> handler.records # no log msg -- the pool is bigger now [] >>> conns.append(db.open()) # but one more and there's a warning again >>> len(handler.records) 1 >>> msg = handler.records[0] >>> print msg.name, msg.levelname, msg.getMessage() ZODB.DB WARNING DB.open() has 7 open connections with a pool_size of 6 Enough of that. >>> handler.clear() >>> st.close() More interesting is the stack-like nature of connection reuse. So long as we keep opening new connections, and keep them alive, all connections returned are distinct: >>> st = Storage() >>> db = DB(st) >>> c1 = db.open() >>> c2 = db.open() >>> c3 = db.open() >>> c1 is c2 or c1 is c3 or c2 is c3 False Let's put some markers on the connections, so we can identify these specific objects later: >>> c1.MARKER = 'c1' >>> c2.MARKER = 'c2' >>> c3.MARKER = 'c3' Now explicitly close c1 and c2: >>> c1.close() >>> c2.close() Reaching into the internals, we can see that db's connection pool now has two connections available for reuse, and knows about three connections in all: >>> pool = db.pool >>> len(pool.available) 2 >>> len(pool.all) 3 Since we closed c2 last, it's at the top of the available stack, so will be reused by the next open(): >>> c1 = db.open() >>> c1.MARKER 'c2' >>> len(pool.available), len(pool.all) (1, 3) >>> c3.close() # now the stack has c3 on top, then c1 >>> c2 = db.open() >>> c2.MARKER 'c3' >>> len(pool.available), len(pool.all) (1, 3) >>> c3 = db.open() >>> c3.MARKER 'c1' >>> len(pool.available), len(pool.all) (0, 3) It's a bit more complicated though. The connection pool tries to keep connections with larger caches at the top of the stack. It does this by having connections with smaller caches "sink" below connections with larger caches when they are closed. To see this, we'll add some objects to the caches: >>> for i in range(10): ... c1.root()[i] = c1.root().__class__() >>> import transaction >>> transaction.commit() >>> c1._cache.cache_non_ghost_count 11 >>> for i in range(5): ... _ = len(c2.root()[i]) >>> c2._cache.cache_non_ghost_count 6 Now, we'll close the connections and get them back: >>> c1.close() >>> c2.close() >>> c3.close() We closed c3 last, but c1 is the biggest, so we get c1 on the next open: >>> db.open() is c1 True Similarly, c2 is the next buggest, so we get that next: >>> db.open() is c2 True and finally c3: >>> db.open() is c3 True What about the 3 in pool.all? We've seen that closing connections doesn't reduce pool.all, and it would be bad if DB kept connections alive forever. In fact pool.all is a "weak set" of connections -- it holds weak references to connections. That alone doesn't keep connection objects alive. The weak set allows DB's statistics methods to return info about connections that are still alive. >>> len(db.cacheDetailSize()) # one result for each connection's cache 3 If a connection object is abandoned (it becomes unreachable), then it will vanish from pool.all automatically. However, connections are involved in cycles, so exactly when a connection vanishes from pool.all isn't predictable. It can be forced by running gc.collect(): >>> import gc >>> dummy = gc.collect() >>> len(pool.all) 3 >>> c3 = None >>> dummy = gc.collect() # removes c3 from pool.all >>> len(pool.all) 2 Note that c3 is really gone; in particular it didn't get added back to the stack of available connections by magic: >>> len(pool.available) 0 Nothing in that last block should have logged any msgs: >>> handler.records [] If "too many" connections are open, then closing one may kick an older closed one out of the available connection stack. >>> st.close() >>> st = Storage() >>> db = DB(st, pool_size=3) >>> conns = [db.open() for dummy in range(6)] >>> len(handler.records) # 3 warnings for the "excess" connections 3 >>> pool = db.pool >>> len(pool.available), len(pool.all) (0, 6) Let's mark them: >>> for i, c in enumerate(conns): ... c.MARKER = i Closing connections adds them to the stack: >>> for i in range(3): ... conns[i].close() >>> len(pool.available), len(pool.all) (3, 6) >>> del conns[:3] # leave the ones with MARKERs 3, 4 and 5 Closing another one will purge the one with MARKER 0 from the stack (since it was the first added to the stack): >>> [c.MARKER for (t, c) in pool.available] [0, 1, 2] >>> conns[0].close() # MARKER 3 >>> len(pool.available), len(pool.all) (3, 5) >>> [c.MARKER for (t, c) in pool.available] [1, 2, 3] Similarly for the other two: >>> conns[1].close(); conns[2].close() >>> len(pool.available), len(pool.all) (3, 3) >>> [c.MARKER for (t, c) in pool.available] [3, 4, 5] Reducing the pool size may also purge the oldest closed connections: >>> db.setPoolSize(2) # gets rid of MARKER 3 >>> len(pool.available), len(pool.all) (2, 2) >>> [c.MARKER for (t, c) in pool.available] [4, 5] Since MARKER 5 is still the last one added to the stack, it will be the first popped: >>> c1 = db.open(); c2 = db.open() >>> c1.MARKER, c2.MARKER (5, 4) >>> len(pool.available), len(pool.all) (0, 2) Next: when a closed Connection is removed from .available due to exceeding pool_size, that Connection's cache is cleared (this behavior was new in ZODB 3.6b6). While user code may still hold a reference to that Connection, once it vanishes from .available it's really not usable for anything sensible (it can never be in the open state again). Waiting for gc to reclaim the Connection and its cache eventually works, but that can take "a long time" and caches can hold on to many objects, and limited resources (like RDB connections), for the duration. >>> st.close() >>> st = Storage() >>> db = DB(st, pool_size=2) >>> conn0 = db.open() >>> len(conn0._cache) # empty now 0 >>> import transaction >>> conn0.root()['a'] = 1 >>> transaction.commit() >>> len(conn0._cache) # but now the cache holds the root object 1 Now open more connections so that the total exceeds pool_size (2): >>> conn1 = db.open(); _ = conn1.root()['a'] >>> conn2 = db.open(); _ = conn2.root()['a'] Note that we accessed the objects in the new connections so they would be of the same size, so that when they get closed, they don't sink below conn0. >>> pool = db.pool >>> len(pool.all), len(pool.available) # all Connections are in use (3, 0) Return pool_size (2) Connections to the pool: >>> conn0.close() >>> conn1.close() >>> len(pool.all), len(pool.available) (3, 2) >>> len(conn0._cache) # nothing relevant has changed yet 1 When we close the third connection, conn0 will be booted from .all, and we expect its cache to be cleared then: >>> conn2.close() >>> len(pool.all), len(pool.available) (2, 2) >>> len(conn0._cache) # conn0's cache is empty again 0 >>> del conn0, conn1, conn2 Clean up. >>> st.close() >>> handler.uninstall() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/hexstorage.py000066400000000000000000000124311230730566700243230ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2010 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## import ZODB.blob import ZODB.interfaces import zope.interface class HexStorage(object): zope.interface.implements(ZODB.interfaces.IStorageWrapper) copied_methods = ( 'close', 'getName', 'getSize', 'history', 'isReadOnly', 'lastTransaction', 'new_oid', 'sortKey', 'tpc_abort', 'tpc_begin', 'tpc_finish', 'tpc_vote', 'loadBlob', 'openCommittedBlobFile', 'temporaryDirectory', 'supportsUndo', 'undo', 'undoLog', 'undoInfo', ) def __init__(self, base): self.base = base base.registerDB(self) for name in self.copied_methods: v = getattr(base, name, None) if v is not None: setattr(self, name, v) zope.interface.directlyProvides(self, zope.interface.providedBy(base)) def __getattr__(self, name): return getattr(self.base, name) def __len__(self): return len(self.base) def load(self, oid, version=''): data, serial = self.base.load(oid, version) return data[2:].decode('hex'), serial def loadBefore(self, oid, tid): r = self.base.loadBefore(oid, tid) if r is not None: data, serial, after = r return data[2:].decode('hex'), serial, after else: return r def loadSerial(self, oid, serial): return self.base.loadSerial(oid, serial)[2:].decode('hex') def pack(self, pack_time, referencesf, gc=True): def refs(p, oids=None): return referencesf(p[2:].decode('hex'), oids) return self.base.pack(pack_time, refs, gc) def registerDB(self, db): self.db = db self._db_transform = db.transform_record_data self._db_untransform = db.untransform_record_data _db_transform = _db_untransform = lambda self, data: data def store(self, oid, serial, data, version, transaction): return self.base.store( oid, serial, '.h'+data.encode('hex'), version, transaction) def restore(self, oid, serial, data, version, prev_txn, transaction): return self.base.restore( oid, serial, data and ('.h'+data.encode('hex')), version, prev_txn, transaction) def iterator(self, start=None, stop=None): for t in self.base.iterator(start, stop): yield Transaction(self, t) def storeBlob(self, oid, oldserial, data, blobfilename, version, transaction): return self.base.storeBlob(oid, oldserial, '.h'+data.encode('hex'), blobfilename, version, transaction) def restoreBlob(self, oid, serial, data, blobfilename, prev_txn, transaction): return self.base.restoreBlob(oid, serial, data and ('.h'+data.encode('hex')), blobfilename, prev_txn, transaction) def invalidateCache(self): return self.db.invalidateCache() def invalidate(self, transaction_id, oids, version=''): return self.db.invalidate(transaction_id, oids, version) def references(self, record, oids=None): return self.db.references(record[2:].decode('hex'), oids) def transform_record_data(self, data): return '.h'+self._db_transform(data).encode('hex') def untransform_record_data(self, data): return self._db_untransform(data[2:].decode('hex')) def record_iternext(self, next=None): oid, tid, data, next = self.base.record_iternext(next) return oid, tid, data[2:].decode('hex'), next def copyTransactionsFrom(self, other): ZODB.blob.copyTransactionsFromTo(other, self) class ServerHexStorage(HexStorage): """Use on ZEO storage server when Hex is used on client Don't do conversion as part of load/store, but provide pickle decoding. """ copied_methods = HexStorage.copied_methods + ( 'load', 'loadBefore', 'loadSerial', 'store', 'restore', 'iterator', 'storeBlob', 'restoreBlob', 'record_iternext', ) class Transaction(object): def __init__(self, store, trans): self.__store = store self.__trans = trans def __iter__(self): for r in self.__trans: if r.data: r.data = self.__store.untransform_record_data(r.data) yield r def __getattr__(self, name): return getattr(self.__trans, name) class ZConfigHex: _factory = HexStorage def __init__(self, config): self.config = config self.name = config.getSectionName() def open(self): base = self.config.base.open() return self._factory(base) class ZConfigServerHex(ZConfigHex): _factory = ServerHexStorage ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/loggingsupport.py000066400000000000000000000064371230730566700252460ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2004 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Support for testing logging code If you want to test that your code generates proper log output, you can create and install a handler that collects output: >>> handler = InstalledHandler('foo.bar') The handler is installed into loggers for all of the names passed. In addition, the logger level is set to 1, which means, log everything. If you want to log less than everything, you can provide a level keyword argument. The level setting effects only the named loggers. Then, any log output is collected in the handler: >>> logging.getLogger('foo.bar').exception('eek') >>> logging.getLogger('foo.bar').info('blah blah') >>> for record in handler.records: ... print record.name, record.levelname ... print ' ', record.getMessage() foo.bar ERROR eek foo.bar INFO blah blah A similar effect can be gotten by just printing the handler: >>> print handler foo.bar ERROR eek foo.bar INFO blah blah After checking the log output, you need to uninstall the handler: >>> handler.uninstall() At which point, the handler won't get any more log output. Let's clear the handler: >>> handler.clear() >>> handler.records [] And then log something: >>> logging.getLogger('foo.bar').info('blah') and, sure enough, we still have no output: >>> handler.records [] $Id: loggingsupport.py 28349 2004-11-06 00:10:32Z tim_one $ """ import logging class Handler(logging.Handler): def __init__(self, *names, **kw): logging.Handler.__init__(self) self.names = names self.records = [] self.setLoggerLevel(**kw) def setLoggerLevel(self, level=1): self.level = level self.oldlevels = {} def emit(self, record): self.records.append(record) def clear(self): del self.records[:] def install(self): for name in self.names: logger = logging.getLogger(name) self.oldlevels[name] = logger.level logger.setLevel(self.level) logger.addHandler(self) def uninstall(self): for name in self.names: logger = logging.getLogger(name) logger.setLevel(self.oldlevels[name]) logger.removeHandler(self) def __str__(self): return '\n'.join( [("%s %s\n %s" % (record.name, record.levelname, '\n'.join([line for line in record.getMessage().split('\n') if line.strip()]) ) ) for record in self.records] ) class InstalledHandler(Handler): def __init__(self, *names): Handler.__init__(self, *names) self.install() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/multidb.txt000066400000000000000000000132721230730566700240050ustar00rootroot00000000000000================== Multiple Databases ================== Multi-database support adds the ability to tie multiple databases into a collection. The original proposal is in the fishbowl: http://www.zope.org/Wikis/ZODB/MultiDatabases/ It was implemented during the PyCon 2005 sprints, but in a simpler form, by Jim Fulton, Christian Theune, and Tim Peters. Overview: No private attributes were added, and one new method was introduced. ``DB``: - a new ``.database_name`` attribute holds the name of this database. - a new ``.databases`` attribute maps from database name to ``DB`` object; all databases in a multi-database collection share the same ``.databases`` object - the ``DB`` constructor has new optional arguments with the same names (``database_name=`` and ``databases=``). ``Connection``: - a new ``.connections`` attribute maps from database name to a ``Connection`` for the database with that name; the ``.connections`` mapping object is also shared among databases in a collection. - a new ``.get_connection(database_name)`` method returns a ``Connection`` for a database in the collection; if a connection is already open, it's returned (this is the value ``.connections[database_name]``), else a new connection is opened (and stored as ``.connections[database_name]``) Creating a multi-database starts with creating a named ``DB``: >>> from ZODB.tests.test_storage import MinimalMemoryStorage >>> from ZODB import DB >>> dbmap = {} >>> db = DB(MinimalMemoryStorage(), database_name='root', databases=dbmap) The database name is accessible afterwards and in a newly created collection: >>> db.database_name 'root' >>> db.databases # doctest: +ELLIPSIS {'root': } >>> db.databases is dbmap True Adding another database to the collection works like this: >>> db2 = DB(MinimalMemoryStorage(), ... database_name='notroot', ... databases=dbmap) The new ``db2`` now shares the ``databases`` dictionary with db and has two entries: >>> db2.databases is db.databases is dbmap True >>> len(db2.databases) 2 >>> names = dbmap.keys(); names.sort(); print names ['notroot', 'root'] It's an error to try to insert a database with a name already in use: >>> db3 = DB(MinimalMemoryStorage(), ... database_name='root', ... databases=dbmap) Traceback (most recent call last): ... ValueError: database_name 'root' already in databases Because that failed, ``db.databases`` wasn't changed: >>> len(db.databases) # still 2 2 You can (still) get a connection to a database this way: >>> import transaction >>> tm = transaction.TransactionManager() >>> cn = db.open(transaction_manager=tm) >>> cn # doctest: +ELLIPSIS This is the only connection in this collection right now: >>> cn.connections # doctest: +ELLIPSIS {'root': } Getting a connection to a different database from an existing connection in the same database collection (this enables 'connection binding' within a given thread/transaction/context ...): >>> cn2 = cn.get_connection('notroot') >>> cn2 # doctest: +ELLIPSIS The second connection gets the same transaction manager as the first: >>> cn2.transaction_manager is tm True Now there are two connections in that collection: >>> cn2.connections is cn.connections True >>> len(cn2.connections) 2 >>> names = cn.connections.keys(); names.sort(); print names ['notroot', 'root'] So long as this database group remains open, the same ``Connection`` objects are returned: >>> cn.get_connection('root') is cn True >>> cn.get_connection('notroot') is cn2 True >>> cn2.get_connection('root') is cn True >>> cn2.get_connection('notroot') is cn2 True Of course trying to get a connection for a database not in the group raises an exception: >>> cn.get_connection('no way') Traceback (most recent call last): ... KeyError: 'no way' Clean up: >>> for a_db in dbmap.values(): ... a_db.close() Configuration from File ----------------------- The database name can also be specified in a config file, starting in ZODB 3.6: >>> from ZODB.config import databaseFromString >>> config = """ ... ... ... database-name this_is_the_name ... ... """ >>> db = databaseFromString(config) >>> print db.database_name this_is_the_name >>> db.databases.keys() ['this_is_the_name'] However, the ``.databases`` attribute cannot be configured from file. It can be passed to the `ZConfig` factory. I'm not sure of the clearest way to test that here; this is ugly: >>> from ZODB.config import getDbSchema >>> import ZConfig >>> from cStringIO import StringIO Derive a new `config2` string from the `config` string, specifying a different database_name: >>> config2 = config.replace("this_is_the_name", "another_name") Now get a `ZConfig` factory from `config2`: >>> f = StringIO(config2) >>> zconfig, handle = ZConfig.loadConfigFile(getDbSchema(), f) >>> factory = zconfig.database The desired ``databases`` mapping can be passed to this factory: >>> db2 = factory[0].open(databases=db.databases) >>> print db2.database_name # has the right name another_name >>> db.databases is db2.databases # shares .databases with `db` True >>> all = db2.databases.keys() >>> all.sort() >>> all # and db.database_name & db2.database_name are the keys ['another_name', 'this_is_the_name'] Cleanup. >>> db.close() >>> db2.close() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/sampledm.py000066400000000000000000000247041230730566700237620ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2004 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Sample objects for use in tests """ class DataManager(object): """Sample data manager This class provides a trivial data-manager implementation and doc strings to illustrate the the protocol and to provide a tool for writing tests. Our sample data manager has state that is updated through an inc method and through transaction operations. When we create a sample data manager: >>> dm = DataManager() It has two bits of state, state: >>> dm.state 0 and delta: >>> dm.delta 0 Both of which are initialized to 0. state is meant to model committed state, while delta represents tentative changes within a transaction. We change the state by calling inc: >>> dm.inc() which updates delta: >>> dm.delta 1 but state isn't changed until we commit the transaction: >>> dm.state 0 To commit the changes, we use 2-phase commit. We execute the first stage by calling prepare. We need to pass a transation. Our sample data managers don't really use the transactions for much, so we'll be lazy and use strings for transactions: >>> t1 = '1' >>> dm.prepare(t1) The sample data manager updates the state when we call prepare: >>> dm.state 1 >>> dm.delta 1 This is mainly so we can detect some affect of calling the methods. Now if we call commit: >>> dm.commit(t1) Our changes are"permanent". The state reflects the changes and the delta has been reset to 0. >>> dm.state 1 >>> dm.delta 0 """ def __init__(self): self.state = 0 self.sp = 0 self.transaction = None self.delta = 0 self.prepared = False def inc(self, n=1): self.delta += n def prepare(self, transaction): """Prepare to commit data >>> dm = DataManager() >>> dm.inc() >>> t1 = '1' >>> dm.prepare(t1) >>> dm.commit(t1) >>> dm.state 1 >>> dm.inc() >>> t2 = '2' >>> dm.prepare(t2) >>> dm.abort(t2) >>> dm.state 1 It is en error to call prepare more than once without an intervening commit or abort: >>> dm.prepare(t1) >>> dm.prepare(t1) Traceback (most recent call last): ... TypeError: Already prepared >>> dm.prepare(t2) Traceback (most recent call last): ... TypeError: Already prepared >>> dm.abort(t1) If there was a preceeding savepoint, the transaction must match: >>> rollback = dm.savepoint(t1) >>> dm.prepare(t2) Traceback (most recent call last): ,,, TypeError: ('Transaction missmatch', '2', '1') >>> dm.prepare(t1) """ if self.prepared: raise TypeError('Already prepared') self._checkTransaction(transaction) self.prepared = True self.transaction = transaction self.state += self.delta def _checkTransaction(self, transaction): if (transaction is not self.transaction and self.transaction is not None): raise TypeError("Transaction missmatch", transaction, self.transaction) def abort(self, transaction): """Abort a transaction The abort method can be called before two-phase commit to throw away work done in the transaction: >>> dm = DataManager() >>> dm.inc() >>> dm.state, dm.delta (0, 1) >>> t1 = '1' >>> dm.abort(t1) >>> dm.state, dm.delta (0, 0) The abort method also throws away work done in savepoints: >>> dm.inc() >>> r = dm.savepoint(t1) >>> dm.inc() >>> r = dm.savepoint(t1) >>> dm.state, dm.delta (0, 2) >>> dm.abort(t1) >>> dm.state, dm.delta (0, 0) If savepoints are used, abort must be passed the same transaction: >>> dm.inc() >>> r = dm.savepoint(t1) >>> t2 = '2' >>> dm.abort(t2) Traceback (most recent call last): ... TypeError: ('Transaction missmatch', '2', '1') >>> dm.abort(t1) The abort method is also used to abort a two-phase commit: >>> dm.inc() >>> dm.state, dm.delta (0, 1) >>> dm.prepare(t1) >>> dm.state, dm.delta (1, 1) >>> dm.abort(t1) >>> dm.state, dm.delta (0, 0) Of course, the transactions passed to prepare and abort must match: >>> dm.prepare(t1) >>> dm.abort(t2) Traceback (most recent call last): ... TypeError: ('Transaction missmatch', '2', '1') >>> dm.abort(t1) """ self._checkTransaction(transaction) if self.transaction is not None: self.transaction = None if self.prepared: self.state -= self.delta self.prepared = False self.delta = 0 def commit(self, transaction): """Complete two-phase commit >>> dm = DataManager() >>> dm.state 0 >>> dm.inc() We start two-phase commit by calling prepare: >>> t1 = '1' >>> dm.prepare(t1) We complete it by calling commit: >>> dm.commit(t1) >>> dm.state 1 It is an error ro call commit without calling prepare first: >>> dm.inc() >>> t2 = '2' >>> dm.commit(t2) Traceback (most recent call last): ... TypeError: Not prepared to commit >>> dm.prepare(t2) >>> dm.commit(t2) If course, the transactions given to prepare and commit must be the same: >>> dm.inc() >>> t3 = '3' >>> dm.prepare(t3) >>> dm.commit(t2) Traceback (most recent call last): ... TypeError: ('Transaction missmatch', '2', '3') """ if not self.prepared: raise TypeError('Not prepared to commit') self._checkTransaction(transaction) self.delta = 0 self.transaction = None self.prepared = False def savepoint(self, transaction): """Provide the ability to rollback transaction state Savepoints provide a way to: - Save partial transaction work. For some data managers, this could allow resources to be used more efficiently. - Provide the ability to revert state to a point in a transaction without aborting the entire transaction. In other words, savepoints support partial aborts. Savepoints don't use two-phase commit. If there are errors in setting or rolling back to savepoints, the application should abort the containing transaction. This is *not* the responsibility of the data manager. Savepoints are always associated with a transaction. Any work done in a savepoint's transaction is tentative until the transaction is committed using two-phase commit. >>> dm = DataManager() >>> dm.inc() >>> t1 = '1' >>> r = dm.savepoint(t1) >>> dm.state, dm.delta (0, 1) >>> dm.inc() >>> dm.state, dm.delta (0, 2) >>> r.rollback() >>> dm.state, dm.delta (0, 1) >>> dm.prepare(t1) >>> dm.commit(t1) >>> dm.state, dm.delta (1, 0) Savepoints must have the same transaction: >>> r1 = dm.savepoint(t1) >>> dm.state, dm.delta (1, 0) >>> dm.inc() >>> dm.state, dm.delta (1, 1) >>> t2 = '2' >>> r2 = dm.savepoint(t2) Traceback (most recent call last): ... TypeError: ('Transaction missmatch', '2', '1') >>> r2 = dm.savepoint(t1) >>> dm.inc() >>> dm.state, dm.delta (1, 2) If we rollback to an earlier savepoint, we discard all work done later: >>> r1.rollback() >>> dm.state, dm.delta (1, 0) and we can no longer rollback to the later savepoint: >>> r2.rollback() Traceback (most recent call last): ... TypeError: ('Attempt to roll back to invalid save point', 3, 2) We can roll back to a savepoint as often as we like: >>> r1.rollback() >>> r1.rollback() >>> r1.rollback() >>> dm.state, dm.delta (1, 0) >>> dm.inc() >>> dm.inc() >>> dm.inc() >>> dm.state, dm.delta (1, 3) >>> r1.rollback() >>> dm.state, dm.delta (1, 0) But we can't rollback to a savepoint after it has been committed: >>> dm.prepare(t1) >>> dm.commit(t1) >>> r1.rollback() Traceback (most recent call last): ... TypeError: Attempt to rollback stale rollback """ if self.prepared: raise TypeError("Can't get savepoint during two-phase commit") self._checkTransaction(transaction) self.transaction = transaction self.sp += 1 return Rollback(self) class Rollback(object): def __init__(self, dm): self.dm = dm self.sp = dm.sp self.delta = dm.delta self.transaction = dm.transaction def rollback(self): if self.transaction is not self.dm.transaction: raise TypeError("Attempt to rollback stale rollback") if self.dm.sp < self.sp: raise TypeError("Attempt to roll back to invalid save point", self.sp, self.dm.sp) self.dm.sp = self.sp self.dm.delta = self.delta def test_suite(): from doctest import DocTestSuite return DocTestSuite() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/speed.py000066400000000000000000000066731230730566700232650ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## usage="""Test speed of a ZODB storage Options: -d file The data file to use as input. The default is this script. -n n The number of repititions -s module A module that defines a 'Storage' attribute, which is an open storage. If not specified, a FileStorage will ne used. -z Test compressing data -D Run in debug mode -L Test loads as well as stores by minimizing the cache after eachrun -M Output means only """ import sys, os, getopt, string, time sys.path.insert(0, os.getcwd()) import ZODB, ZODB.FileStorage import persistent import transaction class P(persistent.Persistent): pass def main(args): opts, args = getopt.getopt(args, 'zd:n:Ds:LM') z=s=None data=sys.argv[0] nrep=5 minimize=0 detailed=1 for o, v in opts: if o=='-n': nrep=string.atoi(v) elif o=='-d': data=v elif o=='-s': s=v elif o=='-z': global zlib import zlib z=compress elif o=='-L': minimize=1 elif o=='-M': detailed=0 elif o=='-D': global debug os.environ['STUPID_LOG_FILE']='' os.environ['STUPID_LOG_SEVERITY']='-999' if s: s=__import__(s, globals(), globals(), ('__doc__',)) s=s.Storage else: s=ZODB.FileStorage.FileStorage('zeo_speed.fs', create=1) data=open(data).read() db=ZODB.DB(s, # disable cache deactivation cache_size=4000, cache_deactivate_after=6000,) results={1:0, 10:0, 100:0, 1000:0} for j in range(nrep): for r in 1, 10, 100, 1000: t=time.time() jar=db.open() transaction.begin() rt=jar.root() key='s%s' % r if rt.has_key(key): p=rt[key] else: rt[key]=p=P() for i in range(r): if z is not None: d=z(data) else: d=data v=getattr(p, str(i), P()) v.d=d setattr(p,str(i),v) transaction.commit() jar.close() t=time.time()-t if detailed: sys.stderr.write("%s\t%s\t%.4f\n" % (j, r, t)) sys.stdout.flush() results[r]=results[r]+t rt=d=p=v=None # release all references if minimize: time.sleep(3) jar.cacheMinimize(3) if detailed: print '-'*24 for r in 1, 10, 100, 1000: t=results[r]/nrep sys.stderr.write("mean:\t%s\t%.4f\t%.4f (s/o)\n" % (r, t, t/r)) db.close() def compress(s): c=zlib.compressobj() o=c.compress(s) return o+c.flush() if __name__=='__main__': main(sys.argv[1:]) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/synchronizers.txt000066400000000000000000000061461230730566700252670ustar00rootroot00000000000000============= Synchronizers ============= Here are some tests that storage ``sync()`` methods get called at appropriate times in the life of a transaction. The tested behavior is new in ZODB 3.4. First define a lightweight storage with a ``sync()`` method: >>> import ZODB >>> from ZODB.MappingStorage import MappingStorage >>> import transaction >>> class SimpleStorage(MappingStorage): ... sync_called = False ... ... def sync(self, *args): ... self.sync_called = True Make a change locally: >>> st = SimpleStorage() >>> db = ZODB.DB(st) >>> cn = db.open() >>> rt = cn.root() >>> rt['a'] = 1 Sync should not have been called yet. >>> st.sync_called # False before 3.4 False ``sync()`` is called by the Connection's ``afterCompletion()`` hook after the commit completes. >>> transaction.commit() >>> st.sync_called # False before 3.4 True ``sync()`` is also called by the ``afterCompletion()`` hook after an abort. >>> st.sync_called = False >>> rt['b'] = 2 >>> transaction.abort() >>> st.sync_called # False before 3.4 True And ``sync()`` is called whenever we explicitly start a new transaction, via the ``newTransaction()`` hook. >>> st.sync_called = False >>> dummy = transaction.begin() >>> st.sync_called # False before 3.4 True Clean up. Closing db isn't enough -- closing a DB doesn't close its `Connections`. Leaving our `Connection` open here can cause the ``SimpleStorage.sync()`` method to get called later, during another test, and our doctest-synthesized module globals no longer exist then. You get a weird traceback then ;-) >>> cn.close() One more, very obscure. It was the case that if the first action a new threaded transaction manager saw was a ``begin()`` call, then synchronizers registered after that in the same transaction weren't communicated to the `Transaction` object, and so the synchronizers' ``afterCompletion()`` hooks weren't called when the transaction commited. None of the test suites (ZODB's, Zope 2.8's, or Zope3's) caught that, but apparently Zope 3 takes this path at some point when serving pages. >>> tm = transaction.ThreadTransactionManager() >>> st.sync_called = False >>> dummy = tm.begin() # we're doing this _before_ opening a connection >>> cn = db.open(transaction_manager=tm) >>> rt = cn.root() # make a change >>> rt['c'] = 3 >>> st.sync_called False Now ensure that ``cn.afterCompletion() -> st.sync()`` gets called by commit despite that the `Connection` registered after the transaction began: >>> tm.commit() >>> st.sync_called True And try the same thing with a non-threaded transaction manager: >>> cn.close() >>> tm = transaction.TransactionManager() >>> st.sync_called = False >>> dummy = tm.begin() # we're doing this _before_ opening a connection >>> cn = db.open(transaction_manager=tm) >>> rt = cn.root() # make a change >>> rt['d'] = 4 >>> st.sync_called False >>> tm.commit() >>> st.sync_called True >>> cn.close() >>> db.close() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/testActivityMonitor.py000066400000000000000000000062171230730566700262230ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Tests of the default activity monitor. See ZODB/ActivityMonitor.py $Id$ """ import unittest import time from ZODB.ActivityMonitor import ActivityMonitor class FakeConnection: loads = 0 stores = 0 def _transferred(self, loads, stores): self.loads = self.loads + loads self.stores = self.stores + stores def getTransferCounts(self, clear=0): res = self.loads, self.stores if clear: self.loads = self.stores = 0 return res class Tests(unittest.TestCase): def testAddLogEntries(self): am = ActivityMonitor(history_length=3600) self.assertEqual(len(am.log), 0) c = FakeConnection() c._transferred(1, 2) am.closedConnection(c) c._transferred(3, 7) am.closedConnection(c) self.assertEqual(len(am.log), 2) def testTrim(self): am = ActivityMonitor(history_length=0.1) c = FakeConnection() c._transferred(1, 2) am.closedConnection(c) time.sleep(0.2) c._transferred(3, 7) am.closedConnection(c) self.assert_(len(am.log) <= 1) def testSetHistoryLength(self): am = ActivityMonitor(history_length=3600) c = FakeConnection() c._transferred(1, 2) am.closedConnection(c) time.sleep(0.2) c._transferred(3, 7) am.closedConnection(c) self.assertEqual(len(am.log), 2) am.setHistoryLength(0.1) self.assertEqual(am.getHistoryLength(), 0.1) self.assert_(len(am.log) <= 1) def testActivityAnalysis(self): am = ActivityMonitor(history_length=3600) c = FakeConnection() c._transferred(1, 2) am.closedConnection(c) c._transferred(3, 7) am.closedConnection(c) res = am.getActivityAnalysis(start=0, end=0, divisions=10) lastend = 0 for n in range(9): div = res[n] self.assertEqual(div['stores'], 0) self.assertEqual(div['loads'], 0) self.assert_(div['start'] > 0) self.assert_(div['start'] >= lastend) self.assert_(div['start'] < div['end']) lastend = div['end'] div = res[9] self.assertEqual(div['stores'], 9) self.assertEqual(div['loads'], 4) self.assert_(div['start'] > 0) self.assert_(div['start'] >= lastend) self.assert_(div['start'] < div['end']) def test_suite(): return unittest.makeSuite(Tests) if __name__=='__main__': unittest.main(defaultTest='test_suite') ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/testBroken.py000066400000000000000000000051031230730566700242700ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2004 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Test broken-object suppport """ import sys import unittest import persistent import transaction import os if os.environ.get('USE_ZOPE_TESTING_DOCTEST'): from zope.testing.doctest import DocTestSuite else: from doctest import DocTestSuite from ZODB.tests.util import DB def test_integration(): r"""Test the integration of broken object support with the databse: >>> db = DB() We'll create a fake module with a class: >>> class NotThere: ... Atall = type('Atall', (persistent.Persistent, ), ... {'__module__': 'ZODB.not.there'}) And stuff this into sys.modules to simulate a regular module: >>> sys.modules['ZODB.not.there'] = NotThere >>> sys.modules['ZODB.not'] = NotThere Now, we'll create and save an instance, and make sure we can load it in another connection: >>> a = NotThere.Atall() >>> a.x = 1 >>> conn1 = db.open() >>> conn1.root()['a'] = a >>> transaction.commit() >>> conn2 = db.open() >>> a2 = conn2.root()['a'] >>> a2.__class__ is a.__class__ True >>> a2.x 1 Now, we'll uninstall the module, simulating having the module go away: >>> del sys.modules['ZODB.not.there'] and we'll try to load the object in another connection: >>> conn3 = db.open() >>> a3 = conn3.root()['a'] >>> a3 # doctest: +NORMALIZE_WHITESPACE >>> a3.__Broken_state__ {'x': 1} Broken objects provide an interface: >>> from ZODB.interfaces import IBroken >>> IBroken.providedBy(a3) True Let's clean up: >>> db.close() >>> del sys.modules['ZODB.not'] Cleanup: >>> import ZODB.broken >>> ZODB.broken.broken_cache.clear() """ def test_suite(): return unittest.TestSuite(( DocTestSuite('ZODB.broken'), DocTestSuite(), )) if __name__ == '__main__': unittest.main() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/testCache.py000066400000000000000000000367371230730566700240740ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """A few simple tests of the public cache API. Each DB Connection has a separate PickleCache. The Cache serves two purposes. It acts like a memo for unpickling. It also keeps recent objects in memory under the assumption that they may be used again. """ from persistent.cPickleCache import PickleCache from persistent import Persistent from persistent.mapping import PersistentMapping from ZODB.tests.MinPO import MinPO from ZODB.utils import p64 import doctest import gc import sys import threading import transaction import unittest import ZODB import ZODB.MappingStorage import ZODB.tests.util class CacheTestBase(ZODB.tests.util.TestCase): def setUp(self): ZODB.tests.util.TestCase.setUp(self) store = ZODB.MappingStorage.MappingStorage() self.db = ZODB.DB(store, cache_size = self.CACHE_SIZE) self.conns = [] def tearDown(self): self.db.close() ZODB.tests.util.TestCase.tearDown(self) CACHE_SIZE = 20 def noodle_new_connection(self): """Do some reads and writes on a new connection.""" c = self.db.open() self.conns.append(c) self.noodle_connection(c) def noodle_connection(self, c): r = c.root() i = len(self.conns) d = r.get(i) if d is None: d = r[i] = PersistentMapping() transaction.commit() for i in range(15): o = d.get(i) if o is None: o = d[i] = MinPO(i) o.value += 1 transaction.commit() # CantGetRidOfMe is used by checkMinimizeTerminates. make_trouble = True class CantGetRidOfMe(MinPO): def __init__(self, value): MinPO.__init__(self, value) self.an_attribute = 42 def __del__(self): # Referencing an attribute of self causes self to be # loaded into the cache again, which also resurrects # self. if make_trouble: self.an_attribute class DBMethods(CacheTestBase): def setUp(self): CacheTestBase.setUp(self) for i in range(4): self.noodle_new_connection() def checkCacheDetail(self): for name, count in self.db.cacheDetail(): self.assert_(isinstance(name, str)) self.assert_(isinstance(count, int)) def checkCacheExtremeDetail(self): expected = ['conn_no', 'id', 'oid', 'rc', 'klass', 'state'] for dict in self.db.cacheExtremeDetail(): for k, v in dict.items(): self.assert_(k in expected) # TODO: not really sure how to do a black box test of the cache. # Should the full sweep and minimize calls always remove things? def checkFullSweep(self): old_size = self.db.cacheSize() self.db.cacheFullSweep() new_size = self.db.cacheSize() self.assert_(new_size < old_size, "%s < %s" % (old_size, new_size)) def checkMinimize(self): old_size = self.db.cacheSize() self.db.cacheMinimize() new_size = self.db.cacheSize() self.assert_(new_size < old_size, "%s < %s" % (old_size, new_size)) def checkMinimizeTerminates(self): # This is tricky. cPickleCache had a case where it could get into # an infinite loop, but we don't want the test suite to hang # if this bug reappears. So this test spawns a thread to run the # dangerous operation, and the main thread complains if the worker # thread hasn't finished in 30 seconds (arbitrary, but way more # than enough). In that case, the worker thread will continue # running forever (until killed externally), but at least the # test suite will move on. # # The bug was triggered by having a persistent object whose __del__ # method references an attribute of the object. An attempt to # ghostify such an object will clear the attribute, and if the # cache also releases the last Python reference to the object then # (due to ghostifying it), the __del__ method gets invoked. # Referencing the attribute loads the object again, and also # puts it back into the cPickleCache. If the cache implementation # isn't looking out for this, it can get into an infinite loop # then, endlessly trying to ghostify an object that in turn keeps # unghostifying itself again. class Worker(threading.Thread): def __init__(self, testcase): threading.Thread.__init__(self) self.testcase = testcase def run(self): global make_trouble # Make CantGetRidOfMe.__del__ dangerous. make_trouble = True conn = self.testcase.conns[0] r = conn.root() d = r[1] for i in range(len(d)): d[i] = CantGetRidOfMe(i) transaction.commit() self.testcase.db.cacheMinimize() # Defang the nasty objects. Else, because they're # immortal now, they hang around and create trouble # for subsequent tests. make_trouble = False self.testcase.db.cacheMinimize() w = Worker(self) w.start() w.join(30) if w.isAlive(): self.fail("cacheMinimize still running after 30 seconds -- " "almost certainly in an infinite loop") # TODO: don't have an explicit test for incrgc, because the # connection and database call it internally. # Same for the get and invalidate methods. def checkLRUitems(self): # get a cache c = self.conns[0]._cache c.lru_items() def checkClassItems(self): c = self.conns[0]._cache c.klass_items() class LRUCacheTests(CacheTestBase): def checkLRU(self): # verify the LRU behavior of the cache dataset_size = 5 CACHE_SIZE = dataset_size*2+1 # a cache big enough to hold the objects added in two # transactions, plus the root object self.db.setCacheSize(CACHE_SIZE) c = self.db.open() r = c.root() l = {} # the root is the only thing in the cache, because all the # other objects are new self.assertEqual(len(c._cache), 1) # run several transactions for t in range(5): for i in range(dataset_size): l[(t,i)] = r[i] = MinPO(i) transaction.commit() # commit() will register the objects, placing them in the # cache. at the end of commit, the cache will be reduced # down to CACHE_SIZE items if len(l)>CACHE_SIZE: self.assertEqual(c._cache.ringlen(), CACHE_SIZE) for i in range(dataset_size): # Check objects added in the first two transactions. # They must all be ghostified. self.assertEqual(l[(0,i)]._p_changed, None) self.assertEqual(l[(1,i)]._p_changed, None) # Check objects added in the last two transactions. # They must all still exist in memory, but have # had their changes flushed self.assertEqual(l[(3,i)]._p_changed, 0) self.assertEqual(l[(4,i)]._p_changed, 0) # Of the objects added in the middle transaction, most # will have been ghostified. There is one cache slot # that may be occupied by either one of those objects or # the root, depending on precise order of access. We do # not bother to check this def checkSize(self): self.assertEqual(self.db.cacheSize(), 0) self.assertEqual(self.db.cacheDetailSize(), []) CACHE_SIZE = 10 self.db.setCacheSize(CACHE_SIZE) CONNS = 3 for i in range(CONNS): self.noodle_new_connection() self.assertEquals(self.db.cacheSize(), CACHE_SIZE * CONNS) details = self.db.cacheDetailSize() self.assertEquals(len(details), CONNS) for d in details: self.assertEquals(d['ngsize'], CACHE_SIZE) # The assertion below is non-sensical # The (poorly named) cache size is a target for non-ghosts. # The cache *usually* contains non-ghosts, so that the # size normally exceeds the target size. #self.assertEquals(d['size'], CACHE_SIZE) def checkDetail(self): CACHE_SIZE = 10 self.db.setCacheSize(CACHE_SIZE) CONNS = 3 for i in range(CONNS): self.noodle_new_connection() gc.collect() # Obscure: The above gc.collect call is necessary to make this test # pass. # # This test then only works because the order of computations # and object accesses in the "noodle" calls is such that the # persistent mapping containing the MinPO objects is # deactivated before the MinPO objects. # # - Without the gc call, the cache will contain ghost MinPOs # and the check of the MinPO count below will fail. That's # because the counts returned by cacheDetail include ghosts. # # - If the mapping object containing the MinPOs isn't # deactivated, there will be one fewer non-ghost MinPO and # the test will fail anyway. # # This test really needs to be thought through and documented # better. for klass, count in self.db.cacheDetail(): if klass.endswith('MinPO'): self.assertEqual(count, CONNS * CACHE_SIZE) if klass.endswith('PersistentMapping'): # one root per connection self.assertEqual(count, CONNS) for details in self.db.cacheExtremeDetail(): # one 'details' dict per object if details['klass'].endswith('PersistentMapping'): self.assertEqual(details['state'], None) else: self.assert_(details['klass'].endswith('MinPO')) self.assertEqual(details['state'], 0) # The cache should never hold an unreferenced ghost. if details['state'] is None: # i.e., it's a ghost self.assert_(details['rc'] > 0) class StubDataManager: def setklassstate(self, object): pass class StubObject(Persistent): pass class CacheErrors(unittest.TestCase): def setUp(self): self.jar = StubDataManager() self.cache = PickleCache(self.jar) def checkGetBogusKey(self): self.assertEqual(self.cache.get(p64(0)), None) try: self.cache[12] except KeyError: pass else: self.fail("expected KeyError") try: self.cache[12] = 12 except TypeError: pass else: self.fail("expected TyepError") try: del self.cache[12] except TypeError: pass else: self.fail("expected TypeError") def checkBogusObject(self): def add(key, obj): self.cache[key] = obj nones = sys.getrefcount(None) key = p64(2) # value isn't persistent self.assertRaises(TypeError, add, key, 12) o = StubObject() # o._p_oid == None self.assertRaises(TypeError, add, key, o) o._p_oid = p64(3) self.assertRaises(ValueError, add, key, o) o._p_oid = key # o._p_jar == None self.assertRaises(Exception, add, key, o) o._p_jar = self.jar self.cache[key] = o # make sure it can be added multiple times self.cache[key] = o # same object, different keys self.assertRaises(ValueError, add, p64(0), o) self.assertEqual(sys.getrefcount(None), nones) def checkTwoCaches(self): jar2 = StubDataManager() cache2 = PickleCache(jar2) o = StubObject() key = o._p_oid = p64(1) o._p_jar = jar2 cache2[key] = o try: self.cache[key] = o except ValueError: pass else: self.fail("expected ValueError because object already in cache") def checkReadOnlyAttrsWhenCached(self): o = StubObject() key = o._p_oid = p64(1) o._p_jar = self.jar self.cache[key] = o try: o._p_oid = p64(2) except ValueError: pass else: self.fail("expect that you can't change oid of cached object") try: del o._p_jar except ValueError: pass else: self.fail("expect that you can't delete jar of cached object") def checkTwoObjsSameOid(self): # Try to add two distinct objects with the same oid to the cache. # This has always been an error, but the error message prior to # ZODB 3.2.6 didn't make sense. This test verifies that (a) an # exception is raised; and, (b) the error message is the intended # one. obj1 = StubObject() key = obj1._p_oid = p64(1) obj1._p_jar = self.jar self.cache[key] = obj1 obj2 = StubObject() obj2._p_oid = key obj2._p_jar = self.jar try: self.cache[key] = obj2 except ValueError, detail: self.assertEqual(str(detail), "A different object already has the same oid") else: self.fail("two objects with the same oid should have failed") def check_basic_cache_size_estimation(): """Make sure the basic accounting is correct: >>> import ZODB.MappingStorage >>> db = ZODB.MappingStorage.DB() >>> conn = db.open() The cache is empty initially: >>> conn._cache.total_estimated_size 0 We force the root to be loaded and the cache grows: >>> getattr(conn.root, 'z', None) >>> conn._cache.total_estimated_size 64 We add some data and the cache grows: >>> conn.root.z = ZODB.tests.util.P('x'*100) >>> import transaction >>> transaction.commit() >>> conn._cache.total_estimated_size 320 Loading the objects in another connection gets the same sizes: >>> conn2 = db.open() >>> conn2._cache.total_estimated_size 0 >>> getattr(conn2.root, 'x', None) >>> conn2._cache.total_estimated_size 128 >>> _ = conn2.root.z.name >>> conn2._cache.total_estimated_size 320 If we deactivate, the size goes down: >>> conn2.root.z._p_deactivate() >>> conn2._cache.total_estimated_size 128 Loading data directly, rather than through traversal updates the cache size correctly: >>> conn3 = db.open() >>> _ = conn3.get(conn2.root.z._p_oid).name >>> conn3._cache.total_estimated_size 192 """ def test_suite(): s = unittest.makeSuite(DBMethods, 'check') s.addTest(unittest.makeSuite(LRUCacheTests, 'check')) s.addTest(unittest.makeSuite(CacheErrors, 'check')) s.addTest(doctest.DocTestSuite()) return s ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/testConfig.py000066400000000000000000000146161230730566700242660ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2003 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## import doctest import tempfile import unittest import transaction import ZODB.config import ZODB.tests.util from ZODB.POSException import ReadOnlyError class ConfigTestBase(ZODB.tests.util.TestCase): def _opendb(self, s): return ZODB.config.databaseFromString(s) def tearDown(self): ZODB.tests.util.TestCase.tearDown(self) if getattr(self, "storage", None) is not None: self.storage.cleanup() def _test(self, s): db = self._opendb(s) self.storage = db._storage # Do something with the database to make sure it works cn = db.open() rt = cn.root() rt["test"] = 1 transaction.commit() db.close() class ZODBConfigTest(ConfigTestBase): def test_map_config1(self): self._test( """ """) def test_map_config2(self): self._test( """ cache-size 1000 """) def test_file_config1(self): path = tempfile.mktemp() self._test( """ path %s """ % path) def test_file_config2(self): path = tempfile.mktemp() cfg = """ path %s create false read-only true """ % path self.assertRaises(ReadOnlyError, self._test, cfg) def test_demo_config(self): cfg = """ name foo """ self._test(cfg) class ZEOConfigTest(ConfigTestBase): def test_zeo_config(self): # We're looking for a port that doesn't exist so a # connection attempt will fail. Instead of elaborate # logic to loop over a port calculation, we'll just pick a # simple "random", likely to not-exist port number and add # an elaborate comment explaining this instead. Go ahead, # grep for 9. from ZEO.ClientStorage import ClientDisconnected import ZConfig from ZODB.config import getDbSchema from StringIO import StringIO cfg = """ server localhost:56897 wait false """ config, handle = ZConfig.loadConfigFile(getDbSchema(), StringIO(cfg)) self.assertEqual(config.database[0].config.storage.config.blob_dir, None) self.assertRaises(ClientDisconnected, self._test, cfg) cfg = """ blob-dir blobs server localhost:56897 wait false """ config, handle = ZConfig.loadConfigFile(getDbSchema(), StringIO(cfg)) self.assertEqual(config.database[0].config.storage.config.blob_dir, 'blobs') self.assertRaises(ClientDisconnected, self._test, cfg) def database_xrefs_config(): r""" >>> db = ZODB.config.databaseFromString( ... "\n\n\n\n") >>> db.xrefs True >>> db = ZODB.config.databaseFromString( ... "\nallow-implicit-cross-references true\n" ... "\n\n\n") >>> db.xrefs True >>> db = ZODB.config.databaseFromString( ... "\nallow-implicit-cross-references false\n" ... "\n\n\n") >>> db.xrefs False """ def multi_atabases(): r"""If there are multiple codb sections -> multidatabase >>> db = ZODB.config.databaseFromString(''' ... ... ... ... ... ... ... ... ... ... database-name Bar ... ... ... ... ''') >>> sorted(db.databases) ['', 'Bar', 'foo'] >>> db.database_name '' >>> db.databases[db.database_name] is db True >>> db.databases['foo'] is not db True >>> db.databases['Bar'] is not db True >>> db.databases['Bar'] is not db.databases['foo'] True Can't have repeats: >>> ZODB.config.databaseFromString(''' ... ... ... ... ... ... ... ... ... ... ... ... ... ''') # doctest: +NORMALIZE_WHITESPACE Traceback (most recent call last): ... ConfigurationSyntaxError: section names must not be re-used within the same container:'1' (line 9) >>> ZODB.config.databaseFromString(''' ... ... ... ... ... ... ... ... ... ''') # doctest: +NORMALIZE_WHITESPACE Traceback (most recent call last): ... ValueError: database_name '' already in databases """ def test_suite(): suite = unittest.TestSuite() suite.addTest(doctest.DocTestSuite( setUp=ZODB.tests.util.setUp, tearDown=ZODB.tests.util.tearDown)) suite.addTest(unittest.makeSuite(ZODBConfigTest)) suite.addTest(unittest.makeSuite(ZEOConfigTest)) return suite if __name__ == '__main__': unittest.main(defaultTest='test_suite') ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/testConnection.py000066400000000000000000001141221230730566700251510ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Unit tests for the Connection class.""" from __future__ import with_statement import doctest import unittest from persistent import Persistent import transaction from ZODB.config import databaseFromString from ZODB.utils import p64 from zope.interface.verify import verifyObject import ZODB.tests.util class ConnectionDotAdd(ZODB.tests.util.TestCase): def setUp(self): ZODB.tests.util.TestCase.setUp(self) from ZODB.Connection import Connection self.db = StubDatabase() self.datamgr = Connection(self.db) self.datamgr.open() self.transaction = StubTransaction() def check_add(self): from ZODB.POSException import InvalidObjectReference obj = StubObject() self.assert_(obj._p_oid is None) self.assert_(obj._p_jar is None) self.datamgr.add(obj) self.assert_(obj._p_oid is not None) self.assert_(obj._p_jar is self.datamgr) self.assert_(self.datamgr.get(obj._p_oid) is obj) # Only first-class persistent objects may be added. self.assertRaises(TypeError, self.datamgr.add, object()) # Adding to the same connection does not fail. Object keeps the # same oid. oid = obj._p_oid self.datamgr.add(obj) self.assertEqual(obj._p_oid, oid) # Cannot add an object from a different connection. obj2 = StubObject() obj2._p_jar = object() self.assertRaises(InvalidObjectReference, self.datamgr.add, obj2) def checkResetOnAbort(self): # Check that _p_oid and _p_jar are reset when a transaction is # aborted. obj = StubObject() self.datamgr.add(obj) oid = obj._p_oid self.datamgr.abort(self.transaction) self.assert_(obj._p_oid is None) self.assert_(obj._p_jar is None) self.assertRaises(KeyError, self.datamgr.get, oid) def checkResetOnTpcAbort(self): obj = StubObject() self.datamgr.add(obj) oid = obj._p_oid # Simulate an error while committing some other object. self.datamgr.tpc_begin(self.transaction) # Let's pretend something bad happens here. # Call tpc_abort, clearing everything. self.datamgr.tpc_abort(self.transaction) self.assert_(obj._p_oid is None) self.assert_(obj._p_jar is None) self.assertRaises(KeyError, self.datamgr.get, oid) def checkTpcAbortAfterCommit(self): obj = StubObject() self.datamgr.add(obj) oid = obj._p_oid self.datamgr.tpc_begin(self.transaction) self.datamgr.commit(self.transaction) # Let's pretend something bad happened here. self.datamgr.tpc_abort(self.transaction) self.assert_(obj._p_oid is None) self.assert_(obj._p_jar is None) self.assertRaises(KeyError, self.datamgr.get, oid) self.assertEquals(self.db.storage._stored, [oid]) def checkCommit(self): obj = StubObject() self.datamgr.add(obj) oid = obj._p_oid self.datamgr.tpc_begin(self.transaction) self.datamgr.commit(self.transaction) self.datamgr.tpc_finish(self.transaction) self.assert_(obj._p_oid is oid) self.assert_(obj._p_jar is self.datamgr) # This next assert_ is covered by an assert in tpc_finish. ##self.assert_(not self.datamgr._added) self.assertEquals(self.db.storage._stored, [oid]) self.assertEquals(self.db.storage._finished, [oid]) def checkModifyOnGetstate(self): member = StubObject() subobj = StubObject() subobj.member = member obj = ModifyOnGetStateObject(subobj) self.datamgr.add(obj) self.datamgr.tpc_begin(self.transaction) self.datamgr.commit(self.transaction) self.datamgr.tpc_finish(self.transaction) storage = self.db.storage self.assert_(obj._p_oid in storage._stored, "object was not stored") self.assert_(subobj._p_oid in storage._stored, "subobject was not stored") self.assert_(member._p_oid in storage._stored, "member was not stored") self.assert_(self.datamgr._added_during_commit is None) def checkUnusedAddWorks(self): # When an object is added, but not committed, it shouldn't be stored, # but also it should be an error. obj = StubObject() self.datamgr.add(obj) self.datamgr.tpc_begin(self.transaction) self.datamgr.tpc_finish(self.transaction) self.assert_(obj._p_oid not in self.datamgr._storage._stored) def check__resetCacheResetsReader(self): # https://bugs.launchpad.net/zodb/+bug/142667 old_cache = self.datamgr._cache self.datamgr._resetCache() new_cache = self.datamgr._cache self.failIf(new_cache is old_cache) self.failUnless(self.datamgr._reader._cache is new_cache) class UserMethodTests(unittest.TestCase): # add isn't tested here, because there are a bunch of traditional # unit tests for it. def test_root(self): r"""doctest of root() method The root() method is simple, and the tests are pretty minimal. Ensure that a new database has a root and that it is a PersistentMapping. >>> db = databaseFromString("\n\n") >>> cn = db.open() >>> root = cn.root() >>> type(root).__name__ 'PersistentMapping' >>> root._p_oid '\x00\x00\x00\x00\x00\x00\x00\x00' >>> root._p_jar is cn True >>> db.close() """ def test_get(self): r"""doctest of get() method The get() method return the persistent object corresponding to an oid. >>> db = databaseFromString("\n\n") >>> cn = db.open() >>> obj = cn.get(p64(0)) >>> obj._p_oid '\x00\x00\x00\x00\x00\x00\x00\x00' The object is a ghost. >>> obj._p_state -1 And multiple calls with the same oid, return the same object. >>> obj2 = cn.get(p64(0)) >>> obj is obj2 True If all references to the object are released, then a new object will be returned. The cache doesn't keep unreferenced ghosts alive. (The next object returned my still have the same id, because Python may re-use the same memory.) >>> del obj, obj2 >>> cn._cache.get(p64(0), None) If the object is unghosted, then it will stay in the cache after the last reference is released. (This is true only if there is room in the cache and the object is recently used.) >>> obj = cn.get(p64(0)) >>> obj._p_activate() >>> y = id(obj) >>> del obj >>> obj = cn.get(p64(0)) >>> id(obj) == y True >>> obj._p_state 0 A request for an object that doesn't exist will raise a POSKeyError. >>> cn.get(p64(1)) Traceback (most recent call last): ... POSKeyError: 0x01 """ def test_close(self): r"""doctest of close() method This is a minimal test, because most of the interesting effects on closing a connection involve its interaction with the database and the transaction. >>> db = databaseFromString("\n\n") >>> cn = db.open() It's safe to close a connection multiple times. >>> cn.close() >>> cn.close() >>> cn.close() It's not possible to load or store objects once the storage is closed. >>> cn.get(p64(0)) Traceback (most recent call last): ... ConnectionStateError: The database connection is closed >>> p = Persistent() >>> cn.add(p) Traceback (most recent call last): ... ConnectionStateError: The database connection is closed """ def test_close_with_pending_changes(self): r"""doctest to ensure close() w/ pending changes complains >>> import transaction Just opening and closing is fine. >>> db = databaseFromString("\n\n") >>> cn = db.open() >>> cn.close() Opening, making a change, committing, and closing is fine. >>> cn = db.open() >>> cn.root()['a'] = 1 >>> transaction.commit() >>> cn.close() Opening, making a change, and aborting is fine. >>> cn = db.open() >>> cn.root()['a'] = 1 >>> transaction.abort() >>> cn.close() But trying to close with a change pending complains. >>> cn = db.open() >>> cn.root()['a'] = 10 >>> cn.close() Traceback (most recent call last): ... ConnectionStateError: Cannot close a connection joined to a transaction This leaves the connection as it was, so we can still commit the change. >>> transaction.commit() >>> cn2 = db.open() >>> cn2.root()['a'] 10 >>> cn.close(); cn2.close() >>> db.close() """ def test_onCloseCallbacks(self): r"""doctest of onCloseCallback() method >>> db = databaseFromString("\n\n") >>> cn = db.open() Every function registered is called, even if it raises an exception. They are only called once. >>> L = [] >>> def f(): ... L.append("f") >>> def g(): ... L.append("g") ... return 1 / 0 >>> cn.onCloseCallback(g) >>> cn.onCloseCallback(f) >>> cn.close() >>> L ['g', 'f'] >>> del L[:] >>> cn.close() >>> L [] The implementation keeps a list of callbacks that is reset to a class variable (which is bound to None) after the connection is closed. >>> cn._Connection__onCloseCallbacks """ def test_close_dispatches_to_activity_monitors(self): r"""doctest that connection close updates activity monitors Set up a multi-database: >>> db1 = ZODB.DB(None) >>> db2 = ZODB.DB(None, databases=db1.databases, database_name='2', ... cache_size=10) >>> conn1 = db1.open() >>> conn2 = conn1.get_connection('2') Add activity monitors to both dbs: >>> from ZODB.ActivityMonitor import ActivityMonitor >>> db1.setActivityMonitor(ActivityMonitor()) >>> db2.setActivityMonitor(ActivityMonitor()) Commit a transaction that affects both connections: >>> conn1.root()[0] = conn1.root().__class__() >>> conn2.root()[0] = conn2.root().__class__() >>> transaction.commit() After closing the primary connection, both monitors should be up to date: >>> conn1.close() >>> len(db1.getActivityMonitor().log) 1 >>> len(db2.getActivityMonitor().log) 1 """ def test_db(self): r"""doctest of db() method >>> db = databaseFromString("\n\n") >>> cn = db.open() >>> cn.db() is db True >>> cn.close() >>> cn.db() is db True """ def test_isReadOnly(self): r"""doctest of isReadOnly() method >>> db = databaseFromString("\n\n") >>> cn = db.open() >>> cn.isReadOnly() False >>> cn.close() >>> cn.isReadOnly() Traceback (most recent call last): ... ConnectionStateError: The database connection is closed An expedient way to create a read-only storage: >>> db.storage.isReadOnly = lambda: True >>> cn = db.open() >>> cn.isReadOnly() True """ def test_cache(self): r"""doctest of cacheMinimize(). Thus test us minimal, just verifying that the method can be called and has some effect. We need other tests that verify the cache works as intended. >>> db = databaseFromString("\n\n") >>> cn = db.open() >>> r = cn.root() >>> cn.cacheMinimize() >>> r._p_state -1 >>> r._p_activate() >>> r._p_state # up to date 0 >>> cn.cacheMinimize() >>> r._p_state # ghost again -1 """ def test_transaction_retry_convenience(): """ Simple test to verify integration with the transaction retry helper my verifying that we can raise ConflictError and have it handled properly. This is an adaptation of the convenience tests in transaction. >>> db = ZODB.tests.util.DB() >>> conn = db.open() >>> dm = conn.root() >>> ntry = 0 >>> with transaction.manager: ... dm['ntry'] = 0 >>> import ZODB.POSException >>> for attempt in transaction.manager.attempts(): ... with attempt as t: ... t.note('test') ... print dm['ntry'], ntry ... ntry += 1 ... dm['ntry'] = ntry ... if ntry % 3: ... raise ZODB.POSException.ConflictError() 0 0 0 1 0 2 """ class InvalidationTests(unittest.TestCase): # It's harder to write serious tests, because some of the critical # correctness issues relate to concurrency. We'll have to depend # on the various concurrent updates and NZODBThreads tests to # handle these. def test_invalidate(self): r""" This test initializes the database with several persistent objects, then manually delivers invalidations and verifies that they have the expected effect. >>> db = databaseFromString("\n\n") >>> cn = db.open() >>> p1 = Persistent() >>> p2 = Persistent() >>> p3 = Persistent() >>> r = cn.root() >>> r.update(dict(p1=p1, p2=p2, p3=p3)) >>> transaction.commit() Transaction ids are 8-byte strings, just like oids; p64() will create one from an int. >>> cn.invalidate(p64(1), {p1._p_oid: 1}) >>> cn._txn_time '\x00\x00\x00\x00\x00\x00\x00\x01' >>> p1._p_oid in cn._invalidated True >>> p2._p_oid in cn._invalidated False >>> cn.invalidate(p64(10), {p2._p_oid: 1, p64(76): 1}) >>> cn._txn_time '\x00\x00\x00\x00\x00\x00\x00\x01' >>> p1._p_oid in cn._invalidated True >>> p2._p_oid in cn._invalidated True Calling invalidate() doesn't affect the object state until a transaction boundary. >>> p1._p_state 0 >>> p2._p_state 0 >>> p3._p_state 0 The sync() method will abort the current transaction and process any pending invalidations. >>> cn.sync() >>> p1._p_state -1 >>> p2._p_state -1 >>> p3._p_state 0 >>> cn._invalidated set([]) """ def test_invalidateCache(): """The invalidateCache method invalidates a connection's cache. It also prevents reads until the end of a transaction:: >>> from ZODB.tests.util import DB >>> import transaction >>> db = DB() >>> tm = transaction.TransactionManager() >>> connection = db.open(transaction_manager=tm) >>> connection.root()['a'] = StubObject() >>> connection.root()['a'].x = 1 >>> connection.root()['b'] = StubObject() >>> connection.root()['b'].x = 1 >>> connection.root()['c'] = StubObject() >>> connection.root()['c'].x = 1 >>> tm.commit() >>> connection.root()['b']._p_deactivate() >>> connection.root()['c'].x = 2 So we have a connection and an active transaction with some modifications. Lets call invalidateCache: >>> connection.invalidateCache() Now, if we try to load an object, we'll get a read conflict: >>> connection.root()['b'].x Traceback (most recent call last): ... ReadConflictError: database read conflict error If we try to commit the transaction, we'll get a conflict error: >>> tm.commit() Traceback (most recent call last): ... ConflictError: database conflict error and the cache will have been cleared: >>> print connection.root()['a']._p_changed None >>> print connection.root()['b']._p_changed None >>> print connection.root()['c']._p_changed None But we'll be able to access data again: >>> connection.root()['b'].x 1 Aborting a transaction after a read conflict also lets us read data and go on about our business: >>> connection.invalidateCache() >>> connection.root()['c'].x Traceback (most recent call last): ... ReadConflictError: database read conflict error >>> tm.abort() >>> connection.root()['c'].x 1 >>> connection.root()['c'].x = 2 >>> tm.commit() >>> db.close() """ def connection_root_convenience(): """Connection root attributes can now be used as objects with attributes >>> db = ZODB.tests.util.DB() >>> conn = db.open() >>> conn.root.x Traceback (most recent call last): ... AttributeError: x >>> del conn.root.x Traceback (most recent call last): ... AttributeError: x >>> conn.root()['x'] = 1 >>> conn.root.x 1 >>> conn.root.y = 2 >>> sorted(conn.root().items()) [('x', 1), ('y', 2)] >>> conn.root >>> del conn.root.x >>> sorted(conn.root().items()) [('y', 2)] >>> conn.root.rather_long_name = 1 >>> conn.root.rather_long_name2 = 1 >>> conn.root.rather_long_name4 = 1 >>> conn.root.rather_long_name5 = 1 >>> conn.root """ class proper_ghost_initialization_with_empty__p_deactivate_class(Persistent): def _p_deactivate(self): pass def proper_ghost_initialization_with_empty__p_deactivate(): """ See https://bugs.launchpad.net/zodb/+bug/185066 >>> db = ZODB.tests.util.DB() >>> conn = db.open() >>> C = proper_ghost_initialization_with_empty__p_deactivate_class >>> conn.root.x = x = C() >>> conn.root.x.y = 1 >>> transaction.commit() >>> conn2 = db.open() >>> conn2.root.x._p_changed >>> conn2.root.x.y 1 """ def readCurrent(): r""" The connection's readCurrent method is called to provide a higher level of consistency in cases where an object if read to compute an update to a separate object. When this is used, the checkCurrentSerialInTransaction method on the storage is called in 2-phase commit. To demonstrate this, we'll create a storage and give it a test implementation of checkCurrentSerialInTransaction. >>> import ZODB.MappingStorage >>> store = ZODB.MappingStorage.MappingStorage() >>> from ZODB.POSException import ReadConflictError >>> bad = set() >>> def checkCurrentSerialInTransaction(oid, serial, trans): ... print 'checkCurrentSerialInTransaction', `oid` ... if not trans == transaction.get(): print 'oops' ... if oid in bad: ... raise ReadConflictError(oid=oid) >>> store.checkCurrentSerialInTransaction = checkCurrentSerialInTransaction Now, we'll use the storage as usual. checkCurrentSerialInTransaction won't normally be called: >>> db = ZODB.DB(store) >>> conn = db.open() >>> conn.root.a = ZODB.tests.util.P('a') >>> conn.root.b = ZODB.tests.util.P('b') >>> transaction.commit() If we call readCurrent for an object and we modify another object, then checkCurrentSerialInTransaction will be called for the object readCurrent was called on. >>> conn.readCurrent(conn.root.a) >>> conn.root.b.x = 0 >>> transaction.commit() checkCurrentSerialInTransaction '\x00\x00\x00\x00\x00\x00\x00\x01' It doesn't matter how often we call readCurrent, checkCurrentSerialInTransaction will be called only once: >>> conn.readCurrent(conn.root.a) >>> conn.readCurrent(conn.root.a) >>> conn.readCurrent(conn.root.a) >>> conn.readCurrent(conn.root.a) >>> conn.root.b.x += 1 >>> transaction.commit() checkCurrentSerialInTransaction '\x00\x00\x00\x00\x00\x00\x00\x01' checkCurrentSerialInTransaction won't be called if another object isn't modified: >>> conn.readCurrent(conn.root.a) >>> transaction.commit() Or if the object it was called on is modified: >>> conn.readCurrent(conn.root.a) >>> conn.root.a.x = 0 >>> conn.root.b.x += 1 >>> transaction.commit() If the storage raises a conflict error, it'll be propigated: >>> _ = str(conn.root.a) # do read >>> bad.add(conn.root.a._p_oid) >>> conn.readCurrent(conn.root.a) >>> conn.root.b.x += 1 >>> transaction.commit() Traceback (most recent call last): ... ReadConflictError: database read conflict error (oid 0x01) >>> transaction.abort() The conflict error will cause the affected object to be invalidated: >>> conn.root.a._p_changed The storage may raise it later: >>> def checkCurrentSerialInTransaction(oid, serial, trans): ... if not trans == transaction.get(): print 'oops' ... print 'checkCurrentSerialInTransaction', `oid` ... store.badness = ReadConflictError(oid=oid) >>> def tpc_vote(t): ... if store.badness: ... badness = store.badness ... store.badness = None ... raise badness >>> store.checkCurrentSerialInTransaction = checkCurrentSerialInTransaction >>> store.badness = None >>> store.tpc_vote = tpc_vote It will still be propigated: >>> _ = str(conn.root.a) # do read >>> conn.readCurrent(conn.root.a) >>> conn.root.b.x = +1 >>> transaction.commit() Traceback (most recent call last): ... ReadConflictError: database read conflict error (oid 0x01) >>> transaction.abort() The conflict error will cause the affected object to be invalidated: >>> conn.root.a._p_changed Read checks don't leak accross transactions: >>> conn.readCurrent(conn.root.a) >>> transaction.commit() >>> conn.root.b.x = +1 >>> transaction.commit() Read checks to work accross savepoints. >>> conn.readCurrent(conn.root.a) >>> conn.root.b.x = +1 >>> _ = transaction.savepoint() >>> transaction.commit() Traceback (most recent call last): ... ReadConflictError: database read conflict error (oid 0x01) >>> transaction.abort() >>> conn.readCurrent(conn.root.a) >>> _ = transaction.savepoint() >>> conn.root.b.x = +1 >>> transaction.commit() Traceback (most recent call last): ... ReadConflictError: database read conflict error (oid 0x01) >>> transaction.abort() """ def cache_management_of_subconnections(): """Make that cache management works for subconnections. When we use multi-databases, we open a connection in one database and access connections to other databases through it. This test verifies thatcache management is applied to all of the connections. Set up a multi-database: >>> db1 = ZODB.DB(None) >>> db2 = ZODB.DB(None, databases=db1.databases, database_name='2', ... cache_size=10) >>> conn1 = db1.open() >>> conn2 = conn1.get_connection('2') Populate it with some data, more than will fit in the cache: >>> for i in range(100): ... conn2.root()[i] = conn2.root().__class__() Upon commit, the cache is reduced to the cache size: >>> transaction.commit() >>> conn2._cache.cache_non_ghost_count 10 Fill it back up: >>> for i in range(100): ... _ = str(conn2.root()[i]) >>> conn2._cache.cache_non_ghost_count 101 Doing cache GC on the primary also does it on the secondary: >>> conn1.cacheGC() >>> conn2._cache.cache_non_ghost_count 10 Ditto for cache minimize: >>> conn1.cacheMinimize() >>> conn2._cache.cache_non_ghost_count 0 Fill it back up: >>> for i in range(100): ... _ = str(conn2.root()[i]) >>> conn2._cache.cache_non_ghost_count 101 GC is done on reopen: >>> conn1.close() >>> db1.open() is conn1 True >>> conn2 is conn1.get_connection('2') True >>> conn2._cache.cache_non_ghost_count 10 """ class C_invalidations_of_new_objects_work_after_savepoint(Persistent): def __init__(self): self.settings = 1 def _p_invalidate(self): print 'INVALIDATE', self.settings Persistent._p_invalidate(self) print self.settings # POSKeyError here def abort_of_savepoint_creating_new_objects_w_exotic_invalidate_doesnt_break(): r""" Before, the following would fail with a POSKeyError, which was somewhat surprizing, in a very edgy sort of way. :) Really, when an object add is aborted, the object should be "removed" from the db and its invalidatuon method shouldm't even be called: >>> conn = ZODB.connection(None) >>> conn.root.x = x = C_invalidations_of_new_objects_work_after_savepoint() >>> _ = transaction.savepoint() >>> x._p_oid '\x00\x00\x00\x00\x00\x00\x00\x01' >>> x._p_jar is conn True >>> transaction.abort() After the abort, the oid and jar are None: >>> x._p_oid >>> x._p_jar """ class Clp9460655(Persistent): def __init__(self, word, id): super(Clp9460655, self).__init__() self.id = id self._word = word def lp9460655(): r""" >>> conn = ZODB.connection(None) >>> root = conn.root() >>> Word = Clp9460655 >>> from BTrees.OOBTree import OOBTree >>> data = root['data'] = OOBTree() >>> commonWords = [] >>> count = "0" >>> for x in ('hello', 'world', 'how', 'are', 'you'): ... commonWords.append(Word(x, count)) ... count = str(int(count) + 1) >>> sv = transaction.savepoint() >>> for word in commonWords: ... sv2 = transaction.savepoint() ... data[word.id] = word >>> sv.rollback() >>> print commonWords[1].id # raises POSKeyError 1 """ def lp615758_transaction_abort_Incomplete_cleanup_for_new_objects(): r""" As the following"DocTest" demonstrates, "abort" forgets to reset "_p_changed" for new (i.e. "added") objects. >>> class P(Persistent): pass ... >>> c = ZODB.connection(None) >>> obj = P() >>> c.add(obj) >>> obj.x = 1 >>> obj._p_changed True >>> transaction.abort() >>> obj._p_changed False >>> c.close() """ class Clp485456_setattr_in_getstate_doesnt_cause_multiple_stores(Persistent): def __getstate__(self): self.got = 1 return self.__dict__.copy() def lp485456_setattr_in_setstate_doesnt_cause_multiple_stores(): r""" >>> C = Clp485456_setattr_in_getstate_doesnt_cause_multiple_stores >>> conn = ZODB.connection(None) >>> oldstore = conn._storage.store >>> def store(oid, *args): ... print 'storing', repr(oid) ... return oldstore(oid, *args) >>> conn._storage.store = store When we commit a change, we only get a single store call >>> conn.root.x = C() >>> transaction.commit() storing '\x00\x00\x00\x00\x00\x00\x00\x00' storing '\x00\x00\x00\x00\x00\x00\x00\x01' >>> conn.add(C()) >>> transaction.commit() storing '\x00\x00\x00\x00\x00\x00\x00\x02' We still see updates: >>> conn.root.x.y = 1 >>> transaction.commit() storing '\x00\x00\x00\x00\x00\x00\x00\x01' Not not non-updates: >>> transaction.commit() Let's try some combinations with savepoints: >>> conn.root.n = 0 >>> _ = transaction.savepoint() >>> oldspstore = conn._storage.store >>> def store(oid, *args): ... print 'savepoint storing', repr(oid) ... return oldspstore(oid, *args) >>> conn._storage.store = store >>> conn.root.y = C() >>> _ = transaction.savepoint() savepoint storing '\x00\x00\x00\x00\x00\x00\x00\x00' savepoint storing '\x00\x00\x00\x00\x00\x00\x00\x03' >>> conn.root.y.x = 1 >>> _ = transaction.savepoint() savepoint storing '\x00\x00\x00\x00\x00\x00\x00\x03' >>> transaction.commit() storing '\x00\x00\x00\x00\x00\x00\x00\x00' storing '\x00\x00\x00\x00\x00\x00\x00\x03' >>> conn.close() """ class _PlayPersistent(Persistent): def setValueWithSize(self, size=0): self.value = size*' ' __init__ = setValueWithSize class EstimatedSizeTests(ZODB.tests.util.TestCase): """check that size estimations are handled correctly.""" def setUp(self): ZODB.tests.util.TestCase.setUp(self) self.db = db = databaseFromString("\n\n") self.conn = c = db.open() self.obj = obj = _PlayPersistent() c.root()['obj'] = obj transaction.commit() def test_size_set_on_write_commit(self): obj, cache = self.obj, self.conn._cache # we have just written "obj". Its size should not be zero size, cache_size = obj._p_estimated_size, cache.total_estimated_size self.assert_(size > 0) self.assert_(cache_size > size) # increase the size, write again and check that the size changed obj.setValueWithSize(1000) transaction.commit() new_size = obj._p_estimated_size self.assert_(new_size > size) self.assertEqual(cache.total_estimated_size, cache_size + new_size - size) def test_size_set_on_write_savepoint(self): obj, cache = self.obj, self.conn._cache # we have just written "obj". Its size should not be zero size, cache_size = obj._p_estimated_size, cache.total_estimated_size # increase the size, write again and check that the size changed obj.setValueWithSize(1000) transaction.savepoint() new_size = obj._p_estimated_size self.assert_(new_size > size) self.assertEqual(cache.total_estimated_size, cache_size + new_size - size) def test_size_set_on_load(self): c = self.db.open() # new connection obj = c.root()['obj'] # the object is still a ghost and '_p_estimated_size' not yet set # access to unghost cache = c._cache cache_size = cache.total_estimated_size obj.value size = obj._p_estimated_size self.assert_(size > 0) self.assertEqual(cache.total_estimated_size, cache_size + size) # we test here as well that the deactivation works reduced the cache # size obj._p_deactivate() self.assertEqual(cache.total_estimated_size, cache_size) def test_configuration(self): # verify defaults .... expected = 0 # ... on db db = self.db self.assertEqual(db.getCacheSizeBytes(), expected) self.assertEqual(db.getHistoricalCacheSizeBytes(), expected) # ... on connection conn = self.conn self.assertEqual(conn._cache.cache_size_bytes, expected) # verify explicit setting ... expected = 10000 # ... on db db = databaseFromString("\n" " cache-size-bytes %d\n" " historical-cache-size-bytes %d\n" " \n" "" % (expected, expected+1) ) self.assertEqual(db.getCacheSizeBytes(), expected) self.assertEqual(db.getHistoricalCacheSizeBytes(), expected+1) # ... on connectionB conn = db.open() self.assertEqual(conn._cache.cache_size_bytes, expected) # test huge (larger than 4 byte) size limit db = databaseFromString("\n" " cache-size-bytes 8GB\n" " \n" "" ) self.assertEqual(db.getCacheSizeBytes(), 0x1L << 33) def test_cache_garbage_collection(self): db = self.db # activate size based cache garbage collection db.setCacheSizeBytes(1) conn = self.conn cache = conn._cache # verify the change worked as expected self.assertEqual(cache.cache_size_bytes, 1) # verify our entrance assumption is fullfilled self.assert_(cache.total_estimated_size > 1) conn.cacheGC() self.assert_(cache.total_estimated_size <= 1) # sanity check self.assert_(cache.total_estimated_size >= 0) def test_cache_garbage_collection_shrinking_object(self): db = self.db # activate size based cache garbage collection db.setCacheSizeBytes(1000) obj, conn, cache = self.obj, self.conn, self.conn._cache # verify the change worked as expected self.assertEqual(cache.cache_size_bytes, 1000) # verify our entrance assumption is fullfilled self.assert_(cache.total_estimated_size > 1) # give the objects some size obj.setValueWithSize(500) transaction.savepoint() self.assert_(cache.total_estimated_size > 500) # make the object smaller obj.setValueWithSize(100) transaction.savepoint() # make sure there was no overflow self.assert_(cache.total_estimated_size != 0) # the size is not larger than the allowed maximum self.assert_(cache.total_estimated_size <= 1000) # ---- stubs class StubObject(Persistent): pass class StubTransaction: pass class ErrorOnGetstateException(Exception): pass class ErrorOnGetstateObject(Persistent): def __getstate__(self): raise ErrorOnGetstateException class ModifyOnGetStateObject(Persistent): def __init__(self, p): self._v_p = p def __getstate__(self): self._p_jar.add(self._v_p) self.p = self._v_p return Persistent.__getstate__(self) class StubStorage: """Very simple in-memory storage that does *just* enough to support tests. Only one concurrent transaction is supported. Voting is not supported. Inspect self._stored and self._finished to see how the storage has been used during a unit test. Whenever an object is stored in the store() method, its oid is appended to self._stored. When a transaction is finished, the oids that have been stored during the transaction are appended to self._finished. """ # internal _oid = 1 _transaction = None def __init__(self): # internal self._stored = [] self._finished = [] self._data = {} self._transdata = {} self._transstored = [] def new_oid(self): oid = str(self._oid) self._oid += 1 return oid def sortKey(self): return 'StubStorage sortKey' def tpc_begin(self, transaction): if transaction is None: raise TypeError('transaction may not be None') elif self._transaction is None: self._transaction = transaction elif self._transaction != transaction: raise RuntimeError( 'StubStorage uses only one transaction at a time') def tpc_abort(self, transaction): if transaction is None: raise TypeError('transaction may not be None') elif self._transaction != transaction: raise RuntimeError( 'StubStorage uses only one transaction at a time') del self._transaction self._transdata.clear() def tpc_finish(self, transaction, callback): if transaction is None: raise TypeError('transaction may not be None') elif self._transaction != transaction: raise RuntimeError( 'StubStorage uses only one transaction at a time') self._finished.extend(self._transstored) self._data.update(self._transdata) callback(transaction) del self._transaction self._transdata.clear() self._transstored = [] def load(self, oid, version=''): if version != '': raise TypeError('StubStorage does not support versions.') return self._data[oid] def store(self, oid, serial, p, version, transaction): if version != '': raise TypeError('StubStorage does not support versions.') if transaction is None: raise TypeError('transaction may not be None') elif self._transaction != transaction: raise RuntimeError( 'StubStorage uses only one transaction at a time') self._stored.append(oid) self._transstored.append(oid) self._transdata[oid] = (p, serial) # Explicitly returing None, as we're not pretending to be a ZEO # storage return None class TestConnectionInterface(unittest.TestCase): def test_connection_interface(self): from ZODB.interfaces import IConnection db = databaseFromString("\n\n") cn = db.open() verifyObject(IConnection, cn) class StubDatabase: def __init__(self): self.storage = StubStorage() self.new_oid = self.storage.new_oid classFactory = None database_name = 'stubdatabase' databases = {'stubdatabase': database_name} def invalidate(self, transaction, dict_with_oid_keys, connection): pass large_record_size = 1<<30 def test_suite(): s = unittest.makeSuite(ConnectionDotAdd, 'check') s.addTest(doctest.DocTestSuite()) s.addTest(unittest.makeSuite(TestConnectionInterface)) s.addTest(unittest.makeSuite(EstimatedSizeTests)) return s ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/testConnectionSavepoint.py000066400000000000000000000135121230730566700270430ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2004 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## import doctest import persistent.mapping import transaction import unittest import ZODB.tests.util def testAddingThenModifyThenAbort(): """\ We ran into a problem in which abort failed after adding an object in a savepoint and then modifying the object. The problem was that, on commit, the savepoint was aborted before the modifications were aborted. Because the object was added in the savepoint, its _p_oid and _p_jar were cleared when the savepoint was aborted. The object was in the registered-object list. There's an invariant for this list that states that all objects in the list should have an oid and (correct) jar. The fix was to abort work done after the savepoint before aborting the savepoint. >>> import ZODB.tests.util >>> db = ZODB.tests.util.DB() >>> connection = db.open() >>> root = connection.root() >>> ob = persistent.mapping.PersistentMapping() >>> root['ob'] = ob >>> sp = transaction.savepoint() >>> ob.x = 1 >>> transaction.abort() """ def testModifyThenSavePointThenModifySomeMoreThenCommit(): """\ We got conflict errors when we committed after we modified an object in a savepoint, and then modified it some more after the last savepoint. The problem was that we were effectively commiting the object twice -- when commiting the current data and when committing the savepoint. The fix was to first make a new savepoint to move new changes to the savepoint storage and *then* to commit the savepoint storage. >>> import ZODB.tests.util >>> db = ZODB.tests.util.DB() >>> connection = db.open() >>> root = connection.root() >>> sp = transaction.savepoint() >>> root['a'] = 1 >>> sp = transaction.savepoint() >>> root['a'] = 2 >>> transaction.commit() """ def testCantCloseConnectionWithActiveSavepoint(): """ >>> import ZODB.tests.util >>> db = ZODB.tests.util.DB() >>> connection = db.open() >>> root = connection.root() >>> root['a'] = 1 >>> sp = transaction.savepoint() >>> connection.close() Traceback (most recent call last): ... ConnectionStateError: Cannot close a connection joined to a transaction >>> db.close() """ def testSavepointDoesCacheGC(): """\ Although the interface doesn't guarantee this internal detail, making a savepoint should do incremental gc on connection memory caches. Indeed, one traditional use for savepoints is simply to free memory space midstream during a long transaction. Before ZODB 3.4.2, making a savepoint failed to trigger cache gc, and this test verifies that it now does. >>> import ZODB >>> from ZODB.tests.MinPO import MinPO >>> from ZODB.MappingStorage import MappingStorage >>> import transaction >>> CACHESIZE = 5 # something tiny >>> LOOPCOUNT = CACHESIZE * 10 >>> st = MappingStorage("Test") >>> db = ZODB.DB(st, cache_size=CACHESIZE) >>> cn = db.open() >>> rt = cn.root() Now attach substantially more than CACHESIZE persistent objects to the root: >>> for i in range(LOOPCOUNT): ... rt[i] = MinPO(i) >>> transaction.commit() Now modify all of them; the cache should contain LOOPCOUNT MinPO objects then, + 1 for the root object: >>> for i in range(LOOPCOUNT): ... obj = rt[i] ... obj.value = -i >>> len(cn._cache) == LOOPCOUNT + 1 True Making a savepoint at this time used to leave the cache holding the same number of objects. Make sure the cache shrinks now instead. >>> dummy = transaction.savepoint() >>> len(cn._cache) <= CACHESIZE + 1 True Verify all the values are as expected: >>> failures = [] >>> for i in range(LOOPCOUNT): ... obj = rt[i] ... if obj.value != -i: ... failures.append(obj) >>> failures [] >>> transaction.abort() >>> db.close() """ def testIsReadonly(): """\ The connection isReadonly method relies on the _storage to have an isReadOnly. We simply rely on the underlying storage method. >>> import ZODB.tests.util >>> db = ZODB.tests.util.DB() >>> connection = db.open() >>> root = connection.root() >>> root['a'] = 1 >>> sp = transaction.savepoint() >>> connection.isReadOnly() False """ class SelfActivatingObject(persistent.Persistent): def _p_invalidate(self): super(SelfActivatingObject, self)._p_invalidate() self._p_activate() def testInvalidateAfterRollback(): """\ The rollback used to invalidate objects before resetting the TmpStore. This caused problems for custom _p_invalidate methods that would load the wrong state. >>> import ZODB.tests.util >>> db = ZODB.tests.util.DB() >>> connection = db.open() >>> root = connection.root() >>> root['p'] = p = SelfActivatingObject() >>> transaction.commit() >>> p.foo = 1 >>> sp = transaction.savepoint() >>> p.foo = 2 >>> sp2 = transaction.savepoint() >>> sp.rollback() >>> p.foo # This used to wrongly return 2 1 """ def tearDown(test): transaction.abort() def test_suite(): return unittest.TestSuite(( doctest.DocFileSuite('testConnectionSavepoint.txt', tearDown=tearDown), doctest.DocTestSuite(tearDown=tearDown), )) if __name__ == '__main__': unittest.main(defaultTest='test_suite') ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/testConnectionSavepoint.txt000066400000000000000000000117761230730566700272440ustar00rootroot00000000000000========== Savepoints ========== Savepoints provide a way to save to disk intermediate work done during a transaction allowing: - partial transaction (subtransaction) rollback (abort) - state of saved objects to be freed, freeing on-line memory for other uses Savepoints make it possible to write atomic subroutines that don't make top-level transaction commitments. Applications ------------ To demonstrate how savepoints work with transactions, we'll show an example. >>> import ZODB.tests.util >>> db = ZODB.tests.util.DB() >>> connection = db.open() >>> root = connection.root() >>> root['name'] = 'bob' As with other data managers, we can commit changes: >>> import transaction >>> transaction.commit() >>> root['name'] 'bob' and abort changes: >>> root['name'] = 'sally' >>> root['name'] 'sally' >>> transaction.abort() >>> root['name'] 'bob' Now, let's look at an application that manages funds for people. It allows deposits and debits to be entered for multiple people. It accepts a sequence of entries and generates a sequence of status messages. For each entry, it applies the change and then validates the user's account. If the user's account is invalid, we roll back the change for that entry. The success or failure of an entry is indicated in the output status. First we'll initialize some accounts: >>> root['bob-balance'] = 0.0 >>> root['bob-credit'] = 0.0 >>> root['sally-balance'] = 0.0 >>> root['sally-credit'] = 100.0 >>> transaction.commit() Now, we'll define a validation function to validate an account: >>> def validate_account(name): ... if root[name+'-balance'] + root[name+'-credit'] < 0: ... raise ValueError('Overdrawn', name) And a function to apply entries. If the function fails in some unexpected way, it rolls back all of its changes and prints the error: >>> def apply_entries(entries): ... savepoint = transaction.savepoint() ... try: ... for name, amount in entries: ... entry_savepoint = transaction.savepoint() ... try: ... root[name+'-balance'] += amount ... validate_account(name) ... except ValueError, error: ... entry_savepoint.rollback() ... print 'Error', str(error) ... else: ... print 'Updated', name ... except Exception, error: ... savepoint.rollback() ... print 'Unexpected exception', error Now let's try applying some entries: >>> apply_entries([ ... ('bob', 10.0), ... ('sally', 10.0), ... ('bob', 20.0), ... ('sally', 10.0), ... ('bob', -100.0), ... ('sally', -100.0), ... ]) Updated bob Updated sally Updated bob Updated sally Error ('Overdrawn', 'bob') Updated sally >>> root['bob-balance'] 30.0 >>> root['sally-balance'] -80.0 If we provide entries that cause an unexpected error: >>> apply_entries([ ... ('bob', 10.0), ... ('sally', 10.0), ... ('bob', '20.0'), ... ('sally', 10.0), ... ]) Updated bob Updated sally Unexpected exception unsupported operand type(s) for +=: 'float' and 'str' Because the apply_entries used a savepoint for the entire function, it was able to rollback the partial changes without rolling back changes made in the previous call to ``apply_entries``: >>> root['bob-balance'] 30.0 >>> root['sally-balance'] -80.0 If we now abort the outer transactions, the earlier changes will go away: >>> transaction.abort() >>> root['bob-balance'] 0.0 >>> root['sally-balance'] 0.0 Savepoint invalidation ---------------------- A savepoint can be used any number of times: >>> root['bob-balance'] = 100.0 >>> root['bob-balance'] 100.0 >>> savepoint = transaction.savepoint() >>> root['bob-balance'] = 200.0 >>> root['bob-balance'] 200.0 >>> savepoint.rollback() >>> root['bob-balance'] 100.0 >>> savepoint.rollback() # redundant, but should be harmless >>> root['bob-balance'] 100.0 >>> root['bob-balance'] = 300.0 >>> root['bob-balance'] 300.0 >>> savepoint.rollback() >>> root['bob-balance'] 100.0 However, using a savepoint invalidates any savepoints that come after it: >>> root['bob-balance'] = 200.0 >>> root['bob-balance'] 200.0 >>> savepoint1 = transaction.savepoint() >>> root['bob-balance'] = 300.0 >>> root['bob-balance'] 300.0 >>> savepoint2 = transaction.savepoint() >>> savepoint.rollback() >>> root['bob-balance'] 100.0 >>> savepoint2.rollback() # doctest: +ELLIPSIS Traceback (most recent call last): ... InvalidSavepointRollbackError... >>> savepoint1.rollback() # doctest: +ELLIPSIS Traceback (most recent call last): ... InvalidSavepointRollbackError... >>> transaction.abort() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/testDB.py000066400000000000000000000223461230730566700233450ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## from ZODB.tests.MinPO import MinPO import doctest import os import sys import time import transaction import unittest import ZODB import ZODB.tests.util # Return total number of connections across all pools in a db._pools. def nconn(pools): return sum([len(pool.all) for pool in pools.values()]) class DBTests(ZODB.tests.util.TestCase): def setUp(self): ZODB.tests.util.TestCase.setUp(self) self.db = ZODB.DB('test.fs') def tearDown(self): self.db.close() ZODB.tests.util.TestCase.tearDown(self) def dowork(self): c = self.db.open() r = c.root() o = r[time.time()] = MinPO(0) transaction.commit() for i in range(25): o.value = MinPO(i) transaction.commit() o = o.value serial = o._p_serial root_serial = r._p_serial c.close() return serial, root_serial # make sure the basic methods are callable def testSets(self): self.db.setCacheSize(15) self.db.setHistoricalCacheSize(15) def test_references(self): # TODO: For now test that we're using referencesf. We really should # have tests of referencesf. import ZODB.serialize self.assert_(self.db.references is ZODB.serialize.referencesf) def test_invalidateCache(): """The invalidateCache method invalidates a connection caches for all of the connections attached to a database:: >>> from ZODB.tests.util import DB >>> import transaction >>> db = DB() >>> tm1 = transaction.TransactionManager() >>> c1 = db.open(transaction_manager=tm1) >>> c1.root()['a'] = MinPO(1) >>> tm1.commit() >>> tm2 = transaction.TransactionManager() >>> c2 = db.open(transaction_manager=tm2) >>> c1.root()['a']._p_deactivate() >>> tm3 = transaction.TransactionManager() >>> c3 = db.open(transaction_manager=tm3) >>> c3.root()['a'].value 1 >>> c3.close() >>> db.invalidateCache() >>> c1.root()['a'].value Traceback (most recent call last): ... ReadConflictError: database read conflict error >>> c2.root()['a'].value Traceback (most recent call last): ... ReadConflictError: database read conflict error >>> c3 is db.open(transaction_manager=tm3) True >>> print c3.root()['a']._p_changed None >>> db.close() """ def connectionDebugInfo(): r"""DB.connectionDebugInfo provides information about connections. >>> import time >>> now = 1228423244.5 >>> def faux_time(): ... global now ... now += .1 ... return now >>> real_time = time.time >>> time.time = faux_time >>> from ZODB.tests.util import DB >>> import transaction >>> db = DB() >>> c1 = db.open() >>> c1.setDebugInfo('test info') >>> c1.root()['a'] = MinPO(1) >>> transaction.commit() >>> c2 = db.open() >>> _ = c1.root()['a'] >>> c2.close() >>> c3 = db.open(before=c1.root()._p_serial) >>> info = db.connectionDebugInfo() >>> import pprint >>> pprint.pprint(sorted(info, key=lambda i: str(i['opened'])), width=1) [{'before': None, 'info': 'test info (2)', 'opened': '2008-12-04T20:40:44Z (1.40s)'}, {'before': '\x03zY\xd8\xc0m9\xdd', 'info': ' (0)', 'opened': '2008-12-04T20:40:45Z (0.30s)'}, {'before': None, 'info': ' (0)', 'opened': None}] >>> time.time = real_time """ def passing_a_file_name_to_DB(): """You can pass a file-storage file name to DB. (Also note that we can access DB in ZODB.) >>> db = ZODB.DB('data.fs') >>> db.storage # doctest: +ELLIPSIS >> os.path.exists('data.fs') True >>> db.close() """ def passing_None_to_DB(): """You can pass None DB to get a MappingStorage. (Also note that we can access DB in ZODB.) >>> db = ZODB.DB(None) >>> db.storage # doctest: +ELLIPSIS >> db.close() """ def open_convenience(): """Often, we just want to open a single connection. >>> conn = ZODB.connection('data.fs') >>> conn.root() {} >>> conn.root()['x'] = 1 >>> transaction.commit() >>> conn.close() Let's make sure the database was cloased when we closed the connection, and that the data is there. >>> db = ZODB.DB('data.fs') >>> conn = db.open() >>> conn.root() {'x': 1} >>> db.close() We can pass storage-specific arguments if they don't conflict with DB arguments. >>> conn = ZODB.connection('data.fs', blob_dir='blobs') >>> conn.root()['b'] = ZODB.blob.Blob('test') >>> transaction.commit() >>> conn.close() >>> db = ZODB.DB('data.fs', blob_dir='blobs') >>> conn = db.open() >>> conn.root()['b'].open().read() 'test' >>> db.close() """ if sys.version_info >= (2, 6): def db_with_transaction(): """Using databases with with The transaction method returns a context manager that when entered starts a transaction with a private transaction manager. To illustrate this, we start a trasnaction using a regular connection and see that it isn't automatically committed or aborted as we use the transaction context manager. >>> db = ZODB.tests.util.DB() >>> conn = db.open() >>> conn.root()['x'] = conn.root().__class__() >>> transaction.commit() >>> conn.root()['x']['x'] = 1 >>> with db.transaction() as conn2: ... conn2.root()['y'] = 1 >>> conn2.opened Now, we'll open a 3rd connection a verify that >>> conn3 = db.open() >>> conn3.root()['x'] {} >>> conn3.root()['y'] 1 >>> conn3.close() Let's try again, but this time, we'll have an exception: >>> with db.transaction() as conn2: ... conn2.root()['y'] = 2 ... XXX Traceback (most recent call last): ... NameError: name 'XXX' is not defined >>> conn2.opened >>> conn3 = db.open() >>> conn3.root()['x'] {} >>> conn3.root()['y'] 1 >>> conn3.close() >>> transaction.commit() >>> conn3 = db.open() >>> conn3.root()['x'] {'x': 1} >>> db.close() """ def connection_allows_empty_version_for_idiots(): r""" >>> db = ZODB.DB('t.fs') >>> c = ZODB.tests.util.assert_deprecated( ... (lambda : db.open('')), ... 'A version string was passed to open') >>> c.root() {} >>> db.close() """ def warn_when_data_records_are_big(): """ When data records are large, a warning is issued to try to prevent new users from shooting themselves in the foot. >>> db = ZODB.DB('t.fs', create=True) >>> conn = db.open() >>> conn.root.x = 'x'*(1<<24) >>> ZODB.tests.util.assert_warning(UserWarning, transaction.commit, ... "object you're saving is large.") >>> db.close() The large_record_size option can be used to control the record size: >>> db = ZODB.DB('t.fs', create=True, large_record_size=999) >>> conn = db.open() >>> conn.root.x = 'x' >>> transaction.commit() >>> conn.root.x = 'x'*999 >>> ZODB.tests.util.assert_warning(UserWarning, transaction.commit, ... "object you're saving is large.") >>> db.close() We can also specify it using a configuration option: >>> import ZODB.config >>> db = ZODB.config.databaseFromString(''' ... ... large-record-size 1MB ... ... path t.fs ... create true ... ... ... ''') >>> conn = db.open() >>> conn.root.x = 'x' >>> transaction.commit() >>> conn.root.x = 'x'*(1<<20) >>> ZODB.tests.util.assert_warning(UserWarning, transaction.commit, ... "object you're saving is large.") >>> db.close() """ # ' def minimally_test_connection_timeout(): """There's a mechanism to discard old connections. Make sure it doesn't error. :) >>> db = ZODB.DB(None, pool_timeout=.01) >>> c1 = db.open() >>> c2 = db.open() >>> c1.close() >>> c2.close() >>> time.sleep(.02) >>> db.open() is c2 True >>> db.pool.available [] """ def test_suite(): s = unittest.makeSuite(DBTests) s.addTest(doctest.DocTestSuite( setUp=ZODB.tests.util.setUp, tearDown=ZODB.tests.util.tearDown, )) return s ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/testDemoStorage.py000066400000000000000000000176441230730566700252760ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## from ZODB.DB import DB from ZODB.tests import ( BasicStorage, HistoryStorage, IteratorStorage, MTStorage, PackableStorage, RevisionStorage, StorageTestBase, Synchronization, ) import os if os.environ.get('USE_ZOPE_TESTING_DOCTEST'): from zope.testing import doctest else: import doctest import random import transaction import unittest import ZODB.DemoStorage import ZODB.tests.hexstorage import ZODB.tests.util import ZODB.utils class DemoStorageTests( StorageTestBase.StorageTestBase, BasicStorage.BasicStorage, HistoryStorage.HistoryStorage, IteratorStorage.ExtendedIteratorStorage, IteratorStorage.IteratorStorage, MTStorage.MTStorage, PackableStorage.PackableStorage, RevisionStorage.RevisionStorage, Synchronization.SynchronizedStorage, ): def setUp(self): StorageTestBase.StorageTestBase.setUp(self) self._storage = ZODB.DemoStorage.DemoStorage() def checkOversizeNote(self): # This base class test checks for the common case where a storage # doesnt support huge transaction metadata. This storage doesnt # have this limit, so we inhibit this test here. pass def checkLoadDelegation(self): # Minimal test of loadEX w/o version -- ironically db = DB(self._storage) # creates object 0. :) s2 = ZODB.DemoStorage.DemoStorage(base=self._storage) self.assertEqual(s2.load(ZODB.utils.z64, ''), self._storage.load(ZODB.utils.z64, '')) def checkLengthAndBool(self): self.assertEqual(len(self._storage), 0) self.assert_(not self._storage) db = DB(self._storage) # creates object 0. :) self.assertEqual(len(self._storage), 1) self.assert_(self._storage) conn = db.open() for i in range(10): conn.root()[i] = conn.root().__class__() transaction.commit() self.assertEqual(len(self._storage), 11) self.assert_(self._storage) def checkLoadBeforeUndo(self): pass # we don't support undo yet checkUndoZombie = checkLoadBeforeUndo class DemoStorageHexTests(DemoStorageTests): def setUp(self): StorageTestBase.StorageTestBase.setUp(self) self._storage = ZODB.tests.hexstorage.HexStorage( ZODB.DemoStorage.DemoStorage()) class DemoStorageWrappedBase(DemoStorageTests): def setUp(self): StorageTestBase.StorageTestBase.setUp(self) self._base = self._makeBaseStorage() self._storage = ZODB.DemoStorage.DemoStorage(base=self._base) def tearDown(self): self._base.close() StorageTestBase.StorageTestBase.tearDown(self) def _makeBaseStorage(self): raise NotImplementedError def checkPackOnlyOneObject(self): pass # Wrapping demo storages don't do gc def checkPackWithMultiDatabaseReferences(self): pass # we never do gc checkPackAllRevisions = checkPackWithMultiDatabaseReferences class DemoStorageWrappedAroundMappingStorage(DemoStorageWrappedBase): def _makeBaseStorage(self): from ZODB.MappingStorage import MappingStorage return MappingStorage() class DemoStorageWrappedAroundFileStorage(DemoStorageWrappedBase): def _makeBaseStorage(self): from ZODB.FileStorage import FileStorage return FileStorage('FileStorageTests.fs') class DemoStorageWrappedAroundHexMappingStorage(DemoStorageWrappedBase): def _makeBaseStorage(self): from ZODB.MappingStorage import MappingStorage return ZODB.tests.hexstorage.HexStorage(MappingStorage()) def setUp(test): random.seed(0) ZODB.tests.util.setUp(test) def testSomeDelegation(): r""" >>> class S: ... def __init__(self, name): ... self.name = name ... def registerDB(self, db): ... print self.name, db ... def close(self): ... print self.name, 'closed' ... sortKey = getSize = __len__ = history = getTid = None ... tpc_finish = tpc_vote = tpc_transaction = None ... _lock_acquire = _lock_release = lambda self: None ... getName = lambda self: 'S' ... isReadOnly = tpc_transaction = None ... supportsUndo = undo = undoLog = undoInfo = None ... supportsTransactionalUndo = None ... def new_oid(self): ... return '\0' * 8 ... def tpc_begin(self, t, tid, status): ... print 'begin', tid, status ... def tpc_abort(self, t): ... pass >>> from ZODB.DemoStorage import DemoStorage >>> storage = DemoStorage(base=S(1), changes=S(2)) >>> storage.registerDB(1) 2 1 >>> storage.close() 1 closed 2 closed >>> storage.tpc_begin(1, 2, 3) begin 2 3 >>> storage.tpc_abort(1) """ def blob_pos_key_error_with_non_blob_base(): """ >>> storage = ZODB.DemoStorage.DemoStorage() >>> storage.loadBlob(ZODB.utils.p64(1), ZODB.utils.p64(1)) Traceback (most recent call last): ... POSKeyError: 0x01 >>> storage.openCommittedBlobFile(ZODB.utils.p64(1), ZODB.utils.p64(1)) Traceback (most recent call last): ... POSKeyError: 0x01 """ def load_before_base_storage_current(): """ Here we'll exercise that DemoStorage's loadBefore method works properly when deferring to a record that is current in the base storage. >>> import time >>> import transaction >>> import ZODB.DB >>> import ZODB.DemoStorage >>> import ZODB.MappingStorage >>> import ZODB.utils >>> base = ZODB.MappingStorage.MappingStorage() >>> basedb = ZODB.DB(base) >>> conn = basedb.open() >>> conn.root()['foo'] = 'bar' >>> transaction.commit() >>> conn.close() >>> storage = ZODB.DemoStorage.DemoStorage(base=base) >>> db = ZODB.DB(storage) >>> conn = db.open() >>> conn.root()['foo'] = 'baz' >>> time.sleep(.1) # Windows has a low-resolution clock >>> transaction.commit() >>> oid = ZODB.utils.z64 >>> base_current = storage.base.load(oid) >>> tid = ZODB.utils.p64(ZODB.utils.u64(base_current[1]) + 1) >>> base_record = storage.base.loadBefore(oid, tid) >>> base_record[-1] is None True >>> base_current == base_record[:2] True >>> t = storage.loadBefore(oid, tid) The data and tid are the values from the base storage, but the next tid is from changes. >>> t[:2] == base_record[:2] True >>> t[-1] == storage.changes.load(oid)[1] True >>> conn.close() >>> db.close() >>> base.close() """ def test_suite(): suite = unittest.TestSuite(( doctest.DocTestSuite( setUp=setUp, tearDown=ZODB.tests.util.tearDown, ), doctest.DocFileSuite( '../DemoStorage.test', setUp=setUp, tearDown=ZODB.tests.util.tearDown, ), )) suite.addTest(unittest.makeSuite(DemoStorageTests, 'check')) suite.addTest(unittest.makeSuite(DemoStorageHexTests, 'check')) suite.addTest(unittest.makeSuite(DemoStorageWrappedAroundFileStorage, 'check')) suite.addTest(unittest.makeSuite(DemoStorageWrappedAroundMappingStorage, 'check')) suite.addTest(unittest.makeSuite(DemoStorageWrappedAroundHexMappingStorage, 'check')) return suite ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/testFileStorage.py000066400000000000000000000554171230730566700252710ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## import cPickle import doctest import os if os.environ.get('USE_ZOPE_TESTING_DOCTEST'): from zope.testing import doctest import unittest import transaction import ZODB.FileStorage import ZODB.tests.hexstorage import ZODB.tests.testblob import ZODB.tests.util import zope.testing.setupstack from ZODB import POSException from ZODB import DB from ZODB.fsIndex import fsIndex from ZODB.tests import StorageTestBase, BasicStorage, TransactionalUndoStorage from ZODB.tests import PackableStorage, Synchronization, ConflictResolution from ZODB.tests import HistoryStorage, IteratorStorage, Corruption from ZODB.tests import RevisionStorage, PersistentStorage, MTStorage from ZODB.tests import ReadOnlyStorage, RecoveryStorage from ZODB.tests.StorageTestBase import MinPO, zodb_pickle class FileStorageTests( StorageTestBase.StorageTestBase, BasicStorage.BasicStorage, TransactionalUndoStorage.TransactionalUndoStorage, RevisionStorage.RevisionStorage, PackableStorage.PackableStorageWithOptionalGC, PackableStorage.PackableUndoStorage, Synchronization.SynchronizedStorage, ConflictResolution.ConflictResolvingStorage, ConflictResolution.ConflictResolvingTransUndoStorage, HistoryStorage.HistoryStorage, IteratorStorage.IteratorStorage, IteratorStorage.ExtendedIteratorStorage, PersistentStorage.PersistentStorage, MTStorage.MTStorage, ReadOnlyStorage.ReadOnlyStorage ): def open(self, **kwargs): self._storage = ZODB.FileStorage.FileStorage('FileStorageTests.fs', **kwargs) def setUp(self): StorageTestBase.StorageTestBase.setUp(self) self.open(create=1) def checkLongMetadata(self): s = "X" * 75000 try: self._dostore(user=s) except POSException.StorageError: pass else: self.fail("expect long user field to raise error") try: self._dostore(description=s) except POSException.StorageError: pass else: self.fail("expect long user field to raise error") def check_use_fsIndex(self): self.assertEqual(self._storage._index.__class__, fsIndex) # A helper for checking that when an .index contains a dict for the # index, it's converted to an fsIndex when the file is opened. def convert_index_to_dict(self): # Convert the index in the current .index file to a Python dict. # Return the index originally found. data = fsIndex.load('FileStorageTests.fs.index') index = data['index'] newindex = dict(index) data['index'] = newindex cPickle.dump(data, open('FileStorageTests.fs.index', 'wb'), 1) return index def check_conversion_to_fsIndex(self, read_only=False): from ZODB.fsIndex import fsIndex # Create some data, and remember the index. for i in range(10): self._dostore() oldindex_as_dict = dict(self._storage._index) # Save the index. self._storage.close() # Convert it to a dict. old_index = self.convert_index_to_dict() self.assert_(isinstance(old_index, fsIndex)) new_index = self.convert_index_to_dict() self.assert_(isinstance(new_index, dict)) # Verify it's converted to fsIndex in memory upon open. self.open(read_only=read_only) self.assert_(isinstance(self._storage._index, fsIndex)) # Verify it has the right content. newindex_as_dict = dict(self._storage._index) self.assertEqual(oldindex_as_dict, newindex_as_dict) # Check that the type on disk has changed iff read_only is False. self._storage.close() current_index = self.convert_index_to_dict() if read_only: self.assert_(isinstance(current_index, dict)) else: self.assert_(isinstance(current_index, fsIndex)) def check_conversion_to_fsIndex_readonly(self): # Same thing, but the disk .index should continue to hold a # Python dict. self.check_conversion_to_fsIndex(read_only=True) def check_conversion_from_dict_to_btree_data_in_fsIndex(self): # To support efficient range searches on its keys as part of # implementing a record iteration protocol in FileStorage, we # converted the fsIndex class from using a dictionary as its # self._data attribute to using an OOBTree in its stead. from ZODB.fsIndex import fsIndex from BTrees.OOBTree import OOBTree # Create some data, and remember the index. for i in range(10): self._dostore() data_dict = dict(self._storage._index._data) # Replace the OOBTree with a dictionary and commit it. self._storage._index._data = data_dict transaction.commit() # Save the index. self._storage.close() # Verify it's converted to fsIndex in memory upon open. self.open() self.assert_(isinstance(self._storage._index, fsIndex)) self.assert_(isinstance(self._storage._index._data, OOBTree)) # Verify it has the right content. new_data_dict = dict(self._storage._index._data) self.assertEqual(len(data_dict), len(new_data_dict)) for k in data_dict: old_tree = data_dict[k] new_tree = new_data_dict[k] self.assertEqual(list(old_tree.items()), list(new_tree.items())) def check_save_after_load_with_no_index(self): for i in range(10): self._dostore() self._storage.close() os.remove('FileStorageTests.fs.index') self.open() self.assertEqual(self._storage._saved, 1) def checkStoreBumpsOid(self): # If .store() is handed an oid bigger than the storage knows # about already, it's crucial that the storage bump its notion # of the largest oid in use. t = transaction.Transaction() self._storage.tpc_begin(t) giant_oid = '\xee' * 8 # Store an object. # oid, serial, data, version, transaction r1 = self._storage.store(giant_oid, '\0'*8, 'data', '', t) # Finish the transaction. r2 = self._storage.tpc_vote(t) self._storage.tpc_finish(t) # Before ZODB 3.2.6, this failed, with ._oid == z64. self.assertEqual(self._storage._oid, giant_oid) def checkRestoreBumpsOid(self): # As above, if .restore() is handed an oid bigger than the storage # knows about already, it's crucial that the storage bump its notion # of the largest oid in use. Because copyTransactionsFrom(), and # ZRS recovery, use the .restore() method, this is plain critical. t = transaction.Transaction() self._storage.tpc_begin(t) giant_oid = '\xee' * 8 # Store an object. # oid, serial, data, version, prev_txn, transaction r1 = self._storage.restore(giant_oid, '\0'*8, 'data', '', None, t) # Finish the transaction. r2 = self._storage.tpc_vote(t) self._storage.tpc_finish(t) # Before ZODB 3.2.6, this failed, with ._oid == z64. self.assertEqual(self._storage._oid, giant_oid) def checkCorruptionInPack(self): # This sets up a corrupt .fs file, with a redundant transaction # length mismatch. The implementation of pack in many releases of # ZODB blew up if the .fs file had such damage: it detected the # damage, but the code to raise CorruptedError referenced an undefined # global. import time from ZODB.utils import U64, p64 from ZODB.FileStorage.format import CorruptedError from ZODB.serialize import referencesf db = DB(self._storage) conn = db.open() conn.root()['xyz'] = 1 transaction.commit() # Ensure it's all on disk. db.close() self._storage.close() # Reopen before damaging. self.open() # Open .fs directly, and damage content. f = open('FileStorageTests.fs', 'r+b') f.seek(0, 2) pos2 = f.tell() - 8 f.seek(pos2) tlen2 = U64(f.read(8)) # length-8 of the last transaction pos1 = pos2 - tlen2 + 8 # skip over the tid at the start f.seek(pos1) tlen1 = U64(f.read(8)) # should be redundant length-8 self.assertEqual(tlen1, tlen2) # verify that it is redundant # Now damage the second copy. f.seek(pos2) f.write(p64(tlen2 - 1)) f.close() # Try to pack. This used to yield # NameError: global name 's' is not defined try: self._storage.pack(time.time(), referencesf) except CorruptedError, detail: self.assert_("redundant transaction length does not match " "initial transaction length" in str(detail)) else: self.fail("expected CorruptedError") def check_record_iternext(self): db = DB(self._storage) conn = db.open() conn.root()['abc'] = MinPO('abc') conn.root()['xyz'] = MinPO('xyz') transaction.commit() # Ensure it's all on disk. db.close() self._storage.close() self.open() key = None for x in ('\000', '\001', '\002'): oid, tid, data, next_oid = self._storage.record_iternext(key) self.assertEqual(oid, ('\000' * 7) + x) key = next_oid expected_data, expected_tid = self._storage.load(oid, '') self.assertEqual(expected_data, data) self.assertEqual(expected_tid, tid) if x == '\002': self.assertEqual(next_oid, None) else: self.assertNotEqual(next_oid, None) class FileStorageHexTests(FileStorageTests): def open(self, **kwargs): self._storage = ZODB.tests.hexstorage.HexStorage( ZODB.FileStorage.FileStorage('FileStorageTests.fs',**kwargs)) class FileStorageTestsWithBlobsEnabled(FileStorageTests): def open(self, **kwargs): if 'blob_dir' not in kwargs: kwargs = kwargs.copy() kwargs['blob_dir'] = 'blobs' FileStorageTests.open(self, **kwargs) class FileStorageHexTestsWithBlobsEnabled(FileStorageTests): def open(self, **kwargs): if 'blob_dir' not in kwargs: kwargs = kwargs.copy() kwargs['blob_dir'] = 'blobs' FileStorageTests.open(self, **kwargs) self._storage = ZODB.tests.hexstorage.HexStorage(self._storage) class FileStorageRecoveryTest( StorageTestBase.StorageTestBase, RecoveryStorage.RecoveryStorage, ): def setUp(self): StorageTestBase.StorageTestBase.setUp(self) self._storage = ZODB.FileStorage.FileStorage("Source.fs", create=True) self._dst = ZODB.FileStorage.FileStorage("Dest.fs", create=True) def tearDown(self): self._dst.close() StorageTestBase.StorageTestBase.tearDown(self) def new_dest(self): return ZODB.FileStorage.FileStorage('Dest.fs') class FileStorageHexRecoveryTest(FileStorageRecoveryTest): def setUp(self): StorageTestBase.StorageTestBase.setUp(self) self._storage = ZODB.tests.hexstorage.HexStorage( ZODB.FileStorage.FileStorage("Source.fs", create=True)) self._dst = ZODB.tests.hexstorage.HexStorage( ZODB.FileStorage.FileStorage("Dest.fs", create=True)) class FileStorageNoRestore(ZODB.FileStorage.FileStorage): @property def restore(self): raise Exception class FileStorageNoRestoreRecoveryTest(FileStorageRecoveryTest): # This test actually verifies a code path of # BaseStorage.copyTransactionsFrom. For simplicity of implementation, we # use a FileStorage deprived of its restore method. def setUp(self): StorageTestBase.StorageTestBase.setUp(self) self._storage = FileStorageNoRestore("Source.fs", create=True) self._dst = FileStorageNoRestore("Dest.fs", create=True) def new_dest(self): return FileStorageNoRestore('Dest.fs') def checkRestoreAcrossPack(self): # Skip this check as it calls restore directly. pass class AnalyzeDotPyTest(StorageTestBase.StorageTestBase): def setUp(self): StorageTestBase.StorageTestBase.setUp(self) self._storage = ZODB.FileStorage.FileStorage("Source.fs", create=True) def checkanalyze(self): import new, sys, pickle from BTrees.OOBTree import OOBTree from ZODB.scripts import analyze # Set up a module to act as a broken import module_name = 'brokenmodule' module = new.module(module_name) sys.modules[module_name] = module class Broken(MinPO): __module__ = module_name module.Broken = Broken oids = [[self._storage.new_oid(), None] for i in range(3)] for i in range(2): t = transaction.Transaction() self._storage.tpc_begin(t) # sometimes data is in this format j = 0 oid, revid = oids[j] serial = self._storage.store( oid, revid, pickle.dumps(OOBTree, 1), "", t) oids[j][1] = serial # and it could be from a broken module j = 1 oid, revid = oids[j] serial = self._storage.store( oid, revid, pickle.dumps(Broken, 1), "", t) oids[j][1] = serial # but mostly it looks like this j = 2 o = MinPO(j) oid, revid = oids[j] serial = self._storage.store(oid, revid, zodb_pickle(o), "", t) oids[j][1] = serial self._storage.tpc_vote(t) self._storage.tpc_finish(t) # now break the import of the Broken class del sys.modules[module_name] # from ZODB.scripts.analyze.analyze fsi = self._storage.iterator() rep = analyze.Report() for txn in fsi: analyze.analyze_trans(rep, txn) # from ZODB.scripts.analyze.report typemap = rep.TYPEMAP.keys() typemap.sort() cumpct = 0.0 for t in typemap: pct = rep.TYPESIZE[t] * 100.0 / rep.DBYTES cumpct += pct self.assertAlmostEqual(cumpct, 100.0, 0, "Failed to analyze some records") # Raise an exception if the tids in FileStorage fs aren't # strictly increasing. def checkIncreasingTids(fs): lasttid = '\0' * 8 for txn in fs.iterator(): if lasttid >= txn.tid: raise ValueError("tids out of order %r >= %r" % (lasttid, txn.tid)) lasttid = txn.tid # Return a TimeStamp object 'minutes' minutes in the future. def timestamp(minutes): import time from persistent.TimeStamp import TimeStamp t = time.time() + 60 * minutes return TimeStamp(*time.gmtime(t)[:5] + (t % 60,)) def testTimeTravelOnOpen(): """ >>> from ZODB.FileStorage import FileStorage >>> from zope.testing.loggingsupport import InstalledHandler Arrange to capture log messages -- they're an important part of this test! >>> handler = InstalledHandler('ZODB.FileStorage') Create a new file storage. >>> st = FileStorage('temp.fs', create=True) >>> db = DB(st) >>> db.close() First check the normal case: transactions are recorded with increasing tids, and time doesn't run backwards. >>> st = FileStorage('temp.fs') >>> db = DB(st) >>> conn = db.open() >>> conn.root()['xyz'] = 1 >>> transaction.get().commit() >>> checkIncreasingTids(st) >>> db.close() >>> st.cleanup() # remove .fs, .index, etc files >>> handler.records # i.e., no log messages [] Now force the database to have transaction records with tids from the future. >>> st = FileStorage('temp.fs', create=True) >>> st._ts = timestamp(15) # 15 minutes in the future >>> db = DB(st) >>> db.close() >>> st = FileStorage('temp.fs') # this should log a warning >>> db = DB(st) >>> conn = db.open() >>> conn.root()['xyz'] = 1 >>> transaction.get().commit() >>> checkIncreasingTids(st) >>> db.close() >>> st.cleanup() >>> [record.levelname for record in handler.records] ['WARNING'] >>> handler.clear() And one more time, with transaction records far in the future. We expect to log a critical error then, as a time so far in the future probably indicates a real problem with the system. Shorter spans may be due to clock drift. >>> st = FileStorage('temp.fs', create=True) >>> st._ts = timestamp(60) # an hour in the future >>> db = DB(st) >>> db.close() >>> st = FileStorage('temp.fs') # this should log a critical error >>> db = DB(st) >>> conn = db.open() >>> conn.root()['xyz'] = 1 >>> transaction.get().commit() >>> checkIncreasingTids(st) >>> db.close() >>> st.cleanup() >>> [record.levelname for record in handler.records] ['CRITICAL'] >>> handler.clear() >>> handler.uninstall() """ def lastInvalidations(): """ The last invalidations method is used by a storage server to populate it's data structure of recent invalidations. The lastInvalidations method is passed a count and must return up to count number of the most recent transactions. We'll create a FileStorage and populate it with some data, keeping track of the transactions along the way: >>> fs = ZODB.FileStorage.FileStorage('t.fs', create=True) >>> db = DB(fs) >>> conn = db.open() >>> from persistent.mapping import PersistentMapping >>> last = [] >>> for i in range(100): ... conn.root()[i] = PersistentMapping() ... transaction.commit() ... last.append(fs.lastTransaction()) Now, we can call lastInvalidations on it: >>> invalidations = fs.lastInvalidations(10) >>> [t for (t, oids) in invalidations] == last[-10:] True >>> from ZODB.utils import u64 >>> [[int(u64(oid)) for oid in oids] ... for (i, oids) in invalidations] ... # doctest: +NORMALIZE_WHITESPACE [[0, 91], [0, 92], [0, 93], [0, 94], [0, 95], [0, 96], [0, 97], [0, 98], [0, 99], [0, 100]] If we ask for more transactions than there are, we'll get as many as there are: >>> len(fs.lastInvalidations(1000)) 101 Of course, calling lastInvalidations on an empty storage refturns no data: >>> db.close() >>> fs = ZODB.FileStorage.FileStorage('t.fs', create=True) >>> list(fs.lastInvalidations(10)) [] >>> fs.close() """ def deal_with_finish_failures(): r""" It's really bad to get errors in FileStorage's _finish method, as that can cause the file storage to be in an inconsistent state. The data file will be fine, but the internal data structures might be hosed. For this reason, FileStorage will close if there is an error after it has finished writing transaction data. It bothers to do very little after writing this data, so this should rarely, if ever, happen. >>> fs = ZODB.FileStorage.FileStorage('data.fs') >>> db = DB(fs) >>> conn = db.open() >>> conn.root()[1] = 1 >>> transaction.commit() Now, we'll indentially break the file storage. It provides a hook for this purpose. :) >>> fs._finish_finish = lambda : None >>> conn.root()[1] = 1 >>> import zope.testing.loggingsupport >>> handler = zope.testing.loggingsupport.InstalledHandler( ... 'ZODB.FileStorage') >>> transaction.commit() Traceback (most recent call last): ... TypeError: () takes no arguments (1 given) >>> print handler ZODB.FileStorage CRITICAL Failure in _finish. Closing. >>> handler.uninstall() >>> fs.load('\0'*8, '') # doctest: +ELLIPSIS Traceback (most recent call last): ... ValueError: ... >>> db.close() >>> fs = ZODB.FileStorage.FileStorage('data.fs') >>> db = DB(fs) >>> conn = db.open() >>> conn.root() {1: 1} >>> transaction.abort() >>> db.close() """ def pack_with_open_blob_files(): """ Make sure packing works while there are open blob files. >>> fs = ZODB.FileStorage.FileStorage('data.fs', blob_dir='blobs') >>> db = ZODB.DB(fs) >>> tm1 = transaction.TransactionManager() >>> conn1 = db.open(tm1) >>> import ZODB.blob >>> conn1.root()[1] = ZODB.blob.Blob() >>> conn1.add(conn1.root()[1]) >>> conn1.root()[1].open('w').write('some data') >>> tm1.commit() >>> tm2 = transaction.TransactionManager() >>> conn2 = db.open(tm2) >>> f = conn1.root()[1].open() >>> conn1.root()[2] = ZODB.blob.Blob() >>> conn1.add(conn1.root()[2]) >>> conn1.root()[2].open('w').write('some more data') >>> db.pack() >>> f.read() 'some data' >>> f.close() >>> tm1.commit() >>> conn2.sync() >>> conn2.root()[2].open().read() 'some more data' >>> db.close() """ def test_suite(): suite = unittest.TestSuite() for klass in [ FileStorageTests, FileStorageHexTests, Corruption.FileStorageCorruptTests, FileStorageRecoveryTest, FileStorageHexRecoveryTest, FileStorageNoRestoreRecoveryTest, FileStorageTestsWithBlobsEnabled, FileStorageHexTestsWithBlobsEnabled, AnalyzeDotPyTest, ]: suite.addTest(unittest.makeSuite(klass, "check")) suite.addTest(doctest.DocTestSuite( setUp=zope.testing.setupstack.setUpDirectory, tearDown=zope.testing.setupstack.tearDown)) suite.addTest(ZODB.tests.testblob.storage_reusable_suite( 'BlobFileStorage', lambda name, blob_dir: ZODB.FileStorage.FileStorage('%s.fs' % name, blob_dir=blob_dir), test_blob_storage_recovery=True, test_packing=True, )) suite.addTest(ZODB.tests.testblob.storage_reusable_suite( 'BlobFileHexStorage', lambda name, blob_dir: ZODB.tests.hexstorage.HexStorage( ZODB.FileStorage.FileStorage('%s.fs' % name, blob_dir=blob_dir)), test_blob_storage_recovery=True, test_packing=True, )) suite.addTest(PackableStorage.IExternalGC_suite( lambda : ZODB.FileStorage.FileStorage( 'data.fs', blob_dir='blobs', pack_gc=False))) suite.layer = ZODB.tests.util.MininalTestLayer('testFileStorage') return suite if __name__=='__main__': unittest.main() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/testMVCCMappingStorage.py000066400000000000000000000143761230730566700264550ustar00rootroot00000000000000############################################################################## # # Copyright (c) Zope Corporation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## import unittest from persistent.mapping import PersistentMapping import transaction from ZODB.DB import DB from ZODB.tests.MVCCMappingStorage import MVCCMappingStorage import ZODB.blob import ZODB.tests.testblob from ZODB.tests import ( BasicStorage, HistoryStorage, IteratorStorage, MTStorage, PackableStorage, RevisionStorage, StorageTestBase, Synchronization, ) class MVCCTests: def checkCrossConnectionInvalidation(self): # Verify connections see updated state at txn boundaries. # This will fail if the Connection doesn't poll for changes. db = DB(self._storage) try: c1 = db.open(transaction.TransactionManager()) r1 = c1.root() r1['myobj'] = 'yes' c2 = db.open(transaction.TransactionManager()) r2 = c2.root() self.assert_('myobj' not in r2) c1.transaction_manager.commit() self.assert_('myobj' not in r2) c2.sync() self.assert_('myobj' in r2) self.assert_(r2['myobj'] == 'yes') finally: db.close() def checkCrossConnectionIsolation(self): # Verify MVCC isolates connections. # This will fail if Connection doesn't poll for changes. db = DB(self._storage) try: c1 = db.open() r1 = c1.root() r1['alpha'] = PersistentMapping() r1['gamma'] = PersistentMapping() transaction.commit() # Open a second connection but don't load root['alpha'] yet c2 = db.open() r2 = c2.root() r1['alpha']['beta'] = 'yes' storage = c1._storage t = transaction.Transaction() t.description = 'isolation test 1' storage.tpc_begin(t) c1.commit(t) storage.tpc_vote(t) storage.tpc_finish(t) # The second connection will now load root['alpha'], but due to # MVCC, it should continue to see the old state. self.assert_(r2['alpha']._p_changed is None) # A ghost self.assert_(not r2['alpha']) self.assert_(r2['alpha']._p_changed == 0) # make root['alpha'] visible to the second connection c2.sync() # Now it should be in sync self.assert_(r2['alpha']._p_changed is None) # A ghost self.assert_(r2['alpha']) self.assert_(r2['alpha']._p_changed == 0) self.assert_(r2['alpha']['beta'] == 'yes') # Repeat the test with root['gamma'] r1['gamma']['delta'] = 'yes' storage = c1._storage t = transaction.Transaction() t.description = 'isolation test 2' storage.tpc_begin(t) c1.commit(t) storage.tpc_vote(t) storage.tpc_finish(t) # The second connection will now load root[3], but due to MVCC, # it should continue to see the old state. self.assert_(r2['gamma']._p_changed is None) # A ghost self.assert_(not r2['gamma']) self.assert_(r2['gamma']._p_changed == 0) # make root[3] visible to the second connection c2.sync() # Now it should be in sync self.assert_(r2['gamma']._p_changed is None) # A ghost self.assert_(r2['gamma']) self.assert_(r2['gamma']._p_changed == 0) self.assert_(r2['gamma']['delta'] == 'yes') finally: db.close() class MVCCMappingStorageTests( StorageTestBase.StorageTestBase, BasicStorage.BasicStorage, HistoryStorage.HistoryStorage, IteratorStorage.ExtendedIteratorStorage, IteratorStorage.IteratorStorage, MTStorage.MTStorage, PackableStorage.PackableStorageWithOptionalGC, RevisionStorage.RevisionStorage, Synchronization.SynchronizedStorage, MVCCTests ): def setUp(self): self._storage = MVCCMappingStorage() def tearDown(self): self._storage.close() def checkLoadBeforeUndo(self): pass # we don't support undo yet checkUndoZombie = checkLoadBeforeUndo def checkTransactionIdIncreases(self): import time from ZODB.utils import newTid from ZODB.TimeStamp import TimeStamp t = transaction.Transaction() self._storage.tpc_begin(t) self._storage.tpc_vote(t) self._storage.tpc_finish(t) # Add a fake transaction transactions = self._storage._transactions self.assertEqual(1, len(transactions)) fake_timestamp = 'zzzzzzzy' # the year 5735 ;-) transactions[fake_timestamp] = transactions.values()[0] # Verify the next transaction comes after the fake transaction t = transaction.Transaction() self._storage.tpc_begin(t) self.assertEqual(self._storage._tid, 'zzzzzzzz') def create_blob_storage(name, blob_dir): s = MVCCMappingStorage(name) return ZODB.blob.BlobStorage(blob_dir, s) def test_suite(): suite = unittest.makeSuite(MVCCMappingStorageTests, 'check') # Note: test_packing doesn't work because even though MVCCMappingStorage # retains history, it does not provide undo methods, so the # BlobStorage wrapper calls _packNonUndoing instead of _packUndoing, # causing blobs to get deleted even though object states are retained. suite.addTest(ZODB.tests.testblob.storage_reusable_suite( 'MVCCMapping', create_blob_storage, test_undo=False, )) return suite if __name__ == "__main__": loader = unittest.TestLoader() loader.testMethodPrefix = "check" unittest.main(testLoader=loader) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/testMappingStorage.py000066400000000000000000000044211230730566700257720ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## import ZODB.MappingStorage import unittest import ZODB.tests.hexstorage from ZODB.tests import ( BasicStorage, HistoryStorage, IteratorStorage, MTStorage, PackableStorage, RevisionStorage, StorageTestBase, Synchronization, ) class MappingStorageTests( StorageTestBase.StorageTestBase, BasicStorage.BasicStorage, HistoryStorage.HistoryStorage, IteratorStorage.ExtendedIteratorStorage, IteratorStorage.IteratorStorage, MTStorage.MTStorage, PackableStorage.PackableStorageWithOptionalGC, RevisionStorage.RevisionStorage, Synchronization.SynchronizedStorage, ): def setUp(self): StorageTestBase.StorageTestBase.setUp(self, ) self._storage = ZODB.MappingStorage.MappingStorage() def checkOversizeNote(self): # This base class test checks for the common case where a storage # doesnt support huge transaction metadata. This storage doesnt # have this limit, so we inhibit this test here. pass def checkLoadBeforeUndo(self): pass # we don't support undo yet checkUndoZombie = checkLoadBeforeUndo class MappingStorageHexTests(MappingStorageTests): def setUp(self): StorageTestBase.StorageTestBase.setUp(self, ) self._storage = ZODB.tests.hexstorage.HexStorage( ZODB.MappingStorage.MappingStorage()) def test_suite(): suite = unittest.makeSuite(MappingStorageTests, 'check') suite = unittest.makeSuite(MappingStorageHexTests, 'check') return suite if __name__ == "__main__": loader = unittest.TestLoader() loader.testMethodPrefix = "check" unittest.main(testLoader=loader) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/testPersistentList.py000066400000000000000000000135071230730566700260530ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Test the list interface to PersistentList """ import unittest from persistent.list import PersistentList l0 = [] l1 = [0] l2 = [0, 1] class TestPList(unittest.TestCase): def checkTheWorld(self): # Test constructors u = PersistentList() u0 = PersistentList(l0) u1 = PersistentList(l1) u2 = PersistentList(l2) uu = PersistentList(u) uu0 = PersistentList(u0) uu1 = PersistentList(u1) uu2 = PersistentList(u2) v = PersistentList(tuple(u)) class OtherList: def __init__(self, initlist): self.__data = initlist def __len__(self): return len(self.__data) def __getitem__(self, i): return self.__data[i] v0 = PersistentList(OtherList(u0)) vv = PersistentList("this is also a sequence") # Test __repr__ eq = self.assertEqual eq(str(u0), str(l0), "str(u0) == str(l0)") eq(repr(u1), repr(l1), "repr(u1) == repr(l1)") eq(`u2`, `l2`, "`u2` == `l2`") # Test __cmp__ and __len__ def mycmp(a, b): r = cmp(a, b) if r < 0: return -1 if r > 0: return 1 return r all = [l0, l1, l2, u, u0, u1, u2, uu, uu0, uu1, uu2] for a in all: for b in all: eq(mycmp(a, b), mycmp(len(a), len(b)), "mycmp(a, b) == mycmp(len(a), len(b))") # Test __getitem__ for i in range(len(u2)): eq(u2[i], i, "u2[i] == i") # Test __setitem__ uu2[0] = 0 uu2[1] = 100 try: uu2[2] = 200 except IndexError: pass else: raise TestFailed("uu2[2] shouldn't be assignable") # Test __delitem__ del uu2[1] del uu2[0] try: del uu2[0] except IndexError: pass else: raise TestFailed("uu2[0] shouldn't be deletable") # Test __getslice__ for i in range(-3, 4): eq(u2[:i], l2[:i], "u2[:i] == l2[:i]") eq(u2[i:], l2[i:], "u2[i:] == l2[i:]") for j in range(-3, 4): eq(u2[i:j], l2[i:j], "u2[i:j] == l2[i:j]") # Test __setslice__ for i in range(-3, 4): u2[:i] = l2[:i] eq(u2, l2, "u2 == l2") u2[i:] = l2[i:] eq(u2, l2, "u2 == l2") for j in range(-3, 4): u2[i:j] = l2[i:j] eq(u2, l2, "u2 == l2") uu2 = u2[:] uu2[:0] = [-2, -1] eq(uu2, [-2, -1, 0, 1], "uu2 == [-2, -1, 0, 1]") uu2[0:] = [] eq(uu2, [], "uu2 == []") # Test __contains__ for i in u2: self.failUnless(i in u2, "i in u2") for i in min(u2)-1, max(u2)+1: self.failUnless(i not in u2, "i not in u2") # Test __delslice__ uu2 = u2[:] del uu2[1:2] del uu2[0:1] eq(uu2, [], "uu2 == []") uu2 = u2[:] del uu2[1:] del uu2[:1] eq(uu2, [], "uu2 == []") # Test __add__, __radd__, __mul__ and __rmul__ #self.failUnless(u1 + [] == [] + u1 == u1, "u1 + [] == [] + u1 == u1") self.failUnless(u1 + [1] == u2, "u1 + [1] == u2") #self.failUnless([-1] + u1 == [-1, 0], "[-1] + u1 == [-1, 0]") self.failUnless(u2 == u2*1 == 1*u2, "u2 == u2*1 == 1*u2") self.failUnless(u2+u2 == u2*2 == 2*u2, "u2+u2 == u2*2 == 2*u2") self.failUnless(u2+u2+u2 == u2*3 == 3*u2, "u2+u2+u2 == u2*3 == 3*u2") # Test append u = u1[:] u.append(1) eq(u, u2, "u == u2") # Test insert u = u2[:] u.insert(0, -1) eq(u, [-1, 0, 1], "u == [-1, 0, 1]") # Test pop u = PersistentList([0, -1, 1]) u.pop() eq(u, [0, -1], "u == [0, -1]") u.pop(0) eq(u, [-1], "u == [-1]") # Test remove u = u2[:] u.remove(1) eq(u, u1, "u == u1") # Test count u = u2*3 eq(u.count(0), 3, "u.count(0) == 3") eq(u.count(1), 3, "u.count(1) == 3") eq(u.count(2), 0, "u.count(2) == 0") # Test index eq(u2.index(0), 0, "u2.index(0) == 0") eq(u2.index(1), 1, "u2.index(1) == 1") try: u2.index(2) except ValueError: pass else: raise TestFailed("expected ValueError") # Test reverse u = u2[:] u.reverse() eq(u, [1, 0], "u == [1, 0]") u.reverse() eq(u, u2, "u == u2") # Test sort u = PersistentList([1, 0]) u.sort() eq(u, u2, "u == u2") # Test extend u = u1[:] u.extend(u2) eq(u, u1 + u2, "u == u1 + u2") def checkBackwardCompat(self): # Verify that the sanest of the ZODB 3.2 dotted paths still works. from ZODB.PersistentList import PersistentList as oldPath self.assert_(oldPath is PersistentList) def test_suite(): return unittest.makeSuite(TestPList, 'check') if __name__ == "__main__": loader = unittest.TestLoader() loader.testMethodPrefix = "check" unittest.main(testLoader=loader) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/testPersistentMapping.py000066400000000000000000000141051230730566700265260ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Verify that PersistentMapping works with old versions of Zope. The comments in PersistentMapping.py address the issue in some detail. The pickled form of a PersistentMapping must use _container to store the actual mapping, because old versions of Zope used this attribute. If the new code doesn't generate pickles that are consistent with the old code, developers will have a hard time testing the new code. """ import unittest import transaction from transaction import Transaction import ZODB from ZODB.MappingStorage import MappingStorage import cPickle import cStringIO import sys # This pickle contains a persistent mapping pickle created from the # old code. pickle = ('((U\x0bPersistenceq\x01U\x11PersistentMappingtq\x02Nt.}q\x03U\n' '_containerq\x04}q\x05U\x07versionq\x06U\x03oldq\x07ss.\n') class PMTests(unittest.TestCase): def checkOldStyleRoot(self): # The Persistence module doesn't exist in Zope3's idea of what ZODB # is, but the global `pickle` references it explicitly. So just # bail if Persistence isn't available. try: import Persistence except ImportError: return # insert the pickle in place of the root s = MappingStorage() t = Transaction() s.tpc_begin(t) s.store('\000' * 8, None, pickle, '', t) s.tpc_vote(t) s.tpc_finish(t) db = ZODB.DB(s) # If the root can be loaded successfully, we should be okay. r = db.open().root() # But make sure it looks like a new mapping self.assert_(hasattr(r, 'data')) self.assert_(not hasattr(r, '_container')) # TODO: This test fails in ZODB 3.3a1. It's making some assumption(s) # about pickles that aren't true. Hard to say when it stopped working, # because this entire test suite hasn't been run for a long time, due to # a mysterious "return None" at the start of the test_suite() function # below. I noticed that when the new checkBackwardCompat() test wasn't # getting run. def TODO_checkNewPicklesAreSafe(self): s = MappingStorage() db = ZODB.DB(s) r = db.open().root() r[1] = 1 r[2] = 2 r[3] = r transaction.commit() # MappingStorage stores serialno + pickle in its _index. root_pickle = s._index['\000' * 8][8:] f = cStringIO.StringIO(root_pickle) u = cPickle.Unpickler(f) klass_info = u.load() klass = find_global(*klass_info[0]) inst = klass.__new__(klass) state = u.load() inst.__setstate__(state) self.assert_(hasattr(inst, '_container')) self.assert_(not hasattr(inst, 'data')) def checkBackwardCompat(self): # Verify that the sanest of the ZODB 3.2 dotted paths still works. from persistent.mapping import PersistentMapping as newPath from ZODB.PersistentMapping import PersistentMapping as oldPath self.assert_(oldPath is newPath) def checkBasicOps(self): from persistent.mapping import PersistentMapping m = PersistentMapping({'x': 1}, a=2, b=3) m['name'] = 'bob' self.assertEqual(m['name'], "bob") self.assertEqual(m.get('name', 42), "bob") self.assert_('name' in m) try: m['fred'] except KeyError: pass else: self.fail("expected KeyError") self.assert_('fred' not in m) self.assertEqual(m.get('fred'), None) self.assertEqual(m.get('fred', 42), 42) keys = m.keys() keys.sort() self.assertEqual(keys, ['a', 'b', 'name', 'x']) values = m.values() values.sort() self.assertEqual(values, [1, 2, 3, 'bob']) items = m.items() items.sort() self.assertEqual(items, [('a', 2), ('b', 3), ('name', 'bob'), ('x', 1)]) keys = list(m.iterkeys()) keys.sort() self.assertEqual(keys, ['a', 'b', 'name', 'x']) values = list(m.itervalues()) values.sort() self.assertEqual(values, [1, 2, 3, 'bob']) items = list(m.iteritems()) items.sort() self.assertEqual(items, [('a', 2), ('b', 3), ('name', 'bob'), ('x', 1)]) # PersistentMapping didn't have an __iter__ method before ZODB 3.4.2. # Check that it plays well now with the Python iteration protocol. def checkIteration(self): from persistent.mapping import PersistentMapping m = PersistentMapping({'x': 1}, a=2, b=3) m['name'] = 'bob' def check(keylist): keylist.sort() self.assertEqual(keylist, ['a', 'b', 'name', 'x']) check(list(m)) check([key for key in m]) i = iter(m) keylist = [] while 1: try: key = i.next() except StopIteration: break keylist.append(key) check(keylist) def find_global(modulename, classname): """Helper for this test suite to get special PersistentMapping""" if classname == "PersistentMapping": class PersistentMapping(object): def __setstate__(self, state): self.__dict__.update(state) return PersistentMapping else: __import__(modulename) mod = sys.modules[modulename] return getattr(mod, classname) def test_suite(): return unittest.makeSuite(PMTests, 'check') if __name__ == "__main__": unittest.main() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/testRecover.py000066400000000000000000000155701230730566700244660ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2003 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Tests of the file storage recovery script.""" import base64 import os import random import sys import unittest import StringIO import ZODB import ZODB.tests.util from ZODB.FileStorage import FileStorage import ZODB.fsrecover from persistent.mapping import PersistentMapping import transaction class RecoverTest(ZODB.tests.util.TestCase): path = None def setUp(self): ZODB.tests.util.TestCase.setUp(self) self.path = 'source.fs' self.storage = FileStorage(self.path) self.populate() self.dest = 'dest.fs' self.recovered = None def tearDown(self): self.storage.close() if self.recovered is not None: self.recovered.close() temp = FileStorage(self.dest) temp.close() ZODB.tests.util.TestCase.tearDown(self) def populate(self): db = ZODB.DB(self.storage) cn = db.open() rt = cn.root() # Create a bunch of objects; the Data.fs is about 100KB. for i in range(50): d = rt[i] = PersistentMapping() transaction.commit() for j in range(50): d[j] = "a" * j transaction.commit() def damage(self, num, size): self.storage.close() # Drop size null bytes into num random spots. for i in range(num): offset = random.randint(0, self.storage._pos - size) f = open(self.path, "a+b") f.seek(offset) f.write("\0" * size) f.close() ITERATIONS = 5 # Run recovery, from self.path to self.dest. Return whatever # recovery printed to stdout, as a string. def recover(self): orig_stdout = sys.stdout faux_stdout = StringIO.StringIO() try: sys.stdout = faux_stdout try: ZODB.fsrecover.recover(self.path, self.dest, verbose=0, partial=True, force=False, pack=1) except SystemExit: raise RuntimeError("recover tried to exit") finally: sys.stdout = orig_stdout return faux_stdout.getvalue() # Caution: because recovery is robust against many kinds of damage, # it's almost impossible for a call to self.recover() to raise an # exception. As a result, these tests may pass even if fsrecover.py # is broken badly. testNoDamage() tries to ensure that at least # recovery doesn't produce any error msgs if the input .fs is in # fact not damaged. def testNoDamage(self): output = self.recover() self.assert_('error' not in output, output) self.assert_('\n0 bytes removed during recovery' in output, output) # Verify that the recovered database is identical to the original. before = file(self.path, 'rb') before_guts = before.read() before.close() after = file(self.dest, 'rb') after_guts = after.read() after.close() self.assertEqual(before_guts, after_guts, "recovery changed a non-damaged .fs file") def testOneBlock(self): for i in range(self.ITERATIONS): self.damage(1, 1024) output = self.recover() self.assert_('error' in output, output) self.recovered = FileStorage(self.dest) self.recovered.close() os.remove(self.path) os.rename(self.dest, self.path) def testFourBlocks(self): for i in range(self.ITERATIONS): self.damage(4, 512) output = self.recover() self.assert_('error' in output, output) self.recovered = FileStorage(self.dest) self.recovered.close() os.remove(self.path) os.rename(self.dest, self.path) def testBigBlock(self): for i in range(self.ITERATIONS): self.damage(1, 32 * 1024) output = self.recover() self.assert_('error' in output, output) self.recovered = FileStorage(self.dest) self.recovered.close() os.remove(self.path) os.rename(self.dest, self.path) def testBadTransaction(self): # Find transaction headers and blast them. L = self.storage.undoLog() r = L[3] tid = base64.decodestring(r["id"] + "\n") pos1 = self.storage._txn_find(tid, 0) r = L[8] tid = base64.decodestring(r["id"] + "\n") pos2 = self.storage._txn_find(tid, 0) self.storage.close() # Overwrite the entire header. f = open(self.path, "a+b") f.seek(pos1 - 50) f.write("\0" * 100) f.close() output = self.recover() self.assert_('error' in output, output) self.recovered = FileStorage(self.dest) self.recovered.close() os.remove(self.path) os.rename(self.dest, self.path) # Overwrite part of the header. f = open(self.path, "a+b") f.seek(pos2 + 10) f.write("\0" * 100) f.close() output = self.recover() self.assert_('error' in output, output) self.recovered = FileStorage(self.dest) self.recovered.close() # Issue 1846: When a transaction had 'c' status (not yet committed), # the attempt to open a temp file to write the trailing bytes fell # into an infinite loop. def testUncommittedAtEnd(self): # Find a transaction near the end. L = self.storage.undoLog() r = L[1] tid = base64.decodestring(r["id"] + "\n") pos = self.storage._txn_find(tid, 0) # Overwrite its status with 'c'. f = open(self.path, "r+b") f.seek(pos + 16) current_status = f.read(1) self.assertEqual(current_status, ' ') f.seek(pos + 16) f.write('c') f.close() # Try to recover. The original bug was that this never completed -- # infinite loop in fsrecover.py. Also, in the ZODB 3.2 line, # reference to an undefined global masked the infinite loop. self.recover() # Verify the destination got truncated. self.assertEqual(os.path.getsize(self.dest), pos) # Get rid of the temp file holding the truncated bytes. os.remove(ZODB.fsrecover._trname) def test_suite(): return unittest.makeSuite(RecoverTest) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/testSerialize.py000066400000000000000000000103031230730566700247750ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2004 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## import doctest import cPickle import cStringIO as StringIO import sys import unittest from ZODB import serialize class ClassWithNewargs(int): def __new__(cls, value): return int.__new__(cls, value) def __getnewargs__(self): return int(self), class ClassWithoutNewargs(object): def __init__(self, value): self.value = value def make_pickle(ob): sio = StringIO.StringIO() p = cPickle.Pickler(sio, 1) p.dump(ob) return sio.getvalue() def test_factory(conn, module_name, name): return globals()[name] class SerializerTestCase(unittest.TestCase): # old format: (module, name), None old_style_without_newargs = make_pickle( ((__name__, "ClassWithoutNewargs"), None)) # old format: (module, name), argtuple old_style_with_newargs = make_pickle( ((__name__, "ClassWithNewargs"), (1,))) # new format: klass new_style_without_newargs = make_pickle( ClassWithoutNewargs) # new format: klass, argtuple new_style_with_newargs = make_pickle( (ClassWithNewargs, (1,))) def test_getClassName(self): r = serialize.ObjectReader(factory=test_factory) eq = self.assertEqual eq(r.getClassName(self.old_style_with_newargs), __name__ + ".ClassWithNewargs") eq(r.getClassName(self.new_style_with_newargs), __name__ + ".ClassWithNewargs") eq(r.getClassName(self.old_style_without_newargs), __name__ + ".ClassWithoutNewargs") eq(r.getClassName(self.new_style_without_newargs), __name__ + ".ClassWithoutNewargs") def test_getGhost(self): # Use a TestObjectReader since we need _get_class() to be # implemented; otherwise this is just a BaseObjectReader. class TestObjectReader(serialize.ObjectReader): # A production object reader would optimize this, but we # don't need to in a test def _get_class(self, module, name): __import__(module) return getattr(sys.modules[module], name) r = TestObjectReader(factory=test_factory) g = r.getGhost(self.old_style_with_newargs) self.assert_(isinstance(g, ClassWithNewargs)) self.assertEqual(g, 1) g = r.getGhost(self.old_style_without_newargs) self.assert_(isinstance(g, ClassWithoutNewargs)) g = r.getGhost(self.new_style_with_newargs) self.assert_(isinstance(g, ClassWithNewargs)) g = r.getGhost(self.new_style_without_newargs) self.assert_(isinstance(g, ClassWithoutNewargs)) def test_myhasattr(self): class OldStyle: bar = "bar" def __getattr__(self, name): if name == "error": raise ValueError("whee!") else: raise AttributeError(name) class NewStyle(object): bar = "bar" def _raise(self): raise ValueError("whee!") error = property(_raise) self.assertRaises(ValueError, serialize.myhasattr, OldStyle(), "error") self.assertRaises(ValueError, serialize.myhasattr, NewStyle(), "error") self.assert_(serialize.myhasattr(OldStyle(), "bar")) self.assert_(serialize.myhasattr(NewStyle(), "bar")) self.assert_(not serialize.myhasattr(OldStyle(), "rat")) self.assert_(not serialize.myhasattr(NewStyle(), "rat")) def test_suite(): suite = unittest.makeSuite(SerializerTestCase) suite.addTest(doctest.DocTestSuite("ZODB.serialize")) return suite ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/testTimeStamp.py000066400000000000000000000123541230730566700247610ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Test the TimeStamp utility type""" import time import unittest from persistent.TimeStamp import TimeStamp EPSILON = 0.000001 class TimeStampTests(unittest.TestCase): def checkYMDTimeStamp(self): self._check_ymd(2001, 6, 3) def _check_ymd(self, yr, mo, dy): ts = TimeStamp(yr, mo, dy) self.assertEqual(ts.year(), yr) self.assertEqual(ts.month(), mo) self.assertEqual(ts.day(), dy) self.assertEquals(ts.hour(), 0) self.assertEquals(ts.minute(), 0) self.assertEquals(ts.second(), 0) t = time.gmtime(ts.timeTime()) self.assertEquals(yr, t[0]) self.assertEquals(mo, t[1]) self.assertEquals(dy, t[2]) def checkFullTimeStamp(self): native_ts = int(time.time()) # fractional seconds get in the way t = time.gmtime(native_ts) # the corresponding GMT struct tm ts = TimeStamp(*t[:6]) # Seconds are stored internally via (conceptually) multiplying by # 2**32 then dividing by 60, ending up with a 32-bit integer. # While this gives a lot of room for cramming many distinct # TimeStamps into a second, it's not good at roundtrip accuracy. # For example, 1 second is stored as int(2**32/60) == 71582788. # Converting back gives 71582788*60.0/2**32 == 0.9999999962747097. # In general, we can lose up to 0.999... to truncation during # storing, creating an absolute error up to about 1*60.0/2**32 == # 0.000000014 on the seconds value we get back. This is so even # when we have an exact integral second value going in (as we # do in this test), so we can't expect equality in any comparison # involving seconds. Minutes (etc) are stored exactly, so we # can expect equality for those. self.assert_(abs(ts.timeTime() - native_ts) < EPSILON) self.assertEqual(ts.year(), t[0]) self.assertEqual(ts.month(), t[1]) self.assertEqual(ts.day(), t[2]) self.assertEquals(ts.hour(), t[3]) self.assertEquals(ts.minute(), t[4]) self.assert_(abs(ts.second() - t[5]) < EPSILON) def checkRawTimestamp(self): t = time.gmtime() ts1 = TimeStamp(*t[:6]) ts2 = TimeStamp(`ts1`) self.assertEquals(ts1, ts2) self.assertEquals(ts1.timeTime(), ts2.timeTime()) self.assertEqual(ts1.year(), ts2.year()) self.assertEqual(ts1.month(), ts2.month()) self.assertEqual(ts1.day(), ts2.day()) self.assertEquals(ts1.hour(), ts2.hour()) self.assertEquals(ts1.minute(), ts2.minute()) self.assert_(abs(ts1.second() - ts2.second()) < EPSILON) def checkDictKey(self): t = time.gmtime() ts1 = TimeStamp(*t[:6]) ts2 = TimeStamp(2000, *t[1:6]) d = {} d[ts1] = 1 d[ts2] = 2 self.assertEquals(len(d), 2) def checkCompare(self): ts1 = TimeStamp(1972, 6, 27) ts2 = TimeStamp(1971, 12, 12) self.assert_(ts1 > ts2) self.assert_(ts2 <= ts1) def checkLaterThan(self): t = time.gmtime() ts = TimeStamp(*t[:6]) ts2 = ts.laterThan(ts) self.assert_(ts2 > ts) # TODO: should test for bogus inputs to TimeStamp constructor def checkTimeStamp(self): # Alternate test suite t = TimeStamp(2002, 1, 23, 10, 48, 5) # GMT self.assertEquals(str(t), '2002-01-23 10:48:05.000000') self.assertEquals(repr(t), '\x03B9H\x15UUU') self.assertEquals(TimeStamp('\x03B9H\x15UUU'), t) self.assertEquals(t.year(), 2002) self.assertEquals(t.month(), 1) self.assertEquals(t.day(), 23) self.assertEquals(t.hour(), 10) self.assertEquals(t.minute(), 48) self.assertEquals(round(t.second()), 5) self.assertEquals(t.timeTime(), 1011782885) t1 = TimeStamp(2002, 1, 23, 10, 48, 10) self.assertEquals(str(t1), '2002-01-23 10:48:10.000000') self.assert_(t == t) self.assert_(t != t1) self.assert_(t < t1) self.assert_(t <= t1) self.assert_(t1 >= t) self.assert_(t1 > t) self.failIf(t == t1) self.failIf(t != t) self.failIf(t > t1) self.failIf(t >= t1) self.failIf(t1 < t) self.failIf(t1 <= t) self.assertEquals(cmp(t, t), 0) self.assertEquals(cmp(t, t1), -1) self.assertEquals(cmp(t1, t), 1) self.assertEquals(t1.laterThan(t), t1) self.assert_(t.laterThan(t1) > t1) self.assertEquals(TimeStamp(2002,1,23), TimeStamp(2002,1,23,0,0,0)) def test_suite(): return unittest.makeSuite(TimeStampTests, 'check') ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/testUtils.py000066400000000000000000000073261230730566700241610ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Test the routines to convert between long and 64-bit strings""" from persistent import Persistent import doctest import random import unittest NUM = 100 from ZODB.utils import U64, p64, u64 class TestUtils(unittest.TestCase): small = [random.randrange(1, 1<<32) for i in range(NUM)] large = [random.randrange(1<<32, 1<<64) for i in range(NUM)] all = small + large def checkLongToStringToLong(self): for num in self.all: s = p64(num) n = U64(s) self.assertEquals(num, n, "U64() failed") n2 = u64(s) self.assertEquals(num, n2, "u64() failed") def checkKnownConstants(self): self.assertEquals("\000\000\000\000\000\000\000\001", p64(1)) self.assertEquals("\000\000\000\001\000\000\000\000", p64(1L<<32)) self.assertEquals(u64("\000\000\000\000\000\000\000\001"), 1) self.assertEquals(U64("\000\000\000\000\000\000\000\001"), 1) self.assertEquals(u64("\000\000\000\001\000\000\000\000"), 1L<<32) self.assertEquals(U64("\000\000\000\001\000\000\000\000"), 1L<<32) def checkPersistentIdHandlesDescriptor(self): from ZODB.serialize import ObjectWriter class P(Persistent): pass writer = ObjectWriter(None) self.assertEqual(writer.persistent_id(P), None) # It's hard to know where to put this test. We're checking that the # ConflictError constructor uses utils.py's get_pickle_metadata() to # deduce the class path from a pickle, instead of actually loading # the pickle (and so also trying to import application module and # class objects, which isn't a good idea on a ZEO server when avoidable). def checkConflictErrorDoesntImport(self): from ZODB.serialize import ObjectWriter from ZODB.POSException import ConflictError from ZODB.tests.MinPO import MinPO import cPickle as pickle obj = MinPO() data = ObjectWriter().serialize(obj) # The pickle contains a GLOBAL ('c') opcode resolving to MinPO's # module and class. self.assert_('cZODB.tests.MinPO\nMinPO\n' in data) # Fiddle the pickle so it points to something "impossible" instead. data = data.replace('cZODB.tests.MinPO\nMinPO\n', 'cpath.that.does.not.exist\nlikewise.the.class\n') # Pickle can't resolve that GLOBAL opcode -- gets ImportError. self.assertRaises(ImportError, pickle.loads, data) # Verify that building ConflictError doesn't get ImportError. try: raise ConflictError(object=obj, data=data) except ConflictError, detail: # And verify that the msg names the impossible path. self.assert_('path.that.does.not.exist.likewise.the.class' in str(detail)) else: self.fail("expected ConflictError, but no exception raised") def test_suite(): suite = unittest.TestSuite() suite.addTest(unittest.makeSuite(TestUtils, 'check')) suite.addTest(doctest.DocFileSuite('../utils.txt')) return suite ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/testZODB.py000066400000000000000000000532021230730566700236110ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## from persistent import Persistent from persistent.mapping import PersistentMapping from ZODB.POSException import ReadConflictError from ZODB.POSException import TransactionFailedError from BTrees.OOBTree import OOBTree import transaction import unittest import ZODB import ZODB.FileStorage import ZODB.MappingStorage import ZODB.tests.util class P(Persistent): pass class ZODBTests(ZODB.tests.util.TestCase): def setUp(self): ZODB.tests.util.TestCase.setUp(self) self._storage = ZODB.FileStorage.FileStorage( 'ZODBTests.fs', create=1) self._db = ZODB.DB(self._storage) def tearDown(self): self._db.close() ZODB.tests.util.TestCase.tearDown(self) def populate(self): transaction.begin() conn = self._db.open() root = conn.root() root['test'] = pm = PersistentMapping() for n in range(100): pm[n] = PersistentMapping({0: 100 - n}) transaction.get().note('created test data') transaction.commit() conn.close() def checkExportImport(self, abort_it=False): self.populate() conn = self._db.open() try: self.duplicate(conn, abort_it) finally: conn.close() conn = self._db.open() try: self.verify(conn, abort_it) finally: conn.close() def duplicate(self, conn, abort_it): transaction.begin() transaction.get().note('duplication') root = conn.root() ob = root['test'] assert len(ob) > 10, 'Insufficient test data' try: import tempfile f = tempfile.TemporaryFile() ob._p_jar.exportFile(ob._p_oid, f) assert f.tell() > 0, 'Did not export correctly' f.seek(0) new_ob = ob._p_jar.importFile(f) self.assertEqual(new_ob, ob) root['dup'] = new_ob f.close() if abort_it: transaction.abort() else: transaction.commit() except: transaction.abort() raise def verify(self, conn, abort_it): transaction.begin() root = conn.root() ob = root['test'] try: ob2 = root['dup'] except KeyError: if abort_it: # Passed the test. return else: raise else: self.failUnless(not abort_it, 'Did not abort duplication') l1 = list(ob.items()) l1.sort() l2 = list(ob2.items()) l2.sort() l1 = map(lambda (k, v): (k, v[0]), l1) l2 = map(lambda (k, v): (k, v[0]), l2) self.assertEqual(l1, l2) self.assert_(ob._p_oid != ob2._p_oid) self.assertEqual(ob._p_jar, ob2._p_jar) oids = {} for v in ob.values(): oids[v._p_oid] = 1 for v in ob2.values(): assert not oids.has_key(v._p_oid), ( 'Did not fully separate duplicate from original') transaction.commit() def checkExportImportAborted(self): self.checkExportImport(abort_it=True) def checkResetCache(self): # The cache size after a reset should be 0. Note that # _resetCache is not a public API, but the resetCaches() # function is, and resetCaches() causes _resetCache() to be # called. self.populate() conn = self._db.open() conn.root() self.assert_(len(conn._cache) > 0) # Precondition conn._resetCache() self.assertEqual(len(conn._cache), 0) def checkResetCachesAPI(self): # Checks the resetCaches() API. # (resetCaches used to be called updateCodeTimestamp.) self.populate() conn = self._db.open() conn.root() self.assert_(len(conn._cache) > 0) # Precondition ZODB.Connection.resetCaches() conn.close() self.assert_(len(conn._cache) > 0) # Still not flushed conn.open() # simulate the connection being reopened self.assertEqual(len(conn._cache), 0) def checkExplicitTransactionManager(self): # Test of transactions that apply to only the connection, # not the thread. tm1 = transaction.TransactionManager() conn1 = self._db.open(transaction_manager=tm1) tm2 = transaction.TransactionManager() conn2 = self._db.open(transaction_manager=tm2) try: r1 = conn1.root() r2 = conn2.root() if r1.has_key('item'): del r1['item'] tm1.get().commit() r1.get('item') r2.get('item') r1['item'] = 1 tm1.get().commit() self.assertEqual(r1['item'], 1) # r2 has not seen a transaction boundary, # so it should be unchanged. self.assertEqual(r2.get('item'), None) conn2.sync() # Now r2 is updated. self.assertEqual(r2['item'], 1) # Now, for good measure, send an update in the other direction. r2['item'] = 2 tm2.get().commit() self.assertEqual(r1['item'], 1) self.assertEqual(r2['item'], 2) conn1.sync() conn2.sync() self.assertEqual(r1['item'], 2) self.assertEqual(r2['item'], 2) finally: conn1.close() conn2.close() def checkSavepointDoesntGetInvalidations(self): # Prior to ZODB 3.2.9 and 3.4, Connection.tpc_finish() processed # invalidations even for a subtxn commit. This could make # inconsistent state visible after a subtxn commit. There was a # suspicion that POSKeyError was possible as a result, but I wasn't # able to construct a case where that happened. # Subtxns are deprecated now, but it's good to check that the # same kind of thing doesn't happen when making savepoints either. # Set up the database, to hold # root --> "p" -> value = 1 # --> "q" -> value = 2 tm1 = transaction.TransactionManager() conn = self._db.open(transaction_manager=tm1) r1 = conn.root() p = P() p.value = 1 r1["p"] = p q = P() q.value = 2 r1["q"] = q tm1.commit() # Now txn T1 changes p.value to 3 locally (subtxn commit). p.value = 3 tm1.savepoint() # Start new txn T2 with a new connection. tm2 = transaction.TransactionManager() cn2 = self._db.open(transaction_manager=tm2) r2 = cn2.root() p2 = r2["p"] self.assertEqual(p._p_oid, p2._p_oid) # T2 shouldn't see T1's change of p.value to 3, because T1 didn't # commit yet. self.assertEqual(p2.value, 1) # Change p.value to 4, and q.value to 5. Neither should be visible # to T1, because T1 is still in progress. p2.value = 4 q2 = r2["q"] self.assertEqual(q._p_oid, q2._p_oid) self.assertEqual(q2.value, 2) q2.value = 5 tm2.commit() # Back to T1. p and q still have the expected values. rt = conn.root() self.assertEqual(rt["p"].value, 3) self.assertEqual(rt["q"].value, 2) # Now make another savepoint in T1. This shouldn't change what # T1 sees for p and q. rt["r"] = P() tm1.savepoint() # Making that savepoint in T1 should not process invalidations # from T2's commit. p.value should still be 3 here (because that's # what T1 savepointed earlier), and q.value should still be 2. # Prior to ZODB 3.2.9 and 3.4, q.value was 5 here. rt = conn.root() try: self.assertEqual(rt["p"].value, 3) self.assertEqual(rt["q"].value, 2) finally: tm1.abort() def checkTxnBeginImpliesAbort(self): # begin() should do an abort() first, if needed. cn = self._db.open() rt = cn.root() rt['a'] = 1 transaction.begin() # should abort adding 'a' to the root rt = cn.root() self.assertRaises(KeyError, rt.__getitem__, 'a') transaction.begin() rt = cn.root() self.assertRaises(KeyError, rt.__getitem__, 'a') # One more time. transaction.begin() rt = cn.root() rt['a'] = 3 transaction.begin() rt = cn.root() self.assertRaises(KeyError, rt.__getitem__, 'a') self.assertRaises(KeyError, rt.__getitem__, 'b') # That used methods of the default transaction *manager*. Alas, # that's not necessarily the same as using methods of the current # transaction, and, in fact, when this test was written, # Transaction.begin() didn't do anything (everything from here # down failed). # Later (ZODB 3.6): Transaction.begin() no longer exists, so the # rest of this test was tossed. def checkFailingCommitSticks(self): # See also checkFailingSavepointSticks. cn = self._db.open() rt = cn.root() rt['a'] = 1 # Arrange for commit to fail during tpc_vote. poisoned = PoisonedObject(PoisonedJar(break_tpc_vote=True)) transaction.get().register(poisoned) self.assertRaises(PoisonedError, transaction.get().commit) # Trying to commit again fails too. self.assertRaises(TransactionFailedError, transaction.commit) self.assertRaises(TransactionFailedError, transaction.commit) self.assertRaises(TransactionFailedError, transaction.commit) # The change to rt['a'] is lost. self.assertRaises(KeyError, rt.__getitem__, 'a') # Trying to modify an object also fails, because Transaction.join() # also raises TransactionFailedError. self.assertRaises(TransactionFailedError, rt.__setitem__, 'b', 2) # Clean up via abort(), and try again. transaction.abort() rt['a'] = 1 transaction.commit() self.assertEqual(rt['a'], 1) # Cleaning up via begin() should also work. rt['a'] = 2 transaction.get().register(poisoned) self.assertRaises(PoisonedError, transaction.commit) self.assertRaises(TransactionFailedError, transaction.commit) # The change to rt['a'] is lost. self.assertEqual(rt['a'], 1) # Trying to modify an object also fails. self.assertRaises(TransactionFailedError, rt.__setitem__, 'b', 2) # Clean up via begin(), and try again. transaction.begin() rt['a'] = 2 transaction.commit() self.assertEqual(rt['a'], 2) cn.close() def checkSavepointRollbackAndReadCurrent(self): ''' savepoint rollback after readcurrent was called on a new object should not raise POSKeyError ''' cn = self._db.open() try: transaction.begin() root = cn.root() added_before_savepoint = P() root['added_before_savepoint'] = added_before_savepoint sp = transaction.savepoint() added_before_savepoint.btree = new_btree = OOBTree() cn.add(new_btree) new_btree['change_to_trigger_read_current'] = P() sp.rollback() transaction.commit() self.assertTrue('added_before_savepoint' in root) finally: transaction.abort() cn.close() def checkFailingSavepointSticks(self): cn = self._db.open() rt = cn.root() rt['a'] = 1 transaction.savepoint() self.assertEqual(rt['a'], 1) rt['b'] = 2 # Make a jar that raises PoisonedError when making a savepoint. poisoned = PoisonedJar(break_savepoint=True) transaction.get().join(poisoned) self.assertRaises(PoisonedError, transaction.savepoint) # Trying to make a savepoint again fails too. self.assertRaises(TransactionFailedError, transaction.savepoint) self.assertRaises(TransactionFailedError, transaction.savepoint) # Top-level commit also fails. self.assertRaises(TransactionFailedError, transaction.commit) # The changes to rt['a'] and rt['b'] are lost. self.assertRaises(KeyError, rt.__getitem__, 'a') self.assertRaises(KeyError, rt.__getitem__, 'b') # Trying to modify an object also fails, because Transaction.join() # also raises TransactionFailedError. self.assertRaises(TransactionFailedError, rt.__setitem__, 'b', 2) # Clean up via abort(), and try again. transaction.abort() rt['a'] = 1 transaction.commit() self.assertEqual(rt['a'], 1) # Cleaning up via begin() should also work. rt['a'] = 2 transaction.get().join(poisoned) self.assertRaises(PoisonedError, transaction.savepoint) # Trying to make a savepoint again fails too. self.assertRaises(TransactionFailedError, transaction.savepoint) # The change to rt['a'] is lost. self.assertEqual(rt['a'], 1) # Trying to modify an object also fails. self.assertRaises(TransactionFailedError, rt.__setitem__, 'b', 2) # Clean up via begin(), and try again. transaction.begin() rt['a'] = 2 transaction.savepoint() self.assertEqual(rt['a'], 2) transaction.commit() cn2 = self._db.open() rt = cn.root() self.assertEqual(rt['a'], 2) cn.close() cn2.close() def checkMultipleUndoInOneTransaction(self): # Verify that it's possible to perform multiple undo # operations within a transaction. If ZODB performs the undo # operations in a nondeterministic order, this test will often # fail. conn = self._db.open() try: root = conn.root() # Add transactions that set root["state"] to (0..5) for state_num in range(6): transaction.begin() root['state'] = state_num transaction.get().note('root["state"] = %d' % state_num) transaction.commit() # Undo all but the first. Note that no work is actually # performed yet. transaction.begin() log = self._db.undoLog() self._db.undoMultiple([log[i]['id'] for i in range(5)]) transaction.get().note('undo states 1 through 5') # Now attempt all those undo operations. transaction.commit() # Sanity check: we should be back to the first state. self.assertEqual(root['state'], 0) finally: transaction.abort() conn.close() class ReadConflictTests(ZODB.tests.util.TestCase): def setUp(self): ZODB.tests.utils.TestCase.setUp(self) self._storage = ZODB.MappingStorage.MappingStorage() def readConflict(self, shouldFail=True): # Two transactions run concurrently. Each reads some object, # then one commits and the other tries to read an object # modified by the first. This read should fail with a conflict # error because the object state read is not necessarily # consistent with the objects read earlier in the transaction. tm1 = transaction.TransactionManager() conn = self._db.open(transaction_manager=tm1) r1 = conn.root() r1["p"] = self.obj self.obj.child1 = P() tm1.get().commit() # start a new transaction with a new connection tm2 = transaction.TransactionManager() cn2 = self._db.open(transaction_manager=tm2) # start a new transaction with the other connection r2 = cn2.root() self.assertEqual(r1._p_serial, r2._p_serial) self.obj.child2 = P() tm1.get().commit() # resume the transaction using cn2 obj = r2["p"] # An attempt to access obj should fail, because r2 was read # earlier in the transaction and obj was modified by the othe # transaction. if shouldFail: self.assertRaises(ReadConflictError, lambda: obj.child1) # And since ReadConflictError was raised, attempting to commit # the transaction should re-raise it. checkNotIndependent() # failed this part of the test for a long time. self.assertRaises(ReadConflictError, tm2.get().commit) # And since that commit failed, trying to commit again should # fail again. self.assertRaises(TransactionFailedError, tm2.get().commit) # And again. self.assertRaises(TransactionFailedError, tm2.get().commit) # Etc. self.assertRaises(TransactionFailedError, tm2.get().commit) else: # make sure that accessing the object succeeds obj.child1 tm2.get().abort() def checkReadConflict(self): self.obj = P() self.readConflict() def checkReadConflictIgnored(self): # Test that an application that catches a read conflict and # continues can not commit the transaction later. root = self._db.open().root() root["real_data"] = real_data = PersistentMapping() root["index"] = index = PersistentMapping() real_data["a"] = PersistentMapping({"indexed_value": 0}) real_data["b"] = PersistentMapping({"indexed_value": 1}) index[1] = PersistentMapping({"b": 1}) index[0] = PersistentMapping({"a": 1}) transaction.commit() # load some objects from one connection tm = transaction.TransactionManager() cn2 = self._db.open(transaction_manager=tm) r2 = cn2.root() real_data2 = r2["real_data"] index2 = r2["index"] real_data["b"]["indexed_value"] = 0 del index[1]["b"] index[0]["b"] = 1 transaction.commit() del real_data2["a"] try: del index2[0]["a"] except ReadConflictError: # This is the crux of the text. Ignore the error. pass else: self.fail("No conflict occurred") # real_data2 still ready to commit self.assert_(real_data2._p_changed) # index2 values not ready to commit self.assert_(not index2._p_changed) self.assert_(not index2[0]._p_changed) self.assert_(not index2[1]._p_changed) self.assertRaises(ReadConflictError, tm.get().commit) self.assertRaises(TransactionFailedError, tm.get().commit) tm.get().abort() def checkReadConflictErrorClearedDuringAbort(self): # When a transaction is aborted, the "memory" of which # objects were the cause of a ReadConflictError during # that transaction should be cleared. root = self._db.open().root() data = PersistentMapping({'d': 1}) root["data"] = data transaction.commit() # Provoke a ReadConflictError. tm2 = transaction.TransactionManager() cn2 = self._db.open(transaction_manager=tm2) r2 = cn2.root() data2 = r2["data"] data['d'] = 2 transaction.commit() try: data2['d'] = 3 except ReadConflictError: pass else: self.fail("No conflict occurred") # Explicitly abort cn2's transaction. tm2.get().abort() # cn2 should retain no memory of the read conflict after an abort(), # but 3.2.3 had a bug wherein it did. data_conflicts = data._p_jar._conflicts data2_conflicts = data2._p_jar._conflicts self.failIf(data_conflicts) self.failIf(data2_conflicts) # this used to fail # And because of that, we still couldn't commit a change to data2['d'] # in the new transaction. cn2.sync() # process the invalidation for data2['d'] data2['d'] = 3 tm2.get().commit() # 3.2.3 used to raise ReadConflictError cn2.close() class PoisonedError(Exception): pass # PoisonedJar arranges to raise PoisonedError from interesting places. class PoisonedJar: def __init__(self, break_tpc_begin=False, break_tpc_vote=False, break_savepoint=False): self.break_tpc_begin = break_tpc_begin self.break_tpc_vote = break_tpc_vote self.break_savepoint = break_savepoint def sortKey(self): return str(id(self)) def tpc_begin(self, *args): if self.break_tpc_begin: raise PoisonedError("tpc_begin fails") # A way to poison a top-level commit. def tpc_vote(self, *args): if self.break_tpc_vote: raise PoisonedError("tpc_vote fails") # A way to poison a savepoint -- also a way to poison a subtxn commit. def savepoint(self): if self.break_savepoint: raise PoisonedError("savepoint fails") def commit(*args): pass def abort(*self): pass class PoisonedObject: def __init__(self, poisonedjar): self._p_jar = poisonedjar def test_suite(): suite = unittest.makeSuite(ZODBTests, 'check') return suite if __name__ == "__main__": unittest.main(defaultTest="test_suite") ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/test_cache.py000066400000000000000000000155661230730566700242700ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2004 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Test behavior of Connection plus cPickleCache.""" from persistent import Persistent from ZODB.config import databaseFromString import doctest import transaction class RecalcitrantObject(Persistent): """A Persistent object that will not become a ghost.""" deactivations = 0 def _p_deactivate(self): self.__class__.deactivations += 1 def init(cls): cls.deactivations = 0 init = classmethod(init) class RegularObject(Persistent): deactivations = 0 invalidations = 0 def _p_deactivate(self): self.__class__.deactivations += 1 super(RegularObject, self)._p_deactivate() def _p_invalidate(self): self.__class__.invalidations += 1 super(RegularObject, self)._p_invalidate() def init(cls): cls.deactivations = 0 cls.invalidations = 0 init = classmethod(init) class PersistentObject(Persistent): pass class CacheTests: def test_cache(self): r"""Test basic cache methods. Let's start with a clean transaction >>> transaction.abort() >>> RegularObject.init() >>> db = databaseFromString("\n" ... "cache-size 4\n" ... "\n" ... "") >>> cn = db.open() >>> r = cn.root() >>> L = [] >>> for i in range(5): ... o = RegularObject() ... L.append(o) ... r[i] = o >>> transaction.commit() After committing a transaction and calling cacheGC(), there should be cache-size (4) objects in the cache. One of the RegularObjects was deactivated. >>> cn._cache.ringlen() 4 >>> RegularObject.deactivations 1 If we explicitly activate the objects again, the ringlen should go back up to 5. >>> for o in L: ... o._p_activate() >>> cn._cache.ringlen() 5 >>> cn.cacheGC() >>> cn._cache.ringlen() 4 >>> RegularObject.deactivations 2 >>> cn.cacheMinimize() >>> cn._cache.ringlen() 0 >>> RegularObject.deactivations 6 If we activate all the objects again and mark one as modified, then the one object should not be deactivated even by a minimize. >>> for o in L: ... o._p_activate() >>> o.attr = 1 >>> cn._cache.ringlen() 5 >>> cn.cacheMinimize() >>> cn._cache.ringlen() 1 >>> RegularObject.deactivations 10 Clean up >>> transaction.abort() """ def test_cache_gc_recalcitrant(self): r"""Test that a cacheGC() call will return. It's possible for a particular object to ignore the _p_deactivate() call. We want to check several things in this case. The cache should called the real _p_deactivate() method not the one provided by Persistent. The cacheGC() call should also return when it's looked at each item, regardless of whether it became a ghost. >>> RecalcitrantObject.init() >>> db = databaseFromString("\n" ... "cache-size 4\n" ... "\n" ... "") >>> cn = db.open() >>> r = cn.root() >>> L = [] >>> for i in range(5): ... o = RecalcitrantObject() ... L.append(o) ... r[i] = o >>> transaction.commit() >>> [o._p_state for o in L] [0, 0, 0, 0, 0] The Connection calls cacheGC() after it commits a transaction. Since the cache will now have more objects that it's target size, it will call _p_deactivate() on each RecalcitrantObject. >>> RecalcitrantObject.deactivations 5 >>> [o._p_state for o in L] [0, 0, 0, 0, 0] An explicit call to cacheGC() has the same effect. >>> cn.cacheGC() >>> RecalcitrantObject.deactivations 10 >>> [o._p_state for o in L] [0, 0, 0, 0, 0] """ def test_cache_on_abort(self): r"""Test that the cache handles transaction abort correctly. >>> RegularObject.init() >>> db = databaseFromString("\n" ... "cache-size 4\n" ... "\n" ... "") >>> cn = db.open() >>> r = cn.root() >>> L = [] >>> for i in range(5): ... o = RegularObject() ... L.append(o) ... r[i] = o >>> transaction.commit() >>> RegularObject.deactivations 1 Modify three of the objects and verify that they are deactivated when the transaction aborts. >>> for i in range(0, 5, 2): ... L[i].attr = i >>> [L[i]._p_state for i in range(0, 5, 2)] [1, 1, 1] >>> cn._cache.ringlen() 5 >>> transaction.abort() >>> cn._cache.ringlen() 2 >>> RegularObject.deactivations 4 """ def test_gc_on_open_connections(self): r"""Test that automatic GC is not applied to open connections. This test (and the corresponding fix) was introduced because of bug report 113923. We start with a persistent object and add a list attribute:: >>> db = databaseFromString("\n" ... "cache-size 0\n" ... "\n" ... "") >>> cn1 = db.open() >>> r = cn1.root() >>> r['ob'] = PersistentObject() >>> r['ob'].l = [] >>> transaction.commit() Now, let's modify the object in a way that doesn't get noticed. Then, we open another connection which triggers automatic garbage connection. After that, the object should not have been ghostified:: >>> r['ob'].l.append(1) >>> cn2 = db.open() >>> r['ob'].l [1] """ def test_suite(): return doctest.DocTestSuite() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/test_datamanageradapter.py000066400000000000000000000121751230730566700270230ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2003 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## import unittest from doctest import DocTestSuite from transaction._transaction import DataManagerAdapter from ZODB.tests.sampledm import DataManager def test_normal_commit(): """ So, we have a data manager: >>> dm = DataManager() and we do some work that modifies uncommited state: >>> dm.inc() >>> dm.state, dm.delta (0, 1) Now we'll commit the changes. When the data manager joins a transaction, the transaction will create an adapter. >>> dma = DataManagerAdapter(dm) and register it as a modified object. At commit time, the transaction will get the "jar" like this: >>> jar = getattr(dma, '_p_jar', dma) and, of course, the jar and the adapter will be the same: >>> jar is dma True The transaction will call tpc_begin: >>> t1 = '1' >>> jar.tpc_begin(t1) Then the transaction will call commit on the jar: >>> jar.commit(t1) This doesn't actually do anything. :) >>> dm.state, dm.delta (0, 1) The transaction will then call tpc_vote: >>> jar.tpc_vote(t1) This prepares the data manager: >>> dm.state, dm.delta (1, 1) >>> dm.prepared True Finally, tpc_finish is called: >>> jar.tpc_finish(t1) and the data manager finishes the two-phase commit: >>> dm.state, dm.delta (1, 0) >>> dm.prepared False """ def test_abort(): """ So, we have a data manager: >>> dm = DataManager() and we do some work that modifies uncommited state: >>> dm.inc() >>> dm.state, dm.delta (0, 1) When the data manager joins a transaction, the transaction will create an adapter. >>> dma = DataManagerAdapter(dm) and register it as a modified object. Now we'll abort the transaction. The transaction will get the "jar" like this: >>> jar = getattr(dma, '_p_jar', dma) and, of course, the jar and the adapter will be the same: >>> jar is dma True Then the transaction will call abort on the jar: >>> t1 = '1' >>> jar.abort(t1) Which aborts the changes in the data manager: >>> dm.state, dm.delta (0, 0) """ def test_tpc_abort_phase1(): """ So, we have a data manager: >>> dm = DataManager() and we do some work that modifies uncommited state: >>> dm.inc() >>> dm.state, dm.delta (0, 1) Now we'll commit the changes. When the data manager joins a transaction, the transaction will create an adapter. >>> dma = DataManagerAdapter(dm) and register it as a modified object. At commit time, the transaction will get the "jar" like this: >>> jar = getattr(dma, '_p_jar', dma) and, of course, the jar and the adapter will be the same: >>> jar is dma True The transaction will call tpc_begin: >>> t1 = '1' >>> jar.tpc_begin(t1) Then the transaction will call commit on the jar: >>> jar.commit(t1) This doesn't actually do anything. :) >>> dm.state, dm.delta (0, 1) At this point, the transaction decides to abort. It calls tpc_abort: >>> jar.tpc_abort(t1) Which causes the state of the data manager to be restored: >>> dm.state, dm.delta (0, 0) """ def test_tpc_abort_phase2(): """ So, we have a data manager: >>> dm = DataManager() and we do some work that modifies uncommited state: >>> dm.inc() >>> dm.state, dm.delta (0, 1) Now we'll commit the changes. When the data manager joins a transaction, the transaction will create an adapter. >>> dma = DataManagerAdapter(dm) and register it as a modified object. At commit time, the transaction will get the "jar" like this: >>> jar = getattr(dma, '_p_jar', dma) and, of course, the jar and the adapter will be the same: >>> jar is dma True The transaction will call tpc_begin: >>> t1 = '1' >>> jar.tpc_begin(t1) Then the transaction will call commit on the jar: >>> jar.commit(t1) This doesn't actually do anything. :) >>> dm.state, dm.delta (0, 1) The transaction calls vote: >>> jar.tpc_vote(t1) This prepares the data manager: >>> dm.state, dm.delta (1, 1) >>> dm.prepared True At this point, the transaction decides to abort. It calls tpc_abort: >>> jar.tpc_abort(t1) Which causes the state of the data manager to be restored: >>> dm.state, dm.delta (0, 0) >>> dm.prepared False """ def test_suite(): return DocTestSuite() if __name__ == '__main__': unittest.main() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/test_doctest_files.py000066400000000000000000000033561230730566700260460ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2004 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## import doctest import unittest __test__ = dict( cross_db_refs_to_blank_db_name = """ There was a bug that caused bad refs to be generated is a database name was blank. >>> import ZODB.tests.util, persistent.mapping, transaction >>> dbs = {} >>> db1 = ZODB.tests.util.DB(database_name='', databases=dbs) >>> db2 = ZODB.tests.util.DB(database_name='2', databases=dbs) >>> conn1 = db1.open() >>> conn2 = conn1.get_connection('2') >>> for i in range(10): ... conn1.root()[i] = persistent.mapping.PersistentMapping() ... transaction.commit() >>> conn2.root()[0] = conn1.root()[9] >>> transaction.commit() >>> conn2.root()._p_deactivate() >>> conn2.root()[0] is conn1.root()[9] True >>> list(conn2.root()[0].keys()) [] """, ) def test_suite(): suite = unittest.TestSuite() suite.addTest(doctest.DocFileSuite("dbopen.txt", "multidb.txt", "synchronizers.txt", )) suite.addTest(doctest.DocTestSuite()) return suite ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/test_fsdump.py000066400000000000000000000044451230730566700245150ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2005 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## r""" fsdump test =========== Let's get a path to work with first. >>> path = 'Data.fs' More imports. >>> import ZODB >>> from ZODB.FileStorage import FileStorage >>> import transaction as txn >>> from BTrees.OOBTree import OOBTree >>> from ZODB.FileStorage.fsdump import fsdump # we're testing this Create an empty FileStorage. >>> st = FileStorage(path) For empty DB fsdump() output definitely empty: >>> fsdump(path) Create a root object and try again: >>> db = ZODB.DB(st) # yes, that creates a root object! >>> fsdump(path) #doctest: +ELLIPSIS Trans #00000 tid=... time=... offset=52 status=' ' user='' description='initial database creation' data #00000 oid=0000000000000000 size=60 class=persistent.mapping.PersistentMapping Now we see first transaction with root object. Let's add a BTree: >>> root = db.open().root() >>> root['tree'] = OOBTree() >>> txn.get().note('added an OOBTree') >>> txn.get().commit() >>> fsdump(path) #doctest: +ELLIPSIS Trans #00000 tid=... time=... offset=52 status=' ' user='' description='initial database creation' data #00000 oid=0000000000000000 size=60 class=persistent.mapping.PersistentMapping Trans #00001 tid=... time=... offset=201 status=' ' user='' description='added an OOBTree' data #00000 oid=0000000000000000 size=107 class=persistent.mapping.PersistentMapping data #00001 oid=0000000000000001 size=29 class=BTrees.OOBTree.OOBTree Now we see two transactions and two changed objects. Clean up. >>> st.close() """ import doctest import zope.testing.setupstack def test_suite(): return doctest.DocTestSuite( setUp=zope.testing.setupstack.setUpDirectory, tearDown=zope.testing.setupstack.tearDown) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/test_storage.py000066400000000000000000000114151230730566700246560ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2004 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """A storage used for unittests. The primary purpose of this module is to have a minimal multi-version storage to use for unit tests. MappingStorage isn't sufficient. Since even a minimal storage has some complexity, we run standard storage tests against the test storage. """ from __future__ import with_statement import bisect import unittest from ZODB.BaseStorage import BaseStorage from ZODB import POSException from ZODB.utils import z64 from ZODB.tests import StorageTestBase from ZODB.tests import BasicStorage, MTStorage, Synchronization from ZODB.tests import RevisionStorage class Transaction(object): """Hold data for current transaction for MinimalMemoryStorage.""" def __init__(self, tid): self.index = {} self.tid = tid def store(self, oid, data): self.index[(oid, self.tid)] = data def cur(self): return dict.fromkeys([oid for oid, tid in self.index.keys()], self.tid) class MinimalMemoryStorage(BaseStorage, object): """Simple in-memory storage that supports revisions. This storage is needed to test multi-version concurrency control. It is similar to MappingStorage, but keeps multiple revisions. It does not support versions. It doesn't implement operations like pack(), because they aren't necessary for testing. """ def __init__(self): super(MinimalMemoryStorage, self).__init__("name") # _index maps oid, tid pairs to data records self._index = {} # _cur maps oid to current tid self._cur = {} self._ltid = z64 def isCurrent(self, oid, serial): return serial == self._cur[oid] def hook(self, oid, tid, version): # A hook for testing pass def __len__(self): return len(self._index) def _clear_temp(self): pass def load(self, oid, version=''): assert version == '' with self._lock: assert not version tid = self._cur[oid] self.hook(oid, tid, '') return self._index[(oid, tid)], tid def _begin(self, tid, u, d, e): self._txn = Transaction(tid) def store(self, oid, serial, data, v, txn): if txn is not self._transaction: raise POSException.StorageTransactionError(self, txn) assert not v if self._cur.get(oid) != serial: if not (serial is None or self._cur.get(oid) in [None, z64]): raise POSException.ConflictError( oid=oid, serials=(self._cur.get(oid), serial), data=data) self._txn.store(oid, data) return self._tid def _abort(self): del self._txn def _finish(self, tid, u, d, e): with self._lock: self._index.update(self._txn.index) self._cur.update(self._txn.cur()) self._ltid = self._tid def loadBefore(self, the_oid, the_tid): # It's okay if loadBefore() is really expensive, because this # storage is just used for testing. with self._lock: tids = [tid for oid, tid in self._index if oid == the_oid] if not tids: raise KeyError(the_oid) tids.sort() i = bisect.bisect_left(tids, the_tid) - 1 if i == -1: return None tid = tids[i] j = i + 1 if j == len(tids): end_tid = None else: end_tid = tids[j] return self._index[(the_oid, tid)], tid, end_tid def loadSerial(self, oid, serial): return self._index[(oid, serial)] def close(self): pass cleanup = close class MinimalTestSuite(StorageTestBase.StorageTestBase, BasicStorage.BasicStorage, MTStorage.MTStorage, Synchronization.SynchronizedStorage, RevisionStorage.RevisionStorage, ): def setUp(self): StorageTestBase.StorageTestBase.setUp(self) self._storage = MinimalMemoryStorage() # we don't implement undo def checkLoadBeforeUndo(self): pass def test_suite(): return unittest.makeSuite(MinimalTestSuite, "check") ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/testblob.py000066400000000000000000000516441230730566700240010ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2004 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## from pickle import Pickler from pickle import Unpickler from StringIO import StringIO from ZODB.blob import Blob from ZODB.DB import DB from ZODB.FileStorage import FileStorage from ZODB.tests.testConfig import ConfigTestBase import os if os.environ.get('USE_ZOPE_TESTING_DOCTEST'): from zope.testing import doctest else: import doctest import os import random import re import struct import sys import time import transaction import unittest import ZConfig import ZODB.blob import ZODB.interfaces import ZODB.tests.IteratorStorage import ZODB.tests.StorageTestBase import ZODB.tests.util import zope.testing.renormalizing def new_time(): """Create a _new_ time stamp. This method also makes sure that after retrieving a timestamp that was *before* a transaction was committed, that at least one second passes so the packing time actually is before the commit time. """ now = new_time = time.time() while new_time <= now: new_time = time.time() time.sleep(1) return new_time class ZODBBlobConfigTest(ConfigTestBase): def test_map_config1(self): self._test( """ blob-dir blobs """) def test_file_config1(self): self._test( """ blob-dir blobs path Data.fs """) def test_blob_dir_needed(self): self.assertRaises(ZConfig.ConfigurationSyntaxError, self._test, """ """) class BlobCloneTests(ZODB.tests.util.TestCase): def testDeepCopyCanInvalidate(self): """ Tests regression for invalidation problems related to missing readers and writers values in cloned objects (see http://mail.zope.org/pipermail/zodb-dev/2008-August/012054.html) """ import ZODB.MappingStorage database = DB(ZODB.blob.BlobStorage( 'blobs', ZODB.MappingStorage.MappingStorage())) connection = database.open() root = connection.root() transaction.begin() root['blob'] = Blob() transaction.commit() stream = StringIO() p = Pickler(stream, 1) p.dump(root['blob']) u = Unpickler(stream) stream.seek(0) clone = u.load() clone._p_invalidate() # it should also be possible to open the cloned blob # (even though it won't contain the original data) clone.open() # tearDown database.close() class BlobTestBase(ZODB.tests.StorageTestBase.StorageTestBase): def setUp(self): ZODB.tests.StorageTestBase.StorageTestBase.setUp(self) self._storage = self.create_storage() class BlobUndoTests(BlobTestBase): def testUndoWithoutPreviousVersion(self): database = DB(self._storage) connection = database.open() root = connection.root() transaction.begin() root['blob'] = Blob() transaction.commit() database.undo(database.undoLog(0, 1)[0]['id']) transaction.commit() # the blob footprint object should exist no longer self.assertRaises(KeyError, root.__getitem__, 'blob') database.close() def testUndo(self): database = DB(self._storage) connection = database.open() root = connection.root() transaction.begin() blob = Blob() blob.open('w').write('this is state 1') root['blob'] = blob transaction.commit() transaction.begin() blob = root['blob'] blob.open('w').write('this is state 2') transaction.commit() database.undo(database.undoLog(0, 1)[0]['id']) transaction.commit() self.assertEqual(blob.open('r').read(), 'this is state 1') database.close() def testUndoAfterConsumption(self): database = DB(self._storage) connection = database.open() root = connection.root() transaction.begin() open('consume1', 'w').write('this is state 1') blob = Blob() blob.consumeFile('consume1') root['blob'] = blob transaction.commit() transaction.begin() blob = root['blob'] open('consume2', 'w').write('this is state 2') blob.consumeFile('consume2') transaction.commit() database.undo(database.undoLog(0, 1)[0]['id']) transaction.commit() self.assertEqual(blob.open('r').read(), 'this is state 1') database.close() def testRedo(self): database = DB(self._storage) connection = database.open() root = connection.root() blob = Blob() transaction.begin() blob.open('w').write('this is state 1') root['blob'] = blob transaction.commit() transaction.begin() blob = root['blob'] blob.open('w').write('this is state 2') transaction.commit() database.undo(database.undoLog(0, 1)[0]['id']) transaction.commit() self.assertEqual(blob.open('r').read(), 'this is state 1') database.undo(database.undoLog(0, 1)[0]['id']) transaction.commit() self.assertEqual(blob.open('r').read(), 'this is state 2') database.close() def testRedoOfCreation(self): database = DB(self._storage) connection = database.open() root = connection.root() blob = Blob() transaction.begin() blob.open('w').write('this is state 1') root['blob'] = blob transaction.commit() database.undo(database.undoLog(0, 1)[0]['id']) transaction.commit() self.assertRaises(KeyError, root.__getitem__, 'blob') database.undo(database.undoLog(0, 1)[0]['id']) transaction.commit() self.assertEqual(blob.open('r').read(), 'this is state 1') database.close() class RecoveryBlobStorage(BlobTestBase, ZODB.tests.IteratorStorage.IteratorDeepCompare): def setUp(self): BlobTestBase.setUp(self) self._dst = self.create_storage('dest') def tearDown(self): self._dst.close() BlobTestBase.tearDown(self) # Requires a setUp() that creates a self._dst destination storage def testSimpleBlobRecovery(self): self.assert_( ZODB.interfaces.IBlobStorageRestoreable.providedBy(self._storage) ) db = DB(self._storage) conn = db.open() conn.root()[1] = ZODB.blob.Blob() transaction.commit() conn.root()[2] = ZODB.blob.Blob() conn.root()[2].open('w').write('some data') transaction.commit() conn.root()[3] = ZODB.blob.Blob() conn.root()[3].open('w').write( (''.join(struct.pack(">I", random.randint(0, (1<<32)-1)) for i in range(random.randint(10000,20000))) )[:-random.randint(1,4)] ) transaction.commit() conn.root()[2] = ZODB.blob.Blob() conn.root()[2].open('w').write('some other data') transaction.commit() self._dst.copyTransactionsFrom(self._storage) self.compare(self._storage, self._dst) def gc_blob_removes_uncommitted_data(): """ >>> blob = Blob() >>> blob.open('w').write('x') >>> fname = blob._p_blob_uncommitted >>> os.path.exists(fname) True >>> blob = None >>> os.path.exists(fname) False """ def commit_from_wrong_partition(): """ It should be possible to commit changes even when a blob is on a different partition. We can simulare this by temporarily breaking os.rename. :) >>> def fail(*args): ... raise OSError >>> os_rename = os.rename >>> os.rename = fail >>> import logging >>> logger = logging.getLogger('ZODB.blob.copied') >>> handler = logging.StreamHandler(sys.stdout) >>> logger.propagate = False >>> logger.setLevel(logging.DEBUG) >>> logger.addHandler(handler) >>> blob_storage = create_storage() >>> database = DB(blob_storage) >>> connection = database.open() >>> root = connection.root() >>> from ZODB.blob import Blob >>> root['blob'] = Blob() >>> root['blob'].open('w').write('test') >>> transaction.commit() # doctest: +ELLIPSIS Copied blob file ... >>> root['blob'].open().read() 'test' Works with savepoints too: >>> root['blob2'] = Blob() >>> root['blob2'].open('w').write('test2') >>> _ = transaction.savepoint() # doctest: +ELLIPSIS Copied blob file ... >>> transaction.commit() # doctest: +ELLIPSIS Copied blob file ... >>> root['blob2'].open().read() 'test2' >>> os.rename = os_rename >>> logger.propagate = True >>> logger.setLevel(0) >>> logger.removeHandler(handler) >>> handler.close() >>> database.close() """ def packing_with_uncommitted_data_non_undoing(): """ This covers regression for bug #130459. When uncommitted data exists it formerly was written to the root of the blob_directory and confused our packing strategy. We now use a separate temporary directory that is ignored while packing. >>> import transaction >>> from ZODB.DB import DB >>> from ZODB.serialize import referencesf >>> blob_storage = create_storage() >>> database = DB(blob_storage) >>> connection = database.open() >>> root = connection.root() >>> from ZODB.blob import Blob >>> root['blob'] = Blob() >>> connection.add(root['blob']) >>> root['blob'].open('w').write('test') >>> blob_storage.pack(new_time(), referencesf) Clean up: >>> database.close() """ def packing_with_uncommitted_data_undoing(): """ This covers regression for bug #130459. When uncommitted data exists it formerly was written to the root of the blob_directory and confused our packing strategy. We now use a separate temporary directory that is ignored while packing. >>> from ZODB.serialize import referencesf >>> blob_storage = create_storage() >>> database = DB(blob_storage) >>> connection = database.open() >>> root = connection.root() >>> from ZODB.blob import Blob >>> root['blob'] = Blob() >>> connection.add(root['blob']) >>> root['blob'].open('w').write('test') >>> blob_storage.pack(new_time(), referencesf) Clean up: >>> database.close() """ def secure_blob_directory(): """ This is a test for secure creation and verification of secure settings of blob directories. >>> blob_storage = create_storage(blob_dir='blobs') Two directories are created: >>> os.path.isdir('blobs') True >>> tmp_dir = os.path.join('blobs', 'tmp') >>> os.path.isdir(tmp_dir) True They are only accessible by the owner: >>> oct(os.stat('blobs').st_mode) '040700' >>> oct(os.stat(tmp_dir).st_mode) '040700' These settings are recognized as secure: >>> blob_storage.fshelper.isSecure('blobs') True >>> blob_storage.fshelper.isSecure(tmp_dir) True After making the permissions of tmp_dir more liberal, the directory is recognized as insecure: >>> os.chmod(tmp_dir, 040711) >>> blob_storage.fshelper.isSecure(tmp_dir) False Clean up: >>> blob_storage.close() """ # On windows, we can't create secure blob directories, at least not # with APIs in the standard library, so there's no point in testing # this. if sys.platform == 'win32': del secure_blob_directory def loadblob_tmpstore(): """ This is a test for assuring that the TmpStore's loadBlob implementation falls back correctly to loadBlob on the backend. First, let's setup a regular database and store a blob: >>> blob_storage = create_storage() >>> database = DB(blob_storage) >>> connection = database.open() >>> root = connection.root() >>> from ZODB.blob import Blob >>> root['blob'] = Blob() >>> connection.add(root['blob']) >>> root['blob'].open('w').write('test') >>> import transaction >>> transaction.commit() >>> blob_oid = root['blob']._p_oid >>> tid = connection._storage.lastTransaction() Now we open a database with a TmpStore in front: >>> database.close() >>> from ZODB.Connection import TmpStore >>> tmpstore = TmpStore(blob_storage) We can access the blob correctly: >>> tmpstore.loadBlob(blob_oid, tid) == blob_storage.loadBlob(blob_oid, tid) True Clean up: >>> tmpstore.close() >>> database.close() """ def is_blob_record(): r""" >>> bs = create_storage() >>> db = DB(bs) >>> conn = db.open() >>> conn.root()['blob'] = ZODB.blob.Blob() >>> transaction.commit() >>> ZODB.blob.is_blob_record(bs.load(ZODB.utils.p64(0), '')[0]) False >>> ZODB.blob.is_blob_record(bs.load(ZODB.utils.p64(1), '')[0]) True An invalid pickle yields a false value: >>> ZODB.blob.is_blob_record("Hello world!") False >>> ZODB.blob.is_blob_record('c__main__\nC\nq\x01.') False >>> ZODB.blob.is_blob_record('cWaaaa\nC\nq\x01.') False As does None, which may occur in delete records: >>> ZODB.blob.is_blob_record(None) False >>> db.close() """ def do_not_depend_on_cwd(): """ >>> bs = create_storage() >>> here = os.getcwd() >>> os.mkdir('evil') >>> os.chdir('evil') >>> db = DB(bs) >>> conn = db.open() >>> conn.root()['blob'] = ZODB.blob.Blob() >>> conn.root()['blob'].open('w').write('data') >>> transaction.commit() >>> os.chdir(here) >>> conn.root()['blob'].open().read() 'data' >>> bs.close() """ def savepoint_isolation(): """Make sure savepoint data is distinct accross transactions >>> bs = create_storage() >>> db = DB(bs) >>> conn = db.open() >>> conn.root.b = ZODB.blob.Blob('initial') >>> transaction.commit() >>> conn.root.b.open('w').write('1') >>> _ = transaction.savepoint() >>> tm = transaction.TransactionManager() >>> conn2 = db.open(transaction_manager=tm) >>> conn2.root.b.open('w').write('2') >>> _ = tm.savepoint() >>> conn.root.b.open().read() '1' >>> conn2.root.b.open().read() '2' >>> transaction.abort() >>> tm.commit() >>> conn.sync() >>> conn.root.b.open().read() '2' >>> db.close() """ def savepoint_commits_without_invalidations_out_of_order(): """Make sure transactions with blobs can be commited without the invalidations out of order error (LP #509801) >>> bs = create_storage() >>> db = DB(bs) >>> tm1 = transaction.TransactionManager() >>> conn1 = db.open(transaction_manager=tm1) >>> conn1.root.b = ZODB.blob.Blob('initial') >>> tm1.commit() >>> conn1.root.b.open('w').write('1') >>> _ = tm1.savepoint() >>> tm2 = transaction.TransactionManager() >>> conn2 = db.open(transaction_manager=tm2) >>> conn2.root.b.open('w').write('2') >>> _ = tm1.savepoint() >>> conn1.root.b.open().read() '1' >>> conn2.root.b.open().read() '2' >>> tm2.commit() >>> tm1.commit() # doctest: +IGNORE_EXCEPTION_DETAIL Traceback (most recent call last): ... ConflictError: database conflict error... >>> tm1.abort() >>> db.close() """ def savepoint_cleanup(): """Make sure savepoint data gets cleaned up. >>> bs = create_storage() >>> tdir = bs.temporaryDirectory() >>> os.listdir(tdir) [] >>> db = DB(bs) >>> conn = db.open() >>> conn.root.b = ZODB.blob.Blob('initial') >>> _ = transaction.savepoint() >>> len(os.listdir(tdir)) 1 >>> transaction.abort() >>> os.listdir(tdir) [] >>> conn.root.b = ZODB.blob.Blob('initial') >>> transaction.commit() >>> conn.root.b.open('w').write('1') >>> _ = transaction.savepoint() >>> transaction.abort() >>> os.listdir(tdir) [] >>> db.close() """ def lp440234_Setting__p_changed_of_a_Blob_w_no_uncomitted_changes_is_noop(): r""" >>> conn = ZODB.connection('data.fs', blob_dir='blobs') >>> blob = ZODB.blob.Blob('blah') >>> conn.add(blob) >>> transaction.commit() >>> old_serial = blob._p_serial >>> blob._p_changed = True >>> transaction.commit() >>> blob.open().read() 'blah' >>> old_serial == blob._p_serial True >>> conn.close() """ def setUp(test): ZODB.tests.util.setUp(test) test.globs['rmtree'] = zope.testing.setupstack.rmtree def setUpBlobAdaptedFileStorage(test): setUp(test) def create_storage(name='data', blob_dir=None): if blob_dir is None: blob_dir = '%s.bobs' % name return ZODB.blob.BlobStorage(blob_dir, FileStorage('%s.fs' % name)) test.globs['create_storage'] = create_storage def storage_reusable_suite(prefix, factory, test_blob_storage_recovery=False, test_packing=False, test_undo=True, ): """Return a test suite for a generic IBlobStorage. Pass a factory taking a name and a blob directory name. """ def setup(test): setUp(test) def create_storage(name='data', blob_dir=None): if blob_dir is None: blob_dir = '%s.bobs' % name return factory(name, blob_dir) test.globs['create_storage'] = create_storage suite = unittest.TestSuite() suite.addTest(doctest.DocFileSuite( "blob_connection.txt", "blob_importexport.txt", "blob_transaction.txt", setUp=setup, tearDown=zope.testing.setupstack.tearDown, optionflags=doctest.ELLIPSIS, )) if test_packing: suite.addTest(doctest.DocFileSuite( "blob_packing.txt", setUp=setup, tearDown=zope.testing.setupstack.tearDown, )) suite.addTest(doctest.DocTestSuite( setUp=setup, tearDown=zope.testing.setupstack.tearDown, checker = zope.testing.renormalizing.RENormalizing([ (re.compile(r'\%(sep)s\%(sep)s' % dict(sep=os.path.sep)), '/'), (re.compile(r'\%(sep)s' % dict(sep=os.path.sep)), '/'), ]), )) def create_storage(self, name='data', blob_dir=None): if blob_dir is None: blob_dir = '%s.bobs' % name return factory(name, blob_dir) def add_test_based_on_test_class(class_): new_class = class_.__class__( prefix+class_.__name__, (class_, ), dict(create_storage=create_storage), ) suite.addTest(unittest.makeSuite(new_class)) if test_blob_storage_recovery: add_test_based_on_test_class(RecoveryBlobStorage) if test_undo: add_test_based_on_test_class(BlobUndoTests) suite.layer = ZODB.tests.util.MininalTestLayer(prefix+'BlobTests') return suite def test_suite(): suite = unittest.TestSuite() suite.addTest(unittest.makeSuite(ZODBBlobConfigTest)) suite.addTest(unittest.makeSuite(BlobCloneTests)) suite.addTest(doctest.DocFileSuite( "blob_basic.txt", "blob_consume.txt", "blob_tempdir.txt", "blobstorage_packing.txt", setUp=setUp, tearDown=zope.testing.setupstack.tearDown, optionflags=doctest.ELLIPSIS, )) suite.addTest(doctest.DocFileSuite( "blob_layout.txt", optionflags=doctest.ELLIPSIS|doctest.NORMALIZE_WHITESPACE, setUp=setUp, tearDown=zope.testing.setupstack.tearDown, checker = zope.testing.renormalizing.RENormalizing([ (re.compile(r'\%(sep)s\%(sep)s' % dict(sep=os.path.sep)), '/'), (re.compile(r'\%(sep)s' % dict(sep=os.path.sep)), '/'), (re.compile(r'\S+/((old|bushy|lawn)/\S+/foo[23456]?)'), r'\1'), ]), )) suite.addTest(storage_reusable_suite( 'BlobAdaptedFileStorage', lambda name, blob_dir: ZODB.blob.BlobStorage(blob_dir, FileStorage('%s.fs' % name)), test_blob_storage_recovery=True, test_packing=True, )) return suite if __name__ == '__main__': unittest.main(defaultTest = 'test_suite') ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/testconflictresolution.py000066400000000000000000000207401230730566700270010ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2007 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## import manuel.doctest import manuel.footnote import doctest import manuel.capture import manuel.testing import persistent import transaction import unittest import ZODB.ConflictResolution import ZODB.tests.util import ZODB.POSException import zope.testing.module def setUp(test): ZODB.tests.util.setUp(test) zope.testing.module.setUp(test, 'ConflictResolution_txt') ZODB.ConflictResolution._class_cache.clear() ZODB.ConflictResolution._unresolvable.clear() def tearDown(test): zope.testing.module.tearDown(test) ZODB.tests.util.tearDown(test) ZODB.ConflictResolution._class_cache.clear() ZODB.ConflictResolution._unresolvable.clear() class ResolveableWhenStateDoesNotChange(persistent.Persistent): def _p_resolveConflict(old, committed, new): raise ZODB.POSException.ConflictError class Unresolvable(persistent.Persistent): pass def succeed_with_resolution_when_state_is_unchanged(): """ If a conflicting change doesn't change the state, then don't even bother calling _p_resolveConflict >>> db = ZODB.DB('t.fs') # FileStorage! >>> storage = db.storage >>> conn = db.open() >>> conn.root.x = ResolveableWhenStateDoesNotChange() >>> conn.root.x.v = 1 >>> transaction.commit() >>> serial1 = conn.root.x._p_serial >>> conn.root.x.v = 2 >>> transaction.commit() >>> serial2 = conn.root.x._p_serial >>> oid = conn.root.x._p_oid So, let's try resolving when the old and committed states are the same bit the new state (pickle) is different: >>> p = storage.tryToResolveConflict( ... oid, serial1, serial1, storage.loadSerial(oid, serial2)) >>> p == storage.loadSerial(oid, serial2) True And when the old and new states are the same bit the committed state is different: >>> p = storage.tryToResolveConflict( ... oid, serial2, serial1, storage.loadSerial(oid, serial1)) >>> p == storage.loadSerial(oid, serial2) True But we still conflict if both the committed and new are different than the original: >>> p = storage.tryToResolveConflict( ... oid, serial2, serial1, storage.loadSerial(oid, serial2)) ... # doctest: +ELLIPSIS Traceback (most recent call last): ... ConflictError: database conflict error (oid 0x01, ... Of course, none of this applies if content doesn't support conflict resolution. >>> conn.root.y = Unresolvable() >>> conn.root.y.v = 1 >>> transaction.commit() >>> oid = conn.root.y._p_oid >>> serial = conn.root.y._p_serial >>> p = storage.tryToResolveConflict( ... oid, serial, serial, storage.loadSerial(oid, serial)) ... # doctest: +ELLIPSIS Traceback (most recent call last): ... ConflictError: database conflict error (oid 0x02, ... >>> db.close() """ class Resolveable(persistent.Persistent): def _p_resolveConflict(self, old, committed, new): resolved = {} for k in old: if k not in committed: if k in new and new[k] == old[k]: continue raise ZODB.POSException.ConflictError if k not in new: if k in committed and committed[k] == old[k]: continue raise ZODB.POSException.ConflictError if committed[k] != old[k]: if new[k] == old[k]: resolved[k] = committed[k] continue raise ZODB.POSException.ConflictError if new[k] != old[k]: if committed[k] == old[k]: resolved[k] = new[k] continue raise ZODB.POSException.ConflictError resolved[k] = old[k] for k in new: if k in old: continue if k in committed: raise ZODB.POSException.ConflictError resolved[k] = new[k] for k in committed: if k in old: continue if k in new: raise ZODB.POSException.ConflictError resolved[k] = committed[k] return resolved def resolve_even_when_referenced_classes_are_absent(): """ We often want to be able to resolve even when there are pesistent references to classes that can't be imported. >>> class P(persistent.Persistent): ... pass >>> db = ZODB.DB('t.fs') # FileStorage! >>> storage = db.storage >>> conn = db.open() >>> conn.root.x = Resolveable() >>> transaction.commit() >>> oid = conn.root.x._p_oid >>> serial = conn.root.x._p_serial >>> conn.root.x.a = P() >>> transaction.commit() >>> aid = conn.root.x.a._p_oid >>> serial1 = conn.root.x._p_serial >>> del conn.root.x.a >>> conn.root.x.b = P() >>> transaction.commit() >>> serial2 = conn.root.x._p_serial Bwahaha: >>> P_aside = P >>> del P Now, even though we can't import P, we can still resolve the conflict: >>> p = storage.tryToResolveConflict( ... oid, serial1, serial, storage.loadSerial(oid, serial2)) And load the pickle: >>> conn2 = db.open() >>> P = P_aside >>> p = conn2._reader.getState(p) >>> sorted(p), p['a'] is conn2.get(aid), p['b'] is conn2.root.x.b (['a', 'b'], True, True) >>> isinstance(p['a'], P) and isinstance(p['b'], P) True Oooooof course, this won't work if the subobjects aren't persistent: >>> class NP: ... pass >>> conn.root.x = Resolveable() >>> transaction.commit() >>> oid = conn.root.x._p_oid >>> serial = conn.root.x._p_serial >>> conn.root.x.a = a = NP() >>> transaction.commit() >>> serial1 = conn.root.x._p_serial >>> del conn.root.x.a >>> conn.root.x.b = b = NP() >>> transaction.commit() >>> serial2 = conn.root.x._p_serial Bwahaha: >>> del NP >>> storage.tryToResolveConflict( ... oid, serial1, serial, storage.loadSerial(oid, serial2)) ... # doctest: +ELLIPSIS Traceback (most recent call last): ... ConflictError: database conflict error (oid ... >>> db.close() """ def resolve_even_when_xdb_referenced_classes_are_absent(): """Cross-database persistent refs! >>> class P(persistent.Persistent): ... pass >>> databases = {} >>> db = ZODB.DB('t.fs', databases=databases, database_name='') >>> db2 = ZODB.DB('o.fs', databases=databases, database_name='o') >>> storage = db.storage >>> conn = db.open() >>> conn.root.x = Resolveable() >>> transaction.commit() >>> oid = conn.root.x._p_oid >>> serial = conn.root.x._p_serial >>> p = P(); conn.get_connection('o').add(p) >>> conn.root.x.a = p >>> transaction.commit() >>> aid = conn.root.x.a._p_oid >>> serial1 = conn.root.x._p_serial >>> del conn.root.x.a >>> p = P(); conn.get_connection('o').add(p) >>> conn.root.x.b = p >>> transaction.commit() >>> serial2 = conn.root.x._p_serial >>> del p Bwahaha: >>> P_aside = P >>> del P Now, even though we can't import P, we can still resolve the conflict: >>> p = storage.tryToResolveConflict( ... oid, serial1, serial, storage.loadSerial(oid, serial2)) And load the pickle: >>> conn2 = db.open() >>> conn2o = conn2.get_connection('o') >>> P = P_aside >>> p = conn2._reader.getState(p) >>> sorted(p), p['a'] is conn2o.get(aid), p['b'] is conn2.root.x.b (['a', 'b'], True, True) >>> isinstance(p['a'], P) and isinstance(p['b'], P) True >>> db.close() >>> db2.close() """ def test_suite(): return unittest.TestSuite([ manuel.testing.TestSuite( manuel.doctest.Manuel() + manuel.footnote.Manuel() + manuel.capture.Manuel(), '../ConflictResolution.txt', setUp=setUp, tearDown=tearDown, ), doctest.DocTestSuite( setUp=setUp, tearDown=tearDown), ]) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/testcrossdatabasereferences.py000066400000000000000000000134001230730566700277270ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2005 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## import doctest import persistent import unittest class MyClass(persistent.Persistent): pass class MyClass_w_getnewargs(persistent.Persistent): def __getnewargs__(self): return () def test_must_use_consistent_connections(): """ It's important to use consistent connections. References to separate connections to the same database or multi-database won't work. For example, it's tempting to open a second database using the database open function, but this doesn't work: >>> import ZODB.tests.util, transaction, persistent >>> databases = {} >>> db1 = ZODB.tests.util.DB(databases=databases, database_name='1') >>> db2 = ZODB.tests.util.DB(databases=databases, database_name='2') >>> tm = transaction.TransactionManager() >>> conn1 = db1.open(transaction_manager=tm) >>> p1 = MyClass() >>> conn1.root()['p'] = p1 >>> tm.commit() >>> conn2 = db2.open(transaction_manager=tm) >>> p2 = MyClass() >>> conn2.root()['p'] = p2 >>> p2.p1 = p1 >>> tm.commit() # doctest: +NORMALIZE_WHITESPACE +ELLIPSIS Traceback (most recent call last): ... InvalidObjectReference: ('Attempt to store a reference to an object from a separate connection to the same database or multidatabase', , ) >>> tm.abort() Even without multi-databases, a common mistake is to mix objects in different connections to the same database. >>> conn2 = db1.open(transaction_manager=tm) >>> p2 = MyClass() >>> conn2.root()['p'] = p2 >>> p2.p1 = p1 >>> tm.commit() # doctest: +NORMALIZE_WHITESPACE +ELLIPSIS Traceback (most recent call last): ... InvalidObjectReference: ('Attempt to store a reference to an object from a separate connection to the same database or multidatabase', , ) >>> tm.abort() """ def test_connection_management_doesnt_get_caching_wrong(): """ If a connection participates in a multidatabase, then it's connections must remain so that references between it's cached objects remain sane. >>> import ZODB.tests.util, transaction, persistent >>> databases = {} >>> db1 = ZODB.tests.util.DB(databases=databases, database_name='1') >>> db2 = ZODB.tests.util.DB(databases=databases, database_name='2') >>> tm = transaction.TransactionManager() >>> conn1 = db1.open(transaction_manager=tm) >>> conn2 = conn1.get_connection('2') >>> z = MyClass() >>> conn2.root()['z'] = z >>> tm.commit() >>> x = MyClass() >>> x.z = z >>> conn1.root()['x'] = x >>> y = MyClass() >>> y.z = z >>> conn1.root()['y'] = y >>> tm.commit() >>> conn1.root()['x'].z is conn1.root()['y'].z True So, we have 2 objects in conn1 that point to the same object in conn2. Now, we'll deactivate one, close and repopen the connection, and see if we get the same objects: >>> x._p_deactivate() >>> conn1.close() >>> conn1 = db1.open(transaction_manager=tm) >>> conn1.root()['x'].z is conn1.root()['y'].z True >>> db1.close() >>> db2.close() """ def test_explicit_adding_with_savepoint(): """ >>> import ZODB.tests.util, transaction, persistent >>> databases = {} >>> db1 = ZODB.tests.util.DB(databases=databases, database_name='1') >>> db2 = ZODB.tests.util.DB(databases=databases, database_name='2') >>> tm = transaction.TransactionManager() >>> conn1 = db1.open(transaction_manager=tm) >>> conn2 = conn1.get_connection('2') >>> z = MyClass() >>> conn1.root()['z'] = z >>> conn1.add(z) >>> s = tm.savepoint() >>> conn2.root()['z'] = z >>> tm.commit() >>> z._p_jar.db().database_name '1' >>> db1.close() >>> db2.close() """ def test_explicit_adding_with_savepoint2(): """ >>> import ZODB.tests.util, transaction, persistent >>> databases = {} >>> db1 = ZODB.tests.util.DB(databases=databases, database_name='1') >>> db2 = ZODB.tests.util.DB(databases=databases, database_name='2') >>> tm = transaction.TransactionManager() >>> conn1 = db1.open(transaction_manager=tm) >>> conn2 = conn1.get_connection('2') >>> z = MyClass() >>> conn1.root()['z'] = z >>> conn1.add(z) >>> s = tm.savepoint() >>> conn2.root()['z'] = z >>> z.x = 1 >>> tm.commit() >>> z._p_jar.db().database_name '1' >>> db1.close() >>> db2.close() """ def tearDownDbs(test): test.globs['db1'].close() test.globs['db2'].close() def test_suite(): return unittest.TestSuite(( doctest.DocFileSuite('../cross-database-references.txt', globs=dict(MyClass=MyClass), tearDown=tearDownDbs, ), doctest.DocFileSuite('../cross-database-references.txt', globs=dict(MyClass=MyClass_w_getnewargs), tearDown=tearDownDbs, ), doctest.DocTestSuite(), )) if __name__ == '__main__': unittest.main(defaultTest='test_suite') ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/testfsIndex.py000066400000000000000000000150701230730566700244540ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## import doctest import random import unittest from ZODB.fsIndex import fsIndex from ZODB.utils import p64, z64 from ZODB.tests.util import setUp, tearDown class Test(unittest.TestCase): def setUp(self): self.index = fsIndex() for i in range(200): self.index[p64(i * 1000)] = (i * 1000L + 1) def test__del__(self): index = self.index self.assert_(p64(1000) in index) self.assert_(p64(100*1000) in index) del self.index[p64(1000)] del self.index[p64(100*1000)] self.assert_(p64(1000) not in index) self.assert_(p64(100*1000) not in index) for key in list(self.index): del index[key] self.assert_(not index) # Whitebox. Make sure empty buckets are removed self.assert_(not index._data) def testInserts(self): index = self.index for i in range(0,200): self.assertEqual((i,index[p64(i*1000)]), (i,(i*1000L+1))) self.assertEqual(len(index), 200) key=p64(2000) self.assertEqual(index.get(key), 2001) key=p64(2001) self.assertEqual(index.get(key), None) self.assertEqual(index.get(key, ''), '') # self.failUnless(len(index._data) > 1) def testUpdate(self): index = self.index d={} for i in range(200): d[p64(i*1000)]=(i*1000L+1) index.update(d) for i in range(400,600): d[p64(i*1000)]=(i*1000L+1) index.update(d) for i in range(100, 500): d[p64(i*1000)]=(i*1000L+2) index.update(d) self.assertEqual(index.get(p64(2000)), 2001) self.assertEqual(index.get(p64(599000)), 599001) self.assertEqual(index.get(p64(399000)), 399002) self.assertEqual(len(index), 600) def testKeys(self): keys = list(iter(self.index)) keys.sort() for i, k in enumerate(keys): self.assertEqual(k, p64(i * 1000)) keys = list(self.index.iterkeys()) keys.sort() for i, k in enumerate(keys): self.assertEqual(k, p64(i * 1000)) keys = self.index.keys() keys.sort() for i, k in enumerate(keys): self.assertEqual(k, p64(i * 1000)) def testValues(self): values = list(self.index.itervalues()) values.sort() for i, v in enumerate(values): self.assertEqual(v, (i * 1000L + 1)) values = self.index.values() values.sort() for i, v in enumerate(values): self.assertEqual(v, (i * 1000L + 1)) def testItems(self): items = list(self.index.iteritems()) items.sort() for i, item in enumerate(items): self.assertEqual(item, (p64(i * 1000), (i * 1000L + 1))) items = self.index.items() items.sort() for i, item in enumerate(items): self.assertEqual(item, (p64(i * 1000), (i * 1000L + 1))) def testMaxKey(self): index = self.index index.clear() # An empty index should complain. self.assertRaises(ValueError, index.maxKey) # Now build up a tree with random values, and check maxKey at each # step. correct_max = "" # smaller than anything we'll add for i in range(1000): key = p64(random.randrange(100000000)) index[key] = i correct_max = max(correct_max, key) index_max = index.maxKey() self.assertEqual(index_max, correct_max) index.clear() a = '\000\000\000\000\000\001\000\000' b = '\000\000\000\000\000\002\000\000' c = '\000\000\000\000\000\003\000\000' d = '\000\000\000\000\000\004\000\000' index[a] = 1 index[c] = 2 self.assertEqual(index.maxKey(b), a) self.assertEqual(index.maxKey(d), c) self.assertRaises(ValueError, index.maxKey, z64) def testMinKey(self): index = self.index index.clear() # An empty index should complain. self.assertRaises(ValueError, index.minKey) # Now build up a tree with random values, and check minKey at each # step. correct_min = "\xff" * 8 # bigger than anything we'll add for i in range(1000): key = p64(random.randrange(100000000)) index[key] = i correct_min = min(correct_min, key) index_min = index.minKey() self.assertEqual(index_min, correct_min) index.clear() a = '\000\000\000\000\000\001\000\000' b = '\000\000\000\000\000\002\000\000' c = '\000\000\000\000\000\003\000\000' d = '\000\000\000\000\000\004\000\000' index[a] = 1 index[c] = 2 self.assertEqual(index.minKey(b), c) self.assertRaises(ValueError, index.minKey, d) def fsIndex_save_and_load(): """ fsIndex objects now have save methods for saving them to disk in a new format. The fsIndex class has a load class method that can load data. Let's start by creating an fsIndex. We'll bother to allocate the object ids to get multiple buckets: >>> index = fsIndex(dict((p64(i), i) for i in xrange(0, 1<<28, 1<<15))) >>> len(index._data) 4096 Now, we'll save the data to disk and then load it: >>> index.save(42, 'index') Note that we pass a file position, which gets saved with the index data. >>> info = fsIndex.load('index') >>> info['pos'] 42 >>> info['index'].__getstate__() == index.__getstate__() True If we save the data in the old format, we can still read it: >>> import cPickle >>> cPickle.dump(dict(pos=42, index=index), open('old', 'wb'), 1) >>> info = fsIndex.load('old') >>> info['pos'] 42 >>> info['index'].__getstate__() == index.__getstate__() True """ def test_suite(): suite = unittest.TestSuite() suite.addTest(unittest.makeSuite(Test)) suite.addTest(doctest.DocTestSuite(setUp=setUp, tearDown=tearDown)) return suite ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/testfsoids.py000066400000000000000000000137571230730566700243550ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2004 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## r""" fsoids test, of the workhorse fsoids.Trace class ================================================ Let's get a path to work with first. >>> path = 'Data.fs' More imports. >>> import ZODB >>> from ZODB.FileStorage import FileStorage >>> import transaction as txn >>> from BTrees.OOBTree import OOBTree >>> from ZODB.FileStorage.fsoids import Tracer # we're testing this Create an empty FileStorage. >>> st = FileStorage(path) There's not a lot interesting in an empty DB! >>> t = Tracer(path) >>> t.register_oids(0x123456) >>> t.register_oids(1) >>> t.register_oids(0) >>> t.run() >>> t.report() oid 0x00 0 revisions this oid was not defined (no data record for it found) oid 0x01 0 revisions this oid was not defined (no data record for it found) oid 0x123456 0 revisions this oid was not defined (no data record for it found) That didn't tell us much, but does show that the specified oids are sorted into increasing order. Create a root object and try again: >>> db = ZODB.DB(st) # yes, that creates a root object! >>> t = Tracer(path) >>> t.register_oids(0, 1) >>> t.run(); t.report() #doctest: +ELLIPSIS oid 0x00 persistent.mapping.PersistentMapping 1 revision tid 0x... offset=4 ... tid user='' tid description='initial database creation' new revision persistent.mapping.PersistentMapping at 52 oid 0x01 0 revisions this oid was not defined (no data record for it found) So we see oid 0 has been used in our one transaction, and that it was created there, and is a PersistentMapping. 4 is the file offset to the start of the transaction record, and 52 is the file offset to the start of the data record for oid 0 within this transaction. Because tids are timestamps too, the "..." parts vary across runs. The initial line for a tid actually looks like this: tid 0x035748597843b877 offset=4 2004-08-20 20:41:28.187000 Let's add a BTree and try again: >>> root = db.open().root() >>> root['tree'] = OOBTree() >>> txn.get().note('added an OOBTree') >>> txn.get().commit() >>> t = Tracer(path) >>> t.register_oids(0, 1) >>> t.run(); t.report() #doctest: +ELLIPSIS oid 0x00 persistent.mapping.PersistentMapping 2 revisions tid 0x... offset=4 ... tid user='' tid description='initial database creation' new revision persistent.mapping.PersistentMapping at 52 tid 0x... offset=162 ... tid user='' tid description='added an OOBTree' new revision persistent.mapping.PersistentMapping at 201 references 0x01 BTrees.OOBTree.OOBTree at 201 oid 0x01 BTrees.OOBTree.OOBTree 1 revision tid 0x... offset=162 ... tid user='' tid description='added an OOBTree' new revision BTrees.OOBTree.OOBTree at 350 referenced by 0x00 persistent.mapping.PersistentMapping at 201 So there are two revisions of oid 0 now, and the second references oid 1. One more, storing a reference in the BTree back to the root object: >>> tree = root['tree'] >>> tree['root'] = root >>> txn.get().note('circling back to the root') >>> txn.get().commit() >>> t = Tracer(path) >>> t.register_oids(0, 1, 2) >>> t.run(); t.report() #doctest: +ELLIPSIS oid 0x00 persistent.mapping.PersistentMapping 2 revisions tid 0x... offset=4 ... tid user='' tid description='initial database creation' new revision persistent.mapping.PersistentMapping at 52 tid 0x... offset=162 ... tid user='' tid description='added an OOBTree' new revision persistent.mapping.PersistentMapping at 201 references 0x01 BTrees.OOBTree.OOBTree at 201 tid 0x... offset=429 ... tid user='' tid description='circling back to the root' referenced by 0x01 BTrees.OOBTree.OOBTree at 477 oid 0x01 BTrees.OOBTree.OOBTree 2 revisions tid 0x... offset=162 ... tid user='' tid description='added an OOBTree' new revision BTrees.OOBTree.OOBTree at 350 referenced by 0x00 persistent.mapping.PersistentMapping at 201 tid 0x... offset=429 ... tid user='' tid description='circling back to the root' new revision BTrees.OOBTree.OOBTree at 477 references 0x00 persistent.mapping.PersistentMapping at 477 oid 0x02 0 revisions this oid was not defined (no data record for it found) Note that we didn't create any new object there (oid 2 is still unused), we just made oid 1 refer to oid 0. Therefore there's a new "new revision" line in the output for oid 1. Note that there's also new output for oid 0, even though the root object didn't change: we got new output for oid 0 because it's a traced oid and the new transaction made a new reference *to* it. Since the Trace constructor takes only one argument, the only sane thing you can do to make it fail is to give it a path to a file that doesn't exist: >>> Tracer('/eiruowieuu/lsijflfjlsijflsdf/eurowiurowioeuri/908479287.fs') Traceback (most recent call last): ... ValueError: must specify an existing FileStorage You get the same kind of exception if you pass it a path to an existing directory (the path must be to a file, not a directory): >>> import os >>> Tracer(os.path.dirname(__file__)) Traceback (most recent call last): ... ValueError: must specify an existing FileStorage Clean up. >>> st.close() >>> st.cleanup() # remove .fs, .index, etc """ import doctest def test_suite(): return doctest.DocTestSuite() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/testhistoricalconnections.py000066400000000000000000000017051230730566700274600ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2007 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## import manuel.doctest import manuel.footnote import manuel.testing import ZODB.tests.util def test_suite(): return manuel.testing.TestSuite( manuel.doctest.Manuel() + manuel.footnote.Manuel(), '../historical_connections.txt', setUp=ZODB.tests.util.setUp, tearDown=ZODB.tests.util.tearDown, ) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/testmvcc.py000066400000000000000000000271651230730566700240140ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2004 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## r""" Multi-version concurrency control tests ======================================= Multi-version concurrency control (MVCC) exploits storages that store multiple revisions of an object to avoid read conflicts. Normally when an object is read from the storage, its most recent revision is read. Under MVCC, an older revision may be read so that the transaction sees a consistent view of the database. ZODB guarantees execution-time consistency: A single transaction will always see a consistent view of the database while it is executing. If transaction A is running, has already read an object O1, and a different transaction B modifies object O2, then transaction A can no longer read the current revision of O2. It must either read the version of O2 that is consistent with O1 or raise a ReadConflictError. When MVCC is in use, A will do the former. This note includes doctests that explain how MVCC is implemented (and test that the implementation is correct). The tests use a MinimalMemoryStorage that implements MVCC support, but not much else. >>> from ZODB.tests.test_storage import MinimalMemoryStorage >>> from ZODB import DB >>> db = DB(MinimalMemoryStorage()) We will use two different connections with different transaction managers to make sure that the connections act independently, even though they'll be run from a single thread. >>> import transaction >>> tm1 = transaction.TransactionManager() >>> cn1 = db.open(transaction_manager=tm1) The test will just use some MinPO objects. The next few lines just setup an initial database state. >>> from ZODB.tests.MinPO import MinPO >>> r = cn1.root() >>> r["a"] = MinPO(1) >>> r["b"] = MinPO(1) >>> tm1.get().commit() Now open a second connection. >>> tm2 = transaction.TransactionManager() >>> cn2 = db.open(transaction_manager=tm2) Connection high-water mark -------------------------- The ZODB Connection tracks a transaction high-water mark, which bounds the latest transaction id that can be read by the current transaction and still present a consistent view of the database. Transactions with ids up to but not including the high-water mark are OK to read. When a transaction commits, the database sends invalidations to all the other connections; the invalidation contains the transaction id and the oids of modified objects. The Connection stores the high-water mark in _txn_time, which is set to None until an invalidation arrives. >>> cn = db.open() >>> print cn._txn_time None >>> cn.invalidate(100, dict.fromkeys([1, 2])) >>> cn._txn_time 100 >>> cn.invalidate(200, dict.fromkeys([1, 2])) >>> cn._txn_time 100 A connection's high-water mark is set to the transaction id taken from the first invalidation processed by the connection. Transaction ids are monotonically increasing, so the first one seen during the current transaction remains the high-water mark for the duration of the transaction. We'd like simple abort and commit calls to make txn boundaries, but that doesn't work unless an object is modified. sync() will abort a transaction and process invalidations. >>> cn.sync() >>> print cn._txn_time # the high-water mark got reset to None None Basic functionality ------------------- The next bit of code includes a simple MVCC test. One transaction will modify "a." The other transaction will then modify "b" and commit. >>> r1 = cn1.root() >>> r1["a"].value = 2 >>> tm1.get().commit() >>> txn = db.lastTransaction() The second connection has its high-water mark set now. >>> cn2._txn_time == txn True It is safe to read "b," because it was not modified by the concurrent transaction. >>> r2 = cn2.root() >>> r2["b"]._p_serial < cn2._txn_time True >>> r2["b"].value 1 >>> r2["b"].value = 2 It is not safe, however, to read the current revision of "a" because it was modified at the high-water mark. If we read it, we'll get a non-current version. >>> r2["a"].value 1 >>> r2["a"]._p_serial < cn2._txn_time True We can confirm that we have a non-current revision by asking the storage. >>> db.storage.isCurrent(r2["a"]._p_oid, r2["a"]._p_serial) False It's possible to modify "a", but we get a conflict error when we commit the transaction. >>> r2["a"].value = 3 >>> tm2.get().commit() Traceback (most recent call last): ... ConflictError: database conflict error (oid 0x01, class ZODB.tests.MinPO.MinPO) >>> tm2.get().abort() This example will demonstrate that we can commit a transaction if we only modify current revisions. >>> print cn2._txn_time None >>> r1 = cn1.root() >>> r1["a"].value = 3 >>> tm1.get().commit() >>> txn = db.lastTransaction() >>> cn2._txn_time == txn True >>> r2["b"].value = r2["a"].value + 1 >>> r2["b"].value 3 >>> tm2.get().commit() >>> print cn2._txn_time None Object cache ------------ A Connection keeps objects in its cache so that multiple database references will always point to the same Python object. At transaction boundaries, objects modified by other transactions are ghostified so that the next transaction doesn't see stale state. We need to be sure the non-current objects loaded by MVCC are always ghosted. It should be trivial, because MVCC is only used when an invalidation has been received for an object. First get the database back in an initial state. >>> cn1.sync() >>> r1["a"].value = 0 >>> r1["b"].value = 0 >>> tm1.get().commit() >>> cn2.sync() >>> r2["a"].value 0 >>> r2["b"].value = 1 >>> tm2.get().commit() >>> r1["b"].value 0 >>> cn1.sync() # cn2 modified 'b', so cn1 should get a ghost for b >>> r1["b"]._p_state # -1 means GHOST -1 Closing the connection, committing a transaction, and aborting a transaction, should all have the same effect on non-current objects in cache. >>> def testit(): ... cn1.sync() ... r1["a"].value = 0 ... r1["b"].value = 0 ... tm1.get().commit() ... cn2.sync() ... r2["b"].value = 1 ... tm2.get().commit() >>> testit() >>> r1["b"]._p_state # 0 means UPTODATE, although note it's an older revision 0 >>> r1["b"].value 0 >>> r1["a"].value = 1 >>> tm1.get().commit() >>> r1["b"]._p_state -1 When a connection is closed, it is saved by the database. It will be reused by the next open() call (along with its object cache). >>> testit() >>> r1["a"].value = 1 >>> tm1.get().abort() >>> cn1.close() >>> cn3 = db.open() >>> cn1 is cn3 True >>> r1 = cn1.root() Although "b" is a ghost in cn1 at this point (because closing a connection has the same effect on non-current objects in the connection's cache as committing a transaction), not every object is a ghost. The root was in the cache and was current, so our first reference to it doesn't return a ghost. >>> r1._p_state # UPTODATE 0 >>> r1["b"]._p_state # GHOST -1 Interaction with Savepoints --------------------------- Basically, making a savepoint shouldn't have any effect on what a thread sees. Before ZODB 3.4.1, the internal TmpStore used when savepoints are pending didn't delegate all the methods necessary to make this work, so we'll do a quick test of that here. First get a clean slate: >>> cn1.close(); cn2.close() >>> cn1 = db.open(transaction_manager=tm1) >>> r1 = cn1.root() >>> r1["a"].value = 0 >>> r1["b"].value = 1 >>> tm1.commit() Now modify "a", but not "b", and make a savepoint. >>> r1["a"].value = 42 >>> sp = cn1.savepoint() Over in the other connection, modify "b" and commit it. This makes the first connection's state for b "old". >>> cn2 = db.open(transaction_manager=tm2) >>> r2 = cn2.root() >>> r2["a"].value, r2["b"].value # shouldn't see the change to "a" (0, 1) >>> r2["b"].value = 43 >>> tm2.commit() >>> r2["a"].value, r2["b"].value (0, 43) Now deactivate "b" in the first connection, and (re)fetch it. The first connection should still see 1, due to MVCC, but to get this old state TmpStore needs to handle the loadBefore() method. >>> r1["b"]._p_deactivate() Before 3.4.1, the next line died with AttributeError: TmpStore instance has no attribute 'loadBefore' >>> r1["b"]._p_state # ghost -1 >>> r1["b"].value 1 Just for fun, finish the commit and make sure both connections see the same things now. >>> tm1.commit() >>> cn1.sync(); cn2.sync() >>> r1["a"].value, r1["b"].value (42, 43) >>> r2["a"].value, r2["b"].value (42, 43) Late invalidation ----------------- The combination of ZEO and MVCC adds more complexity. Since invalidations are delivered asynchronously by ZEO, it is possible for an invalidation to arrive just after a request to load the invalidated object is sent. The connection can't use the just-loaded data, because the invalidation arrived first. The complexity for MVCC is that it must check for invalidated objects after it has loaded them, just in case. Rather than add all the complexity of ZEO to these tests, the MinimalMemoryStorage has a hook. We'll write a subclass that will deliver an invalidation when it loads an object. The hook allows us to test the Connection code. >>> class TestStorage(MinimalMemoryStorage): ... def __init__(self): ... self.hooked = {} ... self.count = 0 ... super(TestStorage, self).__init__() ... def registerDB(self, db): ... self.db = db ... def hook(self, oid, tid, version): ... if oid in self.hooked: ... self.db.invalidate(tid, {oid:1}) ... self.count += 1 We can execute this test with a single connection, because we're synthesizing the invalidation that is normally generated by the second connection. We need to create two revisions so that there is a non-current revision to load. >>> ts = TestStorage() >>> db = DB(ts) >>> cn1 = db.open(transaction_manager=tm1) >>> r1 = cn1.root() >>> r1["a"] = MinPO(0) >>> r1["b"] = MinPO(0) >>> tm1.get().commit() >>> r1["b"].value = 1 >>> tm1.get().commit() >>> cn1.cacheMinimize() # makes everything in cache a ghost >>> oid = r1["b"]._p_oid >>> ts.hooked[oid] = 1 Once the oid is hooked, an invalidation will be delivered the next time it is activated. The code below activates the object, then confirms that the hook worked and that the old state was retrieved. >>> oid in cn1._invalidated False >>> r1["b"]._p_state -1 >>> r1["b"]._p_activate() >>> oid in cn1._invalidated True >>> ts.count 1 >>> r1["b"].value 0 No earlier revision available ----------------------------- We'll reuse the code from the example above, except that there will only be a single revision of "b." As a result, the attempt to activate "b" will result in a ReadConflictError. >>> ts = TestStorage() >>> db = DB(ts) >>> cn1 = db.open(transaction_manager=tm1) >>> r1 = cn1.root() >>> r1["a"] = MinPO(0) >>> r1["b"] = MinPO(0) >>> tm1.get().commit() >>> cn1.cacheMinimize() # makes everything in cache a ghost >>> oid = r1["b"]._p_oid >>> ts.hooked[oid] = 1 Again, once the oid is hooked, an invalidation will be delivered the next time it is activated. The code below activates the object, but unlike the section above, this is no older state to retrieve. >>> oid in cn1._invalidated False >>> r1["b"]._p_state -1 >>> r1["b"]._p_activate() Traceback (most recent call last): ... ReadConflictError: database read conflict error (oid 0x02, class ZODB.tests.MinPO.MinPO) >>> oid in cn1._invalidated True >>> ts.count 1 """ import doctest def test_suite(): return doctest.DocTestSuite() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/testpersistentclass.py000066400000000000000000000037541230730566700263100ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2004 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## import doctest import sys import transaction import unittest import ZODB.persistentclass import ZODB.tests.util def class_with_circular_ref_to_self(): """ It should be possible for a class to reger to itself. >>> class C: ... __metaclass__ = ZODB.persistentclass.PersistentMetaClass >>> C.me = C >>> db = ZODB.tests.util.DB() >>> conn = db.open() >>> conn.root()['C'] = C >>> transaction.commit() >>> conn2 = db.open() >>> C2 = conn2.root()['C'] >>> c = C2() >>> c.__class__.__name__ 'C' """ # XXX need to update files to get newer testing package class FakeModule: def __init__(self, name, dict): self.__dict__ = dict self.__name__ = name def setUp(test): ZODB.tests.util.setUp(test) test.globs['some_database'] = ZODB.tests.util.DB() module = FakeModule('ZODB.persistentclass_txt', test.globs) sys.modules[module.__name__] = module def tearDown(test): test.globs['some_database'].close() del sys.modules['ZODB.persistentclass_txt'] ZODB.tests.util.tearDown(test) def test_suite(): return unittest.TestSuite(( doctest.DocFileSuite("../persistentclass.txt", setUp=setUp, tearDown=tearDown), doctest.DocTestSuite(setUp=setUp, tearDown=tearDown), )) if __name__ == '__main__': unittest.main(defaultTest='test_suite') ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/util.py000066400000000000000000000103521230730566700231270ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2003 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Conventience function for creating test databases """ from __future__ import with_statement from ZODB.MappingStorage import DB import atexit import os import persistent import sys import tempfile import time import transaction import unittest import warnings import ZODB.utils import zope.testing.setupstack def setUp(test, name='test'): transaction.abort() d = tempfile.mkdtemp(prefix=name) zope.testing.setupstack.register(test, zope.testing.setupstack.rmtree, d) zope.testing.setupstack.register( test, setattr, tempfile, 'tempdir', tempfile.tempdir) tempfile.tempdir = d zope.testing.setupstack.register(test, os.chdir, os.getcwd()) os.chdir(d) zope.testing.setupstack.register(test, transaction.abort) tearDown = zope.testing.setupstack.tearDown class TestCase(unittest.TestCase): def setUp(self): self.globs = {} name = self.__class__.__name__ mname = getattr(self, '_TestCase__testMethodName', '') if mname: name += '-' + mname setUp(self, name) tearDown = tearDown def pack(db): db.pack(time.time()+1) class P(persistent.Persistent): def __init__(self, name=None): self.name = name def __repr__(self): return 'P(%s)' % self.name class MininalTestLayer: __bases__ = () __module__ = '' def __init__(self, name): self.__name__ = name def setUp(self): self.here = os.getcwd() self.tmp = tempfile.mkdtemp(self.__name__, dir=os.getcwd()) os.chdir(self.tmp) # sigh. tearDown isn't called when a layer is run in a sub-process. atexit.register(clean, self.tmp) def tearDown(self): os.chdir(self.here) zope.testing.setupstack.rmtree(self.tmp) testSetUp = testTearDown = lambda self: None def clean(tmp): if os.path.isdir(tmp): zope.testing.setupstack.rmtree(tmp) class AAAA_Test_Runner_Hack(unittest.TestCase): """Hack to work around a bug in the test runner. The first later (lex sorted) is run first in the foreground """ layer = MininalTestLayer('!no tests here!') def testNothing(self): pass def assert_warning(category, func, warning_text=''): if sys.version_info < (2, 6): return func() # Can't use catch_warnings :( with warnings.catch_warnings(record=True) as w: warnings.simplefilter('default') result = func() for warning in w: if ((warning.category is category) and (warning_text in str(warning.message))): return result raise AssertionError(w) def assert_deprecated(func, warning_text=''): return assert_warning(DeprecationWarning, func, warning_text) def wait(func=None, timeout=30): if func is None: return lambda f: wait(f, timeout) for i in xrange(int(timeout*100)): if func(): return time.sleep(.01) raise AssertionError def store(storage, oid, value='x', serial=ZODB.utils.z64): if not isinstance(oid, str): oid = ZODB.utils.p64(oid) if not isinstance(serial, str): serial = ZODB.utils.p64(serial) t = transaction.get() storage.tpc_begin(t) storage.store(oid, serial, value, '', t) storage.tpc_vote(t) storage.tpc_finish(t) def mess_with_time(test=None, globs=None, now=1278864701.5): now = [now] def faux_time(): now[0] += 1 return now[0] if test is None and globs is not None: # sigh faux_time.globs = globs test = faux_time import time zope.testing.setupstack.register(test, setattr, time, 'time', time.time) time.time = faux_time ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/tests/warnhook.py000066400000000000000000000041021230730566700237760ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2004 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## import warnings class WarningsHook: """Hook to capture warnings generated by Python. The function warnings.showwarning() is designed to be hooked by application code, allowing the application to customize the way it handles warnings. This hook captures the unformatted warning information and stores it in a list. A test can inspect this list after the test is over. Issues: The warnings module has lots of delicate internal state. If a warning has been reported once, it won't be reported again. It may be necessary to extend this class with a mechanism for modifying the internal state so that we can be guaranteed a warning will be reported. If Python is run with a warnings filter, e.g. python -Werror, then a test that is trying to inspect a particular warning will fail. Perhaps this class can be extended to install more-specific filters the test to work anyway. """ def __init__(self): self.original = None self.warnings = [] def install(self): self.original = warnings.showwarning warnings.showwarning = self.showwarning def uninstall(self): assert self.original is not None warnings.showwarning = self.original self.original = None def showwarning(self, message, category, filename, lineno): self.warnings.append((str(message), category, filename, lineno)) def clear(self): self.warnings = [] ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/transact.py000066400000000000000000000037071230730566700226350ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2003 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Tools to simplify transactions within applications.""" from ZODB.POSException import ReadConflictError, ConflictError import transaction def _commit(note): t = transaction.get() if note: t.note(note) t.commit() def transact(f, note=None, retries=5): """Returns transactional version of function argument f. Higher-order function that converts a regular function into a transactional function. The transactional function will retry up to retries time before giving up. If note, it will be added to the transaction metadata when it commits. The retries occur on ConflictErrors. If some other TransactionError occurs, the transaction will not be retried. """ # TODO: deal with ZEO disconnected errors? def g(*args, **kwargs): n = retries while n: n -= 1 try: r = f(*args, **kwargs) except ReadConflictError, msg: transaction.abort() if not n: raise continue try: _commit(note) except ConflictError, msg: transaction.abort() if not n: raise continue return r raise RuntimeError("couldn't commit transaction") return g ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/utils.py000066400000000000000000000204421230730566700221510ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## import sys import time import struct from struct import pack, unpack from binascii import hexlify, unhexlify import cPickle as pickle from cStringIO import StringIO import warnings from tempfile import mkstemp import os from persistent.TimeStamp import TimeStamp __all__ = ['z64', 'p64', 'u64', 'U64', 'cp', 'newTid', 'oid_repr', 'serial_repr', 'tid_repr', 'positive_id', 'readable_tid_repr', 'DEPRECATED_ARGUMENT', 'deprecated37', 'deprecated38', 'get_pickle_metadata', 'locked', ] # A unique marker to give as the default value for a deprecated argument. # The method should then do a # # if that_arg is not DEPRECATED_ARGUMENT: # complain # # dance. DEPRECATED_ARGUMENT = object() # Raise DeprecationWarning, noting that the deprecated thing will go # away in ZODB 3.7. Point to the caller of our caller (i.e., at the # code using the deprecated thing). def deprecated37(msg): warnings.warn("This will be removed in ZODB 3.7:\n%s" % msg, DeprecationWarning, stacklevel=3) # Raise DeprecationWarning, noting that the deprecated thing will go # away in ZODB 3.8. Point to the caller of our caller (i.e., at the # code using the deprecated thing). def deprecated38(msg): warnings.warn("This will be removed in ZODB 3.8:\n%s" % msg, DeprecationWarning, stacklevel=3) z64 = '\0'*8 assert sys.hexversion >= 0x02030000 # The distinction between ints and longs is blurred in Python 2.2, # so u64() are U64() really the same. def p64(v): """Pack an integer or long into a 8-byte string""" return pack(">Q", v) def u64(v): """Unpack an 8-byte string into a 64-bit long integer.""" return unpack(">Q", v)[0] U64 = u64 def cp(f1, f2, length=None): """Copy all data from one file to another. It copies the data from the current position of the input file (f1) appending it to the current position of the output file (f2). It copies at most 'length' bytes. If 'length' isn't given, it copies until the end of the input file. """ read = f1.read write = f2.write n = 8192 if length is None: old_pos = f1.tell() f1.seek(0,2) length = f1.tell() f1.seek(old_pos) while length > 0: if n > length: n = length data = read(n) if not data: break write(data) length -= len(data) def newTid(old): t = time.time() ts = TimeStamp(*time.gmtime(t)[:5]+(t%60,)) if old is not None: ts = ts.laterThan(TimeStamp(old)) return `ts` def oid_repr(oid): if isinstance(oid, str) and len(oid) == 8: # Convert to hex and strip leading zeroes. as_hex = hexlify(oid).lstrip('0') # Ensure two characters per input byte. if len(as_hex) & 1: as_hex = '0' + as_hex elif as_hex == '': as_hex = '00' return '0x' + as_hex else: return repr(oid) def repr_to_oid(repr): if repr.startswith("0x"): repr = repr[2:] as_bin = unhexlify(repr) as_bin = "\x00"*(8-len(as_bin)) + as_bin return as_bin serial_repr = oid_repr tid_repr = serial_repr # For example, produce # '0x03441422948b4399 2002-04-14 20:50:34.815000' # for 8-byte string tid '\x03D\x14"\x94\x8bC\x99'. def readable_tid_repr(tid): result = tid_repr(tid) if isinstance(tid, str) and len(tid) == 8: result = "%s %s" % (result, TimeStamp(tid)) return result # Addresses can "look negative" on some boxes, some of the time. If you # feed a "negative address" to an %x format, Python 2.3 displays it as # unsigned, but produces a FutureWarning, because Python 2.4 will display # it as signed. So when you want to prodce an address, use positive_id() to # obtain it. # _ADDRESS_MASK is 2**(number_of_bits_in_a_native_pointer). Adding this to # a negative address gives a positive int with the same hex representation as # the significant bits in the original. _ADDRESS_MASK = 256 ** struct.calcsize('P') def positive_id(obj): """Return id(obj) as a non-negative integer.""" result = id(obj) if result < 0: result += _ADDRESS_MASK assert result > 0 return result # Given a ZODB pickle, return pair of strings (module_name, class_name). # Do this without importing the module or class object. # See ZODB/serialize.py's module docstring for the only docs that exist about # ZODB pickle format. If the code here gets smarter, please update those # docs to be at least as smart. The code here doesn't appear to make sense # for what serialize.py calls formats 5 and 6. def get_pickle_metadata(data): # ZODB's data records contain two pickles. The first is the class # of the object, the second is the object. We're only trying to # pick apart the first here, to extract the module and class names. if data.startswith('(c'): # pickle MARK GLOBAL opcode sequence global_prefix = 2 elif data.startswith('c'): # pickle GLOBAL opcode global_prefix = 1 else: global_prefix = 0 if global_prefix: # Formats 1 and 2. # Don't actually unpickle a class, because it will attempt to # load the class. Just break open the pickle and get the # module and class from it. The module and class names are given by # newline-terminated strings following the GLOBAL opcode. modname, classname, rest = data.split('\n', 2) modname = modname[global_prefix:] # strip GLOBAL opcode return modname, classname # Else there are a bunch of other possible formats. f = StringIO(data) u = pickle.Unpickler(f) try: class_info = u.load() except Exception, err: return '', '' if isinstance(class_info, tuple): if isinstance(class_info[0], tuple): # Formats 3 and 4. modname, classname = class_info[0] else: # Formats 5 and 6 (probably) end up here. modname, classname = class_info else: # This isn't a known format. modname = repr(class_info) classname = '' return modname, classname def mktemp(dir=None): """Create a temp file, known by name, in a semi-secure manner.""" handle, filename = mkstemp(dir=dir) os.close(handle) return filename class Locked(object): def __init__(self, func, inst=None, class_=None, preconditions=()): self.im_func = func self.im_self = inst self.im_class = class_ self.preconditions = preconditions def __get__(self, inst, class_): return self.__class__(self.im_func, inst, class_, self.preconditions) def __call__(self, *args, **kw): inst = self.im_self if inst is None: inst = args[0] func = self.im_func.__get__(self.im_self, self.im_class) inst._lock_acquire() try: for precondition in self.preconditions: if not precondition(inst): raise AssertionError( "Failed precondition: ", precondition.__doc__.strip()) return func(*args, **kw) finally: inst._lock_release() class locked(object): def __init__(self, *preconditions): self.preconditions = preconditions def __get__(self, inst, class_): # We didn't get any preconditions, so we have a single "precondition", # which is actually the function to call. func, = self.preconditions return Locked(func, inst, class_) def __call__(self, func): return Locked(func, preconditions=self.preconditions) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/ZODB/utils.txt000066400000000000000000000130131230730566700223340ustar00rootroot00000000000000ZODB Utilits Module =================== The ZODB.utils module provides a number of helpful, somewhat random :), utility functions. >>> import ZODB.utils This document documents a few of them. Over time, it may document more. 64-bit integers and strings --------------------------------- ZODB uses 64-bit transaction ids that are typically represented as strings, but are sometimes manipulated as integers. Object ids are strings too and it is common to ise 64-bit strings that are just packed integers. Functions p64 and u64 pack and unpack integers as strings: >>> ZODB.utils.p64(250347764455111456) '\x03yi\xf7"\xa8\xfb ' >>> print ZODB.utils.u64('\x03yi\xf7"\xa8\xfb ') 250347764455111456 The contant z64 has zero packed as a 64-bit string: >>> ZODB.utils.z64 '\x00\x00\x00\x00\x00\x00\x00\x00' Transaction id generation ------------------------- Storages assign transaction ids as transactions are committed. These are based on UTC time, but must be strictly increasing. The newTid function akes this pretty easy. To see this work (in a predictable way), we'll first hack time.time: >>> import time >>> old_time = time.time >>> time.time = lambda : 1224825068.12 Now, if we ask for a new time stamp, we'll get one based on our faux time: >>> tid = ZODB.utils.newTid(None) >>> tid '\x03yi\xf7"\xa54\x88' newTid requires an old tid as an argument. The old tid may be None, if we don't have a previous transaction id. This time is based on the current time, which we can see by converting it to a time stamp. >>> import ZODB.TimeStamp >>> print ZODB.TimeStamp.TimeStamp(tid) 2008-10-24 05:11:08.120000 To assure that we get a new tid that is later than the old, we can pass an existing tid. Let's pass the tid we just got. >>> tid2 = ZODB.utils.newTid(tid) >>> long(ZODB.utils.u64(tid)), long(ZODB.utils.u64(tid2)) (250347764454864008L, 250347764454864009L) Here, since we called it at the same time, we got a time stamp that was only slightly larger than the previos one. Of course, at a later time, the time stamp we get will be based on the time: >>> time.time = lambda : 1224825069.12 >>> tid = ZODB.utils.newTid(tid2) >>> print ZODB.TimeStamp.TimeStamp(tid) 2008-10-24 05:11:09.120000 >>> time.time = old_time Locking support --------------- Storages are required to be thread safe. The locking descriptor helps automate that. It arranges for a lock to be acquired when a function is called and released when a function exits. To demonstrate this, we'll create a "lock" type that simply prints when it is called: >>> class Lock: ... def acquire(self): ... print 'acquire' ... def release(self): ... print 'release' Now we'll demonstrate the descriptor: >>> class C: ... _lock = Lock() ... _lock_acquire = _lock.acquire ... _lock_release = _lock.release ... ... @ZODB.utils.locked ... def meth(self, *args, **kw): ... print 'meth', args, kw The descriptor expects the instance it wraps to have a '_lock attribute. >>> C().meth(1, 2, a=3) acquire meth (1, 2) {'a': 3} release .. Edge cases We can get the method from the class: >>> C.meth # doctest: +ELLIPSIS >>> C.meth(C()) acquire meth () {} release >>> class C2: ... _lock = Lock() ... _lock_acquire = _lock.acquire ... _lock_release = _lock.release >>> C.meth(C2()) # doctest: +NORMALIZE_WHITESPACE Traceback (most recent call last): ... TypeError: unbound method meth() must be called with C instance as first argument (got C2 instance instead) Preconditions ------------- Often, we want to supply method preconditions. The locking descriptor supports optional method preconditions [1]_. >>> class C: ... def __init__(self): ... _lock = Lock() ... self._lock_acquire = _lock.acquire ... self._lock_release = _lock.release ... self._opened = True ... self._transaction = None ... ... def opened(self): ... """The object is open ... """ ... print 'checking if open' ... return self._opened ... ... def not_in_transaction(self): ... """The object is not in a transaction ... """ ... print 'checking if in a transaction' ... return self._transaction is None ... ... @ZODB.utils.locked(opened, not_in_transaction) ... def meth(self, *args, **kw): ... print 'meth', args, kw >>> c = C() >>> c.meth(1, 2, a=3) acquire checking if open checking if in a transaction meth (1, 2) {'a': 3} release >>> c._transaction = 1 >>> c.meth(1, 2, a=3) # doctest: +NORMALIZE_WHITESPACE Traceback (most recent call last): ... AssertionError: ('Failed precondition: ', 'The object is not in a transaction') >>> c._opened = False >>> c.meth(1, 2, a=3) # doctest: +NORMALIZE_WHITESPACE Traceback (most recent call last): ... AssertionError: ('Failed precondition: ', 'The object is open') .. [1] Arguably, preconditions should be handled via separate descriptors, but for ZODB storages, almost all methods need to be locked. Combining preconditions with locking provides both efficiency and concise expressions. A more general-purpose facility would almost certainly provide separate descriptors for preconditions. ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/persistent/000077500000000000000000000000001230730566700220775ustar00rootroot00000000000000ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/persistent/README.txt000066400000000000000000000013421230730566700235750ustar00rootroot00000000000000=================== Persistence support =================== (This document is under construction. More basic documentation will eventually appear here.) Overriding `__getattr__`, `__getattribute__`, `__setattr__`, and `__delattr__` ------------------------------------------------------------------------------ Subclasses can override the attribute-management methods. For the `__getattr__` method, the behavior is like that for regular Python classes and for earlier versions of ZODB 3. For `__getattribute__`, __setattr__`, and `__delattr__`, it is necessary to call certain methods defined by `persistent.Persistent`. Detailed examples and documentation is provided in the test module, `persistent.tests.test_overriding_attrs`. ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/persistent/TimeStamp.c000066400000000000000000000244001230730566700241460ustar00rootroot00000000000000/***************************************************************************** Copyright (c) 2001, 2004 Zope Foundation and Contributors. All Rights Reserved. This software is subject to the provisions of the Zope Public License, Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS FOR A PARTICULAR PURPOSE ****************************************************************************/ #include "Python.h" #include PyObject *TimeStamp_FromDate(int, int, int, int, int, double); PyObject *TimeStamp_FromString(const char *); static char TimeStampModule_doc[] = "A 64-bit TimeStamp used as a ZODB serial number.\n" "\n" "$Id$\n"; typedef struct { PyObject_HEAD unsigned char data[8]; } TimeStamp; /* The first dimension of the arrays below is non-leapyear / leapyear */ static char month_len[2][12]={ {31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31}, {31, 29, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31} }; static short joff[2][12] = { {0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334}, {0, 31, 60, 91, 121, 152, 182, 213, 244, 274, 305, 335} }; static double gmoff=0; /* TODO: May be better (faster) to store in a file static. */ #define SCONV ((double)60) / ((double)(1<<16)) / ((double)(1<<16)) static int leap(int year) { return year % 4 == 0 && (year % 100 != 0 || year % 400 == 0); } static int days_in_month(int year, int month) { return month_len[leap(year)][month]; } static double TimeStamp_yad(int y) { double d, s; y -= 1900; d = (y - 1) * 365; if (y > 0) { s = 1.0; y -= 1; } else { s = -1.0; y = -y; } return d + s * (y / 4 - y / 100 + (y + 300) / 400); } static double TimeStamp_abst(int y, int mo, int d, int m, int s) { return (TimeStamp_yad(y) + joff[leap(y)][mo] + d) * 86400 + m * 60 + s; } static int TimeStamp_init_gmoff(void) { struct tm *t; time_t z=0; t = gmtime(&z); if (t == NULL) { PyErr_SetString(PyExc_SystemError, "gmtime failed"); return -1; } gmoff = TimeStamp_abst(t->tm_year+1900, t->tm_mon, t->tm_mday - 1, t->tm_hour * 60 + t->tm_min, t->tm_sec); return 0; } static void TimeStamp_dealloc(TimeStamp *ts) { PyObject_Del(ts); } static int TimeStamp_compare(TimeStamp *v, TimeStamp *w) { int cmp = memcmp(v->data, w->data, 8); if (cmp < 0) return -1; if (cmp > 0) return 1; return 0; } static long TimeStamp_hash(TimeStamp *self) { register unsigned char *p = (unsigned char *)self->data; register int len = 8; register long x = *p << 7; while (--len >= 0) x = (1000003*x) ^ *p++; x ^= 8; if (x == -1) x = -2; return x; } typedef struct { /* TODO: reverse-engineer what's in these things and comment them */ int y; int m; int d; int mi; } TimeStampParts; static void TimeStamp_unpack(TimeStamp *self, TimeStampParts *p) { unsigned long v; v = (self->data[0] * 16777216 + self->data[1] * 65536 + self->data[2] * 256 + self->data[3]); p->y = v / 535680 + 1900; p->m = (v % 535680) / 44640 + 1; p->d = (v % 44640) / 1440 + 1; p->mi = v % 1440; } static double TimeStamp_sec(TimeStamp *self) { unsigned int v; v = (self->data[4] * 16777216 + self->data[5] * 65536 + self->data[6] * 256 + self->data[7]); return SCONV * v; } static PyObject * TimeStamp_year(TimeStamp *self) { TimeStampParts p; TimeStamp_unpack(self, &p); return PyInt_FromLong(p.y); } static PyObject * TimeStamp_month(TimeStamp *self) { TimeStampParts p; TimeStamp_unpack(self, &p); return PyInt_FromLong(p.m); } static PyObject * TimeStamp_day(TimeStamp *self) { TimeStampParts p; TimeStamp_unpack(self, &p); return PyInt_FromLong(p.d); } static PyObject * TimeStamp_hour(TimeStamp *self) { TimeStampParts p; TimeStamp_unpack(self, &p); return PyInt_FromLong(p.mi / 60); } static PyObject * TimeStamp_minute(TimeStamp *self) { TimeStampParts p; TimeStamp_unpack(self, &p); return PyInt_FromLong(p.mi % 60); } static PyObject * TimeStamp_second(TimeStamp *self) { return PyFloat_FromDouble(TimeStamp_sec(self)); } static PyObject * TimeStamp_timeTime(TimeStamp *self) { TimeStampParts p; TimeStamp_unpack(self, &p); return PyFloat_FromDouble(TimeStamp_abst(p.y, p.m - 1, p.d - 1, p.mi, 0) + TimeStamp_sec(self) - gmoff); } static PyObject * TimeStamp_raw(TimeStamp *self) { return PyString_FromStringAndSize((const char*)self->data, 8); } static PyObject * TimeStamp_str(TimeStamp *self) { char buf[128]; TimeStampParts p; int len; TimeStamp_unpack(self, &p); len =sprintf(buf, "%4.4d-%2.2d-%2.2d %2.2d:%2.2d:%09.6f", p.y, p.m, p.d, p.mi / 60, p.mi % 60, TimeStamp_sec(self)); return PyString_FromStringAndSize(buf, len); } static PyObject * TimeStamp_laterThan(TimeStamp *self, PyObject *obj) { TimeStamp *o = NULL; TimeStampParts p; unsigned char new[8]; int i; if (obj->ob_type != self->ob_type) { PyErr_SetString(PyExc_TypeError, "expected TimeStamp object"); return NULL; } o = (TimeStamp *)obj; if (memcmp(self->data, o->data, 8) > 0) { Py_INCREF(self); return (PyObject *)self; } memcpy(new, o->data, 8); for (i = 7; i > 3; i--) { if (new[i] == 255) new[i] = 0; else { new[i]++; return TimeStamp_FromString((const char*)new); } } /* All but the first two bytes are the same. Need to increment the year, month, and day explicitly. */ TimeStamp_unpack(o, &p); if (p.mi >= 1439) { p.mi = 0; if (p.d == month_len[leap(p.y)][p.m - 1]) { p.d = 1; if (p.m == 12) { p.m = 1; p.y++; } else p.m++; } else p.d++; } else p.mi++; return TimeStamp_FromDate(p.y, p.m, p.d, p.mi / 60, p.mi % 60, 0); } static struct PyMethodDef TimeStamp_methods[] = { {"year", (PyCFunction)TimeStamp_year, METH_NOARGS}, {"minute", (PyCFunction)TimeStamp_minute, METH_NOARGS}, {"month", (PyCFunction)TimeStamp_month, METH_NOARGS}, {"day", (PyCFunction)TimeStamp_day, METH_NOARGS}, {"hour", (PyCFunction)TimeStamp_hour, METH_NOARGS}, {"second", (PyCFunction)TimeStamp_second, METH_NOARGS}, {"timeTime",(PyCFunction)TimeStamp_timeTime, METH_NOARGS}, {"laterThan", (PyCFunction)TimeStamp_laterThan, METH_O}, {"raw", (PyCFunction)TimeStamp_raw, METH_NOARGS}, {NULL, NULL}, }; static PyTypeObject TimeStamp_type = { PyObject_HEAD_INIT(NULL) 0, "persistent.TimeStamp", sizeof(TimeStamp), 0, (destructor)TimeStamp_dealloc, /* tp_dealloc */ 0, /* tp_print */ 0, /* tp_getattr */ 0, /* tp_setattr */ (cmpfunc)TimeStamp_compare, /* tp_compare */ (reprfunc)TimeStamp_raw, /* tp_repr */ 0, /* tp_as_number */ 0, /* tp_as_sequence */ 0, /* tp_as_mapping */ (hashfunc)TimeStamp_hash, /* tp_hash */ 0, /* tp_call */ (reprfunc)TimeStamp_str, /* tp_str */ 0, /* tp_getattro */ 0, /* tp_setattro */ 0, /* tp_as_buffer */ Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE, /* tp_flags */ 0, /* tp_doc */ 0, /* tp_traverse */ 0, /* tp_clear */ 0, /* tp_richcompare */ 0, /* tp_weaklistoffset */ 0, /* tp_iter */ 0, /* tp_iternext */ TimeStamp_methods, /* tp_methods */ 0, /* tp_members */ 0, /* tp_getset */ 0, /* tp_base */ 0, /* tp_dict */ 0, /* tp_descr_get */ 0, /* tp_descr_set */ }; PyObject * TimeStamp_FromString(const char *buf) { /* buf must be exactly 8 characters */ TimeStamp *ts = (TimeStamp *)PyObject_New(TimeStamp, &TimeStamp_type); memcpy(ts->data, buf, 8); return (PyObject *)ts; } #define CHECK_RANGE(VAR, LO, HI) if ((VAR) < (LO) || (VAR) > (HI)) { \ return PyErr_Format(PyExc_ValueError, \ # VAR " must be between %d and %d: %d", \ (LO), (HI), (VAR)); \ } PyObject * TimeStamp_FromDate(int year, int month, int day, int hour, int min, double sec) { TimeStamp *ts = NULL; int d; unsigned int v; if (year < 1900) return PyErr_Format(PyExc_ValueError, "year must be greater than 1900: %d", year); CHECK_RANGE(month, 1, 12); d = days_in_month(year, month - 1); if (day < 1 || day > d) return PyErr_Format(PyExc_ValueError, "day must be between 1 and %d: %d", d, day); CHECK_RANGE(hour, 0, 23); CHECK_RANGE(min, 0, 59); /* Seconds are allowed to be anything, so chill If we did want to be pickly, 60 would be a better choice. if (sec < 0 || sec > 59) return PyErr_Format(PyExc_ValueError, "second must be between 0 and 59: %f", sec); */ ts = (TimeStamp *)PyObject_New(TimeStamp, &TimeStamp_type); v = (((year - 1900) * 12 + month - 1) * 31 + day - 1); v = (v * 24 + hour) * 60 + min; ts->data[0] = v / 16777216; ts->data[1] = (v % 16777216) / 65536; ts->data[2] = (v % 65536) / 256; ts->data[3] = v % 256; sec /= SCONV; v = (unsigned int)sec; ts->data[4] = v / 16777216; ts->data[5] = (v % 16777216) / 65536; ts->data[6] = (v % 65536) / 256; ts->data[7] = v % 256; return (PyObject *)ts; } PyObject * TimeStamp_TimeStamp(PyObject *obj, PyObject *args) { char *buf = NULL; int len = 0, y, mo, d, h = 0, m = 0; double sec = 0; if (PyArg_ParseTuple(args, "s#:TimeStamp", &buf, &len)) { if (len != 8) { PyErr_SetString(PyExc_ValueError, "8-character string expected"); return NULL; } return TimeStamp_FromString(buf); } PyErr_Clear(); if (!PyArg_ParseTuple(args, "iii|iid", &y, &mo, &d, &h, &m, &sec)) return NULL; return TimeStamp_FromDate(y, mo, d, h, m, sec); } static PyMethodDef TimeStampModule_functions[] = { {"TimeStamp", TimeStamp_TimeStamp, METH_VARARGS}, {NULL, NULL}, }; void initTimeStamp(void) { PyObject *m; if (TimeStamp_init_gmoff() < 0) return; m = Py_InitModule4("TimeStamp", TimeStampModule_functions, TimeStampModule_doc, NULL, PYTHON_API_VERSION); if (m == NULL) return; TimeStamp_type.ob_type = &PyType_Type; TimeStamp_type.tp_getattro = PyObject_GenericGetAttr; } ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/persistent/__init__.py000066400000000000000000000022151230730566700242100ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Provide access to Persistent and PersistentMapping. $Id$ """ from cPersistence import Persistent, GHOST, UPTODATE, CHANGED, STICKY from cPickleCache import PickleCache from cPersistence import simple_new import copy_reg copy_reg.constructor(simple_new) # Make an interface declaration for Persistent, # if zope.interface is available. try: from zope.interface import classImplements except ImportError: pass else: from persistent.interfaces import IPersistent classImplements(Persistent, IPersistent) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/persistent/cPersistence.c000066400000000000000000001031631230730566700246760ustar00rootroot00000000000000/***************************************************************************** Copyright (c) 2001, 2002 Zope Foundation and Contributors. All Rights Reserved. This software is subject to the provisions of the Zope Public License, Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS FOR A PARTICULAR PURPOSE ****************************************************************************/ static char cPersistence_doc_string[] = "Defines Persistent mixin class for persistent objects.\n" "\n" "$Id$\n"; #include "cPersistence.h" #include "structmember.h" struct ccobject_head_struct { CACHE_HEAD }; /* These two objects are initialized when the module is loaded */ static PyObject *TimeStamp, *py_simple_new; /* Strings initialized by init_strings() below. */ static PyObject *py_keys, *py_setstate, *py___dict__, *py_timeTime; static PyObject *py__p_changed, *py__p_deactivate; static PyObject *py___getattr__, *py___setattr__, *py___delattr__; static PyObject *py___slotnames__, *copy_reg_slotnames, *__newobj__; static PyObject *py___getnewargs__, *py___getstate__; static int init_strings(void) { #define INIT_STRING(S) \ if (!(py_ ## S = PyString_InternFromString(#S))) \ return -1; INIT_STRING(keys); INIT_STRING(setstate); INIT_STRING(timeTime); INIT_STRING(__dict__); INIT_STRING(_p_changed); INIT_STRING(_p_deactivate); INIT_STRING(__getattr__); INIT_STRING(__setattr__); INIT_STRING(__delattr__); INIT_STRING(__slotnames__); INIT_STRING(__getnewargs__); INIT_STRING(__getstate__); #undef INIT_STRING return 0; } #ifdef Py_DEBUG static void fatal_1350(cPersistentObject *self, const char *caller, const char *detail) { char buf[1000]; PyOS_snprintf(buf, sizeof(buf), "cPersistence.c %s(): object at %p with type %.200s\n" "%s.\n" "The only known cause is multiple threads trying to ghost and\n" "unghost the object simultaneously.\n" "That's not legal, but ZODB can't stop it.\n" "See Collector #1350.\n", caller, self, self->ob_type->tp_name, detail); Py_FatalError(buf); } #endif static void ghostify(cPersistentObject*); /* Load the state of the object, unghostifying it. Upon success, return 1. * If an error occurred, re-ghostify the object and return -1. */ static int unghostify(cPersistentObject *self) { if (self->state < 0 && self->jar) { PyObject *r; /* Is it ever possible to not have a cache? */ if (self->cache) { /* Create a node in the ring for this unghostified object. */ self->cache->non_ghost_count++; self->cache->total_estimated_size += _estimated_size_in_bytes(self->estimated_size); ring_add(&self->cache->ring_home, &self->ring); Py_INCREF(self); } /* set state to CHANGED while setstate() call is in progress to prevent a recursive call to _PyPersist_Load(). */ self->state = cPersistent_CHANGED_STATE; /* Call the object's __setstate__() */ r = PyObject_CallMethod(self->jar, "setstate", "O", (PyObject *)self); if (r == NULL) { ghostify(self); return -1; } self->state = cPersistent_UPTODATE_STATE; Py_DECREF(r); if (self->cache && self->ring.r_next == NULL) { #ifdef Py_DEBUG fatal_1350(self, "unghostify", "is not in the cache despite that we just " "unghostified it"); #else PyErr_Format(PyExc_SystemError, "object at %p with type " "%.200s not in the cache despite that we just " "unghostified it", self, self->ob_type->tp_name); return -1; #endif } } return 1; } /****************************************************************************/ static PyTypeObject Pertype; static void accessed(cPersistentObject *self) { /* Do nothing unless the object is in a cache and not a ghost. */ if (self->cache && self->state >= 0 && self->ring.r_next) ring_move_to_head(&self->cache->ring_home, &self->ring); } static void ghostify(cPersistentObject *self) { PyObject **dictptr; /* are we already a ghost? */ if (self->state == cPersistent_GHOST_STATE) return; /* Is it ever possible to not have a cache? */ if (self->cache == NULL) { self->state = cPersistent_GHOST_STATE; return; } if (self->ring.r_next == NULL) { /* There's no way to raise an error in this routine. */ #ifdef Py_DEBUG fatal_1350(self, "ghostify", "claims to be in a cache but isn't"); #else return; #endif } /* If we're ghostifying an object, we better have some non-ghosts. */ assert(self->cache->non_ghost_count > 0); self->cache->non_ghost_count--; self->cache->total_estimated_size -= _estimated_size_in_bytes(self->estimated_size); ring_del(&self->ring); self->state = cPersistent_GHOST_STATE; dictptr = _PyObject_GetDictPtr((PyObject *)self); if (dictptr && *dictptr) { Py_DECREF(*dictptr); *dictptr = NULL; } /* We remove the reference to the just ghosted object that the ring * holds. Note that the dictionary of oids->objects has an uncounted * reference, so if the ring's reference was the only one, this frees * the ghost object. Note further that the object's dealloc knows to * inform the dictionary that it is going away. */ Py_DECREF(self); } static int changed(cPersistentObject *self) { if ((self->state == cPersistent_UPTODATE_STATE || self->state == cPersistent_STICKY_STATE) && self->jar) { PyObject *meth, *arg, *result; static PyObject *s_register; if (s_register == NULL) s_register = PyString_InternFromString("register"); meth = PyObject_GetAttr((PyObject *)self->jar, s_register); if (meth == NULL) return -1; arg = PyTuple_New(1); if (arg == NULL) { Py_DECREF(meth); return -1; } Py_INCREF(self); PyTuple_SET_ITEM(arg, 0, (PyObject *)self); result = PyEval_CallObject(meth, arg); Py_DECREF(arg); Py_DECREF(meth); if (result == NULL) return -1; Py_DECREF(result); self->state = cPersistent_CHANGED_STATE; } return 0; } static int readCurrent(cPersistentObject *self) { if ((self->state == cPersistent_UPTODATE_STATE || self->state == cPersistent_STICKY_STATE) && self->jar && self->oid) { static PyObject *s_readCurrent=NULL; PyObject *r; if (s_readCurrent == NULL) s_readCurrent = PyString_InternFromString("readCurrent"); r = PyObject_CallMethodObjArgs(self->jar, s_readCurrent, self, NULL); if (r == NULL) return -1; Py_DECREF(r); } return 0; } static PyObject * Per__p_deactivate(cPersistentObject *self) { if (self->state == cPersistent_UPTODATE_STATE && self->jar) { PyObject **dictptr = _PyObject_GetDictPtr((PyObject *)self); if (dictptr && *dictptr) { Py_DECREF(*dictptr); *dictptr = NULL; } /* Note that we need to set to ghost state unless we are called directly. Methods that override this need to do the same! */ ghostify(self); } Py_INCREF(Py_None); return Py_None; } static PyObject * Per__p_activate(cPersistentObject *self) { if (unghostify(self) < 0) return NULL; Py_INCREF(Py_None); return Py_None; } static int Per_set_changed(cPersistentObject *self, PyObject *v); static PyObject * Per__p_invalidate(cPersistentObject *self) { signed char old_state = self->state; if (old_state != cPersistent_GHOST_STATE) { if (Per_set_changed(self, NULL) < 0) return NULL; ghostify(self); } Py_INCREF(Py_None); return Py_None; } static PyObject * pickle_slotnames(PyTypeObject *cls) { PyObject *slotnames; slotnames = PyDict_GetItem(cls->tp_dict, py___slotnames__); if (slotnames) { int n = PyObject_Not(slotnames); if (n < 0) return NULL; if (n) slotnames = Py_None; Py_INCREF(slotnames); return slotnames; } slotnames = PyObject_CallFunctionObjArgs(copy_reg_slotnames, (PyObject*)cls, NULL); if (slotnames && !(slotnames == Py_None || PyList_Check(slotnames))) { PyErr_SetString(PyExc_TypeError, "copy_reg._slotnames didn't return a list or None"); Py_DECREF(slotnames); return NULL; } return slotnames; } static PyObject * pickle_copy_dict(PyObject *state) { PyObject *copy, *key, *value; char *ckey; Py_ssize_t pos = 0; copy = PyDict_New(); if (!copy) return NULL; if (!state) return copy; while (PyDict_Next(state, &pos, &key, &value)) { if (key && PyString_Check(key)) { ckey = PyString_AS_STRING(key); if (*ckey == '_' && (ckey[1] == 'v' || ckey[1] == 'p') && ckey[2] == '_') /* skip volatile and persistent */ continue; } if (PyObject_SetItem(copy, key, value) < 0) goto err; } return copy; err: Py_DECREF(copy); return NULL; } static char pickle___getstate__doc[] = "Get the object serialization state\n" "\n" "If the object has no assigned slots and has no instance dictionary, then \n" "None is returned.\n" "\n" "If the object has no assigned slots and has an instance dictionary, then \n" "the a copy of the instance dictionary is returned. The copy has any items \n" "with names starting with '_v_' or '_p_' ommitted.\n" "\n" "If the object has assigned slots, then a two-element tuple is returned. \n" "The first element is either None or a copy of the instance dictionary, \n" "as described above. The second element is a dictionary with items \n" "for each of the assigned slots.\n" ; static PyObject * pickle___getstate__(PyObject *self) { PyObject *slotnames=NULL, *slots=NULL, *state=NULL; PyObject **dictp; int n=0; slotnames = pickle_slotnames(self->ob_type); if (!slotnames) return NULL; dictp = _PyObject_GetDictPtr(self); if (dictp) state = pickle_copy_dict(*dictp); else { state = Py_None; Py_INCREF(state); } if (slotnames != Py_None) { int i; slots = PyDict_New(); if (!slots) goto end; for (i = 0; i < PyList_GET_SIZE(slotnames); i++) { PyObject *name, *value; char *cname; name = PyList_GET_ITEM(slotnames, i); if (PyString_Check(name)) { cname = PyString_AS_STRING(name); if (*cname == '_' && (cname[1] == 'v' || cname[1] == 'p') && cname[2] == '_') /* skip volatile and persistent */ continue; } /* Unclear: Will this go through our getattr hook? */ value = PyObject_GetAttr(self, name); if (value == NULL) PyErr_Clear(); else { int err = PyDict_SetItem(slots, name, value); Py_DECREF(value); if (err < 0) goto end; n++; } } } if (n) state = Py_BuildValue("(NO)", state, slots); end: Py_XDECREF(slotnames); Py_XDECREF(slots); return state; } static int pickle_setattrs_from_dict(PyObject *self, PyObject *dict) { PyObject *key, *value; Py_ssize_t pos = 0; if (!PyDict_Check(dict)) { PyErr_SetString(PyExc_TypeError, "Expected dictionary"); return -1; } while (PyDict_Next(dict, &pos, &key, &value)) { if (PyObject_SetAttr(self, key, value) < 0) return -1; } return 0; } static char pickle___setstate__doc[] = "Set the object serialization state\n\n" "The state should be in one of 3 forms:\n\n" "- None\n\n" " Ignored\n\n" "- A dictionary\n\n" " In this case, the object's instance dictionary will be cleared and \n" " updated with the new state.\n\n" "- A two-tuple with a string as the first element. \n\n" " In this case, the method named by the string in the first element will\n" " be called with the second element.\n\n" " This form supports migration of data formats.\n\n" "- A two-tuple with None or a Dictionary as the first element and\n" " with a dictionary as the second element.\n\n" " If the first element is not None, then the object's instance dictionary \n" " will be cleared and updated with the value.\n\n" " The items in the second element will be assigned as attributes.\n" ; static PyObject * pickle___setstate__(PyObject *self, PyObject *state) { PyObject *slots=NULL; if (PyTuple_Check(state)) { if (!PyArg_ParseTuple(state, "OO:__setstate__", &state, &slots)) return NULL; } if (state != Py_None) { PyObject **dict; dict = _PyObject_GetDictPtr(self); if (!dict) { PyErr_SetString(PyExc_TypeError, "this object has no instance dictionary"); return NULL; } if (!*dict) { *dict = PyDict_New(); if (!*dict) return NULL; } PyDict_Clear(*dict); if (PyDict_Update(*dict, state) < 0) return NULL; } if (slots && pickle_setattrs_from_dict(self, slots) < 0) return NULL; Py_INCREF(Py_None); return Py_None; } static char pickle___reduce__doc[] = "Reduce an object to contituent parts for serialization\n" ; static PyObject * pickle___reduce__(PyObject *self) { PyObject *args=NULL, *bargs=NULL, *state=NULL, *getnewargs=NULL; int l, i; getnewargs = PyObject_GetAttr(self, py___getnewargs__); if (getnewargs) { bargs = PyObject_CallFunctionObjArgs(getnewargs, NULL); Py_DECREF(getnewargs); if (!bargs) return NULL; l = PyTuple_Size(bargs); if (l < 0) goto end; } else { PyErr_Clear(); l = 0; } args = PyTuple_New(l+1); if (args == NULL) goto end; Py_INCREF(self->ob_type); PyTuple_SET_ITEM(args, 0, (PyObject*)(self->ob_type)); for (i = 0; i < l; i++) { Py_INCREF(PyTuple_GET_ITEM(bargs, i)); PyTuple_SET_ITEM(args, i+1, PyTuple_GET_ITEM(bargs, i)); } state = PyObject_CallMethodObjArgs(self, py___getstate__, NULL); if (!state) goto end; state = Py_BuildValue("(OON)", __newobj__, args, state); end: Py_XDECREF(bargs); Py_XDECREF(args); return state; } /* Return the object's state, a dict or None. If the object has no dict, it's state is None. Otherwise, return a dict containing all the attributes that don't start with "_v_". The caller should not modify this dict, as it may be a reference to the object's __dict__. */ static PyObject * Per__getstate__(cPersistentObject *self) { /* TODO: Should it be an error to call __getstate__() on a ghost? */ if (unghostify(self) < 0) return NULL; /* TODO: should we increment stickyness? Tim doesn't understand that question. S*/ return pickle___getstate__((PyObject*)self); } /* The Persistent base type provides a traverse function, but not a clear function. An instance of a Persistent subclass will have its dict cleared through subtype_clear(). There is always a cycle between a persistent object and its cache. When the cycle becomes unreachable, the clear function for the cache will break the cycle. Thus, the persistent object need not have a clear function. It would be complex to write a clear function for the objects, if we needed one, because of the reference count tricks done by the cache. */ static void Per_dealloc(cPersistentObject *self) { if (self->state >= 0) { /* If the cache has been cleared, then a non-ghost object isn't in the ring any longer. */ if (self->ring.r_next != NULL) { /* if we're ghostifying an object, we better have some non-ghosts */ assert(self->cache->non_ghost_count > 0); self->cache->non_ghost_count--; self->cache->total_estimated_size -= _estimated_size_in_bytes(self->estimated_size); ring_del(&self->ring); } } if (self->cache) cPersistenceCAPI->percachedel(self->cache, self->oid); Py_XDECREF(self->cache); Py_XDECREF(self->jar); Py_XDECREF(self->oid); self->ob_type->tp_free(self); } static int Per_traverse(cPersistentObject *self, visitproc visit, void *arg) { int err; #define VISIT(SLOT) \ if (SLOT) { \ err = visit((PyObject *)(SLOT), arg); \ if (err) \ return err; \ } VISIT(self->jar); VISIT(self->oid); VISIT(self->cache); #undef VISIT return 0; } /* convert_name() returns a new reference to a string name or sets an exception and returns NULL. */ static PyObject * convert_name(PyObject *name) { #ifdef Py_USING_UNICODE /* The Unicode to string conversion is done here because the existing tp_setattro slots expect a string object as name and we wouldn't want to break those. */ if (PyUnicode_Check(name)) { name = PyUnicode_AsEncodedString(name, NULL, NULL); } else #endif if (!PyString_Check(name)) { PyErr_SetString(PyExc_TypeError, "attribute name must be a string"); return NULL; } else Py_INCREF(name); return name; } /* Returns true if the object requires unghostification. There are several special attributes that we allow access to without requiring that the object be unghostified: __class__ __del__ __dict__ __of__ __setstate__ */ static int unghost_getattr(const char *s) { if (*s++ != '_') return 1; if (*s == 'p') { s++; if (*s == '_') return 0; /* _p_ */ else return 1; } else if (*s == '_') { s++; switch (*s) { case 'c': return strcmp(s, "class__"); case 'd': s++; if (!strcmp(s, "el__")) return 0; /* __del__ */ if (!strcmp(s, "ict__")) return 0; /* __dict__ */ return 1; case 'o': return strcmp(s, "of__"); case 's': return strcmp(s, "setstate__"); default: return 1; } } return 1; } static PyObject* Per_getattro(cPersistentObject *self, PyObject *name) { PyObject *result = NULL; /* guilty until proved innocent */ char *s; name = convert_name(name); if (!name) goto Done; s = PyString_AS_STRING(name); if (unghost_getattr(s)) { if (unghostify(self) < 0) goto Done; accessed(self); } result = PyObject_GenericGetAttr((PyObject *)self, name); Done: Py_XDECREF(name); return result; } /* Exposed as _p_getattr method. Test whether base getattr should be used */ static PyObject * Per__p_getattr(cPersistentObject *self, PyObject *name) { PyObject *result = NULL; /* guilty until proved innocent */ char *s; name = convert_name(name); if (!name) goto Done; s = PyString_AS_STRING(name); if (*s != '_' || unghost_getattr(s)) { if (unghostify(self) < 0) goto Done; accessed(self); result = Py_False; } else result = Py_True; Py_INCREF(result); Done: Py_XDECREF(name); return result; } /* TODO: we should probably not allow assignment of __class__ and __dict__. */ static int Per_setattro(cPersistentObject *self, PyObject *name, PyObject *v) { int result = -1; /* guilty until proved innocent */ char *s; name = convert_name(name); if (!name) goto Done; s = PyString_AS_STRING(name); if (strncmp(s, "_p_", 3) != 0) { if (unghostify(self) < 0) goto Done; accessed(self); if (strncmp(s, "_v_", 3) != 0 && self->state != cPersistent_CHANGED_STATE) { if (changed(self) < 0) goto Done; } } result = PyObject_GenericSetAttr((PyObject *)self, name, v); Done: Py_XDECREF(name); return result; } static int Per_p_set_or_delattro(cPersistentObject *self, PyObject *name, PyObject *v) { int result = -1; /* guilty until proved innocent */ char *s; name = convert_name(name); if (!name) goto Done; s = PyString_AS_STRING(name); if (strncmp(s, "_p_", 3)) { if (unghostify(self) < 0) goto Done; accessed(self); result = 0; } else { if (PyObject_GenericSetAttr((PyObject *)self, name, v) < 0) goto Done; result = 1; } Done: Py_XDECREF(name); return result; } static PyObject * Per__p_setattr(cPersistentObject *self, PyObject *args) { PyObject *name, *v, *result; int r; if (!PyArg_ParseTuple(args, "OO:_p_setattr", &name, &v)) return NULL; r = Per_p_set_or_delattro(self, name, v); if (r < 0) return NULL; result = r ? Py_True : Py_False; Py_INCREF(result); return result; } static PyObject * Per__p_delattr(cPersistentObject *self, PyObject *name) { int r; PyObject *result; r = Per_p_set_or_delattro(self, name, NULL); if (r < 0) return NULL; result = r ? Py_True : Py_False; Py_INCREF(result); return result; } static PyObject * Per_get_changed(cPersistentObject *self) { if (self->state < 0) { Py_INCREF(Py_None); return Py_None; } return PyBool_FromLong(self->state == cPersistent_CHANGED_STATE); } static int Per_set_changed(cPersistentObject *self, PyObject *v) { int deactivate = 0; int true; if (!v) { /* delattr is used to invalidate an object even if it has changed. */ if (self->state != cPersistent_GHOST_STATE) self->state = cPersistent_UPTODATE_STATE; deactivate = 1; } else if (v == Py_None) deactivate = 1; if (deactivate) { PyObject *res, *meth; meth = PyObject_GetAttr((PyObject *)self, py__p_deactivate); if (meth == NULL) return -1; res = PyObject_CallObject(meth, NULL); if (res) Py_DECREF(res); else { /* an error occured in _p_deactivate(). It's not clear what we should do here. The code is obviously ignoring the exception, but it shouldn't return 0 for a getattr and set an exception. The simplest change is to clear the exception, but that simply masks the error. This prints an error to stderr just like exceptions in __del__(). It would probably be better to log it but that would be painful from C. */ PyErr_WriteUnraisable(meth); } Py_DECREF(meth); return 0; } /* !deactivate. If passed a true argument, mark self as changed (starting * with ZODB 3.6, that includes activating the object if it's a ghost). * If passed a false argument, and the object isn't a ghost, set the * state as up-to-date. */ true = PyObject_IsTrue(v); if (true == -1) return -1; if (true) { if (self->state < 0) { if (unghostify(self) < 0) return -1; } return changed(self); } /* We were passed a false, non-None argument. If we're not a ghost, * mark self as up-to-date. */ if (self->state >= 0) self->state = cPersistent_UPTODATE_STATE; return 0; } static PyObject * Per_get_oid(cPersistentObject *self) { PyObject *oid = self->oid ? self->oid : Py_None; Py_INCREF(oid); return oid; } static int Per_set_oid(cPersistentObject *self, PyObject *v) { if (self->cache) { int result; if (v == NULL) { PyErr_SetString(PyExc_ValueError, "can't delete _p_oid of cached object"); return -1; } if (PyObject_Cmp(self->oid, v, &result) < 0) return -1; if (result) { PyErr_SetString(PyExc_ValueError, "can not change _p_oid of cached object"); return -1; } } Py_XDECREF(self->oid); Py_XINCREF(v); self->oid = v; return 0; } static PyObject * Per_get_jar(cPersistentObject *self) { PyObject *jar = self->jar ? self->jar : Py_None; Py_INCREF(jar); return jar; } static int Per_set_jar(cPersistentObject *self, PyObject *v) { if (self->cache) { int result; if (v == NULL) { PyErr_SetString(PyExc_ValueError, "can't delete _p_jar of cached object"); return -1; } if (PyObject_Cmp(self->jar, v, &result) < 0) return -1; if (result) { PyErr_SetString(PyExc_ValueError, "can not change _p_jar of cached object"); return -1; } } Py_XDECREF(self->jar); Py_XINCREF(v); self->jar = v; return 0; } static PyObject * Per_get_serial(cPersistentObject *self) { return PyString_FromStringAndSize(self->serial, 8); } static int Per_set_serial(cPersistentObject *self, PyObject *v) { if (v) { if (PyString_Check(v) && PyString_GET_SIZE(v) == 8) memcpy(self->serial, PyString_AS_STRING(v), 8); else { PyErr_SetString(PyExc_ValueError, "_p_serial must be an 8-character string"); return -1; } } else memset(self->serial, 0, 8); return 0; } static PyObject * Per_get_mtime(cPersistentObject *self) { PyObject *t, *v; if (unghostify(self) < 0) return NULL; accessed(self); if (memcmp(self->serial, "\0\0\0\0\0\0\0\0", 8) == 0) { Py_INCREF(Py_None); return Py_None; } t = PyObject_CallFunction(TimeStamp, "s#", self->serial, 8); if (!t) return NULL; v = PyObject_CallMethod(t, "timeTime", ""); Py_DECREF(t); return v; } static PyObject * Per_get_state(cPersistentObject *self) { return PyInt_FromLong(self->state); } static PyObject * Per_get_estimated_size(cPersistentObject *self) { return PyInt_FromLong(_estimated_size_in_bytes(self->estimated_size)); } static int Per_set_estimated_size(cPersistentObject *self, PyObject *v) { if (v) { if (PyInt_Check(v)) { long lv = PyInt_AS_LONG(v); if (lv < 0) { PyErr_SetString(PyExc_ValueError, "_p_estimated_size must not be negative"); return -1; } self->estimated_size = _estimated_size_in_24_bits(lv); } else { PyErr_SetString(PyExc_ValueError, "_p_estimated_size must be an integer"); return -1; } } else self->estimated_size = 0; return 0; } static PyGetSetDef Per_getsets[] = { {"_p_changed", (getter)Per_get_changed, (setter)Per_set_changed}, {"_p_jar", (getter)Per_get_jar, (setter)Per_set_jar}, {"_p_mtime", (getter)Per_get_mtime}, {"_p_oid", (getter)Per_get_oid, (setter)Per_set_oid}, {"_p_serial", (getter)Per_get_serial, (setter)Per_set_serial}, {"_p_state", (getter)Per_get_state}, {"_p_estimated_size", (getter)Per_get_estimated_size, (setter)Per_set_estimated_size }, {NULL} }; static struct PyMethodDef Per_methods[] = { {"_p_deactivate", (PyCFunction)Per__p_deactivate, METH_NOARGS, "_p_deactivate() -- Deactivate the object"}, {"_p_activate", (PyCFunction)Per__p_activate, METH_NOARGS, "_p_activate() -- Activate the object"}, {"_p_invalidate", (PyCFunction)Per__p_invalidate, METH_NOARGS, "_p_invalidate() -- Invalidate the object"}, {"_p_getattr", (PyCFunction)Per__p_getattr, METH_O, "_p_getattr(name) -- Test whether the base class must handle the name\n" "\n" "The method unghostifies the object, if necessary.\n" "The method records the object access, if necessary.\n" "\n" "This method should be called by subclass __getattribute__\n" "implementations before doing anything else. If the method\n" "returns True, then __getattribute__ implementations must delegate\n" "to the base class, Persistent.\n" }, {"_p_setattr", (PyCFunction)Per__p_setattr, METH_VARARGS, "_p_setattr(name, value) -- Save persistent meta data\n" "\n" "This method should be called by subclass __setattr__ implementations\n" "before doing anything else. If it returns true, then the attribute\n" "was handled by the base class.\n" "\n" "The method unghostifies the object, if necessary.\n" "The method records the object access, if necessary.\n" }, {"_p_delattr", (PyCFunction)Per__p_delattr, METH_O, "_p_delattr(name) -- Delete persistent meta data\n" "\n" "This method should be called by subclass __delattr__ implementations\n" "before doing anything else. If it returns true, then the attribute\n" "was handled by the base class.\n" "\n" "The method unghostifies the object, if necessary.\n" "The method records the object access, if necessary.\n" }, {"__getstate__", (PyCFunction)Per__getstate__, METH_NOARGS, pickle___getstate__doc }, {"__setstate__", (PyCFunction)pickle___setstate__, METH_O, pickle___setstate__doc}, {"__reduce__", (PyCFunction)pickle___reduce__, METH_NOARGS, pickle___reduce__doc}, {NULL, NULL} /* sentinel */ }; /* This module is compiled as a shared library. Some compilers don't allow addresses of Python objects defined in other libraries to be used in static initializers here. The DEFERRED_ADDRESS macro is used to tag the slots where such addresses appear; the module init function must fill in the tagged slots at runtime. The argument is for documentation -- the macro ignores it. */ #define DEFERRED_ADDRESS(ADDR) 0 static PyTypeObject Pertype = { PyObject_HEAD_INIT(DEFERRED_ADDRESS(&PyPersist_MetaType)) 0, /* ob_size */ "persistent.Persistent", /* tp_name */ sizeof(cPersistentObject), /* tp_basicsize */ 0, /* tp_itemsize */ (destructor)Per_dealloc, /* tp_dealloc */ 0, /* tp_print */ 0, /* tp_getattr */ 0, /* tp_setattr */ 0, /* tp_compare */ 0, /* tp_repr */ 0, /* tp_as_number */ 0, /* tp_as_sequence */ 0, /* tp_as_mapping */ 0, /* tp_hash */ 0, /* tp_call */ 0, /* tp_str */ (getattrofunc)Per_getattro, /* tp_getattro */ (setattrofunc)Per_setattro, /* tp_setattro */ 0, /* tp_as_buffer */ Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE | Py_TPFLAGS_HAVE_GC, /* tp_flags */ 0, /* tp_doc */ (traverseproc)Per_traverse, /* tp_traverse */ 0, /* tp_clear */ 0, /* tp_richcompare */ 0, /* tp_weaklistoffset */ 0, /* tp_iter */ 0, /* tp_iternext */ Per_methods, /* tp_methods */ 0, /* tp_members */ Per_getsets, /* tp_getset */ }; /* End of code for Persistent objects */ /* -------------------------------------------------------- */ typedef int (*intfunctionwithpythonarg)(PyObject*); /* Load the object's state if necessary and become sticky */ static int Per_setstate(cPersistentObject *self) { if (unghostify(self) < 0) return -1; self->state = cPersistent_STICKY_STATE; return 0; } static PyObject * simple_new(PyObject *self, PyObject *type_object) { if (!PyType_Check(type_object)) { PyErr_SetString(PyExc_TypeError, "simple_new argument must be a type object."); return NULL; } return PyType_GenericNew((PyTypeObject *)type_object, NULL, NULL); } static PyMethodDef cPersistence_methods[] = { {"simple_new", simple_new, METH_O, "Create an object by simply calling a class's __new__ method without " "arguments."}, {NULL, NULL} }; static cPersistenceCAPIstruct truecPersistenceCAPI = { &Pertype, (getattrofunc)Per_getattro, /*tp_getattr with object key*/ (setattrofunc)Per_setattro, /*tp_setattr with object key*/ changed, accessed, ghostify, (intfunctionwithpythonarg)Per_setstate, NULL, /* The percachedel slot is initialized in cPickleCache.c when the module is loaded. It uses a function in a different shared library. */ readCurrent }; void initcPersistence(void) { PyObject *m, *s; PyObject *copy_reg; if (init_strings() < 0) return; m = Py_InitModule3("cPersistence", cPersistence_methods, cPersistence_doc_string); Pertype.ob_type = &PyType_Type; Pertype.tp_new = PyType_GenericNew; if (PyType_Ready(&Pertype) < 0) return; if (PyModule_AddObject(m, "Persistent", (PyObject *)&Pertype) < 0) return; cPersistenceCAPI = &truecPersistenceCAPI; s = PyCObject_FromVoidPtr(cPersistenceCAPI, NULL); if (!s) return; if (PyModule_AddObject(m, "CAPI", s) < 0) return; if (PyModule_AddIntConstant(m, "GHOST", cPersistent_GHOST_STATE) < 0) return; if (PyModule_AddIntConstant(m, "UPTODATE", cPersistent_UPTODATE_STATE) < 0) return; if (PyModule_AddIntConstant(m, "CHANGED", cPersistent_CHANGED_STATE) < 0) return; if (PyModule_AddIntConstant(m, "STICKY", cPersistent_STICKY_STATE) < 0) return; py_simple_new = PyObject_GetAttrString(m, "simple_new"); if (!py_simple_new) return; copy_reg = PyImport_ImportModule("copy_reg"); if (!copy_reg) return; copy_reg_slotnames = PyObject_GetAttrString(copy_reg, "_slotnames"); if (!copy_reg_slotnames) { Py_DECREF(copy_reg); return; } __newobj__ = PyObject_GetAttrString(copy_reg, "__newobj__"); if (!__newobj__) { Py_DECREF(copy_reg); return; } if (!TimeStamp) { m = PyImport_ImportModule("persistent.TimeStamp"); if (!m) return; TimeStamp = PyObject_GetAttrString(m, "TimeStamp"); Py_DECREF(m); /* fall through to immediate return on error */ } } ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/persistent/cPersistence.h000066400000000000000000000117721230730566700247070ustar00rootroot00000000000000/***************************************************************************** Copyright (c) 2001, 2002 Zope Foundation and Contributors. All Rights Reserved. This software is subject to the provisions of the Zope Public License, Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS FOR A PARTICULAR PURPOSE ****************************************************************************/ #ifndef CPERSISTENCE_H #define CPERSISTENCE_H #include "Python.h" #include "py24compat.h" #include "ring.h" #define CACHE_HEAD \ PyObject_HEAD \ CPersistentRing ring_home; \ int non_ghost_count; \ PY_LONG_LONG total_estimated_size; struct ccobject_head_struct; typedef struct ccobject_head_struct PerCache; /* How big is a persistent object? 12 PyGC_Head is two pointers and an int 8 PyObject_HEAD is an int and a pointer 12 jar, oid, cache pointers 8 ring struct 8 serialno 4 state + extra 4 size info (56) so far 4 dict ptr 4 weaklist ptr ------------------------- 68 only need 62, but obmalloc rounds up to multiple of eight Even a ghost requires 64 bytes. It's possible to make a persistent instance with slots and no dict, which changes the storage needed. */ #define cPersistent_HEAD \ PyObject_HEAD \ PyObject *jar; \ PyObject *oid; \ PerCache *cache; \ CPersistentRing ring; \ char serial[8]; \ signed state:8; \ unsigned estimated_size:24; /* We recently added estimated_size. We originally added it as a new unsigned long field after a signed char state field and a 3-character reserved field. This didn't work because there are packages in the wild that have their own copies of cPersistence.h that didn't see the update. To get around this, we used the reserved space by making estimated_size a 24-bit bit field in the space occupied by the old 3-character reserved field. To fit in 24 bits, we made the units of estimated_size 64-character blocks. This allows is to handle up to a GB. We should never see that, but to be paranoid, we also truncate sizes greater than 1GB. We also set the minimum size to 64 bytes. We use the _estimated_size_in_24_bits and _estimated_size_in_bytes macros both to avoid repetition and to make intent a little clearer. */ #define _estimated_size_in_24_bits(I) ((I) > 1073741696 ? 16777215 : (I)/64+1) #define _estimated_size_in_bytes(I) ((I)*64) #define cPersistent_GHOST_STATE -1 #define cPersistent_UPTODATE_STATE 0 #define cPersistent_CHANGED_STATE 1 #define cPersistent_STICKY_STATE 2 typedef struct { cPersistent_HEAD } cPersistentObject; typedef void (*percachedelfunc)(PerCache *, PyObject *); typedef struct { PyTypeObject *pertype; getattrofunc getattro; setattrofunc setattro; int (*changed)(cPersistentObject*); void (*accessed)(cPersistentObject*); void (*ghostify)(cPersistentObject*); int (*setstate)(PyObject*); percachedelfunc percachedel; int (*readCurrent)(cPersistentObject*); } cPersistenceCAPIstruct; #define cPersistenceType cPersistenceCAPI->pertype #ifndef DONT_USE_CPERSISTENCECAPI static cPersistenceCAPIstruct *cPersistenceCAPI; #endif #define cPersistanceModuleName "cPersistence" #define PER_TypeCheck(O) PyObject_TypeCheck((O), cPersistenceCAPI->pertype) #define PER_USE_OR_RETURN(O,R) {if((O)->state==cPersistent_GHOST_STATE && cPersistenceCAPI->setstate((PyObject*)(O)) < 0) return (R); else if ((O)->state==cPersistent_UPTODATE_STATE) (O)->state=cPersistent_STICKY_STATE;} #define PER_CHANGED(O) (cPersistenceCAPI->changed((cPersistentObject*)(O))) #define PER_READCURRENT(O, E) \ if (cPersistenceCAPI->readCurrent((cPersistentObject*)(O)) < 0) { E; } #define PER_GHOSTIFY(O) (cPersistenceCAPI->ghostify((cPersistentObject*)(O))) /* If the object is sticky, make it non-sticky, so that it can be ghostified. The value is not meaningful */ #define PER_ALLOW_DEACTIVATION(O) ((O)->state==cPersistent_STICKY_STATE && ((O)->state=cPersistent_UPTODATE_STATE)) #define PER_PREVENT_DEACTIVATION(O) ((O)->state==cPersistent_UPTODATE_STATE && ((O)->state=cPersistent_STICKY_STATE)) /* Make a persistent object usable from C by: - Making sure it is not a ghost - Making it sticky. IMPORTANT: If you call this and don't call PER_ALLOW_DEACTIVATION, your object will not be ghostified. PER_USE returns a 1 on success and 0 failure, where failure means error. */ #define PER_USE(O) \ (((O)->state != cPersistent_GHOST_STATE \ || (cPersistenceCAPI->setstate((PyObject*)(O)) >= 0)) \ ? (((O)->state==cPersistent_UPTODATE_STATE) \ ? ((O)->state=cPersistent_STICKY_STATE) : 1) : 0) #define PER_ACCESSED(O) (cPersistenceCAPI->accessed((cPersistentObject*)(O))) #endif ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/persistent/cPickleCache.c000066400000000000000000001125631230730566700245510ustar00rootroot00000000000000/***************************************************************************** Copyright (c) 2001, 2002 Zope Foundation and Contributors. All Rights Reserved. This software is subject to the provisions of the Zope Public License, Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS FOR A PARTICULAR PURPOSE ****************************************************************************/ /* Objects are stored under three different regimes: Regime 1: Persistent Classes Persistent Classes are part of ZClasses. They are stored in the self->data dictionary, and are never garbage collected. The klass_items() method returns a sequence of (oid,object) tuples for every Persistent Class, which should make it possible to implement garbage collection in Python if necessary. Regime 2: Ghost Objects There is no benefit to keeping a ghost object which has no external references, therefore a weak reference scheme is used to ensure that ghost objects are removed from memory as soon as possible, when the last external reference is lost. Ghost objects are stored in the self->data dictionary. Normally a dictionary keeps a strong reference on its values, however this reference count is 'stolen'. This weak reference scheme leaves a dangling reference, in the dictionary, when the last external reference is lost. To clean up this dangling reference the persistent object dealloc function calls self->cache->_oid_unreferenced(self->oid). The cache looks up the oid in the dictionary, ensures it points to an object whose reference count is zero, then removes it from the dictionary. Before removing the object from the dictionary it must temporarily resurrect the object in much the same way that class instances are resurrected before their __del__ is called. Since ghost objects are stored under a different regime to non-ghost objects, an extra ghostify function in cPersistenceAPI replaces self->state=GHOST_STATE assignments that were common in other persistent classes (such as BTrees). Regime 3: Non-Ghost Objects Non-ghost objects are stored in two data structures: the dictionary mapping oids to objects and a doubly-linked list that encodes the order in which the objects were accessed. The dictionary reference is borrowed, as it is for ghosts. The list reference is a new reference; the list stores recently used objects, even if they are otherwise unreferenced, to avoid loading the object from the database again. The doubly-link-list nodes contain next and previous pointers linking together the cache and all non-ghost persistent objects. The node embedded in the cache is the home position. On every attribute access a non-ghost object will relink itself just behind the home position in the ring. Objects accessed least recently will eventually find themselves positioned after the home position. Occasionally other nodes are temporarily inserted in the ring as position markers. The cache contains a ring_lock flag which must be set and unset before and after doing so. Only if the flag is unset can the cache assume that all nodes are either his own home node, or nodes from persistent objects. This assumption is useful during the garbage collection process. The number of non-ghost objects is counted in self->non_ghost_count. The garbage collection process consists of traversing the ring, and deactivating (that is, turning into a ghost) every object until self->non_ghost_count is down to the target size, or until it reaches the home position again. Note that objects in the sticky or changed states are still kept in the ring, however they can not be deactivated. The garbage collection process must skip such objects, rather than deactivating them. */ static char cPickleCache_doc_string[] = "Defines the PickleCache used by ZODB Connection objects.\n" "\n" "$Id$\n"; #define DONT_USE_CPERSISTENCECAPI #include "cPersistence.h" #include "structmember.h" #include #include #undef Py_FindMethod /* Python 2.4 backward compat */ #if PY_MAJOR_VERSION <= 2 && PY_MINOR_VERSION < 5 #define Py_ssize_t int typedef Py_ssize_t (*lenfunc)(PyObject *); #endif /* Python string objects to speed lookups; set by module init. */ static PyObject *py__p_changed; static PyObject *py__p_deactivate; static PyObject *py__p_jar; static PyObject *py__p_oid; static cPersistenceCAPIstruct *capi; /* This object is the pickle cache. The CACHE_HEAD macro guarantees that layout of this struct is the same as the start of ccobject_head in cPersistence.c */ typedef struct { CACHE_HEAD int klass_count; /* count of persistent classes */ PyObject *data; /* oid -> object dict */ PyObject *jar; /* Connection object */ int cache_size; /* target number of items in cache */ PY_LONG_LONG cache_size_bytes; /* target total estimated size of items in cache */ /* Most of the time the ring contains only: * many nodes corresponding to persistent objects * one 'home' node from the cache. In some cases it is handy to temporarily add other types of node into the ring as placeholders. 'ring_lock' is a boolean indicating that someone has already done this. Currently this is only used by the garbage collection code. */ int ring_lock; /* 'cache_drain_resistance' controls how quickly the cache size will drop when it is smaller than the configured size. A value of zero means it will not drop below the configured size (suitable for most caches). Otherwise, it will remove cache_non_ghost_count/cache_drain_resistance items from the cache every time (suitable for rarely used caches, such as those associated with Zope versions. */ int cache_drain_resistance; } ccobject; static int cc_ass_sub(ccobject *self, PyObject *key, PyObject *v); /* ---------------------------------------------------------------- */ #define OBJECT_FROM_RING(SELF, HERE) \ ((cPersistentObject *)(((char *)here) - offsetof(cPersistentObject, ring))) /* Insert self into the ring, following after. */ static void insert_after(CPersistentRing *self, CPersistentRing *after) { assert(self != NULL); assert(after != NULL); self->r_prev = after; self->r_next = after->r_next; after->r_next->r_prev = self; after->r_next = self; } /* Remove self from the ring. */ static void unlink_from_ring(CPersistentRing *self) { assert(self != NULL); self->r_prev->r_next = self->r_next; self->r_next->r_prev = self->r_prev; } static int scan_gc_items(ccobject *self, int target, PY_LONG_LONG target_bytes) { /* This function must only be called with the ring lock held, because it places non-object placeholders in the ring. */ cPersistentObject *object; CPersistentRing *here; CPersistentRing before_original_home; int result = -1; /* guilty until proved innocent */ /* Scan the ring, from least to most recently used, deactivating * up-to-date objects, until we either find the ring_home again or * or we've ghosted enough objects to reach the target size. * Tricky: __getattr__ and __del__ methods can do anything, and in * particular if we ghostify an object with a __del__ method, that method * can load the object again, putting it back into the MRU part of the * ring. Waiting to find ring_home again can thus cause an infinite * loop (Collector #1208). So before_original_home records the MRU * position we start with, and we stop the scan when we reach that. */ insert_after(&before_original_home, self->ring_home.r_prev); here = self->ring_home.r_next; /* least recently used object */ while (here != &before_original_home && (self->non_ghost_count > target || (target_bytes && self->total_estimated_size > target_bytes) ) ) { assert(self->ring_lock); assert(here != &self->ring_home); /* At this point we know that the ring only contains nodes from persistent objects, plus our own home node. We know this because the ring lock is held. We can safely assume the current ring node is a persistent object now we know it is not the home */ object = OBJECT_FROM_RING(self, here); if (object->state == cPersistent_UPTODATE_STATE) { CPersistentRing placeholder; PyObject *method; PyObject *temp; int error_occurred = 0; /* deactivate it. This is the main memory saver. */ /* Add a placeholder, a dummy node in the ring. We need to do this to mark our position in the ring. It is possible that the PyObject_GetAttr() call below will invoke a __getattr__() hook in Python. Also possible that deactivation will lead to a __del__ method call. So another thread might run, and mutate the ring as a side effect of object accesses. There's no predicting then where in the ring here->next will point after that. The placeholder won't move as a side effect of calling Python code. */ insert_after(&placeholder, here); method = PyObject_GetAttr((PyObject *)object, py__p_deactivate); if (method == NULL) error_occurred = 1; else { temp = PyObject_CallObject(method, NULL); Py_DECREF(method); if (temp == NULL) error_occurred = 1; else Py_DECREF(temp); } here = placeholder.r_next; unlink_from_ring(&placeholder); if (error_occurred) goto Done; } else here = here->r_next; } result = 0; Done: unlink_from_ring(&before_original_home); return result; } static PyObject * lockgc(ccobject *self, int target_size, PY_LONG_LONG target_size_bytes) { /* This is thread-safe because of the GIL, and there's nothing * in between checking the ring_lock and acquiring it that calls back * into Python. */ if (self->ring_lock) { Py_INCREF(Py_None); return Py_None; } self->ring_lock = 1; if (scan_gc_items(self, target_size, target_size_bytes) < 0) { self->ring_lock = 0; return NULL; } self->ring_lock = 0; Py_INCREF(Py_None); return Py_None; } static PyObject * cc_incrgc(ccobject *self, PyObject *args) { int obsolete_arg = -999; int starting_size = self->non_ghost_count; int target_size = self->cache_size; PY_LONG_LONG target_size_bytes = self->cache_size_bytes; if (self->cache_drain_resistance >= 1) { /* This cache will gradually drain down to a small size. Check a (small) number of objects proportional to the current size */ int target_size_2 = (starting_size - 1 - starting_size / self->cache_drain_resistance); if (target_size_2 < target_size) target_size = target_size_2; } if (!PyArg_ParseTuple(args, "|i:incrgc", &obsolete_arg)) return NULL; if (obsolete_arg != -999 && (PyErr_Warn(PyExc_DeprecationWarning, "No argument expected") < 0)) return NULL; return lockgc(self, target_size, target_size_bytes); } static PyObject * cc_full_sweep(ccobject *self, PyObject *args) { int dt = -999; /* TODO: This should be deprecated; */ if (!PyArg_ParseTuple(args, "|i:full_sweep", &dt)) return NULL; if (dt == -999) return lockgc(self, 0, 0); else return cc_incrgc(self, args); } static PyObject * cc_minimize(ccobject *self, PyObject *args) { int ignored = -999; if (!PyArg_ParseTuple(args, "|i:minimize", &ignored)) return NULL; if (ignored != -999 && (PyErr_Warn(PyExc_DeprecationWarning, "No argument expected") < 0)) return NULL; return lockgc(self, 0, 0); } static int _invalidate(ccobject *self, PyObject *key) { static PyObject *_p_invalidate = NULL; PyObject *meth, *v; v = PyDict_GetItem(self->data, key); if (v == NULL) return 0; if (_p_invalidate == NULL) { _p_invalidate = PyString_InternFromString("_p_invalidate"); if (_p_invalidate == NULL) { /* It doesn't make any sense to ignore this error, but the caller ignores all errors. TODO: and why does it do that? This should be fixed */ return -1; } } if (v->ob_refcnt <= 1 && PyType_Check(v)) { /* This looks wrong, but it isn't. We use strong references to types because they don't have the ring members. The result is that we *never* remove classes unless they are modified. We can fix this by using wekrefs uniformly. */ self->klass_count--; return PyDict_DelItem(self->data, key); } meth = PyObject_GetAttr(v, _p_invalidate); if (meth == NULL) return -1; v = PyObject_CallObject(meth, NULL); Py_DECREF(meth); if (v == NULL) return -1; Py_DECREF(v); return 0; } static PyObject * cc_invalidate(ccobject *self, PyObject *inv) { PyObject *key, *v; Py_ssize_t i = 0; if (PyDict_Check(inv)) { while (PyDict_Next(inv, &i, &key, &v)) { if (_invalidate(self, key) < 0) return NULL; } PyDict_Clear(inv); } else { if (PyString_Check(inv)) { if (_invalidate(self, inv) < 0) return NULL; } else { int l, r; l = PyObject_Length(inv); if (l < 0) return NULL; for (i=l; --i >= 0; ) { key = PySequence_GetItem(inv, i); if (!key) return NULL; r = _invalidate(self, key); Py_DECREF(key); if (r < 0) return NULL; } /* Dubious: modifying the input may be an unexpected side effect. */ PySequence_DelSlice(inv, 0, l); } } Py_INCREF(Py_None); return Py_None; } static PyObject * cc_get(ccobject *self, PyObject *args) { PyObject *r, *key, *d = NULL; if (!PyArg_ParseTuple(args, "O|O:get", &key, &d)) return NULL; r = PyDict_GetItem(self->data, key); if (!r) { if (d) r = d; else r = Py_None; } Py_INCREF(r); return r; } static PyObject * cc_items(ccobject *self) { return PyObject_CallMethod(self->data, "items", ""); } static PyObject * cc_klass_items(ccobject *self) { PyObject *l,*k,*v; Py_ssize_t p = 0; l = PyList_New(0); if (l == NULL) return NULL; while (PyDict_Next(self->data, &p, &k, &v)) { if(PyType_Check(v)) { v = Py_BuildValue("OO", k, v); if (v == NULL) { Py_DECREF(l); return NULL; } if (PyList_Append(l, v) < 0) { Py_DECREF(v); Py_DECREF(l); return NULL; } Py_DECREF(v); } } return l; } static PyObject * cc_debug_info(ccobject *self) { PyObject *l,*k,*v; Py_ssize_t p = 0; l = PyList_New(0); if (l == NULL) return NULL; while (PyDict_Next(self->data, &p, &k, &v)) { if (v->ob_refcnt <= 0) v = Py_BuildValue("Oi", k, v->ob_refcnt); else if (! PyType_Check(v) && (v->ob_type->tp_basicsize >= sizeof(cPersistentObject)) ) v = Py_BuildValue("Oisi", k, v->ob_refcnt, v->ob_type->tp_name, ((cPersistentObject*)v)->state); else v = Py_BuildValue("Ois", k, v->ob_refcnt, v->ob_type->tp_name); if (v == NULL) goto err; if (PyList_Append(l, v) < 0) goto err; } return l; err: Py_DECREF(l); return NULL; } static PyObject * cc_lru_items(ccobject *self) { PyObject *l; CPersistentRing *here; if (self->ring_lock) { /* When the ring lock is held, we have no way of know which ring nodes belong to persistent objects, and which a placeholders. */ PyErr_SetString(PyExc_ValueError, ".lru_items() is unavailable during garbage collection"); return NULL; } l = PyList_New(0); if (l == NULL) return NULL; here = self->ring_home.r_next; while (here != &self->ring_home) { PyObject *v; cPersistentObject *object = OBJECT_FROM_RING(self, here); if (object == NULL) { Py_DECREF(l); return NULL; } v = Py_BuildValue("OO", object->oid, object); if (v == NULL) { Py_DECREF(l); return NULL; } if (PyList_Append(l, v) < 0) { Py_DECREF(v); Py_DECREF(l); return NULL; } Py_DECREF(v); here = here->r_next; } return l; } static void cc_oid_unreferenced(ccobject *self, PyObject *oid) { /* This is called by the persistent object deallocation function when the reference count on a persistent object reaches zero. We need to fix up our dictionary; its reference is now dangling because we stole its reference count. Be careful to not release the global interpreter lock until this is complete. */ PyObject *v; /* If the cache has been cleared by GC, data will be NULL. */ if (!self->data) return; v = PyDict_GetItem(self->data, oid); assert(v); assert(v->ob_refcnt == 0); /* Need to be very hairy here because a dictionary is about to decref an already deleted object. */ #ifdef Py_TRACE_REFS /* This is called from the deallocation function after the interpreter has untracked the reference. Track it again. */ _Py_NewReference(v); /* Don't increment total refcount as a result of the shenanigans played in this function. The _Py_NewReference() call above creates artificial references to v. */ _Py_RefTotal--; assert(v->ob_type); #else Py_INCREF(v); #endif assert(v->ob_refcnt == 1); /* Incremement the refcount again, because delitem is going to DECREF it. If it's refcount reached zero again, we'd call back to the dealloc function that called us. */ Py_INCREF(v); /* TODO: Should we call _Py_ForgetReference() on error exit? */ if (PyDict_DelItem(self->data, oid) < 0) return; Py_DECREF((ccobject *)((cPersistentObject *)v)->cache); ((cPersistentObject *)v)->cache = NULL; assert(v->ob_refcnt == 1); /* Undo the temporary resurrection. Don't DECREF the object, because this function is called from the object's dealloc function. If the refcnt reaches zero, it will all be invoked recursively. */ _Py_ForgetReference(v); } static PyObject * cc_ringlen(ccobject *self) { CPersistentRing *here; int c = 0; for (here = self->ring_home.r_next; here != &self->ring_home; here = here->r_next) c++; return PyInt_FromLong(c); } static PyObject * cc_update_object_size_estimation(ccobject *self, PyObject *args) { PyObject *oid; cPersistentObject *v; unsigned int new_size; if (!PyArg_ParseTuple(args, "OI:updateObjectSizeEstimation", &oid, &new_size)) return NULL; /* Note: reference borrowed */ v = (cPersistentObject *)PyDict_GetItem(self->data, oid); if (v) { /* we know this object -- update our "total_size_estimation" we must only update when the object is in the ring */ if (v->ring.r_next) { self->total_estimated_size += _estimated_size_in_bytes( (int)(_estimated_size_in_24_bits(new_size)) - (int)(v->estimated_size) ); /* we do this in "Connection" as we need it even when the object is not in the cache (or not the ring) */ /* v->estimated_size = new_size; */ } } Py_RETURN_NONE; } static PyObject* cc_new_ghost(ccobject *self, PyObject *args) { PyObject *tmp, *key, *v; if (!PyArg_ParseTuple(args, "OO:new_ghost", &key, &v)) return NULL; /* Sanity check the value given to make sure it is allowed in the cache */ if (PyType_Check(v)) { /* Its a persistent class, such as a ZClass. Thats ok. */ } else if (v->ob_type->tp_basicsize < sizeof(cPersistentObject)) { /* If it's not an instance of a persistent class, (ie Python classes that derive from persistent.Persistent, BTrees, etc), report an error. TODO: checking sizeof() seems a poor test. */ PyErr_SetString(PyExc_TypeError, "Cache values must be persistent objects."); return NULL; } /* Can't access v->oid directly because the object might be a * persistent class. */ tmp = PyObject_GetAttr(v, py__p_oid); if (tmp == NULL) return NULL; Py_DECREF(tmp); if (tmp != Py_None) { PyErr_SetString(PyExc_AssertionError, "New ghost object must not have an oid"); return NULL; } /* useful sanity check, but not strictly an invariant of this class */ tmp = PyObject_GetAttr(v, py__p_jar); if (tmp == NULL) return NULL; Py_DECREF(tmp); if (tmp != Py_None) { PyErr_SetString(PyExc_AssertionError, "New ghost object must not have a jar"); return NULL; } tmp = PyDict_GetItem(self->data, key); if (tmp) { Py_DECREF(tmp); PyErr_SetString(PyExc_AssertionError, "The given oid is already in the cache"); return NULL; } if (PyType_Check(v)) { if (PyObject_SetAttr(v, py__p_jar, self->jar) < 0) return NULL; if (PyObject_SetAttr(v, py__p_oid, key) < 0) return NULL; if (PyDict_SetItem(self->data, key, v) < 0) return NULL; PyObject_GC_UnTrack((void *)self->data); self->klass_count++; } else { cPersistentObject *p = (cPersistentObject *)v; if(p->cache != NULL) { PyErr_SetString(PyExc_AssertionError, "Already in a cache"); return NULL; } if (PyDict_SetItem(self->data, key, v) < 0) return NULL; /* the dict should have a borrowed reference */ PyObject_GC_UnTrack((void *)self->data); Py_DECREF(v); Py_INCREF(self); p->cache = (PerCache *)self; Py_INCREF(self->jar); p->jar = self->jar; Py_INCREF(key); p->oid = key; p->state = cPersistent_GHOST_STATE; } Py_RETURN_NONE; } static struct PyMethodDef cc_methods[] = { {"items", (PyCFunction)cc_items, METH_NOARGS, "Return list of oid, object pairs for all items in cache."}, {"lru_items", (PyCFunction)cc_lru_items, METH_NOARGS, "List (oid, object) pairs from the lru list, as 2-tuples."}, {"klass_items", (PyCFunction)cc_klass_items, METH_NOARGS, "List (oid, object) pairs of cached persistent classes."}, {"full_sweep", (PyCFunction)cc_full_sweep, METH_VARARGS, "full_sweep() -- Perform a full sweep of the cache."}, {"minimize", (PyCFunction)cc_minimize, METH_VARARGS, "minimize([ignored]) -- Remove as many objects as possible\n\n" "Ghostify all objects that are not modified. Takes an optional\n" "argument, but ignores it."}, {"incrgc", (PyCFunction)cc_incrgc, METH_VARARGS, "incrgc() -- Perform incremental garbage collection\n\n" "This method had been depricated!" "Some other implementations support an optional parameter 'n' which\n" "indicates a repetition count; this value is ignored."}, {"invalidate", (PyCFunction)cc_invalidate, METH_O, "invalidate(oids) -- invalidate one, many, or all ids"}, {"get", (PyCFunction)cc_get, METH_VARARGS, "get(key [, default]) -- get an item, or a default"}, {"ringlen", (PyCFunction)cc_ringlen, METH_NOARGS, "ringlen() -- Returns number of non-ghost items in cache."}, {"debug_info", (PyCFunction)cc_debug_info, METH_NOARGS, "debug_info() -- Returns debugging data about objects in the cache."}, {"update_object_size_estimation", (PyCFunction)cc_update_object_size_estimation, METH_VARARGS, "update_object_size_estimation(oid, new_size) -- " "update the caches size estimation for *oid* " "(if this is known to the cache)."}, {"new_ghost", (PyCFunction)cc_new_ghost, METH_VARARGS, "new_ghost() -- Initialize a ghost and add it to the cache."}, {NULL, NULL} /* sentinel */ }; static int cc_init(ccobject *self, PyObject *args, PyObject *kwds) { int cache_size = 100; PY_LONG_LONG cache_size_bytes = 0; PyObject *jar; if (!PyArg_ParseTuple(args, "O|iL", &jar, &cache_size, &cache_size_bytes)) return -1; self->jar = NULL; self->data = PyDict_New(); if (self->data == NULL) { Py_DECREF(self); return -1; } /* Untrack the dict mapping oids to objects. The dict contains uncounted references to ghost objects, so it isn't safe for GC to visit it. If GC finds an object with more referents that refcounts, it will die with an assertion failure. When the cache participates in GC, it will need to traverse the objects in the doubly-linked list, which will account for all the non-ghost objects. */ PyObject_GC_UnTrack((void *)self->data); self->jar = jar; Py_INCREF(jar); self->cache_size = cache_size; self->cache_size_bytes = cache_size_bytes; self->non_ghost_count = 0; self->total_estimated_size = 0; self->klass_count = 0; self->cache_drain_resistance = 0; self->ring_lock = 0; self->ring_home.r_next = &self->ring_home; self->ring_home.r_prev = &self->ring_home; return 0; } static void cc_dealloc(ccobject *self) { Py_XDECREF(self->data); Py_XDECREF(self->jar); PyObject_GC_Del(self); } static int cc_clear(ccobject *self) { Py_ssize_t pos = 0; PyObject *k, *v; /* Clearing the cache is delicate. A non-ghost object will show up in the ring and in the dict. If we deallocating the dict before clearing the ring, the GC will decref each object in the dict. Since the dict references are uncounted, this will lead to objects having negative refcounts. Freeing the non-ghost objects should eliminate many objects from the cache, but there may still be ghost objects left. It's not safe to decref the dict until it's empty, so we need to manually clear those out of the dict, too. We accomplish that by replacing all the ghost objects with None. */ /* We don't need to lock the ring, because the cache is unreachable. It should be impossible for anyone to be modifying the cache. */ assert(! self->ring_lock); while (self->ring_home.r_next != &self->ring_home) { CPersistentRing *here = self->ring_home.r_next; cPersistentObject *o = OBJECT_FROM_RING(self, here); if (o->cache) { Py_INCREF(o); /* account for uncounted reference */ if (PyDict_DelItem(self->data, o->oid) < 0) return -1; } o->cache = NULL; Py_DECREF(self); self->ring_home.r_next = here->r_next; o->ring.r_prev = NULL; o->ring.r_next = NULL; Py_DECREF(o); here = here->r_next; } Py_XDECREF(self->jar); while (PyDict_Next(self->data, &pos, &k, &v)) { Py_INCREF(v); if (PyDict_SetItem(self->data, k, Py_None) < 0) return -1; } Py_XDECREF(self->data); self->data = NULL; self->jar = NULL; return 0; } static int cc_traverse(ccobject *self, visitproc visit, void *arg) { int err; CPersistentRing *here; /* If we're in the midst of cleaning up old objects, the ring contains * assorted junk we must not pass on to the visit() callback. This * should be rare (our cleanup code would need to have called back * into Python, which in turn triggered Python's gc). When it happens, * simply don't chase any pointers. The cache will appear to be a * source of external references then, and at worst we miss cleaning * up a dead cycle until the next time Python's gc runs. */ if (self->ring_lock) return 0; #define VISIT(SLOT) \ if (SLOT) { \ err = visit((PyObject *)(SLOT), arg); \ if (err) \ return err; \ } VISIT(self->jar); here = self->ring_home.r_next; /* It is possible that an object is traversed after it is cleared. In that case, there is no ring. */ if (!here) return 0; while (here != &self->ring_home) { cPersistentObject *o = OBJECT_FROM_RING(self, here); VISIT(o); here = here->r_next; } #undef VISIT return 0; } static Py_ssize_t cc_length(ccobject *self) { return PyObject_Length(self->data); } static PyObject * cc_subscript(ccobject *self, PyObject *key) { PyObject *r; r = PyDict_GetItem(self->data, key); if (r == NULL) { PyErr_SetObject(PyExc_KeyError, key); return NULL; } Py_INCREF(r); return r; } static int cc_add_item(ccobject *self, PyObject *key, PyObject *v) { int result; PyObject *oid, *object_again, *jar; cPersistentObject *p; /* Sanity check the value given to make sure it is allowed in the cache */ if (PyType_Check(v)) { /* Its a persistent class, such as a ZClass. Thats ok. */ } else if (v->ob_type->tp_basicsize < sizeof(cPersistentObject)) { /* If it's not an instance of a persistent class, (ie Python classes that derive from persistent.Persistent, BTrees, etc), report an error. TODO: checking sizeof() seems a poor test. */ PyErr_SetString(PyExc_TypeError, "Cache values must be persistent objects."); return -1; } /* Can't access v->oid directly because the object might be a * persistent class. */ oid = PyObject_GetAttr(v, py__p_oid); if (oid == NULL) return -1; if (! PyString_Check(oid)) { Py_DECREF(oid); PyErr_Format(PyExc_TypeError, "Cached object oid must be a string, not a %s", oid->ob_type->tp_name); return -1; } /* we know they are both strings. * now check if they are the same string. */ result = PyObject_Compare(key, oid); if (PyErr_Occurred()) { Py_DECREF(oid); return -1; } Py_DECREF(oid); if (result) { PyErr_SetString(PyExc_ValueError, "Cache key does not match oid"); return -1; } /* useful sanity check, but not strictly an invariant of this class */ jar = PyObject_GetAttr(v, py__p_jar); if (jar == NULL) return -1; if (jar==Py_None) { Py_DECREF(jar); PyErr_SetString(PyExc_ValueError, "Cached object jar missing"); return -1; } Py_DECREF(jar); object_again = PyDict_GetItem(self->data, key); if (object_again) { if (object_again != v) { PyErr_SetString(PyExc_ValueError, "A different object already has the same oid"); return -1; } else { /* re-register under the same oid - no work needed */ return 0; } } if (PyType_Check(v)) { if (PyDict_SetItem(self->data, key, v) < 0) return -1; PyObject_GC_UnTrack((void *)self->data); self->klass_count++; return 0; } else { PerCache *cache = ((cPersistentObject *)v)->cache; if (cache) { if (cache != (PerCache *)self) /* This object is already in a different cache. */ PyErr_SetString(PyExc_ValueError, "Cache values may only be in one cache."); return -1; } /* else: This object is already one of ours, which is ok. It would be very strange if someone was trying to register the same object under a different key. */ } if (PyDict_SetItem(self->data, key, v) < 0) return -1; /* the dict should have a borrowed reference */ PyObject_GC_UnTrack((void *)self->data); Py_DECREF(v); p = (cPersistentObject *)v; Py_INCREF(self); p->cache = (PerCache *)self; if (p->state >= 0) { /* insert this non-ghost object into the ring just behind the home position. */ self->non_ghost_count++; ring_add(&self->ring_home, &p->ring); /* this list should have a new reference to the object */ Py_INCREF(v); } return 0; } static int cc_del_item(ccobject *self, PyObject *key) { PyObject *v; cPersistentObject *p; /* unlink this item from the ring */ v = PyDict_GetItem(self->data, key); if (v == NULL) { PyErr_SetObject(PyExc_KeyError, key); return -1; } if (PyType_Check(v)) { self->klass_count--; } else { p = (cPersistentObject *)v; if (p->state >= 0) { self->non_ghost_count--; ring_del(&p->ring); /* The DelItem below will account for the reference held by the list. */ } else { /* This is a ghost object, so we haven't kept a reference count on it. For it have stayed alive this long someone else must be keeping a reference to it. Therefore we need to temporarily give it back a reference count before calling DelItem below */ Py_INCREF(v); } Py_DECREF((PyObject *)p->cache); p->cache = NULL; } if (PyDict_DelItem(self->data, key) < 0) { PyErr_SetString(PyExc_RuntimeError, "unexpectedly couldn't remove key in cc_ass_sub"); return -1; } return 0; } static int cc_ass_sub(ccobject *self, PyObject *key, PyObject *v) { if (!PyString_Check(key)) { PyErr_Format(PyExc_TypeError, "cPickleCache key must be a string, not a %s", key->ob_type->tp_name); return -1; } if (v) return cc_add_item(self, key, v); else return cc_del_item(self, key); } static PyMappingMethods cc_as_mapping = { (lenfunc)cc_length, /*mp_length*/ (binaryfunc)cc_subscript, /*mp_subscript*/ (objobjargproc)cc_ass_sub, /*mp_ass_subscript*/ }; static PyObject * cc_cache_data(ccobject *self, void *context) { return PyDict_Copy(self->data); } static PyGetSetDef cc_getsets[] = { {"cache_data", (getter)cc_cache_data}, {NULL} }; static PyMemberDef cc_members[] = { {"cache_size", T_INT, offsetof(ccobject, cache_size)}, {"cache_size_bytes", T_LONG, offsetof(ccobject, cache_size_bytes)}, {"total_estimated_size", T_LONG, offsetof(ccobject, total_estimated_size), RO}, {"cache_drain_resistance", T_INT, offsetof(ccobject, cache_drain_resistance)}, {"cache_non_ghost_count", T_INT, offsetof(ccobject, non_ghost_count), RO}, {"cache_klass_count", T_INT, offsetof(ccobject, klass_count), RO}, {NULL} }; /* This module is compiled as a shared library. Some compilers don't allow addresses of Python objects defined in other libraries to be used in static initializers here. The DEFERRED_ADDRESS macro is used to tag the slots where such addresses appear; the module init function must fill in the tagged slots at runtime. The argument is for documentation -- the macro ignores it. */ #define DEFERRED_ADDRESS(ADDR) 0 static PyTypeObject Cctype = { PyObject_HEAD_INIT(DEFERRED_ADDRESS(&PyType_Type)) 0, /* ob_size */ "persistent.PickleCache", /* tp_name */ sizeof(ccobject), /* tp_basicsize */ 0, /* tp_itemsize */ (destructor)cc_dealloc, /* tp_dealloc */ 0, /* tp_print */ 0, /* tp_getattr */ 0, /* tp_setattr */ 0, /* tp_compare */ 0, /* tp_repr */ 0, /* tp_as_number */ 0, /* tp_as_sequence */ &cc_as_mapping, /* tp_as_mapping */ 0, /* tp_hash */ 0, /* tp_call */ 0, /* tp_str */ 0, /* tp_getattro */ 0, /* tp_setattro */ 0, /* tp_as_buffer */ Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE | Py_TPFLAGS_HAVE_GC, /* tp_flags */ 0, /* tp_doc */ (traverseproc)cc_traverse, /* tp_traverse */ (inquiry)cc_clear, /* tp_clear */ 0, /* tp_richcompare */ 0, /* tp_weaklistoffset */ 0, /* tp_iter */ 0, /* tp_iternext */ cc_methods, /* tp_methods */ cc_members, /* tp_members */ cc_getsets, /* tp_getset */ 0, /* tp_base */ 0, /* tp_dict */ 0, /* tp_descr_get */ 0, /* tp_descr_set */ 0, /* tp_dictoffset */ (initproc)cc_init, /* tp_init */ }; void initcPickleCache(void) { PyObject *m; Cctype.ob_type = &PyType_Type; Cctype.tp_new = &PyType_GenericNew; if (PyType_Ready(&Cctype) < 0) { return; } m = Py_InitModule3("cPickleCache", NULL, cPickleCache_doc_string); capi = (cPersistenceCAPIstruct *)PyCObject_Import( "persistent.cPersistence", "CAPI"); if (!capi) return; capi->percachedel = (percachedelfunc)cc_oid_unreferenced; py__p_changed = PyString_InternFromString("_p_changed"); if (!py__p_changed) return; py__p_deactivate = PyString_InternFromString("_p_deactivate"); if (!py__p_deactivate) return; py__p_jar = PyString_InternFromString("_p_jar"); if (!py__p_jar) return; py__p_oid = PyString_InternFromString("_p_oid"); if (!py__p_oid) return; if (PyModule_AddStringConstant(m, "cache_variant", "stiff/c") < 0) return; /* This leaks a reference to Cctype, but it doesn't matter. */ if (PyModule_AddObject(m, "PickleCache", (PyObject *)&Cctype) < 0) return; } ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/persistent/dict.py000066400000000000000000000013571230730566700234020ustar00rootroot00000000000000############################################################################## # # Copyright Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## # persistent.dict is deprecated. Use persistent.mapping from persistent.mapping import PersistentMapping as PersistentDict ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/persistent/interfaces.py000066400000000000000000000237061230730566700246040ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Persistence Interfaces $Id$ """ from zope.interface import Interface from zope.interface import Attribute class IPersistent(Interface): """Python persistent interface A persistent object can be in one of several states: - Unsaved The object has been created but not saved in a data manager. In this state, the _p_changed attribute is non-None and false and the _p_jar attribute is None. - Saved The object has been saved and has not been changed since it was saved. In this state, the _p_changed attribute is non-None and false and the _p_jar attribute is set to a data manager. - Sticky This state is identical to the saved state except that the object cannot transition to the ghost state. This is a special state used by C methods of persistent objects to make sure that state is not unloaded in the middle of computation. In this state, the _p_changed attribute is non-None and false and the _p_jar attribute is set to a data manager. There is no Python API for detecting whether an object is in the sticky state. - Changed The object has been changed. In this state, the _p_changed attribute is true and the _p_jar attribute is set to a data manager. - Ghost the object is in memory but its state has not been loaded from the database (or its state has been unloaded). In this state, the object doesn't contain any application data. In this state, the _p_changed attribute is None, and the _p_jar attribute is set to the data manager from which the object was obtained. In all the above, _p_oid (the persistent object id) is set when _p_jar first gets set. The following state transitions are possible: - Unsaved -> Saved This transition occurs when an object is saved in the database. This usually happens when an unsaved object is added to (e.g. as an attribute or item of) a saved (or changed) object and the transaction is committed. - Saved -> Changed Sticky -> Changed Ghost -> Changed This transition occurs when someone sets an attribute or sets _p_changed to a true value on a saved, sticky or ghost object. When the transition occurs, the persistent object is required to call the register() method on its data manager, passing itself as the only argument. Prior to ZODB 3.6, setting _p_changed to a true value on a ghost object was ignored (the object remained a ghost, and getting its _p_changed attribute continued to return None). - Saved -> Sticky This transition occurs when C code marks the object as sticky to prevent its deactivation. - Saved -> Ghost This transition occurs when a saved object is deactivated or invalidated. See discussion below. - Sticky -> Saved This transition occurs when C code unmarks the object as sticky to allow its deactivation. - Changed -> Saved This transition occurs when a transaction is committed. After saving the state of a changed object during transaction commit, the data manager sets the object's _p_changed to a non-None false value. - Changed -> Ghost This transition occurs when a transaction is aborted. All changed objects are invalidated by the data manager by an abort. - Ghost -> Saved This transition occurs when an attribute or operation of a ghost is accessed and the object's state is loaded from the database. Note that there is a separate C API that is not included here. The C API requires a specific data layout and defines the sticky state. About Invalidation, Deactivation and the Sticky & Ghost States The sticky state is intended to be a short-lived state, to prevent an object's state from being discarded while we're in C routines. It is an error to invalidate an object in the sticky state. Deactivation is a request that an object discard its state (become a ghost). Deactivation is an optimization, and a request to deactivate may be ignored. There are two equivalent ways to request deactivation: - call _p_deactivate() - set _p_changed to None There are two ways to invalidate an object: call the _p_invalidate() method (preferred) or delete its _p_changed attribute. This cannot be ignored, and is used when semantics require invalidation. Normally, an invalidated object transitions to the ghost state. However, some objects cannot be ghosts. When these objects are invalidated, they immediately reload their state from their data manager, and are then in the saved state. """ _p_jar = Attribute( """The data manager for the object. The data manager implements the IPersistentDataManager interface. If there is no data manager, then this is None. """) _p_oid = Attribute( """The object id. It is up to the data manager to assign this. The special value None is reserved to indicate that an object id has not been assigned. Non-None object ids must be non-empty strings. The 8-byte string '\0'*8 (8 NUL bytes) is reserved to identify the database root object. """) _p_changed = Attribute( """The persistent state of the object. This is one of: None -- The object is a ghost. false but not None -- The object is saved (or has never been saved). true -- The object has been modified since it was last saved. The object state may be changed by assigning or deleting this attribute; however, assigning None is ignored if the object is not in the saved state, and may be ignored even if the object is in the saved state. At and after ZODB 3.6, setting _p_changed to a true value for a ghost object activates the object; prior to 3.6, setting _p_changed to a true value on a ghost object was ignored. Note that an object can transition to the changed state only if it has a data manager. When such a state change occurs, the 'register' method of the data manager must be called, passing the persistent object. Deleting this attribute forces invalidation independent of existing state, although it is an error if the sticky state is current. """) _p_serial = Attribute( """The object serial number. This member is used by the data manager to distiguish distinct revisions of a given persistent object. This is an 8-byte string (not Unicode). """) def __getstate__(): """Get the object data. The state should not include persistent attributes ("_p_name"). The result must be picklable. """ def __setstate__(state): """Set the object data. """ def _p_activate(): """Activate the object. Change the object to the saved state if it is a ghost. """ def _p_deactivate(): """Deactivate the object. Possibly change an object in the saved state to the ghost state. It may not be possible to make some persistent objects ghosts, and, for optimization reasons, the implementation may choose to keep an object in the saved state. """ def _p_invalidate(): """Invalidate the object. Invalidate the object. This causes any data to be thrown away, even if the object is in the changed state. The object is moved to the ghost state; further accesses will cause object data to be reloaded. """ # TODO: document conflict resolution. class IPersistentDataManager(Interface): """Provide services for managing persistent state. This interface is used by a persistent object to interact with its data manager in the context of a transaction. """ def setstate(object): """Load the state for the given object. The object should be in the ghost state. The object's state will be set and the object will end up in the saved state. The object must provide the IPersistent interface. """ def oldstate(obj, tid): """Return copy of 'obj' that was written by transaction 'tid'. The returned object does not have the typical metadata (_p_jar, _p_oid, _p_serial) set. I'm not sure how references to other peristent objects are handled. Parameters obj: a persistent object from this Connection. tid: id of a transaction that wrote an earlier revision. Raises KeyError if tid does not exist or if tid deleted a revision of obj. """ def register(object): """Register an IPersistent with the current transaction. This method must be called when the object transitions to the changed state. A subclass could override this method to customize the default policy of one transaction manager for each thread. """ # Maybe later: ## def mtime(object): ## """Return the modification time of the object. ## The modification time may not be known, in which case None ## is returned. If non-None, the return value is the kind of ## timestamp supplied by Python's time.time(). ## """ ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/persistent/list.py000066400000000000000000000055331230730566700234320ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Python implementation of persistent list. $Id$""" import persistent from UserList import UserList class PersistentList(UserList, persistent.Persistent): __super_setitem = UserList.__setitem__ __super_delitem = UserList.__delitem__ __super_setslice = UserList.__setslice__ __super_delslice = UserList.__delslice__ __super_iadd = UserList.__iadd__ __super_imul = UserList.__imul__ __super_append = UserList.append __super_insert = UserList.insert __super_pop = UserList.pop __super_remove = UserList.remove __super_reverse = UserList.reverse __super_sort = UserList.sort __super_extend = UserList.extend def __setitem__(self, i, item): self.__super_setitem(i, item) self._p_changed = 1 def __delitem__(self, i): self.__super_delitem(i) self._p_changed = 1 def __setslice__(self, i, j, other): self.__super_setslice(i, j, other) self._p_changed = 1 def __delslice__(self, i, j): self.__super_delslice(i, j) self._p_changed = 1 def __iadd__(self, other): L = self.__super_iadd(other) self._p_changed = 1 return L def __imul__(self, n): L = self.__super_imul(n) self._p_changed = 1 return L def append(self, item): self.__super_append(item) self._p_changed = 1 def insert(self, i, item): self.__super_insert(i, item) self._p_changed = 1 def pop(self, i=-1): rtn = self.__super_pop(i) self._p_changed = 1 return rtn def remove(self, item): self.__super_remove(item) self._p_changed = 1 def reverse(self): self.__super_reverse() self._p_changed = 1 def sort(self, *args, **kwargs): self.__super_sort(*args, **kwargs) self._p_changed = 1 def extend(self, other): self.__super_extend(other) self._p_changed = 1 # This works around a bug in Python 2.1.x (up to 2.1.2 at least) where the # __cmp__ bogusly raises a RuntimeError, and because this is an extension # class, none of the rich comparison stuff works anyway. def __cmp__(self, other): return cmp(self.data, self._UserList__cast(other)) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/persistent/mapping.py000066400000000000000000000066411230730566700241130ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE # ############################################################################## """Python implementation of persistent base types $Id$""" import persistent import UserDict class default(object): def __init__(self, func): self.func = func def __get__(self, inst, class_): if inst is None: return self return self.func(inst) class PersistentMapping(UserDict.IterableUserDict, persistent.Persistent): """A persistent wrapper for mapping objects. This class allows wrapping of mapping objects so that object changes are registered. As a side effect, mapping objects may be subclassed. A subclass of PersistentMapping or any code that adds new attributes should not create an attribute named _container. This is reserved for backwards compatibility reasons. """ # UserDict provides all of the mapping behavior. The # PersistentMapping class is responsible marking the persistent # state as changed when a method actually changes the state. At # the mapping API evolves, we may need to add more methods here. __super_delitem = UserDict.IterableUserDict.__delitem__ __super_setitem = UserDict.IterableUserDict.__setitem__ __super_clear = UserDict.IterableUserDict.clear __super_update = UserDict.IterableUserDict.update __super_setdefault = UserDict.IterableUserDict.setdefault __super_pop = UserDict.IterableUserDict.pop __super_popitem = UserDict.IterableUserDict.popitem def __delitem__(self, key): self.__super_delitem(key) self._p_changed = 1 def __setitem__(self, key, v): self.__super_setitem(key, v) self._p_changed = 1 def clear(self): self.__super_clear() self._p_changed = 1 def update(self, b): self.__super_update(b) self._p_changed = 1 def setdefault(self, key, failobj=None): # We could inline all of UserDict's implementation into the # method here, but I'd rather not depend at all on the # implementation in UserDict (simple as it is). if not self.has_key(key): self._p_changed = 1 return self.__super_setdefault(key, failobj) def pop(self, key, *args): self._p_changed = 1 return self.__super_pop(key, *args) def popitem(self): self._p_changed = 1 return self.__super_popitem() # Old implementations used _container rather than data. # Use a descriptor to provide data when we have _container instead @default def data(self): # We don't want to cause a write on read, so wer're careful not to # do anything that would cause us to become marked as changed, however, # if we're modified, then the saved record will have data, not # _container. data = self.__dict__.pop('_container') self.__dict__['data'] = data return data ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/persistent/py24compat.h000066400000000000000000000004621230730566700242540ustar00rootroot00000000000000/* Backport type definitions from Python 2.5's object.h */ #ifndef PERSISTENT_PY24COMPAT_H #define PERSISTENT_PY24COMPAT_H #if PY_VERSION_HEX < 0x02050000 typedef int Py_ssize_t; #define PY_SSIZE_T_MAX INT_MAX #define PY_SSIZE_T_MIN INT_MIN #endif /* PY_VERSION_HEX */ #endif /* PERSISTENT_PY24COMPAT_H */ ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/persistent/ring.c000066400000000000000000000034131230730566700232030ustar00rootroot00000000000000/***************************************************************************** Copyright (c) 2003 Zope Foundation and Contributors. All Rights Reserved. This software is subject to the provisions of the Zope Public License, Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS FOR A PARTICULAR PURPOSE ****************************************************************************/ #define RING_C "$Id$\n" /* Support routines for the doubly-linked list of cached objects. The cache stores a doubly-linked list of persistent objects, with space for the pointers allocated in the objects themselves. The cache stores the distinguished head of the list, which is not a valid persistent object. The next pointers traverse the ring in order starting with the least recently used object. The prev pointers traverse the ring in order starting with the most recently used object. */ #include "Python.h" #include "ring.h" void ring_add(CPersistentRing *ring, CPersistentRing *elt) { assert(!elt->r_next); elt->r_next = ring; elt->r_prev = ring->r_prev; ring->r_prev->r_next = elt; ring->r_prev = elt; } void ring_del(CPersistentRing *elt) { elt->r_next->r_prev = elt->r_prev; elt->r_prev->r_next = elt->r_next; elt->r_next = NULL; elt->r_prev = NULL; } void ring_move_to_head(CPersistentRing *ring, CPersistentRing *elt) { elt->r_prev->r_next = elt->r_next; elt->r_next->r_prev = elt->r_prev; elt->r_next = ring; elt->r_prev = ring->r_prev; ring->r_prev->r_next = elt; ring->r_prev = elt; } ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/persistent/ring.h000066400000000000000000000051171230730566700232130ustar00rootroot00000000000000/***************************************************************************** Copyright (c) 2003 Zope Foundation and Contributors. All Rights Reserved. This software is subject to the provisions of the Zope Public License, Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS FOR A PARTICULAR PURPOSE ****************************************************************************/ /* Support routines for the doubly-linked list of cached objects. The cache stores a headed, doubly-linked, circular list of persistent objects, with space for the pointers allocated in the objects themselves. The cache stores the distinguished head of the list, which is not a valid persistent object. The other list members are non-ghost persistent objects, linked in LRU (least-recently used) order. The r_next pointers traverse the ring starting with the least recently used object. The r_prev pointers traverse the ring starting with the most recently used object. Obscure: While each object is pointed at twice by list pointers (once by its predecessor's r_next, again by its successor's r_prev), the refcount on the object is bumped only by 1. This leads to some possibly surprising sequences of incref and decref code. Note that since the refcount is bumped at least once, the list does hold a strong reference to each object in it. */ typedef struct CPersistentRing_struct { struct CPersistentRing_struct *r_prev; struct CPersistentRing_struct *r_next; } CPersistentRing; /* The list operations here take constant time independent of the * number of objects in the list: */ /* Add elt as the most recently used object. elt must not already be * in the list, although this isn't checked. */ void ring_add(CPersistentRing *ring, CPersistentRing *elt); /* Remove elt from the list. elt must already be in the list, although * this isn't checked. */ void ring_del(CPersistentRing *elt); /* elt must already be in the list, although this isn't checked. It's * unlinked from its current position, and relinked into the list as the * most recently used object (which is arguably the tail of the list * instead of the head -- but the name of this function could be argued * either way). This is equivalent to * * ring_del(elt); * ring_add(ring, elt); * * but may be a little quicker. */ void ring_move_to_head(CPersistentRing *ring, CPersistentRing *elt); ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/persistent/tests/000077500000000000000000000000001230730566700232415ustar00rootroot00000000000000ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/persistent/tests/__init__.py000066400000000000000000000000121230730566700253430ustar00rootroot00000000000000# package ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/persistent/tests/persistent.txt000066400000000000000000000254761230730566700262200ustar00rootroot00000000000000Tests for `persistent.Persistent` ================================= This document is an extended doc test that covers the basics of the Persistent base class. The test expects a class named `P` to be provided in its globals. The `P` class implements the `Persistent` interface. Test framework -------------- The class `P` needs to behave like `ExampleP`. (Note that the code below is *not* part of the tests.) :: class ExampleP(Persistent): def __init__(self): self.x = 0 def inc(self): self.x += 1 The tests use stub data managers. A data manager is responsible for loading and storing the state of a persistent object. It's stored in the ``_p_jar`` attribute of a persistent object. >>> class DM: ... def __init__(self): ... self.called = 0 ... def register(self, ob): ... self.called += 1 ... def setstate(self, ob): ... ob.__setstate__({'x': 42}) >>> class BrokenDM(DM): ... def register(self,ob): ... self.called += 1 ... raise NotImplementedError ... def setstate(self,ob): ... raise NotImplementedError >>> from persistent import Persistent Test Persistent without Data Manager ------------------------------------ First do some simple tests of a Persistent instance that does not have a data manager (``_p_jar``). >>> p = P() >>> p.x 0 >>> p._p_changed False >>> p._p_state 0 >>> p._p_jar >>> p._p_oid Verify that modifications have no effect on ``_p_state`` of ``_p_changed``. >>> p.inc() >>> p.inc() >>> p.x 2 >>> p._p_changed False >>> p._p_state 0 Try all sorts of different ways to change the object's state. >>> p._p_deactivate() >>> p._p_state 0 >>> p._p_changed = True >>> p._p_state 0 >>> del p._p_changed >>> p._p_changed False >>> p._p_state 0 >>> p.x 2 We can store a size estimation in ``_p_estimated_size``. Its default is 0. The size estimation can be used by a cache associated with the data manager to help in the implementation of its replacement strategy or its size bounds. Of course, the estimated size must not be negative. >>> p._p_estimated_size 0 >>> p._p_estimated_size = 1000 >>> p._p_estimated_size 1024 Huh? Why is the estimated size coming out different than what we put in? The reason is that the size isn't stored exactly. For backward compatibility reasons, the size needs to fit in 24 bits, so, internally, it is adjusted somewhat. >>> p._p_estimated_size = -1 Traceback (most recent call last): .... ValueError: _p_estimated_size must not be negative Test Persistent with Data Manager --------------------------------- Next try some tests of an object with a data manager. The `DM` class is a simple testing stub. >>> p = P() >>> dm = DM() >>> p._p_oid = "00000012" >>> p._p_jar = dm >>> p._p_changed 0 >>> dm.called 0 Modifying the object marks it as changed and registers it with the data manager. Subsequent modifications don't have additional side-effects. >>> p.inc() >>> p._p_changed 1 >>> dm.called 1 >>> p.inc() >>> p._p_changed 1 >>> dm.called 1 It's not possible to deactivate a modified object. >>> p._p_deactivate() >>> p._p_changed 1 It is possible to invalidate it. That's the key difference between deactivation and invalidation. >>> p._p_invalidate() >>> p._p_state -1 Now that the object is a ghost, any attempt to modify it will require that it be unghosted first. The test data manager has the odd property that it sets the object's ``x`` attribute to ``42`` when it is unghosted. >>> p.inc() >>> p.x 43 >>> dm.called 2 You can manually reset the changed field to ``False``, although it's not clear why you would want to do that. The object changes to the ``UPTODATE`` state but retains its modifications. >>> p._p_changed = False >>> p._p_state 0 >>> p._p_changed False >>> p.x 43 >>> p.inc() >>> p._p_changed True >>> dm.called 3 ``__getstate__()`` and ``__setstate__()`` ----------------------------------------- The next several tests cover the ``__getstate__()`` and ``__setstate__()`` implementations. >>> p = P() >>> state = p.__getstate__() >>> isinstance(state, dict) True >>> state['x'] 0 >>> p._p_state 0 Calling setstate always leaves the object in the uptodate state? (I'm not entirely clear on this one.) >>> p.__setstate__({'x': 5}) >>> p._p_state 0 Assigning to a volatile attribute has no effect on the object state. >>> p._v_foo = 2 >>> p.__getstate__() {'x': 5} >>> p._p_state 0 The ``_p_serial`` attribute is not affected by calling setstate. >>> p._p_serial = "00000012" >>> p.__setstate__(p.__getstate__()) >>> p._p_serial '00000012' Change Ghost test ----------------- If an object is a ghost and its ``_p_changed`` is set to ``True`` (any true value), it should activate (unghostify) the object. This behavior is new in ZODB 3.6; before then, an attempt to do ``ghost._p_changed = True`` was ignored. >>> p = P() >>> p._p_jar = DM() >>> p._p_oid = 1 >>> p._p_deactivate() >>> p._p_changed # None >>> p._p_state # ghost state -1 >>> p._p_changed = True >>> p._p_changed 1 >>> p._p_state # changed state 1 >>> p.x 42 Activate, deactivate, and invalidate ------------------------------------ Some of these tests are redundant, but are included to make sure there are explicit and simple tests of ``_p_activate()``, ``_p_deactivate()``, and ``_p_invalidate()``. >>> p = P() >>> p._p_oid = 1 >>> p._p_jar = DM() >>> p._p_deactivate() >>> p._p_state -1 >>> p._p_activate() >>> p._p_state 0 >>> p.x 42 >>> p.inc() >>> p.x 43 >>> p._p_state 1 >>> p._p_invalidate() >>> p._p_state -1 >>> p.x 42 Test failures ------------- The following tests cover various errors cases. When an object is modified, it registers with its data manager. If that registration fails, the exception is propagated and the object stays in the up-to-date state. It shouldn't change to the modified state, because it won't be saved when the transaction commits. >>> p = P() >>> p._p_oid = 1 >>> p._p_jar = BrokenDM() >>> p._p_state 0 >>> p._p_jar.called 0 >>> p._p_changed = 1 Traceback (most recent call last): ... NotImplementedError >>> p._p_jar.called 1 >>> p._p_state 0 Make sure that exceptions that occur inside the data manager's ``setstate()`` method propagate out to the caller. >>> p = P() >>> p._p_oid = 1 >>> p._p_jar = BrokenDM() >>> p._p_deactivate() >>> p._p_state -1 >>> p._p_activate() Traceback (most recent call last): ... NotImplementedError >>> p._p_state -1 Special test to cover layout of ``__dict__`` -------------------------------------------- We once had a bug in the `Persistent` class that calculated an incorrect offset for the ``__dict__`` attribute. It assigned ``__dict__`` and ``_p_jar`` to the same location in memory. This is a simple test to make sure they have different locations. >>> p = P() >>> p.inc() >>> p.inc() >>> 'x' in p.__dict__ True >>> p._p_jar Inheritance and metaclasses --------------------------- Simple tests to make sure it's possible to inherit from the `Persistent` base class multiple times. There used to be metaclasses involved in `Persistent` that probably made this a more interesting test. >>> class A(Persistent): ... pass >>> class B(Persistent): ... pass >>> class C(A, B): ... pass >>> class D(object): ... pass >>> class E(D, B): ... pass >>> a = A() >>> b = B() >>> c = C() >>> d = D() >>> e = E() Also make sure that it's possible to define `Persistent` classes that have a custom metaclass. >>> class alternateMeta(type): ... type >>> class alternate(object): ... __metaclass__ = alternateMeta >>> class mixedMeta(alternateMeta, type): ... pass >>> class mixed(alternate, Persistent): ... pass >>> class mixed(Persistent, alternate): ... pass Basic type structure -------------------- >>> Persistent.__dictoffset__ 0 >>> Persistent.__weakrefoffset__ 0 >>> Persistent.__basicsize__ > object.__basicsize__ True >>> P.__dictoffset__ > 0 True >>> P.__weakrefoffset__ > 0 True >>> P.__dictoffset__ < P.__weakrefoffset__ True >>> P.__basicsize__ > Persistent.__basicsize__ True Slots ----- These are some simple tests of classes that have an ``__slots__`` attribute. Some of the classes should have slots, others shouldn't. >>> class noDict(object): ... __slots__ = ['foo'] >>> class p_noDict(Persistent): ... __slots__ = ['foo'] >>> class p_shouldHaveDict(p_noDict): ... pass >>> p_noDict.__dictoffset__ 0 >>> x = p_noDict() >>> x.foo = 1 >>> x.foo 1 >>> x.bar = 1 Traceback (most recent call last): ... AttributeError: 'p_noDict' object has no attribute 'bar' >>> x._v_bar = 1 Traceback (most recent call last): ... AttributeError: 'p_noDict' object has no attribute '_v_bar' >>> x.__dict__ Traceback (most recent call last): ... AttributeError: 'p_noDict' object has no attribute '__dict__' The various _p_ attributes are unaffected by slots. >>> p._p_oid >>> p._p_jar >>> p._p_state 0 If the most-derived class does not specify >>> p_shouldHaveDict.__dictoffset__ > 0 True >>> x = p_shouldHaveDict() >>> isinstance(x.__dict__, dict) True Pickling -------- There's actually a substantial effort involved in making subclasses of `Persistent` work with plain-old pickle. The ZODB serialization layer never calls pickle on an object; it pickles the object's class description and its state as two separate pickles. >>> import pickle >>> p = P() >>> p.inc() >>> p2 = pickle.loads(pickle.dumps(p)) >>> p2.__class__ is P True >>> p2.x == p.x True We should also test that pickle works with custom getstate and setstate. Perhaps even reduce. The problem is that pickling depends on finding the class in a particular module, and classes defined here won't appear in any module. We could require each user of the tests to define a base class, but that might be tedious. Interfaces ---------- Some versions of Zope and ZODB have the `zope.interface` package available. If it is available, then persistent will be associated with several interfaces. It's hard to write a doctest test that runs the tests only if `zope.interface` is available, so this test looks a little unusual. One problem is that the assert statements won't do anything if you run with `-O`. >>> try: ... import zope.interface ... except ImportError: ... pass ... else: ... from persistent.interfaces import IPersistent ... assert IPersistent.implementedBy(Persistent) ... p = Persistent() ... assert IPersistent.providedBy(p) ... assert IPersistent.implementedBy(P) ... p = P() ... assert IPersistent.providedBy(p) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/persistent/tests/testPersistent.py000066400000000000000000000217631230730566700266640ustar00rootroot00000000000000############################################################################# # # Copyright (c) 2003 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## import unittest Picklable = None # avoid global import of Persistent; updated later class PersistenceTest(unittest.TestCase): def _makeOne(self): from persistent import Persistent class P(Persistent): pass return P() def _makeJar(self): from persistent.tests.utils import ResettingJar return ResettingJar() def test_oid_initial_value(self): obj = self._makeOne() self.assertEqual(obj._p_oid, None) def test_oid_mutable_and_deletable_when_no_jar(self): obj = self._makeOne() obj._p_oid = 12 self.assertEqual(obj._p_oid, 12) del obj._p_oid def test_oid_immutable_when_in_jar(self): obj = self._makeOne() jar = self._makeJar() jar.add(obj) # Can't change oid of cache object. def deloid(): del obj._p_oid self.assertRaises(ValueError, deloid) def setoid(): obj._p_oid = 12 self.assertRaises(ValueError, setoid) # The value returned for _p_changed can be one of: # 0 -- it is not changed # 1 -- it is changed # None -- it is a ghost def test_change_via_setattr(self): from persistent import CHANGED obj = self._makeOne() jar = self._makeJar() jar.add(obj) obj.x = 1 self.assertEqual(obj._p_changed, 1) self.assertEqual(obj._p_state, CHANGED) self.assert_(obj in jar.registered) def test_setattr_then_mark_uptodate(self): from persistent import UPTODATE obj = self._makeOne() jar = self._makeJar() jar.add(obj) obj.x = 1 obj._p_changed = 0 self.assertEqual(obj._p_changed, 0) self.assertEqual(obj._p_state, UPTODATE) def test_set_changed_directly(self): from persistent import CHANGED obj = self._makeOne() jar = self._makeJar() jar.add(obj) obj._p_changed = 1 self.assertEqual(obj._p_changed, 1) self.assertEqual(obj._p_state, CHANGED) self.assert_(obj in jar.registered) def test_cant_ghostify_if_changed(self): from persistent import CHANGED obj = self._makeOne() jar = self._makeJar() jar.add(obj) # setting obj._p_changed to None ghostifies if the # object is in the up-to-date state, but not otherwise. obj.x = 1 obj._p_changed = None self.assertEqual(obj._p_changed, 1) self.assertEqual(obj._p_state, CHANGED) def test_can_ghostify_if_uptodate(self): from persistent import GHOST obj = self._makeOne() jar = self._makeJar() jar.add(obj) obj.x = 1 obj._p_changed = 0 obj._p_changed = None self.assertEqual(obj._p_changed, None) self.assertEqual(obj._p_state, GHOST) def test_can_ghostify_if_changed_but_del__p_changed(self): from persistent import GHOST obj = self._makeOne() jar = self._makeJar() jar.add(obj) # You can transition directly from modified to ghost if # you delete the _p_changed attribute. obj.x = 1 del obj._p_changed self.assertEqual(obj._p_changed, None) self.assertEqual(obj._p_state, GHOST) def test__p_state_immutable(self): from persistent import CHANGED from persistent import GHOST from persistent import STICKY from persistent import UPTODATE # make sure we can't write to _p_state; we don't want yet # another way to change state! obj = self._makeOne() def setstate(value): obj._p_state = value self.assertRaises(Exception, setstate, GHOST) self.assertRaises(Exception, setstate, UPTODATE) self.assertRaises(Exception, setstate, CHANGED) self.assertRaises(Exception, setstate, STICKY) def test_invalidate(self): from persistent import GHOST from persistent import UPTODATE obj = self._makeOne() jar = self._makeJar() jar.add(obj) self.assertEqual(obj._p_changed, 0) self.assertEqual(obj._p_state, UPTODATE) obj._p_invalidate() self.assertEqual(obj._p_changed, None) self.assertEqual(obj._p_state, GHOST) def test_invalidate_activate_invalidate(self): from persistent import GHOST obj = self._makeOne() jar = self._makeJar() jar.add(obj) obj._p_invalidate() obj._p_activate() obj.x = 1 obj._p_invalidate() self.assertEqual(obj._p_changed, None) self.assertEqual(obj._p_state, GHOST) def test_initial_serial(self): NOSERIAL = "\000" * 8 obj = self._makeOne() self.assertEqual(obj._p_serial, NOSERIAL) def test_setting_serial_w_invalid_types_raises(self): # Serial must be an 8-digit string obj = self._makeOne() def set(val): obj._p_serial = val self.assertRaises(ValueError, set, 1) self.assertRaises(ValueError, set, "0123") self.assertRaises(ValueError, set, "012345678") self.assertRaises(ValueError, set, u"01234567") def test_del_serial_returns_to_initial(self): NOSERIAL = "\000" * 8 obj = self._makeOne() obj._p_serial = "01234567" del obj._p_serial self.assertEqual(obj._p_serial, NOSERIAL) def test_initial_mtime(self): obj = self._makeOne() self.assertEqual(obj._p_mtime, None) def test_setting_serial_sets_mtime_to_now(self): import time from persistent.TimeStamp import TimeStamp obj = self._makeOne() t = int(time.time()) ts = TimeStamp(*time.gmtime(t)[:6]) # XXX: race? obj._p_serial = repr(ts) # why repr it? self.assertEqual(obj._p_mtime, t) self.assert_(isinstance(obj._p_mtime, float)) def test_pickle_unpickle(self): import pickle from persistent import Persistent # see above: class must be at module scope to be pickled. global Picklable class Picklable(Persistent): pass obj = Picklable() obj.attr = "test" s = pickle.dumps(obj) obj2 = pickle.loads(s) self.assertEqual(obj.attr, obj2.attr) def test___getattr__(self): from persistent import CHANGED from persistent import Persistent class H1(Persistent): def __init__(self): self.n = 0 def __getattr__(self, attr): self.n += 1 return self.n obj = H1() self.assertEqual(obj.larry, 1) self.assertEqual(obj.curly, 2) self.assertEqual(obj.moe, 3) jar = self._makeJar() jar.add(obj) obj._p_deactivate() # The simple Jar used for testing re-initializes the object. self.assertEqual(obj.larry, 1) # The getattr hook modified the object, so it should now be # in the changed state. self.assertEqual(obj._p_changed, 1) self.assertEqual(obj._p_state, CHANGED) self.assertEqual(obj.curly, 2) self.assertEqual(obj.moe, 3) def test___getattribute__(self): from persistent import CHANGED from persistent import Persistent class H2(Persistent): def __init__(self): self.n = 0 def __getattribute__(self, attr): supergetattr = super(H2, self).__getattribute__ try: return supergetattr(attr) except AttributeError: n = supergetattr("n") self.n = n + 1 return n + 1 obj = H2() self.assertEqual(obj.larry, 1) self.assertEqual(obj.curly, 2) self.assertEqual(obj.moe, 3) jar = self._makeJar() jar.add(obj) obj._p_deactivate() # The simple Jar used for testing re-initializes the object. self.assertEqual(obj.larry, 1) # The getattr hook modified the object, so it should now be # in the changed state. self.assertEqual(obj._p_changed, 1) self.assertEqual(obj._p_state, CHANGED) self.assertEqual(obj.curly, 2) self.assertEqual(obj.moe, 3) # TODO: Need to decide how __setattr__ and __delattr__ should work, # then write tests. def test_suite(): return unittest.makeSuite(PersistenceTest) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/persistent/tests/test_PickleCache.py000066400000000000000000000077041230730566700270150ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2003 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## class DummyConnection: def setklassstate(self, obj): """Method used by PickleCache.""" def test_delitem(): """ >>> from persistent import PickleCache >>> conn = DummyConnection() >>> cache = PickleCache(conn) >>> del cache[''] Traceback (most recent call last): ... KeyError: '' >>> from persistent import Persistent >>> p = Persistent() >>> p._p_oid = 'foo' >>> p._p_jar = conn >>> cache['foo'] = p >>> del cache['foo'] """ def new_ghost(): """ Creating ghosts (from scratch, as opposed to ghostifying a non-ghost) in the curremt implementation is rather tricky. IPeristent doesn't really provide the right interface given that: - _p_deactivate and _p_invalidate are overridable and could assume that the object's state is properly initialized. - Assigning _p_changed to None or deleting it just calls _p_deactivate or _p_invalidate. The current cache implementation is intimately tied up with the persistence implementation and has internal access to the persistence state. The cache implementation can update the persistence state for newly created and ininitialized objects directly. The future persistence and cache implementations will be far more decoupled. The persistence implementation will only manage object state and generate object-usage events. The cache implemnentation(s) will be rersponsible for managing persistence-related (meta-)state, such as _p_state, _p_changed, _p_oid, etc. So in that future implemention, the cache will be more central to managing object persistence information. Caches have a new_ghost method that: - adds an object to the cache, and - initializes its persistence data. >>> import persistent >>> class C(persistent.Persistent): ... pass >>> jar = object() >>> cache = persistent.PickleCache(jar, 10, 100) >>> ob = C.__new__(C) >>> cache.new_ghost('1', ob) >>> ob._p_changed >>> ob._p_jar is jar True >>> ob._p_oid '1' >>> cache.cache_non_ghost_count, cache.total_estimated_size (0, 0) Peristent meta classes work too: >>> import ZODB.persistentclass >>> class PC: ... __metaclass__ = ZODB.persistentclass.PersistentMetaClass >>> PC._p_oid >>> PC._p_jar >>> PC._p_serial >>> PC._p_changed False >>> cache.new_ghost('2', PC) >>> PC._p_oid '2' >>> PC._p_jar is jar True >>> PC._p_serial >>> PC._p_changed False """ def cache_invalidate_and_minimize_used_to_leak_None_ref(): """Persistent weak references >>> import transaction >>> import ZODB.tests.util >>> db = ZODB.tests.util.DB() >>> conn = db.open() >>> conn.root.p = p = conn.root().__class__() >>> transaction.commit() >>> import sys >>> old = sys.getrefcount(None) >>> conn._cache.invalidate(p._p_oid) >>> sys.getrefcount(None) - old 0 >>> _ = conn.root.p.keys() >>> old = sys.getrefcount(None) >>> conn._cache.minimize() >>> sys.getrefcount(None) - old 0 >>> db.close() """ import os if os.environ.get('USE_ZOPE_TESTING_DOCTEST'): from zope.testing.doctest import DocTestSuite else: from doctest import DocTestSuite import unittest def test_suite(): return unittest.TestSuite(( DocTestSuite(), )) if __name__ == '__main__': unittest.main() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/persistent/tests/test_list.py000066400000000000000000000146201230730566700256300ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2001, 2002 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Tests for PersistentList """ import unittest l0 = [] l1 = [0] l2 = [0, 1] class OtherList: def __init__(self, initlist): self.__data = initlist def __len__(self): return len(self.__data) def __getitem__(self, i): return self.__data[i] class TestPList(unittest.TestCase): def _getTargetClass(self): from persistent.list import PersistentList return PersistentList def test_volatile_attributes_not_persisted(self): # http://www.zope.org/Collectors/Zope/2052 m = self._getTargetClass()() m.foo = 'bar' m._v_baz = 'qux' state = m.__getstate__() self.failUnless('foo' in state) self.failIf('_v_baz' in state) def testTheWorld(self): # Test constructors pl = self._getTargetClass() u = pl() u0 = pl(l0) u1 = pl(l1) u2 = pl(l2) uu = pl(u) uu0 = pl(u0) uu1 = pl(u1) uu2 = pl(u2) v = pl(tuple(u)) v0 = pl(OtherList(u0)) vv = pl("this is also a sequence") # Test __repr__ eq = self.assertEqual eq(str(u0), str(l0), "str(u0) == str(l0)") eq(repr(u1), repr(l1), "repr(u1) == repr(l1)") eq(`u2`, `l2`, "`u2` == `l2`") # Test __cmp__ and __len__ def mycmp(a, b): r = cmp(a, b) if r < 0: return -1 if r > 0: return 1 return r all = [l0, l1, l2, u, u0, u1, u2, uu, uu0, uu1, uu2] for a in all: for b in all: eq(mycmp(a, b), mycmp(len(a), len(b)), "mycmp(a, b) == mycmp(len(a), len(b))") # Test __getitem__ for i in range(len(u2)): eq(u2[i], i, "u2[i] == i") # Test __setitem__ uu2[0] = 0 uu2[1] = 100 try: uu2[2] = 200 except IndexError: pass else: raise TestFailed("uu2[2] shouldn't be assignable") # Test __delitem__ del uu2[1] del uu2[0] try: del uu2[0] except IndexError: pass else: raise TestFailed("uu2[0] shouldn't be deletable") # Test __getslice__ for i in range(-3, 4): eq(u2[:i], l2[:i], "u2[:i] == l2[:i]") eq(u2[i:], l2[i:], "u2[i:] == l2[i:]") for j in range(-3, 4): eq(u2[i:j], l2[i:j], "u2[i:j] == l2[i:j]") # Test __setslice__ for i in range(-3, 4): u2[:i] = l2[:i] eq(u2, l2, "u2 == l2") u2[i:] = l2[i:] eq(u2, l2, "u2 == l2") for j in range(-3, 4): u2[i:j] = l2[i:j] eq(u2, l2, "u2 == l2") uu2 = u2[:] uu2[:0] = [-2, -1] eq(uu2, [-2, -1, 0, 1], "uu2 == [-2, -1, 0, 1]") uu2[0:] = [] eq(uu2, [], "uu2 == []") # Test __contains__ for i in u2: self.failUnless(i in u2, "i in u2") for i in min(u2)-1, max(u2)+1: self.failUnless(i not in u2, "i not in u2") # Test __delslice__ uu2 = u2[:] del uu2[1:2] del uu2[0:1] eq(uu2, [], "uu2 == []") uu2 = u2[:] del uu2[1:] del uu2[:1] eq(uu2, [], "uu2 == []") # Test __add__, __radd__, __mul__ and __rmul__ #self.failUnless(u1 + [] == [] + u1 == u1, "u1 + [] == [] + u1 == u1") self.failUnless(u1 + [1] == u2, "u1 + [1] == u2") #self.failUnless([-1] + u1 == [-1, 0], "[-1] + u1 == [-1, 0]") self.failUnless(u2 == u2*1 == 1*u2, "u2 == u2*1 == 1*u2") self.failUnless(u2+u2 == u2*2 == 2*u2, "u2+u2 == u2*2 == 2*u2") self.failUnless(u2+u2+u2 == u2*3 == 3*u2, "u2+u2+u2 == u2*3 == 3*u2") # Test append u = u1[:] u.append(1) eq(u, u2, "u == u2") # Test insert u = u2[:] u.insert(0, -1) eq(u, [-1, 0, 1], "u == [-1, 0, 1]") # Test pop u = pl([0, -1, 1]) u.pop() eq(u, [0, -1], "u == [0, -1]") u.pop(0) eq(u, [-1], "u == [-1]") # Test remove u = u2[:] u.remove(1) eq(u, u1, "u == u1") # Test count u = u2*3 eq(u.count(0), 3, "u.count(0) == 3") eq(u.count(1), 3, "u.count(1) == 3") eq(u.count(2), 0, "u.count(2) == 0") # Test index eq(u2.index(0), 0, "u2.index(0) == 0") eq(u2.index(1), 1, "u2.index(1) == 1") try: u2.index(2) except ValueError: pass else: raise TestFailed("expected ValueError") # Test reverse u = u2[:] u.reverse() eq(u, [1, 0], "u == [1, 0]") u.reverse() eq(u, u2, "u == u2") # Test sort u = pl([1, 0]) u.sort() eq(u, u2, "u == u2") # Test keyword arguments to sort u.sort(cmp=lambda x,y: cmp(y, x)) eq(u, [1, 0], "u == [1, 0]") u.sort(key=lambda x:-x) eq(u, [1, 0], "u == [1, 0]") u.sort(reverse=True) eq(u, [1, 0], "u == [1, 0]") # Passing any other keyword arguments results in a TypeError try: u.sort(blah=True) except TypeError: pass else: raise TestFailed("expected TypeError") # Test extend u = u1[:] u.extend(u2) eq(u, u1 + u2, "u == u1 + u2") # Test iadd u = u1[:] u += u2 eq(u, u1 + u2, "u == u1 + u2") # Test imul u = u1[:] u *= 3 eq(u, u1 + u1 + u1, "u == u1 + u1 + u1") def test_suite(): return unittest.makeSuite(TestPList) if __name__ == "__main__": loader = unittest.TestLoader() unittest.main(testLoader=loader) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/persistent/tests/test_mapping.py000066400000000000000000000135411230730566700263110ustar00rootroot00000000000000############################################################################## # # Copyright (c) Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## import doctest import unittest from zope.testing import setupstack def test_suite(): return unittest.TestSuite(( doctest.DocFileSuite('README.txt'), )) l0 = {} l1 = {0:0} l2 = {0:0, 1:1} class MappingTests(unittest.TestCase): def _getTargetClass(self): from persistent.mapping import PersistentMapping return PersistentMapping def test_volatile_attributes_not_persisted(self): # http://www.zope.org/Collectors/Zope/2052 m = self._getTargetClass()() m.foo = 'bar' m._v_baz = 'qux' state = m.__getstate__() self.failUnless('foo' in state) self.failIf('_v_baz' in state) def testTheWorld(self): # Test constructors pm = self._getTargetClass() u = pm() u0 = pm(l0) u1 = pm(l1) u2 = pm(l2) uu = pm(u) uu0 = pm(u0) uu1 = pm(u1) uu2 = pm(u2) class OtherMapping: def __init__(self, initmapping): self.__data = initmapping def items(self): return self.__data.items() v0 = pm(OtherMapping(u0)) vv = pm([(0, 0), (1, 1)]) # Test __repr__ eq = self.assertEqual eq(str(u0), str(l0), "str(u0) == str(l0)") eq(repr(u1), repr(l1), "repr(u1) == repr(l1)") eq(`u2`, `l2`, "`u2` == `l2`") # Test __cmp__ and __len__ def mycmp(a, b): r = cmp(a, b) if r < 0: return -1 if r > 0: return 1 return r all = [l0, l1, l2, u, u0, u1, u2, uu, uu0, uu1, uu2] for a in all: for b in all: eq(mycmp(a, b), mycmp(len(a), len(b)), "mycmp(a, b) == mycmp(len(a), len(b))") # Test __getitem__ for i in range(len(u2)): eq(u2[i], i, "u2[i] == i") # Test get for i in range(len(u2)): eq(u2.get(i), i, "u2.get(i) == i") eq(u2.get(i, 5), i, "u2.get(i, 5) == i") for i in min(u2)-1, max(u2)+1: eq(u2.get(i), None, "u2.get(i) == None") eq(u2.get(i, 5), 5, "u2.get(i, 5) == 5") # Test __setitem__ uu2[0] = 0 uu2[1] = 100 uu2[2] = 200 # Test __delitem__ del uu2[1] del uu2[0] try: del uu2[0] except KeyError: pass else: raise TestFailed("uu2[0] shouldn't be deletable") # Test __contains__ for i in u2: self.failUnless(i in u2, "i in u2") for i in min(u2)-1, max(u2)+1: self.failUnless(i not in u2, "i not in u2") # Test update l = {"a":"b"} u = pm(l) u.update(u2) for i in u: self.failUnless(i in l or i in u2, "i in l or i in u2") for i in l: self.failUnless(i in u, "i in u") for i in u2: self.failUnless(i in u, "i in u") # Test setdefault x = u2.setdefault(0, 5) eq(x, 0, "u2.setdefault(0, 5) == 0") x = u2.setdefault(5, 5) eq(x, 5, "u2.setdefault(5, 5) == 5") self.failUnless(5 in u2, "5 in u2") # Test pop x = u2.pop(1) eq(x, 1, "u2.pop(1) == 1") self.failUnless(1 not in u2, "1 not in u2") try: u2.pop(1) except KeyError: pass else: raise TestFailed("1 should not be poppable from u2") x = u2.pop(1, 7) eq(x, 7, "u2.pop(1, 7) == 7") # Test popitem items = u2.items() key, value = u2.popitem() self.failUnless((key, value) in items, "key, value in items") self.failUnless(key not in u2, "key not in u2") # Test clear u2.clear() eq(u2, {}, "u2 == {}") def test_legacy_data(): """ We've deprecated PersistentDict. If you import persistent.dict.PersistentDict, you'll get persistent.mapping.PersistentMapping. >>> import persistent.dict, persistent.mapping >>> persistent.dict.PersistentDict is persistent.mapping.PersistentMapping True PersistentMapping uses a data attribute for it's mapping data: >>> m = persistent.mapping.PersistentMapping() >>> m.__dict__ {'data': {}} In the past, it used a _container attribute. For some time, the implementation continued to use a _container attribute in pickles (__get/setstate__) to be compatible with older releases. This isn't really necessary any more. In fact, releases for which this might matter can no longer share databases with current releases. Because releases as recent as 3.9.0b5 still use _container in saved state, we need to accept such state, but we stop producing it. If we reset it's __dict__ with legacy data: >>> m.__dict__.clear() >>> m.__dict__['_container'] = {'a': 1} >>> m.__dict__ {'_container': {'a': 1}} >>> m._p_changed = 0 But when we perform any operations on it, the data will be converted without marking the object as changed: >>> m {'a': 1} >>> m.__dict__ {'data': {'a': 1}} >>> m._p_changed 0 >>> m.__getstate__() {'data': {'a': 1}} """ def test_suite(): return unittest.TestSuite(( doctest.DocTestSuite(), unittest.makeSuite(MappingTests), )) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/persistent/tests/test_overriding_attrs.py000066400000000000000000000253041230730566700302430ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2004 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """Overriding attr methods This module tests and documents, through example, overriding attribute access methods. """ from persistent import Persistent # ouch! def _resettingJar(): from persistent.tests.utils import ResettingJar return ResettingJar() def _rememberingJar(): from persistent.tests.utils import RememberingJar return RememberingJar() class SampleOverridingGetattr(Persistent): """Example of overriding __getattr__ """ def __getattr__(self, name): """Get attributes that can't be gotten the usual way The __getattr__ method works pretty much the same for persistent classes as it does for other classes. No special handling is needed. If an object is a ghost, then it will be activated before __getattr__ is called. In this example, our objects returns a tuple with the attribute name, converted to upper case and the value of _p_changed, for any attribute that isn't handled by the default machinery. >>> o = SampleOverridingGetattr() >>> o._p_changed False >>> o._p_oid >>> o._p_jar >>> o.spam ('SPAM', False) >>> o.spam = 1 >>> o.spam 1 We'll save the object, so it can be deactivated: >>> jar = _resettingJar() >>> jar.add(o) >>> o._p_deactivate() >>> o._p_changed And now, if we ask for an attribute it doesn't have, >>> o.eggs ('EGGS', False) And we see that the object was activated before calling the __getattr__ method. """ # Don't pretend we have any special attributes. if name.startswith("__") and name.endswrith("__"): raise AttributeError(name) else: return name.upper(), self._p_changed class SampleOverridingGetattributeSetattrAndDelattr(Persistent): """Example of overriding __getattribute__, __setattr__, and __delattr__ In this example, we'll provide an example that shows how to override the __getattribute__, __setattr__, and __delattr__ methods. We'll create a class that stores it's attributes in a secret dictionary within it's instance dictionary. The class will have the policy that variables with names starting with 'tmp_' will be volatile. """ def __init__(self, **kw): self.__dict__['__secret__'] = kw.copy() def __getattribute__(self, name): """Get an attribute value The __getattribute__ method is called for all attribute accesses. It overrides the attribute access support inherited from Persistent. Our sample class let's us provide initial values as keyword arguments to the constructor: >>> o = SampleOverridingGetattributeSetattrAndDelattr(x=1) >>> o._p_changed 0 >>> o._p_oid >>> o._p_jar >>> o.x 1 >>> o.y Traceback (most recent call last): ... AttributeError: y Next, we'll save the object in a database so that we can deactivate it: >>> jar = _rememberingJar() >>> jar.add(o) >>> o._p_deactivate() >>> o._p_changed And we'll get some data: >>> o.x 1 which activates the object: >>> o._p_changed 0 It works for missing attribes too: >>> o._p_deactivate() >>> o._p_changed >>> o.y Traceback (most recent call last): ... AttributeError: y >>> o._p_changed 0 See the very important note in the comment below! """ ################################################################# # IMPORTANT! READ THIS! 8-> # # We *always* give Persistent a chance first. # Persistent handles certain special attributes, like _p_ # attributes. In particular, the base class handles __dict__ # and __class__. # # We call _p_getattr. If it returns True, then we have to # use Persistent.__getattribute__ to get the value. # ################################################################# if Persistent._p_getattr(self, name): return Persistent.__getattribute__(self, name) # Data should be in our secret dictionary: secret = self.__dict__['__secret__'] if name in secret: return secret[name] # Maybe it's a method: meth = getattr(self.__class__, name, None) if meth is None: raise AttributeError(name) return meth.__get__(self, self.__class__) def __setattr__(self, name, value): """Set an attribute value The __setattr__ method is called for all attribute assignments. It overrides the attribute assignment support inherited from Persistent. Implementors of __setattr__ methods: 1. Must call Persistent._p_setattr first to allow it to handle some attributes and to make sure that the object is activated if necessary, and 2. Must set _p_changed to mark objects as changed. See the comments in the source below. >>> o = SampleOverridingGetattributeSetattrAndDelattr() >>> o._p_changed 0 >>> o._p_oid >>> o._p_jar >>> o.x Traceback (most recent call last): ... AttributeError: x >>> o.x = 1 >>> o.x 1 Because the implementation doesn't store attributes directly in the instance dictionary, we don't have a key for the attribute: >>> 'x' in o.__dict__ False Next, we'll give the object a "remembering" jar so we can deactivate it: >>> jar = _rememberingJar() >>> jar.add(o) >>> o._p_deactivate() >>> o._p_changed We'll modify an attribute >>> o.y = 2 >>> o.y 2 which reactivates it, and markes it as modified, because our implementation marked it as modified: >>> o._p_changed 1 Now, if fake a commit: >>> jar.fake_commit() >>> o._p_changed 0 And deactivate the object: >>> o._p_deactivate() >>> o._p_changed and then set a variable with a name starting with 'tmp_', The object will be activated, but not marked as modified, because our __setattr__ implementation doesn't mark the object as changed if the name starts with 'tmp_': >>> o.tmp_foo = 3 >>> o._p_changed 0 >>> o.tmp_foo 3 """ ################################################################# # IMPORTANT! READ THIS! 8-> # # We *always* give Persistent a chance first. # Persistent handles certain special attributes, like _p_ # attributes. # # We call _p_setattr. If it returns True, then we are done. # It has already set the attribute. # ################################################################# if Persistent._p_setattr(self, name, value): return self.__dict__['__secret__'][name] = value if not name.startswith('tmp_'): self._p_changed = 1 def __delattr__(self, name): """Delete an attribute value The __delattr__ method is called for all attribute deletions. It overrides the attribute deletion support inherited from Persistent. Implementors of __delattr__ methods: 1. Must call Persistent._p_delattr first to allow it to handle some attributes and to make sure that the object is activated if necessary, and 2. Must set _p_changed to mark objects as changed. See the comments in the source below. >>> o = SampleOverridingGetattributeSetattrAndDelattr( ... x=1, y=2, tmp_z=3) >>> o._p_changed 0 >>> o._p_oid >>> o._p_jar >>> o.x 1 >>> del o.x >>> o.x Traceback (most recent call last): ... AttributeError: x Next, we'll save the object in a jar so that we can deactivate it: >>> jar = _rememberingJar() >>> jar.add(o) >>> o._p_deactivate() >>> o._p_changed If we delete an attribute: >>> del o.y The object is activated. It is also marked as changed because our implementation marked it as changed. >>> o._p_changed 1 >>> o.y Traceback (most recent call last): ... AttributeError: y >>> o.tmp_z 3 Now, if fake a commit: >>> jar.fake_commit() >>> o._p_changed 0 And deactivate the object: >>> o._p_deactivate() >>> o._p_changed and then delete a variable with a name starting with 'tmp_', The object will be activated, but not marked as modified, because our __delattr__ implementation doesn't mark the object as changed if the name starts with 'tmp_': >>> del o.tmp_z >>> o._p_changed 0 >>> o.tmp_z Traceback (most recent call last): ... AttributeError: tmp_z """ ################################################################# # IMPORTANT! READ THIS! 8-> # # We *always* give Persistent a chance first. # Persistent handles certain special attributes, like _p_ # attributes. # # We call _p_delattr. If it returns True, then we are done. # It has already deleted the attribute. # ################################################################# if Persistent._p_delattr(self, name): return del self.__dict__['__secret__'][name] if not name.startswith('tmp_'): self._p_changed = 1 def test_suite(): import os if os.environ.get('USE_ZOPE_TESTING_DOCTEST'): from zope.testing.doctest import DocTestSuite else: from doctest import DocTestSuite return DocTestSuite() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/persistent/tests/test_persistent.py000066400000000000000000000031651230730566700270570ustar00rootroot00000000000000############################################################################## # # Copyright (c) Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## from persistent import Persistent, simple_new import os if os.environ.get('USE_ZOPE_TESTING_DOCTEST'): from zope.testing import doctest else: import doctest import unittest class P(Persistent): def __init__(self): self.x = 0 def inc(self): self.x += 1 def cpersistent_setstate_pointer_sanity(): """ >>> Persistent().__setstate__({}) Traceback (most recent call last): ... TypeError: this object has no instance dictionary >>> class C(Persistent): __slots__ = 'x', 'y' >>> C().__setstate__(({}, {})) Traceback (most recent call last): ... TypeError: this object has no instance dictionary """ def cpersistent_simple_new_invalid_argument(): """ >>> simple_new('') Traceback (most recent call last): ... TypeError: simple_new argument must be a type object. """ def test_suite(): return unittest.TestSuite(( doctest.DocFileSuite("persistent.txt", globs={"P": P}), doctest.DocTestSuite(), )) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/persistent/tests/test_pickle.py000066400000000000000000000143661230730566700261330ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2003 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## from persistent import Persistent import pickle def print_dict(d): d = d.items() d.sort() print '{%s}' % (', '.join( [('%r: %r' % (k, v)) for (k, v) in d] )) def cmpattrs(self, other, *attrs): for attr in attrs: if attr[:3] in ('_v_', '_p_'): continue c = cmp(getattr(self, attr, None), getattr(other, attr, None)) if c: return c return 0 class Simple(Persistent): def __init__(self, name, **kw): self.__name__ = name self.__dict__.update(kw) self._v_favorite_color = 'blue' self._p_foo = 'bar' def __cmp__(self, other): return cmpattrs(self, other, '__class__', *(self.__dict__.keys())) def test_basic_pickling(): """ >>> x = Simple('x', aaa=1, bbb='foo') >>> print_dict(x.__getstate__()) {'__name__': 'x', 'aaa': 1, 'bbb': 'foo'} >>> f, (c,), state = x.__reduce__() >>> f.__name__ '__newobj__' >>> f.__module__ 'copy_reg' >>> c.__name__ 'Simple' >>> print_dict(state) {'__name__': 'x', 'aaa': 1, 'bbb': 'foo'} >>> pickle.loads(pickle.dumps(x)) == x 1 >>> pickle.loads(pickle.dumps(x, 0)) == x 1 >>> pickle.loads(pickle.dumps(x, 1)) == x 1 >>> pickle.loads(pickle.dumps(x, 2)) == x 1 >>> x.__setstate__({'z': 1}) >>> x.__dict__ {'z': 1} """ class Custom(Simple): def __new__(cls, x, y): r = Persistent.__new__(cls) r.x, r.y = x, y return r def __init__(self, x, y): self.a = 42 def __getnewargs__(self): return self.x, self.y def __getstate__(self): return self.a def __setstate__(self, a): self.a = a def test_pickling_w_overrides(): """ >>> x = Custom('x', 'y') >>> x.a = 99 >>> (f, (c, ax, ay), a) = x.__reduce__() >>> f.__name__ '__newobj__' >>> f.__module__ 'copy_reg' >>> c.__name__ 'Custom' >>> ax, ay, a ('x', 'y', 99) >>> pickle.loads(pickle.dumps(x)) == x 1 >>> pickle.loads(pickle.dumps(x, 0)) == x 1 >>> pickle.loads(pickle.dumps(x, 1)) == x 1 >>> pickle.loads(pickle.dumps(x, 2)) == x 1 """ class Slotted(Persistent): __slots__ = 's1', 's2', '_p_splat', '_v_eek' def __init__(self, s1, s2): self.s1, self.s2 = s1, s2 self._v_eek = 1 self._p_splat = 2 class SubSlotted(Slotted): __slots__ = 's3', 's4' def __init__(self, s1, s2, s3): Slotted.__init__(self, s1, s2) self.s3 = s3 def __cmp__(self, other): return cmpattrs(self, other, '__class__', 's1', 's2', 's3', 's4') def test_pickling_w_slots_only(): """ >>> x = SubSlotted('x', 'y', 'z') >>> d, s = x.__getstate__() >>> d >>> print_dict(s) {'s1': 'x', 's2': 'y', 's3': 'z'} >>> pickle.loads(pickle.dumps(x)) == x 1 >>> pickle.loads(pickle.dumps(x, 0)) == x 1 >>> pickle.loads(pickle.dumps(x, 1)) == x 1 >>> pickle.loads(pickle.dumps(x, 2)) == x 1 >>> x.s4 = 'spam' >>> d, s = x.__getstate__() >>> d >>> print_dict(s) {'s1': 'x', 's2': 'y', 's3': 'z', 's4': 'spam'} >>> pickle.loads(pickle.dumps(x)) == x 1 >>> pickle.loads(pickle.dumps(x, 0)) == x 1 >>> pickle.loads(pickle.dumps(x, 1)) == x 1 >>> pickle.loads(pickle.dumps(x, 2)) == x 1 """ class SubSubSlotted(SubSlotted): def __init__(self, s1, s2, s3, **kw): SubSlotted.__init__(self, s1, s2, s3) self.__dict__.update(kw) self._v_favorite_color = 'blue' self._p_foo = 'bar' def __cmp__(self, other): return cmpattrs(self, other, '__class__', 's1', 's2', 's3', 's4', *(self.__dict__.keys())) def test_pickling_w_slots(): """ >>> x = SubSubSlotted('x', 'y', 'z', aaa=1, bbb='foo') >>> d, s = x.__getstate__() >>> print_dict(d) {'aaa': 1, 'bbb': 'foo'} >>> print_dict(s) {'s1': 'x', 's2': 'y', 's3': 'z'} >>> pickle.loads(pickle.dumps(x)) == x 1 >>> pickle.loads(pickle.dumps(x, 0)) == x 1 >>> pickle.loads(pickle.dumps(x, 1)) == x 1 >>> pickle.loads(pickle.dumps(x, 2)) == x 1 >>> x.s4 = 'spam' >>> d, s = x.__getstate__() >>> print_dict(d) {'aaa': 1, 'bbb': 'foo'} >>> print_dict(s) {'s1': 'x', 's2': 'y', 's3': 'z', 's4': 'spam'} >>> pickle.loads(pickle.dumps(x)) == x 1 >>> pickle.loads(pickle.dumps(x, 0)) == x 1 >>> pickle.loads(pickle.dumps(x, 1)) == x 1 >>> pickle.loads(pickle.dumps(x, 2)) == x 1 """ def test_pickling_w_slots_w_empty_dict(): """ >>> x = SubSubSlotted('x', 'y', 'z') >>> d, s = x.__getstate__() >>> print_dict(d) {} >>> print_dict(s) {'s1': 'x', 's2': 'y', 's3': 'z'} >>> pickle.loads(pickle.dumps(x)) == x 1 >>> pickle.loads(pickle.dumps(x, 0)) == x 1 >>> pickle.loads(pickle.dumps(x, 1)) == x 1 >>> pickle.loads(pickle.dumps(x, 2)) == x 1 >>> x.s4 = 'spam' >>> d, s = x.__getstate__() >>> print_dict(d) {} >>> print_dict(s) {'s1': 'x', 's2': 'y', 's3': 'z', 's4': 'spam'} >>> pickle.loads(pickle.dumps(x)) == x 1 >>> pickle.loads(pickle.dumps(x, 0)) == x 1 >>> pickle.loads(pickle.dumps(x, 1)) == x 1 >>> pickle.loads(pickle.dumps(x, 2)) == x 1 """ import os if os.environ.get('USE_ZOPE_TESTING_DOCTEST'): from zope.testing.doctest import DocTestSuite else: from doctest import DocTestSuite import unittest def test_suite(): return unittest.TestSuite(( DocTestSuite(), )) if __name__ == '__main__': unittest.main() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/persistent/tests/test_wref.py000066400000000000000000000016201230730566700256140ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2003 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## import unittest import os if os.environ.get('USE_ZOPE_TESTING_DOCTEST'): from zope.testing.doctest import DocTestSuite else: from doctest import DocTestSuite def test_suite(): return DocTestSuite('persistent.wref') if __name__ == '__main__': unittest.main() ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/persistent/tests/utils.py000066400000000000000000000043171230730566700247600ustar00rootroot00000000000000 class ResettingJar(object): """Testing stub for _p_jar attribute. """ def __init__(self): from persistent.cPickleCache import PickleCache # XXX stub it! self.cache = PickleCache(self) self.oid = 1 self.registered = {} def add(self, obj): import struct obj._p_oid = struct.pack(">Q", self.oid) self.oid += 1 obj._p_jar = self self.cache[obj._p_oid] = obj def close(self): pass # the following methods must be implemented to be a jar def setklassstate(self): # I don't know what this method does, but the pickle cache # constructor calls it. pass def register(self, obj): self.registered[obj] = 1 def setstate(self, obj): # Trivial setstate() implementation that just re-initializes # the object. This isn't what setstate() is supposed to do, # but it suffices for the tests. obj.__class__.__init__(obj) class RememberingJar(object): """Testing stub for _p_jar attribute. """ def __init__(self): from persistent.cPickleCache import PickleCache # XXX stub it! self.cache = PickleCache(self) self.oid = 1 self.registered = {} def add(self, obj): import struct obj._p_oid = struct.pack(">Q", self.oid) self.oid += 1 obj._p_jar = self self.cache[obj._p_oid] = obj # Remember object's state for later. self.obj = obj self.remembered = obj.__getstate__() def close(self): pass def fake_commit(self): self.remembered = self.obj.__getstate__() self.obj._p_changed = 0 # the following methods must be implemented to be a jar def setklassstate(self): # I don't know what this method does, but the pickle cache # constructor calls it. pass def register(self, obj): self.registered[obj] = 1 def setstate(self, obj): # Trivial setstate() implementation that resets the object's # state as of the time it was added to the jar. # This isn't what setstate() is supposed to do, # but it suffices for the tests. obj.__setstate__(self.remembered) ZODB-b28a24c423bed62a194c6508cf08c05f32656a54/src/persistent/wref.py000066400000000000000000000231761230730566700234250ustar00rootroot00000000000000############################################################################## # # Copyright (c) 2003 Zope Foundation and Contributors. # All Rights Reserved. # # This software is subject to the provisions of the Zope Public License, # Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. # THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED # WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED # WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS # FOR A PARTICULAR PURPOSE. # ############################################################################## """ZODB-based persistent weakrefs $Id$ """ __docformat__ = "reStructuredText" from persistent import Persistent import transaction WeakRefMarker = object() class WeakRef(object): """Persistent weak references Persistent weak references are used much like Python weak references. The major difference is that you can't specify an object to be called when the object is removed from the database. Here's an example. We'll start by creating a persistent object and a reference to it: >>> import persistent, ZODB.tests.MinPO >>> import ZODB.tests.util >>> ob = ZODB.tests.MinPO.MinPO() >>> ref = WeakRef(ob) >>> ref() is ob True The hash of the ref if the same as the hash of the referenced object: >>> hash(ref) == hash(ob) True Two refs to the same object are equal: >>> WeakRef(ob) == ref True >>> ob2 = ZODB.tests.MinPO.MinPO(1) >>> WeakRef(ob2) == ref False Lets save the reference and the referenced object in a database: >>> db = ZODB.tests.util.DB() >>> conn1 = db.open() >>> conn1.root()['ob'] = ob >>> conn1.root()['ref'] = ref >>> transaction.commit() If we open a new connection, we can use the reference: >>> conn2 = db.open() >>> conn2.root()['ref']() is conn2.root()['ob'] True >>> hash(conn2.root()['ref']) == hash(conn2.root()['ob']) True But if we delete the referenced object and pack: >>> del conn2.root()['ob'] >>> transaction.commit() >>> ZODB.tests.util.pack(db) And then look in a new connection: >>> conn3 = db.open() >>> conn3.root()['ob'] Traceback (most recent call last): ... KeyError: 'ob' Trying to dereference the reference returns None: >>> conn3.root()['ref']() Trying to get a hash, raises a type error: >>> hash(conn3.root()['ref']) Traceback (most recent call last): ... TypeError: Weakly-referenced object has gone away Always explicitly close databases: :) >>> db.close() >>> del ob, ref, db, conn1, conn2, conn3 When multiple databases are in use, a weakref in one database may point to an object in a different database. Let's create two new databases to demonstrate this. >>> dbA = ZODB.tests.util.DB( ... database_name = 'dbA', ... ) >>> dbB = ZODB.tests.util.DB( ... database_name = 'dbB', ... databases = dbA.databases, ... ) >>> connA1 = dbA.open() >>> connB1 = connA1.get_connection('dbB') Now create and add a new object and a weak reference, and add them to different databases. >>> ob = ZODB.tests.MinPO.MinPO() >>> ref = WeakRef(ob) >>> connA1.root()['ob'] = ob >>> connA1.add(ob) >>> connB1.root()['ref'] = ref >>> transaction.commit() After a succesful commit, the reference should know the oid, database name and connection of the object. >>> ref.oid == ob._p_oid True >>> ref.database_name == 'dbA' True >>> ref.dm is ob._p_jar is connA1 True If we open new connections, we should be able to use the reference. >>> connA2 = dbA.open() >>> connB2 = connA2.get_connection('dbB') >>> ref2 = connB2.root()['ref'] >>> ob2 = connA2.root()['ob'] >>> ref2() is ob2 True >>> ref2.oid == ob2._p_oid True >>> ref2.database_name == 'dbA' True >>> ref2.dm is ob2._p_jar is connA2 True Always explicitly close databases: :) >>> dbA.close() >>> dbB.close() """ # We set _p_oid to a marker so that the serialization system can # provide special handling of weakrefs. _p_oid = WeakRefMarker def __init__(self, ob): self._v_ob = ob self.oid = ob._p_oid self.dm = ob._p_jar if self.dm is not None: self.database_name = self.dm.db().database_name def __call__(self): try: return self._v_ob except AttributeError: try: self._v_ob = self.dm[self.oid] except (KeyError, AttributeError): return None return self._v_ob def __hash__(self): self = self() if self is None: raise TypeError('Weakly-referenced object has gone away') return hash(self) def __eq__(self, other): self = self() if self is None: raise TypeError('Weakly-referenced object has gone away') other = other() if other is None: raise TypeError('Weakly-referenced object has gone away') return self == other class PersistentWeakKeyDictionary(Persistent): """Persistent weak key dictionary This is akin to WeakKeyDictionaries. Note, however, that removal of items is extremely lazy. See below. We'll start by creating a PersistentWeakKeyDictionary and adding some persistent objects to it. >>> d = PersistentWeakKeyDictionary() >>> import ZODB.tests.util >>> p1 = ZODB.tests.util.P('p1') >>> p2 = ZODB.tests.util.P('p2') >>> p3 = ZODB.tests.util.P('p3') >>> d[p1] = 1 >>> d[p2] = 2 >>> d[p3] = 3 We'll create an extra persistent object that's not in the dict: >>> p4 = ZODB.tests.util.P('p4') Now we'll excercise iteration and item access: >>> l = [(str(k), d[k], d.get(k)) for k in d] >>> l.sort() >>> l [('P(p1)', 1, 1), ('P(p2)', 2, 2), ('P(p3)', 3, 3)] And the containment operator: >>> [p in d for p in [p1, p2, p3, p4]] [True, True, True, False] We can add the dict and the referenced objects to a database: >>> db = ZODB.tests.util.DB() >>> conn1 = db.open() >>> conn1.root()['p1'] = p1 >>> conn1.root()['d'] = d >>> conn1.root()['p2'] = p2 >>> conn1.root()['p3'] = p3 >>> transaction.commit() And things still work, as before: >>> l = [(str(k), d[k], d.get(k)) for k in d] >>> l.sort() >>> l [('P(p1)', 1, 1), ('P(p2)', 2, 2), ('P(p3)', 3, 3)] >>> [p in d for p in [p1, p2, p3, p4]] [True, True, True, False] Likewise, we can read the objects from another connection and things still work. >>> conn2 = db.open() >>> d = conn2.root()['d'] >>> p1 = conn2.root()['p1'] >>> p2 = conn2.root()['p2'] >>> p3 = conn2.root()['p3'] >>> l = [(str(k), d[k], d.get(k)) for k in d] >>> l.sort() >>> l [('P(p1)', 1, 1), ('P(p2)', 2, 2), ('P(p3)', 3, 3)] >>> [p in d for p in [p1, p2, p3, p4]] [True, True, True, False] Now, we'll delete one of the objects from the database, but *not* from the dictionary: >>> del conn2.root()['p2'] >>> transaction.commit() And pack the database, so that the no-longer referenced p2 is actually removed from the database. >>> ZODB.tests.util.pack(db) Now if we access the dictionary in a new connection, it no longer has p2: >>> conn3 = db.open() >>> d = conn3.root()['d'] >>> l = [(str(k), d[k], d.get(k)) for k in d] >>> l.sort() >>> l [('P(p1)', 1, 1), ('P(p3)', 3, 3)] It's worth nothing that that the versions of the dictionary in conn1 and conn2 still have p2, because p2 is still in the caches for those connections. Always explicitly close databases: :) >>> db.close() """ # TODO: It's expensive trying to load dead objects from the database. # It would be helpful if the data manager/connection cached these. def __init__(self, adict=None, **kwargs): self.data = {} if adict is not None: keys = getattr(adict, "keys", None) if keys is None: adict = dict(adict) self.update(adict) if kwargs: self.update(kwargs) def __getstate__(self): state = Persistent.__getstate__(self) state['data'] = state['data'].items() return state def __setstate__(self, state): state['data'] = dict([ (k, v) for (k, v) in state['data'] if k() is not None ]) Persistent.__setstate__(self, state) def __setitem__(self, key, value): self.data[WeakRef(key)] = value def __getitem__(self, key): return self.data[WeakRef(key)] def __delitem__(self, key): del self.data[WeakRef(key)] def get(self, key, default=None): """D.get(k[, d]) -> D[k] if k in D, else d. >>> import ZODB.tests.util >>> key = ZODB.tests.util.P("key") >>> missing = ZODB.tests.util.P("missing") >>> d = PersistentWeakKeyDictionary([(key, 1)]) >>> d.get(key) 1 >>> d.get(missing) >>> d.get(missing, 12) 12 """ return self.data.get(WeakRef(key), default) def __contains__(self, key): return WeakRef(key) in self.data def __iter__(self): for k in self.data: yield k() def update(self, adict): if isinstance(adict, PersistentWeakKeyDictionary): self.data.update(adict.update) else: for k, v in adict.items(): self.data[WeakRef(k)] = v # TODO: May need more methods, and tests.